Query 009790
Match_columns 525
No_of_seqs 129 out of 136
Neff 4.4
Searched_HMMs 46136
Date Thu Mar 28 17:21:54 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/009790.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/009790hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PF10633 NPCBM_assoc: NPCBM-as 94.7 0.21 4.5E-06 41.2 8.2 75 81-185 2-76 (78)
2 PF00150 Cellulase: Cellulase 90.3 4.5 9.8E-05 39.4 12.1 100 310-439 57-172 (281)
3 PF15418 DUF4625: Domain of un 90.2 4.5 9.8E-05 37.6 11.2 89 77-189 29-120 (132)
4 COG1470 Predicted membrane pro 90.1 4 8.7E-05 45.4 12.4 38 150-187 324-361 (513)
5 PF01229 Glyco_hydro_39: Glyco 89.4 0.82 1.8E-05 50.1 6.8 107 314-441 83-205 (486)
6 PF06030 DUF916: Bacterial pro 78.2 30 0.00065 31.5 10.6 107 63-186 5-120 (121)
7 COG1470 Predicted membrane pro 69.1 6.7 0.00015 43.7 4.8 33 154-186 437-469 (513)
8 PF13731 WxL: WxL domain surfa 67.5 20 0.00043 35.3 7.3 79 106-185 105-210 (215)
9 PF14352 DUF4402: Domain of un 66.1 6.5 0.00014 35.5 3.4 32 155-186 95-128 (130)
10 PF10003 DUF2244: Integral mem 65.7 5.6 0.00012 36.9 3.0 53 159-213 88-140 (140)
11 PF01835 A2M_N: MG2 domain; I 61.0 40 0.00086 28.5 7.1 29 157-185 58-86 (99)
12 PF06280 DUF1034: Fn3-like dom 58.6 10 0.00022 33.3 3.1 37 151-187 62-101 (112)
13 cd00917 PG-PI_TP The phosphati 58.2 19 0.0004 32.5 4.8 34 152-186 76-109 (122)
14 PF02221 E1_DerP2_DerF2: ML do 58.0 20 0.00043 31.7 5.0 35 153-187 86-120 (134)
15 PF13204 DUF4038: Protein of u 53.2 38 0.00082 35.0 6.7 104 304-439 77-184 (289)
16 KOG1579 Homocysteine S-methylt 45.3 1.7E+02 0.0036 31.5 10.0 138 296-458 78-215 (317)
17 smart00633 Glyco_10 Glycosyl h 41.6 52 0.0011 32.9 5.5 98 314-441 15-127 (254)
18 smart00737 ML Domain involved 38.6 60 0.0013 28.4 4.8 33 154-186 73-105 (118)
19 PF12891 Glyco_hydro_44: Glyco 31.9 67 0.0014 33.1 4.5 57 385-441 105-181 (239)
20 PF08428 Rib: Rib/alpha-like r 29.2 1.5E+02 0.0033 24.1 5.4 31 153-186 19-49 (65)
21 PF00868 Transglut_N: Transglu 23.3 94 0.002 28.1 3.5 31 155-185 87-117 (118)
22 COG4012 Uncharacterized protei 22.8 2.1E+02 0.0045 30.5 6.2 115 388-518 165-299 (342)
23 PF09099 Qn_am_d_aIII: Quinohe 21.9 91 0.002 27.0 2.9 23 160-182 47-69 (81)
24 PF14734 DUF4469: Domain of un 21.8 92 0.002 27.9 3.0 24 163-186 64-87 (102)
25 PF14874 PapD-like: Flagellar- 21.6 4.7E+02 0.01 22.0 7.3 31 154-186 58-88 (102)
26 PF12245 Big_3_2: Bacterial Ig 21.0 3.2E+02 0.007 21.7 5.7 28 161-189 9-36 (60)
27 PF13304 AAA_21: AAA domain; P 20.0 1.3E+02 0.0028 27.2 3.7 37 403-439 259-296 (303)
No 1
>PF10633 NPCBM_assoc: NPCBM-associated, NEW3 domain of alpha-galactosidase; InterPro: IPR018905 This domain has been named NEW3, but its function is not known. It is found on proteins which are bacterial galactosidases [].; PDB: 1EUT_A 2BZD_A 1WCQ_C 2BER_A 1W8O_A 1EUU_A 1W8N_A.
Probab=94.67 E-value=0.21 Score=41.20 Aligned_cols=75 Identities=21% Similarity=0.317 Sum_probs=43.0
Q ss_pred ecCceEEEEEEEecCcccCCCCCCCceEEEEcccccCCCCccccccceEEEEEEecCCCCcccccCCCCcceeeecCCCe
Q 009790 81 ARNERESVQIALRPKVSWSSSSTAGVVQVQCSDLCSASGDRLVVGQSLMLRRVVPMLGVPDALVPLDLPVCQISLIPGET 160 (525)
Q Consensus 81 aRGE~vSFQlVLrs~~~~~~~~~~~~V~Vs~SdL~s~sG~~i~~g~~Itlr~V~yVg~yPD~LvP~d~~~~~v~V~ag~~ 160 (525)
-.||.+.+.+-++.. ......+++++++ .++|..... .| .....|++|+.
T Consensus 2 ~~G~~~~~~~tv~N~----g~~~~~~v~~~l~---~P~GW~~~~-------------------~~----~~~~~l~pG~s 51 (78)
T PF10633_consen 2 TPGETVTVTLTVTNT----GTAPLTNVSLSLS---LPEGWTVSA-------------------SP----ASVPSLPPGES 51 (78)
T ss_dssp -TTEEEEEEEEEE------SSS-BSS-EEEEE-----TTSE----------------------EE----EEE--B-TTSE
T ss_pred CCCCEEEEEEEEEEC----CCCceeeEEEEEe---CCCCccccC-------------------Cc----cccccCCCCCE
Confidence 368888889888753 1233445666554 244432000 01 11226889999
Q ss_pred eEEEEEEEcCCCCCCceeEEEEEEE
Q 009790 161 TAVWVSIDAPYAQPPGLYEGEIIIT 185 (525)
Q Consensus 161 Q~LWIdV~VP~dA~PG~Y~GtVtVt 185 (525)
+.+=++|.+|++++||.|..+++++
T Consensus 52 ~~~~~~V~vp~~a~~G~y~v~~~a~ 76 (78)
T PF10633_consen 52 VTVTFTVTVPADAAPGTYTVTVTAR 76 (78)
T ss_dssp EEEEEEEEE-TT--SEEEEEEEEEE
T ss_pred EEEEEEEECCCCCCCceEEEEEEEE
Confidence 9999999999999999999999886
No 2
>PF00150 Cellulase: Cellulase (glycosyl hydrolase family 5); InterPro: IPR001547 O-Glycosyl hydrolases 3.2.1. from EC are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families [, ]. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site. Glycoside hydrolase family 5 GH5 from CAZY comprises enzymes with several known activities; endoglucanase (3.2.1.4 from EC); beta-mannanase (3.2.1.78 from EC); exo-1,3-glucanase (3.2.1.58 from EC); endo-1,6-glucanase (3.2.1.75 from EC); xylanase (3.2.1.8 from EC); endoglycoceramidase (3.2.1.123 from EC). The microbial degradation of cellulose and xylans requires several types of enzymes. Fungi and bacteria produces a spectrum of cellulolytic enzymes (cellulases) and xylanases which, on the basis of sequence similarities, can be classified into families. One of these families is known as the cellulase family A [] or as the glycosyl hydrolases family 5 []. One of the conserved regions in this family contains a conserved glutamic acid residue which is potentially involved [] in the catalytic mechanism.; GO: 0004553 hydrolase activity, hydrolyzing O-glycosyl compounds, 0005975 carbohydrate metabolic process; PDB: 3NDY_A 3NDZ_B 1LF1_A 1TVP_B 1TVN_A 3AYR_A 3AYS_A 1QI0_A 1W3K_A 1OCQ_A ....
Probab=90.31 E-value=4.5 Score=39.38 Aligned_cols=100 Identities=15% Similarity=0.188 Sum_probs=66.3
Q ss_pred CHHHHHHHHHHHHHHHhCCcCCCCcCccccceeeeccCCCCCCCCCCccccccccceeeeecCccCCCCc----hHHHHH
Q 009790 310 SDEWYEALDQHFKWLLQYRISPFFCRWGESMRVLTYTCPWPADHPKSDEYFSDPRLAAYAVPYSPVLSSN----DGAKDY 385 (525)
Q Consensus 310 s~e~~~~L~~~~~~ll~~risp~f~rW~~~mrv~~y~~~W~ad~~~~d~~~sd~~i~aY~vP~~~~~~g~----~a~~~~ 385 (525)
.+..++.|++.++++.+++|....+ +|....|..+... .... +.++++
T Consensus 57 ~~~~~~~ld~~v~~a~~~gi~vild--------~h~~~~w~~~~~~--------------------~~~~~~~~~~~~~~ 108 (281)
T PF00150_consen 57 DETYLARLDRIVDAAQAYGIYVILD--------LHNAPGWANGGDG--------------------YGNNDTAQAWFKSF 108 (281)
T ss_dssp THHHHHHHHHHHHHHHHTT-EEEEE--------EEESTTCSSSTST--------------------TTTHHHHHHHHHHH
T ss_pred cHHHHHHHHHHHHHHHhCCCeEEEE--------eccCccccccccc--------------------cccchhhHHHHHhh
Confidence 4567899999999999999997643 2222334111110 1121 224557
Q ss_pred HHHHHHHHHhccchhhhheeccCCCCCcc-----------c-HHHHHHHHHHHHHhCCCCcEEEee
Q 009790 386 VRKEIELLRTKAHWKKAYFYLWDEPLNME-----------H-YSSVRNMASELHAYAPDARVLTTY 439 (525)
Q Consensus 386 l~~~~e~Lr~kgw~~kayfyl~DEP~~~e-----------~-y~~~r~~a~~ir~~aPd~ril~T~ 439 (525)
++.++++++. .-....|=|+.||.... . -+.++++++.||+.+|+..|+..-
T Consensus 109 ~~~la~~y~~--~~~v~~~el~NEP~~~~~~~~w~~~~~~~~~~~~~~~~~~Ir~~~~~~~i~~~~ 172 (281)
T PF00150_consen 109 WRALAKRYKD--NPPVVGWELWNEPNGGNDDANWNAQNPADWQDWYQRAIDAIRAADPNHLIIVGG 172 (281)
T ss_dssp HHHHHHHHTT--TTTTEEEESSSSGCSTTSTTTTSHHHTHHHHHHHHHHHHHHHHTTSSSEEEEEE
T ss_pred hhhhccccCC--CCcEEEEEecCCccccCCccccccccchhhhhHHHHHHHHHHhcCCcceeecCC
Confidence 7778888864 33466778999999632 1 257899999999999998777766
No 3
>PF15418 DUF4625: Domain of unknown function (DUF4625)
Probab=90.15 E-value=4.5 Score=37.60 Aligned_cols=89 Identities=18% Similarity=0.228 Sum_probs=53.4
Q ss_pred eeeeecCceEEEEEEEecCcccCCCCCCCceEEEEcccccCCCCc--cccccceEEEEEEecCCCCcccccCCCCcceee
Q 009790 77 NLLAARNERESVQIALRPKVSWSSSSTAGVVQVQCSDLCSASGDR--LVVGQSLMLRRVVPMLGVPDALVPLDLPVCQIS 154 (525)
Q Consensus 77 ~LsAaRGE~vSFQlVLrs~~~~~~~~~~~~V~Vs~SdL~s~sG~~--i~~g~~Itlr~V~yVg~yPD~LvP~d~~~~~v~ 154 (525)
.-.+-||+.+.|.+.+.. ...++.++|++-. +.++.. ..+++. -.|... .+.+.
T Consensus 29 ~~~~~~G~~ihfe~~i~d------~~~i~si~VeIH~--nfd~H~h~~~~~~~---------------~~~~~~-~~~~~ 84 (132)
T PF15418_consen 29 CKVATRGDDIHFEADISD------NSAIKSIKVEIHN--NFDHHTHSTEAGEC---------------EKPWVF-EQDYD 84 (132)
T ss_pred CeEEecCCcEEEEEEEEc------ccceeEEEEEEec--CcCccccccccccc---------------ccCcEE-EEEEc
Confidence 456779999999999974 3677888887721 111100 001000 011110 01222
Q ss_pred ecCC-CeeEEEEEEEcCCCCCCceeEEEEEEEecCC
Q 009790 155 LIPG-ETTAVWVSIDAPYAQPPGLYEGEIIITSKAD 189 (525)
Q Consensus 155 V~ag-~~Q~LWIdV~VP~dA~PG~Y~GtVtVt~~~~ 189 (525)
+..| .+.-+=..|.||++++||.|...|+|+.+.+
T Consensus 85 ~~~g~~~~~~h~~i~IPa~a~~G~YH~~i~VtD~~G 120 (132)
T PF15418_consen 85 IYGGKKNYDFHEHIDIPADAPAGDYHFMITVTDAAG 120 (132)
T ss_pred ccCCcccEeEEEeeeCCCCCCCcceEEEEEEEECCC
Confidence 2222 3456678999999999999999999997444
No 4
>COG1470 Predicted membrane protein [Function unknown]
Probab=90.09 E-value=4 Score=45.35 Aligned_cols=38 Identities=29% Similarity=0.464 Sum_probs=35.5
Q ss_pred cceeeecCCCeeEEEEEEEcCCCCCCceeEEEEEEEec
Q 009790 150 VCQISLIPGETTAVWVSIDAPYAQPPGLYEGEIIITSK 187 (525)
Q Consensus 150 ~~~v~V~ag~~Q~LWIdV~VP~dA~PG~Y~GtVtVt~~ 187 (525)
...+.|.||+...+-+.|+.|++|.||.|..+|++++.
T Consensus 324 vt~vkL~~gE~kdvtleV~ps~na~pG~Ynv~I~A~s~ 361 (513)
T COG1470 324 VTSVKLKPGEEKDVTLEVYPSLNATPGTYNVTITASSS 361 (513)
T ss_pred EEEEEecCCCceEEEEEEecCCCCCCCceeEEEEEecc
Confidence 56889999999999999999999999999999999863
No 5
>PF01229 Glyco_hydro_39: Glycosyl hydrolases family 39; InterPro: IPR000514 O-Glycosyl hydrolases 3.2.1. from EC are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families [, ]. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site. Glycoside hydrolase family 39 GH39 from CAZY comprises enzymes with several known activities; alpha-L-iduronidase (3.2.1.76 from EC); beta-xylosidase (3.2.1.37 from EC). The most highly conserved regions in these enzymes are located in their N-terminal sections. These contain a glutamic acid residue which, on the basis of similarities with other families of glycosyl hydrolases [], probably acts as the proton donor in their catalytic mechanism.; GO: 0004553 hydrolase activity, hydrolyzing O-glycosyl compounds, 0005975 carbohydrate metabolic process; PDB: 2BS9_D 2BFG_E 1W91_B 1UHV_D 1PX8_A.
Probab=89.41 E-value=0.82 Score=50.15 Aligned_cols=107 Identities=21% Similarity=0.355 Sum_probs=64.3
Q ss_pred HHHHHHHHHHHHhCCcCCC----CcCccccceeeeccCCCCCCCCCCccccccccceeeeecCccCCCCchHHHHHHHHH
Q 009790 314 YEALDQHFKWLLQYRISPF----FCRWGESMRVLTYTCPWPADHPKSDEYFSDPRLAAYAVPYSPVLSSNDGAKDYVRKE 389 (525)
Q Consensus 314 ~~~L~~~~~~ll~~risp~----f~rW~~~mrv~~y~~~W~ad~~~~d~~~sd~~i~aY~vP~~~~~~g~~a~~~~l~~~ 389 (525)
|..||+.++.|++.+|.|+ |.+ +.+. .... ..+. . ..+.-| ...-.+..++++++
T Consensus 83 f~~lD~i~D~l~~~g~~P~vel~f~p----~~~~--------~~~~-~~~~-~---~~~~~p----p~~~~~W~~lv~~~ 141 (486)
T PF01229_consen 83 FTYLDQILDFLLENGLKPFVELGFMP----MALA--------SGYQ-TVFW-Y---KGNISP----PKDYEKWRDLVRAF 141 (486)
T ss_dssp -HHHHHHHHHHHHCT-EEEEEE-SB-----GGGB--------SS---EETT-T---TEE-S-----BS-HHHHHHHHHHH
T ss_pred hHHHHHHHHHHHHcCCEEEEEEEech----hhhc--------CCCC-cccc-c---cCCcCC----cccHHHHHHHHHHH
Confidence 6899999999999999995 442 0000 0000 0000 0 000001 12445678899999
Q ss_pred HHHHHhc-c--chhhhheeccCCCCCc---------ccHHHHHHHHHHHHHhCCCCcEEEeeec
Q 009790 390 IELLRTK-A--HWKKAYFYLWDEPLNM---------EHYSSVRNMASELHAYAPDARVLTTYYC 441 (525)
Q Consensus 390 ~e~Lr~k-g--w~~kayfyl~DEP~~~---------e~y~~~r~~a~~ir~~aPd~ril~T~~~ 441 (525)
++|+..+ | ..++.||=+|.||... +=++.++..++.||+++|++||-..-.|
T Consensus 142 ~~h~~~RYG~~ev~~W~fEiWNEPd~~~f~~~~~~~ey~~ly~~~~~~iK~~~p~~~vGGp~~~ 205 (486)
T PF01229_consen 142 ARHYIDRYGIEEVSTWYFEIWNEPDLKDFWWDGTPEEYFELYDATARAIKAVDPELKVGGPAFA 205 (486)
T ss_dssp HHHHHHHHHHHHHTTSEEEESS-TTSTTTSGGG-HHHHHHHHHHHHHHHHHH-TTSEEEEEEEE
T ss_pred HHHHHhhcCCccccceeEEeCcCCCcccccCCCCHHHHHHHHHHHHHHHHHhCCCCcccCcccc
Confidence 9999764 3 2445578799999751 3345788999999999999998766444
No 6
>PF06030 DUF916: Bacterial protein of unknown function (DUF916); InterPro: IPR010317 This family consists of putative cell surface proteins, from Firmicutes, of unknown function.
Probab=78.24 E-value=30 Score=31.55 Aligned_cols=107 Identities=16% Similarity=0.214 Sum_probs=67.0
Q ss_pred ccCCCCCC-CCCCceeeeeecCceEEEEEEEecCcccCCCCCCCceEEEEcc-cccCCCCccccccceEEEEEEecC---
Q 009790 63 NVGPQEMP-RPLEPINLLAARNERESVQIALRPKVSWSSSSTAGVVQVQCSD-LCSASGDRLVVGQSLMLRRVVPML--- 137 (525)
Q Consensus 63 KVfpde~P-~~~~~i~LsAaRGE~vSFQlVLrs~~~~~~~~~~~~V~Vs~Sd-L~s~sG~~i~~g~~Itlr~V~yVg--- 137 (525)
-|.|+..- .....+.|...-|+...+|+.+... +.....++|++.+ ..+.+|. +.|..
T Consensus 5 p~~p~~Q~~~~~~YFdL~~~P~q~~~l~v~i~N~-----s~~~~tv~v~~~~A~Tn~nG~------------I~Y~~~~~ 67 (121)
T PF06030_consen 5 PVLPENQIDKNVSYFDLKVKPGQKQTLEVRITNN-----SDKEITVKVSANTATTNDNGV------------IDYSQNNP 67 (121)
T ss_pred ecCCccccCCCCCeEEEEeCCCCEEEEEEEEEeC-----CCCCEEEEEEEeeeEecCCEE------------EEECCCCc
Confidence 45565443 2357899999999999999999742 1222234443321 1222221 12211
Q ss_pred CC-CcccccCCC---CcceeeecCCCeeEEEEEEEcCCCCCCceeEEEEEEEe
Q 009790 138 GV-PDALVPLDL---PVCQISLIPGETTAVWVSIDAPYAQPPGLYEGEIIITS 186 (525)
Q Consensus 138 ~y-PD~LvP~d~---~~~~v~V~ag~~Q~LWIdV~VP~dA~PG~Y~GtVtVt~ 186 (525)
.. +++-.++.+ ....+.|+|++++-+=++|.+|+..-.|..-|-|.|+.
T Consensus 68 ~~d~sl~~~~~~~v~~~~~Vtl~~~~sk~V~~~i~~P~~~f~G~ilGGi~~~e 120 (121)
T PF06030_consen 68 KKDKSLKYPFSDLVKIPKEVTLPPNESKTVTFTIKMPKKAFDGIILGGIYFSE 120 (121)
T ss_pred ccCcccCcchHHhccCCcEEEECCCCEEEEEEEEEcCCCCcCCEEEeeEEEEe
Confidence 01 111112210 11349999999999999999999999999999999985
No 7
>COG1470 Predicted membrane protein [Function unknown]
Probab=69.08 E-value=6.7 Score=43.67 Aligned_cols=33 Identities=36% Similarity=0.457 Sum_probs=30.7
Q ss_pred eecCCCeeEEEEEEEcCCCCCCceeEEEEEEEe
Q 009790 154 SLIPGETTAVWVSIDAPYAQPPGLYEGEIIITS 186 (525)
Q Consensus 154 ~V~ag~~Q~LWIdV~VP~dA~PG~Y~GtVtVt~ 186 (525)
.|.||+.-.+=++|.||++|.||.|+.+|+.++
T Consensus 437 sL~pge~~tV~ltI~vP~~a~aGdY~i~i~~ks 469 (513)
T COG1470 437 SLEPGESKTVSLTITVPEDAGAGDYRITITAKS 469 (513)
T ss_pred ccCCCCcceEEEEEEcCCCCCCCcEEEEEEEee
Confidence 478899999999999999999999999999987
No 8
>PF13731 WxL: WxL domain surface cell wall-binding
Probab=67.48 E-value=20 Score=35.25 Aligned_cols=79 Identities=25% Similarity=0.400 Sum_probs=46.5
Q ss_pred ceEEEEcccccCCCCccccccceEEEEEEec---CC--CCc------ccccCCCCcceeeecCCCeeEEE----------
Q 009790 106 VVQVQCSDLCSASGDRLVVGQSLMLRRVVPM---LG--VPD------ALVPLDLPVCQISLIPGETTAVW---------- 164 (525)
Q Consensus 106 ~V~Vs~SdL~s~sG~~i~~g~~Itlr~V~yV---g~--yPD------~LvP~d~~~~~v~V~ag~~Q~LW---------- 164 (525)
.+.|+.++|++.+|..+ .+..+.+...... .. -|- .|.+......-+.-.+++.+..|
T Consensus 105 ~L~v~~s~F~~~~~~~L-~ga~l~~~~~~~~~~~~~~~~~~~~~~~~~l~~~~~~~~v~~A~~~~g~G~~~~~~~~~~~~ 183 (215)
T PF13731_consen 105 TLTVKLSPFTNADGDTL-PGATLTFNNGKVQSTANNTNTPTTVSSNITLTPGGQAQTVMSAAKGQGQGTWSYSFGDQDAT 183 (215)
T ss_pred EEEEEeccccccCCcCc-ccceEEecCceeEeecccccCCcccccceEeccCCcceeeEeecccccceEEEEEeCCcccc
Confidence 57888889999887653 3344554432221 00 111 12222211122233456666666
Q ss_pred ----EEEEcCCCCC--CceeEEEEEEE
Q 009790 165 ----VSIDAPYAQP--PGLYEGEIIIT 185 (525)
Q Consensus 165 ----IdV~VP~dA~--PG~Y~GtVtVt 185 (525)
|.+.||.++. +|.|+++|+=+
T Consensus 184 ~~~~v~L~VP~~~~~~ag~Yt~tlTWt 210 (215)
T PF13731_consen 184 ADTGVSLSVPANTAKQAGTYTATLTWT 210 (215)
T ss_pred cccceEEEeCCCCcccCCcEEEEEEEE
Confidence 8899999998 79999999865
No 9
>PF14352 DUF4402: Domain of unknown function (DUF4402)
Probab=66.07 E-value=6.5 Score=35.46 Aligned_cols=32 Identities=28% Similarity=0.506 Sum_probs=25.0
Q ss_pred ecCCCeeEEEE--EEEcCCCCCCceeEEEEEEEe
Q 009790 155 LIPGETTAVWV--SIDAPYAQPPGLYEGEIIITS 186 (525)
Q Consensus 155 V~ag~~Q~LWI--dV~VP~dA~PG~Y~GtVtVt~ 186 (525)
+..+....+.| ++.|+.++++|.|+|+++|+.
T Consensus 95 ~~~~g~~~~~VGGtL~v~~~~~~G~YsGt~~VtV 128 (130)
T PF14352_consen 95 LDTGGSATFNVGGTLNVPANQAAGTYSGTFTVTV 128 (130)
T ss_pred ecCCCcEEEEEEEEEEcCCCCCCeEEEEEEEEEE
Confidence 33444556665 589999999999999999985
No 10
>PF10003 DUF2244: Integral membrane protein (DUF2244); InterPro: IPR019253 This entry consists of various bacterial putative membrane proteins with no known function.
Probab=65.70 E-value=5.6 Score=36.89 Aligned_cols=53 Identities=23% Similarity=0.324 Sum_probs=40.1
Q ss_pred CeeEEEEEEEcCCCCCCceeEEEEEEEecCCccccccccccchhhhHHHhhhhcc
Q 009790 159 ETTAVWVSIDAPYAQPPGLYEGEIIITSKADTELSSQCLGKGEKHRLFMELRNCL 213 (525)
Q Consensus 159 ~~Q~LWIdV~VP~dA~PG~Y~GtVtVt~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 213 (525)
+..+.|+.|.+..+..+ ..-.|++++++..-.-...|+++||..|+.||+..|
T Consensus 88 ~~~~~w~rv~~~~~~~~--~~~~l~L~~~g~~veiG~fL~~~eR~~la~~L~~aL 140 (140)
T PF10003_consen 88 EFNPYWVRVELEEDPGP--GPPRLTLRSRGREVEIGRFLNPEEREELARELRRAL 140 (140)
T ss_pred EEcCCeEEEEEEcCCCC--CCcEEEEEECCEEEEEccCCCHHHHHHHHHHHHhhC
Confidence 45689999999998887 555666665444223346899999999999999764
No 11
>PF01835 A2M_N: MG2 domain; InterPro: IPR002890 The proteinase-binding alpha-macroglobulins (A2M) [] are large glycoproteins found in the plasma of vertebrates, in the hemolymph of some invertebrates and in reptilian and avian egg white. A2M-like proteins are able to inhibit all four classes of proteinases by a 'trapping' mechanism. They have a peptide stretch, called the 'bait region', which contains specific cleavage sites for different proteinases. When a proteinase cleaves the bait region, a conformational change is induced in the protein, thus trapping the proteinase. The entrapped enzyme remains active against low molecular weight substrates, whilst its activity toward larger substrates is greatly reduced, due to steric hindrance. Following cleavage in the bait region, a thiol ester bond, formed between the side chains of a cysteine and a glutamine, is cleaved and mediates the covalent binding of the A2M-like protein to the proteinase. This family includes the N-terminal region of the alpha-2-macroglobulin family. The inhibitor domains belong to MEROPS inhibitor family I39.; GO: 0004866 endopeptidase inhibitor activity; PDB: 2B39_B 3KLS_B 3PRX_C 3KM9_B 3PVM_C 3CU7_A 4E0S_A 4A5W_A 4ACQ_C 2P9R_B ....
Probab=61.03 E-value=40 Score=28.49 Aligned_cols=29 Identities=21% Similarity=0.209 Sum_probs=19.1
Q ss_pred CCCeeEEEEEEEcCCCCCCceeEEEEEEE
Q 009790 157 PGETTAVWVSIDAPYAQPPGLYEGEIIIT 185 (525)
Q Consensus 157 ag~~Q~LWIdV~VP~dA~PG~Y~GtVtVt 185 (525)
....-.+-.++.+|+++..|.|+.++...
T Consensus 58 ~~~~G~~~~~~~lp~~~~~G~y~i~~~~~ 86 (99)
T PF01835_consen 58 TNENGIFSGSFQLPDDAPLGTYTIRVKTD 86 (99)
T ss_dssp TTCTTEEEEEEE--SS---EEEEEEEEET
T ss_pred eCCCCEEEEEEECCCCCCCEeEEEEEEEc
Confidence 34455677899999999999999999885
No 12
>PF06280 DUF1034: Fn3-like domain (DUF1034); InterPro: IPR010435 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This domain of unknown function is present in bacterial and plant peptidases belonging to MEROPS peptidase family S8 (subfamily S8A subtilisin, clan SB). It is C-terminal to and adjacent to the S8 peptidase domain and can be found in conjunction with the PA (Protease associated) domain (IPR003137 from INTERPRO) and additionally in Gram-positive bacteria with the surface protein anchor domain (IPR001899 from INTERPRO).; GO: 0004252 serine-type endopeptidase activity, 0005618 cell wall, 0016020 membrane; PDB: 3EIF_A 1XF1_B.
Probab=58.64 E-value=10 Score=33.29 Aligned_cols=37 Identities=27% Similarity=0.416 Sum_probs=31.2
Q ss_pred ceeeecCCCeeEEEEEEEcCCCCCC---ceeEEEEEEEec
Q 009790 151 CQISLIPGETTAVWVSIDAPYAQPP---GLYEGEIIITSK 187 (525)
Q Consensus 151 ~~v~V~ag~~Q~LWIdV~VP~dA~P---G~Y~GtVtVt~~ 187 (525)
..+.|+||+++-+=|++.+|++..+ ..|.|-|.+++.
T Consensus 62 ~~vTV~ag~s~~v~vti~~p~~~~~~~~~~~eG~I~~~~~ 101 (112)
T PF06280_consen 62 DTVTVPAGQSKTVTVTITPPSGLDASNGPFYEGFITFKSS 101 (112)
T ss_dssp EEEEE-TTEEEEEEEEEE--GGGHHTT-EEEEEEEEEESS
T ss_pred CeEEECCCCEEEEEEEEEehhcCCcccCCEEEEEEEEEcC
Confidence 5899999999999999999998887 899999999973
No 13
>cd00917 PG-PI_TP The phosphatidylinositol/phosphatidylglycerol transfer protein (PG/PI-TP) has been shown to bind phosphatidylglycerol and phosphatidylinositol, but the biological significance of this is still obscure. These proteins belong to the ML domain family.
Probab=58.18 E-value=19 Score=32.45 Aligned_cols=34 Identities=24% Similarity=0.365 Sum_probs=29.7
Q ss_pred eeeecCCCeeEEEEEEEcCCCCCCceeEEEEEEEe
Q 009790 152 QISLIPGETTAVWVSIDAPYAQPPGLYEGEIIITS 186 (525)
Q Consensus 152 ~v~V~ag~~Q~LWIdV~VP~dA~PG~Y~GtVtVt~ 186 (525)
.=++.+|+.. +=.++.||...++|.|+++.++.+
T Consensus 76 ~CPi~~G~~~-~~~~~~ip~~~P~g~y~v~~~l~d 109 (122)
T cd00917 76 SCPIEPGDKF-LTKLVDLPGEIPPGKYTVSARAYT 109 (122)
T ss_pred cCCcCCCcEE-EEEEeeCCCCCCCceEEEEEEEEC
Confidence 3457789887 888899999999999999999986
No 14
>PF02221 E1_DerP2_DerF2: ML domain; InterPro: IPR003172 The MD-2-related lipid-recognition (ML) domain is implicated in lipid recognition, particularly in the recognition of pathogen related products. It has an immunoglobulin-like beta-sandwich fold similar to that of E-set Ig domains. This domain is present in the following proteins: Epididymal secretory protein E1 (also known as Niemann-Pick C2 protein), which is known to bind cholesterol. Niemann-Pick disease type C2 is a fatal hereditary disease characterised by accumulation of low-density lipoprotein-derived cholesterol in lysosomes []. House-dust mite allergen proteins such as Der f 2 from Dermatophagoides farinae and Der p 2 from Dermatophagoides pteronyssinus []. ; PDB: 2AG9_B 1G13_B 2AG2_B 2AG4_A 1TJJ_C 1PU5_C 1PUB_A 2AF9_A 3T6Q_D 3M7O_B ....
Probab=58.03 E-value=20 Score=31.65 Aligned_cols=35 Identities=26% Similarity=0.350 Sum_probs=32.0
Q ss_pred eeecCCCeeEEEEEEEcCCCCCCceeEEEEEEEec
Q 009790 153 ISLIPGETTAVWVSIDAPYAQPPGLYEGEIIITSK 187 (525)
Q Consensus 153 v~V~ag~~Q~LWIdV~VP~dA~PG~Y~GtVtVt~~ 187 (525)
=++.+|+..-.-+++.||...++|.|++++++++.
T Consensus 86 CPi~~G~~~~~~~~~~i~~~~p~~~~~i~~~l~d~ 120 (134)
T PF02221_consen 86 CPIKAGEYYTYTYTIPIPKIYPPGKYTIQWKLTDQ 120 (134)
T ss_dssp STBTTTEEEEEEEEEEESTTSSSEEEEEEEEEEET
T ss_pred CccCCCcEEEEEEEEEcccceeeEEEEEEEEEEeC
Confidence 36789999999999999999999999999999974
No 15
>PF13204 DUF4038: Protein of unknown function (DUF4038); PDB: 3KZS_D.
Probab=53.23 E-value=38 Score=34.97 Aligned_cols=104 Identities=17% Similarity=0.246 Sum_probs=57.4
Q ss_pred cCCCCCCHHHHHHHHHHHHHHHhCCcCCCCc-Cccccceeeec-cCCCCCCCCCCccccccccceeeeecCccCCCCchH
Q 009790 304 FGVRHGSDEWYEALDQHFKWLLQYRISPFFC-RWGESMRVLTY-TCPWPADHPKSDEYFSDPRLAAYAVPYSPVLSSNDG 381 (525)
Q Consensus 304 ~~v~~~s~e~~~~L~~~~~~ll~~risp~f~-rW~~~mrv~~y-~~~W~ad~~~~d~~~sd~~i~aY~vP~~~~~~g~~a 381 (525)
+.....-+++|+.|++.++.|.+++|.+-.. =|+.. | .+.|++..... +.+.
T Consensus 77 ~d~~~~N~~YF~~~d~~i~~a~~~Gi~~~lv~~wg~~-----~~~~~Wg~~~~~m---------------------~~e~ 130 (289)
T PF13204_consen 77 FDFTRPNPAYFDHLDRRIEKANELGIEAALVPFWGCP-----YVPGTWGFGPNIM---------------------PPEN 130 (289)
T ss_dssp ---TT----HHHHHHHHHHHHHHTT-EEEEESS-HHH-----HH-------TTSS----------------------HHH
T ss_pred cCCCCCCHHHHHHHHHHHHHHHHCCCeEEEEEEECCc-----cccccccccccCC---------------------CHHH
Confidence 4444555899999999999999999996522 23332 2 13455442221 4456
Q ss_pred HHHHHHHHHHHHHhcc--chhhhheeccCCCCCcccHHHHHHHHHHHHHhCCCCcEEEee
Q 009790 382 AKDYVRKEIELLRTKA--HWKKAYFYLWDEPLNMEHYSSVRNMASELHAYAPDARVLTTY 439 (525)
Q Consensus 382 ~~~~l~~~~e~Lr~kg--w~~kayfyl~DEP~~~e~y~~~r~~a~~ir~~aPd~ril~T~ 439 (525)
++.|++-+++.+++.. ||-.+==+ -.....-+.++++++.||+.+|.- |.|+
T Consensus 131 ~~~Y~~yv~~Ry~~~~NviW~l~gd~----~~~~~~~~~w~~~~~~i~~~dp~~--L~T~ 184 (289)
T PF13204_consen 131 AERYGRYVVARYGAYPNVIWILGGDY----FDTEKTRADWDAMARGIKENDPYQ--LITI 184 (289)
T ss_dssp HHHHHHHHHHHHTT-SSEEEEEESSS------TTSSHHHHHHHHHHHHHH--SS---EEE
T ss_pred HHHHHHHHHHHHhcCCCCEEEecCcc----CCCCcCHHHHHHHHHHHHhhCCCC--cEEE
Confidence 8899999999999884 33322222 122345589999999999999977 5555
No 16
>KOG1579 consensus Homocysteine S-methyltransferase [Amino acid transport and metabolism]
Probab=45.32 E-value=1.7e+02 Score=31.48 Aligned_cols=138 Identities=12% Similarity=0.072 Sum_probs=88.1
Q ss_pred ChhHHhhhcCCCCCCHHHHHHHHHHHHHHHhCCcCCCCcCccccceeeeccCCCCCCCCCCccccccccceeeeecCccC
Q 009790 296 SDTVIEDRFGVRHGSDEWYEALDQHFKWLLQYRISPFFCRWGESMRVLTYTCPWPADHPKSDEYFSDPRLAAYAVPYSPV 375 (525)
Q Consensus 296 ~~~~i~d~~~v~~~s~e~~~~L~~~~~~ll~~risp~f~rW~~~mrv~~y~~~W~ad~~~~d~~~sd~~i~aY~vP~~~~ 375 (525)
+.+..++| .-++-+.++++....-.+.++++-.++.- -+....+||.|..... ..|.-+|..-
T Consensus 78 s~~~~~~~-~~~~~~~el~~~s~~~a~~Are~~~~~~~-------~v~gsiGp~~A~l~~g---------~eytg~Y~~~ 140 (317)
T KOG1579|consen 78 SSDGFEEY-VEEEELIELYEKSVELADLARERLGEETG-------YVAGSIGPYGATLADG---------SEYTGIYGDN 140 (317)
T ss_pred cchHHhhh-hhhHHHHHHHHHHHHHHHHHHHHhccccc-------eeeeecccccceecCC---------cccccccccc
Confidence 34445555 44555666666555555555543333221 1334556776654432 2355566442
Q ss_pred CCCchHHHHHHHHHHHHHHhccchhhhheeccCCCCCcccHHHHHHHHHHHHHhCCCCcEEEeeecCCCCCCCCCCCccc
Q 009790 376 LSSNDGAKDYVRKEIELLRTKAHWKKAYFYLWDEPLNMEHYSSVRNMASELHAYAPDARVLTTYYCGPSDAPLGPTPFES 455 (525)
Q Consensus 376 ~~g~~a~~~~l~~~~e~Lr~kgw~~kayfyl~DEP~~~e~y~~~r~~a~~ir~~aPd~ril~T~~~~Ps~~~~~~~~~~~ 455 (525)
. ..+..++|.|.-+|.+-++|. |..-| |= -.+...-.++.+.+++..|+.++-.+..|.++.--...+++|.
T Consensus 141 ~-~~~el~~~~k~qle~~~~~gv-D~L~f----ET--ip~~~EA~a~l~~l~~~~~~~p~~is~t~~d~g~l~~G~t~e~ 212 (317)
T KOG1579|consen 141 V-EFEELYDFFKQQLEVFLEAGV-DLLAF----ET--IPNVAEAKAALELLQELGPSKPFWISFTIKDEGRLRSGETGEE 212 (317)
T ss_pred c-CHHHHHHHHHHHHHHHHhCCC-CEEEE----ee--cCCHHHHHHHHHHHHhcCCCCcEEEEEEecCCCcccCCCcHHH
Confidence 2 234478899999999999993 33333 21 1244677788889999999999999999999888888888887
Q ss_pred ccc
Q 009790 456 FVK 458 (525)
Q Consensus 456 ~v~ 458 (525)
++.
T Consensus 213 ~~~ 215 (317)
T KOG1579|consen 213 AAQ 215 (317)
T ss_pred HHH
Confidence 766
No 17
>smart00633 Glyco_10 Glycosyl hydrolase family 10.
Probab=41.60 E-value=52 Score=32.94 Aligned_cols=98 Identities=11% Similarity=0.170 Sum_probs=61.9
Q ss_pred HHHHHHHHHHHHhCCcCCC--CcCccccceeeeccCCCCCCCCCCccccccccceeeeecCccCCCCchHHHHHHHHHHH
Q 009790 314 YEALDQHFKWLLQYRISPF--FCRWGESMRVLTYTCPWPADHPKSDEYFSDPRLAAYAVPYSPVLSSNDGAKDYVRKEIE 391 (525)
Q Consensus 314 ~~~L~~~~~~ll~~risp~--f~rW~~~mrv~~y~~~W~ad~~~~d~~~sd~~i~aY~vP~~~~~~g~~a~~~~l~~~~e 391 (525)
|+.+++.++|+.+++|... .+=|..+ .| .|+.+. + + -.-..+.++|+++.++
T Consensus 15 ~~~~D~~~~~a~~~gi~v~gH~l~W~~~-------------~P---~W~~~~-------~--~-~~~~~~~~~~i~~v~~ 68 (254)
T smart00633 15 FSGADAIVNFAKENGIKVRGHTLVWHSQ-------------TP---DWVFNL-------S--K-ETLLARLENHIKTVVG 68 (254)
T ss_pred hHHHHHHHHHHHHCCCEEEEEEEeeccc-------------CC---HhhhcC-------C--H-HHHHHHHHHHHHHHHH
Confidence 6888999999999998843 1123222 11 122100 0 0 0012335668888888
Q ss_pred HHHhccchhhhheeccCCCCCcc-------cH------HHHHHHHHHHHHhCCCCcEEEeeec
Q 009790 392 LLRTKAHWKKAYFYLWDEPLNME-------HY------SSVRNMASELHAYAPDARVLTTYYC 441 (525)
Q Consensus 392 ~Lr~kgw~~kayfyl~DEP~~~e-------~y------~~~r~~a~~ir~~aPd~ril~T~~~ 441 (525)
|++.+. .+.-++.||.+.. .| +-++.+.+.+|+++|++|++...|.
T Consensus 69 ry~g~i----~~wdV~NE~~~~~~~~~~~~~w~~~~G~~~i~~af~~ar~~~P~a~l~~Ndy~ 127 (254)
T smart00633 69 RYKGKI----YAWDVVNEALHDNGSGLRRSVWYQILGEDYIEKAFRYAREADPDAKLFYNDYN 127 (254)
T ss_pred HhCCcc----eEEEEeeecccCCCcccccchHHHhcChHHHHHHHHHHHHhCCCCEEEEeccC
Confidence 887652 2244788987532 12 6788999999999999999998754
No 18
>smart00737 ML Domain involved in innate immunity and lipid metabolism. ML (MD-2-related lipid-recognition) is a novel domain identified in MD-1, MD-2, GM2A, Npc2 and multiple proteins of unknown function in plants, animals and fungi. These single-domain proteins were predicted to form a beta-rich fold containing multiple strands, and to mediate diverse biological functions through interacting with specific lipids.
Probab=38.56 E-value=60 Score=28.42 Aligned_cols=33 Identities=30% Similarity=0.379 Sum_probs=27.5
Q ss_pred eecCCCeeEEEEEEEcCCCCCCceeEEEEEEEe
Q 009790 154 SLIPGETTAVWVSIDAPYAQPPGLYEGEIIITS 186 (525)
Q Consensus 154 ~V~ag~~Q~LWIdV~VP~dA~PG~Y~GtVtVt~ 186 (525)
++.+|+..-.=..+.||...++|.|++++++++
T Consensus 73 Pl~~G~~~~~~~~~~v~~~~P~~~~~v~~~l~d 105 (118)
T smart00737 73 PIEKGETVNYTNSLTVPGIFPPGKYTVKWELTD 105 (118)
T ss_pred CCCCCeeEEEEEeeEccccCCCeEEEEEEEEEc
Confidence 467788655556779999999999999999986
No 19
>PF12891 Glyco_hydro_44: Glycoside hydrolase family 44; InterPro: IPR024745 This is a family of putative bacterial glycoside hydrolases.; PDB: 3IK2_A 3ZQ9_A 2YJQ_B 2YKK_A 2YIH_A 2EEX_A 2EQD_A 2E0P_A 2E4T_A 2EO7_A ....
Probab=31.89 E-value=67 Score=33.05 Aligned_cols=57 Identities=25% Similarity=0.341 Sum_probs=33.8
Q ss_pred HHHHHHHHHHhc-cch---hhhheeccC-C--------------CCCc-ccHHHHHHHHHHHHHhCCCCcEEEeeec
Q 009790 385 YVRKEIELLRTK-AHW---KKAYFYLWD-E--------------PLNM-EHYSSVRNMASELHAYAPDARVLTTYYC 441 (525)
Q Consensus 385 ~l~~~~e~Lr~k-gw~---~kayfyl~D-E--------------P~~~-e~y~~~r~~a~~ir~~aPd~ril~T~~~ 441 (525)
|+..++.+|+.| |=. ...-||.+| | |.+. |-.+.+-++|+.||+.+|+++|+--.-|
T Consensus 105 y~~ewV~~l~~~~g~a~~~~gvk~y~lDNEP~LW~~TH~dVHP~~~t~~El~~r~i~~AkaiK~~DP~a~v~GP~~w 181 (239)
T PF12891_consen 105 YMDEWVNYLVNKYGNASTNGGVKYYSLDNEPDLWHSTHRDVHPEPVTYDELRDRSIEYAKAIKAADPDAKVFGPVEW 181 (239)
T ss_dssp EHHHHHHHHHHHH--TTSTTS--EEEESS-GGGHHHHTTTT--S---HHHHHHHHHHHHHHHHHH-TTSEEEEEEE-
T ss_pred HHHHHHHHHHHHHhccccCCCceEEEecCchHhhcccccccCCCCCCHHHHHHHHHHHHHHHHhhCCCCeEeechhh
Confidence 677778888776 111 112345555 3 3321 4456777899999999999999977644
No 20
>PF08428 Rib: Rib/alpha-like repeat; InterPro: IPR012706 This entry represents a region of about 79 amino acids found tandemly repeated up to fourteen times within the proteins that contain it. The repeats lack cysteines and are highly conserved, even at the DNA level, within and between proteins []. Proteins containing these repeats include the Rib and alpha surface antigens of group B Streptococcus, Esp of Enterococcus faecalis (Streptococcus faecalis), and related proteins of Lactobacillus. Most members of this protein family also have the cell wall anchor motif, LPXTG, shared by many staphyloccal and streptococcal surface antigens. These repeats are thought to define protective epitopes and may play a role in generating phenotypic and genotypic variation [].
Probab=29.18 E-value=1.5e+02 Score=24.11 Aligned_cols=31 Identities=29% Similarity=0.489 Sum_probs=23.9
Q ss_pred eeecCCCeeEEEEEEEcCCCCCCceeEEEEEEEe
Q 009790 153 ISLIPGETTAVWVSIDAPYAQPPGLYEGEIIITS 186 (525)
Q Consensus 153 v~V~ag~~Q~LWIdV~VP~dA~PG~Y~GtVtVt~ 186 (525)
-.|++|+. --|.+ .|....||.|.+.|+|+=
T Consensus 19 ~~lP~gt~-~~w~~--~pdt~~~G~~~~~V~Vty 49 (65)
T PF08428_consen 19 DNLPAGTT-YSWKD--KPDTSKPGTKTGKVKVTY 49 (65)
T ss_pred ccCCCCcc-eeecc--CCccccCccEEEEEEEEc
Confidence 34555554 46776 899999999999999984
No 21
>PF00868 Transglut_N: Transglutaminase family; InterPro: IPR001102 Synonym(s): Protein-glutamine gamma-glutamyltransferase, Fibrinoligase, TGase Protein-glutamine gamma-glutamyltransferases (2.3.2.13 from EC) (TGase) are calcium-dependent enzymes that catalyse the cross-linking of proteins by promoting the formation of isopeptide bonds between the gamma-carboxyl group of a glutamine in one polypeptide chain and the epsilon-amino group of a lysine in a second polypeptide chain. TGases also catalyse the conjugation of polyamines to proteins [, ]. Transglutaminases are widely distributed in various organs, tissues and body fluids. The best known transglutaminase is blood coagulation factor XIII, a plasma tetrameric protein composed of two catalytic A subunits and two non-catalytic B subunits. Factor XIII is responsible for cross-linking fibrin chains, thus stabilising the fibrin clot. There are commonly three domains: N-terminal, middle (IPR013808 from INTERPRO) and C-terminal (IPR013807 from INTERPRO). This entry represents the N-terminal domain found in transglutaminases.; GO: 0018149 peptide cross-linking; PDB: 1L9N_B 1NUF_A 1NUD_A 1NUG_B 1L9M_A 1KV3_C 3S3S_A 2Q3Z_A 3LY6_A 3S3P_A ....
Probab=23.31 E-value=94 Score=28.11 Aligned_cols=31 Identities=23% Similarity=0.337 Sum_probs=20.9
Q ss_pred ecCCCeeEEEEEEEcCCCCCCceeEEEEEEE
Q 009790 155 LIPGETTAVWVSIDAPYAQPPGLYEGEIIIT 185 (525)
Q Consensus 155 V~ag~~Q~LWIdV~VP~dA~PG~Y~GtVtVt 185 (525)
+...+...+=|.|.+|++|+-|.|+-.|.++
T Consensus 87 v~~~~~~~~tv~V~spa~A~VG~y~l~v~~~ 117 (118)
T PF00868_consen 87 VESQDGNSVTVSVTSPANAPVGRYKLSVETK 117 (118)
T ss_dssp EEEEETTEEEEEEE--TTS--EEEEEEEEEE
T ss_pred EEecCCCEEEEEEECCCCCceEEEEEEEEEe
Confidence 3334444578899999999999999999886
No 22
>COG4012 Uncharacterized protein conserved in archaea [Function unknown]
Probab=22.85 E-value=2.1e+02 Score=30.49 Aligned_cols=115 Identities=18% Similarity=0.140 Sum_probs=72.1
Q ss_pred HHHHHHHhccchhhhheeccCCCCCcccHHHHHHHHHHHHHhCCCCcEEEeeecCCCCCCCCCCCcccccc-------cc
Q 009790 388 KEIELLRTKAHWKKAYFYLWDEPLNMEHYSSVRNMASELHAYAPDARVLTTYYCGPSDAPLGPTPFESFVK-------VP 460 (525)
Q Consensus 388 ~~~e~Lr~kgw~~kayfyl~DEP~~~e~y~~~r~~a~~ir~~aPd~ril~T~~~~Ps~~~~~~~~~~~~v~-------vp 460 (525)
.|.+.|+..|-....-| +-||++ +.|..||+++..+-++--++-+|+|-+ ..-+.+++| |.
T Consensus 165 lfrelL~~ng~~e~f~f-l~~eiP--e~FtRMraaa~sal~~~t~av~mDskf---------aav~gal~dpaa~palvV 232 (342)
T COG4012 165 LFRELLGANGCREDFMF-LDDEIP--ESFTRMRAAAMSALSAGTDAVAMDSKF---------AAVMGALVDPAADPALVV 232 (342)
T ss_pred HHHHHHcCCCCccccee-cCCcCc--hhHHHHHHHHHHHHhcCceEEEEcchh---------HhhhhcccCcccCceEEE
Confidence 56677777776665444 778876 478999999998888887788888873 222333343 56
Q ss_pred cccCCccccccccccccCC-------------chhhHHHHHHhcCCCCCcccEEEEecCCCCCCCccccee
Q 009790 461 KFLRPHTQIYCTSEWVLGN-------------REDLVKDIVTELQPENGEEIYSLSLMVLPTSYSSVSMWS 518 (525)
Q Consensus 461 ~~~~~~~~~~~~s~wv~g~-------------~~~~~~~~~~~~q~~~g~e~WtY~C~~~~~~~~~~~~~~ 518 (525)
++.+.|+--+ =|.+| +.+..++.|.++-....+.=--|.=-+|||-|-.+-.|.
T Consensus 233 d~GngHttaa----lvdedRI~gv~EHHT~~Lspekled~I~rf~~GeL~neeV~~DgGHGch~~evvd~e 299 (342)
T COG4012 233 DYGNGHTTAA----LVDEDRIVGVYEHHTIRLSPEKLEDQIIRFVEGELENEEVYRDGGHGCHNVEVVDWE 299 (342)
T ss_pred EccCCceEEE----EecCCeEEEEeecccccCCHHHHHHHHHHHHhcccccceeecCCCCceeeecccCch
Confidence 7777774332 11122 345667766665322222334566678999887776663
No 23
>PF09099 Qn_am_d_aIII: Quinohemoprotein amine dehydrogenase, alpha subunit domain III; InterPro: IPR015183 This domain is predominantly found in the prokaryotic protein quinohemoprotein amine dehydrogenase, adopting an immunoglobulin-like beta-sandwich fold, with seven strands arranged into two beta sheets; the fold is possibly related to the immunoglobulin and/or fibronectin type III superfamilies. The precise function of this domain has not, as yet, been defined []. ; PDB: 1JMZ_A 1JMX_A 1PBY_A 1JJU_A.
Probab=21.93 E-value=91 Score=26.96 Aligned_cols=23 Identities=22% Similarity=0.339 Sum_probs=19.5
Q ss_pred eeEEEEEEEcCCCCCCceeEEEE
Q 009790 160 TTAVWVSIDAPYAQPPGLYEGEI 182 (525)
Q Consensus 160 ~Q~LWIdV~VP~dA~PG~Y~GtV 182 (525)
.--++++|.+.++++||.|+..+
T Consensus 47 ~~~v~v~V~~aa~a~~G~~~v~v 69 (81)
T PF09099_consen 47 PDEVVVRVKAAADAAPGIRTVRV 69 (81)
T ss_dssp STCEEEEEEEECTSSSEEEEEEE
T ss_pred CCEEEEEEEEcCCCCCccEEEEe
Confidence 33689999999999999998654
No 24
>PF14734 DUF4469: Domain of unknown function (DUF4469) with IG-like fold
Probab=21.79 E-value=92 Score=27.92 Aligned_cols=24 Identities=17% Similarity=0.091 Sum_probs=20.8
Q ss_pred EEEEEEcCCCCCCceeEEEEEEEe
Q 009790 163 VWVSIDAPYAQPPGLYEGEIIITS 186 (525)
Q Consensus 163 LWIdV~VP~dA~PG~Y~GtVtVt~ 186 (525)
==+.+.||++.++|.|+.+|+=+-
T Consensus 64 s~l~~~lPa~L~~G~Y~l~V~Tq~ 87 (102)
T PF14734_consen 64 SRLIFILPADLAAGEYTLEVRTQY 87 (102)
T ss_pred cEEEEECcCccCceEEEEEEEEEe
Confidence 347889999999999999998875
No 25
>PF14874 PapD-like: Flagellar-associated PapD-like
Probab=21.58 E-value=4.7e+02 Score=21.97 Aligned_cols=31 Identities=39% Similarity=0.710 Sum_probs=26.0
Q ss_pred eecCCCeeEEEEEEEcCCCCCCceeEEEEEEEe
Q 009790 154 SLIPGETTAVWVSIDAPYAQPPGLYEGEIIITS 186 (525)
Q Consensus 154 ~V~ag~~Q~LWIdV~VP~dA~PG~Y~GtVtVt~ 186 (525)
.|.||.++.+=|.+.- ..+.|.|++.|.|..
T Consensus 58 ~l~PG~~~~~~V~~~~--~~~~g~~~~~l~i~~ 88 (102)
T PF14874_consen 58 FLAPGESVELEVTFSP--TKPLGDYEGSLVITT 88 (102)
T ss_pred EECCCCEEEEEEEEEe--CCCCceEEEEEEEEE
Confidence 5889999888888773 466899999999987
No 26
>PF12245 Big_3_2: Bacterial Ig-like domain (group 3); InterPro: IPR022038 This family of proteins is found in bacteria. They have two conserved sequence motifs: AGN and GMT.
Probab=20.95 E-value=3.2e+02 Score=21.74 Aligned_cols=28 Identities=29% Similarity=0.464 Sum_probs=21.6
Q ss_pred eEEEEEEEcCCCCCCceeEEEEEEEecCC
Q 009790 161 TAVWVSIDAPYAQPPGLYEGEIIITSKAD 189 (525)
Q Consensus 161 Q~LWIdV~VP~dA~PG~Y~GtVtVt~~~~ 189 (525)
+..|.. -+|.+...|.|+.+++++++++
T Consensus 9 ~~~~~~-~~P~~~~dg~yt~~v~a~D~AG 36 (60)
T PF12245_consen 9 SGVWST-VIPENDADGEYTLTVTATDKAG 36 (60)
T ss_pred ccceec-cccCccCCccEEEEEEEEECCC
Confidence 344543 3699988999999999998555
No 27
>PF13304 AAA_21: AAA domain; PDB: 3QKS_B 1US8_B 1F2U_B 1F2T_B 3QKT_A 1II8_B 3QKR_B 3QKU_A.
Probab=20.04 E-value=1.3e+02 Score=27.19 Aligned_cols=37 Identities=30% Similarity=0.303 Sum_probs=32.3
Q ss_pred heeccCCCCCcccHHHHHHHHHHHHHhCC-CCcEEEee
Q 009790 403 YFYLWDEPLNMEHYSSVRNMASELHAYAP-DARVLTTY 439 (525)
Q Consensus 403 yfyl~DEP~~~e~y~~~r~~a~~ir~~aP-d~ril~T~ 439 (525)
.+.+.|||..-=|.+..+.+++.+++..+ +..++.|.
T Consensus 259 ~illiDEpE~~LHp~~q~~l~~~l~~~~~~~~QviitT 296 (303)
T PF13304_consen 259 SILLIDEPENHLHPSWQRKLIELLKELSKKNIQVIITT 296 (303)
T ss_dssp SEEEEESSSTTSSHHHHHHHHHHHHHTGGGSSEEEEEE
T ss_pred eEEEecCCcCCCCHHHHHHHHHHHHhhCccCCEEEEeC
Confidence 44589999876688999999999999988 89999988
Done!