Query 024627
Match_columns 265
No_of_seqs 132 out of 366
Neff 6.2
Searched_HMMs 46136
Date Fri Mar 29 06:10:41 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/024627.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/024627hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PF02018 CBM_4_9: Carbohydrate 99.4 1.9E-11 4.1E-16 97.1 16.8 124 71-220 1-126 (131)
2 PF04862 DUF642: Protein of un 98.6 3.2E-06 7E-11 71.7 15.8 136 72-233 1-158 (159)
3 PLN03089 hypothetical protein; 98.4 9.1E-06 2E-10 77.6 14.5 144 70-236 193-365 (373)
4 PLN03089 hypothetical protein; 98.1 0.0001 2.3E-09 70.4 14.4 136 72-234 28-184 (373)
5 COG3534 AbfA Alpha-L-arabinofu 98.0 3.3E-06 7.1E-11 81.8 3.8 38 226-264 31-70 (501)
6 PF15425 DUF4627: Domain of un 96.6 0.085 1.9E-06 45.9 13.4 149 68-234 2-211 (212)
7 PF03422 CBM_6: Carbohydrate b 96.4 0.084 1.8E-06 41.8 11.8 83 147-233 38-123 (125)
8 smart00137 MAM Domain in mepri 96.2 0.41 8.8E-06 40.0 15.4 105 116-236 47-159 (161)
9 smart00606 CBD_IV Cellulose Bi 95.9 0.13 2.8E-06 41.1 10.3 81 145-232 44-128 (129)
10 cd06263 MAM Meprin, A5 protein 95.8 0.57 1.2E-05 38.4 14.3 77 154-236 71-155 (157)
11 PF03425 CBM_11: Carbohydrate 95.6 0.6 1.3E-05 39.9 13.9 79 154-237 72-171 (178)
12 PF00629 MAM: MAM domain; Int 94.3 0.55 1.2E-05 37.7 9.9 80 151-236 69-156 (160)
13 PF10648 Gmad2: Immunoglobulin 89.3 2.7 5.8E-05 32.3 7.7 63 155-220 17-80 (88)
14 COG3534 AbfA Alpha-L-arabinofu 86.8 0.16 3.4E-06 49.9 -0.6 37 36-72 3-40 (501)
15 PF04620 FlaA: Flagellar filam 86.3 9.8 0.00021 34.1 10.5 57 147-208 100-157 (217)
16 PF15432 Sec-ASP3: Accessory S 82.9 25 0.00054 28.9 11.1 75 150-235 50-126 (128)
17 PF01835 A2M_N: MG2 domain; I 80.3 23 0.00049 26.7 10.3 67 150-222 11-85 (99)
18 PF06030 DUF916: Bacterial pro 72.4 37 0.0008 27.4 8.8 89 143-241 16-110 (121)
19 PF14299 PP2: Phloem protein 2 71.0 61 0.0013 27.1 11.6 71 149-220 55-132 (154)
20 PF07172 GRP: Glycine rich pro 70.7 1.8 4E-05 33.8 0.7 9 46-54 37-45 (95)
21 PF04300 FBA: F-box associated 68.8 73 0.0016 27.7 10.4 70 138-208 71-148 (184)
22 PF11141 DUF2914: Protein of u 65.2 31 0.00068 24.8 6.2 41 148-190 23-65 (66)
23 PF09212 CBM27: Carbohydrate b 60.6 26 0.00057 30.2 6.0 110 114-235 41-169 (170)
24 PF13313 DUF4082: Domain of un 56.6 40 0.00086 28.5 6.2 46 156-204 34-79 (149)
25 PF07705 CARDB: CARDB; InterP 54.5 82 0.0018 23.0 9.4 69 149-223 14-86 (101)
26 cd00918 Der-p2_like Several gr 49.4 25 0.00054 28.4 3.8 33 147-179 72-109 (120)
27 TIGR03711 acc_sec_asp3 accesso 49.2 1.6E+02 0.0034 24.6 9.9 37 150-187 61-99 (135)
28 PF14109 GldH_lipo: GldH lipop 48.5 1.5E+02 0.0032 24.2 11.5 84 154-245 33-130 (131)
29 PF14683 CBM-like: Polysacchar 48.3 51 0.0011 28.1 5.8 79 153-233 77-167 (167)
30 cd06480 ACD_HspB8_like Alpha-c 46.4 1.1E+02 0.0025 23.4 6.9 79 148-229 9-88 (91)
31 PF13201 Xylanase: Putative gl 44.6 48 0.001 31.6 5.5 86 146-233 207-341 (342)
32 PF10670 DUF4198: Domain of un 44.6 95 0.0021 26.3 7.0 63 169-235 152-214 (215)
33 PF08770 SoxZ: Sulphur oxidati 43.0 1E+02 0.0022 24.0 6.3 33 146-180 60-92 (100)
34 KOG1392 Acetyl-CoA acetyltrans 41.5 11 0.00024 35.6 0.6 25 237-261 265-289 (465)
35 PF03944 Endotoxin_C: delta en 37.2 2.3E+02 0.005 23.1 8.0 82 152-237 50-143 (143)
36 COG4724 Endo-beta-N-acetylgluc 33.7 2.5E+02 0.0054 27.9 8.4 110 100-224 426-535 (553)
37 PF10633 NPCBM_assoc: NPCBM-as 33.3 1.8E+02 0.004 20.8 9.1 68 150-220 1-73 (78)
38 PF10836 DUF2574: Protein of u 31.2 33 0.00071 26.5 1.7 32 19-53 8-39 (93)
39 COG0852 NuoC NADH:ubiquinone o 31.2 19 0.00041 31.2 0.4 41 195-248 107-147 (176)
40 PF07521 RMMBL: RNA-metabolisi 30.3 48 0.001 21.6 2.2 22 239-260 14-35 (43)
41 cd00916 Npc2_like Niemann-Pick 29.7 1.2E+02 0.0026 24.3 4.9 33 147-179 76-113 (123)
42 cd05755 Ig2_ICAM-1_like Second 29.6 2.7E+02 0.0059 21.6 7.0 66 150-218 13-79 (100)
43 PF10365 DUF2436: Domain of un 28.8 1.1E+02 0.0023 26.0 4.4 32 134-165 118-155 (161)
44 smart00737 ML Domain involved 28.6 98 0.0021 24.0 4.1 33 147-179 71-108 (118)
45 PF14785 MalF_P2: Maltose tran 27.7 1.1E+02 0.0024 26.3 4.5 37 152-191 17-53 (164)
46 PF12988 DUF3872: Domain of un 27.0 3.6E+02 0.0079 22.5 7.3 39 139-179 79-124 (137)
47 COG2373 Large extracellular al 25.1 2.7E+02 0.0058 32.3 8.1 39 151-190 406-450 (1621)
48 PF04428 Choline_kin_N: Cholin 25.0 52 0.0011 23.0 1.7 21 243-263 27-48 (53)
49 PF02221 E1_DerP2_DerF2: ML do 25.0 3.3E+02 0.0071 21.2 6.7 36 148-184 86-126 (134)
50 PF09092 Lyase_N: Lyase, N ter 25.0 4.6E+02 0.01 22.8 13.9 128 90-236 17-164 (178)
51 PF04744 Monooxygenase_B: Mono 24.9 2.8E+02 0.0061 27.0 7.1 57 147-207 80-139 (381)
52 PF04151 PPC: Bacterial pre-pe 23.8 2.6E+02 0.0057 19.5 5.7 64 145-229 4-67 (70)
53 COG3906 Uncharacterized protei 22.8 2.7E+02 0.0059 22.2 5.5 67 171-240 15-86 (105)
54 PF12273 RCR: Chitin synthesis 21.6 42 0.00091 27.1 0.8 15 14-28 9-23 (130)
55 COG3126 Uncharacterized protei 21.6 2.5E+02 0.0054 24.0 5.4 76 152-232 41-128 (158)
56 PF14524 Wzt_C: Wzt C-terminal 21.6 3.1E+02 0.0068 21.1 5.9 65 150-218 31-103 (142)
57 TIGR03079 CH4_NH3mon_ox_B meth 20.8 1.3E+02 0.0029 29.2 4.1 38 147-184 100-139 (399)
No 1
>PF02018 CBM_4_9: Carbohydrate binding domain; InterPro: IPR003305 The 1,4-beta-glucanase CenC from Cellulomonas fimi contains two cellulose-binding domains, CBD(N1) and CBD(N2), arranged in tandem at its N terminus. These homologous CBDs are distinct in their selectivity for binding amorphous and not crystalline cellulose []. Multidimensional heteronuclear nuclear magnetic resonance (NMR) spectroscopy was used to determine the tertiary structure of the 152 amino acid N-terminal cellulose-binding domain from C. fimi 1,4-beta-glucanase CenC (CBDN1) []. The tertiary structure of CBDN1 is strikingly similar to that of the bacterial 1,3-1,4-beta-glucanases, as well as other sugar-binding proteins with jelly-roll folds.; GO: 0016798 hydrolase activity, acting on glycosyl bonds; PDB: 3OEA_B 2ZEX_B 3OEB_A 2ZEY_A 2ZEW_A 1GUI_A 2W5F_A 2WZE_A 2WYS_A 2ZEZ_B ....
Probab=99.42 E-value=1.9e-11 Score=97.11 Aligned_cols=124 Identities=23% Similarity=0.414 Sum_probs=85.3
Q ss_pred hhhccCCCCCCCCCCCCCCCCCceEecCCceeEEecCCCccccCCcceEEEEEeecCCCcccccCcceeEEccCceeeee
Q 024627 71 AELVSNRGFEAGGQNIPSNIDPWAIIGNDSSLIVSTDRSSCFERNKVALRMEVLCDSQGTNICPVGGVGVYNPGYWGMGI 150 (265)
Q Consensus 71 AEll~NRsFe~~~~~~~~~~~~W~~~g~~~~~~~~~d~~~~~~~n~~~l~i~v~~~~~~~~~~~~g~~gi~N~Gy~Gi~v 150 (265)
+|||+|.+||.. .+.+|...+.... ....+.. .+.++|+|.-... ...++... .++.|
T Consensus 1 ~nli~N~~Fe~~------~~~~W~~~~~~~~-~~~~~~~----~g~~~l~v~~~~~---------~~~~~~~~--~~~~l 58 (131)
T PF02018_consen 1 GNLIKNGGFEDG------GLSGWSFWGNSGA-SASVDNA----SGNYSLKVSNRSA---------TWDGQSQQ--QTISL 58 (131)
T ss_dssp GBSSSSTTSTTT------STTTEEEESSTTE-EEEEEEC----SSSEEEEEECCSS---------GCGEEEEE--EEEEE
T ss_pred CCEEECCCccCC------CCCCCEEccCCCE-EEEEEcC----CCeEEEEEECCCC---------Ccccccee--cceEe
Confidence 489999999973 4789999877632 2222211 3446666543211 11233222 25999
Q ss_pred ccCCeEEEEEEEEeCCCeeEEEEEEeCCC-Ce-eEEEEEEEeeecCCCCcEEEEEEEEecCCCCcceEEEEE
Q 024627 151 KQGKTYKVVFYIRSLGSVNILVSLTSSNG-LQ-TLATSNIIASASDVSNWTRVETLLEAKETNPNARLQLTT 220 (265)
Q Consensus 151 ~~G~~Y~~Sf~ar~~~~~~vtV~L~~~~g-~~-~las~~i~~~~~~~~~W~ky~~~lta~~t~~~a~l~I~~ 220 (265)
++|++|++|||+|.+....+.+++...++ .. .+....+. ..++|++|+++|+++.+....+|+|..
T Consensus 59 ~~G~~Y~~s~~vk~~~~~~~~~~~~~~~~~~~~~~~~~~~~----~~~~W~~~s~~ft~~~~~~~~~l~~~~ 126 (131)
T PF02018_consen 59 KPGKTYTVSFWVKADSGGTVSVSLRDEDGSPYNWYTGQTVT----ITGEWTKYSGTFTAPSDDDTVRLYFEI 126 (131)
T ss_dssp -TTSEEEEEEEEEESSSEEEEEEEEESSTTTEEEEEEEEEE----ETSSEEEEEEEEEEESSCEEEEEEEEE
T ss_pred cCCCEEEEEEEEEeCCCCEEEEEEEEcCCCCcEEEEEEEEE----CCCCcEEEEEEEEECCCCceEEEEEEe
Confidence 99999999999999877889999988776 22 33332333 258999999999999888888888877
No 2
>PF04862 DUF642: Protein of unknown function (DUF642); InterPro: IPR006946 This family contains a conserved region found in a number of uncharacterised plant proteins.
Probab=98.59 E-value=3.2e-06 Score=71.74 Aligned_cols=136 Identities=18% Similarity=0.243 Sum_probs=81.2
Q ss_pred hhccCCCCCCCCCC----------CCCCCCCceEecCCceeEEecCCCc-----cccCCcceEEEEEeecCCCcccccCc
Q 024627 72 ELVSNRGFEAGGQN----------IPSNIDPWAIIGNDSSLIVSTDRSS-----CFERNKVALRMEVLCDSQGTNICPVG 136 (265)
Q Consensus 72 Ell~NRsFe~~~~~----------~~~~~~~W~~~g~~~~~~~~~d~~~-----~~~~n~~~l~i~v~~~~~~~~~~~~g 136 (265)
.||+|.+||.++.. ....+.+|...+. ...+...... ..+.+.+++++ ++
T Consensus 1 nLl~NG~FE~~p~~~~~~~~~~~~~~s~ipGWtv~g~--Ve~i~~~~~~g~~~~~~p~G~~aveL----g~--------- 65 (159)
T PF04862_consen 1 NLLVNGSFEEGPYNSNMNGTSLSDGSSSIPGWTVSGS--VEYIDSGHFQGGMYFAVPEGKQAVEL----GN--------- 65 (159)
T ss_pred CCccCCCCCCCCccCCCCcceEccCCCcCCCcEEcCE--EEEEecCCccCceeeeCCCCceEEEc----CC---------
Confidence 48999999986432 1235788987554 1122222111 12455677666 22
Q ss_pred ceeEEccCceeeeeccCCeEEEEEEEEeC--CCeeEEEEEEeCCCCeeEEEEEEEeeecCCCCcEEEEEEEEecCCCCcc
Q 024627 137 GVGVYNPGYWGMGIKQGKTYKVVFYIRSL--GSVNILVSLTSSNGLQTLATSNIIASASDVSNWTRVETLLEAKETNPNA 214 (265)
Q Consensus 137 ~~gi~N~Gy~Gi~v~~G~~Y~~Sf~ar~~--~~~~vtV~L~~~~g~~~las~~i~~~~~~~~~W~ky~~~lta~~t~~~a 214 (265)
...|.|. +...+|++|.++|.+... ....+.|+.... ....+.-.+. .....|++|++.|+| .. ...
T Consensus 66 ~~~I~Q~----~~t~~G~~Y~LtF~~~~~~~~~~~l~V~v~~~-~~~~~~~~~~----~~~~~w~~~s~~F~A-~~-t~~ 134 (159)
T PF04862_consen 66 EGSISQT----FTTVPGSTYTLTFSLARNCAQSESLSVSVGGQ-FSFVVTIQTS----YGSGGWDTYSFTFTA-SS-TRI 134 (159)
T ss_pred CceEEEE----EEccCCCEEEEEEEecCCCCCCccEEEEEecc-cceEEEeecc----CCCCCcEEEEEEEEe-CC-CEE
Confidence 2248877 999999999999999742 233577776653 2122221111 123569999999999 44 556
Q ss_pred eEEEEEcc-----CeEEEEeEEee
Q 024627 215 RLQLTTSR-----KGVIWFDQVSA 233 (265)
Q Consensus 215 ~l~I~~~~-----~G~v~lD~VSL 233 (265)
+|.+...+ ..=-.||.|++
T Consensus 135 ~l~f~~~~~~~d~~cGp~iDnV~v 158 (159)
T PF04862_consen 135 TLTFHNPGMESDSACGPVIDNVSV 158 (159)
T ss_pred EEEEECCCccCCCCceeEEEEEEe
Confidence 66664432 11245788875
No 3
>PLN03089 hypothetical protein; Provisional
Probab=98.38 E-value=9.1e-06 Score=77.57 Aligned_cols=144 Identities=18% Similarity=0.298 Sum_probs=88.9
Q ss_pred hhhhccCCCCCCCCC---CC-------------CCCCCCceEecCCceeEEecCCCccccCCcceEEEEEeecCCCcccc
Q 024627 70 WAELVSNRGFEAGGQ---NI-------------PSNIDPWAIIGNDSSLIVSTDRSSCFERNKVALRMEVLCDSQGTNIC 133 (265)
Q Consensus 70 yAEll~NRsFe~~~~---~~-------------~~~~~~W~~~g~~~~~~~~~d~~~~~~~n~~~l~i~v~~~~~~~~~~ 133 (265)
=+.||+|.+||.++. ++ -+++.+|.+......-.++... ...++..+++++ .++
T Consensus 193 ~~Nll~NG~FE~Gp~~~~n~~~gvllp~~~~~~~s~LpgW~i~s~~~V~yids~h-~~vp~G~~aveL--~~g------- 262 (373)
T PLN03089 193 KDNLLKNGGFEEGPYVFPNSSWGVLLPPNIEDDTSPLPGWMIESLKAVKYIDSAH-FSVPEGKRAVEL--VSG------- 262 (373)
T ss_pred ccceeecCCcccCCcccCCCCceEEeCCccccCCCCCCCcEEecCccEEEEecCc-ccCCCCceEEEe--ccC-------
Confidence 358999999998641 11 1357899974433222333332 222456677665 322
Q ss_pred cCcceeEEccCceeeeeccCCeEEEEEEEEeC---CCeeEEEEEEeCCCCeeEEEEEEEeeecCCCCcEEEEEEEEecCC
Q 024627 134 PVGGVGVYNPGYWGMGIKQGKTYKVVFYIRSL---GSVNILVSLTSSNGLQTLATSNIIASASDVSNWTRVETLLEAKET 210 (265)
Q Consensus 134 ~~g~~gi~N~Gy~Gi~v~~G~~Y~~Sf~ar~~---~~~~vtV~L~~~~g~~~las~~i~~~~~~~~~W~ky~~~lta~~t 210 (265)
...+|.|. +...+|++|+++|.+-.. -.+.+.|.....+. .+++...+...+.|++++++|+|..+
T Consensus 263 --~e~aI~Q~----v~T~~G~~Y~LsFs~g~a~~~c~gs~~V~a~ag~~-----~~~v~~~s~g~gg~~~~s~~F~A~s~ 331 (373)
T PLN03089 263 --KESAIAQV----VRTVPGKSYNLSFTVGDANNGCHGSMMVEAFAGKD-----TQKVPYESQGKGGFKRASLRFKAVSN 331 (373)
T ss_pred --CcceEEEE----EEccCCCEEEEEEEEccCCCCCCCcEEEEEEeecc-----cceEEEecCCCcceEEEEEEEEeccC
Confidence 23478877 999999999999997432 23445565443322 12222222356789999999998865
Q ss_pred CCcceEEEEEc-------cCeEEE---EeEEeeecc
Q 024627 211 NPNARLQLTTS-------RKGVIW---FDQVSAMPL 236 (265)
Q Consensus 211 ~~~a~l~I~~~-------~~G~v~---lD~VSLfP~ 236 (265)
. .+|.+.-. ..+.++ ||.|++.+.
T Consensus 332 ~--Trl~F~s~~y~~~~d~~~~~cGPvlDdV~v~~~ 365 (373)
T PLN03089 332 R--TRITFYSSFYHTKSDDFGSLCGPVVDDVRVVPV 365 (373)
T ss_pred C--EEEEEEEeecccccCcCCCcccceeeeEEEEEc
Confidence 3 36666331 136677 999999985
No 4
>PLN03089 hypothetical protein; Provisional
Probab=98.07 E-value=0.0001 Score=70.43 Aligned_cols=136 Identities=17% Similarity=0.276 Sum_probs=82.1
Q ss_pred hhccCCCCCCCCCCC---------CCCCCCceEecCCceeEEecCCC-----ccccCCcceEEEEEeecCCCcccccCcc
Q 024627 72 ELVSNRGFEAGGQNI---------PSNIDPWAIIGNDSSLIVSTDRS-----SCFERNKVALRMEVLCDSQGTNICPVGG 137 (265)
Q Consensus 72 Ell~NRsFe~~~~~~---------~~~~~~W~~~g~~~~~~~~~d~~-----~~~~~n~~~l~i~v~~~~~~~~~~~~g~ 137 (265)
.||+|.+||.++... .+.+.+|.+-+. ...+..... -..+++.+++++ ++ .
T Consensus 28 nLL~NG~FE~gP~~~~~n~t~~~g~s~LPgW~i~g~--VeyI~s~~~~~~m~~~vP~G~~Av~L----G~---------e 92 (373)
T PLN03089 28 GLLPNGDFETPPKKSQMNGTVVIGKNAIPGWEISGF--VEYISSGQKQGGMLLVVPEGAHAVRL----GN---------E 92 (373)
T ss_pred CeecCCCccCCCCcCCCCcccccCCCCCCCCEecCc--EEEEeCCCccCceeEECCCCchhhhc----CC---------C
Confidence 699999999874322 235789996431 112222210 112455677665 22 3
Q ss_pred eeEEccCceeeeeccCCeEEEEEEEEeC--CCeeEEEEEEeCCCCeeEEEEEEEeeecCCCCcEEEEEEEEecCCCCcce
Q 024627 138 VGVYNPGYWGMGIKQGKTYKVVFYIRSL--GSVNILVSLTSSNGLQTLATSNIIASASDVSNWTRVETLLEAKETNPNAR 215 (265)
Q Consensus 138 ~gi~N~Gy~Gi~v~~G~~Y~~Sf~ar~~--~~~~vtV~L~~~~g~~~las~~i~~~~~~~~~W~ky~~~lta~~t~~~a~ 215 (265)
..|.|. |.+.+|..|.++|.+... ....+.|+.... . .++.-++.. ..++|++|...|+|..+. .+
T Consensus 93 ~sI~Q~----i~t~~G~~Y~LTFs~ar~c~~~~~v~vsv~~~-~-~~~~~qt~~----~~~gw~~~s~~F~A~s~~--t~ 160 (373)
T PLN03089 93 ASISQT----LTVTKGSYYSLTFSAARTCAQDESLNVSVPPE-S-GVLPLQTLY----SSSGWDSYAWAFKAESDV--VN 160 (373)
T ss_pred ceEEEE----EEccCCCEEEEEEEecCCCCCCceEEEEecCC-C-cEEeeEEec----cCCCcEEEEEEEEEeccc--EE
Confidence 468877 999999999999999632 234466665543 2 344322221 357899999999997543 56
Q ss_pred EEEEEcc---CeEE--EEeEEeee
Q 024627 216 LQLTTSR---KGVI--WFDQVSAM 234 (265)
Q Consensus 216 l~I~~~~---~G~v--~lD~VSLf 234 (265)
|.|...+ +... .||.|++-
T Consensus 161 l~F~~~~~~~D~~CGPviD~VaIk 184 (373)
T PLN03089 161 LVFHNPGVEEDPACGPLIDAVAIK 184 (373)
T ss_pred EEEECcccCCCCcccceeeeEEEe
Confidence 6664222 3222 38888875
No 5
>COG3534 AbfA Alpha-L-arabinofuranosidase [Carbohydrate transport and metabolism]
Probab=98.05 E-value=3.3e-06 Score=81.77 Aligned_cols=38 Identities=34% Similarity=0.466 Sum_probs=35.6
Q ss_pred EEEeEEeeeccCCCCC--CCchHHHHHHHHcCCCCeEecCC
Q 024627 226 IWFDQVSAMPLDTYKG--HGFRNVLFQMLADLKPRFLRFPG 264 (265)
Q Consensus 226 v~lD~VSLfP~dT~kg--~GlR~DL~e~L~dL~P~FlRfPG 264 (265)
-.|++.+++| |++.. +|||+|++++|+|||++++||||
T Consensus 31 r~vY~Giyep-d~p~~d~~G~RkDVle~lk~Lk~P~lR~PG 70 (501)
T COG3534 31 RAVYEGIYEP-DSPIADERGFRKDVLEALKDLKIPVLRWPG 70 (501)
T ss_pred cceeeeeecC-CCCCcchhhhHHHHHHHHHhcCCceeecCC
Confidence 5689999999 89986 79999999999999999999999
No 6
>PF15425 DUF4627: Domain of unknown function (DUF4627); PDB: 3SEE_A.
Probab=96.60 E-value=0.085 Score=45.93 Aligned_cols=149 Identities=15% Similarity=0.281 Sum_probs=61.3
Q ss_pred chhh-hhccCCCCCCCCC---CCCC--CCCCceEecCC----ceeE-EecCCCccccCCcceEEEEEeecCCCcccccCc
Q 024627 68 GLWA-ELVSNRGFEAGGQ---NIPS--NIDPWAIIGND----SSLI-VSTDRSSCFERNKVALRMEVLCDSQGTNICPVG 136 (265)
Q Consensus 68 GLyA-Ell~NRsFe~~~~---~~~~--~~~~W~~~g~~----~~~~-~~~d~~~~~~~n~~~l~i~v~~~~~~~~~~~~g 136 (265)
|+.| |||+|..|..+-. -+++ ...-|-++.+. +.+. ..+++ ..-+++++|++.. +... .=
T Consensus 2 g~~AQnLIkN~~F~t~Lt~e~~~as~~T~~~Wfavnde~~G~Tt~a~~~tnD----~k~~na~~is~~~---~~ts--Wy 72 (212)
T PF15425_consen 2 GISAQNLIKNGDFDTPLTNENTTASNTTFGKWFAVNDEWDGATTIAWINTND----QKTGNAWGISSWD---KQTS--WY 72 (212)
T ss_dssp --------SSTT--S----B-SSGGGS-TTSEEEEE-S-TTS-EEEEEE-S-----TTS-EEEEETT-S---S-----TT
T ss_pred ccchhhhhhcCccCcchhccccCcCcccccceEEEecccCCceEeeeeccCc----ccccceEEEeecc---cCcH--HH
Confidence 4556 8999999985421 1121 35679887543 2222 22222 2335677763221 1000 00
Q ss_pred ceeEEccCceeeeec---cCCeEEEEEEEEeCCCe---eEEEEEEeCCCCeeEEE-------------------EEEEee
Q 024627 137 GVGVYNPGYWGMGIK---QGKTYKVVFYIRSLGSV---NILVSLTSSNGLQTLAT-------------------SNIIAS 191 (265)
Q Consensus 137 ~~gi~N~Gy~Gi~v~---~G~~Y~~Sf~ar~~~~~---~vtV~L~~~~g~~~las-------------------~~i~~~ 191 (265)
-.=+ |=.+. ...-|+++||+|++..+ .|-|.|.+.+| +.... -...+.
T Consensus 73 kafL------aQr~~~gae~~mYtLsF~AkA~t~g~qv~V~Irl~~~ng-K~~~~Ffmr~~~d~~sqpn~s~a~y~~~ik 145 (212)
T PF15425_consen 73 KAFL------AQRYTNGAEKGMYTLSFDAKADTNGTQVHVFIRLHNDNG-KDNQRFFMRRDYDAQSQPNQSDAQYNFKIK 145 (212)
T ss_dssp TEEE------EEEE-S---SSEEEEEEEEEESSTT-EEEEEEE-B-TTS--B---EEEETT--TTT-TTSBSS-EEEE--
T ss_pred HHHH------HHHHhcccccceEEEEEEeecccCCCcEEEEEEEecCCC-ccceeEEEEeccccccCccchhhhhhhccc
Confidence 1112 22221 23469999999997543 45555666655 22110 012222
Q ss_pred ecCCCCcEEEEEEEEec------------------CCC-C-----cceEEEEE-ccCeEEEEeEEeee
Q 024627 192 ASDVSNWTRVETLLEAK------------------ETN-P-----NARLQLTT-SRKGVIWFDQVSAM 234 (265)
Q Consensus 192 ~~~~~~W~ky~~~lta~------------------~t~-~-----~a~l~I~~-~~~G~v~lD~VSLf 234 (265)
..+.|++|.+.+.=. .++ . +..+.|.. +++|.+.||.|||-
T Consensus 146 --kAgkWtkv~~~fdfgkvvNai~s~k~n~~~~vt~td~~~a~Lkdf~i~iq~q~k~s~vlId~VsLk 211 (212)
T PF15425_consen 146 --KAGKWTKVSVYFDFGKVVNAISSFKMNPAEEVTDTDDDAAILKDFYICIQSQNKPSSVLIDDVSLK 211 (212)
T ss_dssp --STT--EEEEEEEEEEEEES-SSBTTT-TT--EEE--TT-HHHHSEEEEEE--STT-EEEEEEEEEE
T ss_pred --cCCceEEEEEEeehhHHhHHHhhhccCCCCccccCccchhhhcceEEEEEEcCCCceEEecccEec
Confidence 358899999986421 111 1 23344443 45899999999983
No 7
>PF03422 CBM_6: Carbohydrate binding module (family 6); InterPro: IPR005084 A carbohydrate-binding module (CBM) is defined as a contiguous amino acid sequence within a carbohydrate-active enzyme with a discreet fold having carbohydrate-binding activity. A few exceptions are CBMs in cellulosomal scaffolding proteins and rare instances of independent putative CBMs. The requirement of CBMs existing as modules within larger enzymes sets this class of carbohydrate-binding protein apart from other non-catalytic sugar binding proteins such as lectins and sugar transport proteins. CBMs were previously classified as cellulose-binding domains (CBDs) based on the initial discovery of several modules that bound cellulose [, ]. However, additional modules in carbohydrate-active enzymes are continually being found that bind carbohydrates other than cellulose yet otherwise meet the CBM criteria, hence the need to reclassify these polypeptides using more inclusive terminology. Previous classification of cellulose-binding domains were based on amino acid similarity. Groupings of CBDs were called "Types" and numbered with roman numerals (e.g. Type I or Type II CBDs). In keeping with the glycoside hydrolase classification, these groupings are now called families and numbered with Arabic numerals. Families 1 to 13 are the same as Types I to XIII. For a detailed review on the structure and binding modes of CBMs see []. This entry represents CBM6 from CAZY which was previously known as cellulose-binding domain family VI (CBD VI). CBM6 bind to amorphous cellulose, xylan, mixed beta-(1,3)(1,4)glucan and beta-1,3-glucan[, , ]. CBM6 adopts a classic lectin-like beta-jelly roll fold, predominantly consisting of five antiparallel beta-strands on one face and four antiparallel beta-strands on the other face. It contains two potential ligand binding sites, named respectively cleft A and B. These clefts include aromatic residues which are probably involved in the substrate binding. The cleft B is located on the concave surface of one beta-sheet, and the cleft A on one edge of the protein between the loop that connects the inner and outer beta-sheets of the jellyroll fold []. The multiple binding clefts confer the extensive range of specificities displayed by the domain [, , ].; GO: 0030246 carbohydrate binding; PDB: 1UY1_A 1UY3_A 1UY4_A 1UY2_A 1UYY_A 1UXZ_B 1UYZ_A 1UY0_B 1UYX_A 1UZ0_A ....
Probab=96.42 E-value=0.084 Score=41.81 Aligned_cols=83 Identities=12% Similarity=0.216 Sum_probs=57.7
Q ss_pred eeeeccCCeEEEEEEEEeCCC-eeEEEEEEeCCCCeeEEEEEEEeeecCCCCcEEEEEEEEecCCCCcceEEEEEccCe-
Q 024627 147 GMGIKQGKTYKVVFYIRSLGS-VNILVSLTSSNGLQTLATSNIIASASDVSNWTRVETLLEAKETNPNARLQLTTSRKG- 224 (265)
Q Consensus 147 Gi~v~~G~~Y~~Sf~ar~~~~-~~vtV~L~~~~g~~~las~~i~~~~~~~~~W~ky~~~lta~~t~~~a~l~I~~~~~G- 224 (265)
.+.+.++.+|.+.+.+..... ..++|.+-+.+| +.+++..++..+ .-..|+..+..+...+ ....|.|.+...+
T Consensus 38 ~Vd~~~~g~y~~~~~~a~~~~~~~~~l~id~~~g-~~~~~~~~~~tg-~w~~~~~~~~~v~l~~--G~h~i~l~~~~~~~ 113 (125)
T PF03422_consen 38 NVDVPEAGTYTLTIRYANGGGGGTIELRIDGPDG-TLIGTVSLPPTG-GWDTWQTVSVSVKLPA--GKHTIYLVFNGGDG 113 (125)
T ss_dssp EEEESSSEEEEEEEEEEESSSSEEEEEEETTTTS-EEEEEEEEE-ES-STTEEEEEEEEEEEES--EEEEEEEEESSSSS
T ss_pred EEeeCCCceEEEEEEEECCCCCcEEEEEECCCCC-cEEEEEEEcCCC-CccccEEEEEEEeeCC--CeeEEEEEEECCCC
Confidence 488889999999988877654 577777766566 888887776541 2234555655565554 5567888887653
Q ss_pred -EEEEeEEee
Q 024627 225 -VIWFDQVSA 233 (265)
Q Consensus 225 -~v~lD~VSL 233 (265)
.++||-+.+
T Consensus 114 ~~~niD~~~f 123 (125)
T PF03422_consen 114 WAFNIDYFQF 123 (125)
T ss_dssp B-EEEEEEEE
T ss_pred ceEEeEEEEE
Confidence 699999876
No 8
>smart00137 MAM Domain in meprin, A5, receptor protein tyrosine phosphatase mu (and others). Likely to have an adhesive function. Mutations in the meprin MAM domain affect noncovalent associations within meprin oligomers. In receptor tyrosine phosphatase mu-like molecules the MAM domain is important for homophilic cell-cell interactions.
Probab=96.21 E-value=0.41 Score=40.04 Aligned_cols=105 Identities=18% Similarity=0.188 Sum_probs=60.6
Q ss_pred cceEEEEEeecCCCcccccCcceeEEccCceeeeeccCCeEEEEEEEEeC--CCeeEEEEEEeCCCCe--eEEEEEEEee
Q 024627 116 KVALRMEVLCDSQGTNICPVGGVGVYNPGYWGMGIKQGKTYKVVFYIRSL--GSVNILVSLTSSNGLQ--TLATSNIIAS 191 (265)
Q Consensus 116 ~~~l~i~v~~~~~~~~~~~~g~~gi~N~Gy~Gi~v~~G~~Y~~Sf~ar~~--~~~~vtV~L~~~~g~~--~las~~i~~~ 191 (265)
.++|.++..... +...+-|.-+ =|.. +...+-++||..-. ..+.+.|.+.+.++.. .+- ...
T Consensus 47 G~y~~v~~~~~~------~g~~A~L~SP---~~~~-~~~~~cl~F~Y~m~G~~~g~L~V~~~~~~~~~~~~lw----~~~ 112 (161)
T smart00137 47 GHFMFFETSSGA------PGQTARLLSP---PLYE-NRSTHCLTFWYYMYGSGSGTLNVYVRENNGSQDTLLW----SRS 112 (161)
T ss_pred eeEEEEECCCCC------CCCEEEEECC---cccC-CCCCeEEEEEEEecCCCCCEEEEEEEeCCCCCceEeE----EEc
Confidence 478887765322 1223344433 1222 22456788888653 4566888877544422 122 222
Q ss_pred ecCCCCcEEEEEEEEecCCCCcceEEEEEc----cCeEEEEeEEeeecc
Q 024627 192 ASDVSNWTRVETLLEAKETNPNARLQLTTS----RKGVIWFDQVSAMPL 236 (265)
Q Consensus 192 ~~~~~~W~ky~~~lta~~t~~~a~l~I~~~----~~G~v~lD~VSLfP~ 236 (265)
+..++.|++-++.|.+.. .+-++.|... ..|.|.||.|++.|.
T Consensus 113 g~~~~~W~~~~v~l~~~~--~~fqi~fe~~~g~~~~g~IAiDDI~i~~g 159 (161)
T smart00137 113 GTQGGQWLQAEVALSKWQ--QPFQVVFEGTRGKGHSGYIALDDILLSNG 159 (161)
T ss_pred CCCCCceEEEEEEecCCC--CcEEEEEEEEEcCCccceEEEeEEEeecc
Confidence 234688999999999732 2233444332 258999999999873
No 9
>smart00606 CBD_IV Cellulose Binding Domain Type IV.
Probab=95.85 E-value=0.13 Score=41.15 Aligned_cols=81 Identities=14% Similarity=0.168 Sum_probs=51.6
Q ss_pred ceeeeeccCCeEEEEEEEEeCC-CeeEEEEEEeCCCCeeEEEEEEEeeecCCCCcEEE---EEEEEecCCCCcceEEEEE
Q 024627 145 YWGMGIKQGKTYKVVFYIRSLG-SVNILVSLTSSNGLQTLATSNIIASASDVSNWTRV---ETLLEAKETNPNARLQLTT 220 (265)
Q Consensus 145 y~Gi~v~~G~~Y~~Sf~ar~~~-~~~vtV~L~~~~g~~~las~~i~~~~~~~~~W~ky---~~~lta~~t~~~a~l~I~~ 220 (265)
|.++.+.+...|++++.+.+.. .+.++|.+-+.+| +.+++..++.. ++|..| +..+... .....|.|.+
T Consensus 44 y~~vd~~~~g~~~i~~~~as~~~~~~i~v~~d~~~G-~~~~~~~~p~t----g~~~~~~~~~~~v~~~--~G~~~l~~~~ 116 (129)
T smart00606 44 YKDVDFGSSGAYTFTARVASGNAGGSIELRLDSPTG-TLVGTVDVPST----GGWQTYQTVSATVTLP--AGVHDVYLVF 116 (129)
T ss_pred EEeEecCCCCceEEEEEEeCCCCCceEEEEECCCCC-cEEEEEEeCCC----CCCccCEEEEEEEccC--CceEEEEEEE
Confidence 3457776668999999887653 3467888766667 78887777643 445544 4333322 3445677777
Q ss_pred ccCeEEEEeEEe
Q 024627 221 SRKGVIWFDQVS 232 (265)
Q Consensus 221 ~~~G~v~lD~VS 232 (265)
.+...+.||-+.
T Consensus 117 ~~~~~~~ld~~~ 128 (129)
T smart00606 117 KGGNYFNIDWFR 128 (129)
T ss_pred ECCCcEEEEEEE
Confidence 554348888764
No 10
>cd06263 MAM Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region.
Probab=95.79 E-value=0.57 Score=38.41 Aligned_cols=77 Identities=18% Similarity=0.251 Sum_probs=49.4
Q ss_pred CeEEEEEEEEeC--CCeeEEEEEEeCCCC--eeEEEEEEEeeecCCCCcEEEEEEEEecCCCCcceEEEEEc----cCeE
Q 024627 154 KTYKVVFYIRSL--GSVNILVSLTSSNGL--QTLATSNIIASASDVSNWTRVETLLEAKETNPNARLQLTTS----RKGV 225 (265)
Q Consensus 154 ~~Y~~Sf~ar~~--~~~~vtV~L~~~~g~--~~las~~i~~~~~~~~~W~ky~~~lta~~t~~~a~l~I~~~----~~G~ 225 (265)
...-++||..-. ..+.|.|.+....+. ..+-+. . +..++.|++.+++|.+.. ..-++.|... ..|.
T Consensus 71 ~~~Cl~F~y~~~g~~~g~L~V~v~~~~~~~~~~lw~~--~--~~~~~~W~~~~v~l~~~~--~~fqi~fe~~~~~~~~g~ 144 (157)
T cd06263 71 SSHCLSFWYHMYGSGVGTLNVYVREEGGGLGTLLWSA--S--GGQGNQWQEAEVTLSASS--KPFQVVFEGVRGSGSRGD 144 (157)
T ss_pred CCeEEEEEEEecCCCCCeEEEEEEeCCCCcceEEEEE--E--CCCCCeeEEEEEEECCCC--CceEEEEEEEECCCcccc
Confidence 345588877643 367888888876551 223221 2 123589999999999875 2233333331 2589
Q ss_pred EEEeEEeeecc
Q 024627 226 IWFDQVSAMPL 236 (265)
Q Consensus 226 v~lD~VSLfP~ 236 (265)
|.||.|+|.|.
T Consensus 145 IAIDdI~l~~g 155 (157)
T cd06263 145 IALDDISLSPG 155 (157)
T ss_pred EEEeEEEEecc
Confidence 99999999873
No 11
>PF03425 CBM_11: Carbohydrate binding domain (family 11); InterPro: IPR005087 A carbohydrate-binding module (CBM) is defined as a contiguous amino acid sequence within a carbohydrate-active enzyme with a discreet fold having carbohydrate-binding activity. A few exceptions are CBMs in cellulosomal scaffolding proteins and rare instances of independent putative CBMs. The requirement of CBMs existing as modules within larger enzymes sets this class of carbohydrate-binding protein apart from other non-catalytic sugar binding proteins such as lectins and sugar transport proteins. CBMs were previously classified as cellulose-binding domains (CBDs) based on the initial discovery of several modules that bound cellulose [, ]. However, additional modules in carbohydrate-active enzymes are continually being found that bind carbohydrates other than cellulose yet otherwise meet the CBM criteria, hence the need to reclassify these polypeptides using more inclusive terminology. Previous classification of cellulose-binding domains were based on amino acid similarity. Groupings of CBDs were called "Types" and numbered with roman numerals (e.g. Type I or Type II CBDs). In keeping with the glycoside hydrolase classification, these groupings are now called families and numbered with Arabic numerals. Families 1 to 13 are the same as Types I to XIII. For a detailed review on the structure and binding modes of CBMs see []. This entry represents CBM11 from CAZY which binds both beta-1,4-glucan and beta-1,3-1,4-mixed linked glucans.; GO: 0008810 cellulase activity, 0030245 cellulose catabolic process; PDB: 1V0A_A.
Probab=95.56 E-value=0.6 Score=39.92 Aligned_cols=79 Identities=15% Similarity=0.223 Sum_probs=43.1
Q ss_pred CeEEEEEEEEeCCC-eeEEEEEEeCCCCeeEEEEEEEeeecCCCCcEEEEEEEEec--------CC-------C--Ccce
Q 024627 154 KTYKVVFYIRSLGS-VNILVSLTSSNGLQTLATSNIIASASDVSNWTRVETLLEAK--------ET-------N--PNAR 215 (265)
Q Consensus 154 ~~Y~~Sf~ar~~~~-~~vtV~L~~~~g~~~las~~i~~~~~~~~~W~ky~~~lta~--------~t-------~--~~a~ 215 (265)
...-++||+|++.. .+++|++.+... ..+-...+... .+||++++-|..= .. + .-..
T Consensus 72 ~~~gl~Fw~k~dgs~~~l~vqi~d~~~-~e~~~~~~~~~----~~W~~V~IPF~~f~~~~~~~p~g~~~~~~ldl~~v~~ 146 (178)
T PF03425_consen 72 GYGGLSFWIKGDGSGNKLRVQIKDGGD-YEYWEASFTDS----STWKTVEIPFSDFTQRPDYQPGGWGADGTLDLTNVWE 146 (178)
T ss_dssp T--EEEEEEEE------EEEEEEEE-E-EEEEEEEE-------SS-EEEEEEGGG-EE--S---TT----SS--TTSEEE
T ss_pred cCCcEEEEEEcCCCCcEEEEEEecCCc-ceeeEeecCCC----CcCEEEEEEHHHcccccccCCCCCCcccccChHHcEE
Confidence 34578999998754 468888887541 23444556653 4599999986321 10 1 1124
Q ss_pred EEEEEccC---eEEEEeEEeeeccC
Q 024627 216 LQLTTSRK---GVIWFDQVSAMPLD 237 (265)
Q Consensus 216 l~I~~~~~---G~v~lD~VSLfP~d 237 (265)
++|.+.+. |+++||.|.|.+..
T Consensus 147 ~~~~~~~~~~~~~~~iDdI~l~~~~ 171 (178)
T PF03425_consen 147 FAFYVNGGGGAGTFYIDDIRLYGAA 171 (178)
T ss_dssp EEEEESSS---EEEEEEEEEEE-B-
T ss_pred EEEEEcCCCceeEEEEEeEEEEeCc
Confidence 66666543 89999999999864
No 12
>PF00629 MAM: MAM domain; InterPro: IPR000998 MAM is an acronym derived from meprin, A-5 protein, and receptor protein-tyrosine phosphatase mu. The MAM domain consists of approximately 170 amino acids. It occurs in several cell surface proteins, including Meprins, and is thought to function as an interaction or adhesion domain []. The domain has been shown to play a role in homodimerization of protein-tyrosine phosphatase mu [] and appears to help determine the specificity of these interactions. It has been reported that certain cysteine mutations in the MAM domain of murine meprin A result in the formation of monomeric meprin, which has altered stability and activity []. This indicates that these domain-domain interactions are critical for structure and function of the enzyme. It has also been shown that the MAM domain of meprins is necessary for correct folding and transport through the secretory pathway []. ; GO: 0016020 membrane; PDB: 2C9A_A 2V5Y_A.
Probab=94.30 E-value=0.55 Score=37.71 Aligned_cols=80 Identities=15% Similarity=0.162 Sum_probs=43.4
Q ss_pred ccCCeEEEEEEEEe--CCCeeEEEEEEeCCC--CeeEEEEEEEeeecCCCCcEEEEEEEEecCCCCcceEEEEEc----c
Q 024627 151 KQGKTYKVVFYIRS--LGSVNILVSLTSSNG--LQTLATSNIIASASDVSNWTRVETLLEAKETNPNARLQLTTS----R 222 (265)
Q Consensus 151 ~~G~~Y~~Sf~ar~--~~~~~vtV~L~~~~g--~~~las~~i~~~~~~~~~W~ky~~~lta~~t~~~a~l~I~~~----~ 222 (265)
.+...+-++||... ...+.+.|.+..... ...+.+. . +...+.|++.++.|.+. ...-++.|... .
T Consensus 69 ~~~~~~cl~F~y~~~g~~~~~L~V~v~~~~~~~~~~l~~~--~--~~~~~~W~~~~v~l~~~--~~~~~i~f~~~~~~~~ 142 (160)
T PF00629_consen 69 PASGNSCLSFWYYMYGSSVGTLRVYVREESTGNSTPLWSI--T--GSQGNSWQRAQVNLPPI--SSPFQIIFEAIRGSSY 142 (160)
T ss_dssp --SS--EEEEEEEEE-SSSEEEEEEEEETT----S-SEEE---------SSEEEEEEEE-----TS-EEEEEEEEE--SS
T ss_pred cccccceeEEEEeeccccceeeEEEEEecCCccceeeeee--c--CCCcCCccceEEEcccc--cccceEEEEEEEcCCC
Confidence 44456668888864 344678898887722 1223322 2 12468999999999987 33445555442 2
Q ss_pred CeEEEEeEEeeecc
Q 024627 223 KGVIWFDQVSAMPL 236 (265)
Q Consensus 223 ~G~v~lD~VSLfP~ 236 (265)
.|.|.||.|+|-|.
T Consensus 143 ~~~iaiDdi~~~~~ 156 (160)
T PF00629_consen 143 RGDIAIDDISLSPG 156 (160)
T ss_dssp --EEEEEEEEEESS
T ss_pred ceEEEEEEEEEeCC
Confidence 58999999999863
No 13
>PF10648 Gmad2: Immunoglobulin-like domain of bacterial spore germination; InterPro: IPR018911 This domain is found linked to IPR019606 from INTERPRO in some bacterial proteins. It is predicted to contain an immunoglobulin-like all-beta fold.
Probab=89.31 E-value=2.7 Score=32.27 Aligned_cols=63 Identities=16% Similarity=0.189 Sum_probs=45.1
Q ss_pred eEEEEEEEEeCCCeeEEEEEEeCCCCeeEEEEEEEeeecCCCCcEEEEEEEEecCC-CCcceEEEEE
Q 024627 155 TYKVVFYIRSLGSVNILVSLTSSNGLQTLATSNIIASASDVSNWTRVETLLEAKET-NPNARLQLTT 220 (265)
Q Consensus 155 ~Y~~Sf~ar~~~~~~vtV~L~~~~g~~~las~~i~~~~~~~~~W~ky~~~lta~~t-~~~a~l~I~~ 220 (265)
.++++-.+|. -++.+.++|.|.+| +++++..+... .....|-.|+.++.-+.. ...++|++..
T Consensus 17 p~~V~G~A~~-FEgtv~~rv~D~~g-~vl~e~~~~a~-~g~~~~g~F~~tv~~~~~~~~~g~l~v~~ 80 (88)
T PF10648_consen 17 PVKVSGKARV-FEGTVNIRVRDGHG-EVLAEGFVTAT-GGAPSWGPFEGTVSFPPPPPGKGTLEVFE 80 (88)
T ss_pred CEEEEEEEEE-eeeEEEEEEEcCCC-cEEEEeeEEec-cCCCcccceEEEEEeCCCCCCceEEEEEE
Confidence 4555555663 36789999999998 78887777653 367899999999865533 5566776653
No 14
>COG3534 AbfA Alpha-L-arabinofuranosidase [Carbohydrate transport and metabolism]
Probab=86.84 E-value=0.16 Score=49.95 Aligned_cols=37 Identities=24% Similarity=0.444 Sum_probs=32.8
Q ss_pred eEEEEEEcCC-CCCCCcceeeeeeeccccccccchhhh
Q 024627 36 TARLLVDASQ-GRPMPETLFGIFFEEINHAGAGGLWAE 72 (265)
Q Consensus 36 ~~~ltVd~~~-~~~is~~LyG~FfEdIn~s~dGGLyAE 72 (265)
+++++|+.+. ..+|+..+||.|.|++..+.++|||-.
T Consensus 3 ~a~~~v~~d~~ig~I~k~iYG~F~EHlGr~vY~Giyep 40 (501)
T COG3534 3 KARAVVDTDYTIGKIDKRIYGHFIEHLGRAVYEGIYEP 40 (501)
T ss_pred ccceeechhhccCcchhhhhhHHHHhhccceeeeeecC
Confidence 3567888888 889999999999999999999999964
No 15
>PF04620 FlaA: Flagellar filament outer layer protein Flaa; InterPro: IPR006714 Periplasmic flagella are the organelles of spirochete mobility, and are structurally different from the flagella of other motile bacteria. They reside inside the cell within the periplasmic space, and confer mobility in viscous gel-like media such as connective tissue []. The flagella are composed of an outer sheath of FlaA proteins and a core filament of FlaB proteins. Each species usually has several FlaA protein species [].; GO: 0001539 ciliary or flagellar motility, 0030288 outer membrane-bounded periplasmic space
Probab=86.25 E-value=9.8 Score=34.10 Aligned_cols=57 Identities=11% Similarity=0.199 Sum_probs=41.3
Q ss_pred eeeeccCCeEEEEEEEEeCC-CeeEEEEEEeCCCCeeEEEEEEEeeecCCCCcEEEEEEEEec
Q 024627 147 GMGIKQGKTYKVVFYIRSLG-SVNILVSLTSSNGLQTLATSNIIASASDVSNWTRVETLLEAK 208 (265)
Q Consensus 147 Gi~v~~G~~Y~~Sf~ar~~~-~~~vtV~L~~~~g~~~las~~i~~~~~~~~~W~ky~~~lta~ 208 (265)
-|++ +|....+++|+-+.+ ...+.+.|+|.+| .+.. +..-.-.-.+|++.++.+.+.
T Consensus 100 ~Ipi-~g~~k~I~vWV~G~n~~h~L~v~lrD~~G-~~~~---l~~G~L~f~GWK~L~~~iP~~ 157 (217)
T PF04620_consen 100 PIPI-PGVIKSISVWVYGDNYPHWLEVLLRDAKG-EVHQ---LPLGSLNFDGWKNLTVNIPPY 157 (217)
T ss_pred ceec-cceeEEEEEEEECCCCCceEEEEEEcCCC-CEEE---EEeeeecCCceeEEEEECCCC
Confidence 4777 689999999999975 5679999999999 4432 211101247899999987554
No 16
>PF15432 Sec-ASP3: Accessory Sec secretory system ASP3
Probab=82.90 E-value=25 Score=28.94 Aligned_cols=75 Identities=17% Similarity=0.232 Sum_probs=52.0
Q ss_pred eccCCeEEEEEEEEeCC--CeeEEEEEEeCCCCeeEEEEEEEeeecCCCCcEEEEEEEEecCCCCcceEEEEEccCeEEE
Q 024627 150 IKQGKTYKVVFYIRSLG--SVNILVSLTSSNGLQTLATSNIIASASDVSNWTRVETLLEAKETNPNARLQLTTSRKGVIW 227 (265)
Q Consensus 150 v~~G~~Y~~Sf~ar~~~--~~~vtV~L~~~~g~~~las~~i~~~~~~~~~W~ky~~~lta~~t~~~a~l~I~~~~~G~v~ 227 (265)
+++|++|.+.+-+.... +.-++|.+.|+.+ +.+....+.- -+.+|+-+..+-.-++++.-.+-.++.
T Consensus 50 Lk~G~~Y~l~~~~~~~P~~svylki~F~dr~~-e~i~~~i~k~----------~~~~F~yP~~aysY~I~LinaG~~~l~ 118 (128)
T PF15432_consen 50 LKRGHTYQLKFNIDVVPENSVYLKIIFFDRQG-EEIEEQIIKN----------DSFEFTYPEEAYSYTISLINAGCQSLT 118 (128)
T ss_pred ecCCCEEEEEEEEEEccCCeEEEEEEEEccCC-CEeeEEEEec----------CceEEeCCCCceEEEEEEeeCCCCeeE
Confidence 48999999999998753 5568999999988 6666544431 236666665555556666555666777
Q ss_pred EeEEeeec
Q 024627 228 FDQVSAMP 235 (265)
Q Consensus 228 lD~VSLfP 235 (265)
|.-+++-+
T Consensus 119 F~~i~I~e 126 (128)
T PF15432_consen 119 FHSIEISE 126 (128)
T ss_pred EeEEEEEE
Confidence 77777655
No 17
>PF01835 A2M_N: MG2 domain; InterPro: IPR002890 The proteinase-binding alpha-macroglobulins (A2M) [] are large glycoproteins found in the plasma of vertebrates, in the hemolymph of some invertebrates and in reptilian and avian egg white. A2M-like proteins are able to inhibit all four classes of proteinases by a 'trapping' mechanism. They have a peptide stretch, called the 'bait region', which contains specific cleavage sites for different proteinases. When a proteinase cleaves the bait region, a conformational change is induced in the protein, thus trapping the proteinase. The entrapped enzyme remains active against low molecular weight substrates, whilst its activity toward larger substrates is greatly reduced, due to steric hindrance. Following cleavage in the bait region, a thiol ester bond, formed between the side chains of a cysteine and a glutamine, is cleaved and mediates the covalent binding of the A2M-like protein to the proteinase. This family includes the N-terminal region of the alpha-2-macroglobulin family. The inhibitor domains belong to MEROPS inhibitor family I39.; GO: 0004866 endopeptidase inhibitor activity; PDB: 2B39_B 3KLS_B 3PRX_C 3KM9_B 3PVM_C 3CU7_A 4E0S_A 4A5W_A 4ACQ_C 2P9R_B ....
Probab=80.32 E-value=23 Score=26.70 Aligned_cols=67 Identities=10% Similarity=0.199 Sum_probs=42.6
Q ss_pred eccCCeEEEEEEEEeCC-------CeeEEEEEEeCCCCeeEEEEEE-EeeecCCCCcEEEEEEEEecCCCCcceEEEEEc
Q 024627 150 IKQGKTYKVVFYIRSLG-------SVNILVSLTSSNGLQTLATSNI-IASASDVSNWTRVETLLEAKETNPNARLQLTTS 221 (265)
Q Consensus 150 v~~G~~Y~~Sf~ar~~~-------~~~vtV~L~~~~g~~~las~~i-~~~~~~~~~W~ky~~~lta~~t~~~a~l~I~~~ 221 (265)
.++|++=.++.++|... ...++|.|.+.+| +.+..... . ..+.-.++.+|+-++....+...|.+.
T Consensus 11 YrPGetV~~~~~~~~~~~~~~~~~~~~~~v~i~dp~g-~~v~~~~~~~-----~~~~G~~~~~~~lp~~~~~G~y~i~~~ 84 (99)
T PF01835_consen 11 YRPGETVHFRAIVRDLDNDFKPPANSPVTVTIKDPSG-NEVFRWSVNT-----TNENGIFSGSFQLPDDAPLGTYTIRVK 84 (99)
T ss_dssp E-TTSEEEEEEEEEEECTTCSCESSEEEEEEEEETTS-EEEEEEEEEE-----TTCTTEEEEEEE--SS---EEEEEEEE
T ss_pred cCCCCEEEEEEEEeccccccccccCCceEEEEECCCC-CEEEEEEeee-----eCCCCEEEEEEECCCCCCCEeEEEEEE
Confidence 36899999999988643 2579999999988 77776665 3 244556666666666556666666654
Q ss_pred c
Q 024627 222 R 222 (265)
Q Consensus 222 ~ 222 (265)
.
T Consensus 85 ~ 85 (99)
T PF01835_consen 85 T 85 (99)
T ss_dssp E
T ss_pred E
Confidence 3
No 18
>PF06030 DUF916: Bacterial protein of unknown function (DUF916); InterPro: IPR010317 This family consists of putative cell surface proteins, from Firmicutes, of unknown function.
Probab=72.41 E-value=37 Score=27.40 Aligned_cols=89 Identities=17% Similarity=0.229 Sum_probs=48.9
Q ss_pred cCceeeeeccCCeEEEEEEEEeCC--CeeEEEEEEeCCCCeeEEEEEEEeeecCCCCcEEEEEEEEec----CCCCcceE
Q 024627 143 PGYWGMGIKQGKTYKVVFYIRSLG--SVNILVSLTSSNGLQTLATSNIIASASDVSNWTRVETLLEAK----ETNPNARL 216 (265)
Q Consensus 143 ~Gy~Gi~v~~G~~Y~~Sf~ar~~~--~~~vtV~L~~~~g~~~las~~i~~~~~~~~~W~ky~~~lta~----~t~~~a~l 216 (265)
.||+-+.+.+|++.++.+-+.... ...+.|.+.++-. +.-+ .|.-. . .+ .+...++... ....+.
T Consensus 16 ~~YFdL~~~P~q~~~l~v~i~N~s~~~~tv~v~~~~A~T-n~nG--~I~Y~--~-~~-~~~d~sl~~~~~~~v~~~~~-- 86 (121)
T PF06030_consen 16 VSYFDLKVKPGQKQTLEVRITNNSDKEITVKVSANTATT-NDNG--VIDYS--Q-NN-PKKDKSLKYPFSDLVKIPKE-- 86 (121)
T ss_pred CCeEEEEeCCCCEEEEEEEEEeCCCCCEEEEEEEeeeEe-cCCE--EEEEC--C-CC-cccCcccCcchHHhccCCcE--
Confidence 589999999999999999998654 4455666554421 0000 11111 0 11 1111111111 111111
Q ss_pred EEEEccCeEEEEeEEeeeccCCCCC
Q 024627 217 QLTTSRKGVIWFDQVSAMPLDTYKG 241 (265)
Q Consensus 217 ~I~~~~~G~v~lD~VSLfP~dT~kg 241 (265)
|++....+-.+.+-==||+..|+|
T Consensus 87 -Vtl~~~~sk~V~~~i~~P~~~f~G 110 (121)
T PF06030_consen 87 -VTLPPNESKTVTFTIKMPKKAFDG 110 (121)
T ss_pred -EEECCCCEEEEEEEEEcCCCCcCC
Confidence 667666666666666788777887
No 19
>PF14299 PP2: Phloem protein 2
Probab=70.99 E-value=61 Score=27.08 Aligned_cols=71 Identities=14% Similarity=0.241 Sum_probs=41.3
Q ss_pred eeccCCeEEEEEEEEeCC------CeeEEEEEEeCCCCeeEEEEEEEeeecCCCCcEEEEE-EEEecCCCCcceEEEEE
Q 024627 149 GIKQGKTYKVVFYIRSLG------SVNILVSLTSSNGLQTLATSNIIASASDVSNWTRVET-LLEAKETNPNARLQLTT 220 (265)
Q Consensus 149 ~v~~G~~Y~~Sf~ar~~~------~~~vtV~L~~~~g~~~las~~i~~~~~~~~~W~ky~~-~lta~~t~~~a~l~I~~ 220 (265)
.+-+|.+|.++|.+|-.. ..++++++.-.++.+.-....+.......++|-++++ .|..... .++.+.+.+
T Consensus 55 ~Lsp~t~Y~vy~v~kl~~~~~Gw~~~pv~~~v~~~~~~~~~~~~~~~~~~~r~dgW~Eie~GeF~~~~~-~~~ev~f~~ 132 (154)
T PF14299_consen 55 MLSPGTTYAVYFVFKLKDDAYGWDSPPVEFSVKVPDGEKYEQERKVCLPKERGDGWMEIELGEFFNEGG-DDGEVEFSM 132 (154)
T ss_pred EcCCCCEEEEEEEEEecCCCCCCCcCCEEEEEEeCCCccccceeeEEcCCCCCCCEEEEEcceEEecCC-CCcEEEEEE
Confidence 366899999999998531 2245555554555332222333332235789999987 6666532 444444443
No 20
>PF07172 GRP: Glycine rich protein family; InterPro: IPR010800 This family consists of glycine rich proteins. Some of them may be involved in resistance to environmental stress [].
Probab=70.74 E-value=1.8 Score=33.76 Aligned_cols=9 Identities=0% Similarity=0.228 Sum_probs=4.6
Q ss_pred CCCCCccee
Q 024627 46 GRPMPETLF 54 (265)
Q Consensus 46 ~~~is~~Ly 54 (265)
.+++.+.-|
T Consensus 37 ~~~v~~~~~ 45 (95)
T PF07172_consen 37 ENEVQDDKY 45 (95)
T ss_pred CCCCCcccc
Confidence 455555544
No 21
>PF04300 FBA: F-box associated region; InterPro: IPR007397 Proteins containing this domain are associated with F-box domains (IPR001810 from INTERPRO), hence the name FBA. This domain is probably involved in binding other proteins that will be targeted for ubiquitination. Q9UK22 from SWISSPROT is involved in binding to N-glycosylated proteins.; GO: 0030163 protein catabolic process; PDB: 1UMI_A 2RJ2_A 2E33_A 1UMH_A 2E32_A 2E31_A.
Probab=68.79 E-value=73 Score=27.66 Aligned_cols=70 Identities=13% Similarity=0.158 Sum_probs=39.3
Q ss_pred eeEEccCceeeeeccC-CeEEEEEEEEe--CC--CeeEEEEEEeCCCCeeEEEEEEE---eeecCCCCcEEEEEEEEec
Q 024627 138 VGVYNPGYWGMGIKQG-KTYKVVFYIRS--LG--SVNILVSLTSSNGLQTLATSNII---ASASDVSNWTRVETLLEAK 208 (265)
Q Consensus 138 ~gi~N~Gy~Gi~v~~G-~~Y~~Sf~ar~--~~--~~~vtV~L~~~~g~~~las~~i~---~~~~~~~~W~ky~~~lta~ 208 (265)
+.|..+|||-=-+... -.=.+|-|.-+ +- .-.+.|+|.+++. +++++-... +.+..+..|++++.+|+-=
T Consensus 71 IDL~~eG~~~~lLD~~qP~I~isdWy~~r~dc~~~Y~l~V~Lld~~~-~vi~~f~~~~~~~~~~~~~~W~qvsh~F~~Y 148 (184)
T PF04300_consen 71 IDLQAEGYWPELLDSFQPEITISDWYAGRFDCGCVYELHVQLLDANK-NVIAEFKPGPVPIPQWTDNPWKQVSHTFSNY 148 (184)
T ss_dssp EETTTTT--HHHHHHT--EEEEEEEEE--SSS-EEEEEEEEEEETTT-EEEEEEEEESEEE-T--T--EEEEEEEE-S-
T ss_pred EehhhccCCHHHhcCCCCCEEEEEEEeccCCcCcEEEEEEEECcCCC-cEEEEEecccccccccCCCCcEEEEEEEeCC
Confidence 5677788876555433 23455666533 22 3579999999985 888775432 2112368899999999764
No 22
>PF11141 DUF2914: Protein of unknown function (DUF2914); InterPro: IPR022606 This bacterial family of proteins has no known function.
Probab=65.24 E-value=31 Score=24.85 Aligned_cols=41 Identities=15% Similarity=0.228 Sum_probs=30.9
Q ss_pred eeeccCCeEEEEEEEEeC--CCeeEEEEEEeCCCCeeEEEEEEEe
Q 024627 148 MGIKQGKTYKVVFYIRSL--GSVNILVSLTSSNGLQTLATSNIIA 190 (265)
Q Consensus 148 i~v~~G~~Y~~Sf~ar~~--~~~~vtV~L~~~~g~~~las~~i~~ 190 (265)
++|. |..|...=+-+-. ..+.-+|.+.+++| ++|++..+.+
T Consensus 23 l~i~-g~r~Rt~S~k~~~~~~~G~WrV~V~~~~G-~~l~~~~F~V 65 (66)
T PF11141_consen 23 LPIS-GGRWRTWSSKQNFPDQPGDWRVEVVDEDG-QVLGSLRFSV 65 (66)
T ss_pred Eecc-CCCEEEEEEeecCCCCCcCEEEEEEcCCC-CEEEEEEEEE
Confidence 5565 6667666655543 57889999999999 8999988875
No 23
>PF09212 CBM27: Carbohydrate binding module 27; InterPro: IPR015295 This domain is found in carbohydrate binding proteins that bind to beta-1, 4-mannooligosaccharides, carob galactomannan, and konjac glucomannan, but not to cellulose (insoluble and soluble) or soluble birchwood xylan. The region adopts a beta sandwich structure comprising 13 beta strands with a single, small alpha-helix and a single metal atom []. ; PDB: 1OF3_A 1OF4_A 1OH4_A 1PMJ_X 1PMH_X.
Probab=60.57 E-value=26 Score=30.22 Aligned_cols=110 Identities=15% Similarity=0.233 Sum_probs=59.3
Q ss_pred CCcceEEEEEeecCCCcccccCcceeEEccCceeeeeccCCeEEEEEEEEe-C-CCeeEE--EEEEeCCCCeeEEE----
Q 024627 114 RNKVALRMEVLCDSQGTNICPVGGVGVYNPGYWGMGIKQGKTYKVVFYIRS-L-GSVNIL--VSLTSSNGLQTLAT---- 185 (265)
Q Consensus 114 ~n~~~l~i~v~~~~~~~~~~~~g~~gi~N~Gy~Gi~v~~G~~Y~~Sf~ar~-~-~~~~vt--V~L~~~~g~~~las---- 185 (265)
.+..+||+++.-+.+ ....++-| ...+. .+-...+-++-+|+=. + ..+.++ +.| .+|-.-+..
T Consensus 41 ~g~gaLklnv~~~~~----~~W~E~ki-~~~~~--dls~~~~l~fDv~iP~~~~~~G~l~~~a~l--~~gW~k~g~~~~~ 111 (170)
T PF09212_consen 41 GGSGALKLNVDFDGN----NDWDELKI-FKNFE--DLSEYNRLEFDVYIPKNEKYSGSLKPYAAL--NPGWTKIGMDTTE 111 (170)
T ss_dssp GGGSEEEEEEEE-TT----STTEEEEE-CCEEC--CGCC--EEEEEEEEEHHCCSSSEE-EEEEE--CTTTEEECCCSCE
T ss_pred CCCccEEEEeecCCC----CCcchhhh-hhhhh--hcCCccEEEEEEEeCCCCCCCccEEEEEEc--CCChHHhcccccc
Confidence 345789998875431 01333444 22222 3344556666677743 2 244443 444 233111111
Q ss_pred ------EEEEeeecCCCCcEEEEEEEEecCCCCcceEEEEEcc-----CeEEEEeEEeeec
Q 024627 186 ------SNIIASASDVSNWTRVETLLEAKETNPNARLQLTTSR-----KGVIWFDQVSAMP 235 (265)
Q Consensus 186 ------~~i~~~~~~~~~W~ky~~~lta~~t~~~a~l~I~~~~-----~G~v~lD~VSLfP 235 (265)
..+++ .+.+++++++++.-..+.....|.|.+-+ .|.++||-|.|.-
T Consensus 112 ~~v~dle~v~i---~Gk~Y~k~~v~i~~~~~~~~~~lvl~ivG~~~~Y~GpIYIDNV~L~k 169 (170)
T PF09212_consen 112 INVKDLETVTI---DGKGYKKIHVSIEFDSSKKATQLVLQIVGSNLDYNGPIYIDNVKLIK 169 (170)
T ss_dssp EECCCSEEEEE---TTEEEEEEEEEEE--SSCCE-EEEEEEEEES--EEEEEEEEEEEEEE
T ss_pred ccccccceEEE---CCeEEEEEEEEEEccccCCCCcEEEEEccccccccCCEEEEeEEEec
Confidence 22343 36889999999877665556677777654 6999999999863
No 24
>PF13313 DUF4082: Domain of unknown function (DUF4082)
Probab=56.56 E-value=40 Score=28.55 Aligned_cols=46 Identities=28% Similarity=0.285 Sum_probs=33.7
Q ss_pred EEEEEEEEeCCCeeEEEEEEeCCCCeeEEEEEEEeeecCCCCcEEEEEE
Q 024627 156 YKVVFYIRSLGSVNILVSLTSSNGLQTLATSNIIASASDVSNWTRVETL 204 (265)
Q Consensus 156 Y~~Sf~ar~~~~~~vtV~L~~~~g~~~las~~i~~~~~~~~~W~ky~~~ 204 (265)
=-++||-.....+.-+.+|.+.+| +.||+.+++-+ ....||+.++.
T Consensus 34 tgvrfYk~~~ntgthtgsLWsa~G-~lLAt~tft~e--tasGWQt~~f~ 79 (149)
T PF13313_consen 34 TGVRFYKGAGNTGTHTGSLWSADG-TLLATATFTNE--TASGWQTVTFS 79 (149)
T ss_pred EEEEEEeCCCCCCceEEEEECCCC-CEEEEEEEcCC--CCCceEEEecc
Confidence 344555334456777999999999 89999988754 45789998765
No 25
>PF07705 CARDB: CARDB; InterPro: IPR011635 The APHP (acidic peptide-dependent hydrolases/peptidase) domain is found in a variety of different proteins.; PDB: 2KUT_A 2L0D_A 3IDU_A 2KL6_A.
Probab=54.48 E-value=82 Score=22.96 Aligned_cols=69 Identities=13% Similarity=0.205 Sum_probs=43.1
Q ss_pred eeccCCeEEEEEEEEeCC---CeeEEEEEEeCCCCeeEEEEEE-EeeecCCCCcEEEEEEEEecCCCCcceEEEEEccC
Q 024627 149 GIKQGKTYKVVFYIRSLG---SVNILVSLTSSNGLQTLATSNI-IASASDVSNWTRVETLLEAKETNPNARLQLTTSRK 223 (265)
Q Consensus 149 ~v~~G~~Y~~Sf~ar~~~---~~~vtV~L~~~~g~~~las~~i-~~~~~~~~~W~ky~~~lta~~t~~~a~l~I~~~~~ 223 (265)
.+..|+.+++.+-++..+ ...++|.+... + ....+..| .+. .++.+.+.+++++. ....-.+.+.++..
T Consensus 14 ~~~~g~~~~i~~~V~N~G~~~~~~~~v~~~~~-~-~~~~~~~i~~L~---~g~~~~v~~~~~~~-~~G~~~i~~~iD~~ 86 (101)
T PF07705_consen 14 NVVPGEPVTITVTVKNNGTADAENVTVRLYLD-G-NSVSTVTIPSLA---PGESETVTFTWTPP-SPGSYTIRVVIDPD 86 (101)
T ss_dssp EEETTSEEEEEEEEEE-SSS-BEEEEEEEEET-T-EEEEEEEESEB----TTEEEEEEEEEE-S-S-CEEEEEEEESTT
T ss_pred cccCCCEEEEEEEEEECCCCCCCCEEEEEEEC-C-ceeccEEECCcC---CCcEEEEEEEEEeC-CCCeEEEEEEEeeC
Confidence 445789999999998643 34688888764 4 34466666 433 56778888888887 33444556555543
No 26
>cd00918 Der-p2_like Several group 2 allergen proteins belong to the ML domain family. They include Dermatophagoides pteronyssinus, group 2 (Der p 2) and D. farinae, group 2 (Der f 2) allergens. These house dust mites cause heavy atopic diseases such as asthma and dermatitis. Although the allergenic properties of these proteins have been well characterized, their biological function in mites is unknown.
Probab=49.36 E-value=25 Score=28.41 Aligned_cols=33 Identities=18% Similarity=0.288 Sum_probs=24.3
Q ss_pred eeeeccCCe--EEEEEEEEeCC---CeeEEEEEEeCCC
Q 024627 147 GMGIKQGKT--YKVVFYIRSLG---SVNILVSLTSSNG 179 (265)
Q Consensus 147 Gi~v~~G~~--Y~~Sf~ar~~~---~~~vtV~L~~~~g 179 (265)
.=||++|++ |+.++.+.... +..++++|.+.+|
T Consensus 72 ~CPl~~G~~~~y~~~~~V~~~~P~v~~~V~~~L~d~~g 109 (120)
T cd00918 72 KCPIKKGQHYDIKYTWNVPAILPKIKAVVKAVLIGDHG 109 (120)
T ss_pred eCCCcCCcEEEEEEeeeccccCCCeEEEEEEEEEcCCC
Confidence 578999999 56667776643 3568888988766
No 27
>TIGR03711 acc_sec_asp3 accessory Sec system protein Asp3. This protein is designated Asp3 because, along with SecY2, SecA2, and other proteins it is part of the accessory Sec system. The system is involved in the export of serine-rich glycoproteins important for virulence in a number of Gram-positive species, including Streptococcus gordonii and Staphylococcus aureus. This protein family is assigned to transport rather than glycosylation function, but the specific molecular role is unknown.
Probab=49.21 E-value=1.6e+02 Score=24.60 Aligned_cols=37 Identities=24% Similarity=0.438 Sum_probs=28.7
Q ss_pred eccCCeEEEEEEEEeCC--CeeEEEEEEeCCCCeeEEEEE
Q 024627 150 IKQGKTYKVVFYIRSLG--SVNILVSLTSSNGLQTLATSN 187 (265)
Q Consensus 150 v~~G~~Y~~Sf~ar~~~--~~~vtV~L~~~~g~~~las~~ 187 (265)
+++|++|.+.+-+.... +.-++|.+.|+.+ +.+....
T Consensus 61 Lk~g~~Y~i~~n~~~~P~~s~~~ki~F~dr~~-~ei~~~i 99 (135)
T TIGR03711 61 LKRGQTYKLSLNADASPEGSVYLKITFFDRQG-EEIGTEI 99 (135)
T ss_pred EcCCCEEEEEEeeeeCCCceEEEEEEEeccCC-ceeceEE
Confidence 47899999999998753 4568888999988 5665443
No 28
>PF14109 GldH_lipo: GldH lipoprotein
Probab=48.52 E-value=1.5e+02 Score=24.16 Aligned_cols=84 Identities=25% Similarity=0.340 Sum_probs=45.3
Q ss_pred CeEEEEEEEEeCC-----CeeEEEEEEeCCCCeeEEEEEEEeeecCCCCcE------EEEEEEEecCCCCcceEEEEEcc
Q 024627 154 KTYKVVFYIRSLG-----SVNILVSLTSSNGLQTLATSNIIASASDVSNWT------RVETLLEAKETNPNARLQLTTSR 222 (265)
Q Consensus 154 ~~Y~~Sf~ar~~~-----~~~vtV~L~~~~g~~~las~~i~~~~~~~~~W~------ky~~~lta~~t~~~a~l~I~~~~ 222 (265)
..|++.+.+|... ...+.|.+...+|...-....+.+. ...+.|. -|+..+.-.+ .+.+..
T Consensus 33 ~~Y~l~l~lR~~~~Ypy~NL~l~v~~~~p~~~~~~dtl~~~La-d~~G~w~G~G~~~~~e~~~~~~~-------~~~f~~ 104 (131)
T PF14109_consen 33 GPYNLYLNLRNNNDYPYSNLWLIVELTDPDGKKVTDTLECELA-DPDGKWLGKGIGDLYEYKLPYKE-------NVRFPR 104 (131)
T ss_pred CeeEEEEEEEcCCCCCcCCEEEEEEEEcCCCCEEeeeEEEEEE-CCCCcEeeeeEeEeEEEEEEeec-------ceecCC
Confidence 5677777777542 2345556665555221111122222 2345553 3334443322 345677
Q ss_pred CeEEEEeEEeeeccCCCCC---CCch
Q 024627 223 KGVIWFDQVSAMPLDTYKG---HGFR 245 (265)
Q Consensus 223 ~G~v~lD~VSLfP~dT~kg---~GlR 245 (265)
+|+..|-....|.++.-+| -|+|
T Consensus 105 ~G~Y~~~i~q~Mr~~~L~GI~dVGi~ 130 (131)
T PF14109_consen 105 KGSYTFTIEQAMRRNPLPGISDVGIR 130 (131)
T ss_pred CCcEEEEEEeccccccCCCceeeeEE
Confidence 8888888888888777766 3554
No 29
>PF14683 CBM-like: Polysaccharide lyase family 4, domain III; PDB: 1NKG_A 2XHN_B 3NJX_A 3NJV_A.
Probab=48.27 E-value=51 Score=28.10 Aligned_cols=79 Identities=18% Similarity=0.211 Sum_probs=39.4
Q ss_pred CCeEEEEEEEEeC-CCeeEEEEEEeCCCCeeEE----E-EEEEeeecCCCCcEEEEEEEEecCC-CCcceEEEEEccCeE
Q 024627 153 GKTYKVVFYIRSL-GSVNILVSLTSSNGLQTLA----T-SNIIASASDVSNWTRVETLLEAKET-NPNARLQLTTSRKGV 225 (265)
Q Consensus 153 G~~Y~~Sf~ar~~-~~~~vtV~L~~~~g~~~la----s-~~i~~~~~~~~~W~ky~~~lta~~t-~~~a~l~I~~~~~G~ 225 (265)
+..|++.+.+=+. ....++|.+.+..+ .... . ..+...+.-.+.|+.|++++.+..= ...+.+.|+.. .|+
T Consensus 77 ~~~~tL~i~la~a~~~~~~~V~vNg~~~-~~~~~~~~~d~~~~r~g~~~G~~~~~~~~ipa~~L~~G~Nti~lt~~-~gs 154 (167)
T PF14683_consen 77 AGTYTLRIALAGASAGGRLQVSVNGWSG-PFPSAPFGNDNAIYRSGIHRGNYRLYEFDIPASLLKAGENTITLTVP-SGS 154 (167)
T ss_dssp S--EEEEEEEEEEETT-EEEEEETTEE------------S--GGGT---S---EEEEEE-TTSS-SEEEEEEEEEE--S-
T ss_pred CCcEEEEEEeccccCCCCEEEEEcCccC-CccccccCCCCceeeCceecccEEEEEEEEcHHHEEeccEEEEEEEc-cCC
Confidence 3688888887655 56677888766333 2111 1 1111111123789999999988742 23567777664 455
Q ss_pred -----EEEeEEee
Q 024627 226 -----IWFDQVSA 233 (265)
Q Consensus 226 -----v~lD~VSL 233 (265)
|-.|.|.|
T Consensus 155 ~~~~gvmyD~I~L 167 (167)
T PF14683_consen 155 GLSPGVMYDYIRL 167 (167)
T ss_dssp GGSSEEEEEEEEE
T ss_pred CccCeEEEEEEEC
Confidence 88899887
No 30
>cd06480 ACD_HspB8_like Alpha-crystallin domain (ACD) found in mammalian 21.6 KDa small heat shock protein (sHsp) HspB8, also denoted as Hsp22 in humans, and similar proteins. sHsps are molecular chaperones that suppress protein aggregation and protect against cell stress, and are generally active as large oligomers consisting of multiple subunits. A chaperone complex formed of HspB8 and Bag3 stimulates degradation of protein complexes by macroautophagy. HspB8 also forms complexes with Hsp27 (HspB1), MKBP (HspB2), HspB3, alphaB-crystallin (HspB5), Hsp20 (HspB6), and cvHsp (HspB7). These latter interactions may depend on phosphorylation of the respective partner sHsp. HspB8 may participate in the regulation of cell proliferation, cardiac hypertrophy, apoptosis, and carcinogenesis. Point mutations in HspB8 have been correlated with the development of several congenital neurological diseases, including Charcot Marie tooth disease and distal motor neuropathy type II.
Probab=46.40 E-value=1.1e+02 Score=23.42 Aligned_cols=79 Identities=8% Similarity=0.040 Sum_probs=46.0
Q ss_pred eeeccCCeEEEEEEEEeCCCeeEEEEEEeCCCCeeEEEEEEEee-ecCCCCcEEEEEEEEecCCCCcceEEEEEccCeEE
Q 024627 148 MGIKQGKTYKVVFYIRSLGSVNILVSLTSSNGLQTLATSNIIAS-ASDVSNWTRVETLLEAKETNPNARLQLTTSRKGVI 226 (265)
Q Consensus 148 i~v~~G~~Y~~Sf~ar~~~~~~vtV~L~~~~g~~~las~~i~~~-~~~~~~W~ky~~~lta~~t~~~a~l~I~~~~~G~v 226 (265)
-..+.+++|.+++-+++=.+..++|.+.+. ...-+..-... ...+--+++|+-.+.-........+.=.+..+|.|
T Consensus 9 ~~~~~~~~f~v~ldv~gF~pEDL~Vkv~~~---~L~V~Gkh~~~~~e~g~~~r~F~R~~~LP~~Vd~~~v~s~l~~dGvL 85 (91)
T cd06480 9 PPPNSSEPWKVCVNVHSFKPEELTVKTKDG---FVEVSGKHEEQQKEGGIVSKNFTKKIQLPPEVDPVTVFASLSPEGLL 85 (91)
T ss_pred CCCCCCCcEEEEEEeCCCCHHHcEEEEECC---EEEEEEEECcccCCCCEEEEEEEEEEECCCCCCchhEEEEeCCCCeE
Confidence 345567899999999886667788887652 22222211111 01112255666666666555555555556667777
Q ss_pred EEe
Q 024627 227 WFD 229 (265)
Q Consensus 227 ~lD 229 (265)
.|.
T Consensus 86 ~Ie 88 (91)
T cd06480 86 IIE 88 (91)
T ss_pred EEE
Confidence 664
No 31
>PF13201 Xylanase: Putative glycoside hydrolase xylanase; PDB: 3S30_B 3HBZ_A.
Probab=44.62 E-value=48 Score=31.62 Aligned_cols=86 Identities=19% Similarity=0.349 Sum_probs=46.2
Q ss_pred eeeeeccCCeEEEEEEEEeC-C---------------CeeEEEEEEeCCC-------------CeeEEEEEEEeeecCCC
Q 024627 146 WGMGIKQGKTYKVVFYIRSL-G---------------SVNILVSLTSSNG-------------LQTLATSNIIASASDVS 196 (265)
Q Consensus 146 ~Gi~v~~G~~Y~~Sf~ar~~-~---------------~~~vtV~L~~~~g-------------~~~las~~i~~~~~~~~ 196 (265)
+|+|..+ +.=.++.|.|-. + ...+.+.|.+.+. ..++|-..+.-. ...+
T Consensus 207 FG~pf~~-rP~~l~G~YKY~~G~~~~~~~~~~~~~~D~~~Iyavly~~~~~~~~l~g~~~~t~~~iia~a~~~~~-~~~~ 284 (342)
T PF13201_consen 207 FGRPFTK-RPTALKGYYKYTPGEVFYDNGKVVKGKKDECSIYAVLYEWSDDEEYLDGTNILTSPNIIAYAELTDG-TETD 284 (342)
T ss_dssp E-EE--S--EEEEEEEEEEE--SSEEETTEEESS-----EEEEEEEE-BTTBS-EECCTTTT-TTEEEEEE-SS----EE
T ss_pred cCCcccc-eecEEEEEEEEeEccEEecCCcccCCCCccEEEEEEEEeccCCcceecccccCCCcCEEEEEEecCC-CccC
Confidence 5677654 666677777621 0 2235555655422 245566665421 2457
Q ss_pred CcEEEEEEEEecCC---------CCcceEEEEEcc-----------CeEEEEeEEee
Q 024627 197 NWTRVETLLEAKET---------NPNARLQLTTSR-----------KGVIWFDQVSA 233 (265)
Q Consensus 197 ~W~ky~~~lta~~t---------~~~a~l~I~~~~-----------~G~v~lD~VSL 233 (265)
+|+++++.|+.... ..+-+|+|.++. +.+||||.|.|
T Consensus 285 ~~t~F~i~~~~~~~k~~d~~~l~~~~Y~laIV~SSSk~Gd~F~Ga~GStL~iDd~el 341 (342)
T PF13201_consen 285 EWTEFEIPFEYRYGKEYDYDKLENKKYKLAIVFSSSKYGDYFTGAVGSTLWIDDVEL 341 (342)
T ss_dssp EEEEEEEE-ECTTT----HHHHHCT-EEEEEEEESSTCGGGTEEETT-EEEEEEEEE
T ss_pred CCEEEEEEeEeecCcccChhhccCCCeEEEEEEecccCCCeeEcCCCCEEEEeeEEE
Confidence 89999999974431 245778888853 24999999987
No 32
>PF10670 DUF4198: Domain of unknown function (DUF4198)
Probab=44.58 E-value=95 Score=26.31 Aligned_cols=63 Identities=22% Similarity=0.321 Sum_probs=39.0
Q ss_pred eEEEEEEeCCCCeeEEEEEEEeeecCCCCcEEEEEEEEecCCCCcceEEEEEccCeEEEEeEEeeec
Q 024627 169 NILVSLTSSNGLQTLATSNIIASASDVSNWTRVETLLEAKETNPNARLQLTTSRKGVIWFDQVSAMP 235 (265)
Q Consensus 169 ~vtV~L~~~~g~~~las~~i~~~~~~~~~W~ky~~~lta~~t~~~a~l~I~~~~~G~v~lD~VSLfP 235 (265)
.+++++. -+| +.++.+.|.+. ..+.|.+....-..-.|+.++++.|.+..+|.-.|-..-..|
T Consensus 152 ~~~~~vl-~~G-kPl~~a~V~~~--~~~~~~~~~~~~~~~~TD~~G~~~~~~~~~G~wli~a~~~~p 214 (215)
T PF10670_consen 152 PLPFQVL-FDG-KPLAGAEVEAF--SPGGWYDVEHEAKTLKTDANGRATFTLPRPGLWLIRASHKDP 214 (215)
T ss_pred EEEEEEE-ECC-eEcccEEEEEE--ECCCccccccceEEEEECCCCEEEEecCCCEEEEEEEEEecC
Confidence 4556655 356 78887777765 245564332111122356789999999999977766554443
No 33
>PF08770 SoxZ: Sulphur oxidation protein SoxZ; InterPro: IPR014880 SoxZ forms an anti parallel beta structure and forms a complex with SoxY. Sulphur oxidation occurs at the thiol of a conserved cysteine residue of the SoxY subunit []. ; PDB: 1V8H_B 2OX5_E 2OXG_E 2OXH_C.
Probab=42.96 E-value=1e+02 Score=24.03 Aligned_cols=33 Identities=24% Similarity=0.395 Sum_probs=22.7
Q ss_pred eeeeeccCCeEEEEEEEEeCCCeeEEEEEEeCCCC
Q 024627 146 WGMGIKQGKTYKVVFYIRSLGSVNILVSLTSSNGL 180 (265)
Q Consensus 146 ~Gi~v~~G~~Y~~Sf~ar~~~~~~vtV~L~~~~g~ 180 (265)
||.+|-++ =.++|.+++...+.++|...|.+|.
T Consensus 60 ~~~siS~N--P~l~F~~~~~~~g~l~v~~~Dn~G~ 92 (100)
T PF08770_consen 60 WGPSISEN--PYLRFSFKGKKSGTLTVTWTDNKGN 92 (100)
T ss_dssp E-TTB-SS---EEEEEEEESSSEEEEEEEEETTS-
T ss_pred eCCcccCC--CcEEEEEecCCCcEEEEEEEECCCC
Confidence 56677433 3466677887778999999999993
No 34
>KOG1392 consensus Acetyl-CoA acetyltransferase [Lipid transport and metabolism]
Probab=41.49 E-value=11 Score=35.63 Aligned_cols=25 Identities=40% Similarity=0.650 Sum_probs=22.8
Q ss_pred CCCCCCCchHHHHHHHHcCCCCeEe
Q 024627 237 DTYKGHGFRNVLFQMLADLKPRFLR 261 (265)
Q Consensus 237 dT~kg~GlR~DL~e~L~dL~P~FlR 261 (265)
.|.|.||+|....|.|+.|||.|+.
T Consensus 265 ~t~kdngirvssle~laklkpafvk 289 (465)
T KOG1392|consen 265 KTIKDNGIRVSSLEKLAKLKPAFVK 289 (465)
T ss_pred ceecCCCcccCCHHHHhhcCchhcc
Confidence 4668899999999999999999986
No 35
>PF03944 Endotoxin_C: delta endotoxin; InterPro: IPR005638 This family contains insecticidal toxins produced by Bacillus species of bacteria. During spore formation the bacteria produce crystals of this protein. When an insect ingests these proteins, they are activated by proteolytic cleavage. The N terminus is cleaved in all of the proteins and a C-terminal extension is cleaved in some members. Once activated, the endotoxin binds to the gut epithelium and causes cell lysis by the formation of cation-selective channels, which leads to death. The activated region of the delta toxin is composed of three distinct structural domains: an N-terminal helical bundle domain (IPR005639 from INTERPRO) involved in membrane insertion and pore formation; a beta-sheet central domain (IPR001178 from INTERPRO) involved in receptor binding; and a C-terminal beta-sandwich domain that interacts with the N-terminal domain to form a channel [, ]. This entry represents the conserved C-terminal domain.; PDB: 1DLC_A 1JI6_A 1W99_A 1CIY_A 1I5P_A 2C9K_A 3EB7_A.
Probab=37.21 E-value=2.3e+02 Score=23.11 Aligned_cols=82 Identities=13% Similarity=0.321 Sum_probs=38.8
Q ss_pred cCCeEEEEEEEEeCCCeeEEEEEEeCCCCeeEEEEEEEeeecC--CCC--cEEEEEE-----EEecCCCCcceEEEEEc-
Q 024627 152 QGKTYKVVFYIRSLGSVNILVSLTSSNGLQTLATSNIIASASD--VSN--WTRVETL-----LEAKETNPNARLQLTTS- 221 (265)
Q Consensus 152 ~G~~Y~~Sf~ar~~~~~~vtV~L~~~~g~~~las~~i~~~~~~--~~~--W~ky~~~-----lta~~t~~~a~l~I~~~- 221 (265)
..++|++++..-+.....+.+......+ ....++..+.+. .-+ +..|... ++... .....+.|.+.
T Consensus 50 ~~~~YrIRiRYAs~~~~~~~i~~~~~~~---~~~~~~~~T~~~~~~~~~~y~~F~y~~~~~~~~~~~-~~~~~~~i~i~~ 125 (143)
T PF03944_consen 50 SSQKYRIRIRYASNSNGTLSISINNSSG---NLSFNFPSTMSNGDNLTLNYESFQYVEFPTPFTFSS-NQSITITISIQN 125 (143)
T ss_dssp STEEEEEEEEEEESS-EEEEEEETTEEE---ECEEEE--SSSTTGGCCETGGG-EEEEESSEEEEST-SEEEEEEEEEES
T ss_pred CCceEEEEEEEEECCCcEEEEEECCccc---eeeeeccccccCCCccccccceeEeeecCceEEecC-CCceEEEEEEEe
Confidence 6788998888777655555555432211 113333322101 011 3333332 22221 11234555443
Q ss_pred -c-CeEEEEeEEeeeccC
Q 024627 222 -R-KGVIWFDQVSAMPLD 237 (265)
Q Consensus 222 -~-~G~v~lD~VSLfP~d 237 (265)
. .+.|+||-+-+-|.|
T Consensus 126 ~~~~~~v~IDkIEFIPv~ 143 (143)
T PF03944_consen 126 ISSNGNVYIDKIEFIPVN 143 (143)
T ss_dssp STTTS-EEEEEEEEEECT
T ss_pred cCCCCeEEEEeEEEEeCC
Confidence 2 389999999999953
No 36
>COG4724 Endo-beta-N-acetylglucosaminidase D [Carbohydrate transport and metabolism]
Probab=33.71 E-value=2.5e+02 Score=27.90 Aligned_cols=110 Identities=13% Similarity=0.137 Sum_probs=63.0
Q ss_pred ceeEEecCCCccccCCcceEEEEEeecCCCcccccCcceeEEccCceeeeeccCCeEEEEEEEEeCCCeeEEEEEEeCCC
Q 024627 100 SSLIVSTDRSSCFERNKVALRMEVLCDSQGTNICPVGGVGVYNPGYWGMGIKQGKTYKVVFYIRSLGSVNILVSLTSSNG 179 (265)
Q Consensus 100 ~~~~~~~d~~~~~~~n~~~l~i~v~~~~~~~~~~~~g~~gi~N~Gy~Gi~v~~G~~Y~~Sf~ar~~~~~~vtV~L~~~~g 179 (265)
..+..+-|-...+ +..++|++.-.-+..+ ..-|.++-. -+-|..+.+ +++-.|+.....|.+.+.+..+
T Consensus 426 ~kl~~~fDf~~ay-nGGnSLKfsgdl~~~~-----~~nv~Ly~t---~L~i~~~tk--~~v~~k~~~glKV~~~f~~~pd 494 (553)
T COG4724 426 EKLRAEFDFTDAY-NGGNSLKFSGDLAGKT-----DQNVRLYST---KLEITEKTK--LRVAHKGGKGLKVYMAFSTTPD 494 (553)
T ss_pred ceeecccchhhhc-CCCcceeeeeccccCC-----ccceEEEee---ceeeecCce--EEEEeecCCceEEEEEEecCCc
Confidence 3445544433333 4457888644322110 123455443 456665554 4555688766778888887655
Q ss_pred CeeEEEEEEEeeecCCCCcEEEEEEEEecCCCCcceEEEEEccCe
Q 024627 180 LQTLATSNIIASASDVSNWTRVETLLEAKETNPNARLQLTTSRKG 224 (265)
Q Consensus 180 ~~~las~~i~~~~~~~~~W~ky~~~lta~~t~~~a~l~I~~~~~G 224 (265)
..+.+ ..-- ..+++|++=++.|...+...-+.+.+.++.+|
T Consensus 495 ~f~~~--d~~K--~l~~nW~~e~~~l~~~~g~~i~av~l~~e~~~ 535 (553)
T COG4724 495 KFDDA--DAWK--ELSDNWTNEEFDLSSLAGKTIYAVKLFFEHEG 535 (553)
T ss_pred cccch--hhhh--hhcccchhhheehhhccCceEEEEEEEEeccC
Confidence 22222 2221 25689999999998887666666677776665
No 37
>PF10633 NPCBM_assoc: NPCBM-associated, NEW3 domain of alpha-galactosidase; InterPro: IPR018905 This domain has been named NEW3, but its function is not known. It is found on proteins which are bacterial galactosidases [].; PDB: 1EUT_A 2BZD_A 1WCQ_C 2BER_A 1W8O_A 1EUU_A 1W8N_A.
Probab=33.30 E-value=1.8e+02 Score=20.84 Aligned_cols=68 Identities=16% Similarity=0.236 Sum_probs=30.1
Q ss_pred eccCCeEEEEEEEEeCCC---eeEEEEEEeCCCCee--EEEEEEEeeecCCCCcEEEEEEEEecCCCCcceEEEEE
Q 024627 150 IKQGKTYKVVFYIRSLGS---VNILVSLTSSNGLQT--LATSNIIASASDVSNWTRVETLLEAKETNPNARLQLTT 220 (265)
Q Consensus 150 v~~G~~Y~~Sf~ar~~~~---~~vtV~L~~~~g~~~--las~~i~~~~~~~~~W~ky~~~lta~~t~~~a~l~I~~ 220 (265)
+.+|++..+++-++.... ..++++|.--+| .. .....+.. -..++=+.+++++++.++...+...|++
T Consensus 1 v~~G~~~~~~~tv~N~g~~~~~~v~~~l~~P~G-W~~~~~~~~~~~--l~pG~s~~~~~~V~vp~~a~~G~y~v~~ 73 (78)
T PF10633_consen 1 VTPGETVTVTLTVTNTGTAPLTNVSLSLSLPEG-WTVSASPASVPS--LPPGESVTVTFTVTVPADAAPGTYTVTV 73 (78)
T ss_dssp --TTEEEEEEEEEE--SSS-BSS-EEEEE--TT-SE---EEEEE----B-TTSEEEEEEEEEE-TT--SEEEEEEE
T ss_pred CCCCCEEEEEEEEEECCCCceeeEEEEEeCCCC-ccccCCcccccc--CCCCCEEEEEEEEECCCCCCCceEEEEE
Confidence 357778888887776432 235555554455 33 11222221 1245556667777776665555555544
No 38
>PF10836 DUF2574: Protein of unknown function (DUF2574) ; InterPro: IPR020386 This entry contains proteins with no known function.
Probab=31.19 E-value=33 Score=26.55 Aligned_cols=32 Identities=28% Similarity=0.335 Sum_probs=21.8
Q ss_pred HhhhheeeccccccccceEEEEEEcCCCCCCCcce
Q 024627 19 GTCFLFQCFAAEVEVNQTARLLVDASQGRPMPETL 53 (265)
Q Consensus 19 ~~~~~~~~~~~~~~~~~~~~ltVd~~~~~~is~~L 53 (265)
.+.++-|-+|+..+++.+++|+|. |+-.+|+-
T Consensus 8 Gii~laYGls~P~faSdTATLtIs---Grv~~PTC 39 (93)
T PF10836_consen 8 GIIVLAYGLSSPAFASDTATLTIS---GRVSPPTC 39 (93)
T ss_pred hhhHhhhhcccccccccceEEEEc---ceEcCCcc
Confidence 334445566666788899999998 45666653
No 39
>COG0852 NuoC NADH:ubiquinone oxidoreductase 27 kD subunit [Energy production and conversion]
Probab=31.17 E-value=19 Score=31.23 Aligned_cols=41 Identities=22% Similarity=0.412 Sum_probs=26.4
Q ss_pred CCCcEEEEEEEEecCCCCcceEEEEEccCeEEEEeEEeeeccCCCCCCCchHHH
Q 024627 195 VSNWTRVETLLEAKETNPNARLQLTTSRKGVIWFDQVSAMPLDTYKGHGFRNVL 248 (265)
Q Consensus 195 ~~~W~ky~~~lta~~t~~~a~l~I~~~~~G~v~lD~VSLfP~dT~kg~GlR~DL 248 (265)
+.+|..-+ .---+-|.|++.= ++-. =||| ++|.||.+|||-
T Consensus 107 ~A~W~ERE---------~yDmfGI~FeGHP--~LrR-ilm~-~~~~GhPLRKDf 147 (176)
T COG0852 107 AANWYERE---------AYDMFGIVFEGHP--DLRR-ILMP-DDWEGHPLRKDF 147 (176)
T ss_pred cCchhhhh---------hheeeeeEEcCCc--cccc-ccCC-CCCCCCCccCCc
Confidence 57887654 3346778886632 1111 1677 689999999983
No 40
>PF07521 RMMBL: RNA-metabolising metallo-beta-lactamase; InterPro: IPR011108 The metallo-beta-lactamase fold contains five sequence motifs. The first four motifs are found in IPR001279 from INTERPRO and are common to all metallo-beta-lactamases. The fifth motif appears to be specific to function. This entry represents the fifth motif from metallo-beta-lactamases involved in RNA metabolism [].; PDB: 3ZQ4_D 2I7T_A 2I7V_A 2YCB_B 3BK1_A 3T3N_A 3BK2_A 3T3O_A 3AF5_A 3AF6_A ....
Probab=30.28 E-value=48 Score=21.63 Aligned_cols=22 Identities=32% Similarity=0.699 Sum_probs=18.6
Q ss_pred CCCCCchHHHHHHHHcCCCCeE
Q 024627 239 YKGHGFRNVLFQMLADLKPRFL 260 (265)
Q Consensus 239 ~kg~GlR~DL~e~L~dL~P~Fl 260 (265)
|-||.-|.||.++++.++|+.+
T Consensus 14 fSgHad~~~L~~~i~~~~p~~v 35 (43)
T PF07521_consen 14 FSGHADREELLEFIEQLNPRKV 35 (43)
T ss_dssp CSSS-BHHHHHHHHHHHCSSEE
T ss_pred ecCCCCHHHHHHHHHhcCCCEE
Confidence 6689999999999999999754
No 41
>cd00916 Npc2_like Niemann-Pick type C2 (Npc2) is a lysosomal protein in which a mutation in the gene causes a rare form of Niemann-Pick type C disease, an autosomal recessive lipid storage disorder characterized by accumulation of low-density lipoprotein-derived cholesterol in lysosomes. Although Npc2 is known to bind cholesterol, the function of this protein is unknown. These proteins belong to the ML domain family.
Probab=29.65 E-value=1.2e+02 Score=24.27 Aligned_cols=33 Identities=18% Similarity=0.216 Sum_probs=22.3
Q ss_pred eeeeccCCeEEEEE--EEEeCC---CeeEEEEEEeCCC
Q 024627 147 GMGIKQGKTYKVVF--YIRSLG---SVNILVSLTSSNG 179 (265)
Q Consensus 147 Gi~v~~G~~Y~~Sf--~ar~~~---~~~vtV~L~~~~g 179 (265)
.=||++|++|+... .+.... +..++++|.|.++
T Consensus 76 ~CPl~~G~~~~y~~~~~v~~~~P~i~~~v~~~L~d~~~ 113 (123)
T cd00916 76 SCPLSAGEDVTYTLSLPVLAPYPGISVTVEWELTDDDG 113 (123)
T ss_pred CCCCcCCcEEEEEEeeeccccCCCeEEEEEEEEEcCCC
Confidence 45889998876554 554332 3578888988766
No 42
>cd05755 Ig2_ICAM-1_like Second immunoglobulin (Ig)-like domain of intercellular cell adhesion molecule-1 (ICAM-1, CD54) and similar proteins. Ig2_ ICAM-1_like: domain similar to the second immunoglobulin (Ig)-like domain of intercellular cell adhesion molecule-1 (ICAM-1, CD54). During the inflammation process, these molecules recruit leukocytes onto the vascular endothelium before extravasation to the injured tissues. ICAM-1 may be involved in organ targeted tumor metastasis. The interaction of ICAM-1 with leukocyte function-associated antigen-1 (LFA-1) plays a part in leukocyte-endothelial cell recognition. This group also contains ICAM-2, which also interacts with LFA-1. Transmigration of immature dendritic cells across resting endothelium is dependent on the interaction of ICAM-2 with, yet unidentified, ligand(s) on the dendritic cells. ICAM-1 has five Ig-like domains and ICAM-2 has two. ICAM-1 may also act as host receptor for viruses and parasites.
Probab=29.65 E-value=2.7e+02 Score=21.64 Aligned_cols=66 Identities=14% Similarity=0.152 Sum_probs=39.3
Q ss_pred eccCCeEEEEEEEEeCCC-eeEEEEEEeCCCCeeEEEEEEEeeecCCCCcEEEEEEEEecCCCCcceEEE
Q 024627 150 IKQGKTYKVVFYIRSLGS-VNILVSLTSSNGLQTLATSNIIASASDVSNWTRVETLLEAKETNPNARLQL 218 (265)
Q Consensus 150 v~~G~~Y~~Sf~ar~~~~-~~vtV~L~~~~g~~~las~~i~~~~~~~~~W~ky~~~lta~~t~~~a~l~I 218 (265)
+..|+.|++.-.+.+..+ ..++|.+... ++.+-.+.+... .....=+..+++++++..|..+.++=
T Consensus 13 ~~eG~~~tL~C~v~g~~P~a~L~i~W~rG--~~~l~~~~~~~~-~~~~~~~~stlt~~~~r~D~g~~~sC 79 (100)
T cd05755 13 QPVGKNYTLQCDVPGVAPRQNLTVVLLRG--NETLSRQPFGDN-TKSPVNAPATITITVDREDHGANFSC 79 (100)
T ss_pred ccCCCcEEEEEEEcCcCCCCcEEEEEeeC--CEEcccceeccc-cCCCceeEEEEEEecchhhCCcEEEE
Confidence 468999999999988654 4577887753 355544433321 01223345666677766555555443
No 43
>PF10365 DUF2436: Domain of unknown function (DUF2436); InterPro: IPR018832 Gingipains R and K are endopeptidases with specificity for arginyl and lysyl bonds, respectively. Like other cysteine peptidases, they require reducing conditions for activity. They are maximally active at approximately neutral pH. Gingipains R and K are secreted by the bacterium Porphyromonas gingivalis (Bacteroides gingivalis). The bacterium is a major pathogen in periodontal disease, and the many ways in which the activities of the gingipains may contribute to the disease processes have been reviewed []. These enzymes are also involved in the hemagglutinating activity of the organisms. This entry represents a central region found in gingipain K peptidases, active on lysyl bonds; they belong to the MEROPS peptidase family C25 (gingipain family, clan CD).
Probab=28.78 E-value=1.1e+02 Score=25.99 Aligned_cols=32 Identities=22% Similarity=0.229 Sum_probs=21.4
Q ss_pred cCcceeEEccC------ceeeeeccCCeEEEEEEEEeC
Q 024627 134 PVGGVGVYNPG------YWGMGIKQGKTYKVVFYIRSL 165 (265)
Q Consensus 134 ~~g~~gi~N~G------y~Gi~v~~G~~Y~~Sf~ar~~ 165 (265)
|.+.+-|.-.| +....+++|++|+|.+..-+.
T Consensus 118 ~~~kiwIaGd~g~~~tr~dDy~fEAGKtY~ftm~~~g~ 155 (161)
T PF10365_consen 118 PGGKIWIAGDGGDGPTRGDDYVFEAGKTYRFTMKRVGS 155 (161)
T ss_pred CCCeEEEecCCCCCCccccceEEecCCEEEEEEEeccC
Confidence 33445554444 345788999999999886554
No 44
>smart00737 ML Domain involved in innate immunity and lipid metabolism. ML (MD-2-related lipid-recognition) is a novel domain identified in MD-1, MD-2, GM2A, Npc2 and multiple proteins of unknown function in plants, animals and fungi. These single-domain proteins were predicted to form a beta-rich fold containing multiple strands, and to mediate diverse biological functions through interacting with specific lipids.
Probab=28.56 E-value=98 Score=24.02 Aligned_cols=33 Identities=18% Similarity=0.240 Sum_probs=21.7
Q ss_pred eeeeccCCeE--EEEEEEEeCC---CeeEEEEEEeCCC
Q 024627 147 GMGIKQGKTY--KVVFYIRSLG---SVNILVSLTSSNG 179 (265)
Q Consensus 147 Gi~v~~G~~Y--~~Sf~ar~~~---~~~vtV~L~~~~g 179 (265)
.=|+++|++| +.++.+.... ...++++|.++++
T Consensus 71 ~CPl~~G~~~~~~~~~~v~~~~P~~~~~v~~~l~d~~~ 108 (118)
T smart00737 71 KCPIEKGETVNYTNSLTVPGIFPPGKYTVKWELTDEDG 108 (118)
T ss_pred CCCCCCCeeEEEEEeeEccccCCCeEEEEEEEEEcCCC
Confidence 3588999975 4555554432 3467778888776
No 45
>PF14785 MalF_P2: Maltose transport system permease protein MalF P2 domain; PDB: 3RLF_F 3PUX_F 3PV0_F 3PUY_F 3PUV_F 2R6G_F 3PUZ_F 3PUW_F.
Probab=27.69 E-value=1.1e+02 Score=26.28 Aligned_cols=37 Identities=16% Similarity=0.200 Sum_probs=26.4
Q ss_pred cCCeEEEEEEEEeCCCeeEEEEEEeCCCCeeEEEEEEEee
Q 024627 152 QGKTYKVVFYIRSLGSVNILVSLTSSNGLQTLATSNIIAS 191 (265)
Q Consensus 152 ~G~~Y~~Sf~ar~~~~~~vtV~L~~~~g~~~las~~i~~~ 191 (265)
+|++|+|.+|--++ . ..+.|.+.++++.+.|..+...
T Consensus 17 ~g~~y~F~Ly~~~d--~-~~L~l~~~~~~~~~~S~p~~l~ 53 (164)
T PF14785_consen 17 SGESYPFTLYPTGD--G-YRLALTDGESGQLYVSEPFSLD 53 (164)
T ss_dssp EEEEEEEEEEEETT--E-EEEEEEETTTTEEEEE--B---
T ss_pred CCCceeeEEEecCC--e-EEEEEeCCCcCceEEeCCcccc
Confidence 48999999995443 3 8899999887789999988774
No 46
>PF12988 DUF3872: Domain of unknown function, B. Theta Gene description (DUF3872); InterPro: IPR024355 This entry represents proteins of unknown function found primarily in Bacteroides species. The Bacteroides thetaiotaomicron gene coding for this protein is located in a conjugate transposon and appears to be upregulated in the presence of host or other bacterial species compared to growth in pure culture [, ].; PDB: 2L3B_A 2L7Q_A.
Probab=27.03 E-value=3.6e+02 Score=22.53 Aligned_cols=39 Identities=18% Similarity=0.135 Sum_probs=25.7
Q ss_pred eEEccCceeeeeccCCeE-----EEEEEEEeCC--CeeEEEEEEeCCC
Q 024627 139 GVYNPGYWGMGIKQGKTY-----KVVFYIRSLG--SVNILVSLTSSNG 179 (265)
Q Consensus 139 gi~N~Gy~Gi~v~~G~~Y-----~~Sf~ar~~~--~~~vtV~L~~~~g 179 (265)
-+... .|.++.+++.| .|++|.++.. ...+.|-++|+-|
T Consensus 79 ~L~~~--~g~~~~pND~Y~L~~~~FRLYYTS~s~~~q~idv~veDnfG 124 (137)
T PF12988_consen 79 TLRMD--DGTVLLPNDRYPLEKEVFRLYYTSRSDDQQTIDVYVEDNFG 124 (137)
T ss_dssp EEEET--TS-EE-TTSEEE-S-SEEEEEEEE-SSS-EEEEEEEEETTT
T ss_pred EEEec--CCcEeccccceecCcCEEEEEEecCCCCCceeEEEEEeCCC
Confidence 34444 57777777777 6788888744 5578888898887
No 47
>COG2373 Large extracellular alpha-helical protein [General function prediction only]
Probab=25.14 E-value=2.7e+02 Score=32.33 Aligned_cols=39 Identities=21% Similarity=0.335 Sum_probs=30.8
Q ss_pred ccCCeEEEEEEEEeC------CCeeEEEEEEeCCCCeeEEEEEEEe
Q 024627 151 KQGKTYKVVFYIRSL------GSVNILVSLTSSNGLQTLATSNIIA 190 (265)
Q Consensus 151 ~~G~~Y~~Sf~ar~~------~~~~vtV~L~~~~g~~~las~~i~~ 190 (265)
++|++..+.+.+|.- ...++++.+.+.+| +++...++..
T Consensus 406 RpGE~v~~~~~~R~~~~~~a~~~~p~~l~v~~PdG-~~~~~~~~~~ 450 (1621)
T COG2373 406 RPGETVHVNALLRDFDGKTALDNQPLKLRVLDPDG-SVLRTLTITL 450 (1621)
T ss_pred CCCceeeeeeeehhhcccccccCCCeEEEEECCCC-cEEEEEEEec
Confidence 689999999999852 23578999999999 7777766654
No 48
>PF04428 Choline_kin_N: Choline kinase N terminus; InterPro: IPR007521 This domain is found N-terminal to choline/ethanolamine kinase regions (IPR002573 from INTERPRO) in some plant and fungal choline kinase enzymes (2.7.1.32 from EC). This region is only found in some members of the choline kinase family, and is therefore unlikely to contribute to catalysis.; GO: 0016773 phosphotransferase activity, alcohol group as acceptor
Probab=25.03 E-value=52 Score=23.01 Aligned_cols=21 Identities=24% Similarity=0.646 Sum_probs=18.0
Q ss_pred CchHHHHHHHHcCC-CCeEecC
Q 024627 243 GFRNVLFQMLADLK-PRFLRFP 263 (265)
Q Consensus 243 GlR~DL~e~L~dL~-P~FlRfP 263 (265)
-||.|++.++..|+ |++-|-|
T Consensus 27 ~fk~di~~l~htL~i~~W~~v~ 48 (53)
T PF04428_consen 27 RFKQDILRLIHTLKIKKWRRVP 48 (53)
T ss_pred ccHHHHHHHHHHhcccccccCc
Confidence 38999999999999 8877765
No 49
>PF02221 E1_DerP2_DerF2: ML domain; InterPro: IPR003172 The MD-2-related lipid-recognition (ML) domain is implicated in lipid recognition, particularly in the recognition of pathogen related products. It has an immunoglobulin-like beta-sandwich fold similar to that of E-set Ig domains. This domain is present in the following proteins: Epididymal secretory protein E1 (also known as Niemann-Pick C2 protein), which is known to bind cholesterol. Niemann-Pick disease type C2 is a fatal hereditary disease characterised by accumulation of low-density lipoprotein-derived cholesterol in lysosomes []. House-dust mite allergen proteins such as Der f 2 from Dermatophagoides farinae and Der p 2 from Dermatophagoides pteronyssinus []. ; PDB: 2AG9_B 1G13_B 2AG2_B 2AG4_A 1TJJ_C 1PU5_C 1PUB_A 2AF9_A 3T6Q_D 3M7O_B ....
Probab=25.00 E-value=3.3e+02 Score=21.18 Aligned_cols=36 Identities=28% Similarity=0.349 Sum_probs=21.9
Q ss_pred eeeccCCeEEEEEEEEeC--C---CeeEEEEEEeCCCCeeEE
Q 024627 148 MGIKQGKTYKVVFYIRSL--G---SVNILVSLTSSNGLQTLA 184 (265)
Q Consensus 148 i~v~~G~~Y~~Sf~ar~~--~---~~~vtV~L~~~~g~~~la 184 (265)
=|+++|+.|++.+=+.-. . ...++++|.+.++ +.++
T Consensus 86 CPi~~G~~~~~~~~~~i~~~~p~~~~~i~~~l~d~~~-~~i~ 126 (134)
T PF02221_consen 86 CPIKAGEYYTYTYTIPIPKIYPPGKYTIQWKLTDQDG-EEIA 126 (134)
T ss_dssp STBTTTEEEEEEEEEEESTTSSSEEEEEEEEEEETTT-EEEE
T ss_pred CccCCCcEEEEEEEEEcccceeeEEEEEEEEEEeCCC-CEEE
Confidence 378899865554444322 1 3457777888876 5553
No 50
>PF09092 Lyase_N: Lyase, N terminal; InterPro: IPR015176 This entry represents a domain predominantly found in chondroitin ABC lyase I, adopting a jelly-roll fold topology consisting of a two-layered bent beta-sheet sandwich with one short alpha-helix. The convex beta sheet is composed of five antiparallel strands, whilst the concave beta-sheet contains five antiparallel beta-strands with a loop between two consecutive strands folding back onto the concave surface. This domain is required for binding of the protein to long glycosaminoglycan chains []. ; PDB: 2Q1F_A 1HN0_A.
Probab=24.98 E-value=4.6e+02 Score=22.77 Aligned_cols=128 Identities=15% Similarity=0.208 Sum_probs=69.1
Q ss_pred CCCceEecCCceeEEecCCCccccCCcceEEEEEeecCCCcccccCcceeEEccCceeeeec--cCCeE---EEEEEEEe
Q 024627 90 IDPWAIIGNDSSLIVSTDRSSCFERNKVALRMEVLCDSQGTNICPVGGVGVYNPGYWGMGIK--QGKTY---KVVFYIRS 164 (265)
Q Consensus 90 ~~~W~~~g~~~~~~~~~d~~~~~~~n~~~l~i~v~~~~~~~~~~~~g~~gi~N~Gy~Gi~v~--~G~~Y---~~Sf~ar~ 164 (265)
...|....+ +.+.++... +..+.++|+-+-.. .+...|.++. ++..+ .++.+ .+.||+=.
T Consensus 17 p~~~~~~~~-s~LslS~~h---yK~G~~SL~W~w~~---------gs~l~i~~~~--~~~~~~~~~k~~g~~~~~~WIYN 81 (178)
T PF09092_consen 17 PDAFTTSQG-STLSLSDEH---YKDGKQSLKWNWQP---------GSTLTISKPL--GFEPDAPTSKDGGRSAFIFWIYN 81 (178)
T ss_dssp TTCTEEECC-EEEEEESSS----SSTT-EEEEEEEC---------CEEEEEES-B------HHCCCCHHTCCEEEEEEEE
T ss_pred CcceEecCC-ceEEeCHhH---hhCCccccEEEcCC---------CCEEEEeccc--ccccccccccccCcceEEEEEEC
Confidence 356665433 345666543 46778999988863 3345666662 22111 22233 39999976
Q ss_pred CC--CeeEEEEEEeCC---CCeeEEEEEEEeeecCCCCcEEEEEEEEe----cCC---CCcceEEEEEc---cCeEEEEe
Q 024627 165 LG--SVNILVSLTSSN---GLQTLATSNIIASASDVSNWTRVETLLEA----KET---NPNARLQLTTS---RKGVIWFD 229 (265)
Q Consensus 165 ~~--~~~vtV~L~~~~---g~~~las~~i~~~~~~~~~W~ky~~~lta----~~t---~~~a~l~I~~~---~~G~v~lD 229 (265)
+. ...+++++.+.. | .+-..=.+.+. -.+|+-.=+.+.- ... ..-.+|+|+.. ..|+|+||
T Consensus 82 e~p~~~~l~f~F~~~~~~t~-~~~~~F~~~LN---FtGWR~~WV~y~~Dm~g~~~~g~~~md~l~i~AP~~~~~G~lf~D 157 (178)
T PF09092_consen 82 EKPQDDKLRFEFGKGLINTG-KPCYWFPFNLN---FTGWRAAWVSYERDMQGRPEEGSKDMDSLRITAPANDPSGTLFFD 157 (178)
T ss_dssp SS--SSEEEEEEECT--TTT-EECEEEEEE------SEEEEEEEETTTTSEE---TT-----EEEEE--TTSSEEEEEEE
T ss_pred CCCcCCeEEEEecCCcccCC-ccceEEEEEee---cccceeeeeeehhhccCCcccCcceeeEEEEEccccCCCccEEEE
Confidence 54 457889988763 4 45444444443 4678766655542 111 23568888885 47999999
Q ss_pred EEeeecc
Q 024627 230 QVSAMPL 236 (265)
Q Consensus 230 ~VSLfP~ 236 (265)
.+-+-..
T Consensus 158 ~l~~~~~ 164 (178)
T PF09092_consen 158 RLIFSVK 164 (178)
T ss_dssp EEEEEEE
T ss_pred EEeeccc
Confidence 9987664
No 51
>PF04744 Monooxygenase_B: Monooxygenase subunit B protein; InterPro: IPR006833 Ammonia monooxygenase and the particulate methane monooxygenase are both integral membrane proteins, occurring in ammonia oxidisers and methanotrophs respectively, which are thought to be evolutionarily related []. These enzymes have a relatively wide substrate specificity and can catalyse the oxidation of a range of substrates including ammonia, methane, halogenated hydrocarbons and aromatic molecules []. These enzymes are composed of 3 subunits - A (IPR003393 from INTERPRO), B (IPR006833 from INTERPRO) and C (IPR006980 from INTERPRO) - and contain various metal centres, including copper. Particulate methane monooxygenase from Methylococcus capsulatus str. Bath is an ABC homotrimer, which contains mononuclear and dinuclear copper metal centres, and a third metal centre containing a metal ion whose identity in vivo is not certain[]. The soluble regions of these enzymes derive primarily from the B subunit. This subunit forms two antiparallel beta-barrel-like structures and contains the mono- and di- nuclear copper metal centres [].; PDB: 3CHX_E 3RFR_A 3RGB_A 1YEW_A.
Probab=24.89 E-value=2.8e+02 Score=27.00 Aligned_cols=57 Identities=7% Similarity=0.124 Sum_probs=31.3
Q ss_pred eeeeccCCeEEEEEEEEeCCCee--EEEEEEeCCCCeeEEEE-EEEeeecCCCCcEEEEEEEEe
Q 024627 147 GMGIKQGKTYKVVFYIRSLGSVN--ILVSLTSSNGLQTLATS-NIIASASDVSNWTRVETLLEA 207 (265)
Q Consensus 147 Gi~v~~G~~Y~~Sf~ar~~~~~~--vtV~L~~~~g~~~las~-~i~~~~~~~~~W~ky~~~lta 207 (265)
-+.++.|.+|++++.+|+...+. +-.++.=++++..++-. .+.++ ++|..++-..+.
T Consensus 80 S~~le~G~~y~fki~lkar~pG~~hvh~~~nv~~~Gp~~Gpg~~v~i~----g~~~dFtnpVtt 139 (381)
T PF04744_consen 80 SVSLELGGTYEFKIVLKARRPGTWHVHPMLNVEDAGPIVGPGQWVTIE----GSMGDFTNPVTT 139 (381)
T ss_dssp -B---TT-EEEEEEEEEE-S-EEEEEEEEEEETTTEEEEEEEEEEEEE----S-GGG---EEEB
T ss_pred eEEeecCCeeeEEEEEecccCccccceeeEeeccCCCCcCCceEEEEe----ccccccCcceEe
Confidence 58899999999999999976654 55666666665566543 34543 567766655543
No 52
>PF04151 PPC: Bacterial pre-peptidase C-terminal domain; InterPro: IPR007280 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. This domain is normally found at the C terminus of secreted archaeal and bacterial peptidases, the majority of which belong to MEROPS peptidase families M4 (vibriolysin, IPR001570 from INTERPRO), M9A amd M9B (microbial collangenase, IPR002169 from INTERPRO), M28 (aminopeptidase Ap1, IPR007484 from INTERPRO) and S8 (subtilisin family peptidases, IPR000209 from INTERPRO).; GO: 0008233 peptidase activity, 0006508 proteolysis; PDB: 4DY5_B 4DXZ_A 4DY3_B 3JQW_A 3JQX_C 1NQJ_B 1NQD_A 2O8O_A 1WMF_A 1WME_A ....
Probab=23.79 E-value=2.6e+02 Score=19.50 Aligned_cols=64 Identities=19% Similarity=0.320 Sum_probs=33.5
Q ss_pred ceeeeeccCCeEEEEEEEEeCCCeeEEEEEEeCCCCeeEEEEEEEeeecCCCCcEEEEEEEEecCCCCcceEEEEEccCe
Q 024627 145 YWGMGIKQGKTYKVVFYIRSLGSVNILVSLTSSNGLQTLATSNIIASASDVSNWTRVETLLEAKETNPNARLQLTTSRKG 224 (265)
Q Consensus 145 y~Gi~v~~G~~Y~~Sf~ar~~~~~~vtV~L~~~~g~~~las~~i~~~~~~~~~W~ky~~~lta~~t~~~a~l~I~~~~~G 224 (265)
|+-+.+.+|.++++++ .+. ...+.+.|.+.+| ..+++.. ... .....+..+.++....|
T Consensus 4 ~y~f~v~ag~~l~i~l--~~~-~~d~dl~l~~~~g-~~~~~~d--------~~~---------~~~~~~~~i~~~~~~~G 62 (70)
T PF04151_consen 4 YYSFTVPAGGTLTIDL--SGG-SGDADLYLYDSNG-NSLASYD--------DSS---------QSGGNDESITFTAPAAG 62 (70)
T ss_dssp EEEEEESTTEEEEEEE--CET-TSSEEEEEEETTS-SSCEECC--------CCT---------CETTSEEEEEEEESSSE
T ss_pred EEEEEEcCCCEEEEEE--cCC-CCCeEEEEEcCCC-Cchhhhe--------ecC---------CCCCCccEEEEEcCCCE
Confidence 3456777777766554 232 2245577777776 3333210 000 11224456666667777
Q ss_pred EEEEe
Q 024627 225 VIWFD 229 (265)
Q Consensus 225 ~v~lD 229 (265)
+.+|-
T Consensus 63 tYyi~ 67 (70)
T PF04151_consen 63 TYYIR 67 (70)
T ss_dssp EEEEE
T ss_pred EEEEE
Confidence 76653
No 53
>COG3906 Uncharacterized protein conserved in bacteria [Function unknown]
Probab=22.83 E-value=2.7e+02 Score=22.21 Aligned_cols=67 Identities=10% Similarity=0.165 Sum_probs=42.2
Q ss_pred EEEEEeCCCCeeEEEEEEEeeecCCCCcEEEEEEEEecCC--CCcceEEE---EEccCeEEEEeEEeeeccCCCC
Q 024627 171 LVSLTSSNGLQTLATSNIIASASDVSNWTRVETLLEAKET--NPNARLQL---TTSRKGVIWFDQVSAMPLDTYK 240 (265)
Q Consensus 171 tV~L~~~~g~~~las~~i~~~~~~~~~W~ky~~~lta~~t--~~~a~l~I---~~~~~G~v~lD~VSLfP~dT~k 240 (265)
.+.|.|.+|..++...-+++. ...|.|=-+.|.|... +..+...| .++.++.-.=..+.|+|.+|..
T Consensus 15 ~itL~DE~GnE~lf~~L~~~d---~~ef~KeYVll~p~~~e~~e~~eiei~a~~~~~d~dG~eg~~~l~p~etde 86 (105)
T COG3906 15 VITLIDEDGNEVLFEILFTFD---GEEFGKEYVLLVPAGSEEDEDGEIEIFAYSFTPDEDGTEGDLQLVPIETDE 86 (105)
T ss_pred EEEEECCCCceehhheeeeee---chhcceeEEEEecccccccCCCcEEEEEeecCcccccccCceeeecccchH
Confidence 578999999888887766765 3689655567776544 44444443 3343332222456789988764
No 54
>PF12273 RCR: Chitin synthesis regulation, resistance to Congo red; InterPro: IPR020999 RCR proteins are ER membrane proteins that regulate chitin deposition in fungal cell walls. Although chitin, a linear polymer of beta-1,4-linked N-acetylglucosamine, constitutes only 2% of the cell wall it plays a vital role in the overall protection of the cell wall against stress, noxious chemicals and osmotic pressure changes. Congo red is a cell wall-disrupting benzidine-type dye extensively used in many cell wall mutant studies that specifically targets chitin in yeast cells and inhibits growth. RCR proteins render the yeasts resistant to Congo red by diminishing the content of chitin in the cell wall []. RCR proteins are probably regulating chitin synthase III interact directly with ubiquitin ligase Rsp5, and the VPEY motif is necessary for this, via interaction with the WW domains of Rsp5 [].
Probab=21.64 E-value=42 Score=27.13 Aligned_cols=15 Identities=27% Similarity=0.565 Sum_probs=8.9
Q ss_pred HHHHHHhhhheeecc
Q 024627 14 LLFFIGTCFLFQCFA 28 (265)
Q Consensus 14 ~~~~~~~~~~~~~~~ 28 (265)
+++|||++|++.|++
T Consensus 9 i~~i~l~~~~~~~~~ 23 (130)
T PF12273_consen 9 IVAILLFLFLFYCHN 23 (130)
T ss_pred HHHHHHHHHHHHHHH
Confidence 334555566777766
No 55
>COG3126 Uncharacterized protein conserved in bacteria [Function unknown]
Probab=21.57 E-value=2.5e+02 Score=24.03 Aligned_cols=76 Identities=17% Similarity=0.233 Sum_probs=41.5
Q ss_pred cCCeEEEEEEEEeC----CCeeEEEEEEeCC----CCeeEEEEEEEeeecCCCCcEEEEEEE--EecCCCCcceEEEEE-
Q 024627 152 QGKTYKVVFYIRSL----GSVNILVSLTSSN----GLQTLATSNIIASASDVSNWTRVETLL--EAKETNPNARLQLTT- 220 (265)
Q Consensus 152 ~G~~Y~~Sf~ar~~----~~~~vtV~L~~~~----g~~~las~~i~~~~~~~~~W~ky~~~l--ta~~t~~~a~l~I~~- 220 (265)
+-.+-.+++|+|-. ....++|+|.|-+ -.+++|+++|... +-..+++.| .+..=-.+.+.++..
T Consensus 41 a~~sv~G~V~yReriALPp~AvltV~L~DvSlADaPsrvla~~tvr~~-----Gq~P~~F~L~fdp~~i~p~~ryalsAr 115 (158)
T COG3126 41 AQKSVSGTVLYRERIALPPGAVLTVTLSDVSLADAPSRVLAEQTVRTE-----GQVPFPFVLPFDPSDIQPNHRYALSAR 115 (158)
T ss_pred cccccccceEEEEEecCCCCCEEEEEEEecccccChhHhhhhheeecc-----CccceeEEeccChhhCCCCcEEEEEEE
Confidence 34566677777743 2345777777642 1368999988743 233444444 443323445554433
Q ss_pred -ccCeEEEEeEEe
Q 024627 221 -SRKGVIWFDQVS 232 (265)
Q Consensus 221 -~~~G~v~lD~VS 232 (265)
+.+|+++|=.=.
T Consensus 116 I~~~gkL~Fitd~ 128 (158)
T COG3126 116 ITVNGKLLFITDT 128 (158)
T ss_pred EEECCEEEEEecc
Confidence 457777664333
No 56
>PF14524 Wzt_C: Wzt C-terminal domain; PDB: 2R5O_B.
Probab=21.57 E-value=3.1e+02 Score=21.10 Aligned_cols=65 Identities=11% Similarity=0.109 Sum_probs=36.2
Q ss_pred eccCCeEEEEEEEEeCCC---eeEEEEEEeCCCCeeEEEEE-----EEeeecCCCCcEEEEEEEEecCCCCcceEEE
Q 024627 150 IKQGKTYKVVFYIRSLGS---VNILVSLTSSNGLQTLATSN-----IIASASDVSNWTRVETLLEAKETNPNARLQL 218 (265)
Q Consensus 150 v~~G~~Y~~Sf~ar~~~~---~~vtV~L~~~~g~~~las~~-----i~~~~~~~~~W~ky~~~lta~~t~~~a~l~I 218 (265)
+..|+++++.+.++.... ..+.+.+.+.+| +.+...+ ..+. .....+++++++-+..-..++..|
T Consensus 31 ~~~ge~~~i~i~~~~~~~i~~~~~~~~i~~~~g-~~v~~~~t~~~~~~~~---~~~~g~~~~~~~i~~~L~~G~Y~i 103 (142)
T PF14524_consen 31 FESGEPIRIRIDYEVNEDIDDPVFGFAIRDSDG-QRVFGTNTYDSGFPIP---LSEGGTYEVTFTIPKPLNPGEYSI 103 (142)
T ss_dssp EETTSEEEEEEEEEESS-EEEEEEEEEEEETT---EEEEEEHHHHT--EE---E-TT-EEEEEEEEE--B-SEEEEE
T ss_pred EeCCCEEEEEEEEEECCCCCccEEEEEEEcCCC-CEEEEECccccCcccc---ccCCCEEEEEEEEcCccCCCeEEE
Confidence 568999999999998543 357888999999 4444322 1211 012677777776554344444333
No 57
>TIGR03079 CH4_NH3mon_ox_B methane monooxygenase/ammonia monooxygenase, subunit B. Both ammonia oxidizers such as Nitrosomonas europaea and methanotrophs (obligate methane oxidizers) such as Methylococcus capsulatus each can grow only on their own characteristic substrate. However, both groups have the ability to oxidize both substrates, and so the relevant enzymes must be named here according to their ability to oxidze both. The protein family represented here reflects subunit B of both the particulate methane monooxygenase of methylotrophs and the ammonia monooxygenase of nitrifying bacteria.
Probab=20.83 E-value=1.3e+02 Score=29.18 Aligned_cols=38 Identities=8% Similarity=0.080 Sum_probs=26.9
Q ss_pred eeeeccCCeEEEEEEEEeCCCe--eEEEEEEeCCCCeeEE
Q 024627 147 GMGIKQGKTYKVVFYIRSLGSV--NILVSLTSSNGLQTLA 184 (265)
Q Consensus 147 Gi~v~~G~~Y~~Sf~ar~~~~~--~vtV~L~~~~g~~~la 184 (265)
-++++.|++|+|.+.+|+...+ .+-.++.=++++-+++
T Consensus 100 S~~LelG~dYefkv~lkaR~pG~~hvh~m~Nv~~~GpiiG 139 (399)
T TIGR03079 100 SGPLEIGRDYEFEVTLQARIPGRHHMHAMLNVKDAGPIAG 139 (399)
T ss_pred eeEeecCCceeEEEEEeeccCCcccceeEEEeccCCCCcC
Confidence 5889999999999999986544 3555555555544443
Done!