Query 044785
Match_columns 346
No_of_seqs 253 out of 2728
Neff 8.0
Searched_HMMs 46136
Date Fri Mar 29 06:44:06 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/044785.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/044785hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PF13947 GUB_WAK_bind: Wall-as 100.0 1.4E-28 2.9E-33 196.9 10.5 104 29-134 1-106 (106)
2 PF08488 WAK: Wall-associated 99.6 2.4E-15 5.2E-20 118.8 8.4 73 163-238 1-75 (103)
3 PF07645 EGF_CA: Calcium-bindi 98.6 5.7E-08 1.2E-12 64.0 3.4 37 288-325 1-37 (42)
4 KOG1214 Nidogen and related ba 98.3 6.7E-07 1.5E-11 91.0 5.9 79 242-335 785-866 (1289)
5 KOG1214 Nidogen and related ba 97.8 2.2E-05 4.8E-10 80.2 5.6 80 247-337 700-779 (1289)
6 KOG1219 Uncharacterized conser 97.8 2.9E-05 6.3E-10 86.2 6.1 67 247-325 3909-3975(4289)
7 KOG4289 Cadherin EGF LAG seven 97.5 0.0001 2.2E-09 79.2 4.7 53 266-325 1220-1272(2531)
8 KOG1219 Uncharacterized conser 97.5 0.00023 4.9E-09 79.6 6.8 72 241-325 3865-3936(4289)
9 smart00179 EGF_CA Calcium-bind 97.4 0.00022 4.7E-09 45.4 4.0 33 288-322 1-33 (39)
10 KOG4260 Uncharacterized conser 97.4 9.2E-05 2E-09 67.0 2.1 52 269-324 218-270 (350)
11 PF14670 FXa_inhibition: Coagu 97.3 0.00015 3.2E-09 45.9 2.2 32 297-332 5-36 (36)
12 PF12947 EGF_3: EGF domain; I 97.3 0.00015 3.2E-09 45.9 2.0 31 296-326 4-34 (36)
13 PF12662 cEGF: Complement Clr- 97.2 0.00029 6.2E-09 40.2 2.3 24 267-291 1-24 (24)
14 cd00054 EGF_CA Calcium-binding 96.8 0.0017 3.7E-08 40.6 3.7 35 288-324 1-35 (38)
15 PF00008 EGF: EGF-like domain 96.8 0.00092 2E-08 41.1 2.2 28 297-324 3-31 (32)
16 KOG4260 Uncharacterized conser 96.6 0.0014 3E-08 59.5 2.8 80 229-325 220-307 (350)
17 cd01475 vWA_Matrilin VWA_Matri 96.2 0.0041 8.9E-08 56.1 3.7 38 285-325 183-220 (224)
18 cd00053 EGF Epidermal growth f 96.0 0.011 2.5E-07 36.1 3.8 28 297-324 5-32 (36)
19 PF12947 EGF_3: EGF domain; I 95.9 0.006 1.3E-07 38.5 2.2 30 247-282 6-35 (36)
20 KOG4289 Cadherin EGF LAG seven 95.8 0.018 3.8E-07 62.9 6.4 60 247-320 1245-1308(2531)
21 PF12662 cEGF: Complement Clr- 95.8 0.0099 2.1E-07 33.9 2.5 22 312-335 1-22 (24)
22 smart00181 EGF Epidermal growt 95.6 0.018 4E-07 35.5 3.6 26 298-324 6-31 (35)
23 PF07645 EGF_CA: Calcium-bindi 94.1 0.045 9.8E-07 35.6 2.4 32 242-279 4-36 (42)
24 KOG1217 Fibrillins and related 93.9 0.095 2.1E-06 51.7 5.4 63 253-325 243-305 (487)
25 PF00008 EGF: EGF-like domain 93.2 0.044 9.5E-07 33.5 1.2 28 247-279 4-31 (32)
26 cd00053 EGF Epidermal growth f 91.0 0.39 8.5E-06 28.9 3.7 27 247-279 6-32 (36)
27 smart00179 EGF_CA Calcium-bind 89.7 0.58 1.2E-05 29.1 3.6 30 242-277 4-33 (39)
28 KOG1217 Fibrillins and related 89.3 0.51 1.1E-05 46.5 4.7 54 267-325 151-204 (487)
29 PF06247 Plasmod_Pvs28: Plasmo 88.6 0.2 4.3E-06 43.5 1.0 62 267-334 19-87 (197)
30 smart00181 EGF Epidermal growt 87.5 0.78 1.7E-05 27.9 3.1 25 248-279 7-31 (35)
31 PF12946 EGF_MSP1_1: MSP1 EGF 86.3 0.42 9E-06 30.2 1.3 28 298-325 5-33 (37)
32 cd00054 EGF_CA Calcium-binding 86.0 0.98 2.1E-05 27.6 3.0 32 242-279 4-35 (38)
33 PF07974 EGF_2: EGF-like domai 80.8 2.9 6.3E-05 25.5 3.4 25 298-324 6-30 (32)
34 PF12661 hEGF: Human growth fa 76.1 1.3 2.9E-05 21.3 0.7 12 314-325 1-12 (13)
35 PF14670 FXa_inhibition: Coagu 75.4 1.9 4E-05 27.1 1.4 22 253-280 10-31 (36)
36 KOG1225 Teneurin-1 and related 74.3 5.9 0.00013 40.2 5.4 24 298-325 316-339 (525)
37 PF00954 S_locus_glycop: S-loc 73.2 5.9 0.00013 31.3 4.2 41 230-278 68-108 (110)
38 PF07172 GRP: Glycine rich pro 68.0 5.1 0.00011 31.1 2.7 12 1-12 1-12 (95)
39 KOG0994 Extracellular matrix g 65.1 8.8 0.00019 42.1 4.5 64 267-334 840-908 (1758)
40 PHA03099 epidermal growth fact 63.6 7.3 0.00016 31.7 2.8 37 288-325 41-79 (139)
41 PF00954 S_locus_glycop: S-loc 61.1 9.1 0.0002 30.2 3.0 32 290-324 78-109 (110)
42 KOG1225 Teneurin-1 and related 57.7 14 0.0003 37.6 4.3 22 299-325 344-365 (525)
43 cd01475 vWA_Matrilin VWA_Matri 48.1 14 0.0003 33.0 2.3 22 253-280 199-220 (224)
44 PF05887 Trypan_PARP: Procycli 44.0 7.6 0.00016 31.9 0.0 18 1-18 1-18 (143)
45 PF09064 Tme5_EGF_like: Thromb 43.8 19 0.0004 22.3 1.7 13 313-325 18-30 (34)
46 PF08261 Carcinustatin: Carcin 42.1 13 0.00027 15.3 0.5 6 42-47 3-8 (8)
47 PF14380 WAK_assoc: Wall-assoc 41.8 40 0.00088 25.7 3.8 39 229-275 54-93 (94)
48 PHA02887 EGF-like protein; Pro 40.9 23 0.00051 28.4 2.3 35 290-325 84-120 (126)
49 KOG0994 Extracellular matrix g 40.3 29 0.00064 38.4 3.6 17 267-283 884-901 (1758)
50 PF08685 GON: GON domain; Int 34.7 1.1E+02 0.0024 27.1 5.8 59 96-155 126-185 (201)
51 COG2991 Uncharacterized protei 32.6 73 0.0016 23.3 3.5 16 34-52 31-46 (77)
52 PHA02887 EGF-like protein; Pro 31.1 42 0.00091 27.0 2.3 35 241-280 84-120 (126)
53 PF10916 DUF2712: Protein of u 24.3 77 0.0017 26.4 2.8 16 38-53 32-47 (146)
54 PRK02710 plastocyanin; Provisi 23.5 87 0.0019 25.0 3.0 17 1-17 1-17 (119)
55 smart00051 DSL delta serrate l 22.1 1.3E+02 0.0027 21.4 3.2 46 267-324 16-61 (63)
56 KOG1836 Extracellular matrix g 20.6 90 0.0019 36.6 3.3 52 267-325 755-810 (1705)
No 1
>PF13947 GUB_WAK_bind: Wall-associated receptor kinase galacturonan-binding
Probab=99.95 E-value=1.4e-28 Score=196.90 Aligned_cols=104 Identities=39% Similarity=0.860 Sum_probs=91.0
Q ss_pred CCCCCCCCCCccccCCCCCCCCCCCCCCCeEecCCCCCCCCeecccCCccEEEEEee-cCeEeEeeeeecccccC-Cccc
Q 044785 29 KPGCPSRCGDVEIPYPFGTRPGCFLNKYFVITCNKTHYNPPKPFLRKSNIEVVNITI-DGRMNVMQFVAKECYRK-GNSV 106 (346)
Q Consensus 29 ~~~C~~~CG~v~IpYPFgig~~C~~~~~F~l~C~~~~~~~p~l~l~~~~~~V~~is~-~~~~~v~~~~~~~c~~~-~~~~ 106 (346)
+++||++||||+||||||||++|+++|+|+|+|+++ +++|+|++.+.+|+|++|+| +++++|..++.+.|+.. ....
T Consensus 1 ~~~C~~~CGnv~IpYPFgi~~~C~~~~~F~L~C~~~-~~~~~l~l~~~~~~V~~I~~~~~~i~v~~~~~~~~~~~~~~~~ 79 (106)
T PF13947_consen 1 KPGCPSSCGNVSIPYPFGIGPGCGRDPGFELTCNNN-TSPPKLLLSSGNYEVLSISYENGTIRVSDPISSNCYSSSSSNS 79 (106)
T ss_pred CCCCCCccCCEeecCCCccCCCCCCCCCcEEECCCC-CCCceeEecCCcEEEEEEecCCCEEEEEeccccceecCCCCcc
Confidence 478999999999999999999999999999999988 67899999889999999999 99999999999998876 2222
Q ss_pred ccccceeCCCceEeccCCCEEEEEcccc
Q 044785 107 DSYSPTFSLSKFTVSNTENRFVVIGCDS 134 (346)
Q Consensus 107 ~~~~~~l~~~pf~~s~~~n~~~~~gC~~ 134 (346)
....|++.+ ||.+|+.+|+|+++||++
T Consensus 80 ~~~~~~~~~-~~~~s~~~N~~~~~GC~t 106 (106)
T PF13947_consen 80 SNSNLSLNG-PFFFSSSSNKFTVVGCNT 106 (106)
T ss_pred cccEEeecC-CceEccCCcEEEEECCCC
Confidence 234566666 899998899999999985
No 2
>PF08488 WAK: Wall-associated kinase; InterPro: IPR013695 This domain is found together with the eukaryotic protein kinase domain IPR000719 from INTERPRO in plant wall-associated kinases (WAKs) and related proteins. WAKs are serine-threonine kinases which might be involved in signalling to the cytoplasm and are required for cell expansion []. ; GO: 0004674 protein serine/threonine kinase activity, 0016021 integral to membrane
Probab=99.61 E-value=2.4e-15 Score=118.84 Aligned_cols=73 Identities=36% Similarity=0.590 Sum_probs=57.8
Q ss_pred CCCCCCccceeeecCCC-CcceEEEEeeeccCCCcCCCCCcceEEEecCCceeecccccccCCCCC-ccceEEeeeec
Q 044785 163 NGSCVGTGCCQIEIPRG-LKELEVEAFSFNNHTNVSPFNPCTYASVVDKSQFHFSSNYLAWEGTPE-KFPLVLDWEIT 238 (346)
Q Consensus 163 ~~~CsG~gCCq~~ip~~-l~~~~~~~~~~~~~~~~~~~~~c~~a~~~~~~~~~f~~~~~~~~~~~~-~~p~~l~W~i~ 238 (346)
+++|+|++|||++||.+ ++.|.+++...+.... ..++|++|||+|+.||.+...+.. .+.+. .+||+|+|.|+
T Consensus 1 n~~CsG~~CCQt~iP~~~~qv~~~~i~~~~~~~~--~~~~Ck~AFL~d~~~~~~n~t~p~-~~~~~~y~pv~L~W~i~ 75 (103)
T PF08488_consen 1 NSSCSGIGCCQTSIPSGLLQVFNVTIESTDGNNQ--TSNGCKVAFLVDEDWFSSNITDPE-DFHAMGYVPVVLDWFID 75 (103)
T ss_pred CCccCCcceeccCCCCCCceEEEEEeEecCCCcc--cCCCceEEEEeccccccccCCChH-HhccCCeeEEEEEEEEe
Confidence 36899999999999996 5789999987765332 458999999999999987655432 34443 69999999997
No 3
>PF07645 EGF_CA: Calcium-binding EGF domain; InterPro: IPR001881 A sequence of about forty amino-acid residues found in epidermal growth factor (EGF) has been shown [, , , , , ] to be present in a large number of membrane-bound and extracellular, mostly animal, proteins. Many of these proteins require calcium for their biological function and a calcium-binding site has been found at the N terminus of some EGF-like domains []. Calcium-binding may be crucial for numerous protein-protein interactions. For human coagulation factor IX it has been shown [] that the calcium-ligands form a pentagonal bipyramid. The first, third and fourth conserved negatively charged or polar residues are side chain ligands. The latter is possibly hydroxylated (see aspartic acid and asparagine hydroxylation site) []. A conserved aromatic residue, as well as the second conserved negative residue, are thought to be involved in stabilising the calcium-binding site. As in non-calcium binding EGF-like domains, there are six conserved cysteines and the structure of both types is very similar as calcium-binding induces only strictly local structural changes []. +------------------+ +---------+ | | | | nxnnC-x(3,14)-C-x(3,7)-CxxbxxxxaxC-x(1,6)-C-x(8,13)-Cx | | +------------------+ 'n': negatively charged or polar residue [DEQN] 'b': possibly beta-hydroxylated residue [DN] 'a': aromatic amino acid 'C': cysteine, involved in disulphide bond 'x': any amino acid. ; GO: 0005509 calcium ion binding; PDB: 2VJ3_A 1TOZ_A 1LMJ_A 1UZQ_A 1UZK_A 1UZJ_B 1UZP_A 1EMO_A 1EMN_A 2RR0_A ....
Probab=98.56 E-value=5.7e-08 Score=64.00 Aligned_cols=37 Identities=35% Similarity=0.919 Sum_probs=32.6
Q ss_pred eCccCCCCCCCCCCCCCeeEecCCCeEEecCCCCccCC
Q 044785 288 DVNECEDPSLNNCTRTHICDNIPGSYTCRCRKGFHGDG 325 (346)
Q Consensus 288 dideC~~~~~~~C~~~~~C~nt~G~y~C~C~~G~~~~~ 325 (346)
|||||.. ..+.|...+.|+|+.|+|.|.|++||+.+.
T Consensus 1 DidEC~~-~~~~C~~~~~C~N~~Gsy~C~C~~Gy~~~~ 37 (42)
T PF07645_consen 1 DIDECAE-GPHNCPENGTCVNTEGSYSCSCPPGYELND 37 (42)
T ss_dssp ESSTTTT-TSSSSSTTSEEEEETTEEEEEESTTEEECT
T ss_pred CccccCC-CCCcCCCCCEEEcCCCCEEeeCCCCcEECC
Confidence 7999976 558998889999999999999999999443
No 4
>KOG1214 consensus Nidogen and related basement membrane protein proteins [Cell wall/membrane/envelope biogenesis; Extracellular structures]
Probab=98.33 E-value=6.7e-07 Score=90.97 Aligned_cols=79 Identities=41% Similarity=1.053 Sum_probs=61.4
Q ss_pred cCccC-CCCCCC--CeeeCCCCCCCCCCCeeeecCCCcccCCCCCCCceeCccCCCCCCCCCCCCCeeEecCCCeEEecC
Q 044785 242 TCEEA-KICGLN--ASCHKPKDNTTTSSGYHCKCNEGYEGNPYLSDGCQDVNECEDPSLNNCTRTHICDNIPGSYTCRCR 318 (346)
Q Consensus 242 ~C~~~-~~C~~~--s~C~~~~~~~~~~~gy~C~C~~Gy~gnp~~~~gC~dideC~~~~~~~C~~~~~C~nt~G~y~C~C~ 318 (346)
.|++. +.|..+ +.|... + +..|.|.|.+||.|++.. |.|+||| +...|++...|.|++|++.|+|.
T Consensus 785 ~Ce~g~h~C~i~g~a~c~~h-G----gs~y~C~CLPGfsGDG~~---c~dvDeC---~psrChp~A~CyntpgsfsC~C~ 853 (1289)
T KOG1214|consen 785 PCEDGSHTCAIAGQARCVHH-G----GSTYSCACLPGFSGDGHQ---CTDVDEC---SPSRCHPAATCYNTPGSFSCRCQ 853 (1289)
T ss_pred ccccCccccCcCCceEEEec-C----CceEEEeecCCccCCccc---ccccccc---CccccCCCceEecCCCcceeecc
Confidence 34443 456633 344333 2 347999999999999766 9999999 45789999999999999999999
Q ss_pred CCCccCCCCCCCCceeC
Q 044785 319 KGFHGDGTKDGRGCIPN 335 (346)
Q Consensus 319 ~G~~~~~~~~~~~C~~~ 335 (346)
+||.+++.. |+|.
T Consensus 854 pGy~GDGf~----CVP~ 866 (1289)
T KOG1214|consen 854 PGYYGDGFQ----CVPD 866 (1289)
T ss_pred cCccCCCce----ecCC
Confidence 999999855 7664
No 5
>KOG1214 consensus Nidogen and related basement membrane protein proteins [Cell wall/membrane/envelope biogenesis; Extracellular structures]
Probab=97.84 E-value=2.2e-05 Score=80.25 Aligned_cols=80 Identities=35% Similarity=0.897 Sum_probs=66.4
Q ss_pred CCCCCCCeeeCCCCCCCCCCCeeeecCCCcccCCCCCCCceeCccCCCCCCCCCCCCCeeEecCCCeEEecCCCCccCCC
Q 044785 247 KICGLNASCHKPKDNTTTSSGYHCKCNEGYEGNPYLSDGCQDVNECEDPSLNNCTRTHICDNIPGSYTCRCRKGFHGDGT 326 (346)
Q Consensus 247 ~~C~~~s~C~~~~~~~~~~~gy~C~C~~Gy~gnp~~~~gC~dideC~~~~~~~C~~~~~C~nt~G~y~C~C~~G~~~~~~ 326 (346)
..|..++.|....+ ..|+|.|..||.|.+.. |.|++||+. ..+.|.++..|+|.+|+|+|+|..||...+
T Consensus 700 h~cdt~a~C~pg~~-----~~~tcecs~g~~gdgr~---c~d~~eca~-~~~~CGp~s~Cin~pg~~rceC~~gy~F~d- 769 (1289)
T KOG1214|consen 700 HMCDTTARCHPGTG-----VDYTCECSSGYQGDGRN---CVDENECAT-GFHRCGPNSVCINLPGSYRCECRSGYEFAD- 769 (1289)
T ss_pred cccCCCccccCCCC-----cceEEEEeeccCCCCCC---CCChhhhcc-CCCCCCCCceeecCCCceeEEEeecceecc-
Confidence 34666677876554 37999999999999866 999999976 678899999999999999999999999877
Q ss_pred CCCCCceeCCC
Q 044785 327 KDGRGCIPNQN 337 (346)
Q Consensus 327 ~~~~~C~~~~~ 337 (346)
++..|++..+
T Consensus 770 -d~~tCV~i~~ 779 (1289)
T KOG1214|consen 770 -DRHTCVLITP 779 (1289)
T ss_pred -CCcceEEecC
Confidence 4567887655
No 6
>KOG1219 consensus Uncharacterized conserved protein, contains laminin, cadherin and EGF domains [Signal transduction mechanisms]
Probab=97.80 E-value=2.9e-05 Score=86.21 Aligned_cols=67 Identities=34% Similarity=0.796 Sum_probs=55.2
Q ss_pred CCCCCCCeeeCCCCCCCCCCCeeeecCCCcccCCCCCCCceeCccCCCCCCCCCCCCCeeEecCCCeEEecCCCCccCC
Q 044785 247 KICGLNASCHKPKDNTTTSSGYHCKCNEGYEGNPYLSDGCQDVNECEDPSLNNCTRTHICDNIPGSYTCRCRKGFHGDG 325 (346)
Q Consensus 247 ~~C~~~s~C~~~~~~~~~~~gy~C~C~~Gy~gnp~~~~gC~dideC~~~~~~~C~~~~~C~nt~G~y~C~C~~G~~~~~ 325 (346)
.+|.....|....+ +|.|.|+.||.|..+... +|+|| +.++|..++.|.|++|+|+|.|-+|+.+..
T Consensus 3909 nPC~~GgtCip~~n------~f~CnC~~gyTG~~Ce~~---Gi~eC---s~n~C~~gg~C~n~~gsf~CncT~g~~gr~ 3975 (4289)
T KOG1219|consen 3909 NPCLTGGTCIPFYN------GFLCNCPNGYTGKRCEAR---GISEC---SKNVCGTGGQCINIPGSFHCNCTPGILGRT 3975 (4289)
T ss_pred CCCCCCCEEEecCC------CeeEeCCCCccCceeecc---ccccc---ccccccCCceeeccCCceEeccChhHhccc
Confidence 56777778876655 899999999998765432 38999 578999999999999999999999998654
No 7
>KOG4289 consensus Cadherin EGF LAG seven-pass G-type receptor [Signal transduction mechanisms]
Probab=97.50 E-value=0.0001 Score=79.20 Aligned_cols=53 Identities=32% Similarity=0.887 Sum_probs=46.6
Q ss_pred CCeeeecCCCcccCCCCCCCceeCccCCCCCCCCCCCCCeeEecCCCeEEecCCCCccCC
Q 044785 266 SGYHCKCNEGYEGNPYLSDGCQDVNECEDPSLNNCTRTHICDNIPGSYTCRCRKGFHGDG 325 (346)
Q Consensus 266 ~gy~C~C~~Gy~gnp~~~~gC~dideC~~~~~~~C~~~~~C~nt~G~y~C~C~~G~~~~~ 325 (346)
.+.+|+|++||.|+-.. +.||+| -..||.+++.|....|+|.|.|++||+|..
T Consensus 1220 nglrCrCPpGFTgd~Ce----TeiDlC---Ys~pC~nng~C~srEggYtCeCrpg~tGeh 1272 (2531)
T KOG4289|consen 1220 NGLRCRCPPGFTGDYCE----TEIDLC---YSGPCGNNGRCRSREGGYTCECRPGFTGEH 1272 (2531)
T ss_pred CceeEeCCCCCCccccc----chhHhh---hcCCCCCCCceEEecCceeEEecCCccccc
Confidence 58999999999997332 679999 468999999999999999999999999865
No 8
>KOG1219 consensus Uncharacterized conserved protein, contains laminin, cadherin and EGF domains [Signal transduction mechanisms]
Probab=97.46 E-value=0.00023 Score=79.56 Aligned_cols=72 Identities=28% Similarity=0.766 Sum_probs=59.7
Q ss_pred ccCccCCCCCCCCeeeCCCCCCCCCCCeeeecCCCcccCCCCCCCceeCccCCCCCCCCCCCCCeeEecCCCeEEecCCC
Q 044785 241 ETCEEAKICGLNASCHKPKDNTTTSSGYHCKCNEGYEGNPYLSDGCQDVNECEDPSLNNCTRTHICDNIPGSYTCRCRKG 320 (346)
Q Consensus 241 ~~C~~~~~C~~~s~C~~~~~~~~~~~gy~C~C~~Gy~gnp~~~~gC~dideC~~~~~~~C~~~~~C~nt~G~y~C~C~~G 320 (346)
+.|.+ .+|.....|..... +||.|.|+..|.|+-+. .+++.| ..+||..++.|.-..++|.|.||.|
T Consensus 3865 d~C~~-npCqhgG~C~~~~~-----ggy~CkCpsqysG~~CE----i~~epC---~snPC~~GgtCip~~n~f~CnC~~g 3931 (4289)
T KOG1219|consen 3865 DPCND-NPCQHGGTCISQPK-----GGYKCKCPSQYSGNHCE----IDLEPC---ASNPCLTGGTCIPFYNGFLCNCPNG 3931 (4289)
T ss_pred ccccc-CcccCCCEecCCCC-----CceEEeCcccccCcccc----cccccc---cCCCCCCCCEEEecCCCeeEeCCCC
Confidence 34554 57887788876544 49999999999998544 688999 4689999999999999999999999
Q ss_pred CccCC
Q 044785 321 FHGDG 325 (346)
Q Consensus 321 ~~~~~ 325 (346)
|+|..
T Consensus 3932 yTG~~ 3936 (4289)
T KOG1219|consen 3932 YTGKR 3936 (4289)
T ss_pred ccCce
Confidence 99865
No 9
>smart00179 EGF_CA Calcium-binding EGF-like domain.
Probab=97.43 E-value=0.00022 Score=45.40 Aligned_cols=33 Identities=39% Similarity=0.982 Sum_probs=28.1
Q ss_pred eCccCCCCCCCCCCCCCeeEecCCCeEEecCCCCc
Q 044785 288 DVNECEDPSLNNCTRTHICDNIPGSYTCRCRKGFH 322 (346)
Q Consensus 288 dideC~~~~~~~C~~~~~C~nt~G~y~C~C~~G~~ 322 (346)
|++||.. . .+|..++.|.|+.|+|.|.|++||.
T Consensus 1 d~~~C~~-~-~~C~~~~~C~~~~g~~~C~C~~g~~ 33 (39)
T smart00179 1 DIDECAS-G-NPCQNGGTCVNTVGSYRCECPPGYT 33 (39)
T ss_pred CcccCcC-C-CCcCCCCEeECCCCCeEeECCCCCc
Confidence 4788943 2 6898778999999999999999998
No 10
>KOG4260 consensus Uncharacterized conserved protein [Function unknown]
Probab=97.36 E-value=9.2e-05 Score=66.96 Aligned_cols=52 Identities=35% Similarity=0.933 Sum_probs=45.2
Q ss_pred ee-ecCCCcccCCCCCCCceeCccCCCCCCCCCCCCCeeEecCCCeEEecCCCCccC
Q 044785 269 HC-KCNEGYEGNPYLSDGCQDVNECEDPSLNNCTRTHICDNIPGSYTCRCRKGFHGD 324 (346)
Q Consensus 269 ~C-~C~~Gy~gnp~~~~gC~dideC~~~~~~~C~~~~~C~nt~G~y~C~C~~G~~~~ 324 (346)
.| +|+.||... ..+|.|||||.. ...+|..+..|+|+.|+|.|.+++||...
T Consensus 218 ~C~kCkkGW~ld---e~gCvDvnEC~~-ep~~c~~~qfCvNteGSf~C~dk~Gy~~g 270 (350)
T KOG4260|consen 218 GCSKCKKGWKLD---EEGCVDVNECQN-EPAPCKAHQFCVNTEGSFKCEDKEGYKKG 270 (350)
T ss_pred Chhhhcccceec---ccccccHHHHhc-CCCCCChhheeecCCCceEecccccccCC
Confidence 35 689999876 458999999976 67899999999999999999999999873
No 11
>PF14670 FXa_inhibition: Coagulation Factor Xa inhibitory site; PDB: 3Q3K_B 1NFY_B 1LQD_A 1G2L_B 1IQF_L 2UWP_B 2VH6_B 3KQC_L 2P93_L 2BQW_A ....
Probab=97.33 E-value=0.00015 Score=45.87 Aligned_cols=32 Identities=50% Similarity=1.156 Sum_probs=24.7
Q ss_pred CCCCCCCCeeEecCCCeEEecCCCCccCCCCCCCCc
Q 044785 297 LNNCTRTHICDNIPGSYTCRCRKGFHGDGTKDGRGC 332 (346)
Q Consensus 297 ~~~C~~~~~C~nt~G~y~C~C~~G~~~~~~~~~~~C 332 (346)
...| .+.|+|++|+|+|.|++||..+. |++.|
T Consensus 5 NGgC--~h~C~~~~g~~~C~C~~Gy~L~~--D~~tC 36 (36)
T PF14670_consen 5 NGGC--SHICVNTPGSYRCSCPPGYKLAE--DGRTC 36 (36)
T ss_dssp GGGS--SSEEEEETTSEEEE-STTEEE-T--TSSSE
T ss_pred CCCc--CCCCccCCCceEeECCCCCEECc--CCCCC
Confidence 3567 79999999999999999999887 44544
No 12
>PF12947 EGF_3: EGF domain; InterPro: IPR024731 This entry represents an EGF domain found in the the C terminus of malarial parasite merozoite surface protein 1 [], as well as other proteins.; PDB: 2NPR_A 1N1I_C 1B9W_A 1YO8_A 2RHP_A.
Probab=97.31 E-value=0.00015 Score=45.89 Aligned_cols=31 Identities=42% Similarity=0.999 Sum_probs=25.0
Q ss_pred CCCCCCCCCeeEecCCCeEEecCCCCccCCC
Q 044785 296 SLNNCTRTHICDNIPGSYTCRCRKGFHGDGT 326 (346)
Q Consensus 296 ~~~~C~~~~~C~nt~G~y~C~C~~G~~~~~~ 326 (346)
..+.|+.+..|.|+.++|.|.|++||.+++.
T Consensus 4 ~~~~C~~nA~C~~~~~~~~C~C~~Gy~GdG~ 34 (36)
T PF12947_consen 4 NNGGCHPNATCTNTGGSYTCTCKPGYEGDGF 34 (36)
T ss_dssp GGGGS-TTCEEEE-TTSEEEEE-CEEECCST
T ss_pred CCCCCCCCcEeecCCCCEEeECCCCCccCCc
Confidence 4567888999999999999999999999884
No 13
>PF12662 cEGF: Complement Clr-like EGF-like
Probab=97.21 E-value=0.00029 Score=40.20 Aligned_cols=24 Identities=38% Similarity=0.964 Sum_probs=20.1
Q ss_pred CeeeecCCCcccCCCCCCCceeCcc
Q 044785 267 GYHCKCNEGYEGNPYLSDGCQDVNE 291 (346)
Q Consensus 267 gy~C~C~~Gy~gnp~~~~gC~dide 291 (346)
+|+|.|++||..++.. ..|+||||
T Consensus 1 sy~C~C~~Gy~l~~d~-~~C~DIdE 24 (24)
T PF12662_consen 1 SYTCSCPPGYQLSPDG-RSCEDIDE 24 (24)
T ss_pred CEEeeCCCCCcCCCCC-CccccCCC
Confidence 5899999999977643 57999997
No 14
>cd00054 EGF_CA Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.
Probab=96.83 E-value=0.0017 Score=40.61 Aligned_cols=35 Identities=40% Similarity=0.980 Sum_probs=28.4
Q ss_pred eCccCCCCCCCCCCCCCeeEecCCCeEEecCCCCccC
Q 044785 288 DVNECEDPSLNNCTRTHICDNIPGSYTCRCRKGFHGD 324 (346)
Q Consensus 288 dideC~~~~~~~C~~~~~C~nt~G~y~C~C~~G~~~~ 324 (346)
++++|.. . .+|..++.|.++.++|.|.|++||.+.
T Consensus 1 ~~~~C~~-~-~~C~~~~~C~~~~~~~~C~C~~g~~g~ 35 (38)
T cd00054 1 DIDECAS-G-NPCQNGGTCVNTVGSYRCSCPPGYTGR 35 (38)
T ss_pred CcccCCC-C-CCcCCCCEeECCCCCeEeECCCCCcCC
Confidence 3678842 1 688778899999999999999999863
No 15
>PF00008 EGF: EGF-like domain This is a sub-family of the Pfam entry This is a sub-family of the Pfam entry; InterPro: IPR006209 A sequence of about thirty to forty amino-acid residues long found in the sequence of epidermal growth factor (EGF) has been shown [, , , , ] to be present, in a more or less conserved form, in a large number of other, mostly animal proteins. The list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied. The functional significance of EGF domains in what appear to be unrelated proteins is not yet clear. However, a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase). The EGF domain includes six cysteine residues which have been shown (in EGF) to be involved in disulphide bonds. The main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet. Subdomains between the conserved cysteines vary in length.; GO: 0005515 protein binding; PDB: 1WHE_A 1CCF_A 1APO_A 1WHF_A 2VJ3_A 1TOZ_A 4D90_B 3CFW_A 1EDM_B 1IXA_A ....
Probab=96.81 E-value=0.00092 Score=41.07 Aligned_cols=28 Identities=39% Similarity=1.041 Sum_probs=25.2
Q ss_pred CCCCCCCCeeEecC-CCeEEecCCCCccC
Q 044785 297 LNNCTRTHICDNIP-GSYTCRCRKGFHGD 324 (346)
Q Consensus 297 ~~~C~~~~~C~nt~-G~y~C~C~~G~~~~ 324 (346)
.++|.++++|++.. ++|.|.|++||.+.
T Consensus 3 ~~~C~n~g~C~~~~~~~y~C~C~~G~~G~ 31 (32)
T PF00008_consen 3 SNPCQNGGTCIDLPGGGYTCECPPGYTGK 31 (32)
T ss_dssp TTSSTTTEEEEEESTSEEEEEEBTTEEST
T ss_pred CCcCCCCeEEEeCCCCCEEeECCCCCccC
Confidence 45898899999999 99999999999875
No 16
>KOG4260 consensus Uncharacterized conserved protein [Function unknown]
Probab=96.62 E-value=0.0014 Score=59.48 Aligned_cols=80 Identities=26% Similarity=0.637 Sum_probs=57.5
Q ss_pred cceEEeeeecccccCccC-------CCCCCCCeeeCCCCCCCCCCCeeeecCCCcccCCCCCCCceeCccCCCCCCCCC-
Q 044785 229 FPLVLDWEITTKETCEEA-------KICGLNASCHKPKDNTTTSSGYHCKCNEGYEGNPYLSDGCQDVNECEDPSLNNC- 300 (346)
Q Consensus 229 ~p~~l~W~i~~~~~C~~~-------~~C~~~s~C~~~~~~~~~~~gy~C~C~~Gy~gnp~~~~gC~dideC~~~~~~~C- 300 (346)
....-+|.++ ...|.+. ..|..+..|+|..+ +|.|.+++||++. +|||.. -...|
T Consensus 220 ~kCkkGW~ld-e~gCvDvnEC~~ep~~c~~~qfCvNteG------Sf~C~dk~Gy~~g---------~d~C~~-~~d~~~ 282 (350)
T KOG4260|consen 220 SKCKKGWKLD-EEGCVDVNECQNEPAPCKAHQFCVNTEG------SFKCEDKEGYKKG---------VDECQF-CADVCA 282 (350)
T ss_pred hhhcccceec-ccccccHHHHhcCCCCCChhheeecCCC------ceEecccccccCC---------hHHhhh-hhhhcc
Confidence 3456789988 6777654 45777788998877 8999999999863 455532 11223
Q ss_pred CCCCeeEecCCCeEEecCCCCccCC
Q 044785 301 TRTHICDNIPGSYTCRCRKGFHGDG 325 (346)
Q Consensus 301 ~~~~~C~nt~G~y~C~C~~G~~~~~ 325 (346)
..+..|.|+.|.|+|.|..|+..-.
T Consensus 283 ~kn~~c~ni~~~~r~v~f~~~~~~~ 307 (350)
T KOG4260|consen 283 SKNRPCMNIDGQYRCVCFSGLIIIE 307 (350)
T ss_pred cCCCCcccCCccEEEEecccceeee
Confidence 2357889999999999999876543
No 17
>cd01475 vWA_Matrilin VWA_Matrilin: In cartilaginous plate, extracellular matrix molecules mediate cell-matrix and matrix-matrix interactions thereby providing tissue integrity. Some members of the matrilin family are expressed specifically in developing cartilage rudiments. The matrilin family consists of at least four members. All the members of the matrilin family contain VWA domains, EGF-like domains and a heptad repeat coiled-coiled domain at the carboxy terminus which is responsible for the oligomerization of the matrilins. The VWA domains have been shown to be essential for matrilin network formation by interacting with matrix ligands.
Probab=96.25 E-value=0.0041 Score=56.06 Aligned_cols=38 Identities=29% Similarity=0.687 Sum_probs=33.1
Q ss_pred CceeCccCCCCCCCCCCCCCeeEecCCCeEEecCCCCccCC
Q 044785 285 GCQDVNECEDPSLNNCTRTHICDNIPGSYTCRCRKGFHGDG 325 (346)
Q Consensus 285 gC~dideC~~~~~~~C~~~~~C~nt~G~y~C~C~~G~~~~~ 325 (346)
-|.+++||.. ..+.| .+.|.|+.|+|.|.|+.||..++
T Consensus 183 ~C~~~~~C~~-~~~~c--~~~C~~~~g~~~c~c~~g~~~~~ 220 (224)
T cd01475 183 ICVVPDLCAT-LSHVC--QQVCISTPGSYLCACTEGYALLE 220 (224)
T ss_pred cCcCchhhcC-CCCCc--cceEEcCCCCEEeECCCCccCCC
Confidence 5889999965 56788 67999999999999999998765
No 18
>cd00053 EGF Epidermal growth factor domain, found in epidermal growth factor (EGF) presents in a large number of proteins, mostly animal; the list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied; the functional significance of EGF-like domains in what appear to be unrelated proteins is not yet clear; a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase); the domain includes six cysteine residues which have been shown to be involved in disulfide bonds; the main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet; Subdomains between the conserved cysteines vary in length; the region between the 5th and 6th cysteine contains two conserved glycines of which at least one is present in most EGF-like domains; a subset of these bind calcium.
Probab=96.00 E-value=0.011 Score=36.09 Aligned_cols=28 Identities=46% Similarity=1.058 Sum_probs=24.6
Q ss_pred CCCCCCCCeeEecCCCeEEecCCCCccC
Q 044785 297 LNNCTRTHICDNIPGSYTCRCRKGFHGD 324 (346)
Q Consensus 297 ~~~C~~~~~C~nt~G~y~C~C~~G~~~~ 324 (346)
..+|.+++.|+++.++|.|.|+.||.++
T Consensus 5 ~~~C~~~~~C~~~~~~~~C~C~~g~~g~ 32 (36)
T cd00053 5 SNPCSNGGTCVNTPGSYRCVCPPGYTGD 32 (36)
T ss_pred CCCCCCCCEEecCCCCeEeECCCCCccc
Confidence 4578767999999999999999999876
No 19
>PF12947 EGF_3: EGF domain; InterPro: IPR024731 This entry represents an EGF domain found in the the C terminus of malarial parasite merozoite surface protein 1 [], as well as other proteins.; PDB: 2NPR_A 1N1I_C 1B9W_A 1YO8_A 2RHP_A.
Probab=95.92 E-value=0.006 Score=38.53 Aligned_cols=30 Identities=37% Similarity=0.939 Sum_probs=23.1
Q ss_pred CCCCCCCeeeCCCCCCCCCCCeeeecCCCcccCCCC
Q 044785 247 KICGLNASCHKPKDNTTTSSGYHCKCNEGYEGNPYL 282 (346)
Q Consensus 247 ~~C~~~s~C~~~~~~~~~~~gy~C~C~~Gy~gnp~~ 282 (346)
..|..++.|.+..+ .|.|.|++||.|+++.
T Consensus 6 ~~C~~nA~C~~~~~------~~~C~C~~Gy~GdG~~ 35 (36)
T PF12947_consen 6 GGCHPNATCTNTGG------SYTCTCKPGYEGDGFF 35 (36)
T ss_dssp GGS-TTCEEEE-TT------SEEEEE-CEEECCSTC
T ss_pred CCCCCCcEeecCCC------CEEeECCCCCccCCcC
Confidence 35778999998876 8999999999998764
No 20
>KOG4289 consensus Cadherin EGF LAG seven-pass G-type receptor [Signal transduction mechanisms]
Probab=95.83 E-value=0.018 Score=62.88 Aligned_cols=60 Identities=32% Similarity=0.903 Sum_probs=48.4
Q ss_pred CCCCCCCeeeCCCCCCCCCCCeeeecCCCcccCCCCCCCceeC---ccCCCCCCCCCCCCCeeEec-CCCeEEecCCC
Q 044785 247 KICGLNASCHKPKDNTTTSSGYHCKCNEGYEGNPYLSDGCQDV---NECEDPSLNNCTRTHICDNI-PGSYTCRCRKG 320 (346)
Q Consensus 247 ~~C~~~s~C~~~~~~~~~~~gy~C~C~~Gy~gnp~~~~gC~di---deC~~~~~~~C~~~~~C~nt-~G~y~C~C~~G 320 (346)
.+|.+|+.|....+ +|+|.|++||.|.- |+-- ..| .+..|.+++.|+|. .|++.|.||.|
T Consensus 1245 ~pC~nng~C~srEg------gYtCeCrpg~tGeh-----CEvs~~agrC---vpGvC~nggtC~~~~nggf~c~Cp~g 1308 (2531)
T KOG4289|consen 1245 GPCGNNGRCRSREG------GYTCECRPGFTGEH-----CEVSARAGRC---VPGVCKNGGTCVNLLNGGFCCHCPYG 1308 (2531)
T ss_pred CCCCCCCceEEecC------ceeEEecCCccccc-----eeeecccCcc---ccceecCCCEEeecCCCceeccCCCc
Confidence 57899999988777 99999999999863 4321 235 45688889999996 67899999999
No 21
>PF12662 cEGF: Complement Clr-like EGF-like
Probab=95.78 E-value=0.0099 Score=33.91 Aligned_cols=22 Identities=45% Similarity=0.952 Sum_probs=17.9
Q ss_pred CeEEecCCCCccCCCCCCCCceeC
Q 044785 312 SYTCRCRKGFHGDGTKDGRGCIPN 335 (346)
Q Consensus 312 ~y~C~C~~G~~~~~~~~~~~C~~~ 335 (346)
||+|.|++||+.++ +++.|+.+
T Consensus 1 sy~C~C~~Gy~l~~--d~~~C~DI 22 (24)
T PF12662_consen 1 SYTCSCPPGYQLSP--DGRSCEDI 22 (24)
T ss_pred CEEeeCCCCCcCCC--CCCccccC
Confidence 69999999999877 45778754
No 22
>smart00181 EGF Epidermal growth factor-like domain.
Probab=95.61 E-value=0.018 Score=35.51 Aligned_cols=26 Identities=50% Similarity=1.187 Sum_probs=22.8
Q ss_pred CCCCCCCeeEecCCCeEEecCCCCccC
Q 044785 298 NNCTRTHICDNIPGSYTCRCRKGFHGD 324 (346)
Q Consensus 298 ~~C~~~~~C~nt~G~y~C~C~~G~~~~ 324 (346)
.+|..+ .|.++.++|.|.|++||.++
T Consensus 6 ~~C~~~-~C~~~~~~~~C~C~~g~~g~ 31 (35)
T smart00181 6 GPCSNG-TCINTPGSYTCSCPPGYTGD 31 (35)
T ss_pred CCCCCC-EEECCCCCeEeECCCCCccC
Confidence 678655 99999999999999999874
No 23
>PF07645 EGF_CA: Calcium-binding EGF domain; InterPro: IPR001881 A sequence of about forty amino-acid residues found in epidermal growth factor (EGF) has been shown [, , , , , ] to be present in a large number of membrane-bound and extracellular, mostly animal, proteins. Many of these proteins require calcium for their biological function and a calcium-binding site has been found at the N terminus of some EGF-like domains []. Calcium-binding may be crucial for numerous protein-protein interactions. For human coagulation factor IX it has been shown [] that the calcium-ligands form a pentagonal bipyramid. The first, third and fourth conserved negatively charged or polar residues are side chain ligands. The latter is possibly hydroxylated (see aspartic acid and asparagine hydroxylation site) []. A conserved aromatic residue, as well as the second conserved negative residue, are thought to be involved in stabilising the calcium-binding site. As in non-calcium binding EGF-like domains, there are six conserved cysteines and the structure of both types is very similar as calcium-binding induces only strictly local structural changes []. +------------------+ +---------+ | | | | nxnnC-x(3,14)-C-x(3,7)-CxxbxxxxaxC-x(1,6)-C-x(8,13)-Cx | | +------------------+ 'n': negatively charged or polar residue [DEQN] 'b': possibly beta-hydroxylated residue [DN] 'a': aromatic amino acid 'C': cysteine, involved in disulphide bond 'x': any amino acid. ; GO: 0005509 calcium ion binding; PDB: 2VJ3_A 1TOZ_A 1LMJ_A 1UZQ_A 1UZK_A 1UZJ_B 1UZP_A 1EMO_A 1EMN_A 2RR0_A ....
Probab=94.06 E-value=0.045 Score=35.65 Aligned_cols=32 Identities=38% Similarity=0.885 Sum_probs=26.1
Q ss_pred cCccC-CCCCCCCeeeCCCCCCCCCCCeeeecCCCcccC
Q 044785 242 TCEEA-KICGLNASCHKPKDNTTTSSGYHCKCNEGYEGN 279 (346)
Q Consensus 242 ~C~~~-~~C~~~s~C~~~~~~~~~~~gy~C~C~~Gy~gn 279 (346)
.|... ..|..++.|+|..+ +|+|.|++||..+
T Consensus 4 EC~~~~~~C~~~~~C~N~~G------sy~C~C~~Gy~~~ 36 (42)
T PF07645_consen 4 ECAEGPHNCPENGTCVNTEG------SYSCSCPPGYELN 36 (42)
T ss_dssp TTTTTSSSSSTTSEEEEETT------EEEEEESTTEEEC
T ss_pred ccCCCCCcCCCCCEEEcCCC------CEEeeCCCCcEEC
Confidence 46554 57888899999987 9999999999843
No 24
>KOG1217 consensus Fibrillins and related proteins containing Ca2+-binding EGF-like domains [Signal transduction mechanisms]
Probab=93.86 E-value=0.095 Score=51.71 Aligned_cols=63 Identities=38% Similarity=0.934 Sum_probs=51.5
Q ss_pred CeeeCCCCCCCCCCCeeeecCCCcccCCCCCCCceeCccCCCCCCCCCCCCCeeEecCCCeEEecCCCCccCC
Q 044785 253 ASCHKPKDNTTTSSGYHCKCNEGYEGNPYLSDGCQDVNECEDPSLNNCTRTHICDNIPGSYTCRCRKGFHGDG 325 (346)
Q Consensus 253 s~C~~~~~~~~~~~gy~C~C~~Gy~gnp~~~~gC~dideC~~~~~~~C~~~~~C~nt~G~y~C~C~~G~~~~~ 325 (346)
..|.+..+ +|+|.|++||.+... ..|.++++|.. ... |.+++.|.+..+.|.|.|++||.+..
T Consensus 243 ~~c~~~~~------~~~C~~~~g~~~~~~--~~~~~~~~C~~-~~~-c~~~~~C~~~~~~~~C~C~~g~~g~~ 305 (487)
T KOG1217|consen 243 GTCVNTVG------SYTCRCPEGYTGDAC--VTCVDVDSCAL-IAS-CPNGGTCVNVPGSYRCTCPPGFTGRL 305 (487)
T ss_pred CcccccCC------ceeeeCCCCcccccc--ceeeeccccCC-CCc-cCCCCeeecCCCcceeeCCCCCCCCC
Confidence 66766655 799999999998752 24789999964 323 88889999999999999999999876
No 25
>PF00008 EGF: EGF-like domain This is a sub-family of the Pfam entry This is a sub-family of the Pfam entry; InterPro: IPR006209 A sequence of about thirty to forty amino-acid residues long found in the sequence of epidermal growth factor (EGF) has been shown [, , , , ] to be present, in a more or less conserved form, in a large number of other, mostly animal proteins. The list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied. The functional significance of EGF domains in what appear to be unrelated proteins is not yet clear. However, a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase). The EGF domain includes six cysteine residues which have been shown (in EGF) to be involved in disulphide bonds. The main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet. Subdomains between the conserved cysteines vary in length.; GO: 0005515 protein binding; PDB: 1WHE_A 1CCF_A 1APO_A 1WHF_A 2VJ3_A 1TOZ_A 4D90_B 3CFW_A 1EDM_B 1IXA_A ....
Probab=93.25 E-value=0.044 Score=33.52 Aligned_cols=28 Identities=32% Similarity=0.882 Sum_probs=22.9
Q ss_pred CCCCCCCeeeCCCCCCCCCCCeeeecCCCcccC
Q 044785 247 KICGLNASCHKPKDNTTTSSGYHCKCNEGYEGN 279 (346)
Q Consensus 247 ~~C~~~s~C~~~~~~~~~~~gy~C~C~~Gy~gn 279 (346)
.+|.+++.|++... .+|.|.|++||.|.
T Consensus 4 ~~C~n~g~C~~~~~-----~~y~C~C~~G~~G~ 31 (32)
T PF00008_consen 4 NPCQNGGTCIDLPG-----GGYTCECPPGYTGK 31 (32)
T ss_dssp TSSTTTEEEEEEST-----SEEEEEEBTTEEST
T ss_pred CcCCCCeEEEeCCC-----CCEEeECCCCCccC
Confidence 37888899998762 28999999999874
No 26
>cd00053 EGF Epidermal growth factor domain, found in epidermal growth factor (EGF) presents in a large number of proteins, mostly animal; the list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied; the functional significance of EGF-like domains in what appear to be unrelated proteins is not yet clear; a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase); the domain includes six cysteine residues which have been shown to be involved in disulfide bonds; the main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet; Subdomains between the conserved cysteines vary in length; the region between the 5th and 6th cysteine contains two conserved glycines of which at least one is present in most EGF-like domains; a subset of these bind calcium.
Probab=91.04 E-value=0.39 Score=28.91 Aligned_cols=27 Identities=30% Similarity=0.879 Sum_probs=22.5
Q ss_pred CCCCCCCeeeCCCCCCCCCCCeeeecCCCcccC
Q 044785 247 KICGLNASCHKPKDNTTTSSGYHCKCNEGYEGN 279 (346)
Q Consensus 247 ~~C~~~s~C~~~~~~~~~~~gy~C~C~~Gy~gn 279 (346)
..|..++.|.+..+ +|.|.|+.||.++
T Consensus 6 ~~C~~~~~C~~~~~------~~~C~C~~g~~g~ 32 (36)
T cd00053 6 NPCSNGGTCVNTPG------SYRCVCPPGYTGD 32 (36)
T ss_pred CCCCCCCEEecCCC------CeEeECCCCCccc
Confidence 46777789987766 8999999999886
No 27
>smart00179 EGF_CA Calcium-binding EGF-like domain.
Probab=89.71 E-value=0.58 Score=29.09 Aligned_cols=30 Identities=27% Similarity=0.832 Sum_probs=22.9
Q ss_pred cCccCCCCCCCCeeeCCCCCCCCCCCeeeecCCCcc
Q 044785 242 TCEEAKICGLNASCHKPKDNTTTSSGYHCKCNEGYE 277 (346)
Q Consensus 242 ~C~~~~~C~~~s~C~~~~~~~~~~~gy~C~C~~Gy~ 277 (346)
.|.....|..++.|.+..+ +|.|.|+.||.
T Consensus 4 ~C~~~~~C~~~~~C~~~~g------~~~C~C~~g~~ 33 (39)
T smart00179 4 ECASGNPCQNGGTCVNTVG------SYRCECPPGYT 33 (39)
T ss_pred cCcCCCCcCCCCEeECCCC------CeEeECCCCCc
Confidence 4543236777778988766 89999999998
No 28
>KOG1217 consensus Fibrillins and related proteins containing Ca2+-binding EGF-like domains [Signal transduction mechanisms]
Probab=89.26 E-value=0.51 Score=46.48 Aligned_cols=54 Identities=39% Similarity=0.938 Sum_probs=44.2
Q ss_pred CeeeecCCCcccCCCCCCCceeCccCCCCCCCCCCCCCeeEecCCCeEEecCCCCccCC
Q 044785 267 GYHCKCNEGYEGNPYLSDGCQDVNECEDPSLNNCTRTHICDNIPGSYTCRCRKGFHGDG 325 (346)
Q Consensus 267 gy~C~C~~Gy~gnp~~~~gC~dideC~~~~~~~C~~~~~C~nt~G~y~C~C~~G~~~~~ 325 (346)
.++|.|..||.+.+.. .+.++|.. ..++|.+++.|.|..++|.|.|+++|.+..
T Consensus 151 ~~~c~C~~g~~~~~~~----~~~~~C~~-~~~~c~~~~~C~~~~~~~~C~c~~~~~~~~ 204 (487)
T KOG1217|consen 151 PFRCSCTEGYEGEPCE----TDLDECIQ-YSSPCQNGGTCVNTGGSYLCSCPPGYTGST 204 (487)
T ss_pred ceeeeeCCCccccccc----cccccccc-CCCCcCCCcccccCCCCeeEeCCCCccCCc
Confidence 6899999999987644 23378853 456788889999999999999999998865
No 29
>PF06247 Plasmod_Pvs28: Plasmodium ookinete surface protein Pvs28; InterPro: IPR010423 This family consists of several ookinete surface protein (Pvs28) from several species of Plasmodium. Pvs25 and Pvs28 are expressed on the surface of ookinetes. These proteins are potential candidates for vaccine and induce antibodies that block the infectivity of Plasmodium vivax in immunised animals [].; GO: 0009986 cell surface, 0016020 membrane; PDB: 1Z3G_B 1Z1Y_B 1Z27_A.
Probab=88.59 E-value=0.2 Score=43.48 Aligned_cols=62 Identities=29% Similarity=0.723 Sum_probs=43.0
Q ss_pred CeeeecCCCcccCCCCCCCceeCccCCCC--CCCCCCCCCeeEecC-----CCeEEecCCCCccCCCCCCCCcee
Q 044785 267 GYHCKCNEGYEGNPYLSDGCQDVNECEDP--SLNNCTRTHICDNIP-----GSYTCRCRKGFHGDGTKDGRGCIP 334 (346)
Q Consensus 267 gy~C~C~~Gy~gnp~~~~gC~dideC~~~--~~~~C~~~~~C~nt~-----G~y~C~C~~G~~~~~~~~~~~C~~ 334 (346)
.|.|+|.+||..- ..+.|+...+|..+ ...+|...+.|.+.. ..|.|.|.+||...... |.|
T Consensus 19 HfEC~Cnegfvl~--~EntCE~kv~C~~~e~~~K~Cgdya~C~~~~~~~~~~~~~C~C~~gY~~~~~v----Cvp 87 (197)
T PF06247_consen 19 HFECKCNEGFVLK--NENTCEEKVECDKLENVNKPCGDYAKCINQANKGEERAYKCDCINGYILKQGV----CVP 87 (197)
T ss_dssp EEEEEESTTEEEE--ETTEEEE----SG-GGTTSEEETTEEEEE-SSTTSSTSEEEEE-TTEEESSSS----EEE
T ss_pred ceEEEcCCCcEEc--cccccccceecCcccccCccccchhhhhcCCCcccceeEEEecccCceeeCCe----Ech
Confidence 7999999999743 24579998899642 346788789999876 68999999999987644 765
No 30
>smart00181 EGF Epidermal growth factor-like domain.
Probab=87.54 E-value=0.78 Score=27.92 Aligned_cols=25 Identities=32% Similarity=0.941 Sum_probs=20.4
Q ss_pred CCCCCCeeeCCCCCCCCCCCeeeecCCCcccC
Q 044785 248 ICGLNASCHKPKDNTTTSSGYHCKCNEGYEGN 279 (346)
Q Consensus 248 ~C~~~s~C~~~~~~~~~~~gy~C~C~~Gy~gn 279 (346)
.|..+ .|.+..+ +|+|.|+.||.++
T Consensus 7 ~C~~~-~C~~~~~------~~~C~C~~g~~g~ 31 (35)
T smart00181 7 PCSNG-TCINTPG------SYTCSCPPGYTGD 31 (35)
T ss_pred CCCCC-EEECCCC------CeEeECCCCCccC
Confidence 56666 8887755 8999999999985
No 31
>PF12946 EGF_MSP1_1: MSP1 EGF domain 1; InterPro: IPR024730 This EGF-like domain is found at the C terminus of the malaria parasite MSP1 protein. MSP1 is the merozoite surface protein 1. This domain is part of the C-terminal fragment that is proteolytically processed from the the rest of the protein and is left attached to the surface of the invading parasite [].; PDB: 1N1I_C 2FLG_A 1CEJ_A 2NPR_A 1B9W_A 1OB1_F.
Probab=86.26 E-value=0.42 Score=30.20 Aligned_cols=28 Identities=32% Similarity=0.687 Sum_probs=21.1
Q ss_pred CCCCCCCeeEecC-CCeEEecCCCCccCC
Q 044785 298 NNCTRTHICDNIP-GSYTCRCRKGFHGDG 325 (346)
Q Consensus 298 ~~C~~~~~C~nt~-G~y~C~C~~G~~~~~ 325 (346)
..|..+..|.+.. |++.|+|..||..++
T Consensus 5 ~~cP~NA~C~~~~dG~eecrCllgyk~~~ 33 (37)
T PF12946_consen 5 TKCPANAGCFRYDDGSEECRCLLGYKKVG 33 (37)
T ss_dssp S---TTEEEEEETTSEEEEEE-TTEEEET
T ss_pred ccCCCCcccEEcCCCCEEEEeeCCccccC
Confidence 4566689999987 999999999998876
No 32
>cd00054 EGF_CA Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.
Probab=86.04 E-value=0.98 Score=27.56 Aligned_cols=32 Identities=28% Similarity=0.840 Sum_probs=23.5
Q ss_pred cCccCCCCCCCCeeeCCCCCCCCCCCeeeecCCCcccC
Q 044785 242 TCEEAKICGLNASCHKPKDNTTTSSGYHCKCNEGYEGN 279 (346)
Q Consensus 242 ~C~~~~~C~~~s~C~~~~~~~~~~~gy~C~C~~Gy~gn 279 (346)
.|.....|..++.|.+..+ +|.|.|..||.|.
T Consensus 4 ~C~~~~~C~~~~~C~~~~~------~~~C~C~~g~~g~ 35 (38)
T cd00054 4 ECASGNPCQNGGTCVNTVG------SYRCSCPPGYTGR 35 (38)
T ss_pred cCCCCCCcCCCCEeECCCC------CeEeECCCCCcCC
Confidence 3543235777778987766 8999999999873
No 33
>PF07974 EGF_2: EGF-like domain; InterPro: IPR013111 A sequence of about thirty to forty amino-acid residues long found in the sequence of epidermal growth factor (EGF) has been shown [, , , , ] to be present, in a more or less conserved form, in a large number of other, mostly animal proteins. The list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied. The functional significance of EGF domains in what appear to be unrelated proteins is not yet clear. However, a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase). The EGF domain includes six cysteine residues which have been shown (in EGF) to be involved in disulphide bonds. The main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet. Subdomains between the conserved cysteines vary in length. This entry contains EGF domains found in a variety of extracellular and membrane proteins
Probab=80.80 E-value=2.9 Score=25.48 Aligned_cols=25 Identities=28% Similarity=0.688 Sum_probs=20.3
Q ss_pred CCCCCCCeeEecCCCeEEecCCCCccC
Q 044785 298 NNCTRTHICDNIPGSYTCRCRKGFHGD 324 (346)
Q Consensus 298 ~~C~~~~~C~nt~G~y~C~C~~G~~~~ 324 (346)
..|.++++|++. ..+|.|.+||.+.
T Consensus 6 ~~C~~~G~C~~~--~g~C~C~~g~~G~ 30 (32)
T PF07974_consen 6 NICSGHGTCVSP--CGRCVCDSGYTGP 30 (32)
T ss_pred CccCCCCEEeCC--CCEEECCCCCcCC
Confidence 357788999876 4689999999875
No 34
>PF12661 hEGF: Human growth factor-like EGF; PDB: 2YGQ_A 2E26_A 3A7Q_A 2YGP_A 2YGO_A 1HRE_A 1HAE_A 1HAF_A 1HRF_A.
Probab=76.08 E-value=1.3 Score=21.30 Aligned_cols=12 Identities=42% Similarity=1.157 Sum_probs=8.8
Q ss_pred EEecCCCCccCC
Q 044785 314 TCRCRKGFHGDG 325 (346)
Q Consensus 314 ~C~C~~G~~~~~ 325 (346)
.|.|++||.|..
T Consensus 1 ~C~C~~G~~G~~ 12 (13)
T PF12661_consen 1 TCQCPPGWTGPN 12 (13)
T ss_dssp EEEE-TTEETTT
T ss_pred CccCcCCCcCCC
Confidence 489999998753
No 35
>PF14670 FXa_inhibition: Coagulation Factor Xa inhibitory site; PDB: 3Q3K_B 1NFY_B 1LQD_A 1G2L_B 1IQF_L 2UWP_B 2VH6_B 3KQC_L 2P93_L 2BQW_A ....
Probab=75.37 E-value=1.9 Score=27.15 Aligned_cols=22 Identities=27% Similarity=0.730 Sum_probs=16.0
Q ss_pred CeeeCCCCCCCCCCCeeeecCCCcccCC
Q 044785 253 ASCHKPKDNTTTSSGYHCKCNEGYEGNP 280 (346)
Q Consensus 253 s~C~~~~~~~~~~~gy~C~C~~Gy~gnp 280 (346)
..|.+..+ +|+|.|++||...+
T Consensus 10 h~C~~~~g------~~~C~C~~Gy~L~~ 31 (36)
T PF14670_consen 10 HICVNTPG------SYRCSCPPGYKLAE 31 (36)
T ss_dssp SEEEEETT------SEEEE-STTEEE-T
T ss_pred CCCccCCC------ceEeECCCCCEECc
Confidence 46777765 89999999998764
No 36
>KOG1225 consensus Teneurin-1 and related extracellular matrix proteins, contain EGF-like repeats [Signal transduction mechanisms; Extracellular structures]
Probab=74.33 E-value=5.9 Score=40.23 Aligned_cols=24 Identities=29% Similarity=0.855 Sum_probs=13.1
Q ss_pred CCCCCCCeeEecCCCeEEecCCCCccCC
Q 044785 298 NNCTRTHICDNIPGSYTCRCRKGFHGDG 325 (346)
Q Consensus 298 ~~C~~~~~C~nt~G~y~C~C~~G~~~~~ 325 (346)
..|+++++|+ .| +|.|.+||.++.
T Consensus 316 adC~g~G~Ci--~G--~C~C~~Gy~G~~ 339 (525)
T KOG1225|consen 316 ADCSGHGKCI--DG--ECLCDEGYTGEL 339 (525)
T ss_pred ccCCCCCccc--CC--ceEeCCCCcCCc
Confidence 3555566665 22 566666665543
No 37
>PF00954 S_locus_glycop: S-locus glycoprotein family; InterPro: IPR000858 In Brassicaceae, self-incompatible plants have a self/non-self recognition system, which involves the inability of flowering plants to achieve self-fertilisation. This is sporophytically controlled by multiple alleles at a single locus (S). There are a total of 50 different S alleles in Brassica oleracea. S-locus glycoproteins, as well as S-receptor kinases, are in linkage with the S-alleles []. Most of the proteins within this family contain apple-like domain (IPR003609 from INTERPRO), which is predicted to possess protein- and/or carbohydrate-binding functions.; GO: 0048544 recognition of pollen
Probab=73.24 E-value=5.9 Score=31.27 Aligned_cols=41 Identities=27% Similarity=0.725 Sum_probs=30.6
Q ss_pred ceEEeeeecccccCccCCCCCCCCeeeCCCCCCCCCCCeeeecCCCccc
Q 044785 230 PLVLDWEITTKETCEEAKICGLNASCHKPKDNTTTSSGYHCKCNEGYEG 278 (346)
Q Consensus 230 p~~l~W~i~~~~~C~~~~~C~~~s~C~~~~~~~~~~~gy~C~C~~Gy~g 278 (346)
...+-|... .+.|+.-..|..++.|... . ...|.|.+||+.
T Consensus 68 ~W~~~~~~p-~d~Cd~y~~CG~~g~C~~~-~------~~~C~Cl~GF~P 108 (110)
T PF00954_consen 68 SWSVFWSAP-KDQCDVYGFCGPNGICNSN-N------SPKCSCLPGFEP 108 (110)
T ss_pred cEEEEEEec-ccCCCCccccCCccEeCCC-C------CCceECCCCcCC
Confidence 344567777 6789887789999999432 2 456999999974
No 38
>PF07172 GRP: Glycine rich protein family; InterPro: IPR010800 This family consists of glycine rich proteins. Some of them may be involved in resistance to environmental stress [].
Probab=68.02 E-value=5.1 Score=31.08 Aligned_cols=12 Identities=25% Similarity=0.119 Sum_probs=5.5
Q ss_pred CcchhHHHHHHH
Q 044785 1 MPFRLTTKLVVL 12 (346)
Q Consensus 1 M~~~~~~~l~~~ 12 (346)
|....+++|.++
T Consensus 1 MaSK~~llL~l~ 12 (95)
T PF07172_consen 1 MASKAFLLLGLL 12 (95)
T ss_pred CchhHHHHHHHH
Confidence 654444444333
No 39
>KOG0994 consensus Extracellular matrix glycoprotein Laminin subunit beta [Extracellular structures]
Probab=65.08 E-value=8.8 Score=42.14 Aligned_cols=64 Identities=36% Similarity=0.880 Sum_probs=41.6
Q ss_pred Ceee-ecCCCcccCCCCCC-Ccee-CccCCCCCCCCCCCCCeeEecCCCeEE-ecCCCCccCCCC-CCCCcee
Q 044785 267 GYHC-KCNEGYEGNPYLSD-GCQD-VNECEDPSLNNCTRTHICDNIPGSYTC-RCRKGFHGDGTK-DGRGCIP 334 (346)
Q Consensus 267 gy~C-~C~~Gy~gnp~~~~-gC~d-ideC~~~~~~~C~~~~~C~nt~G~y~C-~C~~G~~~~~~~-~~~~C~~ 334 (346)
+.+| +|.+||.|.|-..- .|.+ -|+| ++....| -.|.+..+++.| +|-.||.|++.. .+.+|.|
T Consensus 840 grqCnqCqpG~WgFPeCr~CqCNgHA~~C-d~~tGaC---i~CqD~T~G~~CdrCl~GyyGdP~lg~g~~CrP 908 (1758)
T KOG0994|consen 840 GRQCNQCQPGYWGFPECRPCQCNGHADTC-DPITGAC---IDCQDSTTGHSCDRCLDGYYGDPRLGSGIGCRP 908 (1758)
T ss_pred hhhccccCCCccCCCcCccccccCccccc-Ccccccc---ccccccccccchhhhhccccCCcccCCCCCCCC
Confidence 5566 68888888774421 2222 3566 3344555 357777888999 899999999976 2344543
No 40
>PHA03099 epidermal growth factor-like protein (EGF-like protein); Provisional
Probab=63.62 E-value=7.3 Score=31.75 Aligned_cols=37 Identities=22% Similarity=0.521 Sum_probs=26.0
Q ss_pred eCccCCCCCCCCCCCCCeeEecC--CCeEEecCCCCccCC
Q 044785 288 DVNECEDPSLNNCTRTHICDNIP--GSYTCRCRKGFHGDG 325 (346)
Q Consensus 288 dideC~~~~~~~C~~~~~C~nt~--G~y~C~C~~G~~~~~ 325 (346)
+|.+|.....+.|.+ +.|.-.. ..+.|+|+.||.|..
T Consensus 41 ~i~~Cp~ey~~YClH-G~C~yI~dl~~~~CrC~~GYtGeR 79 (139)
T PHA03099 41 AIRLCGPEGDGYCLH-GDCIHARDIDGMYCRCSHGYTGIR 79 (139)
T ss_pred ccccCChhhCCEeEC-CEEEeeccCCCceeECCCCccccc
Confidence 455665434456763 4887654 788999999999865
No 41
>PF00954 S_locus_glycop: S-locus glycoprotein family; InterPro: IPR000858 In Brassicaceae, self-incompatible plants have a self/non-self recognition system, which involves the inability of flowering plants to achieve self-fertilisation. This is sporophytically controlled by multiple alleles at a single locus (S). There are a total of 50 different S alleles in Brassica oleracea. S-locus glycoproteins, as well as S-receptor kinases, are in linkage with the S-alleles []. Most of the proteins within this family contain apple-like domain (IPR003609 from INTERPRO), which is predicted to possess protein- and/or carbohydrate-binding functions.; GO: 0048544 recognition of pollen
Probab=61.08 E-value=9.1 Score=30.16 Aligned_cols=32 Identities=31% Similarity=0.751 Sum_probs=23.2
Q ss_pred ccCCCCCCCCCCCCCeeEecCCCeEEecCCCCccC
Q 044785 290 NECEDPSLNNCTRTHICDNIPGSYTCRCRKGFHGD 324 (346)
Q Consensus 290 deC~~~~~~~C~~~~~C~nt~G~y~C~C~~G~~~~ 324 (346)
|+|. .-..|++.+.| +......|.|.+||+..
T Consensus 78 d~Cd--~y~~CG~~g~C-~~~~~~~C~Cl~GF~P~ 109 (110)
T PF00954_consen 78 DQCD--VYGFCGPNGIC-NSNNSPKCSCLPGFEPK 109 (110)
T ss_pred cCCC--CccccCCccEe-CCCCCCceECCCCcCCC
Confidence 4663 24578888999 44556779999999753
No 42
>KOG1225 consensus Teneurin-1 and related extracellular matrix proteins, contain EGF-like repeats [Signal transduction mechanisms; Extracellular structures]
Probab=57.66 E-value=14 Score=37.60 Aligned_cols=22 Identities=41% Similarity=1.131 Sum_probs=16.8
Q ss_pred CCCCCCeeEecCCCeEEecCCCCccCC
Q 044785 299 NCTRTHICDNIPGSYTCRCRKGFHGDG 325 (346)
Q Consensus 299 ~C~~~~~C~nt~G~y~C~C~~G~~~~~ 325 (346)
.|.+++.|+| | |.|..||++..
T Consensus 344 ~C~~~g~cv~--g---C~C~~Gw~G~d 365 (525)
T KOG1225|consen 344 ACSGGGQCVN--G---CKCKKGWRGPD 365 (525)
T ss_pred ccCCCceecc--C---ceeccCccCCC
Confidence 3777788876 2 99999999755
No 43
>cd01475 vWA_Matrilin VWA_Matrilin: In cartilaginous plate, extracellular matrix molecules mediate cell-matrix and matrix-matrix interactions thereby providing tissue integrity. Some members of the matrilin family are expressed specifically in developing cartilage rudiments. The matrilin family consists of at least four members. All the members of the matrilin family contain VWA domains, EGF-like domains and a heptad repeat coiled-coiled domain at the carboxy terminus which is responsible for the oligomerization of the matrilins. The VWA domains have been shown to be essential for matrilin network formation by interacting with matrix ligands.
Probab=48.10 E-value=14 Score=33.03 Aligned_cols=22 Identities=32% Similarity=0.717 Sum_probs=17.3
Q ss_pred CeeeCCCCCCCCCCCeeeecCCCcccCC
Q 044785 253 ASCHKPKDNTTTSSGYHCKCNEGYEGNP 280 (346)
Q Consensus 253 s~C~~~~~~~~~~~gy~C~C~~Gy~gnp 280 (346)
..|.+..+ .|.|.|+.||..++
T Consensus 199 ~~C~~~~g------~~~c~c~~g~~~~~ 220 (224)
T cd01475 199 QVCISTPG------SYLCACTEGYALLE 220 (224)
T ss_pred ceEEcCCC------CEEeECCCCccCCC
Confidence 35776655 89999999998765
No 44
>PF05887 Trypan_PARP: Procyclic acidic repetitive protein (PARP); InterPro: IPR008882 This family consists of several Trypanosoma brucei procyclic acidic repetitive protein (PARP) like sequences. The procyclic acidic repetitive protein (parp) genes of T. brucei encode a small family of abundant surface proteins whose expression is restricted to the procyclic form of the parasite. They are found at two unlinked loci, parpA and parpB; transcription of both loci is developmentally regulated [].; GO: 0016020 membrane; PDB: 2X34_B 2X32_B.
Probab=43.98 E-value=7.6 Score=31.86 Aligned_cols=18 Identities=33% Similarity=0.170 Sum_probs=0.0
Q ss_pred CcchhHHHHHHHHHHHHH
Q 044785 1 MPFRLTTKLVVLLLVLLR 18 (346)
Q Consensus 1 M~~~~~~~l~~~ll~l~~ 18 (346)
|.++.+.+|.|||+..++
T Consensus 1 m~pr~l~~LavLL~~A~L 18 (143)
T PF05887_consen 1 MTPRHLCLLAVLLFGAAL 18 (143)
T ss_dssp ------------------
T ss_pred Cccccccccccccccccc
Confidence 788888888777776443
No 45
>PF09064 Tme5_EGF_like: Thrombomodulin like fifth domain, EGF-like; InterPro: IPR015149 This domain adopts a fold similar to other EGF domains, with a flat major and a twisted minor beta sheet. Disulphide pairing, however, is not of the usual 1-3, 2-4, 5-6 type; rather 1-2, 3-4, 5-6 pairing is found. Its extended major sheet (strands beta-2 and beta-3 and the connecting loop) projects into thrombin's active site groove. This domain is required for interaction of thrombomodulin with thrombin, and subsequent activation of protein-C []. ; GO: 0004888 transmembrane signaling receptor activity, 0016021 integral to membrane
Probab=43.76 E-value=19 Score=22.32 Aligned_cols=13 Identities=31% Similarity=0.644 Sum_probs=11.4
Q ss_pred eEEecCCCCccCC
Q 044785 313 YTCRCRKGFHGDG 325 (346)
Q Consensus 313 y~C~C~~G~~~~~ 325 (346)
..|.||+||..+.
T Consensus 18 ~~C~CPeGyIlde 30 (34)
T PF09064_consen 18 GQCFCPEGYILDE 30 (34)
T ss_pred CceeCCCceEecC
Confidence 4899999999876
No 46
>PF08261 Carcinustatin: Carcinustatin peptide
Probab=42.08 E-value=13 Score=15.27 Aligned_cols=6 Identities=67% Similarity=1.558 Sum_probs=4.1
Q ss_pred cCCCCC
Q 044785 42 PYPFGT 47 (346)
Q Consensus 42 pYPFgi 47 (346)
||-||+
T Consensus 3 py~fgl 8 (8)
T PF08261_consen 3 PYSFGL 8 (8)
T ss_pred cccccC
Confidence 677764
No 47
>PF14380 WAK_assoc: Wall-associated receptor kinase C-terminal
Probab=41.82 E-value=40 Score=25.71 Aligned_cols=39 Identities=23% Similarity=0.681 Sum_probs=27.5
Q ss_pred cceEEeeeecccccCccCCCCC-CCCeeeCCCCCCCCCCCeeeecCCC
Q 044785 229 FPLVLDWEITTKETCEEAKICG-LNASCHKPKDNTTTSSGYHCKCNEG 275 (346)
Q Consensus 229 ~p~~l~W~i~~~~~C~~~~~C~-~~s~C~~~~~~~~~~~gy~C~C~~G 275 (346)
-.++|+|.+. ...|. .|. ....|..... ...+.|.|+.|
T Consensus 54 ~GF~L~w~~~-~~~C~---~C~~SgG~Cgy~~~----~~~f~C~C~dg 93 (94)
T PF14380_consen 54 KGFELEWNAD-SGDCR---ECEASGGRCGYDSN----SEQFTCFCSDG 93 (94)
T ss_pred cCcEEEEeCC-CCcCc---ChhcCCCEeCCCCC----CceEEEECCCC
Confidence 6889999976 67885 577 5677864433 23678988875
No 48
>PHA02887 EGF-like protein; Provisional
Probab=40.95 E-value=23 Score=28.40 Aligned_cols=35 Identities=29% Similarity=0.612 Sum_probs=23.3
Q ss_pred ccCCCCCCCCCCCCCeeEec--CCCeEEecCCCCccCC
Q 044785 290 NECEDPSLNNCTRTHICDNI--PGSYTCRCRKGFHGDG 325 (346)
Q Consensus 290 deC~~~~~~~C~~~~~C~nt--~G~y~C~C~~G~~~~~ 325 (346)
++|...-.+.|. ++.|.-. .....|.|+.||.|.-
T Consensus 84 ~pC~~eyk~YCi-HG~C~yI~dL~epsCrC~~GYtG~R 120 (126)
T PHA02887 84 EKCKNDFNDFCI-NGECMNIIDLDEKFCICNKGYTGIR 120 (126)
T ss_pred cccChHhhCEee-CCEEEccccCCCceeECCCCcccCC
Confidence 345432345665 5788665 4458899999998864
No 49
>KOG0994 consensus Extracellular matrix glycoprotein Laminin subunit beta [Extracellular structures]
Probab=40.34 E-value=29 Score=38.39 Aligned_cols=17 Identities=47% Similarity=1.181 Sum_probs=15.0
Q ss_pred Ceee-ecCCCcccCCCCC
Q 044785 267 GYHC-KCNEGYEGNPYLS 283 (346)
Q Consensus 267 gy~C-~C~~Gy~gnp~~~ 283 (346)
|+.| +|..||.|+|.+.
T Consensus 884 G~~CdrCl~GyyGdP~lg 901 (1758)
T KOG0994|consen 884 GHSCDRCLDGYYGDPRLG 901 (1758)
T ss_pred ccchhhhhccccCCcccC
Confidence 7888 8999999999874
No 50
>PF08685 GON: GON domain; InterPro: IPR012314 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site []. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases []. The ADAMTSs (a disintegrin and metalloproteinase domain with thrombospondin type-1 modules) are a family of zinc dependent metalloproteinases that play important roles in a variety of normal and pathological conditions. These enzymes show a complex domain organisation including signal sequence, propeptide, metalloproteinase domain (see PDOC50215 from PROSITEDOC), disintegrin-like domain (see PDOC00351 from PROSITEDOC), central TS-1 motif (see PDOC50092 from PROSITEDOC), cysteine-rich region, and a variable number of TS-like repeats at the C-terminal region. The GON domain is an approximately 200-residue module, whose presence is the hallmark of a subfamily of structurally and evolutionarily related ADAMTSs, called GON- ADAMTSs. The GON domain is characterised by the presence of several conserved cysteine residues and is likely to be globular [], []. Some proteins known to contain a GON domain are listed below: Mammalian ADAMTS-9 Mammalian ADAMTS-20 Caenorhabditis elegans gon-1, a protease required for gonadal morphogenesis Proteins containing the GON domain belong to MEROPS peptidase subfamily M12B (adamalysin, clan MA).; GO: 0004222 metalloendopeptidase activity, 0008270 zinc ion binding
Probab=34.68 E-value=1.1e+02 Score=27.14 Aligned_cols=59 Identities=19% Similarity=0.365 Sum_probs=36.0
Q ss_pred ecccccCC-cccccccceeCCCceEeccCCCEEEEEcccceeeeeeccCCcceeeceEeec
Q 044785 96 AKECYRKG-NSVDSYSPTFSLSKFTVSNTENRFVVIGCDSYAYVRGYLGENRYRAGCMSMC 155 (346)
Q Consensus 96 ~~~c~~~~-~~~~~~~~~l~~~pf~~s~~~n~~~~~gC~~~a~l~~~~~~~~~~~gC~s~C 155 (346)
.-+|+... -.....+++|.+++|.|++ .-+++..|......+.-...+.....-|.=+|
T Consensus 126 AGDCyS~~~CpqG~FsIdL~GTgf~vs~-~~~W~~~G~~a~~~i~~s~~~q~v~g~CGGyC 185 (201)
T PF08685_consen 126 AGDCYSAARCPQGRFSIDLRGTGFRVSP-DTKWVTQGNYAVGKINRSPDGQKVSGRCGGYC 185 (201)
T ss_pred cccccccCCCCCceEEEeeCCCceEecC-CCEEEeCCcEeEEEEEEcCCCcEEEEEeCccC
Confidence 34566541 1222346799999999998 56788889876666542211244444465554
No 51
>COG2991 Uncharacterized protein conserved in bacteria [Function unknown]
Probab=32.64 E-value=73 Score=23.29 Aligned_cols=16 Identities=25% Similarity=0.671 Sum_probs=9.3
Q ss_pred CCCCCccccCCCCCCCCCC
Q 044785 34 SRCGDVEIPYPFGTRPGCF 52 (346)
Q Consensus 34 ~~CG~v~IpYPFgig~~C~ 52 (346)
-+||.+.- .||..-|-
T Consensus 31 GSCGGi~a---lGi~K~Cd 46 (77)
T COG2991 31 GSCGGIAA---LGIEKVCD 46 (77)
T ss_pred cccccHHh---hccchhcC
Confidence 38997742 36655443
No 52
>PHA02887 EGF-like protein; Provisional
Probab=31.14 E-value=42 Score=26.98 Aligned_cols=35 Identities=34% Similarity=0.757 Sum_probs=24.9
Q ss_pred ccCccC--CCCCCCCeeeCCCCCCCCCCCeeeecCCCcccCC
Q 044785 241 ETCEEA--KICGLNASCHKPKDNTTTSSGYHCKCNEGYEGNP 280 (346)
Q Consensus 241 ~~C~~~--~~C~~~s~C~~~~~~~~~~~gy~C~C~~Gy~gnp 280 (346)
..|++. ..|. |..|..... .....|.|..||.|..
T Consensus 84 ~pC~~eyk~YCi-HG~C~yI~d----L~epsCrC~~GYtG~R 120 (126)
T PHA02887 84 EKCKNDFNDFCI-NGECMNIID----LDEKFCICNKGYTGIR 120 (126)
T ss_pred cccChHhhCEee-CCEEEcccc----CCCceeECCCCcccCC
Confidence 357654 4565 678876655 4468999999999863
No 53
>PF10916 DUF2712: Protein of unknown function (DUF2712); InterPro: IPR020208 This entry represents a group of uncharacterised proteins.
Probab=24.31 E-value=77 Score=26.43 Aligned_cols=16 Identities=31% Similarity=0.600 Sum_probs=13.7
Q ss_pred CccccCCCCCCCCCCC
Q 044785 38 DVEIPYPFGTRPGCFL 53 (346)
Q Consensus 38 ~v~IpYPFgig~~C~~ 53 (346)
+-+|+|=|-|++.|.-
T Consensus 32 dn~i~F~F~i~~~~an 47 (146)
T PF10916_consen 32 DNNIPFSFTIKPNQAN 47 (146)
T ss_pred ccCCceEEEeCCcccc
Confidence 6789999999998875
No 54
>PRK02710 plastocyanin; Provisional
Probab=23.49 E-value=87 Score=25.00 Aligned_cols=17 Identities=29% Similarity=0.313 Sum_probs=10.0
Q ss_pred CcchhHHHHHHHHHHHH
Q 044785 1 MPFRLTTKLVVLLLVLL 17 (346)
Q Consensus 1 M~~~~~~~l~~~ll~l~ 17 (346)
|+.++.+++..+|++++
T Consensus 1 ~~~~~~~~~~~~~~~~~ 17 (119)
T PRK02710 1 MAKRLRSIAAALVAVVS 17 (119)
T ss_pred CchhHHHHHHHHHHHHH
Confidence 66766666555555443
No 55
>smart00051 DSL delta serrate ligand.
Probab=22.15 E-value=1.3e+02 Score=21.36 Aligned_cols=46 Identities=24% Similarity=0.524 Sum_probs=25.3
Q ss_pred CeeeecCCCcccCCCCCCCceeCccCCCCCCCCCCCCCeeEecCCCeEEecCCCCccC
Q 044785 267 GYHCKCNEGYEGNPYLSDGCQDVNECEDPSLNNCTRTHICDNIPGSYTCRCRKGFHGD 324 (346)
Q Consensus 267 gy~C~C~~Gy~gnp~~~~gC~dideC~~~~~~~C~~~~~C~nt~G~y~C~C~~G~~~~ 324 (346)
.++-.|.++|.|.... ..|.. .+.+..+..|.. .| .|.|.+||.+.
T Consensus 16 ~~rv~C~~~~yG~~C~-------~~C~~--~~d~~~~~~Cd~-~G--~~~C~~Gw~G~ 61 (63)
T smart00051 16 QIRVTCDENYYGEGCN-------KFCRP--RDDFFGHYTCDE-NG--NKGCLEGWMGP 61 (63)
T ss_pred EEEeeCCCCCcCCccC-------CEeCc--CccccCCccCCc-CC--CEecCCCCcCC
Confidence 3556788888886432 12311 112333556632 33 46789999864
No 56
>KOG1836 consensus Extracellular matrix glycoprotein Laminin subunits alpha and gamma [Extracellular structures]
Probab=20.60 E-value=90 Score=36.63 Aligned_cols=52 Identities=29% Similarity=0.701 Sum_probs=30.5
Q ss_pred Ceee-ecCCCcccCCCCCCCceeCccCCCCCCCCCCCCCeeEec--CCCeEEe-cCCCCccCC
Q 044785 267 GYHC-KCNEGYEGNPYLSDGCQDVNECEDPSLNNCTRTHICDNI--PGSYTCR-CRKGFHGDG 325 (346)
Q Consensus 267 gy~C-~C~~Gy~gnp~~~~gC~dideC~~~~~~~C~~~~~C~nt--~G~y~C~-C~~G~~~~~ 325 (346)
|-+| +|.+||.|++.. +.-.| | ..=+|...+.|..+ .....|. ||+||+|..
T Consensus 755 G~~C~~C~~GfYg~~~~-~~~~d---C---~~C~Cp~~~~~~~~~~~~~~iCk~Cp~gytG~r 810 (1705)
T KOG1836|consen 755 GGQCAQCVDGFYGLPDL-GTSGD---C---QPCPCPNGGACGQTPEILEVVCKNCPPGYTGLR 810 (1705)
T ss_pred CCchhhhcCCCCCcccc-CCCCC---C---ccCCCCCChhhcCcCcccceecCCCCCCCcccc
Confidence 4456 789999998865 11111 4 22345545555544 3456677 888877643
Done!