Query 044268
Match_columns 138
No_of_seqs 197 out of 1380
Neff 9.8
Searched_HMMs 46136
Date Fri Mar 29 13:01:26 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/044268.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/044268hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 KOG1219 Uncharacterized conser 99.6 4.6E-16 1E-20 129.2 5.8 82 1-89 3896-3978(4289)
2 KOG1219 Uncharacterized conser 99.6 7.5E-15 1.6E-19 122.3 7.0 77 8-90 3864-3940(4289)
3 KOG4289 Cadherin EGF LAG seven 99.5 6.9E-14 1.5E-18 113.4 7.6 93 1-99 1232-1327(2531)
4 KOG4289 Cadherin EGF LAG seven 99.1 3.1E-11 6.7E-16 98.5 3.3 80 8-94 1179-1280(2531)
5 PF00008 EGF: EGF-like domain 98.5 6.8E-08 1.5E-12 46.4 1.7 31 11-46 1-32 (32)
6 smart00179 EGF_CA Calcium-bind 98.4 5.5E-07 1.2E-11 44.9 4.1 36 8-48 2-39 (39)
7 PF00008 EGF: EGF-like domain 98.4 1.7E-07 3.7E-12 45.0 1.6 31 54-84 1-31 (32)
8 KOG1214 Nidogen and related ba 98.3 1.9E-06 4.1E-11 68.4 6.5 59 24-85 800-860 (1289)
9 PF07645 EGF_CA: Calcium-bindi 98.2 1.1E-06 2.4E-11 44.9 2.3 34 7-43 1-34 (42)
10 cd00054 EGF_CA Calcium-binding 98.2 3.4E-06 7.4E-11 41.5 4.1 36 8-48 2-38 (38)
11 smart00179 EGF_CA Calcium-bind 98.2 4E-06 8.6E-11 41.7 4.2 36 51-88 2-39 (39)
12 cd00054 EGF_CA Calcium-binding 97.9 2.6E-05 5.6E-10 38.3 4.1 36 51-88 2-38 (38)
13 cd00053 EGF Epidermal growth f 97.8 5E-05 1.1E-09 36.6 4.0 28 18-47 7-35 (36)
14 KOG1214 Nidogen and related ba 97.8 5.8E-05 1.3E-09 60.3 6.2 81 1-85 728-821 (1289)
15 smart00181 EGF Epidermal growt 97.8 5.9E-05 1.3E-09 36.5 3.8 27 18-47 7-34 (35)
16 KOG1217 Fibrillins and related 97.7 0.00011 2.5E-09 55.4 6.6 79 7-91 270-356 (487)
17 smart00181 EGF Epidermal growt 97.6 0.00019 4.2E-09 34.7 3.8 32 54-88 2-35 (35)
18 PF07645 EGF_CA: Calcium-bindi 97.5 7E-05 1.5E-09 38.1 2.1 31 51-82 2-34 (42)
19 KOG1225 Teneurin-1 and related 97.5 0.00032 6.9E-09 54.1 6.4 28 57-90 316-343 (525)
20 KOG1217 Fibrillins and related 97.3 0.00076 1.6E-08 51.0 6.2 61 23-87 243-306 (487)
21 KOG1225 Teneurin-1 and related 97.3 0.0009 1.9E-08 51.7 6.1 46 35-89 266-311 (525)
22 cd00053 EGF Epidermal growth f 97.2 0.00074 1.6E-08 32.3 3.7 28 56-84 5-32 (36)
23 PF12661 hEGF: Human growth fa 97.0 0.00024 5.2E-09 26.9 0.4 10 75-84 2-11 (13)
24 PF12947 EGF_3: EGF domain; I 97.0 0.00062 1.3E-08 33.4 1.9 26 18-45 7-32 (36)
25 KOG4260 Uncharacterized conser 96.9 0.00066 1.4E-08 47.9 2.4 69 6-82 234-304 (350)
26 PF07974 EGF_2: EGF-like domai 96.8 0.0022 4.7E-08 30.6 2.9 26 18-47 7-32 (32)
27 PF12662 cEGF: Complement Clr- 96.5 0.003 6.6E-08 28.0 2.2 18 34-52 2-23 (24)
28 KOG4260 Uncharacterized conser 96.5 0.0047 1E-07 43.7 4.1 64 18-84 151-270 (350)
29 PF07974 EGF_2: EGF-like domai 96.4 0.0051 1.1E-07 29.3 2.9 27 57-87 6-32 (32)
30 KOG1226 Integrin beta subunit 96.4 0.012 2.7E-07 47.0 6.3 80 1-92 539-624 (783)
31 smart00051 DSL delta serrate l 96.3 0.012 2.6E-07 32.6 4.1 46 34-87 17-63 (63)
32 PF12947 EGF_3: EGF domain; I 96.1 0.0035 7.6E-08 30.7 1.4 27 57-84 6-32 (36)
33 KOG1226 Integrin beta subunit 95.6 0.036 7.7E-07 44.5 5.7 61 18-90 515-582 (783)
34 PF14670 FXa_inhibition: Coagu 95.5 0.015 3.2E-07 28.5 2.2 22 18-43 7-28 (36)
35 PHA03099 epidermal growth fact 94.8 0.035 7.5E-07 34.9 2.8 39 51-91 42-84 (139)
36 PHA03099 epidermal growth fact 94.8 0.043 9.3E-07 34.5 3.2 31 18-49 52-82 (139)
37 KOG3516 Neurexin IV [Signal tr 94.2 0.049 1.1E-06 45.8 3.1 40 8-52 545-585 (1306)
38 PHA02887 EGF-like protein; Pro 94.1 0.07 1.5E-06 33.0 3.1 38 51-90 83-124 (126)
39 PHA02887 EGF-like protein; Pro 93.8 0.081 1.8E-06 32.7 2.9 31 18-49 93-123 (126)
40 KOG3516 Neurexin IV [Signal tr 93.7 0.075 1.6E-06 44.7 3.4 38 9-51 956-994 (1306)
41 KOG3514 Neurexin III-alpha [Si 93.1 0.072 1.6E-06 44.6 2.4 36 10-50 625-661 (1591)
42 PF06247 Plasmod_Pvs28: Plasmo 91.4 0.067 1.5E-06 36.0 0.3 63 18-84 7-81 (197)
43 KOG3514 Neurexin III-alpha [Si 91.3 0.18 3.9E-06 42.4 2.7 38 53-92 625-663 (1591)
44 PF12946 EGF_MSP1_1: MSP1 EGF 91.1 0.12 2.5E-06 25.4 0.9 24 18-43 6-30 (37)
45 cd01475 vWA_Matrilin VWA_Matri 89.8 0.52 1.1E-05 32.5 3.7 37 44-84 181-219 (224)
46 PF01414 DSL: Delta serrate li 89.1 0.14 3E-06 28.4 0.3 46 34-87 17-63 (63)
47 KOG1836 Extracellular matrix g 86.5 0.99 2.2E-05 40.2 3.9 54 37-91 760-815 (1705)
48 KOG0994 Extracellular matrix g 85.7 1.2 2.6E-05 38.2 3.8 56 34-90 885-950 (1758)
49 PF12955 DUF3844: Domain of un 82.0 0.95 2.1E-05 27.6 1.5 34 8-42 5-41 (103)
50 PF00954 S_locus_glycop: S-loc 81.3 2.3 4.9E-05 26.0 3.0 32 51-84 77-109 (110)
51 KOG0994 Extracellular matrix g 81.0 2.8 6E-05 36.1 4.2 15 35-49 1085-1099(1758)
52 KOG3509 Basement membrane-spec 80.8 4.7 0.0001 34.1 5.4 69 9-84 407-476 (964)
53 cd01475 vWA_Matrilin VWA_Matri 80.7 1.8 4E-05 29.8 2.8 20 23-44 199-218 (224)
54 cd00055 EGF_Lam Laminin-type e 73.7 3.5 7.6E-05 21.3 2.0 15 34-48 19-33 (50)
55 PF00053 Laminin_EGF: Laminin 73.3 1.6 3.4E-05 22.5 0.6 22 23-48 11-32 (49)
56 PF01683 EB: EB module; Inter 65.2 16 0.00034 18.9 3.4 20 18-43 27-46 (52)
57 KOG1836 Extracellular matrix g 64.2 19 0.00042 32.7 5.5 37 11-50 777-814 (1705)
58 smart00180 EGF_Lam Laminin-typ 61.7 7.5 0.00016 19.7 1.7 16 73-89 18-33 (46)
59 PF09064 Tme5_EGF_like: Thromb 56.9 11 0.00023 18.1 1.6 13 72-84 17-29 (34)
60 PF04863 EGF_alliinase: Alliin 52.0 6.2 0.00014 21.1 0.4 32 18-51 18-53 (56)
61 KOG0196 Tyrosine kinase, EPH ( 45.1 40 0.00086 28.5 4.1 58 19-84 248-319 (996)
62 KOG1218 Proteins containing Ca 43.5 73 0.0016 22.8 5.1 35 34-68 162-197 (316)
63 cd01328 FSL_SPARC Follistatin- 38.4 57 0.0012 19.2 3.1 25 54-78 2-26 (86)
64 KOG3607 Meltrins, fertilins an 31.1 46 0.001 27.6 2.5 29 58-91 631-659 (716)
65 KOG3509 Basement membrane-spec 29.8 54 0.0012 28.2 2.7 36 52-89 407-442 (964)
66 KOG3607 Meltrins, fertilins an 24.4 60 0.0013 27.0 2.1 27 18-49 631-657 (716)
No 1
>KOG1219 consensus Uncharacterized conserved protein, contains laminin, cadherin and EGF domains [Signal transduction mechanisms]
Probab=99.62 E-value=4.6e-16 Score=129.25 Aligned_cols=82 Identities=30% Similarity=0.841 Sum_probs=62.1
Q ss_pred CCCCCCCccCccCCCCccCCCCCEEeecCCCcceEeeCCCCCcCCCCccC-CCCCCCCCCCCCCeeeeCCCCCeeeeCCC
Q 044268 1 GKFCDEEMTMCDGTNEFWCEHGGKCEEIVQGEMYDCKCPAGYAGEHCEHS-GTPCGQIFCFHEAQCLALSQVHNACDCPP 79 (138)
Q Consensus 1 g~~C~~~~~~C~~~~~~pC~~~~~C~~~~~~~~~~C~C~~g~~g~~C~~~-~~~C~~~~C~~~g~C~~~~~~~~~C~C~~ 79 (138)
|++||+++++|.++ ||.+|++|+...++ |.|.|+.||+|..|+.. ++.|..++|.++|.|.+.. |.|.|.|.+
T Consensus 3896 G~~CEi~~epC~sn---PC~~GgtCip~~n~--f~CnC~~gyTG~~Ce~~Gi~eCs~n~C~~gg~C~n~~-gsf~CncT~ 3969 (4289)
T KOG1219|consen 3896 GNHCEIDLEPCASN---PCLTGGTCIPFYNG--FLCNCPNGYTGKRCEARGISECSKNVCGTGGQCINIP-GSFHCNCTP 3969 (4289)
T ss_pred CcccccccccccCC---CCCCCCEEEecCCC--eeEeCCCCccCceeecccccccccccccCCceeeccC-CceEeccCh
Confidence 56777777777777 77777777777777 77777777777777765 6777777777777777776 577777777
Q ss_pred CCCCCCCCCC
Q 044268 80 DWKGSADCSL 89 (138)
Q Consensus 80 g~~g~~~C~~ 89 (138)
|+.|. .|..
T Consensus 3970 g~~gr-~c~~ 3978 (4289)
T KOG1219|consen 3970 GILGR-TCCA 3978 (4289)
T ss_pred hHhcc-cCcc
Confidence 77777 6643
No 2
>KOG1219 consensus Uncharacterized conserved protein, contains laminin, cadherin and EGF domains [Signal transduction mechanisms]
Probab=99.56 E-value=7.5e-15 Score=122.33 Aligned_cols=77 Identities=34% Similarity=0.897 Sum_probs=71.8
Q ss_pred ccCccCCCCccCCCCCEEeecCCCcceEeeCCCCCcCCCCccCCCCCCCCCCCCCCeeeeCCCCCeeeeCCCCCCCCCCC
Q 044268 8 MTMCDGTNEFWCEHGGKCEEIVQGEMYDCKCPAGYAGEHCEHSGTPCGQIFCFHEAQCLALSQVHNACDCPPDWKGSADC 87 (138)
Q Consensus 8 ~~~C~~~~~~pC~~~~~C~~~~~~~~~~C~C~~g~~g~~C~~~~~~C~~~~C~~~g~C~~~~~~~~~C~C~~g~~g~~~C 87 (138)
.++|..+ ||+|+|.|...+.+ +|.|.|++.|.|++|+++.++|.++||.+||+|.... ++|.|.|+.||+|. .|
T Consensus 3864 ~d~C~~n---pCqhgG~C~~~~~g-gy~CkCpsqysG~~CEi~~epC~snPC~~GgtCip~~-n~f~CnC~~gyTG~-~C 3937 (4289)
T KOG1219|consen 3864 TDPCNDN---PCQHGGTCISQPKG-GYKCKCPSQYSGNHCEIDLEPCASNPCLTGGTCIPFY-NGFLCNCPNGYTGK-RC 3937 (4289)
T ss_pred ccccccC---cccCCCEecCCCCC-ceEEeCcccccCcccccccccccCCCCCCCCEEEecC-CCeeEeCCCCccCc-ee
Confidence 3889999 99999999988766 6999999999999999999999999999999999988 59999999999999 99
Q ss_pred CCC
Q 044268 88 SLP 90 (138)
Q Consensus 88 ~~~ 90 (138)
+..
T Consensus 3938 e~~ 3940 (4289)
T KOG1219|consen 3938 EAR 3940 (4289)
T ss_pred ecc
Confidence 876
No 3
>KOG4289 consensus Cadherin EGF LAG seven-pass G-type receptor [Signal transduction mechanisms]
Probab=99.49 E-value=6.9e-14 Score=113.37 Aligned_cols=93 Identities=30% Similarity=0.774 Sum_probs=80.6
Q ss_pred CCCCCCCccCccCCCCccCCCCCEEeecCCCcceEeeCCCCCcCCCCccCCC--CCCCCCCCCCCeeeeCCCCCeeeeCC
Q 044268 1 GKFCDEEMTMCDGTNEFWCEHGGKCEEIVQGEMYDCKCPAGYAGEHCEHSGT--PCGQIFCFHEAQCLALSQVHNACDCP 78 (138)
Q Consensus 1 g~~C~~~~~~C~~~~~~pC~~~~~C~~~~~~~~~~C~C~~g~~g~~C~~~~~--~C~~~~C~~~g~C~~~~~~~~~C~C~ 78 (138)
|..||+++|.|.+. ||.|++.|....++ |+|.|.+||+|.+|+++.. .|.+..|.|+|+|.+...++|.|.|+
T Consensus 1232 gd~CeTeiDlCYs~---pC~nng~C~srEgg--YtCeCrpg~tGehCEvs~~agrCvpGvC~nggtC~~~~nggf~c~Cp 1306 (2531)
T KOG4289|consen 1232 GDYCETEIDLCYSG---PCGNNGRCRSREGG--YTCECRPGFTGEHCEVSARAGRCVPGVCKNGGTCVNLLNGGFCCHCP 1306 (2531)
T ss_pred cccccchhHhhhcC---CCCCCCceEEecCc--eeEEecCCccccceeeecccCccccceecCCCEEeecCCCceeccCC
Confidence 56899999999999 99999999999999 9999999999999997643 58888999999999988789999999
Q ss_pred CC-CCCCCCCCCCCCCCCCCce
Q 044268 79 PD-WKGSADCSLPTLSQTAGAV 99 (138)
Q Consensus 79 ~g-~~g~~~C~~~~~~~~~~~~ 99 (138)
.| |.+. +|+.....+...++
T Consensus 1307 ~ge~e~p-rC~v~trSFp~~sf 1327 (2531)
T KOG4289|consen 1307 YGEFEDP-RCEVTTRSFPPESF 1327 (2531)
T ss_pred CcccCCC-ceEEEeeccCchhe
Confidence 87 4555 89987777664443
No 4
>KOG4289 consensus Cadherin EGF LAG seven-pass G-type receptor [Signal transduction mechanisms]
Probab=99.13 E-value=3.1e-11 Score=98.48 Aligned_cols=80 Identities=26% Similarity=0.651 Sum_probs=68.4
Q ss_pred ccCccCCCCccCCCCCEEeec----------------------CCCcceEeeCCCCCcCCCCccCCCCCCCCCCCCCCee
Q 044268 8 MTMCDGTNEFWCEHGGKCEEI----------------------VQGEMYDCKCPAGYAGEHCEHSGTPCGQIFCFHEAQC 65 (138)
Q Consensus 8 ~~~C~~~~~~pC~~~~~C~~~----------------------~~~~~~~C~C~~g~~g~~C~~~~~~C~~~~C~~~g~C 65 (138)
-+.|..- ||.|..+|+.. .+ ++.|.|++||+|..|+..++.|-+.||.++|.|
T Consensus 1179 DniClrE---PCenymkCvsvlrFdssapf~~s~s~lfRpi~pvn--glrCrCPpGFTgd~CeTeiDlCYs~pC~nng~C 1253 (2531)
T KOG4289|consen 1179 DNICLRE---PCENYMKCVSVLRFDSSAPFLASDSVLFRPIHPVN--GLRCRCPPGFTGDYCETEIDLCYSGPCGNNGRC 1253 (2531)
T ss_pred Cchhhcc---hhHHHHhhhhheeecccCccccccceeeeeccccC--ceeEeCCCCCCcccccchhHhhhcCCCCCCCce
Confidence 4567777 89988888532 23 499999999999999999999999999999999
Q ss_pred eeCCCCCeeeeCCCCCCCCCCCCCCCCCC
Q 044268 66 LALSQVHNACDCPPDWKGSADCSLPTLSQ 94 (138)
Q Consensus 66 ~~~~~~~~~C~C~~g~~g~~~C~~~~~~~ 94 (138)
.... ++|.|.|.+||+|. .||......
T Consensus 1254 ~srE-ggYtCeCrpg~tGe-hCEvs~~ag 1280 (2531)
T KOG4289|consen 1254 RSRE-GGYTCECRPGFTGE-HCEVSARAG 1280 (2531)
T ss_pred EEec-CceeEEecCCcccc-ceeeecccC
Confidence 9998 69999999999999 999765444
No 5
>PF00008 EGF: EGF-like domain This is a sub-family of the Pfam entry This is a sub-family of the Pfam entry; InterPro: IPR006209 A sequence of about thirty to forty amino-acid residues long found in the sequence of epidermal growth factor (EGF) has been shown [, , , , ] to be present, in a more or less conserved form, in a large number of other, mostly animal proteins. The list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied. The functional significance of EGF domains in what appear to be unrelated proteins is not yet clear. However, a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase). The EGF domain includes six cysteine residues which have been shown (in EGF) to be involved in disulphide bonds. The main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet. Subdomains between the conserved cysteines vary in length.; GO: 0005515 protein binding; PDB: 1WHE_A 1CCF_A 1APO_A 1WHF_A 2VJ3_A 1TOZ_A 4D90_B 3CFW_A 1EDM_B 1IXA_A ....
Probab=98.49 E-value=6.8e-08 Score=46.45 Aligned_cols=31 Identities=42% Similarity=1.256 Sum_probs=26.4
Q ss_pred ccCCCCccCCCCCEEeecC-CCcceEeeCCCCCcCCC
Q 044268 11 CDGTNEFWCEHGGKCEEIV-QGEMYDCKCPAGYAGEH 46 (138)
Q Consensus 11 C~~~~~~pC~~~~~C~~~~-~~~~~~C~C~~g~~g~~ 46 (138)
|.++ ||.|+++|++.. .. |.|.|++||.|.+
T Consensus 1 C~~~---~C~n~g~C~~~~~~~--y~C~C~~G~~G~~ 32 (32)
T PF00008_consen 1 CSSN---PCQNGGTCIDLPGGG--YTCECPPGYTGKR 32 (32)
T ss_dssp TTTT---SSTTTEEEEEESTSE--EEEEEBTTEESTT
T ss_pred CCCC---cCCCCeEEEeCCCCC--EEeECCCCCccCC
Confidence 4455 999999999988 66 9999999999863
No 6
>smart00179 EGF_CA Calcium-binding EGF-like domain.
Probab=98.42 E-value=5.5e-07 Score=44.93 Aligned_cols=36 Identities=42% Similarity=1.164 Sum_probs=31.1
Q ss_pred ccCccC-CCCccCCCCCEEeecCCCcceEeeCCCCCc-CCCCc
Q 044268 8 MTMCDG-TNEFWCEHGGKCEEIVQGEMYDCKCPAGYA-GEHCE 48 (138)
Q Consensus 8 ~~~C~~-~~~~pC~~~~~C~~~~~~~~~~C~C~~g~~-g~~C~ 48 (138)
+++|.. . +|.+++.|++..++ |.|.|++||. |..|+
T Consensus 2 ~~~C~~~~---~C~~~~~C~~~~g~--~~C~C~~g~~~g~~C~ 39 (39)
T smart00179 2 IDECASGN---PCQNGGTCVNTVGS--YRCECPPGYTDGRNCE 39 (39)
T ss_pred cccCcCCC---CcCCCCEeECCCCC--eEeECCCCCccCCcCC
Confidence 577876 6 89999999999988 9999999998 87763
No 7
>PF00008 EGF: EGF-like domain This is a sub-family of the Pfam entry This is a sub-family of the Pfam entry; InterPro: IPR006209 A sequence of about thirty to forty amino-acid residues long found in the sequence of epidermal growth factor (EGF) has been shown [, , , , ] to be present, in a more or less conserved form, in a large number of other, mostly animal proteins. The list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied. The functional significance of EGF domains in what appear to be unrelated proteins is not yet clear. However, a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase). The EGF domain includes six cysteine residues which have been shown (in EGF) to be involved in disulphide bonds. The main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet. Subdomains between the conserved cysteines vary in length.; GO: 0005515 protein binding; PDB: 1WHE_A 1CCF_A 1APO_A 1WHF_A 2VJ3_A 1TOZ_A 4D90_B 3CFW_A 1EDM_B 1IXA_A ....
Probab=98.39 E-value=1.7e-07 Score=45.02 Aligned_cols=31 Identities=29% Similarity=0.850 Sum_probs=27.0
Q ss_pred CCCCCCCCCCeeeeCCCCCeeeeCCCCCCCC
Q 044268 54 CGQIFCFHEAQCLALSQVHNACDCPPDWKGS 84 (138)
Q Consensus 54 C~~~~C~~~g~C~~~~~~~~~C~C~~g~~g~ 84 (138)
|.+.||.++|+|+....+.|.|.|++||+|+
T Consensus 1 C~~~~C~n~g~C~~~~~~~y~C~C~~G~~G~ 31 (32)
T PF00008_consen 1 CSSNPCQNGGTCIDLPGGGYTCECPPGYTGK 31 (32)
T ss_dssp TTTTSSTTTEEEEEESTSEEEEEEBTTEEST
T ss_pred CCCCcCCCCeEEEeCCCCCEEeECCCCCccC
Confidence 4567999999999887448999999999997
No 8
>KOG1214 consensus Nidogen and related basement membrane protein proteins [Cell wall/membrane/envelope biogenesis; Extracellular structures]
Probab=98.31 E-value=1.9e-06 Score=68.40 Aligned_cols=59 Identities=25% Similarity=0.723 Sum_probs=49.1
Q ss_pred EEeecCCCcceEeeCCCCCcC--CCCccCCCCCCCCCCCCCCeeeeCCCCCeeeeCCCCCCCCC
Q 044268 24 KCEEIVQGEMYDCKCPAGYAG--EHCEHSGTPCGQIFCFHEAQCLALSQVHNACDCPPDWKGSA 85 (138)
Q Consensus 24 ~C~~~~~~~~~~C~C~~g~~g--~~C~~~~~~C~~~~C~~~g~C~~~~~~~~~C~C~~g~~g~~ 85 (138)
.|+...++ .|+|.|.+||.| ..|. +.+.|.++-|...++|.++. +.+.|.|.+||.|++
T Consensus 800 ~c~~hGgs-~y~C~CLPGfsGDG~~c~-dvDeC~psrChp~A~Cyntp-gsfsC~C~pGy~GDG 860 (1289)
T KOG1214|consen 800 RCVHHGGS-TYSCACLPGFSGDGHQCT-DVDECSPSRCHPAATCYNTP-GSFSCRCQPGYYGDG 860 (1289)
T ss_pred EEEecCCc-eEEEeecCCccCCccccc-cccccCccccCCCceEecCC-CcceeecccCccCCC
Confidence 45555544 499999999986 4563 66999999999999999999 699999999999984
No 9
>PF07645 EGF_CA: Calcium-binding EGF domain; InterPro: IPR001881 A sequence of about forty amino-acid residues found in epidermal growth factor (EGF) has been shown [, , , , , ] to be present in a large number of membrane-bound and extracellular, mostly animal, proteins. Many of these proteins require calcium for their biological function and a calcium-binding site has been found at the N terminus of some EGF-like domains []. Calcium-binding may be crucial for numerous protein-protein interactions. For human coagulation factor IX it has been shown [] that the calcium-ligands form a pentagonal bipyramid. The first, third and fourth conserved negatively charged or polar residues are side chain ligands. The latter is possibly hydroxylated (see aspartic acid and asparagine hydroxylation site) []. A conserved aromatic residue, as well as the second conserved negative residue, are thought to be involved in stabilising the calcium-binding site. As in non-calcium binding EGF-like domains, there are six conserved cysteines and the structure of both types is very similar as calcium-binding induces only strictly local structural changes []. +------------------+ +---------+ | | | | nxnnC-x(3,14)-C-x(3,7)-CxxbxxxxaxC-x(1,6)-C-x(8,13)-Cx | | +------------------+ 'n': negatively charged or polar residue [DEQN] 'b': possibly beta-hydroxylated residue [DN] 'a': aromatic amino acid 'C': cysteine, involved in disulphide bond 'x': any amino acid. ; GO: 0005509 calcium ion binding; PDB: 2VJ3_A 1TOZ_A 1LMJ_A 1UZQ_A 1UZK_A 1UZJ_B 1UZP_A 1EMO_A 1EMN_A 2RR0_A ....
Probab=98.21 E-value=1.1e-06 Score=44.88 Aligned_cols=34 Identities=29% Similarity=0.861 Sum_probs=29.8
Q ss_pred CccCccCCCCccCCCCCEEeecCCCcceEeeCCCCCc
Q 044268 7 EMTMCDGTNEFWCEHGGKCEEIVQGEMYDCKCPAGYA 43 (138)
Q Consensus 7 ~~~~C~~~~~~pC~~~~~C~~~~~~~~~~C~C~~g~~ 43 (138)
|+|+|... ++.|..++.|+++.++ |.|.|++||.
T Consensus 1 DidEC~~~-~~~C~~~~~C~N~~Gs--y~C~C~~Gy~ 34 (42)
T PF07645_consen 1 DIDECAEG-PHNCPENGTCVNTEGS--YSCSCPPGYE 34 (42)
T ss_dssp ESSTTTTT-SSSSSTTSEEEEETTE--EEEEESTTEE
T ss_pred CccccCCC-CCcCCCCCEEEcCCCC--EEeeCCCCcE
Confidence 47888876 4579889999999999 9999999997
No 10
>cd00054 EGF_CA Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.
Probab=98.20 E-value=3.4e-06 Score=41.53 Aligned_cols=36 Identities=42% Similarity=1.154 Sum_probs=30.6
Q ss_pred ccCccC-CCCccCCCCCEEeecCCCcceEeeCCCCCcCCCCc
Q 044268 8 MTMCDG-TNEFWCEHGGKCEEIVQGEMYDCKCPAGYAGEHCE 48 (138)
Q Consensus 8 ~~~C~~-~~~~pC~~~~~C~~~~~~~~~~C~C~~g~~g~~C~ 48 (138)
+++|.. . +|.+++.|.+..+. |.|.|+.+|.|..|+
T Consensus 2 ~~~C~~~~---~C~~~~~C~~~~~~--~~C~C~~g~~g~~C~ 38 (38)
T cd00054 2 IDECASGN---PCQNGGTCVNTVGS--YRCSCPPGYTGRNCE 38 (38)
T ss_pred cccCCCCC---CcCCCCEeECCCCC--eEeECCCCCcCCcCC
Confidence 466766 6 89999999998888 999999999997773
No 11
>smart00179 EGF_CA Calcium-binding EGF-like domain.
Probab=98.19 E-value=4e-06 Score=41.70 Aligned_cols=36 Identities=25% Similarity=0.751 Sum_probs=30.0
Q ss_pred CCCCCC-CCCCCCCeeeeCCCCCeeeeCCCCCC-CCCCCC
Q 044268 51 GTPCGQ-IFCFHEAQCLALSQVHNACDCPPDWK-GSADCS 88 (138)
Q Consensus 51 ~~~C~~-~~C~~~g~C~~~~~~~~~C~C~~g~~-g~~~C~ 88 (138)
+++|.. .+|.++++|.+.. +.|.|.|+.||. |. .|+
T Consensus 2 ~~~C~~~~~C~~~~~C~~~~-g~~~C~C~~g~~~g~-~C~ 39 (39)
T smart00179 2 IDECASGNPCQNGGTCVNTV-GSYRCECPPGYTDGR-NCE 39 (39)
T ss_pred cccCcCCCCcCCCCEeECCC-CCeEeECCCCCccCC-cCC
Confidence 456766 7899989999887 689999999999 88 774
No 12
>cd00054 EGF_CA Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.
Probab=97.93 E-value=2.6e-05 Score=38.27 Aligned_cols=36 Identities=25% Similarity=0.731 Sum_probs=29.4
Q ss_pred CCCCCC-CCCCCCCeeeeCCCCCeeeeCCCCCCCCCCCC
Q 044268 51 GTPCGQ-IFCFHEAQCLALSQVHNACDCPPDWKGSADCS 88 (138)
Q Consensus 51 ~~~C~~-~~C~~~g~C~~~~~~~~~C~C~~g~~g~~~C~ 88 (138)
+++|.. .+|.+++.|.+.. +.|.|.|+.||.|. .|+
T Consensus 2 ~~~C~~~~~C~~~~~C~~~~-~~~~C~C~~g~~g~-~C~ 38 (38)
T cd00054 2 IDECASGNPCQNGGTCVNTV-GSYRCSCPPGYTGR-NCE 38 (38)
T ss_pred cccCCCCCCcCCCCEeECCC-CCeEeECCCCCcCC-cCC
Confidence 355665 6898888999887 58999999999998 774
No 13
>cd00053 EGF Epidermal growth factor domain, found in epidermal growth factor (EGF) presents in a large number of proteins, mostly animal; the list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied; the functional significance of EGF-like domains in what appear to be unrelated proteins is not yet clear; a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase); the domain includes six cysteine residues which have been shown to be involved in disulfide bonds; the main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet; Subdomains between the conserved cysteines vary in length; the region between the 5th and 6th cysteine contains two conserved glycines of which at least one is present in most EGF-like domains; a subset of these bind calcium.
Probab=97.81 E-value=5e-05 Score=36.61 Aligned_cols=28 Identities=43% Similarity=1.201 Sum_probs=25.1
Q ss_pred cCCCCCEEeecCCCcceEeeCCCCCcCC-CC
Q 044268 18 WCEHGGKCEEIVQGEMYDCKCPAGYAGE-HC 47 (138)
Q Consensus 18 pC~~~~~C~~~~~~~~~~C~C~~g~~g~-~C 47 (138)
+|.+++.|++..+. |.|.|+.||.|. .|
T Consensus 7 ~C~~~~~C~~~~~~--~~C~C~~g~~g~~~C 35 (36)
T cd00053 7 PCSNGGTCVNTPGS--YRCVCPPGYTGDRSC 35 (36)
T ss_pred CCCCCCEEecCCCC--eEeECCCCCcccCCc
Confidence 89999999998887 999999999987 55
No 14
>KOG1214 consensus Nidogen and related basement membrane protein proteins [Cell wall/membrane/envelope biogenesis; Extracellular structures]
Probab=97.81 E-value=5.8e-05 Score=60.32 Aligned_cols=81 Identities=25% Similarity=0.676 Sum_probs=56.0
Q ss_pred CCCCCCCccCccCCCCccCCCCCEEeecCCCcceEeeCCCCCc----CCCCcc-----CCCCCCC--CCCCCC--Ceeee
Q 044268 1 GKFCDEEMTMCDGTNEFWCEHGGKCEEIVQGEMYDCKCPAGYA----GEHCEH-----SGTPCGQ--IFCFHE--AQCLA 67 (138)
Q Consensus 1 g~~C~~~~~~C~~~~~~pC~~~~~C~~~~~~~~~~C~C~~g~~----g~~C~~-----~~~~C~~--~~C~~~--g~C~~ 67 (138)
|++|. +.++|+.. +.-|-.+..|++.++. |.|.|..+|. +-.|-. .++.|.. ..|... ..|+.
T Consensus 728 gr~c~-d~~eca~~-~~~CGp~s~Cin~pg~--~rceC~~gy~F~dd~~tCV~i~~pap~n~Ce~g~h~C~i~g~a~c~~ 803 (1289)
T KOG1214|consen 728 GRNCV-DENECATG-FHRCGPNSVCINLPGS--YRCECRSGYEFADDRHTCVLITPPAPANPCEDGSHTCAIAGQARCVH 803 (1289)
T ss_pred CCCCC-ChhhhccC-CCCCCCCceeecCCCc--eeEEEeecceeccCCcceEEecCCCCCCccccCccccCcCCceEEEe
Confidence 46775 67788766 3348888999999999 9999998875 345632 2234432 234332 35556
Q ss_pred CCCCCeeeeCCCCCCCCC
Q 044268 68 LSQVHNACDCPPDWKGSA 85 (138)
Q Consensus 68 ~~~~~~~C~C~~g~~g~~ 85 (138)
...+.|.|.|.+||.|++
T Consensus 804 hGgs~y~C~CLPGfsGDG 821 (1289)
T KOG1214|consen 804 HGGSTYSCACLPGFSGDG 821 (1289)
T ss_pred cCCceEEEeecCCccCCc
Confidence 554589999999999983
No 15
>smart00181 EGF Epidermal growth factor-like domain.
Probab=97.77 E-value=5.9e-05 Score=36.54 Aligned_cols=27 Identities=41% Similarity=1.171 Sum_probs=24.1
Q ss_pred cCCCCCEEeecCCCcceEeeCCCCCcC-CCC
Q 044268 18 WCEHGGKCEEIVQGEMYDCKCPAGYAG-EHC 47 (138)
Q Consensus 18 pC~~~~~C~~~~~~~~~~C~C~~g~~g-~~C 47 (138)
+|.++ .|++..++ |.|.|+.||.| ..|
T Consensus 7 ~C~~~-~C~~~~~~--~~C~C~~g~~g~~~C 34 (35)
T smart00181 7 PCSNG-TCINTPGS--YTCSCPPGYTGDKRC 34 (35)
T ss_pred CCCCC-EEECCCCC--eEeECCCCCccCCcc
Confidence 89988 99998888 99999999998 665
No 16
>KOG1217 consensus Fibrillins and related proteins containing Ca2+-binding EGF-like domains [Signal transduction mechanisms]
Probab=97.73 E-value=0.00011 Score=55.43 Aligned_cols=79 Identities=27% Similarity=0.663 Sum_probs=61.0
Q ss_pred CccCccCCCCcc-CCCCCEEeecCCCcceEeeCCCCCcCCCC--ccCCCCCC----CCCCCCCCeeee-CCCCCeeeeCC
Q 044268 7 EMTMCDGTNEFW-CEHGGKCEEIVQGEMYDCKCPAGYAGEHC--EHSGTPCG----QIFCFHEAQCLA-LSQVHNACDCP 78 (138)
Q Consensus 7 ~~~~C~~~~~~p-C~~~~~C~~~~~~~~~~C~C~~g~~g~~C--~~~~~~C~----~~~C~~~g~C~~-~~~~~~~C~C~ 78 (138)
+++.|... + |.++++|++.... |.|.|++||.|..| ..+...|. ..+|.+++.|.. .....+.|.|.
T Consensus 270 ~~~~C~~~---~~c~~~~~C~~~~~~--~~C~C~~g~~g~~~~~~~~~~~C~~~~~~~~c~~g~~C~~~~~~~~~~C~c~ 344 (487)
T KOG1217|consen 270 DVDSCALI---ASCPNGGTCVNVPGS--YRCTCPPGFTGRLCTECVDVDECSPRNAGGPCANGGTCNTLGSFGGFRCACG 344 (487)
T ss_pred eccccCCC---CccCCCCeeecCCCc--ceeeCCCCCCCCCCccccccccccccccCCcCCCCcccccCCCCCCCCcCCC
Confidence 67888877 4 8999999999988 99999999999987 22335663 466888888822 22136789999
Q ss_pred CCCCCCCCCCCCC
Q 044268 79 PDWKGSADCSLPT 91 (138)
Q Consensus 79 ~g~~g~~~C~~~~ 91 (138)
.+|.|. .|+...
T Consensus 345 ~~~~g~-~C~~~~ 356 (487)
T KOG1217|consen 345 PGFTGR-RCEDSN 356 (487)
T ss_pred CCCCCC-ccccCC
Confidence 999999 998663
No 17
>smart00181 EGF Epidermal growth factor-like domain.
Probab=97.56 E-value=0.00019 Score=34.67 Aligned_cols=32 Identities=28% Similarity=0.863 Sum_probs=25.8
Q ss_pred CCC-CCCCCCCeeeeCCCCCeeeeCCCCCCC-CCCCC
Q 044268 54 CGQ-IFCFHEAQCLALSQVHNACDCPPDWKG-SADCS 88 (138)
Q Consensus 54 C~~-~~C~~~g~C~~~~~~~~~C~C~~g~~g-~~~C~ 88 (138)
|.. .+|.++ .|.+.. +.|.|.|+.||.| . .|+
T Consensus 2 C~~~~~C~~~-~C~~~~-~~~~C~C~~g~~g~~-~C~ 35 (35)
T smart00181 2 CASGGPCSNG-TCINTP-GSYTCSCPPGYTGDK-RCE 35 (35)
T ss_pred CCCcCCCCCC-EEECCC-CCeEeECCCCCccCC-ccC
Confidence 444 578888 899886 6999999999999 7 664
No 18
>PF07645 EGF_CA: Calcium-binding EGF domain; InterPro: IPR001881 A sequence of about forty amino-acid residues found in epidermal growth factor (EGF) has been shown [, , , , , ] to be present in a large number of membrane-bound and extracellular, mostly animal, proteins. Many of these proteins require calcium for their biological function and a calcium-binding site has been found at the N terminus of some EGF-like domains []. Calcium-binding may be crucial for numerous protein-protein interactions. For human coagulation factor IX it has been shown [] that the calcium-ligands form a pentagonal bipyramid. The first, third and fourth conserved negatively charged or polar residues are side chain ligands. The latter is possibly hydroxylated (see aspartic acid and asparagine hydroxylation site) []. A conserved aromatic residue, as well as the second conserved negative residue, are thought to be involved in stabilising the calcium-binding site. As in non-calcium binding EGF-like domains, there are six conserved cysteines and the structure of both types is very similar as calcium-binding induces only strictly local structural changes []. +------------------+ +---------+ | | | | nxnnC-x(3,14)-C-x(3,7)-CxxbxxxxaxC-x(1,6)-C-x(8,13)-Cx | | +------------------+ 'n': negatively charged or polar residue [DEQN] 'b': possibly beta-hydroxylated residue [DN] 'a': aromatic amino acid 'C': cysteine, involved in disulphide bond 'x': any amino acid. ; GO: 0005509 calcium ion binding; PDB: 2VJ3_A 1TOZ_A 1LMJ_A 1UZQ_A 1UZK_A 1UZJ_B 1UZP_A 1EMO_A 1EMN_A 2RR0_A ....
Probab=97.54 E-value=7e-05 Score=38.09 Aligned_cols=31 Identities=23% Similarity=0.700 Sum_probs=26.3
Q ss_pred CCCCCC--CCCCCCCeeeeCCCCCeeeeCCCCCC
Q 044268 51 GTPCGQ--IFCFHEAQCLALSQVHNACDCPPDWK 82 (138)
Q Consensus 51 ~~~C~~--~~C~~~g~C~~~~~~~~~C~C~~g~~ 82 (138)
+++|.. ..|...+.|+++. |+|.|.|++||.
T Consensus 2 idEC~~~~~~C~~~~~C~N~~-Gsy~C~C~~Gy~ 34 (42)
T PF07645_consen 2 IDECAEGPHNCPENGTCVNTE-GSYSCSCPPGYE 34 (42)
T ss_dssp SSTTTTTSSSSSTTSEEEEET-TEEEEEESTTEE
T ss_pred ccccCCCCCcCCCCCEEEcCC-CCEEeeCCCCcE
Confidence 567764 4688889999999 799999999997
No 19
>KOG1225 consensus Teneurin-1 and related extracellular matrix proteins, contain EGF-like repeats [Signal transduction mechanisms; Extracellular structures]
Probab=97.53 E-value=0.00032 Score=54.08 Aligned_cols=28 Identities=21% Similarity=0.562 Sum_probs=15.1
Q ss_pred CCCCCCCeeeeCCCCCeeeeCCCCCCCCCCCCCC
Q 044268 57 IFCFHEAQCLALSQVHNACDCPPDWKGSADCSLP 90 (138)
Q Consensus 57 ~~C~~~g~C~~~~~~~~~C~C~~g~~g~~~C~~~ 90 (138)
..|.++|.|+ . -.|.|.+||+|. .|+..
T Consensus 316 adC~g~G~Ci-~----G~C~C~~Gy~G~-~C~~~ 343 (525)
T KOG1225|consen 316 ADCSGHGKCI-D----GECLCDEGYTGE-LCIQR 343 (525)
T ss_pred ccCCCCCccc-C----CceEeCCCCcCC-ccccc
Confidence 3455555554 1 136666666666 66554
No 20
>KOG1217 consensus Fibrillins and related proteins containing Ca2+-binding EGF-like domains [Signal transduction mechanisms]
Probab=97.30 E-value=0.00076 Score=50.97 Aligned_cols=61 Identities=34% Similarity=0.952 Sum_probs=51.4
Q ss_pred CEEeecCCCcceEeeCCCCCcCCCC--ccCCCCCCCCC-CCCCCeeeeCCCCCeeeeCCCCCCCCCCC
Q 044268 23 GKCEEIVQGEMYDCKCPAGYAGEHC--EHSGTPCGQIF-CFHEAQCLALSQVHNACDCPPDWKGSADC 87 (138)
Q Consensus 23 ~~C~~~~~~~~~~C~C~~g~~g~~C--~~~~~~C~~~~-C~~~g~C~~~~~~~~~C~C~~g~~g~~~C 87 (138)
+.|.+..+. |.|.|++||.+..+ ..+++.|.... |.++++|.... +.|.|.|++||.|. .|
T Consensus 243 ~~c~~~~~~--~~C~~~~g~~~~~~~~~~~~~~C~~~~~c~~~~~C~~~~-~~~~C~C~~g~~g~-~~ 306 (487)
T KOG1217|consen 243 GTCVNTVGS--YTCRCPEGYTGDACVTCVDVDSCALIASCPNGGTCVNVP-GSYRCTCPPGFTGR-LC 306 (487)
T ss_pred CcccccCCc--eeeeCCCCccccccceeeeccccCCCCccCCCCeeecCC-CcceeeCCCCCCCC-CC
Confidence 788888888 99999999998762 34677888753 99999999887 36999999999999 77
No 21
>KOG1225 consensus Teneurin-1 and related extracellular matrix proteins, contain EGF-like repeats [Signal transduction mechanisms; Extracellular structures]
Probab=97.27 E-value=0.0009 Score=51.68 Aligned_cols=46 Identities=35% Similarity=1.027 Sum_probs=24.8
Q ss_pred EeeCCCCCcCCCCccCCCCCCCCCCCCCCeeeeCCCCCeeeeCCCCCCCCCCCCC
Q 044268 35 DCKCPAGYAGEHCEHSGTPCGQIFCFHEAQCLALSQVHNACDCPPDWKGSADCSL 89 (138)
Q Consensus 35 ~C~C~~g~~g~~C~~~~~~C~~~~C~~~g~C~~~~~~~~~C~C~~g~~g~~~C~~ 89 (138)
+|.|++||+|..|+. ..|... |..++.+++ + .|.|++||.|. .|++
T Consensus 266 ~CIC~~Gf~G~dC~e--~~Cp~~-cs~~g~~~~----g-~CiC~~g~~G~-dCs~ 311 (525)
T KOG1225|consen 266 RCICPPGFTGDDCDE--LVCPVD-CSGGGVCVD----G-ECICNPGYSGK-DCSI 311 (525)
T ss_pred eEeCCCCCcCCCCCc--ccCCcc-cCCCceecC----C-EeecCCCcccc-cccc
Confidence 466666666666642 334333 555544432 2 46666666666 5654
No 22
>cd00053 EGF Epidermal growth factor domain, found in epidermal growth factor (EGF) presents in a large number of proteins, mostly animal; the list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied; the functional significance of EGF-like domains in what appear to be unrelated proteins is not yet clear; a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase); the domain includes six cysteine residues which have been shown to be involved in disulfide bonds; the main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet; Subdomains between the conserved cysteines vary in length; the region between the 5th and 6th cysteine contains two conserved glycines of which at least one is present in most EGF-like domains; a subset of these bind calcium.
Probab=97.23 E-value=0.00074 Score=32.32 Aligned_cols=28 Identities=25% Similarity=0.673 Sum_probs=24.0
Q ss_pred CCCCCCCCeeeeCCCCCeeeeCCCCCCCC
Q 044268 56 QIFCFHEAQCLALSQVHNACDCPPDWKGS 84 (138)
Q Consensus 56 ~~~C~~~g~C~~~~~~~~~C~C~~g~~g~ 84 (138)
..+|.+++.|.+.. +.+.|.|+.||.|.
T Consensus 5 ~~~C~~~~~C~~~~-~~~~C~C~~g~~g~ 32 (36)
T cd00053 5 SNPCSNGGTCVNTP-GSYRCVCPPGYTGD 32 (36)
T ss_pred CCCCCCCCEEecCC-CCeEeECCCCCccc
Confidence 56788888999887 58999999999875
No 23
>PF12661 hEGF: Human growth factor-like EGF; PDB: 2YGQ_A 2E26_A 3A7Q_A 2YGP_A 2YGO_A 1HRE_A 1HAE_A 1HAF_A 1HRF_A.
Probab=96.98 E-value=0.00024 Score=26.86 Aligned_cols=10 Identities=60% Similarity=2.012 Sum_probs=5.3
Q ss_pred eeCCCCCCCC
Q 044268 75 CDCPPDWKGS 84 (138)
Q Consensus 75 C~C~~g~~g~ 84 (138)
|.|++||+|.
T Consensus 2 C~C~~G~~G~ 11 (13)
T PF12661_consen 2 CQCPPGWTGP 11 (13)
T ss_dssp EEE-TTEETT
T ss_pred ccCcCCCcCC
Confidence 5556666555
No 24
>PF12947 EGF_3: EGF domain; InterPro: IPR024731 This entry represents an EGF domain found in the the C terminus of malarial parasite merozoite surface protein 1 [], as well as other proteins.; PDB: 2NPR_A 1N1I_C 1B9W_A 1YO8_A 2RHP_A.
Probab=96.97 E-value=0.00062 Score=33.41 Aligned_cols=26 Identities=31% Similarity=0.896 Sum_probs=20.6
Q ss_pred cCCCCCEEeecCCCcceEeeCCCCCcCC
Q 044268 18 WCEHGGKCEEIVQGEMYDCKCPAGYAGE 45 (138)
Q Consensus 18 pC~~~~~C~~~~~~~~~~C~C~~g~~g~ 45 (138)
.|..++.|.++.+. |.|.|.+||.|.
T Consensus 7 ~C~~nA~C~~~~~~--~~C~C~~Gy~Gd 32 (36)
T PF12947_consen 7 GCHPNATCTNTGGS--YTCTCKPGYEGD 32 (36)
T ss_dssp GS-TTCEEEE-TTS--EEEEE-CEEECC
T ss_pred CCCCCcEeecCCCC--EEeECCCCCccC
Confidence 68888999999998 999999999875
No 25
>KOG4260 consensus Uncharacterized conserved protein [Function unknown]
Probab=96.93 E-value=0.00066 Score=47.91 Aligned_cols=69 Identities=16% Similarity=0.493 Sum_probs=49.4
Q ss_pred CCccCccCCCCccCCCCCEEeecCCCcceEeeCCCCCcC--CCCccCCCCCCCCCCCCCCeeeeCCCCCeeeeCCCCCC
Q 044268 6 EEMTMCDGTNEFWCEHGGKCEEIVQGEMYDCKCPAGYAG--EHCEHSGTPCGQIFCFHEAQCLALSQVHNACDCPPDWK 82 (138)
Q Consensus 6 ~~~~~C~~~~~~pC~~~~~C~~~~~~~~~~C~C~~g~~g--~~C~~~~~~C~~~~C~~~g~C~~~~~~~~~C~C~~g~~ 82 (138)
.|+|+|... +.||.....|+|+.++ |.|.+.+||.+ ..|+...+.|.. ....|.+.. +.|+|+|..++.
T Consensus 234 vDvnEC~~e-p~~c~~~qfCvNteGS--f~C~dk~Gy~~g~d~C~~~~d~~~~----kn~~c~ni~-~~~r~v~f~~~~ 304 (350)
T KOG4260|consen 234 VDVNECQNE-PAPCKAHQFCVNTEGS--FKCEDKEGYKKGVDECQFCADVCAS----KNRPCMNID-GQYRCVCFSGLI 304 (350)
T ss_pred ccHHHHhcC-CCCCChhheeecCCCc--eEecccccccCChHHhhhhhhhccc----CCCCcccCC-ccEEEEecccce
Confidence 378899765 5589999999999999 99999999975 234432233321 233566676 599999987754
No 26
>PF07974 EGF_2: EGF-like domain; InterPro: IPR013111 A sequence of about thirty to forty amino-acid residues long found in the sequence of epidermal growth factor (EGF) has been shown [, , , , ] to be present, in a more or less conserved form, in a large number of other, mostly animal proteins. The list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied. The functional significance of EGF domains in what appear to be unrelated proteins is not yet clear. However, a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase). The EGF domain includes six cysteine residues which have been shown (in EGF) to be involved in disulphide bonds. The main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet. Subdomains between the conserved cysteines vary in length. This entry contains EGF domains found in a variety of extracellular and membrane proteins
Probab=96.78 E-value=0.0022 Score=30.58 Aligned_cols=26 Identities=35% Similarity=0.915 Sum_probs=21.7
Q ss_pred cCCCCCEEeecCCCcceEeeCCCCCcCCCC
Q 044268 18 WCEHGGKCEEIVQGEMYDCKCPAGYAGEHC 47 (138)
Q Consensus 18 pC~~~~~C~~~~~~~~~~C~C~~g~~g~~C 47 (138)
.|.++|+|+... ..|.|.+||.|..|
T Consensus 7 ~C~~~G~C~~~~----g~C~C~~g~~G~~C 32 (32)
T PF07974_consen 7 ICSGHGTCVSPC----GRCVCDSGYTGPDC 32 (32)
T ss_pred ccCCCCEEeCCC----CEEECCCCCcCCCC
Confidence 699999998652 56999999999876
No 27
>PF12662 cEGF: Complement Clr-like EGF-like
Probab=96.51 E-value=0.003 Score=27.99 Aligned_cols=18 Identities=50% Similarity=1.311 Sum_probs=13.1
Q ss_pred eEeeCCCCCc----CCCCccCCC
Q 044268 34 YDCKCPAGYA----GEHCEHSGT 52 (138)
Q Consensus 34 ~~C~C~~g~~----g~~C~~~~~ 52 (138)
|.|.|++||. |..|. +++
T Consensus 2 y~C~C~~Gy~l~~d~~~C~-DId 23 (24)
T PF12662_consen 2 YTCSCPPGYQLSPDGRSCE-DID 23 (24)
T ss_pred EEeeCCCCCcCCCCCCccc-cCC
Confidence 8999999997 45663 443
No 28
>KOG4260 consensus Uncharacterized conserved protein [Function unknown]
Probab=96.48 E-value=0.0047 Score=43.74 Aligned_cols=64 Identities=23% Similarity=0.491 Sum_probs=45.9
Q ss_pred cCCCCCEEeec---CCCcceEeeCCCCCcCCCCcc---------------------------------------------
Q 044268 18 WCEHGGKCEEI---VQGEMYDCKCPAGYAGEHCEH--------------------------------------------- 49 (138)
Q Consensus 18 pC~~~~~C~~~---~~~~~~~C~C~~g~~g~~C~~--------------------------------------------- 49 (138)
||..+|.|... .+. -.|.|.+||+|..|..
T Consensus 151 ~C~GnG~C~GdGsR~Gs--GkCkC~~GY~Gp~C~~Cg~eyfes~Rne~~lvCt~Ch~~C~~~Csg~~~k~C~kCkkGW~l 228 (350)
T KOG4260|consen 151 PCFGNGSCHGDGSREGS--GKCKCETGYTGPLCRYCGIEYFESSRNEQHLVCTACHEGCLGVCSGESSKGCSKCKKGWKL 228 (350)
T ss_pred CcCCCCcccCCCCCCCC--CcccccCCCCCccccccchHHHHhhcccccchhhhhhhhhhcccCCCCCCChhhhccccee
Confidence 89888999632 233 6799999999886641
Q ss_pred ------CCCCCC--CCCCCCCCeeeeCCCCCeeeeCCCCCCCC
Q 044268 50 ------SGTPCG--QIFCFHEAQCLALSQVHNACDCPPDWKGS 84 (138)
Q Consensus 50 ------~~~~C~--~~~C~~~g~C~~~~~~~~~C~C~~g~~g~ 84 (138)
++++|. +.||...-.|+++. |+|.|.+.+||.+.
T Consensus 229 de~gCvDvnEC~~ep~~c~~~qfCvNte-GSf~C~dk~Gy~~g 270 (350)
T KOG4260|consen 229 DEEGCVDVNECQNEPAPCKAHQFCVNTE-GSFKCEDKEGYKKG 270 (350)
T ss_pred cccccccHHHHhcCCCCCChhheeecCC-CceEecccccccCC
Confidence 112222 34677777899988 69999999888864
No 29
>PF07974 EGF_2: EGF-like domain; InterPro: IPR013111 A sequence of about thirty to forty amino-acid residues long found in the sequence of epidermal growth factor (EGF) has been shown [, , , , ] to be present, in a more or less conserved form, in a large number of other, mostly animal proteins. The list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied. The functional significance of EGF domains in what appear to be unrelated proteins is not yet clear. However, a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase). The EGF domain includes six cysteine residues which have been shown (in EGF) to be involved in disulphide bonds. The main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet. Subdomains between the conserved cysteines vary in length. This entry contains EGF domains found in a variety of extracellular and membrane proteins
Probab=96.44 E-value=0.0051 Score=29.27 Aligned_cols=27 Identities=26% Similarity=0.794 Sum_probs=21.9
Q ss_pred CCCCCCCeeeeCCCCCeeeeCCCCCCCCCCC
Q 044268 57 IFCFHEAQCLALSQVHNACDCPPDWKGSADC 87 (138)
Q Consensus 57 ~~C~~~g~C~~~~~~~~~C~C~~g~~g~~~C 87 (138)
..|.++|+|... ...|.|.+||+|. .|
T Consensus 6 ~~C~~~G~C~~~---~g~C~C~~g~~G~-~C 32 (32)
T PF07974_consen 6 NICSGHGTCVSP---CGRCVCDSGYTGP-DC 32 (32)
T ss_pred CccCCCCEEeCC---CCEEECCCCCcCC-CC
Confidence 358889999854 3579999999998 65
No 30
>KOG1226 consensus Integrin beta subunit (N-terminal portion of extracellular region) [Signal transduction mechanisms; Extracellular structures]
Probab=96.38 E-value=0.012 Score=47.03 Aligned_cols=80 Identities=28% Similarity=0.716 Sum_probs=49.6
Q ss_pred CCCCCCCccCccCCCCccCCCCCEEeecCCCcceEeeCCCCCcCCCCcc--CCCCCCC---CCCCCCCeeeeCCCCCeee
Q 044268 1 GKFCDEEMTMCDGTNEFWCEHGGKCEEIVQGEMYDCKCPAGYAGEHCEH--SGTPCGQ---IFCFHEAQCLALSQVHNAC 75 (138)
Q Consensus 1 g~~C~~~~~~C~~~~~~pC~~~~~C~~~~~~~~~~C~C~~g~~g~~C~~--~~~~C~~---~~C~~~g~C~~~~~~~~~C 75 (138)
|.+||-+--.|....-..|..+|.|.- -.|.|.+||+|..|+- +.+.|.. ..|...|+|. -..|
T Consensus 539 G~fCECDnfsC~r~~g~lC~g~G~C~C------G~CvC~~GwtG~~C~C~~std~C~~~~G~iCSGrG~C~-----Cg~C 607 (783)
T KOG1226|consen 539 GKFCECDNFSCERHKGVLCGGHGRCEC------GRCVCNPGWTGSACNCPLSTDTCESSDGQICSGRGTCE-----CGRC 607 (783)
T ss_pred eeeeeccCcccccccCcccCCCCeEeC------CcEEcCCCCccCCCCCCCCCccccCCCCceeCCCceee-----CCce
Confidence 456764443454332235777777742 3599999999998863 3455553 2355555553 2357
Q ss_pred eCCCC-CCCCCCCCCCCC
Q 044268 76 DCPPD-WKGSADCSLPTL 92 (138)
Q Consensus 76 ~C~~g-~~g~~~C~~~~~ 92 (138)
.|... |+|. .||.-..
T Consensus 608 ~C~~~~~sG~-~CE~cpt 624 (783)
T KOG1226|consen 608 KCTDPPYSGE-FCEKCPT 624 (783)
T ss_pred EcCCCCcCcc-hhhcCCC
Confidence 77665 8999 8876443
No 31
>smart00051 DSL delta serrate ligand.
Probab=96.26 E-value=0.012 Score=32.62 Aligned_cols=46 Identities=26% Similarity=0.563 Sum_probs=32.8
Q ss_pred eEeeCCCCCcCCCCccCCCCCCC-CCCCCCCeeeeCCCCCeeeeCCCCCCCCCCC
Q 044268 34 YDCKCPAGYAGEHCEHSGTPCGQ-IFCFHEAQCLALSQVHNACDCPPDWKGSADC 87 (138)
Q Consensus 34 ~~C~C~~g~~g~~C~~~~~~C~~-~~C~~~g~C~~~~~~~~~C~C~~g~~g~~~C 87 (138)
+.-.|.++|.|..|. ..|.+ .-+..+..|... -.|.|.+||+|. .|
T Consensus 17 ~rv~C~~~~yG~~C~---~~C~~~~d~~~~~~Cd~~----G~~~C~~Gw~G~-~C 63 (63)
T smart00051 17 IRVTCDENYYGEGCN---KFCRPRDDFFGHYTCDEN----GNKGCLEGWMGP-YC 63 (63)
T ss_pred EEeeCCCCCcCCccC---CEeCcCccccCCccCCcC----CCEecCCCCcCC-CC
Confidence 556799999999996 44543 234566778532 348899999998 66
No 32
>PF12947 EGF_3: EGF domain; InterPro: IPR024731 This entry represents an EGF domain found in the the C terminus of malarial parasite merozoite surface protein 1 [], as well as other proteins.; PDB: 2NPR_A 1N1I_C 1B9W_A 1YO8_A 2RHP_A.
Probab=96.13 E-value=0.0035 Score=30.70 Aligned_cols=27 Identities=26% Similarity=0.631 Sum_probs=21.0
Q ss_pred CCCCCCCeeeeCCCCCeeeeCCCCCCCC
Q 044268 57 IFCFHEAQCLALSQVHNACDCPPDWKGS 84 (138)
Q Consensus 57 ~~C~~~g~C~~~~~~~~~C~C~~g~~g~ 84 (138)
..|...+.|++.. +.+.|.|.+||.|+
T Consensus 6 ~~C~~nA~C~~~~-~~~~C~C~~Gy~Gd 32 (36)
T PF12947_consen 6 GGCHPNATCTNTG-GSYTCTCKPGYEGD 32 (36)
T ss_dssp GGS-TTCEEEE-T-TSEEEEE-CEEECC
T ss_pred CCCCCCcEeecCC-CCEEeECCCCCccC
Confidence 4577788999998 59999999999998
No 33
>KOG1226 consensus Integrin beta subunit (N-terminal portion of extracellular region) [Signal transduction mechanisms; Extracellular structures]
Probab=95.62 E-value=0.036 Score=44.53 Aligned_cols=61 Identities=31% Similarity=0.836 Sum_probs=43.5
Q ss_pred cCCCCCEEeecCCCcceEeeCCCCCc----CCCCccCCCCCCC---CCCCCCCeeeeCCCCCeeeeCCCCCCCCCCCCCC
Q 044268 18 WCEHGGKCEEIVQGEMYDCKCPAGYA----GEHCEHSGTPCGQ---IFCFHEAQCLALSQVHNACDCPPDWKGSADCSLP 90 (138)
Q Consensus 18 pC~~~~~C~~~~~~~~~~C~C~~g~~----g~~C~~~~~~C~~---~~C~~~g~C~~~~~~~~~C~C~~g~~g~~~C~~~ 90 (138)
+|.+.|.|.= -.|.|.+... |..|+-+--.|.. ..|..+|.|. -..|+|.+||+|. .|+-.
T Consensus 515 vCSgrG~C~C------GqC~C~~~~~~~i~G~fCECDnfsC~r~~g~lC~g~G~C~-----CG~CvC~~GwtG~-~C~C~ 582 (783)
T KOG1226|consen 515 VCSGRGDCVC------GQCVCHKPDNGKIYGKFCECDNFSCERHKGVLCGGHGRCE-----CGRCVCNPGWTGS-ACNCP 582 (783)
T ss_pred CcCCCCcEeC------CceEecCCCCCceeeeeeeccCcccccccCcccCCCCeEe-----CCcEEcCCCCccC-CCCCC
Confidence 5888787753 2478877665 8888765545543 4588888884 3469999999999 87643
No 34
>PF14670 FXa_inhibition: Coagulation Factor Xa inhibitory site; PDB: 3Q3K_B 1NFY_B 1LQD_A 1G2L_B 1IQF_L 2UWP_B 2VH6_B 3KQC_L 2P93_L 2BQW_A ....
Probab=95.52 E-value=0.015 Score=28.46 Aligned_cols=22 Identities=41% Similarity=1.084 Sum_probs=17.8
Q ss_pred cCCCCCEEeecCCCcceEeeCCCCCc
Q 044268 18 WCEHGGKCEEIVQGEMYDCKCPAGYA 43 (138)
Q Consensus 18 pC~~~~~C~~~~~~~~~~C~C~~g~~ 43 (138)
-|.+ .|++.+++ |.|.|++||.
T Consensus 7 gC~h--~C~~~~g~--~~C~C~~Gy~ 28 (36)
T PF14670_consen 7 GCSH--ICVNTPGS--YRCSCPPGYK 28 (36)
T ss_dssp GSSS--EEEEETTS--EEEE-STTEE
T ss_pred CcCC--CCccCCCc--eEeECCCCCE
Confidence 3554 89999988 9999999996
No 35
>PHA03099 epidermal growth factor-like protein (EGF-like protein); Provisional
Probab=94.78 E-value=0.035 Score=34.90 Aligned_cols=39 Identities=23% Similarity=0.635 Sum_probs=28.1
Q ss_pred CCCCCC---CCCCCCCeeeeCCC-CCeeeeCCCCCCCCCCCCCCC
Q 044268 51 GTPCGQ---IFCFHEAQCLALSQ-VHNACDCPPDWKGSADCSLPT 91 (138)
Q Consensus 51 ~~~C~~---~~C~~~g~C~~~~~-~~~~C~C~~g~~g~~~C~~~~ 91 (138)
+..|.. +.|.+| .|.-..+ ..+.|+|..||+|. +||...
T Consensus 42 i~~Cp~ey~~YClHG-~C~yI~dl~~~~CrC~~GYtGe-RCEh~d 84 (139)
T PHA03099 42 IRLCGPEGDGYCLHG-DCIHARDIDGMYCRCSHGYTGI-RCQHVV 84 (139)
T ss_pred cccCChhhCCEeECC-EEEeeccCCCceeECCCCcccc-ccccee
Confidence 345543 568886 8875431 37899999999999 998644
No 36
>PHA03099 epidermal growth factor-like protein (EGF-like protein); Provisional
Probab=94.77 E-value=0.043 Score=34.49 Aligned_cols=31 Identities=35% Similarity=0.955 Sum_probs=23.7
Q ss_pred cCCCCCEEeecCCCcceEeeCCCCCcCCCCcc
Q 044268 18 WCEHGGKCEEIVQGEMYDCKCPAGYAGEHCEH 49 (138)
Q Consensus 18 pC~~~~~C~~~~~~~~~~C~C~~g~~g~~C~~ 49 (138)
-|.|| .|.-......+.|.|..||+|..|+.
T Consensus 52 YClHG-~C~yI~dl~~~~CrC~~GYtGeRCEh 82 (139)
T PHA03099 52 YCLHG-DCIHARDIDGMYCRCSHGYTGIRCQH 82 (139)
T ss_pred EeECC-EEEeeccCCCceeECCCCcccccccc
Confidence 47775 88755433338999999999999974
No 37
>KOG3516 consensus Neurexin IV [Signal transduction mechanisms]
Probab=94.17 E-value=0.049 Score=45.77 Aligned_cols=40 Identities=38% Similarity=0.876 Sum_probs=34.9
Q ss_pred ccCccCCCCccCCCCCEEeecCCCcceEeeCC-CCCcCCCCccCCC
Q 044268 8 MTMCDGTNEFWCEHGGKCEEIVQGEMYDCKCP-AGYAGEHCEHSGT 52 (138)
Q Consensus 8 ~~~C~~~~~~pC~~~~~C~~~~~~~~~~C~C~-~g~~g~~C~~~~~ 52 (138)
.+.|.++ +|+++|.|..+... |.|.|. .||.|..|...+.
T Consensus 545 ~drClPN---~CehgG~C~Qs~~~--f~C~C~~TGY~GatCHtsi~ 585 (1306)
T KOG3516|consen 545 SDRCLPN---PCEHGGKCSQSWDD--FECNCELTGYKGATCHTSIY 585 (1306)
T ss_pred ccccCCc---cccCCCcccccccc--eeEeccccccccccccCCCc
Confidence 5789888 99999999998887 999998 9999999976543
No 38
>PHA02887 EGF-like protein; Provisional
Probab=94.13 E-value=0.07 Score=33.01 Aligned_cols=38 Identities=21% Similarity=0.615 Sum_probs=27.7
Q ss_pred CCCCCC---CCCCCCCeeeeCCC-CCeeeeCCCCCCCCCCCCCC
Q 044268 51 GTPCGQ---IFCFHEAQCLALSQ-VHNACDCPPDWKGSADCSLP 90 (138)
Q Consensus 51 ~~~C~~---~~C~~~g~C~~~~~-~~~~C~C~~g~~g~~~C~~~ 90 (138)
+.+|.. +.|.+ |+|.-..+ ....|.|+.||+|. +|+..
T Consensus 83 f~pC~~eyk~YCiH-G~C~yI~dL~epsCrC~~GYtG~-RCE~v 124 (126)
T PHA02887 83 FEKCKNDFNDFCIN-GECMNIIDLDEKFCICNKGYTGI-RCDEV 124 (126)
T ss_pred ccccChHhhCEeeC-CEEEccccCCCceeECCCCcccC-CCCcc
Confidence 456653 56875 58975432 36899999999999 99853
No 39
>PHA02887 EGF-like protein; Provisional
Probab=93.78 E-value=0.081 Score=32.75 Aligned_cols=31 Identities=32% Similarity=0.894 Sum_probs=23.2
Q ss_pred cCCCCCEEeecCCCcceEeeCCCCCcCCCCcc
Q 044268 18 WCEHGGKCEEIVQGEMYDCKCPAGYAGEHCEH 49 (138)
Q Consensus 18 pC~~~~~C~~~~~~~~~~C~C~~g~~g~~C~~ 49 (138)
-|.+ |.|.-...-..+.|.|..||+|..|+.
T Consensus 93 YCiH-G~C~yI~dL~epsCrC~~GYtG~RCE~ 123 (126)
T PHA02887 93 FCIN-GECMNIIDLDEKFCICNKGYTGIRCDE 123 (126)
T ss_pred EeeC-CEEEccccCCCceeECCCCcccCCCCc
Confidence 3774 688754433348999999999999973
No 40
>KOG3516 consensus Neurexin IV [Signal transduction mechanisms]
Probab=93.73 E-value=0.075 Score=44.74 Aligned_cols=38 Identities=34% Similarity=0.862 Sum_probs=31.9
Q ss_pred cCccCCCCccCCCCCEEeecCCCcceEeeCC-CCCcCCCCccCC
Q 044268 9 TMCDGTNEFWCEHGGKCEEIVQGEMYDCKCP-AGYAGEHCEHSG 51 (138)
Q Consensus 9 ~~C~~~~~~pC~~~~~C~~~~~~~~~~C~C~-~g~~g~~C~~~~ 51 (138)
-.|++. +|.|||.|+....+ |+|.|. ..|.|..|..++
T Consensus 956 GhCss~---~C~NGG~Cvery~g--ytCDCs~Tay~Gp~Cs~ei 994 (1306)
T KOG3516|consen 956 GHCSSY---PCLNGGHCVERYDG--YTCDCSRTAYDGPFCSKEI 994 (1306)
T ss_pred cccccc---cccCCCEEEEecCc--eeeccccCcCCCCcccccc
Confidence 358888 99999999999988 999996 468899997543
No 41
>KOG3514 consensus Neurexin III-alpha [Signal transduction mechanisms]
Probab=93.11 E-value=0.072 Score=44.56 Aligned_cols=36 Identities=36% Similarity=1.031 Sum_probs=32.1
Q ss_pred CccCCCCccCCCCCEEeecCCCcceEeeCC-CCCcCCCCccC
Q 044268 10 MCDGTNEFWCEHGGKCEEIVQGEMYDCKCP-AGYAGEHCEHS 50 (138)
Q Consensus 10 ~C~~~~~~pC~~~~~C~~~~~~~~~~C~C~-~g~~g~~C~~~ 50 (138)
.|.++ ||+|+|.|...++. |.|.|. .+|.|+.|+..
T Consensus 625 ~C~~n---PC~N~g~C~egwNr--fiCDCs~T~~~G~~CerE 661 (1591)
T KOG3514|consen 625 ICESN---PCQNGGKCSEGWNR--FICDCSGTGFEGRTCERE 661 (1591)
T ss_pred ccCCC---cccCCCCccccccc--cccccccCcccCccccce
Confidence 68899 99999999999998 999996 57999999854
No 42
>PF06247 Plasmod_Pvs28: Plasmodium ookinete surface protein Pvs28; InterPro: IPR010423 This family consists of several ookinete surface protein (Pvs28) from several species of Plasmodium. Pvs25 and Pvs28 are expressed on the surface of ookinetes. These proteins are potential candidates for vaccine and induce antibodies that block the infectivity of Plasmodium vivax in immunised animals [].; GO: 0009986 cell surface, 0016020 membrane; PDB: 1Z3G_B 1Z1Y_B 1Z27_A.
Probab=91.39 E-value=0.067 Score=35.97 Aligned_cols=63 Identities=24% Similarity=0.636 Sum_probs=42.9
Q ss_pred cCCCCCEEeecCCCcceEeeCCCCCc---CCCCccCCCCCCC-----CCCCCCCeeeeCCC----CCeeeeCCCCCCCC
Q 044268 18 WCEHGGKCEEIVQGEMYDCKCPAGYA---GEHCEHSGTPCGQ-----IFCFHEAQCLALSQ----VHNACDCPPDWKGS 84 (138)
Q Consensus 18 pC~~~~~C~~~~~~~~~~C~C~~g~~---g~~C~~~~~~C~~-----~~C~~~g~C~~~~~----~~~~C~C~~g~~g~ 84 (138)
.|.| |..+...++ |.|.|..||. -..|+. ...|.. .+|...+.|..... ..|.|.|.+||.-.
T Consensus 7 ~CKN-G~LiQMSNH--fEC~Cnegfvl~~EntCE~-kv~C~~~e~~~K~Cgdya~C~~~~~~~~~~~~~C~C~~gY~~~ 81 (197)
T PF06247_consen 7 ICKN-GYLIQMSNH--FECKCNEGFVLKNENTCEE-KVECDKLENVNKPCGDYAKCINQANKGEERAYKCDCINGYILK 81 (197)
T ss_dssp --BT-EEEEEESSE--EEEEESTTEEEEETTEEEE-----SG-GGTTSEEETTEEEEE-SSTTSSTSEEEEE-TTEEES
T ss_pred cccC-CEEEEccCc--eEEEcCCCcEEcccccccc-ceecCcccccCccccchhhhhcCCCcccceeEEEecccCceee
Confidence 5775 577788888 9999999997 356753 345542 57888889987653 37999999999865
No 43
>KOG3514 consensus Neurexin III-alpha [Signal transduction mechanisms]
Probab=91.34 E-value=0.18 Score=42.38 Aligned_cols=38 Identities=21% Similarity=0.501 Sum_probs=32.2
Q ss_pred CCCCCCCCCCCeeeeCCCCCeeeeCC-CCCCCCCCCCCCCC
Q 044268 53 PCGQIFCFHEAQCLALSQVHNACDCP-PDWKGSADCSLPTL 92 (138)
Q Consensus 53 ~C~~~~C~~~g~C~~~~~~~~~C~C~-~g~~g~~~C~~~~~ 92 (138)
.|.++||.|+|.|...- ..|.|.|. .+|.|. .|+....
T Consensus 625 ~C~~nPC~N~g~C~egw-NrfiCDCs~T~~~G~-~CerE~t 663 (1591)
T KOG3514|consen 625 ICESNPCQNGGKCSEGW-NRFICDCSGTGFEGR-TCEREAT 663 (1591)
T ss_pred ccCCCcccCCCCccccc-cccccccccCcccCc-cccceee
Confidence 78899999999999876 48999996 689999 9986543
No 44
>PF12946 EGF_MSP1_1: MSP1 EGF domain 1; InterPro: IPR024730 This EGF-like domain is found at the C terminus of the malaria parasite MSP1 protein. MSP1 is the merozoite surface protein 1. This domain is part of the C-terminal fragment that is proteolytically processed from the the rest of the protein and is left attached to the surface of the invading parasite [].; PDB: 1N1I_C 2FLG_A 1CEJ_A 2NPR_A 1B9W_A 1OB1_F.
Probab=91.11 E-value=0.12 Score=25.37 Aligned_cols=24 Identities=25% Similarity=0.646 Sum_probs=17.3
Q ss_pred cCCCCCEEeecC-CCcceEeeCCCCCc
Q 044268 18 WCEHGGKCEEIV-QGEMYDCKCPAGYA 43 (138)
Q Consensus 18 pC~~~~~C~~~~-~~~~~~C~C~~g~~ 43 (138)
.|..++.|++.. +. +.|.|..||.
T Consensus 6 ~cP~NA~C~~~~dG~--eecrCllgyk 30 (37)
T PF12946_consen 6 KCPANAGCFRYDDGS--EECRCLLGYK 30 (37)
T ss_dssp ---TTEEEEEETTSE--EEEEE-TTEE
T ss_pred cCCCCcccEEcCCCC--EEEEeeCCcc
Confidence 788889999877 55 9999999996
No 45
>cd01475 vWA_Matrilin VWA_Matrilin: In cartilaginous plate, extracellular matrix molecules mediate cell-matrix and matrix-matrix interactions thereby providing tissue integrity. Some members of the matrilin family are expressed specifically in developing cartilage rudiments. The matrilin family consists of at least four members. All the members of the matrilin family contain VWA domains, EGF-like domains and a heptad repeat coiled-coiled domain at the carboxy terminus which is responsible for the oligomerization of the matrilins. The VWA domains have been shown to be essential for matrilin network formation by interacting with matrix ligands.
Probab=89.79 E-value=0.52 Score=32.55 Aligned_cols=37 Identities=19% Similarity=0.513 Sum_probs=27.2
Q ss_pred CCCCccCCCCCCC--CCCCCCCeeeeCCCCCeeeeCCCCCCCC
Q 044268 44 GEHCEHSGTPCGQ--IFCFHEAQCLALSQVHNACDCPPDWKGS 84 (138)
Q Consensus 44 g~~C~~~~~~C~~--~~C~~~g~C~~~~~~~~~C~C~~g~~g~ 84 (138)
+..|+ +.++|.. .+|.. .|.+.. |.|.|.|+.||+..
T Consensus 181 ~~~C~-~~~~C~~~~~~c~~--~C~~~~-g~~~c~c~~g~~~~ 219 (224)
T cd01475 181 GKICV-VPDLCATLSHVCQQ--VCISTP-GSYLCACTEGYALL 219 (224)
T ss_pred cccCc-CchhhcCCCCCccc--eEEcCC-CCEEeECCCCccCC
Confidence 55674 5567753 44653 799888 69999999999865
No 46
>PF01414 DSL: Delta serrate ligand; InterPro: IPR001774 Ligands of the Delta/Serrate/lag-2 (DSL) family and their receptors, members of the lin-12/Notch family, mediate cell-cell interactions that specify cell fate in invertebrates and vertebrates. In Caenorhabditis elegans, two DSL genes, lag-2 and apx-1, influence different cell fate decisions during development []. Molecular interaction between Notch and Serrate, another EGF-homologous transmembrane protein containing a region of striking similarity to Delta, has been shown and the same two EGF repeats of Notch may also constitute a Serrate binding domain [, ].; GO: 0007154 cell communication, 0016020 membrane; PDB: 2VJ2_A.
Probab=89.11 E-value=0.14 Score=28.37 Aligned_cols=46 Identities=26% Similarity=0.615 Sum_probs=20.3
Q ss_pred eEeeCCCCCcCCCCccCCCCCCCC-CCCCCCeeeeCCCCCeeeeCCCCCCCCCCC
Q 044268 34 YDCKCPAGYAGEHCEHSGTPCGQI-FCFHEAQCLALSQVHNACDCPPDWKGSADC 87 (138)
Q Consensus 34 ~~C~C~~g~~g~~C~~~~~~C~~~-~C~~~g~C~~~~~~~~~C~C~~g~~g~~~C 87 (138)
+.-.|...|.|..|. ..|.+. --..+.+|... | .-.|.+||+|. .|
T Consensus 17 ~rv~C~~nyyG~~C~---~~C~~~~d~~ghy~Cd~~--G--~~~C~~Gw~G~-~C 63 (63)
T PF01414_consen 17 IRVVCDENYYGPNCS---KFCKPRDDSFGHYTCDSN--G--NKVCLPGWTGP-NC 63 (63)
T ss_dssp ------TTEETTTT----EE---EEETTEEEEE-SS------EEE-TTEEST-TS
T ss_pred EEEECCCCCCCcccc---CCcCCCcCCcCCcccCCC--C--CCCCCCCCcCC-CC
Confidence 677899999999996 345431 01223355422 1 24689999998 66
No 47
>KOG1836 consensus Extracellular matrix glycoprotein Laminin subunits alpha and gamma [Extracellular structures]
Probab=86.46 E-value=0.99 Score=40.21 Aligned_cols=54 Identities=22% Similarity=0.540 Sum_probs=35.9
Q ss_pred eCCCCCcCCCCccCCCCCCCCCCCCCCeeeeCC-CCCeeee-CCCCCCCCCCCCCCC
Q 044268 37 KCPAGYAGEHCEHSGTPCGQIFCFHEAQCLALS-QVHNACD-CPPDWKGSADCSLPT 91 (138)
Q Consensus 37 ~C~~g~~g~~C~~~~~~C~~~~C~~~g~C~~~~-~~~~~C~-C~~g~~g~~~C~~~~ 91 (138)
.|..||.|..=.-....|.+=+|.+++.|.... .....|. |++||+|. .|+...
T Consensus 760 ~C~~GfYg~~~~~~~~dC~~C~Cp~~~~~~~~~~~~~~iCk~Cp~gytG~-rCe~c~ 815 (1705)
T KOG1836|consen 760 QCVDGFYGLPDLGTSGDCQPCPCPNGGACGQTPEILEVVCKNCPPGYTGL-RCEECA 815 (1705)
T ss_pred hhcCCCCCccccCCCCCCccCCCCCChhhcCcCcccceecCCCCCCCccc-ccccCC
Confidence 466666655332122337777788887776654 2367898 99999999 987543
No 48
>KOG0994 consensus Extracellular matrix glycoprotein Laminin subunit beta [Extracellular structures]
Probab=85.66 E-value=1.2 Score=38.19 Aligned_cols=56 Identities=27% Similarity=0.637 Sum_probs=33.4
Q ss_pred eEe-eCCCCCcCCCCccCCCCCCCCCCCCCC--------eeeeCC-CCCeeeeCCCCCCCCCCCCCC
Q 044268 34 YDC-KCPAGYAGEHCEHSGTPCGQIFCFHEA--------QCLALS-QVHNACDCPPDWKGSADCSLP 90 (138)
Q Consensus 34 ~~C-~C~~g~~g~~C~~~~~~C~~~~C~~~g--------~C~~~~-~~~~~C~C~~g~~g~~~C~~~ 90 (138)
+.| .|..||.|..---....|.+=||..+- .|.... .....|+|.+||+|. +|+.-
T Consensus 885 ~~CdrCl~GyyGdP~lg~g~~CrPCpCP~gp~Sg~~~A~sC~~d~~t~~ivC~C~~GY~G~-RCe~C 950 (1758)
T KOG0994|consen 885 HSCDRCLDGYYGDPRLGSGIGCRPCPCPDGPASGRQHADSCYLDTRTQQIVCHCQEGYSGS-RCEIC 950 (1758)
T ss_pred cchhhhhccccCCcccCCCCCCCCCCCCCCCccchhccccccccccccceeeecccCcccc-chhhh
Confidence 778 689999875321122345444453321 232211 125789999999999 98753
No 49
>PF12955 DUF3844: Domain of unknown function (DUF3844); InterPro: IPR024382 This presumed domain is found in fungal species. It contains 8 largely conserved cysteine residues. This domain is found in proteins thought to be located in the endoplasmic reticulum.
Probab=81.99 E-value=0.95 Score=27.58 Aligned_cols=34 Identities=21% Similarity=0.573 Sum_probs=20.1
Q ss_pred ccCccCCCCccCCCCCEEeecCCC---cceEeeCCCCC
Q 044268 8 MTMCDGTNEFWCEHGGKCEEIVQG---EMYDCKCPAGY 42 (138)
Q Consensus 8 ~~~C~~~~~~pC~~~~~C~~~~~~---~~~~C~C~~g~ 42 (138)
.+.|... .+.|..+|.|...... .=|.|.|.+.+
T Consensus 5 ~~aC~~~-Tn~CsgHG~C~~~~~~~~~~C~~C~C~~T~ 41 (103)
T PF12955_consen 5 NDACENA-TNNCSGHGSCVKKYGSGGGDCFACKCKPTV 41 (103)
T ss_pred HHHHHHh-ccCCCCCceEeeccCCCccceEEEEeeccc
Confidence 3445443 3469999999876432 11667766544
No 50
>PF00954 S_locus_glycop: S-locus glycoprotein family; InterPro: IPR000858 In Brassicaceae, self-incompatible plants have a self/non-self recognition system, which involves the inability of flowering plants to achieve self-fertilisation. This is sporophytically controlled by multiple alleles at a single locus (S). There are a total of 50 different S alleles in Brassica oleracea. S-locus glycoproteins, as well as S-receptor kinases, are in linkage with the S-alleles []. Most of the proteins within this family contain apple-like domain (IPR003609 from INTERPRO), which is predicted to possess protein- and/or carbohydrate-binding functions.; GO: 0048544 recognition of pollen
Probab=81.27 E-value=2.3 Score=25.98 Aligned_cols=32 Identities=22% Similarity=0.565 Sum_probs=24.0
Q ss_pred CCCCCC-CCCCCCCeeeeCCCCCeeeeCCCCCCCC
Q 044268 51 GTPCGQ-IFCFHEAQCLALSQVHNACDCPPDWKGS 84 (138)
Q Consensus 51 ~~~C~~-~~C~~~g~C~~~~~~~~~C~C~~g~~g~ 84 (138)
.+.|.. ..|...+.|.... ...|.|.+||..+
T Consensus 77 ~d~Cd~y~~CG~~g~C~~~~--~~~C~Cl~GF~P~ 109 (110)
T PF00954_consen 77 KDQCDVYGFCGPNGICNSNN--SPKCSCLPGFEPK 109 (110)
T ss_pred ccCCCCccccCCccEeCCCC--CCceECCCCcCCC
Confidence 456764 7899999996542 5679999999754
No 51
>KOG0994 consensus Extracellular matrix glycoprotein Laminin subunit beta [Extracellular structures]
Probab=80.99 E-value=2.8 Score=36.13 Aligned_cols=15 Identities=33% Similarity=1.263 Sum_probs=12.6
Q ss_pred EeeCCCCCcCCCCcc
Q 044268 35 DCKCPAGYAGEHCEH 49 (138)
Q Consensus 35 ~C~C~~g~~g~~C~~ 49 (138)
.|.|.+||.|+.|..
T Consensus 1085 QCqCkpGfGGR~C~q 1099 (1758)
T KOG0994|consen 1085 QCQCKPGFGGRTCSQ 1099 (1758)
T ss_pred ceeccCCCCCcchhH
Confidence 499999999998853
No 52
>KOG3509 consensus Basement membrane-specific heparan sulfate proteoglycan (HSPG) core protein [Posttranslational modification, protein turnover, chaperones]
Probab=80.84 E-value=4.7 Score=34.10 Aligned_cols=69 Identities=28% Similarity=0.683 Sum_probs=46.5
Q ss_pred cCccCCCCccCCCCCEEeecCCCcceEeeCCCCCcCCCCccCCCCCCCCCC-CCCCeeeeCCCCCeeeeCCCCCCCC
Q 044268 9 TMCDGTNEFWCEHGGKCEEIVQGEMYDCKCPAGYAGEHCEHSGTPCGQIFC-FHEAQCLALSQVHNACDCPPDWKGS 84 (138)
Q Consensus 9 ~~C~~~~~~pC~~~~~C~~~~~~~~~~C~C~~g~~g~~C~~~~~~C~~~~C-~~~g~C~~~~~~~~~C~C~~g~~g~ 84 (138)
+.|... ||...+.|...... ..|.|+++|+|..|....+.|...+- ...++|.... ....+.|.++ .|.
T Consensus 407 ~~c~~~---p~~~~g~c~p~~~~--~~c~c~~g~~G~~c~d~~~~~~~~~~g~y~~t~~~~~-~~~~~~c~pg-~g~ 476 (964)
T KOG3509|consen 407 DVCWRI---PCQHDGPCLQTLEG--KQCLCPPGYTGDSCEDCMNGCDRSPNGSYLGTCVPIQ-GKRCEYCGPG-AGA 476 (964)
T ss_pred Cccccc---cCCCCccccccccc--cceeccccccCchhhccCccccccCCccccceEeccC-CCcceeecCC-CCC
Confidence 457777 89999999888877 89999999999999765555543221 1223554443 2455667777 444
No 53
>cd01475 vWA_Matrilin VWA_Matrilin: In cartilaginous plate, extracellular matrix molecules mediate cell-matrix and matrix-matrix interactions thereby providing tissue integrity. Some members of the matrilin family are expressed specifically in developing cartilage rudiments. The matrilin family consists of at least four members. All the members of the matrilin family contain VWA domains, EGF-like domains and a heptad repeat coiled-coiled domain at the carboxy terminus which is responsible for the oligomerization of the matrilins. The VWA domains have been shown to be essential for matrilin network formation by interacting with matrix ligands.
Probab=80.74 E-value=1.8 Score=29.82 Aligned_cols=20 Identities=35% Similarity=0.703 Sum_probs=17.5
Q ss_pred CEEeecCCCcceEeeCCCCCcC
Q 044268 23 GKCEEIVQGEMYDCKCPAGYAG 44 (138)
Q Consensus 23 ~~C~~~~~~~~~~C~C~~g~~g 44 (138)
..|.++.++ |.|.|+.||+.
T Consensus 199 ~~C~~~~g~--~~c~c~~g~~~ 218 (224)
T cd01475 199 QVCISTPGS--YLCACTEGYAL 218 (224)
T ss_pred ceEEcCCCC--EEeECCCCccC
Confidence 369999999 99999999974
No 54
>cd00055 EGF_Lam Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies
Probab=73.72 E-value=3.5 Score=21.34 Aligned_cols=15 Identities=27% Similarity=0.926 Sum_probs=12.3
Q ss_pred eEeeCCCCCcCCCCc
Q 044268 34 YDCKCPAGYAGEHCE 48 (138)
Q Consensus 34 ~~C~C~~g~~g~~C~ 48 (138)
-.|.|.+++.|..|+
T Consensus 19 G~C~C~~~~~G~~C~ 33 (50)
T cd00055 19 GQCECKPNTTGRRCD 33 (50)
T ss_pred CEEeCCCcCCCCCCC
Confidence 358899999998885
No 55
>PF00053 Laminin_EGF: Laminin EGF-like (Domains III and V); InterPro: IPR002049 Laminins [] are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation. They are composed of distinct but related alpha, beta and gamma chains. The three chains form a cross-shaped molecule that consist of a long arm and three short globular arms. The long arm consist of a coiled coil structure contributed by all three chains and cross-linked by interchain disulphide bonds. Beside different types of globular domains each subunit contains, in its first half, consecutive repeats of about 60 amino acids in length that include eight conserved cysteines []. The tertiary structure [, ] of this domain is remotely similar in its N-terminal to that of the EGF-like module (see PDOC00021 from PROSITEDOC). It is known as a 'LE' or 'laminin-type EGF-like' domain. The number of copies of the LE domain in the different forms of laminins is highly variable; from 3 up to 22 copies have been found. A schematic representation of the topology of the four disulphide bonds in the LE domain is shown below. +-------------------+ +-|-----------+ | +--------+ +-----------------+ | | | | | | | | xxCxCxxxxxxxxxxxCxxxxxxxCxxCxxxxxGxxCxxCxxgaagxxxxxxxxxxxCxx sssssssssssssssssssssssssssssssssss 'C': conserved cysteine involved in a disulphide bond 'a': conserved aromatic residue 'G': conserved glycine (lower case = less conserved) 's': region similar to the EGF-like domain In mouse laminin gamma-1 chain, the seventh LE domain has been shown to be the only one that binds with a high affinity to nidogen []. The binding-sites are located on the surface within the loops C1-C3 and C5-C6 [, ]. Long consecutive arrays of LE domains in laminins form rod-like elements of limited flexibility [], which determine the spacing in the formation of laminin networks of basement membranes [].; PDB: 3TBD_A 3ZYG_B 3ZYI_B 2Y38_A 1KLO_A 1NPE_B 3ZYJ_B 1TLE_A.
Probab=73.25 E-value=1.6 Score=22.49 Aligned_cols=22 Identities=27% Similarity=0.835 Sum_probs=16.3
Q ss_pred CEEeecCCCcceEeeCCCCCcCCCCc
Q 044268 23 GKCEEIVQGEMYDCKCPAGYAGEHCE 48 (138)
Q Consensus 23 ~~C~~~~~~~~~~C~C~~g~~g~~C~ 48 (138)
..|.... ..|.|.++|+|+.|+
T Consensus 11 ~~C~~~~----G~C~C~~~~~G~~C~ 32 (49)
T PF00053_consen 11 QTCDPST----GQCVCKPGTTGPRCD 32 (49)
T ss_dssp SSEEETC----EEESBSTTEESTTS-
T ss_pred CcccCCC----CEEeccccccCCcCc
Confidence 4566533 569999999999996
No 56
>PF01683 EB: EB module; InterPro: IPR006149 The EB domain has no known function. It is found in several Caenorhabditis sp. and Drosophila sp. proteins. The domain contains 8 conserved cysteines that probably form four disulphide bridges and is found associated with kunitz domains IPR002223 from INTERPRO
Probab=65.16 E-value=16 Score=18.86 Aligned_cols=20 Identities=40% Similarity=1.051 Sum_probs=12.2
Q ss_pred cCCCCCEEeecCCCcceEeeCCCCCc
Q 044268 18 WCEHGGKCEEIVQGEMYDCKCPAGYA 43 (138)
Q Consensus 18 pC~~~~~C~~~~~~~~~~C~C~~g~~ 43 (138)
.|..+..|++ -.|.|+.||.
T Consensus 27 qC~~~s~C~~------g~C~C~~g~~ 46 (52)
T PF01683_consen 27 QCIGGSVCVN------GRCQCPPGYV 46 (52)
T ss_pred CCCCcCEEcC------CEeECCCCCE
Confidence 4555566633 3577777764
No 57
>KOG1836 consensus Extracellular matrix glycoprotein Laminin subunits alpha and gamma [Extracellular structures]
Probab=64.18 E-value=19 Score=32.68 Aligned_cols=37 Identities=38% Similarity=0.881 Sum_probs=28.0
Q ss_pred ccCCCCccCCCCCEEeecCCCcceEee-CCCCCcCCCCccC
Q 044268 11 CDGTNEFWCEHGGKCEEIVQGEMYDCK-CPAGYAGEHCEHS 50 (138)
Q Consensus 11 C~~~~~~pC~~~~~C~~~~~~~~~~C~-C~~g~~g~~C~~~ 50 (138)
|.+= +|.+++.|..........|. |+++|+|..|+..
T Consensus 777 C~~C---~Cp~~~~~~~~~~~~~~iCk~Cp~gytG~rCe~c 814 (1705)
T KOG1836|consen 777 CQPC---PCPNGGACGQTPEILEVVCKNCPPGYTGLRCEEC 814 (1705)
T ss_pred CccC---CCCCChhhcCcCcccceecCCCCCCCcccccccC
Confidence 6666 78888888766533337898 9999999999753
No 58
>smart00180 EGF_Lam Laminin-type epidermal growth factor-like domai.
Probab=61.70 E-value=7.5 Score=19.73 Aligned_cols=16 Identities=31% Similarity=1.009 Sum_probs=12.8
Q ss_pred eeeeCCCCCCCCCCCCC
Q 044268 73 NACDCPPDWKGSADCSL 89 (138)
Q Consensus 73 ~~C~C~~g~~g~~~C~~ 89 (138)
..|.|.++++|. .|+.
T Consensus 18 G~C~C~~~~~G~-~C~~ 33 (46)
T smart00180 18 GQCECKPNVTGR-RCDR 33 (46)
T ss_pred CEEECCCCCCCC-CCCc
Confidence 468899999998 8774
No 59
>PF09064 Tme5_EGF_like: Thrombomodulin like fifth domain, EGF-like; InterPro: IPR015149 This domain adopts a fold similar to other EGF domains, with a flat major and a twisted minor beta sheet. Disulphide pairing, however, is not of the usual 1-3, 2-4, 5-6 type; rather 1-2, 3-4, 5-6 pairing is found. Its extended major sheet (strands beta-2 and beta-3 and the connecting loop) projects into thrombin's active site groove. This domain is required for interaction of thrombomodulin with thrombin, and subsequent activation of protein-C []. ; GO: 0004888 transmembrane signaling receptor activity, 0016021 integral to membrane
Probab=56.95 E-value=11 Score=18.10 Aligned_cols=13 Identities=23% Similarity=0.588 Sum_probs=10.5
Q ss_pred CeeeeCCCCCCCC
Q 044268 72 HNACDCPPDWKGS 84 (138)
Q Consensus 72 ~~~C~C~~g~~g~ 84 (138)
.+.|.||.||.-+
T Consensus 17 ~~~C~CPeGyIld 29 (34)
T PF09064_consen 17 PGQCFCPEGYILD 29 (34)
T ss_pred CCceeCCCceEec
Confidence 5689999999755
No 60
>PF04863 EGF_alliinase: Alliinase EGF-like domain; InterPro: IPR006947 Allicin is a thiosulphinate that gives rise to dithiines, allyl sulphides and ajoenes, the three groups of active compounds in Allium species. Allicin is synthesised from sulphoxide cysteine derivatives by alliinase, whose C-S lyase activity cleaves C(beta)-S(gamma) bonds. It is thought that this enzyme forms part of a primitive plant defence system [].; GO: 0016846 carbon-sulfur lyase activity; PDB: 1LK9_B 2HOX_C 2HOR_A.
Probab=52.04 E-value=6.2 Score=21.09 Aligned_cols=32 Identities=22% Similarity=0.423 Sum_probs=15.7
Q ss_pred cCCCCCEEe----ecCCCcceEeeCCCCCcCCCCccCC
Q 044268 18 WCEHGGKCE----EIVQGEMYDCKCPAGYAGEHCEHSG 51 (138)
Q Consensus 18 pC~~~~~C~----~~~~~~~~~C~C~~g~~g~~C~~~~ 51 (138)
+|..+|... ...+. ..|.|..-|.|..|+..+
T Consensus 18 ~CSGHGr~flDg~~~dG~--p~CECn~Cy~GpdCS~~~ 53 (56)
T PF04863_consen 18 SCSGHGRAFLDGLIADGS--PVCECNSCYGGPDCSTLI 53 (56)
T ss_dssp --TTSEE--TTS-EETTE--E--EE-TTEESTTS-EE-
T ss_pred CcCCCCeeeeccccccCC--ccccccCCcCCCCcccCC
Confidence 576666653 12344 679999999999987543
No 61
>KOG0196 consensus Tyrosine kinase, EPH (ephrin) receptor family [Signal transduction mechanisms]
Probab=45.06 E-value=40 Score=28.49 Aligned_cols=58 Identities=24% Similarity=0.647 Sum_probs=32.0
Q ss_pred CCCCCEEeecCCCcceEeeCCCCCc----CCCCccC----------CCCCCCCCCCCCCeeeeCCCCCeeeeCCCCCCCC
Q 044268 19 CEHGGKCEEIVQGEMYDCKCPAGYA----GEHCEHS----------GTPCGQIFCFHEAQCLALSQVHNACDCPPDWKGS 84 (138)
Q Consensus 19 C~~~~~C~~~~~~~~~~C~C~~g~~----g~~C~~~----------~~~C~~~~C~~~g~C~~~~~~~~~C~C~~g~~g~ 84 (138)
|...|..+-..+. |.|.+||. +..|+.- ...| .+|..+..-. .. +...|.|..||.-.
T Consensus 248 C~~dGeWlvpiG~----C~C~aGye~~~~~~~C~aCp~G~yK~~~~~~~C--~~CP~~S~s~-~e-ga~~C~C~~gyyRA 319 (996)
T KOG0196|consen 248 CSGDGEWLVPIGG----CVCKAGYEEAENGKACQACPPGTYKASQGDSLC--LPCPPNSHSS-SE-GATSCTCENGYYRA 319 (996)
T ss_pred EcCCCcEEEEcCc----eeecCCCCcccCCCcceeCCCCcccCCCCCCCC--CCCCCCCCCC-CC-CCCcccccCCcccC
Confidence 4444444444444 99999996 4556421 1122 2455443321 22 47789999998754
No 62
>KOG1218 consensus Proteins containing Ca2+-binding EGF-like domains [Signal transduction mechanisms]
Probab=43.49 E-value=73 Score=22.82 Aligned_cols=35 Identities=29% Similarity=0.652 Sum_probs=23.9
Q ss_pred eEeeCCCCCcCCCCccCCCCCCC-CCCCCCCeeeeC
Q 044268 34 YDCKCPAGYAGEHCEHSGTPCGQ-IFCFHEAQCLAL 68 (138)
Q Consensus 34 ~~C~C~~g~~g~~C~~~~~~C~~-~~C~~~g~C~~~ 68 (138)
-.|.|.+||.|..|......|.. ..|.+++.|...
T Consensus 162 ~~c~c~~g~~g~~~~~~~~~c~~~~~~~~g~~C~~~ 197 (316)
T KOG1218|consen 162 GICTCQPGFVGVFCVESCSGCSPLTACENGAKCNRS 197 (316)
T ss_pred CceeccCCcccccccccCCCcCCCcccCCCCeeecc
Confidence 45889999999888654433543 456676677654
No 63
>cd01328 FSL_SPARC Follistatin-like SPARC (secreted protein, acidic, and rich in cysteines) domain; SPARC/BM-40/osteonectin is a multifunctional glycoprotein which modulates cellular interaction with the extracellular matrix by its binding to structural matrix proteins such as collagen and vitronectin. The protein it composed of an N-terminal acidic region, a follistatin (FS) domain and an EF-hand calcium binding domain. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a small hydrophobic core of alpha/beta structure (Kazal domain) and has five disulfide bonds and a conserved N-glycosylation site. The FSL_SPARC domain is a member of the superfamily of kazal-like proteinase inhibitors and follistatin-like proteins.
Probab=38.41 E-value=57 Score=19.19 Aligned_cols=25 Identities=20% Similarity=0.436 Sum_probs=15.0
Q ss_pred CCCCCCCCCCeeeeCCCCCeeeeCC
Q 044268 54 CGQIFCFHEAQCLALSQVHNACDCP 78 (138)
Q Consensus 54 C~~~~C~~~g~C~~~~~~~~~C~C~ 78 (138)
|....|..|.+|.....+...|+|.
T Consensus 2 C~~v~C~~G~~C~~d~~~~p~CvC~ 26 (86)
T cd01328 2 CENHHCGAGKVCEVDDENTPKCVCI 26 (86)
T ss_pred CCCcCCCCCCEeeECCCCCeEEecC
Confidence 4455666666776543346667665
No 64
>KOG3607 consensus Meltrins, fertilins and related Zn-dependent metalloproteinases of the ADAMs family [Posttranslational modification, protein turnover, chaperones]
Probab=31.06 E-value=46 Score=27.60 Aligned_cols=29 Identities=24% Similarity=0.630 Sum_probs=22.9
Q ss_pred CCCCCCeeeeCCCCCeeeeCCCCCCCCCCCCCCC
Q 044268 58 FCFHEAQCLALSQVHNACDCPPDWKGSADCSLPT 91 (138)
Q Consensus 58 ~C~~~g~C~~~~~~~~~C~C~~g~~g~~~C~~~~ 91 (138)
.|...|+|.+ ...|+|.+||.+. .|+...
T Consensus 631 ~C~g~GVCnn----~~~ChC~~gwapp-~C~~~~ 659 (716)
T KOG3607|consen 631 TCNGHGVCNN----ELNCHCEPGWAPP-FCFIFG 659 (716)
T ss_pred ccCCCcccCC----CcceeeCCCCCCC-cccccc
Confidence 4777788864 5579999999999 998644
No 65
>KOG3509 consensus Basement membrane-specific heparan sulfate proteoglycan (HSPG) core protein [Posttranslational modification, protein turnover, chaperones]
Probab=29.79 E-value=54 Score=28.18 Aligned_cols=36 Identities=33% Similarity=0.817 Sum_probs=28.4
Q ss_pred CCCCCCCCCCCCeeeeCCCCCeeeeCCCCCCCCCCCCC
Q 044268 52 TPCGQIFCFHEAQCLALSQVHNACDCPPDWKGSADCSL 89 (138)
Q Consensus 52 ~~C~~~~C~~~g~C~~~~~~~~~C~C~~g~~g~~~C~~ 89 (138)
..|...+|...+.|.... .+..|.|+++|+|. .|+.
T Consensus 407 ~~c~~~p~~~~g~c~p~~-~~~~c~c~~g~~G~-~c~d 442 (964)
T KOG3509|consen 407 DVCWRIPCQHDGPCLQTL-EGKQCLCPPGYTGD-SCED 442 (964)
T ss_pred CccccccCCCCccccccc-cccceeccccccCc-hhhc
Confidence 456667888888887777 38899999999999 7654
No 66
>KOG3607 consensus Meltrins, fertilins and related Zn-dependent metalloproteinases of the ADAMs family [Posttranslational modification, protein turnover, chaperones]
Probab=24.42 E-value=60 Score=26.98 Aligned_cols=27 Identities=30% Similarity=0.784 Sum_probs=20.5
Q ss_pred cCCCCCEEeecCCCcceEeeCCCCCcCCCCcc
Q 044268 18 WCEHGGKCEEIVQGEMYDCKCPAGYAGEHCEH 49 (138)
Q Consensus 18 pC~~~~~C~~~~~~~~~~C~C~~g~~g~~C~~ 49 (138)
.|..+|.|.+. +.|.|.+||.+..|+.
T Consensus 631 ~C~g~GVCnn~-----~~ChC~~gwapp~C~~ 657 (716)
T KOG3607|consen 631 TCNGHGVCNNE-----LNCHCEPGWAPPFCFI 657 (716)
T ss_pred ccCCCcccCCC-----cceeeCCCCCCCcccc
Confidence 37777778543 6799999999988864
Done!