Query 027951
Match_columns 216
No_of_seqs 134 out of 194
Neff 4.7
Searched_HMMs 46136
Date Fri Mar 29 03:59:15 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/027951.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/027951hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 KOG1219 Uncharacterized conser 99.3 1.7E-12 3.6E-17 138.7 6.5 103 40-150 3864-3975(4289)
2 KOG4289 Cadherin EGF LAG seven 98.8 1.7E-09 3.7E-14 113.2 3.8 101 37-175 1176-1301(2531)
3 KOG4289 Cadherin EGF LAG seven 98.3 6.6E-07 1.4E-11 94.4 4.6 66 42-145 1241-1308(2531)
4 KOG1219 Uncharacterized conser 98.1 3E-06 6.5E-11 92.6 5.3 85 80-170 3865-3959(4289)
5 KOG1214 Nidogen and related ba 98.1 5.4E-06 1.2E-10 84.3 6.7 111 41-160 735-876 (1289)
6 PF00008 EGF: EGF-like domain 98.0 4.3E-06 9.3E-11 52.3 1.8 31 120-150 1-32 (32)
7 PF00008 EGF: EGF-like domain 97.9 5.2E-06 1.1E-10 52.0 2.1 31 43-75 1-32 (32)
8 PF07645 EGF_CA: Calcium-bindi 97.8 2E-05 4.2E-10 51.8 2.7 30 117-147 2-34 (42)
9 KOG1217 Fibrillins and related 97.7 0.00025 5.3E-09 63.8 9.3 102 40-149 271-389 (487)
10 KOG1217 Fibrillins and related 97.6 0.00024 5.3E-09 63.8 8.7 121 41-167 170-330 (487)
11 KOG1225 Teneurin-1 and related 97.6 0.00016 3.5E-09 70.7 7.5 94 39-154 243-343 (525)
12 smart00179 EGF_CA Calcium-bind 97.6 9E-05 2E-09 46.0 3.7 30 117-147 2-33 (39)
13 smart00179 EGF_CA Calcium-bind 97.4 0.00019 4.2E-09 44.5 3.4 33 41-76 3-38 (39)
14 KOG1225 Teneurin-1 and related 97.4 0.00034 7.4E-09 68.5 6.5 67 64-150 297-365 (525)
15 KOG4260 Uncharacterized conser 97.3 0.00025 5.5E-09 64.9 4.3 99 46-148 150-269 (350)
16 cd00054 EGF_CA Calcium-binding 97.3 0.00033 7.2E-09 42.6 3.5 32 117-149 2-35 (38)
17 smart00181 EGF Epidermal growt 97.2 0.00052 1.1E-08 42.1 3.4 31 43-76 2-34 (35)
18 cd00054 EGF_CA Calcium-binding 97.1 0.00061 1.3E-08 41.4 3.3 33 41-76 3-37 (38)
19 cd00053 EGF Epidermal growth f 96.9 0.0013 2.8E-08 39.2 3.5 29 120-149 2-32 (36)
20 cd00053 EGF Epidermal growth f 96.9 0.0014 3E-08 39.1 3.4 29 43-74 2-32 (36)
21 smart00181 EGF Epidermal growt 96.9 0.0015 3.2E-08 40.1 3.4 30 119-149 1-31 (35)
22 PF07974 EGF_2: EGF-like domai 96.7 0.0016 3.5E-08 41.1 2.9 25 47-76 7-32 (32)
23 KOG1214 Nidogen and related ba 96.5 0.01 2.2E-07 61.3 8.1 123 42-169 696-843 (1289)
24 KOG1226 Integrin beta subunit 96.3 0.0075 1.6E-07 61.3 6.1 40 123-167 594-635 (783)
25 KOG1226 Integrin beta subunit 96.3 0.0077 1.7E-07 61.2 6.1 89 64-158 479-589 (783)
26 PF12661 hEGF: Human growth fa 96.2 0.0018 4E-08 33.4 0.7 13 64-76 1-13 (13)
27 PF07974 EGF_2: EGF-like domai 95.8 0.01 2.3E-07 37.4 3.0 25 123-150 6-31 (32)
28 PF07645 EGF_CA: Calcium-bindi 95.8 0.0073 1.6E-07 39.5 2.3 29 41-72 3-34 (42)
29 PHA03099 epidermal growth fact 95.7 0.018 4E-07 47.4 4.8 38 42-80 44-84 (139)
30 PF12947 EGF_3: EGF domain; I 95.4 0.0077 1.7E-07 38.9 1.3 26 124-150 7-33 (36)
31 PF12662 cEGF: Complement Clr- 95.3 0.011 2.5E-07 35.2 1.8 13 62-74 1-13 (24)
32 PF06247 Plasmod_Pvs28: Plasmo 94.7 0.0083 1.8E-07 52.2 -0.0 31 119-151 133-164 (197)
33 PHA02887 EGF-like protein; Pro 94.0 0.046 1E-06 44.5 2.8 36 42-78 85-123 (126)
34 KOG4260 Uncharacterized conser 93.5 0.058 1.3E-06 49.8 2.8 74 47-166 245-319 (350)
35 PHA02887 EGF-like protein; Pro 92.2 0.14 3.1E-06 41.7 3.2 36 115-150 81-120 (126)
36 PF12662 cEGF: Complement Clr- 91.3 0.17 3.8E-06 30.2 2.0 14 137-150 1-14 (24)
37 PF12947 EGF_3: EGF domain; I 90.4 0.14 3.1E-06 32.9 1.2 24 48-74 8-32 (36)
38 PF14670 FXa_inhibition: Coagu 89.0 0.28 6.1E-06 31.7 1.8 22 129-151 11-32 (36)
39 cd01475 vWA_Matrilin VWA_Matri 89.0 0.46 1E-05 40.5 3.6 33 117-151 187-221 (224)
40 PF14670 FXa_inhibition: Coagu 86.0 0.41 9E-06 30.9 1.2 21 52-75 11-31 (36)
41 PHA03099 epidermal growth fact 84.7 0.96 2.1E-05 37.5 3.0 34 117-150 42-79 (139)
42 smart00051 DSL delta serrate l 84.2 0.73 1.6E-05 33.1 1.9 13 64-76 51-63 (63)
43 PF00954 S_locus_glycop: S-loc 84.2 0.93 2E-05 34.8 2.6 30 116-147 76-107 (110)
44 PF06247 Plasmod_Pvs28: Plasmo 83.1 0.47 1E-05 41.5 0.6 86 48-170 52-147 (197)
45 cd01475 vWA_Matrilin VWA_Matri 78.7 1.3 2.8E-05 37.7 1.9 40 33-75 174-220 (224)
46 smart00051 DSL delta serrate l 77.3 3.3 7.2E-05 29.7 3.3 16 62-77 16-31 (63)
47 PF12946 EGF_MSP1_1: MSP1 EGF 76.6 1.4 3.1E-05 29.0 1.2 31 120-150 2-33 (37)
48 KOG1836 Extracellular matrix g 76.6 5.3 0.00011 44.8 6.1 41 43-84 777-820 (1705)
49 KOG3516 Neurexin IV [Signal tr 70.6 3.1 6.6E-05 45.0 2.5 35 116-151 544-580 (1306)
50 KOG3516 Neurexin IV [Signal tr 70.3 3.2 6.9E-05 44.8 2.5 36 40-78 545-582 (1306)
51 PF00954 S_locus_glycop: S-loc 68.0 4.7 0.0001 30.8 2.5 31 40-74 77-109 (110)
52 PF12946 EGF_MSP1_1: MSP1 EGF 66.3 3.5 7.7E-05 27.1 1.2 30 43-74 2-32 (37)
53 KOG3514 Neurexin III-alpha [Si 65.3 4.1 8.9E-05 43.9 2.1 35 40-77 623-659 (1591)
54 PF00053 Laminin_EGF: Laminin 62.6 5.1 0.00011 26.5 1.5 22 51-77 11-32 (49)
55 KOG1218 Proteins containing Ca 62.1 27 0.00059 30.5 6.4 36 138-173 162-200 (316)
56 PF12955 DUF3844: Domain of un 60.3 8.3 0.00018 30.6 2.6 31 41-71 6-41 (103)
57 PF01414 DSL: Delta serrate li 59.4 2.5 5.3E-05 30.4 -0.5 11 66-76 53-63 (63)
58 PF12955 DUF3844: Domain of un 58.2 13 0.00029 29.4 3.4 49 123-174 13-66 (103)
59 PF07172 GRP: Glycine rich pro 57.6 7.3 0.00016 30.2 1.8 14 13-26 14-27 (95)
60 KOG3509 Basement membrane-spec 54.0 23 0.0005 37.8 5.3 36 39-77 405-441 (964)
61 smart00180 EGF_Lam Laminin-typ 52.3 9.5 0.0002 25.3 1.5 16 62-77 17-32 (46)
62 cd00055 EGF_Lam Laminin-type e 51.5 9.5 0.00021 25.5 1.4 16 62-77 18-33 (50)
63 KOG1836 Extracellular matrix g 49.5 35 0.00076 38.6 6.0 31 120-150 777-810 (1705)
64 KOG3607 Meltrins, fertilins an 48.6 29 0.00063 35.8 5.0 22 125-150 632-654 (716)
65 KOG3607 Meltrins, fertilins an 48.0 11 0.00024 38.8 1.8 17 61-77 640-656 (716)
66 PF04863 EGF_alliinase: Alliin 39.1 12 0.00025 26.8 0.3 29 49-77 21-50 (56)
67 PF09064 Tme5_EGF_like: Thromb 38.9 22 0.00049 23.0 1.6 14 61-74 16-29 (34)
68 PF01683 EB: EB module; Inter 36.4 40 0.00087 22.3 2.6 27 118-149 20-48 (52)
69 KOG0196 Tyrosine kinase, EPH ( 34.8 79 0.0017 33.7 5.6 25 135-159 305-330 (996)
70 PLN03148 Blue copper-like prot 32.1 41 0.00088 28.8 2.6 19 160-181 106-124 (167)
71 cd01328 FSL_SPARC Follistatin- 31.0 49 0.0011 25.3 2.6 26 42-69 1-27 (86)
72 KOG3512 Netrin, axonal chemotr 29.8 70 0.0015 32.1 4.0 91 40-167 391-493 (592)
73 KOG4004 Matricellular protein 29.8 39 0.00084 30.4 2.1 53 41-98 51-104 (259)
74 cd01328 FSL_SPARC Follistatin- 28.1 65 0.0014 24.6 2.9 30 119-148 1-31 (86)
75 PF06607 Prokineticin: Prokine 27.6 34 0.00073 26.8 1.3 39 38-77 21-61 (97)
76 PF07359 LEAP-2: Liver-express 25.7 39 0.00084 25.6 1.2 18 3-20 1-18 (77)
77 KOG3509 Basement membrane-spec 23.9 1.3E+02 0.0029 32.3 5.1 69 62-150 717-793 (964)
78 KOG3658 Tumor necrosis factor- 23.5 1.1E+02 0.0025 31.8 4.4 28 118-145 564-595 (764)
79 KOG3512 Netrin, axonal chemotr 23.5 1.4E+02 0.0029 30.1 4.8 24 52-77 286-309 (592)
80 KOG3653 Transforming growth fa 23.1 2.8E+02 0.0061 27.9 6.8 13 137-149 115-127 (534)
81 KOG0994 Extracellular matrix g 20.8 1.9E+02 0.0041 32.4 5.5 19 60-78 931-949 (1758)
82 PF01826 TIL: Trypsin Inhibito 20.6 41 0.00089 22.5 0.5 17 140-159 35-51 (55)
No 1
>KOG1219 consensus Uncharacterized conserved protein, contains laminin, cadherin and EGF domains [Signal transduction mechanisms]
Probab=99.32 E-value=1.7e-12 Score=138.66 Aligned_cols=103 Identities=23% Similarity=0.475 Sum_probs=90.0
Q ss_pred cCCCccCCCCCC-eeeccCCCCCcceeecCCCCccCccc-ccCCCccCCccCC-CCCC---CCCCCCCCCCcccccccCC
Q 027951 40 ENVCDKVTCGKG-KCKASQNSTFFYECECDLGWKQNTMA-VDQNLKFLPCIAP-DCTL---NQDCAPSPSPAQEKAAKKN 113 (216)
Q Consensus 40 ~~~C~~~~Cg~G-tC~~~~~~~~~Y~C~C~pGwtg~~c~-~~d~~~~lPC~ip-nCt~---n~sC~~~~~~~~~~~~~~n 113 (216)
.++|+..+|+|| +|..+. ..+|.|+|.+-|+|.+|+ ++..|.+.||-.+ .|.. ++.|.|..++ |++.
T Consensus 3864 ~d~C~~npCqhgG~C~~~~--~ggy~CkCpsqysG~~CEi~~epC~snPC~~GgtCip~~n~f~CnC~~gy-----TG~~ 3936 (4289)
T KOG1219|consen 3864 TDPCNDNPCQHGGTCISQP--KGGYKCKCPSQYSGNHCEIDLEPCASNPCLTGGTCIPFYNGFLCNCPNGY-----TGKR 3936 (4289)
T ss_pred ccccccCcccCCCEecCCC--CCceEEeCcccccCcccccccccccCCCCCCCCEEEecCCCeeEeCCCCc-----cCce
Confidence 489999999985 999874 569999999999999999 5889999999987 7985 5899998888 5665
Q ss_pred -C-CcCCCCCCCCcC-CCeeecCCCCceeeecCCCCcCCC
Q 027951 114 -E-SIFDLCRWIDCG-GGSCKNTSMFSYSCQCAVDHYNLL 150 (216)
Q Consensus 114 -~-s~~DPC~~~~Cg-gGtCv~~~~~sY~C~C~~Gy~nll 150 (216)
| +-+++|..+.|+ +|+|++.. ++|+|.|.+||.|++
T Consensus 3937 Ce~~Gi~eCs~n~C~~gg~C~n~~-gsf~CncT~g~~gr~ 3975 (4289)
T KOG1219|consen 3937 CEARGISECSKNVCGTGGQCINIP-GSFHCNCTPGILGRT 3975 (4289)
T ss_pred eecccccccccccccCCceeeccC-CceEeccChhHhccc
Confidence 3 337999999999 88999975 799999999999987
No 2
>KOG4289 consensus Cadherin EGF LAG seven-pass G-type receptor [Signal transduction mechanisms]
Probab=98.84 E-value=1.7e-09 Score=113.19 Aligned_cols=101 Identities=24% Similarity=0.488 Sum_probs=84.5
Q ss_pred ccccCCCccCCCCCC-eeeccCC-------------------CCCcceeecCCCCccCcccccCCCccCCccCCCCCCCC
Q 027951 37 PAFENVCDKVTCGKG-KCKASQN-------------------STFFYECECDLGWKQNTMAVDQNLKFLPCIAPDCTLNQ 96 (216)
Q Consensus 37 ~~~~~~C~~~~Cg~G-tC~~~~~-------------------~~~~Y~C~C~pGwtg~~c~~~d~~~~lPC~ipnCt~n~ 96 (216)
|+.|++|..+||-|= .|+...- ...+++|+|.|||||..|+
T Consensus 1176 pfdDniClrEPCenymkCvsvlrFdssapf~~s~s~lfRpi~pvnglrCrCPpGFTgd~Ce------------------- 1236 (2531)
T KOG4289|consen 1176 PFDDNICLREPCENYMKCVSVLRFDSSAPFLASDSVLFRPIHPVNGLRCRCPPGFTGDYCE------------------- 1236 (2531)
T ss_pred eccCchhhcchhHHHHhhhhheeecccCccccccceeeeeccccCceeEeCCCCCCccccc-------------------
Confidence 778999999999974 8865321 1348999999999999864
Q ss_pred CCCCCCCCcccccccCCCCcCCCCCCCCcC-CCeeecCCCCceeeecCCCCcCCC---CCCccccc-ccCCcccccccCC
Q 027951 97 DCAPSPSPAQEKAAKKNESIFDLCRWIDCG-GGSCKNTSMFSYSCQCAVDHYNLL---NTSTFPCY-KECSIGMDCKNMG 171 (216)
Q Consensus 97 sC~~~~~~~~~~~~~~n~s~~DPC~~~~Cg-gGtCv~~~~~sY~C~C~~Gy~nll---n~t~~pC~-~~CslG~dC~~lg 171 (216)
..+|.|+..+|+ +|+|+.-. ++|+|+|++||.|.. ...++-|+ .-|.-|.-|.+++
T Consensus 1237 ------------------TeiDlCYs~pC~nng~C~srE-ggYtCeCrpg~tGehCEvs~~agrCvpGvC~nggtC~~~~ 1297 (2531)
T KOG4289|consen 1237 ------------------TEIDLCYSGPCGNNGRCRSRE-GGYTCECRPGFTGEHCEVSARAGRCVPGVCKNGGTCVNLL 1297 (2531)
T ss_pred ------------------chhHhhhcCCCCCCCceEEec-CceeEEecCCccccceeeecccCccccceecCCCEEeecC
Confidence 235899999999 99999865 799999999999996 47789999 7799999999998
Q ss_pred CcCC
Q 027951 172 ISVP 175 (216)
Q Consensus 172 i~~~ 175 (216)
+...
T Consensus 1298 nggf 1301 (2531)
T KOG4289|consen 1298 NGGF 1301 (2531)
T ss_pred CCce
Confidence 8753
No 3
>KOG4289 consensus Cadherin EGF LAG seven-pass G-type receptor [Signal transduction mechanisms]
Probab=98.28 E-value=6.6e-07 Score=94.45 Aligned_cols=66 Identities=29% Similarity=0.666 Sum_probs=53.6
Q ss_pred CCccCCCC-CCeeeccCCCCCcceeecCCCCccCcccccCCCccCCccCCCCCCCCCCCCCCCCcccccccCCCCcCCCC
Q 027951 42 VCDKVTCG-KGKCKASQNSTFFYECECDLGWKQNTMAVDQNLKFLPCIAPDCTLNQDCAPSPSPAQEKAAKKNESIFDLC 120 (216)
Q Consensus 42 ~C~~~~Cg-~GtC~~~~~~~~~Y~C~C~pGwtg~~c~~~d~~~~lPC~ipnCt~n~sC~~~~~~~~~~~~~~n~s~~DPC 120 (216)
.|-+.+|+ ||+|... .++|.|+|.|||+|.+|++.- ...-|
T Consensus 1241 lCYs~pC~nng~C~sr---EggYtCeCrpg~tGehCEvs~-----------------------------------~agrC 1282 (2531)
T KOG4289|consen 1241 LCYSGPCGNNGRCRSR---EGGYTCECRPGFTGEHCEVSA-----------------------------------RAGRC 1282 (2531)
T ss_pred hhhcCCCCCCCceEEe---cCceeEEecCCccccceeeec-----------------------------------ccCcc
Confidence 56677899 6899987 568999999999999976310 12357
Q ss_pred CCCCcC-CCeeecCCCCceeeecCCC
Q 027951 121 RWIDCG-GGSCKNTSMFSYSCQCAVD 145 (216)
Q Consensus 121 ~~~~Cg-gGtCv~~~~~sY~C~C~~G 145 (216)
...+|. +|+|++...++++|.|+.|
T Consensus 1283 vpGvC~nggtC~~~~nggf~c~Cp~g 1308 (2531)
T KOG4289|consen 1283 VPGVCKNGGTCVNLLNGGFCCHCPYG 1308 (2531)
T ss_pred ccceecCCCEEeecCCCceeccCCCc
Confidence 778888 8899998888999999998
No 4
>KOG1219 consensus Uncharacterized conserved protein, contains laminin, cadherin and EGF domains [Signal transduction mechanisms]
Probab=98.11 E-value=3e-06 Score=92.56 Aligned_cols=85 Identities=20% Similarity=0.397 Sum_probs=71.9
Q ss_pred CCCccCCccCC-CCCC----CCCCCCCCCCcccccccCC-CCcCCCCCCCCcC-CCeeecCCCCceeeecCCCCcCCCCC
Q 027951 80 QNLKFLPCIAP-DCTL----NQDCAPSPSPAQEKAAKKN-ESIFDLCRWIDCG-GGSCKNTSMFSYSCQCAVDHYNLLNT 152 (216)
Q Consensus 80 d~~~~lPC~ip-nCt~----n~sC~~~~~~~~~~~~~~n-~s~~DPC~~~~Cg-gGtCv~~~~~sY~C~C~~Gy~nlln~ 152 (216)
+.|+.+||+|+ .|+- .|.|.|.+-+ ++++ |....||..++|. ||+|+... ++|.|.|+.||+|....
T Consensus 3865 d~C~~npCqhgG~C~~~~~ggy~CkCpsqy-----sG~~CEi~~epC~snPC~~GgtCip~~-n~f~CnC~~gyTG~~Ce 3938 (4289)
T KOG1219|consen 3865 DPCNDNPCQHGGTCISQPKGGYKCKCPSQY-----SGNHCEIDLEPCASNPCLTGGTCIPFY-NGFLCNCPNGYTGKRCE 3938 (4289)
T ss_pred cccccCcccCCCEecCCCCCceEEeCcccc-----cCcccccccccccCCCCCCCCEEEecC-CCeeEeCCCCccCceee
Confidence 67999999988 8986 5899997766 5777 8788999999999 99999864 78999999999999753
Q ss_pred -C-ccccc-ccCCcccccccC
Q 027951 153 -S-TFPCY-KECSIGMDCKNM 170 (216)
Q Consensus 153 -t-~~pC~-~~CslG~dC~~l 170 (216)
. ...|- ..|.-|+-|.++
T Consensus 3939 ~~Gi~eCs~n~C~~gg~C~n~ 3959 (4289)
T KOG1219|consen 3939 ARGISECSKNVCGTGGQCINI 3959 (4289)
T ss_pred cccccccccccccCCceeecc
Confidence 3 56788 668899999886
No 5
>KOG1214 consensus Nidogen and related basement membrane protein proteins [Cell wall/membrane/envelope biogenesis; Extracellular structures]
Probab=98.10 E-value=5.4e-06 Score=84.35 Aligned_cols=111 Identities=29% Similarity=0.571 Sum_probs=80.1
Q ss_pred CCCcc--CCCC-CCeeeccCCCCCcceeecCCCCc----cCccc-ccCCCccCCccCC--CCCC------------CCCC
Q 027951 41 NVCDK--VTCG-KGKCKASQNSTFFYECECDLGWK----QNTMA-VDQNLKFLPCIAP--DCTL------------NQDC 98 (216)
Q Consensus 41 ~~C~~--~~Cg-~GtC~~~~~~~~~Y~C~C~pGwt----g~~c~-~~d~~~~lPC~ip--nCt~------------n~sC 98 (216)
+.|++ ..|| +-.|+.. +.+|+|+|..||. |..|. ..+.-+..||+.+ +|.+ .|+|
T Consensus 735 ~eca~~~~~CGp~s~Cin~---pg~~rceC~~gy~F~dd~~tCV~i~~pap~n~Ce~g~h~C~i~g~a~c~~hGgs~y~C 811 (1289)
T KOG1214|consen 735 NECATGFHRCGPNSVCINL---PGSYRCECRSGYEFADDRHTCVLITPPAPANPCEDGSHTCAIAGQARCVHHGGSTYSC 811 (1289)
T ss_pred hhhccCCCCCCCCceeecC---CCceeEEEeecceeccCCcceEEecCCCCCCccccCccccCcCCceEEEecCCceEEE
Confidence 34544 3477 5688765 7789999999985 33454 2444566788865 6765 2889
Q ss_pred CCCCCCcccccccCC--CCcCCCCCCCCcC-CCeeecCCCCceeeecCCCCcCCC------CCCccccccc
Q 027951 99 APSPSPAQEKAAKKN--ESIFDLCRWIDCG-GGSCKNTSMFSYSCQCAVDHYNLL------NTSTFPCYKE 160 (216)
Q Consensus 99 ~~~~~~~~~~~~~~n--~s~~DPC~~~~Cg-gGtCv~~~~~sY~C~C~~Gy~nll------n~t~~pC~~~ 160 (216)
++-|++ .++. -.+.|+|....|. ..+|+++. ++|+|+|++||.|.. ....-+|-+|
T Consensus 812 ~CLPGf-----sGDG~~c~dvDeC~psrChp~A~Cyntp-gsfsC~C~pGy~GDGf~CVP~~~~~T~C~~e 876 (1289)
T KOG1214|consen 812 ACLPGF-----SGDGHQCTDVDECSPSRCHPAATCYNTP-GSFSCRCQPGYYGDGFQCVPDTSSLTPCEQE 876 (1289)
T ss_pred eecCCc-----cCCccccccccccCccccCCCceEecCC-CcceeecccCccCCCceecCCCccCCccccc
Confidence 998888 3333 2356999999999 88999986 899999999999985 2455667655
No 6
>PF00008 EGF: EGF-like domain This is a sub-family of the Pfam entry This is a sub-family of the Pfam entry; InterPro: IPR006209 A sequence of about thirty to forty amino-acid residues long found in the sequence of epidermal growth factor (EGF) has been shown [, , , , ] to be present, in a more or less conserved form, in a large number of other, mostly animal proteins. The list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied. The functional significance of EGF domains in what appear to be unrelated proteins is not yet clear. However, a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase). The EGF domain includes six cysteine residues which have been shown (in EGF) to be involved in disulphide bonds. The main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet. Subdomains between the conserved cysteines vary in length.; GO: 0005515 protein binding; PDB: 1WHE_A 1CCF_A 1APO_A 1WHF_A 2VJ3_A 1TOZ_A 4D90_B 3CFW_A 1EDM_B 1IXA_A ....
Probab=97.95 E-value=4.3e-06 Score=52.35 Aligned_cols=31 Identities=26% Similarity=0.610 Sum_probs=26.0
Q ss_pred CCCCCcC-CCeeecCCCCceeeecCCCCcCCC
Q 027951 120 CRWIDCG-GGSCKNTSMFSYSCQCAVDHYNLL 150 (216)
Q Consensus 120 C~~~~Cg-gGtCv~~~~~sY~C~C~~Gy~nll 150 (216)
|..++|. +|+|+.....+|+|+|++||+|..
T Consensus 1 C~~~~C~n~g~C~~~~~~~y~C~C~~G~~G~~ 32 (32)
T PF00008_consen 1 CSSNPCQNGGTCIDLPGGGYTCECPPGYTGKR 32 (32)
T ss_dssp TTTTSSTTTEEEEEESTSEEEEEEBTTEESTT
T ss_pred CCCCcCCCCeEEEeCCCCCEEeECCCCCccCC
Confidence 5667888 789999765799999999999963
No 7
>PF00008 EGF: EGF-like domain This is a sub-family of the Pfam entry This is a sub-family of the Pfam entry; InterPro: IPR006209 A sequence of about thirty to forty amino-acid residues long found in the sequence of epidermal growth factor (EGF) has been shown [, , , , ] to be present, in a more or less conserved form, in a large number of other, mostly animal proteins. The list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied. The functional significance of EGF domains in what appear to be unrelated proteins is not yet clear. However, a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase). The EGF domain includes six cysteine residues which have been shown (in EGF) to be involved in disulphide bonds. The main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet. Subdomains between the conserved cysteines vary in length.; GO: 0005515 protein binding; PDB: 1WHE_A 1CCF_A 1APO_A 1WHF_A 2VJ3_A 1TOZ_A 4D90_B 3CFW_A 1EDM_B 1IXA_A ....
Probab=97.94 E-value=5.2e-06 Score=51.95 Aligned_cols=31 Identities=29% Similarity=0.699 Sum_probs=26.1
Q ss_pred CccCCCCC-CeeeccCCCCCcceeecCCCCccCc
Q 027951 43 CDKVTCGK-GKCKASQNSTFFYECECDLGWKQNT 75 (216)
Q Consensus 43 C~~~~Cg~-GtC~~~~~~~~~Y~C~C~pGwtg~~ 75 (216)
|...+|.| |+|++.. ..+|+|+|.+||+|.+
T Consensus 1 C~~~~C~n~g~C~~~~--~~~y~C~C~~G~~G~~ 32 (32)
T PF00008_consen 1 CSSNPCQNGGTCIDLP--GGGYTCECPPGYTGKR 32 (32)
T ss_dssp TTTTSSTTTEEEEEES--TSEEEEEEBTTEESTT
T ss_pred CCCCcCCCCeEEEeCC--CCCEEeECCCCCccCC
Confidence 66779997 6999984 3699999999999963
No 8
>PF07645 EGF_CA: Calcium-binding EGF domain; InterPro: IPR001881 A sequence of about forty amino-acid residues found in epidermal growth factor (EGF) has been shown [, , , , , ] to be present in a large number of membrane-bound and extracellular, mostly animal, proteins. Many of these proteins require calcium for their biological function and a calcium-binding site has been found at the N terminus of some EGF-like domains []. Calcium-binding may be crucial for numerous protein-protein interactions. For human coagulation factor IX it has been shown [] that the calcium-ligands form a pentagonal bipyramid. The first, third and fourth conserved negatively charged or polar residues are side chain ligands. The latter is possibly hydroxylated (see aspartic acid and asparagine hydroxylation site) []. A conserved aromatic residue, as well as the second conserved negative residue, are thought to be involved in stabilising the calcium-binding site. As in non-calcium binding EGF-like domains, there are six conserved cysteines and the structure of both types is very similar as calcium-binding induces only strictly local structural changes []. +------------------+ +---------+ | | | | nxnnC-x(3,14)-C-x(3,7)-CxxbxxxxaxC-x(1,6)-C-x(8,13)-Cx | | +------------------+ 'n': negatively charged or polar residue [DEQN] 'b': possibly beta-hydroxylated residue [DN] 'a': aromatic amino acid 'C': cysteine, involved in disulphide bond 'x': any amino acid. ; GO: 0005509 calcium ion binding; PDB: 2VJ3_A 1TOZ_A 1LMJ_A 1UZQ_A 1UZK_A 1UZJ_B 1UZP_A 1EMO_A 1EMN_A 2RR0_A ....
Probab=97.77 E-value=2e-05 Score=51.78 Aligned_cols=30 Identities=40% Similarity=0.757 Sum_probs=26.1
Q ss_pred CCCCCC--CCcC-CCeeecCCCCceeeecCCCCc
Q 027951 117 FDLCRW--IDCG-GGSCKNTSMFSYSCQCAVDHY 147 (216)
Q Consensus 117 ~DPC~~--~~Cg-gGtCv~~~~~sY~C~C~~Gy~ 147 (216)
+|+|.. +.|. +++|+++. ++|+|.|++||.
T Consensus 2 idEC~~~~~~C~~~~~C~N~~-Gsy~C~C~~Gy~ 34 (42)
T PF07645_consen 2 IDECAEGPHNCPENGTCVNTE-GSYSCSCPPGYE 34 (42)
T ss_dssp SSTTTTTSSSSSTTSEEEEET-TEEEEEESTTEE
T ss_pred ccccCCCCCcCCCCCEEEcCC-CCEEeeCCCCcE
Confidence 578876 3698 88999986 899999999999
No 9
>KOG1217 consensus Fibrillins and related proteins containing Ca2+-binding EGF-like domains [Signal transduction mechanisms]
Probab=97.68 E-value=0.00025 Score=63.77 Aligned_cols=102 Identities=25% Similarity=0.522 Sum_probs=71.6
Q ss_pred cCCCccCC-CCC-CeeeccCCCCCcceeecCCCCccCcc-c--ccCCC----ccCCccCC-CCCC-----CCCCCCCCCC
Q 027951 40 ENVCDKVT-CGK-GKCKASQNSTFFYECECDLGWKQNTM-A--VDQNL----KFLPCIAP-DCTL-----NQDCAPSPSP 104 (216)
Q Consensus 40 ~~~C~~~~-Cg~-GtC~~~~~~~~~Y~C~C~pGwtg~~c-~--~~d~~----~~lPC~ip-nCt~-----n~sC~~~~~~ 104 (216)
.+.|.... |.+ |+|+.. ...|+|+|.+||+|..+ . +.++| .-.+|..+ .|.. .+.|.+..++
T Consensus 271 ~~~C~~~~~c~~~~~C~~~---~~~~~C~C~~g~~g~~~~~~~~~~~C~~~~~~~~c~~g~~C~~~~~~~~~~C~c~~~~ 347 (487)
T KOG1217|consen 271 VDSCALIASCPNGGTCVNV---PGSYRCTCPPGFTGRLCTECVDVDECSPRNAGGPCANGGTCNTLGSFGGFRCACGPGF 347 (487)
T ss_pred ccccCCCCccCCCCeeecC---CCcceeeCCCCCCCCCCccccccccccccccCCcCCCCcccccCCCCCCCCcCCCCCC
Confidence 56788764 885 799987 33599999999999997 2 23455 34557665 6622 2456666653
Q ss_pred cccccccCC-CCcCCCCCCCCcC-CCeeecCCCCceeeecCCCCcCC
Q 027951 105 AQEKAAKKN-ESIFDLCRWIDCG-GGSCKNTSMFSYSCQCAVDHYNL 149 (216)
Q Consensus 105 ~~~~~~~~n-~s~~DPC~~~~Cg-gGtCv~~~~~sY~C~C~~Gy~nl 149 (216)
+++. +...|.|....|. +|+|++...++|+|.|+.||.+.
T Consensus 348 -----~g~~C~~~~~~C~~~~~~~~~~c~~~~~~~~~c~~~~~~~~~ 389 (487)
T KOG1217|consen 348 -----TGRRCEDSNDECASSPCCPGGTCVNETPGSYRCACPAGFAGK 389 (487)
T ss_pred -----CCCccccCCccccCCccccCCEeccCCCCCeEecCCCccccC
Confidence 3444 3222589988877 88999843378999999999985
No 10
>KOG1217 consensus Fibrillins and related proteins containing Ca2+-binding EGF-like domains [Signal transduction mechanisms]
Probab=97.64 E-value=0.00024 Score=63.79 Aligned_cols=121 Identities=24% Similarity=0.479 Sum_probs=72.3
Q ss_pred CCCc--cCCCCC-CeeeccCCCCCcceeecCCCCccCccccc---CCCc------------cCCccC---------CCCC
Q 027951 41 NVCD--KVTCGK-GKCKASQNSTFFYECECDLGWKQNTMAVD---QNLK------------FLPCIA---------PDCT 93 (216)
Q Consensus 41 ~~C~--~~~Cg~-GtC~~~~~~~~~Y~C~C~pGwtg~~c~~~---d~~~------------~lPC~i---------pnCt 93 (216)
+.|. ...|.+ ++|+.. ..+|.|.|.+||++..++.. ..+. ...|.+ ..|.
T Consensus 170 ~~C~~~~~~c~~~~~C~~~---~~~~~C~c~~~~~~~~~~~~~~~~~c~~~~~~~~~~g~~~~~c~~~~~~~~~~~~~c~ 246 (487)
T KOG1217|consen 170 DECIQYSSPCQNGGTCVNT---GGSYLCSCPPGYTGSTCETTGNGGTCVDSVACSCPPGARGPECEVSIVECASGDGTCV 246 (487)
T ss_pred cccccCCCCcCCCcccccC---CCCeeEeCCCCccCCcCcCCCCCceEecceeccCCCCCCCCCcccccccccCCCCccc
Confidence 4676 335885 599887 44699999999999987532 0111 112222 1222
Q ss_pred C---CCCCCCCCCCcccccccCCCCcCCCCCCCC-cC-CCeeecCCCCceeeecCCCCcCCCC--C-Cccccc-----cc
Q 027951 94 L---NQDCAPSPSPAQEKAAKKNESIFDLCRWID-CG-GGSCKNTSMFSYSCQCAVDHYNLLN--T-STFPCY-----KE 160 (216)
Q Consensus 94 ~---n~sC~~~~~~~~~~~~~~n~s~~DPC~~~~-Cg-gGtCv~~~~~sY~C~C~~Gy~nlln--~-t~~pC~-----~~ 160 (216)
. ++.|.+.+++..... ....+.|.|.... |. +|+|++.. ++|.|+|++||.+... . ....|. ..
T Consensus 247 ~~~~~~~C~~~~g~~~~~~--~~~~~~~~C~~~~~c~~~~~C~~~~-~~~~C~C~~g~~g~~~~~~~~~~~C~~~~~~~~ 323 (487)
T KOG1217|consen 247 NTVGSYTCRCPEGYTGDAC--VTCVDVDSCALIASCPNGGTCVNVP-GSYRCTCPPGFTGRLCTECVDVDECSPRNAGGP 323 (487)
T ss_pred ccCCceeeeCCCCcccccc--ceeeeccccCCCCccCCCCeeecCC-CcceeeCCCCCCCCCCccccccccccccccCCc
Confidence 2 234554444421100 0112468899864 88 78999975 5699999999999996 1 224451 23
Q ss_pred CCccccc
Q 027951 161 CSIGMDC 167 (216)
Q Consensus 161 CslG~dC 167 (216)
|..|..|
T Consensus 324 c~~g~~C 330 (487)
T KOG1217|consen 324 CANGGTC 330 (487)
T ss_pred CCCCccc
Confidence 5566566
No 11
>KOG1225 consensus Teneurin-1 and related extracellular matrix proteins, contain EGF-like repeats [Signal transduction mechanisms; Extracellular structures]
Probab=97.61 E-value=0.00016 Score=70.72 Aligned_cols=94 Identities=24% Similarity=0.555 Sum_probs=60.6
Q ss_pred ccCCCccCCCCCC-----eeeccCCCCCcceeecCCCCccCcccccCCCccCCccC-CCCCCCCCCCCCCCCcccccccC
Q 027951 39 FENVCDKVTCGKG-----KCKASQNSTFFYECECDLGWKQNTMAVDQNLKFLPCIA-PDCTLNQDCAPSPSPAQEKAAKK 112 (216)
Q Consensus 39 ~~~~C~~~~Cg~G-----tC~~~~~~~~~Y~C~C~pGwtg~~c~~~d~~~~lPC~i-pnCt~n~sC~~~~~~~~~~~~~~ 112 (216)
+...|....|.+| .|++. +|.|++||+|..|+. -.|+.. |.- ..|.++ .|.+.++| .+.
T Consensus 243 ~g~~c~~~~C~~~c~~~g~c~~G-------~CIC~~Gf~G~dC~e-~~Cp~~-cs~~g~~~~g-~CiC~~g~-----~G~ 307 (525)
T KOG1225|consen 243 FGPLCSTIYCPGGCTGRGQCVEG-------RCICPPGFTGDDCDE-LVCPVD-CSGGGVCVDG-ECICNPGY-----SGK 307 (525)
T ss_pred eCCccccccCCCCCcccceEeCC-------eEeCCCCCcCCCCCc-ccCCcc-cCCCceecCC-EeecCCCc-----ccc
Confidence 3445666666644 68775 699999999998763 234444 542 244444 78888877 222
Q ss_pred CCCcCCCCCCCCcC-CCeeecCCCCceeeecCCCCcCCCCCCc
Q 027951 113 NESIFDLCRWIDCG-GGSCKNTSMFSYSCQCAVDHYNLLNTST 154 (216)
Q Consensus 113 n~s~~DPC~~~~Cg-gGtCv~~~~~sY~C~C~~Gy~nlln~t~ 154 (216)
.=+. --|- ..|. +|.|+++ +|+|.+||+|.+-.+.
T Consensus 308 dCs~-~~cp-adC~g~G~Ci~G-----~C~C~~Gy~G~~C~~~ 343 (525)
T KOG1225|consen 308 DCSI-RRCP-ADCSGHGKCIDG-----ECLCDEGYTGELCIQR 343 (525)
T ss_pred cccc-ccCC-ccCCCCCcccCC-----ceEeCCCCcCCccccc
Confidence 2010 1132 6676 8999943 6999999999997665
No 12
>smart00179 EGF_CA Calcium-binding EGF-like domain.
Probab=97.59 E-value=9e-05 Score=46.02 Aligned_cols=30 Identities=40% Similarity=0.790 Sum_probs=25.6
Q ss_pred CCCCCC-CCcC-CCeeecCCCCceeeecCCCCc
Q 027951 117 FDLCRW-IDCG-GGSCKNTSMFSYSCQCAVDHY 147 (216)
Q Consensus 117 ~DPC~~-~~Cg-gGtCv~~~~~sY~C~C~~Gy~ 147 (216)
+|+|.. .+|. +|+|++.. ++|+|.|++||.
T Consensus 2 ~~~C~~~~~C~~~~~C~~~~-g~~~C~C~~g~~ 33 (39)
T smart00179 2 IDECASGNPCQNGGTCVNTV-GSYRCECPPGYT 33 (39)
T ss_pred cccCcCCCCcCCCCEeECCC-CCeEeECCCCCc
Confidence 478887 7898 67999875 789999999998
No 13
>smart00179 EGF_CA Calcium-binding EGF-like domain.
Probab=97.40 E-value=0.00019 Score=44.49 Aligned_cols=33 Identities=27% Similarity=0.727 Sum_probs=27.0
Q ss_pred CCCcc-CCCCC-CeeeccCCCCCcceeecCCCCc-cCcc
Q 027951 41 NVCDK-VTCGK-GKCKASQNSTFFYECECDLGWK-QNTM 76 (216)
Q Consensus 41 ~~C~~-~~Cg~-GtC~~~~~~~~~Y~C~C~pGwt-g~~c 76 (216)
+.|.. .+|.+ |+|+.. ..+|+|.|.+||+ |.+|
T Consensus 3 ~~C~~~~~C~~~~~C~~~---~g~~~C~C~~g~~~g~~C 38 (39)
T smart00179 3 DECASGNPCQNGGTCVNT---VGSYRCECPPGYTDGRNC 38 (39)
T ss_pred ccCcCCCCcCCCCEeECC---CCCeEeECCCCCccCCcC
Confidence 56877 68986 499987 5689999999999 8764
No 14
>KOG1225 consensus Teneurin-1 and related extracellular matrix proteins, contain EGF-like repeats [Signal transduction mechanisms; Extracellular structures]
Probab=97.38 E-value=0.00034 Score=68.47 Aligned_cols=67 Identities=24% Similarity=0.572 Sum_probs=50.0
Q ss_pred eeecCCCCccCcccccCCCccCCcc-CCCCCCCCCCCCCCCCcccccccCCCCcCCCCCCCCcC-CCeeecCCCCceeee
Q 027951 64 ECECDLGWKQNTMAVDQNLKFLPCI-APDCTLNQDCAPSPSPAQEKAAKKNESIFDLCRWIDCG-GGSCKNTSMFSYSCQ 141 (216)
Q Consensus 64 ~C~C~pGwtg~~c~~~d~~~~lPC~-ipnCt~n~sC~~~~~~~~~~~~~~n~s~~DPC~~~~Cg-gGtCv~~~~~sY~C~ 141 (216)
+|.|++||+|..|+ +..|+ -.|. +++|. +..|.+.++|. -+.|....|. +|+|++ + |.
T Consensus 297 ~CiC~~g~~G~dCs-~~~cp-adC~g~G~Ci-~G~C~C~~Gy~-----------G~~C~~~~C~~~g~cv~----g--C~ 356 (525)
T KOG1225|consen 297 ECICNPGYSGKDCS-IRRCP-ADCSGHGKCI-DGECLCDEGYT-----------GELCIQRACSGGGQCVN----G--CK 356 (525)
T ss_pred EeecCCCccccccc-cccCC-ccCCCCCccc-CCceEeCCCCc-----------CCcccccccCCCceecc----C--ce
Confidence 89999999999985 22233 3465 56787 67899999882 1456655587 778887 2 99
Q ss_pred cCCCCcCCC
Q 027951 142 CAVDHYNLL 150 (216)
Q Consensus 142 C~~Gy~nll 150 (216)
|++||+|.-
T Consensus 357 C~~Gw~G~d 365 (525)
T KOG1225|consen 357 CKKGWRGPD 365 (525)
T ss_pred eccCccCCC
Confidence 999999876
No 15
>KOG4260 consensus Uncharacterized conserved protein [Function unknown]
Probab=97.31 E-value=0.00025 Score=64.90 Aligned_cols=99 Identities=17% Similarity=0.335 Sum_probs=59.6
Q ss_pred CCCC-CCeeeccCCCCCcceeecCCCCccCcccc--cCCC------ccCCcc---C---CCCCC--CCCCCC-CCCCccc
Q 027951 46 VTCG-KGKCKASQNSTFFYECECDLGWKQNTMAV--DQNL------KFLPCI---A---PDCTL--NQDCAP-SPSPAQE 107 (216)
Q Consensus 46 ~~Cg-~GtC~~~~~~~~~Y~C~C~pGwtg~~c~~--~d~~------~~lPC~---i---pnCt~--n~sC~~-~~~~~~~ 107 (216)
.+|. +|.|.-...-.++-.|+|++||+|+.|.. +.-+ +-+=|. - ..|+- +.+|.. --+|+-
T Consensus 150 r~C~GnG~C~GdGsR~GsGkCkC~~GY~Gp~C~~Cg~eyfes~Rne~~lvCt~Ch~~C~~~Csg~~~k~C~kCkkGW~l- 228 (350)
T KOG4260|consen 150 RPCFGNGSCHGDGSREGSGKCKCETGYTGPLCRYCGIEYFESSRNEQHLVCTACHEGCLGVCSGESSKGCSKCKKGWKL- 228 (350)
T ss_pred CCcCCCCcccCCCCCCCCCcccccCCCCCccccccchHHHHhhcccccchhhhhhhhhhcccCCCCCCChhhhccccee-
Confidence 3565 89998644335678999999999999763 1111 111111 0 02221 122221 223311
Q ss_pred ccccCCCCcCCCCCC--CCcC-CCeeecCCCCceeeecCCCCcC
Q 027951 108 KAAKKNESIFDLCRW--IDCG-GGSCKNTSMFSYSCQCAVDHYN 148 (216)
Q Consensus 108 ~~~~~n~s~~DPC~~--~~Cg-gGtCv~~~~~sY~C~C~~Gy~n 148 (216)
....=.++|+|.. .+|+ +--|+|+. +||+|++++||.+
T Consensus 229 --de~gCvDvnEC~~ep~~c~~~qfCvNte-GSf~C~dk~Gy~~ 269 (350)
T KOG4260|consen 229 --DEEGCVDVNECQNEPAPCKAHQFCVNTE-GSFKCEDKEGYKK 269 (350)
T ss_pred --cccccccHHHHhcCCCCCChhheeecCC-CceEecccccccC
Confidence 1111135689975 5798 77899986 8999999999998
No 16
>cd00054 EGF_CA Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.
Probab=97.29 E-value=0.00033 Score=42.62 Aligned_cols=32 Identities=38% Similarity=0.706 Sum_probs=26.6
Q ss_pred CCCCCC-CCcC-CCeeecCCCCceeeecCCCCcCC
Q 027951 117 FDLCRW-IDCG-GGSCKNTSMFSYSCQCAVDHYNL 149 (216)
Q Consensus 117 ~DPC~~-~~Cg-gGtCv~~~~~sY~C~C~~Gy~nl 149 (216)
.|+|.. .+|. +|+|++.. ++|+|.|.+||.|.
T Consensus 2 ~~~C~~~~~C~~~~~C~~~~-~~~~C~C~~g~~g~ 35 (38)
T cd00054 2 IDECASGNPCQNGGTCVNTV-GSYRCSCPPGYTGR 35 (38)
T ss_pred cccCCCCCCcCCCCEeECCC-CCeEeECCCCCcCC
Confidence 477887 6898 77999865 68999999999883
No 17
>smart00181 EGF Epidermal growth factor-like domain.
Probab=97.16 E-value=0.00052 Score=42.14 Aligned_cols=31 Identities=26% Similarity=0.727 Sum_probs=25.3
Q ss_pred Ccc-CCCCCCeeeccCCCCCcceeecCCCCcc-Ccc
Q 027951 43 CDK-VTCGKGKCKASQNSTFFYECECDLGWKQ-NTM 76 (216)
Q Consensus 43 C~~-~~Cg~GtC~~~~~~~~~Y~C~C~pGwtg-~~c 76 (216)
|.. .+|.+|+|+.. ..+|+|.|.+||+| ..|
T Consensus 2 C~~~~~C~~~~C~~~---~~~~~C~C~~g~~g~~~C 34 (35)
T smart00181 2 CASGGPCSNGTCINT---PGSYTCSCPPGYTGDKRC 34 (35)
T ss_pred CCCcCCCCCCEEECC---CCCeEeECCCCCccCCcc
Confidence 555 57888899987 56899999999999 653
No 18
>cd00054 EGF_CA Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.
Probab=97.11 E-value=0.00061 Score=41.41 Aligned_cols=33 Identities=24% Similarity=0.674 Sum_probs=26.7
Q ss_pred CCCcc-CCCC-CCeeeccCCCCCcceeecCCCCccCcc
Q 027951 41 NVCDK-VTCG-KGKCKASQNSTFFYECECDLGWKQNTM 76 (216)
Q Consensus 41 ~~C~~-~~Cg-~GtC~~~~~~~~~Y~C~C~pGwtg~~c 76 (216)
+.|.. .+|. +|+|+.. ..+|+|+|.+||.|..|
T Consensus 3 ~~C~~~~~C~~~~~C~~~---~~~~~C~C~~g~~g~~C 37 (38)
T cd00054 3 DECASGNPCQNGGTCVNT---VGSYRCSCPPGYTGRNC 37 (38)
T ss_pred ccCCCCCCcCCCCEeECC---CCCeEeECCCCCcCCcC
Confidence 56776 6898 4699977 56799999999999764
No 19
>cd00053 EGF Epidermal growth factor domain, found in epidermal growth factor (EGF) presents in a large number of proteins, mostly animal; the list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied; the functional significance of EGF-like domains in what appear to be unrelated proteins is not yet clear; a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase); the domain includes six cysteine residues which have been shown to be involved in disulfide bonds; the main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet; Subdomains between the conserved cysteines vary in length; the region between the 5th and 6th cysteine contains two conserved glycines of which at least one is present in most EGF-like domains; a subset of these bind calcium.
Probab=96.92 E-value=0.0013 Score=39.22 Aligned_cols=29 Identities=38% Similarity=0.738 Sum_probs=23.8
Q ss_pred CC-CCCcC-CCeeecCCCCceeeecCCCCcCC
Q 027951 120 CR-WIDCG-GGSCKNTSMFSYSCQCAVDHYNL 149 (216)
Q Consensus 120 C~-~~~Cg-gGtCv~~~~~sY~C~C~~Gy~nl 149 (216)
|. ...|. +++|++.. ++|+|.|++||.+.
T Consensus 2 C~~~~~C~~~~~C~~~~-~~~~C~C~~g~~g~ 32 (36)
T cd00053 2 CAASNPCSNGGTCVNTP-GSYRCVCPPGYTGD 32 (36)
T ss_pred CCCCCCCCCCCEEecCC-CCeEeECCCCCccc
Confidence 44 56677 68999876 68999999999987
No 20
>cd00053 EGF Epidermal growth factor domain, found in epidermal growth factor (EGF) presents in a large number of proteins, mostly animal; the list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied; the functional significance of EGF-like domains in what appear to be unrelated proteins is not yet clear; a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase); the domain includes six cysteine residues which have been shown to be involved in disulfide bonds; the main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet; Subdomains between the conserved cysteines vary in length; the region between the 5th and 6th cysteine contains two conserved glycines of which at least one is present in most EGF-like domains; a subset of these bind calcium.
Probab=96.88 E-value=0.0014 Score=39.12 Aligned_cols=29 Identities=28% Similarity=0.804 Sum_probs=23.6
Q ss_pred Cc-cCCCCC-CeeeccCCCCCcceeecCCCCccC
Q 027951 43 CD-KVTCGK-GKCKASQNSTFFYECECDLGWKQN 74 (216)
Q Consensus 43 C~-~~~Cg~-GtC~~~~~~~~~Y~C~C~pGwtg~ 74 (216)
|. ..+|.+ |+|+.. ..+|+|+|.+||.|.
T Consensus 2 C~~~~~C~~~~~C~~~---~~~~~C~C~~g~~g~ 32 (36)
T cd00053 2 CAASNPCSNGGTCVNT---PGSYRCVCPPGYTGD 32 (36)
T ss_pred CCCCCCCCCCCEEecC---CCCeEeECCCCCccc
Confidence 44 567875 899987 468999999999997
No 21
>smart00181 EGF Epidermal growth factor-like domain.
Probab=96.85 E-value=0.0015 Score=40.07 Aligned_cols=30 Identities=33% Similarity=0.597 Sum_probs=23.9
Q ss_pred CCCC-CCcCCCeeecCCCCceeeecCCCCcCC
Q 027951 119 LCRW-IDCGGGSCKNTSMFSYSCQCAVDHYNL 149 (216)
Q Consensus 119 PC~~-~~CggGtCv~~~~~sY~C~C~~Gy~nl 149 (216)
+|.. .+|.+++|++. .++|+|.|++||.+.
T Consensus 1 ~C~~~~~C~~~~C~~~-~~~~~C~C~~g~~g~ 31 (35)
T smart00181 1 ECASGGPCSNGTCINT-PGSYTCSCPPGYTGD 31 (35)
T ss_pred CCCCcCCCCCCEEECC-CCCeEeECCCCCccC
Confidence 3555 57885599987 479999999999985
No 22
>PF07974 EGF_2: EGF-like domain; InterPro: IPR013111 A sequence of about thirty to forty amino-acid residues long found in the sequence of epidermal growth factor (EGF) has been shown [, , , , ] to be present, in a more or less conserved form, in a large number of other, mostly animal proteins. The list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied. The functional significance of EGF domains in what appear to be unrelated proteins is not yet clear. However, a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase). The EGF domain includes six cysteine residues which have been shown (in EGF) to be involved in disulphide bonds. The main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet. Subdomains between the conserved cysteines vary in length. This entry contains EGF domains found in a variety of extracellular and membrane proteins
Probab=96.73 E-value=0.0016 Score=41.09 Aligned_cols=25 Identities=28% Similarity=0.763 Sum_probs=20.4
Q ss_pred CCC-CCeeeccCCCCCcceeecCCCCccCcc
Q 027951 47 TCG-KGKCKASQNSTFFYECECDLGWKQNTM 76 (216)
Q Consensus 47 ~Cg-~GtC~~~~~~~~~Y~C~C~pGwtg~~c 76 (216)
.|. ||+|+.. ..+|+|++||+|+.|
T Consensus 7 ~C~~~G~C~~~-----~g~C~C~~g~~G~~C 32 (32)
T PF07974_consen 7 ICSGHGTCVSP-----CGRCVCDSGYTGPDC 32 (32)
T ss_pred ccCCCCEEeCC-----CCEEECCCCCcCCCC
Confidence 466 8999864 478999999999863
No 23
>KOG1214 consensus Nidogen and related basement membrane protein proteins [Cell wall/membrane/envelope biogenesis; Extracellular structures]
Probab=96.47 E-value=0.01 Score=61.33 Aligned_cols=123 Identities=20% Similarity=0.408 Sum_probs=74.3
Q ss_pred CCccCCCCCC-eeeccCCCCCcceeecCCCCccC--cccccCCCccCCccCCCCCCCCCCCCCCCCcc-c-------ccc
Q 027951 42 VCDKVTCGKG-KCKASQNSTFFYECECDLGWKQN--TMAVDQNLKFLPCIAPDCTLNQDCAPSPSPAQ-E-------KAA 110 (216)
Q Consensus 42 ~C~~~~Cg~G-tC~~~~~~~~~Y~C~C~pGwtg~--~c~~~d~~~~lPC~ipnCt~n~sC~~~~~~~~-~-------~~~ 110 (216)
+|.+..|+-+ .|.++. ...|.|+|..||.|. .|.++++|..-+ ++|--+.-|.+.++--+ + ...
T Consensus 696 y~gsh~cdt~a~C~pg~--~~~~tcecs~g~~gdgr~c~d~~eca~~~---~~CGp~s~Cin~pg~~rceC~~gy~F~dd 770 (1289)
T KOG1214|consen 696 YDGSHMCDTTARCHPGT--GVDYTCECSSGYQGDGRNCVDENECATGF---HRCGPNSVCINLPGSYRCECRSGYEFADD 770 (1289)
T ss_pred eecCcccCCCccccCCC--CcceEEEEeeccCCCCCCCCChhhhccCC---CCCCCCceeecCCCceeEEEeecceeccC
Confidence 4566778865 888863 347999999999865 455566666522 34444333333222110 0 000
Q ss_pred cCC-C-----CcCCCCCC--CCcC-CC--eeecCCCCceeeecCCCCcCCC--CCCccccc-ccCCccccccc
Q 027951 111 KKN-E-----SIFDLCRW--IDCG-GG--SCKNTSMFSYSCQCAVDHYNLL--NTSTFPCY-KECSIGMDCKN 169 (216)
Q Consensus 111 ~~n-~-----s~~DPC~~--~~Cg-gG--tCv~~~~~sY~C~C~~Gy~nll--n~t~~pC~-~~CslG~dC~~ 169 (216)
+.+ . -..++|.. +.|. .| +|+-+.+++|+|.|-+||+|.. |...+.|- +-|.--+-|-+
T Consensus 771 ~~tCV~i~~pap~n~Ce~g~h~C~i~g~a~c~~hGgs~y~C~CLPGfsGDG~~c~dvDeC~psrChp~A~Cyn 843 (1289)
T KOG1214|consen 771 RHTCVLITPPAPANPCEDGSHTCAIAGQARCVHHGGSTYSCACLPGFSGDGHQCTDVDECSPSRCHPAATCYN 843 (1289)
T ss_pred CcceEEecCCCCCCccccCccccCcCCceEEEecCCceEEEeecCCccCCccccccccccCccccCCCceEec
Confidence 001 0 01357764 5676 44 6777777789999999999875 66667776 55666666654
No 24
>KOG1226 consensus Integrin beta subunit (N-terminal portion of extracellular region) [Signal transduction mechanisms; Extracellular structures]
Probab=96.30 E-value=0.0075 Score=61.31 Aligned_cols=40 Identities=20% Similarity=0.527 Sum_probs=24.9
Q ss_pred CCcC-CCeeecCCCCceeeecCCC-CcCCCCCCcccccccCCccccc
Q 027951 123 IDCG-GGSCKNTSMFSYSCQCAVD-HYNLLNTSTFPCYKECSIGMDC 167 (216)
Q Consensus 123 ~~Cg-gGtCv~~~~~sY~C~C~~G-y~nlln~t~~pC~~~CslG~dC 167 (216)
..|. +|+|+=+ +|+|.+. |+|.+-.--.-|...|..-.+|
T Consensus 594 ~iCSGrG~C~Cg-----~C~C~~~~~sG~~CE~cptc~~~C~~~~~C 635 (783)
T KOG1226|consen 594 QICSGRGTCECG-----RCKCTDPPYSGEFCEKCPTCPDPCAENKSC 635 (783)
T ss_pred ceeCCCceeeCC-----ceEcCCCCcCcchhhcCCCCCCcccccccc
Confidence 3465 8899874 6999888 9998754433344334443333
No 25
>KOG1226 consensus Integrin beta subunit (N-terminal portion of extracellular region) [Signal transduction mechanisms; Extracellular structures]
Probab=96.30 E-value=0.0077 Score=61.24 Aligned_cols=89 Identities=27% Similarity=0.481 Sum_probs=50.1
Q ss_pred eeecCCCCccCcccc-cCCCcc----CCcc----CCCCCCCCCCC-----CCCCCcccccccCC-CCcCCCCCC---CCc
Q 027951 64 ECECDLGWKQNTMAV-DQNLKF----LPCI----APDCTLNQDCA-----PSPSPAQEKAAKKN-ESIFDLCRW---IDC 125 (216)
Q Consensus 64 ~C~C~pGwtg~~c~~-~d~~~~----lPC~----ipnCt~n~sC~-----~~~~~~~~~~~~~n-~s~~DPC~~---~~C 125 (216)
.|+|++||.|.+|+- .++... --|. .|.|+-.-.|. +.+...+ +-+++- |-+.--|.. ..|
T Consensus 479 ~C~C~~G~~G~~CEC~~~~~ss~~~~~~Cr~~~~~~vCSgrG~C~CGqC~C~~~~~~-~i~G~fCECDnfsC~r~~g~lC 557 (783)
T KOG1226|consen 479 QCRCDEGWLGKKCECSTDELSSSEEEDKCRENSDSPVCSGRGDCVCGQCVCHKPDNG-KIYGKFCECDNFSCERHKGVLC 557 (783)
T ss_pred ceecCCCCCCCcccCCccccCcHhHHhhccCCCCCCCcCCCCcEeCCceEecCCCCC-ceeeeeeeccCcccccccCccc
Confidence 699999999999983 333333 1232 12344433333 3333211 223333 322123443 358
Q ss_pred C-CCeeecCCCCceeeecCCCCcCCCC---CCccccc
Q 027951 126 G-GGSCKNTSMFSYSCQCAVDHYNLLN---TSTFPCY 158 (216)
Q Consensus 126 g-gGtCv~~~~~sY~C~C~~Gy~nlln---~t~~pC~ 158 (216)
+ +|+|.=+ +|.|++||+|..- .++.-|.
T Consensus 558 ~g~G~C~CG-----~CvC~~GwtG~~C~C~~std~C~ 589 (783)
T KOG1226|consen 558 GGHGRCECG-----RCVCNPGWTGSACNCPLSTDTCE 589 (783)
T ss_pred CCCCeEeCC-----cEEcCCCCccCCCCCCCCCcccc
Confidence 7 8999874 6999999999973 4445555
No 26
>PF12661 hEGF: Human growth factor-like EGF; PDB: 2YGQ_A 2E26_A 3A7Q_A 2YGP_A 2YGO_A 1HRE_A 1HAE_A 1HAF_A 1HRF_A.
Probab=96.21 E-value=0.0018 Score=33.43 Aligned_cols=13 Identities=31% Similarity=1.030 Sum_probs=10.4
Q ss_pred eeecCCCCccCcc
Q 027951 64 ECECDLGWKQNTM 76 (216)
Q Consensus 64 ~C~C~pGwtg~~c 76 (216)
+|+|++||+|.+|
T Consensus 1 ~C~C~~G~~G~~C 13 (13)
T PF12661_consen 1 TCQCPPGWTGPNC 13 (13)
T ss_dssp EEEE-TTEETTTT
T ss_pred CccCcCCCcCCCC
Confidence 5999999999874
No 27
>PF07974 EGF_2: EGF-like domain; InterPro: IPR013111 A sequence of about thirty to forty amino-acid residues long found in the sequence of epidermal growth factor (EGF) has been shown [, , , , ] to be present, in a more or less conserved form, in a large number of other, mostly animal proteins. The list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied. The functional significance of EGF domains in what appear to be unrelated proteins is not yet clear. However, a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase). The EGF domain includes six cysteine residues which have been shown (in EGF) to be involved in disulphide bonds. The main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet. Subdomains between the conserved cysteines vary in length. This entry contains EGF domains found in a variety of extracellular and membrane proteins
Probab=95.83 E-value=0.01 Score=37.39 Aligned_cols=25 Identities=20% Similarity=0.388 Sum_probs=20.6
Q ss_pred CCcC-CCeeecCCCCceeeecCCCCcCCC
Q 027951 123 IDCG-GGSCKNTSMFSYSCQCAVDHYNLL 150 (216)
Q Consensus 123 ~~Cg-gGtCv~~~~~sY~C~C~~Gy~nll 150 (216)
..|. +|+|+.. .++|+|++||+|..
T Consensus 6 ~~C~~~G~C~~~---~g~C~C~~g~~G~~ 31 (32)
T PF07974_consen 6 NICSGHGTCVSP---CGRCVCDSGYTGPD 31 (32)
T ss_pred CccCCCCEEeCC---CCEEECCCCCcCCC
Confidence 4576 9999985 47999999999864
No 28
>PF07645 EGF_CA: Calcium-binding EGF domain; InterPro: IPR001881 A sequence of about forty amino-acid residues found in epidermal growth factor (EGF) has been shown [, , , , , ] to be present in a large number of membrane-bound and extracellular, mostly animal, proteins. Many of these proteins require calcium for their biological function and a calcium-binding site has been found at the N terminus of some EGF-like domains []. Calcium-binding may be crucial for numerous protein-protein interactions. For human coagulation factor IX it has been shown [] that the calcium-ligands form a pentagonal bipyramid. The first, third and fourth conserved negatively charged or polar residues are side chain ligands. The latter is possibly hydroxylated (see aspartic acid and asparagine hydroxylation site) []. A conserved aromatic residue, as well as the second conserved negative residue, are thought to be involved in stabilising the calcium-binding site. As in non-calcium binding EGF-like domains, there are six conserved cysteines and the structure of both types is very similar as calcium-binding induces only strictly local structural changes []. +------------------+ +---------+ | | | | nxnnC-x(3,14)-C-x(3,7)-CxxbxxxxaxC-x(1,6)-C-x(8,13)-Cx | | +------------------+ 'n': negatively charged or polar residue [DEQN] 'b': possibly beta-hydroxylated residue [DN] 'a': aromatic amino acid 'C': cysteine, involved in disulphide bond 'x': any amino acid. ; GO: 0005509 calcium ion binding; PDB: 2VJ3_A 1TOZ_A 1LMJ_A 1UZQ_A 1UZK_A 1UZJ_B 1UZP_A 1EMO_A 1EMN_A 2RR0_A ....
Probab=95.79 E-value=0.0073 Score=39.47 Aligned_cols=29 Identities=28% Similarity=0.808 Sum_probs=24.0
Q ss_pred CCCccC--CCC-CCeeeccCCCCCcceeecCCCCc
Q 027951 41 NVCDKV--TCG-KGKCKASQNSTFFYECECDLGWK 72 (216)
Q Consensus 41 ~~C~~~--~Cg-~GtC~~~~~~~~~Y~C~C~pGwt 72 (216)
+.|+.. .|. +++|+.+ .++|+|+|.+||+
T Consensus 3 dEC~~~~~~C~~~~~C~N~---~Gsy~C~C~~Gy~ 34 (42)
T PF07645_consen 3 DECAEGPHNCPENGTCVNT---EGSYSCSCPPGYE 34 (42)
T ss_dssp STTTTTSSSSSTTSEEEEE---TTEEEEEESTTEE
T ss_pred cccCCCCCcCCCCCEEEcC---CCCEEeeCCCCcE
Confidence 466654 587 6799998 7799999999999
No 29
>PHA03099 epidermal growth factor-like protein (EGF-like protein); Provisional
Probab=95.73 E-value=0.018 Score=47.44 Aligned_cols=38 Identities=18% Similarity=0.398 Sum_probs=29.2
Q ss_pred CCccC---CCCCCeeeccCCCCCcceeecCCCCccCcccccC
Q 027951 42 VCDKV---TCGKGKCKASQNSTFFYECECDLGWKQNTMAVDQ 80 (216)
Q Consensus 42 ~C~~~---~Cg~GtC~~~~~~~~~Y~C~C~pGwtg~~c~~~d 80 (216)
.|.+. -|-||+|.--.+ ...+.|.|+.||+|.+|++.+
T Consensus 44 ~Cp~ey~~YClHG~C~yI~d-l~~~~CrC~~GYtGeRCEh~d 84 (139)
T PHA03099 44 LCGPEGDGYCLHGDCIHARD-IDGMYCRCSHGYTGIRCQHVV 84 (139)
T ss_pred cCChhhCCEeECCEEEeecc-CCCceeECCCCccccccccee
Confidence 46543 499999976533 457999999999999988543
No 30
>PF12947 EGF_3: EGF domain; InterPro: IPR024731 This entry represents an EGF domain found in the the C terminus of malarial parasite merozoite surface protein 1 [], as well as other proteins.; PDB: 2NPR_A 1N1I_C 1B9W_A 1YO8_A 2RHP_A.
Probab=95.44 E-value=0.0077 Score=38.88 Aligned_cols=26 Identities=31% Similarity=0.580 Sum_probs=19.6
Q ss_pred CcC-CCeeecCCCCceeeecCCCCcCCC
Q 027951 124 DCG-GGSCKNTSMFSYSCQCAVDHYNLL 150 (216)
Q Consensus 124 ~Cg-gGtCv~~~~~sY~C~C~~Gy~nll 150 (216)
.|. +.+|++.. ++|+|+|++||.|..
T Consensus 7 ~C~~nA~C~~~~-~~~~C~C~~Gy~GdG 33 (36)
T PF12947_consen 7 GCHPNATCTNTG-GSYTCTCKPGYEGDG 33 (36)
T ss_dssp GS-TTCEEEE-T-TSEEEEE-CEEECCS
T ss_pred CCCCCcEeecCC-CCEEeECCCCCccCC
Confidence 466 78999986 599999999999864
No 31
>PF12662 cEGF: Complement Clr-like EGF-like
Probab=95.34 E-value=0.011 Score=35.24 Aligned_cols=13 Identities=31% Similarity=0.938 Sum_probs=11.1
Q ss_pred cceeecCCCCccC
Q 027951 62 FYECECDLGWKQN 74 (216)
Q Consensus 62 ~Y~C~C~pGwtg~ 74 (216)
+|+|+|.+||+-.
T Consensus 1 sy~C~C~~Gy~l~ 13 (24)
T PF12662_consen 1 SYTCSCPPGYQLS 13 (24)
T ss_pred CEEeeCCCCCcCC
Confidence 5999999999843
No 32
>PF06247 Plasmod_Pvs28: Plasmodium ookinete surface protein Pvs28; InterPro: IPR010423 This family consists of several ookinete surface protein (Pvs28) from several species of Plasmodium. Pvs25 and Pvs28 are expressed on the surface of ookinetes. These proteins are potential candidates for vaccine and induce antibodies that block the infectivity of Plasmodium vivax in immunised animals [].; GO: 0009986 cell surface, 0016020 membrane; PDB: 1Z3G_B 1Z1Y_B 1Z27_A.
Probab=94.74 E-value=0.0083 Score=52.18 Aligned_cols=31 Identities=23% Similarity=0.481 Sum_probs=18.4
Q ss_pred CCCCCCcC-CCeeecCCCCceeeecCCCCcCCCC
Q 027951 119 LCRWIDCG-GGSCKNTSMFSYSCQCAVDHYNLLN 151 (216)
Q Consensus 119 PC~~~~Cg-gGtCv~~~~~sY~C~C~~Gy~nlln 151 (216)
+|. ..|. +-.|.+.+ +-|+|.|++||.+-..
T Consensus 133 ~C~-LKCk~nE~CK~~~-~~Y~C~~~~~~~~~~~ 164 (197)
T PF06247_consen 133 KCS-LKCKENEECKLVD-GYYKCVCKEGFPGDGE 164 (197)
T ss_dssp -------TTTEEEEEET-TEEEEEE-TT-EEETT
T ss_pred cee-eecCCCcceeeeC-cEEEeecCCCCCCCCC
Confidence 465 4566 55899876 6899999999987653
No 33
>PHA02887 EGF-like protein; Provisional
Probab=93.99 E-value=0.046 Score=44.47 Aligned_cols=36 Identities=19% Similarity=0.356 Sum_probs=28.6
Q ss_pred CCccC---CCCCCeeeccCCCCCcceeecCCCCccCcccc
Q 027951 42 VCDKV---TCGKGKCKASQNSTFFYECECDLGWKQNTMAV 78 (216)
Q Consensus 42 ~C~~~---~Cg~GtC~~~~~~~~~Y~C~C~pGwtg~~c~~ 78 (216)
+|.+. -|-||+|.-..+ ...+.|.|+.||+|.+|+.
T Consensus 85 pC~~eyk~YCiHG~C~yI~d-L~epsCrC~~GYtG~RCE~ 123 (126)
T PHA02887 85 KCKNDFNDFCINGECMNIID-LDEKFCICNKGYTGIRCDE 123 (126)
T ss_pred ccChHhhCEeeCCEEEcccc-CCCceeECCCCcccCCCCc
Confidence 67653 499999986543 4578999999999999874
No 34
>KOG4260 consensus Uncharacterized conserved protein [Function unknown]
Probab=93.51 E-value=0.058 Score=49.79 Aligned_cols=74 Identities=23% Similarity=0.422 Sum_probs=48.9
Q ss_pred CCC-CCeeeccCCCCCcceeecCCCCccCcccccCCCccCCccCCCCCCCCCCCCCCCCcccccccCCCCcCCCCCCCCc
Q 027951 47 TCG-KGKCKASQNSTFFYECECDLGWKQNTMAVDQNLKFLPCIAPDCTLNQDCAPSPSPAQEKAAKKNESIFDLCRWIDC 125 (216)
Q Consensus 47 ~Cg-~GtC~~~~~~~~~Y~C~C~pGwtg~~c~~~d~~~~lPC~ipnCt~n~sC~~~~~~~~~~~~~~n~s~~DPC~~~~C 125 (216)
+|. +--|+.+ .++|+|++.+||++. +|+|.+ |. |-|..
T Consensus 245 ~c~~~qfCvNt---eGSf~C~dk~Gy~~g----~d~C~~--~~-----------------------------d~~~~--- 283 (350)
T KOG4260|consen 245 PCKAHQFCVNT---EGSFKCEDKEGYKKG----VDECQF--CA-----------------------------DVCAS--- 283 (350)
T ss_pred CCChhheeecC---CCceEecccccccCC----hHHhhh--hh-----------------------------hhccc---
Confidence 465 4578876 668999999999983 233332 10 12220
Q ss_pred CCCeeecCCCCceeeecCCCCcCCCCCCcccccccCCcccc
Q 027951 126 GGGSCKNTSMFSYSCQCAVDHYNLLNTSTFPCYKECSIGMD 166 (216)
Q Consensus 126 ggGtCv~~~~~sY~C~C~~Gy~nlln~t~~pC~~~CslG~d 166 (216)
.++.|.+.+ ++|+|.|..|+ -...+.|+..+++-.-
T Consensus 284 kn~~c~ni~-~~~r~v~f~~~----~~~~g~cV~~~~p~~a 319 (350)
T KOG4260|consen 284 KNRPCMNID-GQYRCVCFSGL----IIIEGFCVWHGSPVLA 319 (350)
T ss_pred CCCCcccCC-ccEEEEecccc----eeeeeeeeccCCchhh
Confidence 156788876 79999997763 3667889988875443
No 35
>PHA02887 EGF-like protein; Provisional
Probab=92.24 E-value=0.14 Score=41.65 Aligned_cols=36 Identities=22% Similarity=0.417 Sum_probs=28.9
Q ss_pred CcCCCCCC---CCcCCCeeecCC-CCceeeecCCCCcCCC
Q 027951 115 SIFDLCRW---IDCGGGSCKNTS-MFSYSCQCAVDHYNLL 150 (216)
Q Consensus 115 s~~DPC~~---~~CggGtCv~~~-~~sY~C~C~~Gy~nll 150 (216)
-.++||.. ++|-||+|.--. ...+.|.|.+||+|.-
T Consensus 81 ~hf~pC~~eyk~YCiHG~C~yI~dL~epsCrC~~GYtG~R 120 (126)
T PHA02887 81 MFFEKCKNDFNDFCINGECMNIIDLDEKFCICNKGYTGIR 120 (126)
T ss_pred cCccccChHhhCEeeCCEEEccccCCCceeECCCCcccCC
Confidence 35789975 679999997643 4579999999999974
No 36
>PF12662 cEGF: Complement Clr-like EGF-like
Probab=91.27 E-value=0.17 Score=30.15 Aligned_cols=14 Identities=29% Similarity=0.524 Sum_probs=11.8
Q ss_pred ceeeecCCCCcCCC
Q 027951 137 SYSCQCAVDHYNLL 150 (216)
Q Consensus 137 sY~C~C~~Gy~nll 150 (216)
||+|+|++||.-..
T Consensus 1 sy~C~C~~Gy~l~~ 14 (24)
T PF12662_consen 1 SYTCSCPPGYQLSP 14 (24)
T ss_pred CEEeeCCCCCcCCC
Confidence 69999999998543
No 37
>PF12947 EGF_3: EGF domain; InterPro: IPR024731 This entry represents an EGF domain found in the the C terminus of malarial parasite merozoite surface protein 1 [], as well as other proteins.; PDB: 2NPR_A 1N1I_C 1B9W_A 1YO8_A 2RHP_A.
Probab=90.41 E-value=0.14 Score=32.93 Aligned_cols=24 Identities=25% Similarity=0.793 Sum_probs=17.2
Q ss_pred CC-CCeeeccCCCCCcceeecCCCCccC
Q 027951 48 CG-KGKCKASQNSTFFYECECDLGWKQN 74 (216)
Q Consensus 48 Cg-~GtC~~~~~~~~~Y~C~C~pGwtg~ 74 (216)
|. +-+|+.+ ..+|+|+|++||+|.
T Consensus 8 C~~nA~C~~~---~~~~~C~C~~Gy~Gd 32 (36)
T PF12947_consen 8 CHPNATCTNT---GGSYTCTCKPGYEGD 32 (36)
T ss_dssp S-TTCEEEE----TTSEEEEE-CEEECC
T ss_pred CCCCcEeecC---CCCEEeECCCCCccC
Confidence 44 4589887 448999999999985
No 38
>PF14670 FXa_inhibition: Coagulation Factor Xa inhibitory site; PDB: 3Q3K_B 1NFY_B 1LQD_A 1G2L_B 1IQF_L 2UWP_B 2VH6_B 3KQC_L 2P93_L 2BQW_A ....
Probab=89.03 E-value=0.28 Score=31.66 Aligned_cols=22 Identities=32% Similarity=0.486 Sum_probs=16.6
Q ss_pred eeecCCCCceeeecCCCCcCCCC
Q 027951 129 SCKNTSMFSYSCQCAVDHYNLLN 151 (216)
Q Consensus 129 tCv~~~~~sY~C~C~~Gy~nlln 151 (216)
.|++.. ++|+|.|++||.-.-|
T Consensus 11 ~C~~~~-g~~~C~C~~Gy~L~~D 32 (36)
T PF14670_consen 11 ICVNTP-GSYRCSCPPGYKLAED 32 (36)
T ss_dssp EEEEET-TSEEEE-STTEEE-TT
T ss_pred CCccCC-CceEeECCCCCEECcC
Confidence 799975 7899999999985543
No 39
>cd01475 vWA_Matrilin VWA_Matrilin: In cartilaginous plate, extracellular matrix molecules mediate cell-matrix and matrix-matrix interactions thereby providing tissue integrity. Some members of the matrilin family are expressed specifically in developing cartilage rudiments. The matrilin family consists of at least four members. All the members of the matrilin family contain VWA domains, EGF-like domains and a heptad repeat coiled-coiled domain at the carboxy terminus which is responsible for the oligomerization of the matrilins. The VWA domains have been shown to be essential for matrilin network formation by interacting with matrix ligands.
Probab=89.02 E-value=0.46 Score=40.51 Aligned_cols=33 Identities=33% Similarity=0.498 Sum_probs=24.1
Q ss_pred CCCCCC--CCcCCCeeecCCCCceeeecCCCCcCCCC
Q 027951 117 FDLCRW--IDCGGGSCKNTSMFSYSCQCAVDHYNLLN 151 (216)
Q Consensus 117 ~DPC~~--~~CggGtCv~~~~~sY~C~C~~Gy~nlln 151 (216)
.|+|.. +.|.+ .|.++. ++|.|.|.+||....+
T Consensus 187 ~~~C~~~~~~c~~-~C~~~~-g~~~c~c~~g~~~~~~ 221 (224)
T cd01475 187 PDLCATLSHVCQQ-VCISTP-GSYLCACTEGYALLED 221 (224)
T ss_pred chhhcCCCCCccc-eEEcCC-CCEEeECCCCccCCCC
Confidence 467753 34553 799875 8999999999986543
No 40
>PF14670 FXa_inhibition: Coagulation Factor Xa inhibitory site; PDB: 3Q3K_B 1NFY_B 1LQD_A 1G2L_B 1IQF_L 2UWP_B 2VH6_B 3KQC_L 2P93_L 2BQW_A ....
Probab=85.97 E-value=0.41 Score=30.89 Aligned_cols=21 Identities=29% Similarity=0.656 Sum_probs=16.2
Q ss_pred eeeccCCCCCcceeecCCCCccCc
Q 027951 52 KCKASQNSTFFYECECDLGWKQNT 75 (216)
Q Consensus 52 tC~~~~~~~~~Y~C~C~pGwtg~~ 75 (216)
.|++. +.+|+|.|.+||+-..
T Consensus 11 ~C~~~---~g~~~C~C~~Gy~L~~ 31 (36)
T PF14670_consen 11 ICVNT---PGSYRCSCPPGYKLAE 31 (36)
T ss_dssp EEEEE---TTSEEEE-STTEEE-T
T ss_pred CCccC---CCceEeECCCCCEECc
Confidence 78887 6689999999998753
No 41
>PHA03099 epidermal growth factor-like protein (EGF-like protein); Provisional
Probab=84.71 E-value=0.96 Score=37.49 Aligned_cols=34 Identities=21% Similarity=0.435 Sum_probs=25.9
Q ss_pred CCCCCC---CCcCCCeeecC-CCCceeeecCCCCcCCC
Q 027951 117 FDLCRW---IDCGGGSCKNT-SMFSYSCQCAVDHYNLL 150 (216)
Q Consensus 117 ~DPC~~---~~CggGtCv~~-~~~sY~C~C~~Gy~nll 150 (216)
+-+|.. ++|-||+|.-- +...|.|+|..||.|.-
T Consensus 42 i~~Cp~ey~~YClHG~C~yI~dl~~~~CrC~~GYtGeR 79 (139)
T PHA03099 42 IRLCGPEGDGYCLHGDCIHARDIDGMYCRCSHGYTGIR 79 (139)
T ss_pred cccCChhhCCEeECCEEEeeccCCCceeECCCCccccc
Confidence 446653 67999999753 24689999999999875
No 42
>smart00051 DSL delta serrate ligand.
Probab=84.20 E-value=0.73 Score=33.09 Aligned_cols=13 Identities=23% Similarity=0.429 Sum_probs=10.1
Q ss_pred eeecCCCCccCcc
Q 027951 64 ECECDLGWKQNTM 76 (216)
Q Consensus 64 ~C~C~pGwtg~~c 76 (216)
.|.|.|||+|..|
T Consensus 51 ~~~C~~Gw~G~~C 63 (63)
T smart00051 51 NKGCLEGWMGPYC 63 (63)
T ss_pred CEecCCCCcCCCC
Confidence 3559999999863
No 43
>PF00954 S_locus_glycop: S-locus glycoprotein family; InterPro: IPR000858 In Brassicaceae, self-incompatible plants have a self/non-self recognition system, which involves the inability of flowering plants to achieve self-fertilisation. This is sporophytically controlled by multiple alleles at a single locus (S). There are a total of 50 different S alleles in Brassica oleracea. S-locus glycoproteins, as well as S-receptor kinases, are in linkage with the S-alleles []. Most of the proteins within this family contain apple-like domain (IPR003609 from INTERPRO), which is predicted to possess protein- and/or carbohydrate-binding functions.; GO: 0048544 recognition of pollen
Probab=84.19 E-value=0.93 Score=34.80 Aligned_cols=30 Identities=30% Similarity=0.527 Sum_probs=24.5
Q ss_pred cCCCCCC-CCcC-CCeeecCCCCceeeecCCCCc
Q 027951 116 IFDLCRW-IDCG-GGSCKNTSMFSYSCQCAVDHY 147 (216)
Q Consensus 116 ~~DPC~~-~~Cg-gGtCv~~~~~sY~C~C~~Gy~ 147 (216)
..|+|+. ..|| .|.|..+ .+..|+|-+||.
T Consensus 76 p~d~Cd~y~~CG~~g~C~~~--~~~~C~Cl~GF~ 107 (110)
T PF00954_consen 76 PKDQCDVYGFCGPNGICNSN--NSPKCSCLPGFE 107 (110)
T ss_pred cccCCCCccccCCccEeCCC--CCCceECCCCcC
Confidence 3589996 6899 8999764 467899999996
No 44
>PF06247 Plasmod_Pvs28: Plasmodium ookinete surface protein Pvs28; InterPro: IPR010423 This family consists of several ookinete surface protein (Pvs28) from several species of Plasmodium. Pvs25 and Pvs28 are expressed on the surface of ookinetes. These proteins are potential candidates for vaccine and induce antibodies that block the infectivity of Plasmodium vivax in immunised animals [].; GO: 0009986 cell surface, 0016020 membrane; PDB: 1Z3G_B 1Z1Y_B 1Z27_A.
Probab=83.13 E-value=0.47 Score=41.50 Aligned_cols=86 Identities=27% Similarity=0.633 Sum_probs=48.0
Q ss_pred CCC-CeeeccCC--CCCcceeecCCCCccCcccccCCCccCCccCCCCCCCCCCCCCCCCcccccccCCCCcCCCCCCCC
Q 027951 48 CGK-GKCKASQN--STFFYECECDLGWKQNTMAVDQNLKFLPCIAPDCTLNQDCAPSPSPAQEKAAKKNESIFDLCRWID 124 (216)
Q Consensus 48 Cg~-GtC~~~~~--~~~~Y~C~C~pGwtg~~c~~~d~~~~lPC~ipnCt~n~sC~~~~~~~~~~~~~~n~s~~DPC~~~~ 124 (216)
|+. ++|....+ ....|+|+|-+||.-..-. |+ .+.|....
T Consensus 52 Cgdya~C~~~~~~~~~~~~~C~C~~gY~~~~~v---------Cv----------------------------p~~C~~~~ 94 (197)
T PF06247_consen 52 CGDYAKCINQANKGEERAYKCDCINGYILKQGV---------CV----------------------------PNKCNNKD 94 (197)
T ss_dssp EETTEEEEE-SSTTSSTSEEEEE-TTEEESSSS---------EE----------------------------EGGGSS--
T ss_pred ccchhhhhcCCCcccceeEEEecccCceeeCCe---------Ec----------------------------hhhcCcee
Confidence 665 79987653 3468999999999877521 32 14688889
Q ss_pred cCCCeeecCC--CCceeeecCCCCc-CCCC----CCcccccccCCcccccccC
Q 027951 125 CGGGSCKNTS--MFSYSCQCAVDHY-NLLN----TSTFPCYKECSIGMDCKNM 170 (216)
Q Consensus 125 CggGtCv~~~--~~sY~C~C~~Gy~-nlln----~t~~pC~~~CslG~dC~~l 170 (216)
||.|.|+-.. ...+.|.|+=|+. ...+ .-.-+|.=.|.-...|...
T Consensus 95 Cg~GKCI~d~~~~~~~~CSC~IGkV~~dn~kCtk~G~T~C~LKCk~nE~CK~~ 147 (197)
T PF06247_consen 95 CGSGKCILDPDNPNNPTCSCNIGKVPDDNKKCTKTGETKCSLKCKENEECKLV 147 (197)
T ss_dssp -TTEEEEEEEGGGSEEEEEE-TEEETTTTTESEEEE--------TTTEEEEEE
T ss_pred cCCCeEEecCCCCCCceeEeeeceEeccCCcccCCCccceeeecCCCcceeee
Confidence 9999998643 2367999999999 1111 1113455455556666554
No 45
>cd01475 vWA_Matrilin VWA_Matrilin: In cartilaginous plate, extracellular matrix molecules mediate cell-matrix and matrix-matrix interactions thereby providing tissue integrity. Some members of the matrilin family are expressed specifically in developing cartilage rudiments. The matrilin family consists of at least four members. All the members of the matrilin family contain VWA domains, EGF-like domains and a heptad repeat coiled-coiled domain at the carboxy terminus which is responsible for the oligomerization of the matrilins. The VWA domains have been shown to be essential for matrilin network formation by interacting with matrix ligands.
Probab=78.66 E-value=1.3 Score=37.74 Aligned_cols=40 Identities=20% Similarity=0.388 Sum_probs=26.5
Q ss_pred CCCCccccCCCcc-CCCCC------CeeeccCCCCCcceeecCCCCccCc
Q 027951 33 PLLAPAFENVCDK-VTCGK------GKCKASQNSTFFYECECDLGWKQNT 75 (216)
Q Consensus 33 p~~~~~~~~~C~~-~~Cg~------GtC~~~~~~~~~Y~C~C~pGwtg~~ 75 (216)
.+...+....|.. .+|.. .+|... .++|.|.|.+||+...
T Consensus 174 ~~~~~l~~~~C~~~~~C~~~~~~c~~~C~~~---~g~~~c~c~~g~~~~~ 220 (224)
T cd01475 174 ELTKKFQGKICVVPDLCATLSHVCQQVCIST---PGSYLCACTEGYALLE 220 (224)
T ss_pred HHhhhcccccCcCchhhcCCCCCccceEEcC---CCCEEeECCCCccCCC
Confidence 4445566777753 24432 257765 6789999999998753
No 46
>smart00051 DSL delta serrate ligand.
Probab=77.32 E-value=3.3 Score=29.67 Aligned_cols=16 Identities=13% Similarity=0.218 Sum_probs=13.1
Q ss_pred cceeecCCCCccCccc
Q 027951 62 FYECECDLGWKQNTMA 77 (216)
Q Consensus 62 ~Y~C~C~pGwtg~~c~ 77 (216)
.|+=.|+++|-|..|+
T Consensus 16 ~~rv~C~~~~yG~~C~ 31 (63)
T smart00051 16 QIRVTCDENYYGEGCN 31 (63)
T ss_pred EEEeeCCCCCcCCccC
Confidence 5677899999999864
No 47
>PF12946 EGF_MSP1_1: MSP1 EGF domain 1; InterPro: IPR024730 This EGF-like domain is found at the C terminus of the malaria parasite MSP1 protein. MSP1 is the merozoite surface protein 1. This domain is part of the C-terminal fragment that is proteolytically processed from the the rest of the protein and is left attached to the surface of the invading parasite [].; PDB: 1N1I_C 2FLG_A 1CEJ_A 2NPR_A 1B9W_A 1OB1_F.
Probab=76.64 E-value=1.4 Score=28.96 Aligned_cols=31 Identities=19% Similarity=0.353 Sum_probs=20.6
Q ss_pred CCCCCcC-CCeeecCCCCceeeecCCCCcCCC
Q 027951 120 CRWIDCG-GGSCKNTSMFSYSCQCAVDHYNLL 150 (216)
Q Consensus 120 C~~~~Cg-gGtCv~~~~~sY~C~C~~Gy~nll 150 (216)
|....|- +..|.+...+++.|+|..||.-..
T Consensus 2 C~~~~cP~NA~C~~~~dG~eecrCllgyk~~~ 33 (37)
T PF12946_consen 2 CIDTKCPANAGCFRYDDGSEECRCLLGYKKVG 33 (37)
T ss_dssp -SSS---TTEEEEEETTSEEEEEE-TTEEEET
T ss_pred ccCccCCCCcccEEcCCCCEEEEeeCCccccC
Confidence 3444554 678998877899999999997543
No 48
>KOG1836 consensus Extracellular matrix glycoprotein Laminin subunits alpha and gamma [Extracellular structures]
Probab=76.59 E-value=5.3 Score=44.78 Aligned_cols=41 Identities=17% Similarity=0.433 Sum_probs=31.0
Q ss_pred CccCCCCC-CeeeccCCCCCcceee-cCCCCccCcccc-cCCCcc
Q 027951 43 CDKVTCGK-GKCKASQNSTFFYECE-CDLGWKQNTMAV-DQNLKF 84 (216)
Q Consensus 43 C~~~~Cg~-GtC~~~~~~~~~Y~C~-C~pGwtg~~c~~-~d~~~~ 84 (216)
|+.=+|-. |.|....+ ...+.|+ |.+||+|.+|++ .+...+
T Consensus 777 C~~C~Cp~~~~~~~~~~-~~~~iCk~Cp~gytG~rCe~c~dgyfg 820 (1705)
T KOG1836|consen 777 CQPCPCPNGGACGQTPE-ILEVVCKNCPPGYTGLRCEECADGYFG 820 (1705)
T ss_pred CccCCCCCChhhcCcCc-ccceecCCCCCCCcccccccCCCcccc
Confidence 87777774 58887755 5689999 999999999885 333333
No 49
>KOG3516 consensus Neurexin IV [Signal transduction mechanisms]
Probab=70.56 E-value=3.1 Score=44.96 Aligned_cols=35 Identities=26% Similarity=0.555 Sum_probs=29.9
Q ss_pred cCCCCCCCCcC-CCeeecCCCCceeeecC-CCCcCCCC
Q 027951 116 IFDLCRWIDCG-GGSCKNTSMFSYSCQCA-VDHYNLLN 151 (216)
Q Consensus 116 ~~DPC~~~~Cg-gGtCv~~~~~sY~C~C~-~Gy~nlln 151 (216)
..|+|..++|. +|.|..+ ...|.|.|. .||.|..-
T Consensus 544 i~drClPN~CehgG~C~Qs-~~~f~C~C~~TGY~GatC 580 (1306)
T KOG3516|consen 544 ISDRCLPNPCEHGGKCSQS-WDDFECNCELTGYKGATC 580 (1306)
T ss_pred cccccCCccccCCCccccc-ccceeEeccccccccccc
Confidence 35899999999 7899984 478999999 99999864
No 50
>KOG3516 consensus Neurexin IV [Signal transduction mechanisms]
Probab=70.27 E-value=3.2 Score=44.82 Aligned_cols=36 Identities=33% Similarity=0.693 Sum_probs=30.4
Q ss_pred cCCCccCCCCCC-eeeccCCCCCcceeecC-CCCccCcccc
Q 027951 40 ENVCDKVTCGKG-KCKASQNSTFFYECECD-LGWKQNTMAV 78 (216)
Q Consensus 40 ~~~C~~~~Cg~G-tC~~~~~~~~~Y~C~C~-pGwtg~~c~~ 78 (216)
.+.|.-++|+|| .|.-+ -..|+|.|+ .||+|..|++
T Consensus 545 ~drClPN~CehgG~C~Qs---~~~f~C~C~~TGY~GatCHt 582 (1306)
T KOG3516|consen 545 SDRCLPNPCEHGGKCSQS---WDDFECNCELTGYKGATCHT 582 (1306)
T ss_pred ccccCCccccCCCccccc---ccceeEeccccccccccccC
Confidence 457888999975 99885 457999999 9999999984
No 51
>PF00954 S_locus_glycop: S-locus glycoprotein family; InterPro: IPR000858 In Brassicaceae, self-incompatible plants have a self/non-self recognition system, which involves the inability of flowering plants to achieve self-fertilisation. This is sporophytically controlled by multiple alleles at a single locus (S). There are a total of 50 different S alleles in Brassica oleracea. S-locus glycoproteins, as well as S-receptor kinases, are in linkage with the S-alleles []. Most of the proteins within this family contain apple-like domain (IPR003609 from INTERPRO), which is predicted to possess protein- and/or carbohydrate-binding functions.; GO: 0048544 recognition of pollen
Probab=68.00 E-value=4.7 Score=30.85 Aligned_cols=31 Identities=29% Similarity=0.749 Sum_probs=23.4
Q ss_pred cCCCcc-CCCC-CCeeeccCCCCCcceeecCCCCccC
Q 027951 40 ENVCDK-VTCG-KGKCKASQNSTFFYECECDLGWKQN 74 (216)
Q Consensus 40 ~~~C~~-~~Cg-~GtC~~~~~~~~~Y~C~C~pGwtg~ 74 (216)
.+.|+. ..|| .|.|... ....|+|.+||+-.
T Consensus 77 ~d~Cd~y~~CG~~g~C~~~----~~~~C~Cl~GF~P~ 109 (110)
T PF00954_consen 77 KDQCDVYGFCGPNGICNSN----NSPKCSCLPGFEPK 109 (110)
T ss_pred ccCCCCccccCCccEeCCC----CCCceECCCCcCCC
Confidence 568986 5799 6999653 24569999999753
No 52
>PF12946 EGF_MSP1_1: MSP1 EGF domain 1; InterPro: IPR024730 This EGF-like domain is found at the C terminus of the malaria parasite MSP1 protein. MSP1 is the merozoite surface protein 1. This domain is part of the C-terminal fragment that is proteolytically processed from the the rest of the protein and is left attached to the surface of the invading parasite [].; PDB: 1N1I_C 2FLG_A 1CEJ_A 2NPR_A 1B9W_A 1OB1_F.
Probab=66.27 E-value=3.5 Score=27.11 Aligned_cols=30 Identities=30% Similarity=0.606 Sum_probs=19.4
Q ss_pred CccCCCC-CCeeeccCCCCCcceeecCCCCccC
Q 027951 43 CDKVTCG-KGKCKASQNSTFFYECECDLGWKQN 74 (216)
Q Consensus 43 C~~~~Cg-~GtC~~~~~~~~~Y~C~C~pGwtg~ 74 (216)
|.+..|- |-.|+... .+.++|+|..||+..
T Consensus 2 C~~~~cP~NA~C~~~~--dG~eecrCllgyk~~ 32 (37)
T PF12946_consen 2 CIDTKCPANAGCFRYD--DGSEECRCLLGYKKV 32 (37)
T ss_dssp -SSS---TTEEEEEET--TSEEEEEE-TTEEEE
T ss_pred ccCccCCCCcccEEcC--CCCEEEEeeCCcccc
Confidence 5555665 55888774 468999999999974
No 53
>KOG3514 consensus Neurexin III-alpha [Signal transduction mechanisms]
Probab=65.30 E-value=4.1 Score=43.95 Aligned_cols=35 Identities=29% Similarity=0.787 Sum_probs=29.7
Q ss_pred cCCCccCCCCC-CeeeccCCCCCcceeecC-CCCccCccc
Q 027951 40 ENVCDKVTCGK-GKCKASQNSTFFYECECD-LGWKQNTMA 77 (216)
Q Consensus 40 ~~~C~~~~Cg~-GtC~~~~~~~~~Y~C~C~-pGwtg~~c~ 77 (216)
..+|+.+||+| |+|.+. -+.|.|+|. -||.|..|+
T Consensus 623 ~~~C~~nPC~N~g~C~eg---wNrfiCDCs~T~~~G~~Ce 659 (1591)
T KOG3514|consen 623 EKICESNPCQNGGKCSEG---WNRFICDCSGTGFEGRTCE 659 (1591)
T ss_pred ccccCCCcccCCCCcccc---ccccccccccCcccCcccc
Confidence 45999999997 599998 568999996 589999886
No 54
>PF00053 Laminin_EGF: Laminin EGF-like (Domains III and V); InterPro: IPR002049 Laminins [] are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation. They are composed of distinct but related alpha, beta and gamma chains. The three chains form a cross-shaped molecule that consist of a long arm and three short globular arms. The long arm consist of a coiled coil structure contributed by all three chains and cross-linked by interchain disulphide bonds. Beside different types of globular domains each subunit contains, in its first half, consecutive repeats of about 60 amino acids in length that include eight conserved cysteines []. The tertiary structure [, ] of this domain is remotely similar in its N-terminal to that of the EGF-like module (see PDOC00021 from PROSITEDOC). It is known as a 'LE' or 'laminin-type EGF-like' domain. The number of copies of the LE domain in the different forms of laminins is highly variable; from 3 up to 22 copies have been found. A schematic representation of the topology of the four disulphide bonds in the LE domain is shown below. +-------------------+ +-|-----------+ | +--------+ +-----------------+ | | | | | | | | xxCxCxxxxxxxxxxxCxxxxxxxCxxCxxxxxGxxCxxCxxgaagxxxxxxxxxxxCxx sssssssssssssssssssssssssssssssssss 'C': conserved cysteine involved in a disulphide bond 'a': conserved aromatic residue 'G': conserved glycine (lower case = less conserved) 's': region similar to the EGF-like domain In mouse laminin gamma-1 chain, the seventh LE domain has been shown to be the only one that binds with a high affinity to nidogen []. The binding-sites are located on the surface within the loops C1-C3 and C5-C6 [, ]. Long consecutive arrays of LE domains in laminins form rod-like elements of limited flexibility [], which determine the spacing in the formation of laminin networks of basement membranes [].; PDB: 3TBD_A 3ZYG_B 3ZYI_B 2Y38_A 1KLO_A 1NPE_B 3ZYJ_B 1TLE_A.
Probab=62.64 E-value=5.1 Score=26.51 Aligned_cols=22 Identities=23% Similarity=0.430 Sum_probs=17.2
Q ss_pred CeeeccCCCCCcceeecCCCCccCccc
Q 027951 51 GKCKASQNSTFFYECECDLGWKQNTMA 77 (216)
Q Consensus 51 GtC~~~~~~~~~Y~C~C~pGwtg~~c~ 77 (216)
.+|.+. ..+|.|.++|+|.+|+
T Consensus 11 ~~C~~~-----~G~C~C~~~~~G~~C~ 32 (49)
T PF00053_consen 11 QTCDPS-----TGQCVCKPGTTGPRCD 32 (49)
T ss_dssp SSEEET-----CEEESBSTTEESTTS-
T ss_pred CcccCC-----CCEEeccccccCCcCc
Confidence 366664 5799999999999976
No 55
>KOG1218 consensus Proteins containing Ca2+-binding EGF-like domains [Signal transduction mechanisms]
Probab=62.10 E-value=27 Score=30.45 Aligned_cols=36 Identities=19% Similarity=0.437 Sum_probs=22.9
Q ss_pred eeeecCCCCcCCCCCCccc-cccc--CCcccccccCCCc
Q 027951 138 YSCQCAVDHYNLLNTSTFP-CYKE--CSIGMDCKNMGIS 173 (216)
Q Consensus 138 Y~C~C~~Gy~nlln~t~~p-C~~~--CslG~dC~~lgi~ 173 (216)
-.|+|.+||.|........ |... |..|+.|....-.
T Consensus 162 ~~c~c~~g~~g~~~~~~~~~c~~~~~~~~g~~C~~~~~~ 200 (316)
T KOG1218|consen 162 GICTCQPGFVGVFCVESCSGCSPLTACENGAKCNRSTGS 200 (316)
T ss_pred CceeccCCcccccccccCCCcCCCcccCCCCeeeccccc
Confidence 4688999999888766544 6643 4455556555333
No 56
>PF12955 DUF3844: Domain of unknown function (DUF3844); InterPro: IPR024382 This presumed domain is found in fungal species. It contains 8 largely conserved cysteine residues. This domain is found in proteins thought to be located in the endoplasmic reticulum.
Probab=60.29 E-value=8.3 Score=30.60 Aligned_cols=31 Identities=23% Similarity=0.673 Sum_probs=20.7
Q ss_pred CCCcc--CCCC-CCeeeccCCC--CCcceeecCCCC
Q 027951 41 NVCDK--VTCG-KGKCKASQNS--TFFYECECDLGW 71 (216)
Q Consensus 41 ~~C~~--~~Cg-~GtC~~~~~~--~~~Y~C~C~pGw 71 (216)
+.|++ +.|. ||.|+..... ..=|.|+|.+.+
T Consensus 6 ~aC~~~Tn~CsgHG~C~~~~~~~~~~C~~C~C~~T~ 41 (103)
T PF12955_consen 6 DACENATNNCSGHGSCVKKYGSGGGDCFACKCKPTV 41 (103)
T ss_pred HHHHHhccCCCCCceEeeccCCCccceEEEEeeccc
Confidence 34543 4565 9999987432 246999999943
No 57
>PF01414 DSL: Delta serrate ligand; InterPro: IPR001774 Ligands of the Delta/Serrate/lag-2 (DSL) family and their receptors, members of the lin-12/Notch family, mediate cell-cell interactions that specify cell fate in invertebrates and vertebrates. In Caenorhabditis elegans, two DSL genes, lag-2 and apx-1, influence different cell fate decisions during development []. Molecular interaction between Notch and Serrate, another EGF-homologous transmembrane protein containing a region of striking similarity to Delta, has been shown and the same two EGF repeats of Notch may also constitute a Serrate binding domain [, ].; GO: 0007154 cell communication, 0016020 membrane; PDB: 2VJ2_A.
Probab=59.42 E-value=2.5 Score=30.38 Aligned_cols=11 Identities=27% Similarity=0.664 Sum_probs=8.1
Q ss_pred ecCCCCccCcc
Q 027951 66 ECDLGWKQNTM 76 (216)
Q Consensus 66 ~C~pGwtg~~c 76 (216)
.|.+||+|+.|
T Consensus 53 ~C~~Gw~G~~C 63 (63)
T PF01414_consen 53 VCLPGWTGPNC 63 (63)
T ss_dssp EE-TTEESTTS
T ss_pred CCCCCCcCCCC
Confidence 58999999864
No 58
>PF12955 DUF3844: Domain of unknown function (DUF3844); InterPro: IPR024382 This presumed domain is found in fungal species. It contains 8 largely conserved cysteine residues. This domain is found in proteins thought to be located in the endoplasmic reticulum.
Probab=58.15 E-value=13 Score=29.44 Aligned_cols=49 Identities=20% Similarity=0.360 Sum_probs=30.2
Q ss_pred CCcC-CCeeecCCC----CceeeecCCCCcCCCCCCcccccccCCcccccccCCCcC
Q 027951 123 IDCG-GGSCKNTSM----FSYSCQCAVDHYNLLNTSTFPCYKECSIGMDCKNMGISV 174 (216)
Q Consensus 123 ~~Cg-gGtCv~~~~----~sY~C~C~~Gy~nlln~t~~pC~~~CslG~dC~~lgi~~ 174 (216)
+.|. ||+|++... .=|.|+|.+.+......+. -..===|.+|++..|++
T Consensus 13 n~CsgHG~C~~~~~~~~~~C~~C~C~~T~~~~~~~~~---ktt~W~G~aCqKkDvS~ 66 (103)
T PF12955_consen 13 NNCSGHGSCVKKYGSGGGDCFACKCKPTVVKTGSGKG---KTTHWGGPACQKKDVSV 66 (103)
T ss_pred cCCCCCceEeeccCCCccceEEEEeeccccccccccC---ceeeecccccccccccc
Confidence 4564 999999732 3499999998765432111 00001477888887763
No 59
>PF07172 GRP: Glycine rich protein family; InterPro: IPR010800 This family consists of glycine rich proteins. Some of them may be involved in resistance to environmental stress [].
Probab=57.63 E-value=7.3 Score=30.22 Aligned_cols=14 Identities=14% Similarity=0.273 Sum_probs=5.8
Q ss_pred HHHHHHhhcccCcc
Q 027951 13 AIFFVLQPLTAPSN 26 (216)
Q Consensus 13 ~~~~~~~~~~a~~~ 26 (216)
|+|||+.+-+|+.|
T Consensus 14 A~lLlisSevaa~~ 27 (95)
T PF07172_consen 14 AALLLISSEVAARE 27 (95)
T ss_pred HHHHHHHhhhhhHH
Confidence 34444444444433
No 60
>KOG3509 consensus Basement membrane-specific heparan sulfate proteoglycan (HSPG) core protein [Posttranslational modification, protein turnover, chaperones]
Probab=53.99 E-value=23 Score=37.76 Aligned_cols=36 Identities=19% Similarity=0.550 Sum_probs=28.1
Q ss_pred ccCCCccCCCCCC-eeeccCCCCCcceeecCCCCccCccc
Q 027951 39 FENVCDKVTCGKG-KCKASQNSTFFYECECDLGWKQNTMA 77 (216)
Q Consensus 39 ~~~~C~~~~Cg~G-tC~~~~~~~~~Y~C~C~pGwtg~~c~ 77 (216)
..+.|..++|++. -|-+. .....|.|.+||+|..|+
T Consensus 405 ~g~~c~~~p~~~~g~c~p~---~~~~~c~c~~g~~G~~c~ 441 (964)
T KOG3509|consen 405 LGDVCWRIPCQHDGPCLQT---LEGKQCLCPPGYTGDSCE 441 (964)
T ss_pred CCCccccccCCCCcccccc---ccccceeccccccCchhh
Confidence 3457888889985 44444 678999999999999865
No 61
>smart00180 EGF_Lam Laminin-type epidermal growth factor-like domai.
Probab=52.28 E-value=9.5 Score=25.27 Aligned_cols=16 Identities=19% Similarity=0.337 Sum_probs=13.7
Q ss_pred cceeecCCCCccCccc
Q 027951 62 FYECECDLGWKQNTMA 77 (216)
Q Consensus 62 ~Y~C~C~pGwtg~~c~ 77 (216)
.-+|+|.+||+|.+|+
T Consensus 17 ~G~C~C~~~~~G~~C~ 32 (46)
T smart00180 17 TGQCECKPNVTGRRCD 32 (46)
T ss_pred CCEEECCCCCCCCCCC
Confidence 4589999999999865
No 62
>cd00055 EGF_Lam Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies
Probab=51.52 E-value=9.5 Score=25.49 Aligned_cols=16 Identities=19% Similarity=0.318 Sum_probs=13.8
Q ss_pred cceeecCCCCccCccc
Q 027951 62 FYECECDLGWKQNTMA 77 (216)
Q Consensus 62 ~Y~C~C~pGwtg~~c~ 77 (216)
.-+|.|.+||+|.+|+
T Consensus 18 ~G~C~C~~~~~G~~C~ 33 (50)
T cd00055 18 TGQCECKPNTTGRRCD 33 (50)
T ss_pred CCEEeCCCcCCCCCCC
Confidence 4689999999999965
No 63
>KOG1836 consensus Extracellular matrix glycoprotein Laminin subunits alpha and gamma [Extracellular structures]
Probab=49.46 E-value=35 Score=38.59 Aligned_cols=31 Identities=29% Similarity=0.733 Sum_probs=25.8
Q ss_pred CCCCCcC-CCeeecCC-CCceeee-cCCCCcCCC
Q 027951 120 CRWIDCG-GGSCKNTS-MFSYSCQ-CAVDHYNLL 150 (216)
Q Consensus 120 C~~~~Cg-gGtCv~~~-~~sY~C~-C~~Gy~nll 150 (216)
|..-+|- +|.|.... ..++.|+ |++||.|+-
T Consensus 777 C~~C~Cp~~~~~~~~~~~~~~iCk~Cp~gytG~r 810 (1705)
T KOG1836|consen 777 CQPCPCPNGGACGQTPEILEVVCKNCPPGYTGLR 810 (1705)
T ss_pred CccCCCCCChhhcCcCcccceecCCCCCCCcccc
Confidence 8888887 77888765 5689999 999999875
No 64
>KOG3607 consensus Meltrins, fertilins and related Zn-dependent metalloproteinases of the ADAMs family [Posttranslational modification, protein turnover, chaperones]
Probab=48.58 E-value=29 Score=35.79 Aligned_cols=22 Identities=27% Similarity=0.782 Sum_probs=16.8
Q ss_pred cC-CCeeecCCCCceeeecCCCCcCCC
Q 027951 125 CG-GGSCKNTSMFSYSCQCAVDHYNLL 150 (216)
Q Consensus 125 Cg-gGtCv~~~~~sY~C~C~~Gy~nll 150 (216)
|. +|.|.+ .++|.|.+||....
T Consensus 632 C~g~GVCnn----~~~ChC~~gwapp~ 654 (716)
T KOG3607|consen 632 CNGHGVCNN----ELNCHCEPGWAPPF 654 (716)
T ss_pred cCCCcccCC----CcceeeCCCCCCCc
Confidence 54 666665 47999999999875
No 65
>KOG3607 consensus Meltrins, fertilins and related Zn-dependent metalloproteinases of the ADAMs family [Posttranslational modification, protein turnover, chaperones]
Probab=48.05 E-value=11 Score=38.82 Aligned_cols=17 Identities=24% Similarity=0.620 Sum_probs=13.7
Q ss_pred CcceeecCCCCccCccc
Q 027951 61 FFYECECDLGWKQNTMA 77 (216)
Q Consensus 61 ~~Y~C~C~pGwtg~~c~ 77 (216)
..+.|+|++||.++.|+
T Consensus 640 n~~~ChC~~gwapp~C~ 656 (716)
T KOG3607|consen 640 NELNCHCEPGWAPPFCF 656 (716)
T ss_pred CCcceeeCCCCCCCccc
Confidence 36789999999998764
No 66
>PF04863 EGF_alliinase: Alliinase EGF-like domain; InterPro: IPR006947 Allicin is a thiosulphinate that gives rise to dithiines, allyl sulphides and ajoenes, the three groups of active compounds in Allium species. Allicin is synthesised from sulphoxide cysteine derivatives by alliinase, whose C-S lyase activity cleaves C(beta)-S(gamma) bonds. It is thought that this enzyme forms part of a primitive plant defence system [].; GO: 0016846 carbon-sulfur lyase activity; PDB: 1LK9_B 2HOX_C 2HOR_A.
Probab=39.08 E-value=12 Score=26.84 Aligned_cols=29 Identities=17% Similarity=0.272 Sum_probs=15.9
Q ss_pred CCCeee-ccCCCCCcceeecCCCCccCccc
Q 027951 49 GKGKCK-ASQNSTFFYECECDLGWKQNTMA 77 (216)
Q Consensus 49 g~GtC~-~~~~~~~~Y~C~C~pGwtg~~c~ 77 (216)
|||+.. +.-...+.-.|||+.-|+|+.|.
T Consensus 21 GHGr~flDg~~~dG~p~CECn~Cy~GpdCS 50 (56)
T PF04863_consen 21 GHGRAFLDGLIADGSPVCECNSCYGGPDCS 50 (56)
T ss_dssp TSEE--TTS-EETTEE--EE-TTEESTTS-
T ss_pred CCCeeeeccccccCCccccccCCcCCCCcc
Confidence 488876 32222446899999999999875
No 67
>PF09064 Tme5_EGF_like: Thrombomodulin like fifth domain, EGF-like; InterPro: IPR015149 This domain adopts a fold similar to other EGF domains, with a flat major and a twisted minor beta sheet. Disulphide pairing, however, is not of the usual 1-3, 2-4, 5-6 type; rather 1-2, 3-4, 5-6 pairing is found. Its extended major sheet (strands beta-2 and beta-3 and the connecting loop) projects into thrombin's active site groove. This domain is required for interaction of thrombomodulin with thrombin, and subsequent activation of protein-C []. ; GO: 0004888 transmembrane signaling receptor activity, 0016021 integral to membrane
Probab=38.94 E-value=22 Score=23.00 Aligned_cols=14 Identities=21% Similarity=0.380 Sum_probs=11.1
Q ss_pred CcceeecCCCCccC
Q 027951 61 FFYECECDLGWKQN 74 (216)
Q Consensus 61 ~~Y~C~C~pGwtg~ 74 (216)
..+.|.|.+||--.
T Consensus 16 ~~~~C~CPeGyIld 29 (34)
T PF09064_consen 16 SPGQCFCPEGYILD 29 (34)
T ss_pred CCCceeCCCceEec
Confidence 35799999999654
No 68
>PF01683 EB: EB module; InterPro: IPR006149 The EB domain has no known function. It is found in several Caenorhabditis sp. and Drosophila sp. proteins. The domain contains 8 conserved cysteines that probably form four disulphide bridges and is found associated with kunitz domains IPR002223 from INTERPRO
Probab=36.40 E-value=40 Score=22.33 Aligned_cols=27 Identities=30% Similarity=0.696 Sum_probs=18.9
Q ss_pred CCCCC-CCcC-CCeeecCCCCceeeecCCCCcCC
Q 027951 118 DLCRW-IDCG-GGSCKNTSMFSYSCQCAVDHYNL 149 (216)
Q Consensus 118 DPC~~-~~Cg-gGtCv~~~~~sY~C~C~~Gy~nl 149 (216)
++|.. ..|. +..|+++ +|+|++||.-.
T Consensus 20 ~~C~~~~qC~~~s~C~~g-----~C~C~~g~~~~ 48 (52)
T PF01683_consen 20 ESCESDEQCIGGSVCVNG-----RCQCPPGYVEV 48 (52)
T ss_pred CCCCCcCCCCCcCEEcCC-----EeECCCCCEec
Confidence 45664 3576 5689763 79999998644
No 69
>KOG0196 consensus Tyrosine kinase, EPH (ephrin) receptor family [Signal transduction mechanisms]
Probab=34.80 E-value=79 Score=33.71 Aligned_cols=25 Identities=24% Similarity=0.537 Sum_probs=16.9
Q ss_pred CCceeeecCCCCcCCCC-CCcccccc
Q 027951 135 MFSYSCQCAVDHYNLLN-TSTFPCYK 159 (216)
Q Consensus 135 ~~sY~C~C~~Gy~nlln-~t~~pC~~ 159 (216)
.++-.|+|..||.=.-+ -...||.+
T Consensus 305 ega~~C~C~~gyyRA~~Dp~~mpCT~ 330 (996)
T KOG0196|consen 305 EGATSCTCENGYYRADSDPPSMPCTR 330 (996)
T ss_pred CCCCcccccCCcccCCCCCCCCCCCC
Confidence 35678999999976654 34456543
No 70
>PLN03148 Blue copper-like protein; Provisional
Probab=32.13 E-value=41 Score=28.76 Aligned_cols=19 Identities=42% Similarity=0.773 Sum_probs=9.6
Q ss_pred cCCcccccccCCCcCCCCCCCC
Q 027951 160 ECSIGMDCKNMGISVPLPPPPP 181 (216)
Q Consensus 160 ~CslG~dC~~lgi~~~~~s~~~ 181 (216)
.|.-| .+|-|.+.+.++||
T Consensus 106 hC~~G---mKl~I~V~~~~~pp 124 (167)
T PLN03148 106 QCFNG---MKVTILVHPLPPPP 124 (167)
T ss_pred ccccC---CEEEEEEcCCCCCC
Confidence 45544 24556665544433
No 71
>cd01328 FSL_SPARC Follistatin-like SPARC (secreted protein, acidic, and rich in cysteines) domain; SPARC/BM-40/osteonectin is a multifunctional glycoprotein which modulates cellular interaction with the extracellular matrix by its binding to structural matrix proteins such as collagen and vitronectin. The protein it composed of an N-terminal acidic region, a follistatin (FS) domain and an EF-hand calcium binding domain. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a small hydrophobic core of alpha/beta structure (Kazal domain) and has five disulfide bonds and a conserved N-glycosylation site. The FSL_SPARC domain is a member of the superfamily of kazal-like proteinase inhibitors and follistatin-like proteins.
Probab=30.96 E-value=49 Score=25.31 Aligned_cols=26 Identities=27% Similarity=0.689 Sum_probs=20.1
Q ss_pred CCccCCCCCC-eeeccCCCCCcceeecCC
Q 027951 42 VCDKVTCGKG-KCKASQNSTFFYECECDL 69 (216)
Q Consensus 42 ~C~~~~Cg~G-tC~~~~~~~~~Y~C~C~p 69 (216)
+|+++.|+.| +|+... .+.-+|.|.+
T Consensus 1 pC~~v~C~~G~~C~~d~--~~~p~CvC~~ 27 (86)
T cd01328 1 PCENHHCGAGKVCEVDD--ENTPKCVCID 27 (86)
T ss_pred CCCCcCCCCCCEeeECC--CCCeEEecCC
Confidence 5899999999 998753 3467888864
No 72
>KOG3512 consensus Netrin, axonal chemotropic factor [Signal transduction mechanisms]
Probab=29.77 E-value=70 Score=32.09 Aligned_cols=91 Identities=18% Similarity=0.261 Sum_probs=53.4
Q ss_pred cCCCccCCCC-----CCeeeccCCCCCcceeecCCCCccCcccccCCCccCCccCCCCCCCCCCCCCCCCcccccccCCC
Q 027951 40 ENVCDKVTCG-----KGKCKASQNSTFFYECECDLGWKQNTMAVDQNLKFLPCIAPDCTLNQDCAPSPSPAQEKAAKKNE 114 (216)
Q Consensus 40 ~~~C~~~~Cg-----~GtC~~~~~~~~~Y~C~C~pGwtg~~c~~~d~~~~lPC~ipnCt~n~sC~~~~~~~~~~~~~~n~ 114 (216)
.++|+.-.|. +-+|..+ .-.|+|.+|-+|..|+. +++++++ ..
T Consensus 391 rkaCk~CdChpVGs~gktCNq~-----tGqCpCkeGvtG~tCnr---------------------Ca~gyqq------sr 438 (592)
T KOG3512|consen 391 RKACKACDCHPVGSAGKTCNQT-----TGQCPCKEGVTGLTCNR---------------------CAPGYQQ------SR 438 (592)
T ss_pred hhhhhhcCCccccccccccccc-----CCcccCCCCCccccccc---------------------ccchhhc------cc
Confidence 4466655554 2377643 34699999999998641 2333322 12
Q ss_pred CcCCCCCC------CCcC-CCeeecCCCCceeeecCCCCcCCCCCCcccccccCCccccc
Q 027951 115 SIFDLCRW------IDCG-GGSCKNTSMFSYSCQCAVDHYNLLNTSTFPCYKECSIGMDC 167 (216)
Q Consensus 115 s~~DPC~~------~~Cg-gGtCv~~~~~sY~C~C~~Gy~nlln~t~~pC~~~CslG~dC 167 (216)
+.+-||.. ..++ +++ ...+.+.|+.++.++-=...-.|-++=+++.|+
T Consensus 439 s~vapcik~p~~~~~~~~s~ve-----~qd~~s~Ck~~~~~~r~n~kkfc~~Dyav~~~v 493 (592)
T KOG3512|consen 439 SPVAPCIKIPTDAPTLGSSGVE-----PQDQCSKCKASPGGKRLNQKKFCKKDYAVQLDV 493 (592)
T ss_pred CCCcCceecCCCCccccCCCCc-----chhccccCCCCCcceeccccccCccccceeeEe
Confidence 22334432 1233 333 235888999999876433344588888888888
No 73
>KOG4004 consensus Matricellular protein Osteonectin/SPARC/BM-40 [Extracellular structures]
Probab=29.77 E-value=39 Score=30.41 Aligned_cols=53 Identities=26% Similarity=0.461 Sum_probs=0.0
Q ss_pred CCCccCCCCCC-eeeccCCCCCcceeecCCCCccCcccccCCCccCCccCCCCCCCCCC
Q 027951 41 NVCDKVTCGKG-KCKASQNSTFFYECECDLGWKQNTMAVDQNLKFLPCIAPDCTLNQDC 98 (216)
Q Consensus 41 ~~C~~~~Cg~G-tC~~~~~~~~~Y~C~C~pGwtg~~c~~~d~~~~lPC~ipnCt~n~sC 98 (216)
++|+++.|++| .|.... ....+|.|.. |+=+.-+...--+. |.+.|=|+++.|
T Consensus 51 npC~dh~Cg~gk~C~vd~--~~~P~Cvc~~-~kCP~~~~~p~~KV--C~nnNqTf~S~C 104 (259)
T KOG4004|consen 51 NPCADHKCGPGKNCLVDL--QTQPRCVCCR-YKCPRKQQRPVHKV--CGNNNQTFNSWC 104 (259)
T ss_pred CccccccCCCCceeeecC--CCCceeEEec-CCCCcccCCchhhh--hcCCCcchhHHH
No 74
>cd01328 FSL_SPARC Follistatin-like SPARC (secreted protein, acidic, and rich in cysteines) domain; SPARC/BM-40/osteonectin is a multifunctional glycoprotein which modulates cellular interaction with the extracellular matrix by its binding to structural matrix proteins such as collagen and vitronectin. The protein it composed of an N-terminal acidic region, a follistatin (FS) domain and an EF-hand calcium binding domain. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a small hydrophobic core of alpha/beta structure (Kazal domain) and has five disulfide bonds and a conserved N-glycosylation site. The FSL_SPARC domain is a member of the superfamily of kazal-like proteinase inhibitors and follistatin-like proteins.
Probab=28.10 E-value=65 Score=24.62 Aligned_cols=30 Identities=23% Similarity=0.499 Sum_probs=21.5
Q ss_pred CCCCCCcC-CCeeecCCCCceeeecCCCCcC
Q 027951 119 LCRWIDCG-GGSCKNTSMFSYSCQCAVDHYN 148 (216)
Q Consensus 119 PC~~~~Cg-gGtCv~~~~~sY~C~C~~Gy~n 148 (216)
||....|+ |-+|+-...+.-+|.|.+-=..
T Consensus 1 pC~~v~C~~G~~C~~d~~~~p~CvC~~~Cp~ 31 (86)
T cd01328 1 PCENHHCGAGKVCEVDDENTPKCVCIDPCPE 31 (86)
T ss_pred CCCCcCCCCCCEeeECCCCCeEEecCCcCCC
Confidence 68888999 4489865456788998764333
No 75
>PF06607 Prokineticin: Prokineticin; InterPro: IPR023569 The prokineticin family includes prokinectin itself and related proteins such as BM8 and the AVIToxins. The suprachiasmatic nucleus (SCN) controls the circadian rhythm of physiological and behavioural processes in mammals. It has been shown that prokineticin 2 (PK2), a cysteine-rich secreted protein, functions as an output molecule from the SCN circadian clock. PK2 messenger RNA is rhythmically expressed in the SCN, and the phase of PK2 rhythm is responsive to light entrainment. Molecular and genetic studies have revealed that PK2 is a gene that is controlled by a circadian clock []. The prokinectin domain is found in the prokinectin family and the hainantoxins, where it comprises the whole length of the protein. This domain is also found at the C terminus of some members of the Dickkopf family.; PDB: 1IMT_A 2KRA_A.
Probab=27.64 E-value=34 Score=26.81 Aligned_cols=39 Identities=26% Similarity=0.561 Sum_probs=17.2
Q ss_pred cccCCCcc-CCCCCCeeeccCCCC-CcceeecCCCCccCccc
Q 027951 38 AFENVCDK-VTCGKGKCKASQNST-FFYECECDLGWKQNTMA 77 (216)
Q Consensus 38 ~~~~~C~~-~~Cg~GtC~~~~~~~-~~Y~C~C~pGwtg~~c~ 77 (216)
+....|++ .+|+.|.|=...+.. .-..|+ .-|-.|..|.
T Consensus 21 vitg~C~~d~dCg~G~CCA~~~~~~~~~vCk-PlG~~Ge~Ch 61 (97)
T PF06607_consen 21 VITGACESDADCGPGTCCAVSNWRRSLRVCK-PLGQEGEPCH 61 (97)
T ss_dssp --SSC-SSGGGT-TTEEECE-SS-TT-ECCE-E-B-TT-EE-
T ss_pred EEeccccCcCCCCCCceeCcccccCCCccee-CCCcCcCccc
Confidence 34779986 589999987654211 112354 3455565543
No 76
>PF07359 LEAP-2: Liver-expressed antimicrobial peptide 2 precursor (LEAP-2); InterPro: IPR009955 This family consists of several mammalian liver-expressed antimicrobial peptide 2 (LEAP-2) sequences. LEAP-2 is a cysteine-rich, and cationic protein. LEAP-2 contains a core structure with two disulphide bonds formed by cysteine residues in relative 1-3 and 2-4 positions. LEAP-2 is synthesised as a 77-residue precursor, which is predominantly expressed in the liver and highly conserved among mammals. The largest native LEAP-2 form of 40 amino acid residues is generated from the precursor at a putative cleavage site for a furin-like endoprotease. In contrast to smaller LEAP-2 variants, this peptide exhibits dose-dependent antimicrobial activity against selected microbial model organisms []. The exact function of this family is unclear.; GO: 0042742 defense response to bacterium; PDB: 2L1Q_A.
Probab=25.68 E-value=39 Score=25.63 Aligned_cols=18 Identities=28% Similarity=0.366 Sum_probs=0.0
Q ss_pred cchhhHHHHHHHHHHHhh
Q 027951 3 MASVSVIAFLAIFFVLQP 20 (216)
Q Consensus 3 m~~~~~~~~~~~~~~~~~ 20 (216)
|-..+++|++.++||++.
T Consensus 1 m~~lkl~A~lli~lLL~~ 18 (77)
T PF07359_consen 1 MWHLKLFAVLLICLLLLQ 18 (77)
T ss_dssp ------------------
T ss_pred ChHHHHHHHHHHHHHHHH
Confidence 344567775555444443
No 77
>KOG3509 consensus Basement membrane-specific heparan sulfate proteoglycan (HSPG) core protein [Posttranslational modification, protein turnover, chaperones]
Probab=23.88 E-value=1.3e+02 Score=32.30 Aligned_cols=69 Identities=16% Similarity=0.241 Sum_probs=39.1
Q ss_pred cceeecCCCCccCcccccCCCccCCccCCCCCCCCCCCCCCCCcccccccCCCCcCCCCCCC--CcCC-----CeeecCC
Q 027951 62 FYECECDLGWKQNTMAVDQNLKFLPCIAPDCTLNQDCAPSPSPAQEKAAKKNESIFDLCRWI--DCGG-----GSCKNTS 134 (216)
Q Consensus 62 ~Y~C~C~pGwtg~~c~~~d~~~~lPC~ipnCt~n~sC~~~~~~~~~~~~~~n~s~~DPC~~~--~Cgg-----GtCv~~~ 134 (216)
.-.|.|.+|++|.+|++-.+...+++. ..|.... ...-+|.++ .|.. -+|.++.
T Consensus 717 ~~~C~c~~g~~G~~ce~c~e~~~ls~t-~~~~~~~------------------~~~c~~~~h~~~c~~~~~~nt~~q~~~ 777 (964)
T KOG3509|consen 717 VEQCQCPKGLVGTSCEDCAEGYTLSTT-GGLYPGL------------------CEDCECNSHISQCEDDLGYNTDCQNNT 777 (964)
T ss_pred ccccccCccccCccccccccccccccc-CCcCccc------------------CcccccCCCcccccccccccccccccC
Confidence 458999999999998753344444432 1111110 001133332 2442 2566654
Q ss_pred CCceeee-cCCCCcCCC
Q 027951 135 MFSYSCQ-CAVDHYNLL 150 (216)
Q Consensus 135 ~~sY~C~-C~~Gy~nll 150 (216)
.+|.|+ |.+||.++-
T Consensus 778 -~~~~~~~~~~g~~~da 793 (964)
T KOG3509|consen 778 -EGDRCELCSPGTYGDA 793 (964)
T ss_pred -ccceeeecCCCccccC
Confidence 578886 999998875
No 78
>KOG3658 consensus Tumor necrosis factor-alpha-converting enzyme (TACE/ADAM17) and related metalloproteases [Extracellular structures]
Probab=23.53 E-value=1.1e+02 Score=31.76 Aligned_cols=28 Identities=25% Similarity=0.450 Sum_probs=15.2
Q ss_pred CCCCC-CCcCCCeeecCC---CCceeeecCCC
Q 027951 118 DLCRW-IDCGGGSCKNTS---MFSYSCQCAVD 145 (216)
Q Consensus 118 DPC~~-~~CggGtCv~~~---~~sY~C~C~~G 145 (216)
-+|.. ..|..|+|+..- -+-=+|.|.++
T Consensus 564 t~C~~~~~C~~G~C~gs~c~~~glesC~c~~~ 595 (764)
T KOG3658|consen 564 TVCNETGVCINGKCIGSCCLMQGLESCFCTET 595 (764)
T ss_pred CcccccceEeCCcCccHHHHhhCcceeeeccC
Confidence 35653 456666665421 12346878766
No 79
>KOG3512 consensus Netrin, axonal chemotropic factor [Signal transduction mechanisms]
Probab=23.47 E-value=1.4e+02 Score=30.13 Aligned_cols=24 Identities=13% Similarity=0.462 Sum_probs=18.9
Q ss_pred eeeccCCCCCcceeecCCCCccCccc
Q 027951 52 KCKASQNSTFFYECECDLGWKQNTMA 77 (216)
Q Consensus 52 tC~~~~~~~~~Y~C~C~pGwtg~~c~ 77 (216)
.|+-.. ...+.|+|+.+-+|+.|+
T Consensus 286 ~Cv~d~--~~~ltCdC~HNTaGPdCg 309 (592)
T KOG3512|consen 286 RCVMDE--SSHLTCDCEHNTAGPDCG 309 (592)
T ss_pred eeeecc--CCceEEecccCCCCCCcc
Confidence 787652 335999999999999875
No 80
>KOG3653 consensus Transforming growth factor beta/activin receptor subfamily of serine/threonine kinases [Signal transduction mechanisms]
Probab=23.08 E-value=2.8e+02 Score=27.93 Aligned_cols=13 Identities=38% Similarity=0.802 Sum_probs=10.0
Q ss_pred ceeeecCCCCcCC
Q 027951 137 SYSCQCAVDHYNL 149 (216)
Q Consensus 137 sY~C~C~~Gy~nl 149 (216)
-|.|-|..++=|.
T Consensus 115 ~~~CcCs~~~CN~ 127 (534)
T KOG3653|consen 115 LYFCCCSTDFCNA 127 (534)
T ss_pred EEEEecCCCcccC
Confidence 4889998877665
No 81
>KOG0994 consensus Extracellular matrix glycoprotein Laminin subunit beta [Extracellular structures]
Probab=20.81 E-value=1.9e+02 Score=32.38 Aligned_cols=19 Identities=21% Similarity=0.508 Sum_probs=16.3
Q ss_pred CCcceeecCCCCccCcccc
Q 027951 60 TFFYECECDLGWKQNTMAV 78 (216)
Q Consensus 60 ~~~Y~C~C~pGwtg~~c~~ 78 (216)
...-.|.|++||+|.+|+.
T Consensus 931 t~~ivC~C~~GY~G~RCe~ 949 (1758)
T KOG0994|consen 931 TQQIVCHCQEGYSGSRCEI 949 (1758)
T ss_pred ccceeeecccCccccchhh
Confidence 4578999999999999873
No 82
>PF01826 TIL: Trypsin Inhibitor like cysteine rich domain; InterPro: IPR002919 This domain is found in proteinase inhibitors as well as in many extracellular proteins. The domain typically contains ten cysteine residues that form five disulphide bonds. The cysteine residues that form the disulphide bonds are 1-7, 2-6, 3-5, 4-10 and 8-9. This inhibitor domain belongs to MEROPS inhibitor family I8 (clan IA). Proteins containing this domain inhibit peptidases belonging to families S1 (IPR001254 from INTERPRO), S8 (IPR000209 from INTERPRO), and M4 (IPR001570 from INTERPRO) [] and are restricted to the chordata, nematoda, arthropoda and echinodermata. Examples of proteins containing this domain are: chymotrypsin/elastase inhibitor from Ascaris suum (pig roundworm) Acp62F protein from Drosophila melanogaster Bombina trypsin inhibitor from Bombina maxima (large-webbed bell toad) Bombyx subtilisin inhibitor from Bombyx mori (silk moth) von Willebrand factor ; PDB: 2P3F_N 1HX2_A 1CCV_A 1EAI_D 2H9E_C 1COU_A 1ATE_A 1ATB_A 1ATD_A 1ATA_A ....
Probab=20.60 E-value=41 Score=22.49 Aligned_cols=17 Identities=24% Similarity=0.430 Sum_probs=12.5
Q ss_pred eecCCCCcCCCCCCcccccc
Q 027951 140 CQCAVDHYNLLNTSTFPCYK 159 (216)
Q Consensus 140 C~C~~Gy~nlln~t~~pC~~ 159 (216)
|.|++||. +|.+ ..|+.
T Consensus 35 C~C~~G~v--~~~~-~~CV~ 51 (55)
T PF01826_consen 35 CFCPPGYV--RNDN-GRCVP 51 (55)
T ss_dssp EEETTTEE--EETT-SEEEE
T ss_pred CCCCCCee--EcCC-CCEEc
Confidence 89999998 4444 77773
Done!