Query psy5750
Match_columns 286
No_of_seqs 300 out of 2187
Neff 9.3
Searched_HMMs 46136
Date Fri Aug 16 17:05:35 2013
Command hhsearch -i /work/01045/syshi/Psyhhblits/psy5750.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/5750hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 KOG1214|consensus 100.0 2.1E-29 4.6E-34 231.2 15.5 244 1-268 541-876 (1289)
2 KOG1214|consensus 99.7 1.8E-16 4E-21 146.8 14.2 144 115-275 692-845 (1289)
3 KOG1219|consensus 99.5 2.3E-14 5E-19 143.4 8.3 112 68-201 3864-3976(4289)
4 KOG4289|consensus 99.5 4.8E-14 1E-18 136.7 8.6 108 67-195 1178-1308(2531)
5 KOG1219|consensus 99.5 8E-14 1.7E-18 139.6 8.8 114 111-251 3859-3976(4289)
6 smart00682 G2F G2 nidogen doma 99.3 2.8E-12 6.1E-17 104.7 7.6 63 1-63 86-148 (227)
7 PF07474 G2F: G2F domain; Int 99.3 6E-12 1.3E-16 100.9 6.5 63 1-63 86-148 (192)
8 cd00255 nidG2 Nidogen, G2 doma 99.3 1.2E-11 2.5E-16 101.3 7.4 63 1-63 84-147 (224)
9 KOG1217|consensus 99.1 8.5E-10 1.8E-14 102.4 13.8 183 69-273 170-372 (487)
10 KOG1217|consensus 99.1 1.3E-09 2.9E-14 101.1 12.4 163 77-258 137-316 (487)
11 KOG4289|consensus 99.0 6.5E-10 1.4E-14 108.8 6.7 98 134-252 1218-1316(2531)
12 KOG4260|consensus 98.9 9.5E-10 2.1E-14 90.8 3.7 146 74-245 150-304 (350)
13 KOG4260|consensus 98.9 1.5E-09 3.3E-14 89.6 4.8 137 92-253 131-276 (350)
14 PF07645 EGF_CA: Calcium-bindi 98.9 1.2E-09 2.6E-14 66.2 2.8 34 212-245 1-34 (42)
15 KOG1225|consensus 98.8 2.9E-08 6.4E-13 91.2 10.9 126 90-281 235-360 (525)
16 PF12947 EGF_3: EGF domain; I 98.8 1.9E-09 4.2E-14 62.5 1.7 36 216-251 1-36 (36)
17 KOG1225|consensus 98.7 7.2E-08 1.6E-12 88.7 9.6 110 76-248 256-365 (525)
18 PF06247 Plasmod_Pvs28: Plasmo 98.4 7.5E-08 1.6E-12 75.7 0.6 144 75-248 7-163 (197)
19 KOG1226|consensus 98.4 1.6E-05 3.4E-10 75.2 15.3 163 74-273 467-637 (783)
20 PF07645 EGF_CA: Calcium-bindi 98.4 4E-07 8.7E-12 55.1 3.2 34 114-147 1-34 (42)
21 smart00179 EGF_CA Calcium-bind 98.3 9.7E-07 2.1E-11 52.3 4.2 38 212-251 1-38 (39)
22 PF12947 EGF_3: EGF domain; I 98.1 2.1E-06 4.6E-11 49.7 1.9 32 122-153 5-36 (36)
23 PF00008 EGF: EGF-like domain 98.0 3E-06 6.6E-11 47.9 1.7 26 222-247 5-31 (32)
24 KOG1226|consensus 97.8 0.00014 2.9E-09 69.0 9.9 131 139-285 479-616 (783)
25 PF00008 EGF: EGF-like domain 97.8 1.2E-05 2.5E-10 45.5 1.7 30 71-100 1-31 (32)
26 cd00054 EGF_CA Calcium-binding 97.8 3.6E-05 7.9E-10 44.9 3.9 36 213-251 2-37 (38)
27 smart00179 EGF_CA Calcium-bind 97.7 4.4E-05 9.5E-10 45.0 3.8 35 67-101 1-37 (39)
28 PF12662 cEGF: Complement Clr- 97.6 4.6E-05 9.9E-10 39.6 2.3 24 187-215 1-24 (24)
29 cd00053 EGF Epidermal growth f 97.6 0.00014 2.9E-09 41.7 4.1 30 221-251 6-35 (36)
30 PF14670 FXa_inhibition: Coagu 97.6 4.6E-05 1E-09 44.0 1.9 29 221-251 6-36 (36)
31 PF12662 cEGF: Complement Clr- 97.5 7.1E-05 1.5E-09 38.9 2.2 21 235-255 1-23 (24)
32 smart00181 EGF Epidermal growt 97.5 0.0002 4.4E-09 41.1 3.9 29 221-251 6-34 (35)
33 cd00054 EGF_CA Calcium-binding 97.4 0.00023 5.1E-09 41.3 3.7 35 68-102 2-37 (38)
34 cd00053 EGF Epidermal growth f 96.9 0.0017 3.7E-08 36.9 3.6 28 73-100 5-32 (36)
35 PF06247 Plasmod_Pvs28: Plasmo 96.8 0.0006 1.3E-08 54.1 1.6 120 124-269 7-138 (197)
36 smart00181 EGF Epidermal growt 96.8 0.002 4.4E-08 36.8 3.4 28 71-99 2-30 (35)
37 cd01475 vWA_Matrilin VWA_Matri 96.6 0.0024 5.3E-08 53.5 3.9 39 206-246 180-218 (224)
38 PF07974 EGF_2: EGF-like domai 96.2 0.0077 1.7E-07 33.8 3.4 24 222-247 7-30 (32)
39 KOG0994|consensus 96.1 0.02 4.3E-07 57.0 7.3 137 131-283 878-1046(1758)
40 PF14670 FXa_inhibition: Coagu 95.9 0.0077 1.7E-07 34.8 2.3 24 123-148 6-29 (36)
41 PF12661 hEGF: Human growth fa 95.8 0.0045 9.7E-08 27.2 0.9 11 237-247 1-11 (13)
42 PF12946 EGF_MSP1_1: MSP1 EGF 95.7 0.0041 8.9E-08 35.7 0.7 30 222-251 6-36 (37)
43 PF12946 EGF_MSP1_1: MSP1 EGF 94.8 0.019 4.1E-07 33.0 1.5 31 123-153 5-36 (37)
44 PF07974 EGF_2: EGF-like domai 94.4 0.056 1.2E-06 30.3 2.9 27 74-102 6-32 (32)
45 KOG0994|consensus 93.5 0.61 1.3E-05 47.1 9.8 67 180-251 877-947 (1758)
46 cd01475 vWA_Matrilin VWA_Matri 93.5 0.081 1.7E-06 44.3 3.6 39 108-148 180-218 (224)
47 KOG1836|consensus 92.2 0.86 1.9E-05 48.7 9.5 33 167-202 777-812 (1705)
48 PF01683 EB: EB module; Inter 88.6 0.87 1.9E-05 28.4 3.8 26 222-251 27-52 (52)
49 smart00051 DSL delta serrate l 87.9 1.1 2.3E-05 29.4 4.0 46 187-247 16-61 (63)
50 KOG1836|consensus 86.9 1.2 2.7E-05 47.6 5.8 60 181-252 749-812 (1705)
51 PHA03099 epidermal growth fact 86.6 0.64 1.4E-05 34.7 2.6 32 221-255 51-84 (139)
52 PF12955 DUF3844: Domain of un 85.2 0.47 1E-05 34.3 1.2 33 213-245 5-42 (103)
53 PF00954 S_locus_glycop: S-loc 79.5 2.1 4.5E-05 31.4 2.9 34 212-247 76-109 (110)
54 PHA02887 EGF-like protein; Pro 79.0 2.4 5.1E-05 31.2 2.9 36 67-103 82-122 (126)
55 KOG3516|consensus 78.4 1.6 3.5E-05 44.4 2.6 40 64-103 541-581 (1306)
56 PF01683 EB: EB module; Inter 75.6 7.3 0.00016 24.1 4.3 23 172-199 26-48 (52)
57 PHA02887 EGF-like protein; Pro 75.6 3.2 6.9E-05 30.5 2.8 31 221-254 92-124 (126)
58 KOG1218|consensus 72.1 65 0.0014 27.9 11.0 11 237-247 163-173 (316)
59 PHA03099 epidermal growth fact 71.3 4.4 9.6E-05 30.3 2.8 36 67-103 41-81 (139)
60 PF00954 S_locus_glycop: S-loc 70.5 4.8 0.0001 29.4 2.9 31 67-98 76-107 (110)
61 KOG3514|consensus 69.4 25 0.00053 36.1 8.0 37 69-105 624-661 (1591)
62 PF09064 Tme5_EGF_like: Thromb 61.7 5.1 0.00011 22.6 1.1 12 236-247 18-29 (34)
63 PF00053 Laminin_EGF: Laminin 55.1 11 0.00023 23.0 2.0 20 228-251 12-31 (49)
64 KOG3512|consensus 55.1 53 0.0012 30.5 7.0 25 79-103 284-309 (592)
65 cd00055 EGF_Lam Laminin-type e 52.1 18 0.00039 22.2 2.7 13 237-251 20-32 (50)
66 KOG3516|consensus 51.9 12 0.00027 38.4 2.8 41 111-155 541-582 (1306)
67 KOG1215|consensus 43.4 52 0.0011 33.5 5.8 89 180-282 338-427 (877)
68 PF01414 DSL: Delta serrate li 43.1 10 0.00022 24.8 0.5 15 89-103 17-31 (63)
69 KOG1218|consensus 38.3 2.6E+02 0.0057 24.1 10.8 13 88-100 14-26 (316)
70 KOG3514|consensus 35.1 25 0.00055 36.0 2.0 35 117-155 625-660 (1591)
71 KOG3512|consensus 26.9 1E+02 0.0022 28.8 4.2 25 178-202 284-309 (592)
72 smart00180 EGF_Lam Laminin-typ 22.7 69 0.0015 19.2 1.8 13 237-251 19-31 (46)
73 KOG3509|consensus 21.9 1.7E+02 0.0037 30.1 5.2 36 68-103 406-441 (964)
No 1
>KOG1214|consensus
Probab=99.96 E-value=2.1e-29 Score=231.23 Aligned_cols=244 Identities=31% Similarity=0.698 Sum_probs=200.9
Q ss_pred CccceEEEEEEC-CCCeEEEeEeeeccCccCCeeeeEEEeeccCCcccCCcceeecceeeeeec----------------
Q psy5750 1 GVFNYSAELIFS-TGQRLHVQEEFFGHDVYDQLKMQGSIQGTAPSIAVGVSPVLEDLQQEFTHS---------------- 63 (286)
Q Consensus 1 g~~~~~~~~~~~-~~~~~~~~~~~~g~~~~~~~~~~~~~~g~~p~~~~~~~~~~~~~~~~~~~~---------------- 63 (286)
+.|+|.+++.|. +...++|+|++.|+|.+.+|++++.++|.+|.++.+...++.+|++.|++.
T Consensus 541 ~~ftr~~evtf~g~~~~~vi~q~~~g~d~~~~l~ikt~~~G~vp~~p~~~~~hi~py~elyHys~s~vtstssr~y~~t~ 620 (1289)
T KOG1214|consen 541 AAFTRDMEVTFYGGEETVVITQTAEGLDPENYLSIKTNIQGQVPYVPANFTAHISPYKELYHYSDSTVTSTSSRDYSLTF 620 (1289)
T ss_pred cccccCceEEecCCcceeeeeeecCCCCCCceEEEecccccccceeccccccccCcchhhhhcccceeecccccceeeec
Confidence 468999999999 678899999999999999999999999999999999999999999988773
Q ss_pred -------------------------------------------------------------------CcCCCCCCC--CC
Q psy5750 64 -------------------------------------------------------------------STVNDDPCK--NF 74 (286)
Q Consensus 64 -------------------------------------------------------------------~~~~~~~C~--~~ 74 (286)
.....++|- ++
T Consensus 621 ga~~S~~~sy~~hq~ityq~C~h~~~~p~~p~tqql~vd~vfalyn~ee~~lr~a~Sn~igpV~E~S~~~~~npCy~gsh 700 (1289)
T KOG1214|consen 621 GAINSQTWSYRIHQNITYQVCRHAPRHPSFPTTQQLNVDRVFALYNDEERVLRFAVSNQIGPVKEDSDPTPVNPCYDGSH 700 (1289)
T ss_pred CcccccceeEEEeecceeEEeecCCCCCCCCCceEeecccceeccCccccchhhhhhhcccceecCCCCcccccceecCc
Confidence 011223332 25
Q ss_pred CCCCCCeeeeCC-CCCeEecCCCcccccccccCCCCCCCccCCCcCCCCCCCCCCCeeecCCCCeeEeCCCCcc--cCCC
Q psy5750 75 FCVANSSCIVED-DKPTCICNRGFQQLYSEDRLQDDFGCFDINECNAGTDLCHKNAMCFNEIGSYSCQCRPGFT--GNGH 151 (286)
Q Consensus 75 ~C~~~~~C~~~~-g~~~C~C~~g~~~~~c~~~~~~~~~C~~~~~C~~~~~~C~~~~~C~~~~g~~~C~C~~G~~--g~~~ 151 (286)
.|..++.|.... -.|+|.|..||.+ +++.|.++++|+.+.+.|..+++|++.+++|+|.|..||. +++.
T Consensus 701 ~cdt~a~C~pg~~~~~tcecs~g~~g--------dgr~c~d~~eca~~~~~CGp~s~Cin~pg~~rceC~~gy~F~dd~~ 772 (1289)
T KOG1214|consen 701 MCDTTARCHPGTGVDYTCECSSGYQG--------DGRNCVDENECATGFHRCGPNSVCINLPGSYRCECRSGYEFADDRH 772 (1289)
T ss_pred ccCCCccccCCCCcceEEEEeeccCC--------CCCCCCChhhhccCCCCCCCCceeecCCCceeEEEeecceeccCCc
Confidence 577778888773 4789999999985 4889999999999889999999999999999999999876 6778
Q ss_pred ceecccCCCCCCCCCCCCCCCCCCCCCCcc--cc-CCCCceeeCCCCCCCCCCCCcccCCccccCCCcCCCCCCCCCCCe
Q psy5750 152 QCTEITVPQTGPTSPCESDPRACNPPHSTC--TN-LTDYRTCNCDPGYQKDYLDDRRVAFVCTDVDECMNYPPICNNNAD 228 (286)
Q Consensus 152 ~C~~~~~~~~~~~~~C~~~~~~C~~~~~~C--~~-~~g~~~C~C~~G~~g~~~~~~~~~~~C~d~deC~~~~~~C~~~~~ 228 (286)
+|..+..+ ..++.|..+.+.| ...+.| +. ..+.|.|.|.+||.|++. .|.|+|||..+ .|++++.
T Consensus 773 tCV~i~~p--ap~n~Ce~g~h~C-~i~g~a~c~~hGgs~y~C~CLPGfsGDG~-------~c~dvDeC~ps--rChp~A~ 840 (1289)
T KOG1214|consen 773 TCVLITPP--APANPCEDGSHTC-AIAGQARCVHHGGSTYSCACLPGFSGDGH-------QCTDVDECSPS--RCHPAAT 840 (1289)
T ss_pred ceEEecCC--CCCCccccCcccc-CcCCceEEEecCCceEEEeecCCccCCcc-------ccccccccCcc--ccCCCce
Confidence 89765432 3467888877888 666554 44 446899999999999966 89999999965 5999999
Q ss_pred eeeCCCCeeeecCCCCcCCCCCcCCCCeeecccccccccc
Q psy5750 229 CINRPGTYQCQCKRGFSGDGFNCEEGKYCLVVGITLCKMY 268 (286)
Q Consensus 229 C~n~~g~y~C~C~~Gy~gdg~~C~~~~~c~~~~~~~c~~~ 268 (286)
|+|++|+|.|+|.+||+|||+.|--. . ..++.|+..
T Consensus 841 CyntpgsfsC~C~pGy~GDGf~CVP~-~---~~~T~C~~e 876 (1289)
T KOG1214|consen 841 CYNTPGSFSCRCQPGYYGDGFQCVPD-T---SSLTPCEQE 876 (1289)
T ss_pred EecCCCcceeecccCccCCCceecCC-C---ccCCccccc
Confidence 99999999999999999999999322 1 124556654
No 2
>KOG1214|consensus
Probab=99.71 E-value=1.8e-16 Score=146.82 Aligned_cols=144 Identities=30% Similarity=0.785 Sum_probs=116.9
Q ss_pred CCCcCCCCCCCCCCCeeecCCC-CeeEeCCCCcccCCCceecccCCCCCCCCCCCCCCCCCCCCCCccccCCCCceeeCC
Q psy5750 115 INECNAGTDLCHKNAMCFNEIG-SYSCQCRPGFTGNGHQCTEITVPQTGPTSPCESDPRACNPPHSTCTNLTDYRTCNCD 193 (286)
Q Consensus 115 ~~~C~~~~~~C~~~~~C~~~~g-~~~C~C~~G~~g~~~~C~~~~~~~~~~~~~C~~~~~~C~~~~~~C~~~~g~~~C~C~ 193 (286)
.+.|..+.+.|..++.|....+ .|+|.|..||.|++..|.+ +++|+...+.| ..+..|++.+++|+|+|.
T Consensus 692 ~npCy~gsh~cdt~a~C~pg~~~~~tcecs~g~~gdgr~c~d--------~~eca~~~~~C-Gp~s~Cin~pg~~rceC~ 762 (1289)
T KOG1214|consen 692 VNPCYDGSHMCDTTARCHPGTGVDYTCECSSGYQGDGRNCVD--------ENECATGFHRC-GPNSVCINLPGSYRCECR 762 (1289)
T ss_pred cccceecCcccCCCccccCCCCcceEEEEeeccCCCCCCCCC--------hhhhccCCCCC-CCCceeecCCCceeEEEe
Confidence 4556666677888889987754 7999999999999999998 89999988999 999999999999999999
Q ss_pred CCCCCCCCCCcccCCccc------cCCCcCCCCCCCCCCCe--eeeC-CCCeeeecCCCCcCCCCCcCCCCeeecccccc
Q psy5750 194 PGYQKDYLDDRRVAFVCT------DVDECMNYPPICNNNAD--CINR-PGTYQCQCKRGFSGDGFNCEEGKYCLVVGITL 264 (286)
Q Consensus 194 ~G~~g~~~~~~~~~~~C~------d~deC~~~~~~C~~~~~--C~n~-~g~y~C~C~~Gy~gdg~~C~~~~~c~~~~~~~ 264 (286)
.||.-... +..|. .++.|....|.|+..+. |+.. .+.|+|+|.+||.|||..|.++++|. +..
T Consensus 763 ~gy~F~dd-----~~tCV~i~~pap~n~Ce~g~h~C~i~g~a~c~~hGgs~y~C~CLPGfsGDG~~c~dvDeC~---psr 834 (1289)
T KOG1214|consen 763 SGYEFADD-----RHTCVLITPPAPANPCEDGSHTCAIAGQARCVHHGGSTYSCACLPGFSGDGHQCTDVDECS---PSR 834 (1289)
T ss_pred ecceeccC-----CcceEEecCCCCCCccccCccccCcCCceEEEecCCceEEEeecCCccCCccccccccccC---ccc
Confidence 99864322 22453 35678888888987654 5444 46799999999999999999999996 566
Q ss_pred cccceeeeecc
Q psy5750 265 CKMYLEVVNIQ 275 (286)
Q Consensus 265 c~~~~~~~~~~ 275 (286)
|...+.|++..
T Consensus 835 Chp~A~Cyntp 845 (1289)
T KOG1214|consen 835 CHPAATCYNTP 845 (1289)
T ss_pred cCCCceEecCC
Confidence 88888887754
No 3
>KOG1219|consensus
Probab=99.52 E-value=2.3e-14 Score=143.36 Aligned_cols=112 Identities=26% Similarity=0.680 Sum_probs=103.0
Q ss_pred CCCCCCCCCCCCCeeeeC-CCCCeEecCCCcccccccccCCCCCCCccCCCcCCCCCCCCCCCeeecCCCCeeEeCCCCc
Q psy5750 68 DDPCKNFFCVANSSCIVE-DDKPTCICNRGFQQLYSEDRLQDDFGCFDINECNAGTDLCHKNAMCFNEIGSYSCQCRPGF 146 (286)
Q Consensus 68 ~~~C~~~~C~~~~~C~~~-~g~~~C~C~~g~~~~~c~~~~~~~~~C~~~~~C~~~~~~C~~~~~C~~~~g~~~C~C~~G~ 146 (286)
.++|..+||+++|+|... .++|.|.|++.|.|..|+ .++++|.. +||..+++|+...++|.|.|+.||
T Consensus 3864 ~d~C~~npCqhgG~C~~~~~ggy~CkCpsqysG~~CE---------i~~epC~s--nPC~~GgtCip~~n~f~CnC~~gy 3932 (4289)
T KOG1219|consen 3864 TDPCNDNPCQHGGTCISQPKGGYKCKCPSQYSGNHCE---------IDLEPCAS--NPCLTGGTCIPFYNGFLCNCPNGY 3932 (4289)
T ss_pred ccccccCcccCCCEecCCCCCceEEeCcccccCcccc---------cccccccC--CCCCCCCEEEecCCCeeEeCCCCc
Confidence 389999999999999988 678999999999999998 88999998 999999999999999999999999
Q ss_pred ccCCCceecccCCCCCCCCCCCCCCCCCCCCCCccccCCCCceeeCCCCCCCCCC
Q psy5750 147 TGNGHQCTEITVPQTGPTSPCESDPRACNPPHSTCTNLTDYRTCNCDPGYQKDYL 201 (286)
Q Consensus 147 ~g~~~~C~~~~~~~~~~~~~C~~~~~~C~~~~~~C~~~~g~~~C~C~~G~~g~~~ 201 (286)
+| .+|+.. -+++|.. .+| .+++.|++..|+|.|.|.+||.|.++
T Consensus 3933 TG--~~Ce~~------Gi~eCs~--n~C-~~gg~C~n~~gsf~CncT~g~~gr~c 3976 (4289)
T KOG1219|consen 3933 TG--KRCEAR------GISECSK--NVC-GTGGQCINIPGSFHCNCTPGILGRTC 3976 (4289)
T ss_pred cC--ceeecc------ccccccc--ccc-cCCceeeccCCceEeccChhHhcccC
Confidence 99 889762 1789987 799 99999999999999999999998864
No 4
>KOG4289|consensus
Probab=99.50 E-value=4.8e-14 Score=136.73 Aligned_cols=108 Identities=32% Similarity=0.743 Sum_probs=90.8
Q ss_pred CCCCCCCCCCCCCCeeee----------------------CCCCCeEecCCCcccccccccCCCCCCCccCCCcCCCCCC
Q psy5750 67 NDDPCKNFFCVANSSCIV----------------------EDDKPTCICNRGFQQLYSEDRLQDDFGCFDINECNAGTDL 124 (286)
Q Consensus 67 ~~~~C~~~~C~~~~~C~~----------------------~~g~~~C~C~~g~~~~~c~~~~~~~~~C~~~~~C~~~~~~ 124 (286)
+.+.|...||.+...|+. ..++++|.|++||+|.+|+ ..+|.|-+ .+
T Consensus 1178 dDniClrEPCenymkCvsvlrFdssapf~~s~s~lfRpi~pvnglrCrCPpGFTgd~Ce---------TeiDlCYs--~p 1246 (2531)
T KOG4289|consen 1178 DDNICLREPCENYMKCVSVLRFDSSAPFLASDSVLFRPIHPVNGLRCRCPPGFTGDYCE---------TEIDLCYS--GP 1246 (2531)
T ss_pred cCchhhcchhHHHHhhhhheeecccCccccccceeeeeccccCceeEeCCCCCCccccc---------chhHhhhc--CC
Confidence 456688889988888863 3456899999999999999 88999998 89
Q ss_pred CCCCCeeecCCCCeeEeCCCCcccCCCceecccCCCCCCCCCCCCCCCCCCCCCCcccc-CCCCceeeCCCC
Q psy5750 125 CHKNAMCFNEIGSYSCQCRPGFTGNGHQCTEITVPQTGPTSPCESDPRACNPPHSTCTN-LTDYRTCNCDPG 195 (286)
Q Consensus 125 C~~~~~C~~~~g~~~C~C~~G~~g~~~~C~~~~~~~~~~~~~C~~~~~~C~~~~~~C~~-~~g~~~C~C~~G 195 (286)
|.+++.|....|+|+|.|.+||+| ..|+.. ...-.|.. ..| +++++|++ ..+++.|.|+.|
T Consensus 1247 C~nng~C~srEggYtCeCrpg~tG--ehCEvs-----~~agrCvp--GvC-~nggtC~~~~nggf~c~Cp~g 1308 (2531)
T KOG4289|consen 1247 CGNNGRCRSREGGYTCECRPGFTG--EHCEVS-----ARAGRCVP--GVC-KNGGTCVNLLNGGFCCHCPYG 1308 (2531)
T ss_pred CCCCCceEEecCceeEEecCCccc--cceeee-----cccCcccc--cee-cCCCEEeecCCCceeccCCCc
Confidence 999999999999999999999999 888741 12345666 789 99999998 458899999987
No 5
>KOG1219|consensus
Probab=99.48 E-value=8e-14 Score=139.60 Aligned_cols=114 Identities=32% Similarity=0.942 Sum_probs=101.0
Q ss_pred CCccC-CCcCCCCCCCCCCCeeecCC-CCeeEeCCCCcccCCCceecccCCCCCCCCCCCCCCCCCCCCCCccccCCCCc
Q psy5750 111 GCFDI-NECNAGTDLCHKNAMCFNEI-GSYSCQCRPGFTGNGHQCTEITVPQTGPTSPCESDPRACNPPHSTCTNLTDYR 188 (286)
Q Consensus 111 ~C~~~-~~C~~~~~~C~~~~~C~~~~-g~~~C~C~~G~~g~~~~C~~~~~~~~~~~~~C~~~~~~C~~~~~~C~~~~g~~ 188 (286)
.|... +.|.. +||++++.|...+ ++|.|.|++-|+| ..|+. ++.+|.. .|| ..+++|+...++|
T Consensus 3859 gC~l~~d~C~~--npCqhgG~C~~~~~ggy~CkCpsqysG--~~CEi-------~~epC~s--nPC-~~GgtCip~~n~f 3924 (4289)
T KOG1219|consen 3859 GCSLLTDPCND--NPCQHGGTCISQPKGGYKCKCPSQYSG--NHCEI-------DLEPCAS--NPC-LTGGTCIPFYNGF 3924 (4289)
T ss_pred ccccccccccc--CcccCCCEecCCCCCceEEeCcccccC--ccccc-------ccccccC--CCC-CCCCEEEecCCCe
Confidence 45433 88988 9999999998876 6899999999999 99986 4789998 799 9999999999999
Q ss_pred eeeCCCCCCCCCCCCcccCCcc-c-cCCCcCCCCCCCCCCCeeeeCCCCeeeecCCCCcCCCCCc
Q psy5750 189 TCNCDPGYQKDYLDDRRVAFVC-T-DVDECMNYPPICNNNADCINRPGTYQCQCKRGFSGDGFNC 251 (286)
Q Consensus 189 ~C~C~~G~~g~~~~~~~~~~~C-~-d~deC~~~~~~C~~~~~C~n~~g~y~C~C~~Gy~gdg~~C 251 (286)
.|.|+.||+|. .| . .++||+.+ .|.++|.|+|.+|+|.|.|.+||.|. .|
T Consensus 3925 ~CnC~~gyTG~---------~Ce~~Gi~eCs~n--~C~~gg~C~n~~gsf~CncT~g~~gr--~c 3976 (4289)
T KOG1219|consen 3925 LCNCPNGYTGK---------RCEARGISECSKN--VCGTGGQCINIPGSFHCNCTPGILGR--TC 3976 (4289)
T ss_pred eEeCCCCccCc---------eeecccccccccc--cccCCceeeccCCceEeccChhHhcc--cC
Confidence 99999999999 56 2 38999876 49999999999999999999999988 66
No 6
>smart00682 G2F G2 nidogen domain and fibulin.
Probab=99.34 E-value=2.8e-12 Score=104.74 Aligned_cols=63 Identities=30% Similarity=0.564 Sum_probs=61.8
Q ss_pred CccceEEEEEECCCCeEEEeEeeeccCccCCeeeeEEEeeccCCcccCCcceeecceeeeeec
Q psy5750 1 GVFNYSAELIFSTGQRLHVQEEFFGHDVYDQLKMQGSIQGTAPSIAVGVSPVLEDLQQEFTHS 63 (286)
Q Consensus 1 g~~~~~~~~~~~~~~~~~~~~~~~g~~~~~~~~~~~~~~g~~p~~~~~~~~~~~~~~~~~~~~ 63 (286)
|.|+|+++|+|.+|+.|+|+|+|+|+|.+++|++++.+.|.+|.+++++.+++.||.|.|+++
T Consensus 86 G~F~~~s~v~F~~ge~l~i~Q~~~GlD~~~~L~v~~~i~G~vP~ip~~a~v~i~dY~E~Y~~t 148 (227)
T smart00682 86 GVFTRETEVTFAGGEILRIKQTFSGLDEHGYLKVKIEVSGRVPQVAAGAEVTIPDYTEEYTYT 148 (227)
T ss_pred eEEEEEEEEEECCCCEEEEEEEEeccCccccEEEEEEEEeecCCCCCCCeEEeCCceeEEEEe
Confidence 789999999999999999999999999999999999999999999999999999999999986
No 7
>PF07474 G2F: G2F domain; InterPro: IPR006605 Basement membranes are sheet-like extracellular matrices found at the basal surfaces of epithelia and condensed mesenchyma. By preventing cell mixing and providing a cell-adhesive substrate, they play crucial roles in tissue development and function. Basement menbranes are composed of an evolutionarily ancient set of large glycoproteins, which includes members of the laminin family, collagen IV, perlecan and nidogen/entactin. Nidogen/entactin is an important basement membrane component, which promotes cell attachment, neutrophil chemotaxis, trophoblast outgrowth, and angiogenesis. It consists of three globular regions, G1-G3. G1 and G2 are connected by a thread-like structure, whereas that between G2 and G3 is rod-like [, ]. The nidogen G2 region binds to collagen IV and perlecan. The nidogen G2 structure is composed of two domains, an N-terminal EGF-like domain and a much larger beta-barrel domain of ~230 residues. The nidogen G2 beta-barrel consists of an 11-stranded beta-barrel of complex topology, the interior of which is traversed by the hydrophobic, predominantly alpha helical segment connecting strands C and D. The N-terminal half of the barrel comprises two beta-meanders (strands A-C and D-F) linked by the buried alpha-helical segment. The polypeptide chain then crosses the bottom of the barrel and forms a five-stranded Greek key motif in the C- terminal half of the domain. Helix alpha3 caps the top of the barrel and forms the interface to the EGF-like domain. The nidogen G2 beta-barrel domain has unexpected structural similarity to green fluorescent protein, suggesting that they derive from a common ancestor. A large surface patch on the barrel surface is strikingly conserved in all metazoan nidogens. Site-directed mutagenesis demonstrates that the conserved residues in the conserved patch are involved in the binding of perlecan, and possibly also of collagen IV [].; PDB: 1GL4_A 1H4U_A.
Probab=99.29 E-value=6e-12 Score=100.90 Aligned_cols=63 Identities=30% Similarity=0.532 Sum_probs=49.8
Q ss_pred CccceEEEEEECCCCeEEEeEeeeccCccCCeeeeEEEeeccCCcccCCcceeecceeeeeec
Q psy5750 1 GVFNYSAELIFSTGQRLHVQEEFFGHDVYDQLKMQGSIQGTAPSIAVGVSPVLEDLQQEFTHS 63 (286)
Q Consensus 1 g~~~~~~~~~~~~~~~~~~~~~~~g~~~~~~~~~~~~~~g~~p~~~~~~~~~~~~~~~~~~~~ 63 (286)
|.|+|+++|+|.+|+.|+|+|+|.|+|.++.|.+++.+.|.+|.++.++.+++.||.|.|+++
T Consensus 86 G~F~~~s~v~F~tGe~l~itq~~~GlD~~~~L~~d~~i~G~vP~i~~~a~v~i~dy~E~Y~~t 148 (192)
T PF07474_consen 86 GEFNRESEVEFATGERLTITQTARGLDSDGYLLLDTVISGQVPQIPAGADVHIQDYTEEYVQT 148 (192)
T ss_dssp TEEEEEEEEEESSS--EEEEEEEEEE-TTS-EEEEEEEEEEE----TT-EEE---EEEEEEEE
T ss_pred cEEEEEEEEEEeCCCEEEEEEEecccCCCCcEEEEEEEeccCCCCCCCCeEEeCChhheeEEe
Confidence 789999999999999999999999999999999999999999999999999999999999986
No 8
>cd00255 nidG2 Nidogen, G2 domain; Nidogen is an important component of the basement membrane, an extracellular sheet-like matrix. Nidogen is a multifunctional protein that interacts with many other basement membrane proteins, like collagen, perlecan, lamin, and has a potential role in the assembly and connection of networks. Nidogen consists of 3 globular domains (G1-G3), G3 is the lamin-binding domain, while G2 binds collagen IV and perlecan. Also found in hemicentin, a protein which functions at various cell-cell and cell-matrix junctions and might assist in refining broad regions of cell contact into oriented, line-shaped junctions. Nidogen G2 consists of an N-terminal EGF-like domain (excluded from this alignment model) and an 11-stranded beta-barrel with a central helix, a topology that exhibits high structural similarity to the green flourescent proteins of Cnidaria.
Probab=99.27 E-value=1.2e-11 Score=101.25 Aligned_cols=63 Identities=27% Similarity=0.477 Sum_probs=61.4
Q ss_pred CccceEEEEEECC-CCeEEEeEeeeccCccCCeeeeEEEeeccCCcccCCcceeecceeeeeec
Q psy5750 1 GVFNYSAELIFST-GQRLHVQEEFFGHDVYDQLKMQGSIQGTAPSIAVGVSPVLEDLQQEFTHS 63 (286)
Q Consensus 1 g~~~~~~~~~~~~-~~~~~~~~~~~g~~~~~~~~~~~~~~g~~p~~~~~~~~~~~~~~~~~~~~ 63 (286)
|.|+|+++|+|.+ |+.|+|+|+|.|+|.+++|++++.+.|.+|.+++++.+++.||.|.|++.
T Consensus 84 G~F~~~~~v~F~~~ge~l~I~Q~~~GlD~~~~L~~~~~i~G~vP~i~~~a~v~i~dY~E~Y~~t 147 (224)
T cd00255 84 GEFTRQAEVTFYTGGEKLRITQVARGLDSHGHLLLDTVISGRVPQVPAGATVHIEDYTELYHYT 147 (224)
T ss_pred ceEEEEEEEEEcCCCEEEEEEEEEeccCccCeEEEEEEEEeecCCCCCCCeEEeCCCeeeEEEc
Confidence 7899999999995 99999999999999999999999999999999999999999999999986
No 9
>KOG1217|consensus
Probab=99.13 E-value=8.5e-10 Score=102.39 Aligned_cols=183 Identities=30% Similarity=0.716 Sum_probs=131.2
Q ss_pred CCCCC--CCCCCCCeeeeCCCCCeEecCCCcccccccccCCCCCCCcc-----------CCCcCCCCCCCCCC-CeeecC
Q psy5750 69 DPCKN--FFCVANSSCIVEDDKPTCICNRGFQQLYSEDRLQDDFGCFD-----------INECNAGTDLCHKN-AMCFNE 134 (286)
Q Consensus 69 ~~C~~--~~C~~~~~C~~~~g~~~C~C~~g~~~~~c~~~~~~~~~C~~-----------~~~C~~~~~~C~~~-~~C~~~ 134 (286)
++|.. .+|.+.+.|.+..++|.|.|+++|.+..++.. ..+..|.+ ...|......|... +.|++.
T Consensus 170 ~~C~~~~~~c~~~~~C~~~~~~~~C~c~~~~~~~~~~~~-~~~~~c~~~~~~~~~~g~~~~~c~~~~~~~~~~~~~c~~~ 248 (487)
T KOG1217|consen 170 DECIQYSSPCQNGGTCVNTGGSYLCSCPPGYTGSTCETT-GNGGTCVDSVACSCPPGARGPECEVSIVECASGDGTCVNT 248 (487)
T ss_pred cccccCCCCcCCCcccccCCCCeeEeCCCCccCCcCcCC-CCCceEecceeccCCCCCCCCCcccccccccCCCCccccc
Confidence 68874 57999999999999999999999998876521 00112221 22333322334333 789999
Q ss_pred CCCeeEeCCCCcccCC-CceecccCCCCCCCCCCCCCCCCCCCCCCccccCCCCceeeCCCCCCCCCCCCcccCCccccC
Q psy5750 135 IGSYSCQCRPGFTGNG-HQCTEITVPQTGPTSPCESDPRACNPPHSTCTNLTDYRTCNCDPGYQKDYLDDRRVAFVCTDV 213 (286)
Q Consensus 135 ~g~~~C~C~~G~~g~~-~~C~~~~~~~~~~~~~C~~~~~~C~~~~~~C~~~~g~~~C~C~~G~~g~~~~~~~~~~~C~d~ 213 (286)
.++|.|.|++||.++. ..|.+ ++.|..... | .++++|++..+.|.|.|++||.+... ..+.+.
T Consensus 249 ~~~~~C~~~~g~~~~~~~~~~~--------~~~C~~~~~-c-~~~~~C~~~~~~~~C~C~~g~~g~~~------~~~~~~ 312 (487)
T KOG1217|consen 249 VGSYTCRCPEGYTGDACVTCVD--------VDSCALIAS-C-PNGGTCVNVPGSYRCTCPPGFTGRLC------TECVDV 312 (487)
T ss_pred CCceeeeCCCCccccccceeee--------ccccCCCCc-c-CCCCeeecCCCcceeeCCCCCCCCCC------cccccc
Confidence 9999999999999865 45666 889988644 8 88899999999899999999999843 145667
Q ss_pred CCcCC--CCCCCCCCCee--eeCCCCeeeecCCCCcCCCCCcCCCC-eeecccccccccceeeee
Q psy5750 214 DECMN--YPPICNNNADC--INRPGTYQCQCKRGFSGDGFNCEEGK-YCLVVGITLCKMYLEVVN 273 (286)
Q Consensus 214 deC~~--~~~~C~~~~~C--~n~~g~y~C~C~~Gy~gdg~~C~~~~-~c~~~~~~~c~~~~~~~~ 273 (286)
++|.. ....|.+++.| .+..+.+.|.|..||.|. .|+... .|.... +...+.+++
T Consensus 313 ~~C~~~~~~~~c~~g~~C~~~~~~~~~~C~c~~~~~g~--~C~~~~~~C~~~~---~~~~~~c~~ 372 (487)
T KOG1217|consen 313 DECSPRNAGGPCANGGTCNTLGSFGGFRCACGPGFTGR--RCEDSNDECASSP---CCPGGTCVN 372 (487)
T ss_pred ccccccccCCcCCCCcccccCCCCCCCCcCCCCCCCCC--ccccCCccccCCc---cccCCEecc
Confidence 78852 33558888888 344567889999997666 887664 675543 555555554
No 10
>KOG1217|consensus
Probab=99.08 E-value=1.3e-09 Score=101.06 Aligned_cols=163 Identities=33% Similarity=0.765 Sum_probs=115.8
Q ss_pred CCCCeeeeC---CCCCeEecCCCcccccccccCCCCCCCccCCCcCCCCCCCCCCCeeecCCCCeeEeCCCCcccCCCce
Q psy5750 77 VANSSCIVE---DDKPTCICNRGFQQLYSEDRLQDDFGCFDINECNAGTDLCHKNAMCFNEIGSYSCQCRPGFTGNGHQC 153 (286)
Q Consensus 77 ~~~~~C~~~---~g~~~C~C~~g~~~~~c~~~~~~~~~C~~~~~C~~~~~~C~~~~~C~~~~g~~~C~C~~G~~g~~~~C 153 (286)
...+.|.+. ...+.|.|..||.+..+. ...++|.....+|.+.+.|.+..++|.|.|+++|.+ ..|
T Consensus 137 ~~~~~c~~~~~~~~~~~c~C~~g~~~~~~~---------~~~~~C~~~~~~c~~~~~C~~~~~~~~C~c~~~~~~--~~~ 205 (487)
T KOG1217|consen 137 CIDGSCSNGPGSVGPFRCSCTEGYEGEPCE---------TDLDECIQYSSPCQNGGTCVNTGGSYLCSCPPGYTG--STC 205 (487)
T ss_pred eCchhhcCCCCCCCceeeeeCCCccccccc---------ccccccccCCCCcCCCcccccCCCCeeEeCCCCccC--CcC
Confidence 345566654 347899999999977555 233678754577999999999999999999999987 333
Q ss_pred ecc------------cCCCCCCCCCCCCCCCCCCCCC-CccccCCCCceeeCCCCCCCCCCCCcccCCccccCCCcCCCC
Q psy5750 154 TEI------------TVPQTGPTSPCESDPRACNPPH-STCTNLTDYRTCNCDPGYQKDYLDDRRVAFVCTDVDECMNYP 220 (286)
Q Consensus 154 ~~~------------~~~~~~~~~~C~~~~~~C~~~~-~~C~~~~g~~~C~C~~G~~g~~~~~~~~~~~C~d~deC~~~~ 220 (286)
... ..+.....+.|......| ... +.|++..++|.|.|++||.+... ..|.++++|....
T Consensus 206 ~~~~~~~~c~~~~~~~~~~g~~~~~c~~~~~~~-~~~~~~c~~~~~~~~C~~~~g~~~~~~------~~~~~~~~C~~~~ 278 (487)
T KOG1217|consen 206 ETTGNGGTCVDSVACSCPPGARGPECEVSIVEC-ASGDGTCVNTVGSYTCRCPEGYTGDAC------VTCVDVDSCALIA 278 (487)
T ss_pred cCCCCCceEecceeccCCCCCCCCCcccccccc-cCCCCcccccCCceeeeCCCCcccccc------ceeeeccccCCCC
Confidence 220 000000122333332334 333 78999889999999999999841 1578999999876
Q ss_pred CCCCCCCeeeeCCCCeeeecCCCCcCCCC-CcCCCCeee
Q psy5750 221 PICNNNADCINRPGTYQCQCKRGFSGDGF-NCEEGKYCL 258 (286)
Q Consensus 221 ~~C~~~~~C~n~~g~y~C~C~~Gy~gdg~-~C~~~~~c~ 258 (286)
+ |.++++|++..+.|.|.|++||.|..- .|.....|.
T Consensus 279 ~-c~~~~~C~~~~~~~~C~C~~g~~g~~~~~~~~~~~C~ 316 (487)
T KOG1217|consen 279 S-CPNGGTCVNVPGSYRCTCPPGFTGRLCTECVDVDECS 316 (487)
T ss_pred c-cCCCCeeecCCCcceeeCCCCCCCCCCcccccccccc
Confidence 5 999999999999999999999999853 343445553
No 11
>KOG4289|consensus
Probab=98.99 E-value=6.5e-10 Score=108.82 Aligned_cols=98 Identities=29% Similarity=0.765 Sum_probs=77.5
Q ss_pred CCCCeeEeCCCCcccCCCceecccCCCCCCCCCCCCCCCCCCCCCCccccCCCCceeeCCCCCCCCCCCCcccCCccccC
Q psy5750 134 EIGSYSCQCRPGFTGNGHQCTEITVPQTGPTSPCESDPRACNPPHSTCTNLTDYRTCNCDPGYQKDYLDDRRVAFVCTDV 213 (286)
Q Consensus 134 ~~g~~~C~C~~G~~g~~~~C~~~~~~~~~~~~~C~~~~~~C~~~~~~C~~~~g~~~C~C~~G~~g~~~~~~~~~~~C~d~ 213 (286)
..++++|+|++||+| ..|+. .+|.|-. .+| .+++.|....|+|+|+|.+||+|..+... ...
T Consensus 1218 pvnglrCrCPpGFTg--d~CeT-------eiDlCYs--~pC-~nng~C~srEggYtCeCrpg~tGehCEvs------~~a 1279 (2531)
T KOG4289|consen 1218 PVNGLRCRCPPGFTG--DYCET-------EIDLCYS--GPC-GNNGRCRSREGGYTCECRPGFTGEHCEVS------ARA 1279 (2531)
T ss_pred ccCceeEeCCCCCCc--ccccc-------hhHhhhc--CCC-CCCCceEEecCceeEEecCCccccceeee------ccc
Confidence 345689999999999 68876 4899988 799 99999999999999999999999944210 122
Q ss_pred CCcCCCCCCCCCCCeeeeC-CCCeeeecCCCCcCCCCCcC
Q psy5750 214 DECMNYPPICNNNADCINR-PGTYQCQCKRGFSGDGFNCE 252 (286)
Q Consensus 214 deC~~~~~~C~~~~~C~n~-~g~y~C~C~~Gy~gdg~~C~ 252 (286)
-.|.. ..|.++++|+|. .|.|.|.|+.| .-.+..|+
T Consensus 1280 grCvp--GvC~nggtC~~~~nggf~c~Cp~g-e~e~prC~ 1316 (2531)
T KOG4289|consen 1280 GRCVP--GVCKNGGTCVNLLNGGFCCHCPYG-EFEDPRCE 1316 (2531)
T ss_pred Ccccc--ceecCCCEEeecCCCceeccCCCc-ccCCCceE
Confidence 34654 579999999987 57899999998 43444673
No 12
>KOG4260|consensus
Probab=98.90 E-value=9.5e-10 Score=90.78 Aligned_cols=146 Identities=29% Similarity=0.669 Sum_probs=99.5
Q ss_pred CCCCCCCeeeeC---CCCCeEecCCCcccccccccCCCCCCCccCCC----cCCCCCCCCCCCeeecCCCCeeE-eCCCC
Q psy5750 74 FFCVANSSCIVE---DDKPTCICNRGFQQLYSEDRLQDDFGCFDINE----CNAGTDLCHKNAMCFNEIGSYSC-QCRPG 145 (286)
Q Consensus 74 ~~C~~~~~C~~~---~g~~~C~C~~g~~~~~c~~~~~~~~~C~~~~~----C~~~~~~C~~~~~C~~~~g~~~C-~C~~G 145 (286)
.||..++.|... .|+-+|.|.+||.|+.|..-.. +..=...++ |..-..+|.. .|... ++-.| .|+.|
T Consensus 150 r~C~GnG~C~GdGsR~GsGkCkC~~GY~Gp~C~~Cg~-eyfes~Rne~~lvCt~Ch~~C~~--~Csg~-~~k~C~kCkkG 225 (350)
T KOG4260|consen 150 RPCFGNGSCHGDGSREGSGKCKCETGYTGPLCRYCGI-EYFESSRNEQHLVCTACHEGCLG--VCSGE-SSKGCSKCKKG 225 (350)
T ss_pred CCcCCCCcccCCCCCCCCCcccccCCCCCccccccch-HHHHhhcccccchhhhhhhhhhc--ccCCC-CCCChhhhccc
Confidence 689889999754 5677999999999987752111 000000011 1110022322 34322 22344 59999
Q ss_pred cccCCCceecccCCCCCCCCCCCCCCCCCCCCCCccccCCCCceeeCCCCCCCCCCCCcccCCccccCCCcCCCCCCCC-
Q psy5750 146 FTGNGHQCTEITVPQTGPTSPCESDPRACNPPHSTCTNLTDYRTCNCDPGYQKDYLDDRRVAFVCTDVDECMNYPPICN- 224 (286)
Q Consensus 146 ~~g~~~~C~~~~~~~~~~~~~C~~~~~~C~~~~~~C~~~~g~~~C~C~~G~~g~~~~~~~~~~~C~d~deC~~~~~~C~- 224 (286)
|..+...|.+ +++|...+.+| .....|+|+.|+|.|++.+||.+. +|+|..--..|.
T Consensus 226 W~lde~gCvD--------vnEC~~ep~~c-~~~qfCvNteGSf~C~dk~Gy~~g-------------~d~C~~~~d~~~~ 283 (350)
T KOG4260|consen 226 WKLDEEGCVD--------VNECQNEPAPC-KAHQFCVNTEGSFKCEDKEGYKKG-------------VDECQFCADVCAS 283 (350)
T ss_pred ceeccccccc--------HHHHhcCCCCC-ChhheeecCCCceEecccccccCC-------------hHHhhhhhhhccc
Confidence 9987788998 99999888899 888999999999999999999763 455543112233
Q ss_pred CCCeeeeCCCCeeeecCCCCc
Q psy5750 225 NNADCINRPGTYQCQCKRGFS 245 (286)
Q Consensus 225 ~~~~C~n~~g~y~C~C~~Gy~ 245 (286)
.+..|.|++++|+|+|..|+.
T Consensus 284 kn~~c~ni~~~~r~v~f~~~~ 304 (350)
T KOG4260|consen 284 KNRPCMNIDGQYRCVCFSGLI 304 (350)
T ss_pred CCCCcccCCccEEEEecccce
Confidence 466899999999999999976
No 13
>KOG4260|consensus
Probab=98.90 E-value=1.5e-09 Score=89.60 Aligned_cols=137 Identities=26% Similarity=0.620 Sum_probs=89.9
Q ss_pred ecCCCcccccccccCCCCCCCccCCCcCCCCCCCCCCCeeecC---CCCeeEeCCCCcccCCCceecccCC-----CCCC
Q psy5750 92 ICNRGFQQLYSEDRLQDDFGCFDINECNAGTDLCHKNAMCFNE---IGSYSCQCRPGFTGNGHQCTEITVP-----QTGP 163 (286)
Q Consensus 92 ~C~~g~~~~~c~~~~~~~~~C~~~~~C~~~~~~C~~~~~C~~~---~g~~~C~C~~G~~g~~~~C~~~~~~-----~~~~ 163 (286)
-|++|..|+.|. .|.... ..+|..++.|... .|+-.|.|.+||.| ..|....+. +...
T Consensus 131 CCp~gtyGpdCl-------~Cpggs-----er~C~GnG~C~GdGsR~GsGkCkC~~GY~G--p~C~~Cg~eyfes~Rne~ 196 (350)
T KOG4260|consen 131 CCPDGTYGPDCL-------QCPGGS-----ERPCFGNGSCHGDGSREGSGKCKCETGYTG--PLCRYCGIEYFESSRNEQ 196 (350)
T ss_pred ccCCCCcCCccc-------cCCCCC-----cCCcCCCCcccCCCCCCCCCcccccCCCCC--ccccccchHHHHhhcccc
Confidence 377887777554 232211 1667777888633 46779999999999 777542110 0001
Q ss_pred CCCCCCCCCCCCCCCCccccCCCCcee-eCCCCCCCCCCCCcccCCccccCCCcCCCCCCCCCCCeeeeCCCCeeeecCC
Q psy5750 164 TSPCESDPRACNPPHSTCTNLTDYRTC-NCDPGYQKDYLDDRRVAFVCTDVDECMNYPPICNNNADCINRPGTYQCQCKR 242 (286)
Q Consensus 164 ~~~C~~~~~~C~~~~~~C~~~~g~~~C-~C~~G~~g~~~~~~~~~~~C~d~deC~~~~~~C~~~~~C~n~~g~y~C~C~~ 242 (286)
--.|..-...| .+.|.. ..+..| +|..||..+.- .|.|||||...+..|..+..|+|+.|||+|.+++
T Consensus 197 ~lvCt~Ch~~C---~~~Csg-~~~k~C~kCkkGW~lde~-------gCvDvnEC~~ep~~c~~~qfCvNteGSf~C~dk~ 265 (350)
T KOG4260|consen 197 HLVCTACHEGC---LGVCSG-ESSKGCSKCKKGWKLDEE-------GCVDVNECQNEPAPCKAHQFCVNTEGSFKCEDKE 265 (350)
T ss_pred cchhhhhhhhh---hcccCC-CCCCChhhhcccceeccc-------ccccHHHHhcCCCCCChhheeecCCCceEecccc
Confidence 11222111223 224432 334456 79999988744 7999999999988999999999999999999999
Q ss_pred CCcCCCCCcCC
Q psy5750 243 GFSGDGFNCEE 253 (286)
Q Consensus 243 Gy~gdg~~C~~ 253 (286)
||.++--.|+.
T Consensus 266 Gy~~g~d~C~~ 276 (350)
T KOG4260|consen 266 GYKKGVDECQF 276 (350)
T ss_pred cccCChHHhhh
Confidence 99984334544
No 14
>PF07645 EGF_CA: Calcium-binding EGF domain; InterPro: IPR001881 A sequence of about forty amino-acid residues found in epidermal growth factor (EGF) has been shown [, , , , , ] to be present in a large number of membrane-bound and extracellular, mostly animal, proteins. Many of these proteins require calcium for their biological function and a calcium-binding site has been found at the N terminus of some EGF-like domains []. Calcium-binding may be crucial for numerous protein-protein interactions. For human coagulation factor IX it has been shown [] that the calcium-ligands form a pentagonal bipyramid. The first, third and fourth conserved negatively charged or polar residues are side chain ligands. The latter is possibly hydroxylated (see aspartic acid and asparagine hydroxylation site) []. A conserved aromatic residue, as well as the second conserved negative residue, are thought to be involved in stabilising the calcium-binding site. As in non-calcium binding EGF-like domains, there are six conserved cysteines and the structure of both types is very similar as calcium-binding induces only strictly local structural changes []. +------------------+ +---------+ | | | | nxnnC-x(3,14)-C-x(3,7)-CxxbxxxxaxC-x(1,6)-C-x(8,13)-Cx | | +------------------+ 'n': negatively charged or polar residue [DEQN] 'b': possibly beta-hydroxylated residue [DN] 'a': aromatic amino acid 'C': cysteine, involved in disulphide bond 'x': any amino acid. ; GO: 0005509 calcium ion binding; PDB: 2VJ3_A 1TOZ_A 1LMJ_A 1UZQ_A 1UZK_A 1UZJ_B 1UZP_A 1EMO_A 1EMN_A 2RR0_A ....
Probab=98.89 E-value=1.2e-09 Score=66.19 Aligned_cols=34 Identities=41% Similarity=1.084 Sum_probs=32.6
Q ss_pred cCCCcCCCCCCCCCCCeeeeCCCCeeeecCCCCc
Q psy5750 212 DVDECMNYPPICNNNADCINRPGTYQCQCKRGFS 245 (286)
Q Consensus 212 d~deC~~~~~~C~~~~~C~n~~g~y~C~C~~Gy~ 245 (286)
|||||...++.|..++.|+|+.|+|.|.|++||.
T Consensus 1 DidEC~~~~~~C~~~~~C~N~~Gsy~C~C~~Gy~ 34 (42)
T PF07645_consen 1 DIDECAEGPHNCPENGTCVNTEGSYSCSCPPGYE 34 (42)
T ss_dssp ESSTTTTTSSSSSTTSEEEEETTEEEEEESTTEE
T ss_pred CccccCCCCCcCCCCCEEEcCCCCEEeeCCCCcE
Confidence 7899999889999899999999999999999999
No 15
>KOG1225|consensus
Probab=98.83 E-value=2.9e-08 Score=91.21 Aligned_cols=126 Identities=30% Similarity=0.807 Sum_probs=84.1
Q ss_pred eEecCCCcccccccccCCCCCCCccCCCcCCCCCCCCCCCeeecCCCCeeEeCCCCcccCCCceecccCCCCCCCCCCCC
Q psy5750 90 TCICNRGFQQLYSEDRLQDDFGCFDINECNAGTDLCHKNAMCFNEIGSYSCQCRPGFTGNGHQCTEITVPQTGPTSPCES 169 (286)
Q Consensus 90 ~C~C~~g~~~~~c~~~~~~~~~C~~~~~C~~~~~~C~~~~~C~~~~g~~~C~C~~G~~g~~~~C~~~~~~~~~~~~~C~~ 169 (286)
.|.|..+|.++.|. ...| . +.|..++.|++. +|.|++||+| ..|.. -.|..
T Consensus 235 ic~c~~~~~g~~c~-----~~~C------~---~~c~~~g~c~~G----~CIC~~Gf~G--~dC~e---------~~Cp~ 285 (525)
T KOG1225|consen 235 ICECPEGYFGPLCS-----TIYC------P---GGCTGRGQCVEG----RCICPPGFTG--DDCDE---------LVCPV 285 (525)
T ss_pred eeecCCceeCCccc-----cccC------C---CCCcccceEeCC----eEeCCCCCcC--CCCCc---------ccCCc
Confidence 68888888877554 2222 2 455556778776 8999999998 66654 22332
Q ss_pred CCCCCCCCCCccccCCCCceeeCCCCCCCCCCCCcccCCccccCCCcCCCCCCCCCCCeeeeCCCCeeeecCCCCcCCCC
Q psy5750 170 DPRACNPPHSTCTNLTDYRTCNCDPGYQKDYLDDRRVAFVCTDVDECMNYPPICNNNADCINRPGTYQCQCKRGFSGDGF 249 (286)
Q Consensus 170 ~~~~C~~~~~~C~~~~g~~~C~C~~G~~g~~~~~~~~~~~C~d~deC~~~~~~C~~~~~C~n~~g~y~C~C~~Gy~gdg~ 249 (286)
.| ..++.+++. .|.|.+||.|. .|+ +-+|. ..|.+++.|+ .| +|.|.+||+|+
T Consensus 286 ---~c-s~~g~~~~g----~CiC~~g~~G~---------dCs-~~~cp---adC~g~G~Ci--~G--~C~C~~Gy~G~-- 338 (525)
T KOG1225|consen 286 ---DC-SGGGVCVDG----ECICNPGYSGK---------DCS-IRRCP---ADCSGHGKCI--DG--ECLCDEGYTGE-- 338 (525)
T ss_pred ---cc-CCCceecCC----EeecCCCcccc---------ccc-cccCC---ccCCCCCccc--CC--ceEeCCCCcCC--
Confidence 25 445566554 89999999998 443 33354 3499999998 23 79999999988
Q ss_pred CcCCCCeeecccccccccceeeeecccccCCc
Q psy5750 250 NCEEGKYCLVVGITLCKMYLEVVNIQEICGEN 281 (286)
Q Consensus 250 ~C~~~~~c~~~~~~~c~~~~~~~~~~~~~~~~ 281 (286)
.|+.. . |.+..-+++. .+|.++
T Consensus 339 ~C~~~--------~-C~~~g~cv~g-C~C~~G 360 (525)
T KOG1225|consen 339 LCIQR--------A-CSGGGQCVNG-CKCKKG 360 (525)
T ss_pred ccccc--------c-cCCCceeccC-ceeccC
Confidence 77443 2 5566666666 555554
No 16
>PF12947 EGF_3: EGF domain; InterPro: IPR024731 This entry represents an EGF domain found in the the C terminus of malarial parasite merozoite surface protein 1 [], as well as other proteins.; PDB: 2NPR_A 1N1I_C 1B9W_A 1YO8_A 2RHP_A.
Probab=98.81 E-value=1.9e-09 Score=62.49 Aligned_cols=36 Identities=47% Similarity=1.148 Sum_probs=28.8
Q ss_pred cCCCCCCCCCCCeeeeCCCCeeeecCCCCcCCCCCc
Q psy5750 216 CMNYPPICNNNADCINRPGTYQCQCKRGFSGDGFNC 251 (286)
Q Consensus 216 C~~~~~~C~~~~~C~n~~g~y~C~C~~Gy~gdg~~C 251 (286)
|...++.|+.+|+|++++++|.|+|++||.|||+.|
T Consensus 1 C~~~~~~C~~nA~C~~~~~~~~C~C~~Gy~GdG~~C 36 (36)
T PF12947_consen 1 CLENNGGCHPNATCTNTGGSYTCTCKPGYEGDGFFC 36 (36)
T ss_dssp TTTGGGGS-TTCEEEE-TTSEEEEE-CEEECCSTCE
T ss_pred CCCCCCCCCCCcEeecCCCCEEeECCCCCccCCcCC
Confidence 344456799999999999999999999999999876
No 17
>KOG1225|consensus
Probab=98.71 E-value=7.2e-08 Score=88.69 Aligned_cols=110 Identities=31% Similarity=0.954 Sum_probs=81.5
Q ss_pred CCCCCeeeeCCCCCeEecCCCcccccccccCCCCCCCccCCCcCCCCCCCCCCCeeecCCCCeeEeCCCCcccCCCceec
Q psy5750 76 CVANSSCIVEDDKPTCICNRGFQQLYSEDRLQDDFGCFDINECNAGTDLCHKNAMCFNEIGSYSCQCRPGFTGNGHQCTE 155 (286)
Q Consensus 76 C~~~~~C~~~~g~~~C~C~~g~~~~~c~~~~~~~~~C~~~~~C~~~~~~C~~~~~C~~~~g~~~C~C~~G~~g~~~~C~~ 155 (286)
|...+.|++. +|.|++||.|..|. ... |. ..|..++.+++. .|.|++||.| ..|+.
T Consensus 256 c~~~g~c~~G----~CIC~~Gf~G~dC~-----e~~------Cp---~~cs~~g~~~~g----~CiC~~g~~G--~dCs~ 311 (525)
T KOG1225|consen 256 CTGRGQCVEG----RCICPPGFTGDDCD-----ELV------CP---VDCSGGGVCVDG----ECICNPGYSG--KDCSI 311 (525)
T ss_pred CcccceEeCC----eEeCCCCCcCCCCC-----ccc------CC---cccCCCceecCC----EeecCCCccc--ccccc
Confidence 4444667765 79999999988776 233 33 237666777665 8999999999 77764
Q ss_pred ccCCCCCCCCCCCCCCCCCCCCCCccccCCCCceeeCCCCCCCCCCCCcccCCccccCCCcCCCCCCCCCCCeeeeCCCC
Q psy5750 156 ITVPQTGPTSPCESDPRACNPPHSTCTNLTDYRTCNCDPGYQKDYLDDRRVAFVCTDVDECMNYPPICNNNADCINRPGT 235 (286)
Q Consensus 156 ~~~~~~~~~~~C~~~~~~C~~~~~~C~~~~g~~~C~C~~G~~g~~~~~~~~~~~C~d~deC~~~~~~C~~~~~C~n~~g~ 235 (286)
..|. ..| ..++.|++. +|.|.+||+|. .|... . |.+++.|+|.
T Consensus 312 ---------~~cp---adC-~g~G~Ci~G----~C~C~~Gy~G~---------~C~~~-------~-C~~~g~cv~g--- 354 (525)
T KOG1225|consen 312 ---------RRCP---ADC-SGHGKCIDG----ECLCDEGYTGE---------LCIQR-------A-CSGGGQCVNG--- 354 (525)
T ss_pred ---------ccCC---ccC-CCCCcccCC----ceEeCCCCcCC---------ccccc-------c-cCCCceeccC---
Confidence 3343 478 888999844 99999999998 55432 2 7888889852
Q ss_pred eeeecCCCCcCCC
Q psy5750 236 YQCQCKRGFSGDG 248 (286)
Q Consensus 236 y~C~C~~Gy~gdg 248 (286)
|+|..||+|..
T Consensus 355 --C~C~~Gw~G~d 365 (525)
T KOG1225|consen 355 --CKCKKGWRGPD 365 (525)
T ss_pred --ceeccCccCCC
Confidence 99999999874
No 18
>PF06247 Plasmod_Pvs28: Plasmodium ookinete surface protein Pvs28; InterPro: IPR010423 This family consists of several ookinete surface protein (Pvs28) from several species of Plasmodium. Pvs25 and Pvs28 are expressed on the surface of ookinetes. These proteins are potential candidates for vaccine and induce antibodies that block the infectivity of Plasmodium vivax in immunised animals [].; GO: 0009986 cell surface, 0016020 membrane; PDB: 1Z3G_B 1Z1Y_B 1Z27_A.
Probab=98.39 E-value=7.5e-08 Score=75.71 Aligned_cols=144 Identities=31% Similarity=0.700 Sum_probs=92.4
Q ss_pred CCCCCCeeeeCCCCCeEecCCCcccccccccCCCCCCCccCCCcCC---CCCCCCCCCeeecCC-----CCeeEeCCCCc
Q psy5750 75 FCVANSSCIVEDDKPTCICNRGFQQLYSEDRLQDDFGCFDINECNA---GTDLCHKNAMCFNEI-----GSYSCQCRPGF 146 (286)
Q Consensus 75 ~C~~~~~C~~~~g~~~C~C~~g~~~~~c~~~~~~~~~C~~~~~C~~---~~~~C~~~~~C~~~~-----g~~~C~C~~G~ 146 (286)
.|. +|..++..+.|.|.|.+||... ...+|....+|.. -..+|...+.|++.. ..|.|.|..||
T Consensus 7 ~CK-NG~LiQMSNHfEC~Cnegfvl~-------~EntCE~kv~C~~~e~~~K~Cgdya~C~~~~~~~~~~~~~C~C~~gY 78 (197)
T PF06247_consen 7 ICK-NGYLIQMSNHFECKCNEGFVLK-------NENTCEEKVECDKLENVNKPCGDYAKCINQANKGEERAYKCDCINGY 78 (197)
T ss_dssp --B-TEEEEEESSEEEEEESTTEEEE-------ETTEEEE----SG-GGTTSEEETTEEEEE-SSTTSSTSEEEEE-TTE
T ss_pred ccc-CCEEEEccCceEEEcCCCcEEc-------cccccccceecCcccccCccccchhhhhcCCCcccceeEEEecccCc
Confidence 454 5677778889999999999876 2456666666654 127899999998775 57999999999
Q ss_pred ccCCCceecccCCCCCCCCCCCCCCCCCCCCCCccccC---CCCceeeCCCCCCCCCCCCcccCCcc--ccCCCcCCCCC
Q psy5750 147 TGNGHQCTEITVPQTGPTSPCESDPRACNPPHSTCTNL---TDYRTCNCDPGYQKDYLDDRRVAFVC--TDVDECMNYPP 221 (286)
Q Consensus 147 ~g~~~~C~~~~~~~~~~~~~C~~~~~~C~~~~~~C~~~---~g~~~C~C~~G~~g~~~~~~~~~~~C--~d~deC~~~~~ 221 (286)
+.....|.+ +.|.. ..| . .+.|+.. +....|+|.-|++.. +...| +...+|...
T Consensus 79 ~~~~~vCvp---------~~C~~--~~C-g-~GKCI~d~~~~~~~~CSC~IGkV~~------dn~kCtk~G~T~C~LK-- 137 (197)
T PF06247_consen 79 ILKQGVCVP---------NKCNN--KDC-G-SGKCILDPDNPNNPTCSCNIGKVPD------DNKKCTKTGETKCSLK-- 137 (197)
T ss_dssp EESSSSEEE---------GGGSS------T-TEEEEEEEGGGSEEEEEE-TEEETT------TTTESEEEE---------
T ss_pred eeeCCeEch---------hhcCc--eec-C-CCeEEecCCCCCCceeEeeeceEec------cCCcccCCCccceeee--
Confidence 976678875 56766 567 4 6788752 234499999999822 12267 344578764
Q ss_pred CCCCCCeeeeCCCCeeeecCCCCcCCC
Q psy5750 222 ICNNNADCINRPGTYQCQCKRGFSGDG 248 (286)
Q Consensus 222 ~C~~~~~C~n~~g~y~C~C~~Gy~gdg 248 (286)
|..+..|....+-|.|.|.+||.+++
T Consensus 138 -Ck~nE~CK~~~~~Y~C~~~~~~~~~~ 163 (197)
T PF06247_consen 138 -CKENEECKLVDGYYKCVCKEGFPGDG 163 (197)
T ss_dssp --TTTEEEEEETTEEEEEE-TT-EEET
T ss_pred -cCCCcceeeeCcEEEeecCCCCCCCC
Confidence 88889999999999999999998764
No 19
>KOG1226|consensus
Probab=98.37 E-value=1.6e-05 Score=75.18 Aligned_cols=163 Identities=23% Similarity=0.574 Sum_probs=101.3
Q ss_pred CCCCCCCeeeeCCCCCeEecCCCcccccccccCCCCCCCccCCCcCCC--CCCCCCCCeeecCCCCeeEeCCCCccc--C
Q psy5750 74 FFCVANSSCIVEDDKPTCICNRGFQQLYSEDRLQDDFGCFDINECNAG--TDLCHKNAMCFNEIGSYSCQCRPGFTG--N 149 (286)
Q Consensus 74 ~~C~~~~~C~~~~g~~~C~C~~g~~~~~c~~~~~~~~~C~~~~~C~~~--~~~C~~~~~C~~~~g~~~C~C~~G~~g--~ 149 (286)
..|..+|+++-. .|.|.+||.|..|+-......+-...+.|... ..+|...+.|.=. +|.|.+...+ .
T Consensus 467 ~~C~g~G~~~CG----~C~C~~G~~G~~CEC~~~~~ss~~~~~~Cr~~~~~~vCSgrG~C~CG----qC~C~~~~~~~i~ 538 (783)
T KOG1226|consen 467 ALCHGNGTFVCG----QCRCDEGWLGKKCECSTDELSSSEEEDKCRENSDSPVCSGRGDCVCG----QCVCHKPDNGKIY 538 (783)
T ss_pred cccCCCCcEEec----ceecCCCCCCCcccCCccccCcHhHHhhccCCCCCCCcCCCCcEeCC----ceEecCCCCCcee
Confidence 445555555544 58999999999887211111110123455432 2478888888766 6888776551 1
Q ss_pred CCceecccCCCCCCCCCCCCC-CCCCCCCCCccccCCCCceeeCCCCCCCCCCCCcccCCcc-ccCCCcCCCC-CCCCCC
Q psy5750 150 GHQCTEITVPQTGPTSPCESD-PRACNPPHSTCTNLTDYRTCNCDPGYQKDYLDDRRVAFVC-TDVDECMNYP-PICNNN 226 (286)
Q Consensus 150 ~~~C~~~~~~~~~~~~~C~~~-~~~C~~~~~~C~~~~g~~~C~C~~G~~g~~~~~~~~~~~C-~d~deC~~~~-~~C~~~ 226 (286)
|..|+- +.-.|... ...| ..++.|.=. +|.|.+||+|..+ .| .+.|.|.... ..|...
T Consensus 539 G~fCEC-------DnfsC~r~~g~lC-~g~G~C~CG----~CvC~~GwtG~~C-------~C~~std~C~~~~G~iCSGr 599 (783)
T KOG1226|consen 539 GKFCEC-------DNFSCERHKGVLC-GGHGRCECG----RCVCNPGWTGSAC-------NCPLSTDTCESSDGQICSGR 599 (783)
T ss_pred eeeeec-------cCcccccccCccc-CCCCeEeCC----cEEcCCCCccCCC-------CCCCCCccccCCCCceeCCC
Confidence 266654 12223321 1346 556666543 8999999999977 56 6777886543 357777
Q ss_pred CeeeeCCCCeeeecCCC-CcCCCCCcCCCCeeecccccccccceeeee
Q psy5750 227 ADCINRPGTYQCQCKRG-FSGDGFNCEEGKYCLVVGITLCKMYLEVVN 273 (286)
Q Consensus 227 ~~C~n~~g~y~C~C~~G-y~gdg~~C~~~~~c~~~~~~~c~~~~~~~~ 273 (286)
++|.=. +|+|... |.|. .||.-..| ..+|..+..||.
T Consensus 600 G~C~Cg----~C~C~~~~~sG~--~CE~cptc----~~~C~~~~~Cve 637 (783)
T KOG1226|consen 600 GTCECG----RCKCTDPPYSGE--FCEKCPTC----PDPCAENKSCVE 637 (783)
T ss_pred ceeeCC----ceEcCCCCcCcc--hhhcCCCC----CCcccccccchh
Confidence 777522 6888865 8887 88877777 334777776654
No 20
>PF07645 EGF_CA: Calcium-binding EGF domain; InterPro: IPR001881 A sequence of about forty amino-acid residues found in epidermal growth factor (EGF) has been shown [, , , , , ] to be present in a large number of membrane-bound and extracellular, mostly animal, proteins. Many of these proteins require calcium for their biological function and a calcium-binding site has been found at the N terminus of some EGF-like domains []. Calcium-binding may be crucial for numerous protein-protein interactions. For human coagulation factor IX it has been shown [] that the calcium-ligands form a pentagonal bipyramid. The first, third and fourth conserved negatively charged or polar residues are side chain ligands. The latter is possibly hydroxylated (see aspartic acid and asparagine hydroxylation site) []. A conserved aromatic residue, as well as the second conserved negative residue, are thought to be involved in stabilising the calcium-binding site. As in non-calcium binding EGF-like domains, there are six conserved cysteines and the structure of both types is very similar as calcium-binding induces only strictly local structural changes []. +------------------+ +---------+ | | | | nxnnC-x(3,14)-C-x(3,7)-CxxbxxxxaxC-x(1,6)-C-x(8,13)-Cx | | +------------------+ 'n': negatively charged or polar residue [DEQN] 'b': possibly beta-hydroxylated residue [DN] 'a': aromatic amino acid 'C': cysteine, involved in disulphide bond 'x': any amino acid. ; GO: 0005509 calcium ion binding; PDB: 2VJ3_A 1TOZ_A 1LMJ_A 1UZQ_A 1UZK_A 1UZJ_B 1UZP_A 1EMO_A 1EMN_A 2RR0_A ....
Probab=98.36 E-value=4e-07 Score=55.06 Aligned_cols=34 Identities=50% Similarity=1.155 Sum_probs=32.0
Q ss_pred cCCCcCCCCCCCCCCCeeecCCCCeeEeCCCCcc
Q psy5750 114 DINECNAGTDLCHKNAMCFNEIGSYSCQCRPGFT 147 (286)
Q Consensus 114 ~~~~C~~~~~~C~~~~~C~~~~g~~~C~C~~G~~ 147 (286)
|||||..+.+.|..++.|+|+.|+|.|.|++||.
T Consensus 1 DidEC~~~~~~C~~~~~C~N~~Gsy~C~C~~Gy~ 34 (42)
T PF07645_consen 1 DIDECAEGPHNCPENGTCVNTEGSYSCSCPPGYE 34 (42)
T ss_dssp ESSTTTTTSSSSSTTSEEEEETTEEEEEESTTEE
T ss_pred CccccCCCCCcCCCCCEEEcCCCCEEeeCCCCcE
Confidence 6899998888999899999999999999999998
No 21
>smart00179 EGF_CA Calcium-binding EGF-like domain.
Probab=98.32 E-value=9.7e-07 Score=52.28 Aligned_cols=38 Identities=45% Similarity=1.146 Sum_probs=31.6
Q ss_pred cCCCcCCCCCCCCCCCeeeeCCCCeeeecCCCCcCCCCCc
Q psy5750 212 DVDECMNYPPICNNNADCINRPGTYQCQCKRGFSGDGFNC 251 (286)
Q Consensus 212 d~deC~~~~~~C~~~~~C~n~~g~y~C~C~~Gy~gdg~~C 251 (286)
++|+|... ..|.+++.|+|..++|.|.|++||. +|..|
T Consensus 1 d~~~C~~~-~~C~~~~~C~~~~g~~~C~C~~g~~-~g~~C 38 (39)
T smart00179 1 DIDECASG-NPCQNGGTCVNTVGSYRCECPPGYT-DGRNC 38 (39)
T ss_pred CcccCcCC-CCcCCCCEeECCCCCeEeECCCCCc-cCCcC
Confidence 46888763 4599999999999999999999999 55576
No 22
>PF12947 EGF_3: EGF domain; InterPro: IPR024731 This entry represents an EGF domain found in the the C terminus of malarial parasite merozoite surface protein 1 [], as well as other proteins.; PDB: 2NPR_A 1N1I_C 1B9W_A 1YO8_A 2RHP_A.
Probab=98.07 E-value=2.1e-06 Score=49.71 Aligned_cols=32 Identities=50% Similarity=1.190 Sum_probs=26.3
Q ss_pred CCCCCCCCeeecCCCCeeEeCCCCcccCCCce
Q psy5750 122 TDLCHKNAMCFNEIGSYSCQCRPGFTGNGHQC 153 (286)
Q Consensus 122 ~~~C~~~~~C~~~~g~~~C~C~~G~~g~~~~C 153 (286)
.+.|+.+|.|+++.++|.|.|++||.|+|..|
T Consensus 5 ~~~C~~nA~C~~~~~~~~C~C~~Gy~GdG~~C 36 (36)
T PF12947_consen 5 NGGCHPNATCTNTGGSYTCTCKPGYEGDGFFC 36 (36)
T ss_dssp GGGS-TTCEEEE-TTSEEEEE-CEEECCSTCE
T ss_pred CCCCCCCcEeecCCCCEEeECCCCCccCCcCC
Confidence 46799999999999999999999999998765
No 23
>PF00008 EGF: EGF-like domain This is a sub-family of the Pfam entry This is a sub-family of the Pfam entry; InterPro: IPR006209 A sequence of about thirty to forty amino-acid residues long found in the sequence of epidermal growth factor (EGF) has been shown [, , , , ] to be present, in a more or less conserved form, in a large number of other, mostly animal proteins. The list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied. The functional significance of EGF domains in what appear to be unrelated proteins is not yet clear. However, a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase). The EGF domain includes six cysteine residues which have been shown (in EGF) to be involved in disulphide bonds. The main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet. Subdomains between the conserved cysteines vary in length.; GO: 0005515 protein binding; PDB: 1WHE_A 1CCF_A 1APO_A 1WHF_A 2VJ3_A 1TOZ_A 4D90_B 3CFW_A 1EDM_B 1IXA_A ....
Probab=98.01 E-value=3e-06 Score=47.86 Aligned_cols=26 Identities=42% Similarity=1.206 Sum_probs=24.3
Q ss_pred CCCCCCeeeeCC-CCeeeecCCCCcCC
Q psy5750 222 ICNNNADCINRP-GTYQCQCKRGFSGD 247 (286)
Q Consensus 222 ~C~~~~~C~n~~-g~y~C~C~~Gy~gd 247 (286)
.|.++|+|++.. ++|.|+|++||.|+
T Consensus 5 ~C~n~g~C~~~~~~~y~C~C~~G~~G~ 31 (32)
T PF00008_consen 5 PCQNGGTCIDLPGGGYTCECPPGYTGK 31 (32)
T ss_dssp SSTTTEEEEEESTSEEEEEEBTTEEST
T ss_pred cCCCCeEEEeCCCCCEEeECCCCCccC
Confidence 599999999998 99999999999986
No 24
>KOG1226|consensus
Probab=97.81 E-value=0.00014 Score=69.03 Aligned_cols=131 Identities=20% Similarity=0.464 Sum_probs=82.2
Q ss_pred eEeCCCCcccCCCceecccCCCCC--CCCCCCCC--CCCCCCCCCccccCCCCceeeCCCCCCCCCCCCcccCCccccC-
Q psy5750 139 SCQCRPGFTGNGHQCTEITVPQTG--PTSPCESD--PRACNPPHSTCTNLTDYRTCNCDPGYQKDYLDDRRVAFVCTDV- 213 (286)
Q Consensus 139 ~C~C~~G~~g~~~~C~~~~~~~~~--~~~~C~~~--~~~C~~~~~~C~~~~g~~~C~C~~G~~g~~~~~~~~~~~C~d~- 213 (286)
.|.|.+||.| ..|+-....... ..+.|... ..+| ...+.|+=. +|.|.+...+. ..+..|+.-
T Consensus 479 ~C~C~~G~~G--~~CEC~~~~~ss~~~~~~Cr~~~~~~vC-SgrG~C~CG----qC~C~~~~~~~-----i~G~fCECDn 546 (783)
T KOG1226|consen 479 QCRCDEGWLG--KKCECSTDELSSSEEEDKCRENSDSPVC-SGRGDCVCG----QCVCHKPDNGK-----IYGKFCECDN 546 (783)
T ss_pred ceecCCCCCC--CcccCCccccCcHhHHhhccCCCCCCCc-CCCCcEeCC----ceEecCCCCCc-----eeeeeeeccC
Confidence 6799999999 665421111000 12344321 1256 556666543 78888776632 111244321
Q ss_pred CCcCCC-CCCCCCCCeeeeCCCCeeeecCCCCcCCCCCcCC-CCeeecccccccccceeeeecccccCCccccC
Q psy5750 214 DECMNY-PPICNNNADCINRPGTYQCQCKRGFSGDGFNCEE-GKYCLVVGITLCKMYLEVVNIQEICGENSISG 285 (286)
Q Consensus 214 deC~~~-~~~C~~~~~C~n~~g~y~C~C~~Gy~gdg~~C~~-~~~c~~~~~~~c~~~~~~~~~~~~~~~~~~~~ 285 (286)
-.|... ..+|..+++|.=. +|+|.+||+|+-=.|.. .+.|..++...|++++++.-++.+|+...++|
T Consensus 547 fsC~r~~g~lC~g~G~C~CG----~CvC~~GwtG~~C~C~~std~C~~~~G~iCSGrG~C~Cg~C~C~~~~~sG 616 (783)
T KOG1226|consen 547 FSCERHKGVLCGGHGRCECG----RCVCNPGWTGSACNCPLSTDTCESSDGQICSGRGTCECGRCKCTDPPYSG 616 (783)
T ss_pred cccccccCcccCCCCeEeCC----cEEcCCCCccCCCCCCCCCccccCCCCceeCCCceeeCCceEcCCCCcCc
Confidence 124322 2358888888532 79999999998444433 45787788889999999999999999987665
No 25
>PF00008 EGF: EGF-like domain This is a sub-family of the Pfam entry This is a sub-family of the Pfam entry; InterPro: IPR006209 A sequence of about thirty to forty amino-acid residues long found in the sequence of epidermal growth factor (EGF) has been shown [, , , , ] to be present, in a more or less conserved form, in a large number of other, mostly animal proteins. The list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied. The functional significance of EGF domains in what appear to be unrelated proteins is not yet clear. However, a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase). The EGF domain includes six cysteine residues which have been shown (in EGF) to be involved in disulphide bonds. The main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet. Subdomains between the conserved cysteines vary in length.; GO: 0005515 protein binding; PDB: 1WHE_A 1CCF_A 1APO_A 1WHF_A 2VJ3_A 1TOZ_A 4D90_B 3CFW_A 1EDM_B 1IXA_A ....
Probab=97.80 E-value=1.2e-05 Score=45.45 Aligned_cols=30 Identities=27% Similarity=0.581 Sum_probs=26.9
Q ss_pred CCCCCCCCCCeeeeCC-CCCeEecCCCcccc
Q psy5750 71 CKNFFCVANSSCIVED-DKPTCICNRGFQQL 100 (286)
Q Consensus 71 C~~~~C~~~~~C~~~~-g~~~C~C~~g~~~~ 100 (286)
|.++||.++++|+... +.|.|.|++||.|.
T Consensus 1 C~~~~C~n~g~C~~~~~~~y~C~C~~G~~G~ 31 (32)
T PF00008_consen 1 CSSNPCQNGGTCIDLPGGGYTCECPPGYTGK 31 (32)
T ss_dssp TTTTSSTTTEEEEEESTSEEEEEEBTTEEST
T ss_pred CCCCcCCCCeEEEeCCCCCEEeECCCCCccC
Confidence 5567999999999998 99999999999875
No 26
>cd00054 EGF_CA Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.
Probab=97.79 E-value=3.6e-05 Score=44.88 Aligned_cols=36 Identities=42% Similarity=1.136 Sum_probs=29.9
Q ss_pred CCCcCCCCCCCCCCCeeeeCCCCeeeecCCCCcCCCCCc
Q psy5750 213 VDECMNYPPICNNNADCINRPGTYQCQCKRGFSGDGFNC 251 (286)
Q Consensus 213 ~deC~~~~~~C~~~~~C~n~~g~y~C~C~~Gy~gdg~~C 251 (286)
+++|... ..|.+++.|++..++|.|.|++||.|. .|
T Consensus 2 ~~~C~~~-~~C~~~~~C~~~~~~~~C~C~~g~~g~--~C 37 (38)
T cd00054 2 IDECASG-NPCQNGGTCVNTVGSYRCSCPPGYTGR--NC 37 (38)
T ss_pred cccCCCC-CCcCCCCEeECCCCCeEeECCCCCcCC--cC
Confidence 5778652 359888999999999999999999985 55
No 27
>smart00179 EGF_CA Calcium-binding EGF-like domain.
Probab=97.75 E-value=4.4e-05 Score=44.98 Aligned_cols=35 Identities=20% Similarity=0.492 Sum_probs=30.2
Q ss_pred CCCCCCC-CCCCCCCeeeeCCCCCeEecCCCcc-ccc
Q psy5750 67 NDDPCKN-FFCVANSSCIVEDDKPTCICNRGFQ-QLY 101 (286)
Q Consensus 67 ~~~~C~~-~~C~~~~~C~~~~g~~~C~C~~g~~-~~~ 101 (286)
++++|.. .+|.++++|++..++|.|.|++||. +..
T Consensus 1 d~~~C~~~~~C~~~~~C~~~~g~~~C~C~~g~~~g~~ 37 (39)
T smart00179 1 DIDECASGNPCQNGGTCVNTVGSYRCECPPGYTDGRN 37 (39)
T ss_pred CcccCcCCCCcCCCCEeECCCCCeEeECCCCCccCCc
Confidence 3678887 7999999999999999999999998 543
No 28
>PF12662 cEGF: Complement Clr-like EGF-like
Probab=97.65 E-value=4.6e-05 Score=39.61 Aligned_cols=24 Identities=46% Similarity=0.873 Sum_probs=19.6
Q ss_pred CceeeCCCCCCCCCCCCcccCCccccCCC
Q psy5750 187 YRTCNCDPGYQKDYLDDRRVAFVCTDVDE 215 (286)
Q Consensus 187 ~~~C~C~~G~~g~~~~~~~~~~~C~d~de 215 (286)
+|+|.|++||.... ++..|+||||
T Consensus 1 sy~C~C~~Gy~l~~-----d~~~C~DIdE 24 (24)
T PF12662_consen 1 SYTCSCPPGYQLSP-----DGRSCEDIDE 24 (24)
T ss_pred CEEeeCCCCCcCCC-----CCCccccCCC
Confidence 58999999999763 3458999987
No 29
>cd00053 EGF Epidermal growth factor domain, found in epidermal growth factor (EGF) presents in a large number of proteins, mostly animal; the list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied; the functional significance of EGF-like domains in what appear to be unrelated proteins is not yet clear; a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase); the domain includes six cysteine residues which have been shown to be involved in disulfide bonds; the main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet; Subdomains between the conserved cysteines vary in length; the region between the 5th and 6th cysteine contains two conserved glycines of which at least one is present in most EGF-like domains; a subset of these bind calcium.
Probab=97.57 E-value=0.00014 Score=41.72 Aligned_cols=30 Identities=43% Similarity=1.226 Sum_probs=26.1
Q ss_pred CCCCCCCeeeeCCCCeeeecCCCCcCCCCCc
Q psy5750 221 PICNNNADCINRPGTYQCQCKRGFSGDGFNC 251 (286)
Q Consensus 221 ~~C~~~~~C~n~~g~y~C~C~~Gy~gdg~~C 251 (286)
..|.+++.|++..++|.|.|+.||.|+ ..|
T Consensus 6 ~~C~~~~~C~~~~~~~~C~C~~g~~g~-~~C 35 (36)
T cd00053 6 NPCSNGGTCVNTPGSYRCVCPPGYTGD-RSC 35 (36)
T ss_pred CCCCCCCEEecCCCCeEeECCCCCccc-CCc
Confidence 458888999999999999999999987 445
No 30
>PF14670 FXa_inhibition: Coagulation Factor Xa inhibitory site; PDB: 3Q3K_B 1NFY_B 1LQD_A 1G2L_B 1IQF_L 2UWP_B 2VH6_B 3KQC_L 2P93_L 2BQW_A ....
Probab=97.56 E-value=4.6e-05 Score=44.04 Aligned_cols=29 Identities=41% Similarity=1.103 Sum_probs=22.1
Q ss_pred CCCCCCCeeeeCCCCeeeecCCCCcC--CCCCc
Q psy5750 221 PICNNNADCINRPGTYQCQCKRGFSG--DGFNC 251 (286)
Q Consensus 221 ~~C~~~~~C~n~~g~y~C~C~~Gy~g--dg~~C 251 (286)
..|.+ .|+|++++|+|.|++||.- |+++|
T Consensus 6 GgC~h--~C~~~~g~~~C~C~~Gy~L~~D~~tC 36 (36)
T PF14670_consen 6 GGCSH--ICVNTPGSYRCSCPPGYKLAEDGRTC 36 (36)
T ss_dssp GGSSS--EEEEETTSEEEE-STTEEE-TTSSSE
T ss_pred CCcCC--CCccCCCceEeECCCCCEECcCCCCC
Confidence 34654 8999999999999999984 55554
No 31
>PF12662 cEGF: Complement Clr-like EGF-like
Probab=97.55 E-value=7.1e-05 Score=38.88 Aligned_cols=21 Identities=38% Similarity=1.016 Sum_probs=18.0
Q ss_pred CeeeecCCCCc--CCCCCcCCCC
Q psy5750 235 TYQCQCKRGFS--GDGFNCEEGK 255 (286)
Q Consensus 235 ~y~C~C~~Gy~--gdg~~C~~~~ 255 (286)
||.|+|++||+ .++++|++++
T Consensus 1 sy~C~C~~Gy~l~~d~~~C~DId 23 (24)
T PF12662_consen 1 SYTCSCPPGYQLSPDGRSCEDID 23 (24)
T ss_pred CEEeeCCCCCcCCCCCCccccCC
Confidence 69999999999 6888997664
No 32
>smart00181 EGF Epidermal growth factor-like domain.
Probab=97.48 E-value=0.0002 Score=41.07 Aligned_cols=29 Identities=48% Similarity=1.259 Sum_probs=24.8
Q ss_pred CCCCCCCeeeeCCCCeeeecCCCCcCCCCCc
Q psy5750 221 PICNNNADCINRPGTYQCQCKRGFSGDGFNC 251 (286)
Q Consensus 221 ~~C~~~~~C~n~~g~y~C~C~~Gy~gdg~~C 251 (286)
..|.++ +|++..++|.|.|++||.|+ ..|
T Consensus 6 ~~C~~~-~C~~~~~~~~C~C~~g~~g~-~~C 34 (35)
T smart00181 6 GPCSNG-TCINTPGSYTCSCPPGYTGD-KRC 34 (35)
T ss_pred CCCCCC-EEECCCCCeEeECCCCCccC-Ccc
Confidence 358888 99999999999999999985 355
No 33
>cd00054 EGF_CA Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.
Probab=97.42 E-value=0.00023 Score=41.32 Aligned_cols=35 Identities=20% Similarity=0.481 Sum_probs=29.8
Q ss_pred CCCCCC-CCCCCCCeeeeCCCCCeEecCCCcccccc
Q psy5750 68 DDPCKN-FFCVANSSCIVEDDKPTCICNRGFQQLYS 102 (286)
Q Consensus 68 ~~~C~~-~~C~~~~~C~~~~g~~~C~C~~g~~~~~c 102 (286)
+++|.. .+|.+++.|++..+.|.|.|++||.|..|
T Consensus 2 ~~~C~~~~~C~~~~~C~~~~~~~~C~C~~g~~g~~C 37 (38)
T cd00054 2 IDECASGNPCQNGGTCVNTVGSYRCSCPPGYTGRNC 37 (38)
T ss_pred cccCCCCCCcCCCCEeECCCCCeEeECCCCCcCCcC
Confidence 567877 78988899999999999999999987543
No 34
>cd00053 EGF Epidermal growth factor domain, found in epidermal growth factor (EGF) presents in a large number of proteins, mostly animal; the list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied; the functional significance of EGF-like domains in what appear to be unrelated proteins is not yet clear; a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase); the domain includes six cysteine residues which have been shown to be involved in disulfide bonds; the main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet; Subdomains between the conserved cysteines vary in length; the region between the 5th and 6th cysteine contains two conserved glycines of which at least one is present in most EGF-like domains; a subset of these bind calcium.
Probab=96.86 E-value=0.0017 Score=36.91 Aligned_cols=28 Identities=18% Similarity=0.548 Sum_probs=24.9
Q ss_pred CCCCCCCCeeeeCCCCCeEecCCCcccc
Q psy5750 73 NFFCVANSSCIVEDDKPTCICNRGFQQL 100 (286)
Q Consensus 73 ~~~C~~~~~C~~~~g~~~C~C~~g~~~~ 100 (286)
..+|.+++.|++..+.|.|.|+.||.+.
T Consensus 5 ~~~C~~~~~C~~~~~~~~C~C~~g~~g~ 32 (36)
T cd00053 5 SNPCSNGGTCVNTPGSYRCVCPPGYTGD 32 (36)
T ss_pred CCCCCCCCEEecCCCCeEeECCCCCccc
Confidence 4688888999999999999999999865
No 35
>PF06247 Plasmod_Pvs28: Plasmodium ookinete surface protein Pvs28; InterPro: IPR010423 This family consists of several ookinete surface protein (Pvs28) from several species of Plasmodium. Pvs25 and Pvs28 are expressed on the surface of ookinetes. These proteins are potential candidates for vaccine and induce antibodies that block the infectivity of Plasmodium vivax in immunised animals [].; GO: 0009986 cell surface, 0016020 membrane; PDB: 1Z3G_B 1Z1Y_B 1Z27_A.
Probab=96.81 E-value=0.0006 Score=54.06 Aligned_cols=120 Identities=28% Similarity=0.655 Sum_probs=69.2
Q ss_pred CCCCCCeeecCCCCeeEeCCCCccc-CCCceecccCCCCCCCCCCCC---CCCCCCCCCCccccCC-----CCceeeCCC
Q psy5750 124 LCHKNAMCFNEIGSYSCQCRPGFTG-NGHQCTEITVPQTGPTSPCES---DPRACNPPHSTCTNLT-----DYRTCNCDP 194 (286)
Q Consensus 124 ~C~~~~~C~~~~g~~~C~C~~G~~g-~~~~C~~~~~~~~~~~~~C~~---~~~~C~~~~~~C~~~~-----g~~~C~C~~ 194 (286)
.|.+ +..+...+.|.|.|.+||.. +..+|+. ..+|.. ...+| ..-++|++.. ..|.|.|.+
T Consensus 7 ~CKN-G~LiQMSNHfEC~Cnegfvl~~EntCE~--------kv~C~~~e~~~K~C-gdya~C~~~~~~~~~~~~~C~C~~ 76 (197)
T PF06247_consen 7 ICKN-GYLIQMSNHFECKCNEGFVLKNENTCEE--------KVECDKLENVNKPC-GDYAKCINQANKGEERAYKCDCIN 76 (197)
T ss_dssp --BT-EEEEEESSEEEEEESTTEEEEETTEEEE------------SG-GGTTSEE-ETTEEEEE-SSTTSSTSEEEEE-T
T ss_pred cccC-CEEEEccCceEEEcCCCcEEcccccccc--------ceecCcccccCccc-cchhhhhcCCCcccceeEEEeccc
Confidence 4543 56777788999999999983 4467877 556643 12457 5667887643 589999999
Q ss_pred CCCCCCCCCcccCCccccCCCcCCCCCCCCCCCeeeeC---CCCeeeecCCCCcCCCCCcCCCCeeecccccccccce
Q psy5750 195 GYQKDYLDDRRVAFVCTDVDECMNYPPICNNNADCINR---PGTYQCQCKRGFSGDGFNCEEGKYCLVVGITLCKMYL 269 (286)
Q Consensus 195 G~~g~~~~~~~~~~~C~d~deC~~~~~~C~~~~~C~n~---~g~y~C~C~~Gy~gdg~~C~~~~~c~~~~~~~c~~~~ 269 (286)
||+.... .|. .++|... .|. .|.|+-. +....|.|.-|+..+ +...|...|.+.|....
T Consensus 77 gY~~~~~-------vCv-p~~C~~~--~Cg-~GKCI~d~~~~~~~~CSC~IGkV~~-----dn~kCtk~G~T~C~LKC 138 (197)
T PF06247_consen 77 GYILKQG-------VCV-PNKCNNK--DCG-SGKCILDPDNPNNPTCSCNIGKVPD-----DNKKCTKTGETKCSLKC 138 (197)
T ss_dssp TEEESSS-------SEE-EGGGSS-----T-TEEEEEEEGGGSEEEEEE-TEEETT-----TTTESEEEE--------
T ss_pred CceeeCC-------eEc-hhhcCce--ecC-CCeEEecCCCCCCceeEeeeceEec-----cCCcccCCCccceeeec
Confidence 9988744 564 3567653 487 6789743 345699999999932 23344555566555543
No 36
>smart00181 EGF Epidermal growth factor-like domain.
Probab=96.79 E-value=0.002 Score=36.77 Aligned_cols=28 Identities=29% Similarity=0.682 Sum_probs=24.3
Q ss_pred CCC-CCCCCCCeeeeCCCCCeEecCCCccc
Q psy5750 71 CKN-FFCVANSSCIVEDDKPTCICNRGFQQ 99 (286)
Q Consensus 71 C~~-~~C~~~~~C~~~~g~~~C~C~~g~~~ 99 (286)
|.. .+|.++ .|++..++|.|.|++||.+
T Consensus 2 C~~~~~C~~~-~C~~~~~~~~C~C~~g~~g 30 (35)
T smart00181 2 CASGGPCSNG-TCINTPGSYTCSCPPGYTG 30 (35)
T ss_pred CCCcCCCCCC-EEECCCCCeEeECCCCCcc
Confidence 445 588888 9999999999999999986
No 37
>cd01475 vWA_Matrilin VWA_Matrilin: In cartilaginous plate, extracellular matrix molecules mediate cell-matrix and matrix-matrix interactions thereby providing tissue integrity. Some members of the matrilin family are expressed specifically in developing cartilage rudiments. The matrilin family consists of at least four members. All the members of the matrilin family contain VWA domains, EGF-like domains and a heptad repeat coiled-coiled domain at the carboxy terminus which is responsible for the oligomerization of the matrilins. The VWA domains have been shown to be essential for matrilin network formation by interacting with matrix ligands.
Probab=96.57 E-value=0.0024 Score=53.47 Aligned_cols=39 Identities=31% Similarity=0.863 Sum_probs=33.6
Q ss_pred cCCccccCCCcCCCCCCCCCCCeeeeCCCCeeeecCCCCcC
Q psy5750 206 VAFVCTDVDECMNYPPICNNNADCINRPGTYQCQCKRGFSG 246 (286)
Q Consensus 206 ~~~~C~d~deC~~~~~~C~~~~~C~n~~g~y~C~C~~Gy~g 246 (286)
....|.++++|...++.|.. .|.|+.|+|.|.|++||..
T Consensus 180 ~~~~C~~~~~C~~~~~~c~~--~C~~~~g~~~c~c~~g~~~ 218 (224)
T cd01475 180 QGKICVVPDLCATLSHVCQQ--VCISTPGSYLCACTEGYAL 218 (224)
T ss_pred ccccCcCchhhcCCCCCccc--eEEcCCCCEEeECCCCccC
Confidence 34478899999988888874 8999999999999999975
No 38
>PF07974 EGF_2: EGF-like domain; InterPro: IPR013111 A sequence of about thirty to forty amino-acid residues long found in the sequence of epidermal growth factor (EGF) has been shown [, , , , ] to be present, in a more or less conserved form, in a large number of other, mostly animal proteins. The list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied. The functional significance of EGF domains in what appear to be unrelated proteins is not yet clear. However, a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase). The EGF domain includes six cysteine residues which have been shown (in EGF) to be involved in disulphide bonds. The main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet. Subdomains between the conserved cysteines vary in length. This entry contains EGF domains found in a variety of extracellular and membrane proteins
Probab=96.25 E-value=0.0077 Score=33.76 Aligned_cols=24 Identities=29% Similarity=0.985 Sum_probs=20.8
Q ss_pred CCCCCCeeeeCCCCeeeecCCCCcCC
Q psy5750 222 ICNNNADCINRPGTYQCQCKRGFSGD 247 (286)
Q Consensus 222 ~C~~~~~C~n~~g~y~C~C~~Gy~gd 247 (286)
.|.++++|++. ..+|+|.+||.|+
T Consensus 7 ~C~~~G~C~~~--~g~C~C~~g~~G~ 30 (32)
T PF07974_consen 7 ICSGHGTCVSP--CGRCVCDSGYTGP 30 (32)
T ss_pred ccCCCCEEeCC--CCEEECCCCCcCC
Confidence 59999999866 4599999999997
No 39
>KOG0994|consensus
Probab=96.06 E-value=0.02 Score=57.04 Aligned_cols=137 Identities=25% Similarity=0.549 Sum_probs=69.1
Q ss_pred eecCCCCeeE-eCCCCcccCCCceecccCCCCCCCCCCCCCCCCCCCC--------CCccc--cCCCCceeeCCCCCCCC
Q psy5750 131 CFNEIGSYSC-QCRPGFTGNGHQCTEITVPQTGPTSPCESDPRACNPP--------HSTCT--NLTDYRTCNCDPGYQKD 199 (286)
Q Consensus 131 C~~~~g~~~C-~C~~G~~g~~~~C~~~~~~~~~~~~~C~~~~~~C~~~--------~~~C~--~~~g~~~C~C~~G~~g~ 199 (286)
|.+...++.| +|..||.|+...=.. ..|.. -+| +. ...|. +......|.|.+||.|.
T Consensus 878 CqD~T~G~~CdrCl~GyyGdP~lg~g---------~~CrP--CpC-P~gp~Sg~~~A~sC~~d~~t~~ivC~C~~GY~G~ 945 (1758)
T KOG0994|consen 878 CQDSTTGHSCDRCLDGYYGDPRLGSG---------IGCRP--CPC-PDGPASGRQHADSCYLDTRTQQIVCHCQEGYSGS 945 (1758)
T ss_pred ccccccccchhhhhccccCCcccCCC---------CCCCC--CCC-CCCCccchhccccccccccccceeeecccCcccc
Confidence 5556667778 699999975432211 11111 112 11 11221 12234478888998887
Q ss_pred CCCC--------cccCCccc------cCCCcCCCCCCCCC-CC---eeeeCCCCeee-ecCCCCcCCCC--CcCCCCeee
Q psy5750 200 YLDD--------RRVAFVCT------DVDECMNYPPICNN-NA---DCINRPGTYQC-QCKRGFSGDGF--NCEEGKYCL 258 (286)
Q Consensus 200 ~~~~--------~~~~~~C~------d~deC~~~~~~C~~-~~---~C~n~~g~y~C-~C~~Gy~gdg~--~C~~~~~c~ 258 (286)
.+.. +..+..|+ .||.=. +..|+. .+ .|....-+-.| .|.+||.||-. +| ..=.|.
T Consensus 946 RCe~CA~~~fGnP~~GGtCq~CeC~~NiD~~d--~~aCD~~TG~CLkCL~hTeG~hCe~Ck~Gf~GdA~~q~C-qrC~Cn 1022 (1758)
T KOG0994|consen 946 RCEICADNHFGNPSEGGTCQKCECSNNIDLYD--PGACDVATGACLKCLYHTEGDHCEHCKDGFYGDALRQNC-QRCVCN 1022 (1758)
T ss_pred chhhhcccccCCcccCCccccccccCCcCccC--CCccchhhchhhhhhhcccccchhhccccchhHHHHhhh-hhhecc
Confidence 6542 22344443 122111 223442 12 23322233456 69999999853 45 333455
Q ss_pred cccccccccceeeeecccccCCccc
Q psy5750 259 VVGITLCKMYLEVVNIQEICGENSI 283 (286)
Q Consensus 259 ~~~~~~c~~~~~~~~~~~~~~~~~~ 283 (286)
+.|. .-..+..-+.++..|.+|-|
T Consensus 1023 ~LGT-n~~~~CDr~tGQCpClpNv~ 1046 (1758)
T KOG0994|consen 1023 FLGT-NSTCHCDRFTGQCPCLPNVQ 1046 (1758)
T ss_pred cccc-CCccccccccCcCCCCcccc
Confidence 5432 22355566777778888765
No 40
>PF14670 FXa_inhibition: Coagulation Factor Xa inhibitory site; PDB: 3Q3K_B 1NFY_B 1LQD_A 1G2L_B 1IQF_L 2UWP_B 2VH6_B 3KQC_L 2P93_L 2BQW_A ....
Probab=95.86 E-value=0.0077 Score=34.76 Aligned_cols=24 Identities=42% Similarity=1.061 Sum_probs=19.3
Q ss_pred CCCCCCCeeecCCCCeeEeCCCCccc
Q psy5750 123 DLCHKNAMCFNEIGSYSCQCRPGFTG 148 (286)
Q Consensus 123 ~~C~~~~~C~~~~g~~~C~C~~G~~g 148 (286)
..|.. .|++.+++|+|.|++||.-
T Consensus 6 GgC~h--~C~~~~g~~~C~C~~Gy~L 29 (36)
T PF14670_consen 6 GGCSH--ICVNTPGSYRCSCPPGYKL 29 (36)
T ss_dssp GGSSS--EEEEETTSEEEE-STTEEE
T ss_pred CCcCC--CCccCCCceEeECCCCCEE
Confidence 44654 8999999999999999984
No 41
>PF12661 hEGF: Human growth factor-like EGF; PDB: 2YGQ_A 2E26_A 3A7Q_A 2YGP_A 2YGO_A 1HRE_A 1HAE_A 1HAF_A 1HRF_A.
Probab=95.83 E-value=0.0045 Score=27.20 Aligned_cols=11 Identities=45% Similarity=1.265 Sum_probs=9.0
Q ss_pred eeecCCCCcCC
Q psy5750 237 QCQCKRGFSGD 247 (286)
Q Consensus 237 ~C~C~~Gy~gd 247 (286)
.|+|++||+|+
T Consensus 1 ~C~C~~G~~G~ 11 (13)
T PF12661_consen 1 TCQCPPGWTGP 11 (13)
T ss_dssp EEEE-TTEETT
T ss_pred CccCcCCCcCC
Confidence 58999999997
No 42
>PF12946 EGF_MSP1_1: MSP1 EGF domain 1; InterPro: IPR024730 This EGF-like domain is found at the C terminus of the malaria parasite MSP1 protein. MSP1 is the merozoite surface protein 1. This domain is part of the C-terminal fragment that is proteolytically processed from the the rest of the protein and is left attached to the surface of the invading parasite [].; PDB: 1N1I_C 2FLG_A 1CEJ_A 2NPR_A 1B9W_A 1OB1_F.
Probab=95.71 E-value=0.0041 Score=35.74 Aligned_cols=30 Identities=33% Similarity=0.817 Sum_probs=22.4
Q ss_pred CCCCCCeeeeCC-CCeeeecCCCCcCCCCCc
Q psy5750 222 ICNNNADCINRP-GTYQCQCKRGFSGDGFNC 251 (286)
Q Consensus 222 ~C~~~~~C~n~~-g~y~C~C~~Gy~gdg~~C 251 (286)
.|..++.|++.. |++.|.|..||+.++..|
T Consensus 6 ~cP~NA~C~~~~dG~eecrCllgyk~~~~~C 36 (37)
T PF12946_consen 6 KCPANAGCFRYDDGSEECRCLLGYKKVGGKC 36 (37)
T ss_dssp ---TTEEEEEETTSEEEEEE-TTEEEETTEE
T ss_pred cCCCCcccEEcCCCCEEEEeeCCccccCCCc
Confidence 488899998775 999999999999876555
No 43
>PF12946 EGF_MSP1_1: MSP1 EGF domain 1; InterPro: IPR024730 This EGF-like domain is found at the C terminus of the malaria parasite MSP1 protein. MSP1 is the merozoite surface protein 1. This domain is part of the C-terminal fragment that is proteolytically processed from the the rest of the protein and is left attached to the surface of the invading parasite [].; PDB: 1N1I_C 2FLG_A 1CEJ_A 2NPR_A 1B9W_A 1OB1_F.
Probab=94.80 E-value=0.019 Score=33.04 Aligned_cols=31 Identities=39% Similarity=0.816 Sum_probs=23.0
Q ss_pred CCCCCCCeeecCC-CCeeEeCCCCcccCCCce
Q psy5750 123 DLCHKNAMCFNEI-GSYSCQCRPGFTGNGHQC 153 (286)
Q Consensus 123 ~~C~~~~~C~~~~-g~~~C~C~~G~~g~~~~C 153 (286)
..|..++.|++.. |++.|+|..||..++..|
T Consensus 5 ~~cP~NA~C~~~~dG~eecrCllgyk~~~~~C 36 (37)
T PF12946_consen 5 TKCPANAGCFRYDDGSEECRCLLGYKKVGGKC 36 (37)
T ss_dssp S---TTEEEEEETTSEEEEEE-TTEEEETTEE
T ss_pred ccCCCCcccEEcCCCCEEEEeeCCccccCCCc
Confidence 6788899999887 999999999999765655
No 44
>PF07974 EGF_2: EGF-like domain; InterPro: IPR013111 A sequence of about thirty to forty amino-acid residues long found in the sequence of epidermal growth factor (EGF) has been shown [, , , , ] to be present, in a more or less conserved form, in a large number of other, mostly animal proteins. The list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied. The functional significance of EGF domains in what appear to be unrelated proteins is not yet clear. However, a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase). The EGF domain includes six cysteine residues which have been shown (in EGF) to be involved in disulphide bonds. The main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet. Subdomains between the conserved cysteines vary in length. This entry contains EGF domains found in a variety of extracellular and membrane proteins
Probab=94.44 E-value=0.056 Score=30.26 Aligned_cols=27 Identities=19% Similarity=0.671 Sum_probs=21.5
Q ss_pred CCCCCCCeeeeCCCCCeEecCCCcccccc
Q psy5750 74 FFCVANSSCIVEDDKPTCICNRGFQQLYS 102 (286)
Q Consensus 74 ~~C~~~~~C~~~~g~~~C~C~~g~~~~~c 102 (286)
..|.++++|+.. ..+|.|.+||.|+.|
T Consensus 6 ~~C~~~G~C~~~--~g~C~C~~g~~G~~C 32 (32)
T PF07974_consen 6 NICSGHGTCVSP--CGRCVCDSGYTGPDC 32 (32)
T ss_pred CccCCCCEEeCC--CCEEECCCCCcCCCC
Confidence 368899999966 348999999998643
No 45
>KOG0994|consensus
Probab=93.55 E-value=0.61 Score=47.11 Aligned_cols=67 Identities=27% Similarity=0.624 Sum_probs=36.0
Q ss_pred ccccCCCCcee-eCCCCCCCCCCCCcccCCccccCCCcCCCCCCCCCC-Ceee--eCCCCeeeecCCCCcCCCCCc
Q psy5750 180 TCTNLTDYRTC-NCDPGYQKDYLDDRRVAFVCTDVDECMNYPPICNNN-ADCI--NRPGTYQCQCKRGFSGDGFNC 251 (286)
Q Consensus 180 ~C~~~~g~~~C-~C~~G~~g~~~~~~~~~~~C~d~deC~~~~~~C~~~-~~C~--n~~g~y~C~C~~Gy~gdg~~C 251 (286)
.|.+...++.| .|..||.|+..- ..+..|. .=+|-..|..=..+ -.|. +......|.|.+||.|. .|
T Consensus 877 ~CqD~T~G~~CdrCl~GyyGdP~l--g~g~~Cr-PCpCP~gp~Sg~~~A~sC~~d~~t~~ivC~C~~GY~G~--RC 947 (1758)
T KOG0994|consen 877 DCQDSTTGHSCDRCLDGYYGDPRL--GSGIGCR-PCPCPDGPASGRQHADSCYLDTRTQQIVCHCQEGYSGS--RC 947 (1758)
T ss_pred cccccccccchhhhhccccCCccc--CCCCCCC-CCCCCCCCccchhccccccccccccceeeecccCcccc--ch
Confidence 35556778889 799999998331 1112221 11222222111111 1232 23345689999999987 55
No 46
>cd01475 vWA_Matrilin VWA_Matrilin: In cartilaginous plate, extracellular matrix molecules mediate cell-matrix and matrix-matrix interactions thereby providing tissue integrity. Some members of the matrilin family are expressed specifically in developing cartilage rudiments. The matrilin family consists of at least four members. All the members of the matrilin family contain VWA domains, EGF-like domains and a heptad repeat coiled-coiled domain at the carboxy terminus which is responsible for the oligomerization of the matrilins. The VWA domains have been shown to be essential for matrilin network formation by interacting with matrix ligands.
Probab=93.53 E-value=0.081 Score=44.25 Aligned_cols=39 Identities=26% Similarity=0.691 Sum_probs=33.1
Q ss_pred CCCCCccCCCcCCCCCCCCCCCeeecCCCCeeEeCCCCccc
Q psy5750 108 DDFGCFDINECNAGTDLCHKNAMCFNEIGSYSCQCRPGFTG 148 (286)
Q Consensus 108 ~~~~C~~~~~C~~~~~~C~~~~~C~~~~g~~~C~C~~G~~g 148 (286)
.+..|.++++|....+.|.. .|.++.|+|.|.|++||+.
T Consensus 180 ~~~~C~~~~~C~~~~~~c~~--~C~~~~g~~~c~c~~g~~~ 218 (224)
T cd01475 180 QGKICVVPDLCATLSHVCQQ--VCISTPGSYLCACTEGYAL 218 (224)
T ss_pred ccccCcCchhhcCCCCCccc--eEEcCCCCEEeECCCCccC
Confidence 35678899999876677874 8999999999999999985
No 47
>KOG1836|consensus
Probab=92.25 E-value=0.86 Score=48.71 Aligned_cols=33 Identities=27% Similarity=0.708 Sum_probs=23.0
Q ss_pred CCCCCCCCCCCCCccccCC--CCceee-CCCCCCCCCCC
Q psy5750 167 CESDPRACNPPHSTCTNLT--DYRTCN-CDPGYQKDYLD 202 (286)
Q Consensus 167 C~~~~~~C~~~~~~C~~~~--g~~~C~-C~~G~~g~~~~ 202 (286)
|.. -+| ++++.|.... ....|+ |++||+|..+.
T Consensus 777 C~~--C~C-p~~~~~~~~~~~~~~iCk~Cp~gytG~rCe 812 (1705)
T KOG1836|consen 777 CQP--CPC-PNGGACGQTPEILEVVCKNCPPGYTGLRCE 812 (1705)
T ss_pred Ccc--CCC-CCChhhcCcCcccceecCCCCCCCcccccc
Confidence 554 466 6666676543 566887 99999998765
No 48
>PF01683 EB: EB module; InterPro: IPR006149 The EB domain has no known function. It is found in several Caenorhabditis sp. and Drosophila sp. proteins. The domain contains 8 conserved cysteines that probably form four disulphide bridges and is found associated with kunitz domains IPR002223 from INTERPRO
Probab=88.57 E-value=0.87 Score=28.37 Aligned_cols=26 Identities=35% Similarity=0.885 Sum_probs=18.6
Q ss_pred CCCCCCeeeeCCCCeeeecCCCCcCCCCCc
Q psy5750 222 ICNNNADCINRPGTYQCQCKRGFSGDGFNC 251 (286)
Q Consensus 222 ~C~~~~~C~n~~g~y~C~C~~Gy~gdg~~C 251 (286)
.|..++.|++. .|+|++||.-.+..|
T Consensus 27 qC~~~s~C~~g----~C~C~~g~~~~~~~C 52 (52)
T PF01683_consen 27 QCIGGSVCVNG----RCQCPPGYVEVGGRC 52 (52)
T ss_pred CCCCcCEEcCC----EeECCCCCEecCCCC
Confidence 35567788643 899999998765554
No 49
>smart00051 DSL delta serrate ligand.
Probab=87.92 E-value=1.1 Score=29.44 Aligned_cols=46 Identities=20% Similarity=0.330 Sum_probs=28.2
Q ss_pred CceeeCCCCCCCCCCCCcccCCccccCCCcCCCCCCCCCCCeeeeCCCCeeeecCCCCcCC
Q psy5750 187 YRTCNCDPGYQKDYLDDRRVAFVCTDVDECMNYPPICNNNADCINRPGTYQCQCKRGFSGD 247 (286)
Q Consensus 187 ~~~C~C~~G~~g~~~~~~~~~~~C~d~deC~~~~~~C~~~~~C~n~~g~y~C~C~~Gy~gd 247 (286)
.++-.|.++|.|..+ . ..|... +.+..+..|.. .| .|.|.+||+|.
T Consensus 16 ~~rv~C~~~~yG~~C---------~--~~C~~~-~d~~~~~~Cd~-~G--~~~C~~Gw~G~ 61 (63)
T smart00051 16 QIRVTCDENYYGEGC---------N--KFCRPR-DDFFGHYTCDE-NG--NKGCLEGWMGP 61 (63)
T ss_pred EEEeeCCCCCcCCcc---------C--CEeCcC-ccccCCccCCc-CC--CEecCCCCcCC
Confidence 345579999999844 2 223221 12445566742 33 57899999987
No 50
>KOG1836|consensus
Probab=86.88 E-value=1.2 Score=47.58 Aligned_cols=60 Identities=30% Similarity=0.572 Sum_probs=35.7
Q ss_pred cccCCCCcee-eCCCCCCCCCCCCcccCCccccCCCcCCCCCCCCCCCeeeeC--CCCeeee-cCCCCcCCCCCcC
Q psy5750 181 CTNLTDYRTC-NCDPGYQKDYLDDRRVAFVCTDVDECMNYPPICNNNADCINR--PGTYQCQ-CKRGFSGDGFNCE 252 (286)
Q Consensus 181 C~~~~g~~~C-~C~~G~~g~~~~~~~~~~~C~d~deC~~~~~~C~~~~~C~n~--~g~y~C~-C~~Gy~gdg~~C~ 252 (286)
|+....+.+| .|..||.+.... =..-| |..=+ |..++.|... .....|. |++||+|. +|+
T Consensus 749 C~~~t~G~~C~~C~~GfYg~~~~-------~~~~d-C~~C~--Cp~~~~~~~~~~~~~~iCk~Cp~gytG~--rCe 812 (1705)
T KOG1836|consen 749 CKHNTFGGQCAQCVDGFYGLPDL-------GTSGD-CQPCP--CPNGGACGQTPEILEVVCKNCPPGYTGL--RCE 812 (1705)
T ss_pred cccCCCCCchhhhcCCCCCcccc-------CCCCC-CccCC--CCCChhhcCcCcccceecCCCCCCCccc--ccc
Confidence 3333334455 688888876331 01112 54332 6666677654 4567898 99999987 664
No 51
>PHA03099 epidermal growth factor-like protein (EGF-like protein); Provisional
Probab=86.58 E-value=0.64 Score=34.67 Aligned_cols=32 Identities=28% Similarity=0.762 Sum_probs=25.3
Q ss_pred CCCCCCCeeeeC--CCCeeeecCCCCcCCCCCcCCCC
Q psy5750 221 PICNNNADCINR--PGTYQCQCKRGFSGDGFNCEEGK 255 (286)
Q Consensus 221 ~~C~~~~~C~n~--~g~y~C~C~~Gy~gdg~~C~~~~ 255 (286)
+-|-++ +|.-. ...+.|.|..||.|. +||..+
T Consensus 51 ~YClHG-~C~yI~dl~~~~CrC~~GYtGe--RCEh~d 84 (139)
T PHA03099 51 GYCLHG-DCIHARDIDGMYCRCSHGYTGI--RCQHVV 84 (139)
T ss_pred CEeECC-EEEeeccCCCceeECCCCcccc--ccccee
Confidence 457765 89755 478999999999999 997665
No 52
>PF12955 DUF3844: Domain of unknown function (DUF3844); InterPro: IPR024382 This presumed domain is found in fungal species. It contains 8 largely conserved cysteine residues. This domain is found in proteins thought to be located in the endoplasmic reticulum.
Probab=85.15 E-value=0.47 Score=34.28 Aligned_cols=33 Identities=24% Similarity=0.700 Sum_probs=25.4
Q ss_pred CCCcCCCCCCCCCCCeeeeCC-----CCeeeecCCCCc
Q psy5750 213 VDECMNYPPICNNNADCINRP-----GTYQCQCKRGFS 245 (286)
Q Consensus 213 ~deC~~~~~~C~~~~~C~n~~-----g~y~C~C~~Gy~ 245 (286)
.++|...++.|..|+.|++.- .=|.|.|.+.+.
T Consensus 5 ~~aC~~~Tn~CsgHG~C~~~~~~~~~~C~~C~C~~T~~ 42 (103)
T PF12955_consen 5 NDACENATNNCSGHGSCVKKYGSGGGDCFACKCKPTVV 42 (103)
T ss_pred HHHHHHhccCCCCCceEeeccCCCccceEEEEeecccc
Confidence 466777777899999999872 348999998544
No 53
>PF00954 S_locus_glycop: S-locus glycoprotein family; InterPro: IPR000858 In Brassicaceae, self-incompatible plants have a self/non-self recognition system, which involves the inability of flowering plants to achieve self-fertilisation. This is sporophytically controlled by multiple alleles at a single locus (S). There are a total of 50 different S alleles in Brassica oleracea. S-locus glycoproteins, as well as S-receptor kinases, are in linkage with the S-alleles []. Most of the proteins within this family contain apple-like domain (IPR003609 from INTERPRO), which is predicted to possess protein- and/or carbohydrate-binding functions.; GO: 0048544 recognition of pollen
Probab=79.52 E-value=2.1 Score=31.37 Aligned_cols=34 Identities=29% Similarity=0.684 Sum_probs=25.5
Q ss_pred cCCCcCCCCCCCCCCCeeeeCCCCeeeecCCCCcCC
Q psy5750 212 DVDECMNYPPICNNNADCINRPGTYQCQCKRGFSGD 247 (286)
Q Consensus 212 d~deC~~~~~~C~~~~~C~n~~g~y~C~C~~Gy~gd 247 (286)
..|.|... ..|+..+.|. ...+..|.|.+||+..
T Consensus 76 p~d~Cd~y-~~CG~~g~C~-~~~~~~C~Cl~GF~P~ 109 (110)
T PF00954_consen 76 PKDQCDVY-GFCGPNGICN-SNNSPKCSCLPGFEPK 109 (110)
T ss_pred cccCCCCc-cccCCccEeC-CCCCCceECCCCcCCC
Confidence 34677664 5799999994 4456689999999853
No 54
>PHA02887 EGF-like protein; Provisional
Probab=79.01 E-value=2.4 Score=31.20 Aligned_cols=36 Identities=36% Similarity=0.839 Sum_probs=26.7
Q ss_pred CCCCCCC---CCCCCCCeeeeC--CCCCeEecCCCccccccc
Q psy5750 67 NDDPCKN---FFCVANSSCIVE--DDKPTCICNRGFQQLYSE 103 (286)
Q Consensus 67 ~~~~C~~---~~C~~~~~C~~~--~g~~~C~C~~g~~~~~c~ 103 (286)
...+|.. +-|. +|+|.-. .....|.|..||.|..|+
T Consensus 82 hf~pC~~eyk~YCi-HG~C~yI~dL~epsCrC~~GYtG~RCE 122 (126)
T PHA02887 82 FFEKCKNDFNDFCI-NGECMNIIDLDEKFCICNKGYTGIRCD 122 (126)
T ss_pred CccccChHhhCEee-CCEEEccccCCCceeECCCCcccCCCC
Confidence 3456655 5677 6799865 456799999999988776
No 55
>KOG3516|consensus
Probab=78.38 E-value=1.6 Score=44.36 Aligned_cols=40 Identities=20% Similarity=0.356 Sum_probs=35.9
Q ss_pred CcCCCCCCCCCCCCCCCeeeeCCCCCeEecC-CCccccccc
Q psy5750 64 STVNDDPCKNFFCVANSSCIVEDDKPTCICN-RGFQQLYSE 103 (286)
Q Consensus 64 ~~~~~~~C~~~~C~~~~~C~~~~g~~~C~C~-~g~~~~~c~ 103 (286)
-|.-.+.|.+++|.+++.|...+..|.|.|. .||.|..|.
T Consensus 541 ~C~i~drClPN~CehgG~C~Qs~~~f~C~C~~TGY~GatCH 581 (1306)
T KOG3516|consen 541 MCGISDRCLPNPCEHGGKCSQSWDDFECNCELTGYKGATCH 581 (1306)
T ss_pred ccccccccCCccccCCCcccccccceeEecccccccccccc
Confidence 5667888999999999999999999999999 899988765
No 56
>PF01683 EB: EB module; InterPro: IPR006149 The EB domain has no known function. It is found in several Caenorhabditis sp. and Drosophila sp. proteins. The domain contains 8 conserved cysteines that probably form four disulphide bridges and is found associated with kunitz domains IPR002223 from INTERPRO
Probab=75.64 E-value=7.3 Score=24.09 Aligned_cols=23 Identities=39% Similarity=0.881 Sum_probs=16.5
Q ss_pred CCCCCCCCccccCCCCceeeCCCCCCCC
Q psy5750 172 RACNPPHSTCTNLTDYRTCNCDPGYQKD 199 (286)
Q Consensus 172 ~~C~~~~~~C~~~~g~~~C~C~~G~~g~ 199 (286)
..| ..++.|++. +|.|++||...
T Consensus 26 ~qC-~~~s~C~~g----~C~C~~g~~~~ 48 (52)
T PF01683_consen 26 EQC-IGGSVCVNG----RCQCPPGYVEV 48 (52)
T ss_pred CCC-CCcCEEcCC----EeECCCCCEec
Confidence 456 456778664 89999998654
No 57
>PHA02887 EGF-like protein; Provisional
Probab=75.57 E-value=3.2 Score=30.54 Aligned_cols=31 Identities=32% Similarity=0.858 Sum_probs=23.3
Q ss_pred CCCCCCCeeeeC--CCCeeeecCCCCcCCCCCcCCC
Q psy5750 221 PICNNNADCINR--PGTYQCQCKRGFSGDGFNCEEG 254 (286)
Q Consensus 221 ~~C~~~~~C~n~--~g~y~C~C~~Gy~gdg~~C~~~ 254 (286)
+-|- +|+|.-. .....|.|+.||.|. .|+..
T Consensus 92 ~YCi-HG~C~yI~dL~epsCrC~~GYtG~--RCE~v 124 (126)
T PHA02887 92 DFCI-NGECMNIIDLDEKFCICNKGYTGI--RCDEV 124 (126)
T ss_pred CEee-CCEEEccccCCCceeECCCCcccC--CCCcc
Confidence 3476 4688654 456899999999999 88653
No 58
>KOG1218|consensus
Probab=72.11 E-value=65 Score=27.92 Aligned_cols=11 Identities=45% Similarity=1.150 Sum_probs=7.1
Q ss_pred eeecCCCCcCC
Q psy5750 237 QCQCKRGFSGD 247 (286)
Q Consensus 237 ~C~C~~Gy~gd 247 (286)
.|.|++||.|.
T Consensus 163 ~c~c~~g~~g~ 173 (316)
T KOG1218|consen 163 ICTCQPGFVGV 173 (316)
T ss_pred ceeccCCcccc
Confidence 45677777665
No 59
>PHA03099 epidermal growth factor-like protein (EGF-like protein); Provisional
Probab=71.34 E-value=4.4 Score=30.33 Aligned_cols=36 Identities=22% Similarity=0.563 Sum_probs=26.2
Q ss_pred CCCCCCC---CCCCCCCeeeeC--CCCCeEecCCCccccccc
Q psy5750 67 NDDPCKN---FFCVANSSCIVE--DDKPTCICNRGFQQLYSE 103 (286)
Q Consensus 67 ~~~~C~~---~~C~~~~~C~~~--~g~~~C~C~~g~~~~~c~ 103 (286)
++.+|.. +-|.+ |+|.-. ...+.|.|..||.|..|+
T Consensus 41 ~i~~Cp~ey~~YClH-G~C~yI~dl~~~~CrC~~GYtGeRCE 81 (139)
T PHA03099 41 AIRLCGPEGDGYCLH-GDCIHARDIDGMYCRCSHGYTGIRCQ 81 (139)
T ss_pred ccccCChhhCCEeEC-CEEEeeccCCCceeECCCCccccccc
Confidence 3445554 56765 489865 467899999999988776
No 60
>PF00954 S_locus_glycop: S-locus glycoprotein family; InterPro: IPR000858 In Brassicaceae, self-incompatible plants have a self/non-self recognition system, which involves the inability of flowering plants to achieve self-fertilisation. This is sporophytically controlled by multiple alleles at a single locus (S). There are a total of 50 different S alleles in Brassica oleracea. S-locus glycoproteins, as well as S-receptor kinases, are in linkage with the S-alleles []. Most of the proteins within this family contain apple-like domain (IPR003609 from INTERPRO), which is predicted to possess protein- and/or carbohydrate-binding functions.; GO: 0048544 recognition of pollen
Probab=70.48 E-value=4.8 Score=29.39 Aligned_cols=31 Identities=35% Similarity=0.850 Sum_probs=24.9
Q ss_pred CCCCCCC-CCCCCCCeeeeCCCCCeEecCCCcc
Q psy5750 67 NDDPCKN-FFCVANSSCIVEDDKPTCICNRGFQ 98 (286)
Q Consensus 67 ~~~~C~~-~~C~~~~~C~~~~g~~~C~C~~g~~ 98 (286)
..+.|.. ..|..++.|.. .....|.|.+||.
T Consensus 76 p~d~Cd~y~~CG~~g~C~~-~~~~~C~Cl~GF~ 107 (110)
T PF00954_consen 76 PKDQCDVYGFCGPNGICNS-NNSPKCSCLPGFE 107 (110)
T ss_pred cccCCCCccccCCccEeCC-CCCCceECCCCcC
Confidence 4668887 88999999954 3455799999997
No 61
>KOG3514|consensus
Probab=69.40 E-value=25 Score=36.05 Aligned_cols=37 Identities=22% Similarity=0.464 Sum_probs=32.0
Q ss_pred CCCCCCCCCCCCeeeeCCCCCeEecC-CCccccccccc
Q psy5750 69 DPCKNFFCVANSSCIVEDDKPTCICN-RGFQQLYSEDR 105 (286)
Q Consensus 69 ~~C~~~~C~~~~~C~~~~g~~~C~C~-~g~~~~~c~~~ 105 (286)
..|.++||++++.|...++.|.|.|. .+|.|..|+..
T Consensus 624 ~~C~~nPC~N~g~C~egwNrfiCDCs~T~~~G~~CerE 661 (1591)
T KOG3514|consen 624 KICESNPCQNGGKCSEGWNRFICDCSGTGFEGRTCERE 661 (1591)
T ss_pred cccCCCcccCCCCccccccccccccccCcccCccccce
Confidence 37999999999999999999999985 58888888743
No 62
>PF09064 Tme5_EGF_like: Thrombomodulin like fifth domain, EGF-like; InterPro: IPR015149 This domain adopts a fold similar to other EGF domains, with a flat major and a twisted minor beta sheet. Disulphide pairing, however, is not of the usual 1-3, 2-4, 5-6 type; rather 1-2, 3-4, 5-6 pairing is found. Its extended major sheet (strands beta-2 and beta-3 and the connecting loop) projects into thrombin's active site groove. This domain is required for interaction of thrombomodulin with thrombin, and subsequent activation of protein-C []. ; GO: 0004888 transmembrane signaling receptor activity, 0016021 integral to membrane
Probab=61.72 E-value=5.1 Score=22.56 Aligned_cols=12 Identities=42% Similarity=0.808 Sum_probs=10.5
Q ss_pred eeeecCCCCcCC
Q psy5750 236 YQCQCKRGFSGD 247 (286)
Q Consensus 236 y~C~C~~Gy~gd 247 (286)
++|.||+||.-|
T Consensus 18 ~~C~CPeGyIld 29 (34)
T PF09064_consen 18 GQCFCPEGYILD 29 (34)
T ss_pred CceeCCCceEec
Confidence 489999999976
No 63
>PF00053 Laminin_EGF: Laminin EGF-like (Domains III and V); InterPro: IPR002049 Laminins [] are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation. They are composed of distinct but related alpha, beta and gamma chains. The three chains form a cross-shaped molecule that consist of a long arm and three short globular arms. The long arm consist of a coiled coil structure contributed by all three chains and cross-linked by interchain disulphide bonds. Beside different types of globular domains each subunit contains, in its first half, consecutive repeats of about 60 amino acids in length that include eight conserved cysteines []. The tertiary structure [, ] of this domain is remotely similar in its N-terminal to that of the EGF-like module (see PDOC00021 from PROSITEDOC). It is known as a 'LE' or 'laminin-type EGF-like' domain. The number of copies of the LE domain in the different forms of laminins is highly variable; from 3 up to 22 copies have been found. A schematic representation of the topology of the four disulphide bonds in the LE domain is shown below. +-------------------+ +-|-----------+ | +--------+ +-----------------+ | | | | | | | | xxCxCxxxxxxxxxxxCxxxxxxxCxxCxxxxxGxxCxxCxxgaagxxxxxxxxxxxCxx sssssssssssssssssssssssssssssssssss 'C': conserved cysteine involved in a disulphide bond 'a': conserved aromatic residue 'G': conserved glycine (lower case = less conserved) 's': region similar to the EGF-like domain In mouse laminin gamma-1 chain, the seventh LE domain has been shown to be the only one that binds with a high affinity to nidogen []. The binding-sites are located on the surface within the loops C1-C3 and C5-C6 [, ]. Long consecutive arrays of LE domains in laminins form rod-like elements of limited flexibility [], which determine the spacing in the formation of laminin networks of basement membranes [].; PDB: 3TBD_A 3ZYG_B 3ZYI_B 2Y38_A 1KLO_A 1NPE_B 3ZYJ_B 1TLE_A.
Probab=55.13 E-value=11 Score=22.99 Aligned_cols=20 Identities=45% Similarity=0.929 Sum_probs=15.3
Q ss_pred eeeeCCCCeeeecCCCCcCCCCCc
Q psy5750 228 DCINRPGTYQCQCKRGFSGDGFNC 251 (286)
Q Consensus 228 ~C~n~~g~y~C~C~~Gy~gdg~~C 251 (286)
.|.. ...+|.|+++|.|. .|
T Consensus 12 ~C~~--~~G~C~C~~~~~G~--~C 31 (49)
T PF00053_consen 12 TCDP--STGQCVCKPGTTGP--RC 31 (49)
T ss_dssp SEEE--TCEEESBSTTEEST--TS
T ss_pred cccC--CCCEEeccccccCC--cC
Confidence 5654 34589999999998 77
No 64
>KOG3512|consensus
Probab=55.07 E-value=53 Score=30.52 Aligned_cols=25 Identities=20% Similarity=0.481 Sum_probs=19.4
Q ss_pred CCeeeeCCCC-CeEecCCCccccccc
Q psy5750 79 NSSCIVEDDK-PTCICNRGFQQLYSE 103 (286)
Q Consensus 79 ~~~C~~~~g~-~~C~C~~g~~~~~c~ 103 (286)
.+.|+-...+ ++|.|...-.|+.|+
T Consensus 284 As~Cv~d~~~~ltCdC~HNTaGPdCg 309 (592)
T KOG3512|consen 284 ASRCVMDESSHLTCDCEHNTAGPDCG 309 (592)
T ss_pred cceeeeccCCceEEecccCCCCCCcc
Confidence 3578877554 899999999988766
No 65
>cd00055 EGF_Lam Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies
Probab=52.05 E-value=18 Score=22.16 Aligned_cols=13 Identities=46% Similarity=1.216 Sum_probs=10.5
Q ss_pred eeecCCCCcCCCCCc
Q psy5750 237 QCQCKRGFSGDGFNC 251 (286)
Q Consensus 237 ~C~C~~Gy~gdg~~C 251 (286)
+|.|+++|.|. .|
T Consensus 20 ~C~C~~~~~G~--~C 32 (50)
T cd00055 20 QCECKPNTTGR--RC 32 (50)
T ss_pred EEeCCCcCCCC--CC
Confidence 68888998887 66
No 66
>KOG3516|consensus
Probab=51.89 E-value=12 Score=38.39 Aligned_cols=41 Identities=22% Similarity=0.708 Sum_probs=35.1
Q ss_pred CCccCCCcCCCCCCCCCCCeeecCCCCeeEeCC-CCcccCCCceec
Q psy5750 111 GCFDINECNAGTDLCHKNAMCFNEIGSYSCQCR-PGFTGNGHQCTE 155 (286)
Q Consensus 111 ~C~~~~~C~~~~~~C~~~~~C~~~~g~~~C~C~-~G~~g~~~~C~~ 155 (286)
.|.-++.|.+ ++|.+++.|..+...|.|.|. .||.| ..|..
T Consensus 541 ~C~i~drClP--N~CehgG~C~Qs~~~f~C~C~~TGY~G--atCHt 582 (1306)
T KOG3516|consen 541 MCGISDRCLP--NPCEHGGKCSQSWDDFECNCELTGYKG--ATCHT 582 (1306)
T ss_pred ccccccccCC--ccccCCCcccccccceeEecccccccc--ccccC
Confidence 4666788887 999999999999999999998 79988 77754
No 67
>KOG1215|consensus
Probab=43.44 E-value=52 Score=33.53 Aligned_cols=89 Identities=22% Similarity=0.448 Sum_probs=49.5
Q ss_pred ccccCCCCceeeCCCCCCCCCCCCcccCCccccCCCcCCCCCCCCCCCeee-eCCCCeeeecCCCCcCCCCCcCCCCeee
Q psy5750 180 TCTNLTDYRTCNCDPGYQKDYLDDRRVAFVCTDVDECMNYPPICNNNADCI-NRPGTYQCQCKRGFSGDGFNCEEGKYCL 258 (286)
Q Consensus 180 ~C~~~~g~~~C~C~~G~~g~~~~~~~~~~~C~d~deC~~~~~~C~~~~~C~-n~~g~y~C~C~~Gy~gdg~~C~~~~~c~ 258 (286)
.+.+......|.|..++..... .-.+...|...+..|.+ .|. +.++.|.|.|..||......|+...
T Consensus 338 ~~~~~~v~~~~~~~~~~~~~~~-------~~~~~~~~~~~~g~Csq--~C~~~~p~~~~c~c~~g~~~~~~~c~~~~--- 405 (877)
T KOG1215|consen 338 KCPDVSVGPRCDCMGAKVLPLG-------ARTDSNPCESDNGGCSQ--LCVPNSPGTFKCACSPGYELRLDKCEASD--- 405 (877)
T ss_pred CCCccccCCcccCCccceeccc-------ccccCCcccccCCccce--eccCCCCCceeEecCCCcEeccCCceecC---
Confidence 4444555566777766655422 11122445454566764 787 5699999999999996533342211
Q ss_pred cccccccccceeeeecccccCCcc
Q psy5750 259 VVGITLCKMYLEVVNIQEICGENS 282 (286)
Q Consensus 259 ~~~~~~c~~~~~~~~~~~~~~~~~ 282 (286)
....+.......+++.+..+++
T Consensus 406 --~~~~~l~~s~~~~ir~~~~~~~ 427 (877)
T KOG1215|consen 406 --QPEAFLLFSNRHDIRRISLDCS 427 (877)
T ss_pred --CCCcEEEEecCccceecccCCC
Confidence 1223444445555555555443
No 68
>PF01414 DSL: Delta serrate ligand; InterPro: IPR001774 Ligands of the Delta/Serrate/lag-2 (DSL) family and their receptors, members of the lin-12/Notch family, mediate cell-cell interactions that specify cell fate in invertebrates and vertebrates. In Caenorhabditis elegans, two DSL genes, lag-2 and apx-1, influence different cell fate decisions during development []. Molecular interaction between Notch and Serrate, another EGF-homologous transmembrane protein containing a region of striking similarity to Delta, has been shown and the same two EGF repeats of Notch may also constitute a Serrate binding domain [, ].; GO: 0007154 cell communication, 0016020 membrane; PDB: 2VJ2_A.
Probab=43.12 E-value=10 Score=24.76 Aligned_cols=15 Identities=7% Similarity=0.308 Sum_probs=6.3
Q ss_pred CeEecCCCccccccc
Q psy5750 89 PTCICNRGFQQLYSE 103 (286)
Q Consensus 89 ~~C~C~~g~~~~~c~ 103 (286)
++-.|...|.|..|.
T Consensus 17 ~rv~C~~nyyG~~C~ 31 (63)
T PF01414_consen 17 IRVVCDENYYGPNCS 31 (63)
T ss_dssp ------TTEETTTT-
T ss_pred EEEECCCCCCCcccc
Confidence 456788888877665
No 69
>KOG1218|consensus
Probab=38.30 E-value=2.6e+02 Score=24.07 Aligned_cols=13 Identities=23% Similarity=0.828 Sum_probs=9.6
Q ss_pred CCeEecCCCcccc
Q psy5750 88 KPTCICNRGFQQL 100 (286)
Q Consensus 88 ~~~C~C~~g~~~~ 100 (286)
...|.|.++|.+.
T Consensus 14 ~~~c~c~~~~~g~ 26 (316)
T KOG1218|consen 14 SGQCFCDPGYTGR 26 (316)
T ss_pred CCceecCCCcccc
Confidence 3468888888863
No 70
>KOG3514|consensus
Probab=35.12 E-value=25 Score=35.98 Aligned_cols=35 Identities=26% Similarity=0.861 Sum_probs=30.0
Q ss_pred CcCCCCCCCCCCCeeecCCCCeeEeCC-CCcccCCCceec
Q psy5750 117 ECNAGTDLCHKNAMCFNEIGSYSCQCR-PGFTGNGHQCTE 155 (286)
Q Consensus 117 ~C~~~~~~C~~~~~C~~~~g~~~C~C~-~G~~g~~~~C~~ 155 (286)
.|.. +||.+++.|......|.|-|. .+|.| +.|+.
T Consensus 625 ~C~~--nPC~N~g~C~egwNrfiCDCs~T~~~G--~~Cer 660 (1591)
T KOG3514|consen 625 ICES--NPCQNGGKCSEGWNRFICDCSGTGFEG--RTCER 660 (1591)
T ss_pred ccCC--CcccCCCCccccccccccccccCcccC--ccccc
Confidence 5776 999999999999999999996 47888 77754
No 71
>KOG3512|consensus
Probab=26.94 E-value=1e+02 Score=28.77 Aligned_cols=25 Identities=20% Similarity=0.360 Sum_probs=17.2
Q ss_pred CCcccc-CCCCceeeCCCCCCCCCCC
Q psy5750 178 HSTCTN-LTDYRTCNCDPGYQKDYLD 202 (286)
Q Consensus 178 ~~~C~~-~~g~~~C~C~~G~~g~~~~ 202 (286)
...|+- ..+.++|.|.++-.|+.+.
T Consensus 284 As~Cv~d~~~~ltCdC~HNTaGPdCg 309 (592)
T KOG3512|consen 284 ASRCVMDESSHLTCDCEHNTAGPDCG 309 (592)
T ss_pred cceeeeccCCceEEecccCCCCCCcc
Confidence 345665 3445899998888887654
No 72
>smart00180 EGF_Lam Laminin-type epidermal growth factor-like domai.
Probab=22.74 E-value=69 Score=19.16 Aligned_cols=13 Identities=46% Similarity=1.275 Sum_probs=10.5
Q ss_pred eeecCCCCcCCCCCc
Q psy5750 237 QCQCKRGFSGDGFNC 251 (286)
Q Consensus 237 ~C~C~~Gy~gdg~~C 251 (286)
+|.|+++|.|. .|
T Consensus 19 ~C~C~~~~~G~--~C 31 (46)
T smart00180 19 QCECKPNVTGR--RC 31 (46)
T ss_pred EEECCCCCCCC--CC
Confidence 78888888886 66
No 73
>KOG3509|consensus
Probab=21.86 E-value=1.7e+02 Score=30.07 Aligned_cols=36 Identities=22% Similarity=0.583 Sum_probs=30.4
Q ss_pred CCCCCCCCCCCCCeeeeCCCCCeEecCCCccccccc
Q psy5750 68 DDPCKNFFCVANSSCIVEDDKPTCICNRGFQQLYSE 103 (286)
Q Consensus 68 ~~~C~~~~C~~~~~C~~~~g~~~C~C~~g~~~~~c~ 103 (286)
.+.|...+|...+.|....-...|.|++||.|..|.
T Consensus 406 g~~c~~~p~~~~g~c~p~~~~~~c~c~~g~~G~~c~ 441 (964)
T KOG3509|consen 406 GDVCWRIPCQHDGPCLQTLEGKQCLCPPGYTGDSCE 441 (964)
T ss_pred CCccccccCCCCccccccccccceeccccccCchhh
Confidence 456777889888899988888899999999998766
Done!