Query psy8484
Match_columns 132
No_of_seqs 174 out of 1156
Neff 6.6
Searched_HMMs 46136
Date Fri Aug 16 20:04:11 2013
Command hhsearch -i /work/01045/syshi/Psyhhblits/psy8484.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/8484hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PF07546 EMI: EMI domain; Int 98.9 5.5E-09 1.2E-13 68.1 5.7 65 3-68 3-71 (72)
2 KOG1225|consensus 98.8 6.6E-09 1.4E-13 89.6 5.6 71 56-132 264-335 (525)
3 KOG1225|consensus 98.5 2.4E-07 5.3E-12 80.0 5.6 65 56-132 295-361 (525)
4 KOG4289|consensus 97.8 2.1E-05 4.5E-10 74.3 3.7 73 56-131 1221-1308(2531)
5 PF07974 EGF_2: EGF-like domai 97.2 0.0004 8.6E-09 38.5 2.8 23 79-101 7-31 (32)
6 KOG1226|consensus 97.2 0.00073 1.6E-08 60.6 5.7 73 57-132 478-574 (783)
7 KOG1219|consensus 97.0 0.00069 1.5E-08 66.8 4.4 74 56-132 3885-3971(4289)
8 KOG1214|consensus 96.7 0.0029 6.2E-08 57.8 5.2 75 56-132 756-856 (1289)
9 KOG1219|consensus 95.9 0.0096 2.1E-07 59.4 4.3 52 78-132 3870-3932(4289)
10 PF00008 EGF: EGF-like domain 95.9 0.0059 1.3E-07 33.5 1.8 23 78-100 4-31 (32)
11 smart00051 DSL delta serrate l 95.6 0.011 2.3E-07 37.6 2.4 42 58-101 18-62 (63)
12 PF12661 hEGF: Human growth fa 95.5 0.0079 1.7E-07 26.9 1.1 12 90-101 1-12 (13)
13 KOG1226|consensus 95.3 0.037 8.1E-07 50.0 5.7 22 79-101 556-578 (783)
14 PF07974 EGF_2: EGF-like domai 95.3 0.026 5.6E-07 31.1 3.0 17 116-132 10-27 (32)
15 KOG4289|consensus 94.8 0.019 4.2E-07 55.2 2.5 41 90-132 1223-1268(2531)
16 smart00051 DSL delta serrate l 93.3 0.075 1.6E-06 33.7 2.3 38 92-132 20-58 (63)
17 PF01414 DSL: Delta serrate li 90.7 0.13 2.7E-06 32.6 1.1 44 56-101 16-62 (63)
18 KOG1214|consensus 88.1 0.52 1.1E-05 43.7 3.4 46 56-101 808-860 (1289)
19 KOG1217|consensus 87.4 1.5 3.2E-05 35.7 5.5 75 57-131 252-346 (487)
20 PF06247 Plasmod_Pvs28: Plasmo 86.4 0.31 6.8E-06 37.5 0.9 76 54-131 17-117 (197)
21 PF06247 Plasmod_Pvs28: Plasmo 86.2 0.87 1.9E-05 35.1 3.2 74 56-132 69-159 (197)
22 smart00179 EGF_CA Calcium-bind 83.3 1.5 3.3E-05 23.5 2.6 19 114-132 10-32 (39)
23 cd00053 EGF Epidermal growth f 82.2 1.6 3.4E-05 22.6 2.3 23 78-100 6-32 (36)
24 cd00054 EGF_CA Calcium-binding 80.1 2 4.4E-05 22.6 2.4 23 79-101 10-36 (38)
25 KOG1217|consensus 80.0 3.9 8.5E-05 33.3 5.0 76 56-132 109-200 (487)
26 PHA02887 EGF-like protein; Pro 79.4 1.7 3.7E-05 31.1 2.3 25 79-104 93-122 (126)
27 smart00181 EGF Epidermal growt 77.8 2.4 5.2E-05 22.5 2.2 21 79-99 7-30 (35)
28 KOG1218|consensus 77.4 4.8 0.0001 31.6 4.7 70 59-130 126-205 (316)
29 PF02363 C_tripleX: Cysteine r 77.0 1.1 2.4E-05 20.9 0.5 15 71-86 2-16 (17)
30 KOG4260|consensus 76.3 1.8 3.9E-05 35.4 1.9 81 51-132 162-267 (350)
31 PF09064 Tme5_EGF_like: Thromb 74.9 2.5 5.5E-05 23.7 1.7 10 123-132 17-26 (34)
32 PF07645 EGF_CA: Calcium-bindi 74.6 1.3 2.9E-05 25.2 0.5 19 114-132 11-33 (42)
33 PF01683 EB: EB module; Inter 73.3 6.7 0.00015 23.1 3.5 17 82-99 31-47 (52)
34 PF12947 EGF_3: EGF domain; I 69.3 2.5 5.4E-05 23.7 0.9 17 56-72 20-36 (36)
35 KOG4260|consensus 68.5 4.1 8.8E-05 33.4 2.3 42 60-101 131-180 (350)
36 PHA03099 epidermal growth fact 68.4 3.9 8.5E-05 29.7 2.0 23 79-101 52-79 (139)
37 KOG0994|consensus 63.9 5.7 0.00012 38.4 2.5 73 60-132 1002-1092(1758)
38 KOG1218|consensus 62.7 14 0.00031 28.9 4.4 48 84-132 118-170 (316)
39 PF00954 S_locus_glycop: S-loc 59.3 9 0.0002 26.0 2.4 7 91-97 100-106 (110)
40 KOG3607|consensus 56.1 7.8 0.00017 35.4 2.0 28 76-104 628-656 (716)
41 PF12662 cEGF: Complement Clr- 53.5 8.7 0.00019 19.7 1.1 9 58-66 3-11 (24)
42 KOG1388|consensus 53.3 6.2 0.00013 30.9 0.8 45 57-101 76-124 (217)
43 PF12946 EGF_MSP1_1: MSP1 EGF 46.1 27 0.00059 19.8 2.5 31 35-72 6-36 (37)
44 smart00180 EGF_Lam Laminin-typ 41.5 29 0.00064 20.0 2.3 18 84-101 12-30 (46)
45 PF00053 Laminin_EGF: Laminin 40.2 14 0.00031 21.3 0.8 18 84-101 12-30 (49)
46 KOG0994|consensus 39.6 18 0.00038 35.3 1.6 44 88-131 1083-1139(1758)
47 cd00055 EGF_Lam Laminin-type e 39.6 25 0.00055 20.5 1.8 17 85-101 14-31 (50)
48 PF09402 MSC: Man1-Src1p-C-ter 37.9 11 0.00023 30.5 0.0 27 73-100 49-76 (334)
49 KOG0196|consensus 32.1 69 0.0015 30.2 4.1 12 56-67 258-269 (996)
50 KOG3607|consensus 30.9 46 0.00099 30.5 2.8 19 114-132 631-650 (716)
51 PTZ00214 high cysteine membran 29.6 76 0.0016 29.5 4.0 36 60-98 619-656 (800)
52 PF12955 DUF3844: Domain of un 24.8 56 0.0012 22.7 1.8 19 78-96 13-40 (103)
53 PF05092 PIF: Per os infectivi 24.5 78 0.0017 28.0 3.0 41 57-100 132-182 (522)
54 PF03302 VSP: Giardia variant- 23.9 85 0.0018 26.5 3.1 37 60-99 95-134 (397)
No 1
>PF07546 EMI: EMI domain; InterPro: IPR011489 The EMI domain, first named after its presence in proteins of the EMILIN family, is a small cysteine-rich module of around 75 amino acids. The EMI domain is most often found at the N terminus of metazoan extracellular proteins that are forming or are compatible with multimer formation []. It is found in association with other domains, such as C1q, laminin-type EGF-like, collagen-like, FN3, WAP, ZP or FAS1 []. It has been suggested that the EMI domain could be a protein-protein interaction module, as the EMI domain of EMILIN-1 was found to interact with the C1q domain of EMILIN-2 []. The EMI domain possesses six highly conserved cysteines residues, which likely form disulphide bonds. Other key features of the EMI domain are the C-C-x-G-[WYFH] pattern, a hydrophobic position just preceding the first cysteine (Cys1) of the domain and a cluster of hydrophobic residues between Cys3 and Cys4. The EMI domain could be made of two sub-domains, the fold of the second one sharing similarities with the C-terminal sub-module characteristic of EGF-like domains []. Proteins known to contain a EMI domain include: Vertebrate Emilins, extracellular matrix glycoproteins. Vertebrate Multimerins, extracellular matrix glycoproteins. Vetebrate Emu proteins, which could interact with several different extracellular matrix components and serve to connect and integrate the function of multiple partner molecules. Vertebrate beta-IG-H3. Vertebrate osteoblast-specific factor 2 (OSF-2). Mammalian NEU1/NG3 proteins. Drosophila midline fasciclin. Caenorhabditis elegans ced-1, a transmembrane receptor that mediates cell corpse engulfment. The Pfam alignment for this domain is truncated at the C terminus and does not include the final cysteine []. This is to stop the family overlapping with other domains.; GO: 0005515 protein binding
Probab=98.87 E-value=5.5e-09 Score=68.05 Aligned_cols=65 Identities=22% Similarity=0.355 Sum_probs=50.0
Q ss_pred cCCCCCeeeEEece-eeeeeeeeeee-eeeee-e-ecCCcceeEEEEEeeeeeeecCCCceEeCCCceeC
Q psy8484 3 GVPRTRTKTIPIPY-TETYMDEYCMR-QSWYF-T-YHCQKTRTAYSYKYKTEEYMEDTHVRICCEGYEDD 68 (132)
Q Consensus 3 ~vc~~~~~~~~v~~-~esy~qp~~~~-~~~~~-~-~~C~~~r~~~~~~~~~~~~~~~~~~c~C~~Gy~~~ 68 (132)
+||+.++ +..|+. +|+|.|||..+ +.|++ . ++|+.||+.|+..+|...+...+.+-.|||||.|.
T Consensus 3 nvC~~~~-~~~v~~~~~~~~q~~~~~~~~~C~~~~~~C~~yrt~yr~~Yr~~~k~~t~~~~~CCpGy~~~ 71 (72)
T PF07546_consen 3 NVCAYVV-TRTVSCKVESYVQPYVQPYYTPCWWGPPRCSRYRTVYRPAYRQVYKTVTRLEWRCCPGYSGT 71 (72)
T ss_pred ccCCcEe-EECccEEEEeCcEEEEecccccCCCCCCcCCceEEEeEEEEEEEEEEEccEeeeeCcCcccC
Confidence 6899875 333332 77777777665 44552 2 68999999999999998888888889999999986
No 2
>KOG1225|consensus
Probab=98.80 E-value=6.6e-09 Score=89.58 Aligned_cols=71 Identities=30% Similarity=0.702 Sum_probs=58.5
Q ss_pred CCceEeCCCceeCCCceecccCCCCCC-cEEcCCCeEEccCCceeccCCCCCccccCCCCCCCCEEcCCCceeCCCCC
Q psy8484 56 THVRICCEGYEDDHGSCRPVCERECVF-GSCTSPNQCTCSPGYVVINEASPNICEPHCAECVNGVCSAPNTCDCLDVL 132 (132)
Q Consensus 56 ~~~c~C~~Gy~~~~~~C~~~C~~~C~n-G~C~~p~~C~C~~G~~G~~~c~~~~C~~~C~~C~nG~C~~p~~C~C~~G~ 132 (132)
.++|+|.+||+|.+.. +++|...|.. |.++. ++|.|++||+|.+ |+..+|..+|+ .+|.|+ +++|.|.+||
T Consensus 264 ~G~CIC~~Gf~G~dC~-e~~Cp~~cs~~g~~~~-g~CiC~~g~~G~d-Cs~~~cpadC~--g~G~Ci-~G~C~C~~Gy 335 (525)
T KOG1225|consen 264 EGRCICPPGFTGDDCD-ELVCPVDCSGGGVCVD-GECICNPGYSGKD-CSIRRCPADCS--GHGKCI-DGECLCDEGY 335 (525)
T ss_pred CCeEeCCCCCcCCCCC-cccCCcccCCCceecC-CEeecCCCccccc-cccccCCccCC--CCCccc-CCceEeCCCC
Confidence 6789999999987544 7788888874 66665 7999999999997 77777776665 689999 7999999997
No 3
>KOG1225|consensus
Probab=98.47 E-value=2.4e-07 Score=80.01 Aligned_cols=65 Identities=28% Similarity=0.673 Sum_probs=52.5
Q ss_pred CCceEeCCCceeCCCceecccCCCCC-CcEEcCCCeEEccCCceeccCCCCCccccCCCCCCC-CEEcCCCceeCCCCC
Q psy8484 56 THVRICCEGYEDDHGSCRPVCERECV-FGSCTSPNQCTCSPGYVVINEASPNICEPHCAECVN-GVCSAPNTCDCLDVL 132 (132)
Q Consensus 56 ~~~c~C~~Gy~~~~~~C~~~C~~~C~-nG~C~~p~~C~C~~G~~G~~~c~~~~C~~~C~~C~n-G~C~~p~~C~C~~G~ 132 (132)
..+|+|.+||+|...+ +..|...|. ||.|+ +++|.|++||+|.. |... .|.| |.|++ + |.|.+||
T Consensus 295 ~g~CiC~~g~~G~dCs-~~~cpadC~g~G~Ci-~G~C~C~~Gy~G~~-C~~~-------~C~~~g~cv~-g-C~C~~Gw 361 (525)
T KOG1225|consen 295 DGECICNPGYSGKDCS-IRRCPADCSGHGKCI-DGECLCDEGYTGEL-CIQR-------ACSGGGQCVN-G-CKCKKGW 361 (525)
T ss_pred CCEeecCCCccccccc-cccCCccCCCCCccc-CCceEeCCCCcCCc-cccc-------ccCCCceecc-C-ceeccCc
Confidence 5689999999998654 677888998 59999 79999999999997 5533 2555 56654 6 9999997
No 4
>KOG4289|consensus
Probab=97.77 E-value=2.1e-05 Score=74.27 Aligned_cols=73 Identities=27% Similarity=0.590 Sum_probs=52.6
Q ss_pred CCceEeCCCceeCCCce-eccc-CCCCC-CcEEcC---CCeEEccCCceeccCCCCC----ccccCCCCCCC-CEEcCC-
Q psy8484 56 THVRICCEGYEDDHGSC-RPVC-ERECV-FGSCTS---PNQCTCSPGYVVINEASPN----ICEPHCAECVN-GVCSAP- 123 (132)
Q Consensus 56 ~~~c~C~~Gy~~~~~~C-~~~C-~~~C~-nG~C~~---p~~C~C~~G~~G~~~c~~~----~C~~~C~~C~n-G~C~~p- 123 (132)
..+|.|+|||+|+.+.= +..| ..+|. ||+|.+ ..+|.|.+||+|.+ |+.. .|.|- -|.| |+|++.
T Consensus 1221 glrCrCPpGFTgd~CeTeiDlCYs~pC~nng~C~srEggYtCeCrpg~tGeh-CEvs~~agrCvpG--vC~nggtC~~~~ 1297 (2531)
T KOG4289|consen 1221 GLRCRCPPGFTGDYCETEIDLCYSGPCGNNGRCRSREGGYTCECRPGFTGEH-CEVSARAGRCVPG--VCKNGGTCVNLL 1297 (2531)
T ss_pred ceeEeCCCCCCcccccchhHhhhcCCCCCCCceEEecCceeEEecCCccccc-eeeecccCccccc--eecCCCEEeecC
Confidence 46799999999984321 3456 46898 599974 56899999999997 5433 34443 6777 588863
Q ss_pred ---CceeCCCC
Q psy8484 124 ---NTCDCLDV 131 (132)
Q Consensus 124 ---~~C~C~~G 131 (132)
..|.|+.|
T Consensus 1298 nggf~c~Cp~g 1308 (2531)
T KOG4289|consen 1298 NGGFCCHCPYG 1308 (2531)
T ss_pred CCceeccCCCc
Confidence 37888876
No 5
>PF07974 EGF_2: EGF-like domain; InterPro: IPR013111 A sequence of about thirty to forty amino-acid residues long found in the sequence of epidermal growth factor (EGF) has been shown [, , , , ] to be present, in a more or less conserved form, in a large number of other, mostly animal proteins. The list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied. The functional significance of EGF domains in what appear to be unrelated proteins is not yet clear. However, a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase). The EGF domain includes six cysteine residues which have been shown (in EGF) to be involved in disulphide bonds. The main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet. Subdomains between the conserved cysteines vary in length. This entry contains EGF domains found in a variety of extracellular and membrane proteins
Probab=97.21 E-value=0.0004 Score=38.51 Aligned_cols=23 Identities=39% Similarity=1.066 Sum_probs=19.6
Q ss_pred CCC-CcEEcCC-CeEEccCCceecc
Q psy8484 79 ECV-FGSCTSP-NQCTCSPGYVVIN 101 (132)
Q Consensus 79 ~C~-nG~C~~p-~~C~C~~G~~G~~ 101 (132)
.|. ||+|+.+ ++|.|++||.|++
T Consensus 7 ~C~~~G~C~~~~g~C~C~~g~~G~~ 31 (32)
T PF07974_consen 7 ICSGHGTCVSPCGRCVCDSGYTGPD 31 (32)
T ss_pred ccCCCCEEeCCCCEEECCCCCcCCC
Confidence 477 6999987 8999999999975
No 6
>KOG1226|consensus
Probab=97.19 E-value=0.00073 Score=60.63 Aligned_cols=73 Identities=27% Similarity=0.551 Sum_probs=45.2
Q ss_pred CceEeCCCceeCCCcee----------cccC-----CCCC-CcEEcCCCeEEccCCce----ecc-CCCCCccccCCC--
Q psy8484 57 HVRICCEGYEDDHGSCR----------PVCE-----RECV-FGSCTSPNQCTCSPGYV----VIN-EASPNICEPHCA-- 113 (132)
Q Consensus 57 ~~c~C~~Gy~~~~~~C~----------~~C~-----~~C~-nG~C~~p~~C~C~~G~~----G~~-~c~~~~C~~~C~-- 113 (132)
++|.|.+||.|..++|. ..|. +.|. +|.|+= ++|.|.+... |+. +||...|..+ .
T Consensus 478 G~C~C~~G~~G~~CEC~~~~~ss~~~~~~Cr~~~~~~vCSgrG~C~C-GqC~C~~~~~~~i~G~fCECDnfsC~r~-~g~ 555 (783)
T KOG1226|consen 478 GQCRCDEGWLGKKCECSTDELSSSEEEDKCRENSDSPVCSGRGDCVC-GQCVCHKPDNGKIYGKFCECDNFSCERH-KGV 555 (783)
T ss_pred cceecCCCCCCCcccCCccccCcHhHHhhccCCCCCCCcCCCCcEeC-CceEecCCCCCceeeeeeeccCcccccc-cCc
Confidence 46899999999865552 1121 2455 366654 6777776655 544 5666666544 1
Q ss_pred CC-CCCEEcCCCceeCCCCC
Q psy8484 114 EC-VNGVCSAPNTCDCLDVL 132 (132)
Q Consensus 114 ~C-~nG~C~~p~~C~C~~G~ 132 (132)
-| .||.|.- |+|+|.+||
T Consensus 556 lC~g~G~C~C-G~CvC~~Gw 574 (783)
T KOG1226|consen 556 LCGGHGRCEC-GRCVCNPGW 574 (783)
T ss_pred ccCCCCeEeC-CcEEcCCCC
Confidence 24 3677643 788888887
No 7
>KOG1219|consensus
Probab=97.04 E-value=0.00069 Score=66.84 Aligned_cols=74 Identities=24% Similarity=0.621 Sum_probs=53.5
Q ss_pred CCceEeCCCceeCCCce-eccc-CCCCCC-cEEcC---CCeEEccCCceeccCCCCC---ccccCCCCCCC-CEEcC-CC
Q psy8484 56 THVRICCEGYEDDHGSC-RPVC-ERECVF-GSCTS---PNQCTCSPGYVVINEASPN---ICEPHCAECVN-GVCSA-PN 124 (132)
Q Consensus 56 ~~~c~C~~Gy~~~~~~C-~~~C-~~~C~n-G~C~~---p~~C~C~~G~~G~~~c~~~---~C~~~C~~C~n-G~C~~-p~ 124 (132)
.+.|.|.+-|.|.+++= ...| ..||.+ |+|+. ...|.|+.||+|.. |+.. .|..+ .|.| |.|++ ++
T Consensus 3885 gy~CkCpsqysG~~CEi~~epC~snPC~~GgtCip~~n~f~CnC~~gyTG~~-Ce~~Gi~eCs~n--~C~~gg~C~n~~g 3961 (4289)
T KOG1219|consen 3885 GYKCKCPSQYSGNHCEIDLEPCASNPCLTGGTCIPFYNGFLCNCPNGYTGKR-CEARGISECSKN--VCGTGGQCINIPG 3961 (4289)
T ss_pred ceEEeCcccccCcccccccccccCCCCCCCCEEEecCCCeeEeCCCCccCce-eecccccccccc--cccCCceeeccCC
Confidence 45789999999875431 2335 468996 89984 34799999999987 4432 24333 8998 69998 55
Q ss_pred --ceeCCCCC
Q psy8484 125 --TCDCLDVL 132 (132)
Q Consensus 125 --~C~C~~G~ 132 (132)
.|.|.+|+
T Consensus 3962 sf~CncT~g~ 3971 (4289)
T KOG1219|consen 3962 SFHCNCTPGI 3971 (4289)
T ss_pred ceEeccChhH
Confidence 99998875
No 8
>KOG1214|consensus
Probab=96.68 E-value=0.0029 Score=57.80 Aligned_cols=75 Identities=28% Similarity=0.661 Sum_probs=52.5
Q ss_pred CCceEeCCCcee--CCCceecccC----CC-------CC-Cc--EEc----CCCeEEccCCceecc--CCCCCccccCCC
Q psy8484 56 THVRICCEGYED--DHGSCRPVCE----RE-------CV-FG--SCT----SPNQCTCSPGYVVIN--EASPNICEPHCA 113 (132)
Q Consensus 56 ~~~c~C~~Gy~~--~~~~C~~~C~----~~-------C~-nG--~C~----~p~~C~C~~G~~G~~--~c~~~~C~~~C~ 113 (132)
..+|.|..||.= ...+|+++-. +. |. +| +|+ +...|.|-+||.|.. .++.++|+|+
T Consensus 756 ~~rceC~~gy~F~dd~~tCV~i~~pap~n~Ce~g~h~C~i~g~a~c~~hGgs~y~C~CLPGfsGDG~~c~dvDeC~ps-- 833 (1289)
T KOG1214|consen 756 SYRCECRSGYEFADDRHTCVLITPPAPANPCEDGSHTCAIAGQARCVHHGGSTYSCACLPGFSGDGHQCTDVDECSPS-- 833 (1289)
T ss_pred ceeEEEeecceeccCCcceEEecCCCCCCccccCccccCcCCceEEEecCCceEEEeecCCccCCccccccccccCcc--
Confidence 346778888754 3468865532 23 43 34 444 345799999999976 5677788877
Q ss_pred CCC-CCEEcC-C--CceeCCCCC
Q psy8484 114 ECV-NGVCSA-P--NTCDCLDVL 132 (132)
Q Consensus 114 ~C~-nG~C~~-p--~~C~C~~G~ 132 (132)
.|. +..|.+ | ..|.|.+||
T Consensus 834 rChp~A~CyntpgsfsC~C~pGy 856 (1289)
T KOG1214|consen 834 RCHPAATCYNTPGSFSCRCQPGY 856 (1289)
T ss_pred ccCCCceEecCCCcceeecccCc
Confidence 887 789997 4 589999997
No 9
>KOG1219|consensus
Probab=95.90 E-value=0.0096 Score=59.37 Aligned_cols=52 Identities=25% Similarity=0.696 Sum_probs=40.9
Q ss_pred CCCCC-cEEcC----CCeEEccCCceeccCCCCC--ccccCCCCCCC-CEEcCC---CceeCCCCC
Q psy8484 78 RECVF-GSCTS----PNQCTCSPGYVVINEASPN--ICEPHCAECVN-GVCSAP---NTCDCLDVL 132 (132)
Q Consensus 78 ~~C~n-G~C~~----p~~C~C~~G~~G~~~c~~~--~C~~~C~~C~n-G~C~~p---~~C~C~~G~ 132 (132)
.+|+| |.|+. ...|.|++.|+|.. |+.. .|.+. +|.+ |+|+.. ..|.|+.||
T Consensus 3870 npCqhgG~C~~~~~ggy~CkCpsqysG~~-CEi~~epC~sn--PC~~GgtCip~~n~f~CnC~~gy 3932 (4289)
T KOG1219|consen 3870 NPCQHGGTCISQPKGGYKCKCPSQYSGNH-CEIDLEPCASN--PCLTGGTCIPFYNGFLCNCPNGY 3932 (4289)
T ss_pred CcccCCCEecCCCCCceEEeCcccccCcc-cccccccccCC--CCCCCCEEEecCCCeeEeCCCCc
Confidence 58998 89984 45899999999998 5443 35544 9997 699973 589999887
No 10
>PF00008 EGF: EGF-like domain This is a sub-family of the Pfam entry This is a sub-family of the Pfam entry; InterPro: IPR006209 A sequence of about thirty to forty amino-acid residues long found in the sequence of epidermal growth factor (EGF) has been shown [, , , , ] to be present, in a more or less conserved form, in a large number of other, mostly animal proteins. The list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied. The functional significance of EGF domains in what appear to be unrelated proteins is not yet clear. However, a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase). The EGF domain includes six cysteine residues which have been shown (in EGF) to be involved in disulphide bonds. The main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet. Subdomains between the conserved cysteines vary in length.; GO: 0005515 protein binding; PDB: 1WHE_A 1CCF_A 1APO_A 1WHF_A 2VJ3_A 1TOZ_A 4D90_B 3CFW_A 1EDM_B 1IXA_A ....
Probab=95.88 E-value=0.0059 Score=33.46 Aligned_cols=23 Identities=35% Similarity=0.904 Sum_probs=18.4
Q ss_pred CCCCC-cEEcCC----CeEEccCCceec
Q psy8484 78 RECVF-GSCTSP----NQCTCSPGYVVI 100 (132)
Q Consensus 78 ~~C~n-G~C~~p----~~C~C~~G~~G~ 100 (132)
.+|.| |+|+.. ..|.|.+||+|.
T Consensus 4 ~~C~n~g~C~~~~~~~y~C~C~~G~~G~ 31 (32)
T PF00008_consen 4 NPCQNGGTCIDLPGGGYTCECPPGYTGK 31 (32)
T ss_dssp TSSTTTEEEEEESTSEEEEEEBTTEEST
T ss_pred CcCCCCeEEEeCCCCCEEeECCCCCccC
Confidence 47886 899743 479999999996
No 11
>smart00051 DSL delta serrate ligand.
Probab=95.60 E-value=0.011 Score=37.59 Aligned_cols=42 Identities=19% Similarity=0.460 Sum_probs=30.8
Q ss_pred ceEeCCCceeCCCceecccCC--CCC-CcEEcCCCeEEccCCceecc
Q psy8484 58 VRICCEGYEDDHGSCRPVCER--ECV-FGSCTSPNQCTCSPGYVVIN 101 (132)
Q Consensus 58 ~c~C~~Gy~~~~~~C~~~C~~--~C~-nG~C~~p~~C~C~~G~~G~~ 101 (132)
+-.|.++|.|.. |.-.|.. .+. |.+|...+.+.|.+||.|.+
T Consensus 18 rv~C~~~~yG~~--C~~~C~~~~d~~~~~~Cd~~G~~~C~~Gw~G~~ 62 (63)
T smart00051 18 RVTCDENYYGEG--CNKFCRPRDDFFGHYTCDENGNKGCLEGWMGPY 62 (63)
T ss_pred EeeCCCCCcCCc--cCCEeCcCccccCCccCCcCCCEecCCCCcCCC
Confidence 456899999874 4444432 233 57898888999999999975
No 12
>PF12661 hEGF: Human growth factor-like EGF; PDB: 2YGQ_A 2E26_A 3A7Q_A 2YGP_A 2YGO_A 1HRE_A 1HAE_A 1HAF_A 1HRF_A.
Probab=95.50 E-value=0.0079 Score=26.85 Aligned_cols=12 Identities=42% Similarity=1.099 Sum_probs=6.9
Q ss_pred eEEccCCceecc
Q psy8484 90 QCTCSPGYVVIN 101 (132)
Q Consensus 90 ~C~C~~G~~G~~ 101 (132)
+|.|++||+|.+
T Consensus 1 ~C~C~~G~~G~~ 12 (13)
T PF12661_consen 1 TCQCPPGWTGPN 12 (13)
T ss_dssp EEEE-TTEETTT
T ss_pred CccCcCCCcCCC
Confidence 366677776654
No 13
>KOG1226|consensus
Probab=95.33 E-value=0.037 Score=50.00 Aligned_cols=22 Identities=32% Similarity=0.937 Sum_probs=18.6
Q ss_pred CCC-CcEEcCCCeEEccCCceecc
Q psy8484 79 ECV-FGSCTSPNQCTCSPGYVVIN 101 (132)
Q Consensus 79 ~C~-nG~C~~p~~C~C~~G~~G~~ 101 (132)
.|. ||.|.- ++|.|.+||+|..
T Consensus 556 lC~g~G~C~C-G~CvC~~GwtG~~ 578 (783)
T KOG1226|consen 556 LCGGHGRCEC-GRCVCNPGWTGSA 578 (783)
T ss_pred ccCCCCeEeC-CcEEcCCCCccCC
Confidence 476 688865 8999999999987
No 14
>PF07974 EGF_2: EGF-like domain; InterPro: IPR013111 A sequence of about thirty to forty amino-acid residues long found in the sequence of epidermal growth factor (EGF) has been shown [, , , , ] to be present, in a more or less conserved form, in a large number of other, mostly animal proteins. The list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied. The functional significance of EGF domains in what appear to be unrelated proteins is not yet clear. However, a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase). The EGF domain includes six cysteine residues which have been shown (in EGF) to be involved in disulphide bonds. The main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet. Subdomains between the conserved cysteines vary in length. This entry contains EGF domains found in a variety of extracellular and membrane proteins
Probab=95.28 E-value=0.026 Score=31.13 Aligned_cols=17 Identities=29% Similarity=0.729 Sum_probs=15.5
Q ss_pred CCCEEcCC-CceeCCCCC
Q psy8484 116 VNGVCSAP-NTCDCLDVL 132 (132)
Q Consensus 116 ~nG~C~~p-~~C~C~~G~ 132 (132)
.||+|+.+ ++|.|.+||
T Consensus 10 ~~G~C~~~~g~C~C~~g~ 27 (32)
T PF07974_consen 10 GHGTCVSPCGRCVCDSGY 27 (32)
T ss_pred CCCEEeCCCCEEECCCCC
Confidence 47999987 999999998
No 15
>KOG4289|consensus
Probab=94.79 E-value=0.019 Score=55.24 Aligned_cols=41 Identities=32% Similarity=0.810 Sum_probs=32.0
Q ss_pred eEEccCCceecc-CCCCCccccCCCCCC-CCEEcC---CCceeCCCCC
Q psy8484 90 QCTCSPGYVVIN-EASPNICEPHCAECV-NGVCSA---PNTCDCLDVL 132 (132)
Q Consensus 90 ~C~C~~G~~G~~-~c~~~~C~~~C~~C~-nG~C~~---p~~C~C~~G~ 132 (132)
+|+|++||+|.. +-+.+.|-.. +|. ||.|.. .++|+|.+||
T Consensus 1223 rCrCPpGFTgd~CeTeiDlCYs~--pC~nng~C~srEggYtCeCrpg~ 1268 (2531)
T KOG4289|consen 1223 RCRCPPGFTGDYCETEIDLCYSG--PCGNNGRCRSREGGYTCECRPGF 1268 (2531)
T ss_pred eEeCCCCCCcccccchhHhhhcC--CCCCCCceEEecCceeEEecCCc
Confidence 699999999987 2234556554 887 589987 3799999998
No 16
>smart00051 DSL delta serrate ligand.
Probab=93.26 E-value=0.075 Score=33.66 Aligned_cols=38 Identities=21% Similarity=0.538 Sum_probs=26.8
Q ss_pred EccCCceeccCCCCCccccCCCCCC-CCEEcCCCceeCCCCC
Q psy8484 92 TCSPGYVVINEASPNICEPHCAECV-NGVCSAPNTCDCLDVL 132 (132)
Q Consensus 92 ~C~~G~~G~~~c~~~~C~~~C~~C~-nG~C~~p~~C~C~~G~ 132 (132)
.|+++|.|.. |+ ..|.+. ..+. +..|...|.++|.+||
T Consensus 20 ~C~~~~yG~~-C~-~~C~~~-~d~~~~~~Cd~~G~~~C~~Gw 58 (63)
T smart00051 20 TCDENYYGEG-CN-KFCRPR-DDFFGHYTCDENGNKGCLEGW 58 (63)
T ss_pred eCCCCCcCCc-cC-CEeCcC-ccccCCccCCcCCCEecCCCC
Confidence 6789999987 53 344432 0233 4699888999999998
No 17
>PF01414 DSL: Delta serrate ligand; InterPro: IPR001774 Ligands of the Delta/Serrate/lag-2 (DSL) family and their receptors, members of the lin-12/Notch family, mediate cell-cell interactions that specify cell fate in invertebrates and vertebrates. In Caenorhabditis elegans, two DSL genes, lag-2 and apx-1, influence different cell fate decisions during development []. Molecular interaction between Notch and Serrate, another EGF-homologous transmembrane protein containing a region of striking similarity to Delta, has been shown and the same two EGF repeats of Notch may also constitute a Serrate binding domain [, ].; GO: 0007154 cell communication, 0016020 membrane; PDB: 2VJ2_A.
Probab=90.68 E-value=0.13 Score=32.65 Aligned_cols=44 Identities=25% Similarity=0.546 Sum_probs=21.4
Q ss_pred CCceEeCCCceeCCCceecccCCC--CC-CcEEcCCCeEEccCCceecc
Q psy8484 56 THVRICCEGYEDDHGSCRPVCERE--CV-FGSCTSPNQCTCSPGYVVIN 101 (132)
Q Consensus 56 ~~~c~C~~Gy~~~~~~C~~~C~~~--C~-nG~C~~p~~C~C~~G~~G~~ 101 (132)
+.+-.|.+.|.|.. |.-.|.+. -. |-+|.+.+.=.|.+||.|++
T Consensus 16 ~~rv~C~~nyyG~~--C~~~C~~~~d~~ghy~Cd~~G~~~C~~Gw~G~~ 62 (63)
T PF01414_consen 16 RIRVVCDENYYGPN--CSKFCKPRDDSFGHYTCDSNGNKVCLPGWTGPN 62 (63)
T ss_dssp -------TTEETTT--T-EE---EEETTEEEEE-SS--EEE-TTEESTT
T ss_pred EEEEECCCCCCCcc--ccCCcCCCcCCcCCcccCCCCCCCCCCCCcCCC
Confidence 45678999999984 44444322 11 45888888899999999985
No 18
>KOG1214|consensus
Probab=88.13 E-value=0.52 Score=43.71 Aligned_cols=46 Identities=28% Similarity=0.683 Sum_probs=33.6
Q ss_pred CCceEeCCCceeCCCcee--cccCC-CCC-CcEEcC---CCeEEccCCceecc
Q psy8484 56 THVRICCEGYEDDHGSCR--PVCER-ECV-FGSCTS---PNQCTCSPGYVVIN 101 (132)
Q Consensus 56 ~~~c~C~~Gy~~~~~~C~--~~C~~-~C~-nG~C~~---p~~C~C~~G~~G~~ 101 (132)
.+.|.|.|||.|.+..|. ..|.+ .|. +..|+. ...|+|.+||.|..
T Consensus 808 ~y~C~CLPGfsGDG~~c~dvDeC~psrChp~A~CyntpgsfsC~C~pGy~GDG 860 (1289)
T KOG1214|consen 808 TYSCACLPGFSGDGHQCTDVDECSPSRCHPAATCYNTPGSFSCRCQPGYYGDG 860 (1289)
T ss_pred eEEEeecCCccCCccccccccccCccccCCCceEecCCCcceeecccCccCCC
Confidence 567999999999976653 23432 454 688874 35799999999975
No 19
>KOG1217|consensus
Probab=87.37 E-value=1.5 Score=35.73 Aligned_cols=75 Identities=32% Similarity=0.713 Sum_probs=46.2
Q ss_pred CceEeCCCceeCC-Cce--ecccCC--CCCC-cEEcCC---CeEEccCCceeccC--C-CCCccccCCC--CCCCC-EEc
Q psy8484 57 HVRICCEGYEDDH-GSC--RPVCER--ECVF-GSCTSP---NQCTCSPGYVVINE--A-SPNICEPHCA--ECVNG-VCS 121 (132)
Q Consensus 57 ~~c~C~~Gy~~~~-~~C--~~~C~~--~C~n-G~C~~p---~~C~C~~G~~G~~~--c-~~~~C~~~C~--~C~nG-~C~ 121 (132)
+.|.|.+||.+.. ..| ...|.. .|.| |.|+.. ..|.|++||.|... + +...|.+.-. .|.+| .|.
T Consensus 252 ~~C~~~~g~~~~~~~~~~~~~~C~~~~~c~~~~~C~~~~~~~~C~C~~g~~g~~~~~~~~~~~C~~~~~~~~c~~g~~C~ 331 (487)
T KOG1217|consen 252 YTCRCPEGYTGDACVTCVDVDSCALIASCPNGGTCVNVPGSYRCTCPPGFTGRLCTECVDVDECSPRNAGGPCANGGTCN 331 (487)
T ss_pred eeeeCCCCccccccceeeeccccCCCCccCCCCeeecCCCcceeeCCCCCCCCCCccccccccccccccCCcCCCCcccc
Confidence 4688899998875 222 233433 3775 899864 67999999999872 1 2234532112 57764 773
Q ss_pred CC-----CceeCCCC
Q psy8484 122 AP-----NTCDCLDV 131 (132)
Q Consensus 122 ~p-----~~C~C~~G 131 (132)
.. ..|.|..|
T Consensus 332 ~~~~~~~~~C~c~~~ 346 (487)
T KOG1217|consen 332 TLGSFGGFRCACGPG 346 (487)
T ss_pred cCCCCCCCCcCCCCC
Confidence 32 34777665
No 20
>PF06247 Plasmod_Pvs28: Plasmodium ookinete surface protein Pvs28; InterPro: IPR010423 This family consists of several ookinete surface protein (Pvs28) from several species of Plasmodium. Pvs25 and Pvs28 are expressed on the surface of ookinetes. These proteins are potential candidates for vaccine and induce antibodies that block the infectivity of Plasmodium vivax in immunised animals [].; GO: 0009986 cell surface, 0016020 membrane; PDB: 1Z3G_B 1Z1Y_B 1Z27_A.
Probab=86.39 E-value=0.31 Score=37.47 Aligned_cols=76 Identities=28% Similarity=0.722 Sum_probs=45.6
Q ss_pred cCCCceEeCCCceeC-CCceec--ccC------CCCCC-cEEcCC--------CeEEccCCceecc-CCCCCccccCCCC
Q psy8484 54 EDTHVRICCEGYEDD-HGSCRP--VCE------RECVF-GSCTSP--------NQCTCSPGYVVIN-EASPNICEPHCAE 114 (132)
Q Consensus 54 ~~~~~c~C~~Gy~~~-~~~C~~--~C~------~~C~n-G~C~~p--------~~C~C~~G~~G~~-~c~~~~C~~~C~~ 114 (132)
..-+.|.|.+||.-. ..+|++ .|. ++|.+ +.|+.. ..|.|.+||.-.. .|-+..|... .
T Consensus 17 SNHfEC~Cnegfvl~~EntCE~kv~C~~~e~~~K~Cgdya~C~~~~~~~~~~~~~C~C~~gY~~~~~vCvp~~C~~~--~ 94 (197)
T PF06247_consen 17 SNHFECKCNEGFVLKNENTCEEKVECDKLENVNKPCGDYAKCINQANKGEERAYKCDCINGYILKQGVCVPNKCNNK--D 94 (197)
T ss_dssp SSEEEEEESTTEEEEETTEEEE----SG-GGTTSEEETTEEEEE-SSTTSSTSEEEEE-TTEEESSSSEEEGGGSS----
T ss_pred cCceEEEcCCCcEEccccccccceecCcccccCccccchhhhhcCCCcccceeEEEecccCceeeCCeEchhhcCce--e
Confidence 334679999999876 567743 343 36775 889732 2699999998875 2333334333 7
Q ss_pred CCCCEEcC-C-----CceeCCCC
Q psy8484 115 CVNGVCSA-P-----NTCDCLDV 131 (132)
Q Consensus 115 C~nG~C~~-p-----~~C~C~~G 131 (132)
|.+|.|+- | -.|+|.-|
T Consensus 95 Cg~GKCI~d~~~~~~~~CSC~IG 117 (197)
T PF06247_consen 95 CGSGKCILDPDNPNNPTCSCNIG 117 (197)
T ss_dssp -TTEEEEEEEGGGSEEEEEE-TE
T ss_pred cCCCeEEecCCCCCCceeEeeec
Confidence 88899984 2 27888654
No 21
>PF06247 Plasmod_Pvs28: Plasmodium ookinete surface protein Pvs28; InterPro: IPR010423 This family consists of several ookinete surface protein (Pvs28) from several species of Plasmodium. Pvs25 and Pvs28 are expressed on the surface of ookinetes. These proteins are potential candidates for vaccine and induce antibodies that block the infectivity of Plasmodium vivax in immunised animals [].; GO: 0009986 cell surface, 0016020 membrane; PDB: 1Z3G_B 1Z1Y_B 1Z27_A.
Probab=86.20 E-value=0.87 Score=35.06 Aligned_cols=74 Identities=28% Similarity=0.710 Sum_probs=43.2
Q ss_pred CCceEeCCCceeCCCcee-cccCC-CCCCcEEcC----C--CeEEccCCceeccCCCCCccc----cCCC-CCC-CCEEc
Q psy8484 56 THVRICCEGYEDDHGSCR-PVCER-ECVFGSCTS----P--NQCTCSPGYVVINEASPNICE----PHCA-ECV-NGVCS 121 (132)
Q Consensus 56 ~~~c~C~~Gy~~~~~~C~-~~C~~-~C~nG~C~~----p--~~C~C~~G~~G~~~c~~~~C~----~~C~-~C~-nG~C~ 121 (132)
.+.|.|-+||......|+ ..|.. .|.+|.|+. + ..|+|+-|.. ++ +...|. ..|+ .|. |-.|-
T Consensus 69 ~~~C~C~~gY~~~~~vCvp~~C~~~~Cg~GKCI~d~~~~~~~~CSC~IGkV-~~--dn~kCtk~G~T~C~LKCk~nE~CK 145 (197)
T PF06247_consen 69 AYKCDCINGYILKQGVCVPNKCNNKDCGSGKCILDPDNPNNPTCSCNIGKV-PD--DNKKCTKTGETKCSLKCKENEECK 145 (197)
T ss_dssp SEEEEE-TTEEESSSSEEEGGGSS---TTEEEEEEEGGGSEEEEEE-TEEE-TT--TTTESEEEE--------TTTEEEE
T ss_pred eEEEecccCceeeCCeEchhhcCceecCCCeEEecCCCCCCceeEeeeceE-ec--cCCcccCCCccceeeecCCCccee
Confidence 467999999999988996 56753 688999972 1 2799999998 32 122331 2233 343 56776
Q ss_pred C---CCceeCCCCC
Q psy8484 122 A---PNTCDCLDVL 132 (132)
Q Consensus 122 ~---p~~C~C~~G~ 132 (132)
. -++|.|.+||
T Consensus 146 ~~~~~Y~C~~~~~~ 159 (197)
T PF06247_consen 146 LVDGYYKCVCKEGF 159 (197)
T ss_dssp EETTEEEEEE-TT-
T ss_pred eeCcEEEeecCCCC
Confidence 4 2688888775
No 22
>smart00179 EGF_CA Calcium-binding EGF-like domain.
Probab=83.30 E-value=1.5 Score=23.55 Aligned_cols=19 Identities=32% Similarity=0.842 Sum_probs=10.1
Q ss_pred CCCC-CEEcCC---CceeCCCCC
Q psy8484 114 ECVN-GVCSAP---NTCDCLDVL 132 (132)
Q Consensus 114 ~C~n-G~C~~p---~~C~C~~G~ 132 (132)
+|.+ |.|++. ..|.|++||
T Consensus 10 ~C~~~~~C~~~~g~~~C~C~~g~ 32 (39)
T smart00179 10 PCQNGGTCVNTVGSYRCECPPGY 32 (39)
T ss_pred CcCCCCEeECCCCCeEeECCCCC
Confidence 3443 466542 356666665
No 23
>cd00053 EGF Epidermal growth factor domain, found in epidermal growth factor (EGF) presents in a large number of proteins, mostly animal; the list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied; the functional significance of EGF-like domains in what appear to be unrelated proteins is not yet clear; a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase); the domain includes six cysteine residues which have been shown to be involved in disulfide bonds; the main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet; Subdomains between the conserved cysteines vary in length; the region between the 5th and 6th cysteine contains two conserved glycines of which at least one is present in most EGF-like domains; a subset of these bind calcium.
Probab=82.23 E-value=1.6 Score=22.63 Aligned_cols=23 Identities=35% Similarity=0.963 Sum_probs=17.4
Q ss_pred CCCCC-cEEcC---CCeEEccCCceec
Q psy8484 78 RECVF-GSCTS---PNQCTCSPGYVVI 100 (132)
Q Consensus 78 ~~C~n-G~C~~---p~~C~C~~G~~G~ 100 (132)
.+|.+ +.|+. ...|.|+.||.|.
T Consensus 6 ~~C~~~~~C~~~~~~~~C~C~~g~~g~ 32 (36)
T cd00053 6 NPCSNGGTCVNTPGSYRCVCPPGYTGD 32 (36)
T ss_pred CCCCCCCEEecCCCCeEeECCCCCccc
Confidence 46664 78874 4679999999886
No 24
>cd00054 EGF_CA Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.
Probab=80.05 E-value=2 Score=22.57 Aligned_cols=23 Identities=39% Similarity=1.048 Sum_probs=16.9
Q ss_pred CCCC-cEEcC---CCeEEccCCceecc
Q psy8484 79 ECVF-GSCTS---PNQCTCSPGYVVIN 101 (132)
Q Consensus 79 ~C~n-G~C~~---p~~C~C~~G~~G~~ 101 (132)
+|.+ |.|+. ...|.|.+||.|..
T Consensus 10 ~C~~~~~C~~~~~~~~C~C~~g~~g~~ 36 (38)
T cd00054 10 PCQNGGTCVNTVGSYRCSCPPGYTGRN 36 (38)
T ss_pred CcCCCCEeECCCCCeEeECCCCCcCCc
Confidence 5764 68863 34699999999864
No 25
>KOG1217|consensus
Probab=80.01 E-value=3.9 Score=33.27 Aligned_cols=76 Identities=26% Similarity=0.596 Sum_probs=47.4
Q ss_pred CCceEeCCCceeCCC----ceecccCCCCCCcEEcC------CCeEEccCCceeccCCCC--CccccCCCCCCC-CEEcC
Q psy8484 56 THVRICCEGYEDDHG----SCRPVCERECVFGSCTS------PNQCTCSPGYVVINEASP--NICEPHCAECVN-GVCSA 122 (132)
Q Consensus 56 ~~~c~C~~Gy~~~~~----~C~~~C~~~C~nG~C~~------p~~C~C~~G~~G~~~c~~--~~C~~~C~~C~n-G~C~~ 122 (132)
...|.|.+||.+... .|...-..-+.++.|.. ...|.|..||.|.. +.. +.|...=..|.| +.|.+
T Consensus 109 ~~~c~c~~g~~~~~~~~~~~C~~~~~~~~~~~~c~~~~~~~~~~~c~C~~g~~~~~-~~~~~~~C~~~~~~c~~~~~C~~ 187 (487)
T KOG1217|consen 109 SYECTCPPGYQGTPCEGECECVTGPGVCCIDGSCSNGPGSVGPFRCSCTEGYEGEP-CETDLDECIQYSSPCQNGGTCVN 187 (487)
T ss_pred CceeeCCCccccCcCCcceeecCCCCCeeCchhhcCCCCCCCceeeeeCCCccccc-ccccccccccCCCCcCCCccccc
Confidence 556889999988732 23221110123577763 46799999999986 332 355522115776 68887
Q ss_pred C---CceeCCCCC
Q psy8484 123 P---NTCDCLDVL 132 (132)
Q Consensus 123 p---~~C~C~~G~ 132 (132)
. ..|.|.+||
T Consensus 188 ~~~~~~C~c~~~~ 200 (487)
T KOG1217|consen 188 TGGSYLCSCPPGY 200 (487)
T ss_pred CCCCeeEeCCCCc
Confidence 4 469998886
No 26
>PHA02887 EGF-like protein; Provisional
Probab=79.42 E-value=1.7 Score=31.07 Aligned_cols=25 Identities=32% Similarity=0.834 Sum_probs=19.2
Q ss_pred CCCCcEEc-----CCCeEEccCCceeccCCC
Q psy8484 79 ECVFGSCT-----SPNQCTCSPGYVVINEAS 104 (132)
Q Consensus 79 ~C~nG~C~-----~p~~C~C~~G~~G~~~c~ 104 (132)
=|.||.|. ....|.|+.||.|.. |+
T Consensus 93 YCiHG~C~yI~dL~epsCrC~~GYtG~R-CE 122 (126)
T PHA02887 93 FCINGECMNIIDLDEKFCICNKGYTGIR-CD 122 (126)
T ss_pred EeeCCEEEccccCCCceeECCCCcccCC-CC
Confidence 46688885 234699999999986 54
No 27
>smart00181 EGF Epidermal growth factor-like domain.
Probab=77.78 E-value=2.4 Score=22.47 Aligned_cols=21 Identities=38% Similarity=1.100 Sum_probs=15.6
Q ss_pred CCCCcEEcC---CCeEEccCCcee
Q psy8484 79 ECVFGSCTS---PNQCTCSPGYVV 99 (132)
Q Consensus 79 ~C~nG~C~~---p~~C~C~~G~~G 99 (132)
+|.++.|+. ...|+|.+||.|
T Consensus 7 ~C~~~~C~~~~~~~~C~C~~g~~g 30 (35)
T smart00181 7 PCSNGTCINTPGSYTCSCPPGYTG 30 (35)
T ss_pred CCCCCEEECCCCCeEeECCCCCcc
Confidence 566557763 467999999998
No 28
>KOG1218|consensus
Probab=77.36 E-value=4.8 Score=31.60 Aligned_cols=70 Identities=30% Similarity=0.688 Sum_probs=37.9
Q ss_pred eEeCCCceeCCCce----ecccCCCCC--CcEEcCCCeEEccCCceeccCCCCC--ccccCCCCCCCC-EEcCC-CceeC
Q psy8484 59 RICCEGYEDDHGSC----RPVCERECV--FGSCTSPNQCTCSPGYVVINEASPN--ICEPHCAECVNG-VCSAP-NTCDC 128 (132)
Q Consensus 59 c~C~~Gy~~~~~~C----~~~C~~~C~--nG~C~~p~~C~C~~G~~G~~~c~~~--~C~~~C~~C~nG-~C~~p-~~C~C 128 (132)
|.+..+|.+..... .+.|...|. .+.....+.|.|.+||.|.. +... .|.+.+ .|.+| .|+.. +.+.+
T Consensus 126 c~~~~~~~~~~C~~~~~~g~~C~~~c~~~~~~~~~~~~c~c~~g~~g~~-~~~~~~~c~~~~-~~~~g~~C~~~~~~~~~ 203 (316)
T KOG1218|consen 126 CRCGGGYIGEQCGEENLVGLKCQRDCQCTGGCDCKNGICTCQPGFVGVF-CVESCSGCSPLT-ACENGAKCNRSTGSCLC 203 (316)
T ss_pred eecCCcCccccccccCCCCCCccCCCCCccccCCCCCceeccCCccccc-ccccCCCcCCCc-ccCCCCeeecccccccc
Confidence 55555655542211 123444442 35556678899999999987 2211 133332 56664 88753 44444
Q ss_pred CC
Q psy8484 129 LD 130 (132)
Q Consensus 129 ~~ 130 (132)
.+
T Consensus 204 ~~ 205 (316)
T KOG1218|consen 204 YP 205 (316)
T ss_pred CC
Confidence 43
No 29
>PF02363 C_tripleX: Cysteine rich repeat; InterPro: IPR003341 This signature describes a cysteine repeat C-X3-C-X3-C the function of which is unknown as is the function of the proteins in which they occur.
Probab=76.99 E-value=1.1 Score=20.87 Aligned_cols=15 Identities=47% Similarity=1.119 Sum_probs=9.3
Q ss_pred ceecccCCCCCCcEEc
Q psy8484 71 SCRPVCERECVFGSCT 86 (132)
Q Consensus 71 ~C~~~C~~~C~nG~C~ 86 (132)
.|+|.|...|.+ .|+
T Consensus 2 ~C~p~C~~~C~~-~C~ 16 (17)
T PF02363_consen 2 QCVPQCEPSCEN-SCV 16 (17)
T ss_pred cchhhccCcccc-cCC
Confidence 356667666666 554
No 30
>KOG4260|consensus
Probab=76.34 E-value=1.8 Score=35.40 Aligned_cols=81 Identities=23% Similarity=0.534 Sum_probs=46.4
Q ss_pred eeecCCCceEeCCCceeCCC-cee------------cccC---CCCCCcEEcC--CCeE-EccCCceecc--CCCCCccc
Q psy8484 51 EYMEDTHVRICCEGYEDDHG-SCR------------PVCE---RECVFGSCTS--PNQC-TCSPGYVVIN--EASPNICE 109 (132)
Q Consensus 51 ~~~~~~~~c~C~~Gy~~~~~-~C~------------~~C~---~~C~nG~C~~--p~~C-~C~~G~~G~~--~c~~~~C~ 109 (132)
+.++-.+.|.|.+||.|... .|- .+|. .+|. |.|.. +..| .|..||.-.. .-|.+.|+
T Consensus 162 GsR~GsGkCkC~~GY~Gp~C~~Cg~eyfes~Rne~~lvCt~Ch~~C~-~~Csg~~~k~C~kCkkGW~lde~gCvDvnEC~ 240 (350)
T KOG4260|consen 162 GSREGSGKCKCETGYTGPLCRYCGIEYFESSRNEQHLVCTACHEGCL-GVCSGESSKGCSKCKKGWKLDEEGCVDVNECQ 240 (350)
T ss_pred CCCCCCCcccccCCCCCccccccchHHHHhhcccccchhhhhhhhhh-cccCCCCCCChhhhcccceecccccccHHHHh
Confidence 44455678999999999732 221 2332 2332 36664 3346 6899998764 23555665
Q ss_pred cCCCCCC-CCEEcC---CCceeCCCCC
Q psy8484 110 PHCAECV-NGVCSA---PNTCDCLDVL 132 (132)
Q Consensus 110 ~~C~~C~-nG~C~~---p~~C~C~~G~ 132 (132)
..=.+|. +-.|++ .+.|.+.+||
T Consensus 241 ~ep~~c~~~qfCvNteGSf~C~dk~Gy 267 (350)
T KOG4260|consen 241 NEPAPCKAHQFCVNTEGSFKCEDKEGY 267 (350)
T ss_pred cCCCCCChhheeecCCCceEecccccc
Confidence 4311454 456665 2567766664
No 31
>PF09064 Tme5_EGF_like: Thrombomodulin like fifth domain, EGF-like; InterPro: IPR015149 This domain adopts a fold similar to other EGF domains, with a flat major and a twisted minor beta sheet. Disulphide pairing, however, is not of the usual 1-3, 2-4, 5-6 type; rather 1-2, 3-4, 5-6 pairing is found. Its extended major sheet (strands beta-2 and beta-3 and the connecting loop) projects into thrombin's active site groove. This domain is required for interaction of thrombomodulin with thrombin, and subsequent activation of protein-C []. ; GO: 0004888 transmembrane signaling receptor activity, 0016021 integral to membrane
Probab=74.94 E-value=2.5 Score=23.67 Aligned_cols=10 Identities=30% Similarity=0.786 Sum_probs=7.8
Q ss_pred CCceeCCCCC
Q psy8484 123 PNTCDCLDVL 132 (132)
Q Consensus 123 p~~C~C~~G~ 132 (132)
++.|.|++||
T Consensus 17 ~~~C~CPeGy 26 (34)
T PF09064_consen 17 PGQCFCPEGY 26 (34)
T ss_pred CCceeCCCce
Confidence 3688888887
No 32
>PF07645 EGF_CA: Calcium-binding EGF domain; InterPro: IPR001881 A sequence of about forty amino-acid residues found in epidermal growth factor (EGF) has been shown [, , , , , ] to be present in a large number of membrane-bound and extracellular, mostly animal, proteins. Many of these proteins require calcium for their biological function and a calcium-binding site has been found at the N terminus of some EGF-like domains []. Calcium-binding may be crucial for numerous protein-protein interactions. For human coagulation factor IX it has been shown [] that the calcium-ligands form a pentagonal bipyramid. The first, third and fourth conserved negatively charged or polar residues are side chain ligands. The latter is possibly hydroxylated (see aspartic acid and asparagine hydroxylation site) []. A conserved aromatic residue, as well as the second conserved negative residue, are thought to be involved in stabilising the calcium-binding site. As in non-calcium binding EGF-like domains, there are six conserved cysteines and the structure of both types is very similar as calcium-binding induces only strictly local structural changes []. +------------------+ +---------+ | | | | nxnnC-x(3,14)-C-x(3,7)-CxxbxxxxaxC-x(1,6)-C-x(8,13)-Cx | | +------------------+ 'n': negatively charged or polar residue [DEQN] 'b': possibly beta-hydroxylated residue [DN] 'a': aromatic amino acid 'C': cysteine, involved in disulphide bond 'x': any amino acid. ; GO: 0005509 calcium ion binding; PDB: 2VJ3_A 1TOZ_A 1LMJ_A 1UZQ_A 1UZK_A 1UZJ_B 1UZP_A 1EMO_A 1EMN_A 2RR0_A ....
Probab=74.55 E-value=1.3 Score=25.22 Aligned_cols=19 Identities=32% Similarity=0.858 Sum_probs=12.8
Q ss_pred CCC-CCEEcC---CCceeCCCCC
Q psy8484 114 ECV-NGVCSA---PNTCDCLDVL 132 (132)
Q Consensus 114 ~C~-nG~C~~---p~~C~C~~G~ 132 (132)
.|. ++.|++ .+.|.|++||
T Consensus 11 ~C~~~~~C~N~~Gsy~C~C~~Gy 33 (42)
T PF07645_consen 11 NCPENGTCVNTEGSYSCSCPPGY 33 (42)
T ss_dssp SSSTTSEEEEETTEEEEEESTTE
T ss_pred cCCCCCEEEcCCCCEEeeCCCCc
Confidence 454 577776 2578888876
No 33
>PF01683 EB: EB module; InterPro: IPR006149 The EB domain has no known function. It is found in several Caenorhabditis sp. and Drosophila sp. proteins. The domain contains 8 conserved cysteines that probably form four disulphide bridges and is found associated with kunitz domains IPR002223 from INTERPRO
Probab=73.33 E-value=6.7 Score=23.07 Aligned_cols=17 Identities=41% Similarity=1.085 Sum_probs=9.6
Q ss_pred CcEEcCCCeEEccCCcee
Q psy8484 82 FGSCTSPNQCTCSPGYVV 99 (132)
Q Consensus 82 nG~C~~p~~C~C~~G~~G 99 (132)
+..|+. +.|.|++||.-
T Consensus 31 ~s~C~~-g~C~C~~g~~~ 47 (52)
T PF01683_consen 31 GSVCVN-GRCQCPPGYVE 47 (52)
T ss_pred cCEEcC-CEeECCCCCEe
Confidence 355644 66666666543
No 34
>PF12947 EGF_3: EGF domain; InterPro: IPR024731 This entry represents an EGF domain found in the the C terminus of malarial parasite merozoite surface protein 1 [], as well as other proteins.; PDB: 2NPR_A 1N1I_C 1B9W_A 1YO8_A 2RHP_A.
Probab=69.28 E-value=2.5 Score=23.67 Aligned_cols=17 Identities=35% Similarity=0.581 Sum_probs=12.2
Q ss_pred CCceEeCCCceeCCCce
Q psy8484 56 THVRICCEGYEDDHGSC 72 (132)
Q Consensus 56 ~~~c~C~~Gy~~~~~~C 72 (132)
...|.|.+||.|.+..|
T Consensus 20 ~~~C~C~~Gy~GdG~~C 36 (36)
T PF12947_consen 20 SYTCTCKPGYEGDGFFC 36 (36)
T ss_dssp SEEEEE-CEEECCSTCE
T ss_pred CEEeECCCCCccCCcCC
Confidence 56789999999886554
No 35
>KOG4260|consensus
Probab=68.54 E-value=4.1 Score=33.40 Aligned_cols=42 Identities=31% Similarity=0.665 Sum_probs=27.1
Q ss_pred EeCCCceeCCC-ceecccCCCCC-CcEEc------CCCeEEccCCceecc
Q psy8484 60 ICCEGYEDDHG-SCRPVCERECV-FGSCT------SPNQCTCSPGYVVIN 101 (132)
Q Consensus 60 ~C~~Gy~~~~~-~C~~~C~~~C~-nG~C~------~p~~C~C~~G~~G~~ 101 (132)
-|.+|-.|.+. .|...=..+|. ||.|. ..+.|.|..||+|+.
T Consensus 131 CCp~gtyGpdCl~Cpggser~C~GnG~C~GdGsR~GsGkCkC~~GY~Gp~ 180 (350)
T KOG4260|consen 131 CCPDGTYGPDCLQCPGGSERPCFGNGSCHGDGSREGSGKCKCETGYTGPL 180 (350)
T ss_pred ccCCCCcCCccccCCCCCcCCcCCCCcccCCCCCCCCCcccccCCCCCcc
Confidence 36677777642 33221123565 57775 467899999999986
No 36
>PHA03099 epidermal growth factor-like protein (EGF-like protein); Provisional
Probab=68.45 E-value=3.9 Score=29.71 Aligned_cols=23 Identities=39% Similarity=0.910 Sum_probs=17.6
Q ss_pred CCCCcEEc-----CCCeEEccCCceecc
Q psy8484 79 ECVFGSCT-----SPNQCTCSPGYVVIN 101 (132)
Q Consensus 79 ~C~nG~C~-----~p~~C~C~~G~~G~~ 101 (132)
=|.||.|. ....|.|..||.|..
T Consensus 52 YClHG~C~yI~dl~~~~CrC~~GYtGeR 79 (139)
T PHA03099 52 YCLHGDCIHARDIDGMYCRCSHGYTGIR 79 (139)
T ss_pred EeECCEEEeeccCCCceeECCCCccccc
Confidence 36678885 235699999999986
No 37
>KOG0994|consensus
Probab=63.89 E-value=5.7 Score=38.42 Aligned_cols=73 Identities=26% Similarity=0.634 Sum_probs=41.9
Q ss_pred EeCCCceeCC--Cce-ecccCCCCCC--cEEc-CCCeEEccCCceecc--CCCCCccc----cCCCCCC-C---C-EEcC
Q psy8484 60 ICCEGYEDDH--GSC-RPVCERECVF--GSCT-SPNQCTCSPGYVVIN--EASPNICE----PHCAECV-N---G-VCSA 122 (132)
Q Consensus 60 ~C~~Gy~~~~--~~C-~~~C~~~C~n--G~C~-~p~~C~C~~G~~G~~--~c~~~~C~----~~C~~C~-n---G-~C~~ 122 (132)
.|-+||.|.- -+| .=+|...=.| +.|- ..++|-|-+.-.|.. .|.++.|. .-|.+|+ + | .|..
T Consensus 1002 ~Ck~Gf~GdA~~q~CqrC~Cn~LGTn~~~~CDr~tGQCpClpNv~G~~CDqCA~N~w~laSG~GCe~C~Cd~~~~pqCN~ 1081 (1758)
T KOG0994|consen 1002 HCKDGFYGDALRQNCQRCVCNFLGTNSTCHCDRFTGQCPCLPNVQGVRCDQCAENHWNLASGEGCEPCNCDPIGGPQCNE 1081 (1758)
T ss_pred hccccchhHHHHhhhhhheccccccCCccccccccCcCCCCcccccccccccccchhccccCCCCCccCCCccCCccccc
Confidence 4889998862 223 2222211112 3332 357888888888876 35555554 3344333 2 3 5654
Q ss_pred -CCceeCCCCC
Q psy8484 123 -PNTCDCLDVL 132 (132)
Q Consensus 123 -p~~C~C~~G~ 132 (132)
.|+|.|.+||
T Consensus 1082 ftGQCqCkpGf 1092 (1758)
T KOG0994|consen 1082 FTGQCQCKPGF 1092 (1758)
T ss_pred cccceeccCCC
Confidence 4899999997
No 38
>KOG1218|consensus
Probab=62.73 E-value=14 Score=28.89 Aligned_cols=48 Identities=25% Similarity=0.762 Sum_probs=30.8
Q ss_pred EEcCCC-eEEccCCceeccCCCC-CccccCCC-CC--CCCEEcCCCceeCCCCC
Q psy8484 84 SCTSPN-QCTCSPGYVVINEASP-NICEPHCA-EC--VNGVCSAPNTCDCLDVL 132 (132)
Q Consensus 84 ~C~~p~-~C~C~~G~~G~~~c~~-~~C~~~C~-~C--~nG~C~~p~~C~C~~G~ 132 (132)
+|..+. .|.+..+|.+.. |.. ..-+..|. .| ..+.....+.|.|.+||
T Consensus 118 ~C~~~~~~c~~~~~~~~~~-C~~~~~~g~~C~~~c~~~~~~~~~~~~c~c~~g~ 170 (316)
T KOG1218|consen 118 TCANPRRECRCGGGYIGEQ-CGEENLVGLKCQRDCQCTGGCDCKNGICTCQPGF 170 (316)
T ss_pred ccCCCccceecCCcCcccc-ccccCCCCCCccCCCCCccccCCCCCceeccCCc
Confidence 666666 488889998886 544 33344444 33 23444456888898886
No 39
>PF00954 S_locus_glycop: S-locus glycoprotein family; InterPro: IPR000858 In Brassicaceae, self-incompatible plants have a self/non-self recognition system, which involves the inability of flowering plants to achieve self-fertilisation. This is sporophytically controlled by multiple alleles at a single locus (S). There are a total of 50 different S alleles in Brassica oleracea. S-locus glycoproteins, as well as S-receptor kinases, are in linkage with the S-alleles []. Most of the proteins within this family contain apple-like domain (IPR003609 from INTERPRO), which is predicted to possess protein- and/or carbohydrate-binding functions.; GO: 0048544 recognition of pollen
Probab=59.34 E-value=9 Score=26.00 Aligned_cols=7 Identities=57% Similarity=1.981 Sum_probs=3.3
Q ss_pred EEccCCc
Q psy8484 91 CTCSPGY 97 (132)
Q Consensus 91 C~C~~G~ 97 (132)
|.|.+||
T Consensus 100 C~Cl~GF 106 (110)
T PF00954_consen 100 CSCLPGF 106 (110)
T ss_pred eECCCCc
Confidence 4444443
No 40
>KOG3607|consensus
Probab=56.12 E-value=7.8 Score=35.40 Aligned_cols=28 Identities=29% Similarity=0.799 Sum_probs=22.0
Q ss_pred cCCCCC-CcEEcCCCeEEccCCceeccCCC
Q psy8484 76 CERECV-FGSCTSPNQCTCSPGYVVINEAS 104 (132)
Q Consensus 76 C~~~C~-nG~C~~p~~C~C~~G~~G~~~c~ 104 (132)
|...|. ||+|.+-..|.|.+||.+++ |+
T Consensus 628 ~~~~C~g~GVCnn~~~ChC~~gwapp~-C~ 656 (716)
T KOG3607|consen 628 CPTTCNGHGVCNNELNCHCEPGWAPPF-CF 656 (716)
T ss_pred cccccCCCcccCCCcceeeCCCCCCCc-cc
Confidence 334576 69998888999999999987 44
No 41
>PF12662 cEGF: Complement Clr-like EGF-like
Probab=53.49 E-value=8.7 Score=19.71 Aligned_cols=9 Identities=33% Similarity=0.741 Sum_probs=4.3
Q ss_pred ceEeCCCce
Q psy8484 58 VRICCEGYE 66 (132)
Q Consensus 58 ~c~C~~Gy~ 66 (132)
.|.|.+||+
T Consensus 3 ~C~C~~Gy~ 11 (24)
T PF12662_consen 3 TCSCPPGYQ 11 (24)
T ss_pred EeeCCCCCc
Confidence 344555554
No 42
>KOG1388|consensus
Probab=53.30 E-value=6.2 Score=30.94 Aligned_cols=45 Identities=24% Similarity=0.391 Sum_probs=34.4
Q ss_pred CceE-eCCCceeC--CCceecccCCCCCCcEEcCCCeEEc-cCCceecc
Q psy8484 57 HVRI-CCEGYEDD--HGSCRPVCERECVFGSCTSPNQCTC-SPGYVVIN 101 (132)
Q Consensus 57 ~~c~-C~~Gy~~~--~~~C~~~C~~~C~nG~C~~p~~C~C-~~G~~G~~ 101 (132)
.+|. |-.||.|. +..|+|..-.+..++.+..+++|.| ..|..|..
T Consensus 76 ~~c~kc~~g~~GdtN~g~c~~~~~~g~~~~~~~~~~~c~c~~kgvvgd~ 124 (217)
T KOG1388|consen 76 AHCEKCIVGFYGDTNGGKCQPCDCNGGASACVTLTGKCFCTTKGVVGDL 124 (217)
T ss_pred ccCCceEEEEEecCCCCccCHhhhcCCeeeeeccCCccccccceEeccc
Confidence 3454 88999994 4577777666666778888999999 58888876
No 43
>PF12946 EGF_MSP1_1: MSP1 EGF domain 1; InterPro: IPR024730 This EGF-like domain is found at the C terminus of the malaria parasite MSP1 protein. MSP1 is the merozoite surface protein 1. This domain is part of the C-terminal fragment that is proteolytically processed from the the rest of the protein and is left attached to the surface of the invading parasite [].; PDB: 1N1I_C 2FLG_A 1CEJ_A 2NPR_A 1B9W_A 1OB1_F.
Probab=46.13 E-value=27 Score=19.85 Aligned_cols=31 Identities=23% Similarity=0.567 Sum_probs=18.5
Q ss_pred cCCcceeEEEEEeeeeeeecCCCceEeCCCceeCCCce
Q psy8484 35 HCQKTRTAYSYKYKTEEYMEDTHVRICCEGYEDDHGSC 72 (132)
Q Consensus 35 ~C~~~r~~~~~~~~~~~~~~~~~~c~C~~Gy~~~~~~C 72 (132)
.||....++++..- +..+.|..||+..+..|
T Consensus 6 ~cP~NA~C~~~~dG-------~eecrCllgyk~~~~~C 36 (37)
T PF12946_consen 6 KCPANAGCFRYDDG-------SEECRCLLGYKKVGGKC 36 (37)
T ss_dssp ---TTEEEEEETTS-------EEEEEE-TTEEEETTEE
T ss_pred cCCCCcccEEcCCC-------CEEEEeeCCccccCCCc
Confidence 56777777766421 34588999998876665
No 44
>smart00180 EGF_Lam Laminin-type epidermal growth factor-like domai.
Probab=41.52 E-value=29 Score=19.99 Aligned_cols=18 Identities=28% Similarity=0.761 Sum_probs=13.4
Q ss_pred EEcC-CCeEEccCCceecc
Q psy8484 84 SCTS-PNQCTCSPGYVVIN 101 (132)
Q Consensus 84 ~C~~-p~~C~C~~G~~G~~ 101 (132)
.|.. .++|.|.++|+|..
T Consensus 12 ~C~~~~G~C~C~~~~~G~~ 30 (46)
T smart00180 12 TCDPDTGQCECKPNVTGRR 30 (46)
T ss_pred cccCCCCEEECCCCCCCCC
Confidence 4443 47889999999976
No 45
>PF00053 Laminin_EGF: Laminin EGF-like (Domains III and V); InterPro: IPR002049 Laminins [] are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation. They are composed of distinct but related alpha, beta and gamma chains. The three chains form a cross-shaped molecule that consist of a long arm and three short globular arms. The long arm consist of a coiled coil structure contributed by all three chains and cross-linked by interchain disulphide bonds. Beside different types of globular domains each subunit contains, in its first half, consecutive repeats of about 60 amino acids in length that include eight conserved cysteines []. The tertiary structure [, ] of this domain is remotely similar in its N-terminal to that of the EGF-like module (see PDOC00021 from PROSITEDOC). It is known as a 'LE' or 'laminin-type EGF-like' domain. The number of copies of the LE domain in the different forms of laminins is highly variable; from 3 up to 22 copies have been found. A schematic representation of the topology of the four disulphide bonds in the LE domain is shown below. +-------------------+ +-|-----------+ | +--------+ +-----------------+ | | | | | | | | xxCxCxxxxxxxxxxxCxxxxxxxCxxCxxxxxGxxCxxCxxgaagxxxxxxxxxxxCxx sssssssssssssssssssssssssssssssssss 'C': conserved cysteine involved in a disulphide bond 'a': conserved aromatic residue 'G': conserved glycine (lower case = less conserved) 's': region similar to the EGF-like domain In mouse laminin gamma-1 chain, the seventh LE domain has been shown to be the only one that binds with a high affinity to nidogen []. The binding-sites are located on the surface within the loops C1-C3 and C5-C6 [, ]. Long consecutive arrays of LE domains in laminins form rod-like elements of limited flexibility [], which determine the spacing in the formation of laminin networks of basement membranes [].; PDB: 3TBD_A 3ZYG_B 3ZYI_B 2Y38_A 1KLO_A 1NPE_B 3ZYJ_B 1TLE_A.
Probab=40.16 E-value=14 Score=21.34 Aligned_cols=18 Identities=33% Similarity=0.859 Sum_probs=13.5
Q ss_pred EEcC-CCeEEccCCceecc
Q psy8484 84 SCTS-PNQCTCSPGYVVIN 101 (132)
Q Consensus 84 ~C~~-p~~C~C~~G~~G~~ 101 (132)
.|.. .++|.|.++|.|..
T Consensus 12 ~C~~~~G~C~C~~~~~G~~ 30 (49)
T PF00053_consen 12 TCDPSTGQCVCKPGTTGPR 30 (49)
T ss_dssp SEEETCEEESBSTTEESTT
T ss_pred cccCCCCEEeccccccCCc
Confidence 4432 57788999999987
No 46
>KOG0994|consensus
Probab=39.61 E-value=18 Score=35.30 Aligned_cols=44 Identities=25% Similarity=0.656 Sum_probs=25.8
Q ss_pred CCeEEccCCceecc--CCCCCcccc---CCC--CCC-CC----EEcC-CCceeCCCC
Q psy8484 88 PNQCTCSPGYVVIN--EASPNICEP---HCA--ECV-NG----VCSA-PNTCDCLDV 131 (132)
Q Consensus 88 p~~C~C~~G~~G~~--~c~~~~C~~---~C~--~C~-nG----~C~~-p~~C~C~~G 131 (132)
.++|.|.+||+|.. +|....|+. .|. .|. .| .|.. .|+|+|.+|
T Consensus 1083 tGQCqCkpGfGGR~C~qCqel~WGdP~~~C~aCdCd~rG~~tpQCdr~tG~C~C~~G 1139 (1758)
T KOG0994|consen 1083 TGQCQCKPGFGGRTCSQCQELYWGDPNEKCRACDCDPRGIETPQCDRATGRCVCRPG 1139 (1758)
T ss_pred ccceeccCCCCCcchhHHHHhhcCCCCCCceecCCCCCCCCCCCccccCCceeecCC
Confidence 57899999999976 333333431 122 333 23 3443 477888776
No 47
>cd00055 EGF_Lam Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies
Probab=39.59 E-value=25 Score=20.51 Aligned_cols=17 Identities=29% Similarity=0.761 Sum_probs=13.3
Q ss_pred EcC-CCeEEccCCceecc
Q psy8484 85 CTS-PNQCTCSPGYVVIN 101 (132)
Q Consensus 85 C~~-p~~C~C~~G~~G~~ 101 (132)
|.. .++|.|.+||.|..
T Consensus 14 C~~~~G~C~C~~~~~G~~ 31 (50)
T cd00055 14 CDPGTGQCECKPNTTGRR 31 (50)
T ss_pred ccCCCCEEeCCCcCCCCC
Confidence 543 57789999999987
No 48
>PF09402 MSC: Man1-Src1p-C-terminal domain; InterPro: IPR018996 This entry represents the Inner nuclear membrane proteins MAN1 (also known as LEM domain-containing protein 3) and LEM domain-containing protein 2 (or LEM protein 2). Emerin and MAN1 are LEM domain-containing integral membrane proteins of the vertebrate nuclear envelope []. MAN1 is an integral protein of the inner nuclear membrane which binds to chromatin associated proteins and plays a role in nuclear organisation. The C-terminal nulceoplasmic region forms a DNA binding winged helix and binds to Smad []. LEM protein 2 is an essential protein involved in chromosome segregation and cell division, probably via its interaction with lmn-1, the main component of nuclear lamina. Has some overlapping function with emr-1.; GO: 0005639 integral to nuclear inner membrane; PDB: 2CH0_A.
Probab=37.88 E-value=11 Score=30.53 Aligned_cols=27 Identities=37% Similarity=0.917 Sum_probs=0.0
Q ss_pred ecccCCCCC-CcEEcCCCeEEccCCceec
Q psy8484 73 RPVCERECV-FGSCTSPNQCTCSPGYVVI 100 (132)
Q Consensus 73 ~~~C~~~C~-nG~C~~p~~C~C~~G~~G~ 100 (132)
.|.|. +|+ ||.|...-.-.|.+||.-.
T Consensus 49 ~P~C~-pCP~~a~C~~~~~~~C~~~y~~~ 76 (334)
T PF09402_consen 49 KPSCE-PCPEHAICYPGLKLECEPGYVLK 76 (334)
T ss_dssp -----------------------------
T ss_pred ccccc-ccccccccccccccccccccccc
Confidence 46676 798 9999986689999999765
No 49
>KOG0196|consensus
Probab=32.13 E-value=69 Score=30.22 Aligned_cols=12 Identities=33% Similarity=0.695 Sum_probs=5.7
Q ss_pred CCceEeCCCcee
Q psy8484 56 THVRICCEGYED 67 (132)
Q Consensus 56 ~~~c~C~~Gy~~ 67 (132)
-+.|.|.+||..
T Consensus 258 iG~C~C~aGye~ 269 (996)
T KOG0196|consen 258 IGGCVCKAGYEE 269 (996)
T ss_pred cCceeecCCCCc
Confidence 344555555543
No 50
>KOG3607|consensus
Probab=30.94 E-value=46 Score=30.54 Aligned_cols=19 Identities=32% Similarity=0.796 Sum_probs=16.4
Q ss_pred CCC-CCEEcCCCceeCCCCC
Q psy8484 114 ECV-NGVCSAPNTCDCLDVL 132 (132)
Q Consensus 114 ~C~-nG~C~~p~~C~C~~G~ 132 (132)
.|. ||.|.+...|.|.+||
T Consensus 631 ~C~g~GVCnn~~~ChC~~gw 650 (716)
T KOG3607|consen 631 TCNGHGVCNNELNCHCEPGW 650 (716)
T ss_pred ccCCCcccCCCcceeeCCCC
Confidence 354 8999999999999998
No 51
>PTZ00214 high cysteine membrane protein Group 4; Provisional
Probab=29.55 E-value=76 Score=29.51 Aligned_cols=36 Identities=31% Similarity=0.911 Sum_probs=17.8
Q ss_pred EeCCCceeCCCceecccC-CCCCCcEEcCCCeE-EccCCce
Q psy8484 60 ICCEGYEDDHGSCRPVCE-RECVFGSCTSPNQC-TCSPGYV 98 (132)
Q Consensus 60 ~C~~Gy~~~~~~C~~~C~-~~C~nG~C~~p~~C-~C~~G~~ 98 (132)
.|.+||...+..|.+ |. ..| ..|...+.| +|..+|.
T Consensus 619 ~C~~GYY~d~~~C~~-C~~~~C--~tC~~~~~C~~C~~~~~ 656 (800)
T PTZ00214 619 ACVDGYYADGDACLP-CATPGC--KTCGHASFCTECAGELF 656 (800)
T ss_pred cCCCCcccCCCcccc-CCcccc--ccccCCCCcCcCCCCce
Confidence 467777665544432 22 122 234445555 5666644
No 52
>PF12955 DUF3844: Domain of unknown function (DUF3844); InterPro: IPR024382 This presumed domain is found in fungal species. It contains 8 largely conserved cysteine residues. This domain is found in proteins thought to be located in the endoplasmic reticulum.
Probab=24.78 E-value=56 Score=22.69 Aligned_cols=19 Identities=37% Similarity=1.013 Sum_probs=12.8
Q ss_pred CCCC-CcEEcCC--------CeEEccCC
Q psy8484 78 RECV-FGSCTSP--------NQCTCSPG 96 (132)
Q Consensus 78 ~~C~-nG~C~~p--------~~C~C~~G 96 (132)
+.|. ||.|+.. ..|.|.+.
T Consensus 13 n~CsgHG~C~~~~~~~~~~C~~C~C~~T 40 (103)
T PF12955_consen 13 NNCSGHGSCVKKYGSGGGDCFACKCKPT 40 (103)
T ss_pred cCCCCCceEeeccCCCccceEEEEeecc
Confidence 4676 7888753 25888773
No 53
>PF05092 PIF: Per os infectivity; InterPro: IPR007784 This entry represents a group of dsDNA Baculovirus proteins. It is required for the infectivity of the OBs or occlusion bodies. It is a structural protein of the ODV envelope required only in the first steps of per os larva infection, as viruses being produced in cells expressing the gene for this protein but not containing it in their genomes are able to produce successful infections. Baculoviruses are large DNA viruses that infect arthropods, mainly members of the order Lepidoptera. In their life cycle, they produce two kinds of particles, a budded, non-occluded virus (BV), which buds out of the infected cell and is responsible for the cell-to-cell transmission of the virus, and an occluded form, the occlusion body (OB), which is responsible for protecting the virus between encounters with larvae. A variable number of virions are included in the para-crystalline structure of the OB, mainly constituted by the virus-encoded polyhedrin protein; these virions are called occlusion body-derived virions or ODVs [].
Probab=24.54 E-value=78 Score=28.04 Aligned_cols=41 Identities=34% Similarity=0.712 Sum_probs=24.8
Q ss_pred CceEe-CCCceeC---CCce-ecccCCCCC-CcEEc----CCCeEEccCCceec
Q psy8484 57 HVRIC-CEGYEDD---HGSC-RPVCERECV-FGSCT----SPNQCTCSPGYVVI 100 (132)
Q Consensus 57 ~~c~C-~~Gy~~~---~~~C-~~~C~~~C~-nG~C~----~p~~C~C~~G~~G~ 100 (132)
..|.| -||+.+. -..| .|+ +|+ ||.=. .|=+|.|+.||...
T Consensus 132 LlCsC~~PGlVtqlniy~DC~vpV---GC~PhG~I~din~~pi~C~Cd~GyVsd 182 (522)
T PF05092_consen 132 LLCSCLRPGLVTQLNIYEDCDVPV---GCQPHGRIADINESPIRCVCDDGYVSD 182 (522)
T ss_pred EEEEcCCCCeEeeeehhccCCCcE---ecCCCCEEeeecCCceEeECCCCcccc
Confidence 35666 5777664 2345 454 676 67543 35568888887765
No 54
>PF03302 VSP: Giardia variant-specific surface protein; InterPro: IPR005127 During infection, the intestinal protozoan parasite Giardia lamblia virus undergoes continuous antigenic variation which is determined by diversification of the parasite's major surface antigen, named VSP (variant surface protein).
Probab=23.91 E-value=85 Score=26.49 Aligned_cols=37 Identities=43% Similarity=1.212 Sum_probs=22.2
Q ss_pred EeCCCceeCCCceecccCCCCCCcEEc--CCCeE-EccCCcee
Q psy8484 60 ICCEGYEDDHGSCRPVCERECVFGSCT--SPNQC-TCSPGYVV 99 (132)
Q Consensus 60 ~C~~Gy~~~~~~C~~~C~~~C~nG~C~--~p~~C-~C~~G~~G 99 (132)
.|.+||...+..|.+ |...|. .|. .++.| .|.+||.-
T Consensus 95 ~C~~G~y~~~~~C~~-C~~~C~--~C~~~~~~~Ct~C~~g~~L 134 (397)
T PF03302_consen 95 ECPDGYYKNGNKCVP-CHESCA--TCSGGAPNQCTSCKPGKVL 134 (397)
T ss_pred CCCCCccccCCCCCC-CCcccc--ccCCCCCCCCcccCCCccc
Confidence 688999877666543 333332 232 24566 68888754
Done!