Query 017103
Match_columns 377
No_of_seqs 436 out of 2549
Neff 7.4
Searched_HMMs 46136
Date Fri Mar 29 05:38:16 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/017103.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/017103hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 KOG1225 Teneurin-1 and related 99.5 3E-14 6.4E-19 144.8 9.6 130 97-252 236-365 (525)
2 KOG1225 Teneurin-1 and related 99.4 2.3E-13 4.9E-18 138.4 7.8 102 94-217 264-365 (525)
3 PTZ00337 surface protease GP63 99.2 1.8E-11 4E-16 126.6 8.1 110 2-120 415-528 (567)
4 PTZ00257 Glycoprotein GP63 (le 99.1 1E-10 2.3E-15 120.8 8.1 118 2-120 441-568 (622)
5 PF01457 Peptidase_M8: Leishma 99.0 2.3E-11 5E-16 126.2 -1.8 116 2-117 395-519 (521)
6 KOG1219 Uncharacterized conser 99.0 4.7E-10 1E-14 125.1 6.0 102 68-186 3865-3977(4289)
7 KOG1226 Integrin beta subunit 99.0 1E-09 2.3E-14 114.0 8.2 124 88-222 474-623 (783)
8 KOG1219 Uncharacterized conser 98.8 5.7E-09 1.2E-13 116.9 7.2 92 128-219 3869-3977(4289)
9 KOG2556 Leishmanolysin-like pe 98.8 1.9E-09 4.2E-14 106.1 1.1 112 3-117 491-624 (666)
10 KOG4289 Cadherin EGF LAG seven 98.7 8.6E-09 1.9E-13 111.8 2.8 112 66-188 1178-1318(2531)
11 KOG1226 Integrin beta subunit 98.6 9.2E-08 2E-12 99.8 9.4 98 95-202 525-636 (783)
12 KOG4289 Cadherin EGF LAG seven 98.6 3.4E-08 7.5E-13 107.3 4.5 81 141-221 1223-1318(2531)
13 KOG1217 Fibrillins and related 98.2 3.2E-05 6.9E-10 78.2 15.4 233 56-311 115-370 (487)
14 KOG1217 Fibrillins and related 97.9 0.00037 8E-09 70.4 15.0 124 54-185 155-306 (487)
15 PF07974 EGF_2: EGF-like domai 97.4 0.00017 3.8E-09 45.9 2.8 24 130-153 7-32 (32)
16 PF00008 EGF: EGF-like domain 97.1 0.00038 8.2E-09 44.3 2.4 31 70-106 1-31 (32)
17 PF07974 EGF_2: EGF-like domai 96.9 0.0007 1.5E-08 43.1 2.5 26 160-185 6-32 (32)
18 KOG1214 Nidogen and related ba 96.9 0.0041 9E-08 66.0 8.7 131 42-186 701-864 (1289)
19 smart00051 DSL delta serrate l 96.7 0.0018 3.8E-08 48.0 3.2 42 143-185 20-63 (63)
20 KOG1214 Nidogen and related ba 96.2 0.015 3.2E-07 62.1 7.8 132 73-218 700-858 (1289)
21 KOG4260 Uncharacterized conser 96.1 0.0082 1.8E-07 56.3 4.8 105 99-217 132-268 (350)
22 PF00008 EGF: EGF-like domain 96.0 0.0022 4.7E-08 40.8 0.5 23 129-151 4-31 (32)
23 KOG0994 Extracellular matrix g 96.0 0.033 7.1E-07 61.3 9.2 30 265-294 1069-1098(1758)
24 smart00051 DSL delta serrate l 95.8 0.0086 1.9E-07 44.3 2.9 42 99-153 21-63 (63)
25 PF12661 hEGF: Human growth fa 95.3 0.0059 1.3E-07 30.8 0.4 12 142-153 2-13 (13)
26 KOG0994 Extracellular matrix g 95.1 0.11 2.3E-06 57.5 9.4 120 57-187 892-1052(1758)
27 smart00179 EGF_CA Calcium-bind 94.7 0.04 8.6E-07 35.7 3.3 33 67-106 2-36 (39)
28 PF12661 hEGF: Human growth fa 94.7 0.0095 2.1E-07 30.1 0.1 13 173-185 1-13 (13)
29 smart00179 EGF_CA Calcium-bind 93.6 0.078 1.7E-06 34.3 3.0 13 141-153 25-38 (39)
30 KOG4260 Uncharacterized conser 93.6 0.11 2.4E-06 49.0 4.8 42 144-187 132-183 (350)
31 cd00054 EGF_CA Calcium-binding 92.9 0.14 3.1E-06 32.5 3.3 35 67-109 2-37 (38)
32 cd00054 EGF_CA Calcium-binding 91.6 0.21 4.6E-06 31.7 2.9 24 130-153 10-37 (38)
33 PF01414 DSL: Delta serrate li 90.8 0.048 1E-06 40.3 -0.9 44 141-185 18-63 (63)
34 cd00053 EGF Epidermal growth f 89.9 0.35 7.6E-06 30.0 2.8 25 129-153 6-35 (36)
35 KOG1218 Proteins containing Ca 89.7 1.2 2.5E-05 42.9 7.5 39 170-208 160-201 (316)
36 cd00053 EGF Epidermal growth f 89.7 0.33 7.1E-06 30.2 2.5 26 160-185 6-35 (36)
37 PF07645 EGF_CA: Calcium-bindi 88.8 0.54 1.2E-05 31.5 3.1 32 66-104 1-34 (42)
38 KOG1218 Proteins containing Ca 87.3 17 0.00038 34.7 13.9 102 131-234 81-191 (316)
39 smart00181 EGF Epidermal growt 86.7 0.69 1.5E-05 29.1 2.6 10 141-150 21-30 (35)
40 PF07645 EGF_CA: Calcium-bindi 84.7 0.54 1.2E-05 31.5 1.4 29 113-149 2-34 (42)
41 PF01414 DSL: Delta serrate li 83.6 0.22 4.8E-06 36.8 -1.0 42 99-153 21-63 (63)
42 KOG1836 Extracellular matrix g 81.2 1.9 4.1E-05 50.9 4.7 86 97-187 697-813 (1705)
43 cd00055 EGF_Lam Laminin-type e 80.7 1.3 2.8E-05 30.9 2.1 16 139-154 18-33 (50)
44 smart00181 EGF Epidermal growt 78.5 2.9 6.2E-05 26.2 3.1 29 70-106 2-32 (35)
45 PF00053 Laminin_EGF: Laminin 77.5 1.4 3E-05 30.4 1.4 20 136-155 12-33 (49)
46 PHA02887 EGF-like protein; Pro 76.4 1.9 4.1E-05 35.7 2.1 25 130-155 93-123 (126)
47 PHA02887 EGF-like protein; Pro 75.3 2 4.3E-05 35.5 1.9 25 162-187 94-123 (126)
48 KOG3607 Meltrins, fertilins an 70.2 4.3 9.3E-05 44.1 3.6 53 131-188 606-658 (716)
49 PF12947 EGF_3: EGF domain; I 67.3 4.1 8.9E-05 26.5 1.7 27 73-106 6-32 (36)
50 cd00055 EGF_Lam Laminin-type e 67.2 4.7 0.0001 28.0 2.1 19 168-186 14-33 (50)
51 KOG1836 Extracellular matrix g 67.1 81 0.0018 37.9 13.1 52 56-114 762-816 (1705)
52 PF00053 Laminin_EGF: Laminin 64.6 2.9 6.2E-05 28.8 0.6 21 167-187 12-33 (49)
53 PHA03099 epidermal growth fact 64.5 4.6 0.0001 34.0 1.9 17 171-187 66-82 (139)
54 PF01683 EB: EB module; Inter 63.7 7.8 0.00017 26.9 2.8 20 130-149 27-46 (52)
55 PHA03099 epidermal growth fact 61.0 6.6 0.00014 33.1 2.2 26 130-156 52-83 (139)
56 KOG3607 Meltrins, fertilins an 57.1 6.6 0.00014 42.7 2.1 32 125-156 626-658 (716)
57 PF06247 Plasmod_Pvs28: Plasmo 56.3 2 4.3E-05 38.6 -1.7 103 62-183 34-162 (197)
58 PF04863 EGF_alliinase: Alliin 56.2 5.7 0.00012 28.4 0.9 26 130-155 18-51 (56)
59 KOG3516 Neurexin IV [Signal tr 55.2 7.4 0.00016 44.0 2.0 42 62-111 540-582 (1306)
60 smart00180 EGF_Lam Laminin-typ 53.9 8.5 0.00018 26.3 1.5 17 139-155 17-33 (46)
61 PF09064 Tme5_EGF_like: Thromb 46.5 19 0.0004 23.2 2.0 25 153-181 3-27 (34)
62 PF12955 DUF3844: Domain of un 39.0 18 0.0004 29.4 1.5 19 128-146 12-39 (103)
63 PF12662 cEGF: Complement Clr- 31.8 21 0.00045 21.1 0.5 10 172-181 2-11 (24)
64 PF08247 ENOD40: ENOD40 protei 29.5 9.5 0.00021 18.6 -0.9 12 350-361 1-12 (12)
65 KOG3516 Neurexin IV [Signal tr 28.4 37 0.0008 38.7 2.1 33 157-189 547-584 (1306)
66 KOG3514 Neurexin III-alpha [Si 27.4 34 0.00073 38.7 1.6 34 70-111 626-660 (1591)
67 PF12946 EGF_MSP1_1: MSP1 EGF 23.8 77 0.0017 20.9 2.2 31 70-106 2-32 (37)
68 PF00954 S_locus_glycop: S-loc 20.9 96 0.0021 24.9 2.8 26 125-150 78-108 (110)
No 1
>KOG1225 consensus Teneurin-1 and related extracellular matrix proteins, contain EGF-like repeats [Signal transduction mechanisms; Extracellular structures]
Probab=99.52 E-value=3e-14 Score=144.83 Aligned_cols=130 Identities=39% Similarity=0.855 Sum_probs=107.6
Q ss_pred CccCCCceeeecCCCCcccCCCCCcccccCCCCCCCCCCeeeCCccccCCCCcCCCCCCCCCCCCCCCCceecCCCcccc
Q 017103 97 PVQFPGFNGELICPAYHELCSTGPIAVFGQCPNSCTFNGDCVDGKCHCFLGFHGHDCSKRSCPDNCNGHGKCLSNGACEC 176 (377)
Q Consensus 97 c~C~~G~~G~i~C~~~~~~C~~~~~~~~~~C~~~C~~~G~C~~g~C~C~~G~~G~~C~~~~C~~~C~~~G~C~~~~~C~C 176 (377)
+.|+.+|+|+ .|.. ..||..|.++|.|++|.|+|++||+|.+|++..|+..|++|+.++ .++|+|
T Consensus 236 c~c~~~~~g~--------~c~~------~~C~~~c~~~g~c~~G~CIC~~Gf~G~dC~e~~Cp~~cs~~g~~~-~g~CiC 300 (525)
T KOG1225|consen 236 CECPEGYFGP--------LCST------IYCPGGCTGRGQCVEGRCICPPGFTGDDCDELVCPVDCSGGGVCV-DGECIC 300 (525)
T ss_pred eecCCceeCC--------cccc------ccCCCCCcccceEeCCeEeCCCCCcCCCCCcccCCcccCCCceec-CCEeec
Confidence 3455777777 5663 688899999999999999999999999999989988899999998 459999
Q ss_pred CCCCCCCCCCCCCCCcccCCCCCeeecCCCceEecCCCCCCCCCCCcCcCCCCccccCcccCCCCcccCCCCCccc
Q 017103 177 ENGYTGIDCSTAVCDEQCSLHGGVCDNGVCEFRCSDYAGYTCQNSSKLISSLSVCKYVLEKDAGGQHCAPSESSIL 252 (377)
Q Consensus 177 ~~G~~G~~C~~~~C~~~c~~~Gg~C~~~~~~~~C~~~~G~~C~~~~~~c~~~~~C~~~~~~~~~~~~C~~~~~~~~ 252 (377)
++||+|.+|++..|...|..|| .|+++.+.+. ++|+|..|+.. . |.+.+-|.+ ++.|..+|.+..
T Consensus 301 ~~g~~G~dCs~~~cpadC~g~G-~Ci~G~C~C~-~Gy~G~~C~~~-~-C~~~g~cv~-------gC~C~~Gw~G~d 365 (525)
T KOG1225|consen 301 NPGYSGKDCSIRRCPADCSGHG-KCIDGECLCD-EGYTGELCIQR-A-CSGGGQCVN-------GCKCKKGWRGPD 365 (525)
T ss_pred CCCccccccccccCCccCCCCC-cccCCceEeC-CCCcCCccccc-c-cCCCceecc-------CceeccCccCCC
Confidence 9999999999999999998887 9996665444 78999999987 2 655555554 377888887665
No 2
>KOG1225 consensus Teneurin-1 and related extracellular matrix proteins, contain EGF-like repeats [Signal transduction mechanisms; Extracellular structures]
Probab=99.43 E-value=2.3e-13 Score=138.45 Aligned_cols=102 Identities=45% Similarity=1.087 Sum_probs=90.2
Q ss_pred CCCCccCCCceeeecCCCCcccCCCCCcccccCCCCCCCCCCeeeCCccccCCCCcCCCCCCCCCCCCCCCCceecCCCc
Q 017103 94 AGGPVQFPGFNGELICPAYHELCSTGPIAVFGQCPNSCTFNGDCVDGKCHCFLGFHGHDCSKRSCPDNCNGHGKCLSNGA 173 (377)
Q Consensus 94 ~g~c~C~~G~~G~i~C~~~~~~C~~~~~~~~~~C~~~C~~~G~C~~g~C~C~~G~~G~~C~~~~C~~~C~~~G~C~~~~~ 173 (377)
+|+|||++||+|. .|+. ..||..|++++.+++++|+|++||+|.+|++..|+..|.+||.|+ .++
T Consensus 264 ~G~CIC~~Gf~G~--------dC~e------~~Cp~~cs~~g~~~~g~CiC~~g~~G~dCs~~~cpadC~g~G~Ci-~G~ 328 (525)
T KOG1225|consen 264 EGRCICPPGFTGD--------DCDE------LVCPVDCSGGGVCVDGECICNPGYSGKDCSIRRCPADCSGHGKCI-DGE 328 (525)
T ss_pred CCeEeCCCCCcCC--------CCCc------ccCCcccCCCceecCCEeecCCCccccccccccCCccCCCCCccc-CCc
Confidence 5778899999999 6762 578888999999999999999999999999999999999999999 999
Q ss_pred cccCCCCCCCCCCCCCCCcccCCCCCeeecCCCceEecCCCCCC
Q 017103 174 CECENGYTGIDCSTAVCDEQCSLHGGVCDNGVCEFRCSDYAGYT 217 (377)
Q Consensus 174 C~C~~G~~G~~C~~~~C~~~c~~~Gg~C~~~~~~~~C~~~~G~~ 217 (377)
|+|.+||+|..|++. .| .|++.|+++ +.+. .+|.|.+
T Consensus 329 C~C~~Gy~G~~C~~~----~C-~~~g~cv~g-C~C~-~Gw~G~d 365 (525)
T KOG1225|consen 329 CLCDEGYTGELCIQR----AC-SGGGQCVNG-CKCK-KGWRGPD 365 (525)
T ss_pred eEeCCCCcCCccccc----cc-CCCceeccC-ceec-cCccCCC
Confidence 999999999999987 24 566699988 6555 7899988
No 3
>PTZ00337 surface protease GP63; Provisional
Probab=99.23 E-value=1.8e-11 Score=126.58 Aligned_cols=110 Identities=34% Similarity=0.717 Sum_probs=88.6
Q ss_pred CCCCCCCCCCCCCCCceeecCCCcccCCCCCCCccccCCceeCCCCcccCCcccccceecCCCCCCCCCcccCcCCCCEE
Q 017103 2 LAIAGGQSSLADYCTYFVAYSDGSCTDTNSARAPDRMLGEVRGSNSRCMASSLVRTGFVRGSMTQGNGCYQHRCVNNSLE 81 (377)
Q Consensus 2 ~~~~Gg~~~l~D~Cp~~~~~~~~~C~d~~~~~~~~~~~g~~~G~nsrC~~~~l~~~g~~g~~c~~~d~C~~~pC~ngg~C 81 (377)
+|++||.+.|||||||+++|+++.|.|.+.. .+.|+++|.+||||++.+.+. . .-.+++.|+++.|.++.+-
T Consensus 415 ~p~~GG~~~~~D~CP~~~~~~~~~C~~~~~~----~~~gs~~G~~SrC~~g~~~~~--~--~~~~~~~C~~v~C~~~~~~ 486 (567)
T PTZ00337 415 DPRVGGDDLYMSRCPYVKAYSNGGCTNGDPS----VMPGSVVGPNSRCVKGQDLQF--D--DEYIGDVCVDTRCGDGTLS 486 (567)
T ss_pred CcccCCCccccccCceEeecCCCcCCCCCcc----cCCCceeCCCCCCcCCCCCcc--c--CcccCCEEEEEEcCCCeEE
Confidence 6899999999999999999999999986543 567999999999999876431 1 1235678999999998777
Q ss_pred Ee--eCCceeecCCCCCCccCCC--ceeeecCCCCcccCCCCC
Q 017103 82 VA--VDGIWKVCPEAGGPVQFPG--FNGELICPAYHELCSTGP 120 (377)
Q Consensus 82 v~--~~~~~~~C~~~g~c~C~~G--~~G~i~C~~~~~~C~~~~ 120 (377)
|. .++.|+.|++ |+.+.+++ |.|.|+||.+.+.|...+
T Consensus 487 V~~~g~~~w~~C~~-g~~i~~~~~~~~G~I~CP~y~evC~~~~ 528 (567)
T PTZ00337 487 VRFLDDDAWHECQE-GETVTPPSGPWRGSIVCPQYADVCTAFP 528 (567)
T ss_pred EEEEcCCceEECCC-CCEEeccCCccceEEECcCcccccccCC
Confidence 75 4678999985 66666544 679999999999998554
No 4
>PTZ00257 Glycoprotein GP63 (leishmanolysin); Provisional
Probab=99.13 E-value=1e-10 Score=120.78 Aligned_cols=118 Identities=25% Similarity=0.508 Sum_probs=88.3
Q ss_pred CCCCCCCCCCCCCCCceeecCCCcccCCCCCCCccccCCceeCCCCcccCCcccccceecCCCCCCCCCcccCcCCC--C
Q 017103 2 LAIAGGQSSLADYCTYFVAYSDGSCTDTNSARAPDRMLGEVRGSNSRCMASSLVRTGFVRGSMTQGNGCYQHRCVNN--S 79 (377)
Q Consensus 2 ~~~~Gg~~~l~D~Cp~~~~~~~~~C~d~~~~~~~~~~~g~~~G~nsrC~~~~l~~~g~~g~~c~~~d~C~~~pC~ng--g 79 (377)
+|++||.+.|||||||+++|+++.|.+..+...+....+++||.+||||.+.+.+.......-..++.|+++.|..+ .
T Consensus 441 ~~~~GG~~~~~DyCP~i~~ysn~~C~~d~s~~~~~~~~~s~~G~~SrC~~g~~~~~~~~~~~~~~~~~C~~v~C~~~~~t 520 (622)
T PTZ00257 441 NAFLGGFSAFLDYCPFIVGYSNGACNQDPSTASPSLKEFNVFSDAARCLDGVFQPRNSNARSEPNNALCANVMCDTAART 520 (622)
T ss_pred CCCcCCCcccCCcCCEEeecCCCcccCCcccCCcccccccccCCCceeecCcccccCCcCCCCccCCEEEEEECCCCCCE
Confidence 58899999999999999999999998633333334456689999999999876554444333455789999999984 3
Q ss_pred EEEe--eCCceeecCCCCCCccCC----Cc-e-eeecCCCCcccCCCCC
Q 017103 80 LEVA--VDGIWKVCPEAGGPVQFP----GF-N-GELICPAYHELCSTGP 120 (377)
Q Consensus 80 ~Cv~--~~~~~~~C~~~g~c~C~~----G~-~-G~i~C~~~~~~C~~~~ 120 (377)
.-|+ .+..|+.|| +|+.+.+. +| . |.|+||.|.+.|...+
T Consensus 521 ~sV~v~G~~~w~~Cp-~G~~I~~~~~~~~f~~gg~I~CP~y~eVC~~~~ 568 (622)
T PTZ00257 521 YSVQVRGSSGYVACT-PGESIDLATLSAAFVEGSYITCPPYVEVCQANI 568 (622)
T ss_pred EEEEEEeCCCEEECC-CCCeEccCCCCccccCCCEEECcCcccccccCc
Confidence 4454 356789998 66777753 45 3 3699999999998543
No 5
>PF01457 Peptidase_M8: Leishmanolysin This Prosite motif covers only the active site. This is family M8 in the peptidase classification. ; InterPro: IPR001577 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site []. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases []. This group of metallopeptidases belong to the MEROPS peptidase family M8 (leishmanolysin family, clan MA(M)). The protein fold of the peptidase domain for members of this family resembles that of thermolysin, the type example for clan MA. Leishmanolysin is an enzyme found in the eukaryotes including Leishmania and related parasitic protozoa []. The endopeptidase is the most abundant protein on the cell surface during the promastigote stage of the parasite, and is attached to the membrane by a glycosylphosphatidylinositol anchor []. In the amastigote form, the parasite lives in lysosomes of host macrophages, producing a form of the protease that has an acidic pH optimum []. This differs from most other metalloproteases and may be an adaptation to the environment in which the organism survives [].; GO: 0004222 metalloendopeptidase activity, 0008270 zinc ion binding, 0006508 proteolysis, 0007155 cell adhesion, 0016020 membrane; PDB: 1LML_A.
Probab=99.03 E-value=2.3e-11 Score=126.21 Aligned_cols=116 Identities=41% Similarity=0.842 Sum_probs=69.1
Q ss_pred CCCCCCCCCCCCCCCceeecCCCcccCCCCCCCccccCCceeCCCCcccCCcccccceecCCCCCCCCCcccCcCCCC--
Q 017103 2 LAIAGGQSSLADYCTYFVAYSDGSCTDTNSARAPDRMLGEVRGSNSRCMASSLVRTGFVRGSMTQGNGCYQHRCVNNS-- 79 (377)
Q Consensus 2 ~~~~Gg~~~l~D~Cp~~~~~~~~~C~d~~~~~~~~~~~g~~~G~nsrC~~~~l~~~g~~g~~c~~~d~C~~~pC~ngg-- 79 (377)
++++||.+.+|||||++++|+++.|.+..+...+..+.+++||.+||||++.+......+..-..+..|+++.|.++.
T Consensus 395 ~~~~gG~~~~~DYCP~i~~~sn~~C~~~~s~~~~~~~~ge~fG~~SrCf~~~~~~~~~~~~~~~~~~~C~~~~C~~~~~~ 474 (521)
T PF01457_consen 395 DPSYGGSSEFMDYCPYIQPYSNSYCTDPSSNADPNNMYGEVFGPNSRCFDSTLLKKNSSGPISSYGAGCYQVKCSNDTST 474 (521)
T ss_dssp STTEE-S-TTTTT---EEEEEEEETTS-GGGS-TTTGGG---STTEEEEEEEE-B---------EEEEEEEEEEETTTTE
T ss_pred ccccCCCCcccccceEEeccccCccCCCcccCchhhccccccCCCceeeecccccccCCCcccccCCeEeccCCCCCCcE
Confidence 578999999999999999999999999855555667889999999999998765443332222345789999999998
Q ss_pred EEEeeC--CceeecCCCCCC---ccCCCce--eeecCCCCcccCC
Q 017103 80 LEVAVD--GIWKVCPEAGGP---VQFPGFN--GELICPAYHELCS 117 (377)
Q Consensus 80 ~Cv~~~--~~~~~C~~~g~c---~C~~G~~--G~i~C~~~~~~C~ 117 (377)
+-|.+. +.|+.|++.++- .-..+|. |.|+||.+.+.|.
T Consensus 475 L~V~v~g~~~~~~C~~G~~I~~~~~~~~~~~~G~I~CP~~~evC~ 519 (521)
T PF01457_consen 475 LQVQVLGSSSWVPCPPGGQIEVSTTSNGFSYGGSIICPPYEEVCQ 519 (521)
T ss_dssp EEEE-TT-SS-EE--TT-EEEGGGT-SSB-TT-EEE---HHHHHT
T ss_pred EEEEEeccceEEECCCCCEEEeeecCCCcccCCEEECcCcHHHhc
Confidence 777764 468899976541 2236775 8999999887776
No 6
>KOG1219 consensus Uncharacterized conserved protein, contains laminin, cadherin and EGF domains [Signal transduction mechanisms]
Probab=98.98 E-value=4.7e-10 Score=125.12 Aligned_cols=102 Identities=25% Similarity=0.660 Sum_probs=86.7
Q ss_pred CCCcccCcCCCCEEEeeCCceeecCCCCCCccCCCceeeecCCCCcccCCCCCcccccCCCCCCCCCCeeeC----Cccc
Q 017103 68 NGCYQHRCVNNSLEVAVDGIWKVCPEAGGPVQFPGFNGELICPAYHELCSTGPIAVFGQCPNSCTFNGDCVD----GKCH 143 (377)
Q Consensus 68 d~C~~~pC~ngg~Cv~~~~~~~~C~~~g~c~C~~G~~G~i~C~~~~~~C~~~~~~~~~~C~~~C~~~G~C~~----g~C~ 143 (377)
+.|..+||+|+|+|+..+...+.| .|++-|.|. .||...+.|. +++|..+|+|+. ..|.
T Consensus 3865 d~C~~npCqhgG~C~~~~~ggy~C------kCpsqysG~-~CEi~~epC~----------snPC~~GgtCip~~n~f~Cn 3927 (4289)
T KOG1219|consen 3865 DPCNDNPCQHGGTCISQPKGGYKC------KCPSQYSGN-HCEIDLEPCA----------SNPCLTGGTCIPFYNGFLCN 3927 (4289)
T ss_pred cccccCcccCCCEecCCCCCceEE------eCcccccCc-cccccccccc----------CCCCCCCCEEEecCCCeeEe
Confidence 899999999999998755444444 455999999 8999888998 567999999983 3999
Q ss_pred cCCCCcCCCCCCC---CC-CCCCCCCceecC---CCccccCCCCCCCCCC
Q 017103 144 CFLGFHGHDCSKR---SC-PDNCNGHGKCLS---NGACECENGYTGIDCS 186 (377)
Q Consensus 144 C~~G~~G~~C~~~---~C-~~~C~~~G~C~~---~~~C~C~~G~~G~~C~ 186 (377)
|+.||+|..|+.. +| .++|.++|.|++ .+.|.|.+||.|..|.
T Consensus 3928 C~~gyTG~~Ce~~Gi~eCs~n~C~~gg~C~n~~gsf~CncT~g~~gr~c~ 3977 (4289)
T KOG1219|consen 3928 CPNGYTGKRCEARGISECSKNVCGTGGQCINIPGSFHCNCTPGILGRTCC 3977 (4289)
T ss_pred CCCCccCceeecccccccccccccCCceeeccCCceEeccChhHhcccCc
Confidence 9999999999964 59 578999999983 5799999999999985
No 7
>KOG1226 consensus Integrin beta subunit (N-terminal portion of extracellular region) [Signal transduction mechanisms; Extracellular structures]
Probab=98.98 E-value=1e-09 Score=113.98 Aligned_cols=124 Identities=35% Similarity=0.805 Sum_probs=92.5
Q ss_pred eeecCCCCCCccCCCceeeecCCCCcccCCCCCc---ccccCC-----CCCCCCCCeeeCCccccCCCCc----CCCCCC
Q 017103 88 WKVCPEAGGPVQFPGFNGELICPAYHELCSTGPI---AVFGQC-----PNSCTFNGDCVDGKCHCFLGFH----GHDCSK 155 (377)
Q Consensus 88 ~~~C~~~g~c~C~~G~~G~i~C~~~~~~C~~~~~---~~~~~C-----~~~C~~~G~C~~g~C~C~~G~~----G~~C~~ 155 (377)
...| |.|.|.+||.|. .|+ |..... .....| ..+|+++|.|+-|+|.|.+... |.+|+-
T Consensus 474 ~~~C---G~C~C~~G~~G~-~CE-----C~~~~~ss~~~~~~Cr~~~~~~vCSgrG~C~CGqC~C~~~~~~~i~G~fCEC 544 (783)
T KOG1226|consen 474 TFVC---GQCRCDEGWLGK-KCE-----CSTDELSSSEEEDKCRENSDSPVCSGRGDCVCGQCVCHKPDNGKIYGKFCEC 544 (783)
T ss_pred cEEe---cceecCCCCCCC-ccc-----CCccccCcHhHHhhccCCCCCCCcCCCCcEeCCceEecCCCCCceeeeeeec
Confidence 3456 678899999999 774 332221 112444 2389999999999999988766 999874
Q ss_pred --CCCCC----CCCCCceecCCCccccCCCCCCCCCCCC----CCC----cccCCCCCeeecCCCceEecCCCCCCCCCC
Q 017103 156 --RSCPD----NCNGHGKCLSNGACECENGYTGIDCSTA----VCD----EQCSLHGGVCDNGVCEFRCSDYAGYTCQNS 221 (377)
Q Consensus 156 --~~C~~----~C~~~G~C~~~~~C~C~~G~~G~~C~~~----~C~----~~c~~~Gg~C~~~~~~~~C~~~~G~~C~~~ 221 (377)
..|+. .|.+||+|. -++|.|.+||+|..|+-+ .|. ..|+++| +|.-+.+.|.=+.|.|..||.+
T Consensus 545 DnfsC~r~~g~lC~g~G~C~-CG~CvC~~GwtG~~C~C~~std~C~~~~G~iCSGrG-~C~Cg~C~C~~~~~sG~~CE~c 622 (783)
T KOG1226|consen 545 DNFSCERHKGVLCGGHGRCE-CGRCVCNPGWTGSACNCPLSTDTCESSDGQICSGRG-TCECGRCKCTDPPYSGEFCEKC 622 (783)
T ss_pred cCcccccccCcccCCCCeEe-CCcEEcCCCCccCCCCCCCCCccccCCCCceeCCCc-eeeCCceEcCCCCcCcchhhcC
Confidence 45753 399999997 899999999999999743 463 3577886 8888776554234899999987
Q ss_pred C
Q 017103 222 S 222 (377)
Q Consensus 222 ~ 222 (377)
.
T Consensus 623 p 623 (783)
T KOG1226|consen 623 P 623 (783)
T ss_pred C
Confidence 4
No 8
>KOG1219 consensus Uncharacterized conserved protein, contains laminin, cadherin and EGF domains [Signal transduction mechanisms]
Probab=98.82 E-value=5.7e-09 Score=116.85 Aligned_cols=92 Identities=34% Similarity=0.813 Sum_probs=79.1
Q ss_pred CCCCCCCCeeeC-----CccccCCCCcCCCCCCC--CC-CCCCCCCceec---CCCccccCCCCCCCCCCCC---CCCcc
Q 017103 128 PNSCTFNGDCVD-----GKCHCFLGFHGHDCSKR--SC-PDNCNGHGKCL---SNGACECENGYTGIDCSTA---VCDEQ 193 (377)
Q Consensus 128 ~~~C~~~G~C~~-----g~C~C~~G~~G~~C~~~--~C-~~~C~~~G~C~---~~~~C~C~~G~~G~~C~~~---~C~~~ 193 (377)
.++|+++|+|+. ..|.|++.|+|.+||+. .| ++||..+|+|+ +.+.|.|+.||+|.+|+.. .|..+
T Consensus 3869 ~npCqhgG~C~~~~~ggy~CkCpsqysG~~CEi~~epC~snPC~~GgtCip~~n~f~CnC~~gyTG~~Ce~~Gi~eCs~n 3948 (4289)
T KOG1219|consen 3869 DNPCQHGGTCISQPKGGYKCKCPSQYSGNHCEIDLEPCASNPCLTGGTCIPFYNGFLCNCPNGYTGKRCEARGISECSKN 3948 (4289)
T ss_pred cCcccCCCEecCCCCCceEEeCcccccCcccccccccccCCCCCCCCEEEecCCCeeEeCCCCccCceeecccccccccc
Confidence 457999999983 39999999999999974 58 89999999998 5789999999999999965 59988
Q ss_pred cCCCCCeeecCCCceEecC---CCCCCCC
Q 017103 194 CSLHGGVCDNGVCEFRCSD---YAGYTCQ 219 (377)
Q Consensus 194 c~~~Gg~C~~~~~~~~C~~---~~G~~C~ 219 (377)
.|++||.|++..++|.|.| +.|.+|.
T Consensus 3949 ~C~~gg~C~n~~gsf~CncT~g~~gr~c~ 3977 (4289)
T KOG1219|consen 3949 VCGTGGQCINIPGSFHCNCTPGILGRTCC 3977 (4289)
T ss_pred cccCCceeeccCCceEeccChhHhcccCc
Confidence 8899999999999898876 5555553
No 9
>KOG2556 consensus Leishmanolysin-like peptidase (Peptidase M8 family) [Cell wall/membrane/envelope biogenesis; Defense mechanisms]
Probab=98.76 E-value=1.9e-09 Score=106.10 Aligned_cols=112 Identities=41% Similarity=0.758 Sum_probs=89.4
Q ss_pred CCCCCCCCCCCCCCceeecCCC---------cccCCCCCCCccccCCceeCCCCcccCCcccccceecCCCC-------C
Q 017103 3 AIAGGQSSLADYCTYFVAYSDG---------SCTDTNSARAPDRMLGEVRGSNSRCMASSLVRTGFVRGSMT-------Q 66 (377)
Q Consensus 3 ~~~Gg~~~l~D~Cp~~~~~~~~---------~C~d~~~~~~~~~~~g~~~G~nsrC~~~~l~~~g~~g~~c~-------~ 66 (377)
+.+||+.++||||||+++|+-+ .|++++|++.+..|.+|+||.+|+||+-.+- |+-..|. -
T Consensus 491 kYYGGsvelADYCpf~qefsw~id~t~hkdS~C~~~~n~~e~~~~~~EvyG~~S~cf~~~~~---~~~~kC~r~~v~~~y 567 (666)
T KOG2556|consen 491 KYYGGSVELADYCPFFQEFSWGIDKTQHKDSSCTDINNAREPDRMLGEVYGSESRCFNLTLP---WSLNKCKRIRVLSHY 567 (666)
T ss_pred cccCCceehhhcchHHHHhhcccCcccccCCcceecccCcchhhhhhhccccccceeeccch---HHHHhhcceeccccc
Confidence 3569999999999999999966 9999999999999999999999999985431 2222232 3
Q ss_pred CCCCcccCcCCCCEEEeeCCceeecCCCCCCcc----CCCc--eeeecCCCCcccCC
Q 017103 67 GNGCYQHRCVNNSLEVAVDGIWKVCPEAGGPVQ----FPGF--NGELICPAYHELCS 117 (377)
Q Consensus 67 ~d~C~~~pC~ngg~Cv~~~~~~~~C~~~g~c~C----~~G~--~G~i~C~~~~~~C~ 117 (377)
.+.|+.+.|.+.++-|-.-...+.|..+++.+- ..|| .|.+.||.--++|.
T Consensus 568 ~~Gcy~~~c~~~~l~Vw~~~atY~C~~e~Q~i~ik~~vdGWl~eGsLiCPkC~DyCt 624 (666)
T KOG2556|consen 568 PNGCYLMKCFDLYLGVWPHLATYPCYAENQKIHIKKVVDGWLREGSLICPKCEDYCT 624 (666)
T ss_pred CCceeEEEecCCceeeeeccCcccccccCCEeeEEEeecceeecCceeCCcHHHHHh
Confidence 578999999999888765555688999888875 4788 58888987555565
No 10
>KOG4289 consensus Cadherin EGF LAG seven-pass G-type receptor [Signal transduction mechanisms]
Probab=98.67 E-value=8.6e-09 Score=111.81 Aligned_cols=112 Identities=31% Similarity=0.698 Sum_probs=85.2
Q ss_pred CCCCCcccCcCCCCEEEee---CCce---------e--ecCC-CCCCccCCCceeeecCCCCcccCCCCCcccccCCCCC
Q 017103 66 QGNGCYQHRCVNNSLEVAV---DGIW---------K--VCPE-AGGPVQFPGFNGELICPAYHELCSTGPIAVFGQCPNS 130 (377)
Q Consensus 66 ~~d~C~~~pC~ngg~Cv~~---~~~~---------~--~C~~-~g~c~C~~G~~G~i~C~~~~~~C~~~~~~~~~~C~~~ 130 (377)
+.+.|..-||.|...|+.+ ++.. . .=|- .-.|.|+|||+|. .|++.+++|... +
T Consensus 1178 dDniClrEPCenymkCvsvlrFdssapf~~s~s~lfRpi~pvnglrCrCPpGFTgd-~CeTeiDlCYs~----------p 1246 (2531)
T KOG4289|consen 1178 DDNICLREPCENYMKCVSVLRFDSSAPFLASDSVLFRPIHPVNGLRCRCPPGFTGD-YCETEIDLCYSG----------P 1246 (2531)
T ss_pred cCchhhcchhHHHHhhhhheeecccCccccccceeeeeccccCceeEeCCCCCCcc-cccchhHhhhcC----------C
Confidence 3468999999998878643 1110 0 0000 1223566999999 899999999965 5
Q ss_pred CCCCCeeeC----CccccCCCCcCCCCCCC----CC-CCCCCCCceec----CCCccccCCC-CCCCCCCCC
Q 017103 131 CTFNGDCVD----GKCHCFLGFHGHDCSKR----SC-PDNCNGHGKCL----SNGACECENG-YTGIDCSTA 188 (377)
Q Consensus 131 C~~~G~C~~----g~C~C~~G~~G~~C~~~----~C-~~~C~~~G~C~----~~~~C~C~~G-~~G~~C~~~ 188 (377)
|.++|+|.. .+|.|.+||+|.+||+. .| +..|.++|+|+ ..+.|.|+.| |+++.|+..
T Consensus 1247 C~nng~C~srEggYtCeCrpg~tGehCEvs~~agrCvpGvC~nggtC~~~~nggf~c~Cp~ge~e~prC~v~ 1318 (2531)
T KOG4289|consen 1247 CGNNGRCRSREGGYTCECRPGFTGEHCEVSARAGRCVPGVCKNGGTCVNLLNGGFCCHCPYGEFEDPRCEVT 1318 (2531)
T ss_pred CCCCCceEEecCceeEEecCCccccceeeecccCccccceecCCCEEeecCCCceeccCCCcccCCCceEEE
Confidence 999999972 49999999999999964 48 88899999998 3678999987 688899864
No 11
>KOG1226 consensus Integrin beta subunit (N-terminal portion of extracellular region) [Signal transduction mechanisms; Extracellular structures]
Probab=98.65 E-value=9.2e-08 Score=99.75 Aligned_cols=98 Identities=30% Similarity=0.722 Sum_probs=71.1
Q ss_pred CCCccCCCce----eeecCCCCcccCCCCCcccccCCCCCCCCCCeeeCCccccCCCCcCCCCCC----CCCC----CCC
Q 017103 95 GGPVQFPGFN----GELICPAYHELCSTGPIAVFGQCPNSCTFNGDCVDGKCHCFLGFHGHDCSK----RSCP----DNC 162 (377)
Q Consensus 95 g~c~C~~G~~----G~i~C~~~~~~C~~~~~~~~~~C~~~C~~~G~C~~g~C~C~~G~~G~~C~~----~~C~----~~C 162 (377)
|+|+|.+... |+ .||-+.-.|.... -..|.+||.|.-|+|+|.+||+|..|+- +.|. ..|
T Consensus 525 GqC~C~~~~~~~i~G~-fCECDnfsC~r~~-------g~lC~g~G~C~CG~CvC~~GwtG~~C~C~~std~C~~~~G~iC 596 (783)
T KOG1226|consen 525 GQCVCHKPDNGKIYGK-FCECDNFSCERHK-------GVLCGGHGRCECGRCVCNPGWTGSACNCPLSTDTCESSDGQIC 596 (783)
T ss_pred CceEecCCCCCceeee-eeeccCccccccc-------CcccCCCCeEeCCcEEcCCCCccCCCCCCCCCccccCCCCcee
Confidence 6777777655 77 5643333333211 2259999999999999999999999974 3472 249
Q ss_pred CCCceecCCCccccCCC-CCCCCCCCC-CCCcccCCCCCeee
Q 017103 163 NGHGKCLSNGACECENG-YTGIDCSTA-VCDEQCSLHGGVCD 202 (377)
Q Consensus 163 ~~~G~C~~~~~C~C~~G-~~G~~C~~~-~C~~~c~~~Gg~C~ 202 (377)
+++|+|. -++|+|... |+|..|+.. .|...|..+. .|+
T Consensus 597 SGrG~C~-Cg~C~C~~~~~sG~~CE~cptc~~~C~~~~-~Cv 636 (783)
T KOG1226|consen 597 SGRGTCE-CGRCKCTDPPYSGEFCEKCPTCPDPCAENK-SCV 636 (783)
T ss_pred CCCceee-CCceEcCCCCcCcchhhcCCCCCCcccccc-cch
Confidence 9999997 889999766 999999974 5766665554 565
No 12
>KOG4289 consensus Cadherin EGF LAG seven-pass G-type receptor [Signal transduction mechanisms]
Probab=98.59 E-value=3.4e-08 Score=107.31 Aligned_cols=81 Identities=31% Similarity=0.802 Sum_probs=66.7
Q ss_pred ccccCCCCcCCCCCC--CCC-CCCCCCCceec---CCCccccCCCCCCCCCCCCC----CCcccCCCCCeeec-CCCceE
Q 017103 141 KCHCFLGFHGHDCSK--RSC-PDNCNGHGKCL---SNGACECENGYTGIDCSTAV----CDEQCSLHGGVCDN-GVCEFR 209 (377)
Q Consensus 141 ~C~C~~G~~G~~C~~--~~C-~~~C~~~G~C~---~~~~C~C~~G~~G~~C~~~~----C~~~c~~~Gg~C~~-~~~~~~ 209 (377)
.|.||+||+|++|+. +.| ..+|.++|+|. ..|+|+|.+||+|.+||++. |.+.=|.+||+|++ ....+.
T Consensus 1223 rCrCPpGFTgd~CeTeiDlCYs~pC~nng~C~srEggYtCeCrpg~tGehCEvs~~agrCvpGvC~nggtC~~~~nggf~ 1302 (2531)
T KOG4289|consen 1223 RCRCPPGFTGDYCETEIDLCYSGPCGNNGRCRSREGGYTCECRPGFTGEHCEVSARAGRCVPGVCKNGGTCVNLLNGGFC 1302 (2531)
T ss_pred eEeCCCCCCcccccchhHhhhcCCCCCCCceEEecCceeEEecCCccccceeeecccCccccceecCCCEEeecCCCcee
Confidence 799999999999985 569 68999999998 58999999999999999863 76654588899998 667777
Q ss_pred ecC----CCCCCCCCC
Q 017103 210 CSD----YAGYTCQNS 221 (377)
Q Consensus 210 C~~----~~G~~C~~~ 221 (377)
|.| |.+..|+..
T Consensus 1303 c~Cp~ge~e~prC~v~ 1318 (2531)
T KOG4289|consen 1303 CHCPYGEFEDPRCEVT 1318 (2531)
T ss_pred ccCCCcccCCCceEEE
Confidence 777 445556544
No 13
>KOG1217 consensus Fibrillins and related proteins containing Ca2+-binding EGF-like domains [Signal transduction mechanisms]
Probab=98.22 E-value=3.2e-05 Score=78.15 Aligned_cols=233 Identities=20% Similarity=0.351 Sum_probs=123.1
Q ss_pred ccceecCCCCCCCCCcccCc--CCCCEEEeeCCceeecCCCCCCccCCCceeeecCCCCcccCCCCCcccccCCCCCCCC
Q 017103 56 RTGFVRGSMTQGNGCYQHRC--VNNSLEVAVDGIWKVCPEAGGPVQFPGFNGELICPAYHELCSTGPIAVFGQCPNSCTF 133 (377)
Q Consensus 56 ~~g~~g~~c~~~d~C~~~pC--~ngg~Cv~~~~~~~~C~~~g~c~C~~G~~G~i~C~~~~~~C~~~~~~~~~~C~~~C~~ 133 (377)
..||.+..+.....|...+. ...+.|+.... -.....+.|..||.|. .|....+.|... +..|.+
T Consensus 115 ~~g~~~~~~~~~~~C~~~~~~~~~~~~c~~~~~----~~~~~~c~C~~g~~~~-~~~~~~~~C~~~--------~~~c~~ 181 (487)
T KOG1217|consen 115 PPGYQGTPCEGECECVTGPGVCCIDGSCSNGPG----SVGPFRCSCTEGYEGE-PCETDLDECIQY--------SSPCQN 181 (487)
T ss_pred CCccccCcCCcceeecCCCCCeeCchhhcCCCC----CCCceeeeeCCCcccc-cccccccccccC--------CCCcCC
Confidence 34666555443324666652 44455543321 0011223556999999 676654566632 235889
Q ss_pred CCeeeC----CccccCCCCcCCCCCCCCCCCCCCCCceecCCCccccCCCCCCCCCCCCC--CCcccCCCCCeeecCCCc
Q 017103 134 NGDCVD----GKCHCFLGFHGHDCSKRSCPDNCNGHGKCLSNGACECENGYTGIDCSTAV--CDEQCSLHGGVCDNGVCE 207 (377)
Q Consensus 134 ~G~C~~----g~C~C~~G~~G~~C~~~~C~~~C~~~G~C~~~~~C~C~~G~~G~~C~~~~--C~~~c~~~Gg~C~~~~~~ 207 (377)
++.|.+ ..|.|+++|.|..|+.. .+++.|+....|.+.++|.+..|++.. |... + ++|++....
T Consensus 182 ~~~C~~~~~~~~C~c~~~~~~~~~~~~------~~~~~c~~~~~~~~~~g~~~~~c~~~~~~~~~~---~-~~c~~~~~~ 251 (487)
T KOG1217|consen 182 GGTCVNTGGSYLCSCPPGYTGSTCETT------GNGGTCVDSVACSCPPGARGPECEVSIVECASG---D-GTCVNTVGS 251 (487)
T ss_pred CcccccCCCCeeEeCCCCccCCcCcCC------CCCceEecceeccCCCCCCCCCcccccccccCC---C-CcccccCCc
Confidence 999974 37999999999999864 344555544566777777777776543 3222 2 466665555
Q ss_pred eEecCCCCCCCCC-----CCcCcCCCCccccCccc----CCCCcccCCCCCcccc-CCC--CccccC-CccccCCCCccc
Q 017103 208 FRCSDYAGYTCQN-----SSKLISSLSVCKYVLEK----DAGGQHCAPSESSILQ-QLE--EVVVTP-NYHRLFPGGARK 274 (377)
Q Consensus 208 ~~C~~~~G~~C~~-----~~~~c~~~~~C~~~~~~----~~~~~~C~~~~~~~~c-~~~--~~c~~~-~~c~c~~gg~~~ 274 (377)
+.|.+..|+.... ....|....+|.++..+ +.+.+.|..++.+..+ .+. ..|... ....|.+++.+.
T Consensus 252 ~~C~~~~g~~~~~~~~~~~~~~C~~~~~c~~~~~C~~~~~~~~C~C~~g~~g~~~~~~~~~~~C~~~~~~~~c~~g~~C~ 331 (487)
T KOG1217|consen 252 YTCRCPEGYTGDACVTCVDVDSCALIASCPNGGTCVNVPGSYRCTCPPGFTGRLCTECVDVDECSPRNAGGPCANGGTCN 331 (487)
T ss_pred eeeeCCCCccccccceeeeccccCCCCccCCCCeeecCCCcceeeCCCCCCCCCCccccccccccccccCCcCCCCcccc
Confidence 5565433333332 12222222123332222 1144566677666665 221 122110 111344444432
Q ss_pred ccCcc--cCcccCCCCcccccccccccccccCCCCCCee
Q 017103 275 LFNIF--GTSYCDEAAKRLACWISIQKCDKDGDNRLRVC 311 (377)
Q Consensus 275 ~~~~~--~~~~C~~~~~g~~C~~~~~~C~~~~~~~~~vC 311 (377)
..+.+ ..+.|..+|+|..|+...++|...+.....+|
T Consensus 332 ~~~~~~~~~C~c~~~~~g~~C~~~~~~C~~~~~~~~~~c 370 (487)
T KOG1217|consen 332 TLGSFGGFRCACGPGFTGRRCEDSNDECASSPCCPGGTC 370 (487)
T ss_pred cCCCCCCCCcCCCCCCCCCccccCCccccCCccccCCEe
Confidence 22222 23677777888888766656766553333444
No 14
>KOG1217 consensus Fibrillins and related proteins containing Ca2+-binding EGF-like domains [Signal transduction mechanisms]
Probab=97.85 E-value=0.00037 Score=70.38 Aligned_cols=124 Identities=28% Similarity=0.652 Sum_probs=70.1
Q ss_pred ccccceecCCCCCC-CCCc--ccCcCCCCEEEeeCCceeecCCCCCCccCCCceeeecCCCC--cccCCC------CCcc
Q 017103 54 LVRTGFVRGSMTQG-NGCY--QHRCVNNSLEVAVDGIWKVCPEAGGPVQFPGFNGELICPAY--HELCST------GPIA 122 (377)
Q Consensus 54 l~~~g~~g~~c~~~-d~C~--~~pC~ngg~Cv~~~~~~~~C~~~g~c~C~~G~~G~i~C~~~--~~~C~~------~~~~ 122 (377)
.+..||.+..+... +.|. ..+|+++++|++..+. +.| .|++||.|. .|+.. ...|.. .+-.
T Consensus 155 ~C~~g~~~~~~~~~~~~C~~~~~~c~~~~~C~~~~~~-~~C------~c~~~~~~~-~~~~~~~~~~c~~~~~~~~~~g~ 226 (487)
T KOG1217|consen 155 SCTEGYEGEPCETDLDECIQYSSPCQNGGTCVNTGGS-YLC------SCPPGYTGS-TCETTGNGGTCVDSVACSCPPGA 226 (487)
T ss_pred eeCCCcccccccccccccccCCCCcCCCcccccCCCC-eeE------eCCCCccCC-cCcCCCCCceEecceeccCCCCC
Confidence 44678888877654 7997 4469999999886554 323 345889888 55432 001110 0000
Q ss_pred cccCC---CCCCCCC-CeeeC----CccccCCCCcCCCC----CCCCC-CC-CCCCCceecC---CCccccCCCCCCCCC
Q 017103 123 VFGQC---PNSCTFN-GDCVD----GKCHCFLGFHGHDC----SKRSC-PD-NCNGHGKCLS---NGACECENGYTGIDC 185 (377)
Q Consensus 123 ~~~~C---~~~C~~~-G~C~~----g~C~C~~G~~G~~C----~~~~C-~~-~C~~~G~C~~---~~~C~C~~G~~G~~C 185 (377)
....| ...|... ++|++ .+|.|++||.+..+ +...| .. .|.++++|+. .+.|.|++||+|..|
T Consensus 227 ~~~~c~~~~~~~~~~~~~c~~~~~~~~C~~~~g~~~~~~~~~~~~~~C~~~~~c~~~~~C~~~~~~~~C~C~~g~~g~~~ 306 (487)
T KOG1217|consen 227 RGPECEVSIVECASGDGTCVNTVGSYTCRCPEGYTGDACVTCVDVDSCALIASCPNGGTCVNVPGSYRCTCPPGFTGRLC 306 (487)
T ss_pred CCCCcccccccccCCCCcccccCCceeeeCCCCccccccceeeeccccCCCCccCCCCeeecCCCcceeeCCCCCCCCCC
Confidence 00111 0122222 66653 26777777777764 33455 22 2667777763 267777777777776
No 15
>PF07974 EGF_2: EGF-like domain; InterPro: IPR013111 A sequence of about thirty to forty amino-acid residues long found in the sequence of epidermal growth factor (EGF) has been shown [, , , , ] to be present, in a more or less conserved form, in a large number of other, mostly animal proteins. The list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied. The functional significance of EGF domains in what appear to be unrelated proteins is not yet clear. However, a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase). The EGF domain includes six cysteine residues which have been shown (in EGF) to be involved in disulphide bonds. The main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet. Subdomains between the conserved cysteines vary in length. This entry contains EGF domains found in a variety of extracellular and membrane proteins
Probab=97.36 E-value=0.00017 Score=45.92 Aligned_cols=24 Identities=46% Similarity=1.142 Sum_probs=19.1
Q ss_pred CCCCCCeee--CCccccCCCCcCCCC
Q 017103 130 SCTFNGDCV--DGKCHCFLGFHGHDC 153 (377)
Q Consensus 130 ~C~~~G~C~--~g~C~C~~G~~G~~C 153 (377)
.|+++|+|+ .++|+|++||+|++|
T Consensus 7 ~C~~~G~C~~~~g~C~C~~g~~G~~C 32 (32)
T PF07974_consen 7 ICSGHGTCVSPCGRCVCDSGYTGPDC 32 (32)
T ss_pred ccCCCCEEeCCCCEEECCCCCcCCCC
Confidence 478888888 578888888888776
No 16
>PF00008 EGF: EGF-like domain This is a sub-family of the Pfam entry This is a sub-family of the Pfam entry; InterPro: IPR006209 A sequence of about thirty to forty amino-acid residues long found in the sequence of epidermal growth factor (EGF) has been shown [, , , , ] to be present, in a more or less conserved form, in a large number of other, mostly animal proteins. The list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied. The functional significance of EGF domains in what appear to be unrelated proteins is not yet clear. However, a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase). The EGF domain includes six cysteine residues which have been shown (in EGF) to be involved in disulphide bonds. The main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet. Subdomains between the conserved cysteines vary in length.; GO: 0005515 protein binding; PDB: 1WHE_A 1CCF_A 1APO_A 1WHF_A 2VJ3_A 1TOZ_A 4D90_B 3CFW_A 1EDM_B 1IXA_A ....
Probab=97.11 E-value=0.00038 Score=44.35 Aligned_cols=31 Identities=26% Similarity=0.552 Sum_probs=24.2
Q ss_pred CcccCcCCCCEEEeeCCceeecCCCCCCccCCCceee
Q 017103 70 CYQHRCVNNSLEVAVDGIWKVCPEAGGPVQFPGFNGE 106 (377)
Q Consensus 70 C~~~pC~ngg~Cv~~~~~~~~C~~~g~c~C~~G~~G~ 106 (377)
|..+||+|+|+|+.+....+.| +|++||+|+
T Consensus 1 C~~~~C~n~g~C~~~~~~~y~C------~C~~G~~G~ 31 (32)
T PF00008_consen 1 CSSNPCQNGGTCIDLPGGGYTC------ECPPGYTGK 31 (32)
T ss_dssp TTTTSSTTTEEEEEESTSEEEE------EEBTTEEST
T ss_pred CCCCcCCCCeEEEeCCCCCEEe------ECCCCCccC
Confidence 6678999999999877444545 566999996
No 17
>PF07974 EGF_2: EGF-like domain; InterPro: IPR013111 A sequence of about thirty to forty amino-acid residues long found in the sequence of epidermal growth factor (EGF) has been shown [, , , , ] to be present, in a more or less conserved form, in a large number of other, mostly animal proteins. The list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied. The functional significance of EGF domains in what appear to be unrelated proteins is not yet clear. However, a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase). The EGF domain includes six cysteine residues which have been shown (in EGF) to be involved in disulphide bonds. The main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet. Subdomains between the conserved cysteines vary in length. This entry contains EGF domains found in a variety of extracellular and membrane proteins
Probab=96.94 E-value=0.0007 Score=43.13 Aligned_cols=26 Identities=58% Similarity=1.453 Sum_probs=23.2
Q ss_pred CCCCCCceecCC-CccccCCCCCCCCC
Q 017103 160 DNCNGHGKCLSN-GACECENGYTGIDC 185 (377)
Q Consensus 160 ~~C~~~G~C~~~-~~C~C~~G~~G~~C 185 (377)
..|++||+|+.. ++|+|.+||+|++|
T Consensus 6 ~~C~~~G~C~~~~g~C~C~~g~~G~~C 32 (32)
T PF07974_consen 6 NICSGHGTCVSPCGRCVCDSGYTGPDC 32 (32)
T ss_pred CccCCCCEEeCCCCEEECCCCCcCCCC
Confidence 369999999965 99999999999987
No 18
>KOG1214 consensus Nidogen and related basement membrane protein proteins [Cell wall/membrane/envelope biogenesis; Extracellular structures]
Probab=96.86 E-value=0.0041 Score=66.04 Aligned_cols=131 Identities=26% Similarity=0.671 Sum_probs=85.3
Q ss_pred eeCCCCcccCCc------cccccee--cCCCCCCCCCccc--CcCCCCEEEeeCCcee-ecCCCCCCccCCCceee-ecC
Q 017103 42 VRGSNSRCMASS------LVRTGFV--RGSMTQGNGCYQH--RCVNNSLEVAVDGIWK-VCPEAGGPVQFPGFNGE-LIC 109 (377)
Q Consensus 42 ~~G~nsrC~~~~------l~~~g~~--g~~c~~~d~C~~~--pC~ngg~Cv~~~~~~~-~C~~~g~c~C~~G~~G~-i~C 109 (377)
..+.+.+|+.++ .+..||. |..|.+.++|..- .|..+..||+.++.+. .|. +-.-|.+. -+|
T Consensus 701 ~cdt~a~C~pg~~~~~tcecs~g~~gdgr~c~d~~eca~~~~~CGp~s~Cin~pg~~rceC~------~gy~F~dd~~tC 774 (1289)
T KOG1214|consen 701 MCDTTARCHPGTGVDYTCECSSGYQGDGRNCVDENECATGFHRCGPNSVCINLPGSYRCECR------SGYEFADDRHTC 774 (1289)
T ss_pred ccCCCccccCCCCcceEEEEeeccCCCCCCCCChhhhccCCCCCCCCceeecCCCceeEEEe------ecceeccCCcce
Confidence 335667787762 2344554 4557888899764 7999999999887753 563 22334443 133
Q ss_pred CC-----CcccCCCCCcccccCCCCCCCCCCe--ee-C----CccccCCCCcCCC---CCCCCC-CCCCCCCceec---C
Q 017103 110 PA-----YHELCSTGPIAVFGQCPNSCTFNGD--CV-D----GKCHCFLGFHGHD---CSKRSC-PDNCNGHGKCL---S 170 (377)
Q Consensus 110 ~~-----~~~~C~~~~~~~~~~C~~~C~~~G~--C~-~----g~C~C~~G~~G~~---C~~~~C-~~~C~~~G~C~---~ 170 (377)
-. -.+.|... .+.|.-.|. |+ . .+|.|.|||.|+- ++.++| +..|...+.|. .
T Consensus 775 V~i~~pap~n~Ce~g--------~h~C~i~g~a~c~~hGgs~y~C~CLPGfsGDG~~c~dvDeC~psrChp~A~Cyntpg 846 (1289)
T KOG1214|consen 775 VLITPPAPANPCEDG--------SHTCAIAGQARCVHHGGSTYSCACLPGFSGDGHQCTDVDECSPSRCHPAATCYNTPG 846 (1289)
T ss_pred EEecCCCCCCccccC--------ccccCcCCceEEEecCCceEEEeecCCccCCccccccccccCccccCCCceEecCCC
Confidence 21 23344432 245654444 44 2 3999999999874 445678 77799999998 4
Q ss_pred CCccccCCCCCCC--CCC
Q 017103 171 NGACECENGYTGI--DCS 186 (377)
Q Consensus 171 ~~~C~C~~G~~G~--~C~ 186 (377)
.+.|.|.+||.|+ .|-
T Consensus 847 sfsC~C~pGy~GDGf~CV 864 (1289)
T KOG1214|consen 847 SFSCRCQPGYYGDGFQCV 864 (1289)
T ss_pred cceeecccCccCCCceec
Confidence 6789999999875 554
No 19
>smart00051 DSL delta serrate ligand.
Probab=96.67 E-value=0.0018 Score=48.02 Aligned_cols=42 Identities=36% Similarity=0.762 Sum_probs=34.1
Q ss_pred ccCCCCcCCCCCCCCC--CCCCCCCceecCCCccccCCCCCCCCC
Q 017103 143 HCFLGFHGHDCSKRSC--PDNCNGHGKCLSNGACECENGYTGIDC 185 (377)
Q Consensus 143 ~C~~G~~G~~C~~~~C--~~~C~~~G~C~~~~~C~C~~G~~G~~C 185 (377)
.|+++|.|..|+. .| .+...+|.+|...+.+.|.+||+|++|
T Consensus 20 ~C~~~~yG~~C~~-~C~~~~d~~~~~~Cd~~G~~~C~~Gw~G~~C 63 (63)
T smart00051 20 TCDENYYGEGCNK-FCRPRDDFFGHYTCDENGNKGCLEGWMGPYC 63 (63)
T ss_pred eCCCCCcCCccCC-EeCcCccccCCccCCcCCCEecCCCCcCCCC
Confidence 5778999999975 45 234678888987888999999999886
No 20
>KOG1214 consensus Nidogen and related basement membrane protein proteins [Cell wall/membrane/envelope biogenesis; Extracellular structures]
Probab=96.20 E-value=0.015 Score=62.06 Aligned_cols=132 Identities=23% Similarity=0.505 Sum_probs=81.7
Q ss_pred cCcCCCCEEEeeCCceeecCCCCCCccCCCceeeecCCCCcccCCCCCcccccCCCCCCCCCCeeeC----CccccCCCC
Q 017103 73 HRCVNNSLEVAVDGIWKVCPEAGGPVQFPGFNGELICPAYHELCSTGPIAVFGQCPNSCTFNGDCVD----GKCHCFLGF 148 (377)
Q Consensus 73 ~pC~ngg~Cv~~~~~~~~C~~~g~c~C~~G~~G~i~C~~~~~~C~~~~~~~~~~C~~~C~~~G~C~~----g~C~C~~G~ 148 (377)
+-|..++.|....+..++| .|..||.|.-+=+.+.++|++ +++.|..+..|++ .+|.|..||
T Consensus 700 h~cdt~a~C~pg~~~~~tc------ecs~g~~gdgr~c~d~~eca~--------~~~~CGp~s~Cin~pg~~rceC~~gy 765 (1289)
T KOG1214|consen 700 HMCDTTARCHPGTGVDYTC------ECSSGYQGDGRNCVDENECAT--------GFHRCGPNSVCINLPGSYRCECRSGY 765 (1289)
T ss_pred cccCCCccccCCCCcceEE------EEeeccCCCCCCCCChhhhcc--------CCCCCCCCceeecCCCceeEEEeecc
Confidence 3555566665544444444 455888776222223345653 4678999999984 378887776
Q ss_pred --cCC--CCCC-------CCCC---CCCCCCc--eec----CCCccccCCCCCCC---CCCCCCCCcccCCCCCeeecCC
Q 017103 149 --HGH--DCSK-------RSCP---DNCNGHG--KCL----SNGACECENGYTGI---DCSTAVCDEQCSLHGGVCDNGV 205 (377)
Q Consensus 149 --~G~--~C~~-------~~C~---~~C~~~G--~C~----~~~~C~C~~G~~G~---~C~~~~C~~~c~~~Gg~C~~~~ 205 (377)
.++ .|-. ..|. ..|.-.| .|+ +.|.|.|.+||.|+ .+..+.|.+.-|.-.+.|.+..
T Consensus 766 ~F~dd~~tCV~i~~pap~n~Ce~g~h~C~i~g~a~c~~hGgs~y~C~CLPGfsGDG~~c~dvDeC~psrChp~A~Cyntp 845 (1289)
T KOG1214|consen 766 EFADDRHTCVLITPPAPANPCEDGSHTCAIAGQARCVHHGGSTYSCACLPGFSGDGHQCTDVDECSPSRCHPAATCYNTP 845 (1289)
T ss_pred eeccCCcceEEecCCCCCCccccCccccCcCCceEEEecCCceEEEeecCCccCCccccccccccCccccCCCceEecCC
Confidence 343 4531 1352 3355334 444 47899999999886 3456778754333446999988
Q ss_pred CceEecCCCCCCC
Q 017103 206 CEFRCSDYAGYTC 218 (377)
Q Consensus 206 ~~~~C~~~~G~~C 218 (377)
+++.|.|..|+..
T Consensus 846 gsfsC~C~pGy~G 858 (1289)
T KOG1214|consen 846 GSFSCRCQPGYYG 858 (1289)
T ss_pred CcceeecccCccC
Confidence 8888877555543
No 21
>KOG4260 consensus Uncharacterized conserved protein [Function unknown]
Probab=96.09 E-value=0.0082 Score=56.32 Aligned_cols=105 Identities=31% Similarity=0.721 Sum_probs=65.0
Q ss_pred cCCCceeeecCCCCcccCCCCCcccccCCCCCCCCCCeee-C------CccccCCCCcCCCCCC--------------CC
Q 017103 99 QFPGFNGELICPAYHELCSTGPIAVFGQCPNSCTFNGDCV-D------GKCHCFLGFHGHDCSK--------------RS 157 (377)
Q Consensus 99 C~~G~~G~i~C~~~~~~C~~~~~~~~~~C~~~C~~~G~C~-~------g~C~C~~G~~G~~C~~--------------~~ 157 (377)
|++|.+|+ .|. .|.-.. ..+|.++|.|. + |.|.|.+||+|+.|.. .+
T Consensus 132 Cp~gtyGp-dCl----~Cpggs-------er~C~GnG~C~GdGsR~GsGkCkC~~GY~Gp~C~~Cg~eyfes~Rne~~lv 199 (350)
T KOG4260|consen 132 CPDGTYGP-DCL----QCPGGS-------ERPCFGNGSCHGDGSREGSGKCKCETGYTGPLCRYCGIEYFESSRNEQHLV 199 (350)
T ss_pred cCCCCcCC-ccc----cCCCCC-------cCCcCCCCcccCCCCCCCCCcccccCCCCCccccccchHHHHhhcccccch
Confidence 55899999 332 232111 23688999996 2 5999999999999852 01
Q ss_pred ---CCCCCCCCceecC--CCcc-ccCCCCCCC--CC-CCCCCCc--ccCCCCCeeecCCCceEecCCCCCC
Q 017103 158 ---CPDNCNGHGKCLS--NGAC-ECENGYTGI--DC-STAVCDE--QCSLHGGVCDNGVCEFRCSDYAGYT 217 (377)
Q Consensus 158 ---C~~~C~~~G~C~~--~~~C-~C~~G~~G~--~C-~~~~C~~--~c~~~Gg~C~~~~~~~~C~~~~G~~ 217 (377)
|...|. |.|.. .-.| .|..||.-+ -| ++++|.. .++.-...|++..++|.|....|+.
T Consensus 200 Ct~Ch~~C~--~~Csg~~~k~C~kCkkGW~lde~gCvDvnEC~~ep~~c~~~qfCvNteGSf~C~dk~Gy~ 268 (350)
T KOG4260|consen 200 CTACHEGCL--GVCSGESSKGCSKCKKGWKLDEEGCVDVNECQNEPAPCKAHQFCVNTEGSFKCEDKEGYK 268 (350)
T ss_pred hhhhhhhhh--cccCCCCCCChhhhcccceecccccccHHHHhcCCCCCChhheeecCCCceEeccccccc
Confidence 223343 24542 2234 688999644 22 2445743 3444335899999999997765553
No 22
>PF00008 EGF: EGF-like domain This is a sub-family of the Pfam entry This is a sub-family of the Pfam entry; InterPro: IPR006209 A sequence of about thirty to forty amino-acid residues long found in the sequence of epidermal growth factor (EGF) has been shown [, , , , ] to be present, in a more or less conserved form, in a large number of other, mostly animal proteins. The list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied. The functional significance of EGF domains in what appear to be unrelated proteins is not yet clear. However, a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase). The EGF domain includes six cysteine residues which have been shown (in EGF) to be involved in disulphide bonds. The main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet. Subdomains between the conserved cysteines vary in length.; GO: 0005515 protein binding; PDB: 1WHE_A 1CCF_A 1APO_A 1WHF_A 2VJ3_A 1TOZ_A 4D90_B 3CFW_A 1EDM_B 1IXA_A ....
Probab=96.03 E-value=0.0022 Score=40.79 Aligned_cols=23 Identities=39% Similarity=1.027 Sum_probs=13.1
Q ss_pred CCCCCCCeeeC-----CccccCCCCcCC
Q 017103 129 NSCTFNGDCVD-----GKCHCFLGFHGH 151 (377)
Q Consensus 129 ~~C~~~G~C~~-----g~C~C~~G~~G~ 151 (377)
++|+++|+|++ .+|.|++||+|+
T Consensus 4 ~~C~n~g~C~~~~~~~y~C~C~~G~~G~ 31 (32)
T PF00008_consen 4 NPCQNGGTCIDLPGGGYTCECPPGYTGK 31 (32)
T ss_dssp TSSTTTEEEEEESTSEEEEEEBTTEEST
T ss_pred CcCCCCeEEEeCCCCCEEeECCCCCccC
Confidence 35666666652 156666666664
No 23
>KOG0994 consensus Extracellular matrix glycoprotein Laminin subunit beta [Extracellular structures]
Probab=95.96 E-value=0.033 Score=61.30 Aligned_cols=30 Identities=17% Similarity=0.203 Sum_probs=22.4
Q ss_pred cccCCCCcccccCcccCcccCCCCcccccc
Q 017103 265 HRLFPGGARKLFNIFGTSYCDEAAKRLACW 294 (377)
Q Consensus 265 c~c~~gg~~~~~~~~~~~~C~~~~~g~~C~ 294 (377)
|-|.+.+.-.+..+.|.|.|.+||.|..|+
T Consensus 1069 C~Cd~~~~pqCN~ftGQCqCkpGfGGR~C~ 1098 (1758)
T KOG0994|consen 1069 CNCDPIGGPQCNEFTGQCQCKPGFGGRTCS 1098 (1758)
T ss_pred cCCCccCCccccccccceeccCCCCCcchh
Confidence 455565545566667889999999999885
No 24
>smart00051 DSL delta serrate ligand.
Probab=95.80 E-value=0.0086 Score=44.34 Aligned_cols=42 Identities=24% Similarity=0.513 Sum_probs=31.5
Q ss_pred cCCCceeeecCCCCcccCCCCCcccccCCCCCCCCCCeee-CCccccCCCCcCCCC
Q 017103 99 QFPGFNGELICPAYHELCSTGPIAVFGQCPNSCTFNGDCV-DGKCHCFLGFHGHDC 153 (377)
Q Consensus 99 C~~G~~G~i~C~~~~~~C~~~~~~~~~~C~~~C~~~G~C~-~g~C~C~~G~~G~~C 153 (377)
|.++|.|. .|.. +|.. .+.+.++.+|. +|.+.|.+||+|++|
T Consensus 21 C~~~~yG~-~C~~---~C~~---------~~d~~~~~~Cd~~G~~~C~~Gw~G~~C 63 (63)
T smart00051 21 CDENYYGE-GCNK---FCRP---------RDDFFGHYTCDENGNKGCLEGWMGPYC 63 (63)
T ss_pred CCCCCcCC-ccCC---EeCc---------CccccCCccCCcCCCEecCCCCcCCCC
Confidence 34899999 6644 4541 22467888997 689999999999987
No 25
>PF12661 hEGF: Human growth factor-like EGF; PDB: 2YGQ_A 2E26_A 3A7Q_A 2YGP_A 2YGO_A 1HRE_A 1HAE_A 1HAF_A 1HRF_A.
Probab=95.32 E-value=0.0059 Score=30.82 Aligned_cols=12 Identities=42% Similarity=1.309 Sum_probs=6.1
Q ss_pred cccCCCCcCCCC
Q 017103 142 CHCFLGFHGHDC 153 (377)
Q Consensus 142 C~C~~G~~G~~C 153 (377)
|+|++||+|.+|
T Consensus 2 C~C~~G~~G~~C 13 (13)
T PF12661_consen 2 CQCPPGWTGPNC 13 (13)
T ss_dssp EEE-TTEETTTT
T ss_pred ccCcCCCcCCCC
Confidence 555555555554
No 26
>KOG0994 consensus Extracellular matrix glycoprotein Laminin subunit beta [Extracellular structures]
Probab=95.12 E-value=0.11 Score=57.48 Aligned_cols=120 Identities=28% Similarity=0.627 Sum_probs=62.1
Q ss_pred cceecCCC-CCCCCCcccCcCCCC--------EEEeeCCc-eeecCCCCCCccCCCceeeecCCCCcccCCCCCc---cc
Q 017103 57 TGFVRGSM-TQGNGCYQHRCVNNS--------LEVAVDGI-WKVCPEAGGPVQFPGFNGELICPAYHELCSTGPI---AV 123 (377)
Q Consensus 57 ~g~~g~~c-~~~d~C~~~pC~ngg--------~Cv~~~~~-~~~C~~~g~c~C~~G~~G~i~C~~~~~~C~~~~~---~~ 123 (377)
.||.|.-. -.+..|..-||-.+- +|...... ..+| +|.+||+|. +|+. |....+ ..
T Consensus 892 ~GyyGdP~lg~g~~CrPCpCP~gp~Sg~~~A~sC~~d~~t~~ivC------~C~~GY~G~-RCe~----CA~~~fGnP~~ 960 (1758)
T KOG0994|consen 892 DGYYGDPRLGSGIGCRPCPCPDGPASGRQHADSCYLDTRTQQIVC------HCQEGYSGS-RCEI----CADNHFGNPSE 960 (1758)
T ss_pred ccccCCcccCCCCCCCCCCCCCCCccchhccccccccccccceee------ecccCcccc-chhh----hcccccCCccc
Confidence 35555442 356778877876542 22211111 1133 677999999 8875 332221 00
Q ss_pred ccCC-CCCCCCC------Ceee--CC------------cc-ccCCCCcCCC----CCCCCCC-CCCCCCceec-CCCccc
Q 017103 124 FGQC-PNSCTFN------GDCV--DG------------KC-HCFLGFHGHD----CSKRSCP-DNCNGHGKCL-SNGACE 175 (377)
Q Consensus 124 ~~~C-~~~C~~~------G~C~--~g------------~C-~C~~G~~G~~----C~~~~C~-~~C~~~G~C~-~~~~C~ 175 (377)
...| +-.|+|+ |.|. +| .| +|.+||.|+. |+.-+|. ..=.+.+.|. ..++|.
T Consensus 961 GGtCq~CeC~~NiD~~d~~aCD~~TG~CLkCL~hTeG~hCe~Ck~Gf~GdA~~q~CqrC~Cn~LGTn~~~~CDr~tGQCp 1040 (1758)
T KOG0994|consen 961 GGTCQKCECSNNIDLYDPGACDVATGACLKCLYHTEGDHCEHCKDGFYGDALRQNCQRCVCNFLGTNSTCHCDRFTGQCP 1040 (1758)
T ss_pred CCccccccccCCcCccCCCccchhhchhhhhhhcccccchhhccccchhHHHHhhhhhheccccccCCccccccccCcCC
Confidence 2222 1134443 2221 12 45 4778888873 4443441 1111224454 468888
Q ss_pred cCCCCCCCCCCC
Q 017103 176 CENGYTGIDCST 187 (377)
Q Consensus 176 C~~G~~G~~C~~ 187 (377)
|.+...|.+|+.
T Consensus 1041 ClpNv~G~~CDq 1052 (1758)
T KOG0994|consen 1041 CLPNVQGVRCDQ 1052 (1758)
T ss_pred CCcccccccccc
Confidence 888888888863
No 27
>smart00179 EGF_CA Calcium-binding EGF-like domain.
Probab=94.74 E-value=0.04 Score=35.70 Aligned_cols=33 Identities=27% Similarity=0.549 Sum_probs=24.9
Q ss_pred CCCCcc-cCcCCCCEEEeeCCceeecCCCCCCccCCCce-ee
Q 017103 67 GNGCYQ-HRCVNNSLEVAVDGIWKVCPEAGGPVQFPGFN-GE 106 (377)
Q Consensus 67 ~d~C~~-~pC~ngg~Cv~~~~~~~~C~~~g~c~C~~G~~-G~ 106 (377)
+++|.. .+|.++++|++..+.+ .| .|++||. |.
T Consensus 2 ~~~C~~~~~C~~~~~C~~~~g~~-~C------~C~~g~~~g~ 36 (39)
T smart00179 2 IDECASGNPCQNGGTCVNTVGSY-RC------ECPPGYTDGR 36 (39)
T ss_pred cccCcCCCCcCCCCEeECCCCCe-Ee------ECCCCCccCC
Confidence 577887 7999999999876653 23 4559998 87
No 28
>PF12661 hEGF: Human growth factor-like EGF; PDB: 2YGQ_A 2E26_A 3A7Q_A 2YGP_A 2YGO_A 1HRE_A 1HAE_A 1HAF_A 1HRF_A.
Probab=94.67 E-value=0.0095 Score=30.07 Aligned_cols=13 Identities=46% Similarity=1.403 Sum_probs=10.4
Q ss_pred ccccCCCCCCCCC
Q 017103 173 ACECENGYTGIDC 185 (377)
Q Consensus 173 ~C~C~~G~~G~~C 185 (377)
.|+|++||+|.+|
T Consensus 1 ~C~C~~G~~G~~C 13 (13)
T PF12661_consen 1 TCQCPPGWTGPNC 13 (13)
T ss_dssp EEEE-TTEETTTT
T ss_pred CccCcCCCcCCCC
Confidence 4889999999987
No 29
>smart00179 EGF_CA Calcium-binding EGF-like domain.
Probab=93.62 E-value=0.078 Score=34.27 Aligned_cols=13 Identities=38% Similarity=1.339 Sum_probs=5.8
Q ss_pred ccccCCCCc-CCCC
Q 017103 141 KCHCFLGFH-GHDC 153 (377)
Q Consensus 141 ~C~C~~G~~-G~~C 153 (377)
.|.|++||. |..|
T Consensus 25 ~C~C~~g~~~g~~C 38 (39)
T smart00179 25 RCECPPGYTDGRNC 38 (39)
T ss_pred EeECCCCCccCCcC
Confidence 344444444 4433
No 30
>KOG4260 consensus Uncharacterized conserved protein [Function unknown]
Probab=93.57 E-value=0.11 Score=49.01 Aligned_cols=42 Identities=48% Similarity=1.106 Sum_probs=34.9
Q ss_pred cCCCCcCCCCCCCCCC----CCCCCCceec------CCCccccCCCCCCCCCCC
Q 017103 144 CFLGFHGHDCSKRSCP----DNCNGHGKCL------SNGACECENGYTGIDCST 187 (377)
Q Consensus 144 C~~G~~G~~C~~~~C~----~~C~~~G~C~------~~~~C~C~~G~~G~~C~~ 187 (377)
|++|-.|++|.. |+ .+|.++|.|. ..+.|.|.+||+|+.|..
T Consensus 132 Cp~gtyGpdCl~--Cpggser~C~GnG~C~GdGsR~GsGkCkC~~GY~Gp~C~~ 183 (350)
T KOG4260|consen 132 CPDGTYGPDCLQ--CPGGSERPCFGNGSCHGDGSREGSGKCKCETGYTGPLCRY 183 (350)
T ss_pred cCCCCcCCcccc--CCCCCcCCcCCCCcccCCCCCCCCCcccccCCCCCccccc
Confidence 789999999974 53 4688888887 368899999999999963
No 31
>cd00054 EGF_CA Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.
Probab=92.87 E-value=0.14 Score=32.49 Aligned_cols=35 Identities=29% Similarity=0.550 Sum_probs=24.9
Q ss_pred CCCCcc-cCcCCCCEEEeeCCceeecCCCCCCccCCCceeeecC
Q 017103 67 GNGCYQ-HRCVNNSLEVAVDGIWKVCPEAGGPVQFPGFNGELIC 109 (377)
Q Consensus 67 ~d~C~~-~pC~ngg~Cv~~~~~~~~C~~~g~c~C~~G~~G~i~C 109 (377)
.++|.. .+|.++++|++..+. +.| .|++||.|. .|
T Consensus 2 ~~~C~~~~~C~~~~~C~~~~~~-~~C------~C~~g~~g~-~C 37 (38)
T cd00054 2 IDECASGNPCQNGGTCVNTVGS-YRC------SCPPGYTGR-NC 37 (38)
T ss_pred cccCCCCCCcCCCCEeECCCCC-eEe------ECCCCCcCC-cC
Confidence 467877 799999999876554 223 345899997 44
No 32
>cd00054 EGF_CA Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.
Probab=91.55 E-value=0.21 Score=31.68 Aligned_cols=24 Identities=38% Similarity=1.118 Sum_probs=10.8
Q ss_pred CCCCCCeeeC----CccccCCCCcCCCC
Q 017103 130 SCTFNGDCVD----GKCHCFLGFHGHDC 153 (377)
Q Consensus 130 ~C~~~G~C~~----g~C~C~~G~~G~~C 153 (377)
+|.+++.|++ ..|.|++||.|..|
T Consensus 10 ~C~~~~~C~~~~~~~~C~C~~g~~g~~C 37 (38)
T cd00054 10 PCQNGGTCVNTVGSYRCSCPPGYTGRNC 37 (38)
T ss_pred CcCCCCEeECCCCCeEeECCCCCcCCcC
Confidence 3444445542 14555555554443
No 33
>PF01414 DSL: Delta serrate ligand; InterPro: IPR001774 Ligands of the Delta/Serrate/lag-2 (DSL) family and their receptors, members of the lin-12/Notch family, mediate cell-cell interactions that specify cell fate in invertebrates and vertebrates. In Caenorhabditis elegans, two DSL genes, lag-2 and apx-1, influence different cell fate decisions during development []. Molecular interaction between Notch and Serrate, another EGF-homologous transmembrane protein containing a region of striking similarity to Delta, has been shown and the same two EGF repeats of Notch may also constitute a Serrate binding domain [, ].; GO: 0007154 cell communication, 0016020 membrane; PDB: 2VJ2_A.
Probab=90.83 E-value=0.048 Score=40.34 Aligned_cols=44 Identities=41% Similarity=0.837 Sum_probs=24.0
Q ss_pred ccccCCCCcCCCCCCCCC-CC-CCCCCceecCCCccccCCCCCCCCC
Q 017103 141 KCHCFLGFHGHDCSKRSC-PD-NCNGHGKCLSNGACECENGYTGIDC 185 (377)
Q Consensus 141 ~C~C~~G~~G~~C~~~~C-~~-~C~~~G~C~~~~~C~C~~G~~G~~C 185 (377)
.-.|.+.|.|+.|+. .| +. .-.+|-+|...+.=+|.+||+|+.|
T Consensus 18 rv~C~~nyyG~~C~~-~C~~~~d~~ghy~Cd~~G~~~C~~Gw~G~~C 63 (63)
T PF01414_consen 18 RVVCDENYYGPNCSK-FCKPRDDSFGHYTCDSNGNKVCLPGWTGPNC 63 (63)
T ss_dssp -----TTEETTTT-E-E---EEETTEEEEE-SS--EEE-TTEESTTS
T ss_pred EEECCCCCCCccccC-CcCCCcCCcCCcccCCCCCCCCCCCCcCCCC
Confidence 556888999999986 45 22 1345667777778889999999876
No 34
>cd00053 EGF Epidermal growth factor domain, found in epidermal growth factor (EGF) presents in a large number of proteins, mostly animal; the list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied; the functional significance of EGF-like domains in what appear to be unrelated proteins is not yet clear; a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase); the domain includes six cysteine residues which have been shown to be involved in disulfide bonds; the main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet; Subdomains between the conserved cysteines vary in length; the region between the 5th and 6th cysteine contains two conserved glycines of which at least one is present in most EGF-like domains; a subset of these bind calcium.
Probab=89.90 E-value=0.35 Score=30.05 Aligned_cols=25 Identities=40% Similarity=1.095 Sum_probs=17.1
Q ss_pred CCCCCCCeeeC----CccccCCCCcCC-CC
Q 017103 129 NSCTFNGDCVD----GKCHCFLGFHGH-DC 153 (377)
Q Consensus 129 ~~C~~~G~C~~----g~C~C~~G~~G~-~C 153 (377)
.+|.+++.|++ ..|.|+.||.|. .|
T Consensus 6 ~~C~~~~~C~~~~~~~~C~C~~g~~g~~~C 35 (36)
T cd00053 6 NPCSNGGTCVNTPGSYRCVCPPGYTGDRSC 35 (36)
T ss_pred CCCCCCCEEecCCCCeEeECCCCCcccCCc
Confidence 35777777763 278888888777 44
No 35
>KOG1218 consensus Proteins containing Ca2+-binding EGF-like domains [Signal transduction mechanisms]
Probab=89.74 E-value=1.2 Score=42.91 Aligned_cols=39 Identities=26% Similarity=0.743 Sum_probs=21.9
Q ss_pred CCCccccCCCCCCCCCCCCC--CCccc-CCCCCeeecCCCce
Q 017103 170 SNGACECENGYTGIDCSTAV--CDEQC-SLHGGVCDNGVCEF 208 (377)
Q Consensus 170 ~~~~C~C~~G~~G~~C~~~~--C~~~c-~~~Gg~C~~~~~~~ 208 (377)
..+.|.|.+||.|.++.... |...+ +.+++.|+......
T Consensus 160 ~~~~c~c~~g~~g~~~~~~~~~c~~~~~~~~g~~C~~~~~~~ 201 (316)
T KOG1218|consen 160 KNGICTCQPGFVGVFCVESCSGCSPLTACENGAKCNRSTGSC 201 (316)
T ss_pred CCCceeccCCcccccccccCCCcCCCcccCCCCeeecccccc
Confidence 35666777777777776543 44322 34445666544443
No 36
>cd00053 EGF Epidermal growth factor domain, found in epidermal growth factor (EGF) presents in a large number of proteins, mostly animal; the list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied; the functional significance of EGF-like domains in what appear to be unrelated proteins is not yet clear; a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase); the domain includes six cysteine residues which have been shown to be involved in disulfide bonds; the main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet; Subdomains between the conserved cysteines vary in length; the region between the 5th and 6th cysteine contains two conserved glycines of which at least one is present in most EGF-like domains; a subset of these bind calcium.
Probab=89.66 E-value=0.33 Score=30.21 Aligned_cols=26 Identities=38% Similarity=1.041 Sum_probs=20.2
Q ss_pred CCCCCCceecC---CCccccCCCCCCC-CC
Q 017103 160 DNCNGHGKCLS---NGACECENGYTGI-DC 185 (377)
Q Consensus 160 ~~C~~~G~C~~---~~~C~C~~G~~G~-~C 185 (377)
.+|.++++|+. .+.|.|+.||.|. .|
T Consensus 6 ~~C~~~~~C~~~~~~~~C~C~~g~~g~~~C 35 (36)
T cd00053 6 NPCSNGGTCVNTPGSYRCVCPPGYTGDRSC 35 (36)
T ss_pred CCCCCCCEEecCCCCeEeECCCCCcccCCc
Confidence 45777788873 6889999999988 55
No 37
>PF07645 EGF_CA: Calcium-binding EGF domain; InterPro: IPR001881 A sequence of about forty amino-acid residues found in epidermal growth factor (EGF) has been shown [, , , , , ] to be present in a large number of membrane-bound and extracellular, mostly animal, proteins. Many of these proteins require calcium for their biological function and a calcium-binding site has been found at the N terminus of some EGF-like domains []. Calcium-binding may be crucial for numerous protein-protein interactions. For human coagulation factor IX it has been shown [] that the calcium-ligands form a pentagonal bipyramid. The first, third and fourth conserved negatively charged or polar residues are side chain ligands. The latter is possibly hydroxylated (see aspartic acid and asparagine hydroxylation site) []. A conserved aromatic residue, as well as the second conserved negative residue, are thought to be involved in stabilising the calcium-binding site. As in non-calcium binding EGF-like domains, there are six conserved cysteines and the structure of both types is very similar as calcium-binding induces only strictly local structural changes []. +------------------+ +---------+ | | | | nxnnC-x(3,14)-C-x(3,7)-CxxbxxxxaxC-x(1,6)-C-x(8,13)-Cx | | +------------------+ 'n': negatively charged or polar residue [DEQN] 'b': possibly beta-hydroxylated residue [DN] 'a': aromatic amino acid 'C': cysteine, involved in disulphide bond 'x': any amino acid. ; GO: 0005509 calcium ion binding; PDB: 2VJ3_A 1TOZ_A 1LMJ_A 1UZQ_A 1UZK_A 1UZJ_B 1UZP_A 1EMO_A 1EMN_A 2RR0_A ....
Probab=88.80 E-value=0.54 Score=31.52 Aligned_cols=32 Identities=25% Similarity=0.600 Sum_probs=24.0
Q ss_pred CCCCCccc--CcCCCCEEEeeCCceeecCCCCCCccCCCce
Q 017103 66 QGNGCYQH--RCVNNSLEVAVDGIWKVCPEAGGPVQFPGFN 104 (377)
Q Consensus 66 ~~d~C~~~--pC~ngg~Cv~~~~~~~~C~~~g~c~C~~G~~ 104 (377)
|+|+|... +|..++.|++..|.+. | +|++||.
T Consensus 1 DidEC~~~~~~C~~~~~C~N~~Gsy~-C------~C~~Gy~ 34 (42)
T PF07645_consen 1 DIDECAEGPHNCPENGTCVNTEGSYS-C------SCPPGYE 34 (42)
T ss_dssp ESSTTTTTSSSSSTTSEEEEETTEEE-E------EESTTEE
T ss_pred CccccCCCCCcCCCCCEEEcCCCCEE-e------eCCCCcE
Confidence 46788764 6988999999887764 3 3449997
No 38
>KOG1218 consensus Proteins containing Ca2+-binding EGF-like domains [Signal transduction mechanisms]
Probab=87.35 E-value=17 Score=34.66 Aligned_cols=102 Identities=29% Similarity=0.743 Sum_probs=58.5
Q ss_pred CCCCCeeeCCcccc-CCCCcCCCCCCC-CCCCCCCCCceecCCC-ccccCCCCCCCCCCCC-----CCCcccCCCCCeee
Q 017103 131 CTFNGDCVDGKCHC-FLGFHGHDCSKR-SCPDNCNGHGKCLSNG-ACECENGYTGIDCSTA-----VCDEQCSLHGGVCD 202 (377)
Q Consensus 131 C~~~G~C~~g~C~C-~~G~~G~~C~~~-~C~~~C~~~G~C~~~~-~C~C~~G~~G~~C~~~-----~C~~~c~~~Gg~C~ 202 (377)
|..++........| ..+|.|..|+.+ .|...|.. -+|.... .|.+..+|.+..|... .|...+ .+...+.
T Consensus 81 c~~~~~~~~~~~~~~~~~~~g~~C~~~~~~~~~c~~-~~C~~~~~~c~~~~~~~~~~C~~~~~~g~~C~~~c-~~~~~~~ 158 (316)
T KOG1218|consen 81 CKNGGTCVSSTGYCHLNGYEGPQCESPCPCGDGCAE-KTCANPRRECRCGGGYIGEQCGEENLVGLKCQRDC-QCTGGCD 158 (316)
T ss_pred cCCCCcccCCCCcccCCCCCcccccCCCCcCCcccc-cccCCCccceecCCcCccccccccCCCCCCccCCC-CCccccC
Confidence 55555555334444 789999999864 23222333 4555444 4788888888888761 365555 1111233
Q ss_pred cCCCceEe-cCCCCCCCCCCCcCcCCCCccccC
Q 017103 203 NGVCEFRC-SDYAGYTCQNSSKLISSLSVCKYV 234 (377)
Q Consensus 203 ~~~~~~~C-~~~~G~~C~~~~~~c~~~~~C~~~ 234 (377)
.....+.| ++|.|.+++.....|.....|.++
T Consensus 159 ~~~~~c~c~~g~~g~~~~~~~~~c~~~~~~~~g 191 (316)
T KOG1218|consen 159 CKNGICTCQPGFVGVFCVESCSGCSPLTACENG 191 (316)
T ss_pred CCCCceeccCCcccccccccCCCcCCCcccCCC
Confidence 33333344 578888887765445445555554
No 39
>smart00181 EGF Epidermal growth factor-like domain.
Probab=86.75 E-value=0.69 Score=29.13 Aligned_cols=10 Identities=40% Similarity=1.158 Sum_probs=5.0
Q ss_pred ccccCCCCcC
Q 017103 141 KCHCFLGFHG 150 (377)
Q Consensus 141 ~C~C~~G~~G 150 (377)
+|.|++||.|
T Consensus 21 ~C~C~~g~~g 30 (35)
T smart00181 21 TCSCPPGYTG 30 (35)
T ss_pred EeECCCCCcc
Confidence 4455555544
No 40
>PF07645 EGF_CA: Calcium-binding EGF domain; InterPro: IPR001881 A sequence of about forty amino-acid residues found in epidermal growth factor (EGF) has been shown [, , , , , ] to be present in a large number of membrane-bound and extracellular, mostly animal, proteins. Many of these proteins require calcium for their biological function and a calcium-binding site has been found at the N terminus of some EGF-like domains []. Calcium-binding may be crucial for numerous protein-protein interactions. For human coagulation factor IX it has been shown [] that the calcium-ligands form a pentagonal bipyramid. The first, third and fourth conserved negatively charged or polar residues are side chain ligands. The latter is possibly hydroxylated (see aspartic acid and asparagine hydroxylation site) []. A conserved aromatic residue, as well as the second conserved negative residue, are thought to be involved in stabilising the calcium-binding site. As in non-calcium binding EGF-like domains, there are six conserved cysteines and the structure of both types is very similar as calcium-binding induces only strictly local structural changes []. +------------------+ +---------+ | | | | nxnnC-x(3,14)-C-x(3,7)-CxxbxxxxaxC-x(1,6)-C-x(8,13)-Cx | | +------------------+ 'n': negatively charged or polar residue [DEQN] 'b': possibly beta-hydroxylated residue [DN] 'a': aromatic amino acid 'C': cysteine, involved in disulphide bond 'x': any amino acid. ; GO: 0005509 calcium ion binding; PDB: 2VJ3_A 1TOZ_A 1LMJ_A 1UZQ_A 1UZK_A 1UZJ_B 1UZP_A 1EMO_A 1EMN_A 2RR0_A ....
Probab=84.71 E-value=0.54 Score=31.52 Aligned_cols=29 Identities=41% Similarity=1.116 Sum_probs=21.5
Q ss_pred cccCCCCCcccccCCCCCCCCCCeeeC--C--ccccCCCCc
Q 017103 113 HELCSTGPIAVFGQCPNSCTFNGDCVD--G--KCHCFLGFH 149 (377)
Q Consensus 113 ~~~C~~~~~~~~~~C~~~C~~~G~C~~--g--~C~C~~G~~ 149 (377)
+++|... ++.|..++.|++ | .|.|++||.
T Consensus 2 idEC~~~--------~~~C~~~~~C~N~~Gsy~C~C~~Gy~ 34 (42)
T PF07645_consen 2 IDECAEG--------PHNCPENGTCVNTEGSYSCSCPPGYE 34 (42)
T ss_dssp SSTTTTT--------SSSSSTTSEEEEETTEEEEEESTTEE
T ss_pred ccccCCC--------CCcCCCCCEEEcCCCCEEeeCCCCcE
Confidence 4567654 356888899984 3 899999997
No 41
>PF01414 DSL: Delta serrate ligand; InterPro: IPR001774 Ligands of the Delta/Serrate/lag-2 (DSL) family and their receptors, members of the lin-12/Notch family, mediate cell-cell interactions that specify cell fate in invertebrates and vertebrates. In Caenorhabditis elegans, two DSL genes, lag-2 and apx-1, influence different cell fate decisions during development []. Molecular interaction between Notch and Serrate, another EGF-homologous transmembrane protein containing a region of striking similarity to Delta, has been shown and the same two EGF repeats of Notch may also constitute a Serrate binding domain [, ].; GO: 0007154 cell communication, 0016020 membrane; PDB: 2VJ2_A.
Probab=83.65 E-value=0.22 Score=36.78 Aligned_cols=42 Identities=21% Similarity=0.560 Sum_probs=23.3
Q ss_pred cCCCceeeecCCCCcccCCCCCcccccCCCCCCCCCCeee-CCccccCCCCcCCCC
Q 017103 99 QFPGFNGELICPAYHELCSTGPIAVFGQCPNSCTFNGDCV-DGKCHCFLGFHGHDC 153 (377)
Q Consensus 99 C~~G~~G~i~C~~~~~~C~~~~~~~~~~C~~~C~~~G~C~-~g~C~C~~G~~G~~C 153 (377)
|.+.|.|. .|.. +|... ..=..+-+|. +|.=.|.+||+|++|
T Consensus 21 C~~nyyG~-~C~~---~C~~~---------~d~~ghy~Cd~~G~~~C~~Gw~G~~C 63 (63)
T PF01414_consen 21 CDENYYGP-NCSK---FCKPR---------DDSFGHYTCDSNGNKVCLPGWTGPNC 63 (63)
T ss_dssp --TTEETT-TT-E---E---E---------EETTEEEEE-SS--EEE-TTEESTTS
T ss_pred CCCCCCCc-cccC---CcCCC---------cCCcCCcccCCCCCCCCCCCCcCCCC
Confidence 34889999 6654 55421 0124567886 689999999999987
No 42
>KOG1836 consensus Extracellular matrix glycoprotein Laminin subunits alpha and gamma [Extracellular structures]
Probab=81.23 E-value=1.9 Score=50.94 Aligned_cols=86 Identities=29% Similarity=0.731 Sum_probs=50.9
Q ss_pred CccCCCceeeecCCCCcccCCCCCc-------ccccCCCCCCCCC-Ceee--CCccccCCCCcCCCCCC-----------
Q 017103 97 PVQFPGFNGELICPAYHELCSTGPI-------AVFGQCPNSCTFN-GDCV--DGKCHCFLGFHGHDCSK----------- 155 (377)
Q Consensus 97 c~C~~G~~G~i~C~~~~~~C~~~~~-------~~~~~C~~~C~~~-G~C~--~g~C~C~~G~~G~~C~~----------- 155 (377)
|.|++||+|. .|+. |...-. ....-+|-.|.++ .+|. +|.|.|.+--.|..|+.
T Consensus 697 c~C~~g~tG~-~Ce~----C~~gfrr~~~~~~~~~~c~~C~cngh~~~Cd~~tG~C~C~~~t~G~~C~~C~~GfYg~~~~ 771 (1705)
T KOG1836|consen 697 CTCPVGYTGQ-FCES----CAPGFRRLSPQLGPFCPCIPCDCNGHSNICDPRTGQCKCKHNTFGGQCAQCVDGFYGLPDL 771 (1705)
T ss_pred ccCCCCcccc-hhhh----cchhhhcccccCCCCCcccccccCCccccccCCCCceecccCCCCCchhhhcCCCCCcccc
Confidence 4778999999 5764 322110 0011123346555 5664 46666666555555542
Q ss_pred --C-CC-CCCCCCCceec-----CCCccc-cCCCCCCCCCCC
Q 017103 156 --R-SC-PDNCNGHGKCL-----SNGACE-CENGYTGIDCST 187 (377)
Q Consensus 156 --~-~C-~~~C~~~G~C~-----~~~~C~-C~~G~~G~~C~~ 187 (377)
+ .| +-+|.+++.|. ....|. |++||+|.+|+.
T Consensus 772 ~~~~dC~~C~Cp~~~~~~~~~~~~~~iCk~Cp~gytG~rCe~ 813 (1705)
T KOG1836|consen 772 GTSGDCQPCPCPNGGACGQTPEILEVVCKNCPPGYTGLRCEE 813 (1705)
T ss_pred CCCCCCccCCCCCChhhcCcCcccceecCCCCCCCccccccc
Confidence 1 14 34566666665 356798 999999999985
No 43
>cd00055 EGF_Lam Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies
Probab=80.73 E-value=1.3 Score=30.86 Aligned_cols=16 Identities=31% Similarity=0.889 Sum_probs=12.8
Q ss_pred CCccccCCCCcCCCCC
Q 017103 139 DGKCHCFLGFHGHDCS 154 (377)
Q Consensus 139 ~g~C~C~~G~~G~~C~ 154 (377)
+|+|.|.++|+|..|+
T Consensus 18 ~G~C~C~~~~~G~~C~ 33 (50)
T cd00055 18 TGQCECKPNTTGRRCD 33 (50)
T ss_pred CCEEeCCCcCCCCCCC
Confidence 5788888888888886
No 44
>smart00181 EGF Epidermal growth factor-like domain.
Probab=78.52 E-value=2.9 Score=26.16 Aligned_cols=29 Identities=28% Similarity=0.625 Sum_probs=19.8
Q ss_pred Ccc-cCcCCCCEEEeeCCceeecCCCCCCccCCCcee-e
Q 017103 70 CYQ-HRCVNNSLEVAVDGIWKVCPEAGGPVQFPGFNG-E 106 (377)
Q Consensus 70 C~~-~pC~ngg~Cv~~~~~~~~C~~~g~c~C~~G~~G-~ 106 (377)
|.. .+|.++ +|++..+. +.| .|++||.| .
T Consensus 2 C~~~~~C~~~-~C~~~~~~-~~C------~C~~g~~g~~ 32 (35)
T smart00181 2 CASGGPCSNG-TCINTPGS-YTC------SCPPGYTGDK 32 (35)
T ss_pred CCCcCCCCCC-EEECCCCC-eEe------ECCCCCccCC
Confidence 555 689988 99876433 333 45599999 6
No 45
>PF00053 Laminin_EGF: Laminin EGF-like (Domains III and V); InterPro: IPR002049 Laminins [] are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation. They are composed of distinct but related alpha, beta and gamma chains. The three chains form a cross-shaped molecule that consist of a long arm and three short globular arms. The long arm consist of a coiled coil structure contributed by all three chains and cross-linked by interchain disulphide bonds. Beside different types of globular domains each subunit contains, in its first half, consecutive repeats of about 60 amino acids in length that include eight conserved cysteines []. The tertiary structure [, ] of this domain is remotely similar in its N-terminal to that of the EGF-like module (see PDOC00021 from PROSITEDOC). It is known as a 'LE' or 'laminin-type EGF-like' domain. The number of copies of the LE domain in the different forms of laminins is highly variable; from 3 up to 22 copies have been found. A schematic representation of the topology of the four disulphide bonds in the LE domain is shown below. +-------------------+ +-|-----------+ | +--------+ +-----------------+ | | | | | | | | xxCxCxxxxxxxxxxxCxxxxxxxCxxCxxxxxGxxCxxCxxgaagxxxxxxxxxxxCxx sssssssssssssssssssssssssssssssssss 'C': conserved cysteine involved in a disulphide bond 'a': conserved aromatic residue 'G': conserved glycine (lower case = less conserved) 's': region similar to the EGF-like domain In mouse laminin gamma-1 chain, the seventh LE domain has been shown to be the only one that binds with a high affinity to nidogen []. The binding-sites are located on the surface within the loops C1-C3 and C5-C6 [, ]. Long consecutive arrays of LE domains in laminins form rod-like elements of limited flexibility [], which determine the spacing in the formation of laminin networks of basement membranes [].; PDB: 3TBD_A 3ZYG_B 3ZYI_B 2Y38_A 1KLO_A 1NPE_B 3ZYJ_B 1TLE_A.
Probab=77.54 E-value=1.4 Score=30.44 Aligned_cols=20 Identities=35% Similarity=0.914 Sum_probs=14.0
Q ss_pred eee--CCccccCCCCcCCCCCC
Q 017103 136 DCV--DGKCHCFLGFHGHDCSK 155 (377)
Q Consensus 136 ~C~--~g~C~C~~G~~G~~C~~ 155 (377)
+|. +++|.|.++|+|..|++
T Consensus 12 ~C~~~~G~C~C~~~~~G~~C~~ 33 (49)
T PF00053_consen 12 TCDPSTGQCVCKPGTTGPRCDQ 33 (49)
T ss_dssp SEEETCEEESBSTTEESTTS-E
T ss_pred cccCCCCEEeccccccCCcCcC
Confidence 564 46888888888888863
No 46
>PHA02887 EGF-like protein; Provisional
Probab=76.38 E-value=1.9 Score=35.66 Aligned_cols=25 Identities=36% Similarity=0.938 Sum_probs=19.9
Q ss_pred CCCCCCeee--C----CccccCCCCcCCCCCC
Q 017103 130 SCTFNGDCV--D----GKCHCFLGFHGHDCSK 155 (377)
Q Consensus 130 ~C~~~G~C~--~----g~C~C~~G~~G~~C~~ 155 (377)
.|. ||+|. . -.|.|++||+|..|+.
T Consensus 93 YCi-HG~C~yI~dL~epsCrC~~GYtG~RCE~ 123 (126)
T PHA02887 93 FCI-NGECMNIIDLDEKFCICNKGYTGIRCDE 123 (126)
T ss_pred Eee-CCEEEccccCCCceeECCCCcccCCCCc
Confidence 455 67995 1 3899999999999986
No 47
>PHA02887 EGF-like protein; Provisional
Probab=75.30 E-value=2 Score=35.54 Aligned_cols=25 Identities=44% Similarity=1.213 Sum_probs=20.2
Q ss_pred CCCCceec-----CCCccccCCCCCCCCCCC
Q 017103 162 CNGHGKCL-----SNGACECENGYTGIDCST 187 (377)
Q Consensus 162 C~~~G~C~-----~~~~C~C~~G~~G~~C~~ 187 (377)
|. ||+|. +...|.|++||+|.+|+.
T Consensus 94 Ci-HG~C~yI~dL~epsCrC~~GYtG~RCE~ 123 (126)
T PHA02887 94 CI-NGECMNIIDLDEKFCICNKGYTGIRCDE 123 (126)
T ss_pred ee-CCEEEccccCCCceeECCCCcccCCCCc
Confidence 55 56775 467899999999999985
No 48
>KOG3607 consensus Meltrins, fertilins and related Zn-dependent metalloproteinases of the ADAMs family [Posttranslational modification, protein turnover, chaperones]
Probab=70.17 E-value=4.3 Score=44.14 Aligned_cols=53 Identities=32% Similarity=0.792 Sum_probs=36.2
Q ss_pred CCCCCeeeCCccccCCCCcCCCCCCCCCCCCCCCCceecCCCccccCCCCCCCCCCCC
Q 017103 131 CTFNGDCVDGKCHCFLGFHGHDCSKRSCPDNCNGHGKCLSNGACECENGYTGIDCSTA 188 (377)
Q Consensus 131 C~~~G~C~~g~C~C~~G~~G~~C~~~~C~~~C~~~G~C~~~~~C~C~~G~~G~~C~~~ 188 (377)
|..+-.|++.+|.=.. ..+..+ |+..|++||.|....+|.|.+||.+++|++.
T Consensus 606 Cg~~~vC~~~~C~~~~-v~~~~~----~~~~C~g~GVCnn~~~ChC~~gwapp~C~~~ 658 (716)
T KOG3607|consen 606 CGPGMICINHRCLSAS-VLNSSC----CPTTCNGHGVCNNELNCHCEPGWAPPFCFIF 658 (716)
T ss_pred cCCCceecCCcchhhh-hhcccc----cccccCCCcccCCCcceeeCCCCCCCccccc
Confidence 4444455555554333 333332 4556899999998999999999999999864
No 49
>PF12947 EGF_3: EGF domain; InterPro: IPR024731 This entry represents an EGF domain found in the the C terminus of malarial parasite merozoite surface protein 1 [], as well as other proteins.; PDB: 2NPR_A 1N1I_C 1B9W_A 1YO8_A 2RHP_A.
Probab=67.29 E-value=4.1 Score=26.50 Aligned_cols=27 Identities=26% Similarity=0.533 Sum_probs=17.9
Q ss_pred cCcCCCCEEEeeCCceeecCCCCCCccCCCceee
Q 017103 73 HRCVNNSLEVAVDGIWKVCPEAGGPVQFPGFNGE 106 (377)
Q Consensus 73 ~pC~ngg~Cv~~~~~~~~C~~~g~c~C~~G~~G~ 106 (377)
..|..+++|+++.+ .++| +|.+||.|.
T Consensus 6 ~~C~~nA~C~~~~~-~~~C------~C~~Gy~Gd 32 (36)
T PF12947_consen 6 GGCHPNATCTNTGG-SYTC------TCKPGYEGD 32 (36)
T ss_dssp GGS-TTCEEEE-TT-SEEE------EE-CEEECC
T ss_pred CCCCCCcEeecCCC-CEEe------ECCCCCccC
Confidence 36788899999877 4444 456999986
No 50
>cd00055 EGF_Lam Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies
Probab=67.21 E-value=4.7 Score=28.01 Aligned_cols=19 Identities=42% Similarity=1.027 Sum_probs=15.3
Q ss_pred ec-CCCccccCCCCCCCCCC
Q 017103 168 CL-SNGACECENGYTGIDCS 186 (377)
Q Consensus 168 C~-~~~~C~C~~G~~G~~C~ 186 (377)
|. ..++|.|.++|+|..|+
T Consensus 14 C~~~~G~C~C~~~~~G~~C~ 33 (50)
T cd00055 14 CDPGTGQCECKPNTTGRRCD 33 (50)
T ss_pred ccCCCCEEeCCCcCCCCCCC
Confidence 54 26778899999999987
No 51
>KOG1836 consensus Extracellular matrix glycoprotein Laminin subunits alpha and gamma [Extracellular structures]
Probab=67.10 E-value=81 Score=37.93 Aligned_cols=52 Identities=25% Similarity=0.404 Sum_probs=35.0
Q ss_pred ccceecCCC--CCCCCCcccCcCCCCEEEeeCC-ceeecCCCCCCccCCCceeeecCCCCcc
Q 017103 56 RTGFVRGSM--TQGNGCYQHRCVNNSLEVAVDG-IWKVCPEAGGPVQFPGFNGELICPAYHE 114 (377)
Q Consensus 56 ~~g~~g~~c--~~~d~C~~~pC~ngg~Cv~~~~-~~~~C~~~g~c~C~~G~~G~i~C~~~~~ 114 (377)
..||.|..- +..| |..-||-+++.|..+.. .+.+|+ .|++||+|. +|+.-.+
T Consensus 762 ~~GfYg~~~~~~~~d-C~~C~Cp~~~~~~~~~~~~~~iCk-----~Cp~gytG~-rCe~c~d 816 (1705)
T KOG1836|consen 762 VDGFYGLPDLGTSGD-CQPCPCPNGGACGQTPEILEVVCK-----NCPPGYTGL-RCEECAD 816 (1705)
T ss_pred cCCCCCccccCCCCC-CccCCCCCChhhcCcCcccceecC-----CCCCCCccc-ccccCCC
Confidence 346655442 3344 99999999988877653 334664 378999999 8876333
No 52
>PF00053 Laminin_EGF: Laminin EGF-like (Domains III and V); InterPro: IPR002049 Laminins [] are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation. They are composed of distinct but related alpha, beta and gamma chains. The three chains form a cross-shaped molecule that consist of a long arm and three short globular arms. The long arm consist of a coiled coil structure contributed by all three chains and cross-linked by interchain disulphide bonds. Beside different types of globular domains each subunit contains, in its first half, consecutive repeats of about 60 amino acids in length that include eight conserved cysteines []. The tertiary structure [, ] of this domain is remotely similar in its N-terminal to that of the EGF-like module (see PDOC00021 from PROSITEDOC). It is known as a 'LE' or 'laminin-type EGF-like' domain. The number of copies of the LE domain in the different forms of laminins is highly variable; from 3 up to 22 copies have been found. A schematic representation of the topology of the four disulphide bonds in the LE domain is shown below. +-------------------+ +-|-----------+ | +--------+ +-----------------+ | | | | | | | | xxCxCxxxxxxxxxxxCxxxxxxxCxxCxxxxxGxxCxxCxxgaagxxxxxxxxxxxCxx sssssssssssssssssssssssssssssssssss 'C': conserved cysteine involved in a disulphide bond 'a': conserved aromatic residue 'G': conserved glycine (lower case = less conserved) 's': region similar to the EGF-like domain In mouse laminin gamma-1 chain, the seventh LE domain has been shown to be the only one that binds with a high affinity to nidogen []. The binding-sites are located on the surface within the loops C1-C3 and C5-C6 [, ]. Long consecutive arrays of LE domains in laminins form rod-like elements of limited flexibility [], which determine the spacing in the formation of laminin networks of basement membranes [].; PDB: 3TBD_A 3ZYG_B 3ZYI_B 2Y38_A 1KLO_A 1NPE_B 3ZYJ_B 1TLE_A.
Probab=64.64 E-value=2.9 Score=28.82 Aligned_cols=21 Identities=43% Similarity=0.967 Sum_probs=16.0
Q ss_pred eec-CCCccccCCCCCCCCCCC
Q 017103 167 KCL-SNGACECENGYTGIDCST 187 (377)
Q Consensus 167 ~C~-~~~~C~C~~G~~G~~C~~ 187 (377)
.|. ..++|.|.++|+|..|++
T Consensus 12 ~C~~~~G~C~C~~~~~G~~C~~ 33 (49)
T PF00053_consen 12 TCDPSTGQCVCKPGTTGPRCDQ 33 (49)
T ss_dssp SEEETCEEESBSTTEESTTS-E
T ss_pred cccCCCCEEeccccccCCcCcC
Confidence 555 477889999999999874
No 53
>PHA03099 epidermal growth factor-like protein (EGF-like protein); Provisional
Probab=64.50 E-value=4.6 Score=33.99 Aligned_cols=17 Identities=47% Similarity=1.138 Sum_probs=9.5
Q ss_pred CCccccCCCCCCCCCCC
Q 017103 171 NGACECENGYTGIDCST 187 (377)
Q Consensus 171 ~~~C~C~~G~~G~~C~~ 187 (377)
...|.|..||+|.+||.
T Consensus 66 ~~~CrC~~GYtGeRCEh 82 (139)
T PHA03099 66 GMYCRCSHGYTGIRCQH 82 (139)
T ss_pred CceeECCCCcccccccc
Confidence 34455666666666553
No 54
>PF01683 EB: EB module; InterPro: IPR006149 The EB domain has no known function. It is found in several Caenorhabditis sp. and Drosophila sp. proteins. The domain contains 8 conserved cysteines that probably form four disulphide bridges and is found associated with kunitz domains IPR002223 from INTERPRO
Probab=63.70 E-value=7.8 Score=26.93 Aligned_cols=20 Identities=35% Similarity=0.966 Sum_probs=16.0
Q ss_pred CCCCCCeeeCCccccCCCCc
Q 017103 130 SCTFNGDCVDGKCHCFLGFH 149 (377)
Q Consensus 130 ~C~~~G~C~~g~C~C~~G~~ 149 (377)
.|..+..|++++|.|++||.
T Consensus 27 qC~~~s~C~~g~C~C~~g~~ 46 (52)
T PF01683_consen 27 QCIGGSVCVNGRCQCPPGYV 46 (52)
T ss_pred CCCCcCEEcCCEeECCCCCE
Confidence 46677888888999988875
No 55
>PHA03099 epidermal growth factor-like protein (EGF-like protein); Provisional
Probab=61.03 E-value=6.6 Score=33.08 Aligned_cols=26 Identities=35% Similarity=0.934 Sum_probs=20.2
Q ss_pred CCCCCCeee--C----CccccCCCCcCCCCCCC
Q 017103 130 SCTFNGDCV--D----GKCHCFLGFHGHDCSKR 156 (377)
Q Consensus 130 ~C~~~G~C~--~----g~C~C~~G~~G~~C~~~ 156 (377)
.|.+ |+|. . -.|.|+.||+|.+||..
T Consensus 52 YClH-G~C~yI~dl~~~~CrC~~GYtGeRCEh~ 83 (139)
T PHA03099 52 YCLH-GDCIHARDIDGMYCRCSHGYTGIRCQHV 83 (139)
T ss_pred EeEC-CEEEeeccCCCceeECCCCcccccccce
Confidence 4655 5885 1 38999999999999863
No 56
>KOG3607 consensus Meltrins, fertilins and related Zn-dependent metalloproteinases of the ADAMs family [Posttranslational modification, protein turnover, chaperones]
Probab=57.08 E-value=6.6 Score=42.71 Aligned_cols=32 Identities=31% Similarity=0.699 Sum_probs=27.8
Q ss_pred cCCCCCCCCCCeeeC-CccccCCCCcCCCCCCC
Q 017103 125 GQCPNSCTFNGDCVD-GKCHCFLGFHGHDCSKR 156 (377)
Q Consensus 125 ~~C~~~C~~~G~C~~-g~C~C~~G~~G~~C~~~ 156 (377)
..||..|+++|+|.+ ..|+|.+||.+++|+++
T Consensus 626 ~~~~~~C~g~GVCnn~~~ChC~~gwapp~C~~~ 658 (716)
T KOG3607|consen 626 SCCPTTCNGHGVCNNELNCHCEPGWAPPFCFIF 658 (716)
T ss_pred cccccccCCCcccCCCcceeeCCCCCCCccccc
Confidence 455778999999985 69999999999999874
No 57
>PF06247 Plasmod_Pvs28: Plasmodium ookinete surface protein Pvs28; InterPro: IPR010423 This family consists of several ookinete surface protein (Pvs28) from several species of Plasmodium. Pvs25 and Pvs28 are expressed on the surface of ookinetes. These proteins are potential candidates for vaccine and induce antibodies that block the infectivity of Plasmodium vivax in immunised animals [].; GO: 0009986 cell surface, 0016020 membrane; PDB: 1Z3G_B 1Z1Y_B 1Z27_A.
Probab=56.28 E-value=2 Score=38.60 Aligned_cols=103 Identities=21% Similarity=0.587 Sum_probs=49.8
Q ss_pred CCCCCCCCCcc-----cCcCCCCEEEeeCC----ceeecCCCCCCccCCCceeee-cCCCCcccCCCCCcccccCCCCCC
Q 017103 62 GSMTQGNGCYQ-----HRCVNNSLEVAVDG----IWKVCPEAGGPVQFPGFNGEL-ICPAYHELCSTGPIAVFGQCPNSC 131 (377)
Q Consensus 62 ~~c~~~d~C~~-----~pC~ngg~Cv~~~~----~~~~C~~~g~c~C~~G~~G~i-~C~~~~~~C~~~~~~~~~~C~~~C 131 (377)
..|+...+|.. .+|.+-+.|+.... ..+.| .|.+||+=.. +|- ...|.. ..|
T Consensus 34 ntCE~kv~C~~~e~~~K~Cgdya~C~~~~~~~~~~~~~C------~C~~gY~~~~~vCv--p~~C~~----------~~C 95 (197)
T PF06247_consen 34 NTCEEKVECDKLENVNKPCGDYAKCINQANKGEERAYKC------DCINGYILKQGVCV--PNKCNN----------KDC 95 (197)
T ss_dssp TEEEE----SG-GGTTSEEETTEEEEE-SSTTSSTSEEE------EE-TTEEESSSSEE--EGGGSS-------------
T ss_pred cccccceecCcccccCccccchhhhhcCCCcccceeEEE------ecccCceeeCCeEc--hhhcCc----------eec
Confidence 34666677865 38999999997553 23344 5668986441 120 113332 135
Q ss_pred CCCCeee-C------CccccCCCCc---CCCCCC---CCCCCCCCCCceec---CCCccccCCCCCCC
Q 017103 132 TFNGDCV-D------GKCHCFLGFH---GHDCSK---RSCPDNCNGHGKCL---SNGACECENGYTGI 183 (377)
Q Consensus 132 ~~~G~C~-~------g~C~C~~G~~---G~~C~~---~~C~~~C~~~G~C~---~~~~C~C~~G~~G~ 183 (377)
. .|.|+ + ..|.|.-|+. ...|.. ..|...|..+-.|. ..|.|.|.+++.+.
T Consensus 96 g-~GKCI~d~~~~~~~~CSC~IGkV~~dn~kCtk~G~T~C~LKCk~nE~CK~~~~~Y~C~~~~~~~~~ 162 (197)
T PF06247_consen 96 G-SGKCILDPDNPNNPTCSCNIGKVPDDNKKCTKTGETKCSLKCKENEECKLVDGYYKCVCKEGFPGD 162 (197)
T ss_dssp T-TEEEEEEEGGGSEEEEEE-TEEETTTTTESEEEE--------TTTEEEEEETTEEEEEE-TT-EEE
T ss_pred C-CCeEEecCCCCCCceeEeeeceEeccCCcccCCCccceeeecCCCcceeeeCcEEEeecCCCCCCC
Confidence 3 78887 2 1899988887 222432 24555666667775 35778887777543
No 58
>PF04863 EGF_alliinase: Alliinase EGF-like domain; InterPro: IPR006947 Allicin is a thiosulphinate that gives rise to dithiines, allyl sulphides and ajoenes, the three groups of active compounds in Allium species. Allicin is synthesised from sulphoxide cysteine derivatives by alliinase, whose C-S lyase activity cleaves C(beta)-S(gamma) bonds. It is thought that this enzyme forms part of a primitive plant defence system [].; GO: 0016846 carbon-sulfur lyase activity; PDB: 1LK9_B 2HOX_C 2HOR_A.
Probab=56.18 E-value=5.7 Score=28.37 Aligned_cols=26 Identities=42% Similarity=0.917 Sum_probs=14.5
Q ss_pred CCCCCCeee-C-----C--ccccCCCCcCCCCCC
Q 017103 130 SCTFNGDCV-D-----G--KCHCFLGFHGHDCSK 155 (377)
Q Consensus 130 ~C~~~G~C~-~-----g--~C~C~~G~~G~~C~~ 155 (377)
.|++||+.. + | .|.|..-|.|++|++
T Consensus 18 ~CSGHGr~flDg~~~dG~p~CECn~Cy~GpdCS~ 51 (56)
T PF04863_consen 18 SCSGHGRAFLDGLIADGSPVCECNSCYGGPDCST 51 (56)
T ss_dssp --TTSEE--TTS-EETTEE--EE-TTEESTTS-E
T ss_pred CcCCCCeeeeccccccCCccccccCCcCCCCccc
Confidence 588888874 3 2 788999999999875
No 59
>KOG3516 consensus Neurexin IV [Signal transduction mechanisms]
Probab=55.23 E-value=7.4 Score=43.95 Aligned_cols=42 Identities=17% Similarity=0.409 Sum_probs=31.7
Q ss_pred CCCCCCCCCcccCcCCCCEEEee-CCceeecCCCCCCccCCCceeeecCCC
Q 017103 62 GSMTQGNGCYQHRCVNNSLEVAV-DGIWKVCPEAGGPVQFPGFNGELICPA 111 (377)
Q Consensus 62 ~~c~~~d~C~~~pC~ngg~Cv~~-~~~~~~C~~~g~c~C~~G~~G~i~C~~ 111 (377)
..|...|+|..+||+++|.|... +.+.+.|. ..||.|. +|+.
T Consensus 540 d~C~i~drClPN~CehgG~C~Qs~~~f~C~C~-------~TGY~Ga-tCHt 582 (1306)
T KOG3516|consen 540 DMCGISDRCLPNPCEHGGKCSQSWDDFECNCE-------LTGYKGA-TCHT 582 (1306)
T ss_pred cccccccccCCccccCCCcccccccceeEecc-------ccccccc-cccC
Confidence 35778899999999999999872 23333553 2799999 7876
No 60
>smart00180 EGF_Lam Laminin-type epidermal growth factor-like domai.
Probab=53.90 E-value=8.5 Score=26.26 Aligned_cols=17 Identities=29% Similarity=0.935 Sum_probs=14.9
Q ss_pred CCccccCCCCcCCCCCC
Q 017103 139 DGKCHCFLGFHGHDCSK 155 (377)
Q Consensus 139 ~g~C~C~~G~~G~~C~~ 155 (377)
+|+|.|+++|+|..|+.
T Consensus 17 ~G~C~C~~~~~G~~C~~ 33 (46)
T smart00180 17 TGQCECKPNVTGRRCDR 33 (46)
T ss_pred CCEEECCCCCCCCCCCc
Confidence 57999999999999973
No 61
>PF09064 Tme5_EGF_like: Thrombomodulin like fifth domain, EGF-like; InterPro: IPR015149 This domain adopts a fold similar to other EGF domains, with a flat major and a twisted minor beta sheet. Disulphide pairing, however, is not of the usual 1-3, 2-4, 5-6 type; rather 1-2, 3-4, 5-6 pairing is found. Its extended major sheet (strands beta-2 and beta-3 and the connecting loop) projects into thrombin's active site groove. This domain is required for interaction of thrombomodulin with thrombin, and subsequent activation of protein-C []. ; GO: 0004888 transmembrane signaling receptor activity, 0016021 integral to membrane
Probab=46.51 E-value=19 Score=23.21 Aligned_cols=25 Identities=40% Similarity=1.188 Sum_probs=13.0
Q ss_pred CCCCCCCCCCCCCceecCCCccccCCCCC
Q 017103 153 CSKRSCPDNCNGHGKCLSNGACECENGYT 181 (377)
Q Consensus 153 C~~~~C~~~C~~~G~C~~~~~C~C~~G~~ 181 (377)
|.+..|+..|..+ ..++|.|++||.
T Consensus 3 Cn~t~CpA~CDpn----~~~~C~CPeGyI 27 (34)
T PF09064_consen 3 CNQTECPADCDPN----SPGQCFCPEGYI 27 (34)
T ss_pred cccccCCCccCCC----CCCceeCCCceE
Confidence 4444454444322 245677777774
No 62
>PF12955 DUF3844: Domain of unknown function (DUF3844); InterPro: IPR024382 This presumed domain is found in fungal species. It contains 8 largely conserved cysteine residues. This domain is found in proteins thought to be located in the endoplasmic reticulum.
Probab=38.95 E-value=18 Score=29.38 Aligned_cols=19 Identities=37% Similarity=0.926 Sum_probs=13.4
Q ss_pred CCCCCCCCeeeCC---------ccccCC
Q 017103 128 PNSCTFNGDCVDG---------KCHCFL 146 (377)
Q Consensus 128 ~~~C~~~G~C~~g---------~C~C~~ 146 (377)
.+.|++||.|+.. .|.|.+
T Consensus 12 Tn~CsgHG~C~~~~~~~~~~C~~C~C~~ 39 (103)
T PF12955_consen 12 TNNCSGHGSCVKKYGSGGGDCFACKCKP 39 (103)
T ss_pred ccCCCCCceEeeccCCCccceEEEEeec
Confidence 3579999999732 566765
No 63
>PF12662 cEGF: Complement Clr-like EGF-like
Probab=31.79 E-value=21 Score=21.11 Aligned_cols=10 Identities=40% Similarity=1.092 Sum_probs=6.9
Q ss_pred CccccCCCCC
Q 017103 172 GACECENGYT 181 (377)
Q Consensus 172 ~~C~C~~G~~ 181 (377)
|.|.|++||.
T Consensus 2 y~C~C~~Gy~ 11 (24)
T PF12662_consen 2 YTCSCPPGYQ 11 (24)
T ss_pred EEeeCCCCCc
Confidence 5677777775
No 64
>PF08247 ENOD40: ENOD40 protein; InterPro: IPR013186 The soybean early nodulin 40 (ENOD40) mRNA contains two short overlapping ORFs; in vitro translation yields two peptides of 12 and 24 amino acids []. The putative role of the ENOD40 genes has been in favour of organogenesis, such as induction of the cortical cell divisions that lead to initiation of nodule primordia, in developing lateral roots and embryonic tissues. This supports the hypothesis for a role of ENOD40 in lateral organ development [].
Probab=29.45 E-value=9.5 Score=18.55 Aligned_cols=12 Identities=25% Similarity=0.841 Sum_probs=8.2
Q ss_pred hhhhhhHhhccc
Q 017103 350 IRLSWLDRLRGG 361 (377)
Q Consensus 350 ~~~~~~~~~~~~ 361 (377)
|++.|+.+++++
T Consensus 1 m~l~wqksihgs 12 (12)
T PF08247_consen 1 MELCWQKSIHGS 12 (12)
T ss_pred CceeEeeeecCC
Confidence 677887776653
No 65
>KOG3516 consensus Neurexin IV [Signal transduction mechanisms]
Probab=28.44 E-value=37 Score=38.69 Aligned_cols=33 Identities=42% Similarity=1.138 Sum_probs=25.7
Q ss_pred CC-CCCCCCCceec---CCCccccC-CCCCCCCCCCCC
Q 017103 157 SC-PDNCNGHGKCL---SNGACECE-NGYTGIDCSTAV 189 (377)
Q Consensus 157 ~C-~~~C~~~G~C~---~~~~C~C~-~G~~G~~C~~~~ 189 (377)
.| |++|.++|.|. +.+.|.|. .||.|..|...+
T Consensus 547 rClPN~CehgG~C~Qs~~~f~C~C~~TGY~GatCHtsi 584 (1306)
T KOG3516|consen 547 RCLPNPCEHGGKCSQSWDDFECNCELTGYKGATCHTSI 584 (1306)
T ss_pred ccCCccccCCCcccccccceeEeccccccccccccCCC
Confidence 46 77888888887 46788887 888888888654
No 66
>KOG3514 consensus Neurexin III-alpha [Signal transduction mechanisms]
Probab=27.39 E-value=34 Score=38.65 Aligned_cols=34 Identities=24% Similarity=0.515 Sum_probs=0.0
Q ss_pred CcccCcCCCCEEEe-eCCceeecCCCCCCccCCCceeeecCCC
Q 017103 70 CYQHRCVNNSLEVA-VDGIWKVCPEAGGPVQFPGFNGELICPA 111 (377)
Q Consensus 70 C~~~pC~ngg~Cv~-~~~~~~~C~~~g~c~C~~G~~G~i~C~~ 111 (377)
|..+||+|+|+|.. -+.+-+.|.. .||.|+ +|+.
T Consensus 626 C~~nPC~N~g~C~egwNrfiCDCs~-------T~~~G~-~Cer 660 (1591)
T KOG3514|consen 626 CESNPCQNGGKCSEGWNRFICDCSG-------TGFEGR-TCER 660 (1591)
T ss_pred cCCCcccCCCCcccccccccccccc-------CcccCc-cccc
No 67
>PF12946 EGF_MSP1_1: MSP1 EGF domain 1; InterPro: IPR024730 This EGF-like domain is found at the C terminus of the malaria parasite MSP1 protein. MSP1 is the merozoite surface protein 1. This domain is part of the C-terminal fragment that is proteolytically processed from the the rest of the protein and is left attached to the surface of the invading parasite [].; PDB: 1N1I_C 2FLG_A 1CEJ_A 2NPR_A 1B9W_A 1OB1_F.
Probab=23.84 E-value=77 Score=20.86 Aligned_cols=31 Identities=19% Similarity=0.345 Sum_probs=18.5
Q ss_pred CcccCcCCCCEEEeeCCceeecCCCCCCccCCCceee
Q 017103 70 CYQHRCVNNSLEVAVDGIWKVCPEAGGPVQFPGFNGE 106 (377)
Q Consensus 70 C~~~pC~ngg~Cv~~~~~~~~C~~~g~c~C~~G~~G~ 106 (377)
|...+|-.++-|++.++...+| .|.+||.-.
T Consensus 2 C~~~~cP~NA~C~~~~dG~eec------rCllgyk~~ 32 (37)
T PF12946_consen 2 CIDTKCPANAGCFRYDDGSEEC------RCLLGYKKV 32 (37)
T ss_dssp -SSS---TTEEEEEETTSEEEE------EE-TTEEEE
T ss_pred ccCccCCCCcccEEcCCCCEEE------EeeCCcccc
Confidence 5667777888999877555555 566998643
No 68
>PF00954 S_locus_glycop: S-locus glycoprotein family; InterPro: IPR000858 In Brassicaceae, self-incompatible plants have a self/non-self recognition system, which involves the inability of flowering plants to achieve self-fertilisation. This is sporophytically controlled by multiple alleles at a single locus (S). There are a total of 50 different S alleles in Brassica oleracea. S-locus glycoproteins, as well as S-receptor kinases, are in linkage with the S-alleles []. Most of the proteins within this family contain apple-like domain (IPR003609 from INTERPRO), which is predicted to possess protein- and/or carbohydrate-binding functions.; GO: 0048544 recognition of pollen
Probab=20.89 E-value=96 Score=24.92 Aligned_cols=26 Identities=42% Similarity=0.906 Sum_probs=18.0
Q ss_pred cCCC--CCCCCCCeeeC---CccccCCCCcC
Q 017103 125 GQCP--NSCTFNGDCVD---GKCHCFLGFHG 150 (377)
Q Consensus 125 ~~C~--~~C~~~G~C~~---g~C~C~~G~~G 150 (377)
.+|. ..|...|.|.. ..|.|.+||.-
T Consensus 78 d~Cd~y~~CG~~g~C~~~~~~~C~Cl~GF~P 108 (110)
T PF00954_consen 78 DQCDVYGFCGPNGICNSNNSPKCSCLPGFEP 108 (110)
T ss_pred cCCCCccccCCccEeCCCCCCceECCCCcCC
Confidence 5663 46777888863 27888888864
Done!