Query 005509
Match_columns 693
No_of_seqs 457 out of 2553
Neff 7.8
Searched_HMMs 46136
Date Fri Mar 29 00:35:38 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/005509.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/005509hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PF03016 Exostosin: Exostosin 100.0 1E-46 2.2E-51 399.9 18.8 283 348-687 2-290 (302)
2 KOG1021 Acetylglucosaminyltran 100.0 2E-40 4.4E-45 368.2 18.8 296 349-688 71-398 (464)
3 KOG2264 Exostosin EXT1L [Signa 99.7 1.1E-16 2.3E-21 170.3 12.9 229 398-685 218-474 (907)
4 KOG1225 Teneurin-1 and related 99.6 6.7E-16 1.4E-20 169.8 10.6 123 114-305 243-365 (525)
5 KOG1225 Teneurin-1 and related 99.5 7.8E-14 1.7E-18 153.6 11.2 171 111-309 167-343 (525)
6 KOG1226 Integrin beta subunit 99.4 6.9E-13 1.5E-17 148.0 10.1 143 126-308 468-621 (783)
7 KOG1022 Acetylglucosaminyltran 99.1 7.3E-10 1.6E-14 118.7 11.3 225 398-687 126-358 (691)
8 KOG0994 Extracellular matrix g 99.0 7.1E-10 1.5E-14 126.7 11.0 200 96-308 881-1147(1758)
9 KOG1226 Integrin beta subunit 99.0 5E-10 1.1E-14 125.4 7.9 134 126-309 515-653 (783)
10 KOG1219 Uncharacterized conser 99.0 5.9E-10 1.3E-14 133.5 7.5 107 121-310 3865-3980(4289)
11 KOG0994 Extracellular matrix g 98.9 3.1E-09 6.7E-14 121.6 8.7 170 121-308 854-1099(1758)
12 KOG1836 Extracellular matrix g 98.4 1.4E-06 3.1E-11 108.4 13.3 105 93-202 699-814 (1705)
13 KOG4289 Cadherin EGF LAG seven 98.2 2.6E-06 5.6E-11 100.1 8.4 73 120-195 1716-1799(2531)
14 KOG4289 Cadherin EGF LAG seven 98.2 2.2E-06 4.7E-11 100.7 6.6 94 137-275 1220-1318(2531)
15 KOG1219 Uncharacterized conser 98.1 3.1E-06 6.6E-11 103.0 5.1 68 233-309 3869-3940(4289)
16 KOG1836 Extracellular matrix g 98.0 5E-05 1.1E-09 95.0 13.2 198 95-309 751-1023(1705)
17 PF07974 EGF_2: EGF-like domai 97.9 9.5E-06 2.1E-10 55.4 3.4 27 125-151 6-32 (32)
18 KOG1217 Fibrillins and related 97.8 0.00015 3.3E-09 81.4 12.8 65 236-310 280-356 (487)
19 KOG4260 Uncharacterized conser 97.6 9.3E-05 2E-09 73.9 6.6 132 139-301 130-303 (350)
20 PF07974 EGF_2: EGF-like domai 97.6 5.2E-05 1.1E-09 51.8 3.0 25 282-306 6-32 (32)
21 KOG3512 Netrin, axonal chemotr 97.4 0.00096 2.1E-08 71.6 10.8 153 126-312 279-483 (592)
22 KOG1214 Nidogen and related ba 97.4 0.00049 1.1E-08 78.0 9.0 142 124-305 699-860 (1289)
23 KOG1217 Fibrillins and related 97.3 0.0023 4.9E-08 71.9 12.8 150 127-306 136-306 (487)
24 smart00051 DSL delta serrate l 97.1 0.00049 1.1E-08 54.9 3.3 46 258-306 17-63 (63)
25 PF00008 EGF: EGF-like domain 97.0 0.00064 1.4E-08 46.6 2.7 27 124-150 3-32 (32)
26 KOG1214 Nidogen and related ba 96.9 0.0028 6E-08 72.2 8.7 138 121-303 738-908 (1289)
27 PF12661 hEGF: Human growth fa 96.2 0.0027 5.8E-08 34.1 1.3 13 139-151 1-13 (13)
28 PF00852 Glyco_transf_10: Glyc 96.2 0.0087 1.9E-07 65.0 6.5 129 536-681 141-283 (349)
29 PF12661 hEGF: Human growth fa 96.1 0.0026 5.6E-08 34.2 0.9 13 294-306 1-13 (13)
30 KOG1218 Proteins containing Ca 96.0 0.081 1.8E-06 56.3 12.8 153 134-303 45-209 (316)
31 cd00055 EGF_Lam Laminin-type e 96.0 0.0074 1.6E-07 45.9 3.3 28 126-153 3-34 (50)
32 smart00179 EGF_CA Calcium-bind 95.9 0.011 2.4E-07 41.8 3.7 32 121-152 3-39 (39)
33 PF00008 EGF: EGF-like domain 95.9 0.0041 8.8E-08 42.6 1.3 24 282-305 4-32 (32)
34 PF00053 Laminin_EGF: Laminin 95.8 0.0063 1.4E-07 46.0 2.4 23 131-153 11-33 (49)
35 smart00051 DSL delta serrate l 95.6 0.01 2.3E-07 47.3 3.0 30 121-151 32-63 (63)
36 KOG3512 Netrin, axonal chemotr 95.6 0.047 1E-06 59.1 8.6 103 95-202 301-430 (592)
37 cd00054 EGF_CA Calcium-binding 95.2 0.026 5.6E-07 39.3 3.7 32 121-152 3-38 (38)
38 KOG4260 Uncharacterized conser 94.9 0.019 4.1E-07 57.9 2.9 45 262-308 132-183 (350)
39 smart00180 EGF_Lam Laminin-typ 94.7 0.033 7.1E-07 41.5 3.2 23 131-153 11-33 (46)
40 PF01414 DSL: Delta serrate li 94.3 0.014 3.1E-07 46.6 0.5 45 257-306 16-63 (63)
41 cd00053 EGF Epidermal growth f 94.0 0.071 1.5E-06 36.4 3.4 29 124-152 5-36 (36)
42 smart00181 EGF Epidermal growt 94.0 0.076 1.6E-06 36.6 3.5 27 125-152 6-35 (35)
43 PHA02887 EGF-like protein; Pro 93.9 0.07 1.5E-06 47.0 3.9 33 121-154 84-124 (126)
44 smart00179 EGF_CA Calcium-bind 92.9 0.11 2.4E-06 36.5 3.1 26 282-307 9-39 (39)
45 KOG1218 Proteins containing Ca 92.8 1.1 2.4E-05 47.6 11.9 27 127-154 81-107 (316)
46 cd00054 EGF_CA Calcium-binding 92.4 0.14 3.1E-06 35.4 3.0 26 282-307 9-38 (38)
47 PF04863 EGF_alliinase: Alliin 91.9 0.09 2E-06 40.0 1.5 29 126-154 18-52 (56)
48 PF07645 EGF_CA: Calcium-bindi 91.0 0.22 4.8E-06 36.2 2.8 27 121-147 3-34 (42)
49 cd00053 EGF Epidermal growth f 90.5 0.26 5.7E-06 33.5 2.8 26 282-307 6-36 (36)
50 cd00055 EGF_Lam Laminin-type e 90.1 0.35 7.5E-06 36.7 3.3 20 289-308 13-34 (50)
51 KOG2619 Fucosyltransferase [Ca 89.7 0.63 1.4E-05 50.3 6.0 126 536-676 162-293 (372)
52 PF01414 DSL: Delta serrate li 89.6 0.17 3.6E-06 40.5 1.2 46 139-199 18-63 (63)
53 PHA02887 EGF-like protein; Pro 87.8 0.4 8.6E-06 42.4 2.4 26 283-309 93-124 (126)
54 PF00053 Laminin_EGF: Laminin 87.5 0.31 6.8E-06 36.7 1.4 28 288-317 11-40 (49)
55 smart00181 EGF Epidermal growt 86.5 0.66 1.4E-05 31.8 2.6 25 282-307 6-35 (35)
56 PF12947 EGF_3: EGF domain; I 86.2 0.48 1E-05 33.3 1.7 26 125-150 6-33 (36)
57 PF12955 DUF3844: Domain of un 84.9 0.55 1.2E-05 41.2 1.8 32 282-313 13-66 (103)
58 PF12955 DUF3844: Domain of un 84.8 0.76 1.6E-05 40.3 2.6 31 124-154 12-62 (103)
59 KOG3607 Meltrins, fertilins an 84.6 0.62 1.4E-05 54.9 2.7 33 121-154 626-658 (716)
60 KOG3607 Meltrins, fertilins an 83.1 1.1 2.3E-05 53.1 3.7 35 278-312 626-661 (716)
61 smart00180 EGF_Lam Laminin-typ 82.1 1.5 3.3E-05 32.6 3.0 17 292-308 17-33 (46)
62 PF04863 EGF_alliinase: Alliin 81.0 0.74 1.6E-05 35.1 0.9 29 282-310 17-53 (56)
63 PHA03099 epidermal growth fact 78.6 1.5 3.2E-05 39.6 2.2 29 125-154 51-83 (139)
64 PHA03099 epidermal growth fact 77.8 1.8 3.9E-05 39.1 2.5 26 284-310 53-84 (139)
65 PF09064 Tme5_EGF_like: Thromb 77.4 1.7 3.7E-05 29.9 1.7 22 175-196 6-28 (34)
66 PF01683 EB: EB module; Inter 76.2 2.5 5.5E-05 32.1 2.7 31 266-302 16-46 (52)
67 PF06247 Plasmod_Pvs28: Plasmo 75.3 1.6 3.5E-05 42.2 1.6 133 130-305 10-163 (197)
68 PF12947 EGF_3: EGF domain; I 72.6 1.9 4E-05 30.3 1.0 23 283-305 7-33 (36)
69 PF00534 Glycos_transf_1: Glyc 72.3 4.4 9.6E-05 38.4 4.0 41 628-669 83-124 (172)
70 PF07645 EGF_CA: Calcium-bindi 70.1 2.5 5.4E-05 30.7 1.3 21 282-302 10-34 (42)
71 PF05686 Glyco_transf_90: Glyc 69.5 8.1 0.00018 42.7 5.7 110 564-686 153-266 (395)
72 cd03814 GT1_like_2 This family 60.6 9 0.0002 40.4 3.9 43 627-670 256-299 (364)
73 cd03802 GT1_AviGT4_like This f 59.2 14 0.00031 38.8 5.1 41 629-670 235-277 (335)
74 cd03808 GT1_cap1E_like This fa 58.5 13 0.00028 38.7 4.6 42 628-670 254-296 (359)
75 cd03823 GT1_ExpE7_like This fa 58.4 10 0.00022 39.8 3.8 41 628-669 253-295 (359)
76 PF01683 EB: EB module; Inter 58.3 11 0.00023 28.6 2.9 22 124-147 25-46 (52)
77 cd03822 GT1_ecORF704_like This 52.0 15 0.00032 38.8 3.8 41 628-669 258-301 (366)
78 KOG3516 Neurexin IV [Signal tr 51.3 9.4 0.0002 46.6 2.2 35 120-155 545-584 (1306)
79 cd03798 GT1_wlbH_like This fam 49.1 17 0.00036 38.0 3.6 41 628-669 269-310 (377)
80 cd03807 GT1_WbnK_like This fam 47.3 19 0.0004 37.7 3.6 40 629-669 260-300 (365)
81 PF00954 S_locus_glycop: S-loc 47.2 18 0.00039 32.0 3.0 29 121-149 78-109 (110)
82 cd03819 GT1_WavL_like This fam 46.0 28 0.0006 36.8 4.8 41 628-669 254-296 (355)
83 KOG1388 Attractin and platelet 45.6 15 0.00031 36.6 2.2 73 126-204 53-130 (217)
84 cd03821 GT1_Bme6_like This fam 45.1 22 0.00047 37.3 3.7 41 629-670 273-314 (375)
85 cd04951 GT1_WbdM_like This fam 44.3 24 0.00051 37.4 3.9 40 629-669 254-294 (360)
86 KOG3514 Neurexin III-alpha [Si 43.3 15 0.00033 44.3 2.3 41 271-311 617-663 (1591)
87 PF12662 cEGF: Complement Clr- 42.9 16 0.00034 23.3 1.3 10 293-302 2-11 (24)
88 cd03801 GT1_YqgM_like This fam 41.8 27 0.00058 36.3 3.8 41 628-669 266-307 (374)
89 cd03818 GT1_ExpC_like This fam 41.2 81 0.0018 34.4 7.6 40 629-669 292-332 (396)
90 PF13692 Glyco_trans_1_4: Glyc 40.1 44 0.00095 29.9 4.5 40 629-669 62-103 (135)
91 cd03816 GT1_ALG1_like This fam 37.9 77 0.0017 35.1 6.8 42 628-670 305-350 (415)
92 TIGR03087 stp1 sugar transfera 37.1 44 0.00096 36.6 4.7 39 630-669 290-330 (397)
93 KOG0196 Tyrosine kinase, EPH ( 36.7 75 0.0016 37.9 6.4 65 127-198 248-320 (996)
94 cd03805 GT1_ALG2_like This fam 36.4 51 0.0011 35.6 5.1 40 629-669 291-331 (392)
95 PLN02871 UDP-sulfoquinovose:DA 36.1 34 0.00074 38.5 3.7 40 629-669 323-363 (465)
96 cd03794 GT1_wbuB_like This fam 36.0 36 0.00078 35.8 3.7 43 628-671 285-333 (394)
97 cd03804 GT1_wbaZ_like This fam 35.4 41 0.0009 35.8 4.1 41 628-669 252-292 (351)
98 PRK15427 colanic acid biosynth 34.3 1.5E+02 0.0032 32.8 8.3 45 628-673 289-340 (406)
99 PRK15484 lipopolysaccharide 1, 34.1 42 0.00091 36.7 3.9 43 628-671 267-311 (380)
100 cd03809 GT1_mtfB_like This fam 34.1 28 0.00061 36.6 2.5 41 628-669 263-304 (365)
101 cd03800 GT1_Sucrose_synthase T 33.4 39 0.00085 36.3 3.5 40 629-669 294-334 (398)
102 PF00919 UPF0004: Uncharacteri 32.8 40 0.00086 29.4 2.7 32 392-424 12-44 (98)
103 KOG3516 Neurexin IV [Signal tr 32.7 26 0.00057 43.0 2.0 40 278-317 546-591 (1306)
104 cd05844 GT1_like_7 Glycosyltra 29.7 78 0.0017 33.6 5.1 42 629-671 256-304 (367)
105 cd04962 GT1_like_5 This family 28.8 57 0.0012 34.7 3.8 41 629-670 262-303 (371)
106 TIGR03088 stp2 sugar transfera 28.7 53 0.0012 35.3 3.6 41 629-670 264-305 (374)
107 cd03806 GT1_ALG11_like This fa 28.5 2.1E+02 0.0045 31.8 8.3 40 628-668 315-355 (419)
108 cd04955 GT1_like_6 This family 28.4 86 0.0019 33.1 5.1 41 628-669 258-300 (363)
109 smart00672 CAP10 Putative lipo 27.9 84 0.0018 32.6 4.6 105 564-680 79-192 (256)
110 cd03792 GT1_Trehalose_phosphor 27.1 64 0.0014 34.8 3.9 42 628-670 264-306 (372)
111 KOG3514 Neurexin III-alpha [Si 26.0 42 0.00092 40.8 2.2 34 233-275 628-661 (1591)
112 PHA01633 putative glycosyl tra 24.7 83 0.0018 34.0 4.0 40 629-669 215-255 (335)
113 PRK09922 UDP-D-galactose:(gluc 23.5 87 0.0019 33.7 4.0 38 630-668 250-288 (359)
114 cd03795 GT1_like_4 This family 22.8 86 0.0019 32.9 3.8 40 629-669 255-297 (357)
115 PF12946 EGF_MSP1_1: MSP1 EGF 22.4 71 0.0015 22.7 1.9 24 125-148 5-31 (37)
116 PF00954 S_locus_glycop: S-loc 20.5 79 0.0017 27.9 2.4 22 282-303 84-108 (110)
117 PF14670 FXa_inhibition: Coagu 20.4 67 0.0015 22.6 1.5 16 132-147 11-28 (36)
118 cd03820 GT1_amsD_like This fam 20.4 1E+02 0.0023 31.6 3.8 41 628-669 243-284 (348)
No 1
>PF03016 Exostosin: Exostosin family; InterPro: IPR004263 Hereditary multiple exostoses (EXT) is an autosomal dominant disorder that is characterised by the appearance of multiple outgrowths of the long bones (exostoses) at their epiphyses []. Mutations in two homologous genes, EXT1 and EXT2, are responsible for the EXT syndrome. The human and mouse EXT genes have at least two homologs in the invertebrate Caenorhabditis elegans, indicating that they do not function exclusively as regulators of bone growth. EXT1 and EXT2 have both been shown to encode glycosyltransferases involved in the chain elongation step of heparan sulphate biosynthesis [].; GO: 0016020 membrane
Probab=100.00 E-value=1e-46 Score=399.89 Aligned_cols=283 Identities=32% Similarity=0.559 Sum_probs=207.1
Q ss_pred ccCceEEeEcCChhhhHHHhhccccccccccccccCcCccccccccchhHHHHHHHHhcCCCcCCCcCCCceEEEeccce
Q 005509 348 KKRPLLYVYDLPPEFNSLLLEGRHYKLECVNRIYNEKNETLWTDMLYGSQMAFYESILASPHRTLNGEEADFFFVPVLDS 427 (693)
Q Consensus 348 ~~~p~IYvYdlP~~fn~~ll~~~~~~~~c~~~~~~~~~~~~w~~~~y~~E~~~~~~L~~s~~rT~dP~eAdlF~VP~~~~ 427 (693)
.++++||||++|++||.+++... .......+.+.+|++|.+||++|++|++||.||+|||+||||++.+
T Consensus 2 ~~~lkVYVY~lp~~~~~~~~~~~-----------~~~~~~~~~~~~~~~e~~l~~~l~~s~~~T~dp~eAdlF~vP~~~~ 70 (302)
T PF03016_consen 2 HRGLKVYVYPLPPKFNKDLLDPR-----------EDEQCSWYETSQYALEVILHEALLNSPFRTDDPEEADLFFVPFYSS 70 (302)
T ss_pred CCCCEEEEEeCCccccccceecc-----------ccccCCCcccccchHHHHHHHHHHhCCcEeCCHHHCeEEEEEcccc
Confidence 35789999999999999888321 1122333456799999999999999999999999999999999998
Q ss_pred eeeeccCCCCcccccccccccchhHHHHHHHHHHHHHHcCcccccCCCccEEEEeccCCCCccCCcc--ccCceEEeecc
Q 005509 428 CIITRADDAPHLSAQEHRGLRSSLTLEFYKKAYEHIIEHYPYWNRTSGRDHIWFFSWDEGACYAPKE--IWNSMMLVHWG 505 (693)
Q Consensus 428 ~~~~~~~~~p~~~~~~~~~~r~~~~~~~~~~~~~~l~~~~PyWnR~~GrdH~~v~~~d~g~~~~~~~--~~~~~~l~~~g 505 (693)
+....... ........+.+..++.++++++|||||++|+||||+.++|+|.+..... +.+...++...
T Consensus 71 ~~~~~~~~----------~~~~~~~~~~~~~~~~~~~~~~p~w~r~~G~dH~~~~~~~~g~~~~~~~~~~~~~~~~~~~~ 140 (302)
T PF03016_consen 71 CYFHHWWG----------SPNSGADRDSLSDALRHLLASYPYWNRSGGRDHFFVNSHDRGGCSFDRNPRLMNNSIRAVVA 140 (302)
T ss_pred cccccccC----------CccchhhHHHHHHHHHHHHhcCchhhccCCCCeEEEeccccccccccccHhhhccchhheec
Confidence 87411100 0011123445567788888899999999999999999999888864321 11111111100
Q ss_pred CCCcCCCcceeeeecCCCcccCcCCCCCCccccCCCceeecCccCCchhhhhccccCCCCCCCceeEEecccCCCCCCCC
Q 005509 506 NTNSKHNHSTTAYWADNWDRISSSRRGNHSCFDPEKDLVLPAWKAPDAFVLRSKLWASPREKRKTLFYFNGNLGSAYPNG 585 (693)
Q Consensus 506 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~p~kDvviP~~~~~~~~~~~~~~~~~~~~~R~~L~~F~G~~~~~~~~~ 585 (693)
... ....+|+|++||++|++............+..+..+|++|++|+|++...
T Consensus 141 --------------~~~---------~~~~~~~~~~Di~~P~~~~~~~~~~~~~~~~~~~~~R~~l~~f~g~~~~~---- 193 (302)
T PF03016_consen 141 --------------FSS---------FSSSCFRPGFDIVIPPFVPPSSLPDWRPWPQRPPARRPYLLFFAGTIRPS---- 193 (302)
T ss_pred --------------cCC---------CCcCcccCCCCeeccccccccccCCccccccCCccCCceEEEEeeecccc----
Confidence 000 02358999999999998876543322222345678999999999998642
Q ss_pred CCCCCccHHHHHHHHHHhcCCCCCccccCcccCcceEEecCCchhHHHHhhcCceeeccCCCC-CchhHHHHHhcCceeE
Q 005509 586 RPESSYSMGVRQKLAEEYGSSPNKEGKLGKQHAEDVIVTSLRSENYHEDLSSSVFCGVLPGDG-WSGRMEDSILQGCIPV 664 (693)
Q Consensus 586 r~~~~ys~~iR~~L~~~~~~~~~~~~~~g~~~~~~~~~~~~~~~~y~~~l~~S~FCL~p~Gd~-~s~Rl~dAi~~GCIPV 664 (693)
...|++++|+.|+++|++.++.....+ ........+|.+.|++|||||+|+|++ ++.||+|||++|||||
T Consensus 194 --~~~~~~~~r~~l~~~~~~~~~~~~~~~-------~~~~~~~~~~~~~l~~S~FCL~p~G~~~~s~Rl~eal~~GcIPV 264 (302)
T PF03016_consen 194 --SNDYSGGVRQRLLDECKSDPDFRCSDG-------SETCPSPSEYMELLRNSKFCLCPRGDGPWSRRLYEALAAGCIPV 264 (302)
T ss_pred --ccccchhhhhHHHHhcccCCcceeeec-------ccccccchHHHHhcccCeEEEECCCCCcccchHHHHhhhceeeE
Confidence 111678999999999987654321100 011245567999999999999999997 6899999999999999
Q ss_pred EEeCCeeec---eecCCCccEEEece
Q 005509 665 VIQVVISSF---LLLCQNGSLKIRNK 687 (693)
Q Consensus 665 iisd~~~~p---~l~~~~fsv~v~~~ 687 (693)
||+|++.+| +|||++|||+|+++
T Consensus 265 ii~d~~~lPf~~~ldw~~fsv~v~~~ 290 (302)
T PF03016_consen 265 IISDDYVLPFEDVLDWSRFSVRVPEA 290 (302)
T ss_pred EecCcccCCcccccCHHHEEEEECHH
Confidence 999999999 79999999999975
No 2
>KOG1021 consensus Acetylglucosaminyltransferase EXT1/exostosin 1 [Carbohydrate transport and metabolism; Cell wall/membrane/envelope biogenesis; Extracellular structures]
Probab=100.00 E-value=2e-40 Score=368.20 Aligned_cols=296 Identities=28% Similarity=0.379 Sum_probs=208.4
Q ss_pred cCceEEeEcCChhhhHHHhhcccccc--------ccccc---------cccCc-----Ccccc-ccccchhHHHHHHHHh
Q 005509 349 KRPLLYVYDLPPEFNSLLLEGRHYKL--------ECVNR---------IYNEK-----NETLW-TDMLYGSQMAFYESIL 405 (693)
Q Consensus 349 ~~p~IYvYdlP~~fn~~ll~~~~~~~--------~c~~~---------~~~~~-----~~~~w-~~~~y~~E~~~~~~L~ 405 (693)
....||+|++|+.|+..++..+.... .|..- .+..+ ....| .++||++|.+||.+|+
T Consensus 71 ~~~~v~~~~~~~~F~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~w~~~~~~~~E~~~~~~~~ 150 (464)
T KOG1021|consen 71 AGASVYVYNLPSGFDVSLLLFHKQIPTSPNNKKFMCSYKLNEKRGKVYVYHEGNKPLFHTPSWCLTDQYASEGIFHNRML 150 (464)
T ss_pred cCcceeeeccchhhhhhhhccCccccccCcchhhhhhhhhhcccCceEEecCCCCccccCCCcccccchhHHHHHHHHHh
Confidence 34578999999999999888754332 22210 11111 12244 5689999999999995
Q ss_pred --cCCCcCCCcCCCceEEEeccceeeeeccCCCCcccccccccccchhHHHHHHHHHHHHHHcCcccccCCCccEEEEec
Q 005509 406 --ASPHRTLNGEEADFFFVPVLDSCIITRADDAPHLSAQEHRGLRSSLTLEFYKKAYEHIIEHYPYWNRTSGRDHIWFFS 483 (693)
Q Consensus 406 --~s~~rT~dP~eAdlF~VP~~~~~~~~~~~~~p~~~~~~~~~~r~~~~~~~~~~~~~~l~~~~PyWnR~~GrdH~~v~~ 483 (693)
.+++||.||++||+||||||+++..+++...+.-. .+ ....+++++.+..+++++|||||+.|+|||||+.
T Consensus 151 ~~~~~~Rt~dp~~Ad~f~vPf~~~~~~~~~~~~~~~~------~~-~~~~~~~~~~i~~~~~~~p~W~Rs~G~DH~~v~~ 223 (464)
T KOG1021|consen 151 RRESAFRTLDPLEADAFYVPFYASLDYNRALLWPDER------VN-AILRSILQDYIVALLSKQPYWNRSSGRDHFFVAC 223 (464)
T ss_pred cccCceecCChhhCcEEEEcceeeEehhhhcccCCcc------cc-hHHHHHHHHHHHHHHhcCchhhccCCCceEEEeC
Confidence 77999999999999999999999987764433210 01 1123444555556678999999999999999999
Q ss_pred cCCCCccCCccccCceEEeeccCCCcCCCcceeeeecCCCcccCcCCCCCCccccCC-CceeecCccCCchhhhh--ccc
Q 005509 484 WDEGACYAPKEIWNSMMLVHWGNTNSKHNHSTTAYWADNWDRISSSRRGNHSCFDPE-KDLVLPAWKAPDAFVLR--SKL 560 (693)
Q Consensus 484 ~d~g~~~~~~~~~~~~~l~~~g~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~p~-kDvviP~~~~~~~~~~~--~~~ 560 (693)
+|++...... .++++++..+..-+ ++.. ...|.+. +||+||++...++.... .-.
T Consensus 224 ~~~~~~~~~~-~~~~~~~~i~~~~n----~a~l-----------------s~~~~~~~~dv~iP~~~~~~~~~~~~~~~~ 281 (464)
T KOG1021|consen 224 HDWGDFRRRS-DWGASISLIPEFCN----GALL-----------------SLEFFPWNKDVAIPYPTIPHPLSPPENSWQ 281 (464)
T ss_pred Ccchheeecc-chhhHHHHHHhhCC----ccee-----------------ecccccCCCcccCCCccCcCccCccccccc
Confidence 9998875431 22222211111110 0000 1246777 99999998765543321 123
Q ss_pred cCCCCCCCceeEEecccCCCCCCCCCCCCCccHHHHHHHHHHhcCCCCCccccCcccCcceEEecCCchhHHHHhhcCce
Q 005509 561 WASPREKRKTLFYFNGNLGSAYPNGRPESSYSMGVRQKLAEEYGSSPNKEGKLGKQHAEDVIVTSLRSENYHEDLSSSVF 640 (693)
Q Consensus 561 ~~~~~~~R~~L~~F~G~~~~~~~~~r~~~~ys~~iR~~L~~~~~~~~~~~~~~g~~~~~~~~~~~~~~~~y~~~l~~S~F 640 (693)
...+..+|++|+||+|+. ..+.||+.|+++|+++++.+..+.+. +....+.+...|.+.|++|+|
T Consensus 282 ~~~~~~~R~~L~~F~G~~------------~~~~iR~~L~~~~~~~~~~~~~~~~~---~g~~~~~~~~~y~~~m~~S~F 346 (464)
T KOG1021|consen 282 GGVPFSNRPILAFFAGAP------------AGGQIRSILLDLWKKDPDTEVFVNCP---RGKVSCDRPLNYMEGMQDSKF 346 (464)
T ss_pred cCCCCCCCceEEEEeccc------------cCCcHHHHHHHHhhcCcCccccccCC---CCccccCCcchHHHHhhcCeE
Confidence 345568999999999983 13569999999999844433222221 111234667899999999999
Q ss_pred eeccCCCCC-chhHHHHHhcCceeEEEeCCeeec---eecCCCccEEEecee
Q 005509 641 CGVLPGDGW-SGRMEDSILQGCIPVVIQVVISSF---LLLCQNGSLKIRNKF 688 (693)
Q Consensus 641 CL~p~Gd~~-s~Rl~dAi~~GCIPViisd~~~~p---~l~~~~fsv~v~~~~ 688 (693)
||+|+||++ ++|+||||++|||||||+|++++| .+||++|||+|++|.
T Consensus 347 CL~p~Gd~~ts~R~fdai~~gCvPViisd~~~lpf~~~~d~~~fSV~v~~~~ 398 (464)
T KOG1021|consen 347 CLCPPGDTPTSPRLFDAIVSGCVPVIISDGIQLPFGDVLDWTEFSVFVPEKD 398 (464)
T ss_pred EECCCCCCcccHhHHHHHHhCCccEEEcCCcccCcCCCccceEEEEEEEHHH
Confidence 999999975 789999999999999999999998 799999999999653
No 3
>KOG2264 consensus Exostosin EXT1L [Signal transduction mechanisms]
Probab=99.69 E-value=1.1e-16 Score=170.33 Aligned_cols=229 Identities=21% Similarity=0.231 Sum_probs=142.4
Q ss_pred HHHHHHHhcCCCcCCCcCCCceEEEeccceeeeeccCCCCcccccccccccchhHHHHHHHHHHHHHHcCcccccCCCcc
Q 005509 398 MAFYESILASPHRTLNGEEADFFFVPVLDSCIITRADDAPHLSAQEHRGLRSSLTLEFYKKAYEHIIEHYPYWNRTSGRD 477 (693)
Q Consensus 398 ~~~~~~L~~s~~rT~dP~eAdlF~VP~~~~~~~~~~~~~p~~~~~~~~~~r~~~~~~~~~~~~~~l~~~~PyWnR~~Grd 477 (693)
..|.+.+.+..|.|+||+.|+++++-+= -. ..| ..++. .+ ++.| -++||| |++|+|
T Consensus 218 ~~fq~t~~~n~~~ve~pd~ACiyi~lvg------e~-q~P-------~~l~p---~e-----lekl-yslp~w-~~dg~N 273 (907)
T KOG2264|consen 218 QVFQETIPNNVYLVETPDKACIYIHLVG------EI-QSP-------VVLTP---AE-----LEKL-YSLPHW-RTDGFN 273 (907)
T ss_pred HHHHHhcccceeEeeCCCccEEEEEEec------cc-cCC-------CcCCh---Hh-----hhhh-hcCccc-cCCCcc
Confidence 4677778888999999999999999771 11 111 11221 11 2233 478999 799999
Q ss_pred EEEEeccCCCCccCCccccCceEEeeccCCCcCCCcceeeeecCCCcccCcCCCCCCccccCCCceeecCccCCchhhhh
Q 005509 478 HIWFFSWDEGACYAPKEIWNSMMLVHWGNTNSKHNHSTTAYWADNWDRISSSRRGNHSCFDPEKDLVLPAWKAPDAFVLR 557 (693)
Q Consensus 478 H~~v~~~d~g~~~~~~~~~~~~~l~~~g~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~p~kDvviP~~~~~~~~~~~ 557 (693)
|+++...... +..+.++| +..|++... .+.. ...+|||++|+++++...+.....+
T Consensus 274 hvl~Nl~r~s--~~~n~lyn----~~t~raivv---------Qssf---------~~~q~RpgfDl~V~pv~h~~~e~~~ 329 (907)
T KOG2264|consen 274 HVLFNLGRPS--DTQNLLYN----FQTGRAIVV---------QSSF---------YTVQIRPGFDLPVDPVNHIAVEKNF 329 (907)
T ss_pred eEEEEccCcc--ccccceeE----eccCceEEE---------eecc---------eeeeeccCCCcccCcccccccCccc
Confidence 9999543321 11122222 222222100 0000 0127999999999988776655445
Q ss_pred ccccCCCCCCCceeEEecccCCCCCCCCCCCCCccHHHHHHHHHHhcCCCCCccccCcccCcceEEe-------------
Q 005509 558 SKLWASPREKRKTLFYFNGNLGSAYPNGRPESSYSMGVRQKLAEEYGSSPNKEGKLGKQHAEDVIVT------------- 624 (693)
Q Consensus 558 ~~~~~~~~~~R~~L~~F~G~~~~~~~~~r~~~~ys~~iR~~L~~~~~~~~~~~~~~g~~~~~~~~~~------------- 624 (693)
.++....+.+|++|+.|+|++.+. +.. -...+... ++...++... ..+..+-+++.
T Consensus 330 ~e~~p~vP~~RkyL~t~qgki~~~----~ss---Ln~~~aF~-~e~~adp~~~---a~qds~i~qv~c~~t~k~Qe~~SL 398 (907)
T KOG2264|consen 330 VELTPLVPFQRKYLITLQGKIESD----NSS---LNEFSAFS-EELSADPSRR---AVQDSPIVQVKCSFTCKNQENCSL 398 (907)
T ss_pred eecCcccchhhheeEEEEeeeccc----ccc---cchhhhhH-HHhccCCccc---ccccCceEEEEEeeccccCCCCCc
Confidence 556566788999999999988652 110 11233322 2233332211 01111111221
Q ss_pred -----cCCchhHHHHhhcCceeec-cCCCCC------chhHHHHHhcCceeEEEeCCeeec---eecCCCccEEEe
Q 005509 625 -----SLRSENYHEDLSSSVFCGV-LPGDGW------SGRMEDSILQGCIPVVIQVVISSF---LLLCQNGSLKIR 685 (693)
Q Consensus 625 -----~~~~~~y~~~l~~S~FCL~-p~Gd~~------s~Rl~dAi~~GCIPViisd~~~~p---~l~~~~fsv~v~ 685 (693)
|...+.-.++|++|+|||+ ||||+- -.|++||+..||||||+++...+| .|||.+..+.++
T Consensus 399 pewalcg~~~~RrqLlk~STF~lilpp~d~rv~S~~~~~r~~eaL~~GavPviLg~~~~LPyqd~idWrraal~lP 474 (907)
T KOG2264|consen 399 PEWALCGERERRRQLLKSSTFCLILPPGDPRVISEMFFQRFLEALQLGAVPVILGNSQLLPYQDLIDWRRAALRLP 474 (907)
T ss_pred chhhhccchHHHHHHhccceeEEEecCCCcchhhHHHHHHHHHHHhcCCeeEEeccccccchHHHHHHHHHhhhCC
Confidence 2223466799999999995 889865 378999999999999999999888 799999888775
No 4
>KOG1225 consensus Teneurin-1 and related extracellular matrix proteins, contain EGF-like repeats [Signal transduction mechanisms; Extracellular structures]
Probab=99.63 E-value=6.7e-16 Score=169.77 Aligned_cols=123 Identities=33% Similarity=0.824 Sum_probs=100.0
Q ss_pred ccCccCCCCCCCCCCCCCEEeccCCeEEeCCCccCCCCCccccCcCCCCCCCCCCCCCccccccCCCCCCCCceeeeCCC
Q 005509 114 LVEMIGGKSCKSDCSGQGVCNHELGQCRCFHGFRGKGCSERIHFQCNFPKTPELPYGRWVVSICPTHCDTTRAMCFCGEG 193 (693)
Q Consensus 114 ~~~~~~~~~C~~~C~~~G~C~~~~G~C~C~~G~~G~~Ce~~~~~~C~~~~~~~~~~g~~~~~~C~g~C~~~~g~C~C~~G 193 (693)
.++.+....|+..|+++|.|+ .|.|+|++||+|.+|++. .|+. .|+++-.+..|+|.|++|
T Consensus 243 ~g~~c~~~~C~~~c~~~g~c~--~G~CIC~~Gf~G~dC~e~---~Cp~--------------~cs~~g~~~~g~CiC~~g 303 (525)
T KOG1225|consen 243 FGPLCSTIYCPGGCTGRGQCV--EGRCICPPGFTGDDCDEL---VCPV--------------DCSGGGVCVDGECICNPG 303 (525)
T ss_pred eCCccccccCCCCCcccceEe--CCeEeCCCCCcCCCCCcc---cCCc--------------ccCCCceecCCEeecCCC
Confidence 345565678888899999998 899999999999999874 2443 355656667789999999
Q ss_pred cccCCCCCCCCCCCcccCCCCCCCCCCCCCcCCCCCCccCCCCCCCceecCCcccccccccccccccccCCCCCcCcccc
Q 005509 194 TKYPNRPVAEACGFQVNLPSQPGAPKSTDWAKADLDNIFTTNGSKPGWCNVDPEEAYALKVQFKEECDCKYDGLLGQFCE 273 (693)
Q Consensus 194 ~~G~~C~~~~~C~~~~~~~~~~~~~C~~gw~g~~c~~~~~~~C~~~G~C~~~~~~~~~~~~c~~g~C~C~~~G~~G~~C~ 273 (693)
|+|..|+... | +.+|+++|.| +.++|.|+ +||+|..|+
T Consensus 304 ~~G~dCs~~~-c---------------------------padC~g~G~C-------------i~G~C~C~-~Gy~G~~C~ 341 (525)
T KOG1225|consen 304 YSGKDCSIRR-C---------------------------PADCSGHGKC-------------IDGECLCD-EGYTGELCI 341 (525)
T ss_pred cccccccccc-C---------------------------CccCCCCCcc-------------cCCceEeC-CCCcCCccc
Confidence 9988885431 1 5678899999 47899999 999999999
Q ss_pred cccCCccCCCCCCCceeeCCeeecCCCcccCC
Q 005509 274 VPVSSTCVNQCSGHGHCRGGFCQCDSGWYGVD 305 (693)
Q Consensus 274 ~~~~~~C~~~C~~~G~C~~g~C~C~~G~~G~~ 305 (693)
+. . |+++|.|++| |+|+.||.|.+
T Consensus 342 ~~------~-C~~~g~cv~g-C~C~~Gw~G~d 365 (525)
T KOG1225|consen 342 QR------A-CSGGGQCVNG-CKCKKGWRGPD 365 (525)
T ss_pred cc------c-cCCCceeccC-ceeccCccCCC
Confidence 74 3 9999999999 99999999999
No 5
>KOG1225 consensus Teneurin-1 and related extracellular matrix proteins, contain EGF-like repeats [Signal transduction mechanisms; Extracellular structures]
Probab=99.50 E-value=7.8e-14 Score=153.60 Aligned_cols=171 Identities=26% Similarity=0.469 Sum_probs=123.7
Q ss_pred cccccCccCCCCCCCCCCCCCEEeccCCeEEeCCCccCCCCCccccCcCCCCCCCCCCCCCccccccCCCCCC--CCcee
Q 005509 111 EVDLVEMIGGKSCKSDCSGQGVCNHELGQCRCFHGFRGKGCSERIHFQCNFPKTPELPYGRWVVSICPTHCDT--TRAMC 188 (693)
Q Consensus 111 ~~~~~~~~~~~~C~~~C~~~G~C~~~~G~C~C~~G~~G~~Ce~~~~~~C~~~~~~~~~~g~~~~~~C~g~C~~--~~g~C 188 (693)
..+.++.++...|++.|+.||.+. .+.|.+..+++|..|... .|... ++....-..++..|.. ..+.|
T Consensus 167 ~~~~~~~~g~~~~~~~~~~hg~~~--~~~~l~~~~~s~~~~~~~---~~~~~-----~~~~~r~~~~~~~~~~~~~~~ic 236 (525)
T KOG1225|consen 167 PNPFGAECGQYKCPNDGSGHGRYY--FGNCLSGISASGETCNQL---GCNDD-----CFRTGRCREGRCFCTAGFFDGIC 236 (525)
T ss_pred CCccccccceecCCcCCCCCccce--ecccccccCcchhhhhcc---cCCcc-----ceeccccccCcccccccccCcee
Confidence 445566677778889999999998 899999999999999763 12210 0110000111122221 23489
Q ss_pred eeCCCcccCCCCCCCCCCCc-ccCCCCC--CCCCCCCCcCCCCCCc-cCCCCCCCceecCCcccccccccccccccccCC
Q 005509 189 FCGEGTKYPNRPVAEACGFQ-VNLPSQP--GAPKSTDWAKADLDNI-FTTNGSKPGWCNVDPEEAYALKVQFKEECDCKY 264 (693)
Q Consensus 189 ~C~~G~~G~~C~~~~~C~~~-~~~~~~~--~~~C~~gw~g~~c~~~-~~~~C~~~G~C~~~~~~~~~~~~c~~g~C~C~~ 264 (693)
.|..+|+|+.|.. ..|... .....|. .|.|++||+|.+|+.. .+.+|++++.| ++++|+|.
T Consensus 237 ~c~~~~~g~~c~~-~~C~~~c~~~g~c~~G~CIC~~Gf~G~dC~e~~Cp~~cs~~g~~-------------~~g~CiC~- 301 (525)
T KOG1225|consen 237 ECPEGYFGPLCST-IYCPGGCTGRGQCVEGRCICPPGFTGDDCDELVCPVDCSGGGVC-------------VDGECICN- 301 (525)
T ss_pred ecCCceeCCcccc-ccCCCCCcccceEeCCeEeCCCCCcCCCCCcccCCcccCCCcee-------------cCCEeecC-
Confidence 9999999999863 233221 1112233 3458999999999963 35558777777 47899999
Q ss_pred CCCcCcccccccCCccCCCCCCCceeeCCeeecCCCcccCCCCCC
Q 005509 265 DGLLGQFCEVPVSSTCVNQCSGHGHCRGGFCQCDSGWYGVDCSIP 309 (693)
Q Consensus 265 ~G~~G~~C~~~~~~~C~~~C~~~G~C~~g~C~C~~G~~G~~C~~~ 309 (693)
+||+|..|++. .|+.+|+++|.|++|+|+|.+||+|..|+++
T Consensus 302 ~g~~G~dCs~~---~cpadC~g~G~Ci~G~C~C~~Gy~G~~C~~~ 343 (525)
T KOG1225|consen 302 PGYSGKDCSIR---RCPADCSGHGKCIDGECLCDEGYTGELCIQR 343 (525)
T ss_pred CCccccccccc---cCCccCCCCCcccCCceEeCCCCcCCccccc
Confidence 99999999986 7999999999999999999999999999998
No 6
>KOG1226 consensus Integrin beta subunit (N-terminal portion of extracellular region) [Signal transduction mechanisms; Extracellular structures]
Probab=99.40 E-value=6.9e-13 Score=147.96 Aligned_cols=143 Identities=25% Similarity=0.517 Sum_probs=102.1
Q ss_pred CCCCCCEEeccCCeEEeCCCccCCCCCccccCcCCCCCC-CCCCCCCccccccCCCCCCCCceeeeCCCcc----cCCCC
Q 005509 126 DCSGQGVCNHELGQCRCFHGFRGKGCSERIHFQCNFPKT-PELPYGRWVVSICPTHCDTTRAMCFCGEGTK----YPNRP 200 (693)
Q Consensus 126 ~C~~~G~C~~~~G~C~C~~G~~G~~Ce~~~~~~C~~~~~-~~~~~g~~~~~~C~g~C~~~~g~C~C~~G~~----G~~C~ 200 (693)
.|++||+++ .|+|.|.+||.|..||-.. .+.+... ...|...-....|+|..+|.-|+|.|.+... |+.|+
T Consensus 468 ~C~g~G~~~--CG~C~C~~G~~G~~CEC~~--~~~ss~~~~~~Cr~~~~~~vCSgrG~C~CGqC~C~~~~~~~i~G~fCE 543 (783)
T KOG1226|consen 468 LCHGNGTFV--CGQCRCDEGWLGKKCECST--DELSSSEEEDKCRENSDSPVCSGRGDCVCGQCVCHKPDNGKIYGKFCE 543 (783)
T ss_pred ccCCCCcEE--ecceecCCCCCCCcccCCc--cccCcHhHHhhccCCCCCCCcCCCCcEeCCceEecCCCCCceeeeeee
Confidence 599999998 9999999999999999532 1211100 0001111122389999888999999998877 77774
Q ss_pred CCCCCCCcccCCCCCCCCCCCCCcCCCCCCccCCCCCCCceecCCcccccccccccccccccCCCCCcCccccccc-CCc
Q 005509 201 VAEACGFQVNLPSQPGAPKSTDWAKADLDNIFTTNGSKPGWCNVDPEEAYALKVQFKEECDCKYDGLLGQFCEVPV-SST 279 (693)
Q Consensus 201 ~~~~C~~~~~~~~~~~~~C~~gw~g~~c~~~~~~~C~~~G~C~~~~~~~~~~~~c~~g~C~C~~~G~~G~~C~~~~-~~~ 279 (693)
-.+. .|+...+.-|.++|.|. .|+|.|. +||+|..|+.+. .+.
T Consensus 544 CDnf----------------------sC~r~~g~lC~g~G~C~-------------CG~CvC~-~GwtG~~C~C~~std~ 587 (783)
T KOG1226|consen 544 CDNF----------------------SCERHKGVLCGGHGRCE-------------CGRCVCN-PGWTGSACNCPLSTDT 587 (783)
T ss_pred ccCc----------------------ccccccCcccCCCCeEe-------------CCcEEcC-CCCccCCCCCCCCCcc
Confidence 3211 12222245577888883 7899999 999999998763 345
Q ss_pred cCC----CCCCCceeeCCeeecCCC-cccCCCCC
Q 005509 280 CVN----QCSGHGHCRGGFCQCDSG-WYGVDCSI 308 (693)
Q Consensus 280 C~~----~C~~~G~C~~g~C~C~~G-~~G~~C~~ 308 (693)
|.+ .|+++|+|.-|+|+|... |+|..|+.
T Consensus 588 C~~~~G~iCSGrG~C~Cg~C~C~~~~~sG~~CE~ 621 (783)
T KOG1226|consen 588 CESSDGQICSGRGTCECGRCKCTDPPYSGEFCEK 621 (783)
T ss_pred ccCCCCceeCCCceeeCCceEcCCCCcCcchhhc
Confidence 652 599999999999999766 99999997
No 7
>KOG1022 consensus Acetylglucosaminyltransferase EXT2/exostosin 2 [Carbohydrate transport and metabolism; Cell wall/membrane/envelope biogenesis; Extracellular structures]
Probab=99.06 E-value=7.3e-10 Score=118.67 Aligned_cols=225 Identities=16% Similarity=0.100 Sum_probs=138.3
Q ss_pred HHHHHHHhcCCCcCCCcCCCceEEEeccceeeeeccCCCCcccccccccccchhHHHHHHHHHHHHHHcCcccccCCCcc
Q 005509 398 MAFYESILASPHRTLNGEEADFFFVPVLDSCIITRADDAPHLSAQEHRGLRSSLTLEFYKKAYEHIIEHYPYWNRTSGRD 477 (693)
Q Consensus 398 ~~~~~~L~~s~~rT~dP~eAdlF~VP~~~~~~~~~~~~~p~~~~~~~~~~r~~~~~~~~~~~~~~l~~~~PyWnR~~Grd 477 (693)
..+.|+...|.+.|.|+.+|++|.--. .-+ + + .. +..++ -..++++.-.|.| |.+
T Consensus 126 ~~lleA~~~S~yyt~n~N~aclf~Ps~-d~l--n----------Q--n~----l~~kl----~~~ala~l~~wdr--g~n 180 (691)
T KOG1022|consen 126 IALLEAWHLSFYYTFNYNGACLFMPSS-DEL--N----------Q--NP----LSWKL----EKVALAKLLVWDR--GVN 180 (691)
T ss_pred HHHHHHHHhccceecCCCceEEEecch-hhh--c----------c--Cc----chHHH----HHHHHhcccchhc--ccc
Confidence 467778888999999999999986433 111 1 1 11 22222 1234456779987 999
Q ss_pred EEEEeccCCCCccCCccccCceEEeeccCCCcCCCcceeeeecCCCcccCcCCCCCCccccCCCceeecCccCCchhhhh
Q 005509 478 HIWFFSWDEGACYAPKEIWNSMMLVHWGNTNSKHNHSTTAYWADNWDRISSSRRGNHSCFDPEKDLVLPAWKAPDAFVLR 557 (693)
Q Consensus 478 H~~v~~~d~g~~~~~~~~~~~~~l~~~g~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~p~kDvviP~~~~~~~~~~~ 557 (693)
|..+.-=.-|.- .+|..+ +.|+-+ .-...-..+.| .||++.||.||.|......
T Consensus 181 H~~fnmLpGg~p-----~yntal--dv~~d~----a~~~gggf~tW------------~yr~g~dv~ipv~Sp~~v~--- 234 (691)
T KOG1022|consen 181 HEGFNMLPGGDP-----TYNTAL--DVGQDE----AWYSGGGFGTW------------KYRKGNDVYIPVRSPGNVG--- 234 (691)
T ss_pred eeeEeeccCCCC-----Cccccc--cCCcce----eEEecCCcCcc------------cccCCCccccccccccccC---
Confidence 999932222221 112211 111110 00000112345 6899999999999865221
Q ss_pred ccccCCCCCCCceeEEecccCCCCCCCCCCCCCccHHHHHHHHHHhcCCCCCccccCcccCcceEEe--c--CCchhHHH
Q 005509 558 SKLWASPREKRKTLFYFNGNLGSAYPNGRPESSYSMGVRQKLAEEYGSSPNKEGKLGKQHAEDVIVT--S--LRSENYHE 633 (693)
Q Consensus 558 ~~~~~~~~~~R~~L~~F~G~~~~~~~~~r~~~~ys~~iR~~L~~~~~~~~~~~~~~g~~~~~~~~~~--~--~~~~~y~~ 633 (693)
.......+|..++.-.|. .|...+|..|.++..........++.+...+.... + +....|.+
T Consensus 235 --~~~~~~g~r~~~l~~~q~------------n~~pr~r~~l~el~~kh~e~~l~l~~c~nlsl~~r~~~qhH~~~~yp~ 300 (691)
T KOG1022|consen 235 --RAFLYDGSRYRVLQDCQE------------NYGPRIRVSLIELLSKHEERELELPFCLNLSLNSRGVRQHHFDVKYPS 300 (691)
T ss_pred --ccccCCccceeeeecccc------------ccchHhHHhHHHHHhhccceEEecchhccccccccchhhccccccccc
Confidence 112334566655544441 34566888888876555443333333322121111 1 22357999
Q ss_pred HhhcCceeeccCCCCC-chhHHHHHhcCceeEEEeCCeeec---eecCCCccEEEece
Q 005509 634 DLSSSVFCGVLPGDGW-SGRMEDSILQGCIPVVIQVVISSF---LLLCQNGSLKIRNK 687 (693)
Q Consensus 634 ~l~~S~FCL~p~Gd~~-s~Rl~dAi~~GCIPViisd~~~~p---~l~~~~fsv~v~~~ 687 (693)
.+...+||+.-++..- ..-+.+-+.++|||||+.|.+.+| ++||.-.||.++|-
T Consensus 301 ~l~~~~fc~~~R~~r~gq~~lv~~~~a~c~pvi~vd~y~lpf~~Vvdw~~aSv~~~e~ 358 (691)
T KOG1022|consen 301 SLEFIGFCDGDRVTRGGQFHLVILGYASCAPVISVDIYLLPFLGVVDWIVASVWCMEY 358 (691)
T ss_pred ccceeeeEeccccccCCccceehhhhcccceeeeeehhhhhhhhhhhceeeeEEeehh
Confidence 9999999998888544 456999999999999999999999 89999999999885
No 8
>KOG0994 consensus Extracellular matrix glycoprotein Laminin subunit beta [Extracellular structures]
Probab=99.05 E-value=7.1e-10 Score=126.73 Aligned_cols=200 Identities=24% Similarity=0.462 Sum_probs=117.9
Q ss_pred cccCcccccCCCCcccccccCccCCCCC-CCCCCC--------CCEEeccCC----eEEeCCCccCCCCCccccCcCCCC
Q 005509 96 AEIGRWLSGCDSVAKEVDLVEMIGGKSC-KSDCSG--------QGVCNHELG----QCRCFHGFRGKGCSERIHFQCNFP 162 (693)
Q Consensus 96 ~~~g~~~~~c~~~~~~~~~~~~~~~~~C-~~~C~~--------~G~C~~~~G----~C~C~~G~~G~~Ce~~~~~~C~~~ 162 (693)
.+.|.+|++|..++.+.+.-.. +..| |.+|-. --.|...+- .|.|.+||+|..|+. |.++
T Consensus 881 ~T~G~~CdrCl~GyyGdP~lg~--g~~CrPCpCP~gp~Sg~~~A~sC~~d~~t~~ivC~C~~GY~G~RCe~-----CA~~ 953 (1758)
T KOG0994|consen 881 STTGHSCDRCLDGYYGDPRLGS--GIGCRPCPCPDGPASGRQHADSCYLDTRTQQIVCHCQEGYSGSRCEI-----CADN 953 (1758)
T ss_pred cccccchhhhhccccCCcccCC--CCCCCCCCCCCCCccchhccccccccccccceeeecccCccccchhh-----hccc
Confidence 3578889999777765444221 2344 444422 125632222 799999999999997 6654
Q ss_pred --CCCCCCCCCccccccC--------CCCCCCCceee-------------eCCCcccC----CCCCCCCC------CCcc
Q 005509 163 --KTPELPYGRWVVSICP--------THCDTTRAMCF-------------CGEGTKYP----NRPVAEAC------GFQV 209 (693)
Q Consensus 163 --~~~~~~~g~~~~~~C~--------g~C~~~~g~C~-------------C~~G~~G~----~C~~~~~C------~~~~ 209 (693)
+.|.. .|.|....|+ +.|+-.+|.|. |..||.|. +|..+ .| ..+.
T Consensus 954 ~fGnP~~-GGtCq~CeC~~NiD~~d~~aCD~~TG~CLkCL~hTeG~hCe~Ck~Gf~GdA~~q~CqrC-~Cn~LGTn~~~~ 1031 (1758)
T KOG0994|consen 954 HFGNPSE-GGTCQKCECSNNIDLYDPGACDVATGACLKCLYHTEGDHCEHCKDGFYGDALRQNCQRC-VCNFLGTNSTCH 1031 (1758)
T ss_pred ccCCccc-CCccccccccCCcCccCCCccchhhchhhhhhhcccccchhhccccchhHHHHhhhhhh-eccccccCCccc
Confidence 33333 4566666775 55777777664 55555553 22111 01 1122
Q ss_pred cCCCCCCCCCCCCCcCCCCCCccCCC---CCCCc--eecCCcccccccccc--cccccccCCCCCcCcccccccC-----
Q 005509 210 NLPSQPGAPKSTDWAKADLDNIFTTN---GSKPG--WCNVDPEEAYALKVQ--FKEECDCKYDGLLGQFCEVPVS----- 277 (693)
Q Consensus 210 ~~~~~~~~~C~~gw~g~~c~~~~~~~---C~~~G--~C~~~~~~~~~~~~c--~~g~C~C~~~G~~G~~C~~~~~----- 277 (693)
++.....|+|.++--|..|+.+-.+. =+++| .|+.++. .+-+| .+|+|.|+ +||.|..|++.-+
T Consensus 1032 CDr~tGQCpClpNv~G~~CDqCA~N~w~laSG~GCe~C~Cd~~---~~pqCN~ftGQCqCk-pGfGGR~C~qCqel~WGd 1107 (1758)
T KOG0994|consen 1032 CDRFTGQCPCLPNVQGVRCDQCAENHWNLASGEGCEPCNCDPI---GGPQCNEFTGQCQCK-PGFGGRTCSQCQELYWGD 1107 (1758)
T ss_pred cccccCcCCCCcccccccccccccchhccccCCCCCccCCCcc---CCccccccccceecc-CCCCCcchhHHHHhhcCC
Confidence 34445577889999999888654221 01111 1222210 11122 58899999 9999999987521
Q ss_pred --CccC-CCCCCCc----eee--CCeeecCCCcccCCCCC
Q 005509 278 --STCV-NQCSGHG----HCR--GGFCQCDSGWYGVDCSI 308 (693)
Q Consensus 278 --~~C~-~~C~~~G----~C~--~g~C~C~~G~~G~~C~~ 308 (693)
..|. -.|...| .|. +|+|.|.+|..|..|..
T Consensus 1108 P~~~C~aCdCd~rG~~tpQCdr~tG~C~C~~Gv~G~rCdq 1147 (1758)
T KOG0994|consen 1108 PNEKCRACDCDPRGIETPQCDRATGRCVCRPGVGGPRCDQ 1147 (1758)
T ss_pred CCCCceecCCCCCCCCCCCccccCCceeecCCCCCcchhh
Confidence 1121 1343333 475 89999999999999987
No 9
>KOG1226 consensus Integrin beta subunit (N-terminal portion of extracellular region) [Signal transduction mechanisms; Extracellular structures]
Probab=99.01 E-value=5e-10 Score=125.43 Aligned_cols=134 Identities=27% Similarity=0.507 Sum_probs=97.3
Q ss_pred CCCCCCEEeccCCeEEeCCCcc----CCCCCccccCcCCCCCCCCCCCCCccccccCCCCCCCCceeeeCCCcccCCCCC
Q 005509 126 DCSGQGVCNHELGQCRCFHGFR----GKGCSERIHFQCNFPKTPELPYGRWVVSICPTHCDTTRAMCFCGEGTKYPNRPV 201 (693)
Q Consensus 126 ~C~~~G~C~~~~G~C~C~~G~~----G~~Ce~~~~~~C~~~~~~~~~~g~~~~~~C~g~C~~~~g~C~C~~G~~G~~C~~ 201 (693)
.|+|+|.|. .|+|.|.+... |+.||-. ...|... .+..|+|+..|.-|.|+|.+||+|..|.-
T Consensus 515 vCSgrG~C~--CGqC~C~~~~~~~i~G~fCECD-nfsC~r~----------~g~lC~g~G~C~CG~CvC~~GwtG~~C~C 581 (783)
T KOG1226|consen 515 VCSGRGDCV--CGQCVCHKPDNGKIYGKFCECD-NFSCERH----------KGVLCGGHGRCECGRCVCNPGWTGSACNC 581 (783)
T ss_pred CcCCCCcEe--CCceEecCCCCCceeeeeeecc-Ccccccc----------cCcccCCCCeEeCCcEEcCCCCccCCCCC
Confidence 699999999 99999999887 9999853 2334322 23478877777889999999999998843
Q ss_pred CCCCCCcccCCCCCCCCCCCCCcCCCCCCccCCCCCCCceecCCcccccccccccccccccCCCCCcCcccccccCCccC
Q 005509 202 AEACGFQVNLPSQPGAPKSTDWAKADLDNIFTTNGSKPGWCNVDPEEAYALKVQFKEECDCKYDGLLGQFCEVPVSSTCV 281 (693)
Q Consensus 202 ~~~C~~~~~~~~~~~~~C~~gw~g~~c~~~~~~~C~~~G~C~~~~~~~~~~~~c~~g~C~C~~~G~~G~~C~~~~~~~C~ 281 (693)
.. ... .|....+..|+++|.| .-|+|.|.-++|.|.+||.. ..|+
T Consensus 582 ~~------std--------------~C~~~~G~iCSGrG~C-------------~Cg~C~C~~~~~sG~~CE~c--ptc~ 626 (783)
T KOG1226|consen 582 PL------STD--------------TCESSDGQICSGRGTC-------------ECGRCKCTDPPYSGEFCEKC--PTCP 626 (783)
T ss_pred CC------CCc--------------cccCCCCceeCCCcee-------------eCCceEcCCCCcCcchhhcC--CCCC
Confidence 21 111 2222334557778877 47899998334999999985 6888
Q ss_pred CCCCCCceeeCCee-ecCCCcccCCCCCC
Q 005509 282 NQCSGHGHCRGGFC-QCDSGWYGVDCSIP 309 (693)
Q Consensus 282 ~~C~~~G~C~~g~C-~C~~G~~G~~C~~~ 309 (693)
.+|..+..|+ +| .+..|+.+..|.+.
T Consensus 627 ~~C~~~~~Cv--eC~~~~~g~~~~~C~~~ 653 (783)
T KOG1226|consen 627 DPCAENKSCV--ECQAFETGPVGDTCVEE 653 (783)
T ss_pred Ccccccccch--hhcccccccccchHHHH
Confidence 8999988886 22 24556888887764
No 10
>KOG1219 consensus Uncharacterized conserved protein, contains laminin, cadherin and EGF domains [Signal transduction mechanisms]
Probab=98.99 E-value=5.9e-10 Score=133.52 Aligned_cols=107 Identities=27% Similarity=0.654 Sum_probs=83.9
Q ss_pred CCC-CCCCCCCCEEeccCC---eEEeCCCccCCCCCccccCcCCCCCCCCCCCCCccccccCCCCCCCCceeeeCCCccc
Q 005509 121 KSC-KSDCSGQGVCNHELG---QCRCFHGFRGKGCSERIHFQCNFPKTPELPYGRWVVSICPTHCDTTRAMCFCGEGTKY 196 (693)
Q Consensus 121 ~~C-~~~C~~~G~C~~~~G---~C~C~~G~~G~~Ce~~~~~~C~~~~~~~~~~g~~~~~~C~g~C~~~~g~C~C~~G~~G 196 (693)
..| .++|+++|+|+...+ .|.|++-|.|..||.. ..+|.
T Consensus 3865 d~C~~npCqhgG~C~~~~~ggy~CkCpsqysG~~CEi~-~epC~------------------------------------ 3907 (4289)
T KOG1219|consen 3865 DPCNDNPCQHGGTCISQPKGGYKCKCPSQYSGNHCEID-LEPCA------------------------------------ 3907 (4289)
T ss_pred cccccCcccCCCEecCCCCCceEEeCcccccCcccccc-ccccc------------------------------------
Confidence 678 788999999986443 7999999999999874 22222
Q ss_pred CCCCCCCCCCCcccCCCCCCCCCCCCCcCCCCCCccCCCCCCCceecCCcccccccccccccccccCCCCCcCccccccc
Q 005509 197 PNRPVAEACGFQVNLPSQPGAPKSTDWAKADLDNIFTTNGSKPGWCNVDPEEAYALKVQFKEECDCKYDGLLGQFCEVPV 276 (693)
Q Consensus 197 ~~C~~~~~C~~~~~~~~~~~~~C~~gw~g~~c~~~~~~~C~~~G~C~~~~~~~~~~~~c~~g~C~C~~~G~~G~~C~~~~ 276 (693)
+++|..+|+|....+ ...|.|+ .||+|..||...
T Consensus 3908 ------------------------------------snPC~~GgtCip~~n---------~f~CnC~-~gyTG~~Ce~~G 3941 (4289)
T KOG1219|consen 3908 ------------------------------------SNPCLTGGTCIPFYN---------GFLCNCP-NGYTGKRCEARG 3941 (4289)
T ss_pred ------------------------------------CCCCCCCCEEEecCC---------CeeEeCC-CCccCceeeccc
Confidence 556777888866533 5689999 999999999874
Q ss_pred CCccC-CCCCCCceee--CC--eeecCCCcccCCCCCCc
Q 005509 277 SSTCV-NQCSGHGHCR--GG--FCQCDSGWYGVDCSIPS 310 (693)
Q Consensus 277 ~~~C~-~~C~~~G~C~--~g--~C~C~~G~~G~~C~~~~ 310 (693)
...|. +.|.++|.|+ .| .|.|.+||.|..|...+
T Consensus 3942 i~eCs~n~C~~gg~C~n~~gsf~CncT~g~~gr~c~~~~ 3980 (4289)
T KOG1219|consen 3942 ISECSKNVCGTGGQCINIPGSFHCNCTPGILGRTCCAEK 3980 (4289)
T ss_pred ccccccccccCCceeeccCCceEeccChhHhcccCcccc
Confidence 56687 7899999997 34 89999999999996544
No 11
>KOG0994 consensus Extracellular matrix glycoprotein Laminin subunit beta [Extracellular structures]
Probab=98.90 E-value=3.1e-09 Score=121.65 Aligned_cols=170 Identities=23% Similarity=0.505 Sum_probs=101.8
Q ss_pred CCC-CCCCCCCC-EEeccCCeEE-eCCCccCCCCCccccCcCCCCCC--CCCCC-CCccccccCC----------CCCCC
Q 005509 121 KSC-KSDCSGQG-VCNHELGQCR-CFHGFRGKGCSERIHFQCNFPKT--PELPY-GRWVVSICPT----------HCDTT 184 (693)
Q Consensus 121 ~~C-~~~C~~~G-~C~~~~G~C~-C~~G~~G~~Ce~~~~~~C~~~~~--~~~~~-g~~~~~~C~g----------~C~~~ 184 (693)
.+| +..|++|. +|+..+|.|+ |..-.+|..|+. |..+-. |.-.+ +.|.+.+||. .|.-.
T Consensus 854 PeCr~CqCNgHA~~Cd~~tGaCi~CqD~T~G~~Cdr-----Cl~GyyGdP~lg~g~~CrPCpCP~gp~Sg~~~A~sC~~d 928 (1758)
T KOG0994|consen 854 PECRPCQCNGHADTCDPITGACIDCQDSTTGHSCDR-----CLDGYYGDPRLGSGIGCRPCPCPDGPASGRQHADSCYLD 928 (1758)
T ss_pred CcCccccccCcccccCccccccccccccccccchhh-----hhccccCCcccCCCCCCCCCCCCCCCccchhcccccccc
Confidence 566 66788886 9999999996 999999999987 555422 21122 2566667762 23322
Q ss_pred ----CceeeeCCCcccCCCCCCCCCC--CcccCCCCCCC----------------------------------CCCCCCc
Q 005509 185 ----RAMCFCGEGTKYPNRPVAEACG--FQVNLPSQPGA----------------------------------PKSTDWA 224 (693)
Q Consensus 185 ----~g~C~C~~G~~G~~C~~~~~C~--~~~~~~~~~~~----------------------------------~C~~gw~ 224 (693)
.-.|.|.+||.|.+|+.+.+-. ......+|..| .|..||+
T Consensus 929 ~~t~~ivC~C~~GY~G~RCe~CA~~~fGnP~~GGtCq~CeC~~NiD~~d~~aCD~~TG~CLkCL~hTeG~hCe~Ck~Gf~ 1008 (1758)
T KOG0994|consen 929 TRTQQIVCHCQEGYSGSRCEICADNHFGNPSEGGTCQKCECSNNIDLYDPGACDVATGACLKCLYHTEGDHCEHCKDGFY 1008 (1758)
T ss_pred ccccceeeecccCccccchhhhcccccCCcccCCccccccccCCcCccCCCccchhhchhhhhhhcccccchhhccccch
Confidence 2279999999999986432110 00001111111 1456666
Q ss_pred CC----CCCCccCCCCCCCce---ecCCcccccccccccccccccCCCCCcCcccccccCC--------ccC-CCC--CC
Q 005509 225 KA----DLDNIFTTNGSKPGW---CNVDPEEAYALKVQFKEECDCKYDGLLGQFCEVPVSS--------TCV-NQC--SG 286 (693)
Q Consensus 225 g~----~c~~~~~~~C~~~G~---C~~~~~~~~~~~~c~~g~C~C~~~G~~G~~C~~~~~~--------~C~-~~C--~~ 286 (693)
|. +|.. ..|.-.|+ |..+ .++|+|.|. +...|..|+...+. .|. -+| .+
T Consensus 1009 GdA~~q~Cqr---C~Cn~LGTn~~~~CD---------r~tGQCpCl-pNv~G~~CDqCA~N~w~laSG~GCe~C~Cd~~~ 1075 (1758)
T KOG0994|consen 1009 GDALRQNCQR---CVCNFLGTNSTCHCD---------RFTGQCPCL-PNVQGVRCDQCAENHWNLASGEGCEPCNCDPIG 1075 (1758)
T ss_pred hHHHHhhhhh---heccccccCCccccc---------cccCcCCCC-cccccccccccccchhccccCCCCCccCCCccC
Confidence 64 2221 11211111 1111 147899998 99999999875321 111 012 34
Q ss_pred Cceee--CCeeecCCCcccCCCCC
Q 005509 287 HGHCR--GGFCQCDSGWYGVDCSI 308 (693)
Q Consensus 287 ~G~C~--~g~C~C~~G~~G~~C~~ 308 (693)
+-+|+ +|+|+|+|||-|..|++
T Consensus 1076 ~pqCN~ftGQCqCkpGfGGR~C~q 1099 (1758)
T KOG0994|consen 1076 GPQCNEFTGQCQCKPGFGGRTCSQ 1099 (1758)
T ss_pred CccccccccceeccCCCCCcchhH
Confidence 55787 89999999999999986
No 12
>KOG1836 consensus Extracellular matrix glycoprotein Laminin subunits alpha and gamma [Extracellular structures]
Probab=98.43 E-value=1.4e-06 Score=108.42 Aligned_cols=105 Identities=19% Similarity=0.318 Sum_probs=74.5
Q ss_pred CcccccCcccccCCCCcccccccCccCCCCCCCCCCCC-CEEeccCCeEEeCCCccCCCCCccccCcCCCCCCCCC---C
Q 005509 93 PWKAEIGRWLSGCDSVAKEVDLVEMIGGKSCKSDCSGQ-GVCNHELGQCRCFHGFRGKGCSERIHFQCNFPKTPEL---P 168 (693)
Q Consensus 93 ~~~~~~g~~~~~c~~~~~~~~~~~~~~~~~C~~~C~~~-G~C~~~~G~C~C~~G~~G~~Ce~~~~~~C~~~~~~~~---~ 168 (693)
.....+|++++.|..+...+.-...-...-|+.+|++| .+|+..+|.|.|.+.-.|..|+. |.++..... .
T Consensus 699 C~~g~tG~~Ce~C~~gfrr~~~~~~~~~~c~~C~cngh~~~Cd~~tG~C~C~~~t~G~~C~~-----C~~GfYg~~~~~~ 773 (1705)
T KOG1836|consen 699 CPVGYTGQFCESCAPGFRRLSPQLGPFCPCIPCDCNGHSNICDPRTGQCKCKHNTFGGQCAQ-----CVDGFYGLPDLGT 773 (1705)
T ss_pred CCCCcccchhhhcchhhhcccccCCCCCcccccccCCccccccCCCCceecccCCCCCchhh-----hcCCCCCccccCC
Confidence 33567999999998777655443222123338889997 79999999999999999999997 666543322 2
Q ss_pred CCCccccccCC------CCCCCCceee-eCCCcccCCCCCC
Q 005509 169 YGRWVVSICPT------HCDTTRAMCF-CGEGTKYPNRPVA 202 (693)
Q Consensus 169 ~g~~~~~~C~g------~C~~~~g~C~-C~~G~~G~~C~~~ 202 (693)
+++|....|++ .++...+.|. |++||+|..|+.+
T Consensus 774 ~~dC~~C~Cp~~~~~~~~~~~~~~iCk~Cp~gytG~rCe~c 814 (1705)
T KOG1836|consen 774 SGDCQPCPCPNGGACGQTPEILEVVCKNCPPGYTGLRCEEC 814 (1705)
T ss_pred CCCCccCCCCCChhhcCcCcccceecCCCCCCCcccccccC
Confidence 23456666663 3444467999 9999999999654
No 13
>KOG4289 consensus Cadherin EGF LAG seven-pass G-type receptor [Signal transduction mechanisms]
Probab=98.23 E-value=2.6e-06 Score=100.07 Aligned_cols=73 Identities=26% Similarity=0.584 Sum_probs=53.2
Q ss_pred CCCC-CCCCCCCCEEeccCC----eEEeCCCccCCCCCccccCcCCCCCCCCCCCCCccccccC------CCCCCCCcee
Q 005509 120 GKSC-KSDCSGQGVCNHELG----QCRCFHGFRGKGCSERIHFQCNFPKTPELPYGRWVVSICP------THCDTTRAMC 188 (693)
Q Consensus 120 ~~~C-~~~C~~~G~C~~~~G----~C~C~~G~~G~~Ce~~~~~~C~~~~~~~~~~g~~~~~~C~------g~C~~~~g~C 188 (693)
.+.| -++|.+.|+|....| +|+|++||+|++||.....+|+.+= +.+..|....|. ..|+-.+|+|
T Consensus 1716 ~~vC~lnpc~~~g~Cv~sp~a~GY~C~C~~g~~G~~Ce~~~dq~CPrGW---WG~P~CgpC~CavsKgfdp~CnKt~G~C 1792 (2531)
T KOG4289|consen 1716 VDVCSLNPCENQGTCVRSPGAHGYTCECPPGYTGPYCELRADQPCPRGW---WGFPTCGPCNCAVSKGFDPDCNKTNGQC 1792 (2531)
T ss_pred cchhcccccccCceeecCCCCCceeEECCCcccCcchhhhccCCCCCcc---cCCCCccCccccccCCCCCCccccCcce
Confidence 3566 678999999987665 8999999999999998777787531 112234444442 4577788999
Q ss_pred eeCCCcc
Q 005509 189 FCGEGTK 195 (693)
Q Consensus 189 ~C~~G~~ 195 (693)
.|.+.+.
T Consensus 1793 qCKe~hy 1799 (2531)
T KOG4289|consen 1793 QCKENHY 1799 (2531)
T ss_pred eeccccc
Confidence 9988765
No 14
>KOG4289 consensus Cadherin EGF LAG seven-pass G-type receptor [Signal transduction mechanisms]
Probab=98.18 E-value=2.2e-06 Score=100.67 Aligned_cols=94 Identities=26% Similarity=0.450 Sum_probs=66.0
Q ss_pred CC-eEEeCCCccCCCCCccccCcCCCCCCCCCCCCCccccccC--CCCCCC--CceeeeCCCcccCCCCCCCCCCCcccC
Q 005509 137 LG-QCRCFHGFRGKGCSERIHFQCNFPKTPELPYGRWVVSICP--THCDTT--RAMCFCGEGTKYPNRPVAEACGFQVNL 211 (693)
Q Consensus 137 ~G-~C~C~~G~~G~~Ce~~~~~~C~~~~~~~~~~g~~~~~~C~--g~C~~~--~g~C~C~~G~~G~~C~~~~~C~~~~~~ 211 (693)
.| .|+|++||+|++||.. .+.|-.+ +|. +.|-.. ..+|.|.+||+|..|+....-+.
T Consensus 1220 nglrCrCPpGFTgd~CeTe-iDlCYs~-------------pC~nng~C~srEggYtCeCrpg~tGehCEvs~~agr---- 1281 (2531)
T KOG4289|consen 1220 NGLRCRCPPGFTGDYCETE-IDLCYSG-------------PCGNNGRCRSREGGYTCECRPGFTGEHCEVSARAGR---- 1281 (2531)
T ss_pred CceeEeCCCCCCcccccch-hHhhhcC-------------CCCCCCceEEecCceeEEecCCccccceeeecccCc----
Confidence 44 8999999999999986 4556543 454 334333 34899999999998876533211
Q ss_pred CCCCCCCCCCCCcCCCCCCccCCCCCCCceecCCcccccccccccccccccCCCCCcCcccccc
Q 005509 212 PSQPGAPKSTDWAKADLDNIFTTNGSKPGWCNVDPEEAYALKVQFKEECDCKYDGLLGQFCEVP 275 (693)
Q Consensus 212 ~~~~~~~C~~gw~g~~c~~~~~~~C~~~G~C~~~~~~~~~~~~c~~g~C~C~~~G~~G~~C~~~ 275 (693)
+.+..|.|+|+|.+..+. ...|.|++..|+++.|+..
T Consensus 1282 -------------------CvpGvC~nggtC~~~~ng--------gf~c~Cp~ge~e~prC~v~ 1318 (2531)
T KOG4289|consen 1282 -------------------CVPGVCKNGGTCVNLLNG--------GFCCHCPYGEFEDPRCEVT 1318 (2531)
T ss_pred -------------------cccceecCCCEEeecCCC--------ceeccCCCcccCCCceEEE
Confidence 125668899999776541 3489999767889999864
No 15
>KOG1219 consensus Uncharacterized conserved protein, contains laminin, cadherin and EGF domains [Signal transduction mechanisms]
Probab=98.08 E-value=3.1e-06 Score=103.00 Aligned_cols=68 Identities=26% Similarity=0.671 Sum_probs=58.2
Q ss_pred CCCCCCCceecCCcccccccccccccccccCCCCCcCcccccccCCccCCCCCCCceee----CCeeecCCCcccCCCCC
Q 005509 233 TTNGSKPGWCNVDPEEAYALKVQFKEECDCKYDGLLGQFCEVPVSSTCVNQCSGHGHCR----GGFCQCDSGWYGVDCSI 308 (693)
Q Consensus 233 ~~~C~~~G~C~~~~~~~~~~~~c~~g~C~C~~~G~~G~~C~~~~~~~C~~~C~~~G~C~----~g~C~C~~G~~G~~C~~ 308 (693)
.++|+++|.|...+.. .++|.|+ .-|+|..||+..+.+-+++|..+|+|+ +..|.|+.||+|.+|+.
T Consensus 3869 ~npCqhgG~C~~~~~g--------gy~CkCp-sqysG~~CEi~~epC~snPC~~GgtCip~~n~f~CnC~~gyTG~~Ce~ 3939 (4289)
T KOG1219|consen 3869 DNPCQHGGTCISQPKG--------GYKCKCP-SQYSGNHCEIDLEPCASNPCLTGGTCIPFYNGFLCNCPNGYTGKRCEA 3939 (4289)
T ss_pred cCcccCCCEecCCCCC--------ceEEeCc-ccccCcccccccccccCCCCCCCCEEEecCCCeeEeCCCCccCceeec
Confidence 6789999999876542 5699999 999999999985544468999999998 44899999999999997
Q ss_pred C
Q 005509 309 P 309 (693)
Q Consensus 309 ~ 309 (693)
.
T Consensus 3940 ~ 3940 (4289)
T KOG1219|consen 3940 R 3940 (4289)
T ss_pred c
Confidence 7
No 16
>KOG1836 consensus Extracellular matrix glycoprotein Laminin subunits alpha and gamma [Extracellular structures]
Probab=97.98 E-value=5e-05 Score=95.02 Aligned_cols=198 Identities=20% Similarity=0.375 Sum_probs=113.3
Q ss_pred ccccCcccccCCCCcccccccCccCCCCC-CCCCCCCCEEec----cCCeEE-eCCCccCCCCCccccCcCCCC-----C
Q 005509 95 KAEIGRWLSGCDSVAKEVDLVEMIGGKSC-KSDCSGQGVCNH----ELGQCR-CFHGFRGKGCSERIHFQCNFP-----K 163 (693)
Q Consensus 95 ~~~~g~~~~~c~~~~~~~~~~~~~~~~~C-~~~C~~~G~C~~----~~G~C~-C~~G~~G~~Ce~~~~~~C~~~-----~ 163 (693)
..+-|..+++|..+.........- .+| +.+|-+.|.|.. ..+.|. |++||+|..|+. |..+ .
T Consensus 751 ~~t~G~~C~~C~~GfYg~~~~~~~--~dC~~C~Cp~~~~~~~~~~~~~~iCk~Cp~gytG~rCe~-----c~dgyfg~p~ 823 (1705)
T KOG1836|consen 751 HNTFGGQCAQCVDGFYGLPDLGTS--GDCQPCPCPNGGACGQTPEILEVVCKNCPPGYTGLRCEE-----CADGYFGNPL 823 (1705)
T ss_pred cCCCCCchhhhcCCCCCccccCCC--CCCccCCCCCChhhcCcCcccceecCCCCCCCccccccc-----CCCccccCCC
Confidence 345777888887776655443332 237 667777777743 346899 999999999998 4432 1
Q ss_pred CCCCCCCCccccccC--------CCCCCCCcee-eeCCCcccCCCCCCC--------------CCCCc------------
Q 005509 164 TPELPYGRWVVSICP--------THCDTTRAMC-FCGEGTKYPNRPVAE--------------ACGFQ------------ 208 (693)
Q Consensus 164 ~~~~~~g~~~~~~C~--------g~C~~~~g~C-~C~~G~~G~~C~~~~--------------~C~~~------------ 208 (693)
........+....|. ++|+-..|.| .|.....|..|+.+. .|..+
T Consensus 824 ~~~~~~~~c~~c~c~~n~dp~~~g~c~~~tg~c~~ci~nT~g~~cd~c~~g~~gd~l~~~p~~~c~~c~c~p~gs~~~~~ 903 (1705)
T KOG1836|consen 824 GHDGDVRPCQSCQCNFNVDPNAFGNCNRLTGECLKCIHNTAGEYCDLCKEGYFGDPLAPNPEDKCFACGCVPAGSELPSL 903 (1705)
T ss_pred CCCCCcccCccceeccccCccccccccccccceeeccCCcccccccccccCccccccCCCcCCccccccCccCCcccccc
Confidence 111122244444553 6788888888 577777777664321 11111
Q ss_pred ccCCCCCCCCCCCCCcCCCCCCcc-------------CCCCCCCceecCCcccccccccc--cccccccCCCCCcCcccc
Q 005509 209 VNLPSQPGAPKSTDWAKADLDNIF-------------TTNGSKPGWCNVDPEEAYALKVQ--FKEECDCKYDGLLGQFCE 273 (693)
Q Consensus 209 ~~~~~~~~~~C~~gw~g~~c~~~~-------------~~~C~~~G~C~~~~~~~~~~~~c--~~g~C~C~~~G~~G~~C~ 273 (693)
.+++....+.|.+.-.|.+|..+. ..+|...|.=+ ..| .+|+|.|. +|-+|..|+
T Consensus 904 ~c~~~tGQcec~~~v~g~~c~~c~~g~fnl~s~~gC~~c~c~~~gs~~---------~~c~~~tGqc~c~-~gVtgqrc~ 973 (1705)
T KOG1836|consen 904 TCNPVTGQCECKPNVEGRDCLYCFKGFFNLNSGVGCEPCNCDPTGSES---------SDCDVGTGQCYCR-PGVTGQRCD 973 (1705)
T ss_pred cCCCcccceeccCCCCccccccccccccccCCCCCccccccccccccc---------ccccccCCceeee-cCccccccC
Confidence 112222233444444455444222 11222222110 112 37899998 999999998
Q ss_pred cccC-------CccC-CCCCCCc----eee--CCeeecCCCcccCCCCCC
Q 005509 274 VPVS-------STCV-NQCSGHG----HCR--GGFCQCDSGWYGVDCSIP 309 (693)
Q Consensus 274 ~~~~-------~~C~-~~C~~~G----~C~--~g~C~C~~G~~G~~C~~~ 309 (693)
.... ..|- -.|...| .|+ +|+|.|.+++.|..|..-
T Consensus 974 qc~~~~~~~~~~gc~~c~c~~~Gs~~~qc~~~~G~c~c~~~~~g~~c~~c 1023 (1705)
T KOG1836|consen 974 QCETYHFGFQTEGCGLCECDPLGSRGFQCDPEDGQCPCRPGFEGRRCDQC 1023 (1705)
T ss_pred ccccCcccccccCCcceecccCCcccceecccCCeeeecCCCCCcccccc
Confidence 6421 1111 1344455 586 899999999999777653
No 17
>PF07974 EGF_2: EGF-like domain; InterPro: IPR013111 A sequence of about thirty to forty amino-acid residues long found in the sequence of epidermal growth factor (EGF) has been shown [, , , , ] to be present, in a more or less conserved form, in a large number of other, mostly animal proteins. The list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied. The functional significance of EGF domains in what appear to be unrelated proteins is not yet clear. However, a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase). The EGF domain includes six cysteine residues which have been shown (in EGF) to be involved in disulphide bonds. The main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet. Subdomains between the conserved cysteines vary in length. This entry contains EGF domains found in a variety of extracellular and membrane proteins
Probab=97.94 E-value=9.5e-06 Score=55.37 Aligned_cols=27 Identities=41% Similarity=1.058 Sum_probs=24.6
Q ss_pred CCCCCCCEEeccCCeEEeCCCccCCCC
Q 005509 125 SDCSGQGVCNHELGQCRCFHGFRGKGC 151 (693)
Q Consensus 125 ~~C~~~G~C~~~~G~C~C~~G~~G~~C 151 (693)
..|++||+|+...|+|.|.+||+|++|
T Consensus 6 ~~C~~~G~C~~~~g~C~C~~g~~G~~C 32 (32)
T PF07974_consen 6 NICSGHGTCVSPCGRCVCDSGYTGPDC 32 (32)
T ss_pred CccCCCCEEeCCCCEEECCCCCcCCCC
Confidence 359999999977799999999999987
No 18
>KOG1217 consensus Fibrillins and related proteins containing Ca2+-binding EGF-like domains [Signal transduction mechanisms]
Probab=97.82 E-value=0.00015 Score=81.43 Aligned_cols=65 Identities=29% Similarity=0.729 Sum_probs=47.7
Q ss_pred CCCCceecCCcccccccccccccccccCCCCCcCccc-ccccCCccC-----CCCCCCceee------CCeeecCCCccc
Q 005509 236 GSKPGWCNVDPEEAYALKVQFKEECDCKYDGLLGQFC-EVPVSSTCV-----NQCSGHGHCR------GGFCQCDSGWYG 303 (693)
Q Consensus 236 C~~~G~C~~~~~~~~~~~~c~~g~C~C~~~G~~G~~C-~~~~~~~C~-----~~C~~~G~C~------~g~C~C~~G~~G 303 (693)
|.++++|..... .+.|.|+ +||+|..| .......|. ..|.++++|. ...|.|..||.|
T Consensus 280 c~~~~~C~~~~~---------~~~C~C~-~g~~g~~~~~~~~~~~C~~~~~~~~c~~g~~C~~~~~~~~~~C~c~~~~~g 349 (487)
T KOG1217|consen 280 CPNGGTCVNVPG---------SYRCTCP-PGFTGRLCTECVDVDECSPRNAGGPCANGGTCNTLGSFGGFRCACGPGFTG 349 (487)
T ss_pred cCCCCeeecCCC---------cceeeCC-CCCCCCCCccccccccccccccCCcCCCCcccccCCCCCCCCcCCCCCCCC
Confidence 777888865432 3789999 99999998 221123552 4588888993 236999999999
Q ss_pred CCCCCCc
Q 005509 304 VDCSIPS 310 (693)
Q Consensus 304 ~~C~~~~ 310 (693)
..|+.+.
T Consensus 350 ~~C~~~~ 356 (487)
T KOG1217|consen 350 RRCEDSN 356 (487)
T ss_pred CccccCC
Confidence 9999874
No 19
>KOG4260 consensus Uncharacterized conserved protein [Function unknown]
Probab=97.65 E-value=9.3e-05 Score=73.92 Aligned_cols=132 Identities=20% Similarity=0.379 Sum_probs=75.4
Q ss_pred eEEeCCCccCCCCCccccCcCCCCCCCCCCCCCccccccCCCCCCC-------CceeeeCCCcccCCCCCCC--------
Q 005509 139 QCRCFHGFRGKGCSERIHFQCNFPKTPELPYGRWVVSICPTHCDTT-------RAMCFCGEGTKYPNRPVAE-------- 203 (693)
Q Consensus 139 ~C~C~~G~~G~~Ce~~~~~~C~~~~~~~~~~g~~~~~~C~g~C~~~-------~g~C~C~~G~~G~~C~~~~-------- 203 (693)
.| |++|..|++|.. |+.+.- ..|.|+..|. .|.|.|.+||.|+.|..+.
T Consensus 130 vC-Cp~gtyGpdCl~-----Cpggse----------r~C~GnG~C~GdGsR~GsGkCkC~~GY~Gp~C~~Cg~eyfes~R 193 (350)
T KOG4260|consen 130 VC-CPDGTYGPDCLQ-----CPGGSE----------RPCFGNGSCHGDGSREGSGKCKCETGYTGPLCRYCGIEYFESSR 193 (350)
T ss_pred ec-cCCCCcCCcccc-----CCCCCc----------CCcCCCCcccCCCCCCCCCcccccCCCCCccccccchHHHHhhc
Confidence 44 999999999986 543211 2354332221 4699999999999984321
Q ss_pred ------------CCCCcccCCCCCCC-CCCCCCcCC--CCCC---cc--CCCCCCCceecCCcccccccccccccccccC
Q 005509 204 ------------ACGFQVNLPSQPGA-PKSTDWAKA--DLDN---IF--TTNGSKPGWCNVDPEEAYALKVQFKEECDCK 263 (693)
Q Consensus 204 ------------~C~~~~~~~~~~~~-~C~~gw~g~--~c~~---~~--~~~C~~~G~C~~~~~~~~~~~~c~~g~C~C~ 263 (693)
.|...++.....+| .|..||.-. .|-+ +. +.+|..+..|.+..+ .++|.++
T Consensus 194 ne~~lvCt~Ch~~C~~~Csg~~~k~C~kCkkGW~lde~gCvDvnEC~~ep~~c~~~qfCvNteG---------Sf~C~dk 264 (350)
T KOG4260|consen 194 NEQHLVCTACHEGCLGVCSGESSKGCSKCKKGWKLDEEGCVDVNECQNEPAPCKAHQFCVNTEG---------SFKCEDK 264 (350)
T ss_pred ccccchhhhhhhhhhcccCCCCCCChhhhcccceecccccccHHHHhcCCCCCChhheeecCCC---------ceEeccc
Confidence 12211112222222 277888643 2211 11 566777777765432 5689998
Q ss_pred CCCCcCc--ccccccCCccCCCCC-CCceee----CCeeecCCCc
Q 005509 264 YDGLLGQ--FCEVPVSSTCVNQCS-GHGHCR----GGFCQCDSGW 301 (693)
Q Consensus 264 ~~G~~G~--~C~~~~~~~C~~~C~-~~G~C~----~g~C~C~~G~ 301 (693)
+||.+. .|+ .|...|. .++.|. ..+|+|..|.
T Consensus 265 -~Gy~~g~d~C~-----~~~d~~~~kn~~c~ni~~~~r~v~f~~~ 303 (350)
T KOG4260|consen 265 -EGYKKGVDECQ-----FCADVCASKNRPCMNIDGQYRCVCFSGL 303 (350)
T ss_pred -ccccCChHHhh-----hhhhhcccCCCCcccCCccEEEEecccc
Confidence 999862 222 2334443 355564 3478887765
No 20
>PF07974 EGF_2: EGF-like domain; InterPro: IPR013111 A sequence of about thirty to forty amino-acid residues long found in the sequence of epidermal growth factor (EGF) has been shown [, , , , ] to be present, in a more or less conserved form, in a large number of other, mostly animal proteins. The list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied. The functional significance of EGF domains in what appear to be unrelated proteins is not yet clear. However, a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase). The EGF domain includes six cysteine residues which have been shown (in EGF) to be involved in disulphide bonds. The main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet. Subdomains between the conserved cysteines vary in length. This entry contains EGF domains found in a variety of extracellular and membrane proteins
Probab=97.61 E-value=5.2e-05 Score=51.77 Aligned_cols=25 Identities=64% Similarity=1.379 Sum_probs=23.0
Q ss_pred CCCCCCceee--CCeeecCCCcccCCC
Q 005509 282 NQCSGHGHCR--GGFCQCDSGWYGVDC 306 (693)
Q Consensus 282 ~~C~~~G~C~--~g~C~C~~G~~G~~C 306 (693)
..|++||+|+ .++|+|.+||+|.+|
T Consensus 6 ~~C~~~G~C~~~~g~C~C~~g~~G~~C 32 (32)
T PF07974_consen 6 NICSGHGTCVSPCGRCVCDSGYTGPDC 32 (32)
T ss_pred CccCCCCEEeCCCCEEECCCCCcCCCC
Confidence 4699999999 799999999999987
No 21
>KOG3512 consensus Netrin, axonal chemotropic factor [Signal transduction mechanisms]
Probab=97.42 E-value=0.00096 Score=71.60 Aligned_cols=153 Identities=20% Similarity=0.402 Sum_probs=87.3
Q ss_pred CCCCCC-EEeccCC---eEEeCCCccCCCCCccccCcCCCCCCC-------CCCCCCccccccCC-------CCCC----
Q 005509 126 DCSGQG-VCNHELG---QCRCFHGFRGKGCSERIHFQCNFPKTP-------ELPYGRWVVSICPT-------HCDT---- 183 (693)
Q Consensus 126 ~C~~~G-~C~~~~G---~C~C~~G~~G~~Ce~~~~~~C~~~~~~-------~~~~g~~~~~~C~g-------~C~~---- 183 (693)
.|++|. .|+...+ +|.|.++.+|++|+. |...-.. ..+-..+....|.+ ++..
T Consensus 279 KCNgHAs~Cv~d~~~~ltCdC~HNTaGPdCgr-----CKpfy~dRPW~raT~~~a~~c~ac~Cn~harrcrfn~Ely~lS 353 (592)
T KOG3512|consen 279 KCNGHASRCVMDESSHLTCDCEHNTAGPDCGR-----CKPFYYDRPWGRATALPANECVACNCNGHARRCRFNMELYRLS 353 (592)
T ss_pred eecCccceeeeccCCceEEecccCCCCCCccc-----ccccccCCCccccccCCCccccccccchhhhhcccchhhhccc
Confidence 478876 7865444 899999999999997 4432111 00111222333321 1111
Q ss_pred ---CCceee-eCCCcccCCCCCCCCCCCcccCCCCCCCCCCCCCcCCC---CC---CccCCCCCCCc----eecCCcccc
Q 005509 184 ---TRAMCF-CGEGTKYPNRPVAEACGFQVNLPSQPGAPKSTDWAKAD---LD---NIFTTNGSKPG----WCNVDPEEA 249 (693)
Q Consensus 184 ---~~g~C~-C~~G~~G~~C~~~~~C~~~~~~~~~~~~~C~~gw~g~~---c~---~~~~~~C~~~G----~C~~~~~~~ 249 (693)
..|.|. |...+.|..|.. |..||+-.. .+ .+...+|+.-| +|+.
T Consensus 354 gr~SggvClnCrHnTaGrhChy-----------------CreGyyRd~s~pl~hrkaCk~CdChpVGs~gktCNq----- 411 (592)
T KOG3512|consen 354 GRRSGGVCLNCRHNTAGRHCHY-----------------CREGYYRDGSKPLTHRKACKACDCHPVGSAGKTCNQ----- 411 (592)
T ss_pred CccccceEeecccCCCCccccc-----------------ccCccccCCCCCCchhhhhhhcCCcccccccccccc-----
Confidence 134554 666667666632 233333211 00 01133454433 3543
Q ss_pred cccccccccccccCCCCCcCccccccc---------CCccC-------CCCCCCceeeCCeeecCCCcccCCCCCCccC
Q 005509 250 YALKVQFKEECDCKYDGLLGQFCEVPV---------SSTCV-------NQCSGHGHCRGGFCQCDSGWYGVDCSIPSVM 312 (693)
Q Consensus 250 ~~~~~c~~g~C~C~~~G~~G~~C~~~~---------~~~C~-------~~C~~~G~C~~g~C~C~~G~~G~~C~~~~~~ 312 (693)
.+|+|.|+ +|-+|..|+... ...|. ..|+++++=.+..|.|+.++.|..|+++..-
T Consensus 412 ------~tGqCpCk-eGvtG~tCnrCa~gyqqsrs~vapcik~p~~~~~~~~s~ve~qd~~s~Ck~~~~~~r~n~kkfc 483 (592)
T KOG3512|consen 412 ------TTGQCPCK-EGVTGLTCNRCAPGYQQSRSPVAPCIKIPTDAPTLGSSGVEPQDQCSKCKASPGGKRLNQKKFC 483 (592)
T ss_pred ------cCCcccCC-CCCcccccccccchhhcccCCCcCceecCCCCccccCCCCcchhccccCCCCCcceeccccccC
Confidence 37899999 999999998642 11221 2366666633556799999999999998765
No 22
>KOG1214 consensus Nidogen and related basement membrane protein proteins [Cell wall/membrane/envelope biogenesis; Extracellular structures]
Probab=97.41 E-value=0.00049 Score=78.05 Aligned_cols=142 Identities=20% Similarity=0.486 Sum_probs=83.4
Q ss_pred CCCCCCCCEEeccCC---eEEeCCCccC--CCCCccccCcCCCCCCCCCCCCCccccccC--CCCCCCCc--eeeeCCCc
Q 005509 124 KSDCSGQGVCNHELG---QCRCFHGFRG--KGCSERIHFQCNFPKTPELPYGRWVVSICP--THCDTTRA--MCFCGEGT 194 (693)
Q Consensus 124 ~~~C~~~G~C~~~~G---~C~C~~G~~G--~~Ce~~~~~~C~~~~~~~~~~g~~~~~~C~--g~C~~~~g--~C~C~~G~ 194 (693)
...|.-+..|...+| +|.|..||.| .+|... ++|.... ..|. ..|.+..+ .|.|..||
T Consensus 699 sh~cdt~a~C~pg~~~~~tcecs~g~~gdgr~c~d~--~eca~~~-----------~~CGp~s~Cin~pg~~rceC~~gy 765 (1289)
T KOG1214|consen 699 SHMCDTTARCHPGTGVDYTCECSSGYQGDGRNCVDE--NECATGF-----------HRCGPNSVCINLPGSYRCECRSGY 765 (1289)
T ss_pred CcccCCCccccCCCCcceEEEEeeccCCCCCCCCCh--hhhccCC-----------CCCCCCceeecCCCceeEEEeecc
Confidence 445777888885555 8999999986 467763 4665432 2554 44666554 67777776
Q ss_pred c--cC--CCCCCCCCCCcccCCCCCCCCCCCCCcCCCCCCccCCCCCCCceecCCcccccccccccccccccCCCCCcC-
Q 005509 195 K--YP--NRPVAEACGFQVNLPSQPGAPKSTDWAKADLDNIFTTNGSKPGWCNVDPEEAYALKVQFKEECDCKYDGLLG- 269 (693)
Q Consensus 195 ~--G~--~C~~~~~C~~~~~~~~~~~~~C~~gw~g~~c~~~~~~~C~~~G~C~~~~~~~~~~~~c~~g~C~C~~~G~~G- 269 (693)
. +. +|-.-.. +.-. ..|++ -..+|.-.|.|..... ++ ..+.|.|. +||.|
T Consensus 766 ~F~dd~~tCV~i~~---pap~--------------n~Ce~-g~h~C~i~g~a~c~~h----Gg--s~y~C~CL-PGfsGD 820 (1289)
T KOG1214|consen 766 EFADDRHTCVLITP---PAPA--------------NPCED-GSHTCAIAGQARCVHH----GG--STYSCACL-PGFSGD 820 (1289)
T ss_pred eeccCCcceEEecC---CCCC--------------Ccccc-CccccCcCCceEEEec----CC--ceEEEeec-CCccCC
Confidence 4 22 2310000 0000 01211 0234555554432110 00 25799998 99996
Q ss_pred -cccccccCCcc-CCCCCCCceee----CCeeecCCCcccCC
Q 005509 270 -QFCEVPVSSTC-VNQCSGHGHCR----GGFCQCDSGWYGVD 305 (693)
Q Consensus 270 -~~C~~~~~~~C-~~~C~~~G~C~----~g~C~C~~G~~G~~ 305 (693)
..|... +.| ++.|.-..+|. ...|+|++||.|+.
T Consensus 821 G~~c~dv--DeC~psrChp~A~CyntpgsfsC~C~pGy~GDG 860 (1289)
T KOG1214|consen 821 GHQCTDV--DECSPSRCHPAATCYNTPGSFSCRCQPGYYGDG 860 (1289)
T ss_pred ccccccc--cccCccccCCCceEecCCCcceeecccCccCCC
Confidence 445442 466 47899999997 34899999999974
No 23
>KOG1217 consensus Fibrillins and related proteins containing Ca2+-binding EGF-like domains [Signal transduction mechanisms]
Probab=97.28 E-value=0.0023 Score=71.88 Aligned_cols=150 Identities=22% Similarity=0.500 Sum_probs=97.4
Q ss_pred CCCCCEEecc-----CCeEEeCCCccCCCCCccccCcCCCCCCCCCCCCCccccccC--CCCCCCC--ceeeeCCCcccC
Q 005509 127 CSGQGVCNHE-----LGQCRCFHGFRGKGCSERIHFQCNFPKTPELPYGRWVVSICP--THCDTTR--AMCFCGEGTKYP 197 (693)
Q Consensus 127 C~~~G~C~~~-----~G~C~C~~G~~G~~Ce~~~~~~C~~~~~~~~~~g~~~~~~C~--g~C~~~~--g~C~C~~G~~G~ 197 (693)
+..++.|... .-.|.|..||.|..|+.. ...|..... .|. +.|.... ..|.|.+||.|.
T Consensus 136 ~~~~~~c~~~~~~~~~~~c~C~~g~~~~~~~~~-~~~C~~~~~-----------~c~~~~~C~~~~~~~~C~c~~~~~~~ 203 (487)
T KOG1217|consen 136 CCIDGSCSNGPGSVGPFRCSCTEGYEGEPCETD-LDECIQYSS-----------PCQNGGTCVNTGGSYLCSCPPGYTGS 203 (487)
T ss_pred eeCchhhcCCCCCCCceeeeeCCCccccccccc-ccccccCCC-----------CcCCCcccccCCCCeeEeCCCCccCC
Confidence 4567777643 237999999999999974 245653221 233 4455544 479999999999
Q ss_pred CCCCC---CCCCCcccCCCCCCCCCCCCCcCCCCCCccCCCCCCC-ceecCCcccccccccccccccccCCCCCcCccc-
Q 005509 198 NRPVA---EACGFQVNLPSQPGAPKSTDWAKADLDNIFTTNGSKP-GWCNVDPEEAYALKVQFKEECDCKYDGLLGQFC- 272 (693)
Q Consensus 198 ~C~~~---~~C~~~~~~~~~~~~~C~~gw~g~~c~~~~~~~C~~~-G~C~~~~~~~~~~~~c~~g~C~C~~~G~~G~~C- 272 (693)
.|... ..|.. . ..+.+..++.+..|+... ..|... +.|..... ..+|.|. +||.+..+
T Consensus 204 ~~~~~~~~~~c~~---~---~~~~~~~g~~~~~c~~~~-~~~~~~~~~c~~~~~---------~~~C~~~-~g~~~~~~~ 266 (487)
T KOG1217|consen 204 TCETTGNGGTCVD---S---VACSCPPGARGPECEVSI-VECASGDGTCVNTVG---------SYTCRCP-EGYTGDACV 266 (487)
T ss_pred cCcCCCCCceEec---c---eeccCCCCCCCCCccccc-ccccCCCCcccccCC---------ceeeeCC-CCccccccc
Confidence 88644 11110 0 234566777777776322 223322 77765432 4689998 99999874
Q ss_pred -ccccCCccCC--CCCCCceee----CCeeecCCCcccCCC
Q 005509 273 -EVPVSSTCVN--QCSGHGHCR----GGFCQCDSGWYGVDC 306 (693)
Q Consensus 273 -~~~~~~~C~~--~C~~~G~C~----~g~C~C~~G~~G~~C 306 (693)
.+. .+.|.. .|.++++|. ...|.|++||+|..|
T Consensus 267 ~~~~-~~~C~~~~~c~~~~~C~~~~~~~~C~C~~g~~g~~~ 306 (487)
T KOG1217|consen 267 TCVD-VDSCALIASCPNGGTCVNVPGSYRCTCPPGFTGRLC 306 (487)
T ss_pred eeee-ccccCCCCccCCCCeeecCCCcceeeCCCCCCCCCC
Confidence 111 235652 399999997 268999999999999
No 24
>smart00051 DSL delta serrate ligand.
Probab=97.08 E-value=0.00049 Score=54.94 Aligned_cols=46 Identities=26% Similarity=0.569 Sum_probs=37.5
Q ss_pred cccccCCCCCcCcccccccCCccCCCCCCCceee-CCeeecCCCcccCCC
Q 005509 258 EECDCKYDGLLGQFCEVPVSSTCVNQCSGHGHCR-GGFCQCDSGWYGVDC 306 (693)
Q Consensus 258 g~C~C~~~G~~G~~C~~~~~~~C~~~C~~~G~C~-~g~C~C~~G~~G~~C 306 (693)
..-.|+ ++|.|..|+.. +.+.+.+.++.+|+ .|.|.|.+||+|.+|
T Consensus 17 ~rv~C~-~~~yG~~C~~~--C~~~~d~~~~~~Cd~~G~~~C~~Gw~G~~C 63 (63)
T smart00051 17 IRVTCD-ENYYGEGCNKF--CRPRDDFFGHYTCDENGNKGCLEGWMGPYC 63 (63)
T ss_pred EEeeCC-CCCcCCccCCE--eCcCccccCCccCCcCCCEecCCCCcCCCC
Confidence 355788 99999999862 23335688999997 889999999999988
No 25
>PF00008 EGF: EGF-like domain This is a sub-family of the Pfam entry This is a sub-family of the Pfam entry; InterPro: IPR006209 A sequence of about thirty to forty amino-acid residues long found in the sequence of epidermal growth factor (EGF) has been shown [, , , , ] to be present, in a more or less conserved form, in a large number of other, mostly animal proteins. The list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied. The functional significance of EGF domains in what appear to be unrelated proteins is not yet clear. However, a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase). The EGF domain includes six cysteine residues which have been shown (in EGF) to be involved in disulphide bonds. The main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet. Subdomains between the conserved cysteines vary in length.; GO: 0005515 protein binding; PDB: 1WHE_A 1CCF_A 1APO_A 1WHF_A 2VJ3_A 1TOZ_A 4D90_B 3CFW_A 1EDM_B 1IXA_A ....
Probab=96.97 E-value=0.00064 Score=46.55 Aligned_cols=27 Identities=33% Similarity=0.832 Sum_probs=23.1
Q ss_pred CCCCCCCCEEeccC-C--eEEeCCCccCCC
Q 005509 124 KSDCSGQGVCNHEL-G--QCRCFHGFRGKG 150 (693)
Q Consensus 124 ~~~C~~~G~C~~~~-G--~C~C~~G~~G~~ 150 (693)
+++|.|+|+|.... + .|.|++||+|++
T Consensus 3 ~~~C~n~g~C~~~~~~~y~C~C~~G~~G~~ 32 (32)
T PF00008_consen 3 SNPCQNGGTCIDLPGGGYTCECPPGYTGKR 32 (32)
T ss_dssp TTSSTTTEEEEEESTSEEEEEEBTTEESTT
T ss_pred CCcCCCCeEEEeCCCCCEEeECCCCCccCC
Confidence 56899999998766 4 899999999974
No 26
>KOG1214 consensus Nidogen and related basement membrane protein proteins [Cell wall/membrane/envelope biogenesis; Extracellular structures]
Probab=96.93 E-value=0.0028 Score=72.22 Aligned_cols=138 Identities=20% Similarity=0.391 Sum_probs=81.0
Q ss_pred CCCCCCCCCCCEEeccCC--eEEeCCCc--cCC--CCCccc----cCcCCCCCCCCCCCCCccccccC--CCCCCC----
Q 005509 121 KSCKSDCSGQGVCNHELG--QCRCFHGF--RGK--GCSERI----HFQCNFPKTPELPYGRWVVSICP--THCDTT---- 184 (693)
Q Consensus 121 ~~C~~~C~~~G~C~~~~G--~C~C~~G~--~G~--~Ce~~~----~~~C~~~~~~~~~~g~~~~~~C~--g~C~~~---- 184 (693)
.+|+..|..+.+|++..| +|+|..|| .|+ +|-... ..+|..+. ..|. +.|.+.
T Consensus 738 a~~~~~CGp~s~Cin~pg~~rceC~~gy~F~dd~~tCV~i~~pap~n~Ce~g~-----------h~C~i~g~a~c~~hGg 806 (1289)
T KOG1214|consen 738 ATGFHRCGPNSVCINLPGSYRCECRSGYEFADDRHTCVLITPPAPANPCEDGS-----------HTCAIAGQARCVHHGG 806 (1289)
T ss_pred ccCCCCCCCCceeecCCCceeEEEeecceeccCCcceEEecCCCCCCccccCc-----------cccCcCCceEEEecCC
Confidence 334667999999998888 68877776 343 565432 23343321 2343 444433
Q ss_pred -CceeeeCCCcccCCCCCCCCCCCcccCCCCCCCCCCCCCcCCCCCCccCCCCCCCceecCCcccccccccccccccccC
Q 005509 185 -RAMCFCGEGTKYPNRPVAEACGFQVNLPSQPGAPKSTDWAKADLDNIFTTNGSKPGWCNVDPEEAYALKVQFKEECDCK 263 (693)
Q Consensus 185 -~g~C~C~~G~~G~~C~~~~~C~~~~~~~~~~~~~C~~gw~g~~c~~~~~~~C~~~G~C~~~~~~~~~~~~c~~g~C~C~ 263 (693)
...|.|.+||.|..- .|. -+..|+ ++-|...+.|.++++ ...|.|.
T Consensus 807 s~y~C~CLPGfsGDG~----~c~-----------------dvDeC~---psrChp~A~Cyntpg---------sfsC~C~ 853 (1289)
T KOG1214|consen 807 STYSCACLPGFSGDGH----QCT-----------------DVDECS---PSRCHPAATCYNTPG---------SFSCRCQ 853 (1289)
T ss_pred ceEEEeecCCccCCcc----ccc-----------------cccccC---ccccCCCceEecCCC---------cceeecc
Confidence 238999999998531 010 011222 667888899987764 5689999
Q ss_pred CCCCcCc--cccccc--CCcc------CCCCCCCceee------CCeeecCCCccc
Q 005509 264 YDGLLGQ--FCEVPV--SSTC------VNQCSGHGHCR------GGFCQCDSGWYG 303 (693)
Q Consensus 264 ~~G~~G~--~C~~~~--~~~C------~~~C~~~G~C~------~g~C~C~~G~~G 303 (693)
+||.|. .|--.. ...| +..|.+...|. ..+|.|+++-.|
T Consensus 854 -pGy~GDGf~CVP~~~~~T~C~~er~hpl~chg~t~~~~~~Dp~~~e~p~~~~ppG 908 (1289)
T KOG1214|consen 854 -PGYYGDGFQCVPDTSSLTPCEQERFHPLQCHGSTGFCWCVDPDGHEVPGTQTPPG 908 (1289)
T ss_pred -cCccCCCceecCCCccCCccccccccceeeccccceeEeeCCCcccCCCCCCCCC
Confidence 999964 443210 1223 23465544332 337887776666
No 27
>PF12661 hEGF: Human growth factor-like EGF; PDB: 2YGQ_A 2E26_A 3A7Q_A 2YGP_A 2YGO_A 1HRE_A 1HAE_A 1HAF_A 1HRF_A.
Probab=96.21 E-value=0.0027 Score=34.14 Aligned_cols=13 Identities=38% Similarity=1.216 Sum_probs=11.1
Q ss_pred eEEeCCCccCCCC
Q 005509 139 QCRCFHGFRGKGC 151 (693)
Q Consensus 139 ~C~C~~G~~G~~C 151 (693)
+|.|++||+|++|
T Consensus 1 ~C~C~~G~~G~~C 13 (13)
T PF12661_consen 1 TCQCPPGWTGPNC 13 (13)
T ss_dssp EEEE-TTEETTTT
T ss_pred CccCcCCCcCCCC
Confidence 5999999999998
No 28
>PF00852 Glyco_transf_10: Glycosyltransferase family 10 (fucosyltransferase); InterPro: IPR001503 The biosynthesis of disaccharides, oligosaccharides and polysaccharides involves the action of hundreds of different glycosyltransferases. These enzymes catalyse the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. A classification of glycosyltransferases using nucleotide diphospho-sugar, nucleotide monophospho-sugar and sugar phosphates (2.4.1.- from EC) and related proteins into distinct sequence based families has been described []. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site. The same three-dimensional fold is expected to occur within each of the families. Because 3-D structures are better conserved than sequences, several of the families defined on the basis of sequence similarities may have similar 3-D structures and therefore form 'clans'. Glycosyltransferase family 10 GT10 from CAZY comprises enzymes with two known activities; galactoside 3(4)-L-fucosyltransferase (2.4.1.65 from EC) and galactoside 3-fucosyltransferase (2.4.1.152 from EC). The galactoside 3-fucosyltransferases display similarities with the alpha-2 and alpha-6-fucosyltranferases []. The biosynthesis of the carbohydrate antigen sialyl Lewis X (sLe(x)) is dependent on the activity of an galactoside 3-fucosyltransferase. This enzyme catalyses the transfer of fucose from GDP-beta-fucose to the 3-OH of N-acetylglucosamine present in lactosamine acceptors []. Some of the proteins in this group are responsible for the molecular basis of the blood group antigens, surface markers on the outside of the red blood cell membrane. Most of these markers are proteins, but some are carbohydrates attached to lipids or proteins [Reid M.E., Lomas-Francis C. The Blood Group Antigen FactsBook Academic Press, London / San Diego, (1997)]. Galactoside 3(4)-L-fucosyltransferase (2.4.1.65 from EC) belongs to the Lewis blood group system and is associated with Le(a/b) antigen. ; GO: 0008417 fucosyltransferase activity, 0006486 protein glycosylation, 0016020 membrane; PDB: 2NZX_B 2NZW_C 2NZY_C.
Probab=96.19 E-value=0.0087 Score=64.98 Aligned_cols=129 Identities=14% Similarity=0.121 Sum_probs=56.7
Q ss_pred cccCCCceeecCccCCchhhhhc--cccCCCCCCCceeEEecccCCCCCCCCCCCCCccHHHHHHHHHHhcCCCCCcccc
Q 005509 536 CFDPEKDLVLPAWKAPDAFVLRS--KLWASPREKRKTLFYFNGNLGSAYPNGRPESSYSMGVRQKLAEEYGSSPNKEGKL 613 (693)
Q Consensus 536 ~f~p~kDvviP~~~~~~~~~~~~--~~~~~~~~~R~~L~~F~G~~~~~~~~~r~~~~ys~~iR~~L~~~~~~~~~~~~~~ 613 (693)
.||...||.+|+........... .+......+++..++++.+.. ....|..+++++...- .....
T Consensus 141 TYr~dSDi~~py~~~~~~~~~~~~~~~~~~~~~K~~~~~w~~Snc~------------~~~~R~~~~~~L~~~~-~vd~y 207 (349)
T PF00852_consen 141 TYRRDSDIPLPYGYFSPRESPSEKDDLPNILKKKTKLAAWIVSNCN------------PHSGREEYVRELSKYI-PVDSY 207 (349)
T ss_dssp --------------------------------TSSEEEEE--S-S--------------H-HHHHHHHHHHTTS--EEE-
T ss_pred ccccccccccccccccccccccccccccccccCCCceEEEEeeCcC------------CcccHHHHHHHHHhhc-CeEcc
Confidence 68899999999754322111111 111112233455666666543 2334999999887752 23344
Q ss_pred CcccCcceEEecCCchhHHHHhhcCceeeccCCC---CC-chhHHHHHhcCceeEEEe--C-Ce---eec--eecCCCcc
Q 005509 614 GKQHAEDVIVTSLRSENYHEDLSSSVFCGVLPGD---GW-SGRMEDSILQGCIPVVIQ--V-VI---SSF--LLLCQNGS 681 (693)
Q Consensus 614 g~~~~~~~~~~~~~~~~y~~~l~~S~FCL~p~Gd---~~-s~Rl~dAi~~GCIPViis--d-~~---~~p--~l~~~~fs 681 (693)
|++... .......+.+.|++-||-|+.--. .. |-++++|+.+|+|||+++ . ++ .+| +|+.++|+
T Consensus 208 G~c~~~----~~~~~~~~~~~~~~ykF~lafENs~c~dYiTEK~~~al~~g~VPI~~G~~~~~~~~~~P~~SfI~~~df~ 283 (349)
T PF00852_consen 208 GKCGNN----NPCPRDCKLELLSKYKFYLAFENSNCPDYITEKFWNALLAGTVPIYWGPPRPNYEEFAPPNSFIHVDDFK 283 (349)
T ss_dssp SSTT------SSS--S-HHHHHHTEEEEEEE-SS--TT---HHHHHHHHTTSEEEEES---TTHHHHS-GGGSEEGGGSS
T ss_pred CCCCCC----CCcccccccccccCcEEEEEecCCCCCCCCCHHHHHHHHCCeEEEEECCEecccccCCCCCCccchhcCC
Confidence 544100 012224488999999999986542 22 788999999999999999 3 33 233 77777773
No 29
>PF12661 hEGF: Human growth factor-like EGF; PDB: 2YGQ_A 2E26_A 3A7Q_A 2YGP_A 2YGO_A 1HRE_A 1HAE_A 1HAF_A 1HRF_A.
Probab=96.10 E-value=0.0026 Score=34.18 Aligned_cols=13 Identities=54% Similarity=1.551 Sum_probs=9.0
Q ss_pred eeecCCCcccCCC
Q 005509 294 FCQCDSGWYGVDC 306 (693)
Q Consensus 294 ~C~C~~G~~G~~C 306 (693)
.|+|++||+|.+|
T Consensus 1 ~C~C~~G~~G~~C 13 (13)
T PF12661_consen 1 TCQCPPGWTGPNC 13 (13)
T ss_dssp EEEE-TTEETTTT
T ss_pred CccCcCCCcCCCC
Confidence 4778888888776
No 30
>KOG1218 consensus Proteins containing Ca2+-binding EGF-like domains [Signal transduction mechanisms]
Probab=95.99 E-value=0.081 Score=56.29 Aligned_cols=153 Identities=20% Similarity=0.364 Sum_probs=82.6
Q ss_pred eccCCeEEeCCCccCCCCCccccCcCCCCCCCCCCCCCccccccC--CCCCCCCceeeeCCCcccCCCCCCCCCCCc---
Q 005509 134 NHELGQCRCFHGFRGKGCSERIHFQCNFPKTPELPYGRWVVSICP--THCDTTRAMCFCGEGTKYPNRPVAEACGFQ--- 208 (693)
Q Consensus 134 ~~~~G~C~C~~G~~G~~Ce~~~~~~C~~~~~~~~~~g~~~~~~C~--g~C~~~~g~C~C~~G~~G~~C~~~~~C~~~--- 208 (693)
....+.|.+..+|.|..|+...........+. .. ..|. ..++...+.|. ..+|.|..|.....|+..
T Consensus 45 ~~~~~~~~~~~~~~~~~c~~~~~~~~~~~~c~----~~---~~c~~~~~~~~~~~~~~-~~~~~g~~C~~~~~~~~~c~~ 116 (316)
T KOG1218|consen 45 EVNSGECGLGYGFVGSVCRIECVCGNAGGGCS----QP---CRCKNGGTCVSSTGYCH-LNGYEGPQCESPCPCGDGCAE 116 (316)
T ss_pred cCCceeEecccccCCCccccccccCCCCCccc----Cc---cccCCCCcccCCCCccc-CCCCCcccccCCCCcCCcccc
Confidence 44578999999999999987532222111110 00 0121 22222333444 688889888766555432
Q ss_pred -ccCCCCCCCCCCCCCcCCCCCC--ccCCCCCCCceecCCcccccccccccccccccCCCCCcCcccccccCCccC--CC
Q 005509 209 -VNLPSQPGAPKSTDWAKADLDN--IFTTNGSKPGWCNVDPEEAYALKVQFKEECDCKYDGLLGQFCEVPVSSTCV--NQ 283 (693)
Q Consensus 209 -~~~~~~~~~~C~~gw~g~~c~~--~~~~~C~~~G~C~~~~~~~~~~~~c~~g~C~C~~~G~~G~~C~~~~~~~C~--~~ 283 (693)
.+.+....+.+..+|.+..|.. ..... |.... .+..+..+.++.|.|. +||.|.+|.... ..|. ..
T Consensus 117 ~~C~~~~~~c~~~~~~~~~~C~~~~~~g~~------C~~~c-~~~~~~~~~~~~c~c~-~g~~g~~~~~~~-~~c~~~~~ 187 (316)
T KOG1218|consen 117 KTCANPRRECRCGGGYIGEQCGEENLVGLK------CQRDC-QCTGGCDCKNGICTCQ-PGFVGVFCVESC-SGCSPLTA 187 (316)
T ss_pred cccCCCccceecCCcCccccccccCCCCCC------ccCCC-CCccccCCCCCceecc-CCcccccccccC-CCcCCCcc
Confidence 1111111233445555555553 11111 21111 0111112247789999 999999998752 2255 45
Q ss_pred CCCCceee--CCeeecCCCccc
Q 005509 284 CSGHGHCR--GGFCQCDSGWYG 303 (693)
Q Consensus 284 C~~~G~C~--~g~C~C~~G~~G 303 (693)
|.+++.|. .+.|.|.+++.+
T Consensus 188 ~~~g~~C~~~~~~~~~~~~~~~ 209 (316)
T KOG1218|consen 188 CENGAKCNRSTGSCLCYPGPSG 209 (316)
T ss_pred cCCCCeeeccccccccCCCCcc
Confidence 66778997 678888888865
No 31
>cd00055 EGF_Lam Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies
Probab=95.96 E-value=0.0074 Score=45.88 Aligned_cols=28 Identities=39% Similarity=1.047 Sum_probs=23.5
Q ss_pred CCCCCCE----EeccCCeEEeCCCccCCCCCc
Q 005509 126 DCSGQGV----CNHELGQCRCFHGFRGKGCSE 153 (693)
Q Consensus 126 ~C~~~G~----C~~~~G~C~C~~G~~G~~Ce~ 153 (693)
.|+++|. |+..+|+|.|.+|++|..|+.
T Consensus 3 ~C~~~g~~~~~C~~~~G~C~C~~~~~G~~C~~ 34 (50)
T cd00055 3 DCNGHGSLSGQCDPGTGQCECKPNTTGRRCDR 34 (50)
T ss_pred cCcCCCCCCccccCCCCEEeCCCcCCCCCCCC
Confidence 3555554 988899999999999999985
No 32
>smart00179 EGF_CA Calcium-binding EGF-like domain.
Probab=95.87 E-value=0.011 Score=41.78 Aligned_cols=32 Identities=31% Similarity=1.060 Sum_probs=26.7
Q ss_pred CCC-C-CCCCCCCEEeccCC--eEEeCCCcc-CCCCC
Q 005509 121 KSC-K-SDCSGQGVCNHELG--QCRCFHGFR-GKGCS 152 (693)
Q Consensus 121 ~~C-~-~~C~~~G~C~~~~G--~C~C~~G~~-G~~Ce 152 (693)
++| . .+|.++|+|....| .|.|++||. |..|+
T Consensus 3 ~~C~~~~~C~~~~~C~~~~g~~~C~C~~g~~~g~~C~ 39 (39)
T smart00179 3 DECASGNPCQNGGTCVNTVGSYRCECPPGYTDGRNCE 39 (39)
T ss_pred ccCcCCCCcCCCCEeECCCCCeEeECCCCCccCCcCC
Confidence 567 3 57999999987666 799999999 98885
No 33
>PF00008 EGF: EGF-like domain This is a sub-family of the Pfam entry This is a sub-family of the Pfam entry; InterPro: IPR006209 A sequence of about thirty to forty amino-acid residues long found in the sequence of epidermal growth factor (EGF) has been shown [, , , , ] to be present, in a more or less conserved form, in a large number of other, mostly animal proteins. The list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied. The functional significance of EGF domains in what appear to be unrelated proteins is not yet clear. However, a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase). The EGF domain includes six cysteine residues which have been shown (in EGF) to be involved in disulphide bonds. The main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet. Subdomains between the conserved cysteines vary in length.; GO: 0005515 protein binding; PDB: 1WHE_A 1CCF_A 1APO_A 1WHF_A 2VJ3_A 1TOZ_A 4D90_B 3CFW_A 1EDM_B 1IXA_A ....
Probab=95.85 E-value=0.0041 Score=42.56 Aligned_cols=24 Identities=38% Similarity=0.942 Sum_probs=20.6
Q ss_pred CCCCCCceee-----CCeeecCCCcccCC
Q 005509 282 NQCSGHGHCR-----GGFCQCDSGWYGVD 305 (693)
Q Consensus 282 ~~C~~~G~C~-----~g~C~C~~G~~G~~ 305 (693)
++|.++|+|+ +..|+|++||+|.+
T Consensus 4 ~~C~n~g~C~~~~~~~y~C~C~~G~~G~~ 32 (32)
T PF00008_consen 4 NPCQNGGTCIDLPGGGYTCECPPGYTGKR 32 (32)
T ss_dssp TSSTTTEEEEEESTSEEEEEEBTTEESTT
T ss_pred CcCCCCeEEEeCCCCCEEeECCCCCccCC
Confidence 5899999997 34899999999974
No 34
>PF00053 Laminin_EGF: Laminin EGF-like (Domains III and V); InterPro: IPR002049 Laminins [] are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation. They are composed of distinct but related alpha, beta and gamma chains. The three chains form a cross-shaped molecule that consist of a long arm and three short globular arms. The long arm consist of a coiled coil structure contributed by all three chains and cross-linked by interchain disulphide bonds. Beside different types of globular domains each subunit contains, in its first half, consecutive repeats of about 60 amino acids in length that include eight conserved cysteines []. The tertiary structure [, ] of this domain is remotely similar in its N-terminal to that of the EGF-like module (see PDOC00021 from PROSITEDOC). It is known as a 'LE' or 'laminin-type EGF-like' domain. The number of copies of the LE domain in the different forms of laminins is highly variable; from 3 up to 22 copies have been found. A schematic representation of the topology of the four disulphide bonds in the LE domain is shown below. +-------------------+ +-|-----------+ | +--------+ +-----------------+ | | | | | | | | xxCxCxxxxxxxxxxxCxxxxxxxCxxCxxxxxGxxCxxCxxgaagxxxxxxxxxxxCxx sssssssssssssssssssssssssssssssssss 'C': conserved cysteine involved in a disulphide bond 'a': conserved aromatic residue 'G': conserved glycine (lower case = less conserved) 's': region similar to the EGF-like domain In mouse laminin gamma-1 chain, the seventh LE domain has been shown to be the only one that binds with a high affinity to nidogen []. The binding-sites are located on the surface within the loops C1-C3 and C5-C6 [, ]. Long consecutive arrays of LE domains in laminins form rod-like elements of limited flexibility [], which determine the spacing in the formation of laminin networks of basement membranes [].; PDB: 3TBD_A 3ZYG_B 3ZYI_B 2Y38_A 1KLO_A 1NPE_B 3ZYJ_B 1TLE_A.
Probab=95.82 E-value=0.0063 Score=46.00 Aligned_cols=23 Identities=35% Similarity=0.859 Sum_probs=19.9
Q ss_pred CEEeccCCeEEeCCCccCCCCCc
Q 005509 131 GVCNHELGQCRCFHGFRGKGCSE 153 (693)
Q Consensus 131 G~C~~~~G~C~C~~G~~G~~Ce~ 153 (693)
.+|+..+|+|.|.++|+|..|++
T Consensus 11 ~~C~~~~G~C~C~~~~~G~~C~~ 33 (49)
T PF00053_consen 11 QTCDPSTGQCVCKPGTTGPRCDQ 33 (49)
T ss_dssp SSEEETCEEESBSTTEESTTS-E
T ss_pred CcccCCCCEEeccccccCCcCcC
Confidence 38998899999999999999986
No 35
>smart00051 DSL delta serrate ligand.
Probab=95.63 E-value=0.01 Score=47.35 Aligned_cols=30 Identities=33% Similarity=0.783 Sum_probs=25.5
Q ss_pred CCC--CCCCCCCCEEeccCCeEEeCCCccCCCC
Q 005509 121 KSC--KSDCSGQGVCNHELGQCRCFHGFRGKGC 151 (693)
Q Consensus 121 ~~C--~~~C~~~G~C~~~~G~C~C~~G~~G~~C 151 (693)
+.| .+++.+|.+|+. .|.|.|.+||+|++|
T Consensus 32 ~~C~~~~d~~~~~~Cd~-~G~~~C~~Gw~G~~C 63 (63)
T smart00051 32 KFCRPRDDFFGHYTCDE-NGNKGCLEGWMGPYC 63 (63)
T ss_pred CEeCcCccccCCccCCc-CCCEecCCCCcCCCC
Confidence 455 346889999986 799999999999988
No 36
>KOG3512 consensus Netrin, axonal chemotropic factor [Signal transduction mechanisms]
Probab=95.59 E-value=0.047 Score=59.12 Aligned_cols=103 Identities=20% Similarity=0.419 Sum_probs=64.6
Q ss_pred ccccCcccccCCCCcccccccCc--cCCCCC-CCCCCCCCE-Eec------c-----CCeE-EeCCCccCCCCCccccCc
Q 005509 95 KAEIGRWLSGCDSVAKEVDLVEM--IGGKSC-KSDCSGQGV-CNH------E-----LGQC-RCFHGFRGKGCSERIHFQ 158 (693)
Q Consensus 95 ~~~~g~~~~~c~~~~~~~~~~~~--~~~~~C-~~~C~~~G~-C~~------~-----~G~C-~C~~G~~G~~Ce~~~~~~ 158 (693)
+.+.|.=|..|...+..-+++.. ...++| .+.|++|+. |-. . -|+| .|.+...|.+|.-
T Consensus 301 HNTaGPdCgrCKpfy~dRPW~raT~~~a~~c~ac~Cn~harrcrfn~Ely~lSgr~SggvClnCrHnTaGrhChy----- 375 (592)
T KOG3512|consen 301 HNTAGPDCGRCKPFYYDRPWGRATALPANECVACNCNGHARRCRFNMELYRLSGRRSGGVCLNCRHNTAGRHCHY----- 375 (592)
T ss_pred cCCCCCCcccccccccCCCccccccCCCccccccccchhhhhcccchhhhcccCccccceEeecccCCCCccccc-----
Confidence 34466666666555555555432 345788 778887764 411 1 2366 4999999999986
Q ss_pred CCCC-----CCCCCCCCCccccccC------CCCCCCCceeeeCCCcccCCCCCC
Q 005509 159 CNFP-----KTPELPYGRWVVSICP------THCDTTRAMCFCGEGTKYPNRPVA 202 (693)
Q Consensus 159 C~~~-----~~~~~~~g~~~~~~C~------g~C~~~~g~C~C~~G~~G~~C~~~ 202 (693)
|.-+ +.+......|....|. .+|+..+|+|.|.+|-+|..|..+
T Consensus 376 CreGyyRd~s~pl~hrkaCk~CdChpVGs~gktCNq~tGqCpCkeGvtG~tCnrC 430 (592)
T KOG3512|consen 376 CREGYYRDGSKPLTHRKACKACDCHPVGSAGKTCNQTTGQCPCKEGVTGLTCNRC 430 (592)
T ss_pred ccCccccCCCCCCchhhhhhhcCCcccccccccccccCCcccCCCCCcccccccc
Confidence 4322 1111122234444553 578999999999999999998544
No 37
>cd00054 EGF_CA Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.
Probab=95.21 E-value=0.026 Score=39.34 Aligned_cols=32 Identities=31% Similarity=1.049 Sum_probs=26.1
Q ss_pred CCCC--CCCCCCCEEeccCC--eEEeCCCccCCCCC
Q 005509 121 KSCK--SDCSGQGVCNHELG--QCRCFHGFRGKGCS 152 (693)
Q Consensus 121 ~~C~--~~C~~~G~C~~~~G--~C~C~~G~~G~~Ce 152 (693)
++|. .+|.++|.|....| .|.|..||.|..|+
T Consensus 3 ~~C~~~~~C~~~~~C~~~~~~~~C~C~~g~~g~~C~ 38 (38)
T cd00054 3 DECASGNPCQNGGTCVNTVGSYRCSCPPGYTGRNCE 38 (38)
T ss_pred ccCCCCCCcCCCCEeECCCCCeEeECCCCCcCCcCC
Confidence 5663 57999999986666 79999999998885
No 38
>KOG4260 consensus Uncharacterized conserved protein [Function unknown]
Probab=94.91 E-value=0.019 Score=57.90 Aligned_cols=45 Identities=33% Similarity=0.833 Sum_probs=35.4
Q ss_pred cCCCCCcCcccccccCCccCCCCCCCceee-------CCeeecCCCcccCCCCC
Q 005509 262 CKYDGLLGQFCEVPVSSTCVNQCSGHGHCR-------GGFCQCDSGWYGVDCSI 308 (693)
Q Consensus 262 C~~~G~~G~~C~~~~~~~C~~~C~~~G~C~-------~g~C~C~~G~~G~~C~~ 308 (693)
|+ +|-.|++|... +..-..+|.++|.|. +|.|.|.+||+|+.|..
T Consensus 132 Cp-~gtyGpdCl~C-pggser~C~GnG~C~GdGsR~GsGkCkC~~GY~Gp~C~~ 183 (350)
T KOG4260|consen 132 CP-DGTYGPDCLQC-PGGSERPCFGNGSCHGDGSREGSGKCKCETGYTGPLCRY 183 (350)
T ss_pred cC-CCCcCCccccC-CCCCcCCcCCCCcccCCCCCCCCCcccccCCCCCccccc
Confidence 77 89999999842 111125799999996 67999999999999875
No 39
>smart00180 EGF_Lam Laminin-type epidermal growth factor-like domai.
Probab=94.73 E-value=0.033 Score=41.53 Aligned_cols=23 Identities=35% Similarity=1.012 Sum_probs=20.8
Q ss_pred CEEeccCCeEEeCCCccCCCCCc
Q 005509 131 GVCNHELGQCRCFHGFRGKGCSE 153 (693)
Q Consensus 131 G~C~~~~G~C~C~~G~~G~~Ce~ 153 (693)
..|+..+|+|.|.++++|..|+.
T Consensus 11 ~~C~~~~G~C~C~~~~~G~~C~~ 33 (46)
T smart00180 11 GTCDPDTGQCECKPNVTGRRCDR 33 (46)
T ss_pred CcccCCCCEEECCCCCCCCCCCc
Confidence 57888889999999999999985
No 40
>PF01414 DSL: Delta serrate ligand; InterPro: IPR001774 Ligands of the Delta/Serrate/lag-2 (DSL) family and their receptors, members of the lin-12/Notch family, mediate cell-cell interactions that specify cell fate in invertebrates and vertebrates. In Caenorhabditis elegans, two DSL genes, lag-2 and apx-1, influence different cell fate decisions during development []. Molecular interaction between Notch and Serrate, another EGF-homologous transmembrane protein containing a region of striking similarity to Delta, has been shown and the same two EGF repeats of Notch may also constitute a Serrate binding domain [, ].; GO: 0007154 cell communication, 0016020 membrane; PDB: 2VJ2_A.
Probab=94.34 E-value=0.014 Score=46.58 Aligned_cols=45 Identities=29% Similarity=0.705 Sum_probs=25.6
Q ss_pred ccccccCCCCCcCcccccccCCccCC--CCCCCceee-CCeeecCCCcccCCC
Q 005509 257 KEECDCKYDGLLGQFCEVPVSSTCVN--QCSGHGHCR-GGFCQCDSGWYGVDC 306 (693)
Q Consensus 257 ~g~C~C~~~G~~G~~C~~~~~~~C~~--~C~~~G~C~-~g~C~C~~G~~G~~C 306 (693)
..+-.|. ..|.|..|+. .|.+ .-.+|-+|+ +|.=.|.+||+|++|
T Consensus 16 ~~rv~C~-~nyyG~~C~~----~C~~~~d~~ghy~Cd~~G~~~C~~Gw~G~~C 63 (63)
T PF01414_consen 16 RIRVVCD-ENYYGPNCSK----FCKPRDDSFGHYTCDSNGNKVCLPGWTGPNC 63 (63)
T ss_dssp --------TTEETTTT-E----E---EEETTEEEEE-SS--EEE-TTEESTTS
T ss_pred EEEEECC-CCCCCccccC----CcCCCcCCcCCcccCCCCCCCCCCCCcCCCC
Confidence 4577898 9999999997 5643 245677787 889999999999998
No 41
>cd00053 EGF Epidermal growth factor domain, found in epidermal growth factor (EGF) presents in a large number of proteins, mostly animal; the list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied; the functional significance of EGF-like domains in what appear to be unrelated proteins is not yet clear; a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase); the domain includes six cysteine residues which have been shown to be involved in disulfide bonds; the main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet; Subdomains between the conserved cysteines vary in length; the region between the 5th and 6th cysteine contains two conserved glycines of which at least one is present in most EGF-like domains; a subset of these bind calcium.
Probab=93.96 E-value=0.071 Score=36.44 Aligned_cols=29 Identities=34% Similarity=0.953 Sum_probs=23.4
Q ss_pred CCCCCCCCEEeccCC--eEEeCCCccCC-CCC
Q 005509 124 KSDCSGQGVCNHELG--QCRCFHGFRGK-GCS 152 (693)
Q Consensus 124 ~~~C~~~G~C~~~~G--~C~C~~G~~G~-~Ce 152 (693)
...|.+++.|....+ .|.|+.||.|. .|+
T Consensus 5 ~~~C~~~~~C~~~~~~~~C~C~~g~~g~~~C~ 36 (36)
T cd00053 5 SNPCSNGGTCVNTPGSYRCVCPPGYTGDRSCE 36 (36)
T ss_pred CCCCCCCCEEecCCCCeEeECCCCCcccCCcC
Confidence 356888999986544 89999999998 664
No 42
>smart00181 EGF Epidermal growth factor-like domain.
Probab=93.96 E-value=0.076 Score=36.62 Aligned_cols=27 Identities=37% Similarity=0.947 Sum_probs=21.9
Q ss_pred CCCCCCCEEeccCC--eEEeCCCccC-CCCC
Q 005509 125 SDCSGQGVCNHELG--QCRCFHGFRG-KGCS 152 (693)
Q Consensus 125 ~~C~~~G~C~~~~G--~C~C~~G~~G-~~Ce 152 (693)
..|.++ +|....+ .|.|+.||.| ..|+
T Consensus 6 ~~C~~~-~C~~~~~~~~C~C~~g~~g~~~C~ 35 (35)
T smart00181 6 GPCSNG-TCINTPGSYTCSCPPGYTGDKRCE 35 (35)
T ss_pred CCCCCC-EEECCCCCeEeECCCCCccCCccC
Confidence 468888 9986544 8999999999 7774
No 43
>PHA02887 EGF-like protein; Provisional
Probab=93.92 E-value=0.07 Score=47.05 Aligned_cols=33 Identities=33% Similarity=0.827 Sum_probs=25.2
Q ss_pred CCCC----CCCCCCCEEeccCC----eEEeCCCccCCCCCcc
Q 005509 121 KSCK----SDCSGQGVCNHELG----QCRCFHGFRGKGCSER 154 (693)
Q Consensus 121 ~~C~----~~C~~~G~C~~~~G----~C~C~~G~~G~~Ce~~ 154 (693)
.+|+ +-|- ||+|..... .|.|+.||+|..|+..
T Consensus 84 ~pC~~eyk~YCi-HG~C~yI~dL~epsCrC~~GYtG~RCE~v 124 (126)
T PHA02887 84 EKCKNDFNDFCI-NGECMNIIDLDEKFCICNKGYTGIRCDEV 124 (126)
T ss_pred cccChHhhCEee-CCEEEccccCCCceeECCCCcccCCCCcc
Confidence 5663 3487 789964333 7999999999999974
No 44
>smart00179 EGF_CA Calcium-binding EGF-like domain.
Probab=92.93 E-value=0.11 Score=36.52 Aligned_cols=26 Identities=35% Similarity=0.961 Sum_probs=21.7
Q ss_pred CCCCCCceee----CCeeecCCCcc-cCCCC
Q 005509 282 NQCSGHGHCR----GGFCQCDSGWY-GVDCS 307 (693)
Q Consensus 282 ~~C~~~G~C~----~g~C~C~~G~~-G~~C~ 307 (693)
.+|.++|+|. ...|.|++||. |..|+
T Consensus 9 ~~C~~~~~C~~~~g~~~C~C~~g~~~g~~C~ 39 (39)
T smart00179 9 NPCQNGGTCVNTVGSYRCECPPGYTDGRNCE 39 (39)
T ss_pred CCcCCCCEeECCCCCeEeECCCCCccCCcCC
Confidence 4788889997 34799999999 98885
No 45
>KOG1218 consensus Proteins containing Ca2+-binding EGF-like domains [Signal transduction mechanisms]
Probab=92.79 E-value=1.1 Score=47.56 Aligned_cols=27 Identities=30% Similarity=0.912 Sum_probs=19.8
Q ss_pred CCCCCEEeccCCeEEeCCCccCCCCCcc
Q 005509 127 CSGQGVCNHELGQCRCFHGFRGKGCSER 154 (693)
Q Consensus 127 C~~~G~C~~~~G~C~C~~G~~G~~Ce~~ 154 (693)
|..++.+...++.|. ..+|.|..|+..
T Consensus 81 c~~~~~~~~~~~~~~-~~~~~g~~C~~~ 107 (316)
T KOG1218|consen 81 CKNGGTCVSSTGYCH-LNGYEGPQCESP 107 (316)
T ss_pred cCCCCcccCCCCccc-CCCCCcccccCC
Confidence 667777775566666 788888888874
No 46
>cd00054 EGF_CA Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.
Probab=92.38 E-value=0.14 Score=35.45 Aligned_cols=26 Identities=35% Similarity=0.942 Sum_probs=21.4
Q ss_pred CCCCCCceee----CCeeecCCCcccCCCC
Q 005509 282 NQCSGHGHCR----GGFCQCDSGWYGVDCS 307 (693)
Q Consensus 282 ~~C~~~G~C~----~g~C~C~~G~~G~~C~ 307 (693)
.+|.+++.|. ...|.|.+||.|..|+
T Consensus 9 ~~C~~~~~C~~~~~~~~C~C~~g~~g~~C~ 38 (38)
T cd00054 9 NPCQNGGTCVNTVGSYRCSCPPGYTGRNCE 38 (38)
T ss_pred CCcCCCCEeECCCCCeEeECCCCCcCCcCC
Confidence 4688888997 3479999999998885
No 47
>PF04863 EGF_alliinase: Alliinase EGF-like domain; InterPro: IPR006947 Allicin is a thiosulphinate that gives rise to dithiines, allyl sulphides and ajoenes, the three groups of active compounds in Allium species. Allicin is synthesised from sulphoxide cysteine derivatives by alliinase, whose C-S lyase activity cleaves C(beta)-S(gamma) bonds. It is thought that this enzyme forms part of a primitive plant defence system [].; GO: 0016846 carbon-sulfur lyase activity; PDB: 1LK9_B 2HOX_C 2HOR_A.
Probab=91.88 E-value=0.09 Score=39.96 Aligned_cols=29 Identities=34% Similarity=0.723 Sum_probs=17.3
Q ss_pred CCCCCCEEec----cCC--eEEeCCCccCCCCCcc
Q 005509 126 DCSGQGVCNH----ELG--QCRCFHGFRGKGCSER 154 (693)
Q Consensus 126 ~C~~~G~C~~----~~G--~C~C~~G~~G~~Ce~~ 154 (693)
.|++||..-. ..| .|.|+.-|.|++|++.
T Consensus 18 ~CSGHGr~flDg~~~dG~p~CECn~Cy~GpdCS~~ 52 (56)
T PF04863_consen 18 SCSGHGRAFLDGLIADGSPVCECNSCYGGPDCSTL 52 (56)
T ss_dssp --TTSEE--TTS-EETTEE--EE-TTEESTTS-EE
T ss_pred CcCCCCeeeeccccccCCccccccCCcCCCCcccC
Confidence 6999998842 234 7999999999999985
No 48
>PF07645 EGF_CA: Calcium-binding EGF domain; InterPro: IPR001881 A sequence of about forty amino-acid residues found in epidermal growth factor (EGF) has been shown [, , , , , ] to be present in a large number of membrane-bound and extracellular, mostly animal, proteins. Many of these proteins require calcium for their biological function and a calcium-binding site has been found at the N terminus of some EGF-like domains []. Calcium-binding may be crucial for numerous protein-protein interactions. For human coagulation factor IX it has been shown [] that the calcium-ligands form a pentagonal bipyramid. The first, third and fourth conserved negatively charged or polar residues are side chain ligands. The latter is possibly hydroxylated (see aspartic acid and asparagine hydroxylation site) []. A conserved aromatic residue, as well as the second conserved negative residue, are thought to be involved in stabilising the calcium-binding site. As in non-calcium binding EGF-like domains, there are six conserved cysteines and the structure of both types is very similar as calcium-binding induces only strictly local structural changes []. +------------------+ +---------+ | | | | nxnnC-x(3,14)-C-x(3,7)-CxxbxxxxaxC-x(1,6)-C-x(8,13)-Cx | | +------------------+ 'n': negatively charged or polar residue [DEQN] 'b': possibly beta-hydroxylated residue [DN] 'a': aromatic amino acid 'C': cysteine, involved in disulphide bond 'x': any amino acid. ; GO: 0005509 calcium ion binding; PDB: 2VJ3_A 1TOZ_A 1LMJ_A 1UZQ_A 1UZK_A 1UZJ_B 1UZP_A 1EMO_A 1EMN_A 2RR0_A ....
Probab=90.97 E-value=0.22 Score=36.19 Aligned_cols=27 Identities=30% Similarity=0.943 Sum_probs=23.0
Q ss_pred CCC---CCCCCCCCEEeccCC--eEEeCCCcc
Q 005509 121 KSC---KSDCSGQGVCNHELG--QCRCFHGFR 147 (693)
Q Consensus 121 ~~C---~~~C~~~G~C~~~~G--~C~C~~G~~ 147 (693)
++| +..|..++.|....| .|.|++||.
T Consensus 3 dEC~~~~~~C~~~~~C~N~~Gsy~C~C~~Gy~ 34 (42)
T PF07645_consen 3 DECAEGPHNCPENGTCVNTEGSYSCSCPPGYE 34 (42)
T ss_dssp STTTTTSSSSSTTSEEEEETTEEEEEESTTEE
T ss_pred cccCCCCCcCCCCCEEEcCCCCEEeeCCCCcE
Confidence 677 346988999998888 899999998
No 49
>cd00053 EGF Epidermal growth factor domain, found in epidermal growth factor (EGF) presents in a large number of proteins, mostly animal; the list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied; the functional significance of EGF-like domains in what appear to be unrelated proteins is not yet clear; a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase); the domain includes six cysteine residues which have been shown to be involved in disulfide bonds; the main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet; Subdomains between the conserved cysteines vary in length; the region between the 5th and 6th cysteine contains two conserved glycines of which at least one is present in most EGF-like domains; a subset of these bind calcium.
Probab=90.52 E-value=0.26 Score=33.48 Aligned_cols=26 Identities=38% Similarity=0.912 Sum_probs=21.1
Q ss_pred CCCCCCceee----CCeeecCCCcccC-CCC
Q 005509 282 NQCSGHGHCR----GGFCQCDSGWYGV-DCS 307 (693)
Q Consensus 282 ~~C~~~G~C~----~g~C~C~~G~~G~-~C~ 307 (693)
.+|.+++.|. ...|.|+.||.|. .|+
T Consensus 6 ~~C~~~~~C~~~~~~~~C~C~~g~~g~~~C~ 36 (36)
T cd00053 6 NPCSNGGTCVNTPGSYRCVCPPGYTGDRSCE 36 (36)
T ss_pred CCCCCCCEEecCCCCeEeECCCCCcccCCcC
Confidence 5688888897 4589999999998 664
No 50
>cd00055 EGF_Lam Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies
Probab=90.10 E-value=0.35 Score=36.66 Aligned_cols=20 Identities=30% Similarity=0.830 Sum_probs=17.2
Q ss_pred eee--CCeeecCCCcccCCCCC
Q 005509 289 HCR--GGFCQCDSGWYGVDCSI 308 (693)
Q Consensus 289 ~C~--~g~C~C~~G~~G~~C~~ 308 (693)
.|+ +|+|.|+++|+|.+|+.
T Consensus 13 ~C~~~~G~C~C~~~~~G~~C~~ 34 (50)
T cd00055 13 QCDPGTGQCECKPNTTGRRCDR 34 (50)
T ss_pred cccCCCCEEeCCCcCCCCCCCC
Confidence 464 78999999999999995
No 51
>KOG2619 consensus Fucosyltransferase [Carbohydrate transport and metabolism; Amino acid transport and metabolism]
Probab=89.66 E-value=0.63 Score=50.35 Aligned_cols=126 Identities=14% Similarity=0.078 Sum_probs=72.1
Q ss_pred cccCCCceeecCccCC-ch-hhhhccccCCCCCCCceeEEecccCCCCCCCCCCCCCccHHHHHHHHHHhcCCCCCcccc
Q 005509 536 CFDPEKDLVLPAWKAP-DA-FVLRSKLWASPREKRKTLFYFNGNLGSAYPNGRPESSYSMGVRQKLAEEYGSSPNKEGKL 613 (693)
Q Consensus 536 ~f~p~kDvviP~~~~~-~~-~~~~~~~~~~~~~~R~~L~~F~G~~~~~~~~~r~~~~ys~~iR~~L~~~~~~~~~~~~~~ 613 (693)
.||-+.|+.+|+-... .. ..+..++...-..+++.++++..+... ..-|.++++++... -.....
T Consensus 162 Tyr~dSd~~~pygy~~~~~~~~~~~p~~~~~~~k~~~~aw~vSnc~~------------~~~R~~~~~~L~k~-l~iD~Y 228 (372)
T KOG2619|consen 162 TYRRDSDLFVPYGYLEKPEANPVLVPVNSILSAKTKLAAWLVSNCIP------------RSARLDYYKELMKH-LEIDSY 228 (372)
T ss_pred EEeccCCCCCccceEeecccCceecccccccccccceeeeeccccCc------------chHHHHHHHHHHhh-Cceeec
Confidence 5677777777762211 11 111111111124566777788776542 33566666666543 122233
Q ss_pred CcccCcceEEecCCchhHHHHhhcCceeeccCCC----CCchhHHHHHhcCceeEEEeCCeeeceec
Q 005509 614 GKQHAEDVIVTSLRSENYHEDLSSSVFCGVLPGD----GWSGRMEDSILQGCIPVVIQVVISSFLLL 676 (693)
Q Consensus 614 g~~~~~~~~~~~~~~~~y~~~l~~S~FCL~p~Gd----~~s~Rl~dAi~~GCIPViisd~~~~p~l~ 676 (693)
|.+..+. .........++.++.=||=|+.--. =-|-+|+-|+.+|.|||+++....+.++|
T Consensus 229 G~c~~~~--~~~~~~~~~~~~~s~YKFyLAfENS~c~DYVTEKfw~al~~gsVPVvlg~~n~e~fvP 293 (372)
T KOG2619|consen 229 GECLRKN--ANRDPSDCLLETLSHYKFYLAFENSNCEDYVTEKFWNALDAGSVPVVLGPPNYENFVP 293 (372)
T ss_pred ccccccc--ccCCCCCcceeecccceEEEEecccCCcccccHHHHhhhhcCcccEEECCccccccCC
Confidence 3333211 0112234556788899999986642 22889999999999999999865554555
No 52
>PF01414 DSL: Delta serrate ligand; InterPro: IPR001774 Ligands of the Delta/Serrate/lag-2 (DSL) family and their receptors, members of the lin-12/Notch family, mediate cell-cell interactions that specify cell fate in invertebrates and vertebrates. In Caenorhabditis elegans, two DSL genes, lag-2 and apx-1, influence different cell fate decisions during development []. Molecular interaction between Notch and Serrate, another EGF-homologous transmembrane protein containing a region of striking similarity to Delta, has been shown and the same two EGF repeats of Notch may also constitute a Serrate binding domain [, ].; GO: 0007154 cell communication, 0016020 membrane; PDB: 2VJ2_A.
Probab=89.56 E-value=0.17 Score=40.50 Aligned_cols=46 Identities=24% Similarity=0.447 Sum_probs=21.9
Q ss_pred eEEeCCCccCCCCCccccCcCCCCCCCCCCCCCccccccCCCCCCCCceeeeCCCcccCCC
Q 005509 139 QCRCFHGFRGKGCSERIHFQCNFPKTPELPYGRWVVSICPTHCDTTRAMCFCGEGTKYPNR 199 (693)
Q Consensus 139 ~C~C~~G~~G~~Ce~~~~~~C~~~~~~~~~~g~~~~~~C~g~C~~~~g~C~C~~G~~G~~C 199 (693)
.-.|...|.|+.|+.. |...... .=.-.|+ ..|.=.|.+||+|++|
T Consensus 18 rv~C~~nyyG~~C~~~----C~~~~d~----------~ghy~Cd-~~G~~~C~~Gw~G~~C 63 (63)
T PF01414_consen 18 RVVCDENYYGPNCSKF----CKPRDDS----------FGHYTCD-SNGNKVCLPGWTGPNC 63 (63)
T ss_dssp -----TTEETTTT-EE-------EEET----------TEEEEE--SS--EEE-TTEESTTS
T ss_pred EEECCCCCCCccccCC----cCCCcCC----------cCCcccC-CCCCCCCCCCCcCCCC
Confidence 5689999999999985 4321000 0012466 4688889999999876
No 53
>PHA02887 EGF-like protein; Provisional
Probab=87.79 E-value=0.4 Score=42.45 Aligned_cols=26 Identities=35% Similarity=1.016 Sum_probs=21.6
Q ss_pred CCCCCceee------CCeeecCCCcccCCCCCC
Q 005509 283 QCSGHGHCR------GGFCQCDSGWYGVDCSIP 309 (693)
Q Consensus 283 ~C~~~G~C~------~g~C~C~~G~~G~~C~~~ 309 (693)
-|- ||+|. ...|.|..||+|..|+.-
T Consensus 93 YCi-HG~C~yI~dL~epsCrC~~GYtG~RCE~v 124 (126)
T PHA02887 93 FCI-NGECMNIIDLDEKFCICNKGYTGIRCDEV 124 (126)
T ss_pred Eee-CCEEEccccCCCceeECCCCcccCCCCcc
Confidence 466 68996 459999999999999874
No 54
>PF00053 Laminin_EGF: Laminin EGF-like (Domains III and V); InterPro: IPR002049 Laminins [] are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation. They are composed of distinct but related alpha, beta and gamma chains. The three chains form a cross-shaped molecule that consist of a long arm and three short globular arms. The long arm consist of a coiled coil structure contributed by all three chains and cross-linked by interchain disulphide bonds. Beside different types of globular domains each subunit contains, in its first half, consecutive repeats of about 60 amino acids in length that include eight conserved cysteines []. The tertiary structure [, ] of this domain is remotely similar in its N-terminal to that of the EGF-like module (see PDOC00021 from PROSITEDOC). It is known as a 'LE' or 'laminin-type EGF-like' domain. The number of copies of the LE domain in the different forms of laminins is highly variable; from 3 up to 22 copies have been found. A schematic representation of the topology of the four disulphide bonds in the LE domain is shown below. +-------------------+ +-|-----------+ | +--------+ +-----------------+ | | | | | | | | xxCxCxxxxxxxxxxxCxxxxxxxCxxCxxxxxGxxCxxCxxgaagxxxxxxxxxxxCxx sssssssssssssssssssssssssssssssssss 'C': conserved cysteine involved in a disulphide bond 'a': conserved aromatic residue 'G': conserved glycine (lower case = less conserved) 's': region similar to the EGF-like domain In mouse laminin gamma-1 chain, the seventh LE domain has been shown to be the only one that binds with a high affinity to nidogen []. The binding-sites are located on the surface within the loops C1-C3 and C5-C6 [, ]. Long consecutive arrays of LE domains in laminins form rod-like elements of limited flexibility [], which determine the spacing in the formation of laminin networks of basement membranes [].; PDB: 3TBD_A 3ZYG_B 3ZYI_B 2Y38_A 1KLO_A 1NPE_B 3ZYJ_B 1TLE_A.
Probab=87.47 E-value=0.31 Score=36.66 Aligned_cols=28 Identities=25% Similarity=0.555 Sum_probs=19.3
Q ss_pred ceee--CCeeecCCCcccCCCCCCccCCCCCC
Q 005509 288 GHCR--GGFCQCDSGWYGVDCSIPSVMSSMSE 317 (693)
Q Consensus 288 G~C~--~g~C~C~~G~~G~~C~~~~~~~~~~~ 317 (693)
..|. +|+|.|+++|+|..|++ +..+..+
T Consensus 11 ~~C~~~~G~C~C~~~~~G~~C~~--C~~g~~~ 40 (49)
T PF00053_consen 11 QTCDPSTGQCVCKPGTTGPRCDQ--CKPGYFG 40 (49)
T ss_dssp SSEEETCEEESBSTTEESTTS-E--E-TTEEC
T ss_pred CcccCCCCEEeccccccCCcCcC--CCCcccc
Confidence 3675 78999999999999996 4444333
No 55
>smart00181 EGF Epidermal growth factor-like domain.
Probab=86.51 E-value=0.66 Score=31.79 Aligned_cols=25 Identities=32% Similarity=0.852 Sum_probs=19.2
Q ss_pred CCCCCCceee----CCeeecCCCccc-CCCC
Q 005509 282 NQCSGHGHCR----GGFCQCDSGWYG-VDCS 307 (693)
Q Consensus 282 ~~C~~~G~C~----~g~C~C~~G~~G-~~C~ 307 (693)
.+|.++ .|. ...|.|++||.| ..|+
T Consensus 6 ~~C~~~-~C~~~~~~~~C~C~~g~~g~~~C~ 35 (35)
T smart00181 6 GPCSNG-TCINTPGSYTCSCPPGYTGDKRCE 35 (35)
T ss_pred CCCCCC-EEECCCCCeEeECCCCCccCCccC
Confidence 357777 786 458999999999 7764
No 56
>PF12947 EGF_3: EGF domain; InterPro: IPR024731 This entry represents an EGF domain found in the the C terminus of malarial parasite merozoite surface protein 1 [], as well as other proteins.; PDB: 2NPR_A 1N1I_C 1B9W_A 1YO8_A 2RHP_A.
Probab=86.17 E-value=0.48 Score=33.32 Aligned_cols=26 Identities=31% Similarity=0.908 Sum_probs=19.1
Q ss_pred CCCCCCCEEeccCC--eEEeCCCccCCC
Q 005509 125 SDCSGQGVCNHELG--QCRCFHGFRGKG 150 (693)
Q Consensus 125 ~~C~~~G~C~~~~G--~C~C~~G~~G~~ 150 (693)
..|+.+.+|....+ +|.|++||.|+-
T Consensus 6 ~~C~~nA~C~~~~~~~~C~C~~Gy~GdG 33 (36)
T PF12947_consen 6 GGCHPNATCTNTGGSYTCTCKPGYEGDG 33 (36)
T ss_dssp GGS-TTCEEEE-TTSEEEEE-CEEECCS
T ss_pred CCCCCCcEeecCCCCEEeECCCCCccCC
Confidence 35889999987666 899999999863
No 57
>PF12955 DUF3844: Domain of unknown function (DUF3844); InterPro: IPR024382 This presumed domain is found in fungal species. It contains 8 largely conserved cysteine residues. This domain is found in proteins thought to be located in the endoplasmic reticulum.
Probab=84.89 E-value=0.55 Score=41.17 Aligned_cols=32 Identities=41% Similarity=0.909 Sum_probs=25.0
Q ss_pred CCCCCCceeeC---------CeeecCC-------------CcccCCCCCCccCC
Q 005509 282 NQCSGHGHCRG---------GFCQCDS-------------GWYGVDCSIPSVMS 313 (693)
Q Consensus 282 ~~C~~~G~C~~---------g~C~C~~-------------G~~G~~C~~~~~~~ 313 (693)
++|++||.|.. ..|+|.+ .|.|..|+......
T Consensus 13 n~CsgHG~C~~~~~~~~~~C~~C~C~~T~~~~~~~~~ktt~W~G~aCqKkDvS~ 66 (103)
T PF12955_consen 13 NNCSGHGSCVKKYGSGGGDCFACKCKPTVVKTGSGKGKTTHWGGPACQKKDVSV 66 (103)
T ss_pred cCCCCCceEeeccCCCccceEEEEeeccccccccccCceeeecccccccccccc
Confidence 78999999972 1689987 68888888876554
No 58
>PF12955 DUF3844: Domain of unknown function (DUF3844); InterPro: IPR024382 This presumed domain is found in fungal species. It contains 8 largely conserved cysteine residues. This domain is found in proteins thought to be located in the endoplasmic reticulum.
Probab=84.85 E-value=0.76 Score=40.28 Aligned_cols=31 Identities=32% Similarity=0.993 Sum_probs=24.1
Q ss_pred CCCCCCCCEEeccC----C---eEEeCC-------------CccCCCCCcc
Q 005509 124 KSDCSGQGVCNHEL----G---QCRCFH-------------GFRGKGCSER 154 (693)
Q Consensus 124 ~~~C~~~G~C~~~~----G---~C~C~~-------------G~~G~~Ce~~ 154 (693)
.++|++||.|.... + .|.|.+ .|.|..|+..
T Consensus 12 Tn~CsgHG~C~~~~~~~~~~C~~C~C~~T~~~~~~~~~ktt~W~G~aCqKk 62 (103)
T PF12955_consen 12 TNNCSGHGSCVKKYGSGGGDCFACKCKPTVVKTGSGKGKTTHWGGPACQKK 62 (103)
T ss_pred ccCCCCCceEeeccCCCccceEEEEeeccccccccccCceeeecccccccc
Confidence 46899999997642 1 699998 5778888875
No 59
>KOG3607 consensus Meltrins, fertilins and related Zn-dependent metalloproteinases of the ADAMs family [Posttranslational modification, protein turnover, chaperones]
Probab=84.58 E-value=0.62 Score=54.93 Aligned_cols=33 Identities=33% Similarity=0.772 Sum_probs=29.0
Q ss_pred CCCCCCCCCCCEEeccCCeEEeCCCccCCCCCcc
Q 005509 121 KSCKSDCSGQGVCNHELGQCRCFHGFRGKGCSER 154 (693)
Q Consensus 121 ~~C~~~C~~~G~C~~~~G~C~C~~G~~G~~Ce~~ 154 (693)
..|+..|++||+|+. ...|+|.+||.+++|+..
T Consensus 626 ~~~~~~C~g~GVCnn-~~~ChC~~gwapp~C~~~ 658 (716)
T KOG3607|consen 626 SCCPTTCNGHGVCNN-ELNCHCEPGWAPPFCFIF 658 (716)
T ss_pred cccccccCCCcccCC-CcceeeCCCCCCCccccc
Confidence 445778999999994 789999999999999985
No 60
>KOG3607 consensus Meltrins, fertilins and related Zn-dependent metalloproteinases of the ADAMs family [Posttranslational modification, protein turnover, chaperones]
Probab=83.06 E-value=1.1 Score=53.09 Aligned_cols=35 Identities=37% Similarity=0.835 Sum_probs=30.5
Q ss_pred CccCCCCCCCceee-CCeeecCCCcccCCCCCCccC
Q 005509 278 STCVNQCSGHGHCR-GGFCQCDSGWYGVDCSIPSVM 312 (693)
Q Consensus 278 ~~C~~~C~~~G~C~-~g~C~C~~G~~G~~C~~~~~~ 312 (693)
..|+..|+++|.|+ ...|+|.+||.+++|++....
T Consensus 626 ~~~~~~C~g~GVCnn~~~ChC~~gwapp~C~~~~~~ 661 (716)
T KOG3607|consen 626 SCCPTTCNGHGVCNNELNCHCEPGWAPPFCFIFGYG 661 (716)
T ss_pred cccccccCCCcccCCCcceeeCCCCCCCccccccCC
Confidence 35677899999998 679999999999999998755
No 61
>smart00180 EGF_Lam Laminin-type epidermal growth factor-like domai.
Probab=82.14 E-value=1.5 Score=32.57 Aligned_cols=17 Identities=29% Similarity=0.751 Sum_probs=14.1
Q ss_pred CCeeecCCCcccCCCCC
Q 005509 292 GGFCQCDSGWYGVDCSI 308 (693)
Q Consensus 292 ~g~C~C~~G~~G~~C~~ 308 (693)
+|+|.|+++|+|.+|+.
T Consensus 17 ~G~C~C~~~~~G~~C~~ 33 (46)
T smart00180 17 TGQCECKPNVTGRRCDR 33 (46)
T ss_pred CCEEECCCCCCCCCCCc
Confidence 67888888888888884
No 62
>PF04863 EGF_alliinase: Alliinase EGF-like domain; InterPro: IPR006947 Allicin is a thiosulphinate that gives rise to dithiines, allyl sulphides and ajoenes, the three groups of active compounds in Allium species. Allicin is synthesised from sulphoxide cysteine derivatives by alliinase, whose C-S lyase activity cleaves C(beta)-S(gamma) bonds. It is thought that this enzyme forms part of a primitive plant defence system [].; GO: 0016846 carbon-sulfur lyase activity; PDB: 1LK9_B 2HOX_C 2HOR_A.
Probab=80.99 E-value=0.74 Score=35.14 Aligned_cols=29 Identities=45% Similarity=0.927 Sum_probs=16.7
Q ss_pred CCCCCCceee------CC--eeecCCCcccCCCCCCc
Q 005509 282 NQCSGHGHCR------GG--FCQCDSGWYGVDCSIPS 310 (693)
Q Consensus 282 ~~C~~~G~C~------~g--~C~C~~G~~G~~C~~~~ 310 (693)
-.|++||..- +| .|.|..-|.|++|++..
T Consensus 17 i~CSGHGr~flDg~~~dG~p~CECn~Cy~GpdCS~~~ 53 (56)
T PF04863_consen 17 ISCSGHGRAFLDGLIADGSPVCECNSCYGGPDCSTLI 53 (56)
T ss_dssp S--TTSEE--TTS-EETTEE--EE-TTEESTTS-EE-
T ss_pred CCcCCCCeeeeccccccCCccccccCCcCCCCcccCC
Confidence 4689999874 33 79999999999999765
No 63
>PHA03099 epidermal growth factor-like protein (EGF-like protein); Provisional
Probab=78.59 E-value=1.5 Score=39.63 Aligned_cols=29 Identities=38% Similarity=0.834 Sum_probs=22.1
Q ss_pred CCCCCCCEEeccC----CeEEeCCCccCCCCCcc
Q 005509 125 SDCSGQGVCNHEL----GQCRCFHGFRGKGCSER 154 (693)
Q Consensus 125 ~~C~~~G~C~~~~----G~C~C~~G~~G~~Ce~~ 154 (693)
+-|-+ |+|.... -.|.|..||+|..||..
T Consensus 51 ~YClH-G~C~yI~dl~~~~CrC~~GYtGeRCEh~ 83 (139)
T PHA03099 51 GYCLH-GDCIHARDIDGMYCRCSHGYTGIRCQHV 83 (139)
T ss_pred CEeEC-CEEEeeccCCCceeECCCCcccccccce
Confidence 34765 5995433 27999999999999985
No 64
>PHA03099 epidermal growth factor-like protein (EGF-like protein); Provisional
Probab=77.80 E-value=1.8 Score=39.08 Aligned_cols=26 Identities=35% Similarity=1.050 Sum_probs=21.4
Q ss_pred CCCCceee------CCeeecCCCcccCCCCCCc
Q 005509 284 CSGHGHCR------GGFCQCDSGWYGVDCSIPS 310 (693)
Q Consensus 284 C~~~G~C~------~g~C~C~~G~~G~~C~~~~ 310 (693)
|-+ |+|. ...|.|..||+|..||.-.
T Consensus 53 ClH-G~C~yI~dl~~~~CrC~~GYtGeRCEh~d 84 (139)
T PHA03099 53 CLH-GDCIHARDIDGMYCRCSHGYTGIRCQHVV 84 (139)
T ss_pred eEC-CEEEeeccCCCceeECCCCccccccccee
Confidence 555 4886 5689999999999999865
No 65
>PF09064 Tme5_EGF_like: Thrombomodulin like fifth domain, EGF-like; InterPro: IPR015149 This domain adopts a fold similar to other EGF domains, with a flat major and a twisted minor beta sheet. Disulphide pairing, however, is not of the usual 1-3, 2-4, 5-6 type; rather 1-2, 3-4, 5-6 pairing is found. Its extended major sheet (strands beta-2 and beta-3 and the connecting loop) projects into thrombin's active site groove. This domain is required for interaction of thrombomodulin with thrombin, and subsequent activation of protein-C []. ; GO: 0004888 transmembrane signaling receptor activity, 0016021 integral to membrane
Probab=77.38 E-value=1.7 Score=29.86 Aligned_cols=22 Identities=41% Similarity=1.043 Sum_probs=18.2
Q ss_pred cccCCCCCCC-CceeeeCCCccc
Q 005509 175 SICPTHCDTT-RAMCFCGEGTKY 196 (693)
Q Consensus 175 ~~C~g~C~~~-~g~C~C~~G~~G 196 (693)
..|+..|+.. .++|.|++||.-
T Consensus 6 t~CpA~CDpn~~~~C~CPeGyIl 28 (34)
T PF09064_consen 6 TECPADCDPNSPGQCFCPEGYIL 28 (34)
T ss_pred ccCCCccCCCCCCceeCCCceEe
Confidence 4788999885 679999999973
No 66
>PF01683 EB: EB module; InterPro: IPR006149 The EB domain has no known function. It is found in several Caenorhabditis sp. and Drosophila sp. proteins. The domain contains 8 conserved cysteines that probably form four disulphide bridges and is found associated with kunitz domains IPR002223 from INTERPRO
Probab=76.24 E-value=2.5 Score=32.06 Aligned_cols=31 Identities=39% Similarity=0.827 Sum_probs=23.0
Q ss_pred CCcCcccccccCCccCCCCCCCceeeCCeeecCCCcc
Q 005509 266 GLLGQFCEVPVSSTCVNQCSGHGHCRGGFCQCDSGWY 302 (693)
Q Consensus 266 G~~G~~C~~~~~~~C~~~C~~~G~C~~g~C~C~~G~~ 302 (693)
-..|..|+.. .+|..+..|++|+|.|++||.
T Consensus 16 ~~~g~~C~~~------~qC~~~s~C~~g~C~C~~g~~ 46 (52)
T PF01683_consen 16 VQPGESCESD------EQCIGGSVCVNGRCQCPPGYV 46 (52)
T ss_pred CCCCCCCCCc------CCCCCcCEEcCCEeECCCCCE
Confidence 3446667653 345688999999999999984
No 67
>PF06247 Plasmod_Pvs28: Plasmodium ookinete surface protein Pvs28; InterPro: IPR010423 This family consists of several ookinete surface protein (Pvs28) from several species of Plasmodium. Pvs25 and Pvs28 are expressed on the surface of ookinetes. These proteins are potential candidates for vaccine and induce antibodies that block the infectivity of Plasmodium vivax in immunised animals [].; GO: 0009986 cell surface, 0016020 membrane; PDB: 1Z3G_B 1Z1Y_B 1Z27_A.
Probab=75.34 E-value=1.6 Score=42.25 Aligned_cols=133 Identities=23% Similarity=0.565 Sum_probs=66.6
Q ss_pred CCEEeccCC--eEEeCCCcc---CCCCCccccCcCCCCCCCCCCCCCccccccC--CCC-------CCCCceeeeCCCcc
Q 005509 130 QGVCNHELG--QCRCFHGFR---GKGCSERIHFQCNFPKTPELPYGRWVVSICP--THC-------DTTRAMCFCGEGTK 195 (693)
Q Consensus 130 ~G~C~~~~G--~C~C~~G~~---G~~Ce~~~~~~C~~~~~~~~~~g~~~~~~C~--g~C-------~~~~g~C~C~~G~~ 195 (693)
+|....-.+ .|.|.+||. -..||.. ..|..... ....|. +.| ......|.|..||.
T Consensus 10 NG~LiQMSNHfEC~Cnegfvl~~EntCE~k--v~C~~~e~--------~~K~Cgdya~C~~~~~~~~~~~~~C~C~~gY~ 79 (197)
T PF06247_consen 10 NGYLIQMSNHFECKCNEGFVLKNENTCEEK--VECDKLEN--------VNKPCGDYAKCINQANKGEERAYKCDCINGYI 79 (197)
T ss_dssp TEEEEEESSEEEEEESTTEEEEETTEEEE------SG-GG--------TTSEEETTEEEEE-SSTTSSTSEEEEE-TTEE
T ss_pred CCEEEEccCceEEEcCCCcEEccccccccc--eecCcccc--------cCccccchhhhhcCCCcccceeEEEecccCce
Confidence 455554444 899999995 4567763 34543110 011332 122 22345899999998
Q ss_pred cCCCCCCCCCCCcccCCCCCCCCCCCCCcCCCCCCccCCCCCCCceecCCcccccccccccccccccCCCCCc---Cccc
Q 005509 196 YPNRPVAEACGFQVNLPSQPGAPKSTDWAKADLDNIFTTNGSKPGWCNVDPEEAYALKVQFKEECDCKYDGLL---GQFC 272 (693)
Q Consensus 196 G~~C~~~~~C~~~~~~~~~~~~~C~~gw~g~~c~~~~~~~C~~~G~C~~~~~~~~~~~~c~~g~C~C~~~G~~---G~~C 272 (693)
-.. ..|... .|. .-.|. .|.|..++.. -....|.|. .|+. +..|
T Consensus 80 ~~~----~vCvp~------------------~C~---~~~Cg-~GKCI~d~~~------~~~~~CSC~-IGkV~~dn~kC 126 (197)
T PF06247_consen 80 LKQ----GVCVPN------------------KCN---NKDCG-SGKCILDPDN------PNNPTCSCN-IGKVPDDNKKC 126 (197)
T ss_dssp ESS----SSEEEG------------------GGS---S---T-TEEEEEEEGG------GSEEEEEE--TEEETTTTTES
T ss_pred eeC----CeEchh------------------hcC---ceecC-CCeEEecCCC------CCCceeEee-eceEeccCCcc
Confidence 432 111100 111 22343 6888665431 013489998 8987 5667
Q ss_pred ccccCCccCCCCCCCceee----CCeeecCCCcccCC
Q 005509 273 EVPVSSTCVNQCSGHGHCR----GGFCQCDSGWYGVD 305 (693)
Q Consensus 273 ~~~~~~~C~~~C~~~G~C~----~g~C~C~~G~~G~~ 305 (693)
...-+..|.-.|..+-.|. -++|.|+.|+.|..
T Consensus 127 tk~G~T~C~LKCk~nE~CK~~~~~Y~C~~~~~~~~~~ 163 (197)
T PF06247_consen 127 TKTGETKCSLKCKENEECKLVDGYYKCVCKEGFPGDG 163 (197)
T ss_dssp EEEE--------TTTEEEEEETTEEEEEE-TT-EEET
T ss_pred cCCCccceeeecCCCcceeeeCcEEEeecCCCCCCCC
Confidence 7766678888998889996 34999999997654
No 68
>PF12947 EGF_3: EGF domain; InterPro: IPR024731 This entry represents an EGF domain found in the the C terminus of malarial parasite merozoite surface protein 1 [], as well as other proteins.; PDB: 2NPR_A 1N1I_C 1B9W_A 1YO8_A 2RHP_A.
Probab=72.60 E-value=1.9 Score=30.34 Aligned_cols=23 Identities=26% Similarity=0.743 Sum_probs=16.7
Q ss_pred CCCCCceee----CCeeecCCCcccCC
Q 005509 283 QCSGHGHCR----GGFCQCDSGWYGVD 305 (693)
Q Consensus 283 ~C~~~G~C~----~g~C~C~~G~~G~~ 305 (693)
.|..+++|. ...|+|++||.|+.
T Consensus 7 ~C~~nA~C~~~~~~~~C~C~~Gy~GdG 33 (36)
T PF12947_consen 7 GCHPNATCTNTGGSYTCTCKPGYEGDG 33 (36)
T ss_dssp GS-TTCEEEE-TTSEEEEE-CEEECCS
T ss_pred CCCCCcEeecCCCCEEeECCCCCccCC
Confidence 577788886 45899999999863
No 69
>PF00534 Glycos_transf_1: Glycosyl transferases group 1; InterPro: IPR001296 The biosynthesis of disaccharides, oligosaccharides and polysaccharides involves the action of hundreds of different glycosyltransferases. These enzymes catalyse the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. A classification of glycosyltransferases using nucleotide diphospho-sugar, nucleotide monophospho-sugar and sugar phosphates (2.4.1.- from EC) and related proteins into distinct sequence based families has been described []. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site. The same three-dimensional fold is expected to occur within each of the families. Because 3-D structures are better conserved than sequences, several of the families defined on the basis of sequence similarities may have similar 3-D structures and therefore form 'clans'. Proteins containign this domain transfer UDP, ADP, GDP or CMP linked sugars to a variety of substrates, including glycogen, fructose-6-phosphate and lipopolysaccharides. The bacterial enzymes are involved in various biosynthetic processes that include exopolysaccharide biosynthesis, lipopolysaccharide core biosynthesis and the biosynthesis of the slime polysaccaride colanic acid. Mutations in this domain of the human N-acetylglucosaminyl-phosphatidylinositol biosynthetic protein are the cause of paroxysmal nocturnal hemoglobinuria (PNH), an acquired hemolytic blood disorder characterised by venous thrombosis, erythrocyte hemolysis, infections and defective hematopoiesis.; GO: 0009058 biosynthetic process; PDB: 2L7C_A 2IV3_B 2IUY_B 2XA9_A 2XA1_B 2X6R_A 2XMP_B 2XA2_B 2X6Q_A 3QHP_B ....
Probab=72.30 E-value=4.4 Score=38.38 Aligned_cols=41 Identities=20% Similarity=0.342 Sum_probs=32.1
Q ss_pred chhHHHHhhcCceeeccCC-CCCchhHHHHHhcCceeEEEeCC
Q 005509 628 SENYHEDLSSSVFCGVLPG-DGWSGRMEDSILQGCIPVVIQVV 669 (693)
Q Consensus 628 ~~~y~~~l~~S~FCL~p~G-d~~s~Rl~dAi~~GCIPViisd~ 669 (693)
..++.+.|+.|.+-+.|.- ++++.-++|||.+|| |||+++.
T Consensus 83 ~~~l~~~~~~~di~v~~s~~e~~~~~~~Ea~~~g~-pvI~~~~ 124 (172)
T PF00534_consen 83 DDELDELYKSSDIFVSPSRNEGFGLSLLEAMACGC-PVIASDI 124 (172)
T ss_dssp HHHHHHHHHHTSEEEE-BSSBSS-HHHHHHHHTT--EEEEESS
T ss_pred ccccccccccceecccccccccccccccccccccc-ceeeccc
Confidence 4578899999999999877 477888999999999 7777774
No 70
>PF07645 EGF_CA: Calcium-binding EGF domain; InterPro: IPR001881 A sequence of about forty amino-acid residues found in epidermal growth factor (EGF) has been shown [, , , , , ] to be present in a large number of membrane-bound and extracellular, mostly animal, proteins. Many of these proteins require calcium for their biological function and a calcium-binding site has been found at the N terminus of some EGF-like domains []. Calcium-binding may be crucial for numerous protein-protein interactions. For human coagulation factor IX it has been shown [] that the calcium-ligands form a pentagonal bipyramid. The first, third and fourth conserved negatively charged or polar residues are side chain ligands. The latter is possibly hydroxylated (see aspartic acid and asparagine hydroxylation site) []. A conserved aromatic residue, as well as the second conserved negative residue, are thought to be involved in stabilising the calcium-binding site. As in non-calcium binding EGF-like domains, there are six conserved cysteines and the structure of both types is very similar as calcium-binding induces only strictly local structural changes []. +------------------+ +---------+ | | | | nxnnC-x(3,14)-C-x(3,7)-CxxbxxxxaxC-x(1,6)-C-x(8,13)-Cx | | +------------------+ 'n': negatively charged or polar residue [DEQN] 'b': possibly beta-hydroxylated residue [DN] 'a': aromatic amino acid 'C': cysteine, involved in disulphide bond 'x': any amino acid. ; GO: 0005509 calcium ion binding; PDB: 2VJ3_A 1TOZ_A 1LMJ_A 1UZQ_A 1UZK_A 1UZJ_B 1UZP_A 1EMO_A 1EMN_A 2RR0_A ....
Probab=70.13 E-value=2.5 Score=30.65 Aligned_cols=21 Identities=29% Similarity=0.865 Sum_probs=17.8
Q ss_pred CCCCCCceee----CCeeecCCCcc
Q 005509 282 NQCSGHGHCR----GGFCQCDSGWY 302 (693)
Q Consensus 282 ~~C~~~G~C~----~g~C~C~~G~~ 302 (693)
+.|..++.|+ ++.|.|++||.
T Consensus 10 ~~C~~~~~C~N~~Gsy~C~C~~Gy~ 34 (42)
T PF07645_consen 10 HNCPENGTCVNTEGSYSCSCPPGYE 34 (42)
T ss_dssp SSSSTTSEEEEETTEEEEEESTTEE
T ss_pred CcCCCCCEEEcCCCCEEeeCCCCcE
Confidence 4688899997 45999999997
No 71
>PF05686 Glyco_transf_90: Glycosyl transferase family 90; InterPro: IPR006598 Cryptococcus neoformans is a pathogenic fungus which most commonly affects the central nervous system and causes fatal meningoencephalitis primarily in patients with AIDS. This fungus produces a thick extracellular polysaccharide capsule which is well recognised as a virulence factor. CAP10 is required for capsule formation and virulence [].
Probab=69.48 E-value=8.1 Score=42.75 Aligned_cols=110 Identities=17% Similarity=0.189 Sum_probs=64.7
Q ss_pred CCCCCceeEEecccCCCCCCCCCCCCCccHHHHHHHHHHhcCCCCCccccCcccCcceEEecCCchhHHHHhhcCceeec
Q 005509 564 PREKRKTLFYFNGNLGSAYPNGRPESSYSMGVRQKLAEEYGSSPNKEGKLGKQHAEDVIVTSLRSENYHEDLSSSVFCGV 643 (693)
Q Consensus 564 ~~~~R~~L~~F~G~~~~~~~~~r~~~~ys~~iR~~L~~~~~~~~~~~~~~g~~~~~~~~~~~~~~~~y~~~l~~S~FCL~ 643 (693)
+=.+|.-.+||+|+... +.+|+.|+..-.+.+.......................=++..-+-||=+.
T Consensus 153 pW~~K~p~afWRG~~~~------------~~~R~~L~~~~~~~~~~~~a~i~~~d~~~~~~~~~~~~~l~~~~~yKYli~ 220 (395)
T PF05686_consen 153 PWEDKKPKAFWRGSPTV------------AETRQRLVRCSRSHPDLWDARITKQDWDKEYKPGFKHVPLEDQCKYKYLIY 220 (395)
T ss_pred ChhhcccceEECCCcCC------------CcchhHHHHHhccCCccceeeechhhhhhhccccccccCHHHHhhhheeec
Confidence 34567788999997642 237988887544432210000000000000000111122466778889888
Q ss_pred cCCCCCchhHHHHHhcCceeEEEeCCeeec----eecCCCccEEEec
Q 005509 644 LPGDGWSGRMEDSILQGCIPVVIQVVISSF----LLLCQNGSLKIRN 686 (693)
Q Consensus 644 p~Gd~~s~Rl~dAi~~GCIPViisd~~~~p----~l~~~~fsv~v~~ 686 (693)
.-|.+||.||.=-|..|.|.+.+...+.++ +.||..+ |-|+.
T Consensus 221 idG~~~S~RlkylL~c~SvVl~~~~~~~e~f~~~L~P~vHY-VPV~~ 266 (395)
T PF05686_consen 221 IDGNAWSGRLKYLLACNSVVLKVKSPYYEFFYRALKPWVHY-VPVKR 266 (395)
T ss_pred CCCceeehhHHHHHcCCceEEEeCCcHHHHHHhhhcccccE-EEecc
Confidence 999999999988899999988886665444 5677665 44444
No 72
>cd03814 GT1_like_2 This family is most closely related to the GT1 family of glycosyltransferases. Glycosyltransferases catalyze the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. The acceptor molecule can be a lipid, a protein, a heterocyclic compound, or another carbohydrate residue. This group of glycosyltransferases is most closely related to the previously defined glycosyltransferase family 1 (GT1). The members of this family may transfer UDP, ADP, GDP, or CMP linked sugars. The diverse enzymatic activities among members of this family reflect a wide range of biological functions. The protein structure available for this family has the GTB topology, one of the two protein topologies observed for nucleotide-sugar-dependent glycosyltransferases. GTB proteins have distinct N- and C- terminal domains each containing a typical Rossmann fold. The two domains have high structural homology despite minimal sequence homolog
Probab=60.56 E-value=9 Score=40.37 Aligned_cols=43 Identities=14% Similarity=0.080 Sum_probs=35.6
Q ss_pred CchhHHHHhhcCceeeccCCC-CCchhHHHHHhcCceeEEEeCCe
Q 005509 627 RSENYHEDLSSSVFCGVLPGD-GWSGRMEDSILQGCIPVVIQVVI 670 (693)
Q Consensus 627 ~~~~y~~~l~~S~FCL~p~Gd-~~s~Rl~dAi~~GCIPViisd~~ 670 (693)
...++.+.|+.|.+++.|... +++..++|||.+|+ |||.+|.-
T Consensus 256 ~~~~~~~~~~~~d~~l~~s~~e~~~~~~lEa~a~g~-PvI~~~~~ 299 (364)
T cd03814 256 DGEELAAAYASADVFVFPSRTETFGLVVLEAMASGL-PVVAPDAG 299 (364)
T ss_pred CHHHHHHHHHhCCEEEECcccccCCcHHHHHHHcCC-CEEEcCCC
Confidence 346678999999999998774 66788999999998 88888753
No 73
>cd03802 GT1_AviGT4_like This family is most closely related to the GT1 family of glycosyltransferases. aviGT4 in Streptomyces viridochromogenes has been shown to be involved in biosynthesis of oligosaccharide antibiotic avilamycin A. Inactivation of aviGT4 resulted in a mutant that accumulated a novel avilamycin derivative lacking the terminal eurekanate residue.
Probab=59.20 E-value=14 Score=38.75 Aligned_cols=41 Identities=15% Similarity=-0.031 Sum_probs=33.9
Q ss_pred hhHHHHhhcCceeeccCC--CCCchhHHHHHhcCceeEEEeCCe
Q 005509 629 ENYHEDLSSSVFCGVLPG--DGWSGRMEDSILQGCIPVVIQVVI 670 (693)
Q Consensus 629 ~~y~~~l~~S~FCL~p~G--d~~s~Rl~dAi~~GCIPViisd~~ 670 (693)
....+.|+.+.+.+.|.- .+++.-++|||.+|+ |||.+|.-
T Consensus 235 ~~~~~~~~~~d~~v~ps~~~E~~~~~~lEAma~G~-PvI~~~~~ 277 (335)
T cd03802 235 AEKAELLGNARALLFPILWEEPFGLVMIEAMACGT-PVIAFRRG 277 (335)
T ss_pred HHHHHHHHhCcEEEeCCcccCCcchHHHHHHhcCC-CEEEeCCC
Confidence 456789999999999863 467777999999996 99999853
No 74
>cd03808 GT1_cap1E_like This family is most closely related to the GT1 family of glycosyltransferases. cap1E in Streptococcus pneumoniae is required for the synthesis of type 1 capsular polysaccharides.
Probab=58.45 E-value=13 Score=38.72 Aligned_cols=42 Identities=17% Similarity=0.192 Sum_probs=34.4
Q ss_pred chhHHHHhhcCceeeccCC-CCCchhHHHHHhcCceeEEEeCCe
Q 005509 628 SENYHEDLSSSVFCGVLPG-DGWSGRMEDSILQGCIPVVIQVVI 670 (693)
Q Consensus 628 ~~~y~~~l~~S~FCL~p~G-d~~s~Rl~dAi~~GCIPViisd~~ 670 (693)
..+..+.|+.|.+.+.|.. .+.+..++|||.+| +|||.+|.-
T Consensus 254 ~~~~~~~~~~adi~i~ps~~e~~~~~~~Ea~~~G-~Pvi~s~~~ 296 (359)
T cd03808 254 RDDVPELLAAADVFVLPSYREGLPRVLLEAMAMG-RPVIATDVP 296 (359)
T ss_pred cccHHHHHHhccEEEecCcccCcchHHHHHHHcC-CCEEEecCC
Confidence 4567899999999998875 46677899999999 588888753
No 75
>cd03823 GT1_ExpE7_like This family is most closely related to the GT1 family of glycosyltransferases. ExpE7 in Sinorhizobium meliloti has been shown to be involved in the biosynthesis of galactoglucans (exopolysaccharide II).
Probab=58.40 E-value=10 Score=39.84 Aligned_cols=41 Identities=12% Similarity=0.218 Sum_probs=34.7
Q ss_pred chhHHHHhhcCceeeccC--CCCCchhHHHHHhcCceeEEEeCC
Q 005509 628 SENYHEDLSSSVFCGVLP--GDGWSGRMEDSILQGCIPVVIQVV 669 (693)
Q Consensus 628 ~~~y~~~l~~S~FCL~p~--Gd~~s~Rl~dAi~~GCIPViisd~ 669 (693)
..++.+.|+.|.+.+.|. +.+++..++|||.+| +|||.++.
T Consensus 253 ~~~~~~~~~~ad~~i~ps~~~e~~~~~~~Ea~a~G-~Pvi~~~~ 295 (359)
T cd03823 253 QEEIDDFYAEIDVLVVPSIWPENFPLVIREALAAG-VPVIASDI 295 (359)
T ss_pred HHHHHHHHHhCCEEEEcCcccCCCChHHHHHHHCC-CCEEECCC
Confidence 367889999999999886 467778899999999 88888874
No 76
>PF01683 EB: EB module; InterPro: IPR006149 The EB domain has no known function. It is found in several Caenorhabditis sp. and Drosophila sp. proteins. The domain contains 8 conserved cysteines that probably form four disulphide bridges and is found associated with kunitz domains IPR002223 from INTERPRO
Probab=58.29 E-value=11 Score=28.60 Aligned_cols=22 Identities=36% Similarity=0.975 Sum_probs=18.5
Q ss_pred CCCCCCCCEEeccCCeEEeCCCcc
Q 005509 124 KSDCSGQGVCNHELGQCRCFHGFR 147 (693)
Q Consensus 124 ~~~C~~~G~C~~~~G~C~C~~G~~ 147 (693)
...|.++..|. .|+|.|++||.
T Consensus 25 ~~qC~~~s~C~--~g~C~C~~g~~ 46 (52)
T PF01683_consen 25 DEQCIGGSVCV--NGRCQCPPGYV 46 (52)
T ss_pred cCCCCCcCEEc--CCEeECCCCCE
Confidence 44688899997 89999999984
No 77
>cd03822 GT1_ecORF704_like This family is most closely related to the GT1 family of glycosyltransferases. ORF704 in E. coli has been shown to be involved in the biosynthesis of O-specific mannose homopolysaccharides.
Probab=51.95 E-value=15 Score=38.80 Aligned_cols=41 Identities=24% Similarity=0.146 Sum_probs=34.6
Q ss_pred chhHHHHhhcCceeeccCC-C--CCchhHHHHHhcCceeEEEeCC
Q 005509 628 SENYHEDLSSSVFCGVLPG-D--GWSGRMEDSILQGCIPVVIQVV 669 (693)
Q Consensus 628 ~~~y~~~l~~S~FCL~p~G-d--~~s~Rl~dAi~~GCIPViisd~ 669 (693)
..++.+.|+.|.+.+.|.- . +++.-+.|||.+|+ |||.+|.
T Consensus 258 ~~~~~~~~~~ad~~v~ps~~e~~~~~~~~~Ea~a~G~-PvI~~~~ 301 (366)
T cd03822 258 DEELPELFSAADVVVLPYRSADQTQSGVLAYAIGFGK-PVISTPV 301 (366)
T ss_pred HHHHHHHHhhcCEEEecccccccccchHHHHHHHcCC-CEEecCC
Confidence 4678899999999998765 4 56778999999999 9999885
No 78
>KOG3516 consensus Neurexin IV [Signal transduction mechanisms]
Probab=51.30 E-value=9.4 Score=46.64 Aligned_cols=35 Identities=29% Similarity=0.845 Sum_probs=30.0
Q ss_pred CCCC-CCCCCCCCEEeccCC---eEEeC-CCccCCCCCccc
Q 005509 120 GKSC-KSDCSGQGVCNHELG---QCRCF-HGFRGKGCSERI 155 (693)
Q Consensus 120 ~~~C-~~~C~~~G~C~~~~G---~C~C~-~G~~G~~Ce~~~ 155 (693)
...| |+.|.++|.|.. .+ .|.|. .||+|..|+..+
T Consensus 545 ~drClPN~CehgG~C~Q-s~~~f~C~C~~TGY~GatCHtsi 584 (1306)
T KOG3516|consen 545 SDRCLPNPCEHGGKCSQ-SWDDFECNCELTGYKGATCHTSI 584 (1306)
T ss_pred ccccCCccccCCCcccc-cccceeEeccccccccccccCCC
Confidence 3678 999999999985 44 89999 999999999754
No 79
>cd03798 GT1_wlbH_like This family is most closely related to the GT1 family of glycosyltransferases. wlbH in Bordetella parapertussis has been shown to be required for the biosynthesis of a trisaccharide that, when attached to the B. pertussis lipopolysaccharide (LPS) core (band B), generates band A LPS.
Probab=49.08 E-value=17 Score=38.05 Aligned_cols=41 Identities=17% Similarity=0.165 Sum_probs=34.2
Q ss_pred chhHHHHhhcCceeeccCC-CCCchhHHHHHhcCceeEEEeCC
Q 005509 628 SENYHEDLSSSVFCGVLPG-DGWSGRMEDSILQGCIPVVIQVV 669 (693)
Q Consensus 628 ~~~y~~~l~~S~FCL~p~G-d~~s~Rl~dAi~~GCIPViisd~ 669 (693)
..++.+.|++|.+.+.|.- ++++..++|||.+|+ |||.++.
T Consensus 269 ~~~~~~~~~~ad~~i~~~~~~~~~~~~~Ea~~~G~-pvI~~~~ 310 (377)
T cd03798 269 HEEVPAYYAAADVFVLPSLREGFGLVLLEAMACGL-PVVATDV 310 (377)
T ss_pred HHHHHHHHHhcCeeecchhhccCChHHHHHHhcCC-CEEEecC
Confidence 3567899999999998876 577888999999998 7777764
No 80
>cd03807 GT1_WbnK_like This family is most closely related to the GT1 family of glycosyltransferases. WbnK in Shigella dysenteriae has been shown to be involved in the type 7 O-antigen biosynthesis.
Probab=47.25 E-value=19 Score=37.68 Aligned_cols=40 Identities=18% Similarity=0.225 Sum_probs=33.5
Q ss_pred hhHHHHhhcCceeeccCC-CCCchhHHHHHhcCceeEEEeCC
Q 005509 629 ENYHEDLSSSVFCGVLPG-DGWSGRMEDSILQGCIPVVIQVV 669 (693)
Q Consensus 629 ~~y~~~l~~S~FCL~p~G-d~~s~Rl~dAi~~GCIPViisd~ 669 (693)
.+..+.|+.+.+.+.|.. .+++.-++|||.+| +|||.+|.
T Consensus 260 ~~~~~~~~~adi~v~ps~~e~~~~~~~Ea~a~g-~PvI~~~~ 300 (365)
T cd03807 260 SDVPALLNALDVFVLSSLSEGFPNVLLEAMACG-LPVVATDV 300 (365)
T ss_pred ccHHHHHHhCCEEEeCCccccCCcHHHHHHhcC-CCEEEcCC
Confidence 457799999999998876 47778899999999 58888875
No 81
>PF00954 S_locus_glycop: S-locus glycoprotein family; InterPro: IPR000858 In Brassicaceae, self-incompatible plants have a self/non-self recognition system, which involves the inability of flowering plants to achieve self-fertilisation. This is sporophytically controlled by multiple alleles at a single locus (S). There are a total of 50 different S alleles in Brassica oleracea. S-locus glycoproteins, as well as S-receptor kinases, are in linkage with the S-alleles []. Most of the proteins within this family contain apple-like domain (IPR003609 from INTERPRO), which is predicted to possess protein- and/or carbohydrate-binding functions.; GO: 0048544 recognition of pollen
Probab=47.24 E-value=18 Score=32.04 Aligned_cols=29 Identities=34% Similarity=0.929 Sum_probs=22.2
Q ss_pred CCC--CCCCCCCCEEeccCC-eEEeCCCccCC
Q 005509 121 KSC--KSDCSGQGVCNHELG-QCRCFHGFRGK 149 (693)
Q Consensus 121 ~~C--~~~C~~~G~C~~~~G-~C~C~~G~~G~ 149 (693)
++| ...|..+|.|+.... .|.|.+||.-.
T Consensus 78 d~Cd~y~~CG~~g~C~~~~~~~C~Cl~GF~P~ 109 (110)
T PF00954_consen 78 DQCDVYGFCGPNGICNSNNSPKCSCLPGFEPK 109 (110)
T ss_pred cCCCCccccCCccEeCCCCCCceECCCCcCCC
Confidence 567 467999999975433 89999999643
No 82
>cd03819 GT1_WavL_like This family is most closely related to the GT1 family of glycosyltransferases. WavL in Vibrio cholerae has been shown to be involved in the biosynthesis of the lipopolysaccharide core.
Probab=45.96 E-value=28 Score=36.83 Aligned_cols=41 Identities=7% Similarity=0.071 Sum_probs=33.5
Q ss_pred chhHHHHhhcCceeeccC--CCCCchhHHHHHhcCceeEEEeCC
Q 005509 628 SENYHEDLSSSVFCGVLP--GDGWSGRMEDSILQGCIPVVIQVV 669 (693)
Q Consensus 628 ~~~y~~~l~~S~FCL~p~--Gd~~s~Rl~dAi~~GCIPViisd~ 669 (693)
..++.+.|+.|...+.|. ..+++.-++|||.+|+ |||++|.
T Consensus 254 ~~~~~~~l~~ad~~i~ps~~~e~~~~~l~EA~a~G~-PvI~~~~ 296 (355)
T cd03819 254 CSDMPAAYALADIVVSASTEPEAFGRTAVEAQAMGR-PVIASDH 296 (355)
T ss_pred cccHHHHHHhCCEEEecCCCCCCCchHHHHHHhcCC-CEEEcCC
Confidence 456789999999988875 3566778999999998 8888874
No 83
>KOG1388 consensus Attractin and platelet-activating factor acetylhydrolase [Signal transduction mechanisms; Defense mechanisms]
Probab=45.61 E-value=15 Score=36.61 Aligned_cols=73 Identities=29% Similarity=0.591 Sum_probs=38.5
Q ss_pred CCCCCCEEeccCCeE-EeCCCccCCCCCccccCcCCCCCCCCCCCCCccccccC---CCCCCCCceeee-CCCcccCCCC
Q 005509 126 DCSGQGVCNHELGQC-RCFHGFRGKGCSERIHFQCNFPKTPELPYGRWVVSICP---THCDTTRAMCFC-GEGTKYPNRP 200 (693)
Q Consensus 126 ~C~~~G~C~~~~G~C-~C~~G~~G~~Ce~~~~~~C~~~~~~~~~~g~~~~~~C~---g~C~~~~g~C~C-~~G~~G~~C~ 200 (693)
.|++|+.|+. .-.| .|..|-+|..|+. |..+-..+...|.+....|. ..|....++|.| .-|..|..|+
T Consensus 53 ~cNGh~~c~t-~~v~~~~~N~~~g~~c~k-----c~~g~~GdtN~g~c~~~~~~g~~~~~~~~~~~c~c~~kgvvgd~c~ 126 (217)
T KOG1388|consen 53 QCNGHSDCNT-QHVCWRCENGTTGAHCEK-----CIVGFYGDTNGGKCQPCDCNGGASACVTLTGKCFCTTKGVVGDLCP 126 (217)
T ss_pred HhcCCCCccc-ceeeeeccCccccccCCc-----eEEEEEecCCCCccCHhhhcCCeeeeeccCCccccccceEecccCc
Confidence 3667777763 2233 3555666666654 21110000011222223343 336667889999 5789998887
Q ss_pred CCCC
Q 005509 201 VAEA 204 (693)
Q Consensus 201 ~~~~ 204 (693)
.++.
T Consensus 127 ~~e~ 130 (217)
T KOG1388|consen 127 KCEV 130 (217)
T ss_pred cccc
Confidence 6644
No 84
>cd03821 GT1_Bme6_like This family is most closely related to the GT1 family of glycosyltransferases. Bme6 in Brucella melitensis has been shown to be involved in the biosynthesis of a polysaccharide.
Probab=45.11 E-value=22 Score=37.34 Aligned_cols=41 Identities=15% Similarity=0.216 Sum_probs=33.9
Q ss_pred hhHHHHhhcCceeeccCC-CCCchhHHHHHhcCceeEEEeCCe
Q 005509 629 ENYHEDLSSSVFCGVLPG-DGWSGRMEDSILQGCIPVVIQVVI 670 (693)
Q Consensus 629 ~~y~~~l~~S~FCL~p~G-d~~s~Rl~dAi~~GCIPViisd~~ 670 (693)
.++.+.|+.+.+.+.|.- .+++.-++|||.+|+ |||.++..
T Consensus 273 ~~~~~~~~~adv~v~ps~~e~~~~~~~Eama~G~-PvI~~~~~ 314 (375)
T cd03821 273 EDKAAALADADLFVLPSHSENFGIVVAEALACGT-PVVTTDKV 314 (375)
T ss_pred HHHHHHHhhCCEEEeccccCCCCcHHHHHHhcCC-CEEEcCCC
Confidence 467789999999988776 577788999999995 88888754
No 85
>cd04951 GT1_WbdM_like This family is most closely related to the GT1 family of glycosyltransferases and is named after WbdM in Escherichia coli. In general glycosyltransferases catalyze the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. The acceptor molecule can be a lipid, a protein, a heterocyclic compound, or another carbohydrate residue. This group of glycosyltransferases is most closely related to the previously defined glycosyltransferase family 1 (GT1). The members of this family may transfer UDP, ADP, GDP, or CMP linked sugars. The diverse enzymatic activities among members of this family reflect a wide range of biological functions. The protein structure available for this family has the GTB topology, one of the two protein topologies observed for nucleotide-sugar-dependent glycosyltransferases. GTB proteins have distinct N- and C- terminal domains each containing a typical Rossmann fold. The two domains have
Probab=44.32 E-value=24 Score=37.35 Aligned_cols=40 Identities=10% Similarity=0.160 Sum_probs=32.9
Q ss_pred hhHHHHhhcCceeeccCC-CCCchhHHHHHhcCceeEEEeCC
Q 005509 629 ENYHEDLSSSVFCGVLPG-DGWSGRMEDSILQGCIPVVIQVV 669 (693)
Q Consensus 629 ~~y~~~l~~S~FCL~p~G-d~~s~Rl~dAi~~GCIPViisd~ 669 (693)
.+..+.|+.+.+-+.|.. .+++.-++|||.+|+ |||.+|.
T Consensus 254 ~~~~~~~~~ad~~v~~s~~e~~~~~~~Ea~a~G~-PvI~~~~ 294 (360)
T cd04951 254 DDIAAYYNAADLFVLSSAWEGFGLVVAEAMACEL-PVVATDA 294 (360)
T ss_pred ccHHHHHHhhceEEecccccCCChHHHHHHHcCC-CEEEecC
Confidence 456788999999888766 467778999999999 8888875
No 86
>KOG3514 consensus Neurexin III-alpha [Signal transduction mechanisms]
Probab=43.28 E-value=15 Score=44.33 Aligned_cols=41 Identities=29% Similarity=0.856 Sum_probs=30.2
Q ss_pred ccccccCCccC-CCCCCCceeeCC----eeec-CCCcccCCCCCCcc
Q 005509 271 FCEVPVSSTCV-NQCSGHGHCRGG----FCQC-DSGWYGVDCSIPSV 311 (693)
Q Consensus 271 ~C~~~~~~~C~-~~C~~~G~C~~g----~C~C-~~G~~G~~C~~~~~ 311 (693)
.|....+..|. ++|.|+|.|..| .|.| ..||.|..|+....
T Consensus 617 sCs~~~~~~C~~nPC~N~g~C~egwNrfiCDCs~T~~~G~~CerE~t 663 (1591)
T KOG3514|consen 617 SCSLSNEKICESNPCQNGGKCSEGWNRFICDCSGTGFEGRTCEREAT 663 (1591)
T ss_pred ccchhhccccCCCcccCCCCccccccccccccccCcccCccccceee
Confidence 34433233554 799999999855 7999 57999999998654
No 87
>PF12662 cEGF: Complement Clr-like EGF-like
Probab=42.86 E-value=16 Score=23.30 Aligned_cols=10 Identities=30% Similarity=0.856 Sum_probs=6.2
Q ss_pred CeeecCCCcc
Q 005509 293 GFCQCDSGWY 302 (693)
Q Consensus 293 g~C~C~~G~~ 302 (693)
++|.|++||.
T Consensus 2 y~C~C~~Gy~ 11 (24)
T PF12662_consen 2 YTCSCPPGYQ 11 (24)
T ss_pred EEeeCCCCCc
Confidence 3566666664
No 88
>cd03801 GT1_YqgM_like This family is most closely related to the GT1 family of glycosyltransferases and named after YqgM in Bacillus licheniformis about which little is known. Glycosyltransferases catalyze the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. The acceptor molecule can be a lipid, a protein, a heterocyclic compound, or another carbohydrate residue. This group of glycosyltransferases is most closely related to the previously defined glycosyltransferase family 1 (GT1). The members of this family may transfer UDP, ADP, GDP, or CMP linked sugars. The diverse enzymatic activities among members of this family reflect a wide range of biological functions. The protein structure available for this family has the GTB topology, one of the two protein topologies observed for nucleotide-sugar-dependent glycosyltransferases. GTB proteins have distinct N- and C- terminal domains each containing a typical Rossmann fold.
Probab=41.81 E-value=27 Score=36.26 Aligned_cols=41 Identities=17% Similarity=0.199 Sum_probs=33.9
Q ss_pred chhHHHHhhcCceeeccCC-CCCchhHHHHHhcCceeEEEeCC
Q 005509 628 SENYHEDLSSSVFCGVLPG-DGWSGRMEDSILQGCIPVVIQVV 669 (693)
Q Consensus 628 ~~~y~~~l~~S~FCL~p~G-d~~s~Rl~dAi~~GCIPViisd~ 669 (693)
..++.+.|++|.+-+.|.- ++.+..++|||.+|+ |||.++.
T Consensus 266 ~~~~~~~~~~~di~i~~~~~~~~~~~~~Ea~~~g~-pvI~~~~ 307 (374)
T cd03801 266 DEDLPALYAAADVFVLPSLYEGFGLVLLEAMAAGL-PVVASDV 307 (374)
T ss_pred hhhHHHHHHhcCEEEecchhccccchHHHHHHcCC-cEEEeCC
Confidence 4678899999999988765 466788999999996 7887774
No 89
>cd03818 GT1_ExpC_like This family is most closely related to the GT1 family of glycosyltransferases. ExpC in Rhizobium meliloti has been shown to be involved in the biosynthesis of galactoglucan (exopolysaccharide II).
Probab=41.18 E-value=81 Score=34.44 Aligned_cols=40 Identities=23% Similarity=0.085 Sum_probs=31.3
Q ss_pred hhHHHHhhcCceeeccCC-CCCchhHHHHHhcCceeEEEeCC
Q 005509 629 ENYHEDLSSSVFCGVLPG-DGWSGRMEDSILQGCIPVVIQVV 669 (693)
Q Consensus 629 ~~y~~~l~~S~FCL~p~G-d~~s~Rl~dAi~~GCIPViisd~ 669 (693)
.++.+.|+.|...+.|.- .+.+.-++|||.+|+ |||.+|.
T Consensus 292 ~~~~~~l~~adv~v~~s~~e~~~~~llEAmA~G~-PVIas~~ 332 (396)
T cd03818 292 DQYLALLQVSDVHVYLTYPFVLSWSLLEAMACGC-LVVGSDT 332 (396)
T ss_pred HHHHHHHHhCcEEEEcCcccccchHHHHHHHCCC-CEEEcCC
Confidence 567789999998887654 344556999999998 8888874
No 90
>PF13692 Glyco_trans_1_4: Glycosyl transferases group 1; PDB: 3OY2_A 3OY7_B 2Q6V_A 2HY7_A 3CV3_A 3CUY_A.
Probab=40.14 E-value=44 Score=29.95 Aligned_cols=40 Identities=18% Similarity=0.305 Sum_probs=28.2
Q ss_pred hhHHHHhhcCceeeccC--CCCCchhHHHHHhcCceeEEEeCC
Q 005509 629 ENYHEDLSSSVFCGVLP--GDGWSGRMEDSILQGCIPVVIQVV 669 (693)
Q Consensus 629 ~~y~~~l~~S~FCL~p~--Gd~~s~Rl~dAi~~GCIPViisd~ 669 (693)
+++.+.|+++.+.+.|. +.+.+.+++|+|.+|+ |||.++.
T Consensus 62 ~e~~~~l~~~dv~l~p~~~~~~~~~k~~e~~~~G~-pvi~~~~ 103 (135)
T PF13692_consen 62 EELPEILAAADVGLIPSRFNEGFPNKLLEAMAAGK-PVIASDN 103 (135)
T ss_dssp HHHHHHHHC-SEEEE-BSS-SCC-HHHHHHHCTT---EEEEHH
T ss_pred HHHHHHHHhCCEEEEEeeCCCcCcHHHHHHHHhCC-CEEECCc
Confidence 57899999999999986 4456788999999997 5555654
No 91
>cd03816 GT1_ALG1_like This family is most closely related to the GT1 family of glycosyltransferases. The yeast gene ALG1 has been shown to function as a mannosyltransferase that catalyzes the formation of dolichol pyrophosphate (Dol-PP)-GlcNAc2Man from GDP-Man and Dol-PP-Glc-NAc2, and participates in the formation of the lipid-linked precursor oligosaccharide for N-glycosylation. In humans ALG1 has been associated with the congenital disorders of glycosylation (CDG) designated as subtype CDG-Ik.
Probab=37.93 E-value=77 Score=35.09 Aligned_cols=42 Identities=24% Similarity=0.181 Sum_probs=32.5
Q ss_pred chhHHHHhhcCceeeccC----CCCCchhHHHHHhcCceeEEEeCCe
Q 005509 628 SENYHEDLSSSVFCGVLP----GDGWSGRMEDSILQGCIPVVIQVVI 670 (693)
Q Consensus 628 ~~~y~~~l~~S~FCL~p~----Gd~~s~Rl~dAi~~GCIPViisd~~ 670 (693)
.+++.+.|+.|...+.|. |.+....++|||.+|. |||.++.-
T Consensus 305 ~~~~~~~l~~aDv~v~~~~~~~~~~~p~~~~Eama~G~-PVI~s~~~ 350 (415)
T cd03816 305 AEDYPKLLASADLGVSLHTSSSGLDLPMKVVDMFGCGL-PVCALDFK 350 (415)
T ss_pred HHHHHHHHHhCCEEEEccccccccCCcHHHHHHHHcCC-CEEEeCCC
Confidence 467888999999887532 3345667999999998 99998853
No 92
>TIGR03087 stp1 sugar transferase, PEP-CTERM/EpsH1 system associated. Members of this family include a match to the pfam00534 Glycosyl transferases group 1 domain. Nearly all are found in species that encode the PEP-CTERM/exosortase system predicted to act in protein sorting in a number of Gram-negative bacteria. In particular, these transferases are found proximal to a particular variant of exosortase, EpsH1, which appears to travel with a conserved group of genes summarized by Genome Property GenProp0652. The nature of the sugar transferase reaction catalyzed by members of this clade is unknown and may conceivably be variable with respect to substrate by species, but we hypothesize a conserved substrate.
Probab=37.08 E-value=44 Score=36.57 Aligned_cols=39 Identities=13% Similarity=0.141 Sum_probs=32.0
Q ss_pred hHHHHhhcCceeeccC--CCCCchhHHHHHhcCceeEEEeCC
Q 005509 630 NYHEDLSSSVFCGVLP--GDGWSGRMEDSILQGCIPVVIQVV 669 (693)
Q Consensus 630 ~y~~~l~~S~FCL~p~--Gd~~s~Rl~dAi~~GCIPViisd~ 669 (693)
+..+.|+.+...+.|. +.|....++|||.+|+ |||.++.
T Consensus 290 ~~~~~~~~adv~v~Ps~~~eG~~~~~lEAma~G~-PVV~t~~ 330 (397)
T TIGR03087 290 DVRPYLAHAAVAVAPLRIARGIQNKVLEAMAMAK-PVVASPE 330 (397)
T ss_pred CHHHHHHhCCEEEecccccCCcccHHHHHHHcCC-CEEecCc
Confidence 5678899999988874 5566678999999997 9999874
No 93
>KOG0196 consensus Tyrosine kinase, EPH (ephrin) receptor family [Signal transduction mechanisms]
Probab=36.66 E-value=75 Score=37.94 Aligned_cols=65 Identities=26% Similarity=0.594 Sum_probs=35.5
Q ss_pred CCCCCEEeccCCeEEeCCCcc----CCCCCccccCcCCCCCCCC-CCCCCccccccCCCCCC-CCc--eeeeCCCcccCC
Q 005509 127 CSGQGVCNHELGQCRCFHGFR----GKGCSERIHFQCNFPKTPE-LPYGRWVVSICPTHCDT-TRA--MCFCGEGTKYPN 198 (693)
Q Consensus 127 C~~~G~C~~~~G~C~C~~G~~----G~~Ce~~~~~~C~~~~~~~-~~~g~~~~~~C~g~C~~-~~g--~C~C~~G~~G~~ 198 (693)
|++-|.=.--.|.|.|.+||. |..|+. |+.+.-.. .....| ..||.+-.. ..| .|.|..||+-..
T Consensus 248 C~~dGeWlvpiG~C~C~aGye~~~~~~~C~a-----Cp~G~yK~~~~~~~C--~~CP~~S~s~~ega~~C~C~~gyyRA~ 320 (996)
T KOG0196|consen 248 CSGDGEWLVPIGGCVCKAGYEEAENGKACQA-----CPPGTYKASQGDSLC--LPCPPNSHSSSEGATSCTCENGYYRAD 320 (996)
T ss_pred EcCCCcEEEEcCceeecCCCCcccCCCccee-----CCCCcccCCCCCCCC--CCCCCCCCCCCCCCCcccccCCcccCC
Confidence 666665544468999999994 566764 65431100 000011 245533222 222 789999987543
No 94
>cd03805 GT1_ALG2_like This family is most closely related to the GT1 family of glycosyltransferases. ALG2, a 1,3-mannosyltransferase, in yeast catalyzes the mannosylation of Man(2)GlcNAc(2)-dolichol diphosphate and Man(1)GlcNAc(2)-dolichol diphosphate to form Man(3)GlcNAc(2)-dolichol diphosphate. A deficiency of this enzyme causes an abnormal accumulation of Man1GlcNAc2-PP-dolichol and Man2GlcNAc2-PP-dolichol, which is associated with a type of congenital disorders of glycosylation (CDG), designated CDG-Ii, in humans.
Probab=36.43 E-value=51 Score=35.60 Aligned_cols=40 Identities=18% Similarity=0.079 Sum_probs=32.0
Q ss_pred hhHHHHhhcCceeeccCC-CCCchhHHHHHhcCceeEEEeCC
Q 005509 629 ENYHEDLSSSVFCGVLPG-DGWSGRMEDSILQGCIPVVIQVV 669 (693)
Q Consensus 629 ~~y~~~l~~S~FCL~p~G-d~~s~Rl~dAi~~GCIPViisd~ 669 (693)
....+.|+.|.+.+.|.. .+++.-++|||.+| +|||.+|.
T Consensus 291 ~~~~~~l~~ad~~l~~s~~E~~g~~~lEAma~G-~PvI~s~~ 331 (392)
T cd03805 291 SQKELLLSSARALLYTPSNEHFGIVPLEAMYAG-KPVIACNS 331 (392)
T ss_pred HHHHHHHhhCeEEEECCCcCCCCchHHHHHHcC-CCEEEECC
Confidence 455688999999998766 35666689999999 78888875
No 95
>PLN02871 UDP-sulfoquinovose:DAG sulfoquinovosyltransferase
Probab=36.12 E-value=34 Score=38.52 Aligned_cols=40 Identities=13% Similarity=0.150 Sum_probs=34.3
Q ss_pred hhHHHHhhcCceeeccCCC-CCchhHHHHHhcCceeEEEeCC
Q 005509 629 ENYHEDLSSSVFCGVLPGD-GWSGRMEDSILQGCIPVVIQVV 669 (693)
Q Consensus 629 ~~y~~~l~~S~FCL~p~Gd-~~s~Rl~dAi~~GCIPViisd~ 669 (693)
.++.+.|+.+...+.|... +++.-++|||.+| +|||.++.
T Consensus 323 ~ev~~~~~~aDv~V~pS~~E~~g~~vlEAmA~G-~PVI~s~~ 363 (465)
T PLN02871 323 DELSQAYASGDVFVMPSESETLGFVVLEAMASG-VPVVAARA 363 (465)
T ss_pred HHHHHHHHHCCEEEECCcccccCcHHHHHHHcC-CCEEEcCC
Confidence 5788999999999988763 6677799999999 99999874
No 96
>cd03794 GT1_wbuB_like This family is most closely related to the GT1 family of glycosyltransferases. wbuB in E. coli is involved in the biosynthesis of the O26 O-antigen. It has been proposed to function as an N-acetyl-L-fucosamine (L-FucNAc) transferase.
Probab=36.05 E-value=36 Score=35.79 Aligned_cols=43 Identities=19% Similarity=0.076 Sum_probs=33.3
Q ss_pred chhHHHHhhcCceeeccCCC-CC-----chhHHHHHhcCceeEEEeCCee
Q 005509 628 SENYHEDLSSSVFCGVLPGD-GW-----SGRMEDSILQGCIPVVIQVVIS 671 (693)
Q Consensus 628 ~~~y~~~l~~S~FCL~p~Gd-~~-----s~Rl~dAi~~GCIPViisd~~~ 671 (693)
..++.+.|+.+.+.+.|... ++ ..+++|||.+|+ |||.++.-.
T Consensus 285 ~~~~~~~~~~~di~i~~~~~~~~~~~~~p~~~~Ea~~~G~-pvi~~~~~~ 333 (394)
T cd03794 285 KEELPELLAAADVGLVPLKPGPAFEGVSPSKLFEYMAAGK-PVLASVDGE 333 (394)
T ss_pred hHHHHHHHHhhCeeEEeccCcccccccCchHHHHHHHCCC-cEEEecCCC
Confidence 35778999999999988763 22 456999999995 888887543
No 97
>cd03804 GT1_wbaZ_like This family is most closely related to the GT1 family of glycosyltransferases. wbaZ in Salmonella enterica has been shown to possess the mannosyl transferase activity. The members of this family are found in certain bacteria and Archaea.
Probab=35.45 E-value=41 Score=35.81 Aligned_cols=41 Identities=10% Similarity=0.068 Sum_probs=33.4
Q ss_pred chhHHHHhhcCceeeccCCCCCchhHHHHHhcCceeEEEeCC
Q 005509 628 SENYHEDLSSSVFCGVLPGDGWSGRMEDSILQGCIPVVIQVV 669 (693)
Q Consensus 628 ~~~y~~~l~~S~FCL~p~Gd~~s~Rl~dAi~~GCIPViisd~ 669 (693)
...+.+.|+.+...+.|.=.+++.-++|||.+|+ |||.++.
T Consensus 252 ~~~~~~~~~~ad~~v~ps~e~~g~~~~Eama~G~-Pvi~~~~ 292 (351)
T cd03804 252 DEELRDLYARARAFLFPAEEDFGIVPVEAMASGT-PVIAYGK 292 (351)
T ss_pred HHHHHHHHHhCCEEEECCcCCCCchHHHHHHcCC-CEEEeCC
Confidence 3557899999999888754667777899999997 9998874
No 98
>PRK15427 colanic acid biosynthesis glycosyltransferase WcaL; Provisional
Probab=34.34 E-value=1.5e+02 Score=32.79 Aligned_cols=45 Identities=20% Similarity=0.217 Sum_probs=34.2
Q ss_pred chhHHHHhhcCceeeccC-----C--CCCchhHHHHHhcCceeEEEeCCeeec
Q 005509 628 SENYHEDLSSSVFCGVLP-----G--DGWSGRMEDSILQGCIPVVIQVVISSF 673 (693)
Q Consensus 628 ~~~y~~~l~~S~FCL~p~-----G--d~~s~Rl~dAi~~GCIPViisd~~~~p 673 (693)
..+..+.|+.+...+.|. | +|...-++|||.+| +|||.++.-..+
T Consensus 289 ~~el~~~l~~aDv~v~pS~~~~~g~~Eg~p~~llEAma~G-~PVI~t~~~g~~ 340 (406)
T PRK15427 289 SHEVKAMLDDADVFLLPSVTGADGDMEGIPVALMEAMAVG-IPVVSTLHSGIP 340 (406)
T ss_pred HHHHHHHHHhCCEEEECCccCCCCCccCccHHHHHHHhCC-CCEEEeCCCCch
Confidence 356789999999988874 2 35566799999999 599998753333
No 99
>PRK15484 lipopolysaccharide 1,2-N-acetylglucosaminetransferase; Provisional
Probab=34.14 E-value=42 Score=36.70 Aligned_cols=43 Identities=14% Similarity=0.149 Sum_probs=34.5
Q ss_pred chhHHHHhhcCceeeccCC--CCCchhHHHHHhcCceeEEEeCCee
Q 005509 628 SENYHEDLSSSVFCGVLPG--DGWSGRMEDSILQGCIPVVIQVVIS 671 (693)
Q Consensus 628 ~~~y~~~l~~S~FCL~p~G--d~~s~Rl~dAi~~GCIPViisd~~~ 671 (693)
..+..+.|+.|...+.|.. .+++.-++|||.+| +|||.++.-.
T Consensus 267 ~~~l~~~~~~aDv~v~pS~~~E~f~~~~lEAma~G-~PVI~s~~gg 311 (380)
T PRK15484 267 PEKMHNYYPLADLVVVPSQVEEAFCMVAVEAMAAG-KPVLASTKGG 311 (380)
T ss_pred HHHHHHHHHhCCEEEeCCCCccccccHHHHHHHcC-CCEEEeCCCC
Confidence 3567789999999998864 45666799999999 8999998533
No 100
>cd03809 GT1_mtfB_like This family is most closely related to the GT1 family of glycosyltransferases. mtfB (mannosyltransferase B) in E. coli has been shown to direct the growth of the O9-specific polysaccharide chain. It transfers two mannoses into the position 3 of the previously synthesized polysaccharide.
Probab=34.07 E-value=28 Score=36.62 Aligned_cols=41 Identities=12% Similarity=0.139 Sum_probs=32.9
Q ss_pred chhHHHHhhcCceeeccCC-CCCchhHHHHHhcCceeEEEeCC
Q 005509 628 SENYHEDLSSSVFCGVLPG-DGWSGRMEDSILQGCIPVVIQVV 669 (693)
Q Consensus 628 ~~~y~~~l~~S~FCL~p~G-d~~s~Rl~dAi~~GCIPViisd~ 669 (693)
...+.+.|+.+.+.+.|.- ++++.-++|||.+|+ |||.++.
T Consensus 263 ~~~~~~~~~~~d~~l~ps~~e~~~~~~~Ea~a~G~-pvI~~~~ 304 (365)
T cd03809 263 DEELAALYRGARAFVFPSLYEGFGLPVLEAMACGT-PVIASNI 304 (365)
T ss_pred hhHHHHHHhhhhhhcccchhccCCCCHHHHhcCCC-cEEecCC
Confidence 3567889999999888754 466777999999995 8888875
No 101
>cd03800 GT1_Sucrose_synthase This family is most closely related to the GT1 family of glycosyltransferases. The sucrose-phosphate synthases in this family may be unique to plants and photosynthetic bacteria. This enzyme catalyzes the synthesis of sucrose 6-phosphate from fructose 6-phosphate and uridine 5'-diphosphate-glucose, a key regulatory step of sucrose metabolism. The activity of this enzyme is regulated by phosphorylation and moderated by the concentration of various metabolites and light.
Probab=33.37 E-value=39 Score=36.35 Aligned_cols=40 Identities=15% Similarity=0.114 Sum_probs=32.7
Q ss_pred hhHHHHhhcCceeeccCC-CCCchhHHHHHhcCceeEEEeCC
Q 005509 629 ENYHEDLSSSVFCGVLPG-DGWSGRMEDSILQGCIPVVIQVV 669 (693)
Q Consensus 629 ~~y~~~l~~S~FCL~p~G-d~~s~Rl~dAi~~GCIPViisd~ 669 (693)
.++.+.|+.|...+.|.- ++++.-++|||.+| +|||.++.
T Consensus 294 ~~~~~~~~~adi~l~ps~~e~~~~~l~Ea~a~G-~Pvi~s~~ 334 (398)
T cd03800 294 EDLPALYRAADVFVNPALYEPFGLTALEAMACG-LPVVATAV 334 (398)
T ss_pred HHHHHHHHhCCEEEecccccccCcHHHHHHhcC-CCEEECCC
Confidence 567788999999988865 35566799999999 69999874
No 102
>PF00919 UPF0004: Uncharacterized protein family UPF0004; InterPro: IPR013848 The methylthiotransferase (MTTase) or miaB-like family is named after the (dimethylallyl)adenosine tRNA MTTase miaB protein, which catalyses a C-H to C-S bond conversion in the methylthiolation of tRNA. A related bacterial enzyme rimO performs a similar methylthiolation, but on a protein substrate. RimO acts on the ribosomal protein S12 and forms a separate MTTase subfamily. The miaB-subfamily includes mammalian CDK5 regulatory subunit-associated proteins and similar proteins in other eukaryotes. Two other subfamilies, yqeV and CDKAL1, are named after a Bacillus subtilis and a human protein, respectively. While yqeV-like proteins are found in bacteria, CDKAL1 subfamily members occur in eukaryotes and in archaebacteria. The likely MTTases from these 4 subfamilies contain an N-terminal MTTase domain, a central radical generating fold and a C-terminal TRAM domain (see PDOC50926 from PROSITEDOC). The core forms a radical SAM fold (or AdoMet radical), containing a cysteine motif CxxxCxxC that binds a [4Fe-4S] cluster [, , ]. A reducing equivalent from the [4Fe-4S]+ cluster is used to cleave S-adenosylmethionine (SAM) to generate methionine and a 5'-deoxyadenosyl radical. The latter is thought to produce a reactive substrate radical that is amenable to sulphur insertion [, ]. The N-terminal MTTase domain contains 3 cysteines that bind a second [4Fe-4S] cluster, in addition to the radical-generating [4Fe-4S] cluster, which could be involved in the thiolation reaction. The C-terminal TRAM domain is not shared with other radical SAM proteins outside the MTTase family. The TRAM domain can bind to RNA substrate and seems to be important for substrate recognition. The tertiary structure of the central radical SAM fold has six beta/alpha motifs resembling a three-quarter TIM barrel core (see PDOC00155 from PROSITEDOC) []. The N-terminal MTTase domain might form an additional [beta/alpha]2 TIM barrel unit []. ; GO: 0003824 catalytic activity, 0051539 4 iron, 4 sulfur cluster binding, 0009451 RNA modification
Probab=32.81 E-value=40 Score=29.44 Aligned_cols=32 Identities=22% Similarity=0.162 Sum_probs=21.6
Q ss_pred ccchhHHHHHHHHhcCCC-cCCCcCCCceEEEec
Q 005509 392 MLYGSQMAFYESILASPH-RTLNGEEADFFFVPV 424 (693)
Q Consensus 392 ~~y~~E~~~~~~L~~s~~-rT~dP~eAdlF~VP~ 424 (693)
++|.+|.+ ...|.+..+ .|.+|++||+++|--
T Consensus 12 N~~Dse~i-~~~l~~~G~~~~~~~e~AD~iiiNT 44 (98)
T PF00919_consen 12 NQYDSERI-ASILQAAGYEIVDDPEEADVIIINT 44 (98)
T ss_pred cHHHHHHH-HHHHHhcCCeeecccccCCEEEEEc
Confidence 34555643 344555544 799999999998854
No 103
>KOG3516 consensus Neurexin IV [Signal transduction mechanisms]
Probab=32.67 E-value=26 Score=43.05 Aligned_cols=40 Identities=28% Similarity=0.674 Sum_probs=31.8
Q ss_pred Ccc-CCCCCCCceee----CCeeecC-CCcccCCCCCCccCCCCCC
Q 005509 278 STC-VNQCSGHGHCR----GGFCQCD-SGWYGVDCSIPSVMSSMSE 317 (693)
Q Consensus 278 ~~C-~~~C~~~G~C~----~g~C~C~-~G~~G~~C~~~~~~~~~~~ 317 (693)
+.| ||+|.++|.|. +..|.|. .||.|..|..+......++
T Consensus 546 drClPN~CehgG~C~Qs~~~f~C~C~~TGY~GatCHtsi~e~SCea 591 (1306)
T KOG3516|consen 546 DRCLPNPCEHGGKCSQSWDDFECNCELTGYKGATCHTSIYELSCEA 591 (1306)
T ss_pred cccCCccccCCCcccccccceeEeccccccccccccCCCcchhhHH
Confidence 456 48999999997 5699998 9999999998776544433
No 104
>cd05844 GT1_like_7 Glycosyltransferases catalyze the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. The acceptor molecule can be a lipid, a protein, a heterocyclic compound, or another carbohydrate residue. This group of glycosyltransferases is most closely related to the previously defined glycosyltransferase family 1 (GT1). The members of this family may transfer UDP, ADP, GDP, or CMP linked sugars. The diverse enzymatic activities among members of this family reflect a wide range of biological functions. The protein structure available for this family has the GTB topology, one of the two protein topologies observed for nucleotide-sugar-dependent glycosyltransferases. GTB proteins have distinct N- and C- terminal domains each containing a typical Rossmann fold. The two domains have high structural homology despite minimal sequence homology. The large cleft that separates the two domains includes the catalytic center
Probab=29.74 E-value=78 Score=33.64 Aligned_cols=42 Identities=14% Similarity=0.060 Sum_probs=32.2
Q ss_pred hhHHHHhhcCceeeccCC-------CCCchhHHHHHhcCceeEEEeCCee
Q 005509 629 ENYHEDLSSSVFCGVLPG-------DGWSGRMEDSILQGCIPVVIQVVIS 671 (693)
Q Consensus 629 ~~y~~~l~~S~FCL~p~G-------d~~s~Rl~dAi~~GCIPViisd~~~ 671 (693)
.+..+.|+.|...+.|.- .+++..++|||.+|+ |||.+|.-.
T Consensus 256 ~~l~~~~~~ad~~v~ps~~~~~~~~E~~~~~~~EA~a~G~-PvI~s~~~~ 304 (367)
T cd05844 256 AEVRELMRRARIFLQPSVTAPSGDAEGLPVVLLEAQASGV-PVVATRHGG 304 (367)
T ss_pred HHHHHHHHhCCEEEECcccCCCCCccCCchHHHHHHHcCC-CEEEeCCCC
Confidence 567788999998776642 345778999999995 999998643
No 105
>cd04962 GT1_like_5 This family is most closely related to the GT1 family of glycosyltransferases. Glycosyltransferases catalyze the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. The acceptor molecule can be a lipid, a protein, a heterocyclic compound, or another carbohydrate residue. This group of glycosyltransferases is most closely related to the previously defined glycosyltransferase family 1 (GT1). The members of this family may transfer UDP, ADP, GDP, or CMP linked sugars. The diverse enzymatic activities among members of this family reflect a wide range of biological functions. The protein structure available for this family has the GTB topology, one of the two protein topologies observed for nucleotide-sugar-dependent glycosyltransferases. GTB proteins have distinct N- and C- terminal domains each containing a typical Rossmann fold. The two domains have high structural homology despite minimal sequence homolog
Probab=28.76 E-value=57 Score=34.75 Aligned_cols=41 Identities=17% Similarity=0.161 Sum_probs=33.3
Q ss_pred hhHHHHhhcCceeeccCC-CCCchhHHHHHhcCceeEEEeCCe
Q 005509 629 ENYHEDLSSSVFCGVLPG-DGWSGRMEDSILQGCIPVVIQVVI 670 (693)
Q Consensus 629 ~~y~~~l~~S~FCL~p~G-d~~s~Rl~dAi~~GCIPViisd~~ 670 (693)
.+..+.|+.+...+.|.- .+++.-++|||.+| +|||.+|.-
T Consensus 262 ~~~~~~~~~~d~~v~ps~~E~~~~~~~EAma~g-~PvI~s~~~ 303 (371)
T cd04962 262 DHVEELLSIADLFLLPSEKESFGLAALEAMACG-VPVVASNAG 303 (371)
T ss_pred ccHHHHHHhcCEEEeCCCcCCCccHHHHHHHcC-CCEEEeCCC
Confidence 457789999999998864 35566799999999 899998753
No 106
>TIGR03088 stp2 sugar transferase, PEP-CTERM/EpsH1 system associated. Members of this family include a match to the pfam00534 Glycosyl transferases group 1 domain. Nearly all are found in species that encode the PEP-CTERM/exosortase system predicted to act in protein sorting in a number of Gram-negative bacteria. In particular, these transferases are found proximal to a particular variant of exosortase, EpsH1, which appears to travel with a conserved group of genes summarized by Genome Property GenProp0652. The nature of the sugar transferase reaction catalyzed by members of this clade is unknown and may conceivably be variable with respect to substrate by species, but we hypothesize a conserved substrate.
Probab=28.67 E-value=53 Score=35.32 Aligned_cols=41 Identities=15% Similarity=0.228 Sum_probs=32.3
Q ss_pred hhHHHHhhcCceeeccCC-CCCchhHHHHHhcCceeEEEeCCe
Q 005509 629 ENYHEDLSSSVFCGVLPG-DGWSGRMEDSILQGCIPVVIQVVI 670 (693)
Q Consensus 629 ~~y~~~l~~S~FCL~p~G-d~~s~Rl~dAi~~GCIPViisd~~ 670 (693)
.+..+.|+.|.+.+.|.- .|.+.-++|||.+| +|||.+|.-
T Consensus 264 ~~~~~~~~~adi~v~pS~~Eg~~~~~lEAma~G-~Pvv~s~~~ 305 (374)
T TIGR03088 264 DDVPALMQALDLFVLPSLAEGISNTILEAMASG-LPVIATAVG 305 (374)
T ss_pred CCHHHHHHhcCEEEeccccccCchHHHHHHHcC-CCEEEcCCC
Confidence 467788999998877643 35666799999999 599999853
No 107
>cd03806 GT1_ALG11_like This family is most closely related to the GT1 family of glycosyltransferases. ALG11 in yeast is involved in adding the final 1,2-linked Man to the Man5GlcNAc2-PP-Dol synthesized on the cytosolic face of the ER. The deletion analysis of ALG11 was shown to block the early steps of core biosynthesis that takes place on the cytoplasmic face of the ER and lead to a defect in the assembly of lipid-linked oligosaccharides.
Probab=28.51 E-value=2.1e+02 Score=31.79 Aligned_cols=40 Identities=18% Similarity=0.169 Sum_probs=31.3
Q ss_pred chhHHHHhhcCceeeccCC-CCCchhHHHHHhcCceeEEEeC
Q 005509 628 SENYHEDLSSSVFCGVLPG-DGWSGRMEDSILQGCIPVVIQV 668 (693)
Q Consensus 628 ~~~y~~~l~~S~FCL~p~G-d~~s~Rl~dAi~~GCIPViisd 668 (693)
..++.+.|+.|...+.|.= .+++.=++|||.+||+||. ++
T Consensus 315 ~~~l~~~l~~adv~v~~s~~E~Fgi~~lEAMa~G~pvIa-~~ 355 (419)
T cd03806 315 FEELLEELSTASIGLHTMWNEHFGIGVVEYMAAGLIPLA-HA 355 (419)
T ss_pred HHHHHHHHHhCeEEEECCccCCcccHHHHHHHcCCcEEE-Ec
Confidence 4677899999999988764 4666679999999996664 44
No 108
>cd04955 GT1_like_6 This family is most closely related to the GT1 family of glycosyltransferases. Glycosyltransferases catalyze the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. The acceptor molecule can be a lipid, a protein, a heterocyclic compound, or another carbohydrate residue. This group of glycosyltransferases is most closely related to the previously defined glycosyltransferase family 1 (GT1). The members of this family may transfer UDP, ADP, GDP, or CMP linked sugars. The diverse enzymatic activities among members of this family reflect a wide range of biological functions. The protein structure available for this family has the GTB topology, one of the two protein topologies observed for nucleotide-sugar-dependent glycosyltransferases. GTB proteins have distinct N- and C- terminal domains each containing a typical Rossmann fold. The two domains have high structural homology despite minimal sequence homolog
Probab=28.36 E-value=86 Score=33.11 Aligned_cols=41 Identities=17% Similarity=0.237 Sum_probs=30.9
Q ss_pred chhHHHHhhcCceeeccCC--CCCchhHHHHHhcCceeEEEeCC
Q 005509 628 SENYHEDLSSSVFCGVLPG--DGWSGRMEDSILQGCIPVVIQVV 669 (693)
Q Consensus 628 ~~~y~~~l~~S~FCL~p~G--d~~s~Rl~dAi~~GCIPViisd~ 669 (693)
.....+.++.+...+.|.- .+++.-++|||.+|+ |||.++.
T Consensus 258 ~~~~~~~~~~ad~~v~ps~~~e~~~~~~~EAma~G~-PvI~s~~ 300 (363)
T cd04955 258 DQELLELLRYAALFYLHGHSVGGTNPSLLEAMAYGC-PVLASDN 300 (363)
T ss_pred hHHHHHHHHhCCEEEeCCccCCCCChHHHHHHHcCC-CEEEecC
Confidence 3556788888888877653 355667999999999 7887764
No 109
>smart00672 CAP10 Putative lipopolysaccharide-modifying enzyme.
Probab=27.86 E-value=84 Score=32.57 Aligned_cols=105 Identities=14% Similarity=0.096 Sum_probs=64.7
Q ss_pred CCCCCceeEEecccCCCCCCCCCCCCCccHHHHHHHHHHhcCCCCCc--cccCcccCcce--EEec-CCchhHHHHhhcC
Q 005509 564 PREKRKTLFYFNGNLGSAYPNGRPESSYSMGVRQKLAEEYGSSPNKE--GKLGKQHAEDV--IVTS-LRSENYHEDLSSS 638 (693)
Q Consensus 564 ~~~~R~~L~~F~G~~~~~~~~~r~~~~ys~~iR~~L~~~~~~~~~~~--~~~g~~~~~~~--~~~~-~~~~~y~~~l~~S 638 (693)
+=++|.-.++|+|+... +..|++|++...+.+... +........+. .... .....=.+...+-
T Consensus 79 pW~~K~~~a~WRG~~~~------------~~~R~~Lv~~~~~~p~~~da~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y 146 (256)
T smart00672 79 KWSDKNAYAYWRGNPTV------------ASERLDLIKCNQSSPELVNARITIQDWPGKCDGEEDAPGFKKSPLEEQCKH 146 (256)
T ss_pred CccccCcCccccCCCCC------------CcchHHHHHHhcCCcccceeEEEEecCCCCChHHhcccCcCCCCHHHHhhc
Confidence 44667888999997632 127999998766654211 00000000000 0000 0011224666788
Q ss_pred ceeeccCCCCCchhHHHHHhcCceeEEEeCCeeec----eecCCCc
Q 005509 639 VFCGVLPGDGWSGRMEDSILQGCIPVVIQVVISSF----LLLCQNG 680 (693)
Q Consensus 639 ~FCL~p~Gd~~s~Rl~dAi~~GCIPViisd~~~~p----~l~~~~f 680 (693)
||=+..-|.++|.||.=-|.++.|++.....+..+ +.||..+
T Consensus 147 Kyli~~dG~~~S~rl~~~l~~~Svvl~~~~~~~~~~~~~L~P~~HY 192 (256)
T smart00672 147 KYKINIEGVAWSVRLKYILACDSVVLKVKPEYYEFFSRGLQPWVHY 192 (256)
T ss_pred ceEEecCCccchhhHHHHHhcCceEEEeCCchhHHHHhcccCccce
Confidence 99888999999999999999999999988666554 5666554
No 110
>cd03792 GT1_Trehalose_phosphorylase Trehalose phosphorylase (TP) reversibly catalyzes trehalose synthesis and degradation from alpha-glucose-1-phosphate (alpha-Glc-1-P) and glucose. The catalyzing activity includes the phosphorolysis of trehalose, which produce alpha-Glc-1-P and glucose, and the subsequent synthesis of trehalose. This family is most closely related to the GT1 family of glycosyltransferases.
Probab=27.10 E-value=64 Score=34.81 Aligned_cols=42 Identities=14% Similarity=0.127 Sum_probs=34.2
Q ss_pred chhHHHHhhcCceeeccCC-CCCchhHHHHHhcCceeEEEeCCe
Q 005509 628 SENYHEDLSSSVFCGVLPG-DGWSGRMEDSILQGCIPVVIQVVI 670 (693)
Q Consensus 628 ~~~y~~~l~~S~FCL~p~G-d~~s~Rl~dAi~~GCIPViisd~~ 670 (693)
.....+.|+.+...+.|.- .+++.-++|||.+| +|||.++.-
T Consensus 264 ~~~~~~~~~~ad~~v~~s~~Eg~g~~~lEA~a~G-~Pvv~s~~~ 306 (372)
T cd03792 264 DLEVNALQRASTVVLQKSIREGFGLTVTEALWKG-KPVIAGPVG 306 (372)
T ss_pred HHHHHHHHHhCeEEEeCCCccCCCHHHHHHHHcC-CCEEEcCCC
Confidence 3566788999999888765 57777899999999 799999853
No 111
>KOG3514 consensus Neurexin III-alpha [Signal transduction mechanisms]
Probab=26.03 E-value=42 Score=40.82 Aligned_cols=34 Identities=26% Similarity=0.514 Sum_probs=25.5
Q ss_pred CCCCCCCceecCCcccccccccccccccccCCCCCcCcccccc
Q 005509 233 TTNGSKPGWCNVDPEEAYALKVQFKEECDCKYDGLLGQFCEVP 275 (693)
Q Consensus 233 ~~~C~~~G~C~~~~~~~~~~~~c~~g~C~C~~~G~~G~~C~~~ 275 (693)
+++|.|+|.|...-+ ...|.|.-.+|.|..|+..
T Consensus 628 ~nPC~N~g~C~egwN---------rfiCDCs~T~~~G~~CerE 661 (1591)
T KOG3514|consen 628 SNPCQNGGKCSEGWN---------RFICDCSGTGFEGRTCERE 661 (1591)
T ss_pred CCcccCCCCcccccc---------ccccccccCcccCccccce
Confidence 677888888854321 4689998679999999864
No 112
>PHA01633 putative glycosyl transferase group 1
Probab=24.73 E-value=83 Score=34.02 Aligned_cols=40 Identities=25% Similarity=0.345 Sum_probs=32.4
Q ss_pred hhHHHHhhcCceeeccCC-CCCchhHHHHHhcCceeEEEeCC
Q 005509 629 ENYHEDLSSSVFCGVLPG-DGWSGRMEDSILQGCIPVVIQVV 669 (693)
Q Consensus 629 ~~y~~~l~~S~FCL~p~G-d~~s~Rl~dAi~~GCIPViisd~ 669 (693)
.+..+.++.|.+-+.|.- .+++.=+.|||.+|+ |||.+|-
T Consensus 215 ~dl~~~y~~aDifV~PS~~EgfGlvlLEAMA~G~-PVVas~~ 255 (335)
T PHA01633 215 EYIFAFYGAMDFTIVPSGTEGFGMPVLESMAMGT-PVIHQLM 255 (335)
T ss_pred HHHHHHHHhCCEEEECCccccCCHHHHHHHHcCC-CEEEccC
Confidence 556788999998877754 466777999999999 9999865
No 113
>PRK09922 UDP-D-galactose:(glucosyl)lipopolysaccharide-1,6-D-galactosyltransferase; Provisional
Probab=23.47 E-value=87 Score=33.69 Aligned_cols=38 Identities=11% Similarity=0.198 Sum_probs=30.0
Q ss_pred hHHHHhhcCceeeccCC-CCCchhHHHHHhcCceeEEEeC
Q 005509 630 NYHEDLSSSVFCGVLPG-DGWSGRMEDSILQGCIPVVIQV 668 (693)
Q Consensus 630 ~y~~~l~~S~FCL~p~G-d~~s~Rl~dAi~~GCIPViisd 668 (693)
.+.+.|+.+...+.|.- .+++.-++|||.+| +|||.+|
T Consensus 250 ~~~~~~~~~d~~v~~s~~Egf~~~~lEAma~G-~Pvv~s~ 288 (359)
T PRK09922 250 VVQQKIKNVSALLLTSKFEGFPMTLLEAMSYG-IPCISSD 288 (359)
T ss_pred HHHHHHhcCcEEEECCcccCcChHHHHHHHcC-CCEEEeC
Confidence 44566778888877765 46777799999999 7898888
No 114
>cd03795 GT1_like_4 This family is most closely related to the GT1 family of glycosyltransferases. Glycosyltransferases catalyze the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. The acceptor molecule can be a lipid, a protein, a heterocyclic compound, or another carbohydrate residue. This group of glycosyltransferases is most closely related to the previously defined glycosyltransferase family 1 (GT1). The members of this family may transfer UDP, ADP, GDP, or CMP-linked sugars. The diverse enzymatic activities among members of this family reflect a wide range of biological functions. The protein structure available for this family has the GTB topology, one of the two protein topologies observed for nucleotide-sugar-dependent glycosyltransferases. GTB proteins have distinct N- and C- terminal domains each containing a typical Rossmann fold. The two domains have high structural homology despite minimal sequence homolog
Probab=22.78 E-value=86 Score=32.94 Aligned_cols=40 Identities=13% Similarity=0.142 Sum_probs=31.6
Q ss_pred hhHHHHhhcCceeeccC---CCCCchhHHHHHhcCceeEEEeCC
Q 005509 629 ENYHEDLSSSVFCGVLP---GDGWSGRMEDSILQGCIPVVIQVV 669 (693)
Q Consensus 629 ~~y~~~l~~S~FCL~p~---Gd~~s~Rl~dAi~~GCIPViisd~ 669 (693)
..+.+.++.+...+.|. +.+++.-+.|||.+| +|||.+|.
T Consensus 255 ~~~~~~~~~ad~~i~ps~~~~e~~g~~~~Ea~~~g-~Pvi~~~~ 297 (357)
T cd03795 255 EEKAALLAACDVFVFPSVERSEAFGIVLLEAMAFG-KPVISTEI 297 (357)
T ss_pred HHHHHHHHhCCEEEeCCcccccccchHHHHHHHcC-CCEEecCC
Confidence 55778999999998874 356677799999998 68888774
No 115
>PF12946 EGF_MSP1_1: MSP1 EGF domain 1; InterPro: IPR024730 This EGF-like domain is found at the C terminus of the malaria parasite MSP1 protein. MSP1 is the merozoite surface protein 1. This domain is part of the C-terminal fragment that is proteolytically processed from the the rest of the protein and is left attached to the surface of the invading parasite [].; PDB: 1N1I_C 2FLG_A 1CEJ_A 2NPR_A 1B9W_A 1OB1_F.
Probab=22.40 E-value=71 Score=22.69 Aligned_cols=24 Identities=29% Similarity=0.888 Sum_probs=15.4
Q ss_pred CCCCCCCEEeccC-C--eEEeCCCccC
Q 005509 125 SDCSGQGVCNHEL-G--QCRCFHGFRG 148 (693)
Q Consensus 125 ~~C~~~G~C~~~~-G--~C~C~~G~~G 148 (693)
..|--|..|.... | +|+|..||..
T Consensus 5 ~~cP~NA~C~~~~dG~eecrCllgyk~ 31 (37)
T PF12946_consen 5 TKCPANAGCFRYDDGSEECRCLLGYKK 31 (37)
T ss_dssp S---TTEEEEEETTSEEEEEE-TTEEE
T ss_pred ccCCCCcccEEcCCCCEEEEeeCCccc
Confidence 4566788896555 6 8999999964
No 116
>PF00954 S_locus_glycop: S-locus glycoprotein family; InterPro: IPR000858 In Brassicaceae, self-incompatible plants have a self/non-self recognition system, which involves the inability of flowering plants to achieve self-fertilisation. This is sporophytically controlled by multiple alleles at a single locus (S). There are a total of 50 different S alleles in Brassica oleracea. S-locus glycoproteins, as well as S-receptor kinases, are in linkage with the S-alleles []. Most of the proteins within this family contain apple-like domain (IPR003609 from INTERPRO), which is predicted to possess protein- and/or carbohydrate-binding functions.; GO: 0048544 recognition of pollen
Probab=20.46 E-value=79 Score=27.90 Aligned_cols=22 Identities=27% Similarity=0.705 Sum_probs=18.1
Q ss_pred CCCCCCceee---CCeeecCCCccc
Q 005509 282 NQCSGHGHCR---GGFCQCDSGWYG 303 (693)
Q Consensus 282 ~~C~~~G~C~---~g~C~C~~G~~G 303 (693)
..|...|.|+ ...|.|.+||.-
T Consensus 84 ~~CG~~g~C~~~~~~~C~Cl~GF~P 108 (110)
T PF00954_consen 84 GFCGPNGICNSNNSPKCSCLPGFEP 108 (110)
T ss_pred cccCCccEeCCCCCCceECCCCcCC
Confidence 6799999997 347999999963
No 117
>PF14670 FXa_inhibition: Coagulation Factor Xa inhibitory site; PDB: 3Q3K_B 1NFY_B 1LQD_A 1G2L_B 1IQF_L 2UWP_B 2VH6_B 3KQC_L 2P93_L 2BQW_A ....
Probab=20.42 E-value=67 Score=22.55 Aligned_cols=16 Identities=31% Similarity=1.074 Sum_probs=11.3
Q ss_pred EEeccCC--eEEeCCCcc
Q 005509 132 VCNHELG--QCRCFHGFR 147 (693)
Q Consensus 132 ~C~~~~G--~C~C~~G~~ 147 (693)
.|....| +|.|++||.
T Consensus 11 ~C~~~~g~~~C~C~~Gy~ 28 (36)
T PF14670_consen 11 ICVNTPGSYRCSCPPGYK 28 (36)
T ss_dssp EEEEETTSEEEE-STTEE
T ss_pred CCccCCCceEeECCCCCE
Confidence 5655555 899999996
No 118
>cd03820 GT1_amsD_like This family is most closely related to the GT1 family of glycosyltransferases. AmSD in Erwinia amylovora has been shown to be involved in the biosynthesis of amylovoran, the acidic exopolysaccharide acting as a virulence factor. This enzyme may be responsible for the formation of galactose alpha-1,6 linkages in amylovoran.
Probab=20.39 E-value=1e+02 Score=31.55 Aligned_cols=41 Identities=12% Similarity=0.141 Sum_probs=32.9
Q ss_pred chhHHHHhhcCceeeccCC-CCCchhHHHHHhcCceeEEEeCC
Q 005509 628 SENYHEDLSSSVFCGVLPG-DGWSGRMEDSILQGCIPVVIQVV 669 (693)
Q Consensus 628 ~~~y~~~l~~S~FCL~p~G-d~~s~Rl~dAi~~GCIPViisd~ 669 (693)
..+..+.|+++.+.+.|.. ++++..++|||.+|+. ||.+|.
T Consensus 243 ~~~~~~~~~~ad~~i~ps~~e~~~~~~~Ea~a~G~P-vi~~~~ 284 (348)
T cd03820 243 TKNIEEYYAKASIFVLTSRFEGFPMVLLEAMAFGLP-VISFDC 284 (348)
T ss_pred cchHHHHHHhCCEEEeCccccccCHHHHHHHHcCCC-EEEecC
Confidence 4667899999999998875 4677889999999985 556653
Done!