Query psy951
Match_columns 428
No_of_seqs 297 out of 3008
Neff 8.6
Searched_HMMs 46136
Date Fri Aug 16 21:22:03 2013
Command hhsearch -i /work/01045/syshi/Psyhhblits/psy951.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/951hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 KOG1214|consensus 99.9 2E-25 4.2E-30 225.1 17.2 126 293-425 988-1120(1289)
2 KOG1215|consensus 99.9 2.9E-24 6.3E-29 236.2 25.2 361 1-424 13-531 (877)
3 KOG1214|consensus 99.9 2.9E-24 6.2E-29 216.8 10.9 157 1-163 1125-1285(1289)
4 KOG1215|consensus 99.9 5.1E-22 1.1E-26 218.4 21.5 270 1-401 537-809 (877)
5 KOG1219|consensus 99.4 3.2E-13 7E-18 148.1 7.2 76 135-222 3863-3941(4289)
6 PF00058 Ldl_recept_b: Low-den 99.2 9.9E-12 2.1E-16 82.9 4.9 40 1-40 3-42 (42)
7 PF08450 SGL: SMP-30/Gluconola 99.2 1.5E-09 3.2E-14 102.1 21.6 99 319-423 129-232 (246)
8 KOG1219|consensus 99.2 6E-12 1.3E-16 138.5 5.7 112 139-262 3906-4022(4289)
9 PLN02919 haloacid dehalogenase 99.2 3.6E-09 7.8E-14 118.0 27.2 103 1-105 582-713 (1057)
10 PF00058 Ldl_recept_b: Low-den 99.1 2.3E-10 5E-15 76.3 6.1 40 336-378 1-42 (42)
11 PLN02919 haloacid dehalogenase 99.0 5.9E-08 1.3E-12 108.4 23.3 80 26-106 563-655 (1057)
12 KOG4289|consensus 98.7 6.4E-09 1.4E-13 111.7 3.8 83 135-229 1178-1284(2531)
13 smart00135 LY Low-density lipo 98.7 2.5E-08 5.5E-13 66.6 5.4 40 364-403 4-43 (43)
14 smart00135 LY Low-density lipo 98.6 7.4E-08 1.6E-12 64.3 5.9 42 24-66 2-43 (43)
15 KOG4289|consensus 98.5 7.7E-08 1.7E-12 103.7 5.2 85 131-224 1234-1321(2531)
16 PF08450 SGL: SMP-30/Gluconola 98.3 7.8E-06 1.7E-10 76.7 13.1 121 1-125 99-232 (246)
17 PHA03099 epidermal growth fact 98.1 5.1E-06 1.1E-10 67.1 5.5 33 190-223 52-85 (139)
18 COG3386 Gluconolactonase [Carb 97.9 0.0029 6.3E-08 61.2 21.9 83 323-407 162-251 (307)
19 PF03088 Str_synth: Strictosid 97.9 8.2E-05 1.8E-09 57.9 9.1 70 328-400 2-88 (89)
20 PF14670 FXa_inhibition: Coagu 97.9 3.2E-06 7E-11 53.8 0.8 28 141-169 5-32 (36)
21 PF00008 EGF: EGF-like domain 97.9 6.3E-06 1.4E-10 51.3 1.8 28 190-217 5-32 (32)
22 PF10282 Lactonase: Lactonase, 97.9 0.013 2.9E-07 57.8 26.0 104 321-425 189-301 (345)
23 PF10282 Lactonase: Lactonase, 97.9 0.032 6.9E-07 55.1 28.6 77 324-401 245-324 (345)
24 KOG4659|consensus 97.7 0.0016 3.4E-08 71.2 17.5 92 9-104 334-436 (1899)
25 PRK11028 6-phosphogluconolacto 97.6 0.091 2E-06 51.3 27.4 97 325-423 229-327 (330)
26 PF12662 cEGF: Complement Clr- 97.3 0.00017 3.7E-09 41.3 2.1 24 154-180 1-24 (24)
27 smart00179 EGF_CA Calcium-bind 97.3 0.00027 5.9E-09 45.7 3.2 29 190-219 10-39 (39)
28 PRK11028 6-phosphogluconolacto 97.1 0.3 6.5E-06 47.6 27.8 75 324-399 175-258 (330)
29 COG3391 Uncharacterized conser 97.1 0.36 7.9E-06 48.4 25.1 97 321-422 204-305 (381)
30 cd00054 EGF_CA Calcium-binding 97.0 0.00071 1.5E-08 43.2 3.1 29 190-219 10-38 (38)
31 cd00053 EGF Epidermal growth f 96.8 0.0015 3.2E-08 41.0 3.2 29 190-219 7-36 (36)
32 smart00179 EGF_CA Calcium-bind 96.7 0.0015 3.4E-08 42.1 2.9 28 136-163 2-32 (39)
33 COG3386 Gluconolactonase [Carb 96.7 0.018 4E-07 55.7 11.3 96 10-107 143-245 (307)
34 PF07645 EGF_CA: Calcium-bindi 96.7 0.00076 1.7E-08 44.7 1.1 28 136-163 2-33 (42)
35 PF12661 hEGF: Human growth fa 96.6 0.00084 1.8E-08 32.6 0.6 13 206-218 1-13 (13)
36 smart00181 EGF Epidermal growt 96.5 0.0031 6.7E-08 39.7 3.1 28 190-219 7-35 (35)
37 PF00008 EGF: EGF-like domain 96.5 0.00094 2E-08 41.4 0.6 25 139-163 1-28 (32)
38 PF01436 NHL: NHL repeat; Int 96.5 0.0065 1.4E-07 36.4 4.2 27 368-395 1-27 (28)
39 KOG1520|consensus 96.4 0.0097 2.1E-07 58.1 7.6 91 31-126 115-227 (376)
40 PF06977 SdiA-regulated: SdiA- 96.3 0.083 1.8E-06 49.5 13.1 73 30-105 21-94 (248)
41 smart00181 EGF Epidermal growt 96.2 0.0048 1E-07 38.8 2.6 26 138-163 1-28 (35)
42 PF07645 EGF_CA: Calcium-bindi 96.1 0.0063 1.4E-07 40.3 3.0 24 190-214 11-34 (42)
43 COG2706 3-carboxymuconate cycl 96.1 1.5 3.3E-05 42.4 26.1 103 323-426 190-301 (346)
44 PF07974 EGF_2: EGF-like domai 96.0 0.0066 1.4E-07 37.6 2.5 26 190-218 7-32 (32)
45 PF03088 Str_synth: Strictosid 96.0 0.048 1E-06 42.5 7.9 69 34-106 1-88 (89)
46 PHA02887 EGF-like protein; Pro 95.9 0.0076 1.6E-07 48.3 3.0 31 190-221 93-124 (126)
47 cd00054 EGF_CA Calcium-binding 95.8 0.0092 2E-07 37.9 2.9 28 136-163 2-32 (38)
48 KOG1225|consensus 95.8 0.011 2.3E-07 60.7 4.7 27 190-221 317-343 (525)
49 COG3391 Uncharacterized conser 95.7 0.22 4.7E-06 50.0 13.5 121 303-426 95-217 (381)
50 TIGR03866 PQQ_ABC_repeats PQQ- 95.6 2 4.3E-05 40.3 29.6 93 324-424 207-299 (300)
51 KOG1225|consensus 95.5 0.024 5.3E-07 58.2 5.8 68 139-225 247-316 (525)
52 PF01436 NHL: NHL repeat; Int 95.4 0.035 7.6E-07 33.1 4.2 27 30-58 1-27 (28)
53 cd00053 EGF Epidermal growth f 95.1 0.021 4.6E-07 35.5 2.6 25 139-163 2-29 (36)
54 KOG4659|consensus 95.0 0.2 4.3E-06 55.7 11.1 89 321-412 404-515 (1899)
55 KOG1520|consensus 95.0 0.047 1E-06 53.4 5.8 75 328-403 165-253 (376)
56 KOG1217|consensus 94.9 0.041 8.9E-07 56.2 5.7 80 135-222 270-356 (487)
57 TIGR02604 Piru_Ver_Nterm putat 94.8 0.16 3.4E-06 50.7 9.3 66 324-394 124-208 (367)
58 PRK04792 tolB translocation pr 94.7 5.8 0.00013 40.7 25.4 93 10-105 198-294 (448)
59 KOG4499|consensus 94.7 0.35 7.5E-06 44.3 10.1 96 12-109 140-245 (310)
60 PF12947 EGF_3: EGF domain; I 94.7 0.022 4.9E-07 36.2 1.9 26 190-216 7-32 (36)
61 PF12662 cEGF: Complement Clr- 94.0 0.036 7.7E-07 31.8 1.5 17 204-220 1-21 (24)
62 PRK04043 tolB translocation pr 93.7 9 0.00019 38.9 26.9 93 9-105 168-265 (419)
63 KOG0994|consensus 93.7 0.12 2.6E-06 56.6 5.9 68 148-226 878-955 (1758)
64 PF12955 DUF3844: Domain of un 93.5 0.16 3.5E-06 40.3 5.0 34 190-223 14-64 (103)
65 PF01731 Arylesterase: Arylest 93.4 0.26 5.5E-06 38.2 5.8 38 360-397 45-82 (86)
66 KOG4499|consensus 93.2 0.32 6.9E-06 44.6 7.0 81 321-402 155-244 (310)
67 PF14670 FXa_inhibition: Coagu 93.1 0.053 1.2E-06 34.5 1.4 19 195-214 10-28 (36)
68 KOG1217|consensus 93.0 0.13 2.8E-06 52.6 4.9 64 146-218 243-306 (487)
69 PTZ00382 Variant-specific surf 92.9 0.17 3.7E-06 40.0 4.4 14 208-221 41-56 (96)
70 PF06247 Plasmod_Pvs28: Plasmo 92.8 0.025 5.4E-07 49.6 -0.7 67 144-214 8-79 (197)
71 KOG1836|consensus 92.5 0.18 3.9E-06 58.9 5.5 40 186-225 777-818 (1705)
72 TIGR02604 Piru_Ver_Nterm putat 92.4 1.1 2.5E-05 44.6 10.6 97 322-425 12-133 (367)
73 PF01731 Arylesterase: Arylest 92.2 0.45 9.8E-06 36.8 5.7 43 17-61 41-83 (86)
74 PF12947 EGF_3: EGF domain; I 92.2 0.044 9.6E-07 34.9 0.1 21 143-163 7-29 (36)
75 cd01475 vWA_Matrilin VWA_Matri 92.0 0.1 2.3E-06 48.1 2.4 32 135-167 186-219 (224)
76 TIGR02658 TTQ_MADH_Hv methylam 92.0 14 0.0003 36.6 28.5 94 11-107 28-138 (352)
77 PF09064 Tme5_EGF_like: Thromb 91.7 0.15 3.4E-06 31.5 2.1 28 139-168 3-30 (34)
78 PHA03099 epidermal growth fact 91.7 0.35 7.7E-06 39.6 4.8 37 136-179 42-84 (139)
79 KOG3514|consensus 91.4 2.8 6.1E-05 46.2 12.4 35 189-224 629-664 (1591)
80 PF06977 SdiA-regulated: SdiA- 90.7 3 6.6E-05 39.1 10.8 83 324-407 118-209 (248)
81 PRK04043 tolB translocation pr 90.3 4.8 0.0001 40.9 12.8 106 2-113 293-408 (419)
82 TIGR03866 PQQ_ABC_repeats PQQ- 90.2 7.4 0.00016 36.4 13.5 93 306-401 13-105 (300)
83 PF07995 GSDH: Glucose / Sorbo 90.2 1.2 2.7E-05 43.6 8.2 74 323-397 113-209 (331)
84 KOG3516|consensus 90.1 6.4 0.00014 44.1 13.8 33 190-223 962-995 (1306)
85 PF03022 MRJP: Major royal jel 89.6 3.7 7.9E-05 39.5 10.7 82 326-408 130-228 (287)
86 smart00051 DSL delta serrate l 88.9 0.43 9.3E-06 34.6 2.8 46 156-218 18-63 (63)
87 KOG4260|consensus 88.2 0.24 5.2E-06 45.9 1.4 66 135-214 235-304 (350)
88 PRK05137 tolB translocation pr 88.1 32 0.0007 34.9 26.5 95 9-106 181-279 (435)
89 PF03022 MRJP: Major royal jel 87.8 3.2 6.9E-05 39.9 9.0 62 31-94 186-254 (287)
90 PF07995 GSDH: Glucose / Sorbo 87.2 2.6 5.5E-05 41.4 8.1 83 323-409 1-105 (331)
91 PF06247 Plasmod_Pvs28: Plasmo 87.1 0.15 3.3E-06 44.7 -0.5 71 136-217 87-163 (197)
92 PF12946 EGF_MSP1_1: MSP1 EGF 86.6 0.36 7.8E-06 30.7 1.1 29 190-218 6-36 (37)
93 TIGR03606 non_repeat_PQQ dehyd 85.8 13 0.00029 38.0 12.5 80 318-400 24-125 (454)
94 cd00055 EGF_Lam Laminin-type e 85.7 0.77 1.7E-05 31.4 2.5 18 205-222 19-36 (50)
95 KOG3514|consensus 85.3 3.6 7.9E-05 45.4 8.3 33 138-177 625-660 (1591)
96 PRK05137 tolB translocation pr 85.0 18 0.00039 36.8 13.3 105 2-110 306-417 (435)
97 PRK04792 tolB translocation pr 84.9 17 0.00036 37.3 13.0 98 8-109 328-429 (448)
98 KOG3516|consensus 84.9 0.67 1.5E-05 51.4 2.8 35 189-224 551-586 (1306)
99 PF00053 Laminin_EGF: Laminin 84.0 0.93 2E-05 30.8 2.3 24 196-222 12-35 (49)
100 PRK04922 tolB translocation pr 83.1 57 0.0012 33.1 24.7 77 327-407 339-419 (433)
101 TIGR03032 conserved hypothetic 82.4 12 0.00027 36.1 9.8 61 27-93 199-259 (335)
102 PHA02887 EGF-like protein; Pro 81.5 1 2.3E-05 36.3 2.0 36 135-177 82-123 (126)
103 PRK04922 tolB translocation pr 81.1 27 0.0006 35.4 12.9 101 5-109 311-415 (433)
104 PRK00178 tolB translocation pr 81.0 66 0.0014 32.5 24.9 76 328-407 335-414 (430)
105 TIGR02800 propeller_TolB tol-p 80.4 66 0.0014 32.1 25.5 78 325-406 323-404 (417)
106 PRK00178 tolB translocation pr 79.6 37 0.0008 34.3 13.2 101 5-109 306-410 (430)
107 KOG3607|consensus 79.4 2.3 5.1E-05 45.9 4.4 30 191-224 632-661 (716)
108 PRK02889 tolB translocation pr 79.3 41 0.00089 34.1 13.4 101 4-108 302-406 (427)
109 COG3204 Uncharacterized protei 79.0 18 0.00038 34.6 9.5 101 323-424 180-292 (316)
110 KOG4260|consensus 79.0 2.2 4.8E-05 39.8 3.5 31 190-220 151-183 (350)
111 PRK02889 tolB translocation pr 78.8 79 0.0017 32.1 26.9 76 328-407 332-411 (427)
112 TIGR02658 TTQ_MADH_Hv methylam 78.5 68 0.0015 31.8 14.0 100 2-106 209-331 (352)
113 cd01475 vWA_Matrilin VWA_Matri 78.4 2 4.4E-05 39.4 3.3 20 195-215 199-218 (224)
114 PF15102 TMEM154: TMEM154 prot 76.8 1.9 4.2E-05 36.5 2.3 27 234-260 60-87 (146)
115 TIGR02800 propeller_TolB tol-p 75.8 50 0.0011 33.0 12.9 95 8-106 300-398 (417)
116 PF12273 RCR: Chitin synthesis 75.8 1.2 2.6E-05 37.3 0.9 6 256-261 24-29 (130)
117 PRK01742 tolB translocation pr 74.8 61 0.0013 32.9 13.2 95 10-107 184-282 (429)
118 COG3204 Uncharacterized protei 73.7 88 0.0019 30.1 17.1 72 31-105 86-158 (316)
119 TIGR03606 non_repeat_PQQ dehyd 72.5 6.6 0.00014 40.2 5.4 43 10-54 199-251 (454)
120 PF08693 SKG6: Transmembrane a 72.4 3.8 8.3E-05 26.6 2.3 13 247-259 28-40 (40)
121 PF14991 MLANA: Protein melan- 72.3 0.62 1.3E-05 37.3 -1.6 27 233-259 25-51 (118)
122 PRK03629 tolB translocation pr 72.3 1.2E+02 0.0025 30.9 27.2 93 10-105 179-275 (429)
123 PRK03629 tolB translocation pr 71.6 79 0.0017 32.1 13.1 101 304-407 267-371 (429)
124 PRK01029 tolB translocation pr 71.2 1.2E+02 0.0026 30.9 14.2 105 5-111 300-409 (428)
125 PF02239 Cytochrom_D1: Cytochr 70.9 81 0.0018 31.4 12.7 94 301-398 13-107 (369)
126 PF01299 Lamp: Lysosome-associ 70.0 5 0.00011 38.9 3.7 31 232-262 272-302 (306)
127 PRK01742 tolB translocation pr 68.0 87 0.0019 31.7 12.5 100 304-406 228-331 (429)
128 PF01102 Glycophorin_A: Glycop 67.1 4.6 9.9E-05 33.4 2.3 29 233-261 67-95 (122)
129 KOG1226|consensus 66.8 8.7 0.00019 41.2 4.8 23 206-228 567-589 (783)
130 COG2133 Glucose/sorbosone dehy 65.9 16 0.00035 36.6 6.4 69 324-393 177-263 (399)
131 KOG0994|consensus 65.7 6.6 0.00014 43.8 3.8 56 155-220 1084-1147(1758)
132 PF00954 S_locus_glycop: S-loc 65.1 5.3 0.00012 32.2 2.4 24 190-215 85-108 (110)
133 smart00180 EGF_Lam Laminin-typ 64.1 7.5 0.00016 26.0 2.6 16 206-221 19-34 (46)
134 PF00930 DPPIV_N: Dipeptidyl p 64.0 73 0.0016 31.4 10.8 106 3-109 202-320 (353)
135 KOG1226|consensus 63.3 14 0.00031 39.6 5.7 19 206-224 606-625 (783)
136 PRK01029 tolB translocation pr 62.7 1.4E+02 0.0031 30.3 12.8 82 324-408 327-412 (428)
137 PF13449 Phytase-like: Esteras 61.0 38 0.00083 33.0 8.0 81 324-406 20-127 (326)
138 PF02009 Rifin_STEVOR: Rifin/s 60.2 5.4 0.00012 38.4 1.8 15 245-259 271-285 (299)
139 PF04863 EGF_alliinase: Alliin 59.1 4.2 9E-05 28.2 0.6 32 191-222 19-53 (56)
140 KOG0196|consensus 58.8 16 0.00036 39.6 5.2 55 155-214 259-317 (996)
141 KOG3509|consensus 58.6 16 0.00035 40.6 5.3 72 135-218 405-478 (964)
142 PF14946 DUF4501: Domain of un 57.3 1.3E+02 0.0028 26.1 9.3 28 232-259 90-117 (180)
143 PF05787 DUF839: Bacterial pro 56.5 40 0.00087 35.3 7.7 72 321-392 347-459 (524)
144 PF12191 stn_TNFRSF12A: Tumour 56.5 3.5 7.7E-05 33.8 -0.1 17 245-261 91-107 (129)
145 PF01683 EB: EB module; Inter 55.6 18 0.0004 24.6 3.5 20 190-214 27-46 (52)
146 COG0823 TolB Periplasmic compo 55.4 1.2E+02 0.0027 30.8 10.8 101 10-113 218-322 (425)
147 TIGR03118 PEPCTERM_chp_1 conse 55.1 1.7E+02 0.0037 28.4 10.8 98 1-105 154-279 (336)
148 COG2133 Glucose/sorbosone dehy 54.1 59 0.0013 32.7 8.0 40 12-53 221-260 (399)
149 PF05568 ASFV_J13L: African sw 53.7 9.2 0.0002 32.1 1.9 39 235-273 34-74 (189)
150 PF01414 DSL: Delta serrate li 52.9 5.3 0.00011 28.9 0.3 12 207-218 52-63 (63)
151 PF01034 Syndecan: Syndecan do 52.3 4.9 0.00011 28.9 0.1 25 238-262 17-41 (64)
152 TIGR01478 STEVOR variant surfa 51.9 9.4 0.0002 36.1 1.9 17 245-261 272-288 (295)
153 PF04478 Mid2: Mid2 like cell 51.7 6.6 0.00014 33.6 0.8 15 248-262 67-81 (154)
154 PF13449 Phytase-like: Esteras 49.8 37 0.00079 33.2 5.9 61 325-388 86-166 (326)
155 PTZ00046 rifin; Provisional 49.7 13 0.00028 36.5 2.5 15 245-259 330-344 (358)
156 PTZ00370 STEVOR; Provisional 49.5 10 0.00022 35.9 1.8 17 245-261 268-284 (296)
157 PF02239 Cytochrom_D1: Cytochr 49.4 2.8E+02 0.0061 27.6 12.8 96 2-106 9-109 (369)
158 TIGR01477 RIFIN variant surfac 48.7 14 0.0003 36.3 2.5 15 245-259 325-339 (353)
159 PF02439 Adeno_E3_CR2: Adenovi 48.3 22 0.00049 22.7 2.6 7 234-240 7-13 (38)
160 PF05694 SBP56: 56kDa selenium 48.1 57 0.0012 33.2 6.8 62 325-388 313-393 (461)
161 TIGR02276 beta_rpt_yvtn 40-res 47.3 69 0.0015 20.0 5.9 42 333-377 1-42 (42)
162 PF02333 Phytase: Phytase; In 45.8 3E+02 0.0064 27.6 11.4 84 321-407 153-247 (381)
163 PF11403 Yeast_MT: Yeast metal 43.6 8.9 0.00019 23.6 0.2 9 206-214 23-31 (40)
164 PF06433 Me-amine-dh_H: Methyl 43.2 2.8E+02 0.006 27.4 10.5 62 2-66 199-281 (342)
165 PF11770 GAPT: GRB2-binding ad 42.6 26 0.00057 29.8 3.0 24 232-255 9-32 (158)
166 TIGR02276 beta_rpt_yvtn 40-res 41.5 87 0.0019 19.5 6.1 21 42-63 3-23 (42)
167 TIGR03032 conserved hypothetic 40.7 92 0.002 30.3 6.7 56 67-126 196-251 (335)
168 PF14251 DUF4346: Domain of un 39.1 31 0.00068 28.1 2.8 72 326-409 9-80 (119)
169 PF12273 RCR: Chitin synthesis 39.0 7.4 0.00016 32.5 -0.8 10 253-262 18-27 (130)
170 COG3823 Glutamine cyclotransfe 38.5 1.8E+02 0.0039 26.7 7.7 65 333-400 183-260 (262)
171 PF06739 SBBP: Beta-propeller 37.3 29 0.00063 22.1 2.0 19 369-388 13-31 (38)
172 PF14610 DUF4448: Protein of u 36.9 14 0.0003 33.1 0.5 23 232-254 159-181 (189)
173 PF14781 BBS2_N: Ciliary BBSom 35.8 86 0.0019 26.4 5.0 55 36-92 76-132 (136)
174 TIGR02976 phageshock_pspB phag 34.0 35 0.00075 25.6 2.2 27 235-261 6-32 (75)
175 PF11118 DUF2627: Protein of u 33.8 67 0.0014 24.1 3.6 31 232-262 40-70 (77)
176 PF14759 Reductase_C: Reductas 33.3 1.1E+02 0.0023 23.3 5.0 27 2-29 2-28 (85)
177 PF00930 DPPIV_N: Dipeptidyl p 33.2 1.7E+02 0.0036 28.7 7.7 74 333-409 245-326 (353)
178 PRK02888 nitrous-oxide reducta 33.0 3.1E+02 0.0067 29.5 9.6 33 368-400 320-352 (635)
179 PF05454 DAG1: Dystroglycan (D 32.3 15 0.00033 35.2 0.0 18 245-262 160-177 (290)
180 PF05345 He_PIG: Putative Ig d 30.1 56 0.0012 22.1 2.5 24 29-53 9-32 (49)
181 PRK09458 pspB phage shock prot 29.9 65 0.0014 24.1 3.0 27 235-261 6-32 (75)
182 KOG3653|consensus 29.6 1.1E+02 0.0024 31.4 5.5 21 241-261 164-184 (534)
183 PF02191 OLF: Olfactomedin-lik 29.4 4.8E+02 0.01 24.4 12.0 100 297-397 80-200 (250)
184 COG4257 Vgb Streptogramin lyas 29.2 2.8E+02 0.0062 26.6 7.7 100 31-133 189-291 (353)
185 PF02333 Phytase: Phytase; In 29.1 3.8E+02 0.0083 26.9 9.2 74 31-106 208-291 (381)
186 PF15330 SIT: SHP2-interacting 29.1 34 0.00074 27.6 1.6 18 245-262 12-29 (107)
187 KOG3512|consensus 28.6 49 0.0011 33.6 2.8 25 195-222 407-431 (592)
188 PF06667 PspB: Phage shock pro 28.5 50 0.0011 24.8 2.3 25 236-260 7-31 (75)
189 COG4257 Vgb Streptogramin lyas 27.5 5.6E+02 0.012 24.6 9.7 69 34-106 236-306 (353)
190 COG2706 3-carboxymuconate cycl 26.6 6.3E+02 0.014 24.9 24.9 80 320-400 240-322 (346)
191 PF06084 Cytomega_TRL10: Cytom 26.3 1.3E+02 0.0028 24.3 4.3 8 155-162 20-27 (150)
192 KOG0291|consensus 26.1 6.6E+02 0.014 27.6 10.5 85 11-96 406-500 (893)
193 PF14759 Reductase_C: Reductas 25.0 97 0.0021 23.5 3.4 29 384-412 2-30 (85)
194 PF05337 CSF-1: Macrophage col 24.7 25 0.00053 33.2 0.0 19 244-262 237-255 (285)
195 KOG4649|consensus 23.8 1.8E+02 0.0039 27.6 5.4 54 30-97 30-83 (354)
196 PTZ00214 high cysteine membran 22.8 17 0.00036 40.2 -1.7 17 205-221 751-769 (800)
197 KOG0273|consensus 22.0 7.7E+02 0.017 25.4 9.6 43 367-409 315-357 (524)
198 PF05545 FixQ: Cbb3-type cytoc 21.9 1.1E+02 0.0024 20.5 2.9 7 253-259 28-34 (49)
199 KOG3512|consensus 21.7 75 0.0016 32.4 2.7 35 188-222 277-312 (592)
200 PF06024 DUF912: Nucleopolyhed 21.1 1.1E+02 0.0024 24.3 3.1 9 249-257 80-88 (101)
201 cd01328 FSL_SPARC Follistatin- 20.6 1.1E+02 0.0024 23.6 2.9 22 190-211 6-27 (86)
202 PF11857 DUF3377: Domain of un 20.4 72 0.0016 23.8 1.7 28 232-259 31-58 (74)
203 PF14380 WAK_assoc: Wall-assoc 20.0 1E+02 0.0022 23.9 2.7 20 193-212 73-93 (94)
No 1
>KOG1214|consensus
Probab=99.93 E-value=2e-25 Score=225.10 Aligned_cols=126 Identities=18% Similarity=0.336 Sum_probs=107.3
Q ss_pred CCceeeeecCCcceeeeecCccceee------ccccccceeEEeeccCCCEEEEEecCCCeEEEeeeccccCCceEEEEe
Q psy951 293 SLGLTIVYSNGPEIRAYETHKRRFRD------VISDERRIEALDIDPVDEIIYWVDSYDRNIRRSFMLEAQKGQVQAVIS 366 (428)
Q Consensus 293 ~~~~~~~~s~~~~~~~~~~~~~~~~~------~i~~~~~~~~l~~d~~~~~lyWtd~~~~~I~ra~l~g~~~~~~~~i~~ 366 (428)
.++..|+|.-..+|....+...++.. +.---+-++|||||.+++||||||....+|.||.|.| .+.+.++.
T Consensus 988 ~~gt~LL~aqg~~I~~lplng~~~~K~~ak~~l~~p~~IiVGidfDC~e~mvyWtDv~g~SI~rasL~G---~Ep~ti~n 1064 (1289)
T KOG1214|consen 988 SVGTFLLYAQGQQIGYLPLNGTRLQKDAAKTLLSLPGSIIVGIDFDCRERMVYWTDVAGRSISRASLEG---AEPETIVN 1064 (1289)
T ss_pred CCcceEEEeccceEEEeecCcchhchhhhhceEecccceeeeeecccccceEEEeecCCCccccccccC---CCCceeec
Confidence 33457888888888877777666665 1233456899999999999999999999999999999 45677775
Q ss_pred -ccccceeeeeeccCCeEEEEeCCCCcEEEEEccCcceeEEEecccCCCCCCCccccccc
Q psy951 367 -DERRIEALDIDPVDEIIYWVDSYDRNIRRSFMLEAQKGQVQAGASRHQNGVPASSQRNL 425 (428)
Q Consensus 367 -~~~~p~glavD~~~~~lYwtd~~~~~I~~~~~~g~~~~~l~~~~~~~~~~~p~~~~~~~ 425 (428)
+|.+|+||||||+++|+||||+.+++|+++.|||+.||+|+ .+.+-.||+|++|+
T Consensus 1065 ~~L~SPEGiAVDh~~Rn~ywtDS~lD~IevA~LdG~~rkvLf----~tdLVNPR~iv~D~ 1120 (1289)
T KOG1214|consen 1065 SGLISPEGIAVDHIRRNMYWTDSVLDKIEVALLDGSERKVLF----YTDLVNPRAIVVDP 1120 (1289)
T ss_pred ccCCCccceeeeeccceeeeeccccchhheeecCCceeeEEE----eecccCcceEEeec
Confidence 99999999999999999999999999999999999999999 55578888888876
No 2
>KOG1215|consensus
Probab=99.93 E-value=2.9e-24 Score=236.18 Aligned_cols=361 Identities=25% Similarity=0.413 Sum_probs=234.9
Q ss_pred CEEEeeCCCCeEEEEEcCCCCcEEEEeCCCCCceeEEEcCCCCCEEEEEECCCCeEEEEEcCCCCeEEEEeCCCCcccee
Q psy951 1 MFWAETGASPRIESAWMDGSHRRSLVMTGVRHPTGLSVDAAMDHTLYWVDSKLNTIESVRHDGRNRQTILSGSDKLQHPI 80 (428)
Q Consensus 1 lyWtd~~~~~~I~~a~~DG~~~~~l~~~~~~~P~glavD~~~~~~lYW~d~~~~~I~~~~ldG~~~~~i~~~~~~~~~p~ 80 (428)
||||||+. . |+ ||+.+..+....+.+|+|++||.... ++||+|...+.|+++++||+.|+
T Consensus 13 l~wtd~~~-~-i~----dg~~~~~~~~~~~~~~ng~~id~~~~-~~y~~d~~~~~i~~~~~dg~~r~------------- 72 (877)
T KOG1215|consen 13 LFWTDWGA-N-IE----DGGERKILEKEEFEWPNGLTIDLAWQ-RIYWADAKNDLIESANYDGSGRR------------- 72 (877)
T ss_pred EEEecCCc-c-cc----cCcceEEeeccceeCCCcceecchhh-eeeeccccCCceEEeccCCccce-------------
Confidence 79999999 5 88 88888888888999999999999999 99999999999999999999987
Q ss_pred eeeccccEEEEEeCCCCceeeeccCCCCceeeeccCCCCCcccccccccccCCCCCCCCCCC---CCCCccccCCC--cc
Q psy951 81 SLDVFENNIYWLARDTGSLYKQDKFGRGVPVLISKDLVNPSGVKAYHAQRYNTSAPNPCSQS---PCSHLCLVIPG--GY 155 (428)
Q Consensus 81 ~l~~~~~~lYwtD~~~~~I~~~~~~g~~~~~~~~~~~~~p~~I~v~~~~~~~~~~~n~C~~~---~C~~~C~~~~~--~~ 155 (428)
+|++|++.+||+| ..|..+++..+.....+......|+.+.+++...++. ..++|... +|++.|..... .+
T Consensus 73 ~l~~~~~~~y~~d---~~v~~~~~~sg~~~~~~~~~~~~~~~~~~~~~~~~~~-~~~~~~~~~~~~~~~~~~~~~~c~~~ 148 (877)
T KOG1215|consen 73 ALTLFEDGLYWTD---KSVSAANKKTGKDVTRLSQDSHFPLDIHAYHPSSQPL-APDPCAESGNGPCSHCCLDKFSCRTG 148 (877)
T ss_pred eeeeeccceeecc---chhhhhccCCCCcceeehhcCCCCcceeEEecCCCCC-CCCcccccCCCCCccccCCCCCCcCc
Confidence 7899999999999 7788888886655555555444499999998888876 67788764 77877776663 33
Q ss_pred eecCCCCCCCCCCCCCCcccccCccC------------------------------------------------------
Q psy951 156 QCACPENATPKLPGVAEIRCSAAVER------------------------------------------------------ 181 (428)
Q Consensus 156 ~C~C~~g~~~~~~~~~g~~C~~~~~~------------------------------------------------------ 181 (428)
.|.|+++. .+.++ ...|.+....
T Consensus 149 ~~~Cip~~-~~cd~--~~~C~dg~de~~~~~~~~~~~~~~~~~~~~~~~~d~~~~~~~~~d~~~~~~~~~~~~~~~~~~c 225 (877)
T KOG1215|consen 149 SCKCIPGD-WLCDG--EADCPDGSDELNCAVRRCEPRGASLDCIVAIKVCDIQHDCADDYDESEGRIYWTDDSRIEVTRC 225 (877)
T ss_pred cccCCCCc-eeCCC--CCccccchhhhcccccccCccccccccceeeeecCcccccccccccccCcccccCCcceeEEEe
Confidence 78888877 43332 1222211100
Q ss_pred ------------C------CCCCC--------------CCCCCCCCEEeeCC---CCc----------------------
Q psy951 182 ------------P------RPLPR--------------VCQCQNGGMCAESE---TGD---------------------- 204 (428)
Q Consensus 182 ------------~------~~~~~--------------~c~C~ngg~C~~~~---~g~---------------------- 204 (428)
| ...+. .+.|.++ .|.... .|.
T Consensus 226 ~g~~~~i~~~~~~Dg~~dc~~~~de~~~~~~~~~~~~~e~~~~~~-~~~~~~~~~~g~~d~pdg~de~~~~~~~~~~~~~ 304 (877)
T KOG1215|consen 226 DGSSRCILISEVCDGPRDCVDGPDEGVMNCSDATCEAPEIECADG-DCSDRQKLCDGDLDCPDGLDEDYCKKKLYWSMNV 304 (877)
T ss_pred cCCCcEEeehhccCCCcccccCCcCceeEeeccccCCcceeecCC-CCccceEEecCccCCCCcccccccccceeeeeec
Confidence 0 00000 0011110 010000 000
Q ss_pred -----------------------------------------eEecCCCCCccCccCCCCCCCCCCCchhhHHHHHHHHHH
Q psy951 205 -----------------------------------------LTCNCRQDFAGTFCENYTGIGQGLTLGRSLLYIPTLLLL 243 (428)
Q Consensus 205 -----------------------------------------~~C~C~~gy~G~~Ce~~~~~~~~~~~~~~~~~~~i~~~~ 243 (428)
..|.|..++............-....+ .-
T Consensus 305 d~~~~~i~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~v~~~~~~~~~~~~~~~~~~~~~~~~~~~g-~C--------- 374 (877)
T KOG1215|consen 305 DGSGRRILLSKLCHGYWTDGLNECAERVLKCSHKCPDVSVGPRCDCMGAKVLPLGARTDSNPCESDNG-GC--------- 374 (877)
T ss_pred ccCCceeeecccCccccccccccchhhcccccCCCCccccCCcccCCccceecccccccCCcccccCC-cc---------
Confidence 111111111100000000000000011 00
Q ss_pred HHHHHHHhhheee-cCCCCCCCCCCCCCCceEeecCCeEEecCCCcCCCCCCceeeeecCCcceeeeecCccceeecccc
Q psy951 244 LALVSATVYYVWR-KRPFGKTMGSALSTQSVSFRQGTNVEFGAPAFNNGASLGLTIVYSNGPEIRAYETHKRRFRDVISD 322 (428)
Q Consensus 244 ~~~~~~~~~~~~r-~~~~~~~~~~~~~~~~v~~~~~~~~~~~~p~~~~~~~~~~~~~~s~~~~~~~~~~~~~~~~~~i~~ 322 (428)
--+.... ...++= .-..+.+.+.. . ..+-......+.+++..+++.+.+...++...+++
T Consensus 375 ------sq~C~~~~p~~~~c-----~c~~g~~~~~~-~-------c~~~~~~~~~l~~s~~~~ir~~~~~~~~~~~p~~~ 435 (877)
T KOG1215|consen 375 ------SQLCVPNSPGTFKC-----ACSPGYELRLD-K-------CEASDQPEAFLLFSNRHDIRRISLDCSDVSRPLEG 435 (877)
T ss_pred ------ceeccCCCCCceeE-----ecCCCcEeccC-C-------ceecCCCCcEEEEecCccceecccCCCcceEEccC
Confidence 0000000 000000 00000000000 0 00111122457779999999999988877777787
Q ss_pred ccceeEEeeccCCCEEEEEecCCCeEEEeeeccccCCceEEEEeccccceeeeeeccCCeEEEEeCCCCcEEEEEccCcc
Q psy951 323 ERRIEALDIDPVDEIIYWVDSYDRNIRRSFMLEAQKGQVQAVISDERRIEALDIDPVDEIIYWVDSYDRNIRRSFMLEAQ 402 (428)
Q Consensus 323 ~~~~~~l~~d~~~~~lyWtd~~~~~I~ra~l~g~~~~~~~~i~~~~~~p~glavD~~~~~lYwtd~~~~~I~~~~~~g~~ 402 (428)
..++.++++|+.++.+||+|....+|.++.++|. ....+.-.++..|.|||+||+++++||+|.+...|++..++|+.
T Consensus 436 ~~~~~~~d~d~~~~~i~~~d~~~~~i~~~~~~~~--~~~~~~~~g~~~~~~lavD~~~~~~y~tDe~~~~i~v~~~~g~~ 513 (877)
T KOG1215|consen 436 IKNAVALDFDVLNNRIYWADLSDEKICRASQDGS--SECELCGDGLCIPEGLAVDWIGDNIYWTDEGNCLIEVADLDGSS 513 (877)
T ss_pred CccceEEEEEecCCEEEEEeccCCeEeeeccCCC--ccceEeccCccccCcEEEEeccCCceecccCCceeEEEEccCCc
Confidence 8999999999999999999999999999999994 33333335999999999999999999999999999999999999
Q ss_pred eeEEEecccCCCCCCCcccccc
Q psy951 403 KGQVQAGASRHQNGVPASSQRN 424 (428)
Q Consensus 403 ~~~l~~~~~~~~~~~p~~~~~~ 424 (428)
|++|+... .+.|++++++
T Consensus 514 ~~vl~~~~----l~~~r~~~v~ 531 (877)
T KOG1215|consen 514 RKVLVSKD----LDLPRSIAVD 531 (877)
T ss_pred eeEEEecC----CCCccceeec
Confidence 99999555 5788888876
No 3
>KOG1214|consensus
Probab=99.91 E-value=2.9e-24 Score=216.77 Aligned_cols=157 Identities=29% Similarity=0.589 Sum_probs=138.5
Q ss_pred CEEEeeCC-CCeEEEEEcCCCCcEEEEeCCCCCceeEEEcCCCCCEEEEEECCCCeEEEEEcCCCCeEEEEeCCCCccce
Q psy951 1 MFWAETGA-SPRIESAWMDGSHRRSLVMTGVRHPTGLSVDAAMDHTLYWVDSKLNTIESVRHDGRNRQTILSGSDKLQHP 79 (428)
Q Consensus 1 lyWtd~~~-~~~I~~a~~DG~~~~~l~~~~~~~P~glavD~~~~~~lYW~d~~~~~I~~~~ldG~~~~~i~~~~~~~~~p 79 (428)
||||||.. +|+|++++|||++|++++.+++..|+||++|+.++ .|.|+|+++++.+.+..||..|+++.++ |.+|
T Consensus 1125 LYwtDWnRenPkIets~mDG~NrRilin~DigLPNGLtfdpfs~-~LCWvDAGt~rleC~~p~g~gRR~i~~~---LqYP 1200 (1289)
T KOG1214|consen 1125 LYWTDWNRENPKIETSSMDGENRRILINTDIGLPNGLTFDPFSK-LLCWVDAGTKRLECTLPDGTGRRVIQNN---LQYP 1200 (1289)
T ss_pred eeeccccccCCcceeeccCCccceEEeecccCCCCCceeCcccc-eeeEEecCCcceeEecCCCCcchhhhhc---ccCc
Confidence 79999985 79999999999999999999999999999999999 9999999999999999999999999996 9999
Q ss_pred eeeeccccEEEEEeCCCCceeeeccCCCCceeee-ccCCCCCcccccccccccCCCCCCCCCC--CCCCCccccCCCcce
Q psy951 80 ISLDVFENNIYWLARDTGSLYKQDKFGRGVPVLI-SKDLVNPSGVKAYHAQRYNTSAPNPCSQ--SPCSHLCLVIPGGYQ 156 (428)
Q Consensus 80 ~~l~~~~~~lYwtD~~~~~I~~~~~~g~~~~~~~-~~~~~~p~~I~v~~~~~~~~~~~n~C~~--~~C~~~C~~~~~~~~ 156 (428)
|+|.-+++++|||||...+|.+.++.++..+... ...-....||+... .|+....++|.. ++|+|+|++.-++..
T Consensus 1201 F~itsy~~~fY~TDWk~n~vvsv~~~~~~~td~~~p~~~s~lyGItav~--~~Cp~gstpCSedNGGCqHLCLpgqngav 1278 (1289)
T KOG1214|consen 1201 FSITSYADHFYHTDWKRNGVVSVNKHSGQFTDEYLPEQRSHLYGITAVY--PYCPTGSTPCSEDNGGCQHLCLPGQNGAV 1278 (1289)
T ss_pred eeeeeccccceeeccccCceEEeeccccccccccccccccceEEEEecc--ccCCCCCCcccccCCcceeecccCcCCcc
Confidence 9999999999999999999999999976655433 33445577777653 356668899985 489999999889999
Q ss_pred ecCCCCC
Q psy951 157 CACPENA 163 (428)
Q Consensus 157 C~C~~g~ 163 (428)
|.||...
T Consensus 1279 cecpdnv 1285 (1289)
T KOG1214|consen 1279 CECPDNV 1285 (1289)
T ss_pred ccCCccc
Confidence 9998754
No 4
>KOG1215|consensus
Probab=99.89 E-value=5.1e-22 Score=218.44 Aligned_cols=270 Identities=32% Similarity=0.515 Sum_probs=202.1
Q ss_pred CEEEeeCCCCeEEEEEcCCCCcEEEEeCCCCCceeEEEcCCCCCEEEEEECCCC-eEEEEEcCCCCeEEEEeCCCCccce
Q psy951 1 MFWAETGASPRIESAWMDGSHRRSLVMTGVRHPTGLSVDAAMDHTLYWVDSKLN-TIESVRHDGRNRQTILSGSDKLQHP 79 (428)
Q Consensus 1 lyWtd~~~~~~I~~a~~DG~~~~~l~~~~~~~P~glavD~~~~~~lYW~d~~~~-~I~~~~ldG~~~~~i~~~~~~~~~p 79 (428)
|||+||+..++|+|+.|||+.+++++..++.+|+||++|..++ ++||+|.... .|++++++|.+|+++... .+.+|
T Consensus 537 ~~wtd~~~~~~i~ra~~dg~~~~~l~~~~~~~p~glt~d~~~~-~~yw~d~~~~~~i~~~~~~g~~r~~~~~~--~~~~p 613 (877)
T KOG1215|consen 537 MFWTDWGQPPRIERASLDGSERAVLVTNGILWPNGLTIDYETD-RLYWADAKLDYTIESANMDGQNRRVVDSE--DLPHP 613 (877)
T ss_pred eEEecCCCCchhhhhcCCCCCceEEEeCCccCCCcceEEeecc-eeEEEcccCCcceeeeecCCCceEEeccc--cCCCc
Confidence 7999999877999999999999999998899999999999999 9999999999 899999999999944444 69999
Q ss_pred eeeeccccEEEEEeCCCCceeeeccCCCCceeeeccCCCCCcccccccccccCCCCCCCCCCC--CCCCccccCCCccee
Q psy951 80 ISLDVFENNIYWLARDTGSLYKQDKFGRGVPVLISKDLVNPSGVKAYHAQRYNTSAPNPCSQS--PCSHLCLVIPGGYQC 157 (428)
Q Consensus 80 ~~l~~~~~~lYwtD~~~~~I~~~~~~g~~~~~~~~~~~~~p~~I~v~~~~~~~~~~~n~C~~~--~C~~~C~~~~~~~~C 157 (428)
++++++++++||+||....+.+.++..+.....+......|..+..++...+...+.|+|..+ .|++.|++.|.+.+|
T Consensus 614 ~~~~~~~~~iyw~d~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~n~C~~~n~~c~~KOG~~p~~~~c 693 (877)
T KOG1215|consen 614 FGLSVFEDYIYWTDWSNRAISRAEKHKGSDSRTSRSNLAQPLDIILVHHSSSRPTGVNPCESSNGGCSQLCLPRPQGSTC 693 (877)
T ss_pred eEEEEecceeEEeeccccceEeeecccCCcceeeecccCcccceEEEeccccCCCCCCcccccCCCCCeeeecCCCCCee
Confidence 999999999999999999888888875433235566677888888885554444589999975 899999999977799
Q ss_pred cCCCCCCCCCCCCCCcccccCccCCCCCCCCCCCCCCCEEeeCCCCceEecCCCCCccCccCCCCCCCCCCCchhhHHHH
Q psy951 158 ACPENATPKLPGVAEIRCSAAVERPRPLPRVCQCQNGGMCAESETGDLTCNCRQDFAGTFCENYTGIGQGLTLGRSLLYI 237 (428)
Q Consensus 158 ~C~~g~~~~~~~~~g~~C~~~~~~~~~~~~~c~C~ngg~C~~~~~g~~~C~C~~gy~G~~Ce~~~~~~~~~~~~~~~~~~ 237 (428)
.|+.|+ .+... +..|. ++.+|....=+.
T Consensus 694 ~c~~~~-~l~~~--~~~C~--------------------------------~~~~~~~~~~~~----------------- 721 (877)
T KOG1215|consen 694 ACPEGY-RLSPD--GKSCS--------------------------------SPEGYLLITSRT----------------- 721 (877)
T ss_pred eCCCCC-eecCC--CCeec--------------------------------Cccccccccccc-----------------
Confidence 999998 44433 23333 222221100000
Q ss_pred HHHHHHHHHHHHHhhheeecCCCCCCCCCCCCCCceEeecCCeEEecCCCcCCCCCCceeeeecCCcceeeeecCcccee
Q psy951 238 PTLLLLLALVSATVYYVWRKRPFGKTMGSALSTQSVSFRQGTNVEFGAPAFNNGASLGLTIVYSNGPEIRAYETHKRRFR 317 (428)
Q Consensus 238 ~i~~~~~~~~~~~~~~~~r~~~~~~~~~~~~~~~~v~~~~~~~~~~~~p~~~~~~~~~~~~~~s~~~~~~~~~~~~~~~~ 317 (428)
.+..++.....
T Consensus 722 -------------------------------------------------------------------~~~~~~~~~~~-- 732 (877)
T KOG1215|consen 722 -------------------------------------------------------------------GIPCISLDSEL-- 732 (877)
T ss_pred -------------------------------------------------------------------ccceeecCccc--
Confidence 00000000000
Q ss_pred eccccccceeEEeeccCCCEEEEEecCCCeEEEeeeccccCCceEEEEeccccceeeeeeccCCeEEEEeCCCCcEEEEE
Q psy951 318 DVISDERRIEALDIDPVDEIIYWVDSYDRNIRRSFMLEAQKGQVQAVISDERRIEALDIDPVDEIIYWVDSYDRNIRRSF 397 (428)
Q Consensus 318 ~~i~~~~~~~~l~~d~~~~~lyWtd~~~~~I~ra~l~g~~~~~~~~i~~~~~~p~glavD~~~~~lYwtd~~~~~I~~~~ 397 (428)
....+...- +..++..+|++.......+...+++... .++-.+...|.++++|+.-..|||+......|.+..
T Consensus 733 ----~~~~~~~~~-~~~~~~~~~~~~~~~~~~~~~~~~~~~~--~~~~~~~~~~~~~~~~~~~~~l~~~~~~~~~~~~~~ 805 (877)
T KOG1215|consen 733 ----SPDQPLEDG-DTIDRLEYWTDVRVGVAAVSSQNCAPGY--DLVGEGEPPPEGSAVDEAEDTLYWTCSATSFIEVSG 805 (877)
T ss_pred ----CCCcccCCC-cccccceecccccceeeEEEecCCCCcc--ccccccCCCCCCceeehhhcceEEEeecccEEEEEE
Confidence 000000000 8888999999988887777777775322 233358889999999999999999999999999999
Q ss_pred ccCc
Q psy951 398 MLEA 401 (428)
Q Consensus 398 ~~g~ 401 (428)
+++.
T Consensus 806 ~~~~ 809 (877)
T KOG1215|consen 806 LDGE 809 (877)
T ss_pred Eeee
Confidence 9995
No 5
>KOG1219|consensus
Probab=99.40 E-value=3.2e-13 Score=148.13 Aligned_cols=76 Identities=33% Similarity=0.880 Sum_probs=69.2
Q ss_pred CCCCCCCCCCCC--ccccCC-CcceecCCCCCCCCCCCCCCcccccCccCCCCCCCCCCCCCCCEEeeCCCCceEecCCC
Q psy951 135 APNPCSQSPCSH--LCLVIP-GGYQCACPENATPKLPGVAEIRCSAAVERPRPLPRVCQCQNGGMCAESETGDLTCNCRQ 211 (428)
Q Consensus 135 ~~n~C~~~~C~~--~C~~~~-~~~~C~C~~g~~~~~~~~~g~~C~~~~~~~~~~~~~c~C~ngg~C~~~~~g~~~C~C~~ 211 (428)
..++|..+||+| .|...+ +||.|.||+.| +|+.|+...++|.++ ||++||+|+...+ .|.|.||.
T Consensus 3863 ~~d~C~~npCqhgG~C~~~~~ggy~CkCpsqy-------sG~~CEi~~epC~sn----PC~~GgtCip~~n-~f~CnC~~ 3930 (4289)
T KOG1219|consen 3863 LTDPCNDNPCQHGGTCISQPKGGYKCKCPSQY-------SGNHCEIDLEPCASN----PCLTGGTCIPFYN-GFLCNCPN 3930 (4289)
T ss_pred cccccccCcccCCCEecCCCCCceEEeCcccc-------cCcccccccccccCC----CCCCCCEEEecCC-CeeEeCCC
Confidence 349999999999 999999 78999999999 789999999999998 8999999999884 59999999
Q ss_pred CCccCccCCCC
Q psy951 212 DFAGTFCENYT 222 (428)
Q Consensus 212 gy~G~~Ce~~~ 222 (428)
||+|.+||.+.
T Consensus 3931 gyTG~~Ce~~G 3941 (4289)
T KOG1219|consen 3931 GYTGKRCEARG 3941 (4289)
T ss_pred CccCceeeccc
Confidence 99999998873
No 6
>PF00058 Ldl_recept_b: Low-density lipoprotein receptor repeat class B; InterPro: IPR000033 The low-density lipoprotein receptor (LDLR) is the major cholesterol-carrying lipoprotein of plasma, acting to regulate cholesterol homeostasis in mammalian cells. The LDL receptor binds LDL and transports it into cells by acidic endocytosis. In order to be internalized, the receptor-ligand complex must first cluster into clathrin-coated pits. Once inside the cell, the LDLR separates from its ligand, which is degraded in the lysosomes, while the receptor returns to the cell surface []. The internal dissociation of the LDLR with its ligand is mediated by proton pumps within the walls of the endosome that lower the pH. The LDLR is a multi-domain protein, containing: The ligand-binding domain contains seven or eight 40-amino acid LDLR class A (cysteine-rich) repeats, each of which contains a coordinated calcium ion and six cysteine residues involved in disulphide bond formation []. Similar domains have been found in other extracellular and membrane proteins []. The second conserved region contains two EGF repeats, followed by six LDLR class B (YWTD) repeats, and another EGF repeat. The LDLR class B repeats each contain a conserved YWTD motif, and is predicted to form a beta-propeller structure []. This region is critical for ligand release and recycling of the receptor []. The third domain is rich in serine and threonine residues and contains clustered O-linked carbohydrate chains. The fourth domain is the hydrophobic transmembrane region. The fifth domain is the cytoplasmic tail that directs the receptor to clathrin-coated pits. LDLR is closely related in structure to several other receptors, including LRP1, LRP1b, megalin/LRP2, VLDL receptor, lipoprotein receptor, MEGF7/LRP4, and LRP8/apolipoprotein E receptor2); these proteins participate in a wide range of physiological processes, including the regulation of lipid metabolism, protection against atherosclerosis, neurodevelopment, and transport of nutrients and vitamins []. This entry represents the LDLR classB (YWTD) repeat, the structure of which has been solved []. The six YWTD repeats together fold into a six-bladed beta-propeller. Each blade of the propeller consists of four antiparallel beta-strands; the innermost strand of each blade is labeled 1 and the outermost strand, 4. The sequence repeats are offset with respect to the blades of the propeller, such that any given 40-residue YWTD repeat spans strands 24 of one propeller blade and strand 1 of the subsequent blade. This offset ensures circularization of the propeller because the last strand of the final sequence repeat acts as an innermost strand 1 of the blade that harbors strands 24 from the first sequence repeat. The repeat is found in a variety of proteins that include, vitellogenin receptor from Drosophila melanogaster, low-density lipoprotein (LDL) receptor [], preproepidermal growth factor, and nidogen (entactin).; PDB: 3S2K_A 3S8Z_A 3S8V_B 4A0P_A 3SOB_B 3S94_B 4DG6_A 3SOV_A 3SOQ_A 1NPE_A ....
Probab=99.25 E-value=9.9e-12 Score=82.89 Aligned_cols=40 Identities=38% Similarity=0.831 Sum_probs=38.3
Q ss_pred CEEEeeCCCCeEEEEEcCCCCcEEEEeCCCCCceeEEEcC
Q psy951 1 MFWAETGASPRIESAWMDGSHRRSLVMTGVRHPTGLSVDA 40 (428)
Q Consensus 1 lyWtd~~~~~~I~~a~~DG~~~~~l~~~~~~~P~glavD~ 40 (428)
|||||++..++|++++|||+++++++++++.+|+|||||+
T Consensus 3 iYWtD~~~~~~I~~a~~dGs~~~~vi~~~l~~P~giaVD~ 42 (42)
T PF00058_consen 3 IYWTDWSQDPSIERANLDGSNRRTVISDDLQHPEGIAVDW 42 (42)
T ss_dssp EEEEETTTTEEEEEEETTSTSEEEEEESSTSSEEEEEEET
T ss_pred EEEEECCCCcEEEEEECCCCCeEEEEECCCCCcCEEEECC
Confidence 7999999887999999999999999999999999999996
No 7
>PF08450 SGL: SMP-30/Gluconolaconase/LRE-like region; InterPro: IPR013658 This family describes a region that is found in proteins expressed by a variety of eukaryotic and prokaryotic species. These proteins include various enzymes, such as senescence marker protein 30 (SMP-30, Q15493 from SWISSPROT), gluconolactonase (Q01578 from SWISSPROT) and luciferin-regenerating enzyme (LRE, Q86DU5 from SWISSPROT). SMP-30 is known to hydrolyse diisopropyl phosphorofluoridate in the liver, and has been noted as having sequence similarity, in the region described in this family, with PON1 (P52430 from SWISSPROT) and LRE. ; PDB: 2GHS_A 2DG0_L 2DG1_D 2DSO_D 3E5Z_A 2IAT_A 2IAV_A 2GVV_A 3HLI_A 2GVU_A ....
Probab=99.25 E-value=1.5e-09 Score=102.09 Aligned_cols=99 Identities=14% Similarity=0.083 Sum_probs=75.7
Q ss_pred ccccccceeEEeeccCCCEEEEEecCCCeEEEeeeccccC--CceEEEEe--c-cccceeeeeeccCCeEEEEeCCCCcE
Q psy951 319 VISDERRIEALDIDPVDEIIYWVDSYDRNIRRSFMLEAQK--GQVQAVIS--D-ERRIEALDIDPVDEIIYWVDSYDRNI 393 (428)
Q Consensus 319 ~i~~~~~~~~l~~d~~~~~lyWtd~~~~~I~ra~l~g~~~--~~~~~i~~--~-~~~p~glavD~~~~~lYwtd~~~~~I 393 (428)
+.+++..|-||++++..+.||++|...++|+|..++.... ...++++. . ...|.||++|. .++||.++.+...|
T Consensus 129 ~~~~~~~pNGi~~s~dg~~lyv~ds~~~~i~~~~~~~~~~~~~~~~~~~~~~~~~g~pDG~~vD~-~G~l~va~~~~~~I 207 (246)
T PF08450_consen 129 VADGLGFPNGIAFSPDGKTLYVADSFNGRIWRFDLDADGGELSNRRVFIDFPGGPGYPDGLAVDS-DGNLWVADWGGGRI 207 (246)
T ss_dssp EEEEESSEEEEEEETTSSEEEEEETTTTEEEEEEEETTTCCEEEEEEEEE-SSSSCEEEEEEEBT-TS-EEEEEETTTEE
T ss_pred EecCcccccceEECCcchheeecccccceeEEEeccccccceeeeeeEEEcCCCCcCCCcceEcC-CCCEEEEEcCCCEE
Confidence 4566788999999999999999999999999999975311 22344443 2 34699999998 56899999999999
Q ss_pred EEEEccCcceeEEEecccCCCCCCCccccc
Q psy951 394 RRSFMLEAQKGQVQAGASRHQNGVPASSQR 423 (428)
Q Consensus 394 ~~~~~~g~~~~~l~~~~~~~~~~~p~~~~~ 423 (428)
.+.+.+|+..+++. -|...|.+++.
T Consensus 208 ~~~~p~G~~~~~i~-----~p~~~~t~~~f 232 (246)
T PF08450_consen 208 VVFDPDGKLLREIE-----LPVPRPTNCAF 232 (246)
T ss_dssp EEEETTSCEEEEEE------SSSSEEEEEE
T ss_pred EEECCCccEEEEEc-----CCCCCEEEEEE
Confidence 99999998766654 44446666665
No 8
>KOG1219|consensus
Probab=99.24 E-value=6e-12 Score=138.53 Aligned_cols=112 Identities=27% Similarity=0.596 Sum_probs=79.0
Q ss_pred CCCCCCCC--ccccCCCcceecCCCCCCCCCCCCCCcccccC-ccCCCCCCCCCCCCCCCEEeeCCCCceEecCCCCCcc
Q psy951 139 CSQSPCSH--LCLVIPGGYQCACPENATPKLPGVAEIRCSAA-VERPRPLPRVCQCQNGGMCAESETGDLTCNCRQDFAG 215 (428)
Q Consensus 139 C~~~~C~~--~C~~~~~~~~C~C~~g~~~~~~~~~g~~C~~~-~~~~~~~~~~c~C~ngg~C~~~~~g~~~C~C~~gy~G 215 (428)
|..+||.. .|.+.+++|.|.||.|| +|.+|+.. ...|... +|.+||.|++.+ |+|.|.|.+||.|
T Consensus 3906 C~snPC~~GgtCip~~n~f~CnC~~gy-------TG~~Ce~~Gi~eCs~n----~C~~gg~C~n~~-gsf~CncT~g~~g 3973 (4289)
T KOG1219|consen 3906 CASNPCLTGGTCIPFYNGFLCNCPNGY-------TGKRCEARGISECSKN----VCGTGGQCINIP-GSFHCNCTPGILG 3973 (4289)
T ss_pred ccCCCCCCCCEEEecCCCeeEeCCCCc-------cCceeecccccccccc----cccCCceeeccC-CceEeccChhHhc
Confidence 45667877 89999999999999999 78999876 4434333 899999999999 7899999999999
Q ss_pred CccCCCCCCCC-CCCc-hhhHHHHHHHHHHHHHHHHHhhheeecCCCCC
Q psy951 216 TFCENYTGIGQ-GLTL-GRSLLYIPTLLLLLALVSATVYYVWRKRPFGK 262 (428)
Q Consensus 216 ~~Ce~~~~~~~-~~~~-~~~~~~~~i~~~~~~~~~~~~~~~~r~~~~~~ 262 (428)
..|+...+... ...+ |...+++.+++++++++++++|+.+||+.++|
T Consensus 3974 r~c~~~~pni~~~~~~~gkaEli~I~V~l~~ifilvvlf~~crKk~~rk 4022 (4289)
T KOG1219|consen 3974 RTCCAEKPNILSTVLWLGKAELIIIIVLLALIFILVVLFWKCRKKNSRK 4022 (4289)
T ss_pred ccCccccCccccccchhcccceeehhHHHHHHHHHHHHHHhhhhhccCC
Confidence 99977665431 1222 21222222233333444455777777777555
No 9
>PLN02919 haloacid dehalogenase-like hydrolase family protein
Probab=99.23 E-value=3.6e-09 Score=118.03 Aligned_cols=103 Identities=17% Similarity=0.291 Sum_probs=79.6
Q ss_pred CEEEeeCCCCeEEEEEcCCCCcEEEEeC-------------CCCCceeEEEcCCCCCEEEEEECCCCeEEEEEcCCCCeE
Q psy951 1 MFWAETGASPRIESAWMDGSHRRSLVMT-------------GVRHPTGLSVDAAMDHTLYWVDSKLNTIESVRHDGRNRQ 67 (428)
Q Consensus 1 lyWtd~~~~~~I~~a~~DG~~~~~l~~~-------------~~~~P~glavD~~~~~~lYW~d~~~~~I~~~~ldG~~~~ 67 (428)
||++|.+.. +|.+.+.+|.....+-.. .+..|.||++|..++ .||++|...++|.++++.+...+
T Consensus 582 lyVaDs~n~-rI~v~d~~G~~i~~ig~~g~~G~~dG~~~~a~f~~P~GIavd~~gn-~LYVaDt~n~~Ir~id~~~~~V~ 659 (1057)
T PLN02919 582 LFISDSNHN-RIVVTDLDGNFIVQIGSTGEEGLRDGSFEDATFNRPQGLAYNAKKN-LLYVADTENHALREIDFVNETVR 659 (1057)
T ss_pred EEEEECCCC-eEEEEeCCCCEEEEEccCCCcCCCCCchhccccCCCcEEEEeCCCC-EEEEEeCCCceEEEEecCCCEEE
Confidence 689998877 899999998754443321 256899999998877 89999999999999999887666
Q ss_pred EEEeCC--------------CCccceeeeecc--ccEEEEEeCCCCceeeeccC
Q psy951 68 TILSGS--------------DKLQHPISLDVF--ENNIYWLARDTGSLYKQDKF 105 (428)
Q Consensus 68 ~i~~~~--------------~~~~~p~~l~~~--~~~lYwtD~~~~~I~~~~~~ 105 (428)
++.... ..+.+|.+|++. ++.||++|..++.|++.+..
T Consensus 660 tlag~G~~g~~~~gg~~~~~~~ln~P~gVa~dp~~g~LyVad~~~~~I~v~d~~ 713 (1057)
T PLN02919 660 TLAGNGTKGSDYQGGKKGTSQVLNSPWDVCFEPVNEKVYIAMAGQHQIWEYNIS 713 (1057)
T ss_pred EEeccCcccCCCCCChhhhHhhcCCCeEEEEecCCCeEEEEECCCCeEEEEECC
Confidence 664310 014578777765 68999999999999888765
No 10
>PF00058 Ldl_recept_b: Low-density lipoprotein receptor repeat class B; InterPro: IPR000033 The low-density lipoprotein receptor (LDLR) is the major cholesterol-carrying lipoprotein of plasma, acting to regulate cholesterol homeostasis in mammalian cells. The LDL receptor binds LDL and transports it into cells by acidic endocytosis. In order to be internalized, the receptor-ligand complex must first cluster into clathrin-coated pits. Once inside the cell, the LDLR separates from its ligand, which is degraded in the lysosomes, while the receptor returns to the cell surface []. The internal dissociation of the LDLR with its ligand is mediated by proton pumps within the walls of the endosome that lower the pH. The LDLR is a multi-domain protein, containing: The ligand-binding domain contains seven or eight 40-amino acid LDLR class A (cysteine-rich) repeats, each of which contains a coordinated calcium ion and six cysteine residues involved in disulphide bond formation []. Similar domains have been found in other extracellular and membrane proteins []. The second conserved region contains two EGF repeats, followed by six LDLR class B (YWTD) repeats, and another EGF repeat. The LDLR class B repeats each contain a conserved YWTD motif, and is predicted to form a beta-propeller structure []. This region is critical for ligand release and recycling of the receptor []. The third domain is rich in serine and threonine residues and contains clustered O-linked carbohydrate chains. The fourth domain is the hydrophobic transmembrane region. The fifth domain is the cytoplasmic tail that directs the receptor to clathrin-coated pits. LDLR is closely related in structure to several other receptors, including LRP1, LRP1b, megalin/LRP2, VLDL receptor, lipoprotein receptor, MEGF7/LRP4, and LRP8/apolipoprotein E receptor2); these proteins participate in a wide range of physiological processes, including the regulation of lipid metabolism, protection against atherosclerosis, neurodevelopment, and transport of nutrients and vitamins []. This entry represents the LDLR classB (YWTD) repeat, the structure of which has been solved []. The six YWTD repeats together fold into a six-bladed beta-propeller. Each blade of the propeller consists of four antiparallel beta-strands; the innermost strand of each blade is labeled 1 and the outermost strand, 4. The sequence repeats are offset with respect to the blades of the propeller, such that any given 40-residue YWTD repeat spans strands 24 of one propeller blade and strand 1 of the subsequent blade. This offset ensures circularization of the propeller because the last strand of the final sequence repeat acts as an innermost strand 1 of the blade that harbors strands 24 from the first sequence repeat. The repeat is found in a variety of proteins that include, vitellogenin receptor from Drosophila melanogaster, low-density lipoprotein (LDL) receptor [], preproepidermal growth factor, and nidogen (entactin).; PDB: 3S2K_A 3S8Z_A 3S8V_B 4A0P_A 3SOB_B 3S94_B 4DG6_A 3SOV_A 3SOQ_A 1NPE_A ....
Probab=99.10 E-value=2.3e-10 Score=76.25 Aligned_cols=40 Identities=30% Similarity=0.430 Sum_probs=35.4
Q ss_pred CEEEEEecCCC-eEEEeeeccccCCceEEEEe-ccccceeeeeec
Q psy951 336 EIIYWVDSYDR-NIRRSFMLEAQKGQVQAVIS-DERRIEALDIDP 378 (428)
Q Consensus 336 ~~lyWtd~~~~-~I~ra~l~g~~~~~~~~i~~-~~~~p~glavD~ 378 (428)
++|||||...+ +|.++.|||+ ++++++. ++..|.||||||
T Consensus 1 ~~iYWtD~~~~~~I~~a~~dGs---~~~~vi~~~l~~P~giaVD~ 42 (42)
T PF00058_consen 1 GKIYWTDWSQDPSIERANLDGS---NRRTVISDDLQHPEGIAVDW 42 (42)
T ss_dssp TEEEEEETTTTEEEEEEETTST---SEEEEEESSTSSEEEEEEET
T ss_pred CEEEEEECCCCcEEEEEECCCC---CeEEEEECCCCCcCEEEECC
Confidence 58999999999 9999999994 4666664 999999999997
No 11
>PLN02919 haloacid dehalogenase-like hydrolase family protein
Probab=98.99 E-value=5.9e-08 Score=108.43 Aligned_cols=80 Identities=24% Similarity=0.303 Sum_probs=63.4
Q ss_pred EeCCCCCceeEEEcCCCCCEEEEEECCCCeEEEEEcCCCCeEEEEe-CC----------CCccceeeeecc--ccEEEEE
Q psy951 26 VMTGVRHPTGLSVDAAMDHTLYWVDSKLNTIESVRHDGRNRQTILS-GS----------DKLQHPISLDVF--ENNIYWL 92 (428)
Q Consensus 26 ~~~~~~~P~glavD~~~~~~lYW~d~~~~~I~~~~ldG~~~~~i~~-~~----------~~~~~p~~l~~~--~~~lYwt 92 (428)
+.+.+..|.|+++|..++ +||++|...++|.+.+++|.-...+.. +. ..+.+|.+|++. ++.||++
T Consensus 563 ~~s~l~~P~gvavd~~~g-~lyVaDs~n~rI~v~d~~G~~i~~ig~~g~~G~~dG~~~~a~f~~P~GIavd~~gn~LYVa 641 (1057)
T PLN02919 563 LTSPLKFPGKLAIDLLNN-RLFISDSNHNRIVVTDLDGNFIVQIGSTGEEGLRDGSFEDATFNRPQGLAYNAKKNLLYVA 641 (1057)
T ss_pred ccccCCCCceEEEECCCC-eEEEEECCCCeEEEEeCCCCEEEEEccCCCcCCCCCchhccccCCCcEEEEeCCCCEEEEE
Confidence 345789999999998888 899999999999999999976554443 11 125578888875 5679999
Q ss_pred eCCCCceeeeccCC
Q psy951 93 ARDTGSLYKQDKFG 106 (428)
Q Consensus 93 D~~~~~I~~~~~~g 106 (428)
|..++.|.+.+..+
T Consensus 642 Dt~n~~Ir~id~~~ 655 (1057)
T PLN02919 642 DTENHALREIDFVN 655 (1057)
T ss_pred eCCCceEEEEecCC
Confidence 99998888887654
No 12
>KOG4289|consensus
Probab=98.73 E-value=6.4e-09 Score=111.69 Aligned_cols=83 Identities=31% Similarity=0.777 Sum_probs=67.6
Q ss_pred CCCCCCCCCCCC--ccccC---------------------C-CcceecCCCCCCCCCCCCCCcccccCccCCCCCCCCCC
Q psy951 135 APNPCSQSPCSH--LCLVI---------------------P-GGYQCACPENATPKLPGVAEIRCSAAVERPRPLPRVCQ 190 (428)
Q Consensus 135 ~~n~C~~~~C~~--~C~~~---------------------~-~~~~C~C~~g~~~~~~~~~g~~C~~~~~~~~~~~~~c~ 190 (428)
.-|.|...||+| .|+.. | ++..|.||+|| +|..|+...+.|... +
T Consensus 1178 dDniClrEPCenymkCvsvlrFdssapf~~s~s~lfRpi~pvnglrCrCPpGF-------Tgd~CeTeiDlCYs~----p 1246 (2531)
T KOG4289|consen 1178 DDNICLREPCENYMKCVSVLRFDSSAPFLASDSVLFRPIHPVNGLRCRCPPGF-------TGDYCETEIDLCYSG----P 1246 (2531)
T ss_pred cCchhhcchhHHHHhhhhheeecccCccccccceeeeeccccCceeEeCCCCC-------CcccccchhHhhhcC----C
Confidence 357899999999 77543 1 58999999999 678898877655544 8
Q ss_pred CCCCCEEeeCCCCceEecCCCCCccCccCCCCCCCCCCC
Q psy951 191 CQNGGMCAESETGDLTCNCRQDFAGTFCENYTGIGQGLT 229 (428)
Q Consensus 191 C~ngg~C~~~~~g~~~C~C~~gy~G~~Ce~~~~~~~~~~ 229 (428)
|.|+|.|...+ |.|+|.|.+||+|.+||.......+.+
T Consensus 1247 C~nng~C~srE-ggYtCeCrpg~tGehCEvs~~agrCvp 1284 (2531)
T KOG4289|consen 1247 CGNNGRCRSRE-GGYTCECRPGFTGEHCEVSARAGRCVP 1284 (2531)
T ss_pred CCCCCceEEec-CceeEEecCCccccceeeecccCcccc
Confidence 99999999998 569999999999999999876655444
No 13
>smart00135 LY Low-density lipoprotein-receptor YWTD domain. Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin.
Probab=98.72 E-value=2.5e-08 Score=66.64 Aligned_cols=40 Identities=15% Similarity=0.194 Sum_probs=36.5
Q ss_pred EEeccccceeeeeeccCCeEEEEeCCCCcEEEEEccCcce
Q psy951 364 VISDERRIEALDIDPVDEIIYWVDSYDRNIRRSFMLEAQK 403 (428)
Q Consensus 364 i~~~~~~p~glavD~~~~~lYwtd~~~~~I~~~~~~g~~~ 403 (428)
+..+++.|+|||+||.+++|||+|.....|++++++|..+
T Consensus 4 ~~~~~~~~~~la~d~~~~~lYw~D~~~~~I~~~~~~g~~~ 43 (43)
T smart00135 4 LSEGLGHPNGLAVDWIEGRLYWTDWGLDVIEVANLDGTNR 43 (43)
T ss_pred EECCCCCcCEEEEeecCCEEEEEeCCCCEEEEEeCCCCCC
Confidence 3458999999999999999999999999999999999753
No 14
>smart00135 LY Low-density lipoprotein-receptor YWTD domain. Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin.
Probab=98.65 E-value=7.4e-08 Score=64.32 Aligned_cols=42 Identities=45% Similarity=0.776 Sum_probs=38.3
Q ss_pred EEEeCCCCCceeEEEcCCCCCEEEEEECCCCeEEEEEcCCCCe
Q psy951 24 SLVMTGVRHPTGLSVDAAMDHTLYWVDSKLNTIESVRHDGRNR 66 (428)
Q Consensus 24 ~l~~~~~~~P~glavD~~~~~~lYW~d~~~~~I~~~~ldG~~~ 66 (428)
+++.+++..|+|||+|+.++ +|||+|.....|+++++||+++
T Consensus 2 ~~~~~~~~~~~~la~d~~~~-~lYw~D~~~~~I~~~~~~g~~~ 43 (43)
T smart00135 2 TLLSEGLGHPNGLAVDWIEG-RLYWTDWGLDVIEVANLDGTNR 43 (43)
T ss_pred EEEECCCCCcCEEEEeecCC-EEEEEeCCCCEEEEEeCCCCCC
Confidence 45667899999999999999 9999999999999999999864
No 15
>KOG4289|consensus
Probab=98.53 E-value=7.7e-08 Score=103.69 Aligned_cols=85 Identities=29% Similarity=0.702 Sum_probs=69.6
Q ss_pred cCCCCCCCCCCCCCCC--ccccCCCcceecCCCCCCCCCCCCCCcccccCccCCCCCCCCCCCCCCCEEeeCCCCceEec
Q psy951 131 YNTSAPNPCSQSPCSH--LCLVIPGGYQCACPENATPKLPGVAEIRCSAAVERPRPLPRVCQCQNGGMCAESETGDLTCN 208 (428)
Q Consensus 131 ~~~~~~n~C~~~~C~~--~C~~~~~~~~C~C~~g~~~~~~~~~g~~C~~~~~~~~~~~~~c~C~ngg~C~~~~~g~~~C~ 208 (428)
+|+..++.|.+++|.+ .|....++|+|.|.+|| .|..|+.....-.|.+. .|.|||+|++..+|.+.|.
T Consensus 1234 ~CeTeiDlCYs~pC~nng~C~srEggYtCeCrpg~-------tGehCEvs~~agrCvpG--vC~nggtC~~~~nggf~c~ 1304 (2531)
T KOG4289|consen 1234 YCETEIDLCYSGPCGNNGRCRSREGGYTCECRPGF-------TGEHCEVSARAGRCVPG--VCKNGGTCVNLLNGGFCCH 1304 (2531)
T ss_pred cccchhHhhhcCCCCCCCceEEecCceeEEecCCc-------cccceeeecccCccccc--eecCCCEEeecCCCceecc
Confidence 5666788999999988 99999999999999999 68899865443222222 6889999999999999999
Q ss_pred CCCC-CccCccCCCCCC
Q psy951 209 CRQD-FAGTFCENYTGI 224 (428)
Q Consensus 209 C~~g-y~G~~Ce~~~~~ 224 (428)
||.| |++.+||....+
T Consensus 1305 Cp~ge~e~prC~v~trS 1321 (2531)
T KOG4289|consen 1305 CPYGEFEDPRCEVTTRS 1321 (2531)
T ss_pred CCCcccCCCceEEEeec
Confidence 9997 679999986544
No 16
>PF08450 SGL: SMP-30/Gluconolaconase/LRE-like region; InterPro: IPR013658 This family describes a region that is found in proteins expressed by a variety of eukaryotic and prokaryotic species. These proteins include various enzymes, such as senescence marker protein 30 (SMP-30, Q15493 from SWISSPROT), gluconolactonase (Q01578 from SWISSPROT) and luciferin-regenerating enzyme (LRE, Q86DU5 from SWISSPROT). SMP-30 is known to hydrolyse diisopropyl phosphorofluoridate in the liver, and has been noted as having sequence similarity, in the region described in this family, with PON1 (P52430 from SWISSPROT) and LRE. ; PDB: 2GHS_A 2DG0_L 2DG1_D 2DSO_D 3E5Z_A 2IAT_A 2IAV_A 2GVV_A 3HLI_A 2GVU_A ....
Probab=98.34 E-value=7.8e-06 Score=76.73 Aligned_cols=121 Identities=20% Similarity=0.244 Sum_probs=87.8
Q ss_pred CEEEeeCCC-------CeEEEEEcCCCCcEEEEeCCCCCceeEEEcCCCCCEEEEEECCCCeEEEEEcCCC-----CeEE
Q psy951 1 MFWAETGAS-------PRIESAWMDGSHRRSLVMTGVRHPTGLSVDAAMDHTLYWVDSKLNTIESVRHDGR-----NRQT 68 (428)
Q Consensus 1 lyWtd~~~~-------~~I~~a~~DG~~~~~l~~~~~~~P~glavD~~~~~~lYW~d~~~~~I~~~~ldG~-----~~~~ 68 (428)
||.||.+.. ++|.|.+.+|+- +.+ ..++..|+||++++.++ .||++|...++|.+.+++.. ++++
T Consensus 99 ly~t~~~~~~~~~~~~g~v~~~~~~~~~-~~~-~~~~~~pNGi~~s~dg~-~lyv~ds~~~~i~~~~~~~~~~~~~~~~~ 175 (246)
T PF08450_consen 99 LYVTDSGGGGASGIDPGSVYRIDPDGKV-TVV-ADGLGFPNGIAFSPDGK-TLYVADSFNGRIWRFDLDADGGELSNRRV 175 (246)
T ss_dssp EEEEEECCBCTTCGGSEEEEEEETTSEE-EEE-EEEESSEEEEEEETTSS-EEEEEETTTTEEEEEEEETTTCCEEEEEE
T ss_pred EEEEecCCCccccccccceEEECCCCeE-EEE-ecCcccccceEECCcch-heeecccccceeEEEeccccccceeeeee
Confidence 477776541 469999999543 333 44689999999999988 99999999999999999743 3466
Q ss_pred EEeCCCCccceeeeecc-ccEEEEEeCCCCceeeeccCCCCceeeeccCCCCCccccc
Q psy951 69 ILSGSDKLQHPISLDVF-ENNIYWLARDTGSLYKQDKFGRGVPVLISKDLVNPSGVKA 125 (428)
Q Consensus 69 i~~~~~~~~~p~~l~~~-~~~lYwtD~~~~~I~~~~~~g~~~~~~~~~~~~~p~~I~v 125 (428)
+.........|-++++. +++||.+++..++|.+.+.+|. ....+......|+.++.
T Consensus 176 ~~~~~~~~g~pDG~~vD~~G~l~va~~~~~~I~~~~p~G~-~~~~i~~p~~~~t~~~f 232 (246)
T PF08450_consen 176 FIDFPGGPGYPDGLAVDSDGNLWVADWGGGRIVVFDPDGK-LLREIELPVPRPTNCAF 232 (246)
T ss_dssp EEE-SSSSCEEEEEEEBTTS-EEEEEETTTEEEEEETTSC-EEEEEE-SSSSEEEEEE
T ss_pred EEEcCCCCcCCCcceEcCCCCEEEEEcCCCEEEEECCCcc-EEEEEcCCCCCEEEEEE
Confidence 65543233468888876 7899999999999999999954 34444444556666665
No 17
>PHA03099 epidermal growth factor-like protein (EGF-like protein); Provisional
Probab=98.12 E-value=5.1e-06 Score=67.09 Aligned_cols=33 Identities=21% Similarity=0.686 Sum_probs=26.8
Q ss_pred CCCCCCEEeeCC-CCceEecCCCCCccCccCCCCC
Q psy951 190 QCQNGGMCAESE-TGDLTCNCRQDFAGTFCENYTG 223 (428)
Q Consensus 190 ~C~ngg~C~~~~-~g~~~C~C~~gy~G~~Ce~~~~ 223 (428)
.|.|| .|.... -..+.|.|+.||+|.+||....
T Consensus 52 YClHG-~C~yI~dl~~~~CrC~~GYtGeRCEh~dL 85 (139)
T PHA03099 52 YCLHG-DCIHARDIDGMYCRCSHGYTGIRCQHVVL 85 (139)
T ss_pred EeECC-EEEeeccCCCceeECCCCcccccccceee
Confidence 58886 897655 2569999999999999998663
No 18
>COG3386 Gluconolactonase [Carbohydrate transport and metabolism]
Probab=97.93 E-value=0.0029 Score=61.24 Aligned_cols=83 Identities=16% Similarity=0.135 Sum_probs=62.5
Q ss_pred ccceeEEeeccCCCEEEEEecCCCeEEEeeec---cccCCceEEEEe--ccccceeeeeeccCCeEEEEeC--CCCcEEE
Q psy951 323 ERRIEALDIDPVDEIIYWVDSYDRNIRRSFML---EAQKGQVQAVIS--DERRIEALDIDPVDEIIYWVDS--YDRNIRR 395 (428)
Q Consensus 323 ~~~~~~l~~d~~~~~lyWtd~~~~~I~ra~l~---g~~~~~~~~i~~--~~~~p~glavD~~~~~lYwtd~--~~~~I~~ 395 (428)
+..+=+|+++|.+..||++|...++|+|..++ |.-.+....+.. .-+.|.|+++|..+ .||.-. +...|.+
T Consensus 162 ~~~~NGla~SpDg~tly~aDT~~~~i~r~~~d~~~g~~~~~~~~~~~~~~~G~PDG~~vDadG--~lw~~a~~~g~~v~~ 239 (307)
T COG3386 162 LTIPNGLAFSPDGKTLYVADTPANRIHRYDLDPATGPIGGRRGFVDFDEEPGLPDGMAVDADG--NLWVAAVWGGGRVVR 239 (307)
T ss_pred EEecCceEECCCCCEEEEEeCCCCeEEEEecCcccCccCCcceEEEccCCCCCCCceEEeCCC--CEEEecccCCceEEE
Confidence 66777999999999999999999999999998 543233223333 45899999999654 666432 3448999
Q ss_pred EEccCcceeEEE
Q psy951 396 SFMLEAQKGQVQ 407 (428)
Q Consensus 396 ~~~~g~~~~~l~ 407 (428)
-+.+|++.+++.
T Consensus 240 ~~pdG~l~~~i~ 251 (307)
T COG3386 240 FNPDGKLLGEIK 251 (307)
T ss_pred ECCCCcEEEEEE
Confidence 999988777664
No 19
>PF03088 Str_synth: Strictosidine synthase; InterPro: IPR018119 This entry represents a conserved region found in strictosidine synthase (4.3.3.2 from EC), a key enzyme in alkaloid biosynthesis. It catalyses the Pictet-Spengler stereospecific condensation of tryptamine with secologanin to form strictosidine []. The structure of the native enzyme from the Indian medicinal plant Rauvolfia serpentina (Serpentwood) (Devilpepper) represents the first example of a six-bladed four-stranded beta-propeller fold from the plant kingdom [].; GO: 0016844 strictosidine synthase activity, 0009058 biosynthetic process; PDB: 2FPB_A 2V91_B 2FP8_A 3V1S_B 2FPC_A 2VAQ_A 2FP9_B.
Probab=97.93 E-value=8.2e-05 Score=57.90 Aligned_cols=70 Identities=16% Similarity=0.232 Sum_probs=55.3
Q ss_pred EEeeccCCCEEEEEecCCC-----------------eEEEeeeccccCCceEEEEeccccceeeeeeccCCeEEEEeCCC
Q psy951 328 ALDIDPVDEIIYWVDSYDR-----------------NIRRSFMLEAQKGQVQAVISDERRIEALDIDPVDEIIYWVDSYD 390 (428)
Q Consensus 328 ~l~~d~~~~~lyWtd~~~~-----------------~I~ra~l~g~~~~~~~~i~~~~~~p~glavD~~~~~lYwtd~~~ 390 (428)
++|++..++.||+||..++ ++.|-.... .+.+++..+|..|+|+|+.....-|..+....
T Consensus 2 dldv~~~~g~vYfTdsS~~~~~~~~~~~~le~~~~GRll~ydp~t---~~~~vl~~~L~fpNGVals~d~~~vlv~Et~~ 78 (89)
T PF03088_consen 2 DLDVDQDTGTVYFTDSSSRYDRRDWVYDLLEGRPTGRLLRYDPST---KETTVLLDGLYFPNGVALSPDESFVLVAETGR 78 (89)
T ss_dssp EEEE-TTT--EEEEES-SS--TTGHHHHHHHT---EEEEEEETTT---TEEEEEEEEESSEEEEEE-TTSSEEEEEEGGG
T ss_pred ceeEecCCCEEEEEeCccccCccceeeeeecCCCCcCEEEEECCC---CeEEEehhCCCccCeEEEcCCCCEEEEEeccC
Confidence 6899999999999998654 555655544 55677888999999999999999999999999
Q ss_pred CcEEEEEccC
Q psy951 391 RNIRRSFMLE 400 (428)
Q Consensus 391 ~~I~~~~~~g 400 (428)
.+|.+-.+.|
T Consensus 79 ~Ri~rywl~G 88 (89)
T PF03088_consen 79 YRILRYWLKG 88 (89)
T ss_dssp TEEEEEESSS
T ss_pred ceEEEEEEeC
Confidence 9999988877
No 20
>PF14670 FXa_inhibition: Coagulation Factor Xa inhibitory site; PDB: 3Q3K_B 1NFY_B 1LQD_A 1G2L_B 1IQF_L 2UWP_B 2VH6_B 3KQC_L 2P93_L 2BQW_A ....
Probab=97.92 E-value=3.2e-06 Score=53.82 Aligned_cols=28 Identities=36% Similarity=0.969 Sum_probs=23.0
Q ss_pred CCCCCCccccCCCcceecCCCCCCCCCCC
Q psy951 141 QSPCSHLCLVIPGGYQCACPENATPKLPG 169 (428)
Q Consensus 141 ~~~C~~~C~~~~~~~~C~C~~g~~~~~~~ 169 (428)
+++|+|+|++.+++|+|.|++|| .|.++
T Consensus 5 NGgC~h~C~~~~g~~~C~C~~Gy-~L~~D 32 (36)
T PF14670_consen 5 NGGCSHICVNTPGSYRCSCPPGY-KLAED 32 (36)
T ss_dssp GGGSSSEEEEETTSEEEE-STTE-EE-TT
T ss_pred CCCcCCCCccCCCceEeECCCCC-EECcC
Confidence 35899999999999999999999 77665
No 21
>PF00008 EGF: EGF-like domain This is a sub-family of the Pfam entry This is a sub-family of the Pfam entry; InterPro: IPR006209 A sequence of about thirty to forty amino-acid residues long found in the sequence of epidermal growth factor (EGF) has been shown [, , , , ] to be present, in a more or less conserved form, in a large number of other, mostly animal proteins. The list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied. The functional significance of EGF domains in what appear to be unrelated proteins is not yet clear. However, a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase). The EGF domain includes six cysteine residues which have been shown (in EGF) to be involved in disulphide bonds. The main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet. Subdomains between the conserved cysteines vary in length.; GO: 0005515 protein binding; PDB: 1WHE_A 1CCF_A 1APO_A 1WHF_A 2VJ3_A 1TOZ_A 4D90_B 3CFW_A 1EDM_B 1IXA_A ....
Probab=97.89 E-value=6.3e-06 Score=51.28 Aligned_cols=28 Identities=39% Similarity=0.968 Sum_probs=25.2
Q ss_pred CCCCCCEEeeCCCCceEecCCCCCccCc
Q psy951 190 QCQNGGMCAESETGDLTCNCRQDFAGTF 217 (428)
Q Consensus 190 ~C~ngg~C~~~~~g~~~C~C~~gy~G~~ 217 (428)
+|+|+|+|+....+.|.|.|++||+|++
T Consensus 5 ~C~n~g~C~~~~~~~y~C~C~~G~~G~~ 32 (32)
T PF00008_consen 5 PCQNGGTCIDLPGGGYTCECPPGYTGKR 32 (32)
T ss_dssp SSTTTEEEEEESTSEEEEEEBTTEESTT
T ss_pred cCCCCeEEEeCCCCCEEeECCCCCccCC
Confidence 7999999998886789999999999964
No 22
>PF10282 Lactonase: Lactonase, 7-bladed beta-propeller; InterPro: IPR019405 6-phosphogluconolactonases (6PGL) 3.1.1.31 from EC, which hydrolyses 6-phosphogluconolactone to 6-phosphogluconate is opne of the enzymes in the pentose phosphate pathway. Two families of structurally dissimilar 6PGLs are known to exist: the Escherichia coli (strain K12) YbhE IPR022528 from INTERPRO [] and the Pseudomonas aeruginosa DevB IPR005900 from INTERPRO [] types. This entry contains bacterial 6-phosphogluconolactonases (6PGL) YbhE-type 3.1.1.31 from EC which hydrolyse 6-phosphogluconolactone to 6-phosphogluconate. The entry also contains the fungal muconate lactonizing enzyme carboxy-cis,cis-muconate cyclase 5.5.1.5 from EC and muconate cycloisomerase 5.5.1.1 from EC, which convert cis,cis-muconates to muconolactones and vice versa as part of the microbial beta-ketoadipate pathway. Structures have been reported for the E. coli 6-phosphogluconolactonase and Neurospora crassa muconate cycloisomerase. Structures of proteins in this family have revealed a 7-bladed beta-propeller fold [].; PDB: 3SCY_A 1L0Q_A 3HFQ_B 3FGB_A 1RI6_A 3U4Y_A 3BWS_A 1JOF_H.
Probab=97.87 E-value=0.013 Score=57.82 Aligned_cols=104 Identities=13% Similarity=0.100 Sum_probs=72.3
Q ss_pred ccccceeEEeeccCCCEEEEEecCCCeEEEeeeccccCCceEEE--Eec-------cccceeeeeeccCCeEEEEeCCCC
Q psy951 321 SDERRIEALDIDPVDEIIYWVDSYDRNIRRSFMLEAQKGQVQAV--ISD-------ERRIEALDIDPVDEIIYWVDSYDR 391 (428)
Q Consensus 321 ~~~~~~~~l~~d~~~~~lyWtd~~~~~I~ra~l~g~~~~~~~~i--~~~-------~~~p~glavD~~~~~lYwtd~~~~ 391 (428)
+.-..|+.++|+|..+++|.+...+.+|....++.. ++..+.+ ++- ...|.+|+++..++.||.+..+.+
T Consensus 189 ~~G~GPRh~~f~pdg~~~Yv~~e~s~~v~v~~~~~~-~g~~~~~~~~~~~~~~~~~~~~~~~i~ispdg~~lyvsnr~~~ 267 (345)
T PF10282_consen 189 PPGSGPRHLAFSPDGKYAYVVNELSNTVSVFDYDPS-DGSLTEIQTISTLPEGFTGENAPAEIAISPDGRFLYVSNRGSN 267 (345)
T ss_dssp STTSSEEEEEE-TTSSEEEEEETTTTEEEEEEEETT-TTEEEEEEEEESCETTSCSSSSEEEEEE-TTSSEEEEEECTTT
T ss_pred ccCCCCcEEEEcCCcCEEEEecCCCCcEEEEeeccc-CCceeEEEEeeeccccccccCCceeEEEecCCCEEEEEeccCC
Confidence 455679999999999999999988898888888732 1323332 111 127999999999999999999999
Q ss_pred cEEEEEccCcceeEEEecccCCCCCCCccccccc
Q psy951 392 NIRRSFMLEAQKGQVQAGASRHQNGVPASSQRNL 425 (428)
Q Consensus 392 ~I~~~~~~g~~~~~l~~~~~~~~~~~p~~~~~~~ 425 (428)
.|.+-.+|....+.-....+...-..||.++++-
T Consensus 268 sI~vf~~d~~~g~l~~~~~~~~~G~~Pr~~~~s~ 301 (345)
T PF10282_consen 268 SISVFDLDPATGTLTLVQTVPTGGKFPRHFAFSP 301 (345)
T ss_dssp EEEEEEECTTTTTEEEEEEEEESSSSEEEEEE-T
T ss_pred EEEEEEEecCCCceEEEEEEeCCCCCccEEEEeC
Confidence 9999999654221111223344455688888753
No 23
>PF10282 Lactonase: Lactonase, 7-bladed beta-propeller; InterPro: IPR019405 6-phosphogluconolactonases (6PGL) 3.1.1.31 from EC, which hydrolyses 6-phosphogluconolactone to 6-phosphogluconate is opne of the enzymes in the pentose phosphate pathway. Two families of structurally dissimilar 6PGLs are known to exist: the Escherichia coli (strain K12) YbhE IPR022528 from INTERPRO [] and the Pseudomonas aeruginosa DevB IPR005900 from INTERPRO [] types. This entry contains bacterial 6-phosphogluconolactonases (6PGL) YbhE-type 3.1.1.31 from EC which hydrolyse 6-phosphogluconolactone to 6-phosphogluconate. The entry also contains the fungal muconate lactonizing enzyme carboxy-cis,cis-muconate cyclase 5.5.1.5 from EC and muconate cycloisomerase 5.5.1.1 from EC, which convert cis,cis-muconates to muconolactones and vice versa as part of the microbial beta-ketoadipate pathway. Structures have been reported for the E. coli 6-phosphogluconolactonase and Neurospora crassa muconate cycloisomerase. Structures of proteins in this family have revealed a 7-bladed beta-propeller fold [].; PDB: 3SCY_A 1L0Q_A 3HFQ_B 3FGB_A 1RI6_A 3U4Y_A 3BWS_A 1JOF_H.
Probab=97.86 E-value=0.032 Score=55.14 Aligned_cols=77 Identities=12% Similarity=0.133 Sum_probs=60.8
Q ss_pred cceeEEeeccCCCEEEEEecCCCeEEEeeeccccCCceEEE--Ee-ccccceeeeeeccCCeEEEEeCCCCcEEEEEccC
Q psy951 324 RRIEALDIDPVDEIIYWVDSYDRNIRRSFMLEAQKGQVQAV--IS-DERRIEALDIDPVDEIIYWVDSYDRNIRRSFMLE 400 (428)
Q Consensus 324 ~~~~~l~~d~~~~~lyWtd~~~~~I~ra~l~g~~~~~~~~i--~~-~~~~p~glavD~~~~~lYwtd~~~~~I~~~~~~g 400 (428)
..+.+|.++|..++||-++++.+.|--..+++. ++..+.+ +. .-..|.++++|..++.||.+....+.|.+-..|.
T Consensus 245 ~~~~~i~ispdg~~lyvsnr~~~sI~vf~~d~~-~g~l~~~~~~~~~G~~Pr~~~~s~~g~~l~Va~~~s~~v~vf~~d~ 323 (345)
T PF10282_consen 245 NAPAEIAISPDGRFLYVSNRGSNSISVFDLDPA-TGTLTLVQTVPTGGKFPRHFAFSPDGRYLYVANQDSNTVSVFDIDP 323 (345)
T ss_dssp SSEEEEEE-TTSSEEEEEECTTTEEEEEEECTT-TTTEEEEEEEEESSSSEEEEEE-TTSSEEEEEETTTTEEEEEEEET
T ss_pred CCceeEEEecCCCEEEEEeccCCEEEEEEEecC-CCceEEEEEEeCCCCCccEEEEeCCCCEEEEEecCCCeEEEEEEeC
Confidence 478899999999999999999998888888653 2334433 22 5667999999999999999999999999988764
Q ss_pred c
Q psy951 401 A 401 (428)
Q Consensus 401 ~ 401 (428)
.
T Consensus 324 ~ 324 (345)
T PF10282_consen 324 D 324 (345)
T ss_dssp T
T ss_pred C
Confidence 3
No 24
>KOG4659|consensus
Probab=97.73 E-value=0.0016 Score=71.22 Aligned_cols=92 Identities=18% Similarity=0.266 Sum_probs=66.7
Q ss_pred CCeEEEEEc-CCCCcEEEEe--------CCCCCceeEEEcCCCCCEEEEEECCCCeEEEEEcCCCCeEEEEeCCCCc--c
Q psy951 9 SPRIESAWM-DGSHRRSLVM--------TGVRHPTGLSVDAAMDHTLYWVDSKLNTIESVRHDGRNRQTILSGSDKL--Q 77 (428)
Q Consensus 9 ~~~I~~a~~-DG~~~~~l~~--------~~~~~P~glavD~~~~~~lYW~d~~~~~I~~~~ldG~~~~~i~~~~~~~--~ 77 (428)
.|+|....| ||-.|.+-.. -.+..|..||.-+... ||.-| -+.|.++..||.-+.++--+.... .
T Consensus 334 ~Prvitt~mgdG~qR~veC~~C~G~a~~~~L~aPvala~a~DGS--l~VGD--fNyIRRI~~dg~v~tIl~L~~t~~sh~ 409 (1899)
T KOG4659|consen 334 EPRVITTAMGDGHQRDVECPKCEGKADSISLFAPVALAYAPDGS--LIVGD--FNYIRRISQDGQVSTILTLGLTDTSHS 409 (1899)
T ss_pred CCceEEEeccCcccccccCCCCCCccccceeeceeeEEEcCCCc--EEEcc--chheeeecCCCceEEEEEecCCCccce
Confidence 356665555 6766654321 2477999999987654 99988 688999999999777664332122 2
Q ss_pred ceeeeeccccEEEEEeCCCCceeeecc
Q psy951 78 HPISLDVFENNIYWLARDTGSLYKQDK 104 (428)
Q Consensus 78 ~p~~l~~~~~~lYwtD~~~~~I~~~~~ 104 (428)
+-++++...+.||.+|.....|+|...
T Consensus 410 Yy~AvsPvdgtlyvSdp~s~qv~rv~s 436 (1899)
T KOG4659|consen 410 YYIAVSPVDGTLYVSDPLSKQVWRVSS 436 (1899)
T ss_pred eEEEecCcCceEEecCCCcceEEEecc
Confidence 336778889999999999999998764
No 25
>PRK11028 6-phosphogluconolactonase; Provisional
Probab=97.57 E-value=0.091 Score=51.33 Aligned_cols=97 Identities=12% Similarity=0.014 Sum_probs=66.6
Q ss_pred ceeEEeeccCCCEEEEEecCCCeEEEeeeccccCCceEEEE--eccccceeeeeeccCCeEEEEeCCCCcEEEEEccCcc
Q psy951 325 RIEALDIDPVDEIIYWVDSYDRNIRRSFMLEAQKGQVQAVI--SDERRIEALDIDPVDEIIYWVDSYDRNIRRSFMLEAQ 402 (428)
Q Consensus 325 ~~~~l~~d~~~~~lyWtd~~~~~I~ra~l~g~~~~~~~~i~--~~~~~p~glavD~~~~~lYwtd~~~~~I~~~~~~g~~ 402 (428)
.+.++.++|..++||=++...+.|.-..++.. ....+++- .-...|.++++++.++.||-+..+.+.|.+-.+|...
T Consensus 229 ~~~~i~~~pdg~~lyv~~~~~~~I~v~~i~~~-~~~~~~~~~~~~~~~p~~~~~~~dg~~l~va~~~~~~v~v~~~~~~~ 307 (330)
T PRK11028 229 WAADIHITPDGRHLYACDRTASLISVFSVSED-GSVLSFEGHQPTETQPRGFNIDHSGKYLIAAGQKSHHISVYEIDGET 307 (330)
T ss_pred cceeEEECCCCCEEEEecCCCCeEEEEEEeCC-CCeEEEeEEEeccccCCceEECCCCCEEEEEEccCCcEEEEEEcCCC
Confidence 45679999999999999888887766666431 11122221 1234799999999999999999988899998876433
Q ss_pred eeEEEecccCCCCCCCccccc
Q psy951 403 KGQVQAGASRHQNGVPASSQR 423 (428)
Q Consensus 403 ~~~l~~~~~~~~~~~p~~~~~ 423 (428)
..-...+.+.. -.-|..|++
T Consensus 308 g~l~~~~~~~~-g~~P~~~~~ 327 (330)
T PRK11028 308 GLLTELGRYAV-GQGPMWVSV 327 (330)
T ss_pred CcEEEcccccc-CCCceEEEE
Confidence 22222233334 456888887
No 26
>PF12662 cEGF: Complement Clr-like EGF-like
Probab=97.32 E-value=0.00017 Score=41.29 Aligned_cols=24 Identities=29% Similarity=0.678 Sum_probs=19.3
Q ss_pred cceecCCCCCCCCCCCCCCcccccCcc
Q psy951 154 GYQCACPENATPKLPGVAEIRCSAAVE 180 (428)
Q Consensus 154 ~~~C~C~~g~~~~~~~~~g~~C~~~~~ 180 (428)
+|+|.|++|| .+.++ +++|.+++|
T Consensus 1 sy~C~C~~Gy-~l~~d--~~~C~DIdE 24 (24)
T PF12662_consen 1 SYTCSCPPGY-QLSPD--GRSCEDIDE 24 (24)
T ss_pred CEEeeCCCCC-cCCCC--CCccccCCC
Confidence 5899999999 77664 689988764
No 27
>smart00179 EGF_CA Calcium-binding EGF-like domain.
Probab=97.28 E-value=0.00027 Score=45.73 Aligned_cols=29 Identities=41% Similarity=1.107 Sum_probs=25.6
Q ss_pred CCCCCCEEeeCCCCceEecCCCCCc-cCccC
Q psy951 190 QCQNGGMCAESETGDLTCNCRQDFA-GTFCE 219 (428)
Q Consensus 190 ~C~ngg~C~~~~~g~~~C~C~~gy~-G~~Ce 219 (428)
+|.++++|.+.. |.|.|.|+.||. |..|+
T Consensus 10 ~C~~~~~C~~~~-g~~~C~C~~g~~~g~~C~ 39 (39)
T smart00179 10 PCQNGGTCVNTV-GSYRCECPPGYTDGRNCE 39 (39)
T ss_pred CcCCCCEeECCC-CCeEeECCCCCccCCcCC
Confidence 588899999887 579999999999 98885
No 28
>PRK11028 6-phosphogluconolactonase; Provisional
Probab=97.12 E-value=0.3 Score=47.63 Aligned_cols=75 Identities=16% Similarity=0.238 Sum_probs=55.5
Q ss_pred cceeEEeeccCCCEEEEEecCCCeEEEeeeccccCCceEEEEe---------ccccceeeeeeccCCeEEEEeCCCCcEE
Q psy951 324 RRIEALDIDPVDEIIYWVDSYDRNIRRSFMLEAQKGQVQAVIS---------DERRIEALDIDPVDEIIYWVDSYDRNIR 394 (428)
Q Consensus 324 ~~~~~l~~d~~~~~lyWtd~~~~~I~ra~l~g~~~~~~~~i~~---------~~~~p~glavD~~~~~lYwtd~~~~~I~ 394 (428)
..|++++|+|..+++|=++...++|..-.++.. +...+.+.. +-.+|.+++++..++.||-++.+.+.|.
T Consensus 175 ~~p~~~~~~pdg~~lyv~~~~~~~v~v~~~~~~-~~~~~~~~~~~~~p~~~~~~~~~~~i~~~pdg~~lyv~~~~~~~I~ 253 (330)
T PRK11028 175 AGPRHMVFHPNQQYAYCVNELNSSVDVWQLKDP-HGEIECVQTLDMMPADFSDTRWAADIHITPDGRHLYACDRTASLIS 253 (330)
T ss_pred CCCceEEECCCCCEEEEEecCCCEEEEEEEeCC-CCCEEEEEEEecCCCcCCCCccceeEEECCCCCEEEEecCCCCeEE
Confidence 458899999999999999988888877777631 112222211 1124567999999999999999888999
Q ss_pred EEEcc
Q psy951 395 RSFML 399 (428)
Q Consensus 395 ~~~~~ 399 (428)
+.+++
T Consensus 254 v~~i~ 258 (330)
T PRK11028 254 VFSVS 258 (330)
T ss_pred EEEEe
Confidence 98874
No 29
>COG3391 Uncharacterized conserved protein [Function unknown]
Probab=97.10 E-value=0.36 Score=48.38 Aligned_cols=97 Identities=14% Similarity=0.169 Sum_probs=67.1
Q ss_pred ccccceeEEeeccCCCEEEEEecCC--CeEEEeeeccccCCceEEE---EeccccceeeeeeccCCeEEEEeCCCCcEEE
Q psy951 321 SDERRIEALDIDPVDEIIYWVDSYD--RNIRRSFMLEAQKGQVQAV---ISDERRIEALDIDPVDEIIYWVDSYDRNIRR 395 (428)
Q Consensus 321 ~~~~~~~~l~~d~~~~~lyWtd~~~--~~I~ra~l~g~~~~~~~~i---~~~~~~p~glavD~~~~~lYwtd~~~~~I~~ 395 (428)
.-...|.++++++...++|=++..+ .++.+..... ...... +..+ .|.++++++.++.+|.++.....+.+
T Consensus 204 ~~~~~P~~i~v~~~g~~~yV~~~~~~~~~v~~id~~~---~~v~~~~~~~~~~-~~~~v~~~p~g~~~yv~~~~~~~V~v 279 (381)
T COG3391 204 GVGTGPAGIAVDPDGNRVYVANDGSGSNNVLKIDTAT---GNVTATDLPVGSG-APRGVAVDPAGKAAYVANSQGGTVSV 279 (381)
T ss_pred ccCCCCceEEECCCCCEEEEEeccCCCceEEEEeCCC---ceEEEeccccccC-CCCceeECCCCCEEEEEecCCCeEEE
Confidence 4455788999999999999999887 4666665554 222222 2356 89999999999999999999888888
Q ss_pred EEccCcceeEEEecccCCCCCCCcccc
Q psy951 396 SFMLEAQKGQVQAGASRHQNGVPASSQ 422 (428)
Q Consensus 396 ~~~~g~~~~~l~~~~~~~~~~~p~~~~ 422 (428)
.+.........+.. ..+....|.+++
T Consensus 280 id~~~~~v~~~~~~-~~~~~~~~~~~~ 305 (381)
T COG3391 280 IDGATDRVVKTGPT-GNEALGEPVSIA 305 (381)
T ss_pred EeCCCCceeeeecc-cccccccceecc
Confidence 87665544444322 344344444443
No 30
>cd00054 EGF_CA Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.
Probab=97.01 E-value=0.00071 Score=43.20 Aligned_cols=29 Identities=41% Similarity=1.107 Sum_probs=25.4
Q ss_pred CCCCCCEEeeCCCCceEecCCCCCccCccC
Q psy951 190 QCQNGGMCAESETGDLTCNCRQDFAGTFCE 219 (428)
Q Consensus 190 ~C~ngg~C~~~~~g~~~C~C~~gy~G~~Ce 219 (428)
+|.+++.|.+.. +.|.|.|+.||.|..|+
T Consensus 10 ~C~~~~~C~~~~-~~~~C~C~~g~~g~~C~ 38 (38)
T cd00054 10 PCQNGGTCVNTV-GSYRCSCPPGYTGRNCE 38 (38)
T ss_pred CcCCCCEeECCC-CCeEeECCCCCcCCcCC
Confidence 578889999887 56999999999998885
No 31
>cd00053 EGF Epidermal growth factor domain, found in epidermal growth factor (EGF) presents in a large number of proteins, mostly animal; the list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied; the functional significance of EGF-like domains in what appear to be unrelated proteins is not yet clear; a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase); the domain includes six cysteine residues which have been shown to be involved in disulfide bonds; the main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet; Subdomains between the conserved cysteines vary in length; the region between the 5th and 6th cysteine contains two conserved glycines of which at least one is present in most EGF-like domains; a subset of these bind calcium.
Probab=96.80 E-value=0.0015 Score=41.02 Aligned_cols=29 Identities=38% Similarity=1.064 Sum_probs=25.2
Q ss_pred CCCCCCEEeeCCCCceEecCCCCCccC-ccC
Q psy951 190 QCQNGGMCAESETGDLTCNCRQDFAGT-FCE 219 (428)
Q Consensus 190 ~C~ngg~C~~~~~g~~~C~C~~gy~G~-~Ce 219 (428)
+|.+++.|.+.. +.|.|.|+.||.|. .|+
T Consensus 7 ~C~~~~~C~~~~-~~~~C~C~~g~~g~~~C~ 36 (36)
T cd00053 7 PCSNGGTCVNTP-GSYRCVCPPGYTGDRSCE 36 (36)
T ss_pred CCCCCCEEecCC-CCeEeECCCCCcccCCcC
Confidence 688899999887 56999999999998 764
No 32
>smart00179 EGF_CA Calcium-binding EGF-like domain.
Probab=96.72 E-value=0.0015 Score=42.06 Aligned_cols=28 Identities=32% Similarity=1.014 Sum_probs=24.8
Q ss_pred CCCCCC-CCCCC--ccccCCCcceecCCCCC
Q psy951 136 PNPCSQ-SPCSH--LCLVIPGGYQCACPENA 163 (428)
Q Consensus 136 ~n~C~~-~~C~~--~C~~~~~~~~C~C~~g~ 163 (428)
+++|.. .+|.+ .|.+.+++|.|.|++||
T Consensus 2 ~~~C~~~~~C~~~~~C~~~~g~~~C~C~~g~ 32 (39)
T smart00179 2 IDECASGNPCQNGGTCVNTVGSYRCECPPGY 32 (39)
T ss_pred cccCcCCCCcCCCCEeECCCCCeEeECCCCC
Confidence 477877 68987 89999999999999999
No 33
>COG3386 Gluconolactonase [Carbohydrate transport and metabolism]
Probab=96.69 E-value=0.018 Score=55.73 Aligned_cols=96 Identities=23% Similarity=0.319 Sum_probs=68.8
Q ss_pred CeEEEEEcCCCCcEEEEeCCCCCceeEEEcCCCCCEEEEEECCCCeEEEEEcC---C--CCeEEEEeCCCCccceeeeec
Q psy951 10 PRIESAWMDGSHRRSLVMTGVRHPTGLSVDAAMDHTLYWVDSKLNTIESVRHD---G--RNRQTILSGSDKLQHPISLDV 84 (428)
Q Consensus 10 ~~I~~a~~DG~~~~~l~~~~~~~P~glavD~~~~~~lYW~d~~~~~I~~~~ld---G--~~~~~i~~~~~~~~~p~~l~~ 84 (428)
+.++|.+.+|+..+ ++..++..|+||++++.+. .||++|+..++|.+..+| | .+++..+.....-..|=++.+
T Consensus 143 G~lyr~~p~g~~~~-l~~~~~~~~NGla~SpDg~-tly~aDT~~~~i~r~~~d~~~g~~~~~~~~~~~~~~~G~PDG~~v 220 (307)
T COG3386 143 GSLYRVDPDGGVVR-LLDDDLTIPNGLAFSPDGK-TLYVADTPANRIHRYDLDPATGPIGGRRGFVDFDEEPGLPDGMAV 220 (307)
T ss_pred ceEEEEcCCCCEEE-eecCcEEecCceEECCCCC-EEEEEeCCCCeEEEEecCcccCccCCcceEEEccCCCCCCCceEE
Confidence 46888888666544 4454699999999999999 999999999999999998 3 222322222114567888888
Q ss_pred cccEEEEEe--CCCCceeeeccCCC
Q psy951 85 FENNIYWLA--RDTGSLYKQDKFGR 107 (428)
Q Consensus 85 ~~~~lYwtD--~~~~~I~~~~~~g~ 107 (428)
..+-.||+- +....|.+.+.+|.
T Consensus 221 DadG~lw~~a~~~g~~v~~~~pdG~ 245 (307)
T COG3386 221 DADGNLWVAAVWGGGRVVRFNPDGK 245 (307)
T ss_pred eCCCCEEEecccCCceEEEECCCCc
Confidence 866666643 33448888888854
No 34
>PF07645 EGF_CA: Calcium-binding EGF domain; InterPro: IPR001881 A sequence of about forty amino-acid residues found in epidermal growth factor (EGF) has been shown [, , , , , ] to be present in a large number of membrane-bound and extracellular, mostly animal, proteins. Many of these proteins require calcium for their biological function and a calcium-binding site has been found at the N terminus of some EGF-like domains []. Calcium-binding may be crucial for numerous protein-protein interactions. For human coagulation factor IX it has been shown [] that the calcium-ligands form a pentagonal bipyramid. The first, third and fourth conserved negatively charged or polar residues are side chain ligands. The latter is possibly hydroxylated (see aspartic acid and asparagine hydroxylation site) []. A conserved aromatic residue, as well as the second conserved negative residue, are thought to be involved in stabilising the calcium-binding site. As in non-calcium binding EGF-like domains, there are six conserved cysteines and the structure of both types is very similar as calcium-binding induces only strictly local structural changes []. +------------------+ +---------+ | | | | nxnnC-x(3,14)-C-x(3,7)-CxxbxxxxaxC-x(1,6)-C-x(8,13)-Cx | | +------------------+ 'n': negatively charged or polar residue [DEQN] 'b': possibly beta-hydroxylated residue [DN] 'a': aromatic amino acid 'C': cysteine, involved in disulphide bond 'x': any amino acid. ; GO: 0005509 calcium ion binding; PDB: 2VJ3_A 1TOZ_A 1LMJ_A 1UZQ_A 1UZK_A 1UZJ_B 1UZP_A 1EMO_A 1EMN_A 2RR0_A ....
Probab=96.66 E-value=0.00076 Score=44.72 Aligned_cols=28 Identities=29% Similarity=0.910 Sum_probs=24.2
Q ss_pred CCCCCCC--CCC--CccccCCCcceecCCCCC
Q psy951 136 PNPCSQS--PCS--HLCLVIPGGYQCACPENA 163 (428)
Q Consensus 136 ~n~C~~~--~C~--~~C~~~~~~~~C~C~~g~ 163 (428)
+|+|... .|. ..|+++.++|+|.|++||
T Consensus 2 idEC~~~~~~C~~~~~C~N~~Gsy~C~C~~Gy 33 (42)
T PF07645_consen 2 IDECAEGPHNCPENGTCVNTEGSYSCSCPPGY 33 (42)
T ss_dssp SSTTTTTSSSSSTTSEEEEETTEEEEEESTTE
T ss_pred ccccCCCCCcCCCCCEEEcCCCCEEeeCCCCc
Confidence 5788764 687 399999999999999999
No 35
>PF12661 hEGF: Human growth factor-like EGF; PDB: 2YGQ_A 2E26_A 3A7Q_A 2YGP_A 2YGO_A 1HRE_A 1HAE_A 1HAF_A 1HRF_A.
Probab=96.58 E-value=0.00084 Score=32.58 Aligned_cols=13 Identities=38% Similarity=1.140 Sum_probs=10.9
Q ss_pred EecCCCCCccCcc
Q psy951 206 TCNCRQDFAGTFC 218 (428)
Q Consensus 206 ~C~C~~gy~G~~C 218 (428)
+|.|++||+|.+|
T Consensus 1 ~C~C~~G~~G~~C 13 (13)
T PF12661_consen 1 TCQCPPGWTGPNC 13 (13)
T ss_dssp EEEE-TTEETTTT
T ss_pred CccCcCCCcCCCC
Confidence 5999999999987
No 36
>smart00181 EGF Epidermal growth factor-like domain.
Probab=96.48 E-value=0.0031 Score=39.70 Aligned_cols=28 Identities=39% Similarity=1.079 Sum_probs=23.9
Q ss_pred CCCCCCEEeeCCCCceEecCCCCCcc-CccC
Q psy951 190 QCQNGGMCAESETGDLTCNCRQDFAG-TFCE 219 (428)
Q Consensus 190 ~C~ngg~C~~~~~g~~~C~C~~gy~G-~~Ce 219 (428)
+|.++ .|.+.. +.|.|.|++||.| ..|+
T Consensus 7 ~C~~~-~C~~~~-~~~~C~C~~g~~g~~~C~ 35 (35)
T smart00181 7 PCSNG-TCINTP-GSYTCSCPPGYTGDKRCE 35 (35)
T ss_pred CCCCC-EEECCC-CCeEeECCCCCccCCccC
Confidence 57788 999885 6799999999999 7774
No 37
>PF00008 EGF: EGF-like domain This is a sub-family of the Pfam entry This is a sub-family of the Pfam entry; InterPro: IPR006209 A sequence of about thirty to forty amino-acid residues long found in the sequence of epidermal growth factor (EGF) has been shown [, , , , ] to be present, in a more or less conserved form, in a large number of other, mostly animal proteins. The list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied. The functional significance of EGF domains in what appear to be unrelated proteins is not yet clear. However, a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase). The EGF domain includes six cysteine residues which have been shown (in EGF) to be involved in disulphide bonds. The main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet. Subdomains between the conserved cysteines vary in length.; GO: 0005515 protein binding; PDB: 1WHE_A 1CCF_A 1APO_A 1WHF_A 2VJ3_A 1TOZ_A 4D90_B 3CFW_A 1EDM_B 1IXA_A ....
Probab=96.47 E-value=0.00094 Score=41.43 Aligned_cols=25 Identities=48% Similarity=1.399 Sum_probs=22.7
Q ss_pred CCCCCCCC--ccccCC-CcceecCCCCC
Q psy951 139 CSQSPCSH--LCLVIP-GGYQCACPENA 163 (428)
Q Consensus 139 C~~~~C~~--~C~~~~-~~~~C~C~~g~ 163 (428)
|..++|+| +|++.. .+|.|.|++||
T Consensus 1 C~~~~C~n~g~C~~~~~~~y~C~C~~G~ 28 (32)
T PF00008_consen 1 CSSNPCQNGGTCIDLPGGGYTCECPPGY 28 (32)
T ss_dssp TTTTSSTTTEEEEEESTSEEEEEEBTTE
T ss_pred CCCCcCCCCeEEEeCCCCCEEeECCCCC
Confidence 66778988 999999 99999999999
No 38
>PF01436 NHL: NHL repeat; InterPro: IPR001258 The NHL repeat, named after NCL-1, HT2A and Lin-41, is found largely in a large number of eukaryotic and prokaryotic proteins. For example, the repeat is found in a variety of enzymes of the copper type II, ascorbate-dependent monooxygenase family which catalyse the C terminus alpha-amidation of biological peptides []. In many it occurs in tandem arrays, for example in the ringfinger beta-box, coiled-coil (RBCC) eukaryotic growth regulators []. The 'Brain Tumor' protein (Brat) is one such growth regulator that contains a 6-bladed NHL-repeat beta-propeller [, ]. The NHL repeats are also found in serine/threonine protein kinase (STPK) in diverse range of pathogenic bacteria. These STPK are transmembrane receptors with a intracellular N-terminal kinase domain and extracellular C-terminal sensor domain. In the STPK, PknD, from Mycobacterium tuberculosis, the sensor domain forms a rigid, six-bladed b-propeller composed of NHL repeats with a flexible tether to the transmembrane domain.; GO: 0005515 protein binding; PDB: 3FVZ_A 3FW0_A 1RWL_A 1RWI_A 1Q7F_A.
Probab=96.47 E-value=0.0065 Score=36.37 Aligned_cols=27 Identities=22% Similarity=0.251 Sum_probs=23.8
Q ss_pred cccceeeeeeccCCeEEEEeCCCCcEEE
Q psy951 368 ERRIEALDIDPVDEIIYWVDSYDRNIRR 395 (428)
Q Consensus 368 ~~~p~glavD~~~~~lYwtd~~~~~I~~ 395 (428)
|..|.|||+| ..++||.+|.+..+|.+
T Consensus 1 f~~P~gvav~-~~g~i~VaD~~n~rV~v 27 (28)
T PF01436_consen 1 FNYPHGVAVD-SDGNIYVADSGNHRVQV 27 (28)
T ss_dssp BSSEEEEEEE-TTSEEEEEECCCTEEEE
T ss_pred CcCCcEEEEe-CCCCEEEEECCCCEEEE
Confidence 4689999999 88999999999988764
No 39
>KOG1520|consensus
Probab=96.44 E-value=0.0097 Score=58.07 Aligned_cols=91 Identities=23% Similarity=0.391 Sum_probs=59.6
Q ss_pred CCceeEEEcCCCCCEEEEEECCCCeEEEEEcCCCCeEEEEeCCCCcccee----eeecc-ccEEEEEeCCC---------
Q psy951 31 RHPTGLSVDAAMDHTLYWVDSKLNTIESVRHDGRNRQTILSGSDKLQHPI----SLDVF-ENNIYWLARDT--------- 96 (428)
Q Consensus 31 ~~P~glavD~~~~~~lYW~d~~~~~I~~~~ldG~~~~~i~~~~~~~~~p~----~l~~~-~~~lYwtD~~~--------- 96 (428)
.+|-||+.|..++ .||.+|++.+ +..++..|...+.+.... ...|+ ++++. ++.|||||...
T Consensus 115 GRPLGl~f~~~gg-dL~VaDAYlG-L~~V~p~g~~a~~l~~~~--~G~~~kf~N~ldI~~~g~vyFTDSSsk~~~rd~~~ 190 (376)
T KOG1520|consen 115 GRPLGIRFDKKGG-DLYVADAYLG-LLKVGPEGGLAELLADEA--EGKPFKFLNDLDIDPEGVVYFTDSSSKYDRRDFVF 190 (376)
T ss_pred CCcceEEeccCCC-eEEEEeccee-eEEECCCCCcceeccccc--cCeeeeecCceeEcCCCeEEEeccccccchhheEE
Confidence 6999999999998 8999999888 567788887766665542 33443 34433 89999999643
Q ss_pred --------CceeeeccCCCCceeeeccCCCCCcccccc
Q psy951 97 --------GSLYKQDKFGRGVPVLISKDLVNPSGVKAY 126 (428)
Q Consensus 97 --------~~I~~~~~~g~~~~~~~~~~~~~p~~I~v~ 126 (428)
|++++.++. .....++..++.-|+|+++.
T Consensus 191 a~l~g~~~GRl~~YD~~-tK~~~VLld~L~F~NGlaLS 227 (376)
T KOG1520|consen 191 AALEGDPTGRLFRYDPS-TKVTKVLLDGLYFPNGLALS 227 (376)
T ss_pred eeecCCCccceEEecCc-ccchhhhhhcccccccccCC
Confidence 344444443 22233455555556665543
No 40
>PF06977 SdiA-regulated: SdiA-regulated; InterPro: IPR009722 This entry represents a conserved region approximately 100 residues long within a number of hypothetical bacterial proteins that may be regulated by SdiA, a member of the LuxR family of transcriptional regulators []. Some proteins contain the IPR001258 from INTERPRO repeat.; PDB: 3QQZ_A.
Probab=96.35 E-value=0.083 Score=49.52 Aligned_cols=73 Identities=19% Similarity=0.211 Sum_probs=49.8
Q ss_pred CCCceeEEEcCCCCCEEEEEECCCCeEEEEEcCCCCeEEEEeCCCCccceeeeeccccEEEEE-eCCCCceeeeccC
Q psy951 30 VRHPTGLSVDAAMDHTLYWVDSKLNTIESVRHDGRNRQTILSGSDKLQHPISLDVFENNIYWL-ARDTGSLYKQDKF 105 (428)
Q Consensus 30 ~~~P~glavD~~~~~~lYW~d~~~~~I~~~~ldG~~~~~i~~~~~~~~~p~~l~~~~~~lYwt-D~~~~~I~~~~~~ 105 (428)
...+.||+.|+.++ .||-+.-..+.|...+++|.-.+.+.-. +..-|.+|++.++..|.. +-..+.|.....+
T Consensus 21 ~~e~SGLTy~pd~~-tLfaV~d~~~~i~els~~G~vlr~i~l~--g~~D~EgI~y~g~~~~vl~~Er~~~L~~~~~~ 94 (248)
T PF06977_consen 21 LDELSGLTYNPDTG-TLFAVQDEPGEIYELSLDGKVLRRIPLD--GFGDYEGITYLGNGRYVLSEERDQRLYIFTID 94 (248)
T ss_dssp -S-EEEEEEETTTT-EEEEEETTTTEEEEEETT--EEEEEE-S--S-SSEEEEEE-STTEEEEEETTTTEEEEEEE-
T ss_pred cCCccccEEcCCCC-eEEEEECCCCEEEEEcCCCCEEEEEeCC--CCCCceeEEEECCCEEEEEEcCCCcEEEEEEe
Confidence 34599999999989 9999988899999999999866665433 467789999887655554 4345666665553
No 41
>smart00181 EGF Epidermal growth factor-like domain.
Probab=96.16 E-value=0.0048 Score=38.81 Aligned_cols=26 Identities=42% Similarity=1.244 Sum_probs=22.4
Q ss_pred CCCC-CCCCC-ccccCCCcceecCCCCC
Q psy951 138 PCSQ-SPCSH-LCLVIPGGYQCACPENA 163 (428)
Q Consensus 138 ~C~~-~~C~~-~C~~~~~~~~C~C~~g~ 163 (428)
+|.. .+|.+ .|++..++|.|.|++||
T Consensus 1 ~C~~~~~C~~~~C~~~~~~~~C~C~~g~ 28 (35)
T smart00181 1 ECASGGPCSNGTCINTPGSYTCSCPPGY 28 (35)
T ss_pred CCCCcCCCCCCEEECCCCCeEeECCCCC
Confidence 3555 57888 89999999999999999
No 42
>PF07645 EGF_CA: Calcium-binding EGF domain; InterPro: IPR001881 A sequence of about forty amino-acid residues found in epidermal growth factor (EGF) has been shown [, , , , , ] to be present in a large number of membrane-bound and extracellular, mostly animal, proteins. Many of these proteins require calcium for their biological function and a calcium-binding site has been found at the N terminus of some EGF-like domains []. Calcium-binding may be crucial for numerous protein-protein interactions. For human coagulation factor IX it has been shown [] that the calcium-ligands form a pentagonal bipyramid. The first, third and fourth conserved negatively charged or polar residues are side chain ligands. The latter is possibly hydroxylated (see aspartic acid and asparagine hydroxylation site) []. A conserved aromatic residue, as well as the second conserved negative residue, are thought to be involved in stabilising the calcium-binding site. As in non-calcium binding EGF-like domains, there are six conserved cysteines and the structure of both types is very similar as calcium-binding induces only strictly local structural changes []. +------------------+ +---------+ | | | | nxnnC-x(3,14)-C-x(3,7)-CxxbxxxxaxC-x(1,6)-C-x(8,13)-Cx | | +------------------+ 'n': negatively charged or polar residue [DEQN] 'b': possibly beta-hydroxylated residue [DN] 'a': aromatic amino acid 'C': cysteine, involved in disulphide bond 'x': any amino acid. ; GO: 0005509 calcium ion binding; PDB: 2VJ3_A 1TOZ_A 1LMJ_A 1UZQ_A 1UZK_A 1UZJ_B 1UZP_A 1EMO_A 1EMN_A 2RR0_A ....
Probab=96.09 E-value=0.0063 Score=40.28 Aligned_cols=24 Identities=29% Similarity=1.020 Sum_probs=21.7
Q ss_pred CCCCCCEEeeCCCCceEecCCCCCc
Q psy951 190 QCQNGGMCAESETGDLTCNCRQDFA 214 (428)
Q Consensus 190 ~C~ngg~C~~~~~g~~~C~C~~gy~ 214 (428)
.|..++.|++.. |+|.|.|++||.
T Consensus 11 ~C~~~~~C~N~~-Gsy~C~C~~Gy~ 34 (42)
T PF07645_consen 11 NCPENGTCVNTE-GSYSCSCPPGYE 34 (42)
T ss_dssp SSSTTSEEEEET-TEEEEEESTTEE
T ss_pred cCCCCCEEEcCC-CCEEeeCCCCcE
Confidence 577889999998 789999999998
No 43
>COG2706 3-carboxymuconate cyclase [Carbohydrate transport and metabolism]
Probab=96.06 E-value=1.5 Score=42.41 Aligned_cols=103 Identities=15% Similarity=0.131 Sum_probs=72.1
Q ss_pred ccceeEEeeccCCCEEEEEecCCCeEEEeeeccccCCc---eEEEE------eccccceeeeeeccCCeEEEEeCCCCcE
Q psy951 323 ERRIEALDIDPVDEIIYWVDSYDRNIRRSFMLEAQKGQ---VQAVI------SDERRIEALDIDPVDEIIYWVDSYDRNI 393 (428)
Q Consensus 323 ~~~~~~l~~d~~~~~lyWtd~~~~~I~ra~l~g~~~~~---~~~i~------~~~~~p~glavD~~~~~lYwtd~~~~~I 393 (428)
-..||-|+|||..++.|-+.--+.+|.-...++. ..+ .|.+. .+-.+...|.|...++-||-.+.+.+.|
T Consensus 190 G~GPRHi~FHpn~k~aY~v~EL~stV~v~~y~~~-~g~~~~lQ~i~tlP~dF~g~~~~aaIhis~dGrFLYasNRg~dsI 268 (346)
T COG2706 190 GAGPRHIVFHPNGKYAYLVNELNSTVDVLEYNPA-VGKFEELQTIDTLPEDFTGTNWAAAIHISPDGRFLYASNRGHDSI 268 (346)
T ss_pred CCCcceEEEcCCCcEEEEEeccCCEEEEEEEcCC-CceEEEeeeeccCccccCCCCceeEEEECCCCCEEEEecCCCCeE
Confidence 3468899999999999999888888888888773 122 23332 1455778899999999999999999999
Q ss_pred EEEEccCcceeEEEecccCCCCCCCcccccccc
Q psy951 394 RRSFMLEAQKGQVQAGASRHQNGVPASSQRNLS 426 (428)
Q Consensus 394 ~~~~~~g~~~~~l~~~~~~~~~~~p~~~~~~~~ 426 (428)
.+-.+|-...+--........--.||+..+|-+
T Consensus 269 ~~f~V~~~~g~L~~~~~~~teg~~PR~F~i~~~ 301 (346)
T COG2706 269 AVFSVDPDGGKLELVGITPTEGQFPRDFNINPS 301 (346)
T ss_pred EEEEEcCCCCEEEEEEEeccCCcCCccceeCCC
Confidence 887766543332222232333334777666643
No 44
>PF07974 EGF_2: EGF-like domain; InterPro: IPR013111 A sequence of about thirty to forty amino-acid residues long found in the sequence of epidermal growth factor (EGF) has been shown [, , , , ] to be present, in a more or less conserved form, in a large number of other, mostly animal proteins. The list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied. The functional significance of EGF domains in what appear to be unrelated proteins is not yet clear. However, a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase). The EGF domain includes six cysteine residues which have been shown (in EGF) to be involved in disulphide bonds. The main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet. Subdomains between the conserved cysteines vary in length. This entry contains EGF domains found in a variety of extracellular and membrane proteins
Probab=95.98 E-value=0.0066 Score=37.56 Aligned_cols=26 Identities=27% Similarity=0.764 Sum_probs=22.2
Q ss_pred CCCCCCEEeeCCCCceEecCCCCCccCcc
Q psy951 190 QCQNGGMCAESETGDLTCNCRQDFAGTFC 218 (428)
Q Consensus 190 ~C~ngg~C~~~~~g~~~C~C~~gy~G~~C 218 (428)
.|.++|+|+... .+|.|.+||.|..|
T Consensus 7 ~C~~~G~C~~~~---g~C~C~~g~~G~~C 32 (32)
T PF07974_consen 7 ICSGHGTCVSPC---GRCVCDSGYTGPDC 32 (32)
T ss_pred ccCCCCEEeCCC---CEEECCCCCcCCCC
Confidence 588999998663 47999999999887
No 45
>PF03088 Str_synth: Strictosidine synthase; InterPro: IPR018119 This entry represents a conserved region found in strictosidine synthase (4.3.3.2 from EC), a key enzyme in alkaloid biosynthesis. It catalyses the Pictet-Spengler stereospecific condensation of tryptamine with secologanin to form strictosidine []. The structure of the native enzyme from the Indian medicinal plant Rauvolfia serpentina (Serpentwood) (Devilpepper) represents the first example of a six-bladed four-stranded beta-propeller fold from the plant kingdom [].; GO: 0016844 strictosidine synthase activity, 0009058 biosynthetic process; PDB: 2FPB_A 2V91_B 2FP8_A 3V1S_B 2FPC_A 2VAQ_A 2FP9_B.
Probab=95.98 E-value=0.048 Score=42.49 Aligned_cols=69 Identities=19% Similarity=0.226 Sum_probs=51.6
Q ss_pred eeEEEcCCCCCEEEEEECC-----------------CCeEEEEEcCCCCeEEEEeCCCCccceeeee--ccccEEEEEeC
Q psy951 34 TGLSVDAAMDHTLYWVDSK-----------------LNTIESVRHDGRNRQTILSGSDKLQHPISLD--VFENNIYWLAR 94 (428)
Q Consensus 34 ~glavD~~~~~~lYW~d~~-----------------~~~I~~~~ldG~~~~~i~~~~~~~~~p~~l~--~~~~~lYwtD~ 94 (428)
++|+|+..++ .|||+|+. .+++.+.++.....++++.+ +..|-||+ ..+++|..+..
T Consensus 1 ndldv~~~~g-~vYfTdsS~~~~~~~~~~~~le~~~~GRll~ydp~t~~~~vl~~~---L~fpNGVals~d~~~vlv~Et 76 (89)
T PF03088_consen 1 NDLDVDQDTG-TVYFTDSSSRYDRRDWVYDLLEGRPTGRLLRYDPSTKETTVLLDG---LYFPNGVALSPDESFVLVAET 76 (89)
T ss_dssp -EEEE-TTT---EEEEES-SS--TTGHHHHHHHT---EEEEEEETTTTEEEEEEEE---ESSEEEEEE-TTSSEEEEEEG
T ss_pred CceeEecCCC-EEEEEeCccccCccceeeeeecCCCCcCEEEEECCCCeEEEehhC---CCccCeEEEcCCCCEEEEEec
Confidence 5789998878 89999985 35899999998888888886 77776555 55779999998
Q ss_pred CCCceeeeccCC
Q psy951 95 DTGSLYKQDKFG 106 (428)
Q Consensus 95 ~~~~I~~~~~~g 106 (428)
...+|.|.-..|
T Consensus 77 ~~~Ri~rywl~G 88 (89)
T PF03088_consen 77 GRYRILRYWLKG 88 (89)
T ss_dssp GGTEEEEEESSS
T ss_pred cCceEEEEEEeC
Confidence 888888876554
No 46
>PHA02887 EGF-like protein; Provisional
Probab=95.87 E-value=0.0076 Score=48.26 Aligned_cols=31 Identities=26% Similarity=0.707 Sum_probs=24.8
Q ss_pred CCCCCCEEeeCCC-CceEecCCCCCccCccCCC
Q psy951 190 QCQNGGMCAESET-GDLTCNCRQDFAGTFCENY 221 (428)
Q Consensus 190 ~C~ngg~C~~~~~-g~~~C~C~~gy~G~~Ce~~ 221 (428)
.|.| |+|..... ..+.|.|+.||+|.+||..
T Consensus 93 YCiH-G~C~yI~dL~epsCrC~~GYtG~RCE~v 124 (126)
T PHA02887 93 FCIN-GECMNIIDLDEKFCICNKGYTGIRCDEV 124 (126)
T ss_pred EeeC-CEEEccccCCCceeECCCCcccCCCCcc
Confidence 4776 69976552 5689999999999999864
No 47
>cd00054 EGF_CA Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.
Probab=95.83 E-value=0.0092 Score=37.87 Aligned_cols=28 Identities=32% Similarity=1.027 Sum_probs=23.5
Q ss_pred CCCCCC-CCCCC--ccccCCCcceecCCCCC
Q psy951 136 PNPCSQ-SPCSH--LCLVIPGGYQCACPENA 163 (428)
Q Consensus 136 ~n~C~~-~~C~~--~C~~~~~~~~C~C~~g~ 163 (428)
.++|.. .+|.+ .|.+..++|.|.|++||
T Consensus 2 ~~~C~~~~~C~~~~~C~~~~~~~~C~C~~g~ 32 (38)
T cd00054 2 IDECASGNPCQNGGTCVNTVGSYRCSCPPGY 32 (38)
T ss_pred cccCCCCCCcCCCCEeECCCCCeEeECCCCC
Confidence 467776 67864 79999999999999999
No 48
>KOG1225|consensus
Probab=95.82 E-value=0.011 Score=60.70 Aligned_cols=27 Identities=26% Similarity=0.853 Sum_probs=22.6
Q ss_pred CCCCCCEEeeCCCCceEecCCCCCccCccCCC
Q psy951 190 QCQNGGMCAESETGDLTCNCRQDFAGTFCENY 221 (428)
Q Consensus 190 ~C~ngg~C~~~~~g~~~C~C~~gy~G~~Ce~~ 221 (428)
.|.++|.|+. -+|.|.+||+|..|+..
T Consensus 317 dC~g~G~Ci~-----G~C~C~~Gy~G~~C~~~ 343 (525)
T KOG1225|consen 317 DCSGHGKCID-----GECLCDEGYTGELCIQR 343 (525)
T ss_pred cCCCCCcccC-----CceEeCCCCcCCccccc
Confidence 5888899982 25999999999999886
No 49
>COG3391 Uncharacterized conserved protein [Function unknown]
Probab=95.67 E-value=0.22 Score=50.00 Aligned_cols=121 Identities=14% Similarity=0.111 Sum_probs=82.0
Q ss_pred CcceeeeecCccceeeccccccceeEEeeccCCCEEEEEecC--CCeEEEeeeccccCCceEEEEeccccceeeeeeccC
Q psy951 303 GPEIRAYETHKRRFRDVISDERRIEALDIDPVDEIIYWVDSY--DRNIRRSFMLEAQKGQVQAVISDERRIEALDIDPVD 380 (428)
Q Consensus 303 ~~~~~~~~~~~~~~~~~i~~~~~~~~l~~d~~~~~lyWtd~~--~~~I~ra~l~g~~~~~~~~i~~~~~~p~glavD~~~ 380 (428)
...+..++.........+.--..|.++++|+..+++|-++.+ +.++....-.. ......+.--..|.|+++|..+
T Consensus 95 ~~~v~vid~~~~~~~~~~~vG~~P~~~~~~~~~~~vYV~n~~~~~~~vsvid~~t---~~~~~~~~vG~~P~~~a~~p~g 171 (381)
T COG3391 95 SNTVSVIDTATNTVLGSIPVGLGPVGLAVDPDGKYVYVANAGNGNNTVSVIDAAT---NKVTATIPVGNTPTGVAVDPDG 171 (381)
T ss_pred CCeEEEEcCcccceeeEeeeccCCceEEECCCCCEEEEEecccCCceEEEEeCCC---CeEEEEEecCCCcceEEECCCC
Confidence 445555565555555544334499999999999999999995 56776665543 2111122222268999999999
Q ss_pred CeEEEEeCCCCcEEEEEccCcceeEEEecccCCCCCCCcccccccc
Q psy951 381 EIIYWVDSYDRNIRRSFMLEAQKGQVQAGASRHQNGVPASSQRNLS 426 (428)
Q Consensus 381 ~~lYwtd~~~~~I~~~~~~g~~~~~l~~~~~~~~~~~p~~~~~~~~ 426 (428)
..+|-+|...+.|.+.+..+.....-..........-|..++++..
T Consensus 172 ~~vyv~~~~~~~v~vi~~~~~~v~~~~~~~~~~~~~~P~~i~v~~~ 217 (381)
T COG3391 172 NKVYVTNSDDNTVSVIDTSGNSVVRGSVGSLVGVGTGPAGIAVDPD 217 (381)
T ss_pred CeEEEEecCCCeEEEEeCCCcceeccccccccccCCCCceEEECCC
Confidence 9999999999999999977665443111113455677888887654
No 50
>TIGR03866 PQQ_ABC_repeats PQQ-dependent catabolism-associated beta-propeller protein. Members of this protein family consist of seven repeats each of the YVTN family beta-propeller repeat (see TIGR02276). Members occur invariably as part of a transport operon that is associated with PQQ-dependent catabolism of alcohols such as phenylethanol.
Probab=95.62 E-value=2 Score=40.30 Aligned_cols=93 Identities=5% Similarity=-0.045 Sum_probs=64.5
Q ss_pred cceeEEeeccCCCEEEEEecCCCeEEEeeeccccCCceEEEEeccccceeeeeeccCCeEEEEeCCCCcEEEEEccCcce
Q psy951 324 RRIEALDIDPVDEIIYWVDSYDRNIRRSFMLEAQKGQVQAVISDERRIEALDIDPVDEIIYWVDSYDRNIRRSFMLEAQK 403 (428)
Q Consensus 324 ~~~~~l~~d~~~~~lyWtd~~~~~I~ra~l~g~~~~~~~~i~~~~~~p~glavD~~~~~lYwtd~~~~~I~~~~~~g~~~ 403 (428)
..+.++.+++..+++|.+.....+|..-.+.. .+....+..-+.|.+++++..++.||-+......|.+-++++...
T Consensus 207 ~~~~~i~~s~dg~~~~~~~~~~~~i~v~d~~~---~~~~~~~~~~~~~~~~~~~~~g~~l~~~~~~~~~i~v~d~~~~~~ 283 (300)
T TIGR03866 207 VQPVGIKLTKDGKTAFVALGPANRVAVVDAKT---YEVLDYLLVGQRVWQLAFTPDEKYLLTTNGVSNDVSVIDVAALKV 283 (300)
T ss_pred CCccceEECCCCCEEEEEcCCCCeEEEEECCC---CcEEEEEEeCCCcceEEECCCCCEEEEEcCCCCeEEEEECCCCcE
Confidence 35778999999999999876666665554432 222222232346889999999999998877667899999998776
Q ss_pred eEEEecccCCCCCCCcccccc
Q psy951 404 GQVQAGASRHQNGVPASSQRN 424 (428)
Q Consensus 404 ~~l~~~~~~~~~~~p~~~~~~ 424 (428)
..-+.- + ..|.+||+.
T Consensus 284 ~~~~~~----~-~~~~~~~~~ 299 (300)
T TIGR03866 284 IKSIKV----G-RLPWGVVVR 299 (300)
T ss_pred EEEEEc----c-cccceeEeC
Confidence 444432 2 566777764
No 51
>KOG1225|consensus
Probab=95.47 E-value=0.024 Score=58.16 Aligned_cols=68 Identities=25% Similarity=0.581 Sum_probs=43.7
Q ss_pred CCCCCCCCccccCC--CcceecCCCCCCCCCCCCCCcccccCccCCCCCCCCCCCCCCCEEeeCCCCceEecCCCCCccC
Q psy951 139 CSQSPCSHLCLVIP--GGYQCACPENATPKLPGVAEIRCSAAVERPRPLPRVCQCQNGGMCAESETGDLTCNCRQDFAGT 216 (428)
Q Consensus 139 C~~~~C~~~C~~~~--~~~~C~C~~g~~~~~~~~~g~~C~~~~~~~~~~~~~c~C~ngg~C~~~~~g~~~C~C~~gy~G~ 216 (428)
|....|.+.|.... ..-+|.|++|| .|..|.... -++.|..++.|++. +|.|++||+|.
T Consensus 247 c~~~~C~~~c~~~g~c~~G~CIC~~Gf-------~G~dC~e~~-------Cp~~cs~~g~~~~g-----~CiC~~g~~G~ 307 (525)
T KOG1225|consen 247 CSTIYCPGGCTGRGQCVEGRCICPPGF-------TGDDCDELV-------CPVDCSGGGVCVDG-----ECICNPGYSGK 307 (525)
T ss_pred cccccCCCCCcccceEeCCeEeCCCCC-------cCCCCCccc-------CCcccCCCceecCC-----EeecCCCcccc
Confidence 44445555554332 23478999999 567776421 12246666777643 69999999999
Q ss_pred ccCCCCCCC
Q psy951 217 FCENYTGIG 225 (428)
Q Consensus 217 ~Ce~~~~~~ 225 (428)
.|+.....+
T Consensus 308 dCs~~~cpa 316 (525)
T KOG1225|consen 308 DCSIRRCPA 316 (525)
T ss_pred ccccccCCc
Confidence 998765443
No 52
>PF01436 NHL: NHL repeat; InterPro: IPR001258 The NHL repeat, named after NCL-1, HT2A and Lin-41, is found largely in a large number of eukaryotic and prokaryotic proteins. For example, the repeat is found in a variety of enzymes of the copper type II, ascorbate-dependent monooxygenase family which catalyse the C terminus alpha-amidation of biological peptides []. In many it occurs in tandem arrays, for example in the ringfinger beta-box, coiled-coil (RBCC) eukaryotic growth regulators []. The 'Brain Tumor' protein (Brat) is one such growth regulator that contains a 6-bladed NHL-repeat beta-propeller [, ]. The NHL repeats are also found in serine/threonine protein kinase (STPK) in diverse range of pathogenic bacteria. These STPK are transmembrane receptors with a intracellular N-terminal kinase domain and extracellular C-terminal sensor domain. In the STPK, PknD, from Mycobacterium tuberculosis, the sensor domain forms a rigid, six-bladed b-propeller composed of NHL repeats with a flexible tether to the transmembrane domain.; GO: 0005515 protein binding; PDB: 3FVZ_A 3FW0_A 1RWL_A 1RWI_A 1Q7F_A.
Probab=95.44 E-value=0.035 Score=33.13 Aligned_cols=27 Identities=26% Similarity=0.464 Sum_probs=22.8
Q ss_pred CCCceeEEEcCCCCCEEEEEECCCCeEEE
Q psy951 30 VRHPTGLSVDAAMDHTLYWVDSKLNTIES 58 (428)
Q Consensus 30 ~~~P~glavD~~~~~~lYW~d~~~~~I~~ 58 (428)
+..|.||++| .++ .||-+|....+|.+
T Consensus 1 f~~P~gvav~-~~g-~i~VaD~~n~rV~v 27 (28)
T PF01436_consen 1 FNYPHGVAVD-SDG-NIYVADSGNHRVQV 27 (28)
T ss_dssp BSSEEEEEEE-TTS-EEEEEECCCTEEEE
T ss_pred CcCCcEEEEe-CCC-CEEEEECCCCEEEE
Confidence 3589999999 567 89999999888864
No 53
>cd00053 EGF Epidermal growth factor domain, found in epidermal growth factor (EGF) presents in a large number of proteins, mostly animal; the list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied; the functional significance of EGF-like domains in what appear to be unrelated proteins is not yet clear; a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase); the domain includes six cysteine residues which have been shown to be involved in disulfide bonds; the main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet; Subdomains between the conserved cysteines vary in length; the region between the 5th and 6th cysteine contains two conserved glycines of which at least one is present in most EGF-like domains; a subset of these bind calcium.
Probab=95.07 E-value=0.021 Score=35.53 Aligned_cols=25 Identities=44% Similarity=1.301 Sum_probs=20.5
Q ss_pred CC-CCCCCC--ccccCCCcceecCCCCC
Q psy951 139 CS-QSPCSH--LCLVIPGGYQCACPENA 163 (428)
Q Consensus 139 C~-~~~C~~--~C~~~~~~~~C~C~~g~ 163 (428)
|. ..+|.+ .|.+.+++|.|.|+.||
T Consensus 2 C~~~~~C~~~~~C~~~~~~~~C~C~~g~ 29 (36)
T cd00053 2 CAASNPCSNGGTCVNTPGSYRCVCPPGY 29 (36)
T ss_pred CCCCCCCCCCCEEecCCCCeEeECCCCC
Confidence 44 456763 89988899999999999
No 54
>KOG4659|consensus
Probab=95.04 E-value=0.2 Score=55.73 Aligned_cols=89 Identities=15% Similarity=0.197 Sum_probs=64.3
Q ss_pred ccccceeEEeeccCCCEEEEEecCCCeEEEeee-ccc-cCCceEEEEe---------------------ccccceeeeee
Q psy951 321 SDERRIEALDIDPVDEIIYWVDSYDRNIRRSFM-LEA-QKGQVQAVIS---------------------DERRIEALDID 377 (428)
Q Consensus 321 ~~~~~~~~l~~d~~~~~lyWtd~~~~~I~ra~l-~g~-~~~~~~~i~~---------------------~~~~p~glavD 377 (428)
.+..+---||+||+.+.||-+|-.+++|+|..- .+. ...+.++++- +|..|.|||+|
T Consensus 404 t~~sh~Yy~AvsPvdgtlyvSdp~s~qv~rv~sl~~~d~~~N~evvaG~Ge~Clp~desCGDGalA~dA~L~~PkGIa~d 483 (1899)
T KOG4659|consen 404 TDTSHSYYIAVSPVDGTLYVSDPLSKQVWRVSSLEPQDSRNNYEVVAGDGEVCLPADESCGDGALAQDAQLIFPKGIAFD 483 (1899)
T ss_pred CCccceeEEEecCcCceEEecCCCcceEEEeccCCccccccCeeEEeccCcCccccccccCcchhcccceeccCCceeEc
Confidence 344455569999999999999999999998763 221 1123444431 36799999999
Q ss_pred ccCCeEEEEeCCCCcEEEEEccCcceeEEEecccC
Q psy951 378 PVDEIIYWVDSYDRNIRRSFMLEAQKGQVQAGASR 412 (428)
Q Consensus 378 ~~~~~lYwtd~~~~~I~~~~~~g~~~~~l~~~~~~ 412 (428)
.. ++||++|.-. |.+.+-+|--+..+=....+
T Consensus 484 k~-g~lYfaD~t~--IR~iD~~giIstlig~~~~~ 515 (1899)
T KOG4659|consen 484 KM-GNLYFADGTR--IRVIDTTGIISTLIGTTPDQ 515 (1899)
T ss_pred cC-CcEEEecccE--EEEeccCceEEEeccCCCCc
Confidence 64 7999999874 99999888766655444444
No 55
>KOG1520|consensus
Probab=94.96 E-value=0.047 Score=53.36 Aligned_cols=75 Identities=16% Similarity=0.325 Sum_probs=60.9
Q ss_pred EEeeccCCCEEEEEecCCC----eEEEeeeccccC----------CceEEEEeccccceeeeeeccCCeEEEEeCCCCcE
Q psy951 328 ALDIDPVDEIIYWVDSYDR----NIRRSFMLEAQK----------GQVQAVISDERRIEALDIDPVDEIIYWVDSYDRNI 393 (428)
Q Consensus 328 ~l~~d~~~~~lyWtd~~~~----~I~ra~l~g~~~----------~~~~~i~~~~~~p~glavD~~~~~lYwtd~~~~~I 393 (428)
++|+|+ ++.|||||+.++ ...-+.+.|... -..+++.+++.-|+|||+-.....+-++...+.+|
T Consensus 165 ~ldI~~-~g~vyFTDSSsk~~~rd~~~a~l~g~~~GRl~~YD~~tK~~~VLld~L~F~NGlaLS~d~sfvl~~Et~~~ri 243 (376)
T KOG1520|consen 165 DLDIDP-EGVVYFTDSSSKYDRRDFVFAALEGDPTGRLFRYDPSTKVTKVLLDGLYFPNGLALSPDGSFVLVAETTTARI 243 (376)
T ss_pred ceeEcC-CCeEEEeccccccchhheEEeeecCCCccceEEecCcccchhhhhhcccccccccCCCCCCEEEEEeecccee
Confidence 799999 999999999884 455555555211 12334557899999999999999999999999999
Q ss_pred EEEEccCcce
Q psy951 394 RRSFMLEAQK 403 (428)
Q Consensus 394 ~~~~~~g~~~ 403 (428)
.+..+.|.+-
T Consensus 244 ~rywi~g~k~ 253 (376)
T KOG1520|consen 244 KRYWIKGPKA 253 (376)
T ss_pred eeeEecCCcc
Confidence 9999999876
No 56
>KOG1217|consensus
Probab=94.90 E-value=0.041 Score=56.22 Aligned_cols=80 Identities=29% Similarity=0.765 Sum_probs=54.8
Q ss_pred CCCCCCCCC-CCC--ccccCCCcceecCCCCCCCCCCCCCCccc---ccCccCCCCCCCCCCCCCCCEEe-eCCCCceEe
Q psy951 135 APNPCSQSP-CSH--LCLVIPGGYQCACPENATPKLPGVAEIRC---SAAVERPRPLPRVCQCQNGGMCA-ESETGDLTC 207 (428)
Q Consensus 135 ~~n~C~~~~-C~~--~C~~~~~~~~C~C~~g~~~~~~~~~g~~C---~~~~~~~~~~~~~c~C~ngg~C~-~~~~g~~~C 207 (428)
..+.|.... |.+ .|++.++.|.|.|++|| .+..| .+. ..|......-.|.+++.|. ....+.+.|
T Consensus 270 ~~~~C~~~~~c~~~~~C~~~~~~~~C~C~~g~-------~g~~~~~~~~~-~~C~~~~~~~~c~~g~~C~~~~~~~~~~C 341 (487)
T KOG1217|consen 270 DVDSCALIASCPNGGTCVNVPGSYRCTCPPGF-------TGRLCTECVDV-DECSPRNAGGPCANGGTCNTLGSFGGFRC 341 (487)
T ss_pred eccccCCCCccCCCCeeecCCCcceeeCCCCC-------CCCCCcccccc-ccccccccCCcCCCCcccccCCCCCCCCc
Confidence 457787753 776 99999988999999999 44555 111 1121111222588999993 222346889
Q ss_pred cCCCCCccCccCCCC
Q psy951 208 NCRQDFAGTFCENYT 222 (428)
Q Consensus 208 ~C~~gy~G~~Ce~~~ 222 (428)
.|..+|.|..|+...
T Consensus 342 ~c~~~~~g~~C~~~~ 356 (487)
T KOG1217|consen 342 ACGPGFTGRRCEDSN 356 (487)
T ss_pred CCCCCCCCCccccCC
Confidence 999999999998763
No 57
>TIGR02604 Piru_Ver_Nterm putative membrane-bound dehydrogenase domain. All proteins that score above the trusted cutoff score of 45 to this model are large proteins of either Pirellula sp. 1 or Verrucomicrobium spinosum. These proteins all contain, in addition to this domain, several hundred residues of highly variable sequence, and then a well-conserved C-terminal domain (TIGR02603) that features a putative cytochrome c-type heme binding motif CXXCH. The membrane-bound L-sorbosone dehydrogenase from Acetobacter liquefaciens (Gluconacetobacter liquefaciens) is homologous to this domain but lacks additional sequence regions shared by members of this family and belongs to a different clade of the larger family of homologs. It and its closely related homologs are excluded from the this model by scoring between the trusted (45) and noise (18) cutoffs.
Probab=94.81 E-value=0.16 Score=50.73 Aligned_cols=66 Identities=15% Similarity=0.214 Sum_probs=50.7
Q ss_pred cceeEEeeccCCCEEEEEecCC-------------------CeEEEeeeccccCCceEEEEeccccceeeeeeccCCeEE
Q psy951 324 RRIEALDIDPVDEIIYWVDSYD-------------------RNIRRSFMLEAQKGQVQAVISDERRIEALDIDPVDEIIY 384 (428)
Q Consensus 324 ~~~~~l~~d~~~~~lyWtd~~~-------------------~~I~ra~l~g~~~~~~~~i~~~~~~p~glavD~~~~~lY 384 (428)
..+.+++++| +++||+++... ..|+|...+| +..+++..++.+|.||++|. .++||
T Consensus 124 ~~~~~l~~gp-DG~LYv~~G~~~~~~~~~~~~~~~~~~~~~g~i~r~~pdg---~~~e~~a~G~rnp~Gl~~d~-~G~l~ 198 (367)
T TIGR02604 124 HSLNSLAWGP-DGWLYFNHGNTLASKVTRPGTSDESRQGLGGGLFRYNPDG---GKLRVVAHGFQNPYGHSVDS-WGDVF 198 (367)
T ss_pred ccccCceECC-CCCEEEecccCCCceeccCCCccCcccccCceEEEEecCC---CeEEEEecCcCCCccceECC-CCCEE
Confidence 3477899998 57999987621 3688988888 45666667999999999998 57899
Q ss_pred EEeCCCCcEE
Q psy951 385 WVDSYDRNIR 394 (428)
Q Consensus 385 wtd~~~~~I~ 394 (428)
.+|.......
T Consensus 199 ~tdn~~~~~~ 208 (367)
T TIGR02604 199 FCDNDDPPLC 208 (367)
T ss_pred EEccCCCcee
Confidence 9998544333
No 58
>PRK04792 tolB translocation protein TolB; Provisional
Probab=94.72 E-value=5.8 Score=40.66 Aligned_cols=93 Identities=13% Similarity=0.045 Sum_probs=58.9
Q ss_pred CeEEEEEcCCCCcEEEEeCCCCCceeEEEcCCCCCEEEEEECC--CCeEEEEEcCCCCeEEEEeCCCCccceeeeecccc
Q psy951 10 PRIESAWMDGSHRRSLVMTGVRHPTGLSVDAAMDHTLYWVDSK--LNTIESVRHDGRNRQTILSGSDKLQHPISLDVFEN 87 (428)
Q Consensus 10 ~~I~~a~~DG~~~~~l~~~~~~~P~glavD~~~~~~lYW~d~~--~~~I~~~~ldG~~~~~i~~~~~~~~~p~~l~~~~~ 87 (428)
.+|..++.||.+.+.+.... ..-...+..+.++ +|+|+... ...|+..++++...+.+.... ........+..++
T Consensus 198 ~~l~i~d~dG~~~~~l~~~~-~~~~~p~wSPDG~-~La~~s~~~g~~~L~~~dl~tg~~~~lt~~~-g~~~~~~wSPDG~ 274 (448)
T PRK04792 198 YQLMIADYDGYNEQMLLRSP-EPLMSPAWSPDGR-KLAYVSFENRKAEIFVQDIYTQVREKVTSFP-GINGAPRFSPDGK 274 (448)
T ss_pred eEEEEEeCCCCCceEeecCC-CcccCceECCCCC-EEEEEEecCCCcEEEEEECCCCCeEEecCCC-CCcCCeeECCCCC
Confidence 46888899999887765532 2233567777777 89998543 457999999987766665432 2223345566677
Q ss_pred EEEEEeCCCC--ceeeeccC
Q psy951 88 NIYWLARDTG--SLYKQDKF 105 (428)
Q Consensus 88 ~lYwtD~~~~--~I~~~~~~ 105 (428)
+|+++....+ .|+..+..
T Consensus 275 ~La~~~~~~g~~~Iy~~dl~ 294 (448)
T PRK04792 275 KLALVLSKDGQPEIYVVDIA 294 (448)
T ss_pred EEEEEEeCCCCeEEEEEECC
Confidence 7887643322 35555544
No 59
>KOG4499|consensus
Probab=94.69 E-value=0.35 Score=44.32 Aligned_cols=96 Identities=21% Similarity=0.328 Sum_probs=69.9
Q ss_pred EEEEEcCCCCcEEEEeCCCCCceeEEEcCCCCCEEEEEECCCCeEEEEEcCC-----CCeEEEEeCCC----Cccceeee
Q psy951 12 IESAWMDGSHRRSLVMTGVRHPTGLSVDAAMDHTLYWVDSKLNTIESVRHDG-----RNRQTILSGSD----KLQHPISL 82 (428)
Q Consensus 12 I~~a~~DG~~~~~l~~~~~~~P~glavD~~~~~~lYW~d~~~~~I~~~~ldG-----~~~~~i~~~~~----~~~~p~~l 82 (428)
-.++++-|-+...+ ...+.-|+||+-|.... ..|++|+..-.|..-+.|- ++|++++.-.. .-..|=++
T Consensus 140 ~Ly~~~~~h~v~~i-~~~v~IsNgl~Wd~d~K-~fY~iDsln~~V~a~dyd~~tG~~snr~~i~dlrk~~~~e~~~PDGm 217 (310)
T KOG4499|consen 140 ELYSWLAGHQVELI-WNCVGISNGLAWDSDAK-KFYYIDSLNYEVDAYDYDCPTGDLSNRKVIFDLRKSQPFESLEPDGM 217 (310)
T ss_pred EEEEeccCCCceee-ehhccCCccccccccCc-EEEEEccCceEEeeeecCCCcccccCcceeEEeccCCCcCCCCCCcc
Confidence 34555655555544 44788999999997777 8999999888997777542 67888876431 23334455
Q ss_pred ecc-ccEEEEEeCCCCceeeeccCCCCc
Q psy951 83 DVF-ENNIYWLARDTGSLYKQDKFGRGV 109 (428)
Q Consensus 83 ~~~-~~~lYwtD~~~~~I~~~~~~g~~~ 109 (428)
++. +++||.+-|++++|.+.+...+..
T Consensus 218 ~ID~eG~L~Va~~ng~~V~~~dp~tGK~ 245 (310)
T KOG4499|consen 218 TIDTEGNLYVATFNGGTVQKVDPTTGKI 245 (310)
T ss_pred eEccCCcEEEEEecCcEEEEECCCCCcE
Confidence 554 889999999999999999875544
No 60
>PF12947 EGF_3: EGF domain; InterPro: IPR024731 This entry represents an EGF domain found in the the C terminus of malarial parasite merozoite surface protein 1 [], as well as other proteins.; PDB: 2NPR_A 1N1I_C 1B9W_A 1YO8_A 2RHP_A.
Probab=94.68 E-value=0.022 Score=36.24 Aligned_cols=26 Identities=27% Similarity=0.925 Sum_probs=19.4
Q ss_pred CCCCCCEEeeCCCCceEecCCCCCccC
Q psy951 190 QCQNGGMCAESETGDLTCNCRQDFAGT 216 (428)
Q Consensus 190 ~C~ngg~C~~~~~g~~~C~C~~gy~G~ 216 (428)
.|....+|.+.. ++|.|.|++||.|+
T Consensus 7 ~C~~nA~C~~~~-~~~~C~C~~Gy~Gd 32 (36)
T PF12947_consen 7 GCHPNATCTNTG-GSYTCTCKPGYEGD 32 (36)
T ss_dssp GS-TTCEEEE-T-TSEEEEE-CEEECC
T ss_pred CCCCCcEeecCC-CCEEeECCCCCccC
Confidence 466678999998 48999999999975
No 61
>PF12662 cEGF: Complement Clr-like EGF-like
Probab=94.01 E-value=0.036 Score=31.78 Aligned_cols=17 Identities=35% Similarity=1.013 Sum_probs=13.6
Q ss_pred ceEecCCCCCc----cCccCC
Q psy951 204 DLTCNCRQDFA----GTFCEN 220 (428)
Q Consensus 204 ~~~C~C~~gy~----G~~Ce~ 220 (428)
+|+|.|++||. |..|+.
T Consensus 1 sy~C~C~~Gy~l~~d~~~C~D 21 (24)
T PF12662_consen 1 SYTCSCPPGYQLSPDGRSCED 21 (24)
T ss_pred CEEeeCCCCCcCCCCCCcccc
Confidence 48999999998 556764
No 62
>PRK04043 tolB translocation protein TolB; Provisional
Probab=93.71 E-value=9 Score=38.95 Aligned_cols=93 Identities=12% Similarity=0.024 Sum_probs=58.3
Q ss_pred CCeEEEEEcCCCCcEEEEeCCCCCceeEEEcCCCCCE-EEEEECC--CCeEEEEEcCCCCeEEEEeCCCCccceeeeecc
Q psy951 9 SPRIESAWMDGSHRRSLVMTGVRHPTGLSVDAAMDHT-LYWVDSK--LNTIESVRHDGRNRQTILSGSDKLQHPISLDVF 85 (428)
Q Consensus 9 ~~~I~~a~~DG~~~~~l~~~~~~~P~glavD~~~~~~-lYW~d~~--~~~I~~~~ldG~~~~~i~~~~~~~~~p~~l~~~ 85 (428)
..+|..++.||.+.+++...+ .-..-...+..+ + +|.+... ...|+..++.+..++.+.... +.......+..
T Consensus 168 ~~~l~~~d~dg~~~~~~~~~~--~~~~p~wSpDG~-~~i~y~s~~~~~~~Iyv~dl~tg~~~~lt~~~-g~~~~~~~SPD 243 (419)
T PRK04043 168 KSNIVLADYTLTYQKVIVKGG--LNIFPKWANKEQ-TAFYYTSYGERKPTLYKYNLYTGKKEKIASSQ-GMLVVSDVSKD 243 (419)
T ss_pred cceEEEECCCCCceeEEccCC--CeEeEEECCCCC-cEEEEEEccCCCCEEEEEECCCCcEEEEecCC-CcEEeeEECCC
Confidence 458999999999988876643 112334455555 5 7765443 468999999888777776542 33333445666
Q ss_pred ccEEEEEeCCC--CceeeeccC
Q psy951 86 ENNIYWLARDT--GSLYKQDKF 105 (428)
Q Consensus 86 ~~~lYwtD~~~--~~I~~~~~~ 105 (428)
+.+|.++.... ..|+..+..
T Consensus 244 G~~la~~~~~~g~~~Iy~~dl~ 265 (419)
T PRK04043 244 GSKLLLTMAPKGQPDIYLYDTN 265 (419)
T ss_pred CCEEEEEEccCCCcEEEEEECC
Confidence 77787775332 244444444
No 63
>KOG0994|consensus
Probab=93.67 E-value=0.12 Score=56.61 Aligned_cols=68 Identities=22% Similarity=0.485 Sum_probs=44.3
Q ss_pred cccCCCccee-cCCCCCCCCCCCCCCcccccCccCCCCCCCCCCCCCCC--------EEeeCCC-CceEecCCCCCccCc
Q psy951 148 CLVIPGGYQC-ACPENATPKLPGVAEIRCSAAVERPRPLPRVCQCQNGG--------MCAESET-GDLTCNCRQDFAGTF 217 (428)
Q Consensus 148 C~~~~~~~~C-~C~~g~~~~~~~~~g~~C~~~~~~~~~~~~~c~C~ngg--------~C~~~~~-g~~~C~C~~gy~G~~ 217 (428)
|.+..+|+.| .|..|| ...|- ......+.||+|..|- .|...+. ....|.|.+||+|.+
T Consensus 878 CqD~T~G~~CdrCl~Gy-yGdP~----------lg~g~~CrPCpCP~gp~Sg~~~A~sC~~d~~t~~ivC~C~~GY~G~R 946 (1758)
T KOG0994|consen 878 CQDSTTGHSCDRCLDGY-YGDPR----------LGSGIGCRPCPCPDGPASGRQHADSCYLDTRTQQIVCHCQEGYSGSR 946 (1758)
T ss_pred ccccccccchhhhhccc-cCCcc----------cCCCCCCCCCCCCCCCccchhccccccccccccceeeecccCccccc
Confidence 4444567888 488888 33221 1122234578887663 5876652 458899999999999
Q ss_pred cCCCCCCCC
Q psy951 218 CENYTGIGQ 226 (428)
Q Consensus 218 Ce~~~~~~~ 226 (428)
|+.-....+
T Consensus 947 Ce~CA~~~f 955 (1758)
T KOG0994|consen 947 CEICADNHF 955 (1758)
T ss_pred hhhhccccc
Confidence 998665543
No 64
>PF12955 DUF3844: Domain of unknown function (DUF3844); InterPro: IPR024382 This presumed domain is found in fungal species. It contains 8 largely conserved cysteine residues. This domain is found in proteins thought to be located in the endoplasmic reticulum.
Probab=93.54 E-value=0.16 Score=40.34 Aligned_cols=34 Identities=21% Similarity=0.677 Sum_probs=25.3
Q ss_pred CCCCCCEEeeCCC----CceEecCCC-------------CCccCccCCCCC
Q psy951 190 QCQNGGMCAESET----GDLTCNCRQ-------------DFAGTFCENYTG 223 (428)
Q Consensus 190 ~C~ngg~C~~~~~----g~~~C~C~~-------------gy~G~~Ce~~~~ 223 (428)
.|..+|.|+.... .=|.|.|.+ .|.|..|+...-
T Consensus 14 ~CsgHG~C~~~~~~~~~~C~~C~C~~T~~~~~~~~~ktt~W~G~aCqKkDv 64 (103)
T PF12955_consen 14 NCSGHGSCVKKYGSGGGDCFACKCKPTVVKTGSGKGKTTHWGGPACQKKDV 64 (103)
T ss_pred CCCCCceEeeccCCCccceEEEEeeccccccccccCceeeecccccccccc
Confidence 5888999987742 339999998 466888887553
No 65
>PF01731 Arylesterase: Arylesterase; InterPro: IPR002640 The serum paraoxonases/arylesterases are enzymes that catalyse the hydrolysis of the toxic metabolites of a variety of organophosphorus insecticides. The enzymes hydrolyse a broad spectrum of organophosphate substrates, including paraoxon and a number of aromatic carboxylic acid esters (e.g., phenyl acetate), and hence confer resistance to organophosphate toxicity []. Mammals have 3 distinct paraoxonase types, termed PON1-3 [, ]. In mice and humans, the PON genes are found on the same chromosome in close proximity. PON activity has been found in variety of tissues, with highest levels in liver and serum - the source of serum PON is thought to be the liver. Unlike mammals, fish and avian species lack paraoxonase activity. Human and rabbit PONs appear to have two distinct Ca2+ binding sites, one required for stability and one required for catalytic activity. The Ca2+ dependency of PONs suggests a mechanism of hydrolysis where Ca2+ acts as the electrophillic catalyst, like that proposed for phospholipase A2. The paraoxonase enzymes, PON1 and PON3, are high density lipoprotein (HDL)- associated proteins capable of preventing oxidative modification of low density lipoproteins (LPL) []. Although PON2 has oxidative properties, the enzyme does not associate with HDL. Within a given species, PON1, PON2 and PON3 share ~60% amino acid sequence identity, whereas between mammalian species particular PONs (1,2 or 3) share 79-90% identity at the amino acid level. Human PON1 and PON3 share numerous conserved phosphorylation and N-glycosylation sites; however, it is not known whether the PON proteins are modified at these sites, or whether modification at these sites is required for activity in vivo []. This family consists of arylesterases (Also known as serum paraoxonase) 3.1.1.2 from EC. These enzymes hydrolyse organophosphorus esters such as paraoxon and are found in the liver and blood. They confer resistance to organophosphate toxicity []. Human arylesterase (PON1) P27169 from SWISSPROT is associated with HDL and may protect against LDL oxidation [].; GO: 0004064 arylesterase activity
Probab=93.37 E-value=0.26 Score=38.16 Aligned_cols=38 Identities=21% Similarity=0.239 Sum_probs=33.7
Q ss_pred ceEEEEeccccceeeeeeccCCeEEEEeCCCCcEEEEE
Q psy951 360 QVQAVISDERRIEALDIDPVDEIIYWVDSYDRNIRRSF 397 (428)
Q Consensus 360 ~~~~i~~~~~~p~glavD~~~~~lYwtd~~~~~I~~~~ 397 (428)
+.+++++++..|+||++|..++.||.++.....|.+-.
T Consensus 45 ~~~~va~g~~~aNGI~~s~~~k~lyVa~~~~~~I~vy~ 82 (86)
T PF01731_consen 45 EVKVVASGFSFANGIAISPDKKYLYVASSLAHSIHVYK 82 (86)
T ss_pred EeEEeeccCCCCceEEEcCCCCEEEEEeccCCeEEEEE
Confidence 46677779999999999999999999999998888765
No 66
>KOG4499|consensus
Probab=93.19 E-value=0.32 Score=44.55 Aligned_cols=81 Identities=10% Similarity=0.134 Sum_probs=59.1
Q ss_pred ccccceeEEeeccCCCEEEEEecCCCeEEEee--eccccCCceEEEEe-------ccccceeeeeeccCCeEEEEeCCCC
Q psy951 321 SDERRIEALDIDPVDEIIYWVDSYDRNIRRSF--MLEAQKGQVQAVIS-------DERRIEALDIDPVDEIIYWVDSYDR 391 (428)
Q Consensus 321 ~~~~~~~~l~~d~~~~~lyWtd~~~~~I~ra~--l~g~~~~~~~~i~~-------~~~~p~glavD~~~~~lYwtd~~~~ 391 (428)
..+.-+-||+.|.....+|.+|...-.|..-. ..+.+-+++.+++. +-..|.|++||- .++||.+--+..
T Consensus 155 ~~v~IsNgl~Wd~d~K~fY~iDsln~~V~a~dyd~~tG~~snr~~i~dlrk~~~~e~~~PDGm~ID~-eG~L~Va~~ng~ 233 (310)
T KOG4499|consen 155 NCVGISNGLAWDSDAKKFYYIDSLNYEVDAYDYDCPTGDLSNRKVIFDLRKSQPFESLEPDGMTIDT-EGNLYVATFNGG 233 (310)
T ss_pred hhccCCccccccccCcEEEEEccCceEEeeeecCCCcccccCcceeEEeccCCCcCCCCCCcceEcc-CCcEEEEEecCc
Confidence 45556679999999999999999888884433 33333455666653 345799999998 689999888777
Q ss_pred cEEEEEccCcc
Q psy951 392 NIRRSFMLEAQ 402 (428)
Q Consensus 392 ~I~~~~~~g~~ 402 (428)
++...+....+
T Consensus 234 ~V~~~dp~tGK 244 (310)
T KOG4499|consen 234 TVQKVDPTTGK 244 (310)
T ss_pred EEEEECCCCCc
Confidence 77777765443
No 67
>PF14670 FXa_inhibition: Coagulation Factor Xa inhibitory site; PDB: 3Q3K_B 1NFY_B 1LQD_A 1G2L_B 1IQF_L 2UWP_B 2VH6_B 3KQC_L 2P93_L 2BQW_A ....
Probab=93.14 E-value=0.053 Score=34.48 Aligned_cols=19 Identities=21% Similarity=0.861 Sum_probs=15.7
Q ss_pred CEEeeCCCCceEecCCCCCc
Q psy951 195 GMCAESETGDLTCNCRQDFA 214 (428)
Q Consensus 195 g~C~~~~~g~~~C~C~~gy~ 214 (428)
..|++.+ ++|+|.|++||.
T Consensus 10 h~C~~~~-g~~~C~C~~Gy~ 28 (36)
T PF14670_consen 10 HICVNTP-GSYRCSCPPGYK 28 (36)
T ss_dssp SEEEEET-TSEEEE-STTEE
T ss_pred CCCccCC-CceEeECCCCCE
Confidence 4899987 579999999998
No 68
>KOG1217|consensus
Probab=92.99 E-value=0.13 Score=52.59 Aligned_cols=64 Identities=30% Similarity=0.749 Sum_probs=44.6
Q ss_pred CccccCCCcceecCCCCCCCCCCCCCCcccccCccCCCCCCCCCCCCCCCEEeeCCCCceEecCCCCCccCcc
Q psy951 146 HLCLVIPGGYQCACPENATPKLPGVAEIRCSAAVERPRPLPRVCQCQNGGMCAESETGDLTCNCRQDFAGTFC 218 (428)
Q Consensus 146 ~~C~~~~~~~~C~C~~g~~~~~~~~~g~~C~~~~~~~~~~~~~c~C~ngg~C~~~~~g~~~C~C~~gy~G~~C 218 (428)
..|.+..+++.|.|++|| .... ...|.+..++ .. .. .|.++++|.+.. +.|.|.|++||.|..|
T Consensus 243 ~~c~~~~~~~~C~~~~g~-~~~~---~~~~~~~~~C-~~--~~-~c~~~~~C~~~~-~~~~C~C~~g~~g~~~ 306 (487)
T KOG1217|consen 243 GTCVNTVGSYTCRCPEGY-TGDA---CVTCVDVDSC-AL--IA-SCPNGGTCVNVP-GSYRCTCPPGFTGRLC 306 (487)
T ss_pred CcccccCCceeeeCCCCc-cccc---cceeeecccc-CC--CC-ccCCCCeeecCC-CcceeeCCCCCCCCCC
Confidence 567777888999999999 2211 0123333322 11 12 288999999988 4599999999999998
No 69
>PTZ00382 Variant-specific surface protein (VSP); Provisional
Probab=92.93 E-value=0.17 Score=40.02 Aligned_cols=14 Identities=21% Similarity=0.612 Sum_probs=8.2
Q ss_pred cCCCCCc--cCccCCC
Q psy951 208 NCRQDFA--GTFCENY 221 (428)
Q Consensus 208 ~C~~gy~--G~~Ce~~ 221 (428)
.|..||. +..|...
T Consensus 41 ~C~~GY~~~~~~Cv~~ 56 (96)
T PTZ00382 41 ECNSGFSLDNGKCVSS 56 (96)
T ss_pred cCcCCcccCCCccccc
Confidence 4777776 4556543
No 70
>PF06247 Plasmod_Pvs28: Plasmodium ookinete surface protein Pvs28; InterPro: IPR010423 This family consists of several ookinete surface protein (Pvs28) from several species of Plasmodium. Pvs25 and Pvs28 are expressed on the surface of ookinetes. These proteins are potential candidates for vaccine and induce antibodies that block the infectivity of Plasmodium vivax in immunised animals [].; GO: 0009986 cell surface, 0016020 membrane; PDB: 1Z3G_B 1Z1Y_B 1Z27_A.
Probab=92.76 E-value=0.025 Score=49.57 Aligned_cols=67 Identities=16% Similarity=0.427 Sum_probs=42.0
Q ss_pred CCC-ccccCCCcceecCCCCCCCCCCCCCCcccccCccCCCCCCCCCCCCCCCEEeeCCC----CceEecCCCCCc
Q psy951 144 CSH-LCLVIPGGYQCACPENATPKLPGVAEIRCSAAVERPRPLPRVCQCQNGGMCAESET----GDLTCNCRQDFA 214 (428)
Q Consensus 144 C~~-~C~~~~~~~~C~C~~g~~~~~~~~~g~~C~~~~~~~~~~~~~c~C~ngg~C~~~~~----g~~~C~C~~gy~ 214 (428)
|.+ ......+.|+|.|.+|| .+.+. .+|+...++.....-.-+|.+.++|..... ..|.|.|-+||.
T Consensus 8 CKNG~LiQMSNHfEC~Cnegf-vl~~E---ntCE~kv~C~~~e~~~K~Cgdya~C~~~~~~~~~~~~~C~C~~gY~ 79 (197)
T PF06247_consen 8 CKNGYLIQMSNHFECKCNEGF-VLKNE---NTCEEKVECDKLENVNKPCGDYAKCINQANKGEERAYKCDCINGYI 79 (197)
T ss_dssp -BTEEEEEESSEEEEEESTTE-EEEET---TEEEE----SG-GGTTSEEETTEEEEE-SSTTSSTSEEEEE-TTEE
T ss_pred ccCCEEEEccCceEEEcCCCc-EEccc---cccccceecCcccccCccccchhhhhcCCCcccceeEEEecccCce
Confidence 444 34455678999999999 55543 688877654332212226888899987764 569999999998
No 71
>KOG1836|consensus
Probab=92.51 E-value=0.18 Score=58.92 Aligned_cols=40 Identities=28% Similarity=0.733 Sum_probs=33.0
Q ss_pred CCCCCCCCCCEEeeCC-CCceEec-CCCCCccCccCCCCCCC
Q psy951 186 PRVCQCQNGGMCAESE-TGDLTCN-CRQDFAGTFCENYTGIG 225 (428)
Q Consensus 186 ~~~c~C~ngg~C~~~~-~g~~~C~-C~~gy~G~~Ce~~~~~~ 225 (428)
+.+|+|.+++.|.... .....|. ||+||+|.+|+......
T Consensus 777 C~~C~Cp~~~~~~~~~~~~~~iCk~Cp~gytG~rCe~c~dgy 818 (1705)
T KOG1836|consen 777 CQPCPCPNGGACGQTPEILEVVCKNCPPGYTGLRCEECADGY 818 (1705)
T ss_pred CccCCCCCChhhcCcCcccceecCCCCCCCcccccccCCCcc
Confidence 5688999999998766 4568899 99999999999866543
No 72
>TIGR02604 Piru_Ver_Nterm putative membrane-bound dehydrogenase domain. All proteins that score above the trusted cutoff score of 45 to this model are large proteins of either Pirellula sp. 1 or Verrucomicrobium spinosum. These proteins all contain, in addition to this domain, several hundred residues of highly variable sequence, and then a well-conserved C-terminal domain (TIGR02603) that features a putative cytochrome c-type heme binding motif CXXCH. The membrane-bound L-sorbosone dehydrogenase from Acetobacter liquefaciens (Gluconacetobacter liquefaciens) is homologous to this domain but lacks additional sequence regions shared by members of this family and belongs to a different clade of the larger family of homologs. It and its closely related homologs are excluded from the this model by scoring between the trusted (45) and noise (18) cutoffs.
Probab=92.41 E-value=1.1 Score=44.57 Aligned_cols=97 Identities=8% Similarity=-0.106 Sum_probs=61.3
Q ss_pred cccceeEEeeccCCCEEEEEecC-----------CC-eEEEeee---ccccCCceEEEEeccccceeeeeeccCCeEEEE
Q psy951 322 DERRIEALDIDPVDEIIYWVDSY-----------DR-NIRRSFM---LEAQKGQVQAVISDERRIEALDIDPVDEIIYWV 386 (428)
Q Consensus 322 ~~~~~~~l~~d~~~~~lyWtd~~-----------~~-~I~ra~l---~g~~~~~~~~i~~~~~~p~glavD~~~~~lYwt 386 (428)
.+.+|++|++|+. +.||-++.. .. +|.+..- ||. ....+++.+++..|.||+++.-+ ||.+
T Consensus 12 ~~~~P~~ia~d~~-G~l~V~e~~~y~~~~~~~~~~~~rI~~l~d~dgdG~-~d~~~vfa~~l~~p~Gi~~~~~G--lyV~ 87 (367)
T TIGR02604 12 LLRNPIAVCFDER-GRLWVAEGITYSRPAGRQGPLGDRILILEDADGDGK-YDKSNVFAEELSMVTGLAVAVGG--VYVA 87 (367)
T ss_pred ccCCCceeeECCC-CCEEEEeCCcCCCCCCCCCCCCCEEEEEEcCCCCCC-cceeEEeecCCCCccceeEecCC--EEEe
Confidence 4789999999986 678888742 22 7877765 331 12234555689999999997544 9998
Q ss_pred eCCCCcEEEE-EccCc-----ceeEEEecccCC----CCCCCccccccc
Q psy951 387 DSYDRNIRRS-FMLEA-----QKGQVQAGASRH----QNGVPASSQRNL 425 (428)
Q Consensus 387 d~~~~~I~~~-~~~g~-----~~~~l~~~~~~~----~~~~p~~~~~~~ 425 (428)
+.. .|.+. +-+|. .+++|+ +++.. +...|.+++++.
T Consensus 88 ~~~--~i~~~~d~~gdg~ad~~~~~l~-~~~~~~~~~~~~~~~~l~~gp 133 (367)
T TIGR02604 88 TPP--DILFLRDKDGDDKADGEREVLL-SGFGGQINNHHHSLNSLAWGP 133 (367)
T ss_pred CCC--eEEEEeCCCCCCCCCCccEEEE-EccCCCCCcccccccCceECC
Confidence 755 36544 54442 334444 44333 345566766643
No 73
>PF01731 Arylesterase: Arylesterase; InterPro: IPR002640 The serum paraoxonases/arylesterases are enzymes that catalyse the hydrolysis of the toxic metabolites of a variety of organophosphorus insecticides. The enzymes hydrolyse a broad spectrum of organophosphate substrates, including paraoxon and a number of aromatic carboxylic acid esters (e.g., phenyl acetate), and hence confer resistance to organophosphate toxicity []. Mammals have 3 distinct paraoxonase types, termed PON1-3 [, ]. In mice and humans, the PON genes are found on the same chromosome in close proximity. PON activity has been found in variety of tissues, with highest levels in liver and serum - the source of serum PON is thought to be the liver. Unlike mammals, fish and avian species lack paraoxonase activity. Human and rabbit PONs appear to have two distinct Ca2+ binding sites, one required for stability and one required for catalytic activity. The Ca2+ dependency of PONs suggests a mechanism of hydrolysis where Ca2+ acts as the electrophillic catalyst, like that proposed for phospholipase A2. The paraoxonase enzymes, PON1 and PON3, are high density lipoprotein (HDL)- associated proteins capable of preventing oxidative modification of low density lipoproteins (LPL) []. Although PON2 has oxidative properties, the enzyme does not associate with HDL. Within a given species, PON1, PON2 and PON3 share ~60% amino acid sequence identity, whereas between mammalian species particular PONs (1,2 or 3) share 79-90% identity at the amino acid level. Human PON1 and PON3 share numerous conserved phosphorylation and N-glycosylation sites; however, it is not known whether the PON proteins are modified at these sites, or whether modification at these sites is required for activity in vivo []. This family consists of arylesterases (Also known as serum paraoxonase) 3.1.1.2 from EC. These enzymes hydrolyse organophosphorus esters such as paraoxon and are found in the liver and blood. They confer resistance to organophosphate toxicity []. Human arylesterase (PON1) P27169 from SWISSPROT is associated with HDL and may protect against LDL oxidation [].; GO: 0004064 arylesterase activity
Probab=92.19 E-value=0.45 Score=36.79 Aligned_cols=43 Identities=19% Similarity=0.384 Sum_probs=34.6
Q ss_pred cCCCCcEEEEeCCCCCceeEEEcCCCCCEEEEEECCCCeEEEEEc
Q psy951 17 MDGSHRRSLVMTGVRHPTGLSVDAAMDHTLYWVDSKLNTIESVRH 61 (428)
Q Consensus 17 ~DG~~~~~l~~~~~~~P~glavD~~~~~~lYW~d~~~~~I~~~~l 61 (428)
-||+..+++ .+++..|+||++|+..+ .||-++.....|.....
T Consensus 41 yd~~~~~~v-a~g~~~aNGI~~s~~~k-~lyVa~~~~~~I~vy~~ 83 (86)
T PF01731_consen 41 YDGKEVKVV-ASGFSFANGIAISPDKK-YLYVASSLAHSIHVYKR 83 (86)
T ss_pred EeCCEeEEe-eccCCCCceEEEcCCCC-EEEEEeccCCeEEEEEe
Confidence 467665554 45899999999999888 99999999888876654
No 74
>PF12947 EGF_3: EGF domain; InterPro: IPR024731 This entry represents an EGF domain found in the the C terminus of malarial parasite merozoite surface protein 1 [], as well as other proteins.; PDB: 2NPR_A 1N1I_C 1B9W_A 1YO8_A 2RHP_A.
Probab=92.19 E-value=0.044 Score=34.88 Aligned_cols=21 Identities=29% Similarity=0.771 Sum_probs=16.1
Q ss_pred CCCC--ccccCCCcceecCCCCC
Q psy951 143 PCSH--LCLVIPGGYQCACPENA 163 (428)
Q Consensus 143 ~C~~--~C~~~~~~~~C~C~~g~ 163 (428)
+|.. .|.+.++.|.|.|++||
T Consensus 7 ~C~~nA~C~~~~~~~~C~C~~Gy 29 (36)
T PF12947_consen 7 GCHPNATCTNTGGSYTCTCKPGY 29 (36)
T ss_dssp GS-TTCEEEE-TTSEEEEE-CEE
T ss_pred CCCCCcEeecCCCCEEeECCCCC
Confidence 5655 89999999999999999
No 75
>cd01475 vWA_Matrilin VWA_Matrilin: In cartilaginous plate, extracellular matrix molecules mediate cell-matrix and matrix-matrix interactions thereby providing tissue integrity. Some members of the matrilin family are expressed specifically in developing cartilage rudiments. The matrilin family consists of at least four members. All the members of the matrilin family contain VWA domains, EGF-like domains and a heptad repeat coiled-coiled domain at the carboxy terminus which is responsible for the oligomerization of the matrilins. The VWA domains have been shown to be essential for matrilin network formation by interacting with matrix ligands.
Probab=92.00 E-value=0.1 Score=48.11 Aligned_cols=32 Identities=38% Similarity=0.998 Sum_probs=26.8
Q ss_pred CCCCCCC--CCCCCccccCCCcceecCCCCCCCCC
Q psy951 135 APNPCSQ--SPCSHLCLVIPGGYQCACPENATPKL 167 (428)
Q Consensus 135 ~~n~C~~--~~C~~~C~~~~~~~~C~C~~g~~~~~ 167 (428)
..++|.. ++|.+.|.+.+++|.|.|++|| .+.
T Consensus 186 ~~~~C~~~~~~c~~~C~~~~g~~~c~c~~g~-~~~ 219 (224)
T cd01475 186 VPDLCATLSHVCQQVCISTPGSYLCACTEGY-ALL 219 (224)
T ss_pred CchhhcCCCCCccceEEcCCCCEEeECCCCc-cCC
Confidence 5678864 4799999999999999999999 443
No 76
>TIGR02658 TTQ_MADH_Hv methylamine dehydrogenase heavy chain. This family consists of the heavy chain of methylamine dehydrogenase light chain, a periplasmic enzyme. The enzyme contains a tryptophan tryptophylquinone (TTQ) prothetic group derived from two Trp residues in the light subunity. The enzyme forms a complex with the type I blue copper protein amicyanin and a cytochrome. Electron transfer procedes from TQQ to the copper and then to the heme group of the cytochrome.
Probab=91.98 E-value=14 Score=36.61 Aligned_cols=94 Identities=9% Similarity=0.007 Sum_probs=61.4
Q ss_pred eEEEEEcCCCCcEEEEeCCCCCceeEEEcCCCCCEEEEEEC---------CCCeEEEEEcCCCCe-EEEEeCC------C
Q psy951 11 RIESAWMDGSHRRSLVMTGVRHPTGLSVDAAMDHTLYWVDS---------KLNTIESVRHDGRNR-QTILSGS------D 74 (428)
Q Consensus 11 ~I~~a~~DG~~~~~l~~~~~~~P~glavD~~~~~~lYW~d~---------~~~~I~~~~ldG~~~-~~i~~~~------~ 74 (428)
+|...+.+-....--+.. -++|+|+ +.+..+ .||-+.. ..+.|..++...... ..|..+. .
T Consensus 28 ~v~ViD~~~~~v~g~i~~-G~~P~~~-~spDg~-~lyva~~~~~R~~~G~~~d~V~v~D~~t~~~~~~i~~p~~p~~~~~ 104 (352)
T TIGR02658 28 QVYTIDGEAGRVLGMTDG-GFLPNPV-VASDGS-FFAHASTVYSRIARGKRTDYVEVIDPQTHLPIADIELPEGPRFLVG 104 (352)
T ss_pred eEEEEECCCCEEEEEEEc-cCCCcee-ECCCCC-EEEEEeccccccccCCCCCEEEEEECccCcEEeEEccCCCchhhcc
Confidence 565555544322222333 3599997 999888 9999999 788999988865433 3333221 1
Q ss_pred CccceeeeeccccEEEEEeCC-CCceeeeccCCC
Q psy951 75 KLQHPISLDVFENNIYWLARD-TGSLYKQDKFGR 107 (428)
Q Consensus 75 ~~~~p~~l~~~~~~lYwtD~~-~~~I~~~~~~g~ 107 (428)
..+.-++++..+.+||..... ...+..++....
T Consensus 105 ~~~~~~~ls~dgk~l~V~n~~p~~~V~VvD~~~~ 138 (352)
T TIGR02658 105 TYPWMTSLTPDNKTLLFYQFSPSPAVGVVDLEGK 138 (352)
T ss_pred CccceEEECCCCCEEEEecCCCCCEEEEEECCCC
Confidence 223357788888899998866 667777777643
No 77
>PF09064 Tme5_EGF_like: Thrombomodulin like fifth domain, EGF-like; InterPro: IPR015149 This domain adopts a fold similar to other EGF domains, with a flat major and a twisted minor beta sheet. Disulphide pairing, however, is not of the usual 1-3, 2-4, 5-6 type; rather 1-2, 3-4, 5-6 pairing is found. Its extended major sheet (strands beta-2 and beta-3 and the connecting loop) projects into thrombin's active site groove. This domain is required for interaction of thrombomodulin with thrombin, and subsequent activation of protein-C []. ; GO: 0004888 transmembrane signaling receptor activity, 0016021 integral to membrane
Probab=91.72 E-value=0.15 Score=31.49 Aligned_cols=28 Identities=32% Similarity=0.530 Sum_probs=20.7
Q ss_pred CCCCCCCCccccCCCcceecCCCCCCCCCC
Q psy951 139 CSQSPCSHLCLVIPGGYQCACPENATPKLP 168 (428)
Q Consensus 139 C~~~~C~~~C~~~~~~~~C~C~~g~~~~~~ 168 (428)
|....|...|.+... ..|.||+|| .+.+
T Consensus 3 Cn~t~CpA~CDpn~~-~~C~CPeGy-Ilde 30 (34)
T PF09064_consen 3 CNQTECPADCDPNSP-GQCFCPEGY-ILDE 30 (34)
T ss_pred cccccCCCccCCCCC-CceeCCCce-EecC
Confidence 556678888887653 499999999 5543
No 78
>PHA03099 epidermal growth factor-like protein (EGF-like protein); Provisional
Probab=91.68 E-value=0.35 Score=39.59 Aligned_cols=37 Identities=30% Similarity=0.680 Sum_probs=25.7
Q ss_pred CCCCCC---CCCCC-ccccCC--CcceecCCCCCCCCCCCCCCcccccCc
Q psy951 136 PNPCSQ---SPCSH-LCLVIP--GGYQCACPENATPKLPGVAEIRCSAAV 179 (428)
Q Consensus 136 ~n~C~~---~~C~~-~C~~~~--~~~~C~C~~g~~~~~~~~~g~~C~~~~ 179 (428)
..+|.. +-|-| .|.-.+ ..+.|.|+.|| .|.+|+...
T Consensus 42 i~~Cp~ey~~YClHG~C~yI~dl~~~~CrC~~GY-------tGeRCEh~d 84 (139)
T PHA03099 42 IRLCGPEGDGYCLHGDCIHARDIDGMYCRCSHGY-------TGIRCQHVV 84 (139)
T ss_pred cccCChhhCCEeECCEEEeeccCCCceeECCCCc-------cccccccee
Confidence 345543 34666 676555 78999999999 678887543
No 79
>KOG3514|consensus
Probab=91.41 E-value=2.8 Score=46.16 Aligned_cols=35 Identities=37% Similarity=0.841 Sum_probs=29.7
Q ss_pred CCCCCCCEEeeCCCCceEecCCC-CCccCccCCCCCC
Q psy951 189 CQCQNGGMCAESETGDLTCNCRQ-DFAGTFCENYTGI 224 (428)
Q Consensus 189 c~C~ngg~C~~~~~g~~~C~C~~-gy~G~~Ce~~~~~ 224 (428)
.||+|||+|...- ..|.|.|.. ||.|..||.+...
T Consensus 629 nPC~N~g~C~egw-NrfiCDCs~T~~~G~~CerE~t~ 664 (1591)
T KOG3514|consen 629 NPCQNGGKCSEGW-NRFICDCSGTGFEGRTCEREATA 664 (1591)
T ss_pred CcccCCCCccccc-cccccccccCcccCccccceeee
Confidence 3899999998877 479999986 9999999987653
No 80
>PF06977 SdiA-regulated: SdiA-regulated; InterPro: IPR009722 This entry represents a conserved region approximately 100 residues long within a number of hypothetical bacterial proteins that may be regulated by SdiA, a member of the LuxR family of transcriptional regulators []. Some proteins contain the IPR001258 from INTERPRO repeat.; PDB: 3QQZ_A.
Probab=90.71 E-value=3 Score=39.10 Aligned_cols=83 Identities=11% Similarity=0.071 Sum_probs=53.9
Q ss_pred cceeEEeeccCCCEEEEEecCC-CeEEEeeeccccCCceEEEEe--------ccccceeeeeeccCCeEEEEeCCCCcEE
Q psy951 324 RRIEALDIDPVDEIIYWVDSYD-RNIRRSFMLEAQKGQVQAVIS--------DERRIEALDIDPVDEIIYWVDSYDRNIR 394 (428)
Q Consensus 324 ~~~~~l~~d~~~~~lyWtd~~~-~~I~ra~l~g~~~~~~~~i~~--------~~~~p~glavD~~~~~lYwtd~~~~~I~ 394 (428)
+...||+||+.++.+|-+.-.. ..|+.....-. .....+... .+..|.||++|..+++||-.......|-
T Consensus 118 ~G~EGla~D~~~~~L~v~kE~~P~~l~~~~~~~~-~~~~~~~~~~~~~~~~~~~~d~S~l~~~p~t~~lliLS~es~~l~ 196 (248)
T PF06977_consen 118 KGFEGLAYDPKTNRLFVAKERKPKRLYEVNGFPG-GFDLFVSDDQDLDDDKLFVRDLSGLSYDPRTGHLLILSDESRLLL 196 (248)
T ss_dssp S--EEEEEETTTTEEEEEEESSSEEEEEEESTT--SS--EEEE-HHHH-HT--SS---EEEEETTTTEEEEEETTTTEEE
T ss_pred cceEEEEEcCCCCEEEEEeCCCChhhEEEccccC-ccceeeccccccccccceeccccceEEcCCCCeEEEEECCCCeEE
Confidence 4578999999999999884333 37888876211 122222211 3567999999999999999999988999
Q ss_pred EEEccCcceeEEE
Q psy951 395 RSFMLEAQKGQVQ 407 (428)
Q Consensus 395 ~~~~~g~~~~~l~ 407 (428)
..+.+|+-...+-
T Consensus 197 ~~d~~G~~~~~~~ 209 (248)
T PF06977_consen 197 ELDRQGRVVSSLS 209 (248)
T ss_dssp EE-TT--EEEEEE
T ss_pred EECCCCCEEEEEE
Confidence 9999998766554
No 81
>PRK04043 tolB translocation protein TolB; Provisional
Probab=90.27 E-value=4.8 Score=40.90 Aligned_cols=106 Identities=8% Similarity=0.031 Sum_probs=72.0
Q ss_pred EEEeeCCCCeEEEEEcCCCCcEEEEeCCCCCceeEEEcCCCCCEEEEEECCC--------CeEEEEEcCCCCeEEEEeCC
Q psy951 2 FWAETGASPRIESAWMDGSHRRSLVMTGVRHPTGLSVDAAMDHTLYWVDSKL--------NTIESVRHDGRNRQTILSGS 73 (428)
Q Consensus 2 yWtd~~~~~~I~~a~~DG~~~~~l~~~~~~~P~glavD~~~~~~lYW~d~~~--------~~I~~~~ldG~~~~~i~~~~ 73 (428)
|-+|.+..+.|.+.+++|...+.+...+... .++++..+ .|.++-... ..|..++++|...+.|...
T Consensus 293 F~Sdr~g~~~Iy~~dl~~g~~~rlt~~g~~~---~~~SPDG~-~Ia~~~~~~~~~~~~~~~~I~v~d~~~g~~~~LT~~- 367 (419)
T PRK04043 293 FVSDRLGYPNIFMKKLNSGSVEQVVFHGKNN---SSVSTYKN-YIVYSSRETNNEFGKNTFNLYLISTNSDYIRRLTAN- 367 (419)
T ss_pred EEECCCCCceEEEEECCCCCeEeCccCCCcC---ceECCCCC-EEEEEEcCCCcccCCCCcEEEEEECCCCCeEECCCC-
Confidence 3444555568999999987765554433322 37787777 888876543 4899999999887776654
Q ss_pred CCccceeeeeccccEEEEEeCCCC--ceeeeccCCCCceeee
Q psy951 74 DKLQHPISLDVFENNIYWLARDTG--SLYKQDKFGRGVPVLI 113 (428)
Q Consensus 74 ~~~~~p~~l~~~~~~lYwtD~~~~--~I~~~~~~g~~~~~~~ 113 (428)
........+..+..|+++....+ .|...+.+|.....+.
T Consensus 368 -~~~~~p~~SPDG~~I~f~~~~~~~~~L~~~~l~g~~~~~l~ 408 (419)
T PRK04043 368 -GVNQFPRFSSDGGSIMFIKYLGNQSALGIIRLNYNKSFLFP 408 (419)
T ss_pred -CCcCCeEECCCCCEEEEEEccCCcEEEEEEecCCCeeEEee
Confidence 23334556778888999875433 5888888876555443
No 82
>TIGR03866 PQQ_ABC_repeats PQQ-dependent catabolism-associated beta-propeller protein. Members of this protein family consist of seven repeats each of the YVTN family beta-propeller repeat (see TIGR02276). Members occur invariably as part of a transport operon that is associated with PQQ-dependent catabolism of alcohols such as phenylethanol.
Probab=90.22 E-value=7.4 Score=36.37 Aligned_cols=93 Identities=14% Similarity=0.190 Sum_probs=61.0
Q ss_pred eeeeecCccceeeccccccceeEEeeccCCCEEEEEecCCCeEEEeeeccccCCceEEEEeccccceeeeeeccCCeEEE
Q psy951 306 IRAYETHKRRFRDVISDERRIEALDIDPVDEIIYWVDSYDRNIRRSFMLEAQKGQVQAVISDERRIEALDIDPVDEIIYW 385 (428)
Q Consensus 306 ~~~~~~~~~~~~~~i~~~~~~~~l~~d~~~~~lyWtd~~~~~I~ra~l~g~~~~~~~~i~~~~~~p~glavD~~~~~lYw 385 (428)
+..++.........++....++++++++..+.+|=+....++|..-.+.. .+....+..-..|..++++..++.||-
T Consensus 13 v~~~d~~t~~~~~~~~~~~~~~~l~~~~dg~~l~~~~~~~~~v~~~d~~~---~~~~~~~~~~~~~~~~~~~~~g~~l~~ 89 (300)
T TIGR03866 13 ISVIDTATLEVTRTFPVGQRPRGITLSKDGKLLYVCASDSDTIQVIDLAT---GEVIGTLPSGPDPELFALHPNGKILYI 89 (300)
T ss_pred EEEEECCCCceEEEEECCCCCCceEECCCCCEEEEEECCCCeEEEEECCC---CcEEEeccCCCCccEEEECCCCCEEEE
Confidence 33334433333334444456788999999889998887777777766654 222222222345778899888888888
Q ss_pred EeCCCCcEEEEEccCc
Q psy951 386 VDSYDRNIRRSFMLEA 401 (428)
Q Consensus 386 td~~~~~I~~~~~~g~ 401 (428)
+....+.|.+.++...
T Consensus 90 ~~~~~~~l~~~d~~~~ 105 (300)
T TIGR03866 90 ANEDDNLVTVIDIETR 105 (300)
T ss_pred EcCCCCeEEEEECCCC
Confidence 8776668888887654
No 83
>PF07995 GSDH: Glucose / Sorbosone dehydrogenase; InterPro: IPR012938 Proteins containing this domain are thought to be glucose/sorbosone dehydrogenases. The best characterised of these proteins is soluble glucose dehydrogenase (P13650 from SWISSPROT) from Acinetobacter calcoaceticus, which oxidises glucose to gluconolactone. The enzyme is a calcium-dependent homodimer which uses PQQ as a cofactor [].; GO: 0016901 oxidoreductase activity, acting on the CH-OH group of donors, quinone or similar compound as acceptor, 0048038 quinone binding, 0005975 carbohydrate metabolic process; PDB: 2ISM_A 2WG3_D 3HO5_A 3HO4_A 3HO3_A 2WFT_A 2WG4_B 2WFX_B 1CRU_A 1CQ1_B ....
Probab=90.20 E-value=1.2 Score=43.64 Aligned_cols=74 Identities=16% Similarity=0.073 Sum_probs=50.9
Q ss_pred ccceeEEeeccCCCEEEEE--ecCC-----------CeEEEeeecccc----------CCceEEEEeccccceeeeeecc
Q psy951 323 ERRIEALDIDPVDEIIYWV--DSYD-----------RNIRRSFMLEAQ----------KGQVQAVISDERRIEALDIDPV 379 (428)
Q Consensus 323 ~~~~~~l~~d~~~~~lyWt--d~~~-----------~~I~ra~l~g~~----------~~~~~~i~~~~~~p~glavD~~ 379 (428)
......|+|+| +++||++ |... .+|.|...+|+- ....++...++.+|.|||+|..
T Consensus 113 ~H~g~~l~fgp-DG~LYvs~G~~~~~~~~~~~~~~~G~ilri~~dG~~p~dnP~~~~~~~~~~i~A~GlRN~~~~~~d~~ 191 (331)
T PF07995_consen 113 NHNGGGLAFGP-DGKLYVSVGDGGNDDNAQDPNSLRGKILRIDPDGSIPADNPFVGDDGADSEIYAYGLRNPFGLAFDPN 191 (331)
T ss_dssp SS-EEEEEE-T-TSEEEEEEB-TTTGGGGCSTTSSTTEEEEEETTSSB-TTSTTTTSTTSTTTEEEE--SEEEEEEEETT
T ss_pred CCCCccccCCC-CCcEEEEeCCCCCcccccccccccceEEEecccCcCCCCCccccCCCceEEEEEeCCCccccEEEECC
Confidence 44566799999 4599998 3332 189999999841 1134556669999999999999
Q ss_pred CCeEEEEeCCCCcEEEEE
Q psy951 380 DEIIYWVDSYDRNIRRSF 397 (428)
Q Consensus 380 ~~~lYwtd~~~~~I~~~~ 397 (428)
+++||.+|.+.+..+..+
T Consensus 192 tg~l~~~d~G~~~~dein 209 (331)
T PF07995_consen 192 TGRLWAADNGPDGWDEIN 209 (331)
T ss_dssp TTEEEEEEE-SSSSEEEE
T ss_pred CCcEEEEccCCCCCcEEE
Confidence 999999999877665544
No 84
>KOG3516|consensus
Probab=90.09 E-value=6.4 Score=44.07 Aligned_cols=33 Identities=39% Similarity=1.017 Sum_probs=28.2
Q ss_pred CCCCCCEEeeCCCCceEecCCC-CCccCccCCCCC
Q psy951 190 QCQNGGMCAESETGDLTCNCRQ-DFAGTFCENYTG 223 (428)
Q Consensus 190 ~C~ngg~C~~~~~g~~~C~C~~-gy~G~~Ce~~~~ 223 (428)
+|.|||.|+.... .|.|.|.- -|.|.+|..+..
T Consensus 962 ~C~NGG~Cvery~-gytCDCs~Tay~Gp~Cs~eig 995 (1306)
T KOG3516|consen 962 PCLNGGHCVERYD-GYTCDCSRTAYDGPFCSKEIG 995 (1306)
T ss_pred cccCCCEEEEecC-ceeeccccCcCCCCccccccc
Confidence 8999999998885 49999996 699999977553
No 85
>PF03022 MRJP: Major royal jelly protein; InterPro: IPR003534 The major royal jelly proteins (MRJPs) comprise 12.5% of the mass, and 82-90% of the protein content [], of honeybee (Apis mellifera) royal jelly. Royal jelly is a substance secreted by the cephalic glands of nurse bees [] and it is used to trigger development of a queen bee from a bee larva. The biological function of the MRJPs is unknown, but they are believed to play a major role in nutrition due to their high essential amino acid content []. Two royal jelly proteins, MRJP3 and MRJP5, contain a tandem repeat that results from a high genetic variablility. This polymorphism may be useful for genotyping individual bees [].; PDB: 3Q6P_B 3Q6K_A 3Q6T_A 2QE8_B.
Probab=89.60 E-value=3.7 Score=39.47 Aligned_cols=82 Identities=17% Similarity=0.184 Sum_probs=53.7
Q ss_pred eeEEeecc---CCCEEEEEecCCCeEEEeeec----cccCC------ceEEEEeccccceeeeeeccCCeEEEEeCCCCc
Q psy951 326 IEALDIDP---VDEIIYWVDSYDRNIRRSFML----EAQKG------QVQAVISDERRIEALDIDPVDEIIYWVDSYDRN 392 (428)
Q Consensus 326 ~~~l~~d~---~~~~lyWtd~~~~~I~ra~l~----g~~~~------~~~~i~~~~~~p~glavD~~~~~lYwtd~~~~~ 392 (428)
+.|++..+ ..+.|||.--.+.++||.... ++... ..+.+-.+.....|+++|. +++||+++.....
T Consensus 130 ~~gial~~~~~d~r~LYf~~lss~~ly~v~T~~L~~~~~~~~~~~~~~v~~lG~k~~~s~g~~~D~-~G~ly~~~~~~~a 208 (287)
T PF03022_consen 130 IFGIALSPISPDGRWLYFHPLSSRKLYRVPTSVLRDPSLSDAQALASQVQDLGDKGSQSDGMAIDP-NGNLYFTDVEQNA 208 (287)
T ss_dssp EEEEEE-TTSTTS-EEEEEETT-SEEEEEEHHHHCSTT--HHH-HHHT-EEEEE---SECEEEEET-TTEEEEEECCCTE
T ss_pred ccccccCCCCCCccEEEEEeCCCCcEEEEEHHHhhCccccccccccccceeccccCCCCceEEECC-CCcEEEecCCCCe
Confidence 55677755 557999999888888887632 21111 1222323345779999999 8999999999999
Q ss_pred EEEEEccC----cceeEEEe
Q psy951 393 IRRSFMLE----AQKGQVQA 408 (428)
Q Consensus 393 I~~~~~~g----~~~~~l~~ 408 (428)
|.+.+.++ ..-++|..
T Consensus 209 I~~w~~~~~~~~~~~~~l~~ 228 (287)
T PF03022_consen 209 IGCWDPDGPYTPENFEILAQ 228 (287)
T ss_dssp EEEEETTTSB-GCCEEEEEE
T ss_pred EEEEeCCCCcCccchheeEE
Confidence 99999998 33445553
No 86
>smart00051 DSL delta serrate ligand.
Probab=88.87 E-value=0.43 Score=34.56 Aligned_cols=46 Identities=24% Similarity=0.532 Sum_probs=27.4
Q ss_pred eecCCCCCCCCCCCCCCcccccCccCCCCCCCCCCCCCCCEEeeCCCCceEecCCCCCccCcc
Q psy951 156 QCACPENATPKLPGVAEIRCSAAVERPRPLPRVCQCQNGGMCAESETGDLTCNCRQDFAGTFC 218 (428)
Q Consensus 156 ~C~C~~g~~~~~~~~~g~~C~~~~~~~~~~~~~c~C~ngg~C~~~~~g~~~C~C~~gy~G~~C 218 (428)
.-.|+++| .|..|..... +. ..+..+.+|... | .|.|++||.|..|
T Consensus 18 rv~C~~~~-------yG~~C~~~C~---~~---~d~~~~~~Cd~~--G--~~~C~~Gw~G~~C 63 (63)
T smart00051 18 RVTCDENY-------YGEGCNKFCR---PR---DDFFGHYTCDEN--G--NKGCLEGWMGPYC 63 (63)
T ss_pred EeeCCCCC-------cCCccCCEeC---cC---ccccCCccCCcC--C--CEecCCCCcCCCC
Confidence 44577777 5666643211 00 013455677542 2 4889999999887
No 87
>KOG4260|consensus
Probab=88.24 E-value=0.24 Score=45.91 Aligned_cols=66 Identities=24% Similarity=0.634 Sum_probs=43.1
Q ss_pred CCCCCCCC--CCCC--ccccCCCcceecCCCCCCCCCCCCCCcccccCccCCCCCCCCCCCCCCCEEeeCCCCceEecCC
Q psy951 135 APNPCSQS--PCSH--LCLVIPGGYQCACPENATPKLPGVAEIRCSAAVERPRPLPRVCQCQNGGMCAESETGDLTCNCR 210 (428)
Q Consensus 135 ~~n~C~~~--~C~~--~C~~~~~~~~C~C~~g~~~~~~~~~g~~C~~~~~~~~~~~~~c~C~ngg~C~~~~~g~~~C~C~ 210 (428)
.+|+|.+. +|.- .|+++.++|+|.+.+||. .-. ..|+.-.+ .|. ...+.|.+.. +.|+|.|+
T Consensus 235 DvnEC~~ep~~c~~~qfCvNteGSf~C~dk~Gy~-~g~----d~C~~~~d-------~~~-~kn~~c~ni~-~~~r~v~f 300 (350)
T KOG4260|consen 235 DVNECQNEPAPCKAHQFCVNTEGSFKCEDKEGYK-KGV----DECQFCAD-------VCA-SKNRPCMNID-GQYRCVCF 300 (350)
T ss_pred cHHHHhcCCCCCChhheeecCCCceEeccccccc-CCh----HHhhhhhh-------hcc-cCCCCcccCC-ccEEEEec
Confidence 56788753 6754 999999999999999992 211 22322111 111 1225677776 67999999
Q ss_pred CCCc
Q psy951 211 QDFA 214 (428)
Q Consensus 211 ~gy~ 214 (428)
.|+.
T Consensus 301 ~~~~ 304 (350)
T KOG4260|consen 301 SGLI 304 (350)
T ss_pred ccce
Confidence 9876
No 88
>PRK05137 tolB translocation protein TolB; Provisional
Probab=88.14 E-value=32 Score=34.93 Aligned_cols=95 Identities=8% Similarity=0.016 Sum_probs=59.1
Q ss_pred CCeEEEEEcCCCCcEEEEeCCCCCceeEEEcCCCCCEEEEEEC--CCCeEEEEEcCCCCeEEEEeCCCCccceeeeeccc
Q psy951 9 SPRIESAWMDGSHRRSLVMTGVRHPTGLSVDAAMDHTLYWVDS--KLNTIESVRHDGRNRQTILSGSDKLQHPISLDVFE 86 (428)
Q Consensus 9 ~~~I~~a~~DG~~~~~l~~~~~~~P~glavD~~~~~~lYW~d~--~~~~I~~~~ldG~~~~~i~~~~~~~~~p~~l~~~~ 86 (428)
...|..++.||.+.+.+... -..-...+..+..+ +|+++.. +...|...++++..++.+.... +.......+..+
T Consensus 181 ~~~l~~~d~dg~~~~~lt~~-~~~v~~p~wSpDG~-~lay~s~~~g~~~i~~~dl~~g~~~~l~~~~-g~~~~~~~SPDG 257 (435)
T PRK05137 181 IKRLAIMDQDGANVRYLTDG-SSLVLTPRFSPNRQ-EITYMSYANGRPRVYLLDLETGQRELVGNFP-GMTFAPRFSPDG 257 (435)
T ss_pred ceEEEEECCCCCCcEEEecC-CCCeEeeEECCCCC-EEEEEEecCCCCEEEEEECCCCcEEEeecCC-CcccCcEECCCC
Confidence 34799999999998777542 22344566777667 8888754 3468999999887766554332 222233455566
Q ss_pred cEEEEEeCC--CCceeeeccCC
Q psy951 87 NNIYWLARD--TGSLYKQDKFG 106 (428)
Q Consensus 87 ~~lYwtD~~--~~~I~~~~~~g 106 (428)
++|+++-.. ...|+..+..+
T Consensus 258 ~~la~~~~~~g~~~Iy~~d~~~ 279 (435)
T PRK05137 258 RKVVMSLSQGGNTDIYTMDLRS 279 (435)
T ss_pred CEEEEEEecCCCceEEEEECCC
Confidence 777766432 22455555543
No 89
>PF03022 MRJP: Major royal jelly protein; InterPro: IPR003534 The major royal jelly proteins (MRJPs) comprise 12.5% of the mass, and 82-90% of the protein content [], of honeybee (Apis mellifera) royal jelly. Royal jelly is a substance secreted by the cephalic glands of nurse bees [] and it is used to trigger development of a queen bee from a bee larva. The biological function of the MRJPs is unknown, but they are believed to play a major role in nutrition due to their high essential amino acid content []. Two royal jelly proteins, MRJP3 and MRJP5, contain a tandem repeat that results from a high genetic variablility. This polymorphism may be useful for genotyping individual bees [].; PDB: 3Q6P_B 3Q6K_A 3Q6T_A 2QE8_B.
Probab=87.84 E-value=3.2 Score=39.91 Aligned_cols=62 Identities=24% Similarity=0.419 Sum_probs=48.2
Q ss_pred CCceeEEEcCCCCCEEEEEECCCCeEEEEEcCC----CCeEEEEeCCCCccceeeeeccc---cEEEEEeC
Q psy951 31 RHPTGLSVDAAMDHTLYWVDSKLNTIESVRHDG----RNRQTILSGSDKLQHPISLDVFE---NNIYWLAR 94 (428)
Q Consensus 31 ~~P~glavD~~~~~~lYW~d~~~~~I~~~~ldG----~~~~~i~~~~~~~~~p~~l~~~~---~~lYwtD~ 94 (428)
....|+++|. .+ .||+++...+.|.+.+.++ .+.++++.....+..|-++.+.. ++||....
T Consensus 186 ~~s~g~~~D~-~G-~ly~~~~~~~aI~~w~~~~~~~~~~~~~l~~d~~~l~~pd~~~i~~~~~g~L~v~sn 254 (287)
T PF03022_consen 186 SQSDGMAIDP-NG-NLYFTDVEQNAIGCWDPDGPYTPENFEILAQDPRTLQWPDGLKIDPEGDGYLWVLSN 254 (287)
T ss_dssp -SECEEEEET-TT-EEEEEECCCTEEEEEETTTSB-GCCEEEEEE-CC-GSSEEEEEE-T--TS-EEEEE-
T ss_pred CCCceEEECC-CC-cEEEecCCCCeEEEEeCCCCcCccchheeEEcCceeeccceeeeccccCceEEEEEC
Confidence 3568999997 57 7999999999999999999 56677777653588999999887 89999863
No 90
>PF07995 GSDH: Glucose / Sorbosone dehydrogenase; InterPro: IPR012938 Proteins containing this domain are thought to be glucose/sorbosone dehydrogenases. The best characterised of these proteins is soluble glucose dehydrogenase (P13650 from SWISSPROT) from Acinetobacter calcoaceticus, which oxidises glucose to gluconolactone. The enzyme is a calcium-dependent homodimer which uses PQQ as a cofactor [].; GO: 0016901 oxidoreductase activity, acting on the CH-OH group of donors, quinone or similar compound as acceptor, 0048038 quinone binding, 0005975 carbohydrate metabolic process; PDB: 2ISM_A 2WG3_D 3HO5_A 3HO4_A 3HO3_A 2WFT_A 2WG4_B 2WFX_B 1CRU_A 1CQ1_B ....
Probab=87.24 E-value=2.6 Score=41.42 Aligned_cols=83 Identities=10% Similarity=0.115 Sum_probs=56.5
Q ss_pred ccceeEEeeccCCCEEEEEecCCCeEEEeeeccccCCceEEE-E-----eccccceeeeeec---cCCeEEEEeCCC---
Q psy951 323 ERRIEALDIDPVDEIIYWVDSYDRNIRRSFMLEAQKGQVQAV-I-----SDERRIEALDIDP---VDEIIYWVDSYD--- 390 (428)
Q Consensus 323 ~~~~~~l~~d~~~~~lyWtd~~~~~I~ra~l~g~~~~~~~~i-~-----~~~~~p~glavD~---~~~~lYwtd~~~--- 390 (428)
|++|++|++.|. +.||-++. ..+|.+...+|.. ...+. + .+..-..|||+|. .++.||.+.+..
T Consensus 1 L~~P~~~a~~pd-G~l~v~e~-~G~i~~~~~~g~~--~~~v~~~~~v~~~~~~gllgia~~p~f~~n~~lYv~~t~~~~~ 76 (331)
T PF07995_consen 1 LNNPRSMAFLPD-GRLLVAER-SGRIWVVDKDGSL--KTPVADLPEVFADGERGLLGIAFHPDFASNGYLYVYYTNADED 76 (331)
T ss_dssp ESSEEEEEEETT-SCEEEEET-TTEEEEEETTTEE--CEEEEE-TTTBTSTTBSEEEEEE-TTCCCC-EEEEEEEEE-TS
T ss_pred CCCceEEEEeCC-CcEEEEeC-CceEEEEeCCCcC--cceecccccccccccCCcccceeccccCCCCEEEEEEEcccCC
Confidence 568999999997 78899998 8899988877731 12221 1 1456778999998 468898877732
Q ss_pred -----CcEEEEEccCc-----ceeEEEec
Q psy951 391 -----RNIRRSFMLEA-----QKGQVQAG 409 (428)
Q Consensus 391 -----~~I~~~~~~g~-----~~~~l~~~ 409 (428)
.+|.+..++.. ..++|+.+
T Consensus 77 ~~~~~~~v~r~~~~~~~~~~~~~~~l~~~ 105 (331)
T PF07995_consen 77 GGDNDNRVVRFTLSDGDGDLSSEEVLVTG 105 (331)
T ss_dssp SSSEEEEEEEEEEETTSCEEEEEEEEEEE
T ss_pred CCCcceeeEEEeccCCccccccceEEEEE
Confidence 46888777665 34555544
No 91
>PF06247 Plasmod_Pvs28: Plasmodium ookinete surface protein Pvs28; InterPro: IPR010423 This family consists of several ookinete surface protein (Pvs28) from several species of Plasmodium. Pvs25 and Pvs28 are expressed on the surface of ookinetes. These proteins are potential candidates for vaccine and induce antibodies that block the infectivity of Plasmodium vivax in immunised animals [].; GO: 0009986 cell surface, 0016020 membrane; PDB: 1Z3G_B 1Z1Y_B 1Z27_A.
Probab=87.15 E-value=0.15 Score=44.71 Aligned_cols=71 Identities=25% Similarity=0.702 Sum_probs=40.1
Q ss_pred CCCCCCCCCCC-ccccCC---CcceecCCCCCCCCCCCCCCcccccCccCCCCCCCCC--CCCCCCEEeeCCCCceEecC
Q psy951 136 PNPCSQSPCSH-LCLVIP---GGYQCACPENATPKLPGVAEIRCSAAVERPRPLPRVC--QCQNGGMCAESETGDLTCNC 209 (428)
Q Consensus 136 ~n~C~~~~C~~-~C~~~~---~~~~C~C~~g~~~~~~~~~g~~C~~~~~~~~~~~~~c--~C~ngg~C~~~~~g~~~C~C 209 (428)
++.|....|.. .|+..+ ....|+|.-|+ ...+ ...|...-+ .+| .|..+-.|.... +-|+|.|
T Consensus 87 p~~C~~~~Cg~GKCI~d~~~~~~~~CSC~IGk-V~~d---n~kCtk~G~------T~C~LKCk~nE~CK~~~-~~Y~C~~ 155 (197)
T PF06247_consen 87 PNKCNNKDCGSGKCILDPDNPNNPTCSCNIGK-VPDD---NKKCTKTGE------TKCSLKCKENEECKLVD-GYYKCVC 155 (197)
T ss_dssp EGGGSS---TTEEEEEEEGGGSEEEEEE-TEE-ETTT---TTESEEEE--------------TTTEEEEEET-TEEEEEE
T ss_pred hhhcCceecCCCeEEecCCCCCCceeEeeece-Eecc---CCcccCCCc------cceeeecCCCcceeeeC-cEEEeec
Confidence 35555556655 787666 45699999999 4222 356653321 233 476667898776 6799999
Q ss_pred CCCCccCc
Q psy951 210 RQDFAGTF 217 (428)
Q Consensus 210 ~~gy~G~~ 217 (428)
.+||.|.-
T Consensus 156 ~~~~~~~~ 163 (197)
T PF06247_consen 156 KEGFPGDG 163 (197)
T ss_dssp -TT-EEET
T ss_pred CCCCCCCC
Confidence 99998643
No 92
>PF12946 EGF_MSP1_1: MSP1 EGF domain 1; InterPro: IPR024730 This EGF-like domain is found at the C terminus of the malaria parasite MSP1 protein. MSP1 is the merozoite surface protein 1. This domain is part of the C-terminal fragment that is proteolytically processed from the the rest of the protein and is left attached to the surface of the invading parasite [].; PDB: 1N1I_C 2FLG_A 1CEJ_A 2NPR_A 1B9W_A 1OB1_F.
Probab=86.58 E-value=0.36 Score=30.69 Aligned_cols=29 Identities=24% Similarity=0.718 Sum_probs=20.9
Q ss_pred CCCCCCEEeeCCCCceEecCCCCCc--cCcc
Q psy951 190 QCQNGGMCAESETGDLTCNCRQDFA--GTFC 218 (428)
Q Consensus 190 ~C~ngg~C~~~~~g~~~C~C~~gy~--G~~C 218 (428)
.|+.+..|++...|++.|.|..||. |..|
T Consensus 6 ~cP~NA~C~~~~dG~eecrCllgyk~~~~~C 36 (37)
T PF12946_consen 6 KCPANAGCFRYDDGSEECRCLLGYKKVGGKC 36 (37)
T ss_dssp ---TTEEEEEETTSEEEEEE-TTEEEETTEE
T ss_pred cCCCCcccEEcCCCCEEEEeeCCccccCCCc
Confidence 4556678999888999999999998 4445
No 93
>TIGR03606 non_repeat_PQQ dehydrogenase, PQQ-dependent, s-GDH family. PQQ, or pyrroloquinoline-quinone, serves as a cofactor for a number of sugar and alcohol dehydrogenases in a limited number of bacterial species. Most characterized PQQ-dependent enzymes have multiple repeats of a sequence region described by pfam01011 (PQQ enzyme repeat), but this protein family in unusual in lacking that repeat. Below the noise cutoff are related proteins mostly from species that lack PQQ biosynthesis.
Probab=85.81 E-value=13 Score=38.02 Aligned_cols=80 Identities=14% Similarity=0.104 Sum_probs=56.0
Q ss_pred eccccccceeEEeeccCCCEEEEEecCCCeEEEeeeccccCCceEE-----EE-e-ccccceeeeeec------cCCeEE
Q psy951 318 DVISDERRIEALDIDPVDEIIYWVDSYDRNIRRSFMLEAQKGQVQA-----VI-S-DERRIEALDIDP------VDEIIY 384 (428)
Q Consensus 318 ~~i~~~~~~~~l~~d~~~~~lyWtd~~~~~I~ra~l~g~~~~~~~~-----i~-~-~~~~p~glavD~------~~~~lY 384 (428)
.+.++|..|.+|++.|. +.||-+++...+|++..-++. ....+ ++ . +..-+.|||+|. .++.||
T Consensus 24 ~va~GL~~Pw~maflPD-G~llVtER~~G~I~~v~~~~~--~~~~~~~l~~v~~~~ge~GLlglal~PdF~~~~~n~~lY 100 (454)
T TIGR03606 24 VLLSGLNKPWALLWGPD-NQLWVTERATGKILRVNPETG--EVKVVFTLPEIVNDAQHNGLLGLALHPDFMQEKGNPYVY 100 (454)
T ss_pred EEECCCCCceEEEEcCC-CeEEEEEecCCEEEEEeCCCC--ceeeeecCCceeccCCCCceeeEEECCCccccCCCcEEE
Confidence 35689999999999985 688889987788888765442 11111 12 2 456688999994 457899
Q ss_pred EEeC---------CCCcEEEEEccC
Q psy951 385 WVDS---------YDRNIRRSFMLE 400 (428)
Q Consensus 385 wtd~---------~~~~I~~~~~~g 400 (428)
++.+ ...+|.+..++.
T Consensus 101 vsyt~~~~~~~~~~~~~I~R~~l~~ 125 (454)
T TIGR03606 101 ISYTYKNGDKELPNHTKIVRYTYDK 125 (454)
T ss_pred EEEeccCCCCCccCCcEEEEEEecC
Confidence 9853 145788888863
No 94
>cd00055 EGF_Lam Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies
Probab=85.71 E-value=0.77 Score=31.44 Aligned_cols=18 Identities=22% Similarity=0.748 Sum_probs=14.8
Q ss_pred eEecCCCCCccCccCCCC
Q psy951 205 LTCNCRQDFAGTFCENYT 222 (428)
Q Consensus 205 ~~C~C~~gy~G~~Ce~~~ 222 (428)
-+|.|++++.|..|+.-.
T Consensus 19 G~C~C~~~~~G~~C~~C~ 36 (50)
T cd00055 19 GQCECKPNTTGRRCDRCA 36 (50)
T ss_pred CEEeCCCcCCCCCCCCCC
Confidence 379999999999998543
No 95
>KOG3514|consensus
Probab=85.31 E-value=3.6 Score=45.35 Aligned_cols=33 Identities=21% Similarity=0.579 Sum_probs=28.6
Q ss_pred CCCCCCCCC--ccccCCCcceecCCC-CCCCCCCCCCCccccc
Q psy951 138 PCSQSPCSH--LCLVIPGGYQCACPE-NATPKLPGVAEIRCSA 177 (428)
Q Consensus 138 ~C~~~~C~~--~C~~~~~~~~C~C~~-g~~~~~~~~~g~~C~~ 177 (428)
.|..+||+| .|...++.|.|.|.. || .|++|+-
T Consensus 625 ~C~~nPC~N~g~C~egwNrfiCDCs~T~~-------~G~~Cer 660 (1591)
T KOG3514|consen 625 ICESNPCQNGGKCSEGWNRFICDCSGTGF-------EGRTCER 660 (1591)
T ss_pred ccCCCcccCCCCccccccccccccccCcc-------cCccccc
Confidence 799999999 999999999999976 66 5788874
No 96
>PRK05137 tolB translocation protein TolB; Provisional
Probab=85.03 E-value=18 Score=36.82 Aligned_cols=105 Identities=20% Similarity=0.237 Sum_probs=65.7
Q ss_pred EEEeeCCCCeEEEEEcCCCCcEEEEeCCCCCceeEEEcCCCCCEEEEEECCC--CeEEEEEcCCCCeEEEEeCCCCccce
Q psy951 2 FWAETGASPRIESAWMDGSHRRSLVMTGVRHPTGLSVDAAMDHTLYWVDSKL--NTIESVRHDGRNRQTILSGSDKLQHP 79 (428)
Q Consensus 2 yWtd~~~~~~I~~a~~DG~~~~~l~~~~~~~P~glavD~~~~~~lYW~d~~~--~~I~~~~ldG~~~~~i~~~~~~~~~p 79 (428)
|.++....+.|+..+++|...+.+.... ..-...+..+..+ .|+++.... ..|..++++|...+.+..+ .....
T Consensus 306 f~s~~~g~~~Iy~~d~~g~~~~~lt~~~-~~~~~~~~SpdG~-~ia~~~~~~~~~~i~~~d~~~~~~~~lt~~--~~~~~ 381 (435)
T PRK05137 306 FESDRSGSPQLYVMNADGSNPRRISFGG-GRYSTPVWSPRGD-LIAFTKQGGGQFSIGVMKPDGSGERILTSG--FLVEG 381 (435)
T ss_pred EEECCCCCCeEEEEECCCCCeEEeecCC-CcccCeEECCCCC-EEEEEEcCCCceEEEEEECCCCceEeccCC--CCCCC
Confidence 3344444457888888887766654321 1223456777777 888876543 4789999988877665543 22233
Q ss_pred eeeeccccEEEEEeCCC-----CceeeeccCCCCce
Q psy951 80 ISLDVFENNIYWLARDT-----GSLYKQDKFGRGVP 110 (428)
Q Consensus 80 ~~l~~~~~~lYwtD~~~-----~~I~~~~~~g~~~~ 110 (428)
...+..+..||++-... ..|+..+.+|+...
T Consensus 382 p~~spDG~~i~~~~~~~~~~~~~~L~~~dl~g~~~~ 417 (435)
T PRK05137 382 PTWAPNGRVIMFFRQTPGSGGAPKLYTVDLTGRNER 417 (435)
T ss_pred CeECCCCCEEEEEEccCCCCCcceEEEEECCCCceE
Confidence 44566677888875432 35888888865443
No 97
>PRK04792 tolB translocation protein TolB; Provisional
Probab=84.91 E-value=17 Score=37.31 Aligned_cols=98 Identities=16% Similarity=0.145 Sum_probs=62.2
Q ss_pred CCCeEEEEEcCCCCcEEEEeCCCCCceeEEEcCCCCCEEEEEECCCC--eEEEEEcCCCCeEEEEeCCCCccceeeeecc
Q psy951 8 ASPRIESAWMDGSHRRSLVMTGVRHPTGLSVDAAMDHTLYWVDSKLN--TIESVRHDGRNRQTILSGSDKLQHPISLDVF 85 (428)
Q Consensus 8 ~~~~I~~a~~DG~~~~~l~~~~~~~P~glavD~~~~~~lYW~d~~~~--~I~~~~ldG~~~~~i~~~~~~~~~p~~l~~~ 85 (428)
..+.|.+.++++...+.+.... ....+.++.+..+ .||.+....+ .|..+++++...+.+... ......+.+..
T Consensus 328 g~~~Iy~~dl~~g~~~~Lt~~g-~~~~~~~~SpDG~-~l~~~~~~~g~~~I~~~dl~~g~~~~lt~~--~~d~~ps~spd 403 (448)
T PRK04792 328 GKPQIYRVNLASGKVSRLTFEG-EQNLGGSITPDGR-SMIMVNRTNGKFNIARQDLETGAMQVLTST--RLDESPSVAPN 403 (448)
T ss_pred CCceEEEEECCCCCEEEEecCC-CCCcCeeECCCCC-EEEEEEecCCceEEEEEECCCCCeEEccCC--CCCCCceECCC
Confidence 3457888888765554443222 2344567887777 8988865443 788899988876655443 22222256777
Q ss_pred ccEEEEEeCCCC--ceeeeccCCCCc
Q psy951 86 ENNIYWLARDTG--SLYKQDKFGRGV 109 (428)
Q Consensus 86 ~~~lYwtD~~~~--~I~~~~~~g~~~ 109 (428)
+..|+++....+ .|+..+.+|...
T Consensus 404 G~~I~~~~~~~g~~~l~~~~~~G~~~ 429 (448)
T PRK04792 404 GTMVIYSTTYQGKQVLAAVSIDGRFK 429 (448)
T ss_pred CCEEEEEEecCCceEEEEEECCCCce
Confidence 888988876544 366677766543
No 98
>KOG3516|consensus
Probab=84.85 E-value=0.67 Score=51.37 Aligned_cols=35 Identities=31% Similarity=0.793 Sum_probs=29.2
Q ss_pred CCCCCCCEEeeCCCCceEecCC-CCCccCccCCCCCC
Q psy951 189 CQCQNGGMCAESETGDLTCNCR-QDFAGTFCENYTGI 224 (428)
Q Consensus 189 c~C~ngg~C~~~~~g~~~C~C~-~gy~G~~Ce~~~~~ 224 (428)
.+|++||.|...- ..|.|.|. .||.|..|+.....
T Consensus 551 N~CehgG~C~Qs~-~~f~C~C~~TGY~GatCHtsi~e 586 (1306)
T KOG3516|consen 551 NPCEHGGKCSQSW-DDFECNCELTGYKGATCHTSIYE 586 (1306)
T ss_pred ccccCCCcccccc-cceeEeccccccccccccCCCcc
Confidence 3799999999844 57999999 79999999876544
No 99
>PF00053 Laminin_EGF: Laminin EGF-like (Domains III and V); InterPro: IPR002049 Laminins [] are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation. They are composed of distinct but related alpha, beta and gamma chains. The three chains form a cross-shaped molecule that consist of a long arm and three short globular arms. The long arm consist of a coiled coil structure contributed by all three chains and cross-linked by interchain disulphide bonds. Beside different types of globular domains each subunit contains, in its first half, consecutive repeats of about 60 amino acids in length that include eight conserved cysteines []. The tertiary structure [, ] of this domain is remotely similar in its N-terminal to that of the EGF-like module (see PDOC00021 from PROSITEDOC). It is known as a 'LE' or 'laminin-type EGF-like' domain. The number of copies of the LE domain in the different forms of laminins is highly variable; from 3 up to 22 copies have been found. A schematic representation of the topology of the four disulphide bonds in the LE domain is shown below. +-------------------+ +-|-----------+ | +--------+ +-----------------+ | | | | | | | | xxCxCxxxxxxxxxxxCxxxxxxxCxxCxxxxxGxxCxxCxxgaagxxxxxxxxxxxCxx sssssssssssssssssssssssssssssssssss 'C': conserved cysteine involved in a disulphide bond 'a': conserved aromatic residue 'G': conserved glycine (lower case = less conserved) 's': region similar to the EGF-like domain In mouse laminin gamma-1 chain, the seventh LE domain has been shown to be the only one that binds with a high affinity to nidogen []. The binding-sites are located on the surface within the loops C1-C3 and C5-C6 [, ]. Long consecutive arrays of LE domains in laminins form rod-like elements of limited flexibility [], which determine the spacing in the formation of laminin networks of basement membranes [].; PDB: 3TBD_A 3ZYG_B 3ZYI_B 2Y38_A 1KLO_A 1NPE_B 3ZYJ_B 1TLE_A.
Probab=83.98 E-value=0.93 Score=30.81 Aligned_cols=24 Identities=25% Similarity=0.661 Sum_probs=17.4
Q ss_pred EEeeCCCCceEecCCCCCccCccCCCC
Q psy951 196 MCAESETGDLTCNCRQDFAGTFCENYT 222 (428)
Q Consensus 196 ~C~~~~~g~~~C~C~~gy~G~~Ce~~~ 222 (428)
.|.... .+|.|+++|.|..|+.=.
T Consensus 12 ~C~~~~---G~C~C~~~~~G~~C~~C~ 35 (49)
T PF00053_consen 12 TCDPST---GQCVCKPGTTGPRCDQCK 35 (49)
T ss_dssp SEEETC---EEESBSTTEESTTS-EE-
T ss_pred cccCCC---CEEeccccccCCcCcCCC
Confidence 566533 589999999999998643
No 100
>PRK04922 tolB translocation protein TolB; Provisional
Probab=83.06 E-value=57 Score=33.14 Aligned_cols=77 Identities=10% Similarity=0.079 Sum_probs=52.5
Q ss_pred eEEeeccCCCEEEEEecCCC--eEEEeeeccccCCceEEEEeccccceeeeeeccCCeEEEEeC--CCCcEEEEEccCcc
Q psy951 327 EALDIDPVDEIIYWVDSYDR--NIRRSFMLEAQKGQVQAVISDERRIEALDIDPVDEIIYWVDS--YDRNIRRSFMLEAQ 402 (428)
Q Consensus 327 ~~l~~d~~~~~lyWtd~~~~--~I~ra~l~g~~~~~~~~i~~~~~~p~glavD~~~~~lYwtd~--~~~~I~~~~~~g~~ 402 (428)
...++.|..++|+++..... .|+...+++ +....+..+ ......+....++.|+|+.. +...|...+++|+.
T Consensus 339 ~~~~~SpDG~~Ia~~~~~~~~~~I~v~d~~~---g~~~~Lt~~-~~~~~p~~spdG~~i~~~s~~~g~~~L~~~~~~g~~ 414 (433)
T PRK04922 339 ARASVSPDGKKIAMVHGSGGQYRIAVMDLST---GSVRTLTPG-SLDESPSFAPNGSMVLYATREGGRGVLAAVSTDGRV 414 (433)
T ss_pred cCEEECCCCCEEEEEECCCCceeEEEEECCC---CCeEECCCC-CCCCCceECCCCCEEEEEEecCCceEEEEEECCCCc
Confidence 35788898899988765332 677777765 334433332 23445567777888988665 55679999999988
Q ss_pred eeEEE
Q psy951 403 KGQVQ 407 (428)
Q Consensus 403 ~~~l~ 407 (428)
++.|.
T Consensus 415 ~~~l~ 419 (433)
T PRK04922 415 RQRLV 419 (433)
T ss_pred eEEcc
Confidence 77775
No 101
>TIGR03032 conserved hypothetical protein TIGR03032. This protein family is uncharacterized. A number of motifs are conserved perfectly among all member sequences. The function of this protein is unknown.
Probab=82.38 E-value=12 Score=36.12 Aligned_cols=61 Identities=11% Similarity=0.032 Sum_probs=47.1
Q ss_pred eCCCCCceeEEEcCCCCCEEEEEECCCCeEEEEEcCCCCeEEEEeCCCCccceeeeeccccEEEEEe
Q psy951 27 MTGVRHPTGLSVDAAMDHTLYWVDSKLNTIESVRHDGRNRQTILSGSDKLQHPISLDVFENNIYWLA 93 (428)
Q Consensus 27 ~~~~~~P~glavD~~~~~~lYW~d~~~~~I~~~~ldG~~~~~i~~~~~~~~~p~~l~~~~~~lYwtD 93 (428)
.+++..|++-... .+ +||..|++.+.|.+++.+....+++..- -..|.||+..++.+|..-
T Consensus 199 ~~GLsmPhSPRWh--dg-rLwvldsgtGev~~vD~~~G~~e~Va~v---pG~~rGL~f~G~llvVgm 259 (335)
T TIGR03032 199 ASGLSMPHSPRWY--QG-KLWLLNSGRGELGYVDPQAGKFQPVAFL---PGFTRGLAFAGDFAFVGL 259 (335)
T ss_pred EcCccCCcCCcEe--CC-eEEEEECCCCEEEEEcCCCCcEEEEEEC---CCCCcccceeCCEEEEEe
Confidence 4577888887776 46 8999999999999999985555566554 347889998877777764
No 102
>PHA02887 EGF-like protein; Provisional
Probab=81.47 E-value=1 Score=36.30 Aligned_cols=36 Identities=25% Similarity=0.562 Sum_probs=27.3
Q ss_pred CCCCCCC---CCCCC-ccccCC--CcceecCCCCCCCCCCCCCCccccc
Q psy951 135 APNPCSQ---SPCSH-LCLVIP--GGYQCACPENATPKLPGVAEIRCSA 177 (428)
Q Consensus 135 ~~n~C~~---~~C~~-~C~~~~--~~~~C~C~~g~~~~~~~~~g~~C~~ 177 (428)
...||.. +-|-| .|.-.+ ..+.|.|+.|| .|.+|+.
T Consensus 82 hf~pC~~eyk~YCiHG~C~yI~dL~epsCrC~~GY-------tG~RCE~ 123 (126)
T PHA02887 82 FFEKCKNDFNDFCINGECMNIIDLDEKFCICNKGY-------TGIRCDE 123 (126)
T ss_pred CccccChHhhCEeeCCEEEccccCCCceeECCCCc-------ccCCCCc
Confidence 5678875 36777 777655 67899999999 6788864
No 103
>PRK04922 tolB translocation protein TolB; Provisional
Probab=81.14 E-value=27 Score=35.44 Aligned_cols=101 Identities=19% Similarity=0.216 Sum_probs=62.8
Q ss_pred eeCCCCeEEEEEcCCCCcEEEEeCCCCCceeEEEcCCCCCEEEEEECCC--CeEEEEEcCCCCeEEEEeCCCCccceeee
Q psy951 5 ETGASPRIESAWMDGSHRRSLVMTGVRHPTGLSVDAAMDHTLYWVDSKL--NTIESVRHDGRNRQTILSGSDKLQHPISL 82 (428)
Q Consensus 5 d~~~~~~I~~a~~DG~~~~~l~~~~~~~P~glavD~~~~~~lYW~d~~~--~~I~~~~ldG~~~~~i~~~~~~~~~p~~l 82 (428)
|....+.|...++++...+.+...+ ......++.+..+ .|+++.... ..|...++++...+.+..+ ....-...
T Consensus 311 d~~g~~~iy~~dl~~g~~~~lt~~g-~~~~~~~~SpDG~-~Ia~~~~~~~~~~I~v~d~~~g~~~~Lt~~--~~~~~p~~ 386 (433)
T PRK04922 311 DRGGRPQIYRVAASGGSAERLTFQG-NYNARASVSPDGK-KIAMVHGSGGQYRIAVMDLSTGSVRTLTPG--SLDESPSF 386 (433)
T ss_pred CCCCCceEEEEECCCCCeEEeecCC-CCccCEEECCCCC-EEEEEECCCCceeEEEEECCCCCeEECCCC--CCCCCceE
Confidence 3333457888888766555543322 3445678888777 898876533 3789999987766655433 12222355
Q ss_pred eccccEEEEEeCC--CCceeeeccCCCCc
Q psy951 83 DVFENNIYWLARD--TGSLYKQDKFGRGV 109 (428)
Q Consensus 83 ~~~~~~lYwtD~~--~~~I~~~~~~g~~~ 109 (428)
+..+..|+++... ...|+..+.+|...
T Consensus 387 spdG~~i~~~s~~~g~~~L~~~~~~g~~~ 415 (433)
T PRK04922 387 APNGSMVLYATREGGRGVLAAVSTDGRVR 415 (433)
T ss_pred CCCCCEEEEEEecCCceEEEEEECCCCce
Confidence 6667788887653 33577777776543
No 104
>PRK00178 tolB translocation protein TolB; Provisional
Probab=81.02 E-value=66 Score=32.49 Aligned_cols=76 Identities=8% Similarity=0.068 Sum_probs=51.3
Q ss_pred EEeeccCCCEEEEEecCCC--eEEEeeeccccCCceEEEEeccccceeeeeeccCCeEEEEeC--CCCcEEEEEccCcce
Q psy951 328 ALDIDPVDEIIYWVDSYDR--NIRRSFMLEAQKGQVQAVISDERRIEALDIDPVDEIIYWVDS--YDRNIRRSFMLEAQK 403 (428)
Q Consensus 328 ~l~~d~~~~~lyWtd~~~~--~I~ra~l~g~~~~~~~~i~~~~~~p~glavD~~~~~lYwtd~--~~~~I~~~~~~g~~~ 403 (428)
...+.|..++|+++....+ .|+...+++ ++.+.+... ...+..++...++.|+++.. +...|...+++|...
T Consensus 335 ~~~~Spdg~~i~~~~~~~~~~~l~~~dl~t---g~~~~lt~~-~~~~~p~~spdg~~i~~~~~~~g~~~l~~~~~~g~~~ 410 (430)
T PRK00178 335 RPRLSADGKTLVMVHRQDGNFHVAAQDLQR---GSVRILTDT-SLDESPSVAPNGTMLIYATRQQGRGVLMLVSINGRVR 410 (430)
T ss_pred ceEECCCCCEEEEEEccCCceEEEEEECCC---CCEEEccCC-CCCCCceECCCCCEEEEEEecCCceEEEEEECCCCce
Confidence 3567888889988865433 677777766 334444332 22334567777999999765 456799999999877
Q ss_pred eEEE
Q psy951 404 GQVQ 407 (428)
Q Consensus 404 ~~l~ 407 (428)
+.|.
T Consensus 411 ~~l~ 414 (430)
T PRK00178 411 LPLP 414 (430)
T ss_pred EECc
Confidence 6664
No 105
>TIGR02800 propeller_TolB tol-pal system beta propeller repeat protein TolB. The Tol-PAL system is required for bacterial outer membrane integrity. E. coli TolB is involved in the tonB-independent uptake of group A colicins (colicins A, E1, E2, E3 and K), and is necessary for the colicins to reach their respective targets after initial binding to the bacteria. It is also involved in uptake of filamentous DNA. Study of its structure suggest that the TolB protein might be involved in the recycling of peptidoglycan or in its covalent linking with lipoproteins. The Tol-Pal system is also implicated in pathogenesis of E. coli, Haemophilus ducreyi, Salmonella enterica and Vibrio cholerae, but the mechanism(s) is unclear.
Probab=80.39 E-value=66 Score=32.11 Aligned_cols=78 Identities=12% Similarity=0.063 Sum_probs=51.2
Q ss_pred ceeEEeeccCCCEEEEEecCCC--eEEEeeeccccCCceEEEEeccccceeeeeeccCCeEEEEeC--CCCcEEEEEccC
Q psy951 325 RIEALDIDPVDEIIYWVDSYDR--NIRRSFMLEAQKGQVQAVISDERRIEALDIDPVDEIIYWVDS--YDRNIRRSFMLE 400 (428)
Q Consensus 325 ~~~~l~~d~~~~~lyWtd~~~~--~I~ra~l~g~~~~~~~~i~~~~~~p~glavD~~~~~lYwtd~--~~~~I~~~~~~g 400 (428)
.....++.|..++|+++..... +|+...+++ ...+.+... ......++...++.|+|+.. +...+.+.+.+|
T Consensus 323 ~~~~~~~spdg~~i~~~~~~~~~~~i~~~d~~~---~~~~~l~~~-~~~~~p~~spdg~~l~~~~~~~~~~~l~~~~~~g 398 (417)
T TIGR02800 323 YNASPSWSPDGDLIAFVHREGGGFNIAVMDLDG---GGERVLTDT-GLDESPSFAPNGRMILYATTRGGRGVLGLVSTDG 398 (417)
T ss_pred CccCeEECCCCCEEEEEEccCCceEEEEEeCCC---CCeEEccCC-CCCCCceECCCCCEEEEEEeCCCcEEEEEEECCC
Confidence 3456778888899999976543 677777765 334444322 22344466666899999776 345688888888
Q ss_pred cceeEE
Q psy951 401 AQKGQV 406 (428)
Q Consensus 401 ~~~~~l 406 (428)
+..+.+
T Consensus 399 ~~~~~~ 404 (417)
T TIGR02800 399 RFRARL 404 (417)
T ss_pred ceeeEC
Confidence 766554
No 106
>PRK00178 tolB translocation protein TolB; Provisional
Probab=79.61 E-value=37 Score=34.34 Aligned_cols=101 Identities=17% Similarity=0.196 Sum_probs=62.7
Q ss_pred eeCCCCeEEEEEcCCCCcEEEEeCCCCCceeEEEcCCCCCEEEEEECCCC--eEEEEEcCCCCeEEEEeCCCCccceeee
Q psy951 5 ETGASPRIESAWMDGSHRRSLVMTGVRHPTGLSVDAAMDHTLYWVDSKLN--TIESVRHDGRNRQTILSGSDKLQHPISL 82 (428)
Q Consensus 5 d~~~~~~I~~a~~DG~~~~~l~~~~~~~P~glavD~~~~~~lYW~d~~~~--~I~~~~ldG~~~~~i~~~~~~~~~p~~l 82 (428)
+....+.|+..++++...+.+.... ......++.+..+ .|+++....+ .|..+++++...+.+.... ....| .+
T Consensus 306 ~~~g~~~iy~~d~~~g~~~~lt~~~-~~~~~~~~Spdg~-~i~~~~~~~~~~~l~~~dl~tg~~~~lt~~~-~~~~p-~~ 381 (430)
T PRK00178 306 DRGGKPQIYKVNVNGGRAERVTFVG-NYNARPRLSADGK-TLVMVHRQDGNFHVAAQDLQRGSVRILTDTS-LDESP-SV 381 (430)
T ss_pred CCCCCceEEEEECCCCCEEEeecCC-CCccceEECCCCC-EEEEEEccCCceEEEEEECCCCCEEEccCCC-CCCCc-eE
Confidence 3333457888888766555443322 2233456777777 8998875433 6888999887766665432 12233 56
Q ss_pred eccccEEEEEeCCC--CceeeeccCCCCc
Q psy951 83 DVFENNIYWLARDT--GSLYKQDKFGRGV 109 (428)
Q Consensus 83 ~~~~~~lYwtD~~~--~~I~~~~~~g~~~ 109 (428)
+..+..|+++.... ..|+..+.+|...
T Consensus 382 spdg~~i~~~~~~~g~~~l~~~~~~g~~~ 410 (430)
T PRK00178 382 APNGTMLIYATRQQGRGVLMLVSINGRVR 410 (430)
T ss_pred CCCCCEEEEEEecCCceEEEEEECCCCce
Confidence 67788899987654 3577777765443
No 107
>KOG3607|consensus
Probab=79.41 E-value=2.3 Score=45.92 Aligned_cols=30 Identities=30% Similarity=0.833 Sum_probs=22.5
Q ss_pred CCCCCEEeeCCCCceEecCCCCCccCccCCCCCC
Q psy951 191 CQNGGMCAESETGDLTCNCRQDFAGTFCENYTGI 224 (428)
Q Consensus 191 C~ngg~C~~~~~g~~~C~C~~gy~G~~Ce~~~~~ 224 (428)
|...|.|-+ ...|.|.+||.+..|+.....
T Consensus 632 C~g~GVCnn----~~~ChC~~gwapp~C~~~~~~ 661 (716)
T KOG3607|consen 632 CNGHGVCNN----ELNCHCEPGWAPPFCFIFGYG 661 (716)
T ss_pred cCCCcccCC----CcceeeCCCCCCCccccccCC
Confidence 444555543 467999999999999987655
No 108
>PRK02889 tolB translocation protein TolB; Provisional
Probab=79.34 E-value=41 Score=34.14 Aligned_cols=101 Identities=14% Similarity=0.134 Sum_probs=61.1
Q ss_pred EeeCCCCeEEEEEcCCCCcEEEEeCCCCCceeEEEcCCCCCEEEEEECCCC--eEEEEEcCCCCeEEEEeCCCCccceee
Q psy951 4 AETGASPRIESAWMDGSHRRSLVMTGVRHPTGLSVDAAMDHTLYWVDSKLN--TIESVRHDGRNRQTILSGSDKLQHPIS 81 (428)
Q Consensus 4 td~~~~~~I~~a~~DG~~~~~l~~~~~~~P~glavD~~~~~~lYW~d~~~~--~I~~~~ldG~~~~~i~~~~~~~~~p~~ 81 (428)
+|.+..+.|...+++|...+.+...+ ......++.+..+ .|+++....+ .|...++++...+.+..+ .......
T Consensus 302 s~~~g~~~Iy~~~~~~g~~~~lt~~g-~~~~~~~~SpDG~-~Ia~~s~~~g~~~I~v~d~~~g~~~~lt~~--~~~~~p~ 377 (427)
T PRK02889 302 SDRGGAPQIYRMPASGGAAQRVTFTG-SYNTSPRISPDGK-LLAYISRVGGAFKLYVQDLATGQVTALTDT--TRDESPS 377 (427)
T ss_pred ecCCCCcEEEEEECCCCceEEEecCC-CCcCceEECCCCC-EEEEEEccCCcEEEEEEECCCCCeEEccCC--CCccCce
Confidence 33334457888888776554443322 2233467777777 8888865433 788899987766665544 2223345
Q ss_pred eeccccEEEEEeCCCC--ceeeeccCCCC
Q psy951 82 LDVFENNIYWLARDTG--SLYKQDKFGRG 108 (428)
Q Consensus 82 l~~~~~~lYwtD~~~~--~I~~~~~~g~~ 108 (428)
.+..+..||++....+ .|+..+.+|..
T Consensus 378 ~spdg~~l~~~~~~~g~~~l~~~~~~g~~ 406 (427)
T PRK02889 378 FAPNGRYILYATQQGGRSVLAAVSSDGRI 406 (427)
T ss_pred ECCCCCEEEEEEecCCCEEEEEEECCCCc
Confidence 5666778888764433 36666666543
No 109
>COG3204 Uncharacterized protein conserved in bacteria [Function unknown]
Probab=79.02 E-value=18 Score=34.65 Aligned_cols=101 Identities=14% Similarity=0.190 Sum_probs=66.8
Q ss_pred ccceeEEeeccCCCEEEEEecCC-CeEEEeeeccccCCceEEEE---ec----cccceeeeeeccCCeEEEEeCCCCcEE
Q psy951 323 ERRIEALDIDPVDEIIYWVDSYD-RNIRRSFMLEAQKGQVQAVI---SD----ERRIEALDIDPVDEIIYWVDSYDRNIR 394 (428)
Q Consensus 323 ~~~~~~l~~d~~~~~lyWtd~~~-~~I~ra~l~g~~~~~~~~i~---~~----~~~p~glavD~~~~~lYwtd~~~~~I~ 394 (428)
-+..+|+++|+.++++|-+.-.. .+|+.....-+. -...+.. .. +.-..||..|..+++|+.--...+.+-
T Consensus 180 N~GfEGlA~d~~~~~l~~aKEr~P~~I~~~~~~~~~-l~~~~~~~~~~~~~~f~~DvSgl~~~~~~~~LLVLS~ESr~l~ 258 (316)
T COG3204 180 NKGFEGLAWDPVDHRLFVAKERNPIGIFEVTQSPSS-LSVHASLDPTADRDLFVLDVSGLEFNAITNSLLVLSDESRRLL 258 (316)
T ss_pred CcCceeeecCCCCceEEEEEccCCcEEEEEecCCcc-cccccccCcccccceEeeccccceecCCCCcEEEEecCCceEE
Confidence 34578999999999999885444 488877744311 1111111 12 456789999999999999776666677
Q ss_pred EEEccCcceeEEEecccCCCC--CCCc--ccccc
Q psy951 395 RSFMLEAQKGQVQAGASRHQN--GVPA--SSQRN 424 (428)
Q Consensus 395 ~~~~~g~~~~~l~~~~~~~~~--~~p~--~~~~~ 424 (428)
..+.+|.-+..+.=..-.||+ .+|+ +||.|
T Consensus 259 Evd~~G~~~~~lsL~~g~~gL~~dipqaEGiamD 292 (316)
T COG3204 259 EVDLSGEVIELLSLTKGNHGLSSDIPQAEGIAMD 292 (316)
T ss_pred EEecCCCeeeeEEeccCCCCCcccCCCcceeEEC
Confidence 778899877777655545555 4454 34443
No 110
>KOG4260|consensus
Probab=79.01 E-value=2.2 Score=39.78 Aligned_cols=31 Identities=26% Similarity=0.819 Sum_probs=23.3
Q ss_pred CCCCCCEEeeCC--CCceEecCCCCCccCccCC
Q psy951 190 QCQNGGMCAESE--TGDLTCNCRQDFAGTFCEN 220 (428)
Q Consensus 190 ~C~ngg~C~~~~--~g~~~C~C~~gy~G~~Ce~ 220 (428)
+|.-.|.|.-.. .|+-+|.|..||.|..|..
T Consensus 151 ~C~GnG~C~GdGsR~GsGkCkC~~GY~Gp~C~~ 183 (350)
T KOG4260|consen 151 PCFGNGSCHGDGSREGSGKCKCETGYTGPLCRY 183 (350)
T ss_pred CcCCCCcccCCCCCCCCCcccccCCCCCccccc
Confidence 466667886443 3667999999999998854
No 111
>PRK02889 tolB translocation protein TolB; Provisional
Probab=78.80 E-value=79 Score=32.07 Aligned_cols=76 Identities=14% Similarity=0.168 Sum_probs=51.3
Q ss_pred EEeeccCCCEEEEEecCCC--eEEEeeeccccCCceEEEEeccccceeeeeeccCCeEEEEe--CCCCcEEEEEccCcce
Q psy951 328 ALDIDPVDEIIYWVDSYDR--NIRRSFMLEAQKGQVQAVISDERRIEALDIDPVDEIIYWVD--SYDRNIRRSFMLEAQK 403 (428)
Q Consensus 328 ~l~~d~~~~~lyWtd~~~~--~I~ra~l~g~~~~~~~~i~~~~~~p~glavD~~~~~lYwtd--~~~~~I~~~~~~g~~~ 403 (428)
..++.|..++|+++..... .|+...+++ +....+.++ ...+..+....++.||++- .+...|.+.+.+|+.+
T Consensus 332 ~~~~SpDG~~Ia~~s~~~g~~~I~v~d~~~---g~~~~lt~~-~~~~~p~~spdg~~l~~~~~~~g~~~l~~~~~~g~~~ 407 (427)
T PRK02889 332 SPRISPDGKLLAYISRVGGAFKLYVQDLAT---GQVTALTDT-TRDESPSFAPNGRYILYATQQGGRSVLAAVSSDGRIK 407 (427)
T ss_pred ceEECCCCCEEEEEEccCCcEEEEEEECCC---CCeEEccCC-CCccCceECCCCCEEEEEEecCCCEEEEEEECCCCce
Confidence 4578888888888765433 677777765 334444332 2335667777789998865 3555688999999888
Q ss_pred eEEE
Q psy951 404 GQVQ 407 (428)
Q Consensus 404 ~~l~ 407 (428)
+.+.
T Consensus 408 ~~l~ 411 (427)
T PRK02889 408 QRLS 411 (427)
T ss_pred EEee
Confidence 7775
No 112
>TIGR02658 TTQ_MADH_Hv methylamine dehydrogenase heavy chain. This family consists of the heavy chain of methylamine dehydrogenase light chain, a periplasmic enzyme. The enzyme contains a tryptophan tryptophylquinone (TTQ) prothetic group derived from two Trp residues in the light subunity. The enzyme forms a complex with the type I blue copper protein amicyanin and a cytochrome. Electron transfer procedes from TQQ to the copper and then to the heme group of the cytochrome.
Probab=78.50 E-value=68 Score=31.79 Aligned_cols=100 Identities=18% Similarity=0.231 Sum_probs=63.3
Q ss_pred EEEeeCCCCeEEEEEcCCCCcEEEE-----eCC----CCCcee---EEEcCCCCCEEEEE-ECCC--------CeEEEEE
Q psy951 2 FWAETGASPRIESAWMDGSHRRSLV-----MTG----VRHPTG---LSVDAAMDHTLYWV-DSKL--------NTIESVR 60 (428)
Q Consensus 2 yWtd~~~~~~I~~a~~DG~~~~~l~-----~~~----~~~P~g---lavD~~~~~~lYW~-d~~~--------~~I~~~~ 60 (428)
+|.... ..|...++.|....... ... --+|.| +++++..+ +||-+ ..+. +.|..++
T Consensus 209 ~~vs~e--G~V~~id~~~~~~~~~~~~~~~~~~~~~~~wrP~g~q~ia~~~dg~-~lyV~~~~~~~~thk~~~~~V~ViD 285 (352)
T TIGR02658 209 VWPTYT--GKIFQIDLSSGDAKFLPAIEAFTEAEKADGWRPGGWQQVAYHRARD-RIYLLADQRAKWTHKTASRFLFVVD 285 (352)
T ss_pred EEEecC--CeEEEEecCCCcceecceeeeccccccccccCCCcceeEEEcCCCC-EEEEEecCCccccccCCCCEEEEEE
Confidence 455543 46888887665433321 111 126777 99999988 99994 3322 5888888
Q ss_pred cCCCCeEEEEe-CCCCccceeeeecccc-EEEEEeCCCCceeeeccCC
Q psy951 61 HDGRNRQTILS-GSDKLQHPISLDVFEN-NIYWLARDTGSLYKQDKFG 106 (428)
Q Consensus 61 ldG~~~~~i~~-~~~~~~~p~~l~~~~~-~lYwtD~~~~~I~~~~~~g 106 (428)
.....+.--+. + .....+.++-.+. .||-+....+.|..++...
T Consensus 286 ~~t~kvi~~i~vG--~~~~~iavS~Dgkp~lyvtn~~s~~VsViD~~t 331 (352)
T TIGR02658 286 AKTGKRLRKIELG--HEIDSINVSQDAKPLLYALSTGDKTLYIFDAET 331 (352)
T ss_pred CCCCeEEEEEeCC--CceeeEEECCCCCeEEEEeCCCCCcEEEEECcC
Confidence 75554432223 3 3444556666677 8999988888888888653
No 113
>cd01475 vWA_Matrilin VWA_Matrilin: In cartilaginous plate, extracellular matrix molecules mediate cell-matrix and matrix-matrix interactions thereby providing tissue integrity. Some members of the matrilin family are expressed specifically in developing cartilage rudiments. The matrilin family consists of at least four members. All the members of the matrilin family contain VWA domains, EGF-like domains and a heptad repeat coiled-coiled domain at the carboxy terminus which is responsible for the oligomerization of the matrilins. The VWA domains have been shown to be essential for matrilin network formation by interacting with matrix ligands.
Probab=78.40 E-value=2 Score=39.44 Aligned_cols=20 Identities=25% Similarity=0.762 Sum_probs=17.0
Q ss_pred CEEeeCCCCceEecCCCCCcc
Q psy951 195 GMCAESETGDLTCNCRQDFAG 215 (428)
Q Consensus 195 g~C~~~~~g~~~C~C~~gy~G 215 (428)
..|.+.. |+|.|.|+.||+.
T Consensus 199 ~~C~~~~-g~~~c~c~~g~~~ 218 (224)
T cd01475 199 QVCISTP-GSYLCACTEGYAL 218 (224)
T ss_pred ceEEcCC-CCEEeECCCCccC
Confidence 4788877 7899999999984
No 114
>PF15102 TMEM154: TMEM154 protein family
Probab=76.78 E-value=1.9 Score=36.50 Aligned_cols=27 Identities=26% Similarity=0.520 Sum_probs=11.8
Q ss_pred HHHHHHHHH-HHHHHHHHhhheeecCCC
Q psy951 234 LLYIPTLLL-LLALVSATVYYVWRKRPF 260 (428)
Q Consensus 234 ~~~~~i~~~-~~~~~~~~~~~~~r~~~~ 260 (428)
.++++.+++ +++++++++++++|||+.
T Consensus 60 mIlIP~VLLvlLLl~vV~lv~~~kRkr~ 87 (146)
T PF15102_consen 60 MILIPLVLLVLLLLSVVCLVIYYKRKRT 87 (146)
T ss_pred EEeHHHHHHHHHHHHHHHheeEEeeccc
Confidence 345553333 333444444444444443
No 115
>TIGR02800 propeller_TolB tol-pal system beta propeller repeat protein TolB. The Tol-PAL system is required for bacterial outer membrane integrity. E. coli TolB is involved in the tonB-independent uptake of group A colicins (colicins A, E1, E2, E3 and K), and is necessary for the colicins to reach their respective targets after initial binding to the bacteria. It is also involved in uptake of filamentous DNA. Study of its structure suggest that the TolB protein might be involved in the recycling of peptidoglycan or in its covalent linking with lipoproteins. The Tol-Pal system is also implicated in pathogenesis of E. coli, Haemophilus ducreyi, Salmonella enterica and Vibrio cholerae, but the mechanism(s) is unclear.
Probab=75.83 E-value=50 Score=32.98 Aligned_cols=95 Identities=19% Similarity=0.120 Sum_probs=60.2
Q ss_pred CCCeEEEEEcCCCCcEEEEeCCCCCceeEEEcCCCCCEEEEEECCC--CeEEEEEcCCCCeEEEEeCCCCccceeeeecc
Q psy951 8 ASPRIESAWMDGSHRRSLVMTGVRHPTGLSVDAAMDHTLYWVDSKL--NTIESVRHDGRNRQTILSGSDKLQHPISLDVF 85 (428)
Q Consensus 8 ~~~~I~~a~~DG~~~~~l~~~~~~~P~glavD~~~~~~lYW~d~~~--~~I~~~~ldG~~~~~i~~~~~~~~~p~~l~~~ 85 (428)
..+.|...++++...+.+.. .-......++.+..+ .|+++.... ..|..+++++...+.+.... .... ...+..
T Consensus 300 g~~~iy~~d~~~~~~~~l~~-~~~~~~~~~~spdg~-~i~~~~~~~~~~~i~~~d~~~~~~~~l~~~~-~~~~-p~~spd 375 (417)
T TIGR02800 300 GSPQIYMMDADGGEVRRLTF-RGGYNASPSWSPDGD-LIAFVHREGGGFNIAVMDLDGGGERVLTDTG-LDES-PSFAPN 375 (417)
T ss_pred CCceEEEEECCCCCEEEeec-CCCCccCeEECCCCC-EEEEEEccCCceEEEEEeCCCCCeEEccCCC-CCCC-ceECCC
Confidence 34578888888776555543 234556778888777 899987654 37889999886665554431 2222 245667
Q ss_pred ccEEEEEeCCCC--ceeeeccCC
Q psy951 86 ENNIYWLARDTG--SLYKQDKFG 106 (428)
Q Consensus 86 ~~~lYwtD~~~~--~I~~~~~~g 106 (428)
+++|+++....+ .++..+.+|
T Consensus 376 g~~l~~~~~~~~~~~l~~~~~~g 398 (417)
T TIGR02800 376 GRMILYATTRGGRGVLGLVSTDG 398 (417)
T ss_pred CCEEEEEEeCCCcEEEEEEECCC
Confidence 788988876443 344444444
No 116
>PF12273 RCR: Chitin synthesis regulation, resistance to Congo red; InterPro: IPR020999 RCR proteins are ER membrane proteins that regulate chitin deposition in fungal cell walls. Although chitin, a linear polymer of beta-1,4-linked N-acetylglucosamine, constitutes only 2% of the cell wall it plays a vital role in the overall protection of the cell wall against stress, noxious chemicals and osmotic pressure changes. Congo red is a cell wall-disrupting benzidine-type dye extensively used in many cell wall mutant studies that specifically targets chitin in yeast cells and inhibits growth. RCR proteins render the yeasts resistant to Congo red by diminishing the content of chitin in the cell wall []. RCR proteins are probably regulating chitin synthase III interact directly with ubiquitin ligase Rsp5, and the VPEY motif is necessary for this, via interaction with the WW domains of Rsp5 [].
Probab=75.77 E-value=1.2 Score=37.32 Aligned_cols=6 Identities=33% Similarity=0.390 Sum_probs=2.2
Q ss_pred ecCCCC
Q psy951 256 RKRPFG 261 (428)
Q Consensus 256 r~~~~~ 261 (428)
|||+++
T Consensus 24 rRR~r~ 29 (130)
T PF12273_consen 24 RRRRRR 29 (130)
T ss_pred HHHhhc
Confidence 333333
No 117
>PRK01742 tolB translocation protein TolB; Provisional
Probab=74.83 E-value=61 Score=32.89 Aligned_cols=95 Identities=11% Similarity=0.049 Sum_probs=60.8
Q ss_pred CeEEEEEcCCCCcEEEEeCCCCCceeEEEcCCCCCEEEEEECC--CCeEEEEEcCCCCeEEEEeCCCCccceeeeecccc
Q psy951 10 PRIESAWMDGSHRRSLVMTGVRHPTGLSVDAAMDHTLYWVDSK--LNTIESVRHDGRNRQTILSGSDKLQHPISLDVFEN 87 (428)
Q Consensus 10 ~~I~~a~~DG~~~~~l~~~~~~~P~glavD~~~~~~lYW~d~~--~~~I~~~~ldG~~~~~i~~~~~~~~~p~~l~~~~~ 87 (428)
..|..++.||.+++.+... -..-...+..+..+ +|.|+... ...|...++++..++.+.... +.......+..+.
T Consensus 184 ~~i~i~d~dg~~~~~lt~~-~~~v~~p~wSPDG~-~la~~s~~~~~~~i~i~dl~tg~~~~l~~~~-g~~~~~~wSPDG~ 260 (429)
T PRK01742 184 YEVRVADYDGFNQFIVNRS-SQPLMSPAWSPDGS-KLAYVSFENKKSQLVVHDLRSGARKVVASFR-GHNGAPAFSPDGS 260 (429)
T ss_pred EEEEEECCCCCCceEeccC-CCccccceEcCCCC-EEEEEEecCCCcEEEEEeCCCCceEEEecCC-CccCceeECCCCC
Confidence 4788889999997766442 22446678888888 89887543 357999999877666554432 2223345556677
Q ss_pred EEEEEeCCCC--ceeeeccCCC
Q psy951 88 NIYWLARDTG--SLYKQDKFGR 107 (428)
Q Consensus 88 ~lYwtD~~~~--~I~~~~~~g~ 107 (428)
+|+++-...+ .|+..+..++
T Consensus 261 ~La~~~~~~g~~~Iy~~d~~~~ 282 (429)
T PRK01742 261 RLAFASSKDGVLNIYVMGANGG 282 (429)
T ss_pred EEEEEEecCCcEEEEEEECCCC
Confidence 8888643222 4666665543
No 118
>COG3204 Uncharacterized protein conserved in bacteria [Function unknown]
Probab=73.68 E-value=88 Score=30.07 Aligned_cols=72 Identities=18% Similarity=0.242 Sum_probs=55.7
Q ss_pred CCceeEEEcCCCCCEEEEEECCCCeEEEEEcCCCCeEEEEeCCCCccceeeeeccccEEEEEe-CCCCceeeeccC
Q psy951 31 RHPTGLSVDAAMDHTLYWVDSKLNTIESVRHDGRNRQTILSGSDKLQHPISLDVFENNIYWLA-RDTGSLYKQDKF 105 (428)
Q Consensus 31 ~~P~glavD~~~~~~lYW~d~~~~~I~~~~ldG~~~~~i~~~~~~~~~p~~l~~~~~~lYwtD-~~~~~I~~~~~~ 105 (428)
..-.+|+.++.++ .||=+-.....|-.+..+|.-.+++.-. +...|.+|.+.++..|-.- -....++....+
T Consensus 86 ~nvS~LTynp~~r-tLFav~n~p~~iVElt~~GdlirtiPL~--g~~DpE~Ieyig~n~fvi~dER~~~l~~~~vd 158 (316)
T COG3204 86 ANVSSLTYNPDTR-TLFAVTNKPAAIVELTKEGDLIRTIPLT--GFSDPETIEYIGGNQFVIVDERDRALYLFTVD 158 (316)
T ss_pred ccccceeeCCCcc-eEEEecCCCceEEEEecCCceEEEeccc--ccCChhHeEEecCCEEEEEehhcceEEEEEEc
Confidence 3578999999999 9999888888999999999988887765 5888999998877766654 334445544443
No 119
>TIGR03606 non_repeat_PQQ dehydrogenase, PQQ-dependent, s-GDH family. PQQ, or pyrroloquinoline-quinone, serves as a cofactor for a number of sugar and alcohol dehydrogenases in a limited number of bacterial species. Most characterized PQQ-dependent enzymes have multiple repeats of a sequence region described by pfam01011 (PQQ enzyme repeat), but this protein family in unusual in lacking that repeat. Below the noise cutoff are related proteins mostly from species that lack PQQ biosynthesis.
Probab=72.54 E-value=6.6 Score=40.20 Aligned_cols=43 Identities=28% Similarity=0.380 Sum_probs=36.5
Q ss_pred CeEEEEEcCCCC----------cEEEEeCCCCCceeEEEcCCCCCEEEEEECCCC
Q psy951 10 PRIESAWMDGSH----------RRSLVMTGVRHPTGLSVDAAMDHTLYWVDSKLN 54 (428)
Q Consensus 10 ~~I~~a~~DG~~----------~~~l~~~~~~~P~glavD~~~~~~lYW~d~~~~ 54 (428)
.||.|.+.||+- +..|...++..|-||++|+ ++ +||.+|.+.+
T Consensus 199 GkILRin~DGsiP~dNPf~~g~~~eIyA~G~RNp~Gla~dp-~G-~Lw~~e~Gp~ 251 (454)
T TIGR03606 199 GKVLRLNLDGSIPKDNPSINGVVSHIFTYGHRNPQGLAFTP-DG-TLYASEQGPN 251 (454)
T ss_pred eEEEEEcCCCCCCCCCCccCCCcceEEEEeccccceeEECC-CC-CEEEEecCCC
Confidence 589999999973 3457778999999999998 67 8999998865
No 120
>PF08693 SKG6: Transmembrane alpha-helix domain; InterPro: IPR014805 SKG6 and AXL2 are membrane proteins that show polarised intracellular localisation [, ]. This entry represents the highly conserved transmembrane alpha-helical domain found in these proteins [, ]. The full-length AXL2 protein has a negative regulatory function in cytokinesis [].
Probab=72.39 E-value=3.8 Score=26.59 Aligned_cols=13 Identities=15% Similarity=0.651 Sum_probs=6.5
Q ss_pred HHHHhhheeecCC
Q psy951 247 VSATVYYVWRKRP 259 (428)
Q Consensus 247 ~~~~~~~~~r~~~ 259 (428)
+++++|++|||++
T Consensus 28 l~~~l~~~~rR~k 40 (40)
T PF08693_consen 28 LGAFLFFWYRRKK 40 (40)
T ss_pred HHHHhheEEeccC
Confidence 3344455666653
No 121
>PF14991 MLANA: Protein melan-A; PDB: 2GTZ_F 2GT9_F 3MRO_P 2GUO_C 3MRQ_P 2GTW_C 3L6F_C 3MRP_P.
Probab=72.35 E-value=0.62 Score=37.35 Aligned_cols=27 Identities=15% Similarity=0.260 Sum_probs=2.2
Q ss_pred hHHHHHHHHHHHHHHHHHhhheeecCC
Q psy951 233 SLLYIPTLLLLLALVSATVYYVWRKRP 259 (428)
Q Consensus 233 ~~~~~~i~~~~~~~~~~~~~~~~r~~~ 259 (428)
.+++++|++++|++++++++.++|||.
T Consensus 25 EAaGIGiL~VILgiLLliGCWYckRRS 51 (118)
T PF14991_consen 25 EAAGIGILIVILGILLLIGCWYCKRRS 51 (118)
T ss_dssp ---SSS---------------------
T ss_pred HhccceeHHHHHHHHHHHhheeeeecc
Confidence 456677777777666666776666665
No 122
>PRK03629 tolB translocation protein TolB; Provisional
Probab=72.31 E-value=1.2e+02 Score=30.88 Aligned_cols=93 Identities=11% Similarity=0.031 Sum_probs=61.3
Q ss_pred CeEEEEEcCCCCcEEEEeCCCCCceeEEEcCCCCCEEEEEEC--CCCeEEEEEcCCCCeEEEEeCCCCccceeeeecccc
Q psy951 10 PRIESAWMDGSHRRSLVMTGVRHPTGLSVDAAMDHTLYWVDS--KLNTIESVRHDGRNRQTILSGSDKLQHPISLDVFEN 87 (428)
Q Consensus 10 ~~I~~a~~DG~~~~~l~~~~~~~P~glavD~~~~~~lYW~d~--~~~~I~~~~ldG~~~~~i~~~~~~~~~p~~l~~~~~ 87 (428)
++|..++.||.+++.+... -..-...+..+..+ +|.|+-. +...|...++++...+.+.... ........+..+.
T Consensus 179 ~~l~~~d~dg~~~~~lt~~-~~~~~~p~wSPDG~-~la~~s~~~g~~~i~i~dl~~G~~~~l~~~~-~~~~~~~~SPDG~ 255 (429)
T PRK03629 179 YELRVSDYDGYNQFVVHRS-PQPLMSPAWSPDGS-KLAYVTFESGRSALVIQTLANGAVRQVASFP-RHNGAPAFSPDGS 255 (429)
T ss_pred eeEEEEcCCCCCCEEeecC-CCceeeeEEcCCCC-EEEEEEecCCCcEEEEEECCCCCeEEccCCC-CCcCCeEECCCCC
Confidence 4799999999998887553 23455778888887 8887643 3457999999887766665432 2222345566777
Q ss_pred EEEEEeCCCC--ceeeeccC
Q psy951 88 NIYWLARDTG--SLYKQDKF 105 (428)
Q Consensus 88 ~lYwtD~~~~--~I~~~~~~ 105 (428)
+|+++....+ .|+..+..
T Consensus 256 ~La~~~~~~g~~~I~~~d~~ 275 (429)
T PRK03629 256 KLAFALSKTGSLNLYVMDLA 275 (429)
T ss_pred EEEEEEcCCCCcEEEEEECC
Confidence 8888754322 35555554
No 123
>PRK03629 tolB translocation protein TolB; Provisional
Probab=71.64 E-value=79 Score=32.12 Aligned_cols=101 Identities=10% Similarity=0.103 Sum_probs=65.2
Q ss_pred cceeeeecCccceeeccccccceeEEeeccCCCEEEEE-ecC-CCeEEEeeeccccCCceEEEEeccccceeeeeeccCC
Q psy951 304 PEIRAYETHKRRFRDVISDERRIEALDIDPVDEIIYWV-DSY-DRNIRRSFMLEAQKGQVQAVISDERRIEALDIDPVDE 381 (428)
Q Consensus 304 ~~~~~~~~~~~~~~~~i~~~~~~~~l~~d~~~~~lyWt-d~~-~~~I~ra~l~g~~~~~~~~i~~~~~~p~glavD~~~~ 381 (428)
.++..++........+...........+.|..+.|+.+ +.. ...|++..+++ +..+.+..........++...++
T Consensus 267 ~~I~~~d~~tg~~~~lt~~~~~~~~~~wSPDG~~I~f~s~~~g~~~Iy~~d~~~---g~~~~lt~~~~~~~~~~~SpDG~ 343 (429)
T PRK03629 267 LNLYVMDLASGQIRQVTDGRSNNTEPTWFPDSQNLAYTSDQAGRPQVYKVNING---GAPQRITWEGSQNQDADVSSDGK 343 (429)
T ss_pred cEEEEEECCCCCEEEccCCCCCcCceEECCCCCEEEEEeCCCCCceEEEEECCC---CCeEEeecCCCCccCEEECCCCC
Confidence 34555666655555555555566788999998878665 433 34899988887 33444433333344566777788
Q ss_pred eEEEEeC--CCCcEEEEEccCcceeEEE
Q psy951 382 IIYWVDS--YDRNIRRSFMLEAQKGQVQ 407 (428)
Q Consensus 382 ~lYwtd~--~~~~I~~~~~~g~~~~~l~ 407 (428)
.|+++.. +...|.+.++++...+.|-
T Consensus 344 ~Ia~~~~~~g~~~I~~~dl~~g~~~~Lt 371 (429)
T PRK03629 344 FMVMVSSNGGQQHIAKQDLATGGVQVLT 371 (429)
T ss_pred EEEEEEccCCCceEEEEECCCCCeEEeC
Confidence 8988654 3346888888877666554
No 124
>PRK01029 tolB translocation protein TolB; Provisional
Probab=71.16 E-value=1.2e+02 Score=30.90 Aligned_cols=105 Identities=10% Similarity=0.081 Sum_probs=61.6
Q ss_pred eeCCCCeEEEEEcCCCC-cEEEEeCCCCCceeEEEcCCCCCEEEEEECC--CCeEEEEEcCCCCeEEEEeCCCCccceee
Q psy951 5 ETGASPRIESAWMDGSH-RRSLVMTGVRHPTGLSVDAAMDHTLYWVDSK--LNTIESVRHDGRNRQTILSGSDKLQHPIS 81 (428)
Q Consensus 5 d~~~~~~I~~a~~DG~~-~~~l~~~~~~~P~glavD~~~~~~lYW~d~~--~~~I~~~~ldG~~~~~i~~~~~~~~~p~~ 81 (428)
+.+..++|...++++.. ....++..-......+..+..+ +|+++... ...|...++++...+.+.... .......
T Consensus 300 ~~~g~~~ly~~~~~~~g~~~~~lt~~~~~~~~p~wSPDG~-~Laf~~~~~g~~~I~v~dl~~g~~~~Lt~~~-~~~~~p~ 377 (428)
T PRK01029 300 NKDGRPRIYIMQIDPEGQSPRLLTKKYRNSSCPAWSPDGK-KIAFCSVIKGVRQICVYDLATGRDYQLTTSP-ENKESPS 377 (428)
T ss_pred CCCCCceEEEEECcccccceEEeccCCCCccceeECCCCC-EEEEEEcCCCCcEEEEEECCCCCeEEccCCC-CCccceE
Confidence 33334578887776432 2222332222334567777777 89888654 347999999888777665442 1222233
Q ss_pred eeccccEEEEEeCC--CCceeeeccCCCCcee
Q psy951 82 LDVFENNIYWLARD--TGSLYKQDKFGRGVPV 111 (428)
Q Consensus 82 l~~~~~~lYwtD~~--~~~I~~~~~~g~~~~~ 111 (428)
.+..+..||++... ...|+..+.+++....
T Consensus 378 wSpDG~~L~f~~~~~g~~~L~~vdl~~g~~~~ 409 (428)
T PRK01029 378 WAIDSLHLVYSAGNSNESELYLISLITKKTRK 409 (428)
T ss_pred ECCCCCEEEEEECCCCCceEEEEECCCCCEEE
Confidence 45566788877543 3468888887654433
No 125
>PF02239 Cytochrom_D1: Cytochrome D1 heme domain; PDB: 1NNO_B 1HZU_A 1N15_B 1N50_A 1GJQ_A 1BL9_B 1NIR_B 1N90_B 1HZV_A 1AOQ_A ....
Probab=70.87 E-value=81 Score=31.43 Aligned_cols=94 Identities=12% Similarity=0.153 Sum_probs=59.4
Q ss_pred cCCcceeeeecCccceeecccccccee-EEeeccCCCEEEEEecCCCeEEEeeeccccCCceEEEEeccccceeeeeecc
Q psy951 301 SNGPEIRAYETHKRRFRDVISDERRIE-ALDIDPVDEIIYWVDSYDRNIRRSFMLEAQKGQVQAVISDERRIEALDIDPV 379 (428)
Q Consensus 301 s~~~~~~~~~~~~~~~~~~i~~~~~~~-~l~~d~~~~~lyWtd~~~~~I~ra~l~g~~~~~~~~i~~~~~~p~glavD~~ 379 (428)
....++..++....+....++....++ ++.+.+..+++|=++. +..|....+.- .+..--+.--..|.|+++...
T Consensus 13 ~~~~~v~viD~~t~~~~~~i~~~~~~h~~~~~s~Dgr~~yv~~r-dg~vsviD~~~---~~~v~~i~~G~~~~~i~~s~D 88 (369)
T PF02239_consen 13 RGSGSVAVIDGATNKVVARIPTGGAPHAGLKFSPDGRYLYVANR-DGTVSVIDLAT---GKVVATIKVGGNPRGIAVSPD 88 (369)
T ss_dssp GGGTEEEEEETTT-SEEEEEE-STTEEEEEE-TT-SSEEEEEET-TSEEEEEETTS---SSEEEEEE-SSEEEEEEE--T
T ss_pred cCCCEEEEEECCCCeEEEEEcCCCCceeEEEecCCCCEEEEEcC-CCeEEEEECCc---ccEEEEEecCCCcceEEEcCC
Confidence 345566666766666556666555554 4778888999999874 56776666644 222222333446999999999
Q ss_pred CCeEEEEeCCCCcEEEEEc
Q psy951 380 DEIIYWVDSYDRNIRRSFM 398 (428)
Q Consensus 380 ~~~lYwtd~~~~~I~~~~~ 398 (428)
++.||-+......+.+.+.
T Consensus 89 G~~~~v~n~~~~~v~v~D~ 107 (369)
T PF02239_consen 89 GKYVYVANYEPGTVSVIDA 107 (369)
T ss_dssp TTEEEEEEEETTEEEEEET
T ss_pred CCEEEEEecCCCceeEecc
Confidence 9999999987777887664
No 126
>PF01299 Lamp: Lysosome-associated membrane glycoprotein (Lamp); InterPro: IPR002000 Lysosome-associated membrane glycoproteins (lamp) [] are integral membrane proteins, specific to lysosomes, and whose exact biological function is not yet clear. Structurally, the lamp proteins consist of two internally homologous lysosome-luminal domains separated by a proline-rich hinge region; at the C-terminal extremity there is a transmembrane region (TM) followed by a very short cytoplasmic tail (C). In each of the duplicated domains, there are two conserved disulphide bonds. This structure is schematically represented in the figure below. +-----+ +-----+ +-----+ +-----+ | | | | | | | | xCxxxxxCxxxxxxxxxxxxCxxxxxCxxxxxxxxxCxxxxxCxxxxxxxxxxxxCxxxxxCxxxxxxxx +--------------------------++Hinge++--------------------------++TM++C+ In mammals, there are two closely related types of lamp: lamp-1 and lamp-2, which form major components of the lysosome membrane. In chicken lamp-1 is known as LEP100. Also included in this entry is the macrophage protein CD68 (or macrosialin) [] is a heavily glycosylated integral membrane protein whose structure consists of a mucin-like domain followed by a proline-rich hinge; a single lamp-like domain; a transmembrane region and a short cytoplasmic tail. Similar to CD68, mammalian lamp-3, which is expressed in lymphoid organs, dendritic cells and in lung, contains all the C-terminal regions but lacks the N-terminal lamp-like region []. In a lamp-family protein from nematodes [] only the part C-terminal to the hinge is conserved. ; GO: 0016020 membrane
Probab=69.99 E-value=5 Score=38.94 Aligned_cols=31 Identities=19% Similarity=0.201 Sum_probs=22.4
Q ss_pred hhHHHHHHHHHHHHHHHHHhhheeecCCCCC
Q psy951 232 RSLLYIPTLLLLLALVSATVYYVWRKRPFGK 262 (428)
Q Consensus 232 ~~~~~~~i~~~~~~~~~~~~~~~~r~~~~~~ 262 (428)
.+.+++|++++.|++++++.|++.|||.++.
T Consensus 272 ~vPIaVG~~La~lvlivLiaYli~Rrr~~~g 302 (306)
T PF01299_consen 272 LVPIAVGAALAGLVLIVLIAYLIGRRRSRAG 302 (306)
T ss_pred hHHHHHHHHHHHHHHHHHHhheeEecccccc
Confidence 4566677777766777777888888887654
No 127
>PRK01742 tolB translocation protein TolB; Provisional
Probab=67.96 E-value=87 Score=31.75 Aligned_cols=100 Identities=13% Similarity=0.009 Sum_probs=59.8
Q ss_pred cceeeeecCccceeeccccccceeEEeeccCCCEEEEEecC--CCeEEEeeeccccCCceEEEEeccccceeeeeeccCC
Q psy951 304 PEIRAYETHKRRFRDVISDERRIEALDIDPVDEIIYWVDSY--DRNIRRSFMLEAQKGQVQAVISDERRIEALDIDPVDE 381 (428)
Q Consensus 304 ~~~~~~~~~~~~~~~~i~~~~~~~~l~~d~~~~~lyWtd~~--~~~I~ra~l~g~~~~~~~~i~~~~~~p~glavD~~~~ 381 (428)
+++..++........+........+.++.|...+|+++-.. ...|+...+++ .....+...-......+....++
T Consensus 228 ~~i~i~dl~tg~~~~l~~~~g~~~~~~wSPDG~~La~~~~~~g~~~Iy~~d~~~---~~~~~lt~~~~~~~~~~wSpDG~ 304 (429)
T PRK01742 228 SQLVVHDLRSGARKVVASFRGHNGAPAFSPDGSRLAFASSKDGVLNIYVMGANG---GTPSQLTSGAGNNTEPSWSPDGQ 304 (429)
T ss_pred cEEEEEeCCCCceEEEecCCCccCceeECCCCCEEEEEEecCCcEEEEEEECCC---CCeEeeccCCCCcCCEEECCCCC
Confidence 45666666544333332222334568899999999987432 23788888776 33333433333455667777788
Q ss_pred eEEEEeC--CCCcEEEEEccCcceeEE
Q psy951 382 IIYWVDS--YDRNIRRSFMLEAQKGQV 406 (428)
Q Consensus 382 ~lYwtd~--~~~~I~~~~~~g~~~~~l 406 (428)
.|+++-. +...|...+.+|...+.+
T Consensus 305 ~i~f~s~~~g~~~I~~~~~~~~~~~~l 331 (429)
T PRK01742 305 SILFTSDRSGSPQVYRMSASGGGASLV 331 (429)
T ss_pred EEEEEECCCCCceEEEEECCCCCeEEe
Confidence 8888643 455677777777655433
No 128
>PF01102 Glycophorin_A: Glycophorin A; InterPro: IPR001195 Proteins in this group are responsible for the molecular basis of the blood group antigens, surface markers on the outside of the red blood cell membrane. Most of these markers are proteins, but some are carbohydrates attached to lipids or proteins [Reid M.E., Lomas-Francis C. The Blood Group Antigen FactsBook Academic Press, London / San Diego, (1997)]. Glycophorin A (PAS-2) and glycophorin B (PAS-3) belong to the MNS blood group system and are associated with antigens that include M/N, S/s, U, He, Mi(a), M(c), Vw, Mur, M(g), Vr, M(e), Mt(a), St(a), Ri(a), Cl(a), Ny(a), Hut, Hil, M(v), Far, Mit, Dantu, Hop, Nob, En(a), ENKT, amongst others. Glycophorin A is the major sialoglycoprotein of the erythrocyte membrane []. Structurally, glycophorin A consists of an N-terminal extracellular domain, heavily glycosylated on serine and threonine residues, followed by a transmembrane region and a C-terminal cytoplasmic domain. Other glycophorins in this entry such as Glycophorin B and Glycophorin E represent minor sialoglycoproteins in the erythrocyte membrane.; GO: 0016021 integral to membrane; PDB: 2KPF_B 1AFO_B 2KPE_A.
Probab=67.07 E-value=4.6 Score=33.38 Aligned_cols=29 Identities=10% Similarity=0.131 Sum_probs=12.7
Q ss_pred hHHHHHHHHHHHHHHHHHhhheeecCCCC
Q psy951 233 SLLYIPTLLLLLALVSATVYYVWRKRPFG 261 (428)
Q Consensus 233 ~~~~~~i~~~~~~~~~~~~~~~~r~~~~~ 261 (428)
..+++++++-++++++++.|+..|+|++.
T Consensus 67 ~~Ii~gv~aGvIg~Illi~y~irR~~Kk~ 95 (122)
T PF01102_consen 67 IGIIFGVMAGVIGIILLISYCIRRLRKKS 95 (122)
T ss_dssp HHHHHHHHHHHHHHHHHHHHHHHHHS---
T ss_pred eehhHHHHHHHHHHHHHHHHHHHHHhccC
Confidence 34444444444444445555555555544
No 129
>KOG1226|consensus
Probab=66.83 E-value=8.7 Score=41.16 Aligned_cols=23 Identities=17% Similarity=0.466 Sum_probs=16.9
Q ss_pred EecCCCCCccCccCCCCCCCCCC
Q psy951 206 TCNCRQDFAGTFCENYTGIGQGL 228 (428)
Q Consensus 206 ~C~C~~gy~G~~Ce~~~~~~~~~ 228 (428)
.|.|.+||+|..|+-.....+++
T Consensus 567 ~CvC~~GwtG~~C~C~~std~C~ 589 (783)
T KOG1226|consen 567 RCVCNPGWTGSACNCPLSTDTCE 589 (783)
T ss_pred cEEcCCCCccCCCCCCCCCcccc
Confidence 58999999999987665544333
No 130
>COG2133 Glucose/sorbosone dehydrogenases [Carbohydrate transport and metabolism]
Probab=65.86 E-value=16 Score=36.64 Aligned_cols=69 Identities=13% Similarity=0.080 Sum_probs=45.8
Q ss_pred cceeEEeeccCCCEEEEEecCC-------------CeEEEeeecc---cc-C-CceEEEEeccccceeeeeeccCCeEEE
Q psy951 324 RRIEALDIDPVDEIIYWVDSYD-------------RNIRRSFMLE---AQ-K-GQVQAVISDERRIEALDIDPVDEIIYW 385 (428)
Q Consensus 324 ~~~~~l~~d~~~~~lyWtd~~~-------------~~I~ra~l~g---~~-~-~~~~~i~~~~~~p~glavD~~~~~lYw 385 (428)
..-..|.|+|.. +||-+=... .+|.|..-++ .+ . ...++...++.+|.||+.|.+++.||.
T Consensus 177 H~g~~l~f~pDG-~Lyvs~G~~~~~~~aq~~~~~~Gk~~r~~~a~~~~~d~p~~~~~i~s~G~RN~qGl~w~P~tg~Lw~ 255 (399)
T COG2133 177 HFGGRLVFGPDG-KLYVTTGSNGDPALAQDNVSLAGKVLRIDRAGIIPADNPFPNSEIWSYGHRNPQGLAWHPVTGALWT 255 (399)
T ss_pred cCcccEEECCCC-cEEEEeCCCCCcccccCccccccceeeeccCcccccCCCCCCcceEEeccCCccceeecCCCCcEEE
Confidence 345578999987 999883322 1333333222 11 1 123344459999999999999999999
Q ss_pred EeCCCCcE
Q psy951 386 VDSYDRNI 393 (428)
Q Consensus 386 td~~~~~I 393 (428)
+|.+.+.+
T Consensus 256 ~e~g~d~~ 263 (399)
T COG2133 256 TEHGPDAL 263 (399)
T ss_pred EecCCCcc
Confidence 99988544
No 131
>KOG0994|consensus
Probab=65.73 E-value=6.6 Score=43.83 Aligned_cols=56 Identities=30% Similarity=0.718 Sum_probs=34.2
Q ss_pred ceecCCCCCCCCCCCCCCcccccCccC----CCCCCCCCCCCCCC----EEeeCCCCceEecCCCCCccCccCC
Q psy951 155 YQCACPENATPKLPGVAEIRCSAAVER----PRPLPRVCQCQNGG----MCAESETGDLTCNCRQDFAGTFCEN 220 (428)
Q Consensus 155 ~~C~C~~g~~~~~~~~~g~~C~~~~~~----~~~~~~~c~C~ngg----~C~~~~~g~~~C~C~~gy~G~~Ce~ 220 (428)
-+|.|.+|| .|++|..-.+- +...+.+|.|...| .|...++ .|.|.+|-.|..|..
T Consensus 1084 GQCqCkpGf-------GGR~C~qCqel~WGdP~~~C~aCdCd~rG~~tpQCdr~tG---~C~C~~Gv~G~rCdq 1147 (1758)
T KOG0994|consen 1084 GQCQCKPGF-------GGRTCSQCQELYWGDPNEKCRACDCDPRGIETPQCDRATG---RCVCRPGVGGPRCDQ 1147 (1758)
T ss_pred cceeccCCC-------CCcchhHHHHhhcCCCCCCceecCCCCCCCCCCCccccCC---ceeecCCCCCcchhh
Confidence 478999999 57888642210 12234566776555 3655542 577777777766644
No 132
>PF00954 S_locus_glycop: S-locus glycoprotein family; InterPro: IPR000858 In Brassicaceae, self-incompatible plants have a self/non-self recognition system, which involves the inability of flowering plants to achieve self-fertilisation. This is sporophytically controlled by multiple alleles at a single locus (S). There are a total of 50 different S alleles in Brassica oleracea. S-locus glycoproteins, as well as S-receptor kinases, are in linkage with the S-alleles []. Most of the proteins within this family contain apple-like domain (IPR003609 from INTERPRO), which is predicted to possess protein- and/or carbohydrate-binding functions.; GO: 0048544 recognition of pollen
Probab=65.12 E-value=5.3 Score=32.20 Aligned_cols=24 Identities=25% Similarity=0.757 Sum_probs=18.7
Q ss_pred CCCCCCEEeeCCCCceEecCCCCCcc
Q psy951 190 QCQNGGMCAESETGDLTCNCRQDFAG 215 (428)
Q Consensus 190 ~C~ngg~C~~~~~g~~~C~C~~gy~G 215 (428)
.|...|.|... ....|.|.+||.-
T Consensus 85 ~CG~~g~C~~~--~~~~C~Cl~GF~P 108 (110)
T PF00954_consen 85 FCGPNGICNSN--NSPKCSCLPGFEP 108 (110)
T ss_pred ccCCccEeCCC--CCCceECCCCcCC
Confidence 58888999543 3578999999974
No 133
>smart00180 EGF_Lam Laminin-type epidermal growth factor-like domai.
Probab=64.08 E-value=7.5 Score=26.00 Aligned_cols=16 Identities=25% Similarity=0.968 Sum_probs=13.7
Q ss_pred EecCCCCCccCccCCC
Q psy951 206 TCNCRQDFAGTFCENY 221 (428)
Q Consensus 206 ~C~C~~gy~G~~Ce~~ 221 (428)
+|.|++++.|..|+.-
T Consensus 19 ~C~C~~~~~G~~C~~C 34 (46)
T smart00180 19 QCECKPNVTGRRCDRC 34 (46)
T ss_pred EEECCCCCCCCCCCcC
Confidence 7999999999999853
No 134
>PF00930 DPPIV_N: Dipeptidyl peptidase IV (DPP IV) N-terminal region; InterPro: IPR002469 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This domain defines serine peptidases belonging to MEROPS peptidase family S9 (clan SC), subfamily S9B (dipeptidyl-peptidase IV). The protein fold of the peptidase domain for members of this family resembles that of serine carboxypeptidase D, the type example of clan SC. This domain is an alignment of the region to the N-terminal side of the active site, which is found in IPR001375 from INTERPRO. CD26 (3.4.14.5 from EC) is also called adenosine deaminase-binding protein (ADA-binding protein) or dipeptidylpeptidase IV (DPP IV ectoenzyme). The exopeptidase cleaves off N-terminal X-Pro or X-Ala dipeptides from polypeptides (dipeptidyl peptidase IV activity). CD26 serves as the costimulatory molecule in T cell activation and is an associated marker of autoimmune diseases, adenosine deaminase-deficiency and HIV pathogenesis. Dipeptidyl peptidase IV (DPP IV) is responsible for the removal of N-terminal dipeptides sequentially from polypeptides having unsubstituted N termini, provided that the penultimate residue is proline. The enzyme catalyses the reaction: Dipeptidyl-Polypeptide + H(2)O = Dipeptide + Polypeptide It is a type II membrane protein that forms a homodimer. CD molecules are leucocyte antigens on cell surfaces. CD antigens nomenclature is updated at Protein Reviews On The Web (http://prow.nci.nih.gov/). ; GO: 0006508 proteolysis, 0016020 membrane; PDB: 2RIP_A 3Q8W_B 2AJL_I 1TKR_B 1TK3_B 3C45_A 2G5P_A 3G0C_D 1R9M_C 1RWQ_A ....
Probab=64.00 E-value=73 Score=31.35 Aligned_cols=106 Identities=16% Similarity=0.204 Sum_probs=62.2
Q ss_pred EEeeCCC-CeEEEEEcCCCCcEEEEe-CC---CCCceeEEEc-CCCCCEEEEEECCCC--eEEEEEcCCCCeEEEEeCCC
Q psy951 3 WAETGAS-PRIESAWMDGSHRRSLVM-TG---VRHPTGLSVD-AAMDHTLYWVDSKLN--TIESVRHDGRNRQTILSGSD 74 (428)
Q Consensus 3 Wtd~~~~-~~I~~a~~DG~~~~~l~~-~~---~~~P~glavD-~~~~~~lYW~d~~~~--~I~~~~ldG~~~~~i~~~~~ 74 (428)
|.+..+. ..|..++......+++.. +. +.....+..- ..++ .++|.-...+ .|..++++|...+.|-.+.-
T Consensus 202 ~~nR~q~~~~l~~~d~~tg~~~~~~~e~~~~Wv~~~~~~~~~~~~~~-~~l~~s~~~G~~hly~~~~~~~~~~~lT~G~~ 280 (353)
T PF00930_consen 202 WLNRDQNRLDLVLCDASTGETRVVLEETSDGWVDVYDPPHFLGPDGN-EFLWISERDGYRHLYLYDLDGGKPRQLTSGDW 280 (353)
T ss_dssp EEETTSTEEEEEEEEECTTTCEEEEEEESSSSSSSSSEEEE-TTTSS-EEEEEEETTSSEEEEEEETTSSEEEESS-SSS
T ss_pred EcccCCCEEEEEEEECCCCceeEEEEecCCcceeeecccccccCCCC-EEEEEEEcCCCcEEEEEcccccceeccccCce
Confidence 4444442 245555554333333332 11 3233344432 3445 6666655554 89999999998776666643
Q ss_pred CccceeeeeccccEEEEEeCC----CCceeeeccC-CCCc
Q psy951 75 KLQHPISLDVFENNIYWLARD----TGSLYKQDKF-GRGV 109 (428)
Q Consensus 75 ~~~~p~~l~~~~~~lYwtD~~----~~~I~~~~~~-g~~~ 109 (428)
......+++..++.||++-.. ...|++++.+ ++..
T Consensus 281 ~V~~i~~~d~~~~~iyf~a~~~~p~~r~lY~v~~~~~~~~ 320 (353)
T PF00930_consen 281 EVTSILGWDEDNNRIYFTANGDNPGERHLYRVSLDSGGEP 320 (353)
T ss_dssp -EEEEEEEECTSSEEEEEESSGGTTSBEEEEEETTETTEE
T ss_pred eecccceEcCCCCEEEEEecCCCCCceEEEEEEeCCCCCe
Confidence 444578888889999999864 3468888887 4433
No 135
>KOG1226|consensus
Probab=63.30 E-value=14 Score=39.55 Aligned_cols=19 Identities=32% Similarity=0.992 Sum_probs=14.7
Q ss_pred EecCCCC-CccCccCCCCCC
Q psy951 206 TCNCRQD-FAGTFCENYTGI 224 (428)
Q Consensus 206 ~C~C~~g-y~G~~Ce~~~~~ 224 (428)
+|.|... |+|.+||..+.-
T Consensus 606 ~C~C~~~~~sG~~CE~cptc 625 (783)
T KOG1226|consen 606 RCKCTDPPYSGEFCEKCPTC 625 (783)
T ss_pred ceEcCCCCcCcchhhcCCCC
Confidence 4788876 999999986643
No 136
>PRK01029 tolB translocation protein TolB; Provisional
Probab=62.74 E-value=1.4e+02 Score=30.32 Aligned_cols=82 Identities=9% Similarity=0.057 Sum_probs=56.9
Q ss_pred cceeEEeeccCCCEEEEEecCC--CeEEEeeeccccCCceEEEEeccccceeeeeeccCCeEEEEeC--CCCcEEEEEcc
Q psy951 324 RRIEALDIDPVDEIIYWVDSYD--RNIRRSFMLEAQKGQVQAVISDERRIEALDIDPVDEIIYWVDS--YDRNIRRSFML 399 (428)
Q Consensus 324 ~~~~~l~~d~~~~~lyWtd~~~--~~I~ra~l~g~~~~~~~~i~~~~~~p~glavD~~~~~lYwtd~--~~~~I~~~~~~ 399 (428)
.......+.|..++|+++.... ..|+...+++ ++.+.+..+-......+....++.||++-. +...|.+.+++
T Consensus 327 ~~~~~p~wSPDG~~Laf~~~~~g~~~I~v~dl~~---g~~~~Lt~~~~~~~~p~wSpDG~~L~f~~~~~g~~~L~~vdl~ 403 (428)
T PRK01029 327 RNSSCPAWSPDGKKIAFCSVIKGVRQICVYDLAT---GRDYQLTTSPENKESPSWAIDSLHLVYSAGNSNESELYLISLI 403 (428)
T ss_pred CCccceeECCCCCEEEEEEcCCCCcEEEEEECCC---CCeEEccCCCCCccceEECCCCCEEEEEECCCCCceEEEEECC
Confidence 4456778899999999885433 3788877776 344444443334556666667888988644 45679999999
Q ss_pred CcceeEEEe
Q psy951 400 EAQKGQVQA 408 (428)
Q Consensus 400 g~~~~~l~~ 408 (428)
|...+.|..
T Consensus 404 ~g~~~~Lt~ 412 (428)
T PRK01029 404 TKKTRKIVI 412 (428)
T ss_pred CCCEEEeec
Confidence 988887764
No 137
>PF13449 Phytase-like: Esterase-like activity of phytase
Probab=61.01 E-value=38 Score=33.04 Aligned_cols=81 Identities=19% Similarity=0.163 Sum_probs=53.9
Q ss_pred cceeEEeeccCCCEEEEE--ecCC----CeEEEeeeccccC--CceEEE----E-ec----c----ccceeeeeeccCCe
Q psy951 324 RRIEALDIDPVDEIIYWV--DSYD----RNIRRSFMLEAQK--GQVQAV----I-SD----E----RRIEALDIDPVDEI 382 (428)
Q Consensus 324 ~~~~~l~~d~~~~~lyWt--d~~~----~~I~ra~l~g~~~--~~~~~i----~-~~----~----~~p~glavD~~~~~ 382 (428)
...-||++|+..+ .||+ |.+. .++++..++.... ....+. + .. + .-++||++ ...+.
T Consensus 20 GGlSgl~~~~~~~-~~~avSD~g~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~L~~~~G~~~~~~~~D~Egi~~-~~~g~ 97 (326)
T PF13449_consen 20 GGLSGLDYDPDDG-RFYAVSDRGPNKGPPRFYTFRIDYDQGGIGGVTILDMIPLRDPDGQPFPKNGLDPEGIAV-PPDGS 97 (326)
T ss_pred CcEeeEEEeCCCC-EEEEEECCCCCCCCCcEEEEEeeccCCCccceEeccceeccCCCCCcCCcCCCChhHeEE-ecCCC
Confidence 3455899997444 4665 6555 2688887765211 111211 1 11 1 16779999 67899
Q ss_pred EEEEeCCC------CcEEEEEccCcceeEE
Q psy951 383 IYWVDSYD------RNIRRSFMLEAQKGQV 406 (428)
Q Consensus 383 lYwtd~~~------~~I~~~~~~g~~~~~l 406 (428)
+||++.+. ..|.+.+.+|+..+.+
T Consensus 98 ~~is~E~~~~~~~~p~I~~~~~~G~~~~~~ 127 (326)
T PF13449_consen 98 FWISSEGGRTGGIPPRIRRFDLDGRVIRRF 127 (326)
T ss_pred EEEEeCCccCCCCCCEEEEECCCCcccceE
Confidence 99999999 9999999999876665
No 138
>PF02009 Rifin_STEVOR: Rifin/stevor family; InterPro: IPR002858 Malaria is still a major cause of mortality in many areas of the world. Plasmodium falciparum causes the most severe human form of the disease and is responsible for most fatalities. Severe cases of malaria can occur when the parasite invades and then proliferates within red blood cell erythrocytes. The parasite produces many variant antigenic proteins, encoded by multigene families, which are present on the surface of the infected erythrocyte and play important roles in virulence. A crucial survival mechanism for the malaria parasite is its ability to evade the immune response by switching these variant surface antigens. The high virulence of P. falciparum relative to other malarial parasites is in large part due to the fact that in this organism many of these surface antigens mediate the binding of infected erythrocytes to the vascular endothelium (cytoadherence) and non-infected erythrocytes (rosetting). This can lead to the accumulation of infected cells in the vasculature of a variety of organs, blocking the blood flow and reducing the oxygen supply. Clinical symptoms of severe infection can include fever, progressive anaemia, multi-organ dysfunction and coma. For more information see []. Several multicopy gene families have been described in Plasmodium falciparum, including the stevor family of subtelomeric open reading frames and the rif interspersed repetitive elements. Both families contain three predicted transmembrane segments. It has been proposed that stevor and rif are members of a larger superfamily that code for variant surface antigens [].
Probab=60.22 E-value=5.4 Score=38.43 Aligned_cols=15 Identities=13% Similarity=0.304 Sum_probs=6.5
Q ss_pred HHHHHHhhheeecCC
Q psy951 245 ALVSATVYYVWRKRP 259 (428)
Q Consensus 245 ~~~~~~~~~~~r~~~ 259 (428)
|++++-+++.|||++
T Consensus 271 IMvIIYLILRYRRKK 285 (299)
T PF02009_consen 271 IMVIIYLILRYRRKK 285 (299)
T ss_pred HHHHHHHHHHHHHHh
Confidence 444444444455433
No 139
>PF04863 EGF_alliinase: Alliinase EGF-like domain; InterPro: IPR006947 Allicin is a thiosulphinate that gives rise to dithiines, allyl sulphides and ajoenes, the three groups of active compounds in Allium species. Allicin is synthesised from sulphoxide cysteine derivatives by alliinase, whose C-S lyase activity cleaves C(beta)-S(gamma) bonds. It is thought that this enzyme forms part of a primitive plant defence system [].; GO: 0016846 carbon-sulfur lyase activity; PDB: 1LK9_B 2HOX_C 2HOR_A.
Probab=59.08 E-value=4.2 Score=28.24 Aligned_cols=32 Identities=22% Similarity=0.593 Sum_probs=17.1
Q ss_pred CCCCCEEeeC---CCCceEecCCCCCccCccCCCC
Q psy951 191 CQNGGMCAES---ETGDLTCNCRQDFAGTFCENYT 222 (428)
Q Consensus 191 C~ngg~C~~~---~~g~~~C~C~~gy~G~~Ce~~~ 222 (428)
|.-+|..... ..|.+.|.|..-|.|..|+...
T Consensus 19 CSGHGr~flDg~~~dG~p~CECn~Cy~GpdCS~~~ 53 (56)
T PF04863_consen 19 CSGHGRAFLDGLIADGSPVCECNSCYGGPDCSTLI 53 (56)
T ss_dssp -TTSEE--TTS-EETTEE--EE-TTEESTTS-EE-
T ss_pred cCCCCeeeeccccccCCccccccCCcCCCCcccCC
Confidence 5445555322 1367999999999999998643
No 140
>KOG0196|consensus
Probab=58.76 E-value=16 Score=39.57 Aligned_cols=55 Identities=18% Similarity=0.456 Sum_probs=29.2
Q ss_pred ceecCCCCCCCCCCCCCCcccccCc---cCCCCCCCCC-CCCCCCEEeeCCCCceEecCCCCCc
Q psy951 155 YQCACPENATPKLPGVAEIRCSAAV---ERPRPLPRVC-QCQNGGMCAESETGDLTCNCRQDFA 214 (428)
Q Consensus 155 ~~C~C~~g~~~~~~~~~g~~C~~~~---~~~~~~~~~c-~C~ngg~C~~~~~g~~~C~C~~gy~ 214 (428)
-.|.|.+||.... .+..|+.-. ........+| +|..+.. ....|+..|.|..||.
T Consensus 259 G~C~C~aGye~~~---~~~~C~aCp~G~yK~~~~~~~C~~CP~~S~--s~~ega~~C~C~~gyy 317 (996)
T KOG0196|consen 259 GGCVCKAGYEEAE---NGKACQACPPGTYKASQGDSLCLPCPPNSH--SSSEGATSCTCENGYY 317 (996)
T ss_pred CceeecCCCCccc---CCCcceeCCCCcccCCCCCCCCCCCCCCCC--CCCCCCCcccccCCcc
Confidence 4799999993222 235665321 1111122334 3433221 1123678999999997
No 141
>KOG3509|consensus
Probab=58.64 E-value=16 Score=40.63 Aligned_cols=72 Identities=26% Similarity=0.499 Sum_probs=46.7
Q ss_pred CCCCCCCCCCCC--ccccCCCcceecCCCCCCCCCCCCCCcccccCccCCCCCCCCCCCCCCCEEeeCCCCceEecCCCC
Q psy951 135 APNPCSQSPCSH--LCLVIPGGYQCACPENATPKLPGVAEIRCSAAVERPRPLPRVCQCQNGGMCAESETGDLTCNCRQD 212 (428)
Q Consensus 135 ~~n~C~~~~C~~--~C~~~~~~~~C~C~~g~~~~~~~~~g~~C~~~~~~~~~~~~~c~C~ngg~C~~~~~g~~~C~C~~g 212 (428)
....|...+|++ .|.+.+.+..|.|++|| .|..|.+..+.+...++.+ -.++|.... +...+.|.+|
T Consensus 405 ~g~~c~~~p~~~~g~c~p~~~~~~c~c~~g~-------~G~~c~d~~~~~~~~~~g~---y~~t~~~~~-~~~~~~c~pg 473 (964)
T KOG3509|consen 405 LGDVCWRIPCQHDGPCLQTLEGKQCLCPPGY-------TGDSCEDCMNGCDRSPNGS---YLGTCVPIQ-GKRCEYCGPG 473 (964)
T ss_pred CCCccccccCCCCccccccccccceeccccc-------cCchhhccCccccccCCcc---ccceEeccC-CCcceeecCC
Confidence 345677778888 88888899999999999 6778877665544433222 124555444 2355667777
Q ss_pred CccCcc
Q psy951 213 FAGTFC 218 (428)
Q Consensus 213 y~G~~C 218 (428)
.|..+
T Consensus 474 -~g~~~ 478 (964)
T KOG3509|consen 474 -AGAPT 478 (964)
T ss_pred -CCCcc
Confidence 44433
No 142
>PF14946 DUF4501: Domain of unknown function (DUF4501)
Probab=57.28 E-value=1.3e+02 Score=26.08 Aligned_cols=28 Identities=21% Similarity=0.478 Sum_probs=15.1
Q ss_pred hhHHHHHHHHHHHHHHHHHhhheeecCC
Q psy951 232 RSLLYIPTLLLLLALVSATVYYVWRKRP 259 (428)
Q Consensus 232 ~~~~~~~i~~~~~~~~~~~~~~~~r~~~ 259 (428)
+++++++++++.+.+++-+..++|-||-
T Consensus 90 AASL~LgTffIS~~LilSvA~FFYLKrs 117 (180)
T PF14946_consen 90 AASLFLGTFFISLGLILSVASFFYLKRS 117 (180)
T ss_pred HHHHHHHHHHHHHHHHHHHhhheeeccc
Confidence 5666777777765444443334444433
No 143
>PF05787 DUF839: Bacterial protein of unknown function (DUF839); InterPro: IPR008557 This family consists of bacterial proteins of unknown function.
Probab=56.52 E-value=40 Score=35.34 Aligned_cols=72 Identities=17% Similarity=0.139 Sum_probs=49.7
Q ss_pred ccccceeEEeeccCCCEEEEEecCCC-------------------eEEEeeecccc----CCceEEEEe-----------
Q psy951 321 SDERRIEALDIDPVDEIIYWVDSYDR-------------------NIRRSFMLEAQ----KGQVQAVIS----------- 366 (428)
Q Consensus 321 ~~~~~~~~l~~d~~~~~lyWtd~~~~-------------------~I~ra~l~g~~----~~~~~~i~~----------- 366 (428)
..+..|+++.+||.++.||.+-.... .|+|...++.+ .-..++++.
T Consensus 347 T~f~RpEgi~~~p~~g~vY~a~T~~~~r~~~~~~~~n~~~~n~~G~I~r~~~~~~d~~~~~f~~~~~~~~g~~~~~~~~~ 426 (524)
T PF05787_consen 347 TPFDRPEGITVNPDDGEVYFALTNNSGRGESDVDAANPRAGNGYGQIYRYDPDGNDHAATTFTWELFLVGGDPTDASGNG 426 (524)
T ss_pred ccccCccCeeEeCCCCEEEEEEecCCCCcccccccCCcccCCcccEEEEecccCCccccceeEEEEEEEecCcccccccc
Confidence 56889999999999999999854322 78888776521 113333321
Q ss_pred -------ccccceeeeeeccCCeEEEEeCCCCc
Q psy951 367 -------DERRIEALDIDPVDEIIYWVDSYDRN 392 (428)
Q Consensus 367 -------~~~~p~glavD~~~~~lYwtd~~~~~ 392 (428)
.+..|.+|++|..++.+-=+|.+...
T Consensus 427 ~~~~~~~~f~sPDNL~~d~~G~LwI~eD~~~~~ 459 (524)
T PF05787_consen 427 SNKCDDNGFASPDNLAFDPDGNLWIQEDGGGSN 459 (524)
T ss_pred cCcccCCCcCCCCceEECCCCCEEEEeCCCCCC
Confidence 27799999999865544447776654
No 144
>PF12191 stn_TNFRSF12A: Tumour necrosis factor receptor stn_TNFRSF12A_TNFR domain; InterPro: IPR022316 The tumour necrosis factor (TNF) receptor (TNFR) superfamily comprises more than 20 type-I transmembrane proteins. Family members are defined based on similarity in their extracellular domain - a region that contains many cysteine residues arranged in a specific repetitive pattern []. The cysteines allow formation of an extended rod-like structure, responsible for ligand binding []. Upon receptor activation, different intracellular signalling complexes are assembled for different members of the TNFR superfamily, depending on their intracellular domains and sequences []. Activation of TNFRs can therefore induce a range of disparate effects, including cell proliferation, differentiation, survival, or apoptotic cell death, depending upon the receptor involved []. TNFRs are widely distributed and play important roles in many crucial biological processes, such as lymphoid and neuronal development, innate and adaptive immunity, and maintenance of cellular homeostasis []. Drugs that manipulate their signalling have potential roles in the prevention and treatment of many diseases, such as viral infections, coronary heart disease, transplant rejection, and immune disease []. TNF receptor 12 (also known as TWEAK receptor, and fibroblast growth factor-inducible-14 (Fn14)) has been implicated in endothelial cell growth and migration []. The receptor may also play a role in cell-matrix interactions [].; PDB: 2KN0_A 2RPJ_A 2KMZ_A 2EQP_A.
Probab=56.46 E-value=3.5 Score=33.84 Aligned_cols=17 Identities=24% Similarity=0.430 Sum_probs=0.0
Q ss_pred HHHHHHhhheeecCCCC
Q psy951 245 ALVSATVYYVWRKRPFG 261 (428)
Q Consensus 245 ~~~~~~~~~~~r~~~~~ 261 (428)
++.++.++++|||.+++
T Consensus 91 Vl~llsg~lv~rrcrrr 107 (129)
T PF12191_consen 91 VLALLSGFLVWRRCRRR 107 (129)
T ss_dssp -----------------
T ss_pred HHHHHHHHHHHhhhhcc
Confidence 33334455566555533
No 145
>PF01683 EB: EB module; InterPro: IPR006149 The EB domain has no known function. It is found in several Caenorhabditis sp. and Drosophila sp. proteins. The domain contains 8 conserved cysteines that probably form four disulphide bridges and is found associated with kunitz domains IPR002223 from INTERPRO
Probab=55.59 E-value=18 Score=24.61 Aligned_cols=20 Identities=30% Similarity=0.951 Sum_probs=14.7
Q ss_pred CCCCCCEEeeCCCCceEecCCCCCc
Q psy951 190 QCQNGGMCAESETGDLTCNCRQDFA 214 (428)
Q Consensus 190 ~C~ngg~C~~~~~g~~~C~C~~gy~ 214 (428)
+|..+..|.+ -+|.|++||.
T Consensus 27 qC~~~s~C~~-----g~C~C~~g~~ 46 (52)
T PF01683_consen 27 QCIGGSVCVN-----GRCQCPPGYV 46 (52)
T ss_pred CCCCcCEEcC-----CEeECCCCCE
Confidence 4666678843 2699999986
No 146
>COG0823 TolB Periplasmic component of the Tol biopolymer transport system [Intracellular trafficking and secretion]
Probab=55.40 E-value=1.2e+02 Score=30.82 Aligned_cols=101 Identities=18% Similarity=0.169 Sum_probs=63.4
Q ss_pred CeEEEEEcCCCCcEEEEeCCCCCceeEEEcCCCCCEEEEEECCCC--eEEEEEcCCCCeEEEEeCCCCccceeeeecccc
Q psy951 10 PRIESAWMDGSHRRSLVMTGVRHPTGLSVDAAMDHTLYWVDSKLN--TIESVRHDGRNRQTILSGSDKLQHPISLDVFEN 87 (428)
Q Consensus 10 ~~I~~a~~DG~~~~~l~~~~~~~P~glavD~~~~~~lYW~d~~~~--~I~~~~ldG~~~~~i~~~~~~~~~p~~l~~~~~ 87 (428)
++|...+++...+..+++.. ..=...+.-+.++ +|-++-.+.+ .|+.++++|++.+.+.... +....-++...+.
T Consensus 218 ~~i~~~~l~~g~~~~i~~~~-g~~~~P~fspDG~-~l~f~~~rdg~~~iy~~dl~~~~~~~Lt~~~-gi~~~Ps~spdG~ 294 (425)
T COG0823 218 PRIYYLDLNTGKRPVILNFN-GNNGAPAFSPDGS-KLAFSSSRDGSPDIYLMDLDGKNLPRLTNGF-GINTSPSWSPDGS 294 (425)
T ss_pred ceEEEEeccCCccceeeccC-CccCCccCCCCCC-EEEEEECCCCCccEEEEcCCCCcceecccCC-ccccCccCCCCCC
Confidence 67888888887777776521 1111223344445 6666655444 7999999999876655543 2222345667788
Q ss_pred EEEEEeCCC--CceeeeccCCCCceeee
Q psy951 88 NIYWLARDT--GSLYKQDKFGRGVPVLI 113 (428)
Q Consensus 88 ~lYwtD~~~--~~I~~~~~~g~~~~~~~ 113 (428)
+||++.-.. ..|++.+.+|+....+.
T Consensus 295 ~ivf~Sdr~G~p~I~~~~~~g~~~~riT 322 (425)
T COG0823 295 KIVFTSDRGGRPQIYLYDLEGSQVTRLT 322 (425)
T ss_pred EEEEEeCCCCCcceEEECCCCCceeEee
Confidence 877775433 37888888887664433
No 147
>TIGR03118 PEPCTERM_chp_1 conserved hypothetical protein TIGR03118. This model describes and uncharacterized conserved hypothetical protein. Members are found with the C-terminal putative exosortase interaction domain, PEP-CTERM, in Nitrosospira multiformis, Rhodoferax ferrireducens, Solibacter usitatus Ellin6076, and Acidobacteria bacterium Ellin345. It is found without the PEP-CTERM domain in several other species, including Burkholderia ambifaria, Gloeobacter violaceus PCC 7421, and three copies in the Acanthamoeba polyphaga mimivirus.
Probab=55.08 E-value=1.7e+02 Score=28.36 Aligned_cols=98 Identities=18% Similarity=0.214 Sum_probs=65.4
Q ss_pred CEEEeeCCCCeEEEEEcCCCCcEEEEeC--------CCCCceeEEEcCCCCCEEEEEECC-------------CCeEEEE
Q psy951 1 MFWAETGASPRIESAWMDGSHRRSLVMT--------GVRHPTGLSVDAAMDHTLYWVDSK-------------LNTIESV 59 (428)
Q Consensus 1 lyWtd~~~~~~I~~a~~DG~~~~~l~~~--------~~~~P~glavD~~~~~~lYW~d~~-------------~~~I~~~ 59 (428)
||=+|..+ .||... |++.+.+-+.. .-+.|-+|.-- .+ +||-+-+. .+.|...
T Consensus 154 LYaadF~~-g~IDVF--d~~f~~~~~~g~F~DP~iPagyAPFnIqni--g~-~lyVtYA~qd~~~~d~v~G~G~G~VdvF 227 (336)
T TIGR03118 154 LYAANFRQ-GRIDVF--KGSFRPPPLPGSFIDPALPAGYAPFNVQNL--GG-TLYVTYAQQDADRNDEVAGAGLGYVNVF 227 (336)
T ss_pred EEEeccCC-CceEEe--cCccccccCCCCccCCCCCCCCCCcceEEE--CC-eEEEEEEecCCcccccccCCCcceEEEE
Confidence 46677754 478765 44433322211 12466666543 46 89987443 3589999
Q ss_pred EcCCCCeEEEEeCCCCccceeeeec-------cccEEEEEeCCCCceeeeccC
Q psy951 60 RHDGRNRQTILSGSDKLQHPISLDV-------FENNIYWLARDTGSLYKQDKF 105 (428)
Q Consensus 60 ~ldG~~~~~i~~~~~~~~~p~~l~~-------~~~~lYwtD~~~~~I~~~~~~ 105 (428)
+++|...+.+.++. .|..|.+|++ +.+.|-.-....++|...+..
T Consensus 228 d~~G~l~~r~as~g-~LNaPWG~a~APa~FG~~sg~lLVGNFGDG~InaFD~~ 279 (336)
T TIGR03118 228 TLNGQLLRRVASSG-RLNAPWGLAIAPESFGSLSGALLVGNFGDGTINAYDPQ 279 (336)
T ss_pred cCCCcEEEEeccCC-cccCCceeeeChhhhCCCCCCeEEeecCCceeEEecCC
Confidence 99999988887654 7999999886 356777777777788777765
No 148
>COG2133 Glucose/sorbosone dehydrogenases [Carbohydrate transport and metabolism]
Probab=54.14 E-value=59 Score=32.72 Aligned_cols=40 Identities=20% Similarity=0.180 Sum_probs=31.3
Q ss_pred EEEEEcCCCCcEEEEeCCCCCceeEEEcCCCCCEEEEEECCC
Q psy951 12 IESAWMDGSHRRSLVMTGVRHPTGLSVDAAMDHTLYWVDSKL 53 (428)
Q Consensus 12 I~~a~~DG~~~~~l~~~~~~~P~glavD~~~~~~lYW~d~~~ 53 (428)
|...+.++.+.+ |+..++..|.||+.|+.++ .||-++.+.
T Consensus 221 ~~~~d~p~~~~~-i~s~G~RN~qGl~w~P~tg-~Lw~~e~g~ 260 (399)
T COG2133 221 IIPADNPFPNSE-IWSYGHRNPQGLAWHPVTG-ALWTTEHGP 260 (399)
T ss_pred ccccCCCCCCcc-eEEeccCCccceeecCCCC-cEEEEecCC
Confidence 334455666644 4566899999999999999 999999887
No 149
>PF05568 ASFV_J13L: African swine fever virus J13L protein; InterPro: IPR008385 This family consists of several African swine fever virus (ASFV) j13L proteins [, , ].
Probab=53.66 E-value=9.2 Score=32.14 Aligned_cols=39 Identities=10% Similarity=0.142 Sum_probs=16.4
Q ss_pred HHHHHHHHHHHHHHHHhhheeecCCCCCC--CCCCCCCCce
Q psy951 235 LYIPTLLLLLALVSATVYYVWRKRPFGKT--MGSALSTQSV 273 (428)
Q Consensus 235 ~~~~i~~~~~~~~~~~~~~~~r~~~~~~~--~~~~~~~~~v 273 (428)
++++|+++++++++...++-.|||++... ..++.++++.
T Consensus 34 ILiaIvVliiiiivli~lcssRKkKaaAAi~eediQfinpy 74 (189)
T PF05568_consen 34 ILIAIVVLIIIIIVLIYLCSSRKKKAAAAIEEEDIQFINPY 74 (189)
T ss_pred HHHHHHHHHHHHHHHHHHHhhhhHHHHhhhhhhcccccCcc
Confidence 33444444333333333334444442222 4455666654
No 150
>PF01414 DSL: Delta serrate ligand; InterPro: IPR001774 Ligands of the Delta/Serrate/lag-2 (DSL) family and their receptors, members of the lin-12/Notch family, mediate cell-cell interactions that specify cell fate in invertebrates and vertebrates. In Caenorhabditis elegans, two DSL genes, lag-2 and apx-1, influence different cell fate decisions during development []. Molecular interaction between Notch and Serrate, another EGF-homologous transmembrane protein containing a region of striking similarity to Delta, has been shown and the same two EGF repeats of Notch may also constitute a Serrate binding domain [, ].; GO: 0007154 cell communication, 0016020 membrane; PDB: 2VJ2_A.
Probab=52.86 E-value=5.3 Score=28.87 Aligned_cols=12 Identities=25% Similarity=0.653 Sum_probs=9.1
Q ss_pred ecCCCCCccCcc
Q psy951 207 CNCRQDFAGTFC 218 (428)
Q Consensus 207 C~C~~gy~G~~C 218 (428)
-.|.+||.|..|
T Consensus 52 ~~C~~Gw~G~~C 63 (63)
T PF01414_consen 52 KVCLPGWTGPNC 63 (63)
T ss_dssp EEE-TTEESTTS
T ss_pred CCCCCCCcCCCC
Confidence 368999999887
No 151
>PF01034 Syndecan: Syndecan domain; InterPro: IPR001050 The syndecans are transmembrane proteoglycans which are involved in the organisation of cytoskeleton and/or actin microfilaments, and have important roles as cell surface receptors during cell-cell and/or cell-matrix interactions [, ]. Structurally, these proteins consist of four separate domains: A signal sequence; An extracellular domain (ectodomain) of variable length whose sequence is not evolutionary conserved in the various forms of syndecans. The ectodomain contains the sites of attachment of the heparan sulphate glycosaminoglycan side chains; A transmembrane region; A highly conserved cytoplasmic domain of about 30 to 35 residues, which could interact with cytoskeletal proteins. The proteins known to belong to this family are: Syndecan 1. Syndecan 2 or fibroglycan. Syndecan 3 or neuroglycan or N-syndecan. Syndecan 4 or amphiglycan or ryudocan. Drosophila syndecan. Caenorhabditis elegans probable syndecan (F57C7.3). Syndecan-4, a transmembrane heparan sulphate proteoglycan, is a coreceptor with integrins in cell adhesion. It has been suggested to form a ternary signalling complex with protein kinase Calpha and phosphatidylinositol 4,5-bisphosphate (PIP2). Structural studies have demonstrated that the cytoplasmic domain undergoes a conformational transition and forms a symmetric dimer in the presence of phospholipid activator PIP2, and whose overall structure in solution exhibits a twisted clamp shape having a cavity in the centre of dimeric interface. In addition, it has been observed that the syndecan-4 variable domain interacts, strongly, not only with fatty acyl groups but also the anionic head group of PIP2. These findings indicate that PIP2 promotes oligomerisation of the syndecan-4 cytoplasmic domain for transmembrane signalling and cell-matrix adhesion [, ].; GO: 0008092 cytoskeletal protein binding, 0016020 membrane; PDB: 1EJQ_B 1EJP_B 1YBO_C 1OBY_Q.
Probab=52.28 E-value=4.9 Score=28.93 Aligned_cols=25 Identities=16% Similarity=0.264 Sum_probs=0.6
Q ss_pred HHHHHHHHHHHHHhhheeecCCCCC
Q psy951 238 PTLLLLLALVSATVYYVWRKRPFGK 262 (428)
Q Consensus 238 ~i~~~~~~~~~~~~~~~~r~~~~~~ 262 (428)
++++.++.++++++|++||.|++..
T Consensus 17 G~Vvgll~ailLIlf~iyR~rkkdE 41 (64)
T PF01034_consen 17 GGVVGLLFAILLILFLIYRMRKKDE 41 (64)
T ss_dssp --------------------S----
T ss_pred HHHHHHHHHHHHHHHHHHHHHhcCC
Confidence 3333333344455666776655443
No 152
>TIGR01478 STEVOR variant surface antigen, stevor family. This model represents the stevor branch of the rifin/stevor family (pfam02009) of predicted variant surface antigens as found in Plasmodium falciparum. This model is based on a set of stevor sequences kindly provided by Matt Berriman from the Sanger Center. This is a global model and assesses a penalty for incomplete sequence. Additional fragmentary sequences may be found with the fragment model and a cutoff of 8 bits.
Probab=51.90 E-value=9.4 Score=36.06 Aligned_cols=17 Identities=18% Similarity=0.163 Sum_probs=8.7
Q ss_pred HHHHHHhhheeecCCCC
Q psy951 245 ALVSATVYYVWRKRPFG 261 (428)
Q Consensus 245 ~~~~~~~~~~~r~~~~~ 261 (428)
+++++++|+|.+|||++
T Consensus 272 ~vvliiLYiWlyrrRK~ 288 (295)
T TIGR01478 272 TVVLIILYIWLYRRRKK 288 (295)
T ss_pred HHHHHHHHHHHHHhhcc
Confidence 44555555555555433
No 153
>PF04478 Mid2: Mid2 like cell wall stress sensor; InterPro: IPR007567 This family represents a region near the C terminus of Mid2, which contains a transmembrane region. The remainder of the protein sequence is serine-rich and of low complexity, and is therefore impossible to align accurately. Mid2 is thought to act as a mechanosensor of cell wall stress. The C-terminal cytoplasmic region of Mid2 is known to interact with Rom2, a guanine nucleotide exchange factor (GEF) for Rho1, which is part of the cell wall integrity signalling pathway [].
Probab=51.67 E-value=6.6 Score=33.57 Aligned_cols=15 Identities=20% Similarity=0.144 Sum_probs=6.9
Q ss_pred HHHhhheeecCCCCC
Q psy951 248 SATVYYVWRKRPFGK 262 (428)
Q Consensus 248 ~~~~~~~~r~~~~~~ 262 (428)
++++|++++|+++..
T Consensus 67 l~lvf~~c~r~kktd 81 (154)
T PF04478_consen 67 LALVFIFCIRRKKTD 81 (154)
T ss_pred HHhheeEEEecccCc
Confidence 344455555544443
No 154
>PF13449 Phytase-like: Esterase-like activity of phytase
Probab=49.84 E-value=37 Score=33.16 Aligned_cols=61 Identities=16% Similarity=0.194 Sum_probs=46.7
Q ss_pred ceeEEeeccCCCEEEEEecCC------CeEEEeeeccccCCceEEEE--------------eccccceeeeeeccCCeEE
Q psy951 325 RIEALDIDPVDEIIYWVDSYD------RNIRRSFMLEAQKGQVQAVI--------------SDERRIEALDIDPVDEIIY 384 (428)
Q Consensus 325 ~~~~l~~d~~~~~lyWtd~~~------~~I~ra~l~g~~~~~~~~i~--------------~~~~~p~glavD~~~~~lY 384 (428)
.+++|++ +.++.+||++-+. .+|++...+|. ....+-+ ..-.-.+|||+...++.||
T Consensus 86 D~Egi~~-~~~g~~~is~E~~~~~~~~p~I~~~~~~G~--~~~~~~vP~~~~~~~~~~~~~~~N~G~E~la~~~dG~~l~ 162 (326)
T PF13449_consen 86 DPEGIAV-PPDGSFWISSEGGRTGGIPPRIRRFDLDGR--VIRRFPVPAAFLPDANGTSGRRNNRGFEGLAVSPDGRTLF 162 (326)
T ss_pred ChhHeEE-ecCCCEEEEeCCccCCCCCCEEEEECCCCc--ccceEccccccccccCccccccCCCCeEEEEECCCCCEEE
Confidence 7889999 8899999999999 89999999984 2122211 1344678999999998888
Q ss_pred EEeC
Q psy951 385 WVDS 388 (428)
Q Consensus 385 wtd~ 388 (428)
-+-.
T Consensus 163 ~~~E 166 (326)
T PF13449_consen 163 AAME 166 (326)
T ss_pred EEEC
Confidence 7554
No 155
>PTZ00046 rifin; Provisional
Probab=49.71 E-value=13 Score=36.55 Aligned_cols=15 Identities=13% Similarity=0.304 Sum_probs=6.4
Q ss_pred HHHHHHhhheeecCC
Q psy951 245 ALVSATVYYVWRKRP 259 (428)
Q Consensus 245 ~~~~~~~~~~~r~~~ 259 (428)
+++++-+++.|||++
T Consensus 330 IMvIIYLILRYRRKK 344 (358)
T PTZ00046 330 IMVIIYLILRYRRKK 344 (358)
T ss_pred HHHHHHHHHHhhhcc
Confidence 333333344455444
No 156
>PTZ00370 STEVOR; Provisional
Probab=49.46 E-value=10 Score=35.86 Aligned_cols=17 Identities=24% Similarity=0.227 Sum_probs=8.8
Q ss_pred HHHHHHhhheeecCCCC
Q psy951 245 ALVSATVYYVWRKRPFG 261 (428)
Q Consensus 245 ~~~~~~~~~~~r~~~~~ 261 (428)
+++++++|+|.+|||++
T Consensus 268 ~vvliilYiwlyrrRK~ 284 (296)
T PTZ00370 268 AVVLIILYIWLYRRRKN 284 (296)
T ss_pred HHHHHHHHHHHHHhhcc
Confidence 44555555555555433
No 157
>PF02239 Cytochrom_D1: Cytochrome D1 heme domain; PDB: 1NNO_B 1HZU_A 1N15_B 1N50_A 1GJQ_A 1BL9_B 1NIR_B 1N90_B 1HZV_A 1AOQ_A ....
Probab=49.37 E-value=2.8e+02 Score=27.59 Aligned_cols=96 Identities=19% Similarity=0.264 Sum_probs=55.4
Q ss_pred EEEeeCCCCeEEEEEcCCCCcEEE--EeCCCCCceeEEEcCCCCCEEEEEECCCCeEEEEEcCCCCe-EEEEeCCCCccc
Q psy951 2 FWAETGASPRIESAWMDGSHRRSL--VMTGVRHPTGLSVDAAMDHTLYWVDSKLNTIESVRHDGRNR-QTILSGSDKLQH 78 (428)
Q Consensus 2 yWtd~~~~~~I~~a~~DG~~~~~l--~~~~~~~P~glavD~~~~~~lYW~d~~~~~I~~~~ldG~~~-~~i~~~~~~~~~ 78 (428)
|.++.+.. ++... |+...+++ +...-.-+.++++.+..+ .+|-++ ..+.|..+++.-... ..+..+ ..
T Consensus 9 ~V~~~~~~-~v~vi--D~~t~~~~~~i~~~~~~h~~~~~s~Dgr-~~yv~~-rdg~vsviD~~~~~~v~~i~~G----~~ 79 (369)
T PF02239_consen 9 YVVERGSG-SVAVI--DGATNKVVARIPTGGAPHAGLKFSPDGR-YLYVAN-RDGTVSVIDLATGKVVATIKVG----GN 79 (369)
T ss_dssp EEEEGGGT-EEEEE--ETTT-SEEEEEE-STTEEEEEE-TT-SS-EEEEEE-TTSEEEEEETTSSSEEEEEE-S----SE
T ss_pred EEEecCCC-EEEEE--ECCCCeEEEEEcCCCCceeEEEecCCCC-EEEEEc-CCCeEEEEECCcccEEEEEecC----CC
Confidence 34555444 45554 44433332 332222245677777777 899987 468999999966543 344444 34
Q ss_pred eeeee--ccccEEEEEeCCCCceeeeccCC
Q psy951 79 PISLD--VFENNIYWLARDTGSLYKQDKFG 106 (428)
Q Consensus 79 p~~l~--~~~~~lYwtD~~~~~I~~~~~~g 106 (428)
|.+++ ..+.++|-+.+..+.+...+...
T Consensus 80 ~~~i~~s~DG~~~~v~n~~~~~v~v~D~~t 109 (369)
T PF02239_consen 80 PRGIAVSPDGKYVYVANYEPGTVSVIDAET 109 (369)
T ss_dssp EEEEEE--TTTEEEEEEEETTEEEEEETTT
T ss_pred cceEEEcCCCCEEEEEecCCCceeEecccc
Confidence 55555 46788999988777777766543
No 158
>TIGR01477 RIFIN variant surface antigen, rifin family. This model represents the rifin branch of the rifin/stevor family (pfam02009) of predicted variant surface antigens as found in Plasmodium falciparum. This model is based on a set of rifin sequences kindly provided by Matt Berriman from the Sanger Center. This is a global model and assesses a penalty for incomplete sequence. Additional fragmentary sequences may be found with the fragment model and a cutoff of 20 bits.
Probab=48.69 E-value=14 Score=36.28 Aligned_cols=15 Identities=13% Similarity=0.304 Sum_probs=6.5
Q ss_pred HHHHHHhhheeecCC
Q psy951 245 ALVSATVYYVWRKRP 259 (428)
Q Consensus 245 ~~~~~~~~~~~r~~~ 259 (428)
+++++-+++.|||++
T Consensus 325 IMvIIYLILRYRRKK 339 (353)
T TIGR01477 325 IMVIIYLILRYRRKK 339 (353)
T ss_pred HHHHHHHHHHhhhcc
Confidence 333333344555544
No 159
>PF02439 Adeno_E3_CR2: Adenovirus E3 region protein CR2; InterPro: IPR003470 Early region 3 (E3) of human adenoviruses (Ads) codes for proteins that appear to control viral interactions with the host []. This region called CR1 (conserved region 1) [] is found three times in Human adenovirus 19 (a subgroup D adenovirus) 49 kDa protein in the E3 region. CR1 is also found in the 20.1 Kd protein of subgroup B adenoviruses. The function of this 80 amino acid region is unknown. This region is probably a divergent immunoglobulin domain.
Probab=48.28 E-value=22 Score=22.74 Aligned_cols=7 Identities=0% Similarity=0.292 Sum_probs=2.7
Q ss_pred HHHHHHH
Q psy951 234 LLYIPTL 240 (428)
Q Consensus 234 ~~~~~i~ 240 (428)
++..+++
T Consensus 7 aIIv~V~ 13 (38)
T PF02439_consen 7 AIIVAVV 13 (38)
T ss_pred hHHHHHH
Confidence 3333333
No 160
>PF05694 SBP56: 56kDa selenium binding protein (SBP56); InterPro: IPR008826 This family consists of several eukaryotic selenium binding proteins as well as three sequences from archaea. The exact function of this protein is unknown although it is thought that SBP56 participates in late stages of intra-Golgi protein transport []. The Lotus japonicus homologue of SBP56, LjSBP is thought to have more than one physiological role and can be implicated in controlling the oxidation/reduction status of target proteins in vesicular Golgi transport [].; GO: 0008430 selenium binding; PDB: 2ECE_A.
Probab=48.13 E-value=57 Score=33.17 Aligned_cols=62 Identities=13% Similarity=0.173 Sum_probs=33.6
Q ss_pred ceeEEeeccCCCEEEEEecCCCeEEEeeeccccCCceEEE----Eec---------------cccceeeeeeccCCeEEE
Q psy951 325 RIEALDIDPVDEIIYWVDSYDRNIRRSFMLEAQKGQVQAV----ISD---------------ERRIEALDIDPVDEIIYW 385 (428)
Q Consensus 325 ~~~~l~~d~~~~~lyWtd~~~~~I~ra~l~g~~~~~~~~i----~~~---------------~~~p~glavD~~~~~lYw 385 (428)
-+.+|.+-...++||.+.+....|..-.+....+. +.+ +-+ .+-|.=|.+-+.++||||
T Consensus 313 LitDI~iSlDDrfLYvs~W~~GdvrqYDISDP~~P--kl~gqv~lGG~~~~~~~~~v~g~~l~GgPqMvqlS~DGkRlYv 390 (461)
T PF05694_consen 313 LITDILISLDDRFLYVSNWLHGDVRQYDISDPFNP--KLVGQVFLGGSIRKGDHPVVKGKRLRGGPQMVQLSLDGKRLYV 390 (461)
T ss_dssp ----EEE-TTS-EEEEEETTTTEEEEEE-SSTTS---EEEEEEE-BTTTT-B--TTS------S----EEE-TTSSEEEE
T ss_pred ceEeEEEccCCCEEEEEcccCCcEEEEecCCCCCC--cEEeEEEECcEeccCCCccccccccCCCCCeEEEccCCeEEEE
Confidence 46778888889999999999998888877653212 211 111 124566667777999999
Q ss_pred EeC
Q psy951 386 VDS 388 (428)
Q Consensus 386 td~ 388 (428)
|.+
T Consensus 391 TnS 393 (461)
T PF05694_consen 391 TNS 393 (461)
T ss_dssp E--
T ss_pred Eee
Confidence 987
No 161
>TIGR02276 beta_rpt_yvtn 40-residue YVTN family beta-propeller repeat. This repeat of about 40 amino acids is found in up to 14 copies per protein. Archaea Methanosarcina mazei and Methanosarcina acetivorans each have over 10 genes that encode tandem copies of this repeat, which is also found in other species. PSIPRED predicts with high confidence that each 40-residue repeats contains four beta strands. This model overlaps somewhat with the NHL repeat (Pfam pfam01436) and also shows sequence similarity to the WD domain, G-beta repeat (Pfam pfam00400).
Probab=47.33 E-value=69 Score=20.00 Aligned_cols=42 Identities=14% Similarity=0.064 Sum_probs=25.3
Q ss_pred cCCCEEEEEecCCCeEEEeeeccccCCceEEEEeccccceeeeee
Q psy951 333 PVDEIIYWVDSYDRNIRRSFMLEAQKGQVQAVISDERRIEALDID 377 (428)
Q Consensus 333 ~~~~~lyWtd~~~~~I~ra~l~g~~~~~~~~i~~~~~~p~glavD 377 (428)
|..++||=++...++|....... ......+.--..|.+|+++
T Consensus 1 pd~~~lyv~~~~~~~v~~id~~~---~~~~~~i~vg~~P~~i~~~ 42 (42)
T TIGR02276 1 PDGTKLYVTNSGSNTVSVIDTAT---NKVIATIPVGGYPFGVAVS 42 (42)
T ss_pred CCCCEEEEEeCCCCEEEEEECCC---CeEEEEEECCCCCceEEeC
Confidence 35678888888887776644422 2222223334678888875
No 162
>PF02333 Phytase: Phytase; InterPro: IPR003431 Phytase (3.1.3.8 from EC) (phytate 3-phosphatase) is a secreted enzyme which hydrolyses phytate to release inorganic phosphate. This family appears to represent a novel enzyme that shows phytase activity () and has been shown to consist of a single structural unit with a six-bladed propeller folding architecture ().; GO: 0016158 3-phytase activity; PDB: 3AMS_A 3AMR_A 1QLG_A 2POO_A 1H6L_A 1CVM_A 1POO_A.
Probab=45.79 E-value=3e+02 Score=27.64 Aligned_cols=84 Identities=14% Similarity=0.103 Sum_probs=46.0
Q ss_pred ccccceeEEee--ccCCCEEEEEe-cCCCeEEEeee--ccccCCceE-EEEe---ccccceeeeeeccCCeEEEEeCCCC
Q psy951 321 SDERRIEALDI--DPVDEIIYWVD-SYDRNIRRSFM--LEAQKGQVQ-AVIS---DERRIEALDIDPVDEIIYWVDSYDR 391 (428)
Q Consensus 321 ~~~~~~~~l~~--d~~~~~lyWtd-~~~~~I~ra~l--~g~~~~~~~-~i~~---~~~~p~glavD~~~~~lYwtd~~~~ 391 (428)
.++.++.|+.+ ++.++.+|-.- ..+..+..-.| ++ ++... .++. --.+|+|+++|-.++.||..+...
T Consensus 153 ~~~~e~yGlcly~~~~~g~~ya~v~~k~G~~~Qy~L~~~~--~g~v~~~lVR~f~~~sQ~EGCVVDDe~g~LYvgEE~~- 229 (381)
T PF02333_consen 153 TDLSEPYGLCLYRSPSTGALYAFVNGKDGRVEQYELTDDG--DGKVSATLVREFKVGSQPEGCVVDDETGRLYVGEEDV- 229 (381)
T ss_dssp -SSSSEEEEEEEE-TTT--EEEEEEETTSEEEEEEEEE-T--TSSEEEEEEEEEE-SS-EEEEEEETTTTEEEEEETTT-
T ss_pred cccccceeeEEeecCCCCcEEEEEecCCceEEEEEEEeCC--CCcEeeEEEEEecCCCcceEEEEecccCCEEEecCcc-
Confidence 44556777766 45666555442 22344544444 34 22221 2222 335899999999999999999986
Q ss_pred cEEEEEcc--CcceeEEE
Q psy951 392 NIRRSFML--EAQKGQVQ 407 (428)
Q Consensus 392 ~I~~~~~~--g~~~~~l~ 407 (428)
-|.+-..+ +...++++
T Consensus 230 GIW~y~Aep~~~~~~~~v 247 (381)
T PF02333_consen 230 GIWRYDAEPEGGNDRTLV 247 (381)
T ss_dssp EEEEEESSCCC-S--EEE
T ss_pred EEEEEecCCCCCCcceee
Confidence 48877776 33334454
No 163
>PF11403 Yeast_MT: Yeast metallothionein; InterPro: IPR022710 Metallothioneins are characterised by an abundance of cysteine residues and a lack of generic secondary structure motifs. This protein functions in primary metal storage, transport and detoxification []. For the first 40 residues in the protein the polypeptide wraps around the metal by forming two large parallel loops separated by a deep cleft containing the metal cluster []. ; PDB: 1AQS_A 1AQR_A 1RJU_V 1FMY_A 1AOO_A 1AQQ_A.
Probab=43.59 E-value=8.9 Score=23.57 Aligned_cols=9 Identities=22% Similarity=0.870 Sum_probs=3.6
Q ss_pred EecCCCCCc
Q psy951 206 TCNCRQDFA 214 (428)
Q Consensus 206 ~C~C~~gy~ 214 (428)
.|+||.|-.
T Consensus 23 scscptgcn 31 (40)
T PF11403_consen 23 SCSCPTGCN 31 (40)
T ss_dssp S-SS-TTTT
T ss_pred cCCCCCCCC
Confidence 366666543
No 164
>PF06433 Me-amine-dh_H: Methylamine dehydrogenase heavy chain (MADH); InterPro: IPR009451 Methylamine dehydrogenase (1.4.99.3 from EC) is a periplasmic quinoprotein found in several methyltrophic bacteria []. It is induced when grown on methylamine as a carbon source MADH and catalyses the oxidative deamination of amines to their corresponding aldehydes. The redox cofactor of this enzyme is tryptophan tryptophylquinone (TTQ). Electrons derived from the oxidation of methylamine are passed to an electron acceptor, which is usually the blue-copper protein amicyanin (IPR002386 from INTERPRO). RCH2NH2 + H2O + acceptor = RCHO + NH3 + reduced acceptor MADH is a hetero-tetramer, comprised of two heavy subunits and two light subunits. The heavy subunit forms a seven-bladed beta-propeller like structure [].; GO: 0030058 amine dehydrogenase activity, 0030416 methylamine metabolic process, 0055114 oxidation-reduction process, 0042597 periplasmic space; PDB: 3RN1_F 3SVW_F 3PXT_F 3L4O_F 3L4M_D 3SJL_F 3PXS_D 3ORV_F 3RMZ_F 3RLM_F ....
Probab=43.23 E-value=2.8e+02 Score=27.38 Aligned_cols=62 Identities=16% Similarity=0.138 Sum_probs=37.4
Q ss_pred EEEeeCCCCeEEEEEcCCCCcEEEEeCCCC---------Ccee---EEEcCCCCCEEEEEEC-C--------CCeEEEEE
Q psy951 2 FWAETGASPRIESAWMDGSHRRSLVMTGVR---------HPTG---LSVDAAMDHTLYWVDS-K--------LNTIESVR 60 (428)
Q Consensus 2 yWtd~~~~~~I~~a~~DG~~~~~l~~~~~~---------~P~g---lavD~~~~~~lYW~d~-~--------~~~I~~~~ 60 (428)
||+... +.|..+++.|...+..-.-.+. +|-| +|++..++ +||-.=. + ...|+..+
T Consensus 199 ~F~Sy~--G~v~~~dlsg~~~~~~~~~~~~t~~e~~~~WrPGG~Q~~A~~~~~~-rlyvLMh~g~~gsHKdpgteVWv~D 275 (342)
T PF06433_consen 199 YFVSYE--GNVYSADLSGDSAKFGKPWSLLTDAEKADGWRPGGWQLIAYHAASG-RLYVLMHQGGEGSHKDPGTEVWVYD 275 (342)
T ss_dssp EEEBTT--SEEEEEEETTSSEEEEEEEESS-HHHHHTTEEE-SSS-EEEETTTT-EEEEEEEE--TT-TTS-EEEEEEEE
T ss_pred EEEecC--CEEEEEeccCCcccccCcccccCccccccCcCCcceeeeeeccccC-eEEEEecCCCCCCccCCceEEEEEE
Confidence 455543 3799999998876554322221 3444 88999988 9996511 1 11577777
Q ss_pred cCCCCe
Q psy951 61 HDGRNR 66 (428)
Q Consensus 61 ldG~~~ 66 (428)
+.-..|
T Consensus 276 ~~t~kr 281 (342)
T PF06433_consen 276 LKTHKR 281 (342)
T ss_dssp TTTTEE
T ss_pred CCCCeE
Confidence 766544
No 165
>PF11770 GAPT: GRB2-binding adapter (GAPT); InterPro: IPR021082 This entry represents a family of transmembrane proteins which bind the growth factor receptor-bound protein 2 (GRB2) in B cells []. In contrast to other transmembrane adaptor proteins, GAPT, which this entry represents, is not phosphorylated upon BCR ligation. It associates with GRB2 constitutively through its proline-rich region [].
Probab=42.55 E-value=26 Score=29.76 Aligned_cols=24 Identities=25% Similarity=0.148 Sum_probs=12.6
Q ss_pred hhHHHHHHHHHHHHHHHHHhhhee
Q psy951 232 RSLLYIPTLLLLLALVSATVYYVW 255 (428)
Q Consensus 232 ~~~~~~~i~~~~~~~~~~~~~~~~ 255 (428)
.+++++++.+++++++..++++|+
T Consensus 9 sv~i~igi~Ll~lLl~cgiGcvwh 32 (158)
T PF11770_consen 9 SVAISIGISLLLLLLLCGIGCVWH 32 (158)
T ss_pred hHHHHHHHHHHHHHHHHhcceEEE
Confidence 355556665555544444455444
No 166
>TIGR02276 beta_rpt_yvtn 40-residue YVTN family beta-propeller repeat. This repeat of about 40 amino acids is found in up to 14 copies per protein. Archaea Methanosarcina mazei and Methanosarcina acetivorans each have over 10 genes that encode tandem copies of this repeat, which is also found in other species. PSIPRED predicts with high confidence that each 40-residue repeats contains four beta strands. This model overlaps somewhat with the NHL repeat (Pfam pfam01436) and also shows sequence similarity to the WD domain, G-beta repeat (Pfam pfam00400).
Probab=41.46 E-value=87 Score=19.50 Aligned_cols=21 Identities=24% Similarity=0.267 Sum_probs=17.2
Q ss_pred CCCEEEEEECCCCeEEEEEcCC
Q psy951 42 MDHTLYWVDSKLNTIESVRHDG 63 (428)
Q Consensus 42 ~~~~lYW~d~~~~~I~~~~ldG 63 (428)
++ +||-++...+.|..++...
T Consensus 3 ~~-~lyv~~~~~~~v~~id~~~ 23 (42)
T TIGR02276 3 GT-KLYVTNSGSNTVSVIDTAT 23 (42)
T ss_pred CC-EEEEEeCCCCEEEEEECCC
Confidence 45 8999999999999988743
No 167
>TIGR03032 conserved hypothetical protein TIGR03032. This protein family is uncharacterized. A number of motifs are conserved perfectly among all member sequences. The function of this protein is unknown.
Probab=40.74 E-value=92 Score=30.31 Aligned_cols=56 Identities=18% Similarity=0.135 Sum_probs=41.2
Q ss_pred EEEEeCCCCccceeeeeccccEEEEEeCCCCceeeeccCCCCceeeeccCCCCCcccccc
Q psy951 67 QTILSGSDKLQHPISLDVFENNIYWLARDTGSLYKQDKFGRGVPVLISKDLVNPSGVKAY 126 (428)
Q Consensus 67 ~~i~~~~~~~~~p~~l~~~~~~lYwtD~~~~~I~~~~~~g~~~~~~~~~~~~~p~~I~v~ 126 (428)
++++++ +..|.+--.++++||..|...+++.+++.+.+. .+.+..--..|.||...
T Consensus 196 evl~~G---LsmPhSPRWhdgrLwvldsgtGev~~vD~~~G~-~e~Va~vpG~~rGL~f~ 251 (335)
T TIGR03032 196 EVVASG---LSMPHSPRWYQGKLWLLNSGRGELGYVDPQAGK-FQPVAFLPGFTRGLAFA 251 (335)
T ss_pred CEEEcC---ccCCcCCcEeCCeEEEEECCCCEEEEEcCCCCc-EEEEEECCCCCccccee
Confidence 456664 888888889999999999999999999987333 33334433456666655
No 168
>PF14251 DUF4346: Domain of unknown function (DUF4346)
Probab=39.06 E-value=31 Score=28.10 Aligned_cols=72 Identities=14% Similarity=0.149 Sum_probs=49.8
Q ss_pred eeEEeeccCCCEEEEEecCCCeEEEeeeccccCCceEEEEeccccceeeeeeccCCeEEEEeCCCCcEEEEEccCcceeE
Q psy951 326 IEALDIDPVDEIIYWVDSYDRNIRRSFMLEAQKGQVQAVISDERRIEALDIDPVDEIIYWVDSYDRNIRRSFMLEAQKGQ 405 (428)
Q Consensus 326 ~~~l~~d~~~~~lyWtd~~~~~I~ra~l~g~~~~~~~~i~~~~~~p~glavD~~~~~lYwtd~~~~~I~~~~~~g~~~~~ 405 (428)
-+-|++||..-+|-..|....-|.-.... +.-.-.|+|+|++++-+.=++....+-....+.|+.-+.
T Consensus 9 ~R~i~LDp~GYfiI~~d~~~~~i~a~h~~------------n~I~~~Gla~Dpetge~i~~~g~~~r~~~~~~~GrTAKe 76 (119)
T PF14251_consen 9 QRFIDLDPAGYFIIYVDREAGEICAEHYT------------NDIDDKGLAVDPETGEVIPCRGKVKRTPSIVFKGRTAKE 76 (119)
T ss_pred cCccccCCCccEEEEEeCCCCeeeHhhcc------------CccCcccceeCCCCCCEEEEecCCCCceeEEEecCCHHH
Confidence 35689999998887777665444322221 223445999999999998888777677777777777666
Q ss_pred EEec
Q psy951 406 VQAG 409 (428)
Q Consensus 406 l~~~ 409 (428)
|...
T Consensus 77 L~~~ 80 (119)
T PF14251_consen 77 LYIT 80 (119)
T ss_pred HHHH
Confidence 6433
No 169
>PF12273 RCR: Chitin synthesis regulation, resistance to Congo red; InterPro: IPR020999 RCR proteins are ER membrane proteins that regulate chitin deposition in fungal cell walls. Although chitin, a linear polymer of beta-1,4-linked N-acetylglucosamine, constitutes only 2% of the cell wall it plays a vital role in the overall protection of the cell wall against stress, noxious chemicals and osmotic pressure changes. Congo red is a cell wall-disrupting benzidine-type dye extensively used in many cell wall mutant studies that specifically targets chitin in yeast cells and inhibits growth. RCR proteins render the yeasts resistant to Congo red by diminishing the content of chitin in the cell wall []. RCR proteins are probably regulating chitin synthase III interact directly with ubiquitin ligase Rsp5, and the VPEY motif is necessary for this, via interaction with the WW domains of Rsp5 [].
Probab=39.00 E-value=7.4 Score=32.52 Aligned_cols=10 Identities=0% Similarity=0.019 Sum_probs=4.1
Q ss_pred heeecCCCCC
Q psy951 253 YVWRKRPFGK 262 (428)
Q Consensus 253 ~~~r~~~~~~ 262 (428)
+++++++|++
T Consensus 18 ~~~~~~rRR~ 27 (130)
T PF12273_consen 18 LFYCHNRRRR 27 (130)
T ss_pred HHHHHHHHHh
Confidence 3334444443
No 170
>COG3823 Glutamine cyclotransferase [Posttranslational modification, protein turnover, chaperones]
Probab=38.50 E-value=1.8e+02 Score=26.69 Aligned_cols=65 Identities=20% Similarity=0.086 Sum_probs=44.4
Q ss_pred cCCCEEEEEecCCCeEEEeeeccccCCceEEEE--e-----------ccccceeeeeeccCCeEEEEeCCCCcEEEEEcc
Q psy951 333 PVDEIIYWVDSYDRNIRRSFMLEAQKGQVQAVI--S-----------DERRIEALDIDPVDEIIYWVDSYDRNIRRSFML 399 (428)
Q Consensus 333 ~~~~~lyWtd~~~~~I~ra~l~g~~~~~~~~i~--~-----------~~~~p~glavD~~~~~lYwtd~~~~~I~~~~~~ 399 (428)
..+|+||=-=+.+.+|.|..-+- +++...+ + +...++|||-|...+++|-|-..-..+--+.++
T Consensus 183 ~VdG~lyANVw~t~~I~rI~p~s---GrV~~widlS~L~~~~~~~~~~~nvlNGIA~~~~~~r~~iTGK~wp~lfEVk~~ 259 (262)
T COG3823 183 WVDGELYANVWQTTRIARIDPDS---GRVVAWIDLSGLLKELNLDKSNDNVLNGIAHDPQQDRFLITGKLWPLLFEVKLD 259 (262)
T ss_pred eeccEEEEeeeeecceEEEcCCC---CcEEEEEEccCCchhcCccccccccccceeecCcCCeEEEecCcCceeEEEEec
Confidence 35677877777777887776654 3333322 1 244789999999999999998766666555554
Q ss_pred C
Q psy951 400 E 400 (428)
Q Consensus 400 g 400 (428)
+
T Consensus 260 ~ 260 (262)
T COG3823 260 E 260 (262)
T ss_pred C
Confidence 4
No 171
>PF06739 SBBP: Beta-propeller repeat; InterPro: IPR010620 This family is related to IPR001680 from INTERPRO and is likely to also form a beta-propeller. SBBP stands for Seven Bladed Beta Propeller.
Probab=37.33 E-value=29 Score=22.12 Aligned_cols=19 Identities=16% Similarity=0.127 Sum_probs=15.0
Q ss_pred ccceeeeeeccCCeEEEEeC
Q psy951 369 RRIEALDIDPVDEIIYWVDS 388 (428)
Q Consensus 369 ~~p~glavD~~~~~lYwtd~ 388 (428)
..+.+||||.. +|+|.+=.
T Consensus 13 ~~~~~IavD~~-GNiYv~G~ 31 (38)
T PF06739_consen 13 DYGNGIAVDSN-GNIYVTGY 31 (38)
T ss_pred eeEEEEEECCC-CCEEEEEe
Confidence 46999999955 77998754
No 172
>PF14610 DUF4448: Protein of unknown function (DUF4448)
Probab=36.87 E-value=14 Score=33.08 Aligned_cols=23 Identities=17% Similarity=0.447 Sum_probs=11.9
Q ss_pred hhHHHHHHHHHHHHHHHHHhhhe
Q psy951 232 RSLLYIPTLLLLLALVSATVYYV 254 (428)
Q Consensus 232 ~~~~~~~i~~~~~~~~~~~~~~~ 254 (428)
.++++++++++++++++.+.+++
T Consensus 159 ~laI~lPvvv~~~~~~~~~~~~~ 181 (189)
T PF14610_consen 159 ALAIALPVVVVVLALIMYGFFFW 181 (189)
T ss_pred eEEEEccHHHHHHHHHHHhhhee
Confidence 45566676666654443333333
No 173
>PF14781 BBS2_N: Ciliary BBSome complex subunit 2, N-terminal
Probab=35.83 E-value=86 Score=26.43 Aligned_cols=55 Identities=16% Similarity=0.241 Sum_probs=33.8
Q ss_pred EEEcCCCCCEEEEEECCCC--eEEEEEcCCCCeEEEEeCCCCccceeeeeccccEEEEE
Q psy951 36 LSVDAAMDHTLYWVDSKLN--TIESVRHDGRNRQTILSGSDKLQHPISLDVFENNIYWL 92 (428)
Q Consensus 36 lavD~~~~~~lYW~d~~~~--~I~~~~ldG~~~~~i~~~~~~~~~p~~l~~~~~~lYwt 92 (428)
||.|..++..+||-|...+ .|.--.+.+.....++-+ +.-.-.+.+..++.+|||
T Consensus 76 laYDV~~N~d~Fyke~~DGvn~i~~g~~~~~~~~l~ivG--Gncsi~Gfd~~G~e~fWt 132 (136)
T PF14781_consen 76 LAYDVENNSDLFYKEVPDGVNAIVIGKLGDIPSPLVIVG--GNCSIQGFDYEGNEIFWT 132 (136)
T ss_pred EEEEcccCchhhhhhCccceeEEEEEecCCCCCcEEEEC--ceEEEEEeCCCCcEEEEE
Confidence 4556555545777665433 454455555455555554 334446788889999998
No 174
>TIGR02976 phageshock_pspB phage shock protein B. This model describes the PspB protein of the psp (phage shock protein) operon, as found in Escherichia coli and many related species. Expression of a phage protein called secretin protein IV, and a number of other stresses including ethanol, heat shock, and defects in protein secretion trigger sigma-54-dependent expression of the phage shock regulon. PspB is both a regulator and an effector protein of the phage shock response.
Probab=34.00 E-value=35 Score=25.61 Aligned_cols=27 Identities=15% Similarity=0.310 Sum_probs=17.0
Q ss_pred HHHHHHHHHHHHHHHHhhheeecCCCC
Q psy951 235 LYIPTLLLLLALVSATVYYVWRKRPFG 261 (428)
Q Consensus 235 ~~~~i~~~~~~~~~~~~~~~~r~~~~~ 261 (428)
+++++++++++++.+..+++|+++++.
T Consensus 6 l~~Pliif~ifVap~wl~lHY~~k~~~ 32 (75)
T TIGR02976 6 LAIPLIIFVIFVAPLWLILHYRSKRKT 32 (75)
T ss_pred HHHHHHHHHHHHHHHHHHHHHHhhhcc
Confidence 455555555566667777888765533
No 175
>PF11118 DUF2627: Protein of unknown function (DUF2627); InterPro: IPR020138 This entry represents uncharacterised membrane proteins with no known function.
Probab=33.78 E-value=67 Score=24.08 Aligned_cols=31 Identities=16% Similarity=0.135 Sum_probs=23.8
Q ss_pred hhHHHHHHHHHHHHHHHHHhhheeecCCCCC
Q psy951 232 RSLLYIPTLLLLLALVSATVYYVWRKRPFGK 262 (428)
Q Consensus 232 ~~~~~~~i~~~~~~~~~~~~~~~~r~~~~~~ 262 (428)
+.....+++++++.+-.+.+++++|-|++.+
T Consensus 40 wlqfl~G~~lf~~G~~Fi~GfI~~RDRKrnk 70 (77)
T PF11118_consen 40 WLQFLAGLLLFAIGVGFIAGFILHRDRKRNK 70 (77)
T ss_pred HHHHHHHHHHHHHHHHHHHhHhheeeccccc
Confidence 4556666666677788888999999988777
No 176
>PF14759 Reductase_C: Reductase C-terminal; PDB: 3FG2_P 3LXD_A 2YVG_A 2GR1_A 2GQW_A 2GR3_A 2YVF_A 1F3P_A 2GR0_A 2GR2_A ....
Probab=33.33 E-value=1.1e+02 Score=23.28 Aligned_cols=27 Identities=26% Similarity=0.463 Sum_probs=19.2
Q ss_pred EEEeeCCCCeEEEEEcCCCCcEEEEeCC
Q psy951 2 FWAETGASPRIESAWMDGSHRRSLVMTG 29 (428)
Q Consensus 2 yWtd~~~~~~I~~a~~DG~~~~~l~~~~ 29 (428)
||||.... +|+.+..-+..-++++..+
T Consensus 2 FWSdQ~~~-~iq~~G~~~~~~~~v~rg~ 28 (85)
T PF14759_consen 2 FWSDQYGV-RIQIAGLPGGADEVVVRGD 28 (85)
T ss_dssp EEEEETTE-EEEEEE-STTSSEEEEEEE
T ss_pred eecccCCC-eEEEEECCCCCCEEEEEcc
Confidence 99999887 8999987655545555544
No 177
>PF00930 DPPIV_N: Dipeptidyl peptidase IV (DPP IV) N-terminal region; InterPro: IPR002469 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This domain defines serine peptidases belonging to MEROPS peptidase family S9 (clan SC), subfamily S9B (dipeptidyl-peptidase IV). The protein fold of the peptidase domain for members of this family resembles that of serine carboxypeptidase D, the type example of clan SC. This domain is an alignment of the region to the N-terminal side of the active site, which is found in IPR001375 from INTERPRO. CD26 (3.4.14.5 from EC) is also called adenosine deaminase-binding protein (ADA-binding protein) or dipeptidylpeptidase IV (DPP IV ectoenzyme). The exopeptidase cleaves off N-terminal X-Pro or X-Ala dipeptides from polypeptides (dipeptidyl peptidase IV activity). CD26 serves as the costimulatory molecule in T cell activation and is an associated marker of autoimmune diseases, adenosine deaminase-deficiency and HIV pathogenesis. Dipeptidyl peptidase IV (DPP IV) is responsible for the removal of N-terminal dipeptides sequentially from polypeptides having unsubstituted N termini, provided that the penultimate residue is proline. The enzyme catalyses the reaction: Dipeptidyl-Polypeptide + H(2)O = Dipeptide + Polypeptide It is a type II membrane protein that forms a homodimer. CD molecules are leucocyte antigens on cell surfaces. CD antigens nomenclature is updated at Protein Reviews On The Web (http://prow.nci.nih.gov/). ; GO: 0006508 proteolysis, 0016020 membrane; PDB: 2RIP_A 3Q8W_B 2AJL_I 1TKR_B 1TK3_B 3C45_A 2G5P_A 3G0C_D 1R9M_C 1RWQ_A ....
Probab=33.20 E-value=1.7e+02 Score=28.75 Aligned_cols=74 Identities=15% Similarity=0.249 Sum_probs=48.7
Q ss_pred cCCCEEEEEecCCC--eEEEeeeccccCCceEEEEeccccc-eeeeeeccCCeEEEEeCC----CCcEEEEEcc-Cccee
Q psy951 333 PVDEIIYWVDSYDR--NIRRSFMLEAQKGQVQAVISDERRI-EALDIDPVDEIIYWVDSY----DRNIRRSFML-EAQKG 404 (428)
Q Consensus 333 ~~~~~lyWtd~~~~--~I~ra~l~g~~~~~~~~i~~~~~~p-~glavD~~~~~lYwtd~~----~~~I~~~~~~-g~~~~ 404 (428)
+..+.++|.-..++ .|+...++| +...-+.++--.. .-+++|..++.||++-.. ...+.+++++ |+..+
T Consensus 245 ~~~~~~l~~s~~~G~~hly~~~~~~---~~~~~lT~G~~~V~~i~~~d~~~~~iyf~a~~~~p~~r~lY~v~~~~~~~~~ 321 (353)
T PF00930_consen 245 PDGNEFLWISERDGYRHLYLYDLDG---GKPRQLTSGDWEVTSILGWDEDNNRIYFTANGDNPGERHLYRVSLDSGGEPK 321 (353)
T ss_dssp TTSSEEEEEEETTSSEEEEEEETTS---SEEEESS-SSS-EEEEEEEECTSSEEEEEESSGGTTSBEEEEEETTETTEEE
T ss_pred CCCCEEEEEEEcCCCcEEEEEcccc---cceeccccCceeecccceEcCCCCEEEEEecCCCCCceEEEEEEeCCCCCeE
Confidence 44555555533443 899999988 3333333332233 468999999999998874 5689999999 77766
Q ss_pred EEEec
Q psy951 405 QVQAG 409 (428)
Q Consensus 405 ~l~~~ 409 (428)
.|-..
T Consensus 322 ~LT~~ 326 (353)
T PF00930_consen 322 CLTCE 326 (353)
T ss_dssp ESSTT
T ss_pred eccCC
Confidence 66433
No 178
>PRK02888 nitrous-oxide reductase; Validated
Probab=32.99 E-value=3.1e+02 Score=29.47 Aligned_cols=33 Identities=3% Similarity=0.003 Sum_probs=29.1
Q ss_pred cccceeeeeeccCCeEEEEeCCCCcEEEEEccC
Q psy951 368 ERRIEALDIDPVDEIIYWVDSYDRNIRRSFMLE 400 (428)
Q Consensus 368 ~~~p~glavD~~~~~lYwtd~~~~~I~~~~~~g 400 (428)
-..|.|+++...++.+|-+-.....+++.++.-
T Consensus 320 GKsPHGV~vSPDGkylyVanklS~tVSVIDv~k 352 (635)
T PRK02888 320 PKNPHGVNTSPDGKYFIANGKLSPTVTVIDVRK 352 (635)
T ss_pred CCCccceEECCCCCEEEEeCCCCCcEEEEEChh
Confidence 458999999999999999999888899988754
No 179
>PF05454 DAG1: Dystroglycan (Dystrophin-associated glycoprotein 1); InterPro: IPR008465 Dystroglycan is one of the dystrophin-associated glycoproteins, which is encoded by a 5.5 kb transcript in Homo sapiens. The protein product is cleaved into two non-covalently associated subunits, [alpha] (N-terminal) and [beta] (C-terminal). In skeletal muscle the dystroglycan complex works as a transmembrane linkage between the extracellular matrix and the cytoskeleton [alpha]-dystroglycan is extracellular and binds to merosin ([alpha]-2 laminin) in the basement membrane, while [beta]-dystroglycan is a transmembrane protein and binds to dystrophin, which is a large rod-like cytoskeletal protein, absent in Duchenne muscular dystrophy patients. Dystrophin binds to intracellular actin cables. In this way, the dystroglycan complex, which links the extracellular matrix to the intracellular actin cables, is thought to provide structural integrity in muscle tissues. The dystroglycan complex is also known to serve as an agrin receptor in muscle, where it may regulate agrin-induced acetylcholine receptor clustering at the neuromuscular junction. There is also evidence which suggests the function of dystroglycan as a part of the signal transduction pathway because it is shown that Grb2, a mediator of the Ras-related signal pathway, can interact with the cytoplasmic domain of dystroglycan. In general, aberrant expression of dystrophin-associated protein complex underlies the pathogenesis of Duchenne muscular dystrophy, Becker muscular dystrophy and severe childhood autosomal recessive muscular dystrophy. Interestingly, no genetic disease has been described for either [alpha]- or [beta]-dystroglycan. Dystroglycan is widely distributed in non-muscle tissues as well as in muscle tissues. During epithelial morphogenesis of kidney, the dystroglycan complex is shown to act as a receptor for the basement membrane. Dystroglycan expression in Mus musculus brain and neural retina has also been reported. However, the physiological role of dystroglycan in non-muscle tissues has remained unclear [].; PDB: 1EG4_P.
Probab=32.26 E-value=15 Score=35.19 Aligned_cols=18 Identities=17% Similarity=0.361 Sum_probs=0.0
Q ss_pred HHHHHHhhheeecCCCCC
Q psy951 245 ALVSATVYYVWRKRPFGK 262 (428)
Q Consensus 245 ~~~~~~~~~~~r~~~~~~ 262 (428)
+++++++++.||||+..|
T Consensus 160 LIA~iIa~icyrrkR~GK 177 (290)
T PF05454_consen 160 LIAGIIACICYRRKRKGK 177 (290)
T ss_dssp ------------------
T ss_pred HHHHHHHHHhhhhhhccc
Confidence 334444455666655554
No 180
>PF05345 He_PIG: Putative Ig domain; InterPro: IPR008009 This alignment represents the conserved core region of a ~90 residue repeat found in several haemagglutinins and other cell surface proteins. Sequence similarities to Hyalin (IPR003410 from INTERPRO) and the PKD domain (IPR000601 from INTERPRO) suggest an Ig-like fold so this family may be similar in function to the (IPR003791 from INTERPRO) and (IPR003790 from INTERPRO) protein families.
Probab=30.12 E-value=56 Score=22.15 Aligned_cols=24 Identities=29% Similarity=0.389 Sum_probs=19.9
Q ss_pred CCCCceeEEEcCCCCCEEEEEECCC
Q psy951 29 GVRHPTGLSVDAAMDHTLYWVDSKL 53 (428)
Q Consensus 29 ~~~~P~glavD~~~~~~lYW~d~~~ 53 (428)
.-..|.||.+|..++ .|.|+-...
T Consensus 9 ~~~LP~gLs~d~~tG-~isGtp~~~ 32 (49)
T PF05345_consen 9 GGGLPSGLSLDPSTG-TISGTPTSS 32 (49)
T ss_pred CCCCCCcEEEeCCCC-EEEeecCCC
Confidence 456899999999999 999995443
No 181
>PRK09458 pspB phage shock protein B; Provisional
Probab=29.89 E-value=65 Score=24.09 Aligned_cols=27 Identities=22% Similarity=0.442 Sum_probs=17.4
Q ss_pred HHHHHHHHHHHHHHHHhhheeecCCCC
Q psy951 235 LYIPTLLLLLALVSATVYYVWRKRPFG 261 (428)
Q Consensus 235 ~~~~i~~~~~~~~~~~~~~~~r~~~~~ 261 (428)
+++++++++++++.+-++++|+.+++.
T Consensus 6 l~~PliiF~ifVaPiWL~LHY~sk~~~ 32 (75)
T PRK09458 6 LAIPLTIFVLFVAPIWLWLHYRSKRQG 32 (75)
T ss_pred HHHhHHHHHHHHHHHHHHHhhcccccC
Confidence 445555555566677777888876544
No 182
>KOG3653|consensus
Probab=29.56 E-value=1.1e+02 Score=31.41 Aligned_cols=21 Identities=24% Similarity=0.339 Sum_probs=13.3
Q ss_pred HHHHHHHHHHhhheeecCCCC
Q psy951 241 LLLLALVSATVYYVWRKRPFG 261 (428)
Q Consensus 241 ~~~~~~~~~~~~~~~r~~~~~ 261 (428)
+.+++++++.+|+.||.++-.
T Consensus 164 v~~l~~lvi~~~~~~r~~k~~ 184 (534)
T KOG3653|consen 164 VSLLAALVILAFLGYRQRKNA 184 (534)
T ss_pred HHHHHHHHHHHHHHHHHhhcc
Confidence 334466667777777777633
No 183
>PF02191 OLF: Olfactomedin-like domain; InterPro: IPR003112 The olfactomedin-domain was first identified in olfactomedin, an extracellular matrix protein of the olfactory neuroepithelium []. Members of this extracellular domain-family have since been shown to be present in several metazoan proteins, such as latrophilins, myocilins, optimedins and noelins, the latter being involved in the generation of neural crest cells. Myocilin is of considerable interest, as mutations in its olfactomedin-domain can lead to glaucoma []. The olfactomedin-domains in myocilin and optimedin are essential for the interaction between these two proteins [].; GO: 0005515 protein binding
Probab=29.38 E-value=4.8e+02 Score=24.43 Aligned_cols=100 Identities=21% Similarity=0.217 Sum_probs=55.4
Q ss_pred eeee--cCCcceeeeecCccceee--ccccccc------------eeEEeeccCCC--EEEEEecCCCeEEEeeeccccC
Q psy951 297 TIVY--SNGPEIRAYETHKRRFRD--VISDERR------------IEALDIDPVDE--IIYWVDSYDRNIRRSFMLEAQK 358 (428)
Q Consensus 297 ~~~~--s~~~~~~~~~~~~~~~~~--~i~~~~~------------~~~l~~d~~~~--~lyWtd~~~~~I~ra~l~g~~~ 358 (428)
.+|| .+..+|.+|++......+ .+++... -+++++|. +| -||=+...+..|.-++||-..-
T Consensus 80 slYY~~~~s~~IvkydL~t~~v~~~~~L~~A~~~n~~~y~~~~~t~iD~AvDE-~GLWvIYat~~~~g~ivvskld~~tL 158 (250)
T PF02191_consen 80 SLYYNKYNSRNIVKYDLTTRSVVARRELPGAGYNNRFPYYWSGYTDIDFAVDE-NGLWVIYATEDNNGNIVVSKLDPETL 158 (250)
T ss_pred cEEEEecCCceEEEEECcCCcEEEEEECCccccccccceecCCCceEEEEEcC-CCEEEEEecCCCCCcEEEEeeCcccC
Confidence 4566 567888888888777662 2333333 36777773 33 3555555555799999987311
Q ss_pred CceEEEEeccccceeeeeeccCCeEEEEeCCC---CcEEEEE
Q psy951 359 GQVQAVISDERRIEALDIDPVDEIIYWVDSYD---RNIRRSF 397 (428)
Q Consensus 359 ~~~~~i~~~~~~p~glavD~~~~~lYwtd~~~---~~I~~~~ 397 (428)
.-.+..-..+..+..-.-=.+=+-||-+++.. ..|..+.
T Consensus 159 ~v~~tw~T~~~k~~~~naFmvCGvLY~~~s~~~~~~~I~yaf 200 (250)
T PF02191_consen 159 SVEQTWNTSYPKRSAGNAFMVCGVLYATDSYDTRDTEIFYAF 200 (250)
T ss_pred ceEEEEEeccCchhhcceeeEeeEEEEEEECCCCCcEEEEEE
Confidence 21222222333332222223347899999865 4455543
No 184
>COG4257 Vgb Streptogramin lyase [Defense mechanisms]
Probab=29.17 E-value=2.8e+02 Score=26.56 Aligned_cols=100 Identities=12% Similarity=0.059 Sum_probs=64.6
Q ss_pred CCceeEEEcCCCCCEEEEEECCCCeEEEEEcCCCCeEEEEeCCCCccc-eeeeec-cccEEEEEeCCCCceeeeccCCCC
Q psy951 31 RHPTGLSVDAAMDHTLYWVDSKLNTIESVRHDGRNRQTILSGSDKLQH-PISLDV-FENNIYWLARDTGSLYKQDKFGRG 108 (428)
Q Consensus 31 ~~P~glavD~~~~~~lYW~d~~~~~I~~~~ldG~~~~~i~~~~~~~~~-p~~l~~-~~~~lYwtD~~~~~I~~~~~~g~~ 108 (428)
.-|.||.+.+.. .+|++....+.|-+++.-....+++..-+ .+.+ -..+-. ..+++.-|+|.++++++.+.....
T Consensus 189 ~gpyGi~atpdG--svwyaslagnaiaridp~~~~aev~p~P~-~~~~gsRriwsdpig~~wittwg~g~l~rfdPs~~s 265 (353)
T COG4257 189 GGPYGICATPDG--SVWYASLAGNAIARIDPFAGHAEVVPQPN-ALKAGSRRIWSDPIGRAWITTWGTGSLHRFDPSVTS 265 (353)
T ss_pred CCCcceEECCCC--cEEEEeccccceEEcccccCCcceecCCC-cccccccccccCccCcEEEeccCCceeeEeCccccc
Confidence 478999998764 49999888888888886444444444321 2111 123332 357888888999999999876443
Q ss_pred ceeeeccC-CCCCcccccccccccCC
Q psy951 109 VPVLISKD-LVNPSGVKAYHAQRYNT 133 (428)
Q Consensus 109 ~~~~~~~~-~~~p~~I~v~~~~~~~~ 133 (428)
-.+.-.-+ ...|.++.|.+..+.|.
T Consensus 266 W~eypLPgs~arpys~rVD~~grVW~ 291 (353)
T COG4257 266 WIEYPLPGSKARPYSMRVDRHGRVWL 291 (353)
T ss_pred ceeeeCCCCCCCcceeeeccCCcEEe
Confidence 33322223 45778888877776664
No 185
>PF02333 Phytase: Phytase; InterPro: IPR003431 Phytase (3.1.3.8 from EC) (phytate 3-phosphatase) is a secreted enzyme which hydrolyses phytate to release inorganic phosphate. This family appears to represent a novel enzyme that shows phytase activity () and has been shown to consist of a single structural unit with a six-bladed propeller folding architecture ().; GO: 0016158 3-phytase activity; PDB: 3AMS_A 3AMR_A 1QLG_A 2POO_A 1H6L_A 1CVM_A 1POO_A.
Probab=29.15 E-value=3.8e+02 Score=26.88 Aligned_cols=74 Identities=22% Similarity=0.316 Sum_probs=45.4
Q ss_pred CCceeEEEcCCCCCEEEEEECCCCeEEEEEcC--C-CCeEEEEeCC-CCcc-ceeeeecc-----ccEEEEEeCCCCcee
Q psy951 31 RHPTGLSVDAAMDHTLYWVDSKLNTIESVRHD--G-RNRQTILSGS-DKLQ-HPISLDVF-----ENNIYWLARDTGSLY 100 (428)
Q Consensus 31 ~~P~glavD~~~~~~lYW~d~~~~~I~~~~ld--G-~~~~~i~~~~-~~~~-~p~~l~~~-----~~~lYwtD~~~~~I~ 100 (428)
..|.|+++|...+ +||..+... =|++...+ + ..++.+.... ..+. -..+|+++ .++|.-++...++..
T Consensus 208 sQ~EGCVVDDe~g-~LYvgEE~~-GIW~y~Aep~~~~~~~~v~~~~g~~l~aDvEGlaly~~~~g~gYLivSsQG~~sf~ 285 (381)
T PF02333_consen 208 SQPEGCVVDDETG-RLYVGEEDV-GIWRYDAEPEGGNDRTLVASADGDGLVADVEGLALYYGSDGKGYLIVSSQGDNSFA 285 (381)
T ss_dssp S-EEEEEEETTTT-EEEEEETTT-EEEEEESSCCC-S--EEEEEBSSSSB-S-EEEEEEEE-CCC-EEEEEEEGGGTEEE
T ss_pred CcceEEEEecccC-CEEEecCcc-EEEEEecCCCCCCcceeeecccccccccCccceEEEecCCCCeEEEEEcCCCCeEE
Confidence 4899999999999 999999764 67887776 4 3344443321 1232 34577764 356777777666555
Q ss_pred eeccCC
Q psy951 101 KQDKFG 106 (428)
Q Consensus 101 ~~~~~g 106 (428)
..+..+
T Consensus 286 Vy~r~~ 291 (381)
T PF02333_consen 286 VYDREG 291 (381)
T ss_dssp EEESST
T ss_pred EEecCC
Confidence 555544
No 186
>PF15330 SIT: SHP2-interacting transmembrane adaptor protein, SIT
Probab=29.12 E-value=34 Score=27.59 Aligned_cols=18 Identities=28% Similarity=0.386 Sum_probs=8.5
Q ss_pred HHHHHHhhheeecCCCCC
Q psy951 245 ALVSATVYYVWRKRPFGK 262 (428)
Q Consensus 245 ~~~~~~~~~~~r~~~~~~ 262 (428)
++++++.++.||.+++++
T Consensus 12 ll~l~asl~~wr~~~rq~ 29 (107)
T PF15330_consen 12 LLSLAASLLAWRMKQRQK 29 (107)
T ss_pred HHHHHHHHHHHHHHhhhc
Confidence 444444555555544333
No 187
>KOG3512|consensus
Probab=28.60 E-value=49 Score=33.62 Aligned_cols=25 Identities=20% Similarity=0.729 Sum_probs=18.0
Q ss_pred CEEeeCCCCceEecCCCCCccCccCCCC
Q psy951 195 GMCAESETGDLTCNCRQDFAGTFCENYT 222 (428)
Q Consensus 195 g~C~~~~~g~~~C~C~~gy~G~~Ce~~~ 222 (428)
.+|-..++ .|.|.+|-+|..|..-.
T Consensus 407 ktCNq~tG---qCpCkeGvtG~tCnrCa 431 (592)
T KOG3512|consen 407 KTCNQTTG---QCPCKEGVTGLTCNRCA 431 (592)
T ss_pred ccccccCC---cccCCCCCccccccccc
Confidence 36765552 69999999998886533
No 188
>PF06667 PspB: Phage shock protein B; InterPro: IPR009554 This family consists of several bacterial phage shock protein B (PspB) sequences. The phage shock protein (psp) operon is induced in response to heat, ethanol, osmotic shock and infection by filamentous bacteriophages []. Expression of the operon requires the alternative sigma factor sigma54 and the transcriptional activator PspF. In addition, PspA plays a negative regulatory role, and the integral-membrane proteins PspB and PspC play a positive one [].; GO: 0006355 regulation of transcription, DNA-dependent, 0009271 phage shock
Probab=28.55 E-value=50 Score=24.75 Aligned_cols=25 Identities=8% Similarity=0.332 Sum_probs=14.4
Q ss_pred HHHHHHHHHHHHHHHhhheeecCCC
Q psy951 236 YIPTLLLLLALVSATVYYVWRKRPF 260 (428)
Q Consensus 236 ~~~i~~~~~~~~~~~~~~~~r~~~~ 260 (428)
++++++++++++.+.++++|+.+++
T Consensus 7 ~~plivf~ifVap~WL~lHY~sk~~ 31 (75)
T PF06667_consen 7 FVPLIVFMIFVAPIWLILHYRSKWK 31 (75)
T ss_pred HHHHHHHHHHHHHHHHHHHHHHhcc
Confidence 3344444445666666777777653
No 189
>COG4257 Vgb Streptogramin lyase [Defense mechanisms]
Probab=27.54 E-value=5.6e+02 Score=24.64 Aligned_cols=69 Identities=13% Similarity=0.137 Sum_probs=48.2
Q ss_pred eeEEEcCCCCCEEEEEECCCCeEEEEEcCCCCeEE-EEeCCCCccceeeeeccc-cEEEEEeCCCCceeeeccCC
Q psy951 34 TGLSVDAAMDHTLYWVDSKLNTIESVRHDGRNRQT-ILSGSDKLQHPISLDVFE-NNIYWLARDTGSLYKQDKFG 106 (428)
Q Consensus 34 ~glavD~~~~~~lYW~d~~~~~I~~~~ldG~~~~~-i~~~~~~~~~p~~l~~~~-~~lYwtD~~~~~I~~~~~~g 106 (428)
+.+-.|+. + ++.-++++.+++.+.+..-+.=+. -+.+ ...+|.++-+.. ++++.+|+..+.|.|.+.+.
T Consensus 236 Rriwsdpi-g-~~wittwg~g~l~rfdPs~~sW~eypLPg--s~arpys~rVD~~grVW~sea~agai~rfdpet 306 (353)
T COG4257 236 RRIWSDPI-G-RAWITTWGTGSLHRFDPSVTSWIEYPLPG--SKARPYSMRVDRHGRVWLSEADAGAIGRFDPET 306 (353)
T ss_pred cccccCcc-C-cEEEeccCCceeeEeCcccccceeeeCCC--CCCCcceeeeccCCcEEeeccccCceeecCccc
Confidence 33555655 3 588888888899888876554322 2333 467888888775 56666699999999999873
No 190
>COG2706 3-carboxymuconate cyclase [Carbohydrate transport and metabolism]
Probab=26.58 E-value=6.3e+02 Score=24.89 Aligned_cols=80 Identities=14% Similarity=0.154 Sum_probs=62.3
Q ss_pred cccccceeEEeeccCCCEEEEEecCCCeEEEeeeccccCCceEEE--Ee-ccccceeeeeeccCCeEEEEeCCCCcEEEE
Q psy951 320 ISDERRIEALDIDPVDEIIYWVDSYDRNIRRSFMLEAQKGQVQAV--IS-DERRIEALDIDPVDEIIYWVDSYDRNIRRS 396 (428)
Q Consensus 320 i~~~~~~~~l~~d~~~~~lyWtd~~~~~I~ra~l~g~~~~~~~~i--~~-~~~~p~glavD~~~~~lYwtd~~~~~I~~~ 396 (428)
..+....-+|.+.+..++||-++++...|--..++.. .+....+ .. .-+.|-++.++.-++.|+-+-...++|.+-
T Consensus 240 F~g~~~~aaIhis~dGrFLYasNRg~dsI~~f~V~~~-~g~L~~~~~~~teg~~PR~F~i~~~g~~Liaa~q~sd~i~vf 318 (346)
T COG2706 240 FTGTNWAAAIHISPDGRFLYASNRGHDSIAVFSVDPD-GGKLELVGITPTEGQFPRDFNINPSGRFLIAANQKSDNITVF 318 (346)
T ss_pred cCCCCceeEEEECCCCCEEEEecCCCCeEEEEEEcCC-CCEEEEEEEeccCCcCCccceeCCCCCEEEEEccCCCcEEEE
Confidence 3455567799999999999999999998877777763 1323333 22 667799999999999999999988888876
Q ss_pred EccC
Q psy951 397 FMLE 400 (428)
Q Consensus 397 ~~~g 400 (428)
..|.
T Consensus 319 ~~d~ 322 (346)
T COG2706 319 ERDK 322 (346)
T ss_pred EEcC
Confidence 6554
No 191
>PF06084 Cytomega_TRL10: Cytomegalovirus TRL10 protein; InterPro: IPR009284 This family consists of several Cytomegalovirus TRL10 proteins. TRL10 represents a structural component of the virus particle and like the other HCMV envelope glycoproteins, is present in a disulphide-linked complex [].
Probab=26.26 E-value=1.3e+02 Score=24.34 Aligned_cols=8 Identities=25% Similarity=0.787 Sum_probs=4.6
Q ss_pred ceecCCCC
Q psy951 155 YQCACPEN 162 (428)
Q Consensus 155 ~~C~C~~g 162 (428)
-.|.|.+-
T Consensus 20 l~ckc~~~ 27 (150)
T PF06084_consen 20 LTCKCSPW 27 (150)
T ss_pred EEEecCCC
Confidence 46666653
No 192
>KOG0291|consensus
Probab=26.08 E-value=6.6e+02 Score=27.58 Aligned_cols=85 Identities=15% Similarity=0.014 Sum_probs=54.3
Q ss_pred eEEEEEcCCCCcEE----------EEeCCCCCceeEEEcCCCCCEEEEEECCCCeEEEEEcCCCCeEEEEeCCCCcccee
Q psy951 11 RIESAWMDGSHRRS----------LVMTGVRHPTGLSVDAAMDHTLYWVDSKLNTIESVRHDGRNRQTILSGSDKLQHPI 80 (428)
Q Consensus 11 ~I~~a~~DG~~~~~----------l~~~~~~~P~glavD~~~~~~lYW~d~~~~~I~~~~ldG~~~~~i~~~~~~~~~p~ 80 (428)
.+..+.|||+-|.= +....-..-.-+|+|+.+. .++-.+...-.|..-++....---++++.++-..-+
T Consensus 406 ~llssSLDGtVRAwDlkRYrNfRTft~P~p~QfscvavD~sGe-lV~AG~~d~F~IfvWS~qTGqllDiLsGHEgPVs~l 484 (893)
T KOG0291|consen 406 VLLSSSLDGTVRAWDLKRYRNFRTFTSPEPIQFSCVAVDPSGE-LVCAGAQDSFEIFVWSVQTGQLLDILSGHEGPVSGL 484 (893)
T ss_pred EEEEeecCCeEEeeeecccceeeeecCCCceeeeEEEEcCCCC-EEEeeccceEEEEEEEeecCeeeehhcCCCCcceee
Confidence 48889999986532 2222223445789999877 555555445567777776665556677643333334
Q ss_pred eeeccccEEEEEeCCC
Q psy951 81 SLDVFENNIYWLARDT 96 (428)
Q Consensus 81 ~l~~~~~~lYwtD~~~ 96 (428)
.++..+..|+=..|..
T Consensus 485 ~f~~~~~~LaS~SWDk 500 (893)
T KOG0291|consen 485 SFSPDGSLLASGSWDK 500 (893)
T ss_pred EEccccCeEEeccccc
Confidence 5777888888887763
No 193
>PF14759 Reductase_C: Reductase C-terminal; PDB: 3FG2_P 3LXD_A 2YVG_A 2GR1_A 2GQW_A 2GR3_A 2YVF_A 1F3P_A 2GR0_A 2GR2_A ....
Probab=25.00 E-value=97 Score=23.52 Aligned_cols=29 Identities=21% Similarity=0.404 Sum_probs=24.0
Q ss_pred EEEeCCCCcEEEEEccCcceeEEEecccC
Q psy951 384 YWVDSYDRNIRRSFMLEAQKGQVQAGASR 412 (428)
Q Consensus 384 Ywtd~~~~~I~~~~~~g~~~~~l~~~~~~ 412 (428)
||+|....+|..+-.-+...+++++++..
T Consensus 2 FWSdQ~~~~iq~~G~~~~~~~~v~rg~~~ 30 (85)
T PF14759_consen 2 FWSDQYGVRIQIAGLPGGADEVVVRGDPE 30 (85)
T ss_dssp EEEEETTEEEEEEE-STTSSEEEEEEETT
T ss_pred eecccCCCeEEEEECCCCCCEEEEEccCC
Confidence 89999999999999888888888877744
No 194
>PF05337 CSF-1: Macrophage colony stimulating factor-1 (CSF-1); InterPro: IPR008001 Colony stimulating factor 1 (CSF-1) is a homodimeric polypeptide growth factor whose primary function is to regulate the survival, proliferation, differentiation, and function of cells of the mononuclear phagocytic lineage. This lineage includes mononuclear phagocytic precursors, blood monocytes, tissue macrophages, osteoclasts, and microglia of the brain, all of which possess cell surface receptors for CSF-1. The protein has also been linked with male fertility [] and mutations in the Csf-1 gene have been found to cause osteopetrosis and failure of tooth eruption [].; GO: 0005125 cytokine activity, 0008083 growth factor activity, 0016021 integral to membrane; PDB: 3EJJ_A.
Probab=24.68 E-value=25 Score=33.19 Aligned_cols=19 Identities=21% Similarity=0.350 Sum_probs=0.0
Q ss_pred HHHHHHHhhheeecCCCCC
Q psy951 244 LALVSATVYYVWRKRPFGK 262 (428)
Q Consensus 244 ~~~~~~~~~~~~r~~~~~~ 262 (428)
+|++++++++|||+|+|..
T Consensus 237 LVLLaVGGLLfYr~rrRs~ 255 (285)
T PF05337_consen 237 LVLLAVGGLLFYRRRRRSH 255 (285)
T ss_dssp -------------------
T ss_pred hhhhhccceeeeccccccc
Confidence 3677788888888877554
No 195
>KOG4649|consensus
Probab=23.75 E-value=1.8e+02 Score=27.55 Aligned_cols=54 Identities=20% Similarity=0.274 Sum_probs=36.3
Q ss_pred CCCceeEEEcCCCCCEEEEEECCCCeEEEEEcCCCCeEEEEeCCCCccceeeeeccccEEEEEeCCCC
Q psy951 30 VRHPTGLSVDAAMDHTLYWVDSKLNTIESVRHDGRNRQTILSGSDKLQHPISLDVFENNIYWLARDTG 97 (428)
Q Consensus 30 ~~~P~glavD~~~~~~lYW~d~~~~~I~~~~ldG~~~~~i~~~~~~~~~p~~l~~~~~~lYwtD~~~~ 97 (428)
...-.=+|||+.++ .|||-..-..+||..-+=-.+. +.|--+.+.||+.+..++
T Consensus 30 SHs~~~~avd~~sG-~~~We~ilg~RiE~sa~vvgdf-------------VV~GCy~g~lYfl~~~tG 83 (354)
T KOG4649|consen 30 SHSGIVIAVDPQSG-NLIWEAILGVRIECSAIVVGDF-------------VVLGCYSGGLYFLCVKTG 83 (354)
T ss_pred cCCceEEEecCCCC-cEEeehhhCceeeeeeEEECCE-------------EEEEEccCcEEEEEecch
Confidence 34566789999999 7999998888999865433333 222334556666665555
No 196
>PTZ00214 high cysteine membrane protein Group 4; Provisional
Probab=22.83 E-value=17 Score=40.16 Aligned_cols=17 Identities=24% Similarity=0.434 Sum_probs=12.2
Q ss_pred eEecCCCCCc--cCccCCC
Q psy951 205 LTCNCRQDFA--GTFCENY 221 (428)
Q Consensus 205 ~~C~C~~gy~--G~~Ce~~ 221 (428)
..|.|..||. +..|...
T Consensus 751 ~vC~C~~g~~l~~~~c~~~ 769 (800)
T PTZ00214 751 GVCMCELDAVLTKGVCVPA 769 (800)
T ss_pred CeEEeCCcceecCCeeEec
Confidence 5788999887 5667543
No 197
>KOG0273|consensus
Probab=22.00 E-value=7.7e+02 Score=25.42 Aligned_cols=43 Identities=21% Similarity=0.092 Sum_probs=36.0
Q ss_pred ccccceeeeeeccCCeEEEEeCCCCcEEEEEccCcceeEEEec
Q psy951 367 DERRIEALDIDPVDEIIYWVDSYDRNIRRSFMLEAQKGQVQAG 409 (428)
Q Consensus 367 ~~~~p~glavD~~~~~lYwtd~~~~~I~~~~~~g~~~~~l~~~ 409 (428)
.+++..+|.|||++..=|-+-.....|.|+.++++....-+.+
T Consensus 315 ~~~s~~~lDVdW~~~~~F~ts~td~~i~V~kv~~~~P~~t~~G 357 (524)
T KOG0273|consen 315 EFHSAPALDVDWQSNDEFATSSTDGCIHVCKVGEDRPVKTFIG 357 (524)
T ss_pred eeccCCccceEEecCceEeecCCCceEEEEEecCCCcceeeec
Confidence 3666679999999999999999888999999999877665544
No 198
>PF05545 FixQ: Cbb3-type cytochrome oxidase component FixQ; InterPro: IPR008621 This family consists of several Cbb3-type cytochrome oxidase components (FixQ/CcoQ). FixQ is found in nitrogen fixing bacteria. Since nitrogen fixation is an energy-consuming process, effective symbioses depend on operation of a respiratory chain with a high affinity for O2, closely coupled to ATP production. This requirement is fulfilled by a special three-subunit terminal oxidase (cytochrome terminal oxidase cbb3), which was first identified in Bradyrhizobium japonicum as the product of the fixNOQP operon [].
Probab=21.90 E-value=1.1e+02 Score=20.51 Aligned_cols=7 Identities=29% Similarity=0.771 Sum_probs=2.7
Q ss_pred heeecCC
Q psy951 253 YVWRKRP 259 (428)
Q Consensus 253 ~~~r~~~ 259 (428)
..+++++
T Consensus 28 w~~~~~~ 34 (49)
T PF05545_consen 28 WAYRPRN 34 (49)
T ss_pred HHHcccc
Confidence 3344433
No 199
>KOG3512|consensus
Probab=21.73 E-value=75 Score=32.38 Aligned_cols=35 Identities=31% Similarity=0.813 Sum_probs=26.2
Q ss_pred CCCCCCC-CEEeeCCCCceEecCCCCCccCccCCCC
Q psy951 188 VCQCQNG-GMCAESETGDLTCNCRQDFAGTFCENYT 222 (428)
Q Consensus 188 ~c~C~ng-g~C~~~~~g~~~C~C~~gy~G~~Ce~~~ 222 (428)
-|.|.-+ ..|+....+.++|.|.++-.|..|+.-.
T Consensus 277 RCKCNgHAs~Cv~d~~~~ltCdC~HNTaGPdCgrCK 312 (592)
T KOG3512|consen 277 RCKCNGHASRCVMDESSHLTCDCEHNTAGPDCGRCK 312 (592)
T ss_pred eeeecCccceeeeccCCceEEecccCCCCCCccccc
Confidence 3456322 2688888777999999999999987644
No 200
>PF06024 DUF912: Nucleopolyhedrovirus protein of unknown function (DUF912); InterPro: IPR009261 This entry is represented by Autographa californica nuclear polyhedrosis virus (AcMNPV), Orf78; it is a family of uncharacterised viral proteins.
Probab=21.14 E-value=1.1e+02 Score=24.28 Aligned_cols=9 Identities=33% Similarity=0.582 Sum_probs=3.4
Q ss_pred HHhhheeec
Q psy951 249 ATVYYVWRK 257 (428)
Q Consensus 249 ~~~~~~~r~ 257 (428)
++.|++..|
T Consensus 80 ~IyYFVILR 88 (101)
T PF06024_consen 80 AIYYFVILR 88 (101)
T ss_pred hheEEEEEe
Confidence 333444333
No 201
>cd01328 FSL_SPARC Follistatin-like SPARC (secreted protein, acidic, and rich in cysteines) domain; SPARC/BM-40/osteonectin is a multifunctional glycoprotein which modulates cellular interaction with the extracellular matrix by its binding to structural matrix proteins such as collagen and vitronectin. The protein it composed of an N-terminal acidic region, a follistatin (FS) domain and an EF-hand calcium binding domain. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a small hydrophobic core of alpha/beta structure (Kazal domain) and has five disulfide bonds and a conserved N-glycosylation site. The FSL_SPARC domain is a member of the superfamily of kazal-like proteinase inhibitors and follistatin-like proteins.
Probab=20.56 E-value=1.1e+02 Score=23.56 Aligned_cols=22 Identities=23% Similarity=0.738 Sum_probs=17.2
Q ss_pred CCCCCCEEeeCCCCceEecCCC
Q psy951 190 QCQNGGMCAESETGDLTCNCRQ 211 (428)
Q Consensus 190 ~C~ngg~C~~~~~g~~~C~C~~ 211 (428)
.|..|-.|.....|...|.|.+
T Consensus 6 ~C~~G~~C~~d~~~~p~CvC~~ 27 (86)
T cd01328 6 HCGAGKVCEVDDENTPKCVCID 27 (86)
T ss_pred CCCCCCEeeECCCCCeEEecCC
Confidence 5778888987666788998875
No 202
>PF11857 DUF3377: Domain of unknown function (DUF3377); InterPro: IPR021805 This domain is functionally uncharacterised and found at the C terminus of peptidases belonging to MEROPS peptidase family M10A, membrane-type matrix metallopeptidases (clan MA). ; GO: 0004222 metalloendopeptidase activity
Probab=20.41 E-value=72 Score=23.77 Aligned_cols=28 Identities=25% Similarity=0.422 Sum_probs=16.0
Q ss_pred hhHHHHHHHHHHHHHHHHHhhheeecCC
Q psy951 232 RSLLYIPTLLLLLALVSATVYYVWRKRP 259 (428)
Q Consensus 232 ~~~~~~~i~~~~~~~~~~~~~~~~r~~~ 259 (428)
.+++.++.++++.++.++..++.+|++.
T Consensus 31 avaVviPl~L~LCiLvl~yai~~fkrkG 58 (74)
T PF11857_consen 31 AVAVVIPLVLLLCILVLIYAIFQFKRKG 58 (74)
T ss_pred EEEEeHHHHHHHHHHHHHHHhheeeecC
Confidence 4556666666555555555555566554
No 203
>PF14380 WAK_assoc: Wall-associated receptor kinase C-terminal
Probab=20.01 E-value=1e+02 Score=23.92 Aligned_cols=20 Identities=30% Similarity=0.820 Sum_probs=14.8
Q ss_pred CCCEEeeCC-CCceEecCCCC
Q psy951 193 NGGMCAESE-TGDLTCNCRQD 212 (428)
Q Consensus 193 ngg~C~~~~-~g~~~C~C~~g 212 (428)
.||.|-... ...+.|-|+.|
T Consensus 73 SgG~Cgy~~~~~~f~C~C~dg 93 (94)
T PF14380_consen 73 SGGRCGYDSNSEQFTCFCSDG 93 (94)
T ss_pred CCCEeCCCCCCceEEEECCCC
Confidence 578996554 25699999976
Done!