Query psy4900
Match_columns 485
No_of_seqs 394 out of 2614
Neff 8.7
Searched_HMMs 46136
Date Fri Aug 16 20:44:52 2013
Command hhsearch -i /work/01045/syshi/Psyhhblits/psy4900.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/4900hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 KOG1215|consensus 100.0 1.7E-33 3.6E-38 315.5 23.1 425 10-477 136-603 (877)
2 KOG1214|consensus 100.0 6.7E-29 1.5E-33 253.5 19.4 113 368-484 1080-1198(1289)
3 KOG1214|consensus 99.9 2E-26 4.4E-31 235.5 16.4 117 363-483 1032-1154(1289)
4 KOG1215|consensus 99.9 4.6E-21 1E-25 215.3 22.8 410 29-482 113-564 (877)
5 PF00058 Ldl_recept_b: Low-den 99.2 1.6E-11 3.4E-16 83.0 5.6 39 413-451 1-42 (42)
6 smart00135 LY Low-density lipo 98.9 2.4E-09 5.2E-14 72.9 5.7 42 435-476 2-43 (43)
7 PF00057 Ldl_recept_a: Low-den 98.9 1E-09 2.2E-14 71.6 2.8 37 10-48 1-37 (37)
8 PF00057 Ldl_recept_a: Low-den 98.9 9.1E-10 2E-14 71.8 1.7 36 95-130 2-37 (37)
9 cd00112 LDLa Low Density Lipop 98.8 1.8E-09 3.8E-14 69.7 2.2 35 96-130 1-35 (35)
10 cd00112 LDLa Low Density Lipop 98.8 2.9E-09 6.3E-14 68.7 2.6 35 12-48 1-35 (35)
11 PF14670 FXa_inhibition: Coagu 98.6 1.3E-08 2.9E-13 65.6 0.7 36 262-302 1-36 (36)
12 smart00135 LY Low-density lipo 98.6 9.1E-08 2E-12 64.9 4.7 40 394-433 2-43 (43)
13 smart00192 LDLa Low-density li 98.5 8.2E-08 1.8E-12 61.1 2.4 32 96-127 2-33 (33)
14 smart00192 LDLa Low-density li 98.4 1.5E-07 3.3E-12 59.9 2.9 32 12-45 2-33 (33)
15 PF00058 Ldl_recept_b: Low-den 98.3 8.9E-07 1.9E-11 59.7 5.0 30 454-483 1-31 (42)
16 PF08450 SGL: SMP-30/Gluconola 98.3 2.1E-06 4.5E-11 82.3 8.9 106 368-480 97-221 (246)
17 PLN02919 haloacid dehalogenase 98.2 7.4E-06 1.6E-10 93.7 12.3 107 367-477 579-718 (1057)
18 PF08450 SGL: SMP-30/Gluconola 98.2 4.9E-06 1.1E-10 79.7 8.7 101 366-473 50-165 (246)
19 PF12999 PRKCSH-like: Glucosid 98.1 2.4E-06 5.3E-11 75.4 4.6 69 57-127 34-110 (176)
20 PF03088 Str_synth: Strictosid 98.1 1.3E-05 2.9E-10 63.2 6.8 68 405-473 2-88 (89)
21 PF12999 PRKCSH-like: Glucosid 98.0 1.1E-05 2.4E-10 71.3 5.5 71 13-88 34-110 (176)
22 PLN02919 haloacid dehalogenase 98.0 3.6E-05 7.7E-10 88.2 10.7 86 395-480 562-662 (1057)
23 PF14670 FXa_inhibition: Coagu 97.9 4.1E-06 8.9E-11 54.0 0.6 31 182-212 6-36 (36)
24 PF12662 cEGF: Complement Clr- 97.7 1.6E-05 3.4E-10 46.0 1.6 24 194-217 1-24 (24)
25 COG3386 Gluconolactonase [Carb 97.5 0.00051 1.1E-08 67.6 9.8 97 378-480 145-251 (307)
26 KOG4659|consensus 97.5 0.00024 5.2E-09 78.5 7.5 76 398-476 472-565 (1899)
27 PF12662 cEGF: Complement Clr- 97.4 9.8E-05 2.1E-09 42.8 1.8 21 284-304 1-21 (24)
28 PF07645 EGF_CA: Calcium-bindi 97.3 8.1E-05 1.8E-09 50.2 0.8 35 262-301 5-41 (42)
29 TIGR02604 Piru_Ver_Nterm putat 97.1 0.0035 7.6E-08 63.8 10.7 94 389-485 59-185 (367)
30 PF01436 NHL: NHL repeat; Int 97.0 0.0019 4.1E-08 39.3 4.6 27 441-468 1-27 (28)
31 KOG1520|consensus 96.8 0.0018 4E-08 64.0 5.6 113 365-482 124-262 (376)
32 PF07645 EGF_CA: Calcium-bindi 96.8 0.00042 9E-09 46.7 0.8 27 185-211 15-41 (42)
33 COG3386 Gluconolactonase [Carb 96.8 0.0053 1.1E-07 60.5 8.6 71 400-472 110-193 (307)
34 KOG1520|consensus 96.8 0.0029 6.4E-08 62.6 6.4 63 400-463 114-181 (376)
35 PF10282 Lactonase: Lactonase, 96.6 0.01 2.3E-07 59.8 9.5 109 362-474 150-277 (345)
36 KOG4659|consensus 96.4 0.017 3.8E-07 64.5 10.0 102 368-477 376-507 (1899)
37 TIGR02604 Piru_Ver_Nterm putat 96.3 0.033 7.1E-07 56.7 11.2 87 391-482 2-114 (367)
38 PF06977 SdiA-regulated: SdiA- 96.0 0.016 3.4E-07 55.3 6.6 68 399-468 169-246 (248)
39 KOG4499|consensus 95.9 0.05 1.1E-06 50.4 8.9 80 395-475 152-244 (310)
40 COG3391 Uncharacterized conser 95.6 0.1 2.3E-06 53.3 11.0 106 365-477 83-195 (381)
41 PF03022 MRJP: Major royal jel 95.6 0.06 1.3E-06 52.7 8.8 80 403-483 130-230 (287)
42 PF01731 Arylesterase: Arylest 95.5 0.057 1.2E-06 42.4 6.6 42 429-471 42-83 (86)
43 cd01475 vWA_Matrilin VWA_Matri 95.3 0.013 2.8E-07 55.3 3.1 35 262-301 190-224 (224)
44 TIGR03606 non_repeat_PQQ dehyd 95.3 0.2 4.4E-06 51.9 12.1 94 389-483 18-140 (454)
45 PF10282 Lactonase: Lactonase, 95.3 0.094 2E-06 52.9 9.5 112 362-474 198-324 (345)
46 PF07995 GSDH: Glucose / Sorbo 95.2 0.038 8.2E-07 55.4 6.2 67 399-466 112-205 (331)
47 PRK11028 6-phosphogluconolacto 95.2 0.15 3.2E-06 50.9 10.5 109 363-473 42-157 (330)
48 PF07995 GSDH: Glucose / Sorbo 95.1 0.061 1.3E-06 53.9 7.3 74 400-474 1-92 (331)
49 KOG1219|consensus 95.1 0.032 6.9E-07 65.7 5.6 95 178-330 3866-3969(4289)
50 PF03088 Str_synth: Strictosid 95.0 0.021 4.6E-07 45.1 2.8 42 390-431 46-89 (89)
51 PF03022 MRJP: Major royal jel 94.7 0.098 2.1E-06 51.2 7.3 65 401-466 186-259 (287)
52 KOG2397|consensus 94.6 0.031 6.6E-07 56.8 3.6 69 59-129 43-115 (480)
53 PRK11028 6-phosphogluconolacto 94.5 0.36 7.9E-06 48.1 11.2 73 400-472 34-110 (330)
54 PF06977 SdiA-regulated: SdiA- 94.4 0.2 4.3E-06 47.8 8.5 76 403-478 120-207 (248)
55 KOG2397|consensus 94.3 0.039 8.5E-07 56.1 3.7 71 100-174 44-115 (480)
56 KOG3509|consensus 94.3 0.045 9.8E-07 60.8 4.3 104 25-130 2-110 (964)
57 PF01436 NHL: NHL repeat; Int 94.0 0.064 1.4E-06 32.4 2.8 25 400-425 1-27 (28)
58 KOG3509|consensus 93.5 0.096 2.1E-06 58.3 5.0 102 69-175 3-110 (964)
59 TIGR03606 non_repeat_PQQ dehyd 93.1 0.43 9.4E-06 49.5 8.8 41 422-463 200-250 (454)
60 COG3391 Uncharacterized conser 93.0 0.91 2E-05 46.4 11.1 112 363-479 167-290 (381)
61 smart00179 EGF_CA Calcium-bind 93.0 0.11 2.3E-06 33.7 2.9 28 267-302 9-38 (39)
62 smart00179 EGF_CA Calcium-bind 92.9 0.11 2.3E-06 33.7 2.8 23 187-212 16-38 (39)
63 KOG1219|consensus 92.9 0.14 3E-06 60.8 5.2 67 179-304 3906-3978(4289)
64 PRK04043 tolB translocation pr 92.9 0.77 1.7E-05 47.6 10.4 113 360-481 281-409 (419)
65 PF12947 EGF_3: EGF domain; I 91.9 0.044 9.5E-07 35.4 -0.1 31 265-302 4-36 (36)
66 cd01475 vWA_Matrilin VWA_Matri 91.9 0.12 2.6E-06 48.7 2.7 43 157-210 181-223 (224)
67 PRK04922 tolB translocation pr 91.9 1.4 3.1E-05 45.8 11.1 115 361-481 253-377 (433)
68 PRK04043 tolB translocation pr 91.8 1.2 2.6E-05 46.1 10.3 115 360-482 237-367 (419)
69 PRK04792 tolB translocation pr 91.1 1.9 4.1E-05 45.1 11.0 114 361-480 267-390 (448)
70 TIGR03866 PQQ_ABC_repeats PQQ- 91.0 4.3 9.3E-05 38.9 12.9 75 402-477 208-284 (300)
71 smart00181 EGF Epidermal growt 90.9 0.22 4.8E-06 31.5 2.5 23 183-205 7-30 (35)
72 KOG4260|consensus 90.9 0.16 3.5E-06 47.7 2.4 47 198-295 221-269 (350)
73 PRK00178 tolB translocation pr 90.6 2.8 6E-05 43.5 11.7 115 361-481 248-372 (430)
74 PRK01029 tolB translocation pr 90.5 2.7 5.9E-05 43.7 11.5 117 361-482 286-413 (428)
75 smart00181 EGF Epidermal growt 90.5 0.26 5.7E-06 31.1 2.5 24 267-295 6-30 (35)
76 TIGR02800 propeller_TolB tol-p 90.5 3.1 6.7E-05 42.7 11.9 114 361-480 239-362 (417)
77 PRK04792 tolB translocation pr 90.4 2.1 4.6E-05 44.8 10.6 113 361-480 311-433 (448)
78 COG2706 3-carboxymuconate cycl 90.4 3.4 7.3E-05 40.8 11.0 107 362-472 151-274 (346)
79 TIGR03866 PQQ_ABC_repeats PQQ- 90.2 4 8.8E-05 39.1 11.9 92 377-474 12-105 (300)
80 PRK02889 tolB translocation pr 90.1 2.8 6E-05 43.6 11.2 115 361-481 245-369 (427)
81 PRK05137 tolB translocation pr 89.9 2.8 6E-05 43.6 11.0 115 360-480 250-374 (435)
82 KOG4260|consensus 89.9 0.64 1.4E-05 43.8 5.4 21 185-205 249-269 (350)
83 PRK04922 tolB translocation pr 89.6 2.5 5.5E-05 43.9 10.4 114 361-481 297-420 (433)
84 PRK03629 tolB translocation pr 89.0 4.9 0.00011 41.7 12.0 114 361-480 248-371 (429)
85 TIGR02658 TTQ_MADH_Hv methylam 89.0 4.8 0.0001 40.5 11.3 104 366-474 205-332 (352)
86 PRK05137 tolB translocation pr 88.6 3 6.4E-05 43.4 10.1 113 361-480 295-420 (435)
87 COG2706 3-carboxymuconate cycl 88.5 4.1 8.9E-05 40.2 10.1 101 377-478 17-127 (346)
88 PF09064 Tme5_EGF_like: Thromb 88.0 0.67 1.4E-05 29.1 2.8 26 180-206 4-29 (34)
89 COG3204 Uncharacterized protei 87.3 3.2 7E-05 40.1 8.4 76 399-476 84-162 (316)
90 TIGR02800 propeller_TolB tol-p 87.2 3.8 8.2E-05 42.1 9.9 112 362-480 284-405 (417)
91 PRK02889 tolB translocation pr 87.0 5.6 0.00012 41.3 11.0 113 361-480 289-411 (427)
92 PRK00178 tolB translocation pr 86.2 5.9 0.00013 41.0 10.7 113 361-480 292-414 (430)
93 PF01731 Arylesterase: Arylest 85.9 2.1 4.6E-05 33.6 5.4 39 389-428 43-83 (86)
94 cd00053 EGF Epidermal growth f 85.6 0.79 1.7E-05 28.5 2.4 20 186-205 12-31 (36)
95 PF13449 Phytase-like: Esteras 84.7 3.5 7.6E-05 41.1 7.8 78 400-479 19-127 (326)
96 PRK03629 tolB translocation pr 84.3 10 0.00023 39.3 11.4 99 377-481 180-284 (429)
97 PF12947 EGF_3: EGF domain; I 84.2 0.37 8E-06 31.1 0.4 26 185-212 11-36 (36)
98 PRK01029 tolB translocation pr 84.0 14 0.0003 38.4 12.2 120 361-481 236-368 (428)
99 cd00053 EGF Epidermal growth f 82.6 1.3 2.8E-05 27.5 2.5 20 271-295 12-31 (36)
100 KOG4289|consensus 82.0 1.7 3.6E-05 50.1 4.4 23 272-303 1252-1274(2531)
101 COG2133 Glucose/sorbosone dehy 81.2 4.2 9.1E-05 41.5 6.7 65 400-466 176-263 (399)
102 TIGR02658 TTQ_MADH_Hv methylam 81.1 19 0.00042 36.3 11.3 70 407-476 200-291 (352)
103 PF06247 Plasmod_Pvs28: Plasmo 80.5 0.25 5.5E-06 44.0 -2.0 47 272-322 57-103 (197)
104 PF00008 EGF: EGF-like domain 80.1 0.59 1.3E-05 29.2 0.2 19 186-204 10-29 (32)
105 PRK01742 tolB translocation pr 79.9 15 0.00032 38.2 10.6 115 358-478 206-330 (429)
106 PF13449 Phytase-like: Esteras 79.1 4 8.7E-05 40.7 5.8 57 402-459 86-164 (326)
107 COG3204 Uncharacterized protei 78.7 12 0.00025 36.4 8.3 78 403-480 183-271 (316)
108 PRK01742 tolB translocation pr 78.0 22 0.00048 36.9 11.2 98 377-480 185-288 (429)
109 PF02333 Phytase: Phytase; In 77.8 11 0.00024 38.2 8.5 76 401-476 208-294 (381)
110 KOG4649|consensus 76.7 7.9 0.00017 36.8 6.4 29 401-429 31-61 (354)
111 KOG1225|consensus 76.4 13 0.00028 39.4 8.7 8 287-294 355-362 (525)
112 cd00054 EGF_CA Calcium-binding 75.3 2.5 5.5E-05 26.6 2.2 19 186-204 15-33 (38)
113 COG3823 Glutamine cyclotransfe 75.2 16 0.00035 33.7 7.8 34 441-474 228-261 (262)
114 PF05787 DUF839: Bacterial pro 74.2 12 0.00025 40.1 7.9 66 398-464 347-458 (524)
115 cd00200 WD40 WD40 domain, foun 70.5 67 0.0015 29.3 11.7 74 399-473 134-208 (289)
116 KOG4499|consensus 69.6 19 0.00042 33.8 7.1 46 436-481 152-202 (310)
117 PF02333 Phytase: Phytase; In 69.5 33 0.00071 34.9 9.4 81 398-479 153-247 (381)
118 cd00200 WD40 WD40 domain, foun 68.8 59 0.0013 29.7 10.9 73 400-473 177-250 (289)
119 TIGR02276 beta_rpt_yvtn 40-res 67.8 21 0.00045 23.0 5.4 39 411-450 2-42 (42)
120 COG4946 Uncharacterized protei 66.8 81 0.0018 32.7 11.4 95 377-476 383-481 (668)
121 COG4257 Vgb Streptogramin lyas 65.9 27 0.00059 33.7 7.4 77 403-481 235-314 (353)
122 KOG4289|consensus 65.3 7 0.00015 45.4 4.0 31 179-213 1242-1274(2531)
123 PF02239 Cytochrom_D1: Cytochr 65.2 30 0.00064 35.2 8.4 69 403-472 39-108 (369)
124 PF14583 Pectate_lyase22: Olig 64.3 28 0.0006 35.4 7.7 68 413-481 49-119 (386)
125 TIGR03032 conserved hypothetic 59.0 16 0.00035 35.9 4.8 41 436-478 197-238 (335)
126 KOG1225|consensus 57.5 46 0.00099 35.3 8.2 9 197-205 355-363 (525)
127 PF00930 DPPIV_N: Dipeptidyl p 55.9 46 0.001 33.4 7.9 81 400-480 234-324 (353)
128 PRK02888 nitrous-oxide reducta 54.5 67 0.0015 34.8 8.9 32 442-473 321-352 (635)
129 COG0823 TolB Periplasmic compo 51.9 83 0.0018 32.7 9.1 98 377-481 219-323 (425)
130 PF02239 Cytochrom_D1: Cytochr 51.6 1E+02 0.0022 31.3 9.6 107 363-474 44-160 (369)
131 PF10313 DUF2415: Uncharacteri 51.5 44 0.00095 22.5 4.5 34 404-437 4-42 (43)
132 COG4257 Vgb Streptogramin lyas 51.3 72 0.0016 30.9 7.6 71 398-471 59-131 (353)
133 TIGR03118 PEPCTERM_chp_1 conse 51.0 70 0.0015 31.5 7.6 76 403-482 140-240 (336)
134 KOG1217|consensus 49.3 15 0.00033 38.1 3.3 61 186-297 243-305 (487)
135 PF04885 Stig1: Stigma-specifi 48.6 39 0.00084 29.0 5.0 32 38-72 76-108 (136)
136 COG4946 Uncharacterized protei 47.0 92 0.002 32.4 8.0 62 400-462 443-509 (668)
137 TIGR03075 PQQ_enz_alc_DH PQQ-d 46.8 69 0.0015 34.3 7.8 72 405-483 238-322 (527)
138 KOG0285|consensus 46.4 45 0.00098 33.3 5.6 63 395-458 146-210 (460)
139 PF14583 Pectate_lyase22: Olig 45.9 1.3E+02 0.0029 30.6 9.1 102 377-482 61-186 (386)
140 PF06739 SBBP: Beta-propeller 45.9 19 0.00041 23.4 2.1 19 442-461 13-31 (38)
141 COG2133 Glucose/sorbosone dehy 45.5 91 0.002 32.0 7.9 72 402-474 315-399 (399)
142 PF05096 Glu_cyclase_2: Glutam 43.1 1.2E+02 0.0026 29.2 7.9 67 408-474 180-263 (264)
143 COG0823 TolB Periplasmic compo 41.6 1.5E+02 0.0033 30.7 9.1 69 361-438 243-323 (425)
144 PF05345 He_PIG: Putative Ig d 40.4 27 0.00059 24.1 2.3 22 398-419 8-29 (49)
145 PF05345 He_PIG: Putative Ig d 40.1 36 0.00079 23.4 2.9 25 439-463 8-32 (49)
146 PF01683 EB: EB module; Inter 38.8 64 0.0014 22.2 4.1 11 195-205 37-47 (52)
147 PF12661 hEGF: Human growth fa 38.2 15 0.00033 17.9 0.6 9 196-204 1-9 (13)
148 KOG0266|consensus 37.1 4.6E+02 0.0099 27.4 12.1 85 389-475 236-321 (456)
149 TIGR03032 conserved hypothetic 36.5 1.3E+02 0.0027 29.9 7.0 60 396-461 198-260 (335)
150 PF06433 Me-amine-dh_H: Methyl 34.6 2.2E+02 0.0047 28.6 8.4 70 409-478 192-283 (342)
151 PF05694 SBP56: 56kDa selenium 33.8 1.4E+02 0.003 31.0 7.1 60 403-462 314-394 (461)
152 PF00930 DPPIV_N: Dipeptidyl p 33.7 57 0.0012 32.8 4.5 72 377-454 261-340 (353)
153 KOG4328|consensus 33.0 1.3E+02 0.0029 31.0 6.7 75 400-481 186-265 (498)
154 TIGR03118 PEPCTERM_chp_1 conse 33.0 3.9E+02 0.0086 26.4 9.6 78 400-479 76-175 (336)
155 TIGR03075 PQQ_enz_alc_DH PQQ-d 32.2 1.4E+02 0.0029 32.1 7.2 59 406-466 274-337 (527)
156 PF14339 DUF4394: Domain of un 30.0 2.3E+02 0.005 26.8 7.4 71 401-472 27-103 (236)
157 PF04706 Dickkopf_N: Dickkopf 28.8 81 0.0017 22.2 3.2 9 102-110 44-52 (52)
158 PF14251 DUF4346: Domain of un 28.7 1.5E+02 0.0032 24.7 5.1 64 403-476 9-74 (119)
159 KOG1274|consensus 28.4 3.3E+02 0.0072 30.8 9.2 63 420-483 75-137 (933)
160 KOG1217|consensus 27.8 73 0.0016 32.9 4.3 22 185-206 182-203 (487)
161 KOG0291|consensus 27.5 5.3E+02 0.012 28.7 10.3 64 420-483 404-477 (893)
162 KOG3567|consensus 27.4 89 0.0019 32.3 4.5 53 421-474 445-498 (501)
163 PF10042 DUF2278: Uncharacteri 27.4 40 0.00086 31.2 1.9 16 442-457 85-101 (206)
164 PF14759 Reductase_C: Reductas 26.5 1.2E+02 0.0025 23.5 4.2 27 416-442 2-30 (85)
165 KOG0285|consensus 25.8 3.7E+02 0.008 27.1 8.1 103 377-484 174-277 (460)
166 PF06433 Me-amine-dh_H: Methyl 24.0 7.2E+02 0.016 25.0 10.0 65 406-472 243-320 (342)
167 COG3211 PhoX Predicted phospha 22.8 3.4E+02 0.0073 29.2 7.7 64 398-461 414-519 (616)
168 PTZ00486 apyrase Superfamily; 22.8 4.1E+02 0.0089 26.7 8.0 24 451-474 122-145 (352)
169 KOG0650|consensus 20.9 1.2E+02 0.0025 32.5 4.0 69 403-472 569-637 (733)
170 KOG3658|consensus 20.8 2.6E+02 0.0057 30.5 6.5 23 42-64 498-520 (764)
171 PF04885 Stig1: Stigma-specifi 20.8 2.1E+02 0.0045 24.6 4.9 30 81-111 76-108 (136)
172 PF05096 Glu_cyclase_2: Glutam 20.5 3.5E+02 0.0077 26.1 6.9 67 404-473 48-119 (264)
No 1
>KOG1215|consensus
Probab=100.00 E-value=1.7e-33 Score=315.49 Aligned_cols=425 Identities=28% Similarity=0.536 Sum_probs=285.9
Q ss_pred CCCCCCcEEecCCCCceecCCcccCCcCCCCCCCcCCCCCCCCCCCCCCCCeecCCCCceecCcceeCCCCCCCCCCccc
Q psy4900 10 RKCSPGDFECDPPHGICIPKDKRCDGYYDCRNRKDEEGCPATTGLSCDLDQFRCANGQKCIDAKLKCNYHNDCGDNSDEE 89 (485)
Q Consensus 10 ~~C~~~~f~C~c~~g~ci~~~~~Cd~~~dC~d~sdE~~C~~~~~~~C~~~~f~C~~g~~Ci~~~~~Cd~~~dC~d~sDe~ 89 (485)
..|...+|+|...++.||+..|+||+..+|.+|+||..|... ...+....|+| |...++||...+|.+++|+.
T Consensus 136 ~~~~~~~~~c~~~~~~Cip~~~~cd~~~~C~dg~de~~~~~~-~~~~~~~~~~~------~~~~~~~d~~~~~~~~~d~~ 208 (877)
T KOG1215|consen 136 SHCCLDKFSCRTGSCKCIPGDWLCDGEADCPDGSDELNCAVR-RCEPRGASLDC------IVAIKVCDIQHDCADDYDES 208 (877)
T ss_pred ccccCCCCCCcCccccCCCCceeCCCCCccccchhhhccccc-ccCcccccccc------ceeeeecCcccccccccccc
Confidence 456778899943389999999999999999999999998731 12334445666 88889999999999999998
Q ss_pred CCCCCccC---CCcccCCC-CcccCccccCCCCCCCCCCCCCC--CCCCCCcCCCceecCCCCCCCCCcccCCCcccCCC
Q psy4900 90 KCNFTACH---VGQFKCAN-SLCIPVSYHCDGYRDCIDGSDET--NCTSIACPNNKFLCPMGAAGGKPKCIPKAQVCDGR 163 (485)
Q Consensus 90 ~C~~~~C~---~~~f~C~~-~~Ci~~~~~Cdg~~dC~dgsDe~--~C~~~~c~~~~~~C~~g~~~~~~~Ci~~~~~Cdg~ 163 (485)
.+....+. ...++|.. ..||..+|+||+..||.+++||. .|....|...++.|.++ .|++..++|+|.
T Consensus 209 ~~~~~~~~~~~~~~~~c~g~~~~i~~~~~~Dg~~dc~~~~de~~~~~~~~~~~~~e~~~~~~------~~~~~~~~~~g~ 282 (877)
T KOG1215|consen 209 EGRIYWTDDSRIEVTRCDGSSRCILISEVCDGPRDCVDGPDEGVMNCSDATCEAPEIECADG------DCSDRQKLCDGD 282 (877)
T ss_pred cCcccccCCcceeEEEecCCCcEEeehhccCCCcccccCCcCceeEeeccccCCcceeecCC------CCccceEEecCc
Confidence 87654444 46788986 49999999999999999999995 67777777778999888 999999999999
Q ss_pred CCCCCCccccCcccCCCCCCc---cccccCC--CCCCeeeCCCCceeeCCCCCCCCcchhhhccccccccccccceehee
Q psy4900 164 KDCEDNADEETVCCDCSLLNC---EFTCQAS--PTGGVCQCPEGQKVANDSRTCLLYMKNNLKQAVRSSTVSSHVKLVLL 238 (485)
Q Consensus 164 ~dC~d~sDe~~~C~~C~~~~C---~~~C~n~--~~~~~C~C~~G~~l~~~~~~C~d~~e~~~~~~~~~~~~~~~~~~~~~ 238 (485)
.||++++||.. |....+ .+.|... +....| .. + .....+.|....... ..+.+....
T Consensus 283 ~d~pdg~de~~----~~~~~~~~~~~d~~~~~i~~~~~~--~~-~-~~~~~~~~~~~~~~~----------~~~~~~~~v 344 (877)
T KOG1215|consen 283 LDCPDGLDEDY----CKKKLYWSMNVDGSGRRILLSKLC--HG-Y-WTDGLNECAERVLKC----------SHKCPDVSV 344 (877)
T ss_pred cCCCCcccccc----cccceeeeeecccCCceeeecccC--cc-c-cccccccchhhcccc----------cCCCCcccc
Confidence 99999999974 332100 1111110 000000 00 0 000111111111000 000000000
Q ss_pred heeeeeeecccCCCCCCCCCCCCCCCCCCCcccccc-ccccCCCCCCccccCCCCccccCCCCCCCC-CCCCCceEEcCC
Q psy4900 239 EVYVNVLKVRKLPTTAEPQSPNPCGSNNGGCEHMCI-ITRASGNALGYKCACDIGYRLSVNGNNCNQ-PTCAPGEFQCAS 316 (485)
Q Consensus 239 ~~~~~~~~~~~~~~~~e~~~~~~C~~~~g~C~~~C~-n~~~~~~~g~~~C~C~~Gy~l~~d~~~C~~-~~C~~~~~~c~~ 316 (485)
...........+...... ..+.|...++.|+|+|+ +.+ +.|+|.|..||.+..++ |.. ....+.++.+..
T Consensus 345 ~~~~~~~~~~~~~~~~~~-~~~~~~~~~g~Csq~C~~~~p-----~~~~c~c~~g~~~~~~~--c~~~~~~~~~l~~s~~ 416 (877)
T KOG1215|consen 345 GPRCDCMGAKVLPLGART-DSNPCESDNGGCSQLCVPNSP-----GTFKCACSPGYELRLDK--CEASDQPEAFLLFSNR 416 (877)
T ss_pred CCcccCCccceecccccc-cCCcccccCCccceeccCCCC-----CceeEecCCCcEeccCC--ceecCCCCcEEEEecC
Confidence 000000000000000000 11236657899999999 446 89999999999998876 644 224445555532
Q ss_pred CCcccCceeec-------CC--------------CCCCCCCcCCCCCCcc-ccCCcccCCCCceeccccccCC--CCccC
Q psy4900 317 GRCVPSTFKCD-------AE--------------NDCGDYSDETGCVNVT-CSLSQFACENGRCVPSTWKCDS--ENDCG 372 (485)
Q Consensus 317 g~ci~~~~~cd-------~~--------------~dc~d~sde~~c~~~~-~~~~~~~~~~~~~i~~~~~~~~--~~~y~ 372 (485)
........-+. .. ....|.+++..+.... ......++.+|.+++..+.+++ +++||
T Consensus 417 ~~ir~~~~~~~~~~~p~~~~~~~~~~d~d~~~~~i~~~d~~~~~i~~~~~~~~~~~~~~~~g~~~~~~lavD~~~~~~y~ 496 (877)
T KOG1215|consen 417 HDIRRISLDCSDVSRPLEGIKNAVALDFDVLNNRIYWADLSDEKICRASQDGSSECELCGDGLCIPEGLAVDWIGDNIYW 496 (877)
T ss_pred ccceecccCCCcceEEccCCccceEEEEEecCCEEEEEeccCCeEeeeccCCCccceEeccCccccCcEEEEeccCCcee
Confidence 21111111111 00 0011111111111000 0111124566777777777776 89999
Q ss_pred CCCC--CccccccCcccccceEEEEecCCCCCCccEEecCCCCeEEEeC---CCcEEEEEcCCCCcEEEEeCCCcccceE
Q psy4900 373 DGSD--EGDFCSEKTCAYFQFHAIVLGSNLTNPTDLALDPTSGLMFVAD---SNQILRTNMDGTMAMSIVSEAAYKASGV 447 (485)
Q Consensus 373 ~d~~--~I~~~~~~~c~~g~~~~~l~~~~~~~p~~iavD~~~~~lywtd---~~~I~r~~~dG~~~~~i~~~~~~~p~gl 447 (485)
+|.. .|.+..++ |+++.+|+..++..|+++++||..|+|||+| .++|+|+.|||..+.+++..++.||+||
T Consensus 497 tDe~~~~i~v~~~~----g~~~~vl~~~~l~~~r~~~v~p~~g~~~wtd~~~~~~i~ra~~dg~~~~~l~~~~~~~p~gl 572 (877)
T KOG1215|consen 497 TDEGNCLIEVADLD----GSSRKVLVSKDLDLPRSIAVDPEKGLMFWTDWGQPPRIERASLDGSERAVLVTNGILWPNGL 572 (877)
T ss_pred cccCCceeEEEEcc----CCceeEEEecCCCCccceeeccccCeeEEecCCCCchhhhhcCCCCCceEEEeCCccCCCcc
Confidence 9998 88889999 9999999999999999999999999999999 3589999999999999999999999999
Q ss_pred EEeCCCCeEEEEeCCCC-cEEEEEccCCCeE
Q psy4900 448 ALDINAKRLFWCDNLLD-YIETVDYEGKNRF 477 (485)
Q Consensus 448 avD~~~~~lYW~D~~~~-~I~~~~~dG~~r~ 477 (485)
++|...+++||+|.... .|+.++++|+.|+
T Consensus 573 t~d~~~~~~yw~d~~~~~~i~~~~~~g~~r~ 603 (877)
T KOG1215|consen 573 TIDYETDRLYWADAKLDYTIESANMDGQNRR 603 (877)
T ss_pred eEEeecceeEEEcccCCcceeeeecCCCceE
Confidence 99999999999999998 8999999999998
No 2
>KOG1214|consensus
Probab=99.96 E-value=6.7e-29 Score=253.54 Aligned_cols=113 Identities=21% Similarity=0.274 Sum_probs=109.9
Q ss_pred CCccCCCCC--CccccccCcccccceEEEEecCCCCCCccEEecCCCCeEEEeC----CCcEEEEEcCCCCcEEEEeCCC
Q psy4900 368 ENDCGDGSD--EGDFCSEKTCAYFQFHAIVLGSNLTNPTDLALDPTSGLMFVAD----SNQILRTNMDGTMAMSIVSEAA 441 (485)
Q Consensus 368 ~~~y~~d~~--~I~~~~~~~c~~g~~~~~l~~~~~~~p~~iavD~~~~~lywtd----~~~I~r~~~dG~~~~~i~~~~~ 441 (485)
+++||+|+. +|+++.|+ |+++++|+..+|.+||+|++|+..|.||||| +++|+++.|||.+|++||.+++
T Consensus 1080 Rn~ywtDS~lD~IevA~Ld----G~~rkvLf~tdLVNPR~iv~D~~rgnLYwtDWnRenPkIets~mDG~NrRilin~Di 1155 (1289)
T KOG1214|consen 1080 RNMYWTDSVLDKIEVALLD----GSERKVLFYTDLVNPRAIVVDPIRGNLYWTDWNRENPKIETSSMDGENRRILINTDI 1155 (1289)
T ss_pred ceeeeeccccchhheeecC----CceeeEEEeecccCcceEEeecccCceeeccccccCCcceeeccCCccceEEeeccc
Confidence 899999987 99999999 9999999999999999999999999999999 6999999999999999999999
Q ss_pred cccceEEEeCCCCeEEEEeCCCCcEEEEEccCCCeEEEecCCC
Q psy4900 442 YKASGVALDINAKRLFWCDNLLDYIETVDYEGKNRFLILRGSQ 484 (485)
Q Consensus 442 ~~p~glavD~~~~~lYW~D~~~~~I~~~~~dG~~r~~~~~~~~ 484 (485)
..|+||++|+..+.|.|+|+++++++-..++|..|++|+.+++
T Consensus 1156 gLPNGLtfdpfs~~LCWvDAGt~rleC~~p~g~gRR~i~~~Lq 1198 (1289)
T KOG1214|consen 1156 GLPNGLTFDPFSKLLCWVDAGTKRLECTLPDGTGRRVIQNNLQ 1198 (1289)
T ss_pred CCCCCceeCcccceeeEEecCCcceeEecCCCCcchhhhhccc
Confidence 9999999999999999999999999999999999999998774
No 3
>KOG1214|consensus
Probab=99.94 E-value=2e-26 Score=235.53 Aligned_cols=117 Identities=19% Similarity=0.249 Sum_probs=110.2
Q ss_pred cccCCCCccCCCCC--CccccccCcccccceEEEEecCCCCCCccEEecCCCCeEEEeC--CCcEEEEEcCCCCcEEEEe
Q psy4900 363 WKCDSENDCGDGSD--EGDFCSEKTCAYFQFHAIVLGSNLTNPTDLALDPTSGLMFVAD--SNQILRTNMDGTMAMSIVS 438 (485)
Q Consensus 363 ~~~~~~~~y~~d~~--~I~~~~~~~c~~g~~~~~l~~~~~~~p~~iavD~~~~~lywtd--~~~I~r~~~dG~~~~~i~~ 438 (485)
+.|..+++||+|-. .|.++.|. |...++++.++|..|.|||||+..++||||| ..+|+.|.|||+.|++|+.
T Consensus 1032 fDC~e~mvyWtDv~g~SI~rasL~----G~Ep~ti~n~~L~SPEGiAVDh~~Rn~ywtDS~lD~IevA~LdG~~rkvLf~ 1107 (1289)
T KOG1214|consen 1032 FDCRERMVYWTDVAGRSISRASLE----GAEPETIVNSGLISPEGIAVDHIRRNMYWTDSVLDKIEVALLDGSERKVLFY 1107 (1289)
T ss_pred cccccceEEEeecCCCcccccccc----CCCCceeecccCCCccceeeeeccceeeeeccccchhheeecCCceeeEEEe
Confidence 44555899999977 99999999 9999999999999999999999999999999 7899999999999999999
Q ss_pred CCCcccceEEEeCCCCeEEEEeCCC--CcEEEEEccCCCeEEEecCC
Q psy4900 439 EAAYKASGVALDINAKRLFWCDNLL--DYIETVDYEGKNRFLILRGS 483 (485)
Q Consensus 439 ~~~~~p~glavD~~~~~lYW~D~~~--~~I~~~~~dG~~r~~~~~~~ 483 (485)
++|.+|.+|+||++.++|||+||.. .+|++++|||++|++|+...
T Consensus 1108 tdLVNPR~iv~D~~rgnLYwtDWnRenPkIets~mDG~NrRilin~D 1154 (1289)
T KOG1214|consen 1108 TDLVNPRAIVVDPIRGNLYWTDWNRENPKIETSSMDGENRRILINTD 1154 (1289)
T ss_pred ecccCcceEEeecccCceeeccccccCCcceeeccCCccceEEeecc
Confidence 9999999999999999999999985 79999999999999999754
No 4
>KOG1215|consensus
Probab=99.87 E-value=4.6e-21 Score=215.27 Aligned_cols=410 Identities=23% Similarity=0.379 Sum_probs=266.1
Q ss_pred CCcccCCcCCCCCCCcCC-CCCCCCCCCCCCCCeecC--CCCceecCcceeCCCCCCCCCCcccCCCCCccC--CCcccC
Q psy4900 29 KDKRCDGYYDCRNRKDEE-GCPATTGLSCDLDQFRCA--NGQKCIDAKLKCNYHNDCGDNSDEEKCNFTACH--VGQFKC 103 (485)
Q Consensus 29 ~~~~Cd~~~dC~d~sdE~-~C~~~~~~~C~~~~f~C~--~g~~Ci~~~~~Cd~~~dC~d~sDe~~C~~~~C~--~~~f~C 103 (485)
..|.......+.+..++. +++. ..|....|.|. ++. |++..|.|++..+|.+|+||..|....+. ...|+|
T Consensus 113 ~~~~~~~~~~~~~~~~~~~~~~~---~~~~~~~~~c~~~~~~-Cip~~~~cd~~~~C~dg~de~~~~~~~~~~~~~~~~~ 188 (877)
T KOG1215|consen 113 HAYHPSSQPLAPDPCAESGNGPC---SHCCLDKFSCRTGSCK-CIPGDWLCDGEADCPDGSDELNCAVRRCEPRGASLDC 188 (877)
T ss_pred eEEecCCCCCCCCcccccCCCCC---ccccCCCCCCcCcccc-CCCCceeCCCCCccccchhhhcccccccCcccccccc
Confidence 567788888888888774 3332 45778899999 555 99999999999999999999998633332 244555
Q ss_pred CCCcccCccccCCCCCCCCCCCCCCCCCCCCcC---CCceecCCCCCCCCCcccCCCcccCCCCCCCCCcccc-CcccCC
Q psy4900 104 ANSLCIPVSYHCDGYRDCIDGSDETNCTSIACP---NNKFLCPMGAAGGKPKCIPKAQVCDGRKDCEDNADEE-TVCCDC 179 (485)
Q Consensus 104 ~~~~Ci~~~~~Cdg~~dC~dgsDe~~C~~~~c~---~~~~~C~~g~~~~~~~Ci~~~~~Cdg~~dC~d~sDe~-~~C~~C 179 (485)
|...++||+..+|.++.|+..+...-+. ...++|..+ .+||...+.|||..||.+++||. .. |
T Consensus 189 -----~~~~~~~d~~~~~~~~~d~~~~~~~~~~~~~~~~~~c~g~-----~~~i~~~~~~Dg~~dc~~~~de~~~~---~ 255 (877)
T KOG1215|consen 189 -----IVAIKVCDIQHDCADDYDESEGRIYWTDDSRIEVTRCDGS-----SRCILISEVCDGPRDCVDGPDEGVMN---C 255 (877)
T ss_pred -----ceeeeecCcccccccccccccCcccccCCcceeEEEecCC-----CcEEeehhccCCCcccccCCcCceeE---e
Confidence 8899999999999999999887543222 247888774 49999999999999999999984 11 2
Q ss_pred CCCCc---cccccCCCCCCeeeCCCCceeeCCCCCCCCcchhhhccccccc---cccccceeheeheeeeeeecccCCCC
Q psy4900 180 SLLNC---EFTCQASPTGGVCQCPEGQKVANDSRTCLLYMKNNLKQAVRSS---TVSSHVKLVLLEVYVNVLKVRKLPTT 253 (485)
Q Consensus 180 ~~~~C---~~~C~n~~~~~~C~C~~G~~l~~~~~~C~d~~e~~~~~~~~~~---~~~~~~~~~~~~~~~~~~~~~~~~~~ 253 (485)
....| ++.|.+. .|.+...+.++..+|.+..++..+...... .-....+ ++ ..++-..
T Consensus 256 ~~~~~~~~e~~~~~~------~~~~~~~~~~g~~d~pdg~de~~~~~~~~~~~~~d~~~~~-i~---------~~~~~~~ 319 (877)
T KOG1215|consen 256 SDATCEAPEIECADG------DCSDRQKLCDGDLDCPDGLDEDYCKKKLYWSMNVDGSGRR-IL---------LSKLCHG 319 (877)
T ss_pred eccccCCcceeecCC------CCccceEEecCccCCCCcccccccccceeeeeecccCCce-ee---------ecccCcc
Confidence 33223 3455433 234445556777778777664432211000 0000000 00 0000000
Q ss_pred CCCCCCCCCCCCCCCccccccccccCCCCCCccccCCCCccccCCCCCCCCCCCCCceEEcCCCCc---ccCceeecCCC
Q psy4900 254 AEPQSPNPCGSNNGGCEHMCIITRASGNALGYKCACDIGYRLSVNGNNCNQPTCAPGEFQCASGRC---VPSTFKCDAEN 330 (485)
Q Consensus 254 ~e~~~~~~C~~~~g~C~~~C~n~~~~~~~g~~~C~C~~Gy~l~~d~~~C~~~~C~~~~~~c~~g~c---i~~~~~cd~~~ 330 (485)
..-...+.|......+.+.+..+. ....|.|..++.+.....+.. ..|....-.|. ..| .+..+.|....
T Consensus 320 ~~~~~~~~~~~~~~~~~~~~~~~~-----v~~~~~~~~~~~~~~~~~~~~-~~~~~~~g~Cs-q~C~~~~p~~~~c~c~~ 392 (877)
T KOG1215|consen 320 YWTDGLNECAERVLKCSHKCPDVS-----VGPRCDCMGAKVLPLGARTDS-NPCESDNGGCS-QLCVPNSPGTFKCACSP 392 (877)
T ss_pred ccccccccchhhcccccCCCCccc-----cCCcccCCccceecccccccC-CcccccCCccc-eeccCCCCCceeEecCC
Confidence 000011223333455667777776 778889999888765544411 11211111110 012 25566665544
Q ss_pred CCCCCCcC-------------CCCCCcc-----ccCCcccCCC-CceeccccccCCCCccCCCCC--CccccccCccccc
Q psy4900 331 DCGDYSDE-------------TGCVNVT-----CSLSQFACEN-GRCVPSTWKCDSENDCGDGSD--EGDFCSEKTCAYF 389 (485)
Q Consensus 331 dc~d~sde-------------~~c~~~~-----~~~~~~~~~~-~~~i~~~~~~~~~~~y~~d~~--~I~~~~~~~c~~g 389 (485)
.....++. ..+.++. .....+.... ...++.........+||++.. .|..+.++ +
T Consensus 393 g~~~~~~~c~~~~~~~~~l~~s~~~~ir~~~~~~~~~~~p~~~~~~~~~~d~d~~~~~i~~~d~~~~~i~~~~~~----~ 468 (877)
T KOG1215|consen 393 GYELRLDKCEASDQPEAFLLFSNRHDIRRISLDCSDVSRPLEGIKNAVALDFDVLNNRIYWADLSDEKICRASQD----G 468 (877)
T ss_pred CcEeccCCceecCCCCcEEEEecCccceecccCCCcceEEccCCccceEEEEEecCCEEEEEeccCCeEeeeccC----C
Confidence 43333322 1110000 0000011111 011111112223589999877 78888888 8
Q ss_pred ceEEEEecCCCCCCccEEecCCCCeEEEeC--CCcEEEEEcCCCCcEEEEeCCCcccceEEEeCCCCeEEEEeCC-CCcE
Q psy4900 390 QFHAIVLGSNLTNPTDLALDPTSGLMFVAD--SNQILRTNMDGTMAMSIVSEAAYKASGVALDINAKRLFWCDNL-LDYI 466 (485)
Q Consensus 390 ~~~~~l~~~~~~~p~~iavD~~~~~lywtd--~~~I~r~~~dG~~~~~i~~~~~~~p~glavD~~~~~lYW~D~~-~~~I 466 (485)
.....++..++-.|.+||+|+..+.+||+| ...|+.+.|+|+.+++|+...+..|.+++||+..+.+||+|+. ..+|
T Consensus 469 ~~~~~~~~~g~~~~~~lavD~~~~~~y~tDe~~~~i~v~~~~g~~~~vl~~~~l~~~r~~~v~p~~g~~~wtd~~~~~~i 548 (877)
T KOG1215|consen 469 SSECELCGDGLCIPEGLAVDWIGDNIYWTDEGNCLIEVADLDGSSRKVLVSKDLDLPRSIAVDPEKGLMFWTDWGQPPRI 548 (877)
T ss_pred CccceEeccCccccCcEEEEeccCCceecccCCceeEEEEccCCceeEEEecCCCCccceeeccccCeeEEecCCCCchh
Confidence 777778888999999999999999999999 6789999999999999999999999999999999999999998 5689
Q ss_pred EEEEccCCCeEEEecC
Q psy4900 467 ETVDYEGKNRFLILRG 482 (485)
Q Consensus 467 ~~~~~dG~~r~~~~~~ 482 (485)
+++.+||+.|.+++..
T Consensus 549 ~ra~~dg~~~~~l~~~ 564 (877)
T KOG1215|consen 549 ERASLDGSERAVLVTN 564 (877)
T ss_pred hhhcCCCCCceEEEeC
Confidence 9999999999998864
No 5
>PF00058 Ldl_recept_b: Low-density lipoprotein receptor repeat class B; InterPro: IPR000033 The low-density lipoprotein receptor (LDLR) is the major cholesterol-carrying lipoprotein of plasma, acting to regulate cholesterol homeostasis in mammalian cells. The LDL receptor binds LDL and transports it into cells by acidic endocytosis. In order to be internalized, the receptor-ligand complex must first cluster into clathrin-coated pits. Once inside the cell, the LDLR separates from its ligand, which is degraded in the lysosomes, while the receptor returns to the cell surface []. The internal dissociation of the LDLR with its ligand is mediated by proton pumps within the walls of the endosome that lower the pH. The LDLR is a multi-domain protein, containing: The ligand-binding domain contains seven or eight 40-amino acid LDLR class A (cysteine-rich) repeats, each of which contains a coordinated calcium ion and six cysteine residues involved in disulphide bond formation []. Similar domains have been found in other extracellular and membrane proteins []. The second conserved region contains two EGF repeats, followed by six LDLR class B (YWTD) repeats, and another EGF repeat. The LDLR class B repeats each contain a conserved YWTD motif, and is predicted to form a beta-propeller structure []. This region is critical for ligand release and recycling of the receptor []. The third domain is rich in serine and threonine residues and contains clustered O-linked carbohydrate chains. The fourth domain is the hydrophobic transmembrane region. The fifth domain is the cytoplasmic tail that directs the receptor to clathrin-coated pits. LDLR is closely related in structure to several other receptors, including LRP1, LRP1b, megalin/LRP2, VLDL receptor, lipoprotein receptor, MEGF7/LRP4, and LRP8/apolipoprotein E receptor2); these proteins participate in a wide range of physiological processes, including the regulation of lipid metabolism, protection against atherosclerosis, neurodevelopment, and transport of nutrients and vitamins []. This entry represents the LDLR classB (YWTD) repeat, the structure of which has been solved []. The six YWTD repeats together fold into a six-bladed beta-propeller. Each blade of the propeller consists of four antiparallel beta-strands; the innermost strand of each blade is labeled 1 and the outermost strand, 4. The sequence repeats are offset with respect to the blades of the propeller, such that any given 40-residue YWTD repeat spans strands 24 of one propeller blade and strand 1 of the subsequent blade. This offset ensures circularization of the propeller because the last strand of the final sequence repeat acts as an innermost strand 1 of the blade that harbors strands 24 from the first sequence repeat. The repeat is found in a variety of proteins that include, vitellogenin receptor from Drosophila melanogaster, low-density lipoprotein (LDL) receptor [], preproepidermal growth factor, and nidogen (entactin).; PDB: 3S2K_A 3S8Z_A 3S8V_B 4A0P_A 3SOB_B 3S94_B 4DG6_A 3SOV_A 3SOQ_A 1NPE_A ....
Probab=99.24 E-value=1.6e-11 Score=83.05 Aligned_cols=39 Identities=28% Similarity=0.566 Sum_probs=37.0
Q ss_pred CeEEEeC--CC-cEEEEEcCCCCcEEEEeCCCcccceEEEeC
Q psy4900 413 GLMFVAD--SN-QILRTNMDGTMAMSIVSEAAYKASGVALDI 451 (485)
Q Consensus 413 ~~lywtd--~~-~I~r~~~dG~~~~~i~~~~~~~p~glavD~ 451 (485)
++||||| .. +|++++|||+++++|+..++.+|.||||||
T Consensus 1 ~~iYWtD~~~~~~I~~a~~dGs~~~~vi~~~l~~P~giaVD~ 42 (42)
T PF00058_consen 1 GKIYWTDWSQDPSIERANLDGSNRRTVISDDLQHPEGIAVDW 42 (42)
T ss_dssp TEEEEEETTTTEEEEEEETTSTSEEEEEESSTSSEEEEEEET
T ss_pred CEEEEEECCCCcEEEEEECCCCCeEEEEECCCCCcCEEEECC
Confidence 6899999 56 999999999999999999999999999997
No 6
>smart00135 LY Low-density lipoprotein-receptor YWTD domain. Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin.
Probab=98.92 E-value=2.4e-09 Score=72.86 Aligned_cols=42 Identities=38% Similarity=0.637 Sum_probs=38.9
Q ss_pred EEEeCCCcccceEEEeCCCCeEEEEeCCCCcEEEEEccCCCe
Q psy4900 435 SIVSEAAYKASGVALDINAKRLFWCDNLLDYIETVDYEGKNR 476 (485)
Q Consensus 435 ~i~~~~~~~p~glavD~~~~~lYW~D~~~~~I~~~~~dG~~r 476 (485)
+++..++..|+|||+|+.+++|||+|+..+.|++++++|+.+
T Consensus 2 ~~~~~~~~~~~~la~d~~~~~lYw~D~~~~~I~~~~~~g~~~ 43 (43)
T smart00135 2 TLLSEGLGHPNGLAVDWIEGRLYWTDWGLDVIEVANLDGTNR 43 (43)
T ss_pred EEEECCCCCcCEEEEeecCCEEEEEeCCCCEEEEEeCCCCCC
Confidence 566778999999999999999999999999999999999864
No 7
>PF00057 Ldl_recept_a: Low-density lipoprotein receptor domain class A This prints entry is specific to LDL receptor; InterPro: IPR002172 The low-density lipoprotein receptor (LDLR) is the major cholesterol-carrying lipoprotein of plasma, acting to regulate cholesterol homeostasis in mammalian cells. The LDL receptor binds LDL and transports it into cells by acidic endocytosis. In order to be internalized, the receptor-ligand complex must first cluster into clathrin-coated pits. Once inside the cell, the LDLR separates from its ligand, which is degraded in the lysosomes, while the receptor returns to the cell surface []. The internal dissociation of the LDLR with its ligand is mediated by proton pumps within the walls of the endosome that lower the pH. The LDLR is a multi-domain protein, containing: The ligand-binding domain contains seven or eight 40-amino acid LDLR class A (cysteine-rich) repeats, each of which contains a coordinated calcium ion and six cysteine residues involved in disulphide bond formation []. Similar domains have been found in other extracellular and membrane proteins []. The second conserved region contains two EGF repeats, followed by six LDLR class B (YWTD) repeats, and another EGF repeat. The LDLR class B repeats each contain a conserved YWTD motif, and is predicted to form a beta-propeller structure []. This region is critical for ligand release and recycling of the receptor []. The third domain is rich in serine and threonine residues and contains clustered O-linked carbohydrate chains. The fourth domain is the hydrophobic transmembrane region. The fifth domain is the cytoplasmic tail that directs the receptor to clathrin-coated pits. LDLR is closely related in structure to several other receptors, including LRP1, LRP1b, megalin/LRP2, VLDL receptor, lipoprotein receptor, MEGF7/LRP4, and LRP8/apolipoprotein E receptor2); these proteins participate in a wide range of physiological processes, including the regulation of lipid metabolism, protection against atherosclerosis, neurodevelopment, and transport of nutrients and vitamins []. This entry represents the LDLR class A (cyateine-rich) repeat, which contains 6 disulphide-bound cysteines and a highly conserved cluster of negatively charged amino acids, of which many are clustered on one face of the module []. In LDL receptors, the class A domains form the binding site for LDL and calcium. The acidic residues between the fourth and sixth cysteines are important for high-affinity binding of positively charged sequences in LDLR's ligands. The repeat consists of a beta-hairpin structure followed by a series of beta turns. In the absence of calcium, LDL-A domains are unstructured; the bound calcium ion imparts structural integrity. Following these repeats is a 350 residue domain that resembles part of the epidermal growth factor (EGF) precursor. Numerous familial hypercholestorolemia mutations of the LDL receptor alter the calcium coordinating residue of LDL-A domains or other crucial scaffolding residues. ; GO: 0005515 protein binding; PDB: 2I1P_A 3OJY_A 4E0S_B 3T5O_A 4A5W_B 1JRF_A 1K7B_A 1V9U_5 3DPR_E 2KNY_A ....
Probab=98.89 E-value=1e-09 Score=71.62 Aligned_cols=37 Identities=49% Similarity=1.170 Sum_probs=34.4
Q ss_pred CCCCCCcEEecCCCCceecCCcccCCcCCCCCCCcCCCC
Q psy4900 10 RKCSPGDFECDPPHGICIPKDKRCDGYYDCRNRKDEEGC 48 (485)
Q Consensus 10 ~~C~~~~f~C~c~~g~ci~~~~~Cd~~~dC~d~sdE~~C 48 (485)
++|.+++|+| .++.||+..|+|||+.||.|++||.+|
T Consensus 1 ~~C~~~~f~C--~~~~CI~~~~~CDg~~DC~dgsDE~~C 37 (37)
T PF00057_consen 1 PTCPPGEFRC--GNGQCIPKSWVCDGIPDCPDGSDEQNC 37 (37)
T ss_dssp SSSSTTEEEE--TTSSEEEGGGTTSSSCSSSSSTTTSSH
T ss_pred CcCcCCeeEc--CCCCEEChHHcCCCCCCCCCCcccccC
Confidence 4689999999 888999999999999999999999876
No 8
>PF00057 Ldl_recept_a: Low-density lipoprotein receptor domain class A This prints entry is specific to LDL receptor; InterPro: IPR002172 The low-density lipoprotein receptor (LDLR) is the major cholesterol-carrying lipoprotein of plasma, acting to regulate cholesterol homeostasis in mammalian cells. The LDL receptor binds LDL and transports it into cells by acidic endocytosis. In order to be internalized, the receptor-ligand complex must first cluster into clathrin-coated pits. Once inside the cell, the LDLR separates from its ligand, which is degraded in the lysosomes, while the receptor returns to the cell surface []. The internal dissociation of the LDLR with its ligand is mediated by proton pumps within the walls of the endosome that lower the pH. The LDLR is a multi-domain protein, containing: The ligand-binding domain contains seven or eight 40-amino acid LDLR class A (cysteine-rich) repeats, each of which contains a coordinated calcium ion and six cysteine residues involved in disulphide bond formation []. Similar domains have been found in other extracellular and membrane proteins []. The second conserved region contains two EGF repeats, followed by six LDLR class B (YWTD) repeats, and another EGF repeat. The LDLR class B repeats each contain a conserved YWTD motif, and is predicted to form a beta-propeller structure []. This region is critical for ligand release and recycling of the receptor []. The third domain is rich in serine and threonine residues and contains clustered O-linked carbohydrate chains. The fourth domain is the hydrophobic transmembrane region. The fifth domain is the cytoplasmic tail that directs the receptor to clathrin-coated pits. LDLR is closely related in structure to several other receptors, including LRP1, LRP1b, megalin/LRP2, VLDL receptor, lipoprotein receptor, MEGF7/LRP4, and LRP8/apolipoprotein E receptor2); these proteins participate in a wide range of physiological processes, including the regulation of lipid metabolism, protection against atherosclerosis, neurodevelopment, and transport of nutrients and vitamins []. This entry represents the LDLR class A (cyateine-rich) repeat, which contains 6 disulphide-bound cysteines and a highly conserved cluster of negatively charged amino acids, of which many are clustered on one face of the module []. In LDL receptors, the class A domains form the binding site for LDL and calcium. The acidic residues between the fourth and sixth cysteines are important for high-affinity binding of positively charged sequences in LDLR's ligands. The repeat consists of a beta-hairpin structure followed by a series of beta turns. In the absence of calcium, LDL-A domains are unstructured; the bound calcium ion imparts structural integrity. Following these repeats is a 350 residue domain that resembles part of the epidermal growth factor (EGF) precursor. Numerous familial hypercholestorolemia mutations of the LDL receptor alter the calcium coordinating residue of LDL-A domains or other crucial scaffolding residues. ; GO: 0005515 protein binding; PDB: 2I1P_A 3OJY_A 4E0S_B 3T5O_A 4A5W_B 1JRF_A 1K7B_A 1V9U_5 3DPR_E 2KNY_A ....
Probab=98.86 E-value=9.1e-10 Score=71.80 Aligned_cols=36 Identities=58% Similarity=1.289 Sum_probs=23.7
Q ss_pred ccCCCcccCCCCcccCccccCCCCCCCCCCCCCCCC
Q psy4900 95 ACHVGQFKCANSLCIPVSYHCDGYRDCIDGSDETNC 130 (485)
Q Consensus 95 ~C~~~~f~C~~~~Ci~~~~~Cdg~~dC~dgsDe~~C 130 (485)
.|.+++|+|.++.||+..|+|||+.||.||+||.+|
T Consensus 2 ~C~~~~f~C~~~~CI~~~~~CDg~~DC~dgsDE~~C 37 (37)
T PF00057_consen 2 TCPPGEFRCGNGQCIPKSWVCDGIPDCPDGSDEQNC 37 (37)
T ss_dssp SSSTTEEEETTSSEEEGGGTTSSSCSSSSSTTTSSH
T ss_pred cCcCCeeEcCCCCEEChHHcCCCCCCCCCCcccccC
Confidence 455666666666666666666666666666666543
No 9
>cd00112 LDLa Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure
Probab=98.83 E-value=1.8e-09 Score=69.73 Aligned_cols=35 Identities=60% Similarity=1.248 Sum_probs=23.2
Q ss_pred cCCCcccCCCCcccCccccCCCCCCCCCCCCCCCC
Q psy4900 96 CHVGQFKCANSLCIPVSYHCDGYRDCIDGSDETNC 130 (485)
Q Consensus 96 C~~~~f~C~~~~Ci~~~~~Cdg~~dC~dgsDe~~C 130 (485)
|.+++|+|.++.||+..++|||+.||.|||||.+|
T Consensus 1 C~~~~f~C~~~~Ci~~~~~CDg~~DC~dgsDE~~C 35 (35)
T cd00112 1 CPPNEFRCANGRCIPSSWVCDGEDDCGDGSDEENC 35 (35)
T ss_pred CCCCeEEcCCCCeeCHHHcCCCccCCCCCcccccC
Confidence 34456677666777777777777777777776654
No 10
>cd00112 LDLa Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure
Probab=98.80 E-value=2.9e-09 Score=68.70 Aligned_cols=35 Identities=49% Similarity=1.170 Sum_probs=32.4
Q ss_pred CCCCcEEecCCCCceecCCcccCCcCCCCCCCcCCCC
Q psy4900 12 CSPGDFECDPPHGICIPKDKRCDGYYDCRNRKDEEGC 48 (485)
Q Consensus 12 C~~~~f~C~c~~g~ci~~~~~Cd~~~dC~d~sdE~~C 48 (485)
|.+++|+| .+|.||+..++|||+.||.|+|||.+|
T Consensus 1 C~~~~f~C--~~~~Ci~~~~~CDg~~DC~dgsDE~~C 35 (35)
T cd00112 1 CPPNEFRC--ANGRCIPSSWVCDGEDDCGDGSDEENC 35 (35)
T ss_pred CCCCeEEc--CCCCeeCHHHcCCCccCCCCCcccccC
Confidence 56789999 779999999999999999999999876
No 11
>PF14670 FXa_inhibition: Coagulation Factor Xa inhibitory site; PDB: 3Q3K_B 1NFY_B 1LQD_A 1G2L_B 1IQF_L 2UWP_B 2VH6_B 3KQC_L 2P93_L 2BQW_A ....
Probab=98.58 E-value=1.3e-08 Score=65.56 Aligned_cols=36 Identities=47% Similarity=1.239 Sum_probs=30.3
Q ss_pred CCCCCCCccccccccccCCCCCCccccCCCCccccCCCCCC
Q psy4900 262 CGSNNGGCEHMCIITRASGNALGYKCACDIGYRLSVNGNNC 302 (485)
Q Consensus 262 C~~~~g~C~~~C~n~~~~~~~g~~~C~C~~Gy~l~~d~~~C 302 (485)
|...+|+|+|+|++++ ++|+|.|+.||.|.+|+++|
T Consensus 1 C~~~NGgC~h~C~~~~-----g~~~C~C~~Gy~L~~D~~tC 36 (36)
T PF14670_consen 1 CSVNNGGCSHICVNTP-----GSYRCSCPPGYKLAEDGRTC 36 (36)
T ss_dssp CTTGGGGSSSEEEEET-----TSEEEE-STTEEE-TTSSSE
T ss_pred CCCCCCCcCCCCccCC-----CceEeECCCCCEECcCCCCC
Confidence 3346889999999999 99999999999999999876
No 12
>smart00135 LY Low-density lipoprotein-receptor YWTD domain. Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin.
Probab=98.57 E-value=9.1e-08 Score=64.93 Aligned_cols=40 Identities=33% Similarity=0.476 Sum_probs=35.8
Q ss_pred EEecCCCCCCccEEecCCCCeEEEeC--CCcEEEEEcCCCCc
Q psy4900 394 IVLGSNLTNPTDLALDPTSGLMFVAD--SNQILRTNMDGTMA 433 (485)
Q Consensus 394 ~l~~~~~~~p~~iavD~~~~~lywtd--~~~I~r~~~dG~~~ 433 (485)
+++..++..|.|||+||..++|||+| ...|+|++|+|+++
T Consensus 2 ~~~~~~~~~~~~la~d~~~~~lYw~D~~~~~I~~~~~~g~~~ 43 (43)
T smart00135 2 TLLSEGLGHPNGLAVDWIEGRLYWTDWGLDVIEVANLDGTNR 43 (43)
T ss_pred EEEECCCCCcCEEEEeecCCEEEEEeCCCCEEEEEeCCCCCC
Confidence 45567899999999999999999999 78999999999864
No 13
>smart00192 LDLa Low-density lipoprotein receptor domain class A. Cysteine-rich repeat in the low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. The N-terminal type A repeats in LDL receptor bind the lipoproteins. Other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement. Mutations in the LDL receptor gene cause familial hypercholesterolemia.
Probab=98.49 E-value=8.2e-08 Score=61.11 Aligned_cols=32 Identities=59% Similarity=1.304 Sum_probs=22.0
Q ss_pred cCCCcccCCCCcccCccccCCCCCCCCCCCCC
Q psy4900 96 CHVGQFKCANSLCIPVSYHCDGYRDCIDGSDE 127 (485)
Q Consensus 96 C~~~~f~C~~~~Ci~~~~~Cdg~~dC~dgsDe 127 (485)
|...+|+|.++.||+..++|||++||.|++||
T Consensus 2 C~~~~f~C~~~~Ci~~~~~Cdg~~dC~dgsDE 33 (33)
T smart00192 2 CPPGEFQCDNGRCIPLSWVCDGVDDCSDGSDE 33 (33)
T ss_pred CCCCeEECCCCCEECchhhCCCcCcCcCCCCC
Confidence 44456777777777777777777777777765
No 14
>smart00192 LDLa Low-density lipoprotein receptor domain class A. Cysteine-rich repeat in the low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. The N-terminal type A repeats in LDL receptor bind the lipoproteins. Other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement. Mutations in the LDL receptor gene cause familial hypercholesterolemia.
Probab=98.44 E-value=1.5e-07 Score=59.87 Aligned_cols=32 Identities=50% Similarity=1.188 Sum_probs=29.8
Q ss_pred CCCCcEEecCCCCceecCCcccCCcCCCCCCCcC
Q psy4900 12 CSPGDFECDPPHGICIPKDKRCDGYYDCRNRKDE 45 (485)
Q Consensus 12 C~~~~f~C~c~~g~ci~~~~~Cd~~~dC~d~sdE 45 (485)
|..++|+| .++.||+..++|||++||.|++||
T Consensus 2 C~~~~f~C--~~~~Ci~~~~~Cdg~~dC~dgsDE 33 (33)
T smart00192 2 CPPGEFQC--DNGRCIPLSWVCDGVDDCSDGSDE 33 (33)
T ss_pred CCCCeEEC--CCCCEECchhhCCCcCcCcCCCCC
Confidence 66779999 789999999999999999999997
No 15
>PF00058 Ldl_recept_b: Low-density lipoprotein receptor repeat class B; InterPro: IPR000033 The low-density lipoprotein receptor (LDLR) is the major cholesterol-carrying lipoprotein of plasma, acting to regulate cholesterol homeostasis in mammalian cells. The LDL receptor binds LDL and transports it into cells by acidic endocytosis. In order to be internalized, the receptor-ligand complex must first cluster into clathrin-coated pits. Once inside the cell, the LDLR separates from its ligand, which is degraded in the lysosomes, while the receptor returns to the cell surface []. The internal dissociation of the LDLR with its ligand is mediated by proton pumps within the walls of the endosome that lower the pH. The LDLR is a multi-domain protein, containing: The ligand-binding domain contains seven or eight 40-amino acid LDLR class A (cysteine-rich) repeats, each of which contains a coordinated calcium ion and six cysteine residues involved in disulphide bond formation []. Similar domains have been found in other extracellular and membrane proteins []. The second conserved region contains two EGF repeats, followed by six LDLR class B (YWTD) repeats, and another EGF repeat. The LDLR class B repeats each contain a conserved YWTD motif, and is predicted to form a beta-propeller structure []. This region is critical for ligand release and recycling of the receptor []. The third domain is rich in serine and threonine residues and contains clustered O-linked carbohydrate chains. The fourth domain is the hydrophobic transmembrane region. The fifth domain is the cytoplasmic tail that directs the receptor to clathrin-coated pits. LDLR is closely related in structure to several other receptors, including LRP1, LRP1b, megalin/LRP2, VLDL receptor, lipoprotein receptor, MEGF7/LRP4, and LRP8/apolipoprotein E receptor2); these proteins participate in a wide range of physiological processes, including the regulation of lipid metabolism, protection against atherosclerosis, neurodevelopment, and transport of nutrients and vitamins []. This entry represents the LDLR classB (YWTD) repeat, the structure of which has been solved []. The six YWTD repeats together fold into a six-bladed beta-propeller. Each blade of the propeller consists of four antiparallel beta-strands; the innermost strand of each blade is labeled 1 and the outermost strand, 4. The sequence repeats are offset with respect to the blades of the propeller, such that any given 40-residue YWTD repeat spans strands 24 of one propeller blade and strand 1 of the subsequent blade. This offset ensures circularization of the propeller because the last strand of the final sequence repeat acts as an innermost strand 1 of the blade that harbors strands 24 from the first sequence repeat. The repeat is found in a variety of proteins that include, vitellogenin receptor from Drosophila melanogaster, low-density lipoprotein (LDL) receptor [], preproepidermal growth factor, and nidogen (entactin).; PDB: 3S2K_A 3S8Z_A 3S8V_B 4A0P_A 3SOB_B 3S94_B 4DG6_A 3SOV_A 3SOQ_A 1NPE_A ....
Probab=98.35 E-value=8.9e-07 Score=59.74 Aligned_cols=30 Identities=27% Similarity=0.587 Sum_probs=27.7
Q ss_pred CeEEEEeCCCC-cEEEEEccCCCeEEEecCC
Q psy4900 454 KRLFWCDNLLD-YIETVDYEGKNRFLILRGS 483 (485)
Q Consensus 454 ~~lYW~D~~~~-~I~~~~~dG~~r~~~~~~~ 483 (485)
++|||+|+..+ +|+++++||++|++|++..
T Consensus 1 ~~iYWtD~~~~~~I~~a~~dGs~~~~vi~~~ 31 (42)
T PF00058_consen 1 GKIYWTDWSQDPSIERANLDGSNRRTVISDD 31 (42)
T ss_dssp TEEEEEETTTTEEEEEEETTSTSEEEEEESS
T ss_pred CEEEEEECCCCcEEEEEECCCCCeEEEEECC
Confidence 58999999999 9999999999999998754
No 16
>PF08450 SGL: SMP-30/Gluconolaconase/LRE-like region; InterPro: IPR013658 This family describes a region that is found in proteins expressed by a variety of eukaryotic and prokaryotic species. These proteins include various enzymes, such as senescence marker protein 30 (SMP-30, Q15493 from SWISSPROT), gluconolactonase (Q01578 from SWISSPROT) and luciferin-regenerating enzyme (LRE, Q86DU5 from SWISSPROT). SMP-30 is known to hydrolyse diisopropyl phosphorofluoridate in the liver, and has been noted as having sequence similarity, in the region described in this family, with PON1 (P52430 from SWISSPROT) and LRE. ; PDB: 2GHS_A 2DG0_L 2DG1_D 2DSO_D 3E5Z_A 2IAT_A 2IAV_A 2GVV_A 3HLI_A 2GVU_A ....
Probab=98.32 E-value=2.1e-06 Score=82.29 Aligned_cols=106 Identities=22% Similarity=0.175 Sum_probs=78.6
Q ss_pred CCccCCCCC----------CccccccCcccccceEEEEecCCCCCCccEEecCCCCeEEEeC--CCcEEEEEcCCCC---
Q psy4900 368 ENDCGDGSD----------EGDFCSEKTCAYFQFHAIVLGSNLTNPTDLALDPTSGLMFVAD--SNQILRTNMDGTM--- 432 (485)
Q Consensus 368 ~~~y~~d~~----------~I~~~~~~~c~~g~~~~~l~~~~~~~p~~iavD~~~~~lywtd--~~~I~r~~~dG~~--- 432 (485)
+++|+++.. .|.+.+.+ ++ ..++...+..|.||+++|..+.||++| .++|++..++...
T Consensus 97 G~ly~t~~~~~~~~~~~~g~v~~~~~~----~~--~~~~~~~~~~pNGi~~s~dg~~lyv~ds~~~~i~~~~~~~~~~~~ 170 (246)
T PF08450_consen 97 GNLYVTDSGGGGASGIDPGSVYRIDPD----GK--VTVVADGLGFPNGIAFSPDGKTLYVADSFNGRIWRFDLDADGGEL 170 (246)
T ss_dssp S-EEEEEECCBCTTCGGSEEEEEEETT----SE--EEEEEEEESSEEEEEEETTSSEEEEEETTTTEEEEEEEETTTCCE
T ss_pred CCEEEEecCCCccccccccceEEECCC----Ce--EEEEecCcccccceEECCcchheeecccccceeEEEeccccccce
Confidence 568887665 13444444 33 334446799999999999999999999 7899999998433
Q ss_pred --cEEEEeCCC--cccceEEEeCCCCeEEEEeCCCCcEEEEEccCCCeEEEe
Q psy4900 433 --AMSIVSEAA--YKASGVALDINAKRLFWCDNLLDYIETVDYEGKNRFLIL 480 (485)
Q Consensus 433 --~~~i~~~~~--~~p~glavD~~~~~lYW~D~~~~~I~~~~~dG~~r~~~~ 480 (485)
+++++.... ..|.||+||. .++||.+++..++|.+.+.+|+...+|.
T Consensus 171 ~~~~~~~~~~~~~g~pDG~~vD~-~G~l~va~~~~~~I~~~~p~G~~~~~i~ 221 (246)
T PF08450_consen 171 SNRRVFIDFPGGPGYPDGLAVDS-DGNLWVADWGGGRIVVFDPDGKLLREIE 221 (246)
T ss_dssp EEEEEEEE-SSSSCEEEEEEEBT-TS-EEEEEETTTEEEEEETTSCEEEEEE
T ss_pred eeeeeEEEcCCCCcCCCcceEcC-CCCEEEEEcCCCEEEEECCCccEEEEEc
Confidence 355544332 3699999995 8899999999999999999998766654
No 17
>PLN02919 haloacid dehalogenase-like hydrolase family protein
Probab=98.23 E-value=7.4e-06 Score=93.66 Aligned_cols=107 Identities=10% Similarity=0.089 Sum_probs=84.4
Q ss_pred CCCccCCCCC--CccccccCcccccceEEEEec-------------CCCCCCccEEecCCCCeEEEeC--CCcEEEEEcC
Q psy4900 367 SENDCGDGSD--EGDFCSEKTCAYFQFHAIVLG-------------SNLTNPTDLALDPTSGLMFVAD--SNQILRTNMD 429 (485)
Q Consensus 367 ~~~~y~~d~~--~I~~~~~~~c~~g~~~~~l~~-------------~~~~~p~~iavD~~~~~lywtd--~~~I~r~~~d 429 (485)
++++|++|.. +|.+.+++ |.....+.. ..+..|.||++|+..+.||++| .++|.+.++.
T Consensus 579 ~g~lyVaDs~n~rI~v~d~~----G~~i~~ig~~g~~G~~dG~~~~a~f~~P~GIavd~~gn~LYVaDt~n~~Ir~id~~ 654 (1057)
T PLN02919 579 NNRLFISDSNHNRIVVTDLD----GNFIVQIGSTGEEGLRDGSFEDATFNRPQGLAYNAKKNLLYVADTENHALREIDFV 654 (1057)
T ss_pred CCeEEEEECCCCeEEEEeCC----CCEEEEEccCCCcCCCCCchhccccCCCcEEEEeCCCCEEEEEeCCCceEEEEecC
Confidence 4779999887 88888888 876555432 1267899999999988999999 6889999887
Q ss_pred CCCcEEEEeC----------------CCcccceEEEeCCCCeEEEEeCCCCcEEEEEccCCCeE
Q psy4900 430 GTMAMSIVSE----------------AAYKASGVALDINAKRLFWCDNLLDYIETVDYEGKNRF 477 (485)
Q Consensus 430 G~~~~~i~~~----------------~~~~p~glavD~~~~~lYW~D~~~~~I~~~~~dG~~r~ 477 (485)
+...++|... .+..|.+|+||...++||++|+..++|.+.+..+...+
T Consensus 655 ~~~V~tlag~G~~g~~~~gg~~~~~~~ln~P~gVa~dp~~g~LyVad~~~~~I~v~d~~~g~v~ 718 (1057)
T PLN02919 655 NETVRTLAGNGTKGSDYQGGKKGTSQVLNSPWDVCFEPVNEKVYIAMAGQHQIWEYNISDGVTR 718 (1057)
T ss_pred CCEEEEEeccCcccCCCCCChhhhHhhcCCCeEEEEecCCCeEEEEECCCCeEEEEECCCCeEE
Confidence 7655555321 15689999999989999999999999999887655443
No 18
>PF08450 SGL: SMP-30/Gluconolaconase/LRE-like region; InterPro: IPR013658 This family describes a region that is found in proteins expressed by a variety of eukaryotic and prokaryotic species. These proteins include various enzymes, such as senescence marker protein 30 (SMP-30, Q15493 from SWISSPROT), gluconolactonase (Q01578 from SWISSPROT) and luciferin-regenerating enzyme (LRE, Q86DU5 from SWISSPROT). SMP-30 is known to hydrolyse diisopropyl phosphorofluoridate in the liver, and has been noted as having sequence similarity, in the region described in this family, with PON1 (P52430 from SWISSPROT) and LRE. ; PDB: 2GHS_A 2DG0_L 2DG1_D 2DSO_D 3E5Z_A 2IAT_A 2IAV_A 2GVV_A 3HLI_A 2GVU_A ....
Probab=98.20 E-value=4.9e-06 Score=79.67 Aligned_cols=101 Identities=20% Similarity=0.321 Sum_probs=76.2
Q ss_pred CCCCccCCCCCCccccccCcccccceEEEEecC-----CCCCCccEEecCCCCeEEEeC--C--------CcEEEEEcCC
Q psy4900 366 DSENDCGDGSDEGDFCSEKTCAYFQFHAIVLGS-----NLTNPTDLALDPTSGLMFVAD--S--------NQILRTNMDG 430 (485)
Q Consensus 366 ~~~~~y~~d~~~I~~~~~~~c~~g~~~~~l~~~-----~~~~p~~iavD~~~~~lywtd--~--------~~I~r~~~dG 430 (485)
..+++|.++...+.+.++. .....+++.. .+..|..+++|+. |.||+++ . .+|+|...+|
T Consensus 50 ~~g~l~v~~~~~~~~~d~~----~g~~~~~~~~~~~~~~~~~~ND~~vd~~-G~ly~t~~~~~~~~~~~~g~v~~~~~~~ 124 (246)
T PF08450_consen 50 PDGRLYVADSGGIAVVDPD----TGKVTVLADLPDGGVPFNRPNDVAVDPD-GNLYVTDSGGGGASGIDPGSVYRIDPDG 124 (246)
T ss_dssp TTSEEEEEETTCEEEEETT----TTEEEEEEEEETTCSCTEEEEEEEE-TT-S-EEEEEECCBCTTCGGSEEEEEEETTS
T ss_pred cCCEEEEEEcCceEEEecC----CCcEEEEeeccCCCcccCCCceEEEcCC-CCEEEEecCCCccccccccceEEECCCC
Confidence 3477888877766666655 4444444432 5778899999986 7899999 1 4599999994
Q ss_pred CCcEEEEeCCCcccceEEEeCCCCeEEEEeCCCCcEEEEEccC
Q psy4900 431 TMAMSIVSEAAYKASGVALDINAKRLFWCDNLLDYIETVDYEG 473 (485)
Q Consensus 431 ~~~~~i~~~~~~~p~glavD~~~~~lYW~D~~~~~I~~~~~dG 473 (485)
+ ..++...+..|+||++++..+.||++|+..++|.+.+++.
T Consensus 125 ~--~~~~~~~~~~pNGi~~s~dg~~lyv~ds~~~~i~~~~~~~ 165 (246)
T PF08450_consen 125 K--VTVVADGLGFPNGIAFSPDGKTLYVADSFNGRIWRFDLDA 165 (246)
T ss_dssp E--EEEEEEEESSEEEEEEETTSSEEEEEETTTTEEEEEEEET
T ss_pred e--EEEEecCcccccceEECCcchheeecccccceeEEEeccc
Confidence 4 3444467999999999999999999999999999999984
No 19
>PF12999 PRKCSH-like: Glucosidase II beta subunit-like
Probab=98.14 E-value=2.4e-06 Score=75.42 Aligned_cols=69 Identities=42% Similarity=0.667 Sum_probs=59.4
Q ss_pred CCCCeecCCCCce-ecCcceeCCCCCCCCCCcccCCCCCccCCCcccCCCC----cccCccccCCCCCC---CCCCCCC
Q psy4900 57 DLDQFRCANGQKC-IDAKLKCNYHNDCGDNSDEEKCNFTACHVGQFKCANS----LCIPVSYHCDGYRD---CIDGSDE 127 (485)
Q Consensus 57 ~~~~f~C~~g~~C-i~~~~~Cd~~~dC~d~sDe~~C~~~~C~~~~f~C~~~----~Ci~~~~~Cdg~~d---C~dgsDe 127 (485)
..+.|.|-+|.+- |+.+.+.|++=||.|||||.+ -..|..+.|.|.|. .-||.+++=||+-| |=|||||
T Consensus 34 ~~~~f~Cl~~~~~~I~~~~iNDdyCDC~DGSDEPG--TsAC~~~~FyC~N~g~~p~~i~~s~VnDGICDy~~CCDGSDE 110 (176)
T PF12999_consen 34 ENGKFTCLDGSKIVIPFSQINDDYCDCPDGSDEPG--TSACSNGKFYCENKGHIPRYIPSSRVNDGICDYDICCDGSDE 110 (176)
T ss_pred CCCceEecCCCCceecHHHccCcceeCCCCCCccc--cccCcCceEeeccCCCCCceeehhhhcCCcCcccccCCCCCC
Confidence 3467999988766 899999999999999999964 23688889999874 56888999999999 9999999
No 20
>PF03088 Str_synth: Strictosidine synthase; InterPro: IPR018119 This entry represents a conserved region found in strictosidine synthase (4.3.3.2 from EC), a key enzyme in alkaloid biosynthesis. It catalyses the Pictet-Spengler stereospecific condensation of tryptamine with secologanin to form strictosidine []. The structure of the native enzyme from the Indian medicinal plant Rauvolfia serpentina (Serpentwood) (Devilpepper) represents the first example of a six-bladed four-stranded beta-propeller fold from the plant kingdom [].; GO: 0016844 strictosidine synthase activity, 0009058 biosynthetic process; PDB: 2FPB_A 2V91_B 2FP8_A 3V1S_B 2FPC_A 2VAQ_A 2FP9_B.
Probab=98.06 E-value=1.3e-05 Score=63.23 Aligned_cols=68 Identities=22% Similarity=0.323 Sum_probs=55.7
Q ss_pred cEEecCCCCeEEEeCC-------------------CcEEEEEcCCCCcEEEEeCCCcccceEEEeCCCCeEEEEeCCCCc
Q psy4900 405 DLALDPTSGLMFVADS-------------------NQILRTNMDGTMAMSIVSEAAYKASGVALDINAKRLFWCDNLLDY 465 (485)
Q Consensus 405 ~iavD~~~~~lywtd~-------------------~~I~r~~~dG~~~~~i~~~~~~~p~glavD~~~~~lYW~D~~~~~ 465 (485)
+|+|+...|.|||||. ++|.+.++.....++|+ .+|..|+||||......|+.+.....+
T Consensus 2 dldv~~~~g~vYfTdsS~~~~~~~~~~~~le~~~~GRll~ydp~t~~~~vl~-~~L~fpNGVals~d~~~vlv~Et~~~R 80 (89)
T PF03088_consen 2 DLDVDQDTGTVYFTDSSSRYDRRDWVYDLLEGRPTGRLLRYDPSTKETTVLL-DGLYFPNGVALSPDESFVLVAETGRYR 80 (89)
T ss_dssp EEEE-TTT--EEEEES-SS--TTGHHHHHHHT---EEEEEEETTTTEEEEEE-EEESSEEEEEE-TTSSEEEEEEGGGTE
T ss_pred ceeEecCCCEEEEEeCccccCccceeeeeecCCCCcCEEEEECCCCeEEEeh-hCCCccCeEEEcCCCCEEEEEeccCce
Confidence 6899999999999991 56999988877655555 689999999999999999999999999
Q ss_pred EEEEEccC
Q psy4900 466 IETVDYEG 473 (485)
Q Consensus 466 I~~~~~dG 473 (485)
|.+.-+.|
T Consensus 81 i~rywl~G 88 (89)
T PF03088_consen 81 ILRYWLKG 88 (89)
T ss_dssp EEEEESSS
T ss_pred EEEEEEeC
Confidence 99998887
No 21
>PF12999 PRKCSH-like: Glucosidase II beta subunit-like
Probab=97.98 E-value=1.1e-05 Score=71.33 Aligned_cols=71 Identities=34% Similarity=0.589 Sum_probs=57.1
Q ss_pred CCCcEEecCCCCceecCCcccCCcCCCCCCCcCCCCCCCCCCCCCCCCeecCCCC---ceecCcceeCCCCC---CCCCC
Q psy4900 13 SPGDFECDPPHGICIPKDKRCDGYYDCRNRKDEEGCPATTGLSCDLDQFRCANGQ---KCIDAKLKCNYHND---CGDNS 86 (485)
Q Consensus 13 ~~~~f~C~c~~g~ci~~~~~Cd~~~dC~d~sdE~~C~~~~~~~C~~~~f~C~~g~---~Ci~~~~~Cd~~~d---C~d~s 86 (485)
..+.|+|.-.+..-|+.+.+.|+.-||.|||||.+=. -|+.+.|.|.|.. .-|+.+++=||+=| |=|||
T Consensus 34 ~~~~f~Cl~~~~~~I~~~~iNDdyCDC~DGSDEPGTs-----AC~~~~FyC~N~g~~p~~i~~s~VnDGICDy~~CCDGS 108 (176)
T PF12999_consen 34 ENGKFTCLDGSKIVIPFSQINDDYCDCPDGSDEPGTS-----ACSNGKFYCENKGHIPRYIPSSRVNDGICDYDICCDGS 108 (176)
T ss_pred CCCceEecCCCCceecHHHccCcceeCCCCCCccccc-----cCcCceEeeccCCCCCceeehhhhcCCcCcccccCCCC
Confidence 5678999522223389999999999999999997543 3777899999752 46888899999999 99999
Q ss_pred cc
Q psy4900 87 DE 88 (485)
Q Consensus 87 De 88 (485)
||
T Consensus 109 DE 110 (176)
T PF12999_consen 109 DE 110 (176)
T ss_pred CC
Confidence 99
No 22
>PLN02919 haloacid dehalogenase-like hydrolase family protein
Probab=97.96 E-value=3.6e-05 Score=88.16 Aligned_cols=86 Identities=24% Similarity=0.400 Sum_probs=70.7
Q ss_pred EecCCCCCCccEEecCCCCeEEEeC--CCcEEEEEcCCCCcEEEEeC-------------CCcccceEEEeCCCCeEEEE
Q psy4900 395 VLGSNLTNPTDLALDPTSGLMFVAD--SNQILRTNMDGTMAMSIVSE-------------AAYKASGVALDINAKRLFWC 459 (485)
Q Consensus 395 l~~~~~~~p~~iavD~~~~~lywtd--~~~I~r~~~dG~~~~~i~~~-------------~~~~p~glavD~~~~~lYW~ 459 (485)
++...+..|.+||+|+..+.||++| .++|.+.+++|.....+... .+..|.|||||...+.|||+
T Consensus 562 ~~~s~l~~P~gvavd~~~g~lyVaDs~n~rI~v~d~~G~~i~~ig~~g~~G~~dG~~~~a~f~~P~GIavd~~gn~LYVa 641 (1057)
T PLN02919 562 LLTSPLKFPGKLAIDLLNNRLFISDSNHNRIVVTDLDGNFIVQIGSTGEEGLRDGSFEDATFNRPQGLAYNAKKNLLYVA 641 (1057)
T ss_pred cccccCCCCceEEEECCCCeEEEEECCCCeEEEEeCCCCEEEEEccCCCcCCCCCchhccccCCCcEEEEeCCCCEEEEE
Confidence 4456789999999999999999999 78999999998754444321 25679999999888899999
Q ss_pred eCCCCcEEEEEccCCCeEEEe
Q psy4900 460 DNLLDYIETVDYEGKNRFLIL 480 (485)
Q Consensus 460 D~~~~~I~~~~~dG~~r~~~~ 480 (485)
|...++|.+.++.+...++|.
T Consensus 642 Dt~n~~Ir~id~~~~~V~tla 662 (1057)
T PLN02919 642 DTENHALREIDFVNETVRTLA 662 (1057)
T ss_pred eCCCceEEEEecCCCEEEEEe
Confidence 999999999998876666553
No 23
>PF14670 FXa_inhibition: Coagulation Factor Xa inhibitory site; PDB: 3Q3K_B 1NFY_B 1LQD_A 1G2L_B 1IQF_L 2UWP_B 2VH6_B 3KQC_L 2P93_L 2BQW_A ....
Probab=97.86 E-value=4.1e-06 Score=54.02 Aligned_cols=31 Identities=42% Similarity=0.928 Sum_probs=26.7
Q ss_pred CCccccccCCCCCCeeeCCCCceeeCCCCCC
Q psy4900 182 LNCEFTCQASPTGGVCQCPEGQKVANDSRTC 212 (485)
Q Consensus 182 ~~C~~~C~n~~~~~~C~C~~G~~l~~~~~~C 212 (485)
..|++.|.+++++|+|.|++||.|..|+++|
T Consensus 6 GgC~h~C~~~~g~~~C~C~~Gy~L~~D~~tC 36 (36)
T PF14670_consen 6 GGCSHICVNTPGSYRCSCPPGYKLAEDGRTC 36 (36)
T ss_dssp GGSSSEEEEETTSEEEE-STTEEE-TTSSSE
T ss_pred CCcCCCCccCCCceEeECCCCCEECcCCCCC
Confidence 4689999999999999999999999998876
No 24
>PF12662 cEGF: Complement Clr-like EGF-like
Probab=97.74 E-value=1.6e-05 Score=46.04 Aligned_cols=24 Identities=29% Similarity=0.675 Sum_probs=22.2
Q ss_pred CCeeeCCCCceeeCCCCCCCCcch
Q psy4900 194 GGVCQCPEGQKVANDSRTCLLYMK 217 (485)
Q Consensus 194 ~~~C~C~~G~~l~~~~~~C~d~~e 217 (485)
||+|.|++||.+..++++|.|++|
T Consensus 1 sy~C~C~~Gy~l~~d~~~C~DIdE 24 (24)
T PF12662_consen 1 SYTCSCPPGYQLSPDGRSCEDIDE 24 (24)
T ss_pred CEEeeCCCCCcCCCCCCccccCCC
Confidence 589999999999999999999986
No 25
>COG3386 Gluconolactonase [Carbohydrate transport and metabolism]
Probab=97.52 E-value=0.00051 Score=67.64 Aligned_cols=97 Identities=25% Similarity=0.220 Sum_probs=68.8
Q ss_pred ccccccCcccccceEEEEecCCCCCCccEEecCCCCeEEEeC--CCcEEEEEcC---CC--CcEEEEe--CCCcccceEE
Q psy4900 378 GDFCSEKTCAYFQFHAIVLGSNLTNPTDLALDPTSGLMFVAD--SNQILRTNMD---GT--MAMSIVS--EAAYKASGVA 448 (485)
Q Consensus 378 I~~~~~~~c~~g~~~~~l~~~~~~~p~~iavD~~~~~lywtd--~~~I~r~~~d---G~--~~~~i~~--~~~~~p~gla 448 (485)
+++++.. |... .++...+..|.|||++|....||++| ..+|+|..++ |. +++..+. ..-..|.|++
T Consensus 145 lyr~~p~----g~~~-~l~~~~~~~~NGla~SpDg~tly~aDT~~~~i~r~~~d~~~g~~~~~~~~~~~~~~~G~PDG~~ 219 (307)
T COG3386 145 LYRVDPD----GGVV-RLLDDDLTIPNGLAFSPDGKTLYVADTPANRIHRYDLDPATGPIGGRRGFVDFDEEPGLPDGMA 219 (307)
T ss_pred EEEEcCC----CCEE-EeecCcEEecCceEECCCCCEEEEEeCCCCeEEEEecCcccCccCCcceEEEccCCCCCCCceE
Confidence 4444544 4444 34445689999999999999999999 6899999998 32 2332332 3347899999
Q ss_pred EeCCCCeEE-EEeCCCCcEEEEEccCCCeEEEe
Q psy4900 449 LDINAKRLF-WCDNLLDYIETVDYEGKNRFLIL 480 (485)
Q Consensus 449 vD~~~~~lY-W~D~~~~~I~~~~~dG~~r~~~~ 480 (485)
|| ..++|| ++-+....|.+.+.+|....++.
T Consensus 220 vD-adG~lw~~a~~~g~~v~~~~pdG~l~~~i~ 251 (307)
T COG3386 220 VD-ADGNLWVAAVWGGGRVVRFNPDGKLLGEIK 251 (307)
T ss_pred Ee-CCCCEEEecccCCceEEEECCCCcEEEEEE
Confidence 99 556665 44444559999999988766543
No 26
>KOG4659|consensus
Probab=97.49 E-value=0.00024 Score=78.49 Aligned_cols=76 Identities=21% Similarity=0.286 Sum_probs=62.0
Q ss_pred CCCCCCccEEecCCCCeEEEeCCCcEEEEEcCCCCcEEE------------------EeCCCcccceEEEeCCCCeEEEE
Q psy4900 398 SNLTNPTDLALDPTSGLMFVADSNQILRTNMDGTMAMSI------------------VSEAAYKASGVALDINAKRLFWC 459 (485)
Q Consensus 398 ~~~~~p~~iavD~~~~~lywtd~~~I~r~~~dG~~~~~i------------------~~~~~~~p~glavD~~~~~lYW~ 459 (485)
..|..|+|||||- .|.||++|...|..++-+|--+..| +.-.+.||+.||||+..+-||..
T Consensus 472 A~L~~PkGIa~dk-~g~lYfaD~t~IR~iD~~giIstlig~~~~~~~p~~C~~~~kl~~~~leWPT~LaV~Pmdnsl~Vl 550 (1899)
T KOG4659|consen 472 AQLIFPKGIAFDK-MGNLYFADGTRIRVIDTTGIISTLIGTTPDQHPPRTCAQITKLVDLQLEWPTSLAVDPMDNSLLVL 550 (1899)
T ss_pred ceeccCCceeEcc-CCcEEEecccEEEEeccCceEEEeccCCCCccCccccccccchhheeeecccceeecCCCCeEEEe
Confidence 3478899999994 5899999988888888886443222 22247899999999999999999
Q ss_pred eCCCCcEEEEEccCCCe
Q psy4900 460 DNLLDYIETVDYEGKNR 476 (485)
Q Consensus 460 D~~~~~I~~~~~dG~~r 476 (485)
| ++.|.+++.++..|
T Consensus 551 d--~nvvlrit~~~rV~ 565 (1899)
T KOG4659|consen 551 D--TNVVLRITVVHRVR 565 (1899)
T ss_pred e--cceEEEEccCccEE
Confidence 9 78899999888777
No 27
>PF12662 cEGF: Complement Clr-like EGF-like
Probab=97.39 E-value=9.8e-05 Score=42.76 Aligned_cols=21 Identities=43% Similarity=1.153 Sum_probs=19.6
Q ss_pred CccccCCCCccccCCCCCCCC
Q psy4900 284 GYKCACDIGYRLSVNGNNCNQ 304 (485)
Q Consensus 284 ~~~C~C~~Gy~l~~d~~~C~~ 304 (485)
||+|.|+.||+|.+|+++|.+
T Consensus 1 sy~C~C~~Gy~l~~d~~~C~D 21 (24)
T PF12662_consen 1 SYTCSCPPGYQLSPDGRSCED 21 (24)
T ss_pred CEEeeCCCCCcCCCCCCcccc
Confidence 699999999999999999987
No 28
>PF07645 EGF_CA: Calcium-binding EGF domain; InterPro: IPR001881 A sequence of about forty amino-acid residues found in epidermal growth factor (EGF) has been shown [, , , , , ] to be present in a large number of membrane-bound and extracellular, mostly animal, proteins. Many of these proteins require calcium for their biological function and a calcium-binding site has been found at the N terminus of some EGF-like domains []. Calcium-binding may be crucial for numerous protein-protein interactions. For human coagulation factor IX it has been shown [] that the calcium-ligands form a pentagonal bipyramid. The first, third and fourth conserved negatively charged or polar residues are side chain ligands. The latter is possibly hydroxylated (see aspartic acid and asparagine hydroxylation site) []. A conserved aromatic residue, as well as the second conserved negative residue, are thought to be involved in stabilising the calcium-binding site. As in non-calcium binding EGF-like domains, there are six conserved cysteines and the structure of both types is very similar as calcium-binding induces only strictly local structural changes []. +------------------+ +---------+ | | | | nxnnC-x(3,14)-C-x(3,7)-CxxbxxxxaxC-x(1,6)-C-x(8,13)-Cx | | +------------------+ 'n': negatively charged or polar residue [DEQN] 'b': possibly beta-hydroxylated residue [DN] 'a': aromatic amino acid 'C': cysteine, involved in disulphide bond 'x': any amino acid. ; GO: 0005509 calcium ion binding; PDB: 2VJ3_A 1TOZ_A 1LMJ_A 1UZQ_A 1UZK_A 1UZJ_B 1UZP_A 1EMO_A 1EMN_A 2RR0_A ....
Probab=97.29 E-value=8.1e-05 Score=50.18 Aligned_cols=35 Identities=31% Similarity=0.816 Sum_probs=28.0
Q ss_pred CCCCCCCcc--ccccccccCCCCCCccccCCCCccccCCCCC
Q psy4900 262 CGSNNGGCE--HMCIITRASGNALGYKCACDIGYRLSVNGNN 301 (485)
Q Consensus 262 C~~~~g~C~--~~C~n~~~~~~~g~~~C~C~~Gy~l~~d~~~ 301 (485)
|......|. +.|+|+. |+|+|.|++||++..++..
T Consensus 5 C~~~~~~C~~~~~C~N~~-----Gsy~C~C~~Gy~~~~~~~~ 41 (42)
T PF07645_consen 5 CAEGPHNCPENGTCVNTE-----GSYSCSCPPGYELNDDGTT 41 (42)
T ss_dssp TTTTSSSSSTTSEEEEET-----TEEEEEESTTEEECTTSSE
T ss_pred cCCCCCcCCCCCEEEcCC-----CCEEeeCCCCcEECCCCCc
Confidence 775556776 7899999 9999999999996665543
No 29
>TIGR02604 Piru_Ver_Nterm putative membrane-bound dehydrogenase domain. All proteins that score above the trusted cutoff score of 45 to this model are large proteins of either Pirellula sp. 1 or Verrucomicrobium spinosum. These proteins all contain, in addition to this domain, several hundred residues of highly variable sequence, and then a well-conserved C-terminal domain (TIGR02603) that features a putative cytochrome c-type heme binding motif CXXCH. The membrane-bound L-sorbosone dehydrogenase from Acetobacter liquefaciens (Gluconacetobacter liquefaciens) is homologous to this domain but lacks additional sequence regions shared by members of this family and belongs to a different clade of the larger family of homologs. It and its closely related homologs are excluded from the this model by scoring between the trusted (45) and noise (18) cutoffs.
Probab=97.09 E-value=0.0035 Score=63.77 Aligned_cols=94 Identities=20% Similarity=0.215 Sum_probs=70.0
Q ss_pred cce-EEEEecCCCCCCccEEecCCCCeEEEeCCCcEEEE-EcCCC-----CcEEEEeCC-------CcccceEEEeCCCC
Q psy4900 389 FQF-HAIVLGSNLTNPTDLALDPTSGLMFVADSNQILRT-NMDGT-----MAMSIVSEA-------AYKASGVALDINAK 454 (485)
Q Consensus 389 g~~-~~~l~~~~~~~p~~iavD~~~~~lywtd~~~I~r~-~~dG~-----~~~~i~~~~-------~~~p~glavD~~~~ 454 (485)
|.. +.+++..++..|.||++.+. | ||.++.++|.|. ..+|. .+++|++.- .+.|++|++++ .+
T Consensus 59 G~~d~~~vfa~~l~~p~Gi~~~~~-G-lyV~~~~~i~~~~d~~gdg~ad~~~~~l~~~~~~~~~~~~~~~~~l~~gp-DG 135 (367)
T TIGR02604 59 GKYDKSNVFAEELSMVTGLAVAVG-G-VYVATPPDILFLRDKDGDDKADGEREVLLSGFGGQINNHHHSLNSLAWGP-DG 135 (367)
T ss_pred CCcceeEEeecCCCCccceeEecC-C-EEEeCCCeEEEEeCCCCCCCCCCccEEEEEccCCCCCcccccccCceECC-CC
Confidence 443 34566688999999999865 5 999998889988 45442 345565421 23488999996 67
Q ss_pred eEEEEeCC-------------------CCcEEEEEccCCCeEEEecCCCC
Q psy4900 455 RLFWCDNL-------------------LDYIETVDYEGKNRFLILRGSQN 485 (485)
Q Consensus 455 ~lYW~D~~-------------------~~~I~~~~~dG~~r~~~~~~~~~ 485 (485)
+||+++.. .+.|.+.+.+|+..+++..+.+|
T Consensus 136 ~LYv~~G~~~~~~~~~~~~~~~~~~~~~g~i~r~~pdg~~~e~~a~G~rn 185 (367)
T TIGR02604 136 WLYFNHGNTLASKVTRPGTSDESRQGLGGGLFRYNPDGGKLRVVAHGFQN 185 (367)
T ss_pred CEEEecccCCCceeccCCCccCcccccCceEEEEecCCCeEEEEecCcCC
Confidence 99998872 15799999999998888888775
No 30
>PF01436 NHL: NHL repeat; InterPro: IPR001258 The NHL repeat, named after NCL-1, HT2A and Lin-41, is found largely in a large number of eukaryotic and prokaryotic proteins. For example, the repeat is found in a variety of enzymes of the copper type II, ascorbate-dependent monooxygenase family which catalyse the C terminus alpha-amidation of biological peptides []. In many it occurs in tandem arrays, for example in the ringfinger beta-box, coiled-coil (RBCC) eukaryotic growth regulators []. The 'Brain Tumor' protein (Brat) is one such growth regulator that contains a 6-bladed NHL-repeat beta-propeller [, ]. The NHL repeats are also found in serine/threonine protein kinase (STPK) in diverse range of pathogenic bacteria. These STPK are transmembrane receptors with a intracellular N-terminal kinase domain and extracellular C-terminal sensor domain. In the STPK, PknD, from Mycobacterium tuberculosis, the sensor domain forms a rigid, six-bladed b-propeller composed of NHL repeats with a flexible tether to the transmembrane domain.; GO: 0005515 protein binding; PDB: 3FVZ_A 3FW0_A 1RWL_A 1RWI_A 1Q7F_A.
Probab=96.98 E-value=0.0019 Score=39.25 Aligned_cols=27 Identities=19% Similarity=0.254 Sum_probs=24.6
Q ss_pred CcccceEEEeCCCCeEEEEeCCCCcEEE
Q psy4900 441 AYKASGVALDINAKRLFWCDNLLDYIET 468 (485)
Q Consensus 441 ~~~p~glavD~~~~~lYW~D~~~~~I~~ 468 (485)
+..|.||||| ..++||.+|++.++|.+
T Consensus 1 f~~P~gvav~-~~g~i~VaD~~n~rV~v 27 (28)
T PF01436_consen 1 FNYPHGVAVD-SDGNIYVADSGNHRVQV 27 (28)
T ss_dssp BSSEEEEEEE-TTSEEEEEECCCTEEEE
T ss_pred CcCCcEEEEe-CCCCEEEEECCCCEEEE
Confidence 4689999999 99999999999999876
No 31
>KOG1520|consensus
Probab=96.83 E-value=0.0018 Score=64.01 Aligned_cols=113 Identities=13% Similarity=0.192 Sum_probs=77.9
Q ss_pred cCCCCccCCCCC-CccccccCcccccceEEEEecC----CCCCCccEEecCCCCeEEEeCC------CcEEEEEcCCC--
Q psy4900 365 CDSENDCGDGSD-EGDFCSEKTCAYFQFHAIVLGS----NLTNPTDLALDPTSGLMFVADS------NQILRTNMDGT-- 431 (485)
Q Consensus 365 ~~~~~~y~~d~~-~I~~~~~~~c~~g~~~~~l~~~----~~~~p~~iavD~~~~~lywtd~------~~I~r~~~dG~-- 431 (485)
..++++|..|.- .+..++.. |.....+..+ .+.-..+|+||+ +|.|||||. ..+.-+-|.|.
T Consensus 124 ~~ggdL~VaDAYlGL~~V~p~----g~~a~~l~~~~~G~~~kf~N~ldI~~-~g~vyFTDSSsk~~~rd~~~a~l~g~~~ 198 (376)
T KOG1520|consen 124 KKGGDLYVADAYLGLLKVGPE----GGLAELLADEAEGKPFKFLNDLDIDP-EGVVYFTDSSSKYDRRDFVFAALEGDPT 198 (376)
T ss_pred cCCCeEEEEecceeeEEECCC----CCcceeccccccCeeeeecCceeEcC-CCeEEEeccccccchhheEEeeecCCCc
Confidence 345688888776 55556655 5443332221 134456899999 899999991 23444444442
Q ss_pred ---------Cc-EEEEeCCCcccceEEEeCCCCeEEEEeCCCCcEEEEEccCCCe---EEEecC
Q psy4900 432 ---------MA-MSIVSEAAYKASGVALDINAKRLFWCDNLLDYIETVDYEGKNR---FLILRG 482 (485)
Q Consensus 432 ---------~~-~~i~~~~~~~p~glavD~~~~~lYW~D~~~~~I~~~~~dG~~r---~~~~~~ 482 (485)
.+ ..++..+|..|+||||-+....|-.+.....+|.+.-+.|.+. .+++++
T Consensus 199 GRl~~YD~~tK~~~VLld~L~F~NGlaLS~d~sfvl~~Et~~~ri~rywi~g~k~gt~EvFa~~ 262 (376)
T KOG1520|consen 199 GRLFRYDPSTKVTKVLLDGLYFPNGLALSPDGSFVLVAETTTARIKRYWIKGPKAGTSEVFAEG 262 (376)
T ss_pred cceEEecCcccchhhhhhcccccccccCCCCCCEEEEEeeccceeeeeEecCCccCchhhHhhc
Confidence 11 1244478999999999999999999999999999999999876 555553
No 32
>PF07645 EGF_CA: Calcium-binding EGF domain; InterPro: IPR001881 A sequence of about forty amino-acid residues found in epidermal growth factor (EGF) has been shown [, , , , , ] to be present in a large number of membrane-bound and extracellular, mostly animal, proteins. Many of these proteins require calcium for their biological function and a calcium-binding site has been found at the N terminus of some EGF-like domains []. Calcium-binding may be crucial for numerous protein-protein interactions. For human coagulation factor IX it has been shown [] that the calcium-ligands form a pentagonal bipyramid. The first, third and fourth conserved negatively charged or polar residues are side chain ligands. The latter is possibly hydroxylated (see aspartic acid and asparagine hydroxylation site) []. A conserved aromatic residue, as well as the second conserved negative residue, are thought to be involved in stabilising the calcium-binding site. As in non-calcium binding EGF-like domains, there are six conserved cysteines and the structure of both types is very similar as calcium-binding induces only strictly local structural changes []. +------------------+ +---------+ | | | | nxnnC-x(3,14)-C-x(3,7)-CxxbxxxxaxC-x(1,6)-C-x(8,13)-Cx | | +------------------+ 'n': negatively charged or polar residue [DEQN] 'b': possibly beta-hydroxylated residue [DN] 'a': aromatic amino acid 'C': cysteine, involved in disulphide bond 'x': any amino acid. ; GO: 0005509 calcium ion binding; PDB: 2VJ3_A 1TOZ_A 1LMJ_A 1UZQ_A 1UZK_A 1UZJ_B 1UZP_A 1EMO_A 1EMN_A 2RR0_A ....
Probab=96.82 E-value=0.00042 Score=46.67 Aligned_cols=27 Identities=30% Similarity=0.626 Sum_probs=22.2
Q ss_pred cccccCCCCCCeeeCCCCceeeCCCCC
Q psy4900 185 EFTCQASPTGGVCQCPEGQKVANDSRT 211 (485)
Q Consensus 185 ~~~C~n~~~~~~C~C~~G~~l~~~~~~ 211 (485)
...|.|+.|+|+|.|++||.+..++..
T Consensus 15 ~~~C~N~~Gsy~C~C~~Gy~~~~~~~~ 41 (42)
T PF07645_consen 15 NGTCVNTEGSYSCSCPPGYELNDDGTT 41 (42)
T ss_dssp TSEEEEETTEEEEEESTTEEECTTSSE
T ss_pred CCEEEcCCCCEEeeCCCCcEECCCCCc
Confidence 378999999999999999986555443
No 33
>COG3386 Gluconolactonase [Carbohydrate transport and metabolism]
Probab=96.79 E-value=0.0053 Score=60.52 Aligned_cols=71 Identities=24% Similarity=0.490 Sum_probs=59.7
Q ss_pred CCCCccEEecCCCCeEEEeC-C------------CcEEEEEcCCCCcEEEEeCCCcccceEEEeCCCCeEEEEeCCCCcE
Q psy4900 400 LTNPTDLALDPTSGLMFVAD-S------------NQILRTNMDGTMAMSIVSEAAYKASGVALDINAKRLFWCDNLLDYI 466 (485)
Q Consensus 400 ~~~p~~iavD~~~~~lywtd-~------------~~I~r~~~dG~~~~~i~~~~~~~p~glavD~~~~~lYW~D~~~~~I 466 (485)
+..|..+.+||. |.+|+++ . ..|+|++.+|.. +.++...+..|+|||+++..+.||++|+...+|
T Consensus 110 ~~r~ND~~v~pd-G~~wfgt~~~~~~~~~~~~~~G~lyr~~p~g~~-~~l~~~~~~~~NGla~SpDg~tly~aDT~~~~i 187 (307)
T COG3386 110 LNRPNDGVVDPD-GRIWFGDMGYFDLGKSEERPTGSLYRVDPDGGV-VRLLDDDLTIPNGLAFSPDGKTLYVADTPANRI 187 (307)
T ss_pred cCCCCceeEcCC-CCEEEeCCCccccCccccCCcceEEEEcCCCCE-EEeecCcEEecCceEECCCCCEEEEEeCCCCeE
Confidence 567888999988 8899988 3 248888875544 555556699999999999999999999999999
Q ss_pred EEEEcc
Q psy4900 467 ETVDYE 472 (485)
Q Consensus 467 ~~~~~d 472 (485)
++..++
T Consensus 188 ~r~~~d 193 (307)
T COG3386 188 HRYDLD 193 (307)
T ss_pred EEEecC
Confidence 999998
No 34
>KOG1520|consensus
Probab=96.75 E-value=0.0029 Score=62.58 Aligned_cols=63 Identities=21% Similarity=0.324 Sum_probs=52.6
Q ss_pred CCCCccEEecCCCCeEEEeC-CCcEEEEEcCCCCcEEEEeCC----CcccceEEEeCCCCeEEEEeCCC
Q psy4900 400 LTNPTDLALDPTSGLMFVAD-SNQILRTNMDGTMAMSIVSEA----AYKASGVALDINAKRLFWCDNLL 463 (485)
Q Consensus 400 ~~~p~~iavD~~~~~lywtd-~~~I~r~~~dG~~~~~i~~~~----~~~p~glavD~~~~~lYW~D~~~ 463 (485)
=.+|-||+++..+|.||.+| .--++.++..|...+.+.... +...++|+||. ++.|||||+..
T Consensus 114 CGRPLGl~f~~~ggdL~VaDAYlGL~~V~p~g~~a~~l~~~~~G~~~kf~N~ldI~~-~g~vyFTDSSs 181 (376)
T KOG1520|consen 114 CGRPLGIRFDKKGGDLYVADAYLGLLKVGPEGGLAELLADEAEGKPFKFLNDLDIDP-EGVVYFTDSSS 181 (376)
T ss_pred cCCcceEEeccCCCeEEEEecceeeEEECCCCCcceeccccccCeeeeecCceeEcC-CCeEEEecccc
Confidence 37899999999999999999 677999999988755555433 55678999998 99999999864
No 35
>PF10282 Lactonase: Lactonase, 7-bladed beta-propeller; InterPro: IPR019405 6-phosphogluconolactonases (6PGL) 3.1.1.31 from EC, which hydrolyses 6-phosphogluconolactone to 6-phosphogluconate is opne of the enzymes in the pentose phosphate pathway. Two families of structurally dissimilar 6PGLs are known to exist: the Escherichia coli (strain K12) YbhE IPR022528 from INTERPRO [] and the Pseudomonas aeruginosa DevB IPR005900 from INTERPRO [] types. This entry contains bacterial 6-phosphogluconolactonases (6PGL) YbhE-type 3.1.1.31 from EC which hydrolyse 6-phosphogluconolactone to 6-phosphogluconate. The entry also contains the fungal muconate lactonizing enzyme carboxy-cis,cis-muconate cyclase 5.5.1.5 from EC and muconate cycloisomerase 5.5.1.1 from EC, which convert cis,cis-muconates to muconolactones and vice versa as part of the microbial beta-ketoadipate pathway. Structures have been reported for the E. coli 6-phosphogluconolactonase and Neurospora crassa muconate cycloisomerase. Structures of proteins in this family have revealed a 7-bladed beta-propeller fold [].; PDB: 3SCY_A 1L0Q_A 3HFQ_B 3FGB_A 1RI6_A 3U4Y_A 3BWS_A 1JOF_H.
Probab=96.61 E-value=0.01 Score=59.78 Aligned_cols=109 Identities=12% Similarity=0.122 Sum_probs=75.1
Q ss_pred ccccCCCCccCCCCC--CccccccCcccccc---e--EEEEecCCCCCCccEEecCCCCeEEEeC--CCcEEEEEcCC-C
Q psy4900 362 TWKCDSENDCGDGSD--EGDFCSEKTCAYFQ---F--HAIVLGSNLTNPTDLALDPTSGLMFVAD--SNQILRTNMDG-T 431 (485)
Q Consensus 362 ~~~~~~~~~y~~d~~--~I~~~~~~~c~~g~---~--~~~l~~~~~~~p~~iavD~~~~~lywtd--~~~I~r~~~dG-~ 431 (485)
.+.|+++.+|..|.+ +|.+.+++ .. . ...+-...-.-||.|+++|...+||.+. ...|....++. .
T Consensus 150 ~~~pdg~~v~v~dlG~D~v~~~~~~----~~~~~l~~~~~~~~~~G~GPRh~~f~pdg~~~Yv~~e~s~~v~v~~~~~~~ 225 (345)
T PF10282_consen 150 VFSPDGRFVYVPDLGADRVYVYDID----DDTGKLTPVDSIKVPPGSGPRHLAFSPDGKYAYVVNELSNTVSVFDYDPSD 225 (345)
T ss_dssp EE-TTSSEEEEEETTTTEEEEEEE-----TTS-TEEEEEEEECSTTSSEEEEEE-TTSSEEEEEETTTTEEEEEEEETTT
T ss_pred EECCCCCEEEEEecCCCEEEEEEEe----CCCceEEEeeccccccCCCCcEEEEcCCcCEEEEecCCCCcEEEEeecccC
Confidence 456777888887765 66666655 22 2 1222234456799999999999999998 78899999982 2
Q ss_pred CcEEEEe---C------CCcccceEEEeCCCCeEEEEeCCCCcEEEEEccCC
Q psy4900 432 MAMSIVS---E------AAYKASGVALDINAKRLFWCDNLLDYIETVDYEGK 474 (485)
Q Consensus 432 ~~~~i~~---~------~~~~p~glavD~~~~~lYW~D~~~~~I~~~~~dG~ 474 (485)
.....+. . ...+|.+|+|.+..++||.+..+.+.|.+.++|..
T Consensus 226 g~~~~~~~~~~~~~~~~~~~~~~~i~ispdg~~lyvsnr~~~sI~vf~~d~~ 277 (345)
T PF10282_consen 226 GSLTEIQTISTLPEGFTGENAPAEIAISPDGRFLYVSNRGSNSISVFDLDPA 277 (345)
T ss_dssp TEEEEEEEEESCETTSCSSSSEEEEEE-TTSSEEEEEECTTTEEEEEEECTT
T ss_pred CceeEEEEeeeccccccccCCceeEEEecCCCEEEEEeccCCEEEEEEEecC
Confidence 2222211 1 12378999999999999999999999999999654
No 36
>KOG4659|consensus
Probab=96.42 E-value=0.017 Score=64.51 Aligned_cols=102 Identities=21% Similarity=0.282 Sum_probs=72.3
Q ss_pred CCccCCCCCCccccccCcccccceEEEEecCCCCCCc---cEEecCCCCeEEEeC--CCcEEEEE-cCCC----CcEEEE
Q psy4900 368 ENDCGDGSDEGDFCSEKTCAYFQFHAIVLGSNLTNPT---DLALDPTSGLMFVAD--SNQILRTN-MDGT----MAMSIV 437 (485)
Q Consensus 368 ~~~y~~d~~~I~~~~~~~c~~g~~~~~l~~~~~~~p~---~iavD~~~~~lywtd--~~~I~r~~-~dG~----~~~~i~ 437 (485)
+.+|..|-.-|.++..+ |+...+|- -++..|. -|||+|..|.||.+| .++|+|+. +.++ +.+++.
T Consensus 376 GSl~VGDfNyIRRI~~d----g~v~tIl~-L~~t~~sh~Yy~AvsPvdgtlyvSdp~s~qv~rv~sl~~~d~~~N~evva 450 (1899)
T KOG4659|consen 376 GSLIVGDFNYIRRISQD----GQVSTILT-LGLTDTSHSYYIAVSPVDGTLYVSDPLSKQVWRVSSLEPQDSRNNYEVVA 450 (1899)
T ss_pred CcEEEccchheeeecCC----CceEEEEE-ecCCCccceeEEEecCcCceEEecCCCcceEEEeccCCccccccCeeEEe
Confidence 67787777777777778 77765543 3444443 499999999999999 67788764 3332 234443
Q ss_pred eC--------------------CCcccceEEEeCCCCeEEEEeCCCCcEEEEEccCCCeE
Q psy4900 438 SE--------------------AAYKASGVALDINAKRLFWCDNLLDYIETVDYEGKNRF 477 (485)
Q Consensus 438 ~~--------------------~~~~p~glavD~~~~~lYW~D~~~~~I~~~~~dG~~r~ 477 (485)
.. .|..|.||||| ..+.||++|. -.|..++-+|--+.
T Consensus 451 G~Ge~Clp~desCGDGalA~dA~L~~PkGIa~d-k~g~lYfaD~--t~IR~iD~~giIst 507 (1899)
T KOG4659|consen 451 GDGEVCLPADESCGDGALAQDAQLIFPKGIAFD-KMGNLYFADG--TRIRVIDTTGIIST 507 (1899)
T ss_pred ccCcCccccccccCcchhcccceeccCCceeEc-cCCcEEEecc--cEEEEeccCceEEE
Confidence 22 37789999999 8899999994 45777777775444
No 37
>TIGR02604 Piru_Ver_Nterm putative membrane-bound dehydrogenase domain. All proteins that score above the trusted cutoff score of 45 to this model are large proteins of either Pirellula sp. 1 or Verrucomicrobium spinosum. These proteins all contain, in addition to this domain, several hundred residues of highly variable sequence, and then a well-conserved C-terminal domain (TIGR02603) that features a putative cytochrome c-type heme binding motif CXXCH. The membrane-bound L-sorbosone dehydrogenase from Acetobacter liquefaciens (Gluconacetobacter liquefaciens) is homologous to this domain but lacks additional sequence regions shared by members of this family and belongs to a different clade of the larger family of homologs. It and its closely related homologs are excluded from the this model by scoring between the trusted (45) and noise (18) cutoffs.
Probab=96.32 E-value=0.033 Score=56.69 Aligned_cols=87 Identities=24% Similarity=0.463 Sum_probs=60.0
Q ss_pred eEEEEecCC--CCCCccEEecCCCCeEEEeCC--------------CcEEEEEc---CCCC-cEEEEeCCCcccceEEEe
Q psy4900 391 FHAIVLGSN--LTNPTDLALDPTSGLMFVADS--------------NQILRTNM---DGTM-AMSIVSEAAYKASGVALD 450 (485)
Q Consensus 391 ~~~~l~~~~--~~~p~~iavD~~~~~lywtd~--------------~~I~r~~~---dG~~-~~~i~~~~~~~p~glavD 450 (485)
++..|++.. +.+|++|++|+. |+||.++. .+|.+..- ||.- +.+++..++..|+||++.
T Consensus 2 f~~~l~A~~p~~~~P~~ia~d~~-G~l~V~e~~~y~~~~~~~~~~~~rI~~l~d~dgdG~~d~~~vfa~~l~~p~Gi~~~ 80 (367)
T TIGR02604 2 FKVTLFAAEPLLRNPIAVCFDER-GRLWVAEGITYSRPAGRQGPLGDRILILEDADGDGKYDKSNVFAEELSMVTGLAVA 80 (367)
T ss_pred cEEEEEECCCccCCCceeeECCC-CCEEEEeCCcCCCCCCCCCCCCCEEEEEEcCCCCCCcceeEEeecCCCCccceeEe
Confidence 344566554 999999999987 88999861 27877754 4542 345666789999999997
Q ss_pred CCCCeEEEEeCCCCcEEEE-EccCC-----CeEEEecC
Q psy4900 451 INAKRLFWCDNLLDYIETV-DYEGK-----NRFLILRG 482 (485)
Q Consensus 451 ~~~~~lYW~D~~~~~I~~~-~~dG~-----~r~~~~~~ 482 (485)
. .+ ||.++ ...|.+. +.+|. .+++|+.+
T Consensus 81 ~-~G-lyV~~--~~~i~~~~d~~gdg~ad~~~~~l~~~ 114 (367)
T TIGR02604 81 V-GG-VYVAT--PPDILFLRDKDGDDKADGEREVLLSG 114 (367)
T ss_pred c-CC-EEEeC--CCeEEEEeCCCCCCCCCCccEEEEEc
Confidence 4 34 99986 4457766 44442 45556554
No 38
>PF06977 SdiA-regulated: SdiA-regulated; InterPro: IPR009722 This entry represents a conserved region approximately 100 residues long within a number of hypothetical bacterial proteins that may be regulated by SdiA, a member of the LuxR family of transcriptional regulators []. Some proteins contain the IPR001258 from INTERPRO repeat.; PDB: 3QQZ_A.
Probab=96.04 E-value=0.016 Score=55.32 Aligned_cols=68 Identities=18% Similarity=0.343 Sum_probs=42.1
Q ss_pred CCCCCccEEecCCCCeEEEeC--CCcEEEEEcCCCCcEEEEeC--------CCcccceEEEeCCCCeEEEEeCCCCcEEE
Q psy4900 399 NLTNPTDLALDPTSGLMFVAD--SNQILRTNMDGTMAMSIVSE--------AAYKASGVALDINAKRLFWCDNLLDYIET 468 (485)
Q Consensus 399 ~~~~p~~iavD~~~~~lywtd--~~~I~r~~~dG~~~~~i~~~--------~~~~p~glavD~~~~~lYW~D~~~~~I~~ 468 (485)
.+..|.+|++||.+|.||.-. .++|...+.+|.-...+--. .+.+|+|||+|. .++||.+.- -+..++
T Consensus 169 ~~~d~S~l~~~p~t~~lliLS~es~~l~~~d~~G~~~~~~~L~~g~~gl~~~~~QpEGIa~d~-~G~LYIvsE-pNlfy~ 246 (248)
T PF06977_consen 169 FVRDLSGLSYDPRTGHLLILSDESRLLLELDRQGRVVSSLSLDRGFHGLSKDIPQPEGIAFDP-DGNLYIVSE-PNLFYR 246 (248)
T ss_dssp -SS---EEEEETTTTEEEEEETTTTEEEEE-TT--EEEEEE-STTGGG-SS---SEEEEEE-T-T--EEEEET-TTEEEE
T ss_pred eeccccceEEcCCCCeEEEEECCCCeEEEECCCCCEEEEEEeCCcccCcccccCCccEEEECC-CCCEEEEcC-CceEEE
Confidence 467799999999999999877 78999999999865544322 367899999995 789999863 344444
No 39
>KOG4499|consensus
Probab=95.91 E-value=0.05 Score=50.37 Aligned_cols=80 Identities=15% Similarity=0.203 Sum_probs=62.9
Q ss_pred EecCCCCCCccEEecCCCCeEEEeC--CCcEEEEEcC---C--CCcEEEEeC------CCcccceEEEeCCCCeEEEEeC
Q psy4900 395 VLGSNLTNPTDLALDPTSGLMFVAD--SNQILRTNMD---G--TMAMSIVSE------AAYKASGVALDINAKRLFWCDN 461 (485)
Q Consensus 395 l~~~~~~~p~~iavD~~~~~lywtd--~~~I~r~~~d---G--~~~~~i~~~------~~~~p~glavD~~~~~lYW~D~ 461 (485)
++-..+.-|.||+-|....++|++| ...|...+.| | ++|++|+.- +-..|.|+||| ..++||.+-+
T Consensus 152 ~i~~~v~IsNgl~Wd~d~K~fY~iDsln~~V~a~dyd~~tG~~snr~~i~dlrk~~~~e~~~PDGm~ID-~eG~L~Va~~ 230 (310)
T KOG4499|consen 152 LIWNCVGISNGLAWDSDAKKFYYIDSLNYEVDAYDYDCPTGDLSNRKVIFDLRKSQPFESLEPDGMTID-TEGNLYVATF 230 (310)
T ss_pred eeehhccCCccccccccCcEEEEEccCceEEeeeecCCCcccccCcceeEEeccCCCcCCCCCCcceEc-cCCcEEEEEe
Confidence 3335567789999999999999999 6777666655 2 568888763 23469999999 5999999999
Q ss_pred CCCcEEEEEccCCC
Q psy4900 462 LLDYIETVDYEGKN 475 (485)
Q Consensus 462 ~~~~I~~~~~dG~~ 475 (485)
..++|.+.++....
T Consensus 231 ng~~V~~~dp~tGK 244 (310)
T KOG4499|consen 231 NGGTVQKVDPTTGK 244 (310)
T ss_pred cCcEEEEECCCCCc
Confidence 99999999887544
No 40
>COG3391 Uncharacterized conserved protein [Function unknown]
Probab=95.59 E-value=0.1 Score=53.29 Aligned_cols=106 Identities=15% Similarity=0.198 Sum_probs=72.3
Q ss_pred cCCCCccCCCCC--CccccccCcccccceEEEEecCCCCCCccEEecCCCCeEEEeCC----CcEEEEEcCCCCcEEEEe
Q psy4900 365 CDSENDCGDGSD--EGDFCSEKTCAYFQFHAIVLGSNLTNPTDLALDPTSGLMFVADS----NQILRTNMDGTMAMSIVS 438 (485)
Q Consensus 365 ~~~~~~y~~d~~--~I~~~~~~~c~~g~~~~~l~~~~~~~p~~iavD~~~~~lywtd~----~~I~r~~~dG~~~~~i~~ 438 (485)
..+.++|....+ .|.+.+.. .......+..+. .|.+|++++..++||.++. .+|..++-. ..+++..
T Consensus 83 ~~~~~vyv~~~~~~~v~vid~~----~~~~~~~~~vG~-~P~~~~~~~~~~~vYV~n~~~~~~~vsvid~~--t~~~~~~ 155 (381)
T COG3391 83 PAGNKVYVTTGDSNTVSVIDTA----TNTVLGSIPVGL-GPVGLAVDPDGKYVYVANAGNGNNTVSVIDAA--TNKVTAT 155 (381)
T ss_pred CCCCeEEEecCCCCeEEEEcCc----ccceeeEeeecc-CCceEEECCCCCEEEEEecccCCceEEEEeCC--CCeEEEE
Confidence 345668887643 67777644 222222222222 8999999999999999993 445554443 3333333
Q ss_pred CCCc-ccceEEEeCCCCeEEEEeCCCCcEEEEEccCCCeE
Q psy4900 439 EAAY-KASGVALDINAKRLFWCDNLLDYIETVDYEGKNRF 477 (485)
Q Consensus 439 ~~~~-~p~glavD~~~~~lYW~D~~~~~I~~~~~dG~~r~ 477 (485)
.... .|.++++|+...++|-++...+.|...+..+....
T Consensus 156 ~~vG~~P~~~a~~p~g~~vyv~~~~~~~v~vi~~~~~~v~ 195 (381)
T COG3391 156 IPVGNTPTGVAVDPDGNKVYVTNSDDNTVSVIDTSGNSVV 195 (381)
T ss_pred EecCCCcceEEECCCCCeEEEEecCCCeEEEEeCCCccee
Confidence 2222 58999999999999999999999999987776544
No 41
>PF03022 MRJP: Major royal jelly protein; InterPro: IPR003534 The major royal jelly proteins (MRJPs) comprise 12.5% of the mass, and 82-90% of the protein content [], of honeybee (Apis mellifera) royal jelly. Royal jelly is a substance secreted by the cephalic glands of nurse bees [] and it is used to trigger development of a queen bee from a bee larva. The biological function of the MRJPs is unknown, but they are believed to play a major role in nutrition due to their high essential amino acid content []. Two royal jelly proteins, MRJP3 and MRJP5, contain a tandem repeat that results from a high genetic variablility. This polymorphism may be useful for genotyping individual bees [].; PDB: 3Q6P_B 3Q6K_A 3Q6T_A 2QE8_B.
Probab=95.57 E-value=0.06 Score=52.75 Aligned_cols=80 Identities=18% Similarity=0.314 Sum_probs=53.9
Q ss_pred CccEEecC---CCCeEEEeC--CCcEEEEEcC----CCC--------cEEEEeCCCcccceEEEeCCCCeEEEEeCCCCc
Q psy4900 403 PTDLALDP---TSGLMFVAD--SNQILRTNMD----GTM--------AMSIVSEAAYKASGVALDINAKRLFWCDNLLDY 465 (485)
Q Consensus 403 p~~iavD~---~~~~lywtd--~~~I~r~~~d----G~~--------~~~i~~~~~~~p~glavD~~~~~lYW~D~~~~~ 465 (485)
..|||+.+ ..++|||.- ..+++++... .+. ....+........|+++|. .+.||+++...+.
T Consensus 130 ~~gial~~~~~d~r~LYf~~lss~~ly~v~T~~L~~~~~~~~~~~~~~v~~lG~k~~~s~g~~~D~-~G~ly~~~~~~~a 208 (287)
T PF03022_consen 130 IFGIALSPISPDGRWLYFHPLSSRKLYRVPTSVLRDPSLSDAQALASQVQDLGDKGSQSDGMAIDP-NGNLYFTDVEQNA 208 (287)
T ss_dssp EEEEEE-TTSTTS-EEEEEETT-SEEEEEEHHHHCSTT--HHH-HHHT-EEEEE---SECEEEEET-TTEEEEEECCCTE
T ss_pred ccccccCCCCCCccEEEEEeCCCCcEEEEEHHHhhCccccccccccccceeccccCCCCceEEECC-CCcEEEecCCCCe
Confidence 45678876 446899998 5678888765 111 1123322235678999996 9999999999999
Q ss_pred EEEEEccC----CCeEEEecCC
Q psy4900 466 IETVDYEG----KNRFLILRGS 483 (485)
Q Consensus 466 I~~~~~dG----~~r~~~~~~~ 483 (485)
|.+.+.++ .+.++|++..
T Consensus 209 I~~w~~~~~~~~~~~~~l~~d~ 230 (287)
T PF03022_consen 209 IGCWDPDGPYTPENFEILAQDP 230 (287)
T ss_dssp EEEEETTTSB-GCCEEEEEE-C
T ss_pred EEEEeCCCCcCccchheeEEcC
Confidence 99999998 4566676654
No 42
>PF01731 Arylesterase: Arylesterase; InterPro: IPR002640 The serum paraoxonases/arylesterases are enzymes that catalyse the hydrolysis of the toxic metabolites of a variety of organophosphorus insecticides. The enzymes hydrolyse a broad spectrum of organophosphate substrates, including paraoxon and a number of aromatic carboxylic acid esters (e.g., phenyl acetate), and hence confer resistance to organophosphate toxicity []. Mammals have 3 distinct paraoxonase types, termed PON1-3 [, ]. In mice and humans, the PON genes are found on the same chromosome in close proximity. PON activity has been found in variety of tissues, with highest levels in liver and serum - the source of serum PON is thought to be the liver. Unlike mammals, fish and avian species lack paraoxonase activity. Human and rabbit PONs appear to have two distinct Ca2+ binding sites, one required for stability and one required for catalytic activity. The Ca2+ dependency of PONs suggests a mechanism of hydrolysis where Ca2+ acts as the electrophillic catalyst, like that proposed for phospholipase A2. The paraoxonase enzymes, PON1 and PON3, are high density lipoprotein (HDL)- associated proteins capable of preventing oxidative modification of low density lipoproteins (LPL) []. Although PON2 has oxidative properties, the enzyme does not associate with HDL. Within a given species, PON1, PON2 and PON3 share ~60% amino acid sequence identity, whereas between mammalian species particular PONs (1,2 or 3) share 79-90% identity at the amino acid level. Human PON1 and PON3 share numerous conserved phosphorylation and N-glycosylation sites; however, it is not known whether the PON proteins are modified at these sites, or whether modification at these sites is required for activity in vivo []. This family consists of arylesterases (Also known as serum paraoxonase) 3.1.1.2 from EC. These enzymes hydrolyse organophosphorus esters such as paraoxon and are found in the liver and blood. They confer resistance to organophosphate toxicity []. Human arylesterase (PON1) P27169 from SWISSPROT is associated with HDL and may protect against LDL oxidation [].; GO: 0004064 arylesterase activity
Probab=95.47 E-value=0.057 Score=42.42 Aligned_cols=42 Identities=24% Similarity=0.318 Sum_probs=34.6
Q ss_pred CCCCcEEEEeCCCcccceEEEeCCCCeEEEEeCCCCcEEEEEc
Q psy4900 429 DGTMAMSIVSEAAYKASGVALDINAKRLFWCDNLLDYIETVDY 471 (485)
Q Consensus 429 dG~~~~~i~~~~~~~p~glavD~~~~~lYW~D~~~~~I~~~~~ 471 (485)
||+..+ ++..++..|+||++|+.++.||.++...+.|.+...
T Consensus 42 d~~~~~-~va~g~~~aNGI~~s~~~k~lyVa~~~~~~I~vy~~ 83 (86)
T PF01731_consen 42 DGKEVK-VVASGFSFANGIAISPDKKYLYVASSLAHSIHVYKR 83 (86)
T ss_pred eCCEeE-EeeccCCCCceEEEcCCCCEEEEEeccCCeEEEEEe
Confidence 444434 344789999999999999999999999999988765
No 43
>cd01475 vWA_Matrilin VWA_Matrilin: In cartilaginous plate, extracellular matrix molecules mediate cell-matrix and matrix-matrix interactions thereby providing tissue integrity. Some members of the matrilin family are expressed specifically in developing cartilage rudiments. The matrilin family consists of at least four members. All the members of the matrilin family contain VWA domains, EGF-like domains and a heptad repeat coiled-coiled domain at the carboxy terminus which is responsible for the oligomerization of the matrilins. The VWA domains have been shown to be essential for matrilin network formation by interacting with matrix ligands.
Probab=95.35 E-value=0.013 Score=55.27 Aligned_cols=35 Identities=34% Similarity=0.763 Sum_probs=30.2
Q ss_pred CCCCCCCccccccccccCCCCCCccccCCCCccccCCCCC
Q psy4900 262 CGSNNGGCEHMCIITRASGNALGYKCACDIGYRLSVNGNN 301 (485)
Q Consensus 262 C~~~~g~C~~~C~n~~~~~~~g~~~C~C~~Gy~l~~d~~~ 301 (485)
|......|.|.|.+++ |+|.|.|..||.|..|+++
T Consensus 190 C~~~~~~c~~~C~~~~-----g~~~c~c~~g~~~~~~~~~ 224 (224)
T cd01475 190 CATLSHVCQQVCISTP-----GSYLCACTEGYALLEDNKT 224 (224)
T ss_pred hcCCCCCccceEEcCC-----CCEEeECCCCccCCCCCCC
Confidence 7655667999999999 9999999999999887653
No 44
>TIGR03606 non_repeat_PQQ dehydrogenase, PQQ-dependent, s-GDH family. PQQ, or pyrroloquinoline-quinone, serves as a cofactor for a number of sugar and alcohol dehydrogenases in a limited number of bacterial species. Most characterized PQQ-dependent enzymes have multiple repeats of a sequence region described by pfam01011 (PQQ enzyme repeat), but this protein family in unusual in lacking that repeat. Below the noise cutoff are related proteins mostly from species that lack PQQ biosynthesis.
Probab=95.34 E-value=0.2 Score=51.91 Aligned_cols=94 Identities=21% Similarity=0.264 Sum_probs=64.9
Q ss_pred cceEEEEecCCCCCCccEEecCCCCeEEEeC--CCcEEEEEcCCCCcEEEE------e-CCCcccceEEEeCC------C
Q psy4900 389 FQFHAIVLGSNLTNPTDLALDPTSGLMFVAD--SNQILRTNMDGTMAMSIV------S-EAAYKASGVALDIN------A 453 (485)
Q Consensus 389 g~~~~~l~~~~~~~p~~iavD~~~~~lywtd--~~~I~r~~~dG~~~~~i~------~-~~~~~p~glavD~~------~ 453 (485)
..+...++..+|..|.+|++.|. |.||.++ ..+|.++.-++...+++. . ....-+.|||+++. +
T Consensus 18 ~~f~~~~va~GL~~Pw~maflPD-G~llVtER~~G~I~~v~~~~~~~~~~~~l~~v~~~~ge~GLlglal~PdF~~~~~n 96 (454)
T TIGR03606 18 ENFDKKVLLSGLNKPWALLWGPD-NQLWVTERATGKILRVNPETGEVKVVFTLPEIVNDAQHNGLLGLALHPDFMQEKGN 96 (454)
T ss_pred CCcEEEEEECCCCCceEEEEcCC-CeEEEEEecCCEEEEEeCCCCceeeeecCCceeccCCCCceeeEEECCCccccCCC
Confidence 45666777789999999999986 7899998 489999876654433322 1 13445789999854 4
Q ss_pred CeEEEEeC---------CCCcEEEEEccCC-----CeEEEecCC
Q psy4900 454 KRLFWCDN---------LLDYIETVDYEGK-----NRFLILRGS 483 (485)
Q Consensus 454 ~~lYW~D~---------~~~~I~~~~~dG~-----~r~~~~~~~ 483 (485)
+.||++-+ ...+|.|..++.. ..++|+.+.
T Consensus 97 ~~lYvsyt~~~~~~~~~~~~~I~R~~l~~~~~~l~~~~~Il~~l 140 (454)
T TIGR03606 97 PYVYISYTYKNGDKELPNHTKIVRYTYDKSTQTLEKPVDLLAGL 140 (454)
T ss_pred cEEEEEEeccCCCCCccCCcEEEEEEecCCCCccccceEEEecC
Confidence 68998742 2468999988732 345666543
No 45
>PF10282 Lactonase: Lactonase, 7-bladed beta-propeller; InterPro: IPR019405 6-phosphogluconolactonases (6PGL) 3.1.1.31 from EC, which hydrolyses 6-phosphogluconolactone to 6-phosphogluconate is opne of the enzymes in the pentose phosphate pathway. Two families of structurally dissimilar 6PGLs are known to exist: the Escherichia coli (strain K12) YbhE IPR022528 from INTERPRO [] and the Pseudomonas aeruginosa DevB IPR005900 from INTERPRO [] types. This entry contains bacterial 6-phosphogluconolactonases (6PGL) YbhE-type 3.1.1.31 from EC which hydrolyse 6-phosphogluconolactone to 6-phosphogluconate. The entry also contains the fungal muconate lactonizing enzyme carboxy-cis,cis-muconate cyclase 5.5.1.5 from EC and muconate cycloisomerase 5.5.1.1 from EC, which convert cis,cis-muconates to muconolactones and vice versa as part of the microbial beta-ketoadipate pathway. Structures have been reported for the E. coli 6-phosphogluconolactonase and Neurospora crassa muconate cycloisomerase. Structures of proteins in this family have revealed a 7-bladed beta-propeller fold [].; PDB: 3SCY_A 1L0Q_A 3HFQ_B 3FGB_A 1RI6_A 3U4Y_A 3BWS_A 1JOF_H.
Probab=95.31 E-value=0.094 Score=52.86 Aligned_cols=112 Identities=13% Similarity=0.200 Sum_probs=73.3
Q ss_pred ccccCCCCccCCCCC--CccccccCcccccceEEE--E--ecCC---CCCCccEEecCCCCeEEEeC--CCcEEEEEcCC
Q psy4900 362 TWKCDSENDCGDGSD--EGDFCSEKTCAYFQFHAI--V--LGSN---LTNPTDLALDPTSGLMFVAD--SNQILRTNMDG 430 (485)
Q Consensus 362 ~~~~~~~~~y~~d~~--~I~~~~~~~c~~g~~~~~--l--~~~~---~~~p~~iavD~~~~~lywtd--~~~I~r~~~dG 430 (485)
.+.+++..+|..... .|.+.+++. ..|..... + +... ...|.+|++.|..++||.+. ...|....+|.
T Consensus 198 ~f~pdg~~~Yv~~e~s~~v~v~~~~~-~~g~~~~~~~~~~~~~~~~~~~~~~~i~ispdg~~lyvsnr~~~sI~vf~~d~ 276 (345)
T PF10282_consen 198 AFSPDGKYAYVVNELSNTVSVFDYDP-SDGSLTEIQTISTLPEGFTGENAPAEIAISPDGRFLYVSNRGSNSISVFDLDP 276 (345)
T ss_dssp EE-TTSSEEEEEETTTTEEEEEEEET-TTTEEEEEEEEESCETTSCSSSSEEEEEE-TTSSEEEEEECTTTEEEEEEECT
T ss_pred EEcCCcCEEEEecCCCCcEEEEeecc-cCCceeEEEEeeeccccccccCCceeEEEecCCCEEEEEeccCCEEEEEEEec
Confidence 455667778876544 454444330 01433221 1 1111 23688999999999999999 67898888865
Q ss_pred C-CcEE---EEeCCCcccceEEEeCCCCeEEEEeCCCCcEEEEEccCC
Q psy4900 431 T-MAMS---IVSEAAYKASGVALDINAKRLFWCDNLLDYIETVDYEGK 474 (485)
Q Consensus 431 ~-~~~~---i~~~~~~~p~glavD~~~~~lYW~D~~~~~I~~~~~dG~ 474 (485)
. .+-. .+...-.+|.+|+||+..++||.+....+.|.+.++|..
T Consensus 277 ~~g~l~~~~~~~~~G~~Pr~~~~s~~g~~l~Va~~~s~~v~vf~~d~~ 324 (345)
T PF10282_consen 277 ATGTLTLVQTVPTGGKFPRHFAFSPDGRYLYVANQDSNTVSVFDIDPD 324 (345)
T ss_dssp TTTTEEEEEEEEESSSSEEEEEE-TTSSEEEEEETTTTEEEEEEEETT
T ss_pred CCCceEEEEEEeCCCCCccEEEEeCCCCEEEEEecCCCeEEEEEEeCC
Confidence 4 2222 233345679999999999999999999999999888743
No 46
>PF07995 GSDH: Glucose / Sorbosone dehydrogenase; InterPro: IPR012938 Proteins containing this domain are thought to be glucose/sorbosone dehydrogenases. The best characterised of these proteins is soluble glucose dehydrogenase (P13650 from SWISSPROT) from Acinetobacter calcoaceticus, which oxidises glucose to gluconolactone. The enzyme is a calcium-dependent homodimer which uses PQQ as a cofactor [].; GO: 0016901 oxidoreductase activity, acting on the CH-OH group of donors, quinone or similar compound as acceptor, 0048038 quinone binding, 0005975 carbohydrate metabolic process; PDB: 2ISM_A 2WG3_D 3HO5_A 3HO4_A 3HO3_A 2WFT_A 2WG4_B 2WFX_B 1CRU_A 1CQ1_B ....
Probab=95.21 E-value=0.038 Score=55.40 Aligned_cols=67 Identities=31% Similarity=0.404 Sum_probs=50.5
Q ss_pred CCCCCccEEecCCCCeEEEeC---------------CCcEEEEEcCCCC------------cEEEEeCCCcccceEEEeC
Q psy4900 399 NLTNPTDLALDPTSGLMFVAD---------------SNQILRTNMDGTM------------AMSIVSEAAYKASGVALDI 451 (485)
Q Consensus 399 ~~~~p~~iavD~~~~~lywtd---------------~~~I~r~~~dG~~------------~~~i~~~~~~~p~glavD~ 451 (485)
..+...+|+++|. |+||++- ..+|.|++.||+- ...|+..++.+|.+||+|+
T Consensus 112 ~~H~g~~l~fgpD-G~LYvs~G~~~~~~~~~~~~~~~G~ilri~~dG~~p~dnP~~~~~~~~~~i~A~GlRN~~~~~~d~ 190 (331)
T PF07995_consen 112 GNHNGGGLAFGPD-GKLYVSVGDGGNDDNAQDPNSLRGKILRIDPDGSIPADNPFVGDDGADSEIYAYGLRNPFGLAFDP 190 (331)
T ss_dssp SSS-EEEEEE-TT-SEEEEEEB-TTTGGGGCSTTSSTTEEEEEETTSSB-TTSTTTTSTTSTTTEEEE--SEEEEEEEET
T ss_pred CCCCCccccCCCC-CcEEEEeCCCCCcccccccccccceEEEecccCcCCCCCccccCCCceEEEEEeCCCccccEEEEC
Confidence 3556677999985 6999986 2579999999972 3467788999999999999
Q ss_pred CCCeEEEEeCCCCcE
Q psy4900 452 NAKRLFWCDNLLDYI 466 (485)
Q Consensus 452 ~~~~lYW~D~~~~~I 466 (485)
.+++||.+|.+....
T Consensus 191 ~tg~l~~~d~G~~~~ 205 (331)
T PF07995_consen 191 NTGRLWAADNGPDGW 205 (331)
T ss_dssp TTTEEEEEEE-SSSS
T ss_pred CCCcEEEEccCCCCC
Confidence 999999999776433
No 47
>PRK11028 6-phosphogluconolactonase; Provisional
Probab=95.19 E-value=0.15 Score=50.86 Aligned_cols=109 Identities=14% Similarity=0.028 Sum_probs=71.1
Q ss_pred cccCCCCccCCCCC--CccccccCcccccceEEEEecCCCCCCccEEecCCCCeEEEeC--CCcEEEEEcC--CCCcEEE
Q psy4900 363 WKCDSENDCGDGSD--EGDFCSEKTCAYFQFHAIVLGSNLTNPTDLALDPTSGLMFVAD--SNQILRTNMD--GTMAMSI 436 (485)
Q Consensus 363 ~~~~~~~~y~~d~~--~I~~~~~~~c~~g~~~~~l~~~~~~~p~~iavD~~~~~lywtd--~~~I~r~~~d--G~~~~~i 436 (485)
..+++..+|..... .|.+.+.+ +.|+....-.......|..|+++|..++||-+. .++|...+++ |...+.+
T Consensus 42 ~spd~~~lyv~~~~~~~i~~~~~~--~~g~l~~~~~~~~~~~p~~i~~~~~g~~l~v~~~~~~~v~v~~~~~~g~~~~~~ 119 (330)
T PRK11028 42 ISPDKRHLYVGVRPEFRVLSYRIA--DDGALTFAAESPLPGSPTHISTDHQGRFLFSASYNANCVSVSPLDKDGIPVAPI 119 (330)
T ss_pred ECCCCCEEEEEECCCCcEEEEEEC--CCCceEEeeeecCCCCceEEEECCCCCEEEEEEcCCCeEEEEEECCCCCCCCce
Confidence 34566667765332 45444433 014332111112234799999999999999887 5777777775 4322222
Q ss_pred E-eCCCcccceEEEeCCCCeEEEEeCCCCcEEEEEccC
Q psy4900 437 V-SEAAYKASGVALDINAKRLFWCDNLLDYIETVDYEG 473 (485)
Q Consensus 437 ~-~~~~~~p~glavD~~~~~lYW~D~~~~~I~~~~~dG 473 (485)
- ......|.++++++..++||-++.+.++|.+.+++.
T Consensus 120 ~~~~~~~~~~~~~~~p~g~~l~v~~~~~~~v~v~d~~~ 157 (330)
T PRK11028 120 QIIEGLEGCHSANIDPDNRTLWVPCLKEDRIRLFTLSD 157 (330)
T ss_pred eeccCCCcccEeEeCCCCCEEEEeeCCCCEEEEEEECC
Confidence 1 134567999999999999999999999999998874
No 48
>PF07995 GSDH: Glucose / Sorbosone dehydrogenase; InterPro: IPR012938 Proteins containing this domain are thought to be glucose/sorbosone dehydrogenases. The best characterised of these proteins is soluble glucose dehydrogenase (P13650 from SWISSPROT) from Acinetobacter calcoaceticus, which oxidises glucose to gluconolactone. The enzyme is a calcium-dependent homodimer which uses PQQ as a cofactor [].; GO: 0016901 oxidoreductase activity, acting on the CH-OH group of donors, quinone or similar compound as acceptor, 0048038 quinone binding, 0005975 carbohydrate metabolic process; PDB: 2ISM_A 2WG3_D 3HO5_A 3HO4_A 3HO3_A 2WFT_A 2WG4_B 2WFX_B 1CRU_A 1CQ1_B ....
Probab=95.10 E-value=0.061 Score=53.88 Aligned_cols=74 Identities=20% Similarity=0.321 Sum_probs=54.3
Q ss_pred CCCCccEEecCCCCeEEEeC-CCcEEEEEcCCCCcEEEEeC------CCcccceEEEeC---CCCeEEEEeCCC------
Q psy4900 400 LTNPTDLALDPTSGLMFVAD-SNQILRTNMDGTMAMSIVSE------AAYKASGVALDI---NAKRLFWCDNLL------ 463 (485)
Q Consensus 400 ~~~p~~iavD~~~~~lywtd-~~~I~r~~~dG~~~~~i~~~------~~~~p~glavD~---~~~~lYW~D~~~------ 463 (485)
|++|++|++.|. |.||.++ ..+|.+...+|.....+... ...-+.|||+|+ .++.||.+-...
T Consensus 1 L~~P~~~a~~pd-G~l~v~e~~G~i~~~~~~g~~~~~v~~~~~v~~~~~~gllgia~~p~f~~n~~lYv~~t~~~~~~~~ 79 (331)
T PF07995_consen 1 LNNPRSMAFLPD-GRLLVAERSGRIWVVDKDGSLKTPVADLPEVFADGERGLLGIAFHPDFASNGYLYVYYTNADEDGGD 79 (331)
T ss_dssp ESSEEEEEEETT-SCEEEEETTTEEEEEETTTEECEEEEE-TTTBTSTTBSEEEEEE-TTCCCC-EEEEEEEEE-TSSSS
T ss_pred CCCceEEEEeCC-CcEEEEeCCceEEEEeCCCcCcceecccccccccccCCcccceeccccCCCCEEEEEEEcccCCCCC
Confidence 578999999998 8999999 79999999888874444332 234578999998 357888766532
Q ss_pred --CcEEEEEccCC
Q psy4900 464 --DYIETVDYEGK 474 (485)
Q Consensus 464 --~~I~~~~~dG~ 474 (485)
.+|.+..++..
T Consensus 80 ~~~~v~r~~~~~~ 92 (331)
T PF07995_consen 80 NDNRVVRFTLSDG 92 (331)
T ss_dssp EEEEEEEEEEETT
T ss_pred cceeeEEEeccCC
Confidence 57888888765
No 49
>KOG1219|consensus
Probab=95.09 E-value=0.032 Score=65.73 Aligned_cols=95 Identities=32% Similarity=0.856 Sum_probs=60.9
Q ss_pred CCCCCCccc--cccCCC-CCCeeeCCCCceeeCCCCCCCCcchhhhccccccccccccceeheeheeeeeeecccCCCCC
Q psy4900 178 DCSLLNCEF--TCQASP-TGGVCQCPEGQKVANDSRTCLLYMKNNLKQAVRSSTVSSHVKLVLLEVYVNVLKVRKLPTTA 254 (485)
Q Consensus 178 ~C~~~~C~~--~C~n~~-~~~~C~C~~G~~l~~~~~~C~d~~e~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 254 (485)
.|...+|++ .|..+| ++|+|.|++.|. ++.|.-.-+
T Consensus 3866 ~C~~npCqhgG~C~~~~~ggy~CkCpsqys----G~~CEi~~e------------------------------------- 3904 (4289)
T KOG1219|consen 3866 PCNDNPCQHGGTCISQPKGGYKCKCPSQYS----GNHCEIDLE------------------------------------- 3904 (4289)
T ss_pred ccccCcccCCCEecCCCCCceEEeCccccc----Ccccccccc-------------------------------------
Confidence 466667765 788877 678999999763 566742211
Q ss_pred CCCCCCCCCCCCCCccccccccccCCCCCCccccCCCCccccCCCCCCCC---CCCCCceEEcCCC-C--cccCceeecC
Q psy4900 255 EPQSPNPCGSNNGGCEHMCIITRASGNALGYKCACDIGYRLSVNGNNCNQ---PTCAPGEFQCASG-R--CVPSTFKCDA 328 (485)
Q Consensus 255 e~~~~~~C~~~~g~C~~~C~n~~~~~~~g~~~C~C~~Gy~l~~d~~~C~~---~~C~~~~~~c~~g-~--ci~~~~~cd~ 328 (485)
|=..++|. +| ..|+... ++|.|.|+.||. |++|+. ++|+.+ .|.+| + +++++|.|+.
T Consensus 3905 -pC~snPC~--~G---gtCip~~-----n~f~CnC~~gyT----G~~Ce~~Gi~eCs~n--~C~~gg~C~n~~gsf~Cnc 3967 (4289)
T KOG1219|consen 3905 -PCASNPCL--TG---GTCIPFY-----NGFLCNCPNGYT----GKRCEARGISECSKN--VCGTGGQCINIPGSFHCNC 3967 (4289)
T ss_pred -cccCCCCC--CC---CEEEecC-----CCeeEeCCCCcc----Cceeecccccccccc--cccCCceeeccCCceEecc
Confidence 10112243 22 3577777 899999999994 788877 346643 24333 3 6678888876
Q ss_pred CC
Q psy4900 329 EN 330 (485)
Q Consensus 329 ~~ 330 (485)
.+
T Consensus 3968 T~ 3969 (4289)
T KOG1219|consen 3968 TP 3969 (4289)
T ss_pred Ch
Confidence 54
No 50
>PF03088 Str_synth: Strictosidine synthase; InterPro: IPR018119 This entry represents a conserved region found in strictosidine synthase (4.3.3.2 from EC), a key enzyme in alkaloid biosynthesis. It catalyses the Pictet-Spengler stereospecific condensation of tryptamine with secologanin to form strictosidine []. The structure of the native enzyme from the Indian medicinal plant Rauvolfia serpentina (Serpentwood) (Devilpepper) represents the first example of a six-bladed four-stranded beta-propeller fold from the plant kingdom [].; GO: 0016844 strictosidine synthase activity, 0009058 biosynthetic process; PDB: 2FPB_A 2V91_B 2FP8_A 3V1S_B 2FPC_A 2VAQ_A 2FP9_B.
Probab=94.97 E-value=0.021 Score=45.14 Aligned_cols=42 Identities=31% Similarity=0.448 Sum_probs=34.9
Q ss_pred ceEEEEecCCCCCCccEEecCCCCeEEEeC--CCcEEEEEcCCC
Q psy4900 390 QFHAIVLGSNLTNPTDLALDPTSGLMFVAD--SNQILRTNMDGT 431 (485)
Q Consensus 390 ~~~~~l~~~~~~~p~~iavD~~~~~lywtd--~~~I~r~~~dG~ 431 (485)
+....++..+|.-|.||||.+....|+.++ ..+|.|..|.|.
T Consensus 46 t~~~~vl~~~L~fpNGVals~d~~~vlv~Et~~~Ri~rywl~Gp 89 (89)
T PF03088_consen 46 TKETTVLLDGLYFPNGVALSPDESFVLVAETGRYRILRYWLKGP 89 (89)
T ss_dssp TTEEEEEEEEESSEEEEEE-TTSSEEEEEEGGGTEEEEEESSST
T ss_pred CCeEEEehhCCCccCeEEEcCCCCEEEEEeccCceEEEEEEeCC
Confidence 334455568899999999999999999999 799999999884
No 51
>PF03022 MRJP: Major royal jelly protein; InterPro: IPR003534 The major royal jelly proteins (MRJPs) comprise 12.5% of the mass, and 82-90% of the protein content [], of honeybee (Apis mellifera) royal jelly. Royal jelly is a substance secreted by the cephalic glands of nurse bees [] and it is used to trigger development of a queen bee from a bee larva. The biological function of the MRJPs is unknown, but they are believed to play a major role in nutrition due to their high essential amino acid content []. Two royal jelly proteins, MRJP3 and MRJP5, contain a tandem repeat that results from a high genetic variablility. This polymorphism may be useful for genotyping individual bees [].; PDB: 3Q6P_B 3Q6K_A 3Q6T_A 2QE8_B.
Probab=94.68 E-value=0.098 Score=51.24 Aligned_cols=65 Identities=22% Similarity=0.296 Sum_probs=47.5
Q ss_pred CCCccEEecCCCCeEEEeC--CCcEEEEEcCC----CCcEEEEe-CC-CcccceEEEeC-CCCeEEEEeCCCCcE
Q psy4900 401 TNPTDLALDPTSGLMFVAD--SNQILRTNMDG----TMAMSIVS-EA-AYKASGVALDI-NAKRLFWCDNLLDYI 466 (485)
Q Consensus 401 ~~p~~iavD~~~~~lywtd--~~~I~r~~~dG----~~~~~i~~-~~-~~~p~glavD~-~~~~lYW~D~~~~~I 466 (485)
....|+++|+ .|.||+++ ...|.+.+.++ .+.++|+. .. |.||.+|+|+. ..+.||++-...++.
T Consensus 186 ~~s~g~~~D~-~G~ly~~~~~~~aI~~w~~~~~~~~~~~~~l~~d~~~l~~pd~~~i~~~~~g~L~v~snrl~~~ 259 (287)
T PF03022_consen 186 SQSDGMAIDP-NGNLYFTDVEQNAIGCWDPDGPYTPENFEILAQDPRTLQWPDGLKIDPEGDGYLWVLSNRLQRF 259 (287)
T ss_dssp -SECEEEEET-TTEEEEEECCCTEEEEEETTTSB-GCCEEEEEE-CC-GSSEEEEEE-T--TS-EEEEE-S--SS
T ss_pred CCCceEEECC-CCcEEEecCCCCeEEEEeCCCCcCccchheeEEcCceeeccceeeeccccCceEEEEECcchHh
Confidence 4568899998 79999999 78999999998 34455554 34 89999999995 468999987766544
No 52
>KOG2397|consensus
Probab=94.61 E-value=0.031 Score=56.84 Aligned_cols=69 Identities=42% Similarity=0.651 Sum_probs=58.7
Q ss_pred CCeecCCCCceecCcceeCCCCCCCCCCcccCCCCCccCCCcccCCCC----cccCccccCCCCCCCCCCCCCCC
Q psy4900 59 DQFRCANGQKCIDAKLKCNYHNDCGDNSDEEKCNFTACHVGQFKCANS----LCIPVSYHCDGYRDCIDGSDETN 129 (485)
Q Consensus 59 ~~f~C~~g~~Ci~~~~~Cd~~~dC~d~sDe~~C~~~~C~~~~f~C~~~----~Ci~~~~~Cdg~~dC~dgsDe~~ 129 (485)
..|.|.+|..-|+...+=|+.-||.||+||.. ...|..+.|.|.|. .=|+.+.+=||+-||-|||||..
T Consensus 43 ~~~~CLdgs~~i~f~qlNDd~CDC~DGsDEPG--tsACpngkF~C~N~G~~p~~i~ssrV~DGICDCCDgSDE~~ 115 (480)
T KOG2397|consen 43 SMFKCLDGSKTISFSQLNDDSCDCLDGSDEPG--TSACPNGKFYCVNQGHQPKYIPSSRVNDGICDCCDGSDEYL 115 (480)
T ss_pred cceeeccCCcccCHHHhccccccCCCCCCCCc--cccCCCCceeeeecCCCceeeechhccCcccccccCCCCcc
Confidence 47899998888888888899999999999953 35688999999873 45678889999999999999975
No 53
>PRK11028 6-phosphogluconolactonase; Provisional
Probab=94.50 E-value=0.36 Score=48.05 Aligned_cols=73 Identities=11% Similarity=0.188 Sum_probs=56.3
Q ss_pred CCCCccEEecCCCCeEEEeC--CCcEEEEEcCCCCcEEEEeC--CCcccceEEEeCCCCeEEEEeCCCCcEEEEEcc
Q psy4900 400 LTNPTDLALDPTSGLMFVAD--SNQILRTNMDGTMAMSIVSE--AAYKASGVALDINAKRLFWCDNLLDYIETVDYE 472 (485)
Q Consensus 400 ~~~p~~iavD~~~~~lywtd--~~~I~r~~~dG~~~~~i~~~--~~~~p~glavD~~~~~lYW~D~~~~~I~~~~~d 472 (485)
...|..|+++|...+||.+. ...|....+++.....++.. ....|.+|++++..++||-+....+.|.+.+++
T Consensus 34 ~~~~~~l~~spd~~~lyv~~~~~~~i~~~~~~~~g~l~~~~~~~~~~~p~~i~~~~~g~~l~v~~~~~~~v~v~~~~ 110 (330)
T PRK11028 34 PGQVQPMVISPDKRHLYVGVRPEFRVLSYRIADDGALTFAAESPLPGSPTHISTDHQGRFLFSASYNANCVSVSPLD 110 (330)
T ss_pred CCCCccEEECCCCCEEEEEECCCCcEEEEEECCCCceEEeeeecCCCCceEEEECCCCCEEEEEEcCCCeEEEEEEC
Confidence 46799999999999999987 57787777774443333321 234799999999999999998888888888775
No 54
>PF06977 SdiA-regulated: SdiA-regulated; InterPro: IPR009722 This entry represents a conserved region approximately 100 residues long within a number of hypothetical bacterial proteins that may be regulated by SdiA, a member of the LuxR family of transcriptional regulators []. Some proteins contain the IPR001258 from INTERPRO repeat.; PDB: 3QQZ_A.
Probab=94.41 E-value=0.2 Score=47.84 Aligned_cols=76 Identities=20% Similarity=0.207 Sum_probs=51.4
Q ss_pred CccEEecCCCCeEEEeC--C-CcEEEEEc--CCCCcEEEEe-------CCCcccceEEEeCCCCeEEEEeCCCCcEEEEE
Q psy4900 403 PTDLALDPTSGLMFVAD--S-NQILRTNM--DGTMAMSIVS-------EAAYKASGVALDINAKRLFWCDNLLDYIETVD 470 (485)
Q Consensus 403 p~~iavD~~~~~lywtd--~-~~I~r~~~--dG~~~~~i~~-------~~~~~p~glavD~~~~~lYW~D~~~~~I~~~~ 470 (485)
-.|||.|+.++.||.+. . .+|+.+.. .+....+... ..+..|.||++|+.+++||........|...+
T Consensus 120 ~EGla~D~~~~~L~v~kE~~P~~l~~~~~~~~~~~~~~~~~~~~~~~~~~~~d~S~l~~~p~t~~lliLS~es~~l~~~d 199 (248)
T PF06977_consen 120 FEGLAYDPKTNRLFVAKERKPKRLYEVNGFPGGFDLFVSDDQDLDDDKLFVRDLSGLSYDPRTGHLLILSDESRLLLELD 199 (248)
T ss_dssp -EEEEEETTTTEEEEEEESSSEEEEEEESTT-SS--EEEE-HHHH-HT--SS---EEEEETTTTEEEEEETTTTEEEEE-
T ss_pred eEEEEEcCCCCEEEEEeCCCChhhEEEccccCccceeeccccccccccceeccccceEEcCCCCeEEEEECCCCeEEEEC
Confidence 57999999999999887 2 36788877 3333333322 13557999999999999999999999999999
Q ss_pred ccCCCeEE
Q psy4900 471 YEGKNRFL 478 (485)
Q Consensus 471 ~dG~~r~~ 478 (485)
.+|.-+..
T Consensus 200 ~~G~~~~~ 207 (248)
T PF06977_consen 200 RQGRVVSS 207 (248)
T ss_dssp TT--EEEE
T ss_pred CCCCEEEE
Confidence 99986544
No 55
>KOG2397|consensus
Probab=94.35 E-value=0.039 Score=56.08 Aligned_cols=71 Identities=46% Similarity=0.787 Sum_probs=57.7
Q ss_pred cccCCCC-cccCccccCCCCCCCCCCCCCCCCCCCCcCCCceecCCCCCCCCCcccCCCcccCCCCCCCCCccccC
Q psy4900 100 QFKCANS-LCIPVSYHCDGYRDCIDGSDETNCTSIACPNNKFLCPMGAAGGKPKCIPKAQVCDGRKDCEDNADEET 174 (485)
Q Consensus 100 ~f~C~~~-~Ci~~~~~Cdg~~dC~dgsDe~~C~~~~c~~~~~~C~~g~~~~~~~Ci~~~~~Cdg~~dC~d~sDe~~ 174 (485)
.|.|.++ .-|+...+=|+.-||.||+||.. ...|+.+.|.|.|. ++...=|+.+.+=||+-||=|++||..
T Consensus 44 ~~~CLdgs~~i~f~qlNDd~CDC~DGsDEPG--tsACpngkF~C~N~--G~~p~~i~ssrV~DGICDCCDgSDE~~ 115 (480)
T KOG2397|consen 44 MFKCLDGSKTISFSQLNDDSCDCLDGSDEPG--TSACPNGKFYCVNQ--GHQPKYIPSSRVNDGICDCCDGSDEYL 115 (480)
T ss_pred ceeeccCCcccCHHHhccccccCCCCCCCCc--cccCCCCceeeeec--CCCceeeechhccCcccccccCCCCcc
Confidence 6788765 45677788888899999999954 34677889999886 555577889999999999999999964
No 56
>KOG3509|consensus
Probab=94.27 E-value=0.045 Score=60.81 Aligned_cols=104 Identities=37% Similarity=0.844 Sum_probs=86.9
Q ss_pred ceecCCcccCCcCCCCCCCcCCCCCCCCCCCCCCCCeecCCCCceecCcceeCCCCCCCCCCcccCCC----CCccCCCc
Q psy4900 25 ICIPKDKRCDGYYDCRNRKDEEGCPATTGLSCDLDQFRCANGQKCIDAKLKCNYHNDCGDNSDEEKCN----FTACHVGQ 100 (485)
Q Consensus 25 ~ci~~~~~Cd~~~dC~d~sdE~~C~~~~~~~C~~~~f~C~~g~~Ci~~~~~Cd~~~dC~d~sDe~~C~----~~~C~~~~ 100 (485)
.|......|++..++.+.+|+.+++. ....+++..|.|.+++ +....+.|+....+..++++.+|. ...|.+..
T Consensus 2 ~c~~~~~~~~~~~~~~~~~~~~~~~~-~~~~~~p~~~~~~~~~-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~s~~~~~~ 79 (964)
T KOG3509|consen 2 ECVKNRYACDRQPDCRDRSDVANDPA-IGSACSPNEFKCNNPR-CVQPEALLDADSTCGPNSTPSGCNAKPSASDCKPTE 79 (964)
T ss_pred chhhhhhhhccchhhHhhcccCCCcc-ccccCCcchhccCCcc-ccCchhhhccccccCCCCCcCCccccccccccCCcc
Confidence 35556678999999999999887764 2245789999999998 999999999999999999777765 34678889
Q ss_pred ccCCCC-cccCccccCCCCCCCCCCCCCCCC
Q psy4900 101 FKCANS-LCIPVSYHCDGYRDCIDGSDETNC 130 (485)
Q Consensus 101 f~C~~~-~Ci~~~~~Cdg~~dC~dgsDe~~C 130 (485)
++|.+- ++.+.+..|+|.++|.++++|..+
T Consensus 80 ~~c~~~~~~~~~~~~~~g~~~~~~~~~~~~~ 110 (964)
T KOG3509|consen 80 TQCRDRLRCNPQSFQCDGTNDCKDGSDEVGC 110 (964)
T ss_pred cccccchhcCCccccccCCCCCCccchhccc
Confidence 999886 788999999999999999988654
No 57
>PF01436 NHL: NHL repeat; InterPro: IPR001258 The NHL repeat, named after NCL-1, HT2A and Lin-41, is found largely in a large number of eukaryotic and prokaryotic proteins. For example, the repeat is found in a variety of enzymes of the copper type II, ascorbate-dependent monooxygenase family which catalyse the C terminus alpha-amidation of biological peptides []. In many it occurs in tandem arrays, for example in the ringfinger beta-box, coiled-coil (RBCC) eukaryotic growth regulators []. The 'Brain Tumor' protein (Brat) is one such growth regulator that contains a 6-bladed NHL-repeat beta-propeller [, ]. The NHL repeats are also found in serine/threonine protein kinase (STPK) in diverse range of pathogenic bacteria. These STPK are transmembrane receptors with a intracellular N-terminal kinase domain and extracellular C-terminal sensor domain. In the STPK, PknD, from Mycobacterium tuberculosis, the sensor domain forms a rigid, six-bladed b-propeller composed of NHL repeats with a flexible tether to the transmembrane domain.; GO: 0005515 protein binding; PDB: 3FVZ_A 3FW0_A 1RWL_A 1RWI_A 1Q7F_A.
Probab=94.04 E-value=0.064 Score=32.44 Aligned_cols=25 Identities=28% Similarity=0.614 Sum_probs=20.6
Q ss_pred CCCCccEEecCCCCeEEEeC--CCcEEE
Q psy4900 400 LTNPTDLALDPTSGLMFVAD--SNQILR 425 (485)
Q Consensus 400 ~~~p~~iavD~~~~~lywtd--~~~I~r 425 (485)
+..|.|||+| ..|.||.+| .++|.+
T Consensus 1 f~~P~gvav~-~~g~i~VaD~~n~rV~v 27 (28)
T PF01436_consen 1 FNYPHGVAVD-SDGNIYVADSGNHRVQV 27 (28)
T ss_dssp BSSEEEEEEE-TTSEEEEEECCCTEEEE
T ss_pred CcCCcEEEEe-CCCCEEEEECCCCEEEE
Confidence 4679999999 779999999 566654
No 58
>KOG3509|consensus
Probab=93.50 E-value=0.096 Score=58.32 Aligned_cols=102 Identities=29% Similarity=0.646 Sum_probs=83.6
Q ss_pred eecCcceeCCCCCCCCCCcccCCC--CCccCCCcccCCCCcccCccccCCCCCCCCCCCCCCCCCC----CCcCCCceec
Q psy4900 69 CIDAKLKCNYHNDCGDNSDEEKCN--FTACHVGQFKCANSLCIPVSYHCDGYRDCIDGSDETNCTS----IACPNNKFLC 142 (485)
Q Consensus 69 Ci~~~~~Cd~~~dC~d~sDe~~C~--~~~C~~~~f~C~~~~Ci~~~~~Cdg~~dC~dgsDe~~C~~----~~c~~~~~~C 142 (485)
|......|++..++.+.+|+.++. ...+.+.++.|.++++...-+.|+....+..++++.+|.. ..|....+.|
T Consensus 3 c~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~p~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~s~~~~~~~~c 82 (964)
T KOG3509|consen 3 CVKNRYACDRQPDCRDRSDVANDPAIGSACSPNEFKCNNPRCVQPEALLDADSTCGPNSTPSGCNAKPSASDCKPTETQC 82 (964)
T ss_pred hhhhhhhhccchhhHhhcccCCCccccccCCcchhccCCccccCchhhhccccccCCCCCcCCccccccccccCCccccc
Confidence 677788999999999999998765 5678899999999999999999999999999997777742 2344445666
Q ss_pred CCCCCCCCCcccCCCcccCCCCCCCCCccccCc
Q psy4900 143 PMGAAGGKPKCIPKAQVCDGRKDCEDNADEETV 175 (485)
Q Consensus 143 ~~g~~~~~~~Ci~~~~~Cdg~~dC~d~sDe~~~ 175 (485)
.+- .++.+.+..|+|.++|.+++++...
T Consensus 83 ~~~-----~~~~~~~~~~~g~~~~~~~~~~~~~ 110 (964)
T KOG3509|consen 83 RDR-----LRCNPQSFQCDGTNDCKDGSDEVGC 110 (964)
T ss_pred ccc-----hhcCCccccccCCCCCCccchhccc
Confidence 654 3678889999999999999998663
No 59
>TIGR03606 non_repeat_PQQ dehydrogenase, PQQ-dependent, s-GDH family. PQQ, or pyrroloquinoline-quinone, serves as a cofactor for a number of sugar and alcohol dehydrogenases in a limited number of bacterial species. Most characterized PQQ-dependent enzymes have multiple repeats of a sequence region described by pfam01011 (PQQ enzyme repeat), but this protein family in unusual in lacking that repeat. Below the noise cutoff are related proteins mostly from species that lack PQQ biosynthesis.
Probab=93.09 E-value=0.43 Score=49.53 Aligned_cols=41 Identities=22% Similarity=0.370 Sum_probs=36.1
Q ss_pred cEEEEEcCCCC----------cEEEEeCCCcccceEEEeCCCCeEEEEeCCC
Q psy4900 422 QILRTNMDGTM----------AMSIVSEAAYKASGVALDINAKRLFWCDNLL 463 (485)
Q Consensus 422 ~I~r~~~dG~~----------~~~i~~~~~~~p~glavD~~~~~lYW~D~~~ 463 (485)
+|.|++.||+- +..|...++.+|.|||+|+ +++||.+|.+.
T Consensus 200 kILRin~DGsiP~dNPf~~g~~~eIyA~G~RNp~Gla~dp-~G~Lw~~e~Gp 250 (454)
T TIGR03606 200 KVLRLNLDGSIPKDNPSINGVVSHIFTYGHRNPQGLAFTP-DGTLYASEQGP 250 (454)
T ss_pred EEEEEcCCCCCCCCCCccCCCcceEEEEeccccceeEECC-CCCEEEEecCC
Confidence 79999999973 4578889999999999998 89999999875
No 60
>COG3391 Uncharacterized conserved protein [Function unknown]
Probab=93.04 E-value=0.91 Score=46.42 Aligned_cols=112 Identities=17% Similarity=0.156 Sum_probs=78.7
Q ss_pred cccCCCCccCCCCC--CccccccCcccccceEEE----EecCCCCCCccEEecCCCCeEEEeC-C---CcEEEEEcCCCC
Q psy4900 363 WKCDSENDCGDGSD--EGDFCSEKTCAYFQFHAI----VLGSNLTNPTDLALDPTSGLMFVAD-S---NQILRTNMDGTM 432 (485)
Q Consensus 363 ~~~~~~~~y~~d~~--~I~~~~~~~c~~g~~~~~----l~~~~~~~p~~iavD~~~~~lywtd-~---~~I~r~~~dG~~ 432 (485)
..+++.++|.++.. .|.+.+.. +..... ....-...|.++++++...++|-++ . ..|.+.+.....
T Consensus 167 ~~p~g~~vyv~~~~~~~v~vi~~~----~~~v~~~~~~~~~~~~~~P~~i~v~~~g~~~yV~~~~~~~~~v~~id~~~~~ 242 (381)
T COG3391 167 VDPDGNKVYVTNSDDNTVSVIDTS----GNSVVRGSVGSLVGVGTGPAGIAVDPDGNRVYVANDGSGSNNVLKIDTATGN 242 (381)
T ss_pred ECCCCCeEEEEecCCCeEEEEeCC----CcceeccccccccccCCCCceEEECCCCCEEEEEeccCCCceEEEEeCCCce
Confidence 45666778888844 66666655 443321 0124467899999999999999999 2 367777666544
Q ss_pred cEEE--EeCCCcccceEEEeCCCCeEEEEeCCCCcEEEEEccCCCeEEE
Q psy4900 433 AMSI--VSEAAYKASGVALDINAKRLFWCDNLLDYIETVDYEGKNRFLI 479 (485)
Q Consensus 433 ~~~i--~~~~~~~p~glavD~~~~~lYW~D~~~~~I~~~~~dG~~r~~~ 479 (485)
.... ....+ +|.++++++..+.+|.++...+.+.+.+.........
T Consensus 243 v~~~~~~~~~~-~~~~v~~~p~g~~~yv~~~~~~~V~vid~~~~~v~~~ 290 (381)
T COG3391 243 VTATDLPVGSG-APRGVAVDPAGKAAYVANSQGGTVSVIDGATDRVVKT 290 (381)
T ss_pred EEEeccccccC-CCCceeECCCCCEEEEEecCCCeEEEEeCCCCceeee
Confidence 3332 22345 8999999999999999988888888887776554443
No 61
>smart00179 EGF_CA Calcium-binding EGF-like domain.
Probab=93.03 E-value=0.11 Score=33.70 Aligned_cols=28 Identities=39% Similarity=1.067 Sum_probs=21.8
Q ss_pred CCccc--cccccccCCCCCCccccCCCCccccCCCCCC
Q psy4900 267 GGCEH--MCIITRASGNALGYKCACDIGYRLSVNGNNC 302 (485)
Q Consensus 267 g~C~~--~C~n~~~~~~~g~~~C~C~~Gy~l~~d~~~C 302 (485)
..|.+ .|+++. ++|+|.|+.||. +++.|
T Consensus 9 ~~C~~~~~C~~~~-----g~~~C~C~~g~~---~g~~C 38 (39)
T smart00179 9 NPCQNGGTCVNTV-----GSYRCECPPGYT---DGRNC 38 (39)
T ss_pred CCcCCCCEeECCC-----CCeEeECCCCCc---cCCcC
Confidence 45654 799988 999999999996 45555
No 62
>smart00179 EGF_CA Calcium-binding EGF-like domain.
Probab=92.94 E-value=0.11 Score=33.70 Aligned_cols=23 Identities=39% Similarity=0.924 Sum_probs=19.1
Q ss_pred cccCCCCCCeeeCCCCceeeCCCCCC
Q psy4900 187 TCQASPTGGVCQCPEGQKVANDSRTC 212 (485)
Q Consensus 187 ~C~n~~~~~~C~C~~G~~l~~~~~~C 212 (485)
.|.+++++|.|.|++||. +++.|
T Consensus 16 ~C~~~~g~~~C~C~~g~~---~g~~C 38 (39)
T smart00179 16 TCVNTVGSYRCECPPGYT---DGRNC 38 (39)
T ss_pred EeECCCCCeEeECCCCCc---cCCcC
Confidence 799999999999999986 34444
No 63
>KOG1219|consensus
Probab=92.91 E-value=0.14 Score=60.82 Aligned_cols=67 Identities=28% Similarity=0.700 Sum_probs=48.7
Q ss_pred CCCCCc--cccccCCCCCCeeeCCCCceeeCCCCCCCC--cchhhhccccccccccccceeheeheeeeeeecccCCCCC
Q psy4900 179 CSLLNC--EFTCQASPTGGVCQCPEGQKVANDSRTCLL--YMKNNLKQAVRSSTVSSHVKLVLLEVYVNVLKVRKLPTTA 254 (485)
Q Consensus 179 C~~~~C--~~~C~n~~~~~~C~C~~G~~l~~~~~~C~d--~~e~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 254 (485)
|...+| ..+|...+++|.|.|+.||. ++.|.. +++
T Consensus 3906 C~snPC~~GgtCip~~n~f~CnC~~gyT----G~~Ce~~Gi~e------------------------------------- 3944 (4289)
T KOG1219|consen 3906 CASNPCLTGGTCIPFYNGFLCNCPNGYT----GKRCEARGISE------------------------------------- 3944 (4289)
T ss_pred ccCCCCCCCCEEEecCCCeeEeCCCCcc----Cceeecccccc-------------------------------------
Confidence 666666 35899999999999999984 456642 233
Q ss_pred CCCCCCCCCCCCCCcc--ccccccccCCCCCCccccCCCCccccCCCCCCCC
Q psy4900 255 EPQSPNPCGSNNGGCE--HMCIITRASGNALGYKCACDIGYRLSVNGNNCNQ 304 (485)
Q Consensus 255 e~~~~~~C~~~~g~C~--~~C~n~~~~~~~g~~~C~C~~Gy~l~~d~~~C~~ 304 (485)
|+ ...|. ..|+|++ |+|.|.|-+||. +++|.+
T Consensus 3945 -------Cs--~n~C~~gg~C~n~~-----gsf~CncT~g~~----gr~c~~ 3978 (4289)
T KOG1219|consen 3945 -------CS--KNVCGTGGQCINIP-----GSFHCNCTPGIL----GRTCCA 3978 (4289)
T ss_pred -------cc--cccccCCceeeccC-----CceEeccChhHh----cccCcc
Confidence 54 33443 4799999 999999999995 566643
No 64
>PRK04043 tolB translocation protein TolB; Provisional
Probab=92.89 E-value=0.77 Score=47.57 Aligned_cols=113 Identities=5% Similarity=-0.016 Sum_probs=78.6
Q ss_pred ccccccCCCCccCCCCC----CccccccCcccccceEEEEecCCCCCCccEEecCCCCeEEEeC--C--------CcEEE
Q psy4900 360 PSTWKCDSENDCGDGSD----EGDFCSEKTCAYFQFHAIVLGSNLTNPTDLALDPTSGLMFVAD--S--------NQILR 425 (485)
Q Consensus 360 ~~~~~~~~~~~y~~d~~----~I~~~~~~~c~~g~~~~~l~~~~~~~p~~iavD~~~~~lywtd--~--------~~I~r 425 (485)
...|.|++..+|++... .|++.+++ +...+.+...+..+ .++.|..++|.|+. . ..|+.
T Consensus 281 ~p~~SPDG~~I~F~Sdr~g~~~Iy~~dl~----~g~~~rlt~~g~~~---~~~SPDG~~Ia~~~~~~~~~~~~~~~~I~v 353 (419)
T PRK04043 281 NGNFVEDDKRIVFVSDRLGYPNIFMKKLN----SGSVEQVVFHGKNN---SSVSTYKNYIVYSSRETNNEFGKNTFNLYL 353 (419)
T ss_pred ccEECCCCCEEEEEECCCCCceEEEEECC----CCCeEeCccCCCcC---ceECCCCCEEEEEEcCCCcccCCCCcEEEE
Confidence 35688999888886432 78888887 65554444333222 37888888888886 2 37999
Q ss_pred EEcCCCCcEEEEeCCCcccceEEEeCCCCeEEEEeCC--CCcEEEEEccCCCeEEEec
Q psy4900 426 TNMDGTMAMSIVSEAAYKASGVALDINAKRLFWCDNL--LDYIETVDYEGKNRFLILR 481 (485)
Q Consensus 426 ~~~dG~~~~~i~~~~~~~p~glavD~~~~~lYW~D~~--~~~I~~~~~dG~~r~~~~~ 481 (485)
+++++...+.|..... -...++.+..++||++... ...|..++++|..+..|..
T Consensus 354 ~d~~~g~~~~LT~~~~--~~~p~~SPDG~~I~f~~~~~~~~~L~~~~l~g~~~~~l~~ 409 (419)
T PRK04043 354 ISTNSDYIRRLTANGV--NQFPRFSSDGGSIMFIKYLGNQSALGIIRLNYNKSFLFPL 409 (419)
T ss_pred EECCCCCeEECCCCCC--cCCeEECCCCCEEEEEEccCCcEEEEEEecCCCeeEEeec
Confidence 9999887666654432 2245666778899988654 3469999999987776653
No 65
>PF12947 EGF_3: EGF domain; InterPro: IPR024731 This entry represents an EGF domain found in the the C terminus of malarial parasite merozoite surface protein 1 [], as well as other proteins.; PDB: 2NPR_A 1N1I_C 1B9W_A 1YO8_A 2RHP_A.
Probab=91.95 E-value=0.044 Score=35.38 Aligned_cols=31 Identities=45% Similarity=0.987 Sum_probs=20.9
Q ss_pred CCCCcc--ccccccccCCCCCCccccCCCCccccCCCCCC
Q psy4900 265 NNGGCE--HMCIITRASGNALGYKCACDIGYRLSVNGNNC 302 (485)
Q Consensus 265 ~~g~C~--~~C~n~~~~~~~g~~~C~C~~Gy~l~~d~~~C 302 (485)
.++.|. ..|++++ ++|.|.|.+||. .|+..|
T Consensus 4 ~~~~C~~nA~C~~~~-----~~~~C~C~~Gy~--GdG~~C 36 (36)
T PF12947_consen 4 NNGGCHPNATCTNTG-----GSYTCTCKPGYE--GDGFFC 36 (36)
T ss_dssp GGGGS-TTCEEEE-T-----TSEEEEE-CEEE--CCSTCE
T ss_pred CCCCCCCCcEeecCC-----CCEEeECCCCCc--cCCcCC
Confidence 345564 5799999 899999999996 455543
No 66
>cd01475 vWA_Matrilin VWA_Matrilin: In cartilaginous plate, extracellular matrix molecules mediate cell-matrix and matrix-matrix interactions thereby providing tissue integrity. Some members of the matrilin family are expressed specifically in developing cartilage rudiments. The matrilin family consists of at least four members. All the members of the matrilin family contain VWA domains, EGF-like domains and a heptad repeat coiled-coiled domain at the carboxy terminus which is responsible for the oligomerization of the matrilins. The VWA domains have been shown to be essential for matrilin network formation by interacting with matrix ligands.
Probab=91.89 E-value=0.12 Score=48.66 Aligned_cols=43 Identities=23% Similarity=0.619 Sum_probs=31.9
Q ss_pred CcccCCCCCCCCCccccCcccCCCCCCccccccCCCCCCeeeCCCCceeeCCCC
Q psy4900 157 AQVCDGRKDCEDNADEETVCCDCSLLNCEFTCQASPTGGVCQCPEGQKVANDSR 210 (485)
Q Consensus 157 ~~~Cdg~~dC~d~sDe~~~C~~C~~~~C~~~C~n~~~~~~C~C~~G~~l~~~~~ 210 (485)
...|.+...|.... +.|.+.|.+++|+|.|.|++||.+..+++
T Consensus 181 ~~~C~~~~~C~~~~-----------~~c~~~C~~~~g~~~c~c~~g~~~~~~~~ 223 (224)
T cd01475 181 GKICVVPDLCATLS-----------HVCQQVCISTPGSYLCACTEGYALLEDNK 223 (224)
T ss_pred cccCcCchhhcCCC-----------CCccceEEcCCCCEEeECCCCccCCCCCC
Confidence 35676666665431 24678899999999999999998866554
No 67
>PRK04922 tolB translocation protein TolB; Provisional
Probab=91.88 E-value=1.4 Score=45.75 Aligned_cols=115 Identities=10% Similarity=0.035 Sum_probs=74.5
Q ss_pred cccccCCCCccCCC-C-C--CccccccCcccccceEEEEecCCCCCCccEEecCCCCeEEEeC----CCcEEEEEcCCCC
Q psy4900 361 STWKCDSENDCGDG-S-D--EGDFCSEKTCAYFQFHAIVLGSNLTNPTDLALDPTSGLMFVAD----SNQILRTNMDGTM 432 (485)
Q Consensus 361 ~~~~~~~~~~y~~d-~-~--~I~~~~~~~c~~g~~~~~l~~~~~~~p~~iavD~~~~~lywtd----~~~I~r~~~dG~~ 432 (485)
..|.+++.+++++. . + .|.+.++. +.....+ ..........++.|...+|+++. ...|+..++++..
T Consensus 253 ~~~SpDG~~l~~~~s~~g~~~Iy~~d~~----~g~~~~l-t~~~~~~~~~~~spDG~~l~f~sd~~g~~~iy~~dl~~g~ 327 (433)
T PRK04922 253 PSFSPDGRRLALTLSRDGNPEIYVMDLG----SRQLTRL-TNHFGIDTEPTWAPDGKSIYFTSDRGGRPQIYRVAASGGS 327 (433)
T ss_pred ceECCCCCEEEEEEeCCCCceEEEEECC----CCCeEEC-ccCCCCccceEECCCCCEEEEEECCCCCceEEEEECCCCC
Confidence 45778888776542 1 1 67777777 5444333 23333345678888877777764 3579999998766
Q ss_pred cEEEEeCCCcccceEEEeCCCCeEEEEeCCC--CcEEEEEccCCCeEEEec
Q psy4900 433 AMSIVSEAAYKASGVALDINAKRLFWCDNLL--DYIETVDYEGKNRFLILR 481 (485)
Q Consensus 433 ~~~i~~~~~~~p~glavD~~~~~lYW~D~~~--~~I~~~~~dG~~r~~~~~ 481 (485)
.+.|...+ .....+++.+..+.||++.... ..|...++++...+.|..
T Consensus 328 ~~~lt~~g-~~~~~~~~SpDG~~Ia~~~~~~~~~~I~v~d~~~g~~~~Lt~ 377 (433)
T PRK04922 328 AERLTFQG-NYNARASVSPDGKKIAMVHGSGGQYRIAVMDLSTGSVRTLTP 377 (433)
T ss_pred eEEeecCC-CCccCEEECCCCCEEEEEECCCCceeEEEEECCCCCeEECCC
Confidence 55554322 3344678888889999876533 368899988776555443
No 68
>PRK04043 tolB translocation protein TolB; Provisional
Probab=91.80 E-value=1.2 Score=46.10 Aligned_cols=115 Identities=3% Similarity=-0.082 Sum_probs=75.6
Q ss_pred ccccccCCCCccCCCC----CCccccccCcccccceEEEEecCCCCCCccEEecCCCCeEEEeC----CCcEEEEEcCCC
Q psy4900 360 PSTWKCDSENDCGDGS----DEGDFCSEKTCAYFQFHAIVLGSNLTNPTDLALDPTSGLMFVAD----SNQILRTNMDGT 431 (485)
Q Consensus 360 ~~~~~~~~~~~y~~d~----~~I~~~~~~~c~~g~~~~~l~~~~~~~p~~iavD~~~~~lywtd----~~~I~r~~~dG~ 431 (485)
...|.|++.++.++.. ..|.+.+++ +...+.|.... ..-....+.|...+|||+. .+.|++.+++|.
T Consensus 237 ~~~~SPDG~~la~~~~~~g~~~Iy~~dl~----~g~~~~LT~~~-~~d~~p~~SPDG~~I~F~Sdr~g~~~Iy~~dl~~g 311 (419)
T PRK04043 237 VSDVSKDGSKLLLTMAPKGQPDIYLYDTN----TKTLTQITNYP-GIDVNGNFVEDDKRIVFVSDRLGYPNIFMKKLNSG 311 (419)
T ss_pred eeEECCCCCEEEEEEccCCCcEEEEEECC----CCcEEEcccCC-CccCccEECCCCCEEEEEECCCCCceEEEEECCCC
Confidence 3457888877766432 277777877 55544433221 1122345777777888886 468999999988
Q ss_pred CcEEEEeCCCcccceEEEeCCCCeEEEEeCCC--------CcEEEEEccCCCeEEEecC
Q psy4900 432 MAMSIVSEAAYKASGVALDINAKRLFWCDNLL--------DYIETVDYEGKNRFLILRG 482 (485)
Q Consensus 432 ~~~~i~~~~~~~p~glavD~~~~~lYW~D~~~--------~~I~~~~~dG~~r~~~~~~ 482 (485)
..+.|...+... .++.+..++|.++-... ..|.+++++|...+.|..+
T Consensus 312 ~~~rlt~~g~~~---~~~SPDG~~Ia~~~~~~~~~~~~~~~~I~v~d~~~g~~~~LT~~ 367 (419)
T PRK04043 312 SVEQVVFHGKNN---SSVSTYKNYIVYSSRETNNEFGKNTFNLYLISTNSDYIRRLTAN 367 (419)
T ss_pred CeEeCccCCCcC---ceECCCCCEEEEEEcCCCcccCCCCcEEEEEECCCCCeEECCCC
Confidence 776666443322 36778889888886543 4799999998876665543
No 69
>PRK04792 tolB translocation protein TolB; Provisional
Probab=91.10 E-value=1.9 Score=45.14 Aligned_cols=114 Identities=11% Similarity=0.055 Sum_probs=74.2
Q ss_pred cccccCCCCccCCC-CC---CccccccCcccccceEEEEecCCCCCCccEEecCCCCeEEEeC----CCcEEEEEcCCCC
Q psy4900 361 STWKCDSENDCGDG-SD---EGDFCSEKTCAYFQFHAIVLGSNLTNPTDLALDPTSGLMFVAD----SNQILRTNMDGTM 432 (485)
Q Consensus 361 ~~~~~~~~~~y~~d-~~---~I~~~~~~~c~~g~~~~~l~~~~~~~p~~iavD~~~~~lywtd----~~~I~r~~~dG~~ 432 (485)
..|.|++..++++. .+ .|.+.+++ +.....+. .........++.|...+|+++. ...|++.++++..
T Consensus 267 ~~wSPDG~~La~~~~~~g~~~Iy~~dl~----tg~~~~lt-~~~~~~~~p~wSpDG~~I~f~s~~~g~~~Iy~~dl~~g~ 341 (448)
T PRK04792 267 PRFSPDGKKLALVLSKDGQPEIYVVDIA----TKALTRIT-RHRAIDTEPSWHPDGKSLIFTSERGGKPQIYRVNLASGK 341 (448)
T ss_pred eeECCCCCEEEEEEeCCCCeEEEEEECC----CCCeEECc-cCCCCccceEECCCCCEEEEEECCCCCceEEEEECCCCC
Confidence 45778888787642 22 57777776 54443333 3333455677888888888765 4689999998766
Q ss_pred cEEEEeCCCcccceEEEeCCCCeEEEEeCCC--CcEEEEEccCCCeEEEe
Q psy4900 433 AMSIVSEAAYKASGVALDINAKRLFWCDNLL--DYIETVDYEGKNRFLIL 480 (485)
Q Consensus 433 ~~~i~~~~~~~p~glavD~~~~~lYW~D~~~--~~I~~~~~dG~~r~~~~ 480 (485)
.+.|.. ...+..+.++.+..+.||.+.... ..|.+.++++...+.|.
T Consensus 342 ~~~Lt~-~g~~~~~~~~SpDG~~l~~~~~~~g~~~I~~~dl~~g~~~~lt 390 (448)
T PRK04792 342 VSRLTF-EGEQNLGGSITPDGRSMIMVNRTNGKFNIARQDLETGAMQVLT 390 (448)
T ss_pred EEEEec-CCCCCcCeeECCCCCEEEEEEecCCceEEEEEECCCCCeEEcc
Confidence 555542 223344567888888998876544 46888888887655543
No 70
>TIGR03866 PQQ_ABC_repeats PQQ-dependent catabolism-associated beta-propeller protein. Members of this protein family consist of seven repeats each of the YVTN family beta-propeller repeat (see TIGR02276). Members occur invariably as part of a transport operon that is associated with PQQ-dependent catabolism of alcohols such as phenylethanol.
Probab=91.04 E-value=4.3 Score=38.87 Aligned_cols=75 Identities=13% Similarity=0.191 Sum_probs=55.7
Q ss_pred CCccEEecCCCCeEEEeC--CCcEEEEEcCCCCcEEEEeCCCcccceEEEeCCCCeEEEEeCCCCcEEEEEccCCCeE
Q psy4900 402 NPTDLALDPTSGLMFVAD--SNQILRTNMDGTMAMSIVSEAAYKASGVALDINAKRLFWCDNLLDYIETVDYEGKNRF 477 (485)
Q Consensus 402 ~p~~iavD~~~~~lywtd--~~~I~r~~~dG~~~~~i~~~~~~~p~glavD~~~~~lYW~D~~~~~I~~~~~dG~~r~ 477 (485)
.|.+|+++|...++|.+. ..+|...++........+. .-..|.+|++.+..++||-+....+.|.+.++.+....
T Consensus 208 ~~~~i~~s~dg~~~~~~~~~~~~i~v~d~~~~~~~~~~~-~~~~~~~~~~~~~g~~l~~~~~~~~~i~v~d~~~~~~~ 284 (300)
T TIGR03866 208 QPVGIKLTKDGKTAFVALGPANRVAVVDAKTYEVLDYLL-VGQRVWQLAFTPDEKYLLTTNGVSNDVSVIDVAALKVI 284 (300)
T ss_pred CccceEECCCCCEEEEEcCCCCeEEEEECCCCcEEEEEE-eCCCcceEEECCCCCEEEEEcCCCCeEEEEECCCCcEE
Confidence 578899999988889875 5678877776433222222 22468899999989999988777889999999977643
No 71
>smart00181 EGF Epidermal growth factor-like domain.
Probab=90.95 E-value=0.22 Score=31.47 Aligned_cols=23 Identities=35% Similarity=0.810 Sum_probs=19.6
Q ss_pred Cccc-cccCCCCCCeeeCCCCcee
Q psy4900 183 NCEF-TCQASPTGGVCQCPEGQKV 205 (485)
Q Consensus 183 ~C~~-~C~n~~~~~~C~C~~G~~l 205 (485)
.|.+ .|.++.++|.|.|++||.+
T Consensus 7 ~C~~~~C~~~~~~~~C~C~~g~~g 30 (35)
T smart00181 7 PCSNGTCINTPGSYTCSCPPGYTG 30 (35)
T ss_pred CCCCCEEECCCCCeEeECCCCCcc
Confidence 4555 7999999999999999965
No 72
>KOG4260|consensus
Probab=90.90 E-value=0.16 Score=47.73 Aligned_cols=47 Identities=26% Similarity=0.667 Sum_probs=37.3
Q ss_pred eCCCCceeeCCCCCCCCcchhhhccccccccccccceeheeheeeeeeecccCCCCCCCCCCCCCCCCCCCc--cccccc
Q psy4900 198 QCPEGQKVANDSRTCLLYMKNNLKQAVRSSTVSSHVKLVLLEVYVNVLKVRKLPTTAEPQSPNPCGSNNGGC--EHMCII 275 (485)
Q Consensus 198 ~C~~G~~l~~~~~~C~d~~e~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~e~~~~~~C~~~~g~C--~~~C~n 275 (485)
.|..||.|. ..-|.|++| |......| +|+|+|
T Consensus 221 kCkkGW~ld--e~gCvDvnE--------------------------------------------C~~ep~~c~~~qfCvN 254 (350)
T KOG4260|consen 221 KCKKGWKLD--EEGCVDVNE--------------------------------------------CQNEPAPCKAHQFCVN 254 (350)
T ss_pred hhcccceec--ccccccHHH--------------------------------------------HhcCCCCCChhheeec
Confidence 488999983 567999999 55333345 489999
Q ss_pred cccCCCCCCccccCCCCccc
Q psy4900 276 TRASGNALGYKCACDIGYRL 295 (485)
Q Consensus 276 ~~~~~~~g~~~C~C~~Gy~l 295 (485)
+. |||.|.+.+||+-
T Consensus 255 te-----GSf~C~dk~Gy~~ 269 (350)
T KOG4260|consen 255 TE-----GSFKCEDKEGYKK 269 (350)
T ss_pred CC-----CceEecccccccC
Confidence 99 9999999999964
No 73
>PRK00178 tolB translocation protein TolB; Provisional
Probab=90.57 E-value=2.8 Score=43.46 Aligned_cols=115 Identities=14% Similarity=0.062 Sum_probs=73.0
Q ss_pred cccccCCCCccCCCC-C---CccccccCcccccceEEEEecCCCCCCccEEecCCCCeEEEeC----CCcEEEEEcCCCC
Q psy4900 361 STWKCDSENDCGDGS-D---EGDFCSEKTCAYFQFHAIVLGSNLTNPTDLALDPTSGLMFVAD----SNQILRTNMDGTM 432 (485)
Q Consensus 361 ~~~~~~~~~~y~~d~-~---~I~~~~~~~c~~g~~~~~l~~~~~~~p~~iavD~~~~~lywtd----~~~I~r~~~dG~~ 432 (485)
..|.|++..++++-. + .|.+.+++ +.....+. .........++.|..+.||++. ...|++.++++..
T Consensus 248 ~~~SpDG~~la~~~~~~g~~~Iy~~d~~----~~~~~~lt-~~~~~~~~~~~spDg~~i~f~s~~~g~~~iy~~d~~~g~ 322 (430)
T PRK00178 248 PAWSPDGSKLAFVLSKDGNPEIYVMDLA----SRQLSRVT-NHPAIDTEPFWGKDGRTLYFTSDRGGKPQIYKVNVNGGR 322 (430)
T ss_pred eEECCCCCEEEEEEccCCCceEEEEECC----CCCeEEcc-cCCCCcCCeEECCCCCEEEEEECCCCCceEEEEECCCCC
Confidence 457788887775322 1 67777777 55443332 3333344567777777787765 3689999998766
Q ss_pred cEEEEeCCCcccceEEEeCCCCeEEEEeCCC--CcEEEEEccCCCeEEEec
Q psy4900 433 AMSIVSEAAYKASGVALDINAKRLFWCDNLL--DYIETVDYEGKNRFLILR 481 (485)
Q Consensus 433 ~~~i~~~~~~~p~glavD~~~~~lYW~D~~~--~~I~~~~~dG~~r~~~~~ 481 (485)
.+.|.... ......++.+..+.||++.... ..|.+.++++...+.|..
T Consensus 323 ~~~lt~~~-~~~~~~~~Spdg~~i~~~~~~~~~~~l~~~dl~tg~~~~lt~ 372 (430)
T PRK00178 323 AERVTFVG-NYNARPRLSADGKTLVMVHRQDGNFHVAAQDLQRGSVRILTD 372 (430)
T ss_pred EEEeecCC-CCccceEECCCCCEEEEEEccCCceEEEEEECCCCCEEEccC
Confidence 55554322 2223456777888999887543 368888888876665543
No 74
>PRK01029 tolB translocation protein TolB; Provisional
Probab=90.54 E-value=2.7 Score=43.66 Aligned_cols=117 Identities=11% Similarity=0.045 Sum_probs=75.4
Q ss_pred cccccCCCCccCCC-CC---CccccccCcccccc-eEEEEecCCCCCCccEEecCCCCeEEEeC----CCcEEEEEcCCC
Q psy4900 361 STWKCDSENDCGDG-SD---EGDFCSEKTCAYFQ-FHAIVLGSNLTNPTDLALDPTSGLMFVAD----SNQILRTNMDGT 431 (485)
Q Consensus 361 ~~~~~~~~~~y~~d-~~---~I~~~~~~~c~~g~-~~~~l~~~~~~~p~~iavD~~~~~lywtd----~~~I~r~~~dG~ 431 (485)
..|.|++..|+++. .+ .|.+.+++ +. .....+..........++.|..++|+|+. ...|+..++++.
T Consensus 286 p~wSPDG~~Laf~s~~~g~~~ly~~~~~----~~g~~~~~lt~~~~~~~~p~wSPDG~~Laf~~~~~g~~~I~v~dl~~g 361 (428)
T PRK01029 286 PSFSPDGTRLVFVSNKDGRPRIYIMQID----PEGQSPRLLTKKYRNSSCPAWSPDGKKIAFCSVIKGVRQICVYDLATG 361 (428)
T ss_pred eEECCCCCEEEEEECCCCCceEEEEECc----ccccceEEeccCCCCccceeECCCCCEEEEEEcCCCCcEEEEEECCCC
Confidence 35788887776642 22 56665554 32 12233333333445677888888888875 357999999888
Q ss_pred CcEEEEeCCCcccceEEEeCCCCeEEEEeCC--CCcEEEEEccCCCeEEEecC
Q psy4900 432 MAMSIVSEAAYKASGVALDINAKRLFWCDNL--LDYIETVDYEGKNRFLILRG 482 (485)
Q Consensus 432 ~~~~i~~~~~~~p~glavD~~~~~lYW~D~~--~~~I~~~~~dG~~r~~~~~~ 482 (485)
..+.|... .......+..+..+.||++-.. ...|+..+++|...+.|..+
T Consensus 362 ~~~~Lt~~-~~~~~~p~wSpDG~~L~f~~~~~g~~~L~~vdl~~g~~~~Lt~~ 413 (428)
T PRK01029 362 RDYQLTTS-PENKESPSWAIDSLHLVYSAGNSNESELYLISLITKKTRKIVIG 413 (428)
T ss_pred CeEEccCC-CCCccceEECCCCCEEEEEECCCCCceEEEEECCCCCEEEeecC
Confidence 76665533 2344566776777888886543 46799999999887776653
No 75
>smart00181 EGF Epidermal growth factor-like domain.
Probab=90.47 E-value=0.26 Score=31.12 Aligned_cols=24 Identities=42% Similarity=0.917 Sum_probs=19.9
Q ss_pred CCccc-cccccccCCCCCCccccCCCCccc
Q psy4900 267 GGCEH-MCIITRASGNALGYKCACDIGYRL 295 (485)
Q Consensus 267 g~C~~-~C~n~~~~~~~g~~~C~C~~Gy~l 295 (485)
..|.+ .|+++. ++|+|.|+.||.+
T Consensus 6 ~~C~~~~C~~~~-----~~~~C~C~~g~~g 30 (35)
T smart00181 6 GPCSNGTCINTP-----GSYTCSCPPGYTG 30 (35)
T ss_pred CCCCCCEEECCC-----CCeEeECCCCCcc
Confidence 45665 788887 9999999999975
No 76
>TIGR02800 propeller_TolB tol-pal system beta propeller repeat protein TolB. The Tol-PAL system is required for bacterial outer membrane integrity. E. coli TolB is involved in the tonB-independent uptake of group A colicins (colicins A, E1, E2, E3 and K), and is necessary for the colicins to reach their respective targets after initial binding to the bacteria. It is also involved in uptake of filamentous DNA. Study of its structure suggest that the TolB protein might be involved in the recycling of peptidoglycan or in its covalent linking with lipoproteins. The Tol-Pal system is also implicated in pathogenesis of E. coli, Haemophilus ducreyi, Salmonella enterica and Vibrio cholerae, but the mechanism(s) is unclear.
Probab=90.45 E-value=3.1 Score=42.72 Aligned_cols=114 Identities=11% Similarity=0.028 Sum_probs=73.2
Q ss_pred cccccCCCCccCCCC-C---CccccccCcccccceEEEEecCCCCCCccEEecCCCCeEEEeC----CCcEEEEEcCCCC
Q psy4900 361 STWKCDSENDCGDGS-D---EGDFCSEKTCAYFQFHAIVLGSNLTNPTDLALDPTSGLMFVAD----SNQILRTNMDGTM 432 (485)
Q Consensus 361 ~~~~~~~~~~y~~d~-~---~I~~~~~~~c~~g~~~~~l~~~~~~~p~~iavD~~~~~lywtd----~~~I~r~~~dG~~ 432 (485)
..|.+++..++++.. + .|.+.++. +.....+. .........++.|...+|+|+. ...|+..++++..
T Consensus 239 ~~~spDg~~l~~~~~~~~~~~i~~~d~~----~~~~~~l~-~~~~~~~~~~~s~dg~~l~~~s~~~g~~~iy~~d~~~~~ 313 (417)
T TIGR02800 239 PAFSPDGSKLAVSLSKDGNPDIYVMDLD----GKQLTRLT-NGPGIDTEPSWSPDGKSIAFTSDRGGSPQIYMMDADGGE 313 (417)
T ss_pred eEECCCCCEEEEEECCCCCccEEEEECC----CCCEEECC-CCCCCCCCEEECCCCCEEEEEECCCCCceEEEEECCCCC
Confidence 356777777766422 1 57777766 44433333 2222234557777777888865 3589999999876
Q ss_pred cEEEEeCCCcccceEEEeCCCCeEEEEeCCC--CcEEEEEccCCCeEEEe
Q psy4900 433 AMSIVSEAAYKASGVALDINAKRLFWCDNLL--DYIETVDYEGKNRFLIL 480 (485)
Q Consensus 433 ~~~i~~~~~~~p~glavD~~~~~lYW~D~~~--~~I~~~~~dG~~r~~~~ 480 (485)
.+.|.. .......+++.+..++|+++.... .+|...++++...+.|.
T Consensus 314 ~~~l~~-~~~~~~~~~~spdg~~i~~~~~~~~~~~i~~~d~~~~~~~~l~ 362 (417)
T TIGR02800 314 VRRLTF-RGGYNASPSWSPDGDLIAFVHREGGGFNIAVMDLDGGGERVLT 362 (417)
T ss_pred EEEeec-CCCCccCeEECCCCCEEEEEEccCCceEEEEEeCCCCCeEEcc
Confidence 555543 334456778888889999987654 37888888876655543
No 77
>PRK04792 tolB translocation protein TolB; Provisional
Probab=90.44 E-value=2.1 Score=44.80 Aligned_cols=113 Identities=9% Similarity=0.088 Sum_probs=72.2
Q ss_pred cccccCCCCccCCCC-C---CccccccCcccccceEEEEecCCCCCCccEEecCCCCeEEEeC----CCcEEEEEcCCCC
Q psy4900 361 STWKCDSENDCGDGS-D---EGDFCSEKTCAYFQFHAIVLGSNLTNPTDLALDPTSGLMFVAD----SNQILRTNMDGTM 432 (485)
Q Consensus 361 ~~~~~~~~~~y~~d~-~---~I~~~~~~~c~~g~~~~~l~~~~~~~p~~iavD~~~~~lywtd----~~~I~r~~~dG~~ 432 (485)
..|.+++..++++.. . .|.+.+++ +.....|.... ....+.++.|..++||++. ...|++.++++..
T Consensus 311 p~wSpDG~~I~f~s~~~g~~~Iy~~dl~----~g~~~~Lt~~g-~~~~~~~~SpDG~~l~~~~~~~g~~~I~~~dl~~g~ 385 (448)
T PRK04792 311 PSWHPDGKSLIFTSERGGKPQIYRVNLA----SGKVSRLTFEG-EQNLGGSITPDGRSMIMVNRTNGKFNIARQDLETGA 385 (448)
T ss_pred eEECCCCCEEEEEECCCCCceEEEEECC----CCCEEEEecCC-CCCcCeeECCCCCEEEEEEecCCceEEEEEECCCCC
Confidence 356778777766432 2 67777766 44333332222 2234568888888998876 2478888988876
Q ss_pred cEEEEeCCCcccceEEEeCCCCeEEEEeCCC--CcEEEEEccCCCeEEEe
Q psy4900 433 AMSIVSEAAYKASGVALDINAKRLFWCDNLL--DYIETVDYEGKNRFLIL 480 (485)
Q Consensus 433 ~~~i~~~~~~~p~glavD~~~~~lYW~D~~~--~~I~~~~~dG~~r~~~~ 480 (485)
.+.|..... ....++.+..++|||+.... ..|+.++++|..++.|-
T Consensus 386 ~~~lt~~~~--d~~ps~spdG~~I~~~~~~~g~~~l~~~~~~G~~~~~l~ 433 (448)
T PRK04792 386 MQVLTSTRL--DESPSVAPNGTMVIYSTTYQGKQVLAAVSIDGRFKARLP 433 (448)
T ss_pred eEEccCCCC--CCCceECCCCCEEEEEEecCCceEEEEEECCCCceEECc
Confidence 554443222 22336667789999877544 45888999999877764
No 78
>COG2706 3-carboxymuconate cyclase [Carbohydrate transport and metabolism]
Probab=90.43 E-value=3.4 Score=40.80 Aligned_cols=107 Identities=7% Similarity=0.079 Sum_probs=73.8
Q ss_pred ccccCCCCccCCCCC--CccccccCcccccceEE---EEecCCCCCCccEEecCCCCeEEEeC--CCcEEEEEcCCCC-c
Q psy4900 362 TWKCDSENDCGDGSD--EGDFCSEKTCAYFQFHA---IVLGSNLTNPTDLALDPTSGLMFVAD--SNQILRTNMDGTM-A 433 (485)
Q Consensus 362 ~~~~~~~~~y~~d~~--~I~~~~~~~c~~g~~~~---~l~~~~~~~p~~iavD~~~~~lywtd--~~~I~r~~~dG~~-~ 433 (485)
...|+++.++..|.+ +|.+.+++ .|.... ..+ ..-.-||-|+++|.....|-.. ..+|.....++.. +
T Consensus 151 ~~tP~~~~l~v~DLG~Dri~~y~~~---dg~L~~~~~~~v-~~G~GPRHi~FHpn~k~aY~v~EL~stV~v~~y~~~~g~ 226 (346)
T COG2706 151 NFTPDGRYLVVPDLGTDRIFLYDLD---DGKLTPADPAEV-KPGAGPRHIVFHPNGKYAYLVNELNSTVDVLEYNPAVGK 226 (346)
T ss_pred eeCCCCCEEEEeecCCceEEEEEcc---cCcccccccccc-CCCCCcceEEEcCCCcEEEEEeccCCEEEEEEEcCCCce
Confidence 345666666666655 55555544 133221 122 3344599999999999999988 8999999999852 1
Q ss_pred -E---EEEe-----CCCcccceEEEeCCCCeEEEEeCCCCcEEEEEcc
Q psy4900 434 -M---SIVS-----EAAYKASGVALDINAKRLFWCDNLLDYIETVDYE 472 (485)
Q Consensus 434 -~---~i~~-----~~~~~p~glavD~~~~~lYW~D~~~~~I~~~~~d 472 (485)
+ +|-. .+-.|..+|.|....+.||-++++.+.|.+..++
T Consensus 227 ~~~lQ~i~tlP~dF~g~~~~aaIhis~dGrFLYasNRg~dsI~~f~V~ 274 (346)
T COG2706 227 FEELQTIDTLPEDFTGTNWAAAIHISPDGRFLYASNRGHDSIAVFSVD 274 (346)
T ss_pred EEEeeeeccCccccCCCCceeEEEECCCCCEEEEecCCCCeEEEEEEc
Confidence 1 1211 2355778899999999999999999988777665
No 79
>TIGR03866 PQQ_ABC_repeats PQQ-dependent catabolism-associated beta-propeller protein. Members of this protein family consist of seven repeats each of the YVTN family beta-propeller repeat (see TIGR02276). Members occur invariably as part of a transport operon that is associated with PQQ-dependent catabolism of alcohols such as phenylethanol.
Probab=90.25 E-value=4 Score=39.05 Aligned_cols=92 Identities=15% Similarity=0.211 Sum_probs=59.4
Q ss_pred CccccccCcccccceEEEEecCCCCCCccEEecCCCCeEEEeC--CCcEEEEEcCCCCcEEEEeCCCcccceEEEeCCCC
Q psy4900 377 EGDFCSEKTCAYFQFHAIVLGSNLTNPTDLALDPTSGLMFVAD--SNQILRTNMDGTMAMSIVSEAAYKASGVALDINAK 454 (485)
Q Consensus 377 ~I~~~~~~~c~~g~~~~~l~~~~~~~p~~iavD~~~~~lywtd--~~~I~r~~~dG~~~~~i~~~~~~~p~glavD~~~~ 454 (485)
.|.+.++.+ +....++ .....|++++++|..+.||-+. ...|...+++.......+. ....|..+++++..+
T Consensus 12 ~v~~~d~~t---~~~~~~~--~~~~~~~~l~~~~dg~~l~~~~~~~~~v~~~d~~~~~~~~~~~-~~~~~~~~~~~~~g~ 85 (300)
T TIGR03866 12 TISVIDTAT---LEVTRTF--PVGQRPRGITLSKDGKLLYVCASDSDTIQVIDLATGEVIGTLP-SGPDPELFALHPNGK 85 (300)
T ss_pred EEEEEECCC---CceEEEE--ECCCCCCceEECCCCCEEEEEECCCCeEEEEECCCCcEEEecc-CCCCccEEEECCCCC
Confidence 455555541 3333333 2345688999999888888775 5788888877543322222 223477888888888
Q ss_pred eEEEEeCCCCcEEEEEccCC
Q psy4900 455 RLFWCDNLLDYIETVDYEGK 474 (485)
Q Consensus 455 ~lYW~D~~~~~I~~~~~dG~ 474 (485)
+||-+....+.|...++...
T Consensus 86 ~l~~~~~~~~~l~~~d~~~~ 105 (300)
T TIGR03866 86 ILYIANEDDNLVTVIDIETR 105 (300)
T ss_pred EEEEEcCCCCeEEEEECCCC
Confidence 88877666677888777653
No 80
>PRK02889 tolB translocation protein TolB; Provisional
Probab=90.08 E-value=2.8 Score=43.55 Aligned_cols=115 Identities=11% Similarity=0.050 Sum_probs=72.4
Q ss_pred cccccCCCCccCC-CCC---CccccccCcccccceEEEEecCCCCCCccEEecCCCCeEEEeC----CCcEEEEEcCCCC
Q psy4900 361 STWKCDSENDCGD-GSD---EGDFCSEKTCAYFQFHAIVLGSNLTNPTDLALDPTSGLMFVAD----SNQILRTNMDGTM 432 (485)
Q Consensus 361 ~~~~~~~~~~y~~-d~~---~I~~~~~~~c~~g~~~~~l~~~~~~~p~~iavD~~~~~lywtd----~~~I~r~~~dG~~ 432 (485)
..|.|++.++.++ ..+ .|.+.+++ +.....+. .........++.|...+|+++. ...|+..++++..
T Consensus 245 ~~~SPDG~~la~~~~~~g~~~Iy~~d~~----~~~~~~lt-~~~~~~~~~~wSpDG~~l~f~s~~~g~~~Iy~~~~~~g~ 319 (427)
T PRK02889 245 PAWSPDGRTLAVALSRDGNSQIYTVNAD----GSGLRRLT-QSSGIDTEPFFSPDGRSIYFTSDRGGAPQIYRMPASGGA 319 (427)
T ss_pred eEECCCCCEEEEEEccCCCceEEEEECC----CCCcEECC-CCCCCCcCeEEcCCCCEEEEEecCCCCcEEEEEECCCCc
Confidence 4678888887663 222 56666766 55443332 2222334567888777787764 4679999988766
Q ss_pred cEEEEeCCCcccceEEEeCCCCeEEEEeCCC--CcEEEEEccCCCeEEEec
Q psy4900 433 AMSIVSEAAYKASGVALDINAKRLFWCDNLL--DYIETVDYEGKNRFLILR 481 (485)
Q Consensus 433 ~~~i~~~~~~~p~glavD~~~~~lYW~D~~~--~~I~~~~~dG~~r~~~~~ 481 (485)
.+.+...+ ......++.+..+.|+++.... ..|.+.++++...+.|..
T Consensus 320 ~~~lt~~g-~~~~~~~~SpDG~~Ia~~s~~~g~~~I~v~d~~~g~~~~lt~ 369 (427)
T PRK02889 320 AQRVTFTG-SYNTSPRISPDGKLLAYISRVGGAFKLYVQDLATGQVTALTD 369 (427)
T ss_pred eEEEecCC-CCcCceEECCCCCEEEEEEccCCcEEEEEEECCCCCeEEccC
Confidence 55554322 2233567777888998876543 368888888776655543
No 81
>PRK05137 tolB translocation protein TolB; Provisional
Probab=89.93 E-value=2.8 Score=43.64 Aligned_cols=115 Identities=11% Similarity=0.016 Sum_probs=73.3
Q ss_pred ccccccCCCCccCCCC-C---CccccccCcccccceEEEEecCCCCCCccEEecCCCCeEEEeC----CCcEEEEEcCCC
Q psy4900 360 PSTWKCDSENDCGDGS-D---EGDFCSEKTCAYFQFHAIVLGSNLTNPTDLALDPTSGLMFVAD----SNQILRTNMDGT 431 (485)
Q Consensus 360 ~~~~~~~~~~~y~~d~-~---~I~~~~~~~c~~g~~~~~l~~~~~~~p~~iavD~~~~~lywtd----~~~I~r~~~dG~ 431 (485)
...|.|++..+.++.+ + .|.+.++. +.....|. .........++.|...+|+++. ...|++.+++|.
T Consensus 250 ~~~~SPDG~~la~~~~~~g~~~Iy~~d~~----~~~~~~Lt-~~~~~~~~~~~spDG~~i~f~s~~~g~~~Iy~~d~~g~ 324 (435)
T PRK05137 250 APRFSPDGRKVVMSLSQGGNTDIYTMDLR----SGTTTRLT-DSPAIDTSPSYSPDGSQIVFESDRSGSPQLYVMNADGS 324 (435)
T ss_pred CcEECCCCCEEEEEEecCCCceEEEEECC----CCceEEcc-CCCCccCceeEcCCCCEEEEEECCCCCCeEEEEECCCC
Confidence 3467888887765432 1 67777777 55444433 3333345567777777676654 368999999988
Q ss_pred CcEEEEeCCCcccceEEEeCCCCeEEEEeCCC--CcEEEEEccCCCeEEEe
Q psy4900 432 MAMSIVSEAAYKASGVALDINAKRLFWCDNLL--DYIETVDYEGKNRFLIL 480 (485)
Q Consensus 432 ~~~~i~~~~~~~p~glavD~~~~~lYW~D~~~--~~I~~~~~dG~~r~~~~ 480 (485)
..+.|.... ..-...++.+..++|+++.... ..|...+++|+..+.+.
T Consensus 325 ~~~~lt~~~-~~~~~~~~SpdG~~ia~~~~~~~~~~i~~~d~~~~~~~~lt 374 (435)
T PRK05137 325 NPRRISFGG-GRYSTPVWSPRGDLIAFTKQGGGQFSIGVMKPDGSGERILT 374 (435)
T ss_pred CeEEeecCC-CcccCeEECCCCCEEEEEEcCCCceEEEEEECCCCceEecc
Confidence 766665422 2223456667788888776433 47888888887765543
No 82
>KOG4260|consensus
Probab=89.93 E-value=0.64 Score=43.85 Aligned_cols=21 Identities=24% Similarity=0.289 Sum_probs=18.7
Q ss_pred cccccCCCCCCeeeCCCCcee
Q psy4900 185 EFTCQASPTGGVCQCPEGQKV 205 (485)
Q Consensus 185 ~~~C~n~~~~~~C~C~~G~~l 205 (485)
++.|.|+.|||.|.+.+||.-
T Consensus 249 ~qfCvNteGSf~C~dk~Gy~~ 269 (350)
T KOG4260|consen 249 HQFCVNTEGSFKCEDKEGYKK 269 (350)
T ss_pred hheeecCCCceEecccccccC
Confidence 578999999999999999864
No 83
>PRK04922 tolB translocation protein TolB; Provisional
Probab=89.58 E-value=2.5 Score=43.92 Aligned_cols=114 Identities=8% Similarity=0.061 Sum_probs=74.0
Q ss_pred cccccCCCCccCC-CCC---CccccccCcccccceEEEEecCCCCCCccEEecCCCCeEEEeC----CCcEEEEEcCCCC
Q psy4900 361 STWKCDSENDCGD-GSD---EGDFCSEKTCAYFQFHAIVLGSNLTNPTDLALDPTSGLMFVAD----SNQILRTNMDGTM 432 (485)
Q Consensus 361 ~~~~~~~~~~y~~-d~~---~I~~~~~~~c~~g~~~~~l~~~~~~~p~~iavD~~~~~lywtd----~~~I~r~~~dG~~ 432 (485)
..|.+++..++++ +.. .|.+.+++ +.....+...+ .....+++.|...+|+++. ...|+..++++..
T Consensus 297 ~~~spDG~~l~f~sd~~g~~~iy~~dl~----~g~~~~lt~~g-~~~~~~~~SpDG~~Ia~~~~~~~~~~I~v~d~~~g~ 371 (433)
T PRK04922 297 PTWAPDGKSIYFTSDRGGRPQIYRVAAS----GGSAERLTFQG-NYNARASVSPDGKKIAMVHGSGGQYRIAVMDLSTGS 371 (433)
T ss_pred eEECCCCCEEEEEECCCCCceEEEEECC----CCCeEEeecCC-CCccCEEECCCCCEEEEEECCCCceeEEEEECCCCC
Confidence 4577787777654 222 57767766 44333332222 2345688999888998875 2468999988766
Q ss_pred cEEEEeCCCcccceEEEeCCCCeEEEEeCC--CCcEEEEEccCCCeEEEec
Q psy4900 433 AMSIVSEAAYKASGVALDINAKRLFWCDNL--LDYIETVDYEGKNRFLILR 481 (485)
Q Consensus 433 ~~~i~~~~~~~p~glavD~~~~~lYW~D~~--~~~I~~~~~dG~~r~~~~~ 481 (485)
.+.|. .. .....+++.+..+.||++... ...|...+++|+.++.|..
T Consensus 372 ~~~Lt-~~-~~~~~p~~spdG~~i~~~s~~~g~~~L~~~~~~g~~~~~l~~ 420 (433)
T PRK04922 372 VRTLT-PG-SLDESPSFAPNGSMVLYATREGGRGVLAAVSTDGRVRQRLVS 420 (433)
T ss_pred eEECC-CC-CCCCCceECCCCCEEEEEEecCCceEEEEEECCCCceEEccc
Confidence 55443 22 133456777788889988653 4579999999988776643
No 84
>PRK03629 tolB translocation protein TolB; Provisional
Probab=88.99 E-value=4.9 Score=41.74 Aligned_cols=114 Identities=12% Similarity=0.026 Sum_probs=74.7
Q ss_pred cccccCCCCccCCCC--C--CccccccCcccccceEEEEecCCCCCCccEEecCCCCeEEEeC----CCcEEEEEcCCCC
Q psy4900 361 STWKCDSENDCGDGS--D--EGDFCSEKTCAYFQFHAIVLGSNLTNPTDLALDPTSGLMFVAD----SNQILRTNMDGTM 432 (485)
Q Consensus 361 ~~~~~~~~~~y~~d~--~--~I~~~~~~~c~~g~~~~~l~~~~~~~p~~iavD~~~~~lywtd----~~~I~r~~~dG~~ 432 (485)
..|.|++..++++.. + .|.+.+++ +.....+. ..-......++.|..++|+++. ..+|++.++++..
T Consensus 248 ~~~SPDG~~La~~~~~~g~~~I~~~d~~----tg~~~~lt-~~~~~~~~~~wSPDG~~I~f~s~~~g~~~Iy~~d~~~g~ 322 (429)
T PRK03629 248 PAFSPDGSKLAFALSKTGSLNLYVMDLA----SGQIRQVT-DGRSNNTEPTWFPDSQNLAYTSDQAGRPQVYKVNINGGA 322 (429)
T ss_pred eEECCCCCEEEEEEcCCCCcEEEEEECC----CCCEEEcc-CCCCCcCceEECCCCCEEEEEeCCCCCceEEEEECCCCC
Confidence 467889888887522 2 57777776 54444443 3333456678888877776654 3589999999876
Q ss_pred cEEEEeCCCcccceEEEeCCCCeEEEEeCCC--CcEEEEEccCCCeEEEe
Q psy4900 433 AMSIVSEAAYKASGVALDINAKRLFWCDNLL--DYIETVDYEGKNRFLIL 480 (485)
Q Consensus 433 ~~~i~~~~~~~p~glavD~~~~~lYW~D~~~--~~I~~~~~dG~~r~~~~ 480 (485)
.+.|.. ........++.+..++|+++.... ..|.+.++++...+.|.
T Consensus 323 ~~~lt~-~~~~~~~~~~SpDG~~Ia~~~~~~g~~~I~~~dl~~g~~~~Lt 371 (429)
T PRK03629 323 PQRITW-EGSQNQDADVSSDGKFMVMVSSNGGQQHIAKQDLATGGVQVLT 371 (429)
T ss_pred eEEeec-CCCCccCEEECCCCCEEEEEEccCCCceEEEEECCCCCeEEeC
Confidence 665543 223344567777788888875443 46888888877666554
No 85
>TIGR02658 TTQ_MADH_Hv methylamine dehydrogenase heavy chain. This family consists of the heavy chain of methylamine dehydrogenase light chain, a periplasmic enzyme. The enzyme contains a tryptophan tryptophylquinone (TTQ) prothetic group derived from two Trp residues in the light subunity. The enzyme forms a complex with the type I blue copper protein amicyanin and a cytochrome. Electron transfer procedes from TQQ to the copper and then to the heme group of the cytochrome.
Probab=88.98 E-value=4.8 Score=40.54 Aligned_cols=104 Identities=8% Similarity=-0.040 Sum_probs=65.9
Q ss_pred CCCCccCCCCCCccccccCcccccceEEE-----EecCC----CCCCcc---EEecCCCCeEEEeC-----------CCc
Q psy4900 366 DSENDCGDGSDEGDFCSEKTCAYFQFHAI-----VLGSN----LTNPTD---LALDPTSGLMFVAD-----------SNQ 422 (485)
Q Consensus 366 ~~~~~y~~d~~~I~~~~~~~c~~g~~~~~-----l~~~~----~~~p~~---iavD~~~~~lywtd-----------~~~ 422 (485)
++..+|.+..+.|.++++. +..... ++... --.|.| ||+++..++||... ...
T Consensus 205 dg~~~~vs~eG~V~~id~~----~~~~~~~~~~~~~~~~~~~~~wrP~g~q~ia~~~dg~~lyV~~~~~~~~thk~~~~~ 280 (352)
T TIGR02658 205 SGRLVWPTYTGKIFQIDLS----SGDAKFLPAIEAFTEAEKADGWRPGGWQQVAYHRARDRIYLLADQRAKWTHKTASRF 280 (352)
T ss_pred CCcEEEEecCCeEEEEecC----CCcceecceeeeccccccccccCCCcceeEEEcCCCCEEEEEecCCccccccCCCCE
Confidence 4444555555677777754 332221 11111 124666 99999999999942 245
Q ss_pred EEEEEcCCCCcEEEEeCCCcccceEEEeCCCC-eEEEEeCCCCcEEEEEccCC
Q psy4900 423 ILRTNMDGTMAMSIVSEAAYKASGVALDINAK-RLFWCDNLLDYIETVDYEGK 474 (485)
Q Consensus 423 I~r~~~dG~~~~~i~~~~~~~p~glavD~~~~-~lYW~D~~~~~I~~~~~dG~ 474 (485)
|...++....+..-+.. -..|.+|+|....+ +||-+....+.|.+.+....
T Consensus 281 V~ViD~~t~kvi~~i~v-G~~~~~iavS~Dgkp~lyvtn~~s~~VsViD~~t~ 332 (352)
T TIGR02658 281 LFVVDAKTGKRLRKIEL-GHEIDSINVSQDAKPLLYALSTGDKTLYIFDAETG 332 (352)
T ss_pred EEEEECCCCeEEEEEeC-CCceeeEEECCCCCeEEEEeCCCCCcEEEEECcCC
Confidence 66666544333333322 24789999999999 99999988888999887643
No 86
>PRK05137 tolB translocation protein TolB; Provisional
Probab=88.62 E-value=3 Score=43.43 Aligned_cols=113 Identities=12% Similarity=0.064 Sum_probs=73.8
Q ss_pred cccccCCCCccCCC-CC---CccccccCcccccceEEEEecCCCCCCccEEecCCCCeEEEeC----CCcEEEEEcCCCC
Q psy4900 361 STWKCDSENDCGDG-SD---EGDFCSEKTCAYFQFHAIVLGSNLTNPTDLALDPTSGLMFVAD----SNQILRTNMDGTM 432 (485)
Q Consensus 361 ~~~~~~~~~~y~~d-~~---~I~~~~~~~c~~g~~~~~l~~~~~~~p~~iavD~~~~~lywtd----~~~I~r~~~dG~~ 432 (485)
..|.+++..+++.. .. .|.+.+++ +.....+... .......++.|..++|+++. ..+|...+++|..
T Consensus 295 ~~~spDG~~i~f~s~~~g~~~Iy~~d~~----g~~~~~lt~~-~~~~~~~~~SpdG~~ia~~~~~~~~~~i~~~d~~~~~ 369 (435)
T PRK05137 295 PSYSPDGSQIVFESDRSGSPQLYVMNAD----GSNPRRISFG-GGRYSTPVWSPRGDLIAFTKQGGGQFSIGVMKPDGSG 369 (435)
T ss_pred eeEcCCCCEEEEEECCCCCCeEEEEECC----CCCeEEeecC-CCcccCeEECCCCCEEEEEEcCCCceEEEEEECCCCc
Confidence 45778887776542 22 67777777 6554444322 22233467888888888765 2578888888876
Q ss_pred cEEEEeCCCcccceEEEeCCCCeEEEEeCCC-----CcEEEEEccCCCeEEEe
Q psy4900 433 AMSIVSEAAYKASGVALDINAKRLFWCDNLL-----DYIETVDYEGKNRFLIL 480 (485)
Q Consensus 433 ~~~i~~~~~~~p~glavD~~~~~lYW~D~~~-----~~I~~~~~dG~~r~~~~ 480 (485)
.+.+.. . .....+++.+..+.||++-... ..|++++++|...+.|.
T Consensus 370 ~~~lt~-~-~~~~~p~~spDG~~i~~~~~~~~~~~~~~L~~~dl~g~~~~~l~ 420 (435)
T PRK05137 370 ERILTS-G-FLVEGPTWAPNGRVIMFFRQTPGSGGAPKLYTVDLTGRNEREVP 420 (435)
T ss_pred eEeccC-C-CCCCCCeECCCCCEEEEEEccCCCCCcceEEEEECCCCceEEcc
Confidence 554432 2 2456677778888998875432 47999999998877654
No 87
>COG2706 3-carboxymuconate cyclase [Carbohydrate transport and metabolism]
Probab=88.55 E-value=4.1 Score=40.19 Aligned_cols=101 Identities=15% Similarity=0.190 Sum_probs=75.0
Q ss_pred CccccccCcccccceEEEEecCCCCCCccEEecCCCCeEEEeC----CCcEEEEEcCCC-CcEEEEeCC---CcccceEE
Q psy4900 377 EGDFCSEKTCAYFQFHAIVLGSNLTNPTDLALDPTSGLMFVAD----SNQILRTNMDGT-MAMSIVSEA---AYKASGVA 448 (485)
Q Consensus 377 ~I~~~~~~~c~~g~~~~~l~~~~~~~p~~iavD~~~~~lywtd----~~~I~r~~~dG~-~~~~i~~~~---~~~p~gla 448 (485)
.|++..+++ ..|......+...+.+|.=|++++..+.||-.. ...|....+|.. ++-+++... ..-|.-|+
T Consensus 17 gI~v~~ld~-~~g~l~~~~~v~~~~nptyl~~~~~~~~LY~v~~~~~~ggvaay~iD~~~G~Lt~ln~~~~~g~~p~yvs 95 (346)
T COG2706 17 GIYVFNLDT-KTGELSLLQLVAELGNPTYLAVNPDQRHLYVVNEPGEEGGVAAYRIDPDDGRLTFLNRQTLPGSPPCYVS 95 (346)
T ss_pred ceEEEEEeC-cccccchhhhccccCCCceEEECCCCCEEEEEEecCCcCcEEEEEEcCCCCeEEEeeccccCCCCCeEEE
Confidence 566666651 003333333446789999999999999999887 478999999975 677777644 33468999
Q ss_pred EeCCCCeEEEEeCCCCcEEEEEc--cCCCeEE
Q psy4900 449 LDINAKRLFWCDNLLDYIETVDY--EGKNRFL 478 (485)
Q Consensus 449 vD~~~~~lYW~D~~~~~I~~~~~--dG~~r~~ 478 (485)
||...+.||-+-...+.|.+..+ ||.-..+
T Consensus 96 vd~~g~~vf~AnY~~g~v~v~p~~~dG~l~~~ 127 (346)
T COG2706 96 VDEDGRFVFVANYHSGSVSVYPLQADGSLQPV 127 (346)
T ss_pred ECCCCCEEEEEEccCceEEEEEcccCCccccc
Confidence 99999999999999999988866 4665443
No 88
>PF09064 Tme5_EGF_like: Thrombomodulin like fifth domain, EGF-like; InterPro: IPR015149 This domain adopts a fold similar to other EGF domains, with a flat major and a twisted minor beta sheet. Disulphide pairing, however, is not of the usual 1-3, 2-4, 5-6 type; rather 1-2, 3-4, 5-6 pairing is found. Its extended major sheet (strands beta-2 and beta-3 and the connecting loop) projects into thrombin's active site groove. This domain is required for interaction of thrombomodulin with thrombin, and subsequent activation of protein-C []. ; GO: 0004888 transmembrane signaling receptor activity, 0016021 integral to membrane
Probab=88.01 E-value=0.67 Score=29.09 Aligned_cols=26 Identities=31% Similarity=0.804 Sum_probs=17.8
Q ss_pred CCCCccccccCCCCCCeeeCCCCceee
Q psy4900 180 SLLNCEFTCQASPTGGVCQCPEGQKVA 206 (485)
Q Consensus 180 ~~~~C~~~C~n~~~~~~C~C~~G~~l~ 206 (485)
....|...|.+... +.|.||+||.+.
T Consensus 4 n~t~CpA~CDpn~~-~~C~CPeGyIld 29 (34)
T PF09064_consen 4 NQTECPADCDPNSP-GQCFCPEGYILD 29 (34)
T ss_pred ccccCCCccCCCCC-CceeCCCceEec
Confidence 33445667765433 489999999984
No 89
>COG3204 Uncharacterized protein conserved in bacteria [Function unknown]
Probab=87.32 E-value=3.2 Score=40.07 Aligned_cols=76 Identities=12% Similarity=0.177 Sum_probs=61.7
Q ss_pred CCCCCccEEecCCCCeEEEeC--CCcEEEEEcCCCCcEEEEeCCCcccceEEEeCCCCeEEE-EeCCCCcEEEEEccCCC
Q psy4900 399 NLTNPTDLALDPTSGLMFVAD--SNQILRTNMDGTMAMSIVSEAAYKASGVALDINAKRLFW-CDNLLDYIETVDYEGKN 475 (485)
Q Consensus 399 ~~~~p~~iavD~~~~~lywtd--~~~I~r~~~dG~~~~~i~~~~~~~p~glavD~~~~~lYW-~D~~~~~I~~~~~dG~~ 475 (485)
...+..+|+.+|.++.||-+- .+.|....+.|.-.++|-...++.|++|+ |+.+..|- +|-+..++....++-..
T Consensus 84 ~~~nvS~LTynp~~rtLFav~n~p~~iVElt~~GdlirtiPL~g~~DpE~Ie--yig~n~fvi~dER~~~l~~~~vd~~t 161 (316)
T COG3204 84 ETANVSSLTYNPDTRTLFAVTNKPAAIVELTKEGDLIRTIPLTGFSDPETIE--YIGGNQFVIVDERDRALYLFTVDADT 161 (316)
T ss_pred ccccccceeeCCCcceEEEecCCCceEEEEecCCceEEEecccccCChhHeE--EecCCEEEEEehhcceEEEEEEcCCc
Confidence 355688999999999999877 68899999999998888888899999999 78888885 45556777777776544
Q ss_pred e
Q psy4900 476 R 476 (485)
Q Consensus 476 r 476 (485)
+
T Consensus 162 ~ 162 (316)
T COG3204 162 T 162 (316)
T ss_pred c
Confidence 3
No 90
>TIGR02800 propeller_TolB tol-pal system beta propeller repeat protein TolB. The Tol-PAL system is required for bacterial outer membrane integrity. E. coli TolB is involved in the tonB-independent uptake of group A colicins (colicins A, E1, E2, E3 and K), and is necessary for the colicins to reach their respective targets after initial binding to the bacteria. It is also involved in uptake of filamentous DNA. Study of its structure suggest that the TolB protein might be involved in the recycling of peptidoglycan or in its covalent linking with lipoproteins. The Tol-Pal system is also implicated in pathogenesis of E. coli, Haemophilus ducreyi, Salmonella enterica and Vibrio cholerae, but the mechanism(s) is unclear.
Probab=87.23 E-value=3.8 Score=42.08 Aligned_cols=112 Identities=10% Similarity=0.070 Sum_probs=72.8
Q ss_pred ccccCCCCccCCC-CC---CccccccCcccccceEEEEecCCCCCCccEEecCCCCeEEEeC-C---CcEEEEEcCCCCc
Q psy4900 362 TWKCDSENDCGDG-SD---EGDFCSEKTCAYFQFHAIVLGSNLTNPTDLALDPTSGLMFVAD-S---NQILRTNMDGTMA 433 (485)
Q Consensus 362 ~~~~~~~~~y~~d-~~---~I~~~~~~~c~~g~~~~~l~~~~~~~p~~iavD~~~~~lywtd-~---~~I~r~~~dG~~~ 433 (485)
.|.+++..+++.. .. .|.+.+++ +.....+. ........+++.|...+|+++. . .+|+..++++...
T Consensus 284 ~~s~dg~~l~~~s~~~g~~~iy~~d~~----~~~~~~l~-~~~~~~~~~~~spdg~~i~~~~~~~~~~~i~~~d~~~~~~ 358 (417)
T TIGR02800 284 SWSPDGKSIAFTSDRGGSPQIYMMDAD----GGEVRRLT-FRGGYNASPSWSPDGDLIAFVHREGGGFNIAVMDLDGGGE 358 (417)
T ss_pred EECCCCCEEEEEECCCCCceEEEEECC----CCCEEEee-cCCCCccCeEECCCCCEEEEEEccCCceEEEEEeCCCCCe
Confidence 4567777776542 22 57777776 54443333 2334456778899888999988 2 3788999887554
Q ss_pred EEEEeCCCcccceEEEeCCCCeEEEEeCCC--CcEEEEEccCCCeEEEe
Q psy4900 434 MSIVSEAAYKASGVALDINAKRLFWCDNLL--DYIETVDYEGKNRFLIL 480 (485)
Q Consensus 434 ~~i~~~~~~~p~glavD~~~~~lYW~D~~~--~~I~~~~~dG~~r~~~~ 480 (485)
+.+.... .....++.+..+.|+|+.... ..|...+++|..++.|.
T Consensus 359 ~~l~~~~--~~~~p~~spdg~~l~~~~~~~~~~~l~~~~~~g~~~~~~~ 405 (417)
T TIGR02800 359 RVLTDTG--LDESPSFAPNGRMILYATTRGGRGVLGLVSTDGRFRARLP 405 (417)
T ss_pred EEccCCC--CCCCceECCCCCEEEEEEeCCCcEEEEEEECCCceeeECC
Confidence 4443221 234456667788999987654 46788888998776553
No 91
>PRK02889 tolB translocation protein TolB; Provisional
Probab=87.03 E-value=5.6 Score=41.26 Aligned_cols=113 Identities=7% Similarity=0.035 Sum_probs=72.6
Q ss_pred cccccCCCCccCCC-CC---CccccccCcccccceEEEEecCCCCCCccEEecCCCCeEEEeC-C---CcEEEEEcCCCC
Q psy4900 361 STWKCDSENDCGDG-SD---EGDFCSEKTCAYFQFHAIVLGSNLTNPTDLALDPTSGLMFVAD-S---NQILRTNMDGTM 432 (485)
Q Consensus 361 ~~~~~~~~~~y~~d-~~---~I~~~~~~~c~~g~~~~~l~~~~~~~p~~iavD~~~~~lywtd-~---~~I~r~~~dG~~ 432 (485)
..|.+++..++++. .. .|...+++ +.....+.... ......++.|...+|+++. . ..|+..++++..
T Consensus 289 ~~wSpDG~~l~f~s~~~g~~~Iy~~~~~----~g~~~~lt~~g-~~~~~~~~SpDG~~Ia~~s~~~g~~~I~v~d~~~g~ 363 (427)
T PRK02889 289 PFFSPDGRSIYFTSDRGGAPQIYRMPAS----GGAAQRVTFTG-SYNTSPRISPDGKLLAYISRVGGAFKLYVQDLATGQ 363 (427)
T ss_pred eEEcCCCCEEEEEecCCCCcEEEEEECC----CCceEEEecCC-CCcCceEECCCCCEEEEEEccCCcEEEEEEECCCCC
Confidence 45788888777642 22 56666665 44333332222 2234567888888888876 2 369999988776
Q ss_pred cEEEEeCCCcccceEEEeCCCCeEEEEeCCC--CcEEEEEccCCCeEEEe
Q psy4900 433 AMSIVSEAAYKASGVALDINAKRLFWCDNLL--DYIETVDYEGKNRFLIL 480 (485)
Q Consensus 433 ~~~i~~~~~~~p~glavD~~~~~lYW~D~~~--~~I~~~~~dG~~r~~~~ 480 (485)
.+.|... ......++.+..+.||++-... ..|..++++|..++.+.
T Consensus 364 ~~~lt~~--~~~~~p~~spdg~~l~~~~~~~g~~~l~~~~~~g~~~~~l~ 411 (427)
T PRK02889 364 VTALTDT--TRDESPSFAPNGRYILYATQQGGRSVLAAVSSDGRIKQRLS 411 (427)
T ss_pred eEEccCC--CCccCceECCCCCEEEEEEecCCCEEEEEEECCCCceEEee
Confidence 5555433 2335667778888888876433 46888999998877664
No 92
>PRK00178 tolB translocation protein TolB; Provisional
Probab=86.17 E-value=5.9 Score=40.99 Aligned_cols=113 Identities=8% Similarity=0.015 Sum_probs=72.2
Q ss_pred cccccCCCCccCCC-CC---CccccccCcccccceEEEEecCCCCCCccEEecCCCCeEEEeC----CCcEEEEEcCCCC
Q psy4900 361 STWKCDSENDCGDG-SD---EGDFCSEKTCAYFQFHAIVLGSNLTNPTDLALDPTSGLMFVAD----SNQILRTNMDGTM 432 (485)
Q Consensus 361 ~~~~~~~~~~y~~d-~~---~I~~~~~~~c~~g~~~~~l~~~~~~~p~~iavD~~~~~lywtd----~~~I~r~~~dG~~ 432 (485)
..|.+++..+++.. .. .|++.+++ +.....+.... ......++.|..++|+++. ...|+..++++..
T Consensus 292 ~~~spDg~~i~f~s~~~g~~~iy~~d~~----~g~~~~lt~~~-~~~~~~~~Spdg~~i~~~~~~~~~~~l~~~dl~tg~ 366 (430)
T PRK00178 292 PFWGKDGRTLYFTSDRGGKPQIYKVNVN----GGRAERVTFVG-NYNARPRLSADGKTLVMVHRQDGNFHVAAQDLQRGS 366 (430)
T ss_pred eEECCCCCEEEEEECCCCCceEEEEECC----CCCEEEeecCC-CCccceEECCCCCEEEEEEccCCceEEEEEECCCCC
Confidence 35777887776642 22 56666665 44333332221 2223467888888999887 2358889998776
Q ss_pred cEEEEeCCCcccceEEEeCCCCeEEEEeCCC--CcEEEEEccCCCeEEEe
Q psy4900 433 AMSIVSEAAYKASGVALDINAKRLFWCDNLL--DYIETVDYEGKNRFLIL 480 (485)
Q Consensus 433 ~~~i~~~~~~~p~glavD~~~~~lYW~D~~~--~~I~~~~~dG~~r~~~~ 480 (485)
.+.|... ......++.+..+.||++.... ..|..++++|..++.|.
T Consensus 367 ~~~lt~~--~~~~~p~~spdg~~i~~~~~~~g~~~l~~~~~~g~~~~~l~ 414 (430)
T PRK00178 367 VRILTDT--SLDESPSVAPNGTMLIYATRQQGRGVLMLVSINGRVRLPLP 414 (430)
T ss_pred EEEccCC--CCCCCceECCCCCEEEEEEecCCceEEEEEECCCCceEECc
Confidence 5555432 2233457778889999987543 57899999998776553
No 93
>PF01731 Arylesterase: Arylesterase; InterPro: IPR002640 The serum paraoxonases/arylesterases are enzymes that catalyse the hydrolysis of the toxic metabolites of a variety of organophosphorus insecticides. The enzymes hydrolyse a broad spectrum of organophosphate substrates, including paraoxon and a number of aromatic carboxylic acid esters (e.g., phenyl acetate), and hence confer resistance to organophosphate toxicity []. Mammals have 3 distinct paraoxonase types, termed PON1-3 [, ]. In mice and humans, the PON genes are found on the same chromosome in close proximity. PON activity has been found in variety of tissues, with highest levels in liver and serum - the source of serum PON is thought to be the liver. Unlike mammals, fish and avian species lack paraoxonase activity. Human and rabbit PONs appear to have two distinct Ca2+ binding sites, one required for stability and one required for catalytic activity. The Ca2+ dependency of PONs suggests a mechanism of hydrolysis where Ca2+ acts as the electrophillic catalyst, like that proposed for phospholipase A2. The paraoxonase enzymes, PON1 and PON3, are high density lipoprotein (HDL)- associated proteins capable of preventing oxidative modification of low density lipoproteins (LPL) []. Although PON2 has oxidative properties, the enzyme does not associate with HDL. Within a given species, PON1, PON2 and PON3 share ~60% amino acid sequence identity, whereas between mammalian species particular PONs (1,2 or 3) share 79-90% identity at the amino acid level. Human PON1 and PON3 share numerous conserved phosphorylation and N-glycosylation sites; however, it is not known whether the PON proteins are modified at these sites, or whether modification at these sites is required for activity in vivo []. This family consists of arylesterases (Also known as serum paraoxonase) 3.1.1.2 from EC. These enzymes hydrolyse organophosphorus esters such as paraoxon and are found in the liver and blood. They confer resistance to organophosphate toxicity []. Human arylesterase (PON1) P27169 from SWISSPROT is associated with HDL and may protect against LDL oxidation [].; GO: 0004064 arylesterase activity
Probab=85.89 E-value=2.1 Score=33.57 Aligned_cols=39 Identities=18% Similarity=0.247 Sum_probs=30.2
Q ss_pred cceEEEEecCCCCCCccEEecCCCCeEEEeC--CCcEEEEEc
Q psy4900 389 FQFHAIVLGSNLTNPTDLALDPTSGLMFVAD--SNQILRTNM 428 (485)
Q Consensus 389 g~~~~~l~~~~~~~p~~iavD~~~~~lywtd--~~~I~r~~~ 428 (485)
++... .+..++..|.||+++|..++||.++ .+.|....+
T Consensus 43 ~~~~~-~va~g~~~aNGI~~s~~~k~lyVa~~~~~~I~vy~~ 83 (86)
T PF01731_consen 43 GKEVK-VVASGFSFANGIAISPDKKYLYVASSLAHSIHVYKR 83 (86)
T ss_pred CCEeE-EeeccCCCCceEEEcCCCCEEEEEeccCCeEEEEEe
Confidence 43333 3457899999999999999999999 677776654
No 94
>cd00053 EGF Epidermal growth factor domain, found in epidermal growth factor (EGF) presents in a large number of proteins, mostly animal; the list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied; the functional significance of EGF-like domains in what appear to be unrelated proteins is not yet clear; a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase); the domain includes six cysteine residues which have been shown to be involved in disulfide bonds; the main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet; Subdomains between the conserved cysteines vary in length; the region between the 5th and 6th cysteine contains two conserved glycines of which at least one is present in most EGF-like domains; a subset of these bind calcium.
Probab=85.64 E-value=0.79 Score=28.53 Aligned_cols=20 Identities=35% Similarity=0.662 Sum_probs=17.6
Q ss_pred ccccCCCCCCeeeCCCCcee
Q psy4900 186 FTCQASPTGGVCQCPEGQKV 205 (485)
Q Consensus 186 ~~C~n~~~~~~C~C~~G~~l 205 (485)
..|.+++++|.|.|+.||..
T Consensus 12 ~~C~~~~~~~~C~C~~g~~g 31 (36)
T cd00053 12 GTCVNTPGSYRCVCPPGYTG 31 (36)
T ss_pred CEEecCCCCeEeECCCCCcc
Confidence 67889899999999999964
No 95
>PF13449 Phytase-like: Esterase-like activity of phytase
Probab=84.75 E-value=3.5 Score=41.15 Aligned_cols=78 Identities=18% Similarity=0.241 Sum_probs=52.4
Q ss_pred CCCCccEEecCCCCeEEEe--CCC------cEEEEEcCCCC----cEEE------EeCC---C----cccceEEEeCCCC
Q psy4900 400 LTNPTDLALDPTSGLMFVA--DSN------QILRTNMDGTM----AMSI------VSEA---A----YKASGVALDINAK 454 (485)
Q Consensus 400 ~~~p~~iavD~~~~~lywt--d~~------~I~r~~~dG~~----~~~i------~~~~---~----~~p~glavD~~~~ 454 (485)
+..-.||++|+..+. ||+ |.. +++++.++... ...+ ...+ + ..++||++ ...+
T Consensus 19 ~GGlSgl~~~~~~~~-~~avSD~g~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~L~~~~G~~~~~~~~D~Egi~~-~~~g 96 (326)
T PF13449_consen 19 FGGLSGLDYDPDDGR-FYAVSDRGPNKGPPRFYTFRIDYDQGGIGGVTILDMIPLRDPDGQPFPKNGLDPEGIAV-PPDG 96 (326)
T ss_pred cCcEeeEEEeCCCCE-EEEEECCCCCCCCCcEEEEEeeccCCCccceEeccceeccCCCCCcCCcCCCChhHeEE-ecCC
Confidence 334467999975443 444 333 48888887621 1111 1111 1 15779999 6899
Q ss_pred eEEEEeCCC------CcEEEEEccCCCeEEE
Q psy4900 455 RLFWCDNLL------DYIETVDYEGKNRFLI 479 (485)
Q Consensus 455 ~lYW~D~~~------~~I~~~~~dG~~r~~~ 479 (485)
.+||++-+. .+|.+++.+|...+.+
T Consensus 97 ~~~is~E~~~~~~~~p~I~~~~~~G~~~~~~ 127 (326)
T PF13449_consen 97 SFWISSEGGRTGGIPPRIRRFDLDGRVIRRF 127 (326)
T ss_pred CEEEEeCCccCCCCCCEEEEECCCCcccceE
Confidence 999999999 9999999999886554
No 96
>PRK03629 tolB translocation protein TolB; Provisional
Probab=84.27 E-value=10 Score=39.30 Aligned_cols=99 Identities=8% Similarity=-0.012 Sum_probs=67.5
Q ss_pred CccccccCcccccceEEEEecCCCCCCccEEecCCCCeEEEeC----CCcEEEEEcCCCCcEEEEeCCCcccceEEEeCC
Q psy4900 377 EGDFCSEKTCAYFQFHAIVLGSNLTNPTDLALDPTSGLMFVAD----SNQILRTNMDGTMAMSIVSEAAYKASGVALDIN 452 (485)
Q Consensus 377 ~I~~~~~~~c~~g~~~~~l~~~~~~~p~~iavD~~~~~lywtd----~~~I~r~~~dG~~~~~i~~~~~~~p~glavD~~ 452 (485)
.|.+++.+ |.....+.. +-....+.++.|...+|.|+. .+.|+..++++...+.|.... .....+++.+.
T Consensus 180 ~l~~~d~d----g~~~~~lt~-~~~~~~~p~wSPDG~~la~~s~~~g~~~i~i~dl~~G~~~~l~~~~-~~~~~~~~SPD 253 (429)
T PRK03629 180 ELRVSDYD----GYNQFVVHR-SPQPLMSPAWSPDGSKLAYVTFESGRSALVIQTLANGAVRQVASFP-RHNGAPAFSPD 253 (429)
T ss_pred eEEEEcCC----CCCCEEeec-CCCceeeeEEcCCCCEEEEEEecCCCcEEEEEECCCCCeEEccCCC-CCcCCeEECCC
Confidence 67788888 887777653 223345678888887887764 467999999876655554321 22345778888
Q ss_pred CCeEEEEeCCC--CcEEEEEccCCCeEEEec
Q psy4900 453 AKRLFWCDNLL--DYIETVDYEGKNRFLILR 481 (485)
Q Consensus 453 ~~~lYW~D~~~--~~I~~~~~dG~~r~~~~~ 481 (485)
.++|+|+.... ..|+..++++...+.+..
T Consensus 254 G~~La~~~~~~g~~~I~~~d~~tg~~~~lt~ 284 (429)
T PRK03629 254 GSKLAFALSKTGSLNLYVMDLASGQIRQVTD 284 (429)
T ss_pred CCEEEEEEcCCCCcEEEEEECCCCCEEEccC
Confidence 88999985443 468888888776555543
No 97
>PF12947 EGF_3: EGF domain; InterPro: IPR024731 This entry represents an EGF domain found in the the C terminus of malarial parasite merozoite surface protein 1 [], as well as other proteins.; PDB: 2NPR_A 1N1I_C 1B9W_A 1YO8_A 2RHP_A.
Probab=84.25 E-value=0.37 Score=31.07 Aligned_cols=26 Identities=27% Similarity=0.557 Sum_probs=18.6
Q ss_pred cccccCCCCCCeeeCCCCceeeCCCCCC
Q psy4900 185 EFTCQASPTGGVCQCPEGQKVANDSRTC 212 (485)
Q Consensus 185 ~~~C~n~~~~~~C~C~~G~~l~~~~~~C 212 (485)
...|.++++++.|+|++||.. ++..|
T Consensus 11 nA~C~~~~~~~~C~C~~Gy~G--dG~~C 36 (36)
T PF12947_consen 11 NATCTNTGGSYTCTCKPGYEG--DGFFC 36 (36)
T ss_dssp TCEEEE-TTSEEEEE-CEEEC--CSTCE
T ss_pred CcEeecCCCCEEeECCCCCcc--CCcCC
Confidence 367999999999999999963 45443
No 98
>PRK01029 tolB translocation protein TolB; Provisional
Probab=84.04 E-value=14 Score=38.39 Aligned_cols=120 Identities=11% Similarity=0.091 Sum_probs=69.1
Q ss_pred cccccCCCCccCCCC--C--Ccccc--ccCcccccceEEEEecCCCCCCccEEecCCCCeEEEeC----CCcEEEEEcCC
Q psy4900 361 STWKCDSENDCGDGS--D--EGDFC--SEKTCAYFQFHAIVLGSNLTNPTDLALDPTSGLMFVAD----SNQILRTNMDG 430 (485)
Q Consensus 361 ~~~~~~~~~~y~~d~--~--~I~~~--~~~~c~~g~~~~~l~~~~~~~p~~iavD~~~~~lywtd----~~~I~r~~~dG 430 (485)
..|.|+|..|.++.. + .|.+. ++.. ..+.....+...........++.|...+|+|+. ..+|+++++++
T Consensus 236 p~wSPDG~~Laf~s~~~g~~di~~~~~~~~~-g~~g~~~~lt~~~~~~~~~p~wSPDG~~Laf~s~~~g~~~ly~~~~~~ 314 (428)
T PRK01029 236 PTFSPRKKLLAFISDRYGNPDLFIQSFSLET-GAIGKPRRLLNEAFGTQGNPSFSPDGTRLVFVSNKDGRPRIYIMQIDP 314 (428)
T ss_pred eEECCCCCEEEEEECCCCCcceeEEEeeccc-CCCCcceEeecCCCCCcCCeEECCCCCEEEEEECCCCCceEEEEECcc
Confidence 357888887776532 1 44433 2220 001122233322223344568888877777764 35799998875
Q ss_pred CC-cEEEEeCCCcccceEEEeCCCCeEEEEeCC--CCcEEEEEccCCCeEEEec
Q psy4900 431 TM-AMSIVSEAAYKASGVALDINAKRLFWCDNL--LDYIETVDYEGKNRFLILR 481 (485)
Q Consensus 431 ~~-~~~i~~~~~~~p~glavD~~~~~lYW~D~~--~~~I~~~~~dG~~r~~~~~ 481 (485)
.. ....++.........++.+..++|+++... ...|.+.++++...+.|..
T Consensus 315 ~g~~~~~lt~~~~~~~~p~wSPDG~~Laf~~~~~g~~~I~v~dl~~g~~~~Lt~ 368 (428)
T PRK01029 315 EGQSPRLLTKKYRNSSCPAWSPDGKKIAFCSVIKGVRQICVYDLATGRDYQLTT 368 (428)
T ss_pred cccceEEeccCCCCccceeECCCCCEEEEEEcCCCCcEEEEEECCCCCeEEccC
Confidence 33 233333332334556777778888887543 3478888988877666554
No 99
>cd00053 EGF Epidermal growth factor domain, found in epidermal growth factor (EGF) presents in a large number of proteins, mostly animal; the list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied; the functional significance of EGF-like domains in what appear to be unrelated proteins is not yet clear; a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase); the domain includes six cysteine residues which have been shown to be involved in disulfide bonds; the main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet; Subdomains between the conserved cysteines vary in length; the region between the 5th and 6th cysteine contains two conserved glycines of which at least one is present in most EGF-like domains; a subset of these bind calcium.
Probab=82.56 E-value=1.3 Score=27.51 Aligned_cols=20 Identities=35% Similarity=0.791 Sum_probs=17.0
Q ss_pred ccccccccCCCCCCccccCCCCccc
Q psy4900 271 HMCIITRASGNALGYKCACDIGYRL 295 (485)
Q Consensus 271 ~~C~n~~~~~~~g~~~C~C~~Gy~l 295 (485)
..|+++. +.|+|.|+.||..
T Consensus 12 ~~C~~~~-----~~~~C~C~~g~~g 31 (36)
T cd00053 12 GTCVNTP-----GSYRCVCPPGYTG 31 (36)
T ss_pred CEEecCC-----CCeEeECCCCCcc
Confidence 5688887 8899999999964
No 100
>KOG4289|consensus
Probab=81.98 E-value=1.7 Score=50.11 Aligned_cols=23 Identities=35% Similarity=0.993 Sum_probs=17.8
Q ss_pred cccccccCCCCCCccccCCCCccccCCCCCCC
Q psy4900 272 MCIITRASGNALGYKCACDIGYRLSVNGNNCN 303 (485)
Q Consensus 272 ~C~n~~~~~~~g~~~C~C~~Gy~l~~d~~~C~ 303 (485)
.|.... |+|+|.|.+||. ++.|.
T Consensus 1252 ~C~srE-----ggYtCeCrpg~t----GehCE 1274 (2531)
T KOG4289|consen 1252 RCRSRE-----GGYTCECRPGFT----GEHCE 1274 (2531)
T ss_pred ceEEec-----CceeEEecCCcc----cccee
Confidence 455566 999999999995 66664
No 101
>COG2133 Glucose/sorbosone dehydrogenases [Carbohydrate transport and metabolism]
Probab=81.18 E-value=4.2 Score=41.48 Aligned_cols=65 Identities=18% Similarity=0.229 Sum_probs=50.1
Q ss_pred CCCCccEEecCCCCeEEEeC---------------CCcEEEEEcCC--------CCcEEEEeCCCcccceEEEeCCCCeE
Q psy4900 400 LTNPTDLALDPTSGLMFVAD---------------SNQILRTNMDG--------TMAMSIVSEAAYKASGVALDINAKRL 456 (485)
Q Consensus 400 ~~~p~~iavD~~~~~lywtd---------------~~~I~r~~~dG--------~~~~~i~~~~~~~p~glavD~~~~~l 456 (485)
.+.-+.|+++|.. +||.+- ..+|.|...+| .+ ..|+..++.+|.||++++.++.|
T Consensus 176 ~H~g~~l~f~pDG-~Lyvs~G~~~~~~~aq~~~~~~Gk~~r~~~a~~~~~d~p~~~-~~i~s~G~RN~qGl~w~P~tg~L 253 (399)
T COG2133 176 HHFGGRLVFGPDG-KLYVTTGSNGDPALAQDNVSLAGKVLRIDRAGIIPADNPFPN-SEIWSYGHRNPQGLAWHPVTGAL 253 (399)
T ss_pred CcCcccEEECCCC-cEEEEeCCCCCcccccCccccccceeeeccCcccccCCCCCC-cceEEeccCCccceeecCCCCcE
Confidence 5666779999985 999885 13455555544 33 56788899999999999999999
Q ss_pred EEEeCCCCcE
Q psy4900 457 FWCDNLLDYI 466 (485)
Q Consensus 457 YW~D~~~~~I 466 (485)
|-++.+...+
T Consensus 254 w~~e~g~d~~ 263 (399)
T COG2133 254 WTTEHGPDAL 263 (399)
T ss_pred EEEecCCCcc
Confidence 9999887444
No 102
>TIGR02658 TTQ_MADH_Hv methylamine dehydrogenase heavy chain. This family consists of the heavy chain of methylamine dehydrogenase light chain, a periplasmic enzyme. The enzyme contains a tryptophan tryptophylquinone (TTQ) prothetic group derived from two Trp residues in the light subunity. The enzyme forms a complex with the type I blue copper protein amicyanin and a cytochrome. Electron transfer procedes from TQQ to the copper and then to the heme group of the cytochrome.
Probab=81.10 E-value=19 Score=36.26 Aligned_cols=70 Identities=20% Similarity=0.321 Sum_probs=44.8
Q ss_pred EecCCCCeEEEeC-CCcEEEEEcCCCCcEEEEe-----CC----Ccccce---EEEeCCCCeEEEEe-CCC--------C
Q psy4900 407 ALDPTSGLMFVAD-SNQILRTNMDGTMAMSIVS-----EA----AYKASG---VALDINAKRLFWCD-NLL--------D 464 (485)
Q Consensus 407 avD~~~~~lywtd-~~~I~r~~~dG~~~~~i~~-----~~----~~~p~g---lavD~~~~~lYW~D-~~~--------~ 464 (485)
++-+..|..+|.. .+.|+.+++.+........ .. --.|.| +|++...++||.+- ... +
T Consensus 200 ~~~~~dg~~~~vs~eG~V~~id~~~~~~~~~~~~~~~~~~~~~~~wrP~g~q~ia~~~dg~~lyV~~~~~~~~thk~~~~ 279 (352)
T TIGR02658 200 AYSNKSGRLVWPTYTGKIFQIDLSSGDAKFLPAIEAFTEAEKADGWRPGGWQQVAYHRARDRIYLLADQRAKWTHKTASR 279 (352)
T ss_pred ceEcCCCcEEEEecCCeEEEEecCCCcceecceeeeccccccccccCCCcceeEEEcCCCCEEEEEecCCccccccCCCC
Confidence 3344456777777 8999999988765332211 11 124666 99999999999943 212 5
Q ss_pred cEEEEEccCCCe
Q psy4900 465 YIETVDYEGKNR 476 (485)
Q Consensus 465 ~I~~~~~dG~~r 476 (485)
.|.+.+.....+
T Consensus 280 ~V~ViD~~t~kv 291 (352)
T TIGR02658 280 FLFVVDAKTGKR 291 (352)
T ss_pred EEEEEECCCCeE
Confidence 777777654443
No 103
>PF06247 Plasmod_Pvs28: Plasmodium ookinete surface protein Pvs28; InterPro: IPR010423 This family consists of several ookinete surface protein (Pvs28) from several species of Plasmodium. Pvs25 and Pvs28 are expressed on the surface of ookinetes. These proteins are potential candidates for vaccine and induce antibodies that block the infectivity of Plasmodium vivax in immunised animals [].; GO: 0009986 cell surface, 0016020 membrane; PDB: 1Z3G_B 1Z1Y_B 1Z27_A.
Probab=80.50 E-value=0.25 Score=44.04 Aligned_cols=47 Identities=34% Similarity=0.784 Sum_probs=27.1
Q ss_pred cccccccCCCCCCccccCCCCccccCCCCCCCCCCCCCceEEcCCCCcccC
Q psy4900 272 MCIITRASGNALGYKCACDIGYRLSVNGNNCNQPTCAPGEFQCASGRCVPS 322 (485)
Q Consensus 272 ~C~n~~~~~~~g~~~C~C~~Gy~l~~d~~~C~~~~C~~~~~~c~~g~ci~~ 322 (485)
.|++.+.......|+|.|..||.|..+ .|....|. .+.|++|+||..
T Consensus 57 ~C~~~~~~~~~~~~~C~C~~gY~~~~~--vCvp~~C~--~~~Cg~GKCI~d 103 (197)
T PF06247_consen 57 KCINQANKGEERAYKCDCINGYILKQG--VCVPNKCN--NKDCGSGKCILD 103 (197)
T ss_dssp EEEE-SSTTSSTSEEEEE-TTEEESSS--SEEEGGGS--S---TTEEEEEE
T ss_pred hhhcCCCcccceeEEEecccCceeeCC--eEchhhcC--ceecCCCeEEec
Confidence 455554323337899999999998653 56554444 366777777754
No 104
>PF00008 EGF: EGF-like domain This is a sub-family of the Pfam entry This is a sub-family of the Pfam entry; InterPro: IPR006209 A sequence of about thirty to forty amino-acid residues long found in the sequence of epidermal growth factor (EGF) has been shown [, , , , ] to be present, in a more or less conserved form, in a large number of other, mostly animal proteins. The list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied. The functional significance of EGF domains in what appear to be unrelated proteins is not yet clear. However, a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase). The EGF domain includes six cysteine residues which have been shown (in EGF) to be involved in disulphide bonds. The main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet. Subdomains between the conserved cysteines vary in length.; GO: 0005515 protein binding; PDB: 1WHE_A 1CCF_A 1APO_A 1WHF_A 2VJ3_A 1TOZ_A 4D90_B 3CFW_A 1EDM_B 1IXA_A ....
Probab=80.11 E-value=0.59 Score=29.17 Aligned_cols=19 Identities=42% Similarity=0.884 Sum_probs=16.1
Q ss_pred ccccCCC-CCCeeeCCCCce
Q psy4900 186 FTCQASP-TGGVCQCPEGQK 204 (485)
Q Consensus 186 ~~C~n~~-~~~~C~C~~G~~ 204 (485)
..|.+.. .+|.|.|++||.
T Consensus 10 g~C~~~~~~~y~C~C~~G~~ 29 (32)
T PF00008_consen 10 GTCIDLPGGGYTCECPPGYT 29 (32)
T ss_dssp EEEEEESTSEEEEEEBTTEE
T ss_pred eEEEeCCCCCEEeECCCCCc
Confidence 4787777 899999999985
No 105
>PRK01742 tolB translocation protein TolB; Provisional
Probab=79.87 E-value=15 Score=38.18 Aligned_cols=115 Identities=11% Similarity=0.080 Sum_probs=70.8
Q ss_pred eeccccccCCCCccCCCCC----CccccccCcccccceEEEEecCCCCCCccEEecCCCCeEEEeC----CCcEEEEEcC
Q psy4900 358 CVPSTWKCDSENDCGDGSD----EGDFCSEKTCAYFQFHAIVLGSNLTNPTDLALDPTSGLMFVAD----SNQILRTNMD 429 (485)
Q Consensus 358 ~i~~~~~~~~~~~y~~d~~----~I~~~~~~~c~~g~~~~~l~~~~~~~p~~iavD~~~~~lywtd----~~~I~r~~~d 429 (485)
+....|.|++..+.++..+ .|.+.++. +.....+. ..-..-.++++.|...+|+++- ...|+..+++
T Consensus 206 v~~p~wSPDG~~la~~s~~~~~~~i~i~dl~----tg~~~~l~-~~~g~~~~~~wSPDG~~La~~~~~~g~~~Iy~~d~~ 280 (429)
T PRK01742 206 LMSPAWSPDGSKLAYVSFENKKSQLVVHDLR----SGARKVVA-SFRGHNGAPAFSPDGSRLAFASSKDGVLNIYVMGAN 280 (429)
T ss_pred cccceEcCCCCEEEEEEecCCCcEEEEEeCC----CCceEEEe-cCCCccCceeECCCCCEEEEEEecCCcEEEEEEECC
Confidence 3445678888777664221 57777776 44433332 1112234578888888888863 3468999998
Q ss_pred CCCcEEEEeCCCcccceEEEeCCCCeEEEEeC--CCCcEEEEEccCCCeEE
Q psy4900 430 GTMAMSIVSEAAYKASGVALDINAKRLFWCDN--LLDYIETVDYEGKNRFL 478 (485)
Q Consensus 430 G~~~~~i~~~~~~~p~glavD~~~~~lYW~D~--~~~~I~~~~~dG~~r~~ 478 (485)
+...+.|.. .-.....++..+..++|+.+-. +...|+.++.+|...+.
T Consensus 281 ~~~~~~lt~-~~~~~~~~~wSpDG~~i~f~s~~~g~~~I~~~~~~~~~~~~ 330 (429)
T PRK01742 281 GGTPSQLTS-GAGNNTEPSWSPDGQSILFTSDRSGSPQVYRMSASGGGASL 330 (429)
T ss_pred CCCeEeecc-CCCCcCCEEECCCCCEEEEEECCCCCceEEEEECCCCCeEE
Confidence 776555543 3333456777777777777643 34577777777765543
No 106
>PF13449 Phytase-like: Esterase-like activity of phytase
Probab=79.10 E-value=4 Score=40.72 Aligned_cols=57 Identities=25% Similarity=0.487 Sum_probs=44.0
Q ss_pred CCccEEecCCCCeEEEeC--C------CcEEEEEcCCCCcEEE-EeCCC-------------cccceEEEeCCCCeEEEE
Q psy4900 402 NPTDLALDPTSGLMFVAD--S------NQILRTNMDGTMAMSI-VSEAA-------------YKASGVALDINAKRLFWC 459 (485)
Q Consensus 402 ~p~~iavD~~~~~lywtd--~------~~I~r~~~dG~~~~~i-~~~~~-------------~~p~glavD~~~~~lYW~ 459 (485)
.+.||++ +..|.+||++ . +.|.+++++|...+.+ +-..+ .-.+|||+.+..+.||-+
T Consensus 86 D~Egi~~-~~~g~~~is~E~~~~~~~~p~I~~~~~~G~~~~~~~vP~~~~~~~~~~~~~~~N~G~E~la~~~dG~~l~~~ 164 (326)
T PF13449_consen 86 DPEGIAV-PPDGSFWISSEGGRTGGIPPRIRRFDLDGRVIRRFPVPAAFLPDANGTSGRRNNRGFEGLAVSPDGRTLFAA 164 (326)
T ss_pred ChhHeEE-ecCCCEEEEeCCccCCCCCCEEEEECCCCcccceEccccccccccCccccccCCCCeEEEEECCCCCEEEEE
Confidence 6779999 7889999999 7 8999999999886655 22211 125799999888877754
No 107
>COG3204 Uncharacterized protein conserved in bacteria [Function unknown]
Probab=78.70 E-value=12 Score=36.41 Aligned_cols=78 Identities=21% Similarity=0.206 Sum_probs=57.5
Q ss_pred CccEEecCCCCeEEEeC---CCcEEEEEcCCCCcEEEEe----CC----CcccceEEEeCCCCeEEEEeCCCCcEEEEEc
Q psy4900 403 PTDLALDPTSGLMFVAD---SNQILRTNMDGTMAMSIVS----EA----AYKASGVALDINAKRLFWCDNLLDYIETVDY 471 (485)
Q Consensus 403 p~~iavD~~~~~lywtd---~~~I~r~~~dG~~~~~i~~----~~----~~~p~glavD~~~~~lYW~D~~~~~I~~~~~ 471 (485)
-.|||-|+..++||++- .-+|+...++-....+=+. .. +....||.+|..++.|+..-...+.+...++
T Consensus 183 fEGlA~d~~~~~l~~aKEr~P~~I~~~~~~~~~l~~~~~~~~~~~~~~f~~DvSgl~~~~~~~~LLVLS~ESr~l~Evd~ 262 (316)
T COG3204 183 FEGLAWDPVDHRLFVAKERNPIGIFEVTQSPSSLSVHASLDPTADRDLFVLDVSGLEFNAITNSLLVLSDESRRLLEVDL 262 (316)
T ss_pred ceeeecCCCCceEEEEEccCCcEEEEEecCCcccccccccCcccccceEeeccccceecCCCCcEEEEecCCceEEEEec
Confidence 35799999999999988 3468877765432211111 11 4457799999999999999988999999999
Q ss_pred cCCCeEEEe
Q psy4900 472 EGKNRFLIL 480 (485)
Q Consensus 472 dG~~r~~~~ 480 (485)
+|.-+..+.
T Consensus 263 ~G~~~~~ls 271 (316)
T COG3204 263 SGEVIELLS 271 (316)
T ss_pred CCCeeeeEE
Confidence 998765543
No 108
>PRK01742 tolB translocation protein TolB; Provisional
Probab=77.98 E-value=22 Score=36.86 Aligned_cols=98 Identities=12% Similarity=0.019 Sum_probs=64.6
Q ss_pred CccccccCcccccceEEEEecCCCCCCccEEecCCCCeEEEeC----CCcEEEEEcCCCCcEEEEeCCCcccceEEEeCC
Q psy4900 377 EGDFCSEKTCAYFQFHAIVLGSNLTNPTDLALDPTSGLMFVAD----SNQILRTNMDGTMAMSIVSEAAYKASGVALDIN 452 (485)
Q Consensus 377 ~I~~~~~~~c~~g~~~~~l~~~~~~~p~~iavD~~~~~lywtd----~~~I~r~~~dG~~~~~i~~~~~~~p~glavD~~ 452 (485)
.|.+++.+ |.....+. ..-....++++.|...+|+|+. .+.|+..++.+..++.+.... ..-..+++.+.
T Consensus 185 ~i~i~d~d----g~~~~~lt-~~~~~v~~p~wSPDG~~la~~s~~~~~~~i~i~dl~tg~~~~l~~~~-g~~~~~~wSPD 258 (429)
T PRK01742 185 EVRVADYD----GFNQFIVN-RSSQPLMSPAWSPDGSKLAYVSFENKKSQLVVHDLRSGARKVVASFR-GHNGAPAFSPD 258 (429)
T ss_pred EEEEECCC----CCCceEec-cCCCccccceEcCCCCEEEEEEecCCCcEEEEEeCCCCceEEEecCC-CccCceeECCC
Confidence 67777888 87766554 3333456788999888898875 367999999876655554321 12235677777
Q ss_pred CCeEEEEeCCC--CcEEEEEccCCCeEEEe
Q psy4900 453 AKRLFWCDNLL--DYIETVDYEGKNRFLIL 480 (485)
Q Consensus 453 ~~~lYW~D~~~--~~I~~~~~dG~~r~~~~ 480 (485)
.++|+++-... -.|+..++++...+.|.
T Consensus 259 G~~La~~~~~~g~~~Iy~~d~~~~~~~~lt 288 (429)
T PRK01742 259 GSRLAFASSKDGVLNIYVMGANGGTPSQLT 288 (429)
T ss_pred CCEEEEEEecCCcEEEEEEECCCCCeEeec
Confidence 88888874333 35777788776655543
No 109
>PF02333 Phytase: Phytase; InterPro: IPR003431 Phytase (3.1.3.8 from EC) (phytate 3-phosphatase) is a secreted enzyme which hydrolyses phytate to release inorganic phosphate. This family appears to represent a novel enzyme that shows phytase activity () and has been shown to consist of a single structural unit with a six-bladed propeller folding architecture ().; GO: 0016158 3-phytase activity; PDB: 3AMS_A 3AMR_A 1QLG_A 2POO_A 1H6L_A 1CVM_A 1POO_A.
Probab=77.81 E-value=11 Score=38.17 Aligned_cols=76 Identities=21% Similarity=0.290 Sum_probs=52.4
Q ss_pred CCCccEEecCCCCeEEEeC-CCcEEEEEcC---CCCcEEEEeC---CC-cccceEEEeC---CCCeEEEEeCCCCcEEEE
Q psy4900 401 TNPTDLALDPTSGLMFVAD-SNQILRTNMD---GTMAMSIVSE---AA-YKASGVALDI---NAKRLFWCDNLLDYIETV 469 (485)
Q Consensus 401 ~~p~~iavD~~~~~lywtd-~~~I~r~~~d---G~~~~~i~~~---~~-~~p~glavD~---~~~~lYW~D~~~~~I~~~ 469 (485)
.+|.|+++|-..|+||..+ ..-|++...+ +..++.|... .| .-.+||||=. -.+.|.-++-+.++..+.
T Consensus 208 sQ~EGCVVDDe~g~LYvgEE~~GIW~y~Aep~~~~~~~~v~~~~g~~l~aDvEGlaly~~~~g~gYLivSsQG~~sf~Vy 287 (381)
T PF02333_consen 208 SQPEGCVVDDETGRLYVGEEDVGIWRYDAEPEGGNDRTLVASADGDGLVADVEGLALYYGSDGKGYLIVSSQGDNSFAVY 287 (381)
T ss_dssp S-EEEEEEETTTTEEEEEETTTEEEEEESSCCC-S--EEEEEBSSSSB-S-EEEEEEEE-CCC-EEEEEEEGGGTEEEEE
T ss_pred CcceEEEEecccCCEEEecCccEEEEEecCCCCCCcceeeecccccccccCccceEEEecCCCCeEEEEEcCCCCeEEEE
Confidence 4799999999999999999 7789999998 3444444321 23 3688999932 235677777777888888
Q ss_pred EccCCCe
Q psy4900 470 DYEGKNR 476 (485)
Q Consensus 470 ~~dG~~r 476 (485)
+..|.++
T Consensus 288 ~r~~~~~ 294 (381)
T PF02333_consen 288 DREGPNA 294 (381)
T ss_dssp ESSTT--
T ss_pred ecCCCCc
Confidence 8887653
No 110
>KOG4649|consensus
Probab=76.65 E-value=7.9 Score=36.83 Aligned_cols=29 Identities=21% Similarity=0.243 Sum_probs=23.0
Q ss_pred CCCccEEecCCCCeEEEeC--CCcEEEEEcC
Q psy4900 401 TNPTDLALDPTSGLMFVAD--SNQILRTNMD 429 (485)
Q Consensus 401 ~~p~~iavD~~~~~lywtd--~~~I~r~~~d 429 (485)
+.-+=+|||+.+|+|||-. ..+||...|=
T Consensus 31 Hs~~~~avd~~sG~~~We~ilg~RiE~sa~v 61 (354)
T KOG4649|consen 31 HSGIVIAVDPQSGNLIWEAILGVRIECSAIV 61 (354)
T ss_pred CCceEEEecCCCCcEEeehhhCceeeeeeEE
Confidence 3344589999999999988 7888877654
No 111
>KOG1225|consensus
Probab=76.42 E-value=13 Score=39.40 Aligned_cols=8 Identities=50% Similarity=1.514 Sum_probs=5.9
Q ss_pred ccCCCCcc
Q psy4900 287 CACDIGYR 294 (485)
Q Consensus 287 C~C~~Gy~ 294 (485)
|.|..||+
T Consensus 355 C~C~~Gw~ 362 (525)
T KOG1225|consen 355 CKCKKGWR 362 (525)
T ss_pred ceeccCcc
Confidence 67777776
No 112
>cd00054 EGF_CA Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.
Probab=75.35 E-value=2.5 Score=26.56 Aligned_cols=19 Identities=32% Similarity=0.620 Sum_probs=16.7
Q ss_pred ccccCCCCCCeeeCCCCce
Q psy4900 186 FTCQASPTGGVCQCPEGQK 204 (485)
Q Consensus 186 ~~C~n~~~~~~C~C~~G~~ 204 (485)
..|.+++++|.|.|++||.
T Consensus 15 ~~C~~~~~~~~C~C~~g~~ 33 (38)
T cd00054 15 GTCVNTVGSYRCSCPPGYT 33 (38)
T ss_pred CEeECCCCCeEeECCCCCc
Confidence 5688999999999999985
No 113
>COG3823 Glutamine cyclotransferase [Posttranslational modification, protein turnover, chaperones]
Probab=75.20 E-value=16 Score=33.67 Aligned_cols=34 Identities=18% Similarity=0.234 Sum_probs=27.9
Q ss_pred CcccceEEEeCCCCeEEEEeCCCCcEEEEEccCC
Q psy4900 441 AYKASGVALDINAKRLFWCDNLLDYIETVDYEGK 474 (485)
Q Consensus 441 ~~~p~glavD~~~~~lYW~D~~~~~I~~~~~dG~ 474 (485)
...++|||.|+..+|+|-+-..-..+.-+.+++.
T Consensus 228 ~nvlNGIA~~~~~~r~~iTGK~wp~lfEVk~~~a 261 (262)
T COG3823 228 DNVLNGIAHDPQQDRFLITGKLWPLLFEVKLDEA 261 (262)
T ss_pred cccccceeecCcCCeEEEecCcCceeEEEEecCC
Confidence 5679999999999999998766677777777654
No 114
>PF05787 DUF839: Bacterial protein of unknown function (DUF839); InterPro: IPR008557 This family consists of bacterial proteins of unknown function.
Probab=74.18 E-value=12 Score=40.06 Aligned_cols=66 Identities=21% Similarity=0.294 Sum_probs=46.8
Q ss_pred CCCCCCccEEecCCCCeEEEeC---C------------------CcEEEEEcCCC-------CcEEEEeC----------
Q psy4900 398 SNLTNPTDLALDPTSGLMFVAD---S------------------NQILRTNMDGT-------MAMSIVSE---------- 439 (485)
Q Consensus 398 ~~~~~p~~iavD~~~~~lywtd---~------------------~~I~r~~~dG~-------~~~~i~~~---------- 439 (485)
..+.+|.+|+++|..+.||++- . ..|+|...++. ....++..
T Consensus 347 T~f~RpEgi~~~p~~g~vY~a~T~~~~r~~~~~~~~n~~~~n~~G~I~r~~~~~~d~~~~~f~~~~~~~~g~~~~~~~~~ 426 (524)
T PF05787_consen 347 TPFDRPEGITVNPDDGEVYFALTNNSGRGESDVDAANPRAGNGYGQIYRYDPDGNDHAATTFTWELFLVGGDPTDASGNG 426 (524)
T ss_pred ccccCccCeeEeCCCCEEEEEEecCCCCcccccccCCcccCCcccEEEEecccCCccccceeEEEEEEEecCcccccccc
Confidence 3488999999999999999985 1 36999888876 33333332
Q ss_pred -------CCcccceEEEeCCCCeEEE-EeCCCC
Q psy4900 440 -------AAYKASGVALDINAKRLFW-CDNLLD 464 (485)
Q Consensus 440 -------~~~~p~glavD~~~~~lYW-~D~~~~ 464 (485)
.+..|.+|++|.. ++||. +|...+
T Consensus 427 ~~~~~~~~f~sPDNL~~d~~-G~LwI~eD~~~~ 458 (524)
T PF05787_consen 427 SNKCDDNGFASPDNLAFDPD-GNLWIQEDGGGS 458 (524)
T ss_pred cCcccCCCcCCCCceEECCC-CCEEEEeCCCCC
Confidence 2668999999975 55544 444443
No 115
>cd00200 WD40 WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and botto
Probab=70.53 E-value=67 Score=29.35 Aligned_cols=74 Identities=11% Similarity=0.130 Sum_probs=48.6
Q ss_pred CCCCCccEEecCCCCeEEEeC-CCcEEEEEcCCCCcEEEEeCCCcccceEEEeCCCCeEEEEeCCCCcEEEEEccC
Q psy4900 399 NLTNPTDLALDPTSGLMFVAD-SNQILRTNMDGTMAMSIVSEAAYKASGVALDINAKRLFWCDNLLDYIETVDYEG 473 (485)
Q Consensus 399 ~~~~p~~iavD~~~~~lywtd-~~~I~r~~~dG~~~~~i~~~~~~~p~glavD~~~~~lYW~D~~~~~I~~~~~dG 473 (485)
....+..|++++...+|+-.. ...|...++........+......+..|++.+..+.|+-+.. .+.|...++..
T Consensus 134 ~~~~i~~~~~~~~~~~l~~~~~~~~i~i~d~~~~~~~~~~~~~~~~i~~~~~~~~~~~l~~~~~-~~~i~i~d~~~ 208 (289)
T cd00200 134 HTDWVNSVAFSPDGTFVASSSQDGTIKLWDLRTGKCVATLTGHTGEVNSVAFSPDGEKLLSSSS-DGTIKLWDLST 208 (289)
T ss_pred CCCcEEEEEEcCcCCEEEEEcCCCcEEEEEccccccceeEecCccccceEEECCCcCEEEEecC-CCcEEEEECCC
Confidence 344578899999877777666 778888888744433334333345677887766656665543 67777777764
No 116
>KOG4499|consensus
Probab=69.60 E-value=19 Score=33.79 Aligned_cols=46 Identities=26% Similarity=0.486 Sum_probs=36.9
Q ss_pred EEeCCCcccceEEEeCCCCeEEEEeCCCCcEEEEEcc--C---CCeEEEec
Q psy4900 436 IVSEAAYKASGVALDINAKRLFWCDNLLDYIETVDYE--G---KNRFLILR 481 (485)
Q Consensus 436 i~~~~~~~p~glavD~~~~~lYW~D~~~~~I~~~~~d--G---~~r~~~~~ 481 (485)
++...+..|+|||-|..++.+|.+|+....|...++| + ++|++|+.
T Consensus 152 ~i~~~v~IsNgl~Wd~d~K~fY~iDsln~~V~a~dyd~~tG~~snr~~i~d 202 (310)
T KOG4499|consen 152 LIWNCVGISNGLAWDSDAKKFYYIDSLNYEVDAYDYDCPTGDLSNRKVIFD 202 (310)
T ss_pred eeehhccCCccccccccCcEEEEEccCceEEeeeecCCCcccccCcceeEE
Confidence 3445688899999999999999999999999777765 3 35777764
No 117
>PF02333 Phytase: Phytase; InterPro: IPR003431 Phytase (3.1.3.8 from EC) (phytate 3-phosphatase) is a secreted enzyme which hydrolyses phytate to release inorganic phosphate. This family appears to represent a novel enzyme that shows phytase activity () and has been shown to consist of a single structural unit with a six-bladed propeller folding architecture ().; GO: 0016158 3-phytase activity; PDB: 3AMS_A 3AMR_A 1QLG_A 2POO_A 1H6L_A 1CVM_A 1POO_A.
Probab=69.53 E-value=33 Score=34.90 Aligned_cols=81 Identities=25% Similarity=0.447 Sum_probs=48.6
Q ss_pred CCCCCCccEEe--cCCCCeEEEeC---CCcEEEEEc--CCCCc--EEEEeC--CCcccceEEEeCCCCeEEEEeCCCCcE
Q psy4900 398 SNLTNPTDLAL--DPTSGLMFVAD---SNQILRTNM--DGTMA--MSIVSE--AAYKASGVALDINAKRLFWCDNLLDYI 466 (485)
Q Consensus 398 ~~~~~p~~iav--D~~~~~lywtd---~~~I~r~~~--dG~~~--~~i~~~--~~~~p~glavD~~~~~lYW~D~~~~~I 466 (485)
..+..|.||.+ ++..|.+|-.- ...++...| ++..+ .++|.+ --.+|+|+++|-..++||..+-.. -|
T Consensus 153 ~~~~e~yGlcly~~~~~g~~ya~v~~k~G~~~Qy~L~~~~~g~v~~~lVR~f~~~sQ~EGCVVDDe~g~LYvgEE~~-GI 231 (381)
T PF02333_consen 153 TDLSEPYGLCLYRSPSTGALYAFVNGKDGRVEQYELTDDGDGKVSATLVREFKVGSQPEGCVVDDETGRLYVGEEDV-GI 231 (381)
T ss_dssp -SSSSEEEEEEEE-TTT--EEEEEEETTSEEEEEEEEE-TTSSEEEEEEEEEE-SS-EEEEEEETTTTEEEEEETTT-EE
T ss_pred cccccceeeEEeecCCCCcEEEEEecCCceEEEEEEEeCCCCcEeeEEEEEecCCCcceEEEEecccCCEEEecCcc-EE
Confidence 45677888887 56777666554 455555555 34432 233332 134799999999999999999765 48
Q ss_pred EEEEcc---CCCeEEE
Q psy4900 467 ETVDYE---GKNRFLI 479 (485)
Q Consensus 467 ~~~~~d---G~~r~~~ 479 (485)
++...+ |..++.|
T Consensus 232 W~y~Aep~~~~~~~~v 247 (381)
T PF02333_consen 232 WRYDAEPEGGNDRTLV 247 (381)
T ss_dssp EEEESSCCC-S--EEE
T ss_pred EEEecCCCCCCcceee
Confidence 888876 3445544
No 118
>cd00200 WD40 WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and botto
Probab=68.78 E-value=59 Score=29.73 Aligned_cols=73 Identities=11% Similarity=0.101 Sum_probs=46.1
Q ss_pred CCCCccEEecCCCCeEEEeC-CCcEEEEEcCCCCcEEEEeCCCcccceEEEeCCCCeEEEEeCCCCcEEEEEccC
Q psy4900 400 LTNPTDLALDPTSGLMFVAD-SNQILRTNMDGTMAMSIVSEAAYKASGVALDINAKRLFWCDNLLDYIETVDYEG 473 (485)
Q Consensus 400 ~~~p~~iavD~~~~~lywtd-~~~I~r~~~dG~~~~~i~~~~~~~p~glavD~~~~~lYW~D~~~~~I~~~~~dG 473 (485)
...+..|++.+....|+.+. ...|...++........+......+..|+++.. ++++.+-...+.|...++..
T Consensus 177 ~~~i~~~~~~~~~~~l~~~~~~~~i~i~d~~~~~~~~~~~~~~~~i~~~~~~~~-~~~~~~~~~~~~i~i~~~~~ 250 (289)
T cd00200 177 TGEVNSVAFSPDGEKLLSSSSDGTIKLWDLSTGKCLGTLRGHENGVNSVAFSPD-GYLLASGSEDGTIRVWDLRT 250 (289)
T ss_pred ccccceEEECCCcCEEEEecCCCcEEEEECCCCceecchhhcCCceEEEEEcCC-CcEEEEEcCCCcEEEEEcCC
Confidence 34577889998876777776 677888887743333223222335677887765 55555544567777777663
No 119
>TIGR02276 beta_rpt_yvtn 40-residue YVTN family beta-propeller repeat. This repeat of about 40 amino acids is found in up to 14 copies per protein. Archaea Methanosarcina mazei and Methanosarcina acetivorans each have over 10 genes that encode tandem copies of this repeat, which is also found in other species. PSIPRED predicts with high confidence that each 40-residue repeats contains four beta strands. This model overlaps somewhat with the NHL repeat (Pfam pfam01436) and also shows sequence similarity to the WD domain, G-beta repeat (Pfam pfam00400).
Probab=67.81 E-value=21 Score=22.98 Aligned_cols=39 Identities=15% Similarity=0.227 Sum_probs=22.2
Q ss_pred CCCeEEEeC--CCcEEEEEcCCCCcEEEEeCCCcccceEEEe
Q psy4900 411 TSGLMFVAD--SNQILRTNMDGTMAMSIVSEAAYKASGVALD 450 (485)
Q Consensus 411 ~~~~lywtd--~~~I~r~~~dG~~~~~i~~~~~~~p~glavD 450 (485)
..++||-++ .+.|...+........-+.. -..|.+|+++
T Consensus 2 d~~~lyv~~~~~~~v~~id~~~~~~~~~i~v-g~~P~~i~~~ 42 (42)
T TIGR02276 2 DGTKLYVTNSGSNTVSVIDTATNKVIATIPV-GGYPFGVAVS 42 (42)
T ss_pred CCCEEEEEeCCCCEEEEEECCCCeEEEEEEC-CCCCceEEeC
Confidence 456788877 56777766532222222222 3568888764
No 120
>COG4946 Uncharacterized protein related to the periplasmic component of the Tol biopolymer transport system [Function unknown]
Probab=66.80 E-value=81 Score=32.74 Aligned_cols=95 Identities=12% Similarity=0.066 Sum_probs=63.9
Q ss_pred CccccccCcccccceEEEEecCCCCCCccEEecCCCCeEEEeC-CCcEEEEEcCCCCcEEEEeCCCcccceEEEeCCCCe
Q psy4900 377 EGDFCSEKTCAYFQFHAIVLGSNLTNPTDLALDPTSGLMFVAD-SNQILRTNMDGTMAMSIVSEAAYKASGVALDINAKR 455 (485)
Q Consensus 377 ~I~~~~~~~c~~g~~~~~l~~~~~~~p~~iavD~~~~~lywtd-~~~I~r~~~dG~~~~~i~~~~~~~p~glavD~~~~~ 455 (485)
.+.+.+.+ |...+.+. .++.+..++++++...++-.+. ...|+..++|-.+.++|-...-....++++.+..+.
T Consensus 383 ~l~iyd~~----~~e~kr~e-~~lg~I~av~vs~dGK~~vvaNdr~el~vididngnv~~idkS~~~lItdf~~~~nsr~ 457 (668)
T COG4946 383 KLGIYDKD----GGEVKRIE-KDLGNIEAVKVSPDGKKVVVANDRFELWVIDIDNGNVRLIDKSEYGLITDFDWHPNSRW 457 (668)
T ss_pred eEEEEecC----CceEEEee-CCccceEEEEEcCCCcEEEEEcCceEEEEEEecCCCeeEecccccceeEEEEEcCCcee
Confidence 55556666 66555544 7899999999999977788887 789999999977767666555555566665555554
Q ss_pred EEEEeC---CCCcEEEEEccCCCe
Q psy4900 456 LFWCDN---LLDYIETVDYEGKNR 476 (485)
Q Consensus 456 lYW~D~---~~~~I~~~~~dG~~r 476 (485)
|=.+=. .+..|...+++|...
T Consensus 458 iAYafP~gy~tq~Iklydm~~~Ki 481 (668)
T COG4946 458 IAYAFPEGYYTQSIKLYDMDGGKI 481 (668)
T ss_pred EEEecCcceeeeeEEEEecCCCeE
Confidence 433221 135667777776543
No 121
>COG4257 Vgb Streptogramin lyase [Defense mechanisms]
Probab=65.94 E-value=27 Score=33.67 Aligned_cols=77 Identities=13% Similarity=0.127 Sum_probs=53.3
Q ss_pred CccEEecCCCCeEEEeC--CCcEEEEEcCCCCcEEEEeCC-CcccceEEEeCCCCeEEEEeCCCCcEEEEEccCCCeEEE
Q psy4900 403 PTDLALDPTSGLMFVAD--SNQILRTNMDGTMAMSIVSEA-AYKASGVALDINAKRLFWCDNLLDYIETVDYEGKNRFLI 479 (485)
Q Consensus 403 p~~iavD~~~~~lywtd--~~~I~r~~~dG~~~~~i~~~~-~~~p~glavD~~~~~lYW~D~~~~~I~~~~~dG~~r~~~ 479 (485)
-|.|-.|+. |.++-|+ ..+++|.+-.-+.....-..+ -.+|..|-|| .+++++..|+..+.|.+.+..-..-+++
T Consensus 235 sRriwsdpi-g~~wittwg~g~l~rfdPs~~sW~eypLPgs~arpys~rVD-~~grVW~sea~agai~rfdpeta~ftv~ 312 (353)
T COG4257 235 SRRIWSDPI-GRAWITTWGTGSLHRFDPSVTSWIEYPLPGSKARPYSMRVD-RHGRVWLSEADAGAIGRFDPETARFTVL 312 (353)
T ss_pred ccccccCcc-CcEEEeccCCceeeEeCcccccceeeeCCCCCCCcceeeec-cCCcEEeeccccCceeecCcccceEEEe
Confidence 455677776 5666666 677888776655544433222 4478999999 7888888899999999987766555554
Q ss_pred ec
Q psy4900 480 LR 481 (485)
Q Consensus 480 ~~ 481 (485)
..
T Consensus 313 p~ 314 (353)
T COG4257 313 PI 314 (353)
T ss_pred cC
Confidence 43
No 122
>KOG4289|consensus
Probab=65.26 E-value=7 Score=45.39 Aligned_cols=31 Identities=26% Similarity=0.653 Sum_probs=22.3
Q ss_pred CCCCCc--cccccCCCCCCeeeCCCCceeeCCCCCCC
Q psy4900 179 CSLLNC--EFTCQASPTGGVCQCPEGQKVANDSRTCL 213 (485)
Q Consensus 179 C~~~~C--~~~C~n~~~~~~C~C~~G~~l~~~~~~C~ 213 (485)
|-..+| ...|....|+|+|.|.+||. +..|.
T Consensus 1242 CYs~pC~nng~C~srEggYtCeCrpg~t----GehCE 1274 (2531)
T KOG4289|consen 1242 CYSGPCGNNGRCRSREGGYTCECRPGFT----GEHCE 1274 (2531)
T ss_pred hhcCCCCCCCceEEecCceeEEecCCcc----cccee
Confidence 444444 35788889999999999984 44564
No 123
>PF02239 Cytochrom_D1: Cytochrome D1 heme domain; PDB: 1NNO_B 1HZU_A 1N15_B 1N50_A 1GJQ_A 1BL9_B 1NIR_B 1N90_B 1HZV_A 1AOQ_A ....
Probab=65.15 E-value=30 Score=35.20 Aligned_cols=69 Identities=13% Similarity=0.277 Sum_probs=45.9
Q ss_pred CccEEecCCCCeEEEeC-CCcEEEEEcCCCCcEEEEeCCCcccceEEEeCCCCeEEEEeCCCCcEEEEEcc
Q psy4900 403 PTDLALDPTSGLMFVAD-SNQILRTNMDGTMAMSIVSEAAYKASGVALDINAKRLFWCDNLLDYIETVDYE 472 (485)
Q Consensus 403 p~~iavD~~~~~lywtd-~~~I~r~~~dG~~~~~i~~~~~~~p~glavD~~~~~lYW~D~~~~~I~~~~~d 472 (485)
+.++++-+..+++|-+. ...|...++.-.....-+.. -..|.|+|+.+..++||-+....+.|...+..
T Consensus 39 h~~~~~s~Dgr~~yv~~rdg~vsviD~~~~~~v~~i~~-G~~~~~i~~s~DG~~~~v~n~~~~~v~v~D~~ 108 (369)
T PF02239_consen 39 HAGLKFSPDGRYLYVANRDGTVSVIDLATGKVVATIKV-GGNPRGIAVSPDGKYVYVANYEPGTVSVIDAE 108 (369)
T ss_dssp EEEEE-TT-SSEEEEEETTSEEEEEETTSSSEEEEEE--SSEEEEEEE--TTTEEEEEEEETTEEEEEETT
T ss_pred eeEEEecCCCCEEEEEcCCCeEEEEECCcccEEEEEec-CCCcceEEEcCCCCEEEEEecCCCceeEeccc
Confidence 44567777778899988 67788888875543222322 34699999999999999888777777776543
No 124
>PF14583 Pectate_lyase22: Oligogalacturonate lyase; PDB: 3C5M_C 3PE7_A.
Probab=64.31 E-value=28 Score=35.40 Aligned_cols=68 Identities=13% Similarity=0.247 Sum_probs=41.0
Q ss_pred CeEEEeC---CCcEEEEEcCCCCcEEEEeCCCcccceEEEeCCCCeEEEEeCCCCcEEEEEccCCCeEEEec
Q psy4900 413 GLMFVAD---SNQILRTNMDGTMAMSIVSEAAYKASGVALDINAKRLFWCDNLLDYIETVDYEGKNRFLILR 481 (485)
Q Consensus 413 ~~lywtd---~~~I~r~~~dG~~~~~i~~~~~~~p~glavD~~~~~lYW~D~~~~~I~~~~~dG~~r~~~~~ 481 (485)
.+||-++ ...++.++|.....+.|....-....|..+-...+.|||... ...+.+++|+....++|+.
T Consensus 49 kllF~s~~dg~~nly~lDL~t~~i~QLTdg~g~~~~g~~~s~~~~~~~Yv~~-~~~l~~vdL~T~e~~~vy~ 119 (386)
T PF14583_consen 49 KLLFASDFDGNRNLYLLDLATGEITQLTDGPGDNTFGGFLSPDDRALYYVKN-GRSLRRVDLDTLEERVVYE 119 (386)
T ss_dssp EEEEEE-TTSS-EEEEEETTT-EEEE---SS-B-TTT-EE-TTSSEEEEEET-TTEEEEEETTT--EEEEEE
T ss_pred EEEEEeccCCCcceEEEEcccCEEEECccCCCCCccceEEecCCCeEEEEEC-CCeEEEEECCcCcEEEEEE
Confidence 3555555 567888888888777777654444456777788899888753 3678999999988777664
No 125
>TIGR03032 conserved hypothetical protein TIGR03032. This protein family is uncharacterized. A number of motifs are conserved perfectly among all member sequences. The function of this protein is unknown.
Probab=58.99 E-value=16 Score=35.93 Aligned_cols=41 Identities=15% Similarity=0.103 Sum_probs=33.8
Q ss_pred EEeCCCcccceEEEeCCCCeEEEEeCCCCcEEEEEcc-CCCeEE
Q psy4900 436 IVSEAAYKASGVALDINAKRLFWCDNLLDYIETVDYE-GKNRFL 478 (485)
Q Consensus 436 i~~~~~~~p~glavD~~~~~lYW~D~~~~~I~~~~~d-G~~r~~ 478 (485)
++.+++..|++-. |..++||.+|++++.|.+++++ |+...+
T Consensus 197 vl~~GLsmPhSPR--WhdgrLwvldsgtGev~~vD~~~G~~e~V 238 (335)
T TIGR03032 197 VVASGLSMPHSPR--WYQGKLWLLNSGRGELGYVDPQAGKFQPV 238 (335)
T ss_pred EEEcCccCCcCCc--EeCCeEEEEECCCCEEEEEcCCCCcEEEE
Confidence 4446888888888 8899999999999999999998 655443
No 126
>KOG1225|consensus
Probab=57.46 E-value=46 Score=35.34 Aligned_cols=9 Identities=33% Similarity=1.051 Sum_probs=7.4
Q ss_pred eeCCCCcee
Q psy4900 197 CQCPEGQKV 205 (485)
Q Consensus 197 C~C~~G~~l 205 (485)
|.|..||.-
T Consensus 355 C~C~~Gw~G 363 (525)
T KOG1225|consen 355 CKCKKGWRG 363 (525)
T ss_pred ceeccCccC
Confidence 889999864
No 127
>PF00930 DPPIV_N: Dipeptidyl peptidase IV (DPP IV) N-terminal region; InterPro: IPR002469 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This domain defines serine peptidases belonging to MEROPS peptidase family S9 (clan SC), subfamily S9B (dipeptidyl-peptidase IV). The protein fold of the peptidase domain for members of this family resembles that of serine carboxypeptidase D, the type example of clan SC. This domain is an alignment of the region to the N-terminal side of the active site, which is found in IPR001375 from INTERPRO. CD26 (3.4.14.5 from EC) is also called adenosine deaminase-binding protein (ADA-binding protein) or dipeptidylpeptidase IV (DPP IV ectoenzyme). The exopeptidase cleaves off N-terminal X-Pro or X-Ala dipeptides from polypeptides (dipeptidyl peptidase IV activity). CD26 serves as the costimulatory molecule in T cell activation and is an associated marker of autoimmune diseases, adenosine deaminase-deficiency and HIV pathogenesis. Dipeptidyl peptidase IV (DPP IV) is responsible for the removal of N-terminal dipeptides sequentially from polypeptides having unsubstituted N termini, provided that the penultimate residue is proline. The enzyme catalyses the reaction: Dipeptidyl-Polypeptide + H(2)O = Dipeptide + Polypeptide It is a type II membrane protein that forms a homodimer. CD molecules are leucocyte antigens on cell surfaces. CD antigens nomenclature is updated at Protein Reviews On The Web (http://prow.nci.nih.gov/). ; GO: 0006508 proteolysis, 0016020 membrane; PDB: 2RIP_A 3Q8W_B 2AJL_I 1TKR_B 1TK3_B 3C45_A 2G5P_A 3G0C_D 1R9M_C 1RWQ_A ....
Probab=55.93 E-value=46 Score=33.43 Aligned_cols=81 Identities=12% Similarity=0.112 Sum_probs=52.9
Q ss_pred CCCCccEEec-CCCCeEEEeC----CCcEEEEEcCCCCcEEEEeCCCcccceEEEeCCCCeEEEEeCC----CCcEEEEE
Q psy4900 400 LTNPTDLALD-PTSGLMFVAD----SNQILRTNMDGTMAMSIVSEAAYKASGVALDINAKRLFWCDNL----LDYIETVD 470 (485)
Q Consensus 400 ~~~p~~iavD-~~~~~lywtd----~~~I~r~~~dG~~~~~i~~~~~~~p~glavD~~~~~lYW~D~~----~~~I~~~~ 470 (485)
+.....+.+- +..+.++|.- -..|+..+++|...+.|......--.-+++|..+++||++-.. ...|++++
T Consensus 234 v~~~~~~~~~~~~~~~~l~~s~~~G~~hly~~~~~~~~~~~lT~G~~~V~~i~~~d~~~~~iyf~a~~~~p~~r~lY~v~ 313 (353)
T PF00930_consen 234 VDVYDPPHFLGPDGNEFLWISERDGYRHLYLYDLDGGKPRQLTSGDWEVTSILGWDEDNNRIYFTANGDNPGERHLYRVS 313 (353)
T ss_dssp SSSSSEEEE-TTTSSEEEEEEETTSSEEEEEEETTSSEEEESS-SSS-EEEEEEEECTSSEEEEEESSGGTTSBEEEEEE
T ss_pred eeeecccccccCCCCEEEEEEEcCCCcEEEEEcccccceeccccCceeecccceEcCCCCEEEEEecCCCCCceEEEEEE
Confidence 4334445443 4444455544 3689999999988664443332222458889999999999876 35899999
Q ss_pred cc-CCCeEEEe
Q psy4900 471 YE-GKNRFLIL 480 (485)
Q Consensus 471 ~d-G~~r~~~~ 480 (485)
++ |...+.|-
T Consensus 314 ~~~~~~~~~LT 324 (353)
T PF00930_consen 314 LDSGGEPKCLT 324 (353)
T ss_dssp TTETTEEEESS
T ss_pred eCCCCCeEecc
Confidence 99 76665543
No 128
>PRK02888 nitrous-oxide reductase; Validated
Probab=54.49 E-value=67 Score=34.85 Aligned_cols=32 Identities=16% Similarity=0.169 Sum_probs=28.0
Q ss_pred cccceEEEeCCCCeEEEEeCCCCcEEEEEccC
Q psy4900 442 YKASGVALDINAKRLFWCDNLLDYIETVDYEG 473 (485)
Q Consensus 442 ~~p~glavD~~~~~lYW~D~~~~~I~~~~~dG 473 (485)
..|.||++.+..++||-+-.....|.+.++.-
T Consensus 321 KsPHGV~vSPDGkylyVanklS~tVSVIDv~k 352 (635)
T PRK02888 321 KNPHGVNTSPDGKYFIANGKLSPTVTVIDVRK 352 (635)
T ss_pred CCccceEECCCCCEEEEeCCCCCcEEEEEChh
Confidence 36999999999999999988888888888765
No 129
>COG0823 TolB Periplasmic component of the Tol biopolymer transport system [Intracellular trafficking and secretion]
Probab=51.90 E-value=83 Score=32.67 Aligned_cols=98 Identities=12% Similarity=0.092 Sum_probs=59.9
Q ss_pred CccccccCcccccceEEEEecCCCCCCccEEecCCCCeEEEeC----CCcEEEEEcCCCCcEEEEeCC-CcccceEEEeC
Q psy4900 377 EGDFCSEKTCAYFQFHAIVLGSNLTNPTDLALDPTSGLMFVAD----SNQILRTNMDGTMAMSIVSEA-AYKASGVALDI 451 (485)
Q Consensus 377 ~I~~~~~~~c~~g~~~~~l~~~~~~~p~~iavD~~~~~lywtd----~~~I~r~~~dG~~~~~i~~~~-~~~p~glavD~ 451 (485)
+|.+.+++ ......++. .-.+-...++-|...+|.++- ...|+..+++|..+..|.... +.. .=.+-+
T Consensus 219 ~i~~~~l~----~g~~~~i~~-~~g~~~~P~fspDG~~l~f~~~rdg~~~iy~~dl~~~~~~~Lt~~~gi~~--~Ps~sp 291 (425)
T COG0823 219 RIYYLDLN----TGKRPVILN-FNGNNGAPAFSPDGSKLAFSSSRDGSPDIYLMDLDGKNLPRLTNGFGINT--SPSWSP 291 (425)
T ss_pred eEEEEecc----CCccceeec-cCCccCCccCCCCCCEEEEEECCCCCccEEEEcCCCCcceecccCCcccc--CccCCC
Confidence 46666666 454545543 122223345666667777776 578999999999866644321 222 222335
Q ss_pred CCCeEEEEeCCC--CcEEEEEccCCCeEEEec
Q psy4900 452 NAKRLFWCDNLL--DYIETVDYEGKNRFLILR 481 (485)
Q Consensus 452 ~~~~lYW~D~~~--~~I~~~~~dG~~r~~~~~ 481 (485)
..++||++-... ..|++++++|+..+.|..
T Consensus 292 dG~~ivf~Sdr~G~p~I~~~~~~g~~~~riT~ 323 (425)
T COG0823 292 DGSKIVFTSDRGGRPQIYLYDLEGSQVTRLTF 323 (425)
T ss_pred CCCEEEEEeCCCCCcceEEECCCCCceeEeec
Confidence 677777775443 589999999998655443
No 130
>PF02239 Cytochrom_D1: Cytochrome D1 heme domain; PDB: 1NNO_B 1HZU_A 1N15_B 1N50_A 1GJQ_A 1BL9_B 1NIR_B 1N90_B 1HZV_A 1AOQ_A ....
Probab=51.65 E-value=1e+02 Score=31.26 Aligned_cols=107 Identities=14% Similarity=0.099 Sum_probs=57.0
Q ss_pred cccCCCCccCCCCC-CccccccCcccccceEEEEecCCCCCCccEEecCCCCeEEEeC--CCcEEEEEcCCCCc-EEEEe
Q psy4900 363 WKCDSENDCGDGSD-EGDFCSEKTCAYFQFHAIVLGSNLTNPTDLALDPTSGLMFVAD--SNQILRTNMDGTMA-MSIVS 438 (485)
Q Consensus 363 ~~~~~~~~y~~d~~-~I~~~~~~~c~~g~~~~~l~~~~~~~p~~iavD~~~~~lywtd--~~~I~r~~~dG~~~-~~i~~ 438 (485)
..++++.+|....+ .|.+.++.+ +.....+-. -..|++|++.+..++||-+. .+.|...+...-.. +.|-.
T Consensus 44 ~s~Dgr~~yv~~rdg~vsviD~~~---~~~v~~i~~--G~~~~~i~~s~DG~~~~v~n~~~~~v~v~D~~tle~v~~I~~ 118 (369)
T PF02239_consen 44 FSPDGRYLYVANRDGTVSVIDLAT---GKVVATIKV--GGNPRGIAVSPDGKYVYVANYEPGTVSVIDAETLEPVKTIPT 118 (369)
T ss_dssp -TT-SSEEEEEETTSEEEEEETTS---SSEEEEEE---SSEEEEEEE--TTTEEEEEEEETTEEEEEETTT--EEEEEE-
T ss_pred ecCCCCEEEEEcCCCeEEEEECCc---ccEEEEEec--CCCcceEEEcCCCCEEEEEecCCCceeEeccccccceeeccc
Confidence 45677888887665 666677651 333334432 34699999999989999887 56666654432221 22211
Q ss_pred CCC----c--ccceEEEeCCCCeEEEEeCCCCcEEEEEccCC
Q psy4900 439 EAA----Y--KASGVALDINAKRLFWCDNLLDYIETVDYEGK 474 (485)
Q Consensus 439 ~~~----~--~p~glavD~~~~~lYW~D~~~~~I~~~~~dG~ 474 (485)
..+ . .+.||.-.+.+...+++-...++|...++...
T Consensus 119 ~~~~~~~~~~Rv~aIv~s~~~~~fVv~lkd~~~I~vVdy~d~ 160 (369)
T PF02239_consen 119 GGMPVDGPESRVAAIVASPGRPEFVVNLKDTGEIWVVDYSDP 160 (369)
T ss_dssp -EE-TTTS---EEEEEE-SSSSEEEEEETTTTEEEEEETTTS
T ss_pred ccccccccCCCceeEEecCCCCEEEEEEccCCeEEEEEeccc
Confidence 111 1 23345444444444444555678888876654
No 131
>PF10313 DUF2415: Uncharacterised protein domain (DUF2415); InterPro: IPR019417 This entry represents a short (30 residues) domain of unknown function found in a family of fungal proteins. It contains a characteristic DLL sequence motif.
Probab=51.45 E-value=44 Score=22.48 Aligned_cols=34 Identities=15% Similarity=0.271 Sum_probs=24.9
Q ss_pred ccEEecCCCC---eEEEeC-CCcEEEEEcC-CCCcEEEE
Q psy4900 404 TDLALDPTSG---LMFVAD-SNQILRTNMD-GTMAMSIV 437 (485)
Q Consensus 404 ~~iavD~~~~---~lywtd-~~~I~r~~~d-G~~~~~i~ 437 (485)
|.+.+-|..+ +|.|++ ..+|-.+++. +..++.||
T Consensus 4 R~~kFsP~~~~~DLL~~~E~~g~vhi~D~R~~f~~~QVi 42 (43)
T PF10313_consen 4 RCCKFSPEPGGNDLLAWAEHQGRVHIVDTRSNFMRRQVI 42 (43)
T ss_pred EEEEeCCCCCcccEEEEEccCCeEEEEEcccCccceEee
Confidence 5566666555 999999 8899999988 55555443
No 132
>COG4257 Vgb Streptogramin lyase [Defense mechanisms]
Probab=51.32 E-value=72 Score=30.87 Aligned_cols=71 Identities=17% Similarity=0.186 Sum_probs=50.9
Q ss_pred CCCCCCccEEecCCCCeEEEeC--CCcEEEEEcCCCCcEEEEeCCCcccceEEEeCCCCeEEEEeCCCCcEEEEEc
Q psy4900 398 SNLTNPTDLALDPTSGLMFVAD--SNQILRTNMDGTMAMSIVSEAAYKASGVALDINAKRLFWCDNLLDYIETVDY 471 (485)
Q Consensus 398 ~~~~~p~~iavD~~~~~lywtd--~~~I~r~~~dG~~~~~i~~~~~~~p~glavD~~~~~lYW~D~~~~~I~~~~~ 471 (485)
..-..|..+|.++. |.+++++ .+.|-+.+-.....+++-...-..|+||-|+ ..+.++.+|.++ .|.|.+-
T Consensus 59 p~G~ap~dvapapd-G~VWft~qg~gaiGhLdP~tGev~~ypLg~Ga~Phgiv~g-pdg~~Witd~~~-aI~R~dp 131 (353)
T COG4257 59 PNGSAPFDVAPAPD-GAVWFTAQGTGAIGHLDPATGEVETYPLGSGASPHGIVVG-PDGSAWITDTGL-AIGRLDP 131 (353)
T ss_pred CCCCCccccccCCC-CceEEecCccccceecCCCCCceEEEecCCCCCCceEEEC-CCCCeeEecCcc-eeEEecC
Confidence 34556888999977 7888888 6777777665444444444556689999999 566778888776 5766544
No 133
>TIGR03118 PEPCTERM_chp_1 conserved hypothetical protein TIGR03118. This model describes and uncharacterized conserved hypothetical protein. Members are found with the C-terminal putative exosortase interaction domain, PEP-CTERM, in Nitrosospira multiformis, Rhodoferax ferrireducens, Solibacter usitatus Ellin6076, and Acidobacteria bacterium Ellin345. It is found without the PEP-CTERM domain in several other species, including Burkholderia ambifaria, Gloeobacter violaceus PCC 7421, and three copies in the Acanthamoeba polyphaga mimivirus.
Probab=51.04 E-value=70 Score=31.46 Aligned_cols=76 Identities=17% Similarity=0.229 Sum_probs=50.7
Q ss_pred CccEEecC--CCCeEEEeC--CCcEEEEEcCCCCcEEEEeC-----C---CcccceEEEeCCCCeEEEEeCC--------
Q psy4900 403 PTDLALDP--TSGLMFVAD--SNQILRTNMDGTMAMSIVSE-----A---AYKASGVALDINAKRLFWCDNL-------- 462 (485)
Q Consensus 403 p~~iavD~--~~~~lywtd--~~~I~r~~~dG~~~~~i~~~-----~---~~~p~glavD~~~~~lYW~D~~-------- 462 (485)
-+||||-. ...+||-+| ..+|... |++-+++-+.. . -.-|.+|. -+.++||.+-++
T Consensus 140 YkGLAi~~~~~~~~LYaadF~~g~IDVF--d~~f~~~~~~g~F~DP~iPagyAPFnIq--nig~~lyVtYA~qd~~~~d~ 215 (336)
T TIGR03118 140 YKGLAVGPTGGGDYLYAANFRQGRIDVF--KGSFRPPPLPGSFIDPALPAGYAPFNVQ--NLGGTLYVTYAQQDADRNDE 215 (336)
T ss_pred eeeeEEeecCCCceEEEeccCCCceEEe--cCccccccCCCCccCCCCCCCCCCcceE--EECCeEEEEEEecCCccccc
Confidence 35666653 367999999 7888876 44444333221 1 12356665 678999988643
Q ss_pred -----CCcEEEEEccCCCeEEEecC
Q psy4900 463 -----LDYIETVDYEGKNRFLILRG 482 (485)
Q Consensus 463 -----~~~I~~~~~dG~~r~~~~~~ 482 (485)
.+.|.+.+++|...+.+.++
T Consensus 216 v~G~G~G~VdvFd~~G~l~~r~as~ 240 (336)
T TIGR03118 216 VAGAGLGYVNVFTLNGQLLRRVASS 240 (336)
T ss_pred ccCCCcceEEEEcCCCcEEEEeccC
Confidence 36899999999987776554
No 134
>KOG1217|consensus
Probab=49.30 E-value=15 Score=38.07 Aligned_cols=61 Identities=26% Similarity=0.587 Sum_probs=41.5
Q ss_pred ccccCCCCCCeeeCCCCceeeCCCCCCCCcchhhhccccccccccccceeheeheeeeeeecccCCCCCCCCCCCCCCCC
Q psy4900 186 FTCQASPTGGVCQCPEGQKVANDSRTCLLYMKNNLKQAVRSSTVSSHVKLVLLEVYVNVLKVRKLPTTAEPQSPNPCGSN 265 (485)
Q Consensus 186 ~~C~n~~~~~~C~C~~G~~l~~~~~~C~d~~e~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~e~~~~~~C~~~ 265 (485)
..|.++.++++|.|++||.+.. ...|.++++ |...
T Consensus 243 ~~c~~~~~~~~C~~~~g~~~~~-~~~~~~~~~--------------------------------------------C~~~ 277 (487)
T KOG1217|consen 243 GTCVNTVGSYTCRCPEGYTGDA-CVTCVDVDS--------------------------------------------CALI 277 (487)
T ss_pred CcccccCCceeeeCCCCccccc-cceeeeccc--------------------------------------------cCCC
Confidence 5788888889999999987632 123444444 5432
Q ss_pred CCCcc--ccccccccCCCCCCccccCCCCccccC
Q psy4900 266 NGGCE--HMCIITRASGNALGYKCACDIGYRLSV 297 (485)
Q Consensus 266 ~g~C~--~~C~n~~~~~~~g~~~C~C~~Gy~l~~ 297 (485)
.. |. ..|++.+ +.|+|.|+.||....
T Consensus 278 ~~-c~~~~~C~~~~-----~~~~C~C~~g~~g~~ 305 (487)
T KOG1217|consen 278 AS-CPNGGTCVNVP-----GSYRCTCPPGFTGRL 305 (487)
T ss_pred Cc-cCCCCeeecCC-----CcceeeCCCCCCCCC
Confidence 22 43 4788877 779999999997544
No 135
>PF04885 Stig1: Stigma-specific protein, Stig1; InterPro: IPR006969 This family represents the Stig1 cysteine rich plant protein.The tobacco stigma-specific gene, STIG1 is developmentally regulated and expressed specifically in the stigmatic secretory zone. Pistils of transgenic STIG1-barnase tobacco plants undergo normal development, but lack the stigmatic secretory zone and are female sterile. Pollen grains are unable to penetrate the surface of the ablated pistils. Application of stigmatic exudate from wild-type pistils to the ablated surface increases the efficiency of pollen tube germination and growth and restores the capacity of pollen tubes to penetrate the style []. The function of STIG1 is unknown.
Probab=48.56 E-value=39 Score=29.01 Aligned_cols=32 Identities=38% Similarity=0.889 Sum_probs=14.0
Q ss_pred CCCCC-CcCCCCCCCCCCCCCCCCeecCCCCceecC
Q psy4900 38 DCRNR-KDEEGCPATTGLSCDLDQFRCANGQKCIDA 72 (485)
Q Consensus 38 dC~d~-sdE~~C~~~~~~~C~~~~f~C~~g~~Ci~~ 72 (485)
.|.|- +|..+|.. -+..|+.++ .|-+|. |++.
T Consensus 76 ~Cvdv~~d~~nCG~-Cg~~C~~g~-~cC~G~-Cvd~ 108 (136)
T PF04885_consen 76 KCVDVSSDRNNCGA-CGNKCPYGQ-TCCGGQ-CVDL 108 (136)
T ss_pred cCCccCCCccccHh-hcCCCCCCc-eecCCE-eECC
Confidence 45544 35555532 113454444 333444 5543
No 136
>COG4946 Uncharacterized protein related to the periplasmic component of the Tol biopolymer transport system [Function unknown]
Probab=47.00 E-value=92 Score=32.37 Aligned_cols=62 Identities=19% Similarity=0.200 Sum_probs=44.5
Q ss_pred CCCCccEEecCCCCeEEEeC-----CCcEEEEEcCCCCcEEEEeCCCcccceEEEeCCCCeEEEEeCC
Q psy4900 400 LTNPTDLALDPTSGLMFVAD-----SNQILRTNMDGTMAMSIVSEAAYKASGVALDINAKRLFWCDNL 462 (485)
Q Consensus 400 ~~~p~~iavD~~~~~lywtd-----~~~I~r~~~dG~~~~~i~~~~~~~p~glavD~~~~~lYW~D~~ 462 (485)
..-..+++.+|..++|-++= ...|..++|+|.....+. +....-.+=|+|+..+.||+....
T Consensus 443 ~~lItdf~~~~nsr~iAYafP~gy~tq~Iklydm~~~Kiy~vT-T~ta~DfsPaFD~d~ryLYfLs~R 509 (668)
T COG4946 443 YGLITDFDWHPNSRWIAYAFPEGYYTQSIKLYDMDGGKIYDVT-TPTAYDFSPAFDPDGRYLYFLSAR 509 (668)
T ss_pred cceeEEEEEcCCceeEEEecCcceeeeeEEEEecCCCeEEEec-CCcccccCcccCCCCcEEEEEecc
Confidence 33455677778888777665 478999999987644333 444455567899999999998754
No 137
>TIGR03075 PQQ_enz_alc_DH PQQ-dependent dehydrogenase, methanol/ethanol family. This protein family has a phylogenetic distribution very similar to that coenzyme PQQ biosynthesis enzymes, as shown by partial phylogenetic profiling. Genes in this family often are found adjacent to the PQQ biosynthesis genes themselves. An unusual, strained disulfide bond between adjacent Cys residues contributes to PQQ-binding, as does a Trp residue that is part of a PQQ enzyme repeat (see pfam01011). Characterized members include the dehydrogenase subunit of a membrane-anchored, three subunit alcohol (ethanol) dehydrogenase of Gluconobacter suboxydans, a homodimeric ethanol dehydrogenase in Pseudomonas aeruginosa, and the large subunit of an alpha2/beta2 heterotetrameric methanol dehydrogenase in Methylobacterium extorquens.
Probab=46.77 E-value=69 Score=34.31 Aligned_cols=72 Identities=18% Similarity=0.313 Sum_probs=45.4
Q ss_pred cEEecCCCCeEEEeC-CCcEEEEEcCCCCcEEEEeCCCcccceEEEeCCCCeEEEEeCC-----------CC-cEEEEEc
Q psy4900 405 DLALDPTSGLMFVAD-SNQILRTNMDGTMAMSIVSEAAYKASGVALDINAKRLFWCDNL-----------LD-YIETVDY 471 (485)
Q Consensus 405 ~iavD~~~~~lywtd-~~~I~r~~~dG~~~~~i~~~~~~~p~glavD~~~~~lYW~D~~-----------~~-~I~~~~~ 471 (485)
.+++|+..++|||-- ++.- .+|..|. ..++..-.=||||..++++=|.--. .. .+...+.
T Consensus 238 ~~s~D~~~~lvy~~tGnp~p----~~~~~r~---gdnl~~~s~vAld~~TG~~~W~~Q~~~~D~wD~d~~~~p~l~d~~~ 310 (527)
T TIGR03075 238 TGSYDPETNLIYFGTGNPSP----WNSHLRP---GDNLYTSSIVARDPDTGKIKWHYQTTPHDEWDYDGVNEMILFDLKK 310 (527)
T ss_pred ceeEcCCCCeEEEeCCCCCC----CCCCCCC---CCCccceeEEEEccccCCEEEeeeCCCCCCccccCCCCcEEEEecc
Confidence 479999999999988 3322 4444441 2234444558889999999886422 11 2233446
Q ss_pred cCCCeEEEecCC
Q psy4900 472 EGKNRFLILRGS 483 (485)
Q Consensus 472 dG~~r~~~~~~~ 483 (485)
+|+.|++|+...
T Consensus 311 ~G~~~~~v~~~~ 322 (527)
T TIGR03075 311 DGKPRKLLAHAD 322 (527)
T ss_pred CCcEEEEEEEeC
Confidence 888778877654
No 138
>KOG0285|consensus
Probab=46.45 E-value=45 Score=33.26 Aligned_cols=63 Identities=19% Similarity=0.208 Sum_probs=45.9
Q ss_pred EecCCCCCCccEEecCCCCeEEEeC--CCcEEEEEcCCCCcEEEEeCCCcccceEEEeCCCCeEEE
Q psy4900 395 VLGSNLTNPTDLALDPTSGLMFVAD--SNQILRTNMDGTMAMSIVSEAAYKASGVALDINAKRLFW 458 (485)
Q Consensus 395 l~~~~~~~p~~iavD~~~~~lywtd--~~~I~r~~~dG~~~~~i~~~~~~~p~glavD~~~~~lYW 458 (485)
++...+...+.|||||.+.+ |-|- ...|.--++.....+.-+...+....|+||...+-.||=
T Consensus 146 Vi~gHlgWVr~vavdP~n~w-f~tgs~DrtikIwDlatg~LkltltGhi~~vr~vavS~rHpYlFs 210 (460)
T KOG0285|consen 146 VISGHLGWVRSVAVDPGNEW-FATGSADRTIKIWDLATGQLKLTLTGHIETVRGVAVSKRHPYLFS 210 (460)
T ss_pred hhhhccceEEEEeeCCCcee-EEecCCCceeEEEEcccCeEEEeecchhheeeeeeecccCceEEE
Confidence 45577899999999998533 3343 567777788766656656566778899998877777764
No 139
>PF14583 Pectate_lyase22: Oligogalacturonate lyase; PDB: 3C5M_C 3PE7_A.
Probab=45.93 E-value=1.3e+02 Score=30.57 Aligned_cols=102 Identities=12% Similarity=0.081 Sum_probs=49.8
Q ss_pred CccccccCcccccceEEEEecCCCCCCccEEecCCCCeEEEe-CCCcEEEEEcCCCCcEEEEeCCCccc-ceEE-EeCCC
Q psy4900 377 EGDFCSEKTCAYFQFHAIVLGSNLTNPTDLALDPTSGLMFVA-DSNQILRTNMDGTMAMSIVSEAAYKA-SGVA-LDINA 453 (485)
Q Consensus 377 ~I~~~~~~~c~~g~~~~~l~~~~~~~p~~iavD~~~~~lywt-d~~~I~r~~~dG~~~~~i~~~~~~~p-~gla-vD~~~ 453 (485)
.++..++. ......|-.....+..|..+-+..+.|||. +...+.+++|+....++|....-.+- .|-. ++...
T Consensus 61 nly~lDL~----t~~i~QLTdg~g~~~~g~~~s~~~~~~~Yv~~~~~l~~vdL~T~e~~~vy~~p~~~~g~gt~v~n~d~ 136 (386)
T PF14583_consen 61 NLYLLDLA----TGEITQLTDGPGDNTFGGFLSPDDRALYYVKNGRSLRRVDLDTLEERVVYEVPDDWKGYGTWVANSDC 136 (386)
T ss_dssp EEEEEETT----T-EEEE---SS-B-TTT-EE-TTSSEEEEEETTTEEEEEETTT--EEEEEE--TTEEEEEEEEE-TTS
T ss_pred ceEEEEcc----cCEEEECccCCCCCccceEEecCCCeEEEEECCCeEEEEECCcCcEEEEEECCcccccccceeeCCCc
Confidence 44555665 444444443323344466677888888765 47899999999877666654322221 2222 23222
Q ss_pred CeEE--------E---EeC----------CCCcEEEEEccCCCeEEEecC
Q psy4900 454 KRLF--------W---CDN----------LLDYIETVDYEGKNRFLILRG 482 (485)
Q Consensus 454 ~~lY--------W---~D~----------~~~~I~~~~~dG~~r~~~~~~ 482 (485)
.+|. | .++ -..+|.++++.+..+++|+..
T Consensus 137 t~~~g~e~~~~d~~~l~~~~~f~e~~~a~p~~~i~~idl~tG~~~~v~~~ 186 (386)
T PF14583_consen 137 TKLVGIEISREDWKPLTKWKGFREFYEARPHCRIFTIDLKTGERKVVFED 186 (386)
T ss_dssp SEEEEEEEEGGG-----SHHHHHHHHHC---EEEEEEETTT--EEEEEEE
T ss_pred cEEEEEEEeehhccCccccHHHHHHHhhCCCceEEEEECCCCceeEEEec
Confidence 2221 1 110 124788899998888888864
No 140
>PF06739 SBBP: Beta-propeller repeat; InterPro: IPR010620 This family is related to IPR001680 from INTERPRO and is likely to also form a beta-propeller. SBBP stands for Seven Bladed Beta Propeller.
Probab=45.88 E-value=19 Score=23.36 Aligned_cols=19 Identities=16% Similarity=0.324 Sum_probs=15.2
Q ss_pred cccceEEEeCCCCeEEEEeC
Q psy4900 442 YKASGVALDINAKRLFWCDN 461 (485)
Q Consensus 442 ~~p~glavD~~~~~lYW~D~ 461 (485)
..|.+|||| ..++||.+=.
T Consensus 13 ~~~~~IavD-~~GNiYv~G~ 31 (38)
T PF06739_consen 13 DYGNGIAVD-SNGNIYVTGY 31 (38)
T ss_pred eeEEEEEEC-CCCCEEEEEe
Confidence 469999999 6688997654
No 141
>COG2133 Glucose/sorbosone dehydrogenases [Carbohydrate transport and metabolism]
Probab=45.47 E-value=91 Score=31.98 Aligned_cols=72 Identities=25% Similarity=0.292 Sum_probs=44.7
Q ss_pred CCccEEecCC------CCeEEEeC--CCcEEEEEcCCCC---cEEEEeCCC-cccceEEEeCCCCeEEEEeC-CCCcEEE
Q psy4900 402 NPTDLALDPT------SGLMFVAD--SNQILRTNMDGTM---AMSIVSEAA-YKASGVALDINAKRLFWCDN-LLDYIET 468 (485)
Q Consensus 402 ~p~~iavD~~------~~~lywtd--~~~I~r~~~dG~~---~~~i~~~~~-~~p~glavD~~~~~lYW~D~-~~~~I~~ 468 (485)
.|.||++=.- .+.||... .-.+.+...+|.. .+.++..++ ..|.++++.+ .+-||.+|- +.++|.|
T Consensus 315 ApsGmaFy~G~~fP~~r~~lfV~~hgsw~~~~~~~~g~~~~~~~~fl~~d~~gR~~dV~v~~-DGallv~~D~~~g~i~R 393 (399)
T COG2133 315 APSGMAFYTGDLFPAYRGDLFVGAHGSWPVLRLRPDGNYKVVLTGFLSGDLGGRPRDVAVAP-DGALLVLTDQGDGRILR 393 (399)
T ss_pred ccceeEEecCCcCccccCcEEEEeecceeEEEeccCCCcceEEEEEEecCCCCcccceEECC-CCeEEEeecCCCCeEEE
Confidence 3556666321 14566555 2347788888873 334444322 5899999995 455555554 4779999
Q ss_pred EEccCC
Q psy4900 469 VDYEGK 474 (485)
Q Consensus 469 ~~~dG~ 474 (485)
..+.+.
T Consensus 394 v~~~~~ 399 (399)
T COG2133 394 VSYAGT 399 (399)
T ss_pred ecCCCC
Confidence 988763
No 142
>PF05096 Glu_cyclase_2: Glutamine cyclotransferase; InterPro: IPR007788 This family of enzymes 2.3.2.5 from EC catalyse the cyclization of free L-glutamine and N-terminal glutaminyl residues in proteins to pyroglutamate (5-oxoproline) and pyroglutamyl residues respectively []. This family includes plant and bacterial enzymes and seems unrelated to the mammalian enzymes.; PDB: 3NOK_B 2FAW_A 2IWA_A 3NOM_A 3NOL_A 3MBR_X.
Probab=43.07 E-value=1.2e+02 Score=29.20 Aligned_cols=67 Identities=19% Similarity=0.301 Sum_probs=43.5
Q ss_pred ecCCCCeEEEeC--CCcEEEEEcCCCCcEEEEe-C--------------CCcccceEEEeCCCCeEEEEeCCCCcEEEEE
Q psy4900 408 LDPTSGLMFVAD--SNQILRTNMDGTMAMSIVS-E--------------AAYKASGVALDINAKRLFWCDNLLDYIETVD 470 (485)
Q Consensus 408 vD~~~~~lywtd--~~~I~r~~~dG~~~~~i~~-~--------------~~~~p~glavD~~~~~lYW~D~~~~~I~~~~ 470 (485)
+.+..|+||--- ...|.|++..-.....++. . .....+|||.|+.+++||.|-..=.+++.+.
T Consensus 180 LE~i~G~IyANVW~td~I~~Idp~tG~V~~~iDls~L~~~~~~~~~~~~~~dVLNGIAyd~~~~~l~vTGK~Wp~lyeV~ 259 (264)
T PF05096_consen 180 LEYINGKIYANVWQTDRIVRIDPETGKVVGWIDLSGLRPEVGRDKSRQPDDDVLNGIAYDPETDRLFVTGKLWPKLYEVK 259 (264)
T ss_dssp EEEETTEEEEEETTSSEEEEEETTT-BEEEEEE-HHHHHHHTSTTST--TTS-EEEEEEETTTTEEEEEETT-SEEEEEE
T ss_pred EEEEcCEEEEEeCCCCeEEEEeCCCCeEEEEEEhhHhhhcccccccccccCCeeEeEeEeCCCCEEEEEeCCCCceEEEE
Confidence 445567766333 6788888887544444432 1 1235789999999999999876668888877
Q ss_pred ccCC
Q psy4900 471 YEGK 474 (485)
Q Consensus 471 ~dG~ 474 (485)
+..+
T Consensus 260 l~e~ 263 (264)
T PF05096_consen 260 LVEK 263 (264)
T ss_dssp EEE-
T ss_pred EEec
Confidence 6543
No 143
>COG0823 TolB Periplasmic component of the Tol biopolymer transport system [Intracellular trafficking and secretion]
Probab=41.60 E-value=1.5e+02 Score=30.72 Aligned_cols=69 Identities=13% Similarity=0.110 Sum_probs=44.8
Q ss_pred cccccCCCCccCCC-CC---CccccccCcccccceEEEEecCCCCCCccEEec----CCCCeEEEeC----CCcEEEEEc
Q psy4900 361 STWKCDSENDCGDG-SD---EGDFCSEKTCAYFQFHAIVLGSNLTNPTDLALD----PTSGLMFVAD----SNQILRTNM 428 (485)
Q Consensus 361 ~~~~~~~~~~y~~d-~~---~I~~~~~~~c~~g~~~~~l~~~~~~~p~~iavD----~~~~~lywtd----~~~I~r~~~ 428 (485)
-.|.|++.++-+.- .+ .|++.+++ ++.... |.+..|+... |...+|+++. .+.|+++++
T Consensus 243 P~fspDG~~l~f~~~rdg~~~iy~~dl~----~~~~~~-----Lt~~~gi~~~Ps~spdG~~ivf~Sdr~G~p~I~~~~~ 313 (425)
T COG0823 243 PAFSPDGSKLAFSSSRDGSPDIYLMDLD----GKNLPR-----LTNGFGINTSPSWSPDGSKIVFTSDRGGRPQIYLYDL 313 (425)
T ss_pred ccCCCCCCEEEEEECCCCCccEEEEcCC----CCccee-----cccCCccccCccCCCCCCEEEEEeCCCCCcceEEECC
Confidence 35677776665532 22 78888888 666333 3444444443 4556677665 589999999
Q ss_pred CCCCcEEEEe
Q psy4900 429 DGTMAMSIVS 438 (485)
Q Consensus 429 dG~~~~~i~~ 438 (485)
+|...+.|..
T Consensus 314 ~g~~~~riT~ 323 (425)
T COG0823 314 EGSQVTRLTF 323 (425)
T ss_pred CCCceeEeec
Confidence 9998755554
No 144
>PF05345 He_PIG: Putative Ig domain; InterPro: IPR008009 This alignment represents the conserved core region of a ~90 residue repeat found in several haemagglutinins and other cell surface proteins. Sequence similarities to Hyalin (IPR003410 from INTERPRO) and the PKD domain (IPR000601 from INTERPRO) suggest an Ig-like fold so this family may be similar in function to the (IPR003791 from INTERPRO) and (IPR003790 from INTERPRO) protein families.
Probab=40.43 E-value=27 Score=24.09 Aligned_cols=22 Identities=27% Similarity=0.434 Sum_probs=18.8
Q ss_pred CCCCCCccEEecCCCCeEEEeC
Q psy4900 398 SNLTNPTDLALDPTSGLMFVAD 419 (485)
Q Consensus 398 ~~~~~p~~iavD~~~~~lywtd 419 (485)
..-.-|.+|.||+.+|.|.|+-
T Consensus 8 ~~~~LP~gLs~d~~tG~isGtp 29 (49)
T PF05345_consen 8 TGGGLPSGLSLDPSTGTISGTP 29 (49)
T ss_pred CCCCCCCcEEEeCCCCEEEeec
Confidence 3445699999999999999986
No 145
>PF05345 He_PIG: Putative Ig domain; InterPro: IPR008009 This alignment represents the conserved core region of a ~90 residue repeat found in several haemagglutinins and other cell surface proteins. Sequence similarities to Hyalin (IPR003410 from INTERPRO) and the PKD domain (IPR000601 from INTERPRO) suggest an Ig-like fold so this family may be similar in function to the (IPR003791 from INTERPRO) and (IPR003790 from INTERPRO) protein families.
Probab=40.14 E-value=36 Score=23.44 Aligned_cols=25 Identities=16% Similarity=0.058 Sum_probs=20.5
Q ss_pred CCCcccceEEEeCCCCeEEEEeCCC
Q psy4900 439 EAAYKASGVALDINAKRLFWCDNLL 463 (485)
Q Consensus 439 ~~~~~p~glavD~~~~~lYW~D~~~ 463 (485)
.....|.||.||...+.|.|+=...
T Consensus 8 ~~~~LP~gLs~d~~tG~isGtp~~~ 32 (49)
T PF05345_consen 8 TGGGLPSGLSLDPSTGTISGTPTSS 32 (49)
T ss_pred CCCCCCCcEEEeCCCCEEEeecCCC
Confidence 3466899999999999999985443
No 146
>PF01683 EB: EB module; InterPro: IPR006149 The EB domain has no known function. It is found in several Caenorhabditis sp. and Drosophila sp. proteins. The domain contains 8 conserved cysteines that probably form four disulphide bridges and is found associated with kunitz domains IPR002223 from INTERPRO
Probab=38.82 E-value=64 Score=22.23 Aligned_cols=11 Identities=55% Similarity=1.171 Sum_probs=9.5
Q ss_pred CeeeCCCCcee
Q psy4900 195 GVCQCPEGQKV 205 (485)
Q Consensus 195 ~~C~C~~G~~l 205 (485)
++|.|++||..
T Consensus 37 g~C~C~~g~~~ 47 (52)
T PF01683_consen 37 GRCQCPPGYVE 47 (52)
T ss_pred CEeECCCCCEe
Confidence 68999999876
No 147
>PF12661 hEGF: Human growth factor-like EGF; PDB: 2YGQ_A 2E26_A 3A7Q_A 2YGP_A 2YGO_A 1HRE_A 1HAE_A 1HAF_A 1HRF_A.
Probab=38.17 E-value=15 Score=17.86 Aligned_cols=9 Identities=56% Similarity=1.361 Sum_probs=6.2
Q ss_pred eeeCCCCce
Q psy4900 196 VCQCPEGQK 204 (485)
Q Consensus 196 ~C~C~~G~~ 204 (485)
.|.|++||.
T Consensus 1 ~C~C~~G~~ 9 (13)
T PF12661_consen 1 TCQCPPGWT 9 (13)
T ss_dssp EEEE-TTEE
T ss_pred CccCcCCCc
Confidence 478999985
No 148
>KOG0266|consensus
Probab=37.09 E-value=4.6e+02 Score=27.37 Aligned_cols=85 Identities=14% Similarity=0.105 Sum_probs=58.1
Q ss_pred cceEEEEecCCCCCCccEEecCCCCeEEEeC-CCcEEEEEcCCCCcEEEEeCCCcccceEEEeCCCCeEEEEeCCCCcEE
Q psy4900 389 FQFHAIVLGSNLTNPTDLALDPTSGLMFVAD-SNQILRTNMDGTMAMSIVSEAAYKASGVALDINAKRLFWCDNLLDYIE 467 (485)
Q Consensus 389 g~~~~~l~~~~~~~p~~iavD~~~~~lywtd-~~~I~r~~~dG~~~~~i~~~~~~~p~glavD~~~~~lYW~D~~~~~I~ 467 (485)
+...+++. .......++++.|...+|.=.. ...|.--++........+........++++. ..++++|+-...+.|.
T Consensus 236 ~~~~~~l~-gH~~~v~~~~f~p~g~~i~Sgs~D~tvriWd~~~~~~~~~l~~hs~~is~~~f~-~d~~~l~s~s~d~~i~ 313 (456)
T KOG0266|consen 236 GRNLKTLK-GHSTYVTSVAFSPDGNLLVSGSDDGTVRIWDVRTGECVRKLKGHSDGISGLAFS-PDGNLLVSASYDGTIR 313 (456)
T ss_pred CeEEEEec-CCCCceEEEEecCCCCEEEEecCCCcEEEEeccCCeEEEeeeccCCceEEEEEC-CCCCEEEEcCCCccEE
Confidence 34555554 5666778999999974444333 6777777777654455555555567788888 4556666667788888
Q ss_pred EEEccCCC
Q psy4900 468 TVDYEGKN 475 (485)
Q Consensus 468 ~~~~dG~~ 475 (485)
.-++.+..
T Consensus 314 vwd~~~~~ 321 (456)
T KOG0266|consen 314 VWDLETGS 321 (456)
T ss_pred EEECCCCc
Confidence 88888766
No 149
>TIGR03032 conserved hypothetical protein TIGR03032. This protein family is uncharacterized. A number of motifs are conserved perfectly among all member sequences. The function of this protein is unknown.
Probab=36.48 E-value=1.3e+02 Score=29.87 Aligned_cols=60 Identities=13% Similarity=0.181 Sum_probs=39.3
Q ss_pred ecCCCCCCccEEecCCCCeEEEeC--CCcEEEEEcC-CCCcEEEEeCCCcccceEEEeCCCCeEEEEeC
Q psy4900 396 LGSNLTNPTDLALDPTSGLMFVAD--SNQILRTNMD-GTMAMSIVSEAAYKASGVALDINAKRLFWCDN 461 (485)
Q Consensus 396 ~~~~~~~p~~iavD~~~~~lywtd--~~~I~r~~~d-G~~~~~i~~~~~~~p~glavD~~~~~lYW~D~ 461 (485)
+..++..|.+--. ..|+||+.| .+.|.+++++ |+. ++|.. --..|.||+.. .+.+|..=+
T Consensus 198 l~~GLsmPhSPRW--hdgrLwvldsgtGev~~vD~~~G~~-e~Va~-vpG~~rGL~f~--G~llvVgmS 260 (335)
T TIGR03032 198 VASGLSMPHSPRW--YQGKLWLLNSGRGELGYVDPQAGKF-QPVAF-LPGFTRGLAFA--GDFAFVGLS 260 (335)
T ss_pred EEcCccCCcCCcE--eCCeEEEEECCCCEEEEEcCCCCcE-EEEEE-CCCCCccccee--CCEEEEEec
Confidence 3367777766544 578999999 7889999987 544 33332 23477888854 666665443
No 150
>PF06433 Me-amine-dh_H: Methylamine dehydrogenase heavy chain (MADH); InterPro: IPR009451 Methylamine dehydrogenase (1.4.99.3 from EC) is a periplasmic quinoprotein found in several methyltrophic bacteria []. It is induced when grown on methylamine as a carbon source MADH and catalyses the oxidative deamination of amines to their corresponding aldehydes. The redox cofactor of this enzyme is tryptophan tryptophylquinone (TTQ). Electrons derived from the oxidation of methylamine are passed to an electron acceptor, which is usually the blue-copper protein amicyanin (IPR002386 from INTERPRO). RCH2NH2 + H2O + acceptor = RCHO + NH3 + reduced acceptor MADH is a hetero-tetramer, comprised of two heavy subunits and two light subunits. The heavy subunit forms a seven-bladed beta-propeller like structure [].; GO: 0030058 amine dehydrogenase activity, 0030416 methylamine metabolic process, 0055114 oxidation-reduction process, 0042597 periplasmic space; PDB: 3RN1_F 3SVW_F 3PXT_F 3L4O_F 3L4M_D 3SJL_F 3PXS_D 3ORV_F 3RMZ_F 3RLM_F ....
Probab=34.56 E-value=2.2e+02 Score=28.55 Aligned_cols=70 Identities=13% Similarity=0.136 Sum_probs=46.4
Q ss_pred cCCCCeEEEeC-CCcEEEEEcCCCCcEEEEeCCC--------cc-cce---EEEeCCCCeEEEEeCC-------C--CcE
Q psy4900 409 DPTSGLMFVAD-SNQILRTNMDGTMAMSIVSEAA--------YK-ASG---VALDINAKRLFWCDNL-------L--DYI 466 (485)
Q Consensus 409 D~~~~~lywtd-~~~I~r~~~dG~~~~~i~~~~~--------~~-p~g---lavD~~~~~lYW~D~~-------~--~~I 466 (485)
....+++||.. .+.|+.+++.|...+..-.-.+ .| |-| +|++...++||..=.. . ..|
T Consensus 192 ~~~~~~~~F~Sy~G~v~~~dlsg~~~~~~~~~~~~t~~e~~~~WrPGG~Q~~A~~~~~~rlyvLMh~g~~gsHKdpgteV 271 (342)
T PF06433_consen 192 SRDGGRLYFVSYEGNVYSADLSGDSAKFGKPWSLLTDAEKADGWRPGGWQLIAYHAASGRLYVLMHQGGEGSHKDPGTEV 271 (342)
T ss_dssp ETTTTEEEEEBTTSEEEEEEETTSSEEEEEEEESS-HHHHHTTEEE-SSS-EEEETTTTEEEEEEEE--TT-TTS-EEEE
T ss_pred ECCCCeEEEEecCCEEEEEeccCCcccccCcccccCccccccCcCCcceeeeeeccccCeEEEEecCCCCCCccCCceEE
Confidence 34567899988 9999999999987544432221 23 443 8999999999965321 1 257
Q ss_pred EEEEccCCCeEE
Q psy4900 467 ETVDYEGKNRFL 478 (485)
Q Consensus 467 ~~~~~dG~~r~~ 478 (485)
++.++.-..|..
T Consensus 272 Wv~D~~t~krv~ 283 (342)
T PF06433_consen 272 WVYDLKTHKRVA 283 (342)
T ss_dssp EEEETTTTEEEE
T ss_pred EEEECCCCeEEE
Confidence 776666555443
No 151
>PF05694 SBP56: 56kDa selenium binding protein (SBP56); InterPro: IPR008826 This family consists of several eukaryotic selenium binding proteins as well as three sequences from archaea. The exact function of this protein is unknown although it is thought that SBP56 participates in late stages of intra-Golgi protein transport []. The Lotus japonicus homologue of SBP56, LjSBP is thought to have more than one physiological role and can be implicated in controlling the oxidation/reduction status of target proteins in vesicular Golgi transport [].; GO: 0008430 selenium binding; PDB: 2ECE_A.
Probab=33.76 E-value=1.4e+02 Score=30.98 Aligned_cols=60 Identities=17% Similarity=0.355 Sum_probs=31.0
Q ss_pred CccEEecCCCCeEEEeC--CCcEEEEEcCCCCcEEEEeC----C---------------CcccceEEEeCCCCeEEEEeC
Q psy4900 403 PTDLALDPTSGLMFVAD--SNQILRTNMDGTMAMSIVSE----A---------------AYKASGVALDINAKRLFWCDN 461 (485)
Q Consensus 403 p~~iavD~~~~~lywtd--~~~I~r~~~dG~~~~~i~~~----~---------------~~~p~glavD~~~~~lYW~D~ 461 (485)
++.|.|-.-.++||++. ...|...++.-...-.++.+ + ..-|.=|.|-+..+|||||-+
T Consensus 314 itDI~iSlDDrfLYvs~W~~GdvrqYDISDP~~Pkl~gqv~lGG~~~~~~~~~v~g~~l~GgPqMvqlS~DGkRlYvTnS 393 (461)
T PF05694_consen 314 ITDILISLDDRFLYVSNWLHGDVRQYDISDPFNPKLVGQVFLGGSIRKGDHPVVKGKRLRGGPQMVQLSLDGKRLYVTNS 393 (461)
T ss_dssp ---EEE-TTS-EEEEEETTTTEEEEEE-SSTTS-EEEEEEE-BTTTT-B--TTS------S----EEE-TTSSEEEEE--
T ss_pred eEeEEEccCCCEEEEEcccCCcEEEEecCCCCCCcEEeEEEECcEeccCCCccccccccCCCCCeEEEccCCeEEEEEee
Confidence 56666667789999998 56677777665444333321 0 124666788888999999987
Q ss_pred C
Q psy4900 462 L 462 (485)
Q Consensus 462 ~ 462 (485)
.
T Consensus 394 L 394 (461)
T PF05694_consen 394 L 394 (461)
T ss_dssp -
T ss_pred c
Confidence 4
No 152
>PF00930 DPPIV_N: Dipeptidyl peptidase IV (DPP IV) N-terminal region; InterPro: IPR002469 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This domain defines serine peptidases belonging to MEROPS peptidase family S9 (clan SC), subfamily S9B (dipeptidyl-peptidase IV). The protein fold of the peptidase domain for members of this family resembles that of serine carboxypeptidase D, the type example of clan SC. This domain is an alignment of the region to the N-terminal side of the active site, which is found in IPR001375 from INTERPRO. CD26 (3.4.14.5 from EC) is also called adenosine deaminase-binding protein (ADA-binding protein) or dipeptidylpeptidase IV (DPP IV ectoenzyme). The exopeptidase cleaves off N-terminal X-Pro or X-Ala dipeptides from polypeptides (dipeptidyl peptidase IV activity). CD26 serves as the costimulatory molecule in T cell activation and is an associated marker of autoimmune diseases, adenosine deaminase-deficiency and HIV pathogenesis. Dipeptidyl peptidase IV (DPP IV) is responsible for the removal of N-terminal dipeptides sequentially from polypeptides having unsubstituted N termini, provided that the penultimate residue is proline. The enzyme catalyses the reaction: Dipeptidyl-Polypeptide + H(2)O = Dipeptide + Polypeptide It is a type II membrane protein that forms a homodimer. CD molecules are leucocyte antigens on cell surfaces. CD antigens nomenclature is updated at Protein Reviews On The Web (http://prow.nci.nih.gov/). ; GO: 0006508 proteolysis, 0016020 membrane; PDB: 2RIP_A 3Q8W_B 2AJL_I 1TKR_B 1TK3_B 3C45_A 2G5P_A 3G0C_D 1R9M_C 1RWQ_A ....
Probab=33.70 E-value=57 Score=32.76 Aligned_cols=72 Identities=11% Similarity=0.083 Sum_probs=43.3
Q ss_pred CccccccCcccccceEEEEecCCCCCC-ccEEecCCCCeEEEeC------CCcEEEEEcC-CCCcEEEEeCCCcccceEE
Q psy4900 377 EGDFCSEKTCAYFQFHAIVLGSNLTNP-TDLALDPTSGLMFVAD------SNQILRTNMD-GTMAMSIVSEAAYKASGVA 448 (485)
Q Consensus 377 ~I~~~~~~~c~~g~~~~~l~~~~~~~p-~~iavD~~~~~lywtd------~~~I~r~~~d-G~~~~~i~~~~~~~p~gla 448 (485)
.|.+.+.+ |...+.|- .+--.. .-+++|...+.||++- ...|++++++ |...+.|....... ..++
T Consensus 261 hly~~~~~----~~~~~~lT-~G~~~V~~i~~~d~~~~~iyf~a~~~~p~~r~lY~v~~~~~~~~~~LT~~~~~~-~~~~ 334 (353)
T PF00930_consen 261 HLYLYDLD----GGKPRQLT-SGDWEVTSILGWDEDNNRIYFTANGDNPGERHLYRVSLDSGGEPKCLTCEDGDH-YSAS 334 (353)
T ss_dssp EEEEEETT----SSEEEESS--SSS-EEEEEEEECTSSEEEEEESSGGTTSBEEEEEETTETTEEEESSTTSSTT-EEEE
T ss_pred EEEEEccc----ccceeccc-cCceeecccceEcCCCCEEEEEecCCCCCceEEEEEEeCCCCCeEeccCCCCCc-eEEE
Confidence 56666766 66654333 332233 2478999999999998 3579999999 66655444322222 3555
Q ss_pred EeCCCC
Q psy4900 449 LDINAK 454 (485)
Q Consensus 449 vD~~~~ 454 (485)
+.+..+
T Consensus 335 ~Spdg~ 340 (353)
T PF00930_consen 335 FSPDGK 340 (353)
T ss_dssp E-TTSS
T ss_pred ECCCCC
Confidence 554433
No 153
>KOG4328|consensus
Probab=33.01 E-value=1.3e+02 Score=30.99 Aligned_cols=75 Identities=16% Similarity=0.156 Sum_probs=45.0
Q ss_pred CCCCccEEecCCCC-eEEEeC--CCcEEEEEcCCC--CcEEEEeCCCcccceEEEeCCCCeEEEEeCCCCcEEEEEccCC
Q psy4900 400 LTNPTDLALDPTSG-LMFVAD--SNQILRTNMDGT--MAMSIVSEAAYKASGVALDINAKRLFWCDNLLDYIETVDYEGK 474 (485)
Q Consensus 400 ~~~p~~iavD~~~~-~lywtd--~~~I~r~~~dG~--~~~~i~~~~~~~p~glavD~~~~~lYW~D~~~~~I~~~~~dG~ 474 (485)
.....+++++|... +|.-+- .+.|.--++++. ....++ +..|++.-| ..||+.-+...+|..+.|||+
T Consensus 186 ~~Rit~l~fHPt~~~~lva~GdK~G~VG~Wn~~~~~~d~d~v~---~f~~hs~~V----s~l~F~P~n~s~i~ssSyDGt 258 (498)
T KOG4328|consen 186 DRRITSLAFHPTENRKLVAVGDKGGQVGLWNFGTQEKDKDGVY---LFTPHSGPV----SGLKFSPANTSQIYSSSYDGT 258 (498)
T ss_pred ccceEEEEecccCcceEEEEccCCCcEEEEecCCCCCccCceE---EeccCCccc----cceEecCCChhheeeeccCce
Confidence 45566788888776 444443 577888787733 222222 334444332 356676677777777777777
Q ss_pred CeEEEec
Q psy4900 475 NRFLILR 481 (485)
Q Consensus 475 ~r~~~~~ 481 (485)
-|-+=+.
T Consensus 259 iR~~D~~ 265 (498)
T KOG4328|consen 259 IRLQDFE 265 (498)
T ss_pred eeeeeec
Confidence 7655443
No 154
>TIGR03118 PEPCTERM_chp_1 conserved hypothetical protein TIGR03118. This model describes and uncharacterized conserved hypothetical protein. Members are found with the C-terminal putative exosortase interaction domain, PEP-CTERM, in Nitrosospira multiformis, Rhodoferax ferrireducens, Solibacter usitatus Ellin6076, and Acidobacteria bacterium Ellin345. It is found without the PEP-CTERM domain in several other species, including Burkholderia ambifaria, Gloeobacter violaceus PCC 7421, and three copies in the Acanthamoeba polyphaga mimivirus.
Probab=32.99 E-value=3.9e+02 Score=26.42 Aligned_cols=78 Identities=17% Similarity=0.173 Sum_probs=47.9
Q ss_pred CCCCccEEecCCCC-------------eEEEeCCCcEEEEEcC--CC---CcEEEEeCC--CcccceEEEeCC--CCeEE
Q psy4900 400 LTNPTDLALDPTSG-------------LMFVADSNQILRTNMD--GT---MAMSIVSEA--AYKASGVALDIN--AKRLF 457 (485)
Q Consensus 400 ~~~p~~iavD~~~~-------------~lywtd~~~I~r~~~d--G~---~~~~i~~~~--~~~p~glavD~~--~~~lY 457 (485)
...|+||++....+ +||-|+..+|-.-+-. -+ ...+++... ...=.||||-.. ..+||
T Consensus 76 ~~~PTGiVfN~~~~F~vt~~g~~~~a~Fif~tEdGTisaW~p~v~~t~~~~~~~~~d~s~~gavYkGLAi~~~~~~~~LY 155 (336)
T TIGR03118 76 EGTPTGQVFNGSDTFVVSGEGITGPSRFLFVTEDGTLSGWAPALGTTRMTRAEIVVDASQQGNVYKGLAVGPTGGGDYLY 155 (336)
T ss_pred CCCccEEEEeCCCceEEcCCCcccceeEEEEeCCceEEeecCcCCcccccccEEEEccCCCcceeeeeEEeecCCCceEE
Confidence 45788888874333 4777776666554422 22 122334322 222358887644 68999
Q ss_pred EEeCCCCcEEEEEccCCCeEEE
Q psy4900 458 WCDNLLDYIETVDYEGKNRFLI 479 (485)
Q Consensus 458 W~D~~~~~I~~~~~dG~~r~~~ 479 (485)
=+|-..++|.+. |++.+++-
T Consensus 156 aadF~~g~IDVF--d~~f~~~~ 175 (336)
T TIGR03118 156 AANFRQGRIDVF--KGSFRPPP 175 (336)
T ss_pred EeccCCCceEEe--cCcccccc
Confidence 999999999995 66665543
No 155
>TIGR03075 PQQ_enz_alc_DH PQQ-dependent dehydrogenase, methanol/ethanol family. This protein family has a phylogenetic distribution very similar to that coenzyme PQQ biosynthesis enzymes, as shown by partial phylogenetic profiling. Genes in this family often are found adjacent to the PQQ biosynthesis genes themselves. An unusual, strained disulfide bond between adjacent Cys residues contributes to PQQ-binding, as does a Trp residue that is part of a PQQ enzyme repeat (see pfam01011). Characterized members include the dehydrogenase subunit of a membrane-anchored, three subunit alcohol (ethanol) dehydrogenase of Gluconobacter suboxydans, a homodimeric ethanol dehydrogenase in Pseudomonas aeruginosa, and the large subunit of an alpha2/beta2 heterotetrameric methanol dehydrogenase in Methylobacterium extorquens.
Probab=32.20 E-value=1.4e+02 Score=32.09 Aligned_cols=59 Identities=17% Similarity=0.107 Sum_probs=33.3
Q ss_pred EEecCCCCeEEEeC---CCcEEEEEcCCCCcEEEEeCC-Ccc-cceEEEeCCCCeEEEEeCCCCcE
Q psy4900 406 LALDPTSGLMFVAD---SNQILRTNMDGTMAMSIVSEA-AYK-ASGVALDINAKRLFWCDNLLDYI 466 (485)
Q Consensus 406 iavD~~~~~lywtd---~~~I~r~~~dG~~~~~i~~~~-~~~-p~glavD~~~~~lYW~D~~~~~I 466 (485)
||||..+|++-|.- .+-+. ++|.....+|++.. -.. -..|+.=-.++.+|..|+.+++.
T Consensus 274 vAld~~TG~~~W~~Q~~~~D~w--D~d~~~~p~l~d~~~~G~~~~~v~~~~K~G~~~vlDr~tG~~ 337 (527)
T TIGR03075 274 VARDPDTGKIKWHYQTTPHDEW--DYDGVNEMILFDLKKDGKPRKLLAHADRNGFFYVLDRTNGKL 337 (527)
T ss_pred EEEccccCCEEEeeeCCCCCCc--cccCCCCcEEEEeccCCcEEEEEEEeCCCceEEEEECCCCce
Confidence 89999999999986 23333 45554444444311 001 11222233666777777776554
No 156
>PF14339 DUF4394: Domain of unknown function (DUF4394)
Probab=29.97 E-value=2.3e+02 Score=26.78 Aligned_cols=71 Identities=17% Similarity=0.207 Sum_probs=48.7
Q ss_pred CCCccEEecCCCCeEEEeC-CCcEEEEEcCCCCcEEE----EeCCC-cccceEEEeCCCCeEEEEeCCCCcEEEEEcc
Q psy4900 401 TNPTDLALDPTSGLMFVAD-SNQILRTNMDGTMAMSI----VSEAA-YKASGVALDINAKRLFWCDNLLDYIETVDYE 472 (485)
Q Consensus 401 ~~p~~iavD~~~~~lywtd-~~~I~r~~~dG~~~~~i----~~~~~-~~p~glavD~~~~~lYW~D~~~~~I~~~~~d 472 (485)
.+-.||++-|.+|.||=-. ..+|+.++..-.....+ +...+ ..+.|+.+++...||..+-. .+.=.|.+.|
T Consensus 27 e~l~GID~Rpa~G~LYgl~~~g~lYtIn~~tG~aT~vg~s~~~~al~g~~~gvDFNP~aDRlRvvs~-~GqNlR~npd 103 (236)
T PF14339_consen 27 ESLVGIDFRPANGQLYGLGSTGRLYTINPATGAATPVGASPLTVALSGTAFGVDFNPAADRLRVVSN-TGQNLRLNPD 103 (236)
T ss_pred CeEEEEEeecCCCCEEEEeCCCcEEEEECCCCeEEEeecccccccccCceEEEecCcccCcEEEEcc-CCcEEEECCC
Confidence 4567899999999999875 89999998875443333 11111 23678888888899988733 4444555554
No 157
>PF04706 Dickkopf_N: Dickkopf N-terminal cysteine-rich region; InterPro: IPR006796 Dickkopf proteins are a class of Wnt antagonists. They possess two conserved cysteine-rich regions. This family represents the N-terminal conserved region []. The C-terminal region has been found to share significant sequence similarity to the colipase fold (IPR001981 from INTERPRO) [].; GO: 0007275 multicellular organismal development, 0030178 negative regulation of Wnt receptor signaling pathway, 0005576 extracellular region
Probab=28.81 E-value=81 Score=22.17 Aligned_cols=9 Identities=44% Similarity=1.257 Sum_probs=6.4
Q ss_pred cCCCCcccC
Q psy4900 102 KCANSLCIP 110 (485)
Q Consensus 102 ~C~~~~Ci~ 110 (485)
.|.+|+|+|
T Consensus 44 ~CvnG~C~~ 52 (52)
T PF04706_consen 44 LCVNGVCTP 52 (52)
T ss_pred eeeCCEecC
Confidence 577777765
No 158
>PF14251 DUF4346: Domain of unknown function (DUF4346)
Probab=28.66 E-value=1.5e+02 Score=24.70 Aligned_cols=64 Identities=16% Similarity=0.332 Sum_probs=39.1
Q ss_pred CccEEecCCCCeEEEeC-C-CcEEEEEcCCCCcEEEEeCCCcccceEEEeCCCCeEEEEeCCCCcEEEEEccCCCe
Q psy4900 403 PTDLALDPTSGLMFVAD-S-NQILRTNMDGTMAMSIVSEAAYKASGVALDINAKRLFWCDNLLDYIETVDYEGKNR 476 (485)
Q Consensus 403 p~~iavD~~~~~lywtd-~-~~I~r~~~dG~~~~~i~~~~~~~p~glavD~~~~~lYW~D~~~~~I~~~~~dG~~r 476 (485)
-|-|++||..=+|...| . +.|..-.+ .+.-.-.|||+|+.++.+.=+..+..+-...-+.|+.-
T Consensus 9 ~R~i~LDp~GYfiI~~d~~~~~i~a~h~----------~n~I~~~Gla~Dpetge~i~~~g~~~r~~~~~~~GrTA 74 (119)
T PF14251_consen 9 QRFIDLDPAGYFIIYVDREAGEICAEHY----------TNDIDDKGLAVDPETGEVIPCRGKVKRTPSIVFKGRTA 74 (119)
T ss_pred cCccccCCCccEEEEEeCCCCeeeHhhc----------cCccCcccceeCCCCCCEEEEecCCCCceeEEEecCCH
Confidence 35688999865555555 2 22211111 11223459999999999988877666666666666653
No 159
>KOG1274|consensus
Probab=28.39 E-value=3.3e+02 Score=30.76 Aligned_cols=63 Identities=14% Similarity=0.194 Sum_probs=41.4
Q ss_pred CCcEEEEEcCCCCcEEEEeCCCcccceEEEeCCCCeEEEEeCCCCcEEEEEccCCCeEEEecCC
Q psy4900 420 SNQILRTNMDGTMAMSIVSEAAYKASGVALDINAKRLFWCDNLLDYIETVDYEGKNRFLILRGS 483 (485)
Q Consensus 420 ~~~I~r~~~dG~~~~~i~~~~~~~p~glavD~~~~~lYW~D~~~~~I~~~~~dG~~r~~~~~~~ 483 (485)
.+.|.|+.++......|+.+-....+-++|+....++--.-. .-.|-..+++-.....+++++
T Consensus 75 ~~tv~~y~fps~~~~~iL~Rftlp~r~~~v~g~g~~iaagsd-D~~vK~~~~~D~s~~~~lrgh 137 (933)
T KOG1274|consen 75 QNTVLRYKFPSGEEDTILARFTLPIRDLAVSGSGKMIAAGSD-DTAVKLLNLDDSSQEKVLRGH 137 (933)
T ss_pred cceEEEeeCCCCCccceeeeeeccceEEEEecCCcEEEeecC-ceeEEEEeccccchheeeccc
Confidence 688999999988878888766666778898866555544332 234556666555544455543
No 160
>KOG1217|consensus
Probab=27.83 E-value=73 Score=32.87 Aligned_cols=22 Identities=27% Similarity=0.551 Sum_probs=18.6
Q ss_pred cccccCCCCCCeeeCCCCceee
Q psy4900 185 EFTCQASPTGGVCQCPEGQKVA 206 (485)
Q Consensus 185 ~~~C~n~~~~~~C~C~~G~~l~ 206 (485)
.+.|.+..++|.|.|++||...
T Consensus 182 ~~~C~~~~~~~~C~c~~~~~~~ 203 (487)
T KOG1217|consen 182 GGTCVNTGGSYLCSCPPGYTGS 203 (487)
T ss_pred CcccccCCCCeeEeCCCCccCC
Confidence 3678999999999999999754
No 161
>KOG0291|consensus
Probab=27.47 E-value=5.3e+02 Score=28.70 Aligned_cols=64 Identities=22% Similarity=0.161 Sum_probs=38.9
Q ss_pred CCcEEEEEcCCCCc----------EEEEeCCCcccceEEEeCCCCeEEEEeCCCCcEEEEEccCCCeEEEecCC
Q psy4900 420 SNQILRTNMDGTMA----------MSIVSEAAYKASGVALDINAKRLFWCDNLLDYIETVDYEGKNRFLILRGS 483 (485)
Q Consensus 420 ~~~I~r~~~dG~~~----------~~i~~~~~~~p~glavD~~~~~lYW~D~~~~~I~~~~~dG~~r~~~~~~~ 483 (485)
.+.|..+.|||+-| +++....-.+-.-||||+....|.=.+-..-.|.+-++......-+++|+
T Consensus 404 g~~llssSLDGtVRAwDlkRYrNfRTft~P~p~QfscvavD~sGelV~AG~~d~F~IfvWS~qTGqllDiLsGH 477 (893)
T KOG0291|consen 404 GNVLLSSSLDGTVRAWDLKRYRNFRTFTSPEPIQFSCVAVDPSGELVCAGAQDSFEIFVWSVQTGQLLDILSGH 477 (893)
T ss_pred CCEEEEeecCCeEEeeeecccceeeeecCCCceeeeEEEEcCCCCEEEeeccceEEEEEEEeecCeeeehhcCC
Confidence 56677888888754 34443333334568999888877765544455666666555544455543
No 162
>KOG3567|consensus
Probab=27.42 E-value=89 Score=32.34 Aligned_cols=53 Identities=13% Similarity=0.164 Sum_probs=31.8
Q ss_pred CcEEEEEcCCCCcEE-EEeCCCcccceEEEeCCCCeEEEEeCCCCcEEEEEccCC
Q psy4900 421 NQILRTNMDGTMAMS-IVSEAAYKASGVALDINAKRLFWCDNLLDYIETVDYEGK 474 (485)
Q Consensus 421 ~~I~r~~~dG~~~~~-i~~~~~~~p~glavD~~~~~lYW~D~~~~~I~~~~~dG~ 474 (485)
++|.+..+.-..+.. .-...+..|.||.+| ..+..|-+|...+.+..-..+++
T Consensus 445 ~~ilvi~~~n~~~l~~~g~~~fylphgl~~d-kdgf~~~tdvash~v~k~k~~~~ 498 (501)
T KOG3567|consen 445 DTILVIDPNNAAVLQSSGKNLFYLPHGLSID-KDGFYWVTDVASHQVFKLKPNNK 498 (501)
T ss_pred ceEEEEcCcchhhhhhccCCceecCCcceec-CCCcEEeecccchhhhhcccccc
Confidence 567777776322222 112236679999999 56666666666676665555544
No 163
>PF10042 DUF2278: Uncharacterized conserved protein (DUF2278); InterPro: IPR019268 This entry consists of hypothetical proteins with no known function.
Probab=27.42 E-value=40 Score=31.17 Aligned_cols=16 Identities=38% Similarity=0.405 Sum_probs=14.1
Q ss_pred cccceEEEeCCCCe-EE
Q psy4900 442 YKASGVALDINAKR-LF 457 (485)
Q Consensus 442 ~~p~glavD~~~~~-lY 457 (485)
..|.++|||++++. |+
T Consensus 85 ~~~~~~aLDYiR~~~Lf 101 (206)
T PF10042_consen 85 STPGGGALDYIRGNGLF 101 (206)
T ss_pred CCCCCeeeeEeeCCcCc
Confidence 57899999999998 66
No 164
>PF14759 Reductase_C: Reductase C-terminal; PDB: 3FG2_P 3LXD_A 2YVG_A 2GR1_A 2GQW_A 2GR3_A 2YVF_A 1F3P_A 2GR0_A 2GR2_A ....
Probab=26.54 E-value=1.2e+02 Score=23.47 Aligned_cols=27 Identities=19% Similarity=0.215 Sum_probs=19.3
Q ss_pred EEeC--CCcEEEEEcCCCCcEEEEeCCCc
Q psy4900 416 FVAD--SNQILRTNMDGTMAMSIVSEAAY 442 (485)
Q Consensus 416 ywtd--~~~I~r~~~dG~~~~~i~~~~~~ 442 (485)
|||| ..+|.-+..-+..-++++..+..
T Consensus 2 FWSdQ~~~~iq~~G~~~~~~~~v~rg~~~ 30 (85)
T PF14759_consen 2 FWSDQYGVRIQIAGLPGGADEVVVRGDPE 30 (85)
T ss_dssp EEEEETTEEEEEEE-STTSSEEEEEEETT
T ss_pred eecccCCCeEEEEECCCCCCEEEEEccCC
Confidence 8999 78899999876666666654433
No 165
>KOG0285|consensus
Probab=25.80 E-value=3.7e+02 Score=27.12 Aligned_cols=103 Identities=13% Similarity=0.091 Sum_probs=60.8
Q ss_pred CccccccCcccccceEEEEecCCCCCCccEEecCCCCeEEEeC-CCcEEEEEcCCCCcEEEEeCCCcccceEEEeCCCCe
Q psy4900 377 EGDFCSEKTCAYFQFHAIVLGSNLTNPTDLALDPTSGLMFVAD-SNQILRTNMDGTMAMSIVSEAAYKASGVALDINAKR 455 (485)
Q Consensus 377 ~I~~~~~~~c~~g~~~~~l~~~~~~~p~~iavD~~~~~lywtd-~~~I~r~~~dG~~~~~i~~~~~~~p~glavD~~~~~ 455 (485)
.|...++.+ |..+ .-+.+.+...+++||-+..-+||=.- .+.|.--+|.-.....-....|.-...|++.+.-+.
T Consensus 174 tikIwDlat---g~Lk-ltltGhi~~vr~vavS~rHpYlFs~gedk~VKCwDLe~nkvIR~YhGHlS~V~~L~lhPTldv 249 (460)
T KOG0285|consen 174 TIKIWDLAT---GQLK-LTLTGHIETVRGVAVSKRHPYLFSAGEDKQVKCWDLEYNKVIRHYHGHLSGVYCLDLHPTLDV 249 (460)
T ss_pred eeEEEEccc---CeEE-EeecchhheeeeeeecccCceEEEecCCCeeEEEechhhhhHHHhccccceeEEEecccccee
Confidence 555566651 4444 33446788899999999998998665 566777666533211111234556667777655555
Q ss_pred EEEEeCCCCcEEEEEccCCCeEEEecCCC
Q psy4900 456 LFWCDNLLDYIETVDYEGKNRFLILRGSQ 484 (485)
Q Consensus 456 lYW~D~~~~~I~~~~~dG~~r~~~~~~~~ 484 (485)
|+ +-.....|.+-++..+....++.|+.
T Consensus 250 l~-t~grDst~RvWDiRtr~~V~~l~GH~ 277 (460)
T KOG0285|consen 250 LV-TGGRDSTIRVWDIRTRASVHVLSGHT 277 (460)
T ss_pred EE-ecCCcceEEEeeecccceEEEecCCC
Confidence 54 33344455555666655555555543
No 166
>PF06433 Me-amine-dh_H: Methylamine dehydrogenase heavy chain (MADH); InterPro: IPR009451 Methylamine dehydrogenase (1.4.99.3 from EC) is a periplasmic quinoprotein found in several methyltrophic bacteria []. It is induced when grown on methylamine as a carbon source MADH and catalyses the oxidative deamination of amines to their corresponding aldehydes. The redox cofactor of this enzyme is tryptophan tryptophylquinone (TTQ). Electrons derived from the oxidation of methylamine are passed to an electron acceptor, which is usually the blue-copper protein amicyanin (IPR002386 from INTERPRO). RCH2NH2 + H2O + acceptor = RCHO + NH3 + reduced acceptor MADH is a hetero-tetramer, comprised of two heavy subunits and two light subunits. The heavy subunit forms a seven-bladed beta-propeller like structure [].; GO: 0030058 amine dehydrogenase activity, 0030416 methylamine metabolic process, 0055114 oxidation-reduction process, 0042597 periplasmic space; PDB: 3RN1_F 3SVW_F 3PXT_F 3L4O_F 3L4M_D 3SJL_F 3PXS_D 3ORV_F 3RMZ_F 3RLM_F ....
Probab=24.04 E-value=7.2e+02 Score=24.95 Aligned_cols=65 Identities=12% Similarity=0.168 Sum_probs=35.6
Q ss_pred EEecCCCCeEEEeC-----------CCcEEEEEcCCCCcEEEEeCCCcccc-eEEEeCCCC-eEEEEeCCCCcEEEEEcc
Q psy4900 406 LALDPTSGLMFVAD-----------SNQILRTNMDGTMAMSIVSEAAYKAS-GVALDINAK-RLFWCDNLLDYIETVDYE 472 (485)
Q Consensus 406 iavD~~~~~lywtd-----------~~~I~r~~~dG~~~~~i~~~~~~~p~-glavD~~~~-~lYW~D~~~~~I~~~~~d 472 (485)
+|+++..++||.-= ...|...++.-..|..-+ .|..|. +|+|--..+ +||=++...+.|.+.+.-
T Consensus 243 ~A~~~~~~rlyvLMh~g~~gsHKdpgteVWv~D~~t~krv~Ri--~l~~~~~Si~Vsqd~~P~L~~~~~~~~~l~v~D~~ 320 (342)
T PF06433_consen 243 IAYHAASGRLYVLMHQGGEGSHKDPGTEVWVYDLKTHKRVARI--PLEHPIDSIAVSQDDKPLLYALSAGDGTLDVYDAA 320 (342)
T ss_dssp EEEETTTTEEEEEEEE--TT-TTS-EEEEEEEETTTTEEEEEE--EEEEEESEEEEESSSS-EEEEEETTTTEEEEEETT
T ss_pred eeeccccCeEEEEecCCCCCCccCCceEEEEEECCCCeEEEEE--eCCCccceEEEccCCCcEEEEEcCCCCeEEEEeCc
Confidence 89999999999653 123665555544332222 233333 566643333 555555555556655544
No 167
>COG3211 PhoX Predicted phosphatase [General function prediction only]
Probab=22.85 E-value=3.4e+02 Score=29.20 Aligned_cols=64 Identities=17% Similarity=0.182 Sum_probs=42.4
Q ss_pred CCCCCCccEEecCCCCeEEEeC---C---------------CcEEEEEcCCC-------CcEEEEeCC------------
Q psy4900 398 SNLTNPTDLALDPTSGLMFVAD---S---------------NQILRTNMDGT-------MAMSIVSEA------------ 440 (485)
Q Consensus 398 ~~~~~p~~iavD~~~~~lywtd---~---------------~~I~r~~~dG~-------~~~~i~~~~------------ 440 (485)
.-+.+|..|++.|.+|.+|++- . .+|+|.--.+. .+..++..+
T Consensus 414 T~mdRpE~i~~~p~~g~Vy~~lTNn~~r~~~~aNpr~~n~~G~I~r~~p~~~d~t~~~ftWdlF~~aG~~~~~~~~~~~~ 493 (616)
T COG3211 414 TPMDRPEWIAVNPGTGEVYFTLTNNGKRSDDAANPRAKNGYGQIVRWIPATGDHTDTKFTWDLFVEAGNPSVLEGGASAN 493 (616)
T ss_pred ccccCccceeecCCcceEEEEeCCCCccccccCCCcccccccceEEEecCCCCccCccceeeeeeecCCccccccccccC
Confidence 3478899999999999999987 1 24666555443 233333311
Q ss_pred -----CcccceEEEeCCCCeEEEEeC
Q psy4900 441 -----AYKASGVALDINAKRLFWCDN 461 (485)
Q Consensus 441 -----~~~p~glavD~~~~~lYW~D~ 461 (485)
+..|.+|++|...+..-=+|-
T Consensus 494 ~~~~~f~~PDnl~fD~~GrLWi~TDg 519 (616)
T COG3211 494 INANWFNSPDNLAFDPWGRLWIQTDG 519 (616)
T ss_pred cccccccCCCceEECCCCCEEEEecC
Confidence 345999999976665555554
No 168
>PTZ00486 apyrase Superfamily; Provisional
Probab=22.80 E-value=4.1e+02 Score=26.69 Aligned_cols=24 Identities=17% Similarity=0.220 Sum_probs=16.9
Q ss_pred CCCCeEEEEeCCCCcEEEEEccCC
Q psy4900 451 INAKRLFWCDNLLDYIETVDYEGK 474 (485)
Q Consensus 451 ~~~~~lYW~D~~~~~I~~~~~dG~ 474 (485)
..+++||=.|-+++.|+....++.
T Consensus 122 ~FngkLys~DDrTGiVy~i~~~~~ 145 (352)
T PTZ00486 122 SFNGKLYGFDDRTGIVYEIDIDKK 145 (352)
T ss_pred eeCCEEEEEeCCceEEEEEEcCCC
Confidence 567777777777777777765554
No 169
>KOG0650|consensus
Probab=20.91 E-value=1.2e+02 Score=32.55 Aligned_cols=69 Identities=12% Similarity=0.147 Sum_probs=45.9
Q ss_pred CccEEecCCCCeEEEeCCCcEEEEEcCCCCcEEEEeCCCcccceEEEeCCCCeEEEEeCCCCcEEEEEcc
Q psy4900 403 PTDLALDPTSGLMFVADSNQILRTNMDGTMAMSIVSEAAYKASGVALDINAKRLFWCDNLLDYIETVDYE 472 (485)
Q Consensus 403 p~~iavD~~~~~lywtd~~~I~r~~~dG~~~~~i~~~~~~~p~glavD~~~~~lYW~D~~~~~I~~~~~d 472 (485)
|..+.++|..-+||.+....|.-.+|--.....-+..+..|...|+|++...+|.-. +..++|-..+||
T Consensus 569 vq~v~FHPs~p~lfVaTq~~vRiYdL~kqelvKkL~tg~kwiS~msihp~GDnli~g-s~d~k~~WfDld 637 (733)
T KOG0650|consen 569 VQRVKFHPSKPYLFVATQRSVRIYDLSKQELVKKLLTGSKWISSMSIHPNGDNLILG-SYDKKMCWFDLD 637 (733)
T ss_pred eeEEEecCCCceEEEEeccceEEEehhHHHHHHHHhcCCeeeeeeeecCCCCeEEEe-cCCCeeEEEEcc
Confidence 556788888888888887666666666433221222577899999999877776654 334555555555
No 170
>KOG3658|consensus
Probab=20.84 E-value=2.6e+02 Score=30.52 Aligned_cols=23 Identities=30% Similarity=0.596 Sum_probs=16.3
Q ss_pred CCcCCCCCCCCCCCCCCCCeecC
Q psy4900 42 RKDEEGCPATTGLSCDLDQFRCA 64 (485)
Q Consensus 42 ~sdE~~C~~~~~~~C~~~~f~C~ 64 (485)
..+|..|...++..|++.+-.|-
T Consensus 498 ~~~~k~C~lk~gaqCSpsqgpCC 520 (764)
T KOG3658|consen 498 NLDEKPCTLKPGAQCSPSQGPCC 520 (764)
T ss_pred CCCCCCceeCCCCccCCCCCCcc
Confidence 56777887666678877776664
No 171
>PF04885 Stig1: Stigma-specific protein, Stig1; InterPro: IPR006969 This family represents the Stig1 cysteine rich plant protein.The tobacco stigma-specific gene, STIG1 is developmentally regulated and expressed specifically in the stigmatic secretory zone. Pistils of transgenic STIG1-barnase tobacco plants undergo normal development, but lack the stigmatic secretory zone and are female sterile. Pollen grains are unable to penetrate the surface of the ablated pistils. Application of stigmatic exudate from wild-type pistils to the ablated surface increases the efficiency of pollen tube germination and growth and restores the capacity of pollen tubes to penetrate the style []. The function of STIG1 is unknown.
Probab=20.81 E-value=2.1e+02 Score=24.60 Aligned_cols=30 Identities=33% Similarity=0.806 Sum_probs=16.7
Q ss_pred CCCCC-CcccCCC--CCccCCCcccCCCCcccCc
Q psy4900 81 DCGDN-SDEEKCN--FTACHVGQFKCANSLCIPV 111 (485)
Q Consensus 81 dC~d~-sDe~~C~--~~~C~~~~f~C~~~~Ci~~ 111 (485)
.|.|- +|..+|. ...|..++ .|=+|+|+..
T Consensus 76 ~Cvdv~~d~~nCG~Cg~~C~~g~-~cC~G~Cvd~ 108 (136)
T PF04885_consen 76 KCVDVSSDRNNCGACGNKCPYGQ-TCCGGQCVDL 108 (136)
T ss_pred cCCccCCCccccHhhcCCCCCCc-eecCCEeECC
Confidence 34443 4666665 34566665 4546666643
No 172
>PF05096 Glu_cyclase_2: Glutamine cyclotransferase; InterPro: IPR007788 This family of enzymes 2.3.2.5 from EC catalyse the cyclization of free L-glutamine and N-terminal glutaminyl residues in proteins to pyroglutamate (5-oxoproline) and pyroglutamyl residues respectively []. This family includes plant and bacterial enzymes and seems unrelated to the mammalian enzymes.; PDB: 3NOK_B 2FAW_A 2IWA_A 3NOM_A 3NOL_A 3MBR_X.
Probab=20.47 E-value=3.5e+02 Score=26.06 Aligned_cols=67 Identities=10% Similarity=0.134 Sum_probs=43.4
Q ss_pred ccEEecCCCCeEEEeC----CCcEEEEEcCCCC-cEEEEeCCCcccceEEEeCCCCeEEEEeCCCCcEEEEEccC
Q psy4900 404 TDLALDPTSGLMFVAD----SNQILRTNMDGTM-AMSIVSEAAYKASGVALDINAKRLFWCDNLLDYIETVDYEG 473 (485)
Q Consensus 404 ~~iavD~~~~~lywtd----~~~I~r~~~dG~~-~~~i~~~~~~~p~glavD~~~~~lYW~D~~~~~I~~~~~dG 473 (485)
.||.++ ..|.||=+. ..+|.+.++.... .+..--..-...+|||+- .++||-.-|+.+...+.+.+.
T Consensus 48 QGL~~~-~~g~LyESTG~yG~S~l~~~d~~tg~~~~~~~l~~~~FgEGit~~--~d~l~qLTWk~~~~f~yd~~t 119 (264)
T PF05096_consen 48 QGLEFL-DDGTLYESTGLYGQSSLRKVDLETGKVLQSVPLPPRYFGEGITIL--GDKLYQLTWKEGTGFVYDPNT 119 (264)
T ss_dssp EEEEEE-ETTEEEEEECSTTEEEEEEEETTTSSEEEEEE-TTT--EEEEEEE--TTEEEEEESSSSEEEEEETTT
T ss_pred ccEEec-CCCEEEEeCCCCCcEEEEEEECCCCcEEEEEECCccccceeEEEE--CCEEEEEEecCCeEEEEcccc
Confidence 457774 247888777 4689999888543 222222334467899854 788888888887777777653
Done!