Query gi|254780773|ref|YP_003065186.1| zinc metallopeptidase [Candidatus Liberibacter asiaticus str. psy62] Match_columns 349 No_of_seqs 200 out of 3136 Neff 8.3 Searched_HMMs 39220 Date Sun May 29 21:25:13 2011 Command /home/congqian_1/programs/hhpred/hhsearch -i 254780773.hhm -d /home/congqian_1/database/cdd/Cdd.hhm No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM 1 PRK10779 zinc metallopeptidase 100.0 0 0 712.6 28.7 341 1-347 1-449 (449) 2 TIGR00054 TIGR00054 membrane-a 100.0 0 0 661.1 25.6 341 1-348 1-462 (463) 3 COG0750 Predicted membrane-ass 100.0 0 0 436.1 24.1 346 4-349 2-375 (375) 4 cd06163 S2P-M50_PDZ_RseP-like 100.0 0 0 428.1 16.0 174 7-346 1-182 (182) 5 pfam02163 Peptidase_M50 Peptid 100.0 4.3E-42 0 310.1 14.8 205 9-343 1-205 (205) 6 cd05709 S2P-M50 Site-2 proteas 100.0 1.3E-33 3.2E-38 251.7 9.4 179 9-342 2-180 (180) 7 cd06159 S2P-M50_PDZ_Arch Uncha 99.9 2.1E-26 5.4E-31 201.8 11.6 133 6-324 109-241 (263) 8 LOAD_S2Pmetalloprt consensus 99.9 2.6E-24 6.7E-29 187.4 7.4 147 10-314 2-148 (148) 9 cd06161 S2P-M50_SpoIVFB SpoIVF 99.8 1.6E-18 4E-23 147.5 14.1 151 7-342 30-180 (208) 10 cd06164 S2P-M50_SpoIVFB_CBS Sp 99.8 4.6E-18 1.2E-22 144.3 15.1 148 8-337 46-193 (227) 11 KOG2921 consensus 99.8 4E-18 1E-22 144.8 11.0 137 5-163 121-266 (484) 12 cd06162 S2P-M50_PDZ_SREBP Ster 99.7 1.5E-15 3.8E-20 127.0 11.5 85 5-111 125-209 (277) 13 cd06160 S2P-M50_like_2 Unchara 99.6 2.9E-14 7.5E-19 118.1 13.0 78 7-108 33-111 (183) 14 cd06158 S2P-M50_like_1 Unchara 99.6 1.1E-14 2.8E-19 121.1 7.7 94 8-110 2-102 (181) 15 cd00989 PDZ_metalloprotease PD 99.4 1.1E-13 2.8E-18 114.2 5.4 75 111-186 5-79 (79) 16 TIGR02037 degP_htrA_DO proteas 99.3 8.1E-13 2.1E-17 108.1 4.5 66 121-187 296-362 (484) 17 cd00991 PDZ_archaeal_metallopr 99.2 1E-11 2.6E-16 100.5 4.9 61 120-180 12-73 (79) 18 PRK10139 serine endoprotease; 99.1 8.9E-11 2.3E-15 94.1 6.3 65 121-186 293-358 (455) 19 PRK10942 serine endoprotease; 99.1 8.2E-11 2.1E-15 94.3 5.3 65 121-186 315-380 (474) 20 PRK10898 serine endoprotease; 99.1 1.2E-10 3E-15 93.3 5.4 65 121-186 283-348 (355) 21 cd00988 PDZ_CTP_protease PDZ d 99.1 2.3E-10 5.8E-15 91.2 5.3 68 119-186 14-83 (85) 22 cd00986 PDZ_LON_protease PDZ d 99.0 4.3E-10 1.1E-14 89.3 4.8 64 120-185 10-74 (79) 23 cd00987 PDZ_serine_protease PD 99.0 5.4E-10 1.4E-14 88.7 4.9 63 120-183 26-89 (90) 24 cd00990 PDZ_glycyl_aminopeptid 99.0 5.3E-10 1.3E-14 88.7 4.7 63 120-185 14-76 (80) 25 TIGR02860 spore_IV_B stage IV 99.0 8.4E-10 2.1E-14 87.3 5.1 78 128-206 141-222 (423) 26 TIGR02038 protease_degS peripl 99.0 8.9E-10 2.3E-14 87.2 4.9 64 121-185 288-352 (358) 27 TIGR02037 degP_htrA_DO proteas 98.7 1.7E-08 4.3E-13 78.4 4.6 60 121-180 419-481 (484) 28 cd00136 PDZ PDZ domain, also c 98.6 3.2E-08 8.2E-13 76.4 4.1 54 119-172 14-69 (70) 29 COG0793 Prc Periplasmic protea 98.6 2.7E-07 6.8E-12 70.1 8.4 69 118-186 112-183 (406) 30 PRK10139 serine endoprotease; 98.6 7.6E-08 1.9E-12 73.8 4.7 59 120-179 392-450 (455) 31 PRK10942 serine endoprotease; 98.6 8.8E-08 2.3E-12 73.4 4.9 59 120-179 411-469 (474) 32 COG3480 SdrC Predicted secrete 98.6 2.3E-07 5.9E-12 70.5 6.9 153 121-277 133-294 (342) 33 TIGR03279 cyano_FeS_chp putati 98.4 3.2E-07 8.2E-12 69.5 5.0 62 121-185 1-62 (433) 34 COG1994 SpoIVFB Zn-dependent p 98.4 1.2E-06 3.1E-11 65.6 7.7 44 273-316 136-179 (230) 35 PRK11186 carboxy-terminal prot 98.4 2.4E-06 6.1E-11 63.5 8.3 66 120-185 259-334 (673) 36 smart00228 PDZ Domain present 98.4 4.2E-07 1.1E-11 68.7 4.4 57 119-176 27-85 (85) 37 PRK10779 zinc metallopeptidase 98.2 1.6E-06 4.2E-11 64.6 4.9 59 120-178 128-187 (449) 38 COG0265 DegQ Trypsin-like seri 98.2 2E-06 5.1E-11 64.1 5.1 66 120-186 272-338 (347) 39 KOG3129 consensus 98.2 2.5E-06 6.5E-11 63.3 5.6 78 120-198 141-221 (231) 40 pfam00595 PDZ PDZ domain (Also 98.2 1.5E-06 3.9E-11 64.8 3.4 52 120-172 26-79 (80) 41 cd00992 PDZ_signaling PDZ doma 98.1 2.4E-06 6E-11 63.6 3.6 54 118-172 26-81 (82) 42 TIGR00054 TIGR00054 membrane-a 97.8 1.6E-05 4.2E-10 57.8 3.4 83 120-202 137-229 (463) 43 COG3975 Predicted protease wit 97.6 8.2E-05 2.1E-09 52.9 4.6 59 121-187 465-523 (558) 44 KOG1320 consensus 97.6 7.7E-05 2E-09 53.1 4.2 95 94-189 369-469 (473) 45 KOG1421 consensus 97.6 9.4E-05 2.4E-09 52.5 4.4 64 121-186 306-369 (955) 46 KOG3209 consensus 97.5 0.00011 2.8E-09 52.1 3.7 62 115-177 775-839 (984) 47 pfam04495 GRASP55_65 GRASP55/6 97.0 0.0014 3.5E-08 44.5 5.0 68 121-188 46-115 (280) 48 PRK09681 putative type II secr 96.9 0.0011 2.8E-08 45.2 4.1 58 125-183 254-315 (319) 49 TIGR01713 typeII_sec_gspC gene 96.9 0.0011 2.8E-08 45.1 3.7 54 125-178 220-274 (281) 50 pfam01434 Peptidase_M41 Peptid 96.7 0.0017 4.3E-08 43.9 3.5 67 17-102 10-78 (192) 51 KOG3532 consensus 96.6 0.0023 5.9E-08 42.9 3.5 47 121-167 401-447 (1051) 52 TIGR00225 prc C-terminal proce 96.3 0.018 4.5E-07 36.8 6.7 58 119-176 67-131 (361) 53 COG3031 PulC Type II secretory 96.3 0.0059 1.5E-07 40.1 4.0 54 125-178 214-268 (275) 54 COG1994 SpoIVFB Zn-dependent p 96.2 0.0028 7.3E-08 42.3 2.1 72 14-107 51-123 (230) 55 KOG3553 consensus 96.2 0.0028 7.1E-08 42.4 2.0 55 120-176 61-117 (124) 56 KOG3580 consensus 96.0 0.0035 8.8E-08 41.7 1.9 53 121-173 432-487 (1027) 57 KOG3834 consensus 95.9 0.01 2.6E-07 38.4 3.9 69 121-189 112-182 (462) 58 KOG3552 consensus 95.9 0.0055 1.4E-07 40.3 2.4 55 119-175 75-132 (1298) 59 PRK10733 hflB ATP-dependent me 95.8 0.0084 2.1E-07 39.1 3.3 39 23-65 34-72 (644) 60 KOG3834 consensus 95.7 0.013 3.3E-07 37.8 3.7 71 118-189 15-88 (462) 61 KOG3209 consensus 95.7 0.0092 2.3E-07 38.8 2.9 54 121-174 374-431 (984) 62 KOG0606 consensus 94.7 0.014 3.6E-07 37.5 1.5 42 120-161 660-703 (1205) 63 KOG3542 consensus 94.3 0.022 5.7E-07 36.1 1.8 34 121-154 565-598 (1283) 64 KOG1421 consensus 94.3 0.062 1.6E-06 33.1 3.9 54 121-184 774-827 (955) 65 KOG3651 consensus 93.8 0.042 1.1E-06 34.2 2.4 53 121-174 33-88 (429) 66 KOG3580 consensus 93.5 0.056 1.4E-06 33.4 2.5 57 122-179 223-282 (1027) 67 KOG3549 consensus 93.4 0.066 1.7E-06 32.9 2.8 55 119-174 81-138 (505) 68 KOG3605 consensus 92.9 0.19 4.7E-06 29.8 4.5 67 110-176 663-735 (829) 69 TIGR01241 FtsH_fam ATP-depende 92.6 0.041 1.1E-06 34.3 0.7 11 178-188 338-348 (505) 70 KOG1892 consensus 92.4 0.076 1.9E-06 32.4 1.9 57 120-177 962-1021(1629) 71 KOG3551 consensus 90.2 0.16 4.2E-06 30.2 1.8 55 119-174 111-168 (506) 72 COG0465 HflB ATP-dependent Zn 89.3 0.29 7.3E-06 28.5 2.5 27 29-55 32-58 (596) 73 KOG0609 consensus 88.3 0.32 8E-06 28.2 2.1 55 119-174 147-204 (542) 74 KOG1738 consensus 87.6 0.61 1.6E-05 26.2 3.2 44 121-164 228-274 (638) 75 KOG3550 consensus 86.4 0.38 9.8E-06 27.6 1.7 54 118-172 114-171 (207) 76 KOG3606 consensus 86.4 0.73 1.8E-05 25.7 3.1 55 121-176 197-254 (358) 77 KOG3571 consensus 86.1 0.61 1.6E-05 26.2 2.6 57 120-176 279-340 (626) 78 pfam11874 DUF3394 Domain of un 85.7 0.56 1.4E-05 26.5 2.2 26 120-145 124-149 (183) 79 KOG3605 consensus 83.0 0.67 1.7E-05 25.9 1.7 26 317-342 585-610 (829) 80 pfam01079 Hint Hint module. Th 80.8 1.8 4.5E-05 23.0 3.1 58 128-187 24-82 (214) 81 pfam07136 DUF1385 Protein of u 79.4 4.8 0.00012 20.0 5.6 16 95-110 43-58 (235) 82 pfam02128 Peptidase_M36 Fungal 77.7 0.18 4.7E-06 29.8 -2.7 17 304-320 315-331 (368) 83 cd04278 ZnMc_MMP Zinc-dependen 73.0 1.8 4.7E-05 22.9 1.4 12 53-64 4-15 (157) 84 TIGR02289 M3_not_pepF oligoend 72.4 2 5.1E-05 22.7 1.5 25 136-160 253-278 (553) 85 cd06457 M3A_MIP Peptidase M3 m 69.0 3.5 8.9E-05 21.0 2.1 17 23-39 138-154 (458) 86 KOG4407 consensus 69.0 5.2 0.00013 19.8 3.0 45 121-165 146-192 (1973) 87 COG5233 GRH1 Peripheral Golgi 68.2 3.3 8.4E-05 21.2 1.9 31 121-151 66-96 (417) 88 cd06456 M3A_DCP_Oligopeptidase 66.9 4.4 0.00011 20.3 2.3 22 300-321 395-416 (422) 89 pfam00413 Peptidase_M10 Matrix 66.2 2.1 5.2E-05 22.6 0.5 14 50-64 2-15 (158) 90 TIGR01700 PNPH purine nucleosi 66.0 2.9 7.4E-05 21.5 1.2 15 132-146 96-110 (259) 91 COG0339 Dcp Zn-dependent oligo 63.8 5.1 0.00013 19.8 2.2 32 225-257 462-493 (683) 92 pfam01432 Peptidase_M3 Peptida 63.5 5.6 0.00014 19.6 2.3 20 300-319 421-440 (448) 93 COG0260 PepB Leucyl aminopepti 62.1 3.4 8.8E-05 21.0 1.0 29 122-151 302-330 (485) 94 cd06459 M3B_Oligoendopeptidase 61.8 4.2 0.00011 20.4 1.4 27 136-162 137-165 (427) 95 TIGR00181 pepF oligoendopeptid 61.7 3.9 9.9E-05 20.6 1.2 19 15-33 389-410 (611) 96 cd06455 M3A_TOP Peptidase M3 T 60.3 6 0.00015 19.3 2.0 22 299-320 446-467 (472) 97 pfam00883 Peptidase_M17 Cytoso 59.0 4.2 0.00011 20.4 1.0 29 122-151 135-163 (312) 98 PRK04860 hypothetical protein; 58.7 5 0.00013 19.9 1.4 18 18-36 66-83 (160) 99 TIGR01697 PNPH-PUNA-XAPA inosi 57.8 6.2 0.00016 19.3 1.7 12 133-144 103-114 (266) 100 pfam06838 Alum_res Aluminium r 54.6 8.3 0.00021 18.4 1.9 42 129-170 87-128 (405) 101 cd06258 Peptidase_M3_like The 54.1 8.7 0.00022 18.3 2.0 21 300-320 340-360 (365) 102 PRK00913 leucyl aminopeptidase 51.7 6.2 0.00016 19.2 0.9 27 124-151 308-334 (491) 103 KOG3714 consensus 51.6 4.8 0.00012 20.0 0.3 14 134-147 141-155 (411) 104 cd00433 Peptidase_M17 Cytosol 50.1 6.4 0.00016 19.2 0.8 27 124-151 291-317 (468) 105 KOG1565 consensus 48.5 6.2 0.00016 19.3 0.5 12 15-26 211-222 (469) 106 TIGR02386 rpoC_TIGR DNA-direct 48.0 5.7 0.00014 19.5 0.2 10 136-145 1295-1304(1552) 107 PRK05113 electron transport co 43.4 22 0.00057 15.4 3.6 34 4-37 1-34 (184) 108 KOG2597 consensus 43.1 11 0.00029 17.4 1.2 25 126-151 328-352 (513) 109 pfam04298 Zn_peptidase_2 Putat 43.0 15 0.00039 16.6 1.8 32 277-308 143-176 (222) 110 cd04282 ZnMc_meprin Zinc-depen 42.7 12 0.0003 17.3 1.2 11 135-145 104-114 (230) 111 TIGR03296 M6dom_TIGR03296 M6 f 41.4 4.2 0.00011 20.5 -1.3 11 142-152 140-150 (286) 112 COG2317 Zn-dependent carboxype 40.8 16 0.00041 16.4 1.6 11 166-176 243-253 (497) 113 COG4307 Uncharacterized protei 40.6 15 0.00039 16.5 1.5 17 56-72 56-72 (349) 114 KOG3984 consensus 40.5 9.4 0.00024 18.0 0.4 28 130-157 118-145 (286) 115 TIGR02290 M3_fam_3 oligoendope 40.1 16 0.00041 16.4 1.6 16 87-102 219-234 (600) 116 COG4956 Integral membrane prot 40.0 19 0.00048 16.0 1.9 34 145-178 272-307 (356) 117 COG3091 SprT Zn-dependent meta 39.9 21 0.00052 15.7 2.1 23 13-36 59-81 (156) 118 cd06460 M32_Taq Peptidase fami 38.2 19 0.00049 15.9 1.7 16 305-320 371-386 (396) 119 COG4100 Cystathionine beta-lya 38.1 16 0.0004 16.4 1.3 40 131-170 99-138 (416) 120 pfam06167 MtfA Phosphoenolpyru 37.4 15 0.00039 16.5 1.1 18 145-162 128-145 (248) 121 PRK13267 archaemetzincin-like 36.8 14 0.00035 16.9 0.8 12 138-149 67-78 (177) 122 pfam08434 CLCA_N Calcium-activ 36.8 13 0.00033 17.0 0.6 16 129-144 91-106 (262) 123 pfam05547 Peptidase_M6 Immune 35.8 3.9 0.0001 20.6 -2.1 22 152-173 419-440 (646) 124 pfam02074 Peptidase_M32 Carbox 35.7 23 0.00058 15.4 1.7 16 305-320 470-485 (494) 125 COG3732 SrlE Phosphotransferas 34.2 31 0.00079 14.4 5.1 56 258-313 167-231 (328) 126 COG1164 Oligoendopeptidase F [ 34.1 29 0.00073 14.7 2.0 31 131-161 286-322 (598) 127 TIGR01592 holin_SPP1 holin, SP 33.7 31 0.0008 14.4 2.3 24 274-297 8-31 (82) 128 COG4219 MecR1 Antirepressor re 33.0 20 0.0005 15.8 1.1 16 289-304 269-284 (337) 129 PRK07118 ferredoxin; Validated 31.1 35 0.00088 14.1 3.5 34 4-37 1-34 (276) 130 cd03506 Delta6-FADS-like The D 31.0 35 0.00089 14.1 2.5 32 5-38 3-34 (204) 131 pfam10462 Peptidase_M66 Peptid 30.5 7.7 0.0002 18.6 -1.4 15 55-69 46-60 (304) 132 pfam11667 DUF3267 Protein of u 30.5 9 0.00023 18.1 -1.0 21 15-35 4-24 (107) 133 pfam01435 Peptidase_M48 Peptid 29.9 26 0.00065 15.0 1.2 21 130-150 40-60 (222) 134 COG2707 Predicted membrane pro 29.6 37 0.00093 13.9 2.8 23 288-310 125-147 (151) 135 PRK05015 aminopeptidase B; Pro 29.6 7.9 0.0002 18.5 -1.4 27 124-151 242-268 (424) 136 PRK09194 prolyl-tRNA synthetas 29.1 34 0.00086 14.2 1.7 21 42-62 88-108 (570) 137 cd03507 Delta12-FADS-like The 29.1 37 0.00095 13.9 2.4 27 6-32 37-63 (222) 138 TIGR00819 ydaH AbgT transporte 28.6 31 0.00078 14.5 1.4 30 273-302 403-441 (527) 139 TIGR01393 lepA GTP-binding pro 28.5 20 0.00052 15.7 0.5 35 33-72 50-85 (598) 140 pfam00262 Calreticulin Calreti 27.4 32 0.00082 14.3 1.4 25 64-88 87-111 (359) 141 TIGR02149 glgA_Coryne glycogen 27.3 40 0.001 13.7 2.0 72 129-206 152-226 (416) 142 COG4043 Preprotein translocase 27.2 28 0.00072 14.7 1.1 37 131-168 29-72 (111) 143 PRK07740 hypothetical protein; 27.0 19 0.00048 15.9 0.1 21 132-152 65-87 (240) 144 TIGR01687 moaD_arch MoaD famil 26.7 37 0.00094 13.9 1.6 47 121-169 31-85 (93) 145 pfam09101 Exotox-A_bind Exotox 26.4 41 0.0011 13.6 2.4 13 138-150 31-43 (274) 146 COG0442 ProS Prolyl-tRNA synth 26.3 41 0.001 13.6 1.7 21 42-62 88-108 (500) 147 cd03510 Rhizobitoxine-FADS-lik 25.0 44 0.0011 13.4 2.8 25 12-38 31-55 (175) 148 PRK04897 heat shock protein Ht 24.9 35 0.0009 14.0 1.2 17 131-147 110-126 (298) 149 TIGR02737 caa3_CtaG cytochrome 24.5 45 0.0011 13.3 3.4 36 307-342 100-138 (286) 150 pfam02031 Peptidase_M7 Strepto 24.0 19 0.0005 15.8 -0.3 22 152-173 24-45 (132) 151 PRK01827 thyA thymidylate synt 23.8 46 0.0012 13.2 2.9 50 130-179 188-239 (264) 152 COG4273 Uncharacterized conser 23.6 47 0.0012 13.2 1.8 33 121-153 49-81 (135) 153 TIGR03284 thym_sym thymidylate 23.5 47 0.0012 13.2 3.1 50 130-179 220-271 (296) 154 PRK03982 heat shock protein Ht 23.4 39 0.001 13.7 1.2 31 131-161 98-130 (288) 155 pfam04228 Zn_peptidase Putativ 23.4 33 0.00084 14.3 0.8 14 164-177 113-126 (292) 156 KOG3638 consensus 23.1 48 0.0012 13.1 2.9 61 126-187 218-278 (414) 157 COG3127 Predicted ABC-type tra 22.9 47 0.0012 13.2 1.5 44 299-342 773-816 (829) 158 TIGR03616 RutG pyrimidine util 22.8 48 0.0012 13.1 4.1 84 252-335 307-396 (429) 159 KOG0662 consensus 22.1 38 0.00096 13.9 0.9 11 166-176 128-138 (292) 160 pfam00303 Thymidylat_synt Thym 22.1 50 0.0013 13.0 2.8 47 130-176 186-234 (262) 161 pfam03926 consensus 21.8 50 0.0013 13.0 1.9 16 17-32 59-74 (149) 162 COG2738 Predicted Zn-dependent 21.7 51 0.0013 13.0 1.7 35 276-310 144-180 (226) 163 KOG2090 consensus 21.6 48 0.0012 13.1 1.4 21 293-313 581-601 (704) 164 PRK02391 heat shock protein Ht 21.1 47 0.0012 13.2 1.2 31 131-161 107-139 (297) 165 PRK10720 uracil transporter; P 20.8 53 0.0013 12.8 3.9 83 251-333 287-375 (429) 166 PRK05973 replicative DNA helic 20.8 50 0.0013 13.0 1.3 22 123-144 46-69 (237) 167 PRK11067 outer membrane protei 20.8 53 0.0013 12.8 2.6 11 55-65 263-273 (801) 168 PHA01511 coat protein 20.6 53 0.0014 12.8 3.5 67 132-205 263-329 (430) 169 PRK05457 heat shock protein Ht 20.6 45 0.0012 13.3 1.0 31 131-161 107-139 (289) 170 TIGR02299 HpaE 5-carboxymethyl 20.2 28 0.00073 14.7 -0.0 18 250-267 445-462 (494) No 1 >PRK10779 zinc metallopeptidase; Provisional Probab=100.00 E-value=0 Score=712.58 Aligned_cols=341 Identities=30% Similarity=0.520 Sum_probs=306.2 Q ss_pred CHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCEEEEEEEECCCHHEEEEECCCEEEEEEEEEECCEEECCCCC----- Q ss_conf 936899999999999999997337799999849750045530683112786059807999977112111001244----- Q gi|254780773|r 1 MFWLDCFLLYTVSLIIIVVIHEFGHYMVARLCNIRVLSFSVGFGPELIGITSRSGVRWKVSLIPLGGYVSFSEDE----- 75 (349) Q Consensus 1 m~~~~~~~~~~~~l~~~v~iHE~GH~~~Ar~~gv~V~~FsiGfgp~l~~~~~k~~t~y~i~~~PlGgyV~~~~~e----- 75 (349) |.+++++++|+++|+++|++||+|||++||+|||||++|||||||+||++++|+||||+++|||+||||||.||+ T Consensus 1 m~~l~~i~~fi~~l~~~V~iHElGHfl~Ak~~gv~V~~FsIGfGp~l~~~~~k~gTey~i~~iPlGGYVkm~~e~~~~~~ 80 (449) T PRK10779 1 LSILWNLAAFIVALGVLITVHEFGHFWVARRCGVRVERFSIGFGKALWRRTDRLGTEYVIALIPLGGYVKMLDERAEPVA 80 (449) T ss_pred CHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCEECEEEECCCCHHEEEECCCCEEEEEEEEECCCEEECCCCCCCCCC T ss_conf 93799999999999999998857879999986985487975688152468367981999999503228835778866688 Q ss_pred --CCHHHHCCCCCCCCHHHHHHHHHHHHHHHHHHCCCCCCC-CC------------------------------------ Q ss_conf --550665036842101342112333211221110111001-23------------------------------------ Q gi|254780773|r 76 --KDMRSFFCAAPWKKILTVLAGPLANCVMAILFFTFFFYN-TG------------------------------------ 116 (349) Q Consensus 76 --~~~~~f~~~~~~~R~~i~~AGp~~N~ilA~iif~~~~~~-~g------------------------------------ 116 (349) +++++|++||+|||+++++|||++||++|+++|++++.. .+ T Consensus 81 ~~~~~~~f~~k~~~~R~~i~~aGp~~N~ila~iif~~if~~G~~~~~~~i~~v~~~s~a~~agl~~GD~i~~idg~~~~~ 160 (449) T PRK10779 81 PELRHHAFNNKTVGQRAAIIAAGPVANFIFAIFAYWLVFIIGVPGVRPVVGEIAPNSIAAQAQIAPGTELKAVDGIETPD 160 (449) T ss_pred CCHHHHHHCCCCCCEEEEEECCCHHHHHHHHHHHHHHEEECCCCCCCCEECCCCCCCHHHHCCCCCCCEEEEECCEECCC T ss_conf 31223544138811389874276167778999986622441555545300431468888873888887899989998576 Q ss_pred -----------------------------------------------------------CCCCCCCCCCCCCHHHHHHCC Q ss_conf -----------------------------------------------------------333200024567556530014 Q gi|254780773|r 117 -----------------------------------------------------------VMKPVVSNVSPASPAAIAGVK 137 (349) Q Consensus 117 -----------------------------------------------------------~~~p~I~~V~~~spA~~AGL~ 137 (349) ..+|++++|.|+|||++|||| T Consensus 161 ~~~~~~~l~~~~g~~~~~i~v~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~lgi~~~~p~~~~vi~~V~~~spA~~AGL~ 240 (449) T PRK10779 161 WDAVRLQLVSKIGDEQTTVTVAPFGSDQRRDKTLDLRHWAFEPDKQDPVSSLGIRPRGPQIEPVLEEVQPNSAASKAGLQ 240 (449) T ss_pred HHHHHHHHHHHCCCCCEEEEEECCCCCCEEEEEECCHHCCCCCCCCCCHHHCCCCCCCCCCCCEEEEECCCCHHHHCCCC T ss_conf 89889999885057760699940786411333202011024654345112215333677777435420799989974898 Q ss_pred CCCEEEEECCCCCCCCHHHHHHCCCCCCCCCEEEEEECCCCEEEECCCCCCCCCCCCCEEEEEEEECCCCCCC-----CC Q ss_conf 5662888878304540011111014678863169996587313101121100565431000012203444432-----12 Q gi|254780773|r 138 KGDCIISLDGITVSAFEEVAPYVRENPLHEISLVLYREHVGVLHLKVMPRLQDTVDRFGIKRQVPSVGISFSY-----DE 212 (349) Q Consensus 138 ~GD~Il~InG~~V~s~~dl~~~i~~~~g~~v~i~v~R~~~~~~~~~v~p~~~~~~~~~g~~~~~~~igi~~~~-----~~ 212 (349) +||+|++|||+++++|+|+...++++++++++++++|+++ ..+.+++|+.....+ ...+.+|+.+.. +. T Consensus 241 ~GD~I~~Ing~~i~s~~~l~~~i~~~~~~~i~l~v~R~g~-~~~~~v~p~~~~~~~-----~~~g~~gi~~~~~~~~~~~ 314 (449) T PRK10779 241 AGDRIVKVDGQPLTQWVTFVMLVRDNPGKPLALEIERQGS-PLSLTLIPDTKPVNG-----KAEGFAGVVPKVIPLPDEY 314 (449) T ss_pred CCCEEEEECCEECCCHHHHHHHHHHCCCCEEEEEEEECCC-EEEEEEEEEEECCCC-----CEEEEEEECCCCCCCCCCC T ss_conf 8877999999871659999999986899869999997895-899999640351588-----3367985224335676531 Q ss_pred CCCCCCCHHHHHHHHHHHHHHHHHHHEECCCCCCCCCCCCCCCCCCCHHHHHHHHHHHCCHHHHHHHHHHHHHHHHHHHH Q ss_conf 33345668887689888655422310000000255420022466730122333456530503478999999999999963 Q gi|254780773|r 213 TKLHSRTVLQSFSRGLDEISSITRGFLGVLSSAFGKDTRLNQISGPVGIARIAKNFFDHGFNAYIAFLAMFSWAIGFMNL 292 (349) Q Consensus 213 ~~~~~~~~~~a~~~~~~~~~~~~~~~~~~l~~l~~g~~~~~~lsGPVgIa~~~~~~a~~G~~~~l~~~a~isi~Lg~~Nl 292 (349) .....+++.+|+..+.+++++++..++..++++++|+.+.+++||||||++++++++++|+..+|+|+|+||+|||+||| T Consensus 315 ~~~~~~~~~~a~~~~~~~t~~~~~~~~~~l~~l~tG~v~~~~lsGPVgIa~~~g~~a~~G~~~~l~~~A~iSi~Lgi~NL 394 (449) T PRK10779 315 KTVRQYGPFSAIYEATDKTWQLMKLTVSMLGKLITGDVKLNNLSGPISIAQGAGMSAELGVVYYLMFLALISVNLGIINL 394 (449) T ss_pred EEEEECCHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCCCCCCCCCEEEEEHHHHHHHHHHHHHHHHHHHHHHHHHHHHH T ss_conf 37885178999999999999999999999999823865633357862332011167887599999999999999999985 Q ss_pred HCCCCCCHHHHHHHHHHHHHCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 0647224369999999888278699999999999999999999999999998887 Q gi|254780773|r 293 LPIPILDGGHLITFLLEMIRGKSLGVSVTRVITRMGLCIILFLFFLGIRNDIYGL 347 (349) Q Consensus 293 LPip~LDGG~i~~~~~E~i~gr~i~~~~~~~~~~~g~~ll~~l~i~~~~nDi~rl 347 (349) ||||+|||||+++.+||+|||||+|+|++++++++|++++++||++++||||.|| T Consensus 395 LPIP~LDGG~i~~~~iE~I~gr~l~~k~~~~~~~iG~~~li~Lmi~~~~nDi~RL 449 (449) T PRK10779 395 FPLPVLDGGHLLFLAIEKLKGGPVSERVQDFSYRIGSILLVLLMGLALFNDFSRL 449 (449) T ss_pred CCCCCCCHHHHHHHHHHHHHCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHC T ss_conf 7876767069899999998489999899999999999999999999998887539 No 2 >TIGR00054 TIGR00054 membrane-associated zinc metalloprotease, putative; InterPro: IPR004387 Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site . The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases . Peptidases are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry. Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. This family contains putative zinc metallopeptidases belonging to MEROPS peptidase family M50 (S2P protease family, clan MM). The N-terminal region of contains a perfectly conserved motif HEXGH, where the Glu is the active site and the His residues coordinate the metal cation. The family of bacterial and plant proteins also includes a region that hits the PDZ domain (IPR001478 from INTERPRO), found in a number of proteins targeted to the membrane by binding to a peptide ligand . The family includes EcfE, which is a homolog of human site-2 protease (S2P), a membrane-bound zinc metalloprotease involved in regulated intramembrane proteolysis. In Escherichia coli EcfE activates the sigma(E) pathway of stress response through a site-2 cleavage of anti-sigma(E), RseA.; GO: 0004222 metalloendopeptidase activity, 0006508 proteolysis, 0016021 integral to membrane. Probab=100.00 E-value=0 Score=661.09 Aligned_cols=341 Identities=34% Similarity=0.630 Sum_probs=311.6 Q ss_pred CHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCEEEEEEEECCCHHEEEEECCCEEEEEEEEEECCEEECCC---CC-- Q ss_conf 9368999999999999999973377999998497500455306831127860598079999771121110012---44-- Q gi|254780773|r 1 MFWLDCFLLYTVSLIIIVVIHEFGHYMVARLCNIRVLSFSVGFGPELIGITSRSGVRWKVSLIPLGGYVSFSE---DE-- 75 (349) Q Consensus 1 m~~~~~~~~~~~~l~~~v~iHE~GH~~~Ar~~gv~V~~FsiGfgp~l~~~~~k~~t~y~i~~~PlGgyV~~~~---~e-- 75 (349) |.+|..+.+|+++|+++|++||||||++||+|||||++|||||||+||.+.|+.||||++|+|||||||||+| +| T Consensus 1 ~~~l~~l~sFi~~La~Li~vHElGHF~~Ar~~GvkV~~FsiGFGp~lW~~~~~~~TeY~is~IPLGGYVk~~Gfd~lD~e 80 (463) T TIGR00054 1 MSFLWILASFIIALAVLIFVHELGHFLAARLCGVKVERFSIGFGPKLWLKFKKNGTEYAISLIPLGGYVKMKGFDGLDKE 80 (463) T ss_pred CHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCEEEEEEECCCCHHHHHCCCCCCEEEEEEECCCCEEECCCCCCCCCC T ss_conf 93788899999999999999988899999877946999742027056520013674589875326543623788997664 Q ss_pred --------CCHHHHCCCCCCCCHHHHHHHHHHHHHHHHHHCC-CCCCCCC------------------------------ Q ss_conf --------5506650368421013421123332112211101-1100123------------------------------ Q gi|254780773|r 76 --------KDMRSFFCAAPWKKILTVLAGPLANCVMAILFFT-FFFYNTG------------------------------ 116 (349) Q Consensus 76 --------~~~~~f~~~~~~~R~~i~~AGp~~N~ilA~iif~-~~~~~~g------------------------------ 116 (349) .++++|+++|..||++|++|||++|+++|++.++ ++.. .| T Consensus 81 ~~~~~~~e~~~~~f~~~s~~~k~~i~~aG~~~N~iFa~~~~~~~~~~-~G~~~~~~~~vi~~~~~~S~a~~a~~~~Gd~i 159 (463) T TIGR00054 81 EEEVKPPETDKDLFNNKSVLQKAIIIFAGPLANFIFAIFVYIDLVSL-IGVPGYEVGPVIEELDKNSIALEAGIEPGDEI 159 (463) T ss_pred CCCCCCCCCCHHHHHCCCHHHEEHEECCCHHHHHHHHHHHHHHHHHH-HCCCCCCCCCCCCCCCHHHHHHHHCCCCCCEE T ss_conf 10367652101365268712310100144143599999999989998-04244213664565544579987116898478 Q ss_pred ----------------------------------------------------------------------CCCCCCCCCC Q ss_conf ----------------------------------------------------------------------3332000245 Q gi|254780773|r 117 ----------------------------------------------------------------------VMKPVVSNVS 126 (349) Q Consensus 117 ----------------------------------------------------------------------~~~p~I~~V~ 126 (349) ..++++.+|+ T Consensus 160 l~~~~~~~~~f~~~~~~~~~~~~g~~~~~I~~~PF~S~~e~~~~L~L~~~~~~~~~~~~~~~lgl~~~~P~ie~vl~~~~ 239 (463) T TIGR00054 160 LSVNGKKIPGFKDVRKQIADIVAGEPMVEILAAPFNSDIEREVKLDLRNWTFEVEKEDAVEQLGLKPRGPKIEPVLSDVT 239 (463) T ss_pred EEECCCCCCCCHHHHHHHHHHHCCCCCCEEEECCCCCCCHHHCCCCCEEEEEECCCCCHHHHCCCCCCCCCCEEEECCCC T ss_conf 74077667880889999999751785415776577754112000033123862112552554144247872012331267 Q ss_pred CCCHHHHHHCCCCCEEEEECCCCCCCCHHHHHHCCCCCCCCCEEEEEECCCCEEEECCCCCCCCCCCCCEEEEEEEECCC Q ss_conf 67556530014566288887830454001111101467886316999658731310112110056543100001220344 Q gi|254780773|r 127 PASPAAIAGVKKGDCIISLDGITVSAFEEVAPYVRENPLHEISLVLYREHVGVLHLKVMPRLQDTVDRFGIKRQVPSVGI 206 (349) Q Consensus 127 ~~spA~~AGL~~GD~Il~InG~~V~s~~dl~~~i~~~~g~~v~i~v~R~~~~~~~~~v~p~~~~~~~~~g~~~~~~~igi 206 (349) ++|||++||||+||+|+++||++..+|.|+...++++|++++++.++||| ++++.+++|+..++.++ ..+.+|+ T Consensus 240 ~N~~A~~AGLk~GD~I~~i~g~~l~~w~d~v~~v~~np~~~~~i~v~R~G-~~l~~~l~p~~~~~~gK-----aIg~ig~ 313 (463) T TIGR00054 240 PNSPAEKAGLKEGDKIISIDGEKLKSWRDFVSLVKENPGKSLEIKVERNG-ETLSISLTPEAKKDKGK-----AIGFIGI 313 (463) T ss_pred CCCHHHHCCCCCCCEEEEECCCCCCCHHHHHHHHHHCCCCEEEEEEEECC-CEEEEEEEEEEECCCCC-----EEEEEEE T ss_conf 88537753465688898556812344245899998689956999997278-14634787530079973-----6898775 Q ss_pred CCCCCCCC-----CCCCCHHHHHHHHHHHHHHHHHHHEECCCCCCCCCCCCCCCCCCCHHHHHHHHHHHCCHHHHHHHHH Q ss_conf 44321233-----3456688876898886554223100000002554200224667301223334565305034789999 Q gi|254780773|r 207 SFSYDETK-----LHSRTVLQSFSRGLDEISSITRGFLGVLSSAFGKDTRLNQISGPVGIARIAKNFFDHGFNAYIAFLA 281 (349) Q Consensus 207 ~~~~~~~~-----~~~~~~~~a~~~~~~~~~~~~~~~~~~l~~l~~g~~~~~~lsGPVgIa~~~~~~a~~G~~~~l~~~a 281 (349) .|...... ...+++.+|+.++.+.+.++++.++..+.++++|+.+.++|||||||++.+++.|+.|..+++.|.| T Consensus 314 ~P~~~~~~d~~~~v~~~~~l~a~~~~~~~~~~~~~li~~~l~~Li~g~~~l~~lSGPVgIv~~~~~~A~~G~~~ll~F~A 393 (463) T TIGR00054 314 SPSLKKLEDEYKVVVSYGILEALAKAAEATKDIVKLILKLLGKLITGSLKLKNLSGPVGIVKGAGSSANLGIVYLLQFGA 393 (463) T ss_pred CCCCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCHHHCCCCCHHHHHHHHHHHHHHHHHHHHHH T ss_conf 68753031132100244267899999999989999999988875400022200567421545512466764999998999 Q ss_pred HHHHHHHHHHHHC--CCCCCHHHHHHHHHHHHHCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 9999999996306--472243699999998882786999999999999999999999999999988873 Q gi|254780773|r 282 MFSWAIGFMNLLP--IPILDGGHLITFLLEMIRGKSLGVSVTRVITRMGLCIILFLFFLGIRNDIYGLM 348 (349) Q Consensus 282 ~isi~Lg~~NlLP--ip~LDGG~i~~~~~E~i~gr~i~~~~~~~~~~~g~~ll~~l~i~~~~nDi~rl~ 348 (349) +||+|||++|||| ||+|||||++|.++|.|+|||+|+|++..++.+|.+++++||++++|||+.||+ T Consensus 394 llSiNLgi~NLlPiviP~LDGG~llfl~iE~i~Gkp~~~~~q~~~~~~G~~lLl~L~~l~lfND~~rLl 462 (463) T TIGR00054 394 LLSINLGIINLLPIVIPALDGGQLLFLFIEAIRGKPLPEKVQAFVYRIGVALLLLLMVLGLFNDLLRLL 462 (463) T ss_pred HHHHHHHHHHHHHHHHHHHCCHHHHHHHHHHHCCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHC T ss_conf 999999999888778624242578788998735897766899999999999999999999988876511 No 3 >COG0750 Predicted membrane-associated Zn-dependent proteases 1 [Cell envelope biogenesis, outer membrane] Probab=100.00 E-value=0 Score=436.10 Aligned_cols=346 Identities=33% Similarity=0.613 Sum_probs=298.5 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCEEEEEEEECCCHHEEEEECCCEEEEEEEEEECCEEECCCCC-------- Q ss_conf 899999999999999997337799999849750045530683112786059807999977112111001244-------- Q gi|254780773|r 4 LDCFLLYTVSLIIIVVIHEFGHYMVARLCNIRVLSFSVGFGPELIGITSRSGVRWKVSLIPLGGYVSFSEDE-------- 75 (349) Q Consensus 4 ~~~~~~~~~~l~~~v~iHE~GH~~~Ar~~gv~V~~FsiGfgp~l~~~~~k~~t~y~i~~~PlGgyV~~~~~e-------- 75 (349) +..++.+++++.+++++||+|||+.||+|+++|++|++||||+++++++|.+|+|.++++|+||||+|.+++ T Consensus 2 ~~~~i~~i~~~~~lv~~he~gh~~~a~~~~~~v~~f~ig~g~~l~~~~~~~~~~~~i~~~plggyv~~~~~~~~~~~~~~ 81 (375) T COG0750 2 MLTIIAFIIALGVLVFVHELGHFWVARRCGVKVERFSIGFGPKLFSRKDKGGTEYVLSAIPLGGYVKMLGEDAEEVVLKG 81 (375) T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCEEEEEEEECCEEEEEEECCCCEEEEEEECCCEEEEEEEECCCCCCCCCC T ss_conf 07888898888899999998899999863752589786212016998426880899984261139999873554423354 Q ss_pred --CCHHHHCCCCCCCCHHHHHHHHHHHHHHHHHHCCCCCCCCC---CCCCCCCCCCCCCHHHHHHCCCCCEEEEECCCCC Q ss_conf --55066503684210134211233321122111011100123---3332000245675565300145662888878304 Q gi|254780773|r 76 --KDMRSFFCAAPWKKILTVLAGPLANCVMAILFFTFFFYNTG---VMKPVVSNVSPASPAAIAGVKKGDCIISLDGITV 150 (349) Q Consensus 76 --~~~~~f~~~~~~~R~~i~~AGp~~N~ilA~iif~~~~~~~g---~~~p~I~~V~~~spA~~AGL~~GD~Il~InG~~V 150 (349) +.++.|..++.|+|..+.+|||.+|++++.+.+.......| ...|.++++.++|||+.||+++||+|+++|++++ T Consensus 82 ~~~~~~~f~~~~~~~~~~~~~~Gp~~n~i~~~~~~~~~~~~~G~~~~~~~~~~~v~~~s~a~~a~l~~Gd~iv~~~~~~i 161 (375) T COG0750 82 PEPRPRAFNAKSVWQRIAIVFAGPLFNFILAIVLFVVLFFVIGLVPVASPVVGEVAPKSAAALAGLRPGDRIVAVDGEKV 161 (375) T ss_pred CCCCHHHHHCCCCCCCEEEEECCCCHHHHHHHHHHHHHHEEECEEECCCCCCCCCCCCCHHHHCCCCCCCEEEECCCEEC T ss_conf 34440333034544141699806542788999999886605210212366543445476788757888978995085204 Q ss_pred CCCHHHHHHCCCCCCCC---CEEEEEE-CCCCE-------EEECCCCCCCC-CCCCCEEEEEEEECCCCCCC---CCCCC Q ss_conf 54001111101467886---3169996-58731-------31011211005-65431000012203444432---12333 Q gi|254780773|r 151 SAFEEVAPYVRENPLHE---ISLVLYR-EHVGV-------LHLKVMPRLQD-TVDRFGIKRQVPSVGISFSY---DETKL 215 (349) Q Consensus 151 ~s~~dl~~~i~~~~g~~---v~i~v~R-~~~~~-------~~~~v~p~~~~-~~~~~g~~~~~~~igi~~~~---~~~~~ 215 (349) .+|++.+..+..+++.. +++.+.| ++... ...++.|.... ..+........+.++..+.. ..... T Consensus 162 ~~~~~~~~~~~~~~~~~~~~~~i~~~~~~~~~~~~~~~~~~~~~~~P~~~~~~~~~~~~~i~~~~i~~~p~~~~~~~~~~ 241 (375) T COG0750 162 ASWDDVRRLLVAAAGDVFNLLTILVIRLDGEAHAVAAEIIKSLGLTPVVIPLKPGDKIVAVDVGAIGLSPNGEPDVGKVL 241 (375) T ss_pred CCHHHHHHHHHHCCCCCCCEEEEEEEECCCEEECCCCCCCCEECCCCCCCCCCCCCEEEEEEEEEECCCCCCCCCCCCCC T ss_conf 56667779987533565550799998326544201012222103667645447896246740004245577664323012 Q ss_pred CCCCHHHHHHHHHHHHHHHHHHHEECCCCCCCCCCCCCCCCCCCHHHHHHHHHHHCCHHHHHHHHHHHHHHHHHHHHHCC Q ss_conf 45668887689888655422310000000255420022466730122333456530503478999999999999963064 Q gi|254780773|r 216 HSRTVLQSFSRGLDEISSITRGFLGVLSSAFGKDTRLNQISGPVGIARIAKNFFDHGFNAYIAFLAMFSWAIGFMNLLPI 295 (349) Q Consensus 216 ~~~~~~~a~~~~~~~~~~~~~~~~~~l~~l~~g~~~~~~lsGPVgIa~~~~~~a~~G~~~~l~~~a~isi~Lg~~NlLPi 295 (349) .+.++.+++..+++++.+....+++.++++.++..+.++++||++|++..++.++.|+.++++|++|+|++||++||+|+ T Consensus 242 ~~~~~~~~i~~~v~~~~~~~~~~~~~l~~~~~~~~~~~~l~Gpi~i~~~~~~~~~~~~~~~l~~~~~lsi~lg~lNllP~ 321 (375) T COG0750 242 VKYGPLEAVGLAVEKTGRLVKLTLKMLKKLITGDLSLKNLSGPIGIAKIAGAAASLGLINLLFFLALLSINLGILNLLPI 321 (375) T ss_pred CCCCCHHHHHHHHHHHHHHHHHHHHHHHHCEEEECCCCCCCCEEEEEEECCHHHHHHHHHHHHHHHHHHHHHHHHHHCHH T ss_conf 24461677888876556688888877641414521565355406888834466665699999999999999999996425 Q ss_pred CCCCHHHHHHHHHHHHHCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHC Q ss_conf 722436999999988827869999999999999999999999999999888739 Q gi|254780773|r 296 PILDGGHLITFLLEMIRGKSLGVSVTRVITRMGLCIILFLFFLGIRNDIYGLMQ 349 (349) Q Consensus 296 p~LDGG~i~~~~~E~i~gr~i~~~~~~~~~~~g~~ll~~l~i~~~~nDi~rl~~ 349 (349) |+|||||+++.++|.++|||++++.+..++..|+++++.+|+++++||+.|++. T Consensus 322 p~LDGG~i~~~~~e~~~g~~~~~~~~~~~~~~g~~ll~~~~~~~~~~di~~~~~ 375 (375) T COG0750 322 PPLDGGHLLFYLLEALRGKPLSERVEAALYRIGLALLLLLMLLATFNDLLRLFR 375 (375) T ss_pred HHCCHHHHHHHHHHHHHCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHCCCCCCC T ss_conf 418888999999999808988823512588899999999999998852311259 No 4 >cd06163 S2P-M50_PDZ_RseP-like RseP-like Site-2 proteases (S2P), zinc metalloproteases (MEROPS family M50A), cleave transmembrane domains of substrate proteins, regulating intramembrane proteolysis (RIP) of diverse signal transduction mechanisms. In Escherichia coli, the S2P homolog RseP is involved in the sigmaE pathway of extracytoplasmic stress responses. Also included in this group are such homologs as Bacillus subtilis YluC, Mycobacterium tuberculosis Rv2869c S2P, and Bordetella bronchiseptica HurP. Rv2869c S2P appears to have a role in the regulation of prokaryotic lipid biosynthesis and membrane composition and YluC of Bacillus has a role in transducing membrane stress. This group includes bacterial and eukaryotic S2P/M50s homologs with either one or two PDZ domains present. PDZ domains are believed to have a regulatory role. The RseP PDZ domain is required for the inhibitory reaction that prevents cleavage of its substrate, RseA. Probab=100.00 E-value=0 Score=428.08 Aligned_cols=174 Identities=43% Similarity=0.841 Sum_probs=166.1 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHCCEEEEEEEECCCHHEEEEECCCEEEEEEEEEECCEEECCCCC--------CCH Q ss_conf 999999999999997337799999849750045530683112786059807999977112111001244--------550 Q gi|254780773|r 7 FLLYTVSLIIIVVIHEFGHYMVARLCNIRVLSFSVGFGPELIGITSRSGVRWKVSLIPLGGYVSFSEDE--------KDM 78 (349) Q Consensus 7 ~~~~~~~l~~~v~iHE~GH~~~Ar~~gv~V~~FsiGfgp~l~~~~~k~~t~y~i~~~PlGgyV~~~~~e--------~~~ 78 (349) ++.|+++++++|++||+|||++||+|||||++|||||||++++|+ |+||+|+++++|+||||||.||+ +++ T Consensus 1 il~fi~~l~vlV~vHElGH~~~Ar~~Gv~V~~FsIGfGp~l~~~~-~~~T~~~i~~iPlGGyV~~~g~~~~~~~~~~~~~ 79 (182) T cd06163 1 ILAFILVLGILIFVHELGHFLVAKLFGVKVEEFSIGFGPKLFSFK-KGETEYSISAIPLGGYVKMLGEDPEEEADPEDDP 79 (182) T ss_pred CHHHHHHHHHHHHHHHHHHHHHHHHCCCEEEEEEECCCCCEEEEE-CCCEEEEEEEEHHHHHHHHCCCCCCCCCCCCCCH T ss_conf 964999999999999899899999949856698863785115677-0883698540144356654267765558864232 Q ss_pred HHHCCCCCCCCHHHHHHHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCHHHHHHCCCCCEEEEECCCCCCCCHHHHH Q ss_conf 66503684210134211233321122111011100123333200024567556530014566288887830454001111 Q gi|254780773|r 79 RSFFCAAPWKKILTVLAGPLANCVMAILFFTFFFYNTGVMKPVVSNVSPASPAAIAGVKKGDCIISLDGITVSAFEEVAP 158 (349) Q Consensus 79 ~~f~~~~~~~R~~i~~AGp~~N~ilA~iif~~~~~~~g~~~p~I~~V~~~spA~~AGL~~GD~Il~InG~~V~s~~dl~~ 158 (349) ++|++||+|||++|++|||++|+++|+++|+.+ T Consensus 80 ~~f~~~~~~~R~~i~~AGp~~N~ilA~~i~~~l----------------------------------------------- 112 (182) T cd06163 80 RSFNSKPVWQRILIVFAGPLANFLLAIVLFAVL----------------------------------------------- 112 (182) T ss_pred HHHHCCCHHHHHHHHHHHHHHHHHHHHHHHHHH----------------------------------------------- T ss_conf 447439989999999858999999999999999----------------------------------------------- Q ss_pred HCCCCCCCCCEEEEEECCCCEEEECCCCCCCCCCCCCEEEEEEEECCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHH Q ss_conf 10146788631699965873131011211005654310000122034444321233345668887689888655422310 Q gi|254780773|r 159 YVRENPLHEISLVLYREHVGVLHLKVMPRLQDTVDRFGIKRQVPSVGISFSYDETKLHSRTVLQSFSRGLDEISSITRGF 238 (349) Q Consensus 159 ~i~~~~g~~v~i~v~R~~~~~~~~~v~p~~~~~~~~~g~~~~~~~igi~~~~~~~~~~~~~~~~a~~~~~~~~~~~~~~~ 238 (349) T Consensus 113 -------------------------------------------------------------------------------- 112 (182) T cd06163 113 -------------------------------------------------------------------------------- 112 (182) T ss_pred -------------------------------------------------------------------------------- T ss_conf -------------------------------------------------------------------------------- Q ss_pred EECCCCCCCCCCCCCCCCCCCHHHHHHHHHHHCCHHHHHHHHHHHHHHHHHHHHHCCCCCCHHHHHHHHHHHHHCCCCCH Q ss_conf 00000025542002246673012233345653050347899999999999996306472243699999998882786999 Q gi|254780773|r 239 LGVLSSAFGKDTRLNQISGPVGIARIAKNFFDHGFNAYIAFLAMFSWAIGFMNLLPIPILDGGHLITFLLEMIRGKSLGV 318 (349) Q Consensus 239 ~~~l~~l~~g~~~~~~lsGPVgIa~~~~~~a~~G~~~~l~~~a~isi~Lg~~NlLPip~LDGG~i~~~~~E~i~gr~i~~ 318 (349) +.++|.+|+|||++||||+|||||||+++.+||+++|||+|+ T Consensus 113 --------------------------------------l~~~a~is~~L~i~NLLPip~LDGG~il~~~~E~i~gr~~~~ 154 (182) T cd06163 113 --------------------------------------LSFLALLSINLGILNLLPIPALDGGHLLFLLIEAIRGRPLSE 154 (182) T ss_pred --------------------------------------HHHHHHHHHHHHHHHCCCCCCCCCHHHHHHHHHHHHCCCCCH T ss_conf --------------------------------------999999999999997068876772375999999995899999 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 9999999999999999999999999888 Q gi|254780773|r 319 SVTRVITRMGLCIILFLFFLGIRNDIYG 346 (349) Q Consensus 319 ~~~~~~~~~g~~ll~~l~i~~~~nDi~r 346 (349) |+++.++.+|+++++.+|+++++|||.| T Consensus 155 ~~~~~~~~~G~~~l~~lm~~~~~nDi~r 182 (182) T cd06163 155 KVEEIIQTIGFALLLGLMLFVTFNDIVR 182 (182) T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHCCC T ss_conf 9999999999999999999999875149 No 5 >pfam02163 Peptidase_M50 Peptidase family M50. Probab=100.00 E-value=4.3e-42 Score=310.07 Aligned_cols=205 Identities=35% Similarity=0.609 Sum_probs=178.6 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHHCCEEEEEEEECCCHHEEEEECCCEEEEEEEEEECCEEECCCCCCCHHHHCCCCCCC Q ss_conf 99999999999973377999998497500455306831127860598079999771121110012445506650368421 Q gi|254780773|r 9 LYTVSLIIIVVIHEFGHYMVARLCNIRVLSFSVGFGPELIGITSRSGVRWKVSLIPLGGYVSFSEDEKDMRSFFCAAPWK 88 (349) Q Consensus 9 ~~~~~l~~~v~iHE~GH~~~Ar~~gv~V~~FsiGfgp~l~~~~~k~~t~y~i~~~PlGgyV~~~~~e~~~~~f~~~~~~~ 88 (349) .+++++.+++++||+||+++||++|++|++|++||||.+++++ +++|+|.++++| |+++ ++++++++++++|+ T Consensus 1 ~~~~~l~i~i~~HE~gH~~~Ar~~G~~v~~~~~~~g~~~~~~~-~~~~~~~~~~~~-g~~~-----~~~~~~~~~~~~~~ 73 (205) T pfam02163 1 AFILALLISVVVHELGHALVARRFGVKVERFAIGFGPLLFLHL-DGATEYTIRLMF-GAFG-----APINREFKKKSRKQ 73 (205) T ss_pred CCHHHHHHHHHHHHHHHHHHHHHCCCCCCCEEECCCCCCHHEE-CCCEEEEEECCC-CCCC-----CCCCHHHCCCCHHH T ss_conf 9079999999999999999999929997424234677421016-687259861146-7677-----76765543577656 Q ss_pred CHHHHHHHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCHHHHHHCCCCCEEEEECCCCCCCCHHHHHHCCCCCCCCC Q ss_conf 01342112333211221110111001233332000245675565300145662888878304540011111014678863 Q gi|254780773|r 89 KILTVLAGPLANCVMAILFFTFFFYNTGVMKPVVSNVSPASPAAIAGVKKGDCIISLDGITVSAFEEVAPYVRENPLHEI 168 (349) Q Consensus 89 R~~i~~AGp~~N~ilA~iif~~~~~~~g~~~p~I~~V~~~spA~~AGL~~GD~Il~InG~~V~s~~dl~~~i~~~~g~~v 168 (349) |+.|.+|||++|+++|++++.......+...+........++ T Consensus 74 ~~~V~~AGP~~Nlila~i~~~~~~~~~~~~~~~~~~~~~~~~-------------------------------------- 115 (205) T pfam02163 74 RLKISLAGPLANFILALLLLALLLLLPGIPVPPVIGGVVVGS-------------------------------------- 115 (205) T ss_pred HHHHHHCCHHHHHHHHHHHHHHHHHHCCCCCCCCCCCCCCCC-------------------------------------- T ss_conf 756533117888999999999999833666554433222331-------------------------------------- Q ss_pred EEEEEECCCCEEEECCCCCCCCCCCCCEEEEEEEECCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHHEECCCCCCCC Q ss_conf 16999658731310112110056543100001220344443212333456688876898886554223100000002554 Q gi|254780773|r 169 SLVLYREHVGVLHLKVMPRLQDTVDRFGIKRQVPSVGISFSYDETKLHSRTVLQSFSRGLDEISSITRGFLGVLSSAFGK 248 (349) Q Consensus 169 ~i~v~R~~~~~~~~~v~p~~~~~~~~~g~~~~~~~igi~~~~~~~~~~~~~~~~a~~~~~~~~~~~~~~~~~~l~~l~~g 248 (349) T Consensus 116 -------------------------------------------------------------------------------- 115 (205) T pfam02163 116 -------------------------------------------------------------------------------- 115 (205) T ss_pred -------------------------------------------------------------------------------- T ss_conf -------------------------------------------------------------------------------- Q ss_pred CCCCCCCCCCCHHHHHHHHHHHCCHHHHHHHHHHHHHHHHHHHHHCCCCCCHHHHHHHHHHHHHCCCCCHHHHHHHHHHH Q ss_conf 20022466730122333456530503478999999999999963064722436999999988827869999999999999 Q gi|254780773|r 249 DTRLNQISGPVGIARIAKNFFDHGFNAYIAFLAMFSWAIGFMNLLPIPILDGGHLITFLLEMIRGKSLGVSVTRVITRMG 328 (349) Q Consensus 249 ~~~~~~lsGPVgIa~~~~~~a~~G~~~~l~~~a~isi~Lg~~NlLPip~LDGG~i~~~~~E~i~gr~i~~~~~~~~~~~g 328 (349) .+++||+++.+..+++++.++..++.+++++|++||+|||||+|||||||+++.++ .+++|+.++|.+++.+.++ T Consensus 116 ----~~~~~~~~~~~~~~~~~~~~~~~~l~~~~~inl~L~~fNLLPippLDGG~il~~ll-~~~~~~~~~~~~~~~~~~~ 190 (205) T pfam02163 116 ----SSLSGPVAIAKVGSTSALSGLIALLAFLALLNLNLGLFNLLPIPPLDGGHILRALL-AVRGRPLNERAENYIYLVG 190 (205) T ss_pred ----CCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCCCCCCHHHHHHHH-HHHCCCCHHHHHHHHHHHH T ss_conf ----00253189998756667778999999999999999999648888878289999999-9846851799999999999 Q ss_pred HHHHHHHHHHHHHHH Q ss_conf 999999999999999 Q gi|254780773|r 329 LCIILFLFFLGIRND 343 (349) Q Consensus 329 ~~ll~~l~i~~~~nD 343 (349) +++++.+|+++++|| T Consensus 191 ~~ll~~l~~~~~~~d 205 (205) T pfam02163 191 LALLLLLILLLLFND 205 (205) T ss_pred HHHHHHHHHHHHCCC T ss_conf 999999999880159 No 6 >cd05709 S2P-M50 Site-2 protease (S2P) class of zinc metalloproteases (MEROPS family M50) cleaves transmembrane domains of substrate proteins, regulating intramembrane proteolysis (RIP) of diverse signal transduction mechanisms. Members of this family use proteolytic activity within the membrane to transfer information across membranes to integrate gene expression with physiologic stresses occurring in another cellular compartment. The domain core structure appears to contain at least three transmembrane helices with a catalytic zinc atom coordinated by three conserved residues contained within the consensus sequence HExxH, together with a conserved aspartate residue. The S2P/M50 family of RIP proteases is widely distributed; in eukaryotic cells, they regulate such processes as sterol and lipid metabolism, and endoplasmic reticulum (ER) stress responses. In sterol-depleted mammalian cells, a two-step proteolytic process releases the N-terminal domains of sterol regulatory element-bindin Probab=100.00 E-value=1.3e-33 Score=251.66 Aligned_cols=179 Identities=36% Similarity=0.637 Sum_probs=148.4 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHHCCEEEEEEEECCCHHEEEEECCCEEEEEEEEEECCEEECCCCCCCHHHHCCCCCCC Q ss_conf 99999999999973377999998497500455306831127860598079999771121110012445506650368421 Q gi|254780773|r 9 LYTVSLIIIVVIHEFGHYMVARLCNIRVLSFSVGFGPELIGITSRSGVRWKVSLIPLGGYVSFSEDEKDMRSFFCAAPWK 88 (349) Q Consensus 9 ~~~~~l~~~v~iHE~GH~~~Ar~~gv~V~~FsiGfgp~l~~~~~k~~t~y~i~~~PlGgyV~~~~~e~~~~~f~~~~~~~ 88 (349) .+++++.+++++||+||+++||++|+++++|+.|| .++. +++++.|++.++|+|||+|+.+++++.. ++++|+ T Consensus 2 ~~~~~l~i~i~iHE~gH~~~A~~~G~~~~~~~~~~---~~~p-~~~~~~~~~~~ip~gG~~~~~~~~~~~~---~~~~~~ 74 (180) T cd05709 2 AFILALLISVTVHELGHALVARRLGVKVARFSGGF---TLNP-LKHGDPYGIILIPLGGYAKPVGENPRAF---KKPRWQ 74 (180) T ss_pred CHHHHHHHHHHHHHHHHHHHHHHCCCCHHHHCCCC---CCCC-CCCCCCCCHHHHHCEEEEEEECCCHHHC---CCCCCC T ss_conf 66999999999999999999999699768874753---3366-4699815086631327997755681420---488120 Q ss_pred CHHHHHHHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCHHHHHHCCCCCEEEEECCCCCCCCHHHHHHCCCCCCCCC Q ss_conf 01342112333211221110111001233332000245675565300145662888878304540011111014678863 Q gi|254780773|r 89 KILTVLAGPLANCVMAILFFTFFFYNTGVMKPVVSNVSPASPAAIAGVKKGDCIISLDGITVSAFEEVAPYVRENPLHEI 168 (349) Q Consensus 89 R~~i~~AGp~~N~ilA~iif~~~~~~~g~~~p~I~~V~~~spA~~AGL~~GD~Il~InG~~V~s~~dl~~~i~~~~g~~v 168 (349) |++|.+|||++|+++|++.+.......+...+.. T Consensus 75 ~~~i~~AGP~~Nl~la~i~~~~~~~~~~~~~~~~---------------------------------------------- 108 (180) T cd05709 75 RLLVALAGPLANLLLALLLLLLLLLLGGLPPAPV---------------------------------------------- 108 (180) T ss_pred EEEEEEEHHHHHHHHHHHHHHHHHHHCCCCCCCH---------------------------------------------- T ss_conf 1210200688999999999999999703544320---------------------------------------------- Q ss_pred EEEEEECCCCEEEECCCCCCCCCCCCCEEEEEEEECCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHHEECCCCCCCC Q ss_conf 16999658731310112110056543100001220344443212333456688876898886554223100000002554 Q gi|254780773|r 169 SLVLYREHVGVLHLKVMPRLQDTVDRFGIKRQVPSVGISFSYDETKLHSRTVLQSFSRGLDEISSITRGFLGVLSSAFGK 248 (349) Q Consensus 169 ~i~v~R~~~~~~~~~v~p~~~~~~~~~g~~~~~~~igi~~~~~~~~~~~~~~~~a~~~~~~~~~~~~~~~~~~l~~l~~g 248 (349) T Consensus 109 -------------------------------------------------------------------------------- 108 (180) T cd05709 109 -------------------------------------------------------------------------------- 108 (180) T ss_pred -------------------------------------------------------------------------------- T ss_conf -------------------------------------------------------------------------------- Q ss_pred CCCCCCCCCCCHHHHHHHHHHHCCHHHHHHHHHHHHHHHHHHHHHCCCCCCHHHHHHHHHHHHHCCCCCHHHHHHHHHHH Q ss_conf 20022466730122333456530503478999999999999963064722436999999988827869999999999999 Q gi|254780773|r 249 DTRLNQISGPVGIARIAKNFFDHGFNAYIAFLAMFSWAIGFMNLLPIPILDGGHLITFLLEMIRGKSLGVSVTRVITRMG 328 (349) Q Consensus 249 ~~~~~~lsGPVgIa~~~~~~a~~G~~~~l~~~a~isi~Lg~~NlLPip~LDGG~i~~~~~E~i~gr~i~~~~~~~~~~~g 328 (349) ++....++..++.+++++|+++++|||||+|||||||++..++|..++| .++..+..+ T Consensus 109 -----------------~~~~~~~~~~~l~~~~~in~~l~~fNLlPippLDGg~il~~~l~~~~~~-----~~~~~~~~~ 166 (180) T cd05709 109 -----------------GQAASSGLANLLAFLALINLNLAVFNLLPIPPLDGGRILRALLEAIRGR-----VEERLEAYG 166 (180) T ss_pred -----------------HHHHHHHHHHHHHHHHHHHHHHHHHHCCCCCCCCHHHHHHHHHHHHHHH-----HHHHHHHHH T ss_conf -----------------3667889999999999999999999808888888599999994799999-----999999999 Q ss_pred HHHHHHHHHHHHHH Q ss_conf 99999999999999 Q gi|254780773|r 329 LCIILFLFFLGIRN 342 (349) Q Consensus 329 ~~ll~~l~i~~~~n 342 (349) +.+++.+++..+++ T Consensus 167 ~~~ll~l~~~~~~~ 180 (180) T cd05709 167 FAILLGLLLLLLLN 180 (180) T ss_pred HHHHHHHHHHHHHC T ss_conf 99999999999709 No 7 >cd06159 S2P-M50_PDZ_Arch Uncharacterized Archaeal homologs of Site-2 protease (S2P), zinc metalloproteases (MEROPS family M50) which cleave transmembrane domains of substrate proteins, regulating intramembrane proteolysis (RIP) of diverse signal transduction mechanisms. Members of the S2P/M50 family of RIP proteases use proteolytic activity within the membrane to transfer information across membranes to integrate gene expression with physiologic stresses occurring in another cellular compartment. In eukaryotic cells they regulate such processes as sterol and lipid metabolism, and endoplasmic reticulum stress responses. In prokaryotes they regulate such processes as sporulation, cell division, stress response, and cell differentiation. This group appears to be limited to Archaeal S2P/M50s homologs with additional putative N-terminal transmembrane spanning regions, relative to the core protein, and either one or two PDZ domains present. Probab=99.94 E-value=2.1e-26 Score=201.85 Aligned_cols=133 Identities=32% Similarity=0.510 Sum_probs=118.2 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHCCEEEEEEEECCCHHEEEEECCCEEEEEEEEEECCEEECCCCCCCHHHHCCCC Q ss_conf 99999999999999973377999998497500455306831127860598079999771121110012445506650368 Q gi|254780773|r 6 CFLLYTVSLIIIVVIHEFGHYMVARLCNIRVLSFSVGFGPELIGITSRSGVRWKVSLIPLGGYVSFSEDEKDMRSFFCAA 85 (349) Q Consensus 6 ~~~~~~~~l~~~v~iHE~GH~~~Ar~~gv~V~~FsiGfgp~l~~~~~k~~t~y~i~~~PlGgyV~~~~~e~~~~~f~~~~ 85 (349) .+.++++++.+.+++||+||.++||.+|+||+ |.|+ + +..+|+|+|| |||++++++++ T Consensus 109 p~~~~~~al~v~~vvHE~~Hgi~ar~~~i~vk--S~G~---l------------l~~ip~GAFv-----Epdeee~~~a~ 166 (263) T cd06159 109 PLPYGIIALVVGVVVHELSHGILARVEGIKVK--SGGL---L------------LLIIPPGAFV-----EPDEEELNKAD 166 (263) T ss_pred CHHHHHHHHHHHHHHHHHHHHHHHHHCCCEEC--CCHH---H------------HHHHCCCCCC-----CCCHHHHHCCC T ss_conf 37899999999999998888999998186330--1156---7------------8661642235-----88979985288 Q ss_pred CCCCHHHHHHHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCHHHHHHCCCCCEEEEECCCCCCCCHHHHHHCCCCCC Q ss_conf 42101342112333211221110111001233332000245675565300145662888878304540011111014678 Q gi|254780773|r 86 PWKKILTVLAGPLANCVMAILFFTFFFYNTGVMKPVVSNVSPASPAAIAGVKKGDCIISLDGITVSAFEEVAPYVRENPL 165 (349) Q Consensus 86 ~~~R~~i~~AGp~~N~ilA~iif~~~~~~~g~~~p~I~~V~~~spA~~AGL~~GD~Il~InG~~V~s~~dl~~~i~~~~g 165 (349) +.+|+++++|||.+|+++|++.+..++ T Consensus 167 ~~~r~ri~aAG~~~N~v~~~i~~~l~f----------------------------------------------------- 193 (263) T cd06159 167 RRIRLRIFAAGVTANFVVALIAFALFF----------------------------------------------------- 193 (263) T ss_pred HHHHHHHHHCCCHHHHHHHHHHHHHHH----------------------------------------------------- T ss_conf 154533200360998999999999999----------------------------------------------------- Q ss_pred CCCEEEEEECCCCEEEECCCCCCCCCCCCCEEEEEEEECCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHHEECCCCC Q ss_conf 86316999658731310112110056543100001220344443212333456688876898886554223100000002 Q gi|254780773|r 166 HEISLVLYREHVGVLHLKVMPRLQDTVDRFGIKRQVPSVGISFSYDETKLHSRTVLQSFSRGLDEISSITRGFLGVLSSA 245 (349) Q Consensus 166 ~~v~i~v~R~~~~~~~~~v~p~~~~~~~~~g~~~~~~~igi~~~~~~~~~~~~~~~~a~~~~~~~~~~~~~~~~~~l~~l 245 (349) T Consensus 194 -------------------------------------------------------------------------------- 193 (263) T cd06159 194 -------------------------------------------------------------------------------- 193 (263) T ss_pred -------------------------------------------------------------------------------- T ss_conf -------------------------------------------------------------------------------- Q ss_pred CCCCCCCCCCCCCCHHHHHHHHHHHCCHHHHHHHHHHHHHHHHHHHHHCCCCCCHHHHHHHHHHHHHCCCCCHHHHHHH Q ss_conf 5542002246673012233345653050347899999999999996306472243699999998882786999999999 Q gi|254780773|r 246 FGKDTRLNQISGPVGIARIAKNFFDHGFNAYIAFLAMFSWAIGFMNLLPIPILDGGHLITFLLEMIRGKSLGVSVTRVI 324 (349) Q Consensus 246 ~~g~~~~~~lsGPVgIa~~~~~~a~~G~~~~l~~~a~isi~Lg~~NlLPip~LDGG~i~~~~~E~i~gr~i~~~~~~~~ 324 (349) ++|+.|+|++||+|||||.-||||||++..+.|.+.+|..+++.++.. T Consensus 194 -------------------------------l~W~~~iNf~lglfN~lPa~plDGg~v~~~~~~~~~~r~~~~~~~~~~ 241 (263) T cd06159 194 -------------------------------LYWIFWINFLLGLFNCLPAIPLDGGHVFRDLLEALLRRFPSEKAERVV 241 (263) T ss_pred -------------------------------HHHHHHHHHHHHHHHCCCCCCCCCHHHHHHHHHHHHHHCCCCHHHHHH T ss_conf -------------------------------999999999999983576665760679999899998538751178899 No 8 >LOAD_S2Pmetalloprt consensus Probab=99.90 E-value=2.6e-24 Score=187.37 Aligned_cols=147 Identities=36% Similarity=0.685 Sum_probs=113.5 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHCCEEEEEEEECCCHHEEEEECCCEEEEEEEEEECCEEECCCCCCCHHHHCCCCCCCC Q ss_conf 99999999999733779999984975004553068311278605980799997711211100124455066503684210 Q gi|254780773|r 10 YTVSLIIIVVIHEFGHYMVARLCNIRVLSFSVGFGPELIGITSRSGVRWKVSLIPLGGYVSFSEDEKDMRSFFCAAPWKK 89 (349) Q Consensus 10 ~~~~l~~~v~iHE~GH~~~Ar~~gv~V~~FsiGfgp~l~~~~~k~~t~y~i~~~PlGgyV~~~~~e~~~~~f~~~~~~~R 89 (349) +++++.+.+++||+||+++||++|++++++++++++.+.. +|+|||++++++++ .++..++|+| T Consensus 2 ~~v~l~i~v~vHElgH~~~A~~~G~~~~~~~~~~~~~~~~-------------~~~Gg~~~~~~~~~---~~~~~~~~~~ 65 (148) T LOAD_S2Pmetall 2 FLIALGVSVVVHELGHALVARRFGVKVESFAIGLGLNLFK-------------IPPGGFVELKGEDP---DLKKKSRKAR 65 (148) T ss_pred CHHHHHHHHHHHHHHHHHHHHHCCCCHHHHHHHHCCHHHH-------------HHCEEEEECCCCCC---HHCCCCHHHH T ss_conf 6299999999999999999999699487787563026776-------------50315775687650---1014785666 Q ss_pred HHHHHHHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCHHHHHHCCCCCEEEEECCCCCCCCHHHHHHCCCCCCCCCE Q ss_conf 13421123332112211101110012333320002456755653001456628888783045400111110146788631 Q gi|254780773|r 90 ILTVLAGPLANCVMAILFFTFFFYNTGVMKPVVSNVSPASPAAIAGVKKGDCIISLDGITVSAFEEVAPYVRENPLHEIS 169 (349) Q Consensus 90 ~~i~~AGp~~N~ilA~iif~~~~~~~g~~~p~I~~V~~~spA~~AGL~~GD~Il~InG~~V~s~~dl~~~i~~~~g~~v~ 169 (349) ++|.+|||++|+++|++.+...... +...... .+..+.. T Consensus 66 ~~I~~AGP~~Nl~la~~~~~~~~~~-g~~~~~~---~~~~~~~------------------------------------- 104 (148) T LOAD_S2Pmetall 66 LLVSAAGPLANLLLALLLLLLLFLL-GVPLLLP---VPVVSVV------------------------------------- 104 (148) T ss_pred HHHHHHCHHHHHHHHHHHHHHHHHH-CCCCCCC---CCCCCHH------------------------------------- T ss_conf 0352313088899999999999995-2231012---2222116------------------------------------- Q ss_pred EEEEECCCCEEEECCCCCCCCCCCCCEEEEEEEECCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHHEECCCCCCCCC Q ss_conf 69996587313101121100565431000012203444432123334566888768988865542231000000025542 Q gi|254780773|r 170 LVLYREHVGVLHLKVMPRLQDTVDRFGIKRQVPSVGISFSYDETKLHSRTVLQSFSRGLDEISSITRGFLGVLSSAFGKD 249 (349) Q Consensus 170 i~v~R~~~~~~~~~v~p~~~~~~~~~g~~~~~~~igi~~~~~~~~~~~~~~~~a~~~~~~~~~~~~~~~~~~l~~l~~g~ 249 (349) T Consensus 105 -------------------------------------------------------------------------------- 104 (148) T LOAD_S2Pmetall 105 -------------------------------------------------------------------------------- 104 (148) T ss_pred -------------------------------------------------------------------------------- T ss_conf -------------------------------------------------------------------------------- Q ss_pred CCCCCCCCCCHHHHHHHHHHHCCHHHHHHHHHHHHHHHHHHHHHCCCCCCHHHHHHHHHHHHHCC Q ss_conf 00224667301223334565305034789999999999999630647224369999999888278 Q gi|254780773|r 250 TRLNQISGPVGIARIAKNFFDHGFNAYIAFLAMFSWAIGFMNLLPIPILDGGHLITFLLEMIRGK 314 (349) Q Consensus 250 ~~~~~lsGPVgIa~~~~~~a~~G~~~~l~~~a~isi~Lg~~NlLPip~LDGG~i~~~~~E~i~gr 314 (349) ..+..++.+.+++|+.+++|||+|+||||||||+..+.+..++| T Consensus 105 ---------------------~~~~~~l~~~~~in~~l~~fNLlPi~pLDGg~Il~~ll~~~~~~ 148 (148) T LOAD_S2Pmetall 105 ---------------------AGLLSFLFFLALLNLNLALFNLLPIPPLDGGKILRALLEELRGR 148 (148) T ss_pred ---------------------HHHHHHHHHHHHHHHHHHHHHHCCCCCCCHHHHHHHHHHHHHCC T ss_conf ---------------------79999999999999999999946887878499999999998388 No 9 >cd06161 S2P-M50_SpoIVFB SpoIVFB Site-2 protease (S2P), a zinc metalloprotease (MEROPS family M50B), regulates intramembrane proteolysis (RIP), and is involved in the pro-sigmaK pathway of bacterial spore formation. SpoIVFB (sporulation protein, stage IV cell wall formation, F locus, promoter-distal B) is one of 4 proteins involved in endospore formation; the others are SpoIVFA (sporulation protein, stage IV cell wall formation, F locus, promoter-proximal A), BofA (bypass-of-forespore A), and SpoIVB (sporulation protein, stage IV cell wall formation, B locus). SpoIVFB is negatively regulated by SpoIVFA and BofA and activated by SpoIVB. It is thought that SpoIVFB, SpoIVFA, and BofA are located in the mother-cell membrane that surrounds the forespore and that SpoIVB is secreted from the forespore into the space between the two where it activates SpoIVFB. Probab=99.80 E-value=1.6e-18 Score=147.54 Aligned_cols=151 Identities=36% Similarity=0.639 Sum_probs=110.3 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHCCEEEEEEEECCCHHEEEEECCCEEEEEEEEEECCEEECCCCCCCHHHHCCCCC Q ss_conf 99999999999999733779999984975004553068311278605980799997711211100124455066503684 Q gi|254780773|r 7 FLLYTVSLIIIVVIHEFGHYMVARLCNIRVLSFSVGFGPELIGITSRSGVRWKVSLIPLGGYVSFSEDEKDMRSFFCAAP 86 (349) Q Consensus 7 ~~~~~~~l~~~v~iHE~GH~~~Ar~~gv~V~~FsiGfgp~l~~~~~k~~t~y~i~~~PlGgyV~~~~~e~~~~~f~~~~~ 86 (349) -+...+.+..+|++||+||.++||++|++|++ +...|+||.++++++.+ ++ T Consensus 30 ~~~~~l~lf~sVl~HElgH~l~A~~~G~~v~~---------------------I~L~pfGG~a~~~~~~~--------~~ 80 (208) T cd06161 30 GLLEALLLFLSVLLHELGHALVARRYGIRVRS---------------------ITLLPFGGVAELEEEPE--------TP 80 (208) T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHCCCCCCC---------------------EEEEEEEEEEECCCCCC--------CH T ss_conf 99999999999999999999999994998886---------------------58885012566367775--------78 Q ss_pred CCCHHHHHHHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCHHHHHHCCCCCEEEEECCCCCCCCHHHHHHCCCCCCC Q ss_conf 21013421123332112211101110012333320002456755653001456628888783045400111110146788 Q gi|254780773|r 87 WKKILTVLAGPLANCVMAILFFTFFFYNTGVMKPVVSNVSPASPAAIAGVKKGDCIISLDGITVSAFEEVAPYVRENPLH 166 (349) Q Consensus 87 ~~R~~i~~AGp~~N~ilA~iif~~~~~~~g~~~p~I~~V~~~spA~~AGL~~GD~Il~InG~~V~s~~dl~~~i~~~~g~ 166 (349) +|++.|..|||++|+++|.+.+...... T Consensus 81 ~~e~~IalAGPl~nl~l~~~~~~l~~~~---------------------------------------------------- 108 (208) T cd06161 81 KEEFVIALAGPLVSLLLAGLFYLLYLLL---------------------------------------------------- 108 (208) T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHC---------------------------------------------------- T ss_conf 8888999833488999999999999975---------------------------------------------------- Q ss_pred CCEEEEEECCCCEEEECCCCCCCCCCCCCEEEEEEEECCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHHEECCCCCC Q ss_conf 63169996587313101121100565431000012203444432123334566888768988865542231000000025 Q gi|254780773|r 167 EISLVLYREHVGVLHLKVMPRLQDTVDRFGIKRQVPSVGISFSYDETKLHSRTVLQSFSRGLDEISSITRGFLGVLSSAF 246 (349) Q Consensus 167 ~v~i~v~R~~~~~~~~~v~p~~~~~~~~~g~~~~~~~igi~~~~~~~~~~~~~~~~a~~~~~~~~~~~~~~~~~~l~~l~ 246 (349) +. T Consensus 109 -------------------~~----------------------------------------------------------- 110 (208) T cd06161 109 -------------------PG----------------------------------------------------------- 110 (208) T ss_pred -------------------CC----------------------------------------------------------- T ss_conf -------------------66----------------------------------------------------------- Q ss_pred CCCCCCCCCCCCCHHHHHHHHHHHCCHHHHHHHHHHHHHHHHHHHHHCCCCCCHHHHHHHHHHHHHCCCCCHHHHHHHHH Q ss_conf 54200224667301223334565305034789999999999999630647224369999999888278699999999999 Q gi|254780773|r 247 GKDTRLNQISGPVGIARIAKNFFDHGFNAYIAFLAMFSWAIGFMNLLPIPILDGGHLITFLLEMIRGKSLGVSVTRVITR 326 (349) Q Consensus 247 ~g~~~~~~lsGPVgIa~~~~~~a~~G~~~~l~~~a~isi~Lg~~NlLPip~LDGG~i~~~~~E~i~gr~i~~~~~~~~~~ 326 (349) . ++ ...++.+++++|+.+++|||+|+.||||||++.++.-+.+++ .|....... T Consensus 111 ------~---~~--------------~~~~~~~~~~~Nl~l~~FNLlP~~PLDGGrilrall~~~~~~---~~at~~a~~ 164 (208) T cd06161 111 ------G---GP--------------LSSLLEFLAQVNLILGLFNLLPALPLDGGRVLRALLWRRTGY---RRATRIAAR 164 (208) T ss_pred ------C---CH--------------HHHHHHHHHHHHHHHHHHHCCCCCCCCHHHHHHHHHHHHCCH---HHHHHHHHH T ss_conf ------4---16--------------999999999999999999828787886699999999986178---999999999 Q ss_pred HHHHHHHHHHHHHHHH Q ss_conf 9999999999999999 Q gi|254780773|r 327 MGLCIILFLFFLGIRN 342 (349) Q Consensus 327 ~g~~ll~~l~i~~~~n 342 (349) +|..+-..+++..++. T Consensus 165 ~g~~~~~~l~~~g~~~ 180 (208) T cd06161 165 IGQLFAILLVVLGLFL 180 (208) T ss_pred HHHHHHHHHHHHHHHH T ss_conf 9999999999999999 No 10 >cd06164 S2P-M50_SpoIVFB_CBS SpoIVFB Site-2 protease (S2P), a zinc metalloprotease (MEROPS family M50B), regulates intramembrane proteolysis (RIP), and is involved in the pro-sigmaK pathway of bacterial spore formation. In this subgroup, SpoIVFB (sporulation protein, stage IV cell wall formation, F locus, promoter-distal B) contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domain. SpoIVFB is one of 4 proteins involved in endospore formation; the others are SpoIVFA (sporulation protein, stage IV cell wall formation, F locus, promoter-proximal A), BofA (bypass-of-forespore A), and SpoIVB (sporulation protein, stage IV cell wall formation, B locus). SpoIVFB is negatively regulated by SpoIVFA and BofA and activated by SpoIVB. It is thought that SpoIVFB, SpoIVFA, and BofA are located in the mother-cell membrane that surrounds the forespore and that SpoIVB is secreted from the forespore into the space between the two where it activates SpoIVFB. It has been proposed tha Probab=99.80 E-value=4.6e-18 Score=144.31 Aligned_cols=148 Identities=31% Similarity=0.560 Sum_probs=103.6 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHCCEEEEEEEECCCHHEEEEECCCEEEEEEEEEECCEEECCCCCCCHHHHCCCCCC Q ss_conf 99999999999997337799999849750045530683112786059807999977112111001244550665036842 Q gi|254780773|r 8 LLYTVSLIIIVVIHEFGHYMVARLCNIRVLSFSVGFGPELIGITSRSGVRWKVSLIPLGGYVSFSEDEKDMRSFFCAAPW 87 (349) Q Consensus 8 ~~~~~~l~~~v~iHE~GH~~~Ar~~gv~V~~FsiGfgp~l~~~~~k~~t~y~i~~~PlGgyV~~~~~e~~~~~f~~~~~~ 87 (349) +.+.+.+.++|++||+||.++||++|++|++ +...|+||..+++++.+ +++ T Consensus 46 ~~~a~~lf~sVllHElgHal~Ar~~G~~v~~---------------------I~L~~fGG~a~~~~~~~--------~p~ 96 (227) T cd06164 46 LAAALLLFASVLLHELGHSLVARRYGIPVRS---------------------ITLFLFGGVARLEREPE--------TPG 96 (227) T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHCCCCCCC---------------------EEEEEEEEEEECCCCCC--------CHH T ss_conf 9999999999999999999999992998471---------------------78896351465258999--------967 Q ss_pred CCHHHHHHHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCHHHHHHCCCCCEEEEECCCCCCCCHHHHHHCCCCCCCC Q ss_conf 10134211233321122111011100123333200024567556530014566288887830454001111101467886 Q gi|254780773|r 88 KKILTVLAGPLANCVMAILFFTFFFYNTGVMKPVVSNVSPASPAAIAGVKKGDCIISLDGITVSAFEEVAPYVRENPLHE 167 (349) Q Consensus 88 ~R~~i~~AGp~~N~ilA~iif~~~~~~~g~~~p~I~~V~~~spA~~AGL~~GD~Il~InG~~V~s~~dl~~~i~~~~g~~ 167 (349) +.+.|..|||++|++++.+.+.......+ T Consensus 97 ~e~~Ia~AGPl~n~~l~~~~~~l~~~~~~--------------------------------------------------- 125 (227) T cd06164 97 QEFVIAIAGPLVSLVLALLFLLLSLALPG--------------------------------------------------- 125 (227) T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHCCC--------------------------------------------------- T ss_conf 77266440178999999999999987455--------------------------------------------------- Q ss_pred CEEEEEECCCCEEEECCCCCCCCCCCCCEEEEEEEECCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHHEECCCCCCC Q ss_conf 31699965873131011211005654310000122034444321233345668887689888655422310000000255 Q gi|254780773|r 168 ISLVLYREHVGVLHLKVMPRLQDTVDRFGIKRQVPSVGISFSYDETKLHSRTVLQSFSRGLDEISSITRGFLGVLSSAFG 247 (349) Q Consensus 168 v~i~v~R~~~~~~~~~v~p~~~~~~~~~g~~~~~~~igi~~~~~~~~~~~~~~~~a~~~~~~~~~~~~~~~~~~l~~l~~ 247 (349) .. T Consensus 126 ---------~~--------------------------------------------------------------------- 127 (227) T cd06164 126 ---------SG--------------------------------------------------------------------- 127 (227) T ss_pred ---------CC--------------------------------------------------------------------- T ss_conf ---------64--------------------------------------------------------------------- Q ss_pred CCCCCCCCCCCCHHHHHHHHHHHCCHHHHHHHHHHHHHHHHHHHHHCCCCCCHHHHHHHHHHHHHCCCCCHHHHHHHHHH Q ss_conf 42002246673012233345653050347899999999999996306472243699999998882786999999999999 Q gi|254780773|r 248 KDTRLNQISGPVGIARIAKNFFDHGFNAYIAFLAMFSWAIGFMNLLPIPILDGGHLITFLLEMIRGKSLGVSVTRVITRM 327 (349) Q Consensus 248 g~~~~~~lsGPVgIa~~~~~~a~~G~~~~l~~~a~isi~Lg~~NlLPip~LDGG~i~~~~~E~i~gr~i~~~~~~~~~~~ 327 (349) .++ ...++.+++++|+.+++|||||.+||||||++.+++-+.+|++ .|........ T Consensus 128 --------~~~--------------~~~~~~~l~~~Nl~l~~fNLLP~~PLDGGrilra~lw~~~g~~--~~at~~a~~~ 183 (227) T cd06164 128 --------AGP--------------LGVLLGYLALINLLLAVFNLLPAFPLDGGRVLRALLWRRTGDY--LKATRIAAWV 183 (227) T ss_pred --------CHH--------------HHHHHHHHHHHHHHHHHHHCCCCCCCCHHHHHHHHHHHHHCCH--HHHHHHHHHH T ss_conf --------407--------------9999999999999999997387888870799999999995779--8999999999 Q ss_pred HHHHHHHHHH Q ss_conf 9999999999 Q gi|254780773|r 328 GLCIILFLFF 337 (349) Q Consensus 328 g~~ll~~l~i 337 (349) |-++-+.+++ T Consensus 184 G~~~a~~l~~ 193 (227) T cd06164 184 GRGFAVLLII 193 (227) T ss_pred HHHHHHHHHH T ss_conf 9999999999 No 11 >KOG2921 consensus Probab=99.77 E-value=4e-18 Score=144.76 Aligned_cols=137 Identities=27% Similarity=0.438 Sum_probs=111.0 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHHCCEEEEEEEECCCHHEEEEECCCEEEEEEEEEECCEEECCCCCCCHHHHCCC Q ss_conf 99999999999999997337799999849750045530683112786059807999977112111001244550665036 Q gi|254780773|r 5 DCFLLYTVSLIIIVVIHEFGHYMVARLCNIRVLSFSVGFGPELIGITSRSGVRWKVSLIPLGGYVSFSEDEKDMRSFFCA 84 (349) Q Consensus 5 ~~~~~~~~~l~~~v~iHE~GH~~~Ar~~gv~V~~FsiGfgp~l~~~~~k~~t~y~i~~~PlGgyV~~~~~e~~~~~f~~~ 84 (349) ..+=+|+.++.+.+++||+||.++|-..||+|+-|.|.+ +...| |||| |-|.+..++. T Consensus 121 ~~I~yf~t~lvi~~vvHElGHalAA~segV~vngfgIfi----------------~aiyP-gafv-----dl~~dhLqsl 178 (484) T KOG2921 121 SGIAYFLTSLVITVVVHELGHALAAASEGVQVNGFGIFI----------------AAIYP-GAFV-----DLDNDHLQSL 178 (484) T ss_pred CCCHHHHHHHHHHHHHHHHHHHHHHHHCCCEEEEEEEEE----------------EEECC-CHHH-----HHHHHHHHHC T ss_conf 222156656777787887657999875486130058999----------------88737-5121-----0016677615 Q ss_pred CCCCCHHHHHHHHHHHHHHHHHHCCCCCCCCCCCCC--------CCCCCCCCCHHH-HHHCCCCCEEEEECCCCCCCCHH Q ss_conf 842101342112333211221110111001233332--------000245675565-30014566288887830454001 Q gi|254780773|r 85 APWKKILTVLAGPLANCVMAILFFTFFFYNTGVMKP--------VVSNVSPASPAA-IAGVKKGDCIISLDGITVSAFEE 155 (349) Q Consensus 85 ~~~~R~~i~~AGp~~N~ilA~iif~~~~~~~g~~~p--------~I~~V~~~spA~-~AGL~~GD~Il~InG~~V~s~~d 155 (349) ++.+|++|+.||+.-||++|.+...+++...-...| +|.+|...||+. .-||++||+|+++||.+|++.+| T Consensus 179 ~~fr~LrIfcAGIWHNfvfallc~lal~~lpViLsPfya~g~gV~Vtev~~~Spl~gprGL~vgdvitsldgcpV~~v~d 258 (484) T KOG2921 179 PSFRALRIFCAGIWHNFVFALLCVLALFLLPVILSPFYAHGEGVTVTEVPSVSPLFGPRGLSVGDVITSLDGCPVHKVSD 258 (484) T ss_pred CHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCHHHHCCCEEEEEECCCCCCCCCCCCCCCCCEEEECCCCCCCCHHH T ss_conf 44777888765488889999999999986247423466638507999445557775765677665577537854588889 Q ss_pred HHHHCCCC Q ss_conf 11110146 Q gi|254780773|r 156 VAPYVREN 163 (349) Q Consensus 156 l~~~i~~~ 163 (349) ..+-++.+ T Consensus 259 W~ecl~ts 266 (484) T KOG2921 259 WLECLATS 266 (484) T ss_pred HHHHHHHH T ss_conf 99999864 No 12 >cd06162 S2P-M50_PDZ_SREBP Sterol regulatory element-binding protein (SREBP) Site-2 protease (S2P), a zinc metalloprotease (MEROPS family M50A), regulates intramembrane proteolysis (RIP) of SREBP and is part of a signal transduction mechanism involved in sterol and lipid metabolism. In sterol-depleted mammalian cells, a two-step proteolytic process releases the N-terminal domains of SREBPs from membranes of the endoplasmic reticulum (ER). These domains translocate into the nucleus, where they activate genes of cholesterol and fatty acid biosynthesis. The first cleavage occurs at Site-1 within the ER lumen to generate an intermediate that is subsequently released from the membrane by cleavage at Site-2, which lies within the first transmembrane domain. It is the second proteolytic step that is carried out by the SREBP Site-2 protease (S2P) which is present in this CD family. This group appears to be limited to eumetazoan proteins and contains one PDZ domain. Probab=99.66 E-value=1.5e-15 Score=126.99 Aligned_cols=85 Identities=25% Similarity=0.378 Sum_probs=71.6 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHHCCEEEEEEEECCCHHEEEEECCCEEEEEEEEEECCEEECCCCCCCHHHHCCC Q ss_conf 99999999999999997337799999849750045530683112786059807999977112111001244550665036 Q gi|254780773|r 5 DCFLLYTVSLIIIVVIHEFGHYMVARLCNIRVLSFSVGFGPELIGITSRSGVRWKVSLIPLGGYVSFSEDEKDMRSFFCA 84 (349) Q Consensus 5 ~~~~~~~~~l~~~v~iHE~GH~~~Ar~~gv~V~~FsiGfgp~l~~~~~k~~t~y~i~~~PlGgyV~~~~~e~~~~~f~~~ 84 (349) ...-++++++.+..++||+||.++|.+++|+|+.|.+. + .-.+| |+|| |-+.+..++. T Consensus 125 s~l~Y~~~al~is~v~HElGHA~aA~~e~V~v~~~G~~----~------------~~i~P-gA~v-----~l~~~~L~~l 182 (277) T cd06162 125 SQLGYYFTALLISGVVHEMGHGVAAVREQVRVNGFGIF----F------------FIIYP-GAYV-----DLFTDHLNLI 182 (277) T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHCCCEEEEEEEE----E------------EEEEC-CCEE-----EECHHHHHCC T ss_conf 89899999999999999998999998628715222499----9------------99802-4047-----7179998348 Q ss_pred CCCCCHHHHHHHHHHHHHHHHHHCCCC Q ss_conf 842101342112333211221110111 Q gi|254780773|r 85 APWKKILTVLAGPLANCVMAILFFTFF 111 (349) Q Consensus 85 ~~~~R~~i~~AGp~~N~ilA~iif~~~ 111 (349) ++|||++|+.||+..|+++|.+.+..+ T Consensus 183 ~~~~~Lri~cAGiWHN~vl~~~a~lll 209 (277) T cd06162 183 SPVQQLRIFCAGVWHNFVLGLVGYLLL 209 (277) T ss_pred CHHHHHHHHHHHHHHHHHHHHHHHHHH T ss_conf 888877864413698999999999999 No 13 >cd06160 S2P-M50_like_2 Uncharacterized homologs of Site-2 protease (S2P), zinc metalloproteases (MEROPS family M50) which cleave transmembrane domains of substrate proteins, regulating intramembrane proteolysis (RIP) of diverse signal transduction mechanisms. Members of the S2P/M50 family of RIP proteases use proteolytic activity within the membrane to transfer information across membranes to integrate gene expression with physiologic stresses occurring in another cellular compartment. In eukaryotic cells they regulate such processes as sterol and lipid metabolism, and endoplasmic reticulum stress responses. In prokaryotes they regulate such processes as sporulation, cell division, stress response, and cell differentiation. This group includes bacterial, eukaryotic, and Archaeal S2P/M50s homologs with additional putative N- and C-terminal transmembrane spanning regions, relative to the core protein, and no PDZ domains. Probab=99.60 E-value=2.9e-14 Score=118.06 Aligned_cols=78 Identities=26% Similarity=0.433 Sum_probs=58.2 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHCCEEEE-EEEECCCHHEEEEECCCEEEEEEEEEECCEEECCCCCCCHHHHCCCC Q ss_conf 99999999999999733779999984975004-55306831127860598079999771121110012445506650368 Q gi|254780773|r 7 FLLYTVSLIIIVVIHEFGHYMVARLCNIRVLS-FSVGFGPELIGITSRSGVRWKVSLIPLGGYVSFSEDEKDMRSFFCAA 85 (349) Q Consensus 7 ~~~~~~~l~~~v~iHE~GH~~~Ar~~gv~V~~-FsiGfgp~l~~~~~k~~t~y~i~~~PlGgyV~~~~~e~~~~~f~~~~ 85 (349) -+.+.+++..++++||+||+++||++|+|+.. |.|=+.. .-.+||+.+++++.+ + T Consensus 33 gl~~al~l~~il~~HElGH~l~a~~~gv~~~~p~fiP~~~----------------lg~fGav~~~~~~~~--------~ 88 (183) T cd06160 33 GLPFALALLAILGIHEMGHYLAARRHGVKASLPYFIPFPF----------------IGTFGAFIRMRSPIP--------N 88 (183) T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHCCCCCCCEEEECCCC----------------HHHHHEEEECCCCCC--------C T ss_conf 9999999999999999999999999399777603566620----------------231010234567898--------9 Q ss_pred CCCCHHHHHHHHHHHHHHHHHHC Q ss_conf 42101342112333211221110 Q gi|254780773|r 86 PWKKILTVLAGPLANCVMAILFF 108 (349) Q Consensus 86 ~~~R~~i~~AGp~~N~ilA~iif 108 (349) +++.+.+..|||+++++++..++ T Consensus 89 ~~~~~~ia~aGPl~g~~~~~~~~ 111 (183) T cd06160 89 RKALFDIALAGPLAGLLLALPVL 111 (183) T ss_pred HHHHHHHHHHHHHHHHHHHHHHH T ss_conf 89988998750888999999999 No 14 >cd06158 S2P-M50_like_1 Uncharacterized homologs of Site-2 protease (S2P), zinc metalloproteases (MEROPS family M50) which cleave transmembrane domains of substrate proteins, regulating intramembrane proteolysis (RIP) of diverse signal transduction mechanisms. Members of the S2P/M50 family of RIP proteases use proteolytic activity within the membrane to transfer information across membranes to integrate gene expression with physiologic stresses occurring in another cellular compartment. In eukaryotic cells they regulate such processes as sterol and lipid metabolism, and endoplasmic reticulum stress responses. In prokaryotes they regulate such processes as sporulation, cell division, stress response, and cell differentiation. This group includes bacterial, eukaryotic, and Archaeal S2P/M50s homologs with a minimal core protein and no PDZ domains. Probab=99.56 E-value=1.1e-14 Score=121.07 Aligned_cols=94 Identities=26% Similarity=0.408 Sum_probs=55.7 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHCCEEEEEE--EECCCHHEEEEECCCEEEEEEEEE-----ECCEEECCCCCCCHHH Q ss_conf 999999999999973377999998497500455--306831127860598079999771-----1211100124455066 Q gi|254780773|r 8 LLYTVSLIIIVVIHEFGHYMVARLCNIRVLSFS--VGFGPELIGITSRSGVRWKVSLIP-----LGGYVSFSEDEKDMRS 80 (349) Q Consensus 8 ~~~~~~l~~~v~iHE~GH~~~Ar~~gv~V~~Fs--iGfgp~l~~~~~k~~t~y~i~~~P-----lGgyV~~~~~e~~~~~ 80 (349) +..++++.+.+++||++|.++|+++|-+-.+.. +-..|. ...|--|| -.+| .+|+.|- .+-+++. T Consensus 2 l~~~~~~~~si~~HE~aHa~~A~~~GD~t~~~~GrltLnPl--~hid~~G~----i~l~~~~~~~~Gwakp--v~~~~~~ 73 (181) T cd06158 2 LIVIIAVLLAITLHEFAHAYVAYRLGDPTARRAGRLTLNPL--AHIDPIGT----IILPLLLPFLFGWAKP--VPVNPRN 73 (181) T ss_pred EEHHHHHHHHHHHHHHHHHHHHHHCCCCHHHHCCCEECCCH--HHCCCHHH----HHHHHHHHHHCCCCCC--CCCCCCC T ss_conf 22299999999999999999999849954887695435826--75050278----9999998763235677--4654001 Q ss_pred HCCCCCCCCHHHHHHHHHHHHHHHHHHCCC Q ss_conf 503684210134211233321122111011 Q gi|254780773|r 81 FFCAAPWKKILTVLAGPLANCVMAILFFTF 110 (349) Q Consensus 81 f~~~~~~~R~~i~~AGp~~N~ilA~iif~~ 110 (349) + +.++|++.+|.+|||++|+++|++.... T Consensus 74 ~-~~~r~~~~~valAGPl~Nl~la~~~~~~ 102 (181) T cd06158 74 F-KNPRRGMLLVSLAGPLSNLLLALLFALL 102 (181) T ss_pred C-CCCCCEEEEHHHHHHHHHHHHHHHHHHH T ss_conf 2-5776635303363589999999999999 No 15 >cd00989 PDZ_metalloprotease PDZ domain of bacterial and plant zinc metalloprotases, presumably membrane-associated or integral membrane proteases, which may be involved in signalling and regulatory mechanisms. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins. Probab=99.45 E-value=1.1e-13 Score=114.16 Aligned_cols=75 Identities=33% Similarity=0.606 Sum_probs=66.7 Q ss_pred CCCCCCCCCCCCCCCCCCCHHHHHHCCCCCEEEEECCCCCCCCHHHHHHCCCCCCCCCEEEEEECCCCEEEECCCC Q ss_conf 1001233332000245675565300145662888878304540011111014678863169996587313101121 Q gi|254780773|r 111 FFYNTGVMKPVVSNVSPASPAAIAGVKKGDCIISLDGITVSAFEEVAPYVRENPLHEISLVLYREHVGVLHLKVMP 186 (349) Q Consensus 111 ~~~~~g~~~p~I~~V~~~spA~~AGL~~GD~Il~InG~~V~s~~dl~~~i~~~~g~~v~i~v~R~~~~~~~~~v~p 186 (349) +....+...|+|++|.++|||++|||++||+|++|||+++.+|+|+...++.+++++++++++|+++ ..+++++| T Consensus 5 ~~~~~p~~~~vV~~V~~~spA~~AGl~~GD~I~~ing~~v~~~~~~~~~i~~~~~~~i~l~v~R~g~-~~~~~vtP 79 (79) T cd00989 5 FVPGGPPIEPVIGEVVPGSPAAKAGLKAGDRILAINGQKIKSWEDLVDAVQENPGKPLTLTVERNGE-TITLTLTP 79 (79) T ss_pred EEECCCCCCCEEEEECCCCHHHHCCCCCCCEEEEECCEECCCHHHHHHHHHHCCCCEEEEEEEECCE-EEEEEEEC T ss_conf 9948999999999989999899859999999999999995899999999985899889999999999-98999879 No 16 >TIGR02037 degP_htrA_DO protease Do; InterPro: IPR011782 Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes . They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence . Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases . Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base . The geometric orientations of the catalytic residues are similar between families, despite different protein folds . The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) , . Peptidases are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry. Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. This family consists serine peptidases belonging to MEROPS peptidase family S1, subfamily S1C (protease Do, clan PA(S)). They are variously designated DegP, DegQ, heat shock protein HtrA, MucD and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in Escherichia coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens . The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures .; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis. Probab=99.35 E-value=8.1e-13 Score=108.14 Aligned_cols=66 Identities=29% Similarity=0.517 Sum_probs=59.7 Q ss_pred CCCCCCCCCHHHHHHCCCCCEEEEECCCCCCCCHHHHHHCCC-CCCCCCEEEEEECCCCEEEECCCCC Q ss_conf 000245675565300145662888878304540011111014-6788631699965873131011211 Q gi|254780773|r 121 VVSNVSPASPAAIAGVKKGDCIISLDGITVSAFEEVAPYVRE-NPLHEISLVLYREHVGVLHLKVMPR 187 (349) Q Consensus 121 ~I~~V~~~spA~~AGL~~GD~Il~InG~~V~s~~dl~~~i~~-~~g~~v~i~v~R~~~~~~~~~v~p~ 187 (349) .|.+|.++|||++||||+||+|+++||++|+++.||+..|.. .||+++++++.|+|++ .+++++-. T Consensus 296 LV~~V~~gSPA~kAGlk~GDvI~~~nGk~i~~~~~L~~~i~~~~pG~~~~L~i~R~Gk~-~~~~V~l~ 362 (484) T TIGR02037 296 LVAQVLPGSPAEKAGLKAGDVILSVNGKKIKSFADLRRAIGTLKPGKKVTLTILRKGKE-KTITVTLG 362 (484) T ss_pred EEEEECCCCCHHCCCCCCCCEEEEECCEEECCHHHHHHHHHCCCCCCEEEEEEEECCEE-EEEEEEEE T ss_conf 88854489701006753266899858864058799989874058987799999978868-88999981 No 17 >cd00991 PDZ_archaeal_metalloprotease PDZ domain of archaeal zinc metalloprotases, presumably membrane-associated or integral membrane proteases, which may be involved in signalling and regulatory mechanisms. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins. Probab=99.23 E-value=1e-11 Score=100.51 Aligned_cols=61 Identities=23% Similarity=0.370 Sum_probs=56.3 Q ss_pred CCCCCCCCCCHHHHHHCCCCCEEEEECCCCCCCCHHHHHHCCC-CCCCCCEEEEEECCCCEE Q ss_conf 2000245675565300145662888878304540011111014-678863169996587313 Q gi|254780773|r 120 PVVSNVSPASPAAIAGVKKGDCIISLDGITVSAFEEVAPYVRE-NPLHEISLVLYREHVGVL 180 (349) Q Consensus 120 p~I~~V~~~spA~~AGL~~GD~Il~InG~~V~s~~dl~~~i~~-~~g~~v~i~v~R~~~~~~ 180 (349) -+|.+|.++|||++|||++||+|++|||++|++.+|+.+++.. +||+++++++.|++++.. T Consensus 12 v~V~~V~~gsPA~~AGL~~GDVI~~Ing~~I~~~~d~~~~l~~~~pG~~v~v~v~R~g~~lT 73 (79) T cd00991 12 VVIVGVIVGSPAENAVLHTGDVIYSINGTPITTLEDFMEALKPTKPGEVITVTVLPSTTKLT 73 (79) T ss_pred EEEEEECCCCHHHHCCCCCCCEEEEECCEECCCHHHHHHHHHCCCCCCEEEEEEEECCEEEE T ss_conf 79999678996998699988899998999987999999999618999989999998999977 No 18 >PRK10139 serine endoprotease; Provisional Probab=99.14 E-value=8.9e-11 Score=94.06 Aligned_cols=65 Identities=37% Similarity=0.501 Sum_probs=58.1 Q ss_pred CCCCCCCCCHHHHHHCCCCCEEEEECCCCCCCCHHHHHHCCC-CCCCCCEEEEEECCCCEEEECCCC Q ss_conf 000245675565300145662888878304540011111014-678863169996587313101121 Q gi|254780773|r 121 VVSNVSPASPAAIAGVKKGDCIISLDGITVSAFEEVAPYVRE-NPLHEISLVLYREHVGVLHLKVMP 186 (349) Q Consensus 121 ~I~~V~~~spA~~AGL~~GD~Il~InG~~V~s~~dl~~~i~~-~~g~~v~i~v~R~~~~~~~~~v~p 186 (349) .|.+|.++|||++||||+||+|+++||++|++.+|+...+.. .||+++++++.|+++ ..+++++. T Consensus 293 lV~~V~~~sPA~kAGLk~GDVI~~vnG~~V~~~~dL~~~v~~~~pG~~v~l~v~R~Gk-~~~~~vtl 358 (455) T PRK10139 293 FVSEVLPNSGSAKAGVKSGDIITSLNGKPLNSFAELRSRIATTEPGTKVKLGLLRNGK-PLEVEVTL 358 (455) T ss_pred EEEEECCCCCHHHCCCCCCCEEEEECCEECCCHHHHHHHHHCCCCCCEEEEEEEECCE-EEEEEEEE T ss_conf 5665447883687699999999998998968999999999608988889999999997-99999995 No 19 >PRK10942 serine endoprotease; Provisional Probab=99.12 E-value=8.2e-11 Score=94.30 Aligned_cols=65 Identities=34% Similarity=0.445 Sum_probs=58.3 Q ss_pred CCCCCCCCCHHHHHHCCCCCEEEEECCCCCCCCHHHHHHCCC-CCCCCCEEEEEECCCCEEEECCCC Q ss_conf 000245675565300145662888878304540011111014-678863169996587313101121 Q gi|254780773|r 121 VVSNVSPASPAAIAGVKKGDCIISLDGITVSAFEEVAPYVRE-NPLHEISLVLYREHVGVLHLKVMP 186 (349) Q Consensus 121 ~I~~V~~~spA~~AGL~~GD~Il~InG~~V~s~~dl~~~i~~-~~g~~v~i~v~R~~~~~~~~~v~p 186 (349) .|.+|.++|||++|||++||+|+++||++|++.+|+...+.. .||+++++++.|+++ ..+++++. T Consensus 315 lV~~V~~~sPA~kAGL~~GDVI~~vdG~~I~~~~dL~~~v~~~~pG~~V~l~v~R~Gk-~~~v~Vtl 380 (474) T PRK10942 315 FVSQVLPNSSAAKAGIKAGDVITSLNGKPISSFAALRAQVGTMPVGSKMTLGLLRDGK-PVNVNLEL 380 (474) T ss_pred EEEECCCCCCHHHCCCCCCCEEEEECCEECCCHHHHHHHHHCCCCCCEEEEEEEECCE-EEEEEEEE T ss_conf 6520177993677699989999998998968999999999618988889999999998-99999996 No 20 >PRK10898 serine endoprotease; Provisional Probab=99.10 E-value=1.2e-10 Score=93.26 Aligned_cols=65 Identities=37% Similarity=0.449 Sum_probs=58.0 Q ss_pred CCCCCCCCCHHHHHHCCCCCEEEEECCCCCCCCHHHHHHCCC-CCCCCCEEEEEECCCCEEEECCCC Q ss_conf 000245675565300145662888878304540011111014-678863169996587313101121 Q gi|254780773|r 121 VVSNVSPASPAAIAGVKKGDCIISLDGITVSAFEEVAPYVRE-NPLHEISLVLYREHVGVLHLKVMP 186 (349) Q Consensus 121 ~I~~V~~~spA~~AGL~~GD~Il~InG~~V~s~~dl~~~i~~-~~g~~v~i~v~R~~~~~~~~~v~p 186 (349) .|.+|.|+|||++||||+||+|+++||+++.+.+|+.+.+.. +||+++++++.|+++ ..+++++. T Consensus 283 ~V~~V~~~sPA~~AGL~~GDvI~~idg~~v~~~~~l~~~l~~~~pGd~v~l~v~R~G~-~~~~~VtL 348 (355) T PRK10898 283 VVNEVSPDGPAANAGIQVNDLIISVNNKPAISALETMDQVAEIRPGSVIPVVVMRDDK-QLTLQVTI 348 (355) T ss_pred EEEEECCCCHHHHCCCCCCCEEEEECCEECCCHHHHHHHHHHCCCCCEEEEEEEECCE-EEEEEEEE T ss_conf 8988799995898599989999998998938999999999718997989999999999-99999997 No 21 >cd00988 PDZ_CTP_protease PDZ domain of C-terminal processing-, tail-specific-, and tricorn proteases, which function in posttranslational protein processing, maturation, and disassembly or degradation, in Bacteria, Archaea, and plant chloroplasts. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins. Probab=99.06 E-value=2.3e-10 Score=91.25 Aligned_cols=68 Identities=32% Similarity=0.485 Sum_probs=59.5 Q ss_pred CCCCCCCCCCCHHHHHHCCCCCEEEEECCCCCCC--CHHHHHHCCCCCCCCCEEEEEECCCCEEEECCCC Q ss_conf 3200024567556530014566288887830454--0011111014678863169996587313101121 Q gi|254780773|r 119 KPVVSNVSPASPAAIAGVKKGDCIISLDGITVSA--FEEVAPYVRENPLHEISLVLYREHVGVLHLKVMP 186 (349) Q Consensus 119 ~p~I~~V~~~spA~~AGL~~GD~Il~InG~~V~s--~~dl~~~i~~~~g~~v~i~v~R~~~~~~~~~v~p 186 (349) ..+|.+|.++|||++|||++||+|++|||+++.+ ++++.+.++..+|+++++++.|.+.+..+++++. T Consensus 14 ~~~V~~v~~gsPA~~aGl~~GD~I~~Vng~~v~~~~~~~~~~~lrg~~Gt~V~l~v~R~~~~~~~~~l~R 83 (85) T cd00988 14 GLVITSVLPGSPAAKAGIKAGDIIVAIDGEPVDGLSLEDVVKLLRGKAGTKVRLTLKRGDGEPREVTLTR 83 (85) T ss_pred EEEEEEECCCCHHHHHCCCCCCEEEEECCEECCCCCHHHHHHHHCCCCCCEEEEEEECCCCCEEEEEEEE T ss_conf 8999996899958980899999999999999789999999998659999889999990999899999998 No 22 >cd00986 PDZ_LON_protease PDZ domain of ATP-dependent LON serine proteases. Most PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this bacterial subfamily of protease-associated PDZ domains a C-terminal beta-strand is thought to form the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins. Probab=99.00 E-value=4.3e-10 Score=89.32 Aligned_cols=64 Identities=30% Similarity=0.401 Sum_probs=56.2 Q ss_pred CCCCCCCCCCHHHHHHCCCCCEEEEECCCCCCCCHHHHHHCCC-CCCCCCEEEEEECCCCEEEECCC Q ss_conf 2000245675565300145662888878304540011111014-67886316999658731310112 Q gi|254780773|r 120 PVVSNVSPASPAAIAGVKKGDCIISLDGITVSAFEEVAPYVRE-NPLHEISLVLYREHVGVLHLKVM 185 (349) Q Consensus 120 p~I~~V~~~spA~~AGL~~GD~Il~InG~~V~s~~dl~~~i~~-~~g~~v~i~v~R~~~~~~~~~v~ 185 (349) -+|.+|.++|||+.+ |++||+|+++||+++++.+|+..+++. +||+++++++.|++++ .+++++ T Consensus 10 v~V~~V~~gsPA~~~-Lk~GDvI~~vdGk~v~~~~~l~~~i~~~~~Gd~V~l~v~R~gk~-~~~~vt 74 (79) T cd00986 10 VYVTSVVEGMPAAGK-LKAGDHIIAVDGKPFKEAEELIDYIQSKKEGDTVKLKVKREEKE-LPEDLI 74 (79) T ss_pred EEEEEECCCCCHHHC-CCCCCEEEEECCEECCCHHHHHHHHHCCCCCCEEEEEEEECCEE-EEEEEE T ss_conf 899996799973770-77899999999989579999999996599999899999999999-999999 No 23 >cd00987 PDZ_serine_protease PDZ domain of tryspin-like serine proteases, such as DegP/HtrA, which are oligomeric proteins involved in heat-shock response, chaperone function, and apoptosis. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins. Probab=98.99 E-value=5.4e-10 Score=88.68 Aligned_cols=63 Identities=32% Similarity=0.526 Sum_probs=56.1 Q ss_pred CCCCCCCCCCHHHHHHCCCCCEEEEECCCCCCCCHHHHHHCCCC-CCCCCEEEEEECCCCEEEEC Q ss_conf 20002456755653001456628888783045400111110146-78863169996587313101 Q gi|254780773|r 120 PVVSNVSPASPAAIAGVKKGDCIISLDGITVSAFEEVAPYVREN-PLHEISLVLYREHVGVLHLK 183 (349) Q Consensus 120 p~I~~V~~~spA~~AGL~~GD~Il~InG~~V~s~~dl~~~i~~~-~g~~v~i~v~R~~~~~~~~~ 183 (349) -.|.+|.++|||++|||++||+|+++||++|.+.+|+.+.++.. +++++.+++.|+|+. .+++ T Consensus 26 v~V~~V~~~spA~~aGl~~GDiI~~ing~~i~~~~~~~~~l~~~~~g~~v~~~v~R~g~~-~~~~ 89 (90) T cd00987 26 VLVASVDPGSPAAKAGLKPGDVILAVNGKPVKSVADLRRALAELKPGDKVTLTVLRGGKE-LTVT 89 (90) T ss_pred EEEEEECCCCHHHHCCCCCCCEEEEECCEECCCHHHHHHHHHHCCCCCEEEEEEEECCEE-EEEE T ss_conf 999998999959982999998999999999389999999998269998799999999999-9978 No 24 >cd00990 PDZ_glycyl_aminopeptidase PDZ domain associated with archaeal and bacterial M61 glycyl-aminopeptidases. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand is presumed to form the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins. Probab=98.98 E-value=5.3e-10 Score=88.72 Aligned_cols=63 Identities=22% Similarity=0.391 Sum_probs=54.8 Q ss_pred CCCCCCCCCCHHHHHHCCCCCEEEEECCCCCCCCHHHHHHCCCCCCCCCEEEEEECCCCEEEECCC Q ss_conf 200024567556530014566288887830454001111101467886316999658731310112 Q gi|254780773|r 120 PVVSNVSPASPAAIAGVKKGDCIISLDGITVSAFEEVAPYVRENPLHEISLVLYREHVGVLHLKVM 185 (349) Q Consensus 120 p~I~~V~~~spA~~AGL~~GD~Il~InG~~V~s~~dl~~~i~~~~g~~v~i~v~R~~~~~~~~~v~ 185 (349) ..|..|.++|||++|||++||+|+++||.++++|++... ..++|+++++++.|+++ ..+++++ T Consensus 14 ~~V~~V~~~sPA~~AGl~~GD~IvaidG~~v~~~~~~~~--~~~~G~~v~l~v~R~g~-l~~~~vt 76 (80) T cd00990 14 GKVTFVRDDSPADKAGLVAGDELVAVNGWRVDALQDRLK--EYQAGDPVELTVFRDDR-LIEVPLT 76 (80) T ss_pred EEEEEECCCCHHHHCCCCCCCEEEEECCEEEHHHHHHHH--HCCCCCEEEEEEEECCE-EEEEEEE T ss_conf 999998889969985999899999999999237899997--36998989999999999-9998999 No 25 >TIGR02860 spore_IV_B stage IV sporulation protein B; InterPro: IPR014219 Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes . They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence . Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases . Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base . The geometric orientations of the catalytic residues are similar between families, despite different protein folds . The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) , . Peptidases are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry. Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. SpoIVB, the stage IV sporulation protein B of endospore-forming bacteria such as Bacillus subtilis, is a serine proteinase expressed in the spore (rather than mother cell) compartment, that participates in a proteolytic activation cascade for Sigma-K. It appears to be universal among endospore-forming bacteria and occurs nowhere else. The members of this entry belong to MEROPS peptidase family S55 (SpoIVB peptidase, clan PA).. Probab=98.96 E-value=8.4e-10 Score=87.35 Aligned_cols=78 Identities=23% Similarity=0.443 Sum_probs=63.8 Q ss_pred CCHHHHHHCCCCCEEEEECCCCCCCCHHHHHHCCCCC--CCCCEEEEEECCCCEEEECCCCCCCCCCCC--CEEEEEEEE Q ss_conf 7556530014566288887830454001111101467--886316999658731310112110056543--100001220 Q gi|254780773|r 128 ASPAAIAGVKKGDCIISLDGITVSAFEEVAPYVRENP--LHEISLVLYREHVGVLHLKVMPRLQDTVDR--FGIKRQVPS 203 (349) Q Consensus 128 ~spA~~AGL~~GD~Il~InG~~V~s~~dl~~~i~~~~--g~~v~i~v~R~~~~~~~~~v~p~~~~~~~~--~g~~~~~~~ 203 (349) .|||++||||.||.|++|||++|++-+|+..++++.. +++++++++|+++ ..+.++.|...+..++ +|....... T Consensus 141 ~sPg~~AGi~~GD~I~~iNg~~i~~~~d~~~~i~~~g~~g~~l~l~i~R~~~-~i~~~~~p~~~~~e~~YrIGLyiRDsa 219 (423) T TIGR02860 141 ESPGEEAGIQIGDIILKINGEKIKNMEDIAKLINKAGKTGEKLKLTIKRGGK-IIETKIKPVKDKEEGRYRIGLYIRDSA 219 (423) T ss_pred ECCHHHCCEEEEEEEEEECCCHHCCHHHHHHHHHHHHHCCCEEEEEEEECCC-EEEEEEEEEEECCCCCEEEEEEEEECC T ss_conf 4636547845610899988811035345688887543059548999985890-899866133337887538998994166 Q ss_pred CCC Q ss_conf 344 Q gi|254780773|r 204 VGI 206 (349) Q Consensus 204 igi 206 (349) .|+ T Consensus 220 AGi 222 (423) T TIGR02860 220 AGI 222 (423) T ss_pred CCC T ss_conf 433 No 26 >TIGR02038 protease_degS periplasmic serine peptidase DegS; InterPro: IPR011783 Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes . They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence . Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases . Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base . The geometric orientations of the catalytic residues are similar between families, despite different protein folds . The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) , . Peptidases are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry. Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. This family consists of the periplasmic serine protease DegS (HhoB). They belong to MEROPS peptidase family S1, subfamily S1C (protease Do, clan PA(S)). They are a shorter paralogs of protease Do (HtrA, DegP) and DegQ (HhoA). They are found in Escherichia coli and several of the gammaproteobacteria. DegS contains a trypsin domain and a single copy of PDZ domain (in contrast to DegP with two copies). A critical role of this DegS is to sense stress by detecting misfolded proteins in the periplasm. DegS then cleaves the periplasmic domain of RseA, a transmembrane protein and inhibitor of sigmaE, activating the sigmaE-driven expression of periplasmic proteases/chaperones , , .; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis. Probab=98.95 E-value=8.9e-10 Score=87.17 Aligned_cols=64 Identities=36% Similarity=0.488 Sum_probs=57.8 Q ss_pred CCCCCCCCCHHHHHHCCCCCEEEEECCCCCCCCHHHHHHCCC-CCCCCCEEEEEECCCCEEEECCC Q ss_conf 000245675565300145662888878304540011111014-67886316999658731310112 Q gi|254780773|r 121 VVSNVSPASPAAIAGVKKGDCIISLDGITVSAFEEVAPYVRE-NPLHEISLVLYREHVGVLHLKVM 185 (349) Q Consensus 121 ~I~~V~~~spA~~AGL~~GD~Il~InG~~V~s~~dl~~~i~~-~~g~~v~i~v~R~~~~~~~~~v~ 185 (349) +|.+|.|++||++||+++.|.|+++||+++.+.+++.+.+.+ +||+.|.+|+.|+|++ +++.|+ T Consensus 288 vv~~vdPnGPAA~Ag~l~~Dvilk~dg~~~~g~~~~md~vA~~~PG~~v~~tvlR~Gk~-l~LpV~ 352 (358) T TIGR02038 288 VVTGVDPNGPAARAGILVRDVILKVDGKEVIGAEELMDRVAETRPGSKVLVTVLRKGKQ-LELPVT 352 (358) T ss_pred EEECCCCCCHHHHHCCCCCCEEEEECCCCCCCHHHHHHHHHCCCCCCEEEEEEECCCCE-EEEEEE T ss_conf 88534898767650677155789867953675655455543179997789999706967-870078 No 27 >TIGR02037 degP_htrA_DO protease Do; InterPro: IPR011782 Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes . They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence . Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases . Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base . The geometric orientations of the catalytic residues are similar between families, despite different protein folds . The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) , . Peptidases are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry. Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. This family consists serine peptidases belonging to MEROPS peptidase family S1, subfamily S1C (protease Do, clan PA(S)). They are variously designated DegP, DegQ, heat shock protein HtrA, MucD and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in Escherichia coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens . The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures .; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis. Probab=98.71 E-value=1.7e-08 Score=78.35 Aligned_cols=60 Identities=30% Similarity=0.466 Sum_probs=52.9 Q ss_pred CCCCCCCCCHHHHHHCCCCCEEEEECCCCCCCCHHHHHHCC--CCCCC-CCEEEEEECCCCEE Q ss_conf 00024567556530014566288887830454001111101--46788-63169996587313 Q gi|254780773|r 121 VVSNVSPASPAAIAGVKKGDCIISLDGITVSAFEEVAPYVR--ENPLH-EISLVLYREHVGVL 180 (349) Q Consensus 121 ~I~~V~~~spA~~AGL~~GD~Il~InG~~V~s~~dl~~~i~--~~~g~-~v~i~v~R~~~~~~ 180 (349) +|..|.++|||+++|||+||+|++||+++|+|..|+.+++. ..++. ++.+.|+|++...+ T Consensus 419 ~V~~v~~~s~Aa~~Gl~~GDvI~~vN~~~V~s~~e~~~~l~~~~k~~~k~~~L~i~Rg~~~~~ 481 (484) T TIGR02037 419 VVTKVVSGSPAARAGLQPGDVILSVNQQPVSSVAELNKVLARAKKGGRKKVALLIERGGATIF 481 (484) T ss_pred EEEEECCCCHHHHCCCCCCCEEEEECCCCCCCHHHHHHHHHHHCCCCCEEEEEEEEECCEEEE T ss_conf 999733888899717876618995088014678999999997328870479999998780689 No 28 >cd00136 PDZ PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(pos Probab=98.63 E-value=3.2e-08 Score=76.42 Aligned_cols=54 Identities=33% Similarity=0.584 Sum_probs=49.2 Q ss_pred CCCCCCCCCCCHHHHHHCCCCCEEEEECCCCCCCC--HHHHHHCCCCCCCCCEEEE Q ss_conf 32000245675565300145662888878304540--0111110146788631699 Q gi|254780773|r 119 KPVVSNVSPASPAAIAGVKKGDCIISLDGITVSAF--EEVAPYVRENPLHEISLVL 172 (349) Q Consensus 119 ~p~I~~V~~~spA~~AGL~~GD~Il~InG~~V~s~--~dl~~~i~~~~g~~v~i~v 172 (349) ..+|.+|.++|||++|||++||+|++|||+++.++ +++.+.++..+++++++++ T Consensus 14 ~i~V~~v~~~spA~~aGL~~GD~I~~ing~~v~~~~~~~~~~~l~~~~g~~v~l~v 69 (70) T cd00136 14 GVVVLSVEPGSPAERAGLQAGDVILAVNGTDVKNLTLEDVAELLKKEVGEKVTLTV 69 (70) T ss_pred CEEEEECCCCCHHHHCCCCCCCEEEEECCEECCCCCHHHHHHHHCCCCCCEEEEEE T ss_conf 89999809989799879998999999999996899899999996289879799998 No 29 >COG0793 Prc Periplasmic protease [Cell envelope biogenesis, outer membrane] Probab=98.61 E-value=2.7e-07 Score=70.07 Aligned_cols=69 Identities=29% Similarity=0.417 Sum_probs=57.2 Q ss_pred CCCCCCCCCCCCHHHHHHCCCCCEEEEECCCCCCCC--HHHHHHCCCCCCCCCEEEEEECC-CCEEEECCCC Q ss_conf 332000245675565300145662888878304540--01111101467886316999658-7313101121 Q gi|254780773|r 118 MKPVVSNVSPASPAAIAGVKKGDCIISLDGITVSAF--EEVAPYVRENPLHEISLVLYREH-VGVLHLKVMP 186 (349) Q Consensus 118 ~~p~I~~V~~~spA~~AGL~~GD~Il~InG~~V~s~--~dl~~~i~~~~g~~v~i~v~R~~-~~~~~~~v~p 186 (349) -...|.++.+++||++||+++||+|++|||+++..- ++..+.++..+|..+++++.|.+ .+..+++++. T Consensus 112 ~~~~V~s~~~~~PA~kagi~~GD~I~~IdG~~~~~~~~~~av~~irG~~Gt~V~L~i~r~~~~k~~~v~l~R 183 (406) T COG0793 112 GGVKVVSPIDGSPAAKAGIKPGDVIIKIDGKSVGGVSLDEAVKLIRGKPGTKVTLTILRAGGGKPFTVTLTR 183 (406) T ss_pred CCCEEECCCCCCCHHHHCCCCCCEEEEECCEECCCCCHHHHHHHHCCCCCCEEEEEEEECCCCCEEEEEEEE T ss_conf 982695068899267608998888999899976677777899972689997568999966899536899998 No 30 >PRK10139 serine endoprotease; Provisional Probab=98.57 E-value=7.6e-08 Score=73.83 Aligned_cols=59 Identities=29% Similarity=0.421 Sum_probs=52.8 Q ss_pred CCCCCCCCCCHHHHHHCCCCCEEEEECCCCCCCCHHHHHHCCCCCCCCCEEEEEECCCCE Q ss_conf 200024567556530014566288887830454001111101467886316999658731 Q gi|254780773|r 120 PVVSNVSPASPAAIAGVKKGDCIISLDGITVSAFEEVAPYVRENPLHEISLVLYREHVGV 179 (349) Q Consensus 120 p~I~~V~~~spA~~AGL~~GD~Il~InG~~V~s~~dl~~~i~~~~g~~v~i~v~R~~~~~ 179 (349) -+|.+|.++|||+++||++||+|++||+++|+|.+|+.+++++.+ +.+.+.+.|++... T Consensus 392 VvV~~V~~gS~Aa~aGLr~GDVI~~VN~~~V~sv~d~~~~l~~~~-~~v~L~V~Rgg~~~ 450 (455) T PRK10139 392 IKIDEVVKGSPAAQAGLQKDDVIIGVNRDRVNSIAEMRKVLAAKP-AIIALQIVRGNESI 450 (455) T ss_pred EEEEEECCCCHHHHCCCCCCCEEEEECCEECCCHHHHHHHHHCCC-CEEEEEEEECCEEE T ss_conf 799984789989986999999999779987399999999985589-72899999899689 No 31 >PRK10942 serine endoprotease; Provisional Probab=98.56 E-value=8.8e-08 Score=73.38 Aligned_cols=59 Identities=34% Similarity=0.532 Sum_probs=52.6 Q ss_pred CCCCCCCCCCHHHHHHCCCCCEEEEECCCCCCCCHHHHHHCCCCCCCCCEEEEEECCCCE Q ss_conf 200024567556530014566288887830454001111101467886316999658731 Q gi|254780773|r 120 PVVSNVSPASPAAIAGVKKGDCIISLDGITVSAFEEVAPYVRENPLHEISLVLYREHVGV 179 (349) Q Consensus 120 p~I~~V~~~spA~~AGL~~GD~Il~InG~~V~s~~dl~~~i~~~~g~~v~i~v~R~~~~~ 179 (349) -+|.+|.++|||+++||++||+|++||+++|+|.+|+.+++++.+ +.+.+.+.|++... T Consensus 411 VvV~~V~~~S~Aa~aGLr~GDVI~~VN~~~V~s~~dl~~~l~~~~-~~v~L~V~Rgg~~~ 469 (474) T PRK10942 411 VVVDNVKPGTPAAQIGLKKGDVIIGANQQPVKNIAELRKILDSKP-SVLALNIQRGDSSI 469 (474) T ss_pred EEEEEECCCCHHHHCCCCCCCEEEEECCEECCCHHHHHHHHHHCC-CEEEEEEEECCEEE T ss_conf 699994799979985999998899779988499999999996089-83899999899579 No 32 >COG3480 SdrC Predicted secreted protein containing a PDZ domain [Signal transduction mechanisms] Probab=98.56 E-value=2.3e-07 Score=70.52 Aligned_cols=153 Identities=19% Similarity=0.244 Sum_probs=87.1 Q ss_pred CCCCCCCCCHHHHHHCCCCCEEEEECCCCCCCCHHHHHHCCC-CCCCCCEEEEEECCCCEEEECCCCCCCCCCCCCEEEE Q ss_conf 000245675565300145662888878304540011111014-6788631699965873131011211005654310000 Q gi|254780773|r 121 VVSNVSPASPAAIAGVKKGDCIISLDGITVSAFEEVAPYVRE-NPLHEISLVLYREHVGVLHLKVMPRLQDTVDRFGIKR 199 (349) Q Consensus 121 ~I~~V~~~spA~~AGL~~GD~Il~InG~~V~s~~dl~~~i~~-~~g~~v~i~v~R~~~~~~~~~v~p~~~~~~~~~g~~~ 199 (349) ++.+|.++|||..- |+.||.|+++||+++.+.+|+.+++++ .+|++++++++|.+......+.+-...+..++-|+.. T Consensus 133 yv~~v~~~~~~~gk-l~~gD~i~avdg~~f~s~~e~i~~v~~~k~Gd~VtI~~~r~~~~~~~~~~tl~~~~~~g~~giGI 211 (342) T COG3480 133 YVLSVIDNSPFKGK-LEAGDTIIAVDGEPFTSSDELIDYVSSKKPGDEVTIDYERHNETPEIVTITLIKNDDNGKAGIGI 211 (342) T ss_pred EEEECCCCCCHHCE-ECCCCEEEEECCEECCCHHHHHHHHHCCCCCCEEEEEEEECCCCCCEEEEEEEEECCCCCCEEEE T ss_conf 99971478631022-32687688558944578899999985468897699999951698726899999604688641215 Q ss_pred EEEECC-----CCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHHEECCCCC--CCCCCCCCCCCCCCH-HHHHHHHHHHC Q ss_conf 122034-----4443212333456688876898886554223100000002--554200224667301-22333456530 Q gi|254780773|r 200 QVPSVG-----ISFSYDETKLHSRTVLQSFSRGLDEISSITRGFLGVLSSA--FGKDTRLNQISGPVG-IARIAKNFFDH 271 (349) Q Consensus 200 ~~~~ig-----i~~~~~~~~~~~~~~~~a~~~~~~~~~~~~~~~~~~l~~l--~~g~~~~~~lsGPVg-Ia~~~~~~a~~ 271 (349) ....-. .....+.......+..-.++.++. .+++..-+.. +.. -||....+.--|||| |.|-...+++. T Consensus 212 sl~d~~~v~~~~~V~~~~~~IGGPSAGLMFSL~Iy--~qlt~~DL~~-g~~IAGTGTI~~DG~VG~IGGI~qKvvAA~~A 288 (342) T COG3480 212 SLVDAPEVWAPPDVDFNTENIGGPSAGLMFSLAIY--DQLTKGDLTG-GRFIAGTGTIEVDGKVGPIGGIDQKVVAAAKA 288 (342) T ss_pred EEECCCCCCCCCCEEEECCCCCCCCHHHEEEHHHH--HHCCCCCCCC-CEEEECCEEECCCCCCCCCCCHHHHHHHHHHC T ss_conf 86347654568726751244799754333529888--6405311358-66984111334688335745476776778765 Q ss_pred CHHHHH Q ss_conf 503478 Q gi|254780773|r 272 GFNAYI 277 (349) Q Consensus 272 G~~~~l 277 (349) |-..|| T Consensus 289 GA~vFf 294 (342) T COG3480 289 GADVFF 294 (342) T ss_pred CCCEEE T ss_conf 985998 No 33 >TIGR03279 cyano_FeS_chp putative FeS-containing Cyanobacterial-specific oxidoreductase. Members of this protein family are predicted FeS-containing oxidoreductases of unknown function, apparently restricted to and universal across the Cyanobacteria. The high trusted cutoff score for this model, 700 bits, excludes homologs from other lineages. This exclusion seems justified because a significant number of sequence positions are simultaneously unique to and invariant across the Cyanobacteria, suggesting a specialized, conserved function, perhaps related to photosynthesis. A distantly related protein family, TIGR03278, in universal in and restricted to archaeal methanogens, and may be linked to methanogenesis. Probab=98.43 E-value=3.2e-07 Score=69.53 Aligned_cols=62 Identities=19% Similarity=0.313 Sum_probs=44.3 Q ss_pred CCCCCCCCCHHHHHHCCCCCEEEEECCCCCCCCHHHHHHCCCCCCCCCEEEEEECCCCEEEECCC Q ss_conf 00024567556530014566288887830454001111101467886316999658731310112 Q gi|254780773|r 121 VVSNVSPASPAAIAGVKKGDCIISLDGITVSAFEEVAPYVRENPLHEISLVLYREHVGVLHLKVM 185 (349) Q Consensus 121 ~I~~V~~~spA~~AGL~~GD~Il~InG~~V~s~~dl~~~i~~~~g~~v~i~v~R~~~~~~~~~v~ 185 (349) +|.+|.|+|+|+++|+++||++++|||+++.+.-|.+-++. .+.+++.+.+.+.+..++.+. T Consensus 1 ~I~~V~pgSiA~e~Gie~GD~llsING~~i~DiiDy~f~~~---de~~~L~v~~~~Ge~~~ieie 62 (433) T TIGR03279 1 LISAVLPGSIAEELGFEPGDALVSINGVAPRDLIDYQFLCA---DEELELEVLDANGESHQIEIE 62 (433) T ss_pred CEEEECCCCHHHHHCCCCCCEEEEECCCCCCCCEEEEECCC---CCEEEEEEECCCCCEEEEEEE T ss_conf 94157799978983899998899889945555143411215---855999999589979999984 No 34 >COG1994 SpoIVFB Zn-dependent proteases [General function prediction only] Probab=98.42 E-value=1.2e-06 Score=65.57 Aligned_cols=44 Identities=39% Similarity=0.686 Sum_probs=37.4 Q ss_pred HHHHHHHHHHHHHHHHHHHHHCCCCCCHHHHHHHHHHHHHCCCC Q ss_conf 03478999999999999963064722436999999988827869 Q gi|254780773|r 273 FNAYIAFLAMFSWAIGFMNLLPIPILDGGHLITFLLEMIRGKSL 316 (349) Q Consensus 273 ~~~~l~~~a~isi~Lg~~NlLPip~LDGG~i~~~~~E~i~gr~i 316 (349) +..++...+.+|..|++|||+|+|||||||++....++-....+ T Consensus 136 ~~~~~~~la~~Nl~L~lFNLiPi~PLDGg~vlr~~~~~~~~~~~ 179 (230) T COG1994 136 LFAFLAALALVNLVLALFNLLPIPPLDGGRVLRALLPRRYGAAI 179 (230) T ss_pred HHHHHHHHHHHHHHHHHHHCCCCCCCCHHHHHHHHHHHHHHHHH T ss_conf 99999999999999999973778876589999986389999999 No 35 >PRK11186 carboxy-terminal protease; Provisional Probab=98.37 E-value=2.4e-06 Score=63.52 Aligned_cols=66 Identities=18% Similarity=0.351 Sum_probs=53.6 Q ss_pred CCCCCCCCCCHHHHHH-CCCCCEEEEE--CCCCCCC-----CHHHHHHCCCCCCCCCEEEEEECCCC--EEEECCC Q ss_conf 2000245675565300-1456628888--7830454-----00111110146788631699965873--1310112 Q gi|254780773|r 120 PVVSNVSPASPAAIAG-VKKGDCIISL--DGITVSA-----FEEVAPYVRENPLHEISLVLYREHVG--VLHLKVM 185 (349) Q Consensus 120 p~I~~V~~~spA~~AG-L~~GD~Il~I--nG~~V~s-----~~dl~~~i~~~~g~~v~i~v~R~~~~--~~~~~v~ 185 (349) .+|.++.||+||+++| |++||+|++| +|+++.+ .+|+++.|+..+|.+|++++.|.+.+ ...++++ T Consensus 259 ~~Iv~~i~GgPA~k~g~L~~gD~Ii~V~q~~~~~~dviG~~lddvV~lIRG~kGT~V~L~I~r~~~~~~~~~v~i~ 334 (673) T PRK11186 259 TVIKSLVAGGPAAKSKKLSVGDKIVGVGQDGKEIVDVIGWRLDDVVALIKGPKGSKVRLEILPAGKGTKTRIVTLT 334 (673) T ss_pred EEEEEECCCCHHHHHCCCCCCCEEEEECCCCCCCCCCCCCCHHHHHHHHCCCCCCEEEEEEEECCCCCCEEEEEEE T ss_conf 9999706899588738999899999825789874202376599999985389988799999978888861699999 No 36 >smart00228 PDZ Domain present in PSD-95, Dlg, and ZO-1/2. Also called DHR (Dlg homologous region) or GLGF (relatively well conserved tetrapeptide in these domains). Some PDZs have been shown to bind C-terminal polypeptides; others appear to bind internal (non-C-terminal) polypeptides. Different PDZs possess different binding specificities. Probab=98.37 E-value=4.2e-07 Score=68.71 Aligned_cols=57 Identities=35% Similarity=0.540 Sum_probs=47.8 Q ss_pred CCCCCCCCCCCHHHHHHCCCCCEEEEECCCCCCCC--HHHHHHCCCCCCCCCEEEEEECC Q ss_conf 32000245675565300145662888878304540--01111101467886316999658 Q gi|254780773|r 119 KPVVSNVSPASPAAIAGVKKGDCIISLDGITVSAF--EEVAPYVRENPLHEISLVLYREH 176 (349) Q Consensus 119 ~p~I~~V~~~spA~~AGL~~GD~Il~InG~~V~s~--~dl~~~i~~~~g~~v~i~v~R~~ 176 (349) ..+|.+|.++|||+++||++||+|++|||+++.+. ++....++.. +.++++++.|++ T Consensus 27 gv~I~~v~~~s~A~~~Gl~~GD~I~~vng~~v~~~~~~~~~~~~~~~-~~~v~l~v~r~~ 85 (85) T smart00228 27 GVVVSSVVPGSPAAKAGLKVGDVILEVNGTSVEGLTHLEAVDLLKKA-GGKVTLTVLRGG 85 (85) T ss_pred CEEEEEECCCCHHHHCCCCCCCEEEEECCEECCCCCHHHHHHHHHCC-CCEEEEEEEECC T ss_conf 89999987999478768989999999999998999899999998779-997999999496 No 37 >PRK10779 zinc metallopeptidase; Provisional Probab=98.24 E-value=1.6e-06 Score=64.62 Aligned_cols=59 Identities=22% Similarity=0.303 Sum_probs=47.9 Q ss_pred CCCCCCCCCCHHHHHHCCCCCEEEEECCCCCCCCHHHHHHCCCC-CCCCCEEEEEECCCC Q ss_conf 20002456755653001456628888783045400111110146-788631699965873 Q gi|254780773|r 120 PVVSNVSPASPAAIAGVKKGDCIISLDGITVSAFEEVAPYVREN-PLHEISLVLYREHVG 178 (349) Q Consensus 120 p~I~~V~~~spA~~AGL~~GD~Il~InG~~V~s~~dl~~~i~~~-~g~~v~i~v~R~~~~ 178 (349) |++++|.++|||++||+++||+|+++||+++.+|+|....+..+ ..++.++.+.+.+.+ T Consensus 128 ~~i~~v~~~s~a~~agl~~GD~i~~idg~~~~~~~~~~~~l~~~~g~~~~~i~v~~~~~~ 187 (449) T PRK10779 128 PVVGEIAPNSIAAQAQIAPGTELKAVDGIETPDWDAVRLQLVSKIGDEQTTVTVAPFGSD 187 (449) T ss_pred CEECCCCCCCHHHHCCCCCCCEEEEECCEECCCHHHHHHHHHHHCCCCCEEEEEECCCCC T ss_conf 300431468888873888887899989998576898899998850577606999407864 No 38 >COG0265 DegQ Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain [Posttranslational modification, protein turnover, chaperones] Probab=98.22 E-value=2e-06 Score=64.06 Aligned_cols=66 Identities=36% Similarity=0.497 Sum_probs=57.0 Q ss_pred CCCCCCCCCCHHHHHHCCCCCEEEEECCCCCCCCHHHHHHCCC-CCCCCCEEEEEECCCCEEEECCCC Q ss_conf 2000245675565300145662888878304540011111014-678863169996587313101121 Q gi|254780773|r 120 PVVSNVSPASPAAIAGVKKGDCIISLDGITVSAFEEVAPYVRE-NPLHEISLVLYREHVGVLHLKVMP 186 (349) Q Consensus 120 p~I~~V~~~spA~~AGL~~GD~Il~InG~~V~s~~dl~~~i~~-~~g~~v~i~v~R~~~~~~~~~v~p 186 (349) ..+.+|.++|||+++|+++||.|+++||+++.+..|+.+.+.. ++++++.+++.|+++ ..+.+++. T Consensus 272 ~~V~~v~~~spa~~agi~~Gdii~~~ng~~v~~~~~l~~~v~~~~~g~~v~~~~~r~g~-~~~~~v~l 338 (347) T COG0265 272 AVVLGVLPGSPAAKAGIKAGDIITAVNGKPVASLSDLVAAVASNRPGDEVALKLLRGGK-ERELAVTL 338 (347) T ss_pred CEEEEECCCCHHHHCCCCCCCEEEEECCEECCCHHHHHHHHHCCCCCCEEEEEEEECCE-EEEEEEEC T ss_conf 68865179985787378778779978998855788888887326999768899997883-57776861 No 39 >KOG3129 consensus Probab=98.22 E-value=2.5e-06 Score=63.32 Aligned_cols=78 Identities=28% Similarity=0.388 Sum_probs=63.8 Q ss_pred CCCCCCCCCCHHHHHHCCCCCEEEEECCCCCCCCHHH---HHHCCCCCCCCCEEEEEECCCCEEEECCCCCCCCCCCCCE Q ss_conf 2000245675565300145662888878304540011---1110146788631699965873131011211005654310 Q gi|254780773|r 120 PVVSNVSPASPAAIAGVKKGDCIISLDGITVSAFEEV---APYVRENPLHEISLVLYREHVGVLHLKVMPRLQDTVDRFG 196 (349) Q Consensus 120 p~I~~V~~~spA~~AGL~~GD~Il~InG~~V~s~~dl---~~~i~~~~g~~v~i~v~R~~~~~~~~~v~p~~~~~~~~~g 196 (349) .+|++|.|+|||++|||+.||.|+++.+..--++..+ ...++.+.++.+.+++.|++. ...+.++|+.....+..| T Consensus 141 a~V~sV~~~SPA~~aGl~~gD~il~fGnV~sgn~~~lq~i~~~v~~~e~~~v~v~v~R~g~-~v~L~ltP~~W~GrGLLG 219 (231) T KOG3129 141 AVVDSVVPGSPADEAGLCVGDEILKFGNVHSGNFLPLQNIAAVVQSNEDQIVSVTVIREGQ-KVVLSLTPKKWQGRGLLG 219 (231) T ss_pred EEEEECCCCCHHHHHCCCCCCEEEEECCCCCCCCHHHHHHHHHHHHCCCCCEEEEEECCCC-EEEEEECCCCCCCCCCEE T ss_conf 8875227898345407543765788533246552258898999874437623579961797-788996764355886010 Q ss_pred EE Q ss_conf 00 Q gi|254780773|r 197 IK 198 (349) Q Consensus 197 ~~ 198 (349) .. T Consensus 220 C~ 221 (231) T KOG3129 220 CN 221 (231) T ss_pred EE T ss_conf 01 No 40 >pfam00595 PDZ PDZ domain (Also known as DHR or GLGF). PDZ domains are found in diverse signaling proteins. Probab=98.16 E-value=1.5e-06 Score=64.82 Aligned_cols=52 Identities=27% Similarity=0.458 Sum_probs=43.8 Q ss_pred CCCCCCCCCCHHHHHHCCCCCEEEEECCCCCCCC--HHHHHHCCCCCCCCCEEEE Q ss_conf 2000245675565300145662888878304540--0111110146788631699 Q gi|254780773|r 120 PVVSNVSPASPAAIAGVKKGDCIISLDGITVSAF--EEVAPYVRENPLHEISLVL 172 (349) Q Consensus 120 p~I~~V~~~spA~~AGL~~GD~Il~InG~~V~s~--~dl~~~i~~~~g~~v~i~v 172 (349) -+|.+|.++|||+.+||++||+|++|||+++.++ ++..+.++.. ++++++++ T Consensus 26 ~~V~~V~~~~~A~~~gL~~GD~Il~VNg~~v~~~~~~~~~~~l~~~-~~~v~L~V 79 (80) T pfam00595 26 IFVSEVLPGGAAEAGGLQVGDRILSINGQDLENMSHDEAVLALKGS-GGEVTLTI 79 (80) T ss_pred EEEEEECCCCCHHHCCCCCCCEEEEECCEECCCCCHHHHHHHHHCC-CCEEEEEE T ss_conf 8999977898055487999999999999998999899999999749-99299998 No 41 >cd00992 PDZ_signaling PDZ domain found in a variety of Eumetazoan signaling molecules, often in tandem arrangements. May be responsible for specific protein-protein interactions, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of PDZ domains an N-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in proteases. Probab=98.12 E-value=2.4e-06 Score=63.55 Aligned_cols=54 Identities=30% Similarity=0.528 Sum_probs=46.1 Q ss_pred CCCCCCCCCCCCHHHHHHCCCCCEEEEECCCCCC--CCHHHHHHCCCCCCCCCEEEE Q ss_conf 3320002456755653001456628888783045--400111110146788631699 Q gi|254780773|r 118 MKPVVSNVSPASPAAIAGVKKGDCIISLDGITVS--AFEEVAPYVRENPLHEISLVL 172 (349) Q Consensus 118 ~~p~I~~V~~~spA~~AGL~~GD~Il~InG~~V~--s~~dl~~~i~~~~g~~v~i~v 172 (349) ...+|.+|.|+|||+.+||++||+|++|||+++. +.+++.+.+++.+. .+++++ T Consensus 26 ~~~~I~~v~~~s~A~~~~L~~GD~Il~INg~~v~~~~~~~v~~~l~~~~~-~v~L~V 81 (82) T cd00992 26 GGIFVSRVEPGGPAERGGLRVGDRILEVNGVSVEGLTHEEAVELLKNSGD-EVTLTV 81 (82) T ss_pred CCEEEEEECCCCCHHHCCCCCCCEEEEECCEECCCCCHHHHHHHHHCCCC-EEEEEE T ss_conf 99999998689903434899999989899999999989999999984999-599998 No 42 >TIGR00054 TIGR00054 membrane-associated zinc metalloprotease, putative; InterPro: IPR004387 Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site . The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases . Peptidases are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry. Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. This family contains putative zinc metallopeptidases belonging to MEROPS peptidase family M50 (S2P protease family, clan MM). The N-terminal region of contains a perfectly conserved motif HEXGH, where the Glu is the active site and the His residues coordinate the metal cation. The family of bacterial and plant proteins also includes a region that hits the PDZ domain (IPR001478 from INTERPRO), found in a number of proteins targeted to the membrane by binding to a peptide ligand . The family includes EcfE, which is a homolog of human site-2 protease (S2P), a membrane-bound zinc metalloprotease involved in regulated intramembrane proteolysis. In Escherichia coli EcfE activates the sigma(E) pathway of stress response through a site-2 cleavage of anti-sigma(E), RseA.; GO: 0004222 metalloendopeptidase activity, 0006508 proteolysis, 0016021 integral to membrane. Probab=97.82 E-value=1.6e-05 Score=57.75 Aligned_cols=83 Identities=24% Similarity=0.406 Sum_probs=57.0 Q ss_pred CCCCCCCCCCHHHHHHCCCCCEEEEECCCCCCCCHHHHHHCCCCCCCCCEEEEEE---------CC-CCEEEECCCCCCC Q ss_conf 2000245675565300145662888878304540011111014678863169996---------58-7313101121100 Q gi|254780773|r 120 PVVSNVSPASPAAIAGVKKGDCIISLDGITVSAFEEVAPYVRENPLHEISLVLYR---------EH-VGVLHLKVMPRLQ 189 (349) Q Consensus 120 p~I~~V~~~spA~~AGL~~GD~Il~InG~~V~s~~dl~~~i~~~~g~~v~i~v~R---------~~-~~~~~~~v~p~~~ 189 (349) |+|+++.++|.|.+|++++||+|+++||+++.+|+|+...+.+....++.+++.+ +. -+..+.++.++.. T Consensus 137 ~vi~~~~~~S~a~~a~~~~Gd~il~~~~~~~~~f~~~~~~~~~~~~g~~~~~I~~~PF~S~~e~~~~L~L~~~~~~~~~~ 216 (463) T TIGR00054 137 PVIEELDKNSIALEAGIEPGDEILSVNGKKIPGFKDVRKQIADIVAGEPMVEILAAPFNSDIEREVKLDLRNWTFEVEKE 216 (463) T ss_pred CCCCCCCHHHHHHHHCCCCCCEEEEECCCCCCCCHHHHHHHHHHHCCCCCCEEEECCCCCCCHHHCCCCCEEEEEECCCC T ss_conf 64565544579987116898478740776678808899999997517854157765777541120000331238621125 Q ss_pred CCCCCCEEEEEEE Q ss_conf 5654310000122 Q gi|254780773|r 190 DTVDRFGIKRQVP 202 (349) Q Consensus 190 ~~~~~~g~~~~~~ 202 (349) +.....|+....| T Consensus 217 ~~~~~lgl~~~~P 229 (463) T TIGR00054 217 DAVEQLGLKPRGP 229 (463) T ss_pred CHHHHCCCCCCCC T ss_conf 5255414424787 No 43 >COG3975 Predicted protease with the C-terminal PDZ domain [General function prediction only] Probab=97.62 E-value=8.2e-05 Score=52.91 Aligned_cols=59 Identities=27% Similarity=0.397 Sum_probs=44.8 Q ss_pred CCCCCCCCCHHHHHHCCCCCEEEEECCCCCCCCHHHHHHCCCCCCCCCEEEEEECCCCEEEECCCCC Q ss_conf 0002456755653001456628888783045400111110146788631699965873131011211 Q gi|254780773|r 121 VVSNVSPASPAAIAGVKKGDCIISLDGITVSAFEEVAPYVRENPLHEISLVLYREHVGVLHLKVMPR 187 (349) Q Consensus 121 ~I~~V~~~spA~~AGL~~GD~Il~InG~~V~s~~dl~~~i~~~~g~~v~i~v~R~~~~~~~~~v~p~ 187 (349) .|..|.++|||++|||.+||+|++|||.+ . +.-+...++.+++++.|++. ..+..+++. T Consensus 465 ~i~~V~~~gPA~~AGl~~Gd~ivai~G~s-~------~l~~~~~~d~i~v~~~~~~~-L~e~~v~~~ 523 (558) T COG3975 465 KITFVFPGGPAYKAGLSPGDKIVAINGIS-D------QLDRYKVNDKIQVHVFREGR-LREFLVKLG 523 (558) T ss_pred EEEECCCCCHHHHCCCCCCCEEEEECCCC-C------CCCCCCCCCCEEEEECCCCC-EEEEECCCC T ss_conf 99844789816751588756799976735-5------52214426624899825782-388521368 No 44 >KOG1320 consensus Probab=97.60 E-value=7.7e-05 Score=53.09 Aligned_cols=95 Identities=22% Similarity=0.321 Sum_probs=65.3 Q ss_pred HHHHHHHHHHHHHHCCCCCCCC----C-CCCCCCCCCCCCCHHHHHHCCCCCEEEEECCCCCCCCHHHHHHCCCC-CCCC Q ss_conf 1123332112211101110012----3-33320002456755653001456628888783045400111110146-7886 Q gi|254780773|r 94 LAGPLANCVMAILFFTFFFYNT----G-VMKPVVSNVSPASPAAIAGVKKGDCIISLDGITVSAFEEVAPYVREN-PLHE 167 (349) Q Consensus 94 ~AGp~~N~ilA~iif~~~~~~~----g-~~~p~I~~V~~~spA~~AGL~~GD~Il~InG~~V~s~~dl~~~i~~~-~g~~ 167 (349) .-|.-|+.+++-+.|....-.. + ..--++.+|.|++||..+|+++||+|++|||++|.+..|+.+.++.. +.+. T Consensus 369 ~~g~~s~~i~~g~vf~~~~~~~~~~~~~~q~v~is~Vlp~~~~~~~~~~~g~~V~~vng~~V~n~~~l~~~i~~~~~~~~ 448 (473) T KOG1320 369 YIGLPSYYIFAGLVFVPLTKSYIFPSGVVQLVLVSQVLPGSINGGYGLKPGDQVVKVNGKPVKNLKHLYELIEECSTEDK 448 (473) T ss_pred CCCCEEEEEECCEEEEECCCCCCCCCCCEEEEEEEEECCCCCCCCCCCCCCCEEEEECCEEEECHHHHHHHHHHCCCCCE T ss_conf 67840499965378852577866655633589998864699761002367878998889885256879999875276766 Q ss_pred CEEEEEECCCCEEEECCCCCCC Q ss_conf 3169996587313101121100 Q gi|254780773|r 168 ISLVLYREHVGVLHLKVMPRLQ 189 (349) Q Consensus 168 v~i~v~R~~~~~~~~~v~p~~~ 189 (349) +.+. .|.+.+..++.+.|+.. T Consensus 449 v~vl-~~~~~e~~tl~Il~~~~ 469 (473) T KOG1320 449 VAVL-DRRSAEDATLEILPEHK 469 (473) T ss_pred EEEE-EECCCCCEEEEEECCCC T ss_conf 9999-71476302589502335 No 45 >KOG1421 consensus Probab=97.58 E-value=9.4e-05 Score=52.51 Aligned_cols=64 Identities=25% Similarity=0.409 Sum_probs=57.2 Q ss_pred CCCCCCCCCHHHHHHCCCCCEEEEECCCCCCCCHHHHHHCCCCCCCCCEEEEEECCCCEEEECCCC Q ss_conf 000245675565300145662888878304540011111014678863169996587313101121 Q gi|254780773|r 121 VVSNVSPASPAAIAGVKKGDCIISLDGITVSAFEEVAPYVRENPLHEISLVLYREHVGVLHLKVMP 186 (349) Q Consensus 121 ~I~~V~~~spA~~AGL~~GD~Il~InG~~V~s~~dl~~~i~~~~g~~v~i~v~R~~~~~~~~~v~p 186 (349) ++..|.++|||++. |++||..+++|++-+.++.++-+.+.+.-|+.++++++|+++ ..+++++- T Consensus 306 vV~~vL~~gpa~k~-Le~GDillavN~t~l~df~~l~~iLDegvgk~l~LtI~Rggq-elel~vtv 369 (955) T KOG1421 306 VVETVLPEGPAEKK-LEPGDILLAVNSTCLNDFEALEQILDEGVGKNLELTIQRGGQ-ELELTVTV 369 (955) T ss_pred EEEEECCCCCHHHC-CCCCCEEEEECCEEHHHHHHHHHHHHHCCCCEEEEEEEECCE-EEEEEEEE T ss_conf 99873369803302-577867999833316889999987752358508999984888-99999974 No 46 >KOG3209 consensus Probab=97.49 E-value=0.00011 Score=52.07 Aligned_cols=62 Identities=16% Similarity=0.347 Sum_probs=50.1 Q ss_pred CCCCCCCCCCCCCCCHHHHHH-CCCCCEEEEECCCCCCC--CHHHHHHCCCCCCCCCEEEEEECCC Q ss_conf 233332000245675565300-14566288887830454--0011111014678863169996587 Q gi|254780773|r 115 TGVMKPVVSNVSPASPAAIAG-VKKGDCIISLDGITVSA--FEEVAPYVRENPLHEISLVLYREHV 177 (349) Q Consensus 115 ~g~~~p~I~~V~~~spA~~AG-L~~GD~Il~InG~~V~s--~~dl~~~i~~~~g~~v~i~v~R~~~ 177 (349) ...+.+.|+.+.++|||+.-| |+.||+|++|||++|.+ ..|+++.|++ +|-+|++++.-.++ T Consensus 775 ~~kp~sgiGrIieGSPAdRCgkLkVGDrilAVNG~sI~~lsHadiv~LIKd-aGlsVtLtIip~ee 839 (984) T KOG3209 775 QNKPESGIGRIIEGSPADRCGKLKVGDRILAVNGQSILNLSHADIVSLIKD-AGLSVTLTIIPPEE 839 (984) T ss_pred CCCCCCCCCCCCCCCHHHHHCCCCCCCEEEEECCEEEECCCCHHHHHHHHH-CCCEEEEEECCHHC T ss_conf 668987743125698167505543265688754703303672568888873-68558999748010 No 47 >pfam04495 GRASP55_65 GRASP55/65 family. GRASP55 (Golgi reassembly stacking protein of 55 kDa) and GRASP65 (a 65 kDa) protein are highly homologous. GRASP55 is a component of the Golgi stacking machinery. GRASP65, an N-ethylmaleimide- sensitive membrane protein required for the stacking of Golgi cisternae in a cell-free system. Probab=97.00 E-value=0.0014 Score=44.49 Aligned_cols=68 Identities=28% Similarity=0.391 Sum_probs=56.6 Q ss_pred CCCCCCCCCHHHHHHCCC-CCEEEEECCCCCCCCHHHHHHCCCCCCCCCEEEEEE-CCCCEEEECCCCCC Q ss_conf 000245675565300145-662888878304540011111014678863169996-58731310112110 Q gi|254780773|r 121 VVSNVSPASPAAIAGVKK-GDCIISLDGITVSAFEEVAPYVRENPLHEISLVLYR-EHVGVLHLKVMPRL 188 (349) Q Consensus 121 ~I~~V~~~spA~~AGL~~-GD~Il~InG~~V~s~~dl~~~i~~~~g~~v~i~v~R-~~~~~~~~~v~p~~ 188 (349) -|-+|.++|||+.|||++ .|-|+.-+..-...-+|+.+.++.+.++++.+-|+- +.+.+..++++|+. T Consensus 46 hvl~v~~~SPA~~AgL~~~~DYIiG~~~~~l~~~~~l~~~v~~~~~~~l~lyVYN~~~d~~R~V~i~p~~ 115 (280) T pfam04495 46 HVLDVHPNSPAALAGLQPYSDYIIGTDSGLLRGEDDLFELVESHEGRPLKLYVYNSETDVVREVTITPNR 115 (280) T ss_pred EEEECCCCCHHHHCCCCCCCCEEEECCCCCCCCHHHHHHHHHHHCCCCEEEEEECCCCCCEEEEEEECCC T ss_conf 9984489997997488877786873684231456789999997369976999965788836789985277 No 48 >PRK09681 putative type II secretion protein GspC; Provisional Probab=96.94 E-value=0.0011 Score=45.17 Aligned_cols=58 Identities=21% Similarity=0.377 Sum_probs=45.1 Q ss_pred CCCCCH---HHHHHCCCCCEEEEECCCCCCCCHHHHHHCCC-CCCCCCEEEEEECCCCEEEEC Q ss_conf 456755---65300145662888878304540011111014-678863169996587313101 Q gi|254780773|r 125 VSPASP---AAIAGVKKGDCIISLDGITVSAFEEVAPYVRE-NPLHEISLVLYREHVGVLHLK 183 (349) Q Consensus 125 V~~~sp---A~~AGL~~GD~Il~InG~~V~s~~dl~~~i~~-~~g~~v~i~v~R~~~~~~~~~ 183 (349) +.||.. .+++|||+||.+++|||.+.++.....+.++. +.-++++++|+|||+. .++. T Consensus 254 l~PGkd~~lF~~~Glq~gDlavsiNG~dLtDp~~a~~~~~~l~~ate~~ltVeRdGq~-~~I~ 315 (319) T PRK09681 254 VKPGADRSLFDASGFKEGDIAIALNQQDFTDPRAMIALMRQLPSMDSIQLTVLRKGAR-YDIS 315 (319) T ss_pred ECCCCCHHHHHHCCCCCCCEEEEECCCCCCCHHHHHHHHHHHHHCCEEEEEEEECCEE-EEEE T ss_conf 2799888999972999888889826966789899999999600071558999979968-9999 No 49 >TIGR01713 typeII_sec_gspC general secretion pathway protein C; InterPro: IPR001639 The general (type II) secretion pathway (GSP) within Gram-negative bacteria is a signal sequence-dependent process responsible for protein export , , . The process has two stages: exoproteins are first translocated across the inner membrane by the general signal-dependent export pathway (GEP), and then across the outer membrane by a species-specific accessory mechanism. A number of molecules are involved in the GSP; one of these is known as the 'C' protein, the most probable location of which is the inner membrane . This suggests that protein C is part of the GEP apparatus, aiding trans-location of exoproteins from the cytoplasm to the periplasm, prior to transport across the outer membrane. The size of the 'C' protein is around 270 to 300 amino acids. It apparently contains a single transmembrane domain located in the N-terminal section. The short N-terminal domain is predicted to be cytoplasmic and the large C-terminal domain periplasmic. The gene encoding the 'C' protein has been sequenced in a variety of bacteria such as Aeromonas (exeC); Erwinia (outC); Escherichia coli (yheE or gspC); Klebsiella pneumoniae (pulC); or Vibrio cholerae (epsC).; GO: 0008565 protein transporter activity, 0015628 protein secretion by the type II secretion system, 0015627 type II protein secretion system complex. Probab=96.87 E-value=0.0011 Score=45.10 Aligned_cols=54 Identities=20% Similarity=0.304 Sum_probs=46.8 Q ss_pred CCCCCHHHHHHCCCCCEEEEECCCCCCCCHHHHHHCCC-CCCCCCEEEEEECCCC Q ss_conf 45675565300145662888878304540011111014-6788631699965873 Q gi|254780773|r 125 VSPASPAAIAGVKKGDCIISLDGITVSAFEEVAPYVRE-NPLHEISLVLYREHVG 178 (349) Q Consensus 125 V~~~spA~~AGL~~GD~Il~InG~~V~s~~dl~~~i~~-~~g~~v~i~v~R~~~~ 178 (349) .-+.+..++.|||.||.-+++||..+++-++..++++. ...++.++||+|||+. T Consensus 220 gK~~~lF~~~GLq~gD~AvalNgLdLrd~e~a~~~l~~l~~~~~~~ltv~RdG~~ 274 (281) T TIGR01713 220 GKDPSLFYKSGLQDGDIAVALNGLDLRDPEQAKQALQLLRELTELTLTVERDGQR 274 (281) T ss_pred CCCHHHHHHHCCCCCCEEEEECCCCCCCHHHHHHHHHHHCCCCCEEEEEEECCCC T ss_conf 8985453411685673246536888779899999999730486608999977942 No 50 >pfam01434 Peptidase_M41 Peptidase family M41. Probab=96.69 E-value=0.0017 Score=43.86 Aligned_cols=67 Identities=27% Similarity=0.325 Sum_probs=43.7 Q ss_pred HHHHHHHHHHHHHHHHC--CEEEEEEEECCCHHEEEEECCCEEEEEEEEEECCEEECCCCCCCHHHHCCCCCCCCHHHHH Q ss_conf 99997337799999849--7500455306831127860598079999771121110012445506650368421013421 Q gi|254780773|r 17 IVVIHEFGHYMVARLCN--IRVLSFSVGFGPELIGITSRSGVRWKVSLIPLGGYVSFSEDEKDMRSFFCAAPWKKILTVL 94 (349) Q Consensus 17 ~v~iHE~GH~~~Ar~~g--v~V~~FsiGfgp~l~~~~~k~~t~y~i~~~PlGgyV~~~~~e~~~~~f~~~~~~~R~~i~~ 94 (349) .+.+||.||.++|.++. -+|.+=+| ..|++ -+||+.+.++| |...+.+....+|+.|++ T Consensus 10 ~vA~HEaGHAlva~~l~~~~~v~kvtI---------~prg~---------alG~t~~~p~e-d~~~~tk~~l~~~I~v~L 70 (192) T pfam01434 10 LVAYHEAGHALVGLLLPGADPVHKVTI---------IPRGQ---------ALGYTQFLPEE-DKLLYTKSQLLARIDVAL 70 (192) T ss_pred HHHHHHHHHHHHHHHCCCCCCCEEEEE---------EECCC---------CCEEEEECCCC-CCCCCCHHHHHHHHHHHH T ss_conf 999999999999998469998217998---------62788---------76357865730-112158999999999986 Q ss_pred HHHHHHHH Q ss_conf 12333211 Q gi|254780773|r 95 AGPLANCV 102 (349) Q Consensus 95 AGp~~N~i 102 (349) ||-.+--+ T Consensus 71 gGraAEei 78 (192) T pfam01434 71 GGRAAEEL 78 (192) T ss_pred HHHHHHHH T ss_conf 38999999 No 51 >KOG3532 consensus Probab=96.56 E-value=0.0023 Score=42.91 Aligned_cols=47 Identities=17% Similarity=0.297 Sum_probs=41.8 Q ss_pred CCCCCCCCCHHHHHHCCCCCEEEEECCCCCCCCHHHHHHCCCCCCCC Q ss_conf 00024567556530014566288887830454001111101467886 Q gi|254780773|r 121 VVSNVSPASPAAIAGVKKGDCIISLDGITVSAFEEVAPYVRENPLHE 167 (349) Q Consensus 121 ~I~~V~~~spA~~AGL~~GD~Il~InG~~V~s~~dl~~~i~~~~g~~ 167 (349) .|-.|.+++||.+|.+++||+.++|||.||.+-++...+++...++. T Consensus 401 ~v~tv~~ns~a~k~~~~~gdvlvai~~~pi~s~~q~~~~~~s~~~~~ 447 (1051) T KOG3532 401 KVCTVEDNSLADKAAFKPGDVLVAINNVPIRSERQATRFLQSTTGDL 447 (1051) T ss_pred EEEEECCCCHHHHHCCCCCCEEEEECCCCCHHHHHHHHHHHHCCCCE T ss_conf 99970689754675268655699855852315999999998614406 No 52 >TIGR00225 prc C-terminal processing peptidase; InterPro: IPR004447 Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes . They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence . Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases . Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base . The geometric orientations of the catalytic residues are similar between families, despite different protein folds . The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) , . Peptidases are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry. Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. This group of serine peptidases belong to MEROPS peptidase family S41 (clan SM), subfamily S41A (C-terminal processing peptidase). It is a family of C-terminal peptidases with different substrates in different species, including processing of D1 protein of the photosystem II reaction centre in higher plants, and cleavage of a peptide of 11 residues from the precursor form of penicillin-binding protein in Escherichia coli.; GO: 0008236 serine-type peptidase activity, 0006508 proteolysis. Probab=96.30 E-value=0.018 Score=36.83 Aligned_cols=58 Identities=28% Similarity=0.445 Sum_probs=51.1 Q ss_pred CCCCCCCCCCCHHHHHHCCCCCEEEEECCC-----CCCCC--HHHHHHCCCCCCCCCEEEEEECC Q ss_conf 320002456755653001456628888783-----04540--01111101467886316999658 Q gi|254780773|r 119 KPVVSNVSPASPAAIAGVKKGDCIISLDGI-----TVSAF--EEVAPYVRENPLHEISLVLYREH 176 (349) Q Consensus 119 ~p~I~~V~~~spA~~AGL~~GD~Il~InG~-----~V~s~--~dl~~~i~~~~g~~v~i~v~R~~ 176 (349) ...+....+++||+++|+++||.|+++||+ ++..| ++....++..++..+.+++.|.+ T Consensus 67 ~~~~~~~~~g~p~~~~g~~~~d~~~~~~~~~~~~~~~~~~~~~~~~~~~~g~~g~~~~~~~~~~g 131 (361) T TIGR00225 67 ELVIVSPLEGSPAEKAGLKPGDKILKVNGKGGPLESVLGLSLDDAVALIRGKKGTKVSLEILRAG 131 (361) T ss_pred EEEEEECCCCCCHHHCCCCCCCEEEEECCCCCCCHHHHHCCHHHHHHHHCCCCCCEEEEEEECCC T ss_conf 37886214677311204666640686167666410222012578899750777861689984277 No 53 >COG3031 PulC Type II secretory pathway, component PulC [Intracellular trafficking and secretion] Probab=96.25 E-value=0.0059 Score=40.13 Aligned_cols=54 Identities=13% Similarity=0.255 Sum_probs=46.8 Q ss_pred CCCCCHHHHHHCCCCCEEEEECCCCCCCCHHHHHHCCC-CCCCCCEEEEEECCCC Q ss_conf 45675565300145662888878304540011111014-6788631699965873 Q gi|254780773|r 125 VSPASPAAIAGVKKGDCIISLDGITVSAFEEVAPYVRE-NPLHEISLVLYREHVG 178 (349) Q Consensus 125 V~~~spA~~AGL~~GD~Il~InG~~V~s~~dl~~~i~~-~~g~~v~i~v~R~~~~ 178 (349) ..+.|..++.|||+||.-+++|+...++.+|+...++. ..-++..+|+.|+|.. T Consensus 214 gkd~slF~~sglq~GDIavaiNnldltdp~~m~~llq~l~~m~s~qlTv~R~G~r 268 (275) T COG3031 214 GKDGSLFYKSGLQRGDIAVAINNLDLTDPEDMFRLLQMLRNMPSLQLTVIRRGKR 268 (275) T ss_pred CCCCCHHHHHCCCCCCEEEEECCCCCCCHHHHHHHHHHHHCCCCEEEEEEECCCC T ss_conf 9983244550688765689965866689899999999611386507999945853 No 54 >COG1994 SpoIVFB Zn-dependent proteases [General function prediction only] Probab=96.17 E-value=0.0028 Score=42.29 Aligned_cols=72 Identities=28% Similarity=0.341 Sum_probs=54.4 Q ss_pred HHHHHHHHHHHHHHHHHHHCCEEEEEEEECCCHHEEEEECCCEEEEEEEEEECCEEECCCCCCCHHHHCCCCCCCCHHHH Q ss_conf 99999997337799999849750045530683112786059807999977112111001244550665036842101342 Q gi|254780773|r 14 LIIIVVIHEFGHYMVARLCNIRVLSFSVGFGPELIGITSRSGVRWKVSLIPLGGYVSFSEDEKDMRSFFCAAPWKKILTV 93 (349) Q Consensus 14 l~~~v~iHE~GH~~~Ar~~gv~V~~FsiGfgp~l~~~~~k~~t~y~i~~~PlGgyV~~~~~e~~~~~f~~~~~~~R~~i~ 93 (349) +..-++.||+||...+|..++++. ++-+ .++||+..+++...+.+.++..+ +++.++. T Consensus 51 l~~rl~l~~~gh~~~~~~~~~~l~--~~~i-------------------~~~~g~~~~~~~~v~~~~~~~~~-~~g~lvs 108 (230) T COG1994 51 LAHRLVLHPLGHSDEAGRLGLKLL--LALL-------------------FGFGGFGFLKPVPVNPRGEFLIR-LAGPLVS 108 (230) T ss_pred HHHHHHHHHHCCHHHHHHHHHHHH--HHHH-------------------HHHHHHHEECCCCCCHHHHHHHC-CCCHHHH T ss_conf 999999986013999998787799--9999-------------------98658841044661658864111-3314999 Q ss_pred HH-HHHHHHHHHHHH Q ss_conf 11-233321122111 Q gi|254780773|r 94 LA-GPLANCVMAILF 107 (349) Q Consensus 94 ~A-Gp~~N~ilA~ii 107 (349) .| ||+.|+.+|++. T Consensus 109 ~algpl~ni~la~~~ 123 (230) T COG1994 109 LALGPLTNIALAVLG 123 (230) T ss_pred HHHHHHHHHHHHHHH T ss_conf 988899888999999 No 55 >KOG3553 consensus Probab=96.16 E-value=0.0028 Score=42.36 Aligned_cols=55 Identities=24% Similarity=0.436 Sum_probs=39.2 Q ss_pred CCCCCCCCCCHHHHHHCCCCCEEEEECCCCCC--CCHHHHHHCCCCCCCCCEEEEEECC Q ss_conf 20002456755653001456628888783045--4001111101467886316999658 Q gi|254780773|r 120 PVVSNVSPASPAAIAGVKKGDCIISLDGITVS--AFEEVAPYVRENPLHEISLVLYREH 176 (349) Q Consensus 120 p~I~~V~~~spA~~AGL~~GD~Il~InG~~V~--s~~dl~~~i~~~~g~~v~i~v~R~~ 176 (349) -+|..|.++|||+.|||+.+|+|+.+||-..+ +-+.-+..++++ +.+.+.|.|.+ T Consensus 61 iYvT~V~eGsPA~~AGLrihDKIlQvNG~DfTMvTHd~Avk~i~k~--~vl~mLVaR~~ 117 (124) T KOG3553 61 IYVTRVSEGSPAEIAGLRIHDKILQVNGWDFTMVTHDQAVKRITKE--EVLRMLVARQS 117 (124) T ss_pred EEEEEECCCCHHHHHCCEECCEEEEECCCEEEEEEHHHHHHHHHHH--HHHHHHHHHHC T ss_conf 7999704698366400220356888647405888768888786375--79999988603 No 56 >KOG3580 consensus Probab=96.02 E-value=0.0035 Score=41.69 Aligned_cols=53 Identities=23% Similarity=0.425 Sum_probs=31.3 Q ss_pred CCCCCCCCCHHHHHHCCCCCEEEEECCCCCCCC--HHHHHH-CCCCCCCCCEEEEE Q ss_conf 000245675565300145662888878304540--011111-01467886316999 Q gi|254780773|r 121 VVSNVSPASPAAIAGVKKGDCIISLDGITVSAF--EEVAPY-VRENPLHEISLVLY 173 (349) Q Consensus 121 ~I~~V~~~spA~~AGL~~GD~Il~InG~~V~s~--~dl~~~-i~~~~g~~v~i~v~ 173 (349) .|.+|.++|||++-|||.||.|+.||..+..+. +|-+.+ +.--+|+++++... T Consensus 432 FVaGvqegspA~~eGlqEGDQIL~VN~vdF~nl~REeAVlfLL~lPkGEevtilaQ 487 (1027) T KOG3580 432 FVAGVQEGSPAEQEGLQEGDQILKVNTVDFRNLVREEAVLFLLELPKGEEVTILAQ 487 (1027) T ss_pred EEEECCCCCCHHHCCCCCCCEEEEECCCCCHHHHHHHHHHHHHCCCCCCEEEEHHH T ss_conf 87411268830111300036267753633010427888999862899767761343 No 57 >KOG3834 consensus Probab=95.89 E-value=0.01 Score=38.42 Aligned_cols=69 Identities=22% Similarity=0.351 Sum_probs=54.9 Q ss_pred CCCCCCCCCHHHHHHCC-CCCEEEEECCCCCCCCHHHHHHCCCCCCCCCEEEEEEC-CCCEEEECCCCCCC Q ss_conf 00024567556530014-56628888783045400111110146788631699965-87313101121100 Q gi|254780773|r 121 VVSNVSPASPAAIAGVK-KGDCIISLDGITVSAFEEVAPYVRENPLHEISLVLYRE-HVGVLHLKVMPRLQ 189 (349) Q Consensus 121 ~I~~V~~~spA~~AGL~-~GD~Il~InG~~V~s~~dl~~~i~~~~g~~v~i~v~R~-~~~~~~~~v~p~~~ 189 (349) -|-+|.++|||+.|||+ .+|-|+.+-+..-...+|+...|+.+.++...+-|+-- .+...+++++|+.. T Consensus 112 Hvl~V~p~SPaalAgl~~~~DYivG~~~~~~~~~eDl~~lIeshe~kpLklyVYN~D~d~~ReVti~pn~a 182 (462) T KOG3834 112 HVLSVEPNSPAALAGLRPYTDYIVGIWDAVMHEEEDLFTLIESHEGKPLKLYVYNHDTDSCREVTITPNSA 182 (462) T ss_pred EEEECCCCCHHHHCCCCCCCCEEECCHHHHCCCHHHHHHHHHHCCCCCCCEEEEECCCCCCCEEEEECCCC T ss_conf 44643899878850553365357435455234157899999860278741467644777520478611432 No 58 >KOG3552 consensus Probab=95.87 E-value=0.0055 Score=40.29 Aligned_cols=55 Identities=25% Similarity=0.459 Sum_probs=43.9 Q ss_pred CC-CCCCCCCCCHHHHHHCCCCCEEEEECCCCCCC--CHHHHHHCCCCCCCCCEEEEEEC Q ss_conf 32-00024567556530014566288887830454--00111110146788631699965 Q gi|254780773|r 119 KP-VVSNVSPASPAAIAGVKKGDCIISLDGITVSA--FEEVAPYVRENPLHEISLVLYRE 175 (349) Q Consensus 119 ~p-~I~~V~~~spA~~AGL~~GD~Il~InG~~V~s--~~dl~~~i~~~~g~~v~i~v~R~ 175 (349) .| +|-.|++|+|+. --|+|||.|+.|||++|++ |+.+.+.++... +.++++|.+. T Consensus 75 rPviVr~VT~GGps~-GKL~PGDQIl~vN~Epv~daprervIdlvRace-~sv~ltV~qP 132 (1298) T KOG3552 75 RPVIVRFVTEGGPSI-GKLQPGDQILAVNGEPVKDAPRERVIDLVRACE-SSVNLTVCQP 132 (1298) T ss_pred CCEEEEEECCCCCCC-CCCCCCCEEEEECCCCCCCCCHHHHHHHHHHHH-HHCCEEEECC T ss_conf 706999846898765-633677747874686321143889999999876-4103488604 No 59 >PRK10733 hflB ATP-dependent metalloprotease; Reviewed Probab=95.84 E-value=0.0084 Score=39.05 Aligned_cols=39 Identities=15% Similarity=0.346 Sum_probs=18.6 Q ss_pred HHHHHHHHHHCCEEEEEEEECCCHHEEEEECCCEEEEEEEEEE Q ss_conf 3779999984975004553068311278605980799997711 Q gi|254780773|r 23 FGHYMVARLCNIRVLSFSVGFGPELIGITSRSGVRWKVSLIPL 65 (349) Q Consensus 23 ~GH~~~Ar~~gv~V~~FsiGfgp~l~~~~~k~~t~y~i~~~Pl 65 (349) +..|+.. +..=+|++-.|. ..-...+++++++|... +|. T Consensus 34 ys~f~~~-~~~~~v~~v~i~--~~~i~~~~~~~~~~~~~-~p~ 72 (644) T PRK10733 34 YSTFLQE-VNQDQVREARIN--GREINVTKKDSNRYTTY-IPV 72 (644) T ss_pred HHHHHHH-HHCCCEEEEEEE--CCEEEEEECCCCEEEEE-CCC T ss_conf 9999999-984991599995--77999998689769985-788 No 60 >KOG3834 consensus Probab=95.67 E-value=0.013 Score=37.78 Aligned_cols=71 Identities=24% Similarity=0.374 Sum_probs=50.2 Q ss_pred CCCCCCCCCCCCHHHHHHCCC-CCEEEEECCCCCCCCHH-HHHHCCCCCCCCCEEEEEECCC-CEEEECCCCCCC Q ss_conf 332000245675565300145-66288887830454001-1111014678863169996587-313101121100 Q gi|254780773|r 118 MKPVVSNVSPASPAAIAGVKK-GDCIISLDGITVSAFEE-VAPYVRENPLHEISLVLYREHV-GVLHLKVMPRLQ 189 (349) Q Consensus 118 ~~p~I~~V~~~spA~~AGL~~-GD~Il~InG~~V~s~~d-l~~~i~~~~g~~v~i~v~R~~~-~~~~~~v~p~~~ 189 (349) ..-.+-.|++||||.+|||++ -|=|++|||..++.-+| +...++.+-.+ +++++.-... ++..+.+.|... T Consensus 15 eg~hvlkVqedSpa~~aglepffdFIvSI~g~rL~~dnd~Lk~llk~~sek-Vkltv~n~kt~~~R~v~I~ps~~ 88 (462) T KOG3834 15 EGYHVLKVQEDSPAHKAGLEPFFDFIVSINGIRLNKDNDTLKALLKANSEK-VKLTVYNSKTQEVRIVEIVPSNN 88 (462) T ss_pred EEEEEEEEECCCHHHHCCCCHHHHHHHEECCCCCCCCHHHHHHHHHHCCCC-EEEEEEECCCCEEEEEEECCCCC T ss_conf 047899863378677567503444410017400267658999988742412-17998855542367898424312 No 61 >KOG3209 consensus Probab=95.67 E-value=0.0092 Score=38.78 Aligned_cols=54 Identities=30% Similarity=0.458 Sum_probs=32.3 Q ss_pred CCCCCCCCCHHHHHH-CCCCCEEEEECCCCCC--CCHHHHHHCCCCC-CCCCEEEEEE Q ss_conf 000245675565300-1456628888783045--4001111101467-8863169996 Q gi|254780773|r 121 VVSNVSPASPAAIAG-VKKGDCIISLDGITVS--AFEEVAPYVRENP-LHEISLVLYR 174 (349) Q Consensus 121 ~I~~V~~~spA~~AG-L~~GD~Il~InG~~V~--s~~dl~~~i~~~~-g~~v~i~v~R 174 (349) .|.+|.+++||++-| |++||+|+.|||+.+- +-.|.++..+.-| |+.+++++-| T Consensus 374 qVKsvl~DGPAa~dGkle~GDviV~INg~cvlGhTHAqaV~~fqaiPvg~~V~L~lcR 431 (984) T KOG3209 374 QVKSVLKDGPAAQDGKLETGDVIVHINGECVLGHTHAQAVKRFQAIPVGQSVDLVLCR 431 (984) T ss_pred EEEECCCCCCHHHCCCCCCCCEEEEECCCEECCCCHHHHHHHHHCCCCCCEEEEEEEC T ss_conf 3410035884664385335758999778264361079888774113568706689854 No 62 >KOG0606 consensus Probab=94.73 E-value=0.014 Score=37.53 Aligned_cols=42 Identities=33% Similarity=0.445 Sum_probs=34.3 Q ss_pred CCCCCCCCCCHHHHHHCCCCCEEEEECCCCCCCC--HHHHHHCC Q ss_conf 2000245675565300145662888878304540--01111101 Q gi|254780773|r 120 PVVSNVSPASPAAIAGVKKGDCIISLDGITVSAF--EEVAPYVR 161 (349) Q Consensus 120 p~I~~V~~~spA~~AGL~~GD~Il~InG~~V~s~--~dl~~~i~ 161 (349) -.++.|.++|||..||++++|.|+.+||+++... .++.+.+- T Consensus 660 h~v~sv~egsPA~~agls~~DlIthvnge~v~gl~H~ev~~Lll 703 (1205) T KOG0606 660 HSVGSVEEGSPAFEAGLSAGDLITHVNGEPVHGLVHTEVMELLL 703 (1205) T ss_pred EEEEEECCCCCCCCCCCCCCCEEEECCCCCCCHHHHHHHHHHHH T ss_conf 24454237887334677723346741685430010899999997 No 63 >KOG3542 consensus Probab=94.33 E-value=0.022 Score=36.11 Aligned_cols=34 Identities=35% Similarity=0.577 Sum_probs=30.4 Q ss_pred CCCCCCCCCHHHHHHCCCCCEEEEECCCCCCCCH Q ss_conf 0002456755653001456628888783045400 Q gi|254780773|r 121 VVSNVSPASPAAIAGVKKGDCIISLDGITVSAFE 154 (349) Q Consensus 121 ~I~~V~~~spA~~AGL~~GD~Il~InG~~V~s~~ 154 (349) .|.+|.|+|-|+.+|++.||.|+++||+.-.+.+ T Consensus 565 fV~~V~pgskAa~~GlKRgDqilEVNgQnfenis 598 (1283) T KOG3542 565 FVAEVFPGSKAAREGLKRGDQILEVNGQNFENIS 598 (1283) T ss_pred EEEEECCCCHHHHHHHHHHHHHHHCCCCCHHHHH T ss_conf 8863068846777654201143210452322202 No 64 >KOG1421 consensus Probab=94.25 E-value=0.062 Score=33.08 Aligned_cols=54 Identities=26% Similarity=0.540 Sum_probs=37.4 Q ss_pred CCCCCCCCCHHHHHHCCCCCEEEEECCCCCCCCHHHHHHCCCCCCCCCEEEEEECCCCEEEECC Q ss_conf 0002456755653001456628888783045400111110146788631699965873131011 Q gi|254780773|r 121 VVSNVSPASPAAIAGVKKGDCIISLDGITVSAFEEVAPYVRENPLHEISLVLYREHVGVLHLKV 184 (349) Q Consensus 121 ~I~~V~~~spA~~AGL~~GD~Il~InG~~V~s~~dl~~~i~~~~g~~v~i~v~R~~~~~~~~~v 184 (349) +|+-|.+.-+-- |..||+|+++||+.|...+|+.++. +++..+.|+|.+ .++++ T Consensus 774 ~ishv~~~~~ki---l~~gdiilsvngk~itr~~dl~d~~------eid~~ilrdg~~-~~iki 827 (955) T KOG1421 774 VISHVRPLLHKI---LGVGDIILSVNGKMITRLSDLHDFE------EIDAVILRDGIE-MEIKI 827 (955) T ss_pred EEEEECCCCCCC---CCCCCEEEEECCEEEEEEHHHHHHH------HHHEEEEECCCE-EEEEE T ss_conf 998422576501---1346489995676776502233455------312045415818-99982 No 65 >KOG3651 consensus Probab=93.84 E-value=0.042 Score=34.19 Aligned_cols=53 Identities=28% Similarity=0.464 Sum_probs=41.9 Q ss_pred CCCCCCCCCHHHHHH-CCCCCEEEEECCCCCCC--CHHHHHHCCCCCCCCCEEEEEE Q ss_conf 000245675565300-14566288887830454--0011111014678863169996 Q gi|254780773|r 121 VVSNVSPASPAAIAG-VKKGDCIISLDGITVSA--FEEVAPYVRENPLHEISLVLYR 174 (349) Q Consensus 121 ~I~~V~~~spA~~AG-L~~GD~Il~InG~~V~s--~~dl~~~i~~~~g~~v~i~v~R 174 (349) +|..|..++||+.-| ++.||.|++|||.+|.. -.++...++.+. .++.+.+.+ T Consensus 33 YiVQvFD~tPAa~dG~i~~GDEi~avNg~svKGktKveVAkmIQ~~~-~eV~IhyNK 88 (429) T KOG3651 33 YIVQVFDKTPAAKDGRIRCGDEIVAVNGISVKGKTKVEVAKMIQVSL-NEVKIHYNK 88 (429) T ss_pred EEEEECCCCCHHCCCCCCCCCEEEEECCEEECCCCHHHHHHHHHHHC-CCEEEEEHH T ss_conf 99876169832104740237756876345652710799999999714-656998110 No 66 >KOG3580 consensus Probab=93.47 E-value=0.056 Score=33.37 Aligned_cols=57 Identities=19% Similarity=0.406 Sum_probs=42.8 Q ss_pred CCCCCCCCHHHHHH-CCCCCEEEEECCCCCCC--CHHHHHHCCCCCCCCCEEEEEECCCCE Q ss_conf 00245675565300-14566288887830454--001111101467886316999658731 Q gi|254780773|r 122 VSNVSPASPAAIAG-VKKGDCIISLDGITVSA--FEEVAPYVRENPLHEISLVLYREHVGV 179 (349) Q Consensus 122 I~~V~~~spA~~AG-L~~GD~Il~InG~~V~s--~~dl~~~i~~~~g~~v~i~v~R~~~~~ 179 (349) |..+...+-|+.-| ||.||.|++|||+...+ ..|-...|.++.|+ ..+.|.|+..++ T Consensus 223 vKeit~~gLAardgnlqEGDiiLkINGtvteNmSLtDar~LIEkS~GK-L~lvVlRD~~qt 282 (1027) T KOG3580 223 VKEITRTGLAARDGNLQEGDIILKINGTVTENMSLTDARKLIEKSRGK-LQLVVLRDSQQT 282 (1027) T ss_pred HHHHCCCCHHHCCCCCCCCCEEEEECCEEECCCCCHHHHHHHHHCCCC-EEEEEEECCCCE T ss_conf 433302311112488655637999776740344405678898743673-689999327851 No 67 >KOG3549 consensus Probab=93.38 E-value=0.066 Score=32.85 Aligned_cols=55 Identities=33% Similarity=0.498 Sum_probs=45.5 Q ss_pred CCCCCCCCCCCHHHHHH-CCCCCEEEEECCCCCCC--CHHHHHHCCCCCCCCCEEEEEE Q ss_conf 32000245675565300-14566288887830454--0011111014678863169996 Q gi|254780773|r 119 KPVVSNVSPASPAAIAG-VKKGDCIISLDGITVSA--FEEVAPYVRENPLHEISLVLYR 174 (349) Q Consensus 119 ~p~I~~V~~~spA~~AG-L~~GD~Il~InG~~V~s--~~dl~~~i~~~~g~~v~i~v~R 174 (349) +-+|+.+.++..|+..| |=.||-|+.|||..|.. .+|+++.+++ +|+++++||.. T Consensus 81 PvviSkI~kdQaAd~tG~LFvGDAilqvNGi~v~~c~HeevV~iLRN-AGdeVtlTV~~ 138 (505) T KOG3549 81 PVVISKIYKDQAADITGQLFVGDAILQVNGIYVTACPHEEVVNILRN-AGDEVTLTVKH 138 (505) T ss_pred CEEEEHHHHHHHHHHCCCEEEEEEEEEECCEEEECCCHHHHHHHHHH-CCCEEEEEEHH T ss_conf 47863032554420017467400557755577614880899999870-69879998276 No 68 >KOG3605 consensus Probab=92.95 E-value=0.19 Score=29.77 Aligned_cols=67 Identities=21% Similarity=0.344 Sum_probs=47.5 Q ss_pred CCCCCCCCCCC--CCCCCCCCCHHHHHH-CCCCCEEEEECCCCCCC--CHHHHHHCCCCCC-CCCEEEEEECC Q ss_conf 11001233332--000245675565300-14566288887830454--0011111014678-86316999658 Q gi|254780773|r 110 FFFYNTGVMKP--VVSNVSPASPAAIAG-VKKGDCIISLDGITVSA--FEEVAPYVRENPL-HEISLVLYREH 176 (349) Q Consensus 110 ~~~~~~g~~~p--~I~~V~~~spA~~AG-L~~GD~Il~InG~~V~s--~~dl~~~i~~~~g-~~v~i~v~R~~ 176 (349) ++-++||..-| +|.....++||+..| |.-||+|++|||.+.-. .+.-+.+|+..++ ..|+++|.+-- T Consensus 663 iVESGWGSmLPTVViAnmm~~GpAarsgkLnIGDQiiaING~SLVGLPLstcQs~Ik~~KnQT~VkltiV~cp 735 (829) T KOG3605 663 IVESGWGSILPTVVIANMMHGGPAARSGKLNIGDQIMSINGTSLVGLPLSTCQSIIKGLKNQTAVKLNIVSCP 735 (829) T ss_pred EEECCCCCCCHHHHHHHCCCCCHHHHCCCCCCCCEEEEECCCEECCCCHHHHHHHHHCCCCCCEEEEEEECCC T ss_conf 9834754311589987513677165438766322257644721106607999999861555405888776189 No 69 >TIGR01241 FtsH_fam ATP-dependent metallopeptidase HflB; InterPro: IPR005936 Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site . The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases . Peptidases are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry. Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. This group of metallopeptidases belong to MEROPS peptidase family M41 (FtsH endopeptidase family, clan MA(E)). The predicted active site residues for members of this family and thermolysin, the type example for clan MA, occur in the motif HEXXH. FtsH is a membrane-anchored ATP-dependent protease that degrades misfolded or misassembled membrane proteins as well as a subset of cytoplasmic regulatory proteins. FtsH is a 647-residue protein of 70 kDa, with two putative transmembrane segments towards its N terminus which anchor the protein to the membrane, giving rise to a periplasmic domain of 70 residues and a cytoplasmic segment of 520 residues containing the ATPase and protease domains . ; GO: 0004222 metalloendopeptidase activity, 0030163 protein catabolic process, 0016020 membrane. Probab=92.57 E-value=0.041 Score=34.28 Aligned_cols=11 Identities=27% Similarity=0.483 Sum_probs=5.1 Q ss_pred CEEEECCCCCC Q ss_conf 31310112110 Q gi|254780773|r 178 GVLHLKVMPRL 188 (349) Q Consensus 178 ~~~~~~v~p~~ 188 (349) .+..+|+.|+- T Consensus 338 pV~KvTIIPRG 348 (505) T TIGR01241 338 PVHKVTIIPRG 348 (505) T ss_pred CCCCEEECCCC T ss_conf 52325631478 No 70 >KOG1892 consensus Probab=92.40 E-value=0.076 Score=32.43 Aligned_cols=57 Identities=19% Similarity=0.301 Sum_probs=42.6 Q ss_pred CCCCCCCCCCHHHHHH-CCCCCEEEEECCCCCCCCHH--HHHHCCCCCCCCCEEEEEECCC Q ss_conf 2000245675565300-14566288887830454001--1111014678863169996587 Q gi|254780773|r 120 PVVSNVSPASPAAIAG-VKKGDCIISLDGITVSAFEE--VAPYVRENPLHEISLVLYREHV 177 (349) Q Consensus 120 p~I~~V~~~spA~~AG-L~~GD~Il~InG~~V~s~~d--l~~~i~~~~g~~v~i~v~R~~~ 177 (349) -+|.+|++|+||+.-| |++||..++|||.+.-...+ -.+++ .+.|..|.++|.+.+. T Consensus 962 IYvKsVV~GgaAd~DGRL~aGDQLLsVdG~SLiGisQErAA~lm-trtg~vV~leVaKqgA 1021 (1629) T KOG1892 962 IYVKSVVEGGAADHDGRLEAGDQLLSVDGHSLIGISQERAARLM-TRTGNVVHLEVAKQGA 1021 (1629) T ss_pred EEEEEECCCCCCCCCCCCCCCCEEEEECCCCCCCCCHHHHHHHH-HCCCCEEEEEHHHHHH T ss_conf 47987315875455564015763664558201142588899987-4248757875132356 No 71 >KOG3551 consensus Probab=90.23 E-value=0.16 Score=30.16 Aligned_cols=55 Identities=20% Similarity=0.285 Sum_probs=42.3 Q ss_pred CCCCCCCCCCCHHHHHH-CCCCCEEEEECCCCCCC--CHHHHHHCCCCCCCCCEEEEEE Q ss_conf 32000245675565300-14566288887830454--0011111014678863169996 Q gi|254780773|r 119 KPVVSNVSPASPAAIAG-VKKGDCIISLDGITVSA--FEEVAPYVRENPLHEISLVLYR 174 (349) Q Consensus 119 ~p~I~~V~~~spA~~AG-L~~GD~Il~InG~~V~s--~~dl~~~i~~~~g~~v~i~v~R 174 (349) +-.|+.+.++-.|++++ |..||.|++|||+...+ .++-++.++ +.|++|.+.|.. T Consensus 111 PIlISKIFkGlAADQt~aL~~gDaIlSVNG~dL~~AtHdeAVqaLK-raGkeV~levKy 168 (506) T KOG3551 111 PILISKIFKGLAADQTGALFLGDAILSVNGEDLRDATHDEAVQALK-RAGKEVLLEVKY 168 (506) T ss_pred CEEHHHHHCCCCCCCCCCEEECCEEEEECCHHHHHCCHHHHHHHHH-HHCCEEEEEEEH T ss_conf 3566775152220323662314479973552333202699999998-617553213210 No 72 >COG0465 HflB ATP-dependent Zn proteases [Posttranslational modification, protein turnover, chaperones] Probab=89.31 E-value=0.29 Score=28.47 Aligned_cols=27 Identities=15% Similarity=0.190 Sum_probs=14.1 Q ss_pred HHHHCCEEEEEEEECCCHHEEEEECCC Q ss_conf 998497500455306831127860598 Q gi|254780773|r 29 ARLCNIRVLSFSVGFGPELIGITSRSG 55 (349) Q Consensus 29 Ar~~gv~V~~FsiGfgp~l~~~~~k~~ 55 (349) ....+-+|++.++...-.....+.+++ T Consensus 32 ~~~~~~~v~~~~~~~~~~~v~~~~~~~ 58 (596) T COG0465 32 QLVSGGKVSSVSIKGDSKTVNLKLKDG 58 (596) T ss_pred HHHHCCCCEEEEECCCCEEEEEEECCC T ss_conf 997267851899827732899886488 No 73 >KOG0609 consensus Probab=88.34 E-value=0.32 Score=28.18 Aligned_cols=55 Identities=27% Similarity=0.418 Sum_probs=43.3 Q ss_pred CCCCCCCCCCCHHHHHH-CCCCCEEEEECCCCCCC--CHHHHHHCCCCCCCCCEEEEEE Q ss_conf 32000245675565300-14566288887830454--0011111014678863169996 Q gi|254780773|r 119 KPVVSNVSPASPAAIAG-VKKGDCIISLDGITVSA--FEEVAPYVRENPLHEISLVLYR 174 (349) Q Consensus 119 ~p~I~~V~~~spA~~AG-L~~GD~Il~InG~~V~s--~~dl~~~i~~~~g~~v~i~v~R 174 (349) ..+|..+..|+.|+..| |+.||+|+++||.++.+ .++++..+++.. ..+++.+.- T Consensus 147 ~~~vARI~~GG~~~r~glL~~GD~i~EvNGi~v~~~~~~e~q~~l~~~~-G~itfkiiP 204 (542) T KOG0609 147 KVVVARIMHGGMADRQGLLHVGDEILEVNGISVANKSPEELQELLRNSR-GSITFKIIP 204 (542) T ss_pred CCEEEEECCCCCCHHCCCEEECCCHHEECCEECCCCCHHHHHHHHHHCC-CCEEEEECC T ss_conf 6078552027840003510206512103675215679799999998488-837999763 No 74 >KOG1738 consensus Probab=87.60 E-value=0.61 Score=26.22 Aligned_cols=44 Identities=30% Similarity=0.470 Sum_probs=34.8 Q ss_pred CCCCCCCCCHHHHHH-CCCCCEEEEECCCCCCCCHH--HHHHCCCCC Q ss_conf 000245675565300-14566288887830454001--111101467 Q gi|254780773|r 121 VVSNVSPASPAAIAG-VKKGDCIISLDGITVSAFEE--VAPYVRENP 164 (349) Q Consensus 121 ~I~~V~~~spA~~AG-L~~GD~Il~InG~~V~s~~d--l~~~i~~~~ 164 (349) ++.++.++|||..-+ +.+||.++.||++.+-.|+- ++..++..+ T Consensus 228 ~~s~~~e~Spad~~~kI~dgdEv~qiN~qtvVgwqlk~vV~sL~~~~ 274 (638) T KOG1738 228 VTSKIFEQSPADYRQKILDGDEVLQINEQTVVGWQLKVVVSSLRETP 274 (638) T ss_pred ECCCCCCCCHHHHHHCCCCCCCEEEECCCCCCCCHHHHHHHHCCCCC T ss_conf 44655658867775224673310342364214601576875024576 No 75 >KOG3550 consensus Probab=86.42 E-value=0.38 Score=27.60 Aligned_cols=54 Identities=26% Similarity=0.525 Sum_probs=37.9 Q ss_pred CCC-CCCCCCCCCHHHHHH-CCCCCEEEEECCCCCCCC--HHHHHHCCCCCCCCCEEEE Q ss_conf 332-000245675565300-145662888878304540--0111110146788631699 Q gi|254780773|r 118 MKP-VVSNVSPASPAAIAG-VKKGDCIISLDGITVSAF--EEVAPYVRENPLHEISLVL 172 (349) Q Consensus 118 ~~p-~I~~V~~~spA~~AG-L~~GD~Il~InG~~V~s~--~dl~~~i~~~~g~~v~i~v 172 (349) .+| +|+.+.|++.|+.-| |+.||..+++||.+|..- +.-++.+... ...+.+.+ T Consensus 114 nspiyisriipggvadrhgglkrgdqllsvngvsvege~hekavellkaa-~gsvklvv 171 (207) T KOG3550 114 NSPIYISRIIPGGVADRHGGLKRGDQLLSVNGVSVEGEHHEKAVELLKAA-VGSVKLVV 171 (207) T ss_pred CCCEEEEEECCCCCCCCCCCCCCCCEEEEECCEEECCHHHHHHHHHHHHH-CCCEEEEE T ss_conf 89647886247752001376445564676546420313169999999973-57678987 No 76 >KOG3606 consensus Probab=86.38 E-value=0.73 Score=25.69 Aligned_cols=55 Identities=20% Similarity=0.427 Sum_probs=40.0 Q ss_pred CCCCCCCCCHHHHHHCC-CCCEEEEECCCCCC--CCHHHHHHCCCCCCCCCEEEEEECC Q ss_conf 00024567556530014-56628888783045--4001111101467886316999658 Q gi|254780773|r 121 VVSNVSPASPAAIAGVK-KGDCIISLDGITVS--AFEEVAPYVRENPLHEISLVLYREH 176 (349) Q Consensus 121 ~I~~V~~~spA~~AGL~-~GD~Il~InG~~V~--s~~dl~~~i~~~~g~~v~i~v~R~~ 176 (349) -|+..+|++-|+.-||. ..|.+++|||.+|. +.+++.+.+-.+. ...-+||...| T Consensus 197 FISRlVpGGLAeSTGLLaVnDEVlEVNGIEVaGKTLDQVTDMMvANs-hNLIiTVkPAN 254 (358) T KOG3606 197 FISRLVPGGLAESTGLLAVNDEVLEVNGIEVAGKTLDQVTDMMVANS-HNLIITVKPAN 254 (358) T ss_pred EEEEECCCCCCCCCCEEEECCEEEEECCEEECCCCHHHHHHHHHHCC-CCEEEEECCCC T ss_conf 78850377520134405532416887577841523888788876344-64389961444 No 77 >KOG3571 consensus Probab=86.14 E-value=0.61 Score=26.20 Aligned_cols=57 Identities=18% Similarity=0.355 Sum_probs=41.6 Q ss_pred CCCCCCCCCCHHHHHH-CCCCCEEEEECCCCCCCC--HHHHHHCCC--CCCCCCEEEEEECC Q ss_conf 2000245675565300-145662888878304540--011111014--67886316999658 Q gi|254780773|r 120 PVVSNVSPASPAAIAG-VKKGDCIISLDGITVSAF--EEVAPYVRE--NPLHEISLVLYREH 176 (349) Q Consensus 120 p~I~~V~~~spA~~AG-L~~GD~Il~InG~~V~s~--~dl~~~i~~--~~g~~v~i~v~R~~ 176 (349) -+|+++.++++-+.-| +++||.|+.||..+..+. +|-+..+++ +...++++++.+-. T Consensus 279 IYVgsImkgGAVA~DGRIe~GDMiLQVNevsFENmSNd~AVrvLREaV~~~gPi~ltvAk~~ 340 (626) T KOG3571 279 IYVGSIMKGGAVALDGRIEPGDMILQVNEVSFENMSNDQAVRVLREAVSRPGPIKLTVAKCW 340 (626) T ss_pred EEEEEECCCCEEECCCCCCCCCEEEEEEECCHHHCCCHHHHHHHHHHHCCCCCEEEEEEECC T ss_conf 58864136860311476575533787400123104764999999998636787378886022 No 78 >pfam11874 DUF3394 Domain of unknown function (DUF3394). This domain is functionally uncharacterized. This domain is found in bacteria. This presumed domain is about 190 amino acids in length. This domain is found associated with pfam06808. Probab=85.74 E-value=0.56 Score=26.47 Aligned_cols=26 Identities=31% Similarity=0.345 Sum_probs=16.2 Q ss_pred CCCCCCCCCCHHHHHHCCCCCEEEEE Q ss_conf 20002456755653001456628888 Q gi|254780773|r 120 PVVSNVSPASPAAIAGVKKGDCIISL 145 (349) Q Consensus 120 p~I~~V~~~spA~~AGL~~GD~Il~I 145 (349) -.|+.+..+|||+++|++-||.|+++ T Consensus 124 ~~vd~~~f~s~Aek~G~d~d~~I~~v 149 (183) T pfam11874 124 VIVDEVEFGSPAEKAGIDFDWEIVEV 149 (183) T ss_pred EEEEECCCCCHHHHHCCCCCCEEEEE T ss_conf 89995488986888268778689999 No 79 >KOG3605 consensus Probab=83.02 E-value=0.67 Score=25.92 Aligned_cols=26 Identities=8% Similarity=-0.005 Sum_probs=16.9 Q ss_pred CHHHHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 99999999999999999999999999 Q gi|254780773|r 317 GVSVTRVITRMGLCIILFLFFLGIRN 342 (349) Q Consensus 317 ~~~~~~~~~~~g~~ll~~l~i~~~~n 342 (349) ++..+-++|.+|=++-+..|=|.--| T Consensus 585 SdeAQfIAQSIGQAFqVAY~EFLrAN 610 (829) T KOG3605 585 SDEAQFIAQSIGQAFQVAYMEFLRAN 610 (829) T ss_pred CCHHHHHHHHHHHHHHHHHHHHHHHC T ss_conf 32578999988899999999999872 No 80 >pfam01079 Hint Hint module. This is an alignment of the Hint module in the Hedgehog proteins. It does not include any Inteins which also possess the Hint module. Probab=80.80 E-value=1.8 Score=23.04 Aligned_cols=58 Identities=14% Similarity=0.295 Sum_probs=36.8 Q ss_pred CCHHHHHHCCCCCEEEEECCCCCCCCHHHHHHCCCCCCCCCEEE-EEECCCCEEEECCCCC Q ss_conf 75565300145662888878304540011111014678863169-9965873131011211 Q gi|254780773|r 128 ASPAAIAGVKKGDCIISLDGITVSAFEEVAPYVRENPLHEISLV-LYREHVGVLHLKVMPR 187 (349) Q Consensus 128 ~spA~~AGL~~GD~Il~InG~~V~s~~dl~~~i~~~~g~~v~i~-v~R~~~~~~~~~v~p~ 187 (349) ++--...-|++||++++.|..---.++++..++...|.+.-.+. ++-++.+ .++++|. T Consensus 24 G~~k~m~~L~~GD~Vla~d~~G~~~yS~Vi~Fldr~~~~~~~F~~I~T~~g~--~l~LT~~ 82 (214) T pfam01079 24 GGTKPMSDLRIGDRVLAADSDGKLVYSPVILFLDRDPEQRREFYVIETENGR--KLTLTPA 82 (214) T ss_pred CCEEEHHHCCCCCEEEEECCCCCEEEEEEEEEEEECCCCCEEEEEEEECCCC--EEEECHH T ss_conf 9888937768888899865899998885799982176711899999958998--6997325 No 81 >pfam07136 DUF1385 Protein of unknown function (DUF1385). This family contains a number of hypothetical bacterial proteins of unknown function approximately 300 residues in length. Some family members are predicted to be metal-dependent. Probab=79.43 E-value=4.8 Score=20.02 Aligned_cols=16 Identities=19% Similarity=0.158 Sum_probs=6.6 Q ss_pred HHHHHHHHHHHHHCCC Q ss_conf 1233321122111011 Q gi|254780773|r 95 AGPLANCVMAILFFTF 110 (349) Q Consensus 95 AGp~~N~ilA~iif~~ 110 (349) ...+..+++|..+|.. T Consensus 43 ~t~~~s~~~ai~lF~~ 58 (235) T pfam07136 43 LTVILSLAFAIGLFFV 58 (235) T ss_pred HHHHHHHHHHHHHHHH T ss_conf 9999999999999999 No 82 >pfam02128 Peptidase_M36 Fungalysin metallopeptidase (M36). Probab=77.74 E-value=0.18 Score=29.82 Aligned_cols=17 Identities=0% Similarity=0.172 Sum_probs=8.1 Q ss_pred HHHHHHHHHCCCCCHHH Q ss_conf 99999888278699999 Q gi|254780773|r 304 ITFLLEMIRGKSLGVSV 320 (349) Q Consensus 304 ~~~~~E~i~gr~i~~~~ 320 (349) +.+++++..=-|.++.. T Consensus 315 m~lv~dgmkLqPcnPtF 331 (368) T pfam02128 315 MKLVMDGMKLQPCNPGF 331 (368) T ss_pred HHHHHHHHHCCCCCCCH T ss_conf 99999887428999973 No 83 >cd04278 ZnMc_MMP Zinc-dependent metalloprotease, matrix metalloproteinase (MMP) sub-family. MMPs are responsible for a great deal of pericellular proteolysis of extracellular matrix and cell surface molecules, playing crucial roles in morphogenesis, cell fate specification, cell migration, tissue repair, tumorigenesis, gain or loss of tissue-specific functions, and apoptosis. In many instances, they are anchored to cell membranes via trans-membrane domains, and their activity is controlled via TIMPs (tissue inhibitors of metalloproteinases). Probab=73.00 E-value=1.8 Score=22.90 Aligned_cols=12 Identities=8% Similarity=0.498 Sum_probs=5.8 Q ss_pred CCCEEEEEEEEE Q ss_conf 598079999771 Q gi|254780773|r 53 RSGVRWKVSLIP 64 (349) Q Consensus 53 k~~t~y~i~~~P 64 (349) |..-.|++.-.| T Consensus 4 k~~lty~i~~~~ 15 (157) T cd04278 4 KTNLTYRILNYP 15 (157) T ss_pred CCCEEEEEECCC T ss_conf 870589984279 No 84 >TIGR02289 M3_not_pepF oligoendopeptidase, M3 family; InterPro: IPR011976 Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site . The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases . Peptidases are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry. Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. This family consists of probable oligoendopeptidases belonging to MEROPS peptidase family M3, subfamily M3B (clan MA(E)). The family is related to lactococcal PepF and group B streptococcal PepB (IPR004438 from INTERPRO) but in a distinct clade with considerable sequence differences not only to IPR004438 from INTERPRO but also to IPR011977 from INTERPRO. Likely substrates are small peptides and not whole proteins, as with PepF, but members are not characterised and the activity profile may differ. Several bacteria have both a member of this family and a member of the PepF family.. Probab=72.36 E-value=2 Score=22.67 Aligned_cols=25 Identities=24% Similarity=0.321 Sum_probs=13.3 Q ss_pred CCCCCE-EEEECCCCCCCCHHHHHHC Q ss_conf 145662-8888783045400111110 Q gi|254780773|r 136 VKKGDC-IISLDGITVSAFEEVAPYV 160 (349) Q Consensus 136 L~~GD~-Il~InG~~V~s~~dl~~~i 160 (349) ++|+|. -.-.||++..-..+.-..+ T Consensus 253 l~pwD~s~~~~~gn~L~P~~~~~~~~ 278 (553) T TIGR02289 253 LRPWDESAVFLDGNVLKPFGNVDFLL 278 (553) T ss_pred CCCCCCCCCCCCCCCCCCCCCHHHHH T ss_conf 14201576788678668866778999 No 85 >cd06457 M3A_MIP Peptidase M3 mitochondrial intermediate peptidase (MIP; EC 3.4.24.59) belongs to the widespread subfamily M3A, that show similarity to the Thimet oligopeptidase (TOP). It is one of three peptidases responsible for the proteolytic processing of both, nuclear and mitochondrial encoded precursor polypeptides targeted to the various subcompartments of the mitochondria. It cleaves intermediate-size proteins initially processed by mitochondrial processing peptidase (MPP) to yield a processing intermediate with a typical N-terminal octapeptide that is sequentially cleaved by MIP to mature-size protein. MIP cleaves precursor proteins of respiratory components, including subunits of the electron transport chain and tri-carboxylic acid cycle enzymes, and components of the mitochondrial genetic machinery, including ribosomal proteins, translation factors, and proteins required for mitochondrial DNA metabolism. It has been suggested that the human MIP (HMIP polypeptide; gene symbo Probab=69.00 E-value=3.5 Score=20.98 Aligned_cols=17 Identities=24% Similarity=0.272 Sum_probs=10.1 Q ss_pred HHHHHHHHHHCCEEEEE Q ss_conf 37799999849750045 Q gi|254780773|r 23 FGHYMVARLCNIRVLSF 39 (349) Q Consensus 23 ~GH~~~Ar~~gv~V~~F 39 (349) .=..++.|++|++.++= T Consensus 138 Glf~i~~~Lfgi~~~~~ 154 (458) T cd06457 138 GLSRLFSRLYGIRLVPV 154 (458) T ss_pred HHHHHHHHHHCEEEEEC T ss_conf 99999999857456874 No 86 >KOG4407 consensus Probab=68.95 E-value=5.2 Score=19.78 Aligned_cols=45 Identities=16% Similarity=0.496 Sum_probs=35.9 Q ss_pred CCCCCCCCCHHHHHHCCCCCEEEEECCCCCCC--CHHHHHHCCCCCC Q ss_conf 00024567556530014566288887830454--0011111014678 Q gi|254780773|r 121 VVSNVSPASPAAIAGVKKGDCIISLDGITVSA--FEEVAPYVRENPL 165 (349) Q Consensus 121 ~I~~V~~~spA~~AGL~~GD~Il~InG~~V~s--~~dl~~~i~~~~g 165 (349) .+.+|.++.||--|-||.||+++.+|.+++.. ++++...+..+|. T Consensus 146 f~~eV~~n~~a~~a~LQ~~d~V~mvn~q~~A~ia~sti~S~~kqt~~ 192 (1973) T KOG4407 146 FIKEVQANGPAHYANLQTGDRVLMVNNQPIAGIAYSTIVSMIKQTPA 192 (1973) T ss_pred HHHHHCCCCHHHHHHHHCCCEEEEEECCCCCCHHHHHHHHHHCCCCC T ss_conf 21123358745877652266157750675422103132220125888 No 87 >COG5233 GRH1 Peripheral Golgi membrane protein [Intracellular trafficking and secretion] Probab=68.23 E-value=3.3 Score=21.16 Aligned_cols=31 Identities=35% Similarity=0.528 Sum_probs=23.7 Q ss_pred CCCCCCCCCHHHHHHCCCCCEEEEECCCCCC Q ss_conf 0002456755653001456628888783045 Q gi|254780773|r 121 VVSNVSPASPAAIAGVKKGDCIISLDGITVS 151 (349) Q Consensus 121 ~I~~V~~~spA~~AGL~~GD~Il~InG~~V~ 151 (349) .+-.|.+.+||++||.-.||-|+.+|+.++. T Consensus 66 ~~lrv~~~~~~e~~~~~~~dyilg~n~Dp~~ 96 (417) T COG5233 66 EVLRVNPESPAEKAGMVVGDYILGINEDPLR 96 (417) T ss_pred HHEECCCCCHHHHHCCCCCEEEEEECCCCHH T ss_conf 2200365676675101000058763588279 No 88 >cd06456 M3A_DCP_Oligopeptidase_A Peptidase family M3 dipeptidyl carboxypeptidase (DCP; Dcp II; peptidyl dipeptidase; EC 3.4.15.5). This metal-binding M3A family also includes oligopeptidase A (OpdA; EC 3.4.24.70) enzyme. DCP cleaves dipeptides off the C-termini of various peptides and proteins, the smallest substrate being N-blocked tripeptides and unblocked tetrapeptides. DCP from E. coli is inhibited by the anti-hypertensive drug captopril, an inhibitor of the mammalian angiotensin converting enzyme (ACE, also called peptidyl dipeptidase A). Oligopeptidase A (OpdA) may play a specific role in the degradation of signal peptides after they are released from precursor forms of secreted proteins. It can also cleave N-acetyl-L-Ala. Probab=66.90 E-value=4.4 Score=20.28 Aligned_cols=22 Identities=18% Similarity=0.273 Sum_probs=15.2 Q ss_pred HHHHHHHHHHHHHCCCCCHHHH Q ss_conf 3699999998882786999999 Q gi|254780773|r 300 GGHLITFLLEMIRGKSLGVSVT 321 (349) Q Consensus 300 GG~i~~~~~E~i~gr~i~~~~~ 321 (349) |++=...+++..+||+++.+.. T Consensus 395 Gs~~~~e~~~~FlGR~P~~~al 416 (422) T cd06456 395 GSRDPMELFRAFRGRDPSIEAL 416 (422) T ss_pred CCCCHHHHHHHHCCCCCCHHHH T ss_conf 9949999999856999984789 No 89 >pfam00413 Peptidase_M10 Matrixin. The members of this family are enzymes that cleave peptides. These proteases require zinc for catalysis. Probab=66.18 E-value=2.1 Score=22.57 Aligned_cols=14 Identities=0% Similarity=0.202 Sum_probs=8.3 Q ss_pred EEECCCEEEEEEEEE Q ss_conf 860598079999771 Q gi|254780773|r 50 ITSRSGVRWKVSLIP 64 (349) Q Consensus 50 ~~~k~~t~y~i~~~P 64 (349) |+ |....|++.-.| T Consensus 2 W~-k~~lTy~i~~~~ 15 (158) T pfam00413 2 WK-KKNLTYRIVNYT 15 (158) T ss_pred CC-CCCEEEEEECCC T ss_conf 98-881579983579 No 90 >TIGR01700 PNPH purine nucleoside phosphorylase I, inosine and guanosine-specific; InterPro: IPR011270 This entry represents a family of bacterial and metazoan purine phosphorylases acting primarily on inosine and guanosine and not acting on adenosine. PNP-I refers to the nomenclature from Bacillus stearothermophilus where PHP-II refers to the nucleotidase acting on adenosine as the primary substrate. The bacterial enzymes (PUNA) are typified by the Bacillus PupG protein which is involved in the metabolism of nucleosides as a carbon source. Several metazoan enzymes (PNPH) are well characterised including the human and bovine enzymes which have been crystallised. ; GO: 0004731 purine-nucleoside phosphorylase activity, 0006139 nucleobase nucleoside nucleotide and nucleic acid metabolic process. Probab=66.01 E-value=2.9 Score=21.53 Aligned_cols=15 Identities=27% Similarity=0.304 Sum_probs=10.1 Q ss_pred HHHHCCCCCEEEEEC Q ss_conf 530014566288887 Q gi|254780773|r 132 AIAGVKKGDCIISLD 146 (349) Q Consensus 132 ~~AGL~~GD~Il~In 146 (349) =++-+|+||.++==| T Consensus 96 iN~~F~~GDlmlI~D 110 (259) T TIGR01700 96 INTEFKVGDLMLIRD 110 (259) T ss_pred CCCCCCCCCEEEECC T ss_conf 277766777789711 No 91 >COG0339 Dcp Zn-dependent oligopeptidases [Amino acid transport and metabolism] Probab=63.82 E-value=5.1 Score=19.83 Aligned_cols=32 Identities=19% Similarity=0.268 Sum_probs=13.8 Q ss_pred HHHHHHHHHHHHHHEECCCCCCCCCCCCCCCCC Q ss_conf 898886554223100000002554200224667 Q gi|254780773|r 225 SRGLDEISSITRGFLGVLSSAFGKDTRLNQISG 257 (349) Q Consensus 225 ~~~~~~~~~~~~~~~~~l~~l~~g~~~~~~lsG 257 (349) -.+.+++.++.-.+-..+-.+++. +...++|| T Consensus 462 Lls~dEV~TLFHEfGHgLH~mlt~-v~~~~vsG 493 (683) T COG0339 462 LLSHDEVTTLFHEFGHGLHHLLTR-VKYPGVSG 493 (683) T ss_pred EEEHHHHHHHHHHHHHHHHHHHHC-CCCCCCCC T ss_conf 310788999998753688887633-87456678 No 92 >pfam01432 Peptidase_M3 Peptidase family M3. This is the Thimet oligopeptidase family, large family of mammalian and bacterial oligopeptidases that cleave medium sized peptides. The group also contains mitochondrial intermediate peptidase which is encoded by nuclear DNA but functions within the mitochondria to remove the leader sequence. Probab=63.54 E-value=5.6 Score=19.59 Aligned_cols=20 Identities=20% Similarity=0.130 Sum_probs=9.4 Q ss_pred HHHHHHHHHHHHHCCCCCHH Q ss_conf 36999999988827869999 Q gi|254780773|r 300 GGHLITFLLEMIRGKSLGVS 319 (349) Q Consensus 300 GG~i~~~~~E~i~gr~i~~~ 319 (349) |++=-..+++...||..+.. T Consensus 421 gs~~p~ell~~f~gr~~~~~ 440 (448) T pfam01432 421 GSLDPLELLKKFGGRMPSAD 440 (448) T ss_pred CCCCHHHHHHHHCCCCCCHH T ss_conf 98269999998289999858 No 93 >COG0260 PepB Leucyl aminopeptidase [Amino acid transport and metabolism] Probab=62.08 E-value=3.4 Score=21.02 Aligned_cols=29 Identities=34% Similarity=0.456 Sum_probs=22.7 Q ss_pred CCCCCCCCHHHHHHCCCCCEEEEECCCCCC Q ss_conf 002456755653001456628888783045 Q gi|254780773|r 122 VSNVSPASPAAIAGVKKGDCIISLDGITVS 151 (349) Q Consensus 122 I~~V~~~spA~~AGL~~GD~Il~InG~~V~ 151 (349) |-...+|.|...| .+|||+|++-||+.|+ T Consensus 302 vl~~~ENm~~g~A-~rPGDVits~~GkTVE 330 (485) T COG0260 302 VLPAVENMPSGNA-YRPGDVITSMNGKTVE 330 (485) T ss_pred EEEEECCCCCCCC-CCCCCEEEECCCEEEE T ss_conf 9761315878789-9998767805970899 No 94 >cd06459 M3B_Oligoendopeptidase_F Peptidase family M3B Oligopeptidase F (PepF; Pz-peptidase B; EC 3.4.24.-) is mostly bacterial and includes oligoendopeptidase F from Lactococcus lactis. This enzyme hydrolyzes peptides containing between 7 and 17 amino acids with fairly broad specificity. The PepF gene is duplicated in L. lactis on the plasmid that bears it, while a shortened second copy is found in Bacillus subtilis. Most bacterial PepFs are cytoplasmic endopeptidases; however, the PepF Bacillus amyloliquefaciens oligopeptidase is a secreted protein and may facilitate the process of sporulation. Specifically, the yjbG gene encoding the homolog of the PepF1 and PepF2 oligoendopeptidases of Lactococcus lactis has been identified in Bacillus subtilis as an inhibitor of sporulation initiation when over expressed from a multicopy plasmid. Probab=61.83 E-value=4.2 Score=20.43 Aligned_cols=27 Identities=22% Similarity=0.155 Sum_probs=12.9 Q ss_pred CCCCCEEEEECCC--CCCCCHHHHHHCCC Q ss_conf 1456628888783--04540011111014 Q gi|254780773|r 136 VKKGDCIISLDGI--TVSAFEEVAPYVRE 162 (349) Q Consensus 136 L~~GD~Il~InG~--~V~s~~dl~~~i~~ 162 (349) |++.|.=..+... +.-+|+|..+.+.+ T Consensus 137 l~~~D~~~p~~~~~~~~~~~~ea~~~v~~ 165 (427) T cd06459 137 LRPYDLYAPLVSGNPPKYTYEEAKELVLE 165 (427) T ss_pred CHHHHHCCCCCCCCCCCCCHHHHHHHHHH T ss_conf 15988148788888886899999999999 No 95 >TIGR00181 pepF oligoendopeptidase F; InterPro: IPR004438 Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site . The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases . Peptidases are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry. Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. This group of metallopeptidases belong to MEROPS peptidase family M3 (clan MA(E)), the type example being oligoendopeptidase F from Lactococcus lactis. The enzyme hydrolyses peptides of 7 and 17 amino acids with fairly broad specificity. Differences in substrate specificity should be expected in other species. The gene is duplicated in L. lactis on the plasmid that bears it. A shortened second copy is found in Bacillus subtilis.; GO: 0004222 metalloendopeptidase activity, 0006508 proteolysis. Probab=61.70 E-value=3.9 Score=20.65 Aligned_cols=19 Identities=21% Similarity=0.417 Sum_probs=11.3 Q ss_pred HHHHHHHHHHH---HHHHHHHC Q ss_conf 99999973377---99999849 Q gi|254780773|r 15 IIIVVIHEFGH---YMVARLCN 33 (349) Q Consensus 15 ~~~v~iHE~GH---~~~Ar~~g 33 (349) ++...+||+|| ++.|.... T Consensus 389 ~~~TLaHE~GHS~Hs~~s~~~Q 410 (611) T TIGR00181 389 SVFTLAHELGHSMHSYFSSKHQ 410 (611) T ss_pred HHHHHHHHHHHHHHHHHHHHCC T ss_conf 2579999855679998654117 No 96 >cd06455 M3A_TOP Peptidase M3 Thimet oligopeptidase (TOP; PZ-peptidase; endo-oligopeptidase A; endopeptidase 24.15; soluble metallo-endopeptidase; EC 3.4.24.15) family also includes neurolysin (endopeptidase 24.16, microsomal endopeptidase, mitochondrial oligopeptidase M, neurotensin endopeptidase, soluble angiotensin II-binding protein, thimet oligopeptidase II) which hydrolyzes oligopeptides such as neurotensin, bradykinin and dynorphin A. TOP and neurolysin are neuropeptidases expressed abundantly in the testis, but also found in the liver, lung and kidney. They are involved in the metabolism of neuropeptides under 20 amino acid residues long and cleave most bioactive peptides at the same sites, but recognize different positions on some naturally occurring and synthetic peptides; they cleave at distinct sites on the 13-residue bioactive peptide neurotensin, which modulates central dopaminergic and cholinergic circuits. TOP has been shown to degrade peptides released by the proteasom Probab=60.28 E-value=6 Score=19.34 Aligned_cols=22 Identities=14% Similarity=0.135 Sum_probs=15.5 Q ss_pred CHHHHHHHHHHHHHCCCCCHHH Q ss_conf 4369999999888278699999 Q gi|254780773|r 299 DGGHLITFLLEMIRGKSLGVSV 320 (349) Q Consensus 299 DGG~i~~~~~E~i~gr~i~~~~ 320 (349) -|++=...+++..+||+++.+. T Consensus 446 Ggs~~~~e~~~~F~GR~P~~~a 467 (472) T cd06455 446 GGSKDAADMLKDFLGREPNNDA 467 (472) T ss_pred CCCCCHHHHHHHHCCCCCCCHH T ss_conf 8991899999984699998157 No 97 >pfam00883 Peptidase_M17 Cytosol aminopeptidase family, catalytic domain. The two associated zinc ions and the active site are entirely enclosed within the C-terminal catalytic domain in leucine aminopeptidase. Probab=59.01 E-value=4.2 Score=20.42 Aligned_cols=29 Identities=28% Similarity=0.425 Sum_probs=24.2 Q ss_pred CCCCCCCCHHHHHHCCCCCEEEEECCCCCC Q ss_conf 002456755653001456628888783045 Q gi|254780773|r 122 VSNVSPASPAAIAGVKKGDCIISLDGITVS 151 (349) Q Consensus 122 I~~V~~~spA~~AGL~~GD~Il~InG~~V~ 151 (349) +-...+|+|...| .+|||+|.+-||+.|+ T Consensus 135 i~~l~EN~is~~A-~rPgDVi~s~~GkTVE 163 (312) T pfam00883 135 VLALTENMISGTA-MRPGDIITAMNGKTVE 163 (312) T ss_pred EEEEECCCCCCCC-CCCCCEEEECCCCEEE T ss_conf 9870103789988-9999778917997898 No 98 >PRK04860 hypothetical protein; Provisional Probab=58.73 E-value=5 Score=19.88 Aligned_cols=18 Identities=50% Similarity=0.731 Sum_probs=8.8 Q ss_pred HHHHHHHHHHHHHHHCCEE Q ss_conf 9997337799999849750 Q gi|254780773|r 18 VVIHEFGHYMVARLCNIRV 36 (349) Q Consensus 18 v~iHE~GH~~~Ar~~gv~V 36 (349) |+-||+.|+++=+++| || T Consensus 66 vVpHEvAHllv~~lfG-rV 83 (160) T PRK04860 66 VVPHELAHLLVYQLFG-RV 83 (160) T ss_pred CCHHHHHHHHHHHHHC-CC T ss_conf 3358999999999808-87 No 99 >TIGR01697 PNPH-PUNA-XAPA inosine guanosine and xanthosine phosphorylase family; InterPro: IPR011268 This entry describes a subset of the phosphorylase family. The entry excludes the methylthioadenosine phosphorylases (MTAP, IPR010044 from INTERPRO), which are believed to play a specific role in the recycling of methionine from methylthioadenosine. This entry consists of three clades of purine phosphorylases based on a neighbour-joining tree using the MTAP family as an out group. The highest-branching clade (IPR011269 from INTERPRO) consists of a group of sequences from both Gram-positive and Gram-negative bacteria which have been shown to act as purine nucleotide phosphorylases but whose physiological substrate and role in vivo remain unknown . Of the two remaining clades, one is xanthosine phosphorylase (XAPA, IPR010943 from INTERPRO); it is limited to certain gammaproteobacteria and constitutes a special purine phosphorylase found in a specialised operon for xanthosine catabolism . The enzyme also acts on the same purines (inosine and guanosine) as the other characterised members of this subfamily, but is only induced when xanthosine must be degraded. The remaining and largest clade consists of purine nucleotide phosphorylases (PNPH, IPR011270 from INTERPRO) from metazoa and bacteria which act primarily on guanosine and inosine, and do not act on adenosine. Sequences from Clostridium and Thermotoga fall between these last two clades and are uncharacterised with respect to substrate range. ; GO: 0004731 purine-nucleoside phosphorylase activity, 0006139 nucleobase nucleoside nucleotide and nucleic acid metabolic process. Probab=57.77 E-value=6.2 Score=19.27 Aligned_cols=12 Identities=42% Similarity=0.470 Sum_probs=6.3 Q ss_pred HHHCCCCCEEEE Q ss_conf 300145662888 Q gi|254780773|r 133 IAGVKKGDCIIS 144 (349) Q Consensus 133 ~AGL~~GD~Il~ 144 (349) ++.+++||.++= T Consensus 103 n~~~~~GdLm~i 114 (266) T TIGR01697 103 NADFKPGDLMII 114 (266) T ss_pred CCCCCCCCEEEE T ss_conf 678898537997 No 100 >pfam06838 Alum_res Aluminium resistance protein. This family represents the aluminium resistance protein, which confers resistance to aluminium in bacteria. Probab=54.59 E-value=8.3 Score=18.39 Aligned_cols=42 Identities=19% Similarity=0.218 Sum_probs=29.2 Q ss_pred CHHHHHHCCCCCEEEEECCCCCCCCHHHHHHCCCCCCCCCEE Q ss_conf 556530014566288887830454001111101467886316 Q gi|254780773|r 129 SPAAIAGVKKGDCIISLDGITVSAFEEVAPYVRENPLHEISL 170 (349) Q Consensus 129 spA~~AGL~~GD~Il~InG~~V~s~~dl~~~i~~~~g~~v~i 170 (349) +-|--+=|+|||+++++-|+|-++.+++.-.-.+.+|.-.+. T Consensus 87 ~~aLfg~LrPGD~ll~itG~PYDTL~eVIGi~g~~~GSL~e~ 128 (405) T pfam06838 87 ATALFGVLRPGDELLYITGKPYDTLEEVIGIRGEGQGSLKDF 128 (405) T ss_pred HHHHHHCCCCCCEEEECCCCCCHHHHHHHCCCCCCCCCHHHH T ss_conf 999983678998787448997365898718588998888995 No 101 >cd06258 Peptidase_M3_like The peptidase M3-like family, also called neurolysin-like family, is part of the "zincins" metallopeptidases, and includes M3, M2 and M32 families of metallopeptidases. The M3 family is subdivided into two subfamilies: the widespread M3A, which comprises a number of high-molecular mass endo- and exopeptidases from bacteria, archaea, protozoa, fungi, plants and animals, and the small M3B, whose members are enzymes primarily from bacteria. Well-known mammalian/eukaryotic M3A endopeptidases are the thimet oligopeptidase (TOP; endopeptidase 3.4.24.15), neurolysin (alias endopeptidase 3.4.24.16), and the mitochondrial intermediate peptidase. The first two are intracellular oligopeptidases, which act only on relatively short substrates of less than 20 amino acid residues, while the latter cleaves N-terminal octapeptides from proteins during their import into the mitochondria. The M3A subfamily also contains several bacterial endopeptidases, collectively called olig Probab=54.11 E-value=8.7 Score=18.25 Aligned_cols=21 Identities=14% Similarity=0.112 Sum_probs=13.5 Q ss_pred HHHHHHHHHHHHHCCCCCHHH Q ss_conf 369999999888278699999 Q gi|254780773|r 300 GGHLITFLLEMIRGKSLGVSV 320 (349) Q Consensus 300 GG~i~~~~~E~i~gr~i~~~~ 320 (349) |++=...++|..+||+++.+. T Consensus 340 ~s~~~~el~~~f~Gr~p~~~a 360 (365) T cd06258 340 NSEPWKELLKRATGEDPNADA 360 (365) T ss_pred CCCCHHHHHHHHHCCCCCHHH T ss_conf 899999999997785998078 No 102 >PRK00913 leucyl aminopeptidase; Provisional Probab=51.70 E-value=6.2 Score=19.24 Aligned_cols=27 Identities=26% Similarity=0.455 Sum_probs=20.2 Q ss_pred CCCCCCHHHHHHCCCCCEEEEECCCCCC Q ss_conf 2456755653001456628888783045 Q gi|254780773|r 124 NVSPASPAAIAGVKKGDCIISLDGITVS 151 (349) Q Consensus 124 ~V~~~spA~~AGL~~GD~Il~InG~~V~ 151 (349) ...+|.|...| .+|||+|.+-||+.|+ T Consensus 308 ~~~ENm~~g~a-~~pgDvi~~~~GktvE 334 (491) T PRK00913 308 AACENMPSGNA-YRPGDVLTSMSGKTIE 334 (491) T ss_pred EHHHCCCCCCC-CCCCCEEEECCCCEEE T ss_conf 61213888889-9985557806994798 No 103 >KOG3714 consensus Probab=51.58 E-value=4.8 Score=20.04 Aligned_cols=14 Identities=29% Similarity=0.539 Sum_probs=5.9 Q ss_pred HHCCCCC-EEEEECC Q ss_conf 0014566-2888878 Q gi|254780773|r 134 AGVKKGD-CIISLDG 147 (349) Q Consensus 134 AGL~~GD-~Il~InG 147 (349) -|-+.|. ..++++. T Consensus 141 VGr~gg~~q~~sl~~ 155 (411) T KOG3714 141 VGRRGGGQQLLSLGD 155 (411) T ss_pred CCCCCCCCCCCCCCC T ss_conf 277788653211178 No 104 >cd00433 Peptidase_M17 Cytosol aminopeptidase family, N-terminal and catalytic domains. Family M17 contains zinc- and manganese-dependent exopeptidases ( EC 3.4.11.1), including leucine aminopeptidase. They catalyze removal of amino acids from the N-terminus of a protein and play a key role in protein degradation and in the metabolism of biologically active peptides. They do not contain HEXXH motif (which is used as one of the signature patterns to group the peptidase families) in the metal-binding site. The two associated zinc ions and the active site are entirely enclosed within the C-terminal catalytic domain in leucine aminopeptidase. The enzyme is a hexamer, with the catalytic domains clustered around the three-fold axis, and the two trimers related to one another by a two-fold rotation. The N-terminal domain is structurally similar to the ADP-ribose binding Macro domain. This family includes proteins from bacteria, archaea, animals and plants. Probab=50.13 E-value=6.4 Score=19.18 Aligned_cols=27 Identities=30% Similarity=0.313 Sum_probs=20.0 Q ss_pred CCCCCCHHHHHHCCCCCEEEEECCCCCC Q ss_conf 2456755653001456628888783045 Q gi|254780773|r 124 NVSPASPAAIAGVKKGDCIISLDGITVS 151 (349) Q Consensus 124 ~V~~~spA~~AGL~~GD~Il~InG~~V~ 151 (349) ...+|.|...| .+|||+|++-||+.|+ T Consensus 291 ~~~EN~~~~~a-~rpgDvi~~~~GktvE 317 (468) T cd00433 291 PLAENMISGNA-YRPGDVITSRSGKTVE 317 (468) T ss_pred EHHHCCCCCCC-CCCCCEEECCCCCEEE T ss_conf 62314878889-8984658827996899 No 105 >KOG1565 consensus Probab=48.51 E-value=6.2 Score=19.25 Aligned_cols=12 Identities=42% Similarity=0.841 Sum_probs=7.7 Q ss_pred HHHHHHHHHHHH Q ss_conf 999999733779 Q gi|254780773|r 15 IIIVVIHEFGHY 26 (349) Q Consensus 15 ~~~v~iHE~GH~ 26 (349) ...|++||.||. T Consensus 211 l~~Va~HEiGH~ 222 (469) T KOG1565 211 LFLVAAHEIGHA 222 (469) T ss_pred HHHHHHHHCCCC T ss_conf 677754301100 No 106 >TIGR02386 rpoC_TIGR DNA-directed RNA polymerase, beta' subunit; InterPro: IPR012754 DNA-directed RNA polymerases 2.7.7.6 from EC (also known as DNA-dependent RNA polymerases) are responsible for the polymerisation of ribonucleotides into a sequence complementary to the template DNA. In eukaryotes, there are three different forms of DNA-directed RNA polymerases transcribing different sets of genes. Most RNA polymerases are multimeric enzymes and are composed of a variable number of subunits. The core RNA polymerase complex consists of five subunits (two alpha, one beta, one beta-prime and one omega) and is sufficient for transcription elongation and termination but is unable to initiate transcription. Transcription initiation from promoter elements requires a sixth, dissociable subunit called a sigma factor, which reversibly associates with the core RNA polymerase complex to form a holoenzyme . The core RNA polymerase complex forms a "crab claw"-like structure with an internal channel running along the full length . The key functional sites of the enzyme, as defined by mutational and cross-linking analysis, are located on the inner wall of this channel. RNA synthesis follows after the attachment of RNA polymerase to a specific site, the promoter, on the template DNA strand. The RNA synthesis process continues until a termination sequence is reached. The RNA product, which is synthesised in the 5' to 3'direction, is known as the primary transcript. Eukaryotic nuclei contain three distinct types of RNA polymerases that differ in the RNA they synthesise: RNA polymerase I: located in the nucleoli, synthesises precursors of most ribosomal RNAs. RNA polymerase II: occurs in the nucleoplasm, synthesises mRNA precursors. RNA polymerase III: also occurs in the nucleoplasm, synthesises the precursors of 5S ribosomal RNA, the tRNAs, and a variety of other small nuclear and cytosolic RNAs. Eukaryotic cells are also known to contain separate mitochondrial and chloroplast RNA polymerases. Eukaryotic RNA polymerases, whose molecular masses vary in size from 500 to 700 kD, contain two non-identical large (>100 kDa) subunits and an array of up to 12 different small (less than 50 kDa) subunits. This entry represents the beta-prime subunit, RpoC, found in most bacteria. It excludes some, mainly cyanobacterial, species where RpoC is replaced by two homologous proteins that include an additional domain. One arm of the "claw" is predominantly formed by this subunit, the other being predominantly formed by the beta subunit. The active site of the enzyme is defined by three invariant aspartate residues within the beta-prime subunit .; GO: 0003677 DNA binding, 0003899 DNA-directed RNA polymerase activity, 0006350 transcription. Probab=47.96 E-value=5.7 Score=19.52 Aligned_cols=10 Identities=50% Similarity=0.743 Sum_probs=5.1 Q ss_pred CCCCCEEEEE Q ss_conf 1456628888 Q gi|254780773|r 136 VKKGDCIISL 145 (349) Q Consensus 136 L~~GD~Il~I 145 (349) +++||.|-++ T Consensus 1295 v~~GDIlAkl 1304 (1552) T TIGR02386 1295 VKPGDILAKL 1304 (1552) T ss_pred CCCCCEEEEC T ss_conf 4747478860 No 107 >PRK05113 electron transport complex protein RnfB; Provisional Probab=43.36 E-value=22 Score=15.40 Aligned_cols=34 Identities=6% Similarity=0.103 Sum_probs=24.4 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCEEE Q ss_conf 8999999999999999973377999998497500 Q gi|254780773|r 4 LDCFLLYTVSLIIIVVIHEFGHYMVARLCNIRVL 37 (349) Q Consensus 4 ~~~~~~~~~~l~~~v~iHE~GH~~~Ar~~gv~V~ 37 (349) |.++++.++.++.+=++--+.-.+++|+|.|.++ T Consensus 1 m~~il~av~~l~~lGl~~g~~L~~Ask~f~Ve~D 34 (184) T PRK05113 1 MNTIWIAVIALSLLALVFGAILGFASRRFKVEGD 34 (184) T ss_pred CHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCCC T ss_conf 9159999999999999999999998762046799 No 108 >KOG2597 consensus Probab=43.08 E-value=11 Score=17.42 Aligned_cols=25 Identities=36% Similarity=0.434 Sum_probs=18.7 Q ss_pred CCCCHHHHHHCCCCCEEEEECCCCCC Q ss_conf 56755653001456628888783045 Q gi|254780773|r 126 SPASPAAIAGVKKGDCIISLDGITVS 151 (349) Q Consensus 126 ~~~spA~~AGL~~GD~Il~InG~~V~ 151 (349) .+|+|...| -+|||+|++-||+.|. T Consensus 328 cENm~sg~A-~kpgDVit~~nGKtve 352 (513) T KOG2597 328 CENMPSGNA-TKPGDVITLRNGKTVE 352 (513) T ss_pred ECCCCCCCC-CCCCCEEEECCCCEEE T ss_conf 005887557-8987478812786787 No 109 >pfam04298 Zn_peptidase_2 Putative neutral zinc metallopeptidase. Zinc metallopeptidase zinc binding regions have been predicted in some family members by a pattern match (Prosite:PS00142). Probab=43.02 E-value=15 Score=16.58 Aligned_cols=32 Identities=13% Similarity=0.468 Sum_probs=14.8 Q ss_pred HHHHHHHHHHHH-HHHHHCCCC-CCHHHHHHHHH Q ss_conf 899999999999-996306472-24369999999 Q gi|254780773|r 277 IAFLAMFSWAIG-FMNLLPIPI-LDGGHLITFLL 308 (349) Q Consensus 277 l~~~a~isi~Lg-~~NlLPip~-LDGG~i~~~~~ 308 (349) +.+++++.+.++ +|.++-+|. +|-.+=..... T Consensus 143 l~~igi~lf~~~vlf~lvTLPVEfDAS~RAl~~L 176 (222) T pfam04298 143 LLLIGIILFSAAVLFQLVTLPVEFDASRRALAIL 176 (222) T ss_pred HHHHHHHHHHHHHHHHHEECCCEECCHHHHHHHH T ss_conf 9999999999999997132230026256799999 No 110 >cd04282 ZnMc_meprin Zinc-dependent metalloprotease, meprin_like subfamily. Meprins are membrane-bound or secreted extracellular proteases, which cleave a variety of targets, including peptides such as parathyroid hormone, gastrin, and cholecystokinin, cytokines such as osteopontin, and proteins such as collagen IV, fibronectin, casein and gelatin. Meprins may also be able to release proteins from the cell surface. Closely related meprin alpha- and beta-subunits form homo- and hetero-oligomers; these complexes are found on epithelial cells of the intestine, for example, and are also expressed in certain cancer cells. Probab=42.71 E-value=12 Score=17.30 Aligned_cols=11 Identities=27% Similarity=0.486 Sum_probs=4.5 Q ss_pred HCCCCCEEEEE Q ss_conf 01456628888 Q gi|254780773|r 135 GVKKGDCIISL 145 (349) Q Consensus 135 GL~~GD~Il~I 145 (349) |-+.|-..+++ T Consensus 104 G~~gg~Q~vsl 114 (230) T cd04282 104 GDQQGGQNLSI 114 (230) T ss_pred CCCCCEEEEEE T ss_conf 65388056650 No 111 >TIGR03296 M6dom_TIGR03296 M6 family metalloprotease domain. This model describes a metalloproteinase domain, with a characteristic HExxH motif. Examples of this domain are found in proteins in the family of immune inhibitor A, which cleaves antibacterial peptides, and in other, only distantly related proteases. This model is built to be broader and more inclusive than pfam05547. Probab=41.35 E-value=4.2 Score=20.45 Aligned_cols=11 Identities=27% Similarity=0.685 Sum_probs=5.5 Q ss_pred EEEECCCCCCC Q ss_conf 88887830454 Q gi|254780773|r 142 IISLDGITVSA 152 (349) Q Consensus 142 Il~InG~~V~s 152 (349) ...+||+++.+ T Consensus 140 ~~~~~G~~i~~ 150 (286) T TIGR03296 140 TLPIDGTTIGG 150 (286) T ss_pred CCCCCCEEEEC T ss_conf 41109978725 No 112 >COG2317 Zn-dependent carboxypeptidase [Amino acid transport and metabolism] Probab=40.76 E-value=16 Score=16.40 Aligned_cols=11 Identities=0% Similarity=0.184 Sum_probs=4.5 Q ss_pred CCCEEEEEECC Q ss_conf 86316999658 Q gi|254780773|r 166 HEISLVLYREH 176 (349) Q Consensus 166 ~~v~i~v~R~~ 176 (349) ..|.+|..-++ T Consensus 243 ~DVRITTRy~~ 253 (497) T COG2317 243 NDVRITTRYNE 253 (497) T ss_pred CCEEEEEECCC T ss_conf 75368740477 No 113 >COG4307 Uncharacterized protein conserved in bacteria [Function unknown] Probab=40.58 E-value=15 Score=16.53 Aligned_cols=17 Identities=29% Similarity=0.507 Sum_probs=13.6 Q ss_pred EEEEEEEEEECCEEECC Q ss_conf 07999977112111001 Q gi|254780773|r 56 VRWKVSLIPLGGYVSFS 72 (349) Q Consensus 56 t~y~i~~~PlGgyV~~~ 72 (349) +.|.+.+=+.||+|+-+ T Consensus 56 CNw~V~ad~~~~~C~aC 72 (349) T COG4307 56 CNWLVPADQLGGLCRAC 72 (349) T ss_pred HCCCCCCCCCCCCCHHH T ss_conf 34742146677612555 No 114 >KOG3984 consensus Probab=40.54 E-value=9.4 Score=18.02 Aligned_cols=28 Identities=14% Similarity=0.169 Sum_probs=16.4 Q ss_pred HHHHHHCCCCCEEEEECCCCCCCCHHHH Q ss_conf 5653001456628888783045400111 Q gi|254780773|r 130 PAAIAGVKKGDCIISLDGITVSAFEEVA 157 (349) Q Consensus 130 pA~~AGL~~GD~Il~InG~~V~s~~dl~ 157 (349) .+-+++.+.||..+--|-..+..+.+.. T Consensus 118 ggin~~f~vgdiMli~DHin~~G~agq~ 145 (286) T KOG3984 118 GGINPKFAVGDIMLIKDHINLPGLAGQN 145 (286) T ss_pred CCCCCCCCCCCEEEEECCCCCCCCCCCC T ss_conf 5768554224389970103774326899 No 115 >TIGR02290 M3_fam_3 oligoendopeptidase, pepF/M3 family; InterPro: IPR011977 Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site . The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases . Peptidases are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry. Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. This family consists of probable oligoendopeptidases belonging to MEROPS peptidase family M3, subfamily M3B (clan MA(E)). It is related to lactococcal PepF and group B streptococcal PepB (IPR004438 from INTERPRO) but in a distinct clade with considerable sequence differences not only to IPR004438 from INTERPRO but also to IPR011976 from INTERPRO. Members are not characterised with respect to their substrates and the activity profile may differ.. Probab=40.10 E-value=16 Score=16.37 Aligned_cols=16 Identities=31% Similarity=0.441 Sum_probs=7.4 Q ss_pred CCCHHHHHHHHHHHHH Q ss_conf 2101342112333211 Q gi|254780773|r 87 WKKILTVLAGPLANCV 102 (349) Q Consensus 87 ~~R~~i~~AGp~~N~i 102 (349) |++-...+|=++.|+- T Consensus 219 ~~~~~~~~a~~LN~l~ 234 (600) T TIGR02290 219 WEKEAPTLAAILNALA 234 (600) T ss_pred HHHHHHHHHHHHHHHH T ss_conf 8750026999986555 No 116 >COG4956 Integral membrane protein (PIN domain superfamily) [General function prediction only] Probab=40.01 E-value=19 Score=15.95 Aligned_cols=34 Identities=21% Similarity=0.387 Sum_probs=18.9 Q ss_pred ECCCCCCCCHHHHHHCCC--CCCCCCEEEEEECCCC Q ss_conf 878304540011111014--6788631699965873 Q gi|254780773|r 145 LDGITVSAFEEVAPYVRE--NPLHEISLVLYREHVG 178 (349) Q Consensus 145 InG~~V~s~~dl~~~i~~--~~g~~v~i~v~R~~~~ 178 (349) +-|.++-+.+|+.++++. .||++.++.+.++|+| T Consensus 272 ~qgV~vLNINDLAnAVkP~vlpGe~l~v~iiK~GkE 307 (356) T COG4956 272 LQGVQVLNINDLANAVKPVVLPGEELTVQIIKDGKE 307 (356) T ss_pred HCCCCEECHHHHHHHHCCCCCCCCEEEEEEEECCCC T ss_conf 448846308888887377315787168998506765 No 117 >COG3091 SprT Zn-dependent metalloprotease, SprT family [General function prediction only] Probab=39.93 E-value=21 Score=15.67 Aligned_cols=23 Identities=30% Similarity=0.355 Sum_probs=16.2 Q ss_pred HHHHHHHHHHHHHHHHHHHHCCEE Q ss_conf 999999997337799999849750 Q gi|254780773|r 13 SLIIIVVIHEFGHYMVARLCNIRV 36 (349) Q Consensus 13 ~l~~~v~iHE~GH~~~Ar~~gv~V 36 (349) .+..=|+-||+.|+++=+.+| ++ T Consensus 59 ~f~~~vV~HELaHl~ly~~~g-r~ 81 (156) T COG3091 59 DFIEQVVPHELAHLHLYQEFG-RY 81 (156) T ss_pred HHHHHHHHHHHHHHHHHHHCC-CC T ss_conf 999998899999999999827-87 No 118 >cd06460 M32_Taq Peptidase family M32 is a subclass of metallocarboxypeptidases which are distributed mainly in bacteria and archaea, and contain a HEXXH motif that coordinates a divalent cation such as Zn2+ or Co2+, so far only observed in the active site of neutral metallopeptidases but not in carboxypeptidases. M32 includes the thermostable carboxypeptidases (E.C. 3.4.17.19) from Thermus aquaticus (TaqCP) and Pyrococcus furiosus (PfuCP), which have broad specificities toward a wide range of C-terminal substrates that include basic, aromatic, neutral and polar amino acids. These enzymes have a similar fold to the M3 peptidases such as neurolysin and the M2 angiotensin converting enzyme (ACE). Novel peptidases from protozoa Trypanosoma cruzi, a causative agent of Chagas' disease, and Leishmania major, a parasite that causes leishmaniasis, are the first eukaryotic M32 enzymes identified so far, thus making these enzymes an attractive potential target for drug development against these o Probab=38.22 E-value=19 Score=15.90 Aligned_cols=16 Identities=25% Similarity=0.268 Sum_probs=12.7 Q ss_pred HHHHHHHHCCCCCHHH Q ss_conf 9999888278699999 Q gi|254780773|r 305 TFLLEMIRGKSLGVSV 320 (349) Q Consensus 305 ~~~~E~i~gr~i~~~~ 320 (349) -.+++.++|++++++. T Consensus 371 ~eLl~~~TGe~l~~~~ 386 (396) T cd06460 371 DELLKKATGEPLNPEY 386 (396) T ss_pred HHHHHHHHCCCCCHHH T ss_conf 9999998689998799 No 119 >COG4100 Cystathionine beta-lyase family protein involved in aluminum resistance [Inorganic ion transport and metabolism] Probab=38.13 E-value=16 Score=16.44 Aligned_cols=40 Identities=20% Similarity=0.217 Sum_probs=27.9 Q ss_pred HHHHHCCCCCEEEEECCCCCCCCHHHHHHCCCCCCCCCEE Q ss_conf 6530014566288887830454001111101467886316 Q gi|254780773|r 131 AAIAGVKKGDCIISLDGITVSAFEEVAPYVRENPLHEISL 170 (349) Q Consensus 131 A~~AGL~~GD~Il~InG~~V~s~~dl~~~i~~~~g~~v~i 170 (349) |--.=|+|||..+.|-|.|-++.+|+.-.-.+..|.-.+. T Consensus 99 aLfg~LRpgDell~i~G~PYDTLeevIG~rg~~~gSL~df 138 (416) T COG4100 99 ALFGILRPGDELLYITGSPYDTLEEVIGLRGEGQGSLKDF 138 (416) T ss_pred HHHHCCCCCCEEEEECCCCCHHHHHHHCCCCCCCCCHHHH T ss_conf 9984268887477752886056898756277785427873 No 120 >pfam06167 MtfA Phosphoenolpyruvate:glucose-phosphotransferase regulator. MtfA (earlier known as YeeI) is a transcription factor A that binds Mlc (make large colonies), itself a repressor of glucose and hence a protein important in regulation of the phosphoenolpyruvate:glucose-phosphotransferase (ptsG) system, the major glucose transporter in E.coli. Mlc is a repressor of ptsG, and MtfA is found to bind and inactivate Mlc with high affinity. The membrane-bound protein EIICBGlc encoded by the ptsG gene is the major glucose transporter in Escherichia coli. Probab=37.44 E-value=15 Score=16.54 Aligned_cols=18 Identities=11% Similarity=0.145 Sum_probs=12.1 Q ss_pred ECCCCCCCCHHHHHHCCC Q ss_conf 878304540011111014 Q gi|254780773|r 145 LDGITVSAFEEVAPYVRE 162 (349) Q Consensus 145 InG~~V~s~~dl~~~i~~ 162 (349) -+|.-|-||+|+..-.+. T Consensus 128 ~~G~vvLSW~dv~~g~~~ 145 (248) T pfam06167 128 EQGPVILSWQDVLAGGAN 145 (248) T ss_pred CCCCEEEEHHHHHHHCCC T ss_conf 589579779999744427 No 121 >PRK13267 archaemetzincin-like protein; Reviewed Probab=36.78 E-value=14 Score=16.88 Aligned_cols=12 Identities=25% Similarity=0.507 Sum_probs=4.9 Q ss_pred CCCEEEEECCCC Q ss_conf 566288887830 Q gi|254780773|r 138 KGDCIISLDGIT 149 (349) Q Consensus 138 ~GD~Il~InG~~ 149 (349) ++|.++.|-+.. T Consensus 67 ~~~~vlgvt~~D 78 (177) T PRK13267 67 NGDATIGITDCD 78 (177) T ss_pred CCCEEEEEECCC T ss_conf 997299995254 No 122 >pfam08434 CLCA_N Calcium-activated chloride channel. The CLCA family of calcium-activated chloride channels has been identified in many epithelial and endothelial cell types as well as in smooth muscle cells and has four or five putative transmembrane regions. Additionally to their role as chloride channels some CLCA proteins function as adhesion molecules and may also have roles as tumour suppressors. The domain described here is found at the N-terminus of CLCAs. Probab=36.78 E-value=13 Score=17.03 Aligned_cols=16 Identities=25% Similarity=0.295 Sum_probs=7.3 Q ss_pred CHHHHHHCCCCCEEEE Q ss_conf 5565300145662888 Q gi|254780773|r 129 SPAAIAGVKKGDCIIS 144 (349) Q Consensus 129 spA~~AGL~~GD~Il~ 144 (349) .+|...-.+.-|++++ T Consensus 91 ~~a~~Ety~~ADV~Va 106 (262) T pfam08434 91 LRPKQESYKKADVIVA 106 (262) T ss_pred CCCCHHHCCCCCEEEC T ss_conf 4530101046677965 No 123 >pfam05547 Peptidase_M6 Immune inhibitor A peptidase M6. The insect pathogenic Gram-positive Bacillus thuringiensis secretes immune inhibitor A, a metallopeptidase, which specifically cleaves host antibacterial proteins. A homologue of immune inhibitor A, PrtV, has been identified in the Gram-negative human pathogen Vibrio cholerae. Probab=35.85 E-value=3.9 Score=20.63 Aligned_cols=22 Identities=5% Similarity=0.015 Sum_probs=13.7 Q ss_pred CCHHHHHHCCCCCCCCCEEEEE Q ss_conf 4001111101467886316999 Q gi|254780773|r 152 AFEEVAPYVRENPLHEISLVLY 173 (349) Q Consensus 152 s~~dl~~~i~~~~g~~v~i~v~ 173 (349) .|-++.--++..+|++|++... T Consensus 419 ~Wv~~~~dLsayaGk~V~l~f~ 440 (646) T pfam05547 419 KWIDASYDLSAYAGKKVKLRFD 440 (646) T ss_pred CEEEEEECCHHHCCCEEEEEEE T ss_conf 5078775452426965799999 No 124 >pfam02074 Peptidase_M32 Carboxypeptidase Taq (M32) metallopeptidase. Probab=35.66 E-value=23 Score=15.35 Aligned_cols=16 Identities=19% Similarity=0.266 Sum_probs=11.0 Q ss_pred HHHHHHHHCCCCCHHH Q ss_conf 9999888278699999 Q gi|254780773|r 305 TFLLEMIRGKSLGVSV 320 (349) Q Consensus 305 ~~~~E~i~gr~i~~~~ 320 (349) -.+++.++|+|++++. T Consensus 470 ~eLi~~~TGe~l~~~~ 485 (494) T pfam02074 470 KELLKKATGEDVNAEY 485 (494) T ss_pred HHHHHHHHCCCCCHHH T ss_conf 9999998689998799 No 125 >COG3732 SrlE Phosphotransferase system sorbitol-specific component IIBC [Carbohydrate transport and metabolism] Probab=34.15 E-value=31 Score=14.44 Aligned_cols=56 Identities=20% Similarity=0.268 Sum_probs=33.5 Q ss_pred CCHH-HHHHHHHHHCCHHHHHHHHHHHHHHHHHHHHH--------CCCCCCHHHHHHHHHHHHHC Q ss_conf 3012-23334565305034789999999999999630--------64722436999999988827 Q gi|254780773|r 258 PVGI-ARIAKNFFDHGFNAYIAFLAMFSWAIGFMNLL--------PIPILDGGHLITFLLEMIRG 313 (349) Q Consensus 258 PVgI-a~~~~~~a~~G~~~~l~~~a~isi~Lg~~NlL--------Pip~LDGG~i~~~~~E~i~g 313 (349) .+++ .|...+..++-+.+.+-|+|++|..+|+++-- ++-||-+.-.=..++-.|-+ T Consensus 167 vv~vffqagR~tid~vlknilPFMAFvs~lIgii~~tglGd~iA~~l~PLA~~~~Gll~la~iC~ 231 (328) T COG3732 167 VVAVFFQAGRDTIDTVLKNILPFMAFVSALIGIILYTGLGDVIAHGLSPLANSIWGLLLLALICS 231 (328) T ss_pred HHHHHHHHCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCCCCCCHHHHHHHHHHHC T ss_conf 20345552346899999865679999999999998720677875346512257299999999966 No 126 >COG1164 Oligoendopeptidase F [Amino acid transport and metabolism] Probab=34.13 E-value=29 Score=14.68 Aligned_cols=31 Identities=26% Similarity=0.301 Sum_probs=14.4 Q ss_pred HHHHH---CCCCCEEEE-EC--CCCCCCCHHHHHHCC Q ss_conf 65300---145662888-87--830454001111101 Q gi|254780773|r 131 AAIAG---VKKGDCIIS-LD--GITVSAFEEVAPYVR 161 (349) Q Consensus 131 A~~AG---L~~GD~Il~-In--G~~V~s~~dl~~~i~ 161 (349) |+.-| +++.|+=.- .+ -.+-.+++|..+.+. T Consensus 286 ~k~Lgl~~l~~yD~~~p~~~~~~~~~~s~~ea~~~v~ 322 (598) T COG1164 286 AKVLGLEKLRPYDLYAPLLDKDPSPEYSYEEAKELVL 322 (598) T ss_pred HHHCCCCCCCHHHCCCCCCCCCCCCCCCHHHHHHHHH T ss_conf 9871998588655079844577775666999999999 No 127 >TIGR01592 holin_SPP1 holin, SPP1 family; InterPro: IPR006479 This group of sequences represent one of more than 30 families of phage proteins, all lacking detectable homology with each other, known or believed to act as holins. Holins act in cell lysis by bacteriophage. Members of this family are found in phage PBSX and phage SPP1, among others. . Probab=33.68 E-value=31 Score=14.39 Aligned_cols=24 Identities=17% Similarity=0.469 Sum_probs=21.4 Q ss_pred HHHHHHHHHHHHHHHHHHHHCCCC Q ss_conf 347899999999999996306472 Q gi|254780773|r 274 NAYIAFLAMFSWAIGFMNLLPIPI 297 (349) Q Consensus 274 ~~~l~~~a~isi~Lg~~NlLPip~ 297 (349) ...|..+||+|=.|.+.|.=|||. T Consensus 8 RtiLL~iALvNQ~L~~~g~~~iPv 31 (82) T TIGR01592 8 RTILLIIALVNQFLAMKGISPIPV 31 (82) T ss_pred HHHHHHHHHHHHHHHHCCCCCCCC T ss_conf 899999999999999758974430 No 128 >COG4219 MecR1 Antirepressor regulating drug resistance, predicted signal transduction N-terminal membrane component [Transcription / Signal transduction mechanisms] Probab=32.98 E-value=20 Score=15.80 Aligned_cols=16 Identities=13% Similarity=-0.195 Sum_probs=7.9 Q ss_pred HHHHHCCCCCCHHHHH Q ss_conf 9963064722436999 Q gi|254780773|r 289 FMNLLPIPILDGGHLI 304 (349) Q Consensus 289 ~~NlLPip~LDGG~i~ 304 (349) ..|+.|=-+.-|-|-+ T Consensus 269 ~~~l~~~~~~g~nk~l 284 (337) T COG4219 269 NPPLACHWPAGGNKPL 284 (337) T ss_pred CCCCCCCCCCCCCCHH T ss_conf 8986544434566438 No 129 >PRK07118 ferredoxin; Validated Probab=31.09 E-value=35 Score=14.11 Aligned_cols=34 Identities=9% Similarity=0.070 Sum_probs=26.1 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCEEE Q ss_conf 8999999999999999973377999998497500 Q gi|254780773|r 4 LDCFLLYTVSLIIIVVIHEFGHYMVARLCNIRVL 37 (349) Q Consensus 4 ~~~~~~~~~~l~~~v~iHE~GH~~~Ar~~gv~V~ 37 (349) |..++..++.|+.+=++--+.-.+++|.|.|.++ T Consensus 1 m~~il~~~~~l~~lg~~~g~~L~~ask~f~Ve~D 34 (276) T PRK07118 1 MNLILFAVLSLGALGLVFGILLAFASKKFAVEED 34 (276) T ss_pred CHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCCC T ss_conf 9129999999999999999999998650036789 No 130 >cd03506 Delta6-FADS-like The Delta6 Fatty Acid Desaturase (Delta6-FADS)-like CD includes the integral-membrane enzymes: delta-4, delta-5, delta-6, delta-8, delta-8-sphingolipid, and delta-11 desaturases found in vertebrates, higher plants, fungi, and bacteria. These desaturases are required for the synthesis of highly unsaturated fatty acids (HUFAs), which are mainly esterified into phospholipids and contribute to maintaining membrane fluidity. While HUFAs may be required for cold tolerance in bacteria, plants and fish, the primary role of HUFAs in mammals is cell signaling. These enzymes are described as front-end desaturases because they introduce a double bond between the pre-exiting double bond and the carboxyl (front) end of the fatty acid. Various substrates are involved, with both acyl-coenzyme A (CoA) and acyl-lipid desaturases present in this CD. Acyl-lipid desaturases are localized in the membranes of cyanobacterial thylakoid, plant endoplasmic reticulum (ER), and plastid; an Probab=31.01 E-value=35 Score=14.10 Aligned_cols=32 Identities=19% Similarity=-0.040 Sum_probs=19.8 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHHCCEEEE Q ss_conf 9999999999999999733779999984975004 Q gi|254780773|r 5 DCFLLYTVSLIIIVVIHEFGHYMVARLCNIRVLS 38 (349) Q Consensus 5 ~~~~~~~~~l~~~v~iHE~GH~~~Ar~~gv~V~~ 38 (349) .++++....-......||.+|.-+.|.- ++++ T Consensus 3 ~~~~~G~~~~~~~~l~Hea~H~~~~~~~--~~N~ 34 (204) T cd03506 3 LAILLGLFWAQGGFLAHDAGHGQVFKNR--WLNK 34 (204) T ss_pred HHHHHHHHHHHHHHHHHHHHHHCCCCCC--HHHH T ss_conf 9999999999999999973111302880--9999 No 131 >pfam10462 Peptidase_M66 Peptidase M66. This family of metallopeptidases contains StcE, a virulence factor found in Shiga toxigenic Escherichia coli organisms. StcE peptidase cleaves C1 esterase inhibitor. Probab=30.54 E-value=7.7 Score=18.61 Aligned_cols=15 Identities=7% Similarity=0.093 Sum_probs=11.7 Q ss_pred CEEEEEEEEEECCEE Q ss_conf 807999977112111 Q gi|254780773|r 55 GVRWKVSLIPLGGYV 69 (349) Q Consensus 55 ~t~y~i~~~PlGgyV 69 (349) .+|..+.-||+|=|. T Consensus 46 p~~L~i~~i~~gmlt 60 (304) T pfam10462 46 PYELLIQTLDFGMLT 60 (304) T ss_pred CCEEEEEEECEEEEC T ss_conf 864588554448454 No 132 >pfam11667 DUF3267 Protein of unknown function (DUF3267). This family of proteins has no known function. Probab=30.49 E-value=9 Score=18.15 Aligned_cols=21 Identities=19% Similarity=0.471 Sum_probs=16.7 Q ss_pred HHHHHHHHHHHHHHHHHHCCE Q ss_conf 999999733779999984975 Q gi|254780773|r 15 IIIVVIHEFGHYMVARLCNIR 35 (349) Q Consensus 15 ~~~v~iHE~GH~~~Ar~~gv~ 35 (349) .+++.+||+=|.+..|.++=+ T Consensus 4 ~~~~~iHE~iH~i~f~~f~~~ 24 (107) T pfam11667 4 LLLLILHELLHGIFFKLFGKS 24 (107) T ss_pred EEEHHHHHHHHHHHHHHCCCC T ss_conf 984499999999688860156 No 133 >pfam01435 Peptidase_M48 Peptidase family M48. Probab=29.88 E-value=26 Score=15.02 Aligned_cols=21 Identities=19% Similarity=0.172 Sum_probs=11.1 Q ss_pred HHHHHHCCCCCEEEEECCCCC Q ss_conf 565300145662888878304 Q gi|254780773|r 130 PAAIAGVKKGDCIISLDGITV 150 (349) Q Consensus 130 pA~~AGL~~GD~Il~InG~~V 150 (349) -|+++|+....++.-+|+.++ T Consensus 40 l~~~~~~~~~~~v~v~~~~~~ 60 (222) T pfam01435 40 VADRLGLPAGPEVYVVDSPVR 60 (222) T ss_pred HHHHCCCCCCCEEEEECCCCC T ss_conf 999769999987888748998 No 134 >COG2707 Predicted membrane protein [Function unknown] Probab=29.57 E-value=37 Score=13.94 Aligned_cols=23 Identities=13% Similarity=0.304 Sum_probs=12.8 Q ss_pred HHHHHHCCCCCCHHHHHHHHHHH Q ss_conf 99963064722436999999988 Q gi|254780773|r 288 GFMNLLPIPILDGGHLITFLLEM 310 (349) Q Consensus 288 g~~NlLPip~LDGG~i~~~~~E~ 310 (349) |+|+=.|.=||++..++..+..+ T Consensus 125 a~fgGvpvGPlIaaGil~l~~~k 147 (151) T COG2707 125 ALFGGVPVGPLIAAGILSLLVGK 147 (151) T ss_pred EEECCCCCCHHHHHHHHHHHHHH T ss_conf 33678324247888799999988 No 135 >PRK05015 aminopeptidase B; Provisional Probab=29.56 E-value=7.9 Score=18.52 Aligned_cols=27 Identities=30% Similarity=0.231 Sum_probs=22.5 Q ss_pred CCCCCCHHHHHHCCCCCEEEEECCCCCC Q ss_conf 2456755653001456628888783045 Q gi|254780773|r 124 NVSPASPAAIAGVKKGDCIISLDGITVS 151 (349) Q Consensus 124 ~V~~~spA~~AGL~~GD~Il~InG~~V~ 151 (349) -..+|.|...| .+|||+|.+-||+.|+ T Consensus 242 ~~aENm~sg~A-~rPGDVit~~nGkTVE 268 (424) T PRK05015 242 CCAENLISGNA-FKLGDIITYKNGKTVE 268 (424) T ss_pred EEECCCCCCCC-CCCHHHHHHCCCCEEE T ss_conf 85505778778-8838889763997799 No 136 >PRK09194 prolyl-tRNA synthetase; Provisional Probab=29.15 E-value=34 Score=14.17 Aligned_cols=21 Identities=29% Similarity=0.661 Sum_probs=17.7 Q ss_pred ECCCHHEEEEECCCEEEEEEE Q ss_conf 068311278605980799997 Q gi|254780773|r 42 GFGPELIGITSRSGVRWKVSL 62 (349) Q Consensus 42 Gfgp~l~~~~~k~~t~y~i~~ 62 (349) .+|+.+|++++|++.+|.+++ T Consensus 88 ~~g~el~r~kDR~~~~~~L~P 108 (570) T PRK09194 88 KYGPELLRLKDRHGRDFVLGP 108 (570) T ss_pred HCCHHHEEEECCCCCEEEECC T ss_conf 236134698527898565378 No 137 >cd03507 Delta12-FADS-like The Delta12 Fatty Acid Desaturase (Delta12-FADS)-like CD includes the integral-membrane enzymes, delta-12 acyl-lipid desaturases, oleate 12-hydroxylases, omega3 and omega6 fatty acid desaturases, and other related proteins, found in a wide range of organisms including higher plants, green algae, diatoms, nematodes, fungi, and bacteria. The expression of these proteins appears to be temperature dependent: decreases in temperature result in increased levels of fatty acid desaturation within membrane lipids subsequently altering cell membrane fluidity. An important enzyme for the production of polyunsaturates in plants is the oleate delta-12 desaturase (Arabidopsis FAD2) of the endoplasmic reticulum. This enzyme accepts l-acyl-2-oleoyl-sn-glycero-3-phosphocholine as substrate and requires NADH:cytochrome b oxidoreductase, cytochrome b, and oxygen for activity. FAD2 converts oleate(18:1) to linoleate (18:2) and is closely related to oleate 12-hydroxylase which cat Probab=29.08 E-value=37 Score=13.88 Aligned_cols=27 Identities=15% Similarity=0.032 Sum_probs=18.7 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 999999999999999733779999984 Q gi|254780773|r 6 CFLLYTVSLIIIVVIHEFGHYMVARLC 32 (349) Q Consensus 6 ~~~~~~~~l~~~v~iHE~GH~~~Ar~~ 32 (349) .++.+....++-+..||.||..+.|-. T Consensus 37 ~~~~G~~~~~lfvl~HDa~H~s~~k~r 63 (222) T cd03507 37 WIVQGLFLTGLFVLGHDCGHGSFSDNR 63 (222) T ss_pred HHHHHHHHHHHHHHHHHHHHHCCCCCC T ss_conf 999999999999999981323036881 No 138 >TIGR00819 ydaH AbgT transporter family; InterPro: IPR011540 The p-aminobenzoyl-glutamate transporter family includes two putative transporters, the AbgT protein of Escherichia coli and MtrF of Neisseria gonorrhoeae. AbgT expression is apparently cryptic in wild type cells, but when present on a high copy number plasmid, or when expressed at higher levels due to mutation, it allows utilization of p-aminobenzoyl-glutamate as a source of p-aminobenzoate for p-aminobenzoate auxotrophs . p-Aminobenzoate is a constituent of, and a precursor for, the biosynthesis of folic acid. It is not currently known if AbgT is naturally involved in transporting p-aminobenzoyl-glutamate, or if it only becomes involved when under altered regulation. MtrF is an inner membrane protein which, together with the MtrCDE efflux pump, is required for high-level resistance to hydrophobic antimicrobial agents in N. gonorrhoeae . Its role in this process is not known, but it has been suggested that it may be a component of the efflux pump which is dispensible for basal activity, but required for high-level activity .. Probab=28.56 E-value=31 Score=14.48 Aligned_cols=30 Identities=23% Similarity=0.331 Sum_probs=18.8 Q ss_pred HHHHHHHHHHHHHHHHHHH---------HHCCCCCCHHH Q ss_conf 0347899999999999996---------30647224369 Q gi|254780773|r 273 FNAYIAFLAMFSWAIGFMN---------LLPIPILDGGH 302 (349) Q Consensus 273 ~~~~l~~~a~isi~Lg~~N---------lLPip~LDGG~ 302 (349) +..|..+.+++|+.+|=-- ..|+-.|-|-| T Consensus 403 F~Gl~L~~~F~~l~I~S~SA~W~~~APIFVPMlML~Gy~ 441 (527) T TIGR00819 403 FVGLILLSAFLNLFIASASAIWAVLAPIFVPMLMLLGYA 441 (527) T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCC T ss_conf 999999999999998615679988613788999860465 No 139 >TIGR01393 lepA GTP-binding protein LepA; InterPro: IPR006297 LepA (GUF1 in Saccaromyces) is a GTP-binding membrane protein related to EF-G and EF-Tu. Two types of phylogenetic tree, rooted by other GTP-binding proteins, suggest that eukaryotic homologs (including GUF1 of yeast) originated within the bacterial LepA family. The function of the proteins in this family are unknown. ; GO: 0005525 GTP binding. Probab=28.51 E-value=20 Score=15.72 Aligned_cols=35 Identities=26% Similarity=0.410 Sum_probs=23.1 Q ss_pred CCEEEEEEEECCCHHEEEE-ECCCEEEEEEEEEECCEEECC Q ss_conf 9750045530683112786-059807999977112111001 Q gi|254780773|r 33 NIRVLSFSVGFGPELIGIT-SRSGVRWKVSLIPLGGYVSFS 72 (349) Q Consensus 33 gv~V~~FsiGfgp~l~~~~-~k~~t~y~i~~~PlGgyV~~~ 72 (349) |+.++.=|+ =+.|| .++|.+|.|.+|===|=|-|+ T Consensus 50 GITIK~qaV-----~l~Yk~~~DGe~Y~LNLIDTPGHVDFs 85 (598) T TIGR01393 50 GITIKAQAV-----RLKYKVAKDGETYVLNLIDTPGHVDFS 85 (598) T ss_pred CCEEECCCE-----EEEEEEECCCCEEEEEEECCCCCCCCC T ss_conf 820115634-----753375338878899645288972127 No 140 >pfam00262 Calreticulin Calreticulin family. Probab=27.40 E-value=32 Score=14.34 Aligned_cols=25 Identities=20% Similarity=0.424 Sum_probs=20.7 Q ss_pred EECCEEECCCCCCCHHHHCCCCCCC Q ss_conf 1121110012445506650368421 Q gi|254780773|r 64 PLGGYVSFSEDEKDMRSFFCAAPWK 88 (349) Q Consensus 64 PlGgyV~~~~~e~~~~~f~~~~~~~ 88 (349) ==|||+|+-.++.|.+.|..++++. T Consensus 87 CGGaYiKLl~~~~d~~~f~~~TpY~ 111 (359) T pfam00262 87 CGGAYIKLLSKDFDQKKFSGETPYT 111 (359) T ss_pred CCCEEEEECCCCCCHHHCCCCCCEE T ss_conf 3530467525768987858998605 No 141 >TIGR02149 glgA_Coryne glycogen synthase, Corynebacterium family; InterPro: IPR011875 This entry describes Corynebacterium glutamicum GlgA and closely related proteins in other species. This enzyme is required for glycogen biosynthesis and appears to replace the distantly related IPR011835 from INTERPRO, family of ADP-glucose type glycogen synthase in Corynebacterium glutamicum, Mycobacterium tuberculosis, Bifidobacterium longum, and Streptomyces coelicolor .. Probab=27.35 E-value=40 Score=13.68 Aligned_cols=72 Identities=21% Similarity=0.265 Sum_probs=44.7 Q ss_pred CHHHHHHCCCCCEEEEECCCCCCCCHHHHHHCCCCCC-CCCEEEEEECCCCEEEECC--CCCCCCCCCCCEEEEEEEECC Q ss_conf 5565300145662888878304540011111014678-8631699965873131011--211005654310000122034 Q gi|254780773|r 129 SPAAIAGVKKGDCIISLDGITVSAFEEVAPYVRENPL-HEISLVLYREHVGVLHLKV--MPRLQDTVDRFGIKRQVPSVG 205 (349) Q Consensus 129 spA~~AGL~~GD~Il~InG~~V~s~~dl~~~i~~~~g-~~v~i~v~R~~~~~~~~~v--~p~~~~~~~~~g~~~~~~~ig 205 (349) |=+|+-+++.-|+|++|.+-.= +| +++..|+ ++-.+.|.|||.++.+..- --...+..+++|+.++.|++= T Consensus 152 sW~EktA~~aAd~vIAVS~amr---~D---iL~~YP~lD~~kv~Vv~NGId~~~y~~~~~~~~~~v~~~~Gid~~rP~~l 225 (416) T TIGR02149 152 SWAEKTAIEAADRVIAVSGAMR---ED---ILKVYPDLDPEKVHVVYNGIDTKEYKPAADDDGNKVLDRYGIDRSRPYVL 225 (416) T ss_pred HHHHHHHHHHCCCEEEHHHHCH---HH---HHCCCCCCCCCCEEEEECCCCHHHHCCCCCCCHHHHHHHHCCCCCCCEEE T ss_conf 4788889985040653111033---55---83158688846468886476457606888874113466326799888789 Q ss_pred C Q ss_conf 4 Q gi|254780773|r 206 I 206 (349) Q Consensus 206 i 206 (349) + T Consensus 226 F 226 (416) T TIGR02149 226 F 226 (416) T ss_pred E T ss_conf 8 No 142 >COG4043 Preprotein translocase subunit Sec61beta [Intracellular trafficking, secretion, and vesicular transport] Probab=27.19 E-value=28 Score=14.70 Aligned_cols=37 Identities=30% Similarity=0.623 Sum_probs=23.3 Q ss_pred HHHHHCCCCCEEEEECC-------CCCCCCHHHHHHCCCCCCCCC Q ss_conf 65300145662888878-------304540011111014678863 Q gi|254780773|r 131 AAIAGVKKGDCIISLDG-------ITVSAFEEVAPYVRENPLHEI 168 (349) Q Consensus 131 A~~AGL~~GD~Il~InG-------~~V~s~~dl~~~i~~~~g~~v 168 (349) ++..++++||+|+ .|| ..+.+++.+.+.+++-|-+.+ T Consensus 29 ~krr~ik~GD~Ii-F~~~~l~v~V~~vr~Y~tF~~mlreepiE~v 72 (111) T COG4043 29 PKRRQIKPGDKII-FNGDKLKVEVIDVRVYDTFEEMLREEPIENV 72 (111) T ss_pred HHHCCCCCCCEEE-ECCCEEEEEEEEEEEHHHHHHHHHHCCHHHH T ss_conf 7662789899899-8388467999987605689999985486651 No 143 >PRK07740 hypothetical protein; Provisional Probab=27.01 E-value=19 Score=15.92 Aligned_cols=21 Identities=24% Similarity=0.465 Sum_probs=13.5 Q ss_pred HHHHCCC--CCEEEEECCCCCCC Q ss_conf 5300145--66288887830454 Q gi|254780773|r 132 AIAGVKK--GDCIISLDGITVSA 152 (349) Q Consensus 132 ~~AGL~~--GD~Il~InG~~V~s 152 (349) |.-|+.| ||+|++|-..++.+ T Consensus 65 ETTGl~p~~gD~IIeIgAVkv~~ 87 (240) T PRK07740 65 ETTGFSPDQGDEILSIAAVKTVG 87 (240) T ss_pred ECCCCCCCCCCEEEEEEEEEEEC T ss_conf 58998988898789998999999 No 144 >TIGR01687 moaD_arch MoaD family protein; InterPro: IPR010038 Members of this family appear to be archaeal and bacterial (proteobacteria and Thermus) versions of MoaD, subunit 1 of molybdopterin converting factor.. Probab=26.71 E-value=37 Score=13.93 Aligned_cols=47 Identities=17% Similarity=0.304 Sum_probs=26.9 Q ss_pred CCCCCCCCCHHHHHHCCCC-------CEEEEECCC-CCCCCHHHHHHCCCCCCCCCE Q ss_conf 0002456755653001456-------628888783-045400111110146788631 Q gi|254780773|r 121 VVSNVSPASPAAIAGVKKG-------DCIISLDGI-TVSAFEEVAPYVRENPLHEIS 169 (349) Q Consensus 121 ~I~~V~~~spA~~AGL~~G-------D~Il~InG~-~V~s~~dl~~~i~~~~g~~v~ 169 (349) .+......+|-..+.+-.. +.++.+||+ .++..+++...+++ |++|. T Consensus 31 ll~~l~~~Yp~~~~e~~~et~~~~~~~v~ilvNGran~~~l~GL~~~Lkd--GD~va 85 (93) T TIGR01687 31 LLEELSSRYPKEFSELFKETGLGLVPNVIILVNGRANVDWLEGLETELKD--GDVVA 85 (93) T ss_pred HHHHHHHHCCHHHHHHHCCCCCCCCCEEEEEECCCCCCCCCCCCCCCCCC--CCEEE T ss_conf 89998861565566651477887646578985164143220365752327--87567 No 145 >pfam09101 Exotox-A_bind Exotoxin A binding. Members of this family are found in Pseudomonas aeruginosa exotoxin A, and are responsible for binding of the toxin to the alpha-2-macroglobulin receptor, with subsequent internalisation into endosomes. The domain adopts a thirteen-strand antiparallel beta jelly roll topology, which belongs to the Concanavalin A-like lectins/glucanases fold superfamily. Probab=26.41 E-value=41 Score=13.57 Aligned_cols=13 Identities=31% Similarity=0.399 Sum_probs=5.9 Q ss_pred CCCEEEEECCCCC Q ss_conf 5662888878304 Q gi|254780773|r 138 KGDCIISLDGITV 150 (349) Q Consensus 138 ~GD~Il~InG~~V 150 (349) |.|.|..-||+-| T Consensus 31 p~~~i~dtngegv 43 (274) T pfam09101 31 PDDAIADTNGEGV 43 (274) T ss_pred CCCCCCCCCCCCE T ss_conf 9840007888737 No 146 >COG0442 ProS Prolyl-tRNA synthetase [Translation, ribosomal structure and biogenesis] Probab=26.26 E-value=41 Score=13.61 Aligned_cols=21 Identities=33% Similarity=0.629 Sum_probs=18.2 Q ss_pred ECCCHHEEEEECCCEEEEEEE Q ss_conf 068311278605980799997 Q gi|254780773|r 42 GFGPELIGITSRSGVRWKVSL 62 (349) Q Consensus 42 Gfgp~l~~~~~k~~t~y~i~~ 62 (349) ||||.+++.|++++-+|.+++ T Consensus 88 ~f~~El~~v~drg~~~l~L~P 108 (500) T COG0442 88 GFGPELFRVKDRGDRPLALRP 108 (500) T ss_pred CCCHHHEEEECCCCCEEEECC T ss_conf 036444899716996343578 No 147 >cd03510 Rhizobitoxine-FADS-like This CD includes the dihydrorhizobitoxine fatty acid desaturase (RtxC) characterized in Bradyrhizobium japonicum USDA110, and other related proteins. Dihydrorhizobitoxine desaturase is reported to be involved in the final step of rhizobitoxine biosynthesis. This domain family appears to be structurally related to the membrane fatty acid desaturases and the alkane hydroxylases. They all share in common extensive hydrophobic regions that would be capable of spanning the membrane bilayer at least twice. Comparison of sequences also reveals the existence of three regions of conserved histidine cluster motifs that contain eight histidine residues: HXXXH, HXX(X)HH, and HXXHH. These histidine residues are reported to be catalytically essential and proposed to be the ligands for the iron atoms contained within homologs, stearoyl CoA desaturase and alkane hydroxylase. Probab=24.96 E-value=44 Score=13.39 Aligned_cols=25 Identities=16% Similarity=0.259 Sum_probs=17.2 Q ss_pred HHHHHHHHHHHHHHHHHHHHHCCEEEE Q ss_conf 999999999733779999984975004 Q gi|254780773|r 12 VSLIIIVVIHEFGHYMVARLCNIRVLS 38 (349) Q Consensus 12 ~~l~~~v~iHE~GH~~~Ar~~gv~V~~ 38 (349) ..-.+.+..||..|..+.|.- +.++ T Consensus 31 ~~~~l~~~~Hea~H~~~~~~~--~~N~ 55 (175) T cd03510 31 RQRALAILMHDAAHGLLFRNR--RLND 55 (175) T ss_pred HHHHHHHHHHHHHHHHHCCCC--HHHH T ss_conf 999999999999987204885--2999 No 148 >PRK04897 heat shock protein HtpX; Provisional Probab=24.95 E-value=35 Score=14.04 Aligned_cols=17 Identities=12% Similarity=0.219 Sum_probs=7.3 Q ss_pred HHHHHCCCCCEEEEECC Q ss_conf 65300145662888878 Q gi|254780773|r 131 AAIAGVKKGDCIISLDG 147 (349) Q Consensus 131 A~~AGL~~GD~Il~InG 147 (349) |...|..+.+..+.+.. T Consensus 110 AFatG~~~~~~~V~vt~ 126 (298) T PRK04897 110 AFATGSDPKNAAVAVTT 126 (298) T ss_pred EEEEECCCCCEEEEECH T ss_conf 69950688887899617 No 149 >TIGR02737 caa3_CtaG cytochrome c oxidase assembly factor CtaG; InterPro: IPR014108 Members of this family are the CtaG protein required for assembly of active cytochrome c oxidase of the caa3 type, as found in Bacillus subtilis.. Probab=24.50 E-value=45 Score=13.33 Aligned_cols=36 Identities=19% Similarity=0.296 Sum_probs=29.8 Q ss_pred HHHHHHC---CCCCHHHHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 9988827---86999999999999999999999999999 Q gi|254780773|r 307 LLEMIRG---KSLGVSVTRVITRMGLCIILFLFFLGIRN 342 (349) Q Consensus 307 ~~E~i~g---r~i~~~~~~~~~~~g~~ll~~l~i~~~~n 342 (349) +||++.. ||-.+++....+.==++|+++..+|.+|+ T Consensus 100 l~~~i~~klr~p~v~~v~k~~t~PliALLLFNGLFSlYH 138 (286) T TIGR02737 100 LYEKIFEKLRRPFVKAVLKFLTKPLIALLLFNGLFSLYH 138 (286) T ss_pred HHHHHHHHHCHHHHHHHHHHHHHHHHHHHHHHHHHHHHH T ss_conf 899999885413899999997308999999845788754 No 150 >pfam02031 Peptidase_M7 Streptomyces extracellular neutral proteinase (M7) family. Probab=24.01 E-value=19 Score=15.84 Aligned_cols=22 Identities=14% Similarity=0.203 Sum_probs=8.4 Q ss_pred CCHHHHHHCCCCCCCCCEEEEE Q ss_conf 4001111101467886316999 Q gi|254780773|r 152 AFEEVAPYVRENPLHEISLVLY 173 (349) Q Consensus 152 s~~dl~~~i~~~~g~~v~i~v~ 173 (349) .|++-+.-++-.++...++++. T Consensus 24 iWN~sV~NV~L~~g~~a~~~~~ 45 (132) T pfam02031 24 IWNSSVSNVRLQEGSNADFTYY 45 (132) T ss_pred HHHCCCCCEEEECCCCCCEEEE T ss_conf 8745577467504888777999 No 151 >PRK01827 thyA thymidylate synthase; Reviewed Probab=23.83 E-value=46 Score=13.25 Aligned_cols=50 Identities=16% Similarity=0.256 Sum_probs=34.2 Q ss_pred HHHHHHCCCCCEEEEECCCCCCC--CHHHHHHCCCCCCCCCEEEEEECCCCE Q ss_conf 56530014566288887830454--001111101467886316999658731 Q gi|254780773|r 130 PAAIAGVKKGDCIISLDGITVSA--FEEVAPYVRENPLHEISLVLYREHVGV 179 (349) Q Consensus 130 pA~~AGL~~GD~Il~InG~~V~s--~~dl~~~i~~~~g~~v~i~v~R~~~~~ 179 (349) -|...|++||+-+-.+.+.-|.. .+.+.+.++..|-..-++.+.++-... T Consensus 188 iA~~~gl~pg~l~~~~gD~HIY~nHie~vkeql~R~P~~~P~L~i~~~~~~i 239 (264) T PRK01827 188 IAQQTGLKVGEFVHTIGDAHIYSNHLEQARLQLSREPRPLPKLVINPDIKSI 239 (264) T ss_pred HHHHHCCCCCEEEEEEEEEEEHHHHHHHHHHHHCCCCCCCCEEEECCCCCCC T ss_conf 9998587477899992067865768999999856799999968968999971 No 152 >COG4273 Uncharacterized conserved protein [Function unknown] Probab=23.60 E-value=47 Score=13.22 Aligned_cols=33 Identities=24% Similarity=0.468 Sum_probs=25.6 Q ss_pred CCCCCCCCCHHHHHHCCCCCEEEEECCCCCCCC Q ss_conf 000245675565300145662888878304540 Q gi|254780773|r 121 VVSNVSPASPAAIAGVKKGDCIISLDGITVSAF 153 (349) Q Consensus 121 ~I~~V~~~spA~~AGL~~GD~Il~InG~~V~s~ 153 (349) -+.+|-.++|+..--=+.|++|+.+||-|..-. T Consensus 49 C~agvg~gv~~l~~~arsgrrIlalDGCp~~Ca 81 (135) T COG4273 49 CTAGVGAGVPALVDAARSGRRILALDGCPLRCA 81 (135) T ss_pred EEECCCCCCHHHHHHHHCCCCEEEECCCHHHHH T ss_conf 231015784889987643783697549737889 No 153 >TIGR03284 thym_sym thymidylate synthase. Members of this protein family are thymidylate synthase, an enzyme that produces dTMP from dUMP. In prokaryotes, its gene usually is found close to that for dihydrofolate reductase, and in some systems the two enzymes are found as a fusion protein. This model excludes a set of related proteins (TIGR03283) that appears to replace this family in archaeal methanogens, where tetrahydrofolate is replaced by tetrahydromethanopterin. Probab=23.53 E-value=47 Score=13.21 Aligned_cols=50 Identities=18% Similarity=0.235 Sum_probs=35.8 Q ss_pred HHHHHHCCCCCEEEEECCCCCCC--CHHHHHHCCCCCCCCCEEEEEECCCCE Q ss_conf 56530014566288887830454--001111101467886316999658731 Q gi|254780773|r 130 PAAIAGVKKGDCIISLDGITVSA--FEEVAPYVRENPLHEISLVLYREHVGV 179 (349) Q Consensus 130 pA~~AGL~~GD~Il~InG~~V~s--~~dl~~~i~~~~g~~v~i~v~R~~~~~ 179 (349) -|...|++||+.|-.+.+.-|.. .+++.+.++..|-.--++.+.++-.++ T Consensus 220 iA~~~gl~pg~l~~~~gDaHIY~nHie~v~eql~R~P~~~P~L~i~~~~~~i 271 (296) T TIGR03284 220 IAQETGLEVGEFVHTLGDAHLYSNHLEQAKLQLTREPRPLPKLKLNPEKKDI 271 (296) T ss_pred HHHHHCCCCCEEEEECCEEEEHHHHHHHHHHHHCCCCCCCCEEEECCCCCCC T ss_conf 9998487577799991365767868999999966899999988967998863 No 154 >PRK03982 heat shock protein HtpX; Provisional Probab=23.43 E-value=39 Score=13.72 Aligned_cols=31 Identities=10% Similarity=0.118 Sum_probs=13.9 Q ss_pred HHHHHCCCCCEEEEECCCCC--CCCHHHHHHCC Q ss_conf 65300145662888878304--54001111101 Q gi|254780773|r 131 AAIAGVKKGDCIISLDGITV--SAFEEVAPYVR 161 (349) Q Consensus 131 A~~AGL~~GD~Il~InG~~V--~s~~dl~~~i~ 161 (349) |...|..+.+..+.+..-=+ -+-+|+...+. T Consensus 98 AFa~G~~~~~~~V~vt~GLL~~L~~dEl~aVlA 130 (288) T PRK03982 98 AFATGRDPKHAVVAVTEGILNLLNEDELEGVIA 130 (288) T ss_pred EEEECCCCCCEEEEEEHHHHHHCCHHHHHHHHH T ss_conf 687268999858996489987389999999999 No 155 >pfam04228 Zn_peptidase Putative neutral zinc metallopeptidase. Members of this family have a predicted zinc binding motif characteristic of neutral zinc metallopeptidases (Prosite:PDOC00129). Probab=23.37 E-value=33 Score=14.25 Aligned_cols=14 Identities=29% Similarity=0.380 Sum_probs=6.9 Q ss_pred CCCCCEEEEEECCC Q ss_conf 78863169996587 Q gi|254780773|r 164 PLHEISLVLYREHV 177 (349) Q Consensus 164 ~g~~v~i~v~R~~~ 177 (349) +-++-++.+.++.. T Consensus 113 ~Y~~P~lvlf~~~~ 126 (292) T pfam04228 113 TYQPPTLVLYSGVT 126 (292) T ss_pred CCCCCEEEEECCCC T ss_conf 88998589841886 No 156 >KOG3638 consensus Probab=23.06 E-value=48 Score=13.15 Aligned_cols=61 Identities=16% Similarity=0.337 Sum_probs=37.5 Q ss_pred CCCCHHHHHHCCCCCEEEEECCCCCCCCHHHHHHCCCCCCCCCEEEEEECCCCEEEECCCCC Q ss_conf 56755653001456628888783045400111110146788631699965873131011211 Q gi|254780773|r 126 SPASPAAIAGVKKGDCIISLDGITVSAFEEVAPYVRENPLHEISLVLYREHVGVLHLKVMPR 187 (349) Q Consensus 126 ~~~spA~~AGL~~GD~Il~InG~~V~s~~dl~~~i~~~~g~~v~i~v~R~~~~~~~~~v~p~ 187 (349) .++..-....|.+||.+++-|-.+-..++.+...+...|.+..++.+..-... .++.++|+ T Consensus 218 ~~~~~k~m~el~iGD~Vla~~~~~~~~~spv~~~lhR~pe~~~~F~~i~t~~g-~~l~lT~~ 278 (414) T KOG3638 218 EQGGRKRMDELSIGDYVLAADQGGQTTYSPVALFLHREPEARAEFVVIETEQG-ETLQLTPQ 278 (414) T ss_pred EECCEEECCCCCCCCEEECCCCCCCCCCCCHHHHHCCCCCCCCCCEEEECCCC-CCCCCCCC T ss_conf 50642435778989847612668841137224221558544554168865777-64543420 No 157 >COG3127 Predicted ABC-type transport system involved in lysophospholipase L1 biosynthesis, permease component [Secondary metabolites biosynthesis, transport, and catabolism] Probab=22.90 E-value=47 Score=13.21 Aligned_cols=44 Identities=9% Similarity=-0.145 Sum_probs=20.8 Q ss_pred CHHHHHHHHHHHHHCCCCCHHHHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 43699999998882786999999999999999999999999999 Q gi|254780773|r 299 DGGHLITFLLEMIRGKSLGVSVTRVITRMGLCIILFLFFLGIRN 342 (349) Q Consensus 299 DGG~i~~~~~E~i~gr~i~~~~~~~~~~~g~~ll~~l~i~~~~n 342 (349) -|+-....++-.++.=|..+.+.-..-.++++.-.++-+..+|. T Consensus 773 ~~~aa~w~v~~~vf~lp~~pd~al~~~v~~l~~~~g~~l~G~w~ 816 (829) T COG3127 773 GAEAAAWVLVAKVFDLPWSPDWALWTIVLALVGAVGLGLAGGWL 816 (829) T ss_pred HHHHHHHHHHHHHHCCCCCCCHHHHHHHHHHHHHHHHHHHHHHH T ss_conf 54589999999985588987568999999999999999877899 No 158 >TIGR03616 RutG pyrimidine utilization transport protein G. This protein is observed in operons extremely similar to that characterized in E. coli K-12 responsible for the import and catabolism of pyrimidines, primarily uracil. This protein is a member of the uracil-xanthine permease family defined by TIGR00801. As well as the The Nucleobase:Cation Symporter-2 (NCS2) Family (TC 2.A.40). Probab=22.83 E-value=48 Score=13.12 Aligned_cols=84 Identities=10% Similarity=0.227 Sum_probs=51.6 Q ss_pred CCCCCCCCHHHHHHHHHHHCCHHHHHHHHHHHHHHHHHHHHHCCCCCCHHHHHHHHHHHHH------CCCCCHHHHHHHH Q ss_conf 2246673012233345653050347899999999999996306472243699999998882------7869999999999 Q gi|254780773|r 252 LNQISGPVGIARIAKNFFDHGFNAYIAFLAMFSWAIGFMNLLPIPILDGGHLITFLLEMIR------GKSLGVSVTRVIT 325 (349) Q Consensus 252 ~~~lsGPVgIa~~~~~~a~~G~~~~l~~~a~isi~Lg~~NlLPip~LDGG~i~~~~~E~i~------gr~i~~~~~~~~~ 325 (349) ..|=.|-+++-++.+.-.-+--..++-+++++.-.=+++..+|-|.|-|--++....=... .++++.+..+.+. T Consensus 307 fsqNiGvi~lT~V~SR~V~~~aa~ilillg~~pK~gal~~sIP~pVlGG~~lvmFGmIaa~Giril~~~~vd~~~~rNl~ 386 (429) T TIGR03616 307 YAENIGVMAVTKVYSTLVFVAAAVFAILLGFSPKFGALIHTIPVAVLGGASIVVFGLIAVAGARIWVQNKVDLTQNGNLI 386 (429) T ss_pred HHCCHHHHHHCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHCCHHHHHHHHHHHHHHHHHHHHHHHHHCCCCCCCCCCHH T ss_conf 10122355422766409999999999999988999999985878788699999999999999999997046868875014 Q ss_pred HHHHHHHHHH Q ss_conf 9999999999 Q gi|254780773|r 326 RMGLCIILFL 335 (349) Q Consensus 326 ~~g~~ll~~l 335 (349) .+++.+.+++ T Consensus 387 iva~sl~~G~ 396 (429) T TIGR03616 387 MVAVTLVLGA 396 (429) T ss_pred HHHHHHHHHC T ss_conf 6899999856 No 159 >KOG0662 consensus Probab=22.13 E-value=38 Score=13.86 Aligned_cols=11 Identities=18% Similarity=0.320 Sum_probs=5.2 Q ss_pred CCCEEEEEECC Q ss_conf 86316999658 Q gi|254780773|r 166 HEISLVLYREH 176 (349) Q Consensus 166 ~~v~i~v~R~~ 176 (349) ++.++.+.|+| T Consensus 128 kpqnllin~ng 138 (292) T KOG0662 128 KPQNLLINRNG 138 (292) T ss_pred CCCEEEECCCC T ss_conf 71107753688 No 160 >pfam00303 Thymidylat_synt Thymidylate synthase. Probab=22.11 E-value=50 Score=13.02 Aligned_cols=47 Identities=17% Similarity=0.269 Sum_probs=31.5 Q ss_pred HHHHHHCCCCCEEEEECCCCCCC--CHHHHHHCCCCCCCCCEEEEEECC Q ss_conf 56530014566288887830454--001111101467886316999658 Q gi|254780773|r 130 PAAIAGVKKGDCIISLDGITVSA--FEEVAPYVRENPLHEISLVLYREH 176 (349) Q Consensus 130 pA~~AGL~~GD~Il~InG~~V~s--~~dl~~~i~~~~g~~v~i~v~R~~ 176 (349) -|...|+++|+.+..+++.-|.. .+.+.+.++..|...-++.+.++. T Consensus 186 iA~~~gl~~G~~~~~~gd~HIY~~h~~~v~~ql~r~p~~~P~l~i~~~~ 234 (262) T pfam00303 186 IAQVTGLEPGEFIHTIGDAHIYENHVDQVKEQLSREPRPFPKLKINPEV 234 (262) T ss_pred HHHHHCCCCCEEEEEEEEEEEHHHHHHHHHHHHCCCCCCCCEEEECCCC T ss_conf 9998578678899993355647869999999956799999978967998 No 161 >pfam03926 consensus Probab=21.79 E-value=50 Score=12.98 Aligned_cols=16 Identities=38% Similarity=0.488 Sum_probs=7.1 Q ss_pred HHHHHHHHHHHHHHHH Q ss_conf 9999733779999984 Q gi|254780773|r 17 IVVIHEFGHYMVARLC 32 (349) Q Consensus 17 ~v~iHE~GH~~~Ar~~ 32 (349) -|+-||+-||.+=... T Consensus 59 ~vI~HEL~Hy~lh~~~ 74 (149) T pfam03926 59 QVVPHELAHLHLYQLF 74 (149) T ss_pred HHHHHHHHHHHHHHHC T ss_conf 3179999999999986 No 162 >COG2738 Predicted Zn-dependent protease [General function prediction only] Probab=21.67 E-value=51 Score=12.96 Aligned_cols=35 Identities=17% Similarity=0.378 Sum_probs=23.4 Q ss_pred HHHHHHHHHHHHH-HHHHHCCCC-CCHHHHHHHHHHH Q ss_conf 7899999999999-996306472-2436999999988 Q gi|254780773|r 276 YIAFLAMFSWAIG-FMNLLPIPI-LDGGHLITFLLEM 310 (349) Q Consensus 276 ~l~~~a~isi~Lg-~~NlLPip~-LDGG~i~~~~~E~ 310 (349) .+.|++.+=++.+ +|+|.-+|. .|-.+=..-.+|. T Consensus 144 ~ll~lGiiLfs~aVLF~lvTLPVEFDAS~RA~~~l~~ 180 (226) T COG2738 144 GLLWLGIILFSAAVLFQLVTLPVEFDASSRALKQLES 180 (226) T ss_pred HHHHHHHHHHHHHHHHHEEECCEEECCHHHHHHHHHH T ss_conf 8999999999999998223123121511889999984 No 163 >KOG2090 consensus Probab=21.58 E-value=48 Score=13.12 Aligned_cols=21 Identities=19% Similarity=0.257 Sum_probs=8.3 Q ss_pred HCCCCCCHHHHHHHHHHHHHC Q ss_conf 064722436999999988827 Q gi|254780773|r 293 LPIPILDGGHLITFLLEMIRG 313 (349) Q Consensus 293 LPip~LDGG~i~~~~~E~i~g 313 (349) .|-+.-|--+++..+-++..| T Consensus 581 ~p~~~~~~td~~~~v~rk~~~ 601 (704) T KOG2090 581 CPLIAEDTTDLLSEVKRKFSG 601 (704) T ss_pred CCCCCCCHHHHHHHHHHHCCC T ss_conf 456554568999999986068 No 164 >PRK02391 heat shock protein HtpX; Provisional Probab=21.11 E-value=47 Score=13.20 Aligned_cols=31 Identities=13% Similarity=0.100 Sum_probs=14.5 Q ss_pred HHHHHCCCCCEEEEECCC--CCCCCHHHHHHCC Q ss_conf 653001456628888783--0454001111101 Q gi|254780773|r 131 AAIAGVKKGDCIISLDGI--TVSAFEEVAPYVR 161 (349) Q Consensus 131 A~~AGL~~GD~Il~InG~--~V~s~~dl~~~i~ 161 (349) |-..|..+.+.++.+.-- +.-+-+|+..++. T Consensus 107 AFAtG~~~~~a~V~vT~GLL~~L~~dEL~aVLA 139 (297) T PRK02391 107 AFATGRSPKNAVVCVTTGLLRRLDPEELEAVLA 139 (297) T ss_pred EEEECCCCCCEEEEECHHHHHHCCHHHHHHHHH T ss_conf 257458998769997579886399999999999 No 165 >PRK10720 uracil transporter; Provisional Probab=20.82 E-value=53 Score=12.85 Aligned_cols=83 Identities=12% Similarity=0.288 Sum_probs=50.7 Q ss_pred CCCCCCCCCHHHHHHHHHHHCCHHHHHHHHHHHHHHHHHHHHHCCCCCCHHHHHHHH-H-----HHHHCCCCCHHHHHHH Q ss_conf 022466730122333456530503478999999999999963064722436999999-9-----8882786999999999 Q gi|254780773|r 251 RLNQISGPVGIARIAKNFFDHGFNAYIAFLAMFSWAIGFMNLLPIPILDGGHLITFL-L-----EMIRGKSLGVSVTRVI 324 (349) Q Consensus 251 ~~~~lsGPVgIa~~~~~~a~~G~~~~l~~~a~isi~Lg~~NlLPip~LDGG~i~~~~-~-----E~i~gr~i~~~~~~~~ 324 (349) +..|=.|-+++-++.+.-.-+.-..++-.++++.-.=+++..+|-|.+-|-.++..- + ..+...+++....+.. T Consensus 287 TyseNiGvi~lT~V~SR~V~~~aavilIllg~~pK~galiasIP~pVlGG~~ivlFG~Iaa~Gir~L~~~~vd~~~~rNl 366 (429) T PRK10720 287 TYGENIGVMAITRVYSTWVIGGAAIFAILLSCVGKLAAAIQSIPLPVMGGVSLLLYGVIGASGIRVLIESKVDYNKAQNL 366 (429) T ss_pred CCCCCCCEEEEECCCCHHHHHHHHHHHHHHHHHHHHHHHHHHCCHHHHHHHHHHHHHHHHHHHHHHHHHCCCCCCCCCCH T ss_conf 43565121445045510879999999999986699999998687878869999999999999999999714776888601 Q ss_pred HHHHHHHHH Q ss_conf 999999999 Q gi|254780773|r 325 TRMGLCIIL 333 (349) Q Consensus 325 ~~~g~~ll~ 333 (349) ..++..+.+ T Consensus 367 ~ivav~l~~ 375 (429) T PRK10720 367 ILTSVILII 375 (429) T ss_pred HHHHHHHHH T ss_conf 453999996 No 166 >PRK05973 replicative DNA helicase; Provisional Probab=20.81 E-value=50 Score=12.98 Aligned_cols=22 Identities=18% Similarity=0.453 Sum_probs=12.0 Q ss_pred CCCCCCCHHHH--HHCCCCCEEEE Q ss_conf 02456755653--00145662888 Q gi|254780773|r 123 SNVSPASPAAI--AGVKKGDCIIS 144 (349) Q Consensus 123 ~~V~~~spA~~--AGL~~GD~Il~ 144 (349) ......+||.+ .||+|||.|+= T Consensus 46 ~~~~~~~pa~~l~~gLqPGDLIIl 69 (237) T PRK05973 46 AKAAATTPAEELFGQLRPGDLVLL 69 (237) T ss_pred HHHCCCCCHHHHHCCCCCCCEEEE T ss_conf 763036958998568998677999 No 167 >PRK11067 outer membrane protein assembly factor YaeT; Provisional Probab=20.76 E-value=53 Score=12.84 Aligned_cols=11 Identities=27% Similarity=0.510 Sum_probs=3.9 Q ss_pred CEEEEEEEEEE Q ss_conf 80799997711 Q gi|254780773|r 55 GVRWKVSLIPL 65 (349) Q Consensus 55 ~t~y~i~~~Pl 65 (349) |-.|.++-+=+ T Consensus 263 G~~Y~~g~i~~ 273 (801) T PRK11067 263 GDQYKLSGVQV 273 (801) T ss_pred CCEEEEEEEEE T ss_conf 97498723799 No 168 >PHA01511 coat protein Probab=20.62 E-value=53 Score=12.82 Aligned_cols=67 Identities=18% Similarity=0.215 Sum_probs=37.5 Q ss_pred HHHHCCCCCEEEEECCCCCCCCHHHHHHCCCCCCCCCEEEEEECCCCEEEECCCCCCCCCCCCCEEEEEEEECC Q ss_conf 53001456628888783045400111110146788631699965873131011211005654310000122034 Q gi|254780773|r 132 AIAGVKKGDCIISLDGITVSAFEEVAPYVRENPLHEISLVLYREHVGVLHLKVMPRLQDTVDRFGIKRQVPSVG 205 (349) Q Consensus 132 ~~AGL~~GD~Il~InG~~V~s~~dl~~~i~~~~g~~v~i~v~R~~~~~~~~~v~p~~~~~~~~~g~~~~~~~ig 205 (349) +.+|++.||.+. +.....+..+-++.-++..++.|.|--.. .++++.|+.....+......+.++-. T Consensus 263 aTtg~~~Gd~ft------iaGV~~~h~itK~~tgq~~tFrVv~~~sg-tti~I~Pkpi~~d~~~~s~~~kaYaN 329 (430) T PHA01511 263 ATTGLKRGDKIS------FTGVKFLGQMAKNVLAQDATFSVVRVVDG-THVEITPKPVALDDVSLSPEQRAYAN 329 (430) T ss_pred EECCCCCCCEEE------EECHHHHHHHHHHCCCCCCEEEEEEECCC-CEEEEECCCCCCCCCCCCHHHHCCCC T ss_conf 404742144578------81136887753520488625899996158-66797024422666776612201244 No 169 >PRK05457 heat shock protein HtpX; Provisional Probab=20.61 E-value=45 Score=13.31 Aligned_cols=31 Identities=16% Similarity=0.210 Sum_probs=13.9 Q ss_pred HHHHHCCCCCEEEEECCCC--CCCCHHHHHHCC Q ss_conf 6530014566288887830--454001111101 Q gi|254780773|r 131 AAIAGVKKGDCIISLDGIT--VSAFEEVAPYVR 161 (349) Q Consensus 131 A~~AGL~~GD~Il~InG~~--V~s~~dl~~~i~ 161 (349) |-.+|-.+.+-.+.+.--- .-|-+|+...+. T Consensus 107 AFAtG~~p~~a~VaVT~GLL~~L~~dELegVlA 139 (289) T PRK05457 107 AFATGASKNNSLVAVSTGLLQNMSRDEVEAVLA 139 (289) T ss_pred EEECCCCCCCEEEEECHHHHHHCCHHHHHHHHH T ss_conf 014268988879998579997679999999999 No 170 >TIGR02299 HpaE 5-carboxymethyl-2-hydroxymuconate semialdehyde dehydrogenase; InterPro: IPR011985 This entry represents the dehydrogenase responsible for the conversion of 5-carboxymethyl-2-hydroxymuconate semialdehyde to 5-carboxymethyl-2-hydroxymuconate (a tricarboxylic acid). This is the step in the degradation of 4-hydroxyphenylacetic acid via homoprotocatechuate following the oxidative opening of the aromatic ring .. Probab=20.23 E-value=28 Score=14.69 Aligned_cols=18 Identities=11% Similarity=0.268 Sum_probs=7.4 Q ss_pred CCCCCCCCCCHHHHHHHH Q ss_conf 002246673012233345 Q gi|254780773|r 250 TRLNQISGPVGIARIAKN 267 (349) Q Consensus 250 ~~~~~lsGPVgIa~~~~~ 267 (349) .+..+|.-|=|=.+.+|. T Consensus 445 ~NVR~Lp~PFGG~K~SG~ 462 (494) T TIGR02299 445 QNVRDLPTPFGGVKASGI 462 (494) T ss_pred CCCCCCCCCCCCCCCCCC T ss_conf 455787357886178867 Done!