Query gi|254780383|ref|YP_003064796.1| endonuclease III [Candidatus Liberibacter asiaticus str. psy62] Match_columns 227 No_of_seqs 149 out of 3690 Neff 6.6 Searched_HMMs 39220 Date Sun May 29 15:37:26 2011 Command /home/congqian_1/programs/hhpred/hhsearch -i 254780383.hhm -d /home/congqian_1/database/cdd/Cdd.hhm No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM 1 TIGR01083 nth endonuclease III 100.0 0 0 498.5 11.1 191 23-213 2-192 (192) 2 COG0177 Nth Predicted EndoIII- 100.0 0 0 474.1 18.7 207 21-227 2-208 (211) 3 PRK10702 endonuclease III; Pro 100.0 0 0 461.5 19.5 207 21-227 2-208 (211) 4 PRK10880 adenine DNA glycosyla 100.0 0 0 352.5 16.4 201 26-227 4-213 (350) 5 TIGR01084 mutY A/G-specific ad 100.0 0 0 346.7 8.9 198 29-227 3-223 (297) 6 KOG1921 consensus 100.0 0 0 336.0 13.6 205 20-224 36-259 (286) 7 COG1194 MutY A/G-specific DNA 100.0 1.1E-44 0 321.5 14.2 204 23-227 6-218 (342) 8 PRK13910 DNA glycosylase MutY; 100.0 4.2E-44 0 317.7 10.0 168 56-225 1-170 (290) 9 smart00478 ENDO3c endonuclease 100.0 1.8E-39 4.5E-44 286.6 11.0 148 57-204 1-149 (149) 10 COG2231 Uncharacterized protei 100.0 4.3E-38 1.1E-42 277.2 17.7 207 19-227 2-215 (215) 11 PRK13913 3-methyladenine DNA g 100.0 1.6E-37 4.1E-42 273.4 11.6 182 27-209 6-217 (218) 12 KOG2457 consensus 100.0 2.1E-36 5.4E-41 265.8 13.2 203 23-226 88-309 (555) 13 cd00056 ENDO3c endonuclease II 100.0 1.5E-35 3.9E-40 260.0 11.9 153 49-202 1-158 (158) 14 pfam00730 HhH-GPD HhH-GPD supe 100.0 2.2E-29 5.5E-34 218.6 7.9 136 53-188 1-144 (144) 15 TIGR00588 ogg 8-oxoguanine DNA 99.8 1.1E-18 2.9E-23 146.4 6.5 127 45-173 145-321 (379) 16 COG0122 AlkA 3-methyladenine D 99.7 4.3E-15 1.1E-19 122.3 14.7 127 46-177 103-248 (285) 17 KOG2875 consensus 99.6 9.3E-16 2.4E-20 126.8 7.4 118 45-166 114-257 (323) 18 PRK10308 3-methyl-adenine DNA 99.4 1.4E-11 3.6E-16 98.6 14.8 164 23-196 89-271 (283) 19 KOG1918 consensus 99.2 2.3E-10 5.8E-15 90.5 10.6 143 39-188 66-228 (254) 20 PRK01229 N-glycosylase/DNA lya 99.0 9.8E-09 2.5E-13 79.5 11.7 131 46-184 35-182 (208) 21 COG1059 Thermostable 8-oxoguan 98.7 2.1E-07 5.4E-12 70.5 8.9 132 46-184 37-184 (210) 22 TIGR03252 uncharacterized HhH- 98.0 2.4E-05 6.1E-10 56.7 7.3 106 42-147 12-134 (177) 23 pfam00633 HHH Helix-hairpin-he 97.3 6.5E-05 1.7E-09 53.8 1.1 29 119-147 2-30 (30) 24 smart00525 FES FES domain. iro 96.5 0.00092 2.3E-08 46.0 1.3 22 205-226 1-22 (26) 25 pfam09674 DUF2400 Protein of u 95.2 0.012 3.1E-07 38.4 2.4 42 153-194 174-221 (230) 26 TIGR00575 dnlj DNA ligase, NAD 95.0 0.04 1E-06 35.0 4.3 27 128-155 564-590 (706) 27 TIGR02757 TIGR02757 conserved 94.5 0.018 4.6E-07 37.3 1.7 44 152-195 211-261 (269) 28 TIGR00084 ruvA Holliday juncti 94.5 0.04 1E-06 35.0 3.4 84 59-151 56-142 (217) 29 PRK10353 3-methyl-adenine DNA 94.2 0.33 8.5E-06 28.8 7.6 68 48-115 31-102 (189) 30 pfam10576 EndIII_4Fe-2S Iron-s 94.1 0.02 5.2E-07 37.0 1.2 20 206-225 1-20 (26) 31 PRK07956 ligA NAD-dependent DN 94.0 0.16 4E-06 31.0 5.6 79 67-148 455-563 (668) 32 TIGR00426 TIGR00426 competence 93.6 0.053 1.3E-06 34.2 2.5 55 83-145 11-65 (70) 33 PRK08097 ligB NAD-dependent DN 93.5 0.19 4.9E-06 30.4 5.3 23 125-147 518-540 (563) 34 smart00483 POLXc DNA polymeras 93.2 0.57 1.5E-05 27.2 7.3 147 23-204 4-160 (334) 35 TIGR00615 recR recombination p 92.6 0.06 1.5E-06 33.8 1.6 28 123-150 7-34 (205) 36 PRK00024 radC DNA repair prote 92.3 0.16 4.1E-06 30.9 3.5 66 50-119 27-92 (224) 37 PRK00116 ruvA Holliday junctio 91.8 0.33 8.5E-06 28.8 4.6 72 77-151 57-131 (198) 38 PRK00024 radC DNA repair prote 91.6 0.35 9E-06 28.6 4.5 66 83-151 21-90 (224) 39 COG2818 Tag 3-methyladenine DN 91.2 1.4 3.6E-05 24.6 8.4 74 48-121 32-112 (188) 40 pfam05559 DUF763 Protein of un 91.0 0.42 1.1E-05 28.1 4.5 29 124-152 265-296 (319) 41 KOG2841 consensus 91.0 0.55 1.4E-05 27.3 5.1 20 126-145 225-244 (254) 42 PRK00076 recR recombination pr 90.8 0.15 3.7E-06 31.2 2.0 26 127-152 10-35 (197) 43 PRK13844 recombination protein 90.8 0.14 3.5E-06 31.4 1.8 27 125-151 12-38 (200) 44 TIGR01259 comE comEA protein; 90.5 0.15 3.7E-06 31.2 1.8 49 85-142 68-116 (124) 45 COG0353 RecR Recombinational D 90.4 0.15 3.9E-06 31.0 1.8 26 126-151 10-35 (198) 46 PRK08609 hypothetical protein; 90.3 0.34 8.7E-06 28.7 3.5 49 95-145 53-105 (570) 47 PRK13901 ruvA Holliday junctio 89.8 0.59 1.5E-05 27.1 4.4 71 78-151 57-130 (196) 48 pfam03352 Adenine_glyco Methyl 89.6 1.9 4.8E-05 23.7 7.6 73 48-120 26-105 (179) 49 COG0272 Lig NAD-dependent DNA 89.3 0.4 1E-05 28.2 3.2 70 90-167 506-575 (667) 50 TIGR01259 comE comEA protein; 89.2 0.19 5E-06 30.3 1.6 23 125-147 69-91 (124) 51 PRK13901 ruvA Holliday junctio 89.1 0.34 8.6E-06 28.7 2.7 75 127-202 71-155 (196) 52 PRK13266 Thf1-like protein; Re 88.5 2.2 5.7E-05 23.2 10.6 123 22-151 4-134 (224) 53 COG1555 ComEA DNA uptake prote 88.5 0.28 7.2E-06 29.2 2.0 23 126-148 95-117 (149) 54 cd00141 NT_POLXc Nucleotidyltr 88.0 0.88 2.2E-05 25.9 4.3 98 95-205 50-157 (307) 55 smart00278 HhH1 Helix-hairpin- 87.1 0.22 5.7E-06 29.9 0.8 23 128-150 1-23 (26) 56 COG0632 RuvA Holliday junction 86.9 0.38 9.6E-06 28.4 1.9 67 83-151 63-131 (201) 57 PRK01172 ski2-like helicase; P 86.4 1 2.6E-05 25.6 3.8 48 123-170 607-654 (674) 58 PRK00254 ski2-like helicase; P 86.4 0.62 1.6E-05 27.0 2.7 47 123-169 639-685 (717) 59 TIGR00593 pola DNA polymerase 85.8 0.47 1.2E-05 27.7 1.9 37 23-62 471-507 (1005) 60 PRK05929 consensus 85.2 0.74 1.9E-05 26.4 2.6 14 133-146 190-203 (870) 61 PRK07456 consensus 85.0 0.76 1.9E-05 26.4 2.6 13 49-61 473-485 (975) 62 PRK06887 consensus 84.9 0.73 1.9E-05 26.5 2.5 11 49-59 468-478 (954) 63 PRK07945 hypothetical protein; 84.9 1.1 2.9E-05 25.2 3.5 23 128-150 49-71 (335) 64 COG0632 RuvA Holliday junction 84.8 0.48 1.2E-05 27.7 1.5 57 127-183 72-129 (201) 65 PRK07625 consensus 84.5 0.81 2.1E-05 26.2 2.6 11 49-59 435-445 (922) 66 PRK07898 consensus 84.0 0.88 2.3E-05 25.9 2.6 12 134-145 208-219 (902) 67 PRK08434 consensus 83.6 0.95 2.4E-05 25.7 2.6 14 48-61 406-419 (887) 68 PRK08928 consensus 83.4 0.95 2.4E-05 25.7 2.6 14 133-146 191-204 (861) 69 PRK08835 consensus 83.4 0.96 2.5E-05 25.7 2.6 24 194-219 661-684 (931) 70 PRK07556 consensus 83.3 0.98 2.5E-05 25.6 2.6 26 193-220 703-728 (977) 71 KOG2534 consensus 83.0 0.95 2.4E-05 25.7 2.5 20 126-145 54-73 (353) 72 PRK07300 consensus 83.0 1 2.6E-05 25.5 2.6 14 133-146 201-214 (880) 73 PRK08076 consensus 82.8 1.1 2.7E-05 25.4 2.6 14 133-146 193-206 (877) 74 PRK05797 consensus 82.8 1 2.6E-05 25.5 2.5 12 134-145 194-205 (869) 75 pfam11731 Cdd1 Pathogenicity l 82.5 1.3 3.2E-05 24.9 2.9 42 126-167 10-51 (92) 76 COG2003 RadC DNA repair protei 82.1 3.3 8.5E-05 22.0 4.9 61 50-113 27-87 (224) 77 COG1555 ComEA DNA uptake prote 82.0 1.1 2.7E-05 25.4 2.4 55 83-146 91-145 (149) 78 PRK00116 ruvA Holliday junctio 81.3 0.71 1.8E-05 26.6 1.3 55 127-182 72-128 (198) 79 pfam07834 RanGAP1_C RanGAP1 C- 80.8 5 0.00013 20.9 6.0 111 73-195 19-143 (169) 80 PRK05755 DNA polymerase I; Pro 80.2 1.5 3.9E-05 24.4 2.6 14 48-61 404-417 (889) 81 COG1796 POL4 DNA polymerase IV 79.6 2.7 7E-05 22.6 3.8 22 118-139 83-104 (326) 82 PRK05692 hydroxymethylglutaryl 79.0 2.4 6E-05 23.0 3.3 24 12-35 15-38 (287) 83 PRK07757 acetyltransferase; Pr 78.3 2 5.1E-05 23.5 2.8 50 134-193 80-132 (152) 84 pfam04919 DUF655 Protein of un 77.7 3 7.8E-05 22.3 3.6 62 77-147 72-135 (181) 85 COG1415 Uncharacterized conser 75.9 1.9 4.9E-05 23.6 2.1 29 124-152 274-305 (373) 86 pfam02371 Transposase_20 Trans 75.4 1.4 3.5E-05 24.6 1.3 43 128-173 2-44 (87) 87 TIGR03060 PS_II_psb29 photosys 74.1 7.6 0.00019 19.6 10.1 118 22-146 4-129 (214) 88 pfam01367 5_3_exonuc 5'-3' exo 72.4 1.5 3.9E-05 24.3 0.9 22 128-150 18-39 (100) 89 COG4277 Predicted DNA-binding 69.4 2.2 5.7E-05 23.2 1.2 22 126-147 328-349 (404) 90 COG1491 Predicted RNA-binding 68.0 3.1 7.9E-05 22.3 1.7 55 126-186 128-184 (202) 91 smart00475 53EXOc 5'-3' exonuc 68.0 4 0.0001 21.5 2.3 20 130-150 188-207 (259) 92 COG1948 MUS81 ERCC4-type nucle 66.9 9.5 0.00024 19.0 4.0 20 128-147 182-201 (254) 93 PRK02406 DNA polymerase IV; Va 66.2 9.6 0.00025 18.9 3.9 75 131-207 183-276 (355) 94 TIGR02236 recomb_radA DNA repa 63.9 7 0.00018 19.9 2.9 45 70-117 13-58 (333) 95 cd00080 HhH2_motif Helix-hairp 61.6 7.5 0.00019 19.7 2.7 22 128-150 22-43 (75) 96 PRK00558 uvrC excinuclease ABC 60.8 8 0.0002 19.5 2.7 25 17-41 132-156 (609) 97 pfam07067 DUF1340 Protein of u 59.9 14 0.00037 17.8 4.2 100 63-165 105-209 (236) 98 PHA00439 exonuclease 59.0 3.9 0.0001 21.5 0.9 17 131-147 190-206 (288) 99 cd01972 Nitrogenase_VnfE_like 59.0 15 0.00038 17.7 6.2 28 73-100 166-193 (426) 100 PRK12308 bifunctional arginino 58.3 15 0.00039 17.6 6.2 136 63-217 442-608 (614) 101 PRK09482 xni exonuclease IX; P 58.1 4.3 0.00011 21.3 0.9 21 129-150 183-203 (256) 102 pfam03118 RNA_pol_A_CTD Bacter 56.4 8.5 0.00022 19.3 2.2 49 94-145 9-57 (62) 103 COG0258 Exo 5'-3' exonuclease 56.3 9.8 0.00025 18.9 2.5 18 132-150 202-219 (310) 104 TIGR03674 fen_arch flap struct 56.2 14 0.00035 17.9 3.3 17 131-147 239-255 (338) 105 KOG1201 consensus 55.1 17 0.00044 17.2 6.6 89 61-151 96-198 (300) 106 PRK05179 rpsM 30S ribosomal pr 53.8 7.8 0.0002 19.5 1.7 26 122-147 8-36 (122) 107 KOG2518 consensus 53.5 15 0.00039 17.6 3.1 23 128-151 225-247 (556) 108 smart00279 HhH2 Helix-hairpin- 52.6 6 0.00015 20.3 0.9 17 130-146 18-34 (36) 109 TIGR03631 bact_S13 30S ribosom 52.5 7.2 0.00018 19.8 1.3 26 122-147 6-34 (113) 110 cd00008 53EXOc 5'-3' exonuclea 51.9 6.2 0.00016 20.2 0.9 18 130-147 185-202 (240) 111 PRK03980 flap endonuclease-1; 51.0 16 0.00042 17.4 3.0 17 131-147 192-208 (295) 112 pfam10391 DNA_pol_lambd_f Fing 50.6 6.4 0.00016 20.1 0.8 20 128-147 2-21 (52) 113 PRK03352 DNA polymerase IV; Va 49.8 9.8 0.00025 18.9 1.7 61 105-168 146-217 (345) 114 TIGR02076 pyrH_arch uridylate 49.8 15 0.00039 17.6 2.7 53 47-99 124-179 (232) 115 TIGR01982 UbiB 2-polyprenylphe 48.9 20 0.0005 16.9 3.1 21 79-100 246-266 (452) 116 LOAD_Hrd consensus 47.4 7.7 0.0002 19.6 0.8 21 122-142 41-61 (77) 117 KOG3337 consensus 46.2 11 0.00028 18.5 1.5 30 69-98 18-47 (201) 118 PTZ00035 Rad51; Provisional 45.8 18 0.00047 17.1 2.5 45 70-117 48-93 (350) 119 CHL00137 rps13 ribosomal prote 45.7 10 0.00026 18.8 1.2 26 122-147 8-36 (122) 120 PRK04301 radA DNA repair and r 45.1 24 0.00062 16.2 4.3 46 69-117 19-65 (318) 121 pfam11798 IMS_HHH IMS family H 44.1 8.9 0.00023 19.2 0.7 14 130-143 14-27 (33) 122 PRK01151 rps17E 30S ribosomal 44.0 10 0.00026 18.8 1.0 28 102-129 2-29 (58) 123 cd01700 Pol_V Pol V was discov 43.9 18 0.00046 17.1 2.3 19 130-148 177-195 (344) 124 COG1031 Uncharacterized Fe-S o 43.8 13 0.00034 18.0 1.6 109 22-142 260-379 (560) 125 PRK12278 50S ribosomal protein 43.5 9.3 0.00024 19.0 0.8 41 127-167 152-195 (216) 126 PTZ00217 flap endonuclease-1; 43.2 26 0.00067 16.0 3.3 17 130-146 237-253 (394) 127 pfam00416 Ribosomal_S13 Riboso 43.1 8.4 0.00021 19.3 0.5 21 127-147 14-34 (106) 128 KOG1338 consensus 43.0 26 0.00067 16.0 6.1 98 47-153 88-195 (466) 129 PRK03609 umuC DNA polymerase V 42.1 27 0.00069 15.9 3.8 19 130-148 181-199 (422) 130 pfam03564 DUF1759 Protein of u 41.4 28 0.00071 15.8 3.0 31 62-92 52-82 (146) 131 PRK04053 rps13p 30S ribosomal 41.4 14 0.00035 17.9 1.3 29 120-148 14-45 (149) 132 TIGR02789 nickel_nikB nickel A 41.1 9.9 0.00025 18.9 0.6 19 127-145 253-271 (315) 133 KOG3220 consensus 40.5 29 0.00073 15.7 5.3 28 157-188 137-164 (225) 134 TIGR01993 Pyr-5-nucltdase pyri 40.3 5.4 0.00014 20.6 -0.9 39 124-172 89-130 (205) 135 pfam00570 HRDC HRDC domain. Th 40.0 11 0.00029 18.5 0.7 21 122-142 38-58 (68) 136 pfam05087 Rota_VP2 Rotavirus V 39.6 17 0.00043 17.3 1.6 57 45-101 382-450 (887) 137 PRK07758 hypothetical protein; 39.5 14 0.00037 17.8 1.2 17 126-142 65-81 (95) 138 pfam10343 DUF2419 Protein of u 39.4 30 0.00076 15.6 5.7 64 47-123 44-114 (282) 139 PRK02362 ski2-like helicase; P 39.3 30 0.00076 15.6 7.1 45 123-167 647-691 (736) 140 cd00128 XPG Xeroderma pigmento 39.1 29 0.00075 15.7 2.7 17 131-147 226-242 (316) 141 TIGR01016 sucCoAbeta succinyl- 39.1 30 0.00075 15.7 2.8 30 71-100 9-40 (389) 142 PTZ00205 DNA polymerase kappa; 38.9 29 0.00073 15.8 2.7 39 107-146 277-327 (571) 143 TIGR01448 recD_rel helicase, R 38.7 15 0.00039 17.6 1.2 17 130-146 196-213 (769) 144 TIGR02924 ICDH_alpha isocitrat 38.7 15 0.00039 17.6 1.3 33 156-188 366-403 (481) 145 PTZ00134 40S ribosomal protein 37.7 10 0.00026 18.8 0.2 27 122-148 21-50 (154) 146 smart00341 HRDC Helicase and R 37.2 14 0.00035 17.9 0.8 21 122-142 41-61 (81) 147 cd01703 Pol_iota Pol iota is m 37.2 16 0.00041 17.4 1.2 18 130-147 212-229 (394) 148 pfam11264 ThylakoidFormat Thyl 36.8 33 0.00083 15.4 8.0 95 30-131 7-109 (216) 149 PRK03348 DNA polymerase IV; Pr 35.9 28 0.00071 15.8 2.2 48 98-148 143-199 (456) 150 PRK01216 DNA polymerase IV; Va 35.6 17 0.00044 17.3 1.1 42 106-148 148-198 (351) 151 COG1561 Uncharacterized stress 35.2 24 0.00062 16.2 1.8 23 156-178 181-204 (290) 152 PRK02794 DNA polymerase IV; Pr 34.8 26 0.00067 16.0 1.9 19 130-148 211-229 (417) 153 COG2183 Tex Transcriptional ac 34.7 35 0.00089 15.2 3.2 80 47-141 473-552 (780) 154 PRK07922 N-acetylglutamate syn 34.6 35 0.0009 15.1 3.2 46 125-171 69-125 (170) 155 KOG0221 consensus 34.6 35 0.0009 15.1 6.1 121 30-150 283-445 (849) 156 cd03586 Pol_IV_kappa Pol_IV_ka 34.4 36 0.00091 15.1 3.5 18 131-148 175-192 (337) 157 pfam03755 YicC_N YicC-like fam 34.0 33 0.00084 15.4 2.3 36 101-136 80-115 (159) 158 PRK11820 hypothetical protein; 33.1 35 0.00089 15.2 2.3 12 156-167 222-233 (288) 159 PRK07539 NADH dehydrogenase su 32.9 37 0.00095 15.0 4.0 65 101-166 17-82 (154) 160 pfam06782 UPF0236 Uncharacteri 32.8 38 0.00096 14.9 3.0 131 22-171 241-399 (482) 161 KOG0950 consensus 32.7 38 0.00096 14.9 5.6 25 14-38 721-745 (1008) 162 PRK10917 ATP-dependent DNA hel 32.5 20 0.00052 16.7 1.1 27 12-38 146-172 (677) 163 TIGR01129 secD protein-export 32.0 21 0.00053 16.7 1.0 18 130-147 135-155 (522) 164 TIGR03629 arch_S13P archaeal r 32.0 15 0.00037 17.7 0.2 29 120-148 10-41 (144) 165 PTZ00154 40S ribosomal protein 31.7 23 0.00059 16.4 1.2 26 102-127 3-28 (130) 166 PRK03103 DNA polymerase IV; Re 30.6 35 0.00089 15.2 2.0 18 130-147 184-201 (410) 167 pfam10440 WIYLD WIYLD domain. 30.0 42 0.0011 14.6 3.8 41 20-61 23-63 (65) 168 TIGR03515 GldC gliding motilit 29.5 6.9 0.00018 19.9 -1.7 83 121-226 14-101 (108) 169 PRK00182 tatB sec-independent 29.5 43 0.0011 14.6 4.7 63 82-144 17-83 (165) 170 pfam11372 DUF3173 Protein of u 29.4 43 0.0011 14.6 3.6 42 87-132 5-56 (59) 171 COG5071 RPN5 26S proteasome re 29.3 43 0.0011 14.6 5.0 41 127-167 299-350 (439) 172 TIGR01394 TypA_BipA GTP-bindin 29.2 38 0.00098 14.9 2.0 40 18-59 183-222 (609) 173 COG1468 CRISPR-associated prot 29.2 23 0.00059 16.4 0.9 19 206-224 170-188 (190) 174 PRK10840 transcriptional regul 29.2 43 0.0011 14.5 4.7 26 146-171 172-197 (216) 175 PRK05988 formate dehydrogenase 29.1 43 0.0011 14.5 3.4 65 101-166 18-83 (156) 176 TIGR00372 cas4 CRISPR-associat 29.0 20 0.00052 16.7 0.6 30 195-224 176-206 (206) 177 PRK06027 purU formyltetrahydro 28.6 4.8 0.00012 21.0 -2.7 54 87-140 152-207 (285) 178 cd04949 GT1_gtfA_like This fam 27.7 42 0.0011 14.6 2.0 173 20-203 164-361 (372) 179 PRK03858 DNA polymerase IV; Va 27.7 37 0.00095 15.0 1.7 18 130-147 175-192 (398) 180 TIGR02644 Y_phosphoryl pyrimid 27.5 18 0.00047 17.0 0.1 34 135-173 85-119 (425) 181 COG0099 RpsM Ribosomal protein 26.7 34 0.00088 15.2 1.4 26 122-147 8-36 (121) 182 COG1550 Uncharacterized protei 26.0 32 0.00082 15.4 1.2 58 25-82 25-85 (95) 183 TIGR02814 pfaD_fam PfaD family 25.9 42 0.0011 14.6 1.7 49 62-111 262-316 (449) 184 COG3743 Uncharacterized conser 25.9 28 0.00072 15.8 0.8 18 128-145 67-84 (133) 185 pfam04891 NifQ NifQ. NifQ is i 25.4 50 0.0013 14.1 3.1 47 174-222 109-165 (167) 186 pfam00833 Ribosomal_S17e Ribos 25.3 33 0.00085 15.3 1.1 27 102-128 3-29 (122) 187 PRK01810 DNA polymerase IV; Va 25.2 43 0.0011 14.6 1.7 40 106-146 149-197 (410) 188 cd00424 Pol_Y Y-family of DNA 24.8 51 0.0013 14.0 2.2 19 130-148 175-193 (341) 189 PRK12373 NADH dehydrogenase su 24.3 41 0.0011 14.7 1.5 20 127-146 325-344 (403) 190 PRK13011 formyltetrahydrofolat 24.1 12 0.00031 18.2 -1.3 54 87-140 154-209 (287) 191 KOG3133 consensus 23.8 54 0.0014 13.9 2.3 42 46-87 141-182 (267) 192 PRK09430 djlA Dna-J like membr 23.6 54 0.0014 13.9 8.0 30 118-147 140-173 (269) 193 PRK02515 psbU photosystem II c 23.4 54 0.0014 13.9 2.7 33 114-146 56-91 (144) 194 PRK12311 rpsB 30S ribosomal pr 23.0 41 0.0011 14.7 1.2 18 127-144 268-285 (332) 195 PRK05477 gatB aspartyl/glutamy 22.9 56 0.0014 13.8 8.4 58 124-186 371-431 (479) 196 COG1379 PHP family phosphoeste 22.9 56 0.0014 13.8 6.4 13 85-97 213-225 (403) 197 KOG0628 consensus 22.9 56 0.0014 13.8 4.7 53 8-60 36-97 (511) 198 PRK00188 trpD anthranilate pho 22.8 56 0.0014 13.8 4.0 16 85-100 133-148 (339) 199 pfam01930 Cas_Cas4 Domain of u 22.7 39 0.00099 14.8 1.1 16 208-223 146-161 (162) 200 COG1383 RPS17A Ribosomal prote 22.0 52 0.0013 14.0 1.6 28 102-129 3-30 (74) 201 PRK11057 ATP-dependent DNA hel 22.0 34 0.00086 15.3 0.6 28 123-150 313-343 (607) 202 COG0322 UvrC Nuclease subunit 21.8 59 0.0015 13.6 2.8 20 22-41 133-152 (581) 203 pfam05166 YcgL YcgL domain. Th 21.6 53 0.0013 14.0 1.6 21 81-101 48-68 (74) 204 PRK12933 secD preprotein trans 21.6 42 0.0011 14.6 1.1 37 130-176 490-528 (604) 205 TIGR02203 MsbA_lipidA lipid A 21.3 60 0.0015 13.6 5.2 50 102-151 54-113 (603) 206 pfam00682 HMGL-like HMGL-like. 21.1 61 0.0015 13.6 3.0 16 20-35 11-26 (237) 207 cd03022 DsbA_HCCA_Iso DsbA fam 21.0 61 0.0016 13.5 4.5 92 48-159 85-176 (192) 208 pfam05597 Phasin Poly(hydroxya 20.8 36 0.00092 15.1 0.6 36 148-183 84-119 (132) 209 TIGR02154 PhoB phosphate regul 20.4 17 0.00044 17.2 -1.1 33 145-177 180-216 (226) 210 PTZ00183 centrin; Provisional 20.3 63 0.0016 13.5 7.4 147 40-200 6-164 (168) 211 pfam08625 Utp13 Utp13 specific 20.3 63 0.0016 13.5 2.7 59 81-142 50-108 (138) 212 COG1623 Predicted nucleic-acid 20.1 63 0.0016 13.4 3.3 14 151-164 261-274 (349) No 1 >TIGR01083 nth endonuclease III; InterPro: IPR005759 The spectrum of DNA damage caused by reactive oxygen species includes a wide variety of modifications of purine and pyrimidine bases. Among these modified bases, 7,8-dihydro-8-oxoguanine (8-oxoG) is an important mutagenic lesion. Base excision repair is a critical mechanism for preventing mutations by removing the oxidative lesion from the DNA. Escherichia coli Nth protein (endonuclease III) has an 8-oxoG DNA glycosylase/AP lyase activity which removes 8-oxoG preferentially from 8-oxoG/G mispairs. Human hNTH1 protein, a homolog of E. coli Nth protein, has similar DNA glycosylase/AP lyase activity that removes 8-oxoG from 8-oxoG/G mispairs . ; GO: 0003906 DNA-(apurinic or apyrimidinic site) lyase activity, 0006284 base-excision repair, 0005622 intracellular. Probab=100.00 E-value=0 Score=498.54 Aligned_cols=191 Identities=52% Similarity=0.936 Sum_probs=189.3 Q ss_pred HHHHHHHHHHHHHCCCCCCCCCCCCHHHHHHHHHHHHCCCHHHHHHHHHHHHHCCCCCHHHCCCCHHHHHHHHHHHHHHH Q ss_conf 99999999999976799987777868999999999633320356799899873176200010126899999999730037 Q gi|254780383|r 23 KELEEIFYLFSLKWPSPKGELYYVNHFTLIVAVLLSAQSTDVNVNKATKHLFEIADTPQKMLAIGEKKLQNYIRTIGIYR 102 (227) Q Consensus 23 ~~~~~I~~~L~~~yp~~~~~l~~~~p~~~LVa~iLs~qT~d~~v~~~~~~L~~~ypt~e~l~~a~~~el~~~ir~~G~~~ 102 (227) +++.+|+.+|.+.||.|.+||+|.|||++|||+|||||+||++||+|+++||++|+||++|++++.+||+++||++|||| T Consensus 2 ~~~~~il~~L~~~yP~p~tEL~~~~PFeLLVAtiLSAQ~TD~~VNkaT~~LF~~Y~tp~~~a~a~~eel~~~Ik~iGlYr 81 (192) T TIGR01083 2 QKAQEILERLRKLYPHPTTELDYKNPFELLVATILSAQATDKSVNKATKKLFEVYPTPQALAAAGLEELEEYIKSIGLYR 81 (192) T ss_pred CHHHHHHHHHHHHCCCCEEEEEECCCHHHHHHHHHHHHHCCHHHHHCCHHHHHCCCCHHHHHCCCHHHHHHHHHCCCCCH T ss_conf 40789999999738997254333070789999999865313267631678651277868996089313477642258645 Q ss_pred HHHHHHHHHHHHHHHHHCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCHHHHHHHHHHHHCCCCHHHHHHHH Q ss_conf 99999975235544420010000145677764323588889999875421000121046787765654078889999999 Q gi|254780383|r 103 KKSENIISLSHILINEFDNKIPQTLEGLTRLPGIGRKGANVILSMAFGIPTIGVDTHIFRISNRIGLAPGKTPNKVEQSL 182 (227) Q Consensus 103 ~KAk~I~~~a~~i~~~~~g~vP~~~~~L~~LpGVG~ktA~~il~~~~~~p~~~VDthv~Rv~~Rlgl~~~~~~~~~~~~l 182 (227) +||++|+++|++|+|+|+|+||+++++|++|||||+||||+||+.+||.|+|+|||||+||++||||++++||.++|++| T Consensus 82 ~KAk~I~~~~~~LvE~y~GeVP~~~~eL~~LPGVGRKTANVVL~~aFg~P~iAVDTHv~Rv~~Rlgl~~~~dp~~vE~~L 161 (192) T TIGR01083 82 NKAKNIIALCRKLVERYGGEVPEDREELVKLPGVGRKTANVVLNVAFGIPAIAVDTHVFRVSNRLGLSKGKDPDKVEEEL 161 (192) T ss_pred HHHHHHHHHHHHHHHHHCCCCCCCHHHHHCCCCCCCHHHHHHHHHHHCCCEEEECCCHHHHHHHHCCCCCCCHHHHHHHH T ss_conf 68999999999999981898775537661789987114562433442687057414346554331357778989999999 Q ss_pred HHCCCHHHHHHHHHHHHHHHHHHCCCCCCCC Q ss_conf 6218842267899999999665164899894 Q gi|254780383|r 183 LRIIPPKHQYNAHYWLVLHGRYVCKARKPQC 213 (227) Q Consensus 183 ~~~~p~~~~~~~~~~li~~G~~iC~~~~P~C 213 (227) ++++|.+.|..+|++||.|||.+|+||+|.| T Consensus 162 ~~l~P~~~w~~~hh~lIlHGRy~CkAr~P~C 192 (192) T TIGR01083 162 LKLIPKEFWTKLHHWLILHGRYTCKARKPRC 192 (192) T ss_pred HHHCCCCCHHHCCHHHHHHCCCCCCCCCCCC T ss_conf 8744850012223675431111116888889 No 2 >COG0177 Nth Predicted EndoIII-related endonuclease [DNA replication, recombination, and repair] Probab=100.00 E-value=0 Score=474.12 Aligned_cols=207 Identities=49% Similarity=0.883 Sum_probs=203.6 Q ss_pred CHHHHHHHHHHHHHHCCCCCCCCCCCCHHHHHHHHHHHHCCCHHHHHHHHHHHHHCCCCCHHHCCCCHHHHHHHHHHHHH Q ss_conf 98999999999999767999877778689999999996333203567998998731762000101268999999997300 Q gi|254780383|r 21 TPKELEEIFYLFSLKWPSPKGELYYVNHFTLIVAVLLSAQSTDVNVNKATKHLFEIADTPQKMLAIGEKKLQNYIRTIGI 100 (227) Q Consensus 21 ~~~~~~~I~~~L~~~yp~~~~~l~~~~p~~~LVa~iLs~qT~d~~v~~~~~~L~~~ypt~e~l~~a~~~el~~~ir~~G~ 100 (227) .++++.+|+++|.+.||.+++++.|.|||++||++||||||||++|++|+++||++|+||++++++++++|+++|+++|| T Consensus 2 ~~~~~~~i~~~l~~~~p~~~~~l~~~~pf~lLva~iLSaqttD~~vn~at~~Lf~~~~t~e~l~~a~~~~l~~~I~~iGl 81 (211) T COG0177 2 NKKKALEILDRLRELYPEPKTELDFKDPFELLVAVILSAQTTDEVVNKATPALFKRYPTPEDLLNADEEELEELIKSIGL 81 (211) T ss_pred CHHHHHHHHHHHHHHCCCCCCCCCCCCCHHHHHHHHHHCCCCHHHHHHHHHHHHHHCCCHHHHHCCCHHHHHHHHHHCCC T ss_conf 37669999999998788876755768838999999994467448899999999997599999974999999999986387 Q ss_pred HHHHHHHHHHHHHHHHHHHCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCHHHHHHHHHHHHCCCCHHHHHH Q ss_conf 37999999752355444200100001456777643235888899998754210001210467877656540788899999 Q gi|254780383|r 101 YRKKSENIISLSHILINEFDNKIPQTLEGLTRLPGIGRKGANVILSMAFGIPTIGVDTHIFRISNRIGLAPGKTPNKVEQ 180 (227) Q Consensus 101 ~~~KAk~I~~~a~~i~~~~~g~vP~~~~~L~~LpGVG~ktA~~il~~~~~~p~~~VDthv~Rv~~Rlgl~~~~~~~~~~~ 180 (227) ||+||++|+++|++|+|+|||+||.++++|++|||||+||||+||+++||.|+|+|||||+||++||||++++++++++. T Consensus 82 yr~KAk~I~~~~~~l~e~~~g~vP~~~~eL~~LPGVGrKTAnvVL~~a~g~p~i~VDTHV~Rvs~R~gl~~~~~p~~ve~ 161 (211) T COG0177 82 YRNKAKNIKELARILLEKFGGEVPDTREELLSLPGVGRKTANVVLSFAFGIPAIAVDTHVHRVSNRLGLVPGKTPEEVEE 161 (211) T ss_pred CHHHHHHHHHHHHHHHHHCCCCCCCHHHHHHHCCCCCHHHHHHHHHHHCCCCCCCCCCHHHHHHHHHCCCCCCCHHHHHH T ss_conf 18999999999999999749999815999974899665778989986559986521242999999847788999999999 Q ss_pred HHHHCCCHHHHHHHHHHHHHHHHHHCCCCCCCCCCCCCHHHCHHHCC Q ss_conf 99621884226789999999966516489989472840331768519 Q gi|254780383|r 181 SLLRIIPPKHQYNAHYWLVLHGRYVCKARKPQCQSCIISNLCKRIKQ 227 (227) Q Consensus 181 ~l~~~~p~~~~~~~~~~li~~G~~iC~~~~P~C~~C~l~~~C~~~k~ 227 (227) +|++++|++.|.++|++||.|||.+|+|++|+|+.|||+++|+|+++ T Consensus 162 ~L~~~iP~~~~~~~h~~lI~~GR~iC~ar~P~C~~C~l~~~C~~~~~ 208 (211) T COG0177 162 ALMKLIPKELWTDLHHWLILHGRYICKARKPRCEECPLADLCPSAGK 208 (211) T ss_pred HHHHHCCHHHHHHHHHHHHHHHHHHCCCCCCCCCCCCCHHHCCHHCC T ss_conf 99997897889999999999605311689998675646554811100 No 3 >PRK10702 endonuclease III; Provisional Probab=100.00 E-value=0 Score=461.54 Aligned_cols=207 Identities=47% Similarity=0.863 Sum_probs=203.4 Q ss_pred CHHHHHHHHHHHHHHCCCCCCCCCCCCHHHHHHHHHHHHCCCHHHHHHHHHHHHHCCCCCHHHCCCCHHHHHHHHHHHHH Q ss_conf 98999999999999767999877778689999999996333203567998998731762000101268999999997300 Q gi|254780383|r 21 TPKELEEIFYLFSLKWPSPKGELYYVNHFTLIVAVLLSAQSTDVNVNKATKHLFEIADTPQKMLAIGEKKLQNYIRTIGI 100 (227) Q Consensus 21 ~~~~~~~I~~~L~~~yp~~~~~l~~~~p~~~LVa~iLs~qT~d~~v~~~~~~L~~~ypt~e~l~~a~~~el~~~ir~~G~ 100 (227) ++++..+|+.+|.+.||++.++|+|.|||++||++||||||+|++|++++++||++||||++++++++++|+++|+++|| T Consensus 2 ~~~~~~~i~~~l~~~~p~~~~~l~~~~P~~vLVs~ILsqqTtd~~v~~~~~~L~~~~~t~e~la~a~~~el~~~i~~~G~ 81 (211) T PRK10702 2 NKAKRLEILTRLRDNNPHPTTELNFSSPFELLIAVLLSAQATDVSVNKATAKLYPVANTPAAMLELGVEGVKTYIKTIGL 81 (211) T ss_pred CHHHHHHHHHHHHHHCCCCCCCCCCCCHHHHHHHHHHHCCCCHHHHHHHHHHHHHHCCCHHHHHCCCHHHHHHHHHHHHH T ss_conf 88999999999998786999985889858999999997418589999999999997799999870999999999998635 Q ss_pred HHHHHHHHHHHHHHHHHHHCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCHHHHHHHHHHHHCCCCHHHHHH Q ss_conf 37999999752355444200100001456777643235888899998754210001210467877656540788899999 Q gi|254780383|r 101 YRKKSENIISLSHILINEFDNKIPQTLEGLTRLPGIGRKGANVILSMAFGIPTIGVDTHIFRISNRIGLAPGKTPNKVEQ 180 (227) Q Consensus 101 ~~~KAk~I~~~a~~i~~~~~g~vP~~~~~L~~LpGVG~ktA~~il~~~~~~p~~~VDthv~Rv~~Rlgl~~~~~~~~~~~ 180 (227) |++||++|+++|++|+++|||++|.++++|++|||||+|||++||+++||+|+|+|||||+||++|+||++++++++++. T Consensus 82 y~~KA~~L~~~a~~i~~~~~G~vP~~~~~L~~LpGIG~kTA~aIl~~a~~~~~~~VDtnV~RV~~Rlg~~~~~~~~~~~~ 161 (211) T PRK10702 82 YNSKAENVIKTCRILLEKHNGEVPEDRAALEALPGVGRKTANVVLNTAFGWPTIAVDTHIFRVCNRTQFAPGKNVEQVEE 161 (211) T ss_pred HHHHHHHHHHHHHHHHHHHCCCCHHHHHHHHHHHCCCHHHHHHHHHHHCCCCCCCCCCCHHHHHHHHCCCCCCCHHHHHH T ss_conf 99999999999999999909987666999998766358899999999849986525735999999976577899999999 Q ss_pred HHHHCCCHHHHHHHHHHHHHHHHHHCCCCCCCCCCCCCHHHCHHHCC Q ss_conf 99621884226789999999966516489989472840331768519 Q gi|254780383|r 181 SLLRIIPPKHQYNAHYWLVLHGRYVCKARKPQCQSCIISNLCKRIKQ 227 (227) Q Consensus 181 ~l~~~~p~~~~~~~~~~li~~G~~iC~~~~P~C~~C~l~~~C~~~k~ 227 (227) .|++++|.+.|.++|++||+||+.||++++|+|+.|||++.|+|++| T Consensus 162 ~l~~~~p~~~~~~~~~~li~~G~~iC~~~~P~C~~Cpl~~~C~~~~K 208 (211) T PRK10702 162 KLLKVVPAEFKVDCHHWLILHGRYTCIARKPRCGSCIIEDLCEYKEK 208 (211) T ss_pred HHHHHCCHHHHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCC T ss_conf 99982890237999999999950150699993999989144999666 No 4 >PRK10880 adenine DNA glycosylase; Provisional Probab=100.00 E-value=0 Score=352.51 Aligned_cols=201 Identities=18% Similarity=0.271 Sum_probs=179.8 Q ss_pred HHHHHHHHHHCCC-CCCCCCC---CCHHHHHHHHHHHHCCCHHHHHHHHHHHHHCCCCCHHHCCCCHHHHHHHHHHHHHH Q ss_conf 9999999997679-9987777---86899999999963332035679989987317620001012689999999973003 Q gi|254780383|r 26 EEIFYLFSLKWPS-PKGELYY---VNHFTLIVAVLLSAQSTDVNVNKATKHLFEIADTPQKMLAIGEKKLQNYIRTIGIY 101 (227) Q Consensus 26 ~~I~~~L~~~yp~-~~~~l~~---~~p~~~LVa~iLs~qT~d~~v~~~~~~L~~~ypt~e~l~~a~~~el~~~ir~~G~~ 101 (227) .+....|..||.. .+..|+| .|||.++||+||+|||+.++|...|.+++++|||+++||+|+++|+..+|.++||| T Consensus 4 ~~f~~~ll~Wy~~~~r~~lPWr~~~~PY~vwvSEiMLQQTqv~tV~~yy~rf~~~fP~~~~LA~A~~~~vl~~W~GLGYY 83 (350) T PRK10880 4 SQFSAQVLDWYDKYGRKTLPWQIDKTPYKVWLSEVMLQQTQVATVIPYFERFMARFPTVTDLANAPLDEVLHLWTGLGYY 83 (350) T ss_pred HHHHHHHHHHHHHCCCCCCCCCCCCCCHHHHHHHHHHCCCCHHHHHHHHHHHHHHCCCHHHHHCCCHHHHHHHHHHCCCH T ss_conf 99999999999873998899899998226999999983487789999999999988399999779999999986406938 Q ss_pred HHHHHHHHHHHHHHHHHHCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCHHHHHHHHH-HHHCCCCHHHHHH Q ss_conf 799999975235544420010000145677764323588889999875421000121046787765-6540788899999 Q gi|254780383|r 102 RKKSENIISLSHILINEFDNKIPQTLEGLTRLPGIGRKGANVILSMAFGIPTIGVDTHIFRISNRI-GLAPGKTPNKVEQ 180 (227) Q Consensus 102 ~~KAk~I~~~a~~i~~~~~g~vP~~~~~L~~LpGVG~ktA~~il~~~~~~p~~~VDthv~Rv~~Rl-gl~~~~~~~~~~~ 180 (227) +||++|+++|++|+++|+|.+|++.++|++|||||+|||.+|++++||+|+.+||+||.||+.|+ |+....+..++++ T Consensus 84 -~RArnLh~aA~~i~~~~~G~~P~~~~~L~~LPGIG~yTA~AI~siaf~~~~~ivDgNV~RVlsR~~~i~~~~~~~~~~~ 162 (350) T PRK10880 84 -ARARNLHKAAQQVATLHGGKFPETFEEVAALPGVGRSTAGAILSLSLGKHFPILDGNVKRVLARCYAVSGWPGKKEVEN 162 (350) T ss_pred -HHHHHHHHHHHHHHHHCCCCCCCHHHHHHHCCCCCHHHHHHHHHHHCCCCCCCCCCCCHHHHHHHHCCCCCCCCHHHHH T ss_conf -9999999999999997589898259998626688727999999997699535766440328888750336887559999 Q ss_pred H----HHHCCCHHHHHHHHHHHHHHHHHHCCCCCCCCCCCCCHHHCHHHCC Q ss_conf 9----9621884226789999999966516489989472840331768519 Q gi|254780383|r 181 S----LLRIIPPKHQYNAHYWLVLHGRYVCKARKPQCQSCIISNLCKRIKQ 227 (227) Q Consensus 181 ~----l~~~~p~~~~~~~~~~li~~G~~iC~~~~P~C~~C~l~~~C~~~k~ 227 (227) . ...++|.+...+|+++||++|+.||++++|+|..|||++.|..|+. T Consensus 163 ~l~~~a~~~~p~~~~~~~nQAlMDLGA~vCtp~~P~C~~CPl~~~C~A~~~ 213 (350) T PRK10880 163 KLWSLSEQVTPAVGVERFNQAMMDLGAMVCTRSKPKCELCPLQNGCIAAAN 213 (350) T ss_pred HHHHHHHHHCCCCCHHHHHHHHHHHHHHHCCCCCCCCCCCCCCCCCHHHHC T ss_conf 999999973884327799999999734102799998788988220646427 No 5 >TIGR01084 mutY A/G-specific adenine glycosylase; InterPro: IPR005760 The DNA repair enzyme MutY plays an important role in the prevention of DNA mutations resulting from the presence of the oxidatively damaged lesion 7,8-dihydro-8-oxo-2'-deoxyguanosine (8-OxoG). 8-OxoG can mispair with 2'-deoxycytidine 5'-triphosphate or with 2'-deoxyadenosine triphosphate during DNA replication, forming C*8-oxoG and A*8-oxoG mispairs. Human MutY is responsible for recognition and removal of the inappropriately inserted adenine in an A=8-oxoG mispair. If unrepaired, the A=8-oxoG mispairs can result in deleterious C:G to A:T transversions. Multiple forms of mammalian MutY are formed by alternative splicing and locate to the nucleus or mitochondrion, where they have been shown to interact with several other proteins involved in the repair of DNA damage . The HhH-GPD domain within the protein binds the phosphate backbone of the substrate. This family represents bacterial MutY and a limited number of murine homologs. In rat, the Escherichia coli MutY DNA glycosylase homolog (MYH) is induced in response to mitochondrial DNA damage . ; GO: 0019104 DNA N-glycosylase activity, 0006284 base-excision repair, 0005622 intracellular. Probab=100.00 E-value=0 Score=346.70 Aligned_cols=198 Identities=23% Similarity=0.359 Sum_probs=176.0 Q ss_pred HHHHHHHCCCCCC-CCCC---------------CCHHHHHHHHHHHHCCCHHHHHH-HHHHHHHCCCCCHHHCCCCHHHH Q ss_conf 9999997679998-7777---------------86899999999963332035679-98998731762000101268999 Q gi|254780383|r 29 FYLFSLKWPSPKG-ELYY---------------VNHFTLIVAVLLSAQSTDVNVNK-ATKHLFEIADTPQKMLAIGEKKL 91 (227) Q Consensus 29 ~~~L~~~yp~~~~-~l~~---------------~~p~~~LVa~iLs~qT~d~~v~~-~~~~L~~~ypt~e~l~~a~~~el 91 (227) .+.|..||..... .|+| ++||.|.||+||+|||+.++|.. .|.+|+++|||.++||+|+++|+ T Consensus 3 ~~~ll~Wy~~~gRk~LPWr~~~~~~cdeslkhi~~pY~VW~SEvMLQQTqV~tV~prYf~rFle~FPTv~~LA~A~~deV 82 (297) T TIGR01084 3 REDLLSWYDKEGRKTLPWRQNKNQRCDESLKHIDDPYRVWVSEVMLQQTQVATVIPRYFERFLERFPTVQALANAPQDEV 82 (297) T ss_pred HHHHHHHHHHCCCCCCCCCCCCCCCHHHHHHHCCCCCCEEEEEECCCCCEEEECCCCCCHHHHHHCCCHHHHHCCCHHHH T ss_conf 78899999871786788747875542457752235451440021100110011267100476642788578747796579 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCHHHHHHHHHHHHC Q ss_conf 99999730037999999752355444200100001456777643235888899998754210001210467877656540 Q gi|254780383|r 92 QNYIRTIGIYRKKSENIISLSHILINEFDNKIPQTLEGLTRLPGIGRKGANVILSMAFGIPTIGVDTHIFRISNRIGLAP 171 (227) Q Consensus 92 ~~~ir~~G~~~~KAk~I~~~a~~i~~~~~g~vP~~~~~L~~LpGVG~ktA~~il~~~~~~p~~~VDthv~Rv~~Rlgl~~ 171 (227) ..+|.++||| .||+||+++|+.|.++|||++|.|.++|.+|||||+|||.+|++++||+|..+||.||.||++|+-=+. T Consensus 83 L~lW~GLGYY-aRARNL~kAA~~v~~~fGG~fP~d~~~~~~L~GVG~yTAgAils~a~~~~~p~~DGNV~RVLsR~fA~~ 161 (297) T TIGR01084 83 LKLWEGLGYY-ARARNLHKAAQEVVEEFGGEFPQDLEDLKALPGVGRYTAGAILSFAYNKPVPILDGNVKRVLSRLFAVE 161 (297) T ss_pred HHHHHCCCHH-HHHHHHHHHHHHHHHHHCCCCCCCHHHHHHCCCCCHHHHHHHHHHHHCCCCCCCCCCHHHHHHHHHHHC T ss_conf 9986257867-888999999999998718817723797851789762179999998726876201540788999998624 Q ss_pred C-CCHHHH----HHHHHHCCC-HHHHHHHHHHHHHHHHHHCCCCCCCCCCCCCHHHCHHHCC Q ss_conf 7-888999----999962188-4226789999999966516489989472840331768519 Q gi|254780383|r 172 G-KTPNKV----EQSLLRIIP-PKHQYNAHYWLVLHGRYVCKARKPQCQSCIISNLCKRIKQ 227 (227) Q Consensus 172 ~-~~~~~~----~~~l~~~~p-~~~~~~~~~~li~~G~~iC~~~~P~C~~C~l~~~C~~~k~ 227 (227) + -+..++ ....+.++| .....+||++|||.|+.||++++|+|+.|||.++|.-|+. T Consensus 162 ~~~~~k~~e~~l~~~~~~llpe~~~~~~~nqalmDlGA~iC~rk~P~C~~CPl~~~C~A~~~ 223 (297) T TIGR01084 162 GWPGKKKVENRLWELAESLLPEKADPEAFNQALMDLGALICTRKKPKCDLCPLQDFCLAYKQ 223 (297) T ss_pred CCCCCCHHHHHHHHHHHHHCCCCCCHHHHHHHHHHCCCCCCCCCCCCCCCCCHHHHHHHHHC T ss_conf 89887348899999999858865686588889986236103784785454870665556542 No 6 >KOG1921 consensus Probab=100.00 E-value=0 Score=335.98 Aligned_cols=205 Identities=30% Similarity=0.512 Sum_probs=189.9 Q ss_pred CCHHHHHHHHHHHHHHCCCCCCCCC----------CCC----HHHHHHHHHHHHCCCHHHHHHHHHHHHHCC-CCCHHHC Q ss_conf 9989999999999997679998777----------786----899999999963332035679989987317-6200010 Q gi|254780383|r 20 YTPKELEEIFYLFSLKWPSPKGELY----------YVN----HFTLIVAVLLSAQSTDVNVNKATKHLFEIA-DTPQKML 84 (227) Q Consensus 20 ~~~~~~~~I~~~L~~~yp~~~~~l~----------~~~----p~~~LVa~iLs~qT~d~~v~~~~~~L~~~y-pt~e~l~ 84 (227) ..++++.++|.++...-++...|.+ ..+ .|++||+.|||.||+|+++..|+.+|.+.. -|++++. T Consensus 36 ~ppe~w~~~y~~ir~mR~k~~APVD~mGc~~~~~~~~~pk~~RfqvLv~lmLSSQTKDevt~~Am~rL~~~~gLT~e~v~ 115 (286) T KOG1921 36 KPPENWLEVYERIRKMRSKIVAPVDTMGCSRIPSLKADPKERRFQVLVGLMLSSQTKDEVTAAAMLRLKEYGGLTLEAVL 115 (286) T ss_pred CCCCCHHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHCCHHHHHHHHHHHHHHHHCCCCHHHHH T ss_conf 89946899999999886035688634355557655578056759999999970100788899999999985597899986 Q ss_pred CCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHH-HHHHCCCHHHHH Q ss_conf 1268999999997300379999997523554442001000014567776432358888999987542-100012104678 Q gi|254780383|r 85 AIGEKKLQNYIRTIGIYRKKSENIISLSHILINEFDNKIPQTLEGLTRLPGIGRKGANVILSMAFGI-PTIGVDTHIFRI 163 (227) Q Consensus 85 ~a~~~el~~~ir~~G~~~~KAk~I~~~a~~i~~~~~g~vP~~~~~L~~LpGVG~ktA~~il~~~~~~-p~~~VDthv~Rv 163 (227) ++++..|.++|+|+|||++||+||+.+|+++.++|+|++|.+.++|++|||||||.|..+|..|||. .+|.|||||||+ T Consensus 116 ~~de~~l~~LI~~VgFy~rKA~ylkkta~IL~d~f~gDIP~~v~dLlsLPGVGPKMa~L~m~~AWn~i~GI~VDtHVHRi 195 (286) T KOG1921 116 KIDEPTLNELIYPVGFYTRKAKYLKKTAKILQDKFDGDIPDTVEDLLSLPGVGPKMAHLTMQVAWNKIVGICVDTHVHRI 195 (286) T ss_pred CCCHHHHHHHHHHCCCHHHHHHHHHHHHHHHHHHHCCCCCHHHHHHHCCCCCCHHHHHHHHHHHHCCCEEEEEEHHHHHH T ss_conf 16757687650001215788899999999999870799755599885589976599999999985664057863289998 Q ss_pred HHHHHHHCCCC--HHHHHHHHHHCCCHHHHHHHHHHHHHHHHHHCCCCCCCCCCCCC-HHHCHH Q ss_conf 77656540788--89999999621884226789999999966516489989472840-331768 Q gi|254780383|r 164 SNRIGLAPGKT--PNKVEQSLLRIIPPKHQYNAHYWLVLHGRYVCKARKPQCQSCII-SNLCKR 224 (227) Q Consensus 164 ~~Rlgl~~~~~--~~~~~~~l~~~~p~~~~~~~~~~li~~G~~iC~~~~P~C~~C~l-~~~C~~ 224 (227) ++||||++.++ +++++..|+.++|++.|...|+.|+.|||.||+|++|+|+.|.+ ++.|+. T Consensus 196 ~nrlgWv~~ktkspE~TR~aLq~wLPk~lW~eIN~lLVGFGQ~iC~p~~prC~~C~~~~~~Cps 259 (286) T KOG1921 196 CNRLGWVDTKTKSPEQTRVALQQWLPKSLWVEINHLLVGFGQTICTPRRPRCGLCLLSRDLCPS 259 (286) T ss_pred HHHHCCCCCCCCCHHHHHHHHHHHCCHHHHHHHHCEEECCCCEEEECCCCCCCCCCCCCCCCCH T ss_conf 8875633566688789999999867687775620105514521453279986332357566842 No 7 >COG1194 MutY A/G-specific DNA glycosylase [DNA replication, recombination, and repair] Probab=100.00 E-value=1.1e-44 Score=321.51 Aligned_cols=204 Identities=19% Similarity=0.342 Sum_probs=179.8 Q ss_pred HHHHHHHHHHHHHCCCCCCCCCCC---CHHHHHHHHHHHHCCCHHHHHHHHHHHHHCCCCCHHHCCCCHHHHHHHHHHHH Q ss_conf 999999999999767999877778---68999999999633320356799899873176200010126899999999730 Q gi|254780383|r 23 KELEEIFYLFSLKWPSPKGELYYV---NHFTLIVAVLLSAQSTDVNVNKATKHLFEIADTPQKMLAIGEKKLQNYIRTIG 99 (227) Q Consensus 23 ~~~~~I~~~L~~~yp~~~~~l~~~---~p~~~LVa~iLs~qT~d~~v~~~~~~L~~~ypt~e~l~~a~~~el~~~ir~~G 99 (227) .....+...|.+||......|+|+ +||.+|||+||+|||+.++|...+.++.++|||+++||+|+.+|+..+|.++| T Consensus 6 ~~~~~~~~~ll~Wy~~~~R~LPWR~~~~PY~VwvSEiMLQQT~v~~Vi~yy~~fl~rfPti~~LA~A~~~evl~~W~gLG 85 (342) T COG1194 6 GDIEKFQEALLDWYDKNGRDLPWRETKDPYRVWVSEIMLQQTQVATVIPYYERFLERFPTIKALAAAPEDEVLKAWEGLG 85 (342) T ss_pred HHHHHHHHHHHHHHHHHCCCCCCCCCCCCCEEHHHHHHHHHCCHHHHHHHHHHHHHHCCCHHHHHCCCHHHHHHHHHHCC T ss_conf 24578888999999974874887789986220268887600607455456999998689989986688889999987167 Q ss_pred HHHHHHHHHHHHHHHHHHHHCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCHHHHHHHHH-HHHCCC----C Q ss_conf 03799999975235544420010000145677764323588889999875421000121046787765-654078----8 Q gi|254780383|r 100 IYRKKSENIISLSHILINEFDNKIPQTLEGLTRLPGIGRKGANVILSMAFGIPTIGVDTHIFRISNRI-GLAPGK----T 174 (227) Q Consensus 100 ~~~~KAk~I~~~a~~i~~~~~g~vP~~~~~L~~LpGVG~ktA~~il~~~~~~p~~~VDthv~Rv~~Rl-gl~~~~----~ 174 (227) || .||++|+++|+.++++|+|.+|++.++|.+|||||+|||.+||+++||+|...||+||.||+.|+ ++.... + T Consensus 86 Yy-sRArnL~~~A~~v~~~~~G~~P~~~~~l~~LpGiG~yTa~Ail~~a~~~~~~~lDgNV~RVl~R~f~i~~~~~~~~~ 164 (342) T COG1194 86 YY-SRARNLHKAAQEVVERHGGEFPDDEEELAALPGVGPYTAGAILSFAFNQPEPVLDGNVKRVLSRLFAISGDIGKPKT 164 (342) T ss_pred HH-HHHHHHHHHHHHHHHHCCCCCCCCHHHHHHCCCCCHHHHHHHHHHHHCCCCCEEECCHHEEEHHHHCCCCCCCCCCH T ss_conf 37-89999999999999981997999999998678973889999999871898753305422112665314365565410 Q ss_pred HHHHHHHHHHCC-CHHHHHHHHHHHHHHHHHHCCCCCCCCCCCCCHHHCHHHCC Q ss_conf 899999996218-84226789999999966516489989472840331768519 Q gi|254780383|r 175 PNKVEQSLLRII-PPKHQYNAHYWLVLHGRYVCKARKPQCQSCIISNLCKRIKQ 227 (227) Q Consensus 175 ~~~~~~~l~~~~-p~~~~~~~~~~li~~G~~iC~~~~P~C~~C~l~~~C~~~k~ 227 (227) ...++..+..++ |.....+|+++||++|+.||++++|+|+.|||++.|..|+. T Consensus 165 ~~~~~~~~~~ll~p~~~~~~fnqammdlGA~ICt~~~P~C~~CPl~~~c~a~~~ 218 (342) T COG1194 165 KKELWELAEQLLTPDRRPGDFNQAMMDLGATICTAKKPKCSLCPLRDNCAAYRN 218 (342) T ss_pred HHHHHHHHHHHCCCCCCHHHHHHHHHHHHHHHHCCCCCCCCCCCCHHHHHHHHC T ss_conf 589999999744898876799999998636761589998773946688999981 No 8 >PRK13910 DNA glycosylase MutY; Provisional Probab=100.00 E-value=4.2e-44 Score=317.68 Aligned_cols=168 Identities=23% Similarity=0.413 Sum_probs=157.7 Q ss_pred HHHHCCCHHHHHH-HHHHHHHCCCCCHHHCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCHHHHHHHHHHH Q ss_conf 9963332035679-989987317620001012689999999973003799999975235544420010000145677764 Q gi|254780383|r 56 LLSAQSTDVNVNK-ATKHLFEIADTPQKMLAIGEKKLQNYIRTIGIYRKKSENIISLSHILINEFDNKIPQTLEGLTRLP 134 (227) Q Consensus 56 iLs~qT~d~~v~~-~~~~L~~~ypt~e~l~~a~~~el~~~ir~~G~~~~KAk~I~~~a~~i~~~~~g~vP~~~~~L~~Lp 134 (227) ||+|||+.++|.. .+.+++++|||+++||+|+++|+..+|.++||| +||++|+++|++|+++|+|.+|++.++|++|| T Consensus 1 iMLQQTqv~tvip~y~~~f~~~fP~~~~la~a~~~~vl~~W~GLGYY-~RArnl~~~a~~i~~~~~g~~P~~~~~L~~LP 79 (290) T PRK13910 1 VMSQQTQINTVVERFYSPFLEAFPTLKDLANAQLEEVLLLWRGLGYY-SRAKNLKKSAEICVKEHHSQLPNDYQSLLKLP 79 (290) T ss_pred CCCCCCCCCCCCHHHHHHHHHHCCCHHHHHCCCHHHHHHHHHHCCCH-HHHHHHHHHHHHHHHHHCCCCCCHHHHHHHCC T ss_conf 96885742100578999999988399999778999999998746848-99999999999999983898985299997588 Q ss_pred HHHHHHHHHHHHHHHHHHHHHCCCHHHHHHHHH-HHHCCCCHHHHHHHHHHCCCHHHHHHHHHHHHHHHHHHCCCCCCCC Q ss_conf 323588889999875421000121046787765-6540788899999996218842267899999999665164899894 Q gi|254780383|r 135 GIGRKGANVILSMAFGIPTIGVDTHIFRISNRI-GLAPGKTPNKVEQSLLRIIPPKHQYNAHYWLVLHGRYVCKARKPQC 213 (227) Q Consensus 135 GVG~ktA~~il~~~~~~p~~~VDthv~Rv~~Rl-gl~~~~~~~~~~~~l~~~~p~~~~~~~~~~li~~G~~iC~~~~P~C 213 (227) |||+|||++|++++||+|+.+||+||.||+.|+ |+.+..+..+++....+++|.+...+|+|+||++|+.||+++ |+| T Consensus 80 GIG~yTA~AI~siaf~~~~~~vDgNv~RVl~R~~~~~~~~~~k~~~~~~~~~~~~~~~~~~nqalMdlGa~iC~pk-P~C 158 (290) T PRK13910 80 GIGAYTANAILCFGFREKSACVDANIKRVLLRLFGLDPNIHAKDLQIKANDFLNLNESFNHNQALIDLGALICSPK-PKC 158 (290) T ss_pred CCCCHHHHHHHHHHCCCCCCCCCCCHHEEEEHHHCCCCCCCHHHHHHHHHHHCCCCCCHHHHHHHHHHHHHHCCCC-CCC T ss_conf 9982699999998647633200045201410232478994669999999982595422689999998633343799-998 Q ss_pred CCCCCHHHCHHH Q ss_conf 728403317685 Q gi|254780383|r 214 QSCIISNLCKRI 225 (227) Q Consensus 214 ~~C~l~~~C~~~ 225 (227) ..||+++.|... T Consensus 159 ~~CPl~~~C~a~ 170 (290) T PRK13910 159 AICPFNPYCLGK 170 (290) T ss_pred CCCCCHHHHHCC T ss_conf 779795552315 No 9 >smart00478 ENDO3c endonuclease III. includes endonuclease III (DNA-(apurinic or apyrimidinic site) lyase), alkylbase DNA glycosidases (Alka-family) and other DNA glycosidases Probab=100.00 E-value=1.8e-39 Score=286.56 Aligned_cols=148 Identities=43% Similarity=0.733 Sum_probs=143.7 Q ss_pred HHHCCCHHHHHHHHHHHHHCCCCCHHHCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCHHHHHHHHHHHHH Q ss_conf 96333203567998998731762000101268999999997300379999997523554442001000014567776432 Q gi|254780383|r 57 LSAQSTDVNVNKATKHLFEIADTPQKMLAIGEKKLQNYIRTIGIYRKKSENIISLSHILINEFDNKIPQTLEGLTRLPGI 136 (227) Q Consensus 57 Ls~qT~d~~v~~~~~~L~~~ypt~e~l~~a~~~el~~~ir~~G~~~~KAk~I~~~a~~i~~~~~g~vP~~~~~L~~LpGV 136 (227) |||||+|++|++++++|+++||||++++.+++++|+++|+++|||++||++|+++++.|.++|+|.+|.++++|++|||| T Consensus 1 LSqqt~~~~v~~~~~~l~~~~pt~~~l~~a~~~~l~~~i~~~g~~~~ka~~i~~~a~~i~~~~~~~~p~~~~~L~~lpGV 80 (149) T smart00478 1 LSQQTSDEAVNKATERLFEKFPTPEDLAAADEEELEELIRPLGFYRRKAKYLIELARILVEEYGGEVPDDREELLKLPGV 80 (149) T ss_pred CCCCCCHHHHHHHHHHHHHHCCCHHHHHCCCHHHHHHHHHHCCCHHHHHHHHHHHHHHHHHHCCCCCCCHHHHHHCCCCC T ss_conf 99865289999999999998859999986899999999998688999999999999999986655588559998758986 Q ss_pred HHHHHHHHHHHHHHHHHHHCCCHHHHHHHHHHHHCCC-CHHHHHHHHHHCCCHHHHHHHHHHHHHHHHH Q ss_conf 3588889999875421000121046787765654078-8899999996218842267899999999665 Q gi|254780383|r 137 GRKGANVILSMAFGIPTIGVDTHIFRISNRIGLAPGK-TPNKVEQSLLRIIPPKHQYNAHYWLVLHGRY 204 (227) Q Consensus 137 G~ktA~~il~~~~~~p~~~VDthv~Rv~~Rlgl~~~~-~~~~~~~~l~~~~p~~~~~~~~~~li~~G~~ 204 (227) |+|||++||+|+|++|+++||+||+||++|+|+++.+ ++++++..+++++|.+.|.++|++||+||+. T Consensus 81 G~~tA~~vl~~~~~~~~~~vD~~v~Rv~~R~~~~~~~~~~~~~~~~l~~~~p~~~~~~~~~~l~~~G~~ 149 (149) T smart00478 81 GRKTANAVLSFALGKPFIPVDTHVLRIAKRLGLVDKKSTPEEVEKLLEKLLPKEDWRELNLLLIDFGRT 149 (149) T ss_pred CHHHHHHHHHHHCCCCCCCCCCCHHHHHHHHCCCCCCCCHHHHHHHHHHHCCCCCHHHHHHHHHHCCCC T ss_conf 599999999998799835134139999999847888898999999999878934399999999981899 No 10 >COG2231 Uncharacterized protein related to Endonuclease III [DNA replication, recombination, and repair] Probab=100.00 E-value=4.3e-38 Score=277.23 Aligned_cols=207 Identities=25% Similarity=0.384 Sum_probs=186.9 Q ss_pred CCCHHHHHHHHHHHHHHCCCCCCCCCCCCHHHHHHHHHHHHCCCHHHHHHHHHHHHHCC-CCCHHHCCCCHHHHHHHHHH Q ss_conf 39989999999999997679998777786899999999963332035679989987317-62000101268999999997 Q gi|254780383|r 19 LYTPKELEEIFYLFSLKWPSPKGELYYVNHFTLIVAVLLSAQSTDVNVNKATKHLFEIA-DTPQKMLAIGEKKLQNYIRT 97 (227) Q Consensus 19 ~~~~~~~~~I~~~L~~~yp~~~~~l~~~~p~~~LVa~iLs~qT~d~~v~~~~~~L~~~y-pt~e~l~~a~~~el~~~ir~ 97 (227) |...+...+||..|...|+ +.+||+..+-++++|++||.|+|+|++|.++..+|.+.. -+++++.+.+.++|+++||| T Consensus 2 ~~~~~~~~~iy~~L~~~yg-~q~WWp~~~~~EiiigAILtQNT~WknvekAlenLk~~~~~~l~~I~~~~~~~L~elIrp 80 (215) T COG2231 2 MLNMENITKIYKELLRLYG-DQGWWPADNKDEIIIGAILTQNTSWKNVEKALENLKNEGILNLKKILKLDEEELAELIRP 80 (215) T ss_pred CCCHHHHHHHHHHHHHHCC-CCCCCCCCCCHHHHHHHHHHCCCCHHHHHHHHHHHHHCCCCCHHHHHCCCHHHHHHHHHC T ss_conf 6514889999999999758-866788987206899889852452999999999998815679999845899999998704 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHCC---CCHH-HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCHHHHHHHHHHHHCCC Q ss_conf 300379999997523554442001---0000-145677764323588889999875421000121046787765654078 Q gi|254780383|r 98 IGIYRKKSENIISLSHILINEFDN---KIPQ-TLEGLTRLPGIGRKGANVILSMAFGIPTIGVDTHIFRISNRIGLAPGK 173 (227) Q Consensus 98 ~G~~~~KAk~I~~~a~~i~~~~~g---~vP~-~~~~L~~LpGVG~ktA~~il~~~~~~p~~~VDthv~Rv~~Rlgl~~~~ 173 (227) .||||+||+||+++++.+...|.+ .-+. .+++|+++.|||.+|||+||+|++++|+|+||.|..|++.|+|...++ T Consensus 81 sGFYnqKa~rLk~l~k~l~~~~~~~~~~~~~~~R~~LL~iKGIG~ETaDsILlYa~~rp~FVvD~YtrR~l~rlg~i~~k 160 (215) T COG2231 81 SGFYNQKAKRLKALSKNLAKFFINLESFKSEVLREELLSIKGIGKETADSILLYALDRPVFVVDKYTRRLLSRLGGIEEK 160 (215) T ss_pred CCHHHHHHHHHHHHHHHHHHHHHHHHCCCHHHHHHHHHCCCCCCHHHHHHHHHHHHCCCCCCHHHHHHHHHHHHCCCCCC T ss_conf 24089999999999999999864231115188999987268866223999999980486446329999999994551025 Q ss_pred CHHHHHHHHHHCCCHH--HHHHHHHHHHHHHHHHCCCCCCCCCCCCCHHHCHHHCC Q ss_conf 8899999996218842--26789999999966516489989472840331768519 Q gi|254780383|r 174 TPNKVEQSLLRIIPPK--HQYNAHYWLVLHGRYVCKARKPQCQSCIISNLCKRIKQ 227 (227) Q Consensus 174 ~~~~~~~~l~~~~p~~--~~~~~~~~li~~G~~iC~~~~P~C~~C~l~~~C~~~k~ 227 (227) +++++.+..++.+|++ .+..||.+++.||+.+|+- +|.|+.|||...|.++++ T Consensus 161 ~ydeik~~fe~~l~~~~~lyqe~HAlIv~~~K~f~~k-~~~~~~cpL~~~~~~~~~ 215 (215) T COG2231 161 KYDEIKELFEENLPENLRLYQEFHALIVEHAKHFCKK-KPLCEKCPLKEKCKKYRR 215 (215) T ss_pred CHHHHHHHHHHCCHHHHHHHHHHHHHHHHHHHHHCCC-CCCCCCCHHHHHHHHCCC T ss_conf 4999999998535067999999999999999998158-867787658998752269 No 11 >PRK13913 3-methyladenine DNA glycosylase; Provisional Probab=100.00 E-value=1.6e-37 Score=273.37 Aligned_cols=182 Identities=19% Similarity=0.225 Sum_probs=159.0 Q ss_pred HHHHHHHHH---CCCCCCCCCCCCHHHHHHHHHHHHCCCHHHHHHHHHHHHHCC-------CCCHHHCCCCHHHHHHHHH Q ss_conf 999999997---679998777786899999999963332035679989987317-------6200010126899999999 Q gi|254780383|r 27 EIFYLFSLK---WPSPKGELYYVNHFTLIVAVLLSAQSTDVNVNKATKHLFEIA-------DTPQKMLAIGEKKLQNYIR 96 (227) Q Consensus 27 ~I~~~L~~~---yp~~~~~l~~~~p~~~LVa~iLs~qT~d~~v~~~~~~L~~~y-------pt~e~l~~a~~~el~~~ir 96 (227) +++..|+.. +..|..||+.++||+++|++||.|||+|++|.++..+|.+.- -++++++.++.++|+++|| T Consensus 6 ~l~~~l~~~~~~~~~P~~WWPa~~~FEvivGAILtQNT~W~nVekAl~nLk~a~lL~~~~~~~l~~i~~l~~e~La~lIr 85 (218) T PRK13913 6 EILKALKSLDLLKNAPSWWWPNALKFEALLGAVLTQNTKFEAVLKSLENLKNAFILENDDEINLKKIAYIEFSKLAECVR 85 (218) T ss_pred HHHHHHHHCCCCCCCCCCCCCCCCCCEEEEEEEEHHCCCHHHHHHHHHHHHHCCCCCCCCCCCHHHHHCCCHHHHHHHHH T ss_conf 99999997574658999998999976553441100418788999999999976677864415999997189999999950 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHCC----CCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCHHHHHHHHHHHHCC Q ss_conf 7300379999997523554442001----000014567776432358888999987542100012104678776565407 Q gi|254780383|r 97 TIGIYRKKSENIISLSHILINEFDN----KIPQTLEGLTRLPGIGRKGANVILSMAFGIPTIGVDTHIFRISNRIGLAPG 172 (227) Q Consensus 97 ~~G~~~~KAk~I~~~a~~i~~~~~g----~vP~~~~~L~~LpGVG~ktA~~il~~~~~~p~~~VDthv~Rv~~Rlgl~~~ 172 (227) |+||||+||++|+++++.+.++|++ ..+.+|++|++++|||++|||+||+|+|++|+|+||+|++|+++|+|+.. T Consensus 86 PaGFy~~KA~rLk~l~~~~~~d~~~~~~~~~~~~Re~LL~lkGIG~ETADsILlYa~~~p~FVVDaYT~Ri~~rlG~~~- 164 (218) T PRK13913 86 PSGFYNQKAKRLIDLSKNILKDFQSFENFKQEVTREWLLDQKGIGKESADAILCYVCAKEVMVVDKYSYLFLKKLGIEI- 164 (218) T ss_pred CCCHHHHHHHHHHHHHHHHHHHHHHHHHCCCHHHHHHHHCCCCCCHHHHHHHHHHHCCCCEEECHHHHHHHHHHCCCCC- T ss_conf 4015899999999999999987525751453658999974898663339999999749984511188999999819985- Q ss_pred CCHHHHHHHHHHCCCHH----------------HHHHHHHHHHHHHHHHCCCC Q ss_conf 88899999996218842----------------26789999999966516489 Q gi|254780383|r 173 KTPNKVEQSLLRIIPPK----------------HQYNAHYWLVLHGRYVCKAR 209 (227) Q Consensus 173 ~~~~~~~~~l~~~~p~~----------------~~~~~~~~li~~G~~iC~~~ 209 (227) .++++++..++..+|.+ .+..||.++|.||++.|+.+ T Consensus 165 ~~Ydelq~~fe~~l~e~~~~~~~~~~~~~~l~~ly~efHaLIVeh~K~~~~~k 217 (218) T PRK13913 165 EDYDELQHFFEKGVQENLNSALALYENTISLAQLYARFHGKIVEFSKQKLELK 217 (218) T ss_pred CCHHHHHHHHHHCCHHHHHHHHHHHCCCCHHHHHHHHHHHHHHHHHHHHHCCC T ss_conf 79999999999622777888876511334089999999999999999871047 No 12 >KOG2457 consensus Probab=100.00 E-value=2.1e-36 Score=265.79 Aligned_cols=203 Identities=20% Similarity=0.352 Sum_probs=178.7 Q ss_pred HHHHHHHHHHHHHCCCCCCCCCCCC------------HHHHHHHHHHHHCCCHHHHHHHHHHHHHCCCCCHHHCCCCH-H Q ss_conf 9999999999997679998777786------------89999999996333203567998998731762000101268-9 Q gi|254780383|r 23 KELEEIFYLFSLKWPSPKGELYYVN------------HFTLIVAVLLSAQSTDVNVNKATKHLFEIADTPQKMLAIGE-K 89 (227) Q Consensus 23 ~~~~~I~~~L~~~yp~~~~~l~~~~------------p~~~LVa~iLs~qT~d~~v~~~~~~L~~~ypt~e~l~~a~~-~ 89 (227) .+...+-.-|.+||...+..|+|++ .|+++|++||+|||+...|.+.|.+++++|||..+++.|+. + T Consensus 88 ~Ev~~fR~sLl~wYD~~KRdLPWR~r~sEde~DwerRaYeVwVSEiMLQQTrV~TV~~YYt~WMqkwPTl~dla~Asl~~ 167 (555) T KOG2457 88 NEVQKFRMSLLDWYDVNKRDLPWRNRRSEDEKDWERRAYEVWVSEIMLQQTRVQTVMKYYTRWMQKWPTLYDLAQASLEK 167 (555) T ss_pred HHHHHHHHHHHHHHHHHCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCHHHHHHHHHHHH T ss_conf 89999999898876300223864468752135688889999999999989999999999999998375088888878988 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCHHHHHHHHH-HHHHHHHHHHHHHHHHHHHHHHHCCCHHHHHHHH-H Q ss_conf 9999999730037999999752355444200100001456777-6432358888999987542100012104678776-5 Q gi|254780383|r 90 KLQNYIRTIGIYRKKSENIISLSHILINEFDNKIPQTLEGLTR-LPGIGRKGANVILSMAFGIPTIGVDTHIFRISNR-I 167 (227) Q Consensus 90 el~~~ir~~G~~~~KAk~I~~~a~~i~~~~~g~vP~~~~~L~~-LpGVG~ktA~~il~~~~~~p~~~VDthv~Rv~~R-l 167 (227) ++..+|.++|||+ |+++|.+-|+++++.++|.+|++-+.|.+ +||||+|||.+|++++||++.-+||.||.||+.| + T Consensus 168 eVn~lWaGlGyY~-R~rrL~ega~~vv~~~~ge~Prta~~l~kgvpGVG~YTAGAiaSIAf~q~tGiVDGNVirvlsRal 246 (555) T KOG2457 168 EVNELWAGLGYYR-RARRLLEGAKMVVAGTEGEFPRTASSLMKGVPGVGQYTAGAIASIAFNQVTGIVDGNVIRVLSRAL 246 (555) T ss_pred HHHHHHHHHHHHH-HHHHHHHHHHHHHHHCCCCCCCHHHHHHHHCCCCCCCCHHHHHHHHHCCCCCCCCCCHHHHHHHHH T ss_conf 9999984101898-889999999999975788788738999851888774231045542304764320461577767767 Q ss_pred HHHCCCCHHHH----HHHHHHCCCHHHHHHHHHHHHHHHHHHCCCCCCCCCCCCCHHHCHHHC Q ss_conf 65407888999----999962188422678999999996651648998947284033176851 Q gi|254780383|r 168 GLAPGKTPNKV----EQSLLRIIPPKHQYNAHYWLVLHGRYVCKARKPQCQSCIISNLCKRIK 226 (227) Q Consensus 168 gl~~~~~~~~~----~~~l~~~~p~~~~~~~~~~li~~G~~iC~~~~P~C~~C~l~~~C~~~k 226 (227) .+...-+..+. -+-...++.+.+.+|||+++|++|+.+|++.+|.|+.||+...|..|+ T Consensus 247 AIhsDcSkgk~~q~~wkLA~qLVDP~RPGDFNQalMELGAt~CTpq~P~CS~CPvss~CrA~q 309 (555) T KOG2457 247 AIHSDCSKGKFFQSSWKLAAQLVDPSRPGDFNQALMELGATLCTPQKPSCSSCPVSSQCRAFQ 309 (555) T ss_pred HHCCCCCHHHHHHHHHHHHHHHCCCCCCCCHHHHHHHHCCEECCCCCCCCCCCCCHHHHHHHH T ss_conf 613774300688999999998259889883779999836712157998767787288998876 No 13 >cd00056 ENDO3c endonuclease III; includes endonuclease III (DNA-(apurinic or apyrimidinic site) lyase), alkylbase DNA glycosidases (Alka-family) and other DNA glycosidases Probab=100.00 E-value=1.5e-35 Score=260.05 Aligned_cols=153 Identities=39% Similarity=0.629 Sum_probs=146.7 Q ss_pred HHHHHHHHHHHCCCHHHHHHHHHHHHHCC-CCCHHHCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCC---H Q ss_conf 99999999963332035679989987317-6200010126899999999730037999999752355444200100---0 Q gi|254780383|r 49 FTLIVAVLLSAQSTDVNVNKATKHLFEIA-DTPQKMLAIGEKKLQNYIRTIGIYRKKSENIISLSHILINEFDNKI---P 124 (227) Q Consensus 49 ~~~LVa~iLs~qT~d~~v~~~~~~L~~~y-pt~e~l~~a~~~el~~~ir~~G~~~~KAk~I~~~a~~i~~~~~g~v---P 124 (227) |++||++||||||++++++.++.+|+++| |||++++++++++|+++++++| |++||++|+++|+.+.++++|.. | T Consensus 1 f~~Li~~Il~qq~s~~~a~~~~~~l~~~~~pt~~~l~~~~~~~l~~~~~~~g-y~~Ka~~i~~~a~~i~~~~~~~~~~~~ 79 (158) T cd00056 1 FEVLVSEILSQQTTDKAVNKAYERLFERYGPTPEALAAADEEELRELIRSLG-YRRKAKYLKELARAIVEGFGGLVLDDP 79 (158) T ss_pred CHHHHHHHHHCCCHHHHHHHHHHHHHHHCCCCHHHHHCCCHHHHHHHHCCCC-HHHHHHHHHHHHHHHHHHCCCCCCCCH T ss_conf 9999999998145299999999999985499899998099999999973356-899999999988888986089578988 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCHHHHHHHHHHHHC-CCCHHHHHHHHHHCCCHHHHHHHHHHHHHHH Q ss_conf 01456777643235888899998754210001210467877656540-7888999999962188422678999999996 Q gi|254780383|r 125 QTLEGLTRLPGIGRKGANVILSMAFGIPTIGVDTHIFRISNRIGLAP-GKTPNKVEQSLLRIIPPKHQYNAHYWLVLHG 202 (227) Q Consensus 125 ~~~~~L~~LpGVG~ktA~~il~~~~~~p~~~VDthv~Rv~~Rlgl~~-~~~~~~~~~~l~~~~p~~~~~~~~~~li~~G 202 (227) .++++|++|||||+|||++||+|+||.|+|+||+||+|+++|+|+.+ ..++++++..++.++|...|.++|++||+|| T Consensus 80 ~~~~~L~~l~GIG~~TA~~vl~~~~~~~~~~vD~~v~R~~~rl~~~~~~~~~~~~~~~~~~~~p~~~~~~~~~~L~~~g 158 (158) T cd00056 80 DAREELLALPGVGRKTANVVLLFALGPDAFPVDTHVRRVLKRLGLIPKKKTPEELEELLEELLPKPYWGEANQALMDLG 158 (158) T ss_pred HHHHHHHCCCCCCHHHHHHHHHHHCCCCCCCCCHHHHHHHHHHCCCCCCCCHHHHHHHHHHHCCCCHHHHHHHHHHHCC T ss_conf 8999987589828999999999987998362259999999995787789999999999998589201999999998676 No 14 >pfam00730 HhH-GPD HhH-GPD superfamily base excision DNA repair protein. This family contains a diverse range of structurally related DNA repair proteins. The superfamily is called the HhH-GPD family after its hallmark Helix-hairpin-helix and Gly/Pro rich loop followed by a conserved aspartate. This includes endonuclease III, EC:4.2.99.18 and MutY an A/G-specific adenine glycosylase, both have a C terminal 4Fe-4S cluster. The family also includes 8-oxoguanine DNA glycosylases. The methyl-CPG binding protein MBD4 also contains a related domain that is a thymine DNA glycosylase. The family also includes DNA-3-methyladenine glycosylase II EC:3.2.2.21 and other members of the AlkA family. Probab=99.96 E-value=2.2e-29 Score=218.60 Aligned_cols=136 Identities=40% Similarity=0.677 Sum_probs=125.4 Q ss_pred HHHHHHHCCCHHHHHHHHHHHHHCC--CCCHHHCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCHHHH--- Q ss_conf 9999963332035679989987317--62000101268999999997300379999997523554442001000014--- Q gi|254780383|r 53 VAVLLSAQSTDVNVNKATKHLFEIA--DTPQKMLAIGEKKLQNYIRTIGIYRKKSENIISLSHILINEFDNKIPQTL--- 127 (227) Q Consensus 53 Va~iLs~qT~d~~v~~~~~~L~~~y--pt~e~l~~a~~~el~~~ir~~G~~~~KAk~I~~~a~~i~~~~~g~vP~~~--- 127 (227) |++||||||++++++.++.+|+++| |||++++++++++|+++|+++||+++||++|+++|+.+.++|++.+|.+. T Consensus 1 V~~IlsQq~s~~~a~~i~~rl~~~~~~pt~~~l~~~~~~~l~~~i~~~G~~~~Ka~~I~~~a~~~~~~~~~~~~~~~~~~ 80 (144) T pfam00730 1 VSAILSQQTSDKAANKITKRLFERYGFPTPEDLAEADEEELRELIKGLGFYRRKAKYIKELARILVEGYLGLVPLDLEEL 80 (144) T ss_pred CEEEECCCCHHHHHHHHHHHHHHHHCCCCHHHHHCCCHHHHHHHHHCCCCHHHHHHHHHHHHHHHHHHCCCCCCCCHHHH T ss_conf 95542012349999999999999828989999985999999999870897699999999999888986289788611569 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHHH--HHHCCCHHHHHHHHHHHHCCC-CHHHHHHHHHHCCCH Q ss_conf 5677764323588889999875421--000121046787765654078-889999999621884 Q gi|254780383|r 128 EGLTRLPGIGRKGANVILSMAFGIP--TIGVDTHIFRISNRIGLAPGK-TPNKVEQSLLRIIPP 188 (227) Q Consensus 128 ~~L~~LpGVG~ktA~~il~~~~~~p--~~~VDthv~Rv~~Rlgl~~~~-~~~~~~~~l~~~~p~ 188 (227) ++|++|||||++||+++|+|+|++| .+++|+||+|+++|+|+.+.+ +++++++.+.+.+|. T Consensus 81 ~~L~~l~GIG~~ta~~~l~~~~~~~d~~~~~D~~v~r~~~rl~~~~~~~~~~~~~~~l~~~~~p 144 (144) T pfam00730 81 EALLALPGVGRWTAEAVLLFALGRPDVFPAVDTHVRRVAKRLGLIKEKPTPEEVERELEELWPP 144 (144) T ss_pred HHHHCCCCCCHHHHHHHHHHHCCCCCCEECCCHHHHHHHHHHCCCCCCCCHHHHHHHHHHHCCC T ss_conf 9986088976999999999986999873264499999999977998999999999998713788 No 15 >TIGR00588 ogg 8-oxoguanine DNA-glycosylase (ogg); InterPro: IPR004577 All proteins in this family for which functions are known are 8-oxo-guanaine DNA glycosylases that function in base excision repair. The enzyme incises DNA at 8-oxoG residues, and excises 7,8-dihydro-8-oxoguanine from damaged DNA. It has beta-lyase activity that nicks DNA 3' to the lesion.; GO: 0008534 oxidized purine base lesion DNA N-glycosylase activity, 0006281 DNA repair, 0005634 nucleus. Probab=99.76 E-value=1.1e-18 Score=146.41 Aligned_cols=127 Identities=21% Similarity=0.367 Sum_probs=104.7 Q ss_pred CCCHHHHHHHHHHHHCCCHHHHHHHHHHH---------------HHCCCCCHHHCC------CCHHHHHH---------H Q ss_conf 78689999999996333203567998998---------------731762000101------26899999---------9 Q gi|254780383|r 45 YVNHFTLIVAVLLSAQSTDVNVNKATKHL---------------FEIADTPQKMLA------IGEKKLQN---------Y 94 (227) Q Consensus 45 ~~~p~~~LVa~iLs~qT~d~~v~~~~~~L---------------~~~ypt~e~l~~------a~~~el~~---------~ 94 (227) .+|||+.|||-|-|.|++-.+.-+-.++| |.-||+++.|+. -+..+.|. - T Consensus 145 qkdP~EcliSfIcSsNnni~RiTrm~e~lc~~fG~~~~~~dgvtyH~FP~~~~LtgvaeGsledl~~~E~nlPsdfsfn~ 224 (379) T TIGR00588 145 QKDPFECLISFICSSNNNIARITRMVERLCQAFGPRLITLDGVTYHGFPSLHALTGVAEGSLEDLPEAEANLPSDFSFNH 224 (379) T ss_pred CCCCHHHHHHHHHHHCCCHHHHHHHHHHHHHHHCCHHHHHCCEEECCCCCHHHHHHHHHHHHHHHHHHHHHCCCCCCHHH T ss_conf 27871566778875046521254338999986220321224621147987566521210016679998751564346265 Q ss_pred HHHHHH-HHHHHHHHHHHHHHHHHHHCCC-C----------------HHHHHHHHHHHHHHHHHHHHHHHHHHHH-HHHH Q ss_conf 997300-3799999975235544420010-0----------------0014567776432358888999987542-1000 Q gi|254780383|r 95 IRTIGI-YRKKSENIISLSHILINEFDNK-I----------------PQTLEGLTRLPGIGRKGANVILSMAFGI-PTIG 155 (227) Q Consensus 95 ir~~G~-~~~KAk~I~~~a~~i~~~~~g~-v----------------P~~~~~L~~LpGVG~ktA~~il~~~~~~-p~~~ 155 (227) +|.+|+ | ||+||.+.|+.|+|+-++. + .+.++.|+.|||||+|.||||+++++++ .++| T Consensus 225 LR~lG~GY--RA~Yi~~tar~l~ee~~~~nitsdta~LQ~ic~~~~Yedar~~L~~l~GVG~KVADCicLmgl~k~~avP 302 (379) T TIGR00588 225 LRKLGLGY--RARYIRETARALLEEQGGRNITSDTAWLQQICKDADYEDAREALLELPGVGPKVADCICLMGLDKPQAVP 302 (379) T ss_pred HHHCCCCC--CCHHHHHHHHHHHHHCCCCCCHHHHHHHHHHCCCCCHHHHHHHHHCCCCCCCHHHHHHHHHHCCCCCCEE T ss_conf 75248875--4068999999988412664200235799986066886789999721699970488888865227897101 Q ss_pred CCCHHHHHHHHH-HHHCCC Q ss_conf 121046787765-654078 Q gi|254780383|r 156 VDTHIFRISNRI-GLAPGK 173 (227) Q Consensus 156 VDthv~Rv~~Rl-gl~~~~ 173 (227) ||.||.||.+|. ++...+ T Consensus 303 VDVh~~~Ia~rdy~~sank 321 (379) T TIGR00588 303 VDVHVRRIAKRDYQWSANK 321 (379) T ss_pred EHHHHHHHHHHCCCCCCCC T ss_conf 1156888864403531012 No 16 >COG0122 AlkA 3-methyladenine DNA glycosylase/8-oxoguanine DNA glycosylase [DNA replication, recombination, and repair] Probab=99.67 E-value=4.3e-15 Score=122.33 Aligned_cols=127 Identities=26% Similarity=0.334 Sum_probs=103.1 Q ss_pred CCHHHHHHHHHHHHCCCHHHHHHHHHHHHHC----------CCCCHHHCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 8689999999996333203567998998731----------762000101268999999997300379999997523554 Q gi|254780383|r 46 VNHFTLIVAVLLSAQSTDVNVNKATKHLFEI----------ADTPQKMLAIGEKKLQNYIRTIGIYRKKSENIISLSHIL 115 (227) Q Consensus 46 ~~p~~~LVa~iLs~qT~d~~v~~~~~~L~~~----------ypt~e~l~~a~~~el~~~ir~~G~~~~KAk~I~~~a~~i 115 (227) .|||+.||+.|++||-+.+...+...+|... ||||++++.++++.+. .+|+...||++|+++|+.+ T Consensus 103 ~d~fe~lv~aI~~QqvS~~~A~~i~~rl~~~~g~~~~~~~~fptpe~l~~~~~~~l~----~~g~s~~Ka~yi~~~A~~~ 178 (285) T COG0122 103 PDPFEALVRAILSQQVSVAAAAKIWARLVSLYGNALEIYHSFPTPEQLAAADEEALR----RCGLSGRKAEYIISLARAA 178 (285) T ss_pred CCHHHHHHHHHHHHHCCHHHHHHHHHHHHHHHCCCCCCCCCCCCHHHHHHCCHHHHH----HHCCCHHHHHHHHHHHHHH T ss_conf 678999999999765059999999999999818766656679899999847999998----8378577899999999999 Q ss_pred HHHH-C-CC-----CHHHHHHHHHHHHHHHHHHHHHHHHHHHH-HHHH-CCCHHHHHHHHHHHHCCCCHHH Q ss_conf 4420-0-10-----00014567776432358888999987542-1000-1210467877656540788899 Q gi|254780383|r 116 INEF-D-NK-----IPQTLEGLTRLPGIGRKGANVILSMAFGI-PTIG-VDTHIFRISNRIGLAPGKTPNK 177 (227) Q Consensus 116 ~~~~-~-g~-----vP~~~~~L~~LpGVG~ktA~~il~~~~~~-p~~~-VDthv~Rv~~Rlgl~~~~~~~~ 177 (227) .+.. + .. .-..+++|.+|+|||++||+++|+|++++ .+|| .|.++.+-++++. ..++.+.+ T Consensus 179 ~~g~~~~~~l~~~~~e~a~e~L~~i~GIG~WTAe~~llf~lgr~dvfP~~D~~lr~~~~~~~-~~~~~~~~ 248 (285) T COG0122 179 AEGELDLSELKPLSDEEAIEELTALKGIGPWTAEMFLLFGLGRPDVFPADDLGLRRAIKKLY-RLPTRPTE 248 (285) T ss_pred HCCCCCHHHHCCCCHHHHHHHHHCCCCCCHHHHHHHHHHHCCCCCCCCCHHHHHHHHHHHHH-CCCCCCHH T ss_conf 85996567662588999999987378867999999999816889867715599999999972-57878508 No 17 >KOG2875 consensus Probab=99.63 E-value=9.3e-16 Score=126.80 Aligned_cols=118 Identities=19% Similarity=0.311 Sum_probs=100.3 Q ss_pred CCCHHHHHHHHHHHHCCCHHHHHHHHHHH---------------HHCCCCCHHHCCCCHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 78689999999996333203567998998---------------731762000101268999999997300379999997 Q gi|254780383|r 45 YVNHFTLIVAVLLSAQSTDVNVNKATKHL---------------FEIADTPQKMLAIGEKKLQNYIRTIGIYRKKSENII 109 (227) Q Consensus 45 ~~~p~~~LVa~iLs~qT~d~~v~~~~~~L---------------~~~ypt~e~l~~a~~~el~~~ir~~G~~~~KAk~I~ 109 (227) -.|||+-|++-|.|++.+-.+.-+-.+.| |..|||.++|+. .++|.-+|.+||-- ||++|. T Consensus 114 rQdP~E~lfSFiCSSNNNIaRIT~Mve~fc~~fG~~i~~~dg~~~h~FPsl~~L~g---~~~Ea~LR~~gfGY-RAkYI~ 189 (323) T KOG2875 114 RQDPIECLFSFICSSNNNIARITGMVERFCQAFGPRIIQLDGVDYHGFPSLQALAG---PEVEAELRKLGFGY-RAKYIS 189 (323) T ss_pred HCCCHHHHHHHHHCCCCCHHHHHHHHHHHHHHHCCCEEEECCCCCCCCCCHHHHCC---HHHHHHHHHCCCCH-HHHHHH T ss_conf 51719788988725787599999999999986175037555712036854777647---47699999817644-689999 Q ss_pred HHHHHHHHHHCCC----------CHHHHHHHHHHHHHHHHHHHHHHHHHHHH-HHHHCCCHHHHHHHH Q ss_conf 5235544420010----------00014567776432358888999987542-100012104678776 Q gi|254780383|r 110 SLSHILINEFDNK----------IPQTLEGLTRLPGIGRKGANVILSMAFGI-PTIGVDTHIFRISNR 166 (227) Q Consensus 110 ~~a~~i~~~~~g~----------vP~~~~~L~~LpGVG~ktA~~il~~~~~~-p~~~VDthv~Rv~~R 166 (227) +.++.|.++++|. ..+.+++|.+|||||+|.||||++++++. .++|||+||.|+..- T Consensus 190 ~ta~~l~~~~g~~~wLqslr~~~yeear~~L~~lpGVG~KVADCI~Lm~l~~~~~VPVDvHi~ria~~ 257 (323) T KOG2875 190 ATARALQEKQGGLAWLQSLRKSSYEEAREALCSLPGVGPKVADCICLMSLDKLSAVPVDVHIWRIAQD 257 (323) T ss_pred HHHHHHHHHCCCCHHHHHHHCCCHHHHHHHHHCCCCCCCHHHHHHHHHCCCCCCCCCCHHHHHHHHHC T ss_conf 99999997235005999885452899999985288876147562231205887655622458887622 No 18 >PRK10308 3-methyl-adenine DNA glycosylase II; Provisional Probab=99.42 E-value=1.4e-11 Score=98.64 Aligned_cols=164 Identities=20% Similarity=0.257 Sum_probs=114.4 Q ss_pred HHHHHHHHHHHHHCCCCCCCCCCCCHHHHHHHHHHHHCCCHHHHHHHHHHHHH-------------CCCCCHHHCCCCHH Q ss_conf 99999999999976799987777868999999999633320356799899873-------------17620001012689 Q gi|254780383|r 23 KELEEIFYLFSLKWPSPKGELYYVNHFTLIVAVLLSAQSTDVNVNKATKHLFE-------------IADTPQKMLAIGEK 89 (227) Q Consensus 23 ~~~~~I~~~L~~~yp~~~~~l~~~~p~~~LVa~iLs~qT~d~~v~~~~~~L~~-------------~ypt~e~l~~a~~~ 89 (227) ..+.+.+..|....|.-+.+-.| |+|+.+|-+||-||-+-........+|.+ .||||+.++.++.+ T Consensus 89 ~~I~~~L~pl~~~~PGlRvPg~~-d~fE~~vrAIlGQQvSv~aA~tl~~Rlv~~~G~~~~~~~~~~~FPtp~~la~~~~~ 167 (283) T PRK10308 89 QIVAGALGKLGAARPGLRLPGSV-DAFEQGVRAILGQLVSVAMAAKLTAKVAQLYGERLDDFPDYVCFPTPQRLAAADPQ 167 (283) T ss_pred HHHHHHHHHHHHCCCCCCCCCCC-CHHHHHHHHHHHCCCCHHHHHHHHHHHHHHHCCCCCCCCCCEECCCHHHHHCCCHH T ss_conf 99999876776318997788879-88999999997140219999999999999948957889885347998998538975 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHHH-HC----CCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHH-HHHCCCHHHHH Q ss_conf 99999997300379999997523554442-00----10000145677764323588889999875421-00012104678 Q gi|254780383|r 90 KLQNYIRTIGIYRKKSENIISLSHILINE-FD----NKIPQTLEGLTRLPGIGRKGANVILSMAFGIP-TIGVDTHIFRI 163 (227) Q Consensus 90 el~~~ir~~G~~~~KAk~I~~~a~~i~~~-~~----g~vP~~~~~L~~LpGVG~ktA~~il~~~~~~p-~~~VDthv~Rv 163 (227) + ++.+|+.+.|++.|..+|+.+.+. .. ......++.|+.|||||+.||+.|++.+++.| +|+.+--+.| T Consensus 168 ~----L~~lg~p~~ra~tl~~lA~a~~~g~l~l~~~~d~~~~~~~L~~l~GIGpWTa~Yv~mR~lg~pD~fp~~Dl~l~- 242 (283) T PRK10308 168 A----LKALGMPLKRAEALIHLANAALEGTLPLTAPGDVEQAMKTLQTFPGIGRWTANYFALRGWQAKDVFLPDDYLIK- 242 (283) T ss_pred H----HHHCCCCHHHHHHHHHHHHHHHCCCCCCCCCCCHHHHHHHHHHCCCCCHHHHHHHHHHHCCCCCCCCCCCHHHH- T ss_conf 6----64558966899999999999966986677789999999998736797889999999983789987876029999- Q ss_pred HHHHHHHCCCCHHHHHHHHHHCCCHHHHHHHHH Q ss_conf 776565407888999999962188422678999 Q gi|254780383|r 164 SNRIGLAPGKTPNKVEQSLLRIIPPKHQYNAHY 196 (227) Q Consensus 164 ~~Rlgl~~~~~~~~~~~~l~~~~p~~~~~~~~~ 196 (227) ++++ +.++.+++...+.+=|=.-+.-+|. T Consensus 243 -~~l~---~~~~~~~~~~a~~W~PWRsYA~~~L 271 (283) T PRK10308 243 -QRFP---GMTPAQIRRYAERWKPWRSYALLHI 271 (283) T ss_pred -HHHC---CCCHHHHHHHHHCCCCHHHHHHHHH T ss_conf -7613---5999999999753588999999999 No 19 >KOG1918 consensus Probab=99.20 E-value=2.3e-10 Score=90.52 Aligned_cols=143 Identities=23% Similarity=0.318 Sum_probs=109.8 Q ss_pred CCCCCCCCCHHHHHHHHHHHHCCCHHHHHHHHHHHHH------CCCCCHHHCCCCHHHHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 9987777868999999999633320356799899873------1762000101268999999997300379999997523 Q gi|254780383|r 39 PKGELYYVNHFTLIVAVLLSAQSTDVNVNKATKHLFE------IADTPQKMLAIGEKKLQNYIRTIGIYRKKSENIISLS 112 (227) Q Consensus 39 ~~~~l~~~~p~~~LVa~iLs~qT~d~~v~~~~~~L~~------~ypt~e~l~~a~~~el~~~ir~~G~~~~KAk~I~~~a 112 (227) |.+....+.||.-|+.+|||||-.+..+|-++.++.. .+|+|+.+..++.++|. .+||..+|+.+|..+| T Consensus 66 p~~~~~~q~Pf~~LiraIlsQQLs~kAansI~~Rfvsl~~g~~~~~~pe~i~~~~~~~lr----kcG~S~rK~~yLh~lA 141 (254) T KOG1918 66 PLTFKETQTPFERLIRAILSQQLSGKAANSIYNRFVSLCGGAEKFPTPEFIDPLDCEELR----KCGFSKRKASYLHSLA 141 (254) T ss_pred CCCCCCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCCCCCCHHCCCCCHHHHH----HHCCCHHHHHHHHHHH T ss_conf 997662347099999999999998999999999999984797677883212767899999----8573324679999999 Q ss_pred HHHHHHHCCCCHHH-----------HHHHHHHHHHHHHHHHHHHHHHHHH-HHHHCCCHHHH-HHHH-HHHHCCCCHHHH Q ss_conf 55444200100001-----------4567776432358888999987542-10001210467-8776-565407888999 Q gi|254780383|r 113 HILINEFDNKIPQT-----------LEGLTRLPGIGRKGANVILSMAFGI-PTIGVDTHIFR-ISNR-IGLAPGKTPNKV 178 (227) Q Consensus 113 ~~i~~~~~g~vP~~-----------~~~L~~LpGVG~ktA~~il~~~~~~-p~~~VDthv~R-v~~R-lgl~~~~~~~~~ 178 (227) ....+.| ||.+ ++-|..+.|||+.|+...|.|++++ .++|+|--..| =.+- +|+.+-..+.++ T Consensus 142 ~~~~ng~---I~s~~~i~~mseEeL~~~LT~VKGIg~Wtv~MflIfsL~R~DVmp~dDlgir~g~k~l~gl~~~p~~~ev 218 (254) T KOG1918 142 EAYTNGY---IPSKSGIEKMSEEELIERLTNVKGIGRWTVEMFLIFSLHRPDVMPADDLGIRNGVKKLLGLKPLPLPKEV 218 (254) T ss_pred HHHHCCC---CCCHHHHHHCCHHHHHHHHHHCCCCCCEEEEEEEEECCCCCCCCCCHHHHHHHHHHHHHCCCCCCCHHHH T ss_conf 9986477---7766777615799999999860475511244321101477766685036688779998089998865999 Q ss_pred HHHHHHCCCH Q ss_conf 9999621884 Q gi|254780383|r 179 EQSLLRIIPP 188 (227) Q Consensus 179 ~~~l~~~~p~ 188 (227) ++.-+.+-|- T Consensus 219 ekl~e~~kpy 228 (254) T KOG1918 219 EKLCEKCKPY 228 (254) T ss_pred HHHHHHCCCH T ss_conf 9986203404 No 20 >PRK01229 N-glycosylase/DNA lyase; Provisional Probab=99.00 E-value=9.8e-09 Score=79.50 Aligned_cols=131 Identities=24% Similarity=0.333 Sum_probs=102.2 Q ss_pred CCHHHHHHHHHHHHCCCHHHHHHHHHHHHHCCCCCHHHCCCCHHHHHHHHHHHH--HHHHHHHHHHHHHHHHHHHHC--- Q ss_conf 868999999999633320356799899873176200010126899999999730--037999999752355444200--- Q gi|254780383|r 46 VNHFTLIVAVLLSAQSTDVNVNKATKHLFEIADTPQKMLAIGEKKLQNYIRTIG--IYRKKSENIISLSHILINEFD--- 120 (227) Q Consensus 46 ~~p~~~LVa~iLs~qT~d~~v~~~~~~L~~~ypt~e~l~~a~~~el~~~ir~~G--~~~~KAk~I~~~a~~i~~~~~--- 120 (227) .+-|.-|+-|||.+||+.....++...+- +.+..++.+++.+.++.+| |+|+||++|.++=+.+ .+.. T Consensus 35 ~~iF~EL~FCILTanssA~~~~ka~~~l~------~g~~~~~~eel~~~l~~~g~RF~n~rAkyIv~aR~~~-~~lk~i~ 107 (208) T PRK01229 35 EDLFSELSFCILTANSSAEGGIKAQKEIG------DGFLYLSEEELREKLKEVGHRFPNLRAEYIVEARKLI-GKLKEII 107 (208) T ss_pred HHHHHHHHHHHHCCCCHHHHHHHHHHHHH------HHCCCCCHHHHHHHHHHCCCCCCHHHHHHHHHHHHHH-HHHHHHH T ss_conf 89999999999388740778999999987------4002599999999999849678422689999999998-8999999 Q ss_pred ---CCCHHHHHHHHH-HHHHHHHHHHHHHHH-HHHHHHHHCCCHHHHHHHHHHHHCC-------CCHHHHHHHHHH Q ss_conf ---100001456777-643235888899998-7542100012104678776565407-------888999999962 Q gi|254780383|r 121 ---NKIPQTLEGLTR-LPGIGRKGANVILSM-AFGIPTIGVDTHIFRISNRIGLAPG-------KTPNKVEQSLLR 184 (227) Q Consensus 121 ---g~vP~~~~~L~~-LpGVG~ktA~~il~~-~~~~p~~~VDthv~Rv~~Rlgl~~~-------~~~~~~~~~l~~ 184 (227) +.....|++|.+ ++|+|.|-|+-.|-. +| .....+|.|+.|.+.++|+.+. +.+.++|..|.+ T Consensus 108 ~~~~~~~e~Re~Lv~nIKG~G~KEASHFLRNiG~-~dlAIlDrHILr~l~~~g~i~~~pk~~t~k~Yle~E~~l~~ 182 (208) T PRK01229 108 KADKDQFEAREFLVKNIKGIGYKEASHFLRNVGF-EDLAILDRHILRFLKRYGLIKEIPKSLSKKRYLEIESILRE 182 (208) T ss_pred HCCCCHHHHHHHHHHHCCCCCHHHHHHHHHHCCC-CHHHHHHHHHHHHHHHCCCCCCCCCCCCHHHHHHHHHHHHH T ss_conf 7068778999999985647667999999996585-23577699999999982897656787789899999999999 No 21 >COG1059 Thermostable 8-oxoguanine DNA glycosylase [DNA replication, recombination, and repair] Probab=98.65 E-value=2.1e-07 Score=70.52 Aligned_cols=132 Identities=23% Similarity=0.271 Sum_probs=99.5 Q ss_pred CCHHHHHHHHHHHHCCCHHHHHHHHHHHHHCCCCCHHHCCCCHHHHHHHHHHHH--HHHHHHHHHHHHHHHHHHHHCC-- Q ss_conf 868999999999633320356799899873176200010126899999999730--0379999997523554442001-- Q gi|254780383|r 46 VNHFTLIVAVLLSAQSTDVNVNKATKHLFEIADTPQKMLAIGEKKLQNYIRTIG--IYRKKSENIISLSHILINEFDN-- 121 (227) Q Consensus 46 ~~p~~~LVa~iLs~qT~d~~v~~~~~~L~~~ypt~e~l~~a~~~el~~~ir~~G--~~~~KAk~I~~~a~~i~~~~~g-- 121 (227) .+-|.-|.-|||.+|++.....++-..|= +.+..++.+||.+.++.+| |||+||++|...-+.+ .+-.. T Consensus 37 e~lf~ELsFCILTANsSA~~~~~~q~~lG------~gfly~~~eEL~e~Lk~~g~Rf~n~raeyIVeaR~~~-~~lk~~v 109 (210) T COG1059 37 EDLFKELSFCILTANSSATMGLRAQNELG------DGFLYLSEEELREKLKEVGYRFYNVRAEYIVEAREKF-DDLKIIV 109 (210) T ss_pred HHHHHHHHHHHCCCCCHHHHHHHHHHHHC------CCCCCCCHHHHHHHHHHHCCHHCCCCHHHHHHHHHHH-HHHHHHH T ss_conf 89999989986046611777999999863------3310288999999999816012242459999999988-7789988 Q ss_pred ---CCHH-HHHHHH-HHHHHHHHHHHHHHHHHHHHHHHHCCCHHHHHHHHHHHHCCC-------CHHHHHHHHHH Q ss_conf ---0000-145677-764323588889999875421000121046787765654078-------88999999962 Q gi|254780383|r 122 ---KIPQ-TLEGLT-RLPGIGRKGANVILSMAFGIPTIGVDTHIFRISNRIGLAPGK-------TPNKVEQSLLR 184 (227) Q Consensus 122 ---~vP~-~~~~L~-~LpGVG~ktA~~il~~~~~~p~~~VDthv~Rv~~Rlgl~~~~-------~~~~~~~~l~~ 184 (227) ..+. .|+.|+ .+.|+|-|-|+-.|-.+--.....+|.|+.|.+.|.|+.... .+..++..|.+ T Consensus 110 ~~~~~~~vaRE~Lv~nikGiGyKEASHFLRNVG~~D~AIlDrHIlr~l~r~g~i~e~~kt~t~K~YLe~E~ilr~ 184 (210) T COG1059 110 KADENEKVARELLVENIKGIGYKEASHFLRNVGFEDLAILDRHILRWLVRYGLIDENPKTLTRKLYLEIEEILRS 184 (210) T ss_pred HCCCCHHHHHHHHHHHCCCCCHHHHHHHHHHCCHHHHHHHHHHHHHHHHHHCCCCCCCCCCCHHHHHHHHHHHHH T ss_conf 267426789999999701553799999999547168999999999999983652347454448889999999999 No 22 >TIGR03252 uncharacterized HhH-GPD family protein. This model describes a small, well-conserved bacterial protein family. Its sequence largely consists of a domain, HhH-GPD, found in a variety of related base excision DNA repair enzymes (see pfam00730). Probab=98.03 E-value=2.4e-05 Score=56.69 Aligned_cols=106 Identities=16% Similarity=0.300 Sum_probs=83.5 Q ss_pred CCCCCCHHHHHHHHHHHHCCCHHHHHHHHHHHHHCC--CCCHHHCCCCHHHHHHHHHHH----HHHHHHHHHHHHHHHHH Q ss_conf 777786899999999963332035679989987317--620001012689999999973----00379999997523554 Q gi|254780383|r 42 ELYYVNHFTLIVAVLLSAQSTDVNVNKATKHLFEIA--DTPQKMLAIGEKKLQNYIRTI----GIYRKKSENIISLSHIL 115 (227) Q Consensus 42 ~l~~~~p~~~LVa~iLs~qT~d~~v~~~~~~L~~~y--pt~e~l~~a~~~el~~~ir~~----G~~~~KAk~I~~~a~~i 115 (227) +|--+|||-+||+.+|-||-.-+..-..-.++-+|. .++..++.++++++.++...- -|...=|+||+++|+.| T Consensus 12 ~LL~~~p~AlLiGMLLDQQvPmE~AF~GP~~l~~RlG~lD~~~IA~~Dpd~f~~l~~e~PAiHRfP~SMA~Riq~l~~~i 91 (177) T TIGR03252 12 ELLSSDPFALLTGMLLDQQVPMERAFAGPHKIARRMGSLDAEDIAKYDPQAFVALFSERPAVHRFPGSMAKRVQALAQYV 91 (177) T ss_pred HHHCCCCHHHHHHHHHHCCCHHHHHHCCHHHHHHHHCCCCHHHHHHCCHHHHHHHHCCCCCHHHCCHHHHHHHHHHHHHH T ss_conf 88713928999999982334299986185999988379998999817999999997679614217288999999999999 Q ss_pred HHHHCCCCHH-------H----HHHHHHHHHHHHHHHHHHHHH Q ss_conf 4420010000-------1----456777643235888899998 Q gi|254780383|r 116 INEFDNKIPQ-------T----LEGLTRLPGIGRKGANVILSM 147 (227) Q Consensus 116 ~~~~~g~vP~-------~----~~~L~~LpGVG~ktA~~il~~ 147 (227) +++|+|+.-. | .+-|..|||-|..-|...+.. T Consensus 92 v~~YdGda~~vW~~g~~dg~ell~Rl~~LPGFG~qKA~IFlAL 134 (177) T TIGR03252 92 VDTYDGDATAVWTEGDPDGKELLRRLKALPGFGKQKAKIFLAL 134 (177) T ss_pred HHHHCCCHHHHHCCCCCCHHHHHHHHHHCCCCCHHHHHHHHHH T ss_conf 9980894988740679989999999986799619999999999 No 23 >pfam00633 HHH Helix-hairpin-helix motif. The helix-hairpin-helix DNA-binding motif is found to be duplicated in the central domain of RuvA. The HhH domain of DisA, a bacterial checkpoint control protein, is a DNA-binding domain. Probab=97.34 E-value=6.5e-05 Score=53.77 Aligned_cols=29 Identities=45% Similarity=0.702 Sum_probs=26.3 Q ss_pred HCCCCHHHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 00100001456777643235888899998 Q gi|254780383|r 119 FDNKIPQTLEGLTRLPGIGRKGANVILSM 147 (227) Q Consensus 119 ~~g~vP~~~~~L~~LpGVG~ktA~~il~~ 147 (227) +++..|.++++|++|||||+|||+.|+.+ T Consensus 2 ~~~~~~as~eeL~~lpGVG~~tA~~I~~~ 30 (30) T pfam00633 2 LEGLIPASREELLALPGVGPKTAEAILSY 30 (30) T ss_pred CCCCCCCCHHHHHHCCCCCHHHHHHHHCC T ss_conf 64435235999972889776889988539 No 24 >smart00525 FES FES domain. iron-sulpphur binding domain in DNA-(apurinic or apyrimidinic site) lyase (subfamily of ENDO3) Probab=96.54 E-value=0.00092 Score=46.01 Aligned_cols=22 Identities=41% Similarity=1.124 Sum_probs=20.6 Q ss_pred HCCCCCCCCCCCCCHHHCHHHC Q ss_conf 1648998947284033176851 Q gi|254780383|r 205 VCKARKPQCQSCIISNLCKRIK 226 (227) Q Consensus 205 iC~~~~P~C~~C~l~~~C~~~k 226 (227) ||++++|+|+.|||+++|+++. T Consensus 1 iC~~~kP~C~~Cpl~~~C~~~~ 22 (26) T smart00525 1 ICTARKPRCDECPLKDLCPAYX 22 (26) T ss_pred CCCCCCCCCCCCCCHHHCCHHH T ss_conf 9656688767682121060333 No 25 >pfam09674 DUF2400 Protein of unknown function (DUF2400). Members of this uncharacterized protein family are found sporadically, so far only among spirochetes, epsilon and delta proteobacteria, and Bacteroides. The function is unknown and its gene neighbourhoods show little conservation. Probab=95.25 E-value=0.012 Score=38.45 Aligned_cols=42 Identities=26% Similarity=0.450 Sum_probs=31.2 Q ss_pred HHHCCCHHHHHHHHHHHHCCCCH-----HHHHHHHHHCCCHH-HHHHH Q ss_conf 00012104678776565407888-----99999996218842-26789 Q gi|254780383|r 153 TIGVDTHIFRISNRIGLAPGKTP-----NKVEQSLLRIIPPK-HQYNA 194 (227) Q Consensus 153 ~~~VDthv~Rv~~Rlgl~~~~~~-----~~~~~~l~~~~p~~-~~~~~ 194 (227) .+|+||||.||+.+|||...++. .++...|.++-|.+ ..+|| T Consensus 174 ~iPLDtH~~rvar~LgL~~rk~~d~kaa~ElT~~lr~~dp~DPvKYDF 221 (230) T pfam09674 174 IIPLDTHTHRVARKLGLLKRKQYDLKAALEITAALRELDPDDPVKYDF 221 (230) T ss_pred EEECCHHHHHHHHHHCCCCCCCCCHHHHHHHHHHHHHCCCCCCCHHHH T ss_conf 320207089999992882256112999999999998609889810456 No 26 >TIGR00575 dnlj DNA ligase, NAD-dependent; InterPro: IPR001679 DNA ligase (polydeoxyribonucleotide synthase) is the enzyme that joins two DNA fragments by catalyzing the formation of an internucleotide ester bond between phosphate and deoxyribose. It is active during DNA replication, DNA repair and DNA recombination. There are two forms of DNA ligase: one requires ATP (6.5.1.1 from EC), the other NAD (6.5.1.2 from EC). This family is predominantly composed of NAD-dependent bacterial DNA ligases. They are proteins of about 75 to 85 Kd whose sequence is well conserved , . They also show similarity to yicF, an Escherichia coli hypothetical protein of 63 Kd.; GO: 0003911 DNA ligase (NAD+) activity, 0006260 DNA replication, 0006281 DNA repair. Probab=94.96 E-value=0.04 Score=34.95 Aligned_cols=27 Identities=30% Similarity=0.394 Sum_probs=22.1 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 5677764323588889999875421000 Q gi|254780383|r 128 EGLTRLPGIGRKGANVILSMAFGIPTIG 155 (227) Q Consensus 128 ~~L~~LpGVG~ktA~~il~~~~~~p~~~ 155 (227) .+|++++|||+++|..|..| |..+.+- T Consensus 564 s~L~~~~g~G~~vA~~~~~~-F~~~~~~ 590 (706) T TIGR00575 564 SELLSVEGVGPKVAESIVNF-FHDPNNL 590 (706) T ss_pred HHHHHCCCCHHHHHHHHHHH-HHCCCCC T ss_conf 55641014027899999998-7120001 No 27 >TIGR02757 TIGR02757 conserved hypothetical protein TIGR02757; InterPro: IPR014127 Members of this uncharacterised protein family are found sporadically, so far only among spirochetes, epsilon and delta proteobacteria, and Bacteroides. The function is unknown and its gene neighbourhoods show little conservation.. Probab=94.53 E-value=0.018 Score=37.30 Aligned_cols=44 Identities=25% Similarity=0.503 Sum_probs=33.3 Q ss_pred HHHHCCCHHHHHH-HHHHHHCCC-----CHHHHHHHHHHCCCHH-HHHHHH Q ss_conf 1000121046787-765654078-----8899999996218842-267899 Q gi|254780383|r 152 PTIGVDTHIFRIS-NRIGLAPGK-----TPNKVEQSLLRIIPPK-HQYNAH 195 (227) Q Consensus 152 p~~~VDthv~Rv~-~Rlgl~~~~-----~~~~~~~~l~~~~p~~-~~~~~~ 195 (227) =.+|+|||++||+ .-|||...+ ++.++-+.|.++.|.| -.+||. T Consensus 211 Li~PLDTH~~~~~s~~Lkl~~rk~~dlK~A~~iT~~L~~~~p~DP~kYDFA 261 (269) T TIGR02757 211 LILPLDTHVFRIASKKLKLLKRKSYDLKAAIEITEALKKLNPEDPIKYDFA 261 (269) T ss_pred CCCCCHHHHHHHHHHHHCHHHHHHCCHHHHHHHHHHHHHHCCCCCCCCCHH T ss_conf 403335899999877623256652366899999999876178859524411 No 28 >TIGR00084 ruvA Holliday junction DNA helicase RuvA; InterPro: IPR000085 In prokaryotes, RuvA, RuvB, and RuvC process the universal DNA intermediate of homologous recombination, termed Holliday junction. The tetrameric DNA helicase RuvA specifically binds to the Holliday junction and facilitates the isomerization of the junction from the stacked folded configuration to the square-planar structure . In the RuvA tetramer, each subunit consists of three domains, I, II and III, where I and II form the major core that is responsible for Holliday junction binding and base pair rearrangements of Holliday junction executed at the crossover point, whereas domain III regulates branch migration through direct contact with RuvB.; GO: 0003678 DNA helicase activity, 0006281 DNA repair, 0006310 DNA recombination. Probab=94.52 E-value=0.04 Score=34.99 Aligned_cols=84 Identities=24% Similarity=0.363 Sum_probs=53.9 Q ss_pred HCCCHHHHHHHHHHHHHCCCCCHHHCCCCHHH-HHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCH-HHHHHHHHHHHH Q ss_conf 33320356799899873176200010126899-9999997300379999997523554442001000-014567776432 Q gi|254780383|r 59 AQSTDVNVNKATKHLFEIADTPQKMLAIGEKK-LQNYIRTIGIYRKKSENIISLSHILINEFDNKIP-QTLEGLTRLPGI 136 (227) Q Consensus 59 ~qT~d~~v~~~~~~L~~~ypt~e~l~~a~~~e-l~~~ir~~G~~~~KAk~I~~~a~~i~~~~~g~vP-~~~~~L~~LpGV 136 (227) +.-..+..|.+...||. +...++.+ ..++|+-.|---+=|-+|+.... .+.+..-|. .+.+.|.++||| T Consensus 56 ~~~~RedaNQi~~~LfG-------F~~~~Er~lF~~Li~~nGvGpk~ALaiL~~~~--~~~~~~ai~~~~~~~L~k~pGv 126 (217) T TIGR00084 56 HLVVREDANQILHLLFG-------FNTLEERELFKELIKVNGVGPKLALAILSNMS--PEEFVQAIETEEVKALVKIPGV 126 (217) T ss_pred EEEEECCHHHHHHHHHC-------CCCHHHHHHHHHHHHCCCCHHHHHHHHHCCCC--HHHHHHHHHHHHHHHHHCCCCC T ss_conf 77776046789999734-------79877899999985148802899999866788--7589888864104442045885 Q ss_pred HHHHHHHHH-HHHHHH Q ss_conf 358888999-987542 Q gi|254780383|r 137 GRKGANVIL-SMAFGI 151 (227) Q Consensus 137 G~ktA~~il-~~~~~~ 151 (227) |.|+|+.++ +.-.|+ T Consensus 127 GKK~A~~l~~leL~gk 142 (217) T TIGR00084 127 GKKTAERLLALELKGK 142 (217) T ss_pred CHHHHHHHHHHHHHHH T ss_conf 7378999987775454 No 29 >PRK10353 3-methyl-adenine DNA glycosylase I; Provisional Probab=94.21 E-value=0.33 Score=28.77 Aligned_cols=68 Identities=21% Similarity=0.239 Sum_probs=51.7 Q ss_pred HHHHHHHHHHHHCCCHHHHHHHHHHHHHCCC--CCHHHCCCCHHHHHHHHHHHHH--HHHHHHHHHHHHHHH Q ss_conf 8999999999633320356799899873176--2000101268999999997300--379999997523554 Q gi|254780383|r 48 HFTLIVAVLLSAQSTDVNVNKATKHLFEIAD--TPQKMLAIGEKKLQNYIRTIGI--YRKKSENIISLSHIL 115 (227) Q Consensus 48 p~~~LVa~iLs~qT~d~~v~~~~~~L~~~yp--t~e~l~~a~~~el~~~ir~~G~--~~~KAk~I~~~a~~i 115 (227) -|+.|+-+.+-+--+|..+.+--+.+.+.|- +|+.++..++++++.++.--|. .+.|-..++.=|+.+ T Consensus 31 LFE~L~LE~~QaGLSW~tiL~KRe~fr~AF~~Fd~~~VA~~~e~die~Ll~n~~IIRNr~KI~AvI~NA~~~ 102 (189) T PRK10353 31 LFEMICLEGQQAGLSWITVLKKRENYRACFHQFDPVRVAAMQEEDVERLVQDAGIIRHRGKIQAIIGNARAY 102 (189) T ss_pred HHHHHHHHHHCCCCCHHHHHHHHHHHHHHHCCCCHHHHHCCCHHHHHHHHCCCCHHHHHHHHHHHHHHHHHH T ss_conf 999999998524167999999899999998089989996389999998854621246189999999999999 No 30 >pfam10576 EndIII_4Fe-2S Iron-sulfur binding domain of endonuclease III. Escherichia coli endonuclease III (EC 4.2.99.18) is a DNA repair enzyme that acts both as a DNA N-glycosylase, removing oxidized pyrimidines from DNA, and as an apurinic/apyrimidinic (AP) endonuclease, introducing a single-strand nick at the site from which the damaged base was removed. Endonuclease III is an iron-sulfur protein that binds a single 4Fe-4S cluster. The 4Fe-4S cluster does not seem to be important for catalytic activity, but is probably involved in the proper positioning of the enzyme along the DNA strand. The 4Fe-4S cluster is bound by four cysteines which are all located in a 17 amino acid region at the C-terminal end of endonuclease III. A similar region is also present in the central section of mutY and in the C-terminus of ORF-10 and of the Micro-coccus UV endonuclease. Probab=94.10 E-value=0.02 Score=36.96 Aligned_cols=20 Identities=45% Similarity=1.246 Sum_probs=18.5 Q ss_pred CCCCCCCCCCCCCHHHCHHH Q ss_conf 64899894728403317685 Q gi|254780383|r 206 CKARKPQCQSCIISNLCKRI 225 (227) Q Consensus 206 C~~~~P~C~~C~l~~~C~~~ 225 (227) |++++|+|+.|||++.|... T Consensus 1 Ct~rkP~C~~Cpl~~~C~~~ 20 (26) T pfam10576 1 CTARKPKCEECPLADLCXXX 20 (26) T ss_pred CCCCCCCCCCCCHHHHHHHC T ss_conf 98789876658689877410 No 31 >PRK07956 ligA NAD-dependent DNA ligase LigA; Validated Probab=94.01 E-value=0.16 Score=30.95 Aligned_cols=79 Identities=25% Similarity=0.419 Sum_probs=46.5 Q ss_pred HHHHHHHHHCC--CCCHHHCCCCHHHHHHHHHHHHHHHHHHHHHHHH-------------------------HHHHHHHH Q ss_conf 79989987317--6200010126899999999730037999999752-------------------------35544420 Q gi|254780383|r 67 NKATKHLFEIA--DTPQKMLAIGEKKLQNYIRTIGIYRKKSENIISL-------------------------SHILINEF 119 (227) Q Consensus 67 ~~~~~~L~~~y--pt~e~l~~a~~~el~~~ir~~G~~~~KAk~I~~~-------------------------a~~i~~~~ 119 (227) .+....|++.. .++.++-.+..++|..+ =||-...|.+|++. |+.|...| T Consensus 455 ~~~i~~L~~~g~i~~~~Diy~L~~~~L~~l---~gfgeKsa~nLl~aIe~SK~~~l~r~L~ALGI~~VG~~~Ak~La~~f 531 (668) T PRK07956 455 EKIIEQLFEKGLIHTPADLFKLTEEDLLQL---EGFGEKSAQNLLDAIEKSKETPLARFLYALGIRHVGEKAAKALARHF 531 (668) T ss_pred HHHHHHHHHCCCCCCHHHHHCCCHHHHHCC---CCHHHHHHHHHHHHHHHHCCCCHHHHHHHCCCCCCCHHHHHHHHHHH T ss_conf 999999987687665899972885454021---23556699999999998547758889986278641299999999996 Q ss_pred CC--C-CHHHHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 01--0-00014567776432358888999987 Q gi|254780383|r 120 DN--K-IPQTLEGLTRLPGIGRKGANVILSMA 148 (227) Q Consensus 120 ~g--~-vP~~~~~L~~LpGVG~ktA~~il~~~ 148 (227) +. . .-.+.++|.+++|||+++|.+|..|- T Consensus 532 ~sl~~l~~as~e~L~~I~giG~~~A~si~~ff 563 (668) T PRK07956 532 GSLEALEAASEEELAAVEGIGEEVAQSIVEFF 563 (668) T ss_pred CCHHHHHHCCHHHHHCCCCCCHHHHHHHHHHH T ss_conf 68999970899998576884499999999997 No 32 >TIGR00426 TIGR00426 competence protein ComEA helix-hairpin-helix repeat region; InterPro: IPR004509 This domain is found in competence protein ComEA and closely related proteins from a number of species that exhibit competence for transformation by exongenous DNA, including Streptococcus pneumoniae, Bacillus subtilis, Neisseria meningitidis, and Haemophilus influenzae. This domain represents a region of two tandem copies of a helix-hairpin-helix domain, each about 30 residues in length. Limited sequence similarity can be found among some members of this family N-terminal to this domain.. Probab=93.55 E-value=0.053 Score=34.17 Aligned_cols=55 Identities=27% Similarity=0.451 Sum_probs=43.5 Q ss_pred HCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 101268999999997300379999997523554442001000014567776432358888999 Q gi|254780383|r 83 MLAIGEKKLQNYIRTIGIYRKKSENIISLSHILINEFDNKIPQTLEGLTRLPGIGRKGANVIL 145 (227) Q Consensus 83 l~~a~~~el~~~ir~~G~~~~KAk~I~~~a~~i~~~~~g~vP~~~~~L~~LpGVG~ktA~~il 145 (227) +=.|+.+||+..|.++|. .||..|+.- +|+|| .. .+.++|++.||||.++-.-++ T Consensus 11 INtAtaeElq~~~~GvG~--kKAeAIv~Y----REe~G-~F-~t~Edl~~V~GiG~~~~Ek~~ 65 (70) T TIGR00426 11 INTATAEELQKALSGVGA--KKAEAIVAY----REEYG-RF-KTVEDLKKVSGIGEKLLEKNK 65 (70) T ss_pred CCHHCHHHHHHHHCCCCH--HHHHHHHHH----HHCCC-CC-CCHHHHHHCCCCCHHHHHHHH T ss_conf 011047888876428872--378999887----53277-95-762223214787624555564 No 33 >PRK08097 ligB NAD-dependent DNA ligase LigB; Reviewed Probab=93.49 E-value=0.19 Score=30.39 Aligned_cols=23 Identities=30% Similarity=0.570 Sum_probs=15.1 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 01456777643235888899998 Q gi|254780383|r 125 QTLEGLTRLPGIGRKGANVILSM 147 (227) Q Consensus 125 ~~~~~L~~LpGVG~ktA~~il~~ 147 (227) .+.++|..++|||+++|.+|..| T Consensus 518 as~e~l~~i~gIG~~~A~si~~f 540 (563) T PRK08097 518 RTEQQWQQLPGIGEGRARQLIAF 540 (563) T ss_pred CCHHHHHCCCCCCHHHHHHHHHH T ss_conf 99989955798489999999999 No 34 >smart00483 POLXc DNA polymerase X family. includes vertebrate polymerase beta and terminal deoxynucleotidyltransferases Probab=93.17 E-value=0.57 Score=27.20 Aligned_cols=147 Identities=17% Similarity=0.270 Sum_probs=69.8 Q ss_pred HHHHHHHHHHHHHCCCCCCCCCCCCHHHHHHHHHHHHCCCHHHHHHHHHHHHHCCCCCHHHCCCCHHHHHHHHHHHHHHH Q ss_conf 99999999999976799987777868999999999633320356799899873176200010126899999999730037 Q gi|254780383|r 23 KELEEIFYLFSLKWPSPKGELYYVNHFTLIVAVLLSAQSTDVNVNKATKHLFEIADTPQKMLAIGEKKLQNYIRTIGIYR 102 (227) Q Consensus 23 ~~~~~I~~~L~~~yp~~~~~l~~~~p~~~LVa~iLs~qT~d~~v~~~~~~L~~~ypt~e~l~~a~~~el~~~ir~~G~~~ 102 (227) +++-+++..|.+.|. +...|+|... +- .+|. .-.+.+|.+ +. +.+++.+ |.++| . T Consensus 4 ~~i~~~L~~la~~~e-----~~Gen~~k~~------ay------~~Aa-~~i~~l~~~--i~--~~~~l~~-ipGIG--~ 58 (334) T smart00483 4 RGIIDALEILAENYE-----VFGENKRKCS------YF------RKAA-SVLKSLPFP--IN--SMKDLKG-LPGIG--D 58 (334) T ss_pred HHHHHHHHHHHHHHH-----HCCCCHHHHH------HH------HHHH-HHHHHCCCC--CC--CHHHHCC-CCCCC--H T ss_conf 999999999999999-----8599777899------99------9999-999859835--58--9999727-99987--8 Q ss_pred HHHHHHHHHHHH--H--HHH-HCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCHH---HHHH--HHHHHHCC Q ss_conf 999999752355--4--442-001000014567776432358888999987542100012104---6787--76565407 Q gi|254780383|r 103 KKSENIISLSHI--L--INE-FDNKIPQTLEGLTRLPGIGRKGANVILSMAFGIPTIGVDTHI---FRIS--NRIGLAPG 172 (227) Q Consensus 103 ~KAk~I~~~a~~--i--~~~-~~g~vP~~~~~L~~LpGVG~ktA~~il~~~~~~p~~~VDthv---~Rv~--~Rlgl~~~ 172 (227) .=|..|.++.+. + .++ -...+|....+|+++||||++||... +-.|...+. |--- .++. ..+| T Consensus 59 ~ia~kI~Eil~TG~~~~~e~~~~~~~~~~l~el~~I~GvGpk~a~~l--~~~Gi~tl~-dL~~a~~~~l~~~q~~G---- 131 (334) T smart00483 59 KIKKKIEEIIETGKSSKVLEILNDEVYKSLKLFTNVFGVGPKTAAKW--YRKGIRTLE-ELKKNKELKLTKQQKAG---- 131 (334) T ss_pred HHHHHHHHHHHCCCCHHHHHHHCCCCCHHHHHHHCCCCCCHHHHHHH--HHCCCCCHH-HHHHHHHHHHHHHHHHH---- T ss_conf 99999999998499489999872865168999853888778999999--984988799-99987887678898876---- Q ss_pred CCHHHHHHHHHHCCCHHHHHHHHHHHHHHHHH Q ss_conf 88899999996218842267899999999665 Q gi|254780383|r 173 KTPNKVEQSLLRIIPPKHQYNAHYWLVLHGRY 204 (227) Q Consensus 173 ~~~~~~~~~l~~~~p~~~~~~~~~~li~~G~~ 204 (227) .+...++.+.+|.+....+...+...-+. T Consensus 132 ---lk~~ed~~~rIpr~e~~~~~~~i~~~l~~ 160 (334) T smart00483 132 ---LKYYEDILKKVSRAEAFAVEYIVKRAVRK 160 (334) T ss_pred ---HHHHHHHHCCCCHHHHHHHHHHHHHHHHH T ss_conf ---99999997268199999999999999985 No 35 >TIGR00615 recR recombination protein RecR; InterPro: IPR000093 The bacterial protein recR seems to play a role in a recombinational process of DNA repair . It may act with recF and recO. RecR is a protein of about 200 amino acid residues. This protein contains a putative C4-type zinc finger in the N-terminal section.; GO: 0006281 DNA repair, 0006310 DNA recombination. Probab=92.61 E-value=0.06 Score=33.81 Aligned_cols=28 Identities=36% Similarity=0.572 Sum_probs=20.7 Q ss_pred CHHHHHHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 0001456777643235888899998754 Q gi|254780383|r 123 IPQTLEGLTRLPGIGRKGANVILSMAFG 150 (227) Q Consensus 123 vP~~~~~L~~LpGVG~ktA~~il~~~~~ 150 (227) +-+-.+.|.+|||||+|||.=++-+... T Consensus 7 ~~~Lie~L~kLPgiG~KsA~RlAf~LL~ 34 (205) T TIGR00615 7 ISKLIESLKKLPGIGPKSAQRLAFHLLK 34 (205) T ss_pred HHHHHHHHHHCCCCCHHHHHHHHHHHCC T ss_conf 9999998640789871478999998607 No 36 >PRK00024 radC DNA repair protein RadC; Reviewed Probab=92.31 E-value=0.16 Score=30.90 Aligned_cols=66 Identities=20% Similarity=0.305 Sum_probs=52.3 Q ss_pred HHHHHHHHHHCCCHHHHHHHHHHHHHCCCCCHHHCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 9999999963332035679989987317620001012689999999973003799999975235544420 Q gi|254780383|r 50 TLIVAVLLSAQSTDVNVNKATKHLFEIADTPQKMLAIGEKKLQNYIRTIGIYRKKSENIISLSHILINEF 119 (227) Q Consensus 50 ~~LVa~iLs~qT~d~~v~~~~~~L~~~ypt~e~l~~a~~~el~~~ir~~G~~~~KAk~I~~~a~~i~~~~ 119 (227) .-|++.+|..-+....|....++|+++|.+...+.+++.++|.. ++++| ..||..|..+... ..++ T Consensus 27 ~ELLallL~~g~~~~d~~~lA~~ll~~~g~l~~l~~a~~~eL~~-i~GiG--~~kA~~l~a~~El-~rR~ 92 (224) T PRK00024 27 AELLAILLRTGTKGKSVLDLARELLERFGSLRGLLDASLEELQE-IKGIG--PAKAAQLKAALEL-ARRI 92 (224) T ss_pred HHHHHHHHHCCCCCCCHHHHHHHHHHHCCCHHHHHHCCHHHHHC-CCCCC--HHHHHHHHHHHHH-HHHH T ss_conf 99999998469999998999999999859999998708898844-78988--9999999999999-9999 No 37 >PRK00116 ruvA Holliday junction DNA helicase RuvA; Reviewed Probab=91.79 E-value=0.33 Score=28.76 Aligned_cols=72 Identities=24% Similarity=0.248 Sum_probs=44.7 Q ss_pred CCCCHHHCCCCHHHHHHHHHH-HHHHHHHHHHHHHHH--HHHHHHHCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 762000101268999999997-300379999997523--554442001000014567776432358888999987542 Q gi|254780383|r 77 ADTPQKMLAIGEKKLQNYIRT-IGIYRKKSENIISLS--HILINEFDNKIPQTLEGLTRLPGIGRKGANVILSMAFGI 151 (227) Q Consensus 77 ypt~e~l~~a~~~el~~~ir~-~G~~~~KAk~I~~~a--~~i~~~~~g~vP~~~~~L~~LpGVG~ktA~~il~~~~~~ 151 (227) .++...+.+..+.++-+.+-. -|.-...|-.|.... ..+.+ -..-.|.+.|.++||||+|||.-|+...-++ T Consensus 57 ~~~LyGF~~~~Er~~F~~Li~V~GIGpK~AL~ILs~~~~~~l~~---aI~~~D~~~L~~vpGIG~KtA~rIi~ELk~K 131 (198) T PRK00116 57 AQLLYGFLTKEERELFRLLISVSGVGPKLALAILSGLSPEELAQ---AIANGDIKALTKVPGVGKKTAERIVLELKDK 131 (198) T ss_pred CCEEEEECCHHHHHHHHHHHCCCCCCHHHHHHHHCCCCHHHHHH---HHHHCCHHHHCCCCCCCHHHHHHHHHHHHHH T ss_conf 87578408889999999985668857899998870299999999---9985899997068897889999999999988 No 38 >PRK00024 radC DNA repair protein RadC; Reviewed Probab=91.56 E-value=0.35 Score=28.61 Aligned_cols=66 Identities=21% Similarity=0.390 Sum_probs=47.4 Q ss_pred HCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCC---CHHHHHHHHHHHHHHHHHHHHHHHH-HHHH Q ss_conf 1012689999999973003799999975235544420010---0001456777643235888899998-7542 Q gi|254780383|r 83 MLAIGEKKLQNYIRTIGIYRKKSENIISLSHILINEFDNK---IPQTLEGLTRLPGIGRKGANVILSM-AFGI 151 (227) Q Consensus 83 l~~a~~~el~~~ir~~G~~~~KAk~I~~~a~~i~~~~~g~---vP~~~~~L~~LpGVG~ktA~~il~~-~~~~ 151 (227) ...++..||-+++=..|. +-+..+.+|+.++++|||- .-.+.++|.+++|||+-.|..++.. .+++ T Consensus 21 ~~~LsD~ELLallL~~g~---~~~d~~~lA~~ll~~~g~l~~l~~a~~~eL~~i~GiG~~kA~~l~a~~El~r 90 (224) T PRK00024 21 AAALSDAELLAILLRTGT---KGKSVLDLARELLERFGSLRGLLDASLEELQEIKGIGPAKAAQLKAALELAR 90 (224) T ss_pred CCCCCHHHHHHHHHHCCC---CCCCHHHHHHHHHHHCCCHHHHHHCCHHHHHCCCCCCHHHHHHHHHHHHHHH T ss_conf 320777999999984699---9999899999999985999999870889884478988999999999999999 No 39 >COG2818 Tag 3-methyladenine DNA glycosylase [DNA replication, recombination, and repair] Probab=91.17 E-value=1.4 Score=24.57 Aligned_cols=74 Identities=23% Similarity=0.281 Sum_probs=51.1 Q ss_pred HHHHHHHHHHHHCCCHHHHHHHHHHHHHCC--CCCHHHCCCCHHHHHHHHHHHHHHHH--HHHHHHHHHHHH---HHHHC Q ss_conf 899999999963332035679989987317--62000101268999999997300379--999997523554---44200 Q gi|254780383|r 48 HFTLIVAVLLSAQSTDVNVNKATKHLFEIA--DTPQKMLAIGEKKLQNYIRTIGIYRK--KSENIISLSHIL---INEFD 120 (227) Q Consensus 48 p~~~LVa~iLs~qT~d~~v~~~~~~L~~~y--pt~e~l~~a~~~el~~~ir~~G~~~~--KAk~I~~~a~~i---~~~~~ 120 (227) -|+.|.-++.-+--+|..|.+--+.+.+.| -+|+.++..+.++++.++.-.|.-|. |-+.++.=|+.. .++|| T Consensus 32 LFE~l~Le~fQAGLSW~tVL~KRe~freaF~~Fd~~kVA~~~~~dverLl~d~gIIR~r~KI~A~i~NA~~~l~l~~e~G 111 (188) T COG2818 32 LFELLCLEGFQAGLSWLTVLKKREAFREAFHGFDPEKVAAMTEEDVERLLADAGIIRNRGKIKATINNARAVLELQKEFG 111 (188) T ss_pred HHHHHHHHHHHCCCHHHHHHHHHHHHHHHHHCCCHHHHHCCCHHHHHHHHHCCCHHHHHHHHHHHHHHHHHHHHHHHHCC T ss_conf 99999999996565189999859999999956999999758898999997073145348989999999999999998708 Q ss_pred C Q ss_conf 1 Q gi|254780383|r 121 N 121 (227) Q Consensus 121 g 121 (227) + T Consensus 112 s 112 (188) T COG2818 112 S 112 (188) T ss_pred C T ss_conf 8 No 40 >pfam05559 DUF763 Protein of unknown function (DUF763). This family consists of several uncharacterized bacterial and archaeal proteins of unknown function. Probab=91.05 E-value=0.42 Score=28.11 Aligned_cols=29 Identities=31% Similarity=0.619 Sum_probs=22.3 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHHH---HHHH Q ss_conf 0014567776432358888999987---5421 Q gi|254780383|r 124 PQTLEGLTRLPGIGRKGANVILSMA---FGIP 152 (227) Q Consensus 124 P~~~~~L~~LpGVG~ktA~~il~~~---~~~p 152 (227) |.++++|+.+||||++|.-++.+.+ ||.| T Consensus 265 p~dfeeLLl~~GvGp~TlRALaLvaElIyg~p 296 (319) T pfam05559 265 PEDFEELLLLKGVGPSTLRALALVAEVIYGTP 296 (319) T ss_pred CCCHHHHHHCCCCCHHHHHHHHHHHHHHCCCC T ss_conf 01699997147988899999999999980899 No 41 >KOG2841 consensus Probab=90.98 E-value=0.55 Score=27.29 Aligned_cols=20 Identities=25% Similarity=0.406 Sum_probs=8.1 Q ss_pred HHHHHHHHHHHHHHHHHHHH Q ss_conf 14567776432358888999 Q gi|254780383|r 126 TLEGLTRLPGIGRKGANVIL 145 (227) Q Consensus 126 ~~~~L~~LpGVG~ktA~~il 145 (227) ++++|..+||+|+--|.-+. T Consensus 225 S~~ele~~~G~G~~kak~l~ 244 (254) T KOG2841 225 SEGELEQCPGLGPAKAKRLH 244 (254) T ss_pred CHHHHHHCCCCCHHHHHHHH T ss_conf 77679867573789999999 No 42 >PRK00076 recR recombination protein RecR; Reviewed Probab=90.81 E-value=0.15 Score=31.17 Aligned_cols=26 Identities=35% Similarity=0.516 Sum_probs=17.8 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 45677764323588889999875421 Q gi|254780383|r 127 LEGLTRLPGIGRKGANVILSMAFGIP 152 (227) Q Consensus 127 ~~~L~~LpGVG~ktA~~il~~~~~~p 152 (227) .+.|.+|||||+|||.=+..+....+ T Consensus 10 I~~l~kLPGIG~KsA~Rla~~LL~~~ 35 (197) T PRK00076 10 IEALRKLPGIGPKSAQRLAFHLLQRD 35 (197) T ss_pred HHHHHHCCCCCHHHHHHHHHHHHCCC T ss_conf 99998789998899999999998299 No 43 >PRK13844 recombination protein RecR; Provisional Probab=90.81 E-value=0.14 Score=31.39 Aligned_cols=27 Identities=26% Similarity=0.475 Sum_probs=19.2 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 014567776432358888999987542 Q gi|254780383|r 125 QTLEGLTRLPGIGRKGANVILSMAFGI 151 (227) Q Consensus 125 ~~~~~L~~LpGVG~ktA~~il~~~~~~ 151 (227) +-.+.|.+|||||+|||.=+..+.... T Consensus 12 ~LI~~l~kLPGIG~KsA~Rla~~Ll~~ 38 (200) T PRK13844 12 AVIESLRKLPTIGKKSSQRLALYLLDK 38 (200) T ss_pred HHHHHHHHCCCCCHHHHHHHHHHHHCC T ss_conf 999998168998788999999998649 No 44 >TIGR01259 comE comEA protein; InterPro: IPR004787 The comE locus is obligatory for bacterial cell competence - the process of internalizing the exogenous added DNA. comEA and comEC are required for transformability, whereas the products of comEB and of the overlapping comER, which is transcribed in the reverse direction, are dispensable . ComEA has been shown to be an integral membrane protein, as predicted from hydropathy analysis, with its C-terminal domain outside the cytoplasmic membrane. This C-terminal domain possesses a sequence with similarity to those of several proteins known to be involved in nucleic acid transactions including UvrC and a human protein that binds to the replication origin of the Human herpesvirus 4 (Epstein-Barr virus) .. Probab=90.52 E-value=0.15 Score=31.18 Aligned_cols=49 Identities=27% Similarity=0.403 Sum_probs=24.7 Q ss_pred CCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCHHHHHHHHHHHHHHHHHHH Q ss_conf 1268999999997300379999997523554442001000014567776432358888 Q gi|254780383|r 85 AIGEKKLQNYIRTIGIYRKKSENIISLSHILINEFDNKIPQTLEGLTRLPGIGRKGAN 142 (227) Q Consensus 85 ~a~~~el~~~ir~~G~~~~KAk~I~~~a~~i~~~~~g~vP~~~~~L~~LpGVG~ktA~ 142 (227) .|+.+||+.+ -|.-..||+.|++- ++ -+|.. .+.|+|.+..|||+|+-. T Consensus 68 ~As~~EL~~l---~GiGP~kA~aIi~Y----Re-~nG~F-~SvddL~kVsGIG~k~~e 116 (124) T TIGR01259 68 KASLEELQAL---PGIGPAKAKAIIEY----RE-ENGAF-KSVDDLTKVSGIGEKSLE 116 (124) T ss_pred HHHHHHHHHC---CCCCCHHHHHHHHH----HH-HCCCC-CCHHHHHCCCCCCHHHHH T ss_conf 6789998636---99981337999999----98-56997-775550035788546687 No 45 >COG0353 RecR Recombinational DNA repair protein (RecF pathway) [DNA replication, recombination, and repair] Probab=90.44 E-value=0.15 Score=31.04 Aligned_cols=26 Identities=27% Similarity=0.489 Sum_probs=16.6 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 14567776432358888999987542 Q gi|254780383|r 126 TLEGLTRLPGIGRKGANVILSMAFGI 151 (227) Q Consensus 126 ~~~~L~~LpGVG~ktA~~il~~~~~~ 151 (227) -.+.|.+|||||+|||.-+..+..++ T Consensus 10 LI~~l~kLPGvG~KsA~R~AfhLL~~ 35 (198) T COG0353 10 LIDALKKLPGVGPKSAQRLAFHLLQR 35 (198) T ss_pred HHHHHHHCCCCCHHHHHHHHHHHHCC T ss_conf 99999768998832799999999735 No 46 >PRK08609 hypothetical protein; Provisional Probab=90.29 E-value=0.34 Score=28.69 Aligned_cols=49 Identities=29% Similarity=0.552 Sum_probs=30.8 Q ss_pred HHHHHHHHHHHHHHHHHHHH----HHHHHCCCCHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 99730037999999752355----4442001000014567776432358888999 Q gi|254780383|r 95 IRTIGIYRKKSENIISLSHI----LINEFDNKIPQTLEGLTRLPGIGRKGANVIL 145 (227) Q Consensus 95 ir~~G~~~~KAk~I~~~a~~----i~~~~~g~vP~~~~~L~~LpGVG~ktA~~il 145 (227) |.++| ..=|..|.++.+. ..++...++|...-+|+++||||||+|..+- T Consensus 53 ipGIG--k~Ia~KI~Eil~TG~l~~le~L~~~~P~gl~eLl~IpGlGPKka~~L~ 105 (570) T PRK08609 53 IKGIG--KGTAEVIQEYRETGESSVLQELQKEVPEGLLPLLKLPGLGGKKIAKLY 105 (570) T ss_pred CCCCC--HHHHHHHHHHHHCCCHHHHHHHHHHCCHHHHHHHCCCCCCHHHHHHHH T ss_conf 99954--999999999997299089999985487779999778987789999999 No 47 >PRK13901 ruvA Holliday junction DNA helicase motor protein; Provisional Probab=89.82 E-value=0.59 Score=27.12 Aligned_cols=71 Identities=18% Similarity=0.138 Sum_probs=42.9 Q ss_pred CCCHHHCCCCHHHHHHHHHH-HHHHHHHHHHHHHHH--HHHHHHHCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 62000101268999999997-300379999997523--554442001000014567776432358888999987542 Q gi|254780383|r 78 DTPQKMLAIGEKKLQNYIRT-IGIYRKKSENIISLS--HILINEFDNKIPQTLEGLTRLPGIGRKGANVILSMAFGI 151 (227) Q Consensus 78 pt~e~l~~a~~~el~~~ir~-~G~~~~KAk~I~~~a--~~i~~~~~g~vP~~~~~L~~LpGVG~ktA~~il~~~~~~ 151 (227) .+...+.+..+.++-+.+-. .|.-...|-.|.... ..+.+ -..-.|.+.|.++||||+|||.-|+...-++ T Consensus 57 ~~LyGF~~~~Er~~F~~LisVsGIGpk~Al~iLs~~~~~~l~~---aI~~~D~~~L~~vpGIG~KtA~rIi~ELk~K 130 (196) T PRK13901 57 LKLFGFLNSSEREVFEELIGVDGIGPRAALRVLSGIKYNEFRD---AIDREDIELISKVKGIGNKMAGKIFLKLRGK 130 (196) T ss_pred CEEECCCCHHHHHHHHHHHCCCCCCHHHHHHHHCCCCHHHHHH---HHHHCCHHHHHHCCCCCHHHHHHHHHHHHHH T ss_conf 7133659889999999987658826899999975799999999---9992899998319995899999999999765 No 48 >pfam03352 Adenine_glyco Methyladenine glycosylase. The DNA-3-methyladenine glycosylase I is constitutively expressed and is specific for the alkylated 3-methyladenine DNA. Probab=89.64 E-value=1.9 Score=23.72 Aligned_cols=73 Identities=23% Similarity=0.303 Sum_probs=54.5 Q ss_pred HHHHHHHHHHHHCCCHHHHHHHHHHHHHCCC--CCHHHCCCCHHHHHHHHHHHHH--HHHHHHHHHHHHHHHH---HHHC Q ss_conf 8999999999633320356799899873176--2000101268999999997300--3799999975235544---4200 Q gi|254780383|r 48 HFTLIVAVLLSAQSTDVNVNKATKHLFEIAD--TPQKMLAIGEKKLQNYIRTIGI--YRKKSENIISLSHILI---NEFD 120 (227) Q Consensus 48 p~~~LVa~iLs~qT~d~~v~~~~~~L~~~yp--t~e~l~~a~~~el~~~ir~~G~--~~~KAk~I~~~a~~i~---~~~~ 120 (227) -|+.|+-+..-+==+|..|.+--+.+.+.|- +|+.++..++++|+.++.--|. .+.|-..++.=|+.++ +++| T Consensus 26 LFE~L~LE~~QaGLSW~tIL~KR~~fr~aF~~Fd~~~VA~~~e~~ie~Ll~d~~IIRnr~KI~Avi~NA~~~l~i~~e~g 105 (179) T pfam03352 26 LFELLCLEGFQAGLSWITILKKREAFREAFAGFDPEKVAAFTEADVERLLADPGIIRNRLKIEATINNARAILKLQEEFG 105 (179) T ss_pred HHHHHHHHHHCCCCCHHHHHHHHHHHHHHHCCCCHHHHHCCCHHHHHHHHCCHHHHHHHHHHHHHHHHHHHHHHHHHHCC T ss_conf 99999998973647899999879999999828999999648999999985267878768999999999999999998379 No 49 >COG0272 Lig NAD-dependent DNA ligase (contains BRCT domain type II) [DNA replication, recombination, and repair] Probab=89.31 E-value=0.4 Score=28.22 Aligned_cols=70 Identities=20% Similarity=0.361 Sum_probs=37.9 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCHHHHHHHHH Q ss_conf 999999973003799999975235544420010000145677764323588889999875421000121046787765 Q gi|254780383|r 90 KLQNYIRTIGIYRKKSENIISLSHILINEFDNKIPQTLEGLTRLPGIGRKGANVILSMAFGIPTIGVDTHIFRISNRI 167 (227) Q Consensus 90 el~~~ir~~G~~~~KAk~I~~~a~~i~~~~~g~vP~~~~~L~~LpGVG~ktA~~il~~~~~~p~~~VDthv~Rv~~Rl 167 (227) .+..+|-.+|..++-...-+.+|+.. ..+..-.-.+.++|.++||||.+.|.+|.-| |.. .+..-++.+| T Consensus 506 ~l~r~l~aLGIr~VG~~~Ak~La~~f-~sl~~l~~a~~e~l~~i~giG~~vA~si~~f-f~~------~~~~~li~~L 575 (667) T COG0272 506 PLARFLYALGIRHVGETTAKSLARHF-GTLEALLAASEEELASIPGIGEVVARSIIEF-FAN------EENRELIDEL 575 (667) T ss_pred CHHHHHHHCCCCHHHHHHHHHHHHHH-HHHHHHHHCCHHHHHHCCCHHHHHHHHHHHH-HCC------HHHHHHHHHH T ss_conf 89999998797114089999999876-0299998429999950666128999999999-727------7789999999 No 50 >TIGR01259 comE comEA protein; InterPro: IPR004787 The comE locus is obligatory for bacterial cell competence - the process of internalizing the exogenous added DNA. comEA and comEC are required for transformability, whereas the products of comEB and of the overlapping comER, which is transcribed in the reverse direction, are dispensable . ComEA has been shown to be an integral membrane protein, as predicted from hydropathy analysis, with its C-terminal domain outside the cytoplasmic membrane. This C-terminal domain possesses a sequence with similarity to those of several proteins known to be involved in nucleic acid transactions including UvrC and a human protein that binds to the replication origin of the Human herpesvirus 4 (Epstein-Barr virus) .. Probab=89.24 E-value=0.19 Score=30.35 Aligned_cols=23 Identities=43% Similarity=0.696 Sum_probs=20.8 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 01456777643235888899998 Q gi|254780383|r 125 QTLEGLTRLPGIGRKGANVILSM 147 (227) Q Consensus 125 ~~~~~L~~LpGVG~ktA~~il~~ 147 (227) .+.+||.+||||||.=|.+|..| T Consensus 69 As~~EL~~l~GiGP~kA~aIi~Y 91 (124) T TIGR01259 69 ASLEELQALPGIGPAKAKAIIEY 91 (124) T ss_pred HHHHHHHHCCCCCCHHHHHHHHH T ss_conf 78999863699981337999999 No 51 >PRK13901 ruvA Holliday junction DNA helicase motor protein; Provisional Probab=89.06 E-value=0.34 Score=28.73 Aligned_cols=75 Identities=20% Similarity=0.238 Sum_probs=38.8 Q ss_pred HHHHHHHHHHHHHHHHHHHH-HHHHHHHHHCCCHHHHHHHHH-HHHCCCCHHHHHHHHHHCCCH--------HHHHHHHH Q ss_conf 45677764323588889999-875421000121046787765-654078889999999621884--------22678999 Q gi|254780383|r 127 LEGLTRLPGIGRKGANVILS-MAFGIPTIGVDTHIFRISNRI-GLAPGKTPNKVEQSLLRIIPP--------KHQYNAHY 196 (227) Q Consensus 127 ~~~L~~LpGVG~ktA~~il~-~~~~~p~~~VDthv~Rv~~Rl-gl~~~~~~~~~~~~l~~~~p~--------~~~~~~~~ 196 (227) ++.|.+..|||||+|=.||+ +..+.=.-+|...=...+.++ |+ ..|+.+++--+|+.-+.. ....+.-. T Consensus 71 F~~LisVsGIGpk~Al~iLs~~~~~~l~~aI~~~D~~~L~~vpGI-G~KtA~rIi~ELk~Kl~~~~~~~~~~~~~~e~~~ 149 (196) T PRK13901 71 FEELIGVDGIGPRAALRVLSGIKYNEFRDAIDREDIELISKVKGI-GNKMAGKIFLKLRGKLVKNDELESSLFKFKELEQ 149 (196) T ss_pred HHHHHCCCCCCHHHHHHHHCCCCHHHHHHHHHHCCHHHHHHCCCC-CHHHHHHHHHHHHHHHCCCCCCCCCCCCHHHHHH T ss_conf 999876588268999999757999999999992899998319995-8999999999997653156655655344899999 Q ss_pred HHHHHH Q ss_conf 999996 Q gi|254780383|r 197 WLVLHG 202 (227) Q Consensus 197 ~li~~G 202 (227) +|+.+| T Consensus 150 AL~~LG 155 (196) T PRK13901 150 SIVNMG 155 (196) T ss_pred HHHHCC T ss_conf 999849 No 52 >PRK13266 Thf1-like protein; Reviewed Probab=88.50 E-value=2.2 Score=23.19 Aligned_cols=123 Identities=15% Similarity=0.122 Sum_probs=76.5 Q ss_pred HHHHHHHHHHHHHHCCCCCCCCCCCCHHHHHHHHH-HHHCCC---H----HHHHHHHHHHHHCCCCCHHHCCCCHHHHHH Q ss_conf 89999999999997679998777786899999999-963332---0----356799899873176200010126899999 Q gi|254780383|r 22 PKELEEIFYLFSLKWPSPKGELYYVNHFTLIVAVL-LSAQST---D----VNVNKATKHLFEIADTPQKMLAIGEKKLQN 93 (227) Q Consensus 22 ~~~~~~I~~~L~~~yp~~~~~l~~~~p~~~LVa~i-Ls~qT~---d----~~v~~~~~~L~~~ypt~e~l~~a~~~el~~ 93 (227) ...+.+--+.+.+.||.|..++.-+--=++||-.= |+.|+. | --+..+|+.|++.|+-.+.... =... T Consensus 4 ~~TVsDsKr~F~~~~p~pI~~lYrrvvdELLVElHLL~~n~~F~yD~lFAlGlvt~Fd~fm~GY~Pee~~~~----IF~A 79 (224) T PRK13266 4 RRTVSDSKRAFHAAFPRVIPSLYRRVVDELLVELHLLSVQSDFKYDPLFALGLVTVFDRFMQGYRPEEHKDA----LFEA 79 (224) T ss_pred CCCHHHHHHHHHHHCCCCCCHHHHHHHHHHHHHHHHHHHCCCCCCCCHHHHHHHHHHHHHHCCCCCHHHHHH----HHHH T ss_conf 611488799999858988828999999999999999874146642737784499999999767998256999----9999 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 9997300379999997523554442001000014567776432358888999987542 Q gi|254780383|r 94 YIRTIGIYRKKSENIISLSHILINEFDNKIPQTLEGLTRLPGIGRKGANVILSMAFGI 151 (227) Q Consensus 94 ~ir~~G~~~~KAk~I~~~a~~i~~~~~g~vP~~~~~L~~LpGVG~ktA~~il~~~~~~ 151 (227) ++..+||- +..+.+.|+.+.+...|.-+.+...+++-+|-|....-.++...-+. T Consensus 80 lc~s~~~d---p~~~r~dA~~l~~~a~~~s~~~i~~~l~~~~~~~~~l~~~~~~~~~~ 134 (224) T PRK13266 80 LCQAVGFD---PEQLREDAEQLLELAKGKSLDEILSWLTQKGGGANELLATLQAIANN 134 (224) T ss_pred HHHHCCCC---HHHHHHHHHHHHHHHHCCCHHHHHHHHHHCCCCCCHHHHHHHHHHCC T ss_conf 99844899---99999999999999875899999999973545550688999987427 No 53 >COG1555 ComEA DNA uptake protein and related DNA-binding proteins [DNA replication, recombination, and repair] Probab=88.48 E-value=0.28 Score=29.25 Aligned_cols=23 Identities=43% Similarity=0.670 Sum_probs=16.6 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 14567776432358888999987 Q gi|254780383|r 126 TLEGLTRLPGIGRKGANVILSMA 148 (227) Q Consensus 126 ~~~~L~~LpGVG~ktA~~il~~~ 148 (227) +-++|..|||||++.|.+|..|- T Consensus 95 s~eeL~~lpgIG~~kA~aIi~yR 117 (149) T COG1555 95 SAEELQALPGIGPKKAQAIIDYR 117 (149) T ss_pred CHHHHHHCCCCCHHHHHHHHHHH T ss_conf 89999886798999999999999 No 54 >cd00141 NT_POLXc Nucleotidyltransferase (NT) domain of family X DNA Polymerases. X family polymerases fill in short gaps during DNA repair. They are relatively inaccurate enzymes and play roles in base excision repair, in non-homologous end joining (NHEJ) which acts mainly to repair damage due to ionizing radiation, and in V(D)J recombination. This family includes eukaryotic Pol beta, Pol lambda, Pol mu, and terminal deoxyribonucleotidyl transferase (TdT). Pol beta and Pol lambda are primarily DNA template-dependent polymerases. TdT is a DNA template-independent polymerase. Pol mu has both template dependent and template independent activities. This subgroup belongs to the Pol beta-like NT superfamily. In the majority of enzymes in this superfamily, two carboxylates, Dx[D/E], together with a third more distal carboxylate, coordinate two divalent metal cations involved in a two-metal ion mechanism of nucleotide addition. These three carboxylate residues are fairly well conserved in this Probab=88.04 E-value=0.88 Score=25.94 Aligned_cols=98 Identities=24% Similarity=0.362 Sum_probs=49.2 Q ss_pred HHHHHHHHHHHHHHHHHHHH----HHHHHCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCHHHHHHHHHHHH Q ss_conf 99730037999999752355----44420010000145677764323588889999875421000121046787765654 Q gi|254780383|r 95 IRTIGIYRKKSENIISLSHI----LINEFDNKIPQTLEGLTRLPGIGRKGANVILSMAFGIPTIGVDTHIFRISNRIGLA 170 (227) Q Consensus 95 ir~~G~~~~KAk~I~~~a~~----i~~~~~g~vP~~~~~L~~LpGVG~ktA~~il~~~~~~p~~~VDthv~Rv~~Rlgl~ 170 (227) |.++| ..=|..|.++.+. ..++....+|....+|+++||||+++|.... ..|...+. . +-.-.| T Consensus 50 lpGIG--~~ia~kI~Eil~tG~~~~le~l~~~~p~~l~~l~~I~GiGpk~a~~l~--~~Gi~sl~--d----L~~a~g-- 117 (307) T cd00141 50 LPGIG--KKIAEKIEEILETGKLRKLEELREDVPPGLLLLLRVPGVGPKTARKLY--ELGIRTLE--D----LRKAAG-- 117 (307) T ss_pred CCCCC--HHHHHHHHHHHHHCCHHHHHHHHCCCCHHHHHHHCCCCCCHHHHHHHH--HCCCCCHH--H----HHHHHC-- T ss_conf 99964--899999999999798089999866563789999647887889999999--82999799--9----997501-- Q ss_pred CCCCHH------HHHHHHHHCCCHHHHHHHHHHHHHHHHHH Q ss_conf 078889------99999962188422678999999996651 Q gi|254780383|r 171 PGKTPN------KVEQSLLRIIPPKHQYNAHYWLVLHGRYV 205 (227) Q Consensus 171 ~~~~~~------~~~~~l~~~~p~~~~~~~~~~li~~G~~i 205 (227) .+... +-..++...+|......+...+..+-+.+ T Consensus 118 -~k~~~~~~~Gl~~~~~~~~ripr~e~~~~~~~i~~~l~~~ 157 (307) T cd00141 118 -AKLEQNILIGLEYYEDFQQRIPREEALAIAEIIKEALREV 157 (307) T ss_pred -HHHHHHHHHHHHHHHHHCCCEEHHHHHHHHHHHHHHHHHC T ss_conf -1038999999999998515677999999999999999838 No 55 >smart00278 HhH1 Helix-hairpin-helix DNA-binding motif class 1. Probab=87.12 E-value=0.22 Score=29.95 Aligned_cols=23 Identities=43% Similarity=0.654 Sum_probs=19.3 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 56777643235888899998754 Q gi|254780383|r 128 EGLTRLPGIGRKGANVILSMAFG 150 (227) Q Consensus 128 ~~L~~LpGVG~ktA~~il~~~~~ 150 (227) ++|+++||||+++|..++.+... T Consensus 1 ~~L~~v~GIG~k~A~~ll~~~~~ 23 (26) T smart00278 1 EELLKVPGIGPKTAEKILEAXXX 23 (26) T ss_pred CCCCCCCCCCCHHHHHHHHHHHC T ss_conf 92101799881159999997620 No 56 >COG0632 RuvA Holliday junction resolvasome, DNA-binding subunit [DNA replication, recombination, and repair] Probab=86.94 E-value=0.38 Score=28.42 Aligned_cols=67 Identities=30% Similarity=0.392 Sum_probs=35.9 Q ss_pred HCCCCHHHH-HHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCC-HHHHHHHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 101268999-99999730037999999752355444200100-0014567776432358888999987542 Q gi|254780383|r 83 MLAIGEKKL-QNYIRTIGIYRKKSENIISLSHILINEFDNKI-PQTLEGLTRLPGIGRKGANVILSMAFGI 151 (227) Q Consensus 83 l~~a~~~el-~~~ir~~G~~~~KAk~I~~~a~~i~~~~~g~v-P~~~~~L~~LpGVG~ktA~~il~~~~~~ 151 (227) +...++.++ ..+|+-.|--..=|-.|...... +++-.-| -.|.+.|.++||||.|||.-+++.--++ T Consensus 63 F~~~~ER~lF~~LisVnGIGpK~ALaiLs~~~~--~~l~~aI~~~d~~~L~k~PGIGkKtAerivleLk~K 131 (201) T COG0632 63 FLTEEERELFRLLISVNGIGPKLALAILSNLDP--EELAQAIANEDVKALSKIPGIGKKTAERIVLELKGK 131 (201) T ss_pred CCCHHHHHHHHHHHCCCCCCHHHHHHHHCCCCH--HHHHHHHHHCCHHHHHCCCCCCHHHHHHHHHHHHHH T ss_conf 998899999999871188058999999848999--999999983286764418987788999999997605 No 57 >PRK01172 ski2-like helicase; Provisional Probab=86.40 E-value=1 Score=25.56 Aligned_cols=48 Identities=23% Similarity=0.271 Sum_probs=32.0 Q ss_pred CHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCHHHHHHHHHHHH Q ss_conf 000145677764323588889999875421000121046787765654 Q gi|254780383|r 123 IPQTLEGLTRLPGIGRKGANVILSMAFGIPTIGVDTHIFRISNRIGLA 170 (227) Q Consensus 123 vP~~~~~L~~LpGVG~ktA~~il~~~~~~p~~~VDthv~Rv~~Rlgl~ 170 (227) ++.+.=+|++|||||+..|-..-.-+|.-+.-...+...++.+=.||. T Consensus 607 v~~eLl~L~~I~gigr~RAR~Ly~aG~~s~~dia~a~~~~L~~i~g~~ 654 (674) T PRK01172 607 IREDLIDLVLIPKVGRVRARRLYDAGFKTVDDIARSSPERIKKIYGFS 654 (674) T ss_pred CHHHHHHHCCCCCCCHHHHHHHHHCCCCCHHHHHHCCHHHHHHCCCCC T ss_conf 868889771889999899999998699999999709998987641989 No 58 >PRK00254 ski2-like helicase; Provisional Probab=86.37 E-value=0.62 Score=26.97 Aligned_cols=47 Identities=26% Similarity=0.262 Sum_probs=30.1 Q ss_pred CHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCHHHHHHHHHHH Q ss_conf 00014567776432358888999987542100012104678776565 Q gi|254780383|r 123 IPQTLEGLTRLPGIGRKGANVILSMAFGIPTIGVDTHIFRISNRIGL 169 (227) Q Consensus 123 vP~~~~~L~~LpGVG~ktA~~il~~~~~~p~~~VDthv~Rv~~Rlgl 169 (227) ++.+.=+|++|||||+..|-..-.-++.-+.-...++..++..=.|| T Consensus 639 v~~ELl~L~~I~gvgr~RAR~Ly~aGi~s~~~ia~A~p~~l~~i~g~ 685 (717) T PRK00254 639 IREELIPLMELPMIGRKRARALYNAGFRDLEDIMNAKPSELLAVEGI 685 (717) T ss_pred CCHHHHHHCCCCCCCHHHHHHHHHCCCCCHHHHHCCCHHHEECCCCC T ss_conf 97656835648998989999999869999999965999990302372 No 59 >TIGR00593 pola DNA polymerase I; InterPro: IPR002298 DNA carries the biological information that instructs cells how to exist in an ordered fashion. Accurate replication is thus one of the most important events in the cell life cycle. This function is mediated by DNA-directed DNA polymerases, which add nucleotide triphosphate (dNTP) residues to the 5'-end of the growing DNA chain, using a complementary DNA as template. Small RNA molecules are generally used as primers for chain elongation, although terminal proteins may also be used. DNA-dependent DNA polymerases have been grouped into families, denoted A, B and X, on the basis of sequence similarities , . Members of family A, which includes bacterial and bacteriophage polymerases, share significant similarity to Escherichia coli polymerase I; hence family A is also known as the pol I family. The bacterial polymerases also contain an exonuclease activity, which is coded for in the N-terminal portion. Three motifs, A, B and C , are seen to be conserved across all DNA polymerases, with motifs A and C also seen in RNA polymerases. They are centred on invariant residues, and their structural significance was implied from the Klenow (E. coli) structure. Motif A contains a strictly-conserved aspartate at the junction of a beta-strand and an alpha-helix; motif B contains an alpha-helix with positive charges; and motif C has a doublet of negative charges, located in a beta-turn-beta secondary structure .; GO: 0003677 DNA binding, 0003887 DNA-directed DNA polymerase activity, 0006260 DNA replication. Probab=85.84 E-value=0.47 Score=27.75 Aligned_cols=37 Identities=14% Similarity=0.018 Sum_probs=14.0 Q ss_pred HHHHHHHHHHHHHCCCCCCCCCCCCHHHHHHHHHHHHCCC Q ss_conf 9999999999997679998777786899999999963332 Q gi|254780383|r 23 KELEEIFYLFSLKWPSPKGELYYVNHFTLIVAVLLSAQST 62 (227) Q Consensus 23 ~~~~~I~~~L~~~yp~~~~~l~~~~p~~~LVa~iLs~qT~ 62 (227) .+++.++..|...+=.+..-.. -+..+++.-|++.+. T Consensus 471 ~~~K~~~~~L~~~g~~~~~~~~---~~D~~laaYll~~~~ 507 (1005) T TIGR00593 471 HDAKFLMHLLKRKGIELIEIGV---IFDTMLAAYLLDPAQ 507 (1005) T ss_pred HHHHHHHHHHHHCCCCCCCCCC---CCCHHHHHHHHCCCC T ss_conf 8999999999743773344211---454899999843035 No 60 >PRK05929 consensus Probab=85.16 E-value=0.74 Score=26.43 Aligned_cols=14 Identities=36% Similarity=0.586 Sum_probs=7.9 Q ss_pred HHHHHHHHHHHHHH Q ss_conf 64323588889999 Q gi|254780383|r 133 LPGIGRKGANVILS 146 (227) Q Consensus 133 LpGVG~ktA~~il~ 146 (227) +||||+|||...+. T Consensus 190 VpGIG~KTA~kLL~ 203 (870) T PRK05929 190 VSGCGPKKAAALLK 203 (870) T ss_pred CCCCCHHHHHHHHH T ss_conf 99760999999998 No 61 >PRK07456 consensus Probab=85.00 E-value=0.76 Score=26.38 Aligned_cols=13 Identities=23% Similarity=0.294 Sum_probs=7.1 Q ss_pred HHHHHHHHHHHCC Q ss_conf 9999999996333 Q gi|254780383|r 49 FTLIVAVLLSAQS 61 (227) Q Consensus 49 ~~~LVa~iLs~qT 61 (227) |..+++.-|+.-+ T Consensus 473 fDTmLAsYLLnP~ 485 (975) T PRK07456 473 FDTLLADYLLNPE 485 (975) T ss_pred CCHHHHHHHHCCC T ss_conf 1399999876876 No 62 >PRK06887 consensus Probab=84.88 E-value=0.73 Score=26.50 Aligned_cols=11 Identities=9% Similarity=0.163 Sum_probs=5.1 Q ss_pred HHHHHHHHHHH Q ss_conf 99999999963 Q gi|254780383|r 49 FTLIVAVLLSA 59 (227) Q Consensus 49 ~~~LVa~iLs~ 59 (227) |..+++.-|+. T Consensus 468 fDTmLAaYLLd 478 (954) T PRK06887 468 FDTMLESYTLN 478 (954) T ss_pred CCHHHHHHHHC T ss_conf 16989987518 No 63 >PRK07945 hypothetical protein; Provisional Probab=84.88 E-value=1.1 Score=25.15 Aligned_cols=23 Identities=43% Similarity=0.556 Sum_probs=16.8 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 56777643235888899998754 Q gi|254780383|r 128 EGLTRLPGIGRKGANVILSMAFG 150 (227) Q Consensus 128 ~~L~~LpGVG~ktA~~il~~~~~ 150 (227) ..|.+|||||++||.+|---.-| T Consensus 49 ~~~~~l~gig~~ta~vi~~a~~g 71 (335) T PRK07945 49 GSWQSLPGIGPKTAKVIAQAWAG 71 (335) T ss_pred CCCEECCCCCHHHHHHHHHHHCC T ss_conf 98112788780589999999658 No 64 >COG0632 RuvA Holliday junction resolvasome, DNA-binding subunit [DNA replication, recombination, and repair] Probab=84.84 E-value=0.48 Score=27.70 Aligned_cols=57 Identities=23% Similarity=0.138 Sum_probs=33.8 Q ss_pred HHHHHHHHHHHHHHHHHHHHH-HHHHHHHHCCCHHHHHHHHHHHHCCCCHHHHHHHHH Q ss_conf 456777643235888899998-754210001210467877656540788899999996 Q gi|254780383|r 127 LEGLTRLPGIGRKGANVILSM-AFGIPTIGVDTHIFRISNRIGLAPGKTPNKVEQSLL 183 (227) Q Consensus 127 ~~~L~~LpGVG~ktA~~il~~-~~~~p~~~VDthv~Rv~~Rlgl~~~~~~~~~~~~l~ 183 (227) +..|+++.|||+|+|=+||+. ..+.=.-+|++.=...+.|+--+..|+.+++--+|+ T Consensus 72 F~~LisVnGIGpK~ALaiLs~~~~~~l~~aI~~~d~~~L~k~PGIGkKtAerivleLk 129 (201) T COG0632 72 FRLLISVNGIGPKLALAILSNLDPEELAQAIANEDVKALSKIPGIGKKTAERIVLELK 129 (201) T ss_pred HHHHHCCCCCCHHHHHHHHCCCCHHHHHHHHHHCCHHHHHCCCCCCHHHHHHHHHHHH T ss_conf 9998711880589999998489999999999832867644189877889999999976 No 65 >PRK07625 consensus Probab=84.51 E-value=0.81 Score=26.18 Aligned_cols=11 Identities=0% Similarity=0.060 Sum_probs=5.7 Q ss_pred HHHHHHHHHHH Q ss_conf 99999999963 Q gi|254780383|r 49 FTLIVAVLLSA 59 (227) Q Consensus 49 ~~~LVa~iLs~ 59 (227) |..+++.-|+. T Consensus 435 fDTmLAaYLL~ 445 (922) T PRK07625 435 HDTLLESYVLE 445 (922) T ss_pred HHHHHHHHHHC T ss_conf 03999988754 No 66 >PRK07898 consensus Probab=83.98 E-value=0.88 Score=25.92 Aligned_cols=12 Identities=42% Similarity=0.772 Sum_probs=4.8 Q ss_pred HHHHHHHHHHHH Q ss_conf 432358888999 Q gi|254780383|r 134 PGIGRKGANVIL 145 (227) Q Consensus 134 pGVG~ktA~~il 145 (227) ||||+|||...+ T Consensus 208 pGIG~KTA~kLL 219 (902) T PRK07898 208 PGVGEKTAAKWI 219 (902) T ss_pred CCCCHHHHHHHH T ss_conf 984478899999 No 67 >PRK08434 consensus Probab=83.58 E-value=0.95 Score=25.71 Aligned_cols=14 Identities=14% Similarity=0.093 Sum_probs=6.7 Q ss_pred HHHHHHHHHHHHCC Q ss_conf 89999999996333 Q gi|254780383|r 48 HFTLIVAVLLSAQS 61 (227) Q Consensus 48 p~~~LVa~iLs~qT 61 (227) -|..+++.-|..-. T Consensus 406 ~fDTmLAaYLLdp~ 419 (887) T PRK08434 406 YADTMILAWLKDPS 419 (887) T ss_pred HHHHHHHHHHCCCC T ss_conf 14899999866986 No 68 >PRK08928 consensus Probab=83.42 E-value=0.95 Score=25.72 Aligned_cols=14 Identities=36% Similarity=0.693 Sum_probs=9.0 Q ss_pred HHHHHHHHHHHHHH Q ss_conf 64323588889999 Q gi|254780383|r 133 LPGIGRKGANVILS 146 (227) Q Consensus 133 LpGVG~ktA~~il~ 146 (227) +||||+|||...+. T Consensus 191 VpGIG~KTA~kLL~ 204 (861) T PRK08928 191 VPSIGPKTAAKLIT 204 (861) T ss_pred CCCCCHHHHHHHHH T ss_conf 98856289999999 No 69 >PRK08835 consensus Probab=83.38 E-value=0.96 Score=25.67 Aligned_cols=24 Identities=25% Similarity=0.279 Sum_probs=9.9 Q ss_pred HHHHHHHHHHHHCCCCCCCCCCCCCH Q ss_conf 99999999665164899894728403 Q gi|254780383|r 194 AHYWLVLHGRYVCKARKPQCQSCIIS 219 (227) Q Consensus 194 ~~~~li~~G~~iC~~~~P~C~~C~l~ 219 (227) |||..-.-||.=| .+|.=..-|++ T Consensus 661 f~Q~~t~TGRLSS--~~PNLQNIPiR 684 (931) T PRK08835 661 YHQAVTATGRLSS--TDPNLQNIPIR 684 (931) T ss_pred HHHCCCCCCCCCC--CCCCCCCCCCC T ss_conf 4310155245257--99630267888 No 70 >PRK07556 consensus Probab=83.33 E-value=0.98 Score=25.61 Aligned_cols=26 Identities=15% Similarity=0.160 Sum_probs=12.6 Q ss_pred HHHHHHHHHHHHHCCCCCCCCCCCCCHH Q ss_conf 8999999996651648998947284033 Q gi|254780383|r 193 NAHYWLVLHGRYVCKARKPQCQSCIISN 220 (227) Q Consensus 193 ~~~~~li~~G~~iC~~~~P~C~~C~l~~ 220 (227) .|||..-.-||.=| .+|.=...|++. T Consensus 703 ~fnq~~t~TGRlSS--~~PNLQNIPir~ 728 (977) T PRK07556 703 SYALAATTTGRLSS--SDPNLQNIPVRT 728 (977) T ss_pred CHHHHCEECCCCCC--CCCCCCCCCCCC T ss_conf 30232310576456--998646788876 No 71 >KOG2534 consensus Probab=83.05 E-value=0.95 Score=25.70 Aligned_cols=20 Identities=45% Similarity=0.532 Sum_probs=9.7 Q ss_pred HHHHHHHHHHHHHHHHHHHH Q ss_conf 14567776432358888999 Q gi|254780383|r 126 TLEGLTRLPGIGRKGANVIL 145 (227) Q Consensus 126 ~~~~L~~LpGVG~ktA~~il 145 (227) +.+++.+|||||+++|..|- T Consensus 54 S~~ea~~lP~iG~kia~ki~ 73 (353) T KOG2534 54 SGEEAEKLPGIGPKIAEKIQ 73 (353) T ss_pred CHHHHCCCCCCCHHHHHHHH T ss_conf 57885579997777999999 No 72 >PRK07300 consensus Probab=83.00 E-value=1 Score=25.47 Aligned_cols=14 Identities=29% Similarity=0.389 Sum_probs=9.5 Q ss_pred HHHHHHHHHHHHHH Q ss_conf 64323588889999 Q gi|254780383|r 133 LPGIGRKGANVILS 146 (227) Q Consensus 133 LpGVG~ktA~~il~ 146 (227) .||||+|||...+. T Consensus 201 VpGIG~KTA~kLL~ 214 (880) T PRK07300 201 VTKIGEKTGLKLLH 214 (880) T ss_pred CCCCCHHHHHHHHH T ss_conf 89853699999999 No 73 >PRK08076 consensus Probab=82.76 E-value=1.1 Score=25.38 Aligned_cols=14 Identities=43% Similarity=0.786 Sum_probs=8.6 Q ss_pred HHHHHHHHHHHHHH Q ss_conf 64323588889999 Q gi|254780383|r 133 LPGIGRKGANVILS 146 (227) Q Consensus 133 LpGVG~ktA~~il~ 146 (227) +||||+|||...+. T Consensus 193 VpGiG~KtA~~ll~ 206 (877) T PRK08076 193 VPGVGEKTAIKLLK 206 (877) T ss_pred CCCCCHHHHHHHHH T ss_conf 99863799999999 No 74 >PRK05797 consensus Probab=82.76 E-value=1 Score=25.49 Aligned_cols=12 Identities=50% Similarity=0.847 Sum_probs=4.9 Q ss_pred HHHHHHHHHHHH Q ss_conf 432358888999 Q gi|254780383|r 134 PGIGRKGANVIL 145 (227) Q Consensus 134 pGVG~ktA~~il 145 (227) ||||+|||...+ T Consensus 194 pGIG~KTA~kLL 205 (869) T PRK05797 194 PGIGEKTAFKLI 205 (869) T ss_pred CCCCHHHHHHHH T ss_conf 987818999999 No 75 >pfam11731 Cdd1 Pathogenicity locus. Cdd1 is expressed as part of the pathogenicity locus operon in several different orders of bacteria. Many members of the family are annotated as being putative mitomycin resistance proteins but this could not be confirmed. Probab=82.50 E-value=1.3 Score=24.87 Aligned_cols=42 Identities=24% Similarity=0.383 Sum_probs=28.3 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCHHHHHHHHH Q ss_conf 145677764323588889999875421000121046787765 Q gi|254780383|r 126 TLEGLTRLPGIGRKGANVILSMAFGIPTIGVDTHIFRISNRI 167 (227) Q Consensus 126 ~~~~L~~LpGVG~ktA~~il~~~~~~p~~~VDthv~Rv~~Rl 167 (227) ...+|..|||||+.+|.-....++..+.=....+-.-+..|+ T Consensus 10 ~l~~L~~lPnIG~a~a~DL~~LGi~~~~~L~g~dp~elY~~l 51 (92) T pfam11731 10 ALKELTDLPNIGKATAKDLRLLGINSPAQLAGRDPLELYERL 51 (92) T ss_pred HHHHHHCCCCCCHHHHHHHHHHCCCCHHHHHCCCHHHHHHHH T ss_conf 999874189746999999999189989999179999999999 No 76 >COG2003 RadC DNA repair proteins [DNA replication, recombination, and repair] Probab=82.09 E-value=3.3 Score=22.03 Aligned_cols=61 Identities=20% Similarity=0.274 Sum_probs=48.3 Q ss_pred HHHHHHHHHHCCCHHHHHHHHHHHHHCCCCCHHHCCCCHHHHHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 9999999963332035679989987317620001012689999999973003799999975235 Q gi|254780383|r 50 TLIVAVLLSAQSTDVNVNKATKHLFEIADTPQKMLAIGEKKLQNYIRTIGIYRKKSENIISLSH 113 (227) Q Consensus 50 ~~LVa~iLs~qT~d~~v~~~~~~L~~~ypt~e~l~~a~~~el~~~ir~~G~~~~KAk~I~~~a~ 113 (227) .-|++.+|-.-|+...|....+.|.++|.+...+..++.+++.. ++++| ..||-.|+++.. T Consensus 27 ~ELLailLrtG~~~~~~~~la~~lL~~fg~L~~l~~a~~~el~~-v~GiG--~aka~~l~a~~E 87 (224) T COG2003 27 AELLAILLRTGTKGESVLDLAKELLQEFGSLAELLKASVEELSS-VKGIG--LAKAIQIKAAIE 87 (224) T ss_pred HHHHHHHHHCCCCCCCHHHHHHHHHHHCCCHHHHHHCCHHHHHH-CCCCC--HHHHHHHHHHHH T ss_conf 89999999628999878999999999732588887379999951-78833--889999999999 No 77 >COG1555 ComEA DNA uptake protein and related DNA-binding proteins [DNA replication, recombination, and repair] Probab=82.01 E-value=1.1 Score=25.39 Aligned_cols=55 Identities=24% Similarity=0.399 Sum_probs=37.2 Q ss_pred HCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCHHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 1012689999999973003799999975235544420010000145677764323588889999 Q gi|254780383|r 83 MLAIGEKKLQNYIRTIGIYRKKSENIISLSHILINEFDNKIPQTLEGLTRLPGIGRKGANVILS 146 (227) Q Consensus 83 l~~a~~~el~~~ir~~G~~~~KAk~I~~~a~~i~~~~~g~vP~~~~~L~~LpGVG~ktA~~il~ 146 (227) +=.|+.+|| ..+.++| ..||..|++. ++++| .. .+.++|.+.+|||+++-.-... T Consensus 91 iNtAs~eeL-~~lpgIG--~~kA~aIi~y----Re~~G-~f-~sv~dL~~v~GiG~~~~ekl~~ 145 (149) T COG1555 91 INTASAEEL-QALPGIG--PKKAQAIIDY----REENG-PF-KSVDDLAKVKGIGPKTLEKLKD 145 (149) T ss_pred CCCCCHHHH-HHCCCCC--HHHHHHHHHH----HHHCC-CC-CCHHHHCCCCCCCHHHHHHHHH T ss_conf 661089999-8867989--9999999999----99739-97-6578871077778999998775 No 78 >PRK00116 ruvA Holliday junction DNA helicase RuvA; Reviewed Probab=81.34 E-value=0.71 Score=26.58 Aligned_cols=55 Identities=24% Similarity=0.207 Sum_probs=29.7 Q ss_pred HHHHHHHHHHHHHHHHHHHHH-HHHH-HHHHCCCHHHHHHHHHHHHCCCCHHHHHHHH Q ss_conf 456777643235888899998-7542-1000121046787765654078889999999 Q gi|254780383|r 127 LEGLTRLPGIGRKGANVILSM-AFGI-PTIGVDTHIFRISNRIGLAPGKTPNKVEQSL 182 (227) Q Consensus 127 ~~~L~~LpGVG~ktA~~il~~-~~~~-p~~~VDthv~Rv~~Rlgl~~~~~~~~~~~~l 182 (227) ++.|.+..|||+|+|=.||+. ..+. -..+...++..+.+==|+ ..|+.+++-.+| T Consensus 72 F~~Li~V~GIGpK~AL~ILs~~~~~~l~~aI~~~D~~~L~~vpGI-G~KtA~rIi~EL 128 (198) T PRK00116 72 FRLLISVSGVGPKLALAILSGLSPEELAQAIANGDIKALTKVPGV-GKKTAERIVLEL 128 (198) T ss_pred HHHHHCCCCCCHHHHHHHHCCCCHHHHHHHHHHCCHHHHCCCCCC-CHHHHHHHHHHH T ss_conf 999856688578999988702999999999985899997068897-889999999999 No 79 >pfam07834 RanGAP1_C RanGAP1 C-terminal domain. Ran-GTPase activating protein 1 (RanGAP1) is a GTPase activator for the nuclear Ras-related regulatory protein Ran, converting it to the putatively inactive GDP-bound state. Its C-terminal domain is required for RanGAP1 localisation at the vertebrate nuclear pore complex, and is sumoylated by the small ubiquitin-related modifier protein (SUMO-1). This domain is composed almost entirely of helical substructures that are organized into an alpha-alpha superhelix fold, with the exception of the peptide containing the lysine residue required for SUMO-1 conjugation. Probab=80.79 E-value=5 Score=20.87 Aligned_cols=111 Identities=15% Similarity=0.217 Sum_probs=69.2 Q ss_pred HHHCCCCCHHHCCCCHHHHHHHHHHHHHHH--HHHHHHHHHHHHHHHHHCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 873176200010126899999999730037--999999752355444200100001456777643235888899998754 Q gi|254780383|r 73 LFEIADTPQKMLAIGEKKLQNYIRTIGIYR--KKSENIISLSHILINEFDNKIPQTLEGLTRLPGIGRKGANVILSMAFG 150 (227) Q Consensus 73 L~~~ypt~e~l~~a~~~el~~~ir~~G~~~--~KAk~I~~~a~~i~~~~~g~vP~~~~~L~~LpGVG~ktA~~il~~~~~ 150 (227) =|--+|+|+.+..++.+.-+-+.+.++-.. .=+....+++....++ ++|-. ---.+.|++|--+|. T Consensus 19 ~FL~fPSpekLl~LG~krs~lI~qq~d~~D~~kvv~aflkvsSv~~dd--~eVK~----------AV~~~~DallkkaF~ 86 (169) T pfam07834 19 TFLVFPSPEKLIRLGPKRSQLIAQQVDVTDAEKVVEAFLKVSSVYKDE--PEVKQ----------AVFETTDALMRKAFS 86 (169) T ss_pred HHHCCCCHHHHHHHCHHHHHHHHHHHCCCCHHHHHHHHHHHHHHHCCC--HHHHH----------HHHHHHHHHHHHHHC T ss_conf 882599989998856268999999843412878999999999986160--99999----------999999999999844 Q ss_pred HHHHHCCCHHHHHHHHHHHHCCCCH----------HHH--HHHHHHCCCHHHHHHHH Q ss_conf 2100012104678776565407888----------999--99996218842267899 Q gi|254780383|r 151 IPTIGVDTHIFRISNRIGLAPGKTP----------NKV--EQSLLRIIPPKHQYNAH 195 (227) Q Consensus 151 ~p~~~VDthv~Rv~~Rlgl~~~~~~----------~~~--~~~l~~~~p~~~~~~~~ 195 (227) -+.|--|.-+-+++-++|+.++.+. ..+ +..-++.||++...-+. T Consensus 87 s~~~~~~~fi~~LLVhmGLLKSEdKvk~v~~l~GpL~~L~H~vqQ~YFPk~~a~vl~ 143 (169) T pfam07834 87 NSPFQSNSFITSLLVNMGLLKSEDKVKKISVLPGPLLTLNHMVQQEYFPKDLAGVLL 143 (169) T ss_pred CCCCCHHHHHHHHHHHHHCCHHHHHCCCCCCCCCHHHHHHHHHHHHHCCHHHHHHHH T ss_conf 888871789999999984211032034588885059999999988627387899999 No 80 >PRK05755 DNA polymerase I; Provisional Probab=80.20 E-value=1.5 Score=24.35 Aligned_cols=14 Identities=21% Similarity=0.297 Sum_probs=9.1 Q ss_pred HHHHHHHHHHHHCC Q ss_conf 89999999996333 Q gi|254780383|r 48 HFTLIVAVLLSAQS 61 (227) Q Consensus 48 p~~~LVa~iLs~qT 61 (227) -|..+++.-|+.-+ T Consensus 404 ~fDTmLAaYLLdP~ 417 (889) T PRK05755 404 AFDTMLASYLLDPG 417 (889) T ss_pred CHHHHHHHHHHCCC T ss_conf 30199999874788 No 81 >COG1796 POL4 DNA polymerase IV (family X) [DNA replication, recombination, and repair] Probab=79.63 E-value=2.7 Score=22.61 Aligned_cols=22 Identities=36% Similarity=0.832 Sum_probs=11.8 Q ss_pred HHCCCCHHHHHHHHHHHHHHHH Q ss_conf 2001000014567776432358 Q gi|254780383|r 118 EFDNKIPQTLEGLTRLPGIGRK 139 (227) Q Consensus 118 ~~~g~vP~~~~~L~~LpGVG~k 139 (227) ...+.+|.....|+++||+|+| T Consensus 83 ~lk~~~P~gl~~Ll~v~GlGpk 104 (326) T COG1796 83 ALKKEVPEGLEPLLKVPGLGPK 104 (326) T ss_pred HHHHHCCCCHHHHHHCCCCCCH T ss_conf 9988579555878607798928 No 82 >PRK05692 hydroxymethylglutaryl-CoA lyase; Provisional Probab=79.04 E-value=2.4 Score=23.03 Aligned_cols=24 Identities=8% Similarity=-0.036 Sum_probs=11.8 Q ss_pred CCCCCHHCCCHHHHHHHHHHHHHH Q ss_conf 778600239989999999999997 Q gi|254780383|r 12 GNSPLGCLYTPKELEEIFYLFSLK 35 (227) Q Consensus 12 ~~~p~~~~~~~~~~~~I~~~L~~~ 35 (227) |.-..+...+.++--+|.+.|.+. T Consensus 15 G~Q~~~~~~s~e~K~~ia~~L~~~ 38 (287) T PRK05692 15 GLQNEKRFIPTADKIALIDRLSAA 38 (287) T ss_pred CCCCCCCCCCHHHHHHHHHHHHHC T ss_conf 677989984999999999999984 No 83 >PRK07757 acetyltransferase; Provisional Probab=78.28 E-value=2 Score=23.55 Aligned_cols=50 Identities=20% Similarity=0.222 Sum_probs=33.8 Q ss_pred HHHHHHHHHHHHHHHHHH---HHHHCCCHHHHHHHHHHHHCCCCHHHHHHHHHHCCCHHHHHH Q ss_conf 432358888999987542---100012104678776565407888999999962188422678 Q gi|254780383|r 134 PGIGRKGANVILSMAFGI---PTIGVDTHIFRISNRIGLAPGKTPNKVEQSLLRIIPPKHQYN 193 (227) Q Consensus 134 pGVG~ktA~~il~~~~~~---p~~~VDthv~Rv~~Rlgl~~~~~~~~~~~~l~~~~p~~~~~~ 193 (227) .|||.+....++-.|-.. ..|+. |+....+.++|+.... . +.+|.+.|.+ T Consensus 80 ~GiG~~Ll~~l~~~Ar~~G~~~lf~L-Tt~~~fF~~~GF~~~~-~--------~~lP~kvw~d 132 (152) T PRK07757 80 KGIGRMLVEACLEEARELGVKRVFAL-TYQPEFFEKLGFREVD-K--------EALPQKIWAD 132 (152) T ss_pred CCHHHHHHHHHHHHHHHCCCCEEEEE-ECCHHHHHHCCCEECC-H--------HHCCHHHHHH T ss_conf 88899999999999998699999990-5866789878998888-3--------5599899998 No 84 >pfam04919 DUF655 Protein of unknown function, DUF655. This family includes several uncharacterized archaeal proteins. Probab=77.72 E-value=3 Score=22.31 Aligned_cols=62 Identities=23% Similarity=0.275 Sum_probs=37.5 Q ss_pred CCCCHHHCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCHH--HHHHHHHHHHHHHHHHHHHHHH Q ss_conf 7620001012689999999973003799999975235544420010000--1456777643235888899998 Q gi|254780383|r 77 ADTPQKMLAIGEKKLQNYIRTIGIYRKKSENIISLSHILINEFDNKIPQ--TLEGLTRLPGIGRKGANVILSM 147 (227) Q Consensus 77 ypt~e~l~~a~~~el~~~ir~~G~~~~KAk~I~~~a~~i~~~~~g~vP~--~~~~L~~LpGVG~ktA~~il~~ 147 (227) .-+.++|......+|..+|..+=-.| . +..++=||-.-|- -...|.-|||||.|++..|+-. T Consensus 72 ri~YedLT~tAk~eL~~vi~~iV~~n--E-------~~FV~FfN~A~pit~rlH~leLLPGIGkK~~~~ilee 135 (181) T pfam04919 72 RISYEDLTDTARSELPYVVEEIVKEN--E-------DRFVKFFNEAEPITTRLHQLELLPGIGKKMMWAILEE 135 (181) T ss_pred ECCHHHCCHHHHHHHHHHHHHHHHHC--H-------HHHHHHHHCCCCCHHHHHHHHHCCCCCHHHHHHHHHH T ss_conf 32778779999999999999999968--4-------6766765136874487888875334038999999999 No 85 >COG1415 Uncharacterized conserved protein [Function unknown] Probab=75.89 E-value=1.9 Score=23.64 Aligned_cols=29 Identities=34% Similarity=0.680 Sum_probs=23.5 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHHH---HHHH Q ss_conf 0014567776432358888999987---5421 Q gi|254780383|r 124 PQTLEGLTRLPGIGRKGANVILSMA---FGIP 152 (227) Q Consensus 124 P~~~~~L~~LpGVG~ktA~~il~~~---~~~p 152 (227) |.|+++|+-.||||++|.-+.-+.| ||.| T Consensus 274 p~Df~elLl~~GiGpstvRALalVAEvIyg~~ 305 (373) T COG1415 274 PDDFEELLLVPGIGPSTVRALALVAEVIYGEP 305 (373) T ss_pred CCCHHHHHHCCCCCHHHHHHHHHHHHHHHCCC T ss_conf 32499987406878899999999999980899 No 86 >pfam02371 Transposase_20 Transposase IS116/IS110/IS902 family. Transposases are needed for efficient transposition of the insertion sequence or transposon DNA. This family includes transposases for IS116, IS110 and IS902. This region is often found with pfam01548. The exact function of this region is uncertain. This family contains a HHH motif suggesting a DNA-binding function. Probab=75.39 E-value=1.4 Score=24.59 Aligned_cols=43 Identities=30% Similarity=0.403 Sum_probs=30.1 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCHHHHHHHHHHHHCCC Q ss_conf 5677764323588889999875421000121046787765654078 Q gi|254780383|r 128 EGLTRLPGIGRKGANVILSMAFGIPTIGVDTHIFRISNRIGLAPGK 173 (227) Q Consensus 128 ~~L~~LpGVG~ktA~~il~~~~~~p~~~VDthv~Rv~~Rlgl~~~~ 173 (227) +-|.++||||+-||..++...-+..-| .+...+..-.|+++.. T Consensus 2 ~~L~sipGiG~~~a~~l~aeigd~~rF---~~~~~~~s~~Gl~P~~ 44 (87) T pfam02371 2 ELLLSIPGIGPITAAALLAEIGDISRF---KSARQLAAYAGLAPRE 44 (87) T ss_pred CHHHCCCCCCHHHHHHHHHHHCCHHHC---CCHHHHHHHCCCCCCC T ss_conf 234269995299999999992985327---8999999983999985 No 87 >TIGR03060 PS_II_psb29 photosystem II biogenesis protein Psp29. Psp29, originally designated sll1414 in Synechocystis 6803, is found universally in Cyanobacteria and in Arabidopsis. It was isolated and partially sequenced from purified photosystem II (PS II) in Synechocystis. While its function is unknown, mutant studies show an impairment in photosystem II biogenesis and/or stability, rather than in PS II core function. Probab=74.13 E-value=7.6 Score=19.63 Aligned_cols=118 Identities=15% Similarity=0.163 Sum_probs=70.4 Q ss_pred HHHHHHHHHHHHHHCCCCCCCCCCCCHHHHHHHHH-HHHCCC---H----HHHHHHHHHHHHCCCCCHHHCCCCHHHHHH Q ss_conf 89999999999997679998777786899999999-963332---0----356799899873176200010126899999 Q gi|254780383|r 22 PKELEEIFYLFSLKWPSPKGELYYVNHFTLIVAVL-LSAQST---D----VNVNKATKHLFEIADTPQKMLAIGEKKLQN 93 (227) Q Consensus 22 ~~~~~~I~~~L~~~yp~~~~~l~~~~p~~~LVa~i-Ls~qT~---d----~~v~~~~~~L~~~ypt~e~l~~a~~~el~~ 93 (227) ...+.+--+.+.+.||.|..++.-+--=++||-.= |+.|+. | --+..+|+.|++.|+-.++... =... T Consensus 4 ~~TVsDsKr~F~~~~p~pI~~lYrrvvdELLVElHLl~~n~~F~yD~lfAlGlvt~Fd~fm~GY~Pee~~~~----IF~A 79 (214) T TIGR03060 4 RRTVSDSKRAFHAAFPRVIPPLYRRVVDELLVELHLLSHQSDFKYDPLFALGLVTVFDRFMEGYRPEEHLDA----LFDA 79 (214) T ss_pred CCCHHHHHHHHHHHCCCCCCHHHHHHHHHHHHHHHHHHHCCCCCCCCHHHHHHHHHHHHHHCCCCCHHHHHH----HHHH T ss_conf 611488799999858988818999999999999999885036543736784599999999767998256999----9999 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHCCCCHHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 99973003799999975235544420010000145677764323588889999 Q gi|254780383|r 94 YIRTIGIYRKKSENIISLSHILINEFDNKIPQTLEGLTRLPGIGRKGANVILS 146 (227) Q Consensus 94 ~ir~~G~~~~KAk~I~~~a~~i~~~~~g~vP~~~~~L~~LpGVG~ktA~~il~ 146 (227) ++..+||- +..+.+.|+.+.+...|.-+.+...+++=.|-|....+.+.. T Consensus 80 lc~s~~~d---p~~~r~dA~~l~~~a~~~s~~~i~~~l~~~~~~~~~~~~l~~ 129 (214) T TIGR03060 80 LCNSNGFD---PEQLREDAKQLLEQAKGKGLDEILSWLTQANLSNGGGDTLQG 129 (214) T ss_pred HHHHCCCC---HHHHHHHHHHHHHHHHCCCHHHHHHHHHCCCCCCCHHHHHHH T ss_conf 98734899---999999999999999748999999998504755411679998 No 88 >pfam01367 5_3_exonuc 5'-3' exonuclease, C-terminal SAM fold. Probab=72.43 E-value=1.5 Score=24.34 Aligned_cols=22 Identities=36% Similarity=0.774 Sum_probs=16.5 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 56777643235888899998754 Q gi|254780383|r 128 EGLTRLPGIGRKGANVILSMAFG 150 (227) Q Consensus 128 ~~L~~LpGVG~ktA~~il~~~~~ 150 (227) +.+-.+||||+|||..+|.. |+ T Consensus 18 DnIPGv~GiG~KtA~~Ll~~-~g 39 (100) T pfam01367 18 DNIPGVPGIGEKTAAKLLKE-YG 39 (100) T ss_pred CCCCCCCCCCCHHHHHHHHH-CC T ss_conf 58899999881689999998-19 No 89 >COG4277 Predicted DNA-binding protein with the Helix-hairpin-helix motif [General function prediction only] Probab=69.39 E-value=2.2 Score=23.21 Aligned_cols=22 Identities=41% Similarity=0.616 Sum_probs=17.9 Q ss_pred HHHHHHHHHHHHHHHHHHHHHH Q ss_conf 1456777643235888899998 Q gi|254780383|r 126 TLEGLTRLPGIGRKGANVILSM 147 (227) Q Consensus 126 ~~~~L~~LpGVG~ktA~~il~~ 147 (227) ++++|+..||||.|+|.-|+.. T Consensus 328 ~~~~llRVPGiG~ksa~rIv~~ 349 (404) T COG4277 328 PYKELLRVPGIGVKSARRIVMT 349 (404) T ss_pred CHHHHCCCCCCCHHHHHHHHHH T ss_conf 7778211688773788999887 No 90 >COG1491 Predicted RNA-binding protein [Translation, ribosomal structure and biogenesis] Probab=68.01 E-value=3.1 Score=22.25 Aligned_cols=55 Identities=31% Similarity=0.337 Sum_probs=32.3 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCHHHHHHHHHHHHCCCCHHH--HHHHHHHCC Q ss_conf 1456777643235888899998754210001210467877656540788899--999996218 Q gi|254780383|r 126 TLEGLTRLPGIGRKGANVILSMAFGIPTIGVDTHIFRISNRIGLAPGKTPNK--VEQSLLRII 186 (227) Q Consensus 126 ~~~~L~~LpGVG~ktA~~il~~~~~~p~~~VDthv~Rv~~Rlgl~~~~~~~~--~~~~l~~~~ 186 (227) -...|.-|||||.|+...||-.--.+|. +...-|-+|++... ++.+ .++.+.++- T Consensus 128 RLH~LELLpGiGkK~m~~IleERkkkpF----eSFeDi~~Rv~~~~--~p~~~I~~RIl~El~ 184 (202) T COG1491 128 RLHQLELLPGIGKKTMWAILEERKKKPF----ESFEDIKERVKGLH--DPAKMIAERILDELK 184 (202) T ss_pred HHHHHHHCCCCCHHHHHHHHHHHHCCCC----CCHHHHHHHHCCCC--CHHHHHHHHHHHHHC T ss_conf 7888875312049999999998742888----66899998805677--789999999999960 No 91 >smart00475 53EXOc 5'-3' exonuclease. Probab=67.95 E-value=4 Score=21.50 Aligned_cols=20 Identities=45% Similarity=0.831 Sum_probs=14.5 Q ss_pred HHHHHHHHHHHHHHHHHHHHH Q ss_conf 777643235888899998754 Q gi|254780383|r 130 LTRLPGIGRKGANVILSMAFG 150 (227) Q Consensus 130 L~~LpGVG~ktA~~il~~~~~ 150 (227) +-.+||||+|||--+|.. || T Consensus 188 IpGV~GIG~KtA~kLL~~-yg 207 (259) T smart00475 188 IPGVPGIGEKTAAKLLKE-FG 207 (259) T ss_pred CCCCCCCCHHHHHHHHHH-CC T ss_conf 999998478999999998-39 No 92 >COG1948 MUS81 ERCC4-type nuclease [DNA replication, recombination, and repair] Probab=66.88 E-value=9.5 Score=18.97 Aligned_cols=20 Identities=40% Similarity=0.481 Sum_probs=15.3 Q ss_pred HHHHHHHHHHHHHHHHHHHH Q ss_conf 56777643235888899998 Q gi|254780383|r 128 EGLTRLPGIGRKGANVILSM 147 (227) Q Consensus 128 ~~L~~LpGVG~ktA~~il~~ 147 (227) .-|.++||||.+.|.-++.. T Consensus 182 ~il~s~pgig~~~a~~ll~~ 201 (254) T COG1948 182 YILESIPGIGPKLAERLLKK 201 (254) T ss_pred HHHHCCCCCCHHHHHHHHHH T ss_conf 99970899648999999998 No 93 >PRK02406 DNA polymerase IV; Validated Probab=66.17 E-value=9.6 Score=18.94 Aligned_cols=75 Identities=20% Similarity=0.237 Sum_probs=35.1 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHHH--CCCHHHHHHHHHHH--------HCCCCHHHH--H---HHH--HHCCCHH--HH Q ss_conf 7764323588889999875421000--12104678776565--------407888999--9---999--6218842--26 Q gi|254780383|r 131 TRLPGIGRKGANVILSMAFGIPTIG--VDTHIFRISNRIGL--------APGKTPNKV--E---QSL--LRIIPPK--HQ 191 (227) Q Consensus 131 ~~LpGVG~ktA~~il~~~~~~p~~~--VDthv~Rv~~Rlgl--------~~~~~~~~~--~---~~l--~~~~p~~--~~ 191 (227) .++||||++|+.-.-. +|...+. .+.....+..++|- ..+.+...+ + +.+ ..-|+.+ .. T Consensus 183 ~~l~GIG~~~~~~L~~--~GI~ti~DL~~~~~~~L~~~fG~~g~~l~~~a~Gid~~~V~~~~~~KSI~~~~Tf~~d~~~~ 260 (355) T PRK02406 183 EKIPGVGKVTAEKLHA--LGIRTCADLQKWDLATLLRRFGKFGRRLYERARGIDERPVKPDRERKSVGVERTYAKDLYDL 260 (355) T ss_pred CCCCCCCHHHHHHHHH--CCCCCHHHHHHCCHHHHHHHHCHHHHHHHHHHCCCCCCCCCCCCCCCEEEEEEECCCCCCCH T ss_conf 4058858899999998--29817999760999999999798999999996599874303366761055567879999999 Q ss_pred HHHHHHHHHHHHHHCC Q ss_conf 7899999999665164 Q gi|254780383|r 192 YNAHYWLVLHGRYVCK 207 (227) Q Consensus 192 ~~~~~~li~~G~~iC~ 207 (227) .++...|..+..++|. T Consensus 261 ~~i~~~l~~l~~~v~~ 276 (355) T PRK02406 261 EDCEAELERLYPELEA 276 (355) T ss_pred HHHHHHHHHHHHHHHH T ss_conf 9999999999999999 No 94 >TIGR02236 recomb_radA DNA repair and recombination protein RadA; InterPro: IPR011938 This family consists exclusively of archaeal RadA protein, a homolog of bacterial RecA, eukaryotic RAD51 (IPR011941 from INTERPRO), and archaeal RadB (IPR011939 from INTERPRO). This protein is involved in DNA repair and recombination. The member from Pyrococcus horikoshii contains an intein .; GO: 0003684 damaged DNA binding, 0005524 ATP binding, 0008094 DNA-dependent ATPase activity, 0006281 DNA repair, 0006310 DNA recombination. Probab=63.92 E-value=7 Score=19.87 Aligned_cols=45 Identities=22% Similarity=0.220 Sum_probs=38.5 Q ss_pred HHHHHH-CCCCCHHHCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 899873-176200010126899999999730037999999752355444 Q gi|254780383|r 70 TKHLFE-IADTPQKMLAIGEKKLQNYIRTIGIYRKKSENIISLSHILIN 117 (227) Q Consensus 70 ~~~L~~-~ypt~e~l~~a~~~el~~~ir~~G~~~~KAk~I~~~a~~i~~ 117 (227) .++|.+ -|.|.+++|-|++.||.+. .|.....|..|+.+|+...+ T Consensus 13 A~KL~EaGy~t~~~iA~A~~~EL~~~---~gI~E~~A~kiI~AAR~a~~ 58 (333) T TIGR02236 13 AEKLREAGYDTLEAIAVASPKELSEI---AGIGEGTAAKIIQAARKAAD 58 (333) T ss_pred HHHHHHHHHHHHHHHHCCCHHHHHHH---CCCCHHHHHHHHHHHHHHHC T ss_conf 89988610788999844585795320---37877789999999999846 No 95 >cd00080 HhH2_motif Helix-hairpin-helix class 2 (Pol1 family) motif. HhH2 domains are found in Rad2 family of prokaryotic and eukaryotic replication and repair nucleases, i.e., DNA polymerase I, Taq DNA polymerase, DNA repair protein Rad2 endonuclease, flap endonuclease, exonuclease I and IX, 5'-3' exonuclease and also bacteriophage Rnase H. These nucleases degrade RNA-DNA or DNA-DNA duplexes, or both and play essential roles in DNA duplication, repair, and recombination. Probab=61.60 E-value=7.5 Score=19.66 Aligned_cols=22 Identities=36% Similarity=0.714 Sum_probs=16.1 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 56777643235888899998754 Q gi|254780383|r 128 EGLTRLPGIGRKGANVILSMAFG 150 (227) Q Consensus 128 ~~L~~LpGVG~ktA~~il~~~~~ 150 (227) +.+-.+||||+|||--++.. |+ T Consensus 22 DnipGV~GIG~ktA~~ll~~-~~ 43 (75) T cd00080 22 DNIPGVPGIGPKTALKLLKE-YG 43 (75) T ss_pred CCCCCCCCCCHHHHHHHHHH-CC T ss_conf 58877586379999999999-09 No 96 >PRK00558 uvrC excinuclease ABC subunit C; Validated Probab=60.81 E-value=8 Score=19.46 Aligned_cols=25 Identities=16% Similarity=0.204 Sum_probs=16.0 Q ss_pred HHCCCHHHHHHHHHHHHHHCCCCCC Q ss_conf 0239989999999999997679998 Q gi|254780383|r 17 GCLYTPKELEEIFYLFSLKWPSPKG 41 (227) Q Consensus 17 ~~~~~~~~~~~I~~~L~~~yp~~~~ 41 (227) ++......+++++..|...||-..+ T Consensus 132 GPf~~~~~~~~~l~~l~k~F~lR~C 156 (609) T PRK00558 132 GPYPSAGAVRETLNLLQKLFPLRTC 156 (609) T ss_pred CCCCCHHHHHHHHHHHHHHHCCCCC T ss_conf 7867499999999999998354667 No 97 >pfam07067 DUF1340 Protein of unknown function (DUF1340). This family consists of several hypothetical Streptococcus thermophilus bacteriophage proteins of around 235 residues in length. The function of this family is unknown. Probab=59.89 E-value=14 Score=17.76 Aligned_cols=100 Identities=18% Similarity=0.222 Sum_probs=51.9 Q ss_pred HHHHHHHHHHHHHCC-CCCHHHCCCCHHHHHHHHHHH-HHHHHHHHHHHHHHHHHHHHHCCCCHHHHHHHHHHHHH-HHH Q ss_conf 035679989987317-620001012689999999973-00379999997523554442001000014567776432-358 Q gi|254780383|r 63 DVNVNKATKHLFEIA-DTPQKMLAIGEKKLQNYIRTI-GIYRKKSENIISLSHILINEFDNKIPQTLEGLTRLPGI-GRK 139 (227) Q Consensus 63 d~~v~~~~~~L~~~y-pt~e~l~~a~~~el~~~ir~~-G~~~~KAk~I~~~a~~i~~~~~g~vP~~~~~L~~LpGV-G~k 139 (227) |+.-....+.||..| ..++.|..+.+.-..+.|-.+ |-.+++|..|..+-..+-.+-|-.-|.. ..+|.|. .++ T Consensus 105 ~k~nagl~~elf~qy~~ei~~lra~~pn~~~~yimevkgc~~qqa~ti~taint~yteigiltprk---viqlegllsre 181 (236) T pfam07067 105 YKTNAGLTEELFKQYREEIKSLRAAHPNSFAAYIMEVKGCSYQQANTIRTAINTIYTEIGILTPRK---VIQLEGLLSRE 181 (236) T ss_pred CCCCCCCCHHHHHHHHHHHHHHHHHCCCHHHHHHHHCCCCHHHHHHHHHHHHHHHHHHHHCCCHHH---HHHHHHHHHHH T ss_conf 535689859999999999999987487458899897014207767489999999999885356178---88887487899 Q ss_pred HHHHHHHHHHHHHHHH--CCCHHHHHHH Q ss_conf 8889999875421000--1210467877 Q gi|254780383|r 140 GANVILSMAFGIPTIG--VDTHIFRISN 165 (227) Q Consensus 140 tA~~il~~~~~~p~~~--VDthv~Rv~~ 165 (227) .-.-|.-++||+-..| +|.-|-||.- T Consensus 182 lfgkiakyvfnkyewpesldsevdriyl 209 (236) T pfam07067 182 LFGKIAKYVFNKYEWPESLDSEVDRIYL 209 (236) T ss_pred HHHHHHHHHHCCCCCCHHHHHHHHHHHE T ss_conf 9889999982114584346766566630 No 98 >PHA00439 exonuclease Probab=59.04 E-value=3.9 Score=21.54 Aligned_cols=17 Identities=29% Similarity=0.548 Sum_probs=13.9 Q ss_pred HHHHHHHHHHHHHHHHH Q ss_conf 77643235888899998 Q gi|254780383|r 131 TRLPGIGRKGANVILSM 147 (227) Q Consensus 131 ~~LpGVG~ktA~~il~~ 147 (227) -.+||||++||..+|-- T Consensus 190 ~GvpGiG~ktA~klL~~ 206 (288) T PHA00439 190 SGIPGWGPDTAEAFLNN 206 (288) T ss_pred CCCCCCCHHHHHHHHCC T ss_conf 89988488999998637 No 99 >cd01972 Nitrogenase_VnfE_like Nitrogenase_VnfE_like: VnfE subunit of the VnfEN complex_like. This group in addition to VnfE contains a subset of the alpha subunit of the nitrogenase MoFe protein and NifE-like proteins. The nitrogenase enzyme system catalyzes the ATP-dependent reduction of dinitrogen to ammonia. NifEN participates in the synthesis of the iron-molybdenum cofactor (FeMoco) of MoFe protein of the molybdenum(Mo)-nitrogenase. NifB-co (an iron and sulfur containing precursor of the FeMoco) from NifB is transferred to NifEN where it is further processed to FeMoco. VnfEN may similarly be a scaffolding protein for the iron-vanadium cofactor (FeVco) of the vanadium-dependent (V)-nitrogenase. NifE and NifN are essential for the Mo-nitrogenase, VnfE and VnfN are not essential for the V-nitrogenase. NifE and NifN can substitute when the vnfEN genes are inactivated. Probab=58.99 E-value=15 Score=17.66 Aligned_cols=28 Identities=7% Similarity=0.309 Sum_probs=16.9 Q ss_pred HHHCCCCCHHHCCCCHHHHHHHHHHHHH Q ss_conf 8731762000101268999999997300 Q gi|254780383|r 73 LFEIADTPQKMLAIGEKKLQNYIRTIGI 100 (227) Q Consensus 73 L~~~ypt~e~l~~a~~~el~~~ir~~G~ 100 (227) ++.-+..++.....+.++|+.++...|+ T Consensus 166 iig~~~~~~~~~~~d~~ei~~ll~~~Gi 193 (426) T cd01972 166 IIGLWGGPERTEQEDVDEFKRLLNELGL 193 (426) T ss_pred CCCCCCCCCCCCCCCHHHHHHHHHHCCC T ss_conf 2477678656750009999999998699 No 100 >PRK12308 bifunctional argininosuccinate lyase/N-acetylglutamate synthase; Provisional Probab=58.30 E-value=15 Score=17.58 Aligned_cols=136 Identities=16% Similarity=0.221 Sum_probs=76.8 Q ss_pred HHHHHHHHH----HHHHCCCCCHHHCCCCHHHHHHHHHHHHHHHHHHH--------HHHHHHHHHHHHHCCCC------- Q ss_conf 035679989----98731762000101268999999997300379999--------99752355444200100------- Q gi|254780383|r 63 DVNVNKATK----HLFEIADTPQKMLAIGEKKLQNYIRTIGIYRKKSE--------NIISLSHILINEFDNKI------- 123 (227) Q Consensus 63 d~~v~~~~~----~L~~~ypt~e~l~~a~~~el~~~ir~~G~~~~KAk--------~I~~~a~~i~~~~~g~v------- 123 (227) -++|..+.+ +|-.+--.--.+..|...+++.+-+=+++|-.+-. -+..+-...+.+-+|+| T Consensus 442 p~qV~~ai~~a~~rl~~~~~~~i~iR~Arl~Dv~~i~~lv~~~A~~G~~LpR~~~~l~~~i~~f~vaE~~g~v~g~~sl~ 521 (614) T PRK12308 442 PEQVAYAVKQAKKRLAARDTSGVKVRPARLTDIDAIEGMVAYWAGLGENLPRTRNELVRDIGSFAVAEHHGEVTGCASLY 521 (614) T ss_pred HHHHHHHHHHHHHHHHHCCCCCCEEECCCCCCHHHHHHHHHHHHHHCCCCCCCHHHHHHHHHHHEEEEECCEEEEEEEEE T ss_conf 89999999999999872568873663256773899999999986412458887788999876642665578188887898 Q ss_pred --HHHHHHHHHH--------HHHHHHHHHHHHHHHHH--HHHHHCCCHHHHHHHHHHHHCCCCHHHHHHHHHHCCCHHHH Q ss_conf --0014567776--------43235888899998754--21000121046787765654078889999999621884226 Q gi|254780383|r 124 --PQTLEGLTRL--------PGIGRKGANVILSMAFG--IPTIGVDTHIFRISNRIGLAPGKTPNKVEQSLLRIIPPKHQ 191 (227) Q Consensus 124 --P~~~~~L~~L--------pGVG~ktA~~il~~~~~--~p~~~VDthv~Rv~~Rlgl~~~~~~~~~~~~l~~~~p~~~~ 191 (227) -.+..|.-+| .|+|.-.....+-.+.. .+-+.|=|++--++..+|+.... ++.+|.+-| T Consensus 522 i~~~~LAEIrsl~v~~~~~~~G~G~~lV~~~l~~a~~~~~~rvfvLT~~p~fF~k~gf~~~~---------k~~lp~kv~ 592 (614) T PRK12308 522 IYDSGLAEIRSLGVEAGWQVQGQGKALVQYLVEKARQMAIKKVFVLTRVPEFFMKQGFSPTS---------KSLLPEKVL 592 (614) T ss_pred EEECCHHHHHHHHCCHHHHHCCCHHHHHHHHHHHHHHHCCCEEEEEECCCHHHHHCCCEECC---------HHHCHHHHH T ss_conf 86167199998616787774283289999999999983787589984371889975982078---------445759899 Q ss_pred HHHHHHHHHHHHHHCCCCCCCCCCCC Q ss_conf 78999999996651648998947284 Q gi|254780383|r 192 YNAHYWLVLHGRYVCKARKPQCQSCI 217 (227) Q Consensus 192 ~~~~~~li~~G~~iC~~~~P~C~~C~ 217 (227) .+- .-| ++.|.|++-. T Consensus 593 kdc---------~~c-p~~~~cdE~a 608 (614) T PRK12308 593 KDC---------DQC-PRQHACDEVA 608 (614) T ss_pred HHH---------HCC-CCCCCCCHHH T ss_conf 876---------548-8767722787 No 101 >PRK09482 xni exonuclease IX; Provisional Probab=58.06 E-value=4.3 Score=21.31 Aligned_cols=21 Identities=33% Similarity=0.567 Sum_probs=15.6 Q ss_pred HHHHHHHHHHHHHHHHHHHHHH Q ss_conf 6777643235888899998754 Q gi|254780383|r 129 GLTRLPGIGRKGANVILSMAFG 150 (227) Q Consensus 129 ~L~~LpGVG~ktA~~il~~~~~ 150 (227) .+-.+||||+|||...|.. || T Consensus 183 NIPGV~GIG~KtA~~LL~~-fg 203 (256) T PRK09482 183 KIPGVAGIGPKSAAELLNQ-FR 203 (256) T ss_pred CCCCCCCCCHHHHHHHHHH-HC T ss_conf 8999998588899999998-55 No 102 >pfam03118 RNA_pol_A_CTD Bacterial RNA polymerase, alpha chain C terminal domain. The alpha subunit of RNA polymerase consists of two independently folded domains, referred to as amino-terminal and carboxyl terminal domains. The amino terminal domain is involved in the interaction with the other subunits of the RNA polymerase. The carboxyl-terminal domain interacts with the DNA and activators. The amino acid sequence of the alpha subunit is conserved in prokaryotic and chloroplast RNA polymerases. There are three regions of particularly strong conservation, two in the amino-terminal and one in the carboxyl- terminal. Probab=56.41 E-value=8.5 Score=19.30 Aligned_cols=49 Identities=18% Similarity=0.257 Sum_probs=31.9 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHCCCCHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 9997300379999997523554442001000014567776432358888999 Q gi|254780383|r 94 YIRTIGIYRKKSENIISLSHILINEFDNKIPQTLEGLTRLPGIGRKGANVIL 145 (227) Q Consensus 94 ~ir~~G~~~~KAk~I~~~a~~i~~~~~g~vP~~~~~L~~LpGVG~ktA~~il 145 (227) -|..++|. +|+.+..+-+.+- ..+.-+..+.++|+++|+.|+|+-+-|. T Consensus 9 ~I~~L~LS-~R~~N~Lk~~~I~--tv~dL~~~s~~dLl~i~N~G~kSl~EI~ 57 (62) T pfam03118 9 PIEELELS-VRSYNCLKRAGIN--TVGDLLSKSEEDLLKIKNFGKKSLEEIK 57 (62) T ss_pred CHHHHCCC-HHHHHHHHHCCCC--CHHHHHHCCHHHHHHCCCCCHHHHHHHH T ss_conf 89981686-8999999894996--7999985899999748898685799999 No 103 >COG0258 Exo 5'-3' exonuclease (including N-terminal domain of PolI) [DNA replication, recombination, and repair] Probab=56.25 E-value=9.8 Score=18.88 Aligned_cols=18 Identities=39% Similarity=0.612 Sum_probs=13.6 Q ss_pred HHHHHHHHHHHHHHHHHHH Q ss_conf 7643235888899998754 Q gi|254780383|r 132 RLPGIGRKGANVILSMAFG 150 (227) Q Consensus 132 ~LpGVG~ktA~~il~~~~~ 150 (227) .+||||+|||--++.- || T Consensus 202 GV~GIG~ktA~~Ll~~-~g 219 (310) T COG0258 202 GVKGIGPKTALKLLQE-YG 219 (310) T ss_pred CCCCCCHHHHHHHHHH-HC T ss_conf 9998389999999998-38 No 104 >TIGR03674 fen_arch flap structure-specific endonuclease. Endonuclease that cleaves the 5'-overhanging flap structure that is generated by displacement synthesis when DNA polymerase encounters the 5'-end of a downstream Okazaki fragment. Has 5'-endo-/exonuclease and 5'-pseudo-Y-endonuclease activities. Cleaves the junction between single and double-stranded regions of flap DNA Probab=56.20 E-value=14 Score=17.86 Aligned_cols=17 Identities=29% Similarity=0.392 Sum_probs=13.1 Q ss_pred HHHHHHHHHHHHHHHHH Q ss_conf 77643235888899998 Q gi|254780383|r 131 TRLPGIGRKGANVILSM 147 (227) Q Consensus 131 ~~LpGVG~ktA~~il~~ 147 (227) -++||||++||--++-- T Consensus 239 ~gI~GIG~k~A~klIkk 255 (338) T TIGR03674 239 EGVKGIGPKTALKLIKE 255 (338) T ss_pred CCCCCCCHHHHHHHHHH T ss_conf 89998568999999998 No 105 >KOG1201 consensus Probab=55.14 E-value=17 Score=17.24 Aligned_cols=89 Identities=15% Similarity=0.172 Sum_probs=59.1 Q ss_pred CCHHHHHHHHHHHHHCCC------------CCHHHCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCH--HH Q ss_conf 320356799899873176------------2000101268999999997300379999997523554442001000--01 Q gi|254780383|r 61 STDVNVNKATKHLFEIAD------------TPQKMLAIGEKKLQNYIRTIGIYRKKSENIISLSHILINEFDNKIP--QT 126 (227) Q Consensus 61 T~d~~v~~~~~~L~~~yp------------t~e~l~~a~~~el~~~ir~~G~~~~KAk~I~~~a~~i~~~~~g~vP--~~ 126 (227) |.++.+.+..+++.+.+. +...+.+.+.++|+..+.---+. --..+++++-...++-+|-+- .+ T Consensus 96 s~~eei~~~a~~Vk~e~G~V~ILVNNAGI~~~~~ll~~~d~ei~k~~~vN~~~--~f~t~kaFLP~M~~~~~GHIV~IaS 173 (300) T KOG1201 96 SDREEIYRLAKKVKKEVGDVDILVNNAGIVTGKKLLDCSDEEIQKTFDVNTIA--HFWTTKAFLPKMLENNNGHIVTIAS 173 (300) T ss_pred CCHHHHHHHHHHHHHHCCCCEEEEECCCCCCCCCCCCCCHHHHHHHHHHHHHH--HHHHHHHHHHHHHHCCCCEEEEEHH T ss_conf 98899999999999861995499836642448875679989999999876689--9999998738887457963998355 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 4567776432358888999987542 Q gi|254780383|r 127 LEGLTRLPGIGRKGANVILSMAFGI 151 (227) Q Consensus 127 ~~~L~~LpGVG~ktA~~il~~~~~~ 151 (227) ...+..++|+++|+|+---..+|.+ T Consensus 174 ~aG~~g~~gl~~YcaSK~a~vGfhe 198 (300) T KOG1201 174 VAGLFGPAGLADYCASKFAAVGFHE 198 (300) T ss_pred HHCCCCCCCCHHHHHHHHHHHHHHH T ss_conf 3313577653235651899999999 No 106 >PRK05179 rpsM 30S ribosomal protein S13; Validated Probab=53.77 E-value=7.8 Score=19.54 Aligned_cols=26 Identities=42% Similarity=0.626 Sum_probs=20.1 Q ss_pred CCHHH---HHHHHHHHHHHHHHHHHHHHH Q ss_conf 00001---456777643235888899998 Q gi|254780383|r 122 KIPQT---LEGLTRLPGIGRKGANVILSM 147 (227) Q Consensus 122 ~vP~~---~~~L~~LpGVG~ktA~~il~~ 147 (227) ++|.+ .-.|.++.|||+.+|..|+.- T Consensus 8 dl~~~K~v~~ALt~I~GIG~~~A~~Ic~~ 36 (122) T PRK05179 8 DIPRNKRVVIALTYIYGIGRTRAKEILAA 36 (122) T ss_pred CCCCCCCHHHHHHHHCCCCHHHHHHHHHH T ss_conf 48999786847730027589999999998 No 107 >KOG2518 consensus Probab=53.54 E-value=15 Score=17.61 Aligned_cols=23 Identities=35% Similarity=0.618 Sum_probs=17.0 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 567776432358888999987542 Q gi|254780383|r 128 EGLTRLPGIGRKGANVILSMAFGI 151 (227) Q Consensus 128 ~~L~~LpGVG~ktA~~il~~~~~~ 151 (227) +-|-+|||||-.||..++.- |+. T Consensus 225 DYl~slpGvGl~tA~k~l~k-~~~ 247 (556) T KOG2518 225 DYLSSLPGVGLATAHKLLSK-YNT 247 (556) T ss_pred CCCCCCCCCCHHHHHHHHHH-CCC T ss_conf 31124765229999999985-375 No 108 >smart00279 HhH2 Helix-hairpin-helix class 2 (Pol1 family) motifs. Probab=52.57 E-value=6 Score=20.32 Aligned_cols=17 Identities=35% Similarity=0.507 Sum_probs=13.2 Q ss_pred HHHHHHHHHHHHHHHHH Q ss_conf 77764323588889999 Q gi|254780383|r 130 LTRLPGIGRKGANVILS 146 (227) Q Consensus 130 L~~LpGVG~ktA~~il~ 146 (227) +-.+||||++||--++. T Consensus 18 ipGV~GIG~ktA~~ll~ 34 (36) T smart00279 18 IPGVKGIGPKTALKLLR 34 (36) T ss_pred CCCCCCCCHHHHHHHHH T ss_conf 89999747899999998 No 109 >TIGR03631 bact_S13 30S ribosomal protein S13. This model describes bacterial ribosomal protein S13, to the exclusion of the homologous archaeal S13P and eukaryotic ribosomal protein S18. This model identifies some (but not all) instances of chloroplast and mitochondrial S13, which is of bacterial type. Probab=52.46 E-value=7.2 Score=19.80 Aligned_cols=26 Identities=42% Similarity=0.533 Sum_probs=19.4 Q ss_pred CCHHH---HHHHHHHHHHHHHHHHHHHHH Q ss_conf 00001---456777643235888899998 Q gi|254780383|r 122 KIPQT---LEGLTRLPGIGRKGANVILSM 147 (227) Q Consensus 122 ~vP~~---~~~L~~LpGVG~ktA~~il~~ 147 (227) ++|.+ .-.|.++.|||+.+|..|+.- T Consensus 6 di~~~K~v~~ALt~I~GIG~~~A~~Ic~~ 34 (113) T TIGR03631 6 DIPNNKRVEIALTYIYGIGRTRARKILEK 34 (113) T ss_pred CCCCCCCHHHHHHCEECCCHHHHHHHHHH T ss_conf 49999674606520027589999999999 No 110 >cd00008 53EXOc 5'-3' exonuclease; T5 type 5'-3' exonuclease domains may co-occur with DNA polymerase I (Pol I) domains, or be part of Pol I containing complexes. They digest dsDNA and ssDNA, releasing mono-,di- and tri-nucleotides, as well as oligonucleotides, and have also been reported to possess RNase H activity. Also called 5' nuclease family, involved in structure-specific cleavage of flaps formed by Pol I activity (similar to mammalian flap endonuclease I, FEN-1). A single nucleic acid strand may be threaded through the 5' nuclease enzyme before cleavage occurs. The domain binds two divalent metal ions which are necessary for activity. Probab=51.87 E-value=6.2 Score=20.20 Aligned_cols=18 Identities=39% Similarity=0.673 Sum_probs=13.2 Q ss_pred HHHHHHHHHHHHHHHHHH Q ss_conf 777643235888899998 Q gi|254780383|r 130 LTRLPGIGRKGANVILSM 147 (227) Q Consensus 130 L~~LpGVG~ktA~~il~~ 147 (227) +-.+||||+|||..++.- T Consensus 185 IpGV~GiG~KtA~kLl~~ 202 (240) T cd00008 185 IPGVPGIGEKTAAKLLKE 202 (240) T ss_pred CCCCCCCCHHHHHHHHHH T ss_conf 899998578999999998 No 111 >PRK03980 flap endonuclease-1; Provisional Probab=50.98 E-value=16 Score=17.38 Aligned_cols=17 Identities=29% Similarity=0.407 Sum_probs=13.2 Q ss_pred HHHHHHHHHHHHHHHHH Q ss_conf 77643235888899998 Q gi|254780383|r 131 TRLPGIGRKGANVILSM 147 (227) Q Consensus 131 ~~LpGVG~ktA~~il~~ 147 (227) -.+||||++||--++-- T Consensus 192 ~gI~gIGpk~Alklikk 208 (295) T PRK03980 192 PGVKGIGPKTALKLIKK 208 (295) T ss_pred CCCCCCCHHHHHHHHHH T ss_conf 99998429999999999 No 112 >pfam10391 DNA_pol_lambd_f Fingers domain of DNA polymerase lambda. DNA polymerases catalyse the addition of dNMPs onto the 3-prime ends of DNA chains. There is a general polymerase fold consisting of three subdomains that have been likened to the fingers, palm, and thumb of a right hand. DNA_pol_lambd_f is the central three-helical region of DNA polymerase lambda referred to as the F and G helices of the fingers domain. Contacts with DNA involve this conserved helix-hairpin-helix motif in the fingers region which interacts with the primer strand. This motif is common to several DNA binding proteins and confers a sequence-independent interaction with the DNA backbone. Probab=50.58 E-value=6.4 Score=20.12 Aligned_cols=20 Identities=25% Similarity=0.250 Sum_probs=15.8 Q ss_pred HHHHHHHHHHHHHHHHHHHH Q ss_conf 56777643235888899998 Q gi|254780383|r 128 EGLTRLPGIGRKGANVILSM 147 (227) Q Consensus 128 ~~L~~LpGVG~ktA~~il~~ 147 (227) +.+.++.|||+++|.-.... T Consensus 2 ~~f~~I~GvGp~~A~~~~~~ 21 (52) T pfam10391 2 KLFTNIWGVGPKTARKWYRQ 21 (52) T ss_pred HHHHHCCCCCHHHHHHHHHH T ss_conf 03663544069999999994 No 113 >PRK03352 DNA polymerase IV; Validated Probab=49.81 E-value=9.8 Score=18.87 Aligned_cols=61 Identities=16% Similarity=0.244 Sum_probs=32.4 Q ss_pred HHHHHHHHHHHHHHHCCC--C-HHHHHH------HHHHHHHHHHHHHHHHHHHHHHHHHH--CCCHHHHHHHHHH Q ss_conf 999975235544420010--0-001456------77764323588889999875421000--1210467877656 Q gi|254780383|r 105 SENIISLSHILINEFDNK--I-PQTLEG------LTRLPGIGRKGANVILSMAFGIPTIG--VDTHIFRISNRIG 168 (227) Q Consensus 105 Ak~I~~~a~~i~~~~~g~--v-P~~~~~------L~~LpGVG~ktA~~il~~~~~~p~~~--VDthv~Rv~~Rlg 168 (227) .+.|=++|..+ .+-+|. + |.+..+ +.++||||++|+.-.-. +|...+. .+.+...+..++| T Consensus 146 nk~LAKiAs~~-~KP~G~~~i~~~~~~~~l~~lpv~~i~GIG~~~~~kL~~--~GI~Ti~DL~~~~~~~L~~~fG 217 (345) T PRK03352 146 NKQRAKIATGF-AKPAGVFRLTDANWMAVMGDRPVDALWGVGPKTAKKLAA--LGITTVADLAATDPAVLTATFG 217 (345) T ss_pred HHHHHHHHHHH-HHCCCEEECCHHHHHHHHHCCCCHHCCCCCHHHHHHHHH--CCCCCHHHHHCCCHHHHHHHHC T ss_conf 99999999876-104874744888999998527821127828999999997--5997199986699999999978 No 114 >TIGR02076 pyrH_arch uridylate kinase, putative; InterPro: IPR011818 Uridylate kinases (also known as UMP kinases) are key enzymes in the synthesis of nucleoside triphosphates. They catalyse the reversible transfer of the gamma-phosphoryl group from an ATP donor to UMP, yielding UDP, which is the starting point for the synthesis of all other pyrimidine nucleotides. The eukaryotic enzyme has a dual specificity, phosphorylating both UMP and CMP, while the bacterial enzyme is specific to UMP. The bacterial enzyme shows no sequence similarity to the eukaryotic enzyme or other nucleoside monophosphate kinases, but rather appears to be part of the amino acid kinase family. It is dependent on magnesium for activity and is activated by GTP and repressed by UTP , . In many bacterial genomes, the gene tends to be located immediately downstream of elongation factor T and upstream of ribosome recycling factor. A related protein family, believed to be equivalent in function is found in the archaea and in spirochetes. Structurally, the bacterial and archaeal proteins are homohexamers centred around a hollow nucleus and organised as a trimer of dimers , . Each monomer within the protein forms the amino acid kinase fold and can be divided into an N-terminal region which binds UMP and mediates intersubunit interactions within the dimer, and a C-terminal region which binds ATP and contains a mobile loop covering the active site. Inhibition of enzyme activity by UTP appears to be due to competition for the binding site for UMP, not allosteric inhibition as was previously suspected. This entry represents the archaeal and spirochete proteins.; GO: 0009041 uridylate kinase activity, 0006221 pyrimidine nucleotide biosynthetic process. Probab=49.76 E-value=15 Score=17.57 Aligned_cols=53 Identities=15% Similarity=0.164 Sum_probs=34.9 Q ss_pred CHHHHHHHHHHHHCC--CHHHHHHHHHHHHHCCCCCHHHCCCCHHHHHHHHH-HHH Q ss_conf 689999999996333--20356799899873176200010126899999999-730 Q gi|254780383|r 47 NHFTLIVAVLLSAQS--TDVNVNKATKHLFEIADTPQKMLAIGEKKLQNYIR-TIG 99 (227) Q Consensus 47 ~p~~~LVa~iLs~qT--~d~~v~~~~~~L~~~ypt~e~l~~a~~~el~~~ir-~~G 99 (227) |.--.|+|+.+.+.+ .-.+|+.||.+==++||+.+.+-+++.+||.+++. ..- T Consensus 124 DAVAA~lAE~~~ad~L~~~TnVDGVYd~DP~~~~~A~~~~~l~~~eL~~i~~G~~~ 179 (232) T TIGR02076 124 DAVAALLAEFLEADLLINATNVDGVYDKDPNKYPDAKKFDKLTPEELVEIVRGSSS 179 (232) T ss_pred HHHHHHHHHHHCCCCEEEEECCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHCCCCC T ss_conf 99999997662687269982268521777788878400025898899998604411 No 115 >TIGR01982 UbiB 2-polyprenylphenol 6-hydroxylase; InterPro: IPR010232 This entry represents the enzyme (UbiB) which catalyses the first hydroxylation step in the ubiquinone biosynthetic pathway in bacteria , . The gene is also known as AarF in certain species. It is believed that the reaction converts 2-polyprenylphenol to 6-hydroxy-2-polyprenylphenol. Members are mainly proteobacteria.; GO: 0006744 ubiquinone biosynthetic process. Probab=48.88 E-value=20 Score=16.85 Aligned_cols=21 Identities=14% Similarity=0.143 Sum_probs=9.4 Q ss_pred CCHHHCCCCHHHHHHHHHHHHH Q ss_conf 2000101268999999997300 Q gi|254780383|r 79 TPQKMLAIGEKKLQNYIRTIGI 100 (227) Q Consensus 79 t~e~l~~a~~~el~~~ir~~G~ 100 (227) |.|=+--.+..|++ .++-.|+ T Consensus 246 TmEwIdGip~~D~~-~L~~~G~ 266 (452) T TIGR01982 246 TMEWIDGIPLSDIA-ALDEAGL 266 (452) T ss_pred EEHHCCCCCCHHHH-HHHHCCC T ss_conf 00103552533589-9964689 No 116 >LOAD_Hrd consensus Probab=47.43 E-value=7.7 Score=19.60 Aligned_cols=21 Identities=24% Similarity=0.515 Sum_probs=17.9 Q ss_pred CCHHHHHHHHHHHHHHHHHHH Q ss_conf 000014567776432358888 Q gi|254780383|r 122 KIPQTLEGLTRLPGIGRKGAN 142 (227) Q Consensus 122 ~vP~~~~~L~~LpGVG~ktA~ 142 (227) ..|.+.++|.+++|||++-+. T Consensus 41 ~~P~t~~eL~~I~Gig~~k~~ 61 (77) T LOAD_Hrd 41 LLPTTVSELLAIDGVGEAKVE 61 (77) T ss_pred HCCCCHHHHCCCCCCCHHHHH T ss_conf 789999998289996999999 No 117 >KOG3337 consensus Probab=46.23 E-value=11 Score=18.53 Aligned_cols=30 Identities=10% Similarity=0.160 Sum_probs=14.6 Q ss_pred HHHHHHHCCCCCHHHCCCCHHHHHHHHHHH Q ss_conf 989987317620001012689999999973 Q gi|254780383|r 69 ATKHLFEIADTPQKMLAIGEKKLQNYIRTI 98 (227) Q Consensus 69 ~~~~L~~~ypt~e~l~~a~~~el~~~ir~~ 98 (227) |+..++.|||+|.+..-.+++-|+..+.+- T Consensus 18 VssAfw~RYPNpySkHVlSeDvleR~Vt~d 47 (201) T KOG3337 18 VSSAFWQRYPNPYSKHVLSEDVLEREVTDD 47 (201) T ss_pred HHHHHHHHCCCCCCCCCCCHHHHHHEECCC T ss_conf 999999738997664203377786502877 No 118 >PTZ00035 Rad51; Provisional Probab=45.75 E-value=18 Score=17.06 Aligned_cols=45 Identities=20% Similarity=0.087 Sum_probs=37.5 Q ss_pred HHHHH-HCCCCCHHHCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 89987-3176200010126899999999730037999999752355444 Q gi|254780383|r 70 TKHLF-EIADTPQKMLAIGEKKLQNYIRTIGIYRKKSENIISLSHILIN 117 (227) Q Consensus 70 ~~~L~-~~ypt~e~l~~a~~~el~~~ir~~G~~~~KAk~I~~~a~~i~~ 117 (227) .++|. .-|.|.++++.++.++|.++ -|+...||..|++.|+.++. T Consensus 48 i~kl~~aG~~tv~~v~~~~~k~L~~i---kgise~k~~Ki~~~a~k~~~ 93 (350) T PTZ00035 48 LELLKEGGLQTVECVAYAPMRTLCAI---KGISEQKAEKLKKACKELCN 93 (350) T ss_pred HHHHHHCCCCHHHHHHHCCHHHHHHC---CCCCHHHHHHHHHHHHHHCC T ss_conf 99999849124899985099989773---79469999999999997557 No 119 >CHL00137 rps13 ribosomal protein S13; Validated Probab=45.66 E-value=10 Score=18.82 Aligned_cols=26 Identities=35% Similarity=0.464 Sum_probs=19.8 Q ss_pred CCHHH---HHHHHHHHHHHHHHHHHHHHH Q ss_conf 00001---456777643235888899998 Q gi|254780383|r 122 KIPQT---LEGLTRLPGIGRKGANVILSM 147 (227) Q Consensus 122 ~vP~~---~~~L~~LpGVG~ktA~~il~~ 147 (227) ++|.+ .-.|.++.|||+.+|..|+.- T Consensus 8 ~i~~~K~v~~aLt~I~GIG~~~A~~Ic~~ 36 (122) T CHL00137 8 DLPRNKRIEIALTYIYGIGLTSAKKILEK 36 (122) T ss_pred CCCCCCEEEEHHHHHCCCCHHHHHHHHHH T ss_conf 79999773131110006189999999998 No 120 >PRK04301 radA DNA repair and recombination protein RadA; Validated Probab=45.14 E-value=24 Score=16.21 Aligned_cols=46 Identities=22% Similarity=0.269 Sum_probs=37.4 Q ss_pred HHHHHHH-CCCCCHHHCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 9899873-176200010126899999999730037999999752355444 Q gi|254780383|r 69 ATKHLFE-IADTPQKMLAIGEKKLQNYIRTIGIYRKKSENIISLSHILIN 117 (227) Q Consensus 69 ~~~~L~~-~ypt~e~l~~a~~~el~~~ir~~G~~~~KAk~I~~~a~~i~~ 117 (227) ...+|.+ -|.|.++++.++.++|.++ .|+...||..|++.++.+.. T Consensus 19 ~~~kL~~aG~~tv~~l~~~~~~~L~~~---~gis~~~a~ki~~~a~~~~~ 65 (318) T PRK04301 19 TAEKLREAGYDTVEAIAVASPKELSEI---AGISESTAAKIIEAAREALD 65 (318) T ss_pred HHHHHHHCCCCCHHHHHCCCHHHHHHH---HCCCHHHHHHHHHHHHHHCC T ss_conf 999999869954999874899999985---09999999999999998536 No 121 >pfam11798 IMS_HHH IMS family HHH motif. These proteins are involved in UV protection, eg. Probab=44.07 E-value=8.9 Score=19.16 Aligned_cols=14 Identities=50% Similarity=1.023 Sum_probs=12.3 Q ss_pred HHHHHHHHHHHHHH Q ss_conf 77764323588889 Q gi|254780383|r 130 LTRLPGIGRKGANV 143 (227) Q Consensus 130 L~~LpGVG~ktA~~ 143 (227) +.++||||++++.. T Consensus 14 i~~i~GIG~~~~~~ 27 (33) T pfam11798 14 ISKIPGIGRKTAEK 27 (33) T ss_pred CCCCCCCCHHHHHH T ss_conf 22168866678999 No 122 >PRK01151 rps17E 30S ribosomal protein S17e; Validated Probab=43.98 E-value=10 Score=18.82 Aligned_cols=28 Identities=14% Similarity=0.266 Sum_probs=22.6 Q ss_pred HHHHHHHHHHHHHHHHHHCCCCHHHHHH Q ss_conf 7999999752355444200100001456 Q gi|254780383|r 102 RKKSENIISLSHILINEFDNKIPQTLEG 129 (227) Q Consensus 102 ~~KAk~I~~~a~~i~~~~~g~vP~~~~~ 129 (227) +.|.+.|+.+++.|+|+|.+.+-.|++. T Consensus 2 ~Ir~k~IKr~a~~lieky~~~ft~DF~~ 29 (58) T PRK01151 2 NIRPTFIKRTAEELLEKYPDKFTDDFEH 29 (58) T ss_pred CCCCHHHHHHHHHHHHHHHHHHCCCHHH T ss_conf 8362799999999999832052478888 No 123 >cd01700 Pol_V Pol V was discovered in Escherichia coli as Umuc and UmuD proteins induced by UV. This branch of DNA polymerases is mostly found in bacteria. Pol V enables DNA replication to bypass covalently linked cys-sin T-T photo-dimers and 6-4 T-T or T-C photoproducts, which would otherwise stall the DNA replication fork. Probab=43.93 E-value=18 Score=17.12 Aligned_cols=19 Identities=26% Similarity=0.436 Sum_probs=13.9 Q ss_pred HHHHHHHHHHHHHHHHHHH Q ss_conf 7776432358888999987 Q gi|254780383|r 130 LTRLPGIGRKGANVILSMA 148 (227) Q Consensus 130 L~~LpGVG~ktA~~il~~~ 148 (227) +.++||||++|+.-.-..+ T Consensus 177 v~~i~GIG~~t~~kL~~~G 195 (344) T cd01700 177 VGDVWGIGRRLSKRLAAMG 195 (344) T ss_pred HHHCCCCCHHHHHHHHHCC T ss_conf 4442783699999999869 No 124 >COG1031 Uncharacterized Fe-S oxidoreductase [Energy production and conversion] Probab=43.84 E-value=13 Score=18.01 Aligned_cols=109 Identities=19% Similarity=0.276 Sum_probs=48.6 Q ss_pred HHHHHHHHHHHHHHCCCCCCCCCCC--CHHHHHHHHHHHHCCCHHHHHHHHHHHHHCCCCCHHHCCCCHHHHHHHH-HHH Q ss_conf 8999999999999767999877778--6899999999963332035679989987317620001012689999999-973 Q gi|254780383|r 22 PKELEEIFYLFSLKWPSPKGELYYV--NHFTLIVAVLLSAQSTDVNVNKATKHLFEIADTPQKMLAIGEKKLQNYI-RTI 98 (227) Q Consensus 22 ~~~~~~I~~~L~~~yp~~~~~l~~~--~p~~~LVa~iLs~qT~d~~v~~~~~~L~~~ypt~e~l~~a~~~el~~~i-r~~ 98 (227) |+.+++.+.-....-|+.++ |+-+ ||- +| ++--++ ..++ -+..-+|.||-..+..+.+.-..-. +-- T Consensus 260 PealekL~~Gir~~AP~l~t-LHiDNaNP~-----tI--a~yp~e-Sr~i-~K~ivky~TpGnVaAfGlEsaDp~V~r~N 329 (560) T COG1031 260 PEALEKLFRGIRNVAPNLKT-LHIDNANPA-----TI--ARYPEE-SREI-AKVIVKYGTPGNVAAFGLESADPRVARKN 329 (560) T ss_pred HHHHHHHHHHHHHHCCCCEE-EEECCCCCH-----HH--HCCHHH-HHHH-HHHHHHHCCCCCEEEEECCCCCHHHHHHC T ss_conf 89999999999861898726-654589956-----44--158488-9999-99998647987554330454687787640 Q ss_pred HHHHHHHHHHHHHHHHHHHHHCCC-----CHH---HHHHHHHHHHHHHHHHH Q ss_conf 003799999975235544420010-----000---14567776432358888 Q gi|254780383|r 99 GIYRKKSENIISLSHILINEFDNK-----IPQ---TLEGLTRLPGIGRKGAN 142 (227) Q Consensus 99 G~~~~KAk~I~~~a~~i~~~~~g~-----vP~---~~~~L~~LpGVG~ktA~ 142 (227) ++ +.-+.-..++.+ |++++||. +|. -.+.+..|||=-.+|=. T Consensus 330 nL-~~spEEvl~AV~-ivn~vG~~rg~nGlP~lLPGINfv~GL~GEtkeT~~ 379 (560) T COG1031 330 NL-NASPEEVLEAVE-IVNEVGGGRGYNGLPYLLPGINFVFGLPGETKETYE 379 (560) T ss_pred CC-CCCHHHHHHHHH-HHHHHCCCCCCCCCCCCCCCCEEEECCCCCCHHHHH T ss_conf 56-699899999999-999864766768984204662067338876277788 No 125 >PRK12278 50S ribosomal protein L21/unknown domain fusion protein; Provisional Probab=43.54 E-value=9.3 Score=19.04 Aligned_cols=41 Identities=22% Similarity=0.336 Sum_probs=23.3 Q ss_pred HHHHHHHHHHHHHHHHHHHH---HHHHHHHHHCCCHHHHHHHHH Q ss_conf 45677764323588889999---875421000121046787765 Q gi|254780383|r 127 LEGLTRLPGIGRKGANVILS---MAFGIPTIGVDTHIFRISNRI 167 (227) Q Consensus 127 ~~~L~~LpGVG~ktA~~il~---~~~~~p~~~VDthv~Rv~~Rl 167 (227) -++|.+|.||||+.+...-. |.|.+=+---|..+.+|-..+ T Consensus 152 aDDLk~I~GIGP~~e~~Ln~~GI~~F~QIA~~t~~dia~id~~l 195 (216) T PRK12278 152 ADDLTKITGVGPALAKKLNEAGITTFAQIAALTDEDIAAIDEKL 195 (216) T ss_pred CCCCCEECCCCHHHHHHHHHHCCCHHHHHHCCCHHHHHHHHHHC T ss_conf 96543602658899999998187239998559999999987652 No 126 >PTZ00217 flap endonuclease-1; Provisional Probab=43.19 E-value=26 Score=16.02 Aligned_cols=17 Identities=29% Similarity=0.468 Sum_probs=13.1 Q ss_pred HHHHHHHHHHHHHHHHH Q ss_conf 77764323588889999 Q gi|254780383|r 130 LTRLPGIGRKGANVILS 146 (227) Q Consensus 130 L~~LpGVG~ktA~~il~ 146 (227) +-++||||+++|--++- T Consensus 237 ~~~I~GIGpk~A~klIk 253 (394) T PTZ00217 237 CDTIEGIGPKTAYELIK 253 (394) T ss_pred CCCCCCCCHHHHHHHHH T ss_conf 68998748899999999 No 127 >pfam00416 Ribosomal_S13 Ribosomal protein S13/S18. This family includes ribosomal protein S13 from prokaryotes and S18 from eukaryotes. Probab=43.08 E-value=8.4 Score=19.35 Aligned_cols=21 Identities=48% Similarity=0.585 Sum_probs=17.4 Q ss_pred HHHHHHHHHHHHHHHHHHHHH Q ss_conf 456777643235888899998 Q gi|254780383|r 127 LEGLTRLPGIGRKGANVILSM 147 (227) Q Consensus 127 ~~~L~~LpGVG~ktA~~il~~ 147 (227) .-.|.++.|||+.+|..|+.- T Consensus 14 ~~ALt~I~GIG~~~A~~Ic~~ 34 (106) T pfam00416 14 EIALTYIKGIGRRKANQILKK 34 (106) T ss_pred EEEECCCCCCCHHHHHHHHHH T ss_conf 444112105289999999999 No 128 >KOG1338 consensus Probab=42.96 E-value=26 Score=15.99 Aligned_cols=98 Identities=17% Similarity=0.225 Sum_probs=53.6 Q ss_pred CHHHHHHHHHHHHCCCHHHHHHHHHHHHHCCCCCHHHCCC---CHHHHHHHHHHHHHHHH-H-HHHHHHHHHHHHHHHCC Q ss_conf 6899999999963332035679989987317620001012---68999999997300379-9-99997523554442001 Q gi|254780383|r 47 NHFTLIVAVLLSAQSTDVNVNKATKHLFEIADTPQKMLAI---GEKKLQNYIRTIGIYRK-K-SENIISLSHILINEFDN 121 (227) Q Consensus 47 ~p~~~LVa~iLs~qT~d~~v~~~~~~L~~~ypt~e~l~~a---~~~el~~~ir~~G~~~~-K-Ak~I~~~a~~i~~~~~g 121 (227) .-|..|+.++|--.-.-. - --+.-.|++||+|..|-.. +++|+.-+..+.++.+. | +..|..--..+++-|++ T Consensus 88 gsw~~Lllvll~E~~~pq-~-SrWrPYfs~wp~p~rm~spifWdEnEl~~Ll~stvlee~~Kd~aeI~~~~i~~i~pf~~ 165 (466) T KOG1338 88 GSWGMLLLVLLREKKMPQ-K-SRWRPYFSRWPQPARMHSPIFWDENELSMLLCSTVLEETVKDKAEIEKDFIFVIQPFKQ 165 (466) T ss_pred CCHHHHHHHHHHHHHCCC-C-CCCCCHHHHCCCHHHCCCCCCCCCHHHHHHHHCCCCHHHHHHHHHHHHHHHHHHHHHHH T ss_conf 749999999998750532-0-23440777379866527875578057777752345226676899999999999999987 Q ss_pred CCHHH-----HHHHHHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 00001-----456777643235888899998754210 Q gi|254780383|r 122 KIPQT-----LEGLTRLPGIGRKGANVILSMAFGIPT 153 (227) Q Consensus 122 ~vP~~-----~~~L~~LpGVG~ktA~~il~~~~~~p~ 153 (227) ..|.- ++... +.+.++|.|.|+.+. T Consensus 166 ~~p~vfs~~slEdF~-------y~~Al~laysfdve~ 195 (466) T KOG1338 166 HCPIVFSRPSLEDFM-------YAYALGLAYSFDVEF 195 (466) T ss_pred HCCCHHCCCCHHHHH-------HHHHHHHHHHEEEEH T ss_conf 685410353788899-------999999888321002 No 129 >PRK03609 umuC DNA polymerase V subunit UmuC; Reviewed Probab=42.06 E-value=27 Score=15.90 Aligned_cols=19 Identities=21% Similarity=0.422 Sum_probs=14.2 Q ss_pred HHHHHHHHHHHHHHHHHHH Q ss_conf 7776432358888999987 Q gi|254780383|r 130 LTRLPGIGRKGANVILSMA 148 (227) Q Consensus 130 L~~LpGVG~ktA~~il~~~ 148 (227) +..++|||++|+.-.-..+ T Consensus 181 v~~lwGIG~~~~~kL~~~G 199 (422) T PRK03609 181 VEEVWGVGRRISKKLNAMG 199 (422) T ss_pred HHHHHHHHHHHHHHHHHCC T ss_conf 8898611399999999878 No 130 >pfam03564 DUF1759 Protein of unknown function (DUF1759). This is a family of proteins of unknown function. Probab=41.39 E-value=28 Score=15.84 Aligned_cols=31 Identities=29% Similarity=0.215 Sum_probs=15.1 Q ss_pred CHHHHHHHHHHHHHCCCCCHHHCCCCHHHHH Q ss_conf 2035679989987317620001012689999 Q gi|254780383|r 62 TDVNVNKATKHLFEIADTPQKMLAIGEKKLQ 92 (227) Q Consensus 62 ~d~~v~~~~~~L~~~ypt~e~l~~a~~~el~ 92 (227) ++.+-..|.+.|.++|.++..++.+-.++|. T Consensus 52 t~~nY~~A~~~L~~Ry~n~rli~~s~~~~l~ 82 (146) T pfam03564 52 TAANYDVAWEALKERYDNPRVIIRSLLNKLM 82 (146) T ss_pred CCCCHHHHHHHHHHHHCCHHHHHHHHHHHHH T ss_conf 8779999999999871288899999999998 No 131 >PRK04053 rps13p 30S ribosomal protein S13P; Reviewed Probab=41.39 E-value=14 Score=17.93 Aligned_cols=29 Identities=31% Similarity=0.413 Sum_probs=21.3 Q ss_pred CCCCHHH---HHHHHHHHHHHHHHHHHHHHHH Q ss_conf 0100001---4567776432358888999987 Q gi|254780383|r 120 DNKIPQT---LEGLTRLPGIGRKGANVILSMA 148 (227) Q Consensus 120 ~g~vP~~---~~~L~~LpGVG~ktA~~il~~~ 148 (227) |-++|.+ .-.|..+-|||+.+|..|+.-+ T Consensus 14 gvdIp~~K~i~~ALt~IyGIG~~~A~~Ic~~l 45 (149) T PRK04053 14 NTDLDGTKPVEYALTGIKGIGRRTARAIARKL 45 (149) T ss_pred CCCCCCCCEEEEECCCCCCCCHHHHHHHHHHH T ss_conf 86789996864441111484899999999991 No 132 >TIGR02789 nickel_nikB nickel ABC transporter, permease subunit NikB; InterPro: IPR014156 ABC transporters belong to the ATP-Binding Cassette (ABC) superfamily, which uses the hydrolysis of ATP to energize diverse biological systems. ABC transporters are minimally constituted of two conserved regions: a highly conserved ATP binding cassette (ABC) and a less conserved transmembrane domain (TMD). These regions can be found on the same protein or on two different ones. Most ABC transporters function as a dimer and therefore are constituted of four domains, two ABC modules and two TMDs. ABC transporters are involved in the export or import of a wide variety of substrates ranging from small ions to macromolecules. The major function of ABC import systems is to provide essential nutrients to bacteria. They are found only in prokaryotes and their four constitutive domains are usually encoded by independent polypeptides (two ABC proteins and two TMD proteins). Prokaryotic importers require additional extracytoplasmic binding proteins (one or more per systems) for function. In contrast, export systems are involved in the extrusion of noxious substances, the export of extracellular toxins and the targeting of membrane components. They are found in all living organisms and in general the TMD is fused to the ABC module in a variety of combinations. Some eukaryotic exporters encode the four domains on the same polypeptide chain . The ABC module (approximately two hundred amino acid residues) is known to bind and hydrolyze ATP, thereby coupling transport to ATP hydrolysis in a large number of biological processes. The cassette is duplicated in several subfamilies. Its primary sequence is highly conserved, displaying a typical phosphate-binding loop: Walker A, and a magnesium binding site: Walker B. Besides these two regions, three other conserved motifs are present in the ABC cassette: the switch region which contains a histidine loop, postulated to polarize the attaching water molecule for hydrolysis, the signature conserved motif (LSGGQ) specific to the ABC transporter, and the Q-motif (between Walker A and the signature), which interacts with the gamma phosphate through a water bond. The Walker A, Walker B, Q-loop and switch region form the nucleotide binding site , , . The 3D structure of a monomeric ABC module adopts a stubby L-shape with two distinct arms. ArmI (mainly beta-strand) contains Walker A and Walker B. The important residues for ATP hydrolysis and/or binding are located in the P-loop. The ATP-binding pocket is located at the extremity of armI. The perpendicular armII contains mostly the alpha helical subdomain with the signature motif. It only seems to be required for structural integrity of the ABC module. ArmII is in direct contact with the TMD. The hinge between armI and armII contains both the histidine loop and the Q-loop, making contact with the gamma phosphate of the ATP molecule. ATP hydrolysis leads to a conformational change that could facilitate ADP release. In the dimer the two ABC cassettes contact each other through hydrophobic interactions at the antiparallel beta-sheet of armI by a two-fold axis , , , , , . Proteins known to belong to this family are classified in several functional subfamilies depending on the substrate used (for further information see http://www.tcdb.org/tcdb/index.php?tc=3.A.1). This family consists of the NikB family of nickel ABC transporter permeases. Operons that contain this protein also contain a homologous permease subunit NikC. Nickel is used in cells as part of urease or certain hydrogenases or superoxide dismutases.. Probab=41.15 E-value=9.9 Score=18.86 Aligned_cols=19 Identities=32% Similarity=0.697 Sum_probs=11.1 Q ss_pred HHHHHHHHHHHHHHHHHHH Q ss_conf 4567776432358888999 Q gi|254780383|r 127 LEGLTRLPGIGRKGANVIL 145 (227) Q Consensus 127 ~~~L~~LpGVG~ktA~~il 145 (227) .|..=++||||++.-++|. T Consensus 253 iE~iFswPG~GRy~i~Aif 271 (315) T TIGR02789 253 IENIFSWPGVGRYAISAIF 271 (315) T ss_pred EEEECCCCCHHHHHHHHHH T ss_conf 5410126661045743356 No 133 >KOG3220 consensus Probab=40.49 E-value=29 Score=15.74 Aligned_cols=28 Identities=21% Similarity=0.331 Sum_probs=14.0 Q ss_pred CCHHHHHHHHHHHHCCCCHHHHHHHHHHCCCH Q ss_conf 21046787765654078889999999621884 Q gi|254780383|r 157 DTHIFRISNRIGLAPGKTPNKVEQSLLRIIPP 188 (227) Q Consensus 157 Dthv~Rv~~Rlgl~~~~~~~~~~~~l~~~~p~ 188 (227) |+-+.|+..|=++ +.+..+..+....|- T Consensus 137 ~~Ql~Rl~~Rd~l----se~dAe~Rl~sQmp~ 164 (225) T KOG3220 137 ELQLERLVERDEL----SEEDAENRLQSQMPL 164 (225) T ss_pred HHHHHHHHHHCCC----CHHHHHHHHHHCCCH T ss_conf 8999999874464----699999898732987 No 134 >TIGR01993 Pyr-5-nucltdase pyrimidine 5'-nucleotidase; InterPro: IPR010237 This family of proteins includes the SDT1/SSM1 gene from yeast, which has been shown to code for a pyrimidine (UMP/CMP) 5'nucleotidase. The family spans plants, fungi and bacteria. These enzymes are members of the haloacid dehalogenase (HAD) superfamily of hydrolases, specifically the IA subfamily.. Probab=40.30 E-value=5.4 Score=20.61 Aligned_cols=39 Identities=36% Similarity=0.431 Sum_probs=27.2 Q ss_pred HHHHHHHHHHH---HHHHHHHHHHHHHHHHHHHHHCCCHHHHHHHHHHHHCC Q ss_conf 00145677764---32358888999987542100012104678776565407 Q gi|254780383|r 124 PQTLEGLTRLP---GIGRKGANVILSMAFGIPTIGVDTHIFRISNRIGLAPG 172 (227) Q Consensus 124 P~~~~~L~~Lp---GVG~ktA~~il~~~~~~p~~~VDthv~Rv~~Rlgl~~~ 172 (227) |..++.|.+|| -+|+|. +|.-+- -.|+.|+++|||+.+. T Consensus 89 p~L~~~L~~LpqsGK~~Rk~-----iFTN~~-----~~Ha~r~l~~LGi~d~ 130 (205) T TIGR01993 89 PELRNLLLRLPQSGKKGRKI-----IFTNGD-----RAHARRALNRLGIEDC 130 (205) T ss_pred HHHHHHHHHHHHCCCCCCEE-----EEECCC-----HHHHHHHHHHCCHHHC T ss_conf 88999999734126555567-----761587-----8999999986472120 No 135 >pfam00570 HRDC HRDC domain. The HRDC (Helicase and RNase D C-terminal) domain has a putative role in nucleic acid binding. Mutations in the HRDC domain cause human disease. It is interesting to note that the RecQ helicase in Deinococcus radiodurans has three tandem HRDC domains. Probab=40.00 E-value=11 Score=18.49 Aligned_cols=21 Identities=48% Similarity=0.934 Sum_probs=18.2 Q ss_pred CCHHHHHHHHHHHHHHHHHHH Q ss_conf 000014567776432358888 Q gi|254780383|r 122 KIPQTLEGLTRLPGIGRKGAN 142 (227) Q Consensus 122 ~vP~~~~~L~~LpGVG~ktA~ 142 (227) ..|.+.++|.+++|+|+..+. T Consensus 38 ~~P~s~~~L~~i~g~~~~~~~ 58 (68) T pfam00570 38 KLPRTLEELLRIPGVGPRKLE 58 (68) T ss_pred HCCCCHHHHHCCCCCCHHHHH T ss_conf 784999998089999999999 No 136 >pfam05087 Rota_VP2 Rotavirus VP2 protein. Rotavirus particles consist of three concentric proteinaceous capsid layers. The innermost capsid (core) is made of VP2. The genomic RNA and the two minor proteins VP1 and VP3 are encapsidated within this layer. The N-terminus of rotavirus VP2 is necessary for the encapsidation of VP1 and VP3. Probab=39.61 E-value=17 Score=17.28 Aligned_cols=57 Identities=19% Similarity=0.250 Sum_probs=33.0 Q ss_pred CCCHHHHHHHHHHHHCCCH-HHHHHHHHHHH------HCCCC----CHHHCCCCHHHHHHHHHHH-HHH Q ss_conf 7868999999999633320-35679989987------31762----0001012689999999973-003 Q gi|254780383|r 45 YVNHFTLIVAVLLSAQSTD-VNVNKATKHLF------EIADT----PQKMLAIGEKKLQNYIRTI-GIY 101 (227) Q Consensus 45 ~~~p~~~LVa~iLs~qT~d-~~v~~~~~~L~------~~ypt----~e~l~~a~~~el~~~ir~~-G~~ 101 (227) ..|.|+.+|+++|||||-. +-|-.-+-.|. ..-|+ -|++...-..-+..+|.|+ |+. T Consensus 382 and~FKtiIa~mLsqRT~sl~FvTsnymSLiS~MwL~tivP~~mFiReslvAcQLAiiNTiiYPafGl~ 450 (887) T pfam05087 382 ANDAFKTIIATMLSQRTISLEFVTSNYMSLISCMWLMTIVPSEMFIRESLVACQLAVINTIIYPAFGLQ 450 (887) T ss_pred HHHHHHHHHHHHHHHCEEEEEEECCCHHHHHHHHHHHEECCHHHHHHHHHHHHHHHHHHHHHHHCCCHH T ss_conf 778999999999740235677632234778865252023632677898999999999876530032327 No 137 >PRK07758 hypothetical protein; Provisional Probab=39.51 E-value=14 Score=17.77 Aligned_cols=17 Identities=18% Similarity=0.372 Sum_probs=13.0 Q ss_pred HHHHHHHHHHHHHHHHH Q ss_conf 14567776432358888 Q gi|254780383|r 126 TLEGLTRLPGIGRKGAN 142 (227) Q Consensus 126 ~~~~L~~LpGVG~ktA~ 142 (227) +.++|++|.||||.+-. T Consensus 65 tEkELL~LHGmGP~ai~ 81 (95) T PRK07758 65 SEKEILKLHGMGPASLP 81 (95) T ss_pred HHHHHHHHHCCCHHHHH T ss_conf 19999998486888899 No 138 >pfam10343 DUF2419 Protein of unknown function (DUF2419). This is a family of conserved proteins found from plants to humans. The function is not known. A few members are annotated as being cobyrinic acid a,c-diamide synthetase but this could not be confirmed. Probab=39.42 E-value=30 Score=15.64 Aligned_cols=64 Identities=16% Similarity=0.228 Sum_probs=45.4 Q ss_pred CHHHHHHHHHHHHCCCHHHHHHHHHHHHHCC--CCCHHHCCCCHHHHHHHHHHH-----HHHHHHHHHHHHHHHHHHHHH Q ss_conf 6899999999963332035679989987317--620001012689999999973-----003799999975235544420 Q gi|254780383|r 47 NHFTLIVAVLLSAQSTDVNVNKATKHLFEIA--DTPQKMLAIGEKKLQNYIRTI-----GIYRKKSENIISLSHILINEF 119 (227) Q Consensus 47 ~p~~~LVa~iLs~qT~d~~v~~~~~~L~~~y--pt~e~l~~a~~~el~~~ir~~-----G~~~~KAk~I~~~a~~i~~~~ 119 (227) +-|+.|++.|-- .|-+.- -+|+-+++.+.+++..+.++- -+-..|.+-|.++.+.+.++| T Consensus 44 ~gY~aL~a~l~r-------------Al~~gi~i~~~~~~~~~t~~~l~~iF~~~~~~e~Pll~ER~~~L~EvG~vL~~~f 110 (282) T pfam10343 44 TGYWSLCAALKR-------------ALDEGIPITDPEFWAKCTLEELRHIFRSATDEEIPLLEERLRCLREAGRVLLEKF 110 (282) T ss_pred CCHHHHHHHHHH-------------HHHCCCCCCCHHHHHHCCHHHHHHHHCCCCCCCCCCHHHHHHHHHHHHHHHHHHC T ss_conf 638999999999-------------9975998668899986799999998347998868788999999999999999872 Q ss_pred CCCC Q ss_conf 0100 Q gi|254780383|r 120 DNKI 123 (227) Q Consensus 120 ~g~v 123 (227) +|.. T Consensus 111 ~Gs~ 114 (282) T pfam10343 111 DGSF 114 (282) T ss_pred CCCH T ss_conf 9879 No 139 >PRK02362 ski2-like helicase; Provisional Probab=39.35 E-value=30 Score=15.63 Aligned_cols=45 Identities=18% Similarity=0.144 Sum_probs=30.1 Q ss_pred CHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCHHHHHHHHH Q ss_conf 000145677764323588889999875421000121046787765 Q gi|254780383|r 123 IPQTLEGLTRLPGIGRKGANVILSMAFGIPTIGVDTHIFRISNRI 167 (227) Q Consensus 123 vP~~~~~L~~LpGVG~ktA~~il~~~~~~p~~~VDthv~Rv~~Rl 167 (227) ++.+.=+|++|||||+..|-..-.-+|.-+.-...++..+|..=+ T Consensus 647 v~~ELl~L~~I~gvgr~RAR~Ly~aGi~s~~dla~A~p~~l~~il 691 (736) T PRK02362 647 VREELLDLVGLRGIGRVRARRLYNAGITSRADLRAADKEVVAAIL 691 (736) T ss_pred CCHHHHHHCCCCCCCHHHHHHHHHCCCCCHHHHHHCCHHHHHHHH T ss_conf 877689770889999899999998799999999709999999997 No 140 >cd00128 XPG Xeroderma pigmentosum G N- and I-regions (XPGN, XPGI); contains the HhH2 motif; domain in nucleases. XPG is a eukaryotic enzyme that functions in nucleotide-excision repair and transcription-coupled repair of oxidative DNA damage. Functionally/structurally related to FEN-1; divalent metal ion-dependent exo- and endonuclease, and bacterial and bacteriophage 5'3' exonucleases. Probab=39.14 E-value=29 Score=15.68 Aligned_cols=17 Identities=29% Similarity=0.487 Sum_probs=12.8 Q ss_pred HHHHHHHHHHHHHHHHH Q ss_conf 77643235888899998 Q gi|254780383|r 131 TRLPGIGRKGANVILSM 147 (227) Q Consensus 131 ~~LpGVG~ktA~~il~~ 147 (227) -.+||||++||--++.- T Consensus 226 ~gi~giG~k~A~kli~~ 242 (316) T cd00128 226 EGIPGIGPVTALKLIKK 242 (316) T ss_pred CCCCCCCHHHHHHHHHH T ss_conf 99997359999999999 No 141 >TIGR01016 sucCoAbeta succinyl-CoA synthetase, beta subunit; InterPro: IPR005809 There are four different enzymes that share a similar catalytic mechanism which involves the phosphorylation by ATP (or GTP) of a specific histidine residue in the active site. These enzymes are: ATP citrate-lyase (4.1.3.8 from EC) , the primary enzyme responsible for the synthesis of cytosolic acetyl-CoA in many tissues, catalyzes the formation of acetyl-CoA and oxaloacetate from citrate and CoA with the concomitant hydrolysis of ATP to ADP and phosphate. ATP-citrate lyase is a tetramer of identical subunits; Succinyl-CoA ligase (GDP-forming) (6.2.1.4 from EC) is a mitochondrial enzyme that catalyzes the substrate level phosphorylation step of the tricarboxylic acid cycle: the formation of succinyl-CoA from succinate with a concomitant hydrolysis of GTP to GDP and phosphate. This enzyme is a dimer composed of an alpha and a beta subunits; Succinyl-CoA ligase (ADP-forming) (6.2.1.5 from EC) is a bacterial enzyme that during aerobic metabolism functions in the citric acid cycle, coupling the hydrolysis of succinyl-CoA to the synthesis of ATP. It can also function in the other direction for anabolic purposes. This enzyme is a tetramer composed of two alpha and two beta subunits; and Malate-CoA ligase (6.2.1.9 from EC) (malyl-CoA synthetase) , is a bacterial enzyme that forms malyl-CoA from malate and CoA with the concomitant hydrolysis of ATP to ADP and phosphate. Malate-CoA ligase is composed of two different subunits. This entry corresponds to a conserved region located in the first half of ATP citrate lyase and in the beta subunits of succinyl-CoA ligases and malate-CoA ligase. ; GO: 0003824 catalytic activity, 0008152 metabolic process. Probab=39.09 E-value=30 Score=15.66 Aligned_cols=30 Identities=17% Similarity=0.324 Sum_probs=19.2 Q ss_pred HHHHHCC--CCCHHHCCCCHHHHHHHHHHHHH Q ss_conf 9987317--62000101268999999997300 Q gi|254780383|r 71 KHLFEIA--DTPQKMLAIGEKKLQNYIRTIGI 100 (227) Q Consensus 71 ~~L~~~y--pt~e~l~~a~~~el~~~ir~~G~ 100 (227) +.||++| |+|+.....+++|++.++..+|. T Consensus 9 K~if~~YGiPvp~g~v~~s~~e~~~~~~~~g~ 40 (389) T TIGR01016 9 KEIFAKYGIPVPEGEVATSVEEVEEIAEELGE 40 (389) T ss_pred HHHHHHCCCCCCCCCEEECHHHHHHHHHHCCC T ss_conf 99998478967886004167899999997079 No 142 >PTZ00205 DNA polymerase kappa; Provisional Probab=38.88 E-value=29 Score=15.75 Aligned_cols=39 Identities=15% Similarity=0.388 Sum_probs=22.1 Q ss_pred HHHHHHHHHHHHHCCC--CH-HHHHH---------HHHHHHHHHHHHHHHHH Q ss_conf 9975235544420010--00-01456---------77764323588889999 Q gi|254780383|r 107 NIISLSHILINEFDNK--IP-QTLEG---------LTRLPGIGRKGANVILS 146 (227) Q Consensus 107 ~I~~~a~~i~~~~~g~--vP-~~~~~---------L~~LpGVG~ktA~~il~ 146 (227) .|=++|..+ ++-||+ +| .++++ +-++||||+=|+...-. T Consensus 277 ~LAKIASd~-nKPNGQfvl~~~~r~~V~~F~~~LPvRKIpGIGkVte~~L~a 327 (571) T PTZ00205 277 ALAKIASNI-NKPNGQHDLNLHTRGDVMTYVRDLGLRSVPGVGKVTEALLKG 327 (571) T ss_pred HHHHHHHCC-CCCCCCEEECCCCHHHHHHHHHHCCCCCCCCCCHHHHHHHHH T ss_conf 999987623-798985454589989999999838976689756888999987 No 143 >TIGR01448 recD_rel helicase, RecD/TraA family; InterPro: IPR006345 These sequences represent a family similar to RecD, the exodeoxyribonuclease V alpha chain of IPR006344 from INTERPRO. Members of this family, however, are not found in a context of RecB and RecC and are longer by about 200 amino acids at the amino end. Chlamydia muridarum has both a member of this family and a RecD. . Probab=38.69 E-value=15 Score=17.59 Aligned_cols=17 Identities=35% Similarity=0.534 Sum_probs=10.8 Q ss_pred HHH-HHHHHHHHHHHHHH Q ss_conf 777-64323588889999 Q gi|254780383|r 130 LTR-LPGIGRKGANVILS 146 (227) Q Consensus 130 L~~-LpGVG~ktA~~il~ 146 (227) |.. +.|||=.+||.+.. T Consensus 196 L~~d~~GiGF~~AD~lA~ 213 (769) T TIGR01448 196 LAEDVKGIGFKTADQLAE 213 (769) T ss_pred HHHHHCCCCHHHHHHHHH T ss_conf 367513740568999999 No 144 >TIGR02924 ICDH_alpha isocitrate dehydrogenase; InterPro: IPR014273 This entry represents a group of isocitrate dehydrogenases found mainly in the alphaproteobacteria. Many of the species containing these proteins appear to have a TCA cycle lacking only a determined isocitrate dehydrogenase. The precise identity of the cofactor (NADH -- 1.1.1.41 from EC vs. NADPH -- 1.1.1.42 from EC) is unclear.. Probab=38.66 E-value=15 Score=17.56 Aligned_cols=33 Identities=9% Similarity=0.021 Sum_probs=15.8 Q ss_pred CCCHHHHHHHHH----HHHCCC-CHHHHHHHHHHCCCH Q ss_conf 121046787765----654078-889999999621884 Q gi|254780383|r 156 VDTHIFRISNRI----GLAPGK-TPNKVEQSLLRIIPP 188 (227) Q Consensus 156 VDthv~Rv~~Rl----gl~~~~-~~~~~~~~l~~~~p~ 188 (227) ++++..|++-=. .|..+. +..++-..|++.-++ T Consensus 366 ~~~~~~k~LvG~Difi~wd~~~ld~~~l~~~L~~~~~~ 403 (481) T TIGR02924 366 EQSYKVKVLVGVDIFIKWDGSSLDLNQLVKLLEKIKLD 403 (481) T ss_pred CCCCCEEEEEEEEEEEEECCCCCCHHHHHHHHHHCCCC T ss_conf 36631148988778997367582489999999853899 No 145 >PTZ00134 40S ribosomal protein S18; Provisional Probab=37.70 E-value=10 Score=18.81 Aligned_cols=27 Identities=33% Similarity=0.438 Sum_probs=19.8 Q ss_pred CCHHH---HHHHHHHHHHHHHHHHHHHHHH Q ss_conf 00001---4567776432358888999987 Q gi|254780383|r 122 KIPQT---LEGLTRLPGIGRKGANVILSMA 148 (227) Q Consensus 122 ~vP~~---~~~L~~LpGVG~ktA~~il~~~ 148 (227) ++|.+ .-.|..+.|||+.+|..|+--+ T Consensus 21 di~g~K~v~~aLt~I~GIG~~~A~~Ic~~~ 50 (154) T PTZ00134 21 NVDGREKVTIALTAIKGIGRRFATVVCKQA 50 (154) T ss_pred CCCCCCEEEEEEEEECCCCHHHHHHHHHHC T ss_conf 589995889985322064899999999980 No 146 >smart00341 HRDC Helicase and RNase D C-terminal. Hypothetical role in nucleic acid binding. Mutations in the HRDC domain cause human disease. Probab=37.22 E-value=14 Score=17.89 Aligned_cols=21 Identities=24% Similarity=0.563 Sum_probs=17.3 Q ss_pred CCHHHHHHHHHHHHHHHHHHH Q ss_conf 000014567776432358888 Q gi|254780383|r 122 KIPQTLEGLTRLPGIGRKGAN 142 (227) Q Consensus 122 ~vP~~~~~L~~LpGVG~ktA~ 142 (227) ..|.+.++|.+++|+|++... T Consensus 41 ~~P~~~~~L~~i~g~~~~~~~ 61 (81) T smart00341 41 ALPTNVSELLAIDGVGEEKAR 61 (81) T ss_pred HCCCCHHHHHCCCCCCHHHHH T ss_conf 887999998468999999999 No 147 >cd01703 Pol_iota Pol iota is member of the DNA polymerase Y-family, and has also been called Rad30 homolog B. Unlike classic DNA polymerases,Y-family polymerases are induced by DNA damage. They can transverse normal replication-blocking DNA lesions. Unlike Pol eta, Pol iota is unable to replicate through a cis-syn T-T dimer. In human Pol iota, the base-pairing mode in the active site at the replicative end mat bee Hoogsteen instead of Watson-Click. Human Pol iota can incorporate the correct nucleotide opposite a purine much more efficiently than opposite a pyrimidine. Pol iota prefers to insert Guanosine instead of Adenosine opposite Thymidine. Probab=37.22 E-value=16 Score=17.43 Aligned_cols=18 Identities=39% Similarity=0.650 Sum_probs=14.6 Q ss_pred HHHHHHHHHHHHHHHHHH Q ss_conf 777643235888899998 Q gi|254780383|r 130 LTRLPGIGRKGANVILSM 147 (227) Q Consensus 130 L~~LpGVG~ktA~~il~~ 147 (227) +.++||||.+|+.-+-.. T Consensus 212 i~ki~GIG~kt~~~L~~~ 229 (394) T cd01703 212 LRKIPGIGYKTAKKLEDH 229 (394) T ss_pred HHHHCCCCHHHHHHHHHC T ss_conf 055278179999999983 No 148 >pfam11264 ThylakoidFormat Thylakoid formation protein. THF1 is localized to the outer plastid membrane and the stroma. THF1 has a role in sugar signalling. THF1 is also thought to have a role in chloroplast and leaf development. THF1 has been shown to play a crucial role in vesicle-mediated thylakoid membrane biogenesis. Probab=36.76 E-value=33 Score=15.36 Aligned_cols=95 Identities=19% Similarity=0.194 Sum_probs=57.3 Q ss_pred HHHHHHCCCCCCCCCCCCHHHHHHHH-HHHHCCC---H----HHHHHHHHHHHHCCCCCHHHCCCCHHHHHHHHHHHHHH Q ss_conf 99999767999877778689999999-9963332---0----35679989987317620001012689999999973003 Q gi|254780383|r 30 YLFSLKWPSPKGELYYVNHFTLIVAV-LLSAQST---D----VNVNKATKHLFEIADTPQKMLAIGEKKLQNYIRTIGIY 101 (227) Q Consensus 30 ~~L~~~yp~~~~~l~~~~p~~~LVa~-iLs~qT~---d----~~v~~~~~~L~~~ypt~e~l~~a~~~el~~~ir~~G~~ 101 (227) +.+.+.||.|..++.-+--=++||-. +|+.|+. | --+..+|+.|++.|+..+.... =...++..+||- T Consensus 7 r~F~~~~p~pI~~lYrrvvdELLVElHLL~~n~~F~yD~lFAlGlvt~Fd~fm~GY~Pee~~~~----IF~Alc~s~~~d 82 (216) T pfam11264 7 RAFHAAYPRPIPSLYRRVVDELLVELHLLSHQSDFKYDPLFALGLVTVFDRFMEGYRPEEHKDA----IFNALCSALGFD 82 (216) T ss_pred HHHHHHCCCCCCHHHHHHHHHHHHHHHHHHHCCCCCCCCHHHHHHHHHHHHHHCCCCCHHHHHH----HHHHHHHHCCCC T ss_conf 9999858988828999999999999999886036542737884599999999767998567999----999999854899 Q ss_pred HHHHHHHHHHHHHHHHHHCCCCHHHHHHHH Q ss_conf 799999975235544420010000145677 Q gi|254780383|r 102 RKKSENIISLSHILINEFDNKIPQTLEGLT 131 (227) Q Consensus 102 ~~KAk~I~~~a~~i~~~~~g~vP~~~~~L~ 131 (227) +..+.+.|+.+.+...|.-..+...++ T Consensus 83 ---p~~~r~dA~~l~~~a~~~s~~~i~~~l 109 (216) T pfam11264 83 ---PEQLRKDAEQLEEAAKSHSLEEIVSWL 109 (216) T ss_pred ---HHHHHHHHHHHHHHHHCCCHHHHHHHH T ss_conf ---999999999999998759999999999 No 149 >PRK03348 DNA polymerase IV; Provisional Probab=35.93 E-value=28 Score=15.82 Aligned_cols=48 Identities=21% Similarity=0.256 Sum_probs=27.3 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHCCCC---HHHHHH------HHHHHHHHHHHHHHHHHHH Q ss_conf 30037999999752355444200100---001456------7776432358888999987 Q gi|254780383|r 98 IGIYRKKSENIISLSHILINEFDNKI---PQTLEG------LTRLPGIGRKGANVILSMA 148 (227) Q Consensus 98 ~G~~~~KAk~I~~~a~~i~~~~~g~v---P~~~~~------L~~LpGVG~ktA~~il~~~ 148 (227) +|...+ |.|=++|..+ .+-+|.. |.+..+ +.+|||||++|+.-.-..+ T Consensus 143 IGIa~n--K~LAKiAS~~-aKP~G~~vi~p~~~~~~L~~lPV~~lwGIG~~t~~kL~~~G 199 (456) T PRK03348 143 VGAGSG--KQIAKIASGL-AKPDGIRVVPPGEERELLAPLPVRRLWGIGPVAEEKLHRLG 199 (456) T ss_pred EEECHH--HHHHHHHHHH-CCCCCEEEECHHHHHHHHHHCCHHHCCCCCHHHHHHHHHCC T ss_conf 887415--9999999872-48983799688899999875657433877599999999869 No 150 >PRK01216 DNA polymerase IV; Validated Probab=35.56 E-value=17 Score=17.26 Aligned_cols=42 Identities=14% Similarity=0.323 Sum_probs=22.2 Q ss_pred HHHHHHHHHHHHHHCCC--C-HHHHHH------HHHHHHHHHHHHHHHHHHH Q ss_conf 99975235544420010--0-001456------7776432358888999987 Q gi|254780383|r 106 ENIISLSHILINEFDNK--I-PQTLEG------LTRLPGIGRKGANVILSMA 148 (227) Q Consensus 106 k~I~~~a~~i~~~~~g~--v-P~~~~~------L~~LpGVG~ktA~~il~~~ 148 (227) +.|=++|..+ ++-+|. + |.+..+ +.++||||++|+.-....+ T Consensus 148 k~lAKiAs~~-~KP~G~~~v~~~~~~~fl~~lpv~~i~GIG~~t~~~L~~~G 198 (351) T PRK01216 148 KVFAKIIADM-AKPNGLGVISPEEVKEFLNNLDIDDVPGIGKVLSERLNELG 198 (351) T ss_pred HHHHHHHHHH-CCCCCEEEECHHHHHHHHHCCCCCEECCCCHHHHHHHHHCC T ss_conf 9999998872-39995799883689998760994385485799999999859 No 151 >COG1561 Uncharacterized stress-induced protein [Function unknown] Probab=35.20 E-value=24 Score=16.23 Aligned_cols=23 Identities=9% Similarity=0.095 Sum_probs=9.8 Q ss_pred CCCHHHHHHHHH-HHHCCCCHHHH Q ss_conf 121046787765-65407888999 Q gi|254780383|r 156 VDTHIFRISNRI-GLAPGKTPNKV 178 (227) Q Consensus 156 VDthv~Rv~~Rl-gl~~~~~~~~~ 178 (227) +..+-.|+..|+ .+...-++..+ T Consensus 181 ~~~~~~~l~~ri~~~~~~~d~~rl 204 (290) T COG1561 181 LEWYRERLVARLNEAQDQLDEDRL 204 (290) T ss_pred HHHHHHHHHHHHHHHHCCCCHHHH T ss_conf 999999999999987634676789 No 152 >PRK02794 DNA polymerase IV; Provisional Probab=34.78 E-value=26 Score=15.99 Aligned_cols=19 Identities=16% Similarity=0.169 Sum_probs=13.9 Q ss_pred HHHHHHHHHHHHHHHHHHH Q ss_conf 7776432358888999987 Q gi|254780383|r 130 LTRLPGIGRKGANVILSMA 148 (227) Q Consensus 130 L~~LpGVG~ktA~~il~~~ 148 (227) +.+++|||+.|+.-.-.++ T Consensus 211 v~diwGIG~rt~~kL~~~G 229 (417) T PRK02794 211 VGFIWGVGPATAARLAADG 229 (417) T ss_pred HHHHCCCCHHHHHHHHHCC T ss_conf 4551684689999999849 No 153 >COG2183 Tex Transcriptional accessory protein [Transcription] Probab=34.73 E-value=35 Score=15.15 Aligned_cols=80 Identities=19% Similarity=0.304 Sum_probs=41.3 Q ss_pred CHHHHHHHHHHHHCCCHHHHHHHHHHHHHCCCCCHHHCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCHHH Q ss_conf 68999999999633320356799899873176200010126899999999730037999999752355444200100001 Q gi|254780383|r 47 NHFTLIVAVLLSAQSTDVNVNKATKHLFEIADTPQKMLAIGEKKLQNYIRTIGIYRKKSENIISLSHILINEFDNKIPQT 126 (227) Q Consensus 47 ~p~~~LVa~iLs~qT~d~~v~~~~~~L~~~ypt~e~l~~a~~~el~~~ir~~G~~~~KAk~I~~~a~~i~~~~~g~vP~~ 126 (227) .+||..|+---+.+.-|..|..+... ..-+.+. |+ ..+...+ .|+..+.|++|.+.- +++| .+ .+ T Consensus 473 gqyQHdv~q~~L~~~Ld~vved~VN~---VGVdvNt---As-a~lL~~V--sGL~kt~A~nIv~~r----~~~g-~f-~~ 537 (780) T COG2183 473 GQYQHDVSQKKLAESLDAVVEDCVNA---VGVDVNT---AS-ASLLSYV--SGLNKTLAKNIVAYR----DENG-AF-DN 537 (780) T ss_pred CCCCCCCCHHHHHHHHHHHHHHHHCC---CCCCCCC---CC-HHHHHHH--HHHCHHHHHHHHHHH----HHCC-CC-CC T ss_conf 53214688789999999999987310---2611342---78-9999877--402566789999977----5328-84-40 Q ss_pred HHHHHHHHHHHHHHH Q ss_conf 456777643235888 Q gi|254780383|r 127 LEGLTRLPGIGRKGA 141 (227) Q Consensus 127 ~~~L~~LpGVG~ktA 141 (227) |++|++.|+.|+|+= T Consensus 538 Rk~L~kv~rlg~k~F 552 (780) T COG2183 538 RKQLKKVPRLGPKAF 552 (780) T ss_pred HHHHHCCCCCCHHHH T ss_conf 898724887672236 No 154 >PRK07922 N-acetylglutamate synthase; Validated Probab=34.62 E-value=35 Score=15.14 Aligned_cols=46 Identities=17% Similarity=0.294 Sum_probs=30.8 Q ss_pred HHHHHHHHH--------HHHHHHHHHHHHHHHH--HH-HHHHCCCHHHHHHHHHHHHC Q ss_conf 014567776--------4323588889999875--42-10001210467877656540 Q gi|254780383|r 125 QTLEGLTRL--------PGIGRKGANVILSMAF--GI-PTIGVDTHIFRISNRIGLAP 171 (227) Q Consensus 125 ~~~~~L~~L--------pGVG~ktA~~il~~~~--~~-p~~~VDthv~Rv~~Rlgl~~ 171 (227) .+..|+-+| .|+|.+.-..++..|- |. ..|+.- +..-.+.++|+.. T Consensus 69 ~dlAEIrsLAV~p~~rg~G~G~~Lv~~l~~~Ar~lGi~~vFvLT-~~~~fF~k~GF~e 125 (170) T PRK07922 69 EDLAEVRTVAVDPAMRGHGVGHAIVERLLDVARELGLSRVFVLT-FEVEFFARHGFVE 125 (170) T ss_pred CCHHHHEEEEECHHHCCCCHHHHHHHHHHHHHHHCCCCEEEEEE-CCHHHHHHCCCEE T ss_conf 53213044588787818984999999999999985998699997-8368999769987 No 155 >KOG0221 consensus Probab=34.61 E-value=35 Score=15.14 Aligned_cols=121 Identities=16% Similarity=0.133 Sum_probs=70.6 Q ss_pred HHHHHHCCCCCCCCC-----------CCCHHHHHHHHHHHHCC-CHHHHHHHHHHHHH---CCCCCHHHCCCCH--HHHH Q ss_conf 999997679998777-----------78689999999996333-20356799899873---1762000101268--9999 Q gi|254780383|r 30 YLFSLKWPSPKGELY-----------YVNHFTLIVAVLLSAQS-TDVNVNKATKHLFE---IADTPQKMLAIGE--KKLQ 92 (227) Q Consensus 30 ~~L~~~yp~~~~~l~-----------~~~p~~~LVa~iLs~qT-~d~~v~~~~~~L~~---~ypt~e~l~~a~~--~el~ 92 (227) +.|+.|+-+|.+.+. +.+||..-|+.+||+-- +-+++.-+++++.. +..+|+.++..=. -+|. T Consensus 283 k~Lr~Wf~nPttd~~~l~sR~~~i~~fl~~qNa~~~~~Ls~~lgr~k~~~~~~~~~~sg~t~l~~W~~~~stv~~~~~i~ 362 (849) T KOG0221 283 KLLRLWFTNPTTDLGELSSRLDVIQFFLLPQNADMAQMLSRLLGRIKNVPLILKRMKSGHTKLSDWQVLYSTVYSALGIR 362 (849) T ss_pred HHHHHHHCCCCCCHHHHHHHHHHHHHHHCCHHHHHHHHHHHHHHHCCCHHHHHHHHHCCCCEECHHHHHHHHHHHHHHHH T ss_conf 99999822888737878889999999846120699999999985243378999998547752113899999999999999 Q ss_pred HHHHHH----HHHHHHHHHHHHHHHHHHH------HHCCC---------------CHHHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 999973----0037999999752355444------20010---------------0001456777643235888899998 Q gi|254780383|r 93 NYIRTI----GIYRKKSENIISLSHILIN------EFDNK---------------IPQTLEGLTRLPGIGRKGANVILSM 147 (227) Q Consensus 93 ~~ir~~----G~~~~KAk~I~~~a~~i~~------~~~g~---------------vP~~~~~L~~LpGVG~ktA~~il~~ 147 (227) .+++.+ .++|.+++-.....+.+.. +|.|. +.+-++-+..||||=...|.--+.+ T Consensus 363 ~~~rslp~s~~~~~~~~~~~~~~l~eia~~~g~vIdF~~S~~~~r~Tv~~giD~elDE~r~~y~~lp~~Lt~vAr~e~~~ 442 (849) T KOG0221 363 DACRSLPQSIQLFRDIAQEFSDDLHEIASLIGKVIDFEGSLAENRFTVLPGIDPELDEKRRRYMGLPSFLTEVARKELEN 442 (849) T ss_pred HHHHHCCCCHHHHHHHHHHHHHHHHHHHHHHHHEECCCCCCCCCEEEECCCCCHHHHHHHHHHCCCHHHHHHHHHHHHHH T ss_conf 99974743145566778888999999999861102133321136388548997678999999706508999999999986 Q ss_pred HHH Q ss_conf 754 Q gi|254780383|r 148 AFG 150 (227) Q Consensus 148 ~~~ 150 (227) .-+ T Consensus 443 L~~ 445 (849) T KOG0221 443 LDS 445 (849) T ss_pred HCC T ss_conf 179 No 156 >cd03586 Pol_IV_kappa Pol_IV_kappa, a member of the Y-family of DNA polymerases. Pol_Y's can transverse replication-blocking DNA lesions, such as cyclobutane pyrimidine dimers resulting from UV damage, at the cost of an elevated error rate. The Y-family has no 3'-5' exonuclease activity. In addition to possessing a topology akin to a right hand, with "thumb", "fingers" and "palm" motifs, like polymerases from the A-, B-, C- and X-families, the Y-family has a unique "little finger" motif. Expression of Y-family polymerases is often induced by DNA damage. These polymerases are phylogenetically unrelated to classical DNA polymerases. Originally called the DinB family, they belong to the recently described Y-family of DNA polymerases. Pol IV is mostly found in bacteria and archaea. Although the structure of Pol IV is similar to that of Pol eta, it shows markedly differenct efficiencies and fidelities in bypassing various DNA lesions. All Pol IV-like polymerases studied to date are able to b Probab=34.36 E-value=36 Score=15.11 Aligned_cols=18 Identities=22% Similarity=0.407 Sum_probs=13.3 Q ss_pred HHHHHHHHHHHHHHHHHH Q ss_conf 776432358888999987 Q gi|254780383|r 131 TRLPGIGRKGANVILSMA 148 (227) Q Consensus 131 ~~LpGVG~ktA~~il~~~ 148 (227) .++||||++++.-.-..+ T Consensus 175 ~~l~GIG~~~~~~L~~~g 192 (337) T cd03586 175 RKIWGIGKVTAEKLNRLG 192 (337) T ss_pred HHCCCCCHHHHHHHHHCC T ss_conf 020784689999999808 No 157 >pfam03755 YicC_N YicC-like family, N-terminal region. Family of bacterial proteins. Although poorly characterized, the members of this protein family have been demonstrated to play a role in stationary phase survival. These proteins are not essential during stationary phase. Probab=33.97 E-value=33 Score=15.35 Aligned_cols=36 Identities=25% Similarity=0.385 Sum_probs=24.1 Q ss_pred HHHHHHHHHHHHHHHHHHHCCCCHHHHHHHHHHHHH Q ss_conf 379999997523554442001000014567776432 Q gi|254780383|r 101 YRKKSENIISLSHILINEFDNKIPQTLEGLTRLPGI 136 (227) Q Consensus 101 ~~~KAk~I~~~a~~i~~~~~g~vP~~~~~L~~LpGV 136 (227) -..+++.+....+.+.++++-.-|.+..+|+++||| T Consensus 80 n~~~~~~~~~~~~~l~~~~~~~~~~~~~~ll~~~~v 115 (159) T pfam03755 80 NEALLKAYLAALKELAEELGLAAPISLDDLLRLPGV 115 (159) T ss_pred CHHHHHHHHHHHHHHHHHCCCCCCCCHHHHHCCCCC T ss_conf 999999999999999997299899999999659870 No 158 >PRK11820 hypothetical protein; Provisional Probab=33.06 E-value=35 Score=15.16 Aligned_cols=12 Identities=8% Similarity=0.396 Sum_probs=5.2 Q ss_pred CCCHHHHHHHHH Q ss_conf 121046787765 Q gi|254780383|r 156 VDTHIFRISNRI 167 (227) Q Consensus 156 VDthv~Rv~~Rl 167 (227) .+.|+..+-.-+ T Consensus 222 l~sHi~~f~~~l 233 (288) T PRK11820 222 LESHLKEFRELL 233 (288) T ss_pred HHHHHHHHHHHH T ss_conf 999999999998 No 159 >PRK07539 NADH dehydrogenase subunit E; Validated Probab=32.93 E-value=37 Score=14.96 Aligned_cols=65 Identities=14% Similarity=0.246 Sum_probs=27.6 Q ss_pred HHHHHHHHHHHHHHHHHHHCCCCHH-HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCHHHHHHHH Q ss_conf 3799999975235544420010000-14567776432358888999987542100012104678776 Q gi|254780383|r 101 YRKKSENIISLSHILINEFDNKIPQ-TLEGLTRLPGIGRKGANVILSMAFGIPTIGVDTHIFRISNR 166 (227) Q Consensus 101 ~~~KAk~I~~~a~~i~~~~~g~vP~-~~~~L~~LpGVG~ktA~~il~~~~~~p~~~VDthv~Rv~~R 166 (227) |..+...++.+...+.++|| -||. ..+++-..-||-+--...|.+|=-....-|+-.|+.|||.= T Consensus 17 yp~~~~allp~L~~iQ~~~G-~i~~ea~~~iA~~l~i~~~~V~~vatFY~~f~~~P~Gk~~i~VC~~ 82 (154) T PRK07539 17 YPRPRSAVIPALKIVQEQRG-WVPDEAIEAVADYLGMPAIDVEEVATFYSMIFRQPVGRHVIQVCTS 82 (154) T ss_pred CCCCHHHHHHHHHHHHHHCC-CCCHHHHHHHHHHHCCCHHHHHHHHHHHHHHCCCCCCCEEEEECCC T ss_conf 79977899999999999879-9399999999999797999999999986776267899648997898 No 160 >pfam06782 UPF0236 Uncharacterized protein family (UPF0236). Probab=32.81 E-value=38 Score=14.95 Aligned_cols=131 Identities=13% Similarity=0.091 Sum_probs=65.8 Q ss_pred HHHHHHHHHHHHHHCCCCCCC---C-------------------CCCCHHHHHHHHHH--HHCCCHHHHHHHHHHHHHCC Q ss_conf 899999999999976799987---7-------------------77868999999999--63332035679989987317 Q gi|254780383|r 22 PKELEEIFYLFSLKWPSPKGE---L-------------------YYVNHFTLIVAVLL--SAQSTDVNVNKATKHLFEIA 77 (227) Q Consensus 22 ~~~~~~I~~~L~~~yp~~~~~---l-------------------~~~~p~~~LVa~iL--s~qT~d~~v~~~~~~L~~~y 77 (227) .+-+.++..-|..+|.--... + ..-|+|.+.=..+. +.+-. ..+.+ T Consensus 241 ~e~We~v~~yi~~~Ydld~ikkI~ingDGA~WIk~g~~~~~~a~~~LDrFHL~K~I~~~~~~~~~------~~e~i---- 310 (482) T pfam06782 241 GDVWERFEEYLENEYDYEPIRKIIINGDGASWIKEGREWGKKACYQLDRFHLAKELLKCLSHHPR------WREDA---- 310 (482) T ss_pred HHHHHHHHHHHHHHCCHHHCEEEEEECCCHHHHHHHHHHCCCCEEEECHHHHHHHHHHHHCCCCH------HHHHH---- T ss_conf 77999999999986484143499996786386887787557828995488999999998744945------79999---- Q ss_pred CCCHHHCCCCHHHHHHHHHHHH---HHHHHHHHHHHHHHHHHHHHCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 6200010126899999999730---0379999997523554442001000014567776432358888999987542100 Q gi|254780383|r 78 DTPQKMLAIGEKKLQNYIRTIG---IYRKKSENIISLSHILINEFDNKIPQTLEGLTRLPGIGRKGANVILSMAFGIPTI 154 (227) Q Consensus 78 pt~e~l~~a~~~el~~~ir~~G---~~~~KAk~I~~~a~~i~~~~~g~vP~~~~~L~~LpGVG~ktA~~il~~~~~~p~~ 154 (227) ..++..-+.+++..++..+- --..+-+.|.++-+.|...|++. .++..-++..|++.+-.- +.+...- T Consensus 311 --~kai~~~d~~~~~~~l~~~~~~~~~e~~~k~I~~~~rYI~~nw~~I--~~yr~~l~~r~~~~~~~~-----g~saegh 381 (482) T pfam06782 311 --RKALAKGDKEGLLVILEEAVGTAKDEKKEKQIAEAIRYIENMPECI--RDYREWLSEQGVETEGVR-----GCGAAEH 381 (482) T ss_pred --HHHHHHCCHHHHHHHHHHHHHHHCCHHHHHHHHHHHHHHHHCHHHH--HHHHHHHHHHCCCCCCCC-----CCCHHHH T ss_conf --9999958998899999999987327578999999999998696989--887666555125777850-----1364553 Q ss_pred HCCCHHHHHHHH-HHHHC Q ss_conf 012104678776-56540 Q gi|254780383|r 155 GVDTHIFRISNR-IGLAP 171 (227) Q Consensus 155 ~VDthv~Rv~~R-lgl~~ 171 (227) +.-....|+.+| +||+. T Consensus 382 vshvlS~RmssRpm~WS~ 399 (482) T pfam06782 382 VSHRFSARLSSRARSWSR 399 (482) T ss_pred HHHHHHHHHCCCCCCHHH T ss_conf 789998886328975119 No 161 >KOG0950 consensus Probab=32.72 E-value=38 Score=14.94 Aligned_cols=25 Identities=12% Similarity=0.126 Sum_probs=15.6 Q ss_pred CCCHHCCCHHHHHHHHHHHHHHCCC Q ss_conf 8600239989999999999997679 Q gi|254780383|r 14 SPLGCLYTPKELEEIFYLFSLKWPS 38 (227) Q Consensus 14 ~p~~~~~~~~~~~~I~~~L~~~yp~ 38 (227) +-......++....+++.|..-|+. T Consensus 721 a~f~~~~~~~~a~~l~~~L~~~~~~ 745 (1008) T KOG0950 721 ACFNAGSDPEVANILFADLKKSLPQ 745 (1008) T ss_pred HHHCCCCCHHHHHHHHHHHHHHHHC T ss_conf 5642567856667899999875302 No 162 >PRK10917 ATP-dependent DNA helicase RecG; Provisional Probab=32.52 E-value=20 Score=16.74 Aligned_cols=27 Identities=19% Similarity=0.121 Sum_probs=13.1 Q ss_pred CCCCCHHCCCHHHHHHHHHHHHHHCCC Q ss_conf 778600239989999999999997679 Q gi|254780383|r 12 GNSPLGCLYTPKELEEIFYLFSLKWPS 38 (227) Q Consensus 12 ~~~p~~~~~~~~~~~~I~~~L~~~yp~ 38 (227) +-.|.....+.+.++.+.....+.++. T Consensus 146 PVYplT~GLsqk~irklI~~aL~~~~~ 172 (677) T PRK10917 146 PVYPLTEGLKQKTLRKLIKQALERLPA 172 (677) T ss_pred EECCCCCCCCHHHHHHHHHHHHHHHHC T ss_conf 535077667869999999999987431 No 163 >TIGR01129 secD protein-export membrane protein SecD; InterPro: IPR005791 Secretion across the inner membrane in some Gram-negative bacteria occurs via the preprotein translocase pathway. Proteins are produced in the cytoplasm as precursors, and require a chaperone subunit to direct them to the translocase component. . From there, the mature proteins are either targeted to the outer membrane, or remain as periplasmic proteins. The translocase protein subunits are encoded on the bacterial chromosome. The translocase itself comprises 7 proteins, including a chaperone protein (SecB), an ATPase (SecA), an integral membrane complex (SecCY, SecE and SecG), and two additional membrane proteins that promote the release of the mature peptide into the periplasm (SecD and SecF) . The chaperone protein SecB is a highly acidic homotetrameric protein that exists as a "dimer of dimers" in the bacterial cytoplasm. SecB maintains preproteins in an unfolded state after translation, and targets these to the peripheral membrane protein ATPase SecA for secretion . Together with SecY and SecG, SecE forms a multimeric channel through which preproteins are translocated, using both proton motive forces and ATP-driven secretion. The latter is mediated by SecA. The structure of the Escherichia coli SecYEG assembly revealed a sandwich of two membranes interacting through the extensive cytoplasmic domains . Each membrane is composed of dimers of SecYEG. The monomeric complex contains 15 transmembrane helices. This entry describes the SecD family of transport proteins. Members of this family are highly variable in length immediately after the well-conserved motif LGLGLXGG at the amino-terminal end of this model. Archaeal homologs are not included in the seed. SecD from Mycobacterium tuberculosis has a long Pro-rich insert. ; GO: 0015450 P-P-bond-hydrolysis-driven protein transmembrane transporter activity, 0006886 intracellular protein transport, 0009276 1-2nm peptidoglycan-based cell wall. Probab=31.96 E-value=21 Score=16.69 Aligned_cols=18 Identities=33% Similarity=0.490 Sum_probs=8.1 Q ss_pred HHHHHHH--HHH-HHHHHHHH Q ss_conf 7776432--358-88899998 Q gi|254780383|r 130 LTRLPGI--GRK-GANVILSM 147 (227) Q Consensus 130 L~~LpGV--G~k-tA~~il~~ 147 (227) ++.|||+ .+. =|.-|+.- T Consensus 135 ~VelPG~T~~D~~rak~Ilg~ 155 (522) T TIGR01129 135 VVELPGVTLTDTSRAKDILGG 155 (522) T ss_pred EEEECCCCCCCHHHHHHHCCC T ss_conf 998478676548999986065 No 164 >TIGR03629 arch_S13P archaeal ribosomal protein S13P. This model describes exclusively the archaeal ribosomal protein S13P. It excludes the homologous eukaryotic 40S ribosomal protein S18 and bacterial 30S ribosomal protein S13. Probab=31.95 E-value=15 Score=17.73 Aligned_cols=29 Identities=31% Similarity=0.371 Sum_probs=21.0 Q ss_pred CCCCHHH---HHHHHHHHHHHHHHHHHHHHHH Q ss_conf 0100001---4567776432358888999987 Q gi|254780383|r 120 DNKIPQT---LEGLTRLPGIGRKGANVILSMA 148 (227) Q Consensus 120 ~g~vP~~---~~~L~~LpGVG~ktA~~il~~~ 148 (227) |-++|.+ .-.|..+.|||+.+|..|+.-+ T Consensus 10 g~di~~~K~v~~aLt~I~GIG~~~A~~Ic~~~ 41 (144) T TIGR03629 10 DTDLDGNKPVEYALTGIKGIGRRFARAIARKL 41 (144) T ss_pred CCCCCCCCEEEEEEEEECCCCHHHHHHHHHHH T ss_conf 75489996898872212372899999999990 No 165 >PTZ00154 40S ribosomal protein S17; Provisional Probab=31.71 E-value=23 Score=16.36 Aligned_cols=26 Identities=8% Similarity=0.319 Sum_probs=13.0 Q ss_pred HHHHHHHHHHHHHHHHHHCCCCHHHH Q ss_conf 79999997523554442001000014 Q gi|254780383|r 102 RKKSENIISLSHILINEFDNKIPQTL 127 (227) Q Consensus 102 ~~KAk~I~~~a~~i~~~~~g~vP~~~ 127 (227) +.+.+.++.+|+.|+|+|-..+-.|+ T Consensus 3 rVRtktvKraAr~iiEkYy~~lt~DF 28 (130) T PTZ00154 3 RVRTKTVKRAARQIVEKYYAKLTLDF 28 (130) T ss_pred CCCHHHHHHHHHHHHHHHHHHHCCCH T ss_conf 72005999999999998031414547 No 166 >PRK03103 DNA polymerase IV; Reviewed Probab=30.57 E-value=35 Score=15.16 Aligned_cols=18 Identities=28% Similarity=0.532 Sum_probs=12.8 Q ss_pred HHHHHHHHHHHHHHHHHH Q ss_conf 777643235888899998 Q gi|254780383|r 130 LTRLPGIGRKGANVILSM 147 (227) Q Consensus 130 L~~LpGVG~ktA~~il~~ 147 (227) +..++|||++++.-.-.. T Consensus 184 v~~iwGIG~~~~~kL~~~ 201 (410) T PRK03103 184 IGKLFGVGRRMEHHLRRM 201 (410) T ss_pred HHHCCCCCHHHHHHHHHC T ss_conf 133068788999999985 No 167 >pfam10440 WIYLD WIYLD domain. This presumed domain has been predicted to contain three alpha helices. The domain was named the WIYLD domain based on the pattern of most conserved residues. The domain appears to be specific to plant SET-domain proteins. Probab=29.99 E-value=42 Score=14.64 Aligned_cols=41 Identities=7% Similarity=0.089 Sum_probs=26.4 Q ss_pred CCHHHHHHHHHHHHHHCCCCCCCCCCCCHHHHHHHHHHHHCC Q ss_conf 998999999999999767999877778689999999996333 Q gi|254780383|r 20 YTPKELEEIFYLFSLKWPSPKGELYYVNHFTLIVAVLLSAQS 61 (227) Q Consensus 20 ~~~~~~~~I~~~L~~~yp~~~~~l~~~~p~~~LVa~iLs~qT 61 (227) ..++.++.++..|.+.|++. =++-=.+-|++|+..||..|- T Consensus 23 ~~~~~v~~vlk~LL~~yg~n-W~~IEe~~Yr~l~dai~e~~e 63 (65) T pfam10440 23 IPDAVIRPVLKELLELYGGN-WFLIEEDNYRVLVDAIFEKQE 63 (65) T ss_pred CCHHHHHHHHHHHHHHHCCC-CHHHHCCCHHHHHHHHHHHHH T ss_conf 98799999999999996788-277760548999999999885 No 168 >TIGR03515 GldC gliding motility-associated protein GldC. Members of this protein family are exclusive to the Bacteroidetes phylum (previously Cytophaga-Flavobacteria-Bacteroides). GldC is a protein linked to a type of rapid surface gliding motility found in certain Bacteroidetes, such as Flavobacterium johnsoniae and Cytophaga hutchinsonii. Knockouts of GldC do not abolish the gliding phenotype but do impair it. Gliding motility appears closely linked to chitin utilization in the model species Flavobacterium johnsoniae. Bacteroidetes with members of this protein family appear to have all of the genes associated with gliding motility. Probab=29.54 E-value=6.9 Score=19.90 Aligned_cols=83 Identities=16% Similarity=0.093 Sum_probs=47.5 Q ss_pred CCCHHHHHHHHHHHHH-HHHHHHHHHHHHHHH---HHHHCCCHHHHHHHHHHHHCCCCHHHHHHHHHHCCCHH-HHHHHH Q ss_conf 1000014567776432-358888999987542---10001210467877656540788899999996218842-267899 Q gi|254780383|r 121 NKIPQTLEGLTRLPGI-GRKGANVILSMAFGI---PTIGVDTHIFRISNRIGLAPGKTPNKVEQSLLRIIPPK-HQYNAH 195 (227) Q Consensus 121 g~vP~~~~~L~~LpGV-G~ktA~~il~~~~~~---p~~~VDthv~Rv~~Rlgl~~~~~~~~~~~~l~~~~p~~-~~~~~~ 195 (227) .+||..+.+--.=-|| +...|.++++..++. .+.-+|-. ++ -.|-+ -..-|| T Consensus 14 n~iPE~I~WsA~Dgg~~~~~~~kA~~LS~WD~~~ketlriDLW----------TK-------------dMpVdeMk~F~~ 70 (108) T TIGR03515 14 NNVPEQILWEATDGPSQGQNPTKAISLSIWDHDGKNTMRIDLW----------TK-------------DMPVDEMKRFFI 70 (108) T ss_pred CCCCCCCEEECCCCCCCCHHHHHHHHHHHHCCCCCCEEEEEEC----------CC-------------CCCHHHHHHHHH T ss_conf 8998654587887898634677789888736777743565630----------37-------------685899999999 Q ss_pred HHHHHHHHHHCCCCCCCCCCCCCHHHCHHHC Q ss_conf 9999996651648998947284033176851 Q gi|254780383|r 196 YWLVLHGRYVCKARKPQCQSCIISNLCKRIK 226 (227) Q Consensus 196 ~~li~~G~~iC~~~~P~C~~C~l~~~C~~~k 226 (227) |.|+..+..+=+|..-.=-.=-++++|.||- T Consensus 71 Qtl~~madt~~rAT~d~~m~~~mrdfc~~Fa 101 (108) T TIGR03515 71 ETLGGMADTFLNATGDEKMAEDIKDLCDRLA 101 (108) T ss_pred HHHHHHHHHHHHHHCCHHHHHHHHHHHHHHH T ss_conf 9999999999987083999999999999999 No 169 >PRK00182 tatB sec-independent translocase; Provisional Probab=29.50 E-value=43 Score=14.59 Aligned_cols=63 Identities=17% Similarity=0.292 Sum_probs=42.4 Q ss_pred HHCCCCHHHHHHHHHHHHHHHHHHHHHHHHHH-HHHHHHCCCCH---HHHHHHHHHHHHHHHHHHHH Q ss_conf 01012689999999973003799999975235-54442001000---01456777643235888899 Q gi|254780383|r 82 KMLAIGEKKLQNYIRTIGIYRKKSENIISLSH-ILINEFDNKIP---QTLEGLTRLPGIGRKGANVI 144 (227) Q Consensus 82 ~l~~a~~~el~~~ir~~G~~~~KAk~I~~~a~-~i~~~~~g~vP---~~~~~L~~LpGVG~ktA~~i 144 (227) +|.-.+++.|=.+|+.+|-+-++++....-++ .|.++.|-++. ....+|.+|.+.|+++|-.= T Consensus 17 aLvVlGPeRLP~air~v~~~ir~~R~~a~~ak~eL~~ELGpEf~e~rkpl~e~~~lr~m~Pk~~itk 83 (165) T PRK00182 17 GLIVIGPERLPRLIKDVRAALLAARTAINNAKQQLDGDFGEEFDEFRKPLTQIAQYRRMGPKTAITK 83 (165) T ss_pred HHHHCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCHHHHHHH T ss_conf 9962580353799999999999999999999999999881889999999999998864586889999 No 170 >pfam11372 DUF3173 Protein of unknown function (DUF3173). This family of proteins with unknown function appears to be restricted to Firmicutes. Probab=29.45 E-value=43 Score=14.58 Aligned_cols=42 Identities=21% Similarity=0.462 Sum_probs=26.5 Q ss_pred CHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH-----HC----CCCHHH-HHHHHH Q ss_conf 68999999997300379999997523554442-----00----100001-456777 Q gi|254780383|r 87 GEKKLQNYIRTIGIYRKKSENIISLSHILINE-----FD----NKIPQT-LEGLTR 132 (227) Q Consensus 87 ~~~el~~~ir~~G~~~~KAk~I~~~a~~i~~~-----~~----g~vP~~-~~~L~~ 132 (227) +.+||.+ +||...-|+.|+.-|+.+.-+ |+ |.||.+ .++|+. T Consensus 5 t~~dLi~----lGf~~~~A~~IIrqAK~~lV~~G~~~Y~nkRlg~VP~~~VEeilG 56 (59) T pfam11372 5 TKKDLIK----LGFKPHTARDIIRQAKELLVERGYSFYNNKRLGRVPVSIVEEILG 56 (59) T ss_pred CHHHHHH----HCCCHHHHHHHHHHHHHHHHHCCCCHHCCCCCCCCCHHHHHHHHC T ss_conf 7999999----579887999999999999998188700177137265999999868 No 171 >COG5071 RPN5 26S proteasome regulatory complex component [Posttranslational modification, protein turnover, chaperones] Probab=29.26 E-value=43 Score=14.56 Aligned_cols=41 Identities=22% Similarity=0.235 Sum_probs=30.9 Q ss_pred HHHHHHHHHHHHHHHHHHHH--HHHHHHHH---------HCCCHHHHHHHHH Q ss_conf 45677764323588889999--87542100---------0121046787765 Q gi|254780383|r 127 LEGLTRLPGIGRKGANVILS--MAFGIPTI---------GVDTHIFRISNRI 167 (227) Q Consensus 127 ~~~L~~LpGVG~ktA~~il~--~~~~~p~~---------~VDthv~Rv~~Rl 167 (227) .++|++-|-|.+-++++++. |+|+-.+- .||-|-.||+.|. T Consensus 299 vNelmrwp~V~~~y~~~l~~~~faF~~e~~~~~w~DL~krviEHN~RvI~~y 350 (439) T COG5071 299 VNELMRWPKVAEIYGSALRSNVFAFNDEKGEKRWSDLRKRVIEHNIRVIANY 350 (439) T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHCCCCCHHHHHHHHHHHHHHHHHHHHHH T ss_conf 9988731567766267887503220330015349999999987317699988 No 172 >TIGR01394 TypA_BipA GTP-binding protein TypA; InterPro: IPR006298 This bacterial (and Arabidopsis) protein, termed TypA or BipA, is a GTP-binding protein. It is phosphorylated on a tyrosine residue under some cellular conditions. Mutants show altered regulation of some pathways, but the precise function is unknown.; GO: 0005525 GTP binding, 0005622 intracellular. Probab=29.20 E-value=38 Score=14.88 Aligned_cols=40 Identities=20% Similarity=0.300 Sum_probs=28.8 Q ss_pred HCCCHHHHHHHHHHHHHHCCCCCCCCCCCCHHHHHHHHHHHH Q ss_conf 239989999999999997679998777786899999999963 Q gi|254780383|r 18 CLYTPKELEEIFYLFSLKWPSPKGELYYVNHFTLIVAVLLSA 59 (227) Q Consensus 18 ~~~~~~~~~~I~~~L~~~yp~~~~~l~~~~p~~~LVa~iLs~ 59 (227) .....++..-++..+.++-|.|....+ .|||+||+.+=.. T Consensus 183 ~~~~~~~m~PLFd~I~~hvPaP~~~~d--~PlQmlvt~ldy~ 222 (609) T TIGR01394 183 LDDDSEDMAPLFDAILRHVPAPKGDLD--EPLQMLVTNLDYD 222 (609) T ss_pred CCCCHHHHHHHHHHHHHCCCCCCCCCC--CCHHHEEEECCCC T ss_conf 887220178999898640688898887--6242100011014 No 173 >COG1468 CRISPR-associated protein Cas4 (RecB family exonuclease) [Defense mechanisms] Probab=29.20 E-value=23 Score=16.39 Aligned_cols=19 Identities=21% Similarity=0.642 Sum_probs=15.3 Q ss_pred CCCCCCCCCCCCCHHHCHH Q ss_conf 6489989472840331768 Q gi|254780383|r 206 CKARKPQCQSCIISNLCKR 224 (227) Q Consensus 206 C~~~~P~C~~C~l~~~C~~ 224 (227) =...+|+|..|+++.+|.. T Consensus 170 ~~~~~~~C~~C~y~~iC~~ 188 (190) T COG1468 170 PPKKKKKCKKCAYREICFP 188 (190) T ss_pred CCCCCCCCCCCCCCEECCC T ss_conf 9999886999986334267 No 174 >PRK10840 transcriptional regulator RcsB; Provisional Probab=29.16 E-value=43 Score=14.55 Aligned_cols=26 Identities=4% Similarity=0.069 Sum_probs=19.1 Q ss_pred HHHHHHHHHHCCCHHHHHHHHHHHHC Q ss_conf 98754210001210467877656540 Q gi|254780383|r 146 SMAFGIPTIGVDTHIFRISNRIGLAP 171 (227) Q Consensus 146 ~~~~~~p~~~VDthv~Rv~~Rlgl~~ 171 (227) ..-.+...--|.+|+.+++..||..+ T Consensus 172 A~~L~iS~~TV~~h~~~i~~KLgv~n 197 (216) T PRK10840 172 AKKLNRSIKTISSQKKSAMMKLGVEN 197 (216) T ss_pred HHHHCCCHHHHHHHHHHHHHHCCCCC T ss_conf 98969899999999999999829998 No 175 >PRK05988 formate dehydrogenase subunit gamma; Validated Probab=29.09 E-value=43 Score=14.54 Aligned_cols=65 Identities=12% Similarity=0.303 Sum_probs=27.5 Q ss_pred HHHHHHHHHHHHHHHHHHHCCCCHHH-HHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCHHHHHHHH Q ss_conf 37999999752355444200100001-4567776432358888999987542100012104678776 Q gi|254780383|r 101 YRKKSENIISLSHILINEFDNKIPQT-LEGLTRLPGIGRKGANVILSMAFGIPTIGVDTHIFRISNR 166 (227) Q Consensus 101 ~~~KAk~I~~~a~~i~~~~~g~vP~~-~~~L~~LpGVG~ktA~~il~~~~~~p~~~VDthv~Rv~~R 166 (227) |..+...|+.+...+.++|| =||.+ ..++-..=||-+--...|.+|=-....-|+-.|+.|||.= T Consensus 18 y~~~~~alip~L~~iQ~~~G-yip~~a~~~vA~~l~i~~~~V~~vaTFY~~f~~~P~Gk~~i~VC~~ 83 (156) T PRK05988 18 LKHLEGALLPILHAIQEEFG-YIPEDAVPMIAEALNLSRAEVHGVITFYHDFRTEPPGRHVLKLCRA 83 (156) T ss_pred CCCCHHHHHHHHHHHHHHCC-CCCHHHHHHHHHHHCCCHHHHHHHHHHHHHHCCCCCCCEEEEECCC T ss_conf 79986599999999999829-9999999999999795999999997489886368998669997588 No 176 >TIGR00372 cas4 CRISPR-associated protein Cas4; InterPro: IPR013343 This entry consists of conserved proteins found in many prokaryotic genomes whose genes are associated with CRISPRs (Clustered, Regularly Interspaced Short Palidromic Repeats). The function of these proteins has not been experimentally determined, but computational analysis suggests that they be nucleases like RecB (IPR004586 from INTERPRO), functioning as part of a hypothetical DNA repair system , . . Probab=29.03 E-value=20 Score=16.74 Aligned_cols=30 Identities=13% Similarity=0.265 Sum_probs=19.7 Q ss_pred HHHHHHHHHHHCCC-CCCCCCCCCCHHHCHH Q ss_conf 99999996651648-9989472840331768 Q gi|254780383|r 195 HYWLVLHGRYVCKA-RKPQCQSCIISNLCKR 224 (227) Q Consensus 195 ~~~li~~G~~iC~~-~~P~C~~C~l~~~C~~ 224 (227) -..++..|..-=.+ ..++|..|+.+..|.. T Consensus 176 i~~~~~~~~~P~~~~~~~kC~~C~y~~~C~~ 206 (206) T TIGR00372 176 IEKLLEGEKLPPPPKKSRKCKFCPYREICLP 206 (206) T ss_pred HHHHHHCCCCCCCCCCCCCCCCCCCCCCCCC T ss_conf 9999848815576888775777555212589 No 177 >PRK06027 purU formyltetrahydrofolate deformylase; Reviewed Probab=28.58 E-value=4.8 Score=20.97 Aligned_cols=54 Identities=17% Similarity=0.147 Sum_probs=39.9 Q ss_pred CHHHHHHHHHHHHH-HHHHHHHHHHHHHHHHHHHCCCCHHHHHH-HHHHHHHHHHH Q ss_conf 68999999997300-37999999752355444200100001456-77764323588 Q gi|254780383|r 87 GEKKLQNYIRTIGI-YRKKSENIISLSHILINEFDNKIPQTLEG-LTRLPGIGRKG 140 (227) Q Consensus 87 ~~~el~~~ir~~G~-~~~KAk~I~~~a~~i~~~~~g~vP~~~~~-L~~LpGVG~kt 140 (227) .++++.++++.-+. +=+=|++..-++..++++|.|++-+-.-. |=++||.++|- T Consensus 152 ~E~~i~~~~~~~~~d~ivla~ym~il~~~~~~~~~~~iiNiH~s~lp~f~G~~~~~ 207 (285) T PRK06027 152 AEAQLLELIDEYQPDLVVLARYMQILSPEFVARFPGRIINIHHSFLPAFKGAKPYH 207 (285) T ss_pred HHHHHHHHHHHCCCCEEEEHHHHHHCCHHHHHHHCCCEEEECHHHCCCCCCCCHHH T ss_conf 99999999873497199763368766888998721764784511225799987799 No 178 >cd04949 GT1_gtfA_like This family is most closely related to the GT1 family of glycosyltransferases and is named after gtfA in Streptococcus gordonii, where it plays a role in the O-linked glycosylation of GspB, a cell surface glycoprotein involved in platelet binding. In general glycosyltransferases catalyze the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. The acceptor molecule can be a lipid, a protein, a heterocyclic compound, or another carbohydrate residue. This group of glycosyltransferases is most closely related to the previously defined glycosyltransferase family 1 (GT1). The members of this family may transfer UDP, ADP, GDP, or CMP linked sugars. The diverse enzymatic activities among members of this family reflect a wide range of biological functions. The protein structure available for this family has the GTB topology, one of the two protein topologies observed for nucleotide-sugar-dependent glycosyltra Probab=27.69 E-value=42 Score=14.59 Aligned_cols=173 Identities=14% Similarity=0.145 Sum_probs=92.7 Q ss_pred CCHHHHHHHHHHHHHHC-----CCC-----C--CCCCCCCHHHHHHHHHHHHCCCHHHHHHHHHHHHHCCCCCHHHCCCC Q ss_conf 99899999999999976-----799-----9--87777868999999999633320356799899873176200010126 Q gi|254780383|r 20 YTPKELEEIFYLFSLKW-----PSP-----K--GELYYVNHFTLIVAVLLSAQSTDVNVNKATKHLFEIADTPQKMLAIG 87 (227) Q Consensus 20 ~~~~~~~~I~~~L~~~y-----p~~-----~--~~l~~~~p~~~LVa~iLs~qT~d~~v~~~~~~L~~~ypt~e~l~~a~ 87 (227) .|++..+.|..++.... |.. . .+..-++|++++-..-|+.+-.-....+|+..+.+++|++.-..--+ T Consensus 164 ~T~~Q~~di~~~f~~~~~i~~IP~~~~~~~~~~~~~~~r~~~~ii~vgRL~~eK~~d~LI~A~~~v~~~~P~~~L~I~G~ 243 (372) T cd04949 164 ATEQQKQDLQKQFGNYNPIYTIPVGSIDPLKLPAQFKQRKPHKIITVARLAPEKQLDQLIKAFAKVVKQVPDATLDIYGY 243 (372) T ss_pred CCHHHHHHHHHHHCCCCCEEEECCCCCCHHCCCCCCCCCCCCEEEEEECCCCCCCHHHHHHHHHHHHHHCCCCEEEEEEC T ss_conf 87999999999717888589967824203116666435898979999677740285999999999998789929999973 Q ss_pred ---HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCC----CCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCH- Q ss_conf ---8999999997300379999997523554442001----00001456777643235888899998754210001210- Q gi|254780383|r 88 ---EKKLQNYIRTIGIYRKKSENIISLSHILINEFDN----KIPQTLEGLTRLPGIGRKGANVILSMAFGIPTIGVDTH- 159 (227) Q Consensus 88 ---~~el~~~ir~~G~~~~KAk~I~~~a~~i~~~~~g----~vP~~~~~L~~LpGVG~ktA~~il~~~~~~p~~~VDth- 159 (227) .++|+++++..|+.+ .-.+.+....+.+.|.. .+|... .|-|.-... ..++|.|++.-|.. T Consensus 244 G~~~~~L~~~i~~l~l~~--~V~f~G~~~~~~~~y~~a~~~v~~S~~------EGfgl~llE---Ama~GlPvIa~d~~y 312 (372) T cd04949 244 GDEEEKLKELIEELGLED--YVFLKGYTRDLDEVYQKAQLSLLTSQS------EGFGLSLME---ALSHGLPVISYDVNY 312 (372) T ss_pred CHHHHHHHHHHHHCCCCC--EEEECCCCCCHHHHHHHCCEEEECCCC------CCCCCHHHH---HHHCCCCEEEECCCC T ss_conf 477899999999829998--799889988989999757999980200------367658999---998599999805999 Q ss_pred HHH-HHHH--HH-HHCCCCHHHHHHHHHHCCCH-HHHHHHHHHHHHHHH Q ss_conf 467-8776--56-54078889999999621884-226789999999966 Q gi|254780383|r 160 IFR-ISNR--IG-LAPGKTPNKVEQSLLRIIPP-KHQYNAHYWLVLHGR 203 (227) Q Consensus 160 v~R-v~~R--lg-l~~~~~~~~~~~~l~~~~p~-~~~~~~~~~li~~G~ 203 (227) --+ ++.- -| +++..++++....+..++.. +.+..++..-....+ T Consensus 313 G~~eiI~~g~nG~Lv~~~d~~~la~~i~~ll~~~~~~~~~s~~a~~~a~ 361 (372) T cd04949 313 GPSEIIEDGENGYLVPKGDIEALAEAIIELLNDPKLLQKFSEAAYENAE 361 (372) T ss_pred CCHHHHCCCCCEEEECCCCHHHHHHHHHHHHCCHHHHHHHHHHHHHHHH T ss_conf 9688845898479968999999999999998699999999999999999 No 179 >PRK03858 DNA polymerase IV; Validated Probab=27.66 E-value=37 Score=14.98 Aligned_cols=18 Identities=28% Similarity=0.353 Sum_probs=12.7 Q ss_pred HHHHHHHHHHHHHHHHHH Q ss_conf 777643235888899998 Q gi|254780383|r 130 LTRLPGIGRKGANVILSM 147 (227) Q Consensus 130 L~~LpGVG~ktA~~il~~ 147 (227) +..++|||++++.-.-.+ T Consensus 175 v~~iwGIG~~~~~~L~~~ 192 (398) T PRK03858 175 VRRLWGVGAVTAAKLRAH 192 (398) T ss_pred CCCCCCCCHHHHHHHHHC T ss_conf 012158687999999984 No 180 >TIGR02644 Y_phosphoryl pyrimidine-nucleoside phosphorylase; InterPro: IPR000053 Two highly similar activities are represented in this group: thymidine phosphorylase (TP, gene deoA, 2.4.2.4 from EC) and pyrimidine-nucleoside phosphorylase (PyNP, gene pdp, 2.4.2.2 from EC). Both are dimeric enzymes that function in the salvage pathway to catalyse the reversible phosphorolysis of pyrimidine nucleosides to the free base and sugar moieties. In the case of thymidine phosphorylase, thymidine (and to a lesser extent, 2'-deoxyuridine) is lysed to produce thymine (or uracil) and 2'-deoxyribose-1-phosphate. Pyrimidine-nucleoside phosphorylase performs the analogous reaction on thymidine (to produce the same products) and uridine (to produce uracil and ribose-1-phosphate). PyNP is typically the only pyrimidine nucleoside phosphorylase encoded by Gram positive bacteria, while eukaryotes and proteobacteria encode two: TP, and the unrelated uridine phosphorylase. In humans, TP was originally characterised as platelet-derived endothelial cell growth factor and gliostatin . Structurally, the enzymes are homodimers, each composed of a rigid all alpha-helix lobe and a mixed alpha-helix/beta-sheet lobe, which are connected by a flexible hinge , . Prior to substrate binding, the lobes are separated by a large cleft. A functional active site and subsequent catalysis occurs upon closing of the cleft. The active site, composed of a phosphate binding site and a (deoxy)ribonucleotide binding site within the cleft region, is highly conserved between the two enzymes of this group. Active site residues (Escherichia coli DeoA numbering) include the phosphate binding Lys84 and Ser86 (close to a glycine-rich loop), Ser113, and Thr123, and the pyrimidine nucleoside-binding Arg171, Ser186, and Lys190. Sequence comparison between the active site residues for both enzymes reveals only one difference , which has been proposed to partially mediate substrate specificity. In TP, position 111 is a methionine, while the analogous position in PyNP is lysine. It should be noted that the uncharacterised archaeal members of this family differ in a number of respects from either of the characterised activities. The residue at position 108 is lysine, indicating the activity might be PyNP-like (though the determinants of substrate specificity have not been fully elucidated). Position 171 is glutamate (negative charge side chain) rather than arginine (positive charge side chain). In addition, a large loop that may "lock in" the substrates within the active site is much smaller than in the characterised members. It is not clear what effect these and other differences have on activity and specificity.; GO: 0006206 pyrimidine base metabolic process. Probab=27.45 E-value=18 Score=17.03 Aligned_cols=34 Identities=24% Similarity=0.454 Sum_probs=22.8 Q ss_pred HHHHHHHHHHHHHHHHHHHHHCCCHHHHHHHH-HHHHCCC Q ss_conf 32358888999987542100012104678776-5654078 Q gi|254780383|r 135 GIGRKGANVILSMAFGIPTIGVDTHIFRISNR-IGLAPGK 173 (227) Q Consensus 135 GVG~ktA~~il~~~~~~p~~~VDthv~Rv~~R-lgl~~~~ 173 (227) |||+||.= .++==+-..+..|-.++.| ||.+.++ T Consensus 85 GVGDK~SL-----~L~P~vAa~G~~Vak~SGRGLGhTGGT 119 (425) T TIGR02644 85 GVGDKVSL-----VLGPIVAALGVKVAKMSGRGLGHTGGT 119 (425) T ss_pred CCCHHHHH-----HHHHHHHHCCCCCCCCCCCCCCCCCCC T ss_conf 83402467-----779999974897455147756765662 No 181 >COG0099 RpsM Ribosomal protein S13 [Translation, ribosomal structure and biogenesis] Probab=26.71 E-value=34 Score=15.20 Aligned_cols=26 Identities=38% Similarity=0.535 Sum_probs=19.3 Q ss_pred CCHHH---HHHHHHHHHHHHHHHHHHHHH Q ss_conf 00001---456777643235888899998 Q gi|254780383|r 122 KIPQT---LEGLTRLPGIGRKGANVILSM 147 (227) Q Consensus 122 ~vP~~---~~~L~~LpGVG~ktA~~il~~ 147 (227) ++|.+ .=.|..+.|||..+|..|+.- T Consensus 8 dip~~K~v~iALt~IyGIG~~~a~~I~~~ 36 (121) T COG0099 8 DIPGNKRVVIALTYIYGIGRRRAKEICKK 36 (121) T ss_pred CCCCCCEEEEHHHHHCCCCHHHHHHHHHH T ss_conf 79998257650463035369999999999 No 182 >COG1550 Uncharacterized protein conserved in bacteria [Function unknown] Probab=26.03 E-value=32 Score=15.42 Aligned_cols=58 Identities=14% Similarity=0.142 Sum_probs=38.5 Q ss_pred HHHHHHHHHHHCCCCCCCCCCCCHHHHH---HHHHHHHCCCHHHHHHHHHHHHHCCCCCHH Q ss_conf 9999999999767999877778689999---999996333203567998998731762000 Q gi|254780383|r 25 LEEIFYLFSLKWPSPKGELYYVNHFTLI---VAVLLSAQSTDVNVNKATKHLFEIADTPQK 82 (227) Q Consensus 25 ~~~I~~~L~~~yp~~~~~l~~~~p~~~L---Va~iLs~qT~d~~v~~~~~~L~~~ypt~e~ 82 (227) ++.|+.+|...|+-.-.+..|+|-|+-- ||+|-+-+...+.+..-...|...+|.+|- T Consensus 25 lr~iv~rLk~KFnvSvaE~~~qD~~qr~~IgiA~Vs~Dr~~~e~~l~~~~~~id~~p~~E~ 85 (95) T COG1550 25 LRPIVTRLKNKFNVSVAETGYQDLWQRAEIGIATVSSDRAVAERVLDRALDFIDAEPEFER 85 (95) T ss_pred HHHHHHHHHHHCCEEEEECCCHHHHHHHEEEEEEEECCHHHHHHHHHHHHHHHHCCCCHHH T ss_conf 9999999887565235412742466663255899846488899999999999870974402 No 183 >TIGR02814 pfaD_fam PfaD family protein; InterPro: IPR014179 The protein PfaD is part of a four-gene locus, similar to polyketide biosynthesis systems, which is responsible for omega-3 polyunsaturated fatty acid biosynthesis in several high pressure and/or cold-adapted bacteria. Several other members of the entry are found in loci presumed to act in polyketide biosyntheses per se.. Probab=25.91 E-value=42 Score=14.62 Aligned_cols=49 Identities=10% Similarity=0.194 Sum_probs=24.3 Q ss_pred CHHHHHHHHHHHHHCCCCCHHHCCCCHHHHHHH------HHHHHHHHHHHHHHHHH Q ss_conf 203567998998731762000101268999999------99730037999999752 Q gi|254780383|r 62 TDVNVNKATKHLFEIADTPQKMLAIGEKKLQNY------IRTIGIYRKKSENIISL 111 (227) Q Consensus 62 ~d~~v~~~~~~L~~~ypt~e~l~~a~~~el~~~------ir~~G~~~~KAk~I~~~ 111 (227) ...-+....++|.... +.++.+-|+-.|+-|+ +|.-.||-.||.+|+++ T Consensus 262 ~EAGtSd~Vk~lLa~~-~v~DtayAPAgDMFE~GvklQVLKrGtlFP~RANkLY~L 316 (449) T TIGR02814 262 VEAGTSDEVKKLLAKA-DVQDTAYAPAGDMFELGVKLQVLKRGTLFPARANKLYEL 316 (449) T ss_pred CCCCCCHHHHHHHHCC-CCCCCCCCCCHHHHHCCCEEEEEECCCCCHHHCCHHHHH T ss_conf 3458886799998158-976501365200544277688840252321011115798 No 184 >COG3743 Uncharacterized conserved protein [Function unknown] Probab=25.91 E-value=28 Score=15.81 Aligned_cols=18 Identities=44% Similarity=0.673 Sum_probs=15.3 Q ss_pred HHHHHHHHHHHHHHHHHH Q ss_conf 567776432358888999 Q gi|254780383|r 128 EGLTRLPGIGRKGANVIL 145 (227) Q Consensus 128 ~~L~~LpGVG~ktA~~il 145 (227) ++|..|.|||++.+.+.- T Consensus 67 DDLt~I~GIGPk~e~~Ln 84 (133) T COG3743 67 DDLTRISGIGPKLEKVLN 84 (133) T ss_pred CCCHHHCCCCHHHHHHHH T ss_conf 542110043788998998 No 185 >pfam04891 NifQ NifQ. NifQ is involved in early stages of the biosynthesis of the iron-molybdenum cofactor (FeMo-co), which is an integral part of the active site of dinitrogenase. The conserved C-terminal cysteine residues may be involved in metal binding. Probab=25.36 E-value=50 Score=14.10 Aligned_cols=47 Identities=32% Similarity=0.559 Sum_probs=30.0 Q ss_pred CHHHHHHHHHHCCCH--------HHHHHHHHH-H-HHHHHHHCCCCCCCCCCCCCHHHC Q ss_conf 889999999621884--------226789999-9-999665164899894728403317 Q gi|254780383|r 174 TPNKVEQSLLRIIPP--------KHQYNAHYW-L-VLHGRYVCKARKPQCQSCIISNLC 222 (227) Q Consensus 174 ~~~~~~~~l~~~~p~--------~~~~~~~~~-l-i~~G~~iC~~~~P~C~~C~l~~~C 222 (227) +..+...-+...||. =+|+.|=+- + -.-|-.+|++ |.|+.|.=-..| T Consensus 109 ~R~eLs~Lm~r~FP~LAa~N~~nMrWKKFfYrqlCe~eG~~~C~a--PsC~~C~d~~~C 165 (167) T pfam04891 109 SRAELSALLARHFPPLAARNTRNMRWKKFFYRQLCEREGIVLCKA--PSCGECDDFALC 165 (167) T ss_pred CHHHHHHHHHHHHHHHHHHCCCCCCHHHHHHHHHHHHCCCCCCCC--CCCCCCCCHHHC T ss_conf 789999999998599997353688199999999999769775799--999985657523 No 186 >pfam00833 Ribosomal_S17e Ribosomal S17. Probab=25.33 E-value=33 Score=15.29 Aligned_cols=27 Identities=11% Similarity=0.416 Sum_probs=16.5 Q ss_pred HHHHHHHHHHHHHHHHHHCCCCHHHHH Q ss_conf 799999975235544420010000145 Q gi|254780383|r 102 RKKSENIISLSHILINEFDNKIPQTLE 128 (227) Q Consensus 102 ~~KAk~I~~~a~~i~~~~~g~vP~~~~ 128 (227) +.+.+.++.+|+.|+++|...+-.|++ T Consensus 3 ~Vrtk~vKr~a~~iiEkY~~~lt~DF~ 29 (122) T pfam00833 3 RVRTKTVKRAARVIIEKYYSKLTLDFQ 29 (122) T ss_pred CCCHHHHHHHHHHHHHHCHHHHCCCHH T ss_conf 723149999999999976335345788 No 187 >PRK01810 DNA polymerase IV; Validated Probab=25.24 E-value=43 Score=14.58 Aligned_cols=40 Identities=13% Similarity=0.194 Sum_probs=21.1 Q ss_pred HHHHHHHHHHHHHHCCC--C-HHHHHH------HHHHHHHHHHHHHHHHH Q ss_conf 99975235544420010--0-001456------77764323588889999 Q gi|254780383|r 106 ENIISLSHILINEFDNK--I-PQTLEG------LTRLPGIGRKGANVILS 146 (227) Q Consensus 106 k~I~~~a~~i~~~~~g~--v-P~~~~~------L~~LpGVG~ktA~~il~ 146 (227) +.|=++|..+. +-+|. + |.+..+ +..++|||++++...-. T Consensus 149 K~lAKiAs~~~-Kp~G~~~i~~~~~~~~l~~lpv~~iwGIG~~~~~~L~~ 197 (410) T PRK01810 149 KFLAKMASDMK-KPLGITVLRKRDVPEMLWPLPVEEMHGIGEKTAEKLKD 197 (410) T ss_pred HHHHHHHHHHC-CCCCCEECCHHHHHHHHHCCCCCCCCCCCHHHHHHHHH T ss_conf 99999988614-87673005789999998648851205867789999998 No 188 >cd00424 Pol_Y Y-family of DNA polymerases. Pol_Y's can transverse replication-blocking DNA lesions, such as cyclobutane pyrimidine dimers resulting from UV damage, at the cost of an elevated error rate. The Y-family has no 3'-5' exonuclease activity. In addition to possessing a topology akin to a right hand, with "thumb", "fingers" and "palm" motifs, like polymerases from the A-, B-, C- and X-families, the Y-family has a unique "little finger" motif. Expression of Y-family polymerases is often induced by DNA damage. These polymerases are phylogenetically unrelated to classical DNA polymerases. Probab=24.84 E-value=51 Score=14.04 Aligned_cols=19 Identities=32% Similarity=0.623 Sum_probs=14.2 Q ss_pred HHHHHHHHHHHHHHHHHHH Q ss_conf 7776432358888999987 Q gi|254780383|r 130 LTRLPGIGRKGANVILSMA 148 (227) Q Consensus 130 L~~LpGVG~ktA~~il~~~ 148 (227) +.+|||||++|+.-....+ T Consensus 175 v~~l~GIG~~~~~~L~~~g 193 (341) T cd00424 175 LSDIPGIGSVTASRLEALG 193 (341) T ss_pred HHHHCCCCHHHHHHHHHCC T ss_conf 7887487889999999849 No 189 >PRK12373 NADH dehydrogenase subunit E; Provisional Probab=24.30 E-value=41 Score=14.67 Aligned_cols=20 Identities=20% Similarity=0.355 Sum_probs=15.7 Q ss_pred HHHHHHHHHHHHHHHHHHHH Q ss_conf 45677764323588889999 Q gi|254780383|r 127 LEGLTRLPGIGRKGANVILS 146 (227) Q Consensus 127 ~~~L~~LpGVG~ktA~~il~ 146 (227) -++|..|.|||+|.-...-. T Consensus 325 ~DDLk~ikGvGPkle~~ln~ 344 (403) T PRK12373 325 ADDLKLISGVGPKIEGTLNE 344 (403) T ss_pred CCHHHHHCCCCHHHHHHHHH T ss_conf 63556641758789999886 No 190 >PRK13011 formyltetrahydrofolate deformylase; Reviewed Probab=24.15 E-value=12 Score=18.22 Aligned_cols=54 Identities=15% Similarity=0.086 Sum_probs=39.4 Q ss_pred CHHHHHHHHHHHHH-HHHHHHHHHHHHHHHHHHHCCCCHHHHH-HHHHHHHHHHHH Q ss_conf 68999999997300-3799999975235544420010000145-677764323588 Q gi|254780383|r 87 GEKKLQNYIRTIGI-YRKKSENIISLSHILINEFDNKIPQTLE-GLTRLPGIGRKG 140 (227) Q Consensus 87 ~~~el~~~ir~~G~-~~~KAk~I~~~a~~i~~~~~g~vP~~~~-~L~~LpGVG~kt 140 (227) .+.++.++++.-+- +=.=|++..-++..++++|.|++-+-.- -|=+++|-++|- T Consensus 154 ~E~~~~~~~~~~~~d~ivla~ym~il~~~~~~~~~~~iinih~s~lP~f~G~~~~~ 209 (287) T PRK13011 154 QEAQVLDVIEESGAELVVLARYMQVLSPELCRKLAGRAINIHHSFLPGFKGAKPYH 209 (287) T ss_pred HHHHHHHHHHHCCCCEEEHHHHHHHCCHHHHHHHCCCCCCCCHHHCCCCCCCCHHH T ss_conf 99999999873397399434488874989998403801021600115789986799 No 191 >KOG3133 consensus Probab=23.82 E-value=54 Score=13.91 Aligned_cols=42 Identities=7% Similarity=-0.089 Sum_probs=30.7 Q ss_pred CCHHHHHHHHHHHHCCCHHHHHHHHHHHHHCCCCCHHHCCCC Q ss_conf 868999999999633320356799899873176200010126 Q gi|254780383|r 46 VNHFTLIVAVLLSAQSTDVNVNKATKHLFEIADTPQKMLAIG 87 (227) Q Consensus 46 ~~p~~~LVa~iLs~qT~d~~v~~~~~~L~~~ypt~e~l~~a~ 87 (227) ..++.-.+..||.|-|+-++-..-++.|+.+||-|-.=..++ T Consensus 141 ~g~le~~m~~iMqqllSKEILyeplKEl~~~YPkwLeen~e~ 182 (267) T KOG3133 141 SGDLEPIMESIMQQLLSKEILYEPLKELGANYPKWLEENGES 182 (267) T ss_pred CCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCC T ss_conf 753799999999999888774016999998735999855355 No 192 >PRK09430 djlA Dna-J like membrane chaperone protein; Provisional Probab=23.65 E-value=54 Score=13.89 Aligned_cols=30 Identities=20% Similarity=0.168 Sum_probs=14.4 Q ss_pred HHCCCCHH-HHHHHHHH---HHHHHHHHHHHHHH Q ss_conf 20010000-14567776---43235888899998 Q gi|254780383|r 118 EFDNKIPQ-TLEGLTRL---PGIGRKGANVILSM 147 (227) Q Consensus 118 ~~~g~vP~-~~~~L~~L---pGVG~ktA~~il~~ 147 (227) .-||.+.. ..+-|.++ =|+.+..-+.++.. T Consensus 140 ~ADG~l~~~E~~~L~~Ia~~lg~s~~~f~~i~a~ 173 (269) T PRK09430 140 FADGSLHPNERQVLYVIAEELGFSRFQFDQLLRM 173 (269) T ss_pred HHCCCCCHHHHHHHHHHHHHHCCCHHHHHHHHHH T ss_conf 8558999999999999999939899999999999 No 193 >PRK02515 psbU photosystem II complex extrinsic protein precursor U; Provisional Probab=23.45 E-value=54 Score=13.87 Aligned_cols=33 Identities=24% Similarity=0.324 Sum_probs=22.9 Q ss_pred HHHHHHCCCCHH---HHHHHHHHHHHHHHHHHHHHH Q ss_conf 544420010000---145677764323588889999 Q gi|254780383|r 114 ILINEFDNKIPQ---TLEGLTRLPGIGRKGANVILS 146 (227) Q Consensus 114 ~i~~~~~g~vP~---~~~~L~~LpGVG~ktA~~il~ 146 (227) .+..+||++|.- +.....++||.+|..|..|.. T Consensus 56 kl~t~~g~KIDlNNa~vr~f~q~pGmYPtlA~kIv~ 91 (144) T PRK02515 56 KLATERGEKIDLNNSSVRAFRQFPGMYPTLAGKIVK 91 (144) T ss_pred HHHHHHCCCCCCCCHHHHHHHHCCCCCHHHHHHHHH T ss_conf 888871564025627499998688846799999984 No 194 >PRK12311 rpsB 30S ribosomal protein S2; Provisional Probab=22.97 E-value=41 Score=14.67 Aligned_cols=18 Identities=17% Similarity=0.326 Sum_probs=12.7 Q ss_pred HHHHHHHHHHHHHHHHHH Q ss_conf 456777643235888899 Q gi|254780383|r 127 LEGLTRLPGIGRKGANVI 144 (227) Q Consensus 127 ~~~L~~LpGVG~ktA~~i 144 (227) -++|.+|+|||++.+... T Consensus 268 ~ddl~ki~gvgp~~~~~l 285 (332) T PRK12311 268 PDDLKKLTGVSGAIEKKL 285 (332) T ss_pred CCHHHHHCCCCHHHHHHH T ss_conf 425777157678899985 No 195 >PRK05477 gatB aspartyl/glutamyl-tRNA amidotransferase subunit B; Validated Probab=22.93 E-value=56 Score=13.80 Aligned_cols=58 Identities=19% Similarity=0.170 Sum_probs=30.8 Q ss_pred HHHHHHHHHHH---HHHHHHHHHHHHHHHHHHHHHCCCHHHHHHHHHHHHCCCCHHHHHHHHHHCC Q ss_conf 00145677764---3235888899998754210001210467877656540788899999996218 Q gi|254780383|r 124 PQTLEGLTRLP---GIGRKGANVILSMAFGIPTIGVDTHIFRISNRIGLAPGKTPNKVEQSLLRII 186 (227) Q Consensus 124 P~~~~~L~~Lp---GVG~ktA~~il~~~~~~p~~~VDthv~Rv~~Rlgl~~~~~~~~~~~~l~~~~ 186 (227) |....+|+.+= =|-.+.|.-|+-..+.. +..+.-++.+.||..-.+...++....+++ T Consensus 371 ~~~laeLi~li~~g~Is~~~AK~il~~m~~~-----~~sp~~ii~~~gL~~isD~~eLe~ii~eVI 431 (479) T PRK05477 371 PEQLAELIKLIDDGTISGKIAKKVFEEMLEG-----GGDPDEIVEEKGLKQISDEGALEAIVDEVL 431 (479) T ss_pred HHHHHHHHHHHHCCCCCHHHHHHHHHHHHHC-----CCCHHHHHHHCCCCCCCCHHHHHHHHHHHH T ss_conf 9999999999985985689999999999966-----999999999739824489999999999999 No 196 >COG1379 PHP family phosphoesterase with a Zn ribbon [General function prediction only] Probab=22.89 E-value=56 Score=13.79 Aligned_cols=13 Identities=23% Similarity=0.325 Sum_probs=6.5 Q ss_pred CCCHHHHHHHHHH Q ss_conf 1268999999997 Q gi|254780383|r 85 AIGEKKLQNYIRT 97 (227) Q Consensus 85 ~a~~~el~~~ir~ 97 (227) +.+.+++..+|+. T Consensus 213 ~~sF~~~r~Ai~~ 225 (403) T COG1379 213 EISFEELRKAIKG 225 (403) T ss_pred CCCHHHHHHHHHC T ss_conf 6888999999715 No 197 >KOG0628 consensus Probab=22.85 E-value=56 Score=13.79 Aligned_cols=53 Identities=19% Similarity=0.120 Sum_probs=35.0 Q ss_pred CCCCCCCCCHHCCCHHHHHHHHHHHHHHC-CCCCC--------CCCCCCHHHHHHHHHHHHC Q ss_conf 77767786002399899999999999976-79998--------7777868999999999633 Q gi|254780383|r 8 DSYQGNSPLGCLYTPKELEEIFYLFSLKW-PSPKG--------ELYYVNHFTLIVAVLLSAQ 60 (227) Q Consensus 8 ~~~~~~~p~~~~~~~~~~~~I~~~L~~~y-p~~~~--------~l~~~~p~~~LVa~iLs~q 60 (227) +.....-|..+..+|+.+++|+..+++.- |.... ..+..+-|..+++.|||.- T Consensus 36 GYl~~llP~~aPe~pE~~~~Il~D~ekiI~PGitHw~hP~fhAyfpa~~s~~siladmLs~~ 97 (511) T KOG0628 36 GYLRDLLPSKAPEKPESWEDILGDLEKIIMPGITHWQHPHFHAYFPAGNSYPSILADMLSGG 97 (511) T ss_pred CHHHHHCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCCEEEECCCCCCHHHHHHHHHHCC T ss_conf 22354378878998556999998799770689755689860367657665177999998500 No 198 >PRK00188 trpD anthranilate phosphoribosyltransferase; Provisional Probab=22.83 E-value=56 Score=13.79 Aligned_cols=16 Identities=13% Similarity=0.441 Sum_probs=8.6 Q ss_pred CCCHHHHHHHHHHHHH Q ss_conf 1268999999997300 Q gi|254780383|r 85 AIGEKKLQNYIRTIGI 100 (227) Q Consensus 85 ~a~~~el~~~ir~~G~ 100 (227) +.+.+++++.+...|| T Consensus 133 ~~s~~~~~~~~~~~g~ 148 (339) T PRK00188 133 DLTPEQVARCLDEVGI 148 (339) T ss_pred CCCHHHHHHHHHHCCC T ss_conf 8999999999998095 No 199 >pfam01930 Cas_Cas4 Domain of unknown function DUF83. This domain has no known function. The domain contains three conserved cysteines at its C terminus. Probab=22.73 E-value=39 Score=14.85 Aligned_cols=16 Identities=25% Similarity=0.779 Sum_probs=13.4 Q ss_pred CCCCCCCCCCCHHHCH Q ss_conf 8998947284033176 Q gi|254780383|r 208 ARKPQCQSCIISNLCK 223 (227) Q Consensus 208 ~~~P~C~~C~l~~~C~ 223 (227) ..+|+|..|.+.++|. T Consensus 146 ~~~~~C~~Csy~~~C~ 161 (162) T pfam01930 146 EKKKYCKKCAYREFCW 161 (162) T ss_pred CCCCCCCCCCCHHHCC T ss_conf 9899798999741038 No 200 >COG1383 RPS17A Ribosomal protein S17E [Translation, ribosomal structure and biogenesis] Probab=22.05 E-value=52 Score=13.98 Aligned_cols=28 Identities=18% Similarity=0.343 Sum_probs=18.4 Q ss_pred HHHHHHHHHHHHHHHHHHCCCCHHHHHH Q ss_conf 7999999752355444200100001456 Q gi|254780383|r 102 RKKSENIISLSHILINEFDNKIPQTLEG 129 (227) Q Consensus 102 ~~KAk~I~~~a~~i~~~~~g~vP~~~~~ 129 (227) +.|-++++..|+.|.|+|.+.+-.|++. T Consensus 3 ~IR~~~vKR~a~el~ekY~~~ft~dFe~ 30 (74) T COG1383 3 RIRPKFVKRTARELIEKYPDKFTDDFET 30 (74) T ss_pred CCCHHHHHHHHHHHHHHHHHHHCCCHHH T ss_conf 7525799999999999857775232788 No 201 >PRK11057 ATP-dependent DNA helicase RecQ; Provisional Probab=22.02 E-value=34 Score=15.26 Aligned_cols=28 Identities=29% Similarity=0.406 Sum_probs=12.9 Q ss_pred CHHHHHHHHHHHH-HHH--HHHHHHHHHHHH Q ss_conf 0001456777643-235--888899998754 Q gi|254780383|r 123 IPQTLEGLTRLPG-IGR--KGANVILSMAFG 150 (227) Q Consensus 123 vP~~~~~L~~LpG-VG~--ktA~~il~~~~~ 150 (227) +|.+.+..-+=-| =|+ +-|.|+|.|..+ T Consensus 313 ~P~s~e~yyQE~GRAGRDG~~a~c~l~y~~~ 343 (607) T PRK11057 313 IPRNIESYYQETGRAGRDGLPAEAMLFYDPA 343 (607) T ss_pred CCCCHHHHHHHHHHCCCCCCCCEEEEEECHH T ss_conf 9999999999886352589854189985687 No 202 >COG0322 UvrC Nuclease subunit of the excinuclease complex [DNA replication, recombination, and repair] Probab=21.79 E-value=59 Score=13.65 Aligned_cols=20 Identities=15% Similarity=0.222 Sum_probs=13.6 Q ss_pred HHHHHHHHHHHHHHCCCCCC Q ss_conf 89999999999997679998 Q gi|254780383|r 22 PKELEEIFYLFSLKWPSPKG 41 (227) Q Consensus 22 ~~~~~~I~~~L~~~yp~~~~ 41 (227) ...+++++..|...||-..+ T Consensus 133 ~~a~~~~l~ll~rlfplR~C 152 (581) T COG0322 133 AGAVRETLNLLQRLFPLRTC 152 (581) T ss_pred HHHHHHHHHHHHHHHCCHHC T ss_conf 34689999999976113316 No 203 >pfam05166 YcgL YcgL domain. This family of proteins formerly called DUF709 includes the E. coli gene ycgL. Homologues of YcgL are found in gammaproteobacteria. The structure of this protein shows a novel alpha/beta/alpha sandwich structure. Probab=21.60 E-value=53 Score=13.96 Aligned_cols=21 Identities=19% Similarity=0.366 Sum_probs=16.8 Q ss_pred HHHCCCCHHHHHHHHHHHHHH Q ss_conf 001012689999999973003 Q gi|254780383|r 81 QKMLAIGEKKLQNYIRTIGIY 101 (227) Q Consensus 81 e~l~~a~~~el~~~ir~~G~~ 101 (227) ..|+.++.+++.+.|..-||| T Consensus 48 r~LA~~d~~~Vl~~l~~~Gfy 68 (74) T pfam05166 48 KKLARADVEKVLEALEEQGFY 68 (74) T ss_pred CHHHCCCHHHHHHHHHHCCEE T ss_conf 132139999999999868978 No 204 >PRK12933 secD preprotein translocase subunit SecD; Reviewed Probab=21.57 E-value=42 Score=14.61 Aligned_cols=37 Identities=38% Similarity=0.681 Sum_probs=20.9 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHCCCHHHHHHHHH--HHHCCCCHH Q ss_conf 77764323588889999875421000121046787765--654078889 Q gi|254780383|r 130 LTRLPGIGRKGANVILSMAFGIPTIGVDTHIFRISNRI--GLAPGKTPN 176 (227) Q Consensus 130 L~~LpGVG~ktA~~il~~~~~~p~~~VDthv~Rv~~Rl--gl~~~~~~~ 176 (227) -+.|||| |..||.. +++||+||. +..|+ -+-.++++. T Consensus 490 tLTLPGI----AGivLti-----GmAVDaNVl-ifERirEElr~G~~~~ 528 (604) T PRK12933 490 VLTLPGI----AGLVLTV-----GMAVDTNVL-IFERIKDKLKEGRSFA 528 (604) T ss_pred EECHHHH----HHHHHHH-----HHHEECCCH-HHHHHHHHHHCCCCHH T ss_conf 2736669----9999875-----110241513-7488899997699879 No 205 >TIGR02203 MsbA_lipidA lipid A export permease/ATP-binding protein MsbA; InterPro: IPR011917 ABC transporters belong to the ATP-Binding Cassette (ABC) superfamily, which uses the hydrolysis of ATP to energize diverse biological systems. ABC transporters are minimally constituted of two conserved regions: a highly conserved ATP binding cassette (ABC) and a less conserved transmembrane domain (TMD). These regions can be found on the same protein or on two different ones. Most ABC transporters function as a dimer and therefore are constituted of four domains, two ABC modules and two TMDs. ABC transporters are involved in the export or import of a wide variety of substrates ranging from small ions to macromolecules. The major function of ABC import systems is to provide essential nutrients to bacteria. They are found only in prokaryotes and their four constitutive domains are usually encoded by independent polypeptides (two ABC proteins and two TMD proteins). Prokaryotic importers require additional extracytoplasmic binding proteins (one or more per systems) for function. In contrast, export systems are involved in the extrusion of noxious substances, the export of extracellular toxins and the targeting of membrane components. They are found in all living organisms and in general the TMD is fused to the ABC module in a variety of combinations. Some eukaryotic exporters encode the four domains on the same polypeptide chain . The ABC module (approximately two hundred amino acid residues) is known to bind and hydrolyze ATP, thereby coupling transport to ATP hydrolysis in a large number of biological processes. The cassette is duplicated in several subfamilies. Its primary sequence is highly conserved, displaying a typical phosphate-binding loop: Walker A, and a magnesium binding site: Walker B. Besides these two regions, three other conserved motifs are present in the ABC cassette: the switch region which contains a histidine loop, postulated to polarize the attaching water molecule for hydrolysis, the signature conserved motif (LSGGQ) specific to the ABC transporter, and the Q-motif (between Walker A and the signature), which interacts with the gamma phosphate through a water bond. The Walker A, Walker B, Q-loop and switch region form the nucleotide binding site , , . The 3D structure of a monomeric ABC module adopts a stubby L-shape with two distinct arms. ArmI (mainly beta-strand) contains Walker A and Walker B. The important residues for ATP hydrolysis and/or binding are located in the P-loop. The ATP-binding pocket is located at the extremity of armI. The perpendicular armII contains mostly the alpha helical subdomain with the signature motif. It only seems to be required for structural integrity of the ABC module. ArmII is in direct contact with the TMD. The hinge between armI and armII contains both the histidine loop and the Q-loop, making contact with the gamma phosphate of the ATP molecule. ATP hydrolysis leads to a conformational change that could facilitate ADP release. In the dimer the two ABC cassettes contact each other through hydrophobic interactions at the antiparallel beta-sheet of armI by a two-fold axis , , , , , . Proteins known to belong to this family are classified in several functional subfamilies depending on the substrate used (for further information see http://www.tcdb.org/tcdb/index.php?tc=3.A.1). This family consists of a single polypeptide chain transporter in the ATP-binding cassette (ABC) transporter family, MsbA, which exports lipid A. It may also act in multidrug resistance. Lipid A, a part of lipopolysaccharide, is found in the outer leaflet of the outer membrane of most Gram-negative bacteria. Members of this family are restricted to the Proteobacteria (although lipid A is more broadly distributed) and often are clustered with lipid A biosynthesis genes .; GO: 0005524 ATP binding, 0006869 lipid transport, 0009276 1-2nm peptidoglycan-based cell wall, 0016021 integral to membrane. Probab=21.29 E-value=60 Score=13.58 Aligned_cols=50 Identities=20% Similarity=0.228 Sum_probs=29.8 Q ss_pred HHHHHHHHHHHHHHHHHHCCC----------CHHHHHHHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 799999975235544420010----------00014567776432358888999987542 Q gi|254780383|r 102 RKKSENIISLSHILINEFDNK----------IPQTLEGLTRLPGIGRKGANVILSMAFGI 151 (227) Q Consensus 102 ~~KAk~I~~~a~~i~~~~~g~----------vP~~~~~L~~LpGVG~ktA~~il~~~~~~ 151 (227) +.-|.-|....+.+.|+|--- ||--.=-|.-+.||+.++++-.|+.+-+. T Consensus 54 ~~aaaGilstlqnWreqfty~~~~~~~~~~~vPL~~vgl~~lRGi~~f~s~Y~l~~Vs~~ 113 (603) T TIGR02203 54 LVAAAGILSTLQNWREQFTYMVLRDREVLWWVPLLVVGLAVLRGIASFVSDYLLAWVSNK 113 (603) T ss_pred HHHHHHHHHHHHHHHHHHHHHHHCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH T ss_conf 357778998887667752022201650521247999999999977778988999999999 No 206 >pfam00682 HMGL-like HMGL-like. This family contains a diverse set of enzymes. These include various aldolases and a region of pyruvate carboxylase. Probab=21.06 E-value=61 Score=13.55 Aligned_cols=16 Identities=13% Similarity=0.059 Sum_probs=9.3 Q ss_pred CCHHHHHHHHHHHHHH Q ss_conf 9989999999999997 Q gi|254780383|r 20 YTPKELEEIFYLFSLK 35 (227) Q Consensus 20 ~~~~~~~~I~~~L~~~ 35 (227) .+.++-.+|.+.|.+. T Consensus 11 ~~~e~K~~i~~~L~~~ 26 (237) T pfam00682 11 FSVEEKLAIARALDEA 26 (237) T ss_pred CCHHHHHHHHHHHHHC T ss_conf 8999999999999984 No 207 >cd03022 DsbA_HCCA_Iso DsbA family, 2-hydroxychromene-2-carboxylate (HCCA) isomerase subfamily; HCCA isomerase is a glutathione (GSH) dependent enzyme involved in the naphthalene catabolic pathway. It converts HCCA, a hemiketal formed spontaneously after ring cleavage of 1,2-dihydroxynapthalene by a dioxygenase, into cis-o-hydroxybenzylidenepyruvate (cHBPA). This is the fourth reaction in a six-step pathway that converts napthalene into salicylate. HCCA isomerase is unique to bacteria that degrade polycyclic aromatic compounds. It is closely related to the eukaryotic protein, GSH transferase kappa (GSTK). Probab=21.00 E-value=61 Score=13.54 Aligned_cols=92 Identities=13% Similarity=0.114 Sum_probs=47.3 Q ss_pred HHHHHHHHHHHHCCCHHHHHHHHHHHHHCCCCCHHHCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCHHHH Q ss_conf 89999999996333203567998998731762000101268999999997300379999997523554442001000014 Q gi|254780383|r 48 HFTLIVAVLLSAQSTDVNVNKATKHLFEIADTPQKMLAIGEKKLQNYIRTIGIYRKKSENIISLSHILINEFDNKIPQTL 127 (227) Q Consensus 48 p~~~LVa~iLs~qT~d~~v~~~~~~L~~~ypt~e~l~~a~~~el~~~ir~~G~~~~KAk~I~~~a~~i~~~~~g~vP~~~ 127 (227) .....-..+..++ ...........||+.|=. +..--.+.+.|.+++..+|+-...+....+- +++...+- T Consensus 85 s~~a~~~~~~a~~-~~~~~~~~~~~lf~a~f~-~g~di~d~~vL~~ia~~~Gld~~~~~~~~~~-----~~~~~~l~--- 154 (192) T cd03022 85 TLRAMRAALAAQA-EGDAAEAFARAVFRALWG-EGLDIADPAVLAAVAAAAGLDADELLAAADD-----PAVKAALR--- 154 (192) T ss_pred HHHHHHHHHHHHH-CCCHHHHHHHHHHHHHHC-CCCCCCCHHHHHHHHHHCCCCHHHHHHHHCC-----HHHHHHHH--- T ss_conf 1889999999998-674599999999999862-9978798999999999839999999987526-----68999999--- Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCH Q ss_conf 56777643235888899998754210001210 Q gi|254780383|r 128 EGLTRLPGIGRKGANVILSMAFGIPTIGVDTH 159 (227) Q Consensus 128 ~~L~~LpGVG~ktA~~il~~~~~~p~~~VDth 159 (227) ..++.+.-.-++|.|.|.||.- T Consensus 155 ----------~~~~~A~~~Gi~GvPtfvi~~e 176 (192) T cd03022 155 ----------ANTEEAIARGVFGVPTFVVDGE 176 (192) T ss_pred ----------HHHHHHHHCCCEECCEEEECCE T ss_conf ----------9999999879947778999999 No 208 >pfam05597 Phasin Poly(hydroxyalcanoate) granule associated protein (phasin). Polyhydroxyalkanoates (PHAs) are storage polyesters synthesized by various bacteria as intracellular carbon and energy reserve material. PHAs are accumulated as water-insoluble inclusions within the cells. This family consists of the phasins PhaF and PhaI which act as a transcriptional regulator of PHA biosynthesis genes. PhaF has been proposed to repress expression of the phaC1 gene and the phaIF operon. Probab=20.77 E-value=36 Score=15.08 Aligned_cols=36 Identities=14% Similarity=0.234 Sum_probs=24.6 Q ss_pred HHHHHHHHCCCHHHHHHHHHHHHCCCCHHHHHHHHH Q ss_conf 754210001210467877656540788899999996 Q gi|254780383|r 148 AFGIPTIGVDTHIFRISNRIGLAPGKTPNKVEQSLL 183 (227) Q Consensus 148 ~~~~p~~~VDthv~Rv~~Rlgl~~~~~~~~~~~~l~ 183 (227) .|++=.=.+|.-|.+.++|||+...++.+..+..+. T Consensus 84 ~wdklE~~fe~rV~~aL~rLGips~~di~~L~~rId 119 (132) T pfam05597 84 QWDKLEQAFDERVAKALNRLGVPSRKEVEALSARID 119 (132) T ss_pred HHHHHHHHHHHHHHHHHHHCCCCCHHHHHHHHHHHH T ss_conf 588899999999999998569998999999999999 No 209 >TIGR02154 PhoB phosphate regulon transcriptional regulatory protein PhoB; InterPro: IPR011879 PhoB is a DNA-binding response regulator protein acting with PhoR in a 2-component system responding to phosphate ion. PhoB acts as a positive regulator of gene expression for phosphate-related genes such as phoA, phoS, phoE and ugpAB as well as itself . It is often found proximal to genes for the high-affinity phosphate ABC transporter (pstSCAB; GenProp0190) and presumably regulates these as well.; GO: 0000156 two-component response regulator activity, 0003677 DNA binding, 0000160 two-component signal transduction system (phosphorelay), 0006817 phosphate transport. Probab=20.35 E-value=17 Score=17.24 Aligned_cols=33 Identities=21% Similarity=0.212 Sum_probs=21.9 Q ss_pred HHHHHHHHHHH----CCCHHHHHHHHHHHHCCCCHHH Q ss_conf 99875421000----1210467877656540788899 Q gi|254780383|r 145 LSMAFGIPTIG----VDTHIFRISNRIGLAPGKTPNK 177 (227) Q Consensus 145 l~~~~~~p~~~----VDthv~Rv~~Rlgl~~~~~~~~ 177 (227) |-.+||..+++ ||+|+.|+=+-|......++.+ T Consensus 180 LD~VWG~dvyVE~RTVDVHIRRLRKaL~~~g~~~~vq 216 (226) T TIGR02154 180 LDRVWGRDVYVEERTVDVHIRRLRKALEPGGLEDLVQ 216 (226) T ss_pred HHHHCCCCEEEECCCEEEEECCHHHHCCCCCCCCCEE T ss_conf 0110589432513500032200054238788887156 No 210 >PTZ00183 centrin; Provisional Probab=20.34 E-value=63 Score=13.45 Aligned_cols=147 Identities=12% Similarity=0.084 Sum_probs=67.1 Q ss_pred CCCCCCCCHHHHHHHHHHHHCCCHHHHHHHHHHHHHCCCCCHHHCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 98777786899999999963332035679989987317620001012689999999973003799999975235544420 Q gi|254780383|r 40 KGELYYVNHFTLIVAVLLSAQSTDVNVNKATKHLFEIADTPQKMLAIGEKKLQNYIRTIGIYRKKSENIISLSHILINEF 119 (227) Q Consensus 40 ~~~l~~~~p~~~LVa~iLs~qT~d~~v~~~~~~L~~~ypt~e~l~~a~~~el~~~ir~~G~~~~KAk~I~~~a~~i~~~~ 119 (227) +-|.-|.+|= .+-++=+++.... ++..|..| +...=-..+.+|+..+++.+|+.- ....+..+...+-..- T Consensus 6 ~~~~~~~~~~------~~~~~ls~eq~~e-lke~F~~~-D~d~dG~Is~~El~~~L~~lG~~~-t~~el~~i~~~~D~d~ 76 (168) T PTZ00183 6 QMPIRYENPR------SIRPELNEEQKLE-IREAFDLF-DTDGTGYIDVKELKVAMRALGFEP-KKEEIKRMIADVDKDG 76 (168) T ss_pred CCCCCCCCCC------HHCCCCCHHHHHH-HHHHHHHH-CCCCCCCCCHHHHHHHHHHHCCCC-CHHHHHHHHHHCCCCC T ss_conf 7998888944------2002699999999-99999998-699869697999999999908999-9999999998628789 Q ss_pred CCCCHHHHHHHHHHH---HHHHHHHHHHHHHHH-----HHHHHHCCCHHHHHHHHHHHHCCCCHHHHHHHHHHCCCH-H- Q ss_conf 010000145677764---323588889999875-----421000121046787765654078889999999621884-2- Q gi|254780383|r 120 DNKIPQTLEGLTRLP---GIGRKGANVILSMAF-----GIPTIGVDTHIFRISNRIGLAPGKTPNKVEQSLLRIIPP-K- 189 (227) Q Consensus 120 ~g~vP~~~~~L~~Lp---GVG~ktA~~il~~~~-----~~p~~~VDthv~Rv~~Rlgl~~~~~~~~~~~~l~~~~p~-~- 189 (227) +|.| ++++.+.+- --...+.+. +.-+| +..+.+-=....+++..+|. .-+.++++..+...=.. + T Consensus 77 ~g~I--~f~eF~~~~~~~~~~~~~~~~-l~~aF~~fD~d~~G~Is~~elk~~l~~lg~--~is~eei~~l~~~~D~d~DG 151 (168) T PTZ00183 77 SGSI--DFNEFLEIMTKKMGERDPREE-ILKAFRLFDDDDTGKISLKNLKRVAKELGE--NLTDEELQEMIDEADRDGDG 151 (168) T ss_pred CCCE--EHHHHHHHHHHHHCCCCCHHH-HHHHHHHHCCCCCCCCCHHHHHHHHHHHCC--CCCHHHHHHHHHHHCCCCCC T ss_conf 9845--199999999998524240899-999999858888896789999999999689--99999999999985989999 Q ss_pred --HHHHHHHHHHH Q ss_conf --26789999999 Q gi|254780383|r 190 --HQYNAHYWLVL 200 (227) Q Consensus 190 --~~~~~~~~li~ 200 (227) .+.+|-..|.. T Consensus 152 ~I~y~EF~~~m~~ 164 (168) T PTZ00183 152 EISEEEFKRIMKK 164 (168) T ss_pred CEEHHHHHHHHHH T ss_conf 6909999999986 No 211 >pfam08625 Utp13 Utp13 specific WD40 associated domain. Utp13 is a component of the five protein Pwp2 complex that forms part of a stable particle subunit independent of the U3 small nucleolar ribonucleoprotein that is essential for the initial assembly steps of the 90S pre-ribosome. Pwp2 is capable of interacting directly with the 35 S pre-rRNA 5' end. Probab=20.33 E-value=63 Score=13.45 Aligned_cols=59 Identities=10% Similarity=0.182 Sum_probs=35.1 Q ss_pred HHHCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCHHHHHHHHHHHHHHHHHHH Q ss_conf 00101268999999997300379999997523554442001000014567776432358888 Q gi|254780383|r 81 QKMLAIGEKKLQNYIRTIGIYRKKSENIISLSHILINEFDNKIPQTLEGLTRLPGIGRKGAN 142 (227) Q Consensus 81 e~l~~a~~~el~~~ir~~G~~~~KAk~I~~~a~~i~~~~~g~vP~~~~~L~~LpGVG~ktA~ 142 (227) +.+..++.+++..+++-+--||..|+.-.- |+.++.--=... ..++|.++||++.-... T Consensus 50 ~~i~~L~~~ql~~LL~~~r~WNTNsr~~~v-AQ~vL~~ll~~~--~~~~l~~~~g~~~~lea 108 (138) T pfam08625 50 ETIGRLRKDQLELLLKFIREWNTNAKTCHV-AQRVLSVILKSF--PPEELLEVPGLKEILEA 108 (138) T ss_pred HHHHHCCHHHHHHHHHHHHHCCCCCCCCHH-HHHHHHHHHHHC--CHHHHHCCCCHHHHHHH T ss_conf 999857999999999999876467756099-999999999767--98998800148999999 No 212 >COG1623 Predicted nucleic-acid-binding protein (contains the HHH domain) [General function prediction only] Probab=20.12 E-value=63 Score=13.42 Aligned_cols=14 Identities=7% Similarity=-0.071 Sum_probs=6.2 Q ss_pred HHHHHCCCHHHHHH Q ss_conf 21000121046787 Q gi|254780383|r 151 IPTIGVDTHIFRIS 164 (227) Q Consensus 151 ~p~~~VDthv~Rv~ 164 (227) -..+.=+.++.|++ T Consensus 261 ~~~lL~~~~i~kvl 274 (349) T COG1623 261 DEELLDPENIAKVL 274 (349) T ss_pred CHHHCCHHHHHHHH T ss_conf 04017889999996 Done!