RPS-BLAST 2.2.22 [Sep-27-2009] Database: CddB 21,608 sequences; 5,994,473 total letters Searching..................................................done Query= gi|254780479|ref|YP_003064892.1| A/G-specific adenine glycosylase [Candidatus Liberibacter asiaticus str. psy62] (356 letters) >gnl|CDD|130156 TIGR01084, mutY, A/G-specific adenine glycosylase. This equivalog model identifies mutY members of the pfam00730 superfamily (HhH-GPD: Helix-hairpin-helix and Gly/Pro rich loop followed by a conserved aspartate). The major members of the superfamily are nth and mutY. Length = 275 Score = 266 bits (683), Expect = 4e-72 Identities = 105/266 (39%), Positives = 150/266 (56%), Gaps = 13/266 (4%) Query: 9 QSKILDWYDTNHR-VLPWRTSPKTEKSSLPSPYKVWISEIMLQQTTVKTVEPYFKKFMQK 67 +L WYD R LPWR + +PY+VW+SE+MLQQT V TV PYF++F+++ Sbjct: 3 SEDLLSWYDKYGRKTLPWRQNK--------TPYRVWLSEVMLQQTQVATVIPYFERFLER 54 Query: 68 WPTIFCLSSAKDEEILSAWAGLGYYTRARNLKKCADIIVKKYEGNFPHKVEILKKLPGIG 127 +PT+ L++A +E+L W GLGYY RARNL K A +V+++ G FP E L LPG+G Sbjct: 55 FPTVQALANAPQDEVLKLWEGLGYYARARNLHKAAQEVVEEFGGEFPQDFEDLAALPGVG 114 Query: 128 DYTASAIVAIAFNHFAVVVDTNIERIISRYFDIIKPA--PLYHKTIKNYARKITSTSRPG 185 YTA AI++ A N ++D N++R++SR F + + A + + P Sbjct: 115 RYTAGAILSFALNKPYPILDGNVKRVLSRLFAVEGWPGKKKVENRLWTLAESLLPKADPE 174 Query: 186 DFVQAMMDLGALICTSNKPLCPLCPIQKNCLTFSEGKSHLLGINTIKKKRPMRTGAVFIA 245 F QA+MDLGA+ICT KP C LCP+Q CL + +G + K P RT F+ Sbjct: 175 AFNQALMDLGAMICTRKKPKCDLCPLQDFCLAYQQGTWEEYPVKKPKAAPPERTTY-FLV 233 Query: 246 ITN-DNRILLRKRTNTRLLEGMDELP 270 + N D +LL +R L G+ P Sbjct: 234 LQNYDGEVLLEQRPEKGLWGGLYCFP 259 >gnl|CDD|182805 PRK10880, PRK10880, adenine DNA glycosylase; Provisional. Length = 350 Score = 204 bits (522), Expect = 2e-53 Identities = 118/332 (35%), Positives = 176/332 (53%), Gaps = 45/332 (13%) Query: 12 ILDWYDTNHR-VLPWRTSPKTEKSSLPSPYKVWISEIMLQQTTVKTVEPYFKKFMQKWPT 70 +LDWYD R LPW+ KT PYKVW+SE+MLQQT V TV PYF++FM ++PT Sbjct: 10 VLDWYDKYGRKTLPWQI-DKT-------PYKVWLSEVMLQQTQVATVIPYFERFMARFPT 61 Query: 71 IFCLSSAKDEEILSAWAGLGYYTRARNLKKCADIIVKKYEGNFPHKVEILKKLPGIGDYT 130 + L++A +E+L W GLGYY RARNL K A + + G FP E + LPG+G T Sbjct: 62 VTDLANAPLDEVLHLWTGLGYYARARNLHKAAQQVATLHGGEFPETFEEVAALPGVGRST 121 Query: 131 ASAIVAIAF-NHFAVVVDTNIERIISRYFDIIK-PAPLYHKTIKNYARKITSTSRPGD-- 186 A AI++++ HF ++D N++R+++R + + P K ++N +++ P Sbjct: 122 AGAILSLSLGKHFP-ILDGNVKRVLARCYAVSGWPG---KKEVENRLWQLSEQVTPAVGV 177 Query: 187 --FVQAMMDLGALICTSNKPLCPLCPIQKNCLTFSEGKSHLLGINTIKKKRPMRTGAVFI 244 F QAMMDLGA++CT +KP C LCP+Q C+ ++ L K+ P RTG F+ Sbjct: 178 ERFNQAMMDLGAMVCTRSKPKCELCPLQNGCIAYANHSWALYPGKKPKQTLPERTG-YFL 236 Query: 245 AITNDNRILLRKRTNTRLLEGM---------DELPGSAWSSTKDGNIDTHSAPFTANWIL 295 + + + + L +R + L G+ +EL W + + D + TA Sbjct: 237 LLQHGDEVWLEQRPPSGLWGGLFCFPQFADEEEL--RQWLAQRGIAADNLT-QLTA---- 289 Query: 296 CNTITHTFTHFTLTLFVWKTIVPQIVIIPDST 327 HTF+HF L IVP + + T Sbjct: 290 ---FRHTFSHFHL------DIVPMWLPVSSFT 312 >gnl|CDD|172427 PRK13910, PRK13910, DNA glycosylase MutY; Provisional. Length = 289 Score = 145 bits (366), Expect = 2e-35 Identities = 92/264 (34%), Positives = 146/264 (55%), Gaps = 22/264 (8%) Query: 48 MLQQTTVKTV-EPYFKKFMQKWPTIFCLSSAKDEEILSAWAGLGYYTRARNLKKCADIIV 106 M QQT + TV E ++ F++ +PT+ L++A EE+L W GLGYY+RA+NLKK A+I V Sbjct: 1 MSQQTQINTVVERFYSPFLEAFPTLKDLANAPLEEVLLLWRGLGYYSRAKNLKKSAEICV 60 Query: 107 KKYEGNFPHKVEILKKLPGIGDYTASAIVAIAFNHFAVVVDTNIERIISRYFDIIKPAPL 166 K++ P+ + L KLPGIG YTA+AI+ F + VD NI+R++ R F + + Sbjct: 61 KEHHSQLPNDYQSLLKLPGIGAYTANAILCFGFREKSACVDANIKRVLLRLFGL--DPNI 118 Query: 167 YHKTIKNYARKITSTSRPGDFVQAMMDLGALICTSNKPLCPLCPIQKNCLTFSEGKSHLL 226 + K ++ A + + + QA++DLGALIC S KP C +CP+ CL GK++ Sbjct: 119 HAKDLQIKANDFLNLNESFNHNQALIDLGALIC-SPKPKCAICPLNPYCL----GKNNPE 173 Query: 227 GINTIKKKRPMRTGAVFIAITNDNRILLRKRTNTRLLEGMDELPGSAWSSTKDGNIDTHS 286 +T+KKK+ + ++ + N + ++ +L GM P + K+ N++ + Sbjct: 174 K-HTLKKKQEIVQEERYLGVVIQNNQIALEKIEQKLYLGMHHFP-----NLKE-NLE-YK 225 Query: 287 APFTANWILCNTITHTFTHFTLTL 310 PF I H+ T F L L Sbjct: 226 LPFLG------AIKHSHTKFKLNL 243 >gnl|CDD|128754 smart00478, ENDO3c, endonuclease III. includes endonuclease III (DNA-(apurinic or apyrimidinic site) lyase), alkylbase DNA glycosidases (Alka-family) and other DNA glycosidases. Length = 149 Score = 134 bits (339), Expect = 4e-32 Identities = 43/151 (28%), Positives = 78/151 (51%), Gaps = 3/151 (1%) Query: 48 MLQQTTVKTVEPYFKKFMQKWPTIFCLSSAKDEEILSAWAGLGYYT-RARNLKKCADIIV 106 + QQT+ + V ++ +K+PT L++A +EE+ LG+Y +A+ L + A I+V Sbjct: 1 LSQQTSDEAVNKATERLFEKFPTPEDLAAADEEELEELIRPLGFYRRKAKYLIELARILV 60 Query: 107 KKYEGNFPHKVEILKKLPGIGDYTASAIVAIAFNHFAVVVDTNIERIISRYFDIIKPAPL 166 ++Y G P E L KLPG+G TA+A+++ A + VDT++ RI R + K + Sbjct: 61 EEYGGEVPDDREELLKLPGVGRKTANAVLSFALGKPFIPVDTHVLRIAKRLGLVDKKST- 119 Query: 167 YHKTIKNYARKITSTSRPGDFVQAMMDLGAL 197 + ++ K+ + ++D G Sbjct: 120 -PEEVEKLLEKLLPKEDWRELNLLLIDFGRT 149 >gnl|CDD|130155 TIGR01083, nth, endonuclease III. This equivalog model identifes nth members of the pfam00730 superfamily (HhH-GPD: Helix-hairpin-helix and Gly/Pro rich loop followed by a conserved aspartate). The major members of the superfamily are nth and mutY. Length = 191 Score = 85.1 bits (211), Expect = 3e-17 Identities = 49/175 (28%), Positives = 85/175 (48%), Gaps = 14/175 (8%) Query: 38 SPYKVWISEIMLQQTTVKTVEPYFKKFMQKWPTIFCLSSAKDEEILSAWAGLGYY-TRAR 96 +P+++ ++ I+ Q T K+V KK + +PT L+ A EE+ +G Y +A+ Sbjct: 25 NPFELLVATILSAQATDKSVNKATKKLFEVYPTPQALAQAGLEELEEYIKSIGLYRNKAK 84 Query: 97 NLKKCADIIVKKYEGNFPHKVEILKKLPGIGDYTASAIVAIAFNHFAVVVDTNIERI--- 153 N+ I+V++Y G P E L KLPG+G TA+ ++ +AF A+ VDT++ R+ Sbjct: 85 NIIALCRILVERYGGEVPEDREELVKLPGVGRKTANVVLNVAFGIPAIAVDTHVFRVSNR 144 Query: 154 --ISRYFDIIKPAPLYHKTIKNYARKITSTSRPGDFVQAMMDLGALICTSNKPLC 206 +S+ D K ++ K+ ++ G C + KPLC Sbjct: 145 LGLSKGKDPDK--------VEEELLKLIPREFWTKLHHWLILHGRYTCKARKPLC 191 >gnl|CDD|182661 PRK10702, PRK10702, endonuclease III; Provisional. Length = 211 Score = 50.4 bits (120), Expect = 8e-07 Identities = 47/189 (24%), Positives = 86/189 (45%), Gaps = 7/189 (3%) Query: 29 PKTEKSSLPSPYKVWISEIMLQQTTVKTVEPYFKKFMQKWPTIFCLSSAKDEEILSAWAG 88 P TE + SP+++ I+ ++ Q T +V K T + E + + Sbjct: 20 PTTELN-FSSPFELLIAVLLSAQATDVSVNKATAKLYPVANTPAAMLELGVEGVKTYIKT 78 Query: 89 LGYY-TRARNLKKCADIIVKKYEGNFPHKVEILKKLPGIGDYTASAIVAIAFNHFAVVVD 147 +G Y ++A N+ K I+++++ G P L+ LPG+G TA+ ++ AF + VD Sbjct: 79 IGLYNSKAENVIKTCRILLEQHNGEVPEDRAALEALPGVGRKTANVVLNTAFGWPTIAVD 138 Query: 148 TNIERIISRYFDIIKPAPLYH-KTIKNYARKITSTSRPGDFVQAMMDLGALICTSNKPLC 206 T+I R+ +R + AP + + ++ K+ D ++ G C + KP C Sbjct: 139 THIFRVCNR----TQFAPGKNVEQVEEKLLKVVPAEFKVDCHHWLILHGRYTCIARKPRC 194 Query: 207 PLCPIQKNC 215 C I+ C Sbjct: 195 GSCIIEDLC 203 >gnl|CDD|128798 smart00525, FES, FES domain. iron-sulpphur binding domain in DNA-(apurinic or apyrimidinic site) lyase (subfamily of ENDO3). Length = 21 Score = 33.3 bits (77), Expect = 0.091 Identities = 9/18 (50%), Positives = 12/18 (66%) Query: 198 ICTSNKPLCPLCPIQKNC 215 ICT+ KP C CP++ C Sbjct: 1 ICTARKPRCDECPLKDLC 18 >gnl|CDD|151111 pfam10576, EndIII_4Fe-2S, Iron-sulfur binding domain of endonuclease III. Escherichia coli endonuclease III (EC 4.2.99.18) is a DNA repair enzyme that acts both as a DNA N-glycosylase, removing oxidized pyrimidines from DNA, and as an apurinic/apyrimidinic (AP) endonuclease, introducing a single-strand nick at the site from which the damaged base was removed. Endonuclease III is an iron-sulfur protein that binds a single 4Fe-4S cluster. The 4Fe-4S cluster does not seem to be important for catalytic activity, but is probably involved in the proper positioning of the enzyme along the DNA strand. The 4Fe-4S cluster is bound by four cysteines which are all located in a 17 amino acid region at the C-terminal end of endonuclease III. A similar region is also present in the central section of mutY and in the C-terminus of ORF-10 and of the Micro-coccus UV endonuclease. Length = 17 Score = 32.0 bits (74), Expect = 0.25 Identities = 8/17 (47%), Positives = 10/17 (58%) Query: 199 CTSNKPLCPLCPIQKNC 215 CT+ KP C CP+ C Sbjct: 1 CTARKPKCEECPLADLC 17 >gnl|CDD|179200 PRK00994, PRK00994, F420-dependent methylenetetrahydromethanopterin dehydrogenase; Provisional. Length = 277 Score = 29.2 bits (66), Expect = 1.8 Identities = 18/61 (29%), Positives = 27/61 (44%), Gaps = 9/61 (14%) Query: 64 FMQKWPTIFCLSSAKDEEILSAWAGLGYYTRARNLKKCADIIVKKYEGNFPHKV--EILK 121 FM K + A E++ A A L AR ++K D +++K PH +IL Sbjct: 216 FMTKEMEKYIPIVASAHEMMRAAAKLAD--EAREIEKANDTVLRK-----PHAKDGKILS 268 Query: 122 K 122 K Sbjct: 269 K 269 >gnl|CDD|134035 PRK00020, truB, tRNA pseudouridine synthase B; Provisional. Length = 244 Score = 28.8 bits (64), Expect = 1.9 Identities = 20/65 (30%), Positives = 29/65 (44%), Gaps = 8/65 (12%) Query: 127 GDYTASAIVAIAFNHFAVVVDTNIERIISRYFDIIKPAPLYHKTIK-------NYARKIT 179 GD T IVA A + FA V + + ++SR+ I+ P + +K YAR Sbjct: 87 GDLTGH-IVARAPDGFAGVEEAALRDVLSRFVGTIEQIPPMYSALKRDGKPLYEYARAGI 145 Query: 180 STSRP 184 RP Sbjct: 146 ELDRP 150 >gnl|CDD|128712 smart00435, TOPEUc, DNA Topoisomerase I (eukaryota). DNA Topoisomerase I (eukaryota), DNA topoisomerase V, Vaccina virus topoisomerase, Variola virus topoisomerase, Shope fibroma virus topoisomeras. Length = 391 Score = 28.9 bits (65), Expect = 2.3 Identities = 11/26 (42%), Positives = 13/26 (50%) Query: 92 YTRARNLKKCADIIVKKYEGNFPHKV 117 Y +AR LKK D I K Y + K Sbjct: 85 YEKARKLKKHIDKIRKDYTKDLKSKE 110 >gnl|CDD|129520 TIGR00426, TIGR00426, competence protein ComEA helix-hairpin-helix repeat region. Members of the subfamily recognized by this model include competence protein ComEA and closely related proteins from a number of species that exhibit competence for transformation by exongenous DNA, including Streptococcus pneumoniae, Bacillus subtilis, Neisseria meningitidis, and Haemophilus influenzae. This model represents a region of two tandem copies of a helix-hairpin-helix domain (pfam00633), each about 30 residues in length. Limited sequence similarity can be found among some members of this family N-terminal to the region covered by this model. Length = 69 Score = 28.0 bits (62), Expect = 3.6 Identities = 21/58 (36%), Positives = 31/58 (53%), Gaps = 9/58 (15%) Query: 74 LSSAKDEEILSAWAGLGYYTRARNLKKCADIIVKKYE-GNFPHKVEILKKLPGIGDYT 130 +++A EE+ A G+G LKK I+ + E G F VE LK++PGIG+ Sbjct: 10 INTATAEELQRAMNGVG-------LKKAEAIVSYREEYGPF-KTVEDLKQVPGIGNSL 59 >gnl|CDD|181223 PRK08074, PRK08074, bifunctional ATP-dependent DNA helicase/DNA polymerase III subunit epsilon; Validated. Length = 928 Score = 28.0 bits (63), Expect = 4.0 Identities = 12/39 (30%), Positives = 17/39 (43%), Gaps = 1/39 (2%) Query: 68 WPTIFCLSSAKDEEILSAWAGLGYYTRARNLKKCADIIV 106 W I D S W +Y RA+N K AD+++ Sbjct: 399 WNRI-ASDGESDGGKQSPWFSRCFYQRAKNRAKFADLVI 436 >gnl|CDD|137505 PRK09751, PRK09751, putative ATP-dependent helicase Lhr; Provisional. Length = 1490 Score = 27.6 bits (61), Expect = 5.0 Identities = 6/18 (33%), Positives = 9/18 (50%) Query: 14 DWYDTNHRVLPWRTSPKT 31 +WY R PW+ P+ Sbjct: 433 EWYSRVRRAAPWKDLPRR 450 >gnl|CDD|163497 TIGR03785, marine_sort_HK, proteobacterial dedicated sortase system histidine kinase. This histidine kinase protein is paired with an adjacent response regulator (TIGR03787) gene. It co-occurs with a variant sortase enzyme (TIGR03784), usually in the same gene neighborhood, in proteobacterial species most of which are marine, and with an LPXTG motif-containing sortase target conserved protein (TIGR03788). Sortases and LPXTG proteins are far more common in Gram-positive bacteria, where sortase systems mediate attachment to the cell wall or cross-linking of pilin structures. We give this predicted sensor histidine kinase the gene symbol psdS, for Proteobacterial Dedicated Sortase system Sensor histidine kinase. Length = 703 Score = 27.4 bits (61), Expect = 5.5 Identities = 11/34 (32%), Positives = 18/34 (52%), Gaps = 5/34 (14%) Query: 25 WRTSPKTEKSSLPSPYKVWISE-----IMLQQTT 53 WR SP ++ L + Y +WI ++ +QTT Sbjct: 361 WRLSPDSKAVILSAAYPIWIDTEVLGAVIAEQTT 394 >gnl|CDD|180664 PRK06705, PRK06705, argininosuccinate lyase; Provisional. Length = 502 Score = 27.3 bits (60), Expect = 5.7 Identities = 18/69 (26%), Positives = 31/69 (44%), Gaps = 10/69 (14%) Query: 53 TVKTVEPYFKKFMQKWPTIFCLSSA------KDEEILSAWAGLGYYTRARNLKKCADIIV 106 T ++PY K ++K +FC+ +A +E+ L + Y A + AD++ Sbjct: 328 TEDDLQPYLYKGIEKAIRVFCIMNAVIRTMKVEEDTLKRRS----YKHAITITDFADVLT 383 Query: 107 KKYEGNFPH 115 K Y F H Sbjct: 384 KNYGIPFRH 392 >gnl|CDD|163580 TIGR03868, F420-O_ABCperi, proposed F420-0 ABC transporter, periplasmic F420-0 binding protein. This small clade of ABC-type transporter periplasmic binding protein components is found as a three gene cassette along with a permease (TIGR03869) and an ATPase (TIGR03873). The organisms containing this cassette are all Actinobacteria and all contain numerous genes requiring the coenzyme F420. This model was defined based on five such organisms, four of which are lacking all F420 biosynthetic capability save the final side-chain polyglutamate attachment step (via the gene cofE: TIGR01916). In Jonesia denitrificans DSM 20603 and marine actinobacterium PHSC20C1 this cassette is in an apparent operon with the cofE gene and, in PHSC20C1, also with a F420-dependent glucose-6-phosphate dehydrogenase (TIGR03554). Based on these observations we propose that this periplasmic binding protein is a component of an F420-0 (that is, F420 lacking only the polyglutamate tail) transporter. Length = 287 Score = 27.1 bits (60), Expect = 6.3 Identities = 14/52 (26%), Positives = 25/52 (48%), Gaps = 11/52 (21%) Query: 305 HFTLTLFVWKTIV---PQIVIIPDSTWHDAQ--------NLANAALPTVMKK 345 H T T W+ +V P ++++ DS W+ A+ N A + L V ++ Sbjct: 206 HDTWTPMSWEAVVDADPDVIVLVDSAWNSAEKKIEVLESNPATSNLTAVQEQ 257 >gnl|CDD|129252 TIGR00148, TIGR00148, UbiD family decarboxylases. Found in bacteria, archaea, and yeast, with two members in A. fulgidus. No homologs were detected besides those classified as orthologs. The member from H. pylori has a C-terminal extension of just over 100 residues that is shared in part by the Aquifex aeolicus homolog. Length = 438 Score = 27.3 bits (61), Expect = 6.9 Identities = 23/95 (24%), Positives = 39/95 (41%), Gaps = 8/95 (8%) Query: 157 YFDIIKPAPLYH-KTIKNYARK--ITSTSRPGDFVQAMMDLGALICTSNKPLC--PLCPI 211 Y+DI++P P+ K + Y R+ I + PG +G P+ + I Sbjct: 275 YYDIVRPEPVITVKRM--YHREDPIYHATYPGGPPHEDALMGVPTEPVFYPILRNQVPEI 332 Query: 212 QKNCLTFSEGKSHLLGINTIKKKRPMRTGAVFIAI 246 + + G LL + +IKK+ P V +A Sbjct: 333 -IDAVLPEGGCHWLLAVVSIKKRYPGDAKNVIMAA 366 >gnl|CDD|177825 PLN02168, PLN02168, copper ion binding / pectinesterase. Length = 545 Score = 26.9 bits (59), Expect = 9.1 Identities = 13/38 (34%), Positives = 19/38 (50%), Gaps = 3/38 (7%) Query: 2 PQPEHIIQSKILDWYDTNHRVLPWRTSPKTEKSSLPSP 39 P+P+ I DW+ +H V+ R S SLP+P Sbjct: 154 PKPDEEYDILIGDWFYADHTVM--RASLDN-GHSLPNP 188 >gnl|CDD|161960 TIGR00615, recR, recombination protein RecR. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). Length = 195 Score = 26.9 bits (60), Expect = 9.3 Identities = 13/33 (39%), Positives = 18/33 (54%), Gaps = 3/33 (9%) Query: 108 KYEGNFPHKVEILKKLPGIGDYTASAIVAIAFN 140 +Y +E LKKLPGIG +A +AF+ Sbjct: 1 QYPPPISKLIESLKKLPGIGPKSAQ---RLAFH 30 >gnl|CDD|128574 smart00278, HhH1, Helix-hairpin-helix DNA-binding motif class 1. Length = 20 Score = 26.5 bits (60), Expect = 9.9 Identities = 10/19 (52%), Positives = 12/19 (63%) Query: 118 EILKKLPGIGDYTASAIVA 136 E L K+PGIG TA I+ Sbjct: 1 EELLKVPGIGPKTAEKILE 19 Database: CddB Posted date: Feb 4, 2011 9:54 PM Number of letters in database: 5,994,473 Number of sequences in database: 21,608 Lambda K H 0.321 0.135 0.421 Gapped Lambda K H 0.267 0.0628 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 21608 Number of Hits to DB: 5,877,083 Number of extensions: 368360 Number of successful extensions: 719 Number of sequences better than 10.0: 1 Number of HSP's gapped: 711 Number of HSP's successfully gapped: 26 Length of query: 356 Length of database: 5,994,473 Length adjustment: 94 Effective length of query: 262 Effective length of database: 3,963,321 Effective search space: 1038390102 Effective search space used: 1038390102 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.4 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.9 bits) S2: 58 (26.3 bits)