Query gi|254780310|ref|YP_003064723.1| hypothetical protein CLIBASIA_00975 [Candidatus Liberibacter asiaticus str. psy62] Match_columns 119 No_of_seqs 116 out of 730 Neff 7.9 Searched_HMMs 39220 Date Sun May 29 15:45:28 2011 Command /home/congqian_1/programs/hhpred/hhsearch -i 254780310.hhm -d /home/congqian_1/database/cdd/Cdd.hhm No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM 1 PRK11595 gluconate periplasmic 99.9 8.1E-24 2.1E-28 152.3 8.3 99 18-117 2-102 (227) 2 COG1040 ComFC Predicted amidop 99.6 8.8E-17 2.2E-21 113.4 4.2 101 14-117 5-106 (225) 3 TIGR00201 comF comF family pro 99.3 1E-12 2.6E-17 90.8 3.1 82 24-119 1-82 (207) 4 TIGR00143 hypF [NiFe] hydrogen 91.0 0.27 7E-06 27.5 3.6 71 18-100 120-212 (799) 5 pfam04981 NMD3 NMD3 family. Th 85.2 0.46 1.2E-05 26.3 1.6 36 24-59 1-44 (237) 6 COG1499 NMD3 NMD protein affec 74.8 2.1 5.3E-05 22.7 2.1 37 23-59 8-52 (355) 7 PRK11788 hypothetical protein; 73.1 3.2 8.2E-05 21.6 2.7 44 5-48 338-383 (389) 8 pfam09889 DUF2116 Uncharacteri 68.0 2.2 5.7E-05 22.5 1.0 24 22-45 4-28 (59) 9 TIGR00599 rad18 DNA repair pro 67.7 3.7 9.5E-05 21.3 2.1 59 23-100 29-93 (421) 10 KOG2041 consensus 64.6 2.8 7.2E-05 21.9 1.0 45 23-72 1119-1165(1189) 11 pfam10764 Gin Inhibitor of sig 64.4 3.6 9.1E-05 21.4 1.4 27 23-49 1-32 (46) 12 pfam03660 PHF5 PHF5-like prote 61.8 4 0.0001 21.1 1.3 50 22-74 27-79 (105) 13 PRK11032 hypothetical protein; 59.5 2.2 5.7E-05 22.5 -0.3 33 28-60 117-152 (160) 14 pfam07295 DUF1451 Protein of u 57.4 7.5 0.00019 19.6 2.1 48 13-60 92-142 (148) 15 pfam07191 DUF1407 Protein of u 56.9 1.3 3.4E-05 23.7 -1.8 38 23-62 3-42 (70) 16 PRK05580 primosome assembly pr 55.9 2.8 7.1E-05 22.0 -0.3 43 24-71 407-452 (699) 17 KOG0978 consensus 55.4 2.5 6.5E-05 22.2 -0.6 40 23-62 645-690 (698) 18 COG4068 Uncharacterized protei 55.0 4.9 0.00013 20.6 0.8 18 23-40 10-27 (64) 19 pfam10977 DUF2797 Protein of u 54.8 6.6 0.00017 19.9 1.4 25 23-47 13-39 (233) 20 KOG4080 consensus 53.2 6.1 0.00015 20.1 1.0 26 22-48 94-119 (176) 21 pfam04423 Rad50_zn_hook Rad50 52.7 14 0.00035 18.1 2.8 30 3-34 4-33 (54) 22 TIGR00575 dnlj DNA ligase, NAD 50.8 9.6 0.00024 19.0 1.7 16 18-33 408-423 (706) 23 TIGR00269 TIGR00269 conserved 49.8 8.4 0.00021 19.3 1.3 19 51-69 82-100 (106) 24 PRK00420 hypothetical protein; 48.8 15 0.00039 17.9 2.5 37 5-42 4-43 (107) 25 COG2888 Predicted Zn-ribbon RN 45.9 3.7 9.5E-05 21.3 -1.0 24 20-43 8-35 (61) 26 pfam10217 DUF2039 Uncharacteri 45.6 4.7 0.00012 20.7 -0.5 34 21-59 55-90 (92) 27 pfam04216 FdhE Protein involve 43.5 26 0.00066 16.6 3.0 56 4-60 139-215 (283) 28 COG2956 Predicted N-acetylgluc 41.8 21 0.00053 17.1 2.3 42 5-46 338-381 (389) 29 COG5432 RAD18 RING-finger-cont 40.6 13 0.00034 18.2 1.2 37 23-60 27-69 (391) 30 KOG4265 consensus 39.7 11 0.00027 18.8 0.5 41 21-62 290-337 (349) 31 pfam10571 UPF0547 Uncharacteri 39.1 9 0.00023 19.1 0.1 22 38-60 3-24 (26) 32 PRK08359 transcription factor; 38.4 15 0.00039 17.9 1.2 26 21-46 6-41 (175) 33 PRK10410 hypothetical protein; 38.3 20 0.0005 17.3 1.8 73 8-98 9-95 (114) 34 COG0068 HypF Hydrogenase matur 38.1 29 0.00074 16.3 2.6 12 34-45 100-111 (750) 35 COG2995 PqiA Uncharacterized p 37.9 13 0.00034 18.2 0.8 27 34-60 219-245 (418) 36 KOG1512 consensus 37.5 15 0.00037 18.0 1.0 55 23-81 316-372 (381) 37 pfam06906 DUF1272 Protein of u 37.4 25 0.00062 16.7 2.1 37 21-60 5-51 (57) 38 pfam10083 DUF2321 Uncharacteri 36.5 12 0.00032 18.4 0.5 11 50-60 68-78 (158) 39 pfam03854 zf-P11 P-11 zinc fin 36.0 13 0.00033 18.2 0.6 32 30-62 15-47 (50) 40 cd00350 rubredoxin_like Rubred 35.0 21 0.00053 17.2 1.4 25 23-60 3-27 (33) 41 KOG3002 consensus 34.3 25 0.00063 16.7 1.7 35 23-61 50-91 (299) 42 pfam08772 NOB1_Zn_bind Nin one 33.8 13 0.00032 18.3 0.2 23 37-59 11-33 (73) 43 KOG2660 consensus 33.7 23 0.00058 16.9 1.5 65 17-96 11-82 (331) 44 KOG1813 consensus 32.7 28 0.00072 16.4 1.8 44 17-61 237-286 (313) 45 pfam05810 NinF NinF protein. T 31.7 25 0.00063 16.7 1.4 22 24-45 20-42 (58) 46 COG1198 PriA Primosomal protei 31.2 9.2 0.00024 19.1 -0.9 36 22-59 445-484 (730) 47 PRK08620 DNA topoisomerase III 30.5 46 0.0012 15.2 3.2 58 4-61 580-659 (726) 48 COG1592 Rubrerythrin [Energy p 29.8 45 0.0011 15.3 2.4 38 9-60 119-159 (166) 49 pfam10146 zf-C4H2 Zinc finger- 29.1 9.5 0.00024 19.0 -1.1 21 37-58 186-206 (220) 50 COG1439 Predicted nucleic acid 29.1 14 0.00035 18.1 -0.3 23 37-60 141-163 (177) 51 TIGR00354 polC DNA polymerase 28.8 15 0.00038 18.0 -0.2 43 22-69 680-722 (1173) 52 COG1997 RPL43A Ribosomal prote 27.1 53 0.0013 14.9 2.4 25 3-30 20-44 (89) 53 TIGR00570 cdk7 CDK-activating 26.7 34 0.00086 16.0 1.4 39 22-61 9-62 (322) 54 pfam05290 Baculo_IE-1 Baculovi 26.6 54 0.0014 14.8 2.9 45 18-62 75-133 (141) 55 PRK01343 zinc-binding protein; 26.0 33 0.00085 16.0 1.2 12 21-32 9-20 (56) 56 TIGR01405 polC_Gram_pos DNA po 25.5 26 0.00066 16.6 0.6 29 34-63 716-757 (1264) 57 COG1645 Uncharacterized Zn-fin 24.9 23 0.00058 16.9 0.2 23 21-43 28-52 (131) 58 KOG1842 consensus 24.7 15 0.00039 17.9 -0.7 11 36-46 205-215 (505) 59 KOG2613 consensus 23.5 46 0.0012 15.2 1.6 71 24-100 17-90 (502) 60 TIGR02688 TIGR02688 conserved 23.2 62 0.0016 14.5 2.2 19 5-23 417-435 (470) 61 COG1644 RPB10 DNA-directed RNA 23.0 22 0.00057 17.0 -0.1 15 19-33 2-16 (63) 62 PRK07220 DNA topoisomerase I; 23.0 59 0.0015 14.6 2.0 83 23-105 591-711 (740) 63 TIGR01054 rgy reverse gyrase; 23.0 13 0.00034 18.2 -1.3 31 16-46 2-36 (1843) 64 pfam08274 PhnA_Zn_Ribbon PhnA 22.8 29 0.00074 16.3 0.4 24 22-45 3-29 (30) 65 COG5175 MOT2 Transcriptional r 22.7 32 0.00081 16.1 0.6 41 23-63 16-66 (480) 66 COG0199 RpsN Ribosomal protein 22.7 31 0.00079 16.2 0.6 24 23-46 23-48 (61) 67 COG1110 Reverse gyrase [DNA re 21.7 26 0.00066 16.6 -0.0 31 16-46 3-37 (1187) 68 PRK04023 DNA polymerase II lar 21.7 25 0.00065 16.7 -0.1 52 19-75 631-682 (1128) 69 pfam11290 DUF3090 Protein of u 21.2 45 0.0012 15.3 1.1 13 50-62 154-166 (171) 70 COG5220 TFB3 Cdk activating ki 21.1 41 0.001 15.5 0.9 37 22-58 11-61 (314) 71 PRK03564 formate dehydrogenase 20.5 72 0.0018 14.1 5.7 56 4-60 160-236 (307) 72 cd00162 RING RING-finger (Real 20.4 54 0.0014 14.8 1.4 37 23-59 1-44 (45) 73 COG4306 Uncharacterized protei 20.1 46 0.0012 15.2 1.0 10 51-60 69-78 (160) No 1 >PRK11595 gluconate periplasmic binding protein; Provisional Probab=99.90 E-value=8.1e-24 Score=152.33 Aligned_cols=99 Identities=17% Similarity=0.237 Sum_probs=89.4 Q ss_pred HHCCCCCCCCCCCCC-CCCEECHHHHHHCCCCCCCCCCCCCCCCCCCC-CCHHHHCCCCCHHHHHHHHCCCHHHHHHHHH Q ss_conf 857874710375156-57702888995086767862012444653347-6089851678435301200021279999999 Q gi|254780310|r 18 CIYPSICPIYSRIIN-LRFCLCGHCWSKIHFITATEHILKNNKDNIDK-DPLKSMQKDLPLTQIRSVTLYCDMSCVLVRL 95 (119) Q Consensus 18 ~lfP~~C~~C~~~~~-~~~~lC~~C~~~l~~i~~~~C~~Cg~~~~~~~-~C~~C~~~~~~f~~~~a~~~Y~~~~r~lI~~ 95 (119) ++||++|++|+..+. ++..||.+|+++++++ ++.|++||.|..... .|++|...+|+|++.+++|.|+++++++||+ T Consensus 2 L~~P~~C~~C~~~l~~~~~~lC~~C~~~l~~~-~~~C~~Cg~p~~~~~~~C~~C~~~~p~~~~~~a~~~Y~~~~~~lI~~ 80 (227) T PRK11595 2 LTVPGLCWLCRMPLALSHWGICSVCSRALRTL-PTLCPQCGLPATHTHLPCGRCLQKPPPWQRLVFVSDYAPPLSGLIHQ 80 (227) T ss_pred CCCCCCCCCCCCCCCCCCCCCCHHHHCCCCCC-CCCCCCCCCCCCCCCCCCHHHHCCCCCHHHEEHHHHCCHHHHHHHHH T ss_conf 66898471369961138784466676008766-67575136967656777779881996566620343037799999999 Q ss_pred HHHCCCHHHHHHHHHHHHHHHH Q ss_conf 7447876699999999999974 Q gi|254780310|r 96 LKYHDRTDLAIMMAQWMFRVLE 117 (119) Q Consensus 96 ~Ky~~~~~la~~la~~m~r~~~ 117 (119) |||+|+.+++..||++|+..++ T Consensus 81 ~Ky~~~~~l~~~la~~l~~~~~ 102 (227) T PRK11595 81 LKFSRRSELASALARLLLLEVL 102 (227) T ss_pred HHHCCCHHHHHHHHHHHHHHHH T ss_conf 8738867899999999999999 No 2 >COG1040 ComFC Predicted amidophosphoribosyltransferases [General function prediction only] Probab=99.65 E-value=8.8e-17 Score=113.36 Aligned_cols=101 Identities=20% Similarity=0.274 Sum_probs=93.2 Q ss_pred HHHHHHCCCCCCCCCCCCCCCCEECHHHHHHCCCCCCCCCCCCCCCCCCC-CCCHHHHCCCCCHHHHHHHHCCCHHHHHH Q ss_conf 99998578747103751565770288899508676786201244465334-76089851678435301200021279999 Q gi|254780310|r 14 ELFHCIYPSICPIYSRIINLRFCLCGHCWSKIHFITATEHILKNNKDNID-KDPLKSMQKDLPLTQIRSVTLYCDMSCVL 92 (119) Q Consensus 14 ~ll~~lfP~~C~~C~~~~~~~~~lC~~C~~~l~~i~~~~C~~Cg~~~~~~-~~C~~C~~~~~~f~~~~a~~~Y~~~~r~l 92 (119) .. ..++|+.|..|...+..+ .+|..|++.++++.. .|+.||.+.... ..|+.|...+++|++.++++.|+++++++ T Consensus 5 ~~-~~~~~~~~~~~~~~l~~~-~~C~~C~~~~~~~~~-~C~~C~~~l~~~~~~~~~~~~~~~~~~~~~~~~~Y~~~l~~~ 81 (225) T COG1040 5 LC-RLLLPPRCWLCLLLLFFP-GLCSGCQADLPLIGN-LCPLCGLPLSSHACRCGECLAKPPPFERLRSLGSYNGPLREL 81 (225) T ss_pred HH-CCCCCCCHHHHHHHCCCC-CCCHHHHHHHHHHHH-HHHHHHCCCCCCCCCCHHHHCCCCCCCCEEEEEEECHHHHHH T ss_conf 52-546662019887653689-869455423467762-757875514443344888853588530258888731999999 Q ss_pred HHHHHHCCCHHHHHHHHHHHHHHHH Q ss_conf 9997447876699999999999974 Q gi|254780310|r 93 VRLLKYHDRTDLAIMMAQWMFRVLE 117 (119) Q Consensus 93 I~~~Ky~~~~~la~~la~~m~r~~~ 117 (119) |+++||+++..++..||.||++++. T Consensus 82 i~~~Kf~~~~~l~~~la~~l~~~~~ 106 (225) T COG1040 82 ISQLKFQGDLDLAKLLARLLAKALD 106 (225) T ss_pred HHHHHHCCCHHHHHHHHHHHHHHHH T ss_conf 9986526717789999999999865 No 3 >TIGR00201 comF comF family protein; InterPro: IPR005222 Proteins in this family are found in bacterial species which posses systems for natural transformation with exogenous DNA (eg Bacillus subtilis, Haemophilus influenzae), and also species without these systems (eg Escherichia coli). Competence protein F has been shown to be important for the uptake of exogenous DNA in naturally competent bacteria, though the precise role of this protein is not yet known , . GntX is a periplasmic gluconate binding protein thought to be part of a high-affinity gluconate transport system .. Probab=99.30 E-value=1e-12 Score=90.81 Aligned_cols=82 Identities=17% Similarity=0.286 Sum_probs=71.9 Q ss_pred CCCCCCCCCCCCEECHHHHHHCCCCCCCCCCCCCCCCCCCCCCHHHHCCCCCHHHHHHHHCCCHHHHHHHHHHHHCCCHH Q ss_conf 71037515657702888995086767862012444653347608985167843530120002127999999974478766 Q gi|254780310|r 24 CPIYSRIINLRFCLCGHCWSKIHFITATEHILKNNKDNIDKDPLKSMQKDLPLTQIRSVTLYCDMSCVLVRLLKYHDRTD 103 (119) Q Consensus 24 C~~C~~~~~~~~~lC~~C~~~l~~i~~~~C~~Cg~~~~~~~~C~~C~~~~~~f~~~~a~~~Y~~~~r~lI~~~Ky~~~~~ 103 (119) |++||+.+.+...+|.+|......+ ...|+.|+..+|+|++.+.++.|.+++..+|++|||.++.+ T Consensus 1 C~~C~k~~~S~~a~C~~C~~~~t~~--------------~~~CG~~L~~~P~W~~l~~v~~Y~~~l~~li~~fKf~~~~~ 66 (207) T TIGR00201 1 CSLCGKRIKSSKALCDQCGSERTLF--------------RDSCGLCLKQNPSWDKLVSVFEYKEPLKELISRFKFDAQAE 66 (207) T ss_pred CCCCCCCCCCCCCCCCCCCCHHHHH--------------HCCCCCCCCCCCCCCCEEEEEECCHHHHHHHHHHCCCCHHH T ss_conf 9888887300776246666246664--------------11023013787581303777605434789999733450368 Q ss_pred HHHHHHHHHHHHHHCC Q ss_conf 9999999999997429 Q gi|254780310|r 104 LAIMMAQWMFRVLEKI 119 (119) Q Consensus 104 la~~la~~m~r~~~~i 119 (119) ++..|+..++.++.+| T Consensus 67 I~~~L~~~l~~~~~~~ 82 (207) T TIGR00201 67 IARALASLLALTVSKA 82 (207) T ss_pred HHHHHHHHHHHHHHHC T ss_conf 9999889999998603 No 4 >TIGR00143 hypF [NiFe] hydrogenase maturation protein HypF; InterPro: IPR004421 The large subunit of [NiFe]-hydrogenase, as well as other nickel metalloenzymes, is synthesized as a precursor devoid of the metalloenzyme active site. This precursor then undergoes a complex post-translational maturation process that requires a number of accessory proteins , , . Members of the HypF family are accessory proteins involved in hydrogenase maturation. They contain the following domains: acylphosphatase, zinc fingers (2 repeats), a YrdC-like domain, and a C-terminal domain with a putative O-carbamoyltransferase motif. The presence of CO and CN- ligands of the active site iron atoms is essential for [NiFe]-hydrogenase enzyme activity . Both ligands have been suggested to originate from carbamoylphosphate , which is required for maturation of [NiFe]-hydrogenases . Escherichia coli HypF interacts with carbamoylphosphate as a substrate and releases inorganic phosphate . In addition, HypF also cleaves ATP into AMP and pyrophosphate in the presence of carbamoylphosphate. This, and the fact that HypF catalyzes a carbamoylphosphate-dependent pyrophosphate ATP exchange reaction, suggest that the protein catalyzes the activation of carbamoylphosphate . The mechanism of action of HypF, as well as of its individual domains, is not yet clear. Mutations in any of the three major signature motifs, the acylphosphatase, the zinc fingers, and the O-carbamoyltransferase motif, can block carbamoylphosphate phosphatase activity. This indicates an integrated cooperativity between these domains in the cleavage reaction . The N-terminal acylphosphatase (ACP) domain is thought to support the conversion of carbamoylphosphate into CO and CN- , . Biochemical results demonstrating its ACP activity are not available , . ACPs are small enzymes that specifically catalyze the hydrolysis of carboxylphosphate bonds in acylphosphates, including carbamoylphosphate . Zinc fingers have been implicated in bivalent cation binding or as part of a chaperone domain interacting with the large subunit precursor, but experimental studies on such a function are lacking thus far. The YrdC-like domain is present in protein families with regulatory functions (IPR012200 from INTERPRO, IPR010923 from INTERPRO) and has been implicated in RNA binding . It is not clear what function it may have in members of the HypF family. A C-terminal domain is distantly related to peptidase M22, but contains a conserved O-carbamoyltransferase motif required for the carbamoylphosphate phosphatase activity . The function of this domain is not clear. Nomenclature note: the following names are used as synonyms of HypF: HupY in Azotobacter chroococcum, HupN in Rhizobium leguminosarum, HydA in Escherichia coli. In other organisms, these names are used to designate various "hydrogenase cluster" proteins unrelated to the members of this family. ; GO: 0030528 transcription regulator activity. Probab=91.03 E-value=0.27 Score=27.54 Aligned_cols=71 Identities=20% Similarity=0.122 Sum_probs=41.7 Q ss_pred HHCCC-CCCCCCCC---C---C--------CCCEECHHHHHHCCCC-------CCCCCCCCCCCCCCCCCCHHHHCCCCC Q ss_conf 85787-47103751---5---6--------5770288899508676-------786201244465334760898516784 Q gi|254780310|r 18 CIYPS-ICPIYSRI---I---N--------LRFCLCGHCWSKIHFI-------TATEHILKNNKDNIDKDPLKSMQKDLP 75 (119) Q Consensus 18 ~lfP~-~C~~C~~~---~---~--------~~~~lC~~C~~~l~~i-------~~~~C~~Cg~~~~~~~~C~~C~~~~~~ 75 (119) ++||= .|.-||-. + + .+.+||++|.++-.-. .+..||+||-.+.... T Consensus 120 Y~YPF~~CT~CGPRfTi~~aLPYDRe~T~m~~FpLC~~C~~EY~dP~DRRFHAQ~~aCP~CGP~L~f~~----------- 188 (799) T TIGR00143 120 YLYPFISCTDCGPRFTIIEALPYDRENTSMADFPLCPDCEKEYKDPLDRRFHAQAIACPRCGPKLEFVS----------- 188 (799) T ss_pred CCCCCCCCCCCCCCHHHHHCCCCCCCCCCCCCCCCCHHHHHHCCCCCCCEEEECCCCCCCCCCCCCEEC----------- T ss_conf 127432435567525676427888874334578988468997078876304644627733578653021----------- Q ss_pred HHHHHHHHCCCHHHHHHHHHHHHCC Q ss_conf 3530120002127999999974478 Q gi|254780310|r 76 LTQIRSVTLYCDMSCVLVRLLKYHD 100 (119) Q Consensus 76 f~~~~a~~~Y~~~~r~lI~~~Ky~~ 100 (119) +.......-+++++..|.++|-++ T Consensus 189 -~~~~vIae~~~al~~a~~~L~~G~ 212 (799) T TIGR00143 189 -RGGEVIAEKDDALKEAAKLLKKGK 212 (799) T ss_pred -CCCEEEECCCHHHHHHHHHHCCCC T ss_conf -687167527736899999840797 No 5 >pfam04981 NMD3 NMD3 family. The NMD3 protein is involved in nonsense mediated mRNA decay. This amino terminal region contains four conserved CXXC motifs that could be metal binding. NMD3 is involved in export of the 60S ribosomal subunit is mediated by the adapter protein Nmd3p in a Crm1p-dependent pathway. Probab=85.15 E-value=0.46 Score=26.30 Aligned_cols=36 Identities=25% Similarity=0.253 Sum_probs=24.6 Q ss_pred CCCCCCCCCC-CCEECHHHHHHCCCCCC-------CCCCCCCCC Q ss_conf 7103751565-77028889950867678-------620124446 Q gi|254780310|r 24 CPIYSRIINL-RFCLCGHCWSKIHFITA-------TEHILKNNK 59 (119) Q Consensus 24 C~~C~~~~~~-~~~lC~~C~~~l~~i~~-------~~C~~Cg~~ 59 (119) |+.||.+.+. ...+|.+|..+-..+.. ..|+.||.- T Consensus 1 C~~CG~~~~~~~~~mC~~C~~~~~~i~~ip~~~~v~~C~~Cg~~ 44 (237) T pfam04981 1 CPRCGRPIEPLIDGLCPDCYRERVDITEIPEELTVVVCRDCGRY 44 (237) T ss_pred CCCCCCCCCCCCCCCCHHHHHCCCCEEECCCEEEEEECCCCCCE T ss_conf 98579989977356666787105765998985899999999879 No 6 >COG1499 NMD3 NMD protein affecting ribosome stability and mRNA decay [Translation, ribosomal structure and biogenesis] Probab=74.84 E-value=2.1 Score=22.66 Aligned_cols=37 Identities=22% Similarity=0.348 Sum_probs=25.1 Q ss_pred CCCCCCCCCC-CCCEECHHHHHHCCC-CC------CCCCCCCCCC Q ss_conf 4710375156-577028889950867-67------8620124446 Q gi|254780310|r 23 ICPIYSRIIN-LRFCLCGHCWSKIHF-IT------ATEHILKNNK 59 (119) Q Consensus 23 ~C~~C~~~~~-~~~~lC~~C~~~l~~-i~------~~~C~~Cg~~ 59 (119) .|+.||+..+ ....+|.+|.-+-.. +. --+|..||.. T Consensus 8 ~C~~CGr~~~~~~~~lC~dC~~~~~~~~~ip~~~~v~~C~~Cga~ 52 (355) T COG1499 8 LCVRCGRSVDPLIDGLCGDCYVETTPLIEIPDEVNVEVCRHCGAY 52 (355) T ss_pred EECCCCCCCCHHHHCCCHHHHHCCCCCCCCCCCEEEEECCCCCCC T ss_conf 715688968645505467787504622038873578987767870 No 7 >PRK11788 hypothetical protein; Provisional Probab=73.06 E-value=3.2 Score=21.62 Aligned_cols=44 Identities=16% Similarity=0.336 Sum_probs=26.7 Q ss_pred HHHHHHHHHHHHHHHCCCCCCCCCCCCCCCCEECHHH--HHHCCCC Q ss_conf 9999999999999857874710375156577028889--9508676 Q gi|254780310|r 5 IQTVKSIIIELFHCIYPSICPIYSRIINLRFCLCGHC--WSKIHFI 48 (119) Q Consensus 5 ~~~ik~~~~~ll~~lfP~~C~~C~~~~~~~~~lC~~C--~~~l~~i 48 (119) ++.++.++.+.+.--..=+|--||=....-.+-|+.| |..+.++ T Consensus 338 l~~l~~~v~~~~~~~~~Y~C~~CGF~~~~~~WqCPsC~~W~Si~P~ 383 (389) T PRK11788 338 LELLRDLVGEQLKRKPRYRCRNCGFTARTLYWHCPSCKAWETIKPI 383 (389) T ss_pred HHHHHHHHHHHHHCCCCEECCCCCCCCCCEEEECCCCCCCCCCCCC T ss_conf 9999999999971799976999999888314579099986784898 No 8 >pfam09889 DUF2116 Uncharacterized protein containing a Zn-ribbon (DUF2116). This domain, found in various hypothetical archaeal proteins, has no known function. Probab=67.99 E-value=2.2 Score=22.51 Aligned_cols=24 Identities=17% Similarity=0.408 Sum_probs=14.8 Q ss_pred CCCCCCCCCCCCCCEECHH-HHHHC Q ss_conf 7471037515657702888-99508 Q gi|254780310|r 22 SICPIYSRIINLRFCLCGH-CWSKI 45 (119) Q Consensus 22 ~~C~~C~~~~~~~~~lC~~-C~~~l 45 (119) .+|++||..++.+..+|++ |..++ T Consensus 4 kHC~vCG~~Ipp~e~fCS~kC~~~~ 28 (59) T pfam09889 4 KHCIVCGTAIPPDESFCSEKCQEEY 28 (59) T ss_pred CCCCCCCCCCCCCCCCCCHHHHHHH T ss_conf 5356579978943223148899999 No 9 >TIGR00599 rad18 DNA repair protein rad18; InterPro: IPR004580 During DNA replication, lesion bypass is an important cellular response to unrepaired damage in the genome. In the yeast Saccharomyces cerevisiae, Rad6 and Rad18 are required for both the error-free and error-prone lesion bypass mechanisms. The RAD18 gene encodes a RING-finger protein with single-stranded DNA binding activity that interacts with the ubiquitin-conjugating enzyme RAD6. ; GO: 0003684 damaged DNA binding, 0006281 DNA repair, 0005634 nucleus. Probab=67.72 E-value=3.7 Score=21.27 Aligned_cols=59 Identities=15% Similarity=0.154 Sum_probs=36.1 Q ss_pred CCCCCCCCCC-----C-CCEECHHHHHHCCCCCCCCCCCCCCCCCCCCCCHHHHCCCCCHHHHHHHHCCCHHHHHHHHHH Q ss_conf 4710375156-----5-770288899508676786201244465334760898516784353012000212799999997 Q gi|254780310|r 23 ICPIYSRIIN-----L-RFCLCGHCWSKIHFITATEHILKNNKDNIDKDPLKSMQKDLPLTQIRSVTLYCDMSCVLVRLL 96 (119) Q Consensus 23 ~C~~C~~~~~-----~-~~~lC~~C~~~l~~i~~~~C~~Cg~~~~~~~~C~~C~~~~~~f~~~~a~~~Y~~~~r~lI~~~ 96 (119) ||.+|..... + .+-+|.-|.+.- +-+++-||.|-.++-. +-.+-+-.++++|-.| T Consensus 29 RC~iCkdFf~~P~lTsC~HTFCSLCIR~~-L~~~p~CP~Cr~~~qE------------------s~LR~n~~l~E~vesF 89 (421) T TIGR00599 29 RCHICKDFFDAPVLTSCSHTFCSLCIRRC-LSEEPKCPLCRAEDQE------------------SKLRKNWVLEEIVESF 89 (421) T ss_pred HHHHHHHHHCCCCCCCCCCCCCHHHHHHH-HCCCCCCCCCCCCHHH------------------HHHHHHHHHHHHHHHH T ss_conf 35676898468830488632003688776-1478888736770456------------------6667889999999888 Q ss_pred HHCC Q ss_conf 4478 Q gi|254780310|r 97 KYHD 100 (119) Q Consensus 97 Ky~~ 100 (119) |--. T Consensus 90 k~~R 93 (421) T TIGR00599 90 KNLR 93 (421) T ss_pred HHHH T ss_conf 7466 No 10 >KOG2041 consensus Probab=64.64 E-value=2.8 Score=21.94 Aligned_cols=45 Identities=13% Similarity=0.130 Sum_probs=32.9 Q ss_pred CCCCCCCCCCCCCEECHHHHHHCCCCCCCCCCCCCCCCCCCC--CCHHHHCC Q ss_conf 471037515657702888995086767862012444653347--60898516 Q gi|254780310|r 23 ICPIYSRIINLRFCLCGHCWSKIHFITATEHILKNNKDNIDK--DPLKSMQK 72 (119) Q Consensus 23 ~C~~C~~~~~~~~~lC~~C~~~l~~i~~~~C~~Cg~~~~~~~--~C~~C~~~ 72 (119) .|..||..++....-|++|..++| .|..-|.|.+... .|..|... T Consensus 1119 dc~~cg~~i~~~~~~c~ec~~kfP-----~CiasG~pIt~~~fWlC~~CkH~ 1165 (1189) T KOG2041 1119 DCSVCGAKIDPYDLQCSECQTKFP-----VCIASGRPITDNIFWLCPRCKHR 1165 (1189) T ss_pred EEEECCCCCCCCCCCCHHHCCCCC-----EEECCCCCCCCCEEEECCCCCCC T ss_conf 243048847966777733337676-----36505975355538974633355 No 11 >pfam10764 Gin Inhibitor of sigma-G Gin. Gin allows sigma-F to delay late forespore transcription by preventing sigma-G to take over before the cell has reached a critical stage of development. Gin is also known as CsfB. Probab=64.36 E-value=3.6 Score=21.36 Aligned_cols=27 Identities=22% Similarity=0.314 Sum_probs=20.2 Q ss_pred CCCCCCCCCCC-----CCEECHHHHHHCCCCC Q ss_conf 47103751565-----7702888995086767 Q gi|254780310|r 23 ICPIYSRIINL-----RFCLCGHCWSKIHFIT 49 (119) Q Consensus 23 ~C~~C~~~~~~-----~~~lC~~C~~~l~~i~ 49 (119) .|++|+++... +..||.+|.++|.-++ T Consensus 1 ~CiIC~~~k~~GI~i~~~fIC~~CE~~iv~t~ 32 (46) T pfam10764 1 KCIICEKPKNEGIHLYGKFICTECEKKLINTE 32 (46) T ss_pred CEEECCCCCCCCEEEECCCCHHHHHHHHHCCC T ss_conf 93868983888789978791578899884289 No 12 >pfam03660 PHF5 PHF5-like protein. This family of proteins the superfamily of PHD-finger proteins. At least one example, from mouse, may act as a chromatin-associated protein. The S. pombe ini1 gene is essential, required for splicing. It is localized in the nucleus, but not detected in the nucleolus and can be complemented by human ini1. Probab=61.76 E-value=4 Score=21.06 Aligned_cols=50 Identities=14% Similarity=0.260 Sum_probs=34.2 Q ss_pred CCCCCCCCCCCC--CCEECHHHHHHCCCCCCCCCCCCCCCCCCC-CCCHHHHCCCC Q ss_conf 747103751565--770288899508676786201244465334-76089851678 Q gi|254780310|r 22 SICPIYSRIINL--RFCLCGHCWSKIHFITATEHILKNNKDNID-KDPLKSMQKDL 74 (119) Q Consensus 22 ~~C~~C~~~~~~--~~~lC~~C~~~l~~i~~~~C~~Cg~~~~~~-~~C~~C~~~~~ 74 (119) .+|++|+..+.. ...||.+|--. .....|..||.+...+ .+|.+|...+. T Consensus 27 GkCpiCDS~Vrp~~~vrICdeCs~G---~~~~rCIiCg~~g~sdAYYC~eC~~lEK 79 (105) T pfam03660 27 GKCPICDSYVRPTTKVRICDECSFG---SLGNKCIICGSPGVSDAYYCWECVRLEK 79 (105) T ss_pred CCCCCCCCCCCCCCEEEECCCCCCC---CCCCCEEEECCCCCCCCHHHHHHHHHHC T ss_conf 8466454645765447898878888---7798569808988760210688886312 No 13 >PRK11032 hypothetical protein; Provisional Probab=59.49 E-value=2.2 Score=22.48 Aligned_cols=33 Identities=15% Similarity=0.331 Sum_probs=24.5 Q ss_pred CCCCCCCCEECHHHHHHCCCCCC---CCCCCCCCCC Q ss_conf 75156577028889950867678---6201244465 Q gi|254780310|r 28 SRIINLRFCLCGHCWSKIHFITA---TEHILKNNKD 60 (119) Q Consensus 28 ~~~~~~~~~lC~~C~~~l~~i~~---~~C~~Cg~~~ 60 (119) |..++.+..+|..|-.++.+.++ |.|+.||... T Consensus 117 GEivg~G~LvC~~Cg~~~~~~~~~~ippCp~Cg~~~ 152 (160) T PRK11032 117 GEVVGLGNLVCEKCHHHLAFYTPEVLPLCPKCGHDQ 152 (160) T ss_pred CEEEECCEEEHHHCCCEEEEECCCCCCCCCCCCCCE T ss_conf 605325566574289877874687798887799976 No 14 >pfam07295 DUF1451 Protein of unknown function (DUF1451). This family consists of several hypothetical bacterial proteins of around 160 residues in length. Members of this family contain four highly conserved cysteine resides toward the C-terminal region of the protein. The function of this family is unknown. Probab=57.43 E-value=7.5 Score=19.59 Aligned_cols=48 Identities=17% Similarity=0.203 Sum_probs=30.7 Q ss_pred HHHHHHHCCCCCCCCCCCCCCCCEECHHHHHHCCCCCC---CCCCCCCCCC Q ss_conf 99999857874710375156577028889950867678---6201244465 Q gi|254780310|r 13 IELFHCIYPSICPIYSRIINLRFCLCGHCWSKIHFITA---TEHILKNNKD 60 (119) Q Consensus 13 ~~ll~~lfP~~C~~C~~~~~~~~~lC~~C~~~l~~i~~---~~C~~Cg~~~ 60 (119) ..+.+=+=..-.---|..++.+..+|..|-.++.+..+ |.|+.||... T Consensus 92 ~el~~dl~h~g~Y~sGEvvg~G~LvC~~Cg~~~~~~~p~~ip~Cp~Cg~~~ 142 (148) T pfam07295 92 HELFQDLEHHGVYQSGEIVGLGTLVCENCGHMLTFYHPSVIPPCPKCGHTE 142 (148) T ss_pred HHHHHHHHCCCEEECCCEEECCEEEECCCCCEEEEECCCCCCCCCCCCCCE T ss_conf 999987541572424605416457723689878874687688987799982 No 15 >pfam07191 DUF1407 Protein of unknown function (DUF1407). This family consists of several short, hypothetical bacterial proteins of around 70 residues in length. Members of this family have 8 highly conserved cysteine residues, which form two zinc ribbon domains. Probab=56.95 E-value=1.3 Score=23.72 Aligned_cols=38 Identities=13% Similarity=0.230 Sum_probs=26.6 Q ss_pred CCCCCCCCCC--CCCEECHHHHHHCCCCCCCCCCCCCCCCCC Q ss_conf 4710375156--577028889950867678620124446533 Q gi|254780310|r 23 ICPIYSRIIN--LRFCLCGHCWSKIHFITATEHILKNNKDNI 62 (119) Q Consensus 23 ~C~~C~~~~~--~~~~lC~~C~~~l~~i~~~~C~~Cg~~~~~ 62 (119) .|+.|+..++ ++..-|..|.+.+.. ..+||.|+.++.. T Consensus 3 ~CP~C~~~l~~~~~~~~C~~C~~~~~~--~a~CP~C~~~Lq~ 42 (70) T pfam07191 3 ICPQCQQELEWKGGHYHCDQCQKDFKK--QALCPDCHQELEV 42 (70) T ss_pred CCCCCCCCCEECCCCEECHHHCCEEEE--EEECCCCCCHHHH T ss_conf 288899952433997797033010147--8989762437899 No 16 >PRK05580 primosome assembly protein PriA; Validated Probab=55.88 E-value=2.8 Score=21.97 Aligned_cols=43 Identities=12% Similarity=0.169 Sum_probs=18.9 Q ss_pred CCCCCCCCCCCCEECHHHHHHCCCCC---CCCCCCCCCCCCCCCCCHHHHC Q ss_conf 71037515657702888995086767---8620124446533476089851 Q gi|254780310|r 24 CPIYSRIINLRFCLCGHCWSKIHFIT---ATEHILKNNKDNIDKDPLKSMQ 71 (119) Q Consensus 24 C~~C~~~~~~~~~lC~~C~~~l~~i~---~~~C~~Cg~~~~~~~~C~~C~~ 71 (119) |.-||.. .-|+.|-..|.+.. ...|..||........|..|-. T Consensus 407 C~~Cg~~-----~~C~~C~~~L~~h~~~~~l~Ch~Cg~~~~~~~~Cp~Cgs 452 (699) T PRK05580 407 CRDCGWV-----ARCPHCDGPLTLHRAGRRLRCHHCGYQEPIPRACPECGS 452 (699) T ss_pred CHHCCCE-----EECCCCCCEEEECCCCCCEECCCCCCCCCCCCCCCCCCC T ss_conf 4531994-----565678986342068983322646883657554656799 No 17 >KOG0978 consensus Probab=55.43 E-value=2.5 Score=22.19 Aligned_cols=40 Identities=13% Similarity=0.134 Sum_probs=28.9 Q ss_pred CCCCCCCC-----CC-CCCEECHHHHHHCCCCCCCCCCCCCCCCCC Q ss_conf 47103751-----56-577028889950867678620124446533 Q gi|254780310|r 23 ICPIYSRI-----IN-LRFCLCGHCWSKIHFITATEHILKNNKDNI 62 (119) Q Consensus 23 ~C~~C~~~-----~~-~~~~lC~~C~~~l~~i~~~~C~~Cg~~~~~ 62 (119) .|++|... +. =.+.+|..|....--+...-||.|+.+++. T Consensus 645 kCs~Cn~R~Kd~vI~kC~H~FC~~Cvq~r~etRqRKCP~Cn~aFga 690 (698) T KOG0978 645 KCSVCNTRWKDAVITKCGHVFCEECVQTRYETRQRKCPKCNAAFGA 690 (698) T ss_pred ECCCCCCCHHHHHHHHCCHHHHHHHHHHHHHHHCCCCCCCCCCCCC T ss_conf 2877667556689983206888998888998854879998888785 No 18 >COG4068 Uncharacterized protein containing a Zn-ribbon [Function unknown] Probab=54.96 E-value=4.9 Score=20.58 Aligned_cols=18 Identities=17% Similarity=0.385 Sum_probs=9.1 Q ss_pred CCCCCCCCCCCCCEECHH Q ss_conf 471037515657702888 Q gi|254780310|r 23 ICPIYSRIINLRFCLCGH 40 (119) Q Consensus 23 ~C~~C~~~~~~~~~lC~~ 40 (119) +|++||..++.+..+|++ T Consensus 10 HC~VCg~aIp~de~~CSe 27 (64) T COG4068 10 HCVVCGKAIPPDEQVCSE 27 (64) T ss_pred CCCCCCCCCCCCCCHHHH T ss_conf 566058868974036889 No 19 >pfam10977 DUF2797 Protein of unknown function (DUF2797). This family of proteins has no known function. Probab=54.76 E-value=6.6 Score=19.90 Aligned_cols=25 Identities=24% Similarity=0.376 Sum_probs=19.8 Q ss_pred CCCCCCCCCC--CCCEECHHHHHHCCC Q ss_conf 4710375156--577028889950867 Q gi|254780310|r 23 ICPIYSRIIN--LRFCLCGHCWSKIHF 47 (119) Q Consensus 23 ~C~~C~~~~~--~~~~lC~~C~~~l~~ 47 (119) .|+.||+.+. -....|..|+.+++. T Consensus 13 ~C~~CG~~t~ksf~qG~C~~Cf~~~p~ 39 (233) T pfam10977 13 GCLNCGRKTKKSFSQGYCYPCFSKLAQ 39 (233) T ss_pred EEECCCCCCCCCCCCCCCHHHHHCCCC T ss_conf 741168867656668578777522703 No 20 >KOG4080 consensus Probab=53.19 E-value=6.1 Score=20.09 Aligned_cols=26 Identities=31% Similarity=0.691 Sum_probs=18.4 Q ss_pred CCCCCCCCCCCCCCEECHHHHHHCCCC Q ss_conf 747103751565770288899508676 Q gi|254780310|r 22 SICPIYSRIINLRFCLCGHCWSKIHFI 48 (119) Q Consensus 22 ~~C~~C~~~~~~~~~lC~~C~~~l~~i 48 (119) .+|+.||..- ..+.||..|..++... T Consensus 94 ~~CP~CGh~k-~a~~LC~~Cy~kV~ke 119 (176) T KOG4080 94 NTCPACGHIK-PAHTLCDYCYAKVHKE 119 (176) T ss_pred CCCCCCCCCC-CCCCCHHHHHHHHHHH T ss_conf 4375457644-1452079999999999 No 21 >pfam04423 Rad50_zn_hook Rad50 zinc hook motif. The Mre11 complex (Mre11 Rad50 Nbs1) is central to chromosomal maintenance and functions in homologous recombination, telomere maintenance and sister chromatid association. The Rad50 coiled-coil region contains a dimer interface at the apex of the coiled coils in which pairs of conserved Cys-X-X-Cys motifs form interlocking hooks that bind one Zn ion. This alignment includes the zinc hook motif and a short stretch of coiled-coil on either side. Probab=52.67 E-value=14 Score=18.13 Aligned_cols=30 Identities=20% Similarity=0.339 Sum_probs=14.7 Q ss_pred HHHHHHHHHHHHHHHHHCCCCCCCCCCCCCCC Q ss_conf 89999999999999985787471037515657 Q gi|254780310|r 3 AIIQTVKSIIIELFHCIYPSICPIYSRIINLR 34 (119) Q Consensus 3 ~~~~~ik~~~~~ll~~lfP~~C~~C~~~~~~~ 34 (119) +-..-.+..+..+.+- ...|++||+.++.+ T Consensus 4 ~~~~~~~k~i~~l~~~--~~~CPvC~r~l~~e 33 (54) T pfam04423 4 SETEEYNKAIEELKEA--KGCCPVCGRPLDEE 33 (54) T ss_pred HHHHHHHHHHHHHHHH--CCCCCCCCCCCCHH T ss_conf 3789999999999984--58777669976678 No 22 >TIGR00575 dnlj DNA ligase, NAD-dependent; InterPro: IPR001679 DNA ligase (polydeoxyribonucleotide synthase) is the enzyme that joins two DNA fragments by catalyzing the formation of an internucleotide ester bond between phosphate and deoxyribose. It is active during DNA replication, DNA repair and DNA recombination. There are two forms of DNA ligase: one requires ATP (6.5.1.1 from EC), the other NAD (6.5.1.2 from EC). This family is predominantly composed of NAD-dependent bacterial DNA ligases. They are proteins of about 75 to 85 Kd whose sequence is well conserved , . They also show similarity to yicF, an Escherichia coli hypothetical protein of 63 Kd.; GO: 0003911 DNA ligase (NAD+) activity, 0006260 DNA replication, 0006281 DNA repair. Probab=50.77 E-value=9.6 Score=19.00 Aligned_cols=16 Identities=19% Similarity=0.451 Sum_probs=12.5 Q ss_pred HHCCCCCCCCCCCCCC Q ss_conf 8578747103751565 Q gi|254780310|r 18 CIYPSICPIYSRIINL 33 (119) Q Consensus 18 ~lfP~~C~~C~~~~~~ 33 (119) +.||.+||.|+..+.. T Consensus 408 ~~~P~~CP~C~s~lv~ 423 (706) T TIGR00575 408 IKFPTHCPSCGSPLVR 423 (706) T ss_pred EECCCCCCCCCCEEEC T ss_conf 3428718888833111 No 23 >TIGR00269 TIGR00269 conserved hypothetical protein TIGR00269; InterPro: IPR000541 The following uncharacterised proteins have been shown to share regions of similarities, yeast chromosome VII hypothetical protein YGL211w; Dictyostelium discoideum (Slime mold) protein veg136; and Methanococcus jannaschii hypothetical proteins MJ1157 and MJ1478.. Probab=49.79 E-value=8.4 Score=19.31 Aligned_cols=19 Identities=5% Similarity=-0.279 Sum_probs=9.1 Q ss_pred CCCCCCCCCCCCCCCCHHH Q ss_conf 6201244465334760898 Q gi|254780310|r 51 TEHILKNNKDNIDKDPLKS 69 (119) Q Consensus 51 ~~C~~Cg~~~~~~~~C~~C 69 (119) ..|.+||.|-+-+..|..| T Consensus 82 ~~C~~CGeP~SPG~~CkaC 100 (106) T TIGR00269 82 RRCERCGEPASPGKICKAC 100 (106) T ss_pred CCCCCCCCCCCCCCHHHHH T ss_conf 5000147888875443666 No 24 >PRK00420 hypothetical protein; Validated Probab=48.76 E-value=15 Score=17.88 Aligned_cols=37 Identities=8% Similarity=0.158 Sum_probs=18.9 Q ss_pred HHHHHHHHHHHHHHHCCCCCCCCCCCCC---CCCEECHHHH Q ss_conf 9999999999999857874710375156---5770288899 Q gi|254780310|r 5 IQTVKSIIIELFHCIYPSICPIYSRIIN---LRFCLCGHCW 42 (119) Q Consensus 5 ~~~ik~~~~~ll~~lfP~~C~~C~~~~~---~~~~lC~~C~ 42 (119) ++.+-..+.+... ..+.+|+.||.++- ++..+|+.|- T Consensus 4 vK~~a~ll~~Ga~-ml~~~C~~Cg~plf~~k~G~~~Cp~cg 43 (107) T PRK00420 4 VKKAAELLRSGAK-MLDKHCPVCGLPLFELKDGEVVCPNHG 43 (107) T ss_pred HHHHHHHHHHHHH-HHHHHCCCCCCCEEECCCCCEECCCCC T ss_conf 9999999995777-635137657984057489877689898 No 25 >COG2888 Predicted Zn-ribbon RNA-binding protein with a function in translation [Translation, ribosomal structure and biogenesis] Probab=45.89 E-value=3.7 Score=21.27 Aligned_cols=24 Identities=25% Similarity=0.462 Sum_probs=15.2 Q ss_pred CCCCCCCCCCCCC-CCC---EECHHHHH Q ss_conf 7874710375156-577---02888995 Q gi|254780310|r 20 YPSICPIYSRIIN-LRF---CLCGHCWS 43 (119) Q Consensus 20 fP~~C~~C~~~~~-~~~---~lC~~C~~ 43 (119) .|+.|..||..+. .+. +.|+.|-+ T Consensus 8 ~~~~CtSCg~~i~p~e~~v~F~CPnCGe 35 (61) T COG2888 8 DPPVCTSCGREIAPGETAVKFPCPNCGE 35 (61) T ss_pred CCCEECCCCCEECCCCCEEEEECCCCCC T ss_conf 8852123787704687500864899982 No 26 >pfam10217 DUF2039 Uncharacterized conserved protein (DUF2039). This entry is a region of approximately 100 residues containing three pairs of cysteine residues. The region is conserved from plants to humans but its function is unknown. Probab=45.63 E-value=4.7 Score=20.72 Aligned_cols=34 Identities=12% Similarity=0.326 Sum_probs=15.5 Q ss_pred CCCCCCCCC-CCC-CCCEECHHHHHHCCCCCCCCCCCCCCC Q ss_conf 874710375-156-577028889950867678620124446 Q gi|254780310|r 21 PSICPIYSR-IIN-LRFCLCGHCWSKIHFITATEHILKNNK 59 (119) Q Consensus 21 P~~C~~C~~-~~~-~~~~lC~~C~~~l~~i~~~~C~~Cg~~ 59 (119) |..|.-|+. -+. +=..||.+|..++. .|..|+.| T Consensus 55 p~kC~kC~qktVk~AYH~iC~~Ca~~~~-----~CaKC~k~ 90 (92) T pfam10217 55 PKKCNKCQQKTVRHAYHHICDDCAKELK-----VCAKCQKP 90 (92) T ss_pred CCCCHHCCCCHHHHHHHHHHHHHHHHHH-----HCCCCCCC T ss_conf 7402221113699999998899998754-----37366999 No 27 >pfam04216 FdhE Protein involved in formate dehydrogenase formation. The function of these proteins is unknown. They may possibly be involved in the formation of formate dehydrogenase. Probab=43.55 E-value=26 Score=16.60 Aligned_cols=56 Identities=14% Similarity=0.339 Sum_probs=34.9 Q ss_pred HHHHHHHHHHHHHHHHC-----C-----CCCCCCCCC-C-----CC-----CCEECHHHHHHCCCCCCCCCCCCCCCC Q ss_conf 99999999999999857-----8-----747103751-5-----65-----770288899508676786201244465 Q gi|254780310|r 4 IIQTVKSIIIELFHCIY-----P-----SICPIYSRI-I-----NL-----RFCLCGHCWSKIHFITATEHILKNNKD 60 (119) Q Consensus 4 ~~~~ik~~~~~ll~~lf-----P-----~~C~~C~~~-~-----~~-----~~~lC~~C~~~l~~i~~~~C~~Cg~~~ 60 (119) +...+.-.+..+...+. | ..|++||.. + .. ....|.-|..+.++ ....|+.||... T Consensus 139 i~AaLqv~~a~~A~~l~~~~~~~~~~~~~~CPvCGs~P~~s~~~~~~~~G~Ryl~Cs~C~teW~~-~R~~C~~Cg~~~ 215 (283) T pfam04216 139 LWAALQLYWAQLAQQLDARALPEAGWQRGLCPVCGSAPVASVIRGGGAQGLRYLHCSLCETEWHF-VRVKCTNCGSTK 215 (283) T ss_pred HHHHHHHHHHHHHHCCCCHHCCCCCCCCCCCCCCCCCCHHHEEECCCCCCCEEEECCCCCCCCCC-CCCCCCCCCCCC T ss_conf 99999999999996288010377776589699999810001131378788368865888783242-265479999999 No 28 >COG2956 Predicted N-acetylglucosaminyl transferase [Carbohydrate transport and metabolism] Probab=41.80 E-value=21 Score=17.11 Aligned_cols=42 Identities=14% Similarity=0.354 Sum_probs=27.2 Q ss_pred HHHHHHHHHHHHHHHCCCCCCCCCCCCCCCCEECHHH--HHHCC Q ss_conf 9999999999999857874710375156577028889--95086 Q gi|254780310|r 5 IQTVKSIIIELFHCIYPSICPIYSRIINLRFCLCGHC--WSKIH 46 (119) Q Consensus 5 ~~~ik~~~~~ll~~lfP~~C~~C~~~~~~~~~lC~~C--~~~l~ 46 (119) +..+++++.+-+.-..+=+|--||=....-.+=|++| |..+. T Consensus 338 L~~lr~mvgeql~~~~~YRC~~CGF~a~~l~W~CPsC~~W~Tik 381 (389) T COG2956 338 LDLLRDMVGEQLRRKPRYRCQNCGFTAHTLYWHCPSCRAWETIK 381 (389) T ss_pred HHHHHHHHHHHHHHCCCCEECCCCCCHHEEEEECCCCCCCCCCC T ss_conf 99999999999731677210016863101354188756522417 No 29 >COG5432 RAD18 RING-finger-containing E3 ubiquitin ligase [Signal transduction mechanisms] Probab=40.58 E-value=13 Score=18.23 Aligned_cols=37 Identities=19% Similarity=0.287 Sum_probs=26.7 Q ss_pred CCCCCCCCCC------CCCEECHHHHHHCCCCCCCCCCCCCCCC Q ss_conf 4710375156------5770288899508676786201244465 Q gi|254780310|r 23 ICPIYSRIIN------LRFCLCGHCWSKIHFITATEHILKNNKD 60 (119) Q Consensus 23 ~C~~C~~~~~------~~~~lC~~C~~~l~~i~~~~C~~Cg~~~ 60 (119) +|.+|+..+. =++.+|.-|... +.-+.|.||.|..+. T Consensus 27 rC~IC~~~i~ip~~TtCgHtFCslCIR~-hL~~qp~CP~Cr~~~ 69 (391) T COG5432 27 RCRICDCRISIPCETTCGHTFCSLCIRR-HLGTQPFCPVCREDP 69 (391) T ss_pred HHHHHHHEEECCEECCCCCCHHHHHHHH-HHCCCCCCCCCCCCH T ss_conf 7654011040412225665065889998-726799985102657 No 30 >KOG4265 consensus Probab=39.69 E-value=11 Score=18.77 Aligned_cols=41 Identities=15% Similarity=0.314 Sum_probs=27.5 Q ss_pred CCCCCCCCCCCC-------CCCEECHHHHHHCCCCCCCCCCCCCCCCCC Q ss_conf 874710375156-------577028889950867678620124446533 Q gi|254780310|r 21 PSICPIYSRIIN-------LRFCLCGHCWSKIHFITATEHILKNNKDNI 62 (119) Q Consensus 21 P~~C~~C~~~~~-------~~~~lC~~C~~~l~~i~~~~C~~Cg~~~~~ 62 (119) +..|++|-...- ..--+|.+|...|.+ ...-||.|..|... T Consensus 290 gkeCVIClse~rdt~vLPCRHLCLCs~Ca~~Lr~-q~n~CPICRqpi~~ 337 (349) T KOG4265 290 GKECVICLSESRDTVVLPCRHLCLCSGCAKSLRY-QTNNCPICRQPIEE 337 (349) T ss_pred CCEEEEEECCCCCEEEECCHHHEHHHHHHHHHHH-HHCCCCCCCCCHHH T ss_conf 8705997458865388504221301758999987-61599712364475 No 31 >pfam10571 UPF0547 Uncharacterized protein family UPF0547. This domain contains a zinc-ribbon motif. Probab=39.05 E-value=9 Score=19.14 Aligned_cols=22 Identities=14% Similarity=0.190 Sum_probs=11.8 Q ss_pred CHHHHHHCCCCCCCCCCCCCCCC Q ss_conf 88899508676786201244465 Q gi|254780310|r 38 CGHCWSKIHFITATEHILKNNKD 60 (119) Q Consensus 38 C~~C~~~l~~i~~~~C~~Cg~~~ 60 (119) |++|.+.++ +....|+.||..+ T Consensus 3 CP~C~~~vp-~~~~~Cp~CG~~F 24 (26) T pfam10571 3 CPECGAEVP-LAAKICPHCGYEF 24 (26) T ss_pred CCCCCCCCC-HHCCCCCCCCCCC T ss_conf 875548364-0034477888555 No 32 >PRK08359 transcription factor; Validated Probab=38.36 E-value=15 Score=17.89 Aligned_cols=26 Identities=27% Similarity=0.673 Sum_probs=18.2 Q ss_pred CCCCCCCCCCCCCC----------CEECHHHHHHCC Q ss_conf 87471037515657----------702888995086 Q gi|254780310|r 21 PSICPIYSRIINLR----------FCLCGHCWSKIH 46 (119) Q Consensus 21 P~~C~~C~~~~~~~----------~~lC~~C~~~l~ 46 (119) |..|=+||..+... -.+|.+|..++- T Consensus 6 ~~yCEiCG~~i~g~~~~v~ieGael~VC~~C~~K~g 41 (175) T PRK08359 6 PKYCELCGREIRGPGHRIRIEGAELLVCDDCYRKYG 41 (175) T ss_pred CCEEECCCCCCCCCCEEEEECCEEEHHHHHHHHHHC T ss_conf 865447998034880699989988725767899868 No 33 >PRK10410 hypothetical protein; Provisional Probab=38.27 E-value=20 Score=17.26 Aligned_cols=73 Identities=10% Similarity=0.069 Sum_probs=38.7 Q ss_pred HHHHHHHHHHHHCCCCCCCCCCCCCCCCEECHHHHHHCCC-----------CCCCCCCCCCCCCCCCCCCHHHHCCCCCH Q ss_conf 9999999999857874710375156577028889950867-----------67862012444653347608985167843 Q gi|254780310|r 8 VKSIIIELFHCIYPSICPIYSRIINLRFCLCGHCWSKIHF-----------ITATEHILKNNKDNIDKDPLKSMQKDLPL 76 (119) Q Consensus 8 ik~~~~~ll~~lfP~~C~~C~~~~~~~~~lC~~C~~~l~~-----------i~~~~C~~Cg~~~~~~~~C~~C~~~~~~f 76 (119) =+.++..++.+ .|..-.. ...||++|.+-+.+ .+-|+|..|- ..|. .+... T Consensus 9 E~~ti~~MI~l-------YC~~~H~-~~~lC~eC~~L~~YA~~Rl~~Cp~ge~Kp~C~~C~------iHCY----~p~~r 70 (114) T PRK10410 9 EKKTIKKMIRL-------YCKKHHQ-ASALCPECEELLEYAQKRLDKCPFGEEKPTCKQCP------VHCY----KPAKR 70 (114) T ss_pred HHHHHHHHHHH-------HHHHCCC-CCCCCHHHHHHHHHHHHHHHCCCCCCCCCCCCCCC------CCCC----CHHHH T ss_conf 99999999999-------9982688-88888999999999999996399999998888899------7779----98999 Q ss_pred HHHHHHHCCCHH---HHHHHHHHHH Q ss_conf 530120002127---9999999744 Q gi|254780310|r 77 TQIRSVTLYCDM---SCVLVRLLKY 98 (119) Q Consensus 77 ~~~~a~~~Y~~~---~r~lI~~~Ky 98 (119) .+.+.+..|.|+ .++-|..+++ T Consensus 71 ~~ir~VMr~sGPRMl~~hPi~ai~H 95 (114) T PRK10410 71 EKIKQIMRWSGPRMLWRHPILAVRH 95 (114) T ss_pred HHHHHHHHHCCCCHHHHHHHHHHHH T ss_conf 9999999861740888679999999 No 34 >COG0068 HypF Hydrogenase maturation factor [Posttranslational modification, protein turnover, chaperones] Probab=38.12 E-value=29 Score=16.32 Aligned_cols=12 Identities=25% Similarity=0.664 Sum_probs=5.3 Q ss_pred CCEECHHHHHHC Q ss_conf 770288899508 Q gi|254780310|r 34 RFCLCGHCWSKI 45 (119) Q Consensus 34 ~~~lC~~C~~~l 45 (119) +..+|++|..++ T Consensus 100 D~a~C~~Cl~Ei 111 (750) T COG0068 100 DAATCEDCLEEI 111 (750) T ss_pred CHHHHHHHHHHH T ss_conf 502149999985 No 35 >COG2995 PqiA Uncharacterized paraquat-inducible protein A [Function unknown] Probab=37.87 E-value=13 Score=18.20 Aligned_cols=27 Identities=11% Similarity=-0.001 Sum_probs=18.3 Q ss_pred CCEECHHHHHHCCCCCCCCCCCCCCCC Q ss_conf 770288899508676786201244465 Q gi|254780310|r 34 RFCLCGHCWSKIHFITATEHILKNNKD 60 (119) Q Consensus 34 ~~~lC~~C~~~l~~i~~~~C~~Cg~~~ 60 (119) ....|..|....+.-.++.|+|||.+. T Consensus 219 ~~~~C~~C~~~~~~~~~~~CpRC~~~L 245 (418) T COG2995 219 GLRSCLCCHYILPHDAEPRCPRCGSKL 245 (418) T ss_pred CCEECCCCCCCCCHHHCCCCCCCCCHH T ss_conf 213536343447875578888778802 No 36 >KOG1512 consensus Probab=37.47 E-value=15 Score=17.98 Aligned_cols=55 Identities=16% Similarity=0.097 Sum_probs=33.0 Q ss_pred CCCCCCCCCC-CCCEECHHHHHHCCCCCCCCCCCCCCCCCCCCCCH-HHHCCCCCHHHHHH Q ss_conf 4710375156-57702888995086767862012444653347608-98516784353012 Q gi|254780310|r 23 ICPIYSRIIN-LRFCLCGHCWSKIHFITATEHILKNNKDNIDKDPL-KSMQKDLPLTQIRS 81 (119) Q Consensus 23 ~C~~C~~~~~-~~~~lC~~C~~~l~~i~~~~C~~Cg~~~~~~~~C~-~C~~~~~~f~~~~a 81 (119) .|.+|+++.. .+..+|.-|-...+ .+|.--+.--.+...|+ .|....+++.+-.+ T Consensus 316 lC~IC~~P~~E~E~~FCD~CDRG~H----T~CVGL~~lP~G~WICD~~C~~~~~~t~R~~s 372 (381) T KOG1512 316 LCRICLGPVIESEHLFCDVCDRGPH----TLCVGLQDLPRGEWICDMRCREATLNTTRQSS 372 (381) T ss_pred HHHHCCCCCCHHHEEEEECCCCCCC----EEEEECCCCCCCCEEECCHHHHHCCCCCHHHH T ss_conf 1121388543012244200057662----35532264788766602377875689853666 No 37 >pfam06906 DUF1272 Protein of unknown function (DUF1272). This family consists of several hypothetical bacterial proteins of around 80 residues in length. This family contains a number of conserved cysteine residues and its function is unknown. Probab=37.36 E-value=25 Score=16.73 Aligned_cols=37 Identities=11% Similarity=0.059 Sum_probs=24.6 Q ss_pred CCCCCCCCCCCCCCC----------EECHHHHHHCCCCCCCCCCCCCCCC Q ss_conf 874710375156577----------0288899508676786201244465 Q gi|254780310|r 21 PSICPIYSRIINLRF----------CLCGHCWSKIHFITATEHILKNNKD 60 (119) Q Consensus 21 P~~C~~C~~~~~~~~----------~lC~~C~~~l~~i~~~~C~~Cg~~~ 60 (119) -+.|--|++.+..+. -+|.+|...+- ...|+.||-.+ T Consensus 5 rpnCE~C~~dLppds~~A~ICsfECTFC~~C~~~~l---~~~CPNCgGel 51 (57) T pfam06906 5 RPNCECCDRDLPPDSPDARICSFECTFCADCAETRL---HGVCPNCGGEL 51 (57) T ss_pred CCCCCCCCCCCCCCCCCCEEEEEECEECHHHHHHHH---CCCCCCCCCCC T ss_conf 668655699899998887788785612678997886---68484998812 No 38 >pfam10083 DUF2321 Uncharacterized protein conserved in bacteria (DUF2321). Members of this family of hypothetical bacterial proteins have no known function. Probab=36.46 E-value=12 Score=18.37 Aligned_cols=11 Identities=0% Similarity=-0.423 Sum_probs=5.4 Q ss_pred CCCCCCCCCCC Q ss_conf 86201244465 Q gi|254780310|r 50 ATEHILKNNKD 60 (119) Q Consensus 50 ~~~C~~Cg~~~ 60 (119) +.+|..||.|+ T Consensus 68 PsyC~nCG~py 78 (158) T pfam10083 68 PSYCHNCGKPF 78 (158) T ss_pred CHHHHHCCCCC T ss_conf 54687479988 No 39 >pfam03854 zf-P11 P-11 zinc finger. Probab=36.00 E-value=13 Score=18.24 Aligned_cols=32 Identities=9% Similarity=0.159 Sum_probs=23.9 Q ss_pred CCC-CCCEECHHHHHHCCCCCCCCCCCCCCCCCC Q ss_conf 156-577028889950867678620124446533 Q gi|254780310|r 30 IIN-LRFCLCGHCWSKIHFITATEHILKNNKDNI 62 (119) Q Consensus 30 ~~~-~~~~lC~~C~~~l~~i~~~~C~~Cg~~~~~ 62 (119) .+. +++.||-.|...|-- ...+|+.|+.|+.. T Consensus 15 Li~C~dHYLCl~CLt~ml~-~s~~C~iC~~~LPt 47 (50) T pfam03854 15 LVTCSDHYLCLRCLQLLLS-VSERCPICKKPLPT 47 (50) T ss_pred EEEECCHHHHHHHHHHHHC-CCCCCCCCCCCCCC T ss_conf 2213420449999999973-05677624675765 No 40 >cd00350 rubredoxin_like Rubredoxin_like; nonheme iron binding domain containing a [Fe(SCys)4] center. The family includes rubredoxins, a small electron transfer protein, and a slightly smaller modular rubredoxin domain present in rubrerythrin and nigerythrin and detected either N- or C-terminal to such proteins as flavin reductase, NAD(P)H-nitrite reductase, and ferredoxin-thioredoxin reductase. In rubredoxin, the iron atom is coordinated by four cysteine residues (Fe(S-Cys)4), but iron can also be replaced by cobalt, nickel or zinc and believed to be involved in electron transfer. Rubrerythrins and nigerythrins are small homodimeric proteins, generally consisting of 2 domains: a rubredoxin domain C-terminal to a non-sulfur, oxo-bridged diiron site in the N-terminal rubrerythrin domain. Rubrerythrins and nigerythrins have putative peroxide activity. Probab=35.03 E-value=21 Score=17.15 Aligned_cols=25 Identities=12% Similarity=0.157 Sum_probs=14.6 Q ss_pred CCCCCCCCCCCCCEECHHHHHHCCCCCCCCCCCCCCCC Q ss_conf 47103751565770288899508676786201244465 Q gi|254780310|r 23 ICPIYSRIINLRFCLCGHCWSKIHFITATEHILKNNKD 60 (119) Q Consensus 23 ~C~~C~~~~~~~~~lC~~C~~~l~~i~~~~C~~Cg~~~ 60 (119) .|.+||-....+. .+-.||.||.+. T Consensus 3 ~C~vCGyi~~~~~-------------~p~~CP~Cg~~k 27 (33) T cd00350 3 VCPVCGYIYDGEE-------------APWVCPVCGAPK 27 (33) T ss_pred CCCCCCCEEECCC-------------CCCCCCCCCCCH T ss_conf 8886998875786-------------987287889978 No 41 >KOG3002 consensus Probab=34.25 E-value=25 Score=16.72 Aligned_cols=35 Identities=14% Similarity=0.248 Sum_probs=24.6 Q ss_pred CCCCCCCCCC-------CCCEECHHHHHHCCCCCCCCCCCCCCCCC Q ss_conf 4710375156-------57702888995086767862012444653 Q gi|254780310|r 23 ICPIYSRIIN-------LRFCLCGHCWSKIHFITATEHILKNNKDN 61 (119) Q Consensus 23 ~C~~C~~~~~-------~~~~lC~~C~~~l~~i~~~~C~~Cg~~~~ 61 (119) .|++|...+. +++.+|.+|..++. ..||.|..+.+ T Consensus 50 eCPvC~~~l~~Pi~QC~nGHlaCssC~~~~~----~~CP~Cr~~~g 91 (299) T KOG3002 50 DCPVCFNPLSPPIFQCDNGHLACSSCRTKVS----NKCPTCRLPIG 91 (299) T ss_pred CCCHHHCCCCCCCEECCCCCEEHHHHHHHHC----CCCCCCCCCCC T ss_conf 6950316476653724888675654334540----55986545565 No 42 >pfam08772 NOB1_Zn_bind Nin one binding (NOB1) Zn-ribbon like. This domain corresponds to a zinc ribbon and is found on the RNA binding protein NOB1. Probab=33.84 E-value=13 Score=18.32 Aligned_cols=23 Identities=17% Similarity=0.174 Sum_probs=15.8 Q ss_pred ECHHHHHHCCCCCCCCCCCCCCC Q ss_conf 28889950867678620124446 Q gi|254780310|r 37 LCGHCWSKIHFITATEHILKNNK 59 (119) Q Consensus 37 lC~~C~~~l~~i~~~~C~~Cg~~ 59 (119) .|..|+.-.+-.+..+|+.||.. T Consensus 11 rC~aCf~~t~~~~k~FCpkCGn~ 33 (73) T pfam08772 11 RCHACFKTTPDMTKQFCPKCGNA 33 (73) T ss_pred EEEECCCCCCCCCCCCCCCCCCC T ss_conf 54232348489766127536999 No 43 >KOG2660 consensus Probab=33.73 E-value=23 Score=16.90 Aligned_cols=65 Identities=11% Similarity=-0.030 Sum_probs=34.0 Q ss_pred HHHCCCCCCCCCCCCCC-CC------EECHHHHHHCCCCCCCCCCCCCCCCCCCCCCHHHHCCCCCHHHHHHHHCCCHHH Q ss_conf 98578747103751565-77------028889950867678620124446533476089851678435301200021279 Q gi|254780310|r 17 HCIYPSICPIYSRIINL-RF------CLCGHCWSKIHFITATEHILKNNKDNIDKDPLKSMQKDLPLTQIRSVTLYCDMS 89 (119) Q Consensus 17 ~~lfP~~C~~C~~~~~~-~~------~lC~~C~~~l~~i~~~~C~~Cg~~~~~~~~C~~C~~~~~~f~~~~a~~~Y~~~~ 89 (119) ++-.=-.|.+|++.+-. .. .+|.+|.-+- +....+|+.|+.-...... ... -.++..+ T Consensus 11 ~~n~~itC~LC~GYliDATTI~eCLHTFCkSCIvk~-l~~~~~CP~C~i~ih~t~p----------l~n----i~~Drtl 75 (331) T KOG2660 11 ELNPHITCRLCGGYLIDATTITECLHTFCKSCIVKY-LEESKYCPTCDIVIHKTHP----------LLN----IRSDRTL 75 (331) T ss_pred HCCCCEECCCCCCEEECCHHHHHHHHHHHHHHHHHH-HHHCCCCCCCCEECCCCCC----------CCC----CCCCHHH T ss_conf 136410203154644443018999998889999999-9861678766325267554----------124----7700489 Q ss_pred HHHHHHH Q ss_conf 9999997 Q gi|254780310|r 90 CVLVRLL 96 (119) Q Consensus 90 r~lI~~~ 96 (119) |++|.+| T Consensus 76 qdiVyKL 82 (331) T KOG2660 76 QDIVYKL 82 (331) T ss_pred HHHHHHH T ss_conf 9999997 No 44 >KOG1813 consensus Probab=32.70 E-value=28 Score=16.38 Aligned_cols=44 Identities=14% Similarity=0.264 Sum_probs=30.1 Q ss_pred HHHCCCCCCCCCCCCC------CCCEECHHHHHHCCCCCCCCCCCCCCCCC Q ss_conf 9857874710375156------57702888995086767862012444653 Q gi|254780310|r 17 HCIYPSICPIYSRIIN------LRFCLCGHCWSKIHFITATEHILKNNKDN 61 (119) Q Consensus 17 ~~lfP~~C~~C~~~~~------~~~~lC~~C~~~l~~i~~~~C~~Cg~~~~ 61 (119) ..++|-.|.+|++... -.+.+|..|..+ ++...+.|..|+.... T Consensus 237 ~~~~Pf~c~icr~~f~~pVvt~c~h~fc~~ca~~-~~qk~~~c~vC~~~t~ 286 (313) T KOG1813 237 IELLPFKCFICRKYFYRPVVTKCGHYFCEVCALK-PYQKGEKCYVCSQQTH 286 (313) T ss_pred CCCCCCCCCCCCCCCCCCHHHCCCCEEEHHHHCC-CCCCCCCCEECCCCCC T ss_conf 4347754341034334644332786455122034-2036983334160104 No 45 >pfam05810 NinF NinF protein. This family consists of several bacteriophage NinF proteins as well as related sequences from E. coli. Probab=31.69 E-value=25 Score=16.70 Aligned_cols=22 Identities=14% Similarity=0.463 Sum_probs=17.3 Q ss_pred CCCCCCCCC-CCCEECHHHHHHC Q ss_conf 710375156-5770288899508 Q gi|254780310|r 24 CPIYSRIIN-LRFCLCGHCWSKI 45 (119) Q Consensus 24 C~~C~~~~~-~~~~lC~~C~~~l 45 (119) |..|++.+. .+..+|.+|-.++ T Consensus 20 CA~C~kqL~~~Ev~~C~eC~~E~ 42 (58) T pfam05810 20 CAGCGKQLHPDEVHVCEECVAEA 42 (58) T ss_pred HHCCCCCCCHHHHHHHHHHHHHH T ss_conf 81714135704887899999999 No 46 >COG1198 PriA Primosomal protein N' (replication factor Y) - superfamily II helicase [DNA replication, recombination, and repair] Probab=31.16 E-value=9.2 Score=19.08 Aligned_cols=36 Identities=11% Similarity=0.101 Sum_probs=17.8 Q ss_pred CCCCCCCCCCC----CCCEECHHHHHHCCCCCCCCCCCCCCC Q ss_conf 74710375156----577028889950867678620124446 Q gi|254780310|r 22 SICPIYSRIIN----LRFCLCGHCWSKIHFITATEHILKNNK 59 (119) Q Consensus 22 ~~C~~C~~~~~----~~~~lC~~C~~~l~~i~~~~C~~Cg~~ 59 (119) ..|+-|+..+. .....|..|-.+-+ .+..|+.||.. T Consensus 445 ~~Cp~Cd~~lt~H~~~~~L~CH~Cg~~~~--~p~~Cp~Cgs~ 484 (730) T COG1198 445 AECPNCDSPLTLHKATGQLRCHYCGYQEP--IPQSCPECGSE 484 (730) T ss_pred CCCCCCCCCEEEECCCCEEEECCCCCCCC--CCCCCCCCCCC T ss_conf 24899995127864798067077999899--88779899997 No 47 >PRK08620 DNA topoisomerase III; Provisional Probab=30.48 E-value=46 Score=15.21 Aligned_cols=58 Identities=17% Similarity=0.273 Sum_probs=33.9 Q ss_pred HHHHHHHHHHHHHHHH------C------CCCCCCCCCCCC-----CC-CEECHH--HH--HHCCCCCCCCCCCCCCCCC Q ss_conf 9999999999999985------7------874710375156-----57-702888--99--5086767862012444653 Q gi|254780310|r 4 IIQTVKSIIIELFHCI------Y------PSICPIYSRIIN-----LR-FCLCGH--CW--SKIHFITATEHILKNNKDN 61 (119) Q Consensus 4 ~~~~ik~~~~~ll~~l------f------P~~C~~C~~~~~-----~~-~~lC~~--C~--~~l~~i~~~~C~~Cg~~~~ 61 (119) +++.++..+.++++-+ | -..|+-||..+- .+ ...|++ |. ..+...++..||.||.... T Consensus 580 fi~~~~~~~~~~v~~~k~~~~~~~~~~~t~~~Cp~Cg~~m~~~~gr~Gkf~~C~~peC~~~k~~~~~~~~~Cp~C~~~~~ 659 (726) T PRK08620 580 FINEMKNYTKKVVNEIKNSDKKYKHDNLTGTKCPDCGKFMLEVKGKNGKMLVCQDRECGHRKNVSRKTNARCPNCKKKLE 659 (726) T ss_pred HHHHHHHHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCEEEECCCCCEEECCCCCCCCCCCCHHHCCCCCCCCCCEEE T ss_conf 99999999999999997401442457778985642683211685898755746899899977710212894999998568 No 48 >COG1592 Rubrerythrin [Energy production and conversion] Probab=29.80 E-value=45 Score=15.28 Aligned_cols=38 Identities=8% Similarity=0.149 Sum_probs=23.5 Q ss_pred HHHHHHHHHHHCCC---CCCCCCCCCCCCCEECHHHHHHCCCCCCCCCCCCCCCC Q ss_conf 99999999985787---47103751565770288899508676786201244465 Q gi|254780310|r 9 KSIIIELFHCIYPS---ICPIYSRIINLRFCLCGHCWSKIHFITATEHILKNNKD 60 (119) Q Consensus 9 k~~~~~ll~~lfP~---~C~~C~~~~~~~~~lC~~C~~~l~~i~~~~C~~Cg~~~ 60 (119) ..+++.+|+.+.-. +|++||-....+ .+-.||.||.|. T Consensus 119 ~~~~~~~Le~~~~~~~~vC~vCGy~~~ge--------------~P~~CPiCga~k 159 (166) T COG1592 119 AEMFRGLLERLEEGKVWVCPVCGYTHEGE--------------APEVCPICGAPK 159 (166) T ss_pred HHHHHHHHHHHHCCCEEECCCCCCCCCCC--------------CCCCCCCCCCHH T ss_conf 99999999866038778768788812689--------------987699999818 No 49 >pfam10146 zf-C4H2 Zinc finger-containing protein. This is a family of proteins which appears to have a highly conserved zinc finger domain at the C terminal end, described as -C-X2-CH-X3-H-X5-C-X2-C-. The structure is predicted to contain a coiled coil. Members are annotated as being tumour-associated antigen HCA127 in humans but this could not confirmed. Probab=29.14 E-value=9.5 Score=19.01 Aligned_cols=21 Identities=29% Similarity=0.414 Sum_probs=15.7 Q ss_pred ECHHHHHHCCCCCCCCCCCCCC Q ss_conf 2888995086767862012444 Q gi|254780310|r 37 LCGHCWSKIHFITATEHILKNN 58 (119) Q Consensus 37 lC~~C~~~l~~i~~~~C~~Cg~ 58 (119) .|.+|..+++. +.|.|+.|-. T Consensus 186 ~ClSChQQIHR-NAPICPlCKA 206 (220) T pfam10146 186 VCLSCHQQIHR-NAPICPLCKA 206 (220) T ss_pred HHHHHHHHHHC-CCCCCCCCCC T ss_conf 66618988855-7977742334 No 50 >COG1439 Predicted nucleic acid-binding protein, consists of a PIN domain and a Zn-ribbon module [General function prediction only] Probab=29.09 E-value=14 Score=18.14 Aligned_cols=23 Identities=9% Similarity=-0.026 Sum_probs=14.1 Q ss_pred ECHHHHHHCCCCCCCCCCCCCCCC Q ss_conf 288899508676786201244465 Q gi|254780310|r 37 LCGHCWSKIHFITATEHILKNNKD 60 (119) Q Consensus 37 lC~~C~~~l~~i~~~~C~~Cg~~~ 60 (119) .|..|...++ .+..+|+.||.+. T Consensus 141 rC~GC~~~f~-~~~~~Cp~CG~~~ 163 (177) T COG1439 141 RCHGCKRIFP-EPKDFCPICGSPL 163 (177) T ss_pred EEECCCEECC-CCCCCCCCCCCCE T ss_conf 9845752508-9888077899911 No 51 >TIGR00354 polC DNA polymerase II, large subunit DP2; InterPro: IPR004475 This family represents the large subunit, DP2, of a two subunit novel archaebacterial replicative DNA polymerase first characterised for Pyrococcus furiosus. The structure of DP2 appears to be organised as a ~950 residue component separated from a ~300 residue component by a ~150 residue intein. The other subunit, DP1, has sequence similarity to the eukaryotic DNA polymerase delta small subunit.; GO: 0003677 DNA binding, 0003887 DNA-directed DNA polymerase activity, 0006260 DNA replication, 0006308 DNA catabolic process. Probab=28.82 E-value=15 Score=17.96 Aligned_cols=43 Identities=14% Similarity=0.148 Sum_probs=22.7 Q ss_pred CCCCCCCCCCCCCCEECHHHHHHCCCCCCCCCCCCCCCCCCCCCCHHH Q ss_conf 747103751565770288899508676786201244465334760898 Q gi|254780310|r 22 SICPIYSRIINLRFCLCGHCWSKIHFITATEHILKNNKDNIDKDPLKS 69 (119) Q Consensus 22 ~~C~~C~~~~~~~~~lC~~C~~~l~~i~~~~C~~Cg~~~~~~~~C~~C 69 (119) .+||-||.. +-..+|+.|=+.... ..+||.|....+. ..|..| T Consensus 680 ~~CP~Cgk~--s~~~~Cp~CG~~te~--~~~gPsCrmknts-svCesC 722 (1173) T TIGR00354 680 AKCPSCGKE--SLYRVCPVCGEKTEL--DEYGPSCRMKNTS-SVCESC 722 (1173) T ss_pred CCCCCCCCC--CEEEECCCCCCEEEE--CCCCCCCCCCCCC-CHHHCC T ss_conf 218876640--000145778854544--5778853003542-022105 No 52 >COG1997 RPL43A Ribosomal protein L37AE/L43A [Translation, ribosomal structure and biogenesis] Probab=27.08 E-value=53 Score=14.90 Aligned_cols=25 Identities=24% Similarity=0.416 Sum_probs=14.0 Q ss_pred HHHHHHHHHHHHHHHHHCCCCCCCCCCC Q ss_conf 8999999999999998578747103751 Q gi|254780310|r 3 AIIQTVKSIIIELFHCIYPSICPIYSRI 30 (119) Q Consensus 3 ~~~~~ik~~~~~ll~~lfP~~C~~C~~~ 30 (119) ++.+.++.+-...-+ +..|+.|++. T Consensus 20 ~~Rrrv~~ie~~~~~---~~~Cp~C~~~ 44 (89) T COG1997 20 KLRRRVKEIEAQQRA---KHVCPFCGRT 44 (89) T ss_pred HHHHHHHHHHHHHHC---CCCCCCCCCC T ss_conf 899999999999854---7769978974 No 53 >TIGR00570 cdk7 CDK-activating kinase assembly factor MAT1; InterPro: IPR004575 MAT1 (menage a trois 1) is a RING finger protein with a characteristic C3HC4 motif located in the N-terminal domain. MAT1 stabilises the cyclin H-CDK7 complex to form a functional CDK-activating kinase (CAK) enzymatic complex which then goes on to activate many of the CDK enzymes intimately involved in the cell cycle . CDK7 forms a stable complex with cyclin H and MAT1 in vivo only when phosphorylated on either one of two residues (Ser164 or Thr170) in its T-loop. The requirement for MAT1 for the activation of CAK can be by-passed by the phosphorylation of CDK7 on the T-loop. The two mechanisms for CDK7 complex stabilisation and activation (MAT1 addition and T-loop phosphorylation), which can operate independently in vitro, actually cooperate under physiological conditions to maintain complex integrity. With prolonged exposure to elevated temperature, dissociation to monomeric subunits occurs in vivo when CDK7 is dephosphorylated, even in the presence of MAT1 . The Cyclin H-MAT1-CDK7 complex also forms part of TFIIH, a multiprotein complex required for both transcription and DNA repair.; GO: 0007049 cell cycle, 0005634 nucleus. Probab=26.68 E-value=34 Score=15.97 Aligned_cols=39 Identities=15% Similarity=0.293 Sum_probs=26.1 Q ss_pred CCCCCCCC--C-------CCC---CCEECHHHHHHCCCCCC-CCCC--CCCCCCC Q ss_conf 74710375--1-------565---77028889950867678-6201--2444653 Q gi|254780310|r 22 SICPIYSR--I-------INL---RFCLCGHCWSKIHFITA-TEHI--LKNNKDN 61 (119) Q Consensus 22 ~~C~~C~~--~-------~~~---~~~lC~~C~~~l~~i~~-~~C~--~Cg~~~~ 61 (119) ..||.|.. . +-+ ++-||.+|-.-| |+.+ ..|| -|+.|+. T Consensus 9 d~CPrCKTtkYrnPslKLlVNPvCGHtLCESCVdlL-F~~Gsg~CPyk~C~~pLR 62 (322) T TIGR00570 9 DACPRCKTTKYRNPSLKLLVNPVCGHTLCESCVDLL-FVRGSGSCPYKECDTPLR 62 (322) T ss_pred CCCCCCCCCCCCCCCCCEEECCCCCCCCCHHHHHHH-HHCCCCCCCCCCCCCCCC T ss_conf 788887766666886422227866560104478888-734888888546787443 No 54 >pfam05290 Baculo_IE-1 Baculovirus immediate-early protein (IE-0). The Autographa californica multinucleocapsid nuclear polyhedrosis virus (AcMNPV) ie-1 gene product (IE-1) is thought to play a central role in stimulating early viral transcription. IE-1 has been demonstrated to activate several early viral gene promoters and to negatively regulate the promoters of two other AcMNPV regulatory genes, ie-0 and ie-2. It is thought that that IE-1 negatively regulates the expression of certain genes by binding directly, or as part of a complex, to promoter regions containing a specific IE-1-binding motif (5'-ACBYGTAA-3') near their mRNA start sites. Probab=26.58 E-value=54 Score=14.82 Aligned_cols=45 Identities=11% Similarity=0.210 Sum_probs=28.6 Q ss_pred HHCCC--CCCCCCCCCCCCC----------EECHHHHHHCCCC--CCCCCCCCCCCCCC Q ss_conf 85787--4710375156577----------0288899508676--78620124446533 Q gi|254780310|r 18 CIYPS--ICPIYSRIINLRF----------CLCGHCWSKIHFI--TATEHILKNNKDNI 62 (119) Q Consensus 18 ~lfP~--~C~~C~~~~~~~~----------~lC~~C~~~l~~i--~~~~C~~Cg~~~~~ 62 (119) |+=|+ .|-+|......++ .+|..|..+|... ..|.||.|...+.. T Consensus 75 F~d~~lYeCnIC~etS~e~~FLKPnECCgY~iCn~Cya~LWK~~~~ypvCPvCkTSFKs 133 (141) T pfam05290 75 FLEPKLYQCNICQDTSAEEHFLKPNECCGYKICNLCYANLWKFCTVYPVCPVCKTSFKS 133 (141) T ss_pred CCCCCCEEECCCCCCCCHHCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCCCCCCC T ss_conf 04887146037534330120379633424059999999999874558778866676567 No 55 >PRK01343 zinc-binding protein; Provisional Probab=26.02 E-value=33 Score=15.99 Aligned_cols=12 Identities=17% Similarity=0.346 Sum_probs=7.7 Q ss_pred CCCCCCCCCCCC Q ss_conf 874710375156 Q gi|254780310|r 21 PSICPIYSRIIN 32 (119) Q Consensus 21 P~~C~~C~~~~~ 32 (119) ++.|++|++... T Consensus 9 ~~~CPiC~k~~~ 20 (56) T PRK01343 9 TRPCPECGKPST 20 (56) T ss_pred CCCCCCCCCCCC T ss_conf 998988899774 No 56 >TIGR01405 polC_Gram_pos DNA polymerase III, alpha subunit, Gram-positive type; InterPro: IPR006308 These are the polypeptide chains of DNA polymerase III. Full-length homologs of this protein are restricted to the Gram-positive lineages, including the Mycoplasmas. This protein is designated alpha chain and given the gene symbol polC, but is not a full-length homolog of other polC genes. The N-terminal region of about 200 amino acids is rich in low-complexity sequence and poorly alignable. ; GO: 0003677 DNA binding, 0003887 DNA-directed DNA polymerase activity, 0006260 DNA replication, 0005737 cytoplasm. Probab=25.47 E-value=26 Score=16.61 Aligned_cols=29 Identities=21% Similarity=0.272 Sum_probs=16.2 Q ss_pred CCEECHHHHHHCCCCC-----------CCCCCCCC--CCCCCC Q ss_conf 7702888995086767-----------86201244--465334 Q gi|254780310|r 34 RFCLCGHCWSKIHFIT-----------ATEHILKN--NKDNID 63 (119) Q Consensus 34 ~~~lC~~C~~~l~~i~-----------~~~C~~Cg--~~~~~~ 63 (119) .+.+|+.|.. .+|++ +--||+|| .|+..+ T Consensus 716 PHY~Cp~Cky-~Ef~~D~~~~~GfDLp~K~CP~Cgak~pl~kD 757 (1264) T TIGR01405 716 PHYLCPNCKY-SEFVTDGSVGSGFDLPDKDCPKCGAKAPLKKD 757 (1264) T ss_pred CCCCCCCCCE-EEEECCCCCCCCCCCCCCCCCCCCCCCCCCCC T ss_conf 8750878735-53003787788776857888888877763457 No 57 >COG1645 Uncharacterized Zn-finger containing protein [General function prediction only] Probab=24.93 E-value=23 Score=16.90 Aligned_cols=23 Identities=17% Similarity=0.271 Sum_probs=15.0 Q ss_pred CCCCCCCCCCCC--CCCEECHHHHH Q ss_conf 874710375156--57702888995 Q gi|254780310|r 21 PSICPIYSRIIN--LRFCLCGHCWS 43 (119) Q Consensus 21 P~~C~~C~~~~~--~~~~lC~~C~~ 43 (119) =-+|+.||.++= .+..+|+.|.. T Consensus 28 ~~hCp~Cg~PLF~KdG~v~CPvC~~ 52 (131) T COG1645 28 AKHCPKCGTPLFRKDGEVFCPVCGY 52 (131) T ss_pred HHHCCCCCCCCEEECCEEECCCCCC T ss_conf 7448655883163089587777776 No 58 >KOG1842 consensus Probab=24.66 E-value=15 Score=17.86 Aligned_cols=11 Identities=27% Similarity=0.682 Sum_probs=7.0 Q ss_pred EECHHHHHHCC Q ss_conf 02888995086 Q gi|254780310|r 36 CLCGHCWSKIH 46 (119) Q Consensus 36 ~lC~~C~~~l~ 46 (119) .+|.+|...++ T Consensus 205 VmC~~C~k~iS 215 (505) T KOG1842 205 VMCRDCSKFIS 215 (505) T ss_pred HHHHHHHHHCC T ss_conf 77888887468 No 59 >KOG2613 consensus Probab=23.46 E-value=46 Score=15.19 Aligned_cols=71 Identities=17% Similarity=0.061 Sum_probs=35.7 Q ss_pred CCCCCCCCCC-CCEECHHHHHHCCCCCCCCCCCCCCCC-CCCCCCHHHHCC-CCCHHHHHHHHCCCHHHHHHHHHHHHCC Q ss_conf 7103751565-770288899508676786201244465-334760898516-7843530120002127999999974478 Q gi|254780310|r 24 CPIYSRIINL-RFCLCGHCWSKIHFITATEHILKNNKD-NIDKDPLKSMQK-DLPLTQIRSVTLYCDMSCVLVRLLKYHD 100 (119) Q Consensus 24 C~~C~~~~~~-~~~lC~~C~~~l~~i~~~~C~~Cg~~~-~~~~~C~~C~~~-~~~f~~~~a~~~Y~~~~r~lI~~~Ky~~ 100 (119) |--||-++++ ..-.|.+|...-.-|+.. .|- .....|..|-+- .||=.+.++.+.-.+-+.-.+.++|--+ T Consensus 17 CCeCGvpi~Pn~anMC~~Clrs~VDITeg------ipr~~~i~~Cr~CeRYlqPP~~Wi~a~leSrELLaiclkklK~L~ 90 (502) T KOG2613 17 CCECGVPIEPNPANMCVDCLRSEVDITEG------IPRQATISFCRECERYLQPPKTWIRAELESRELLAICLKKLKGLN 90 (502) T ss_pred EECCCCCCCCCHHHHHHHHHHEEEEHHCC------CCCHHHHHHCCCCCEECCCCHHHHHHHHCCHHHHHHHHHHHCCCC T ss_conf 94379867986678899876403015038------850001000216610047937776211111789999998615766 No 60 >TIGR02688 TIGR02688 conserved hypothetical protein TIGR02688; InterPro: IPR014061 Members of this entry are uncharacterised proteins sporadically distributed in bacteria and archaea, about 470 amino acids in length. Several members of this family appear in public databases with annotation as ATP-dependent protease La, despite the lack of similarity to families IPR004815 from INTERPRO (ATP-dependent protease La) or IPR003111 from INTERPRO (ATP-dependent protease La (LON) domain). The proteins in this entry are encoded by genes repeatedly found downstream of another gene that encodes an uncharacterised protein of about 880 amino acids in length (see IPR014060 from INTERPRO).. Probab=23.19 E-value=62 Score=14.48 Aligned_cols=19 Identities=21% Similarity=0.509 Sum_probs=13.7 Q ss_pred HHHHHHHHHHHHHHHCCCC Q ss_conf 9999999999999857874 Q gi|254780310|r 5 IQTVKSIIIELFHCIYPSI 23 (119) Q Consensus 5 ~~~ik~~~~~ll~~lfP~~ 23 (119) .++||.+.+.++.+|||.. T Consensus 417 ~~avk~~~SGl~KlLFPh~ 435 (470) T TIGR02688 417 VKAVKKLFSGLLKLLFPHG 435 (470) T ss_pred HHHHHHHHHHHHHHHCCCC T ss_conf 6789999877766416898 No 61 >COG1644 RPB10 DNA-directed RNA polymerase, subunit N (RpoN/RPB10) [Transcription] Probab=23.05 E-value=22 Score=16.96 Aligned_cols=15 Identities=27% Similarity=0.481 Sum_probs=12.1 Q ss_pred HCCCCCCCCCCCCCC Q ss_conf 578747103751565 Q gi|254780310|r 19 IYPSICPIYSRIINL 33 (119) Q Consensus 19 lfP~~C~~C~~~~~~ 33 (119) ++|-+|..||+.+++ T Consensus 2 iiPiRCFsCGkvi~~ 16 (63) T COG1644 2 IIPVRCFSCGKVIGH 16 (63) T ss_pred CCCEEEECCCCCHHH T ss_conf 885375238878788 No 62 >PRK07220 DNA topoisomerase I; Validated Probab=23.00 E-value=59 Score=14.62 Aligned_cols=83 Identities=10% Similarity=-0.017 Sum_probs=45.0 Q ss_pred CCCCCCCCCC-----CC-C-EECH---HHHHHCCCC-------CCCCCCCCCCCCC---------CCCCCHHHHCCC--- Q ss_conf 4710375156-----57-7-0288---899508676-------7862012444653---------347608985167--- Q gi|254780310|r 23 ICPIYSRIIN-----LR-F-CLCG---HCWSKIHFI-------TATEHILKNNKDN---------IDKDPLKSMQKD--- 73 (119) Q Consensus 23 ~C~~C~~~~~-----~~-~-~lC~---~C~~~l~~i-------~~~~C~~Cg~~~~---------~~~~C~~C~~~~--- 73 (119) .|+.||..+- .+ . .=|. +|...+|.- ++-.|+.||.+.- ....|..|.-.. T Consensus 591 ~CP~Cg~~l~~r~~k~g~~FigCs~YP~C~~t~pLp~~g~~~~~~~~C~~~~~~~~~~~~~~~~~~~~~cp~c~~~~~~~ 670 (740) T PRK07220 591 KCSLCGSELMVRRSKRGSRFIGCSGYPNCTFSLPLPKSGQIIVTDKVCEAHGLHHIKIINGGKRPWDLGCPQCNFIEWQK 670 (740) T ss_pred CCCCCCCCCEEEECCCCCEEEECCCCCCCCCCEECCCCCCEEECCCCCCCCCCEEEEEEECCCCCCCCCCCCCCCHHHHC T ss_conf 78889960047745879978858989999985226999846347886877998379998089875767899877601301 Q ss_pred ---------CCHHHHHHHHCCCHHHHHHHHHHHHCCCHHHH Q ss_conf ---------84353012000212799999997447876699 Q gi|254780310|r 74 ---------LPLTQIRSVTLYCDMSCVLVRLLKYHDRTDLA 105 (119) Q Consensus 74 ---------~~f~~~~a~~~Y~~~~r~lI~~~Ky~~~~~la 105 (119) |.-...-+.-.-+|......-+++-++-..+. T Consensus 671 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 711 (740) T PRK07220 671 TQKEEQAQQPKKEKPKSIKDIEGVGKATAGKLEEAGITTVE 711 (740) T ss_pred HHHHHHHCCCCCCCCCHHHCCCCCCHHHHHHHHHCCCCCHH T ss_conf 12455431544568520121556588999999976887599 No 63 >TIGR01054 rgy reverse gyrase; InterPro: IPR005736 DNA topoisomerases regulate the number of topological links between two DNA strands (i.e. change the number of superhelical turns) by catalysing transient single- or double-strand breaks, crossing the strands through one another, then resealing the breaks. These enzymes have several functions: to remove DNA supercoils during transcription and DNA replication; for strand breakage during recombination; for chromosome condensation; and to disentangle intertwined DNA during mitosis , . DNA topoisomerases are divided into two classes: type I enzymes (5.99.1.2 from EC; topoisomerases I, III and V) break single-strand DNA, and type II enzymes (5.99.1.3 from EC; topoisomerases II, IV and VI) break double-strand DNA . Type I topoisomerases are ATP-independent enzymes (except for reverse gyrase), and can be subdivided according to their structure and reaction mechanisms: type IA (bacterial and archaeal topoisomerase I, topoisomerase III and reverse gyrase) and type IB (eukaryotic topoisomerase I and topoisomerase V). These enzymes are primarily responsible for relaxing positively and/or negatively supercoiled DNA, except for reverse gyrase, which can introduce positive supercoils into DNA. Reverse gyrase is a type IA topoisomerase that is unique among these enzymes in its requirement for ATP. Reverse gyrase is a hyperthermophile-specific enzyme that acts as a renaturase by positively supercoiling DNA, and by annealing complementary single-strand circles . Hyperthermophilic organisms must protect themselves against heat-induced degradation, and reverse gyrase acts to reduce the rate of double-strand DNA breakage, a function that does not require ATP hydrolysis and which is independent of its positive supercoiling abilities. Reverse gyrase achieves this by recognising nicked DNA and recruiting a protein coat to the site of damage . More information about this protein can be found at Protein of the Month: DNA Topoisomerase .; GO: 0003677 DNA binding, 0003916 DNA topoisomerase activity, 0006265 DNA topological change, 0006268 DNA unwinding during replication, 0005694 chromosome. Probab=22.95 E-value=13 Score=18.19 Aligned_cols=31 Identities=19% Similarity=0.450 Sum_probs=22.2 Q ss_pred HHHHCCCCCCCCCCCCCCCC---E-ECHHHHHHCC Q ss_conf 99857874710375156577---0-2888995086 Q gi|254780310|r 16 FHCIYPSICPIYSRIINLRF---C-LCGHCWSKIH 46 (119) Q Consensus 16 l~~lfP~~C~~C~~~~~~~~---~-lC~~C~~~l~ 46 (119) +..||-.-|+.||+.++++. + -|+.|..+.+ T Consensus 2 ~~~vy~~lCPNCGG~i~~eRL~kGLPC~kCLP~~~ 36 (1843) T TIGR01054 2 IPAVYKELCPNCGGEISSERLEKGLPCEKCLPEEP 36 (1843) T ss_pred CHHHHHCCCCCCCCCCCHHHHHCCCCCCCCCCCCC T ss_conf 40046348898989877898735788864578767 No 64 >pfam08274 PhnA_Zn_Ribbon PhnA Zinc-Ribbon. Probab=22.81 E-value=29 Score=16.31 Aligned_cols=24 Identities=17% Similarity=0.472 Sum_probs=14.4 Q ss_pred CCCCCCCCCC---CCCCEECHHHHHHC Q ss_conf 7471037515---65770288899508 Q gi|254780310|r 22 SICPIYSRII---NLRFCLCGHCWSKI 45 (119) Q Consensus 22 ~~C~~C~~~~---~~~~~lC~~C~~~l 45 (119) +.|+.|+... ..+..+|++|-.+. T Consensus 3 P~Cp~C~seytY~d~~~~vCpeC~hEw 29 (30) T pfam08274 3 PKCPLCNSEYTYEDGALLVCPECAHEW 29 (30) T ss_pred CCCCCCCCCCEECCCCEEECCCCCCCC T ss_conf 878878981247479997997545656 No 65 >COG5175 MOT2 Transcriptional repressor [Transcription] Probab=22.72 E-value=32 Score=16.11 Aligned_cols=41 Identities=15% Similarity=0.332 Sum_probs=25.4 Q ss_pred CCCCCCCCCC--C--------CCEECHHHHHHCCCCCCCCCCCCCCCCCCC Q ss_conf 4710375156--5--------770288899508676786201244465334 Q gi|254780310|r 23 ICPIYSRIIN--L--------RFCLCGHCWSKIHFITATEHILKNNKDNID 63 (119) Q Consensus 23 ~C~~C~~~~~--~--------~~~lC~~C~~~l~~i~~~~C~~Cg~~~~~~ 63 (119) .|++|=.++. . +..+|.-||..+...-...|+.|.+..+.+ T Consensus 16 ~cplcie~mditdknf~pc~cgy~ic~fc~~~irq~lngrcpacrr~y~de 66 (480) T COG5175 16 YCPLCIEPMDITDKNFFPCPCGYQICQFCYNNIRQNLNGRCPACRRKYDDE 66 (480) T ss_pred CCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHCCCCCHHHHHHCCCC T ss_conf 474115643245677565776218999999888765058883765423434 No 66 >COG0199 RpsN Ribosomal protein S14 [Translation, ribosomal structure and biogenesis] Probab=22.67 E-value=31 Score=16.16 Aligned_cols=24 Identities=25% Similarity=0.580 Sum_probs=12.9 Q ss_pred CCCCCCCCCC--CCCEECHHHHHHCC Q ss_conf 4710375156--57702888995086 Q gi|254780310|r 23 ICPIYSRIIN--LRFCLCGHCWSKIH 46 (119) Q Consensus 23 ~C~~C~~~~~--~~~~lC~~C~~~l~ 46 (119) +|..||++-+ ....||--|+.++- T Consensus 23 RC~~cGRprgv~Rkf~lcR~cfRE~A 48 (61) T COG0199 23 RCRRCGRPRGVIRKFGLCRICFRELA 48 (61) T ss_pred CCCCCCCCCCCHHHHHHHHHHHHHHH T ss_conf 13036997323355543799999986 No 67 >COG1110 Reverse gyrase [DNA replication, recombination, and repair] Probab=21.73 E-value=26 Score=16.60 Aligned_cols=31 Identities=29% Similarity=0.447 Sum_probs=20.7 Q ss_pred HHHHCCCCCCCCCCCCCCCC---EE-CHHHHHHCC Q ss_conf 99857874710375156577---02-888995086 Q gi|254780310|r 16 FHCIYPSICPIYSRIINLRF---CL-CGHCWSKIH 46 (119) Q Consensus 16 l~~lfP~~C~~C~~~~~~~~---~l-C~~C~~~l~ 46 (119) ...+|-..|+.||..++++. .+ |..|..+-+ T Consensus 3 ~~~iY~~~CpNCGG~isseRL~~glpCe~CLp~~~ 37 (1187) T COG1110 3 PNAIYGSSCPNCGGDISSERLEKGLPCERCLPEDT 37 (1187) T ss_pred CHHHHHCCCCCCCCCCCHHHHHCCCCCHHCCCCCC T ss_conf 46566256998899675778745998332068864 No 68 >PRK04023 DNA polymerase II large subunit; Validated Probab=21.66 E-value=25 Score=16.65 Aligned_cols=52 Identities=10% Similarity=0.077 Sum_probs=28.5 Q ss_pred HCCCCCCCCCCCCCCCCEECHHHHHHCCCCCCCCCCCCCCCCCCCCCCHHHHCCCCC Q ss_conf 578747103751565770288899508676786201244465334760898516784 Q gi|254780310|r 19 IYPSICPIYSRIINLRFCLCGHCWSKIHFITATEHILKNNKDNIDKDPLKSMQKDLP 75 (119) Q Consensus 19 lfP~~C~~C~~~~~~~~~lC~~C~~~l~~i~~~~C~~Cg~~~~~~~~C~~C~~~~~~ 75 (119) +..++|+-||... -...|+.|-..-. ....|+.||.... ...|..|...... T Consensus 631 vg~R~Cp~Cg~eT--~~~~C~~CG~~T~--~~~~c~~C~~~~~-~~~c~~c~~~~~~ 682 (1128) T PRK04023 631 VGNRKCPSCGKET--FYRRCPFCGTHTE--PVYRCPRCGIEVD-EEVCPKCGREPTG 682 (1128) T ss_pred EEEEECCCCCCCC--CCCCCCCCCCCCC--CCCCCCCCCCCCC-CCCCCCCCCCCCC T ss_conf 8202889999835--7557877799665--4324776666556-6535445776777 No 69 >pfam11290 DUF3090 Protein of unknown function (DUF3090). This family of proteins with unknown function appears to be restricted to Actinobacteria. Probab=21.20 E-value=45 Score=15.26 Aligned_cols=13 Identities=8% Similarity=-0.192 Sum_probs=8.2 Q ss_pred CCCCCCCCCCCCC Q ss_conf 8620124446533 Q gi|254780310|r 50 ATEHILKNNKDNI 62 (119) Q Consensus 50 ~~~C~~Cg~~~~~ 62 (119) .|.|+.||.|.+- T Consensus 154 Rp~Cp~Cg~Pidp 166 (171) T pfam11290 154 RPPCPLCGQPLDP 166 (171) T ss_pred CCCCCCCCCCCCC T ss_conf 9999988997699 No 70 >COG5220 TFB3 Cdk activating kinase (CAK)/RNA polymerase II transcription initiation/nucleotide excision repair factor TFIIH, subunit TFB3 [Cell division and chromosome partitioning / Transcription / DNA replication, recombination, and repair] Probab=21.07 E-value=41 Score=15.51 Aligned_cols=37 Identities=16% Similarity=0.333 Sum_probs=22.7 Q ss_pred CCCCCCCC--CCCCCC----------EECHHHHHHCCCCCCCCCC--CCCC Q ss_conf 74710375--156577----------0288899508676786201--2444 Q gi|254780310|r 22 SICPIYSR--IINLRF----------CLCGHCWSKIHFITATEHI--LKNN 58 (119) Q Consensus 22 ~~C~~C~~--~~~~~~----------~lC~~C~~~l~~i~~~~C~--~Cg~ 58 (119) .+|++|.. .+..+- -+|.+|...+--..+..|| -||. T Consensus 11 ~~CPvCksDrYLnPdik~linPECyHrmCESCvdRIFs~GpAqCP~~gC~k 61 (314) T COG5220 11 RRCPVCKSDRYLNPDIKILINPECYHRMCESCVDRIFSRGPAQCPYKGCGK 61 (314) T ss_pred CCCCCCCCCCCCCCCEEEEECHHHHHHHHHHHHHHHHCCCCCCCCCCCHHH T ss_conf 248854445414888479978999999999999998627988899843789 No 71 >PRK03564 formate dehydrogenase accessory protein FdhE; Provisional Probab=20.54 E-value=72 Score=14.14 Aligned_cols=56 Identities=14% Similarity=0.214 Sum_probs=34.3 Q ss_pred HHHHHHHHHHHHHHHHCCC----------CCCCCCC-CC-------CCCC---EECHHHHHHCCCCCCCCCCCCCCCC Q ss_conf 9999999999999985787----------4710375-15-------6577---0288899508676786201244465 Q gi|254780310|r 4 IIQTVKSIIIELFHCIYPS----------ICPIYSR-II-------NLRF---CLCGHCWSKIHFITATEHILKNNKD 60 (119) Q Consensus 4 ~~~~ik~~~~~ll~~lfP~----------~C~~C~~-~~-------~~~~---~lC~~C~~~l~~i~~~~C~~Cg~~~ 60 (119) |.-++...+.++...|=+. .|++||. ++ +.++ ..|.-|..+.++. ...|..|+... T Consensus 160 i~AALqv~wa~lA~~l~~~~~~~~~~~~~~CPvCGs~Pvasvv~~g~~~G~RyL~CslC~teW~~~-R~~C~~C~~~~ 236 (307) T PRK03564 160 IWAALSLYWAQMAQLIPGKARAEYGEQRQYCPVCGSMPVSSVVQIGTTQGLRYLHCNLCESEWHVV-RVKCSNCEQSG 236 (307) T ss_pred HHHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCEEEEEECCCCCEEEEECCCCCCCCCCC-CCCCCCCCCCC T ss_conf 999999999999854895224787777885998898751455750687870688648777740213-53468888988 No 72 >cd00162 RING RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H) Probab=20.39 E-value=54 Score=14.85 Aligned_cols=37 Identities=19% Similarity=0.238 Sum_probs=21.7 Q ss_pred CCCCCCCCCCC-------CCEECHHHHHHCCCCCCCCCCCCCCC Q ss_conf 47103751565-------77028889950867678620124446 Q gi|254780310|r 23 ICPIYSRIINL-------RFCLCGHCWSKIHFITATEHILKNNK 59 (119) Q Consensus 23 ~C~~C~~~~~~-------~~~lC~~C~~~l~~i~~~~C~~Cg~~ 59 (119) .|.+|...... ++.+|..|..++.......|+.|..+ T Consensus 1 ~C~iC~~~~~~~~~~~~CgH~fC~~Ci~~~~~~~~~~CP~Cr~~ 44 (45) T cd00162 1 ECPICLEEFREPVVLLPCGHVFCRSCIDKWLKSGKNTCPLCRTP 44 (45) T ss_pred CCCCCCHHHCCEEEEECCCCCCCHHHHHHHHHHCCCCCCCCCCC T ss_conf 98408803278118818999106899999994791868382880 No 73 >COG4306 Uncharacterized protein conserved in bacteria [Function unknown] Probab=20.13 E-value=46 Score=15.22 Aligned_cols=10 Identities=0% Similarity=-0.403 Sum_probs=4.9 Q ss_pred CCCCCCCCCC Q ss_conf 6201244465 Q gi|254780310|r 51 TEHILKNNKD 60 (119) Q Consensus 51 ~~C~~Cg~~~ 60 (119) .+|..||.++ T Consensus 69 sfchncgs~f 78 (160) T COG4306 69 SFCHNCGSRF 78 (160) T ss_pred CHHHCCCCCC T ss_conf 0654179988 Done!