Query T0641 3NYI, Eubacterium ventriosum, 296 residues Match_columns 296 No_of_seqs 123 out of 1231 Neff 7.6 Searched_HMMs 11830 Date Thu Jul 22 15:09:39 2010 Command /home/syshi_2/2008/ferredoxin/manualcheck/update/HHsearch/bin/hhsearch -i /home/syshi_3/CASP9/HHsearch4Targetseq/seq/T0641.hhm -d /home/syshi_2/2008/ferredoxin/manualcheck/update/HHsearch/database/pfamA_24_hhmdb -o /home/syshi_3/CASP9/HHsearch4Targetseq/pfamAsearch/T0641.hhr No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM 1 PF02645 DegV: Uncharacterised 100.0 0 0 357.2 22.9 210 75-292 1-211 (211) 2 PF04084 ORC2: Origin recognit 78.5 1.7 0.00014 20.2 11.9 152 45-200 11-176 (326) 3 PF02608 Bmp: Basic membrane p 74.7 1.4 0.00012 20.7 3.8 53 69-129 46-98 (306) 4 PF04748 Polysacc_deac_2: Dive 44.7 7.2 0.00061 16.0 6.7 59 69-131 72-132 (213) 5 PF10975 DUF2802: Protein of u 44.2 6.5 0.00055 16.3 2.7 22 134-155 33-54 (70) 6 PF06745 KaiC: KaiC; InterPro 41.8 8 0.00067 15.7 8.9 84 79-174 42-127 (225) 7 PF06180 CbiK: Cobalt chelatas 36.4 9.6 0.00082 15.2 13.2 32 74-105 61-93 (262) 8 PF02481 DNA_processg_A: DNA r 36.1 9.7 0.00082 15.1 4.1 29 5-36 7-35 (212) 9 PF11517 Nab2: Nuclear abundan 35.0 8.1 0.00068 15.7 2.0 59 132-190 31-96 (107) 10 PF03652 UPF0081: Uncharacteri 31.1 12 0.00099 14.6 4.7 84 68-157 34-122 (135) 11 PF08004 DUF1699: Protein of u 30.9 12 0.00099 14.6 7.4 84 65-158 25-117 (131) 12 PF08784 RPA_C: Replication pr 28.3 13 0.0011 14.3 2.3 43 126-170 59-101 (102) 13 PF10079 DUF2317: Uncharacteri 28.1 13 0.0011 14.3 3.6 229 11-264 109-373 (542) 14 PF03102 NeuB: NeuB family; I 27.2 14 0.0012 14.2 6.7 26 68-93 53-78 (241) 15 PF10483 Hap2_elong: Histone a 22.0 17 0.0014 13.6 6.6 91 65-192 21-117 (280) 16 PF03796 DnaB_C: DnaB-like hel 21.1 6.4 0.00054 16.3 -0.6 20 212-231 150-169 (185) No 1 >PF02645 DegV: Uncharacterised protein, DegV family COG1307; InterPro: IPR003797 This family of proteins is related to DegV of Bacillus subtilis and includes paralogous sets in several species (B. subtilis, Deinococcus radiodurans, Mycoplasma pneumoniae) that are closer in percent identity to each than to most homologs from other species. This suggests both recent paralogy and diversity of function.; PDB: 1pzx_B 2dt8_A 2g7z_B 3egl_A 1vpv_B 1mgp_A 3fys_A. Probab=100.00 E-value=0 Score=357.22 Aligned_cols=210 Identities=34% Similarity=0.518 Sum_probs=196.6 Q ss_pred HHHHHHHHCCCCEEEEEECCCCCHHHHHHHHHHHHHHHHCCCCEEEEEECCHHHHHHHHHHHHHHHHHHCCCCHHHHHHH Q ss_conf 99999986589499997145543578999999999997478980899826566799999999999999748999999999 Q T0641 75 DVFRSFVEQGFPVVCFTITTLFSGSYNSAINAKSLVLEDYPDANICVIDSKQNTVTQALLIDQFVRMLEDGLSFEQAMSK 154 (296) Q Consensus 75 ~~~~~~~~~g~~vi~i~iSs~lSgty~~a~~a~~~~~~~~~~~~i~ViDS~~~s~~~g~lv~~a~~l~~~G~s~~ei~~~ 154 (296) |+|+++.++||+||++|+||+||||||+|++| .+++++.+|+|+||+++|+|++++|++|++|+++|++++||.++ T Consensus 1 e~~~~~~~~yd~vi~i~iSs~LSgty~~a~~a----~~~~~~~~v~ViDS~~~s~~~~~~v~~a~~~~~~G~s~~eI~~~ 76 (211) T PF02645_consen 1 ELFEELLEGYDDVIVITISSGLSGTYNSAKQA----AEEEPDKNVHVIDSKSVSAGQGLLVLEAAKLIEQGKSFEEIVEK 76 (211) T ss_dssp -HHHHHHTCTSEEEEEES-TTT-SHHHHHHHH----HHTTSTTCEEEEE-SS-HHHCHHHHHHHHHHHHTT--HHHHHHH T ss_pred CHHHHHHCCCCEEEEEECCCCCCHHHHHHHHH----HHHCCCCEEEEEECCHHHHHHHHHHHHHHHHHHCCCCHHHHHHH T ss_conf 96789766998699998786421179999978----97679985999918614799999999999999859999999999 Q ss_pred HHHHHHCCEEEEEECCHHHHHHCCCCHHHHHHHHHHHCCCEEEEEECCEEEEEEEECCHHHH-HHHHHHHHHHHHHCCCC Q ss_conf 99887416699998375887526974178999988754752899987868997751567999-99999999999741489 Q T0641 155 LDALMASARIFFTVGSLDYLKMGGRIGKVATAATGKLGVKPVIIMKDGDIGLGGIGRNRNKL-KNSVLQVAKKYLDENNK 233 (296) Q Consensus 155 l~~~~~~~~~~f~v~~L~~L~kgGRis~~~~~ig~lL~IkPIl~~~~G~i~~~~k~R~~kka-~~~~~~~~~~~~~~~~~ 233 (296) ++++++++++||+|+||+||+||||||++++++|++|||||||++++|.+.+.+|+|+.+++ ++++++.+.+.... . T Consensus 77 l~~~~~~~~~~~~v~~L~~L~kgGRis~~~~~lg~lL~IkPIl~~~~g~~~~~~k~r~~~k~~~~~~~~~~~~~~~~--~ 154 (211) T PF02645_consen 77 LEELRDKTHTYFIVDDLDYLVKGGRISKAAAFLGSLLNIKPILSFDDGGIVPVAKVRGFKKAAIKKLIEQVKEFIDD--G 154 (211) T ss_dssp HHHHHHTEEEEEEES-THHHHH----CHHHHHHCCCTTSEEEEEEETTEEEEEEEESSHHHH-HHHHHHHHHHHHTT--C T ss_pred HHHHHHCCEEEEEECCHHHHHHCCCCCHHHHHHHHHCCCEEEEEEECCEEEEEEEECCCCCHHHHHHHHHHHHHHCC--C T ss_conf 99998386899998987999768902178999985028689999989918999887477505899999999986267--9 Q ss_pred CCEEEEEEECCCHHHHHHHHHHHHHHCCCCCCEEEEEEECCEEEEEECCCEEEEEEEEC Q ss_conf 82699998449989999999999987386222268986265567530545099999841 Q T0641 234 DNFIVSVGYGYDKEEGFEFMKEVESTLDVKLDSETNVAIGIVSAVHTGPYPIGLGVIRK 292 (296) Q Consensus 234 ~~~~i~i~~~~~~e~~~~~~~~l~~~~~~~~~~~~~~~i~~vi~~H~Gpg~igi~~~~k 292 (296) ....+++.|+++++++.++.+.+.+.++. .++.+.++||++++|+|||++|++|++| T Consensus 155 ~~~~~~i~~~~~~e~~~~l~~~l~~~~~~--~~i~~~~~~~vi~~H~Gpga~gv~~~~k 211 (211) T PF02645_consen 155 KNYRIAISHANNEEEAEELKEELKEKFPK--VEIYISPIGPVIGVHTGPGAIGVAFIKK 211 (211) T ss_dssp GEEEEEEEESSHHHHHHHHHHHHHHCSCE--EEEEEEE--HHHHCC----EEEEEEEC- T ss_pred CCEEEEEEECCCHHHHHHHHHHHHHCCCC--CCEEEEEECCEEEEEECCCEEEEEEEEC T ss_conf 75899999189989999999999841678--7299999894999996789299999969 No 2 >PF04084 ORC2: Origin recognition complex subunit 2 ; InterPro: IPR007220 All DNA replication initiation is driven by a single conserved eukaryotic initiator complex termed the origin recognition complex (ORC). The ORC is a six protein complex. The function of ORC is reviewed in . This entry is subunit 2, which binds the origin of replication. It plays a role in chromosome replication and mating type transcriptional silencing.; GO: 0006260 DNA replication, 0000808 origin recognition complex, 0005634 nucleus Probab=78.50 E-value=1.7 Score=20.17 Aligned_cols=152 Identities=9% Similarity=0.147 Sum_probs=90.4 Q ss_pred CCCHHHHHHHHHHCCCC-CCEECCCCHHHHHHHHHHH---HHCCCCEEEEEECCCCCHHHHHHHHHHHHHHHHCCCCEEE Q ss_conf 68989999999835788-6232368989999999999---8658949999714554357899999999999747898089 Q T0641 45 DITRDECYQRMVDDPKL-FPKTSLPSVESYADVFRSF---VEQGFPVVCFTITTLFSGSYNSAINAKSLVLEDYPDANIC 120 (296) Q Consensus 45 di~~e~~y~~l~~~~~~-~pkTS~ps~~~~~~~~~~~---~~~g~~vi~i~iSs~lSgty~~a~~a~~~~~~~~~~~~i~ 120 (296) -++.+++.+.+..-.+. +.+...--...+...|.++ +.+|+-++...+.|+..=.-+-| .....++.+..+. T Consensus 11 ~l~~~e~~~~~~~~~~~~h~~e~~~L~~~~~~~f~qW~feL~~GFnil~YG~GSKr~LL~~Fa----~~~l~~~~~~~vv 86 (326) T PF04084_consen 11 LLDHEEYFELLQELSDNPHQKEKKFLLELHEKLFPQWMFELSQGFNILFYGYGSKRQLLEDFA----EELLSDNGSGPVV 86 (326) T ss_pred CCCHHHHHHHHHHHCCHHHHHHHHHHHHHHHHHHHHHHHHHHCCCEEEEECCCHHHHHHHHHH----HHHHHHCCCCCEE T ss_conf 479999999998754201388999999999987599999995798489973571799999999----9997505788779 Q ss_pred EEECCHHHHHHHHHHHHHHHHHHCC-----CCHHHHHHHHHHHHH----CCEEEEEECCHHHHHH-CCCCHHHHHHHHHH Q ss_conf 9826566799999999999999748-----999999999998874----1669999837588752-69741789999887 Q T0641 121 VIDSKQNTVTQALLIDQFVRMLEDG-----LSFEQAMSKLDALMA----SARIFFTVGSLDYLKM-GGRIGKVATAATGK 190 (296) Q Consensus 121 ViDS~~~s~~~g~lv~~a~~l~~~G-----~s~~ei~~~l~~~~~----~~~~~f~v~~L~~L~k-gGRis~~~~~ig~l 190 (296) ||+-..-+.....+.....+.+... .+..+..+.+.+... ..+.|+++.|+|--.- +.+.-..-+.+++. T Consensus 87 VVnGy~p~~~~k~il~~I~~~l~~~~~~~~~~~~~~~~~i~~~~~~~~~~~~l~liIhNIDgp~LR~~~~q~~La~La~~ 166 (326) T PF04084_consen 87 VVNGYFPSINIKDILNSIAEALLPPPSKWGKSPSEQLDFIVSYLKSRPSPPPLYLIIHNIDGPSLRNDKAQALLAQLASI 166 (326) T ss_pred EEECCCCCCCHHHHHHHHHHHHHHHCCCCCCCHHHHHHHHHHHHHCCCCCCCEEEEEECCCCHHHCCHHHHHHHHHHHCC T ss_conf 99688998859999999999985011343689899999999987447888867999978797354682899999999669 Q ss_pred HCCCEEEEEE Q ss_conf 5475289998 Q T0641 191 LGVKPVIIMK 200 (296) Q Consensus 191 L~IkPIl~~~ 200 (296) =+|+-|-+++ T Consensus 167 p~I~liaSvD 176 (326) T PF04084_consen 167 PNIHLIASVD 176 (326) T ss_pred CCEEEEEEEC T ss_conf 9879999716 No 3 >PF02608 Bmp: Basic membrane protein; InterPro: IPR003760 This is a family of basic membrane lipoproteins from Borrelia and various putative lipoproteins from other bacteria. All of these proteins are outer membrane proteins and are thus antigenic in nature when possessed by the pathogenic members of the family . The Bacillus subtilis degR, a positive regulator of the production of degradative enzymes, is also a member of this group .; GO: 0008289 lipid binding; PDB: 2hqb_A 2fqx_A 2fqy_A 2fqw_A. Probab=74.66 E-value=1.4 Score=20.69 Aligned_cols=53 Identities=21% Similarity=0.390 Sum_probs=39.1 Q ss_pred CHHHHHHHHHHHHHCCCCEEEEEECCCCCHHHHHHHHHHHHHHHHCCCCEEEEEECCHHHH Q ss_conf 9899999999998658949999714554357899999999999747898089982656679 Q T0641 69 SVESYADVFRSFVEQGFPVVCFTITTLFSGSYNSAINAKSLVLEDYPDANICVIDSKQNTV 129 (296) Q Consensus 69 s~~~~~~~~~~~~~~g~~vi~i~iSs~lSgty~~a~~a~~~~~~~~~~~~i~ViDS~~~s~ 129 (296) ..+++.+.++++.++|+++|+.+ + ..|+.+ ...++++||+.++.++|...... T Consensus 46 ~~~~~~~~~~~~~~~g~~lIi~~-g----~~~~~~---~~~vA~~yPd~~F~~idg~~~~~ 98 (306) T PF02608_consen 46 EPADYEQALREAADDGYDLIIGT-G----FQFSDA---IEKVAKEYPDVKFIIIDGYVDGP 98 (306) T ss_dssp SHHHHHHHHHHHHHTT--EEEEE-------CCHHH---HHCCCCC-TTSEEEEESS---S- T ss_pred CHHHHHHHHHHHHHCCCCEEEEE-C----HHHHHH---HHHHHHHCCCCEEEEEECCCCCC T ss_conf 47779999999986799999996-7----788899---99999888998899996776788 No 4 >PF04748 Polysacc_deac_2: Divergent polysaccharide deacetylase; InterPro: IPR006837 This is a family of uncharacterised proteins that includes YibQ.; PDB: 2nly_A. Probab=44.68 E-value=7.2 Score=16.00 Aligned_cols=59 Identities=10% Similarity=0.052 Sum_probs=29.4 Q ss_pred CHHHHHHHHHHHHHCCC--CEEEEEECCCCCHHHHHHHHHHHHHHHHCCCCEEEEEECCHHHHHH Q ss_conf 98999999999986589--4999971455435789999999999974789808998265667999 Q T0641 69 SVESYADVFRSFVEQGF--PVVCFTITTLFSGSYNSAINAKSLVLEDYPDANICVIDSKQNTVTQ 131 (296) Q Consensus 69 s~~~~~~~~~~~~~~g~--~vi~i~iSs~lSgty~~a~~a~~~~~~~~~~~~i~ViDS~~~s~~~ 131 (296) +.+++...++...+.-- .-+-=|..|++......+....+.+.+ .....+||++.+... T Consensus 72 ~~~~i~~~l~~~l~~~P~a~GvnNhmGS~~t~~~~~m~~v~~~l~~----~gl~fvDS~T~~~S~ 132 (213) T PF04748_consen 72 SAEEIEKRLEWALARVPGAVGVNNHMGSRFTSDREAMRWVMEVLKK----RGLFFVDSRTSPRSV 132 (213) T ss_dssp ----HHHHHHHHHHHSTT----EEE----GGG-HHHHHHHHHHHHH----TT--EEE----SS-S T ss_pred CHHHHHHHHHHHHHHCCCEEEEECCCCCCHHCCHHHHHHHHHHHHH----CCCEEECCCCCCCCH T ss_conf 9999999999999868980898336660443699999999999987----699899079986558 No 5 >PF10975 DUF2802: Protein of unknown function (DUF2802) Probab=44.17 E-value=6.5 Score=16.31 Aligned_cols=22 Identities=23% Similarity=0.462 Sum_probs=18.2 Q ss_pred HHHHHHHHHHCCCCHHHHHHHH Q ss_conf 9999999997489999999999 Q T0641 134 LIDQFVRMLEDGLSFEQAMSKL 155 (296) Q Consensus 134 lv~~a~~l~~~G~s~~ei~~~l 155 (296) .-..|++|++.|.+.+||.+.- T Consensus 33 ~Y~~A~klv~~Ga~~~el~~~C 54 (70) T PF10975_consen 33 LYSQAAKLVEQGADIDELMQEC 54 (70) T ss_pred HHHHHHHHHHCCCCHHHHHHHH T ss_conf 4999999999299999999880 No 6 >PF06745 KaiC: KaiC; InterPro: IPR014774 This entry represents a conserved region within bacterial and archaeal proteins, most of which are hypothetical. More than one copy is sometimes found in each protein in this entry. These include KaiC, which is one of the Kai proteins among which direct protein-protein association may be a critical process in the generation of circadian rhythms in cyanobacteria . The circadian clock protein KaiC, is encoded in the kaiABC operon that controls circadian rhythms and may be universal in Cyanobacteria. Each member contains two copies of this domain, which is also found in other proteins. KaiC performs autophosphorylation and acts as its own transcriptional repressor.; PDB: 2w0m_A 1u9i_A 2gbl_A 2zts_B 2dr3_B. Probab=41.84 E-value=8 Score=15.72 Aligned_cols=84 Identities=13% Similarity=0.092 Sum_probs=34.5 Q ss_pred HHHHC-CCCEEEEEECCCCCHHHHHHHHHHHHHHHHCCCCEEEEEECCHHHHHHHHHHHHHHHHHHCCCCHHHHHHHHHH Q ss_conf 99865-89499997145543578999999999997478980899826566799999999999999748999999999998 Q T0641 79 SFVEQ-GFPVVCFTITTLFSGSYNSAINAKSLVLEDYPDANICVIDSKQNTVTQALLIDQFVRMLEDGLSFEQAMSKLDA 157 (296) Q Consensus 79 ~~~~~-g~~vi~i~iSs~lSgty~~a~~a~~~~~~~~~~~~i~ViDS~~~s~~~g~lv~~a~~l~~~G~s~~ei~~~l~~ 157 (296) +..++ |..+++++....-....+.+...--.+.+-....++.++|......... ..+.+++.+.+.+ T Consensus 42 ~~~~~~ge~~lyis~ee~~~~i~~~~~~~g~d~~~~~~~g~l~~~d~~~~~~~~~------------~~~~~~l~~~l~~ 109 (225) T PF06745_consen 42 NGAKQEGEKVLYISFEESPEQIIRRMSSLGWDLEEYIDSGKLKIIDAFPERIEWS------------ELDLDELLDRLRE 109 (225) T ss_dssp HHHTTTT--EEEEESSS-HHHHHHHHHTTT--HHHHHHTTSEEEEE-SGGG---T------------SSCHHHHHHHHHH T ss_pred HHHHHCCCCEEEEEECCCHHHHHHHHHHCCCCHHHHHHCCCEEEEEECCCCCCCC------------CCCHHHHHHHHHH T ss_conf 9998459946999943789999999998199658886468658998414100244------------3799999999999 Q ss_pred HHHCC-EEEEEECCHHHH Q ss_conf 87416-699998375887 Q T0641 158 LMASA-RIFFTVGSLDYL 174 (296) Q Consensus 158 ~~~~~-~~~f~v~~L~~L 174 (296) ..++. ...+++|+|..+ T Consensus 110 ~i~~~~~~~vVIDsl~~l 127 (225) T PF06745_consen 110 AIEEYGPDRVVIDSLTAL 127 (225) T ss_dssp HHHHHT-SEEEEETHHHH T ss_pred HHHHCCCCEEEEECHHHH T ss_conf 998529978999571775 No 7 >PF06180 CbiK: Cobalt chelatase (CbiK); InterPro: IPR010388 This group, typified by Salmonella typhimurium CbiK, contains anaerobic cobalt chelatases that act in the anaerobic cobalamin biosynthesis pathway , . Cobalamin (vitamin B12) can be complexed with metal via ATP-dependent reactions (aerobic pathway) (e.g., in Pseudomonas denitrificans) or via ATP-independent reactions (anaerobic pathway) (e.g., in Salmonella typhimurium) , . The corresponding cobalt chelatases are not homologous. This group belongs to the class of ATP-independent, single-subunit chelatases that also includes distantly related protoporphyrin IX (PPIX) ferrochelatase (HemH) (Class II chelatases) . The structure of Salmonella typhimurium CbiK shows that it has a remarkably similar topology to Bacillus subtilis ferrochelatase despite only weak sequence conservation . Both enzymes contain a histidine residue identified as the metal ion ligand, but CbiK contains a second histidine in place of the glutamic acid residue identified as a general base in PPIX ferrochelatase . Site-directed mutagenesis has confirmed a role for this histidine and a nearby glutamic acid in cobalt binding, modulating metal ion specificity as well as catalytic efficiency . It should be noted that CysG and Met8p, which are multifunctional proteins associated with siroheme biosynthesis, include chelatase activity and can therefore be considered as the third class of chelatases . As with the class II chelatases, they do not require ATP for activity. However, they are not structurally similar to HemH or CbiK, and it is likely that they have arisen by the acquisition of a chelatase function within a dehydrogenase catalytic framework , . ; PDB: 1qgo_A. Probab=36.36 E-value=9.6 Score=15.17 Aligned_cols=32 Identities=13% Similarity=0.379 Sum_probs=13.4 Q ss_pred HHHHHHHHHCCCC-EEEEEECCCCCHHHHHHHH Q ss_conf 9999999865894-9999714554357899999 Q T0641 74 ADVFRSFVEQGFP-VVCFTITTLFSGSYNSAIN 105 (296) Q Consensus 74 ~~~~~~~~~~g~~-vi~i~iSs~lSgty~~a~~ 105 (296) .++++++.++|++ |++.++---=-.-|+..+. T Consensus 61 ~eaL~~L~~~G~~~V~VQslhii~G~Ey~~l~~ 93 (262) T PF06180_consen 61 EEALEKLADEGYTHVVVQSLHIIPGEEYEKLKK 93 (262) T ss_dssp HHHHHHHHH----EEEEEE--SB---HHHHHHH T ss_pred HHHHHHHHHCCCCEEEEEECCEECCHHHHHHHH T ss_conf 999999998799889991263628576999999 No 8 >PF02481 DNA_processg_A: DNA recombination-mediator protein A; InterPro: IPR003488 The SMF family (DNA processing chain A, dprA) are a group of bacterial proteins. In Helicobacter pylori, dprA is required for natural chromosomal and plasmid transformation .; GO: 0009294 DNA mediated transformation Probab=36.09 E-value=9.7 Score=15.14 Aligned_cols=29 Identities=21% Similarity=0.276 Sum_probs=14.8 Q ss_pred EEEEECCCCCCCHHHHHHCCCEEEEEEEEECC Q ss_conf 89995057899989998649879899999889 Q T0641 5 YKIVSDSACDLSKEYLEKHDVTIVPLSVSFDG 36 (296) Q Consensus 5 i~IitDSt~dl~~~~~~~~~I~vvPl~I~~~g 36 (296) |++++-.-.+.|..+.+ |.--|+-+..-| T Consensus 7 i~~i~~~d~~YP~~L~~---i~~~P~~Lf~~G 35 (212) T PF02481_consen 7 IRIITIGDPDYPKLLKE---IKDPPPVLFYKG 35 (212) T ss_pred CEEECCCCHHHHHHHHH---CCCCCHHEEEEC T ss_conf 89988283120387886---359994218877 No 9 >PF11517 Nab2: Nuclear abundant poly(A) RNA-bind protein 2 (Nab2); PDB: 2jps_A 2v75_A. Probab=34.98 E-value=8.1 Score=15.68 Aligned_cols=59 Identities=19% Similarity=0.259 Sum_probs=33.3 Q ss_pred HHHHHHHHHHHHCCCCHHHHHHHHHHHHHCCEEEEEE-------CCHHHHHHCCCCHHHHHHHHHH Q ss_conf 9999999999974899999999999887416699998-------3758875269741789999887 Q T0641 132 ALLIDQFVRMLEDGLSFEQAMSKLDALMASARIFFTV-------GSLDYLKMGGRIGKVATAATGK 190 (296) Q Consensus 132 g~lv~~a~~l~~~G~s~~ei~~~l~~~~~~~~~~f~v-------~~L~~L~kgGRis~~~~~ig~l 190 (296) .+++....-|+-+|-+.++|++.|-.+-+....-+.. ..|++|+.|--+..+.+.+-.+ T Consensus 31 ~yVAEyIvLLmsNggs~esvvqELssLFDsvs~qa~~~VVqtaF~al~~Lq~g~~~~~iv~Ki~~~ 96 (107) T PF11517_consen 31 NYVAEYIVLLMSNGGSPESVVQELSSLFDSVSQQALQDVVQTAFFALEALQQGDSLENIVSKIRGM 96 (107) T ss_dssp HHHHHHHHHHHH----HHHHHHHHHHH-TTS-HHHHHHHHHHHHHHHHHHH----CHHHHHHHHHH T ss_pred HHHHHHHHEEEECCCCHHHHHHHHHHHHHCCCHHHHHHHHHHHHHHHHHHHCCCHHHHHHHHHHCC T ss_conf 999979202335899768999999999833699999999999999999986777299999998734 No 10 >PF03652 UPF0081: Uncharacterised protein family (UPF0081); InterPro: IPR005227 Holliday junction resolvases (HJRs) are key enzymes of DNA recombination. The principal HJRs are now known or confidently predicted for all bacteria and archaea whose genomes have been completely sequenced, with many species encoding multiple potential HJRs. Structural and evolutionary relationships of HJRs and related nucleases suggests that the HJR function has evolved independently from at least four distinct structural folds, namely RNase H, endonuclease, endonuclease VIIcolicin E and RusA (IPR008822 from INTERPRO): The endonuclease fold, whose structural prototypes are the phage exonuclease, the very short patch repair nuclease (Vsr) and type II restriction enzymes, is shown to encompass by far a greater diversity of nucleases than previously suspected. This fold unifies archaeal HJRs (IPR002732 from INTERPRO), repair nucleases such as RecB (IPR004586 from INTERPRO) and Vsr (IPR004603 from INTERPRO), restriction enzymes and a variety of predicted nucleases whose specific activities remain to be determined. The RNase H fold characterises the RuvC family (IPR002176 from INTERPRO), which is nearly ubiquitous in bacteria, and in addition the YqgF family (IPR005227 from INTERPRO). The proteins of this family, typified by Escherichia coli YqgF, are likely to function as an alternative to RuvC in most bacteria, but could be the principal HJRs in low-GC Gram-positive bacteria and Aquifex. Endonuclease VII of phage T4 (IPR004211 from INTERPRO) is shown to serve as a structural template for many nucleases, including McrA and other type II restriction enzymes. Together with colicin E7, endonuclease VII defines a distinct metal-dependent nuclease fold. Horizontal gene transfer, lineage-specific gene loss and gene family expansion, and non-orthologous gene displacement seem to have been major forces in the evolution of HJRs and related nucleases. A remarkable case of displacement is seen in the Lyme disease spirochete Borrelia burgdorferi, which does not possess any of the typical HJRs, but instead encodes, in its chromosome and each of the linear plasmids, members of the exonuclease family predicted to function as HJRs. The diversity of HJRs and related nucleases in bacteria and archaea contrasts with their near absence in eukaryotes. The few detected eukaryotic representatives of the endonuclease fold and the RNase H fold have probably been acquired from bacteria via horizontal gene transfer. The identity of the principal HJR(s) involved in recombination in eukaryotes remains uncertain; this function could be performed by topoisomerase IB or by a novel, so far undetected, class of enzymes. Likely HJRs and related nucleases were identified in the genomes of numerous bacterial and eukaryotic DNA viruses. Gene flow between viral and cellular genomes has probably played a major role in the evolution of this class of enzymes. This family represents the YqgF family of putative Holliday junction resolvases. With the exception of the spirochetes, the YqgF family is represented in all bacterial lineages, including the mycoplasmas with their highly degenerate genomes. The RuvC resolvases are conspicuously absent in the low-GC Gram-positive bacterial lineage, with the exception of Ureaplasma urealyticum (Q9PQY7 from SWISSPROT, ). Furthermore, loss of function ruvC mutants of E. coli show a residual HJR activity that cannot be ascribed to the prophage-encoded RusA resolvase . This suggests that the YqgF family proteins could be alternative HJRs whose function partially overlaps with that of RuvC . ; GO: 0000150 recombinase activity, 0003677 DNA binding, 0004518 nuclease activity, 0006281 DNA repair, 0006310 DNA recombination, 0006974 response to DNA damage stimulus; PDB: 1vhx_B 1iv0_A 1ovq_A 1nmn_B 1nu0_A. Probab=31.14 E-value=12 Score=14.62 Aligned_cols=84 Identities=13% Similarity=0.190 Sum_probs=51.5 Q ss_pred CCHHHHHHHHHHHHHCCC-CEEEEEECCCCCHHHH----HHHHHHHHHHHHCCCCEEEEEECCHHHHHHHHHHHHHHHHH Q ss_conf 898999999999986589-4999971455435789----99999999997478980899826566799999999999999 Q T0641 68 PSVESYADVFRSFVEQGF-PVVCFTITTLFSGSYN----SAINAKSLVLEDYPDANICVIDSKQNTVTQALLIDQFVRML 142 (296) Q Consensus 68 ps~~~~~~~~~~~~~~g~-~vi~i~iSs~lSgty~----~a~~a~~~~~~~~~~~~i~ViDS~~~s~~~g~lv~~a~~l~ 142 (296) .+.....+.+.++.+++. +-|++.+...++|+.. .++.-++.+.+.+|+.+|+-+|=+..+..-... |. T Consensus 34 ~~~~~~~~~l~~li~~~~~~~iVvGlP~~~~G~~~~~~~~v~~fa~~L~~~~~~lpv~~~DEr~TT~~A~~~------l~ 107 (135) T PF03652_consen 34 KNDEKDWDELKKLIKEWQVDGIVVGLPLNMDGSEGEQAKKVRKFAERLKKRFPGLPVYLVDERLTTKEAERR------LK 107 (135) T ss_dssp TTHHCHHHHHHHHHHHCEECEEEE-EEBB--SSC-TTHHHHHHHHHHHHHHH-TSEEEEEECSCSHHCCHHH------HH T ss_pred CCCCHHHHHHHHHHHHHCCCEEEEECCCCCCCCCCHHHHHHHHHHHHHHHHHCCCCEEEECCCCCHHHHHHH------HH T ss_conf 888678999999999848998999578999988087999999999999998649965986001136999999------99 Q ss_pred HCCCCHHHHHHHHHH Q ss_conf 748999999999998 Q T0641 143 EDGLSFEQAMSKLDA 157 (296) Q Consensus 143 ~~G~s~~ei~~~l~~ 157 (296) +.|.+-..-...+++ T Consensus 108 ~~~~~~~~~k~~iD~ 122 (135) T PF03652_consen 108 EAGIKRKKRKGKIDS 122 (135) T ss_dssp HTT--HHHHHHHHHH T ss_pred HCCCCCHHCCCCHHH T ss_conf 859970015864559 No 11 >PF08004 DUF1699: Protein of unknown function (DUF1699); InterPro: IPR012546 This family contains many archaeal proteins which have very conserved sequences. Probab=30.95 E-value=12 Score=14.60 Aligned_cols=84 Identities=17% Similarity=0.243 Sum_probs=54.3 Q ss_pred ECCCCHHHHHHHHHHHHHCCCCEEEEEECCCCCHHHHHHHHHHHHHHHHCCCCEEEEE---------ECCHHHHHHHHHH Q ss_conf 2368989999999999865894999971455435789999999999974789808998---------2656679999999 Q T0641 65 TSLPSVESYADVFRSFVEQGFPVVCFTITTLFSGSYNSAINAKSLVLEDYPDANICVI---------DSKQNTVTQALLI 135 (296) Q Consensus 65 TS~ps~~~~~~~~~~~~~~g~~vi~i~iSs~lSgty~~a~~a~~~~~~~~~~~~i~Vi---------DS~~~s~~~g~lv 135 (296) .=-||-.++..+.+.+.. -+++-++ ++.+.+...+.+|+++-. +|+.+ |-..-..--.... T Consensus 25 AFRpSNkDif~Lv~tCP~--ieviQlP-----~SY~~TvSksi~mfL~mq---~I~LieGDVWGHRKDinEYy~ip~~vi 94 (131) T PF08004_consen 25 AFRPSNKDIFSLVETCPK--IEVIQLP-----KSYYRTVSKSIEMFLEMQ---GIQLIEGDVWGHRKDINEYYTIPSSVI 94 (131) T ss_pred EECCCCHHHHHHHHHCCC--CEEEECC-----HHHHHHHHHHHHHHHHHH---CEEEEECCCCCCCCCCHHHCCCCHHHH T ss_conf 546875039999863987--5488678-----899999999999999870---713663244223123330502479999 Q ss_pred HHHHHHHHCCCCHHHHHHHHHHH Q ss_conf 99999997489999999999988 Q T0641 136 DQFVRMLEDGLSFEQAMSKLDAL 158 (296) Q Consensus 136 ~~a~~l~~~G~s~~ei~~~l~~~ 158 (296) .+..+|..+|.|.++|.+.+..- T Consensus 95 ekI~el~~eG~s~e~i~eki~re 117 (131) T PF08004_consen 95 EKIKELKSEGISNEEIAEKISRE 117 (131) T ss_pred HHHHHHHHCCCCHHHHHHHHHHH T ss_conf 99999997599889999998765 No 12 >PF08784 RPA_C: Replication protein A C terminal; InterPro: IPR014892 This protein corresponds to the C terminal of the single stranded DNA binding protein RPA (replication protein A). RPA is involved in many DNA metabolic pathways including DNA replication, DNA repair, recombination, cell cycle and DNA damage checkpoints. ; PDB: 1quq_A 2pqa_C 2pi2_C 1l1o_B 1z1d_A 2z6k_B 1dpu_A. Probab=28.28 E-value=13 Score=14.31 Aligned_cols=43 Identities=16% Similarity=0.330 Sum_probs=33.6 Q ss_pred HHHHHHHHHHHHHHHHHHCCCCHHHHHHHHHHHHHCCEEEEEECC Q ss_conf 667999999999999997489999999999988741669999837 Q T0641 126 QNTVTQALLIDQFVRMLEDGLSFEQAMSKLDALMASARIFFTVGS 170 (296) Q Consensus 126 ~~s~~~g~lv~~a~~l~~~G~s~~ei~~~l~~~~~~~~~~f~v~~ 170 (296) +.....|.-+.+.++.+ +.+.++|.+.++++.+.-++|=++|+ T Consensus 59 ~~~~~eGv~v~~i~~~l--~~~~~~v~~a~~~L~~eG~IYsTiDd 101 (102) T PF08784_consen 59 SPQSEEGVHVDEIAQKL--GMPENEVRKAIDELSDEGLIYSTIDD 101 (102) T ss_dssp --------BHHHHHHHS--TS-HHHHHHHHHHHHH---EEESSST T ss_pred CCCCCCCCCHHHHHHHH--CCCHHHHHHHHHHHHHCCEEECCCCC T ss_conf 46887871899999996--93999999999999859847266689 No 13 >PF10079 DUF2317: Uncharacterized protein conserved in bacteria (DUF2317) Probab=28.12 E-value=13 Score=14.29 Aligned_cols=229 Identities=17% Similarity=0.143 Sum_probs=108.9 Q ss_pred CCCCCCHHHHHHCCCEEEEEEEEECCEEEECCCCCCCHHHHHH------HHHHC----CCCCCEECCCCHHHHHHHHHHH Q ss_conf 5789998999864987989999988947860787689899999------99835----7886232368989999999999 Q T0641 11 SACDLSKEYLEKHDVTIVPLSVSFDGETYYRDGVDITRDECYQ------RMVDD----PKLFPKTSLPSVESYADVFRSF 80 (296) Q Consensus 11 St~dl~~~~~~~~~I~vvPl~I~~~g~~~y~D~~di~~e~~y~------~l~~~----~~~~pkTS~ps~~~~~~~~~~~ 80 (296) |+..+..+..++++..|||+.-+- |+.+ |-.+|+.--++. ++.=. .+.......++++.+.+.++++ T Consensus 109 s~I~LAk~l~~~~~~~vVPVFWiA-~EDH--DfeEInh~~~~~~~~~~~k~~~~~~~~~~~~~~~~~l~~~~~~~~l~~~ 185 (542) T PF10079_consen 109 SAINLAKELEEKLGYPVVPVFWIA-GEDH--DFEEINHTYVFGKGGKLKKIKWHPPDPKKASVAVGRLDTEDLREWLEQF 185 (542) T ss_pred HHHHHHHHHHHHHCCCCEEEEEEE-CCCC--CHHHHHHEEECCCCCCEEEEECCCCCCCCCCCCCCCCCHHHHHHHHHHH T ss_conf 999999999987189803589850-4766--7777501122156773379973688876687560157888999999999 Q ss_pred HHCCCC-EEEEEECCCC------CHHHHH--HHHHHHHHHHHCCCCEEEEEECCHHHHHHHHHHHHHHHHHHCCCCH-HH Q ss_conf 865894-9999714554------357899--9999999997478980899826566799999999999999748999-99 Q T0641 81 VEQGFP-VVCFTITTLF------SGSYNS--AINAKSLVLEDYPDANICVIDSKQNTVTQALLIDQFVRMLEDGLSF-EQ 150 (296) Q Consensus 81 ~~~g~~-vi~i~iSs~l------Sgty~~--a~~a~~~~~~~~~~~~i~ViDS~~~s~~~g~lv~~a~~l~~~G~s~-~e 150 (296) .+..-+ -..-.+-.-+ |.|+.- +..+.+.+ .+.-+.++|+.....-. +.+-...+.++...++ .. T Consensus 186 ~~~l~et~~~~~l~~~~~~~y~~~~tl~daFa~l~~~LF----~~~GLv~lD~~d~~lk~-l~~p~f~~~l~~~~~~~~~ 260 (542) T PF10079_consen 186 FKELGETEHTEELLELLEEAYRESNTLADAFARLMNELF----GDYGLVLLDPDDPELKK-LEAPFFKRELEEHPEVSKA 260 (542) T ss_pred HHHCCCCCCHHHHHHHHHHHHHCCCCHHHHHHHHHHHHH----HHCCEEEECCCCHHHHH-HHHHHHHHHHHHCHHHHHH T ss_conf 986578862899999999998548999999999999986----44797997799989999-8699999999737889999 Q ss_pred HHHHHHHHHHCCEE---EEEEC--CHHHHHHCCCCHHHHHHHHHHHCCCEEEEEECCEEEEEEEECCHHHHHHHHHHHHH Q ss_conf 99999988741669---99983--75887526974178999988754752899987868997751567999999999999 Q T0641 151 AMSKLDALMASARI---FFTVG--SLDYLKMGGRIGKVATAATGKLGVKPVIIMKDGDIGLGGIGRNRNKLKNSVLQVAK 225 (296) Q Consensus 151 i~~~l~~~~~~~~~---~f~v~--~L~~L~kgGRis~~~~~ig~lL~IkPIl~~~~G~i~~~~k~R~~kka~~~~~~~~~ 225 (296) +.+..+.+.+.-.. -.-+. +|-|+..|.|. -|..++|.....+-.+++. .+++.+.++ T Consensus 261 v~~~~~~l~~~Gy~~~iq~~p~~~nLFy~~dg~R~---------------~l~~~~~~F~~~~~~~~fs--~~ELl~~le 323 (542) T PF10079_consen 261 VSEQQERLEELGYSPQIQVNPRAINLFYLDDGQRE---------------LLEYEGGVFVVKDGEIRFS--KEELLEELE 323 (542) T ss_pred HHHHHHHHHHCCCCCCEEECCCCEEEEEECCCEEE---------------EEEEECCEEEECCCCEEEC--HHHHHHHHH T ss_conf 99999999976999870217983478897298899---------------9897599899779865577--999999998 Q ss_pred HHHHCCCCC---------CEEEEEEECCCHHHHHHHH--HHHHHHCCCCC Q ss_conf 997414898---------2699998449989999999--99998738622 Q T0641 226 KYLDENNKD---------NFIVSVGYGYDKEEGFEFM--KEVESTLDVKL 264 (296) Q Consensus 226 ~~~~~~~~~---------~~~i~i~~~~~~e~~~~~~--~~l~~~~~~~~ 264 (296) +.=+..... ...-..+|.+.|-+..-|. +.+-+.++..+ T Consensus 324 ~~PERFSpNVvlRPl~Qe~llPtlAyIgGPGEIaYwaeLk~vfe~~g~~m 373 (542) T PF10079_consen 324 DHPERFSPNVVLRPLMQEWLLPTLAYIGGPGEIAYWAELKQVFEHFGIPM 373 (542) T ss_pred HCCCCCCCCHHHHHHHHHHHCCCCCEECCCHHHHHHHHHHHHHHHHCCCC T ss_conf 59144786000104657663365226468489999999999999928999 No 14 >PF03102 NeuB: NeuB family; InterPro: IPR013132 NeuB is the prokaryotic N-acetylneuraminic acid synthase (Neu5Ac). It catalyses the direct formation of Neu5Ac (the most common sialic acid) by condensation of phosphoenolpyruvate (PEP) and N-acetylmannosamine (ManNAc). This reaction has only been observed in prokaryotes; eukaryotes synthesise the 9-phosphate form, Neu5Ac-9-P, and utilise ManNAc-6-P instead of ManNAc. Such eukaryotic enzymes are not present in this family . This family also contains SpsE spore coat polysaccharide biosynthesis proteins.; GO: 0016051 carbohydrate biosynthetic process; PDB: 3g8r_B 1xuz_A 3cm4_A 2zdr_A 1xuu_A 1vli_A. Probab=27.20 E-value=14 Score=14.18 Aligned_cols=26 Identities=15% Similarity=0.120 Sum_probs=16.2 Q ss_pred CCHHHHHHHHHHHHHCCCCEEEEEEC Q ss_conf 89899999999998658949999714 Q T0641 68 PSVESYADVFRSFVEQGFPVVCFTIT 93 (296) Q Consensus 68 ps~~~~~~~~~~~~~~g~~vi~i~iS 93 (296) -+.+++.++++.+.+.|-+.++-+.. T Consensus 53 l~~e~~~~L~~~c~~~gi~f~sTpfd 78 (241) T PF03102_consen 53 LPEEWHKELFEYCRELGIDFFSTPFD 78 (241) T ss_dssp S-HHHHHHHHHHHHHTT-EEEEEE-S T ss_pred CCHHHHHHHHHHHHHCCCCEEECCCC T ss_conf 99999999999999859928978898 No 15 >PF10483 Hap2_elong: Histone acetylation protein 2 Probab=22.01 E-value=17 Score=13.55 Aligned_cols=91 Identities=15% Similarity=0.190 Sum_probs=60.7 Q ss_pred ECCCCHHHHHHHHHHHHHCCCCEEEEEECCCCCHHHHHHHHHHHHHHHHCCCCEEEEEECCHHHHHHHHHHHHHHHHHHC Q ss_conf 23689899999999998658949999714554357899999999999747898089982656679999999999999974 Q T0641 65 TSLPSVESYADVFRSFVEQGFPVVCFTITTLFSGSYNSAINAKSLVLEDYPDANICVIDSKQNTVTQALLIDQFVRMLED 144 (296) Q Consensus 65 TS~ps~~~~~~~~~~~~~~g~~vi~i~iSs~lSgty~~a~~a~~~~~~~~~~~~i~ViDS~~~s~~~g~lv~~a~~l~~~ 144 (296) -.|++..-+.+++.+.......|++++...-=-. +|.+ ..+|++ T Consensus 21 l~q~a~~ll~e~i~~a~~~~~~V~~vsfEt~~~p--------------~~~d---~fi~~~------------------- 64 (280) T PF10483_consen 21 LEQSARPLLREFIRRAKLRNEHVIFVSFETLNKP--------------EYVD---SFIDAR------------------- 64 (280) T ss_pred CCCCCHHHHHHHHHHHHCCCCEEEEEEEECCCCC--------------CCCC---CHHHCC------------------- T ss_conf 3454748999999987368984999987647886--------------5566---022045------------------- Q ss_pred CCCHHHHHHHHHHH------HHCCEEEEEECCHHHHHHCCCCHHHHHHHHHHHC Q ss_conf 89999999999988------7416699998375887526974178999988754 Q T0641 145 GLSFEQAMSKLDAL------MASARIFFTVGSLDYLKMGGRIGKVATAATGKLG 192 (296) Q Consensus 145 G~s~~ei~~~l~~~------~~~~~~~f~v~~L~~L~kgGRis~~~~~ig~lL~ 192 (296) +++..+|.+.+... .......+++|+|.+|.+.-.. .+..++++++. T Consensus 65 ~~s~~~i~~~i~~~~~~~~~~~~~~~lViIDSln~ll~~~~~-~l~~fLssl~~ 117 (280) T PF10483_consen 65 SKSPADIIKEIKSALPSSQNKSKKRFLVIIDSLNPLLRHSPT-QLSSFLSSLLS 117 (280) T ss_pred CCCHHHHHHHHHHHCCCCCCCCCCCEEEEEECCCHHHHHHHH-HHHHHHHHCCC T ss_conf 799899999998731555677787538999638677752078-99999985067 No 16 >PF03796 DnaB_C: DnaB-like helicase C terminal domain; InterPro: IPR007694 The hexameric helicase DnaB unwinds the DNA duplex at the Escherichia coli chromosome replication fork. Although the mechanism by which DnaB both couples ATP hydrolysis to translocation along DNA and denatures the duplex is unknown, a change in the quaternary structure of the protein involving dimerization of the N-terminal domain has been observed and may occur during the enzymatic cycle. This C-terminal domain contains an ATP-binding site and is therefore probably the site of ATP hydrolysis. ; GO: 0003678 DNA helicase activity, 0005524 ATP binding, 0006260 DNA replication; PDB: 2q6t_B 1q57_E 1e0j_F 1cr1_A 1e0k_F 1cr0_A 1cr2_A 1cr4_A 3bgw_E 3bh0_A .... Probab=21.07 E-value=6.4 Score=16.35 Aligned_cols=20 Identities=15% Similarity=0.155 Sum_probs=9.1 Q ss_pred CHHHHHHHHHHHHHHHHHCC Q ss_conf 67999999999999997414 Q T0641 212 NRNKLKNSVLQVAKKYLDEN 231 (296) Q Consensus 212 ~~kka~~~~~~~~~~~~~~~ 231 (296) .....+..+...++..+++. T Consensus 150 ~~~~~i~~i~~~Lk~lA~~~ 169 (185) T PF03796_consen 150 DRREEIGEISRRLKRLAKEL 169 (185) T ss_dssp TCHHHHHHHHHHHHHHHHHH T ss_pred CHHHHHHHHHHHHHHHHHHH T ss_conf 89999999999999999982 Done!