Query 537021.9.peg.1079_1 Match_columns 251 No_of_seqs 131 out of 760 Neff 9.0 Searched_HMMs 39220 Date Wed May 25 08:30:58 2011 Command /home/congqian_1/programs/hhpred/hhsearch -i peg_1079.hhm -d /home/congqian_1/database/cdd/Cdd.hhm No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM 1 pfam04466 Terminase_3 Phage te 100.0 2.4E-42 0 248.0 21.8 239 1-250 85-340 (387) 2 pfam03237 Terminase_6 Terminas 100.0 3.7E-34 9.5E-39 203.8 20.4 245 1-251 78-337 (380) 3 COG1783 XtmB Phage terminase l 99.9 1.1E-22 2.9E-27 141.7 11.6 230 1-242 108-353 (414) 4 TIGR01547 phage_term_2 phage t 99.7 3.4E-16 8.7E-21 106.8 8.9 229 16-250 120-409 (462) 5 COG5565 Bacteriophage terminas 99.1 3.8E-11 9.6E-16 79.5 2.1 71 1-71 7-77 (79) 6 COG5323 Uncharacterized conser 98.9 4.7E-09 1.2E-13 68.2 8.2 233 2-249 117-360 (410) 7 pfam03354 Terminase_1 Phage Te 98.0 0.00069 1.8E-08 40.3 16.1 236 2-249 110-420 (473) 8 COG4626 Phage terminase-like p 97.0 0.021 5.5E-07 32.2 13.4 235 5-249 174-479 (546) 9 COG4373 Mu-like prophage FluMu 94.7 0.28 7.1E-06 26.2 10.2 156 2-158 122-339 (509) 10 pfam02562 PhoH PhoH-like prote 94.2 0.15 3.9E-06 27.6 5.9 40 159-200 120-159 (205) 11 PRK10536 hypothetical protein; 89.6 1.2 3.2E-05 22.7 5.9 41 159-201 177-217 (262) 12 pfam12138 Spherulin4 Spherulat 82.1 2.8 7.1E-05 20.8 4.5 26 29-54 17-43 (243) 13 pfam07652 Flavi_DEAD Flaviviru 71.8 5.7 0.00015 19.1 3.7 14 43-56 127-140 (146) 14 pfam05876 Terminase_GpA Phage 60.6 12 0.00032 17.3 4.9 163 7-173 126-377 (552) 15 pfam00176 SNF2_N SNF2 family N 58.5 14 0.00035 17.1 4.0 10 15-24 16-25 (295) 16 COG1875 NYN ribonuclease and A 58.4 10 0.00026 17.7 2.9 13 39-51 76-88 (436) 17 COG5362 Phage-related terminas 56.2 15 0.00038 16.9 4.2 107 134-250 43-155 (202) 18 TIGR02529 EutJ ethanolamine ut 48.5 20 0.00051 16.2 4.3 30 136-169 110-139 (240) 19 PRK11678 putative chaperone; P 40.2 27 0.0007 15.5 3.0 18 134-151 209-227 (450) 20 TIGR02036 dsdC D-serine deamin 37.3 16 0.0004 16.8 1.1 41 68-108 92-136 (302) 21 pfam04312 DUF460 Protein of un 36.4 32 0.00081 15.1 9.5 91 132-241 30-121 (138) 22 COG3453 Uncharacterized protei 35.0 33 0.00085 15.0 5.5 82 161-245 4-85 (130) 23 COG4098 comFA Superfamily II D 33.9 35 0.00089 14.9 4.1 39 14-52 201-245 (441) 24 COG2410 Predicted nuclease (RN 33.6 35 0.0009 14.9 5.0 31 135-169 2-32 (178) 25 COG3877 Uncharacterized protei 29.4 42 0.0011 14.5 3.0 38 205-250 76-113 (122) 26 TIGR00108 eRF peptide chain re 28.4 44 0.0011 14.4 3.1 120 34-157 18-152 (425) 27 KOG1015 consensus 28.3 44 0.0011 14.4 3.1 48 4-52 810-861 (1567) 28 pfam04273 DUF442 Putative phos 28.0 44 0.0011 14.3 5.4 80 163-245 5-84 (110) 29 PRK09401 reverse gyrase; Revie 24.9 51 0.0013 14.0 3.3 15 9-23 193-207 (1176) 30 PRK05183 hscA chaperone protei 24.2 52 0.0013 13.9 2.9 11 135-145 203-213 (621) 31 TIGR02350 prok_dnaK chaperone 24.0 53 0.0013 13.9 2.6 34 72-108 97-131 (598) 32 PRK01433 hscA chaperone protei 22.9 56 0.0014 13.8 4.1 11 135-145 194-204 (595) 33 PTZ00271 hypoxanthine-guanine 22.7 56 0.0014 13.8 5.9 106 80-186 26-146 (211) 34 TIGR00956 3a01205 Pleiotropic 20.1 64 0.0016 13.5 2.7 25 196-220 314-338 (1466) No 1 >pfam04466 Terminase_3 Phage terminase large subunit. Initiation of packaging of double-stranded viral DNA involves the specific interaction of the prohead with viral DNA in a process mediated by a phage-encoded terminase protein. The terminase enzymes are usually hetero-oligomers composed of a small and a large subunit. This region is found on the large subunit and possess an endonuclease and ATPase activity that require Mg2+ and a neutral or slightly basic reaction. This region is also found in bacterial sequences. Probab=100.00 E-value=2.4e-42 Score=248.03 Aligned_cols=239 Identities=13% Similarity=0.086 Sum_probs=188.5 Q ss_pred CCCCCCCHHHHCCCC-CCEEEEECC--CCHHHHHHHHHHHCC--CCCEEEEEECCCCCCCHHHHHHCCCC----CCCEEE Q ss_conf 974336965503331-358998189--897499886311247--58739999248999875655640367----997599 Q 537021.9.peg.1 1 LKAYEQGRDKWQSNT-VHYVWFDEE--PPEDVYFEGLTRINA--TQGLVTLTLTPLKGRSPIIEHYLSAS----SSDRQV 71 (251) Q Consensus 1 f~~~~q~~~~~~G~~-~~~i~~DE~--~~~~~~~~~~~r~~~--~~g~i~~~~nP~~~~~~~~~~~~~~~----~~~~~~ 71 (251) |++.+ +++++.|.+ ++.+|+||+ .+.+.|++++.|++. +++.+++++||.++.||+|++|+... .+++.+ T Consensus 85 f~G~d-~~~~iks~~~i~~~~~eEa~~~~~~~~~~l~~~~r~~~~~~~i~~~~NP~~~~~w~~~~~~~~~~~~~~~~~~~ 163 (387) T pfam04466 85 FYGMD-DPAKIKSIKDVSDAWIEEAAEFKTEDFDQLIPTIRRPKPGSEIFMSFNPVNKLNWTYKRFFKNDKSELDDDTYI 163 (387) T ss_pred EEECC-CHHHHHCCCCCEEEEEECHHHCCHHHHHHHHHHHCCCCCCEEEEEECCCCCCCCHHHHHHHCCCCCCCCCCEEE T ss_conf 98578-96884163661499994124479989999998853178871999982899987748899741775467788599 Q ss_pred EEEEHHHCCCCCHHHHHHHHHC--CCHHHHHHHHCCCCEEECCCEEECCCCEEECCCCCCCCCCEEEEEECCCC-CCCCE Q ss_conf 9977133466889999999870--89735643300510010143000012034315654656743575200377-48719 Q 537021.9.peg.1 72 IRMTINETPHYNEQERKRIIDS--YPLHEREARTKGEPILGSGRIFPIVEEDIVINSLDIPEHWVQIGGMDFGW-HHPFA 148 (251) Q Consensus 72 ~~~t~~DNp~l~~~~~~~~~~~--~~~~~~~~~~~G~~~~~~g~v~~~~~~~~~~~~~~~~~~~~~~~g~D~G~-~~p~a 148 (251) +|+|+.||||||++++++++++ .++..|+++++|+|...+|+||+.|+...+....+.......++|+|||| ++|+| T Consensus 164 ~~~ty~DNp~L~~~~i~~~e~~k~~dp~~y~~~~lGe~~~~~g~V~~~~~~~~~~~~~~~~~~~~~~~G~D~G~~~dpta 243 (387) T pfam04466 164 HHSTYRDNPFLPEVDIREIEELKRRNPDYYRIEYLGEFGGLGTLVLPNFEIKPLWVEAAEDAHIKLGFKRDFGFDESATA 243 (387) T ss_pred EEEEECCCCCCCHHHHHHHHHHHHCCHHHHHHHHCCCEEECCCEEEECCEEEECCCCCHHHHHCCCCCCEECCCCCCCCE T ss_conf 99996169989999999999987039999899775715634877973544555232640220003342135464378877 Q ss_pred EEEEEEECCCCEEEEEEEHHHC--CCCHHHHHHHHHHCCCCCEEEECCCCCEECCCCCCCHHHHHHHCCCEEEECCCCCC Q ss_conf 9999997739989999431025--99879999999833248708972630011178888889999978974776577777 Q 537021.9.peg.1 149 AGHLVWNRDSDVIYVVKNYRCR--EQTPIFHVAALKSWGKWLPWAWPHDGLQHDKRSGEQLSAQYRRQGMKMLPECATFD 226 (251) Q Consensus 149 ~~~~~~~~~~~~~~i~de~~~~--~~t~~~~~~~i~~~~~~~~~~~~~~~~~~~~~~~~s~~e~~r~~G~~~~~~~~~~~ 226 (251) +++++++..++.+|+++|+|.. +.++...++.++........ +..|++.++++.++-+..++++.+ ++ T Consensus 244 ~v~~~~~~~~~~lyi~~e~y~~~~~~~~~~~~~~~~~~~~~~~~------i~~dsa~p~~i~~~~~~~~~~~~~----~~ 313 (387) T pfam04466 244 SVRGALDVKKKVIYLYDEYYDPANQAIDDRTAEVLRDKNYTKEA------IKADSAEAKSIAALKSPGIFKIVG----AK 313 (387) T ss_pred EEEEEEECCCCEEEEEEEEECCCCCCCCHHHHHHHHHCCCCCCE------EECCCCCCCHHHHHHHHCCHHHHC----CC T ss_conf 99999977998999999998467876608999998863766650------223676852567998715434412----30 Q ss_pred CCCCCHHHHHHHHHH---HHCCCCCEE Q ss_conf 677758998999999---860799078 Q 537021.9.peg.1 227 DGSNGVEAGISDMLD---RMRSGRWKV 250 (251) Q Consensus 227 kg~~sV~~GI~~l~~---~~~~grl~V 250 (251) ||++||++||+.|+. .+..|++.| T Consensus 314 k~~~sv~~gi~~l~~~~~i~~~~~~~~ 340 (387) T pfam04466 314 KGPGSVLQKTRFLDTFRAVLIGEYLDP 340 (387) T ss_pred CCCCCHHHHHHHHHHHHHHHHCCCCEE T ss_conf 489727888889999999960576200 No 2 >pfam03237 Terminase_6 Terminase-like family. This family represents a group of terminase proteins. Probab=100.00 E-value=3.7e-34 Score=203.81 Aligned_cols=245 Identities=22% Similarity=0.225 Sum_probs=190.0 Q ss_pred CCCCC--CCHHHHCCCCCCEEEEECC--CCHHHHHHHHHHHCCC--CCEEEEEECCCCCCCHHHHHHCCCCCC------C Q ss_conf 97433--6965503331358998189--8974998863112475--873999924899987565564036799------7 Q 537021.9.peg.1 1 LKAYE--QGRDKWQSNTVHYVWFDEE--PPEDVYFEGLTRINAT--QGLVTLTLTPLKGRSPIIEHYLSASSS------D 68 (251) Q Consensus 1 f~~~~--q~~~~~~G~~~~~i~~DE~--~~~~~~~~~~~r~~~~--~g~i~~~~nP~~~~~~~~~~~~~~~~~------~ 68 (251) |++.+ .+++++||.+++++|+||+ .+.+.+.++++|++.. ..+.++.+||+++.||+|+.|.....+ . T Consensus 78 ~~~~~~~~~~~~~rG~~~~~i~~DE~a~~~~~~~~~~~~~~~~~~~~~~~~~~stp~~~~~~~~~~~~~~~~~~~~~~~~ 157 (380) T pfam03237 78 FLGLESETTAQGYRGASIAGIYFDEATWLPKFQESELVRRLRATKGKWRKTFFSTPPSPGHWVYDFWTGWLDDKGKRTFI 157 (380) T ss_pred EEECCCCCCHHHCCCCCCCEEEEEEHHHCCCHHHHHHHHHHHCCCCCCCEEEEECCCCCCCCHHHHHHHHCCCCCCCCEE T ss_conf 96257766431034854554998304536627899998644104799757999889899851989985540167775202 Q ss_pred EEEEEEEHHHCCCCCHHHHHHHHHCCCHHHHHHHHCCCCEEECCCEEECCCC-EEECCCCCCCCCCEEEEEECCCCC--C Q ss_conf 5999977133466889999999870897356433005100101430000120-343156546567435752003774--8 Q 537021.9.peg.1 69 RQVIRMTINETPHYNEQERKRIIDSYPLHEREARTKGEPILGSGRIFPIVEE-DIVINSLDIPEHWVQIGGMDFGWH--H 145 (251) Q Consensus 69 ~~~~~~t~~DNp~l~~~~~~~~~~~~~~~~~~~~~~G~~~~~~g~v~~~~~~-~~~~~~~~~~~~~~~~~g~D~G~~--~ 145 (251) +..+++|+.|||++++++++++++.+++..++|+++|+|...+|++|+.+.. .+.+....+|..+.+++|+|+|.+ + T Consensus 158 ~~~~~~t~~d~~~~~~~~~e~l~~~~~~~~~~qe~~g~f~~~~g~if~~~~~~~~~~~~~~~p~~~~~~~g~D~~~~~~~ 237 (380) T pfam03237 158 PADVEVTIEDARALGPEYKEELRALYSDEEFARLLMGEWVDTSGSIFKRFELERCDVDEERPPEHREVIGGVDPAASRGG 237 (380) T ss_pred EEEEECCHHHHCCCCHHHHHHHHHHCCHHHHHHHHCCCEECCCCCEECHHHHHHCCCCCCCCCCCCEEEEEECCCCCCCC T ss_conf 20674566871326889999998768999999985886665788571478885464675568888559999867888889 Q ss_pred CCEEEEEEEECCCCEEEEEEEHHHCCCCHHHHHHHHHHCCCCCEEEECCCCCEECCCCCCCHHHHHHHCCCEEEECCCCC Q ss_conf 71999999977399899994310259987999999983324870897263001117888888999997897477657777 Q 537021.9.peg.1 146 PFAAGHLVWNRDSDVIYVVKNYRCREQTPIFHVAALKSWGKWLPWAWPHDGLQHDKRSGEQLSAQYRRQGMKMLPECATF 225 (251) Q Consensus 146 p~a~~~~~~~~~~~~~~i~de~~~~~~t~~~~~~~i~~~~~~~~~~~~~~~~~~~~~~~~s~~e~~r~~G~~~~~~~~~~ 225 (251) +++++++..+.+++.+|+++++..++.++.++++.++++........ ..+..++.|.++++.+++.+-.. .... T Consensus 238 D~t~~~v~~~~~~~~~~~~~~~~~~~~~~~~~a~~i~~~~~~~~~~~---i~id~~~~G~~~~~~l~~~~~~~---~~~~ 311 (380) T pfam03237 238 DYAALVVIAEVDGDKFYVLAREHERGLSPAEQAAIIKKLAERYNVIY---IYIDDTGGGESVAQLLRRELPGA---AFTV 311 (380) T ss_pred CCEEEEEEEECCCCEEEEEEEEEECCCCHHHHHHHHHHHHHHCCCEE---EEEECCCCCCHHHHHHHHHCCCC---CCCE T ss_conf 96699999965798599999998368899999999999986438759---99837864307999999866446---7110 Q ss_pred CCCCCCHHHHHHHHHHHHCCCCCEEC Q ss_conf 76777589989999998607990789 Q 537021.9.peg.1 226 DDGSNGVEAGISDMLDRMRSGRWKVF 251 (251) Q Consensus 226 ~kg~~sV~~GI~~l~~~~~~grl~Vf 251 (251) ....+++.+||..|+.+|++||++++ T Consensus 312 ~~a~~~k~~~~~~v~~l~e~g~v~i~ 337 (380) T pfam03237 312 RPAPKGKNARVLKVSDLIESGRLKVP 337 (380) T ss_pred EECCCHHHHHHHHHHHHHHCCCEEEE T ss_conf 58875199999999999988989994 No 3 >COG1783 XtmB Phage terminase large subunit [General function prediction only] Probab=99.89 E-value=1.1e-22 Score=141.75 Aligned_cols=230 Identities=16% Similarity=0.176 Sum_probs=174.3 Q ss_pred CCCCCCCHHHHCCCCC---CEEEEECCCC--HHHHHHHHHHHC--CCCCEEEEEECCCCCCCHHHHHHCC-----CCCCC Q ss_conf 9743369655033313---5899818989--749988631124--7587399992489998756556403-----67997 Q 537021.9.peg.1 1 LKAYEQGRDKWQSNTV---HYVWFDEEPP--EDVYFEGLTRIN--ATQGLVTLTLTPLKGRSPIIEHYLS-----ASSSD 68 (251) Q Consensus 1 f~~~~q~~~~~~G~~~---~~i~~DE~~~--~~~~~~~~~r~~--~~~g~i~~~~nP~~~~~~~~~~~~~-----~~~~~ 68 (251) |+..+ +|.+|-+-+. ..+|++|+.. .+.+.+++-+|+ .-.+++.++.||.+-.+|.|+.|+. +..++ T Consensus 108 F~G~d-dp~klKSi~~~~~s~~WfEE~~e~s~e~~~e~l~~l~~~~~~~~~~~~snpv~~~pw~~~~w~~~~~Dek~~~d 186 (414) T COG1783 108 FKGLD-DPAKLKSIAVNWISDLWFEEASEFSYEDDIELLVELRRRELKGHIILSSNPVSFNPWTYKHWLEFAVDEKKKAD 186 (414) T ss_pred EECCC-CHHHHHHHHCCHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCCCCCEEECCCCCCCCCCHHHHHHHHHCCCCCCC T ss_conf 94589-87885230100433555787763303456888777641223777246326434588538888888744355786 Q ss_pred EEEEEEEHHHCCCCCHHHHHHHHH--CCCHHHHHHHHCCCCEEECCCEEECCCCEEEC-CCCCCCCCCEEEEEECCCCC- Q ss_conf 599997713346688999999987--08973564330051001014300001203431-56546567435752003774- Q 537021.9.peg.1 69 RQVIRMTINETPHYNEQERKRIID--SYPLHEREARTKGEPILGSGRIFPIVEEDIVI-NSLDIPEHWVQIGGMDFGWH- 144 (251) Q Consensus 69 ~~~~~~t~~DNp~l~~~~~~~~~~--~~~~~~~~~~~~G~~~~~~g~v~~~~~~~~~~-~~~~~~~~~~~~~g~D~G~~- 144 (251) .++.|+|+.||+||+.+++++.+. .++++.++....|+|.+.+|.|++.+...-+. -.+.+..--..-.|+|||+. T Consensus 187 t~~hhtT~~dn~fL~~~~v~~~ed~k~~d~d~yri~~~gev~v~~~~v~~~~e~~~~d~v~~~i~~i~~~s~gm~~Gf~~ 266 (414) T COG1783 187 TYIHHTTYRDNLFLGFDDVDELEDLKKNDPDLYRIVRDGEVGVKNGDVFDQFEVKPFDAVKFAIDNISRPSTGMDFGFTA 266 (414) T ss_pred EEEEEEECCCCCCCCHHHHHHHHHHHHCCCCCCEEEEEEEEEECCCEECCHHHCCCHHHHHHHHHHHCCCCCCCEEEEEE T ss_conf 68886201356667778999998764128541238997789860752546321577288776677615565540242574 Q ss_pred CCCEEEEEEEECCCCEEEEEEEHHHCCCCHHHHHHHHHHCCCCCEEEECCCCCEECCCCCCCHHHHHHHCCCEEEECCCC Q ss_conf 87199999997739989999431025998799999998332487089726300111788888899999789747765777 Q 537021.9.peg.1 145 HPFAAGHLVWNRDSDVIYVVKNYRCREQTPIFHVAALKSWGKWLPWAWPHDGLQHDKRSGEQLSAQYRRQGMKMLPECAT 224 (251) Q Consensus 145 ~p~a~~~~~~~~~~~~~~i~de~~~~~~t~~~~~~~i~~~~~~~~~~~~~~~~~~~~~~~~s~~e~~r~~G~~~~~~~~~ 224 (251) +++++++.+++.....++++-|||.++++....++.+++...... .+..+++.+++ ++.+++.|+.+.++ T Consensus 267 ~~n~l~~~~v~~~~k~l~~~~ey~~~~~~~d~~~~~i~e~n~~k~------~~~~dsAe~kl-i~yfk~~G~~~v~a--- 336 (414) T COG1783 267 KFNRLLKLAVDPGKKYLYIYVEYYANKMLDDKKTKDISEFNKTKS------VIALDSAEPKL-IQYFKDVGVGMVYA--- 336 (414) T ss_pred CCCEEEEEEEECCCCEEEEEEECHHHHHHHHHHHHHHHHHHHCCE------EEEECCCCHHH-HHHHHHCCCCCCCC--- T ss_conf 075778888704654047986112244343567899987541122------79704634778-89998628651135--- Q ss_pred CCCCCCCHHHHHHHHHHH Q ss_conf 776777589989999998 Q 537021.9.peg.1 225 FDDGSNGVEAGISDMLDR 242 (251) Q Consensus 225 ~~kg~~sV~~GI~~l~~~ 242 (251) +|.++|.+.+|..|+.. T Consensus 337 -~k~~~s~lq~~k~l~~f 353 (414) T COG1783 337 -KKFKGSRLQKIKKLKRF 353 (414) T ss_pred -CCCCCCHHHHHHHHHHH T ss_conf -56763077878988851 No 4 >TIGR01547 phage_term_2 phage terminase, large subunit, PBSX family; InterPro: IPR006437 This group of sequences represent a highly divergent family of the large subunit of phage terminase. All members are encoded by phage genomes or within prophage regions of bacterial genomes. This is a distinct family from the phage terminase family represented by IPR005021 from INTERPRO.; GO: 0006323 DNA packaging. Probab=99.67 E-value=3.4e-16 Score=106.78 Aligned_cols=229 Identities=15% Similarity=0.148 Sum_probs=149.1 Q ss_pred CCEEEEECC--CCHHHHHHHHHHHCCCCCE---EEEEECCCCCCCHHHHHHCCC---CCCCEE---------------EE Q ss_conf 358998189--8974998863112475873---999924899987565564036---799759---------------99 Q 537021.9.peg.1 16 VHYVWFDEE--PPEDVYFEGLTRINATQGL---VTLTLTPLKGRSPIIEHYLSA---SSSDRQ---------------VI 72 (251) Q Consensus 16 ~~~i~~DE~--~~~~~~~~~~~r~~~~~g~---i~~~~nP~~~~~~~~~~~~~~---~~~~~~---------------~~ 72 (251) +..+|++|+ .+.+.+.++..+++.+++. +++.+||.++.||+++.|... ...... .+ T Consensus 120 ~~~~~~ee~~~~~~~~~~~~~~~~r~~~~~~~~~~~~~nP~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 199 (462) T TIGR01547 120 LADLWFEEASELTKEDIKELIPRLREPGGKNKFIIFSSNPESPLHWVYKDFIENLGEDEFRICDKPPYEYGVVDKKLYIL 199 (462) T ss_pred HHHHHHHHHHHCCHHHHHHHHHHHCCCCCCEEEEEEEECCCCCCCHHHHHHHHCCCCCCHHHHCCCCHHHHHEEECEEEE T ss_conf 56565554443014468888876405777506999840889855305666543047760122101101322000000256 Q ss_pred EEEHHHCCCCC----HHHHHHHHHCC--CHHHHHHHHCCCCEE-------------ECCCEEECCCCEEECCCCCCCCCC Q ss_conf 97713346688----99999998708--973564330051001-------------014300001203431565465674 Q 537021.9.peg.1 73 RMTINETPHYN----EQERKRIIDSY--PLHEREARTKGEPIL-------------GSGRIFPIVEEDIVINSLDIPEHW 133 (251) Q Consensus 73 ~~t~~DNp~l~----~~~~~~~~~~~--~~~~~~~~~~G~~~~-------------~~g~v~~~~~~~~~~~~~~~~~~~ 133 (251) ++|+.|||+|+ +..+++++..+ .+..++..++|+|.. .++.++..+............... T Consensus 200 ~~~~~dn~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~G~~~~~~~~~~~g~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 279 (462) T TIGR01547 200 HSTYRDNPFLSGGDVEEYIQELEKLKDRDPALYRRILLGEWVSAADVWSLGLPVLTSEGILFKKLDVKAAYIKELPNDPS 279 (462) T ss_pred EEECCCCCCCCCHHHHHHHHHHHHHHCCCHHHHHHHHHCCCCHHHHHHHCCCCCCCCCCHHHHHHHHHHHHHHHHCCCCC T ss_conf 74203465676115899999999863156134334442351012233220444434441112111002212343123210 Q ss_pred EEEEEECCCCC--C-CCEEEEEEEECCCCE---E-EEEEEHHHCCCC-----HHHHHHHHHHCCCCCEEEECCCCCEECC Q ss_conf 35752003774--8-719999999773998---9-999431025998-----7999999983324870897263001117 Q 537021.9.peg.1 134 VQIGGMDFGWH--H-PFAAGHLVWNRDSDV---I-YVVKNYRCREQT-----PIFHVAALKSWGKWLPWAWPHDGLQHDK 201 (251) Q Consensus 134 ~~~~g~D~G~~--~-p~a~~~~~~~~~~~~---~-~i~de~~~~~~t-----~~~~~~~i~~~~~~~~~~~~~~~~~~~~ 201 (251) ..+.|+|+|.. + +++++++........ + |+..+++....+ ...+...+++........ .... T Consensus 280 ~~~~~~~~g~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~------~~~~ 353 (462) T TIGR01547 280 DFVGGIDAGGVDGNSPSAYVLLGIEHGKKYYDGLEYLAEYYYSNALEQVKDAVLEYANELKQFVGVKEGI------YADN 353 (462) T ss_pred HHEEEECCCCCCCCCCEEEEEEEEECCCHHHHHHHHHHHEEECCCHHHHHHHHHHHHHHHHHHHHHHHHH------CCCC T ss_conf 0013201476666530367887200000122211221010004630234554556555566566544321------0255 Q ss_pred CCCCCHHHHHH------HCCC-EEEECCCCCCCCCCCHHHHHHHHHHHHCCCCCEE Q ss_conf 88888899999------7897-4776577777677758998999999860799078 Q 537021.9.peg.1 202 RSGEQLSAQYR------RQGM-KMLPECATFDDGSNGVEAGISDMLDRMRSGRWKV 250 (251) Q Consensus 202 ~~~~s~~e~~r------~~G~-~~~~~~~~~~kg~~sV~~GI~~l~~~~~~grl~V 250 (251) +...+.....+ +.+. .........+..+..+..+|..++.++.+++++. T Consensus 354 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 409 (462) T TIGR01547 354 GDLKTFSDFLRLWLPGLEHGYNYFDVGAKKAPGAKLAVLDGIEVFSDLLAEGKLKF 409 (462) T ss_pred CHHHHHHHHHHHHCCCHHHCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHHCCHHHH T ss_conf 21455544443201111210463102210000111234556777787751010222 No 5 >COG5565 Bacteriophage terminase large (ATPase) subunit and inactivated derivatives [General function prediction only] Probab=99.08 E-value=3.8e-11 Score=79.54 Aligned_cols=71 Identities=59% Similarity=1.137 Sum_probs=65.6 Q ss_pred CCCCCCCHHHHCCCCCCEEEEECCCCHHHHHHHHHHHCCCCCEEEEEECCCCCCCHHHHHHCCCCCCCEEE Q ss_conf 97433696550333135899818989749988631124758739999248999875655640367997599 Q 537021.9.peg.1 1 LKAYEQGRDKWQSNTVHYVWFDEEPPEDVYFEGLTRINATQGLVTLTLTPLKGRSPIIEHYLSASSSDRQV 71 (251) Q Consensus 1 f~~~~q~~~~~~G~~~~~i~~DE~~~~~~~~~~~~r~~~~~g~i~~~~nP~~~~~~~~~~~~~~~~~~~~~ 71 (251) |||+||+++|.|+.+.|++|+||.++.++|.+-+.|+....|-+++++||..+.+-+..+|.....|+..+ T Consensus 7 fksfeqgr~kwq~~~v~y~wfdeqpp~dvy~eGiTrtnrt~g~~~vtftPlkg~s~vva~fl~an~pdR~v 77 (79) T COG5565 7 FKSFEQGREKWQARTVDYVWFDEQPPEDVYFEGITRTNRTSGITIVTFTPLKGMSRVVARFLAANTPDRAV 77 (79) T ss_pred HHHHHHHHHHHHCCCCCCCCCCCCCHHHHHHCCCEEECCCCCEEEEEECCCCCHHHHHHHHHHCCCCCCCC T ss_conf 23698877876067567776465873776650310202445417899532210789999998707975445 No 6 >COG5323 Uncharacterized conserved protein [Function unknown] Probab=98.94 E-value=4.7e-09 Score=68.20 Aligned_cols=233 Identities=14% Similarity=0.127 Sum_probs=138.8 Q ss_pred CCCCCCHHHHCCCCCCEEEEECC----CCHHHHHHHHHHHC-CCCCEEEEEECCCCCCCHHHHHHCCCCCCCEEEEEEEH Q ss_conf 74336965503331358998189----89749988631124-75873999924899987565564036799759999771 Q 537021.9.peg.1 2 KAYEQGRDKWQSNTVHYVWFDEE----PPEDVYFEGLTRIN-ATQGLVTLTLTPLKGRSPIIEHYLSASSSDRQVIRMTI 76 (251) Q Consensus 2 ~~~~q~~~~~~G~~~~~i~~DE~----~~~~~~~~~~~r~~-~~~g~i~~~~nP~~~~~~~~~~~~~~~~~~~~~~~~t~ 76 (251) +|.| ++++|||+.+|.+|.||. -+++.|.++...++ ....+..+|+||. .++..+....+ |...+-++.+ T Consensus 117 FSsE-DPeSLRGPQFh~AW~DEl~kWk~PqETw~MLqFGLRLGe~PRqvVTTTPr--p~plLKaLl~d--ptv~~trm~T 191 (410) T COG5323 117 FSSE-DPDSLRGPQFHLAWTDELLKWKEPQETWAMLQFGLRLGEDPRQVVTTTPR--PIPLLKALLAD--PTVALTRMGT 191 (410) T ss_pred ECCC-CHHHCCCCCCCHHHHHHHHCCCCHHHHHHHHHHHHHHCCCCHHHEECCCC--CCHHHHHHHCC--CCHHHHHCCC T ss_conf 0358-80320485200677788744798488999998766516784010205998--62789988428--6125450543 Q ss_pred HHC-CCCCHHHHHHHHHCCC-HHHHHHHHCCCCEEECCCEEECCCCEEECCCCCCCCCCEEEEEECCCC--C-CCCEEEE Q ss_conf 334-6688999999987089-735643300510010143000012034315654656743575200377--4-8719999 Q 537021.9.peg.1 77 NET-PHYNEQERKRIIDSYP-LHEREARTKGEPILGSGRIFPIVEEDIVINSLDIPEHWVQIGGMDFGW--H-HPFAAGH 151 (251) Q Consensus 77 ~DN-p~l~~~~~~~~~~~~~-~~~~~~~~~G~~~~~~g~v~~~~~~~~~~~~~~~~~~~~~~~g~D~G~--~-~p~a~~~ 151 (251) ..| ..|.+.+++.+-+.|. ...-+|++.|+-.--+|+.+...+-...... ..++...+.+.+|.-- . +...+++ T Consensus 192 a~NAgNLapgFl~t~a~rYgGTRLgrQEldGelvee~GaLw~r~dle~c~ea-~p~pL~RiVVAVDPPA~~g~~sCGIVV 270 (410) T COG5323 192 AANAGNLAPGFLRTLASRYGGTRLGRQELDGELVEEDGALWRREDLERCREA-RPAPLDRIVVAVDPPATAGGDSCGIVV 270 (410) T ss_pred CCCCCCCCHHHHHHHHHHHCCCCHHHHHCCCEEECCCCCCCCHHHHHHHHHC-CCCCCCEEEEEECCCCCCCCCCEEEEE T ss_conf 2244565779999999984544113555287786367730028789999861-898702379972698767787511699 Q ss_pred EEEECCCCEEEEEEEHHHCCCCHHHHHHHHHHCCCCCEEEECCCCCEEC-CCCCCCHHHHHHHCCCEEEECCCCCCCCCC Q ss_conf 9997739989999431025998799999998332487089726300111-788888899999789747765777776777 Q 537021.9.peg.1 152 LVWNRDSDVIYVVKNYRCREQTPIFHVAALKSWGKWLPWAWPHDGLQHD-KRSGEQLSAQYRRQGMKMLPECATFDDGSN 230 (251) Q Consensus 152 ~~~~~~~~~~~i~de~~~~~~t~~~~~~~i~~~~~~~~~~~~~~~~~~~-~~~~~s~~e~~r~~G~~~~~~~~~~~kg~~ 230 (251) ... .++ +.+|+-.-...+.++..++......... ...|..+.+ +..|.-....+....-++.-....+..||- T Consensus 271 aG~-~~g-r~~VLADes~~g~~PagWArravAa~r~----heADa~VAEvNQGGeMVravLa~~Dp~~pv~lvRASrGK~ 344 (410) T COG5323 271 AGR-RDG-RAFVLADESARGLSPAGWARRAVAAARA----HEADALVAEVNQGGEMVRAVLAQADPPCPVKLVRASRGKR 344 (410) T ss_pred EEE-ECC-CEEEEECCCCCCCCCCHHHHHHHHHHHH----HHHHHHHHHHHCCCHHHHHHHHHCCCCCCEEEEECCCCCH T ss_conf 885-058-1699612302478821268999999875----1134788877134179999996049998627666145631 Q ss_pred CHHHHHHHHHHHHCCCCCE Q ss_conf 5899899999986079907 Q 537021.9.peg.1 231 GVEAGISDMLDRMRSGRWK 249 (251) Q Consensus 231 sV~~GI~~l~~~~~~grl~ 249 (251) . .-.-+..+.++||+. T Consensus 345 ~---RAEPVAALYEQGRVr 360 (410) T COG5323 345 A---RAEPVAALYEQGRVR 360 (410) T ss_pred H---CCCHHHHHHHCCCEE T ss_conf 0---136169988556300 No 7 >pfam03354 Terminase_1 Phage Terminase. The majority of the members of this family are bacteriophage proteins, several of which are thought to be terminase large subunit proteins. There are also a number of bacterial proteins of unknown function. Probab=98.05 E-value=0.00069 Score=40.30 Aligned_cols=236 Identities=12% Similarity=0.079 Sum_probs=118.4 Q ss_pred CCCCCCHHHHCCCCCCEEEEECCC---CHHHHHHHHHHHCCCCCE-EEEEECCC-CCCCHHHHHHC-----CCC-----C Q ss_conf 743369655033313589981898---974998863112475873-99992489-99875655640-----367-----9 Q 537021.9.peg.1 2 KAYEQGRDKWQSNTVHYVWFDEEP---PEDVYFEGLTRINATQGL-VTLTLTPL-KGRSPIIEHYL-----SAS-----S 66 (251) Q Consensus 2 ~~~~q~~~~~~G~~~~~i~~DE~~---~~~~~~~~~~r~~~~~g~-i~~~~nP~-~~~~~~~~~~~-----~~~-----~ 66 (251) +....+.+++.|....++.+||.. ..+.|+.+.+.+.....+ +++++|.. ...+..++.+. .++ . T Consensus 110 ~~ls~~~~~~dG~~~~~~i~DE~h~~~~~~~~~~l~sg~~~r~~~l~~~ITTag~~~~~~~~~~~~~~~~vl~g~~~~~d 189 (473) T pfam03354 110 KALSNNGDQYDGGNPSLAIFDEMHEFKDRELVSTIVTGMRKQDNPQTIQITTAGPNRGGPYDEEREYIKRILEGDVERDD 189 (473) T ss_pred EEEECCCCCCCCCCCCEEEEEECCCCCCHHHHHHHHHHHHCCCCCEEEEECCCCCCCCCHHHHHHHHHHHHHCCCCCCCC T ss_conf 99857898766998728998722304882789999876316888719997677889887189999999999708777678 Q ss_pred CCEEEEEEEH-------------HHCCCC----CHHHH-HHHHHC--CC--HHHHHHHHCCCCEEE-CCCEEE--CCCCE Q ss_conf 9759999771-------------334668----89999-999870--89--735643300510010-143000--01203 Q 537021.9.peg.1 67 SDRQVIRMTI-------------NETPHY----NEQER-KRIIDS--YP--LHEREARTKGEPILG-SGRIFP--IVEED 121 (251) Q Consensus 67 ~~~~~~~~t~-------------~DNp~l----~~~~~-~~~~~~--~~--~~~~~~~~~G~~~~~-~g~v~~--~~~~~ 121 (251) +....+-+.. .-||.| +.+.+ ++++.. .+ ...++...+..|... +..... .|+.. T Consensus 190 ~~~f~~i~~~d~~dd~~D~~~W~kANP~lg~~~~~~~l~~~~~~a~~~p~~~~~f~~k~lN~w~~~~~~~wl~~~~w~~~ 269 (473) T pfam03354 190 DSYFGLIYELDNDDEVKDPAKWIKANPLLGSSLTRDNLLKGLIDAIGSPLKMNKFLTKNFNLWMGQDTDSWLTLQDWEQA 269 (473) T ss_pred CCEEEEEEECCCCCCCCCHHHHHHHCCCCCCCCCHHHHHHHHHHHHHCHHHHHHHHHHHCCCEECCCCCCCCCHHHHHHC T ss_conf 76699997169887768989999859676788799999999999864937689999972386344544657899999747 Q ss_pred EECCCCCCCCCCEEEEEECCCCCC-CCEEEEEEEECCCCEEEEEEEHHHCCCCH-------------------------- Q ss_conf 431565465674357520037748-71999999977399899994310259987-------------------------- Q 537021.9.peg.1 122 IVINSLDIPEHWVQIGGMDFGWHH-PFAAGHLVWNRDSDVIYVVKNYRCREQTP-------------------------- 174 (251) Q Consensus 122 ~~~~~~~~~~~~~~~~g~D~G~~~-p~a~~~~~~~~~~~~~~i~de~~~~~~t~-------------------------- 174 (251) ..+..+ ....+++.|+|+...+ -||+++ .+. ..+.+|++-.++...... T Consensus 270 -~~~~~~-~~g~~~~~G~DlS~~~Dlta~~~-~~~-~~~~~~~~~~~~ip~~~~~~~~~~~~~y~~w~~~G~~l~~~~g~ 345 (473) T pfam03354 270 -VFPPFD-INGRRVYIGVDLSMKGDVTALVF-VYP-LDGKFYLHAHSFIPESQAEQIKQDGINYREFIDRGECLTLHDGG 345 (473) T ss_pred -CCCHHH-HCCCEEEEEEEECCCCCCCEEEE-EEE-ECCEEEEEEEEECCHHHHHHHHCCCCCHHHHHHHCCEEEECCCC T ss_conf -898578-47996999984035776413799-999-78999999986568627666540155689999709879944898 Q ss_pred ----HHHHHHHHHCCC--CCEEEECCCCCEECCCCCCCHHHHHHHCCC--EEEECCCCCCCCCCCHHHHHHHHHHHHCCC Q ss_conf ----999999983324--870897263001117888888999997897--477657777767775899899999986079 Q 537021.9.peg.1 175 ----IFHVAALKSWGK--WLPWAWPHDGLQHDKRSGEQLSAQYRRQGM--KMLPECATFDDGSNGVEAGISDMLDRMRSG 246 (251) Q Consensus 175 ----~~~~~~i~~~~~--~~~~~~~~~~~~~~~~~~~s~~e~~r~~G~--~~~~~~~~~~kg~~sV~~GI~~l~~~~~~g 246 (251) ....+.|.++.. ...+. .+..|......+.+.+.+.|+ .++.. ..+..+.-..+..+..++.+| T Consensus 346 ~iD~~~v~~~i~~~~~~~~~~v~----~i~yD~~~a~~~~~~le~~g~~~~~v~~----~Q~~~~ls~~~k~le~~~~~g 417 (473) T pfam03354 346 MIDPNQIIPWILDFITKTGLDVQ----AIGYDPWQAKYFIDRLESTFLDWPLVEI----RQGFFSLSNPIKELQELVAEN 417 (473) T ss_pred EECHHHHHHHHHHHHHHCCCCEE----EEEECHHHHHHHHHHHHHCCCCCCEEEE----CCCCCCCCHHHHHHHHHHHCC T ss_conf 32699999999999996299725----9984667899999999961899866984----687630077999999999779 Q ss_pred CCE Q ss_conf 907 Q 537021.9.peg.1 247 RWK 249 (251) Q Consensus 247 rl~ 249 (251) ++. T Consensus 418 ~i~ 420 (473) T pfam03354 418 KLT 420 (473) T ss_pred CEE T ss_conf 989 No 8 >COG4626 Phage terminase-like protein, large subunit [General function prediction only] Probab=97.00 E-value=0.021 Score=32.24 Aligned_cols=235 Identities=12% Similarity=0.047 Sum_probs=120.8 Q ss_pred CCCHHHHCCCCCCEEEEECC---CCH-HHHHHHHHHH-CCCCCEEEEEECC-CCCCCHHHHHHCC------C--CCCCEE Q ss_conf 36965503331358998189---897-4998863112-4758739999248-9998756556403------6--799759 Q 537021.9.peg.1 5 EQGRDKWQSNTVHYVWFDEE---PPE-DVYFEGLTRI-NATQGLVTLTLTP-LKGRSPIIEHYLS------A--SSSDRQ 70 (251) Q Consensus 5 ~q~~~~~~G~~~~~i~~DE~---~~~-~~~~~~~~r~-~~~~g~i~~~~nP-~~~~~~~~~~~~~------~--~~~~~~ 70 (251) ..++.++-|...+++.+||. .+. +.++.....+ ..+.+.++.++|- ......+++.+.. . ..++.. T Consensus 174 aa~~~~~Dg~~~~~~I~DEih~f~~~~~~~~~~~~g~~ar~~~l~~~ITT~g~~~~g~~~q~~~y~k~vl~g~~~d~~~f 253 (546) T COG4626 174 AADPNTVDGLNSVGAIIDELHLFGKQEDMYSEAKGGLGARPEGLVVYITTSGDPPAGVFKQKLQYAKDVLDGKIKDPHFF 253 (546) T ss_pred HCCCCCCCCCCCCEEEEEHHHHHCCHHHHHHHHHHHHCCCCCCEEEEEECCCCCCCCHHHHHHHHHHHHHCCCCCCCCEE T ss_conf 04887556787654887637541678999999974201576763999966898874189999999999866987882107 Q ss_pred EEEEEHH-------------HCCCCCH----H-HHHHHHH-CCCHHHHHHHH----CCCCEEECCCEEE--CCCCEEEC- Q ss_conf 9997713-------------3466889----9-9999987-08973564330----0510010143000--01203431- Q 537021.9.peg.1 71 VIRMTIN-------------ETPHYNE----Q-ERKRIID-SYPLHEREART----KGEPILGSGRIFP--IVEEDIVI- 124 (251) Q Consensus 71 ~~~~t~~-------------DNp~l~~----~-~~~~~~~-~~~~~~~~~~~----~G~~~~~~g~v~~--~~~~~~~~- 124 (251) .+.+..+ -||.|.- + ...+.++ ...+... +.+ +..|..+....+. .|...... T Consensus 254 ~~i~e~Dd~~e~~dpe~w~kaNPnlg~sv~~~~l~s~~~ka~~~~q~~-~dF~tK~lNi~~~~~~~~~~~~~w~~~~~~~ 332 (546) T COG4626 254 PVIYELDEEGEHDDPENWAKANPNLGVSVDEAFLYSEYRKARNAPQEA-RDFMTKHLNIWIGASDAWFGADFWEQQGRTV 332 (546) T ss_pred EEEEECCCCCCCCCHHHHHHCCCCCCCEEEHHHHHHHHHHHHCCCHHC-HHHHHCCCCEEECHHHCCCCHHHHHHHCCCC T ss_conf 999976882000386777430887651420777776999873181112-2777512331510010246868999731445 Q ss_pred -CCCCCCCCCEEEEEECCCCC-CCCEEEEEEEECCCCEEEEEE--------------------EHHHCCC---------C Q ss_conf -56546567435752003774-871999999977399899994--------------------3102599---------8 Q 537021.9.peg.1 125 -NSLDIPEHWVQIGGMDFGWH-HPFAAGHLVWNRDSDVIYVVK--------------------NYRCREQ---------T 173 (251) Q Consensus 125 -~~~~~~~~~~~~~g~D~G~~-~p~a~~~~~~~~~~~~~~i~d--------------------e~~~~~~---------t 173 (251) +.+ .-....++.|+|++-. |-+++++..-......+++.- ||...|. . T Consensus 333 lde~-~~~gq~c~~G~Dls~~~D~ts~al~f~~~~~~~~i~~~h~wv~~~~~~~~k~~~~~~~ew~k~G~lTit~~~~id 411 (546) T COG4626 333 LDEI-LLRGQVCYGGIDLSGLDDLTSMALVGRYRETDEWIGWGHAWVHRAAVKRRKSEAPRLQEWVKAGDLTITRRDLID 411 (546) T ss_pred CCHH-HHCCCEEEEEECCCCCCCCCCEEEEEECCCCCEEEEECCCCCHHHHCCCHHHCCHHHHHHHHCCCEEEECCCCCC T ss_conf 6535-514866899733234555541068875378762788424551243113201024668999975927973787313 Q ss_pred HHHHHHHHHHCCCCCEEEECCCCCEECCCCCCCHHHHHHHCCCEEEECCCCCCCCCCCHHHHHHHHHHHHCCCCCE Q ss_conf 7999999983324870897263001117888888999997897477657777767775899899999986079907 Q 537021.9.peg.1 174 PIFHVAALKSWGKWLPWAWPHDGLQHDKRSGEQLSAQYRRQGMKMLPECATFDDGSNGVEAGISDMLDRMRSGRWK 249 (251) Q Consensus 174 ~~~~~~~i~~~~~~~~~~~~~~~~~~~~~~~~s~~e~~r~~G~~~~~~~~~~~kg~~sV~~GI~~l~~~~~~grl~ 249 (251) ....++++.+.....++ .....|...+..+.+.|.+.|+++++.+.... .--.++..++..+.+|+|. T Consensus 412 ~~~I~ew~~~~~~~~~i----~~v~~D~~g~~~~~~~l~~~g~~lv~i~Q~~~----~l~~~~k~~e~~~~~g~i~ 479 (546) T COG4626 412 YAEIVEWFMEIREKFLI----KLVGFDPSGAGEFRDALAEAGIKVVGIPQGFK----KLSGAIKTIERKLAEGVLV 479 (546) T ss_pred HHHHHHHHHHHHHHCCC----CEEEECCCCHHHHHHHHHHCCCCEEECCCHHH----HHCCHHHHHHHHHHCCCEE T ss_conf 89999999999873785----18855644328899999847995434450055----5474267899998669589 No 9 >COG4373 Mu-like prophage FluMu protein gp28 [General function prediction only] Probab=94.71 E-value=0.28 Score=26.20 Aligned_cols=156 Identities=13% Similarity=0.007 Sum_probs=94.6 Q ss_pred CCCCCCHHHHCCCCCCEEEEECCCCHHHHHHHHH---HHCCCCCEEEEEECCCCCCCHHHHHHCC--CCCCCEEEEEEEH Q ss_conf 7433696550333135899818989749988631---1247587399992489998756556403--6799759999771 Q 537021.9.peg.1 2 KAYEQGRDKWQSNTVHYVWFDEEPPEDVYFEGLT---RINATQGLVTLTLTPLKGRSPIIEHYLS--ASSSDRQVIRMTI 76 (251) Q Consensus 2 ~~~~q~~~~~~G~~~~~i~~DE~~~~~~~~~~~~---r~~~~~g~i~~~~nP~~~~~~~~~~~~~--~~~~~~~~~~~t~ 76 (251) ++...+|+.|||..= -+++||+.-.+...+++. .++.=+.++-++.|-++..+-|++.-.+ ....++.+...|+ T Consensus 122 ~ALSSnPknlRg~qG-~VviDEaAFHE~ldEllkAA~altmWGa~vRviStHNGvDnlFnQ~iQear~grk~ysvH~iTl 200 (509) T COG4373 122 TALSSNPKNLRGKQG-KVVIDEAAFHEDLDELLKAAAALTMWGAPVRVISTHNGVDNLFNQMIQEARQGRKKYSVHSITL 200 (509) T ss_pred EECCCCCCCCCCCCC-CEEEEHHHHHHHHHHHHHHHHHHHHCCCCEEEEECCCCHHHHHHHHHHHHHCCCCCCEEEEEEH T ss_conf 660479745546788-4886236656539999998778765177248996268701789999999870255531788865 Q ss_pred HHC--------------C-CCCHH---HHHHHHHCCC-HHHHHHHHCCCCEEECCCEEECC--CCE-------------- Q ss_conf 334--------------6-68899---9999987089-73564330051001014300001--203-------------- Q 537021.9.peg.1 77 NET--------------P-HYNEQ---ERKRIIDSYP-LHEREARTKGEPILGSGRIFPIV--EED-------------- 121 (251) Q Consensus 77 ~DN--------------p-~l~~~---~~~~~~~~~~-~~~~~~~~~G~~~~~~g~v~~~~--~~~-------------- 121 (251) +|. | |-|+. ++.++++.-+ .+...+++++.+-...|+..+.- +.. T Consensus 201 dDAiadGLy~RIc~v~~~~w~pE~Ea~w~a~l~~~a~t~eda~eEy~C~Pk~s~gAYIphalie~a~~~~vp~l~fe~~~ 280 (509) T COG4373 201 DDAIADGLYERICNVDRPAWAPEVEAKWLAELRAIAGTDEDAQEEYMCNPKDSTGAYIPHALIEAAVAAEVPDLIFELGS 280 (509) T ss_pred HHHHHHHHHHHHHHCCCCCCCHHHHHHHHHHHHHHCCCCHHHHHHHCCCCCCCCCCCCCHHHHHHHHHCCCCCEEEECCH T ss_conf 77777789999982677456756788999988865698144577628775667766466789998875479851797574 Q ss_pred --EECC-------------C------CCCCCCCEEEEEECCCCC-CCCEEEEEEEECCC Q ss_conf --4315-------------6------546567435752003774-87199999997739 Q 537021.9.peg.1 122 --IVIN-------------S------LDIPEHWVQIGGMDFGWH-HPFAAGHLVWNRDS 158 (251) Q Consensus 122 --~~~~-------------~------~~~~~~~~~~~g~D~G~~-~p~a~~~~~~~~~~ 158 (251) |.+. + ....+..+.++|+|||-. |-++.+++.+.++. T Consensus 281 ~f~~~~~~~r~~~~~~~cl~~l~P~Lqalnp~~r~~fGvDfaR~~DLsv~~v~e~~~dt 339 (509) T COG4373 281 EFHDIPAWLRESEVLTWCLPDLRPALQALNPGGRLYFGVDFARKRDLSVLWVWEKVGDT 339 (509) T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCCCEEECCCCCCCCCCEEEEEEEECCCH T ss_conf 55423566424566665344356898732999716521220225673499987445523 No 10 >pfam02562 PhoH PhoH-like protein. PhoH is a cytoplasmic protein and predicted ATPase that is induced by phosphate starvation. Probab=94.24 E-value=0.15 Score=27.60 Aligned_cols=40 Identities=20% Similarity=0.076 Sum_probs=28.8 Q ss_pred CEEEEEEEHHHCCCCHHHHHHHHHHCCCCCEEEECCCCCEEC Q ss_conf 989999431025998799999998332487089726300111 Q 537021.9.peg.1 159 DVIYVVKNYRCREQTPIFHVAALKSWGKWLPWAWPHDGLQHD 200 (251) Q Consensus 159 ~~~~i~de~~~~~~t~~~~~~~i~~~~~~~~~~~~~~~~~~~ 200 (251) +++.|+||. .++|..++...|.+.|..-.++..-|....| T Consensus 120 n~~iIvDEa--QN~t~~~lk~ilTRiG~~SK~vi~GD~~Q~D 159 (205) T pfam02562 120 DAFIILDEA--QNTTPEQMKMFLTRIGFNSKMVVTGDITQID 159 (205) T ss_pred CCEEEEECH--HCCCHHHHHHHHHHCCCCCEEEEECCHHHCC T ss_conf 688999722--1399999999984217996899947866517 No 11 >PRK10536 hypothetical protein; Provisional Probab=89.62 E-value=1.2 Score=22.72 Aligned_cols=41 Identities=15% Similarity=0.056 Sum_probs=30.6 Q ss_pred CEEEEEEEHHHCCCCHHHHHHHHHHCCCCCEEEECCCCCEECC Q ss_conf 9899994310259987999999983324870897263001117 Q 537021.9.peg.1 159 DVIYVVKNYRCREQTPIFHVAALKSWGKWLPWAWPHDGLQHDK 201 (251) Q Consensus 159 ~~~~i~de~~~~~~t~~~~~~~i~~~~~~~~~~~~~~~~~~~~ 201 (251) +.+.|+||. .+.|..++...|...|..-.++..-|....|. T Consensus 177 na~IIvDEa--QN~T~~qmk~iLTRiG~~SKiVi~GD~~Q~Dl 217 (262) T PRK10536 177 NAVVILDEA--QNVTAAQMKMFLTRLGENVTVIVNGDITQCDL 217 (262) T ss_pred CEEEEEEHH--HCCCHHHHHHHHHHCCCCCEEEEECCCCCCCC T ss_conf 428998412--12899999889854259968999688202269 No 12 >pfam12138 Spherulin4 Spherulation-specific family 4. This protein is found in bacteria, archaea and eukaryotes. Proteins in this family are typically between 250 and 398 amino acids in length. There is a conserved NPG sequence motif and there are two completely conserved G residues that may be functionally important. Starvation will often induce spherulation - the production of spores - and this process may involve DNA-methylation. Changes in the methylation of spherulin4 are associated with the formation of spherules, but these changes are probably transient. Methylation of the gene accompanies its transcriptional activation, and spherulin4 mRNA is only detectable in late spherulating cultures and mature spherules. It is a spherulation-specific protein. Probab=82.11 E-value=2.8 Score=20.81 Aligned_cols=26 Identities=15% Similarity=0.096 Sum_probs=15.2 Q ss_pred HHHHHHHHHC-CCCCEEEEEECCCCCC Q ss_conf 9988631124-7587399992489998 Q 537021.9.peg.1 29 VYFEGLTRIN-ATQGLVTLTLTPLKGR 54 (251) Q Consensus 29 ~~~~~~~r~~-~~~g~i~~~~nP~~~~ 54 (251) .|..+...+. .+.-..++..||.+|. T Consensus 17 ~W~~l~~~~~~~p~~~~~vIiNP~~GP 43 (243) T pfam12138 17 EWDPLYDAIAAYPDVPFTVIINPNNGP 43 (243) T ss_pred CCHHHHHHHHCCCCCCEEEEECCCCCC T ss_conf 437899887438998779998589999 No 13 >pfam07652 Flavi_DEAD Flavivirus DEAD domain. Probab=71.77 E-value=5.7 Score=19.13 Aligned_cols=14 Identities=29% Similarity=0.413 Sum_probs=6.5 Q ss_pred EEEEEECCCCCCCH Q ss_conf 39999248999875 Q 537021.9.peg.1 43 LVTLTLTPLKGRSP 56 (251) Q Consensus 43 ~i~~~~nP~~~~~~ 56 (251) -+++|.||++-.++ T Consensus 127 ~i~mTATPPG~~~~ 140 (146) T pfam07652 127 AIFMTATPPGTSDP 140 (146) T ss_pred EEEEECCCCCCCCC T ss_conf 99995689998998 No 14 >pfam05876 Terminase_GpA Phage terminase large subunit (GpA). This family consists of several phage terminase large subunit proteins as well as related sequences from several bacterial species. The DNA packaging enzyme of bacteriophage lambda, terminase, is a heteromultimer composed of a small subunit, gpNu1, and a large subunit, gpA, products of the Nu1 and A genes, respectively. Terminase is involved in the site-specific binding and cutting of the DNA in the initial stages of packaging. It is now known that gpA is actively involved in late stages of packaging, including DNA translocation, and that this enzyme contains separate functional domains for its early and late packaging activities. Probab=60.61 E-value=12 Score=17.30 Aligned_cols=163 Identities=17% Similarity=0.158 Sum_probs=88.1 Q ss_pred CHHHHCCCCCCEEEEECCC--CH------HHHHHHHHHHCC--CCCEEEEEECCCCC-CCHHHHHHCCCC---------- Q ss_conf 9655033313589981898--97------499886311247--58739999248999-875655640367---------- Q 537021.9.peg.1 7 GRDKWQSNTVHYVWFDEEP--PE------DVYFEGLTRINA--TQGLVTLTLTPLKG-RSPIIEHYLSAS---------- 65 (251) Q Consensus 7 ~~~~~~G~~~~~i~~DE~~--~~------~~~~~~~~r~~~--~~g~i~~~~nP~~~-~~~~~~~~~~~~---------- 65 (251) ++..|++.++.++++||.. +. +-......|+.. ..+.++..+||-.. .+.+...|.... T Consensus 126 S~~~L~s~~~r~l~~DEvD~~~~~~~~eGdP~~La~~R~~tf~~~~K~~~~STPt~~g~s~I~~~~~~sdqr~~~vpCPh 205 (552) T pfam05876 126 SPANLRSRPVRYVILDEVDAYPEDVDGEGDPISLAEKRTETFGSRRKILAGSTPTIKGTSRIEALYEESDQRRYYVPCPH 205 (552) T ss_pred CCHHHHCCCCCEEEECCHHHCCCCCCCCCCHHHHHHHHHHHHCCCCEEEEECCCCCCCCCHHHHHHHCCCEEEEEEECCC T ss_conf 96143048635588513443654567787989999999875302857999838999996689999866873899868989 Q ss_pred -------------------------------------------------------CCCEEEEEEEHHHCCCCCHHHHH-H Q ss_conf -------------------------------------------------------99759999771334668899999-9 Q 537021.9.peg.1 66 -------------------------------------------------------SSDRQVIRMTINETPHYNEQERK-R 89 (251) Q Consensus 66 -------------------------------------------------------~~~~~~~~~t~~DNp~l~~~~~~-~ 89 (251) .+....+|.+-.=.||.+-.++. + T Consensus 206 Cg~~q~l~~~~l~w~~~~~p~~a~y~C~~Cg~~i~e~~k~~m~~~G~W~~~~~~~~~~~~gfhl~~lySp~~sw~~ia~~ 285 (552) T pfam05876 206 CGEEQELRWERLKWDKGEAPETARYVCPHCGCVIEEHHKRAMLRAGRWIATAPIRRPRHAGFHLNALYSPFRSWGELAAE 285 (552) T ss_pred CCCCCCCCCCCEEECCCCCCCCEEEECCCCCCCCCHHHHHHHHHCCEEEECCCCCCCCCEEEEECHHHCCCCCHHHHHHH T ss_conf 99865550054664799985534998988988889999998876899980588889981179963574552589999999 Q ss_pred HHH---CCCHHHHHHH---HCCC-CEEECCCEEECCCC-EEECCC---CCCCCCC-EEEEEECCCCCCCCEEEEEEEECC Q ss_conf 987---0897356433---0051-00101430000120-343156---5465674-357520037748719999999773 Q 537021.9.peg.1 90 IID---SYPLHEREAR---TKGE-PILGSGRIFPIVEE-DIVINS---LDIPEHW-VQIGGMDFGWHHPFAAGHLVWNRD 157 (251) Q Consensus 90 ~~~---~~~~~~~~~~---~~G~-~~~~~g~v~~~~~~-~~~~~~---~~~~~~~-~~~~g~D~G~~~p~a~~~~~~~~~ 157 (251) .+. ..++...+.. .+|+ |.. .|..-. ++. .-+.++ ..+|... .+..|+|.=-+. ..+.++++..+ T Consensus 286 ~l~A~~~gd~~~lk~f~Nt~Lge~w~~-~~~~~~-~~~L~~rre~y~~~~vP~g~~~LtagvDvQ~dr-le~~v~gwG~~ 362 (552) T pfam05876 286 FLKAERKGDPEKLKTFVNTDLGEPWEE-KGEAPD-WEELAARAEDYPRGTVPAGVLVLTAGVDVQGDR-LEVEVVGWGRG 362 (552) T ss_pred HHHHHHCCCHHHHHHEEECCCCCEECC-CCCCCC-HHHHHHHHHHCCCCCCCCCEEEEEEEEEECCCE-EEEEEEEECCC T ss_conf 998776199888635474024424213-466789-899999876446667898706999998614998-99999998799 Q ss_pred CCEEEEEEEHHHCCCC Q ss_conf 9989999431025998 Q 537021.9.peg.1 158 SDVIYVVKNYRCREQT 173 (251) Q Consensus 158 ~~~~~i~de~~~~~~t 173 (251) ... +++|....-|.+ T Consensus 363 ~es-W~id~~~i~Gdp 377 (552) T pfam05876 363 GES-WLIDRGVIWGDP 377 (552) T ss_pred CCE-EEEEEEEECCCC T ss_conf 855-999888861898 No 15 >pfam00176 SNF2_N SNF2 family N-terminal domain. This domain is found in proteins involved in a variety of processes including transcription regulation (e.g., SNF2, STH1, brahma, MOT1), DNA repair (e.g., ERCC6, RAD16, RAD5), DNA recombination (e.g., RAD54), and chromatin unwinding (e.g., ISWI) as well as a variety of other proteins with little functional information (e.g., lodestar, ETL1). Probab=58.52 E-value=14 Score=17.10 Aligned_cols=10 Identities=20% Similarity=0.059 Sum_probs=7.4 Q ss_pred CCCEEEEECC Q ss_conf 1358998189 Q 537021.9.peg.1 15 TVHYVWFDEE 24 (251) Q Consensus 15 ~~~~i~~DE~ 24 (251) ...++..||- T Consensus 16 ~~ggiLaDeM 25 (295) T pfam00176 16 GLGGILADEM 25 (295) T ss_pred CCCEEEECCC T ss_conf 9998972278 No 16 >COG1875 NYN ribonuclease and ATPase of PhoH family domains [General function prediction only] Probab=58.41 E-value=10 Score=17.74 Aligned_cols=13 Identities=15% Similarity=0.235 Sum_probs=8.2 Q ss_pred CCCCEEEEEECCC Q ss_conf 7587399992489 Q 537021.9.peg.1 39 ATQGLVTLTLTPL 51 (251) Q Consensus 39 ~~~g~i~~~~nP~ 51 (251) .++|...+..|+. T Consensus 76 ~~G~~l~iel~~~ 88 (436) T COG1875 76 NKGGTLHVELNHQ 88 (436) T ss_pred CCCCEEEEEEECC T ss_conf 8897589998326 No 17 >COG5362 Phage-related terminase [General function prediction only] Probab=56.25 E-value=15 Score=16.89 Aligned_cols=107 Identities=15% Similarity=-0.023 Sum_probs=60.2 Q ss_pred EEEEEECCCC-----CCCCEEEEEEEECCCCEEEEEEEHHHCCCCHHHHHHHHHHCCCC-CEEEECCCCCEECCCCCCCH Q ss_conf 3575200377-----48719999999773998999943102599879999999833248-70897263001117888888 Q 537021.9.peg.1 134 VQIGGMDFGW-----HHPFAAGHLVWNRDSDVIYVVKNYRCREQTPIFHVAALKSWGKW-LPWAWPHDGLQHDKRSGEQL 207 (251) Q Consensus 134 ~~~~g~D~G~-----~~p~a~~~~~~~~~~~~~~i~de~~~~~~t~~~~~~~i~~~~~~-~~~~~~~~~~~~~~~~~~s~ 207 (251) ....+||..+ .|-+|...+... .+++|++|-...+ ...+.+.+.-.+...+ .++ ...+...++|++. T Consensus 43 ~~vqsWDtA~ke~kdsD~tag~iwg~l--dgryy~~dlvhdr-agvpelln~taeldgk~~~i----~~~iEpkaaGk~~ 115 (202) T COG5362 43 YCVQSWDTAIKESKDSDYTAGNIWGIL--DGRYYLVDLVHDR-AGVPELLNLTAELDGKYQPI----FVLIEPKAAGKQL 115 (202) T ss_pred HEEEHHHHHCCCCCCCCCCHHHHEEEC--CCEEEEEEEEHHH-CCCHHHHHHHHHHCCCCCCE----EEECCCCCCCHHH T ss_conf 100002333024678874043201102--5608986640111-38299998899860134650----4562776442799 Q ss_pred HHHHHHCCCEEEECCCCCCCCCCCHHHHHHHHHHHHCCCCCEE Q ss_conf 9999978974776577777677758998999999860799078 Q 537021.9.peg.1 208 SAQYRRQGMKMLPECATFDDGSNGVEAGISDMLDRMRSGRWKV 250 (251) Q Consensus 208 ~e~~r~~G~~~~~~~~~~~kg~~sV~~GI~~l~~~~~~grl~V 250 (251) ++.+...|++.+....... |.+ .-.-..+..++++|-+.| T Consensus 116 ~q~Lk~~~~~g~~~r~eps-gdK--vTRf~pvsplfEsgnVyv 155 (202) T COG5362 116 IQDLKFLGGNGRVIRIEPS-GDK--VTRFAPVSPLFESGNVYV 155 (202) T ss_pred HHHHHHCCEEEEEEEECCC-CCC--EEECCCCCHHHHCCCEEE T ss_conf 9997633222589864037-883--452132565550782799 No 18 >TIGR02529 EutJ ethanolamine utilization protein EutJ family protein; InterPro: IPR013366 Salmonella typhimurium is capable of growth on ethanolamine as a sole source of carbon nitrogen and energy . During growth on this compound the cells form a multimolecular structure known as a metabolosome, which is similar to the carboxysome used by some photosynthetic bacteria to fix CO2, and is thought to contain the enzymes needed to metabolise this compound to acetyl-CoA. The metabolosome is not directly involved in the biochemistry of ethanolamine utilization - instead its role is thought to be to concentrate the enzymes involved in this process, while also protecting the cell from the build-up of toxic intermediates . The genes involved in growth on ethanolamine are encoded in a 17-gene operon known as the ethanolamine utilization (eut) operon. EutJ shows similarity to chaperonins and may play a role in assembly of the metabolosme , though it is not necessary for growth on this compound.. Probab=48.48 E-value=20 Score=16.18 Aligned_cols=30 Identities=17% Similarity=0.028 Sum_probs=19.0 Q ss_pred EEEECCCCCCCCEEEEEEEECCCCEEEEEEEHHH Q ss_conf 7520037748719999999773998999943102 Q 537021.9.peg.1 136 IGGMDFGWHHPFAAGHLVWNRDSDVIYVVKNYRC 169 (251) Q Consensus 136 ~~g~D~G~~~p~a~~~~~~~~~~~~~~i~de~~~ 169 (251) -..+|.|.- |+=+ . +-++|+.+|.-||=-+ T Consensus 110 G~VVDvGGG--TTGi-S-I~K~GKViy~ADEpTG 139 (240) T TIGR02529 110 GAVVDVGGG--TTGI-S-ILKKGKVIYSADEPTG 139 (240) T ss_pred CEEEEECCC--CEEE-E-EEECCCEEEEEECCCC T ss_conf 279984788--0335-7-9975968998237999 No 19 >PRK11678 putative chaperone; Provisional Probab=40.16 E-value=27 Score=15.45 Aligned_cols=18 Identities=11% Similarity=-0.108 Sum_probs=9.3 Q ss_pred EEEEEECCCCC-CCCEEEE Q ss_conf 35752003774-8719999 Q 537021.9.peg.1 134 VQIGGMDFGWH-HPFAAGH 151 (251) Q Consensus 134 ~~~~g~D~G~~-~p~a~~~ 151 (251) ..+...|||.. -..+++. T Consensus 209 ~~vLV~DlGGGT~DvSlv~ 227 (450) T PRK11678 209 KTVLVVDIGGGTTDCSLLL 227 (450) T ss_pred CEEEEEEECCCEEEEEEEE T ss_conf 8799999089848887478 No 20 >TIGR02036 dsdC D-serine deaminase transcriptional activator; InterPro: IPR011781 This family, part of the LysR family of transcriptional regulators, activates transcription of the gene for D-serine deaminase, dsdA. Trusted members of this family so far are found adjacent to dsdA and only in Gammaproteobacteria, including Escherichia coli, Vibrio cholerae, and Colwellia psychrerythraea.. Probab=37.29 E-value=16 Score=16.77 Aligned_cols=41 Identities=17% Similarity=0.216 Sum_probs=22.1 Q ss_pred CEEEEEEEHHHCCCCCHH----HHHHHHHCCCHHHHHHHHCCCCE Q ss_conf 759999771334668899----99999870897356433005100 Q 537021.9.peg.1 68 DRQVIRMTINETPHYNEQ----ERKRIIDSYPLHEREARTKGEPI 108 (251) Q Consensus 68 ~~~~~~~t~~DNp~l~~~----~~~~~~~~~~~~~~~~~~~G~~~ 108 (251) +=.+=..|.+-=|-+.+= -+..+.+.||.-.......-+=. T Consensus 92 ~E~SG~LT~YSRPSfAQCWLVPri~~F~~~YPsIsL~~LTGNeNi 136 (302) T TIGR02036 92 QELSGELTVYSRPSFAQCWLVPRIADFKKRYPSISLKVLTGNENI 136 (302) T ss_pred CCCCCCEEECCCCCHHHHHHHHHHHHHHHCCCCEEEEECCCCCCE T ss_conf 751210200225533344332323212003871221100153532 No 21 >pfam04312 DUF460 Protein of unknown function (DUF460). Archaeal protein of unknown function. Probab=36.37 E-value=32 Score=15.11 Aligned_cols=91 Identities=19% Similarity=0.168 Sum_probs=54.9 Q ss_pred CCEEEEEECCCCCCCCEEEEEEEECCCCEEEEEEEHHHCCCCHHHHHHHHHHCCCCCEEEECCCCCEECCCCCCCHHHHH Q ss_conf 74357520037748719999999773998999943102599879999999833248708972630011178888889999 Q 537021.9.peg.1 132 HWVQIGGMDFGWHHPFAAGHLVWNRDSDVIYVVKNYRCREQTPIFHVAALKSWGKWLPWAWPHDGLQHDKRSGEQLSAQY 211 (251) Q Consensus 132 ~~~~~~g~D~G~~~p~a~~~~~~~~~~~~~~i~de~~~~~~t~~~~~~~i~~~~~~~~~~~~~~~~~~~~~~~~s~~e~~ 211 (251) .-...+|+|.|.+-.-| +.|-+|+.++ -+..+++..++..+.|.++|. |++ ...|..-.+...+-+ T Consensus 30 r~~lIVGIDPG~ttgiA----ildLdG~~l~---~~S~R~~~~~evi~~I~~~G~--Pvi-----VAtDV~p~P~~V~Ki 95 (138) T pfam04312 30 RRYLIVGIDPGITTGIA----ILDLDGEVLD---LYSSRNMDRGEVIELIYELGK--PVI-----VATDVSPPPETVKKI 95 (138) T ss_pred CCCEEEEECCCCEEEEE----EEECCCCEEE---EEECCCCCHHHHHHHHHHCCC--EEE-----EEECCCCCCHHHHHH T ss_conf 67679997899143788----9825884998---770367998999999997497--699-----982698982899999 Q ss_pred HH-CCCEEEECCCCCCCCCCCHHHHHHHHHH Q ss_conf 97-8974776577777677758998999999 Q 537021.9.peg.1 212 RR-QGMKMLPECATFDDGSNGVEAGISDMLD 241 (251) Q Consensus 212 r~-~G~~~~~~~~~~~kg~~sV~~GI~~l~~ 241 (251) ++ .|-.+. .++-.-+|+.-.+-++. T Consensus 96 a~~f~A~ly-----~P~~dl~veEK~~l~~~ 121 (138) T pfam04312 96 ARSFGAVLY-----TPERDLSVEEKRELARK 121 (138) T ss_pred HHHHCCEEE-----CCCCCCCHHHHHHHHHH T ss_conf 997198115-----78876768999999987 No 22 >COG3453 Uncharacterized protein conserved in bacteria [Function unknown] Probab=35.02 E-value=33 Score=14.99 Aligned_cols=82 Identities=12% Similarity=0.052 Sum_probs=56.1 Q ss_pred EEEEEEHHHCCCCHHHHHHHHHHCCCCCEEEECCCCCEECCCCCCCHHHHHHHCCCEEEECCCCCCCCCCCHHHHHHHHH Q ss_conf 99994310259987999999983324870897263001117888888999997897477657777767775899899999 Q 537021.9.peg.1 161 IYVVKNYRCREQTPIFHVAALKSWGKWLPWAWPHDGLQHDKRSGEQLSAQYRRQGMKMLPECATFDDGSNGVEAGISDML 240 (251) Q Consensus 161 ~~i~de~~~~~~t~~~~~~~i~~~~~~~~~~~~~~~~~~~~~~~~s~~e~~r~~G~~~~~~~~~~~kg~~sV~~GI~~l~ 240 (251) ..|-|+|+-++.....-...|+..|.+--++--+|+.-..+....++.+-.+.+|+....-+ ..+..--++-++.++ T Consensus 4 ~~I~d~lsVsgQi~~~D~~~iaa~GFksiI~nRPDgEe~~QP~~~~i~~aa~~aGl~y~~iP---V~~~~iT~~dV~~f~ 80 (130) T COG3453 4 RRINDRLSVSGQISPADIASIAALGFKSIICNRPDGEEPGQPGFAAIAAAAEAAGLTYTHIP---VTGGGITEADVEAFQ 80 (130) T ss_pred EECCCCEEECCCCCHHHHHHHHHHCCCEECCCCCCCCCCCCCCHHHHHHHHHHCCCCEEEEE---CCCCCCCHHHHHHHH T ss_conf 66154342369898889999997042210016998778899974999999996698258763---479987999999999 Q ss_pred HHHCC Q ss_conf 98607 Q 537021.9.peg.1 241 DRMRS 245 (251) Q Consensus 241 ~~~~~ 245 (251) ..|.+ T Consensus 81 ~Al~e 85 (130) T COG3453 81 RALDE 85 (130) T ss_pred HHHHH T ss_conf 99997 No 23 >COG4098 comFA Superfamily II DNA/RNA helicase required for DNA uptake (late competence protein) [DNA replication, recombination, and repair] Probab=33.88 E-value=35 Score=14.89 Aligned_cols=39 Identities=21% Similarity=0.226 Sum_probs=24.2 Q ss_pred CCCCEEEEEC--CCCHH---HHHHHHHHHCCC-CCEEEEEECCCC Q ss_conf 3135899818--98974---998863112475-873999924899 Q 537021.9.peg.1 14 NTVHYVWFDE--EPPED---VYFEGLTRINAT-QGLVTLTLTPLK 52 (251) Q Consensus 14 ~~~~~i~~DE--~~~~~---~~~~~~~r~~~~-~g~i~~~~nP~~ 52 (251) .++|.+.+|| |-|++ .....+.-.+.+ +..|++|+||.. T Consensus 201 ~aFD~liIDEVDAFP~~~d~~L~~Av~~ark~~g~~IylTATp~k 245 (441) T COG4098 201 QAFDLLIIDEVDAFPFSDDQSLQYAVKKARKKEGATIYLTATPTK 245 (441) T ss_pred HHCCEEEEECCCCCCCCCCHHHHHHHHHHHCCCCCEEEEECCCHH T ss_conf 643389983024565667888999999751236736999648807 No 24 >COG2410 Predicted nuclease (RNAse H fold) [General function prediction only] Probab=33.58 E-value=35 Score=14.86 Aligned_cols=31 Identities=16% Similarity=0.320 Sum_probs=22.2 Q ss_pred EEEEECCCCCCCCEEEEEEEECCCCEEEEEEEHHH Q ss_conf 57520037748719999999773998999943102 Q 537021.9.peg.1 135 QIGGMDFGWHHPFAAGHLVWNRDSDVIYVVKNYRC 169 (251) Q Consensus 135 ~~~g~D~G~~~p~a~~~~~~~~~~~~~~i~de~~~ 169 (251) .|.|+|.|...++++..+. .+++.++.+|.. T Consensus 2 my~GIDla~k~~tavavl~----~~~~~~i~~~s~ 32 (178) T COG2410 2 MYAGIDLAVKRSTAVAVLI----EGRIEIISAWSS 32 (178) T ss_pred CCCCCCCCCCCCCEEEEEE----CCEEEEEECCCC T ss_conf 3023223467774389997----887999871366 No 25 >COG3877 Uncharacterized protein conserved in bacteria [Function unknown] Probab=29.40 E-value=42 Score=14.46 Aligned_cols=38 Identities=24% Similarity=0.443 Sum_probs=30.2 Q ss_pred CCHHHHHHHCCCEEEECCCCCCCCCCCHHHHHHHHHHHHCCCCCEE Q ss_conf 8889999978974776577777677758998999999860799078 Q 537021.9.peg.1 205 EQLSAQYRRQGMKMLPECATFDDGSNGVEAGISDMLDRMRSGRWKV 250 (251) Q Consensus 205 ~s~~e~~r~~G~~~~~~~~~~~kg~~sV~~GI~~l~~~~~~grl~V 250 (251) ..+.+++|.+|.+ +.+.+++.-|-..+.+.++.|.+.+ T Consensus 76 ~kld~vlramgy~--------p~~e~~~~i~~~~i~~qle~Gei~p 113 (122) T COG3877 76 TKLDEVLRAMGYN--------PDSENSVNIGKKKIIDQLEKGEISP 113 (122) T ss_pred HHHHHHHHHCCCC--------CCCCCHHHHHHHHHHHHHHCCCCCH T ss_conf 8999999980899--------8998704553899999998178799 No 26 >TIGR00108 eRF peptide chain release factor eRF/aRF, subunit 1; InterPro: IPR004403 These proteins are translation factors that have been characterised in eukaryotes as the non-GTP-binding subunit of a cytosolic heterodimer that acts as a translation release factor for all three stop codons. Members of this orthologous family are found in Eukarya and Archaea. The name used should be eRF1 for the Archaea and aRF1 for the Eukarya. Alternative names include eRF1, SUP45, omnipotent suppressor protein 1.; GO: 0016149 translation release factor activity codon specific, 0006415 translational termination, 0005737 cytoplasm. Probab=28.36 E-value=44 Score=14.36 Aligned_cols=120 Identities=13% Similarity=0.106 Sum_probs=52.8 Q ss_pred HHHHCCCCC---EEEEEECCCCCCCHHHHHHCCCCCCCEEEEEEEHHHCCCCCHHHHHHHHHCCCHHHHHHHHCCCCEEE Q ss_conf 311247587---39999248999875655640367997599997713346688999999987089735643300510010 Q 537021.9.peg.1 34 LTRINATQG---LVTLTLTPLKGRSPIIEHYLSASSSDRQVIRMTINETPHYNEQERKRIIDSYPLHEREARTKGEPILG 110 (251) Q Consensus 34 ~~r~~~~~g---~i~~~~nP~~~~~~~~~~~~~~~~~~~~~~~~t~~DNp~l~~~~~~~~~~~~~~~~~~~~~~G~~~~~ 110 (251) +--+.+..| .++--.=|+.++-.-+.....+.-.....|++...=.--+ +-|+.+...+- .|+.--+---++. T Consensus 18 l~eL~~~rG~GTeLiSLyIPP~~~I~dv~~~Lr~ElsqASNIKSk~tr~nVl--SAIe~ilqrLK--lf~~pPe~GlV~~ 93 (425) T TIGR00108 18 LQELEKKRGRGTELISLYIPPDRQISDVAKHLREELSQASNIKSKQTRKNVL--SAIEAILQRLK--LFKKPPENGLVIF 93 (425) T ss_pred HHHHHCCCCCCCEEEEEEECCCCCCHHHHHHHHHHHHHHCCCHHHHCCHHHH--HHHHHHHHHHH--HCCCCCCCCEEEE T ss_conf 9888604899724789986788951178888887521320202222113578--99999999864--2157730276899 Q ss_pred CCCEEECC--CCEEECCCCCCCCCCEEE-EEECCCC---------CCCCEEEEEEEECC Q ss_conf 14300001--203431565465674357-5200377---------48719999999773 Q 537021.9.peg.1 111 SGRIFPIV--EEDIVINSLDIPEHWVQI-GGMDFGW---------HHPFAAGHLVWNRD 157 (251) Q Consensus 111 ~g~v~~~~--~~~~~~~~~~~~~~~~~~-~g~D~G~---------~~p~a~~~~~~~~~ 157 (251) .|.|-..- .+.+....+++|..-+.| +-||-=| .+.-+..+...|+. T Consensus 94 ~G~v~~~gpG~EK~~T~~iEPpePi~ty~Y~Cds~FylepL~E~L~~k~~yGlivlDr~ 152 (425) T TIGR00108 94 CGMVPREGPGTEKMVTYVIEPPEPIKTYIYHCDSKFYLEPLKELLEEKDKYGLIVLDRK 152 (425) T ss_pred EEEEECCCCCCCEEEEEECCCCCCCCCCEECCCCCCCHHHHHHHHHCCCCCCEEEECCC T ss_conf 86761598884016888527898745433401871011068887610233125898088 No 27 >KOG1015 consensus Probab=28.34 E-value=44 Score=14.36 Aligned_cols=48 Identities=21% Similarity=0.178 Sum_probs=34.5 Q ss_pred CCCCHHHHCCCCCCEEEEECCC----CHHHHHHHHHHHCCCCCEEEEEECCCC Q ss_conf 3369655033313589981898----974998863112475873999924899 Q 537021.9.peg.1 4 YEQGRDKWQSNTVHYVWFDEEP----PEDVYFEGLTRINATQGLVTLTLTPLK 52 (251) Q Consensus 4 ~~q~~~~~~G~~~~~i~~DE~~----~~~~~~~~~~r~~~~~g~i~~~~nP~~ 52 (251) .++....|+-++=|.+++||+. ..+..++.+.+++++ -+|++|.||.. T Consensus 810 ke~f~k~lvdpGPD~vVCDE~HiLKNeksa~Skam~~irtk-RRI~LTGTPLQ 861 (1567) T KOG1015 810 KEIFNKALVDPGPDFVVCDEGHILKNEKSAVSKAMNSIRTK-RRIILTGTPLQ 861 (1567) T ss_pred HHHHHHHCCCCCCCEEEECCHHHHCCCHHHHHHHHHHHHHH-EEEEEECCCHH T ss_conf 99999860578997687242122135247899999987764-04775267113 No 28 >pfam04273 DUF442 Putative phosphatase (DUF442). Although this domain is uncharacterized it seems likely that it performs a phosphatase function. Probab=27.98 E-value=44 Score=14.32 Aligned_cols=80 Identities=16% Similarity=0.056 Sum_probs=50.6 Q ss_pred EEEEHHHCCCCHHHHHHHHHHCCCCCEEEECCCCCEECCCCCCCHHHHHHHCCCEEEECCCCCCCCCCCHHHHHHHHHHH Q ss_conf 99431025998799999998332487089726300111788888899999789747765777776777589989999998 Q 537021.9.peg.1 163 VVKNYRCREQTPIFHVAALKSWGKWLPWAWPHDGLQHDKRSGEQLSAQYRRQGMKMLPECATFDDGSNGVEAGISDMLDR 242 (251) Q Consensus 163 i~de~~~~~~t~~~~~~~i~~~~~~~~~~~~~~~~~~~~~~~~s~~e~~r~~G~~~~~~~~~~~kg~~sV~~GI~~l~~~ 242 (251) |-+.++.++.......+.|.+.|.+.-++.-+|+...++.....+.+....+|+....-+... +..+ .+-|..+.+. T Consensus 5 i~~~~~vs~Qi~~~di~~la~~GfktIInnRPd~E~~~qp~~~~~~~~a~~~Gl~y~~iPv~~--~~~t-~~~v~~f~~~ 81 (110) T pfam04273 5 INEDLSVSPQIQPDDIAAAARAGFRSVINNRPDGEEPGQPSNAAEQAAARAAGLAYRFIPVIS--GQIT-EADVEAFQRA 81 (110) T ss_pred CCCCEEECCCCCHHHHHHHHHCCCCEEEECCCCCCCCCCCCHHHHHHHHHHCCCEEEEEECCC--CCCC-HHHHHHHHHH T ss_conf 478875759989999999998598388533888777899888999999998399799964477--8989-9999999999 Q ss_pred HCC Q ss_conf 607 Q 537021.9.peg.1 243 MRS 245 (251) Q Consensus 243 ~~~ 245 (251) |++ T Consensus 82 l~~ 84 (110) T pfam04273 82 LAA 84 (110) T ss_pred HHH T ss_conf 985 No 29 >PRK09401 reverse gyrase; Reviewed Probab=24.85 E-value=51 Score=14.00 Aligned_cols=15 Identities=20% Similarity=0.605 Sum_probs=6.9 Q ss_pred HHHCCCCCCEEEEEC Q ss_conf 550333135899818 Q 537021.9.peg.1 9 DKWQSNTVHYVWFDE 23 (251) Q Consensus 9 ~~~~G~~~~~i~~DE 23 (251) +.|-...+|.|++|- T Consensus 193 ~~l~~~~f~fifvDD 207 (1176) T PRK09401 193 DELPKDRFDFVFVDD 207 (1176) T ss_pred HHCCCCCCCEEEEEC T ss_conf 760356888899934 No 30 >PRK05183 hscA chaperone protein HscA; Provisional Probab=24.21 E-value=52 Score=13.93 Aligned_cols=11 Identities=27% Similarity=0.217 Sum_probs=6.5 Q ss_pred EEEEECCCCCC Q ss_conf 57520037748 Q 537021.9.peg.1 135 QIGGMDFGWHH 145 (251) Q Consensus 135 ~~~g~D~G~~~ 145 (251) .+...|+|..- T Consensus 203 ~vlVyDLGGGT 213 (621) T PRK05183 203 VIAVYDLGGGT 213 (621) T ss_pred EEEEEECCCCE T ss_conf 79999878865 No 31 >TIGR02350 prok_dnaK chaperone protein DnaK; InterPro: IPR012725 Molecular chaperones are a diverse family of proteins that function to protect proteins in the intracellular milieu from irreversible aggregation during synthesis and in times of cellular stress. The bacterial molecular chaperone DnaK is an enzyme that couples cycles of ATP binding, hydrolysis, and ADP release by an N-terminal ATP-hydrolysing domain to cycles of sequestration and release of unfolded proteins by a C-terminal substrate binding domain. In prokaryotes, the grpE protein is a co-chaperone for DnaK, and acts as a nucleotide exchange factor, stimulating the rate of ADP release 5000-fold . DnaK is itself a weak ATPase; ATP hydrolysis by DnaK is stimulated by its interaction with another co-chaperone, DnaJ. Thus the co-chaperones DnaJ and GrpE are capable of tightly regulating the nucleotide-bound and substrate-bound state of DnaK in ways that are necessary for the normal housekeeping functions and stress-related functions of the DnaK molecular chaperone cycle. Members of this family are the chaperone DnaK, of the DnaK-DnaJ-GrpE chaperone system. All members of the seed alignment were taken from completely sequenced bacterial or archaeal genomes and (except for the Mycoplasma sequence) found clustered with other genes of this systems. This entry excludes DnaK homologues that are not DnaK itself, such as the heat shock cognate protein HscA (IPR010236 from INTERPRO). However, it is not designed to distinguish among DnaK paralogs in eukaryotes. Note that a number of DnaK genes have shadow ORFs in the same reverse (relative to dnaK) reading frame, a few of which have been assigned glutamate dehydrogenase activity. The significance of this observation is unclear; the lengths of such shadow ORFs are highly variable as if the presumptive protein product is not conserved.; GO: 0005524 ATP binding, 0051082 unfolded protein binding, 0006457 protein folding. Probab=23.96 E-value=53 Score=13.90 Aligned_cols=34 Identities=18% Similarity=0.285 Sum_probs=15.9 Q ss_pred EEEEHHHCCCCCHHHHHH-HHHCCCHHHHHHHHCCCCE Q ss_conf 997713346688999999-9870897356433005100 Q 537021.9.peg.1 72 IRMTINETPHYNEQERKR-IIDSYPLHEREARTKGEPI 108 (251) Q Consensus 72 ~~~t~~DNp~l~~~~~~~-~~~~~~~~~~~~~~~G~~~ 108 (251) +.+...|-.| ++++|.. ++..+- ...-.|+|+-. T Consensus 97 ~~~~~~~K~y-~P~eISA~iL~klk--~~AE~yLGe~v 131 (598) T TIGR02350 97 VRVEVRGKEY-TPQEISAMILQKLK--KDAEAYLGEKV 131 (598) T ss_pred EEEEECCCEE-CHHHHHHHHHHHHH--HHHHHHCCCCC T ss_conf 8999618610-71779999999999--99998628530 No 32 >PRK01433 hscA chaperone protein HscA; Provisional Probab=22.93 E-value=56 Score=13.79 Aligned_cols=11 Identities=18% Similarity=-0.100 Sum_probs=6.6 Q ss_pred EEEEECCCCCC Q ss_conf 57520037748 Q 537021.9.peg.1 135 QIGGMDFGWHH 145 (251) Q Consensus 135 ~~~g~D~G~~~ 145 (251) .+...|+|..- T Consensus 194 ~ilVyDLGGGT 204 (595) T PRK01433 194 CYLVYDLGGGT 204 (595) T ss_pred EEEEEECCCCE T ss_conf 59999888971 No 33 >PTZ00271 hypoxanthine-guanine phosphoribosyltransferase; Provisional Probab=22.73 E-value=56 Score=13.76 Aligned_cols=106 Identities=8% Similarity=0.039 Sum_probs=59.8 Q ss_pred CCCCHHHHHHHHHCCCHHHHHHHHCC------CC----EEECCCEEECCCCEEECCCCCCCCCCEEEEEECCCC-CCCCE Q ss_conf 66889999999870897356433005------10----010143000012034315654656743575200377-48719 Q 537021.9.peg.1 80 PHYNEQERKRIIDSYPLHEREARTKG------EP----ILGSGRIFPIVEEDIVINSLDIPEHWVQIGGMDFGW-HHPFA 148 (251) Q Consensus 80 p~l~~~~~~~~~~~~~~~~~~~~~~G------~~----~~~~g~v~~~~~~~~~~~~~~~~~~~~~~~g~D~G~-~~p~a 148 (251) -.+++++|.+..+.+.... .+.|.| .. .+.-|++.+..+-..++....+|-.......-=+|- ...+. T Consensus 26 vL~seeeI~~rV~elg~qI-s~dY~~~~l~~~~pL~vigVLkGs~~F~aDL~R~i~~~~ip~~iDFm~vSSYg~gt~SsG 104 (211) T PTZ00271 26 TLVTQEQVWAATAKCAKKI-AEDYRSFKLTTENPLYLLCVLKGSFIFTADLARFLADEGVPVKVEFICASSYGTGVETSG 104 (211) T ss_pred EECCHHHHHHHHHHHHHHH-HHHHCCCCCCCCCCEEEEEECCCCHHHHHHHHHHHCCCCCCEEEEEEEEEECCCCCEECC T ss_conf 6768999999999999999-998703554678976999984570999999999712369983889999620699971075 Q ss_pred EEEEEEEC----CCCEEEEEEEHHHCCCCHHHHHHHHHHCCC Q ss_conf 99999977----399899994310259987999999983324 Q 537021.9.peg.1 149 AGHLVWNR----DSDVIYVVKNYRCREQTPIFHVAALKSWGK 186 (251) Q Consensus 149 ~~~~~~~~----~~~~~~i~de~~~~~~t~~~~~~~i~~~~~ 186 (251) .+.+..|. .|+.+.|++..-.+|.|...+.+.|+.++- T Consensus 105 ~v~i~~dl~~~i~gk~VLIVEDIvDTG~TL~~l~~~l~~~~p 146 (211) T PTZ00271 105 QVRMLLDVRDSVENRHILIVEDIVDSAITLQYLMRFMLAKKP 146 (211) T ss_pred EEEEEECCCCCCCCCEEEEEECHHCCCHHHHHHHHHHHHCCC T ss_conf 289944588776898799994132125589999999985499 No 34 >TIGR00956 3a01205 Pleiotropic Drug Resistance (PDR) Family protein; InterPro: IPR005285 ABC transporters belong to the ATP-Binding Cassette (ABC) superfamily, which uses the hydrolysis of ATP to energize diverse biological systems. ABC transporters are minimally constituted of two conserved regions: a highly conserved ATP binding cassette (ABC) and a less conserved transmembrane domain (TMD). These regions can be found on the same protein or on two different ones. Most ABC transporters function as a dimer and therefore are constituted of four domains, two ABC modules and two TMDs. ABC transporters are involved in the export or import of a wide variety of substrates ranging from small ions to macromolecules. The major function of ABC import systems is to provide essential nutrients to bacteria. They are found only in prokaryotes and their four constitutive domains are usually encoded by independent polypeptides (two ABC proteins and two TMD proteins). Prokaryotic importers require additional extracytoplasmic binding proteins (one or more per systems) for function. In contrast, export systems are involved in the extrusion of noxious substances, the export of extracellular toxins and the targeting of membrane components. They are found in all living organisms and in general the TMD is fused to the ABC module in a variety of combinations. Some eukaryotic exporters encode the four domains on the same polypeptide chain . The ABC module (approximately two hundred amino acid residues) is known to bind and hydrolyze ATP, thereby coupling transport to ATP hydrolysis in a large number of biological processes. The cassette is duplicated in several subfamilies. Its primary sequence is highly conserved, displaying a typical phosphate-binding loop: Walker A, and a magnesium binding site: Walker B. Besides these two regions, three other conserved motifs are present in the ABC cassette: the switch region which contains a histidine loop, postulated to polarize the attaching water molecule for hydrolysis, the signature conserved motif (LSGGQ) specific to the ABC transporter, and the Q-motif (between Walker A and the signature), which interacts with the gamma phosphate through a water bond. The Walker A, Walker B, Q-loop and switch region form the nucleotide binding site , , . The 3D structure of a monomeric ABC module adopts a stubby L-shape with two distinct arms. ArmI (mainly beta-strand) contains Walker A and Walker B. The important residues for ATP hydrolysis and/or binding are located in the P-loop. The ATP-binding pocket is located at the extremity of armI. The perpendicular armII contains mostly the alpha helical subdomain with the signature motif. It only seems to be required for structural integrity of the ABC module. ArmII is in direct contact with the TMD. The hinge between armI and armII contains both the histidine loop and the Q-loop, making contact with the gamma phosphate of the ATP molecule. ATP hydrolysis leads to a conformational change that could facilitate ADP release. In the dimer the two ABC cassettes contact each other through hydrophobic interactions at the antiparallel beta-sheet of armI by a two-fold axis , , , , , . Proteins known to belong to this family are classified in several functional subfamilies depending on the substrate used (for further information see http://www.tcdb.org/tcdb/index.php?tc=3.A.1). This family includes transporters, whose physiological function is not yet established. These proteins are thought to confer resistance to the chemicals cycloheximide and sulphomethuron methyl, BFA, azole antifungal agents, other antifungal agents: amorolfine and terbinafine. Some of them could serve as an efflux pump of various antibiotics.. Probab=20.14 E-value=64 Score=13.46 Aligned_cols=25 Identities=16% Similarity=0.226 Sum_probs=11.2 Q ss_pred CCEECCCCCCCHHHHHHHCCCEEEE Q ss_conf 0011178888889999978974776 Q 537021.9.peg.1 196 GLQHDKRSGEQLSAQYRRQGMKMLP 220 (251) Q Consensus 196 ~~~~~~~~~~s~~e~~r~~G~~~~~ 220 (251) |...=.|......+-+..+|+.+-+ T Consensus 314 G~QIYfG~~~~Ak~YF~~MGf~Cp~ 338 (1466) T TIGR00956 314 GYQIYFGPADKAKQYFEKMGFKCPD 338 (1466) T ss_pred CCEEEECCHHHHHHHHHHCCCCCCC T ss_conf 7167668768999889747885968 Done!