Query gi|254781073|ref|YP_003065486.1| coproporphyrinogen III oxidase [Candidatus Liberibacter asiaticus str. psy62] Match_columns 307 No_of_seqs 154 out of 1016 Neff 5.0 Searched_HMMs 39220 Date Mon May 30 04:07:50 2011 Command /home/congqian_1/programs/hhpred/hhsearch -i 254781073.hhm -d /home/congqian_1/database/cdd/Cdd.hhm No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM 1 PRK05330 coproporphyrinogen II 100.0 0 0 929.7 21.6 270 14-307 3-273 (297) 2 pfam01218 Coprogen_oxidas Copr 100.0 0 0 917.8 21.1 264 20-307 2-271 (296) 3 COG0408 HemF Coproporphyrinoge 100.0 0 0 908.5 19.1 272 13-307 2-279 (303) 4 KOG1518 consensus 100.0 0 0 806.8 18.2 268 15-306 76-356 (382) 5 TIGR02169 SMC_prok_A chromosom 63.2 0.61 1.6E-05 26.6 -2.7 72 25-130 1035-1106(1202) 6 KOG0793 consensus 58.3 12 0.00031 18.3 3.3 58 192-249 868-931 (1004) 7 TIGR02529 EutJ ethanolamine ut 56.9 6.2 0.00016 20.1 1.6 46 71-123 73-125 (240) 8 pfam02961 BAF Barrier to autoi 52.3 14 0.00035 17.9 2.8 32 167-198 55-89 (89) 9 TIGR00759 aceE 2-oxo-acid dehy 51.6 3.5 8.9E-05 21.8 -0.4 36 156-207 668-703 (905) 10 COG3642 Mn2+-dependent serine/ 50.1 16 0.00041 17.4 2.8 36 235-274 166-201 (204) 11 pfam12268 DUF3612 Protein of u 44.3 11 0.00029 18.4 1.3 34 148-185 92-125 (178) 12 pfam06319 DUF1052 Protein of u 41.2 15 0.00037 17.7 1.5 23 187-209 79-101 (158) 13 TIGR01362 KDO8P_synth 3-deoxy- 39.9 8.9 0.00023 19.1 0.2 12 277-288 162-173 (279) 14 COG5132 BUD31 Cell cycle contr 37.0 11 0.00028 18.5 0.3 56 222-284 11-70 (146) 15 TIGR02539 SepCysS Sep-tRNA:Cys 34.8 16 0.0004 17.6 0.8 57 172-252 320-380 (381) 16 KOG0738 consensus 34.1 37 0.00093 15.2 3.3 17 27-43 64-80 (491) 17 pfam12138 Spherulin4 Spherulat 33.4 34 0.00088 15.3 2.4 31 189-219 82-113 (243) 18 PHA00028 rep RNA replicase, be 32.9 38 0.00097 15.0 3.4 57 144-204 380-439 (561) 19 COG2425 Uncharacterized protei 32.8 21 0.00053 16.8 1.1 14 144-157 226-239 (437) 20 KOG4320 consensus 32.7 17 0.00042 17.4 0.6 23 115-137 99-121 (253) 21 COG3354 FlaG Putative archaeal 32.3 37 0.00094 15.1 2.4 23 76-98 92-114 (154) 22 pfam01125 G10 G10 protein. 32.2 38 0.00098 15.0 2.4 52 231-284 16-71 (145) 23 TIGR02707 butyr_kinase butyrat 31.5 25 0.00064 16.2 1.4 67 25-108 152-219 (353) 24 KOG4233 consensus 30.8 41 0.001 14.8 2.4 32 167-198 55-89 (90) 25 COG1103 Archaea-specific pyrid 30.0 5.6 0.00014 20.4 -2.2 63 184-252 318-381 (382) 26 PHA02087 hypothetical protein 28.3 34 0.00086 15.4 1.6 13 205-217 60-72 (83) 27 TIGR01975 isoAsp_dipep beta-as 28.2 13 0.00032 18.1 -0.6 54 126-182 217-274 (391) 28 KOG3410 consensus 27.7 33 0.00084 15.5 1.5 53 242-307 67-119 (120) 29 PRK10997 yieM hypothetical pro 27.4 17 0.00043 17.3 -0.1 13 146-158 388-400 (484) 30 PRK09491 rimI ribosomal-protei 26.7 30 0.00077 15.7 1.1 62 235-302 76-144 (144) 31 KOG0175 consensus 26.1 34 0.00088 15.3 1.3 33 152-184 106-138 (285) 32 KOG3404 consensus 25.8 22 0.00055 16.6 0.2 14 268-282 55-68 (145) 33 TIGR01491 HAD-SF-IB-PSPlk Phos 25.3 45 0.0011 14.6 1.8 13 259-271 35-47 (203) 34 pfam09433 HRP1 THO complex sub 25.2 52 0.0013 14.2 2.5 14 203-216 594-607 (725) 35 PRK13272 treA trehalase; Provi 23.6 37 0.00095 15.1 1.1 64 145-220 316-392 (537) 36 pfam04437 RINT1_TIP1 RINT-1 / 23.6 56 0.0014 14.0 4.5 28 175-202 243-272 (485) 37 COG2075 RPL24A Ribosomal prote 22.6 56 0.0014 14.0 1.9 16 70-85 15-30 (66) 38 TIGR01263 4HPPD 4-hydroxypheny 22.3 26 0.00067 16.1 0.1 35 74-112 43-83 (379) 39 KOG3063 consensus 22.3 24 0.00061 16.4 -0.1 27 68-94 46-72 (301) 40 TIGR00505 ribA GTP cyclohydrol 21.0 31 0.00078 15.7 0.3 11 201-211 108-119 (227) 41 cd06182 CYPOR_like NADPH cytoc 20.6 64 0.0016 13.6 2.6 39 236-274 225-264 (267) 42 KOG2305 consensus 20.5 41 0.001 14.8 0.8 15 121-135 138-154 (313) 43 COG4829 CatC1 Muconolactone de 20.3 25 0.00063 16.2 -0.3 29 268-303 46-74 (98) No 1 >PRK05330 coproporphyrinogen III oxidase; Provisional Probab=100.00 E-value=0 Score=929.67 Aligned_cols=270 Identities=46% Similarity=0.866 Sum_probs=261.0 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEECCCCCCCCCCCEEEEEEECCCEEEEEEEEEEE Q ss_conf 69999999999999999999999996301022223445665125212317888887775048995059465368877763 Q gi|254781073|r 14 IEERKRISQRKFENLQSIICTEFEKLENEAHENSANRSPKIFTVKHWLRDKSQKEDLGGGRMATLCAGKVFEKAAVLVST 93 (307) Q Consensus 14 ie~kk~~~~~wF~~LQd~Ic~~fE~lE~~~~~~~~~~~~~~F~~d~W~R~~g~~~~~GGG~s~VL~~G~VFEKaGVNfS~ 93 (307) +..++.++++||++||++||++||+||+. ++|++|.|+|++| |||+||||++|+||||||||||+ T Consensus 3 ~~~~~~~~~~~f~~LQ~~Ic~~~e~le~~----------~~F~~d~W~r~~g-----GGG~s~vl~~G~vFEKaGVN~S~ 67 (297) T PRK05330 3 MTPDKARVKAWLLGLQDRICAALEALDGP----------ARFVEDSWQRPEG-----GGGRSRVLRNGRVFEKAGVNFSH 67 (297) T ss_pred CCHHHHHHHHHHHHHHHHHHHHHHHHCCC----------CCEEECCCCCCCC-----CCCEEEEEECCCEEEEEEEEEEE T ss_conf 76679999999999999999999975088----------8705046515899-----99738998489178850178998 Q ss_pred EECCCHHHHHHHHHHCCCCCCCEEECEEEEEECCCCCCCHHHCCHHHEEECC-CCCCCCCCCCCCCCHHHCCCHHHHHHH Q ss_conf 0035426776421110267651000002233136887440102302102024-312532001100000110432679999 Q gi|254781073|r 94 VYGDLSPDFKDQILGTTKNPYFWATGLSVIVHPYNPHVPAVHFNIRMIVTGA-YWFGGGIDLTPSLESRRHSYDPDVIFF 172 (307) Q Consensus 94 V~G~~~p~~a~~i~g~~~~~~F~AtGiS~V~Hp~nP~vPt~H~N~R~f~~~~-~WFGGG~DLTP~~~~~~y~~~eD~~~f 172 (307) |+|+++|+++.+++++.++++|||||||||+||+||+|||+|||+|||+|++ ||||||+||||+ |+++||+++| T Consensus 68 V~G~~~p~~a~~~~~~~~~~~F~AtGiSlV~HP~NP~vPt~H~N~R~f~~~~~~WFGGG~DLTP~-----y~~~eD~~~f 142 (297) T PRK05330 68 VHGEFLPPSATAHRPELAGRSFEATGVSLVAHPRNPYVPTVHMNVRFFKTGPVWWFGGGFDLTPY-----YPFEEDAVHF 142 (297) T ss_pred EECCCCHHHHHHCCCCCCCCCEEEECCEEEEECCCCCCCCCCCEEEEEECCCCCEECCCCCCCCC-----CCCHHHHHHH T ss_conf 63568989997464545788706510126630478888630010589960576243576357777-----6867899999 Q ss_pred HHHHHHHCCCCCHHHHHHHHHHHHHHHHHHHHCCCCCCCCEEHHCCCCCCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 99998403656733428999976787434532521234301010147866658967799999999999999999999875 Q gi|254781073|r 173 HNTLKEMCSQHAVANYTHYKEWCDRYFYLPHRQESRGIGGIFFDHLHSSPEMGGIDADFSFISAVGDCFIKLYPSLVRRN 252 (307) Q Consensus 173 H~~~k~~Cd~~~~~~Y~~fKk~CD~YFylpHR~E~RGvGGIFFD~l~~~~~~~~~e~~f~f~~~vg~~fl~~y~~Iv~~r 252 (307) |++||++||+|++++|++||||||+|||||||+|+|||||||||||++ ++||++|+|+++||++||++|+|||+|| T Consensus 143 H~~lK~~Cd~~~~~~Y~~fKk~CD~YFylpHR~E~RGiGGIFfD~l~~----~~~~~~f~f~~~vg~~fl~~Y~~Iv~kr 218 (297) T PRK05330 143 HQTAKDACDPFGPDYYPRFKKWCDEYFYLKHRNEPRGIGGIFFDDLNS----GDFERDFAFTQAVGDAFLDAYLPIVERR 218 (297) T ss_pred HHHHHHHHHHHCHHHHHHHHHHHHHHCCCHHCCCCCCCCCEEHHHHCC----CCHHHHHHHHHHHHHHHHHHHHHHHHHH T ss_conf 999999997639677799986404430010026554667143222046----2267779999999999999899999986 Q ss_pred CCCCCCHHHHHHHHCCCCCHHHHHHHHCCCCCCCCCCCCCHHHEEECCCCCCCCC Q ss_conf 4799998999885420240112211110255357357888111411288878897 Q gi|254781073|r 253 YHHLFSEQDRQEQLIRRGRYAEFNLLYDKGTNFGFKTGGNVESILASMPPLVAWP 307 (307) Q Consensus 253 ~~~~~~~~~k~~Ql~rRgRYvEFNL~yDRGT~FGL~t~gr~esIl~SlPp~~~Wp 307 (307) ++++||++||+|||+||||||||||||||||+|||+||||||||||||||+|+|+ T Consensus 219 ~~~~~t~~~k~~Ql~rRGRYvEFNLlyDRGT~FGL~tggr~esILmSlPp~a~W~ 273 (297) T PRK05330 219 KDTPYGEREREFQLYRRGRYVEFNLVYDRGTLFGLQTGGRTESILMSLPPLVRWE 273 (297) T ss_pred CCCCCCHHHHHHHHHCCCEEEEEEEEEECCCEECCCCCCCHHHHHCCCCCCCEEC T ss_conf 2587567888788751660389998864686003557996677420499877011 No 2 >pfam01218 Coprogen_oxidas Coproporphyrinogen III oxidase. Probab=100.00 E-value=0 Score=917.82 Aligned_cols=264 Identities=44% Similarity=0.858 Sum_probs=256.2 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEECCCCCCCCCCCEEEEEEECCCEEEEEEEEEEEEECCCH Q ss_conf 99999999999999999996301022223445665125212317888887775048995059465368877763003542 Q gi|254781073|r 20 ISQRKFENLQSIICTEFEKLENEAHENSANRSPKIFTVKHWLRDKSQKEDLGGGRMATLCAGKVFEKAAVLVSTVYGDLS 99 (307) Q Consensus 20 ~~~~wF~~LQd~Ic~~fE~lE~~~~~~~~~~~~~~F~~d~W~R~~g~~~~~GGG~s~VL~~G~VFEKaGVNfS~V~G~~~ 99 (307) ++++||++||++||++||+||++ ++|++|.|+|++| |||+||||++|+||||||||||+|+|+++ T Consensus 2 ~~e~~f~~lQ~~Ic~~~e~le~~----------~~F~~d~W~r~~g-----GGG~s~vl~~G~vfEKagVN~S~V~G~~~ 66 (296) T pfam01218 2 RVEAYLLGLQDRICAALEAIDGG----------AKFVEDAWERPEG-----GGGRSRVLQDGNVFEKAGVNFSHVHGKLL 66 (296) T ss_pred HHHHHHHHHHHHHHHHHHHHCCC----------CCCEECCCCCCCC-----CCCEEEEEECCCEEEEEEEEEEEEEECCC T ss_conf 68999999999999999975288----------8715326615899-----99738998089378850278999530278 Q ss_pred HHHHHHHHHCCCCCCCEEECEEEEEECCCCCCCHHHCCHHHEEEC------CCCCCCCCCCCCCCCHHHCCCHHHHHHHH Q ss_conf 677642111026765100000223313688744010230210202------43125320011000001104326799999 Q gi|254781073|r 100 PDFKDQILGTTKNPYFWATGLSVIVHPYNPHVPAVHFNIRMIVTG------AYWFGGGIDLTPSLESRRHSYDPDVIFFH 173 (307) Q Consensus 100 p~~a~~i~g~~~~~~F~AtGiS~V~Hp~nP~vPt~H~N~R~f~~~------~~WFGGG~DLTP~~~~~~y~~~eD~~~fH 173 (307) |+++.++++..++++|||||||||+||+||+|||+|||+|||++. .||||||+||||| |+++||+++|| T Consensus 67 p~~a~~~~~~~~~~~F~AtGiSlV~HP~NP~vPt~H~N~R~f~~~~~~~~~~wWFGGG~DLTP~-----y~~~eD~~~fH 141 (296) T pfam01218 67 PASATAMRPELAGRPFQAMGVSLVIHPRNPYVPTVHANVRYFIAEKEGEEPVWWFGGGFDLTPY-----YGFEEDAVHFH 141 (296) T ss_pred HHHHHHCCCCCCCCCEEECCCEEEEECCCCCCCCEEEEEEEEEEECCCCCCCEEECCCCCCCCC-----CCCHHHHHHHH T ss_conf 8899746433367760321222666427988763020068999745888764255465357777-----68778999999 Q ss_pred HHHHHHCCCCCHHHHHHHHHHHHHHHHHHHHCCCCCCCCEEHHCCCCCCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHC Q ss_conf 99984036567334289999767874345325212343010101478666589677999999999999999999998754 Q gi|254781073|r 174 NTLKEMCSQHAVANYTHYKEWCDRYFYLPHRQESRGIGGIFFDHLHSSPEMGGIDADFSFISAVGDCFIKLYPSLVRRNY 253 (307) Q Consensus 174 ~~~k~~Cd~~~~~~Y~~fKk~CD~YFylpHR~E~RGvGGIFFD~l~~~~~~~~~e~~f~f~~~vg~~fl~~y~~Iv~~r~ 253 (307) ++||++||+|+++||++||||||+|||||||+|+|||||||||||++ +++|++|+|+++||++||++|.|||+||+ T Consensus 142 ~~lK~~Cd~~~~~~Y~~fKk~CD~YFyipHR~E~RGiGGIFfD~l~~----~~~~~~f~f~~~vg~~fl~~Y~~Iv~kr~ 217 (296) T pfam01218 142 QTAKDACDPFGPDYYPRFKKWCDEYFYLKHRNEPRGVGGIFFDDLNE----GDFERSFAFVQSVGDAFLPAYLPIVERRK 217 (296) T ss_pred HHHHHHHHHHCHHHHHHHHHHHHHHCCHHHCCCCCCCCCEEHHHCCC----CCHHHHHHHHHHHHHHHHHHHHHHHHHHC T ss_conf 99999987628677899997756631213215554667144122046----35778999999999998888999999852 Q ss_pred CCCCCHHHHHHHHCCCCCHHHHHHHHCCCCCCCCCCCCCHHHEEECCCCCCCCC Q ss_conf 799998999885420240112211110255357357888111411288878897 Q gi|254781073|r 254 HHLFSEQDRQEQLIRRGRYAEFNLLYDKGTNFGFKTGGNVESILASMPPLVAWP 307 (307) Q Consensus 254 ~~~~~~~~k~~Ql~rRgRYvEFNL~yDRGT~FGL~t~gr~esIl~SlPp~~~Wp 307 (307) +++||++||+|||+||||||||||||||||+|||+||||||||||||||+|+|. T Consensus 218 ~~~~t~~~k~~Ql~rRGRYvEFNLlyDRGT~FGL~t~Gr~esILmSlPp~~~W~ 271 (296) T pfam01218 218 DTPYGEREREFQLYRRGRYVEFNLVYDRGTLFGLQTGGRTESILMSLPPLARWE 271 (296) T ss_pred CCCCCHHHHHHHHHHCCEEEEEEEEEECCCHHCCCCCCCCCEEEECCCCCCCCC T ss_conf 587677888788760752479998863686110547996001344489877111 No 3 >COG0408 HemF Coproporphyrinogen III oxidase [Coenzyme metabolism] Probab=100.00 E-value=0 Score=908.49 Aligned_cols=272 Identities=47% Similarity=0.902 Sum_probs=264.2 Q ss_pred CHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEECCCCCCCCCCCEEEEEEECCCEEEEEEEEEE Q ss_conf 66999999999999999999999999630102222344566512521231788888777504899505946536887776 Q gi|254781073|r 13 DIEERKRISQRKFENLQSIICTEFEKLENEAHENSANRSPKIFTVKHWLRDKSQKEDLGGGRMATLCAGKVFEKAAVLVS 92 (307) Q Consensus 13 ~ie~kk~~~~~wF~~LQd~Ic~~fE~lE~~~~~~~~~~~~~~F~~d~W~R~~g~~~~~GGG~s~VL~~G~VFEKaGVNfS 92 (307) +++.+++.+++||++|||+||++||+||+++ +|.+|.|+|++++ |||++|||++|.||||+||||| T Consensus 2 ~~~~~~~~~~~~~~~Lqd~Ic~~le~~dg~~----------~F~~d~W~r~~g~----GgG~~~vl~~G~vFEk~GVn~S 67 (303) T COG0408 2 DIEPDKQAVKAWLLNLQDEICAALEALDGEA----------KFVEDSWQREEGG----GGGRSRVLMDGAVFEKGGVNFS 67 (303) T ss_pred CCCCHHHHHHHHHHHHHHHHHHHHHHHCCCE----------EEECCCCCCCCCC----CCCEEEEEECCCEEEECCCEEE T ss_conf 8522899999999999999999998526640----------6740322246898----8872588722734663372158 Q ss_pred EEECCCHHHHHHHHHHCCCCCCCEEECEEEEEECCCCCCCHHHCCHHHEEECC------CCCCCCCCCCCCCCHHHCCCH Q ss_conf 30035426776421110267651000002233136887440102302102024------312532001100000110432 Q gi|254781073|r 93 TVYGDLSPDFKDQILGTTKNPYFWATGLSVIVHPYNPHVPAVHFNIRMIVTGA------YWFGGGIDLTPSLESRRHSYD 166 (307) Q Consensus 93 ~V~G~~~p~~a~~i~g~~~~~~F~AtGiS~V~Hp~nP~vPt~H~N~R~f~~~~------~WFGGG~DLTP~~~~~~y~~~ 166 (307) +|+|+++|+++++++++.++++|||||||||+||+||+|||+|||+|||+|.+ ||||||+||||+ |+++ T Consensus 68 ~V~G~~~P~~~~~~rp~~~g~~F~A~GiSlV~HP~NP~vPt~H~N~R~f~a~~~~~~~vwWFGGG~DLTP~-----y~~~ 142 (303) T COG0408 68 TVFGEFSPESATAMRPELAGRRFFATGISLVAHPKNPYVPTVHLNVRYFEAEKPGAEPVWWFGGGADLTPY-----YGFE 142 (303) T ss_pred EEECCCCCHHHHHCCCHHCCCCEEEEEEEEEEECCCCCCCCEEEEEEEEEEECCCCCCCEEECCCCCCCCC-----CCCH T ss_conf 87446680787638830028863653116997258999874231478999745888754565477567545-----6623 Q ss_pred HHHHHHHHHHHHHCCCCCHHHHHHHHHHHHHHHHHHHHCCCCCCCCEEHHCCCCCCCCCCHHHHHHHHHHHHHHHHHHHH Q ss_conf 67999999998403656733428999976787434532521234301010147866658967799999999999999999 Q gi|254781073|r 167 PDVIFFHNTLKEMCSQHAVANYTHYKEWCDRYFYLPHRQESRGIGGIFFDHLHSSPEMGGIDADFSFISAVGDCFIKLYP 246 (307) Q Consensus 167 eD~~~fH~~~k~~Cd~~~~~~Y~~fKk~CD~YFylpHR~E~RGvGGIFFD~l~~~~~~~~~e~~f~f~~~vg~~fl~~y~ 246 (307) ||++|||+++|.+||+|+.++||+||+|||||||||||||+|||||||||||++. +|+.+|+|+++||++||++|+ T Consensus 143 eD~~hfH~~~k~aC~~~~~~~YprfK~WCDeYFyLkHR~E~RGiGGiFfD~l~~~----~~~~~Faf~qdvG~afl~aY~ 218 (303) T COG0408 143 EDAVHFHRAAKDACDPHGPEDYPRFKKWCDEYFYLKHRNEPRGIGGIFFDDLNEP----DFERDFAFTQDVGKAFLPAYL 218 (303) T ss_pred HHHHHHHHHHHHHHCCCCHHHHHHHHHHHHHHEEECCCCCCCCCCEEECCCCCCC----CHHHHHHHHHHHHHHHHHHHH T ss_conf 6779999999997531782340688876355433014677777650531335678----989989999998777766437 Q ss_pred HHHHHHCCCCCCHHHHHHHHCCCCCHHHHHHHHCCCCCCCCCCCCCHHHEEECCCCCCCCC Q ss_conf 9998754799998999885420240112211110255357357888111411288878897 Q gi|254781073|r 247 SLVRRNYHHLFSEQDRQEQLIRRGRYAEFNLLYDKGTNFGFKTGGNVESILASMPPLVAWP 307 (307) Q Consensus 247 ~Iv~~r~~~~~~~~~k~~Ql~rRgRYvEFNL~yDRGT~FGL~t~gr~esIl~SlPp~~~Wp 307 (307) |||++|++++||++||+||||||||||||||||||||+||||||||+|||||||||+|+|| T Consensus 219 pIV~~r~~~~~te~er~fQl~RRGRYVEFNLvyDRGT~FGLqTgGr~ESILmSlPP~vrW~ 279 (303) T COG0408 219 PIVERRKNMPWTEREREFQLYRRGRYVEFNLVYDRGTLFGLQTGGRVESILMSLPPLVRWE 279 (303) T ss_pred HHHHHHCCCCCCHHHHHHHHHHCCCEEEEEEEEECCCEEEECCCCCHHHHHHCCCCCCCCC T ss_conf 9998743899865677788886153379998874463675126973655764299534256 No 4 >KOG1518 consensus Probab=100.00 E-value=0 Score=806.83 Aligned_cols=268 Identities=43% Similarity=0.810 Sum_probs=253.3 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEECCCCCCCCCCCEEEEEEECCCEEEEEEEEEEEE Q ss_conf 99999999999999999999999963010222234456651252123178888877750489950594653688777630 Q gi|254781073|r 15 EERKRISQRKFENLQSIICTEFEKLENEAHENSANRSPKIFTVKHWLRDKSQKEDLGGGRMATLCAGKVFEKAAVLVSTV 94 (307) Q Consensus 15 e~kk~~~~~wF~~LQd~Ic~~fE~lE~~~~~~~~~~~~~~F~~d~W~R~~g~~~~~GGG~s~VL~~G~VFEKaGVNfS~V 94 (307) +.-+.+.+...+..|.+||+++|++|+. .+|.+|.|+|++| |||+||||+||+||||||||+|.| T Consensus 76 ~~ir~~mE~lI~~~Qaevc~aleaidgg----------~kF~~D~W~r~eG-----GgGiscVlQDG~vFEKaGVnvSVV 140 (382) T KOG1518 76 SSIRAQMETLIREAQAEVCQALEAIDGG----------QKFKVDRWTRGEG-----GGGISCVLQDGNVFEKAGVNVSVV 140 (382) T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHCCC----------CCCCEEEEECCCC-----CCCEEEEECCCCEEECCCCEEEEE T ss_conf 7799999999999999999999874066----------5310222102789-----875279970687632177268999 Q ss_pred ECCCHHHHHHHHHHCCC------CCCCEEECEEEEEECCCCCCCHHHCCHHHEEEC------CCCCCCCCCCCCCCCHHH Q ss_conf 03542677642111026------765100000223313688744010230210202------431253200110000011 Q gi|254781073|r 95 YGDLSPDFKDQILGTTK------NPYFWATGLSVIVHPYNPHVPAVHFNIRMIVTG------AYWFGGGIDLTPSLESRR 162 (307) Q Consensus 95 ~G~~~p~~a~~i~g~~~------~~~F~AtGiS~V~Hp~nP~vPt~H~N~R~f~~~------~~WFGGG~DLTP~~~~~~ 162 (307) +|.++|++..++++..+ ..+|+|+|||.||||+||++||+|+|||||+|. .||||||+||||+ T Consensus 141 ~G~l~p~Av~~mra~~k~lk~~~~lpFfA~GvS~ViHP~NPhaPT~HfNYRYFE~~~~dg~kqWWFGGG~DlTPs----- 215 (382) T KOG1518 141 YGVLPPEAVQAMRARHKNLKPTGPLPFFAAGVSSVIHPKNPHAPTTHFNYRYFETENADGVKQWWFGGGADLTPS----- 215 (382) T ss_pred ECCCCHHHHHHHHHCCCCCCCCCCCCEEECCCEEEECCCCCCCCCEEEEEEEEEEECCCCCEEEEECCCCCCCHH----- T ss_conf 444899999998722357777898754631520222268999873332026788754888377776578668715----- Q ss_pred CCCHHHHHHHHHHHHHHCCCCCHHHHHHHHHHHHHHHHHHHHCCCCCCCCEEHHCCCCCCCCCCHHHHHHHHHHHHHHHH Q ss_conf 04326799999999840365673342899997678743453252123430101014786665896779999999999999 Q gi|254781073|r 163 HSYDPDVIFFHNTLKEMCSQHAVANYTHYKEWCDRYFYLPHRQESRGIGGIFFDHLHSSPEMGGIDADFSFISAVGDCFI 242 (307) Q Consensus 163 y~~~eD~~~fH~~~k~~Cd~~~~~~Y~~fKk~CD~YFylpHR~E~RGvGGIFFD~l~~~~~~~~~e~~f~f~~~vg~~fl 242 (307) |+++||++|||+.+|+|||+||+++||+||||||+||||+||+|+|||||||||||++. |.|..|+||++|+.+|+ T Consensus 216 yl~eeD~~hFH~~~K~AcD~hdp~~YPrFKKWcDdYF~IkHR~E~RGiGGIFFDDld~~----d~ee~f~fv~~Ca~avv 291 (382) T KOG1518 216 YLFEEDGKHFHQLHKEACDKHDPTFYPRFKKWCDDYFYIKHRKERRGIGGIFFDDLDEP----DPEELFSFVTDCARAVV 291 (382) T ss_pred HHHHHHHHHHHHHHHHHHHCCCCCCCHHHHHHHHHHEEEEECCCCCCCCCEECCCCCCC----CHHHHHHHHHHHHHHHC T ss_conf 65334579999999987632497533567765123256420465456663522557888----99999999999887534 Q ss_pred HHHHHHHHHHCCCCCCHHHHHHHHCCCCCHHHHHHHHCCCCCCCCCCCC-CHHHEEECCCCCCCC Q ss_conf 9999999875479999899988542024011221111025535735788-811141128887889 Q gi|254781073|r 243 KLYPSLVRRNYHHLFSEQDRQEQLIRRGRYAEFNLLYDKGTNFGFKTGG-NVESILASMPPLVAW 306 (307) Q Consensus 243 ~~y~~Iv~~r~~~~~~~~~k~~Ql~rRgRYvEFNL~yDRGT~FGL~t~g-r~esIl~SlPp~~~W 306 (307) |+|+|||+||++++||+++|+||++||||||||||+|||||+|||+|.| |+||||||||..|+| T Consensus 292 PsYipiv~krkdmeft~~ek~wQ~lRRGrYvEFNliYDRGT~FGL~tpgsRiESILmsLPlha~w 356 (382) T KOG1518 292 PSYIPIVEKRKDMEFTEQEKQWQQLRRGRYVEFNLIYDRGTKFGLKTPGSRIESILMSLPLHASW 356 (382) T ss_pred CCCCHHHHHHCCCCCCHHHHHHHHHHCCCEEEEEEEEECCCEEECCCCCCHHHHHHHCCCCHHHH T ss_conf 42204666505777674689999985053378999873574331347861067676436101102 No 5 >TIGR02169 SMC_prok_A chromosome segregation protein SMC; InterPro: IPR011891 The SMC (structural maintenance of chromosomes) family of proteins, exist in virtually all organisms including both bacteria and archaea. The SMC proteins are essential for successful chromosome transmission during replication and segregation of the genome in all organisms and form three types of heterodimer (SMC1SMC3, SMC2SMC4, SMC5SMC6), which are core components of large multiprotein complexes. The best known complexes are cohesin, which is responsible for sister-chromatid cohesion, and condensin, which is required for full chromosome condensation in mitosis. SMCs are generally present as single proteins in bacteria, and as at least six distinct proteins in eukaryotes. The proteins range in size from approximately 110 to 170 kDa, and share a five-domain structure, with globular N- and C-terminal (IPR003395 from INTERPRO) domains separated by a long (circa 100 nm or 900 residues) coiled coil segment in the centre of which is a globular ''hinge'' domain, characterised by a set of four highly conserved glycine residues that are typical of flexible regions in a protein. The amino-terminal domain contains a 'Walker A' nucleotide-binding domain (GxxGxGKS/T), which by mutational studies has been shown to be essential in several proteins. The carboxy-terminal domain contains a sequence (the DA-box) that resembles a 'Walker B' motif (XXXXD, where X is any hydrophobic residue), and a LSGG motif with homology to the signature sequence of the ATP-binding cassette (ABC) family of ATPases . All SMC proteins appear to form dimers, either forming homodimers with themselves, as in the case of prokaryotic SMC proteins, or heterodimers between different but related SMC proteins. The dimers are arranged in an antiparallel alignment. This orientation brings the N- and C-terminal globular domains (from either different or identical protamers) together, which unites an ATP binding site (Walker A motif) within the N-terminal domain with a Walker B motif (DA box) within the C-terminal domain, to form a potentially functional ATPase. Protein interaction and microscopy data suggest that SMC dimers form a ring-like structure which might embrace DNA molecules. Non-SMC subunits associate with the SMC amino- and carboxy-terminal domains. The sequence homology within the carboxy-terminal domain is relatively high within the SMC1-SMC4 group, whereas SMC5 and SMC6 show some divergence in both of these sequences. SMCs share not only sequence similarity but also structural similarity with ABC proteins. SMC proteins function together with other proteins in a range of chromosomal transactions, including chromosome condensation, sister-chromatid cohesion, recombination, DNA repair and epigenetic silencing of gene expression . This family represents the SMC protein of archaea and a few bacteria (Aquifex, Synechocystis, etc); the SMC of other bacteria is described by IPR011890 from INTERPRO. The N- and C-terminal domains of this protein are well conserved, but the central hinge region is skewed in composition and highly divergent .. Probab=63.23 E-value=0.61 Score=26.63 Aligned_cols=72 Identities=19% Similarity=0.320 Sum_probs=49.5 Q ss_pred HHHHHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEECCCCCCCCCCCEEEEEEECCCEEEEEEEEEEEEECCCHHHHHH Q ss_conf 99999999999999630102222344566512521231788888777504899505946536887776300354267764 Q gi|254781073|r 25 FENLQSIICTEFEKLENEAHENSANRSPKIFTVKHWLRDKSQKEDLGGGRMATLCAGKVFEKAAVLVSTVYGDLSPDFKD 104 (307) Q Consensus 25 F~~LQd~Ic~~fE~lE~~~~~~~~~~~~~~F~~d~W~R~~g~~~~~GGG~s~VL~~G~VFEKaGVNfS~V~G~~~p~~a~ 104 (307) +..-+..|-..++.+|.... .|.+ .+|+..--||+.|++.++|+=.. T Consensus 1035 L~~Er~~i~~rI~~~e~~Kr-------------------------------~~F~--~aF~~IN~~f~~iF~~LSP~G~g 1081 (1202) T TIGR02169 1035 LEEEREEILERIEEYEKKKR-------------------------------EVFM--EAFEAINENFKEIFAELSPGGTG 1081 (1202) T ss_pred HHHHHHHHHHHHHHHHHHHH-------------------------------HHHH--HHHHHHHHHHHHHHHHHCCCCCE T ss_conf 99989999999999887889-------------------------------9999--99999999999999853889824 Q ss_pred HHHHCCCCCCCEEECEEEEEECCCCC Q ss_conf 21110267651000002233136887 Q gi|254781073|r 105 QILGTTKNPYFWATGLSVIVHPYNPH 130 (307) Q Consensus 105 ~i~g~~~~~~F~AtGiS~V~Hp~nP~ 130 (307) ..-=..-+-|| +.||-|++||++=- T Consensus 1082 ~L~Le~PdDPF-~GGl~l~a~P~~K~ 1106 (1202) T TIGR02169 1082 ELILENPDDPF-AGGLELKAKPKGKP 1106 (1202) T ss_pred EEECCCCCCCC-CCCCEEEEEECCCC T ss_conf 65435888743-68717888737885 No 6 >KOG0793 consensus Probab=58.26 E-value=12 Score=18.26 Aligned_cols=58 Identities=19% Similarity=0.431 Sum_probs=33.4 Q ss_pred HHHHHHH----HHHHHH--CCCCCCCCEEHHCCCCCCCCCCHHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 9976787----434532--521234301010147866658967799999999999999999999 Q gi|254781073|r 192 KEWCDRY----FYLPHR--QESRGIGGIFFDHLHSSPEMGGIDADFSFISAVGDCFIKLYPSLV 249 (307) Q Consensus 192 Kk~CD~Y----FylpHR--~E~RGvGGIFFD~l~~~~~~~~~e~~f~f~~~vg~~fl~~y~~Iv 249 (307) --|||+| |||+.- +|+|-|--..|=-+....-...--..++|-+.|-++|-.---||| T Consensus 868 HIWceDfLVRSFYLKNlqtseTRTvTQFHfLSWp~egvPasarslLdFRRKVNK~YRGRScpIi 931 (1004) T KOG0793 868 HIWCEDFLVRSFYLKNLQTSETRTVTQFHFLSWPDEGVPASARSLLDFRRKVNKCYRGRSCPII 931 (1004) T ss_pred HHHHHHHHHHHHHHHHCCCCCCEEEEEEEEECCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCEE T ss_conf 4303467899999853133552135544541254568863137899999885242157778669 No 7 >TIGR02529 EutJ ethanolamine utilization protein EutJ family protein; InterPro: IPR013366 Salmonella typhimurium is capable of growth on ethanolamine as a sole source of carbon nitrogen and energy . During growth on this compound the cells form a multimolecular structure known as a metabolosome, which is similar to the carboxysome used by some photosynthetic bacteria to fix CO2, and is thought to contain the enzymes needed to metabolise this compound to acetyl-CoA. The metabolosome is not directly involved in the biochemistry of ethanolamine utilization - instead its role is thought to be to concentrate the enzymes involved in this process, while also protecting the cell from the build-up of toxic intermediates . The genes involved in growth on ethanolamine are encoded in a 17-gene operon known as the ethanolamine utilization (eut) operon. EutJ shows similarity to chaperonins and may play a role in assembly of the metabolosme , though it is not necessary for growth on this compound.. Probab=56.85 E-value=6.2 Score=20.13 Aligned_cols=46 Identities=24% Similarity=0.387 Sum_probs=31.6 Q ss_pred CCEEEEEEECCCEEEEEEEEEEEEECCCHHHHHHHHHHC-------CCCCCCEEECEEEE Q ss_conf 750489950594653688777630035426776421110-------26765100000223 Q gi|254781073|r 71 GGGRMATLCAGKVFEKAAVLVSTVYGDLSPDFKDQILGT-------TKNPYFWATGLSVI 123 (307) Q Consensus 71 GGG~s~VL~~G~VFEKaGVNfS~V~G~~~p~~a~~i~g~-------~~~~~F~AtGiS~V 123 (307) ++|-.+++- +|+|-||..+++|-= -|.+++..... ..+. -||||++ T Consensus 73 ~~~~~k~~v--NV~EsAG~eV~~V~D--EPTAAa~vL~i~nG~VVDvGGG---TTGiSI~ 125 (240) T TIGR02529 73 EEGDVKVIV--NVVESAGIEVLKVLD--EPTAAAAVLQIKNGAVVDVGGG---TTGISIL 125 (240) T ss_pred CCCCEEEEE--EEEECCCEEEEEEEC--CHHHHHHHHCCCCCEEEEECCC---CEEEEEE T ss_conf 889758999--987225615766514--2789998728857279984788---0335799 No 8 >pfam02961 BAF Barrier to autointegration factor. The BAF protein has a SAM-domain-like bundle of orthogonally packed alpha-hairpins - one classic and one pseudo helix-hairpin-helix motif. The protein is involved in the prevention of retroviral DNA integration. Probab=52.33 E-value=14 Score=17.88 Aligned_cols=32 Identities=34% Similarity=0.728 Sum_probs=25.3 Q ss_pred HHHHHHHHHHHHHCCCCC---HHHHHHHHHHHHHH Q ss_conf 679999999984036567---33428999976787 Q gi|254781073|r 167 PDVIFFHNTLKEMCSQHA---VANYTHYKEWCDRY 198 (307) Q Consensus 167 eD~~~fH~~~k~~Cd~~~---~~~Y~~fKk~CD~Y 198 (307) .|-..|-.-+|++|.... .+-|.-.|.|||+. T Consensus 55 kde~~F~~Wlk~~~gAn~kQa~dcy~cLkeWCd~F 89 (89) T pfam02961 55 KDEELFKEWLKDTCGANAKQARDCYNCLKEWCSCF 89 (89) T ss_pred CCHHHHHHHHHHHCCCCHHHHHHHHHHHHHHHHCC T ss_conf 64999999999980888889988999999998629 No 9 >TIGR00759 aceE 2-oxo-acid dehydrogenase E1 component, homodimeric type; InterPro: IPR004660 Most members of this family are pyruvate dehydrogenase complex, E1 component. It includes a counterexample from Pseudomonas putida, MdeB, that is active as an E1 component of an alpha-ketoglutarate dehydrogenase complex rather than a pyruvate dehydrogenase complex. The second pyruvate dehydrogenase complex E1 protein from Alcaligenes eutrophus, PdhE, complements an aceE mutant of Escherichia coli but is not part of a pyruvate dehydrogenase complex operon, is more similar to the Pseudomonas putida MdeB than to E. coli AceE, and may have also have a different primary specificity. ; GO: 0016491 oxidoreductase activity. Probab=51.60 E-value=3.5 Score=21.75 Aligned_cols=36 Identities=19% Similarity=0.520 Sum_probs=29.4 Q ss_pred CCCCHHHCCCHHHHHHHHHHHHHHCCCCCHHHHHHHHHHHHHHHHHHHHCCC Q ss_conf 0000011043267999999998403656733428999976787434532521 Q gi|254781073|r 156 PSLESRRHSYDPDVIFFHNTLKEMCSQHAVANYTHYKEWCDRYFYLPHRQES 207 (307) Q Consensus 156 P~~~~~~y~~~eD~~~fH~~~k~~Cd~~~~~~Y~~fKk~CD~YFylpHR~E~ 207 (307) |+ |-|+ =++-++..++.+|-.- .||.||||.=-||. T Consensus 668 Pa-----FAyE-vAVI~~~Gl~RMy~E~----------qed~FyY~Tv~NE~ 703 (905) T TIGR00759 668 PA-----FAYE-VAVIMEDGLRRMYGEK----------QEDVFYYVTVLNEN 703 (905) T ss_pred CC-----HHHH-HHHHHHHHHHHHCCCC----------CCCEEEEEEEECCC T ss_conf 51-----0233-6787877897323667----------44224766450578 No 10 >COG3642 Mn2+-dependent serine/threonine protein kinase [Signal transduction mechanisms] Probab=50.13 E-value=16 Score=17.44 Aligned_cols=36 Identities=25% Similarity=0.358 Sum_probs=19.7 Q ss_pred HHHHHHHHHHHHHHHHHHCCCCCCHHHHHHHHCCCCCHHH Q ss_conf 9999999999999998754799998999885420240112 Q gi|254781073|r 235 SAVGDCFIKLYPSLVRRNYHHLFSEQDRQEQLIRRGRYAE 274 (307) Q Consensus 235 ~~vg~~fl~~y~~Iv~~r~~~~~~~~~k~~Ql~rRgRYvE 274 (307) ..+-..|+..|...+.... .-.++---.-+|||||| T Consensus 166 e~l~~~f~~gY~~~~~~~~----~Vl~~~~eIr~RgRYve 201 (204) T COG3642 166 EELFAAFLEGYREEFGEAK----EVLERLEEIRLRGRYVE 201 (204) T ss_pred HHHHHHHHHHHHHHHCCHH----HHHHHHHHHHHHCCCCC T ss_conf 9999999999998750289----99999999997233000 No 11 >pfam12268 DUF3612 Protein of unknown function (DUF3612). This domain family is found in bacteria, and is approximately 180 amino acids in length. The family is found in association with pfam01381. Probab=44.32 E-value=11 Score=18.41 Aligned_cols=34 Identities=29% Similarity=0.572 Sum_probs=25.9 Q ss_pred CCCCCCCCCCCCHHHCCCHHHHHHHHHHHHHHCCCCCH Q ss_conf 25320011000001104326799999999840365673 Q gi|254781073|r 148 FGGGIDLTPSLESRRHSYDPDVIFFHNTLKEMCSQHAV 185 (307) Q Consensus 148 FGGG~DLTP~~~~~~y~~~eD~~~fH~~~k~~Cd~~~~ 185 (307) .--|.||.|.++.+. -|+..--+.+|+.|-+.+. T Consensus 92 lcaGIDLnPAi~aQG----~Da~~ia~~lk~~Cv~~gG 125 (178) T pfam12268 92 LCAGIDLNPAIEAQG----GDALALAQELKSACVSNGG 125 (178) T ss_pred EEECCCCCHHHHHCC----CCHHHHHHHHHHHHHHCCC T ss_conf 884235534676448----8999999999999983799 No 12 >pfam06319 DUF1052 Protein of unknown function (DUF1052). This family consists of several bacterial proteins of unknown function. Probab=41.17 E-value=15 Score=17.72 Aligned_cols=23 Identities=26% Similarity=0.663 Sum_probs=17.8 Q ss_pred HHHHHHHHHHHHHHHHHHCCCCC Q ss_conf 42899997678743453252123 Q gi|254781073|r 187 NYTHYKEWCDRYFYLPHRQESRG 209 (307) Q Consensus 187 ~Y~~fKk~CD~YFylpHR~E~RG 209 (307) -.+.|-.|||.|||--+-+=+.. T Consensus 79 KW~~Yl~~CDrfffAV~~~fP~d 101 (158) T pfam06319 79 KWPEYRLHCDRLFFATHPDVPLE 101 (158) T ss_pred CCHHHHHHHHHHHHHCCCCCCCC T ss_conf 61678999887875168899802 No 13 >TIGR01362 KDO8P_synth 3-deoxy-8-phosphooctulonate synthase; InterPro: IPR006269 These sequences describe 2-dehydro-3-deoxyphosphooctonate aldolase. Alternate names include 3-deoxy-d-manno-octulosonic acid 8-phosphate and KDO-8 phosphate synthetase. It catalyzes the aldol condensation of phosphoenolpyruvate with D-arabinose 5-phosphate. phosphoenolpyruvate + D-arabinose 5-phosphate + H_2O = 2-dehydro-3-deoxy-D-octonate 8-phosphate + phosphate In Gram-negative bacteria, this is the first step in the biosynthesis of 3-deoxy-D-manno-octulosonate, part of the oligosaccharide core of lipopolysaccharide. ; GO: 0008676 3-deoxy-8-phosphooctulonate synthase activity, 0008152 metabolic process, 0005737 cytoplasm. Probab=39.85 E-value=8.9 Score=19.11 Aligned_cols=12 Identities=50% Similarity=1.176 Sum_probs=8.7 Q ss_pred HHHCCCCCCCCC Q ss_conf 111025535735 Q gi|254781073|r 277 LLYDKGTNFGFK 288 (307) Q Consensus 277 L~yDRGT~FGL~ 288 (307) ||.+|||.||-+ T Consensus 162 ~lcERG~~FGYn 173 (279) T TIGR01362 162 LLCERGTSFGYN 173 (279) T ss_pred EEEECCCCCCCC T ss_conf 586178888887 No 14 >COG5132 BUD31 Cell cycle control protein, G10 family [Transcription / Cell division and chromosome partitioning] Probab=37.01 E-value=11 Score=18.49 Aligned_cols=56 Identities=21% Similarity=0.297 Sum_probs=28.5 Q ss_pred CCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCCCHHHH---HHHH-CCCCCHHHHHHHHCCCCC Q ss_conf 66589677999999999999999999998754799998999---8854-202401122111102553 Q gi|254781073|r 222 PEMGGIDADFSFISAVGDCFIKLYPSLVRRNYHHLFSEQDR---QEQL-IRRGRYAEFNLLYDKGTN 284 (307) Q Consensus 222 ~~~~~~e~~f~f~~~vg~~fl~~y~~Iv~~r~~~~~~~~~k---~~Ql-~rRgRYvEFNL~yDRGT~ 284 (307) +.++.||+.--++.+.-...-+ +++.+..+ +..|. -||| ..|.||+ +||-|-||.. T Consensus 11 p~PdgFeki~ptL~~fe~~mRq-----aen~~~~~-sk~E~lwpIfQLHHQRSRYI-Y~LyyKR~aI 70 (146) T COG5132 11 PAPDGFEKIRPTLEKFEAEMRQ-----AENAPLAP-SKPENLWPIFQLHHQRSRYI-YNLYYKRGAI 70 (146) T ss_pred CCCCCHHHHCCHHHHHHHHHHH-----HHCCCCCC-CCHHHHHHHHHHHHHHHHHH-HHHHHHHHHH T ss_conf 9996354524359999999998-----75277789-97677669999987655788-8878655167 No 15 >TIGR02539 SepCysS Sep-tRNA:Cys-tRNA synthase; InterPro: IPR013375 The aminoacyl-tRNA synthetases (6.1.1. from EC) catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology . The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric . Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices , and are mostly dimeric or multimeric, containing at least three conserved regions , , . However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases . Many archaeal species lack cysteinyl-tRNA synthetase, an essential enzyme that provides Cys-tRNA(Cys) in other organisms. Instead, in a two step pathway, tRNA-Cys is acylated with O-phosphoserine (Sep) to form Sep-tRNA(Cys), which is subsequently converted to Cys-tRNA(Cys) . This pathway is also thought to function as the sole route of cysteine biosynthesis in these organisms. Several other archaeal species use both this pathway and direct tRNA(Cys) aminoacylation to synthesize Cys-tRNA(Cys), but this pathway appears to be the only route for cysteine biosynthesis. Proteins in this entry catalyse the second step in this pathway using pyridoxal phosphate and a sulphur donor to synthesize Cys from Sep while attached to the tRNA.. Probab=34.81 E-value=16 Score=17.55 Aligned_cols=57 Identities=23% Similarity=0.343 Sum_probs=34.5 Q ss_pred HHHHHHHHCCCCCHH---HHHHHHHHHHHHHHHHHHCCCCCCCCEEHHCCCCCCCCCCH-HHHHHHHHHHHHHHHHHHHH Q ss_conf 999998403656733---42899997678743453252123430101014786665896-77999999999999999999 Q gi|254781073|r 172 FHNTLKEMCSQHAVA---NYTHYKEWCDRYFYLPHRQESRGIGGIFFDHLHSSPEMGGI-DADFSFISAVGDCFIKLYPS 247 (307) Q Consensus 172 fH~~~k~~Cd~~~~~---~Y~~fKk~CD~YFylpHR~E~RGvGGIFFD~l~~~~~~~~~-e~~f~f~~~vg~~fl~~y~~ 247 (307) ||..+++. .+. +|.+.|| |||.|| ....+.-+ -..|-+.++=-..+.+|+.+ T Consensus 320 f~Eia~~~----kr~GyFLY~ELKk--------------Rgi~GI------~~G~Tk~~K~SvYGL~~Eqv~~Vv~sf~e 375 (381) T TIGR02539 320 FHEIAKKH----KRRGYFLYEELKK--------------RGIVGI------RSGQTKEIKLSVYGLTKEQVEYVVDSFKE 375 (381) T ss_pred CHHHHCCC----CCCCCCHHHHHHH--------------CCCEEC------CCCCCEEEEEEEECCCCCHHHHHHHHHHH T ss_conf 01341448----7977640566531--------------583001------78874277643312752114444678999 Q ss_pred HHHHH Q ss_conf 99875 Q gi|254781073|r 248 LVRRN 252 (307) Q Consensus 248 Iv~~r 252 (307) |+++. T Consensus 376 I~e~~ 380 (381) T TIGR02539 376 IVEEY 380 (381) T ss_pred HHHHC T ss_conf 99854 No 16 >KOG0738 consensus Probab=34.07 E-value=37 Score=15.16 Aligned_cols=17 Identities=18% Similarity=0.358 Sum_probs=10.0 Q ss_pred HHHHHHHHHHHHHHHHC Q ss_conf 99999999999963010 Q gi|254781073|r 27 NLQSIICTEFEKLENEA 43 (307) Q Consensus 27 ~LQd~Ic~~fE~lE~~~ 43 (307) .+-++|...++.++-.. T Consensus 64 e~vk~i~~~~~~~~~a~ 80 (491) T KOG0738 64 ELVKQIVRDLRDLKEAS 80 (491) T ss_pred HHHHHHHHHHHHHCCCC T ss_conf 99999987788612102 No 17 >pfam12138 Spherulin4 Spherulation-specific family 4. This protein is found in bacteria, archaea and eukaryotes. Proteins in this family are typically between 250 and 398 amino acids in length. There is a conserved NPG sequence motif and there are two completely conserved G residues that may be functionally important. Starvation will often induce spherulation - the production of spores - and this process may involve DNA-methylation. Changes in the methylation of spherulin4 are associated with the formation of spherules, but these changes are probably transient. Methylation of the gene accompanies its transcriptional activation, and spherulin4 mRNA is only detectable in late spherulating cultures and mature spherules. It is a spherulation-specific protein. Probab=33.38 E-value=34 Score=15.33 Aligned_cols=31 Identities=26% Similarity=0.404 Sum_probs=15.1 Q ss_pred HHHHHHHHHHHHHHHHCC-CCCCCCEEHHCCC Q ss_conf 899997678743453252-1234301010147 Q gi|254781073|r 189 THYKEWCDRYFYLPHRQE-SRGIGGIFFDHLH 219 (307) Q Consensus 189 ~~fKk~CD~YFylpHR~E-~RGvGGIFFD~l~ 219 (307) ..-++.-|.|.-.+-+-+ .=+|+|||||..- T Consensus 82 ~~v~~di~~y~~W~~~~~~~~~v~GiF~DE~p 113 (243) T pfam12138 82 SEVLADIDTYAGWPSQSATGYGVDGIFLDETP 113 (243) T ss_pred HHHHHHHHHHHHHHHCCCCCCCCCEEEECCCC T ss_conf 99998799985155405756344438962687 No 18 >PHA00028 rep RNA replicase, beta subunit Probab=32.88 E-value=38 Score=15.04 Aligned_cols=57 Identities=30% Similarity=0.540 Sum_probs=41.1 Q ss_pred CCCCCCCCCCCCCCCCHHHCCCHHHHHHHHHHHHHHCCC---CCHHHHHHHHHHHHHHHHHHHH Q ss_conf 243125320011000001104326799999999840365---6733428999976787434532 Q gi|254781073|r 144 GAYWFGGGIDLTPSLESRRHSYDPDVIFFHNTLKEMCSQ---HAVANYTHYKEWCDRYFYLPHR 204 (307) Q Consensus 144 ~~~WFGGG~DLTP~~~~~~y~~~eD~~~fH~~~k~~Cd~---~~~~~Y~~fKk~CD~YFylpHR 204 (307) ++.||- |.|.||++..+.--...|....-+.++-.-+. -++-+|+-+.|-|| .||-. T Consensus 380 GaH~f~-gvDVtPFYik~pi~~l~dlililN~l~~W~~v~Gi~dPr~~~v~~ky~~---liP~~ 439 (561) T PHA00028 380 GAHYFA-GVDVTPFYIKRPLDNLPDLILILNSLRRWGTVTGISDPRLYPLYNKYRD---LIPKT 439 (561) T ss_pred HHHHCC-CCCCCCEEECCCCCCHHHHHHHHHHHHCCCCCCCCCCCCHHHHHHHHHH---HCCCC T ss_conf 032216-7566535862554688999999986313012466467304999999998---68874 No 19 >COG2425 Uncharacterized protein containing a von Willebrand factor type A (vWA) domain [General function prediction only] Probab=32.81 E-value=21 Score=16.76 Aligned_cols=14 Identities=21% Similarity=0.126 Sum_probs=8.7 Q ss_pred CCCCCCCCCCCCCC Q ss_conf 24312532001100 Q gi|254781073|r 144 GAYWFGGGIDLTPS 157 (307) Q Consensus 144 ~~~WFGGG~DLTP~ 157 (307) +..|+++=.+|-|. T Consensus 226 ~i~~~~~l~~Llp~ 239 (437) T COG2425 226 GITQSDDLLRLLPI 239 (437) T ss_pred CHHHCCHHHCCCCH T ss_conf 20001044314907 No 20 >KOG4320 consensus Probab=32.70 E-value=17 Score=17.37 Aligned_cols=23 Identities=30% Similarity=0.205 Sum_probs=21.2 Q ss_pred CEEECEEEEEECCCCCCCHHHCC Q ss_conf 10000022331368874401023 Q gi|254781073|r 115 FWATGLSVIVHPYNPHVPAVHFN 137 (307) Q Consensus 115 F~AtGiS~V~Hp~nP~vPt~H~N 137 (307) +-|.|+|+|+|-.-+.++.+|+- T Consensus 99 ~aalgl~vVaNfQet~i~~VH~~ 121 (253) T KOG4320 99 AAALGLSVVANFQETAIRIVHDI 121 (253) T ss_pred HHHHHHEEEEECCCCCCHHHHHH T ss_conf 99853013450545430222323 No 21 >COG3354 FlaG Putative archaeal flagellar protein G [Cell motility and secretion] Probab=32.30 E-value=37 Score=15.14 Aligned_cols=23 Identities=26% Similarity=0.371 Sum_probs=20.4 Q ss_pred EEEECCCEEEEEEEEEEEEECCC Q ss_conf 99505946536887776300354 Q gi|254781073|r 76 ATLCAGKVFEKAAVLVSTVYGDL 98 (307) Q Consensus 76 ~VL~~G~VFEKaGVNfS~V~G~~ 98 (307) -||-||++.+++.|+|++|-|.- T Consensus 92 tVliDG~iv~~a~~~~~~~~gs~ 114 (154) T COG3354 92 TVLIDGNIVTPAYVTFTSVNGSS 114 (154) T ss_pred EEEECCCEECCCEEEEEECCCCE T ss_conf 99985858235237999439975 No 22 >pfam01125 G10 G10 protein. Probab=32.22 E-value=38 Score=15.02 Aligned_cols=52 Identities=17% Similarity=0.210 Sum_probs=25.3 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHCCCCCCHHHHHH---HH-CCCCCHHHHHHHHCCCCC Q ss_conf 9999999999999999999875479999899988---54-202401122111102553 Q gi|254781073|r 231 FSFISAVGDCFIKLYPSLVRRNYHHLFSEQDRQE---QL-IRRGRYAEFNLLYDKGTN 284 (307) Q Consensus 231 f~f~~~vg~~fl~~y~~Iv~~r~~~~~~~~~k~~---Ql-~rRgRYvEFNL~yDRGT~ 284 (307) |+.++++-..|-.-...+...-.. .=...|-.| |+ ..|.||| |+|-|.|-.. T Consensus 16 ~~~Ie~tL~e~~~kmr~a~~~~~~-~k~k~e~lWpI~rI~hqrsRYI-ydlyYk~k~I 71 (145) T pfam01125 16 FDKIEPTLDEFEAKMRDAENEPHE-GKRKKEALWPIFRIHHQRSRYI-YDLYYKRKAI 71 (145) T ss_pred HHHHHHHHHHHHHHHHHHHCCCCC-CCCCCCCCCHHHHHHHHHHHHH-HHHHHHHHHH T ss_conf 888789999999999998618877-7775621006899876661899-9999888676 No 23 >TIGR02707 butyr_kinase butyrate kinase; InterPro: IPR011245 This group represents bacterial butyrate kinase, an enzyme that facilitates the formation of butyryl-CoA by phosphorylating butyrate in the presence of ATP to form butyryl phosphate . The final steps in butyrate synthesis by anaerobic bacteria can occur via butyrate kinase and phosphotransbutyrylase or via butyryl-CoA:acetate CoA-transferase, the latter providing the dominant route for butyrate formation in human colonic bacteria .; GO: 0005524 ATP binding, 0047761 butyrate kinase activity, 0016310 phosphorylation, 0005737 cytoplasm. Probab=31.46 E-value=25 Score=16.22 Aligned_cols=67 Identities=22% Similarity=0.202 Sum_probs=41.0 Q ss_pred HHHHHHHHHHHHHHHHHHCCCCCCCCCCCCCEEEEEECCCCCCCCCCCEEE-EEEECCCEEEEEEEEEEEEECCCHHHHH Q ss_conf 999999999999996301022223445665125212317888887775048-9950594653688777630035426776 Q gi|254781073|r 25 FENLQSIICTEFEKLENEAHENSANRSPKIFTVKHWLRDKSQKEDLGGGRM-ATLCAGKVFEKAAVLVSTVYGDLSPDFK 103 (307) Q Consensus 25 F~~LQd~Ic~~fE~lE~~~~~~~~~~~~~~F~~d~W~R~~g~~~~~GGG~s-~VL~~G~VFEKaGVNfS~V~G~~~p~~a 103 (307) |.-|-..-|+.=-+-|.... ...-.|+.-+ .|||+| -+-++|+|++ -+|==.=-|.|+|+.+ T Consensus 152 fHALNqKAvARr~A~e~gK~-----YE~~N~ivaH----------lGGGISvaAH~~Gr~vD--VNNALdGeGPFSPERs 214 (353) T TIGR02707 152 FHALNQKAVARRIAKELGKR-----YEEMNLIVAH----------LGGGISVAAHRKGRVVD--VNNALDGEGPFSPERS 214 (353) T ss_pred HHHHHHHHHHHHHHHHCCCC-----CCCCCEEEEE----------CCCCCEEEEECCCCEEE--EECCCCCCCCCCCCCC T ss_conf 44433889999999972896-----0044548998----------28870341454861799--7347784329387545 Q ss_pred HHHHH Q ss_conf 42111 Q gi|254781073|r 104 DQILG 108 (307) Q Consensus 104 ~~i~g 108 (307) -..|- T Consensus 215 G~LP~ 219 (353) T TIGR02707 215 GTLPL 219 (353) T ss_pred CCCCH T ss_conf 65658 No 24 >KOG4233 consensus Probab=30.80 E-value=41 Score=14.84 Aligned_cols=32 Identities=34% Similarity=0.728 Sum_probs=25.0 Q ss_pred HHHHHHHHHHHHHCCC---CCHHHHHHHHHHHHHH Q ss_conf 6799999999840365---6733428999976787 Q gi|254781073|r 167 PDVIFFHNTLKEMCSQ---HAVANYTHYKEWCDRY 198 (307) Q Consensus 167 eD~~~fH~~~k~~Cd~---~~~~~Y~~fKk~CD~Y 198 (307) -|-..|..=+|.+|.. +..+-|.-.+.|||.+ T Consensus 55 KdE~lF~~Wlk~~~gat~~~a~~~~~CL~eWc~~F 89 (90) T KOG4233 55 KDEDLFQEWLKETCGATAKQAQDCFNCLNEWCDCF 89 (90) T ss_pred CCHHHHHHHHHHHCCCCHHHHHHHHHHHHHHHHHH T ss_conf 66999999999982844888999998899999975 No 25 >COG1103 Archaea-specific pyridoxal phosphate-dependent enzymes [General function prediction only] Probab=29.98 E-value=5.6 Score=20.44 Aligned_cols=63 Identities=11% Similarity=0.110 Sum_probs=32.4 Q ss_pred CHHHHHHHHHHHHHHHHHHHHCCCCCCCCEEHHCCCCCCCCCCH-HHHHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 73342899997678743453252123430101014786665896-7799999999999999999999875 Q gi|254781073|r 184 AVANYTHYKEWCDRYFYLPHRQESRGIGGIFFDHLHSSPEMGGI-DADFSFISAVGDCFIKLYPSLVRRN 252 (307) Q Consensus 184 ~~~~Y~~fKk~CD~YFylpHR~E~RGvGGIFFD~l~~~~~~~~~-e~~f~f~~~vg~~fl~~y~~Iv~~r 252 (307) .+.+|.--||---.-|||=+-=..|||.||==-- ...+ -..+-+..+-.+...+++.+|+++. T Consensus 318 tp~f~eIakk~~r~gyFlY~ELK~RgI~GI~~G~------Tk~~K~svyGl~~Eqve~V~~afkeI~eky 381 (382) T COG1103 318 TPVFHEIAKKHKRKGYFLYEELKKRGIHGIQPGQ------TKYFKLSVYGLSWEQVEYVVDAFKEIAEKY 381 (382) T ss_pred CCHHHHHHHHCCCCCEEEHHHHHHCCCCCCCCCC------EEEEEEEEECCCHHHHHHHHHHHHHHHHHC T ss_conf 7128999875767854668988765866506675------137888750577999999999999999861 No 26 >PHA02087 hypothetical protein Probab=28.26 E-value=34 Score=15.39 Aligned_cols=13 Identities=62% Similarity=0.902 Sum_probs=8.1 Q ss_pred CCCCCCCCEEHHC Q ss_conf 5212343010101 Q gi|254781073|r 205 QESRGIGGIFFDH 217 (307) Q Consensus 205 ~E~RGvGGIFFD~ 217 (307) -|.-|-|||-||+ T Consensus 60 pe~~ggggi~fdd 72 (83) T PHA02087 60 PESEGGGGITFDD 72 (83) T ss_pred CCCCCCCCEEECC T ss_conf 7346898603564 No 27 >TIGR01975 isoAsp_dipep beta-aspartyl peptidase; InterPro: IPR010229 Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site . The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases . Peptidases are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry. Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. This group of proteins include metallopeptidases belonging to the MEROPS peptidase family M38 (clan MJ, beta-aspartyl dipeptidase family). This entry includes the beta-aspartyl dipeptidase from Escherichia coli, (3.4.19.5 from EC, IadA), which degrades isoaspartyl dipeptides and may unblock degradation of proteins that cannot be repaired. This entry also describes closely related proteins from other species (e.g. Clostridium perfringens, Thermoanaerobacter tengcongensis) that may have an equivalent in function. This family shows homology to dihydroorotases. The L-isoaspartyl derivative of Asp arises non-enzymatically over time as a form of protein damage. In this isomerisation, the connectivity of the polypeptide changes to pass through the beta-carboxyl of the side chain. Much but not all of this damage can be repaired by protein-L-isoaspartate (D-aspartate) O-methyltransferase.. Probab=28.18 E-value=13 Score=18.13 Aligned_cols=54 Identities=26% Similarity=0.374 Sum_probs=37.2 Q ss_pred CCCCCCCHHHCC--HHHEEECC--CCCCCCCCCCCCCCHHHCCCHHHHHHHHHHHHHHCCC Q ss_conf 368874401023--02102024--3125320011000001104326799999999840365 Q gi|254781073|r 126 PYNPHVPAVHFN--IRMIVTGA--YWFGGGIDLTPSLESRRHSYDPDVIFFHNTLKEMCSQ 182 (307) Q Consensus 126 p~nP~vPt~H~N--~R~f~~~~--~WFGGG~DLTP~~~~~~y~~~eD~~~fH~~~k~~Cd~ 182 (307) |-|-++|| ||| .-.||.+- .--||=.|||-+.... ..+|-.+.=-..+|.+.++ T Consensus 217 Pi~q~lPT-H~nR~~~LFE~g~~fa~~GG~iDlTss~~p~--~~~egev~p~eGlk~~l~~ 274 (391) T TIGR01975 217 PITQFLPT-HINRNRELFEAGLEFAKKGGTIDLTSSIDPQ--FRKEGEVKPAEGLKKLLEA 274 (391) T ss_pred CCCCCCCC-CCCCCHHHHHHHHHHHHCCCEEEEECCCCCC--CCCCCCCCHHHHHHHHHHC T ss_conf 70025577-6476756899999999739808760278887--5535543767899999963 No 28 >KOG3410 consensus Probab=27.69 E-value=33 Score=15.47 Aligned_cols=53 Identities=17% Similarity=0.132 Sum_probs=36.5 Q ss_pred HHHHHHHHHHHCCCCCCHHHHHHHHCCCCCHHHHHHHHCCCCCCCCCCCCCHHHEEECCCCCCCCC Q ss_conf 999999998754799998999885420240112211110255357357888111411288878897 Q gi|254781073|r 242 IKLYPSLVRRNYHHLFSEQDRQEQLIRRGRYAEFNLLYDKGTNFGFKTGGNVESILASMPPLVAWP 307 (307) Q Consensus 242 l~~y~~Iv~~r~~~~~~~~~k~~Ql~rRgRYvEFNL~yDRGT~FGL~t~gr~esIl~SlPp~~~Wp 307 (307) +++|+.........+.-.--++.+..-|-|--+||+..|-=|-+++. |-|+|| T Consensus 67 T~Ae~~~e~~qe~~~~kril~~a~ktHk~rve~~n~~Ld~~tEh~di-------------pKvsw~ 119 (120) T KOG3410 67 TKAELRFEKVQEKRPCKRILKEAGKTHKERVEKFNRHLDEMTEHFDI-------------PKVSWT 119 (120) T ss_pred CHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCCCC-------------CCCCCC T ss_conf 37788889999866999999998767899999999988612402476-------------678999 No 29 >PRK10997 yieM hypothetical protein; Provisional Probab=27.43 E-value=17 Score=17.34 Aligned_cols=13 Identities=54% Similarity=0.886 Sum_probs=9.6 Q ss_pred CCCCCCCCCCCCC Q ss_conf 3125320011000 Q gi|254781073|r 146 YWFGGGIDLTPSL 158 (307) Q Consensus 146 ~WFGGG~DLTP~~ 158 (307) .=|+||.|+.|++ T Consensus 388 ~sF~GGTD~~~~L 400 (484) T PRK10997 388 QSFRGGTDLAPCL 400 (484) T ss_pred CCCCCCCCHHHHH T ss_conf 8888984579999 No 30 >PRK09491 rimI ribosomal-protein-alanine N-acetyltransferase; Provisional Probab=26.66 E-value=30 Score=15.70 Aligned_cols=62 Identities=8% Similarity=0.170 Sum_probs=32.8 Q ss_pred HHHHHHHHHHHHHHHHHHCCC-----CCCHHHHHHHHCCCCCHHHHHHHHCCCCCCCC--CCCCCHHHEEECCCC Q ss_conf 999999999999999875479-----99989998854202401122111102553573--578881114112888 Q gi|254781073|r 235 SAVGDCFIKLYPSLVRRNYHH-----LFSEQDRQEQLIRRGRYAEFNLLYDKGTNFGF--KTGGNVESILASMPP 302 (307) Q Consensus 235 ~~vg~~fl~~y~~Iv~~r~~~-----~~~~~~k~~Ql~rRgRYvEFNL~yDRGT~FGL--~t~gr~esIl~SlPp 302 (307) +.+|...+.........+.-. --..++....||++==+. -+ |..=+- ...||-++|+||||- T Consensus 76 ~G~g~~Ll~~~~~~~~~~g~~~i~LEVr~sN~~A~~lY~k~GF~---~~---g~R~~YY~~~dg~EDAiiM~l~L 144 (144) T PRK09491 76 QGLGRALLEHLIDELEKRGVATLWLEVRASNAAAIALYESLGFN---EV---TIRRNYYPTADGREDAIIMALPL 144 (144) T ss_pred CCHHHHHHHHHHHHHHHCCCCEEEEEEECCCHHHHHHHHHCCCE---EE---CEECCCCCCCCCCCCCEEEECCC T ss_conf 89799999999999998799799999957878999999988998---91---78878568978990216886279 No 31 >KOG0175 consensus Probab=26.09 E-value=34 Score=15.33 Aligned_cols=33 Identities=24% Similarity=0.441 Sum_probs=27.6 Q ss_pred CCCCCCCCHHHCCCHHHHHHHHHHHHHHCCCCC Q ss_conf 001100000110432679999999984036567 Q gi|254781073|r 152 IDLTPSLESRRHSYDPDVIFFHNTLKEMCSQHA 184 (307) Q Consensus 152 ~DLTP~~~~~~y~~~eD~~~fH~~~k~~Cd~~~ 184 (307) .+++||+....-.-..||.+|++.|-.-|-.|. T Consensus 106 IeIn~ylLGTmAGgAADCqfWer~L~kecRL~e 138 (285) T KOG0175 106 IEINPYLLGTMAGGAADCQFWERVLAKECRLHE 138 (285) T ss_pred EEECHHHHHCCCCCCHHHHHHHHHHHHHHHHHH T ss_conf 640614310024751456899999998877898 No 32 >KOG3404 consensus Probab=25.83 E-value=22 Score=16.64 Aligned_cols=14 Identities=36% Similarity=0.940 Sum_probs=5.9 Q ss_pred CCCCHHHHHHHHCCC Q ss_conf 024011221111025 Q gi|254781073|r 268 RRGRYAEFNLLYDKG 282 (307) Q Consensus 268 rRgRYvEFNL~yDRG 282 (307) .|.||+ |+|-|-|+ T Consensus 55 QrsRYi-YdlyykR~ 68 (145) T KOG3404 55 QRSRYI-YDLYYKRK 68 (145) T ss_pred HHHHHH-HHHHHHHH T ss_conf 423668-88888788 No 33 >TIGR01491 HAD-SF-IB-PSPlk Phosphoserine phosphatase-like hydrolase, archaeal; InterPro: IPR006386 This group of sequences belong to the IB subfamily of the haloacid dehalogenase (HAD) superfamily of aspartate-nucleophile hydrolases. The sequences are all from archaeal species. The phylogenetically closest group of sequences to these are phosphoserine phosphatases. As there are no known archaeal phosphoserine phosphatases, it seems likely that this group of sequences represent the archaeal branch of PSPase.. Probab=25.35 E-value=45 Score=14.58 Aligned_cols=13 Identities=15% Similarity=0.365 Sum_probs=5.0 Q ss_pred HHHHHHHHCCCCC Q ss_conf 8999885420240 Q gi|254781073|r 259 EQDRQEQLIRRGR 271 (307) Q Consensus 259 ~~~k~~Ql~rRgR 271 (307) +.+|.+.|+++|+ T Consensus 35 ~A~kn~elf~~G~ 47 (203) T TIGR01491 35 LAKKNAELFESGS 47 (203) T ss_pred HHHHHHHHHHCCC T ss_conf 5678789873596 No 34 >pfam09433 HRP1 THO complex subunit HPR1. The THO complex plays a role in coupling transcription elongation to mRNA export. It is composed of subunits THP2, HPR1, THO2 and MFT1. Probab=25.21 E-value=52 Score=14.19 Aligned_cols=14 Identities=29% Similarity=0.655 Sum_probs=5.6 Q ss_pred HHCCCCCCCCEEHH Q ss_conf 32521234301010 Q gi|254781073|r 203 HRQESRGIGGIFFD 216 (307) Q Consensus 203 HR~E~RGvGGIFFD 216 (307) --+|.-||-|+|=+ T Consensus 594 kVdE~tGi~GLF~e 607 (725) T pfam09433 594 KVDEKTGLAGLFDE 607 (725) T ss_pred CCCCCCCCCCCCCH T ss_conf 76610143213684 No 35 >PRK13272 treA trehalase; Provisional Probab=23.63 E-value=37 Score=15.12 Aligned_cols=64 Identities=20% Similarity=0.246 Sum_probs=38.6 Q ss_pred CCCCCCCC-------------CCCCCCCHHHCCCHHHHHHHHHHHHHHCCCCCHHHHHHHHHHHHHHHHHHHHCCCCCCC Q ss_conf 43125320-------------01100000110432679999999984036567334289999767874345325212343 Q gi|254781073|r 145 AYWFGGGI-------------DLTPSLESRRHSYDPDVIFFHNTLKEMCSQHAVANYTHYKEWCDRYFYLPHRQESRGIG 211 (307) Q Consensus 145 ~~WFGGG~-------------DLTP~~~~~~y~~~eD~~~fH~~~k~~Cd~~~~~~Y~~fKk~CD~YFylpHR~E~RGvG 211 (307) .=||+.|. ||+-.+ |..+.+...++....+.|...=.....+.++.-++|++- |. T Consensus 316 SRW~~d~~~L~tI~T~~IiPVDLNalL----y~~E~~lA~~~~~~~~~~~~~y~~~A~~R~~aI~~~LWn----e~---- 383 (537) T PRK13272 316 SRWLADGRELATIRTTAIVPIDLNSLL----YHLERTLAQACASSGAACSQDYAALAQQRKQAIDAHLWN----PA---- 383 (537) T ss_pred CCCCCCCCCCCCCCCCEECEECHHHHH----HHHHHHHHHHHHHCCCHHHHHHHHHHHHHHHHHHHHCCC----CC---- T ss_conf 233567865243330221436589999----999999999998658467999999999999999997628----87---- Q ss_pred CEEHHCCCC Q ss_conf 010101478 Q gi|254781073|r 212 GIFFDHLHS 220 (307) Q Consensus 212 GIFFD~l~~ 220 (307) |+||||=-. T Consensus 384 G~~~DYd~~ 392 (537) T PRK13272 384 GYYADYDWQ 392 (537) T ss_pred CEEEECCCC T ss_conf 804420046 No 36 >pfam04437 RINT1_TIP1 RINT-1 / TIP-1 family. This family includes RINT-1, a Rad50 interacting protein which participates in radiation induced checkpoint control, as well as the TIP-1 protein from yeast that seems to be involved in a complex with Sec20p that is required for golgi transport. Probab=23.58 E-value=56 Score=13.99 Aligned_cols=28 Identities=32% Similarity=0.718 Sum_probs=18.7 Q ss_pred HHHHHCCCCCHHH--HHHHHHHHHHHHHHH Q ss_conf 9984036567334--289999767874345 Q gi|254781073|r 175 TLKEMCSQHAVAN--YTHYKEWCDRYFYLP 202 (307) Q Consensus 175 ~~k~~Cd~~~~~~--Y~~fKk~CD~YFylp 202 (307) .+..+|-.++... ....+.|||+.|||- T Consensus 243 ~l~~~~~~lnsa~yi~~~L~eWs~~v~Fle 272 (485) T pfam04437 243 ELERTCRKLNAANYLESKLKDWSDDVFFLE 272 (485) T ss_pred HHHHHHHHHHHHHHHHHHHHHHCCCCEEHH T ss_conf 399999998359999999998557763121 No 37 >COG2075 RPL24A Ribosomal protein L24E [Translation, ribosomal structure and biogenesis] Probab=22.63 E-value=56 Score=13.95 Aligned_cols=16 Identities=38% Similarity=0.434 Sum_probs=13.5 Q ss_pred CCCEEEEEEECCCEEE Q ss_conf 7750489950594653 Q gi|254781073|r 70 LGGGRMATLCAGKVFE 85 (307) Q Consensus 70 ~GGG~s~VL~~G~VFE 85 (307) -|-|+|.|..||.||= T Consensus 15 PGtG~m~Vr~Dg~v~~ 30 (66) T COG2075 15 PGTGIMYVRNDGKVLR 30 (66) T ss_pred CCCEEEEEECCCEEEE T ss_conf 9942799955983999 No 38 >TIGR01263 4HPPD 4-hydroxyphenylpyruvate dioxygenase; InterPro: IPR005956 4-hydroxyphenylpyruvate dioxygenase (1.13.11.27 from EC) oxidizes 4-hydroxyphenylpyruvate, a tyrosine and phenylalanine catabolite, to homogentisate. Homogentisate can undergo a further non-enzymatic oxidation and polymerization into brown pigments that protect some bacterial species from light. A similar process occurs spontaneously in blood and is hemolytic . In some bacterial species, this enzyme has been studied as a hemolysin.; GO: 0003868 4-hydroxyphenylpyruvate dioxygenase activity, 0009072 aromatic amino acid family metabolic process. Probab=22.31 E-value=26 Score=16.07 Aligned_cols=35 Identities=14% Similarity=0.204 Sum_probs=16.4 Q ss_pred EEEEEECCCEEEEEEEEEEEEECCC------HHHHHHHHHHCCCC Q ss_conf 4899505946536887776300354------26776421110267 Q gi|254781073|r 74 RMATLCAGKVFEKAAVLVSTVYGDL------SPDFKDQILGTTKN 112 (307) Q Consensus 74 ~s~VL~~G~VFEKaGVNfS~V~G~~------~p~~a~~i~g~~~~ 112 (307) .+-++++|+| -.=+|...... ...++..+.+..++ T Consensus 43 ~~~~~rqG~i----~fv~~~~~~~~~~~~~~~~~f~~~HGdgv~d 83 (379) T TIGR01263 43 ASTVYRQGQI----NFVVTAELSSDTTTASEAADFAAKHGDGVKD 83 (379) T ss_pred EEEEEECCEE----EEEEECCCCCCCHHHHHHHHHHHHCCCCHHH T ss_conf 6999993759----9998457788851256999999757882312 No 39 >KOG3063 consensus Probab=22.25 E-value=24 Score=16.36 Aligned_cols=27 Identities=22% Similarity=0.327 Sum_probs=19.9 Q ss_pred CCCCCEEEEEEECCCEEEEEEEEEEEE Q ss_conf 877750489950594653688777630 Q gi|254781073|r 68 EDLGGGRMATLCAGKVFEKAAVLVSTV 94 (307) Q Consensus 68 ~~~GGG~s~VL~~G~VFEKaGVNfS~V 94 (307) ...+|-.+.-+.+|+-+|--||-++-| T Consensus 46 Etv~G~V~l~lk~gkkleH~Gikiefi 72 (301) T KOG3063 46 ETVSGKVNLRLKDGKKLEHQGIKIEFI 72 (301) T ss_pred CEEEEEEEEEECCCCCCCCCCEEEEEE T ss_conf 753018999974786111276489999 No 40 >TIGR00505 ribA GTP cyclohydrolase II; InterPro: IPR000926 GTP cyclohydrolase II catalyses the first committed step in the biosynthesis of riboflavin. The enzyme converts GTP and water to formate, 2,5-diamino-6-hydroxy-4-(5-phosphoribosylamino)- pyrimidine and pyrophosphate, and requires magnesium as a cofactor. It is sometimes found as a bifunctional enzyme with 3,4-dihydroxy-2-butanone 4-phosphate synthase (DHBP_synthase) IPR000422 from INTERPRO. ; GO: 0003935 GTP cyclohydrolase II activity, 0009231 riboflavin biosynthetic process. Probab=20.99 E-value=31 Score=15.65 Aligned_cols=11 Identities=64% Similarity=0.942 Sum_probs=8.3 Q ss_pred HHHH-CCCCCCC Q ss_conf 4532-5212343 Q gi|254781073|r 201 LPHR-QESRGIG 211 (307) Q Consensus 201 lpHR-~E~RGvG 211 (307) |=|| +|=|||| T Consensus 108 iY~RG~EGRGIG 119 (227) T TIGR00505 108 IYLRGQEGRGIG 119 (227) T ss_pred EEECCCCCCCCC T ss_conf 970376677623 No 41 >cd06182 CYPOR_like NADPH cytochrome p450 reductase (CYPOR) serves as an electron donor in several oxygenase systems and is a component of nitric oxide synthases and methionine synthase reductases. CYPOR transfers two electrons from NADPH to the heme of cytochrome p450 via FAD and FMN. CYPOR has a C-terminal ferredoxin reducatase (FNR)- like FAD and NAD binding module, an FMN-binding domain, and an additional conecting domain (inserted within the FAD binding region) that orients the FNR and FMN binding domains. Ferredoxin-NADP+ (oxido)reductase is an FAD-containing enzyme that catalyzes the reversible electron transfer between NADP(H) and electron carrier proteins such as ferredoxin and flavodoxin. Isoforms of these flavoproteins (i.e. having a non-covalently bound FAD as a prosthetic group) are present in chloroplasts, mitochondria, and bacteria and participate in a wide variety of redox metabolic pathways. The C-terminal domain contains most of the NADP(H) binding residues and the N-t Probab=20.62 E-value=64 Score=13.61 Aligned_cols=39 Identities=15% Similarity=0.146 Sum_probs=23.4 Q ss_pred HHHHHHHHHHHHHHHHHCCCCCCHHHHHHH-HCCCCCHHH Q ss_conf 999999999999998754799998999885-420240112 Q gi|254781073|r 236 AVGDCFIKLYPSLVRRNYHHLFSEQDRQEQ-LIRRGRYAE 274 (307) Q Consensus 236 ~vg~~fl~~y~~Iv~~r~~~~~~~~~k~~Q-l~rRgRYvE 274 (307) .++..+..+...|+.+.....-.+.+.-++ |.+.|||+| T Consensus 225 ~M~~~V~~~l~~il~~~g~~~~~~A~~~l~~l~~~gRY~~ 264 (267) T cd06182 225 SMAKDVEDALVKIIAKAGGVDESDAEEYLKELEDEGRYVE 264 (267) T ss_pred HHHHHHHHHHHHHHHHHCCCCHHHHHHHHHHHHHCCCEEE T ss_conf 1068999999999998479999999999999998698666 No 42 >KOG2305 consensus Probab=20.48 E-value=41 Score=14.85 Aligned_cols=15 Identities=40% Similarity=0.937 Sum_probs=10.7 Q ss_pred EEEEECCCC--CCCHHH Q ss_conf 223313688--744010 Q gi|254781073|r 121 SVIVHPYNP--HVPAVH 135 (307) Q Consensus 121 S~V~Hp~nP--~vPt~H 135 (307) -+|+||.|| ++|-+. T Consensus 138 ~lvaHPvNPPyfiPLvE 154 (313) T KOG2305 138 CLVAHPVNPPYFIPLVE 154 (313) T ss_pred EEEECCCCCCCCCCHHE T ss_conf 25745799974120110 No 43 >COG4829 CatC1 Muconolactone delta-isomerase [Secondary metabolites biosynthesis, transport, and catabolism] Probab=20.34 E-value=25 Score=16.25 Aligned_cols=29 Identities=41% Similarity=0.843 Sum_probs=19.9 Q ss_pred CCCCHHHHHHHHCCCCCCCCCCCCCHHHEEECCCCC Q ss_conf 024011221111025535735788811141128887 Q gi|254781073|r 268 RRGRYAEFNLLYDKGTNFGFKTGGNVESILASMPPL 303 (307) Q Consensus 268 rRgRYvEFNL~yDRGT~FGL~t~gr~esIl~SlPp~ 303 (307) +-|-|+-+. +|-+.++|-.+.||.||||- T Consensus 46 ~~Geyanys-------lFd~dd~~eLh~~L~~~P~f 74 (98) T COG4829 46 RPGEYANYS-------LFDADDNGELHQLLASMPPF 74 (98) T ss_pred CCCCCCCEE-------EECCCCHHHHHHHHHCCCCC T ss_conf 143102311-------23078617899999629986 Done!