Query gi|254780660|ref|YP_003065073.1| hypothetical protein CLIBASIA_02735 [Candidatus Liberibacter asiaticus str. psy62] Match_columns 244 No_of_seqs 153 out of 2227 Neff 6.6 Searched_HMMs 39220 Date Sun May 29 20:54:43 2011 Command /home/congqian_1/programs/hhpred/hhsearch -i 254780660.hhm -d /home/congqian_1/database/cdd/Cdd.hhm No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM 1 PRK00110 hypothetical protein; 100.0 0 0 665.9 23.1 238 1-239 1-239 (239) 2 COG0217 Uncharacterized conser 100.0 0 0 655.6 21.5 239 1-240 1-240 (241) 3 pfam01709 DUF28 Domain of unkn 100.0 0 0 621.4 22.1 233 5-238 1-234 (234) 4 PRK12378 hypothetical protein; 100.0 0 0 610.7 22.5 231 5-238 3-235 (235) 5 KOG2972 consensus 100.0 0 0 458.3 17.9 236 1-239 29-275 (276) 6 TIGR01033 TIGR01033 conserved 100.0 0 0 398.1 17.6 238 1-238 1-246 (246) 7 PRK07562 ribonucleotide-diphos 94.5 0.22 5.7E-06 31.2 7.3 64 11-74 300-369 (1177) 8 PRK09250 fructose-bisphosphate 68.9 5.2 0.00013 21.5 3.0 62 100-163 40-108 (348) 9 pfam07592 Transposase_36 Rhodo 58.6 3.4 8.7E-05 22.8 0.5 44 24-67 49-97 (311) 10 PRK03881 hypothetical protein; 55.0 19 0.00047 17.5 8.0 189 37-231 189-457 (467) 11 pfam09186 DUF1949 Domain of un 54.0 19 0.00049 17.4 3.8 51 172-226 1-51 (55) 12 TIGR02645 ARCH_P_rylase putati 51.2 15 0.00038 18.2 2.8 84 45-129 122-241 (499) 13 pfam05423 Mycobact_memb Mycoba 49.7 22 0.00055 17.1 3.4 28 77-104 44-73 (140) 14 cd01061 RNase_T2_euk Ribonucle 49.1 4 0.0001 22.3 -0.4 27 97-123 151-177 (195) 15 PRK00881 purH bifunctional pho 48.8 19 0.00049 17.4 3.0 10 132-141 265-274 (514) 16 cd04876 ACT_RelA-SpoT ACT dom 44.3 27 0.00068 16.4 5.9 58 174-234 7-70 (71) 17 cd01421 IMPCH Inosine monophos 44.3 24 0.00062 16.7 3.0 42 148-190 113-158 (187) 18 pfam11588 DUF3243 Protein of u 44.1 26 0.00067 16.5 3.1 29 47-75 12-40 (81) 19 TIGR02717 AcCoA-syn-alpha acet 43.5 27 0.0007 16.3 5.0 166 63-234 167-380 (457) 20 PRK09466 metL bifunctional asp 42.2 29 0.00073 16.2 3.5 16 107-122 171-186 (810) 21 pfam06144 DNA_pol3_delta DNA p 42.1 26 0.00066 16.5 2.9 14 167-180 90-103 (172) 22 pfam11692 DUF3289 Protein of u 41.4 22 0.00056 17.0 2.4 54 28-81 106-163 (277) 23 KOG2121 consensus 39.6 18 0.00046 17.6 1.7 63 52-117 222-288 (746) 24 cd03286 ABC_MSH6_euk MutS6 hom 37.3 18 0.00046 17.6 1.4 18 101-118 86-106 (218) 25 TIGR01660 narH nitrate reducta 36.1 33 0.00084 15.8 2.6 22 51-72 306-327 (495) 26 cd03287 ABC_MSH3_euk MutS3 hom 35.3 20 0.00052 17.3 1.4 16 104-119 90-108 (222) 27 pfam00488 MutS_V MutS domain V 35.1 34 0.00087 15.6 2.6 16 211-226 218-233 (234) 28 COG2605 Predicted kinase relat 33.8 38 0.00098 15.3 3.8 46 147-192 267-316 (333) 29 pfam01910 DUF77 Domain of unkn 33.8 23 0.00059 16.8 1.5 55 181-241 21-76 (92) 30 cd03285 ABC_MSH2_euk MutS2 hom 33.8 37 0.00094 15.4 2.6 19 100-118 85-106 (222) 31 PRK00441 argR arginine repress 32.0 41 0.001 15.1 8.3 123 98-224 10-147 (149) 32 TIGR01764 excise DNA binding d 31.0 40 0.001 15.2 2.3 23 52-74 4-26 (49) 33 cd00439 Transaldolase Transald 30.7 43 0.0011 14.9 3.7 79 39-120 20-114 (252) 34 KOG1290 consensus 30.2 44 0.0011 14.9 2.6 54 86-145 151-216 (590) 35 TIGR00063 folE GTP cyclohydrol 29.9 24 0.00061 16.8 1.1 10 197-206 85-95 (183) 36 pfam03927 NapD NapD protein. U 29.7 44 0.0011 14.8 3.0 65 169-238 6-76 (78) 37 pfam05902 4_1_CTD 4.1 protein 29.4 43 0.0011 14.9 2.3 10 147-156 82-91 (114) 38 KOG0893 consensus 28.8 46 0.0012 14.7 5.1 73 27-100 43-125 (125) 39 cd04877 ACT_TyrR N-terminal AC 28.8 46 0.0012 14.7 3.7 55 180-235 15-69 (74) 40 cd04887 ACT_MalLac-Enz ACT_Mal 28.4 47 0.0012 14.7 5.4 58 171-231 5-68 (74) 41 cd07373 2A5CPDO_A The alpha su 26.5 50 0.0013 14.5 3.6 93 42-137 85-185 (271) 42 KOG1572 consensus 26.4 15 0.00039 18.1 -0.4 62 59-122 129-190 (249) 43 COG3492 Uncharacterized protei 26.3 43 0.0011 14.9 1.9 20 214-233 16-35 (104) 44 PHA00099 minor capsid protein 26.1 23 0.00059 16.8 0.5 93 86-182 5-108 (147) 45 cd06139 DNA_polA_I_Ecoli_like_ 25.5 31 0.0008 15.9 1.0 25 216-240 162-186 (193) 46 TIGR02394 rpoS_proteo RNA poly 23.6 57 0.0014 14.1 3.3 30 31-62 33-72 (292) 47 pfam06798 PrkA PrkA serine pro 23.5 57 0.0014 14.1 4.2 50 175-225 189-238 (254) 48 TIGR02684 dnstrm_HI1420 probab 23.0 41 0.0011 15.0 1.3 16 46-61 71-86 (91) 49 PRK10507 bifunctional glutathi 22.6 45 0.0012 14.8 1.4 76 57-135 289-368 (619) 50 COG4838 Uncharacterized protei 22.4 60 0.0015 13.9 3.1 31 27-57 52-82 (92) 51 TIGR01969 minD_arch cell divis 22.0 46 0.0012 14.7 1.3 49 169-218 134-187 (258) 52 PHA00448 hypothetical protein 21.9 61 0.0016 13.9 4.0 47 9-63 21-67 (70) 53 COG1438 ArgR Arginine represso 21.8 61 0.0016 13.8 7.3 95 97-192 11-117 (150) 54 PRK00139 murE UDP-N-acetylmura 21.5 62 0.0016 13.8 7.8 61 49-115 298-358 (481) 55 PRK03803 murD UDP-N-acetylmura 21.3 63 0.0016 13.8 7.3 58 51-114 281-339 (448) 56 TIGR02768 TraA_Ti Ti-type conj 21.3 48 0.0012 14.6 1.3 71 49-120 370-458 (888) 57 COG2766 PrkA Putative Ser prot 21.2 63 0.0016 13.8 4.7 14 42-55 53-66 (649) 58 COG1033 Predicted exporters of 21.0 64 0.0016 13.7 4.4 17 106-122 418-434 (727) 59 COG3734 DgoK 2-keto-3-deoxy-ga 20.9 37 0.00095 15.4 0.7 15 23-37 170-184 (306) 60 TIGR02875 spore_0_A sporulatio 20.8 51 0.0013 14.4 1.4 28 56-83 205-234 (270) 61 PRK03341 arginine repressor; P 20.2 66 0.0017 13.6 8.3 127 97-224 20-164 (168) 62 cd04883 ACT_AcuB C-terminal AC 20.2 66 0.0017 13.6 3.0 27 98-124 4-30 (72) 63 cd00374 RNase_T2 Ribonuclease 20.1 16 0.0004 18.1 -1.4 31 93-123 147-177 (195) 64 PRK08775 homoserine O-acetyltr 20.1 66 0.0017 13.6 2.3 25 82-106 46-72 (343) No 1 >PRK00110 hypothetical protein; Validated Probab=100.00 E-value=0 Score=665.86 Aligned_cols=238 Identities=50% Similarity=0.746 Sum_probs=232.3 Q ss_pred CCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCHHHCHHHHHHHHHHHHCCCCHHHHHHHHHHHCCC-CCC Q ss_conf 9774112655646775578889799999999999998189994148999999999996688878999999850377-666 Q gi|254780660|r 1 MAGHSQFKNIMHRKERKDALKSKIFSKLSREITVSAKLSGQNPLENPRLRLAIQNAKNQSMPKENIERAIKKAGSD-DLG 79 (244) Q Consensus 1 maGHsKW~nIkh~K~~~D~~k~k~f~k~~keI~~A~k~gG~dp~~N~~L~~ai~~Ak~~~mPk~~Ie~aIkk~~~~-~~~ 79 (244) ||||||||||||+||++|++||++|+||+|+|++|||+|||||++|||||+||++||++||||++|||||+||.|. +.. T Consensus 1 MaGHsKWs~Ikh~K~~~D~~ksk~f~k~~reI~vA~k~GG~DP~~N~~L~~ai~~Ak~~nmPk~~IerAIkk~~g~~~~~ 80 (239) T PRK00110 1 MAGHSKWANIKHRKGAQDAKRGKIFTKLIREITVAAKLGGGDPDGNPRLRLAIDKAKAANMPKDNIERAIKKGTGELDGA 80 (239) T ss_pred CCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCCCCCHHHHHHHHHHHHHCCCHHHHHHHHHHCCCCCCCC T ss_conf 99751656530477776799899999999999999981799966099999999999983899889999998435787766 Q ss_pred CCCEEEEEEECCCCEEEEEEEECCCCHHHHHHHHHHHHHCCCCCCCCCCCHHHHHHCCEEEECCCCCCCHHHHHHHHCCC Q ss_conf 64023678650599399999944630235899898764256831478860456864472797067753013667751268 Q gi|254780660|r 80 NYTNIRYEGYGPEGVAIIIEALTDNRNRTASSIRSIFTKANGSLGSTGSTTRFFEQIGEIIYHSNIGDSNLAMEVAIESD 159 (244) Q Consensus 80 ~~~~~~yEg~gP~gvaiiVe~lTDN~nRt~~~vr~~f~K~gG~lg~~Gsv~~~F~~~G~i~~~~~~~d~d~~~e~aie~g 159 (244) +|++++|||||||||+|||||||||+|||+++||++|+|+||+||++|||+|||+|+|+|.++++..++|.+++++||+| T Consensus 81 ~~~ei~YEg~gP~GvaiiVe~lTDN~nRt~~~vR~~f~K~gG~lg~~GSV~~~F~~kG~i~~~~~~~~ed~l~e~aIe~G 160 (239) T PRK00110 81 NYEEIRYEGYGPGGVAVIVEALTDNRNRTAAEVRHAFSKNGGNLGETGSVSYMFDRKGVIVYAPGAVDEDALMEAALEAG 160 (239) T ss_pred CCEEEEEEEECCCCEEEEEEECCCCHHHHHHHHHHHHHHCCCEECCCCCCCEEEEEEEEEEECCCCCCHHHHHHHHHHCC T ss_conf 41577899985897499999907877669999999999739802788741110003589998689999899999987579 Q ss_pred CCCCCCCCCCEEEEECCCCHHHHHHHHHHCCCCCCCCEEEECCCCCEECCCHHHHHHHHHHHHHHHCCCCCCCCCCCCCC Q ss_conf 75223468825999643201345665420256742221786037610248989999999999875238881311025626 Q gi|254780660|r 160 AFEVLFEDQEYIFYCDFNNVGLTSKKLEEKIGEAQSIKVIWKPLNYIRLSNADKVKSIIKMIENLEDDDDVQSVYSNLEI 239 (244) Q Consensus 160 a~Dv~~~d~~~~i~~~~~~~~~v~~~Le~~~~~~~~sei~~~P~~~V~l~d~e~~~~~~klie~Led~DDVq~VytN~~i 239 (244) |+||+++++.++|+|+|++|.+|+++|+++++++.+++|+|+|+++|+|+ +|+++++.+|++.|||+||||+||||++| T Consensus 161 AeDve~~d~~~~i~~~~~~~~~v~~~Le~~g~~~~~sei~~~P~~~v~l~-~e~~~~~~klie~Lee~DDVq~Vy~N~ei 239 (239) T PRK00110 161 AEDVESDDGSFEVYTAPEDFEAVRDALEAAGFEAESAEVTMIPQNTVELD-EETAEKLLKLIDALEDLDDVQNVYHNAEI 239 (239) T ss_pred CCEEECCCCEEEEEECHHHHHHHHHHHHHCCCCHHHEEEEEECCCCEEEC-HHHHHHHHHHHHHHHCCCCCCEEEECCCC T ss_conf 84000257439999689999999999997427801204798059735209-99999999999987434481524226649 No 2 >COG0217 Uncharacterized conserved protein [Function unknown] Probab=100.00 E-value=0 Score=655.60 Aligned_cols=239 Identities=49% Similarity=0.723 Sum_probs=233.9 Q ss_pred CCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCHHHCHHHHHHHHHHHHCCCCHHHHHHHHHHHCC-CCCC Q ss_conf 977411265564677557888979999999999999818999414899999999999668887899999985037-7666 Q gi|254780660|r 1 MAGHSQFKNIMHRKERKDALKSKIFSKLSREITVSAKLSGQNPLENPRLRLAIQNAKNQSMPKENIERAIKKAGS-DDLG 79 (244) Q Consensus 1 maGHsKW~nIkh~K~~~D~~k~k~f~k~~keI~~A~k~gG~dp~~N~~L~~ai~~Ak~~~mPk~~Ie~aIkk~~~-~~~~ 79 (244) |||||||+||||+|+++|++|||+|+||+|||++|||+|||||++|||||.||++||++||||++|||||+||.| .+.. T Consensus 1 MaGHsKw~nIkhrK~a~Dakr~Kif~Kl~keI~vAaK~Gg~dP~~NprLr~aI~kAk~~nmPkd~IerAI~ka~G~~d~~ 80 (241) T COG0217 1 MAGHSKWANIKHRKAAQDAKRSKIFTKLIKEITVAAKQGGPDPESNPRLRTAIEKAKAANMPKDNIERAIKKASGGKDGA 80 (241) T ss_pred CCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCCCCCHHHHHHHHHHHHCCCCHHHHHHHHHHCCCCCCCC T ss_conf 99651578888787787898877999999999999980699976298999999999881998789999997465888756 Q ss_pred CCCEEEEEEECCCCEEEEEEEECCCCHHHHHHHHHHHHHCCCCCCCCCCCHHHHHHCCEEEECCCCCCCHHHHHHHHCCC Q ss_conf 64023678650599399999944630235899898764256831478860456864472797067753013667751268 Q gi|254780660|r 80 NYTNIRYEGYGPEGVAIIIEALTDNRNRTASSIRSIFTKANGSLGSTGSTTRFFEQIGEIIYHSNIGDSNLAMEVAIESD 159 (244) Q Consensus 80 ~~~~~~yEg~gP~gvaiiVe~lTDN~nRt~~~vr~~f~K~gG~lg~~Gsv~~~F~~~G~i~~~~~~~d~d~~~e~aie~g 159 (244) +|++++||||||+||+|||||||||+|||+++||++|+|+||+||++|||+|||+|+|+|.+.++..++|.+||.+||+| T Consensus 81 ~~~ei~YEGygP~GvaiiVe~LTDN~NRTas~vR~~F~K~GG~lg~~GSV~~mF~~kGvi~~~~~~~~ed~l~e~~ieag 160 (241) T COG0217 81 NYEEIRYEGYGPGGVAIIVEALTDNRNRTASNVRSAFNKNGGNLGEPGSVSYMFDRKGVIVVEKNEIDEDELLEAAIEAG 160 (241) T ss_pred CEEEEEEEEECCCCEEEEEEECCCCCCHHHHHHHHHHHHCCCCCCCCCEEEEEEECCEEEEECCCCCCHHHHHHHHHHCC T ss_conf 54789998687984399998626885101899999997458751898617898755389998899899899999999778 Q ss_pred CCCCCCCCCCEEEEECCCCHHHHHHHHHHCCCCCCCCEEEECCCCCEECCCHHHHHHHHHHHHHHHCCCCCCCCCCCCCC Q ss_conf 75223468825999643201345665420256742221786037610248989999999999875238881311025626 Q gi|254780660|r 160 AFEVLFEDQEYIFYCDFNNVGLTSKKLEEKIGEAQSIKVIWKPLNYIRLSNADKVKSIIKMIENLEDDDDVQSVYSNLEI 239 (244) Q Consensus 160 a~Dv~~~d~~~~i~~~~~~~~~v~~~Le~~~~~~~~sei~~~P~~~V~l~d~e~~~~~~klie~Led~DDVq~VytN~~i 239 (244) |+|++.+++.++|+|+|++|+.|+.+|+++++++..+++.|+|+++|.+++ |+++++++||++|||+||||+||||+++ T Consensus 161 aeDv~~~~~~~~V~t~p~~~~~V~~~L~~~g~~~~~ael~~iP~~~v~~~~-e~a~k~~kLid~LEd~DDVQ~Vy~N~~~ 239 (241) T COG0217 161 AEDVEEDEGSIEVYTEPEDFNKVKEALEAAGYEIESAELTMIPQNTVELDD-EDAEKLEKLIDALEDDDDVQNVYHNAEI 239 (241) T ss_pred CHHHHCCCCEEEEEECHHHHHHHHHHHHHCCCCEEEEEEEEECCCCEECCH-HHHHHHHHHHHHHHCCCCHHHHHHCCCC T ss_conf 335433797299997857799999999976982001158995387651278-7899999999987441016787733755 Q ss_pred C Q ss_conf 6 Q gi|254780660|r 240 A 240 (244) Q Consensus 240 ~ 240 (244) + T Consensus 240 ~ 240 (241) T COG0217 240 S 240 (241) T ss_pred C T ss_conf 7 No 3 >pfam01709 DUF28 Domain of unknown function DUF28. This domain is found in bacterial and yeast proteins it compromises the entire length or central region of most of the proteins in the family, all of which are hypothetical with no known function. The average length of this domain is approximately 230 amino acids long. Probab=100.00 E-value=0 Score=621.43 Aligned_cols=233 Identities=47% Similarity=0.694 Sum_probs=227.5 Q ss_pred CCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCHHHCHHHHHHHHHHHHCCCCHHHHHHHHHHHCCC-CCCCCCE Q ss_conf 112655646775578889799999999999998189994148999999999996688878999999850377-6666402 Q gi|254780660|r 5 SQFKNIMHRKERKDALKSKIFSKLSREITVSAKLSGQNPLENPRLRLAIQNAKNQSMPKENIERAIKKAGSD-DLGNYTN 83 (244) Q Consensus 5 sKW~nIkh~K~~~D~~k~k~f~k~~keI~~A~k~gG~dp~~N~~L~~ai~~Ak~~~mPk~~Ie~aIkk~~~~-~~~~~~~ 83 (244) ||||||||+||++|++|||+|+||+|+|++|||+|||||++|||||+||++||++||||++|||||+||.|. +..+|++ T Consensus 1 skw~~Ikh~K~~~D~~ksk~f~k~~reI~~A~k~GG~Dp~~N~~L~~ai~~Ak~~nmPk~~IerAIkk~~g~~~~~~~~e 80 (234) T pfam01709 1 SKWANIKHRKAAQDAKRGKIFTKLIKEITVAAKMGGPDPEGNPRLRLAIEKAKAANMPKDNIERAIKKGSGGLDGENYEE 80 (234) T ss_pred CCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCCCCCHHHHHHHHHHHHHCCCHHHHHHHHHHCCCCCCCCCCEE T ss_conf 98312311767878998899999999999999817999650999999999999818998899999985248887655158 Q ss_pred EEEEEECCCCEEEEEEEECCCCHHHHHHHHHHHHHCCCCCCCCCCCHHHHHHCCEEEECCCCCCCHHHHHHHHCCCCCCC Q ss_conf 36786505993999999446302358998987642568314788604568644727970677530136677512687522 Q gi|254780660|r 84 IRYEGYGPEGVAIIIEALTDNRNRTASSIRSIFTKANGSLGSTGSTTRFFEQIGEIIYHSNIGDSNLAMEVAIESDAFEV 163 (244) Q Consensus 84 ~~yEg~gP~gvaiiVe~lTDN~nRt~~~vr~~f~K~gG~lg~~Gsv~~~F~~~G~i~~~~~~~d~d~~~e~aie~ga~Dv 163 (244) ++|||||||||+|||||||||+|||+++||++|+|+||+||++|||+|||+|+|+|.++++..++|.+++++||+||+|| T Consensus 81 ~~YEg~gp~Gvaiive~lTDN~nRt~~~vR~~f~K~gG~lg~~GSV~~~F~~kG~i~i~~~~~~ee~l~e~aIe~GAeDv 160 (234) T pfam01709 81 IRYEGYGPGGVAVIVECLTDNRNRTAADVRHAFSKNGGNLGESGSVSYMFDRKGVIVFEKEGVDEDELLEAALEAGAEDV 160 (234) T ss_pred EEEEEECCCCEEEEEEECCCCHHHHHHHHHHHHHHCCCEECCCCCEEEEEEEEEEEEEECCCCCHHHHHHHHHHCCCCEE T ss_conf 88999837974999998178765579999999997298347987502442267899980799997999999986696213 Q ss_pred CCCCCCEEEEECCCCHHHHHHHHHHCCCCCCCCEEEECCCCCEECCCHHHHHHHHHHHHHHHCCCCCCCCCCCCC Q ss_conf 346882599964320134566542025674222178603761024898999999999987523888131102562 Q gi|254780660|r 164 LFEDQEYIFYCDFNNVGLTSKKLEEKIGEAQSIKVIWKPLNYIRLSNADKVKSIIKMIENLEDDDDVQSVYSNLE 238 (244) Q Consensus 164 ~~~d~~~~i~~~~~~~~~v~~~Le~~~~~~~~sei~~~P~~~V~l~d~e~~~~~~klie~Led~DDVq~VytN~~ 238 (244) +.+++.++|+|+|++|.+|+++|+++++.+.+++|+|+|+++|+|+ +|+++++.+|+|.|||+||||+||||++ T Consensus 161 ~~~e~~~~i~~~~~~~~~v~~~Le~~g~~~~~sel~~~P~~~v~l~-~e~~~~~~klie~Lee~dDVq~Vy~N~e 234 (234) T pfam01709 161 EDEDGSIEVITDPTDFEAVKKALEEAGLEIESAEITMIPQNTVELD-EEDAEKLEKLIDALEDLDDVQNVYHNAE 234 (234) T ss_pred ECCCCCEEEEECHHHHHHHHHHHHHCCCCHHHCEEEEECCCCCCCC-HHHHHHHHHHHHHHHCCCCCCEEEECCC T ss_conf 3158818999689999999999997528802304699438865119-9999999999998745568240633789 No 4 >PRK12378 hypothetical protein; Provisional Probab=100.00 E-value=0 Score=610.66 Aligned_cols=231 Identities=41% Similarity=0.634 Sum_probs=223.1 Q ss_pred CCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCHHHCHHHHHHHHHHHHCCCCHHHHHHHHHHHCCCCCCCCCEE Q ss_conf 11265564677557888979999999999999818999414899999999999668887899999985037766664023 Q gi|254780660|r 5 SQFKNIMHRKERKDALKSKIFSKLSREITVSAKLSGQNPLENPRLRLAIQNAKNQSMPKENIERAIKKAGSDDLGNYTNI 84 (244) Q Consensus 5 sKW~nIkh~K~~~D~~k~k~f~k~~keI~~A~k~gG~dp~~N~~L~~ai~~Ak~~~mPk~~Ie~aIkk~~~~~~~~~~~~ 84 (244) -||+||||+|+++|++||++|+||+|+|++|||+|||||++|||||+||++||++||||++||||||||.|.+..+|+++ T Consensus 3 rKw~~Ikh~K~~~D~~k~k~f~k~~r~I~vA~k~GG~dp~~N~~L~~ai~~Ak~~nmPkd~IerAIkk~~g~~~~~~ee~ 82 (235) T PRK12378 3 RAWENIKASKAKKDGAKSKIYAKLGKEIYVAAKQGGPDPESNPRLRVVIERAKKANVPKDVIERAIKKAKGGGGEDYEEA 82 (235) T ss_pred CHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCCCCCHHHHHHHHHHHHHCCCHHHHHHHHHHCCCCCCCCEEEE T ss_conf 32878888877658998899999999999999817999650999999999999849998999999984448887762799 Q ss_pred EEEEECCCCEEEEEEEECCCCHHHHHHHHHHHHHCCCCCCCCCCCHHHHHHCCEEEECCCCCCCHHHHHHHHCCCC--CC Q ss_conf 6786505993999999446302358998987642568314788604568644727970677530136677512687--52 Q gi|254780660|r 85 RYEGYGPEGVAIIIEALTDNRNRTASSIRSIFTKANGSLGSTGSTTRFFEQIGEIIYHSNIGDSNLAMEVAIESDA--FE 162 (244) Q Consensus 85 ~yEg~gP~gvaiiVe~lTDN~nRt~~~vr~~f~K~gG~lg~~Gsv~~~F~~~G~i~~~~~~~d~d~~~e~aie~ga--~D 162 (244) +|||||||||+|||||||||+|||+++|||+|+|+||+||++|||+|||+|+|+|.++++ ++|.+++++||+|| +| T Consensus 83 ~YEg~gP~GvaiiVe~lTDN~nRt~~~vr~~f~K~gG~lg~~GSV~~~F~rkG~I~~~~~--~~d~~~e~aie~ga~~~d 160 (235) T PRK12378 83 RYEGFGPNGVMVIVECLTDNVNRTVANVRSAFNKNGGNLGTSGSVAFMFDHKGVFEFAGD--DEDELLEALIDADVDVED 160 (235) T ss_pred EEEEECCCCEEEEEEECCCCHHHHHHHHHHHHHHCCCCCCCCCCEEEEEEEEEEEEECCC--CHHHHHHHHHHCCCCCEE T ss_conf 999987898299999968987678999999999839914799754132115689998179--876898998757998057 Q ss_pred CCCCCCCEEEEECCCCHHHHHHHHHHCCCCCCCCEEEECCCCCEECCCHHHHHHHHHHHHHHHCCCCCCCCCCCCC Q ss_conf 2346882599964320134566542025674222178603761024898999999999987523888131102562 Q gi|254780660|r 163 VLFEDQEYIFYCDFNNVGLTSKKLEEKIGEAQSIKVIWKPLNYIRLSNADKVKSIIKMIENLEDDDDVQSVYSNLE 238 (244) Q Consensus 163 v~~~d~~~~i~~~~~~~~~v~~~Le~~~~~~~~sei~~~P~~~V~l~d~e~~~~~~klie~Led~DDVq~VytN~~ 238 (244) ++.+++.++|+|+|++|.+|+++|+++++++.+++|+|+|+++|+|++ |+++++.+|+|.|||+||||+||||++ T Consensus 161 ~~~ed~~~~i~t~~~~l~~v~~~Le~~g~~i~~aei~~~P~~~v~l~~-e~~~~v~kLid~Lee~DDVQ~VysN~e 235 (235) T PRK12378 161 VEEEEGTITVYTDPTDFHKVKKALEAAGFEFLVAELEFIPQNPVELSG-EDLEQFEKLLDALEDDDDVQNVYHNVE 235 (235) T ss_pred EECCCEEEEEEECHHHHHHHHHHHHHCCCCCEEEEEEEECCCCEECCH-HHHHHHHHHHHHHHCCCCCCCEEECCC T ss_conf 304670699997899999999999971388311157992598541099-999999999998744458150502789 No 5 >KOG2972 consensus Probab=100.00 E-value=0 Score=458.29 Aligned_cols=236 Identities=33% Similarity=0.501 Sum_probs=217.6 Q ss_pred CCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCHHHCHHHHHHHHHHHHCCCCHHHHHHHHHHHCCCCCCC Q ss_conf 97741126556467755788897999999999999981899941489999999999966888789999998503776666 Q gi|254780660|r 1 MAGHSQFKNIMHRKERKDALKSKIFSKLSREITVSAKLSGQNPLENPRLRLAIQNAKNQSMPKENIERAIKKAGSDDLGN 80 (244) Q Consensus 1 maGHsKW~nIkh~K~~~D~~k~k~f~k~~keI~~A~k~gG~dp~~N~~L~~ai~~Ak~~~mPk~~Ie~aIkk~~~~~~~~ 80 (244) |+||||||||||+||++|++|+|++.||+++|..|||.||+||++|.||++.++.||+.+|||++||+||+|+.+++... T Consensus 29 ~sgH~kwskIk~~Kg~nD~~rsk~~nkl~~~i~~aVk~gg~np~lN~~LAtlle~ak~~~vpkd~ien~i~ras~k~~~a 108 (276) T KOG2972 29 MSGHNKWSKIKHKKGANDQARSKQINKLSQGIILAVKQGGANPELNMRLATLLESAKKISVPKDGIENAINRASGKEGSA 108 (276) T ss_pred ECCCCHHHHHCCCCCCCHHHHHHHHHHHHHHHHHHHHHCCCCCHHHHHHHHHHHHHHHCCCCHHHHHHHHHHHCCCCCCC T ss_conf 22630666530223536788988988999999999971599903556899999998863997789999999730378874 Q ss_pred CCEEEEEEECCCCEEEEEEEECCCCHHHHHHHHHHHHHCCCCCCCCCCCHHHHHHCC-EEEECCCCCCCHHHHHHHHCCC Q ss_conf 402367865059939999994463023589989876425683147886045686447-2797067753013667751268 Q gi|254780660|r 81 YTNIRYEGYGPEGVAIIIEALTDNRNRTASSIRSIFTKANGSLGSTGSTTRFFEQIG-EIIYHSNIGDSNLAMEVAIESD 159 (244) Q Consensus 81 ~~~~~yEg~gP~gvaiiVe~lTDN~nRt~~~vr~~f~K~gG~lg~~Gsv~~~F~~~G-~i~~~~~~~d~d~~~e~aie~g 159 (244) ++.+.||+|||+||++|||++|||+||+++.||++|+|+||.+. ++|.|||++|| ++.++++..|.+.+...+||+| T Consensus 109 ~e~~~ye~~gp~GV~liVealTdnknr~~~~iRs~~nk~GG~s~--~~~r~~FdkKG~Vv~V~~~~~dk~vL~ie~ie~~ 186 (276) T KOG2972 109 VEFIEYEAMGPSGVGLIVEALTDNKNRAASSIRSIFNKHGGASA--SGVRFLFDKKGVVVNVPPEKRDKDVLNIEAIEAG 186 (276) T ss_pred EEEEEEEEECCCCEEEEEEEEECCHHHHHHHHHHHHHHCCCCCC--CCCEEEEECCCEEEECCHHHCCHHHHHHHHHHHC T ss_conf 37898766557742899986505276779999999987487555--6613677336628962831143246518988713 Q ss_pred CCCCCCC---------CC-CEEEEECCCCHHHHHHHHHHCCCCCCCCEEEECCCCCEECCCHHHHHHHHHHHHHHHCCCC Q ss_conf 7522346---------88-2599964320134566542025674222178603761024898999999999987523888 Q gi|254780660|r 160 AFEVLFE---------DQ-EYIFYCDFNNVGLTSKKLEEKIGEAQSIKVIWKPLNYIRLSNADKVKSIIKMIENLEDDDD 229 (244) Q Consensus 160 a~Dv~~~---------d~-~~~i~~~~~~~~~v~~~Le~~~~~~~~sei~~~P~~~V~l~d~e~~~~~~klie~Led~DD 229 (244) |+|+..+ +. .|.++|+|+++++|...|.+.|+.+.+++++|+|..+|++.+++ ++++.+|+++|.|+|| T Consensus 187 A~d~~~~~~~e~d~eeer~~fkiv~e~ssl~qV~~~Lr~~G~~i~d~~le~~P~~~vev~~~~-lEk~qkL~q~L~e~ed 265 (276) T KOG2972 187 AEDIVAEPVLEIDEEEEREEFKIVTEPSSLNQVAHKLRSKGFEIKDSGLEFIPLEEVEVDVPA-LEKIQKLIQALYENED 265 (276) T ss_pred CCCCCCCCCCCCCCCCCCCEEEEEECCCHHHHHHHHHHCCCCEEECCCCCCCCCCCCCCCCCC-HHHHHHHHHHHHHCHH T ss_conf 300036741224541354216888360019999998412880354154310469853667512-6999999999860223 Q ss_pred CCCCCCCCCC Q ss_conf 1311025626 Q gi|254780660|r 230 VQSVYSNLEI 239 (244) Q Consensus 230 Vq~VytN~~i 239 (244) |..||+|+.- T Consensus 266 V~~iy~ni~~ 275 (276) T KOG2972 266 VMFIYDNILN 275 (276) T ss_pred HHHHHHCCCC T ss_conf 7888641458 No 6 >TIGR01033 TIGR01033 conserved hypothetical protein TIGR01033; InterPro: IPR002876 This domain is found in bacteria, plants, and yeast proteins. It compromises the entire length or central region of most of the proteins in the family, all of which are hypothetical with no known function. The average length of this domain is approximately 230 amino acids long. The crystal structure of a conserved hypothetical protein, Aq1575, from Aquifex aeolicus has been determined. A structural homology search reveals that this protein has a new fold with no obvious similarity to those of other proteins of known three-dimensional structure. The protein reveals a monomer consisting of three domains arranged along a pseudo threefold symmetry axis. There is a large cleft with approximate dimensions of 10 A x 10 A x 20 A in the centre of the three domains along the symmetry axis. Two possible active sites are suggested based on the structure and multiple sequence alignment. There are several highly conserved residues in these putative active sites .. Probab=100.00 E-value=0 Score=398.07 Aligned_cols=238 Identities=45% Similarity=0.689 Sum_probs=230.1 Q ss_pred CCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCHHHCHHHHHHHHHHHHCCCCHHHHHHHHHHHCC--CCC Q ss_conf 977411265564677557888979999999999999818999414899999999999668887899999985037--766 Q gi|254780660|r 1 MAGHSQFKNIMHRKERKDALKSKIFSKLSREITVSAKLSGQNPLENPRLRLAIQNAKNQSMPKENIERAIKKAGS--DDL 78 (244) Q Consensus 1 maGHsKW~nIkh~K~~~D~~k~k~f~k~~keI~~A~k~gG~dp~~N~~L~~ai~~Ak~~~mPk~~Ie~aIkk~~~--~~~ 78 (244) |+||+||++++|+|++.|++|+++|+++.|+|.+|+|.||+||+.||+|+.++++|+..|||+++|+|+|+++.+ .+. T Consensus 1 ~~g~~~w~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~gg~~~~~n~~l~~~~~~~~~~~~p~~~~~~~~~~~~~~~~~~ 80 (246) T TIGR01033 1 MAGHSKWANIKHRKGAQDAKRGKIFTKLIKEIIVAAKLGGGDPESNPRLRTAIEKAKAANLPKDNIERAIKKGAGYGLDG 80 (246) T ss_pred CCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCCCCCHHHHHHHHHHHHCCCCHHHHHHHHHHCCCCCCCC T ss_conf 98621356776666655555335789999999887651467743214789999988751686345777654202444454 Q ss_pred CCCCEEEEEEECCCCEEEEEEEECCCCHHHHHHHHHHHHHCCCCCCCCCCCHHHHHHCCEEEECCCC--CCC-HHHHHHH Q ss_conf 6640236786505993999999446302358998987642568314788604568644727970677--530-1366775 Q gi|254780660|r 79 GNYTNIRYEGYGPEGVAIIIEALTDNRNRTASSIRSIFTKANGSLGSTGSTTRFFEQIGEIIYHSNI--GDS-NLAMEVA 155 (244) Q Consensus 79 ~~~~~~~yEg~gP~gvaiiVe~lTDN~nRt~~~vr~~f~K~gG~lg~~Gsv~~~F~~~G~i~~~~~~--~d~-d~~~e~a 155 (244) .+|++++||||||+|++++|+|+|||+|||++++|++|+|+||+|+.+|||.|+|.++|++.+.... .++ +.+++.+ T Consensus 81 ~~~~~~~y~g~~p~g~~~~~~~~~dn~~~~~~~~~~~~~~~gg~~~~~g~~~~~f~~~g~~~~~~~~~~~~~~~~~~~~~ 160 (246) T TIGR01033 81 SNYEEITYEGYGPGGVAVLVECLTDNKNRTASELRSAFNKNGGSLGEPGSVSYLFSRKGVIELPKNEVELDELEDLLEAA 160 (246) T ss_pred CCHHHEEEECCCCCCEEEEEEEECCCCHHHHHHHHHHHHHCCCCCCCCCCEEHHCCCCCEEEECCCCHHHHHHHHHHHHH T ss_conf 30111011012576505677530267303578888877532664456541000001363267514300145688999999 Q ss_pred HCCCCCCCCCCCCC---EEEEECCCCHHHHHHHHHHCCCCCCCCEEEECCCCCEECCCHHHHHHHHHHHHHHHCCCCCCC Q ss_conf 12687522346882---599964320134566542025674222178603761024898999999999987523888131 Q gi|254780660|r 156 IESDAFEVLFEDQE---YIFYCDFNNVGLTSKKLEEKIGEAQSIKVIWKPLNYIRLSNADKVKSIIKMIENLEDDDDVQS 232 (244) Q Consensus 156 ie~ga~Dv~~~d~~---~~i~~~~~~~~~v~~~Le~~~~~~~~sei~~~P~~~V~l~d~e~~~~~~klie~Led~DDVq~ 232 (244) +++|++|+...++. +.++|.|.+|..++..|+..++.+..+++.|+|.+++++++.++.+++.+|++.|+++||||. T Consensus 161 ~~~g~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~p~~~~~~~~~~~~~~~~~~~~~~~~~~d~~~ 240 (246) T TIGR01033 161 LEAGAEDLDDDDDEEGGFEVYTAPEELEEVKEALESKGFPIEEAELTLLPLTTVDLDDPETAEKLLKLLDALEDDDDVQE 240 (246) T ss_pred HHCCCHHHCCCCCCCCCEEEEECHHHHHHHHHHHHHCCCCCHHHHHHHCCCCCCCCCCHHHHHHHHHHHHHHHCCHHHHH T ss_conf 84150010013455442378644456889998887606710010000001110012432357899999987403203577 Q ss_pred CCCCCC Q ss_conf 102562 Q gi|254780660|r 233 VYSNLE 238 (244) Q Consensus 233 VytN~~ 238 (244) ||+|++ T Consensus 241 ~~~n~~ 246 (246) T TIGR01033 241 VYHNFE 246 (246) T ss_pred HHHCCC T ss_conf 751379 No 7 >PRK07562 ribonucleotide-diphosphate reductase subunit alpha; Validated Probab=94.53 E-value=0.22 Score=31.23 Aligned_cols=64 Identities=30% Similarity=0.304 Sum_probs=50.6 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHHC------CCCHHHCHHHHHHHHHHHHCCCCHHHHHHHHHHHC Q ss_conf 64677557888979999999999999818------99941489999999999966888789999998503 Q gi|254780660|r 11 MHRKERKDALKSKIFSKLSREITVSAKLS------GQNPLENPRLRLAIQNAKNQSMPKENIERAIKKAG 74 (244) Q Consensus 11 kh~K~~~D~~k~k~f~k~~keI~~A~k~g------G~dp~~N~~L~~ai~~Ak~~~mPk~~Ie~aIkk~~ 74 (244) ...|.+.=..-+++-.++.+.|+-|+..+ |-||..|+.|+.+|..|++.++|-.-|.|.|.-+. T Consensus 300 EE~Kv~aLv~g~~~~~~~l~~i~~a~~~~~~~~~~~fDp~~n~~l~~~i~~A~~~~vp~~~i~rv~~~~~ 369 (1177) T PRK07562 300 EEQKVASLVTGSKINSKHLKAIMKACVNCEGSGDDCFDPAKNPALKREIKAAKKALVPENYIKRVIQLAR 369 (1177) T ss_pred CHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCCCCCCCCCCCCHHHHHHHHHHHHCCCCHHHHHHHHHHHH T ss_conf 1122576640007789999999998741443311235731166788999987750572777777777653 No 8 >PRK09250 fructose-bisphosphate aldolase; Provisional Probab=68.91 E-value=5.2 Score=21.46 Aligned_cols=62 Identities=21% Similarity=0.274 Sum_probs=39.1 Q ss_pred EECCCCHHHHHHHHHHHHHCCCCCCCCCC-----CHHHHHHCCEEEECCC--CCCCHHHHHHHHCCCCCCC Q ss_conf 94463023589989876425683147886-----0456864472797067--7530136677512687522 Q gi|254780660|r 100 ALTDNRNRTASSIRSIFTKANGSLGSTGS-----TTRFFEQIGEIIYHSN--IGDSNLAMEVAIESDAFEV 163 (244) Q Consensus 100 ~lTDN~nRt~~~vr~~f~K~gG~lg~~Gs-----v~~~F~~~G~i~~~~~--~~d~d~~~e~aie~ga~Dv 163 (244) ..+|...++..++-.+|.- |+||.+|- |..=|+|-.--.|.++ .-|.+-.+++|||+|..-+ T Consensus 40 ~~s~r~~~v~~nl~~~~~~--Grl~gtG~l~ILpvDQG~EH~~~~sFa~Np~~~DP~~~~~LAie~g~~a~ 108 (348) T PRK09250 40 IDSDRNPGVLRNLQRLLNH--GRLAGTGYLSILPVDQGFEHGAGASFAPNPLYFDPENIVQLAIEAGCNAV 108 (348) T ss_pred CCCCCCHHHHHHHHHHHHC--CCCCCCCEEEEEECCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHCCCCHH T ss_conf 0468987899999999845--85578750899975555656876567889676684889998872474034 No 9 >pfam07592 Transposase_36 Rhodopirellula transposase. These transposases are found in the planctomycete Rhodopirellula baltica, the cyanobacterium Nostoc, and the Gram-positive bacterium Streptomyces. Probab=58.59 E-value=3.4 Score=22.79 Aligned_cols=44 Identities=14% Similarity=0.307 Sum_probs=25.7 Q ss_pred HHHHHHHHHHHHHH--HCCCCHHHCHHHHHHHHHHH---HCCCCHHHHH Q ss_conf 99999999999998--18999414899999999999---6688878999 Q gi|254780660|r 24 IFSKLSREITVSAK--LSGQNPLENPRLRLAIQNAK---NQSMPKENIE 67 (244) Q Consensus 24 ~f~k~~keI~~A~k--~gG~dp~~N~~L~~ai~~Ak---~~~mPk~~Ie 67 (244) +-..++-..+..+| +|..-|+.|.+.+.+=+.+| +.+-|--.|+ T Consensus 49 lL~~~GysLq~~~Kt~eg~~hpDRdaQF~~In~~~~~~~~~g~PvISVD 97 (311) T pfam07592 49 LLNELGYSLQANVKTKEGKKHPDRDAQFEQINERVKEFDNNGQPVISVD 97 (311) T ss_pred HHHHCCCCHHHHHCCCCCCCCCCHHHHHHHHHHHHHHHHHCCCCEEEEE T ss_conf 9987492113223236788899816789999999999876699578874 No 10 >PRK03881 hypothetical protein; Provisional Probab=54.96 E-value=19 Score=17.54 Aligned_cols=189 Identities=15% Similarity=0.088 Sum_probs=98.4 Q ss_pred HHCCC---C---HHHCHHHHHHHHHH---HHCCCCHHHHHHHHHHH-------CCC-CCCC--CCEEEEEEECCCCEEEE Q ss_conf 81899---9---41489999999999---96688878999999850-------377-6666--40236786505993999 Q gi|254780660|r 37 KLSGQ---N---PLENPRLRLAIQNA---KNQSMPKENIERAIKKA-------GSD-DLGN--YTNIRYEGYGPEGVAII 97 (244) Q Consensus 37 k~gG~---d---p~~N~~L~~ai~~A---k~~~mPk~~Ie~aIkk~-------~~~-~~~~--~~~~~yEg~gP~gvaii 97 (244) +.+|| . |+.-.++..++++. .-.+|+.+.||+|=.=| .|. +.-. .+-+.|| ||.||+.. T Consensus 189 ~~dgPyGy~P~G~~FD~~i~~~l~~gd~~~ll~id~~l~e~AgECG~rs~~~~lGaldg~~~~~~vlSYE--GPFGVGY~ 266 (467) T PRK03881 189 TEDGPYGYHPDGEEFDKALVDLLRKGDVEGIINIDEDLIEEAGECGLRSVLMMLGALDGYEVKSEVLSYE--GPFGVGYG 266 (467) T ss_pred CCCCCCCCCCCHHHHHHHHHHHHHCCCHHHHHCCCHHHHHHHHHCCHHHHHHHHHHCCCCCCCCCEEEEE--CCCCCCEE T ss_conf 8899988897517789999999865798887318977756656406788999987313663356343201--58653579 Q ss_pred EEEECCCC---HHH--------------------------HHHHHHHHHHCCCCCCCCC-CCHHHH-HHCCEEEECCCC- Q ss_conf 99944630---235--------------------------8998987642568314788-604568-644727970677- Q gi|254780660|r 98 IEALTDNR---NRT--------------------------ASSIRSIFTKANGSLGSTG-STTRFF-EQIGEIIYHSNI- 145 (244) Q Consensus 98 Ve~lTDN~---nRt--------------------------~~~vr~~f~K~gG~lg~~G-sv~~~F-~~~G~i~~~~~~- 145 (244) |=.++.-- ..+ ...|.++++. |-.+..+. .-.-|| .+.|+|+.-++. T Consensus 267 Va~~~~~~~~~~~~~~~~~~~~~~~~~~~r~~ed~~v~lAr~~le~y~~~-g~~~~~p~~~p~e~~~~~~GvFVTL~k~g 345 (467) T PRK03881 267 VARLTVGSAEDTSLLEKLEREREQRIEKRRAEESPYVRLARESLEYYLKT-GKVLKVPPDLPEEMLKGRAGVFVSLKKDG 345 (467) T ss_pred EEEEECCCCCCCCHHHHHHHHHHHHHHHHHCCCCHHHHHHHHHHHHHHHC-CCCCCCCCCCCHHHHCCCCEEEEEEEECC T ss_conf 99985588765106889999999998765404467999999999999821-98688987788788647661999985899 Q ss_pred --------------CCCHHHHHHHHCCCCCCCCCC----C---C---CEEEEECCCCHHHHHHHHHHC--CCCCC--CCE Q ss_conf --------------530136677512687522346----8---8---259996432013456654202--56742--221 Q gi|254780660|r 146 --------------GDSNLAMEVAIESDAFEVLFE----D---Q---EYIFYCDFNNVGLTSKKLEEK--IGEAQ--SIK 197 (244) Q Consensus 146 --------------~d~d~~~e~aie~ga~Dv~~~----d---~---~~~i~~~~~~~~~v~~~Le~~--~~~~~--~se 197 (244) .-.+++...|+.|+-.|--.. + + ++.|+++|+...... .|+-. |.-+. .-. T Consensus 346 ~LRGCIGt~~p~~~~l~~~i~~nA~~Aa~~DPRF~pv~~~El~~l~i~VsvL~~pe~~~~~~-~l~p~~~Gliv~~g~~~ 424 (467) T PRK03881 346 ELRGCIGTIAPTRENIAEEIIRNAISAGFNDPRFYPVEEDELDDLVYSVDVLMEPEPVKSLE-ELDPKKYGVIVRSGRRR 424 (467) T ss_pred EECCCCCCCCCCCCHHHHHHHHHHHHHHCCCCCCCCCCHHHHHCCEEEEEECCCCEECCCHH-HCCCCCCEEEEEECCCC T ss_conf 61657232877753599999999999852799989999568608789999678981079877-76977406999878966 Q ss_pred EEECCCCCEECCC-HHHHHHHHHHHHHHHCCCCCC Q ss_conf 7860376102489-899999999998752388813 Q gi|254780660|r 198 VIWKPLNYIRLSN-ADKVKSIIKMIENLEDDDDVQ 231 (244) Q Consensus 198 i~~~P~~~V~l~d-~e~~~~~~klie~Led~DDVq 231 (244) =.+.|+-. .+++ ++++ ...+.-.-|..+++|+ T Consensus 425 gllLP~ve-g~~t~ee~l-~~~~~KAGi~~d~~~~ 457 (467) T PRK03881 425 GLLLPDLE-GVDTVEEQL-SIALQKAGISPDEPYK 457 (467) T ss_pred EEECCCCC-CCCCHHHHH-HHHHHHCCCCCCCCEE T ss_conf 54788777-999999999-9999956979899779 No 11 >pfam09186 DUF1949 Domain of unknown function (DUF1949). Members of this family pertain to a set of functionally uncharacterized hypothetical bacterial proteins. They adopt a ferredoxin-like fold, with a beta-alpha-beta-beta-alpha-beta arrangement. Probab=54.00 E-value=19 Score=17.43 Aligned_cols=51 Identities=18% Similarity=0.233 Sum_probs=34.0 Q ss_pred EEECCCCHHHHHHHHHHCCCCCCCCEEEECCCCCEECCCHHHHHHHHHHHHHHHC Q ss_conf 9964320134566542025674222178603761024898999999999987523 Q gi|254780660|r 172 FYCDFNNVGLTSKKLEEKIGEAQSIKVIWKPLNYIRLSNADKVKSIIKMIENLED 226 (244) Q Consensus 172 i~~~~~~~~~v~~~Le~~~~~~~~sei~~~P~~~V~l~d~e~~~~~~klie~Led 226 (244) +.|++..+..++..|++.++.+...+..-.-.-.+.++ +++.+ .|.+.|-| T Consensus 1 l~~dY~~~~~v~~~l~~~~~~i~d~~y~~~V~l~v~v~-~~~~~---~~~~~L~~ 51 (55) T pfam09186 1 LTCDYAQLGKVERLLEQFGAVILDEEYTDKVTLTLAVP-EEEVE---AFKAKLTD 51 (55) T ss_pred CEECCCCHHHHHHHHHHCCCEEEEEEEEEEEEEEEEEC-HHHHH---HHHHHHHH T ss_conf 96443355999999998798898403001599999976-88899---99999998 No 12 >TIGR02645 ARCH_P_rylase putative thymidine phosphorylase; InterPro: IPR013466 Proteins in this entry are closely related to characterised examples of thymidine phosphorylase (2.4.2.4 from EC) and pyrimidine nucleoside phosphorylase (2.4.2.2 from EC). Most examples are found in the archaea, but other examples are found in bacteria such as Legionella pneumophila (strain Paris) and Rhodopseudomonas palustris CGA009.; GO: 0009032 thymidine phosphorylase activity. Probab=51.22 E-value=15 Score=18.24 Aligned_cols=84 Identities=18% Similarity=0.230 Sum_probs=57.1 Q ss_pred HCHHHHHHHHHHHHCCCCHHHHHHHHHH-HCCCCCCCCC-EEEEE----------------------------------E Q ss_conf 4899999999999668887899999985-0377666640-23678----------------------------------6 Q gi|254780660|r 45 ENPRLRLAIQNAKNQSMPKENIERAIKK-AGSDDLGNYT-NIRYE----------------------------------G 88 (244) Q Consensus 45 ~N~~L~~ai~~Ak~~~mPk~~Ie~aIkk-~~~~~~~~~~-~~~yE----------------------------------g 88 (244) +|..+++-|...-..+|.-|=|+.--.+ +..++.-+++ ++... + T Consensus 122 ~d~EIsAF~ta~~~~gm~~DE~~aLT~AMa~tG~~l~wd~~~i~DkHsIGGvPGNk~s~~VVPIVAAaGL~IPKTSSRAI 201 (499) T TIGR02645 122 SDVEISAFLTASAINGMTMDEIEALTIAMADTGEMLEWDREPIMDKHSIGGVPGNKTSLIVVPIVAAAGLLIPKTSSRAI 201 (499) T ss_pred CHHHHHHHHHHHHHCCCCHHHHHHHHHHHHHCCCEEECCCCEEEEEEECCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCC T ss_conf 31679999999871688889999999988850976643895678761018877200214425678836788887334100 Q ss_pred ECCCCEEEEEEEECCCCHHHHHHHHHHHHHCCCCCCCCCCC Q ss_conf 50599399999944630235899898764256831478860 Q gi|254780660|r 89 YGPEGVAIIIEALTDNRNRTASSIRSIFTKANGSLGSTGST 129 (244) Q Consensus 89 ~gP~gvaiiVe~lTDN~nRt~~~vr~~f~K~gG~lg~~Gsv 129 (244) -+|.|.|=++|+|| +.+=++.|||-+.+|-||.|.=.|.+ T Consensus 202 TSaAGTAD~mEVLt-rV~ls~~E~K~iV~~~ggcLvWGGAl 241 (499) T TIGR02645 202 TSAAGTADVMEVLT-RVELSVEEIKRIVEKVGGCLVWGGAL 241 (499) T ss_pred CCCCCCCCEEEEEC-CCCCCHHHHHHHHHHCCCEEEECCCC T ss_conf 37786622346721-65224788989986408645412201 No 13 >pfam05423 Mycobact_memb Mycobacterium membrane protein. This family contains several membrane proteins from Mycobacterium species. Probab=49.67 E-value=22 Score=17.08 Aligned_cols=28 Identities=29% Similarity=0.624 Sum_probs=17.9 Q ss_pred CCCCC--CEEEEEEECCCCEEEEEEEECCC Q ss_conf 66664--02367865059939999994463 Q gi|254780660|r 77 DLGNY--TNIRYEGYGPEGVAIIIEALTDN 104 (244) Q Consensus 77 ~~~~~--~~~~yEg~gP~gvaiiVe~lTDN 104 (244) +..+| +.++||.|||.|..--|.-+-.| T Consensus 44 ~~~~fnPK~V~YEVfG~pGt~a~inYlD~~ 73 (140) T pfam05423 44 DIKPFNPKHVTYEVFGPPGTVADINYLDAD 73 (140) T ss_pred CCCCCCCCEEEEEEECCCCCEEEEEEECCC T ss_conf 567778868999997799974788898699 No 14 >cd01061 RNase_T2_euk Ribonuclease T2 (RNase T2) is a widespread family of secreted RNases found in every organism examined thus far. This family includes RNase Rh, RNase MC1, RNase LE, and self-incompatibility RNases (S-RNases). Plant T2 RNases are expressed during leaf senescence in order to scavenge phosphate from ribonucleotides. They are also expressed in response to wounding or pathogen invasion. S-RNases are thought to prevent self-fertilization by acting as selective cytotoxins of "self" pollen. Generally, RNases have two distinct binding sites: the primary site (B1 site) and the subsite (B2 site), for nucleotides located at the 5'- and 3'- terminal ends of the sessil bond, respectively. This CD includes the eukaryotic RNase T2 family members. Probab=49.15 E-value=4 Score=22.30 Aligned_cols=27 Identities=22% Similarity=0.343 Sum_probs=14.5 Q ss_pred EEEEECCCCHHHHHHHHHHHHHCCCCC Q ss_conf 999944630235899898764256831 Q gi|254780660|r 97 IIEALTDNRNRTASSIRSIFTKANGSL 123 (244) Q Consensus 97 iVe~lTDN~nRt~~~vr~~f~K~gG~l 123 (244) .|.|..++......+||.+|.|.|+.+ T Consensus 151 ~l~C~~~~~~~~L~EI~iC~dk~~~~~ 177 (195) T cd01061 151 VIKCSKDPGKGELNEIWICFDKKGGEF 177 (195) T ss_pred EEEEEECCCCCEEEEEEEEEECCCCEE T ss_conf 899754799988989999998899967 No 15 >PRK00881 purH bifunctional phosphoribosylaminoimidazolecarboxamide formyltransferase/IMP cyclohydrolase; Provisional Probab=48.79 E-value=19 Score=17.42 Aligned_cols=10 Identities=20% Similarity=0.281 Sum_probs=3.9 Q ss_pred HHHHCCEEEE Q ss_conf 6864472797 Q gi|254780660|r 132 FFEQIGEIIY 141 (244) Q Consensus 132 ~F~~~G~i~~ 141 (244) -|++-.+.++ T Consensus 265 ef~~pa~vIi 274 (514) T PRK00881 265 EFDEPACVIV 274 (514) T ss_pred HCCCCEEEEE T ss_conf 3667689999 No 16 >cd04876 ACT_RelA-SpoT ACT domain found C-terminal of the RelA/SpoT domains. ACT_RelA-SpoT: the ACT domain found C-terminal of the RelA/SpoT domains. Enzymes of the Rel/Spo family enable bacteria to survive prolonged periods of nutrient limitation by controlling guanosine-3'-diphosphate-5'-(tri)diphosphate ((p)ppGpp) production and subsequent rRNA repression (stringent response). Both the synthesis of (p)ppGpp from ATP and GDP(GTP), and its hydrolysis to GDP(GTP) and pyrophosphate, are catalyzed by Rel/Spo proteins. In Escherichia coli and its close relatives, the metabolism of (p)ppGpp is governed by two homologous proteins, RelA and SpoT. The RelA protein catalyzes (p)ppGpp synthesis in a reaction requiring its binding to ribosomes bearing codon-specified uncharged tRNA. The major role of the SpoT protein is the breakdown of (p)ppGpp by a manganese-dependent (p)ppGpp pyrophosphohydrolase activity. Although the stringent response appears to be tightly regulated by these two enzymes i Probab=44.30 E-value=27 Score=16.41 Aligned_cols=58 Identities=9% Similarity=0.025 Sum_probs=40.8 Q ss_pred ECCCCHHHHHHHHHHCCCCCCCCEEEECCCC------CEECCCHHHHHHHHHHHHHHHCCCCCCCCC Q ss_conf 6432013456654202567422217860376------102489899999999998752388813110 Q gi|254780660|r 174 CDFNNVGLTSKKLEEKIGEAQSIKVIWKPLN------YIRLSNADKVKSIIKMIENLEDDDDVQSVY 234 (244) Q Consensus 174 ~~~~~~~~v~~~Le~~~~~~~~sei~~~P~~------~V~l~d~e~~~~~~klie~Led~DDVq~Vy 234 (244) =.|--|..+...+.+.+..+.+....-.+.. .+++.+. +.+..++.+|...++|..|+ T Consensus 7 Dr~GlL~dI~~~is~~~~nI~~v~~~~~~~~~~~~~~~v~V~d~---~~L~~li~~l~~i~~V~~V~ 70 (71) T cd04876 7 DRPGLLADITTVIAEEKINILSVNTRTDDDGLATIRLTLEVRDL---EHLARIMRKLRQIPGVIDVR 70 (71) T ss_pred CCCCHHHHHHHHHHHCCCCEEEEEEEECCCCEEEEEEEEEECCH---HHHHHHHHHHHCCCCCEEEE T ss_conf 37787999999999879967999999758986999999998899---99999999987799915988 No 17 >cd01421 IMPCH Inosine monophosphate cyclohydrolase domain. This is the N-terminal domain in the purine biosynthesis pathway protein ATIC (purH). The bifunctional ATIC protein contains a C-terminal ATIC formylase domain that formylates 5-aminoimidazole-4-carboxamide-ribonucleotide. The IMPCH domain then converts the formyl-5-aminoimidazole-4-carboxamide-ribonucleotide to inosine monophosphate. This is the final step in de novo purine production. Probab=44.29 E-value=24 Score=16.71 Aligned_cols=42 Identities=14% Similarity=0.067 Sum_probs=25.5 Q ss_pred CHHHHHHHHCCCCCCCCC----CCCCEEEEECCCCHHHHHHHHHHCC Q ss_conf 013667751268752234----6882599964320134566542025 Q gi|254780660|r 148 SNLAMEVAIESDAFEVLF----EDQEYIFYCDFNNVGLTSKKLEEKI 190 (244) Q Consensus 148 ~d~~~e~aie~ga~Dv~~----~d~~~~i~~~~~~~~~v~~~Le~~~ 190 (244) .+.++|. ||.|.--+.- +-..+.+.|+|+++..+.+.|+..+ T Consensus 113 ~~~~IEn-IDIGGpsliRAAAKN~~~V~v~~dp~dY~~~i~~l~~~g 158 (187) T cd01421 113 LEEAIEN-IDIGGPSLLRAAAKNYKDVTVLVDPADYQKVLEELKSNG 158 (187) T ss_pred HHHHHHH-CCCCCHHHHHHHHHCCCCEEEECCHHHHHHHHHHHHHCC T ss_conf 8999984-467748999999826881287679999999999999769 No 18 >pfam11588 DUF3243 Protein of unknown function (DUF3243). This family of proteins with unknown function appears to be restricted to Firmicutes. Probab=44.14 E-value=26 Score=16.45 Aligned_cols=29 Identities=24% Similarity=0.255 Sum_probs=24.3 Q ss_pred HHHHHHHHHHHHCCCCHHHHHHHHHHHCC Q ss_conf 99999999999668887899999985037 Q gi|254780660|r 47 PRLRLAIQNAKNQSMPKENIERAIKKAGS 75 (244) Q Consensus 47 ~~L~~ai~~Ak~~~mPk~~Ie~aIkk~~~ 75 (244) ..|...++.|++.+||..+|.....+-++ T Consensus 12 ~~Lg~rv~~a~~~Gm~ee~i~~~A~~iGd 40 (81) T pfam11588 12 DFLGDRVNLAKKIGMSEETISKLAYRIGD 40 (81) T ss_pred HHHHHHHHHHHHCCCCHHHHHHHHHHHHH T ss_conf 99999999999859988999999999999 No 19 >TIGR02717 AcCoA-syn-alpha acetyl coenzyme A synthetase (ADP forming), alpha domain; InterPro: IPR014089 This group of ADP-dependent acetyl-CoA synthetases (ACS) act in the direction of acetate and ATP production in the organisms in which they have been characterised , , . In most species this protein is bifunctional, existing as fused alpha-beta domains. In Pyrococcus and related species, however, the domains exist as separate polypeptides. This entry represents the alpha (N-terminal) domain. In Pyrococcus and related species there appears to have been the development of a paralogous family such that four other proteins are close relatives. One of these (along with its beta-domain partner) was characterised as ACS-II showing specificity for phenylacetyl-CoA . This entry excludes non-ACS-I paralogs. . Probab=43.50 E-value=27 Score=16.32 Aligned_cols=166 Identities=16% Similarity=0.174 Sum_probs=83.6 Q ss_pred HHHHHHHHHHHCC-------C---CCCCCCEEEEEEECCCC--EEEEEEEECCCCHHHHHHHHHH-------HHH----- Q ss_conf 7899999985037-------7---66664023678650599--3999999446302358998987-------642----- Q gi|254780660|r 63 KENIERAIKKAGS-------D---DLGNYTNIRYEGYGPEG--VAIIIEALTDNRNRTASSIRSI-------FTK----- 118 (244) Q Consensus 63 k~~Ie~aIkk~~~-------~---~~~~~~~~~yEg~gP~g--vaiiVe~lTDN~nRt~~~vr~~-------f~K----- 118 (244) -..++.|.++.-| + |-...+=+.|=+-=|.- +.+-+|-+-|= +|.+...|.+ +=| T Consensus 167 ~a~Ldwa~~~~vGfS~~VS~GNkAD~~e~Dlley~~~D~~T~~I~~Y~Eg~~DG-~~Fl~~A~~~s~~KPiv~lKsG~s~ 245 (457) T TIGR02717 167 TALLDWAEKNGVGFSYFVSLGNKADIDESDLLEYLADDPDTKVILLYLEGIKDG-RKFLKTAKEISKKKPIVVLKSGTSE 245 (457) T ss_pred HHHHHHHHHCCCCEEEEEECCCCEECCHHHHHHHHHCCCCCCEEEEECCCCCCH-HHHHHHHHHHHHCCCEEEEECCCCH T ss_conf 999999987278134778267411116577888985398940899971787041-6899998886305988999368883 Q ss_pred --------CCCCCCCCCCCHH--HHHHCCEEEECCCCCCCHHHHHHHHCCCCCCCCCCCCCEEEEECCCCHHHH-HHHHH Q ss_conf --------5683147886045--686447279706775301366775126875223468825999643201345-66542 Q gi|254780660|r 119 --------ANGSLGSTGSTTR--FFEQIGEIIYHSNIGDSNLAMEVAIESDAFEVLFEDQEYIFYCDFNNVGLT-SKKLE 187 (244) Q Consensus 119 --------~gG~lg~~Gsv~~--~F~~~G~i~~~~~~~d~d~~~e~aie~ga~Dv~~~d~~~~i~~~~~~~~~v-~~~Le 187 (244) |-|+|+.+- ..| .|.+.|+|....- ++++++|-=..-...-..++.+.|+|.---.+-+ .++++ T Consensus 246 ~GakAA~SHTGaLAGs~-~~y~aaf~q~G~iRa~~~----~ELfd~A~~L~~~~~~~~g~~~~IiTN~GG~Gvia~D~~~ 320 (457) T TIGR02717 246 AGAKAASSHTGALAGSD-EAYDAAFKQAGVIRADSI----EELFDLARLLSNQPLPPKGNRVAIITNAGGPGVIATDACE 320 (457) T ss_pred HHHHHHHHCCCHHHHHH-HHHHHHHCCCCEEEEECH----HHHHHHHHHHHCCCCCCCCCEEEEEECCCCHHHHHHHHHH T ss_conf 45676521023133668-999987430143887017----7889999998358989988769999789616778765677 Q ss_pred HCCCCCC------------CCEEEECCCCCEECCC-HHHHHHHHHHHHHHHCCCCCCCCC Q ss_conf 0256742------------2217860376102489-899999999998752388813110 Q gi|254780660|r 188 EKIGEAQ------------SIKVIWKPLNYIRLSN-ADKVKSIIKMIENLEDDDDVQSVY 234 (244) Q Consensus 188 ~~~~~~~------------~sei~~~P~~~V~l~d-~e~~~~~~klie~Led~DDVq~Vy 234 (244) +.|..+. .-=-.|.+.|||++.- .=..+++.+-|+.+-+||-|.-|. T Consensus 321 ~~Gl~L~~~~~~t~~~L~~~LP~~as~~NPVD~~GsDA~~~~Y~~~l~~v~eD~nVd~~~ 380 (457) T TIGR02717 321 EVGLELAELSEKTKEKLRNILPPEASIKNPVDVLGSDATAERYAKALKIVAEDENVDGVV 380 (457) T ss_pred HCCCEEECCCHHHHHHHHHHCCCCCCCCCCCEEEECCCCHHHHHHHHHHHHCCCCCCEEE T ss_conf 749745558589999999747611477875125522789899999999983488888899 No 20 >PRK09466 metL bifunctional aspartate kinase II/homoserine dehydrogenase II; Provisional Probab=42.16 E-value=29 Score=16.18 Aligned_cols=16 Identities=6% Similarity=0.247 Sum_probs=7.8 Q ss_pred HHHHHHHHHHHHCCCC Q ss_conf 3589989876425683 Q gi|254780660|r 107 RTASSIRSIFTKANGS 122 (244) Q Consensus 107 Rt~~~vr~~f~K~gG~ 122 (244) ++...++..|..+.|. T Consensus 171 ~s~~~l~~~l~~~~~~ 186 (810) T PRK09466 171 LSYPLLQQLLAQHPGK 186 (810) T ss_pred HHHHHHHHHHHHCCCC T ss_conf 6599999999736897 No 21 >pfam06144 DNA_pol3_delta DNA polymerase III, delta subunit. DNA polymerase III, delta subunit (EC 2.7.7.7) is required for, along with delta' subunit, the assembly of the processivity factor beta(2) onto primed DNA in the DNA polymerase III holoenzyme-catalysed reaction. The delta subunit is also known as HolA. Probab=42.07 E-value=26 Score=16.49 Aligned_cols=14 Identities=0% Similarity=0.064 Sum_probs=6.5 Q ss_pred CCCEEEEECCCCHH Q ss_conf 88259996432013 Q gi|254780660|r 167 DQEYIFYCDFNNVG 180 (244) Q Consensus 167 d~~~~i~~~~~~~~ 180 (244) ++.+.|++.+.... T Consensus 90 ~~~~lv~~~~~~~~ 103 (172) T pfam06144 90 EDTLLIIEAPGKLD 103 (172) T ss_pred CCEEEEEEECCCHH T ss_conf 87289998367413 No 22 >pfam11692 DUF3289 Protein of unknown function (DUF3289). This family of proteins with unknown function appears to be restricted to Proteobacteria. Probab=41.38 E-value=22 Score=17.00 Aligned_cols=54 Identities=15% Similarity=0.079 Sum_probs=38.1 Q ss_pred HHHHHHHHHH--HCCC--CHHHCHHHHHHHHHHHHCCCCHHHHHHHHHHHCCCCCCCC Q ss_conf 9999999998--1899--9414899999999999668887899999985037766664 Q gi|254780660|r 28 LSREITVSAK--LSGQ--NPLENPRLRLAIQNAKNQSMPKENIERAIKKAGSDDLGNY 81 (244) Q Consensus 28 ~~keI~~A~k--~gG~--dp~~N~~L~~ai~~Ak~~~mPk~~Ie~aIkk~~~~~~~~~ 81 (244) |.++..---+ .|+| ||..|..|+-.|-.+|..|-+...|...|.++-+-+...| T Consensus 106 Li~~mI~hfq~~~G~~f~~~lLn~A~ke~Il~d~s~ns~~~~ik~~i~~~id~~~~~~ 163 (277) T pfam11692 106 LIGRLIDHMQYGNGAPFRDLLLNAALKEVILGDKTNNSSLLVIKAILDRGIDWDKKIF 163 (277) T ss_pred HHHHHHHHHHCCCCCCCCCHHHHHHHHHHHHCCCCCCHHHHHHHHHHHCCCCCCCCCC T ss_conf 9999999986079986777788999999872155564599999999853778433457 No 23 >KOG2121 consensus Probab=39.62 E-value=18 Score=17.60 Aligned_cols=63 Identities=21% Similarity=0.257 Sum_probs=41.7 Q ss_pred HHHHHHHCCCCHHHHHHHHHHHCCCCCCCCCEEEE-EEECC---CCEEEEEEEECCCCHHHHHHHHHHHH Q ss_conf 99999966888789999998503776666402367-86505---99399999944630235899898764 Q gi|254780660|r 52 AIQNAKNQSMPKENIERAIKKAGSDDLGNYTNIRY-EGYGP---EGVAIIIEALTDNRNRTASSIRSIFT 117 (244) Q Consensus 52 ai~~Ak~~~mPk~~Ie~aIkk~~~~~~~~~~~~~y-Eg~gP---~gvaiiVe~lTDN~nRt~~~vr~~f~ 117 (244) -+++|++.++|+.-+-..+++|....-.+-+-+++ |..|| +-+.+|++|-+++- +..+-.-++ T Consensus 222 ~~~kA~~lGvp~Gp~~~~L~~G~~vt~~~g~i~~~~ev~gp~~~~~~f~il~cp~e~~---l~~i~~~~~ 288 (746) T KOG2121 222 DVEKAKELGVPKGPLIGKLKSGESVTLDDGTIVVPSEVVGPSRPGASFLILDCPDESY---LNAILENIK 288 (746) T ss_pred CHHHHHHHCCCCCCCHHHHCCCCCEECCCCCEEEHHHHCCCCCCCCEEEEECCCCHHH---HHHHHHCCC T ss_conf 6898997089978614544179705516972884666048998765799965896788---999874242 No 24 >cd03286 ABC_MSH6_euk MutS6 homolog in eukaryotes. The MutS protein initiates DNA mismatch repair by recognizing mispaired and unpaired bases embedded in duplex DNA and activating endo- and exonucleases to remove the mismatch. Members of the MutS family possess C-terminal domain with a conserved ATPase activity that belongs to the ATP binding cassette (ABC) superfamily. MutS homologs (MSH) have been identified in most prokaryotic and all eukaryotic organisms examined. Prokaryotes have two homologs (MutS1 and MutS2), whereas seven MSH proteins (MSH1 to MSH7) have been identified in eukaryotes. The homodimer MutS1 and heterodimers MSH2-MSH3 and MSH2-MSH6 are primarily involved in mitotic mismatch repair, whereas MSH4-MSH5 is involved in resolution of Holliday junctions during meiosis. All members of the MutS family contain the highly conserved Walker A/B ATPase domain, and many share a common mechanism of action. MutS1, MSH2-MSH3, MSH2-MSH6, and MSH4-MSH5 dimerize to form sliding c Probab=37.33 E-value=18 Score=17.62 Aligned_cols=18 Identities=6% Similarity=0.124 Sum_probs=6.5 Q ss_pred ECCCCHHHHHH---HHHHHHH Q ss_conf 44630235899---8987642 Q gi|254780660|r 101 LTDNRNRTASS---IRSIFTK 118 (244) Q Consensus 101 lTDN~nRt~~~---vr~~f~K 118 (244) +..+......+ ++.+++. T Consensus 86 ~~~~~StF~~e~~~~~~il~~ 106 (218) T cd03286 86 IMKGESTFMVELSETANILRH 106 (218) T ss_pred HHHHHHHHHHHHHHHHHHHHH T ss_conf 431150699999999999986 No 25 >TIGR01660 narH nitrate reductase, beta subunit; InterPro: IPR006547 The nitrate reductase enzyme complex allows bacteria to use nitrate as an electron acceptor during anaerobic growth. The enzyme complex consists of a tetramer that has an alpha, beta and 2 gamma subunits. The alpha and beta subunits have catalytic activity and the gamma subunits attach the enzyme to the membrane and are b-type cytochromes that receive electrons from the quinone pool and transfers them to the beta subunit. The sequences in this family are the beta subunit for nitrate reductase I (narH) and nitrate reductase II (narY) for Gram-positive and Gram-negative bacteria. A few thermophiles and archaea also match the model. A number of the sequences in this set are experimentally characterised, these include: E.Coli NarH (P11349 from SWISSPROT) and NarY (P19318 from SWISSPROT) , , P42176 from SWISSPROT from Bacillus subtilis, and related proteins from Psuedomonas fluorescens, Paracoccus denitrificans, and Halomonas halodenitrificans.; GO: 0008940 nitrate reductase activity, 0042126 nitrate metabolic process, 0009325 nitrate reductase complex. Probab=36.13 E-value=33 Score=15.78 Aligned_cols=22 Identities=36% Similarity=0.501 Sum_probs=19.0 Q ss_pred HHHHHHHHCCCCHHHHHHHHHH Q ss_conf 9999999668887899999985 Q gi|254780660|r 51 LAIQNAKNQSMPKENIERAIKK 72 (244) Q Consensus 51 ~ai~~Ak~~~mPk~~Ie~aIkk 72 (244) .+|..|++.++|.+.|+.|=+. T Consensus 306 ~Vi~~A~~~Gip~~~I~aAQ~S 327 (495) T TIGR01660 306 EVIKQAKKDGIPEEVIEAAQQS 327 (495) T ss_pred HHHHHHHHCCCCHHHHHHHHCC T ss_conf 9999998658966899875318 No 26 >cd03287 ABC_MSH3_euk MutS3 homolog in eukaryotes. The MutS protein initiates DNA mismatch repair by recognizing mispaired and unpaired bases embedded in duplex DNA and activating endo- and exonucleases to remove the mismatch. Members of the MutS family possess C-terminal domain with a conserved ATPase activity that belongs to the ATP binding cassette (ABC) superfamily. MutS homologs (MSH) have been identified in most prokaryotic and all eukaryotic organisms examined. Prokaryotes have two homologs (MutS1 and MutS2), whereas seven MSH proteins (MSH1 to MSH7) have been identified in eukaryotes. The homodimer MutS1 and heterodimers MSH2-MSH3 and MSH2-MSH6 are primarily involved in mitotic mismatch repair, whereas MSH4-MSH5 is involved in resolution of Holliday junctions during meiosis. All members of the MutS family contain the highly conserved Walker A/B ATPase domain, and many share a common mechanism of action. MutS1, MSH2-MSH3, MSH2-MSH6, and MSH4-MSH5 dimerize to form sliding c Probab=35.28 E-value=20 Score=17.26 Aligned_cols=16 Identities=6% Similarity=0.075 Sum_probs=6.8 Q ss_pred CCHHHHHH---HHHHHHHC Q ss_conf 30235899---89876425 Q gi|254780660|r 104 NRNRTASS---IRSIFTKA 119 (244) Q Consensus 104 N~nRt~~~---vr~~f~K~ 119 (244) +.+....+ ++.++++. T Consensus 90 ~~StF~~e~~~~~~il~~~ 108 (222) T cd03287 90 GMSTFMVELSETSHILSNC 108 (222) T ss_pred CCCHHHHHHHHHHHHHHHC T ss_conf 5227999999999999867 No 27 >pfam00488 MutS_V MutS domain V. This domain is found in proteins of the MutS family (DNA mismatch repair proteins) and is found associated with pfam01624, pfam05188, pfam05192 and pfam05190. The mutS family of proteins is named after the Salmonella typhimurium MutS protein involved in mismatch repair; other members of the family included the eukaryotic MSH 1,2,3, 4,5 and 6 proteins. These have various roles in DNA repair and recombination. Human MSH has been implicated in non-polyposis colorectal carcinoma (HNPCC) and is a mismatch binding protein. The aligned region corresponds with domain V of Thermus aquaticus MutS, which contains a Walker A motif, and is structurally similar to the ATPase domain of ABC transporters. Probab=35.13 E-value=34 Score=15.64 Aligned_cols=16 Identities=25% Similarity=0.393 Sum_probs=8.9 Q ss_pred HHHHHHHHHHHHHHHC Q ss_conf 8999999999987523 Q gi|254780660|r 211 ADKVKSIIKMIENLED 226 (244) Q Consensus 211 ~e~~~~~~klie~Led 226 (244) ++-.++-..+++.||+ T Consensus 218 ~~ii~rA~~i~~~le~ 233 (234) T pfam00488 218 ESVVERAREVLAELED 233 (234) T ss_pred HHHHHHHHHHHHHHHC T ss_conf 9999999999999866 No 28 >COG2605 Predicted kinase related to galactokinase and mevalonate kinase [General function prediction only] Probab=33.82 E-value=38 Score=15.30 Aligned_cols=46 Identities=20% Similarity=0.327 Sum_probs=35.7 Q ss_pred CCHHHHHHHHCCCCCCCCC---C-CCCEEEEECCCCHHHHHHHHHHCCCC Q ss_conf 3013667751268752234---6-88259996432013456654202567 Q gi|254780660|r 147 DSNLAMEVAIESDAFEVLF---E-DQEYIFYCDFNNVGLTSKKLEEKIGE 192 (244) Q Consensus 147 d~d~~~e~aie~ga~Dv~~---~-d~~~~i~~~~~~~~~v~~~Le~~~~~ 192 (244) .-|.+.++|++.||.-=.. . .+...|+|+|+....+.++|+..... T Consensus 267 ~IDriy~~A~~~GA~~gKl~GaG~gGFllf~~~p~k~~~l~r~l~~~~~~ 316 (333) T COG2605 267 AIDRIYELALKNGAYGGKLSGAGGGGFLLFFCDPSKRNELARALEKEQGF 316 (333) T ss_pred HHHHHHHHHHHCCCHHCEEECCCCCCEEEEEECCCCHHHHHHHHHHHCCC T ss_conf 78999999986673222230268862799996854248999999984497 No 29 >pfam01910 DUF77 Domain of unknown function DUF77. Domain of unknown function. The crystal structure of two of these members shows that this domain has a ferredoxin like fold and is likely to exists as at least homodimers. Sulphate ions are are located at the dimer interfaces, which are thought to confer additional stability. Although the function of this domain remains to be identified, its structure suggests a role in protein-protein interactions possibly regulated by the binding of small-molecule ligands. Probab=33.82 E-value=23 Score=16.84 Aligned_cols=55 Identities=18% Similarity=0.119 Sum_probs=26.9 Q ss_pred HHHHHHHHCCCCCCCCEEEECCC-CCEECCCHHHHHHHHHHHHHHHCCCCCCCCCCCCCCCC Q ss_conf 45665420256742221786037-61024898999999999987523888131102562662 Q gi|254780660|r 181 LTSKKLEEKIGEAQSIKVIWKPL-NYIRLSNADKVKSIIKMIENLEDDDDVQSVYSNLEIAD 241 (244) Q Consensus 181 ~v~~~Le~~~~~~~~sei~~~P~-~~V~l~d~e~~~~~~klie~Led~DDVq~VytN~~i~e 241 (244) .+.+.|++.|+.+ ...|. |.++-+=++-++-+.+..+.|. ..++.+||+++.|++ T Consensus 21 ~~i~vi~~sgl~y-----~l~pmgT~iEge~dev~~~v~~~~e~~~-~~G~~RV~t~iKID~ 76 (92) T pfam01910 21 AVIEVLKESGLKY-----ELGPMGTTIEGEWDEVMEVVKKAHEALF-EAGAPRVSTVIKIDD 76 (92) T ss_pred HHHHHHHHCCCCE-----EECCCEEEEECCHHHHHHHHHHHHHHHH-HCCCCEEEEEEEEEE T ss_conf 9999999749975-----8448700887789999999999999999-769987999999880 No 30 >cd03285 ABC_MSH2_euk MutS2 homolog in eukaryotes. The MutS protein initiates DNA mismatch repair by recognizing mispaired and unpaired bases embedded in duplex DNA and activating endo- and exonucleases to remove the mismatch. Members of the MutS family possess C-terminal domain with a conserved ATPase activity that belongs to the ATP binding cassette (ABC) superfamily. MutS homologs (MSH) have been identified in most prokaryotic and all eukaryotic organisms examined. Prokaryotes have two homologs (MutS1 and MutS2), whereas seven MSH proteins (MSH1 to MSH7) have been identified in eukaryotes. The homodimer MutS1 and heterodimers MSH2-MSH3 and MSH2-MSH6 are primarily involved in mitotic mismatch repair, whereas MSH4-MSH5 is involved in resolution of Holliday junctions during meiosis. All members of the MutS family contain the highly conserved Walker A/B ATPase domain, and many share a common mechanism of action. MutS1, MSH2-MSH3, MSH2-MSH6, and MSH4-MSH5 dimerize to form sliding c Probab=33.78 E-value=37 Score=15.42 Aligned_cols=19 Identities=5% Similarity=0.069 Sum_probs=7.8 Q ss_pred EECCCCHHHHHHHH---HHHHH Q ss_conf 94463023589989---87642 Q gi|254780660|r 100 ALTDNRNRTASSIR---SIFTK 118 (244) Q Consensus 100 ~lTDN~nRt~~~vr---~~f~K 118 (244) .+.++.+....+++ .+++. T Consensus 85 ~~~~~~StF~~e~~~~~~il~~ 106 (222) T cd03285 85 SQLKGVSTFMAEMLETAAILKS 106 (222) T ss_pred CCCCCHHHHHHHHHHHHHHHHH T ss_conf 1003352899999999999984 No 31 >PRK00441 argR arginine repressor; Provisional Probab=32.00 E-value=41 Score=15.09 Aligned_cols=123 Identities=11% Similarity=0.085 Sum_probs=69.9 Q ss_pred EEEECCCCHHHHHHHHHHHHHCCCCCCCCCCCHHHHHHCCEEEECCCCC---------CC---HHHHHHHHCCCCCCCCC Q ss_conf 9994463023589989876425683147886045686447279706775---------30---13667751268752234 Q gi|254780660|r 98 IEALTDNRNRTASSIRSIFTKANGSLGSTGSTTRFFEQIGEIIYHSNIG---------DS---NLAMEVAIESDAFEVLF 165 (244) Q Consensus 98 Ve~lTDN~nRt~~~vr~~f~K~gG~lg~~Gsv~~~F~~~G~i~~~~~~~---------d~---d~~~e~aie~ga~Dv~~ 165 (244) -+.++.|.-+|..++...+.+.|=. .+.-.+|-.....|.+-.+...+ .. ..-+.-.+.-...++.. T Consensus 10 ~~lI~~~~I~tQ~eL~~~L~~~Gi~-vTQATlSRDl~eL~~vKv~~~~G~~~Y~~~~~~~~~~~~~l~~~~~~~v~~v~~ 88 (149) T PRK00441 10 LEIINSKEIETQEELAEELKKMGFD-VTQATVSRDIKELKLIKVLGNNGKYKYATINKTENNLSDRLVNIFANTVISVEN 88 (149) T ss_pred HHHHHHCCCCCHHHHHHHHHHCCCC-EEHHHHHHHHHHCCCEEEECCCCCEEEEECCCCCCCHHHHHHHHHHHHEEEEEE T ss_conf 9999728967899999999986997-665898886998198896569997899964766641778999999987067750 Q ss_pred CCCCEEEEECCCCHHHHHHHHHHCCCCCCCCEEEECC---CCCEECCCHHHHHHHHHHHHHH Q ss_conf 6882599964320134566542025674222178603---7610248989999999999875 Q gi|254780660|r 166 EDQEYIFYCDFNNVGLTSKKLEEKIGEAQSIKVIWKP---LNYIRLSNADKVKSIIKMIENL 224 (244) Q Consensus 166 ~d~~~~i~~~~~~~~~v~~~Le~~~~~~~~sei~~~P---~~~V~l~d~e~~~~~~klie~L 224 (244) .+..+.+-|.|-.-..+...|....++-. +.-++ .-.|-..+++.++.+.+-+..| T Consensus 89 ~~~lvvIkT~pG~A~~va~~iD~~~~~~I---~GTIAGdDTilvi~~~~~~a~~l~~~i~~l 147 (149) T PRK00441 89 VDNMVVIKTISGSASAAAEAIDTLNFDGI---AGTIAGDNTIFILVRSLEKAQEIVEKLKKL 147 (149) T ss_pred CCCEEEEEECCCCHHHHHHHHHHCCCCCC---EEEEECCCEEEEEECCHHHHHHHHHHHHHH T ss_conf 37789999389958999999983799872---798605998999978889999999999998 No 32 >TIGR01764 excise DNA binding domain, excisionase family; InterPro: IPR010093 An excisionase, or Xis protein, is a small protein that binds and promotes excisive recombination; it is not enzymatically active. This entry represents a number of putative excisionases and related proteins from temperate phage, plasmids, and transposons, as well as DNA binding domains of other proteins, such as a DNA modification methylase. This entry identifies mostly small proteins and N-terminal regions of large proteins, but some proteins appear to have two copies. This domain appears similar, in both sequence and predicted secondary structure (PSIPRED) to the MerR family of transcriptional regulators (IPR000551 from INTERPRO).; GO: 0003677 DNA binding. Probab=30.97 E-value=40 Score=15.18 Aligned_cols=23 Identities=17% Similarity=0.229 Sum_probs=19.6 Q ss_pred HHHHHHHCCCCHHHHHHHHHHHC Q ss_conf 99999966888789999998503 Q gi|254780660|r 52 AIQNAKNQSMPKENIERAIKKAG 74 (244) Q Consensus 52 ai~~Ak~~~mPk~~Ie~aIkk~~ 74 (244) +=+.|+--|+|+++|.+.|..|. T Consensus 4 v~EaA~yLgv~~~t~~~l~~~g~ 26 (49) T TIGR01764 4 VEEAAEYLGVSKSTVYRLIEEGE 26 (49) T ss_pred HHHHHHHCCCCHHHHHHHHHCCC T ss_conf 78899771999057899997189 No 33 >cd00439 Transaldolase Transaldolase. Enzymes found in the non-oxidative branch of the pentose phosphate pathway, that catalyze the reversible transfer of a dihydroxyacetone group from fructose-6-phosphate to erythrose-4-phosphate yielding sedoheptulose-7-phosphate and glyceraldehyde-3-phosphate. They are members of the class I aldolases, who are characterized by using a Schiff-base mechanism for stabilization of the reaction intermediates. Probab=30.68 E-value=43 Score=14.95 Aligned_cols=79 Identities=20% Similarity=0.265 Sum_probs=52.0 Q ss_pred CCCCHHHCHHHHHHHHHHHHCCCCHHHHHHHHHHHCCCCCCCCCEE--------------EEEEE-CCCCEEEEEEE-EC Q ss_conf 8999414899999999999668887899999985037766664023--------------67865-05993999999-44 Q gi|254780660|r 39 SGQNPLENPRLRLAIQNAKNQSMPKENIERAIKKAGSDDLGNYTNI--------------RYEGY-GPEGVAIIIEA-LT 102 (244) Q Consensus 39 gG~dp~~N~~L~~ai~~Ak~~~mPk~~Ie~aIkk~~~~~~~~~~~~--------------~yEg~-gP~gvaiiVe~-lT 102 (244) |=.+..+||.| +.+|.+.+--.+-+.+++.+++....+-|+++ .||.. |||-|.+=|.. |. T Consensus 20 gv~GvTsNPsI---f~kAi~~~~~y~~~~~~l~~~~~~~~~~~~~L~~~di~~A~d~l~pv~~~~~gdG~VS~Ev~p~la 96 (252) T cd00439 20 GVRGVTTNPSI---IQAAISTSNAYNDQFRTLVESGKDIESAYWELVVKDIQDACKLFEPIYDQTEADGRVSVEVSARLA 96 (252) T ss_pred CCCCCCCCHHH---HHHHHCCCCCHHHHHHHHHHCCCCHHHHHHHHHHHHHHHHHHHHHHHHHCCCCCCEEEEEECCCCC T ss_conf 98652688899---999860940048999999972898789999999998999998605898703899728999780030 Q ss_pred CCCHHHHHHHHHHHHHCC Q ss_conf 630235899898764256 Q gi|254780660|r 103 DNRNRTASSIRSIFTKAN 120 (244) Q Consensus 103 DN~nRt~~~vr~~f~K~g 120 (244) ++...|+.+-|.+++..+ T Consensus 97 ~d~~~~i~~a~~l~~~~~ 114 (252) T cd00439 97 DDTQGMVEAAKYLSKVVN 114 (252) T ss_pred CCHHHHHHHHHHHHHHHC T ss_conf 497999999999999827 No 34 >KOG1290 consensus Probab=30.18 E-value=44 Score=14.89 Aligned_cols=54 Identities=24% Similarity=0.375 Sum_probs=37.0 Q ss_pred EEEECCCC--EEEEEEEECCCCHHH----------HHHHHHHHHHCCCCCCCCCCCHHHHHHCCEEEECCCC Q ss_conf 78650599--399999944630235----------8998987642568314788604568644727970677 Q gi|254780660|r 86 YEGYGPEG--VAIIIEALTDNRNRT----------ASSIRSIFTKANGSLGSTGSTTRFFEQIGEIIYHSNI 145 (244) Q Consensus 86 yEg~gP~g--vaiiVe~lTDN~nRt----------~~~vr~~f~K~gG~lg~~Gsv~~~F~~~G~i~~~~~~ 145 (244) |..-||+| |.|+.|+|-||--+. +..||.+..- -| -...||-.++|+|...-+. T Consensus 151 FkhsGpNG~HVCMVfEvLGdnLLklI~~s~YrGlpl~~VK~I~~q---vL---~GLdYLH~ecgIIHTDlKP 216 (590) T KOG1290 151 FKHSGPNGQHVCMVFEVLGDNLLKLIKYSNYRGLPLSCVKEICRQ---VL---TGLDYLHRECGIIHTDLKP 216 (590) T ss_pred CEECCCCCCEEEEEEHHHHHHHHHHHHHHCCCCCCHHHHHHHHHH---HH---HHHHHHHHHCCCCCCCCCC T ss_conf 131378874799881653067999999817788768999999999---99---8777888751710047873 No 35 >TIGR00063 folE GTP cyclohydrolase I; InterPro: IPR001474 GTP cyclohydrolase I (3.5.4.16 from EC) catalyzes the biosynthesis of formic acid and dihydroneopterin triphosphate from GTP. This reaction is the first step in the biosynthesis of tetrahydrofolate in prokaryotes, of tetrahydrobiopterin in vertebrates, and of pteridine-containing pigments in insects. The comparison of the sequence of the enzyme from bacterial and eukaryotic sources shows that the structure of this enzyme has been extremely well conserved throughout evolution .; GO: 0003934 GTP cyclohydrolase I activity, 0019438 aromatic compound biosynthetic process, 0005737 cytoplasm. Probab=29.89 E-value=24 Score=16.77 Aligned_cols=10 Identities=20% Similarity=0.464 Sum_probs=6.1 Q ss_pred EEEECCCC-CE Q ss_conf 17860376-10 Q gi|254780660|r 197 KVIWKPLN-YI 206 (244) Q Consensus 197 ei~~~P~~-~V 206 (244) -+.|+|+. .| T Consensus 85 ~vaYIPk~GkV 95 (183) T TIGR00063 85 HVAYIPKDGKV 95 (183) T ss_pred EEEEECCCCEE T ss_conf 89988189648 No 36 >pfam03927 NapD NapD protein. Uncharacterized protein involved in formation of periplasmic nitrate reductase. Probab=29.69 E-value=44 Score=14.83 Aligned_cols=65 Identities=15% Similarity=0.172 Sum_probs=44.6 Q ss_pred CEEEEECCCCHHHHHHHHHHCCCCCCCCEEEEC-C--CCCEECCCHHHHHHHHHHHHHHHCCCCCCC---CCCCCC Q ss_conf 259996432013456654202567422217860-3--761024898999999999987523888131---102562 Q gi|254780660|r 169 EYIFYCDFNNVGLTSKKLEEKIGEAQSIKVIWK-P--LNYIRLSNADKVKSIIKMIENLEDDDDVQS---VYSNLE 238 (244) Q Consensus 169 ~~~i~~~~~~~~~v~~~Le~~~~~~~~sei~~~-P--~~~V~l~d~e~~~~~~klie~Led~DDVq~---VytN~~ 238 (244) .+.+.|.|+.+..|+.+|.... .+|+-.. | +--|.+..+. ...+...++.+++++.|.+ |||.++ T Consensus 6 SlVV~~~Pe~~~~V~~~l~~~p----g~Eih~~~~~GKiVVtiE~~~-~~~~~~~i~~i~~l~GVlsa~lVYh~~e 76 (78) T pfam03927 6 SLVVHVRPERLAEVKAAILALP----GAEIHAVSPEGKLVVVLEGES-QGAILDTIEAINALEGVLSASLVYHQIE 76 (78) T ss_pred EEEEEECHHHHHHHHHHHHCCC----CCEEECCCCCCEEEEEEEECC-HHHHHHHHHHHHCCCCEEEEEEEEEECC T ss_conf 8999968777999999997499----968863799942999997288-2799999999865998048987678516 No 37 >pfam05902 4_1_CTD 4.1 protein C-terminal domain (CTD). At the C-terminus of all known 4.1 proteins is a sequence domain unique to these proteins, known as the C-terminal domain (CTD). Mammalian CTDs are associated with a growing number of protein-protein interactions, although such activities have yet to be associated with invertebrate CTDs. Mammalian CTDs are generally defined by sequence alignment as encoded by exons 18-21. Comparison of known vertebrate 4.1 proteins with invertebrate 4.1 proteins indicates that mammalian 4.1 exon 19 represents a vertebrate adaptation that extends the sequence of the CTD with a Ser/Thr-rich sequence. The CTD was first described as a 22/24-kDa domain by chymotryptic digestion of erythrocyte 4.1 (4.1R). CTD is thought to represent an independent folding structure which has gained function since the divergence of vertebrates from invertebrates. Probab=29.36 E-value=43 Score=14.95 Aligned_cols=10 Identities=40% Similarity=0.508 Sum_probs=4.2 Q ss_pred CCHHHHHHHH Q ss_conf 3013667751 Q gi|254780660|r 147 DSNLAMEVAI 156 (244) Q Consensus 147 d~d~~~e~ai 156 (244) |-|.+|-.|| T Consensus 82 DHdqALAqAI 91 (114) T pfam05902 82 DHDQALAQAI 91 (114) T ss_pred CHHHHHHHHH T ss_conf 4689999999 No 38 >KOG0893 consensus Probab=28.78 E-value=46 Score=14.72 Aligned_cols=73 Identities=23% Similarity=0.368 Sum_probs=45.9 Q ss_pred HHHHHHHH-HHHH-CCCCHHHCHHHHHHHHHHHHCCCCHHHHHHHHHHHCC--CCCC--CCCEEEEE----EECCCCEEE Q ss_conf 99999999-9981-8999414899999999999668887899999985037--7666--64023678----650599399 Q gi|254780660|r 27 KLSREITV-SAKL-SGQNPLENPRLRLAIQNAKNQSMPKENIERAIKKAGS--DDLG--NYTNIRYE----GYGPEGVAI 96 (244) Q Consensus 27 k~~keI~~-A~k~-gG~dp~~N~~L~~ai~~Ak~~~mPk~~Ie~aIkk~~~--~~~~--~~~~~~yE----g~gP~gvai 96 (244) +..+||.- |.|+ |-+|.-.+++|--++-.=...|+|.- |---+.|-.. .+.. -|..++|= +++|+|..+ T Consensus 43 ~alkeI~kFA~keMgt~dv~~Dt~lnkavwakgirnv~~~-irvrlsrk~n~~e~~~~~l~t~~t~v~~~~~~~~~~~~~ 121 (125) T KOG0893 43 RALKEIWKFAMKEMGTPDVHVDTRLNKAVWEKGIRNVPYR-IRVRLSRKRNEDEDSPNKLYTQVTYVPVSHGKNPQGKTV 121 (125) T ss_pred HHHHHHHHHHHHHHCCCCCEECCHHHHHHHHHCCCCCCCH-HHCCCCCCCCCCCCCCHHHEEEEEEEEECCCCCCCCEEE T ss_conf 8899999999997298600424035178887325577613-200234234522113111244589986134346665377 Q ss_pred EEEE Q ss_conf 9999 Q gi|254780660|r 97 IIEA 100 (244) Q Consensus 97 iVe~ 100 (244) +|+| T Consensus 122 ~v~~ 125 (125) T KOG0893 122 IVEE 125 (125) T ss_pred EEEC T ss_conf 7309 No 39 >cd04877 ACT_TyrR N-terminal ACT domain of the TyrR protein. ACT_TyrR: N-terminal ACT domain of the TyrR protein. The TyrR protein of Escherichia coli controls the expression of a group of transcription units (TyrR regulon) whose gene products are involved in the biosynthesis or transport of the aromatic amino acids. Binding to specific DNA sequences known as TyrR boxes, the TyrR protein can either activate or repress transcription at different sigma70 promoters. Its regulatory activity occurs in response to intracellular levels of tyrosine, phenylalanine and tryptophan. The TyrR protein consists of an N-terminal region important for transcription activation with an ATP-independent aromatic amino acid binding site (contained within the ACT domain) and is involved in dimerization; a central region with an ATP binding site, an ATP-dependent aromatic amino acid binding site and is involved in hexamerization; and a helix turn helix DNA binding C-terminal region. In solution, in the absence Probab=28.76 E-value=46 Score=14.72 Aligned_cols=55 Identities=15% Similarity=0.166 Sum_probs=35.9 Q ss_pred HHHHHHHHHCCCCCCCCEEEECCCCCEECCCHHHHHHHHHHHHHHHCCCCCCCCCC Q ss_conf 34566542025674222178603761024898999999999987523888131102 Q gi|254780660|r 180 GLTSKKLEEKIGEAQSIKVIWKPLNYIRLSNADKVKSIIKMIENLEDDDDVQSVYS 235 (244) Q Consensus 180 ~~v~~~Le~~~~~~~~sei~~~P~~~V~l~d~e~~~~~~klie~Led~DDVq~Vyt 235 (244) ..+...|...++....-|+.-.+.-++..++-+ .+.+..|+..|.-.+-|..|-+ T Consensus 15 ~eiL~lL~~~~IdL~giEi~~~g~IYl~~~~l~-f~~~~~Lm~~ir~I~GV~dVkt 69 (74) T cd04877 15 QEVLDLLVEHNIDLRGIEIDPKGRIYLNFPTIE-FEKLQTLMPEIRRIDGVEDVKT 69 (74) T ss_pred HHHHHHHHHCCCCCEEEEECCCCEEEEECCCCC-HHHHHHHHHHHHCCCCCCEEEE T ss_conf 999999997799832888726881899778789-8999999999847898316788 No 40 >cd04887 ACT_MalLac-Enz ACT_MalLac-Enz CD includes the N-terminal ACT domain of putative NAD-dependent malic enzyme 1, Bacillus subtilis YqkI and related domains. The ACT_MalLac-Enz CD includes the N-terminal ACT domain of putative NAD-dependent malic enzyme 1, Bacillus subtilis YqkI, a malolactic enzyme (MalLac-Enz) which converts malate to lactate, and other related ACT domains. The yqkJ product is predicted to convert malate directly to lactate, as opposed to related malic enzymes that convert malate to pyruvate. Members of this CD belong to the superfamily of ACT regulatory domains. Probab=28.36 E-value=47 Score=14.67 Aligned_cols=58 Identities=9% Similarity=0.095 Sum_probs=40.9 Q ss_pred EEEECCCCHHHHHHHHHHCCCCCCCCEEEECCC------CCEECCCHHHHHHHHHHHHHHHCCCCCC Q ss_conf 999643201345665420256742221786037------6102489899999999998752388813 Q gi|254780660|r 171 IFYCDFNNVGLTSKKLEEKIGEAQSIKVIWKPL------NYIRLSNADKVKSIIKMIENLEDDDDVQ 231 (244) Q Consensus 171 ~i~~~~~~~~~v~~~Le~~~~~~~~sei~~~P~------~~V~l~d~e~~~~~~klie~Led~DDVq 231 (244) ++--.|-.|..+..++...|..+....+.-.-. -+|++.+.++ ...++++|+.++.|. T Consensus 5 ~~~~~pG~Lg~vataIg~~GGnI~~idvve~~~~~~v~Ditv~~~d~~h---~~~Iv~al~~l~gV~ 68 (74) T cd04887 5 ELPNRPGMLGRVTTAIGEAGGDIGAIDLVEQGRDYTVRDITVDAPSEEH---AETIVAAVRALPEVK 68 (74) T ss_pred EECCCCCHHHHHHHHHHHCCCCEEEEEEEEECCCEEEEEEEEECCCHHH---HHHHHHHHHCCCCEE T ss_conf 9549986499999999876985677899994499599999998697788---999999996199859 No 41 >cd07373 2A5CPDO_A The alpha subunit of the Class III extradiol dioxygenase, 2-amino-5-chlorophenol 1,6-dioxygenase, which catalyzes the oxidization and subsequent ring-opening of 2-amino-5-chlorophenol. 2-amino-5-chlorophenol 1,6-dioxygenase (2A5CPDO) catalyzes the oxidization and subsequent ring-opening of 2-amino-5-chlorophenol, which is an intermediate during p-chloronitrobenzene degradation. This enzyme is a member of the class III extradiol dioxygenase family, a group of enzymes which use a non-heme Fe(II) to cleave aromatic rings between a hydroxylated carbon and an adjacent non-hydroxylated carbon. The active enzyme is probably a heterotetramer, composed of two alpha and two beta subunits. The alpha and beta subunits share significant sequence similarity and may have evolved by gene duplication. This model describes the alpha subunit, which does not contain a potential metal binding site and may not possess catalytic activity. Probab=26.55 E-value=50 Score=14.46 Aligned_cols=93 Identities=11% Similarity=0.118 Sum_probs=54.6 Q ss_pred CHHHCHHHHHHHHH-HHHCCCCHHHHHHHHHHHCCCCCCCCCEEEEEEECCCCEEEEEEEE--CCCCH---HHHHHHHHH Q ss_conf 94148999999999-9966888789999998503776666402367865059939999994--46302---358998987 Q gi|254780660|r 42 NPLENPRLRLAIQN-AKNQSMPKENIERAIKKAGSDDLGNYTNIRYEGYGPEGVAIIIEAL--TDNRN---RTASSIRSI 115 (244) Q Consensus 42 dp~~N~~L~~ai~~-Ak~~~mPk~~Ie~aIkk~~~~~~~~~~~~~yEg~gP~gvaiiVe~l--TDN~n---Rt~~~vr~~ 115 (244) |-..|+.|+.+|.. |++.+++--.++. .+-.-+-+....++|-..+.+...++|-+. ..|.. |.-.-++-+ T Consensus 85 d~~~D~eLa~ai~~~a~~~Gl~~~~~~~---~~~~iDyGTivpl~~ln~~~~~~pvvi~s~~~~~~~ee~~~lG~a~~~A 161 (271) T cd07373 85 DIRSDTALAEACVTACPEHGVHARGVDY---DGFPIDTGTITACTLMGIGTEALPLVVASNNLYHSGEITEKLGAIAADA 161 (271) T ss_pred CCCCCHHHHHHHHHHHHHCCCEEEEECC---CCCCCCCCCEEHHHHHCCCCCCCEEEEEECCCCCCHHHHHHHHHHHHHH T ss_conf 2138999999999999977974552059---9986542212078772878778508998557657989999999999999 Q ss_pred HHHCCCC--CCCCCCCHHHHHHCC Q ss_conf 6425683--147886045686447 Q gi|254780660|r 116 FTKANGS--LGSTGSTTRFFEQIG 137 (244) Q Consensus 116 f~K~gG~--lg~~Gsv~~~F~~~G 137 (244) +.++|.+ +-.+||.||-|=+-+ T Consensus 162 i~~s~krvvllASgsLSHr~f~~~ 185 (271) T cd07373 162 AKDQNKRVAVVGVGGLSGSLFREE 185 (271) T ss_pred HHHCCCCEEEEEECCHHHHHCCCC T ss_conf 998299789998353244312688 No 42 >KOG1572 consensus Probab=26.39 E-value=15 Score=18.11 Aligned_cols=62 Identities=24% Similarity=0.293 Sum_probs=38.4 Q ss_pred CCCCHHHHHHHHHHHCCCCCCCCCEEEEEEECCCCEEEEEEEECCCCHHHHHHHHHHHHHCCCC Q ss_conf 6888789999998503776666402367865059939999994463023589989876425683 Q gi|254780660|r 59 QSMPKENIERAIKKAGSDDLGNYTNIRYEGYGPEGVAIIIEALTDNRNRTASSIRSIFTKANGS 122 (244) Q Consensus 59 ~~mPk~~Ie~aIkk~~~~~~~~~~~~~yEg~gP~gvaiiVe~lTDN~nRt~~~vr~~f~K~gG~ 122 (244) .++|.+.|.+|++- --+..+|--+.-...|++-++.+|-|+--=.|=.++.|=.=+.|+.|. T Consensus 129 ~~~~~~~i~~~l~~--lld~~N~P~Lihc~rGkhRtg~lVgclRklq~W~lssil~Ey~~fa~s 190 (249) T KOG1572 129 VNIPDHSIRKALKV--LLDKRNYPILIHCKRGKHRTGCLVGCLRKLQNWSLSSILDEYLRFAGS 190 (249) T ss_pred CCCHHHHHHHHHHH--HHCCCCCCEEEECCCCCCCHHHHHHHHHHHHCCCHHHHHHHHHHHCCC T ss_conf 88747999999998--831267865776488871013169999998556156778899875052 No 43 >COG3492 Uncharacterized protein conserved in bacteria [Function unknown] Probab=26.35 E-value=43 Score=14.92 Aligned_cols=20 Identities=25% Similarity=0.604 Sum_probs=13.6 Q ss_pred HHHHHHHHHHHHCCCCCCCC Q ss_conf 99999999875238881311 Q gi|254780660|r 214 VKSIIKMIENLEDDDDVQSV 233 (244) Q Consensus 214 ~~~~~klie~Led~DDVq~V 233 (244) +..+-.|++-|.+.-||||| T Consensus 16 AAaFRrLv~HL~~rsdvQNI 35 (104) T COG3492 16 AAAFRRLVEHLQERSDVQNI 35 (104) T ss_pred HHHHHHHHHHHHHHCCCCHH T ss_conf 99999999999985300013 No 44 >PHA00099 minor capsid protein Probab=26.15 E-value=23 Score=16.84 Aligned_cols=93 Identities=19% Similarity=0.182 Sum_probs=51.3 Q ss_pred EEEECCCCEEEEEEEECCCCHHHHHHHHHHHHHCCCCCCCCCCCHHHHHHCCEEEECCCCCCCHHHHHHHHCCCCC---- Q ss_conf 7865059939999994463023589989876425683147886045686447279706775301366775126875---- Q gi|254780660|r 86 YEGYGPEGVAIIIEALTDNRNRTASSIRSIFTKANGSLGSTGSTTRFFEQIGEIIYHSNIGDSNLAMEVAIESDAF---- 161 (244) Q Consensus 86 yEg~gP~gvaiiVe~lTDN~nRt~~~vr~~f~K~gG~lg~~Gsv~~~F~~~G~i~~~~~~~d~d~~~e~aie~ga~---- 161 (244) |...-+.+.-.-.|.||..-...-++|++|++|+++. |.+.|.=-+.+.-.--..+.|--+.|++.+++..- T Consensus 5 y~~~~~~~~~~~~eSLTqQ~f~~EcDIn~IVKk~~~T----G~i~hv~~~~p~YgD~s~v~dyqeAmn~V~~AqE~F~~L 80 (147) T PHA00099 5 YSEKKSVKLKFTQKSLTQQHNKDECDINNIVKKLNAT----GVLEHVERRQPRYGDCMSPMDYQEALNVVIEAQEAFDSL 80 (147) T ss_pred CCCCCHHEEEECCCCHHHHHHHHHCCHHHHHHHHHCC----CCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHHHC T ss_conf 3641000010055207766537633899999996226----876443456875355466066999999999999999868 Q ss_pred --CC---CCCC-C-CEEEEECCCCHHHH Q ss_conf --22---3468-8-25999643201345 Q gi|254780660|r 162 --EV---LFED-Q-EYIFYCDFNNVGLT 182 (244) Q Consensus 162 --Dv---~~~d-~-~~~i~~~~~~~~~v 182 (244) +| ..+| + .+.|..+|.....+ T Consensus 81 Ps~vR~~F~NDP~~FleF~~dp~N~de~ 108 (147) T PHA00099 81 PAKIRERFGNDPEEMLDFLSDPENYDEA 108 (147) T ss_pred CHHHHHHHCCCHHHHHHHHCCCCCHHHH T ss_conf 7999998769999999996183349999 No 45 >cd06139 DNA_polA_I_Ecoli_like_exo The 3'-5' exonuclease domain of DNA polymerases has a fundamental role in reducing polymerase errors and is involved in proofreading activity. E.coli-like Polymerase I (pol I), a subgroup of family-A DNA polymerases, contains a DEDDy-type DnaQ-like 3'-5' exonuclease domain in the same polypeptide chain as the polymerase domain. The exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, with a specific YX(3)D pattern at ExoIII. These motifs are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. E. coli DNA pol I is involved in genome replication but is not the main replicating enzyme. It is also implicated in DNA repair. Probab=25.47 E-value=31 Score=15.93 Aligned_cols=25 Identities=12% Similarity=0.317 Sum_probs=17.9 Q ss_pred HHHHHHHHHHCCCCCCCCCCCCCCC Q ss_conf 9999998752388813110256266 Q gi|254780660|r 216 SIIKMIENLEDDDDVQSVYSNLEIA 240 (244) Q Consensus 216 ~~~klie~Led~DDVq~VytN~~i~ 240 (244) -...|.+.|++......+|.++++| T Consensus 162 L~~~l~~~L~~~~~l~~ly~~iE~P 186 (193) T cd06139 162 LYELLKPKLKEEPGLLELYEEIEMP 186 (193) T ss_pred HHHHHHHHHHHHHHHHHHHHHHHCC T ss_conf 9999999998422099999998456 No 46 >TIGR02394 rpoS_proteo RNA polymerase sigma factor RpoS; InterPro: IPR012761 The bacterial core RNA polymerase complex, which consists of five subunits, is sufficient for transcription elongation and termination but is unable to initiate transcription. Transcription initiation from promoter elements requires a sixth, dissociable subunit called a sigma factor, which reversibly associates with the core RNA polymerase complex to form a holoenzyme . RNA polymerase recruits alternative sigma factors as a means of switching on specific regulons. Most bacteria express a multiplicity of sigma factors. Two of these factors, sigma-70 (gene rpoD), generally known as the major or primary sigma factor, and sigma-54 (gene rpoN or ntrA) direct the transcription of a wide variety of genes. The other sigma factors, known as alternative sigma factors, are required for the transcription of specific subsets of genes. With regard to sequence similarity, sigma factors can be grouped into two classes, the sigma-54 and sigma-70 families. Sequence alignments of the sigma70 family members reveal four conserved regions that can be further divided into subregions eg. sub-region 2.2, which may be involved in the binding of the sigma factor to the core RNA polymerase; and sub-region 4.2, which seems to harbor a DNA-binding 'helix-turn-helix' motif involved in binding the conserved -35 region of promoters recognized by the major sigma factors , . This entry represents the clade of sigma factors called RpoS (also called sigma-38, KatF, etc.), found only in proteobacteria. This sigma factor is induced in stationary phase (in response to the stress of nutrient limitation) and becomes the second principal sigma factor at that time.; GO: 0003677 DNA binding, 0003700 transcription factor activity, 0016987 sigma factor activity, 0006352 transcription initiation, 0006355 regulation of transcription DNA-dependent. Probab=23.57 E-value=57 Score=14.08 Aligned_cols=30 Identities=27% Similarity=0.393 Sum_probs=18.9 Q ss_pred HHHHHHHHCCCC-------HHHCHHHHHHHHHHHHC---CCC Q ss_conf 999999818999-------41489999999999966---888 Q gi|254780660|r 31 EITVSAKLSGQN-------PLENPRLRLAIQNAKNQ---SMP 62 (244) Q Consensus 31 eI~~A~k~gG~d-------p~~N~~L~~ai~~Ak~~---~mP 62 (244) |+..|.|.--+| .+.| ||.++.=||.+ ||| T Consensus 33 E~~~Arra~~GD~eAR~~MIE~N--LRLVV~IAk~Y~nRGlp 72 (292) T TIGR02394 33 ELAYARRALAGDFEARKKMIESN--LRLVVSIAKHYVNRGLP 72 (292) T ss_pred HHHHHHHHHCCCHHHHHHHHHHH--CHHHHHHHHHCCCCCCH T ss_conf 99998886507888988877640--12676786440476514 No 47 >pfam06798 PrkA PrkA serine protein kinase C-terminal domain. This is a family of PrkA bacterial and archaeal serine kinases approximately 630 residues long. This family corresponds to the C-terminal domain. Probab=23.51 E-value=57 Score=14.07 Aligned_cols=50 Identities=12% Similarity=0.208 Sum_probs=30.2 Q ss_pred CCCCHHHHHHHHHHCCCCCCCCEEEECCCCCEECCCHHHHHHHHHHHHHHH Q ss_conf 432013456654202567422217860376102489899999999998752 Q gi|254780660|r 175 DFNNVGLTSKKLEEKIGEAQSIKVIWKPLNYIRLSNADKVKSIIKMIENLE 225 (244) Q Consensus 175 ~~~~~~~v~~~Le~~~~~~~~sei~~~P~~~V~l~d~e~~~~~~klie~Le 225 (244) ++.++..++.+++++.+.-..--+-.+.. .-...|+|+.+++..+++.|- T Consensus 189 d~~s~e~LreaiEkkLf~d~kd~~~~is~-~sk~~d~e~qkk~~~~v~rmi 238 (254) T pfam06798 189 DYDSYERLREAIEKKLFSDVKDLLKLISL-VSKVTDEEQQKKIDEVVDRMI 238 (254) T ss_pred CCCCHHHHHHHHHHHHHHHHHHHHHHHHH-HCCCCCHHHHHHHHHHHHHHH T ss_conf 99856999999999999867747877531-125899999999999999999 No 48 >TIGR02684 dnstrm_HI1420 probable addiction module antidote protein; InterPro: IPR014057 Members of this strictly bacterial protein family are small, at roughly 100 amino acids. The gene is almost invariably the downstream member of a gene pair. It is a predicted DNA-binding protein from a clade within the helix-turn-helix family IPR001387 from INTERPRO. These gene pairs, when found on the bacterial chromosome, are located often with prophage regions, but also both in integrated plasmid regions and in housekeeping gene regions. Analysis suggests that the gene pair may serve as an addiction module.. Probab=23.02 E-value=41 Score=15.05 Aligned_cols=16 Identities=19% Similarity=0.376 Sum_probs=7.6 Q ss_pred CHHHHHHHHHHHHCCC Q ss_conf 8999999999996688 Q gi|254780660|r 46 NPRLRLAIQNAKNQSM 61 (244) Q Consensus 46 N~~L~~ai~~Ak~~~m 61 (244) ||+|.+++.-.|+-|+ T Consensus 71 nP~f~T~lkV~~ALG~ 86 (91) T TIGR02684 71 NPTFDTILKVTKALGL 86 (91) T ss_pred CCCHHHHHHHHHHCCC T ss_conf 9566888999984086 No 49 >PRK10507 bifunctional glutathionylspermidine amidase/glutathionylspermidine synthetase; Provisional Probab=22.57 E-value=45 Score=14.76 Aligned_cols=76 Identities=12% Similarity=0.057 Sum_probs=37.4 Q ss_pred HHCCCCHHHHHHHHHHH-CCCCCCCCCEEEEE-EECCCCEEEEEEEECCCCHHH--HHHHHHHHHHCCCCCCCCCCCHHH Q ss_conf 96688878999999850-37766664023678-650599399999944630235--899898764256831478860456 Q gi|254780660|r 57 KNQSMPKENIERAIKKA-GSDDLGNYTNIRYE-GYGPEGVAIIIEALTDNRNRT--ASSIRSIFTKANGSLGSTGSTTRF 132 (244) Q Consensus 57 k~~~mPk~~Ie~aIkk~-~~~~~~~~~~~~yE-g~gP~gvaiiVe~lTDN~nRt--~~~vr~~f~K~gG~lg~~Gsv~~~ 132 (244) +..++|+....+ |.+. ..-...-+ .=||. +|...| .-+.|--.|-.+-- .+.+--..-+++|-.....+-..+ T Consensus 289 ~~F~IP~~~wp~-iR~SWq~R~~~li-sGRfDFa~d~~g-lK~~EYNaDSpS~l~Eag~iQ~~Wae~~g~~~g~~~g~~L 365 (619) T PRK10507 289 ALFDIPKILWPR-LRLSWQRRRHHMI-TGRMDFCMDERG-LKVYEYNADSASCHTEAGLILEKWAEQGYKGNGFNPAEGL 365 (619) T ss_pred HHCCCCHHHHHH-HHHHHHHCCCCCE-EEEEEEEECCCC-CEEEEECCCCCHHHHHHHHHHHHHHHHCCCCCCCCHHHHH T ss_conf 763998789899-9998973587530-145666776988-5899955898157776489999999961876576928999 Q ss_pred HHH Q ss_conf 864 Q gi|254780660|r 133 FEQ 135 (244) Q Consensus 133 F~~ 135 (244) |++ T Consensus 366 ~~~ 368 (619) T PRK10507 366 INE 368 (619) T ss_pred HHH T ss_conf 999 No 50 >COG4838 Uncharacterized protein conserved in bacteria [Function unknown] Probab=22.37 E-value=60 Score=13.92 Aligned_cols=31 Identities=23% Similarity=0.280 Sum_probs=25.4 Q ss_pred HHHHHHHHHHHHCCCCHHHCHHHHHHHHHHH Q ss_conf 9999999999818999414899999999999 Q gi|254780660|r 27 KLSREITVSAKLSGQNPLENPRLRLAIQNAK 57 (244) Q Consensus 27 k~~keI~~A~k~gG~dp~~N~~L~~ai~~Ak 57 (244) -++|||--|+|.|--||+.--+|-+-+++-- T Consensus 52 GLsrEidFAvrLgLid~e~GKqll~~LEkeL 82 (92) T COG4838 52 GLSREIDFAVRLGLIDEEDGKQLLSTLEKEL 82 (92) T ss_pred HHHHHHHHHHHHCCCCHHHHHHHHHHHHHHH T ss_conf 0255553888863667888799999999999 No 51 >TIGR01969 minD_arch cell division ATPase MinD; InterPro: IPR010224 Proper placement of the bacterial cell division site requires the site-specific inactivation of other potential division sites. In Escherichia coli, selection of the correct mid-cell site is mediated by the MinC, MinD and MinE proteins. Several members of this family are found in archaeal genomes but their function is uncharacterised. More distantly related proteins include flagellar biosynthesis proteins and ParA chromosome partitioning proteins. This entry represents the archaeal MinD family. The exact roles of the various archaeal MinD homologs are unknown. . Probab=21.99 E-value=46 Score=14.71 Aligned_cols=49 Identities=8% Similarity=0.053 Sum_probs=26.9 Q ss_pred CEEEEECCC-----CHHHHHHHHHHCCCCCCCCEEEECCCCCEECCCHHHHHHHH Q ss_conf 259996432-----01345665420256742221786037610248989999999 Q gi|254780660|r 169 EYIFYCDFN-----NVGLTSKKLEEKIGEAQSIKVIWKPLNYIRLSNADKVKSII 218 (244) Q Consensus 169 ~~~i~~~~~-----~~~~v~~~Le~~~~~~~~sei~~~P~~~V~l~d~e~~~~~~ 218 (244) +..+-+.|+ |--+++--.++.|..+...=+=++-....+++ .|..|.++ T Consensus 134 elLLVvNPEi~SItDaLK~k~va~~lGt~ilG~vlNRv~~~~tel~-~~eiE~iL 187 (258) T TIGR01969 134 ELLLVVNPEISSITDALKVKIVAEKLGTAILGVVLNRVTRDKTELG-REEIEAIL 187 (258) T ss_pred CCEEEECCCHHHHHHHHHHHHHHHHCCCCEEEEEEEECCCCCCCCC-HHHHHHHH T ss_conf 6648667654467778899999876088324689960236666378-88999884 No 52 >PHA00448 hypothetical protein Probab=21.86 E-value=61 Score=13.85 Aligned_cols=47 Identities=34% Similarity=0.457 Sum_probs=32.5 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCHHHCHHHHHHHHHHHHCCCCH Q ss_conf 5564677557888979999999999999818999414899999999999668887 Q gi|254780660|r 9 NIMHRKERKDALKSKIFSKLSREITVSAKLSGQNPLENPRLRLAIQNAKNQSMPK 63 (244) Q Consensus 9 nIkh~K~~~D~~k~k~f~k~~keI~~A~k~gG~dp~~N~~L~~ai~~Ak~~~mPk 63 (244) +--..|+.+|+.|+...++-+|+..-|+-.|-. .-.+.+ ||.+.|.| T Consensus 21 Krln~kArkda~~a~~LAk~arelsdaAs~gvt---~aA~iA-----akAq~lsK 67 (70) T PHA00448 21 KRLNDKARKDATRARRLAKQSRELSDAASAGVT---EAARIA-----AKAQQLSK 67 (70) T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCH---HHHHHH-----HHHHHHHH T ss_conf 985288887489999999998886687863304---899999-----98878887 No 53 >COG1438 ArgR Arginine repressor [Transcription] Probab=21.79 E-value=61 Score=13.84 Aligned_cols=95 Identities=9% Similarity=0.152 Sum_probs=64.9 Q ss_pred EEEEECCCCHHHHHHHHHHHHHCCCCCCCCCCCHHHHHHCCEEEECCCCC------------CCHHHHHHHHCCCCCCCC Q ss_conf 99994463023589989876425683147886045686447279706775------------301366775126875223 Q gi|254780660|r 97 IIEALTDNRNRTASSIRSIFTKANGSLGSTGSTTRFFEQIGEIIYHSNIG------------DSNLAMEVAIESDAFEVL 164 (244) Q Consensus 97 iVe~lTDN~nRt~~~vr~~f~K~gG~lg~~Gsv~~~F~~~G~i~~~~~~~------------d~d~~~e~aie~ga~Dv~ 164 (244) |-+.+|.|+-.|..++...+.++|.. .++-.||-+-.+.|.+.++.... ......-..+..-..++. T Consensus 11 Ik~iI~~~~i~TQ~Elv~~L~~~Gi~-vTQaTvSRDlkelglvKv~~~~g~~~Y~l~~~~~~~~~~~~~~~~~~~v~~vd 89 (150) T COG1438 11 IKEIITEEKISTQEELVELLQEEGIE-VTQATVSRDLKELGLVKVRNEKGTYVYSLPAELGVPPTSKLKRYLKDLVLSID 89 (150) T ss_pred HHHHHHHCCCCCHHHHHHHHHHCCCE-EEHHHHHHHHHHCCCEEECCCCCCEEEEECCCCCCCCHHHHHHHHHHHHEEEC T ss_conf 99999867777899999999982975-86398787799859889337897389984876677704668889998810111 Q ss_pred CCCCCEEEEECCCCHHHHHHHHHHCCCC Q ss_conf 4688259996432013456654202567 Q gi|254780660|r 165 FEDQEYIFYCDFNNVGLTSKKLEEKIGE 192 (244) Q Consensus 165 ~~d~~~~i~~~~~~~~~v~~~Le~~~~~ 192 (244) -.+..+.+.|.|-.-..+...|...+++ T Consensus 90 ~~~~~ivlkT~PG~A~~ia~~lD~~~~~ 117 (150) T COG1438 90 RNGNLIVLKTSPGAAQLIARLLDSLAKD 117 (150) T ss_pred CCCCEEEEEECCCHHHHHHHHHHHCCCH T ss_conf 4786899980896589999998743743 No 54 >PRK00139 murE UDP-N-acetylmuramoylalanyl-D-glutamate--2,6-diaminopimelate ligase; Provisional Probab=21.50 E-value=62 Score=13.80 Aligned_cols=61 Identities=18% Similarity=0.265 Sum_probs=38.6 Q ss_pred HHHHHHHHHHCCCCHHHHHHHHHHHCCCCCCCCCEEEEEEECCCCEEEEEEEECCCCHHHHHHHHHH Q ss_conf 9999999996688878999999850377666640236786505993999999446302358998987 Q gi|254780660|r 49 LRLAIQNAKNQSMPKENIERAIKKAGSDDLGNYTNIRYEGYGPEGVAIIIEALTDNRNRTASSIRSI 115 (244) Q Consensus 49 L~~ai~~Ak~~~mPk~~Ie~aIkk~~~~~~~~~~~~~yEg~gP~gvaiiVe~lTDN~nRt~~~vr~~ 115 (244) +-+|+.-|...++|.+.|.++|++-.+-. +-+|-+ ..++|..+||+ +-.|+--..+-+.++ T Consensus 298 alaAiava~~lGi~~~~i~~~L~~~~~v~-GRmE~i----~~~~~~~vivD-YAHtP~sl~~~L~~l 358 (481) T PRK00139 298 LLAAIAALLALGVPLEDILKALAKLRPVP-GRMERV----GAGGGPLVIVD-YAHTPDALEKVLDAL 358 (481) T ss_pred HHHHHHHHHHCCCCHHHHHHHHHCCCCCC-CCCEEE----ECCCCCEEEEE-CCCCHHHHHHHHHHH T ss_conf 99999999983989899999983499999-986899----87999689997-689989999999999 No 55 >PRK03803 murD UDP-N-acetylmuramoyl-L-alanyl-D-glutamate synthetase; Provisional Probab=21.31 E-value=63 Score=13.77 Aligned_cols=58 Identities=17% Similarity=0.192 Sum_probs=23.1 Q ss_pred HHHHHHHHCCCCHHHHHHHHHHHCCCCCCCCCEEEEEEE-CCCCEEEEEEEECCCCHHHHHHHHH Q ss_conf 999999966888789999998503776666402367865-0599399999944630235899898 Q gi|254780660|r 51 LAIQNAKNQSMPKENIERAIKKAGSDDLGNYTNIRYEGY-GPEGVAIIIEALTDNRNRTASSIRS 114 (244) Q Consensus 51 ~ai~~Ak~~~mPk~~Ie~aIkk~~~~~~~~~~~~~yEg~-gP~gvaiiVe~lTDN~nRt~~~vr~ 114 (244) +|+.-|+..++|.+.|.++|+.-.|-.- |+|-. -.+||-+|=+.-.-|..-|.+-++. T Consensus 281 AA~a~a~~~Gi~~e~I~~aL~~F~Gl~h------R~E~v~~~~Gv~fiNDSKaTN~~at~~Al~~ 339 (448) T PRK03803 281 AALALGEAAGLPMEAMLETLKEFKGLEH------RCQWVREVDGVRYYNDSKGTNVGATLAAIEG 339 (448) T ss_pred HHHHHHHHCCCCHHHHHHHHHHCCCCCE------EEEEEEECCCEEEEECCCCCCHHHHHHHHHH T ss_conf 9999999829987898998863557120------5799996188899968998997899999996 No 56 >TIGR02768 TraA_Ti Ti-type conjugative transfer relaxase TraA; InterPro: IPR014136 This entry represents the Ti-type conjugative transfer relaxase TraA. TraA contains domains distinctive of a single strand exonuclease (N-terminus, MobA/MobL, IPR005053 from INTERPRO) as well as a helicase domain (central region, homologous to the corresponding region of the F-type relaxase TraI, IPR014129 from INTERPRO). This protein likely fills the same role as TraI(F), nicking (at the oriT site) and unwinding the coiled plasmid prior to conjugative transfer .. Probab=21.29 E-value=48 Score=14.58 Aligned_cols=71 Identities=11% Similarity=0.185 Sum_probs=44.9 Q ss_pred HHHHHHHHHHCCCCHHHHHHHH------------------HHHCCCCCCCCCEEEEEEECCCCEEEEEEEECCCCHHHHH Q ss_conf 9999999996688878999999------------------8503776666402367865059939999994463023589 Q gi|254780660|r 49 LRLAIQNAKNQSMPKENIERAI------------------KKAGSDDLGNYTNIRYEGYGPEGVAIIIEALTDNRNRTAS 110 (244) Q Consensus 49 L~~ai~~Ak~~~mPk~~Ie~aI------------------kk~~~~~~~~~~~~~yEg~gP~gvaiiVe~lTDN~nRt~~ 110 (244) =+.+|.+.+.-+||..+...|| +...+-+.+++.-|. +.-||+++|+||=--=-=|+=... T Consensus 370 ~a~~l~~~~~hgV~~~~~~~a~ER~~~~~~~~~GGmlaas~~~~rLs~EQ~~Av~-hvt~s~~iavVvG~AGtGKSt~L~ 448 (888) T TIGR02768 370 SAEALSQSQGHGVSPPIVDAAIERVDRILRRDSGGMLAASDQHERLSEEQKEAVR-HVTGSGDIAVVVGRAGTGKSTMLK 448 (888) T ss_pred HHHHHHHCCCCCCCHHHHHHHHHHHHHHHCCCCCCCEECCCCCCCCHHHHHHHHH-HHCCCCCEEEEECCCCCCHHHHHH T ss_conf 9999962068678877888888766665227886401116787774589999987-532899648997489987667899 Q ss_pred HHHHHHHHCC Q ss_conf 9898764256 Q gi|254780660|r 111 SIRSIFTKAN 120 (244) Q Consensus 111 ~vr~~f~K~g 120 (244) -.|.++...| T Consensus 449 aAR~AWe~~G 458 (888) T TIGR02768 449 AAREAWEAAG 458 (888) T ss_pred HHHHHHHHCC T ss_conf 9999998739 No 57 >COG2766 PrkA Putative Ser protein kinase [Signal transduction mechanisms] Probab=21.23 E-value=63 Score=13.76 Aligned_cols=14 Identities=21% Similarity=0.264 Sum_probs=6.9 Q ss_pred CHHHCHHHHHHHHH Q ss_conf 94148999999999 Q gi|254780660|r 42 NPLENPRLRLAIQN 55 (244) Q Consensus 42 dp~~N~~L~~ai~~ 55 (244) |++.+.|+..+..+ T Consensus 53 dt~~d~r~~~~f~n 66 (649) T COG2766 53 DTEHDGRLSRIFSN 66 (649) T ss_pred CCCCCCHHHHHCCC T ss_conf 72200024411000 No 58 >COG1033 Predicted exporters of the RND superfamily [General function prediction only] Probab=20.96 E-value=64 Score=13.72 Aligned_cols=17 Identities=18% Similarity=0.094 Sum_probs=11.1 Q ss_pred HHHHHHHHHHHHHCCCC Q ss_conf 23589989876425683 Q gi|254780660|r 106 NRTASSIRSIFTKANGS 122 (244) Q Consensus 106 nRt~~~vr~~f~K~gG~ 122 (244) .|...+.+.+-+++||+ T Consensus 418 ~p~~~~~~~i~~~~ggs 434 (727) T COG1033 418 LPALKALDFIEKEFGGS 434 (727) T ss_pred CHHHHHHHHHHHHCCCC T ss_conf 67888899999874998 No 59 >COG3734 DgoK 2-keto-3-deoxy-galactonokinase [Carbohydrate transport and metabolism] Probab=20.88 E-value=37 Score=15.39 Aligned_cols=15 Identities=33% Similarity=0.421 Sum_probs=5.6 Q ss_pred HHHHHHHHHHHHHHH Q ss_conf 799999999999998 Q gi|254780660|r 23 KIFSKLSREITVSAK 37 (244) Q Consensus 23 k~f~k~~keI~~A~k 37 (244) -+|.-+.+.=.++.. T Consensus 170 Elf~~l~~HS~l~~~ 184 (306) T COG3734 170 ELFALLINHSTLSAG 184 (306) T ss_pred HHHHHHHHHHHHHCC T ss_conf 999998741444136 No 60 >TIGR02875 spore_0_A sporulation transcription factor Spo0A; InterPro: IPR012052 Members of this group are response regulators/transcription factors that contain CheY-like receiver (phosphoacceptor) domain and a unique DNA-binding domain. Spo0A controls the entry of Bacillus subtilis into the developmental process of sporulation . Activation of the Spo0A transcription factor by phosphorylation serves as a developmental checkpoint and to integrate several physiological signals that control entry into the sporulation pathway. The signals are generated by conditions of nutrient deprivation, high cell density, the Krebs cycle, DNA replication, DNA damage, and some aspect of the chromosome partitioning machinery . Activated Spo0A has multiple functions . It represses promoters (such as the abrB promoter) by binding to sites (0A boxes) downstream from the transcription start site. Spo0A also activates transcription from promoters used by two forms of RNA polymerase that differ by containing either the major sigma factor, sigma A (e.g. the spoIIE and spoIIG promoters) or the alternate sigma factor, sigma H (e.g. the spoIIA promoter). At promoters activated by Spo0A, the 0A boxes lie upstream of the transcription-initiation site.; GO: 0003677 DNA binding, 0005509 calcium ion binding, 0000160 two-component signal transduction system (phosphorelay), 0042173 regulation of sporulation, 0045449 regulation of transcription, 0050906 detection of stimulus during sensory perception. Probab=20.84 E-value=51 Score=14.40 Aligned_cols=28 Identities=32% Similarity=0.316 Sum_probs=17.7 Q ss_pred HHHCCCCHHHHHHHHHHHCCC--CCCCCCE Q ss_conf 996688878999999850377--6666402 Q gi|254780660|r 56 AKNQSMPKENIERAIKKAGSD--DLGNYTN 83 (244) Q Consensus 56 Ak~~~mPk~~Ie~aIkk~~~~--~~~~~~~ 83 (244) ||+.|-+-+.+||||.-|=+- +-++.|. T Consensus 205 A~kYnTTaSRVERAIRHAIEvAWsRG~~E~ 234 (270) T TIGR02875 205 AKKYNTTASRVERAIRHAIEVAWSRGNIEL 234 (270) T ss_pred HHHCCCCCCHHHHHHHHHHHHCCCCCCCCC T ss_conf 854386964466766646342025886210 No 61 >PRK03341 arginine repressor; Provisional Probab=20.25 E-value=66 Score=13.62 Aligned_cols=127 Identities=9% Similarity=0.060 Sum_probs=69.3 Q ss_pred EEEEECCCCHHHHHHHHHHHHHCCCCCCCCCCCHHHHHHCCEEEECCCCCC------------------CHHHHHHHHCC Q ss_conf 999944630235899898764256831478860456864472797067753------------------01366775126 Q gi|254780660|r 97 IIEALTDNRNRTASSIRSIFTKANGSLGSTGSTTRFFEQIGEIIYHSNIGD------------------SNLAMEVAIES 158 (244) Q Consensus 97 iVe~lTDN~nRt~~~vr~~f~K~gG~lg~~Gsv~~~F~~~G~i~~~~~~~d------------------~d~~~e~aie~ 158 (244) |-+.++.|.-+|..++...+.+.|=. .+...+|-.-...|++-....... ...-+.-.+.- T Consensus 20 I~~lI~~~~I~tQeeL~~~L~~~Gi~-vTQATiSRDikEL~lvKv~~~~G~~~yy~lp~~~~~~~~~~~~~~~l~~~~~~ 98 (168) T PRK03341 20 IVAILSSQSVRSQSELAALLADEGID-VTQATLSRDLEELGAVKLRGADGGLGVYVVPEDGSPRRGVAGGTERLRRLLGE 98 (168) T ss_pred HHHHHHCCCCCCHHHHHHHHHHCCCC-EEHHHHHHHHHHHCCEEEECCCCCEEEEEECCCCCCCCCCCCHHHHHHHHHHH T ss_conf 99999608978899999999976986-54278775299838767325899889999457666444555578999999998 Q ss_pred CCCCCCCCCCCEEEEECCCCHHHHHHHHHHCCCCCCCCEEEECCCCCEECCCHHHHHHHHHHHHHH Q ss_conf 875223468825999643201345665420256742221786037610248989999999999875 Q gi|254780660|r 159 DAFEVLFEDQEYIFYCDFNNVGLTSKKLEEKIGEAQSIKVIWKPLNYIRLSNADKVKSIIKMIENL 224 (244) Q Consensus 159 ga~Dv~~~d~~~~i~~~~~~~~~v~~~Le~~~~~~~~sei~~~P~~~V~l~d~e~~~~~~klie~L 224 (244) ....++..+..+.+-|.|-.=..+...|....+.-.-.-|.=.-.-.|-..+++.++++..-++.| T Consensus 99 ~v~~v~~~~nlvVikT~pG~A~~va~~iD~~~~~eI~GTIAGdDTIlVi~~~~~~a~~l~~~L~~l 164 (168) T PRK03341 99 LLVSTDASGNLAVLRTPPGAAQYLASAIDRAALPYVVGTIAGDDTVLVVAREPMTGAELAARLENL 164 (168) T ss_pred HEEEEEECCCEEEEEECCCCHHHHHHHHHHCCCCCCEEEEECCCEEEEEECCHHHHHHHHHHHHHH T ss_conf 756775107689998189948999999984799873798604998999978888999999999998 No 62 >cd04883 ACT_AcuB C-terminal ACT domain of the Bacillus subtilis acetoin utilization protein, AcuB. This CD includes the C-terminal ACT domain of the Bacillus subtilis acetoin utilization protein, AcuB. AcuB is putatively involved in the anaerobic catabolism of acetoin, and related proteins. Studies report the induction of AcuB by nitrate respiration and also by fermentation. Since acetoin can be secreted and later serve as a source of carbon, it has been proposed that, during anaerobic growth when other carbon sources are exhausted, the induction of the AcuB protein results in acetoin catabolism. AcuB-like proteins have two N-terminal tandem CBS domains and a single C-terminal ACT domain. Members of this CD belong to the superfamily of ACT regulatory domains. Probab=20.21 E-value=66 Score=13.62 Aligned_cols=27 Identities=22% Similarity=0.288 Sum_probs=24.5 Q ss_pred EEEECCCCHHHHHHHHHHHHHCCCCCC Q ss_conf 999446302358998987642568314 Q gi|254780660|r 98 IEALTDNRNRTASSIRSIFTKANGSLG 124 (244) Q Consensus 98 Ve~lTDN~nRt~~~vr~~f~K~gG~lg 124 (244) +|+..+|+-...++|-.+|+++|-|+- T Consensus 4 Iev~V~Dr~G~La~va~i~~~~~iNI~ 30 (72) T cd04883 4 IEVRVPDRPGQLADIAAIFKDRGVNIV 30 (72) T ss_pred EEEEECCCCCCHHHHHHHHHHCCCCEE T ss_conf 999957986729999999997597589 No 63 >cd00374 RNase_T2 Ribonuclease T2 (RNase T2) is a widespread family of secreted RNases found in every organism examined thus far. This family includes RNase Rh, RNase MC1, RNase LE, and self-incompatibility RNases (S-RNases). Plant T2 RNases are expressed during leaf senescence in order to scavenge phosphate from ribonucleotides. They are also expressed in response to wounding or pathogen invasion. S-RNases are thought to prevent self-fertilization by acting as selective cytotoxins of "self" pollen. Probab=20.10 E-value=16 Score=18.06 Aligned_cols=31 Identities=16% Similarity=0.161 Sum_probs=16.7 Q ss_pred CEEEEEEEECCCCHHHHHHHHHHHHHCCCCC Q ss_conf 9399999944630235899898764256831 Q gi|254780660|r 93 GVAIIIEALTDNRNRTASSIRSIFTKANGSL 123 (244) Q Consensus 93 gvaiiVe~lTDN~nRt~~~vr~~f~K~gG~l 123 (244) |+.+.|.|..++....+.+|+.+|.|.++++ T Consensus 147 g~~~~l~C~~~~~~~~L~Ev~~C~~k~~~~~ 177 (195) T cd00374 147 GATPSLKCTKDPGKGLLTEIWICFDKDALKF 177 (195) T ss_pred CCCCEEEEEECCCCCEEEEEEEEEECCCCEE T ss_conf 9980899721798878989999997899967 No 64 >PRK08775 homoserine O-acetyltransferase; Provisional Probab=20.06 E-value=66 Score=13.60 Aligned_cols=25 Identities=28% Similarity=0.600 Sum_probs=16.4 Q ss_pred CEEEEEEECCCC--EEEEEEEECCCCH Q ss_conf 023678650599--3999999446302 Q gi|254780660|r 82 TNIRYEGYGPEG--VAIIIEALTDNRN 106 (244) Q Consensus 82 ~~~~yEg~gP~g--vaiiVe~lTDN~n 106 (244) ..+.||-|||.| +-+|+-+||-+.. T Consensus 46 ~~~~yet~G~~~~navlv~HaLtg~~H 72 (343) T PRK08775 46 LRLRYELIGPANAPVVFVAGGISAHRH 72 (343) T ss_pred EEEEEEEECCCCCCEEEEECCCCCCCC T ss_conf 377776524789988999078677300 Done!