Query gi|254780331|ref|YP_003064744.1| 50S ribosomal protein L9 [Candidatus Liberibacter asiaticus str. psy62] Match_columns 179 No_of_seqs 119 out of 1863 Neff 6.4 Searched_HMMs 39220 Date Sun May 29 16:30:09 2011 Command /home/congqian_1/programs/hhpred/hhsearch -i 254780331.hhm -d /home/congqian_1/database/cdd/Cdd.hhm No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM 1 PRK00137 rplI 50S ribosomal pr 100.0 0 0 327.9 13.2 147 14-160 1-147 (147) 2 CHL00160 rpl9 ribosomal protei 100.0 0 0 318.6 13.6 150 11-161 2-153 (154) 3 COG0359 RplI Ribosomal protein 100.0 1.4E-45 0 307.8 12.1 148 14-161 1-148 (148) 4 TIGR00158 L9 ribosomal protein 99.9 3.5E-25 8.9E-30 179.4 10.1 147 14-160 1-151 (151) 5 KOG4607 consensus 99.9 4.4E-22 1.1E-26 159.8 9.5 155 7-163 42-198 (222) 6 pfam03948 Ribosomal_L9_C Ribos 99.9 9.4E-22 2.4E-26 157.8 8.3 86 75-160 1-86 (86) 7 pfam01281 Ribosomal_L9_N Ribos 99.7 3.7E-19 9.5E-24 141.4 2.8 48 14-61 1-48 (48) 8 pfam10045 DUF2280 Uncharacteri 37.2 21 0.00053 16.9 1.7 33 106-138 20-53 (104) 9 KOG3279 consensus 37.2 30 0.00077 15.9 2.6 68 41-108 197-277 (283) 10 pfam08461 HTH_12 Ribonuclease 36.2 30 0.00077 15.9 2.4 25 105-129 13-37 (66) 11 cd04491 SoSSB_OBF SoSSB_OBF: A 35.3 20 0.00052 16.9 1.4 19 25-43 48-66 (82) 12 TIGR02389 RNA_pol_rpoA2 DNA-di 31.8 39 0.00099 15.2 2.4 15 113-127 304-319 (397) 13 pfam06560 GPI Glucose-6-phosph 31.7 41 0.001 15.0 2.6 14 26-39 113-126 (181) 14 TIGR02169 SMC_prok_A chromosom 30.3 23 0.00059 16.6 1.0 51 85-135 112-169 (1202) 15 cd04606 CBS_pair_Mg_transporte 29.5 38 0.00097 15.2 2.0 17 97-113 91-107 (109) 16 cd04625 CBS_pair_12 The CBS do 28.6 35 0.00089 15.5 1.7 27 96-122 29-55 (112) 17 COG2047 Uncharacterized protei 28.0 35 0.00089 15.5 1.6 30 96-125 131-160 (258) 18 COG1438 ArgR Arginine represso 27.9 37 0.00094 15.3 1.7 34 107-142 22-55 (150) 19 TIGR01026 fliI_yscN ATPase Fli 26.3 25 0.00063 16.4 0.6 34 89-122 231-266 (455) 20 TIGR01088 aroQ 3-dehydroquinat 25.8 35 0.0009 15.4 1.3 28 99-126 17-50 (144) 21 COG2239 MgtE Mg/Co/Ni transpor 25.5 53 0.0013 14.4 3.1 43 75-117 212-254 (451) 22 pfam07523 Big_3 Bacterial Ig-l 24.7 55 0.0014 14.3 4.4 41 116-157 26-68 (68) 23 PRK06461 single-stranded DNA-b 23.8 57 0.0014 14.2 3.2 28 15-42 54-81 (130) 24 PHA02119 hypothetical protein 23.7 48 0.0012 14.6 1.7 29 99-127 47-75 (87) 25 cd03064 TRX_Fd_NuoE TRX-like [ 23.7 51 0.0013 14.4 1.8 19 99-117 61-79 (80) 26 cd02980 TRX_Fd_family Thioredo 23.4 48 0.0012 14.6 1.6 22 96-117 55-76 (77) 27 cd03081 TRX_Fd_NuoE_FDH_gamma 23.4 50 0.0013 14.5 1.7 19 99-117 61-79 (80) 28 PRK02998 prsA peptidylprolyl i 22.8 59 0.0015 14.0 3.1 73 74-148 144-225 (283) 29 pfam03848 TehB Tellurite resis 22.7 60 0.0015 14.0 3.0 43 27-69 30-74 (192) 30 KOG1937 consensus 22.4 30 0.00077 15.9 0.5 22 20-47 72-93 (521) 31 PRK11207 tellurite resistance 22.2 61 0.0016 14.0 2.9 47 20-67 24-72 (198) 32 smart00116 CBS Domain in cysta 21.4 49 0.0012 14.6 1.3 20 96-115 29-48 (49) 33 TIGR00003 TIGR00003 copper ion 21.3 61 0.0016 14.0 1.8 18 106-123 49-66 (66) 34 TIGR01054 rgy reverse gyrase; 21.0 47 0.0012 14.7 1.2 18 22-39 1665-1682(1843) 35 cd06240 Peptidase_M14-like_1_3 20.3 67 0.0017 13.7 2.4 29 45-73 13-41 (273) 36 TIGR01734 D-ala-DACP-lig D-ala 20.2 44 0.0011 14.9 0.9 49 74-125 31-84 (513) No 1 >PRK00137 rplI 50S ribosomal protein L9; Reviewed Probab=100.00 E-value=0 Score=327.93 Aligned_cols=147 Identities=41% Similarity=0.712 Sum_probs=145.2 Q ss_pred EEEEEEEECCCCCCCCCEEEECCCCEEEEECCCCCEECCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHC Q ss_conf 33465201117587783888277632110143787301220146888778999998422346789999887654321000 Q gi|254780331|r 14 MEVILLQNVTNLGPMGEVVKVKNGYARNYLLPKKKALRANKENKILFESQRSVLEAANLEKKAKYEGISKDLAKKNFSLI 93 (179) Q Consensus 14 mkVIL~~dv~~lG~~Gdiv~Vk~GyaRN~LiP~~~A~~aT~~n~~~~~~~~~~~~~~~~~~~~~a~~~~~~L~~~~l~i~ 93 (179) |||||++||++||++||+|+|++|||||||||+|+|++||++|+++++.+++.++++.....+.|+.++++|++.+|+|. T Consensus 1 MkVIL~~dV~~lG~~GdvV~Vk~GYARNyLiP~~~A~~AT~~nl~~~~~~~~~~~~~~~~~~~~a~~~~~~L~~~~l~i~ 80 (147) T PRK00137 1 MKVILLEDVKNLGKLGDVVEVKDGYARNFLLPQGKAVRATKANLKQLEARRAELEAKAAEELAEAEALAEKLEGLTVTIA 80 (147) T ss_pred CEEEEECCCCCCCCCCCEEEECCHHHHHHCCCCCCHHHCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCEEEEE T ss_conf 96999143203698899999875346673266670002857569999988999999999999999999998609869999 Q ss_pred CCCCCCCCCCCCCCHHHHHHHHHHCCCCCCHHHHCCCCCCCCEEEEEEEEEECCCEEEEEEEEEEEC Q ss_conf 2556543100341078999999860887686663026564320549999996397299999999725 Q gi|254780331|r 94 RAAGDTGYLYGSVSSRDIADLLIEEGFDVNRGQINLKSPIKSVGIHNIMISLHADVSTTITLNVARS 160 (179) Q Consensus 94 ~k~~e~gkLfGsVt~~dI~~~L~~~gi~I~k~~I~l~~pIk~~G~y~V~I~L~~~V~a~i~V~V~~~ 160 (179) +++|++|+||||||++||+++|.++|++|++++|.+++|||++|+|+|+|+||++|+++++|+|+++ T Consensus 81 ~k~~~~gkLfGSVt~~~I~~~l~~~gi~i~k~~I~l~~pIk~~G~~~V~i~l~~~v~~~i~v~V~~e 147 (147) T PRK00137 81 AKAGEDGKLFGSVTTKDIAEALKAAGIEIDKRKIRLPEPIKTLGEYEVEVKLHPEVTATVKVNVVAE 147 (147) T ss_pred EEECCCCEEECCCCHHHHHHHHHHCCCEECHHHEECCCCHHCCEEEEEEEEECCCEEEEEEEEEEEC T ss_conf 9706677050555889999999974974469994388702124778999996698399999999859 No 2 >CHL00160 rpl9 ribosomal protein L9; Provisional Probab=100.00 E-value=0 Score=318.62 Aligned_cols=150 Identities=33% Similarity=0.502 Sum_probs=144.9 Q ss_pred CCCEEEEEEEECCCCCCCCCEEEECCCCEEEEECCCCCEECCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH-H Q ss_conf 430334652011175877838882776321101437873012201468887789999984223467899998876543-2 Q gi|254780331|r 11 KKIMEVILLQNVTNLGPMGEVVKVKNGYARNYLLPKKKALRANKENKILFESQRSVLEAANLEKKAKYEGISKDLAKK-N 89 (179) Q Consensus 11 ~~~mkVIL~~dv~~lG~~Gdiv~Vk~GyaRN~LiP~~~A~~aT~~n~~~~~~~~~~~~~~~~~~~~~a~~~~~~L~~~-~ 89 (179) ++.|+|||++||++||++||+|+|+||||||||||+|+|++||++|+++++.+++..+++..+..+.|+.+++.|+++ . T Consensus 2 ~k~MkVILl~dV~~LGk~GdiV~Vk~GYARNfLiP~g~A~~at~~n~~~~~~~~~~~~~~~~~~~~~a~~~~~~l~~~~~ 81 (154) T CHL00160 2 KKKITVVLKENIQNLGKSGDVVKVASGYARNFLIPNKMAQVATQGILKQQKMYAAIKEEKLDEAKENAQKSAQLLEEIQK 81 (154) T ss_pred CCCEEEEECCCCCCCCCCCCEEEECCHHHHHHCCCCCCHHHCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCE T ss_conf 86529998311012798899899876325465053685344799999999999999999999999999999998607866 Q ss_pred HHHCCCCCCCCCCCCCCCHHHHHHHHHH-CCCCCCHHHHCCCCCCCCEEEEEEEEEECCCEEEEEEEEEEECH Q ss_conf 1000255654310034107899999986-08876866630265643205499999963972999999997251 Q gi|254780331|r 90 FSLIRAAGDTGYLYGSVSSRDIADLLIE-EGFDVNRGQINLKSPIKSVGIHNIMISLHADVSTTITLNVARST 161 (179) Q Consensus 90 l~i~~k~~e~gkLfGsVt~~dI~~~L~~-~gi~I~k~~I~l~~pIk~~G~y~V~I~L~~~V~a~i~V~V~~~~ 161 (179) ++|.+++|++|+||||||++||+++|.+ .|++|++++|.+|+ ||++|+|.|+|+||++|+++++|+|++++ T Consensus 82 ~~i~~k~g~~gkLfGSVt~~dI~~~l~~~~~~~idk~~I~l~~-Ik~~G~~~V~v~L~~~V~a~l~v~Vv~Es 153 (154) T CHL00160 82 FSVKKKTGDGNQIFGSVTEKEISQIIKNTTNEKIDKQNIYLPE-IKTIGIYNLEIKLTSDVTANIKLQVLPES 153 (154) T ss_pred EEEEEEECCCCCEECCCCHHHHHHHHHHHCCCCCCHHHCCCCC-CCCCEEEEEEEEECCCEEEEEEEEEEECC T ss_conf 9999995899824888698999999998629921778871515-10168479999946983999999999468 No 3 >COG0359 RplI Ribosomal protein L9 [Translation, ribosomal structure and biogenesis] Probab=100.00 E-value=1.4e-45 Score=307.78 Aligned_cols=148 Identities=41% Similarity=0.681 Sum_probs=144.9 Q ss_pred EEEEEEEECCCCCCCCCEEEECCCCEEEEECCCCCEECCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHC Q ss_conf 33465201117587783888277632110143787301220146888778999998422346789999887654321000 Q gi|254780331|r 14 MEVILLQNVTNLGPMGEVVKVKNGYARNYLLPKKKALRANKENKILFESQRSVLEAANLEKKAKYEGISKDLAKKNFSLI 93 (179) Q Consensus 14 mkVIL~~dv~~lG~~Gdiv~Vk~GyaRN~LiP~~~A~~aT~~n~~~~~~~~~~~~~~~~~~~~~a~~~~~~L~~~~l~i~ 93 (179) |||||++||.+||+.||+|+|+||||||||||+|+|++||+.|++.++.+++..+++..+.+++|+.++..|++.++.|. T Consensus 1 MkVILl~dV~~lGk~Gdiv~VkdGYarNfLiPkglAv~At~~n~~~~~~~r~~~e~~~~~~~~~a~~lk~~Le~~~~~i~ 80 (148) T COG0359 1 MKVILLEDVKGLGKKGDIVEVKDGYARNFLIPKGLAVPATKGNLKLLEARRAKLEKKAAEELAEAEALKEKLEGKTVEIA 80 (148) T ss_pred CEEEEECCHHHCCCCCCEEEECCHHHHHHHCCCCCHHHCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCEEEEE T ss_conf 93899340132587888899646126464030463000799899999999999999888999999999988507609999 Q ss_pred CCCCCCCCCCCCCCHHHHHHHHHHCCCCCCHHHHCCCCCCCCEEEEEEEEEECCCEEEEEEEEEEECH Q ss_conf 25565431003410789999998608876866630265643205499999963972999999997251 Q gi|254780331|r 94 RAAGDTGYLYGSVSSRDIADLLIEEGFDVNRGQINLKSPIKSVGIHNIMISLHADVSTTITLNVARST 161 (179) Q Consensus 94 ~k~~e~gkLfGsVt~~dI~~~L~~~gi~I~k~~I~l~~pIk~~G~y~V~I~L~~~V~a~i~V~V~~~~ 161 (179) +++|++|+||||||++||++++.++|+.|+++.|.+|++|+++|.|+|+++||++|+++++|.|.++. T Consensus 81 ~kag~~GklfGSVt~~dIa~~l~~~g~~idk~~i~l~~~ik~~G~~~V~vkLh~eV~a~v~v~V~~~~ 148 (148) T COG0359 81 VKAGEDGKLFGSVTSKDIAEALKAAGFKLDKRKIRLPNGIKTLGEHEVEVKLHEEVTATVKVNVVAEN 148 (148) T ss_pred EECCCCCCEECCCCHHHHHHHHHHCCCCCCHHEEECCCHHHHCCEEEEEEEECCCEEEEEEEEEEECC T ss_conf 77188886620423899999999708875031556674265361068899825863999999997279 No 4 >TIGR00158 L9 ribosomal protein L9; InterPro: IPR000244 Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites , . About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome , . Ribosomal protein L9 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L9 is known to bind directly to the 23S rRNA. It belongs to a family of ribosomal proteins grouped on the basis of sequence similarities , . The crystal structure of Bacillus stearothermophilus L9 shows the 149-residue protein comprises two globular domains connected by a rigid linker . Each domain contains an rRNA binding site, and the protein functions as a structural protein in the large subunit of the ribosome. The C-terminal domain consists of two loops, an alpha-helix and a three-stranded mixed parallel, anti-parallel beta-sheet packed against the central alpha-helix. The long central alpha-helix is exposed to solvent in the middle and participates in the hydrophobic cores of the two domains at both ends. ; GO: 0003735 structural constituent of ribosome, 0006412 translation, 0005622 intracellular, 0005840 ribosome. Probab=99.92 E-value=3.5e-25 Score=179.36 Aligned_cols=147 Identities=36% Similarity=0.623 Sum_probs=138.8 Q ss_pred EEEEEEEECCCCCCCCCEEEECCCCEEEEECCCCCEECCCCHHHHHHHHHHHHHHHHHHHH-HHHHHHHHHHHHHHHHHH Q ss_conf 3346520111758778388827763211014378730122014688877899999842234-678999988765432100 Q gi|254780331|r 14 MEVILLQNVTNLGPMGEVVKVKNGYARNYLLPKKKALRANKENKILFESQRSVLEAANLEK-KAKYEGISKDLAKKNFSL 92 (179) Q Consensus 14 mkVIL~~dv~~lG~~Gdiv~Vk~GyaRN~LiP~~~A~~aT~~n~~~~~~~~~~~~~~~~~~-~~~a~~~~~~L~~~~l~i 92 (179) |+|+|++|+.++|+.||+++|++|||||||+|+++|+++|+.++..++.++.....+.... .+.+..+...++...+.+ T Consensus 1 ~~~~~~~~~~~~g~~g~~~~~~~g~~~~~l~p~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 80 (151) T TIGR00158 1 MKVILLEDVKNLGKRGDVVEVKDGYARNFLIPKGLAVPATKKNIEKFEARRKKLEEKEAANLKAAAARLKEVLELGTLTI 80 (151) T ss_pred CCCHHHHHHHHCCCCCCEEECCCCCHHHHHCCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHEEE T ss_conf 91001213431253245111144410110011331001105678877788888888888888988777877664431135 Q ss_pred CCCCCCCCCCCCCCCHHHHHHHHHH--CCCCCCHHHHCCCC-CCCCEEEEEEEEEECCCEEEEEEEEEEEC Q ss_conf 0255654310034107899999986--08876866630265-64320549999996397299999999725 Q gi|254780331|r 93 IRAAGDTGYLYGSVSSRDIADLLIE--EGFDVNRGQINLKS-PIKSVGIHNIMISLHADVSTTITLNVARS 160 (179) Q Consensus 93 ~~k~~e~gkLfGsVt~~dI~~~L~~--~gi~I~k~~I~l~~-pIk~~G~y~V~I~L~~~V~a~i~V~V~~~ 160 (179) ..+.+++|++||+|+..+|++.+.. .++.++++.+.++. +++.+|.|.+.+++|+++.+.+.+.|.++ T Consensus 81 ~~~~~~~g~~~g~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~g~~~~~~~~~~~~~~~~~~~~~~~ 151 (151) T TIGR00158 81 SLKSGDGGKLFGSITTKEIADALKADHAGLDLDKKKIELPDGVLRTFGDYEVTLKLHPEVTAVLKVEVVPE 151 (151) T ss_pred EECCCCCCCCCHHHHHHHHHHHHHHHHCCCCCHHCCCCCCCCCCCCCCCEEEEEEECCCEEEEEEEEEECC T ss_conf 42037654310002357889988875405410000220466630014631467876365256788776349 No 5 >KOG4607 consensus Probab=99.87 E-value=4.4e-22 Score=159.82 Aligned_cols=155 Identities=25% Similarity=0.294 Sum_probs=126.8 Q ss_pred CCCCCCCEEEEEEEECCCCCCCCCEEEECCCCEEEEECCCCCEECCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 21024303346520111758778388827763211014378730122014688877899999842234678999988765 Q gi|254780331|r 7 NKKGKKIMEVILLQNVTNLGPMGEVVKVKNGYARNYLLPKKKALRANKENKILFESQRSVLEAANLEKKAKYEGISKDLA 86 (179) Q Consensus 7 ~kk~~~~mkVIL~~dv~~lG~~Gdiv~Vk~GyaRN~LiP~~~A~~aT~~n~~~~~~~~~~~~~~~~~~~~~a~~~~~~L~ 86 (179) .++.+...+|||++||++||+.||+|+|++||+||+|+|+|+|+|+||.+.+.+..+.++......+...+++.++ -|+ T Consensus 42 ~~k~k~~levIL~~~Ve~lG~qGdvVsVk~g~~RN~Llp~glAvy~tp~~~~~~k~~~~e~~~~k~~vk~e~k~V~-~lq 120 (222) T KOG4607 42 QKKPKPNLEVILKTDVEKLGKQGDVVSVKRGYFRNFLLPKGLAVYNTPLNLKKYKLREQEEEAEKIRVKEEAKVVA-VLQ 120 (222) T ss_pred HCCCCCCEEEEEEHHHHHHCCCCCEEEEECCHHHHHCCCCCCCCCCCHHHHHHHHHHHHHHHHHHHCCHHHHHHHH-HHH T ss_conf 4057765154100013441646757986121233211654530017865689999998787766441577888888-888 Q ss_pred HHHHHHCCCCCCCCCC-CCCCCHHHHHHHH-HHCCCCCCHHHHCCCCCCCCEEEEEEEEEECCCEEEEEEEEEEECHHH Q ss_conf 4321000255654310-0341078999999-860887686663026564320549999996397299999999725145 Q gi|254780331|r 87 KKNFSLIRAAGDTGYL-YGSVSSRDIADLL-IEEGFDVNRGQINLKSPIKSVGIHNIMISLHADVSTTITLNVARSTEE 163 (179) Q Consensus 87 ~~~l~i~~k~~e~gkL-fGsVt~~dI~~~L-~~~gi~I~k~~I~l~~pIk~~G~y~V~I~L~~~V~a~i~V~V~~~~~~ 163 (179) ...+.+.++-+..+.| +++|+.+...... ++..+.++++.|..|. ++.-|+|-..|+++++.++.++..|...+-+ T Consensus 121 t~v~~~~~~k~~kw~l~~~~V~~~l~~gv~~~~~t~~l~k~~vs~P~-~k~e~~~~~~V~in~~~~vr~~~~v~~~e~d 198 (222) T KOG4607 121 TVVLFKVMNKGGKWKLNPNLVKASLRKGVIVAELTIKLDKELVSGPI-TKEEGEYICEVKINPDVTVRVKIRVTHNEYD 198 (222) T ss_pred HHHHHHEECCCCCEEECHHHHHHHHHCCEEECCCCCCCCCCCCCCCC-CCCCCEEEEEEEECCCCEEEEEEEEECCCCC T ss_conf 66554202147862515878789874461761000357632257975-4422328999997876268763266405447 No 6 >pfam03948 Ribosomal_L9_C Ribosomal protein L9, C-terminal domain. Probab=99.86 E-value=9.4e-22 Score=157.75 Aligned_cols=86 Identities=33% Similarity=0.606 Sum_probs=83.1 Q ss_pred HHHHHHHHHHHHHHHHHHCCCCCCCCCCCCCCCHHHHHHHHHHCCCCCCHHHHCCCCCCCCEEEEEEEEEECCCEEEEEE Q ss_conf 67899998876543210002556543100341078999999860887686663026564320549999996397299999 Q gi|254780331|r 75 KAKYEGISKDLAKKNFSLIRAAGDTGYLYGSVSSRDIADLLIEEGFDVNRGQINLKSPIKSVGIHNIMISLHADVSTTIT 154 (179) Q Consensus 75 ~~~a~~~~~~L~~~~l~i~~k~~e~gkLfGsVt~~dI~~~L~~~gi~I~k~~I~l~~pIk~~G~y~V~I~L~~~V~a~i~ 154 (179) .++|++++++|++..|+|.++++++|+||||||++||++.|.++|+.|++++|.+++|||++|.|.|+|+||++|+|+++ T Consensus 1 i~~A~~l~~~l~~~~l~i~~~~~e~g~LfGSVt~~dI~~~l~~~g~~i~k~~I~l~~~IK~iG~~~V~I~Lh~~V~~~i~ 80 (86) T pfam03948 1 LAEAEALAEKLEGLTVTIKAKAGEDGKLFGSVTTKDIAEALKAQGIEIDKKKIELPEPIKTLGEYEVEVKLHPDVTATIK 80 (86) T ss_pred CHHHHHHHHHHCCCEEEEEEEECCCCEEECCCCHHHHHHHHHHCCCCCCHHHEECCCCCCCCEEEEEEEEECCCEEEEEE T ss_conf 97799999986598899999968998456135889999999977994158887659840044889999996599799999 Q ss_pred EEEEEC Q ss_conf 999725 Q gi|254780331|r 155 LNVARS 160 (179) Q Consensus 155 V~V~~~ 160 (179) |+|+++ T Consensus 81 i~V~~e 86 (86) T pfam03948 81 VEVVAE 86 (86) T ss_pred EEEEEC T ss_conf 999859 No 7 >pfam01281 Ribosomal_L9_N Ribosomal protein L9, N-terminal domain. Probab=99.75 E-value=3.7e-19 Score=141.37 Aligned_cols=48 Identities=54% Similarity=0.809 Sum_probs=46.3 Q ss_pred EEEEEEEECCCCCCCCCEEEECCCCEEEEECCCCCEECCCCHHHHHHH Q ss_conf 334652011175877838882776321101437873012201468887 Q gi|254780331|r 14 MEVILLQNVTNLGPMGEVVKVKNGYARNYLLPKKKALRANKENKILFE 61 (179) Q Consensus 14 mkVIL~~dv~~lG~~Gdiv~Vk~GyaRN~LiP~~~A~~aT~~n~~~~~ 61 (179) |+|||++||++||++||+|+|++|||||||||+|+|++||++|+++++ T Consensus 1 mkViL~~dV~~lG~~Gdvv~V~~GyarN~Lip~~~A~~at~~~l~~~~ 48 (48) T pfam01281 1 MKVILLEDVEGLGKKGDIVEVKPGYARNFLLPKGLAVYATPENLKELE 48 (48) T ss_pred CEEEEECCCCCCCCCCCEEEECCCEEHHHHCCCCCCHHCCHHHHHHCC T ss_conf 989992142020766889998584114452567961337999998529 No 8 >pfam10045 DUF2280 Uncharacterized conserved protein (DUF2280). Members of this family of hypothetical bacterial proteins have no known function. Probab=37.20 E-value=21 Score=16.90 Aligned_cols=33 Identities=30% Similarity=0.528 Sum_probs=24.9 Q ss_pred CCHHHHHHHHHHC-CCCCCHHHHCCCCCCCCEEE Q ss_conf 1078999999860-88768666302656432054 Q gi|254780331|r 106 VSSRDIADLLIEE-GFDVNRGQINLKSPIKSVGI 138 (179) Q Consensus 106 Vt~~dI~~~L~~~-gi~I~k~~I~l~~pIk~~G~ 138 (179) =||.++++++++. |++|++.+++--+|-|.-|. T Consensus 20 dTPs~va~aVk~EFgi~vsrQqve~yDPTK~aG~ 53 (104) T pfam10045 20 DTPSEVAEAVKEEFGIEVTRQQVESYDPTKAAGK 53 (104) T ss_pred CCHHHHHHHHHHHHCCEECHHHHHHCCCHHHHHH T ss_conf 8899999999999684426999875295677767 No 9 >KOG3279 consensus Probab=37.18 E-value=30 Score=15.88 Aligned_cols=68 Identities=12% Similarity=0.070 Sum_probs=29.0 Q ss_pred EEECCCCCEECCCCHHHHHHHHHHHHHHHHHHH-------------HHHHHHHHHHHHHHHHHHHCCCCCCCCCCCCCCC Q ss_conf 101437873012201468887789999984223-------------4678999988765432100025565431003410 Q gi|254780331|r 41 NYLLPKKKALRANKENKILFESQRSVLEAANLE-------------KKAKYEGISKDLAKKNFSLIRAAGDTGYLYGSVS 107 (179) Q Consensus 41 N~LiP~~~A~~aT~~n~~~~~~~~~~~~~~~~~-------------~~~~a~~~~~~L~~~~l~i~~k~~e~gkLfGsVt 107 (179) .|+||-.-|.-..-.-...++.++...+...-. ..+.-.+.++.|+.-.......++..+++||.++ T Consensus 197 ~F~IPEeEAEW~GLtL~EAirKQ~~lEe~~~PvPLk~~f~~~LieqLrq~~~~~~Q~Le~Pea~~K~eS~~~~~~~~k~n 276 (283) T KOG3279 197 RFLIPEEEAEWYGLTLLEAIRKQKQLEEAEKPVPLKLEFRGKLIEQLRQAGISEAQKLEKPEALTKLESSPSSSWLSKIN 276 (283) T ss_pred HHCCCHHHHHHHHHHHHHHHHHHHHHHHCCCCCCHHHHHHHHHHHHHHHHHHHHHHHHCCCHHHHHCCCCCCHHHHHHCC T ss_conf 75286666457510399999998877752588527999999999999860157887642832444413575403665048 Q ss_pred H Q ss_conf 7 Q gi|254780331|r 108 S 108 (179) Q Consensus 108 ~ 108 (179) + T Consensus 277 p 277 (283) T KOG3279 277 P 277 (283) T ss_pred C T ss_conf 7 No 10 >pfam08461 HTH_12 Ribonuclease R winged-helix domain. This domain is found at the amino terminus of Ribonuclease R and a number of presumed transcriptional regulatory proteins from archaebacteria. Probab=36.16 E-value=30 Score=15.88 Aligned_cols=25 Identities=20% Similarity=0.421 Sum_probs=21.8 Q ss_pred CCCHHHHHHHHHHCCCCCCHHHHCC Q ss_conf 4107899999986088768666302 Q gi|254780331|r 105 SVSSRDIADLLIEEGFDVNRGQINL 129 (179) Q Consensus 105 sVt~~dI~~~L~~~gi~I~k~~I~l 129 (179) +|+++.|++.|+..|++|..+.+.. T Consensus 13 Pigak~ia~~L~~rG~~i~eRaVRY 37 (66) T pfam08461 13 PIGAKIIAEELNLRGYDIGERAVRY 37 (66) T ss_pred CCCHHHHHHHHHHHCCCCCHHHHHH T ss_conf 8649999999998285840899999 No 11 >cd04491 SoSSB_OBF SoSSB_OBF: A subfamily of OB folds similar to the OB fold of the crenarchaeote Sulfolobus solfataricus single-stranded (ss) DNA-binding protein (SSoSSB). SSoSSB has a single OB fold, and it physically and functionally interacts with RNA polymerase. In vitro, SSoSSB can substitute for the basal transcription factor TBP, stimulating transcription from promoters under conditions in which TBP is limiting, and supporting transcription when TBP is absent. SSoSSB selectively melts the duplex DNA of promoter sequences. It also relieves transcriptional repression by the chromatin Alba. In addition, SSoSSB activates reverse gyrase activity, which involves DNA binding, DNA cleavage, strand passage and ligation. SSoSSB stimulates all these steps in the presence of the chromatin protein, Sul7d. SSoSSB antagonizes the inhibitory effect of Sul7d on reverse gyrase supercoiling activity. It also physically and functionally interacts with Mini-chromosome Maintenance (MCM), stimulating Probab=35.28 E-value=20 Score=16.94 Aligned_cols=19 Identities=32% Similarity=0.557 Sum_probs=16.2 Q ss_pred CCCCCCEEEECCCCEEEEE Q ss_conf 5877838882776321101 Q gi|254780331|r 25 LGPMGEVVKVKNGYARNYL 43 (179) Q Consensus 25 lG~~Gdiv~Vk~GyaRN~L 43 (179) -=+.||+|.+.+||+|+|- T Consensus 48 ~l~~Gd~v~i~~~~v~~~~ 66 (82) T cd04491 48 DLEPGDVVRIENAYVREFN 66 (82) T ss_pred CCCCCCEEEEEEEEEEEEC T ss_conf 5589999999689998889 No 12 >TIGR02389 RNA_pol_rpoA2 DNA-directed RNA polymerase, subunit A''; InterPro: IPR012757 DNA-directed RNA polymerases 2.7.7.6 from EC (also known as DNA-dependent RNA polymerases) are responsible for the polymerisation of ribonucleotides into a sequence complementary to the template DNA. In eukaryotes, there are three different forms of DNA-directed RNA polymerases transcribing different sets of genes. Most RNA polymerases are multimeric enzymes and are composed of a variable number of subunits. The core RNA polymerase complex consists of five subunits (two alpha, one beta, one beta-prime and one omega) and is sufficient for transcription elongation and termination but is unable to initiate transcription. Transcription initiation from promoter elements requires a sixth, dissociable subunit called a sigma factor, which reversibly associates with the core RNA polymerase complex to form a holoenzyme . The core RNA polymerase complex forms a "crab claw"-like structure with an internal channel running along the full length . The key functional sites of the enzyme, as defined by mutational and cross-linking analysis, are located on the inner wall of this channel. RNA synthesis follows after the attachment of RNA polymerase to a specific site, the promoter, on the template DNA strand. The RNA synthesis process continues until a termination sequence is reached. The RNA product, which is synthesised in the 5' to 3'direction, is known as the primary transcript. Eukaryotic nuclei contain three distinct types of RNA polymerases that differ in the RNA they synthesise: RNA polymerase I: located in the nucleoli, synthesises precursors of most ribosomal RNAs. RNA polymerase II: occurs in the nucleoplasm, synthesises mRNA precursors. RNA polymerase III: also occurs in the nucleoplasm, synthesises the precursors of 5S ribosomal RNA, the tRNAs, and a variety of other small nuclear and cytosolic RNAs. Eukaryotic cells are also known to contain separate mitochondrial and chloroplast RNA polymerases. Eukaryotic RNA polymerases, whose molecular masses vary in size from 500 to 700 kD, contain two non-identical large (>100 kDa) subunits and an array of up to 12 different small (less than 50 kDa) subunits. This family consists of the archaeal A'' subunit of the DNA-directed RNA polymerase. The example from Methanococcus jannaschii contains an intein.; GO: 0003677 DNA binding, 0003899 DNA-directed RNA polymerase activity, 0006350 transcription. Probab=31.79 E-value=39 Score=15.20 Aligned_cols=15 Identities=33% Similarity=0.545 Sum_probs=6.3 Q ss_pred HHHHHCCCC-CCHHHH Q ss_conf 999860887-686663 Q gi|254780331|r 113 DLLIEEGFD-VNRGQI 127 (179) Q Consensus 113 ~~L~~~gi~-I~k~~I 127 (179) .-|.+||++ ||-|.+ T Consensus 304 ~tL~EQGL~dVDiRHl 319 (397) T TIGR02389 304 RTLEEQGLDDVDIRHL 319 (397) T ss_pred HHHHHCCCCHHHHHHH T ss_conf 9986428860248989 No 13 >pfam06560 GPI Glucose-6-phosphate isomerase (GPI). This family consists of several bacterial and archaeal glucose-6-phosphate isomerase (GPI) proteins (EC:5.3.1.9). Probab=31.65 E-value=41 Score=15.03 Aligned_cols=14 Identities=36% Similarity=0.287 Sum_probs=6.9 Q ss_pred CCCCCEEEECCCCE Q ss_conf 87783888277632 Q gi|254780331|r 26 GPMGEVVKVKNGYA 39 (179) Q Consensus 26 G~~Gdiv~Vk~Gya 39 (179) -..||+|-|.|||| T Consensus 113 ~~~G~~v~IPP~~a 126 (181) T pfam06560 113 MEKGTVVYVPPYYG 126 (181) T ss_pred ECCCCEEEECCCEE T ss_conf 54898899799815 No 14 >TIGR02169 SMC_prok_A chromosome segregation protein SMC; InterPro: IPR011891 The SMC (structural maintenance of chromosomes) family of proteins, exist in virtually all organisms including both bacteria and archaea. The SMC proteins are essential for successful chromosome transmission during replication and segregation of the genome in all organisms and form three types of heterodimer (SMC1SMC3, SMC2SMC4, SMC5SMC6), which are core components of large multiprotein complexes. The best known complexes are cohesin, which is responsible for sister-chromatid cohesion, and condensin, which is required for full chromosome condensation in mitosis. SMCs are generally present as single proteins in bacteria, and as at least six distinct proteins in eukaryotes. The proteins range in size from approximately 110 to 170 kDa, and share a five-domain structure, with globular N- and C-terminal (IPR003395 from INTERPRO) domains separated by a long (circa 100 nm or 900 residues) coiled coil segment in the centre of which is a globular ''hinge'' domain, characterised by a set of four highly conserved glycine residues that are typical of flexible regions in a protein. The amino-terminal domain contains a 'Walker A' nucleotide-binding domain (GxxGxGKS/T), which by mutational studies has been shown to be essential in several proteins. The carboxy-terminal domain contains a sequence (the DA-box) that resembles a 'Walker B' motif (XXXXD, where X is any hydrophobic residue), and a LSGG motif with homology to the signature sequence of the ATP-binding cassette (ABC) family of ATPases . All SMC proteins appear to form dimers, either forming homodimers with themselves, as in the case of prokaryotic SMC proteins, or heterodimers between different but related SMC proteins. The dimers are arranged in an antiparallel alignment. This orientation brings the N- and C-terminal globular domains (from either different or identical protamers) together, which unites an ATP binding site (Walker A motif) within the N-terminal domain with a Walker B motif (DA box) within the C-terminal domain, to form a potentially functional ATPase. Protein interaction and microscopy data suggest that SMC dimers form a ring-like structure which might embrace DNA molecules. Non-SMC subunits associate with the SMC amino- and carboxy-terminal domains. The sequence homology within the carboxy-terminal domain is relatively high within the SMC1-SMC4 group, whereas SMC5 and SMC6 show some divergence in both of these sequences. SMCs share not only sequence similarity but also structural similarity with ABC proteins. SMC proteins function together with other proteins in a range of chromosomal transactions, including chromosome condensation, sister-chromatid cohesion, recombination, DNA repair and epigenetic silencing of gene expression . This family represents the SMC protein of archaea and a few bacteria (Aquifex, Synechocystis, etc); the SMC of other bacteria is described by IPR011890 from INTERPRO. The N- and C-terminal domains of this protein are well conserved, but the central hinge region is skewed in composition and highly divergent .. Probab=30.29 E-value=23 Score=16.62 Aligned_cols=51 Identities=20% Similarity=0.242 Sum_probs=23.8 Q ss_pred HHHHHHHHCCCCCCCCCCCCC-------CCHHHHHHHHHHCCCCCCHHHHCCCCCCCC Q ss_conf 654321000255654310034-------107899999986088768666302656432 Q gi|254780331|r 85 LAKKNFSLIRAAGDTGYLYGS-------VSSRDIADLLIEEGFDVNRGQINLKSPIKS 135 (179) Q Consensus 85 L~~~~l~i~~k~~e~gkLfGs-------Vt~~dI~~~L~~~gi~I~k~~I~l~~pIk~ 135 (179) ++..++...+++.++|+.|-+ +|..||.+.|...||.-+--+|.|=|-|+. T Consensus 112 vde~~v~Rr~kv~~~~~yySyY~lNG~~~~l~ei~d~L~~~gI~p~gYNvVlQGDvt~ 169 (1202) T TIGR02169 112 VDELEVSRRLKVTDDGKYYSYYYLNGKSVRLSEIHDFLAAAGIYPEGYNVVLQGDVTK 169 (1202) T ss_pred CCCEEEEEEEEECCCCCEEEEEEECCCCCCHHHHHHHHHHCCCCCCCCEEEEECCHHH T ss_conf 2435899888873798468888870820357668999986176889870674354123 No 15 >cd04606 CBS_pair_Mg_transporter This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domain in the magnesium transporter, MgtE. MgtE and its homologs are found in eubacteria, archaebacteria, and eukaryota. Members of this family transport Mg2+ or other divalent cations into the cell via two highly conserved aspartates. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually come in tandem repeats, which associate to form a so-called Bateman domain or a CBS pair which is reflected in this model. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Probab=29.48 E-value=38 Score=15.25 Aligned_cols=17 Identities=29% Similarity=0.593 Sum_probs=7.7 Q ss_pred CCCCCCCCCCCHHHHHH Q ss_conf 65431003410789999 Q gi|254780331|r 97 GDTGYLYGSVSSRDIAD 113 (179) Q Consensus 97 ~e~gkLfGsVt~~dI~~ 113 (179) +++|+|-|-||..||.+ T Consensus 91 d~~~~lvGiIt~~Di~~ 107 (109) T cd04606 91 DEEGRLVGIITVDDVID 107 (109) T ss_pred CCCCEEEEEEEHHHHHH T ss_conf 88997999999689684 No 16 >cd04625 CBS_pair_12 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic gener Probab=28.55 E-value=35 Score=15.47 Aligned_cols=27 Identities=22% Similarity=0.394 Sum_probs=21.6 Q ss_pred CCCCCCCCCCCCHHHHHHHHHHCCCCC Q ss_conf 565431003410789999998608876 Q gi|254780331|r 96 AGDTGYLYGSVSSRDIADLLIEEGFDV 122 (179) Q Consensus 96 ~~e~gkLfGsVt~~dI~~~L~~~gi~I 122 (179) +.++|+|.|=||..||...+...|... T Consensus 29 V~~~g~lvGIiT~rDi~~~~~~~~~~~ 55 (112) T cd04625 29 VMERGELVGLLTFREVLQAMAQHGAGV 55 (112) T ss_pred EEECCEEEEEEEHHHHHHHHHHCCCCC T ss_conf 957999999998799999999709980 No 17 >COG2047 Uncharacterized protein (ATP-grasp superfamily) [General function prediction only] Probab=27.98 E-value=35 Score=15.48 Aligned_cols=30 Identities=23% Similarity=0.453 Sum_probs=24.5 Q ss_pred CCCCCCCCCCCCHHHHHHHHHHCCCCCCHH Q ss_conf 565431003410789999998608876866 Q gi|254780331|r 96 AGDTGYLYGSVSSRDIADLLIEEGFDVNRG 125 (179) Q Consensus 96 ~~e~gkLfGsVt~~dI~~~L~~~gi~I~k~ 125 (179) .-++-+.+|++|..++++.|+++|+...+. T Consensus 131 l~eep~VlGA~ts~eLi~~lke~gV~fr~~ 160 (258) T COG2047 131 LVEEPRVLGAVTSKELIEELKEHGVEFRSG 160 (258) T ss_pred CCCCCEEEEEECCHHHHHHHHHCCEEECCC T ss_conf 357763777408899999999729571358 No 18 >COG1438 ArgR Arginine repressor [Transcription] Probab=27.89 E-value=37 Score=15.34 Aligned_cols=34 Identities=24% Similarity=0.574 Sum_probs=16.5 Q ss_pred CHHHHHHHHHHCCCCCCHHHHCCCCCCCCEEEEEEE Q ss_conf 078999999860887686663026564320549999 Q gi|254780331|r 107 SSRDIADLLIEEGFDVNRGQINLKSPIKSVGIHNIM 142 (179) Q Consensus 107 t~~dI~~~L~~~gi~I~k~~I~l~~pIk~~G~y~V~ 142 (179) |+.+|++.|.+.|+++.-.. +.--||++|...|. T Consensus 22 TQ~Elv~~L~~~Gi~vTQaT--vSRDlkelglvKv~ 55 (150) T COG1438 22 TQEELVELLQEEGIEVTQAT--VSRDLKELGLVKVR 55 (150) T ss_pred CHHHHHHHHHHCCCEEEHHH--HHHHHHHCCCEEEC T ss_conf 89999999998297586398--78779985988933 No 19 >TIGR01026 fliI_yscN ATPase FliI/YscN family; InterPro: IPR005714 Proteins in this entry show extensive homology to the ATP synthase F1 beta subunit, and are involved in type III protein secretion. They fall into the two separate functional groups outlined below. The first group, exemplified by the Salmonella typhimurium FliI protein (P26465 from SWISSPROT), is needed for flagellar assembly. Most structural components of the bacterial flagellum are translocated through the central channel of the growing flagellar structure by the type III flagellar protein-export apparatus in an ATPase-driven manner, to be assembled at the growing end. FliI is the ATPase that couples ATP hydrolysis to the translocation reaction , . The second group couples ATP hydrolysis to protein translocation in non-flagellar type III secretion systems. Often these systems are involved in virulence and pathogenicity. YscN (P40290 from SWISSPROT) from pathogenic Yersinia species, for example, energises the injection of antihost factors directly into eukaryotic cells, thus overcoming host defences .; GO: 0016887 ATPase activity, 0009058 biosynthetic process, 0015031 protein transport, 0005737 cytoplasm. Probab=26.27 E-value=25 Score=16.43 Aligned_cols=34 Identities=21% Similarity=0.316 Sum_probs=24.3 Q ss_pred HHHHCCCCCC--CCCCCCCCCHHHHHHHHHHCCCCC Q ss_conf 2100025565--431003410789999998608876 Q gi|254780331|r 89 NFSLIRAAGD--TGYLYGSVSSRDIADLLIEEGFDV 122 (179) Q Consensus 89 ~l~i~~k~~e--~gkLfGsVt~~dI~~~L~~~gi~I 122 (179) .+.+-.-+.+ ==|++|+.+..-||+++++||-+| T Consensus 231 SV~VVaTSD~SPl~R~~GAy~At~iAEYFrdqGk~V 266 (455) T TIGR01026 231 SVVVVATSDQSPLLRLKGAYVATAIAEYFRDQGKDV 266 (455) T ss_pred EEEEEECCCCCHHHHHHHHHEEHHHHHHHHHCCCEE T ss_conf 179983688638888732640025435465218705 No 20 >TIGR01088 aroQ 3-dehydroquinate dehydratase, type II; InterPro: IPR001874 3-dehydroquinate dehydratase (4.2.1.10 from EC), or dehydroquinase, catalyzes the conversion of 3-dehydroquinate into 3-dehydroshikimate. It is the third step in the shikimate pathway for the biosynthesis of aromatic amino acids from chorismate. Two classes of dehydroquinases exist, known as types I and II. Class-II enzymes are homododecameric enzymes of about 17 kDa. They are found in some bacteria such as actinomycetales , and some fungi where they act in a catabolic pathway that allows the use of quinic acid as a carbon source.; GO: 0003855 3-dehydroquinate dehydratase activity, 0009073 aromatic amino acid family biosynthetic process. Probab=25.80 E-value=35 Score=15.44 Aligned_cols=28 Identities=25% Similarity=0.267 Sum_probs=19.4 Q ss_pred CCCCCCCCCHHHHHHHHHHC----C--CCCCHHH Q ss_conf 43100341078999999860----8--8768666 Q gi|254780331|r 99 TGYLYGSVSSRDIADLLIEE----G--FDVNRGQ 126 (179) Q Consensus 99 ~gkLfGsVt~~dI~~~L~~~----g--i~I~k~~ 126 (179) +-.+|||+|-.||.+.+++. + ++++-.| T Consensus 17 EP~~YG~~tle~i~~~~~~~a~~~~ld~e~~~fQ 50 (144) T TIGR01088 17 EPGVYGSQTLEEIEEILETFAAQLNLDVEVEFFQ 50 (144) T ss_pred CCCCCCCCCHHHHHHHHHHHHHHCCCEEEEEEEC T ss_conf 6532478687899999999998539827898730 No 21 >COG2239 MgtE Mg/Co/Ni transporter MgtE (contains CBS domain) [Inorganic ion transport and metabolism] Probab=25.49 E-value=53 Score=14.35 Aligned_cols=43 Identities=19% Similarity=0.321 Sum_probs=33.4 Q ss_pred HHHHHHHHHHHHHHHHHHCCCCCCCCCCCCCCCHHHHHHHHHH Q ss_conf 6789999887654321000255654310034107899999986 Q gi|254780331|r 75 KAKYEGISKDLAKKNFSLIRAAGDTGYLYGSVSSRDIADLLIE 117 (179) Q Consensus 75 ~~~a~~~~~~L~~~~l~i~~k~~e~gkLfGsVt~~dI~~~L~~ 117 (179) ....++.+..++...+..---+.++++|-|-||-.||++.+.+ T Consensus 212 ~~dqeevA~~~~~ydl~a~PVVd~~~~LiG~itiDDiidvi~e 254 (451) T COG2239 212 DDDQEEVARLFEKYDLLAVPVVDEDNRLIGIITIDDIIDVIEE 254 (451) T ss_pred CCCHHHHHHHHHHHCCEECCEECCCCCEEEEEEHHHHHHHHHH T ss_conf 5787999999998287015357789846325549999999999 No 22 >pfam07523 Big_3 Bacterial Ig-like domain (group 3). This family consists of bacterial domains with an Ig-like fold. Members of this family are found in a variety of bacterial surface proteins. Probab=24.68 E-value=55 Score=14.26 Aligned_cols=41 Identities=22% Similarity=0.326 Sum_probs=31.2 Q ss_pred HHCCCCCCHHHHCCCC--CCCCEEEEEEEEEECCCEEEEEEEEE Q ss_conf 8608876866630265--64320549999996397299999999 Q gi|254780331|r 116 IEEGFDVNRGQINLKS--PIKSVGIHNIMISLHADVSTTITLNV 157 (179) Q Consensus 116 ~~~gi~I~k~~I~l~~--pIk~~G~y~V~I~L~~~V~a~i~V~V 157 (179) -+.|-.++...+.+.+ .-...|.|.|++.+ .+++.++.|.| T Consensus 26 d~~G~~v~~~dv~V~g~vdt~~~G~y~VTyty-~g~~~t~~VtV 68 (68) T pfam07523 26 DKDGKAVDFSDVTVSGTVDTTKAGTYEVTYTY-DGVSKTITVTV 68 (68) T ss_pred CCCCCCCCHHHCEEEEEECCCCCEEEEEEEEE-CCEEEEEEEEC T ss_conf 28999935548789847759997288999998-99899999989 No 23 >PRK06461 single-stranded DNA-binding protein; Reviewed Probab=23.79 E-value=57 Score=14.15 Aligned_cols=28 Identities=29% Similarity=0.432 Sum_probs=16.8 Q ss_pred EEEEEEECCCCCCCCCEEEECCCCEEEE Q ss_conf 3465201117587783888277632110 Q gi|254780331|r 15 EVILLQNVTNLGPMGEVVKVKNGYARNY 42 (179) Q Consensus 15 kVIL~~dv~~lG~~Gdiv~Vk~GyaRN~ 42 (179) .+.|--+-.+.=+.||+|.+.+||.+-| T Consensus 54 ~~tlWde~~~~i~~GD~V~I~nayv~~~ 81 (130) T PRK06461 54 KLTLWGDQAGTLKEGEVVKIENAWTTLY 81 (130) T ss_pred EEEEECCCCCCCCCCCEEEEECCEEEEE T ss_conf 9999456456468999999944798888 No 24 >PHA02119 hypothetical protein Probab=23.75 E-value=48 Score=14.63 Aligned_cols=29 Identities=21% Similarity=0.377 Sum_probs=23.0 Q ss_pred CCCCCCCCCHHHHHHHHHHCCCCCCHHHH Q ss_conf 43100341078999999860887686663 Q gi|254780331|r 99 TGYLYGSVSSRDIADLLIEEGFDVNRGQI 127 (179) Q Consensus 99 ~gkLfGsVt~~dI~~~L~~~gi~I~k~~I 127 (179) .|.-|-+|-++||+++|+..|.++.-... T Consensus 47 ~~~kfp~i~~~divdylr~lgy~~~~~s~ 75 (87) T PHA02119 47 DVAKFPAIMPKDIVDYLRSLGYDAKSDSF 75 (87) T ss_pred CCCCCCCCCCHHHHHHHHHCCCHHCCCCC T ss_conf 04447754617799999981632202000 No 25 >cd03064 TRX_Fd_NuoE TRX-like [2Fe-2S] Ferredoxin (Fd) family, NADH:ubiquinone oxidoreductase (Nuo) subunit E subfamily; Nuo, also called respiratory chain Complex 1, is the entry point for electrons into the respiratory chains of bacteria and the mitochondria of eukaryotes. It is a multisubunit complex with at least 14 core subunits. It catalyzes the electron transfer of NADH to quinone coupled with the transfer of protons across the membrane, providing the proton motive force required for energy-consuming processes. Electrons are transferred from NADH to quinone through a chain of iron-sulfur clusters in Nuo, including the [2Fe-2S] cluster present in NuoE core subunit, also called the 24 kD subunit of Complex 1. This subfamily also include formate dehydrogenases, NiFe hydrogenases and NAD-reducing hydrogenases, that contain a NuoE domain. A subset of these proteins contain both NuoE and NuoF in a single chain. NuoF, also called the 51 kD subunit of Complex 1, contains one [4Fe-4S] clu Probab=23.66 E-value=51 Score=14.43 Aligned_cols=19 Identities=16% Similarity=0.440 Sum_probs=15.1 Q ss_pred CCCCCCCCCHHHHHHHHHH Q ss_conf 4310034107899999986 Q gi|254780331|r 99 TGYLYGSVSSRDIADLLIE 117 (179) Q Consensus 99 ~gkLfGsVt~~dI~~~L~~ 117 (179) ||++|+.+|+.++.+.|.+ T Consensus 61 n~~~~~~vt~e~v~~ii~~ 79 (80) T cd03064 61 NDDVYGRLTPEKVDAILEA 79 (80) T ss_pred CCEEECCCCHHHHHHHHHH T ss_conf 9998778899999999973 No 26 >cd02980 TRX_Fd_family Thioredoxin (TRX)-like [2Fe-2S] Ferredoxin (Fd) family; composed of [2Fe-2S] Fds with a TRX fold (TRX-like Fds) and proteins containing domains similar to TRX-like Fd including formate dehydrogenases, NAD-reducing hydrogenases and the subunit E of NADH:ubiquinone oxidoreductase (NuoE). TRX-like Fds are soluble low-potential electron carriers containing a single [2Fe-2S] cluster. The exact role of TRX-like Fd is still unclear. It has been suggested that it may be involved in nitrogen fixation. Its homologous domains in large redox enzymes (such as Nuo and hydrogenases) function as electron carriers. Probab=23.42 E-value=48 Score=14.63 Aligned_cols=22 Identities=27% Similarity=0.548 Sum_probs=17.8 Q ss_pred CCCCCCCCCCCCHHHHHHHHHH Q ss_conf 5654310034107899999986 Q gi|254780331|r 96 AGDTGYLYGSVSSRDIADLLIE 117 (179) Q Consensus 96 ~~e~gkLfGsVt~~dI~~~L~~ 117 (179) +.++|..||-+|+.++.+.+.+ T Consensus 55 v~p~~~~y~~vt~~~v~~iv~~ 76 (77) T cd02980 55 VYPDGVWYGRVTPEDVEEIVEE 76 (77) T ss_pred EECCCEEECCCCHHHHHHHHHC T ss_conf 9478727858999999999971 No 27 >cd03081 TRX_Fd_NuoE_FDH_gamma TRX-like [2Fe-2S] Ferredoxin (Fd) family, NADH:ubiquinone oxidoreductase (Nuo) subunit E subfamily, NAD-dependent formate dehydrogenase (FDH) gamma subunit; composed of proteins similar to the gamma subunit of NAD-linked FDH of Ralstonia eutropha, a soluble enzyme that catalyzes the irreversible oxidation of formate to carbon dioxide accompanied by the reduction of NAD+ to NADH. FDH is a heteromeric enzyme composed of four nonidentical subunits (alpha, beta, gamma and delta). The FDH gamma subunit is closely related to NuoE, which is part of a multisubunit complex (Nuo) catalyzing the electron transfer of NADH to quinone coupled with the transfer of protons across the membrane. Electrons are transferred from NADH to quinone through a chain of iron-sulfur clusters in Nuo, including the [2Fe-2S] cluster present in NuoE. Similarly, the FDH gamma subunit is hypothesized to be involved in an electron transport chain involving other FDH subunits, upon the oxidat Probab=23.37 E-value=50 Score=14.48 Aligned_cols=19 Identities=32% Similarity=0.462 Sum_probs=14.4 Q ss_pred CCCCCCCCCHHHHHHHHHH Q ss_conf 4310034107899999986 Q gi|254780331|r 99 TGYLYGSVSSRDIADLLIE 117 (179) Q Consensus 99 ~gkLfGsVt~~dI~~~L~~ 117 (179) ||++||.+|+..+.+.|.+ T Consensus 61 n~~~y~~lt~ek~~~il~~ 79 (80) T cd03081 61 DGEVHGRVDPEKFDALLAE 79 (80) T ss_pred CCEEECCCCHHHHHHHHHC T ss_conf 9998568899999999970 No 28 >PRK02998 prsA peptidylprolyl isomerase; Reviewed Probab=22.83 E-value=59 Score=14.04 Aligned_cols=73 Identities=15% Similarity=0.306 Sum_probs=42.6 Q ss_pred HHHHHHHHHHHHHHH-HH-------HHCCCCCCCCCCCCCCCHHHHHHHHHHCCCCCCHHHHCCCCCCC-CEEEEEEEEE Q ss_conf 467899998876543-21-------00025565431003410789999998608876866630265643-2054999999 Q gi|254780331|r 74 KKAKYEGISKDLAKK-NF-------SLIRAAGDTGYLYGSVSSRDIADLLIEEGFDVNRGQINLKSPIK-SVGIHNIMIS 144 (179) Q Consensus 74 ~~~~a~~~~~~L~~~-~l-------~i~~k~~e~gkLfGsVt~~dI~~~L~~~gi~I~k~~I~l~~pIk-~~G~y~V~I~ 144 (179) ..+.|+.+.+.|+.. .| ....-+..+|--.|.+++.+.+..+.+.-+.++...| .+||+ .+|-|.|.+. T Consensus 144 ~e~~A~~v~~~L~~G~dF~~lAk~yS~D~~s~~~GG~Lg~~~~g~~~~~f~~Aaf~L~~G~v--S~PVkt~~GyHIIkv~ 221 (283) T PRK02998 144 DEKTAKEVKEKVNNGEDFAALAKQYSEDTGSKEQGGEISGFAPGQTVKEFEEAAYKLDAGQV--SEPVKTTYGYHIIKVT 221 (283) T ss_pred CHHHHHHHHHHHHCCCCHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHCCCCCCC--CCCEEECCEEEEEEEE T ss_conf 89999999999877998999999958996644358866767999807899999975999994--8877878867999980 Q ss_pred ECCC Q ss_conf 6397 Q gi|254780331|r 145 LHAD 148 (179) Q Consensus 145 L~~~ 148 (179) =.++ T Consensus 222 dk~~ 225 (283) T PRK02998 222 DKKE 225 (283) T ss_pred CCCC T ss_conf 1688 No 29 >pfam03848 TehB Tellurite resistance protein TehB. Probab=22.70 E-value=60 Score=14.02 Aligned_cols=43 Identities=19% Similarity=0.222 Sum_probs=32.6 Q ss_pred CCCCEEEECCCCEEE--EECCCCCEECCCCHHHHHHHHHHHHHHH Q ss_conf 778388827763211--0143787301220146888778999998 Q gi|254780331|r 27 PMGEVVKVKNGYARN--YLLPKKKALRANKENKILFESQRSVLEA 69 (179) Q Consensus 27 ~~Gdiv~Vk~GyaRN--~LiP~~~A~~aT~~n~~~~~~~~~~~~~ 69 (179) ..|.+.++--|-+|| ||--+|+.+-|..-|-..++..+...++ T Consensus 30 ~pgk~LDlgcG~GRNslyLa~~G~~VtavD~n~~aL~~l~~ia~~ 74 (192) T pfam03848 30 KPGKALDLGCGQGRNSLFLSLLGYDVTAVDHNENSIANLQDIKEK 74 (192) T ss_pred CCCCEEEECCCCCHHHHHHHHCCCEEEEEECCHHHHHHHHHHHHH T ss_conf 997466604789731899986899179997999999999999997 No 30 >KOG1937 consensus Probab=22.41 E-value=30 Score=15.88 Aligned_cols=22 Identities=36% Similarity=0.765 Sum_probs=18.0 Q ss_pred EECCCCCCCCCEEEECCCCEEEEECCCC Q ss_conf 0111758778388827763211014378 Q gi|254780331|r 20 QNVTNLGPMGEVVKVKNGYARNYLLPKK 47 (179) Q Consensus 20 ~dv~~lG~~Gdiv~Vk~GyaRN~LiP~~ 47 (179) +.+.++|..||+ || .|||.|+- T Consensus 72 q~ckdlgyrgD~-----gy-qtfLypn~ 93 (521) T KOG1937 72 QYCKDLGYRGDT-----GY-QTFLYPNI 93 (521) T ss_pred HHHHHCCCCCCC-----CH-HHEECCCC T ss_conf 999874987643-----20-22014885 No 31 >PRK11207 tellurite resistance protein TehB; Provisional Probab=22.22 E-value=61 Score=13.96 Aligned_cols=47 Identities=19% Similarity=0.280 Sum_probs=34.9 Q ss_pred EECCCCCCCCCEEEECCCCEEE--EECCCCCEECCCCHHHHHHHHHHHHH Q ss_conf 0111758778388827763211--01437873012201468887789999 Q gi|254780331|r 20 QNVTNLGPMGEVVKVKNGYARN--YLLPKKKALRANKENKILFESQRSVL 67 (179) Q Consensus 20 ~dv~~lG~~Gdiv~Vk~GyaRN--~LiP~~~A~~aT~~n~~~~~~~~~~~ 67 (179) +-++.+ ..|.+.++-.|.+|| ||-.+|+-+-|..-+-..++..++.. T Consensus 24 ~~~~~~-~~g~~LDlgcG~Grna~~La~~G~~VtavD~s~~al~~~~~~a 72 (198) T PRK11207 24 EAVKVV-KPGRTLDLGCGNGRNSLYLAANGYDVTAWDKNPMSIANLERIK 72 (198) T ss_pred HHHCCC-CCCCEEEECCCCCHHHHHHHHCCCEEEEEECCHHHHHHHHHHH T ss_conf 873358-9974777247887869999868985999979999999999999 No 32 >smart00116 CBS Domain in cystathionine beta-synthase and other proteins. Domain present in all 3 forms of cellular life. Present in two copies in inosine monophosphate dehydrogenase, of which one is disordered in the crystal structure [3]. A number of disease states are associated with CBS-containing proteins including homocystinuria, Becker's and Thomsen disease. Probab=21.44 E-value=49 Score=14.58 Aligned_cols=20 Identities=40% Similarity=0.599 Sum_probs=17.1 Q ss_pred CCCCCCCCCCCCHHHHHHHH Q ss_conf 56543100341078999999 Q gi|254780331|r 96 AGDTGYLYGSVSSRDIADLL 115 (179) Q Consensus 96 ~~e~gkLfGsVt~~dI~~~L 115 (179) ++++|+|-|=||..||..++ T Consensus 29 Vd~~~~lvGiit~~Dil~~l 48 (49) T smart00116 29 VDEEGRLVGIVTRRDIIKAL 48 (49) T ss_pred ECCCCCEEEEEEHHHHHHHH T ss_conf 98999199998879999864 No 33 >TIGR00003 TIGR00003 copper ion binding protein; InterPro: IPR006122 Proteins that transport heavy metals in micro-organisms and eukaryotes share similarities in their sequences and structures. These proteins provide an important focus for research, some being involved in bacterial resistance to toxic metals, such as lead and cadmium, while others are involved in inherited human syndromes, such as Wilson's and Menke's diseases . A conserved 30-residue domain has been found in a number of these heavy metal transport or detoxification proteins . The domain, which has been termed Heavy-Metal-Associated (HMA), contains two conserved cysteines that are probably involved in metal binding. This sub-domain is found in copper-binding proteins. ; GO: 0005507 copper ion binding, 0006825 copper ion transport. Probab=21.26 E-value=61 Score=13.96 Aligned_cols=18 Identities=28% Similarity=0.818 Sum_probs=15.6 Q ss_pred CCHHHHHHHHHHCCCCCC Q ss_conf 107899999986088768 Q gi|254780331|r 106 VSSRDIADLLIEEGFDVN 123 (179) Q Consensus 106 Vt~~dI~~~L~~~gi~I~ 123 (179) ++..+|.++|.+.|+++. T Consensus 49 v~~~~I~~Ai~d~GY~~~ 66 (66) T TIGR00003 49 VSAKEIKEAILDAGYEVE 66 (66) T ss_pred CCHHHHHHHHHHCCCCCC T ss_conf 446778889873665369 No 34 >TIGR01054 rgy reverse gyrase; InterPro: IPR005736 DNA topoisomerases regulate the number of topological links between two DNA strands (i.e. change the number of superhelical turns) by catalysing transient single- or double-strand breaks, crossing the strands through one another, then resealing the breaks. These enzymes have several functions: to remove DNA supercoils during transcription and DNA replication; for strand breakage during recombination; for chromosome condensation; and to disentangle intertwined DNA during mitosis , . DNA topoisomerases are divided into two classes: type I enzymes (5.99.1.2 from EC; topoisomerases I, III and V) break single-strand DNA, and type II enzymes (5.99.1.3 from EC; topoisomerases II, IV and VI) break double-strand DNA . Type I topoisomerases are ATP-independent enzymes (except for reverse gyrase), and can be subdivided according to their structure and reaction mechanisms: type IA (bacterial and archaeal topoisomerase I, topoisomerase III and reverse gyrase) and type IB (eukaryotic topoisomerase I and topoisomerase V). These enzymes are primarily responsible for relaxing positively and/or negatively supercoiled DNA, except for reverse gyrase, which can introduce positive supercoils into DNA. Reverse gyrase is a type IA topoisomerase that is unique among these enzymes in its requirement for ATP. Reverse gyrase is a hyperthermophile-specific enzyme that acts as a renaturase by positively supercoiling DNA, and by annealing complementary single-strand circles . Hyperthermophilic organisms must protect themselves against heat-induced degradation, and reverse gyrase acts to reduce the rate of double-strand DNA breakage, a function that does not require ATP hydrolysis and which is independent of its positive supercoiling abilities. Reverse gyrase achieves this by recognising nicked DNA and recruiting a protein coat to the site of damage . More information about this protein can be found at Protein of the Month: DNA Topoisomerase .; GO: 0003677 DNA binding, 0003916 DNA topoisomerase activity, 0006265 DNA topological change, 0006268 DNA unwinding during replication, 0005694 chromosome. Probab=21.04 E-value=47 Score=14.67 Aligned_cols=18 Identities=17% Similarity=0.220 Sum_probs=5.7 Q ss_pred CCCCCCCCCEEEECCCCE Q ss_conf 117587783888277632 Q gi|254780331|r 22 VTNLGPMGEVVKVKNGYA 39 (179) Q Consensus 22 v~~lG~~Gdiv~Vk~Gya 39 (179) +.-.+..+.+=-+..||. T Consensus 1665 g~e~~~~~~v~~~~~Gf~ 1682 (1843) T TIGR01054 1665 GKEVEEEGVVEIKERGFE 1682 (1843) T ss_pred CCEEEEEEEEEEEECCHH T ss_conf 601311246887503202 No 35 >cd06240 Peptidase_M14-like_1_3 Peptidase M14-like domain of a functionally uncharacterized subgroup of the M14 family of metallocarboxypeptidases (MCPs). The M14 family are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Two major subfamilies of the M14 family, defined based on sequence and structural homology, are the A/B and N/E subfamilies. Enzymes belonging to the A/B subfamily are normally synthesized as inactive precursors containing preceding signal peptide, followed by an N-terminal pro-region linked to the enzyme; these proenzymes are called procarboxypeptidases. The A/B enzymes can be further divided based on their substrate specificity; Carboxypeptidase A-like (CPA-like) enzymes favor hydrophobic residues while carboxypeptidase B-like (CPB-like) enzymes only cleave the basic residues lysine or arginine. T Probab=20.31 E-value=67 Score=13.71 Aligned_cols=29 Identities=17% Similarity=0.185 Sum_probs=22.4 Q ss_pred CCCCEECCCCHHHHHHHHHHHHHHHHHHH Q ss_conf 37873012201468887789999984223 Q gi|254780331|r 45 PKKKALRANKENKILFESQRSVLEAANLE 73 (179) Q Consensus 45 P~~~A~~aT~~n~~~~~~~~~~~~~~~~~ 73 (179) |.-+++..+++|+++++..++...+...- T Consensus 13 pl~~~~IsS~eN~~~Ld~ir~~~~~Ladp 41 (273) T cd06240 13 PQIMAAISSPENLAKLDHYKAILRKLADP 41 (273) T ss_pred EEEEEEECCHHHHHHHHHHHHHHHHHHCC T ss_conf 56999974999998699999999986186 No 36 >TIGR01734 D-ala-DACP-lig D-alanine-activating enzyme; InterPro: IPR010072 This entry represents the enzyme (also called D-alanine-D-alanyl carrier protein ligase) which activates D-alanine as an adenylate via the reaction D-ala + ATP to D-ala-AMP + PPi, and further catalyses the condensation of the amino acid adenylate with the D-alanyl carrier protein (D-ala-ACP). The D-alanine is then further transferred to teichoic acid in the biosynthesis of lipoteichoic acid (LTA) and wall teichoic acid (WTA) in Gram-positive bacteria, both polysacchatides .; GO: 0016208 AMP binding, 0047473 D-alanine-poly(phosphoribitol) ligase activity, 0019350 teichoic acid biosynthetic process. Probab=20.23 E-value=44 Score=14.85 Aligned_cols=49 Identities=10% Similarity=0.188 Sum_probs=26.8 Q ss_pred HHHHHHHHHHHHHHHHHHHCCCCCCCCCCCCCCCHHHHHHHHHH--CC---CCCCHH Q ss_conf 46789999887654321000255654310034107899999986--08---876866 Q gi|254780331|r 74 KKAKYEGISKDLAKKNFSLIRAAGDTGYLYGSVSSRDIADLLIE--EG---FDVNRG 125 (179) Q Consensus 74 ~~~~a~~~~~~L~~~~l~i~~k~~e~gkLfGsVt~~dI~~~L~~--~g---i~I~k~ 125 (179) .+++...++..|+...|.=+.| .=-+||-=++..|+..|.. .| +.||-+ T Consensus 31 L~~~Sd~la~~i~~~~l~~k~k---PiivfG~~~~~Ml~~flg~~KsGhaYiPvD~s 84 (513) T TIGR01734 31 LKEQSDRLAAFIQERLLPEKEK---PIIVFGHMEPEMLVAFLGSIKSGHAYIPVDTS 84 (513) T ss_pred HHHHHHHHHHHHHHCCCCCCCC---CEEEECCCCHHHHHHHHHHHHCCCCCCCCCCC T ss_conf 9999999999998605766677---57886488689999999975168964343668 Done!