Query gi|254780250|ref|YP_003064663.1| 50S ribosomal protein L24 [Candidatus Liberibacter asiaticus str. psy62] Match_columns 102 No_of_seqs 104 out of 1662 Neff 5.0 Searched_HMMs 39220 Date Tue May 24 04:46:32 2011 Command /home/congqian_1/programs/hhpred/hhsearch -i 254780250.hhm -d /home/congqian_1/database/cdd/Cdd.hhm No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM 1 TIGR01079 rplX_bact ribosomal 100.0 0 0 275.7 11.1 100 1-100 1-109 (109) 2 PRK00004 rplX 50S ribosomal pr 100.0 1.2E-42 0 251.9 13.1 100 1-100 1-102 (102) 3 COG0198 RplX Ribosomal protein 100.0 2.7E-36 6.9E-41 217.5 12.4 99 1-101 2-104 (104) 4 CHL00141 rpl24 ribosomal prote 100.0 1.2E-33 3.1E-38 203.0 8.9 79 2-80 10-90 (90) 5 PRK12281 rplX 50S ribosomal pr 100.0 4.9E-31 1.3E-35 188.9 8.8 72 2-73 5-77 (77) 6 KOG1708 consensus 100.0 7.5E-29 1.9E-33 177.0 8.5 99 3-101 72-174 (236) 7 PRK01191 rpl24p 50S ribosomal 99.5 1.5E-14 3.8E-19 99.3 7.1 63 2-74 45-107 (119) 8 TIGR01080 rplX_A_E ribosomal p 99.3 1.9E-12 4.7E-17 88.0 5.3 62 2-73 40-101 (116) 9 PTZ00194 60S ribosomal protein 99.2 1.7E-11 4.3E-16 82.8 6.6 61 2-72 45-105 (143) 10 KOG3401 consensus 97.6 2.7E-05 6.8E-10 49.1 2.4 62 2-73 47-109 (145) 11 TIGR00405 L26e_arch ribosomal 97.6 0.00013 3.3E-09 45.4 5.0 39 2-40 91-129 (151) 12 pfam00467 KOW KOW motif. This 97.4 0.00013 3.3E-09 45.4 3.5 32 6-37 1-32 (32) 13 PRK08559 nusG transcription an 96.8 0.0026 6.5E-08 38.3 4.9 38 3-40 94-131 (153) 14 smart00739 KOW KOW (Kyprides, 96.5 0.0026 6.8E-08 38.3 3.3 28 3-30 1-28 (28) 15 PRK05609 nusG transcription an 96.4 0.007 1.8E-07 36.0 5.0 35 3-37 128-162 (183) 16 PRK04333 50S ribosomal protein 95.9 0.022 5.7E-07 33.2 5.5 38 1-39 1-38 (83) 17 COG0250 NusG Transcription ant 95.8 0.019 4.8E-07 33.6 4.8 36 3-38 123-158 (178) 18 KOG1999 consensus 94.3 0.069 1.8E-06 30.6 4.2 27 4-30 460-486 (1024) 19 PRK09014 rfaH transcriptional 93.9 0.13 3.2E-06 29.1 4.9 39 4-44 110-148 (162) 20 COG2163 RPL14A Ribosomal prote 89.7 0.73 1.9E-05 25.0 4.8 38 1-39 1-39 (125) 21 PTZ00065 60S ribosomal protein 88.7 0.89 2.3E-05 24.5 4.7 35 4-39 8-42 (130) 22 TIGR00922 nusG transcription t 88.4 0.84 2.1E-05 24.7 4.4 34 4-37 140-173 (193) 23 COG5164 SPT5 Transcription elo 87.1 0.58 1.5E-05 25.5 2.9 29 3-31 139-167 (607) 24 KOG1999 consensus 83.9 0.71 1.8E-05 25.1 2.1 25 4-28 408-432 (1024) 25 TIGR01956 NusG_myco NusG famil 81.8 2.4 6.2E-05 22.2 4.1 36 2-37 278-316 (335) 26 PRK04313 30S ribosomal protein 80.5 3 7.5E-05 21.7 4.2 28 3-30 171-198 (237) 27 PRK06531 yajC preprotein trans 77.7 4 0.0001 21.0 4.2 37 2-42 35-73 (120) 28 PTZ00189 60S ribosomal protein 66.6 11 0.00028 18.6 5.0 53 1-53 31-103 (159) 29 TIGR00739 yajC preprotein tran 63.3 12 0.00031 18.3 4.0 33 2-38 36-68 (86) 30 COG1193 Mismatch repair ATPase 62.0 10 0.00026 18.7 3.5 41 2-46 611-653 (753) 31 TIGR01672 AphA HAD superfamily 61.7 4.1 0.0001 20.9 1.3 28 2-29 127-157 (248) 32 pfam01157 Ribosomal_L21e Ribos 59.6 13 0.00034 18.2 3.6 31 1-31 30-70 (99) 33 TIGR01553 formate-DH-alph form 59.6 15 0.00038 17.9 4.3 50 4-70 958-1007(1043) 34 PRK12289 ribosome-associated G 55.8 18 0.00045 17.5 4.5 33 3-35 50-82 (351) 35 KOG3418 consensus 52.4 20 0.00051 17.2 4.4 41 1-41 1-48 (136) 36 cd01733 LSm10 The eukaryotic S 51.3 16 0.0004 17.7 2.9 42 53-94 33-76 (78) 37 PTZ00118 40S ribosomal protein 51.2 21 0.00053 17.1 3.9 28 4-31 175-202 (262) 38 PRK04306 50S ribosomal protein 50.8 21 0.00054 17.0 3.9 35 1-37 31-75 (97) 39 KOG4315 consensus 50.1 6.7 0.00017 19.8 0.9 24 7-30 232-255 (455) 40 cd01721 Sm_D3 The eukaryotic S 47.4 24 0.00061 16.8 3.3 42 53-94 24-67 (70) 41 TIGR02772 Ku_bact Ku protein; 46.1 9.1 0.00023 19.0 1.0 34 68-101 18-57 (271) 42 PRK00409 recombination and DNA 45.6 26 0.00066 16.6 5.1 38 3-43 634-672 (780) 43 PTZ00223 40S ribosomal protein 44.4 27 0.00069 16.5 4.7 28 3-30 171-198 (273) 44 COG3700 AphA Acid phosphatase 41.2 9.1 0.00023 19.1 0.4 33 2-34 125-157 (237) 45 TIGR01955 RfaH transcriptional 40.0 23 0.00058 16.9 2.3 28 3-30 111-138 (162) 46 cd03691 BipA_TypA_II BipA_TypA 39.6 27 0.00069 16.5 2.6 28 3-30 26-53 (86) 47 pfam03144 GTP_EFTU_D2 Elongati 37.8 35 0.00088 15.9 3.1 29 3-32 12-40 (70) 48 PRK11009 aphA acid phosphatase 37.2 4.6 0.00012 20.7 -1.6 31 2-32 125-155 (235) 49 PRK08515 flgA flagellar basal 36.6 36 0.00092 15.8 4.2 16 87-102 203-218 (229) 50 pfam04452 Methyltrans_RNA RNA 36.3 37 0.00094 15.8 4.4 36 2-37 15-50 (225) 51 PRK06804 flgA flagellar basal 35.6 38 0.00096 15.7 3.6 13 4-16 210-222 (272) 52 TIGR01497 kdpB K+-transporting 35.0 22 0.00056 17.0 1.6 26 3-39 123-148 (675) 53 cd01724 Sm_D1 The eukaryotic S 34.5 35 0.00089 15.9 2.5 44 53-96 25-70 (90) 54 KOG1086 consensus 31.2 17 0.00044 17.5 0.5 45 41-85 523-568 (594) 55 KOG3421 consensus 30.6 46 0.0012 15.2 3.1 34 5-39 8-41 (136) 56 PRK12618 flgA flagellar basal 29.6 48 0.0012 15.1 3.2 15 87-101 110-125 (138) 57 cd05705 S1_Rrp5_repeat_hs14 S1 28.9 43 0.0011 15.4 2.2 12 59-70 27-38 (74) 58 cd04466 S1_YloQ_GTPase S1_YloQ 28.8 50 0.0013 15.0 4.3 12 21-32 53-64 (68) 59 pfam02211 NHase_beta Nitrile h 28.5 50 0.0013 15.0 3.9 52 3-71 132-193 (220) 60 TIGR01069 mutS2 MutS2 family p 28.2 51 0.0013 15.0 3.9 37 4-43 672-710 (834) 61 cd05882 Ig1_Necl-1 First (N-te 25.9 56 0.0014 14.8 3.3 16 58-73 59-74 (95) 62 COG1471 RPS4A Ribosomal protei 25.7 57 0.0014 14.7 4.7 26 4-29 174-199 (241) 63 pfam11962 DUF3476 Domain of un 24.6 59 0.0015 14.6 2.6 28 73-100 64-92 (222) 64 PRK05585 yajC preprotein trans 23.4 63 0.0016 14.5 4.0 29 3-37 53-81 (107) 65 COG1862 YajC Preprotein transl 23.2 63 0.0016 14.5 3.3 26 2-31 42-67 (97) 66 TIGR00459 aspS_bact aspartyl-t 23.1 60 0.0015 14.6 2.1 75 3-81 43-120 (653) 67 TIGR00448 rpoE DNA-directed RN 22.9 53 0.0014 14.9 1.8 34 59-92 105-148 (184) 68 pfam06431 Polyoma_lg_T_C Polyo 22.9 38 0.00096 15.7 1.0 28 39-66 243-272 (417) 69 pfam11784 DUF3320 Protein of u 21.6 14 0.00036 18.0 -1.4 15 55-69 19-33 (52) 70 pfam11910 NdhO Cyanobacterial 21.2 43 0.0011 15.4 1.0 16 4-19 1-16 (67) 71 cd04090 eEF2_II_snRNP Loc2 eEF 21.1 70 0.0018 14.2 4.9 14 4-17 28-41 (94) 72 cd05881 Ig1_Necl-2 First (N-te 20.3 28 0.00072 16.4 -0.1 16 58-73 59-74 (95) No 1 >TIGR01079 rplX_bact ribosomal protein L24; InterPro: IPR003256 Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites , . About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome , . Ribosomal protein L24 is one of the proteins from the large ribosomal subunit. In their mature form, these proteins have 103 to 150 amino-acid residues. This domain is found in L24 and L26 ribosomal proteins.; GO: 0003735 structural constituent of ribosome, 0006412 translation, 0005622 intracellular, 0005840 ribosome. Probab=100.00 E-value=0 Score=275.69 Aligned_cols=100 Identities=50% Similarity=0.860 Sum_probs=94.6 Q ss_pred CCCCCCCCEEEEEECCCCCCEEEEEEEECCCCEEEEECEEEEEEEECCCC-CC---CCEEEEEEECCCHHHEEEECC-CC Q ss_conf 97226586999984378886469999974699899906059743205777-65---651799970468665788978-99 Q gi|254780250|r 1 MEKIRTGDRVLVLAGKDKGKAGQVMGVVRKSGRAFVQGVNIVKRHQRQTP-NK---EAGIISKEASIHLSNLSLIDK-DG 75 (102) Q Consensus 1 M~kikkGD~V~VisGkdKGk~G~V~~V~~k~~~ViVeGiN~~kkh~k~~~-~~---~gGii~~E~pIh~SNV~lvd~-~~ 75 (102) |+|||+||+|+||||+||||+|+||+|+|+.++|+||||||+|||+||++ ++ +|||+++|||||+|||||+|| ++ T Consensus 1 k~kiKKGD~V~VIsGKdKGK~GkVl~v~~~~~kViVEGvN~vKKH~Kp~~~~~~a~~GGi~~~EaPIh~SNVm~~~~~~~ 80 (109) T TIGR01079 1 KMKIKKGDTVVVISGKDKGKRGKVLKVLPKKNKVIVEGVNMVKKHVKPKPTQRKAKEGGIIEKEAPIHISNVMLFDPKTG 80 (109) T ss_pred CCEEEECCEEEEEECCCCCCEEEEEEEECCCCEEEEECCCEEEEEECCCCCCCCCCCCCCCCEECCCCCCCEEEECCCCC T ss_conf 97133098888886889887238999525788388831533221566886778866788101622345342144435789 Q ss_pred CEEEEEEEEECC----EEEEEECCCCCEE Q ss_conf 622889999999----7999981568771 Q gi|254780250|r 76 KQVRVGFSFVDG----KKIRIAKRSGEPI 100 (102) Q Consensus 76 k~trv~~~~~dG----~kvRv~kksg~~i 100 (102) +||||+|+|+|+ +||||||+||+.| T Consensus 81 ~~tRvg~r~~~d~~~~kKVR~~Kk~Ge~I 109 (109) T TIGR01079 81 KATRVGYRFEEDGKTGKKVRVFKKNGEII 109 (109) T ss_pred CCCCEEEEEEECCCCEEEEEEEECCCCCC T ss_conf 83400588882899513898887138859 No 2 >PRK00004 rplX 50S ribosomal protein L24; Reviewed Probab=100.00 E-value=1.2e-42 Score=251.88 Aligned_cols=100 Identities=54% Similarity=0.885 Sum_probs=96.4 Q ss_pred CCCCCCCCEEEEEECCCCCCEEEEEEEECCCCEEEEECEEEEEEEECCCC-CCCCEEEEEEECCCHHHEEEECC-CCCEE Q ss_conf 97226586999984378886469999974699899906059743205777-65651799970468665788978-99622 Q gi|254780250|r 1 MEKIRTGDRVLVLAGKDKGKAGQVMGVVRKSGRAFVQGVNIVKRHQRQTP-NKEAGIISKEASIHLSNLSLIDK-DGKQV 78 (102) Q Consensus 1 M~kikkGD~V~VisGkdKGk~G~V~~V~~k~~~ViVeGiN~~kkh~k~~~-~~~gGii~~E~pIh~SNV~lvd~-~~k~t 78 (102) +|+|++||+|+||||+|||++|+|++|++++|+|+|||+||.+||+||++ +++|||+++|+|||+|||||+|| +++|| T Consensus 1 k~kikkGD~V~VisGkdKGk~G~V~~v~~~~~~viVeGvN~~kkh~Kp~~~~~~Ggi~~~E~pIh~SNV~lvd~~~~k~t 80 (102) T PRK00004 1 KMKIKKGDTVIVIAGKDKGKQGKVLKVLPKKDKVIVEGVNIVKKHQKPNQEGPQGGIVEKEAPIHISNVALFDPKTGKAT 80 (102) T ss_pred CCCEECCCEEEEEECCCCCCCEEEEEEECCCCEEEEECCEEEEEECCCCCCCCCCCEEEEECCEEEHHEEEECCCCCCCE T ss_conf 97206799999927799997368999998799999977478999717766787883899988899014889879889856 Q ss_pred EEEEEEECCEEEEEECCCCCEE Q ss_conf 8899999997999981568771 Q gi|254780250|r 79 RVGFSFVDGKKIRIAKRSGEPI 100 (102) Q Consensus 79 rv~~~~~dG~kvRv~kksg~~i 100 (102) |++|+++||+|+|+|++||++| T Consensus 81 rv~~k~~dG~kvRv~kktg~~I 102 (102) T PRK00004 81 RVGFKVEDGKKVRVAKKSGEVI 102 (102) T ss_pred EEEEEEECCEEEEEEECCCCCC T ss_conf 8999995997999994379899 No 3 >COG0198 RplX Ribosomal protein L24 [Translation, ribosomal structure and biogenesis] Probab=100.00 E-value=2.7e-36 Score=217.46 Aligned_cols=99 Identities=55% Similarity=0.892 Sum_probs=94.0 Q ss_pred CCCCCCCCEEEEEECCCCCCEEEEEEEECCCCEEEEECEEEEEEEECCC-CCCCCEEEEEEECCCHHHEEEECCC--CCE Q ss_conf 9722658699998437888646999997469989990605974320577-7656517999704686657889789--962 Q gi|254780250|r 1 MEKIRTGDRVLVLAGKDKGKAGQVMGVVRKSGRAFVQGVNIVKRHQRQT-PNKEAGIISKEASIHLSNLSLIDKD--GKQ 77 (102) Q Consensus 1 M~kikkGD~V~VisGkdKGk~G~V~~V~~k~~~ViVeGiN~~kkh~k~~-~~~~gGii~~E~pIh~SNV~lvd~~--~k~ 77 (102) |++|++||+|.||+|+|||++|+|++++++. |+|||+|+.++|++|+ ++++|||+++|||||+|||||+|+. +++ T Consensus 2 ~~~IrkGD~V~Vi~GkdKGk~GkVl~v~~k~--V~VEGVnv~kkh~k~~~~~~~ggii~~EapIh~SnV~i~~~~~~~~~ 79 (104) T COG0198 2 KMKVKKGDTVKVIAGKDKGKEGKVLKVLPKK--VVVEGVNVVKKHIKPSQENPEGGIINKEAPIHISNVAIIDPNKTGKP 79 (104) T ss_pred CCCEECCCEEEEEECCCCCCCEEEEEEECCE--EEEECCEEEEECCCCCCCCCCCCEEEEEECCCHHHEEEECCCCCCCC T ss_conf 8524369999998668899614899991573--89977488980477777687886154562336799589644447883 Q ss_pred EEEEEEEE-CCEEEEEECCCCCEEC Q ss_conf 28899999-9979999815687715 Q gi|254780250|r 78 VRVGFSFV-DGKKIRIAKRSGEPID 101 (102) Q Consensus 78 trv~~~~~-dG~kvRv~kksg~~id 101 (102) +|++|++. ||+|+|+||+||+.|| T Consensus 80 ~Rv~~~~~~~~kkvr~~Kk~g~~i~ 104 (104) T COG0198 80 TRVGYKVEEDGKKVRVAKKSGEVID 104 (104) T ss_pred CEEEEEEECCCCEEEEEECCCCCCC T ss_conf 1678999317967998851472068 No 4 >CHL00141 rpl24 ribosomal protein L24; Validated Probab=100.00 E-value=1.2e-33 Score=203.02 Aligned_cols=79 Identities=37% Similarity=0.606 Sum_probs=74.9 Q ss_pred CCCCCCCEEEEEECCCCCCEEEEEEEECCCCEEEEECEEEEEEEECCCC-CCCCEEEEEEECCCHHHEEEECC-CCCEEE Q ss_conf 7226586999984378886469999974699899906059743205777-65651799970468665788978-996228 Q gi|254780250|r 2 EKIRTGDRVLVLAGKDKGKAGQVMGVVRKSGRAFVQGVNIVKRHQRQTP-NKEAGIISKEASIHLSNLSLIDK-DGKQVR 79 (102) Q Consensus 2 ~kikkGD~V~VisGkdKGk~G~V~~V~~k~~~ViVeGiN~~kkh~k~~~-~~~gGii~~E~pIh~SNV~lvd~-~~k~tr 79 (102) ++|++||+|+||||+|||++|+|++|++++++|+|||+|+++||++|++ +++|||+++|+|||+|||||+|| +++||| T Consensus 10 ~kIkkGD~V~VisGkdKGk~G~Vl~v~~~~~~viVeGvN~~kKH~Kp~~~~~~GgIi~~EaPIhiSNV~l~dp~~~k~tR 89 (90) T CHL00141 10 MHVKKGDTVQVISGKDKGKIGEVLKIIRKSNKVIVKGINIKTKHIKPQKEGEVGEIKQFEAPIHSSNVMLYNEENNIASR 89 (90) T ss_pred EEEECCCEEEEEECCCCCCCEEEEEEECCCCEEEEECCEEEEEECCCCCCCCCCCEEEEECCEEEEEEEEECCCCCCCCC T ss_conf 06708999999166789973579999867999999795888871589889998889898738322348888787797478 Q ss_pred E Q ss_conf 8 Q gi|254780250|r 80 V 80 (102) Q Consensus 80 v 80 (102) . T Consensus 90 ~ 90 (90) T CHL00141 90 S 90 (90) T ss_pred C T ss_conf 9 No 5 >PRK12281 rplX 50S ribosomal protein L24; Reviewed Probab=99.97 E-value=4.9e-31 Score=188.87 Aligned_cols=72 Identities=43% Similarity=0.721 Sum_probs=69.3 Q ss_pred CCCCCCCEEEEEECCCCCCEEEEEEEECCCCEEEEECEEEEEEEECCCC-CCCCEEEEEEECCCHHHEEEECC Q ss_conf 7226586999984378886469999974699899906059743205777-65651799970468665788978 Q gi|254780250|r 2 EKIRTGDRVLVLAGKDKGKAGQVMGVVRKSGRAFVQGVNIVKRHQRQTP-NKEAGIISKEASIHLSNLSLIDK 73 (102) Q Consensus 2 ~kikkGD~V~VisGkdKGk~G~V~~V~~k~~~ViVeGiN~~kkh~k~~~-~~~gGii~~E~pIh~SNV~lvd~ 73 (102) |+|++||+|+||||+|||++|+|++|++++|+|+|||+|+.+||+||++ +++|||+++|+|||+|||||+|+ T Consensus 5 ~kIkkGD~V~VisGkdKGk~G~Vl~v~~~~~rviVeGvN~~kkh~Kp~~~~~~Ggii~~E~pIh~SNV~lvdk 77 (77) T PRK12281 5 LHVKKGDMVKVIAGDDKGKTGKVLAVLPKKNRVIVEGVNIRKKAIKPSQKNPNGGFIEKEMPIHISNVKKVEK 77 (77) T ss_pred EEEECCCEEEEEECCCCCCCEEEEEEECCCCEEEEECCEEEEEECCCCCCCCCCCEEEEECCEEHHHCEEECC T ss_conf 4875899999946678997278999987799999948637987349988899988999884782440618069 No 6 >KOG1708 consensus Probab=99.96 E-value=7.5e-29 Score=177.04 Aligned_cols=99 Identities=37% Similarity=0.620 Sum_probs=92.6 Q ss_pred CCCCCCEEEEEECCCCCCEEEEEEEECCCCEEEEECEEEEEEEECCCCC-CCCEEEEEEECCCHHH-EEEECCC-CCEEE Q ss_conf 2265869999843788864699999746998999060597432057776-5651799970468665-7889789-96228 Q gi|254780250|r 3 KIRTGDRVLVLAGKDKGKAGQVMGVVRKSGRAFVQGVNIVKRHQRQTPN-KEAGIISKEASIHLSN-LSLIDKD-GKQVR 79 (102) Q Consensus 3 kikkGD~V~VisGkdKGk~G~V~~V~~k~~~ViVeGiN~~kkh~k~~~~-~~gGii~~E~pIh~SN-V~lvd~~-~k~tr 79 (102) -++-||+|+||+|+||||+|.|++|++.+|+|+|+|+|...+|+..... ..|.|+.+|||||+|| |||+||. ..||+ T Consensus 72 ~ff~GDtVeVlvGkDkGkqG~Vtqv~r~~s~VvV~gln~k~r~~gsekeg~pgtivk~EaPlhvsk~VmLvdp~d~q~te 151 (236) T KOG1708 72 HFFFGDTVEVLVGKDKGKQGEVTQVIRHRSWVVVKGLNTKYRHMGSEKEGEPGTIVKSEAPLHVSKQVMLVDPEDDQPTE 151 (236) T ss_pred EEECCCEEEEEECCCCCCCCEEEEEEECCCEEEECCCCHHHHHHCCCCCCCCCEEEEECCCCEEECCEEEECCCCCCCCE T ss_conf 68349879997515677431389986047648972610344442640028886277503772340406987732367732 Q ss_pred EEEEEE-CCEEEEEECCCCCEEC Q ss_conf 899999-9979999815687715 Q gi|254780250|r 80 VGFSFV-DGKKIRIAKRSGEPID 101 (102) Q Consensus 80 v~~~~~-dG~kvRv~kksg~~id 101 (102) ++|++. +|+|||++.+||.+|+ T Consensus 152 ~~wr~~e~GekVRvstrSG~iIp 174 (236) T KOG1708 152 VEWRFTEDGEKVRVSTRSGRIIP 174 (236) T ss_pred EEEEECCCCCEEEEEECCCCCCC T ss_conf 01577478857999830551614 No 7 >PRK01191 rpl24p 50S ribosomal protein L24P; Validated Probab=99.54 E-value=1.5e-14 Score=99.35 Aligned_cols=63 Identities=37% Similarity=0.480 Sum_probs=54.4 Q ss_pred CCCCCCCEEEEEECCCCCCEEEEEEEECCCCEEEEECEEEEEEEECCCCCCCCEEEEEEECCCHHHEEEECCC Q ss_conf 7226586999984378886469999974699899906059743205777656517999704686657889789 Q gi|254780250|r 2 EKIRTGDRVLVLAGKDKGKAGQVMGVVRKSGRAFVQGVNIVKRHQRQTPNKEAGIISKEASIHLSNLSLIDKD 74 (102) Q Consensus 2 ~kikkGD~V~VisGkdKGk~G~V~~V~~k~~~ViVeGiN~~kkh~k~~~~~~gGii~~E~pIh~SNV~lvd~~ 74 (102) +.|++||+|+|++|.+||++|+|++|++++++|+|||+|..+.. |. +...|||.|||++...+ T Consensus 45 ~~IrkgD~V~V~rG~~kG~~GkV~~V~~k~~~V~VEgv~~~K~~--------G~--~v~~pIhpSnvvItkL~ 107 (119) T PRK01191 45 LPVRKGDTVKVMRGDFKGEEGKVVEVDLKRYRIYVEGVTIKKAD--------GT--EVPYPIHPSNVMITKLD 107 (119) T ss_pred CCEECCCEEEEEECCCCCCCCEEEEEECCCCEEEEEEEEEECCC--------CC--EEEEEECCCCEEEEECC T ss_conf 43546999999552778962318999736889999436998479--------98--78642256317999746 No 8 >TIGR01080 rplX_A_E ribosomal protein L24; InterPro: IPR005756 Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites , . About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome , . Ribosomal protein L24 is one of the proteins from the large ribosomal subunit. In their mature form, these proteins have 103 to 150 amino-acid residues. This entry represents the archaeal and eukaryotic branch of these proteins, known as the L26 family.; GO: 0003735 structural constituent of ribosome, 0006412 translation, 0015934 large ribosomal subunit. Probab=99.32 E-value=1.9e-12 Score=87.98 Aligned_cols=62 Identities=32% Similarity=0.450 Sum_probs=54.4 Q ss_pred CCCCCCCEEEEEECCCCCCEEEEEEEECCCCEEEEECEEEEEEEECCCCCCCCEEEEEEECCCHHHEEEECC Q ss_conf 722658699998437888646999997469989990605974320577765651799970468665788978 Q gi|254780250|r 2 EKIRTGDRVLVLAGKDKGKAGQVMGVVRKSGRAFVQGVNIVKRHQRQTPNKEAGIISKEASIHLSNLSLIDK 73 (102) Q Consensus 2 ~kikkGD~V~VisGkdKGk~G~V~~V~~k~~~ViVeGiN~~kkh~k~~~~~~gGii~~E~pIh~SNV~lvd~ 73 (102) +.|++||.|.|+.|..+|.+|+|.+|+.++-+|.|||++..+- .| -+...|||.||||+++- T Consensus 40 lP~RkgD~V~i~RG~fkG~EGkv~~Vd~kr~~i~ve~~t~~k~--------~G--~~V~~~~hpSnv~I~~L 101 (116) T TIGR01080 40 LPVRKGDKVRIVRGDFKGHEGKVLEVDLKRYRIYVEGVTKEKV--------NG--TEVPVPIHPSNVMITKL 101 (116) T ss_pred CCCCCCCEEEEEECCCCCCCCCEEEEEEEEEEEEECCCCCCCC--------CC--CEEEECCCCCCEEEEEE T ss_conf 7612398789974662587551688730388898813101023--------88--56642467662689854 No 9 >PTZ00194 60S ribosomal protein L26; Provisional Probab=99.24 E-value=1.7e-11 Score=82.76 Aligned_cols=61 Identities=28% Similarity=0.365 Sum_probs=52.0 Q ss_pred CCCCCCCEEEEEECCCCCCEEEEEEEECCCCEEEEECEEEEEEEECCCCCCCCEEEEEEECCCHHHEEEEC Q ss_conf 72265869999843788864699999746998999060597432057776565179997046866578897 Q gi|254780250|r 2 EKIRTGDRVLVLAGKDKGKAGQVMGVVRKSGRAFVQGVNIVKRHQRQTPNKEAGIISKEASIHLSNLSLID 72 (102) Q Consensus 2 ~kikkGD~V~VisGkdKGk~G~V~~V~~k~~~ViVeGiN~~kkh~k~~~~~~gGii~~E~pIh~SNV~lvd 72 (102) +.|++||+|+|+.|.++|++|+|.+|++++.+|.|||+...+. .|. +...|||.|||++.- T Consensus 45 ~pIRkgDeV~V~RG~fkG~eGKV~~V~~kk~~I~VEgvt~~K~--------nG~--~V~v~IhPSnVvITK 105 (143) T PTZ00194 45 MPVRKDDEVIVKRGAFKGREGKVTACYRLKWVIHIDKVNREKA--------NGS--TVAVGIHPSNVEITK 105 (143) T ss_pred CCCCCCCEEEEEECCCCCCCCEEEEEEEEEEEEEEEEEEEECC--------CCC--EEECCCCCCEEEEEE T ss_conf 0011599999985553687765999995000999951789848--------998--772243674279999 No 10 >KOG3401 consensus Probab=97.65 E-value=2.7e-05 Score=49.12 Aligned_cols=62 Identities=27% Similarity=0.381 Sum_probs=49.8 Q ss_pred CCCCCCCEEEEEECCCCC-CEEEEEEEECCCCEEEEECEEEEEEEECCCCCCCCEEEEEEECCCHHHEEEECC Q ss_conf 722658699998437888-646999997469989990605974320577765651799970468665788978 Q gi|254780250|r 2 EKIRTGDRVLVLAGKDKG-KAGQVMGVVRKSGRAFVQGVNIVKRHQRQTPNKEAGIISKEASIHLSNLSLIDK 73 (102) Q Consensus 2 ~kikkGD~V~VisGkdKG-k~G~V~~V~~k~~~ViVeGiN~~kkh~k~~~~~~gGii~~E~pIh~SNV~lvd~ 73 (102) |-|+++|+|+|..|.++| +.|.|.++++++--+.+|.|-- .|.+ | .....|||+|++++.-+ T Consensus 47 ~pir~ddev~v~rg~~kG~q~G~v~~vyrKk~~iyie~v~~----eK~n----G--t~v~vgihPsK~~iTkl 109 (145) T KOG3401 47 MPIRKDDEVQVVRGHFKGFQIGKVSQVYRKKYVIYIERVQR----EKAN----G--TTVPVGIHPSKVVITKL 109 (145) T ss_pred CCEEECCEEEEEECCCCCCCCCEEHHHHHHHHEEEEEEEEE----EECC----C--CCCCCCCCCHHEEECCC T ss_conf 31464337999741334411030024345320022366778----5146----7--52245667003011240 No 11 >TIGR00405 L26e_arch ribosomal protein L24; InterPro: IPR011590 Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites , . About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome , . This protein contains a KOW domain, shared by bacterial NusG and the L24p/L26e family of ribosomal proteins. Although called archaeal NusG in several publications, it is the only close homolog of eukaryotic L26e in archaeal genomes, shares an operon with L11 in many genomes, and has been sequenced from purified ribosomes. It is here designated as a ribosomal protein for these reasons.. Probab=97.57 E-value=0.00013 Score=45.36 Aligned_cols=39 Identities=26% Similarity=0.399 Sum_probs=36.6 Q ss_pred CCCCCCCEEEEEECCCCCCEEEEEEEECCCCEEEEECEE Q ss_conf 722658699998437888646999997469989990605 Q gi|254780250|r 2 EKIRTGDRVLVLAGKDKGKAGQVMGVVRKSGRAFVQGVN 40 (102) Q Consensus 2 ~kikkGD~V~VisGkdKGk~G~V~~V~~k~~~ViVeGiN 40 (102) ..|++||.|.+|||.+||-..+|.+|+..++-|++|=+| T Consensus 91 e~I~kGd~VEiisGPFKGErAkViRvDe~keEvtlEL~e 129 (151) T TIGR00405 91 ESIKKGDVVEIISGPFKGERAKVIRVDESKEEVTLELLE 129 (151) T ss_pred HCCCCCCEEEEEECCCCCCEEEEEEEECCCCEEEEEEEC T ss_conf 313478888995389976446898630787626676321 No 12 >pfam00467 KOW KOW motif. This family has been extended to coincide with ref. The KOW (Kyprides, Ouzounis, Woese) motif is found in a variety of ribosomal proteins and NusG. Probab=97.43 E-value=0.00013 Score=45.37 Aligned_cols=32 Identities=41% Similarity=0.574 Sum_probs=29.9 Q ss_pred CCCEEEEEECCCCCCEEEEEEEECCCCEEEEE Q ss_conf 58699998437888646999997469989990 Q gi|254780250|r 6 TGDRVLVLAGKDKGKAGQVMGVVRKSGRAFVQ 37 (102) Q Consensus 6 kGD~V~VisGkdKGk~G~V~~V~~k~~~ViVe 37 (102) +||.|+|++|+++|..|.|+++++.+++|+|+ T Consensus 1 ~G~~V~V~~G~~~G~~g~I~~i~~~~~~v~v~ 32 (32) T pfam00467 1 KGDVVRVISGPFKGKKGKVVEVDDSKARVHVE 32 (32) T ss_pred CCCEEEEECCCCCCCCCCEEEECCCCEEEECC T ss_conf 98789993467568755489960752068509 No 13 >PRK08559 nusG transcription antitermination protein NusG; Validated Probab=96.77 E-value=0.0026 Score=38.33 Aligned_cols=38 Identities=29% Similarity=0.371 Sum_probs=34.8 Q ss_pred CCCCCCEEEEEECCCCCCEEEEEEEECCCCEEEEECEE Q ss_conf 22658699998437888646999997469989990605 Q gi|254780250|r 3 KIRTGDRVLVLAGKDKGKAGQVMGVVRKSGRAFVQGVN 40 (102) Q Consensus 3 kikkGD~V~VisGkdKGk~G~V~~V~~k~~~ViVeGiN 40 (102) .|..||.|.|++|.++|-.|.|.+|+..+++|.|+=+. T Consensus 94 ~i~~G~~V~v~~Gpfkg~~a~V~~Vd~~k~~vtV~l~~ 131 (153) T PRK08559 94 GIKEGDIVELIAGPFKGEKARVVRVDESKEEVTVELLE 131 (153) T ss_pred CCCCCCEEEEECCCCCCCCEEEEEECCCCCEEEEEEEE T ss_conf 46899999991357699617999981668899999996 No 14 >smart00739 KOW KOW (Kyprides, Ouzounis, Woese) motif. Motif in ribosomal proteins, NusG, Spt5p, KIN17 and T54. Probab=96.46 E-value=0.0026 Score=38.25 Aligned_cols=28 Identities=46% Similarity=0.647 Sum_probs=24.9 Q ss_pred CCCCCCEEEEEECCCCCCEEEEEEEECC Q ss_conf 2265869999843788864699999746 Q gi|254780250|r 3 KIRTGDRVLVLAGKDKGKAGQVMGVVRK 30 (102) Q Consensus 3 kikkGD~V~VisGkdKGk~G~V~~V~~k 30 (102) .+..||.|.|++|+++|.+|.|++++.. T Consensus 1 ~~~~G~~V~V~~G~~~g~~g~V~~i~~~ 28 (28) T smart00739 1 KFEVGDTVRVIAGPFKGKVGKVLEVDGE 28 (28) T ss_pred CCCCCCEEEEEECCCCCCCCEEEEECCC T ss_conf 9402768899722557741138982389 No 15 >PRK05609 nusG transcription antitermination protein NusG; Validated Probab=96.38 E-value=0.007 Score=35.96 Aligned_cols=35 Identities=23% Similarity=0.168 Sum_probs=31.6 Q ss_pred CCCCCCEEEEEECCCCCCEEEEEEEECCCCEEEEE Q ss_conf 22658699998437888646999997469989990 Q gi|254780250|r 3 KIRTGDRVLVLAGKDKGKAGQVMGVVRKSGRAFVQ 37 (102) Q Consensus 3 kikkGD~V~VisGkdKGk~G~V~~V~~k~~~ViVe 37 (102) .++.||.|.|++|..+|-.|.|.+++..++|+.|. T Consensus 128 ~~~~Gd~V~I~~GPf~g~~g~v~~~d~~k~Rv~V~ 162 (183) T PRK05609 128 DFEVGEVVRVTDGPFADFNGTVEEVDYEKSKLKVL 162 (183) T ss_pred CCCCCCEEEEECCCCCCCEEEEEEECCCCCEEEEE T ss_conf 32279899993678999689999983878999999 No 16 >PRK04333 50S ribosomal protein L14e; Validated Probab=95.85 E-value=0.022 Score=33.21 Aligned_cols=38 Identities=24% Similarity=0.459 Sum_probs=34.2 Q ss_pred CCCCCCCCEEEEEECCCCCCEEEEEEEECCCCEEEEECE Q ss_conf 972265869999843788864699999746998999060 Q gi|254780250|r 1 MEKIRTGDRVLVLAGKDKGKAGQVMGVVRKSGRAFVQGV 39 (102) Q Consensus 1 M~kikkGD~V~VisGkdKGk~G~V~~V~~k~~~ViVeGi 39 (102) |.-+-.|-.+.+.+|+|.||...|..|+-. |+|+|+|- T Consensus 1 m~~VEvGRV~~i~~G~~aGkl~vIVDiID~-nrvLVdGP 38 (83) T PRK04333 1 MAAIEVGRVCVKTAGREAGRKCVIVDIIDK-NFVLVTGP 38 (83) T ss_pred CCCEEECEEEEEECCCCCCCEEEEEEEECC-CEEEEECC T ss_conf 985670349999317767978999999738-87998899 No 17 >COG0250 NusG Transcription antiterminator [Transcription] Probab=95.76 E-value=0.019 Score=33.65 Aligned_cols=36 Identities=25% Similarity=0.312 Sum_probs=32.7 Q ss_pred CCCCCCEEEEEECCCCCCEEEEEEEECCCCEEEEEC Q ss_conf 226586999984378886469999974699899906 Q gi|254780250|r 3 KIRTGDRVLVLAGKDKGKAGQVMGVVRKSGRAFVQG 38 (102) Q Consensus 3 kikkGD~V~VisGkdKGk~G~V~~V~~k~~~ViVeG 38 (102) -+..||.|.|++|..+|-.|.|.+|+..++++.|+= T Consensus 123 ~~e~Gd~VrI~~GpFa~f~g~V~evd~ek~~~~v~v 158 (178) T COG0250 123 DFEPGDVVRIIDGPFAGFKAKVEEVDEEKGKLKVEV 158 (178) T ss_pred CCCCCCEEEEECCCCCCCCEEEEEECCCCCEEEEEE T ss_conf 678998899916678995178999847676899999 No 18 >KOG1999 consensus Probab=94.29 E-value=0.069 Score=30.57 Aligned_cols=27 Identities=37% Similarity=0.613 Sum_probs=24.5 Q ss_pred CCCCCEEEEEECCCCCCEEEEEEEECC Q ss_conf 265869999843788864699999746 Q gi|254780250|r 4 IRTGDRVLVLAGKDKGKAGQVMGVVRK 30 (102) Q Consensus 4 ikkGD~V~VisGkdKGk~G~V~~V~~k 30 (102) ++.||-|+||+|.+.|.+|.|++|... T Consensus 460 F~~GDhVKVi~G~~eG~tGlVvrVe~~ 486 (1024) T KOG1999 460 FEPGDHVKVIAGRYEGDTGLVVRVEQG 486 (1024) T ss_pred CCCCCEEEEEECCCCCCCCEEEEEECC T ss_conf 357875789731202775359998377 No 19 >PRK09014 rfaH transcriptional activator RfaH; Provisional Probab=93.92 E-value=0.13 Score=29.15 Aligned_cols=39 Identities=18% Similarity=0.362 Sum_probs=30.9 Q ss_pred CCCCCEEEEEECCCCCCEEEEEEEECCCCEEEEECEEEEEE Q ss_conf 26586999984378886469999974699899906059743 Q gi|254780250|r 4 IRTGDRVLVLAGKDKGKAGQVMGVVRKSGRAFVQGVNIVKR 44 (102) Q Consensus 4 ikkGD~V~VisGkdKGk~G~V~~V~~k~~~ViVeGiN~~kk 44 (102) .+.||.|.|++|..+|-.|.+.++..+ +|++|- +++.-+ T Consensus 110 ~~~Gd~V~I~~GPf~g~~g~v~~~~~~-~R~~vl-l~~lgr 148 (162) T PRK09014 110 PKPGDKVIITEGAFEGIQAIYTEPDGE-ARSILL-LNLLNK 148 (162) T ss_pred CCCCCEEEEEECCCCCCEEEEEEECCC-CCEEEE-EEECCC T ss_conf 999999999437999808999988687-878999-620399 No 20 >COG2163 RPL14A Ribosomal protein L14E/L6E/L27E [Translation, ribosomal structure and biogenesis] Probab=89.67 E-value=0.73 Score=24.99 Aligned_cols=38 Identities=24% Similarity=0.424 Sum_probs=33.4 Q ss_pred CCC-CCCCCEEEEEECCCCCCEEEEEEEECCCCEEEEECE Q ss_conf 972-265869999843788864699999746998999060 Q gi|254780250|r 1 MEK-IRTGDRVLVLAGKDKGKAGQVMGVVRKSGRAFVQGV 39 (102) Q Consensus 1 M~k-ikkGD~V~VisGkdKGk~G~V~~V~~k~~~ViVeGi 39 (102) |++ +..|--|.+.+|.+.|+...|++++-++ .+++.|- T Consensus 1 ~~~~l~~GrVvvv~~GR~aGkk~VIv~~iDd~-~v~i~gp 39 (125) T COG2163 1 MRASLEVGRVVVVTAGRFAGKKVVIVKIIDDN-FVLITGP 39 (125) T ss_pred CCCCCCCCEEEEEECCEECCCEEEEEEECCCC-EEEEECC T ss_conf 97656387699996250279549999982278-7997378 No 21 >PTZ00065 60S ribosomal protein L14; Provisional Probab=88.73 E-value=0.89 Score=24.53 Aligned_cols=35 Identities=29% Similarity=0.504 Sum_probs=30.4 Q ss_pred CCCCCEEEEEECCCCCCEEEEEEEECCCCEEEEECE Q ss_conf 265869999843788864699999746998999060 Q gi|254780250|r 4 IRTGDRVLVLAGKDKGKAGQVMGVVRKSGRAFVQGV 39 (102) Q Consensus 4 ikkGD~V~VisGkdKGk~G~V~~V~~k~~~ViVeGi 39 (102) +-.|-.|.|-.|.+.||...|..|+-. |||+|+|- T Consensus 8 VEvGRVv~i~~Gp~~GKL~~IVDIID~-nRvLVDGP 42 (130) T PTZ00065 8 VEPGRLCLITYGPDAGKLCFIVDIVTP-TRVLVDGA 42 (130) T ss_pred EECCEEEEEEECCCCCCEEEEEEEECC-CEEEECCC T ss_conf 342659999407888978999998617-64674087 No 22 >TIGR00922 nusG transcription termination/antitermination factor NusG; InterPro: IPR001062 Bacterial transcription antitermination protein, nusG, is a component of the transcription complex and interacts with the termination factor Rho and RNA polymerase , . NusG is a bacterial transcriptional elongation factor involved in transcription termination and anti-termination .; GO: 0003711 transcription elongation regulator activity, 0006355 regulation of transcription DNA-dependent. Probab=88.44 E-value=0.84 Score=24.68 Aligned_cols=34 Identities=24% Similarity=0.215 Sum_probs=31.3 Q ss_pred CCCCCEEEEEECCCCCCEEEEEEEECCCCEEEEE Q ss_conf 2658699998437888646999997469989990 Q gi|254780250|r 4 IRTGDRVLVLAGKDKGKAGQVMGVVRKSGRAFVQ 37 (102) Q Consensus 4 ikkGD~V~VisGkdKGk~G~V~~V~~k~~~ViVe 37 (102) +-.||.|.|+.|...-=+|+|..|+.+++++-|. T Consensus 140 fE~Ge~Vrv~dGPF~~F~G~Veev~~Ek~kLkV~ 173 (193) T TIGR00922 140 FEVGEQVRVNDGPFANFTGTVEEVDYEKSKLKVS 173 (193) T ss_pred CCCCCEEEEECCCCCCCCEEEEEEEHHCCEEEEE T ss_conf 3579888980388888514798880213769999 No 23 >COG5164 SPT5 Transcription elongation factor [Transcription] Probab=87.14 E-value=0.58 Score=25.53 Aligned_cols=29 Identities=24% Similarity=0.290 Sum_probs=25.4 Q ss_pred CCCCCCEEEEEECCCCCCEEEEEEEECCC Q ss_conf 22658699998437888646999997469 Q gi|254780250|r 3 KIRTGDRVLVLAGKDKGKAGQVMGVVRKS 31 (102) Q Consensus 3 kikkGD~V~VisGkdKGk~G~V~~V~~k~ 31 (102) .+.+||.|.||.|.+++-.|.|+.|...+ T Consensus 139 ~f~~gD~vkVI~g~~~~d~g~V~rI~~~~ 167 (607) T COG5164 139 GFYKGDLVKVIEGGEMVDIGTVPRIDGEK 167 (607) T ss_pred CCCCCCEEEEECCCCCCCCCEEEEECCCE T ss_conf 44468747884165201333378863862 No 24 >KOG1999 consensus Probab=83.87 E-value=0.71 Score=25.06 Aligned_cols=25 Identities=40% Similarity=0.534 Sum_probs=11.9 Q ss_pred CCCCCEEEEEECCCCCCEEEEEEEE Q ss_conf 2658699998437888646999997 Q gi|254780250|r 4 IRTGDRVLVLAGKDKGKAGQVMGVV 28 (102) Q Consensus 4 ikkGD~V~VisGkdKGk~G~V~~V~ 28 (102) +-.||.|.|+.|..+|-+|.|.+|. T Consensus 408 F~~GD~VeV~~Gel~glkG~ve~vd 432 (1024) T KOG1999 408 FSPGDAVEVIVGELKGLKGKVESVD 432 (1024) T ss_pred CCCCCEEEEEEEEECCCEEEEEECC T ss_conf 3788838996212405256899616 No 25 >TIGR01956 NusG_myco NusG family protein; InterPro: IPR010216 This entry represents a family of Mycoplasma proteins orthologous to the bacterial transcription termination/antitermination factor NusG. These sequences from Mycoplasma are notably diverged from those in bacterial species, and although NusA and ribosomal protein S10 (NusE) appear to be present, NusB may be absent in Mycoplasmas calling into question whether these species have a functional Nus system, which includes this family as a member.; GO: 0003711 transcription elongation regulator activity, 0006355 regulation of transcription DNA-dependent. Probab=81.76 E-value=2.4 Score=22.16 Aligned_cols=36 Identities=31% Similarity=0.526 Sum_probs=32.7 Q ss_pred CCCCCCCEEEEEECCCCC--CEEEEEEEECC-CCEEEEE Q ss_conf 722658699998437888--64699999746-9989990 Q gi|254780250|r 2 EKIRTGDRVLVLAGKDKG--KAGQVMGVVRK-SGRAFVQ 37 (102) Q Consensus 2 ~kikkGD~V~VisGkdKG--k~G~V~~V~~k-~~~ViVe 37 (102) ..++.|..|.|++|...| -.|.|.+++.. ++..+|+ T Consensus 278 ~~F~VG~~V~I~~~~~~gde~~g~I~~i~~~tk~~a~Ve 316 (335) T TIGR01956 278 SKFKVGNLVEILAGSFKGDEIEGKIKKIDQETKDKAIVE 316 (335) T ss_pred CCCCCCCEEEEEECCCCCCCEEEEEEEHHCCCCCEEEEE T ss_conf 013358788997467558601133212004677568999 No 26 >PRK04313 30S ribosomal protein S4e; Validated Probab=80.53 E-value=3 Score=21.70 Aligned_cols=28 Identities=18% Similarity=0.474 Sum_probs=16.1 Q ss_pred CCCCCCEEEEEECCCCCCEEEEEEEECC Q ss_conf 2265869999843788864699999746 Q gi|254780250|r 3 KIRTGDRVLVLAGKDKGKAGQVMGVVRK 30 (102) Q Consensus 3 kikkGD~V~VisGkdKGk~G~V~~V~~k 30 (102) ++..|..+.|+.|+..|..|+|.++... T Consensus 171 ~fe~G~~~~itgG~n~G~vG~I~~I~~~ 198 (237) T PRK04313 171 PFEEGNLAMITGGKHVGEIGKIVEIQVT 198 (237) T ss_pred ECCCCCEEEEECCEEEEEEEEEEEEEEC T ss_conf 5279989999788052589999889960 No 27 >PRK06531 yajC preprotein translocase subunit YajC; Validated Probab=77.75 E-value=4 Score=20.98 Aligned_cols=37 Identities=27% Similarity=0.472 Sum_probs=30.4 Q ss_pred CCCCCCCEEEEEECCCCCCEEEEEEEECCCCEEEE--ECEEEE Q ss_conf 72265869999843788864699999746998999--060597 Q gi|254780250|r 2 EKIRTGDRVLVLAGKDKGKAGQVMGVVRKSGRAFV--QGVNIV 42 (102) Q Consensus 2 ~kikkGD~V~VisGkdKGk~G~V~~V~~k~~~ViV--eGiN~~ 42 (102) ..+++||.|.-+.| --|+|.+|+...++|.+ +|+-+. T Consensus 35 ~~L~kGdeVvTiGG----l~G~V~~Vd~e~~tV~Ld~~Gv~l~ 73 (120) T PRK06531 35 NAIQKGDEVVTIGG----LFGTVDEVDTEAKKIVLDVDGVYLT 73 (120) T ss_pred HHCCCCCEEEECCC----CEEEEEEEECCCCEEEEEECCEEEE T ss_conf 72579998997898----2899999927898899982897999 No 28 >PTZ00189 60S ribosomal protein L21; Provisional Probab=66.59 E-value=11 Score=18.59 Aligned_cols=53 Identities=26% Similarity=0.414 Sum_probs=40.1 Q ss_pred CCCCCCCCEEEEEE-CC---------CCCCEEEEEEEECC----------CCEEEEECEEEEEEEECCCCCCC Q ss_conf 97226586999984-37---------88864699999746----------99899906059743205777656 Q gi|254780250|r 1 MEKIRTGDRVLVLA-GK---------DKGKAGQVMGVVRK----------SGRAFVQGVNIVKRHQRQTPNKE 53 (102) Q Consensus 1 M~kikkGD~V~Vis-Gk---------dKGk~G~V~~V~~k----------~~~ViVeGiN~~kkh~k~~~~~~ 53 (102) |.-++.||.|-|.. |. +-|++|.|-.|.+. .++++..-+|+..-|++++.-.+ T Consensus 31 l~~yk~GD~VdIk~~gavqKGMPhk~YHGkTGrV~nVt~~avGv~vnK~V~~ri~~KRinVriEHvk~Skcr~ 103 (159) T PTZ00189 31 LTNIKVGDYVDVVADSAVREGMPHKYYHGRTGIVWNVTPRGVGVIINKPVRTRTLRKRICVRFEHVRKSRCQE 103 (159) T ss_pred HHHHCCCCEEEEEECCCEECCCCCCCCCCCCCCEEEECCCEEEEEEEEEECCEEEEEEEEEEEEEECCCCCHH T ss_conf 8650389889997438400589861006875327764474479999877778575378899867501544689 No 29 >TIGR00739 yajC preprotein translocase, YajC subunit; InterPro: IPR003849 This entry describes proteins of unknown function.. Probab=63.28 E-value=12 Score=18.34 Aligned_cols=33 Identities=24% Similarity=0.529 Sum_probs=27.0 Q ss_pred CCCCCCCEEEEEECCCCCCEEEEEEEECCCCEEEEEC Q ss_conf 7226586999984378886469999974699899906 Q gi|254780250|r 2 EKIRTGDRVLVLAGKDKGKAGQVMGVVRKSGRAFVQG 38 (102) Q Consensus 2 ~kikkGD~V~VisGkdKGk~G~V~~V~~k~~~ViVeG 38 (102) .-++|||.|.-.+| -.|+|.+|.-..|.+.++= T Consensus 36 ~~L~KGd~V~T~gG----i~G~V~~i~e~~~~i~i~~ 68 (86) T TIGR00739 36 ESLKKGDKVLTIGG----IIGTVTKIAENTNNIVIEL 68 (86) T ss_pred HCCCCCCEEEECCC----EEEEEEEEECCCCEEEEEE T ss_conf 52799778998388----3899988523886789998 No 30 >COG1193 Mismatch repair ATPase (MutS family) [DNA replication, recombination, and repair] Probab=62.03 E-value=10 Score=18.73 Aligned_cols=41 Identities=29% Similarity=0.499 Sum_probs=30.4 Q ss_pred CCCCCCCEEEEEECCCCCCEEEEEEEECCCCEEEEE--CEEEEEEEE Q ss_conf 722658699998437888646999997469989990--605974320 Q gi|254780250|r 2 EKIRTGDRVLVLAGKDKGKAGQVMGVVRKSGRAFVQ--GVNIVKRHQ 46 (102) Q Consensus 2 ~kikkGD~V~VisGkdKGk~G~V~~V~~k~~~ViVe--GiN~~kkh~ 46 (102) .+++.||.|.+++ |..|.++++..+...+.|+ .+.|.-.|. T Consensus 611 ~~l~~gDev~~~t----~e~G~v~~i~a~~~e~~v~~g~~kv~V~~~ 653 (753) T COG1193 611 RKLKLGDEVEVIT----GEPGAVVKIIAGILEALVQSGILKVIVSHL 653 (753) T ss_pred CCCEECCEEEEEC----CCCCCEEEEECCCCEEEEECCEEEEEEEHH T ss_conf 5751345357605----885103565326764687604069997526 No 31 >TIGR01672 AphA HAD superfamily (subfamily IIIB) phosphatase, TIGR01672; InterPro: IPR010025 This family of proteins is a member of the IIIB subfamily (IPR001001 from INTERPRO) of the haloacid dehalogenase (HAD) superfamily of hydrolases. All characterised members of subfamily III and most characterised members of the HAD superfamily are phosphatases. HAD superfamily phosphatases contain active site residues in several conserved catalytic motifs , all of which are found conserved in this family. AphA is a periplasmic acid phosphatase of Escherichia coli belonging to class B bacterial phosphatases , which is part of the DDDD superfamily of phosphohydrolases. The crystal structure of AphA has been determined at 2.2A and its resolution extended to 1.7A. Despite the lack of sequence homology, the AphA structure reveals a haloacid dehalogenase-like fold. This finding suggests that this fold could be conserved among members of the DDDD superfamily of phosphohydrolases. The active enzyme is a homotetramer built by using an extended N-terminal arm intertwining the four monomers. The active site of the native enzyme hosts a magnesium ion, which can be replaced by other metal ions. The structure explains the non-specific behaviour of AphA towards substrates, while a structure-based alignment with other phosphatases provides clues about the catalytic mechanism . ; GO: 0003993 acid phosphatase activity, 0030288 outer membrane-bounded periplasmic space. Probab=61.67 E-value=4.1 Score=20.93 Aligned_cols=28 Identities=29% Similarity=0.628 Sum_probs=22.7 Q ss_pred CCCCCCCEEEEEECCCCC---CEEEEEEEEC Q ss_conf 722658699998437888---6469999974 Q gi|254780250|r 2 EKIRTGDRVLVLAGKDKG---KAGQVMGVVR 29 (102) Q Consensus 2 ~kikkGD~V~VisGkdKG---k~G~V~~V~~ 29 (102) |..++||.|.-++|+-.| |.|+|-.|-+ T Consensus 127 MH~~RGD~i~F~TGRt~gsmykkGk~d~v~k 157 (248) T TIGR01672 127 MHQKRGDKIFFVTGRTAGSMYKKGKVDKVAK 157 (248) T ss_pred HHHHHCCEEEEEECCCCCCCCCCCCCCCCCH T ss_conf 8876098799984687644332562143330 No 32 >pfam01157 Ribosomal_L21e Ribosomal protein L21e. Probab=59.58 E-value=13 Score=18.17 Aligned_cols=31 Identities=26% Similarity=0.383 Sum_probs=23.6 Q ss_pred CCCCCCCCEEEEEECC----------CCCCEEEEEEEECCC Q ss_conf 9722658699998437----------888646999997469 Q gi|254780250|r 1 MEKIRTGDRVLVLAGK----------DKGKAGQVMGVVRKS 31 (102) Q Consensus 1 M~kikkGD~V~VisGk----------dKGk~G~V~~V~~k~ 31 (102) |..++.||.|-+..-. +-|++|+|..|.+.. T Consensus 30 l~~f~~GD~V~I~idpsv~kGMPh~~yhGkTG~V~nv~~~~ 70 (99) T pfam01157 30 LREYKVGDYVDIKINGSVQKGMPHKRFHGKTGRVYNVTPGA 70 (99) T ss_pred HHHCCCCCEEEEEECCCEECCCCCCEECCCCEEEEEECCCC T ss_conf 86436998899963486006997422158864799845860 No 33 >TIGR01553 formate-DH-alph formate dehydrogenase, alpha subunit; InterPro: IPR006443 This family of sequences describe a subset of formate dehydrogenase alpha chains found mainly in proteobacteria but also in Aquifex aeolicus. The alpha chain contains domains for molybdopterin dinucleotide binding and molybdopterin oxidoreductase. The holo-enzyme also contains beta and gamma subunits of 32 and 20 kDa. The enzyme catalyzes the oxidation of formate (produced from pyruvate during anaerobic growth) to carbon dioxide with the concomitant release of two electrons and two protons. The electrons are utilised mainly in the nitrate respiration by nitrate reductase . In Escherichia coli and Salmonella typhi, there are two forms of the formate dehydrogenase, one induced by nitrate which is strictly anaerobic (fdn), and one incuced during the transition from aerobic to anaerobic growth (fdo). This subunit is one of only three proteins in Escherichia coli which contain selenocysteine .; GO: 0008863 formate dehydrogenase activity, 0045333 cellular respiration, 0005737 cytoplasm. Probab=59.55 E-value=15 Score=17.86 Aligned_cols=50 Identities=22% Similarity=0.233 Sum_probs=35.2 Q ss_pred CCCCCEEEEEECCCCCCEEEEEEEECCCCEEEEECEEEEEEEECCCCCCCCEEEEEEECCCHHHEEE Q ss_conf 2658699998437888646999997469989990605974320577765651799970468665788 Q gi|254780250|r 4 IRTGDRVLVLAGKDKGKAGQVMGVVRKSGRAFVQGVNIVKRHQRQTPNKEAGIISKEASIHLSNLSL 70 (102) Q Consensus 4 ikkGD~V~VisGkdKGk~G~V~~V~~k~~~ViVeGiN~~kkh~k~~~~~~gGii~~E~pIh~SNV~l 70 (102) |++||.|+|-| -|-.++-+-+++|..||=.=+--....+=.|||+.==.| T Consensus 958 I~nGD~V~~~s-----------------~Rg~i~A~A~vTKRiKpl~i~G~~Vh~iGiPiHwg~~~l 1007 (1043) T TIGR01553 958 IQNGDLVKVES-----------------VRGKIKAVAVVTKRIKPLKIDGKVVHTIGIPIHWGFEAL 1007 (1043) T ss_pred CCCCCEEEEEE-----------------CCEEEEEEEEEEECCCCCEECCEEEEEEECCCCCCCCCC T ss_conf 65478689972-----------------251489999973046751326827999824621076366 No 34 >PRK12289 ribosome-associated GTPase; Reviewed Probab=55.75 E-value=18 Score=17.50 Aligned_cols=33 Identities=24% Similarity=0.363 Sum_probs=27.1 Q ss_pred CCCCCCEEEEEECCCCCCEEEEEEEECCCCEEE Q ss_conf 226586999984378886469999974699899 Q gi|254780250|r 3 KIRTGDRVLVLAGKDKGKAGQVMGVVRKSGRAF 35 (102) Q Consensus 3 kikkGD~V~VisGkdKGk~G~V~~V~~k~~~Vi 35 (102) ++-.||.|.|=.-...+.+|.|.+|+|.+|.++ T Consensus 50 ~v~VGD~V~ve~~d~~~~~G~I~~IlpRkn~L~ 82 (351) T PRK12289 50 QVMVGDRVVVEEPDWQGQRGAIAEVLPRRTELD 82 (351) T ss_pred CCCCCCEEEEEECCCCCCEEEEEEEECCCCCEE T ss_conf 876577799964178898188968706445151 No 35 >KOG3418 consensus Probab=52.39 E-value=20 Score=17.19 Aligned_cols=41 Identities=37% Similarity=0.637 Sum_probs=31.8 Q ss_pred CCC-CCCCCEEEEEECCCCCCEEEEEEEECCC------CEEEEECEEE Q ss_conf 972-2658699998437888646999997469------9899906059 Q gi|254780250|r 1 MEK-IRTGDRVLVLAGKDKGKAGQVMGVVRKS------GRAFVQGVNI 41 (102) Q Consensus 1 M~k-ikkGD~V~VisGkdKGk~G~V~~V~~k~------~~ViVeGiN~ 41 (102) |.+ ++-|--|.|+||.+.|+...|.+-+-.. .-++|+|+-- T Consensus 1 m~kflkPgkvv~v~sG~yAg~KaVivk~~Ddg~~d~p~~h~LvAgi~r 48 (136) T KOG3418 1 MAKFLKPGKVVLVLSGRYAGKKAVIVKNIDDGTEDKPYGHALVAGVDR 48 (136) T ss_pred CCCCCCCCCEEEEECCCCCCCCEEEEEECCCCCCCCCCCEEEEEEHHH T ss_conf 963356994787523543676379996302587568875335523223 No 36 >cd01733 LSm10 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm10 is an SmD1-like protein which is thought to bind U7 snRNA along with LSm11 and five other Sm subunits to form a 7-member ring structure. LSm10 and the U7 snRNP of which it is a part are thought to play an important role in histone mRNA 3' processing. Probab=51.27 E-value=16 Score=17.75 Aligned_cols=42 Identities=36% Similarity=0.538 Sum_probs=32.7 Q ss_pred CCEEEEE--EECCCHHHEEEECCCCCEEEEEEEEECCEEEEEEC Q ss_conf 6517999--70468665788978996228899999997999981 Q gi|254780250|r 53 EAGIISK--EASIHLSNLSLIDKDGKQVRVGFSFVDGKKIRIAK 94 (102) Q Consensus 53 ~gGii~~--E~pIh~SNV~lvd~~~k~trv~~~~~dG~kvRv~k 94 (102) .|-+.+. -+-+|.++|.+.++++++.....-+.-|..+||.- T Consensus 33 ~G~L~~vD~~MN~~L~~v~~t~~~g~~~~l~~~~IRGs~IRyv~ 76 (78) T cd01733 33 TGRIASVDAFMNIRLAKVTIIDRNGKQVQVEEIMVTGRNIRYVH 76 (78) T ss_pred EEEEEEECCCCCCEEEEEEEECCCCCEEECCEEEECCCEEEEEE T ss_conf 99999874462669945999937998778777999176689998 No 37 >PTZ00118 40S ribosomal protein S4; Provisional Probab=51.20 E-value=21 Score=17.08 Aligned_cols=28 Identities=18% Similarity=0.541 Sum_probs=13.7 Q ss_pred CCCCCEEEEEECCCCCCEEEEEEEECCC Q ss_conf 2658699998437888646999997469 Q gi|254780250|r 4 IRTGDRVLVLAGKDKGKAGQVMGVVRKS 31 (102) Q Consensus 4 ikkGD~V~VisGkdKGk~G~V~~V~~k~ 31 (102) +..|..+.|..|+.-|..|+|..+-+.. T Consensus 175 fe~G~~~~vtgG~n~GrvG~I~~ie~~~ 202 (262) T PTZ00118 175 FEVGNLVMITGGHNVGRVGTIVSKEKHP 202 (262) T ss_pred CCCCCEEEEECCCCCCEEEEEEEEEECC T ss_conf 2899899998984552689999996138 No 38 >PRK04306 50S ribosomal protein L21e; Reviewed Probab=50.80 E-value=21 Score=17.04 Aligned_cols=35 Identities=26% Similarity=0.593 Sum_probs=22.7 Q ss_pred CCCCCCCCEEEEEEC----------CCCCCEEEEEEEECCCCEEEEE Q ss_conf 972265869999843----------7888646999997469989990 Q gi|254780250|r 1 MEKIRTGDRVLVLAG----------KDKGKAGQVMGVVRKSGRAFVQ 37 (102) Q Consensus 1 M~kikkGD~V~VisG----------kdKGk~G~V~~V~~k~~~ViVe 37 (102) |..++.||.|-+..- .+-|++|+|.. +....++|+ T Consensus 31 l~~f~~GD~V~I~idpsv~kGmPh~ryhGkTG~V~~--~~g~a~~V~ 75 (97) T PRK04306 31 LQEFEEGDKVHIVIDPSVHKGMPHPRFHGKTGTVVG--KRGRAYIVE 75 (97) T ss_pred HHHCCCCCEEEEEECCCCCCCCCCCEECCCCEEEEE--ECCEEEEEE T ss_conf 874779988999868862069985404687559993--145399999 No 39 >KOG4315 consensus Probab=50.13 E-value=6.7 Score=19.76 Aligned_cols=24 Identities=25% Similarity=0.338 Sum_probs=19.3 Q ss_pred CCEEEEEECCCCCCEEEEEEEECC Q ss_conf 869999843788864699999746 Q gi|254780250|r 7 GDRVLVLAGKDKGKAGQVMGVVRK 30 (102) Q Consensus 7 GD~V~VisGkdKGk~G~V~~V~~k 30 (102) |=.|.+++|.++|.-|+|+..... T Consensus 232 g~~vr~~~g~~~~~~~ki~~~~~S 255 (455) T KOG4315 232 GVAVRMAIGGKKGLFGKIVCLPVS 255 (455) T ss_pred CEEEEEEECCCCCCCCEEEECCCC T ss_conf 536999646652110215741576 No 40 >cd01721 Sm_D3 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit D3 heterodimerizes with subunit B and three such heterodimers form a hexameric ring structure with alternating B and D3 subunits. The D3 - B heterodimer also assembles into a heptameric ring containing D1, D2, E, F, and G subunits. Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes. Probab=47.43 E-value=24 Score=16.78 Aligned_cols=42 Identities=19% Similarity=0.432 Sum_probs=32.6 Q ss_pred CCEEEEE--EECCCHHHEEEECCCCCEEEEEEEEECCEEEEEEC Q ss_conf 6517999--70468665788978996228899999997999981 Q gi|254780250|r 53 EAGIISK--EASIHLSNLSLIDKDGKQVRVGFSFVDGKKIRIAK 94 (102) Q Consensus 53 ~gGii~~--E~pIh~SNV~lvd~~~k~trv~~~~~dG~kvRv~k 94 (102) .|-+.+. -+-+|.+||.+..++|++.+...-+.-|..+||.. T Consensus 24 ~G~L~~~d~~MN~~L~~v~~t~~~g~~~~l~~v~IRGs~Ir~i~ 67 (70) T cd01721 24 RGKLIEAEDNMNCQLKDVTVTARDGRVSQLEQVYIRGSKIRFFI 67 (70) T ss_pred EEEEEEEECCCCCEEEEEEEECCCCCEEECCEEEECCCEEEEEE T ss_conf 99998870236749989999988998975665999076589998 No 41 >TIGR02772 Ku_bact Ku protein; InterPro: IPR009187 This superfamily consists of prokaryotic Ku domain containing proteins. In the eukaryotes it has been shown that the Ku protein is involved in repairing DNA double-strand breaks by non-homologous end-joining , . The Ku protein is a heterodimer of approximately 70 kDa and 80 kDa subunits . Both these subunits have strong sequence similarity and it has been suggested that they may have evolved by gene duplication from a homodimeric ancestor in eukaryotes . The prokaryotic Ku members are homodimers and they have been predicted to be involved in the DNA repair system, which is mechanistically similar to eukaryotic non-homologous end joining , . Recent findings have implicated yeast Ku in telomeric structure maintenance in addition to on-homologous end-joining. Some of the phenotypes of the Ku-knockout mice may indicate a similar role for Ku at mammalian telomeres . Evolutionary notes: With current available phyletic information it is difficult to determine the correct evolutionary trajectory of the Ku domain. It is possible that the core Ku domain was present in bacteria and archaea even before the presence of the eukaryotes. Eukaryotes might have vertically inherited the Ku-core protein, from a common ancestor shared with a certain archaeal lineage or through horizontal transfer from bacteria. Alternatively, the core Ku domain could have evolved in the eukaryotic lineage and then horizontally transferred to the prokaryotes. Sequencing of additional archaeal genomes and those of early-branching eukaryotes may help resolving the evolutionary history of the Ku domain. Structure notes: The eukaryotic Ku heterodimer comprises an alpha/beta N-terminal, a central beta-barrel domain and a helical C-terminal arm . Structural analysis of the Ku70/80 heterodimer bound to DNA indicate that subunit contacts lead to the formation of a highly charged channel through which the DNA passes without making any contacts with the DNA bases . For additional information please see .. Probab=46.14 E-value=9.1 Score=19.05 Aligned_cols=34 Identities=15% Similarity=0.541 Sum_probs=23.8 Q ss_pred EEEECCCCCEEEEEEEEE---CCEEE---EEECCCCCEEC Q ss_conf 788978996228899999---99799---99815687715 Q gi|254780250|r 68 LSLIDKDGKQVRVGFSFV---DGKKI---RIAKRSGEPID 101 (102) Q Consensus 68 V~lvd~~~k~trv~~~~~---dG~kv---Rv~kksg~~id 101 (102) |.|+..+....+|.|... +|.+| |||..||++++ T Consensus 18 V~Ly~AT~~~~~i~F~~l~~~~~~rv~y~~V~~~tG~~V~ 57 (271) T TIGR02772 18 VKLYPATESSEDISFHLLHREDGNRVRYRKVDEETGKEVE 57 (271) T ss_pred EEEEECCCCCCCCCCCCCCHHHCCCCCEEEECCCCCCCCC T ss_conf 5762051223455534230402895231630652375357 No 42 >PRK00409 recombination and DNA strand exchange inhibitor protein; Reviewed Probab=45.61 E-value=26 Score=16.58 Aligned_cols=38 Identities=18% Similarity=0.302 Sum_probs=26.4 Q ss_pred CCCCCCEEEEEECCCCCCEEEEEEEECCCC-EEEEECEEEEE Q ss_conf 226586999984378886469999974699-89990605974 Q gi|254780250|r 3 KIRTGDRVLVLAGKDKGKAGQVMGVVRKSG-RAFVQGVNIVK 43 (102) Q Consensus 3 kikkGD~V~VisGkdKGk~G~V~~V~~k~~-~ViVeGiN~~k 43 (102) .++.||.|.|.+- |+.|+|+++..+++ .|-+.++.+.- T Consensus 634 ~~~~Gd~V~v~~~---~~~G~V~~i~~~~~~~V~~g~~k~~v 672 (780) T PRK00409 634 ELKVGDEVKYLSL---GQKGEVLSIPDNKEAIVQAGIMKMKV 672 (780) T ss_pred CCCCCCEEEECCC---CCEEEEEEECCCCEEEEEECCEEEEE T ss_conf 9999998998579---96799999869981999979769997 No 43 >PTZ00223 40S ribosomal protein S4; Provisional Probab=44.40 E-value=27 Score=16.48 Aligned_cols=28 Identities=32% Similarity=0.703 Sum_probs=14.8 Q ss_pred CCCCCCEEEEEECCCCCCEEEEEEEECC Q ss_conf 2265869999843788864699999746 Q gi|254780250|r 3 KIRTGDRVLVLAGKDKGKAGQVMGVVRK 30 (102) Q Consensus 3 kikkGD~V~VisGkdKGk~G~V~~V~~k 30 (102) ++..|..+.|+.|+.-|..|+|..+-+. T Consensus 171 kfe~G~l~~itgG~n~GrvG~I~~ie~~ 198 (273) T PTZ00223 171 KNRNGKVVMVTGGANRGRIGEIVSIERH 198 (273) T ss_pred ECCCCCEEEEECCCCCCEEEEEEEEEEC T ss_conf 3389989999898325517899889864 No 44 >COG3700 AphA Acid phosphatase (class B) [General function prediction only] Probab=41.22 E-value=9.1 Score=19.06 Aligned_cols=33 Identities=24% Similarity=0.380 Sum_probs=27.9 Q ss_pred CCCCCCCEEEEEECCCCCCEEEEEEEECCCCEE Q ss_conf 722658699998437888646999997469989 Q gi|254780250|r 2 EKIRTGDRVLVLAGKDKGKAGQVMGVVRKSGRA 34 (102) Q Consensus 2 ~kikkGD~V~VisGkdKGk~G~V~~V~~k~~~V 34 (102) |..++||.+.-++|+-.||+-+|.+.+.+.=.+ T Consensus 125 MHq~RGD~i~FvTGRt~gk~d~vsk~Lak~F~i 157 (237) T COG3700 125 MHQRRGDAIYFVTGRTPGKTDTVSKTLAKNFHI 157 (237) T ss_pred HHHHCCCEEEEEECCCCCCCCCCCHHHHHHCCC T ss_conf 998538848999367787543211667853465 No 45 >TIGR01955 RfaH transcriptional activator RfaH; InterPro: IPR010215 This entry represents the transcriptional activator protein, RfaH . This protein is most closely related to the transcriptional termination/antitermination protein NusG (IPR001062 from INTERPRO) and contains the KOW motif (IPR005824 from INTERPRO) . This protein appears to be limited to the proteobacteria. In Escherichia coli, this gene appears to control the expression of haemolysin, sex factor and lipopolysaccharide genes.; GO: 0003711 transcription elongation regulator activity, 0006355 regulation of transcription DNA-dependent. Probab=39.96 E-value=23 Score=16.87 Aligned_cols=28 Identities=18% Similarity=0.251 Sum_probs=22.2 Q ss_pred CCCCCCEEEEEECCCCCCEEEEEEEECC Q ss_conf 2265869999843788864699999746 Q gi|254780250|r 3 KIRTGDRVLVLAGKDKGKAGQVMGVVRK 30 (102) Q Consensus 3 kikkGD~V~VisGkdKGk~G~V~~V~~k 30 (102) .+.+||+|+|..|...|-++.=+.-+.. T Consensus 111 ~~~~G~~V~i~~G~fag~EAIF~~~dG~ 138 (162) T TIGR01955 111 LFKKGDKVRITDGSFAGLEAIFLEPDGE 138 (162) T ss_pred CCCCCCEEEEEECCCCCCCEEEECCCCC T ss_conf 7789887998628713600354078842 No 46 >cd03691 BipA_TypA_II BipA_TypA_II: domain II of BipA (also called TypA) having homology to domain II of the elongation factors (EFs) EF-G and EF-Tu. BipA is a highly conserved protein with global regulatory properties in Escherichia coli. BipA is phosphorylated on a tyrosine residue under some cellular conditions. Mutants show altered regulation of some pathways. BipA functions as a translation factor that is required specifically for the expression of the transcriptional modulator Fis. BipA binds to ribosomes at a site that coincides with that of EF-G and has a GTPase activity that is sensitive to high GDP:GTP ratios and, is stimulated by 70S ribosomes programmed with mRNA and aminoacylated tRNAs. The growth rate-dependent induction of BipA allows the efficient expression of Fis, thereby modulating a range of downstream processes, including DNA metabolism and type III secretion. Probab=39.63 E-value=27 Score=16.47 Aligned_cols=28 Identities=14% Similarity=0.264 Sum_probs=19.5 Q ss_pred CCCCCCEEEEEECCCCCCEEEEEEEECC Q ss_conf 2265869999843788864699999746 Q gi|254780250|r 3 KIRTGDRVLVLAGKDKGKAGQVMGVVRK 30 (102) Q Consensus 3 kikkGD~V~VisGkdKGk~G~V~~V~~k 30 (102) ++++||.|.++.-..+.+.++|.+++.- T Consensus 26 ~lk~gd~v~~~~~~~~~~~~rv~~l~~~ 53 (86) T cd03691 26 TVKVGQQVAVVKRDGKIEKAKITKLFGF 53 (86) T ss_pred CCCCCCEEEEECCCCCEEECCCEEEEEE T ss_conf 5179998999616782676223076896 No 47 >pfam03144 GTP_EFTU_D2 Elongation factor Tu domain 2. Elongation factor Tu consists of three structural domains, this is the second domain. This domain adopts a beta barrel structure. This the second domain is involved in binding to charged tRNA. This domain is also found in other proteins such as elongation factor G and translation initiation factor IF-2. This domain is structurally related to pfam03143, and in fact has weak sequence matches to this domain. Probab=37.76 E-value=35 Score=15.89 Aligned_cols=29 Identities=28% Similarity=0.518 Sum_probs=20.0 Q ss_pred CCCCCCEEEEEECCCCCCEEEEEEEECCCC Q ss_conf 226586999984378886469999974699 Q gi|254780250|r 3 KIRTGDRVLVLAGKDKGKAGQVMGVVRKSG 32 (102) Q Consensus 3 kikkGD~V~VisGkdKGk~G~V~~V~~k~~ 32 (102) .|++||.|.+. .....+..+|.+++...+ T Consensus 12 ~lk~gd~v~~~-~~~~~~~~kV~~l~~~~~ 40 (70) T pfam03144 12 TLKKGDKVVIG-PNGTGKKGRVTSLEMFHG 40 (70) T ss_pred EEECCCEEEEE-CCCCCCCEEEEEEEEECC T ss_conf 89659999993-699622137718999775 No 48 >PRK11009 aphA acid phosphatase/phosphotransferase; Provisional Probab=37.19 E-value=4.6 Score=20.65 Aligned_cols=31 Identities=19% Similarity=0.384 Sum_probs=25.4 Q ss_pred CCCCCCCEEEEEECCCCCCEEEEEEEECCCC Q ss_conf 7226586999984378886469999974699 Q gi|254780250|r 2 EKIRTGDRVLVLAGKDKGKAGQVMGVVRKSG 32 (102) Q Consensus 2 ~kikkGD~V~VisGkdKGk~G~V~~V~~k~~ 32 (102) |..++||.+.-|+|+..++.-++.+.+.+.= T Consensus 125 MH~~RGD~IyFITGRt~~~~e~~t~~L~~~F 155 (235) T PRK11009 125 MHVKRGDSIYFITGRTQTKTETVSKTLAKNF 155 (235) T ss_pred HHHHCCCEEEEEECCCCCCCCHHHHHHHHHH T ss_conf 9997299599995888887514889999871 No 49 >PRK08515 flgA flagellar basal body P-ring biosynthesis protein FlgA; Reviewed Probab=36.57 E-value=36 Score=15.78 Aligned_cols=16 Identities=19% Similarity=0.372 Sum_probs=9.5 Q ss_pred CEEEEEECCCCCEECC Q ss_conf 9799998156877159 Q gi|254780250|r 87 GKKIRIAKRSGEPIDG 102 (102) Q Consensus 87 G~kvRv~kksg~~id~ 102 (102) |+.+|+--.||+++.| T Consensus 203 Gd~IrVkN~S~Kvv~a 218 (229) T PRK08515 203 GDIIQAKNKSNKILKA 218 (229) T ss_pred CCEEEEECCCCCEEEE T ss_conf 9889999488999999 No 50 >pfam04452 Methyltrans_RNA RNA methyltransferase. RNA methyltransferases modify nucleotides during ribosomal RNA maturation in a site-specific manner. The Escherichia coli member is specific for U1498 methylation. Probab=36.25 E-value=37 Score=15.75 Aligned_cols=36 Identities=14% Similarity=0.270 Sum_probs=28.8 Q ss_pred CCCCCCCEEEEEECCCCCCEEEEEEEECCCCEEEEE Q ss_conf 722658699998437888646999997469989990 Q gi|254780250|r 2 EKIRTGDRVLVLAGKDKGKAGQVMGVVRKSGRAFVQ 37 (102) Q Consensus 2 ~kikkGD~V~VisGkdKGk~G~V~~V~~k~~~ViVe 37 (102) +|++.||.+.|.-|...--.++|.++.++.-.+.+. T Consensus 15 lR~k~gd~i~v~dg~g~~~~~~I~~i~~~~~~~~i~ 50 (225) T pfam04452 15 LRLKEGDEIKLFDGDGGEYLAEIEEISKKSVLVKIL 50 (225) T ss_pred CCCCCCCEEEEEECCCCEEEEEEEEECCCCEEEEEE T ss_conf 858999999999798989999999951881899650 No 51 >PRK06804 flgA flagellar basal body P-ring biosynthesis protein FlgA; Reviewed Probab=35.60 E-value=38 Score=15.70 Aligned_cols=13 Identities=31% Similarity=0.565 Sum_probs=6.3 Q ss_pred CCCCCEEEEEECC Q ss_conf 2658699998437 Q gi|254780250|r 4 IRTGDRVLVLAGK 16 (102) Q Consensus 4 ikkGD~V~VisGk 16 (102) |++||.|.+++.. T Consensus 210 V~rGq~V~IiA~~ 222 (272) T PRK06804 210 VTRNQHVLMLAAQ 222 (272) T ss_pred EECCCEEEEEEEC T ss_conf 9269989999915 No 52 >TIGR01497 kdpB K+-transporting ATPase, B subunit; InterPro: IPR006391 These sequences describe the P-type ATPase subunit of the complex responsible for translocating potassium ions across biological membranes in microbes. In Escherichia coli and other species, this complex consists of the proteins KdpA, KdpB, KdpC and KdpF. KdpB is the ATPase subunit, while KdpA is the potassium-ion translocating subunit . The function of KdpC is unclear, although it has been suggested to couple the ATPase subunit to the ion-translocating subunit , while KdpF serves to stabilize the complex . The potassium P-type ATPases have been characterised as Type IA based on a phylogenetic analysis which places this clade closest to the heavy-metal translocating ATPases (Type IB) . Others place this clade closer to the Na+/K+ antiporter type (Type IIC) based on physical characteristics . ; GO: 0008556 potassium-transporting ATPase activity, 0006813 potassium ion transport, 0016021 integral to membrane. Probab=35.00 E-value=22 Score=16.97 Aligned_cols=26 Identities=42% Similarity=0.758 Sum_probs=18.0 Q ss_pred CCCCCCEEEEEECCCCCCEEEEEEEECCCCEEEEECE Q ss_conf 2265869999843788864699999746998999060 Q gi|254780250|r 3 KIRTGDRVLVLAGKDKGKAGQVMGVVRKSGRAFVQGV 39 (102) Q Consensus 3 kikkGD~V~VisGkdKGk~G~V~~V~~k~~~ViVeGi 39 (102) .+|+||.|.|.+|. |+|-.+.|+ +|+ T Consensus 123 ~LkkGD~VlVeaGD----------vIP~DGEVi-~Gv 148 (675) T TIGR01497 123 ELKKGDVVLVEAGD----------VIPADGEVI-EGV 148 (675) T ss_pred HCCCCCEEEEECCC----------EECCCCCEE-CCC T ss_conf 32578889996383----------725997476-451 No 53 >cd01724 Sm_D1 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit D1 heterodimerizes with subunit D2 and three such heterodimers form a hexameric ring structure with alternating D1 and D2 subunits. The D1 - D2 heterodimer also assembles into a heptameric ring containing DB, D3, E, F, and G subunits. Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes. Probab=34.53 E-value=35 Score=15.88 Aligned_cols=44 Identities=23% Similarity=0.228 Sum_probs=30.7 Q ss_pred CCEEEEE--EECCCHHHEEEECCCCCEEEEEEEEECCEEEEEECCC Q ss_conf 6517999--7046866578897899622889999999799998156 Q gi|254780250|r 53 EAGIISK--EASIHLSNLSLIDKDGKQVRVGFSFVDGKKIRIAKRS 96 (102) Q Consensus 53 ~gGii~~--E~pIh~SNV~lvd~~~k~trv~~~~~dG~kvRv~kks 96 (102) .|-+++. -+-+|.+||.+.++.+.+..+...+.-|..+||.--- T Consensus 25 ~G~L~~vd~~MN~~L~~v~~t~~~~~~~~l~~~~IRGs~IRyi~lP 70 (90) T cd01724 25 HGTITGVDPSMNTHLKNVKLTLKGRNPVPLDTLSIRGNNIRYFILP 70 (90) T ss_pred EEEEEEECCCCEEEEEEEEEECCCCCEEECCEEEECCCCEEEEECC T ss_conf 9999881378201898899977999877877499957738999887 No 54 >KOG1086 consensus Probab=31.22 E-value=17 Score=17.55 Aligned_cols=45 Identities=9% Similarity=0.183 Sum_probs=29.7 Q ss_pred EEEEEECCCCCCCCEEEEEEECCCHHHEEEEC-CCCCEEEEEEEEE Q ss_conf 97432057776565179997046866578897-8996228899999 Q gi|254780250|r 41 IVKRHQRQTPNKEAGIISKEASIHLSNLSLID-KDGKQVRVGFSFV 85 (102) Q Consensus 41 ~~kkh~k~~~~~~gGii~~E~pIh~SNV~lvd-~~~k~trv~~~~~ 85 (102) |+.+-++++..+-.-+-.+--|-.++.|+|++ |...+.|.+|++. T Consensus 523 mkvkLQp~sgteL~~Fspi~ppaaitqvlllanp~ke~vrlryklt 568 (594) T KOG1086 523 MKVKLQPPSGTELPAFSPIMPPAAITQVLLLANPHKEKVRLRYKLT 568 (594) T ss_pred EEEECCCCCCCCCCCCCCCCCHHHHHHHHHHCCCCCCCEEEEEEEE T ss_conf 0454468876657777888886998899985385535436889988 No 55 >KOG3421 consensus Probab=30.57 E-value=46 Score=15.22 Aligned_cols=34 Identities=35% Similarity=0.533 Sum_probs=29.3 Q ss_pred CCCCEEEEEECCCCCCEEEEEEEECCCCEEEEECE Q ss_conf 65869999843788864699999746998999060 Q gi|254780250|r 5 RTGDRVLVLAGKDKGKAGQVMGVVRKSGRAFVQGV 39 (102) Q Consensus 5 kkGD~V~VisGkdKGk~G~V~~V~~k~~~ViVeGi 39 (102) -.|--+.|-.|.|.|+.-.|..|+-. |++.++|- T Consensus 8 eVGrva~v~~G~~~GkL~AIVdviDq-nr~lvDGp 41 (136) T KOG3421 8 EVGRVALVSFGPDAGKLVAIVDVIDQ-NRALVDGP 41 (136) T ss_pred EECEEEEEEECCCCCEEEEEEEEECC-HHHHCCCC T ss_conf 00349999706777608999986253-23530486 No 56 >PRK12618 flgA flagellar basal body P-ring biosynthesis protein FlgA; Reviewed Probab=29.64 E-value=48 Score=15.13 Aligned_cols=15 Identities=27% Similarity=0.432 Sum_probs=7.5 Q ss_pred CEEEEE-ECCCCCEEC Q ss_conf 979999-815687715 Q gi|254780250|r 87 GKKIRI-AKRSGEPID 101 (102) Q Consensus 87 G~kvRv-~kksg~~id 101 (102) |+-+|+ .-.|+.+|. T Consensus 110 Gd~IrV~N~~S~kiV~ 125 (138) T PRK12618 110 GDVIRVMNLSSRTTVS 125 (138) T ss_pred CCEEEEEECCCCCEEE T ss_conf 9989999889999999 No 57 >cd05705 S1_Rrp5_repeat_hs14 S1_Rrp5_repeat_hs14: Rrp5 is a trans-acting factor important for biogenesis of both the 40S and 60S eukaryotic ribosomal subunits. Rrp5 has two distinct regions, an N-terminal region containing tandemly repeated S1 RNA-binding domains (12 S1 repeats in Saccharomyces cerevisiae Rrp5 and 14 S1 repeats in Homo sapiens Rrp5) and a C-terminal region containing tetratricopeptide repeat (TPR) motifs thought to be involved in protein-protein interactions. Mutational studies have shown that each region represents a specific functional domain. Deletions within the S1-containing region inhibit pre-rRNA processing at either site A3 or A2, whereas deletions within the TPR region confer an inability to support cleavage of A0-A2. This CD includes H. sapiens S1 repeat 14 (hs14). Rrp5 is found in eukaryotes but not in prokaryotes or archaea. Probab=28.94 E-value=43 Score=15.37 Aligned_cols=12 Identities=8% Similarity=0.133 Sum_probs=6.7 Q ss_pred EEECCCHHHEEE Q ss_conf 970468665788 Q gi|254780250|r 59 KEASIHLSNLSL 70 (102) Q Consensus 59 ~E~pIh~SNV~l 70 (102) .++-+++||++- T Consensus 27 v~grv~~~nls~ 38 (74) T cd05705 27 IVGRVLFQNVTK 38 (74) T ss_pred EEEEEEEECCCH T ss_conf 489999700362 No 58 >cd04466 S1_YloQ_GTPase S1_YloQ_GTPase: YloQ GTase family (also known as YjeQ and CpgA), S1-like RNA-binding domain. Proteins in the YloQ GTase family bind the ribosome and have GTPase activity. The precise role of this family is unknown. The protein structure is composed of three domains: an N-terminal S1 domain, a central GTPase domain, and a C-terminal zinc finger domain. This N-terminal S1 domain binds ssRNA. The central GTPase domain contains nucleotide-binding signature motifs: G1 (walker A), G3 (walker B) and G4 motifs. Experiments show that the bacterial YloQ and YjeQ proteins have low intrinsic GTPase activity. The C-terminal zinc-finger domain has structural similarity to a portion of the DNA-repair protein Rad51. This suggests a possible role for this GTPase as a regulator of translation, perhaps as a translation initiation factor. This family is classified based on the N-terminal S1 domain. Probab=28.78 E-value=50 Score=15.05 Aligned_cols=12 Identities=8% Similarity=0.352 Sum_probs=4.5 Q ss_pred EEEEEEEECCCC Q ss_conf 469999974699 Q gi|254780250|r 21 AGQVMGVVRKSG 32 (102) Q Consensus 21 ~G~V~~V~~k~~ 32 (102) +|.|.+|+|-+| T Consensus 53 ~g~I~~i~pR~n 64 (68) T cd04466 53 EGVIEEILPRKN 64 (68) T ss_pred EEEEEEECCEEE T ss_conf 699989915043 No 59 >pfam02211 NHase_beta Nitrile hydratase beta subunit. Nitrile hydratases EC:4.2.1.84 are unusual metalloenzymes that catalyse the hydration of nitriles to their corresponding amides. They are used as biocatalysts in acrylamide production, one of the few commercial scale bioprocesses, as well as in environmental remediation for the removal of nitriles from waste streams. Nitrile hydratases are composed of two subunits, alpha and beta, and they contain one iron atom per alpha beta unit. Probab=28.50 E-value=50 Score=15.02 Aligned_cols=52 Identities=27% Similarity=0.339 Sum_probs=31.2 Q ss_pred CCCCCCEEEEEECCC----------CCCEEEEEEEECCCCEEEEECEEEEEEEECCCCCCCCEEEEEEECCCHHHEEEE Q ss_conf 226586999984378----------886469999974699899906059743205777656517999704686657889 Q gi|254780250|r 3 KIRTGDRVLVLAGKD----------KGKAGQVMGVVRKSGRAFVQGVNIVKRHQRQTPNKEAGIISKEASIHLSNLSLI 71 (102) Q Consensus 3 kikkGD~V~VisGkd----------KGk~G~V~~V~~k~~~ViVeGiN~~kkh~k~~~~~~gGii~~E~pIh~SNV~lv 71 (102) ++..||.|.|..-.. +|++|+|..+. | -|+-|..+..| .-|.|-|+..|.+- T Consensus 132 ~F~vGd~Vrv~~~~~~gHtRlP~Y~rgk~G~I~~~~---------G-----~~v~Pd~~A~g---~ge~p~~lY~V~F~ 193 (220) T pfam02211 132 RFAVGDRVRTRNINPNGHTRLPRYVRGKTGTIVRVH---------G-----AHVFPDSNAHG---LGEAPQPLYTVRFD 193 (220) T ss_pred CCCCCCEEEEEECCCCCCCCCHHHHCCCEEEEEEEE---------C-----CCCCCCHHCCC---CCCCCEEEEEEEEE T ss_conf 779999899822799975235367678745899884---------6-----87896121238---89986035899872 No 60 >TIGR01069 mutS2 MutS2 family protein; InterPro: IPR005747 Mismatch repair contributes to the overall fidelity of DNA replication and is essential for combating the adverse effects of damage to the genome. It involves the correction of mismatched base pairs that have been missed by the proofreading element of the DNA polymerase complex. The post-replicative Mismatch Repair System (MMRS) of Escherichia coli involves MutS (Mutator S), MutL and MutH proteins, and acts to correct point mutations or small insertion/deletion loops produced during DNA replication . MutS and MutL are involved in preventing recombination between partially homologous DNA sequences. The assembly of MMRS is initiated by MutS, which recognises and binds to mispaired nucleotides and allows further action of MutL and MutH to eliminate a portion of newly synthesized DNA strand containing the mispaired base . MutS can also collaborate with methyltransferases in the repair of O(6)-methylguanine damage, which would otherwise pair with thymine during replication to create an O(6)mG:T mismatch . MutS exists as a dimer, where the two monomers have different conformations and form a heterodimer at the structural level . Only one monomer recognises the mismatch specifically and has ADP bound. Non-specific major groove DNA-binding domains from both monomers embrace the DNA in a clamp-like structure. Mismatch binding induces ATP uptake and a conformational change in the MutS protein, resulting in a clamp that translocates on DNA. MutS is a modular protein with a complex structure , and is composed of: N-terminal mismatch-recognition domain, which is similar in structure to tRNA endonuclease. Connector domain, which is similar in structure to Holliday junction resolvase ruvC. Core domain, which is composed of two separate subdomains that join together to form a helical bundle; from within the core domain, two helices act as levers that extend towards (but do not touch) the DNA. Clamp domain, which is inserted between the two subdomains of the core domain at the top of the lever helices; the clamp domain has a beta-sheet structure. ATPase domain (connected to the core domain), which has a classical Walker A motif. HTH (helix-turn-helix) domain, which is involved in dimer contacts. Homologues of MutS have been found in many species including eukaryotes (MSH 1, 2, 3, 4, 5, and 6 proteins), archaea and bacteria, and together these proteins have been grouped into the MutS family. Although many of these proteins have similar activities to the E. coli MutS, there is significant diversity of function among the MutS family members. This diversity is even seen within species, where many species encode multiple MutS homologues with distinct functions . Inter-species homologues may have arisen through frequent ancient horizontal gene transfer of MutS (and MutL) from bacteria to archaea and eukaryotes via endosymbiotic ancestors of mitochondria and chloroplasts . This entry represents a family of MutS proteins.; GO: 0003684 damaged DNA binding, 0005524 ATP binding, 0045005 maintenance of fidelity during DNA-dependent DNA replication. Probab=28.24 E-value=51 Score=15.00 Aligned_cols=37 Identities=22% Similarity=0.373 Sum_probs=27.8 Q ss_pred CCCCCEEEEEECCCCCCEEEEEEEECCCCE--EEEECEEEEE Q ss_conf 265869999843788864699999746998--9990605974 Q gi|254780250|r 4 IRTGDRVLVLAGKDKGKAGQVMGVVRKSGR--AFVQGVNIVK 43 (102) Q Consensus 4 ikkGD~V~VisGkdKGk~G~V~~V~~k~~~--ViVeGiN~~k 43 (102) ++.||.+.|.+ .|+.|+|++|..+-+. |.|...+|.- T Consensus 672 Fk~Gd~~~~~~---~g~kg~~~~~~~~g~~~~V~~g~~~m~v 710 (834) T TIGR01069 672 FKVGDKVKVES---FGQKGKIVEIKGKGNKWNVTVGLLRMKV 710 (834) T ss_pred CCCCCCCEEEE---CCCEEEEEEEECCCCEEEEEEEEEEEEE T ss_conf 73574011121---5864799998056557776652104431 No 61 >cd05882 Ig1_Necl-1 First (N-terminal) immunoglobulin (Ig)-like domain of nectin-like molcule-1 (Necl-1, also known as cell adhesion molecule3 (CADM3)). Ig1_Necl-1: domain similar to the N-terminal immunoglobulin (Ig)-like domain of nectin-like molecule-1, Necl-1 (also known as celll adhesion molecule 3 (CADM3), SynCAM2, IGSF4). Nectin-like molecules have similar domain structures to those of nectins. At least five nectin-like molecules have been identified (Necl-1 - Necl-5). They all have an extracellular region containing three Ig-like domains, a transmembrane region, and a cytoplasmic region. The N-terminal Ig-like domain of the extracellular region belongs to the V-type subfamily of Ig domains, is essential to cell-cell adhesion, and plays a part in the interaction with the envelope glycoprotein D of various viruses. Necl-1 has Ca(2+)-independent homophilic and heterophilic cell-cell adhesion activity. Necl-1 is specifically expressed in neural tissue, and is important to the format Probab=25.94 E-value=56 Score=14.76 Aligned_cols=16 Identities=38% Similarity=0.497 Sum_probs=13.9 Q ss_pred EEEECCCHHHEEEECC Q ss_conf 9970468665788978 Q gi|254780250|r 58 SKEASIHLSNLSLIDK 73 (102) Q Consensus 58 ~~E~pIh~SNV~lvd~ 73 (102) ..|..|++|||.|.|. T Consensus 59 ~~elsI~isnV~l~DE 74 (95) T cd05882 59 PTELIISISNVQLSDE 74 (95) T ss_pred CCEEEEEEEECCEECC T ss_conf 6218999802238648 No 62 >COG1471 RPS4A Ribosomal protein S4E [Translation, ribosomal structure and biogenesis] Probab=25.65 E-value=57 Score=14.73 Aligned_cols=26 Identities=23% Similarity=0.489 Sum_probs=13.0 Q ss_pred CCCCCEEEEEECCCCCCEEEEEEEEC Q ss_conf 26586999984378886469999974 Q gi|254780250|r 4 IRTGDRVLVLAGKDKGKAGQVMGVVR 29 (102) Q Consensus 4 ikkGD~V~VisGkdKGk~G~V~~V~~ 29 (102) +-.|-.|.|..|+.-|..|+|.++.. T Consensus 174 fe~g~~~~vtgG~h~G~~G~I~~I~~ 199 (241) T COG1471 174 FEEGALVYVTGGRHVGRVGTIVEIEI 199 (241) T ss_pred CCCCCEEEEECCCCCCCEEEEEEEEE T ss_conf 58986899977701352278999997 No 63 >pfam11962 DUF3476 Domain of unknown function (DUF3476). This presumed domain is functionally uncharacterized. This domain is found in bacteria and viruses. This domain is typically between 213 to 236 amino acids in length. This domain has a conserved VGL sequence motif. Probab=24.64 E-value=59 Score=14.62 Aligned_cols=28 Identities=32% Similarity=0.644 Sum_probs=20.1 Q ss_pred CCCCEEEEEEEE-ECCEEEEEECCCCCEE Q ss_conf 899622889999-9997999981568771 Q gi|254780250|r 73 KDGKQVRVGFSF-VDGKKIRIAKRSGEPI 100 (102) Q Consensus 73 ~~~k~trv~~~~-~dG~kvRv~kksg~~i 100 (102) .++.|..+|+-+ .||.|+|.+..+..+| T Consensus 64 ~dG~~i~~G~~Vtl~g~KIr~A~~~d~ii 92 (222) T pfam11962 64 LDGQPIDTGYLVTLDGDKIRKAQEGDDIL 92 (222) T ss_pred CCCCCCCCCEEEEEECCEEEECCCCCCEE T ss_conf 68994436369997299898467899478 No 64 >PRK05585 yajC preprotein translocase subunit YajC; Validated Probab=23.44 E-value=63 Score=14.49 Aligned_cols=29 Identities=24% Similarity=0.414 Sum_probs=23.2 Q ss_pred CCCCCCEEEEEECCCCCCEEEEEEEECCCCEEEEE Q ss_conf 22658699998437888646999997469989990 Q gi|254780250|r 3 KIRTGDRVLVLAGKDKGKAGQVMGVVRKSGRAFVQ 37 (102) Q Consensus 3 kikkGD~V~VisGkdKGk~G~V~~V~~k~~~ViVe 37 (102) .+++||.|.-.+| --|+|.++.. +.|.+| T Consensus 53 ~L~~Gd~VvT~gG----i~G~I~~v~d--~~v~le 81 (107) T PRK05585 53 SLAKGDEVVTNGG----IIGKVTKVSE--DFVIIE 81 (107) T ss_pred HCCCCCEEEECCC----CEEEEEEEEC--CEEEEE T ss_conf 4589999998998----5899999979--989999 No 65 >COG1862 YajC Preprotein translocase subunit YajC [Intracellular trafficking and secretion] Probab=23.18 E-value=63 Score=14.46 Aligned_cols=26 Identities=27% Similarity=0.533 Sum_probs=20.8 Q ss_pred CCCCCCCEEEEEECCCCCCEEEEEEEECCC Q ss_conf 722658699998437888646999997469 Q gi|254780250|r 2 EKIRTGDRVLVLAGKDKGKAGQVMGVVRKS 31 (102) Q Consensus 2 ~kikkGD~V~VisGkdKGk~G~V~~V~~k~ 31 (102) .-+++||+|.-++| -.|+|.+|.... T Consensus 42 ~sL~kGD~VvT~gG----i~G~V~~v~d~~ 67 (97) T COG1862 42 NSLKKGDEVVTIGG----IVGTVTKVGDDT 67 (97) T ss_pred HHCCCCCEEEECCC----EEEEEEEEECCC T ss_conf 74568998997587----399999970681 No 66 >TIGR00459 aspS_bact aspartyl-tRNA synthetase; InterPro: IPR004524 The aminoacyl-tRNA synthetases (6.1.1. from EC) catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology . The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric . Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices , and are mostly dimeric or multimeric, containing at least three conserved regions , , . However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases . Aspartyl tRNA synthetase 6.1.1.12 from EC is an alpha2 dimer that belongs to class IIb. Structural analysis combined with mutagenesis and enzymology data on the yeast enzyme point to a tRNA binding process that starts by a recognition event between the tRNA anticodon loop and the synthetase anticodon binding module . This family represents aspartyl-tRNA synthetases from the bacteria and from mitochondria. In some species, this enzyme aminoacylates tRNA for both Asp and Asn; Asp-tRNA(asn) is subsequently transamidated to Asn-tRNA(asn). ; GO: 0000166 nucleotide binding, 0004815 aspartate-tRNA ligase activity, 0005524 ATP binding, 0006412 translation, 0006422 aspartyl-tRNA aminoacylation, 0005737 cytoplasm. Probab=23.13 E-value=60 Score=14.61 Aligned_cols=75 Identities=21% Similarity=0.274 Sum_probs=46.7 Q ss_pred CCCCCCEEEEEECCC-CCCEEEEEEEECCCCEEEEECEEEEEEEECCCCCCCCEEEEEEECCCHHHEEEEC-CC-CCEEE Q ss_conf 226586999984378-8864699999746998999060597432057776565179997046866578897-89-96228 Q gi|254780250|r 3 KIRTGDRVLVLAGKD-KGKAGQVMGVVRKSGRAFVQGVNIVKRHQRQTPNKEAGIISKEASIHLSNLSLID-KD-GKQVR 79 (102) Q Consensus 3 kikkGD~V~VisGkd-KGk~G~V~~V~~k~~~ViVeGiN~~kkh~k~~~~~~gGii~~E~pIh~SNV~lvd-~~-~k~tr 79 (102) +=|.||+|||.+-++ .-..=.+.+=++...=|.|.|.=..+..-.-+.|-..|-+| |+.+.+.|++ .. ..|-- T Consensus 43 RD~~GdivQv~~~p~~~~~a~~~a~~lr~E~vv~v~G~v~~R~~~~~~~~l~tg~~E----i~~~~i~~~NG~s~~~P~~ 118 (653) T TIGR00459 43 RDRSGDIVQVVCDPDVSKDALELAKGLRNEDVVQVKGKVSARPEGSINRNLDTGEIE----ILAEEITLLNGKSKTPPLI 118 (653) T ss_pred ECCCCCEEEEEECCCCCHHHHHHHHHCCCCEEEEEEEEEEECCCCCCCCCCCCCEEE----EECCCEEEEECCCCCCCCE T ss_conf 258888899986775678899999733552289999999865853446556763488----9818626861212687940 Q ss_pred EE Q ss_conf 89 Q gi|254780250|r 80 VG 81 (102) Q Consensus 80 v~ 81 (102) +. T Consensus 119 ~~ 120 (653) T TIGR00459 119 IE 120 (653) T ss_pred EE T ss_conf 32 No 67 >TIGR00448 rpoE DNA-directed RNA polymerase; InterPro: IPR004519 DNA-directed RNA polymerases 2.7.7.6 from EC (also known as DNA-dependent RNA polymerases) are responsible for the polymerisation of ribonucleotides into a sequence complementary to the template DNA. In eukaryotes, there are three different forms of DNA-directed RNA polymerases transcribing different sets of genes. Most RNA polymerases are multimeric enzymes and are composed of a variable number of subunits. The core RNA polymerase complex consists of five subunits (two alpha, one beta, one beta-prime and one omega) and is sufficient for transcription elongation and termination but is unable to initiate transcription. Transcription initiation from promoter elements requires a sixth, dissociable subunit called a sigma factor, which reversibly associates with the core RNA polymerase complex to form a holoenzyme . The core RNA polymerase complex forms a "crab claw"-like structure with an internal channel running along the full length . The key functional sites of the enzyme, as defined by mutational and cross-linking analysis, are located on the inner wall of this channel. RNA synthesis follows after the attachment of RNA polymerase to a specific site, the promoter, on the template DNA strand. The RNA synthesis process continues until a termination sequence is reached. The RNA product, which is synthesised in the 5' to 3'direction, is known as the primary transcript. Eukaryotic nuclei contain three distinct types of RNA polymerases that differ in the RNA they synthesise: RNA polymerase I: located in the nucleoli, synthesises precursors of most ribosomal RNAs. RNA polymerase II: occurs in the nucleoplasm, synthesises mRNA precursors. RNA polymerase III: also occurs in the nucleoplasm, synthesises the precursors of 5S ribosomal RNA, the tRNAs, and a variety of other small nuclear and cytosolic RNAs. Eukaryotic cells are also known to contain separate mitochondrial and chloroplast RNA polymerases. Eukaryotic RNA polymerases, whose molecular masses vary in size from 500 to 700 kD, contain two non-identical large (>100 kDa) subunits and an array of up to 12 different small (less than 50 kDa) subunits. This family seems to be confined to the archea and eukaryotic taxa and are quite dissimilar to Escherichia coli RpoE.; GO: 0003677 DNA binding, 0003899 DNA-directed RNA polymerase activity, 0006350 transcription, 0005634 nucleus. Probab=22.94 E-value=53 Score=14.89 Aligned_cols=34 Identities=18% Similarity=0.293 Sum_probs=20.1 Q ss_pred EEECCCHHHEE----EECCCCCE-E----E-EEEEEECCEEEEE Q ss_conf 97046866578----89789962-2----8-8999999979999 Q gi|254780250|r 59 KEASIHLSNLS----LIDKDGKQ-V----R-VGFSFVDGKKIRI 92 (102) Q Consensus 59 ~E~pIh~SNV~----lvd~~~k~-t----r-v~~~~~dG~kvRv 92 (102) ..+=+|+|||+ .+||..++ + + .+..++.|.+||. T Consensus 105 ~D~l~h~sq~~ddy~~YdPk~~~liGPmD~Etk~~ld~gd~vRa 148 (184) T TIGR00448 105 FDGLLHVSQVLDDYVVYDPKESALIGPMDKETKKVLDVGDKVRA 148 (184) T ss_pred CCCEEEEEEEEECCEEECCCCCCEECCCCHHCCCEEECCCEEEE T ss_conf 13234410011356366265660456740121735101675667 No 68 >pfam06431 Polyoma_lg_T_C Polyomavirus large T antigen C-terminus. Probab=22.85 E-value=38 Score=15.69 Aligned_cols=28 Identities=32% Similarity=0.446 Sum_probs=21.5 Q ss_pred EEEEEEEE-CCCCCCCCEEEEE-EECCCHH Q ss_conf 05974320-5777656517999-7046866 Q gi|254780250|r 39 VNIVKRHQ-RQTPNKEAGIISK-EASIHLS 66 (102) Q Consensus 39 iN~~kkh~-k~~~~~~gGii~~-E~pIh~S 66 (102) +|+-+||+ |.+|=-+.||++. |+.|... T Consensus 243 VNLEKKH~NKrsQIFPPgIVTmNeY~iP~T 272 (417) T pfam06431 243 VNLEKKHLNKRTQIFPPGIVTMNEYSVPKT 272 (417) T ss_pred ECHHHHHCCCCCCCCCCCEEEECCCCCCHH T ss_conf 034453036653148996663056666266 No 69 >pfam11784 DUF3320 Protein of unknown function (DUF3320). This family is conserved in Proteobacteria and Chlorobi families. Many members are annotated as being putative DNA helicase-related proteins. Probab=21.56 E-value=14 Score=18.03 Aligned_cols=15 Identities=33% Similarity=0.499 Sum_probs=12.0 Q ss_pred EEEEEEECCCHHHEE Q ss_conf 179997046866578 Q gi|254780250|r 55 GIISKEASIHLSNLS 69 (102) Q Consensus 55 Gii~~E~pIh~SNV~ 69 (102) -|++.|+|||.+-+. T Consensus 19 ~IV~~EgPI~~~~l~ 33 (52) T pfam11784 19 HIVEVEGPIHEDELA 33 (52) T ss_pred HHHHHCCCCHHHHHH T ss_conf 999870773099999 No 70 >pfam11910 NdhO Cyanobacterial and plant NDH-1 subunit O. The proton-pumping NADH:ubiquinone oxidoreductase catalyzes the electron transfer from NADH to ubiquinone linked with proton translocation across the membrane. It is the largest, most complex and least understood of the respiratory chain enzymes and is referred to as Complex I. The subunit composition of the enzyme varies between groups of organisms. Complex I originating from mammalian mitochondria contains 45 different proteins, whereas in bacteria, the corresponding complex NDH-1 consists of 14 different polypeptides. Homologues of these 14 proteins are found among subunits of the mitochondrial complex I, and therefore bacterial NDH-1 might be considered a model proton-pumping NADH dehydrogenase with a minimal set of subunits. Escherichia coli NDH-1 readily disintegrates into 3 subcomplexes: a water-soluble NADH dehydrogenase fragment (NuoE, -F, and -G),the connecting fragment (NuoB, -C, -D, and -I), and the membrane fragment Probab=21.21 E-value=43 Score=15.40 Aligned_cols=16 Identities=25% Similarity=0.266 Sum_probs=12.9 Q ss_pred CCCCCEEEEEECCCCC Q ss_conf 2658699998437888 Q gi|254780250|r 4 IRTGDRVLVLAGKDKG 19 (102) Q Consensus 4 ikkGD~V~VisGkdKG 19 (102) +|||++|.|+..+.-+ T Consensus 1 lKKG~lVrv~re~~~n 16 (67) T pfam11910 1 LKKGSLVRVNREKYEN 16 (67) T ss_pred CCCCCEEEEEHHHHHC T ss_conf 9866379977787403 No 71 >cd04090 eEF2_II_snRNP Loc2 eEF2_C_snRNP, cd01514/C terminal domain:eEF2_C_snRNP: This family includes C-terminal portion of the spliceosomal human 116kD U5 small nuclear ribonucleoprotein (snRNP) protein (U5-116 kD) and, its yeast counterpart Snu114p. This domain is homologous to domain II of the eukaryotic translational elongation factor EF-2. Yeast Snu114p is essential for cell viability and for splicing in vivo. U5-116 kD binds GTP. Experiments suggest that GTP binding and probably GTP hydrolysis is important for the function of the U5-116 kD/Snu114p. In complex with GTP, EF-2 promotes the translocation step of translation. During translocation the peptidyl-tRNA is moved from the A site to the P site, the uncharged tRNA from the P site to the E-site and, the mRNA is shifted one codon relative to the ribosome. Probab=21.10 E-value=70 Score=14.22 Aligned_cols=14 Identities=36% Similarity=0.558 Sum_probs=10.5 Q ss_pred CCCCCEEEEEECCC Q ss_conf 26586999984378 Q gi|254780250|r 4 IRTGDRVLVLAGKD 17 (102) Q Consensus 4 ikkGD~V~VisGkd 17 (102) |++||+|.|+.-++ T Consensus 28 l~~G~~V~Vlg~~y 41 (94) T cd04090 28 IKKGQKVKVLGENY 41 (94) T ss_pred ECCCCEEEEECCCC T ss_conf 84899999979998 No 72 >cd05881 Ig1_Necl-2 First (N-terminal) immunoglobulin (Ig)-like domain of nectin-like molecule 2 (also known as cell adhesion molecule 1 (CADM1)). Ig1_Necl-2: domain similar to the N-terminal immunoglobulin (Ig)-like domain of nectin-like molecule-2, Necl-2 (also known as cell adhesion molecule 1 (CADM1), SynCAM1, IGSF4A, Tslc1, sgIGSF, and RA175). Nectin-like molecules have similar domain structures to those of nectins. At least five nectin-like molecules have been identified (Necl-1 - Necl-5). They all have an extracellular region containing three Ig-like domains, a transmembrane region, and a cytoplasmic region. The N-terminal Ig-like domain of the extracellular region, belongs to the V-type subfamily of Ig domains, is essential to cell-cell adhesion, and plays a part in the interaction with the envelope glycoprotein D of various viruses. Necl-2 has Ca(2+)-independent homophilic and heterophilic cell-cell adhesion activity. Necl-2 is expressed in a wide variety of tissues, and is a Probab=20.34 E-value=28 Score=16.39 Aligned_cols=16 Identities=50% Similarity=0.630 Sum_probs=13.6 Q ss_pred EEEECCCHHHEEEECC Q ss_conf 9970468665788978 Q gi|254780250|r 58 SKEASIHLSNLSLIDK 73 (102) Q Consensus 58 ~~E~pIh~SNV~lvd~ 73 (102) ..|..|++|||.+-|. T Consensus 59 ~~elsIsisnV~l~DE 74 (95) T cd05881 59 SNELRVSLSNVSLSDE 74 (95) T ss_pred CCCEEEEEEEEEEECC T ss_conf 7526999842517338 Done!