Query         018594
Match_columns 353
No_of_seqs    219 out of 346
Neff          4.9 
Searched_HMMs 46136
Date          Fri Mar 29 02:24:39 2013
Command       hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/018594.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/018594hhsearch_cdd -cpu 12 -v 0 

 No Hit                             Prob E-value P-value  Score    SS Cols Query HMM  Template HMM
  1 KOG2837 Protein containing a U 100.0 7.6E-77 1.7E-81  559.7   5.3  256    1-315    54-309 (309)
  2 PF10357 Kin17_mid:  Domain of  100.0 3.4E-66 7.4E-71  444.4   7.6  125    1-125     3-127 (127)
  3 KOG4315 G-patch nucleic acid b  99.7 2.8E-18 6.2E-23  170.8   2.3  112  237-352   331-445 (455)
  4 KOG1999 RNA polymerase II tran  98.2 1.1E-06 2.4E-11   96.0   4.3  108  237-353   915-1022(1024)
  5 PF00467 KOW:  KOW motif;  Inte  97.3 0.00042   9E-09   46.3   4.5   31  299-329     1-31  (32)
  6 smart00739 KOW KOW (Kyprides,   96.8  0.0022 4.8E-08   40.5   4.0   26  297-322     2-27  (28)
  7 PRK12281 rplX 50S ribosomal pr  96.2  0.0087 1.9E-07   47.9   5.0   33  298-330     8-40  (76)
  8 CHL00141 rpl24 ribosomal prote  96.2  0.0091   2E-07   48.5   5.0   32  298-329    10-41  (83)
  9 TIGR00405 L26e_arch ribosomal   95.9    0.02 4.2E-07   50.0   6.2   51  297-351    87-139 (145)
 10 PRK00004 rplX 50S ribosomal pr  95.6   0.022 4.7E-07   48.1   5.0   33  298-330     6-38  (105)
 11 PRK01191 rpl24p 50S ribosomal   95.0   0.043 9.4E-07   47.6   5.2   32  298-329    47-78  (120)
 12 TIGR01079 rplX_bact ribosomal   95.0   0.043 9.2E-07   46.4   4.9   33  298-330     5-37  (104)
 13 PTZ00194 60S ribosomal protein  95.0   0.039 8.5E-07   49.2   4.9   33  298-330    48-80  (143)
 14 PRK05609 nusG transcription an  94.9   0.069 1.5E-06   47.8   6.4   53  297-352   127-180 (181)
 15 TIGR01080 rplX_A_E ribosomal p  94.9   0.044 9.4E-07   47.2   4.8   32  298-329    43-74  (114)
 16 TIGR00922 nusG transcription t  94.9   0.072 1.6E-06   47.5   6.4   50  297-350   120-171 (172)
 17 PRK08559 nusG transcription an  94.5     0.1 2.2E-06   46.5   6.4   50  297-350    95-146 (153)
 18 COG0250 NusG Transcription ant  93.4    0.19 4.2E-06   46.1   6.1   56  295-352   122-177 (178)
 19 PRK04313 30S ribosomal protein  92.5       1 2.2E-05   43.4   9.9   73  245-324   119-199 (237)
 20 TIGR01955 RfaH transcriptional  92.2    0.29 6.3E-06   42.9   5.4   49  297-349   109-158 (159)
 21 PLN00036 40S ribosomal protein  91.9     1.2 2.7E-05   43.4   9.7   71  245-322   123-200 (261)
 22 PRK09014 rfaH transcriptional   91.8    0.35 7.6E-06   42.8   5.5   50  297-351   110-161 (162)
 23 PTZ00118 40S ribosomal protein  91.6     1.5 3.2E-05   42.9   9.9   80  245-332   123-211 (262)
 24 PF11623 DUF3252:  Protein of u  91.6    0.72 1.6E-05   34.6   6.0   51  240-293     2-53  (53)
 25 COG0198 RplX Ribosomal protein  90.7    0.43 9.3E-06   40.6   4.6   30  298-329     6-35  (104)
 26 TIGR01956 NusG_myco NusG famil  89.9    0.81 1.8E-05   44.6   6.4   51  297-349   206-256 (258)
 27 COG5164 SPT5 Transcription elo  89.9    0.79 1.7E-05   48.1   6.6   70  262-334   298-387 (607)
 28 PTZ00223 40S ribosomal protein  89.5     2.4 5.1E-05   41.7   9.3   72  245-323   120-198 (273)
 29 COG1471 RPS4A Ribosomal protei  89.4     2.5 5.3E-05   40.8   9.1   53  282-334   156-213 (241)
 30 KOG1999 RNA polymerase II tran  85.0     3.5 7.7E-05   46.6   8.4   78  240-324   408-487 (1024)
 31 PF15591 Imm17:  Immunity prote  84.5     1.9 4.1E-05   34.6   4.5   50  245-294    10-63  (74)
 32 PF15057 DUF4537:  Domain of un  83.7      13 0.00028   32.2   9.7   97  247-351     3-111 (124)
 33 smart00743 Agenet Tudor-like d  78.0     7.4 0.00016   28.7   5.6   54  241-298     4-59  (61)
 34 smart00333 TUDOR Tudor domain.  73.7      13 0.00027   26.8   5.7   53  240-297     3-55  (57)
 35 PLN00045 photosystem I reactio  71.8     6.5 0.00014   33.1   4.2   58  292-349    35-97  (101)
 36 PRK08559 nusG transcription an  71.2      15 0.00032   32.7   6.7   53  240-297    95-150 (153)
 37 TIGR00405 L26e_arch ribosomal   70.5      12 0.00026   32.5   5.9   55  238-297    85-142 (145)
 38 TIGR02760 TraI_TIGR conjugativ  65.8      43 0.00093   41.3  11.0   89  238-333   680-786 (1960)
 39 PTZ00065 60S ribosomal protein  62.3      20 0.00044   31.7   5.6   36  299-338    10-46  (130)
 40 PRK04333 50S ribosomal protein  59.1      17 0.00037   29.7   4.3   28  298-325     5-32  (84)
 41 cd04508 TUDOR Tudor domains ar  59.0      18 0.00038   25.1   3.9   37  254-292    10-46  (48)
 42 PF02427 PSI_PsaE:  Photosystem  57.7      37  0.0008   26.4   5.6   53  297-349     1-57  (61)
 43 PF10615 DUF2470:  Protein of u  56.9     4.8  0.0001   32.0   0.7   36   53-88     16-55  (83)
 44 PTZ00471 60S ribosomal protein  56.5      18 0.00038   32.2   4.3   26  298-323     6-31  (134)
 45 PF10771 DUF2582:  Protein of u  55.6      12 0.00026   29.3   2.7   23   66-88     43-65  (65)
 46 smart00333 TUDOR Tudor domain.  55.3      49  0.0011   23.6   5.8   52  296-352     2-53  (57)
 47 PF01455 HupF_HypC:  HupF/HypC   53.9      43 0.00094   26.1   5.6   42  263-308     7-49  (68)
 48 PF01176 eIF-1a:  Translation i  51.5      59  0.0013   24.8   6.0   57  262-318     6-63  (65)
 49 PF00567 TUDOR:  Tudor domain;   51.2      82  0.0018   24.8   7.1   57  236-298    50-106 (121)
 50 CHL00010 infA translation init  50.9 1.3E+02  0.0027   24.0   8.8   62  261-323     9-73  (78)
 51 cd01734 YlxS_C YxlS is a Bacil  48.5      41 0.00088   26.8   4.9   51  297-351    22-76  (83)
 52 PRK14639 hypothetical protein;  47.4      39 0.00084   29.9   5.0   72  271-352    63-134 (140)
 53 COG2163 RPL14A Ribosomal prote  44.4      35 0.00075   30.0   4.1   28  297-324     5-32  (125)
 54 KOG1708 Mitochondrial/chloropl  40.7      40 0.00086   32.3   4.2   33  298-330    74-106 (236)
 55 TIGR01955 RfaH transcriptional  40.6      57  0.0012   28.4   5.0   52  237-293   106-159 (159)
 56 PF12872 OST-HTH:  OST-HTH/LOTU  39.6      27  0.0006   26.3   2.5   68   17-86      4-74  (74)
 57 TIGR00922 nusG transcription t  39.3      78  0.0017   28.0   5.7   53  236-293   116-171 (172)
 58 PRK14630 hypothetical protein;  38.1      73  0.0016   28.3   5.3   69  271-352    72-140 (143)
 59 PRK14633 hypothetical protein;  38.1      66  0.0014   28.7   5.0   70  272-352    70-143 (150)
 60 CHL00125 psaE photosystem I su  38.1      56  0.0012   25.6   3.9   52  298-349     3-58  (64)
 61 PRK14638 hypothetical protein;  37.6      60  0.0013   29.0   4.7   69  271-351    75-143 (150)
 62 PRK13709 conjugal transfer nic  37.3 2.4E+02  0.0053   34.8  10.8   88  238-332   647-749 (1747)
 63 PRK14637 hypothetical protein;  37.1      78  0.0017   28.4   5.4   47  298-352    96-143 (151)
 64 COG2163 RPL14A Ribosomal prote  35.8      68  0.0015   28.2   4.6   33  240-277     5-37  (125)
 65 PRK14636 hypothetical protein;  35.6      87  0.0019   28.8   5.5   71  271-352    73-147 (176)
 66 PRK04914 ATP-dependent helicas  35.0 2.8E+02   0.006   32.1  10.4   63  262-329    18-82  (956)
 67 cd08768 Cdc6_C Winged-helix do  34.6      26 0.00057   27.2   1.8   65   18-83      6-70  (87)
 68 PRK09014 rfaH transcriptional   34.4   1E+02  0.0022   27.1   5.6   52  238-294   108-161 (162)
 69 COG1096 Predicted RNA-binding   33.9 4.1E+02  0.0089   25.1   9.9   91  239-332     7-105 (188)
 70 PRK05609 nusG transcription an  33.8 1.4E+02   0.003   26.5   6.5   53  237-294   124-179 (181)
 71 KOG3421 60S ribosomal protein   33.4      43 0.00092   29.9   3.0   36  299-338     9-44  (136)
 72 PRK04012 translation initiatio  32.5 3.1E+02  0.0066   23.1   8.1   60  261-320    23-83  (100)
 73 PF02736 Myosin_N:  Myosin N-te  31.6 1.1E+02  0.0024   21.4   4.4   28  261-290    14-41  (42)
 74 cd05793 S1_IF1A S1_IF1A: Trans  31.4 2.7E+02  0.0058   22.2   7.9   56  263-318     4-60  (77)
 75 TIGR00523 eIF-1A eukaryotic/ar  30.9 2.6E+02  0.0056   23.5   7.2   58  243-305     8-66  (99)
 76 PF14505 DUF4438:  Domain of un  30.7      51  0.0011   32.2   3.3   35  299-333    60-94  (258)
 77 PF07076 DUF1344:  Protein of u  30.1 2.3E+02   0.005   22.1   6.1   43  261-308     5-49  (61)
 78 PRK04950 ProP expression regul  30.0   1E+02  0.0022   29.5   5.1   35  298-335   168-202 (213)
 79 TIGR00739 yajC preprotein tran  29.7   1E+02  0.0022   25.0   4.4   30  298-333    39-68  (84)
 80 PRK14631 hypothetical protein;  29.5 1.2E+02  0.0026   27.8   5.4   73  271-352    92-168 (174)
 81 PF04717 Phage_base_V:  Phage-r  29.3 1.2E+02  0.0027   23.4   4.7   45  263-307     1-54  (79)
 82 PF06003 SMN:  Survival motor n  29.1 1.4E+02  0.0029   29.2   6.0   57  237-298    67-124 (264)
 83 PRK14640 hypothetical protein;  28.7 1.1E+02  0.0025   27.2   5.0   69  272-352    73-145 (152)
 84 PF08863 YolD:  YolD-like prote  28.6 1.3E+02  0.0028   23.5   4.9   41  307-352    52-92  (92)
 85 COG5164 SPT5 Transcription elo  27.7      98  0.0021   33.1   4.9   44  243-292   355-398 (607)
 86 cd04456 S1_IF1A_like S1_IF1A_l  27.5 3.2E+02   0.007   21.8   8.0   48  263-310     4-52  (78)
 87 PRK14647 hypothetical protein;  27.4      96  0.0021   27.9   4.3   71  271-352    74-153 (159)
 88 PRK05585 yajC preprotein trans  27.0 1.1E+02  0.0025   25.8   4.4   30  298-333    54-83  (106)
 89 PRK02749 photosystem I reactio  26.9 1.1E+02  0.0024   24.5   4.0   39  298-336     4-46  (71)
 90 cd04451 S1_IF1 S1_IF1: Transla  26.9 2.7E+02  0.0059   20.7   7.9   24  294-317    38-61  (64)
 91 PRK00276 infA translation init  26.9   3E+02  0.0065   21.3   8.4   56  262-318    10-68  (72)
 92 PLN00208 translation initiatio  26.8 4.8E+02    0.01   23.6   8.5   60  263-322    36-96  (145)
 93 COG0779 Uncharacterized protei  25.9 1.5E+02  0.0032   26.9   5.2   45  299-351    98-146 (153)
 94 PRK00092 ribosome maturation p  25.8 1.1E+02  0.0024   27.1   4.4   71  271-351    73-147 (154)
 95 cd05696 S1_Rrp5_repeat_hs4 S1_  25.6 2.3E+02  0.0051   21.4   5.6   57  262-329     6-69  (71)
 96 KOG4235 Mitochondrial thymidin  25.4   1E+02  0.0022   29.8   4.2   44   51-100   128-171 (244)
 97 PRK14645 hypothetical protein;  25.2 1.3E+02  0.0027   27.2   4.6   68  271-351    77-144 (154)
 98 PRK14712 conjugal transfer nic  24.8 5.2E+02   0.011   31.9  10.7   86  241-332   518-617 (1623)
 99 COG3041 Uncharacterized protei  24.4      41 0.00089   28.1   1.3   18   14-31      5-22  (91)
100 PTZ00065 60S ribosomal protein  23.9 2.1E+02  0.0046   25.4   5.7   35  240-279     8-42  (130)
101 PTZ00329 eukaryotic translatio  23.8 5.5E+02   0.012   23.5   8.4   60  263-322    36-96  (155)
102 PF01421 Reprolysin:  Reprolysi  23.5      97  0.0021   27.9   3.6   53   18-70     26-78  (199)
103 PF02576 DUF150:  Uncharacteris  23.4   2E+02  0.0044   24.9   5.5   74  271-352    62-139 (141)
104 PHA02104 hypothetical protein   23.3      54  0.0012   26.5   1.7   30  235-269    34-64  (89)
105 PF12122 DUF3582:  Protein of u  23.3 1.4E+02  0.0031   25.0   4.3   27   64-90     15-42  (101)
106 PRK13316 heme-degrading monoox  22.8      58  0.0013   28.5   1.9   59    4-70      3-76  (121)
107 TIGR02059 swm_rep_I cyanobacte  22.6 4.3E+02  0.0094   22.6   7.0   61  285-348    35-97  (101)
108 PRK04333 50S ribosomal protein  22.1 1.6E+02  0.0034   24.1   4.2   31  241-276     5-35  (84)
109 PRK14634 hypothetical protein;  21.7 1.6E+02  0.0034   26.5   4.6   70  271-352    75-148 (155)
110 PRK14646 hypothetical protein;  21.5 1.7E+02  0.0037   26.2   4.8   70  271-352    75-148 (155)
111 PF02214 BTB_2:  BTB/POZ domain  21.2      24 0.00053   27.9  -0.7   20   75-96     34-53  (94)
112 PRK00411 cdc6 cell division co  21.2 1.1E+02  0.0024   30.1   3.9   66   32-97    312-381 (394)
113 PTZ00471 60S ribosomal protein  21.1 1.8E+02   0.004   26.0   4.7   29  239-272     4-32  (134)
114 PF08141 SspH:  Small acid-solu  21.0 1.7E+02  0.0036   22.5   3.9   37  310-349    20-56  (58)
115 PF04986 Y2_Tnp:  Putative tran  20.6      89  0.0019   28.5   2.8   57   14-71     47-112 (183)
116 PRK14635 hypothetical protein;  20.2 1.4E+02  0.0031   26.9   4.0   74  272-352    75-156 (162)
117 PRK10413 hydrogenase 2 accesso  20.1 3.2E+02  0.0069   22.2   5.6   44  263-308     7-54  (82)

No 1  
>KOG2837 consensus Protein containing a U1-type Zn-finger and implicated in RNA splicing or processing  [RNA processing and modification]
Probab=100.00  E-value=7.6e-77  Score=559.68  Aligned_cols=256  Identities=50%  Similarity=0.759  Sum_probs=213.0

Q ss_pred             CcccccCchhhHhhhHHHHHHHHHHHHHhccCCcccccceeeeeecccccceeecccccccHHHHHHHhcccccEEEeec
Q 018594            1 MQIFGQNPDRIVEGYSEEFEAGFLELMRRSHRFSRIAATVVYNEYIHDRHHVHMNSTRWATLTEFVKYLGRTGKCKVEET   80 (353)
Q Consensus         1 m~lf~enp~~~i~~fS~eF~~~Fl~lLr~~~g~krV~aN~vYneyI~dr~HiHMNaT~W~tLt~Fvk~Lgr~G~c~vdet   80 (353)
                      |++|++||+++++.||.+|+.+||+|||++||+|||+||+||||||+||+|||||||+|.|||+||+||||+|+|+||+|
T Consensus        54 l~~~~~Np~~~~~~fs~eF~~dFl~LLr~~~g~KrI~aN~VYnEYI~dR~HvHMNaT~w~SLtefvk~LGR~Gkc~vdet  133 (309)
T KOG2837|consen   54 LLLFALNPGRSLERFSNEFEKDFLSLLRQRHGTKRIGANKVYNEYIADRNHVHMNATRWRSLTEFVKYLGRTGKCKVDET  133 (309)
T ss_pred             HHHHHhCcchhHHHhHHHHHHHHHHHHHHHhccceechhHHHHHHHccccceeecchhhhhHHHHHHHhccCceeeeecC
Confidence            68999999999999999999999999999999999999999999999999999999999999999999999999999999


Q ss_pred             CceeEEEeecCChHHHHHHHHHHHHhhhcCCHHHHHHHHHHHHHHHHHhcCcCCCCCCCCCCCcchhhhhhhhhccccee
Q 018594           81 PKGWFITYIDRDSETLFKEKMKNKRIKLDMVDEERQEREIQKQIEIAASSSVSSTNPLSNSEDNTTRELNLEAAAAVGKV  160 (353)
Q Consensus        81 ekGw~I~yId~~pe~~~r~~~~~k~~~~~~~dee~~~~~i~~qi~r~~~~~~~~~~~~~~~~~~~~~el~r~~~~~~~ki  160 (353)
                      |+||||+|||++|+++.|+.+..++++++++|||+.+++|+.||.||++.      ++.+.+.+..+||.|++......+
T Consensus       134 ekgw~i~yIdk~petl~r~~~d~~r~rqe~~dEe~~~~~id~Qi~Rake~------g~~e~e~e~~~El~~d~~~~~~~v  207 (309)
T KOG2837|consen  134 EKGWFITYIDKFPETLKRIEEDLKRERQEKDDEERGADLIDGQIKRAKEQ------GEKEYEPEMNTELSRDGDDERKSV  207 (309)
T ss_pred             CCceEEEEeccChhhhcchhhHHHHHhhhhhHHHHHHHHHHHHHHHHHhc------cccccccccccccccCCccccccc
Confidence            99999999999999999999999999999999999999999999999974      111223444688988765111001


Q ss_pred             eeecCCCCCCCCCCCCCCCCCCCCcchhhhhcccccCCCCCCCcccCCCCCCCCCCChHHHHHHHHHHHHhhcCCCCCcc
Q 018594          161 GFALGSSYKDNVTSNGSGNNGSSSTRLVFEELDKDNNNNNNNNRKIDKNGSKVSGNSALEELMREEEKVKEKMNRKDYWL  240 (353)
Q Consensus       161 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~k~saLdeime~ee~kk~~~~r~~~WL  240 (353)
                      .-+...+..             ..|+...              ..+.+.+  .+++   ||||++||.+|        | 
T Consensus       208 ~~~~~~sk~-------------~~p~~kk--------------~~~~~~~--~~~r---dEi~~~ee~kk--------w-  246 (309)
T KOG2837|consen  208 VVSSALSKR-------------VNPKAKK--------------LPPDKDG--GKKR---DEIMKMEERKK--------W-  246 (309)
T ss_pred             eeeeeccCc-------------CChhhhc--------------CCCCccc--ccch---HHHHHhhhcCc--------e-
Confidence            111100000             1111100              0011111  1122   99999999775        8 


Q ss_pred             cCCcEEEEeecccCCcccccceeEEEEecCCceEEEEecCCCeEEEEcCCceeeecCCCCCeEEEEeCCCCCceE
Q 018594          241 CEGIIVKVMSKALADKGYNKQKGVVRKVIDKYVGEIEMLEKKHVLRVDQDELETVIPQIGGLVRIVNGAYRGSNA  315 (353)
Q Consensus       241 ~~~IvVKIi~K~l~dGkyYk~KgvV~~V~d~~~c~V~l~d~g~~l~vdq~~LETVIP~~G~~V~IV~G~~RG~~g  315 (353)
                          +|+||+++++. + |++||||.+|+|.|++.|+ +|+|++|+|||+|||||||+     |||||+|||..|
T Consensus       247 ----~vk~~sk~l~~-k-~K~K~vv~~vid~y~~~~K-ld~g~~lk~dq~~lEtvip~-----~~vng~yRg~~~  309 (309)
T KOG2837|consen  247 ----VVKVISKSLGE-K-YKQKGVVKKVIDDYTGQIK-LDSGTVLKVDQEHLETVIPQ-----MIVNGAYRGSEA  309 (309)
T ss_pred             ----EEEeehhhhhH-H-hccccHHHHHHHhhhhhee-ccCCceecccHHHHHHHhHH-----HHhhhhhccCCC
Confidence                99999999999 6 9999999999999999999 78999999999999999999     899999999754


No 2  
>PF10357 Kin17_mid:  Domain of Kin17 curved DNA-binding protein;  InterPro: IPR019447  This entry represents the conserved central 169 residue region of the Kin17 DNA/RNA-binding proteins. The N-terminal region of Kin17 contains a zinc-finger domain, while in the human and mouse proteins there is a RecA-like domain found in the C-terminal region. In humans, Kin17 protein forms intra-nuclear foci during cell proliferation and is re-distributed in the nucleoplasm during the cell cycle []. ; PDB: 2V1N_A.
Probab=100.00  E-value=3.4e-66  Score=444.43  Aligned_cols=125  Identities=58%  Similarity=1.015  Sum_probs=89.9

Q ss_pred             CcccccCchhhHhhhHHHHHHHHHHHHHhccCCcccccceeeeeecccccceeecccccccHHHHHHHhcccccEEEeec
Q 018594            1 MQIFGQNPDRIVEGYSEEFEAGFLELMRRSHRFSRIAATVVYNEYIHDRHHVHMNSTRWATLTEFVKYLGRTGKCKVEET   80 (353)
Q Consensus         1 m~lf~enp~~~i~~fS~eF~~~Fl~lLr~~~g~krV~aN~vYneyI~dr~HiHMNaT~W~tLt~Fvk~Lgr~G~c~vdet   80 (353)
                      |++||+||++||++||++|+++||+|||++||+|||+||+||||||+||+|||||||+|+|||+||+||||+|+|+||+|
T Consensus         3 m~~~~~n~~k~i~~yS~eFe~~Fl~lLr~~hg~krV~AN~vYnEyI~Dk~HvHMNaT~W~sLT~FvkyLgr~G~~~Vdet   82 (127)
T PF10357_consen    3 MLLFAENPGKFIDEYSEEFEKDFLRLLRRRHGTKRVNANKVYNEYIQDKDHVHMNATRWTSLTEFVKYLGREGKCKVDET   82 (127)
T ss_dssp             -------GGG-HHHHHHHHHHHHHHHHHHHTSS-EEEHHHHHHHHTTSS----GGGSS-SSHHHHHHHHTTTTSEEEEEE
T ss_pred             hHHHhhChhhHHHHHHHHHHHHHHHHHHHhcCCCeechhHHHHHHhcCccceeecccccchHHHHHHHHhhCCeeEeecC
Confidence            89999999999999999999999999999999999999999999999999999999999999999999999999999999


Q ss_pred             CceeEEEeecCChHHHHHHHHHHHHhhhcCCHHHHHHHHHHHHHH
Q 018594           81 PKGWFITYIDRDSETLFKEKMKNKRIKLDMVDEERQEREIQKQIE  125 (353)
Q Consensus        81 ekGw~I~yId~~pe~~~r~~~~~k~~~~~~~dee~~~~~i~~qi~  125 (353)
                      |+||||+|||+||++++|+++..++++++++||||++++|++||+
T Consensus        83 ekg~~I~yID~~pe~l~r~~~~~k~~~~~~~dee~~~~~i~~Qi~  127 (127)
T PF10357_consen   83 EKGWFISYIDRSPETLARQEELAKKEKAEKDDEERERKLIEKQIE  127 (127)
T ss_dssp             TTEEEEEE--SSHHHHHHHHHTGGGT-------------------
T ss_pred             CCceEEEeeCCCHHHHHHHHHHHHHHHhhhhHHHHHHHHHHHhhC
Confidence            999999999999999999999999999999999999999999995


No 3  
>KOG4315 consensus G-patch nucleic acid binding protein [General function prediction only]
Probab=99.70  E-value=2.8e-18  Score=170.85  Aligned_cols=112  Identities=25%  Similarity=0.414  Sum_probs=103.0

Q ss_pred             CCcccCCcEEEEeecccCCcccccceeEEEEecCCceEEEEecCCCeEE--EEcCCceeeecCC-CCCeEEEEeCCCCCc
Q 018594          237 DYWLCEGIIVKVMSKALADKGYNKQKGVVRKVIDKYVGEIEMLEKKHVL--RVDQDELETVIPQ-IGGLVRIVNGAYRGS  313 (353)
Q Consensus       237 ~~WL~~~IvVKIi~K~l~dGkyYk~KgvV~~V~d~~~c~V~l~d~g~~l--~vdq~~LETVIP~-~G~~V~IV~G~~RG~  313 (353)
                      ..||+.+|.|||+++.+..|+||.+|++|.+|.+..+|.|.|++++..+  .|+|++|||++|. .|.+||||.|.|.|.
T Consensus       331 k~wlR~dl~VR~is~d~Kgg~ly~~K~~i~dv~gp~scd~r~Dedq~~~qg~irq~~lET~~pr~~Ge~vmvv~gkhkg~  410 (455)
T KOG4315|consen  331 KSWLRSDLKVRFISKDVKGGRLYEKKVRIVDVVGPTSCDIRMDEDQELVQGNIRQELLETALPRRGGEKVMVVSGKHKGV  410 (455)
T ss_pred             chhhhcceeEEeeccccccchhhhcccceecccCCCccceeccccccccccchHHHHHhhhcccccCceeEEEecccccc
Confidence            3899999999999999999999999999999999999999999877777  3999999999995 677899999999999


Q ss_pred             eEEEEeeeCCccEEEEEEeccccCCceeeeccccccccc
Q 018594          314 NARLLGVDTDKFCAQVKIEKGVYDGRVLNAIDYEDICKL  352 (353)
Q Consensus       314 ~g~L~siD~~~~~a~V~l~~g~~~g~~v~~l~yedicKl  352 (353)
                      +|.|++.|.++.+++|++...   ..++ .+.||+||.|
T Consensus       411 ~g~llskd~~Ke~~~v~~~a~---ndvv-~~~~D~v~ey  445 (455)
T KOG4315|consen  411 YGSLLSKDLDKETGVVRLVAT---NDVV-TVYLDQVCEY  445 (455)
T ss_pred             hhhhhhhhhhhhhcceecccc---cchh-hhhHHHHHHh
Confidence            999999999999999998863   4555 4999999987


No 4  
>KOG1999 consensus RNA polymerase II transcription elongation factor DSIF/SUPT5H/SPT5 [Transcription]
Probab=98.20  E-value=1.1e-06  Score=95.95  Aligned_cols=108  Identities=19%  Similarity=0.227  Sum_probs=92.2

Q ss_pred             CCcccCCcEEEEeecccCCcccccceeEEEEecCCceEEEEecCCCeEEEEcCCceeeecCCCCCeEEEEeCCCCCceEE
Q 018594          237 DYWLCEGIIVKVMSKALADKGYNKQKGVVRKVIDKYVGEIEMLEKKHVLRVDQDELETVIPQIGGLVRIVNGAYRGSNAR  316 (353)
Q Consensus       237 ~~WL~~~IvVKIi~K~l~dGkyYk~KgvV~~V~d~~~c~V~l~d~g~~l~vdq~~LETVIP~~G~~V~IV~G~~RG~~g~  316 (353)
                      ..| ..++.+.+.+-+ .++...++.++|++|.++ .|+|.+-+.|+.+.+...+|+++.|..|+.++|+.|.++|.+|+
T Consensus       915 ~~~-~~~~~~~~~d~~-~~~~~~G~~~~ir~v~~G-~~sv~~~de~~~~~~s~~~~a~~~p~~~d~~k~~~g~~~g~~~~  991 (1024)
T KOG1999|consen  915 GNG-GDGNSSWGPDTS-LDTQLVGQTGIIRSVADG-GCSVWLGDEGETISNSKPHLAPAPPCKGDDVKSIWGDDRGSTGK  991 (1024)
T ss_pred             CCC-CccceEeccccc-ccceecccccceeeccCC-ceeeecCCCCcccccccccCccCCCCCCCCcccccccccccccc
Confidence            356 678888888754 455889999999999887 99999999999999999999999999999999999999999999


Q ss_pred             EEeeeCCccEEEEEEeccccCCceeeecccccccccC
Q 018594          317 LLGVDTDKFCAQVKIEKGVYDGRVLNAIDYEDICKLA  353 (353)
Q Consensus       317 L~siD~~~~~a~V~l~~g~~~g~~v~~l~yedicKl~  353 (353)
                      |++.|..++.+.+...+    +  ++.+.+--+||+.
T Consensus       992 ~~~~dg~~g~~~~d~~~----~--~k~l~~~~~~k~~ 1022 (1024)
T KOG1999|consen  992 LVGNDGWDGIVRIDETS----D--IKILNLGLLCKMV 1022 (1024)
T ss_pred             ccCCCcccceecccccc----c--chhhhhhhhhhcc
Confidence            99999999766655433    2  4558888899873


No 5  
>PF00467 KOW:  KOW motif;  InterPro: IPR005824 Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [, ]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.  Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [, ]. The KOW (Kyprides, Ouzounis, Woese) motif is found in a variety of ribosomal proteins and the bacterial transcription antitermination proteins NusG []. ; PDB: 3BBO_W 2HGJ_X 2HGQ_X 2HGU_X 1NPP_B 1M1G_D 1NPR_A 2XHC_A 2KVQ_G 2JVV_A ....
Probab=97.33  E-value=0.00042  Score=46.32  Aligned_cols=31  Identities=23%  Similarity=0.646  Sum_probs=28.3

Q ss_pred             CCCeEEEEeCCCCCceEEEEeeeCCccEEEE
Q 018594          299 IGGLVRIVNGAYRGSNARLLGVDTDKFCAQV  329 (353)
Q Consensus       299 ~G~~V~IV~G~~RG~~g~L~siD~~~~~a~V  329 (353)
                      +|+.|+|+.|+|+|..|+++++|.++..+.|
T Consensus         1 ~Gd~V~V~~G~~~G~~G~I~~i~~~~~~V~v   31 (32)
T PF00467_consen    1 VGDTVKVISGPFKGKIGKIVEIDRSKVRVTV   31 (32)
T ss_dssp             TTSEEEESSSTTTTEEEEEEEEETTTTEEEE
T ss_pred             CCCEEEEeEcCCCCceEEEEEEECCCCEEEE
Confidence            5899999999999999999999999976654


No 6  
>smart00739 KOW KOW (Kyprides, Ouzounis, Woese) motif. Motif in ribosomal proteins, NusG, Spt5p, KIN17 and T54.
Probab=96.81  E-value=0.0022  Score=40.50  Aligned_cols=26  Identities=31%  Similarity=0.703  Sum_probs=24.3

Q ss_pred             CCCCCeEEEEeCCCCCceEEEEeeeC
Q 018594          297 PQIGGLVRIVNGAYRGSNARLLGVDT  322 (353)
Q Consensus       297 P~~G~~V~IV~G~~RG~~g~L~siD~  322 (353)
                      |.+|+.|+|+.|.|+|..|++++++.
T Consensus         2 ~~~G~~V~I~~G~~~g~~g~i~~i~~   27 (28)
T smart00739        2 FEVGDTVRVIAGPFKGKVGKVLEVDG   27 (28)
T ss_pred             CCCCCEEEEeECCCCCcEEEEEEEcC
Confidence            57899999999999999999999975


No 7  
>PRK12281 rplX 50S ribosomal protein L24; Reviewed
Probab=96.22  E-value=0.0087  Score=47.89  Aligned_cols=33  Identities=24%  Similarity=0.451  Sum_probs=30.5

Q ss_pred             CCCCeEEEEeCCCCCceEEEEeeeCCccEEEEE
Q 018594          298 QIGGLVRIVNGAYRGSNARLLGVDTDKFCAQVK  330 (353)
Q Consensus       298 ~~G~~V~IV~G~~RG~~g~L~siD~~~~~a~V~  330 (353)
                      ..|++|.|+.|.++|..|+++.++.++..+.|+
T Consensus         8 ~kGD~V~Vi~G~dKGK~G~V~~V~~~~~~V~Ve   40 (76)
T PRK12281          8 KKGDMVKVIAGDDKGKTGKVLAVLPKKNRVIVE   40 (76)
T ss_pred             cCCCEEEEeEcCCCCcEEEEEEEEcCCCEEEEc
Confidence            679999999999999999999999999887773


No 8  
>CHL00141 rpl24 ribosomal protein L24; Validated
Probab=96.19  E-value=0.0091  Score=48.52  Aligned_cols=32  Identities=25%  Similarity=0.395  Sum_probs=30.1

Q ss_pred             CCCCeEEEEeCCCCCceEEEEeeeCCccEEEE
Q 018594          298 QIGGLVRIVNGAYRGSNARLLGVDTDKFCAQV  329 (353)
Q Consensus       298 ~~G~~V~IV~G~~RG~~g~L~siD~~~~~a~V  329 (353)
                      ..|++|.|+.|.++|..|++++++.++..+.|
T Consensus        10 ~~GD~V~Vi~G~dKGK~G~V~~V~~~~~~V~V   41 (83)
T CHL00141         10 KIGDTVKIISGSDKGKIGEVLKIIKKSNKVIV   41 (83)
T ss_pred             cCCCEEEEeEcCCCCcEEEEEEEEcCCCEEEE
Confidence            57999999999999999999999999988877


No 9  
>TIGR00405 L26e_arch ribosomal protein L24p/L26e, archaeal. This protein contains a KOW domain, shared by bacterial NusG and the L24p/L26e family of ribosomal proteins. Although called archaeal NusG in several publications, it is the only close homolog of eukaryotic L26e in archaeal genomes, shares an operon with L11 in many genomes, and has been sequenced from purified ribosomes. It is here designated as a ribosomal protein for these reasons.
Probab=95.91  E-value=0.02  Score=50.02  Aligned_cols=51  Identities=18%  Similarity=0.384  Sum_probs=42.9

Q ss_pred             CCCCCeEEEEeCCCCCceEEEEeeeCCccEEEEEEeccccCCce--eeecccccccc
Q 018594          297 PQIGGLVRIVNGAYRGSNARLLGVDTDKFCAQVKIEKGVYDGRV--LNAIDYEDICK  351 (353)
Q Consensus       297 P~~G~~V~IV~G~~RG~~g~L~siD~~~~~a~V~l~~g~~~g~~--v~~l~yedicK  351 (353)
                      +.+|+.|.|+.|++.|..|.+..+|..+..|.|.|..   .+..  ++ +++++|-+
T Consensus        87 ~~~Gd~V~I~~GPf~G~~g~v~~~d~~k~~v~v~l~~---~~~~~~v~-v~~~~l~~  139 (145)
T TIGR00405        87 IKKGDIVEIISGPFKGERAKVIRVDESKEEVTLELIE---AAVPIPVT-VKGDQVRI  139 (145)
T ss_pred             cCCCCEEEEeecCCCCCeEEEEEEcCCCCEEEEEEEE---cCccceEE-EeeeEEEE
Confidence            5789999999999999999999999998899998885   3444  53 78877754


No 10 
>PRK00004 rplX 50S ribosomal protein L24; Reviewed
Probab=95.60  E-value=0.022  Score=48.13  Aligned_cols=33  Identities=24%  Similarity=0.333  Sum_probs=30.6

Q ss_pred             CCCCeEEEEeCCCCCceEEEEeeeCCccEEEEE
Q 018594          298 QIGGLVRIVNGAYRGSNARLLGVDTDKFCAQVK  330 (353)
Q Consensus       298 ~~G~~V~IV~G~~RG~~g~L~siD~~~~~a~V~  330 (353)
                      ..|+.|+|+.|.++|..|++++++.++..+.|+
T Consensus         6 ~kGD~V~Vi~G~dKGk~G~V~~V~~~~~~V~Ve   38 (105)
T PRK00004          6 KKGDTVIVIAGKDKGKRGKVLKVLPKKNKVIVE   38 (105)
T ss_pred             cCCCEEEEeEcCCCCcEEEEEEEEcCCCEEEEc
Confidence            679999999999999999999999999888773


No 11 
>PRK01191 rpl24p 50S ribosomal protein L24P; Validated
Probab=95.04  E-value=0.043  Score=47.64  Aligned_cols=32  Identities=22%  Similarity=0.493  Sum_probs=29.7

Q ss_pred             CCCCeEEEEeCCCCCceEEEEeeeCCccEEEE
Q 018594          298 QIGGLVRIVNGAYRGSNARLLGVDTDKFCAQV  329 (353)
Q Consensus       298 ~~G~~V~IV~G~~RG~~g~L~siD~~~~~a~V  329 (353)
                      ..|+.|.|+.|.++|..|++++++.++..+.|
T Consensus        47 kkGD~V~VisG~~KGk~GkV~~V~~~~~~V~V   78 (120)
T PRK01191         47 RKGDTVKVMRGDFKGEEGKVVEVDLKRGRIYV   78 (120)
T ss_pred             eCCCEEEEeecCCCCceEEEEEEEcCCCEEEE
Confidence            45999999999999999999999999988877


No 12 
>TIGR01079 rplX_bact ribosomal protein L24, bacterial/organelle. This model recognizes bacterial and organellar forms of ribosomal protein L24. It excludes eukaryotic and archaeal forms, designated L26 in eukaryotes.
Probab=94.98  E-value=0.043  Score=46.40  Aligned_cols=33  Identities=21%  Similarity=0.365  Sum_probs=30.4

Q ss_pred             CCCCeEEEEeCCCCCceEEEEeeeCCccEEEEE
Q 018594          298 QIGGLVRIVNGAYRGSNARLLGVDTDKFCAQVK  330 (353)
Q Consensus       298 ~~G~~V~IV~G~~RG~~g~L~siD~~~~~a~V~  330 (353)
                      ..|++|.|+.|.++|..|++++++.++..+.|+
T Consensus         5 kkGD~V~Vi~G~dKGK~G~V~~V~~~~~~V~Ve   37 (104)
T TIGR01079         5 KKGDTVKVISGKDKGKRGKVLKVLPKTNKVIVE   37 (104)
T ss_pred             cCCCEEEEeEcCCCCcEEEEEEEEcCCCEEEEC
Confidence            579999999999999999999999999888773


No 13 
>PTZ00194 60S ribosomal protein L26; Provisional
Probab=94.98  E-value=0.039  Score=49.22  Aligned_cols=33  Identities=18%  Similarity=0.360  Sum_probs=30.1

Q ss_pred             CCCCeEEEEeCCCCCceEEEEeeeCCccEEEEE
Q 018594          298 QIGGLVRIVNGAYRGSNARLLGVDTDKFCAQVK  330 (353)
Q Consensus       298 ~~G~~V~IV~G~~RG~~g~L~siD~~~~~a~V~  330 (353)
                      ..|+.|+|+.|.++|..|++++++..++.+.|.
T Consensus        48 kkGD~V~Vi~Gk~KGk~GkV~~V~~k~~~ViVE   80 (143)
T PTZ00194         48 RKDDEVMVVRGHHKGREGKVTAVYRKKWVIHIE   80 (143)
T ss_pred             ecCCEEEEecCCCCCCceEEEEEEcCCCEEEEe
Confidence            459999999999999999999999999888773


No 14 
>PRK05609 nusG transcription antitermination protein NusG; Validated
Probab=94.93  E-value=0.069  Score=47.83  Aligned_cols=53  Identities=23%  Similarity=0.407  Sum_probs=41.4

Q ss_pred             CCCCCeEEEEeCCCCCceEEEEeeeCCccEEEEEEeccccCCce-eeeccccccccc
Q 018594          297 PQIGGLVRIVNGAYRGSNARLLGVDTDKFCAQVKIEKGVYDGRV-LNAIDYEDICKL  352 (353)
Q Consensus       297 P~~G~~V~IV~G~~RG~~g~L~siD~~~~~a~V~l~~g~~~g~~-v~~l~yedicKl  352 (353)
                      +.+|+.|.|+.|++.|..|++..+|..+..|.|.|.-   -|+. .-.++++++-++
T Consensus       127 ~~~Gd~VrI~~GPf~G~~g~v~~i~~~~~r~~v~l~~---~G~~~~v~l~~~~l~~~  180 (181)
T PRK05609        127 FEVGEMVRVIDGPFADFNGTVEEVDYEKSKLKVLVSI---FGRETPVELEFSQVEKI  180 (181)
T ss_pred             CCCCCEEEEeccCCCCCEEEEEEEeCCCCEEEEEEEE---CCCceEEEEchHHEEEc
Confidence            4689999999999999999999999888788887763   2432 224777777654


No 15 
>TIGR01080 rplX_A_E ribosomal protein L24p/L26e, archaeal/eukaryotic. This model represents the archaeal and eukaryotic branch of the ribosomal protein L24p/L26e family. Bacterial and organellar forms are represented by the related TIGR01079.
Probab=94.92  E-value=0.044  Score=47.19  Aligned_cols=32  Identities=28%  Similarity=0.574  Sum_probs=29.5

Q ss_pred             CCCCeEEEEeCCCCCceEEEEeeeCCccEEEE
Q 018594          298 QIGGLVRIVNGAYRGSNARLLGVDTDKFCAQV  329 (353)
Q Consensus       298 ~~G~~V~IV~G~~RG~~g~L~siD~~~~~a~V  329 (353)
                      ..|+.|+|+.|.++|..|+++.|+..+..|.|
T Consensus        43 kkGD~V~Vi~Gk~KGk~GkV~~V~~~~~~V~V   74 (114)
T TIGR01080        43 RKGDKVRIMRGDFKGHEGKVSKVDLKRYRIYV   74 (114)
T ss_pred             ecCCEEEEecCCCCCCEEEEEEEEcCCCEEEE
Confidence            56999999999999999999999999987766


No 16 
>TIGR00922 nusG transcription termination/antitermination factor NusG. Archaeal proteins once termed NusG share the KOW domain but are actually a ribosomal protein corresponding to L24p in bacterial and L26e in eukaryotes (TIGR00405).
Probab=94.87  E-value=0.072  Score=47.48  Aligned_cols=50  Identities=20%  Similarity=0.365  Sum_probs=39.8

Q ss_pred             CCCCCeEEEEeCCCCCceEEEEeeeCCccEEEEEEeccccCCc--eeeeccccccc
Q 018594          297 PQIGGLVRIVNGAYRGSNARLLGVDTDKFCAQVKIEKGVYDGR--VLNAIDYEDIC  350 (353)
Q Consensus       297 P~~G~~V~IV~G~~RG~~g~L~siD~~~~~a~V~l~~g~~~g~--~v~~l~yedic  350 (353)
                      +.+|++|.|+.|++.|..|.+..+|.++..|.|.|+-   .|+  .+ .+++++|-
T Consensus       120 ~~~G~~V~I~~Gpf~G~~g~v~~~~~~~~r~~V~v~~---~g~~~~v-~v~~~~l~  171 (172)
T TIGR00922       120 FEVGEQVRVNDGPFANFTGTVEEVDYEKSKLKVSVSI---FGRETPV-ELEFSQVE  171 (172)
T ss_pred             CCCCCEEEEeecCCCCcEEEEEEEcCCCCEEEEEEEE---CCCceEE-EEcHHHee
Confidence            4689999999999999999999999888888888774   243  23 36766653


No 17 
>PRK08559 nusG transcription antitermination protein NusG; Validated
Probab=94.55  E-value=0.1  Score=46.47  Aligned_cols=50  Identities=20%  Similarity=0.406  Sum_probs=41.1

Q ss_pred             CCCCCeEEEEeCCCCCceEEEEeeeCCccEEEEEEeccccCCce--eeeccccccc
Q 018594          297 PQIGGLVRIVNGAYRGSNARLLGVDTDKFCAQVKIEKGVYDGRV--LNAIDYEDIC  350 (353)
Q Consensus       297 P~~G~~V~IV~G~~RG~~g~L~siD~~~~~a~V~l~~g~~~g~~--v~~l~yedic  350 (353)
                      ..+|+.|.|+.|.++|..|++.++|.++..|+|.+..-   ...  + .++.++|+
T Consensus        95 ~~~G~~V~I~~Gpf~g~~g~V~~vd~~k~~v~v~ll~~---~~~~pv-~v~~~~~~  146 (153)
T PRK08559         95 IKEGDIVELIAGPFKGEKARVVRVDESKEEVTVELLEA---AVPIPV-TVRGDQVR  146 (153)
T ss_pred             CCCCCEEEEeccCCCCceEEEEEEcCCCCEEEEEEECC---cceeeE-EEeccEEE
Confidence            57899999999999999999999999999999988752   222  3 47777764


No 18 
>COG0250 NusG Transcription antiterminator [Transcription]
Probab=93.40  E-value=0.19  Score=46.11  Aligned_cols=56  Identities=25%  Similarity=0.372  Sum_probs=43.7

Q ss_pred             ecCCCCCeEEEEeCCCCCceEEEEeeeCCccEEEEEEeccccCCceeeeccccccccc
Q 018594          295 VIPQIGGLVRIVNGAYRGSNARLLGVDTDKFCAQVKIEKGVYDGRVLNAIDYEDICKL  352 (353)
Q Consensus       295 VIP~~G~~V~IV~G~~RG~~g~L~siD~~~~~a~V~l~~g~~~g~~v~~l~yedicKl  352 (353)
                      +-+.+|+.|+|+.|++.|..|++.++|.++..++|.+..= ..-..+ .++|++|-++
T Consensus       122 ~~~e~Gd~VrI~~GpFa~f~g~V~evd~ek~~~~v~v~if-gr~tPV-el~~~qVek~  177 (178)
T COG0250         122 VDFEPGDVVRIIDGPFAGFKAKVEEVDEEKGKLKVEVSIF-GRPTPV-ELEFDQVEKL  177 (178)
T ss_pred             ccCCCCCEEEEeccCCCCccEEEEEEcCcCcEEEEEEEEe-CCceEE-EEehhhEEEe
Confidence            4456899999999999999999999999998888887741 112234 3888887665


No 19 
>PRK04313 30S ribosomal protein S4e; Validated
Probab=92.54  E-value=1  Score=43.39  Aligned_cols=73  Identities=18%  Similarity=0.255  Sum_probs=49.0

Q ss_pred             EEEEeecccCCcccccceeEEEEecCCceEEEEecC-----CCeEEEEc--CCceeeecC-CCCCeEEEEeCCCCCceEE
Q 018594          245 IVKVMSKALADKGYNKQKGVVRKVIDKYVGEIEMLE-----KKHVLRVD--QDELETVIP-QIGGLVRIVNGAYRGSNAR  316 (353)
Q Consensus       245 vVKIi~K~l~dGkyYk~KgvV~~V~d~~~c~V~l~d-----~g~~l~vd--q~~LETVIP-~~G~~V~IV~G~~RG~~g~  316 (353)
                      .+||.+|..      .++|++.=+...++..+. ++     .++.|.|+  ...+.-.+| ..|..++|+.|.+.|.+|+
T Consensus       119 L~KV~~k~~------~~gG~~ql~~hDGrni~~-~~~~~~k~~Dtv~i~l~~~kI~~~i~fe~G~l~~itgG~n~GriG~  191 (237)
T PRK04313        119 LCKIENKTT------VKGGKIQLNLHDGRNILV-DVEDDYKTGDSLLISLPEQEIVDHIPFEEGNLAIITGGKHVGEIGK  191 (237)
T ss_pred             EEEEEeEEE------ecCCEEEEEecCCceEEc-cCccccccCCEEEEECCCCceeEEEecCCCCEEEEECCeeeeeEEE
Confidence            367776543      445666555443343332 32     57776544  444666666 8999999999999999999


Q ss_pred             EEeeeCCc
Q 018594          317 LLGVDTDK  324 (353)
Q Consensus       317 L~siD~~~  324 (353)
                      +.++....
T Consensus       192 I~~i~~~~  199 (237)
T PRK04313        192 IKEIEVTK  199 (237)
T ss_pred             EEEEEEcc
Confidence            99997544


No 20 
>TIGR01955 RfaH transcriptional activator RfaH. This model represents the transcriptional activator protein, RfaH. This protein is most closely related to the transcriptional termination/antitermination protein NusG (TIGR00922) and contains the KOW motif (pfam00467). This protein appears to be limited to the gamma proteobacteria. In E. coli, this gene appears to control the expression of haemolysin, sex factor and lipopolysaccharide genes.
Probab=92.23  E-value=0.29  Score=42.90  Aligned_cols=49  Identities=24%  Similarity=0.381  Sum_probs=35.9

Q ss_pred             CCCCCeEEEEeCCCCCceEEEEeeeCCccEEEEEEeccccCCcee-eecccccc
Q 018594          297 PQIGGLVRIVNGAYRGSNARLLGVDTDKFCAQVKIEKGVYDGRVL-NAIDYEDI  349 (353)
Q Consensus       297 P~~G~~V~IV~G~~RG~~g~L~siD~~~~~a~V~l~~g~~~g~~v-~~l~yedi  349 (353)
                      +.+|++|.|+.|++.|..|.+..+|. +..+.|.|+-   -|+.+ -.+++++|
T Consensus       109 ~~~G~~V~V~~GPf~g~~g~v~~~~~-~~r~~v~l~~---~gr~~~v~~~~~~~  158 (159)
T TIGR01955       109 PYKGDKVRITDGAFAGFEAIFLEPDG-EKRSMLLLNM---IGKQIKVSVPNTSV  158 (159)
T ss_pred             CCCCCEEEEeccCCCCcEEEEEEECC-CceEEEEEhh---hCCceEEEecHHHc
Confidence            46899999999999999999999984 4577777763   24432 12555554


No 21 
>PLN00036 40S ribosomal protein S4; Provisional
Probab=91.88  E-value=1.2  Score=43.40  Aligned_cols=71  Identities=18%  Similarity=0.287  Sum_probs=48.7

Q ss_pred             EEEEeecccCCcccccceeEEEEecCCceEEEEecC----CCeEEEE--cCCceeeecC-CCCCeEEEEeCCCCCceEEE
Q 018594          245 IVKVMSKALADKGYNKQKGVVRKVIDKYVGEIEMLE----KKHVLRV--DQDELETVIP-QIGGLVRIVNGAYRGSNARL  317 (353)
Q Consensus       245 vVKIi~K~l~dGkyYk~KgvV~~V~d~~~c~V~l~d----~g~~l~v--dq~~LETVIP-~~G~~V~IV~G~~RG~~g~L  317 (353)
                      .+||.++.      ..++|++.=+....+. +...|    .++.|.|  +...+.-.+| ..|..++|+.|.+.|.+|++
T Consensus       123 LcKV~~k~------~~~gG~~ql~~hDGrn-i~~~d~~~k~~Dtv~i~l~~~kI~~~ikfe~G~l~~vtgG~n~GrvG~I  195 (261)
T PLN00036        123 LCKVRKIQ------FGQKGIPYLNTHDGRT-IRYPDPLIKANDTIKIDLETNKIVDFIKFDVGNLVMVTGGRNRGRVGVI  195 (261)
T ss_pred             EEEEEEEE------EecCCeEEEEecCCce-eccCCCccccCCEEEEeCCCCceeeEEecCCCCEEEEECCeeceeEEEE
Confidence            35777553      4456666665443333 43222    4777654  4555666777 89999999999999999999


Q ss_pred             EeeeC
Q 018594          318 LGVDT  322 (353)
Q Consensus       318 ~siD~  322 (353)
                      .++..
T Consensus       196 ~~i~~  200 (261)
T PLN00036        196 KNREK  200 (261)
T ss_pred             EEEEe
Confidence            99984


No 22 
>PRK09014 rfaH transcriptional activator RfaH; Provisional
Probab=91.83  E-value=0.35  Score=42.81  Aligned_cols=50  Identities=22%  Similarity=0.264  Sum_probs=37.4

Q ss_pred             CCCCCeEEEEeCCCCCceEEEEeeeCCccEEEEEEeccccCCc--eeeecccccccc
Q 018594          297 PQIGGLVRIVNGAYRGSNARLLGVDTDKFCAQVKIEKGVYDGR--VLNAIDYEDICK  351 (353)
Q Consensus       297 P~~G~~V~IV~G~~RG~~g~L~siD~~~~~a~V~l~~g~~~g~--~v~~l~yedicK  351 (353)
                      +.+|++|.|+.|++.|..|.+..+|. +..+.|.++-   .|+  .+ .+++++|-+
T Consensus       110 ~~~G~~V~I~~Gp~~g~eg~v~~~~~-~~r~~v~v~~---~gr~~~v-~v~~~~~~~  161 (162)
T PRK09014        110 PKPGDKVIITEGAFEGLQAIYTEPDG-EARSILLLNL---LNKQVKH-SVDNTQFRK  161 (162)
T ss_pred             CCCCCEEEEecCCCCCcEEEEEEeCC-CeEEEEeehh---hCCcEEE-EECHHHeec
Confidence            46899999999999999999999994 4456676663   233  33 377777654


No 23 
>PTZ00118 40S ribosomal protein S4; Provisional
Probab=91.65  E-value=1.5  Score=42.90  Aligned_cols=80  Identities=18%  Similarity=0.289  Sum_probs=51.6

Q ss_pred             EEEEeecccCCcccccceeEEEEec-CCceEEEEec----CCCeEEEE--cCCceeeecC-CCCCeEEEEeCCCCCceEE
Q 018594          245 IVKVMSKALADKGYNKQKGVVRKVI-DKYVGEIEML----EKKHVLRV--DQDELETVIP-QIGGLVRIVNGAYRGSNAR  316 (353)
Q Consensus       245 vVKIi~K~l~dGkyYk~KgvV~~V~-d~~~c~V~l~----d~g~~l~v--dq~~LETVIP-~~G~~V~IV~G~~RG~~g~  316 (353)
                      .+||..+      +..++|++.=+. |+ .. +...    ..++.|.|  +...+.-.+| ..|..++|..|.+.|.+|+
T Consensus       123 LcKV~~k------~~~~gg~~~l~~hDG-rn-i~~~d~~ik~~Dtv~i~l~~~kI~~~ikfe~G~l~~vtgG~n~GriG~  194 (262)
T PTZ00118        123 LCRVKKT------FLGPKEVSIAVTHDG-RT-IRYVHPDVKVGDSLRLDLETGKVLEFLKFEVGNLVMITGGHNVGRVGT  194 (262)
T ss_pred             EEEEeEE------EECCCCeEEEEecCc-ce-eccCCCcccCCCEEEEECCCCceeeEEecCCCCEEEEECCeeceeEEE
Confidence            3577654      344566666543 44 33 4322    25777654  4555666677 8999999999999999999


Q ss_pred             EEeeeCCccE-EEEEEe
Q 018594          317 LLGVDTDKFC-AQVKIE  332 (353)
Q Consensus       317 L~siD~~~~~-a~V~l~  332 (353)
                      +.++.....+ -.|.|.
T Consensus       195 I~~~~~~~~~~~~V~i~  211 (262)
T PTZ00118        195 IVSKEKHPGSFDLIHVK  211 (262)
T ss_pred             EEEEEecCCCCcEEEEE
Confidence            9997654333 234444


No 24 
>PF11623 DUF3252:  Protein of unknown function (DUF3252);  InterPro: IPR021659  This family of proteins has no known function. Some members are annotated as Ssl0352 however this cannot be confirmed. Currently there is no known function. ; PDB: 3C4S_B 2JZ2_A.
Probab=91.64  E-value=0.72  Score=34.64  Aligned_cols=51  Identities=24%  Similarity=0.270  Sum_probs=34.0

Q ss_pred             ccCCcEEEEeecccCCcccccceeEEEEecCCceEEEEecCC-CeEEEEcCCcee
Q 018594          240 LCEGIIVKVMSKALADKGYNKQKGVVRKVIDKYVGEIEMLEK-KHVLRVDQDELE  293 (353)
Q Consensus       240 L~~~IvVKIi~K~l~dGkyYk~KgvV~~V~d~~~c~V~l~d~-g~~l~vdq~~LE  293 (353)
                      +.||-.|+|++-  .+ -||.=.|.|..|.|+..+++.=-.+ ++.+.+.-++||
T Consensus         2 ilPG~~V~V~n~--~~-~Y~~y~G~VQRvsdgkaaVLFEGGnWdKlvTf~l~eLe   53 (53)
T PF11623_consen    2 ILPGSTVRVKNP--ND-IYYGYEGFVQRVSDGKAAVLFEGGNWDKLVTFRLSELE   53 (53)
T ss_dssp             --TT-EEEE--T--TS-TTTT-EEEEEEEETTEEEEEEEETTEEEEEEEEGGGEE
T ss_pred             ccCCCEEEEeCC--CC-ccchheEEEEEeeCCeEEEEecCCCceEEEEEEhhhCC
Confidence            568999999865  34 7999999999999997776654333 345677777776


No 25 
>COG0198 RplX Ribosomal protein L24 [Translation, ribosomal structure and biogenesis]
Probab=90.71  E-value=0.43  Score=40.58  Aligned_cols=30  Identities=27%  Similarity=0.485  Sum_probs=27.3

Q ss_pred             CCCCeEEEEeCCCCCceEEEEeeeCCccEEEE
Q 018594          298 QIGGLVRIVNGAYRGSNARLLGVDTDKFCAQV  329 (353)
Q Consensus       298 ~~G~~V~IV~G~~RG~~g~L~siD~~~~~a~V  329 (353)
                      ..|+.|.|+.|.++|..|.+++++...  +.|
T Consensus         6 rkGD~V~Vi~GkdKGk~GkVl~v~~k~--V~V   35 (104)
T COG0198           6 KKGDTVKVIAGKDKGKEGKVLKVLPKK--VVV   35 (104)
T ss_pred             ecCCEEEEEecCCCCcceEEEEEecCe--EEE
Confidence            469999999999999999999999998  554


No 26 
>TIGR01956 NusG_myco NusG family protein. This model represents a family of Mycoplasma proteins orthologous to the bacterial transcription termination/antitermination factor NusG. These sequences from Mycoplasma are notably diverged (long branches in a Neighbor-joining phylogenetic tree) from the bacterial species. And although NusA and ribosomal protein S10 (NusE) appear to be present, NusB may be absent in Mycoplasmas calling into question whether these species have a functional Nus system including this family as a member.
Probab=89.94  E-value=0.81  Score=44.58  Aligned_cols=51  Identities=22%  Similarity=0.436  Sum_probs=39.0

Q ss_pred             CCCCCeEEEEeCCCCCceEEEEeeeCCccEEEEEEeccccCCceeeecccccc
Q 018594          297 PQIGGLVRIVNGAYRGSNARLLGVDTDKFCAQVKIEKGVYDGRVLNAIDYEDI  349 (353)
Q Consensus       297 P~~G~~V~IV~G~~RG~~g~L~siD~~~~~a~V~l~~g~~~g~~v~~l~yedi  349 (353)
                      +.+|+.|.|+.|++.|..|.+.++|.++..+.|.|.-- .....+ .|+|++|
T Consensus       206 f~vGd~VrI~dGPF~GfeG~I~eid~~k~Rv~VlV~If-GR~TpV-eL~~~qV  256 (258)
T TIGR01956       206 FRVGNFVKIVDGPFKGIVGKIKKIDQEKKKAIVEVEIL-GKSVDV-DLNFKHL  256 (258)
T ss_pred             CCCCCEEEEEecCCCCcEEEEEEEeCCCCEEEEEEEec-CCcEEE-EEchHHE
Confidence            47899999999999999999999998777787777641 122234 3777765


No 27 
>COG5164 SPT5 Transcription elongation factor [Transcription]
Probab=89.93  E-value=0.79  Score=48.07  Aligned_cols=70  Identities=36%  Similarity=0.505  Sum_probs=50.4

Q ss_pred             eeEEEEecCCceEEEEecCCCeEEEEcC----------------Cceeeec----CCCCCeEEEEeCCCCCceEEEEeee
Q 018594          262 KGVVRKVIDKYVGEIEMLEKKHVLRVDQ----------------DELETVI----PQIGGLVRIVNGAYRGSNARLLGVD  321 (353)
Q Consensus       262 KgvV~~V~d~~~c~V~l~d~g~~l~vdq----------------~~LETVI----P~~G~~V~IV~G~~RG~~g~L~siD  321 (353)
                      -||++. +.+..|.|..-|--.-+++|-                ..||--|    |++|..|.|-.|+|+|..|.+..+|
T Consensus       298 nGVfv~-~~~nv~~VAtkd~~~s~k~dl~kmnp~v~~~~~~p~~~~l~r~i~gRd~aigktVrIr~g~yKG~lGVVKdv~  376 (607)
T COG5164         298 NGVFVK-IEGNVCIVATKDFTESLKVDLDKMNPPVTVNLQNPKTNELERKIVGRDPAIGKTVRIRCGEYKGHLGVVKDVD  376 (607)
T ss_pred             CceEEE-ecCceeEEEeccchhhhcccHhhcCchhhcCCCCCcchhhhccccccccccCceEEEeecccccccceeeecc
Confidence            455544 345588888765322333433                3455555    7899999999999999999999998


Q ss_pred             CCccEEEEEEecc
Q 018594          322 TDKFCAQVKIEKG  334 (353)
Q Consensus       322 ~~~~~a~V~l~~g  334 (353)
                      .+.  |.|+|.++
T Consensus       377 ~~~--arVeLhs~  387 (607)
T COG5164         377 RNI--ARVELHSN  387 (607)
T ss_pred             Cce--EEEEEecC
Confidence            887  88888875


No 28 
>PTZ00223 40S ribosomal protein S4; Provisional
Probab=89.54  E-value=2.4  Score=41.75  Aligned_cols=72  Identities=17%  Similarity=0.254  Sum_probs=47.8

Q ss_pred             EEEEeecccCCcccccceeEEEEecCCceEEEEec----CCCeEEEE--cCCceeeecC-CCCCeEEEEeCCCCCceEEE
Q 018594          245 IVKVMSKALADKGYNKQKGVVRKVIDKYVGEIEML----EKKHVLRV--DQDELETVIP-QIGGLVRIVNGAYRGSNARL  317 (353)
Q Consensus       245 vVKIi~K~l~dGkyYk~KgvV~~V~d~~~c~V~l~----d~g~~l~v--dq~~LETVIP-~~G~~V~IV~G~~RG~~g~L  317 (353)
                      .+||.++..      .++|++.=+...++. +...    ..++.|.|  +...+.-.+| ..|..|+|..|.+.|.+|++
T Consensus       120 LcKV~~k~~------~~gG~~ql~~hDGrn-I~~~d~~~k~~Dtv~i~l~~~kI~~~ikfe~G~l~~vtgG~n~GriG~I  192 (273)
T PTZ00223        120 LMKVVNVYT------ATGRIPVAVTHDGHR-IRYPDPRTSRGDTLVYNVKEKKVVDLIKNRNGKVVMVTGGANRGRIGEI  192 (273)
T ss_pred             EEEEEEEEE------ecCCeeEEEecCCce-eccCCccccCCCEEEEECCCCeeeEEEecCCCCEEEEECCeeceeEEEE
Confidence            367776543      445665555433233 3322    25777754  4444555666 89999999999999999999


Q ss_pred             EeeeCC
Q 018594          318 LGVDTD  323 (353)
Q Consensus       318 ~siD~~  323 (353)
                      .++...
T Consensus       193 ~~i~~~  198 (273)
T PTZ00223        193 VSIERH  198 (273)
T ss_pred             EEEEec
Confidence            999544


No 29 
>COG1471 RPS4A Ribosomal protein S4E [Translation, ribosomal structure and biogenesis]
Probab=89.42  E-value=2.5  Score=40.80  Aligned_cols=53  Identities=15%  Similarity=0.267  Sum_probs=39.3

Q ss_pred             CeEE--EEcCCceeeecC-CCCCeEEEEeCCCCCceEEEEeeeCCc--cEEEEEEecc
Q 018594          282 KHVL--RVDQDELETVIP-QIGGLVRIVNGAYRGSNARLLGVDTDK--FCAQVKIEKG  334 (353)
Q Consensus       282 g~~l--~vdq~~LETVIP-~~G~~V~IV~G~~RG~~g~L~siD~~~--~~a~V~l~~g  334 (353)
                      |+++  .++...+.-.|| .+|..|+|+.|.|.|.+|++.+|....  ..=+|.+++.
T Consensus       156 ~Dtv~i~lp~~~I~~~i~fe~g~~~~vtgG~h~G~~G~I~~I~~~~~~~~~~v~~e~~  213 (241)
T COG1471         156 GDTVKISLPEQKIVEHIKFEEGALVYVTGGRHVGRVGTIVEIEIQESSKPNLVTVEDE  213 (241)
T ss_pred             ccEEEEeCCChhheeEeccCCCcEEEEECCccccceEEEEEEEEecCCCccEEEEecC
Confidence            6655  566666666666 899999999999999999999998763  2223555543


No 30 
>KOG1999 consensus RNA polymerase II transcription elongation factor DSIF/SUPT5H/SPT5 [Transcription]
Probab=84.98  E-value=3.5  Score=46.61  Aligned_cols=78  Identities=24%  Similarity=0.311  Sum_probs=58.0

Q ss_pred             ccCCcEEEEeecccCCcccccceeEEEEecCCceEEEEec--CCCeEEEEcCCceeeecCCCCCeEEEEeCCCCCceEEE
Q 018594          240 LCEGIIVKVMSKALADKGYNKQKGVVRKVIDKYVGEIEML--EKKHVLRVDQDELETVIPQIGGLVRIVNGAYRGSNARL  317 (353)
Q Consensus       240 L~~~IvVKIi~K~l~dGkyYk~KgvV~~V~d~~~c~V~l~--d~g~~l~vdq~~LETVIP~~G~~V~IV~G~~RG~~g~L  317 (353)
                      ++|+=.|.|+     .|.+-+-||+|..|.+. .++|...  +-..-|.+..+.|- =.-.+|+.|+|+.|.|.|.+|.+
T Consensus       408 F~~GD~VeV~-----~Gel~glkG~ve~vdg~-~vti~~~~e~l~~pl~~~~~eLr-KyF~~GDhVKVi~G~~eG~tGlV  480 (1024)
T KOG1999|consen  408 FSPGDAVEVI-----VGELKGLKGKVESVDGT-IVTIMSKHEDLKGPLEVPASELR-KYFEPGDHVKVIAGRYEGDTGLV  480 (1024)
T ss_pred             cCCCCeEEEe-----eeeeccceeEEEeccCc-eEEEeeccccCCCccccchHhhh-hhccCCCeEEEEeccccCCcceE
Confidence            5677777776     33566789999998554 5555543  23556778888872 22368999999999999999999


Q ss_pred             EeeeCCc
Q 018594          318 LGVDTDK  324 (353)
Q Consensus       318 ~siD~~~  324 (353)
                      +.|+...
T Consensus       481 vrVe~~~  487 (1024)
T KOG1999|consen  481 VRVEQGD  487 (1024)
T ss_pred             EEEeCCe
Confidence            9998766


No 31 
>PF15591 Imm17:  Immunity protein 17
Probab=84.54  E-value=1.9  Score=34.56  Aligned_cols=50  Identities=20%  Similarity=0.325  Sum_probs=37.9

Q ss_pred             EEEEeecccCCcccccceeEEEEecCC----ceEEEEecCCCeEEEEcCCceee
Q 018594          245 IVKVMSKALADKGYNKQKGVVRKVIDK----YVGEIEMLEKKHVLRVDQDELET  294 (353)
Q Consensus       245 vVKIi~K~l~dGkyYk~KgvV~~V~d~----~~c~V~l~d~g~~l~vdq~~LET  294 (353)
                      +|+|.+....+.++++++|||......    +.-.|.+-+......++.+.|++
T Consensus        10 ~V~v~~s~p~~~ei~Gk~GVVlG~SeeD~~~~gY~Vli~d~e~~~~~ee~~l~~   63 (74)
T PF15591_consen   10 EVEVVRSCPCDAEIWGKRGVVLGISEEDGGNFGYSVLIFDMECCWYIEEDELEA   63 (74)
T ss_pred             EEEEeccCcchhhhcCceeEEEEEecCCCcEEEEEEEEeeeeeEEEechHHeee
Confidence            688887666667899999999999642    33556666667777888888875


No 32 
>PF15057 DUF4537:  Domain of unknown function (DUF4537)
Probab=83.68  E-value=13  Score=32.17  Aligned_cols=97  Identities=15%  Similarity=0.153  Sum_probs=61.2

Q ss_pred             EEeecccCCcccccceeEEEEecCCceEEEEecCCCeEEEEcCCceeeecC------CCCCeEEEEe--CCCCCceEEEE
Q 018594          247 KVMSKALADKGYNKQKGVVRKVIDKYVGEIEMLEKKHVLRVDQDELETVIP------QIGGLVRIVN--GAYRGSNARLL  318 (353)
Q Consensus       247 KIi~K~l~dGkyYk~KgvV~~V~d~~~c~V~l~d~g~~l~vdq~~LETVIP------~~G~~V~IV~--G~~RG~~g~L~  318 (353)
                      +|+.++..||-||-  |+|++..+.....|+.. .++...++..++=.+-+      ++|++|+...  +.++=.-|+++
T Consensus         3 ~VlAR~~~DG~YY~--GtV~~~~~~~~~lV~f~-~~~~~~v~~~~iI~~~~~~~~~L~~GD~VLA~~~~~~~~Y~Pg~V~   79 (124)
T PF15057_consen    3 KVLARREEDGFYYP--GTVKKCVSSGQFLVEFD-DGDTQEVPISDIIALSDAMRHSLQVGDKVLAPWEPDDCRYGPGTVI   79 (124)
T ss_pred             eEEEeeCCCCcEEe--EEEEEccCCCEEEEEEC-CCCEEEeChHHeEEccCcccCcCCCCCEEEEecCcCCCEEeCEEEE
Confidence            46777888987776  89999888778888873 45555555544433222      4799999984  34555558888


Q ss_pred             ee----eCCccEEEEEEeccccCCceeeecccccccc
Q 018594          319 GV----DTDKFCAQVKIEKGVYDGRVLNAIDYEDICK  351 (353)
Q Consensus       319 si----D~~~~~a~V~l~~g~~~g~~v~~l~yedicK  351 (353)
                      +.    -..+...+|.+-+|    +.. .++...+.+
T Consensus        80 ~~~~~~~~~~~~~~V~f~ng----~~~-~vp~~~~~~  111 (124)
T PF15057_consen   80 AGPERRASEDKEYTVRFYNG----KTA-KVPRGEVIW  111 (124)
T ss_pred             ECccccccCCceEEEEEECC----CCC-ccchhhEEE
Confidence            62    22333466777764    333 356555544


No 33 
>smart00743 Agenet Tudor-like domain present in plant sequences. Domain in plant sequences with possible chromatin-associated functions.
Probab=78.01  E-value=7.4  Score=28.72  Aligned_cols=54  Identities=26%  Similarity=0.238  Sum_probs=41.3

Q ss_pred             cCCcEEEEeecccCCcccccceeEEEEecCCceEEEEecC--CCeEEEEcCCceeeecCC
Q 018594          241 CEGIIVKVMSKALADKGYNKQKGVVRKVIDKYVGEIEMLE--KKHVLRVDQDELETVIPQ  298 (353)
Q Consensus       241 ~~~IvVKIi~K~l~dGkyYk~KgvV~~V~d~~~c~V~l~d--~g~~l~vdq~~LETVIP~  298 (353)
                      ..|=.|-+-...  +|.+|.  |+|..|.+...+.|...+  .+....++.+.|-+..|-
T Consensus         4 ~~G~~Ve~~~~~--~~~W~~--a~V~~~~~~~~~~V~~~~~~~~~~e~v~~~~LRp~~~w   59 (61)
T smart00743        4 KKGDRVEVFSKE--EDSWWE--AVVTKVLGDGKYLVRYLTESEPLKETVDWSDLRPHPPW   59 (61)
T ss_pred             CCCCEEEEEECC--CCEEEE--EEEEEECCCCEEEEEECCCCcccEEEEeHHHcccCCCC
Confidence            345566665443  556765  899999986689999999  888999999999887663


No 34 
>smart00333 TUDOR Tudor domain. Domain of unknown function present in several RNA-binding proteins. 10 copies in the Drosophila Tudor protein. Initial proposal that the survival motor neuron gene product contain a Tudor domain are corroborated by more recent database search techniques such as PSI-BLAST (unpublished).
Probab=73.71  E-value=13  Score=26.76  Aligned_cols=53  Identities=15%  Similarity=0.089  Sum_probs=39.8

Q ss_pred             ccCCcEEEEeecccCCcccccceeEEEEecCCceEEEEecCCCeEEEEcCCceeeecC
Q 018594          240 LCEGIIVKVMSKALADKGYNKQKGVVRKVIDKYVGEIEMLEKKHVLRVDQDELETVIP  297 (353)
Q Consensus       240 L~~~IvVKIi~K~l~dGkyYk~KgvV~~V~d~~~c~V~l~d~g~~l~vdq~~LETVIP  297 (353)
                      +.+|-.|.+.-   .+|.||.  |.|.++.+...+.|...|-|....|+.++|-+..|
T Consensus         3 ~~~G~~~~a~~---~d~~wyr--a~I~~~~~~~~~~V~f~D~G~~~~v~~~~l~~l~~   55 (57)
T smart00333        3 FKVGDKVAARW---EDGEWYR--ARIIKVDGEQLYEVFFIDYGNEEVVPPSDLRPLPE   55 (57)
T ss_pred             CCCCCEEEEEe---CCCCEEE--EEEEEECCCCEEEEEEECCCccEEEeHHHeecCCC
Confidence            34555554442   4778886  79999987568999999989998998888876655


No 35 
>PLN00045 photosystem I reaction center subunit IV; Provisional
Probab=71.78  E-value=6.5  Score=33.10  Aligned_cols=58  Identities=24%  Similarity=0.410  Sum_probs=42.7

Q ss_pred             eeeecCCCCCeEEEEeCC--CCCceEEEEeeeCC---ccEEEEEEeccccCCceeeecccccc
Q 018594          292 LETVIPQIGGLVRIVNGA--YRGSNARLLGVDTD---KFCAQVKIEKGVYDGRVLNAIDYEDI  349 (353)
Q Consensus       292 LETVIP~~G~~V~IV~G~--~RG~~g~L~siD~~---~~~a~V~l~~g~~~g~~v~~l~yedi  349 (353)
                      -.++-|+.|.+|+|+.-+  +-..+|++.++|.+   +|-++|+++.-.+.|-.-.++..+.|
T Consensus        35 pp~ig~~RGskVrIlR~ESYWyn~vGtVvsVDq~~girYPVvVRF~kvNY~gvnTNnfa~~El   97 (101)
T PLN00045         35 PPPIGPKRGSKVKILRPESYWFNDVGKVVAVDQDPGVRYPVVVRFEKVNYAGVSTNNYALDEI   97 (101)
T ss_pred             CCCcccCCCCEEEEccccceeecCcceEEEEeCCCCcccceEEEeeeeeccccccccccHhhh
Confidence            345667889999999765  46788999999988   78899999876666633333455444


No 36 
>PRK08559 nusG transcription antitermination protein NusG; Validated
Probab=71.20  E-value=15  Score=32.68  Aligned_cols=53  Identities=26%  Similarity=0.312  Sum_probs=43.9

Q ss_pred             ccCCcEEEEeecccCCcccccceeEEEEec-CCceEEEEecCCCeE--EEEcCCceeeecC
Q 018594          240 LCEGIIVKVMSKALADKGYNKQKGVVRKVI-DKYVGEIEMLEKKHV--LRVDQDELETVIP  297 (353)
Q Consensus       240 L~~~IvVKIi~K~l~dGkyYk~KgvV~~V~-d~~~c~V~l~d~g~~--l~vdq~~LETVIP  297 (353)
                      +.+|=.|+|+     +|-|-+..|.|.++. .+..+.|.+++....  ++|+.+.|.+|=+
T Consensus        95 ~~~G~~V~I~-----~Gpf~g~~g~V~~vd~~k~~v~v~ll~~~~~~pv~v~~~~~~~~~~  150 (153)
T PRK08559         95 IKEGDIVELI-----AGPFKGEKARVVRVDESKEEVTVELLEAAVPIPVTVRGDQVRVVKK  150 (153)
T ss_pred             CCCCCEEEEe-----ccCCCCceEEEEEEcCCCCEEEEEEECCcceeeEEEeccEEEEecc
Confidence            6789999998     567778899999996 366789999988777  8899999977754


No 37 
>TIGR00405 L26e_arch ribosomal protein L24p/L26e, archaeal. This protein contains a KOW domain, shared by bacterial NusG and the L24p/L26e family of ribosomal proteins. Although called archaeal NusG in several publications, it is the only close homolog of eukaryotic L26e in archaeal genomes, shares an operon with L11 in many genomes, and has been sequenced from purified ribosomes. It is here designated as a ribosomal protein for these reasons.
Probab=70.50  E-value=12  Score=32.51  Aligned_cols=55  Identities=18%  Similarity=0.251  Sum_probs=46.3

Q ss_pred             CcccCCcEEEEeecccCCcccccceeEEEEec-CCceEEEEecCCCeE--EEEcCCceeeecC
Q 018594          238 YWLCEGIIVKVMSKALADKGYNKQKGVVRKVI-DKYVGEIEMLEKKHV--LRVDQDELETVIP  297 (353)
Q Consensus       238 ~WL~~~IvVKIi~K~l~dGkyYk~KgvV~~V~-d~~~c~V~l~d~g~~--l~vdq~~LETVIP  297 (353)
                      ..+.+|=.|+|+     +|-|-+-.|.|.++. .+..+.|.+++.+..  +.++.++|+.+=+
T Consensus        85 ~~~~~Gd~V~I~-----~GPf~G~~g~v~~~d~~k~~v~v~l~~~~~~~~v~v~~~~l~~~~~  142 (145)
T TIGR00405        85 ESIKKGDIVEII-----SGPFKGERAKVIRVDESKEEVTLELIEAAVPIPVTVKGDQVRIIQK  142 (145)
T ss_pred             cccCCCCEEEEe-----ecCCCCCeEEEEEEcCCCCEEEEEEEEcCccceEEEeeeEEEEecc
Confidence            458999999998     567888899999996 355899999988888  8999999988654


No 38 
>TIGR02760 TraI_TIGR conjugative transfer relaxase protein TraI. This protein is a component of the relaxosome complex. In the process of conjugative plasmid transfer the realaxosome binds to the plasmid at the oriT (origin of transfer) site. The relaxase protein TraI mediates the single-strand nicking and ATP-dependent unwinding (relaxation, helicase activity) of the plasmid molecule. These two activities reside in separate domains of the protein.
Probab=65.85  E-value=43  Score=41.29  Aligned_cols=89  Identities=9%  Similarity=0.148  Sum_probs=62.1

Q ss_pred             CcccCCcEEEEeecccCCcccccceeEEEEecC-CceEEEEecCCCeEEEEcCCce-----------eeecC-CCCCeEE
Q 018594          238 YWLCEGIIVKVMSKALADKGYNKQKGVVRKVID-KYVGEIEMLEKKHVLRVDQDEL-----------ETVIP-QIGGLVR  304 (353)
Q Consensus       238 ~WL~~~IvVKIi~K~l~dGkyYk~KgvV~~V~d-~~~c~V~l~d~g~~l~vdq~~L-----------ETVIP-~~G~~V~  304 (353)
                      ....+|-+|..    +.++.+.+.-++|..|.. ..+.+|.. +.|+.+.++.+.|           ..+|| +.|++++
T Consensus       680 ~~Yr~Gdvv~~----y~~~~~~~~~y~V~~V~~~~n~L~l~~-~dG~~~~~~p~~l~~~~~~~svy~~~~l~ia~Gdrl~  754 (1960)
T TIGR02760       680 AHYKQGMVIRF----WQKGKIPHDDYVVTNVNKHNNTLTLKD-AQGKTQKFKPSSLKDLERPFSVYRPEQLEVAAGERLQ  754 (1960)
T ss_pred             hhcCCCCEEEe----ecccCccCCcEEEEEEeCCCCEEEEEc-CCCCEEEECHHHhcccccceeeeccccccccCCCEEE
Confidence            34578888887    333455566678999875 33555544 3588888888887           34467 7899999


Q ss_pred             EEe-----CCCCCceEEEEeeeCCccEEEEEEec
Q 018594          305 IVN-----GAYRGSNARLLGVDTDKFCAQVKIEK  333 (353)
Q Consensus       305 IV~-----G~~RG~~g~L~siD~~~~~a~V~l~~  333 (353)
                      +..     |--+|..+++.+++...  ++|+...
T Consensus       755 ~trn~~~~gl~ng~~~tV~~i~~~~--i~l~~~~  786 (1960)
T TIGR02760       755 VTGNHFHSRVRNGELLTVSSINNEG--ITLITED  786 (1960)
T ss_pred             EccCCcccCccCCCEEEEEEEcCCe--EEEEeCC
Confidence            983     33578999999998876  5555543


No 39 
>PTZ00065 60S ribosomal protein L14; Provisional
Probab=62.32  E-value=20  Score=31.71  Aligned_cols=36  Identities=25%  Similarity=0.270  Sum_probs=27.6

Q ss_pred             CCCeEEEEeCCCCCceEEEEeeeCCccEEEEEEeccc-cCC
Q 018594          299 IGGLVRIVNGAYRGSNARLLGVDTDKFCAQVKIEKGV-YDG  338 (353)
Q Consensus       299 ~G~~V~IV~G~~RG~~g~L~siD~~~~~a~V~l~~g~-~~g  338 (353)
                      +|.-|+|..|+|.|..+++++|-..+. |.|   .|| ..|
T Consensus        10 iGRVvli~~Gp~~GKL~vIVDIID~nR-vLV---DGP~~tg   46 (130)
T PTZ00065         10 PGRLCLIQYGPDAGKLCFIVDIVTPTR-VLV---DGAFITG   46 (130)
T ss_pred             eceEEEEecCCCCCCEEEEEEEEcCCe-EEE---eCCCcCC
Confidence            588888999999999999999966553 333   477 444


No 40 
>PRK04333 50S ribosomal protein L14e; Validated
Probab=59.13  E-value=17  Score=29.69  Aligned_cols=28  Identities=14%  Similarity=0.341  Sum_probs=24.3

Q ss_pred             CCCCeEEEEeCCCCCceEEEEeeeCCcc
Q 018594          298 QIGGLVRIVNGAYRGSNARLLGVDTDKF  325 (353)
Q Consensus       298 ~~G~~V~IV~G~~RG~~g~L~siD~~~~  325 (353)
                      .+|.-|+++.|.|+|..+.++++..+++
T Consensus         5 ~~GrvV~~~~Grd~gk~~vIv~i~d~~~   32 (84)
T PRK04333          5 EVGRVCVKTAGREAGRKCVIVDIIDKNF   32 (84)
T ss_pred             cccEEEEEeccCCCCCEEEEEEEecCCE
Confidence            4799999999999999999999965553


No 41 
>cd04508 TUDOR Tudor domains are found in many eukaryotic organisms and have been implicated in protein-protein interactions in which methylated protein substrates bind to these domains. For example, the Tudor domain of Survival of Motor Neuron (SMN) binds to symmetrically dimethylated arginines of arginine-glycine (RG) rich sequences found in the C-terminal tails of Sm proteins. The SMN protein is linked to spinal muscular atrophy. Another example is the tandem tudor domains of 53BP1, which bind to histone H4 specifically dimethylated at Lys20 (H4-K20me2). 53BP1 is a key transducer of the DNA damage checkpoint signal.
Probab=59.03  E-value=18  Score=25.10  Aligned_cols=37  Identities=11%  Similarity=0.162  Sum_probs=28.6

Q ss_pred             CCcccccceeEEEEecCCceEEEEecCCCeEEEEcCCce
Q 018594          254 ADKGYNKQKGVVRKVIDKYVGEIEMLEKKHVLRVDQDEL  292 (353)
Q Consensus       254 ~dGkyYk~KgvV~~V~d~~~c~V~l~d~g~~l~vdq~~L  292 (353)
                      .+|++|-  |.|.++.....+.|...|-|..-.|+.++|
T Consensus        10 ~d~~wyr--a~V~~~~~~~~~~V~f~DyG~~~~v~~~~l   46 (48)
T cd04508          10 DDGKWYR--AKITSILSDGKVEVFFVDYGNTEVVPLSDL   46 (48)
T ss_pred             CCCeEEE--EEEEEECCCCcEEEEEEcCCCcEEEeHHHc
Confidence            4578886  899999866689999998888776665554


No 42 
>PF02427 PSI_PsaE:  Photosystem I reaction centre subunit IV / PsaE;  InterPro: IPR003375 PsaE is a 69 amino acid polypeptide from photosystem I present on the stromal side of the thylakoid membrane. The structure is comprised of a well-defined five-stranded beta-sheet similar to SH3 domains []. This subunit may form complexes with ferredoxin and ferredoxin-oxidoreductase in the photosystem I reaction centre.; GO: 0015979 photosynthesis, 0009522 photosystem I, 0009538 photosystem I reaction center; PDB: 1PSF_A 1PSE_A 2WSF_E 2WSC_E 2O01_E 2WSE_E 1GXI_E 1JB0_E 3PCQ_E 1QP2_A ....
Probab=57.73  E-value=37  Score=26.41  Aligned_cols=53  Identities=21%  Similarity=0.359  Sum_probs=35.7

Q ss_pred             CCCCCeEEEEeCC--CCCceEEEEeeeCC--ccEEEEEEeccccCCceeeecccccc
Q 018594          297 PQIGGLVRIVNGA--YRGSNARLLGVDTD--KFCAQVKIEKGVYDGRVLNAIDYEDI  349 (353)
Q Consensus       297 P~~G~~V~IV~G~--~RG~~g~L~siD~~--~~~a~V~l~~g~~~g~~v~~l~yedi  349 (353)
                      |+.|.+|+|+.-+  +-..+|++.++|.+  ++-++|+++.-.+.|-.-.++..+.+
T Consensus         1 i~rgskVrIlR~ESYWyn~vGtV~svdqs~i~YPV~VRF~kvNY~g~nTnnfa~~El   57 (61)
T PF02427_consen    1 IKRGSKVRILRKESYWYNEVGTVASVDQSGIRYPVVVRFDKVNYAGVNTNNFALDEL   57 (61)
T ss_dssp             S-TTSEEEE-SSSSTTTTSEEEEEEETTSSSSSSEEEE-SSS-SSSSSEEEE-GGGE
T ss_pred             CCCCCEEEEccccceeecccceEEEEccCCccccEEEEEEEecccCccccccchhhh
Confidence            4679999999765  47789999999988  67899999876665533334555544


No 43 
>PF10615 DUF2470:  Protein of unknown function (DUF2470);  InterPro: IPR019595  This entry represents a putative haem-iron utilisation family of proteins, as many members are annotated as being pyridoxamine 5'-phosphate oxidase-related, FMN-binding; however the function of this domain is not known. ; PDB: 3GAS_D 3SWJ_A 2ARZ_B.
Probab=56.91  E-value=4.8  Score=31.99  Aligned_cols=36  Identities=17%  Similarity=0.347  Sum_probs=23.3

Q ss_pred             eecccccccHHHHHHHhccc---ccEEEee-cCceeEEEe
Q 018594           53 HMNSTRWATLTEFVKYLGRT---GKCKVEE-TPKGWFITY   88 (353)
Q Consensus        53 HMNaT~W~tLt~Fvk~Lgr~---G~c~vde-tekGw~I~y   88 (353)
                      |||.-...+|..|++++|.-   +.|++.. +..|..|.|
T Consensus        16 HMN~DH~d~l~~~~~~~~~~~~~~~a~m~~id~~G~~l~~   55 (83)
T PF10615_consen   16 HMNDDHADDLLLYARHYGGVPDAASARMTDIDRDGFDLRV   55 (83)
T ss_dssp             HHHHH-HHHHHHHHHHHHT-SSSSS-EEEEEETTEEEEEE
T ss_pred             HHHHhHHHHHHHHHHhcCCCCCCCCEEEEEEeccccEEEE
Confidence            99999999999999988865   2233322 344555555


No 44 
>PTZ00471 60S ribosomal protein L27; Provisional
Probab=56.50  E-value=18  Score=32.24  Aligned_cols=26  Identities=23%  Similarity=0.372  Sum_probs=22.7

Q ss_pred             CCCCeEEEEeCCCCCceEEEEeeeCC
Q 018594          298 QIGGLVRIVNGAYRGSNARLLGVDTD  323 (353)
Q Consensus       298 ~~G~~V~IV~G~~RG~~g~L~siD~~  323 (353)
                      ++|.-|+||+|.|.|..|.++....+
T Consensus         6 kpgkVVivL~GR~AGkKaVivk~~dd   31 (134)
T PTZ00471          6 KPGKVVIVTSGRYAGRKAVIVQNFDT   31 (134)
T ss_pred             cCCEEEEEEccccCCcEEEEEeecCC
Confidence            57889999999999999999987555


No 45 
>PF10771 DUF2582:  Protein of unknown function (DUF2582);  InterPro: IPR019707  This entry represents conserved proteins found in bacteria and archaea. The function is not known. ; PDB: 2L02_B 2L01_A.
Probab=55.57  E-value=12  Score=29.26  Aligned_cols=23  Identities=22%  Similarity=0.472  Sum_probs=20.8

Q ss_pred             HHHhcccccEEEeecCceeEEEe
Q 018594           66 VKYLGRTGKCKVEETPKGWFITY   88 (353)
Q Consensus        66 vk~Lgr~G~c~vdetekGw~I~y   88 (353)
                      +=||.|+|++.+++.+.-|||..
T Consensus        43 iGWLarE~KI~~~~~~~~~~v~L   65 (65)
T PF10771_consen   43 IGWLARENKIEFEEKNGELYVSL   65 (65)
T ss_dssp             HHHHHCTTSEEEEEETTEEEEEE
T ss_pred             HHHHhccCceeEEeeCCEEEEEC
Confidence            67999999999999999999863


No 46 
>smart00333 TUDOR Tudor domain. Domain of unknown function present in several RNA-binding proteins. 10 copies in the Drosophila Tudor protein. Initial proposal that the survival motor neuron gene product contain a Tudor domain are corroborated by more recent database search techniques such as PSI-BLAST (unpublished).
Probab=55.26  E-value=49  Score=23.62  Aligned_cols=52  Identities=19%  Similarity=0.111  Sum_probs=35.9

Q ss_pred             cCCCCCeEEEEeCCCCCceEEEEeeeCCccEEEEEEeccccCCceeeeccccccccc
Q 018594          296 IPQIGGLVRIVNGAYRGSNARLLGVDTDKFCAQVKIEKGVYDGRVLNAIDYEDICKL  352 (353)
Q Consensus       296 IP~~G~~V~IV~G~~RG~~g~L~siD~~~~~a~V~l~~g~~~g~~v~~l~yedicKl  352 (353)
                      .|++|..|++....-+=..|++++++.+ ..+.|....   .|. ...++.++|..+
T Consensus         2 ~~~~G~~~~a~~~d~~wyra~I~~~~~~-~~~~V~f~D---~G~-~~~v~~~~l~~l   53 (57)
T smart00333        2 TFKVGDKVAARWEDGEWYRARIIKVDGE-QLYEVFFID---YGN-EEVVPPSDLRPL   53 (57)
T ss_pred             CCCCCCEEEEEeCCCCEEEEEEEEECCC-CEEEEEEEC---CCc-cEEEeHHHeecC
Confidence            4688988888863345567899999986 456666664   144 335888888754


No 47 
>PF01455 HupF_HypC:  HupF/HypC family;  InterPro: IPR001109 The large subunit of [NiFe]-hydrogenase, as well as other nickel metalloenzymes, is synthesised as a precursor devoid of the metalloenzyme active site. This precursor then undergoes a complex post-translational maturation process that requires a number of accessory proteins. The hydrogenase expression/formation proteins (HupF/HypC) form a family of small proteins that are hydrogenase precursor-specific chaperones required for this maturation process []. They are believed to keep the hydrogenase precursor in a conformation accessible for metal incorporation [, ].; PDB: 3D3R_A 2Z1C_C 2OT2_A.
Probab=53.92  E-value=43  Score=26.14  Aligned_cols=42  Identities=19%  Similarity=0.180  Sum_probs=30.7

Q ss_pred             eEEEEe-cCCceEEEEecCCCeEEEEcCCceeeecCCCCCeEEEEeC
Q 018594          263 GVVRKV-IDKYVGEIEMLEKKHVLRVDQDELETVIPQIGGLVRIVNG  308 (353)
Q Consensus       263 gvV~~V-~d~~~c~V~l~d~g~~l~vdq~~LETVIP~~G~~V~IV~G  308 (353)
                      +.|.+| .++..|.|...  |..-.|+-..|..  +++|+.|+|=.|
T Consensus         7 ~~Vv~v~~~~~~A~v~~~--G~~~~V~~~lv~~--v~~Gd~VLVHaG   49 (68)
T PF01455_consen    7 GRVVEVDEDGGMAVVDFG--GVRREVSLALVPD--VKVGDYVLVHAG   49 (68)
T ss_dssp             EEEEEEETTTTEEEEEET--TEEEEEEGTTCTS--B-TT-EEEEETT
T ss_pred             EEEEEEeCCCCEEEEEcC--CcEEEEEEEEeCC--CCCCCEEEEecC
Confidence            467777 34678888764  6777888888877  788999999776


No 48 
>PF01176 eIF-1a:  Translation initiation factor 1A / IF-1;  InterPro: IPR006196  The S1 domain of around 70 amino acids, originally identified in ribosomal protein S1, is found in a large number of RNA-associated proteins. It has been shown that S1 proteins bind RNA through their S1 domains with some degree of sequence specificity. This type of S1 domain is found in translation initiation factor 1.  The solution structure of one S1 RNA-binding domain from Escherichia coli polynucleotide phosphorylase has been determined []. It displays some similarity with the cold shock domain (CSD) (IPR002059 from INTERPRO). Both the S1 and the CSD domain consist of an antiparallel beta barrel of the same topology with 5 beta strands. This fold is also shared by many other proteins of unrelated function and is known as the OB fold. However, the S1 and CSD fold can be distinguished from the other OB folds by the presence of a short 3(10) helix at the end of strand 3. This unique feature is likely to form a part of the DNA/RNA-binding site.  This entry is specific for bacterial, chloroplastic and eukaryotic IF-1 type S1 domains.; GO: 0003723 RNA binding, 0003743 translation initiation factor activity, 0006413 translational initiation; PDB: 1JT8_A 3I4O_A 1AH9_A 1ZO1_W 1D7Q_A 2OQK_A 2DGY_A 1HR0_W.
Probab=51.49  E-value=59  Score=24.80  Aligned_cols=57  Identities=18%  Similarity=0.273  Sum_probs=39.2

Q ss_pred             eeEEEEecCCceEEEEecCCCeEE-EEcCCceeeecCCCCCeEEEEeCCCCCceEEEE
Q 018594          262 KGVVRKVIDKYVGEIEMLEKKHVL-RVDQDELETVIPQIGGLVRIVNGAYRGSNARLL  318 (353)
Q Consensus       262 KgvV~~V~d~~~c~V~l~d~g~~l-~vdq~~LETVIP~~G~~V~IV~G~~RG~~g~L~  318 (353)
                      -|+|..+.+.+...|.+.++...+ .++...--+|-=..|+.|+|-.-.|--..|.++
T Consensus         6 ~~~V~~~lG~~~~~V~~~dg~~~l~~i~gK~r~~iwI~~GD~V~V~~~~~d~~kG~Ii   63 (65)
T PF01176_consen    6 IGRVTEMLGNNLFEVECEDGEERLARIPGKFRKRIWIKRGDFVLVEPSPYDKVKGRII   63 (65)
T ss_dssp             EEEEEEEESSSEEEEEETTSEEEEEEE-HHHHTCC---TTEEEEEEESTTCTTEEEEE
T ss_pred             EEEEEEECCCCEEEEEeCCCCEEEEEeccceeeeEecCCCCEEEEEecccCCCeEEEE
Confidence            478899999889999988654444 688885555555788888887666666666654


No 49 
>PF00567 TUDOR:  Tudor domain;  InterPro: IPR008191 There are multiple copies of this domain in the Drosophila melanogaster tudor protein and it has been identified in several RNA-binding proteins []. Although the function of this domain is unknown, in Drosophila melanogaster the tudor protein is required during oogenesis for the formation of primordial germ cells and for normal abdominal segmentation [].; PDB: 3NTI_A 3NTK_B 3NTH_A 2DIQ_A 3FDR_A 3PNW_O 3S6W_A 3PMT_A 2WAC_A 2O4X_A ....
Probab=51.17  E-value=82  Score=24.79  Aligned_cols=57  Identities=18%  Similarity=0.256  Sum_probs=37.0

Q ss_pred             CCCcccCCcEEEEeecccCCcccccceeEEEEecCCceEEEEecCCCeEEEEcCCceeeecCC
Q 018594          236 KDYWLCEGIIVKVMSKALADKGYNKQKGVVRKVIDKYVGEIEMLEKKHVLRVDQDELETVIPQ  298 (353)
Q Consensus       236 ~~~WL~~~IvVKIi~K~l~dGkyYk~KgvV~~V~d~~~c~V~l~d~g~~l~vdq~~LETVIP~  298 (353)
                      ...++..+..+.+    ..+|.+|-  |+|....+...+.|++.|-|.+..|..++|-..-|.
T Consensus        50 ~~~~~~~~~~~~~----~~~~~w~R--a~I~~~~~~~~~~V~~iD~G~~~~v~~~~l~~l~~~  106 (121)
T PF00567_consen   50 PESNPGEGCLCVV----SEDGRWYR--AVITVDIDENQYKVFLIDYGNTEKVSASDLRPLPPE  106 (121)
T ss_dssp             ST--TTEEEEEEE----TTTSEEEE--EEEEEEECTTEEEEEETTTTEEEEEEGGGEEE--HH
T ss_pred             cccccCCEEEEEE----ecCCceee--EEEEEecccceeEEEEEecCceEEEcHHHhhhhCHH
Confidence            3444444444433    23557766  677555666799999999999999999998766553


No 50 
>CHL00010 infA translation initiation factor 1
Probab=50.91  E-value=1.3e+02  Score=24.01  Aligned_cols=62  Identities=10%  Similarity=0.105  Sum_probs=35.8

Q ss_pred             ceeEEEEecCCceEEEEecCCCeEE--EEcCCce-eeecCCCCCeEEEEeCCCCCceEEEEeeeCC
Q 018594          261 QKGVVRKVIDKYVGEIEMLEKKHVL--RVDQDEL-ETVIPQIGGLVRIVNGAYRGSNARLLGVDTD  323 (353)
Q Consensus       261 ~KgvV~~V~d~~~c~V~l~d~g~~l--~vdq~~L-ETVIP~~G~~V~IV~G~~RG~~g~L~siD~~  323 (353)
                      -+|+|..+.+.....|.+.+ |..+  .+....= ..+-|.+|+.|.|=-=.+-...|.++-+-..
T Consensus         9 ~~G~Vik~lg~~~y~V~~~~-g~~~~c~~rGklr~~~i~~~vGD~V~ve~~~~~~~~g~Ii~r~~~   73 (78)
T CHL00010          9 MEGLVTESLPNGMFRVRLDN-GCQVLGYISGKIRRNSIRILPGDRVKVELSPYDLTKGRIIYRLRN   73 (78)
T ss_pred             EEEEEEEEcCCCEEEEEeCC-CCEEEEEeccceecCCcccCCCCEEEEEEcccCCCeEEEEEEecC
Confidence            46777777743355555533 4333  3333222 2667899999988744455556666665443


No 51 
>cd01734 YlxS_C YxlS is a Bacillus subtilis gene of unknown function with two domains that each have an alpha/beta fold.  The N-terminal domain is composed of two alpha-helices and a three-stranded beta-sheet, while the C-terminal domain is composed of one alpha-helix and a five-stranded beta-sheet.  This CD represents the C-terminal domain which has a fold similar to the Sm fold of proteins like Sm-D3.
Probab=48.52  E-value=41  Score=26.78  Aligned_cols=51  Identities=24%  Similarity=0.429  Sum_probs=33.1

Q ss_pred             CCCCCeEEEEe----CCCCCceEEEEeeeCCccEEEEEEeccccCCceeeecccccccc
Q 018594          297 PQIGGLVRIVN----GAYRGSNARLLGVDTDKFCAQVKIEKGVYDGRVLNAIDYEDICK  351 (353)
Q Consensus       297 P~~G~~V~IV~----G~~RG~~g~L~siD~~~~~a~V~l~~g~~~g~~v~~l~yedicK  351 (353)
                      ...|..|.|..    |..+-..|+|.+++.+.  +++..+.. ..+..+ .++|++|.+
T Consensus        22 r~~G~~v~v~~~~~~~~~~~~~G~L~~~~~~~--v~l~~~~~-~~~~~~-~i~~~~I~k   76 (83)
T cd01734          22 RAVGKYVHVKLYQPIDGQKEFEGTLLGVDDDT--VTLEVDIK-TRGKTV-EIPLDKIAK   76 (83)
T ss_pred             HhCCCEEEEEEEcccCCeEEEEEEEEeEeCCE--EEEEEecC-CCCeEE-EEEhHHeeE
Confidence            34688888853    33455689999999876  55554421 113445 499999976


No 52 
>PRK14639 hypothetical protein; Provisional
Probab=47.39  E-value=39  Score=29.88  Aligned_cols=72  Identities=19%  Similarity=0.277  Sum_probs=45.3

Q ss_pred             CceEEEEecCCCeEEEEcCCceeeecCCCCCeEEEEeCCCCCceEEEEeeeCCccEEEEEEeccccCCceeeeccccccc
Q 018594          271 KYVGEIEMLEKKHVLRVDQDELETVIPQIGGLVRIVNGAYRGSNARLLGVDTDKFCAQVKIEKGVYDGRVLNAIDYEDIC  350 (353)
Q Consensus       271 ~~~c~V~l~d~g~~l~vdq~~LETVIP~~G~~V~IV~G~~RG~~g~L~siD~~~~~a~V~l~~g~~~g~~v~~l~yedic  350 (353)
                      .|+-.|..++-++-|+-+.+    .....|..|.|-...-+-..|+|.+++.+.  +++....+   +..+ .++|++|.
T Consensus        63 ~Y~LEVSSPGl~RpL~~~~~----f~r~~G~~v~v~l~~~~~~~G~L~~~~~~~--i~l~~~~~---~~~~-~i~~~~I~  132 (140)
T PRK14639         63 EYFLEVSSPGLERKLSKIEH----FAKSIGELVKITTNEKEKFEGKIVSVDDEN--ITLENLEN---KEKT-TINFNDIK  132 (140)
T ss_pred             CeEEEEeCCCCCCcCCCHHH----HHHhCCCEEEEEECCCcEEEEEEEEEeCCE--EEEEEccC---CcEE-EEEhHHee
Confidence            35555555554555543222    335579999997545678889999999876  55533221   3445 49999997


Q ss_pred             cc
Q 018594          351 KL  352 (353)
Q Consensus       351 Kl  352 (353)
                      +.
T Consensus       133 ka  134 (140)
T PRK14639        133 KA  134 (140)
T ss_pred             eE
Confidence            63


No 53 
>COG2163 RPL14A Ribosomal protein L14E/L6E/L27E [Translation, ribosomal structure and biogenesis]
Probab=44.44  E-value=35  Score=30.04  Aligned_cols=28  Identities=18%  Similarity=0.459  Sum_probs=25.2

Q ss_pred             CCCCCeEEEEeCCCCCceEEEEeeeCCc
Q 018594          297 PQIGGLVRIVNGAYRGSNARLLGVDTDK  324 (353)
Q Consensus       297 P~~G~~V~IV~G~~RG~~g~L~siD~~~  324 (353)
                      +.+|.-|.|+.|.++|..+.++++-.++
T Consensus         5 l~~GrVvvv~~GR~aGkk~VIv~~iDd~   32 (125)
T COG2163           5 LEVGRVVVVTAGRFAGKKVVIVKIIDDN   32 (125)
T ss_pred             ccCCeEEEEecceeCCceEEEEEEccCC
Confidence            4578899999999999999999998887


No 54 
>KOG1708 consensus Mitochondrial/chloroplast ribosomal protein L24 [Translation, ribosomal structure and biogenesis]
Probab=40.72  E-value=40  Score=32.26  Aligned_cols=33  Identities=21%  Similarity=0.285  Sum_probs=29.0

Q ss_pred             CCCCeEEEEeCCCCCceEEEEeeeCCccEEEEE
Q 018594          298 QIGGLVRIVNGAYRGSNARLLGVDTDKFCAQVK  330 (353)
Q Consensus       298 ~~G~~V~IV~G~~RG~~g~L~siD~~~~~a~V~  330 (353)
                      ..|+.|.|+-|..+|..|.+..+-..+.-++|.
T Consensus        74 f~GDtVeVlvGkDkGkqG~Vtqv~r~~s~VvV~  106 (236)
T KOG1708|consen   74 FFGDTVEVLVGKDKGKQGEVTQVIRHRSWVVVK  106 (236)
T ss_pred             ecCCEEEEEecccCCccceEEEEeecCceEEEc
Confidence            369999999999999999999998888777763


No 55 
>TIGR01955 RfaH transcriptional activator RfaH. This model represents the transcriptional activator protein, RfaH. This protein is most closely related to the transcriptional termination/antitermination protein NusG (TIGR00922) and contains the KOW motif (pfam00467). This protein appears to be limited to the gamma proteobacteria. In E. coli, this gene appears to control the expression of haemolysin, sex factor and lipopolysaccharide genes.
Probab=40.61  E-value=57  Score=28.37  Aligned_cols=52  Identities=10%  Similarity=0.056  Sum_probs=39.7

Q ss_pred             CCcccCCcEEEEeecccCCcccccceeEEEEecCCceEEEEecCCCe--EEEEcCCcee
Q 018594          237 DYWLCEGIIVKVMSKALADKGYNKQKGVVRKVIDKYVGEIEMLEKKH--VLRVDQDELE  293 (353)
Q Consensus       237 ~~WL~~~IvVKIi~K~l~dGkyYk~KgvV~~V~d~~~c~V~l~d~g~--~l~vdq~~LE  293 (353)
                      ...+.+|=.|+|+     +|-|-+-.|+|.++.++..+.|.+.--|+  .+.|+.++||
T Consensus       106 ~~~~~~G~~V~V~-----~GPf~g~~g~v~~~~~~~r~~v~l~~~gr~~~v~~~~~~~~  159 (159)
T TIGR01955       106 TTLPYKGDKVRIT-----DGAFAGFEAIFLEPDGEKRSMLLLNMIGKQIKVSVPNTSVE  159 (159)
T ss_pred             ccCCCCCCEEEEe-----ccCCCCcEEEEEEECCCceEEEEEhhhCCceEEEecHHHcC
Confidence            4678899999998     56777889999999776677776644444  4678877775


No 56 
>PF12872 OST-HTH:  OST-HTH/LOTUS domain; PDB: 2KPM_A 3S93_B 3RCO_A 2KZV_A.
Probab=39.59  E-value=27  Score=26.29  Aligned_cols=68  Identities=15%  Similarity=0.217  Sum_probs=45.8

Q ss_pred             HHHHHHHHHHHHhccCC-cccccceeeeeecccccceeecccccccHHHHHHHhcccccEEEeec--CceeEE
Q 018594           17 EEFEAGFLELMRRSHRF-SRIAATVVYNEYIHDRHHVHMNSTRWATLTEFVKYLGRTGKCKVEET--PKGWFI   86 (353)
Q Consensus        17 ~eF~~~Fl~lLr~~~g~-krV~aN~vYneyI~dr~HiHMNaT~W~tLt~Fvk~Lgr~G~c~vdet--ekGw~I   86 (353)
                      +.|.....++|...++. -+|....+.++|.+--.++.--.=-..+|++|++-  -.+.|.|.++  .+.|+|
T Consensus         4 ~~~~~~l~~ll~~~~~~~g~v~ls~l~~~~~~~~~~f~~~~yG~~~l~~ll~~--~~~~~~i~~~~~g~~~~v   74 (74)
T PF12872_consen    4 EELKKLLRELLESQKGEDGWVSLSQLGQEYKKKYPDFDPRDYGFSSLSELLES--LPDVVEIEERQHGGQVYV   74 (74)
T ss_dssp             HHHHHHHHHHHHHTCTTTSSEEHHHHHHHHHHHHTT--TCCTTSSSHHHHHHT---TTTEEEEEEECCCC---
T ss_pred             HHHHHHHHHHHHhCcCCCceEEHHHHHHHHHHHCCCCCccccCCCcHHHHHHh--CCCeEEEeeeCCCCcCCC
Confidence            45677788888777763 47999999988886555566555667899999975  4788888555  445654


No 57 
>TIGR00922 nusG transcription termination/antitermination factor NusG. Archaeal proteins once termed NusG share the KOW domain but are actually a ribosomal protein corresponding to L24p in bacterial and L26e in eukaryotes (TIGR00405).
Probab=39.31  E-value=78  Score=28.01  Aligned_cols=53  Identities=17%  Similarity=0.141  Sum_probs=39.6

Q ss_pred             CCCcccCCcEEEEeecccCCcccccceeEEEEec-CCceEEEEecCCCe--EEEEcCCcee
Q 018594          236 KDYWLCEGIIVKVMSKALADKGYNKQKGVVRKVI-DKYVGEIEMLEKKH--VLRVDQDELE  293 (353)
Q Consensus       236 ~~~WL~~~IvVKIi~K~l~dGkyYk~KgvV~~V~-d~~~c~V~l~d~g~--~l~vdq~~LE  293 (353)
                      ....+.+|=.|+|+     +|-|-+..|+|..+. ++..+.|.+.--|+  .+.|+.++|+
T Consensus       116 ~~~~~~~G~~V~I~-----~Gpf~G~~g~v~~~~~~~~r~~V~v~~~g~~~~v~v~~~~l~  171 (172)
T TIGR00922       116 PKIDFEVGEQVRVN-----DGPFANFTGTVEEVDYEKSKLKVSVSIFGRETPVELEFSQVE  171 (172)
T ss_pred             cccCCCCCCEEEEe-----ecCCCCcEEEEEEEcCCCCEEEEEEEECCCceEEEEcHHHee
Confidence            34678899999998     567888899999996 45577776644344  5678877775


No 58 
>PRK14630 hypothetical protein; Provisional
Probab=38.14  E-value=73  Score=28.29  Aligned_cols=69  Identities=14%  Similarity=0.188  Sum_probs=41.5

Q ss_pred             CceEEEEecCCCeEEEEcCCceeeecCCCCCeEEEEeCCCCCceEEEEeeeCCccEEEEEEeccccCCceeeeccccccc
Q 018594          271 KYVGEIEMLEKKHVLRVDQDELETVIPQIGGLVRIVNGAYRGSNARLLGVDTDKFCAQVKIEKGVYDGRVLNAIDYEDIC  350 (353)
Q Consensus       271 ~~~c~V~l~d~g~~l~vdq~~LETVIP~~G~~V~IV~G~~RG~~g~L~siD~~~~~a~V~l~~g~~~g~~v~~l~yedic  350 (353)
                      .|+-.|..++-++.|+-+.+    ..-..|..|.|-...-. ..|+|.+++.+.  +++..+     +..+ .+||++|.
T Consensus        72 ~Y~LEVSSPGldRpL~~~~d----f~r~~G~~v~V~l~~~~-~~G~L~~~~d~~--i~l~~~-----~~~~-~i~~~~I~  138 (143)
T PRK14630         72 NFSLEISTPGINRKIKSDRE----FKIFEGKKIKLMLDNDF-EEGFILEAKADS--FIFKTD-----SKEV-NVLYSDVK  138 (143)
T ss_pred             CeEEEEeCCCCCCcCCCHHH----HHHhCCCEEEEEEcCcc-eEEEEEEEeCCE--EEEEEC-----CEEE-EEEhHhcc
Confidence            34555555544555543322    22346888888654332 289999998766  555433     3445 49999998


Q ss_pred             cc
Q 018594          351 KL  352 (353)
Q Consensus       351 Kl  352 (353)
                      |.
T Consensus       139 ka  140 (143)
T PRK14630        139 KA  140 (143)
T ss_pred             eE
Confidence            63


No 59 
>PRK14633 hypothetical protein; Provisional
Probab=38.11  E-value=66  Score=28.71  Aligned_cols=70  Identities=19%  Similarity=0.284  Sum_probs=42.7

Q ss_pred             ceEEEEecCCCeEEEEcCCceeeecCCCCCeEEEEe----CCCCCceEEEEeeeCCccEEEEEEeccccCCceeeecccc
Q 018594          272 YVGEIEMLEKKHVLRVDQDELETVIPQIGGLVRIVN----GAYRGSNARLLGVDTDKFCAQVKIEKGVYDGRVLNAIDYE  347 (353)
Q Consensus       272 ~~c~V~l~d~g~~l~vdq~~LETVIP~~G~~V~IV~----G~~RG~~g~L~siD~~~~~a~V~l~~g~~~g~~v~~l~ye  347 (353)
                      |+-.|..++-++.|+-+.++    .-..|..|.|..    +..+-..|+|.+++.+.  +++.+..    +..+ .++|+
T Consensus        70 Y~LEVSSPGldRpL~~~~~f----~r~~G~~v~V~~~~~~~~~~~~~G~L~~v~~~~--i~l~~~~----~~~~-~i~~~  138 (150)
T PRK14633         70 YILEVSSPGMNRQIFNIIQA----QALVGFNVKAVTLAPVGSQTKFKGVLERVEGNN--VILNLED----GKEI-SFDFD  138 (150)
T ss_pred             eEEEEeCCCCCCCCCCHHHH----HHhCCCeEEEEEecccCCcEEEEEEEEEEeCCE--EEEEEcC----CcEE-EEEhH
Confidence            44455555445555433332    234688888853    34566789999998776  5554432    3445 49999


Q ss_pred             ccccc
Q 018594          348 DICKL  352 (353)
Q Consensus       348 dicKl  352 (353)
                      +|.+.
T Consensus       139 ~I~ka  143 (150)
T PRK14633        139 ELKKL  143 (150)
T ss_pred             HeeeE
Confidence            99863


No 60 
>CHL00125 psaE photosystem I subunit IV; Reviewed
Probab=38.06  E-value=56  Score=25.59  Aligned_cols=52  Identities=23%  Similarity=0.396  Sum_probs=36.1

Q ss_pred             CCCCeEEEEeCC--CCCceEEEEeeeCC--ccEEEEEEeccccCCceeeecccccc
Q 018594          298 QIGGLVRIVNGA--YRGSNARLLGVDTD--KFCAQVKIEKGVYDGRVLNAIDYEDI  349 (353)
Q Consensus       298 ~~G~~V~IV~G~--~RG~~g~L~siD~~--~~~a~V~l~~g~~~g~~v~~l~yedi  349 (353)
                      +.|++|+|+.-+  +-..+|++.++|.+  ++-++|+++.-.+.|-.-.++..+.+
T Consensus         3 ~rGskVrIlR~ESYWyn~vGtV~svd~~gi~YPV~VRF~kvNY~g~nTNnfa~~El   58 (64)
T CHL00125          3 KRGSKVRILRKESYWYNEIGTVATVDQSGIRYPVLVRFEKVNYSGTNTNNFSLDEL   58 (64)
T ss_pred             ccCCEEEEccccceeecCcceEEEEcCCCCCccEEEEEeeeeccccccccccHHHH
Confidence            468999999665  36778999999987  67788988865555532223444433


No 61 
>PRK14638 hypothetical protein; Provisional
Probab=37.62  E-value=60  Score=29.00  Aligned_cols=69  Identities=19%  Similarity=0.188  Sum_probs=44.7

Q ss_pred             CceEEEEecCCCeEEEEcCCceeeecCCCCCeEEEEeCCCCCceEEEEeeeCCccEEEEEEeccccCCceeeeccccccc
Q 018594          271 KYVGEIEMLEKKHVLRVDQDELETVIPQIGGLVRIVNGAYRGSNARLLGVDTDKFCAQVKIEKGVYDGRVLNAIDYEDIC  350 (353)
Q Consensus       271 ~~~c~V~l~d~g~~l~vdq~~LETVIP~~G~~V~IV~G~~RG~~g~L~siD~~~~~a~V~l~~g~~~g~~v~~l~yedic  350 (353)
                      .|+-.|..++-++-|+-+.++.    -..|..|.|-...-+-.+|+|.+++.+.  +++...     +..+ .+||++|.
T Consensus        75 ~Y~LEVSSPGldRpL~~~~~f~----r~~G~~v~V~~~~~k~~~G~L~~~~~~~--i~l~~~-----~~~~-~i~~~~I~  142 (150)
T PRK14638         75 SYTLEVSSPGLDRPLRGPKDYV----RFTGKLAKIVTKDGKTFIGRIESFVDGT--ITISDE-----KEKY-EINIDDVK  142 (150)
T ss_pred             ceEEEEeCCCCCCCCCCHHHHH----HhCCCEEEEEECCCcEEEEEEEEEeCCE--EEEEEC-----CcEE-EEEhHHcc
Confidence            4555565555556665444333    3469999996534577899999998765  444422     3445 48999987


Q ss_pred             c
Q 018594          351 K  351 (353)
Q Consensus       351 K  351 (353)
                      +
T Consensus       143 ~  143 (150)
T PRK14638        143 R  143 (150)
T ss_pred             e
Confidence            6


No 62 
>PRK13709 conjugal transfer nickase/helicase TraI; Provisional
Probab=37.32  E-value=2.4e+02  Score=34.79  Aligned_cols=88  Identities=14%  Similarity=0.151  Sum_probs=56.3

Q ss_pred             CcccCCcEEEEeecccCCcccccceeEEEEecCCce-EEEEecCCCeEEEEcCCce----eeecC-----CCCCeEEEEe
Q 018594          238 YWLCEGIIVKVMSKALADKGYNKQKGVVRKVIDKYV-GEIEMLEKKHVLRVDQDEL----ETVIP-----QIGGLVRIVN  307 (353)
Q Consensus       238 ~WL~~~IvVKIi~K~l~dGkyYk~KgvV~~V~d~~~-c~V~l~d~g~~l~vdq~~L----ETVIP-----~~G~~V~IV~  307 (353)
                      .+..+|-+|+.-......+..|    +|..|..... .+|. ...|..+.++...+    +-.-|     +.|+++++..
T Consensus       647 ~~Y~~G~vi~~~~~~~~~~~~y----~V~~v~~~~n~LtL~-~~~G~~~~~~p~~~~~~~~vy~~~~ieiA~GDrLr~T~  721 (1747)
T PRK13709        647 DMYRPGMVMEQWNPETRSHDRY----VIDRVTAQSHSLTLR-DAQGETQVVKISSLDSSWSLFRPEKMPVADGERLRVLG  721 (1747)
T ss_pred             hcCCCCcEEEeeccccccCccE----EEEEEcCCCCEEEEE-cCCCCEEEeChHHhcccceeccccccccCCCCEEEEcc
Confidence            4558999998754322222333    7888876433 3333 34588888885443    44444     5799999994


Q ss_pred             C-----CCCCceEEEEeeeCCccEEEEEEe
Q 018594          308 G-----AYRGSNARLLGVDTDKFCAQVKIE  332 (353)
Q Consensus       308 G-----~~RG~~g~L~siD~~~~~a~V~l~  332 (353)
                      .     -..|..+++.+++...  ++|...
T Consensus       722 nd~~~~l~Ngd~~tV~~i~~~~--i~l~~~  749 (1747)
T PRK13709        722 KIPGLRLKGGDRLQVTSVSEDG--LTVVVP  749 (1747)
T ss_pred             CCcccCccCCCEEEEEEecCCe--EEEEEC
Confidence            3     3578999999998755  666654


No 63 
>PRK14637 hypothetical protein; Provisional
Probab=37.09  E-value=78  Score=28.39  Aligned_cols=47  Identities=17%  Similarity=0.163  Sum_probs=31.4

Q ss_pred             CCCCeEEEEeCCCCCc-eEEEEeeeCCccEEEEEEeccccCCceeeeccccccccc
Q 018594          298 QIGGLVRIVNGAYRGS-NARLLGVDTDKFCAQVKIEKGVYDGRVLNAIDYEDICKL  352 (353)
Q Consensus       298 ~~G~~V~IV~G~~RG~-~g~L~siD~~~~~a~V~l~~g~~~g~~v~~l~yedicKl  352 (353)
                      ..|..|.|-....+.. .|+|.+++.+.  +++...     +..+ .+||++|.+.
T Consensus        96 ~~G~~V~V~l~~~~~~~~G~L~~~~d~~--v~l~~~-----~~~~-~i~~~~I~ka  143 (151)
T PRK14637         96 FVGETVKVWFECTGQWQVGTIAEADETC--LVLTSD-----GVPV-TIPYVQITKA  143 (151)
T ss_pred             hCCCEEEEEECCCCcEEEEEEEEEeCCE--EEEEEC-----CEEE-EEEHHHeeeE
Confidence            3688899865223345 59999998876  444432     4445 4999999763


No 64 
>COG2163 RPL14A Ribosomal protein L14E/L6E/L27E [Translation, ribosomal structure and biogenesis]
Probab=35.81  E-value=68  Score=28.20  Aligned_cols=33  Identities=30%  Similarity=0.306  Sum_probs=25.1

Q ss_pred             ccCCcEEEEeecccCCcccccceeEEEEecCCceEEEE
Q 018594          240 LCEGIIVKVMSKALADKGYNKQKGVVRKVIDKYVGEIE  277 (353)
Q Consensus       240 L~~~IvVKIi~K~l~dGkyYk~KgvV~~V~d~~~c~V~  277 (353)
                      |-+|-+|-+.     .|+|-++++||.+++|.....|.
T Consensus         5 l~~GrVvvv~-----~GR~aGkk~VIv~~iDd~~v~i~   37 (125)
T COG2163           5 LEVGRVVVVT-----AGRFAGKKVVIVKIIDDNFVLIT   37 (125)
T ss_pred             ccCCeEEEEe-----cceeCCceEEEEEEccCCEEEEe
Confidence            4567777555     57999999999999998655444


No 65 
>PRK14636 hypothetical protein; Provisional
Probab=35.58  E-value=87  Score=28.81  Aligned_cols=71  Identities=18%  Similarity=0.186  Sum_probs=42.9

Q ss_pred             CceEEEEecCCCeEEEEcCCceeeecCCCCCeEEEE-eCCC---CCceEEEEeeeCCccEEEEEEeccccCCceeeeccc
Q 018594          271 KYVGEIEMLEKKHVLRVDQDELETVIPQIGGLVRIV-NGAY---RGSNARLLGVDTDKFCAQVKIEKGVYDGRVLNAIDY  346 (353)
Q Consensus       271 ~~~c~V~l~d~g~~l~vdq~~LETVIP~~G~~V~IV-~G~~---RG~~g~L~siD~~~~~a~V~l~~g~~~g~~v~~l~y  346 (353)
                      .|+-.|..++-.+.|+-+.++    .-..|..|.|- +.+.   +-.+|+|.+++.+.  +++.+..    +..+ .+||
T Consensus        73 ~Y~LEVSSPGldRpL~~~~df----~r~~G~~V~V~l~~~~~g~k~~~G~L~~v~~~~--v~l~~~~----~~~~-~i~~  141 (176)
T PRK14636         73 AYRLEVSSPGIDRPLTRPKDF----ADWAGHEARIALSEPLDGRKQFRGELKGIDGDT--VTIADNK----AGEV-ILPF  141 (176)
T ss_pred             CeEEEEeCCCCCCCCCCHHHH----HHhCCCeEEEEEecccCCeEEEEEEEEEEeCCE--EEEEEcC----CcEE-EEEh
Confidence            455555555545555433222    23469988885 4333   44589999998766  5554432    3345 4899


Q ss_pred             cccccc
Q 018594          347 EDICKL  352 (353)
Q Consensus       347 edicKl  352 (353)
                      ++|.+.
T Consensus       142 ~~I~kA  147 (176)
T PRK14636        142 AAIESA  147 (176)
T ss_pred             HHcceE
Confidence            999763


No 66 
>PRK04914 ATP-dependent helicase HepA; Validated
Probab=35.02  E-value=2.8e+02  Score=32.14  Aligned_cols=63  Identities=13%  Similarity=0.083  Sum_probs=48.5

Q ss_pred             eeEEEEecCCceEEEEecCCCe--EEEEcCCceeeecCCCCCeEEEEeCCCCCceEEEEeeeCCccEEEE
Q 018594          262 KGVVRKVIDKYVGEIEMLEKKH--VLRVDQDELETVIPQIGGLVRIVNGAYRGSNARLLGVDTDKFCAQV  329 (353)
Q Consensus       262 KgvV~~V~d~~~c~V~l~d~g~--~l~vdq~~LETVIP~~G~~V~IV~G~~RG~~g~L~siD~~~~~a~V  329 (353)
                      =|+|++|.++ +++|..+.+|.  +..++..-|--|++.+|+.|.+..|    ...++..+...++.++-
T Consensus        18 lG~v~~~d~r-~vtv~fpas~e~R~ya~~~apl~Rv~~~~g~~v~~~~~----~~~~v~~v~~~~gl~~y   82 (956)
T PRK04914         18 LGTVVAVDGR-TVTLLFPATGENRLYARNDAPLTRVMFNPGDTITSHEG----WQLTVEEVEEENGLLTY   82 (956)
T ss_pred             cEEEEEEeCC-EEEEEecCCCCceeeecCCCCceeeecCCCCEEEecCC----CEEEEEEEeccCCcEEE
Confidence            5899999766 99999988755  4579999999999999999986555    45666666666654443


No 67 
>cd08768 Cdc6_C Winged-helix domain of essential DNA replication protein Cell division control protein (Cdc6), which mediates DNA binding. This model characterizes the winged-helix, C-terminal domain of the Cell division control protein (Cdc6_C). Cdc6 (also known as Cell division cycle 6 or Cdc18) functions as a regulator at the early stages of DNA replication, by helping to recruit and load the Minichromosome Maintenance Complex (MCM) onto DNA and may have additional roles in the control of mitotic entry. Precise duplication of chromosomal DNA is required for genomic stability during replication. Cdc6 has an essential role in DNA replication and irregular expression of Cdc6 may lead to genomic instability. Cdc6 over-expression is observed in many cancerous lesions. DNA replication begins when an origin recognition complex (ORC) binds to a replication origin site on the chromatin. Studies indicate that Cdc6 interacts with ORC through the Orc1 subunit, and that this association increases
Probab=34.61  E-value=26  Score=27.18  Aligned_cols=65  Identities=17%  Similarity=0.201  Sum_probs=49.7

Q ss_pred             HHHHHHHHHHHhccCCcccccceeeeeecccccceeecccccccHHHHHHHhcccccEEEeecCce
Q 018594           18 EFEAGFLELMRRSHRFSRIAATVVYNEYIHDRHHVHMNSTRWATLTEFVKYLGRTGKCKVEETPKG   83 (353)
Q Consensus        18 eF~~~Fl~lLr~~~g~krV~aN~vYneyI~dr~HiHMNaT~W~tLt~Fvk~Lgr~G~c~vdetekG   83 (353)
                      -|.-+-+.++++ -|...+..+.||+.|-.-=.+.+++.-.+....+++.-|.-.|++.++..-+|
T Consensus         6 l~L~Al~~~~~~-~~~~~~~~~~vy~~Y~~~c~~~~~~~l~~~~~~~~l~~L~~~gli~~~~~~~g   70 (87)
T cd08768           6 LVLLALLLLFKR-GGEEEATTGEVYEVYEELCEEIGVDPLTQRRISDLLSELEMLGLLETEVSSKG   70 (87)
T ss_pred             HHHHHHHHHHhc-CCCCCccHHHHHHHHHHHHHHcCCCCCcHHHHHHHHHHHHHcCCeEEEEecCC
Confidence            344444545533 35677899999999986666678888999999999999999999999876544


No 68 
>PRK09014 rfaH transcriptional activator RfaH; Provisional
Probab=34.39  E-value=1e+02  Score=27.14  Aligned_cols=52  Identities=8%  Similarity=-0.011  Sum_probs=39.7

Q ss_pred             CcccCCcEEEEeecccCCcccccceeEEEEecCCceEEEEecCCCe--EEEEcCCceee
Q 018594          238 YWLCEGIIVKVMSKALADKGYNKQKGVVRKVIDKYVGEIEMLEKKH--VLRVDQDELET  294 (353)
Q Consensus       238 ~WL~~~IvVKIi~K~l~dGkyYk~KgvV~~V~d~~~c~V~l~d~g~--~l~vdq~~LET  294 (353)
                      ..+.+|=.|+|+     +|-|-+-.|+|.++.++..+.|.+.--|+  .+.|+.++|+.
T Consensus       108 ~~~~~G~~V~I~-----~Gp~~g~eg~v~~~~~~~r~~v~v~~~gr~~~v~v~~~~~~~  161 (162)
T PRK09014        108 ETPKPGDKVIIT-----EGAFEGLQAIYTEPDGEARSILLLNLLNKQVKHSVDNTQFRK  161 (162)
T ss_pred             cCCCCCCEEEEe-----cCCCCCcEEEEEEeCCCeEEEEeehhhCCcEEEEECHHHeec
Confidence            347889999998     56788899999999877777776643344  55788888864


No 69 
>COG1096 Predicted RNA-binding protein (consists of S1 domain and a Zn-ribbon domain) [Translation, ribosomal structure and biogenesis]
Probab=33.92  E-value=4.1e+02  Score=25.06  Aligned_cols=91  Identities=18%  Similarity=0.335  Sum_probs=59.0

Q ss_pred             cccCCcEEEEeecccCCcccccceeEEEEecCCceEEEEecCCCeEEEEcCCceeeecCCCCCeEE--EEeCCCCCceEE
Q 018594          239 WLCEGIIVKVMSKALADKGYNKQKGVVRKVIDKYVGEIEMLEKKHVLRVDQDELETVIPQIGGLVR--IVNGAYRGSNAR  316 (353)
Q Consensus       239 WL~~~IvVKIi~K~l~dGkyYk~KgvV~~V~d~~~c~V~l~d~g~~l~vdq~~LETVIP~~G~~V~--IV~G~~RG~~g~  316 (353)
                      ...||=+|-..-..+....-|.+.|.|....-+   .+..++...++.|...-.++.+|+.|+.|.  |+...-+.....
T Consensus         7 ~v~PGd~~a~~EE~~~G~gt~~~~g~i~Aa~~G---~~~~d~~n~~~~V~p~~~~~~~~K~GdiV~grV~~v~~~~a~V~   83 (188)
T COG1096           7 FVLPGDVLAVIEEFLPGEGTYEEGGEIRAAATG---VVRRDDKNRVISVKPGKKTPPLPKGGDIVYGRVTDVREQRALVR   83 (188)
T ss_pred             EEcCcceeeeeeeeecCCCeEeECCEEEEeecc---cEEEcccceEEEeccCCCCCCCCCCCCEEEEEEeeccceEEEEE
Confidence            445665665554444433456667777776443   355556677888888888999999999875  445555666667


Q ss_pred             EEeeeCCc------cEEEEEEe
Q 018594          317 LLGVDTDK------FCAQVKIE  332 (353)
Q Consensus       317 L~siD~~~------~~a~V~l~  332 (353)
                      +.+++...      +.+.+.+.
T Consensus        84 i~~ve~~~r~~~~~~~~~ihvs  105 (188)
T COG1096          84 IVGVEGKERELATSGAADIHVS  105 (188)
T ss_pred             EEEEecccccCCCCceeeEEEE
Confidence            77777644      55655554


No 70 
>PRK05609 nusG transcription antitermination protein NusG; Validated
Probab=33.80  E-value=1.4e+02  Score=26.52  Aligned_cols=53  Identities=17%  Similarity=0.177  Sum_probs=39.8

Q ss_pred             CCcccCCcEEEEeecccCCcccccceeEEEEec-CCceEEEEecCCC--eEEEEcCCceee
Q 018594          237 DYWLCEGIIVKVMSKALADKGYNKQKGVVRKVI-DKYVGEIEMLEKK--HVLRVDQDELET  294 (353)
Q Consensus       237 ~~WL~~~IvVKIi~K~l~dGkyYk~KgvV~~V~-d~~~c~V~l~d~g--~~l~vdq~~LET  294 (353)
                      ...+.+|=.|+|+     +|-|-+..|+|..+. ++..+.|.+.--|  ..+.|+.+.||.
T Consensus       124 ~~~~~~Gd~VrI~-----~GPf~G~~g~v~~i~~~~~r~~v~l~~~G~~~~v~l~~~~l~~  179 (181)
T PRK05609        124 KVDFEVGEMVRVI-----DGPFADFNGTVEEVDYEKSKLKVLVSIFGRETPVELEFSQVEK  179 (181)
T ss_pred             ccCCCCCCEEEEe-----ccCCCCCEEEEEEEeCCCCEEEEEEEECCCceEEEEchHHEEE
Confidence            4778899999998     567888999999997 4546666654334  456788888765


No 71 
>KOG3421 consensus 60S ribosomal protein L14 [Translation, ribosomal structure and biogenesis]
Probab=33.43  E-value=43  Score=29.93  Aligned_cols=36  Identities=17%  Similarity=0.338  Sum_probs=28.0

Q ss_pred             CCCeEEEEeCCCCCceEEEEeeeCCccEEEEEEeccccCC
Q 018594          299 IGGLVRIVNGAYRGSNARLLGVDTDKFCAQVKIEKGVYDG  338 (353)
Q Consensus       299 ~G~~V~IV~G~~RG~~g~L~siD~~~~~a~V~l~~g~~~g  338 (353)
                      +|.-++|..|+|.|...-+..++..+   .+.++ ||+++
T Consensus         9 VGrva~v~~G~~~GkL~AIVdviDqn---r~lvD-Gp~t~   44 (136)
T KOG3421|consen    9 VGRVALVSFGPDAGKLVAIVDVIDQN---RALVD-GPCTG   44 (136)
T ss_pred             cceEEEEEecCCCceEEEEEEeecch---hhhcc-Ccccc
Confidence            58889999999999999999998776   23334 67554


No 72 
>PRK04012 translation initiation factor IF-1A; Provisional
Probab=32.46  E-value=3.1e+02  Score=23.13  Aligned_cols=60  Identities=12%  Similarity=0.175  Sum_probs=35.4

Q ss_pred             ceeEEEEecCCceEEEEecCCCeEE-EEcCCceeeecCCCCCeEEEEeCCCCCceEEEEee
Q 018594          261 QKGVVRKVIDKYVGEIEMLEKKHVL-RVDQDELETVIPQIGGLVRIVNGAYRGSNARLLGV  320 (353)
Q Consensus       261 ~KgvV~~V~d~~~c~V~l~d~g~~l-~vdq~~LETVIP~~G~~V~IV~G~~RG~~g~L~si  320 (353)
                      .-|+|..+.+.+...|.+.++..++ .++..+=-+|-=..|+.|+|-.=+|--..|.++-+
T Consensus        23 ~~g~V~~~lG~~~~~V~~~dG~~~la~i~GK~Rk~IwI~~GD~VlVe~~~~~~~kg~Iv~r   83 (100)
T PRK04012         23 VFGVVEQMLGANRVRVRCMDGVERMGRIPGKMKKRMWIREGDVVIVAPWDFQDEKADIIWR   83 (100)
T ss_pred             EEEEEEEEcCCCEEEEEeCCCCEEEEEEchhhcccEEecCCCEEEEEecccCCCEEEEEEE
Confidence            3467888888878888887644444 46555544444456666666544444444444433


No 73 
>PF02736 Myosin_N:  Myosin N-terminal SH3-like domain;  InterPro: IPR004009 This domain has an SH3-like fold. It is found at the N terminus of many but not all myosins. The function of this domain is unknown.; GO: 0003774 motor activity, 0005524 ATP binding, 0016459 myosin complex; PDB: 2EC6_A 2W4H_M 1O1E_P 1O1D_D 1O18_A 1O1C_P 1O1B_D 1O1F_A 2W4A_M 2W4G_M ....
Probab=31.62  E-value=1.1e+02  Score=21.40  Aligned_cols=28  Identities=14%  Similarity=0.146  Sum_probs=21.9

Q ss_pred             ceeEEEEecCCceEEEEecCCCeEEEEcCC
Q 018594          261 QKGVVRKVIDKYVGEIEMLEKKHVLRVDQD  290 (353)
Q Consensus       261 ~KgvV~~V~d~~~c~V~l~d~g~~l~vdq~  290 (353)
                      -+|.|.+..++ .++|++.+ |..+.|..+
T Consensus        14 v~g~I~~~~g~-~vtV~~~~-G~~~tv~~d   41 (42)
T PF02736_consen   14 VKGEIIEEEGD-KVTVKTED-GKEVTVKKD   41 (42)
T ss_dssp             EEEEEEEEESS-EEEEEETT-TEEEEEEGG
T ss_pred             EEEEEEEEcCC-EEEEEECC-CCEEEeCCC
Confidence            46888887665 89999987 888877654


No 74 
>cd05793 S1_IF1A S1_IF1A: Translation initiation factor IF1A, also referred to as eIF1A in eukaryotes and aIF1A in archaea, S1-like RNA-binding domain. S1-like RNA-binding domains are found in a wide variety of RNA-associated proteins. IF1A is essential for translation initiation. eIF1A acts synergistically with eIF1 to mediate assembly of ribosomal initiation complexes at the initiation codon and maintain the accuracy of this process by recognizing and destabilizing aberrant preinitiation complexes from the mRNA. Without eIF1A and eIF1, 43S ribosomal preinitiation complexes can bind to the cap-proximal region, but are unable to reach the initiation codon. eIF1a also enhances the formation of 5'-terminal complexes in the presence of other translation initiation factors. This protein family is only found in eukaryotes and archaea.
Probab=31.39  E-value=2.7e+02  Score=22.18  Aligned_cols=56  Identities=20%  Similarity=0.286  Sum_probs=31.2

Q ss_pred             eEEEEecCCceEEEEecCCCeEE-EEcCCceeeecCCCCCeEEEEeCCCCCceEEEE
Q 018594          263 GVVRKVIDKYVGEIEMLEKKHVL-RVDQDELETVIPQIGGLVRIVNGAYRGSNARLL  318 (353)
Q Consensus       263 gvV~~V~d~~~c~V~l~d~g~~l-~vdq~~LETVIP~~G~~V~IV~G~~RG~~g~L~  318 (353)
                      |.|..+.+.....|.+.++..++ .++-.+=-.+-=.+|+.|+|-.=+|--..|.++
T Consensus         4 g~V~~~~g~~~~~V~~~~g~~~la~i~gK~rk~iwI~~GD~V~Ve~~~~d~~kg~Iv   60 (77)
T cd05793           4 GQVEKMLGNGRLEVRCFDGKKRLCRIRGKMRKRVWINEGDIVLVAPWDFQDDKADII   60 (77)
T ss_pred             EEEEEEcCCCEEEEEECCCCEEEEEEchhhcccEEEcCCCEEEEEeccccCCEEEEE
Confidence            67888888778888887644433 455444433333456666554333433344333


No 75 
>TIGR00523 eIF-1A eukaryotic/archaeal initiation factor 1A. Recommended nomenclature: eIF-1A for eukaryotes, aIF-1A for Archaea. Also called eIF-4C
Probab=30.90  E-value=2.6e+02  Score=23.49  Aligned_cols=58  Identities=19%  Similarity=0.248  Sum_probs=33.2

Q ss_pred             CcEEEEeecccCCcccccceeEEEEecCCceEEEEecCCCeEE-EEcCCceeeecCCCCCeEEE
Q 018594          243 GIIVKVMSKALADKGYNKQKGVVRKVIDKYVGEIEMLEKKHVL-RVDQDELETVIPQIGGLVRI  305 (353)
Q Consensus       243 ~IvVKIi~K~l~dGkyYk~KgvV~~V~d~~~c~V~l~d~g~~l-~vdq~~LETVIP~~G~~V~I  305 (353)
                      .+.+++..+.  +|.   .-|.|..+.+.+...|.+.++..++ .|+..+=-.|-=..|+.|+|
T Consensus         8 ~~~~~~p~~~--e~e---~~g~V~~~lG~~~~~V~~~dG~~~la~i~GK~Rk~iwI~~GD~VlV   66 (99)
T TIGR00523         8 QIRVRLPRKE--EGE---ILGVIEQMLGAGRVKVRCLDGKTRLGRIPGKLKKRIWIREGDVVIV   66 (99)
T ss_pred             cceeeCCCCC--CCE---EEEEEEEEcCCCEEEEEeCCCCEEEEEEchhhcccEEecCCCEEEE
Confidence            3455666442  333   3467888888878888877644333 45544443333445555555


No 76 
>PF14505 DUF4438:  Domain of unknown function (DUF4438); PDB: 3N99_N 3DCL_A.
Probab=30.74  E-value=51  Score=32.24  Aligned_cols=35  Identities=20%  Similarity=0.317  Sum_probs=27.3

Q ss_pred             CCCeEEEEeCCCCCceEEEEeeeCCccEEEEEEec
Q 018594          299 IGGLVRIVNGAYRGSNARLLGVDTDKFCAQVKIEK  333 (353)
Q Consensus       299 ~G~~V~IV~G~~RG~~g~L~siD~~~~~a~V~l~~  333 (353)
                      +|...+|+.|+-+|..|.+...+..-.-+.|.++.
T Consensus        60 iGN~A~VvSG~AKG~~G~VtGkHGGieHVlV~F~~   94 (258)
T PF14505_consen   60 IGNEAKVVSGDAKGAKGVVTGKHGGIEHVLVDFPD   94 (258)
T ss_dssp             BT-EEEE-SSTTTT-EEEEEEEETTTTEEEEE--H
T ss_pred             cCceeEEeecccCCCcCeEecccCCeeeEEEECCH
Confidence            69999999999999999999999988788887764


No 77 
>PF07076 DUF1344:  Protein of unknown function (DUF1344);  InterPro: IPR009780 This family consists of several short, hypothetical bacterial proteins of around 80 residues in length. Members of this family are found in Rhizobium, Agrobacterium and Brucella species. The function of this family is unknown.
Probab=30.09  E-value=2.3e+02  Score=22.09  Aligned_cols=43  Identities=9%  Similarity=0.110  Sum_probs=30.6

Q ss_pred             ceeEEEEecCCceEEEEecCCCeEEEEcCCceeeecC--CCCCeEEEEeC
Q 018594          261 QKGVVRKVIDKYVGEIEMLEKKHVLRVDQDELETVIP--QIGGLVRIVNG  308 (353)
Q Consensus       261 ~KgvV~~V~d~~~c~V~l~d~g~~l~vdq~~LETVIP--~~G~~V~IV~G  308 (353)
                      -.|+|.+|... +.+|.|. +|...+++.+.=   ++  ++|.+|+|..-
T Consensus         5 veG~I~~id~~-~~titLd-DGksy~lp~ef~---~~~L~~G~kV~V~yd   49 (61)
T PF07076_consen    5 VEGTIKSIDPE-TMTITLD-DGKSYKLPEEFD---FDGLKPGMKVVVFYD   49 (61)
T ss_pred             ceEEEEEEcCC-ceEEEec-CCCEEECCCccc---ccccCCCCEEEEEEE
Confidence            36899998554 8888775 588887765432   55  57899988754


No 78 
>PRK04950 ProP expression regulator; Provisional
Probab=29.95  E-value=1e+02  Score=29.52  Aligned_cols=35  Identities=23%  Similarity=0.538  Sum_probs=28.4

Q ss_pred             CCCCeEEEEeCCCCCceEEEEeeeCCccEEEEEEeccc
Q 018594          298 QIGGLVRIVNGAYRGSNARLLGVDTDKFCAQVKIEKGV  335 (353)
Q Consensus       298 ~~G~~V~IV~G~~RG~~g~L~siD~~~~~a~V~l~~g~  335 (353)
                      +.|..|.|--| -.-.-|++++|+.++  |.|+|++|.
T Consensus       168 ~~gq~v~vk~g-~~~~~a~i~ei~kd~--v~vql~~Gl  202 (213)
T PRK04950        168 TVGQAVKVKAG-KSAMDATVLEITKDD--VRVQLDSGL  202 (213)
T ss_pred             ccCCEEEEecc-CCCCceEEEEEecCc--EEEEcCCCc
Confidence            34888888888 445779999999998  999999863


No 79 
>TIGR00739 yajC preprotein translocase, YajC subunit. While this protein is part of the preprotein translocase in Escherichia coli, it is not essential for viability or protein secretion. The N-terminus region contains a predicted membrane-spanning region followed by a region consisting almost entirely of residues with charged (acidic, basic, or zwitterionic) side chains. This small protein is about 100 residues in length, and is restricted to bacteria; however, this protein is absent from some lineages, including spirochetes and Mycoplasmas.
Probab=29.70  E-value=1e+02  Score=25.00  Aligned_cols=30  Identities=10%  Similarity=0.235  Sum_probs=23.1

Q ss_pred             CCCCeEEEEeCCCCCceEEEEeeeCCccEEEEEEec
Q 018594          298 QIGGLVRIVNGAYRGSNARLLGVDTDKFCAQVKIEK  333 (353)
Q Consensus       298 ~~G~~V~IV~G~~RG~~g~L~siD~~~~~a~V~l~~  333 (353)
                      ++|++|+-..|    -.|++.+++.+.  +.|++..
T Consensus        39 ~~Gd~VvT~gG----i~G~V~~i~d~~--v~vei~~   68 (84)
T TIGR00739        39 KKGDKVLTIGG----IIGTVTKIAENT--IVIELND   68 (84)
T ss_pred             CCCCEEEECCC----eEEEEEEEeCCE--EEEEECC
Confidence            46899977766    889999999764  6777654


No 80 
>PRK14631 hypothetical protein; Provisional
Probab=29.52  E-value=1.2e+02  Score=27.84  Aligned_cols=73  Identities=29%  Similarity=0.239  Sum_probs=43.9

Q ss_pred             CceEEEEecCCCeEEEEcCCceeeecCCCCCeEEEEe----CCCCCceEEEEeeeCCccEEEEEEeccccCCceeeeccc
Q 018594          271 KYVGEIEMLEKKHVLRVDQDELETVIPQIGGLVRIVN----GAYRGSNARLLGVDTDKFCAQVKIEKGVYDGRVLNAIDY  346 (353)
Q Consensus       271 ~~~c~V~l~d~g~~l~vdq~~LETVIP~~G~~V~IV~----G~~RG~~g~L~siD~~~~~a~V~l~~g~~~g~~v~~l~y  346 (353)
                      .|+-.|..++-.+-|+=.    .-.....|..|.|-.    +..+-.+|+|.+++-++..+++.+..    +..+ .++|
T Consensus        92 ~Y~LEVSSPGldRpL~~~----~df~r~~G~~V~V~l~~~~~~~k~~~G~L~~v~~~~~~v~l~~~~----~~~~-~i~~  162 (174)
T PRK14631         92 EYALEVSSPGWDRPFFQL----EQLQGYIGQQVALRLIAAVENRRKFQAKLLAVDLENEEIQVEVEG----KHVL-DIDS  162 (174)
T ss_pred             CeEEEEeCCCCCCcCCCH----HHHHHhCCCeEEEEEecccCCceEEEEEEEEeecCCCEEEEEEcC----CcEE-EEEh
Confidence            355555555445555322    223445688888863    33567889999998434446665542    3345 4899


Q ss_pred             cccccc
Q 018594          347 EDICKL  352 (353)
Q Consensus       347 edicKl  352 (353)
                      ++|.|.
T Consensus       163 ~~I~ka  168 (174)
T PRK14631        163 NNIDKA  168 (174)
T ss_pred             HHcceE
Confidence            999763


No 81 
>PF04717 Phage_base_V:  Phage-related baseplate assembly protein;  InterPro: IPR006531 This domain occurs in a family of phage (and bacteriocin) proteins related to the phage P2 V gene product, which forms the small spike at the tip of the tail []. Homologs in general are annotated as baseplate assembly protein V. At least one member is encoded within a region of Pectobacterium carotovorum (Erwinia carotovora) described as a bacteriocin, a phage tail-derived module able to kill bacteria closely related to the host strain. It is also found in Vgr-related proteins. Genes encoding type VI secretion systems (T6SS) are widely distributed in pathogenic Gram-negative bacterial species. In Vibrio cholerae, T6SS have been found to secrete three related proteins extracellularly, VgrG-1, VgrG-2, and VgrG-3. VgrG-1 can covalently cross-link actin in vitro, and this activity was used to demonstrate that V. cholerae can translocate VgrG-1 into macrophages by a T6SS-dependent mechanism. VgrG-related proteins likely assemble into a trimeric complex that is analogous to that formed by the two trimeric proteins gp27 and gp5 that make up the baseplate "tail spike" of Escherichia coli bacteriophage T4. The VgrG components of the T6SS apparatus might assemble a "cell-puncturing device" analogous to phage tail spikes to deliver effector protein domains through membranes of target host cells []. Gp5 is an integral component of the virion baseplate of bacteriophage T4. T4 Gp5 consists of 3 domains connected via long linkers: the N-terminal oligosaccharide/oligonucleotide-binding (OB)-fold domain, the middle lysozyme domain, and the C-terminal triplestranded-helix. The equivalent of the Gp5 OB-fold domain in the structure of VgrG is the domain of unknown function comprising residues 380-470 and conserved in all known VgrGs. This entry represents the OB-fold domain which consists of a 5-stranded antiparallel-barrel with a Greek-key topology [].; PDB: 3AQJ_C 3QR8_A 2P5Z_X.
Probab=29.35  E-value=1.2e+02  Score=23.42  Aligned_cols=45  Identities=22%  Similarity=0.177  Sum_probs=21.0

Q ss_pred             eEEEEecC-CceEEEEecCCCeEE----EEcCCc---e-eeecCCCCCeEEEEe
Q 018594          263 GVVRKVID-KYVGEIEMLEKKHVL----RVDQDE---L-ETVIPQIGGLVRIVN  307 (353)
Q Consensus       263 gvV~~V~d-~~~c~V~l~d~g~~l----~vdq~~---L-ETVIP~~G~~V~IV~  307 (353)
                      |+|.+|.. ..++.|++.+.+..+    .+-+-+   . --..|.+|+.|+|+.
T Consensus         1 G~V~~v~~~~grvrV~~~~~~~~~s~Wl~~~~~~ag~~g~~~~P~iGeqV~v~~   54 (79)
T PF04717_consen    1 GTVTAVDPDKGRVRVRFPDDGDIVSDWLPVLQPRAGGWGFWFPPEIGEQVLVLF   54 (79)
T ss_dssp             EEEEEEETTTTEEEEE-B-CTTEEEEEEEE--S-BSSSB------TT-EEEEEE
T ss_pred             CeEEEEECCCCEEEEEEecCCCccceEEEeeehhccCCeeEccCCCCcEEEEEc
Confidence            67777754 467888874444432    222211   2 234578999999984


No 82 
>PF06003 SMN:  Survival motor neuron protein (SMN);  InterPro: IPR010304 This family consists of several eukaryotic survival motor neuron (SMN) proteins. The Survival of Motor Neurons (SMN) protein, the product of the spinal muscular atrophy-determining gene, is part of a large macromolecular complex (SMN complex) that functions in the assembly of spliceosomal small nuclear ribonucleoproteins (snRNPs). The SMN complex functions as a specificity factor essential for the efficient assembly of Sm proteins on U snRNAs and likely protects cells from illicit, and potentially deleterious, non-specific binding of Sm proteins to RNAs.; GO: 0003723 RNA binding, 0006397 mRNA processing, 0005634 nucleus, 0005737 cytoplasm; PDB: 1MHN_A 4A4G_A 3S6N_M 4A4E_A 1G5V_A 4A4H_A 4A4F_A 2D9T_A.
Probab=29.06  E-value=1.4e+02  Score=29.17  Aligned_cols=57  Identities=11%  Similarity=0.066  Sum_probs=37.5

Q ss_pred             CCcccCCcEEEEeecccCCcccccceeEEEEecC-CceEEEEecCCCeEEEEcCCceeeecCC
Q 018594          237 DYWLCEGIIVKVMSKALADKGYNKQKGVVRKVID-KYVGEIEMLEKKHVLRVDQDELETVIPQ  298 (353)
Q Consensus       237 ~~WL~~~IvVKIi~K~l~dGkyYk~KgvV~~V~d-~~~c~V~l~d~g~~l~vdq~~LETVIP~  298 (353)
                      ..| ..|=....  ..-.||.||.  ++|.+|.. +.+|+|...+-|..-.|.-.+|-+.-..
T Consensus        67 ~~W-kvGd~C~A--~~s~Dg~~Y~--A~I~~i~~~~~~~~V~f~gYgn~e~v~l~dL~~~~~~  124 (264)
T PF06003_consen   67 KKW-KVGDKCMA--VYSEDGQYYP--ATIESIDEEDGTCVVVFTGYGNEEEVNLSDLKPSEGD  124 (264)
T ss_dssp             T----TT-EEEE--E-TTTSSEEE--EEEEEEETTTTEEEEEETTTTEEEEEEGGGEEETT--
T ss_pred             cCC-CCCCEEEE--EECCCCCEEE--EEEEEEcCCCCEEEEEEcccCCeEeeehhhhcccccc
Confidence            467 44544444  4567999997  89999975 5699999988788777777777655443


No 83 
>PRK14640 hypothetical protein; Provisional
Probab=28.65  E-value=1.1e+02  Score=27.22  Aligned_cols=69  Identities=14%  Similarity=0.144  Sum_probs=41.8

Q ss_pred             ceEEEEecCCCeEEEEcCCceeeecCCCCCeEEEEe----CCCCCceEEEEeeeCCccEEEEEEeccccCCceeeecccc
Q 018594          272 YVGEIEMLEKKHVLRVDQDELETVIPQIGGLVRIVN----GAYRGSNARLLGVDTDKFCAQVKIEKGVYDGRVLNAIDYE  347 (353)
Q Consensus       272 ~~c~V~l~d~g~~l~vdq~~LETVIP~~G~~V~IV~----G~~RG~~g~L~siD~~~~~a~V~l~~g~~~g~~v~~l~ye  347 (353)
                      |.-.|..++-++-|+-+..    ..-..|..|.|-.    +..+-..|+|.+++.+.  +++.+.     |+.+ .+||+
T Consensus        73 Y~LEVSSPGl~RpL~~~~~----f~r~~G~~v~V~l~~~~~~~k~~~G~L~~v~~~~--v~l~~~-----~~~~-~i~~~  140 (152)
T PRK14640         73 YYLEVSSPGLDRPLFKVAQ----FEKYVGQEAAVTLRMATNNRRKFKGVIKAVQGDM--ITLTVD-----GKDE-VLAFT  140 (152)
T ss_pred             eEEEEeCCCCCCcCCCHHH----HHHhCCCeEEEEEecccCCceEEEEEEEEEeCCE--EEEEEC-----CeEE-EEEhH
Confidence            4444544444454443322    2234688888863    33567789999998765  544433     3445 48999


Q ss_pred             ccccc
Q 018594          348 DICKL  352 (353)
Q Consensus       348 dicKl  352 (353)
                      +|.+.
T Consensus       141 ~I~ka  145 (152)
T PRK14640        141 NIQKA  145 (152)
T ss_pred             HeeeE
Confidence            98763


No 84 
>PF08863 YolD:  YolD-like protein;  InterPro: IPR014962 These proteins are functionally uncharacterised. However it has been predicted that these proteins are functionally equivalent to the UmuD subunit of polymerase V from Gram-negative bacteria []. 
Probab=28.62  E-value=1.3e+02  Score=23.53  Aligned_cols=41  Identities=22%  Similarity=0.394  Sum_probs=30.2

Q ss_pred             eCCCCCceEEEEeeeCCccEEEEEEeccccCCceeeeccccccccc
Q 018594          307 NGAYRGSNARLLGVDTDKFCAQVKIEKGVYDGRVLNAIDYEDICKL  352 (353)
Q Consensus       307 ~G~~RG~~g~L~siD~~~~~a~V~l~~g~~~g~~v~~l~yedicKl  352 (353)
                      +|.|.-.+|++..+|.......+.-..    +... .++|+||+.+
T Consensus        52 ~g~~~~~~G~I~~id~~~~~l~~~~~~----~~~~-~I~~~~I~~I   92 (92)
T PF08863_consen   52 DGYYQSVTGTIHKIDEINRTLKLKDED----GETE-KIPFDDIIDI   92 (92)
T ss_pred             CCeeEEEEEEEEEEcCCCCEEEEEeCC----CCEE-EEEhhhEEEC
Confidence            677888999999999999766654321    3333 5999999853


No 85 
>COG5164 SPT5 Transcription elongation factor [Transcription]
Probab=27.71  E-value=98  Score=33.14  Aligned_cols=44  Identities=20%  Similarity=0.437  Sum_probs=35.3

Q ss_pred             CcEEEEeecccCCcccccceeEEEEecCCceEEEEecCCCeEEEEcCCce
Q 018594          243 GIIVKVMSKALADKGYNKQKGVVRKVIDKYVGEIEMLEKKHVLRVDQDEL  292 (353)
Q Consensus       243 ~IvVKIi~K~l~dGkyYk~KgvV~~V~d~~~c~V~l~d~g~~l~vdq~~L  292 (353)
                      +-.|||.     .|.|-++=|||+||.+. .|.|+|-.....+.|+-+.|
T Consensus       355 gktVrIr-----~g~yKG~lGVVKdv~~~-~arVeLhs~nK~VTI~K~~l  398 (607)
T COG5164         355 GKTVRIR-----CGEYKGHLGVVKDVDRN-IARVELHSNNKFVTIEKSRL  398 (607)
T ss_pred             CceEEEe-----ecccccccceeeeccCc-eEEEEEecCCceEEeehhhe
Confidence            5578886     34688899999999765 99999987777788887777


No 86 
>cd04456 S1_IF1A_like S1_IF1A_like: Translation initiation factor IF1A-like, S1-like RNA-binding domain. IF1A is also referred to as eIF1A in eukaryotes and aIF1A in archaea. S1-like RNA-binding domains are found in a wide variety of RNA-associated proteins. IF1A is essential for translation initiation. eIF1A acts synergistically with eIF1 to mediate assembly of ribosomal initiation complexes at the initiation codon and maintain the accuracy of this process by recognizing and destabilizing aberrant preinitiation complexes from the mRNA. Without eIF1A and eIF1, 43S ribosomal preinitiation complexes can bind to the cap-proximal region, but are unable to reach the initiation codon. eIF1a also enhances the formation of 5'-terminal complexes in the presence of other translation initiation factors. This protein family is only found in eukaryotes and archaea.
Probab=27.50  E-value=3.2e+02  Score=21.83  Aligned_cols=48  Identities=10%  Similarity=0.068  Sum_probs=31.3

Q ss_pred             eEEEEecCCceEEEEecCCCeEE-EEcCCceeeecCCCCCeEEEEeCCC
Q 018594          263 GVVRKVIDKYVGEIEMLEKKHVL-RVDQDELETVIPQIGGLVRIVNGAY  310 (353)
Q Consensus       263 gvV~~V~d~~~c~V~l~d~g~~l-~vdq~~LETVIP~~G~~V~IV~G~~  310 (353)
                      |+|....+.+...|++.|+..++ .++..+=-+|-=..|+.|+|-.-+|
T Consensus         4 ~~V~~~lG~~~~~V~~~dg~~~l~~i~gK~Rk~iwI~~GD~VlV~~~~~   52 (78)
T cd04456           4 VRVLRMLGNNRHEVECADGQRRLVSIPGKLRKNIWIKRGDFLIVDPIEE   52 (78)
T ss_pred             EEEEEECCCCEEEEEECCCCEEEEEEchhhccCEEEcCCCEEEEEeccc
Confidence            67788877778888887654444 5766666664446677776644333


No 87 
>PRK14647 hypothetical protein; Provisional
Probab=27.36  E-value=96  Score=27.87  Aligned_cols=71  Identities=18%  Similarity=0.253  Sum_probs=42.9

Q ss_pred             CceEEEEecCCCeEEEEcCCceeeecCCCCCeEEEEe-C--------CCCCceEEEEeeeCCccEEEEEEeccccCCcee
Q 018594          271 KYVGEIEMLEKKHVLRVDQDELETVIPQIGGLVRIVN-G--------AYRGSNARLLGVDTDKFCAQVKIEKGVYDGRVL  341 (353)
Q Consensus       271 ~~~c~V~l~d~g~~l~vdq~~LETVIP~~G~~V~IV~-G--------~~RG~~g~L~siD~~~~~a~V~l~~g~~~g~~v  341 (353)
                      .|+-.|..++-++.|+-+..+    .-..|..|.|-. .        ..+-..|+|.+++.+.  +++.+..    ++.+
T Consensus        74 ~Y~LEVSSPG~~RpL~~~~~f----~r~~G~~v~V~l~~~~~~~~~~~~~~~~G~L~~~~~~~--v~l~~~~----~~~~  143 (159)
T PRK14647         74 RYTLEVSSPGLDRPLKKEADY----ERYAGRLVKVRTFELLADEAGNKRKTFLGELEGLADGV--VTIALKE----GQQA  143 (159)
T ss_pred             CeEEEEcCCCCCCcCCCHHHH----HHhCCcEEEEEEeccccccccCCceEEEEEEEeecCCE--EEEEEcC----CcEE
Confidence            345555555445555433322    245688888863 1        2466789999998655  5554432    3445


Q ss_pred             eeccccccccc
Q 018594          342 NAIDYEDICKL  352 (353)
Q Consensus       342 ~~l~yedicKl  352 (353)
                       .+||++|.+.
T Consensus       144 -~i~~~~I~ka  153 (159)
T PRK14647        144 -RIPLDKIAKA  153 (159)
T ss_pred             -EEEHHHCCEE
Confidence             4899999863


No 88 
>PRK05585 yajC preprotein translocase subunit YajC; Validated
Probab=26.99  E-value=1.1e+02  Score=25.85  Aligned_cols=30  Identities=17%  Similarity=0.220  Sum_probs=22.8

Q ss_pred             CCCCeEEEEeCCCCCceEEEEeeeCCccEEEEEEec
Q 018594          298 QIGGLVRIVNGAYRGSNARLLGVDTDKFCAQVKIEK  333 (353)
Q Consensus       298 ~~G~~V~IV~G~~RG~~g~L~siD~~~~~a~V~l~~  333 (353)
                      ++|++|+-..|    -.|++.++|.+  .+.|++..
T Consensus        54 k~Gd~VvT~gG----i~G~Vv~i~~~--~v~lei~~   83 (106)
T PRK05585         54 AKGDEVVTNGG----IIGKVTKVSED--FVIIELND   83 (106)
T ss_pred             CCCCEEEECCC----eEEEEEEEeCC--EEEEEECC
Confidence            46888877766    88999999864  57777754


No 89 
>PRK02749 photosystem I reaction center subunit IV; Provisional
Probab=26.91  E-value=1.1e+02  Score=24.45  Aligned_cols=39  Identities=26%  Similarity=0.381  Sum_probs=29.9

Q ss_pred             CCCCeEEEEeCC--CCCceEEEEeeeCC--ccEEEEEEecccc
Q 018594          298 QIGGLVRIVNGA--YRGSNARLLGVDTD--KFCAQVKIEKGVY  336 (353)
Q Consensus       298 ~~G~~V~IV~G~--~RG~~g~L~siD~~--~~~a~V~l~~g~~  336 (353)
                      +.|++|+|++-+  +-..+|++.++|.+  ++-++|+++.-.+
T Consensus         4 ~rGskVrIlR~ESYWyn~vGtV~svD~sgi~YPV~VRF~kvNY   46 (71)
T PRK02749          4 SRGDKVRILRPESYWYNEVGTVASVDKSGIKYPVIVRFDKVNY   46 (71)
T ss_pred             ccCCEEEEccccceeecCcceEEEEccCCCeeeEEEEeeeeec
Confidence            468999999665  46788999999987  6778888875433


No 90 
>cd04451 S1_IF1 S1_IF1: Translation Initiation Factor IF1, S1-like RNA-binding domain. IF1 contains an S1-like RNA-binding domain, which is found in a wide variety of RNA-associated proteins. Translation initiation includes a number of interrelated steps preceding the formation of the first peptide bond. In Escherichia coli, the initiation mechanism requires, in addition to mRNA, fMet-tRNA, and ribosomal subunits,  the presence of three additional proteins (initiation factors IF1, IF2, and IF3) and at least one GTP molecule. The three initiation factors influence both the kinetics and the stability of ternary complex formation. IF1 is the smallest of the three factors. IF1 enhances the rate of 70S ribosome subunit association and dissociation and the interaction of 30S ribosomal subunit with IF2 and IF3. It stimulates 30S complex formation. In addition, by binding to the A-site of the 30S ribosomal subunit, IF1 may contribute to the fidelity of the selection of the initiation site of th
Probab=26.88  E-value=2.7e+02  Score=20.74  Aligned_cols=24  Identities=17%  Similarity=0.209  Sum_probs=14.1

Q ss_pred             eecCCCCCeEEEEeCCCCCceEEE
Q 018594          294 TVIPQIGGLVRIVNGAYRGSNARL  317 (353)
Q Consensus       294 TVIP~~G~~V~IV~G~~RG~~g~L  317 (353)
                      .+-|..|+.|.+-...+-+..|.+
T Consensus        38 ~~~~~vGD~V~~~~~~~~~~~g~I   61 (64)
T cd04451          38 YIRILPGDRVKVELSPYDLTKGRI   61 (64)
T ss_pred             CcccCCCCEEEEEEeecCCCEEEE
Confidence            445889999977643333333433


No 91 
>PRK00276 infA translation initiation factor IF-1; Validated
Probab=26.88  E-value=3e+02  Score=21.30  Aligned_cols=56  Identities=14%  Similarity=0.190  Sum_probs=29.1

Q ss_pred             eeEEEEecCCceEEEEecCCCeEE--EEcCCce-eeecCCCCCeEEEEeCCCCCceEEEE
Q 018594          262 KGVVRKVIDKYVGEIEMLEKKHVL--RVDQDEL-ETVIPQIGGLVRIVNGAYRGSNARLL  318 (353)
Q Consensus       262 KgvV~~V~d~~~c~V~l~d~g~~l--~vdq~~L-ETVIP~~G~~V~IV~G~~RG~~g~L~  318 (353)
                      +|+|..+..+..+.|.+.+ |..+  .+....= --+-|.+|+.|.|---.+-...|.++
T Consensus        10 ~G~Vi~~~~~~~y~V~~~~-g~~~~c~~~Gklr~~~i~i~vGD~V~ve~~~~~~~~g~Iv   68 (72)
T PRK00276         10 EGTVVEALPNAMFRVELEN-GHEVLAHISGKMRKNYIRILPGDKVTVELSPYDLTKGRIT   68 (72)
T ss_pred             EEEEEEEcCCCEEEEEeCC-CCEEEEEEccceeeCCcccCCCCEEEEEEcccCCCeEEEE
Confidence            4667766654344455432 3332  2333222 13448889999877444444445444


No 92 
>PLN00208 translation initiation factor (eIF); Provisional
Probab=26.82  E-value=4.8e+02  Score=23.60  Aligned_cols=60  Identities=12%  Similarity=0.103  Sum_probs=39.9

Q ss_pred             eEEEEecCCceEEEEecCCCeEE-EEcCCceeeecCCCCCeEEEEeCCCCCceEEEEeeeC
Q 018594          263 GVVRKVIDKYVGEIEMLEKKHVL-RVDQDELETVIPQIGGLVRIVNGAYRGSNARLLGVDT  322 (353)
Q Consensus       263 gvV~~V~d~~~c~V~l~d~g~~l-~vdq~~LETVIP~~G~~V~IV~G~~RG~~g~L~siD~  322 (353)
                      |+|..+.+...+.|...+...++ .|+-.+=-.|-=.+|+.|+|-.-+|--..|.++-+-.
T Consensus        36 g~V~~~lGn~~~~V~c~dG~~rLa~IpGKmRKrIWI~~GD~VlVel~~~d~~KgdIv~ry~   96 (145)
T PLN00208         36 AQVLRMLGNGRCEALCIDGTKRLCHIRGKMRKKVWIAAGDIILVGLRDYQDDKADVILKYM   96 (145)
T ss_pred             EEEEEEcCCCEEEEEECCCCEEEEEEeccceeeEEecCCCEEEEEccCCCCCEEEEEEEcC
Confidence            46777777777777776654444 5666555544456788888876667777777776544


No 93 
>COG0779 Uncharacterized protein conserved in bacteria [Function unknown]
Probab=25.95  E-value=1.5e+02  Score=26.89  Aligned_cols=45  Identities=22%  Similarity=0.497  Sum_probs=32.3

Q ss_pred             CCCeEEEEe----CCCCCceEEEEeeeCCccEEEEEEeccccCCceeeecccccccc
Q 018594          299 IGGLVRIVN----GAYRGSNARLLGVDTDKFCAQVKIEKGVYDGRVLNAIDYEDICK  351 (353)
Q Consensus       299 ~G~~V~IV~----G~~RG~~g~L~siD~~~~~a~V~l~~g~~~g~~v~~l~yedicK  351 (353)
                      .|..|+|..    ..-+-..|+|.++|.+.  +++.++     ++.+ .+||.+|.|
T Consensus        98 ~G~~Vkv~l~~~~~~~k~~~G~i~~~d~~~--v~~~~~-----~k~v-~Ip~~~i~k  146 (153)
T COG0779          98 IGEKVKVKLRLPIEGRKKFEGKIVAVDGET--VTLEVD-----GKEV-EIPFSDIAK  146 (153)
T ss_pred             cCcEEEEEEecccCCceEEEEEEEEEcCCe--EEEEEC-----CEEE-EEEcccchh
Confidence            577777775    33566789999999888  555443     4446 499999876


No 94 
>PRK00092 ribosome maturation protein RimP; Reviewed
Probab=25.83  E-value=1.1e+02  Score=27.07  Aligned_cols=71  Identities=21%  Similarity=0.286  Sum_probs=43.0

Q ss_pred             CceEEEEecCCCeEEEEcCCceeeecCCCCCeEEEE-e---CCCCCceEEEEeeeCCccEEEEEEeccccCCceeeeccc
Q 018594          271 KYVGEIEMLEKKHVLRVDQDELETVIPQIGGLVRIV-N---GAYRGSNARLLGVDTDKFCAQVKIEKGVYDGRVLNAIDY  346 (353)
Q Consensus       271 ~~~c~V~l~d~g~~l~vdq~~LETVIP~~G~~V~IV-~---G~~RG~~g~L~siD~~~~~a~V~l~~g~~~g~~v~~l~y  346 (353)
                      .|+-.|..++-++-|+-+..    .....|..|.|- +   +.-+-..|+|.+++.+.  +++.+...   ++.+ .++|
T Consensus        73 ~Y~LEVSSPGi~RpL~~~~~----f~r~~G~~v~V~~~~~~~~~~~~~G~L~~~~~~~--i~l~~~~~---~~~~-~i~~  142 (154)
T PRK00092         73 AYTLEVSSPGLDRPLKKARD----FRRFIGREVKVKLYEPIDGRKKFQGILLAVDGET--VTLEVEGK---EKEV-EIPL  142 (154)
T ss_pred             CeEEEEeCCCCCCcCCCHHH----HHHhCCCeEEEEEEcccCCceEEEEEEEEeeCCE--EEEEECCC---eEEE-EEEH
Confidence            35555555554555543322    335679999996 2   33345589999998876  44444321   1245 4999


Q ss_pred             ccccc
Q 018594          347 EDICK  351 (353)
Q Consensus       347 edicK  351 (353)
                      ++|.+
T Consensus       143 ~~I~~  147 (154)
T PRK00092        143 DNIAK  147 (154)
T ss_pred             HHcce
Confidence            99876


No 95 
>cd05696 S1_Rrp5_repeat_hs4 S1_Rrp5_repeat_hs4: Rrp5 is a trans-acting factor important for biogenesis of both the 40S and 60S eukaryotic ribosomal subunits. Rrp5 has two distinct regions, an N-terminal region containing tandemly repeated S1 RNA-binding domains (12 S1 repeats in Saccharomyces cerevisiae Rrp5 and 14 S1 repeats in Homo sapiens Rrp5) and a C-terminal region containing tetratricopeptide repeat (TPR) motifs thought to be involved in protein-protein interactions. Mutational studies have shown that each region represents a specific functional domain. Deletions within the S1-containing region inhibit pre-rRNA processing at either site A3 or A2, whereas deletions within the TPR region confer an inability to support cleavage of A0-A2. This CD includes H. sapiens S1 repeat 4 (hs4). Rrp5 is found in eukaryotes but not in prokaryotes or archaea.
Probab=25.58  E-value=2.3e+02  Score=21.39  Aligned_cols=57  Identities=19%  Similarity=0.095  Sum_probs=31.7

Q ss_pred             eeEEEEecCCceEEEEecCCCeEEEEcCCce-----eeecC--CCCCeEEEEeCCCCCceEEEEeeeCCccEEEE
Q 018594          262 KGVVRKVIDKYVGEIEMLEKKHVLRVDQDEL-----ETVIP--QIGGLVRIVNGAYRGSNARLLGVDTDKFCAQV  329 (353)
Q Consensus       262 KgvV~~V~d~~~c~V~l~d~g~~l~vdq~~L-----ETVIP--~~G~~V~IV~G~~RG~~g~L~siD~~~~~a~V  329 (353)
                      .|.|++|...+-+.|.+.+ |-.--|..++|     +...-  ++|+.|          .++++++|..+..+.+
T Consensus         6 ~g~V~~v~~~~G~~V~l~~-gv~G~i~~s~l~~~~~~~~~~~~~vG~~v----------~~kV~~id~~~~~i~l   69 (71)
T cd05696           6 SVKVTKVEPDLGAVFELKD-GLLGFVHISHLSDDKVPSDTGPFKAGTTH----------KARIIGYSPMDGLLQL   69 (71)
T ss_pred             eeEEEEEccCceEEEEeCC-CCEEEEEHHHCCcchhcCcccccCCCCEE----------EEEEEEEeCCCCEEEE
Confidence            4888898655678888865 32223444444     22110  245544          3455677777655544


No 96 
>KOG4235 consensus Mitochondrial thymidine kinase 2/deoxyguanosine kinase [Nucleotide transport and metabolism]
Probab=25.37  E-value=1e+02  Score=29.80  Aligned_cols=44  Identities=20%  Similarity=0.355  Sum_probs=37.8

Q ss_pred             ceeecccccccHHHHHHHhcccccEEEeecCceeEEEeecCChHHHHHHH
Q 018594           51 HVHMNSTRWATLTEFVKYLGRTGKCKVEETPKGWFITYIDRDSETLFKEK  100 (353)
Q Consensus        51 HiHMNaT~W~tLt~Fvk~Lgr~G~c~vdetekGw~I~yId~~pe~~~r~~  100 (353)
                      -=-||-+-|+.+.+.-.|+-+++.+.+|      -|-|+--+||+....-
T Consensus       128 sg~m~e~e~~iy~eW~d~i~~~~~v~~d------giIYLrasPetc~~Ri  171 (244)
T KOG4235|consen  128 SGSMNEVEYVIYQEWFDWILRSMDVSLD------GIIYLRASPETCYKRI  171 (244)
T ss_pred             cCCcccchhhhHHHHHHHHHhccccccc------eEEEeecChHHHHHHH
Confidence            3469999999999999999999888888      4889999999986433


No 97 
>PRK14645 hypothetical protein; Provisional
Probab=25.18  E-value=1.3e+02  Score=27.17  Aligned_cols=68  Identities=15%  Similarity=0.094  Sum_probs=41.3

Q ss_pred             CceEEEEecCCCeEEEEcCCceeeecCCCCCeEEEEeCCCCCceEEEEeeeCCccEEEEEEeccccCCceeeeccccccc
Q 018594          271 KYVGEIEMLEKKHVLRVDQDELETVIPQIGGLVRIVNGAYRGSNARLLGVDTDKFCAQVKIEKGVYDGRVLNAIDYEDIC  350 (353)
Q Consensus       271 ~~~c~V~l~d~g~~l~vdq~~LETVIP~~G~~V~IV~G~~RG~~g~L~siD~~~~~a~V~l~~g~~~g~~v~~l~yedic  350 (353)
                      .|+-.|..++-.+.|+-+.++    .-..|..|.|..| .+-..|+|.+++.+.  +++.+.     +..+ .+||++|.
T Consensus        77 ~Y~LEVSSPGldRpL~~~~df----~r~~G~~v~v~~~-~k~~~G~L~~~~d~~--i~l~~~-----~~~~-~i~~~~I~  143 (154)
T PRK14645         77 EYRLEVESPGPKRPLFTARHF----ERFAGLKAKVRGP-GENFTGRIKAVSGDQ--VTFDVG-----GEDR-TLRIGTFQ  143 (154)
T ss_pred             ceEEEEeCCCCCCCCCCHHHH----HHhCCCEEEEEcC-CeEEEEEEEEEeCCE--EEEEEC-----CeEE-EEEHHHhh
Confidence            345555554444555433222    2346888988653 456689999998765  545432     4445 49999985


Q ss_pred             c
Q 018594          351 K  351 (353)
Q Consensus       351 K  351 (353)
                      +
T Consensus       144 ~  144 (154)
T PRK14645        144 A  144 (154)
T ss_pred             h
Confidence            3


No 98 
>PRK14712 conjugal transfer nickase/helicase TraI; Provisional
Probab=24.79  E-value=5.2e+02  Score=31.87  Aligned_cols=86  Identities=14%  Similarity=0.137  Sum_probs=58.1

Q ss_pred             cCCcEEEEeecccCCcccccceeEEEEecCCceEEEEecCCCeEEEEcCCce----eeecC-----CCCCeEEEEeCC--
Q 018594          241 CEGIIVKVMSKALADKGYNKQKGVVRKVIDKYVGEIEMLEKKHVLRVDQDEL----ETVIP-----QIGGLVRIVNGA--  309 (353)
Q Consensus       241 ~~~IvVKIi~K~l~dGkyYk~KgvV~~V~d~~~c~V~l~d~g~~l~vdq~~L----ETVIP-----~~G~~V~IV~G~--  309 (353)
                      .||-+|+--++.+..+..|    +|..|.....+.+-....|....++.+.+    +-.-|     +.|++|++....  
T Consensus       518 ~~GmVl~~~~r~~k~~~~y----~V~~V~~~~n~LtL~~~dG~~~~~~p~~~~~~~~vy~~e~lelA~GDrlr~t~nd~~  593 (1623)
T PRK14712        518 RPGMVMEQWNPETRSHDRY----VTERVTAQSHSLTLRNAQGETQVVRISSLDSSWSLFRPEKMPVADGERLRVTGKIPG  593 (1623)
T ss_pred             CCCCEEEecccCcCcCceE----EEEEEcCCCceEEEEcCCCcEEEechHHcccceeeecccccccCCCCEEEEccCCcc
Confidence            7899997333444333444    78888776555443455688888888775    33444     579999999553  


Q ss_pred             ---CCCceEEEEeeeCCccEEEEEEe
Q 018594          310 ---YRGSNARLLGVDTDKFCAQVKIE  332 (353)
Q Consensus       310 ---~RG~~g~L~siD~~~~~a~V~l~  332 (353)
                         -.|..+++.+++.+.  ++|...
T Consensus       594 ~~L~ngd~~tV~~i~~~~--itl~~~  617 (1623)
T PRK14712        594 LRVSGGDRLQVASVSEDA--MTVVVP  617 (1623)
T ss_pred             cCccCCCEEEEEEecCCe--EEEEEC
Confidence               366889999998777  555544


No 99 
>COG3041 Uncharacterized protein conserved in bacteria [Function unknown]
Probab=24.40  E-value=41  Score=28.13  Aligned_cols=18  Identities=33%  Similarity=0.728  Sum_probs=16.1

Q ss_pred             hhHHHHHHHHHHHHHhcc
Q 018594           14 GYSEEFEAGFLELMRRSH   31 (353)
Q Consensus        14 ~fS~eF~~~Fl~lLr~~~   31 (353)
                      .||.+|.+||-.+.++.+
T Consensus         5 ~~skqF~kD~k~~~k~~~   22 (91)
T COG3041           5 EYSKQFKKDFKKLIKRGP   22 (91)
T ss_pred             ehhhhhhHHHHHHHhcCc
Confidence            589999999999999875


No 100
>PTZ00065 60S ribosomal protein L14; Provisional
Probab=23.88  E-value=2.1e+02  Score=25.37  Aligned_cols=35  Identities=3%  Similarity=-0.079  Sum_probs=27.4

Q ss_pred             ccCCcEEEEeecccCCcccccceeEEEEecCCceEEEEec
Q 018594          240 LCEGIIVKVMSKALADKGYNKQKGVVRKVIDKYVGEIEML  279 (353)
Q Consensus       240 L~~~IvVKIi~K~l~dGkyYk~KgvV~~V~d~~~c~V~l~  279 (353)
                      +.+|=+|.|.     .|.|+++-++|+||+|...|-|--+
T Consensus         8 VEiGRVvli~-----~Gp~~GKL~vIVDIID~nRvLVDGP   42 (130)
T PTZ00065          8 VEPGRLCLIQ-----YGPDAGKLCFIVDIVTPTRVLVDGA   42 (130)
T ss_pred             eeeceEEEEe-----cCCCCCCEEEEEEEEcCCeEEEeCC
Confidence            4456566554     4579999999999999989988766


No 101
>PTZ00329 eukaryotic translation initiation factor 1A; Provisional
Probab=23.84  E-value=5.5e+02  Score=23.47  Aligned_cols=60  Identities=15%  Similarity=0.130  Sum_probs=38.1

Q ss_pred             eEEEEecCCceEEEEecCCCeEE-EEcCCceeeecCCCCCeEEEEeCCCCCceEEEEeeeC
Q 018594          263 GVVRKVIDKYVGEIEMLEKKHVL-RVDQDELETVIPQIGGLVRIVNGAYRGSNARLLGVDT  322 (353)
Q Consensus       263 gvV~~V~d~~~c~V~l~d~g~~l-~vdq~~LETVIP~~G~~V~IV~G~~RG~~g~L~siD~  322 (353)
                      |+|..+.+...+.|.+.+...++ .|+-.+=--|-=.+|+.|+|-.-+|--..|.++-+-.
T Consensus        36 g~V~~~LGn~~f~V~c~dG~~rLa~I~GKmRK~IWI~~GD~VlVel~~yd~~KgdIi~Ry~   96 (155)
T PTZ00329         36 AQVLRMLGNGRLEAYCFDGVKRLCHIRGKMRKRVWINIGDIILVSLRDFQDSKADVILKYT   96 (155)
T ss_pred             EEEEEEcCCCEEEEEECCCCEEEEEeeccceeeEEecCCCEEEEeccCCCCCEEEEEEEcC
Confidence            56777777777777776644444 4555554444446678887766667666676665543


No 102
>PF01421 Reprolysin:  Reprolysin (M12B) family zinc metalloprotease  This Prosite motif covers only the active site.;  InterPro: IPR001590 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site []. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases []. This group of metallopeptidases belong to the MEROPS peptidase family M12, subfamily M12B (adamalysin family, clan (MA(M)). The protein fold of the peptidase domain for members of this family resembles that of thermolysin, the type example for clan MA and the predicted active site residues for members of this family and thermolysin occur in the motif HEXXH []. The adamalysins are zinc dependent endopeptidases found in snake venom. There are some mammalian proteins such as P78325 from SWISSPROT, and fertilin Q28472 from SWISSPROT. Fertilin and closely related proteins appear to not have some active site residues and may not be active enzymes. CD156 (also called ADAM8 (3.4.24 from EC) or MS2 human) has been implicated in extravasation of leukocytes. CD molecules are leucocyte antigens on cell surfaces. CD antigens nomenclature is updated at Protein Reviews On The Web (http://prow.nci.nih.gov/). ; GO: 0004222 metalloendopeptidase activity, 0006508 proteolysis; PDB: 2E3X_A 2W15_A 2W14_A 2W13_A 2W12_A 1ND1_A 3K7L_A 2DW2_A 2DW0_B 2DW1_A ....
Probab=23.48  E-value=97  Score=27.92  Aligned_cols=53  Identities=11%  Similarity=0.192  Sum_probs=36.5

Q ss_pred             HHHHHHHHHHHhccCCcccccceeeeeecccccceeecccccccHHHHHHHhc
Q 018594           18 EFEAGFLELMRRSHRFSRIAATVVYNEYIHDRHHVHMNSTRWATLTEFVKYLG   70 (353)
Q Consensus        18 eF~~~Fl~lLr~~~g~krV~aN~vYneyI~dr~HiHMNaT~W~tLt~Fvk~Lg   70 (353)
                      +|.-..+.+.-.-|-.=.|..-.++=|.-.++++|+++.....||..|.+|--
T Consensus        26 ~~~~~i~n~v~~~y~~l~i~v~l~~leiw~~~d~i~~~~~~~~~L~~F~~w~~   78 (199)
T PF01421_consen   26 QYVLTIVNIVDSIYQQLNIRVVLVGLEIWTEEDKINISNDADSTLENFCNWQK   78 (199)
T ss_dssp             HHHHHHHHHHHHHHGGGTEEEEEEEEEEESSSTSS---SSHHHHHHHHHHHHH
T ss_pred             HHHHHHHHHHhhhcccCCeEEEEEEEEEcccCCceeeecchHHHHHHHHHHHH
Confidence            34445555555555444455566788888999999999999999999999954


No 103
>PF02576 DUF150:  Uncharacterised BCR, YhbC family COG0779;  InterPro: IPR003728 The RimP protein facilitates maturation of the 30S ribsomal subunit, and is required for the efficient production of translationally competent ribosmomes [].; PDB: 1IB8_A.
Probab=23.40  E-value=2e+02  Score=24.87  Aligned_cols=74  Identities=23%  Similarity=0.276  Sum_probs=34.7

Q ss_pred             CceEEEEecCCCeEEEEcCCceeeecCCCCCeEEEEe-CCC---CCceEEEEeeeCCccEEEEEEeccccCCceeeeccc
Q 018594          271 KYVGEIEMLEKKHVLRVDQDELETVIPQIGGLVRIVN-GAY---RGSNARLLGVDTDKFCAQVKIEKGVYDGRVLNAIDY  346 (353)
Q Consensus       271 ~~~c~V~l~d~g~~l~vdq~~LETVIP~~G~~V~IV~-G~~---RG~~g~L~siD~~~~~a~V~l~~g~~~g~~v~~l~y  346 (353)
                      .|.-.|..++-.+.|+-+..+.    =..|..|.|-. .+.   +-..|+|.+++.+.  +++++..+. ....+ .++|
T Consensus        62 ~y~LEVSSPG~~r~L~~~~~~~----~~iG~~v~v~~~~~~~~~~~~~G~L~~~~~~~--i~l~~~~~~-~~~~~-~I~~  133 (141)
T PF02576_consen   62 DYTLEVSSPGIDRPLKSPRDFE----RFIGRKVKVKLKQPVNGRKEFEGKLLEVDEDE--ITLEVEGKG-KKKEV-EIPF  133 (141)
T ss_dssp             -EEEEEE--SSSS--SSHHHHH----HH-SEEEEEE-SS-SSS-SEEEEEEEEEETTE--EEEEEE-SS--EEEE-EE-S
T ss_pred             ceEEEEeCCCCCCcCCCHHHHH----HhcCCeEEEEEeccCCCcEEEEEEEEEEeCCE--EEEEECCcc-ceEEE-EEEH
Confidence            3455555554344443222211    12588888874 223   33489999999866  666665421 11245 4999


Q ss_pred             cccccc
Q 018594          347 EDICKL  352 (353)
Q Consensus       347 edicKl  352 (353)
                      ++|.|.
T Consensus       134 ~~I~ka  139 (141)
T PF02576_consen  134 SDIKKA  139 (141)
T ss_dssp             S--SS-
T ss_pred             HHCceE
Confidence            999763


No 104
>PHA02104 hypothetical protein
Probab=23.28  E-value=54  Score=26.52  Aligned_cols=30  Identities=23%  Similarity=0.492  Sum_probs=23.2

Q ss_pred             CCCCcccCCc-EEEEeecccCCcccccceeEEEEec
Q 018594          235 RKDYWLCEGI-IVKVMSKALADKGYNKQKGVVRKVI  269 (353)
Q Consensus       235 r~~~WL~~~I-vVKIi~K~l~dGkyYk~KgvV~~V~  269 (353)
                      ..-+|-.||| .|+|     ++.+||-+.|.|-...
T Consensus        34 ~ti~w~fp~i~ev~i-----g~s~yfa~~gkiynat   64 (89)
T PHA02104         34 STIFWTFPGITEVRI-----GSSTYFAQNGKIYNAT   64 (89)
T ss_pred             eEEEEecCCcEEEEe-----cceeeehhCCeEEeeE
Confidence            4579999999 5665     4568999999886653


No 105
>PF12122 DUF3582:  Protein of unknown function (DUF3582);  InterPro: IPR022732 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ].  This entry represents the N-terminal domain of membrane-bound serine endopeptidases belonging to MEROPS peptidase family S54 (rhomboid-1, clan ST). This domain contains a conserved ASW sequence motif and a single completely conserved residue F that may be functionally important.  The tertiary structure of the GlpG protein from Escherichia coli has been determined []. The GlpG protein has six transmembrane domains (other members of the family are predicted to have seven), with the N- and C-terminal ends anchored in the cytoplasm. One transmembrane domain is shorter than the rest, creating an internal, aqueous cavity just below the membrane surface and it is here were proteolysis occurs. There is also a membrane-embedded loop between the first and second transmembrane domains which is postulated to act as a gate controlling substrate access to the active site. No other family of serine peptidases is known to have active site residues within transmembrane domains (although transmembrane active sites are known for aspartic peptidase and metallopeptidases), and the GlpG protein has the type structure for clan ST.; GO: 0004252 serine-type endopeptidase activity, 0016021 integral to membrane; PDB: 3UBB_A 3B45_A 3B44_A 2NRF_A 3TXT_A 2O7L_A 2XTU_A 2IRV_A 2XOW_A 2XTV_A ....
Probab=23.27  E-value=1.4e+02  Score=25.02  Aligned_cols=27  Identities=26%  Similarity=0.650  Sum_probs=17.4

Q ss_pred             HHHHHhcccc-cEEEeecCceeEEEeec
Q 018594           64 EFVKYLGRTG-KCKVEETPKGWFITYID   90 (353)
Q Consensus        64 ~Fvk~Lgr~G-~c~vdetekGw~I~yId   90 (353)
                      .|+.||-..| .|.|+..+.|=|--||.
T Consensus        15 aF~DYl~sqgI~~~i~~~~~~~~~lwl~   42 (101)
T PF12122_consen   15 AFIDYLASQGIELQIEPEGQGQFALWLH   42 (101)
T ss_dssp             HHHHHHHHTT--EEEE-SSSE--EEEES
T ss_pred             HHHHHHHHCCCeEEEEECCCCceEEEEe
Confidence            5899998888 57888778884444444


No 106
>PRK13316 heme-degrading monooxygenase IsdG; Provisional
Probab=22.76  E-value=58  Score=28.48  Aligned_cols=59  Identities=14%  Similarity=0.232  Sum_probs=42.4

Q ss_pred             cccCchhhHhhhHHHHHHHHHH---------HHHhccCCccccc------ceeeeeecccccceeecccccccHHHHHHH
Q 018594            4 FGQNPDRIVEGYSEEFEAGFLE---------LMRRSHRFSRIAA------TVVYNEYIHDRHHVHMNSTRWATLTEFVKY   68 (353)
Q Consensus         4 f~enp~~~i~~fS~eF~~~Fl~---------lLr~~~g~krV~a------N~vYneyI~dr~HiHMNaT~W~tLt~Fvk~   68 (353)
                      +..|-=++-.++.++|+.-|..         +|.+.=|.-+.+.      +.-|-||+.        -|+|.|-..|-.|
T Consensus         3 Iv~Nri~V~~g~a~~~~~rF~~r~~~g~~~~~ie~~pGFv~f~lL~~~~~~~~~~e~~V--------~T~WeSeeaF~aW   74 (121)
T PRK13316          3 IVTNTIKVEKGAAEHVIRQFTGANGDGHPTKDIAEVEGFLGFELWHSKPEDKDYEEVVV--------TSKWESEEAQRNW   74 (121)
T ss_pred             EEEEEEEeCCCcHHHHHHHHhccCcccccccchhcCCCceEEEEeeccCCCCCceEEEE--------EEEECCHHHHHHH
Confidence            4456666778899999999976         7777777665442      234444443        2899999999999


Q ss_pred             hc
Q 018594           69 LG   70 (353)
Q Consensus        69 Lg   70 (353)
                      .-
T Consensus        75 ~~   76 (121)
T PRK13316         75 VK   76 (121)
T ss_pred             hc
Confidence            75


No 107
>TIGR02059 swm_rep_I cyanobacterial long protein repeat. This domain appears in 29 copies in a large (10000 amino protein in Synechococcus sp. WH8102 associated with a novel flagellar system, as one of three different repeats. Similar domains are found in two different large (<3500) proteins of Synechocystis PCC6803.
Probab=22.59  E-value=4.3e+02  Score=22.58  Aligned_cols=61  Identities=16%  Similarity=0.246  Sum_probs=44.3

Q ss_pred             EEEcCCce--eeecCCCCCeEEEEeCCCCCceEEEEeeeCCccEEEEEEeccccCCceeeeccccc
Q 018594          285 LRVDQDEL--ETVIPQIGGLVRIVNGAYRGSNARLLGVDTDKFCAQVKIEKGVYDGRVLNAIDYED  348 (353)
Q Consensus       285 l~vdq~~L--ETVIP~~G~~V~IV~G~~RG~~g~L~siD~~~~~a~V~l~~g~~~g~~v~~l~yed  348 (353)
                      |..++.--  ..-.|..+.-..-|+|..+-..|  ++++...-.+++.|...-..|+.|. +.|-+
T Consensus        35 LtY~e~L~~~t~~~p~~~~FtVtVnG~~n~Vt~--VsV~~s~ktVTLTL~~~V~~Gq~VT-VsYt~   97 (101)
T TIGR02059        35 LTFNEPLADITNHAPTRDQFAVTVNGAPNTVTS--VSLGGSNTTITLTLAQVVEDGDEVT-LSYTK   97 (101)
T ss_pred             EEechhcCccccCCCCCCcEEEEeCCcEeeEEE--EEEcCcccEEEEEecccccCCCEEE-EEeeC
Confidence            45555533  34457777777677998888888  6777777779999987777788775 88754


No 108
>PRK04333 50S ribosomal protein L14e; Validated
Probab=22.06  E-value=1.6e+02  Score=24.06  Aligned_cols=31  Identities=19%  Similarity=0.113  Sum_probs=23.3

Q ss_pred             cCCcEEEEeecccCCcccccceeEEEEecCCceEEE
Q 018594          241 CEGIIVKVMSKALADKGYNKQKGVVRKVIDKYVGEI  276 (353)
Q Consensus       241 ~~~IvVKIi~K~l~dGkyYk~KgvV~~V~d~~~c~V  276 (353)
                      .+|=+|.+.     .|++.++.++|.+++|...|-|
T Consensus         5 ~~GrvV~~~-----~Grd~gk~~vIv~i~d~~~vlV   35 (84)
T PRK04333          5 EVGRVCVKT-----AGREAGRKCVIVDIIDKNFVLV   35 (84)
T ss_pred             cccEEEEEe-----ccCCCCCEEEEEEEecCCEEEE
Confidence            456667654     4588899999999988766666


No 109
>PRK14634 hypothetical protein; Provisional
Probab=21.66  E-value=1.6e+02  Score=26.48  Aligned_cols=70  Identities=24%  Similarity=0.252  Sum_probs=42.4

Q ss_pred             CceEEEEecCCCeEEEEcCCceeeecCCCCCeEEEE-eC---CCCCceEEEEeeeCCccEEEEEEeccccCCceeeeccc
Q 018594          271 KYVGEIEMLEKKHVLRVDQDELETVIPQIGGLVRIV-NG---AYRGSNARLLGVDTDKFCAQVKIEKGVYDGRVLNAIDY  346 (353)
Q Consensus       271 ~~~c~V~l~d~g~~l~vdq~~LETVIP~~G~~V~IV-~G---~~RG~~g~L~siD~~~~~a~V~l~~g~~~g~~v~~l~y  346 (353)
                      .|+-.|..++-.+.|+-+.++    .-..|..|.|- .+   ..+-..|+|.+++.+.  +++.+.     +..+ .+||
T Consensus        75 ~Y~LEVSSPGldRpL~~~~~f----~r~~G~~V~V~l~~~~~~~k~~~G~L~~~~~~~--v~l~~~-----~~~~-~i~~  142 (155)
T PRK14634         75 AYVLEISSPGIGDQLSSDRDF----QTFRGFPVEVSHRDDDGSEQRLEGLLLERNEDH--LQINIR-----GRIK-RIPR  142 (155)
T ss_pred             CeEEEEeCCCCCCcCCCHHHH----HHhCCCeEEEEEecCCCCeEEEEEEEEEEeCCE--EEEEEC-----CEEE-EEEH
Confidence            355555555445555432222    23469989884 32   2366789999998766  545432     4445 4999


Q ss_pred             cccccc
Q 018594          347 EDICKL  352 (353)
Q Consensus       347 edicKl  352 (353)
                      ++|.+.
T Consensus       143 ~~I~ka  148 (155)
T PRK14634        143 DSVISV  148 (155)
T ss_pred             HHeeeE
Confidence            999864


No 110
>PRK14646 hypothetical protein; Provisional
Probab=21.52  E-value=1.7e+02  Score=26.25  Aligned_cols=70  Identities=20%  Similarity=0.223  Sum_probs=40.9

Q ss_pred             CceEEEEecCCCeEEEEcCCceeeecCCCCCeEEEEe-CC---CCCceEEEEeeeCCccEEEEEEeccccCCceeeeccc
Q 018594          271 KYVGEIEMLEKKHVLRVDQDELETVIPQIGGLVRIVN-GA---YRGSNARLLGVDTDKFCAQVKIEKGVYDGRVLNAIDY  346 (353)
Q Consensus       271 ~~~c~V~l~d~g~~l~vdq~~LETVIP~~G~~V~IV~-G~---~RG~~g~L~siD~~~~~a~V~l~~g~~~g~~v~~l~y  346 (353)
                      .|+-.|..++-++.|+=+.    -..-..|..|.|-. ..   .+-..|+|.+++.+.  +++.+.     |+.+ .+||
T Consensus        75 ~Y~LEVSSPGldRpL~~~~----df~r~~G~~v~V~l~~~~~~~~~~~G~L~~~~~~~--v~l~~~-----g~~~-~i~~  142 (155)
T PRK14646         75 SYVLEISSQGVSDELTSER----DFKTFKGFPVNVELNQKNSKIKFLNGLLYEKSKDY--LAINIK-----GKIK-KIPF  142 (155)
T ss_pred             CeEEEEcCCCCCCcCCCHH----HHHHhCCCEEEEEEecCcCCeEEEEEEEEEEeCCE--EEEEEC-----CEEE-EEEH
Confidence            3455555544444443222    12334688888862 22   233469999998775  555432     4556 4999


Q ss_pred             cccccc
Q 018594          347 EDICKL  352 (353)
Q Consensus       347 edicKl  352 (353)
                      ++|.|.
T Consensus       143 ~~I~ka  148 (155)
T PRK14646        143 NEVLKI  148 (155)
T ss_pred             HHeeeE
Confidence            999864


No 111
>PF02214 BTB_2:  BTB/POZ domain;  InterPro: IPR003131 Potassium channels are the most diverse group of the ion channel family [, ]. They are important in shaping the action potential, and in neuronal excitability and plasticity []. The potassium channel family is composed of several functionally distinct isoforms, which can be broadly separated into 2 groups []: the practically non-inactivating 'delayed' group and the rapidly inactivating 'transient' group. These are all highly similar proteins, with only small amino acid changes causing the diversity of the voltage-dependent gating mechanism, channel conductance and toxin binding properties. Each type of K+ channel is activated by different signals and conditions depending on their type of regulation: some open in response to depolarisation of the plasma membrane; others in response to hyperpolarisation or an increase in intracellular calcium concentration; some can be regulated by binding of a transmitter, together with intracellular kinases; while others are regulated by GTP-binding proteins or other second messengers []. In eukaryotic cells, K+ channels are involved in neural signalling and generation of the cardiac rhythm, act as effectors in signal transduction pathways involving G protein-coupled receptors (GPCRs) and may have a role in target cell lysis by cytotoxic T-lymphocytes []. In prokaryotic cells, they play a role in the maintenance of ionic homeostasis [].  All K+ channels discovered so far possess a core of alpha subunits, each comprising either one or two copies of a highly conserved pore loop domain (P-domain). The P-domain contains the sequence (T/SxxTxGxG), which has been termed the K+ selectivity sequence. In families that contain one P-domain, four subunits assemble to form a selective pathway for K+ across the membrane. However, it remains unclear how the 2 P-domain subunits assemble to form a selective pore. The functional diversity of these families can arise through homo- or hetero-associations of alpha subunits or association with auxiliary cytoplasmic beta subunits. K+ channel subunits containing one pore domain can be assigned into one of two superfamilies: those that possess six transmembrane (TM) domains and those that possess only two TM domains. The six TM domain superfamily can be further subdivided into conserved gene families: the voltage-gated (Kv) channels; the KCNQ channels (originally known as KvLQT channels); the EAG-like K+ channels; and three types of calcium (Ca)-activated K+ channels (BK, IK and SK) []. The 2TM domain family comprises inward-rectifying K+ channels. In addition, there are K+ channel alpha-subunits that possess two P-domains. These are usually highly regulated K+ selective leak channels. The Kv family can be divided into several subfamilies on the basis of sequence similarity and function. Four of these subfamilies, Kv1 (Shaker), Kv2 (Shab), Kv3 (Shaw) and Kv4 (Shal), consist of pore-forming alpha subunits that associate with different types of beta subunit. Each alpha subunit comprises six hydrophobic TM domains with a P-domain between the fifth and sixth, which partially resides in the membrane. The fourth TM domain has positively charged residues at every third residue and acts as a voltage sensor, which triggers the conformational change that opens the channel pore in response to a displacement in membrane potential []. More recently, 4 new electrically-silent alpha subunits have been cloned: Kv5 (KCNF), Kv6 (KCNG), Kv8 and Kv9 (KCNS). These subunits do not themselves possess any functional activity, but appear to form heteromeric channels with Kv2 subunits, and thus modulate Shab channel activity []. When highly expressed, they inhibit channel activity, but at lower levels show more specific modulatory actions. The N-terminal, cytoplasmic tetramerization domain (T1) of voltage-gated potassium channels encodes molecular determinants for subfamily-specific assembly of alpha-subunits into functional tetrameric channels []. This domain is found in a subset of a larger group of proteins that contain the BTB/POZ domain.; GO: 0005249 voltage-gated potassium channel activity, 0006813 potassium ion transport, 0008076 voltage-gated potassium channel complex, 0016020 membrane; PDB: 1NN7_A 3KVT_A 1EXB_E 1QDV_A 1DSX_E 1QDW_F 3LUT_B 3LNM_B 2A79_B 3DRY_C ....
Probab=21.21  E-value=24  Score=27.92  Aligned_cols=20  Identities=30%  Similarity=0.548  Sum_probs=14.7

Q ss_pred             EEEeecCceeEEEeecCChHHH
Q 018594           75 CKVEETPKGWFITYIDRDSETL   96 (353)
Q Consensus        75 c~vdetekGw~I~yId~~pe~~   96 (353)
                      +.....+.|.|  ||||||+..
T Consensus        34 ~~~~~~~~~~~--fiDRdp~~F   53 (94)
T PF02214_consen   34 SDDYDDDDGEY--FIDRDPELF   53 (94)
T ss_dssp             GGGEETTTTEE--EESS-HHHH
T ss_pred             ccccCCccceE--EeccChhhh
Confidence            55566778887  899999876


No 112
>PRK00411 cdc6 cell division control protein 6; Reviewed
Probab=21.17  E-value=1.1e+02  Score=30.13  Aligned_cols=66  Identities=17%  Similarity=0.224  Sum_probs=45.8

Q ss_pred             CCcccccceeeeeecccccceeecccccccHHHHHHHhcccccEEEeecCce----eEEEeecCChHHHH
Q 018594           32 RFSRIAATVVYNEYIHDRHHVHMNSTRWATLTEFVKYLGRTGKCKVEETPKG----WFITYIDRDSETLF   97 (353)
Q Consensus        32 g~krV~aN~vYneyI~dr~HiHMNaT~W~tLt~Fvk~Lgr~G~c~vdetekG----w~I~yId~~pe~~~   97 (353)
                      +...|....+|++|-.=-.-+.+..-.+.++.+++..|...|++......+|    +-+--+.-+|+.+.
T Consensus       312 ~~~~~~~~~i~~~y~~l~~~~~~~~~~~~~~~~~l~~L~~~glI~~~~~~~g~~g~~~~~~~~~~~~~~~  381 (394)
T PRK00411        312 GGDEVTTGEVYEEYKELCEELGYEPRTHTRFYEYINKLDMLGIINTRYSGKGGRGRTRLISLSYDPEDVL  381 (394)
T ss_pred             CCCcccHHHHHHHHHHHHHHcCCCcCcHHHHHHHHHHHHhcCCeEEEEecCCCCCCeEEEEecCCHHHHH
Confidence            4456888999988863323334555578899999999999999998765444    33334456776553


No 113
>PTZ00471 60S ribosomal protein L27; Provisional
Probab=21.06  E-value=1.8e+02  Score=25.96  Aligned_cols=29  Identities=28%  Similarity=0.385  Sum_probs=23.4

Q ss_pred             cccCCcEEEEeecccCCcccccceeEEEEecCCc
Q 018594          239 WLCEGIIVKVMSKALADKGYNKQKGVVRKVIDKY  272 (353)
Q Consensus       239 WL~~~IvVKIi~K~l~dGkyYk~KgvV~~V~d~~  272 (353)
                      .|.||-+|=|.     .|+|.++|+||....|.+
T Consensus         4 ~~kpgkVVivL-----~GR~AGkKaVivk~~ddg   32 (134)
T PTZ00471          4 FLKPGKVVIVT-----SGRYAGRKAVIVQNFDTA   32 (134)
T ss_pred             cccCCEEEEEE-----ccccCCcEEEEEeecCCC
Confidence            46788888554     569999999999998863


No 114
>PF08141 SspH:  Small acid-soluble spore protein H family;  InterPro: IPR012610 This family consists of the small acid-soluble spore proteins (SASP) of the H type (sspH). SspH are unique to spores of Bacillus subtilis and are expressed only in the forespore compartment during sporulation of this organism. The sspH genes are monocistronic and are recognised by the forespore-specific sigma factor for RNA polymerase - sigma-G. The specific role of this protein is unclear but is thought to play a role in sporulation under conditions different from that of the common laboratory tests of spore properties [].; GO: 0030436 asexual sporulation, 0042601 endospore-forming forespore
Probab=20.96  E-value=1.7e+02  Score=22.47  Aligned_cols=37  Identities=19%  Similarity=0.210  Sum_probs=29.0

Q ss_pred             CCCceEEEEeeeCCccEEEEEEeccccCCceeeecccccc
Q 018594          310 YRGSNARLLGVDTDKFCAQVKIEKGVYDGRVLNAIDYEDI  349 (353)
Q Consensus       310 ~RG~~g~L~siD~~~~~a~V~l~~g~~~g~~v~~l~yedi  349 (353)
                      |.|.---+.++|.++..|+|.....|  +... .+|..+|
T Consensus        20 y~G~pV~Ie~vde~~~tA~V~~l~~p--~~~~-~Vpv~~L   56 (58)
T PF08141_consen   20 YNGVPVWIEHVDEENGTARVHPLDNP--EEEQ-EVPVNDL   56 (58)
T ss_pred             ECCEEEEEEEEcCCCCeEEEEECCCC--CcEE-EEEHHHc
Confidence            78888999999999999999988655  3333 3777665


No 115
>PF04986 Y2_Tnp:  Putative transposase;  InterPro: IPR007069 Transposases are needed for efficient transposition of the insertion sequence or transposon DNA. This family includes transposases IS1294 and IS801 []. More information about these proteins can be found at Protein of the Month: Transposase [].; GO: 0003677 DNA binding, 0004803 transposase activity, 0006313 transposition, DNA-mediated
Probab=20.58  E-value=89  Score=28.47  Aligned_cols=57  Identities=19%  Similarity=0.247  Sum_probs=36.5

Q ss_pred             hhHHHHHHHHHHHHHhccCCcccc---------cceeeeeecccccceeecccccccHHHHHHHhcc
Q 018594           14 GYSEEFEAGFLELMRRSHRFSRIA---------ATVVYNEYIHDRHHVHMNSTRWATLTEFVKYLGR   71 (353)
Q Consensus        14 ~fS~eF~~~Fl~lLr~~~g~krV~---------aN~vYneyI~dr~HiHMNaT~W~tLt~Fvk~Lgr   71 (353)
                      .=++.|...||++|+.++..-.+.         .+.++++--++.=+||+...-. .-..=++||||
T Consensus        47 ~l~~~fr~k~l~~L~~~~~~~~l~~~~~~~~~~~~~~~~~~~~k~w~V~~~~~~~-~~~~~~~YL~R  112 (183)
T PF04986_consen   47 ALSKVFRGKFLQALRQRYDKGLLRRGPIENFQEWSNLLNKLYRKGWVVYCKKPVG-NGEQVLEYLGR  112 (183)
T ss_pred             hhhHHHHHHHHHHHHHHHHhcccccccccchhhhhhhccccccCccccccCcccc-cchHHHHHHHH
Confidence            347899999999999996544333         3444444445566777764433 44455777775


No 116
>PRK14635 hypothetical protein; Provisional
Probab=20.17  E-value=1.4e+02  Score=26.88  Aligned_cols=74  Identities=19%  Similarity=0.199  Sum_probs=41.8

Q ss_pred             ceEEEEecCCCeEEEEcCCceeeecCCCCCeEEEE--e-C--CCCCceEEEEeeeCCccEEEEEEecc---ccCCceeee
Q 018594          272 YVGEIEMLEKKHVLRVDQDELETVIPQIGGLVRIV--N-G--AYRGSNARLLGVDTDKFCAQVKIEKG---VYDGRVLNA  343 (353)
Q Consensus       272 ~~c~V~l~d~g~~l~vdq~~LETVIP~~G~~V~IV--~-G--~~RG~~g~L~siD~~~~~a~V~l~~g---~~~g~~v~~  343 (353)
                      |+-.|..++-++.|+-+.++.-    ..|..|.|-  . |  .+.|-+|+|.++|.+.  +++.+...   +..+..+ .
T Consensus        75 Y~LEVSSPGldRpL~~~~~~~r----~~G~~v~v~~~~~~~~~~~g~~g~L~~~~~~~--v~l~~~~k~~~~~~~~~~-~  147 (162)
T PRK14635         75 FTLKVSSAGAERKLRLPEDLDR----FRGIPVRLVFRSEESEKWQEGIFRLVNRDGDQ--VELEKFQKGKKSKVKKQT-T  147 (162)
T ss_pred             eEEEEcCCCCCCcCCCHHHHHH----hCCCEEEEEEecCCCcEEEecceEEEEEcCCE--EEEEEecccccccCCeEE-E
Confidence            4555555554556655444332    247766553  1 2  4567777999998776  55544211   0113445 4


Q ss_pred             ccccccccc
Q 018594          344 IDYEDICKL  352 (353)
Q Consensus       344 l~yedicKl  352 (353)
                      +||++|.+.
T Consensus       148 ip~~~I~ka  156 (162)
T PRK14635        148 LNLKDILKG  156 (162)
T ss_pred             EEhHHeeee
Confidence            999998763


No 117
>PRK10413 hydrogenase 2 accessory protein HypG; Provisional
Probab=20.10  E-value=3.2e+02  Score=22.24  Aligned_cols=44  Identities=20%  Similarity=0.185  Sum_probs=27.3

Q ss_pred             eEEEEecCC--ceEEEEecCCCeEEEEcCCceeeecC--CCCCeEEEEeC
Q 018594          263 GVVRKVIDK--YVGEIEMLEKKHVLRVDQDELETVIP--QIGGLVRIVNG  308 (353)
Q Consensus       263 gvV~~V~d~--~~c~V~l~d~g~~l~vdq~~LETVIP--~~G~~V~IV~G  308 (353)
                      |.|.++.+.  ..|.|...  |-.-.|+-..+..+-|  ++|++|+|=.|
T Consensus         7 ~kVi~i~~~~~~~A~vd~~--Gv~r~V~l~Lv~~~~~~~~vGDyVLVHaG   54 (82)
T PRK10413          7 GQVLAVGEDIHQLAQVEVC--GIKRDVNIALICEGNPADLLGQWVLVHVG   54 (82)
T ss_pred             eEEEEECCCCCcEEEEEcC--CeEEEEEeeeeccCCcccccCCEEEEecc
Confidence            345555543  35767654  4444566666654433  78999999776


Done!