Query         021931
Match_columns 305
No_of_seqs    120 out of 1166
Neff          3.8 
Searched_HMMs 46136
Date          Fri Mar 29 06:44:41 2013
Command       hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/021931.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/021931hhsearch_cdd -cpu 12 -v 0 

 No Hit                             Prob E-value P-value  Score    SS Cols Query HMM  Template HMM
  1 CHL00123 rps6 ribosomal protei 100.0 1.1E-29 2.4E-34  204.6  12.5   94  115-208     3-97  (97)
  2 PRK00453 rpsF 30S ribosomal pr 100.0 3.8E-29 8.1E-34  203.0  13.0  103  117-219     1-104 (108)
  3 COG0360 RpsF Ribosomal protein 100.0 1.3E-29 2.9E-34  210.6   9.6  103  118-220     1-104 (112)
  4 TIGR00166 S6 ribosomal protein 100.0 1.5E-28 3.3E-33  194.8  12.6   92  118-209     1-92  (93)
  5 PF01250 Ribosomal_S6:  Ribosom  99.9 2.3E-27   5E-32  186.3  11.3   91  118-208     1-92  (92)
  6 PRK14074 rpsF 30S ribosomal pr  99.8 1.5E-20 3.2E-25  173.9   8.0  111   98-222   140-251 (257)
  7 KOG4708 Mitochondrial ribosoma  99.6 4.7E-17   1E-21  140.1   3.1   96  116-211     2-101 (141)
  8 PRK14074 rpsF 30S ribosomal pr  97.2 0.00035 7.6E-09   66.0   3.8   46  117-162     1-47  (257)
  9 PF10446 DUF2457:  Protein of u  93.6   0.056 1.2E-06   55.1   3.1   20  259-278    98-117 (458)
 10 KOG1832 HIV-1 Vpr-binding prot  93.2   0.061 1.3E-06   59.2   2.9   57  134-191  1256-1320(1516)
 11 PF09026 CENP-B_dimeris:  Centr  89.2    0.11 2.4E-06   43.4   0.0   21  221-241     2-22  (101)
 12 PF06524 NOA36:  NOA36 protein;  84.9    0.76 1.6E-05   44.7   2.9   14  204-217   200-213 (314)
 13 PF04931 DNA_pol_phi:  DNA poly  83.2     1.3 2.7E-05   47.6   4.0   10  193-202   597-606 (784)
 14 PF04931 DNA_pol_phi:  DNA poly  75.7     2.1 4.5E-05   45.9   2.8   14  182-195   620-633 (784)
 15 PF06524 NOA36:  NOA36 protein;  75.1     2.7 5.9E-05   41.0   3.1    6   71-76    102-107 (314)
 16 cd04905 ACT_CM-PDT C-terminal   70.3      31 0.00067   26.0   7.4   54  136-196    13-68  (80)
 17 KOG1832 HIV-1 Vpr-binding prot  69.7     4.2   9E-05   45.7   3.3   11  230-240  1404-1414(1516)
 18 cd04880 ACT_AAAH-PDT-like ACT   67.3      40 0.00086   24.9   7.3   53  136-195    11-65  (75)
 19 cd04902 ACT_3PGDH-xct C-termin  60.4      35 0.00076   24.3   5.8   62  136-207    11-72  (73)
 20 KOG3130 Uncharacterized conser  51.8      11 0.00024   38.9   2.5   11  261-271   301-311 (514)
 21 PF04147 Nop14:  Nop14-like fam  50.2      12 0.00026   41.0   2.6   10   55-64    118-127 (840)
 22 cd04893 ACT_GcvR_1 ACT domains  49.3      76  0.0017   24.0   6.3   47  136-192    13-59  (77)
 23 PF05285 SDA1:  SDA1;  InterPro  45.1      14  0.0003   36.0   2.0   27  134-160    20-46  (324)
 24 PF02724 CDC45:  CDC45-like pro  42.4      19 0.00042   38.1   2.7   16  188-203    64-79  (622)
 25 KOG2038 CAATT-binding transcri  42.0      23  0.0005   39.3   3.2   54   89-142   695-753 (988)
 26 cd04931 ACT_PAH ACT domain of   39.3 1.9E+02  0.0041   23.3   7.4   55  136-197    26-85  (90)
 27 PF11705 RNA_pol_3_Rpc31:  DNA-  38.4      25 0.00054   32.4   2.5    7   68-74     33-39  (233)
 28 cd04869 ACT_GcvR_2 ACT domains  37.9 1.6E+02  0.0034   21.7   6.3   54  136-194    11-66  (81)
 29 cd04904 ACT_AAAH ACT domain of  37.4 1.8E+02  0.0039   21.9   7.3   53  137-196    13-65  (74)
 30 KOG0943 Predicted ubiquitin-pr  36.9      22 0.00048   41.7   2.1    7   88-94   1572-1578(3015)
 31 PTZ00415 transmission-blocking  35.3      22 0.00047   42.8   1.8   12  266-277   205-216 (2849)
 32 PTZ00415 transmission-blocking  34.5      29 0.00062   41.9   2.6    7  145-151    68-74  (2849)
 33 cd04879 ACT_3PGDH-like ACT_3PG  33.7 1.5E+02  0.0033   20.1   5.8   58  136-203    11-68  (71)
 34 cd04929 ACT_TPH ACT domain of   30.1 2.6E+02  0.0056   21.5   6.7   52  137-195    13-64  (74)
 35 KOG2023 Nuclear transport rece  27.6      35 0.00076   37.5   1.7   16  218-233   334-349 (885)
 36 KOG4032 Uncharacterized conser  27.5      73  0.0016   29.5   3.5   14  182-195   104-117 (184)
 37 PF11702 DUF3295:  Protein of u  26.0      44 0.00095   35.2   2.1   12  245-256   307-318 (507)
 38 KOG0943 Predicted ubiquitin-pr  25.7      51  0.0011   39.0   2.6    6  189-194  1682-1687(3015)
 39 KOG0526 Nucleosome-binding fac  24.7      59  0.0013   34.8   2.7   26   70-95    182-207 (615)
 40 PF11705 RNA_pol_3_Rpc31:  DNA-  22.9      87  0.0019   28.9   3.2    9  188-196   122-130 (233)
 41 PF09569 RE_ScaI:  ScaI restric  22.2 1.1E+02  0.0024   28.5   3.6   76  121-216    92-170 (191)
 42 PF14257 DUF4349:  Domain of un  22.0 5.7E+02   0.012   23.6   8.4   62  132-200    59-120 (262)
 43 PLN03075 nicotianamine synthas  21.8 1.4E+02   0.003   29.2   4.5   71  117-189   193-268 (296)
 44 PRK11898 prephenate dehydratas  21.3 3.6E+02  0.0079   25.7   7.1   55  122-185   197-251 (283)
 45 PF01842 ACT:  ACT domain;  Int  21.3 2.8E+02  0.0061   19.0   6.2   50  136-194    12-61  (66)
 46 PTZ00482 membrane-attack compl  21.0      74  0.0016   35.5   2.7   26  220-245    77-102 (844)
 47 PF12253 CAF1A:  Chromatin asse  20.0      84  0.0018   25.2   2.2   14  230-243    42-55  (77)

No 1  
>CHL00123 rps6 ribosomal protein S6; Validated
Probab=99.96  E-value=1.1e-29  Score=204.60  Aligned_cols=94  Identities=30%  Similarity=0.525  Sum_probs=90.4

Q ss_pred             ccCcCcceEEEecCCCh-HhHHHHHHHHHHHHHhCCCEEEEEEeeeccccccccCCCCeEEEEEEEEEeCcchHHHHHHH
Q 021931          115 ERRRHYEVVYLIHEKYE-EDVGSVNEKVQDFLREKKGRVWRLNDWGLRRLAYKIQKAKKAHYILMNFELEAKWINDFKTM  193 (305)
Q Consensus       115 ~~MR~YEimlILrP~le-EEVkalvekv~~iL~e~GG~I~kvEdWG~RrLAY~IKK~keG~Yvlm~Fea~psaIkELer~  193 (305)
                      .+||+||+|+|++|+++ +++++++++++++|.++||+|+++++||+|+|||+|+|+++|+|++++|.++|++|++|++.
T Consensus         3 ~~mr~YE~~~Il~p~l~e~~~~~~~~~~~~~i~~~gg~i~~~~~wG~r~LAY~I~k~~~G~Yv~~~f~~~~~~i~eler~   82 (97)
T CHL00123          3 SKLNKYETMYLLKPDLNEEELLKWIENYKKLLRKRGAKNISVQNRGKRKLSYKINKYEDGIYIQMNYSGNGKLVNSLEKA   82 (97)
T ss_pred             CcccceeEEEEECCCCCHHHHHHHHHHHHHHHHHCCCEEEEEEeecCeeeeEEcCCCCEEEEEEEEEEECHHHHHHHHHH
Confidence            36899999999999985 57999999999999999999999999999999999999999999999999999999999999


Q ss_pred             hCCCCCeeEEEEEee
Q 021931          194 LDKDEKVIRHLVIKR  208 (305)
Q Consensus       194 LrLDE~VLR~LIVK~  208 (305)
                      |+++++|||||++|.
T Consensus        83 lri~e~VlR~m~vk~   97 (97)
T CHL00123         83 LKLDENVLRYLTFKK   97 (97)
T ss_pred             hCCCCCeEEEEEEeC
Confidence            999999999999973


No 2  
>PRK00453 rpsF 30S ribosomal protein S6; Reviewed
Probab=99.96  E-value=3.8e-29  Score=203.04  Aligned_cols=103  Identities=35%  Similarity=0.682  Sum_probs=97.7

Q ss_pred             CcCcceEEEecCCC-hHhHHHHHHHHHHHHHhCCCEEEEEEeeeccccccccCCCCeEEEEEEEEEeCcchHHHHHHHhC
Q 021931          117 RRHYEVVYLIHEKY-EEDVGSVNEKVQDFLREKKGRVWRLNDWGLRRLAYKIQKAKKAHYILMNFELEAKWINDFKTMLD  195 (305)
Q Consensus       117 MR~YEimlILrP~l-eEEVkalvekv~~iL~e~GG~I~kvEdWG~RrLAY~IKK~keG~Yvlm~Fea~psaIkELer~Lr  195 (305)
                      |++||+|+|++|.+ +++++++++++.++|.+.||+|+++++||.|+|||+|+|+.+|+|++|+|.++|+++++|++.|+
T Consensus         1 M~~YE~~~il~~~~~~~~~~~~~~~~~~~i~~~gg~i~~~~~~G~r~LAY~I~k~~~G~Y~~~~f~~~~~~i~el~~~l~   80 (108)
T PRK00453          1 MRKYEIVFILRPDLSEEQVKALVERFKGVITENGGTIHKVEDWGRRRLAYPINKLRKGHYVLLNFEAPPAAIAELERLFR   80 (108)
T ss_pred             CCceeEEEEECCCCCHHHHHHHHHHHHHHHHHCCCEEEEEecccccccceEcCCCcEEEEEEEEEEeCHHHHHHHHHHhC
Confidence            89999999999997 46899999999999999999999999999999999999999999999999999999999999999


Q ss_pred             CCCCeeEEEEEeecCCCCCCCCCC
Q 021931          196 KDEKVIRHLVIKRDKAITEDCPPP  219 (305)
Q Consensus       196 LDE~VLR~LIVK~ek~i~e~~p~~  219 (305)
                      +|++|||||++|+++...+..+..
T Consensus        81 ~~~~VlR~~~vk~~~~~~~~~~~~  104 (108)
T PRK00453         81 INEDVLRFLTVKVEEAEEEPSPMM  104 (108)
T ss_pred             CCCCeEEEEEEEecccccccChhh
Confidence            999999999999998877666544


No 3  
>COG0360 RpsF Ribosomal protein S6 [Translation, ribosomal structure and biogenesis]
Probab=99.96  E-value=1.3e-29  Score=210.64  Aligned_cols=103  Identities=40%  Similarity=0.727  Sum_probs=98.4

Q ss_pred             cCcceEEEecCCCh-HhHHHHHHHHHHHHHhCCCEEEEEEeeeccccccccCCCCeEEEEEEEEEeCcchHHHHHHHhCC
Q 021931          118 RHYEVVYLIHEKYE-EDVGSVNEKVQDFLREKKGRVWRLNDWGLRRLAYKIQKAKKAHYILMNFELEAKWINDFKTMLDK  196 (305)
Q Consensus       118 R~YEimlILrP~le-EEVkalvekv~~iL~e~GG~I~kvEdWG~RrLAY~IKK~keG~Yvlm~Fea~psaIkELer~LrL  196 (305)
                      ++||+|||++|+++ ++++++++++.++|+++||+|.++++||+|+|||+|+|+++|||++|+|+|+|+++++|+|.|++
T Consensus         1 ~~YEi~~iv~p~~see~~~~~ve~~~~~l~~~gg~i~~~e~wG~R~LAY~IkK~~~g~Y~l~~f~~~~~~i~Eler~~ri   80 (112)
T COG0360           1 RKYEIVFIVRPDLSEEQVAALVEKYKGVLTNNGGEIHKVEDWGKRRLAYPIKKLREGHYVLMNFEAEPAAIAELERLLRI   80 (112)
T ss_pred             CceEEEEEECCCCCHHHHHHHHHHHHHHHHHCCCEEEEehhhhhhhhcceecccceEEEEEEEEEcCHHHHHHHHHHhcc
Confidence            46999999999986 57999999999999999999999999999999999999999999999999999999999999999


Q ss_pred             CCCeeEEEEEeecCCCCCCCCCCC
Q 021931          197 DEKVIRHLVIKRDKAITEDCPPPP  220 (305)
Q Consensus       197 DE~VLR~LIVK~ek~i~e~~p~~~  220 (305)
                      +++||||||||.+......+++.+
T Consensus        81 n~~VlR~liik~~~~~~~~~~~~~  104 (112)
T COG0360          81 NEDVLRHLIIKVEKAKEELSPMLK  104 (112)
T ss_pred             chhhheeeEEEechhhcccchhhh
Confidence            999999999999999888887753


No 4  
>TIGR00166 S6 ribosomal protein S6. MRP17 protein is a component of the small ribosomal subunit in mitochondria, and is shown here to be an ortholog of S6.
Probab=99.96  E-value=1.5e-28  Score=194.82  Aligned_cols=92  Identities=39%  Similarity=0.709  Sum_probs=89.0

Q ss_pred             cCcceEEEecCCChHhHHHHHHHHHHHHHhCCCEEEEEEeeeccccccccCCCCeEEEEEEEEEeCcchHHHHHHHhCCC
Q 021931          118 RHYEVVYLIHEKYEEDVGSVNEKVQDFLREKKGRVWRLNDWGLRRLAYKIQKAKKAHYILMNFELEAKWINDFKTMLDKD  197 (305)
Q Consensus       118 R~YEimlILrP~leEEVkalvekv~~iL~e~GG~I~kvEdWG~RrLAY~IKK~keG~Yvlm~Fea~psaIkELer~LrLD  197 (305)
                      ++||+|+|++|+.+++++++++++.++|.++||+|+++++||.|+|||+|+|+++|+|++|+|+++|++|++|++.|++|
T Consensus         1 ~~YE~~~Il~p~~~~~~~~~~~~~~~~i~~~gg~i~~~~~~G~r~LaY~I~k~~~G~Y~~~~f~~~~~~i~el~~~lr~~   80 (93)
T TIGR00166         1 RHYEIIFLVRPTLSEEVKGQIERYKKVITLNGAEIVRSEDWGKRRLAYPIKKQLRAHYVLMNFSGEAQVIKEFERTARIN   80 (93)
T ss_pred             CceeEEEEECCCCcHHHHHHHHHHHHHHHhCCCEEEEEEeecceecceEcCCCceEEEEEEEEEeCHHHHHHHHHHhcCC
Confidence            57999999999987669999999999999999999999999999999999999999999999999999999999999999


Q ss_pred             CCeeEEEEEeec
Q 021931          198 EKVIRHLVIKRD  209 (305)
Q Consensus       198 E~VLR~LIVK~e  209 (305)
                      ++|||||++|++
T Consensus        81 ~~VlR~~~vk~~   92 (93)
T TIGR00166        81 DNVIRSLIIKLE   92 (93)
T ss_pred             cCeEEEEEEEec
Confidence            999999999975


No 5  
>PF01250 Ribosomal_S6:  Ribosomal protein S6;  InterPro: IPR000529 Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [, ]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.  Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [, ]. Ribosomal protein S6 is one of the proteins from the small ribosomal subunit. In Escherichia coli, S6 is known to bind together with S18 to 16S ribosomal RNA. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities, groups bacterial, red algal chloroplast and cyanelle S6 ribosomal proteins.; GO: 0003735 structural constituent of ribosome, 0019843 rRNA binding, 0006412 translation, 0005840 ribosome; PDB: 3BBN_F 3R3T_B 3F1E_F 2QNH_g 2OW8_g 3PYQ_F 3PYS_F 3PYU_F 3MR8_F 3PYN_F ....
Probab=99.95  E-value=2.3e-27  Score=186.27  Aligned_cols=91  Identities=34%  Similarity=0.718  Sum_probs=86.6

Q ss_pred             cCcceEEEecCCCh-HhHHHHHHHHHHHHHhCCCEEEEEEeeeccccccccCCCCeEEEEEEEEEeCcchHHHHHHHhCC
Q 021931          118 RHYEVVYLIHEKYE-EDVGSVNEKVQDFLREKKGRVWRLNDWGLRRLAYKIQKAKKAHYILMNFELEAKWINDFKTMLDK  196 (305)
Q Consensus       118 R~YEimlILrP~le-EEVkalvekv~~iL~e~GG~I~kvEdWG~RrLAY~IKK~keG~Yvlm~Fea~psaIkELer~LrL  196 (305)
                      |.||+|+|++|+.+ ++++++++++.++|.++||+|+++++||.|+|||+|+|+.+|+|++|+|+++|+++++|++.|++
T Consensus         1 r~YE~~~il~~~~~~~~~~~~~~~~~~~i~~~gg~v~~~~~~G~r~LaY~i~k~~~G~Y~~~~f~~~~~~i~el~~~l~~   80 (92)
T PF01250_consen    1 RKYELMFILRPDLSEEEIKKLIERVKKIIEKNGGVVRSVENWGKRRLAYPIKKQKEGHYFLFNFDASPSAIKELERKLRL   80 (92)
T ss_dssp             EEEEEEEEE-TTSCHHHHHHHHHHHHHHHHHTTEEEEEEEEEEEEEESSEETTECEEEEEEEEEEESTTHHHHHHHHHHT
T ss_pred             CceeEEEEECCCCCHHHHHHHHHHHHHHHHHCCCEEEEEEEEeecccccCCCCCCEEEEEEEEEEeCHHHHHHHHHHhcC
Confidence            57999999999975 58999999999999999999999999999999999999999999999999999999999999999


Q ss_pred             CCCeeEEEEEee
Q 021931          197 DEKVIRHLVIKR  208 (305)
Q Consensus       197 DE~VLR~LIVK~  208 (305)
                      |++|||||++|.
T Consensus        81 ~~~VlR~~~vK~   92 (92)
T PF01250_consen   81 DEDVLRYLIVKK   92 (92)
T ss_dssp             STTEEEEEEEE-
T ss_pred             CCCeEEEEEEeC
Confidence            999999999984


No 6  
>PRK14074 rpsF 30S ribosomal protein S6; Provisional
Probab=99.82  E-value=1.5e-20  Score=173.91  Aligned_cols=111  Identities=22%  Similarity=0.337  Sum_probs=94.2

Q ss_pred             HHHHHHHhHHhhhhh-ccccCcCcceEEEecCCChHhHHHHHHHHHHHHHhCCCEEEEEEeeeccccccccCCCCeEEEE
Q 021931           98 EKLYESLNIELESEL-NVERRRHYEVVYLIHEKYEEDVGSVNEKVQDFLREKKGRVWRLNDWGLRRLAYKIQKAKKAHYI  176 (305)
Q Consensus        98 ~~l~~~l~~~~~s~~-~~~~MR~YEimlILrP~leEEVkalvekv~~iL~e~GG~I~kvEdWG~RrLAY~IKK~keG~Yv  176 (305)
                      +.+-+||---+|..+ +..         |++|   ++++++++++.++|+++|  +.++++||+|+|||||+|+++|+|+
T Consensus       140 ~~~~~~~~~~~~~~~~~~~---------Iv~P---DQveevvEkik~iIe~~G--iikvE~WGkRkLAYpIkK~~eGyYv  205 (257)
T PRK14074        140 ENISEHLIKIFQDILKDFR---------INGP---NQSNKTLEMLLKNIEASG--LIKYEYWGLLDFAYPINKMKSGHYC  205 (257)
T ss_pred             HHHHHHHHHHHHHHHHhcC---------CCCH---HHHHHHHHHHHHHHHhcC--eeehHhhcchhhccccCCCCeEEEE
Confidence            344456655555554 322         3344   577788999999999995  6799999999999999999999999


Q ss_pred             EEEEEeCcchHHHHHHHhCCCCCeeEEEEEeecCCCCCCCCCCCCc
Q 021931          177 LMNFELEAKWINDFKTMLDKDEKVIRHLVIKRDKAITEDCPPPPEF  222 (305)
Q Consensus       177 lm~Fea~psaIkELer~LrLDE~VLR~LIVK~ek~i~e~~p~~~ef  222 (305)
                      +|+|.++|++|++|+|.||++++|||||+||++++..+++|+.++.
T Consensus       206 L~nFeAep~aIaELER~lRInE~VIRfLtVKlDe~~~~pSpimkk~  251 (257)
T PRK14074        206 IMCISSTSSIMDEFVRRMKLNENIIRHLSVQVDKFFEGKSYMMNKQ  251 (257)
T ss_pred             EEEEEcCHHHHHHHHHHhcCccceeeEEEEeeccccccCChhhhhh
Confidence            9999999999999999999999999999999999999999988654


No 7  
>KOG4708 consensus Mitochondrial ribosomal protein MRP17 [Translation, ribosomal structure and biogenesis]
Probab=99.65  E-value=4.7e-17  Score=140.14  Aligned_cols=96  Identities=25%  Similarity=0.408  Sum_probs=89.6

Q ss_pred             cCcCcceEEEecCCChHhHHHHHHHHHHHHHhCCCEEEEEEeeeccccccccCC----CCeEEEEEEEEEeCcchHHHHH
Q 021931          116 RRRHYEVVYLIHEKYEEDVGSVNEKVQDFLREKKGRVWRLNDWGLRRLAYKIQK----AKKAHYILMNFELEAKWINDFK  191 (305)
Q Consensus       116 ~MR~YEimlILrP~leEEVkalvekv~~iL~e~GG~I~kvEdWG~RrLAY~IKK----~keG~Yvlm~Fea~psaIkELe  191 (305)
                      .|+.||+++|+++....+...++.+..+.+...||.|+.++++|.|.|+|+|+|    |..|+||+|.|.++|++..+|.
T Consensus         2 lmp~yelali~~~~~rpela~~l~rt~~~lid~ngVvrdveslG~r~Lpy~i~K~~~~h~~g~~f~m~f~ss~~v~~ei~   81 (141)
T KOG4708|consen    2 LMPLYELALITRSLSRPELAKLLARTGGHLIDRNGVVRDVESLGKRELPYKIKKLDQRHYRGQHFLMTFYSSPAVQSEIK   81 (141)
T ss_pred             cchHHHHHHHhcccCCHHHHHHHHHHhhHHhhcCCeEeechhcchhhhcchHHHhcCccccceEEEEeecCCHHHHHHHH
Confidence            699999999999998777778888888888888999999999999999999996    6789999999999999999999


Q ss_pred             HHhCCCCCeeEEEEEeecCC
Q 021931          192 TMLDKDEKVIRHLVIKRDKA  211 (305)
Q Consensus       192 r~LrLDE~VLR~LIVK~ek~  211 (305)
                      +.|+.|.+|||+++||++..
T Consensus        82 ~~l~~D~dviR~~IVKv~~~  101 (141)
T KOG4708|consen   82 RILKRDPDVIRWLIVKVDDI  101 (141)
T ss_pred             HHHhcChhhHHhhheecccc
Confidence            99999999999999999984


No 8  
>PRK14074 rpsF 30S ribosomal protein S6; Provisional
Probab=97.17  E-value=0.00035  Score=65.95  Aligned_cols=46  Identities=15%  Similarity=0.234  Sum_probs=43.3

Q ss_pred             CcCcceEEEecCCChH-hHHHHHHHHHHHHHhCCCEEEEEEeeeccc
Q 021931          117 RRHYEVVYLIHEKYEE-DVGSVNEKVQDFLREKKGRVWRLNDWGLRR  162 (305)
Q Consensus       117 MR~YEimlILrP~leE-EVkalvekv~~iL~e~GG~I~kvEdWG~Rr  162 (305)
                      |+.||.+||.+++++. ++..+++.+..+|.++||.|...+.||.+.
T Consensus         1 m~lYE~~fIa~q~ls~~q~e~l~e~~~~~l~~~~~~v~~~e~wG~~~   47 (257)
T PRK14074          1 MNLYEFTFIAQQGLLQQEVEEMVQELAVLLKNIKADVMFQQIKGILE   47 (257)
T ss_pred             CCccceeeeecccccHHHHHHHHHHHHHHHHhcCCeeehhhhhhhhh
Confidence            8999999999999865 899999999999999999999999999864


No 9  
>PF10446 DUF2457:  Protein of unknown function (DUF2457);  InterPro: IPR018853  This entry represents a family of uncharacterised proteins. 
Probab=93.56  E-value=0.056  Score=55.09  Aligned_cols=20  Identities=35%  Similarity=0.559  Sum_probs=11.9

Q ss_pred             CCCCCCCCCCceEEEeCCCC
Q 021931          259 DDNVDDDFEHGFSIVNVDDN  278 (305)
Q Consensus       259 ~~~~~~~~~~~~~~~~~~~~  278 (305)
                      .++|.-|.|.||---|+.++
T Consensus        98 ddG~~TDnE~GFAdSDDEdD  117 (458)
T PF10446_consen   98 DDGNETDNEAGFADSDDEDD  117 (458)
T ss_pred             ccCccCcccccccccccccc
Confidence            34666678888765443333


No 10 
>KOG1832 consensus HIV-1 Vpr-binding protein [Cell cycle control, cell division, chromosome partitioning]
Probab=93.21  E-value=0.061  Score=59.22  Aligned_cols=57  Identities=9%  Similarity=0.174  Sum_probs=30.7

Q ss_pred             HHHHHHHHHHHHHhCCC--------EEEEEEeeeccccccccCCCCeEEEEEEEEEeCcchHHHHH
Q 021931          134 VGSVNEKVQDFLREKKG--------RVWRLNDWGLRRLAYKIQKAKKAHYILMNFELEAKWINDFK  191 (305)
Q Consensus       134 Vkalvekv~~iL~e~GG--------~I~kvEdWG~RrLAY~IKK~keG~Yvlm~Fea~psaIkELe  191 (305)
                      +++.+-++-.+-...||        .|++.+-|.+|.|--- ..--.=.-..+.|+..+..+-.+.
T Consensus      1256 ~~~aIh~FD~ft~~~~G~FHP~g~eVIINSEIwD~RTF~lL-h~VP~Ldqc~VtFNstG~VmYa~~ 1320 (1516)
T KOG1832|consen 1256 IPEAIHRFDQFTDYGGGGFHPSGNEVIINSEIWDMRTFKLL-HSVPSLDQCAVTFNSTGDVMYAML 1320 (1516)
T ss_pred             cHHHHhhhhhheecccccccCCCceEEeechhhhhHHHHHH-hcCccccceEEEeccCccchhhhh
Confidence            34555555554333333        5889999999876421 111122234566776666555444


No 11 
>PF09026 CENP-B_dimeris:  Centromere protein B dimerisation domain;  InterPro: IPR015115 Centromere protein B (CENP-B) interacts with centromeric heterochromatin in chromosomes and binds to a specific subset of alphoid satellite DNA, called the CENP-B box. CENP-B may organise arrays of centromere satellite DNA into a higher order structure, which then directs centromere formation and kinetochore assembly in mammalian chromosomes. The CENP-B dimerisation domain is composed of two alpha-helices, which are folded into an antiparallel configuration. Dimerisation of CENP-B is mediated by this domain, in which monomers dimerise to form a symmetrical, antiparallel, four-helix bundle structure with a large hydrophobic patch in which 23 residues of one monomer form van der Waals contacts with the other monomer. This CENP-B dimer configuration may be suitable for capturing two distant CENP-B boxes during centromeric heterochromatin formation []. ; GO: 0003677 DNA binding, 0003682 chromatin binding, 0006355 regulation of transcription, DNA-dependent, 0000775 chromosome, centromeric region, 0005634 nucleus; PDB: 1UFI_A.
Probab=89.24  E-value=0.11  Score=43.38  Aligned_cols=21  Identities=19%  Similarity=0.336  Sum_probs=0.0

Q ss_pred             CccchhhccCCCCCccCCCCC
Q 021931          221 EFHTLRAEMHGYDEEEDDIDY  241 (305)
Q Consensus       221 ef~s~~~~~d~~~~~~~~~~~  241 (305)
                      -||-+.+..|.+.|.++++++
T Consensus         2 ~~~~~eg~~dse~dsdEdeee   22 (101)
T PF09026_consen    2 TLHFLEGEEDSESDSDEDEEE   22 (101)
T ss_dssp             ---------------------
T ss_pred             ceeecccCcccccccccchhh
Confidence            455566666655554444333


No 12 
>PF06524 NOA36:  NOA36 protein;  InterPro: IPR010531 This family consists of several NOA36 proteins which contain 29 highly conserved cysteine residues. The function of this protein is unknown.; GO: 0008270 zinc ion binding, 0005634 nucleus
Probab=84.87  E-value=0.76  Score=44.69  Aligned_cols=14  Identities=29%  Similarity=0.408  Sum_probs=6.6

Q ss_pred             EEEeecCCCCCCCC
Q 021931          204 LVIKRDKAITEDCP  217 (305)
Q Consensus       204 LIVK~ek~i~e~~p  217 (305)
                      -.+|.++...-+||
T Consensus       200 Kg~ky~k~k~~PCP  213 (314)
T PF06524_consen  200 KGFKYEKGKPIPCP  213 (314)
T ss_pred             cccccccCCCCCCC
Confidence            33555554444444


No 13 
>PF04931 DNA_pol_phi:  DNA polymerase phi;  InterPro: IPR007015 Proteins of this family are predominantly nucleolar. The majority are described as transcription factor transactivators. The family also includes the fifth essential DNA polymerase (Pol5p) of Schizosaccharomyces pombe (Fission yeast) and Saccharomyces cerevisiae (Baker's yeast) (2.7.7.7 from EC). Pol5p is localized exclusively to the nucleolus and binds near or at the enhancer region of rRNA-encoding DNA repeating units.; GO: 0003677 DNA binding, 0003887 DNA-directed DNA polymerase activity, 0006351 transcription, DNA-dependent
Probab=83.19  E-value=1.3  Score=47.56  Aligned_cols=10  Identities=20%  Similarity=0.574  Sum_probs=4.0

Q ss_pred             HhCCCCCeeE
Q 021931          193 MLDKDEKVIR  202 (305)
Q Consensus       193 ~LrLDE~VLR  202 (305)
                      .|......||
T Consensus       597 lls~~s~llR  606 (784)
T PF04931_consen  597 LLSQPSALLR  606 (784)
T ss_pred             HHhCcchHHH
Confidence            3333344444


No 14 
>PF04931 DNA_pol_phi:  DNA polymerase phi;  InterPro: IPR007015 Proteins of this family are predominantly nucleolar. The majority are described as transcription factor transactivators. The family also includes the fifth essential DNA polymerase (Pol5p) of Schizosaccharomyces pombe (Fission yeast) and Saccharomyces cerevisiae (Baker's yeast) (2.7.7.7 from EC). Pol5p is localized exclusively to the nucleolus and binds near or at the enhancer region of rRNA-encoding DNA repeating units.; GO: 0003677 DNA binding, 0003887 DNA-directed DNA polymerase activity, 0006351 transcription, DNA-dependent
Probab=75.75  E-value=2.1  Score=45.95  Aligned_cols=14  Identities=21%  Similarity=0.233  Sum_probs=6.7

Q ss_pred             eCcchHHHHHHHhC
Q 021931          182 LEAKWINDFKTMLD  195 (305)
Q Consensus       182 a~psaIkELer~Lr  195 (305)
                      ..+..+..|-..|.
T Consensus       620 ~t~~~l~~ll~vl~  633 (784)
T PF04931_consen  620 LTESGLQLLLDVLD  633 (784)
T ss_pred             cCHHHHHHHHHHhc
Confidence            34445555544444


No 15 
>PF06524 NOA36:  NOA36 protein;  InterPro: IPR010531 This family consists of several NOA36 proteins which contain 29 highly conserved cysteine residues. The function of this protein is unknown.; GO: 0008270 zinc ion binding, 0005634 nucleus
Probab=75.12  E-value=2.7  Score=40.96  Aligned_cols=6  Identities=50%  Similarity=0.506  Sum_probs=2.4

Q ss_pred             cchhhh
Q 021931           71 FPEAVL   76 (305)
Q Consensus        71 ~~~~~~   76 (305)
                      |=||.+
T Consensus       102 fCEawv  107 (314)
T PF06524_consen  102 FCEAWV  107 (314)
T ss_pred             cchhhe
Confidence            334433


No 16 
>cd04905 ACT_CM-PDT C-terminal ACT domain of the bifunctional chorismate mutase-prephenate dehydratase (CM-PDT) enzyme and the prephenate dehydratase (PDT) enzyme. The C-terminal ACT domain of the bifunctional chorismate mutase-prephenate dehydratase (CM-PDT) enzyme and the prephenate dehydratase (PDT) enzyme, found in plants, fungi, bacteria, and archaea. The P-protein of E. coli (CM-PDT, PheA) catalyzes the conversion of chorismate to prephenate and then the decarboxylation and dehydration to form phenylpyruvate. These are the first two steps in the biosynthesis of L-Phe and L-Tyr via the shikimate pathway in microorganisms and plants. The E. coli P-protein (CM-PDT) has three domains with an N-terminal domain with chorismate mutase activity, a middle domain with prephenate dehydratase activity, and an ACT regulatory C-terminal domain. The prephenate dehydratase enzyme has a PDT and ACT domain. The ACT domain is essential to bring about the negative allosteric regulation by L-Phe bindi
Probab=70.27  E-value=31  Score=25.98  Aligned_cols=54  Identities=9%  Similarity=0.060  Sum_probs=36.7

Q ss_pred             HHHHHHHHHHHhCCCEEEEEEeeeccccccccCCCCeEEEEEEEEEeC--cchHHHHHHHhCC
Q 021931          136 SVNEKVQDFLREKKGRVWRLNDWGLRRLAYKIQKAKKAHYILMNFELE--AKWINDFKTMLDK  196 (305)
Q Consensus       136 alvekv~~iL~e~GG~I~kvEdWG~RrLAY~IKK~keG~Yvlm~Fea~--psaIkELer~LrL  196 (305)
                      ..+.++.+++.++|..|.+++       .+|.++....++|.+.+++.  ...+..+-..|+.
T Consensus        13 G~L~~il~~f~~~~ini~~i~-------s~p~~~~~~~~~f~vd~~~~~~~~~~~~~l~~l~~   68 (80)
T cd04905          13 GALYDVLGVFAERGINLTKIE-------SRPSKGGLWEYVFFIDFEGHIEDPNVAEALEELKR   68 (80)
T ss_pred             CHHHHHHHHHHHCCcCEEEEE-------EEEcCCCCceEEEEEEEECCCCCHHHHHHHHHHHH
Confidence            357788899999999999997       34444455567777788875  4444444444443


No 17 
>KOG1832 consensus HIV-1 Vpr-binding protein [Cell cycle control, cell division, chromosome partitioning]
Probab=69.73  E-value=4.2  Score=45.69  Aligned_cols=11  Identities=27%  Similarity=0.438  Sum_probs=4.3

Q ss_pred             CCCCCccCCCC
Q 021931          230 HGYDEEEDDID  240 (305)
Q Consensus       230 d~~~~~~~~~~  240 (305)
                      +++.|||+|++
T Consensus      1404 ~dd~DeeeD~e 1414 (1516)
T KOG1832|consen 1404 DDDSDEEEDDE 1414 (1516)
T ss_pred             ccccCccccch
Confidence            33334443333


No 18 
>cd04880 ACT_AAAH-PDT-like ACT domain of the nonheme iron-dependent, aromatic amino acid hydroxylases (AAAH). ACT domain of the nonheme iron-dependent, aromatic amino acid hydroxylases (AAAH): Phenylalanine hydroxylases (PAH), tyrosine hydroxylases (TH) and tryptophan hydroxylases (TPH), both peripheral (TPH1) and neuronal (TPH2) enzymes. This family of enzymes shares a common catalytic mechanism, in which dioxygen is used by an active site containing a single, reduced iron atom to hydroxylate an unactivated aromatic substrate, concomitant with a two-electron oxidation of tetrahydropterin (BH4) cofactor to its quinonoid dihydropterin form. Eukaryotic AAAHs have an N-terminal  ACT (regulatory) domain, a middle catalytic domain and a C-terminal domain which is responsible for the oligomeric state of the enzyme forming a domain-swapped tetrameric coiled-coil. The PAH, TH, and TPH enzymes contain highly conserved catalytic domains but distinct N-terminal ACT domains and differ in their mech
Probab=67.30  E-value=40  Score=24.92  Aligned_cols=53  Identities=11%  Similarity=0.080  Sum_probs=38.9

Q ss_pred             HHHHHHHHHHHhCCCEEEEEEeeeccccccccCCCCeEEEEEEEEEe--CcchHHHHHHHhC
Q 021931          136 SVNEKVQDFLREKKGRVWRLNDWGLRRLAYKIQKAKKAHYILMNFEL--EAKWINDFKTMLD  195 (305)
Q Consensus       136 alvekv~~iL~e~GG~I~kvEdWG~RrLAY~IKK~keG~Yvlm~Fea--~psaIkELer~Lr  195 (305)
                      ..+.++.+.+.++|..|..++..       |+++....+.|++.+.+  ....+..+-..|+
T Consensus        11 G~L~~vL~~f~~~~vni~~I~Sr-------p~~~~~~~~~f~id~~~~~~~~~~~~~l~~l~   65 (75)
T cd04880          11 GALAKALKVFAERGINLTKIESR-------PSRKGLWEYEFFVDFEGHIDDPDVKEALEELK   65 (75)
T ss_pred             CHHHHHHHHHHHCCCCEEEEEee-------ecCCCCceEEEEEEEECCCCCHHHHHHHHHHH
Confidence            35678889999999999999875       56666666778899988  3555555555544


No 19 
>cd04902 ACT_3PGDH-xct C-terminal ACT (regulatory) domain of D-3-phosphoglycerate dehydrogenase (3PGDH). The C-terminal ACT (regulatory) domain of D-3-phosphoglycerate dehydrogenase (3PGDH), with an extended C-terminal (xct) region from bacteria, archaea, fungi, and plants. 3PGDH is an enzyme that belongs to the D-isomer specific, 2-hydroxyacid dehydrogenase family and catalyzes the oxidation of D-3-phosphoglycerate to 3- phosphohydroxypyruvate, which is the first step in the biosynthesis of L-serine, using NAD+ as the oxidizing agent. In bacteria, 3PGDH is feedback-controlled by the end product L-serine in an allosteric manner. Some 3PGDH enzymes have an additional domain formed by an extended C-terminal region. This additional domain introduces significant asymmetry to the homotetramer. Adjacent ACT (regulatory) domains interact, creating two serine-binding sites, however, this asymmetric arrangement results in the formation of two different and distinct domain interfaces between iden
Probab=60.43  E-value=35  Score=24.31  Aligned_cols=62  Identities=11%  Similarity=0.064  Sum_probs=41.8

Q ss_pred             HHHHHHHHHHHhCCCEEEEEEeeeccccccccCCCCeEEEEEEEEEeCcchHHHHHHHhCCCCCeeEEEEEe
Q 021931          136 SVNEKVQDFLREKKGRVWRLNDWGLRRLAYKIQKAKKAHYILMNFELEAKWINDFKTMLDKDEKVIRHLVIK  207 (305)
Q Consensus       136 alvekv~~iL~e~GG~I~kvEdWG~RrLAY~IKK~keG~Yvlm~Fea~psaIkELer~LrLDE~VLR~LIVK  207 (305)
                      ..+.++..+|.++|..|..+..       ++- +.....++.+.++.  ....++.+.|+-.+.|++..+++
T Consensus        11 G~l~~i~~~l~~~~inI~~~~~-------~~~-~~~~~~~~~i~v~~--~~~~~~~~~l~~~~~v~~v~~~~   72 (73)
T cd04902          11 GVIGKVGTILGEAGINIAGMQV-------GRD-EPGGEALMVLSVDE--PVPDEVLEELRALPGILSAKVVE   72 (73)
T ss_pred             CHHHHHHHHHHHcCcChhheEe-------ecc-CCCCEEEEEEEeCC--CCCHHHHHHHHcCCCccEEEEEe
Confidence            3566788999999999977742       111 12233344555555  33458888999999999988875


No 20 
>KOG3130 consensus Uncharacterized conserved protein [Function unknown]
Probab=51.76  E-value=11  Score=38.89  Aligned_cols=11  Identities=18%  Similarity=0.069  Sum_probs=4.9

Q ss_pred             CCCCCCCCceE
Q 021931          261 NVDDDFEHGFS  271 (305)
Q Consensus       261 ~~~~~~~~~~~  271 (305)
                      .+++-..+.|.
T Consensus       301 ~v~dN~~p~i~  311 (514)
T KOG3130|consen  301 GVGDNSIPTIY  311 (514)
T ss_pred             ccCCCcCcccc
Confidence            44444444433


No 21 
>PF04147 Nop14:  Nop14-like family ;  InterPro: IPR007276 Emg1 and Nop14 are novel proteins whose interaction is required for the maturation of the 18S rRNA and for 40S ribosome production [].
Probab=50.22  E-value=12  Score=40.98  Aligned_cols=10  Identities=20%  Similarity=-0.090  Sum_probs=6.0

Q ss_pred             ccccCCccCC
Q 021931           55 EDSHSFFSKA   64 (305)
Q Consensus        55 ~~~~~~~~~~   64 (305)
                      .-||.=.+..
T Consensus       118 ~LTH~G~sL~  127 (840)
T PF04147_consen  118 TLTHGGQSLS  127 (840)
T ss_pred             hhccCCCCcc
Confidence            4677655554


No 22 
>cd04893 ACT_GcvR_1 ACT domains that comprise the Glycine Cleavage System Transcriptional Repressor (GcvR) protein, and other related domains. This CD includes the first of the two ACT domains that comprise the Glycine Cleavage System Transcriptional Repressor (GcvR) protein, and other related domains. The glycine cleavage enzyme system in Escherichia coli provides one-carbon units for cellular methylation reactions. This enzyme system, encoded by the gcvTHP operon and lpd gene, catalyzes the cleavage of glycine into CO2 + NH3 and transfers a one-carbon unit to tetrahydrofolate, producing 5,10-methylenetetrahydrofolate. The gcvTHP operon is activated by the GcvA protein in response to glycine and repressed by a GcvA/GcvR interaction in the absence of glycine. It has been proposed that the co-activator glycine acts through a mechanism of de-repression by binding to GcvR and preventing GcvR from interacting with GcvA to block GcvA's activator function. Evidence also suggests that GcvR int
Probab=49.26  E-value=76  Score=24.03  Aligned_cols=47  Identities=9%  Similarity=-0.050  Sum_probs=32.7

Q ss_pred             HHHHHHHHHHHhCCCEEEEEEeeeccccccccCCCCeEEEEEEEEEeCcchHHHHHH
Q 021931          136 SVNEKVQDFLREKKGRVWRLNDWGLRRLAYKIQKAKKAHYILMNFELEAKWINDFKT  192 (305)
Q Consensus       136 alvekv~~iL~e~GG~I~kvEdWG~RrLAY~IKK~keG~Yvlm~Fea~psaIkELer  192 (305)
                      .++..+.++|.++||.|..+.-..          ...-.+..+.|..+......|+.
T Consensus        13 GiVa~vs~~la~~g~nI~d~~q~~----------~~~~F~m~~~~~~~~~~~~~l~~   59 (77)
T cd04893          13 GILNELTRAVSESGCNILDSRMAI----------LGTEFALTMLVEGSWDAIAKLEA   59 (77)
T ss_pred             hHHHHHHHHHHHcCCCEEEceeeE----------EcCEEEEEEEEEeccccHHHHHH
Confidence            578899999999999999988776          11222455667766444555553


No 23 
>PF05285 SDA1:  SDA1;  InterPro: IPR007949 This domain consists of several SDA1 protein homologues. SDA1 is a Saccharomyces cerevisiae protein which is involved in the control of the actin cytoskeleton. The protein is essential for cell viability and is localised in the nucleus [].
Probab=45.13  E-value=14  Score=36.00  Aligned_cols=27  Identities=19%  Similarity=0.236  Sum_probs=15.8

Q ss_pred             HHHHHHHHHHHHHhCCCEEEEEEeeec
Q 021931          134 VGSVNEKVQDFLREKKGRVWRLNDWGL  160 (305)
Q Consensus       134 Vkalvekv~~iL~e~GG~I~kvEdWG~  160 (305)
                      |..+...+-.++....-.+..-.++|+
T Consensus        20 V~~Aarsli~l~Rev~P~lL~kkdRGr   46 (324)
T PF05285_consen   20 VMMAARSLINLFREVNPELLHKKDRGR   46 (324)
T ss_pred             HHHHHHHHHHHHHHHCHHhcCchhcCC
Confidence            444445555555555556666667775


No 24 
>PF02724 CDC45:  CDC45-like protein;  InterPro: IPR003874 CDC45 is an essential gene required for initiation of DNA replication in Saccharomyces cerevisiae (cell division control protein 45), forming a complex with MCM5/CDC46. Homologs of CDC45 have been identified in human [], mouse and the smut fungus, Melampsora spp., (tsd2 protein) among others.; GO: 0006270 DNA-dependent DNA replication initiation
Probab=42.36  E-value=19  Score=38.10  Aligned_cols=16  Identities=31%  Similarity=0.393  Sum_probs=8.6

Q ss_pred             HHHHHHhCCCCCeeEE
Q 021931          188 NDFKTMLDKDEKVIRH  203 (305)
Q Consensus       188 kELer~LrLDE~VLR~  203 (305)
                      -.|...|.+++++.=|
T Consensus        64 ~dl~~~l~~~~~~~iy   79 (622)
T PF02724_consen   64 VDLEEFLELDEDVTIY   79 (622)
T ss_pred             hhHHHHhCCCCceEEE
Confidence            3455566666555433


No 25 
>KOG2038 consensus CAATT-binding transcription factor/60S ribosomal subunit biogenesis protein [Translation, ribosomal structure and biogenesis; Transcription]
Probab=41.98  E-value=23  Score=39.33  Aligned_cols=54  Identities=11%  Similarity=0.051  Sum_probs=30.4

Q ss_pred             CCCCCcchHHHHHH---HHhHHhhhhhcc-ccCcCcceEEEecCCChH-hHHHHHHHHH
Q 021931           89 LPEFADDEEEKLYE---SLNIELESELNV-ERRRHYEVVYLIHEKYEE-DVGSVNEKVQ  142 (305)
Q Consensus        89 ~p~~~~~~~~~l~~---~l~~~~~s~~~~-~~MR~YEimlILrP~leE-EVkalvekv~  142 (305)
                      =|-|..|+.--|.+   +++-.-||+--. +..-.=|.+.+-.|.+.+ ..-+.+.++.
T Consensus       695 ~P~f~nAd~tslWEl~~ls~HfHPSVa~~Akall~G~~i~y~g~~L~dfTL~~FLDrF~  753 (988)
T KOG2038|consen  695 NPLFCNADHTSLWELLLLSKHFHPSVATFAKALLEGEEIQYGGPPLNDFTLMAFLDRFA  753 (988)
T ss_pred             CccccCCccchHHHHHHHhhhcCchHHHHHHHHhcCceeecCCCchhHHHHHHHHHHHH
Confidence            48899999998888   444455665422 222223334444555555 4455555553


No 26 
>cd04931 ACT_PAH ACT domain of the nonheme iron-dependent aromatic amino acid hydroxylase, phenylalanine hydroxylases (PAH). ACT domain of the nonheme iron-dependent aromatic amino acid hydroxylase, phenylalanine hydroxylases (PAH). PAH catalyzes the hydroxylation of L-Phe to L-Tyr, the first step in the catabolic degradation of L-Phe. In PAH, an autoregulatory sequence, N-terminal of the ACT domain, extends across the catalytic domain active site and regulates the enzyme by intrasteric regulation. It appears that the activation by L-Phe induces a conformational change that converts the enzyme to a high-affinity and high-activity state. Modulation of activity is achieved through inhibition by BH4 and activation by phosphorylation of serine residues of the autoregulatory region. The molecular basis for the cooperative activation process is not fully understood yet. Members of this CD belong to the superfamily of ACT regulatory domains.
Probab=39.27  E-value=1.9e+02  Score=23.26  Aligned_cols=55  Identities=15%  Similarity=0.177  Sum_probs=38.6

Q ss_pred             HHHHHHHHHHHhCCCEEEEEEeeeccccccccCCCCeEEEEEEEEEeC-c----chHHHHHHHhCCC
Q 021931          136 SVNEKVQDFLREKKGRVWRLNDWGLRRLAYKIQKAKKAHYILMNFELE-A----KWINDFKTMLDKD  197 (305)
Q Consensus       136 alvekv~~iL~e~GG~I~kvEdWG~RrLAY~IKK~keG~Yvlm~Fea~-p----saIkELer~LrLD  197 (305)
                      ..+.++...+..+|-.+.+++.+=       +++..-.++|++.|++. .    .++.+|.+.++.+
T Consensus        26 GsL~~vL~~Fa~~~INLt~IeSRP-------~~~~~~~Y~FfVDieg~~~~~~~~~l~~L~~~~~~~   85 (90)
T cd04931          26 GALAKVLRLFEEKDINLTHIESRP-------SRLNKDEYEFFINLDKKSAPALDPIIKSLRNDIGAT   85 (90)
T ss_pred             cHHHHHHHHHHHCCCCEEEEEecc-------CCCCCceEEEEEEEEcCCCHHHHHHHHHHHHHhCCC
Confidence            346688889999999999999754       44445567889999985 2    2555666555543


No 27 
>PF11705 RNA_pol_3_Rpc31:  DNA-directed RNA polymerase III subunit Rpc31;  InterPro: IPR024661 DNA-directed RNA polymerases 2.7.7.6 from EC (also known as DNA-dependent RNA polymerases) are responsible for the polymerisation of ribonucleotides into a sequence complementary to the template DNA. In eukaryotes, there are three different forms of DNA-directed RNA polymerases transcribing different sets of genes. Most RNA polymerases are multimeric enzymes and are composed of a variable number of subunits. The core RNA polymerase complex consists of five subunits (two alpha, one beta, one beta-prime and one omega) and is sufficient for transcription elongation and termination but is unable to initiate transcription. Transcription initiation from promoter elements requires a sixth, dissociable subunit called a sigma factor, which reversibly associates with the core RNA polymerase complex to form a holoenzyme []. The core RNA polymerase complex forms a "crab claw"-like structure with an internal channel running along the full length []. The key functional sites of the enzyme, as defined by mutational and cross-linking analysis, are located on the inner wall of this channel. RNA synthesis follows after the attachment of RNA polymerase to a specific site, the promoter, on the template DNA strand. The RNA synthesis process continues until a termination sequence is reached. The RNA product, which is synthesised in the 5' to 3'direction, is known as the primary transcript. Eukaryotic nuclei contain three distinct types of RNA polymerases that differ in the RNA they synthesise:  RNA polymerase I: located in the nucleoli, synthesises precursors of most ribosomal RNAs. RNA polymerase II: occurs in the nucleoplasm, synthesises mRNA precursors.  RNA polymerase III: also occurs in the nucleoplasm, synthesises the precursors of 5S ribosomal RNA, the tRNAs, and a variety of other small nuclear and cytosolic RNAs.   Eukaryotic cells are also known to contain separate mitochondrial and chloroplast RNA polymerases. Eukaryotic RNA polymerases, whose molecular masses vary in size from 500 to 700 kDa, contain two non-identical large (>100 kDa) subunits and an array of up to 12 different small (less than 50 kDa) subunits. RNA polymerase III contains seventeen subunits in yeasts and in human cells. Twelve of these are akin to RNA polymerase I or II and the other five are RNA polymerase III-specific, and form the functionally distinct groups: (i) Rpc31-Rpc34-Rpc82, and (ii) Rpc37-Rpc53. Rpc31, Rpc34 and Rpc82 form a cluster of enzyme-specific subunits that contribute to transcription initiation in Saccharomyces cerevisiae and Homo sapiens. There is evidence that these subunits are anchored at or near the N-terminal Zn-fold of Rpc1, itself prolonged by a highly conserved but RNA polymerase III-specific domain []. This entry represents the Rpc31 subunit.
Probab=38.45  E-value=25  Score=32.37  Aligned_cols=7  Identities=29%  Similarity=0.557  Sum_probs=3.7

Q ss_pred             cCCcchh
Q 021931           68 NLFFPEA   74 (305)
Q Consensus        68 ~~~~~~~   74 (305)
                      +=+||..
T Consensus        33 ~~lfP~~   39 (233)
T PF11705_consen   33 PPLFPPL   39 (233)
T ss_pred             CCCCCCC
Confidence            3456654


No 28 
>cd04869 ACT_GcvR_2 ACT domains that comprise the Glycine Cleavage System Transcriptional Repressor (GcvR) protein, and other related domains. This CD includes the second of the two ACT domains that comprise the Glycine Cleavage System Transcriptional Repressor (GcvR) protein, and other related domains. The glycine cleavage enzyme system in Escherichia coli provides one-carbon units for cellular methylation reactions. This enzyme system, encoded by the gcvTHP operon and lpd gene, catalyzes the cleavage of glycine into CO2 + NH3 and transfers a one-carbon unit to tetrahydrofolate, producing 5,10-methylenetetrahydrofolate. The gcvTHP operon is activated by the GcvA protein in response to glycine and repressed by a GcvA/GcvR interaction in the absence of glycine. It has been proposed that the co-activator glycine acts through a mechanism of de-repression by binding to GcvR and preventing GcvR from interacting with GcvA to block GcvA's activator function. Evidence also suggests that GcvR in
Probab=37.90  E-value=1.6e+02  Score=21.69  Aligned_cols=54  Identities=15%  Similarity=0.224  Sum_probs=35.4

Q ss_pred             HHHHHHHHHHHhCCCEEEEEEeeeccccccccCCCCeEEE-EEEEEEeCcc-hHHHHHHHh
Q 021931          136 SVNEKVQDFLREKKGRVWRLNDWGLRRLAYKIQKAKKAHY-ILMNFELEAK-WINDFKTML  194 (305)
Q Consensus       136 alvekv~~iL~e~GG~I~kvEdWG~RrLAY~IKK~keG~Y-vlm~Fea~ps-aIkELer~L  194 (305)
                      .++.++..+|.++|+.|..+...-     +.......+.| +.+.|..++. .+.+|+..|
T Consensus        11 Giv~~it~~l~~~~~nI~~~~~~~-----~~~~~~~~~~~~~~~~v~~p~~~~~~~l~~~l   66 (81)
T cd04869          11 GIVHEVTQFLAQRNINIEDLSTET-----YSAPMSGTPLFKAQATLALPAGTDLDALREEL   66 (81)
T ss_pred             CHHHHHHHHHHHcCCCeEEeEeee-----ecCCCCCcceEEEEEEEecCCCCCHHHHHHHH
Confidence            467889999999999999887633     33333333444 3567777654 466666443


No 29 
>cd04904 ACT_AAAH ACT domain of the nonheme iron-dependent, aromatic amino acid hydroxylases (AAAH). ACT domain of the nonheme iron-dependent, aromatic amino acid hydroxylases (AAAH): Phenylalanine hydroxylases (PAH), tyrosine hydroxylases (TH) and tryptophan hydroxylases (TPH), both peripheral (TPH1) and neuronal (TPH2) enzymes. This family of enzymes shares a common catalytic mechanism, in which dioxygen is used by an active site containing a single, reduced iron atom to hydroxylate an unactivated aromatic substrate, concomitant with a two-electron oxidation of tetrahydropterin (BH4) cofactor to its quinonoid dihydropterin form. PAH catalyzes the hydroxylation of L-Phe to L-Tyr, the first step in the catabolic degradation of L-Phe; TH catalyses the hydroxylation of L-Tyr to 3,4-dihydroxyphenylalanine, the rate limiting step in the biosynthesis of catecholamines; and TPH catalyses the hydroxylation of L-Trp to 5-hydroxytryptophan, the rate limiting step in the biosynthesis of 5-hydroxy
Probab=37.36  E-value=1.8e+02  Score=21.88  Aligned_cols=53  Identities=6%  Similarity=0.103  Sum_probs=38.3

Q ss_pred             HHHHHHHHHHhCCCEEEEEEeeeccccccccCCCCeEEEEEEEEEeCcchHHHHHHHhCC
Q 021931          137 VNEKVQDFLREKKGRVWRLNDWGLRRLAYKIQKAKKAHYILMNFELEAKWINDFKTMLDK  196 (305)
Q Consensus       137 lvekv~~iL~e~GG~I~kvEdWG~RrLAY~IKK~keG~Yvlm~Fea~psaIkELer~LrL  196 (305)
                      .+..+.+.+..+|-.+++++.+       |++...--++|++.|++....+..+-..|+.
T Consensus        13 ~L~~vL~~f~~~~iNlt~IeSR-------P~~~~~~~y~Ffvd~~~~~~~~~~~l~~L~~   65 (74)
T cd04904          13 ALARALKLFEEFGVNLTHIESR-------PSRRNGSEYEFFVDCEVDRGDLDQLISSLRR   65 (74)
T ss_pred             HHHHHHHHHHHCCCcEEEEECC-------CCCCCCceEEEEEEEEcChHHHHHHHHHHHH
Confidence            4667888899999999999975       4455555688899999866555555555543


No 30 
>KOG0943 consensus Predicted ubiquitin-protein ligase/hyperplastic discs protein, HECT superfamily [Posttranslational modification, protein turnover, chaperones]
Probab=36.88  E-value=22  Score=41.68  Aligned_cols=7  Identities=57%  Similarity=0.482  Sum_probs=3.7

Q ss_pred             CCCCCCc
Q 021931           88 ALPEFAD   94 (305)
Q Consensus        88 ~~p~~~~   94 (305)
                      .||-|+-
T Consensus      1572 aL~~fAV 1578 (3015)
T KOG0943|consen 1572 ALLPFAV 1578 (3015)
T ss_pred             HhhhHHH
Confidence            4565653


No 31 
>PTZ00415 transmission-blocking target antigen s230; Provisional
Probab=35.30  E-value=22  Score=42.80  Aligned_cols=12  Identities=8%  Similarity=0.196  Sum_probs=5.7

Q ss_pred             CCCceEEEeCCC
Q 021931          266 FEHGFSIVNVDD  277 (305)
Q Consensus       266 ~~~~~~~~~~~~  277 (305)
                      --+-|.++....
T Consensus       205 ~~d~f~~~e~g~  216 (2849)
T PTZ00415        205 KTDCFKFIEAGA  216 (2849)
T ss_pred             ccceeeeccCCC
Confidence            334555554433


No 32 
>PTZ00415 transmission-blocking target antigen s230; Provisional
Probab=34.46  E-value=29  Score=41.86  Aligned_cols=7  Identities=14%  Similarity=0.325  Sum_probs=3.1

Q ss_pred             HHhCCCE
Q 021931          145 LREKKGR  151 (305)
Q Consensus       145 L~e~GG~  151 (305)
                      ...+||.
T Consensus        68 ~~~~g~~   74 (2849)
T PTZ00415         68 FDKNGGI   74 (2849)
T ss_pred             cccCCCE
Confidence            3444543


No 33 
>cd04879 ACT_3PGDH-like ACT_3PGDH-like CD includes the C-terminal ACT (regulatory) domain of D-3-phosphoglycerate dehydrogenase (3PGDH). ACT_3PGDH-like: The ACT_3PGDH-like CD includes the C-terminal ACT (regulatory) domain of D-3-phosphoglycerate dehydrogenase (3PGDH), with or without an extended C-terminal (xct) region found in various bacteria, archaea, fungi, and plants. 3PGDH is an enzyme that belongs to the D-isomer specific, 2-hydroxyacid dehydrogenase family and catalyzes the oxidation of D-3-phosphoglycerate to 3- phosphohydroxypyruvate, which is the first step in the biosynthesis of L-serine, using NAD+ as the oxidizing agent. In bacteria, 3PGDH is feedback controlled by the end product L-serine in an allosteric manner. In the Escherichia coli homotetrameric enzyme, the interface at adjacent ACT (regulatory) domains couples to create an extended beta-sheet. Each regulatory interface forms two serine-binding sites. The mechanism by which serine transmits inhibition to the active
Probab=33.74  E-value=1.5e+02  Score=20.11  Aligned_cols=58  Identities=17%  Similarity=0.151  Sum_probs=38.2

Q ss_pred             HHHHHHHHHHHhCCCEEEEEEeeeccccccccCCCCeEEEEEEEEEeCcchHHHHHHHhCCCCCeeEE
Q 021931          136 SVNEKVQDFLREKKGRVWRLNDWGLRRLAYKIQKAKKAHYILMNFELEAKWINDFKTMLDKDEKVIRH  203 (305)
Q Consensus       136 alvekv~~iL~e~GG~I~kvEdWG~RrLAY~IKK~keG~Yvlm~Fea~psaIkELer~LrLDE~VLR~  203 (305)
                      ..+..+..+|.+.|..|..+..          .....+.+..+.|......+.++-+.|+--+.|++.
T Consensus        11 g~l~~i~~~l~~~~~nI~~~~~----------~~~~~~~~~~~~~~v~~~~~~~l~~~l~~~~~V~~v   68 (71)
T cd04879          11 GVIGKVGTILGEHGINIAAMQV----------GRKEKGGIAYMVLDVDSPVPEEVLEELKALPGIIRV   68 (71)
T ss_pred             CHHHHHHHHHHhcCCCeeeEEE----------eccCCCCEEEEEEEcCCCCCHHHHHHHHcCCCeEEE
Confidence            3566788899999999988864          111112234444444334677888888888888874


No 34 
>cd04929 ACT_TPH ACT domain of the nonheme iron-dependent aromatic amino acid hydroxylase, tryptophan hydroxylases (TPH), both peripheral (TPH1) and neuronal (TPH2) enzymes. ACT domain of the nonheme iron-dependent aromatic amino acid hydroxylase, tryptophan hydroxylases (TPH), both peripheral (TPH1) and neuronal (TPH2) enzymes. TPH catalyses the hydroxylation of L-Trp to 5-hydroxytryptophan, the rate limiting step in the biosynthesis of 5-hydroxytryptamine (serotonin) and the first reaction in the synthesis of melatonin. Very little is known about the role of the ACT domain in TPH, which appears to be regulated by phosphorylation but not by its substrate or cofactor. Members of this CD belong to the superfamily of ACT regulatory domains.
Probab=30.14  E-value=2.6e+02  Score=21.54  Aligned_cols=52  Identities=12%  Similarity=0.192  Sum_probs=39.0

Q ss_pred             HHHHHHHHHHhCCCEEEEEEeeeccccccccCCCCeEEEEEEEEEeCcchHHHHHHHhC
Q 021931          137 VNEKVQDFLREKKGRVWRLNDWGLRRLAYKIQKAKKAHYILMNFELEAKWINDFKTMLD  195 (305)
Q Consensus       137 lvekv~~iL~e~GG~I~kvEdWG~RrLAY~IKK~keG~Yvlm~Fea~psaIkELer~Lr  195 (305)
                      .+.++.+++..+|..+.+++.+=       .++..--++|++.+++....+..+-..|+
T Consensus        13 ~L~~iL~~f~~~~inl~~IeSRP-------~~~~~~~y~F~id~e~~~~~i~~~l~~l~   64 (74)
T cd04929          13 GLAKALKLFQELGINVVHIESRK-------SKRRSSEFEIFVDCECDQRRLDELVQLLK   64 (74)
T ss_pred             HHHHHHHHHHHCCCCEEEEEecc-------CCCCCceEEEEEEEEcCHHHHHHHHHHHH
Confidence            46688889999999999999764       44455568889999988776666555554


No 35 
>KOG2023 consensus Nuclear transport receptor Karyopherin-beta2/Transportin (importin beta superfamily) [Nuclear structure; Intracellular trafficking, secretion, and vesicular transport]
Probab=27.57  E-value=35  Score=37.55  Aligned_cols=16  Identities=31%  Similarity=0.526  Sum_probs=11.0

Q ss_pred             CCCCccchhhccCCCC
Q 021931          218 PPPEFHTLRAEMHGYD  233 (305)
Q Consensus       218 ~~~ef~s~~~~~d~~~  233 (305)
                      .+|+||..|...-.++
T Consensus       334 IkPRfhksk~~~~~~~  349 (885)
T KOG2023|consen  334 IKPRFHKSKEHGNGED  349 (885)
T ss_pred             ccchhhhchhccCccc
Confidence            3488988877765444


No 36 
>KOG4032 consensus Uncharacterized conserved protein [Function unknown]
Probab=27.45  E-value=73  Score=29.54  Aligned_cols=14  Identities=14%  Similarity=0.074  Sum_probs=6.6

Q ss_pred             eCcchHHHHHHHhC
Q 021931          182 LEAKWINDFKTMLD  195 (305)
Q Consensus       182 a~psaIkELer~Lr  195 (305)
                      .+-+.|.+|-..|.
T Consensus       104 ~N~~~ieells~l~  117 (184)
T KOG4032|consen  104 GNYAIIEELLSKLP  117 (184)
T ss_pred             ccHHHHHHHHHHcc
Confidence            34445555554443


No 37 
>PF11702 DUF3295:  Protein of unknown function (DUF3295);  InterPro: IPR021711  This family is conserved in fungi but the function is not known. 
Probab=26.03  E-value=44  Score=35.20  Aligned_cols=12  Identities=33%  Similarity=0.432  Sum_probs=7.0

Q ss_pred             ccCCccCCCCCC
Q 021931          245 EYDDEYDDGEGD  256 (305)
Q Consensus       245 ~~~~~~~~~~~~  256 (305)
                      +||||+.|||+.
T Consensus       307 dDDDDssDWEDS  318 (507)
T PF11702_consen  307 DDDDDSSDWEDS  318 (507)
T ss_pred             cCCccchhhhhc
Confidence            455666666553


No 38 
>KOG0943 consensus Predicted ubiquitin-protein ligase/hyperplastic discs protein, HECT superfamily [Posttranslational modification, protein turnover, chaperones]
Probab=25.65  E-value=51  Score=38.96  Aligned_cols=6  Identities=17%  Similarity=0.429  Sum_probs=2.3

Q ss_pred             HHHHHh
Q 021931          189 DFKTML  194 (305)
Q Consensus       189 ELer~L  194 (305)
                      ++...+
T Consensus      1682 e~qesl 1687 (3015)
T KOG0943|consen 1682 EIQESL 1687 (3015)
T ss_pred             hhhhcc
Confidence            344333


No 39 
>KOG0526 consensus Nucleosome-binding factor SPN, POB3 subunit [Transcription; Replication, recombination and repair; Chromatin structure and dynamics]
Probab=24.69  E-value=59  Score=34.80  Aligned_cols=26  Identities=31%  Similarity=0.325  Sum_probs=19.6

Q ss_pred             CcchhhhhhhhhhccccCCCCCCCcc
Q 021931           70 FFPEAVLLKEKKVQEDGRALPEFADD   95 (305)
Q Consensus        70 ~~~~~~~~~~~~~~~~g~~~p~~~~~   95 (305)
                      .|-++|+-|-.-.+..|...--|.++
T Consensus       182 ~F~~~v~~kAdv~~~~gdaI~~f~~i  207 (615)
T KOG0526|consen  182 AFYENVLAKADVSSAVGDAIVSFEEI  207 (615)
T ss_pred             HHHHHHHHhcCcccccCCceEEeeee
Confidence            37788888888888888876666543


No 40 
>PF11705 RNA_pol_3_Rpc31:  DNA-directed RNA polymerase III subunit Rpc31;  InterPro: IPR024661 DNA-directed RNA polymerases 2.7.7.6 from EC (also known as DNA-dependent RNA polymerases) are responsible for the polymerisation of ribonucleotides into a sequence complementary to the template DNA. In eukaryotes, there are three different forms of DNA-directed RNA polymerases transcribing different sets of genes. Most RNA polymerases are multimeric enzymes and are composed of a variable number of subunits. The core RNA polymerase complex consists of five subunits (two alpha, one beta, one beta-prime and one omega) and is sufficient for transcription elongation and termination but is unable to initiate transcription. Transcription initiation from promoter elements requires a sixth, dissociable subunit called a sigma factor, which reversibly associates with the core RNA polymerase complex to form a holoenzyme []. The core RNA polymerase complex forms a "crab claw"-like structure with an internal channel running along the full length []. The key functional sites of the enzyme, as defined by mutational and cross-linking analysis, are located on the inner wall of this channel. RNA synthesis follows after the attachment of RNA polymerase to a specific site, the promoter, on the template DNA strand. The RNA synthesis process continues until a termination sequence is reached. The RNA product, which is synthesised in the 5' to 3'direction, is known as the primary transcript. Eukaryotic nuclei contain three distinct types of RNA polymerases that differ in the RNA they synthesise:  RNA polymerase I: located in the nucleoli, synthesises precursors of most ribosomal RNAs. RNA polymerase II: occurs in the nucleoplasm, synthesises mRNA precursors.  RNA polymerase III: also occurs in the nucleoplasm, synthesises the precursors of 5S ribosomal RNA, the tRNAs, and a variety of other small nuclear and cytosolic RNAs.   Eukaryotic cells are also known to contain separate mitochondrial and chloroplast RNA polymerases. Eukaryotic RNA polymerases, whose molecular masses vary in size from 500 to 700 kDa, contain two non-identical large (>100 kDa) subunits and an array of up to 12 different small (less than 50 kDa) subunits. RNA polymerase III contains seventeen subunits in yeasts and in human cells. Twelve of these are akin to RNA polymerase I or II and the other five are RNA polymerase III-specific, and form the functionally distinct groups: (i) Rpc31-Rpc34-Rpc82, and (ii) Rpc37-Rpc53. Rpc31, Rpc34 and Rpc82 form a cluster of enzyme-specific subunits that contribute to transcription initiation in Saccharomyces cerevisiae and Homo sapiens. There is evidence that these subunits are anchored at or near the N-terminal Zn-fold of Rpc1, itself prolonged by a highly conserved but RNA polymerase III-specific domain []. This entry represents the Rpc31 subunit.
Probab=22.91  E-value=87  Score=28.87  Aligned_cols=9  Identities=22%  Similarity=0.313  Sum_probs=4.0

Q ss_pred             HHHHHHhCC
Q 021931          188 NDFKTMLDK  196 (305)
Q Consensus       188 kELer~LrL  196 (305)
                      .+|...++.
T Consensus       122 ~EL~~~~~~  130 (233)
T PF11705_consen  122 KELWPTLRK  130 (233)
T ss_pred             HHHHhhccc
Confidence            344444443


No 41 
>PF09569 RE_ScaI:  ScaI restriction endonuclease;  InterPro: IPR019069 There are four classes of restriction endonucleases: types I, II,III and IV. All types of enzymes recognise specific short DNA sequences and carry out the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5'-phosphates. They differ in their recognition sequence, subunit composition, cleavage position, and cofactor requirements [, ], as summarised below:   Type I enzymes (3.1.21.3 from EC) cleave at sites remote from recognition site; require both ATP and S-adenosyl-L-methionine to function; multifunctional protein with both restriction and methylase (2.1.1.72 from EC) activities. Type II enzymes (3.1.21.4 from EC) cleave within or at short specific distances from recognition site; most require magnesium; single function (restriction) enzymes independent of methylase. Type III enzymes (3.1.21.5 from EC) cleave at sites a short distance from recognition site; require ATP (but doesn't hydrolyse it); S-adenosyl-L-methionine stimulates reaction but is not required; exists as part of a complex with a modification methylase methylase (2.1.1.72 from EC). Type IV enzymes target methylated DNA.   Type II restriction endonucleases (3.1.21.4 from EC) are components of prokaryotic DNA restriction-modification mechanisms that protect the organism against invading foreign DNA. These site-specific deoxyribonucleases catalyse the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5'-phosphates. Of the 3000 restriction endonucleases that have been characterised, most are homodimeric or tetrameric enzymes that cleave target DNA at sequence-specific sites close to the recognition site. For homodimeric enzymes, the recognition site is usually a palindromic sequence 4-8 bp in length. Most enzymes require magnesium ions as a cofactor for catalysis. Although they can vary in their mode of recognition, many restriction endonucleases share a similar structural core comprising four beta-strands and one alpha-helix, as well as a similar mechanism of cleavage, suggesting a common ancestral origin []. However, there is still considerable diversity amongst restriction endonucleases [, ]. The target site recognition process triggers large conformational changes of the enzyme and the target DNA, leading to the activation of the catalytic centres. Like other DNA binding proteins, restriction enzymes are capable of non-specific DNA binding as well, which is the prerequisite for efficient target site location by facilitated diffusion. Non-specific binding usually does not involve interactions with the bases but only with the DNA backbone [].   This entry represents the restriction endonuclease ScaI, which recognises and cleaves the double-stranded sequence AGT^ACT. 
Probab=22.23  E-value=1.1e+02  Score=28.52  Aligned_cols=76  Identities=18%  Similarity=0.254  Sum_probs=48.5

Q ss_pred             ceEEEecCCChHhHHHHHHHHHHHHHhCCCEEEEEEeeeccccccccC---CCCeEEEEEEEEEeCcchHHHHHHHhCCC
Q 021931          121 EVVYLIHEKYEEDVGSVNEKVQDFLREKKGRVWRLNDWGLRRLAYKIQ---KAKKAHYILMNFELEAKWINDFKTMLDKD  197 (305)
Q Consensus       121 EimlILrP~leEEVkalvekv~~iL~e~GG~I~kvEdWG~RrLAY~IK---K~keG~Yvlm~Fea~psaIkELer~LrLD  197 (305)
                      .+++|.++..+=|++..-         .     .-.-+|.|..||+=+   |.+.|+|+-++|+--...      .++--
T Consensus        92 Div~i~~~~ysiEiKTSS---------~-----~~~i~gNRSy~q~~~~~kKsK~GYYlaINfekf~~~------~~~p~  151 (191)
T PF09569_consen   92 DIVNIPNPFYSIEIKTSS---------S-----PKQIYGNRSYGQNSDNSKKSKDGYYLAINFEKFSET------DLRPK  151 (191)
T ss_pred             ceeeccCCceeEEEeccC---------C-----ccceecccccccCCCCCccccCceEEEEEeeccCcc------CCCcc
Confidence            467777776554332211         1     223467788887643   788999999999975554      23333


Q ss_pred             CCeeEEEEEeecCCCCCCC
Q 021931          198 EKVIRHLVIKRDKAITEDC  216 (305)
Q Consensus       198 E~VLR~LIVK~ek~i~e~~  216 (305)
                      -..|||--+-..+=+.+.+
T Consensus       152 I~~IRFGwld~tDWi~Q~a  170 (191)
T PF09569_consen  152 IRLIRFGWLDHTDWIGQKA  170 (191)
T ss_pred             eEEEEEecccchhhhcccc
Confidence            4578998887777555443


No 42 
>PF14257 DUF4349:  Domain of unknown function (DUF4349)
Probab=22.04  E-value=5.7e+02  Score=23.59  Aligned_cols=62  Identities=15%  Similarity=0.175  Sum_probs=50.0

Q ss_pred             HhHHHHHHHHHHHHHhCCCEEEEEEeeeccccccccCCCCeEEEEEEEEEeCcchHHHHHHHhCCCCCe
Q 021931          132 EDVGSVNEKVQDFLREKKGRVWRLNDWGLRRLAYKIQKAKKAHYILMNFELEAKWINDFKTMLDKDEKV  200 (305)
Q Consensus       132 EEVkalvekv~~iL~e~GG~I~kvEdWG~RrLAY~IKK~keG~Yvlm~Fea~psaIkELer~LrLDE~V  200 (305)
                      +.+.+..+.+..++.+.||.|.....++.       ..........+.+..++.....+-..|.--..|
T Consensus        59 ~d~~~a~~~i~~~~~~~gG~i~~~~~~~~-------~~~~~~~~~~ltiRVP~~~~~~~l~~l~~~g~v  120 (262)
T PF14257_consen   59 KDVEKAVKKIENLVESYGGYIESSSSSSS-------GGSDDERSASLTIRVPADKFDSFLDELSELGKV  120 (262)
T ss_pred             CCHHHHHHHHHHHHHHcCCEEEEEeeecc-------cCCCCcceEEEEEEECHHHHHHHHHHHhccCce
Confidence            45778899999999999999999997654       445566777889999999999999888855433


No 43 
>PLN03075 nicotianamine synthase; Provisional
Probab=21.78  E-value=1.4e+02  Score=29.23  Aligned_cols=71  Identities=21%  Similarity=0.271  Sum_probs=51.2

Q ss_pred             CcCcceEEEecCCC---hHhHHHHHHHHHHHHHhCCCEEEEEEeeeccccccccC--CCCeEEEEEEEEEeCcchHHH
Q 021931          117 RRHYEVVYLIHEKY---EEDVGSVNEKVQDFLREKKGRVWRLNDWGLRRLAYKIQ--KAKKAHYILMNFELEAKWIND  189 (305)
Q Consensus       117 MR~YEimlILrP~l---eEEVkalvekv~~iL~e~GG~I~kvEdWG~RrLAY~IK--K~keG~Yvlm~Fea~psaIkE  189 (305)
                      ...|.++|+. .-.   .+.-..+++.+...+...|--+.+. .||.|.|-|++-  ..-+|.-++..|.-.+..++.
T Consensus       193 l~~FDlVF~~-ALi~~dk~~k~~vL~~l~~~LkPGG~Lvlr~-~~G~r~~LYp~v~~~~~~gf~~~~~~~P~~~v~Ns  268 (296)
T PLN03075        193 LKEYDVVFLA-ALVGMDKEEKVKVIEHLGKHMAPGALLMLRS-AHGARAFLYPVVDPCDLRGFEVLSVFHPTDEVINS  268 (296)
T ss_pred             cCCcCEEEEe-cccccccccHHHHHHHHHHhcCCCcEEEEec-ccchHhhcCCCCChhhCCCeEEEEEECCCCCceee
Confidence            3579999998 432   2456789999999988755555555 799999999965  344488888887776665543


No 44 
>PRK11898 prephenate dehydratase; Provisional
Probab=21.31  E-value=3.6e+02  Score=25.74  Aligned_cols=55  Identities=5%  Similarity=-0.019  Sum_probs=39.4

Q ss_pred             eEEEecCCChHhHHHHHHHHHHHHHhCCCEEEEEEeeeccccccccCCCCeEEEEEEEEEeCcc
Q 021931          122 VVYLIHEKYEEDVGSVNEKVQDFLREKKGRVWRLNDWGLRRLAYKIQKAKKAHYILMNFELEAK  185 (305)
Q Consensus       122 imlILrP~leEEVkalvekv~~iL~e~GG~I~kvEdWG~RrLAY~IKK~keG~Yvlm~Fea~ps  185 (305)
                      +.+++......  ...+-++.+++...|-.+++++.+       |+++..-.++|++.|++...
T Consensus       197 tslif~l~~~~--pGsL~~~L~~F~~~~INLt~IeSR-------P~~~~~~~y~F~vd~eg~~~  251 (283)
T PRK11898        197 TSLVLTLPNNL--PGALYKALSEFAWRGINLTRIESR-------PTKTGLGTYFFFIDVEGHID  251 (283)
T ss_pred             EEEEEEeCCCC--ccHHHHHHHHHHHCCCCeeeEecc-------cCCCCCccEEEEEEEEccCC
Confidence            56666643321  134667778889999999999987       55555567888999998666


No 45 
>PF01842 ACT:  ACT domain;  InterPro: IPR002912 The ACT domain is found in a variety of contexts and is proposed to be a conserved regulatory binding fold. ACT domains are linked to a wide range of metabolic enzymes that are regulated by amino acid concentration. The archetypical ACT domain is the C-terminal regulatory domain of 3-phosphoglycerate dehydrogenase (3PGDH), which folds with a ferredoxin-like topology. A pair of ACT domains form an eight-stranded antiparallel sheet with two molecules of allosteric inhibitor serine bound in the interface. Biochemical exploration of a few other proteins containing ACT domains supports the suggestions that these domains contain the archetypical ACT structure [].; GO: 0016597 amino acid binding, 0008152 metabolic process; PDB: 3L76_B 2F06_B 3NRB_C 1Y7P_C 2QMX_A 2DT9_A 2ZHO_D 3K5P_A 3TVI_K 3C1M_C ....
Probab=21.27  E-value=2.8e+02  Score=18.96  Aligned_cols=50  Identities=6%  Similarity=0.082  Sum_probs=34.9

Q ss_pred             HHHHHHHHHHHhCCCEEEEEEeeeccccccccCCCCeEEEEEEEEEeCcchHHHHHHHh
Q 021931          136 SVNEKVQDFLREKKGRVWRLNDWGLRRLAYKIQKAKKAHYILMNFELEAKWINDFKTML  194 (305)
Q Consensus       136 alvekv~~iL~e~GG~I~kvEdWG~RrLAY~IKK~keG~Yvlm~Fea~psaIkELer~L  194 (305)
                      -++.++.++|.++|..|..+.......=         +.++.+.+..+......+...|
T Consensus        12 G~l~~v~~~la~~~inI~~~~~~~~~~~---------~~~~~~~~~~~~~~~~~~~~~l   61 (66)
T PF01842_consen   12 GILADVTEILADHGINIDSISQSSDKDG---------VGIVFIVIVVDEEDLEKLLEEL   61 (66)
T ss_dssp             THHHHHHHHHHHTTEEEEEEEEEEESST---------TEEEEEEEEEEGHGHHHHHHHH
T ss_pred             CHHHHHHHHHHHcCCCHHHeEEEecCCC---------ceEEEEEEECCCCCHHHHHHHH
Confidence            4778999999999999999988775432         5556666666555555554443


No 46 
>PTZ00482 membrane-attack complex/perforin (MACPF) Superfamily; Provisional
Probab=20.95  E-value=74  Score=35.50  Aligned_cols=26  Identities=31%  Similarity=0.538  Sum_probs=15.4

Q ss_pred             CCccchhhccCCCCCccCCCCCCCcc
Q 021931          220 PEFHTLRAEMHGYDEEEDDIDYDDEE  245 (305)
Q Consensus       220 ~ef~s~~~~~d~~~~~~~~~~~~~~~  245 (305)
                      .-|..-+++.++++|.|.|..|+||+
T Consensus        77 ~~~~~~~~~~~~~~~~~~~~~~~~~~  102 (844)
T PTZ00482         77 ASFLNQRKSLDDDDDDEFDFLYEDDE  102 (844)
T ss_pred             cchhhhhcccccCcchhhhhhccccc
Confidence            34555566667777666665555543


No 47 
>PF12253 CAF1A:  Chromatin assembly factor 1 subunit A;  InterPro: IPR022043  The CAF-1 or chromatin assembly factor-1 consists of three subunits, and this is the first, or A []. The A domain is uniquely required for the progression of S phase in mouse cells [], independent of its ability to promote histone deposition [] but dependent on its ability to interact with HP1 - heterochromatin protein 1-rich heterochromatin domains next to centromeres that are crucial for chromosome segregation during mitosis. This HP1-CAF-1 interaction module functions as a built-in replication control for heterochromatin, which, like a control barrier, has an impact on S-phase progression in addition to DNA-based checkpoints []. 
Probab=20.05  E-value=84  Score=25.15  Aligned_cols=14  Identities=7%  Similarity=0.020  Sum_probs=5.7

Q ss_pred             CCCCCccCCCCCCC
Q 021931          230 HGYDEEEDDIDYDD  243 (305)
Q Consensus       230 d~~~~~~~~~~~~~  243 (305)
                      +|++++|=++++++
T Consensus        42 dyDSd~EWeE~e~G   55 (77)
T PF12253_consen   42 DYDSDDEWEEEEEG   55 (77)
T ss_pred             ecCCccccccCCCC
Confidence            34444444333333


Done!