Query         015859
Match_columns 399
No_of_seqs    80 out of 82
Neff          3.9 
Searched_HMMs 46136
Date          Fri Mar 29 01:33:48 2013
Command       hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/015859.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/015859hhsearch_cdd -cpu 12 -v 0 

 No Hit                             Prob E-value P-value  Score    SS Cols Query HMM  Template HMM
  1 smart00317 SET SET (Su(var)3-9  98.8 2.3E-08 4.9E-13   80.4   8.2   32  167-198     9-40  (116)
  2 KOG1079 Transcriptional repres  98.7 3.7E-08   8E-13  105.6   8.4  125  156-388   596-728 (739)
  3 KOG1080 Histone H3 (Lys4) meth  96.1    0.02 4.2E-07   65.5   8.8   68  156-228   868-937 (1005)
  4 KOG1082 Histone H3 (Lys9) meth  94.0    0.14 3.1E-06   52.0   6.8  148  152-377   174-322 (364)
  5 PF00856 SET:  SET domain;  Int  92.6   0.097 2.1E-06   43.3   2.7   27  169-195     1-27  (162)
  6 KOG2461 Transcription factor B  92.0    0.18 3.8E-06   52.5   4.4   63  155-227    29-95  (396)
  7 KOG1085 Predicted methyltransf  88.1    0.35 7.6E-06   49.2   2.6   56  136-193   230-291 (392)
  8 KOG4442 Clathrin coat binding   81.6     6.1 0.00013   44.2   8.5   59  168-228   130-191 (729)
  9 COG2940 Proteins containing SE  71.7     1.4 2.9E-05   46.4   0.2   29  354-382   427-456 (480)
 10 PF00856 SET:  SET domain;  Int  64.4     4.5 9.7E-05   33.3   1.9   21  355-375   141-162 (162)
 11 KOG1083 Putative transcription  45.1      13 0.00029   43.6   2.0   23  356-378  1274-1297(1306)
 12 KOG0404 Thioredoxin reductase   44.5     8.7 0.00019   38.6   0.5   51  112-178   242-292 (322)
 13 PF14451 Ub-Mut7C:  Mut7-C ubiq  37.2      23  0.0005   29.3   1.8   26  162-188    43-75  (81)
 14 KOG3192 Mitochondrial J-type c  33.8 1.4E+02   0.003   28.3   6.5   36   56-91    106-141 (168)
 15 PF08638 Med14:  Mediator compl  30.2 3.1E+02  0.0068   25.9   8.4   85    4-96     16-108 (195)
 16 COG1791 Uncharacterized conser  28.9      44 0.00096   31.9   2.5   26  168-193   103-137 (181)
 17 KOG1337 N-methyltransferase [G  28.0      41 0.00089   35.4   2.3   21  356-376   257-278 (472)
 18 PF02311 AraC_binding:  AraC-li  26.6      41 0.00088   27.0   1.6   28  167-194    30-61  (136)
 19 PF08666 SAF:  SAF domain;  Int  25.0      49  0.0011   24.5   1.7   13  358-370     3-16  (63)
 20 COG1485 Predicted ATPase [Gene  22.8      55  0.0012   34.4   2.1   41  355-395   160-228 (367)
 21 KOG3710 EGL-Nine (EGLN) protei  22.8      78  0.0017   31.8   3.0   75  153-243   118-200 (280)
 22 PF08443 RimK:  RimK-like ATP-g  21.4      59  0.0013   29.7   1.8   41  133-176    18-60  (190)

No 1  
>smart00317 SET SET (Su(var)3-9, Enhancer-of-zeste, Trithorax) domain. Putative methyl transferase, based on outlier plant homologues
Probab=98.80  E-value=2.3e-08  Score=80.38  Aligned_cols=32  Identities=22%  Similarity=0.250  Sum_probs=28.7

Q ss_pred             CCCceEEEeEEecCCcEEEEecCeeecccccc
Q 015859          167 EAGQGLFLCGEANVGAVIAIYPGIIYSPAYYR  198 (399)
Q Consensus       167 ~AG~GVFl~G~v~~GtVVa~YPGvVY~p~~~~  198 (399)
                      .+|.|||..-.+++|++|+-|+|.+..+....
T Consensus         9 ~~G~gl~a~~~i~~g~~i~~~~g~~~~~~~~~   40 (116)
T smart00317        9 GKGWGVRATEDIPKGEFIGEYVGEIITSEEAE   40 (116)
T ss_pred             CCcEEEEECCccCCCCEEEEEEeEEECHHHHH
Confidence            79999999999999999999999999864443


No 2  
>KOG1079 consensus Transcriptional repressor EZH1 [Transcription]
Probab=98.70  E-value=3.7e-08  Score=105.57  Aligned_cols=125  Identities=29%  Similarity=0.337  Sum_probs=91.5

Q ss_pred             EeecCCCCCcCCCCceEEEeEEecCCcEEEEecCeeeccccccccCCCCCCCCCC-CeeeeccCCeEEecCCCCCCCCCc
Q 015859          156 LDLKPSQIPHEEAGQGLFLCGEANVGAVIAIYPGIIYSPAYYRYIPGYPRVDAQN-PYLITRYDGTVINAQPWGSGWDTR  234 (399)
Q Consensus       156 l~vk~SsIph~~AG~GVFl~G~v~~GtVVa~YPGvVY~p~~~~~ipgyP~vd~~N-~YLi~r~DG~vIDg~pwg~gg~sr  234 (399)
                      +-+.||.|    ||-|+|++-.|.++.+|+.|-|.+-+..-.-.- |- ..|..+ +|||-.+++.|||++-.|      
T Consensus       596 ~llapSdV----aGwGlFlKe~v~KnefisEY~GE~IS~dEADrR-Gk-iYDr~~cSflFnln~dyviDs~rkG------  663 (739)
T KOG1079|consen  596 VLLAPSDV----AGWGLFLKESVSKNEFISEYTGEIISHDEADRR-GK-IYDRYMCSFLFNLNNDYVIDSTRKG------  663 (739)
T ss_pred             eeechhhc----cccceeeccccCCCceeeeecceeccchhhhhc-cc-ccccccceeeeeccccceEeeeeec------
Confidence            77899988    899999999999999999999998773211100 00 013334 799999999999998864      


Q ss_pred             cccCCCCcCcCCCCCCCcCcCchhHHHhhcCCCCCCCCCCCcccccccChhhhhhhhcCCCCCCCCCeEEEeecCCCCcc
Q 015859          235 ELWDGLTLPEIMPNSKGAEKGSDQFWKLLSKPMDNKRGGSGSEMLERRNPLALAHFANHPAKGMVPNVMICPYDFPLTEK  314 (399)
Q Consensus       235 ~~~~g~~~~~~~~~~~~a~~~~d~~w~~ls~Pl~~s~~~~~~~~le~~NPLAlGH~aNHpp~g~~pNV~~~~yDfP~~~~  314 (399)
                                                                         .++|||||-+   .||..+.-+=++.   
T Consensus       664 ---------------------------------------------------nk~rFANHS~---nPNCYAkvm~V~G---  686 (739)
T KOG1079|consen  664 ---------------------------------------------------NKIRFANHSF---NPNCYAKVMMVAG---  686 (739)
T ss_pred             ---------------------------------------------------chhhhccCCC---CCCcEEEEEEecC---
Confidence                                                               1368999987   5887766553332   


Q ss_pred             cccccCCccccCCchhhhhhhccccceecCCCCCCCCCceEEEEEEEeccC-CCceeeeecccCCCC------CCCCccc
Q 015859          315 DMRPYIPNISFGNAEEVNMRRFGSFWFKWGSGSGSSTPVLKTLALVATRAI-CDEEVLLNYRLSNSK------RRPVWYS  387 (399)
Q Consensus       315 ~LR~YIPNv~~~~~~~~~m~r~g~~w~~~~~~~~~~~~vlr~vVLVAtRdI-~dEELflNYRls~~~------~~P~WY~  387 (399)
                                                             -+=+.+.|.|+| .|||||.+|||++..      .-++||.
T Consensus       687 ---------------------------------------dhRIGifAkRaIeagEELffDYrYs~~~~~k~~~~~~~s~k  727 (739)
T KOG1079|consen  687 ---------------------------------------DHRIGIFAKRAIEAGEELFFDYRYSPEHALKFVGIERESYK  727 (739)
T ss_pred             ---------------------------------------CcceeeeehhhcccCceeeeeeccCccccccccccCccccc
Confidence                                                   122447899999 889999999999865      2346665


Q ss_pred             c
Q 015859          388 P  388 (399)
Q Consensus       388 p  388 (399)
                      +
T Consensus       728 ~  728 (739)
T KOG1079|consen  728 V  728 (739)
T ss_pred             c
Confidence            4


No 3  
>KOG1080 consensus Histone H3 (Lys4) methyltransferase complex, subunit SET1 and related methyltransferases [Chromatin structure and dynamics; Transcription]
Probab=96.10  E-value=0.02  Score=65.46  Aligned_cols=68  Identities=26%  Similarity=0.409  Sum_probs=51.3

Q ss_pred             EeecCCCCCcCCCCceEEEeEEecCCcEEEEecCeeeccc--cccccCCCCCCCCCCCeeeeccCCeEEecCCCC
Q 015859          156 LDLKPSQIPHEEAGQGLFLCGEANVGAVIAIYPGIIYSPA--YYRYIPGYPRVDAQNPYLITRYDGTVINAQPWG  228 (399)
Q Consensus       156 l~vk~SsIph~~AG~GVFl~G~v~~GtVVa~YPGvVY~p~--~~~~ipgyP~vd~~N~YLi~r~DG~vIDg~pwg  228 (399)
                      |.-.+|.|    -|-|||..+.+.+|.-|-=|=|-++.+.  =+|-+ .|-+.-..-.|||+-=|++||||...|
T Consensus       868 ~~F~~s~i----H~wglfa~~~i~~~dmViEY~Ge~vR~~iad~RE~-~Y~~~gi~~sYlfrid~~~ViDAtk~g  937 (1005)
T KOG1080|consen  868 VKFGRSGI----HGWGLFAMENIAAGDMVIEYRGELVRSSIADLREA-RYERMGIGDSYLFRIDDEVVVDATKKG  937 (1005)
T ss_pred             hccccccc----cccceeeccCccccceEEEeeceehhhhHHHHHHH-HHhccCcccceeeecccceEEeccccC
Confidence            66778888    4899999999999999999999999841  12221 222223356799987789999999864


No 4  
>KOG1082 consensus Histone H3 (Lys9) methyltransferase SUV39H1/Clr4, required for transcriptional silencing [Chromatin structure and dynamics; Transcription]
Probab=93.95  E-value=0.14  Score=51.98  Aligned_cols=148  Identities=18%  Similarity=0.217  Sum_probs=88.4

Q ss_pred             hCeEEeecCCCCCcCCCCceEEEeEEecCCcEEEEecCeeeccccccccCCCCCCCCCCCeeeeccCCeEEecCCCCCCC
Q 015859          152 IGYTLDLKPSQIPHEEAGQGLFLCGEANVGAVIAIYPGIIYSPAYYRYIPGYPRVDAQNPYLITRYDGTVINAQPWGSGW  231 (399)
Q Consensus       152 lGfsl~vk~SsIph~~AG~GVFl~G~v~~GtVVa~YPGvVY~p~~~~~ipgyP~vd~~N~YLi~r~DG~vIDg~pwg~gg  231 (399)
                      +=|.|.|-.+   + ..|=||--.=.+++|+-|+=|.|-|-+-.-.+...      ..+.|++...|++.+.-..|..  
T Consensus       174 ~~~~leIfrt---~-~kGwgvRs~~~I~~G~fvcEyaGe~~t~~e~~~~~------~~~~~~~~~~~~~~~~~~~~~~--  241 (364)
T KOG1082|consen  174 LQFHLEVFRT---P-EKGWGVRTLDPIPAGEFVCEYAGEVLTSEEAQRRT------HLREYLDDDCDAYSIADREWVD--  241 (364)
T ss_pred             cccceEEEec---C-CceeeecccccccCCCeeEEEeeEecChHHhhhcc------ccccccccccccchhhhccccc--
Confidence            4455555555   2 36667666558999999999999998854433331      2456766555655555455521  


Q ss_pred             CCccccCCCCcCcCCCCCCCcCcCchhHHHhhcCCCCCCCCCCCcccccccChhhhhhhhcCCCCCCCCCeEEEeecCCC
Q 015859          232 DTRELWDGLTLPEIMPNSKGAEKGSDQFWKLLSKPMDNKRGGSGSEMLERRNPLALAHFANHPAKGMVPNVMICPYDFPL  311 (399)
Q Consensus       232 ~sr~~~~g~~~~~~~~~~~~a~~~~d~~w~~ls~Pl~~s~~~~~~~~le~~NPLAlGH~aNHpp~g~~pNV~~~~yDfP~  311 (399)
                       .  -+.+            ......+.+...+.+          ..+...+=-.+|+|+||-   -+|||+++.+-.  
T Consensus       242 -~--~~~~------------~~~~~~~~~~~~~~~----------~~ida~~~GNv~RfinHS---C~PN~~~~~v~~--  291 (364)
T KOG1082|consen  242 -E--SPVG------------NTFVAPSLPGGPGRE----------LLIDAKPHGNVARFINHS---CSPNLLYQAVFQ--  291 (364)
T ss_pred             -c--cccc------------ccccccccccCCCcc----------eEEchhhcccccccccCC---CCccceeeeeee--
Confidence             1  0111            001111111111110          223455556788999997   579999987711  


Q ss_pred             CcccccccCCccccCCchhhhhhhccccceecCCCCCCCCCceEEEEEEEeccC-CCceeeeecccC
Q 015859          312 TEKDMRPYIPNISFGNAEEVNMRRFGSFWFKWGSGSGSSTPVLKTLALVATRAI-CDEEVLLNYRLS  377 (399)
Q Consensus       312 ~~~~LR~YIPNv~~~~~~~~~m~r~g~~w~~~~~~~~~~~~vlr~vVLVAtRdI-~dEELflNYRls  377 (399)
                                    +  +                    ..+.+--++|.|+++| ..|||=++|-.+
T Consensus       292 --------------~--~--------------------~~~~~~~i~ffa~~~I~p~~ELT~dYg~~  322 (364)
T KOG1082|consen  292 --------------D--E--------------------FVLLYLRIGFFALRDISPGEELTLDYGKA  322 (364)
T ss_pred             --------------c--C--------------------CccchheeeeeeccccCCCcccchhhccc
Confidence                          1  0                    1233556789999999 889999999865


No 5  
>PF00856 SET:  SET domain;  InterPro: IPR001214 The SET domain appears generally as one part of a larger multidomain protein, and recently there were described three structures of very different proteins with distinct domain compositions: Neurospora crassa DIM-5, a member of the Su(var) family of HKMTs which methylate histone H3 on lysine 9,human SET7 (also called SET9), which methylates H3 on lysine 4 and garden pea Rubisco LSMT, an enzyme that does not modify histones, but instead methylates lysine 14 in the flexible tail of the large subunit of the enzyme Rubisco. The SET domain itself turned out to be an uncommon structure. Although in all three studies, electron density maps revealed the location of the AdoMet or AdoHcy cofactor, the SET domain bears no similarity at all to the canonical/AdoMet-dependent methyltransferase fold. Strictly conserved in the C-terminal motif of the SET domain tyrosine could be involved in abstracting a proton from the protonated amino group of the substrate lysine, promoting its nucleophilic attack on the sulphonium methyl group of the AdoMet cofactor. In contrast to the AdoMet-dependent protein methyltranferases of the classical type, which tend to bind their polypeptide substrates on top of the cofactor, it is noted from the Rubisco LSMT structure that the AdoMet seems to bind in a separate cleft, suggesting how a polypeptide substrate could be subjected to multiple rounds of methylation without having to be released from the enzyme. In contrast, SET7/9 is able to add only a single methyl group to its substrate. It has been demonstrated that association of SET domain and myotubularin-related proteins modulates growth control []. The SET domain-containing Drosophila melanogaster (Fruit fly) protein, enhancer of zeste, has a function in segment determination and the mammalian homologue may be involved in the regulation of gene transcription and chromatin structure. Histone lysine methylation is part of the histone code that regulated chromatin function and epigenetic control of gene function. Histone lysine methyltransferases (HMTase) differ both in their substrate specificity for the various acceptor lysines as well as in their product specificity for the number of methyl groups (one, two, or three) they transfer. With just one exception [], the HMTases belong to SET family that can be classified according to the sequences surrounding the SET domain [, ]. Structural studies on the human SET7/9, a mono-methylase, have revealed the molecular basis for the specificity of the enzyme for the histone-target and the roles of the invariant residues in the SET domain in determining the methylation specificities [].  The pre-SET domain, as found in the SUV39 SET family, contains nine invariant cysteine residues that are grouped into two segments separated by a region of variable length. These 9 cysteines coordinate 3 zinc ions to form to form a triangular cluster, where each of the zinc ions is coordinated by 4 four cysteines to give a tetrahedral configuration. The function of this domain is structural, holding together 2 long segments of random coils. The C-terminal region including the post-SET domain is disordered when not interacting with a histone tail and in the absence of zinc. The three conserved cysteines in the post-SET domain form a zinc-binding site when coupled to a fourth conserved cysteine in the knot-like structure close to the SET domain active site []. The structured post-SET region brings in the C-terminal residues that participate in S-adenosylmethine-binding and histone tail interactions. The three conserved cysteine residues are essential for HMTase activity, as replacement with serine abolishes HMTase activity [], []. ; GO: 0005515 protein binding; PDB: 3TG5_A 3S7F_A 3RIB_B 3TG4_A 3S7J_A 3S7D_A 3S7B_A 3H6L_A 3SMT_A 3K5K_A ....
Probab=92.59  E-value=0.097  Score=43.27  Aligned_cols=27  Identities=37%  Similarity=0.605  Sum_probs=22.1

Q ss_pred             CceEEEeEEecCCcEEEEecCeeeccc
Q 015859          169 GQGLFLCGEANVGAVIAIYPGIIYSPA  195 (399)
Q Consensus       169 G~GVFl~G~v~~GtVVa~YPGvVY~p~  195 (399)
                      |+|||.+-.+++|+||.++.+.+..+.
T Consensus         1 GrGl~At~dI~~Ge~I~~p~~~~~~~~   27 (162)
T PF00856_consen    1 GRGLFATRDIKAGEVILIPRPAILTPD   27 (162)
T ss_dssp             SEEEEESS-B-TTEEEEEESEEEEEHH
T ss_pred             CEEEEECccCCCCCEEEEECcceEEeh
Confidence            899999999999999988888887753


No 6  
>KOG2461 consensus Transcription factor BLIMP-1/PRDI-BF1, contains C2H2-type Zn-finger and SET domains [Transcription]
Probab=92.03  E-value=0.18  Score=52.54  Aligned_cols=63  Identities=24%  Similarity=0.399  Sum_probs=45.3

Q ss_pred             EEeecCCCCCcCCCCceEEEeEEecCCcEEEEecCeeeccccccccCCCCCCCCCCCeeeeccC-C---eEEecCCC
Q 015859          155 TLDLKPSQIPHEEAGQGLFLCGEANVGAVIAIYPGIIYSPAYYRYIPGYPRVDAQNPYLITRYD-G---TVINAQPW  227 (399)
Q Consensus       155 sl~vk~SsIph~~AG~GVFl~G~v~~GtVVa~YPGvVY~p~~~~~ipgyP~vd~~N~YLi~r~D-G---~vIDg~pw  227 (399)
                      .|.+++|+|+  .+|-||+=+..+++|+.-+=|=|-+        |+-+..-.++|.|+..-|. +   .+|||.+-
T Consensus        29 ~l~i~~Ssv~--~~~lgV~s~~~i~~G~~FGP~~G~~--------~~~~~~~~~n~~y~W~I~~~d~~~~~iDg~d~   95 (396)
T KOG2461|consen   29 ELRIKPSSVP--VTGLGVWSNASILPGTSFGPFEGEI--------IASIDSKSANNRYMWEIFSSDNGYEYIDGTDE   95 (396)
T ss_pred             ceEeeccccC--CccccccccccccCcccccCccCcc--------ccccccccccCcceEEEEeCCCceEEeccCCh
Confidence            3789999999  8999999999999988887777776        2222222345677665443 2   68888874


No 7  
>KOG1085 consensus Predicted methyltransferase (contains a SET domain) [General function prediction only]
Probab=88.09  E-value=0.35  Score=49.17  Aligned_cols=56  Identities=20%  Similarity=0.289  Sum_probs=34.9

Q ss_pred             cccccHHHHHH----HHHHHh--CeEEeecCCCCCcCCCCceEEEeEEecCCcEEEEecCeeec
Q 015859          136 TRQLTRTELSQ----RLKDAI--GYTLDLKPSQIPHEEAGQGLFLCGEANVGAVIAIYPGIIYS  193 (399)
Q Consensus       136 t~~l~~~~vs~----~l~~~l--Gfsl~vk~SsIph~~AG~GVFl~G~v~~GtVVa~YPGvVY~  193 (399)
                      ..+-+..+|++    .|.+.+  |=+-.++.--+.  ..|+||-.+-....|..|--|=|..-.
T Consensus       230 S~RKtk~~i~~E~~~~l~~~vl~g~~egl~~~~~d--gKGRGv~a~~~F~rgdFVVEY~Gdlie  291 (392)
T KOG1085|consen  230 SNRKTKKQISDEAKHALRDTVLKGTNEGLLEVYKD--GKGRGVRAKVNFERGDFVVEYRGDLIE  291 (392)
T ss_pred             cchhhHHHhhHHHHHHHHHHHHhccccceeEEeec--cccceeEeecccccCceEEEEecceee
Confidence            33334445544    444443  333333333444  589999888888999999999887644


No 8  
>KOG4442 consensus Clathrin coat binding protein/Huntingtin interacting protein HIP1, involved in regulation of endocytosis [Intracellular trafficking, secretion, and vesicular transport]
Probab=81.64  E-value=6.1  Score=44.22  Aligned_cols=59  Identities=22%  Similarity=0.308  Sum_probs=38.6

Q ss_pred             CCceEEEeEEecCCcEEEEecCeeeccccccc-cCCCCCCCCCC--CeeeeccCCeEEecCCCC
Q 015859          168 AGQGLFLCGEANVGAVIAIYPGIIYSPAYYRY-IPGYPRVDAQN--PYLITRYDGTVINAQPWG  228 (399)
Q Consensus       168 AG~GVFl~G~v~~GtVVa~YPGvVY~p~~~~~-ipgyP~vd~~N--~YLi~r~DG~vIDg~pwg  228 (399)
                      .|-||=..-.+++|+.|-=|=|=|-+-.-|+. +.-|-+  ..|  -|.|+.--|.+|||.-.|
T Consensus       130 KG~GLRA~~dI~~g~FI~EY~GEVI~~~Ef~kR~~~Y~~--d~~kh~Yfm~L~~~e~IDAT~KG  191 (729)
T KOG4442|consen  130 KGCGLRAEEDIPKGQFILEYIGEVIEEKEFEKRVKRYAK--DGIKHYYFMALQGGEYIDATKKG  191 (729)
T ss_pred             cccceeeccccCCCcEEeeeccccccHHHHHHHHHHHHh--cCCceEEEEEecCCceecccccC
Confidence            45555555589999999999999988543331 111110  122  366666689999999975


No 9  
>COG2940 Proteins containing SET domain [General function prediction only]
Probab=71.72  E-value=1.4  Score=46.39  Aligned_cols=29  Identities=28%  Similarity=0.407  Sum_probs=24.1

Q ss_pred             eEEEEEEEeccC-CCceeeeecccCCCCCC
Q 015859          354 LKTLALVATRAI-CDEEVLLNYRLSNSKRR  382 (399)
Q Consensus       354 lr~vVLVAtRdI-~dEELflNYRls~~~~~  382 (399)
                      ++-++..|.||| .+|||.++|-.......
T Consensus       427 ~~~~~~~~~rDI~~geEl~~dy~~~~~~~~  456 (480)
T COG2940         427 IFKISIYAIRDIKAGEELTYDYGPSLEDNR  456 (480)
T ss_pred             cceeeecccccchhhhhhccccccccccch
Confidence            667888999999 99999999987755543


No 10 
>PF00856 SET:  SET domain;  InterPro: IPR001214 The SET domain appears generally as one part of a larger multidomain protein, and recently there were described three structures of very different proteins with distinct domain compositions: Neurospora crassa DIM-5, a member of the Su(var) family of HKMTs which methylate histone H3 on lysine 9,human SET7 (also called SET9), which methylates H3 on lysine 4 and garden pea Rubisco LSMT, an enzyme that does not modify histones, but instead methylates lysine 14 in the flexible tail of the large subunit of the enzyme Rubisco. The SET domain itself turned out to be an uncommon structure. Although in all three studies, electron density maps revealed the location of the AdoMet or AdoHcy cofactor, the SET domain bears no similarity at all to the canonical/AdoMet-dependent methyltransferase fold. Strictly conserved in the C-terminal motif of the SET domain tyrosine could be involved in abstracting a proton from the protonated amino group of the substrate lysine, promoting its nucleophilic attack on the sulphonium methyl group of the AdoMet cofactor. In contrast to the AdoMet-dependent protein methyltranferases of the classical type, which tend to bind their polypeptide substrates on top of the cofactor, it is noted from the Rubisco LSMT structure that the AdoMet seems to bind in a separate cleft, suggesting how a polypeptide substrate could be subjected to multiple rounds of methylation without having to be released from the enzyme. In contrast, SET7/9 is able to add only a single methyl group to its substrate. It has been demonstrated that association of SET domain and myotubularin-related proteins modulates growth control []. The SET domain-containing Drosophila melanogaster (Fruit fly) protein, enhancer of zeste, has a function in segment determination and the mammalian homologue may be involved in the regulation of gene transcription and chromatin structure. Histone lysine methylation is part of the histone code that regulated chromatin function and epigenetic control of gene function. Histone lysine methyltransferases (HMTase) differ both in their substrate specificity for the various acceptor lysines as well as in their product specificity for the number of methyl groups (one, two, or three) they transfer. With just one exception [], the HMTases belong to SET family that can be classified according to the sequences surrounding the SET domain [, ]. Structural studies on the human SET7/9, a mono-methylase, have revealed the molecular basis for the specificity of the enzyme for the histone-target and the roles of the invariant residues in the SET domain in determining the methylation specificities [].  The pre-SET domain, as found in the SUV39 SET family, contains nine invariant cysteine residues that are grouped into two segments separated by a region of variable length. These 9 cysteines coordinate 3 zinc ions to form to form a triangular cluster, where each of the zinc ions is coordinated by 4 four cysteines to give a tetrahedral configuration. The function of this domain is structural, holding together 2 long segments of random coils. The C-terminal region including the post-SET domain is disordered when not interacting with a histone tail and in the absence of zinc. The three conserved cysteines in the post-SET domain form a zinc-binding site when coupled to a fourth conserved cysteine in the knot-like structure close to the SET domain active site []. The structured post-SET region brings in the C-terminal residues that participate in S-adenosylmethine-binding and histone tail interactions. The three conserved cysteine residues are essential for HMTase activity, as replacement with serine abolishes HMTase activity [], []. ; GO: 0005515 protein binding; PDB: 3TG5_A 3S7F_A 3RIB_B 3TG4_A 3S7J_A 3S7D_A 3S7B_A 3H6L_A 3SMT_A 3K5K_A ....
Probab=64.35  E-value=4.5  Score=33.32  Aligned_cols=21  Identities=38%  Similarity=0.567  Sum_probs=17.9

Q ss_pred             EEEEEEEeccC-CCceeeeecc
Q 015859          355 KTLALVATRAI-CDEEVLLNYR  375 (399)
Q Consensus       355 r~vVLVAtRdI-~dEELflNYR  375 (399)
                      .+++++|+|+| .|||||++|.
T Consensus       141 ~~~~~~a~r~I~~GeEi~isYG  162 (162)
T PF00856_consen  141 GCLVVRATRDIKKGEEIFISYG  162 (162)
T ss_dssp             TEEEEEESS-B-TTSBEEEEST
T ss_pred             ceEEEEECCccCCCCEEEEEEC
Confidence            46889999999 8999999994


No 11 
>KOG1083 consensus Putative transcription factor ASH1/LIN-59 [Transcription]
Probab=45.07  E-value=13  Score=43.64  Aligned_cols=23  Identities=26%  Similarity=0.476  Sum_probs=19.8

Q ss_pred             EEEEEEeccC-CCceeeeecccCC
Q 015859          356 TLALVATRAI-CDEEVLLNYRLSN  378 (399)
Q Consensus       356 ~vVLVAtRdI-~dEELflNYRls~  378 (399)
                      -|+|+|+||| +||||..+|-+.-
T Consensus      1274 Rv~L~A~rDi~kGEELtYDYN~ks 1297 (1306)
T KOG1083|consen 1274 RVGLFALRDLPKGEELTYDYNFKS 1297 (1306)
T ss_pred             eeeeeecCCCCCCceEEEeccccc
Confidence            3678999999 9999999987653


No 12 
>KOG0404 consensus Thioredoxin reductase [Posttranslational modification, protein turnover, chaperones]
Probab=44.49  E-value=8.7  Score=38.65  Aligned_cols=51  Identities=29%  Similarity=0.486  Sum_probs=36.1

Q ss_pred             CCCCCcccccCCCCCCCCCCCCcccccccHHHHHHHHHHHhCeEEeecCCCCCcCCCCceEEEeEEe
Q 015859          112 PRRSGLSFAVGPTARPTDSPVVPQTRQLTRTELSQRLKDAIGYTLDLKPSQIPHEEAGQGLFLCGEA  178 (399)
Q Consensus       112 ~r~sgl~fa~~~~~~~~~~~~~~~t~~l~~~~vs~~l~~~lGfsl~vk~SsIph~~AG~GVFl~G~v  178 (399)
                      -..|||.|+.|++         |+|+.|+- +|   =.|.-||-+.+-.++.   -.=-|||..|-|
T Consensus       242 l~v~GlFf~IGH~---------Pat~~l~g-qv---e~d~~GYi~t~pgts~---TsvpG~FAAGDV  292 (322)
T KOG0404|consen  242 LPVSGLFFAIGHS---------PATKFLKG-QV---ELDEDGYIVTRPGTSL---TSVPGVFAAGDV  292 (322)
T ss_pred             cccceeEEEecCC---------chhhHhcC-ce---eeccCceEEeccCccc---ccccceeecccc
Confidence            4569999999995         88988876 55   5688899887744432   122288887644


No 13 
>PF14451 Ub-Mut7C:  Mut7-C ubiquitin
Probab=37.22  E-value=23  Score=29.31  Aligned_cols=26  Identities=35%  Similarity=0.747  Sum_probs=21.6

Q ss_pred             CCCcCCCCceEEEeE-------EecCCcEEEEec
Q 015859          162 QIPHEEAGQGLFLCG-------EANVGAVIAIYP  188 (399)
Q Consensus       162 sIph~~AG~GVFl~G-------~v~~GtVVa~YP  188 (399)
                      .|||.+-| -|+|+|       .+..|+.|++||
T Consensus        43 GVP~tEV~-~i~vNG~~v~~~~~~~~Gd~v~V~P   75 (81)
T PF14451_consen   43 GVPHTEVG-LILVNGRPVDFDYRLKDGDRVAVYP   75 (81)
T ss_pred             CCChHHeE-EEEECCEECCCcccCCCCCEEEEEe
Confidence            47888888 577887       678999999998


No 14 
>KOG3192 consensus Mitochondrial J-type chaperone [Posttranslational modification, protein turnover, chaperones]
Probab=33.83  E-value=1.4e+02  Score=28.35  Aligned_cols=36  Identities=17%  Similarity=0.243  Sum_probs=29.9

Q ss_pred             HHHHHHhccCCcHHHHHHHHHHHHHHHHHHHHhhhh
Q 015859           56 EEIIDMAGKASLSDQQQQVLDNIHSQIKRFCLSMDE   91 (399)
Q Consensus        56 eeii~~a~~~~~~~qq~qvq~nih~qi~~~c~~~~~   91 (399)
                      |+|-+|-....+..-+.|+|+-|..++..+-++|.+
T Consensus       106 E~IS~~~De~~l~~lk~q~q~ri~q~~~qlge~~es  141 (168)
T KOG3192|consen  106 EAISEMDDEEDLKQLKSQNQERIAQCKQQLGEAFES  141 (168)
T ss_pred             HHHHhccCcHHHHHHHHHHHHHHHHHHHHHHHHHhh
Confidence            566677777888888999999999999998877754


No 15 
>PF08638 Med14:  Mediator complex subunit MED14;  InterPro: IPR013947 The Mediator complex is a coactivator involved in the regulated transcription of nearly all RNA polymerase II-dependent genes. Mediator functions as a bridge to convey information from gene-specific regulatory proteins to the basal RNA polymerase II transcription machinery. The Mediator complex, having a compact conformation in its free form, is recruited to promoters by direct interactions with regulatory proteins and serves for the assembly of a functional preinitiation complex with RNA polymerase II and the general transcription factors. On recruitment the Mediator complex unfolds to an extended conformation and partially surrounds RNA polymerase II, specifically interacting with the unphosphorylated form of the C-terminal domain (CTD) of RNA polymerase II. The Mediator complex dissociates from the RNA polymerase II holoenzyme and stays at the promoter when transcriptional elongation begins.  The Mediator complex is composed of at least 31 subunits: MED1, MED4, MED6, MED7, MED8, MED9, MED10, MED11, MED12, MED13, MED13L, MED14, MED15, MED16, MED17, MED18, MED19, MED20, MED21, MED22, MED23, MED24, MED25, MED26, MED27, MED29, MED30, MED31, CCNC, CDK8 and CDC2L6/CDK11.  The subunits form at least three structurally distinct submodules. The head and the middle modules interact directly with RNA polymerase II, whereas the elongated tail module interacts with gene-specific regulatory proteins. Mediator containing the CDK8 module is less active than Mediator lacking this module in supporting transcriptional activation.   The head module contains: MED6, MED8, MED11, SRB4/MED17, SRB5/MED18, ROX3/MED19, SRB2/MED20 and SRB6/MED22.  The middle module contains: MED1, MED4, NUT1/MED5, MED7, CSE2/MED9, NUT2/MED10, SRB7/MED21 and SOH1/MED31. CSE2/MED9 interacts directly with MED4.  The tail module contains: MED2, PGD1/MED3, RGR1/MED14, GAL11/MED15 and SIN4/MED16.  The CDK8 module contains: MED12, MED13, CCNC and CDK8.   Individual preparations of the Mediator complex lacking one or more distinct subunits have been variously termed ARC, CRSP, DRIP, PC2, SMCC and TRAP.  Saccharomyces cerevisiae (Baker's yeast) RGR1 mediator complex subunit affects chromatin structure, transcriptional regulation of diverse genes, and sporulation. It is required for glucose repression, HO repression, RME1 repression and sporulation [, ]. This subunit is also found in higher eukaryotes and MED14 is the agreed unified nomenclature for this subunit []. ; GO: 0001104 RNA polymerase II transcription cofactor activity, 0006357 regulation of transcription from RNA polymerase II promoter, 0016592 mediator complex
Probab=30.21  E-value=3.1e+02  Score=25.88  Aligned_cols=85  Identities=21%  Similarity=0.360  Sum_probs=60.0

Q ss_pred             HHHHHHHHHHHHhc----CCCCCCCccchhhhhcch----hhhhhhhhhhccCCcchhcHHHHHHHhccCCcHHHHHHHH
Q 015859            4 LFQKFQEAVKTLAK----SPTFARDPRQLQFEADMN----RLFLYTSYNRLGRDAEEADAEEIIDMAGKASLSDQQQQVL   75 (399)
Q Consensus         4 ~f~~~q~~v~~la~----~~~~~~~~r~~q~e~D~~----rlf~~tsy~~l~~~~~~~d~eeii~~a~~~~~~~qq~qvq   75 (399)
                      -|+.+++.+++|+.    .+...|--+=+||=....    ||.+++-+.+-   +  ++...+|+|..   +-++|.+..
T Consensus        16 sy~eL~~l~e~l~~~~~~~~d~~rK~~ll~~~~~~R~~fiKLlvL~kWs~~---~--~~v~k~idl~~---~l~~q~~~~   87 (195)
T PF08638_consen   16 SYNELQQLIETLPSDDTSQSDSERKRRLLQFAQSTRQRFIKLLVLVKWSRK---A--KDVSKCIDLLN---FLRQQNMCF   87 (195)
T ss_pred             HHHHHHHHHHHccccCCccchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHh---h--hHHHHHHHHHH---HHHHHHHHH
Confidence            57889999999998    443334444455555544    56667776654   3  25677777765   347888888


Q ss_pred             HHHHHHHHHHHHhhhhhhcCC
Q 015859           76 DNIHSQIKRFCLSMDEILLLP   96 (399)
Q Consensus        76 ~nih~qi~~~c~~~~~il~~~   96 (399)
                      ++..+++..++..|..+=+|.
T Consensus        88 ~~~~~~L~~~~~~l~~Ar~p~  108 (195)
T PF08638_consen   88 EDAADRLFRLKEQLQNARLPN  108 (195)
T ss_pred             HHHHHHHHHHHHHhhhccCCC
Confidence            999999999998888877765


No 16 
>COG1791 Uncharacterized conserved protein, contains double-stranded beta-helix domain [Function unknown]
Probab=28.91  E-value=44  Score=31.93  Aligned_cols=26  Identities=38%  Similarity=0.689  Sum_probs=21.6

Q ss_pred             CCceEEE-e---E-----EecCCcEEEEecCeeec
Q 015859          168 AGQGLFL-C---G-----EANVGAVIAIYPGIIYS  193 (399)
Q Consensus       168 AG~GVFl-~---G-----~v~~GtVVa~YPGvVY~  193 (399)
                      ||.|+|. .   |     .|.+|++|++-||+=+.
T Consensus       103 aG~GiF~v~~~d~~~~~i~c~~gDLI~vP~gi~Hw  137 (181)
T COG1791         103 AGEGIFDVHSPDGKVYQIRCEKGDLISVPPGIYHW  137 (181)
T ss_pred             ecceEEEEECCCCcEEEEEEccCCEEecCCCceEE
Confidence            7999998 2   2     68999999999998654


No 17 
>KOG1337 consensus N-methyltransferase [General function prediction only]
Probab=27.96  E-value=41  Score=35.38  Aligned_cols=21  Identities=43%  Similarity=0.697  Sum_probs=19.5

Q ss_pred             EEEEEEeccC-CCceeeeeccc
Q 015859          356 TLALVATRAI-CDEEVLLNYRL  376 (399)
Q Consensus       356 ~vVLVAtRdI-~dEELflNYRl  376 (399)
                      ++.++++++| +|||+|.||+-
T Consensus       257 ~~~l~~~~~v~~geevfi~YG~  278 (472)
T KOG1337|consen  257 AVELVAERDVSAGEEVFINYGP  278 (472)
T ss_pred             cEEEEEeeeecCCCeEEEecCC
Confidence            8899999999 99999999984


No 18 
>PF02311 AraC_binding:  AraC-like ligand binding domain;  InterPro: IPR003313 This entry defines the arabinose-binding and dimerisation domain of the bacterial gene regulatory protein AraC. The crystal structure of the arabinose-binding and dimerization domain of the Escherichia coli gene regulatory protein AraC was determined in the presence and absence of L-arabinose. The arabinose-bound molecule shows that the protein adopts an unusual fold, binding sugar within a beta barrel and completely burying the arabinose with the amino-terminal arm of the protein. Dimer contacts in the presence of arabinose are mediated by an antiparallel coiled-coil. In the uncomplexed protein, the amino-terminal arm is disordered, uncovering the sugar-binding pocket and allowing it to serve as an oligomerization interface [].; GO: 0006355 regulation of transcription, DNA-dependent; PDB: 1XJA_B 2ARA_A 2AAC_B 2ARC_A.
Probab=26.64  E-value=41  Score=27.00  Aligned_cols=28  Identities=29%  Similarity=0.336  Sum_probs=18.5

Q ss_pred             CCCceEEE-eE---EecCCcEEEEecCeeecc
Q 015859          167 EAGQGLFL-CG---EANVGAVIAIYPGIIYSP  194 (399)
Q Consensus       167 ~AG~GVFl-~G---~v~~GtVVa~YPGvVY~p  194 (399)
                      ..|+|.|. +|   .+.+|+++-+-||.++.-
T Consensus        30 ~~G~~~~~~~~~~~~l~~g~~~li~p~~~H~~   61 (136)
T PF02311_consen   30 LSGEGTLHIDGQEYPLKPGDLFLIPPGQPHSY   61 (136)
T ss_dssp             EEE-EEEEETTEEEEE-TT-EEEE-TTS-EEE
T ss_pred             eCCEEEEEECCEEEEEECCEEEEecCCccEEE
Confidence            46778777 33   899999999999999984


No 19 
>PF08666 SAF:  SAF domain;  InterPro: IPR013974  This entry includes a range of different proteins, such as antifreeze proteins, flagellar FlgA proteins, and CpaB pilus proteins. ; PDB: 1C89_A 3NLA_A 3RDN_A 1C8A_A 3FRN_A 1WVO_A 3K3S_H 3G8R_B 1XUU_A 1XUZ_A ....
Probab=25.01  E-value=49  Score=24.46  Aligned_cols=13  Identities=31%  Similarity=0.473  Sum_probs=10.3

Q ss_pred             EEEEeccC-CCcee
Q 015859          358 ALVATRAI-CDEEV  370 (399)
Q Consensus       358 VLVAtRdI-~dEEL  370 (399)
                      |+||+||| .|+.|
T Consensus         3 vvVA~~di~~G~~i   16 (63)
T PF08666_consen    3 VVVAARDIPAGTVI   16 (63)
T ss_dssp             EEEESSTB-TT-BE
T ss_pred             EEEEeCccCCCCEE
Confidence            68999999 78766


No 20 
>COG1485 Predicted ATPase [General function prediction only]
Probab=22.82  E-value=55  Score=34.43  Aligned_cols=41  Identities=34%  Similarity=0.655  Sum_probs=32.7

Q ss_pred             EEEEEEEeccCCCc---------eeee-------------------ecccCCCCCCCCccccCCHHHHh
Q 015859          355 KTLALVATRAICDE---------EVLL-------------------NYRLSNSKRRPVWYSPVDEEEDR  395 (399)
Q Consensus       355 r~vVLVAtRdI~dE---------ELfl-------------------NYRls~~~~~P~WY~pvD~eEd~  395 (399)
                      ++|+||||-.+.=+         |.||                   +||+-...+-|-|++|.|.|.+.
T Consensus       160 ~GV~lvaTSN~~P~~LY~dGlqR~~FLP~I~li~~~~~v~~vD~~~DYR~r~l~~a~~y~~Pl~~~~~~  228 (367)
T COG1485         160 RGVVLVATSNTAPDNLYKDGLQRERFLPAIDLIKSHFEVVNVDGPVDYRLRKLEQAPVYLTPLDAEAEA  228 (367)
T ss_pred             CCcEEEEeCCCChHHhcccchhHHhhHHHHHHHHHheEEEEecCCccccccccccCceeecCCcHHHHH
Confidence            68889999877333         3343                   89999999999999999998764


No 21 
>KOG3710 consensus EGL-Nine (EGLN) protein [Signal transduction mechanisms]
Probab=22.80  E-value=78  Score=31.85  Aligned_cols=75  Identities=31%  Similarity=0.363  Sum_probs=41.1

Q ss_pred             CeEEeecCCCCCcCCCCceEEEeEEecCCcEEEEecCeeeccccccccCCCCCCCCCCCeeeeccCCeEEec-----CCC
Q 015859          153 GYTLDLKPSQIPHEEAGQGLFLCGEANVGAVIAIYPGIIYSPAYYRYIPGYPRVDAQNPYLITRYDGTVINA-----QPW  227 (399)
Q Consensus       153 Gfsl~vk~SsIph~~AG~GVFl~G~v~~GtVVa~YPGvVY~p~~~~~ipgyP~vd~~N~YLi~r~DG~vIDg-----~pw  227 (399)
                      |+-...-.|.|-|...--|-++=|  ..-+.||.|||-=  --|+++.        +||-    -||-.|-.     +.|
T Consensus       118 ~~L~s~~d~~i~h~~~r~~~~~~g--RtkAMVAcYPGNG--tgYVrHV--------DNP~----gDGRcITcIYYlNqNW  181 (280)
T KOG3710|consen  118 MLLPSPIDSVILHCNGRLGSYIIG--RTKAMVACYPGNG--TGYVRHV--------DNPH----GDGRCITCIYYLNQNW  181 (280)
T ss_pred             eeecccchhhhhhhcccccccccc--ceeEEEEEecCCC--ceeeEec--------cCCC----CCceEEEEEEEcccCc
Confidence            333344444454544444445544  5668999999852  1233444        6664    34433311     223


Q ss_pred             ---CCCCCCccccCCCCcC
Q 015859          228 ---GSGWDTRELWDGLTLP  243 (399)
Q Consensus       228 ---g~gg~sr~~~~g~~~~  243 (399)
                         -.||+=|..|.|.+..
T Consensus       182 D~kv~Gg~Lri~pe~~~~~  200 (280)
T KOG3710|consen  182 DVKVHGGILRIFPEGSTTF  200 (280)
T ss_pred             ceeeccceeEeccCCCCcc
Confidence               2468888889887663


No 22 
>PF08443 RimK:  RimK-like ATP-grasp domain;  InterPro: IPR013651 This ATP-grasp domain is found in the ribosomal S6 modification enzyme RimK []. It has an unusual nucleotide-binding fold referred to as palmate, or ATP-grasp fold. This domain is found in a number of enzymes of known structure as well as in urea amidolyase, tubulin-tyrosine ligase, and three enzymes of purine biosynthesis.; PDB: 1UC8_B 1UC9_A.
Probab=21.36  E-value=59  Score=29.65  Aligned_cols=41  Identities=29%  Similarity=0.596  Sum_probs=19.6

Q ss_pred             CcccccccHHHHHHHHHHHh-CeEEeecCCCCCcCCCCceEEE-eE
Q 015859          133 VPQTRQLTRTELSQRLKDAI-GYTLDLKPSQIPHEEAGQGLFL-CG  176 (399)
Q Consensus       133 ~~~t~~l~~~~vs~~l~~~l-Gfsl~vk~SsIph~~AG~GVFl-~G  176 (399)
                      +|+|.-....+-.+.+.+.+ ||-+.+||+.=   ..|.|||+ +.
T Consensus        18 vP~t~~~~~~~~~~~~~~~~~~~p~ViKp~~g---~~G~gV~~i~~   60 (190)
T PF08443_consen   18 VPETRVTNSPEEAKEFIEELGGFPVVIKPLRG---SSGRGVFLINS   60 (190)
T ss_dssp             ---EEEESSHHHHHHHHHHH--SSEEEE-SB----------EEEES
T ss_pred             CCCEEEECCHHHHHHHHHHhcCCCEEEeeCCC---CCCCEEEEecC
Confidence            56766554444445566666 99999999753   67999998 54


Done!