Query         018186
Match_columns 359
No_of_seqs    246 out of 1373
Neff          7.5 
Searched_HMMs 46136
Date          Fri Mar 29 06:51:59 2013
Command       hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/018186.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/018186hhsearch_cdd -cpu 12 -v 0 

 No Hit                             Prob E-value P-value  Score    SS Cols Query HMM  Template HMM
  1 KOG1338 Uncharacterized conser 100.0 8.8E-33 1.9E-37  261.3  16.0  274   76-357     9-314 (466)
  2 KOG1337 N-methyltransferase [G 100.0 3.2E-32   7E-37  275.6  16.7  268   74-358    47-326 (472)
  3 PF00856 SET:  SET domain;  Int  99.8 1.1E-18 2.4E-23  149.0   8.1   49  259-307   112-162 (162)
  4 smart00317 SET SET (Su(var)3-9  99.0 5.8E-10 1.3E-14   90.4   5.9   46  261-306    69-116 (116)
  5 KOG2589 Histone tail methylase  95.8  0.0073 1.6E-07   58.3   3.2   47  260-308   192-238 (453)
  6 KOG1085 Predicted methyltransf  95.6   0.011 2.4E-07   55.3   3.7   48  267-315   335-385 (392)
  7 KOG4442 Clathrin coat binding   94.6   0.034 7.4E-07   57.9   4.0   43  266-308   194-238 (729)
  8 KOG1079 Transcriptional repres  94.2   0.046 9.9E-07   56.7   3.8   44  265-308   665-710 (739)
  9 KOG1080 Histone H3 (Lys4) meth  94.0   0.045 9.7E-07   60.1   3.7   45  264-308   938-984 (1005)
 10 COG2940 Proteins containing SE  91.5     0.1 2.2E-06   53.6   1.9   43  266-308   406-450 (480)
 11 KOG1082 Histone H3 (Lys9) meth  85.5    0.55 1.2E-05   46.4   2.4   44  267-310   274-323 (364)
 12 KOG1083 Putative transcription  80.7     1.5 3.1E-05   48.2   3.3   44  266-309  1251-1296(1306)
 13 KOG1085 Predicted methyltransf  68.0     5.9 0.00013   37.6   3.5   35   91-125   256-290 (392)
 14 KOG1141 Predicted histone meth  67.0     2.8 6.2E-05   44.9   1.4   53  267-319  1191-1253(1262)
 15 KOG2084 Predicted histone tail  65.1     9.7 0.00021   38.2   4.9   60  259-320   199-265 (482)
 16 KOG1338 Uncharacterized conser  56.3     2.4 5.1E-05   41.9  -1.3   78  259-341   269-349 (466)
 17 COG1188 Ribosome-associated he  47.7      18  0.0004   28.9   2.7   55  233-309     8-62  (100)
 18 KOG2461 Transcription factor B  47.4      19 0.00041   36.1   3.4   35  285-319   121-155 (396)
 19 TIGR02059 swm_rep_I cyanobacte  46.2      36 0.00079   27.2   4.1   29  280-308    69-97  (101)
 20 smart00317 SET SET (Su(var)3-9  41.4      34 0.00073   26.5   3.5   28   91-118    83-113 (116)
 21 KOG4442 Clathrin coat binding   40.1      26 0.00057   37.2   3.2   29   91-119   120-148 (729)
 22 PF10281 Ish1:  Putative stress  34.0      39 0.00084   21.7   2.2   16   76-91      6-21  (38)
 23 KOG1080 Histone H3 (Lys4) meth  31.4      45 0.00097   37.4   3.4   45   74-119   850-894 (1005)
 24 PF08666 SAF:  SAF domain;  Int  29.4      33 0.00072   24.0   1.4   15  103-117     2-16  (63)
 25 PF11629 Mst1_SARAH:  C termina  24.1 2.5E+02  0.0054   19.4   5.5   40  182-223     5-44  (49)
 26 PF09652 Cas_VVA1548:  Putative  22.9      57  0.0012   25.7   1.7   42   74-124     4-46  (93)
 27 KOG1337 N-methyltransferase [G  21.4      35 0.00075   34.9   0.3   51   73-123     3-55  (472)
 28 PF00856 SET:  SET domain;  Int  21.3 1.2E+02  0.0026   24.5   3.6   27   92-118   129-158 (162)

No 1  
>KOG1338 consensus Uncharacterized conserved protein [Function unknown]
Probab=100.00  E-value=8.8e-33  Score=261.27  Aligned_cols=274  Identities=22%  Similarity=0.338  Sum_probs=218.5

Q ss_pred             HHHHHHHHHhCC-CCC-CCcEEEeeC----CCceEEEEcccCCCCCEEEEcCCCCcccccCCCC---CchhhhhhccCCC
Q 018186           76 ASTLQKWLSDSG-LPP-QKMAIQKVD----VGERGLVALKNIRKGEKLLFVPPSLVITADSKWS---CPEAGEVLKQCSV  146 (359)
Q Consensus        76 ~~~l~~Wl~~~G-~~~-~~v~i~~~~----~~GrGl~At~~I~~ge~ll~IP~~l~is~~~a~~---~~~~~~~l~~~~l  146 (359)
                      .+.|+.|++..+ ... ++|.+...+    ..|+|++|+++|++||.|+.+|++++++..+...   .|+..+++= +++
T Consensus         9 ~~~fl~w~k~t~eletSpKi~~ndl~~v~~~~G~g~vAtesIkkgE~Lf~~prdsvLsvtts~li~~lps~~rv~L-ne~   87 (466)
T KOG1338|consen    9 AKRFLLWGKLTLELETSPKIDNNDLPWVERIAGAGIVATESIKKGESLFAYPRDSVLSVTTSALITPLPSDIRVLL-NEV   87 (466)
T ss_pred             HHHHHHHHHHhhheeecccccccccchhhhhcccceeeehhhcCCceEEEecCccEEeeehHHhcccchHHHHHHh-hcC
Confidence            689999999987 443 778776543    2499999999999999999999999999876311   222222221 468


Q ss_pred             CChHHHHHHHHHHhccCCCCCcHHHHHhcCCC--CCCccccCHHHHHHhhccchHHHHHHHHHHHHHHHHHHHHHHHHhh
Q 018186          147 PDWPLLATYLISEASFEKSSRWSNYISALPRQ--PYSLLYWTRAELDRYLEASQIRERAIERITNVIGTYNDLRLRIFSK  224 (359)
Q Consensus       147 ~~~~~Lal~Ll~E~~~g~~S~w~pYl~~LP~~--~~~pl~w~~~el~~lL~gt~l~~~~~~~~~~~~~~y~~~~~~l~~~  224 (359)
                      +.|..|++.|++|...+..|+|+||++.+|+.  .++|+||+++|++.+++|+-+.+ ..+..+.+.+.|....+++.+.
T Consensus        88 gsw~~Lllvll~E~~~pq~SrWrPYfs~wp~p~rm~spifWdEnEl~~Ll~stvlee-~~Kd~aeI~~~~i~~i~pf~~~  166 (466)
T KOG1338|consen   88 GSWGMLLLVLLREKKMPQKSRWRPYFSRWPQPARMHSPIFWDENELSMLLCSTVLEE-TVKDKAEIEKDFIFVIQPFKQH  166 (466)
T ss_pred             CcHHHHHHHHHHHhhcccccccccHHHhCCChhhcCCCccCCchHHHHHhhcccchh-hHhHHHHHHHHHHHHHHHHHHh
Confidence            89999999999999877779999999999975  57899999999997677776655 7788899999999999999999


Q ss_pred             CCCCCCccCCCHHHHHHHHhhhhhcceecCCCC-------------CceEeeeeeecccCCCC-cceeEEeeCCCCeEEE
Q 018186          225 YPDLFPEEVFNMETFKWSFGILFSRLVRLPSMD-------------GRVALVPWADMLNHSCE-VETFLDYDKSSQGVVF  290 (359)
Q Consensus       225 ~p~~f~~~~~t~~~f~WA~~~V~SRaf~~~~~~-------------~~~~LvP~~Dm~NH~~~-~~~~~~~d~~~~~~~l  290 (359)
                      +|..|..  +++++|..+++++.+.+|.++-..             ..-+|+|.+||+||+.. .|+...|+  ++|+.|
T Consensus       167 ~p~vfs~--~slEdF~y~~Al~laysfdve~~~s~~~~eee~e~e~ngk~m~p~ad~lNhd~~k~nanl~y~--~NcL~m  242 (466)
T KOG1338|consen  167 CPIVFSR--PSLEDFMYAYALGLAYSFDVEFLLSLDNLEEESEIECNGKLMTPIADFLNHDGLKANANLRYE--DNCLEM  242 (466)
T ss_pred             Ccchhcc--cCHHHHHHHHHHHHHHheeeehhcchhhhhhhhccccCcccccchhhhhccchhhcccceecc--Ccceee
Confidence            9998864  899999999999999999875320             13589999999999987 77778885  699999


Q ss_pred             EEcCcCCCCceEEecCCCCChHHHHhcCCcccCCCCC-------CCCeEEEeeccCCCCccHHHHHHHHHHCCC
Q 018186          291 TTDRQYQPGEQVFISYGKKSNGELLLSYGFVPREGTN-------PSDSVELPLSLKKSDKCYKEKLEALRKYGL  357 (359)
Q Consensus       291 ~a~r~i~~GeEv~isYG~~sN~~LL~~YGFv~~~~~N-------p~D~v~L~l~l~~~d~~~~~K~~~L~~~Gl  357 (359)
                      +|+|+|.+|+||+++||.++|.  |++||.+.-.+..       -.|.+++-.+++.+++....|.-++..+|.
T Consensus       243 va~r~iekgdev~n~dg~~p~~--l~~l~ka~c~gihm~~g~~~l~niv~~l~D~~~d~tm~~~R~il~ql~nt  314 (466)
T KOG1338|consen  243 VADRNIEKGDEVDNSDGLKPMG--LLKLTKALCVGIHMVWGILKLYNIVQILMDVPNDDTMRNMRLILLQLHNT  314 (466)
T ss_pred             eecCCCCCccccccccccCcch--hhhhhhhccceeeeecceeecchHHHHHhcCCCcchHHHHHHHHHHhccc
Confidence            9999999999999999998888  6666665442211       123344444667788888888776666653


No 2  
>KOG1337 consensus N-methyltransferase [General function prediction only]
Probab=99.98  E-value=3.2e-32  Score=275.65  Aligned_cols=268  Identities=39%  Similarity=0.616  Sum_probs=207.6

Q ss_pred             hcHHHHHHHHHhCCCCCCCcEEEeeCCCceEEEEcccCCCCCEEEEcCCCCcccccCCCCCchhhhhhccCCCCCh-HHH
Q 018186           74 ENASTLQKWLSDSGLPPQKMAIQKVDVGERGLVALKNIRKGEKLLFVPPSLVITADSKWSCPEAGEVLKQCSVPDW-PLL  152 (359)
Q Consensus        74 ~~~~~l~~Wl~~~G~~~~~v~i~~~~~~GrGl~At~~I~~ge~ll~IP~~l~is~~~a~~~~~~~~~l~~~~l~~~-~~L  152 (359)
                      +....+..|.+..|....+..+......++++.+..++..++.+..+|....+..+.....            ... ..|
T Consensus        47 ~~~~~~~~~~~~~g~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~------------~~~~~~l  114 (472)
T KOG1337|consen   47 ENIKSLKFWLTGNGLSSSKSSLPGNDIDEWPLLVSIRLIKGEKLLLVPPLLLLIAKRKPYN------------DLLPIAL  114 (472)
T ss_pred             cccccceeccccCCcchhhhccccccccccchhhhhhhhhhhhhccCCchhhhccccccCc------------cccHHHH
Confidence            3345566666666665433322222234556666666666665555555555444433211            111 578


Q ss_pred             HHHHHHHhccCCCCCcHHHHHhcCCCCCCccccCHHHHHHhhccchHHHHHHHHHHHHHHHHHHHHHHHHhhCCCCCC--
Q 018186          153 ATYLISEASFEKSSRWSNYISALPRQPYSLLYWTRAELDRYLEASQIRERAIERITNVIGTYNDLRLRIFSKYPDLFP--  230 (359)
Q Consensus       153 al~Ll~E~~~g~~S~w~pYl~~LP~~~~~pl~w~~~el~~lL~gt~l~~~~~~~~~~~~~~y~~~~~~l~~~~p~~f~--  230 (359)
                      +++|+.|...+..|.|.+|+..||..+++|++|..+++. .|.++.....+..+.+.++..+..+.. +...++..+.  
T Consensus       115 ~~~l~~~~~~~~~s~w~~~i~~l~~~~~~p~~~~~~~v~-~l~~~~~~~~~~~~~~~~~~~~~~~~~-~~~~~~~~~~~~  192 (472)
T KOG1337|consen  115 ALFLLLEWAHGEISKWKPYISTLPSQYNSPLLWSEDEVK-SLLSTPLFEIVASRRQNLVNKSAELLE-VLQSHPSLFGSD  192 (472)
T ss_pred             HHHHHHhhhccccccchhhhhhchhhcCCccccCHHHHH-HhhcchhhHHHHHHHHHhhhhHHHHHH-HHHhcccccccc
Confidence            999999998888899999999999999999999999998 589999888887777777665555543 3344554432  


Q ss_pred             -ccCCCHHHHHHHHhhhhhcceecCCC--------CCceEeeeeeecccCCCCcceeEEeeCCCCeEEEEEcCcCCCCce
Q 018186          231 -EEVFNMETFKWSFGILFSRLVRLPSM--------DGRVALVPWADMLNHSCEVETFLDYDKSSQGVVFTTDRQYQPGEQ  301 (359)
Q Consensus       231 -~~~~t~~~f~WA~~~V~SRaf~~~~~--------~~~~~LvP~~Dm~NH~~~~~~~~~~d~~~~~~~l~a~r~i~~GeE  301 (359)
                       .+.++++.|.||+++|.||+|+.+..        +...+|+|++||+||+++. ..+.|+..++.+.+++.+++++|||
T Consensus       193 ~~d~~~~~~~~w~~~~~~sr~~~~~~~~~~~~~~~~~~~~L~P~~D~~NH~~~~-~~~~~~~~d~~~~l~~~~~v~~gee  271 (472)
T KOG1337|consen  193 LFDTFTFSAFKWAYSIVNSRAFYLPSLQRLTAGDPDDNEALAPLIDLLNHSPEV-IKAGYNQEDEAVELVAERDVSAGEE  271 (472)
T ss_pred             ccCccchHHHHHHHHHHhhhhhccccccccccCCCCcchhhhhhHHhhccCchh-ccccccCCCCcEEEEEeeeecCCCe
Confidence             23489999999999999999987643        2367999999999999998 5566777777999999999999999


Q ss_pred             EEecCCCCChHHHHhcCCcccCCCCCCCCeEEEeeccCCCCccHHHHHHHHHHCCCC
Q 018186          302 VFISYGKKSNGELLLSYGFVPREGTNPSDSVELPLSLKKSDKCYKEKLEALRKYGLS  358 (359)
Q Consensus       302 v~isYG~~sN~~LL~~YGFv~~~~~Np~D~v~L~l~l~~~d~~~~~K~~~L~~~Gl~  358 (359)
                      |||+||+++|++||++||||.+  +||+|.|.|.+.+...|+.+..|...++++|+.
T Consensus       272 vfi~YG~~~N~eLL~~YGFv~~--~N~~d~v~l~~~l~~~~~~~~~~~~~~~~~~~~  326 (472)
T KOG1337|consen  272 VFINYGPKSNAELLLHYGFVEE--DNPYDSVTLKLALPPEDVSYLDKSDVLKKNGLP  326 (472)
T ss_pred             EEEecCCCchHHHHHhcCCCCC--CCCcceEEEeecccccccchhHHHHHHhhcCCC
Confidence            9999999999999999999987  999999999999999999999999999999875


No 3  
>PF00856 SET:  SET domain;  InterPro: IPR001214 The SET domain appears generally as one part of a larger multidomain protein, and recently there were described three structures of very different proteins with distinct domain compositions: Neurospora crassa DIM-5, a member of the Su(var) family of HKMTs which methylate histone H3 on lysine 9,human SET7 (also called SET9), which methylates H3 on lysine 4 and garden pea Rubisco LSMT, an enzyme that does not modify histones, but instead methylates lysine 14 in the flexible tail of the large subunit of the enzyme Rubisco. The SET domain itself turned out to be an uncommon structure. Although in all three studies, electron density maps revealed the location of the AdoMet or AdoHcy cofactor, the SET domain bears no similarity at all to the canonical/AdoMet-dependent methyltransferase fold. Strictly conserved in the C-terminal motif of the SET domain tyrosine could be involved in abstracting a proton from the protonated amino group of the substrate lysine, promoting its nucleophilic attack on the sulphonium methyl group of the AdoMet cofactor. In contrast to the AdoMet-dependent protein methyltranferases of the classical type, which tend to bind their polypeptide substrates on top of the cofactor, it is noted from the Rubisco LSMT structure that the AdoMet seems to bind in a separate cleft, suggesting how a polypeptide substrate could be subjected to multiple rounds of methylation without having to be released from the enzyme. In contrast, SET7/9 is able to add only a single methyl group to its substrate. It has been demonstrated that association of SET domain and myotubularin-related proteins modulates growth control []. The SET domain-containing Drosophila melanogaster (Fruit fly) protein, enhancer of zeste, has a function in segment determination and the mammalian homologue may be involved in the regulation of gene transcription and chromatin structure. Histone lysine methylation is part of the histone code that regulated chromatin function and epigenetic control of gene function. Histone lysine methyltransferases (HMTase) differ both in their substrate specificity for the various acceptor lysines as well as in their product specificity for the number of methyl groups (one, two, or three) they transfer. With just one exception [], the HMTases belong to SET family that can be classified according to the sequences surrounding the SET domain [, ]. Structural studies on the human SET7/9, a mono-methylase, have revealed the molecular basis for the specificity of the enzyme for the histone-target and the roles of the invariant residues in the SET domain in determining the methylation specificities [].  The pre-SET domain, as found in the SUV39 SET family, contains nine invariant cysteine residues that are grouped into two segments separated by a region of variable length. These 9 cysteines coordinate 3 zinc ions to form to form a triangular cluster, where each of the zinc ions is coordinated by 4 four cysteines to give a tetrahedral configuration. The function of this domain is structural, holding together 2 long segments of random coils. The C-terminal region including the post-SET domain is disordered when not interacting with a histone tail and in the absence of zinc. The three conserved cysteines in the post-SET domain form a zinc-binding site when coupled to a fourth conserved cysteine in the knot-like structure close to the SET domain active site []. The structured post-SET region brings in the C-terminal residues that participate in S-adenosylmethine-binding and histone tail interactions. The three conserved cysteine residues are essential for HMTase activity, as replacement with serine abolishes HMTase activity [], []. ; GO: 0005515 protein binding; PDB: 3TG5_A 3S7F_A 3RIB_B 3TG4_A 3S7J_A 3S7D_A 3S7B_A 3H6L_A 3SMT_A 3K5K_A ....
Probab=99.76  E-value=1.1e-18  Score=148.98  Aligned_cols=49  Identities=43%  Similarity=0.802  Sum_probs=44.1

Q ss_pred             ceEeeeeeecccCCCCcceeEEee--CCCCeEEEEEcCcCCCCceEEecCC
Q 018186          259 RVALVPWADMLNHSCEVETFLDYD--KSSQGVVFTTDRQYQPGEQVFISYG  307 (359)
Q Consensus       259 ~~~LvP~~Dm~NH~~~~~~~~~~d--~~~~~~~l~a~r~i~~GeEv~isYG  307 (359)
                      ..+|+|++||+||++.+|+.+.++  ..+++++++|.|+|++||||||+||
T Consensus       112 ~~~l~p~~d~~NHsc~pn~~~~~~~~~~~~~~~~~a~r~I~~GeEi~isYG  162 (162)
T PF00856_consen  112 GIALYPFADMLNHSCDPNCEVSFDFDGDGGCLVVRATRDIKKGEEIFISYG  162 (162)
T ss_dssp             EEEEETGGGGSEEESSTSEEEEEEEETTTTEEEEEESS-B-TTSBEEEEST
T ss_pred             ccccCcHhHheccccccccceeeEeecccceEEEEECCccCCCCEEEEEEC
Confidence            579999999999999999998887  5789999999999999999999998


No 4  
>smart00317 SET SET (Su(var)3-9, Enhancer-of-zeste, Trithorax) domain. Putative methyl transferase, based on outlier plant homologues
Probab=99.00  E-value=5.8e-10  Score=90.43  Aligned_cols=46  Identities=30%  Similarity=0.440  Sum_probs=40.7

Q ss_pred             EeeeeeecccCCCCcceeEEeeCCCC--eEEEEEcCcCCCCceEEecC
Q 018186          261 ALVPWADMLNHSCEVETFLDYDKSSQ--GVVFTTDRQYQPGEQVFISY  306 (359)
Q Consensus       261 ~LvP~~Dm~NH~~~~~~~~~~d~~~~--~~~l~a~r~i~~GeEv~isY  306 (359)
                      .+.|+++++||++.+|+.+.+...++  .+.++|.|+|++||||+++|
T Consensus        69 ~~~~~~~~iNHsc~pN~~~~~~~~~~~~~~~~~a~r~I~~GeEi~i~Y  116 (116)
T smart00317       69 RKGNIARFINHSCEPNCELLFVEVNGDSRIVIFALRDIKPGEELTIDY  116 (116)
T ss_pred             ccCcHHHeeCCCCCCCEEEEEEEECCCcEEEEEECCCcCCCCEEeecC
Confidence            48999999999999999887764444  59999999999999999999


No 5  
>KOG2589 consensus Histone tail methylase [Chromatin structure and dynamics]
Probab=95.79  E-value=0.0073  Score=58.27  Aligned_cols=47  Identities=23%  Similarity=0.396  Sum_probs=37.7

Q ss_pred             eEeeeeeecccCCCCcceeEEeeCCCCeEEEEEcCcCCCCceEEecCCC
Q 018186          260 VALVPWADMLNHSCEVETFLDYDKSSQGVVFTTDRQYQPGEQVFISYGK  308 (359)
Q Consensus       260 ~~LvP~~Dm~NH~~~~~~~~~~d~~~~~~~l~a~r~i~~GeEv~isYG~  308 (359)
                      +-|=|. -++||++.+||.+.- .+.+...|++.|||++||||+--||.
T Consensus       192 LwLGPa-afINHDCrpnCkFvs-~g~~tacvkvlRDIePGeEITcFYgs  238 (453)
T KOG2589|consen  192 LWLGPA-AFINHDCRPNCKFVS-TGRDTACVKVLRDIEPGEEITCFYGS  238 (453)
T ss_pred             heeccH-HhhcCCCCCCceeec-CCCceeeeehhhcCCCCceeEEeecc
Confidence            345564 489999999986432 23378999999999999999999997


No 6  
>KOG1085 consensus Predicted methyltransferase (contains a SET domain) [General function prediction only]
Probab=95.63  E-value=0.011  Score=55.27  Aligned_cols=48  Identities=23%  Similarity=0.459  Sum_probs=38.8

Q ss_pred             ecccCCCCcceeE---EeeCCCCeEEEEEcCcCCCCceEEecCCCCChHHHH
Q 018186          267 DMLNHSCEVETFL---DYDKSSQGVVFTTDRQYQPGEQVFISYGKKSNGELL  315 (359)
Q Consensus       267 Dm~NH~~~~~~~~---~~d~~~~~~~l~a~r~i~~GeEv~isYG~~sN~~LL  315 (359)
                      -++||+-..|+..   ..| ....++++|.++|.+|||++..||+++-+-++
T Consensus       335 RLINHS~~gNl~TKvv~Id-g~pHLiLvA~rdIa~GEELlYDYGDRSkesi~  385 (392)
T KOG1085|consen  335 RLINHSVRGNLKTKVVEID-GSPHLILVARRDIAQGEELLYDYGDRSKESIA  385 (392)
T ss_pred             hhhcccccCcceeeEEEec-CCceEEEEeccccccchhhhhhccccchhHHh
Confidence            4899999887643   333 45679999999999999999999998876554


No 7  
>KOG4442 consensus Clathrin coat binding protein/Huntingtin interacting protein HIP1, involved in regulation of endocytosis [Intracellular trafficking, secretion, and vesicular transport]
Probab=94.62  E-value=0.034  Score=57.85  Aligned_cols=43  Identities=23%  Similarity=0.468  Sum_probs=33.3

Q ss_pred             eecccCCCCcceeEE-ee-CCCCeEEEEEcCcCCCCceEEecCCC
Q 018186          266 ADMLNHSCEVETFLD-YD-KSSQGVVFTTDRQYQPGEQVFISYGK  308 (359)
Q Consensus       266 ~Dm~NH~~~~~~~~~-~d-~~~~~~~l~a~r~i~~GeEv~isYG~  308 (359)
                      +=++||++++||... |. .+.-.+-+.|.+.|++||||+..|+-
T Consensus       194 aRFiNHSC~PNa~~~KWtV~~~lRvGiFakk~I~~GEEITFDYqf  238 (729)
T KOG4442|consen  194 ARFINHSCDPNAEVQKWTVPDELRVGIFAKKVIKPGEEITFDYQF  238 (729)
T ss_pred             HHhhcCCCCCCceeeeeeeCCeeEEEEeEecccCCCceeeEeccc
Confidence            357999999998642 22 23456778899999999999999873


No 8  
>KOG1079 consensus Transcriptional repressor EZH1 [Transcription]
Probab=94.15  E-value=0.046  Score=56.69  Aligned_cols=44  Identities=18%  Similarity=0.311  Sum_probs=36.3

Q ss_pred             eeecccCCCCcceeEE--eeCCCCeEEEEEcCcCCCCceEEecCCC
Q 018186          265 WADMLNHSCEVETFLD--YDKSSQGVVFTTDRQYQPGEQVFISYGK  308 (359)
Q Consensus       265 ~~Dm~NH~~~~~~~~~--~d~~~~~~~l~a~r~i~~GeEv~isYG~  308 (359)
                      .+-++||+.++||...  +......+-+.|.|.|++|||||..|+=
T Consensus       665 k~rFANHS~nPNCYAkvm~V~GdhRIGifAkRaIeagEELffDYrY  710 (739)
T KOG1079|consen  665 KIRFANHSFNPNCYAKVMMVAGDHRIGIFAKRAIEAGEELFFDYRY  710 (739)
T ss_pred             hhhhccCCCCCCcEEEEEEecCCcceeeeehhhcccCceeeeeecc
Confidence            4568999999998754  3345678899999999999999999963


No 9  
>KOG1080 consensus Histone H3 (Lys4) methyltransferase complex, subunit SET1 and related methyltransferases [Chromatin structure and dynamics; Transcription]
Probab=94.04  E-value=0.045  Score=60.12  Aligned_cols=45  Identities=22%  Similarity=0.417  Sum_probs=36.2

Q ss_pred             eeeecccCCCCcceeEEee--CCCCeEEEEEcCcCCCCceEEecCCC
Q 018186          264 PWADMLNHSCEVETFLDYD--KSSQGVVFTTDRQYQPGEQVFISYGK  308 (359)
Q Consensus       264 P~~Dm~NH~~~~~~~~~~d--~~~~~~~l~a~r~i~~GeEv~isYG~  308 (359)
                      =++=++||++.+||+...-  .+...++++|.|+|.+||||+.+|--
T Consensus       938 niAr~InHsC~PNCyakvi~V~g~~~IvIyakr~I~~~EElTYDYkF  984 (1005)
T KOG1080|consen  938 NIARFINHSCNPNCYAKVITVEGDKRIVIYSKRDIAAGEELTYDYKF  984 (1005)
T ss_pred             chhheeecccCCCceeeEEEecCeeEEEEEEecccccCceeeeeccc
Confidence            3556899999999976432  24457999999999999999999953


No 10 
>COG2940 Proteins containing SET domain [General function prediction only]
Probab=91.52  E-value=0.1  Score=53.55  Aligned_cols=43  Identities=23%  Similarity=0.356  Sum_probs=35.9

Q ss_pred             eecccCCCCcceeEEeeCCCC--eEEEEEcCcCCCCceEEecCCC
Q 018186          266 ADMLNHSCEVETFLDYDKSSQ--GVVFTTDRQYQPGEQVFISYGK  308 (359)
Q Consensus       266 ~Dm~NH~~~~~~~~~~d~~~~--~~~l~a~r~i~~GeEv~isYG~  308 (359)
                      .=++||++.+|+........+  .+.+++.+||.+||||+++||.
T Consensus       406 ~r~~nHS~~pN~~~~~~~~~g~~~~~~~~~rDI~~geEl~~dy~~  450 (480)
T COG2940         406 ARFINHSCTPNCEASPIEVNGIFKISIYAIRDIKAGEELTYDYGP  450 (480)
T ss_pred             cceeecCCCCCcceecccccccceeeecccccchhhhhhcccccc
Confidence            338999999999877655444  6778899999999999999987


No 11 
>KOG1082 consensus Histone H3 (Lys9) methyltransferase SUV39H1/Clr4, required for transcriptional silencing [Chromatin structure and dynamics; Transcription]
Probab=85.53  E-value=0.55  Score=46.41  Aligned_cols=44  Identities=30%  Similarity=0.528  Sum_probs=33.1

Q ss_pred             ecccCCCCcceeEEe---eC---CCCeEEEEEcCcCCCCceEEecCCCCC
Q 018186          267 DMLNHSCEVETFLDY---DK---SSQGVVFTTDRQYQPGEQVFISYGKKS  310 (359)
Q Consensus       267 Dm~NH~~~~~~~~~~---d~---~~~~~~l~a~r~i~~GeEv~isYG~~s  310 (359)
                      =++||++.+|..+..   +.   .--.+.+.|.++|.+|+|++..||...
T Consensus       274 RfinHSC~PN~~~~~v~~~~~~~~~~~i~ffa~~~I~p~~ELT~dYg~~~  323 (364)
T KOG1082|consen  274 RFINHSCSPNLLYQAVFQDEFVLLYLRIGFFALRDISPGEELTLDYGKAY  323 (364)
T ss_pred             ccccCCCCccceeeeeeecCCccchheeeeeeccccCCCcccchhhcccc
Confidence            578999999876532   21   112467889999999999999999743


No 12 
>KOG1083 consensus Putative transcription factor ASH1/LIN-59 [Transcription]
Probab=80.65  E-value=1.5  Score=48.21  Aligned_cols=44  Identities=25%  Similarity=0.391  Sum_probs=32.7

Q ss_pred             eecccCCCCcceeE-EeeCCC-CeEEEEEcCcCCCCceEEecCCCC
Q 018186          266 ADMLNHSCEVETFL-DYDKSS-QGVVFTTDRQYQPGEQVFISYGKK  309 (359)
Q Consensus       266 ~Dm~NH~~~~~~~~-~~d~~~-~~~~l~a~r~i~~GeEv~isYG~~  309 (359)
                      +-+.||++.+||.. .|.-.+ -.+.|.|.|||.+||||+..|..+
T Consensus      1251 ~RfinhscKPNc~~qkwSVNG~~Rv~L~A~rDi~kGEELtYDYN~k 1296 (1306)
T KOG1083|consen 1251 ARFINHSCKPNCEMQKWSVNGEYRVGLFALRDLPKGEELTYDYNFK 1296 (1306)
T ss_pred             ccccccccCCCCccccccccceeeeeeeecCCCCCCceEEEecccc
Confidence            44578999998853 233222 246789999999999999999764


No 13 
>KOG1085 consensus Predicted methyltransferase (contains a SET domain) [General function prediction only]
Probab=68.00  E-value=5.9  Score=37.56  Aligned_cols=35  Identities=20%  Similarity=0.202  Sum_probs=29.9

Q ss_pred             CCcEEEeeCCCceEEEEcccCCCCCEEEEcCCCCc
Q 018186           91 QKMAIQKVDVGERGLVALKNIRKGEKLLFVPPSLV  125 (359)
Q Consensus        91 ~~v~i~~~~~~GrGl~At~~I~~ge~ll~IP~~l~  125 (359)
                      .++.+....+.||||+|+++++.|+.|+.--=+++
T Consensus       256 egl~~~~~dgKGRGv~a~~~F~rgdFVVEY~Gdli  290 (392)
T KOG1085|consen  256 EGLLEVYKDGKGRGVRAKVNFERGDFVVEYRGDLI  290 (392)
T ss_pred             cceeEEeeccccceeEeecccccCceEEEEeccee
Confidence            57888888899999999999999999987655443


No 14 
>KOG1141 consensus Predicted histone methyl transferase [Chromatin structure and dynamics]
Probab=66.99  E-value=2.8  Score=44.94  Aligned_cols=53  Identities=26%  Similarity=0.384  Sum_probs=37.4

Q ss_pred             ecccCCCCcceeE---EeeCCCCe---EEEEEcCcCCCCceEEecCCC----CChHHHHhcCC
Q 018186          267 DMLNHSCEVETFL---DYDKSSQG---VVFTTDRQYQPGEQVFISYGK----KSNGELLLSYG  319 (359)
Q Consensus       267 Dm~NH~~~~~~~~---~~d~~~~~---~~l~a~r~i~~GeEv~isYG~----~sN~~LL~~YG  319 (359)
                      -++||++.+|..+   -+|..+-.   +.+.+.+-|++|.|++..|+=    -..-+|+..-|
T Consensus      1191 RfLNHSC~PNl~VQnVfvdTHdlrfPwVAFFt~kyVkAgtELTWDY~Ye~g~v~~keL~C~CG 1253 (1262)
T KOG1141|consen 1191 RFLNHSCDPNLHVQNVFVDTHDLRFPWVAFFTRKYVKAGTELTWDYQYEQGQVATKELTCHCG 1253 (1262)
T ss_pred             hhhccCCCccceeeeeeeeccccCCchhhhhhhhhhccCceeeeeccccccccccceEEEecC
Confidence            4799999998654   24433333   456788999999999999973    34456666655


No 15 
>KOG2084 consensus Predicted histone tail methylase containing SET domain [Chromatin structure and dynamics]
Probab=65.13  E-value=9.7  Score=38.16  Aligned_cols=60  Identities=28%  Similarity=0.416  Sum_probs=45.2

Q ss_pred             ceEeeeeeecccCCCCcceeEEeeCCCCeEEEEEcCcCCCCc-eEEecCCCC--C----hHHHHhcCCc
Q 018186          259 RVALVPWADMLNHSCEVETFLDYDKSSQGVVFTTDRQYQPGE-QVFISYGKK--S----NGELLLSYGF  320 (359)
Q Consensus       259 ~~~LvP~~Dm~NH~~~~~~~~~~d~~~~~~~l~a~r~i~~Ge-Ev~isYG~~--s----N~~LL~~YGF  320 (359)
                      ..+|.|..=++||++.+|+...|+  +....+.+...+.+++ +++++|-..  +    ...|-..|.|
T Consensus       199 ~~~l~~~~~~~~hsC~pn~~~~~~--~~~~~~~~~~~~~~~~~~l~~~y~~~~~~~~~r~~~l~~~~~f  265 (482)
T KOG2084|consen  199 GRGLFPGSSLFNHSCFPNISVIFD--GRGLALLVPAGIDAGEEELTISYTDPLLSTASRQKQLRQSKLF  265 (482)
T ss_pred             eeeecccchhcccCCCCCeEEEEC--CceeEEEeecccCCCCCEEEEeecccccCHHHHHHHHhhccce
Confidence            468999999999999999987775  4556666666777766 999999762  2    3456666667


No 16 
>KOG1338 consensus Uncharacterized conserved protein [Function unknown]
Probab=56.32  E-value=2.4  Score=41.94  Aligned_cols=78  Identities=17%  Similarity=0.088  Sum_probs=59.2

Q ss_pred             ceEeeeeeecccCCCCcce--eEEeeCCCCeEEEEEcCcCCCCceEEecCCCCChHHHHhcCC-cccCCCCCCCCeEEEe
Q 018186          259 RVALVPWADMLNHSCEVET--FLDYDKSSQGVVFTTDRQYQPGEQVFISYGKKSNGELLLSYG-FVPREGTNPSDSVELP  335 (359)
Q Consensus       259 ~~~LvP~~Dm~NH~~~~~~--~~~~d~~~~~~~l~a~r~i~~GeEv~isYG~~sN~~LL~~YG-Fv~~~~~Np~D~v~L~  335 (359)
                      ..+++|+++|+|-.-.-..  .+-+| ..+...+++.|.|  +.|+.+.|+...+.++..+|| |+-. +--|++.+.+ 
T Consensus       269 ~ka~c~gihm~~g~~~l~niv~~l~D-~~~d~tm~~~R~i--l~ql~nt~teld~~e~~~syd~ftkk-E~~p~~g~lv-  343 (466)
T KOG1338|consen  269 TKALCVGIHMVWGILKLYNIVQILMD-VPNDDTMRNMRLI--LLQLHNTRTELDINEFHSSYDTFTKK-EVKPAIGKLV-  343 (466)
T ss_pred             hhhccceeeeecceeecchHHHHHhc-CCCcchHHHHHHH--HHHhccchhhhhhHHHHHhhhhhhhc-cccccceeee-
Confidence            4689999999988755322  12344 3566788899988  999999999999999999999 5544 5668887777 


Q ss_pred             eccCCC
Q 018186          336 LSLKKS  341 (359)
Q Consensus       336 l~l~~~  341 (359)
                      +.+++.
T Consensus       344 ~glpq~  349 (466)
T KOG1338|consen  344 IGLPQS  349 (466)
T ss_pred             eechhh
Confidence            466664


No 17 
>COG1188 Ribosome-associated heat shock protein implicated in the recycling of the 50S subunit (S4 paralog) [Translation, ribosomal structure and biogenesis]
Probab=47.69  E-value=18  Score=28.86  Aligned_cols=55  Identities=15%  Similarity=0.367  Sum_probs=39.1

Q ss_pred             CCCHHHHHHHHhhhhhcceecCCCCCceEeeeeeecccCCCCcceeEEeeCCCCeEEEEEcCcCCCCceEEecCCCC
Q 018186          233 VFNMETFKWSFGILFSRLVRLPSMDGRVALVPWADMLNHSCEVETFLDYDKSSQGVVFTTDRQYQPGEQVFISYGKK  309 (359)
Q Consensus       233 ~~t~~~f~WA~~~V~SRaf~~~~~~~~~~LvP~~Dm~NH~~~~~~~~~~d~~~~~~~l~a~r~i~~GeEv~isYG~~  309 (359)
                      ..-++.|+|+..++-+|+..-             ||++-     ..+.++    +-...+.++++.|++|.|.||.+
T Consensus         8 ~mRLDKwL~~aR~~KrRslAk-------------~~~~~-----GrV~vN----G~~aKpS~~VK~GD~l~i~~~~~   62 (100)
T COG1188           8 RMRLDKWLWAARFIKRRSLAK-------------EMIEG-----GRVKVN----GQRAKPSKEVKVGDILTIRFGNK   62 (100)
T ss_pred             ceehHHHHHHHHHhhhHHHHH-------------HHHHC-----CeEEEC----CEEcccccccCCCCEEEEEeCCc
Confidence            356899999999999998752             23221     123342    23348889999999999999974


No 18 
>KOG2461 consensus Transcription factor BLIMP-1/PRDI-BF1, contains C2H2-type Zn-finger and SET domains [Transcription]
Probab=47.41  E-value=19  Score=36.10  Aligned_cols=35  Identities=26%  Similarity=0.539  Sum_probs=31.4

Q ss_pred             CCeEEEEEcCcCCCCceEEecCCCCChHHHHhcCC
Q 018186          285 SQGVVFTTDRQYQPGEQVFISYGKKSNGELLLSYG  319 (359)
Q Consensus       285 ~~~~~l~a~r~i~~GeEv~isYG~~sN~~LL~~YG  319 (359)
                      +..+-+++.|+|++||||.+-||.--+.+|...+|
T Consensus       121 ~~~Ifyrt~r~I~p~eELlVWY~~e~~~~L~~~~~  155 (396)
T KOG2461|consen  121 GENIFYRTIRDIRPNEELLVWYGSEYAEELAYGHG  155 (396)
T ss_pred             cCceEEEecccCCCCCeEEEEeccchHhHhcccCC
Confidence            45688899999999999999999988888888888


No 19 
>TIGR02059 swm_rep_I cyanobacterial long protein repeat. This domain appears in 29 copies in a large (10000 amino protein in Synechococcus sp. WH8102 associated with a novel flagellar system, as one of three different repeats. Similar domains are found in two different large (<3500) proteins of Synechocystis PCC6803.
Probab=46.18  E-value=36  Score=27.18  Aligned_cols=29  Identities=24%  Similarity=0.398  Sum_probs=24.2

Q ss_pred             EeeCCCCeEEEEEcCcCCCCceEEecCCC
Q 018186          280 DYDKSSQGVVFTTDRQYQPGEQVFISYGK  308 (359)
Q Consensus       280 ~~d~~~~~~~l~a~r~i~~GeEv~isYG~  308 (359)
                      ..+.....+.+...+.|..||+|.++|-+
T Consensus        69 sV~~s~ktVTLTL~~~V~~Gq~VTVsYt~   97 (101)
T TIGR02059        69 SLGGSNTTITLTLAQVVEDGDEVTLSYTK   97 (101)
T ss_pred             EEcCcccEEEEEecccccCCCEEEEEeeC
Confidence            34545568999999999999999999965


No 20 
>smart00317 SET SET (Su(var)3-9, Enhancer-of-zeste, Trithorax) domain. Putative methyl transferase, based on outlier plant homologues
Probab=41.40  E-value=34  Score=26.55  Aligned_cols=28  Identities=29%  Similarity=0.437  Sum_probs=19.9

Q ss_pred             CCcEEEeeCCC---ceEEEEcccCCCCCEEE
Q 018186           91 QKMAIQKVDVG---ERGLVALKNIRKGEKLL  118 (359)
Q Consensus        91 ~~v~i~~~~~~---GrGl~At~~I~~ge~ll  118 (359)
                      +++.+......   ...++|+++|++||.|.
T Consensus        83 pN~~~~~~~~~~~~~~~~~a~r~I~~GeEi~  113 (116)
T smart00317       83 PNCELLFVEVNGDSRIVIFALRDIKPGEELT  113 (116)
T ss_pred             CCEEEEEEEECCCcEEEEEECCCcCCCCEEe
Confidence            45555543333   37889999999999985


No 21 
>KOG4442 consensus Clathrin coat binding protein/Huntingtin interacting protein HIP1, involved in regulation of endocytosis [Intracellular trafficking, secretion, and vesicular transport]
Probab=40.11  E-value=26  Score=37.19  Aligned_cols=29  Identities=28%  Similarity=0.297  Sum_probs=25.4

Q ss_pred             CCcEEEeeCCCceEEEEcccCCCCCEEEE
Q 018186           91 QKMAIQKVDVGERGLVALKNIRKGEKLLF  119 (359)
Q Consensus        91 ~~v~i~~~~~~GrGl~At~~I~~ge~ll~  119 (359)
                      -+|++-.+...|.||.|.++|++|+.|+.
T Consensus       120 A~vevF~Te~KG~GLRA~~dI~~g~FI~E  148 (729)
T KOG4442|consen  120 AKVEVFLTEKKGCGLRAEEDIPKGQFILE  148 (729)
T ss_pred             CceeEEEecCcccceeeccccCCCcEEee
Confidence            35777777889999999999999999986


No 22 
>PF10281 Ish1:  Putative stress-responsive nuclear envelope protein;  InterPro: IPR018803  This group of proteins, found primarily in fungi, consists of putative stress-responsive nuclear envelope protein Ish1 and homologues []. 
Probab=34.03  E-value=39  Score=21.71  Aligned_cols=16  Identities=38%  Similarity=0.856  Sum_probs=13.6

Q ss_pred             HHHHHHHHHhCCCCCC
Q 018186           76 ASTLQKWLSDSGLPPQ   91 (359)
Q Consensus        76 ~~~l~~Wl~~~G~~~~   91 (359)
                      ..+|.+||.++|+..+
T Consensus         6 ~~~L~~wL~~~gi~~~   21 (38)
T PF10281_consen    6 DSDLKSWLKSHGIPVP   21 (38)
T ss_pred             HHHHHHHHHHcCCCCC
Confidence            3689999999999864


No 23 
>KOG1080 consensus Histone H3 (Lys4) methyltransferase complex, subunit SET1 and related methyltransferases [Chromatin structure and dynamics; Transcription]
Probab=31.40  E-value=45  Score=37.41  Aligned_cols=45  Identities=13%  Similarity=0.201  Sum_probs=31.6

Q ss_pred             hcHHHHHHHHHhCCCCCCCcEEEeeCCCceEEEEcccCCCCCEEEE
Q 018186           74 ENASTLQKWLSDSGLPPQKMAIQKVDVGERGLVALKNIRKGEKLLF  119 (359)
Q Consensus        74 ~~~~~l~~Wl~~~G~~~~~v~i~~~~~~GrGl~At~~I~~ge~ll~  119 (359)
                      ....+++.|.+.+--. ..|.+....-.|.||||.++|.+||.||.
T Consensus       850 ~~~~~~~~~~~~~~rk-k~~~F~~s~iH~wglfa~~~i~~~dmViE  894 (1005)
T KOG1080|consen  850 LDEAEVLRYNQLKFRK-KYVKFGRSGIHGWGLFAMENIAAGDMVIE  894 (1005)
T ss_pred             cchHHHHHHHHHhhhh-hhhccccccccccceeeccCccccceEEE
Confidence            3345566665543111 23667766678999999999999999975


No 24 
>PF08666 SAF:  SAF domain;  InterPro: IPR013974  This entry includes a range of different proteins, such as antifreeze proteins, flagellar FlgA proteins, and CpaB pilus proteins. ; PDB: 1C89_A 3NLA_A 3RDN_A 1C8A_A 3FRN_A 1WVO_A 3K3S_H 3G8R_B 1XUU_A 1XUZ_A ....
Probab=29.35  E-value=33  Score=24.05  Aligned_cols=15  Identities=33%  Similarity=0.512  Sum_probs=11.2

Q ss_pred             eEEEEcccCCCCCEE
Q 018186          103 RGLVALKNIRKGEKL  117 (359)
Q Consensus       103 rGl~At~~I~~ge~l  117 (359)
                      +-++|+++|++|+.|
T Consensus         2 ~vvVA~~di~~G~~i   16 (63)
T PF08666_consen    2 RVVVAARDIPAGTVI   16 (63)
T ss_dssp             SEEEESSTB-TT-BE
T ss_pred             cEEEEeCccCCCCEE
Confidence            358999999999998


No 25 
>PF11629 Mst1_SARAH:  C terminal SARAH domain of Mst1;  InterPro: IPR024205 The SARAH (Sav/Rassf/Hpo) domain is found at the C terminus in three classes of eukaryotic tumour suppressors that give the domain its name. In the Sav (Salvador) and Hpo (Hippo) families, the SARAH domain mediates signal transduction from Hpo via the Sav scaffolding protein to the downstream component Wts (Warts); the phosphorylation of Wts by Hpo triggers cell cycle arrest and apoptosis by down-regulating cyclin E, Diap 1 and other targets []. The SARAH domain is also involved in dimerisation, as in the human Hpo orthologue, Mst1, which homodimerises via its C-terminal SARAH domain. The SARAH domain is found associated with other domains, such as protein kinase domains, WW/rsp5/WWP domain (IPR001202 from INTERPRO), C1 domain (IPR002219 from INTERPRO), LIM domain (IPR001781 from INTERPRO), or the Ras-associating (RA) domain (IPR000159 from INTERPRO).; GO: 0004674 protein serine/threonine kinase activity; PDB: 2JO8_A.
Probab=24.12  E-value=2.5e+02  Score=19.42  Aligned_cols=40  Identities=20%  Similarity=0.321  Sum_probs=27.7

Q ss_pred             ccccCHHHHHHhhccchHHHHHHHHHHHHHHHHHHHHHHHHh
Q 018186          182 LLYWTRAELDRYLEASQIRERAIERITNVIGTYNDLRLRIFS  223 (359)
Q Consensus       182 pl~w~~~el~~lL~gt~l~~~~~~~~~~~~~~y~~~~~~l~~  223 (359)
                      .-.|+.+|+++.|  ..|...+.+-++.++..|..-+++|..
T Consensus         5 Lk~ls~~eL~~rl--~~LD~~ME~Eieelr~RY~~KRqPIld   44 (49)
T PF11629_consen    5 LKFLSYEELQQRL--ASLDPEMEQEIEELRQRYQAKRQPILD   44 (49)
T ss_dssp             GGGS-HHHHHHHH--HHHHHHHHHHHHHHHHHHHHHHHHHHH
T ss_pred             HhhCCHHHHHHHH--HhCCHHHHHHHHHHHHHHHHhhccHHH
Confidence            3468889988644  345555666677888899888888764


No 26 
>PF09652 Cas_VVA1548:  Putative CRISPR-associated protein (Cas_VVA1548);  InterPro: IPR013443 Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) are a family of DNA direct repeats separated by regularly sized non-repetitive spacer sequences that are found in most bacterial and archaeal genomes []. CRISPRs appear to provide acquired resistance against bacteriophages, possibly acting with an RNA interference-like mechanism to inhibit gene functions of invasive DNA elements [, ]. Differences in the number and type of spacers between CRISPR repeats correlate with phage sensitivity. It is thought that following phage infection, bacteria integrate new spacers derived from phage genomic sequences, and that the removal or addition of particular spacers modifies the phage-resistance phenotype of the cell. Therefore, the specificity of CRISPRs may be determined by spacer-phage sequence similarity. In addition, there are many protein families known as CRISPR-associated sequences (Cas), which are encoded in the vicinity of CRISPR loci []. CRISPR/cas gene regions can be quite large, with up to 20 different, tandem-arranged cas genes next to a CRISPR cluster or filling the region between two repeat clusters. Cas genes and CRISPRs are found on mobile genetic elements such as plasmids, and have undergone extensive horizontal transfer. Cas proteins are thought to be involved in the propagation and functioning of CRISPRs. Some Cas proteins show similarity to helicases and repair proteins, although the functions of most are unknown. Cas families can be divided into subtypes according to operon organisation and phylogeny.   This entry represents a conserved region of about 95 amino acids found exclusively in species with CRISPR repeats. In all bacterial species that contain this entry, the genes encoding the proteins are in the midst of a cluster of cas genes.
Probab=22.93  E-value=57  Score=25.72  Aligned_cols=42  Identities=17%  Similarity=0.400  Sum_probs=26.7

Q ss_pred             hcHHHHHHHHHhCCCCCCCcEEEeeCCCceEEEEcccCCCCCEEE-EcCCCC
Q 018186           74 ENASTLQKWLSDSGLPPQKMAIQKVDVGERGLVALKNIRKGEKLL-FVPPSL  124 (359)
Q Consensus        74 ~~~~~l~~Wl~~~G~~~~~v~i~~~~~~GrGl~At~~I~~ge~ll-~IP~~l  124 (359)
                      .+..-.++|++++|+.++.+.- ..+        ..+|.+|++|+ ++|..+
T Consensus         4 sRH~GAieW~~~qg~~iD~~v~-Hld--------~~~i~~GD~ViGtLPvhL   46 (93)
T PF09652_consen    4 SRHPGAIEWAKQQGIQIDHFVD-HLD--------PADIQPGDVVIGTLPVHL   46 (93)
T ss_pred             eecccHHHHHHHhCCCcceeec-cCC--------HHHccCCCEEEEeCcHHH
Confidence            3455678999999987654321 211        56788888775 445443


No 27 
>KOG1337 consensus N-methyltransferase [General function prediction only]
Probab=21.36  E-value=35  Score=34.91  Aligned_cols=51  Identities=14%  Similarity=0.178  Sum_probs=34.7

Q ss_pred             hhcHHHHHHHHHhCCCCC-CCcEEEeeCCCceEEEEc-ccCCCCCEEEEcCCC
Q 018186           73 LENASTLQKWLSDSGLPP-QKMAIQKVDVGERGLVAL-KNIRKGEKLLFVPPS  123 (359)
Q Consensus        73 ~~~~~~l~~Wl~~~G~~~-~~v~i~~~~~~GrGl~At-~~I~~ge~ll~IP~~  123 (359)
                      +.+...|++|...+|+.. .++........|.+.++. +.+...+.+..+...
T Consensus         3 ~~~l~~~l~~~~~~~~~~~~~~~~~~~~~~g~~~~~~~~~~~~~~~~~~~~~~   55 (472)
T KOG1337|consen    3 VDVLSALLRWAQCNGISLSSSLDLRPDELKGLVRWAASESIASSENIKSLKFW   55 (472)
T ss_pred             hhHHHHhhhHHhccCccCCcccccCccccCcceeeeecccCCCccccccceec
Confidence            356789999999999986 456666655667777777 555555555444433


No 28 
>PF00856 SET:  SET domain;  InterPro: IPR001214 The SET domain appears generally as one part of a larger multidomain protein, and recently there were described three structures of very different proteins with distinct domain compositions: Neurospora crassa DIM-5, a member of the Su(var) family of HKMTs which methylate histone H3 on lysine 9,human SET7 (also called SET9), which methylates H3 on lysine 4 and garden pea Rubisco LSMT, an enzyme that does not modify histones, but instead methylates lysine 14 in the flexible tail of the large subunit of the enzyme Rubisco. The SET domain itself turned out to be an uncommon structure. Although in all three studies, electron density maps revealed the location of the AdoMet or AdoHcy cofactor, the SET domain bears no similarity at all to the canonical/AdoMet-dependent methyltransferase fold. Strictly conserved in the C-terminal motif of the SET domain tyrosine could be involved in abstracting a proton from the protonated amino group of the substrate lysine, promoting its nucleophilic attack on the sulphonium methyl group of the AdoMet cofactor. In contrast to the AdoMet-dependent protein methyltranferases of the classical type, which tend to bind their polypeptide substrates on top of the cofactor, it is noted from the Rubisco LSMT structure that the AdoMet seems to bind in a separate cleft, suggesting how a polypeptide substrate could be subjected to multiple rounds of methylation without having to be released from the enzyme. In contrast, SET7/9 is able to add only a single methyl group to its substrate. It has been demonstrated that association of SET domain and myotubularin-related proteins modulates growth control []. The SET domain-containing Drosophila melanogaster (Fruit fly) protein, enhancer of zeste, has a function in segment determination and the mammalian homologue may be involved in the regulation of gene transcription and chromatin structure. Histone lysine methylation is part of the histone code that regulated chromatin function and epigenetic control of gene function. Histone lysine methyltransferases (HMTase) differ both in their substrate specificity for the various acceptor lysines as well as in their product specificity for the number of methyl groups (one, two, or three) they transfer. With just one exception [], the HMTases belong to SET family that can be classified according to the sequences surrounding the SET domain [, ]. Structural studies on the human SET7/9, a mono-methylase, have revealed the molecular basis for the specificity of the enzyme for the histone-target and the roles of the invariant residues in the SET domain in determining the methylation specificities [].  The pre-SET domain, as found in the SUV39 SET family, contains nine invariant cysteine residues that are grouped into two segments separated by a region of variable length. These 9 cysteines coordinate 3 zinc ions to form to form a triangular cluster, where each of the zinc ions is coordinated by 4 four cysteines to give a tetrahedral configuration. The function of this domain is structural, holding together 2 long segments of random coils. The C-terminal region including the post-SET domain is disordered when not interacting with a histone tail and in the absence of zinc. The three conserved cysteines in the post-SET domain form a zinc-binding site when coupled to a fourth conserved cysteine in the knot-like structure close to the SET domain active site []. The structured post-SET region brings in the C-terminal residues that participate in S-adenosylmethine-binding and histone tail interactions. The three conserved cysteine residues are essential for HMTase activity, as replacement with serine abolishes HMTase activity [], []. ; GO: 0005515 protein binding; PDB: 3TG5_A 3S7F_A 3RIB_B 3TG4_A 3S7J_A 3S7D_A 3S7B_A 3H6L_A 3SMT_A 3K5K_A ....
Probab=21.28  E-value=1.2e+02  Score=24.49  Aligned_cols=27  Identities=26%  Similarity=0.352  Sum_probs=18.9

Q ss_pred             CcEEEee---CCCceEEEEcccCCCCCEEE
Q 018186           92 KMAIQKV---DVGERGLVALKNIRKGEKLL  118 (359)
Q Consensus        92 ~v~i~~~---~~~GrGl~At~~I~~ge~ll  118 (359)
                      ++.+...   .+...-++|+++|++||.|.
T Consensus       129 n~~~~~~~~~~~~~~~~~a~r~I~~GeEi~  158 (162)
T PF00856_consen  129 NCEVSFDFDGDGGCLVVRATRDIKKGEEIF  158 (162)
T ss_dssp             SEEEEEEEETTTTEEEEEESS-B-TTSBEE
T ss_pred             ccceeeEeecccceEEEEECCccCCCCEEE
Confidence            5666554   46678889999999999885


Done!