Query         013921
Match_columns 434
No_of_seqs    255 out of 1328
Neff          7.7 
Searched_HMMs 46136
Date          Fri Mar 29 08:43:33 2013
Command       hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/013921.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/013921hhsearch_cdd -cpu 12 -v 0 

 No Hit                             Prob E-value P-value  Score    SS Cols Query HMM  Template HMM
  1 KOG1337 N-methyltransferase [G 100.0 2.6E-34 5.7E-39  299.5  20.9  346   62-428    47-413 (472)
  2 KOG1338 Uncharacterized conser 100.0 1.5E-32 3.2E-37  265.4  15.9  270   62-345     7-313 (466)
  3 PF00856 SET:  SET domain;  Int  99.7 1.5E-17 3.3E-22  146.5   6.6   53  239-297   110-162 (162)
  4 PF09273 Rubis-subs-bind:  Rubi  99.6 2.1E-15 4.5E-20  129.7   9.0  103  327-429     1-103 (128)
  5 smart00317 SET SET (Su(var)3-9  98.6 2.9E-08 6.4E-13   82.8   3.9   47  244-296    70-116 (116)
  6 KOG4442 Clathrin coat binding   94.9   0.025 5.4E-07   60.3   4.1   45  247-297   193-237 (729)
  7 KOG2589 Histone tail methylase  94.6   0.039 8.5E-07   54.6   4.1   47  242-298   192-238 (453)
  8 KOG1085 Predicted methyltransf  94.5   0.033 7.1E-07   53.4   3.3   52  248-306   334-385 (392)
  9 KOG1079 Transcriptional repres  88.3    0.42   9E-06   51.1   3.4   45  247-297   665-709 (739)
 10 KOG1080 Histone H3 (Lys4) meth  87.3    0.56 1.2E-05   53.2   3.8   45  247-297   939-983 (1005)
 11 KOG1083 Putative transcription  83.4     1.2 2.6E-05   50.0   4.0   44  249-298  1252-1295(1306)
 12 COG2940 Proteins containing SE  72.9     1.9 4.2E-05   45.5   1.7   45  247-298   405-450 (480)
 13 KOG1082 Histone H3 (Lys9) meth  72.9     3.3 7.2E-05   42.1   3.4   49  247-298   272-321 (364)
 14 COG1188 Ribosome-associated he  49.8      20 0.00044   29.3   3.3   54  216-298     8-61  (100)
 15 KOG2461 Transcription factor B  35.1      33 0.00072   35.3   3.0   35  275-310   121-155 (396)
 16 KOG1085 Predicted methyltransf  34.0      34 0.00074   33.3   2.7   29   85-113   260-289 (392)
 17 PF08666 SAF:  SAF domain;  Int  28.5      32  0.0007   24.8   1.2   15   92-106     2-16  (63)
 18 PF10281 Ish1:  Putative stress  24.8      69  0.0015   21.0   2.2   17   64-80      6-22  (38)
 19 PF09652 Cas_VVA1548:  Putative  24.8      62  0.0014   26.1   2.3   40   64-113     6-46  (93)
 20 KOG1337 N-methyltransferase [G  23.9      52  0.0011   34.6   2.3  121   62-186     4-136 (472)
 21 TIGR02059 swm_rep_I cyanobacte  23.6      88  0.0019   25.6   2.9   25  274-298    73-97  (101)
 22 TIGR02620 cas_VVA1548 putative  22.3      69  0.0015   25.8   2.1   39   65-113     7-46  (93)
 23 KOG1081 Transcription factor N  20.1      41  0.0009   35.4   0.6   45  248-298   372-416 (463)

No 1  
>KOG1337 consensus N-methyltransferase [General function prediction only]
Probab=100.00  E-value=2.6e-34  Score=299.48  Aligned_cols=346  Identities=32%  Similarity=0.429  Sum_probs=259.0

Q ss_pred             hhHHHHHHHHHhCCCCCCCCCeeecccCCccEEEEccCCCCCCEEEEecCCCccCccccccchhhhhhcCCChh-HHHHH
Q 013921           62 AQVETFWQWLRDQKVVSPKSPIRPATFPEGLGLVAQRDIAKNEVVLEVPMKFWINPDTVAASEIGSLCSGLKPW-ISVAL  140 (434)
Q Consensus        62 ~~~~~f~~Wl~~~G~~~~~~~v~~~~~~~GrGl~A~~~I~~ge~ll~IP~~~~ls~~~~~~~~~~~~~~~l~~~-~~Lal  140 (434)
                      +..+.+.-|.+..|...... ...+....++++.+..++..++.+..+|....+..+....         .+.. ..+++
T Consensus        47 ~~~~~~~~~~~~~g~~~~~~-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~---------~~~~~~~l~~  116 (472)
T KOG1337|consen   47 ENIKSLKFWLTGNGLSSSKS-SLPGNDIDEWPLLVSIRLIKGEKLLLVPPLLLLIAKRKPY---------NDLLPIALAL  116 (472)
T ss_pred             cccccceeccccCCcchhhh-ccccccccccchhhhhhhhhhhhhccCCchhhhccccccC---------ccccHHHHHH
Confidence            55666666666666654431 1122222355666666665666555555554444333221         1122 68899


Q ss_pred             HHHHHh-cCCCCCcHHHHhhcCCCCCCccccCHhHHhhcCCCchHHHHHHHHHHHHHHHHHHHHHHhccCCCCCC----C
Q 013921          141 FLIREK-KKEDSPWRVYLDILPECTDSTVFWSEEELVELQGTQLLSTTLGVKEYVQNEYLKVEEEIILPNKQLFP----R  215 (434)
Q Consensus       141 ~Ll~E~-~~~~S~w~pYl~~LP~~~~~pl~w~~~el~~L~gt~l~~~~~~~~~~~~~~~~~l~~~l~~~~~~~f~----~  215 (434)
                      ++++|+ .+..|.|++|+..||..+++|++|...++..|++++....+..++..++..+..+.+ +...++..++    +
T Consensus       117 ~l~~~~~~~~~s~w~~~i~~l~~~~~~p~~~~~~~v~~l~~~~~~~~~~~~~~~~~~~~~~~~~-~~~~~~~~~~~~~~d  195 (472)
T KOG1337|consen  117 FLLLEWAHGEISKWKPYISTLPSQYNSPLLWSEDEVKSLLSTPLFEIVASRRQNLVNKSAELLE-VLQSHPSLFGSDLFD  195 (472)
T ss_pred             HHHHhhhccccccchhhhhhchhhcCCccccCHHHHHHhhcchhhHHHHHHHHHhhhhHHHHHH-HHHhccccccccccC
Confidence            999999 777799999999999999999999999999999999999988888777776555543 3334443332    2


Q ss_pred             CCCHHHHHHHHHHHHhccccccCC---------CcEEEeeccccccCCCCCCCCCceEEecCCCccCCCceEEEEeCCCC
Q 013921          216 PITLDDFLWAFGILRSRAFSRLRG---------QNLVLIPLADLINHSPGITTEDYAYEIKGAGLFSRDLLFSLRTPVPV  286 (434)
Q Consensus       216 ~~t~e~f~WA~~~V~SRaf~~~~~---------~~~~LvP~~Dm~NH~~~~~~~~~~~~~~~~g~~~~~~~~~l~a~r~i  286 (434)
                      .+++++|.||+++|.||+|+...+         +..+|+|++||+||++...         ..+++..++.+.+++.++|
T Consensus       196 ~~~~~~~~w~~~~~~sr~~~~~~~~~~~~~~~~~~~~L~P~~D~~NH~~~~~---------~~~~~~~d~~~~l~~~~~v  266 (472)
T KOG1337|consen  196 TFTFSAFKWAYSIVNSRAFYLPSLQRLTAGDPDDNEALAPLIDLLNHSPEVI---------KAGYNQEDEAVELVAERDV  266 (472)
T ss_pred             ccchHHHHHHHHHHhhhhhccccccccccCCCCcchhhhhhHHhhccCchhc---------cccccCCCCcEEEEEeeee
Confidence            379999999999999999986422         3679999999999998752         1234445668999999999


Q ss_pred             CCCCeEEeccCCCCCcHHHHHhCCcCCCCCCCceEEEEeecCCCCcChhhHHHHHHHCCCCCccEEEEecCCCCCHHHHH
Q 013921          287 KAGEQVLIQYDLNKSNAELALDYGFIESKSDRNAYTLTLEISESDPFFGDKLDIAETNGLGESAYFDIVLGRTLPPAMLQ  366 (434)
Q Consensus       287 ~~GeEv~i~YG~~~sN~~LL~~YGFv~~~Np~D~v~l~l~i~~~d~~~~~K~~lL~~~gl~~~~~f~l~~~~~~~~~Ll~  366 (434)
                      ++||||||+||+ ++|++||++|||+.++||+|.|.+.+.++..|+.+..|...+.++++.....|.+...+....+++.
T Consensus       267 ~~geevfi~YG~-~~N~eLL~~YGFv~~~N~~d~v~l~~~l~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  345 (472)
T KOG1337|consen  267 SAGEEVFINYGP-KSNAELLLHYGFVEEDNPYDSVTLKLALPPEDVSYLDKSDVLKKNGLPSSGEFSILLTGEPVSEMLL  345 (472)
T ss_pred             cCCCeEEEecCC-CchHHHHHhcCCCCCCCCcceEEEeecccccccchhHHHHHHhhcCCCCCceEEEeecCCchhhhhh
Confidence            999999999997 9999999999999999999999999999999999999999999999999999988776555566655


Q ss_pred             HHHHHhcCCCcH--HhHH---HHhhcccccCCCCCCChHHHHHHHHHHHHH-HHHHHhcCCCchhhhc
Q 013921          367 YLRLVALGGTDA--FLLE---SIFRNTIWGHLDLPVSHANEELICRVVRDA-CKSALSGFHTTIEEVN  428 (434)
Q Consensus       367 ~lRl~~~~~~e~--~~~~---~~~~~~~~g~~~~~vS~~nE~~~~~~L~~~-~~~~L~~y~TTieeDe  428 (434)
                      ..+++.+.....  ..+.   ...+...+.....+++.++|...+..+... |...+..+.+++++|+
T Consensus       346 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~l~~~~~~~~~~l~~~~~~~~~~~~~  413 (472)
T KOG1337|consen  346 LFLLLDALSERLESELVCEETSISRSCEEFLSGLPVSLDNEQKLLYGLQKLLCSLTLRVFKALIDEDE  413 (472)
T ss_pred             hhhhhccccccchhhhhhhhcccccccccccccCceeecchHHHHHHHhhccccchhcccchhhhhhh
Confidence            555443333322  1111   122334455567789999999999999999 9999999999995554


No 2  
>KOG1338 consensus Uncharacterized conserved protein [Function unknown]
Probab=100.00  E-value=1.5e-32  Score=265.38  Aligned_cols=270  Identities=22%  Similarity=0.351  Sum_probs=212.4

Q ss_pred             hhHHHHHHHHHhCC-CCCCCCCeeeccc---CC--ccEEEEccCCCCCCEEEEecCCCccCcccccc-----chhhhhhc
Q 013921           62 AQVETFWQWLRDQK-VVSPKSPIRPATF---PE--GLGLVAQRDIAKNEVVLEVPMKFWINPDTVAA-----SEIGSLCS  130 (434)
Q Consensus        62 ~~~~~f~~Wl~~~G-~~~~~~~v~~~~~---~~--GrGl~A~~~I~~ge~ll~IP~~~~ls~~~~~~-----~~~~~~~~  130 (434)
                      +..+.|+.|++..+ .+.++ +|.+.+.   .+  |+|++|+++|++||.++.+|++++++..+...     +....++.
T Consensus         7 d~~~~fl~w~k~t~eletSp-Ki~~ndl~~v~~~~G~g~vAtesIkkgE~Lf~~prdsvLsvtts~li~~lps~~rv~Ln   85 (466)
T KOG1338|consen    7 DLAKRFLLWGKLTLELETSP-KIDNNDLPWVERIAGAGIVATESIKKGESLFAYPRDSVLSVTTSALITPLPSDIRVLLN   85 (466)
T ss_pred             cHHHHHHHHHHHhhheeecc-cccccccchhhhhcccceeeehhhcCCceEEEecCccEEeeehHHhcccchHHHHHHhh
Confidence            45789999999987 77766 4555442   23  89999999999999999999999999877532     12333455


Q ss_pred             CCChhHHHHHHHHHHh-cCCCCCcHHHHhhcCCC--CCCccccCHhHHhhcCCCchHHHHHHHHHHHHHHHHHHHHHHhc
Q 013921          131 GLKPWISVALFLIREK-KKEDSPWRVYLDILPEC--TDSTVFWSEEELVELQGTQLLSTTLGVKEYVQNEYLKVEEEIIL  207 (434)
Q Consensus       131 ~l~~~~~Lal~Ll~E~-~~~~S~w~pYl~~LP~~--~~~pl~w~~~el~~L~gt~l~~~~~~~~~~~~~~~~~l~~~l~~  207 (434)
                      +.+.|..|++.|++|. .+.+|+|+||++.+|+.  .++|+||+++|++.|..+.+++++.+.++.++++|....+++.+
T Consensus        86 e~gsw~~Lllvll~E~~~pq~SrWrPYfs~wp~p~rm~spifWdEnEl~~Ll~stvlee~~Kd~aeI~~~~i~~i~pf~~  165 (466)
T KOG1338|consen   86 EVGSWGMLLLVLLREKKMPQKSRWRPYFSRWPQPARMHSPIFWDENELSMLLCSTVLEETVKDKAEIEKDFIFVIQPFKQ  165 (466)
T ss_pred             cCCcHHHHHHHHHHHhhcccccccccHHHhCCChhhcCCCccCCchHHHHHhhcccchhhHhHHHHHHHHHHHHHHHHHH
Confidence            7899999999999999 45559999999999984  78999999999998655556666888899999999999999999


Q ss_pred             cCCCCCCCCCCHHHHHHHHHHHHhccccccC--------------CCcEEEeeccccccCCCCCCCCCceEEecCCCccC
Q 013921          208 PNKQLFPRPITLDDFLWAFGILRSRAFSRLR--------------GQNLVLIPLADLINHSPGITTEDYAYEIKGAGLFS  273 (434)
Q Consensus       208 ~~~~~f~~~~t~e~f~WA~~~V~SRaf~~~~--------------~~~~~LvP~~Dm~NH~~~~~~~~~~~~~~~~g~~~  273 (434)
                      .+|..|.. +++|+|..+++++.+.+|.+.-              -...+|+|.+||+||+......+..+         
T Consensus       166 ~~p~vfs~-~slEdF~y~~Al~laysfdve~~~s~~~~eee~e~e~ngk~m~p~ad~lNhd~~k~nanl~y---------  235 (466)
T KOG1338|consen  166 HCPIVFSR-PSLEDFMYAYALGLAYSFDVEFLLSLDNLEEESEIECNGKLMTPIADFLNHDGLKANANLRY---------  235 (466)
T ss_pred             hCcchhcc-cCHHHHHHHHHHHHHHheeeehhcchhhhhhhhccccCcccccchhhhhccchhhcccceec---------
Confidence            89888844 7999999999999999997631              12569999999999997643223232         


Q ss_pred             CCceEEEEeCCCCCCCCeEEeccCCCCCcHHHHHhCCcCCCCCC---------CceEEEEeecCCCCcChhhHHHHHHHC
Q 013921          274 RDLLFSLRTPVPVKAGEQVLIQYDLNKSNAELALDYGFIESKSD---------RNAYTLTLEISESDPFFGDKLDIAETN  344 (434)
Q Consensus       274 ~~~~~~l~a~r~i~~GeEv~i~YG~~~sN~~LL~~YGFv~~~Np---------~D~v~l~l~i~~~d~~~~~K~~lL~~~  344 (434)
                      +++|+.|+|.|+|.+|+||+++||. ++|.  |++||.+.-.-.         +|.+.+-.+++.+++.+..|.-+++.+
T Consensus       236 ~~NcL~mva~r~iekgdev~n~dg~-~p~~--l~~l~ka~c~gihm~~g~~~l~niv~~l~D~~~d~tm~~~R~il~ql~  312 (466)
T KOG1338|consen  236 EDNCLEMVADRNIEKGDEVDNSDGL-KPMG--LLKLTKALCVGIHMVWGILKLYNIVQILMDVPNDDTMRNMRLILLQLH  312 (466)
T ss_pred             cCcceeeeecCCCCCcccccccccc-Ccch--hhhhhhhccceeeeecceeecchHHHHHhcCCCcchHHHHHHHHHHhc
Confidence            3679999999999999999999996 8888  888887765432         223333345667778777776655444


Q ss_pred             C
Q 013921          345 G  345 (434)
Q Consensus       345 g  345 (434)
                      +
T Consensus       313 n  313 (466)
T KOG1338|consen  313 N  313 (466)
T ss_pred             c
Confidence            3


No 3  
>PF00856 SET:  SET domain;  InterPro: IPR001214 The SET domain appears generally as one part of a larger multidomain protein, and recently there were described three structures of very different proteins with distinct domain compositions: Neurospora crassa DIM-5, a member of the Su(var) family of HKMTs which methylate histone H3 on lysine 9,human SET7 (also called SET9), which methylates H3 on lysine 4 and garden pea Rubisco LSMT, an enzyme that does not modify histones, but instead methylates lysine 14 in the flexible tail of the large subunit of the enzyme Rubisco. The SET domain itself turned out to be an uncommon structure. Although in all three studies, electron density maps revealed the location of the AdoMet or AdoHcy cofactor, the SET domain bears no similarity at all to the canonical/AdoMet-dependent methyltransferase fold. Strictly conserved in the C-terminal motif of the SET domain tyrosine could be involved in abstracting a proton from the protonated amino group of the substrate lysine, promoting its nucleophilic attack on the sulphonium methyl group of the AdoMet cofactor. In contrast to the AdoMet-dependent protein methyltranferases of the classical type, which tend to bind their polypeptide substrates on top of the cofactor, it is noted from the Rubisco LSMT structure that the AdoMet seems to bind in a separate cleft, suggesting how a polypeptide substrate could be subjected to multiple rounds of methylation without having to be released from the enzyme. In contrast, SET7/9 is able to add only a single methyl group to its substrate. It has been demonstrated that association of SET domain and myotubularin-related proteins modulates growth control []. The SET domain-containing Drosophila melanogaster (Fruit fly) protein, enhancer of zeste, has a function in segment determination and the mammalian homologue may be involved in the regulation of gene transcription and chromatin structure. Histone lysine methylation is part of the histone code that regulated chromatin function and epigenetic control of gene function. Histone lysine methyltransferases (HMTase) differ both in their substrate specificity for the various acceptor lysines as well as in their product specificity for the number of methyl groups (one, two, or three) they transfer. With just one exception [], the HMTases belong to SET family that can be classified according to the sequences surrounding the SET domain [, ]. Structural studies on the human SET7/9, a mono-methylase, have revealed the molecular basis for the specificity of the enzyme for the histone-target and the roles of the invariant residues in the SET domain in determining the methylation specificities [].  The pre-SET domain, as found in the SUV39 SET family, contains nine invariant cysteine residues that are grouped into two segments separated by a region of variable length. These 9 cysteines coordinate 3 zinc ions to form to form a triangular cluster, where each of the zinc ions is coordinated by 4 four cysteines to give a tetrahedral configuration. The function of this domain is structural, holding together 2 long segments of random coils. The C-terminal region including the post-SET domain is disordered when not interacting with a histone tail and in the absence of zinc. The three conserved cysteines in the post-SET domain form a zinc-binding site when coupled to a fourth conserved cysteine in the knot-like structure close to the SET domain active site []. The structured post-SET region brings in the C-terminal residues that participate in S-adenosylmethine-binding and histone tail interactions. The three conserved cysteine residues are essential for HMTase activity, as replacement with serine abolishes HMTase activity [], []. ; GO: 0005515 protein binding; PDB: 3TG5_A 3S7F_A 3RIB_B 3TG4_A 3S7J_A 3S7D_A 3S7B_A 3H6L_A 3SMT_A 3K5K_A ....
Probab=99.70  E-value=1.5e-17  Score=146.50  Aligned_cols=53  Identities=25%  Similarity=0.421  Sum_probs=40.9

Q ss_pred             CCcEEEeeccccccCCCCCCCCCceEEecCCCccCCCceEEEEeCCCCCCCCeEEeccC
Q 013921          239 GQNLVLIPLADLINHSPGITTEDYAYEIKGAGLFSRDLLFSLRTPVPVKAGEQVLIQYD  297 (434)
Q Consensus       239 ~~~~~LvP~~Dm~NH~~~~~~~~~~~~~~~~g~~~~~~~~~l~a~r~i~~GeEv~i~YG  297 (434)
                      .+..+|+|++||+||+..+   |+.+....+   ..++.+.++|.|+|++|||||++||
T Consensus       110 ~~~~~l~p~~d~~NHsc~p---n~~~~~~~~---~~~~~~~~~a~r~I~~GeEi~isYG  162 (162)
T PF00856_consen  110 RDGIALYPFADMLNHSCDP---NCEVSFDFD---GDGGCLVVRATRDIKKGEEIFISYG  162 (162)
T ss_dssp             EEEEEEETGGGGSEEESST---SEEEEEEEE---TTTTEEEEEESS-B-TTSBEEEEST
T ss_pred             ccccccCcHhHhecccccc---ccceeeEee---cccceEEEEECCccCCCCEEEEEEC
Confidence            3468999999999999976   455554310   1467999999999999999999999


No 4  
>PF09273 Rubis-subs-bind:  Rubisco LSMT substrate-binding;  InterPro: IPR015353 This domain adopts a multihelical structure, with an irregular array of long and short alpha-helices. It allows binding of the protein to substrate, such as the N-terminal tails of histones H3 and H4 and the large subunit of the Rubisco holoenzyme complex []. ; PDB: 3QXY_A 3RC0_A 1P0Y_A 2H2E_C 2H23_A 1MLV_C 2H2J_B 2H21_B 1OZV_C 3SMT_A.
Probab=99.61  E-value=2.1e-15  Score=129.72  Aligned_cols=103  Identities=31%  Similarity=0.515  Sum_probs=85.7

Q ss_pred             cCCCCcChhhHHHHHHHCCCCCccEEEEecCCCCCHHHHHHHHHHhcCCCcHHhHHHHhhcccccCCCCCCChHHHHHHH
Q 013921          327 ISESDPFFGDKLDIAETNGLGESAYFDIVLGRTLPPAMLQYLRLVALGGTDAFLLESIFRNTIWGHLDLPVSHANEELIC  406 (434)
Q Consensus       327 i~~~d~~~~~K~~lL~~~gl~~~~~f~l~~~~~~~~~Ll~~lRl~~~~~~e~~~~~~~~~~~~~g~~~~~vS~~nE~~~~  406 (434)
                      ++++||+++.|.++|+.+|+..+..|.+..++.+|++|++++||+++++++............++....|+|.+||.+++
T Consensus         1 l~~~D~l~~~K~~lL~~~gl~~~~~f~l~~~~~~~~~Ll~~lRv~~~~~~e~~~~~~~~~~~~~~~~~~~ls~~nE~~~l   80 (128)
T PF09273_consen    1 LSPSDPLFEEKKQLLEEHGLSGDQTFDLRADGPLPPELLAALRVLLMTEEELRALKSLADSSEWSDRSEPLSPENEIAAL   80 (128)
T ss_dssp             --TTSTTHHHHHHHHHHTTS-SEEEEEEECCSSSHHHHHHHHHHHHSCHHHHHHHHHCGTTTHCCHCCC-SBHHHHHHHH
T ss_pred             CCchhhhHHHHHHHHHHCCCCCCceeeeeCCCCCCHHHHHHHHHHHcChHHHHHHHHhhcccccccccCCCchhhHHHHH
Confidence            46789999999999999999988889999877789999999999999988877665544333333456789999999999


Q ss_pred             HHHHHHHHHHHhcCCCchhhhcc
Q 013921          407 RVVRDACKSALSGFHTTIEEVNV  429 (434)
Q Consensus       407 ~~L~~~~~~~L~~y~TTieeDe~  429 (434)
                      ++|.++|+.+|++|+||+|||+.
T Consensus        81 ~~L~~~~~~~L~~y~TtleeD~~  103 (128)
T PF09273_consen   81 QFLIDLCEARLSAYPTTLEEDEE  103 (128)
T ss_dssp             HHHHHHHHHHHTTSSS-HHHHHH
T ss_pred             HHHHHHHHHHHHhCCCcHHHHHH
Confidence            99999999999999999999975


No 5  
>smart00317 SET SET (Su(var)3-9, Enhancer-of-zeste, Trithorax) domain. Putative methyl transferase, based on outlier plant homologues
Probab=98.61  E-value=2.9e-08  Score=82.76  Aligned_cols=47  Identities=23%  Similarity=0.228  Sum_probs=36.3

Q ss_pred             EeeccccccCCCCCCCCCceEEecCCCccCCCceEEEEeCCCCCCCCeEEecc
Q 013921          244 LIPLADLINHSPGITTEDYAYEIKGAGLFSRDLLFSLRTPVPVKAGEQVLIQY  296 (434)
Q Consensus       244 LvP~~Dm~NH~~~~~~~~~~~~~~~~g~~~~~~~~~l~a~r~i~~GeEv~i~Y  296 (434)
                      +.|+++++||+..++   +.+.....   .....+.++|.|+|++||||+++|
T Consensus        70 ~~~~~~~iNHsc~pN---~~~~~~~~---~~~~~~~~~a~r~I~~GeEi~i~Y  116 (116)
T smart00317       70 KGNIARFINHSCEPN---CELLFVEV---NGDSRIVIFALRDIKPGEELTIDY  116 (116)
T ss_pred             cCcHHHeeCCCCCCC---EEEEEEEE---CCCcEEEEEECCCcCCCCEEeecC
Confidence            899999999999874   44443211   012369999999999999999999


No 6  
>KOG4442 consensus Clathrin coat binding protein/Huntingtin interacting protein HIP1, involved in regulation of endocytosis [Intracellular trafficking, secretion, and vesicular transport]
Probab=94.93  E-value=0.025  Score=60.25  Aligned_cols=45  Identities=27%  Similarity=0.422  Sum_probs=38.3

Q ss_pred             ccccccCCCCCCCCCceEEecCCCccCCCceEEEEeCCCCCCCCeEEeccC
Q 013921          247 LADLINHSPGITTEDYAYEIKGAGLFSRDLLFSLRTPVPVKAGEQVLIQYD  297 (434)
Q Consensus       247 ~~Dm~NH~~~~~~~~~~~~~~~~g~~~~~~~~~l~a~r~i~~GeEv~i~YG  297 (434)
                      ++=++||+.++++..-.|.+.+      ...+-+.+.+.|++||||+..|+
T Consensus       193 laRFiNHSC~PNa~~~KWtV~~------~lRvGiFakk~I~~GEEITFDYq  237 (729)
T KOG4442|consen  193 LARFINHSCDPNAEVQKWTVPD------ELRVGIFAKKVIKPGEEITFDYQ  237 (729)
T ss_pred             HHHhhcCCCCCCceeeeeeeCC------eeEEEEeEecccCCCceeeEecc
Confidence            5668999999988777898853      45677889999999999999998


No 7  
>KOG2589 consensus Histone tail methylase [Chromatin structure and dynamics]
Probab=94.57  E-value=0.039  Score=54.59  Aligned_cols=47  Identities=23%  Similarity=0.291  Sum_probs=36.5

Q ss_pred             EEEeeccccccCCCCCCCCCceEEecCCCccCCCceEEEEeCCCCCCCCeEEeccCC
Q 013921          242 LVLIPLADLINHSPGITTEDYAYEIKGAGLFSRDLLFSLRTPVPVKAGEQVLIQYDL  298 (434)
Q Consensus       242 ~~LvP~~Dm~NH~~~~~~~~~~~~~~~~g~~~~~~~~~l~a~r~i~~GeEv~i~YG~  298 (434)
                      .-|=|-+ ++||+..+   |+.+...+      .+...+++.|||++||||+--||.
T Consensus       192 LwLGPaa-fINHDCrp---nCkFvs~g------~~tacvkvlRDIePGeEITcFYgs  238 (453)
T KOG2589|consen  192 LWLGPAA-FINHDCRP---NCKFVSTG------RDTACVKVLRDIEPGEEITCFYGS  238 (453)
T ss_pred             heeccHH-hhcCCCCC---CceeecCC------CceeeeehhhcCCCCceeEEeecc
Confidence            4455644 79999987   56665432      357889999999999999999996


No 8  
>KOG1085 consensus Predicted methyltransferase (contains a SET domain) [General function prediction only]
Probab=94.49  E-value=0.033  Score=53.37  Aligned_cols=52  Identities=27%  Similarity=0.310  Sum_probs=37.0

Q ss_pred             cccccCCCCCCCCCceEEecCCCccCCCceEEEEeCCCCCCCCeEEeccCCCCCcHHHH
Q 013921          248 ADLINHSPGITTEDYAYEIKGAGLFSRDLLFSLRTPVPVKAGEQVLIQYDLNKSNAELA  306 (434)
Q Consensus       248 ~Dm~NH~~~~~~~~~~~~~~~~g~~~~~~~~~l~a~r~i~~GeEv~i~YG~~~sN~~LL  306 (434)
                      .-|+||+...+...-..+++      ....+.++|.++|.+|||+...||. +|-+.++
T Consensus       334 GRLINHS~~gNl~TKvv~Id------g~pHLiLvA~rdIa~GEELlYDYGD-RSkesi~  385 (392)
T KOG1085|consen  334 GRLINHSVRGNLKTKVVEID------GSPHLILVARRDIAQGEELLYDYGD-RSKESIA  385 (392)
T ss_pred             hhhhcccccCcceeeEEEec------CCceEEEEeccccccchhhhhhccc-cchhHHh
Confidence            34789997654311122333      2568999999999999999999995 7765544


No 9  
>KOG1079 consensus Transcriptional repressor EZH1 [Transcription]
Probab=88.27  E-value=0.42  Score=51.07  Aligned_cols=45  Identities=18%  Similarity=0.225  Sum_probs=32.4

Q ss_pred             ccccccCCCCCCCCCceEEecCCCccCCCceEEEEeCCCCCCCCeEEeccC
Q 013921          247 LADLINHSPGITTEDYAYEIKGAGLFSRDLLFSLRTPVPVKAGEQVLIQYD  297 (434)
Q Consensus       247 ~~Dm~NH~~~~~~~~~~~~~~~~g~~~~~~~~~l~a~r~i~~GeEv~i~YG  297 (434)
                      .+-++||+..+++..-..-+.      .+..+-+.|.|.|.+|||+|..|+
T Consensus       665 k~rFANHS~nPNCYAkvm~V~------GdhRIGifAkRaIeagEELffDYr  709 (739)
T KOG1079|consen  665 KIRFANHSFNPNCYAKVMMVA------GDHRIGIFAKRAIEAGEELFFDYR  709 (739)
T ss_pred             hhhhccCCCCCCcEEEEEEec------CCcceeeeehhhcccCceeeeeec
Confidence            456789999774311112223      245788999999999999999998


No 10 
>KOG1080 consensus Histone H3 (Lys4) methyltransferase complex, subunit SET1 and related methyltransferases [Chromatin structure and dynamics; Transcription]
Probab=87.26  E-value=0.56  Score=53.21  Aligned_cols=45  Identities=22%  Similarity=0.303  Sum_probs=34.5

Q ss_pred             ccccccCCCCCCCCCceEEecCCCccCCCceEEEEeCCCCCCCCeEEeccC
Q 013921          247 LADLINHSPGITTEDYAYEIKGAGLFSRDLLFSLRTPVPVKAGEQVLIQYD  297 (434)
Q Consensus       247 ~~Dm~NH~~~~~~~~~~~~~~~~g~~~~~~~~~l~a~r~i~~GeEv~i~YG  297 (434)
                      ++=++||+..+++..-...+.      ++..++|+|.|+|.+||||+..|-
T Consensus       939 iAr~InHsC~PNCyakvi~V~------g~~~IvIyakr~I~~~EElTYDYk  983 (1005)
T KOG1080|consen  939 IARFINHSCNPNCYAKVITVE------GDKRIVIYSKRDIAAGEELTYDYK  983 (1005)
T ss_pred             hhheeecccCCCceeeEEEec------CeeEEEEEEecccccCceeeeecc
Confidence            667899999986432222333      356899999999999999999996


No 11 
>KOG1083 consensus Putative transcription factor ASH1/LIN-59 [Transcription]
Probab=83.40  E-value=1.2  Score=50.00  Aligned_cols=44  Identities=23%  Similarity=0.357  Sum_probs=36.4

Q ss_pred             ccccCCCCCCCCCceEEecCCCccCCCceEEEEeCCCCCCCCeEEeccCC
Q 013921          249 DLINHSPGITTEDYAYEIKGAGLFSRDLLFSLRTPVPVKAGEQVLIQYDL  298 (434)
Q Consensus       249 Dm~NH~~~~~~~~~~~~~~~~g~~~~~~~~~l~a~r~i~~GeEv~i~YG~  298 (434)
                      -+.||+..+++....|.++|      ...+.|.|.|||.+||||+..|.-
T Consensus      1252 RfinhscKPNc~~qkwSVNG------~~Rv~L~A~rDi~kGEELtYDYN~ 1295 (1306)
T KOG1083|consen 1252 RFINHSCKPNCEMQKWSVNG------EYRVGLFALRDLPKGEELTYDYNF 1295 (1306)
T ss_pred             cccccccCCCCccccccccc------eeeeeeeecCCCCCCceEEEeccc
Confidence            34789988888777888864      457888999999999999999973


No 12 
>COG2940 Proteins containing SET domain [General function prediction only]
Probab=72.94  E-value=1.9  Score=45.49  Aligned_cols=45  Identities=27%  Similarity=0.271  Sum_probs=31.7

Q ss_pred             ccccccCCCCCCCCCceEEecC-CCccCCCceEEEEeCCCCCCCCeEEeccCC
Q 013921          247 LADLINHSPGITTEDYAYEIKG-AGLFSRDLLFSLRTPVPVKAGEQVLIQYDL  298 (434)
Q Consensus       247 ~~Dm~NH~~~~~~~~~~~~~~~-~g~~~~~~~~~l~a~r~i~~GeEv~i~YG~  298 (434)
                      +.=++||+..++.   .....+ .|    ...+..++.+||++||||.+.||.
T Consensus       405 ~~r~~nHS~~pN~---~~~~~~~~g----~~~~~~~~~rDI~~geEl~~dy~~  450 (480)
T COG2940         405 VARFINHSCTPNC---EASPIEVNG----IFKISIYAIRDIKAGEELTYDYGP  450 (480)
T ss_pred             ccceeecCCCCCc---ceecccccc----cceeeecccccchhhhhhcccccc
Confidence            4448999997754   332211 11    236778899999999999999996


No 13 
>KOG1082 consensus Histone H3 (Lys9) methyltransferase SUV39H1/Clr4, required for transcriptional silencing [Chromatin structure and dynamics; Transcription]
Probab=72.90  E-value=3.3  Score=42.09  Aligned_cols=49  Identities=20%  Similarity=0.255  Sum_probs=34.1

Q ss_pred             ccccccCCCCCCCCCceEEe-cCCCccCCCceEEEEeCCCCCCCCeEEeccCC
Q 013921          247 LADLINHSPGITTEDYAYEI-KGAGLFSRDLLFSLRTPVPVKAGEQVLIQYDL  298 (434)
Q Consensus       247 ~~Dm~NH~~~~~~~~~~~~~-~~~g~~~~~~~~~l~a~r~i~~GeEv~i~YG~  298 (434)
                      ++=++||+..++   ..|.. ..++....-..+.+.+.++|.+|+|++..||.
T Consensus       272 v~RfinHSC~PN---~~~~~v~~~~~~~~~~~i~ffa~~~I~p~~ELT~dYg~  321 (364)
T KOG1082|consen  272 VARFINHSCSPN---LLYQAVFQDEFVLLYLRIGFFALRDISPGEELTLDYGK  321 (364)
T ss_pred             ccccccCCCCcc---ceeeeeeecCCccchheeeeeeccccCCCcccchhhcc
Confidence            456799999874   33322 11122223456788899999999999999996


No 14 
>COG1188 Ribosome-associated heat shock protein implicated in the recycling of the 50S subunit (S4 paralog) [Translation, ribosomal structure and biogenesis]
Probab=49.78  E-value=20  Score=29.28  Aligned_cols=54  Identities=20%  Similarity=0.481  Sum_probs=38.4

Q ss_pred             CCCHHHHHHHHHHHHhccccccCCCcEEEeeccccccCCCCCCCCCceEEecCCCccCCCceEEEEeCCCCCCCCeEEec
Q 013921          216 PITLDDFLWAFGILRSRAFSRLRGQNLVLIPLADLINHSPGITTEDYAYEIKGAGLFSRDLLFSLRTPVPVKAGEQVLIQ  295 (434)
Q Consensus       216 ~~t~e~f~WA~~~V~SRaf~~~~~~~~~LvP~~Dm~NH~~~~~~~~~~~~~~~~g~~~~~~~~~l~a~r~i~~GeEv~i~  295 (434)
                      ..-+|.|+|+.-++-+|+.--            ||++-..        ..++        +. ..++.++++.||+|.|.
T Consensus         8 ~mRLDKwL~~aR~~KrRslAk------------~~~~~Gr--------V~vN--------G~-~aKpS~~VK~GD~l~i~   58 (100)
T COG1188           8 RMRLDKWLWAARFIKRRSLAK------------EMIEGGR--------VKVN--------GQ-RAKPSKEVKVGDILTIR   58 (100)
T ss_pred             ceehHHHHHHHHHhhhHHHHH------------HHHHCCe--------EEEC--------CE-EcccccccCCCCEEEEE
Confidence            356899999999999999852            3333221        1121        12 23788999999999999


Q ss_pred             cCC
Q 013921          296 YDL  298 (434)
Q Consensus       296 YG~  298 (434)
                      ||.
T Consensus        59 ~~~   61 (100)
T COG1188          59 FGN   61 (100)
T ss_pred             eCC
Confidence            995


No 15 
>KOG2461 consensus Transcription factor BLIMP-1/PRDI-BF1, contains C2H2-type Zn-finger and SET domains [Transcription]
Probab=35.07  E-value=33  Score=35.30  Aligned_cols=35  Identities=26%  Similarity=0.303  Sum_probs=28.7

Q ss_pred             CceEEEEeCCCCCCCCeEEeccCCCCCcHHHHHhCC
Q 013921          275 DLLFSLRTPVPVKAGEQVLIQYDLNKSNAELALDYG  310 (434)
Q Consensus       275 ~~~~~l~a~r~i~~GeEv~i~YG~~~sN~~LL~~YG  310 (434)
                      ...+-.++.|+|.+||||.+-||. --+.+|...+|
T Consensus       121 ~~~Ifyrt~r~I~p~eELlVWY~~-e~~~~L~~~~~  155 (396)
T KOG2461|consen  121 GENIFYRTIRDIRPNEELLVWYGS-EYAEELAYGHG  155 (396)
T ss_pred             cCceEEEecccCCCCCeEEEEecc-chHhHhcccCC
Confidence            346778999999999999999996 45677777777


No 16 
>KOG1085 consensus Predicted methyltransferase (contains a SET domain) [General function prediction only]
Probab=33.97  E-value=34  Score=33.32  Aligned_cols=29  Identities=17%  Similarity=0.260  Sum_probs=22.5

Q ss_pred             ecccC-CccEEEEccCCCCCCEEEEecCCC
Q 013921           85 PATFP-EGLGLVAQRDIAKNEVVLEVPMKF  113 (434)
Q Consensus        85 ~~~~~-~GrGl~A~~~I~~ge~ll~IP~~~  113 (434)
                      +..+. .|||++|+..++.||.|+.---++
T Consensus       260 ~~~~dgKGRGv~a~~~F~rgdFVVEY~Gdl  289 (392)
T KOG1085|consen  260 EVYKDGKGRGVRAKVNFERGDFVVEYRGDL  289 (392)
T ss_pred             EEeeccccceeEeecccccCceEEEEecce
Confidence            33344 499999999999999998765554


No 17 
>PF08666 SAF:  SAF domain;  InterPro: IPR013974  This entry includes a range of different proteins, such as antifreeze proteins, flagellar FlgA proteins, and CpaB pilus proteins. ; PDB: 1C89_A 3NLA_A 3RDN_A 1C8A_A 3FRN_A 1WVO_A 3K3S_H 3G8R_B 1XUU_A 1XUZ_A ....
Probab=28.46  E-value=32  Score=24.80  Aligned_cols=15  Identities=40%  Similarity=0.490  Sum_probs=10.9

Q ss_pred             cEEEEccCCCCCCEE
Q 013921           92 LGLVAQRDIAKNEVV  106 (434)
Q Consensus        92 rGl~A~~~I~~ge~l  106 (434)
                      +-++|+++|++|+.|
T Consensus         2 ~vvVA~~di~~G~~i   16 (63)
T PF08666_consen    2 RVVVAARDIPAGTVI   16 (63)
T ss_dssp             SEEEESSTB-TT-BE
T ss_pred             cEEEEeCccCCCCEE
Confidence            358999999999987


No 18 
>PF10281 Ish1:  Putative stress-responsive nuclear envelope protein;  InterPro: IPR018803  This group of proteins, found primarily in fungi, consists of putative stress-responsive nuclear envelope protein Ish1 and homologues []. 
Probab=24.85  E-value=69  Score=21.02  Aligned_cols=17  Identities=24%  Similarity=0.562  Sum_probs=14.5

Q ss_pred             HHHHHHHHHhCCCCCCC
Q 013921           64 VETFWQWLRDQKVVSPK   80 (434)
Q Consensus        64 ~~~f~~Wl~~~G~~~~~   80 (434)
                      -.+|.+||.++|+..++
T Consensus         6 ~~~L~~wL~~~gi~~~~   22 (38)
T PF10281_consen    6 DSDLKSWLKSHGIPVPK   22 (38)
T ss_pred             HHHHHHHHHHcCCCCCC
Confidence            36789999999998876


No 19 
>PF09652 Cas_VVA1548:  Putative CRISPR-associated protein (Cas_VVA1548);  InterPro: IPR013443 Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) are a family of DNA direct repeats separated by regularly sized non-repetitive spacer sequences that are found in most bacterial and archaeal genomes []. CRISPRs appear to provide acquired resistance against bacteriophages, possibly acting with an RNA interference-like mechanism to inhibit gene functions of invasive DNA elements [, ]. Differences in the number and type of spacers between CRISPR repeats correlate with phage sensitivity. It is thought that following phage infection, bacteria integrate new spacers derived from phage genomic sequences, and that the removal or addition of particular spacers modifies the phage-resistance phenotype of the cell. Therefore, the specificity of CRISPRs may be determined by spacer-phage sequence similarity. In addition, there are many protein families known as CRISPR-associated sequences (Cas), which are encoded in the vicinity of CRISPR loci []. CRISPR/cas gene regions can be quite large, with up to 20 different, tandem-arranged cas genes next to a CRISPR cluster or filling the region between two repeat clusters. Cas genes and CRISPRs are found on mobile genetic elements such as plasmids, and have undergone extensive horizontal transfer. Cas proteins are thought to be involved in the propagation and functioning of CRISPRs. Some Cas proteins show similarity to helicases and repair proteins, although the functions of most are unknown. Cas families can be divided into subtypes according to operon organisation and phylogeny.   This entry represents a conserved region of about 95 amino acids found exclusively in species with CRISPR repeats. In all bacterial species that contain this entry, the genes encoding the proteins are in the midst of a cluster of cas genes.
Probab=24.78  E-value=62  Score=26.11  Aligned_cols=40  Identities=18%  Similarity=0.309  Sum_probs=25.3

Q ss_pred             HHHHHHHHHhCCCCCCCCCeeecccCCccEEEEccCCCCCCEEE-EecCCC
Q 013921           64 VETFWQWLRDQKVVSPKSPIRPATFPEGLGLVAQRDIAKNEVVL-EVPMKF  113 (434)
Q Consensus        64 ~~~f~~Wl~~~G~~~~~~~v~~~~~~~GrGl~A~~~I~~ge~ll-~IP~~~  113 (434)
                      ....++|++++|+.++.  +.....        ..+|++|++|+ ++|.++
T Consensus         6 H~GAieW~~~qg~~iD~--~v~Hld--------~~~i~~GD~ViGtLPvhL   46 (93)
T PF09652_consen    6 HPGAIEWAKQQGIQIDH--FVDHLD--------PADIQPGDVVIGTLPVHL   46 (93)
T ss_pred             cccHHHHHHHhCCCcce--eeccCC--------HHHccCCCEEEEeCcHHH
Confidence            34567999999998776  221111        56677777654 556554


No 20 
>KOG1337 consensus N-methyltransferase [General function prediction only]
Probab=23.90  E-value=52  Score=34.63  Aligned_cols=121  Identities=17%  Similarity=0.118  Sum_probs=65.7

Q ss_pred             hhHHHHHHHHHhCCCCCCCCCeeecccC-CccEEEEc-cCCCCCCEEEEecCCCccCccccccch-hhhhhcCCChhHHH
Q 013921           62 AQVETFWQWLRDQKVVSPKSPIRPATFP-EGLGLVAQ-RDIAKNEVVLEVPMKFWINPDTVAASE-IGSLCSGLKPWISV  138 (434)
Q Consensus        62 ~~~~~f~~Wl~~~G~~~~~~~v~~~~~~-~GrGl~A~-~~I~~ge~ll~IP~~~~ls~~~~~~~~-~~~~~~~l~~~~~L  138 (434)
                      ..+..|++|...+|+..+.. +...... .|.+.+|. ..+...+.+..+....-...-...... .+..+.   .|..+
T Consensus         4 ~~l~~~l~~~~~~~~~~~~~-~~~~~~~~~g~~~~~~~~~~~~~~~~~~~~~~~~~~g~~~~~~~~~~~~~~---~~~~~   79 (472)
T KOG1337|consen    4 DVLSALLRWAQCNGISLSSS-LDLRPDELKGLVRWAASESIASSENIKSLKFWLTGNGLSSSKSSLPGNDID---EWPLL   79 (472)
T ss_pred             hHHHHhhhHHhccCccCCcc-cccCccccCcceeeeecccCCCccccccceeccccCCcchhhhcccccccc---ccchh
Confidence            67889999999999998872 4433322 37777777 555555555444444333332222111 111111   11111


Q ss_pred             ---------HHHHHHHhcCCCCCcHHHHhhcCCCCCCccccCHhHHhhcCCCchHHH
Q 013921          139 ---------ALFLIREKKKEDSPWRVYLDILPECTDSTVFWSEEELVELQGTQLLST  186 (434)
Q Consensus       139 ---------al~Ll~E~~~~~S~w~pYl~~LP~~~~~pl~w~~~el~~L~gt~l~~~  186 (434)
                               .+.+.-.+....++|.+|.+.+|....++++|...+.....+.+....
T Consensus        80 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~l~~~l~~~~~~~~~s~w~~~i~~  136 (472)
T KOG1337|consen   80 VSIRLIKGEKLLLVPPLLLLIAKRKPYNDLLPIALALFLLLEWAHGEISKWKPYIST  136 (472)
T ss_pred             hhhhhhhhhhhccCCchhhhccccccCccccHHHHHHHHHHhhhccccccchhhhhh
Confidence                     111111123346788899999886667778877665544555544433


No 21 
>TIGR02059 swm_rep_I cyanobacterial long protein repeat. This domain appears in 29 copies in a large (10000 amino protein in Synechococcus sp. WH8102 associated with a novel flagellar system, as one of three different repeats. Similar domains are found in two different large (<3500) proteins of Synechocystis PCC6803.
Probab=23.57  E-value=88  Score=25.64  Aligned_cols=25  Identities=20%  Similarity=0.368  Sum_probs=21.7

Q ss_pred             CCceEEEEeCCCCCCCCeEEeccCC
Q 013921          274 RDLLFSLRTPVPVKAGEQVLIQYDL  298 (434)
Q Consensus       274 ~~~~~~l~a~r~i~~GeEv~i~YG~  298 (434)
                      ....+.|.-.+.|..||||.++|-.
T Consensus        73 s~ktVTLTL~~~V~~Gq~VTVsYt~   97 (101)
T TIGR02059        73 SNTTITLTLAQVVEDGDEVTLSYTK   97 (101)
T ss_pred             cccEEEEEecccccCCCEEEEEeeC
Confidence            3457899999999999999999964


No 22 
>TIGR02620 cas_VVA1548 putative CRISPR-associated protein, VVA1548 family. This model represents a conserved domain of about 95 amino acids exclusively in species with CRISPR (Clustered Regularly Interspaced Short Palidromic Repeats). In all bacterial species with members so far (Vibrio vulnificus YJ016, Mannheimia succiniciproducens MBEL55E, and Nitrosomonas europaea ATCC 19718) and but not in the archaeon Methanothermobacter thermautotrophicus str. Delta H, the gene for this protein is in the midst of a cluster of Cas protein gene near CRISPR repeats.
Probab=22.29  E-value=69  Score=25.82  Aligned_cols=39  Identities=21%  Similarity=0.384  Sum_probs=23.3

Q ss_pred             HHHHHHHHhCCCCCCCCCeeecccCCccEEEEccCCCCCCEEE-EecCCC
Q 013921           65 ETFWQWLRDQKVVSPKSPIRPATFPEGLGLVAQRDIAKNEVVL-EVPMKF  113 (434)
Q Consensus        65 ~~f~~Wl~~~G~~~~~~~v~~~~~~~GrGl~A~~~I~~ge~ll-~IP~~~  113 (434)
                      ..-++|++++|..++.  +.....+        .+|.+|++|+ ++|.++
T Consensus         7 ~Ga~eW~~~qG~~iD~--~v~HLd~--------~~i~~GD~ViGtLPv~L   46 (93)
T TIGR02620         7 SGAQEWLSQQGIQIDH--FVDHLDP--------IDISQGDKVIGTLPVSL   46 (93)
T ss_pred             ccHHHHHHhcCCccce--eecccCH--------HHhcCCCEEEEeCCHHH
Confidence            3457999999998776  2222111        4566666554 455543


No 23 
>KOG1081 consensus Transcription factor NSD1 and related SET domain proteins [Transcription]
Probab=20.08  E-value=41  Score=35.38  Aligned_cols=45  Identities=24%  Similarity=0.391  Sum_probs=32.1

Q ss_pred             cccccCCCCCCCCCceEEecCCCccCCCceEEEEeCCCCCCCCeEEeccCC
Q 013921          248 ADLINHSPGITTEDYAYEIKGAGLFSRDLLFSLRTPVPVKAGEQVLIQYDL  298 (434)
Q Consensus       248 ~Dm~NH~~~~~~~~~~~~~~~~g~~~~~~~~~l~a~r~i~~GeEv~i~YG~  298 (434)
                      -++.||+..+......|.+.      .+..+.+.+.+.++.|+|++.+|-.
T Consensus       372 sr~~nh~~~~~v~~~k~~~~------~~t~~~~~a~~~i~~g~e~t~~~n~  416 (463)
T KOG1081|consen  372 SRFLNHSCQPNVETEKWQVI------GDTRVGLFAPRQIEAGEELTFNYNG  416 (463)
T ss_pred             hhhhcccCCCceeechhhee------cccccccccccccccchhhhheeec
Confidence            45689996654433344432      2456788999999999999999963


Done!