Query 018186
Match_columns 359
No_of_seqs 246 out of 1373
Neff 7.5
Searched_HMMs 46136
Date Fri Mar 29 06:51:59 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/018186.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/018186hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 KOG1338 Uncharacterized conser 100.0 8.8E-33 1.9E-37 261.3 16.0 274 76-357 9-314 (466)
2 KOG1337 N-methyltransferase [G 100.0 3.2E-32 7E-37 275.6 16.7 268 74-358 47-326 (472)
3 PF00856 SET: SET domain; Int 99.8 1.1E-18 2.4E-23 149.0 8.1 49 259-307 112-162 (162)
4 smart00317 SET SET (Su(var)3-9 99.0 5.8E-10 1.3E-14 90.4 5.9 46 261-306 69-116 (116)
5 KOG2589 Histone tail methylase 95.8 0.0073 1.6E-07 58.3 3.2 47 260-308 192-238 (453)
6 KOG1085 Predicted methyltransf 95.6 0.011 2.4E-07 55.3 3.7 48 267-315 335-385 (392)
7 KOG4442 Clathrin coat binding 94.6 0.034 7.4E-07 57.9 4.0 43 266-308 194-238 (729)
8 KOG1079 Transcriptional repres 94.2 0.046 9.9E-07 56.7 3.8 44 265-308 665-710 (739)
9 KOG1080 Histone H3 (Lys4) meth 94.0 0.045 9.7E-07 60.1 3.7 45 264-308 938-984 (1005)
10 COG2940 Proteins containing SE 91.5 0.1 2.2E-06 53.6 1.9 43 266-308 406-450 (480)
11 KOG1082 Histone H3 (Lys9) meth 85.5 0.55 1.2E-05 46.4 2.4 44 267-310 274-323 (364)
12 KOG1083 Putative transcription 80.7 1.5 3.1E-05 48.2 3.3 44 266-309 1251-1296(1306)
13 KOG1085 Predicted methyltransf 68.0 5.9 0.00013 37.6 3.5 35 91-125 256-290 (392)
14 KOG1141 Predicted histone meth 67.0 2.8 6.2E-05 44.9 1.4 53 267-319 1191-1253(1262)
15 KOG2084 Predicted histone tail 65.1 9.7 0.00021 38.2 4.9 60 259-320 199-265 (482)
16 KOG1338 Uncharacterized conser 56.3 2.4 5.1E-05 41.9 -1.3 78 259-341 269-349 (466)
17 COG1188 Ribosome-associated he 47.7 18 0.0004 28.9 2.7 55 233-309 8-62 (100)
18 KOG2461 Transcription factor B 47.4 19 0.00041 36.1 3.4 35 285-319 121-155 (396)
19 TIGR02059 swm_rep_I cyanobacte 46.2 36 0.00079 27.2 4.1 29 280-308 69-97 (101)
20 smart00317 SET SET (Su(var)3-9 41.4 34 0.00073 26.5 3.5 28 91-118 83-113 (116)
21 KOG4442 Clathrin coat binding 40.1 26 0.00057 37.2 3.2 29 91-119 120-148 (729)
22 PF10281 Ish1: Putative stress 34.0 39 0.00084 21.7 2.2 16 76-91 6-21 (38)
23 KOG1080 Histone H3 (Lys4) meth 31.4 45 0.00097 37.4 3.4 45 74-119 850-894 (1005)
24 PF08666 SAF: SAF domain; Int 29.4 33 0.00072 24.0 1.4 15 103-117 2-16 (63)
25 PF11629 Mst1_SARAH: C termina 24.1 2.5E+02 0.0054 19.4 5.5 40 182-223 5-44 (49)
26 PF09652 Cas_VVA1548: Putative 22.9 57 0.0012 25.7 1.7 42 74-124 4-46 (93)
27 KOG1337 N-methyltransferase [G 21.4 35 0.00075 34.9 0.3 51 73-123 3-55 (472)
28 PF00856 SET: SET domain; Int 21.3 1.2E+02 0.0026 24.5 3.6 27 92-118 129-158 (162)
No 1
>KOG1338 consensus Uncharacterized conserved protein [Function unknown]
Probab=100.00 E-value=8.8e-33 Score=261.27 Aligned_cols=274 Identities=22% Similarity=0.338 Sum_probs=218.5
Q ss_pred HHHHHHHHHhCC-CCC-CCcEEEeeC----CCceEEEEcccCCCCCEEEEcCCCCcccccCCCC---CchhhhhhccCCC
Q 018186 76 ASTLQKWLSDSG-LPP-QKMAIQKVD----VGERGLVALKNIRKGEKLLFVPPSLVITADSKWS---CPEAGEVLKQCSV 146 (359)
Q Consensus 76 ~~~l~~Wl~~~G-~~~-~~v~i~~~~----~~GrGl~At~~I~~ge~ll~IP~~l~is~~~a~~---~~~~~~~l~~~~l 146 (359)
.+.|+.|++..+ ... ++|.+...+ ..|+|++|+++|++||.|+.+|++++++..+... .|+..+++= +++
T Consensus 9 ~~~fl~w~k~t~eletSpKi~~ndl~~v~~~~G~g~vAtesIkkgE~Lf~~prdsvLsvtts~li~~lps~~rv~L-ne~ 87 (466)
T KOG1338|consen 9 AKRFLLWGKLTLELETSPKIDNNDLPWVERIAGAGIVATESIKKGESLFAYPRDSVLSVTTSALITPLPSDIRVLL-NEV 87 (466)
T ss_pred HHHHHHHHHHhhheeecccccccccchhhhhcccceeeehhhcCCceEEEecCccEEeeehHHhcccchHHHHHHh-hcC
Confidence 689999999987 443 778776543 2499999999999999999999999999876311 222222221 468
Q ss_pred CChHHHHHHHHHHhccCCCCCcHHHHHhcCCC--CCCccccCHHHHHHhhccchHHHHHHHHHHHHHHHHHHHHHHHHhh
Q 018186 147 PDWPLLATYLISEASFEKSSRWSNYISALPRQ--PYSLLYWTRAELDRYLEASQIRERAIERITNVIGTYNDLRLRIFSK 224 (359)
Q Consensus 147 ~~~~~Lal~Ll~E~~~g~~S~w~pYl~~LP~~--~~~pl~w~~~el~~lL~gt~l~~~~~~~~~~~~~~y~~~~~~l~~~ 224 (359)
+.|..|++.|++|...+..|+|+||++.+|+. .++|+||+++|++.+++|+-+.+ ..+..+.+.+.|....+++.+.
T Consensus 88 gsw~~Lllvll~E~~~pq~SrWrPYfs~wp~p~rm~spifWdEnEl~~Ll~stvlee-~~Kd~aeI~~~~i~~i~pf~~~ 166 (466)
T KOG1338|consen 88 GSWGMLLLVLLREKKMPQKSRWRPYFSRWPQPARMHSPIFWDENELSMLLCSTVLEE-TVKDKAEIEKDFIFVIQPFKQH 166 (466)
T ss_pred CcHHHHHHHHHHHhhcccccccccHHHhCCChhhcCCCccCCchHHHHHhhcccchh-hHhHHHHHHHHHHHHHHHHHHh
Confidence 89999999999999877779999999999975 57899999999997677776655 7788899999999999999999
Q ss_pred CCCCCCccCCCHHHHHHHHhhhhhcceecCCCC-------------CceEeeeeeecccCCCC-cceeEEeeCCCCeEEE
Q 018186 225 YPDLFPEEVFNMETFKWSFGILFSRLVRLPSMD-------------GRVALVPWADMLNHSCE-VETFLDYDKSSQGVVF 290 (359)
Q Consensus 225 ~p~~f~~~~~t~~~f~WA~~~V~SRaf~~~~~~-------------~~~~LvP~~Dm~NH~~~-~~~~~~~d~~~~~~~l 290 (359)
+|..|.. +++++|..+++++.+.+|.++-.. ..-+|+|.+||+||+.. .|+...|+ ++|+.|
T Consensus 167 ~p~vfs~--~slEdF~y~~Al~laysfdve~~~s~~~~eee~e~e~ngk~m~p~ad~lNhd~~k~nanl~y~--~NcL~m 242 (466)
T KOG1338|consen 167 CPIVFSR--PSLEDFMYAYALGLAYSFDVEFLLSLDNLEEESEIECNGKLMTPIADFLNHDGLKANANLRYE--DNCLEM 242 (466)
T ss_pred Ccchhcc--cCHHHHHHHHHHHHHHheeeehhcchhhhhhhhccccCcccccchhhhhccchhhcccceecc--Ccceee
Confidence 9998864 899999999999999999875320 13589999999999987 77778885 699999
Q ss_pred EEcCcCCCCceEEecCCCCChHHHHhcCCcccCCCCC-------CCCeEEEeeccCCCCccHHHHHHHHHHCCC
Q 018186 291 TTDRQYQPGEQVFISYGKKSNGELLLSYGFVPREGTN-------PSDSVELPLSLKKSDKCYKEKLEALRKYGL 357 (359)
Q Consensus 291 ~a~r~i~~GeEv~isYG~~sN~~LL~~YGFv~~~~~N-------p~D~v~L~l~l~~~d~~~~~K~~~L~~~Gl 357 (359)
+|+|+|.+|+||+++||.++|. |++||.+.-.+.. -.|.+++-.+++.+++....|.-++..+|.
T Consensus 243 va~r~iekgdev~n~dg~~p~~--l~~l~ka~c~gihm~~g~~~l~niv~~l~D~~~d~tm~~~R~il~ql~nt 314 (466)
T KOG1338|consen 243 VADRNIEKGDEVDNSDGLKPMG--LLKLTKALCVGIHMVWGILKLYNIVQILMDVPNDDTMRNMRLILLQLHNT 314 (466)
T ss_pred eecCCCCCccccccccccCcch--hhhhhhhccceeeeecceeecchHHHHHhcCCCcchHHHHHHHHHHhccc
Confidence 9999999999999999998888 6666665442211 123344444667788888888776666653
No 2
>KOG1337 consensus N-methyltransferase [General function prediction only]
Probab=99.98 E-value=3.2e-32 Score=275.65 Aligned_cols=268 Identities=39% Similarity=0.616 Sum_probs=207.6
Q ss_pred hcHHHHHHHHHhCCCCCCCcEEEeeCCCceEEEEcccCCCCCEEEEcCCCCcccccCCCCCchhhhhhccCCCCCh-HHH
Q 018186 74 ENASTLQKWLSDSGLPPQKMAIQKVDVGERGLVALKNIRKGEKLLFVPPSLVITADSKWSCPEAGEVLKQCSVPDW-PLL 152 (359)
Q Consensus 74 ~~~~~l~~Wl~~~G~~~~~v~i~~~~~~GrGl~At~~I~~ge~ll~IP~~l~is~~~a~~~~~~~~~l~~~~l~~~-~~L 152 (359)
+....+..|.+..|....+..+......++++.+..++..++.+..+|....+..+..... ... ..|
T Consensus 47 ~~~~~~~~~~~~~g~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~------------~~~~~~l 114 (472)
T KOG1337|consen 47 ENIKSLKFWLTGNGLSSSKSSLPGNDIDEWPLLVSIRLIKGEKLLLVPPLLLLIAKRKPYN------------DLLPIAL 114 (472)
T ss_pred cccccceeccccCCcchhhhccccccccccchhhhhhhhhhhhhccCCchhhhccccccCc------------cccHHHH
Confidence 3345566666666665433322222234556666666666665555555555444433211 111 578
Q ss_pred HHHHHHHhccCCCCCcHHHHHhcCCCCCCccccCHHHHHHhhccchHHHHHHHHHHHHHHHHHHHHHHHHhhCCCCCC--
Q 018186 153 ATYLISEASFEKSSRWSNYISALPRQPYSLLYWTRAELDRYLEASQIRERAIERITNVIGTYNDLRLRIFSKYPDLFP-- 230 (359)
Q Consensus 153 al~Ll~E~~~g~~S~w~pYl~~LP~~~~~pl~w~~~el~~lL~gt~l~~~~~~~~~~~~~~y~~~~~~l~~~~p~~f~-- 230 (359)
+++|+.|...+..|.|.+|+..||..+++|++|..+++. .|.++.....+..+.+.++..+..+.. +...++..+.
T Consensus 115 ~~~l~~~~~~~~~s~w~~~i~~l~~~~~~p~~~~~~~v~-~l~~~~~~~~~~~~~~~~~~~~~~~~~-~~~~~~~~~~~~ 192 (472)
T KOG1337|consen 115 ALFLLLEWAHGEISKWKPYISTLPSQYNSPLLWSEDEVK-SLLSTPLFEIVASRRQNLVNKSAELLE-VLQSHPSLFGSD 192 (472)
T ss_pred HHHHHHhhhccccccchhhhhhchhhcCCccccCHHHHH-HhhcchhhHHHHHHHHHhhhhHHHHHH-HHHhcccccccc
Confidence 999999998888899999999999999999999999998 589999888887777777665555543 3344554432
Q ss_pred -ccCCCHHHHHHHHhhhhhcceecCCC--------CCceEeeeeeecccCCCCcceeEEeeCCCCeEEEEEcCcCCCCce
Q 018186 231 -EEVFNMETFKWSFGILFSRLVRLPSM--------DGRVALVPWADMLNHSCEVETFLDYDKSSQGVVFTTDRQYQPGEQ 301 (359)
Q Consensus 231 -~~~~t~~~f~WA~~~V~SRaf~~~~~--------~~~~~LvP~~Dm~NH~~~~~~~~~~d~~~~~~~l~a~r~i~~GeE 301 (359)
.+.++++.|.||+++|.||+|+.+.. +...+|+|++||+||+++. ..+.|+..++.+.+++.+++++|||
T Consensus 193 ~~d~~~~~~~~w~~~~~~sr~~~~~~~~~~~~~~~~~~~~L~P~~D~~NH~~~~-~~~~~~~~d~~~~l~~~~~v~~gee 271 (472)
T KOG1337|consen 193 LFDTFTFSAFKWAYSIVNSRAFYLPSLQRLTAGDPDDNEALAPLIDLLNHSPEV-IKAGYNQEDEAVELVAERDVSAGEE 271 (472)
T ss_pred ccCccchHHHHHHHHHHhhhhhccccccccccCCCCcchhhhhhHHhhccCchh-ccccccCCCCcEEEEEeeeecCCCe
Confidence 23489999999999999999987643 2367999999999999998 5566777777999999999999999
Q ss_pred EEecCCCCChHHHHhcCCcccCCCCCCCCeEEEeeccCCCCccHHHHHHHHHHCCCC
Q 018186 302 VFISYGKKSNGELLLSYGFVPREGTNPSDSVELPLSLKKSDKCYKEKLEALRKYGLS 358 (359)
Q Consensus 302 v~isYG~~sN~~LL~~YGFv~~~~~Np~D~v~L~l~l~~~d~~~~~K~~~L~~~Gl~ 358 (359)
|||+||+++|++||++||||.+ +||+|.|.|.+.+...|+.+..|...++++|+.
T Consensus 272 vfi~YG~~~N~eLL~~YGFv~~--~N~~d~v~l~~~l~~~~~~~~~~~~~~~~~~~~ 326 (472)
T KOG1337|consen 272 VFINYGPKSNAELLLHYGFVEE--DNPYDSVTLKLALPPEDVSYLDKSDVLKKNGLP 326 (472)
T ss_pred EEEecCCCchHHHHHhcCCCCC--CCCcceEEEeecccccccchhHHHHHHhhcCCC
Confidence 9999999999999999999987 999999999999999999999999999999875
No 3
>PF00856 SET: SET domain; InterPro: IPR001214 The SET domain appears generally as one part of a larger multidomain protein, and recently there were described three structures of very different proteins with distinct domain compositions: Neurospora crassa DIM-5, a member of the Su(var) family of HKMTs which methylate histone H3 on lysine 9,human SET7 (also called SET9), which methylates H3 on lysine 4 and garden pea Rubisco LSMT, an enzyme that does not modify histones, but instead methylates lysine 14 in the flexible tail of the large subunit of the enzyme Rubisco. The SET domain itself turned out to be an uncommon structure. Although in all three studies, electron density maps revealed the location of the AdoMet or AdoHcy cofactor, the SET domain bears no similarity at all to the canonical/AdoMet-dependent methyltransferase fold. Strictly conserved in the C-terminal motif of the SET domain tyrosine could be involved in abstracting a proton from the protonated amino group of the substrate lysine, promoting its nucleophilic attack on the sulphonium methyl group of the AdoMet cofactor. In contrast to the AdoMet-dependent protein methyltranferases of the classical type, which tend to bind their polypeptide substrates on top of the cofactor, it is noted from the Rubisco LSMT structure that the AdoMet seems to bind in a separate cleft, suggesting how a polypeptide substrate could be subjected to multiple rounds of methylation without having to be released from the enzyme. In contrast, SET7/9 is able to add only a single methyl group to its substrate. It has been demonstrated that association of SET domain and myotubularin-related proteins modulates growth control []. The SET domain-containing Drosophila melanogaster (Fruit fly) protein, enhancer of zeste, has a function in segment determination and the mammalian homologue may be involved in the regulation of gene transcription and chromatin structure. Histone lysine methylation is part of the histone code that regulated chromatin function and epigenetic control of gene function. Histone lysine methyltransferases (HMTase) differ both in their substrate specificity for the various acceptor lysines as well as in their product specificity for the number of methyl groups (one, two, or three) they transfer. With just one exception [], the HMTases belong to SET family that can be classified according to the sequences surrounding the SET domain [, ]. Structural studies on the human SET7/9, a mono-methylase, have revealed the molecular basis for the specificity of the enzyme for the histone-target and the roles of the invariant residues in the SET domain in determining the methylation specificities []. The pre-SET domain, as found in the SUV39 SET family, contains nine invariant cysteine residues that are grouped into two segments separated by a region of variable length. These 9 cysteines coordinate 3 zinc ions to form to form a triangular cluster, where each of the zinc ions is coordinated by 4 four cysteines to give a tetrahedral configuration. The function of this domain is structural, holding together 2 long segments of random coils. The C-terminal region including the post-SET domain is disordered when not interacting with a histone tail and in the absence of zinc. The three conserved cysteines in the post-SET domain form a zinc-binding site when coupled to a fourth conserved cysteine in the knot-like structure close to the SET domain active site []. The structured post-SET region brings in the C-terminal residues that participate in S-adenosylmethine-binding and histone tail interactions. The three conserved cysteine residues are essential for HMTase activity, as replacement with serine abolishes HMTase activity [], []. ; GO: 0005515 protein binding; PDB: 3TG5_A 3S7F_A 3RIB_B 3TG4_A 3S7J_A 3S7D_A 3S7B_A 3H6L_A 3SMT_A 3K5K_A ....
Probab=99.76 E-value=1.1e-18 Score=148.98 Aligned_cols=49 Identities=43% Similarity=0.802 Sum_probs=44.1
Q ss_pred ceEeeeeeecccCCCCcceeEEee--CCCCeEEEEEcCcCCCCceEEecCC
Q 018186 259 RVALVPWADMLNHSCEVETFLDYD--KSSQGVVFTTDRQYQPGEQVFISYG 307 (359)
Q Consensus 259 ~~~LvP~~Dm~NH~~~~~~~~~~d--~~~~~~~l~a~r~i~~GeEv~isYG 307 (359)
..+|+|++||+||++.+|+.+.++ ..+++++++|.|+|++||||||+||
T Consensus 112 ~~~l~p~~d~~NHsc~pn~~~~~~~~~~~~~~~~~a~r~I~~GeEi~isYG 162 (162)
T PF00856_consen 112 GIALYPFADMLNHSCDPNCEVSFDFDGDGGCLVVRATRDIKKGEEIFISYG 162 (162)
T ss_dssp EEEEETGGGGSEEESSTSEEEEEEEETTTTEEEEEESS-B-TTSBEEEEST
T ss_pred ccccCcHhHheccccccccceeeEeecccceEEEEECCccCCCCEEEEEEC
Confidence 579999999999999999998887 5789999999999999999999998
No 4
>smart00317 SET SET (Su(var)3-9, Enhancer-of-zeste, Trithorax) domain. Putative methyl transferase, based on outlier plant homologues
Probab=99.00 E-value=5.8e-10 Score=90.43 Aligned_cols=46 Identities=30% Similarity=0.440 Sum_probs=40.7
Q ss_pred EeeeeeecccCCCCcceeEEeeCCCC--eEEEEEcCcCCCCceEEecC
Q 018186 261 ALVPWADMLNHSCEVETFLDYDKSSQ--GVVFTTDRQYQPGEQVFISY 306 (359)
Q Consensus 261 ~LvP~~Dm~NH~~~~~~~~~~d~~~~--~~~l~a~r~i~~GeEv~isY 306 (359)
.+.|+++++||++.+|+.+.+...++ .+.++|.|+|++||||+++|
T Consensus 69 ~~~~~~~~iNHsc~pN~~~~~~~~~~~~~~~~~a~r~I~~GeEi~i~Y 116 (116)
T smart00317 69 RKGNIARFINHSCEPNCELLFVEVNGDSRIVIFALRDIKPGEELTIDY 116 (116)
T ss_pred ccCcHHHeeCCCCCCCEEEEEEEECCCcEEEEEECCCcCCCCEEeecC
Confidence 48999999999999999887764444 59999999999999999999
No 5
>KOG2589 consensus Histone tail methylase [Chromatin structure and dynamics]
Probab=95.79 E-value=0.0073 Score=58.27 Aligned_cols=47 Identities=23% Similarity=0.396 Sum_probs=37.7
Q ss_pred eEeeeeeecccCCCCcceeEEeeCCCCeEEEEEcCcCCCCceEEecCCC
Q 018186 260 VALVPWADMLNHSCEVETFLDYDKSSQGVVFTTDRQYQPGEQVFISYGK 308 (359)
Q Consensus 260 ~~LvP~~Dm~NH~~~~~~~~~~d~~~~~~~l~a~r~i~~GeEv~isYG~ 308 (359)
+-|=|. -++||++.+||.+.- .+.+...|++.|||++||||+--||.
T Consensus 192 LwLGPa-afINHDCrpnCkFvs-~g~~tacvkvlRDIePGeEITcFYgs 238 (453)
T KOG2589|consen 192 LWLGPA-AFINHDCRPNCKFVS-TGRDTACVKVLRDIEPGEEITCFYGS 238 (453)
T ss_pred heeccH-HhhcCCCCCCceeec-CCCceeeeehhhcCCCCceeEEeecc
Confidence 345564 489999999986432 23378999999999999999999997
No 6
>KOG1085 consensus Predicted methyltransferase (contains a SET domain) [General function prediction only]
Probab=95.63 E-value=0.011 Score=55.27 Aligned_cols=48 Identities=23% Similarity=0.459 Sum_probs=38.8
Q ss_pred ecccCCCCcceeE---EeeCCCCeEEEEEcCcCCCCceEEecCCCCChHHHH
Q 018186 267 DMLNHSCEVETFL---DYDKSSQGVVFTTDRQYQPGEQVFISYGKKSNGELL 315 (359)
Q Consensus 267 Dm~NH~~~~~~~~---~~d~~~~~~~l~a~r~i~~GeEv~isYG~~sN~~LL 315 (359)
-++||+-..|+.. ..| ....++++|.++|.+|||++..||+++-+-++
T Consensus 335 RLINHS~~gNl~TKvv~Id-g~pHLiLvA~rdIa~GEELlYDYGDRSkesi~ 385 (392)
T KOG1085|consen 335 RLINHSVRGNLKTKVVEID-GSPHLILVARRDIAQGEELLYDYGDRSKESIA 385 (392)
T ss_pred hhhcccccCcceeeEEEec-CCceEEEEeccccccchhhhhhccccchhHHh
Confidence 4899999887643 333 45679999999999999999999998876554
No 7
>KOG4442 consensus Clathrin coat binding protein/Huntingtin interacting protein HIP1, involved in regulation of endocytosis [Intracellular trafficking, secretion, and vesicular transport]
Probab=94.62 E-value=0.034 Score=57.85 Aligned_cols=43 Identities=23% Similarity=0.468 Sum_probs=33.3
Q ss_pred eecccCCCCcceeEE-ee-CCCCeEEEEEcCcCCCCceEEecCCC
Q 018186 266 ADMLNHSCEVETFLD-YD-KSSQGVVFTTDRQYQPGEQVFISYGK 308 (359)
Q Consensus 266 ~Dm~NH~~~~~~~~~-~d-~~~~~~~l~a~r~i~~GeEv~isYG~ 308 (359)
+=++||++++||... |. .+.-.+-+.|.+.|++||||+..|+-
T Consensus 194 aRFiNHSC~PNa~~~KWtV~~~lRvGiFakk~I~~GEEITFDYqf 238 (729)
T KOG4442|consen 194 ARFINHSCDPNAEVQKWTVPDELRVGIFAKKVIKPGEEITFDYQF 238 (729)
T ss_pred HHhhcCCCCCCceeeeeeeCCeeEEEEeEecccCCCceeeEeccc
Confidence 357999999998642 22 23456778899999999999999873
No 8
>KOG1079 consensus Transcriptional repressor EZH1 [Transcription]
Probab=94.15 E-value=0.046 Score=56.69 Aligned_cols=44 Identities=18% Similarity=0.311 Sum_probs=36.3
Q ss_pred eeecccCCCCcceeEE--eeCCCCeEEEEEcCcCCCCceEEecCCC
Q 018186 265 WADMLNHSCEVETFLD--YDKSSQGVVFTTDRQYQPGEQVFISYGK 308 (359)
Q Consensus 265 ~~Dm~NH~~~~~~~~~--~d~~~~~~~l~a~r~i~~GeEv~isYG~ 308 (359)
.+-++||+.++||... +......+-+.|.|.|++|||||..|+=
T Consensus 665 k~rFANHS~nPNCYAkvm~V~GdhRIGifAkRaIeagEELffDYrY 710 (739)
T KOG1079|consen 665 KIRFANHSFNPNCYAKVMMVAGDHRIGIFAKRAIEAGEELFFDYRY 710 (739)
T ss_pred hhhhccCCCCCCcEEEEEEecCCcceeeeehhhcccCceeeeeecc
Confidence 4568999999998754 3345678899999999999999999963
No 9
>KOG1080 consensus Histone H3 (Lys4) methyltransferase complex, subunit SET1 and related methyltransferases [Chromatin structure and dynamics; Transcription]
Probab=94.04 E-value=0.045 Score=60.12 Aligned_cols=45 Identities=22% Similarity=0.417 Sum_probs=36.2
Q ss_pred eeeecccCCCCcceeEEee--CCCCeEEEEEcCcCCCCceEEecCCC
Q 018186 264 PWADMLNHSCEVETFLDYD--KSSQGVVFTTDRQYQPGEQVFISYGK 308 (359)
Q Consensus 264 P~~Dm~NH~~~~~~~~~~d--~~~~~~~l~a~r~i~~GeEv~isYG~ 308 (359)
=++=++||++.+||+...- .+...++++|.|+|.+||||+.+|--
T Consensus 938 niAr~InHsC~PNCyakvi~V~g~~~IvIyakr~I~~~EElTYDYkF 984 (1005)
T KOG1080|consen 938 NIARFINHSCNPNCYAKVITVEGDKRIVIYSKRDIAAGEELTYDYKF 984 (1005)
T ss_pred chhheeecccCCCceeeEEEecCeeEEEEEEecccccCceeeeeccc
Confidence 3556899999999976432 24457999999999999999999953
No 10
>COG2940 Proteins containing SET domain [General function prediction only]
Probab=91.52 E-value=0.1 Score=53.55 Aligned_cols=43 Identities=23% Similarity=0.356 Sum_probs=35.9
Q ss_pred eecccCCCCcceeEEeeCCCC--eEEEEEcCcCCCCceEEecCCC
Q 018186 266 ADMLNHSCEVETFLDYDKSSQ--GVVFTTDRQYQPGEQVFISYGK 308 (359)
Q Consensus 266 ~Dm~NH~~~~~~~~~~d~~~~--~~~l~a~r~i~~GeEv~isYG~ 308 (359)
.=++||++.+|+........+ .+.+++.+||.+||||+++||.
T Consensus 406 ~r~~nHS~~pN~~~~~~~~~g~~~~~~~~~rDI~~geEl~~dy~~ 450 (480)
T COG2940 406 ARFINHSCTPNCEASPIEVNGIFKISIYAIRDIKAGEELTYDYGP 450 (480)
T ss_pred cceeecCCCCCcceecccccccceeeecccccchhhhhhcccccc
Confidence 338999999999877655444 6778899999999999999987
No 11
>KOG1082 consensus Histone H3 (Lys9) methyltransferase SUV39H1/Clr4, required for transcriptional silencing [Chromatin structure and dynamics; Transcription]
Probab=85.53 E-value=0.55 Score=46.41 Aligned_cols=44 Identities=30% Similarity=0.528 Sum_probs=33.1
Q ss_pred ecccCCCCcceeEEe---eC---CCCeEEEEEcCcCCCCceEEecCCCCC
Q 018186 267 DMLNHSCEVETFLDY---DK---SSQGVVFTTDRQYQPGEQVFISYGKKS 310 (359)
Q Consensus 267 Dm~NH~~~~~~~~~~---d~---~~~~~~l~a~r~i~~GeEv~isYG~~s 310 (359)
=++||++.+|..+.. +. .--.+.+.|.++|.+|+|++..||...
T Consensus 274 RfinHSC~PN~~~~~v~~~~~~~~~~~i~ffa~~~I~p~~ELT~dYg~~~ 323 (364)
T KOG1082|consen 274 RFINHSCSPNLLYQAVFQDEFVLLYLRIGFFALRDISPGEELTLDYGKAY 323 (364)
T ss_pred ccccCCCCccceeeeeeecCCccchheeeeeeccccCCCcccchhhcccc
Confidence 578999999876532 21 112467889999999999999999743
No 12
>KOG1083 consensus Putative transcription factor ASH1/LIN-59 [Transcription]
Probab=80.65 E-value=1.5 Score=48.21 Aligned_cols=44 Identities=25% Similarity=0.391 Sum_probs=32.7
Q ss_pred eecccCCCCcceeE-EeeCCC-CeEEEEEcCcCCCCceEEecCCCC
Q 018186 266 ADMLNHSCEVETFL-DYDKSS-QGVVFTTDRQYQPGEQVFISYGKK 309 (359)
Q Consensus 266 ~Dm~NH~~~~~~~~-~~d~~~-~~~~l~a~r~i~~GeEv~isYG~~ 309 (359)
+-+.||++.+||.. .|.-.+ -.+.|.|.|||.+||||+..|..+
T Consensus 1251 ~RfinhscKPNc~~qkwSVNG~~Rv~L~A~rDi~kGEELtYDYN~k 1296 (1306)
T KOG1083|consen 1251 ARFINHSCKPNCEMQKWSVNGEYRVGLFALRDLPKGEELTYDYNFK 1296 (1306)
T ss_pred ccccccccCCCCccccccccceeeeeeeecCCCCCCceEEEecccc
Confidence 44578999998853 233222 246789999999999999999764
No 13
>KOG1085 consensus Predicted methyltransferase (contains a SET domain) [General function prediction only]
Probab=68.00 E-value=5.9 Score=37.56 Aligned_cols=35 Identities=20% Similarity=0.202 Sum_probs=29.9
Q ss_pred CCcEEEeeCCCceEEEEcccCCCCCEEEEcCCCCc
Q 018186 91 QKMAIQKVDVGERGLVALKNIRKGEKLLFVPPSLV 125 (359)
Q Consensus 91 ~~v~i~~~~~~GrGl~At~~I~~ge~ll~IP~~l~ 125 (359)
.++.+....+.||||+|+++++.|+.|+.--=+++
T Consensus 256 egl~~~~~dgKGRGv~a~~~F~rgdFVVEY~Gdli 290 (392)
T KOG1085|consen 256 EGLLEVYKDGKGRGVRAKVNFERGDFVVEYRGDLI 290 (392)
T ss_pred cceeEEeeccccceeEeecccccCceEEEEeccee
Confidence 57888888899999999999999999987655443
No 14
>KOG1141 consensus Predicted histone methyl transferase [Chromatin structure and dynamics]
Probab=66.99 E-value=2.8 Score=44.94 Aligned_cols=53 Identities=26% Similarity=0.384 Sum_probs=37.4
Q ss_pred ecccCCCCcceeE---EeeCCCCe---EEEEEcCcCCCCceEEecCCC----CChHHHHhcCC
Q 018186 267 DMLNHSCEVETFL---DYDKSSQG---VVFTTDRQYQPGEQVFISYGK----KSNGELLLSYG 319 (359)
Q Consensus 267 Dm~NH~~~~~~~~---~~d~~~~~---~~l~a~r~i~~GeEv~isYG~----~sN~~LL~~YG 319 (359)
-++||++.+|..+ -+|..+-. +.+.+.+-|++|.|++..|+= -..-+|+..-|
T Consensus 1191 RfLNHSC~PNl~VQnVfvdTHdlrfPwVAFFt~kyVkAgtELTWDY~Ye~g~v~~keL~C~CG 1253 (1262)
T KOG1141|consen 1191 RFLNHSCDPNLHVQNVFVDTHDLRFPWVAFFTRKYVKAGTELTWDYQYEQGQVATKELTCHCG 1253 (1262)
T ss_pred hhhccCCCccceeeeeeeeccccCCchhhhhhhhhhccCceeeeeccccccccccceEEEecC
Confidence 4799999998654 24433333 456788999999999999973 34456666655
No 15
>KOG2084 consensus Predicted histone tail methylase containing SET domain [Chromatin structure and dynamics]
Probab=65.13 E-value=9.7 Score=38.16 Aligned_cols=60 Identities=28% Similarity=0.416 Sum_probs=45.2
Q ss_pred ceEeeeeeecccCCCCcceeEEeeCCCCeEEEEEcCcCCCCc-eEEecCCCC--C----hHHHHhcCCc
Q 018186 259 RVALVPWADMLNHSCEVETFLDYDKSSQGVVFTTDRQYQPGE-QVFISYGKK--S----NGELLLSYGF 320 (359)
Q Consensus 259 ~~~LvP~~Dm~NH~~~~~~~~~~d~~~~~~~l~a~r~i~~Ge-Ev~isYG~~--s----N~~LL~~YGF 320 (359)
..+|.|..=++||++.+|+...|+ +....+.+...+.+++ +++++|-.. + ...|-..|.|
T Consensus 199 ~~~l~~~~~~~~hsC~pn~~~~~~--~~~~~~~~~~~~~~~~~~l~~~y~~~~~~~~~r~~~l~~~~~f 265 (482)
T KOG2084|consen 199 GRGLFPGSSLFNHSCFPNISVIFD--GRGLALLVPAGIDAGEEELTISYTDPLLSTASRQKQLRQSKLF 265 (482)
T ss_pred eeeecccchhcccCCCCCeEEEEC--CceeEEEeecccCCCCCEEEEeecccccCHHHHHHHHhhccce
Confidence 468999999999999999987775 4556666666777766 999999762 2 3456666667
No 16
>KOG1338 consensus Uncharacterized conserved protein [Function unknown]
Probab=56.32 E-value=2.4 Score=41.94 Aligned_cols=78 Identities=17% Similarity=0.088 Sum_probs=59.2
Q ss_pred ceEeeeeeecccCCCCcce--eEEeeCCCCeEEEEEcCcCCCCceEEecCCCCChHHHHhcCC-cccCCCCCCCCeEEEe
Q 018186 259 RVALVPWADMLNHSCEVET--FLDYDKSSQGVVFTTDRQYQPGEQVFISYGKKSNGELLLSYG-FVPREGTNPSDSVELP 335 (359)
Q Consensus 259 ~~~LvP~~Dm~NH~~~~~~--~~~~d~~~~~~~l~a~r~i~~GeEv~isYG~~sN~~LL~~YG-Fv~~~~~Np~D~v~L~ 335 (359)
..+++|+++|+|-.-.-.. .+-+| ..+...+++.|.| +.|+.+.|+...+.++..+|| |+-. +--|++.+.+
T Consensus 269 ~ka~c~gihm~~g~~~l~niv~~l~D-~~~d~tm~~~R~i--l~ql~nt~teld~~e~~~syd~ftkk-E~~p~~g~lv- 343 (466)
T KOG1338|consen 269 TKALCVGIHMVWGILKLYNIVQILMD-VPNDDTMRNMRLI--LLQLHNTRTELDINEFHSSYDTFTKK-EVKPAIGKLV- 343 (466)
T ss_pred hhhccceeeeecceeecchHHHHHhc-CCCcchHHHHHHH--HHHhccchhhhhhHHHHHhhhhhhhc-cccccceeee-
Confidence 4689999999988755322 12344 3566788899988 999999999999999999999 5544 5668887777
Q ss_pred eccCCC
Q 018186 336 LSLKKS 341 (359)
Q Consensus 336 l~l~~~ 341 (359)
+.+++.
T Consensus 344 ~glpq~ 349 (466)
T KOG1338|consen 344 IGLPQS 349 (466)
T ss_pred eechhh
Confidence 466664
No 17
>COG1188 Ribosome-associated heat shock protein implicated in the recycling of the 50S subunit (S4 paralog) [Translation, ribosomal structure and biogenesis]
Probab=47.69 E-value=18 Score=28.86 Aligned_cols=55 Identities=15% Similarity=0.367 Sum_probs=39.1
Q ss_pred CCCHHHHHHHHhhhhhcceecCCCCCceEeeeeeecccCCCCcceeEEeeCCCCeEEEEEcCcCCCCceEEecCCCC
Q 018186 233 VFNMETFKWSFGILFSRLVRLPSMDGRVALVPWADMLNHSCEVETFLDYDKSSQGVVFTTDRQYQPGEQVFISYGKK 309 (359)
Q Consensus 233 ~~t~~~f~WA~~~V~SRaf~~~~~~~~~~LvP~~Dm~NH~~~~~~~~~~d~~~~~~~l~a~r~i~~GeEv~isYG~~ 309 (359)
..-++.|+|+..++-+|+..- ||++- ..+.++ +-...+.++++.|++|.|.||.+
T Consensus 8 ~mRLDKwL~~aR~~KrRslAk-------------~~~~~-----GrV~vN----G~~aKpS~~VK~GD~l~i~~~~~ 62 (100)
T COG1188 8 RMRLDKWLWAARFIKRRSLAK-------------EMIEG-----GRVKVN----GQRAKPSKEVKVGDILTIRFGNK 62 (100)
T ss_pred ceehHHHHHHHHHhhhHHHHH-------------HHHHC-----CeEEEC----CEEcccccccCCCCEEEEEeCCc
Confidence 356899999999999998752 23221 123342 23348889999999999999974
No 18
>KOG2461 consensus Transcription factor BLIMP-1/PRDI-BF1, contains C2H2-type Zn-finger and SET domains [Transcription]
Probab=47.41 E-value=19 Score=36.10 Aligned_cols=35 Identities=26% Similarity=0.539 Sum_probs=31.4
Q ss_pred CCeEEEEEcCcCCCCceEEecCCCCChHHHHhcCC
Q 018186 285 SQGVVFTTDRQYQPGEQVFISYGKKSNGELLLSYG 319 (359)
Q Consensus 285 ~~~~~l~a~r~i~~GeEv~isYG~~sN~~LL~~YG 319 (359)
+..+-+++.|+|++||||.+-||.--+.+|...+|
T Consensus 121 ~~~Ifyrt~r~I~p~eELlVWY~~e~~~~L~~~~~ 155 (396)
T KOG2461|consen 121 GENIFYRTIRDIRPNEELLVWYGSEYAEELAYGHG 155 (396)
T ss_pred cCceEEEecccCCCCCeEEEEeccchHhHhcccCC
Confidence 45688899999999999999999988888888888
No 19
>TIGR02059 swm_rep_I cyanobacterial long protein repeat. This domain appears in 29 copies in a large (10000 amino protein in Synechococcus sp. WH8102 associated with a novel flagellar system, as one of three different repeats. Similar domains are found in two different large (<3500) proteins of Synechocystis PCC6803.
Probab=46.18 E-value=36 Score=27.18 Aligned_cols=29 Identities=24% Similarity=0.398 Sum_probs=24.2
Q ss_pred EeeCCCCeEEEEEcCcCCCCceEEecCCC
Q 018186 280 DYDKSSQGVVFTTDRQYQPGEQVFISYGK 308 (359)
Q Consensus 280 ~~d~~~~~~~l~a~r~i~~GeEv~isYG~ 308 (359)
..+.....+.+...+.|..||+|.++|-+
T Consensus 69 sV~~s~ktVTLTL~~~V~~Gq~VTVsYt~ 97 (101)
T TIGR02059 69 SLGGSNTTITLTLAQVVEDGDEVTLSYTK 97 (101)
T ss_pred EEcCcccEEEEEecccccCCCEEEEEeeC
Confidence 34545568999999999999999999965
No 20
>smart00317 SET SET (Su(var)3-9, Enhancer-of-zeste, Trithorax) domain. Putative methyl transferase, based on outlier plant homologues
Probab=41.40 E-value=34 Score=26.55 Aligned_cols=28 Identities=29% Similarity=0.437 Sum_probs=19.9
Q ss_pred CCcEEEeeCCC---ceEEEEcccCCCCCEEE
Q 018186 91 QKMAIQKVDVG---ERGLVALKNIRKGEKLL 118 (359)
Q Consensus 91 ~~v~i~~~~~~---GrGl~At~~I~~ge~ll 118 (359)
+++.+...... ...++|+++|++||.|.
T Consensus 83 pN~~~~~~~~~~~~~~~~~a~r~I~~GeEi~ 113 (116)
T smart00317 83 PNCELLFVEVNGDSRIVIFALRDIKPGEELT 113 (116)
T ss_pred CCEEEEEEEECCCcEEEEEECCCcCCCCEEe
Confidence 45555543333 37889999999999985
No 21
>KOG4442 consensus Clathrin coat binding protein/Huntingtin interacting protein HIP1, involved in regulation of endocytosis [Intracellular trafficking, secretion, and vesicular transport]
Probab=40.11 E-value=26 Score=37.19 Aligned_cols=29 Identities=28% Similarity=0.297 Sum_probs=25.4
Q ss_pred CCcEEEeeCCCceEEEEcccCCCCCEEEE
Q 018186 91 QKMAIQKVDVGERGLVALKNIRKGEKLLF 119 (359)
Q Consensus 91 ~~v~i~~~~~~GrGl~At~~I~~ge~ll~ 119 (359)
-+|++-.+...|.||.|.++|++|+.|+.
T Consensus 120 A~vevF~Te~KG~GLRA~~dI~~g~FI~E 148 (729)
T KOG4442|consen 120 AKVEVFLTEKKGCGLRAEEDIPKGQFILE 148 (729)
T ss_pred CceeEEEecCcccceeeccccCCCcEEee
Confidence 35777777889999999999999999986
No 22
>PF10281 Ish1: Putative stress-responsive nuclear envelope protein; InterPro: IPR018803 This group of proteins, found primarily in fungi, consists of putative stress-responsive nuclear envelope protein Ish1 and homologues [].
Probab=34.03 E-value=39 Score=21.71 Aligned_cols=16 Identities=38% Similarity=0.856 Sum_probs=13.6
Q ss_pred HHHHHHHHHhCCCCCC
Q 018186 76 ASTLQKWLSDSGLPPQ 91 (359)
Q Consensus 76 ~~~l~~Wl~~~G~~~~ 91 (359)
..+|.+||.++|+..+
T Consensus 6 ~~~L~~wL~~~gi~~~ 21 (38)
T PF10281_consen 6 DSDLKSWLKSHGIPVP 21 (38)
T ss_pred HHHHHHHHHHcCCCCC
Confidence 3689999999999864
No 23
>KOG1080 consensus Histone H3 (Lys4) methyltransferase complex, subunit SET1 and related methyltransferases [Chromatin structure and dynamics; Transcription]
Probab=31.40 E-value=45 Score=37.41 Aligned_cols=45 Identities=13% Similarity=0.201 Sum_probs=31.6
Q ss_pred hcHHHHHHHHHhCCCCCCCcEEEeeCCCceEEEEcccCCCCCEEEE
Q 018186 74 ENASTLQKWLSDSGLPPQKMAIQKVDVGERGLVALKNIRKGEKLLF 119 (359)
Q Consensus 74 ~~~~~l~~Wl~~~G~~~~~v~i~~~~~~GrGl~At~~I~~ge~ll~ 119 (359)
....+++.|.+.+--. ..|.+....-.|.||||.++|.+||.||.
T Consensus 850 ~~~~~~~~~~~~~~rk-k~~~F~~s~iH~wglfa~~~i~~~dmViE 894 (1005)
T KOG1080|consen 850 LDEAEVLRYNQLKFRK-KYVKFGRSGIHGWGLFAMENIAAGDMVIE 894 (1005)
T ss_pred cchHHHHHHHHHhhhh-hhhccccccccccceeeccCccccceEEE
Confidence 3345566665543111 23667766678999999999999999975
No 24
>PF08666 SAF: SAF domain; InterPro: IPR013974 This entry includes a range of different proteins, such as antifreeze proteins, flagellar FlgA proteins, and CpaB pilus proteins. ; PDB: 1C89_A 3NLA_A 3RDN_A 1C8A_A 3FRN_A 1WVO_A 3K3S_H 3G8R_B 1XUU_A 1XUZ_A ....
Probab=29.35 E-value=33 Score=24.05 Aligned_cols=15 Identities=33% Similarity=0.512 Sum_probs=11.2
Q ss_pred eEEEEcccCCCCCEE
Q 018186 103 RGLVALKNIRKGEKL 117 (359)
Q Consensus 103 rGl~At~~I~~ge~l 117 (359)
+-++|+++|++|+.|
T Consensus 2 ~vvVA~~di~~G~~i 16 (63)
T PF08666_consen 2 RVVVAARDIPAGTVI 16 (63)
T ss_dssp SEEEESSTB-TT-BE
T ss_pred cEEEEeCccCCCCEE
Confidence 358999999999998
No 25
>PF11629 Mst1_SARAH: C terminal SARAH domain of Mst1; InterPro: IPR024205 The SARAH (Sav/Rassf/Hpo) domain is found at the C terminus in three classes of eukaryotic tumour suppressors that give the domain its name. In the Sav (Salvador) and Hpo (Hippo) families, the SARAH domain mediates signal transduction from Hpo via the Sav scaffolding protein to the downstream component Wts (Warts); the phosphorylation of Wts by Hpo triggers cell cycle arrest and apoptosis by down-regulating cyclin E, Diap 1 and other targets []. The SARAH domain is also involved in dimerisation, as in the human Hpo orthologue, Mst1, which homodimerises via its C-terminal SARAH domain. The SARAH domain is found associated with other domains, such as protein kinase domains, WW/rsp5/WWP domain (IPR001202 from INTERPRO), C1 domain (IPR002219 from INTERPRO), LIM domain (IPR001781 from INTERPRO), or the Ras-associating (RA) domain (IPR000159 from INTERPRO).; GO: 0004674 protein serine/threonine kinase activity; PDB: 2JO8_A.
Probab=24.12 E-value=2.5e+02 Score=19.42 Aligned_cols=40 Identities=20% Similarity=0.321 Sum_probs=27.7
Q ss_pred ccccCHHHHHHhhccchHHHHHHHHHHHHHHHHHHHHHHHHh
Q 018186 182 LLYWTRAELDRYLEASQIRERAIERITNVIGTYNDLRLRIFS 223 (359)
Q Consensus 182 pl~w~~~el~~lL~gt~l~~~~~~~~~~~~~~y~~~~~~l~~ 223 (359)
.-.|+.+|+++.| ..|...+.+-++.++..|..-+++|..
T Consensus 5 Lk~ls~~eL~~rl--~~LD~~ME~Eieelr~RY~~KRqPIld 44 (49)
T PF11629_consen 5 LKFLSYEELQQRL--ASLDPEMEQEIEELRQRYQAKRQPILD 44 (49)
T ss_dssp GGGS-HHHHHHHH--HHHHHHHHHHHHHHHHHHHHHHHHHHH
T ss_pred HhhCCHHHHHHHH--HhCCHHHHHHHHHHHHHHHHhhccHHH
Confidence 3468889988644 345555666677888899888888764
No 26
>PF09652 Cas_VVA1548: Putative CRISPR-associated protein (Cas_VVA1548); InterPro: IPR013443 Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) are a family of DNA direct repeats separated by regularly sized non-repetitive spacer sequences that are found in most bacterial and archaeal genomes []. CRISPRs appear to provide acquired resistance against bacteriophages, possibly acting with an RNA interference-like mechanism to inhibit gene functions of invasive DNA elements [, ]. Differences in the number and type of spacers between CRISPR repeats correlate with phage sensitivity. It is thought that following phage infection, bacteria integrate new spacers derived from phage genomic sequences, and that the removal or addition of particular spacers modifies the phage-resistance phenotype of the cell. Therefore, the specificity of CRISPRs may be determined by spacer-phage sequence similarity. In addition, there are many protein families known as CRISPR-associated sequences (Cas), which are encoded in the vicinity of CRISPR loci []. CRISPR/cas gene regions can be quite large, with up to 20 different, tandem-arranged cas genes next to a CRISPR cluster or filling the region between two repeat clusters. Cas genes and CRISPRs are found on mobile genetic elements such as plasmids, and have undergone extensive horizontal transfer. Cas proteins are thought to be involved in the propagation and functioning of CRISPRs. Some Cas proteins show similarity to helicases and repair proteins, although the functions of most are unknown. Cas families can be divided into subtypes according to operon organisation and phylogeny. This entry represents a conserved region of about 95 amino acids found exclusively in species with CRISPR repeats. In all bacterial species that contain this entry, the genes encoding the proteins are in the midst of a cluster of cas genes.
Probab=22.93 E-value=57 Score=25.72 Aligned_cols=42 Identities=17% Similarity=0.400 Sum_probs=26.7
Q ss_pred hcHHHHHHHHHhCCCCCCCcEEEeeCCCceEEEEcccCCCCCEEE-EcCCCC
Q 018186 74 ENASTLQKWLSDSGLPPQKMAIQKVDVGERGLVALKNIRKGEKLL-FVPPSL 124 (359)
Q Consensus 74 ~~~~~l~~Wl~~~G~~~~~v~i~~~~~~GrGl~At~~I~~ge~ll-~IP~~l 124 (359)
.+..-.++|++++|+.++.+.- ..+ ..+|.+|++|+ ++|..+
T Consensus 4 sRH~GAieW~~~qg~~iD~~v~-Hld--------~~~i~~GD~ViGtLPvhL 46 (93)
T PF09652_consen 4 SRHPGAIEWAKQQGIQIDHFVD-HLD--------PADIQPGDVVIGTLPVHL 46 (93)
T ss_pred eecccHHHHHHHhCCCcceeec-cCC--------HHHccCCCEEEEeCcHHH
Confidence 3455678999999987654321 211 56788888775 445443
No 27
>KOG1337 consensus N-methyltransferase [General function prediction only]
Probab=21.36 E-value=35 Score=34.91 Aligned_cols=51 Identities=14% Similarity=0.178 Sum_probs=34.7
Q ss_pred hhcHHHHHHHHHhCCCCC-CCcEEEeeCCCceEEEEc-ccCCCCCEEEEcCCC
Q 018186 73 LENASTLQKWLSDSGLPP-QKMAIQKVDVGERGLVAL-KNIRKGEKLLFVPPS 123 (359)
Q Consensus 73 ~~~~~~l~~Wl~~~G~~~-~~v~i~~~~~~GrGl~At-~~I~~ge~ll~IP~~ 123 (359)
+.+...|++|...+|+.. .++........|.+.++. +.+...+.+..+...
T Consensus 3 ~~~l~~~l~~~~~~~~~~~~~~~~~~~~~~g~~~~~~~~~~~~~~~~~~~~~~ 55 (472)
T KOG1337|consen 3 VDVLSALLRWAQCNGISLSSSLDLRPDELKGLVRWAASESIASSENIKSLKFW 55 (472)
T ss_pred hhHHHHhhhHHhccCccCCcccccCccccCcceeeeecccCCCccccccceec
Confidence 356789999999999986 456666655667777777 555555555444433
No 28
>PF00856 SET: SET domain; InterPro: IPR001214 The SET domain appears generally as one part of a larger multidomain protein, and recently there were described three structures of very different proteins with distinct domain compositions: Neurospora crassa DIM-5, a member of the Su(var) family of HKMTs which methylate histone H3 on lysine 9,human SET7 (also called SET9), which methylates H3 on lysine 4 and garden pea Rubisco LSMT, an enzyme that does not modify histones, but instead methylates lysine 14 in the flexible tail of the large subunit of the enzyme Rubisco. The SET domain itself turned out to be an uncommon structure. Although in all three studies, electron density maps revealed the location of the AdoMet or AdoHcy cofactor, the SET domain bears no similarity at all to the canonical/AdoMet-dependent methyltransferase fold. Strictly conserved in the C-terminal motif of the SET domain tyrosine could be involved in abstracting a proton from the protonated amino group of the substrate lysine, promoting its nucleophilic attack on the sulphonium methyl group of the AdoMet cofactor. In contrast to the AdoMet-dependent protein methyltranferases of the classical type, which tend to bind their polypeptide substrates on top of the cofactor, it is noted from the Rubisco LSMT structure that the AdoMet seems to bind in a separate cleft, suggesting how a polypeptide substrate could be subjected to multiple rounds of methylation without having to be released from the enzyme. In contrast, SET7/9 is able to add only a single methyl group to its substrate. It has been demonstrated that association of SET domain and myotubularin-related proteins modulates growth control []. The SET domain-containing Drosophila melanogaster (Fruit fly) protein, enhancer of zeste, has a function in segment determination and the mammalian homologue may be involved in the regulation of gene transcription and chromatin structure. Histone lysine methylation is part of the histone code that regulated chromatin function and epigenetic control of gene function. Histone lysine methyltransferases (HMTase) differ both in their substrate specificity for the various acceptor lysines as well as in their product specificity for the number of methyl groups (one, two, or three) they transfer. With just one exception [], the HMTases belong to SET family that can be classified according to the sequences surrounding the SET domain [, ]. Structural studies on the human SET7/9, a mono-methylase, have revealed the molecular basis for the specificity of the enzyme for the histone-target and the roles of the invariant residues in the SET domain in determining the methylation specificities []. The pre-SET domain, as found in the SUV39 SET family, contains nine invariant cysteine residues that are grouped into two segments separated by a region of variable length. These 9 cysteines coordinate 3 zinc ions to form to form a triangular cluster, where each of the zinc ions is coordinated by 4 four cysteines to give a tetrahedral configuration. The function of this domain is structural, holding together 2 long segments of random coils. The C-terminal region including the post-SET domain is disordered when not interacting with a histone tail and in the absence of zinc. The three conserved cysteines in the post-SET domain form a zinc-binding site when coupled to a fourth conserved cysteine in the knot-like structure close to the SET domain active site []. The structured post-SET region brings in the C-terminal residues that participate in S-adenosylmethine-binding and histone tail interactions. The three conserved cysteine residues are essential for HMTase activity, as replacement with serine abolishes HMTase activity [], []. ; GO: 0005515 protein binding; PDB: 3TG5_A 3S7F_A 3RIB_B 3TG4_A 3S7J_A 3S7D_A 3S7B_A 3H6L_A 3SMT_A 3K5K_A ....
Probab=21.28 E-value=1.2e+02 Score=24.49 Aligned_cols=27 Identities=26% Similarity=0.352 Sum_probs=18.9
Q ss_pred CcEEEee---CCCceEEEEcccCCCCCEEE
Q 018186 92 KMAIQKV---DVGERGLVALKNIRKGEKLL 118 (359)
Q Consensus 92 ~v~i~~~---~~~GrGl~At~~I~~ge~ll 118 (359)
++.+... .+...-++|+++|++||.|.
T Consensus 129 n~~~~~~~~~~~~~~~~~a~r~I~~GeEi~ 158 (162)
T PF00856_consen 129 NCEVSFDFDGDGGCLVVRATRDIKKGEEIF 158 (162)
T ss_dssp SEEEEEEEETTTTEEEEEESS-B-TTSBEE
T ss_pred ccceeeEeecccceEEEEECCccCCCCEEE
Confidence 5666554 46678889999999999885
Done!