Query 015859
Match_columns 399
No_of_seqs 80 out of 82
Neff 3.9
Searched_HMMs 46136
Date Fri Mar 29 01:33:48 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/015859.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/015859hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 smart00317 SET SET (Su(var)3-9 98.8 2.3E-08 4.9E-13 80.4 8.2 32 167-198 9-40 (116)
2 KOG1079 Transcriptional repres 98.7 3.7E-08 8E-13 105.6 8.4 125 156-388 596-728 (739)
3 KOG1080 Histone H3 (Lys4) meth 96.1 0.02 4.2E-07 65.5 8.8 68 156-228 868-937 (1005)
4 KOG1082 Histone H3 (Lys9) meth 94.0 0.14 3.1E-06 52.0 6.8 148 152-377 174-322 (364)
5 PF00856 SET: SET domain; Int 92.6 0.097 2.1E-06 43.3 2.7 27 169-195 1-27 (162)
6 KOG2461 Transcription factor B 92.0 0.18 3.8E-06 52.5 4.4 63 155-227 29-95 (396)
7 KOG1085 Predicted methyltransf 88.1 0.35 7.6E-06 49.2 2.6 56 136-193 230-291 (392)
8 KOG4442 Clathrin coat binding 81.6 6.1 0.00013 44.2 8.5 59 168-228 130-191 (729)
9 COG2940 Proteins containing SE 71.7 1.4 2.9E-05 46.4 0.2 29 354-382 427-456 (480)
10 PF00856 SET: SET domain; Int 64.4 4.5 9.7E-05 33.3 1.9 21 355-375 141-162 (162)
11 KOG1083 Putative transcription 45.1 13 0.00029 43.6 2.0 23 356-378 1274-1297(1306)
12 KOG0404 Thioredoxin reductase 44.5 8.7 0.00019 38.6 0.5 51 112-178 242-292 (322)
13 PF14451 Ub-Mut7C: Mut7-C ubiq 37.2 23 0.0005 29.3 1.8 26 162-188 43-75 (81)
14 KOG3192 Mitochondrial J-type c 33.8 1.4E+02 0.003 28.3 6.5 36 56-91 106-141 (168)
15 PF08638 Med14: Mediator compl 30.2 3.1E+02 0.0068 25.9 8.4 85 4-96 16-108 (195)
16 COG1791 Uncharacterized conser 28.9 44 0.00096 31.9 2.5 26 168-193 103-137 (181)
17 KOG1337 N-methyltransferase [G 28.0 41 0.00089 35.4 2.3 21 356-376 257-278 (472)
18 PF02311 AraC_binding: AraC-li 26.6 41 0.00088 27.0 1.6 28 167-194 30-61 (136)
19 PF08666 SAF: SAF domain; Int 25.0 49 0.0011 24.5 1.7 13 358-370 3-16 (63)
20 COG1485 Predicted ATPase [Gene 22.8 55 0.0012 34.4 2.1 41 355-395 160-228 (367)
21 KOG3710 EGL-Nine (EGLN) protei 22.8 78 0.0017 31.8 3.0 75 153-243 118-200 (280)
22 PF08443 RimK: RimK-like ATP-g 21.4 59 0.0013 29.7 1.8 41 133-176 18-60 (190)
No 1
>smart00317 SET SET (Su(var)3-9, Enhancer-of-zeste, Trithorax) domain. Putative methyl transferase, based on outlier plant homologues
Probab=98.80 E-value=2.3e-08 Score=80.38 Aligned_cols=32 Identities=22% Similarity=0.250 Sum_probs=28.7
Q ss_pred CCCceEEEeEEecCCcEEEEecCeeecccccc
Q 015859 167 EAGQGLFLCGEANVGAVIAIYPGIIYSPAYYR 198 (399)
Q Consensus 167 ~AG~GVFl~G~v~~GtVVa~YPGvVY~p~~~~ 198 (399)
.+|.|||..-.+++|++|+-|+|.+..+....
T Consensus 9 ~~G~gl~a~~~i~~g~~i~~~~g~~~~~~~~~ 40 (116)
T smart00317 9 GKGWGVRATEDIPKGEFIGEYVGEIITSEEAE 40 (116)
T ss_pred CCcEEEEECCccCCCCEEEEEEeEEECHHHHH
Confidence 79999999999999999999999999864443
No 2
>KOG1079 consensus Transcriptional repressor EZH1 [Transcription]
Probab=98.70 E-value=3.7e-08 Score=105.57 Aligned_cols=125 Identities=29% Similarity=0.337 Sum_probs=91.5
Q ss_pred EeecCCCCCcCCCCceEEEeEEecCCcEEEEecCeeeccccccccCCCCCCCCCC-CeeeeccCCeEEecCCCCCCCCCc
Q 015859 156 LDLKPSQIPHEEAGQGLFLCGEANVGAVIAIYPGIIYSPAYYRYIPGYPRVDAQN-PYLITRYDGTVINAQPWGSGWDTR 234 (399)
Q Consensus 156 l~vk~SsIph~~AG~GVFl~G~v~~GtVVa~YPGvVY~p~~~~~ipgyP~vd~~N-~YLi~r~DG~vIDg~pwg~gg~sr 234 (399)
+-+.||.| ||-|+|++-.|.++.+|+.|-|.+-+..-.-.- |- ..|..+ +|||-.+++.|||++-.|
T Consensus 596 ~llapSdV----aGwGlFlKe~v~KnefisEY~GE~IS~dEADrR-Gk-iYDr~~cSflFnln~dyviDs~rkG------ 663 (739)
T KOG1079|consen 596 VLLAPSDV----AGWGLFLKESVSKNEFISEYTGEIISHDEADRR-GK-IYDRYMCSFLFNLNNDYVIDSTRKG------ 663 (739)
T ss_pred eeechhhc----cccceeeccccCCCceeeeecceeccchhhhhc-cc-ccccccceeeeeccccceEeeeeec------
Confidence 77899988 899999999999999999999998773211100 00 013334 799999999999998864
Q ss_pred cccCCCCcCcCCCCCCCcCcCchhHHHhhcCCCCCCCCCCCcccccccChhhhhhhhcCCCCCCCCCeEEEeecCCCCcc
Q 015859 235 ELWDGLTLPEIMPNSKGAEKGSDQFWKLLSKPMDNKRGGSGSEMLERRNPLALAHFANHPAKGMVPNVMICPYDFPLTEK 314 (399)
Q Consensus 235 ~~~~g~~~~~~~~~~~~a~~~~d~~w~~ls~Pl~~s~~~~~~~~le~~NPLAlGH~aNHpp~g~~pNV~~~~yDfP~~~~ 314 (399)
.++|||||-+ .||..+.-+=++.
T Consensus 664 ---------------------------------------------------nk~rFANHS~---nPNCYAkvm~V~G--- 686 (739)
T KOG1079|consen 664 ---------------------------------------------------NKIRFANHSF---NPNCYAKVMMVAG--- 686 (739)
T ss_pred ---------------------------------------------------chhhhccCCC---CCCcEEEEEEecC---
Confidence 1368999987 5887766553332
Q ss_pred cccccCCccccCCchhhhhhhccccceecCCCCCCCCCceEEEEEEEeccC-CCceeeeecccCCCC------CCCCccc
Q 015859 315 DMRPYIPNISFGNAEEVNMRRFGSFWFKWGSGSGSSTPVLKTLALVATRAI-CDEEVLLNYRLSNSK------RRPVWYS 387 (399)
Q Consensus 315 ~LR~YIPNv~~~~~~~~~m~r~g~~w~~~~~~~~~~~~vlr~vVLVAtRdI-~dEELflNYRls~~~------~~P~WY~ 387 (399)
-+=+.+.|.|+| .|||||.+|||++.. .-++||.
T Consensus 687 ---------------------------------------dhRIGifAkRaIeagEELffDYrYs~~~~~k~~~~~~~s~k 727 (739)
T KOG1079|consen 687 ---------------------------------------DHRIGIFAKRAIEAGEELFFDYRYSPEHALKFVGIERESYK 727 (739)
T ss_pred ---------------------------------------CcceeeeehhhcccCceeeeeeccCccccccccccCccccc
Confidence 122447899999 889999999999865 2346665
Q ss_pred c
Q 015859 388 P 388 (399)
Q Consensus 388 p 388 (399)
+
T Consensus 728 ~ 728 (739)
T KOG1079|consen 728 V 728 (739)
T ss_pred c
Confidence 4
No 3
>KOG1080 consensus Histone H3 (Lys4) methyltransferase complex, subunit SET1 and related methyltransferases [Chromatin structure and dynamics; Transcription]
Probab=96.10 E-value=0.02 Score=65.46 Aligned_cols=68 Identities=26% Similarity=0.409 Sum_probs=51.3
Q ss_pred EeecCCCCCcCCCCceEEEeEEecCCcEEEEecCeeeccc--cccccCCCCCCCCCCCeeeeccCCeEEecCCCC
Q 015859 156 LDLKPSQIPHEEAGQGLFLCGEANVGAVIAIYPGIIYSPA--YYRYIPGYPRVDAQNPYLITRYDGTVINAQPWG 228 (399)
Q Consensus 156 l~vk~SsIph~~AG~GVFl~G~v~~GtVVa~YPGvVY~p~--~~~~ipgyP~vd~~N~YLi~r~DG~vIDg~pwg 228 (399)
|.-.+|.| -|-|||..+.+.+|.-|-=|=|-++.+. =+|-+ .|-+.-..-.|||+-=|++||||...|
T Consensus 868 ~~F~~s~i----H~wglfa~~~i~~~dmViEY~Ge~vR~~iad~RE~-~Y~~~gi~~sYlfrid~~~ViDAtk~g 937 (1005)
T KOG1080|consen 868 VKFGRSGI----HGWGLFAMENIAAGDMVIEYRGELVRSSIADLREA-RYERMGIGDSYLFRIDDEVVVDATKKG 937 (1005)
T ss_pred hccccccc----cccceeeccCccccceEEEeeceehhhhHHHHHHH-HHhccCcccceeeecccceEEeccccC
Confidence 66778888 4899999999999999999999999841 12221 222223356799987789999999864
No 4
>KOG1082 consensus Histone H3 (Lys9) methyltransferase SUV39H1/Clr4, required for transcriptional silencing [Chromatin structure and dynamics; Transcription]
Probab=93.95 E-value=0.14 Score=51.98 Aligned_cols=148 Identities=18% Similarity=0.217 Sum_probs=88.4
Q ss_pred hCeEEeecCCCCCcCCCCceEEEeEEecCCcEEEEecCeeeccccccccCCCCCCCCCCCeeeeccCCeEEecCCCCCCC
Q 015859 152 IGYTLDLKPSQIPHEEAGQGLFLCGEANVGAVIAIYPGIIYSPAYYRYIPGYPRVDAQNPYLITRYDGTVINAQPWGSGW 231 (399)
Q Consensus 152 lGfsl~vk~SsIph~~AG~GVFl~G~v~~GtVVa~YPGvVY~p~~~~~ipgyP~vd~~N~YLi~r~DG~vIDg~pwg~gg 231 (399)
+=|.|.|-.+ + ..|=||--.=.+++|+-|+=|.|-|-+-.-.+... ..+.|++...|++.+.-..|..
T Consensus 174 ~~~~leIfrt---~-~kGwgvRs~~~I~~G~fvcEyaGe~~t~~e~~~~~------~~~~~~~~~~~~~~~~~~~~~~-- 241 (364)
T KOG1082|consen 174 LQFHLEVFRT---P-EKGWGVRTLDPIPAGEFVCEYAGEVLTSEEAQRRT------HLREYLDDDCDAYSIADREWVD-- 241 (364)
T ss_pred cccceEEEec---C-CceeeecccccccCCCeeEEEeeEecChHHhhhcc------ccccccccccccchhhhccccc--
Confidence 4455555555 2 36667666558999999999999998854433331 2456766555655555455521
Q ss_pred CCccccCCCCcCcCCCCCCCcCcCchhHHHhhcCCCCCCCCCCCcccccccChhhhhhhhcCCCCCCCCCeEEEeecCCC
Q 015859 232 DTRELWDGLTLPEIMPNSKGAEKGSDQFWKLLSKPMDNKRGGSGSEMLERRNPLALAHFANHPAKGMVPNVMICPYDFPL 311 (399)
Q Consensus 232 ~sr~~~~g~~~~~~~~~~~~a~~~~d~~w~~ls~Pl~~s~~~~~~~~le~~NPLAlGH~aNHpp~g~~pNV~~~~yDfP~ 311 (399)
. -+.+ ......+.+...+.+ ..+...+=-.+|+|+||- -+|||+++.+-.
T Consensus 242 -~--~~~~------------~~~~~~~~~~~~~~~----------~~ida~~~GNv~RfinHS---C~PN~~~~~v~~-- 291 (364)
T KOG1082|consen 242 -E--SPVG------------NTFVAPSLPGGPGRE----------LLIDAKPHGNVARFINHS---CSPNLLYQAVFQ-- 291 (364)
T ss_pred -c--cccc------------ccccccccccCCCcc----------eEEchhhcccccccccCC---CCccceeeeeee--
Confidence 1 0111 001111111111110 223455556788999997 579999987711
Q ss_pred CcccccccCCccccCCchhhhhhhccccceecCCCCCCCCCceEEEEEEEeccC-CCceeeeecccC
Q 015859 312 TEKDMRPYIPNISFGNAEEVNMRRFGSFWFKWGSGSGSSTPVLKTLALVATRAI-CDEEVLLNYRLS 377 (399)
Q Consensus 312 ~~~~LR~YIPNv~~~~~~~~~m~r~g~~w~~~~~~~~~~~~vlr~vVLVAtRdI-~dEELflNYRls 377 (399)
+ + ..+.+--++|.|+++| ..|||=++|-.+
T Consensus 292 --------------~--~--------------------~~~~~~~i~ffa~~~I~p~~ELT~dYg~~ 322 (364)
T KOG1082|consen 292 --------------D--E--------------------FVLLYLRIGFFALRDISPGEELTLDYGKA 322 (364)
T ss_pred --------------c--C--------------------CccchheeeeeeccccCCCcccchhhccc
Confidence 1 0 1233556789999999 889999999865
No 5
>PF00856 SET: SET domain; InterPro: IPR001214 The SET domain appears generally as one part of a larger multidomain protein, and recently there were described three structures of very different proteins with distinct domain compositions: Neurospora crassa DIM-5, a member of the Su(var) family of HKMTs which methylate histone H3 on lysine 9,human SET7 (also called SET9), which methylates H3 on lysine 4 and garden pea Rubisco LSMT, an enzyme that does not modify histones, but instead methylates lysine 14 in the flexible tail of the large subunit of the enzyme Rubisco. The SET domain itself turned out to be an uncommon structure. Although in all three studies, electron density maps revealed the location of the AdoMet or AdoHcy cofactor, the SET domain bears no similarity at all to the canonical/AdoMet-dependent methyltransferase fold. Strictly conserved in the C-terminal motif of the SET domain tyrosine could be involved in abstracting a proton from the protonated amino group of the substrate lysine, promoting its nucleophilic attack on the sulphonium methyl group of the AdoMet cofactor. In contrast to the AdoMet-dependent protein methyltranferases of the classical type, which tend to bind their polypeptide substrates on top of the cofactor, it is noted from the Rubisco LSMT structure that the AdoMet seems to bind in a separate cleft, suggesting how a polypeptide substrate could be subjected to multiple rounds of methylation without having to be released from the enzyme. In contrast, SET7/9 is able to add only a single methyl group to its substrate. It has been demonstrated that association of SET domain and myotubularin-related proteins modulates growth control []. The SET domain-containing Drosophila melanogaster (Fruit fly) protein, enhancer of zeste, has a function in segment determination and the mammalian homologue may be involved in the regulation of gene transcription and chromatin structure. Histone lysine methylation is part of the histone code that regulated chromatin function and epigenetic control of gene function. Histone lysine methyltransferases (HMTase) differ both in their substrate specificity for the various acceptor lysines as well as in their product specificity for the number of methyl groups (one, two, or three) they transfer. With just one exception [], the HMTases belong to SET family that can be classified according to the sequences surrounding the SET domain [, ]. Structural studies on the human SET7/9, a mono-methylase, have revealed the molecular basis for the specificity of the enzyme for the histone-target and the roles of the invariant residues in the SET domain in determining the methylation specificities []. The pre-SET domain, as found in the SUV39 SET family, contains nine invariant cysteine residues that are grouped into two segments separated by a region of variable length. These 9 cysteines coordinate 3 zinc ions to form to form a triangular cluster, where each of the zinc ions is coordinated by 4 four cysteines to give a tetrahedral configuration. The function of this domain is structural, holding together 2 long segments of random coils. The C-terminal region including the post-SET domain is disordered when not interacting with a histone tail and in the absence of zinc. The three conserved cysteines in the post-SET domain form a zinc-binding site when coupled to a fourth conserved cysteine in the knot-like structure close to the SET domain active site []. The structured post-SET region brings in the C-terminal residues that participate in S-adenosylmethine-binding and histone tail interactions. The three conserved cysteine residues are essential for HMTase activity, as replacement with serine abolishes HMTase activity [], []. ; GO: 0005515 protein binding; PDB: 3TG5_A 3S7F_A 3RIB_B 3TG4_A 3S7J_A 3S7D_A 3S7B_A 3H6L_A 3SMT_A 3K5K_A ....
Probab=92.59 E-value=0.097 Score=43.27 Aligned_cols=27 Identities=37% Similarity=0.605 Sum_probs=22.1
Q ss_pred CceEEEeEEecCCcEEEEecCeeeccc
Q 015859 169 GQGLFLCGEANVGAVIAIYPGIIYSPA 195 (399)
Q Consensus 169 G~GVFl~G~v~~GtVVa~YPGvVY~p~ 195 (399)
|+|||.+-.+++|+||.++.+.+..+.
T Consensus 1 GrGl~At~dI~~Ge~I~~p~~~~~~~~ 27 (162)
T PF00856_consen 1 GRGLFATRDIKAGEVILIPRPAILTPD 27 (162)
T ss_dssp SEEEEESS-B-TTEEEEEESEEEEEHH
T ss_pred CEEEEECccCCCCCEEEEECcceEEeh
Confidence 899999999999999988888887753
No 6
>KOG2461 consensus Transcription factor BLIMP-1/PRDI-BF1, contains C2H2-type Zn-finger and SET domains [Transcription]
Probab=92.03 E-value=0.18 Score=52.54 Aligned_cols=63 Identities=24% Similarity=0.399 Sum_probs=45.3
Q ss_pred EEeecCCCCCcCCCCceEEEeEEecCCcEEEEecCeeeccccccccCCCCCCCCCCCeeeeccC-C---eEEecCCC
Q 015859 155 TLDLKPSQIPHEEAGQGLFLCGEANVGAVIAIYPGIIYSPAYYRYIPGYPRVDAQNPYLITRYD-G---TVINAQPW 227 (399)
Q Consensus 155 sl~vk~SsIph~~AG~GVFl~G~v~~GtVVa~YPGvVY~p~~~~~ipgyP~vd~~N~YLi~r~D-G---~vIDg~pw 227 (399)
.|.+++|+|+ .+|-||+=+..+++|+.-+=|=|-+ |+-+..-.++|.|+..-|. + .+|||.+-
T Consensus 29 ~l~i~~Ssv~--~~~lgV~s~~~i~~G~~FGP~~G~~--------~~~~~~~~~n~~y~W~I~~~d~~~~~iDg~d~ 95 (396)
T KOG2461|consen 29 ELRIKPSSVP--VTGLGVWSNASILPGTSFGPFEGEI--------IASIDSKSANNRYMWEIFSSDNGYEYIDGTDE 95 (396)
T ss_pred ceEeeccccC--CccccccccccccCcccccCccCcc--------ccccccccccCcceEEEEeCCCceEEeccCCh
Confidence 3789999999 8999999999999988887777776 2222222345677665443 2 68888874
No 7
>KOG1085 consensus Predicted methyltransferase (contains a SET domain) [General function prediction only]
Probab=88.09 E-value=0.35 Score=49.17 Aligned_cols=56 Identities=20% Similarity=0.289 Sum_probs=34.9
Q ss_pred cccccHHHHHH----HHHHHh--CeEEeecCCCCCcCCCCceEEEeEEecCCcEEEEecCeeec
Q 015859 136 TRQLTRTELSQ----RLKDAI--GYTLDLKPSQIPHEEAGQGLFLCGEANVGAVIAIYPGIIYS 193 (399)
Q Consensus 136 t~~l~~~~vs~----~l~~~l--Gfsl~vk~SsIph~~AG~GVFl~G~v~~GtVVa~YPGvVY~ 193 (399)
..+-+..+|++ .|.+.+ |=+-.++.--+. ..|+||-.+-....|..|--|=|..-.
T Consensus 230 S~RKtk~~i~~E~~~~l~~~vl~g~~egl~~~~~d--gKGRGv~a~~~F~rgdFVVEY~Gdlie 291 (392)
T KOG1085|consen 230 SNRKTKKQISDEAKHALRDTVLKGTNEGLLEVYKD--GKGRGVRAKVNFERGDFVVEYRGDLIE 291 (392)
T ss_pred cchhhHHHhhHHHHHHHHHHHHhccccceeEEeec--cccceeEeecccccCceEEEEecceee
Confidence 33334445544 444443 333333333444 589999888888999999999887644
No 8
>KOG4442 consensus Clathrin coat binding protein/Huntingtin interacting protein HIP1, involved in regulation of endocytosis [Intracellular trafficking, secretion, and vesicular transport]
Probab=81.64 E-value=6.1 Score=44.22 Aligned_cols=59 Identities=22% Similarity=0.308 Sum_probs=38.6
Q ss_pred CCceEEEeEEecCCcEEEEecCeeeccccccc-cCCCCCCCCCC--CeeeeccCCeEEecCCCC
Q 015859 168 AGQGLFLCGEANVGAVIAIYPGIIYSPAYYRY-IPGYPRVDAQN--PYLITRYDGTVINAQPWG 228 (399)
Q Consensus 168 AG~GVFl~G~v~~GtVVa~YPGvVY~p~~~~~-ipgyP~vd~~N--~YLi~r~DG~vIDg~pwg 228 (399)
.|-||=..-.+++|+.|-=|=|=|-+-.-|+. +.-|-+ ..| -|.|+.--|.+|||.-.|
T Consensus 130 KG~GLRA~~dI~~g~FI~EY~GEVI~~~Ef~kR~~~Y~~--d~~kh~Yfm~L~~~e~IDAT~KG 191 (729)
T KOG4442|consen 130 KGCGLRAEEDIPKGQFILEYIGEVIEEKEFEKRVKRYAK--DGIKHYYFMALQGGEYIDATKKG 191 (729)
T ss_pred cccceeeccccCCCcEEeeeccccccHHHHHHHHHHHHh--cCCceEEEEEecCCceecccccC
Confidence 45555555589999999999999988543331 111110 122 366666689999999975
No 9
>COG2940 Proteins containing SET domain [General function prediction only]
Probab=71.72 E-value=1.4 Score=46.39 Aligned_cols=29 Identities=28% Similarity=0.407 Sum_probs=24.1
Q ss_pred eEEEEEEEeccC-CCceeeeecccCCCCCC
Q 015859 354 LKTLALVATRAI-CDEEVLLNYRLSNSKRR 382 (399)
Q Consensus 354 lr~vVLVAtRdI-~dEELflNYRls~~~~~ 382 (399)
++-++..|.||| .+|||.++|-.......
T Consensus 427 ~~~~~~~~~rDI~~geEl~~dy~~~~~~~~ 456 (480)
T COG2940 427 IFKISIYAIRDIKAGEELTYDYGPSLEDNR 456 (480)
T ss_pred cceeeecccccchhhhhhccccccccccch
Confidence 667888999999 99999999987755543
No 10
>PF00856 SET: SET domain; InterPro: IPR001214 The SET domain appears generally as one part of a larger multidomain protein, and recently there were described three structures of very different proteins with distinct domain compositions: Neurospora crassa DIM-5, a member of the Su(var) family of HKMTs which methylate histone H3 on lysine 9,human SET7 (also called SET9), which methylates H3 on lysine 4 and garden pea Rubisco LSMT, an enzyme that does not modify histones, but instead methylates lysine 14 in the flexible tail of the large subunit of the enzyme Rubisco. The SET domain itself turned out to be an uncommon structure. Although in all three studies, electron density maps revealed the location of the AdoMet or AdoHcy cofactor, the SET domain bears no similarity at all to the canonical/AdoMet-dependent methyltransferase fold. Strictly conserved in the C-terminal motif of the SET domain tyrosine could be involved in abstracting a proton from the protonated amino group of the substrate lysine, promoting its nucleophilic attack on the sulphonium methyl group of the AdoMet cofactor. In contrast to the AdoMet-dependent protein methyltranferases of the classical type, which tend to bind their polypeptide substrates on top of the cofactor, it is noted from the Rubisco LSMT structure that the AdoMet seems to bind in a separate cleft, suggesting how a polypeptide substrate could be subjected to multiple rounds of methylation without having to be released from the enzyme. In contrast, SET7/9 is able to add only a single methyl group to its substrate. It has been demonstrated that association of SET domain and myotubularin-related proteins modulates growth control []. The SET domain-containing Drosophila melanogaster (Fruit fly) protein, enhancer of zeste, has a function in segment determination and the mammalian homologue may be involved in the regulation of gene transcription and chromatin structure. Histone lysine methylation is part of the histone code that regulated chromatin function and epigenetic control of gene function. Histone lysine methyltransferases (HMTase) differ both in their substrate specificity for the various acceptor lysines as well as in their product specificity for the number of methyl groups (one, two, or three) they transfer. With just one exception [], the HMTases belong to SET family that can be classified according to the sequences surrounding the SET domain [, ]. Structural studies on the human SET7/9, a mono-methylase, have revealed the molecular basis for the specificity of the enzyme for the histone-target and the roles of the invariant residues in the SET domain in determining the methylation specificities []. The pre-SET domain, as found in the SUV39 SET family, contains nine invariant cysteine residues that are grouped into two segments separated by a region of variable length. These 9 cysteines coordinate 3 zinc ions to form to form a triangular cluster, where each of the zinc ions is coordinated by 4 four cysteines to give a tetrahedral configuration. The function of this domain is structural, holding together 2 long segments of random coils. The C-terminal region including the post-SET domain is disordered when not interacting with a histone tail and in the absence of zinc. The three conserved cysteines in the post-SET domain form a zinc-binding site when coupled to a fourth conserved cysteine in the knot-like structure close to the SET domain active site []. The structured post-SET region brings in the C-terminal residues that participate in S-adenosylmethine-binding and histone tail interactions. The three conserved cysteine residues are essential for HMTase activity, as replacement with serine abolishes HMTase activity [], []. ; GO: 0005515 protein binding; PDB: 3TG5_A 3S7F_A 3RIB_B 3TG4_A 3S7J_A 3S7D_A 3S7B_A 3H6L_A 3SMT_A 3K5K_A ....
Probab=64.35 E-value=4.5 Score=33.32 Aligned_cols=21 Identities=38% Similarity=0.567 Sum_probs=17.9
Q ss_pred EEEEEEEeccC-CCceeeeecc
Q 015859 355 KTLALVATRAI-CDEEVLLNYR 375 (399)
Q Consensus 355 r~vVLVAtRdI-~dEELflNYR 375 (399)
.+++++|+|+| .|||||++|.
T Consensus 141 ~~~~~~a~r~I~~GeEi~isYG 162 (162)
T PF00856_consen 141 GCLVVRATRDIKKGEEIFISYG 162 (162)
T ss_dssp TEEEEEESS-B-TTSBEEEEST
T ss_pred ceEEEEECCccCCCCEEEEEEC
Confidence 46889999999 8999999994
No 11
>KOG1083 consensus Putative transcription factor ASH1/LIN-59 [Transcription]
Probab=45.07 E-value=13 Score=43.64 Aligned_cols=23 Identities=26% Similarity=0.476 Sum_probs=19.8
Q ss_pred EEEEEEeccC-CCceeeeecccCC
Q 015859 356 TLALVATRAI-CDEEVLLNYRLSN 378 (399)
Q Consensus 356 ~vVLVAtRdI-~dEELflNYRls~ 378 (399)
-|+|+|+||| +||||..+|-+.-
T Consensus 1274 Rv~L~A~rDi~kGEELtYDYN~ks 1297 (1306)
T KOG1083|consen 1274 RVGLFALRDLPKGEELTYDYNFKS 1297 (1306)
T ss_pred eeeeeecCCCCCCceEEEeccccc
Confidence 3678999999 9999999987653
No 12
>KOG0404 consensus Thioredoxin reductase [Posttranslational modification, protein turnover, chaperones]
Probab=44.49 E-value=8.7 Score=38.65 Aligned_cols=51 Identities=29% Similarity=0.486 Sum_probs=36.1
Q ss_pred CCCCCcccccCCCCCCCCCCCCcccccccHHHHHHHHHHHhCeEEeecCCCCCcCCCCceEEEeEEe
Q 015859 112 PRRSGLSFAVGPTARPTDSPVVPQTRQLTRTELSQRLKDAIGYTLDLKPSQIPHEEAGQGLFLCGEA 178 (399)
Q Consensus 112 ~r~sgl~fa~~~~~~~~~~~~~~~t~~l~~~~vs~~l~~~lGfsl~vk~SsIph~~AG~GVFl~G~v 178 (399)
-..|||.|+.|++ |+|+.|+- +| =.|.-||-+.+-.++. -.=-|||..|-|
T Consensus 242 l~v~GlFf~IGH~---------Pat~~l~g-qv---e~d~~GYi~t~pgts~---TsvpG~FAAGDV 292 (322)
T KOG0404|consen 242 LPVSGLFFAIGHS---------PATKFLKG-QV---ELDEDGYIVTRPGTSL---TSVPGVFAAGDV 292 (322)
T ss_pred cccceeEEEecCC---------chhhHhcC-ce---eeccCceEEeccCccc---ccccceeecccc
Confidence 4569999999995 88988876 55 5688899887744432 122288887644
No 13
>PF14451 Ub-Mut7C: Mut7-C ubiquitin
Probab=37.22 E-value=23 Score=29.31 Aligned_cols=26 Identities=35% Similarity=0.747 Sum_probs=21.6
Q ss_pred CCCcCCCCceEEEeE-------EecCCcEEEEec
Q 015859 162 QIPHEEAGQGLFLCG-------EANVGAVIAIYP 188 (399)
Q Consensus 162 sIph~~AG~GVFl~G-------~v~~GtVVa~YP 188 (399)
.|||.+-| -|+|+| .+..|+.|++||
T Consensus 43 GVP~tEV~-~i~vNG~~v~~~~~~~~Gd~v~V~P 75 (81)
T PF14451_consen 43 GVPHTEVG-LILVNGRPVDFDYRLKDGDRVAVYP 75 (81)
T ss_pred CCChHHeE-EEEECCEECCCcccCCCCCEEEEEe
Confidence 47888888 577887 678999999998
No 14
>KOG3192 consensus Mitochondrial J-type chaperone [Posttranslational modification, protein turnover, chaperones]
Probab=33.83 E-value=1.4e+02 Score=28.35 Aligned_cols=36 Identities=17% Similarity=0.243 Sum_probs=29.9
Q ss_pred HHHHHHhccCCcHHHHHHHHHHHHHHHHHHHHhhhh
Q 015859 56 EEIIDMAGKASLSDQQQQVLDNIHSQIKRFCLSMDE 91 (399)
Q Consensus 56 eeii~~a~~~~~~~qq~qvq~nih~qi~~~c~~~~~ 91 (399)
|+|-+|-....+..-+.|+|+-|..++..+-++|.+
T Consensus 106 E~IS~~~De~~l~~lk~q~q~ri~q~~~qlge~~es 141 (168)
T KOG3192|consen 106 EAISEMDDEEDLKQLKSQNQERIAQCKQQLGEAFES 141 (168)
T ss_pred HHHHhccCcHHHHHHHHHHHHHHHHHHHHHHHHHhh
Confidence 566677777888888999999999999998877754
No 15
>PF08638 Med14: Mediator complex subunit MED14; InterPro: IPR013947 The Mediator complex is a coactivator involved in the regulated transcription of nearly all RNA polymerase II-dependent genes. Mediator functions as a bridge to convey information from gene-specific regulatory proteins to the basal RNA polymerase II transcription machinery. The Mediator complex, having a compact conformation in its free form, is recruited to promoters by direct interactions with regulatory proteins and serves for the assembly of a functional preinitiation complex with RNA polymerase II and the general transcription factors. On recruitment the Mediator complex unfolds to an extended conformation and partially surrounds RNA polymerase II, specifically interacting with the unphosphorylated form of the C-terminal domain (CTD) of RNA polymerase II. The Mediator complex dissociates from the RNA polymerase II holoenzyme and stays at the promoter when transcriptional elongation begins. The Mediator complex is composed of at least 31 subunits: MED1, MED4, MED6, MED7, MED8, MED9, MED10, MED11, MED12, MED13, MED13L, MED14, MED15, MED16, MED17, MED18, MED19, MED20, MED21, MED22, MED23, MED24, MED25, MED26, MED27, MED29, MED30, MED31, CCNC, CDK8 and CDC2L6/CDK11. The subunits form at least three structurally distinct submodules. The head and the middle modules interact directly with RNA polymerase II, whereas the elongated tail module interacts with gene-specific regulatory proteins. Mediator containing the CDK8 module is less active than Mediator lacking this module in supporting transcriptional activation. The head module contains: MED6, MED8, MED11, SRB4/MED17, SRB5/MED18, ROX3/MED19, SRB2/MED20 and SRB6/MED22. The middle module contains: MED1, MED4, NUT1/MED5, MED7, CSE2/MED9, NUT2/MED10, SRB7/MED21 and SOH1/MED31. CSE2/MED9 interacts directly with MED4. The tail module contains: MED2, PGD1/MED3, RGR1/MED14, GAL11/MED15 and SIN4/MED16. The CDK8 module contains: MED12, MED13, CCNC and CDK8. Individual preparations of the Mediator complex lacking one or more distinct subunits have been variously termed ARC, CRSP, DRIP, PC2, SMCC and TRAP. Saccharomyces cerevisiae (Baker's yeast) RGR1 mediator complex subunit affects chromatin structure, transcriptional regulation of diverse genes, and sporulation. It is required for glucose repression, HO repression, RME1 repression and sporulation [, ]. This subunit is also found in higher eukaryotes and MED14 is the agreed unified nomenclature for this subunit []. ; GO: 0001104 RNA polymerase II transcription cofactor activity, 0006357 regulation of transcription from RNA polymerase II promoter, 0016592 mediator complex
Probab=30.21 E-value=3.1e+02 Score=25.88 Aligned_cols=85 Identities=21% Similarity=0.360 Sum_probs=60.0
Q ss_pred HHHHHHHHHHHHhc----CCCCCCCccchhhhhcch----hhhhhhhhhhccCCcchhcHHHHHHHhccCCcHHHHHHHH
Q 015859 4 LFQKFQEAVKTLAK----SPTFARDPRQLQFEADMN----RLFLYTSYNRLGRDAEEADAEEIIDMAGKASLSDQQQQVL 75 (399)
Q Consensus 4 ~f~~~q~~v~~la~----~~~~~~~~r~~q~e~D~~----rlf~~tsy~~l~~~~~~~d~eeii~~a~~~~~~~qq~qvq 75 (399)
-|+.+++.+++|+. .+...|--+=+||=.... ||.+++-+.+- + ++...+|+|.. +-++|.+..
T Consensus 16 sy~eL~~l~e~l~~~~~~~~d~~rK~~ll~~~~~~R~~fiKLlvL~kWs~~---~--~~v~k~idl~~---~l~~q~~~~ 87 (195)
T PF08638_consen 16 SYNELQQLIETLPSDDTSQSDSERKRRLLQFAQSTRQRFIKLLVLVKWSRK---A--KDVSKCIDLLN---FLRQQNMCF 87 (195)
T ss_pred HHHHHHHHHHHccccCCccchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHh---h--hHHHHHHHHHH---HHHHHHHHH
Confidence 57889999999998 443334444455555544 56667776654 3 25677777765 347888888
Q ss_pred HHHHHHHHHHHHhhhhhhcCC
Q 015859 76 DNIHSQIKRFCLSMDEILLLP 96 (399)
Q Consensus 76 ~nih~qi~~~c~~~~~il~~~ 96 (399)
++..+++..++..|..+=+|.
T Consensus 88 ~~~~~~L~~~~~~l~~Ar~p~ 108 (195)
T PF08638_consen 88 EDAADRLFRLKEQLQNARLPN 108 (195)
T ss_pred HHHHHHHHHHHHHhhhccCCC
Confidence 999999999998888877765
No 16
>COG1791 Uncharacterized conserved protein, contains double-stranded beta-helix domain [Function unknown]
Probab=28.91 E-value=44 Score=31.93 Aligned_cols=26 Identities=38% Similarity=0.689 Sum_probs=21.6
Q ss_pred CCceEEE-e---E-----EecCCcEEEEecCeeec
Q 015859 168 AGQGLFL-C---G-----EANVGAVIAIYPGIIYS 193 (399)
Q Consensus 168 AG~GVFl-~---G-----~v~~GtVVa~YPGvVY~ 193 (399)
||.|+|. . | .|.+|++|++-||+=+.
T Consensus 103 aG~GiF~v~~~d~~~~~i~c~~gDLI~vP~gi~Hw 137 (181)
T COG1791 103 AGEGIFDVHSPDGKVYQIRCEKGDLISVPPGIYHW 137 (181)
T ss_pred ecceEEEEECCCCcEEEEEEccCCEEecCCCceEE
Confidence 7999998 2 2 68999999999998654
No 17
>KOG1337 consensus N-methyltransferase [General function prediction only]
Probab=27.96 E-value=41 Score=35.38 Aligned_cols=21 Identities=43% Similarity=0.697 Sum_probs=19.5
Q ss_pred EEEEEEeccC-CCceeeeeccc
Q 015859 356 TLALVATRAI-CDEEVLLNYRL 376 (399)
Q Consensus 356 ~vVLVAtRdI-~dEELflNYRl 376 (399)
++.++++++| +|||+|.||+-
T Consensus 257 ~~~l~~~~~v~~geevfi~YG~ 278 (472)
T KOG1337|consen 257 AVELVAERDVSAGEEVFINYGP 278 (472)
T ss_pred cEEEEEeeeecCCCeEEEecCC
Confidence 8899999999 99999999984
No 18
>PF02311 AraC_binding: AraC-like ligand binding domain; InterPro: IPR003313 This entry defines the arabinose-binding and dimerisation domain of the bacterial gene regulatory protein AraC. The crystal structure of the arabinose-binding and dimerization domain of the Escherichia coli gene regulatory protein AraC was determined in the presence and absence of L-arabinose. The arabinose-bound molecule shows that the protein adopts an unusual fold, binding sugar within a beta barrel and completely burying the arabinose with the amino-terminal arm of the protein. Dimer contacts in the presence of arabinose are mediated by an antiparallel coiled-coil. In the uncomplexed protein, the amino-terminal arm is disordered, uncovering the sugar-binding pocket and allowing it to serve as an oligomerization interface [].; GO: 0006355 regulation of transcription, DNA-dependent; PDB: 1XJA_B 2ARA_A 2AAC_B 2ARC_A.
Probab=26.64 E-value=41 Score=27.00 Aligned_cols=28 Identities=29% Similarity=0.336 Sum_probs=18.5
Q ss_pred CCCceEEE-eE---EecCCcEEEEecCeeecc
Q 015859 167 EAGQGLFL-CG---EANVGAVIAIYPGIIYSP 194 (399)
Q Consensus 167 ~AG~GVFl-~G---~v~~GtVVa~YPGvVY~p 194 (399)
..|+|.|. +| .+.+|+++-+-||.++.-
T Consensus 30 ~~G~~~~~~~~~~~~l~~g~~~li~p~~~H~~ 61 (136)
T PF02311_consen 30 LSGEGTLHIDGQEYPLKPGDLFLIPPGQPHSY 61 (136)
T ss_dssp EEE-EEEEETTEEEEE-TT-EEEE-TTS-EEE
T ss_pred eCCEEEEEECCEEEEEECCEEEEecCCccEEE
Confidence 46778777 33 899999999999999984
No 19
>PF08666 SAF: SAF domain; InterPro: IPR013974 This entry includes a range of different proteins, such as antifreeze proteins, flagellar FlgA proteins, and CpaB pilus proteins. ; PDB: 1C89_A 3NLA_A 3RDN_A 1C8A_A 3FRN_A 1WVO_A 3K3S_H 3G8R_B 1XUU_A 1XUZ_A ....
Probab=25.01 E-value=49 Score=24.46 Aligned_cols=13 Identities=31% Similarity=0.473 Sum_probs=10.3
Q ss_pred EEEEeccC-CCcee
Q 015859 358 ALVATRAI-CDEEV 370 (399)
Q Consensus 358 VLVAtRdI-~dEEL 370 (399)
|+||+||| .|+.|
T Consensus 3 vvVA~~di~~G~~i 16 (63)
T PF08666_consen 3 VVVAARDIPAGTVI 16 (63)
T ss_dssp EEEESSTB-TT-BE
T ss_pred EEEEeCccCCCCEE
Confidence 68999999 78766
No 20
>COG1485 Predicted ATPase [General function prediction only]
Probab=22.82 E-value=55 Score=34.43 Aligned_cols=41 Identities=34% Similarity=0.655 Sum_probs=32.7
Q ss_pred EEEEEEEeccCCCc---------eeee-------------------ecccCCCCCCCCccccCCHHHHh
Q 015859 355 KTLALVATRAICDE---------EVLL-------------------NYRLSNSKRRPVWYSPVDEEEDR 395 (399)
Q Consensus 355 r~vVLVAtRdI~dE---------ELfl-------------------NYRls~~~~~P~WY~pvD~eEd~ 395 (399)
++|+||||-.+.=+ |.|| +||+-...+-|-|++|.|.|.+.
T Consensus 160 ~GV~lvaTSN~~P~~LY~dGlqR~~FLP~I~li~~~~~v~~vD~~~DYR~r~l~~a~~y~~Pl~~~~~~ 228 (367)
T COG1485 160 RGVVLVATSNTAPDNLYKDGLQRERFLPAIDLIKSHFEVVNVDGPVDYRLRKLEQAPVYLTPLDAEAEA 228 (367)
T ss_pred CCcEEEEeCCCChHHhcccchhHHhhHHHHHHHHHheEEEEecCCccccccccccCceeecCCcHHHHH
Confidence 68889999877333 3343 89999999999999999998764
No 21
>KOG3710 consensus EGL-Nine (EGLN) protein [Signal transduction mechanisms]
Probab=22.80 E-value=78 Score=31.85 Aligned_cols=75 Identities=31% Similarity=0.363 Sum_probs=41.1
Q ss_pred CeEEeecCCCCCcCCCCceEEEeEEecCCcEEEEecCeeeccccccccCCCCCCCCCCCeeeeccCCeEEec-----CCC
Q 015859 153 GYTLDLKPSQIPHEEAGQGLFLCGEANVGAVIAIYPGIIYSPAYYRYIPGYPRVDAQNPYLITRYDGTVINA-----QPW 227 (399)
Q Consensus 153 Gfsl~vk~SsIph~~AG~GVFl~G~v~~GtVVa~YPGvVY~p~~~~~ipgyP~vd~~N~YLi~r~DG~vIDg-----~pw 227 (399)
|+-...-.|.|-|...--|-++=| ..-+.||.|||-= --|+++. +||- -||-.|-. +.|
T Consensus 118 ~~L~s~~d~~i~h~~~r~~~~~~g--RtkAMVAcYPGNG--tgYVrHV--------DNP~----gDGRcITcIYYlNqNW 181 (280)
T KOG3710|consen 118 MLLPSPIDSVILHCNGRLGSYIIG--RTKAMVACYPGNG--TGYVRHV--------DNPH----GDGRCITCIYYLNQNW 181 (280)
T ss_pred eeecccchhhhhhhcccccccccc--ceeEEEEEecCCC--ceeeEec--------cCCC----CCceEEEEEEEcccCc
Confidence 333344444454544444445544 5668999999852 1233444 6664 34433311 223
Q ss_pred ---CCCCCCccccCCCCcC
Q 015859 228 ---GSGWDTRELWDGLTLP 243 (399)
Q Consensus 228 ---g~gg~sr~~~~g~~~~ 243 (399)
-.||+=|..|.|.+..
T Consensus 182 D~kv~Gg~Lri~pe~~~~~ 200 (280)
T KOG3710|consen 182 DVKVHGGILRIFPEGSTTF 200 (280)
T ss_pred ceeeccceeEeccCCCCcc
Confidence 2468888889887663
No 22
>PF08443 RimK: RimK-like ATP-grasp domain; InterPro: IPR013651 This ATP-grasp domain is found in the ribosomal S6 modification enzyme RimK []. It has an unusual nucleotide-binding fold referred to as palmate, or ATP-grasp fold. This domain is found in a number of enzymes of known structure as well as in urea amidolyase, tubulin-tyrosine ligase, and three enzymes of purine biosynthesis.; PDB: 1UC8_B 1UC9_A.
Probab=21.36 E-value=59 Score=29.65 Aligned_cols=41 Identities=29% Similarity=0.596 Sum_probs=19.6
Q ss_pred CcccccccHHHHHHHHHHHh-CeEEeecCCCCCcCCCCceEEE-eE
Q 015859 133 VPQTRQLTRTELSQRLKDAI-GYTLDLKPSQIPHEEAGQGLFL-CG 176 (399)
Q Consensus 133 ~~~t~~l~~~~vs~~l~~~l-Gfsl~vk~SsIph~~AG~GVFl-~G 176 (399)
+|+|.-....+-.+.+.+.+ ||-+.+||+.= ..|.|||+ +.
T Consensus 18 vP~t~~~~~~~~~~~~~~~~~~~p~ViKp~~g---~~G~gV~~i~~ 60 (190)
T PF08443_consen 18 VPETRVTNSPEEAKEFIEELGGFPVVIKPLRG---SSGRGVFLINS 60 (190)
T ss_dssp ---EEEESSHHHHHHHHHHH--SSEEEE-SB----------EEEES
T ss_pred CCCEEEECCHHHHHHHHHHhcCCCEEEeeCCC---CCCCEEEEecC
Confidence 56766554444445566666 99999999753 67999998 54
Done!