Query 006089
Match_columns 662
No_of_seqs 415 out of 1788
Neff 5.9
Searched_HMMs 46136
Date Thu Mar 28 18:10:54 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/006089.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/006089hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 smart00466 SRA SET and RING fi 100.0 5.8E-58 1.3E-62 436.6 16.4 153 196-352 2-155 (155)
2 KOG1082 Histone H3 (Lys9) meth 100.0 8.5E-55 1.8E-59 470.7 19.5 290 370-662 55-354 (364)
3 PF02182 SAD_SRA: SAD/SRA doma 100.0 8.1E-54 1.8E-58 410.5 13.2 153 196-352 1-155 (155)
4 KOG4442 Clathrin coat binding 100.0 1E-40 2.2E-45 369.4 13.3 162 462-662 93-260 (729)
5 KOG1141 Predicted histone meth 100.0 1.6E-40 3.4E-45 368.1 9.3 153 370-524 668-836 (1262)
6 KOG1080 Histone H3 (Lys4) meth 99.9 1.9E-27 4.2E-32 279.0 10.9 134 487-661 866-1004(1005)
7 KOG1079 Transcriptional repres 99.9 1.3E-26 2.9E-31 256.1 10.7 143 462-638 558-714 (739)
8 smart00317 SET SET (Su(var)3-9 99.9 7.2E-22 1.6E-26 176.5 12.8 111 488-632 1-116 (116)
9 KOG1083 Putative transcription 99.8 1.6E-22 3.5E-27 231.0 2.3 127 476-636 1166-1297(1306)
10 PF05033 Pre-SET: Pre-SET moti 99.8 2.4E-21 5.2E-26 173.7 6.1 100 379-479 1-103 (103)
11 smart00468 PreSET N-terminal t 99.8 9.3E-21 2E-25 168.8 8.3 94 377-471 1-98 (98)
12 KOG1085 Predicted methyltransf 99.6 4.5E-16 9.7E-21 158.6 8.9 123 483-636 252-380 (392)
13 COG2940 Proteins containing SE 99.6 3E-16 6.6E-21 176.3 6.0 168 466-662 311-480 (480)
14 PF00856 SET: SET domain; Int 99.3 4.3E-12 9.2E-17 118.1 9.0 55 575-633 108-162 (162)
15 KOG1141 Predicted histone meth 98.9 1.3E-09 2.7E-14 123.8 6.5 283 372-661 872-1261(1262)
16 KOG1081 Transcription factor N 98.8 9.3E-10 2E-14 123.1 2.1 150 461-662 287-437 (463)
17 KOG2589 Histone tail methylase 98.3 4.7E-07 1E-11 96.2 5.2 114 496-654 136-252 (453)
18 KOG2461 Transcription factor B 97.7 3.3E-05 7.2E-10 85.1 5.2 120 478-637 20-147 (396)
19 smart00508 PostSET Cysteine-ri 96.2 0.0022 4.7E-08 43.8 1.3 16 647-662 2-17 (26)
20 COG3440 Predicted restriction 94.8 0.00081 1.8E-08 70.5 -7.3 142 199-349 6-149 (301)
21 smart00570 AWS associated with 94.3 0.02 4.4E-07 45.4 1.3 25 460-484 25-49 (51)
22 KOG2084 Predicted histone tail 90.9 0.26 5.5E-06 54.8 4.5 44 588-639 208-252 (482)
23 KOG1337 N-methyltransferase [G 70.2 3.7 7.9E-05 46.8 3.3 41 588-635 239-279 (472)
24 PF03638 TCR: Tesmin/TSO1-like 50.8 11 0.00023 29.0 1.7 37 422-480 3-40 (42)
25 KOG3813 Uncharacterized conser 49.2 9.2 0.0002 43.6 1.6 42 423-480 308-350 (640)
26 KOG1171 Metallothionein-like p 43.9 8.3 0.00018 43.0 0.3 37 420-478 215-252 (406)
27 PF12218 End_N_terminal: N ter 33.1 22 0.00048 29.6 1.1 50 290-355 11-60 (67)
28 PF08666 SAF: SAF domain; Int 30.2 25 0.00053 28.0 0.9 16 615-630 3-18 (63)
29 KOG1081 Transcription factor N 24.9 24 0.00052 40.4 -0.0 153 463-659 95-261 (463)
30 KOG2155 Tubulin-tyrosine ligas 23.2 43 0.00093 37.9 1.5 50 584-635 203-252 (631)
No 1
>smart00466 SRA SET and RING finger associated domain. Domain of unknown function in SET domain containing proteins and in Deinococcus radiodurans DRA1533. Domain in SET domain containing proteins and in Deinococcus radiodurans DRA1533.
Probab=100.00 E-value=5.8e-58 Score=436.65 Aligned_cols=153 Identities=58% Similarity=1.001 Sum_probs=145.9
Q ss_pred CcccccCCCccCCceechhhhhhhhccccCCcCCcccccccCCCCCCCeEEEEEecCCCCCCCCCCCeEEEEcCCCCCCC
Q 006089 196 RKRLGVVPGVEIGDIFFFRMEMCLIGLHSQSMAGIDYMITRSDLDEEPVAVSIISSGGYDDDAEDSDILIYSGQGGNANR 275 (662)
Q Consensus 196 ~k~~G~vpGv~vGd~f~~R~e~~~~GlH~~~~~GI~~~~~~~~~~~~~~A~SIV~SGgy~dd~D~gd~l~YtG~GG~~~~ 275 (662)
.|+||+||||+|||+|++|+||+++|||+++|+||||++.+ +++++|+|||+||||+||+|+||+|+|||+||++.
T Consensus 2 ~~~~G~vpGv~vGd~f~~R~el~~~GlH~~~~~GI~~~~~~---~~~~~A~SIV~SggYedd~D~gd~liYtG~gg~~~- 77 (155)
T smart00466 2 KHIFGPVPGVEVGDIFFFRVELCLVGLHRPTQAGIDGLTAD---EGEPGATSVVSSGGYEDDTDDGDVLIYTGQGGRDM- 77 (155)
T ss_pred CceEeCCCCccCCCEEcchhHhhhhcccCcccCCccccccc---CCCccEEEEEECCCccCcccCCCEEEEEccCCccC-
Confidence 58899999999999999999999999999999999999864 56799999999999999999999999999999965
Q ss_pred CCCcccCcccchhhHHHHHHHHhCCccEEEeccc-cccCCCCceeeecCceeeeeeEEecCCCCceEEEEEeeecCCC
Q 006089 276 KGEQAADQKLERGNLALERSLRRASEVRVIRGMK-DAINQSSKVYVYDGLYTVQESWTEKGKSGCNIFKYKLVRIPGQ 352 (662)
Q Consensus 276 ~~~~~~DQ~l~~gNlAL~~S~~~~~pVRViRg~~-~~~~~~~~~y~YDGLY~V~~~w~e~g~~G~~v~kfkL~R~pgQ 352 (662)
+++|..||+|++||+||++|+++++|||||||++ ...+.+.++|||||||+|+++|.++|++|+.||||+|+|+|||
T Consensus 78 ~~~~~~dQkl~~gNlAL~~S~~~~~PVRViRg~~~~~~~~p~~gyrYDGLY~V~~~w~e~g~~G~~v~kfkL~R~~gQ 155 (155)
T smart00466 78 THGQPEDQKLERGNLALEASCRKGIPVRVVRGMKGYSKYAPGKGYIYDGLYRIVDYWREVGKSGFLVFKFKLVRIPGQ 155 (155)
T ss_pred CCCCccccEecchhHHHHHHHhcCCceEEEccccccCCCCCCCeEEECcEEEEEEEEEecCCCCcEEEEEEEEeCCCC
Confidence 6789999999999999999999999999999999 5567899999999999999999999999999999999999998
No 2
>KOG1082 consensus Histone H3 (Lys9) methyltransferase SUV39H1/Clr4, required for transcriptional silencing [Chromatin structure and dynamics; Transcription]
Probab=100.00 E-value=8.5e-55 Score=470.72 Aligned_cols=290 Identities=37% Similarity=0.689 Sum_probs=244.7
Q ss_pred CCccccccCCCCCCcCCCCCccccCCCCCCCCCCcEEcceeccCC-CcCCCCCCCCCCCCCCCcCCCCC-cccccccCCC
Q 006089 370 SGRVGLILPDLSSGAEAIPIALINDVDDEKGPAYFTYLTTVKYSK-SFRLTQPSFGCNCYSACGPGNPN-CSCVQKNGGD 447 (662)
Q Consensus 370 ~~r~~~i~~DiS~G~E~~PI~~vN~VD~~~~P~~F~Yi~~~~~~~-~~~~~~~~~gC~C~~~C~~~~~~-C~C~~~n~g~ 447 (662)
..+.+.+.+||+.|.|++||+++|+||++.+ .+|.|++...+.. ......+..+|.|.+.|...... |.|...|++.
T Consensus 55 ~~~~~~~~~d~~~~~e~~~v~~~n~id~~~~-~~f~y~~~~~~~~~~~~~~~~~~~c~C~~~~~~~~~~~C~C~~~n~~~ 133 (364)
T KOG1082|consen 55 KLEAKSELEDIALGSENLPVPLVNRIDEDAP-LYFQYIATEIVDPGELSDCENSTGCRCCSSCSSVLPLTCLCERHNGGL 133 (364)
T ss_pred ccccccccccccCccccCceeeeeeccCCcc-ccceeccccccCccccccCccccCCCccCCCCCCCCccccChHhhCCc
Confidence 4456778999999999999999999998877 8999999988887 44445678899999888763222 9999999999
Q ss_pred CcccCCc---eeecCCCceeecCCCCCCCCCCCCcccccCceeeEEEEecCCCCCeeEeCCccCCCceEEEeecEEeeHH
Q 006089 448 FPYTANG---VLVSRKPLIYECGPSCPCNRDCKNRVSQTGLKVRLDVFKTKDRGWGLRSLDPIRAGTFICEYAGEVVDKF 524 (662)
Q Consensus 448 ~~Y~~~G---~L~~~~~~i~EC~~~C~C~~~C~NRv~Q~G~k~~LeVfrT~~kGwGVrA~~~I~~GtfIcEY~GEvit~~ 524 (662)
++|+.+| .+...++.+|||++.|+|+.+|.|||+|+|++.+||||+|..+|||||+++.|++|+|||||+||+++..
T Consensus 134 ~~~~~~~~~~~~~~~~~~i~EC~~~C~C~~~C~nRv~q~g~~~~leIfrt~~kGwgvRs~~~I~~G~fvcEyaGe~~t~~ 213 (364)
T KOG1082|consen 134 VAYTCDGDCGTLGKFKEPVFECSVACGCHPDCANRVVQKGLQFHLEVFRTPEKGWGVRTLDPIPAGEFVCEYAGEVLTSE 213 (364)
T ss_pred cccccCCccccccccCccccccccCCCCCCcCcchhhccccccceEEEecCCceeeecccccccCCCeeEEEeeEecChH
Confidence 9999998 7788899999999999999999999999999999999999999999999999999999999999999999
Q ss_pred hHhhhcCCCCCceeeecccccccccccCCCCCccCCCCCCCccccCCCCCEEEeccccCChhhcccCCCCCCceeEEEEE
Q 006089 525 KARQDGEGSNEDYVFDTTRTYDSFKWNYEPGLIEDDDPSDTTEEYDLPYPLVISAKNVGNVARFMNHSCSPNVFWQPIIF 604 (662)
Q Consensus 525 e~~~~~~~~~d~Ylfd~~~~~~~~~w~~~~~l~~~~~~~~~~~~~~~~~~~~IDA~~~GNvaRFINHSC~PN~~~q~V~~ 604 (662)
+++.+. ....|.++....+..+.|++......................++|||+.+||++|||||||.||++++.|+.
T Consensus 214 e~~~~~--~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ida~~~GNv~RfinHSC~PN~~~~~v~~ 291 (364)
T KOG1082|consen 214 EAQRRT--HLREYLDDDCDAYSIADREWVDESPVGNTFVAPSLPGGPGRELLIDAKPHGNVARFINHSCSPNLLYQAVFQ 291 (364)
T ss_pred Hhhhcc--ccccccccccccchhhhccccccccccccccccccccCCCcceEEchhhcccccccccCCCCccceeeeeee
Confidence 987653 246777765555555556665554444433333333445678999999999999999999999999999999
Q ss_pred ecCCCceeEEEEEEeecCCCCcEEEEecCCCCCCC--CCC---CCCCCeEeecCCCCCccccC
Q 006089 605 ENNNESFVHVAFFAMRHVPPMTELTYDYGISKSDG--GNY---EPHRKKKCLCGTLKCRGYFG 662 (662)
Q Consensus 605 d~~d~~~p~I~~FA~RdI~~GEELT~DYg~~~~~~--~~~---~~~~~~~C~CGS~~CrG~lG 662 (662)
++.+..++||+|||+++|+||||||||||..+... ... .......|.||+.+||++++
T Consensus 292 ~~~~~~~~~i~ffa~~~I~p~~ELT~dYg~~~~~~~~~~~~~~~~~~~~~c~c~~~~cr~~~~ 354 (364)
T KOG1082|consen 292 DEFVLLYLRIGFFALRDISPGEELTLDYGKAYKLLVQDGANIYTPVMKKNCNCGLEKCRGLLG 354 (364)
T ss_pred cCCccchheeeeeeccccCCCcccchhhcccccccccccccccccccchhhcCCCHHhCcccC
Confidence 99999999999999999999999999999886421 111 24578999999999999875
No 3
>PF02182 SAD_SRA: SAD/SRA domain; InterPro: IPR003105 This domain has been termed SRA-YDG, for SET and Ring finger Associated, and because of the conserved YDG motif within the domain. Further characteristics of the domain are the conservation of up to 13 evenly spaced glycine residues and a VRV(I/V)RG motif. The domain is mainly found in plants and animals and in bacteria. In animals, this domain is associated with the Np95-like ring finger protein and the related gene product Np97, which contains PHD and RING FINGER domains and which is an important determinant in cell cycle progression. Np95 is a chromatin-associated ubiquitin ligase, binding to histones is direct and shows a remarkable preference for histone H3 and its N-terminal tail. The SRA-YDG domain contained in Np95 is indispensable both for the interaction with histones and for chromatin binding in vivo [, ]. In plants the SRA-YDG domain is associated with the SET domain, found in a family of histone methyl transferases, and in bacteria it is found in association with HNH, a non-specific nuclease motif [, ].; GO: 0042393 histone binding; PDB: 2ZO1_B 2ZKD_A 2ZO0_B 2ZKF_A 2ZKG_B 3FDE_A 3F8I_A 2ZO2_B 3F8J_B 2ZKE_A ....
Probab=100.00 E-value=8.1e-54 Score=410.47 Aligned_cols=153 Identities=55% Similarity=0.928 Sum_probs=123.6
Q ss_pred CcccccCCCccCCceechhhhhhhhccccCCcCCcccccccCCCCCCCeEEEEEecCCCCCCCCCCCeEEEEcCCCCCCC
Q 006089 196 RKRLGVVPGVEIGDIFFFRMEMCLIGLHSQSMAGIDYMITRSDLDEEPVAVSIISSGGYDDDAEDSDILIYSGQGGNANR 275 (662)
Q Consensus 196 ~k~~G~vpGv~vGd~f~~R~e~~~~GlH~~~~~GI~~~~~~~~~~~~~~A~SIV~SGgy~dd~D~gd~l~YtG~GG~~~~ 275 (662)
+|+|||||||+|||||++|+||+++|||+++|+|||+++.. +.++|+|||+||+|+||+|+||+|+|||+||++..
T Consensus 1 ~k~~G~ipGv~vG~~f~~r~~~~~~G~H~~~~~GI~g~~~~----g~~~A~SIV~Sg~y~dd~D~gd~l~YtG~gg~~~~ 76 (155)
T PF02182_consen 1 EKRFGHIPGVEVGDWFPYRMELSIVGLHGPTQAGIDGMKKE----GGPVAYSIVLSGGYEDDEDNGDVLIYTGQGGNDLS 76 (155)
T ss_dssp -TSSS--TT--TT-EESSHHHHHHTTSS--SS-SEEEETTT----ESEEEEEEEESSSSTTCEECSSEEEEE-SSSB--T
T ss_pred CCcEeCCCCccCccEEhHHHHHhHhccCCCccCCeecccCC----CceeeEEEEECCCcccccCCCCEEEEEcCCCcccc
Confidence 47899999999999999999999999999999999998862 23679999999999999999999999999999877
Q ss_pred CCCcccCcccchhhHHHHHHHHhCCccEEEecccccc-CCCCce-eeecCceeeeeeEEecCCCCceEEEEEeeecCCC
Q 006089 276 KGEQAADQKLERGNLALERSLRRASEVRVIRGMKDAI-NQSSKV-YVYDGLYTVQESWTEKGKSGCNIFKYKLVRIPGQ 352 (662)
Q Consensus 276 ~~~~~~DQ~l~~gNlAL~~S~~~~~pVRViRg~~~~~-~~~~~~-y~YDGLY~V~~~w~e~g~~G~~v~kfkL~R~pgQ 352 (662)
+.+|..||+|++||+||++|+++++|||||||++... +++..+ |||||||+|+++|.+++++|+.||||+|+|+|||
T Consensus 77 ~~~~~~dQ~l~~gN~AL~~S~~~~~PVRViR~~~~~~~~ap~~g~yrYDGLY~V~~~w~~~g~~G~~v~kF~L~R~~gQ 155 (155)
T PF02182_consen 77 GNKQPKDQKLERGNLALANSMKTGNPVRVIRGYKLKSSYAPKGGIYRYDGLYKVVKYWREKGKSGFKVFKFKLVRLPGQ 155 (155)
T ss_dssp TT-B-S---SSHHHHHHHHHSGGS-EEEEEEEGGGGGTTS-SSS-EEEEEEEEEEEEEEEE-TTSSEEEEEEEEE-TSS
T ss_pred cccccccccccchhHHHHHHHhcCCCeEEEeecCCCCccCCcCCCEEeCcEEEEEEEEEEeCCCCcEEEEEEEEECCCC
Confidence 7789999999999999999999999999999998774 355566 9999999999999999999999999999999998
No 4
>KOG4442 consensus Clathrin coat binding protein/Huntingtin interacting protein HIP1, involved in regulation of endocytosis [Intracellular trafficking, secretion, and vesicular transport]
Probab=100.00 E-value=1e-40 Score=369.41 Aligned_cols=162 Identities=35% Similarity=0.652 Sum_probs=140.8
Q ss_pred ceeecCC-CCC-CCCCCCCcccccCceeeEEEEecCCCCCeeEeCCccCCCceEEEeecEEeeHHhHhhhc---CC-CCC
Q 006089 462 LIYECGP-SCP-CNRDCKNRVSQTGLKVRLDVFKTKDRGWGLRSLDPIRAGTFICEYAGEVVDKFKARQDG---EG-SNE 535 (662)
Q Consensus 462 ~i~EC~~-~C~-C~~~C~NRv~Q~G~k~~LeVfrT~~kGwGVrA~~~I~~GtfIcEY~GEvit~~e~~~~~---~~-~~d 535 (662)
+..||++ .|. |+..|.|+-+|+..-.+++||+|..|||||||..+|++|+||.||+||||+..|++.+. .. ...
T Consensus 93 t~iECs~~~C~~cg~~C~NQRFQkkqyA~vevF~Te~KG~GLRA~~dI~~g~FI~EY~GEVI~~~Ef~kR~~~Y~~d~~k 172 (729)
T KOG4442|consen 93 TSIECSDRECPRCGVYCKNQRFQKKQYAKVEVFLTEKKGCGLRAEEDIPKGQFILEYIGEVIEEKEFEKRVKRYAKDGIK 172 (729)
T ss_pred hhcccCCccCCCccccccchhhhhhccCceeEEEecCcccceeeccccCCCcEEeeeccccccHHHHHHHHHHHHhcCCc
Confidence 4579999 898 99999999999999999999999999999999999999999999999999999987653 11 112
Q ss_pred ceeeecccccccccccCCCCCccCCCCCCCccccCCCCCEEEeccccCChhhcccCCCCCCceeEEEEEecCCCceeEEE
Q 006089 536 DYVFDTTRTYDSFKWNYEPGLIEDDDPSDTTEEYDLPYPLVISAKNVGNVARFMNHSCSPNVFWQPIIFENNNESFVHVA 615 (662)
Q Consensus 536 ~Ylfd~~~~~~~~~w~~~~~l~~~~~~~~~~~~~~~~~~~~IDA~~~GNvaRFINHSC~PN~~~q~V~~d~~d~~~p~I~ 615 (662)
+|.| +.+....+|||+.+||+||||||||+|||+++.|.+. +..||+
T Consensus 173 h~Yf-----------------------------m~L~~~e~IDAT~KGnlaRFiNHSC~PNa~~~KWtV~----~~lRvG 219 (729)
T KOG4442|consen 173 HYYF-----------------------------MALQGGEYIDATKKGNLARFINHSCDPNAEVQKWTVP----DELRVG 219 (729)
T ss_pred eEEE-----------------------------EEecCCceecccccCcHHHhhcCCCCCCceeeeeeeC----CeeEEE
Confidence 2222 1223457999999999999999999999999999885 478999
Q ss_pred EEEeecCCCCcEEEEecCCCCCCCCCCCCCCCeEeecCCCCCccccC
Q 006089 616 FFAMRHVPPMTELTYDYGISKSDGGNYEPHRKKKCLCGTLKCRGYFG 662 (662)
Q Consensus 616 ~FA~RdI~~GEELT~DYg~~~~~~~~~~~~~~~~C~CGS~~CrG~lG 662 (662)
|||.|.|++||||||||++..... ...+|+||+++|+||||
T Consensus 220 iFakk~I~~GEEITFDYqf~rYGr------~AQ~CyCgeanC~G~IG 260 (729)
T KOG4442|consen 220 IFAKKVIKPGEEITFDYQFDRYGR------DAQPCYCGEANCRGWIG 260 (729)
T ss_pred EeEecccCCCceeeEecccccccc------cccccccCCcccccccC
Confidence 999999999999999999986432 46799999999999998
No 5
>KOG1141 consensus Predicted histone methyl transferase [Chromatin structure and dynamics]
Probab=100.00 E-value=1.6e-40 Score=368.13 Aligned_cols=153 Identities=31% Similarity=0.573 Sum_probs=123.4
Q ss_pred CCccccccCCCCCCcCCCCCccccCCCCCCCCCCcEEcceeccCCCcC---CCCCCCCCCCCCCCcCCCCCcccccccC-
Q 006089 370 SGRVGLILPDLSSGAEAIPIALINDVDDEKGPAYFTYLTTVKYSKSFR---LTQPSFGCNCYSACGPGNPNCSCVQKNG- 445 (662)
Q Consensus 370 ~~r~~~i~~DiS~G~E~~PI~~vN~VD~~~~P~~F~Yi~~~~~~~~~~---~~~~~~gC~C~~~C~~~~~~C~C~~~n~- 445 (662)
.-.+++.+.||+.|+|.+||..+|++|..++| .|.|..+++-....- ...+.++|+|..+|.+ ...|+|.|...
T Consensus 668 p~kp~~~~~Di~~g~e~vpis~~neids~~lp-q~ay~K~~ip~~~nl~n~~~~fl~scdc~~gcid-~~kcachQltvk 745 (1262)
T KOG1141|consen 668 PLKPGNRCTDIPCGREHVPISEKNEIDSHRLP-QAAYKKHMIPTNNNLSNRRKDFLQSCDCPTGCID-SMKCACHQLTVK 745 (1262)
T ss_pred CcCCcceeccccCCccccccceeecccCcCCc-cchhheeeccCCCcccccChhhhhcCCCCcchhh-hhhhhHHHHHHH
Confidence 34568889999999999999999999998755 799988776544321 1345789999999998 57999997521
Q ss_pred -------CCCc----ccCCceeecCCCceeecCCCCCCC-CCCCCcccccCceeeEEEEecCCCCCeeEeCCccCCCceE
Q 006089 446 -------GDFP----YTANGVLVSRKPLIYECGPSCPCN-RDCKNRVSQTGLKVRLDVFKTKDRGWGLRSLDPIRAGTFI 513 (662)
Q Consensus 446 -------g~~~----Y~~~G~L~~~~~~i~EC~~~C~C~-~~C~NRv~Q~G~k~~LeVfrT~~kGwGVrA~~~I~~GtfI 513 (662)
+... |....+....+..+|||+..|+|. +.|.||++|+|.+.+|++|+|.+||||+|++++|.+|+||
T Consensus 746 ~~~t~p~~~v~~t~gykyKRl~e~~ptg~yEc~k~ckc~~~~C~nrmvqhg~qvRlq~fkt~~kGWg~rclddi~~g~fV 825 (1262)
T KOG1141|consen 746 KKTTGPNQNVASTNGYKYKRLIEIRPTGPYECLKACKCCGPDCLNRMVQHGYQVRLQRFKTIHKGWGRRCLDDITGGNFV 825 (1262)
T ss_pred hhccCCCcccccCcchhhHHHHHhcCCCHHHHHHhhccCcHHHHHHHhhcCceeEeeeccccccccceEeeeecCCceEE
Confidence 1111 222233334567799999999986 5899999999999999999999999999999999999999
Q ss_pred EEeecEEeeHH
Q 006089 514 CEYAGEVVDKF 524 (662)
Q Consensus 514 cEY~GEvit~~ 524 (662)
|-|.|-++++.
T Consensus 826 ciy~g~~l~~~ 836 (1262)
T KOG1141|consen 826 CIYPGGALLHQ 836 (1262)
T ss_pred EEecchhhhhh
Confidence 99999998754
No 6
>KOG1080 consensus Histone H3 (Lys4) methyltransferase complex, subunit SET1 and related methyltransferases [Chromatin structure and dynamics; Transcription]
Probab=99.94 E-value=1.9e-27 Score=279.02 Aligned_cols=134 Identities=35% Similarity=0.681 Sum_probs=113.0
Q ss_pred eeEEEEecCCCCCeeEeCCccCCCceEEEeecEEeeHHhHh---hhc--CCCCCceeeecccccccccccCCCCCccCCC
Q 006089 487 VRLDVFKTKDRGWGLRSLDPIRAGTFICEYAGEVVDKFKAR---QDG--EGSNEDYVFDTTRTYDSFKWNYEPGLIEDDD 561 (662)
Q Consensus 487 ~~LeVfrT~~kGwGVrA~~~I~~GtfIcEY~GEvit~~e~~---~~~--~~~~d~Ylfd~~~~~~~~~w~~~~~l~~~~~ 561 (662)
.+|.-.++..+||||||++.|.+|++|.||+||+|...-++ .++ .+.++.|+|.++
T Consensus 866 k~~~F~~s~iH~wglfa~~~i~~~dmViEY~Ge~vR~~iad~RE~~Y~~~gi~~sYlfrid------------------- 926 (1005)
T KOG1080|consen 866 KYVKFGRSGIHGWGLFAMENIAAGDMVIEYRGELVRSSIADLREARYERMGIGDSYLFRID------------------- 926 (1005)
T ss_pred hhhccccccccccceeeccCccccceEEEeeceehhhhHHHHHHHHHhccCcccceeeecc-------------------
Confidence 34666778889999999999999999999999999754432 122 344789999542
Q ss_pred CCCCccccCCCCCEEEeccccCChhhcccCCCCCCceeEEEEEecCCCceeEEEEEEeecCCCCcEEEEecCCCCCCCCC
Q 006089 562 PSDTTEEYDLPYPLVISAKNVGNVARFMNHSCSPNVFWQPIIFENNNESFVHVAFFAMRHVPPMTELTYDYGISKSDGGN 641 (662)
Q Consensus 562 ~~~~~~~~~~~~~~~IDA~~~GNvaRFINHSC~PN~~~q~V~~d~~d~~~p~I~~FA~RdI~~GEELT~DYg~~~~~~~~ 641 (662)
...+|||+++||+||||||||+|||++..+.+++ ..+|.+||.|+|.+||||||||.+...+.
T Consensus 927 -----------~~~ViDAtk~gniAr~InHsC~PNCyakvi~V~g----~~~IvIyakr~I~~~EElTYDYkF~~e~~-- 989 (1005)
T KOG1080|consen 927 -----------DEVVVDATKKGNIARFINHSCNPNCYAKVITVEG----DKRIVIYSKRDIAAGEELTYDYKFPTEDD-- 989 (1005)
T ss_pred -----------cceEEeccccCchhheeecccCCCceeeEEEecC----eeEEEEEEecccccCceeeeecccccccc--
Confidence 2379999999999999999999999998887764 45999999999999999999999876543
Q ss_pred CCCCCCeEeecCCCCCcccc
Q 006089 642 YEPHRKKKCLCGTLKCRGYF 661 (662)
Q Consensus 642 ~~~~~~~~C~CGS~~CrG~l 661 (662)
+..|+|||++|||++
T Consensus 990 -----kipClCgap~Crg~~ 1004 (1005)
T KOG1080|consen 990 -----KIPCLCGAPNCRGFL 1004 (1005)
T ss_pred -----ccccccCCCcccccc
Confidence 799999999999986
No 7
>KOG1079 consensus Transcriptional repressor EZH1 [Transcription]
Probab=99.93 E-value=1.3e-26 Score=256.10 Aligned_cols=143 Identities=29% Similarity=0.524 Sum_probs=123.1
Q ss_pred ceeecCCC-CCC-C---------CCCCCcccccCceeeEEEEecCCCCCeeEeCCccCCCceEEEeecEEeeHHhHhhhc
Q 006089 462 LIYECGPS-CPC-N---------RDCKNRVSQTGLKVRLDVFKTKDRGWGLRSLDPIRAGTFICEYAGEVVDKFKARQDG 530 (662)
Q Consensus 462 ~i~EC~~~-C~C-~---------~~C~NRv~Q~G~k~~LeVfrT~~kGwGVrA~~~I~~GtfIcEY~GEvit~~e~~~~~ 530 (662)
...||.|. |.| + .+|.|--+|+|.+.++.|..+...|||+|+++...+++||.||+||+|++.|++.++
T Consensus 558 A~rECdPd~Cl~cg~~~~~d~~~~~C~N~~l~~~~qkr~llapSdVaGwGlFlKe~v~KnefisEY~GE~IS~dEADrRG 637 (739)
T KOG1079|consen 558 AVRECDPDVCLMCGNVDHFDSSKISCKNTNLQRGEQKRVLLAPSDVAGWGLFLKESVSKNEFISEYTGEIISHDEADRRG 637 (739)
T ss_pred hccccCchHHhccCcccccccCccccccchhhhhhhcceeechhhccccceeeccccCCCceeeeecceeccchhhhhcc
Confidence 35799974 754 2 289999999999999999999999999999999999999999999999999998764
Q ss_pred ---CCCCCceeeecccccccccccCCCCCccCCCCCCCccccCCCCCEEEeccccCChhhcccCCCCCCceeEEEEEecC
Q 006089 531 ---EGSNEDYVFDTTRTYDSFKWNYEPGLIEDDDPSDTTEEYDLPYPLVISAKNVGNVARFMNHSCSPNVFWQPIIFENN 607 (662)
Q Consensus 531 ---~~~~d~Ylfd~~~~~~~~~w~~~~~l~~~~~~~~~~~~~~~~~~~~IDA~~~GNvaRFINHSC~PN~~~q~V~~d~~ 607 (662)
++....|+|++ ..+|+|||+++||.+||+|||-.|||++..+++.
T Consensus 638 kiYDr~~cSflFnl------------------------------n~dyviDs~rkGnk~rFANHS~nPNCYAkvm~V~-- 685 (739)
T KOG1079|consen 638 KIYDRYMCSFLFNL------------------------------NNDYVIDSTRKGNKIRFANHSFNPNCYAKVMMVA-- 685 (739)
T ss_pred cccccccceeeeec------------------------------cccceEeeeeecchhhhccCCCCCCcEEEEEEec--
Confidence 34455566643 3458999999999999999999999998777764
Q ss_pred CCceeEEEEEEeecCCCCcEEEEecCCCCCC
Q 006089 608 NESFVHVAFFAMRHVPPMTELTYDYGISKSD 638 (662)
Q Consensus 608 d~~~p~I~~FA~RdI~~GEELT~DYg~~~~~ 638 (662)
+..+|+|||.|.|.+||||||||.|+...
T Consensus 686 --GdhRIGifAkRaIeagEELffDYrYs~~~ 714 (739)
T KOG1079|consen 686 --GDHRIGIFAKRAIEAGEELFFDYRYSPEH 714 (739)
T ss_pred --CCcceeeeehhhcccCceeeeeeccCccc
Confidence 44599999999999999999999998544
No 8
>smart00317 SET SET (Su(var)3-9, Enhancer-of-zeste, Trithorax) domain. Putative methyl transferase, based on outlier plant homologues
Probab=99.87 E-value=7.2e-22 Score=176.53 Aligned_cols=111 Identities=39% Similarity=0.746 Sum_probs=89.7
Q ss_pred eEEEEecCCCCCeeEeCCccCCCceEEEeecEEeeHHhHhhhc---C--CCCCceeeecccccccccccCCCCCccCCCC
Q 006089 488 RLDVFKTKDRGWGLRSLDPIRAGTFICEYAGEVVDKFKARQDG---E--GSNEDYVFDTTRTYDSFKWNYEPGLIEDDDP 562 (662)
Q Consensus 488 ~LeVfrT~~kGwGVrA~~~I~~GtfIcEY~GEvit~~e~~~~~---~--~~~d~Ylfd~~~~~~~~~w~~~~~l~~~~~~ 562 (662)
++++++++.+|+||+|.++|++|++|++|.|+++...+..... . .....|+|+.
T Consensus 1 ~~~~~~~~~~G~gl~a~~~i~~g~~i~~~~g~~~~~~~~~~~~~~~~~~~~~~~~~~~~--------------------- 59 (116)
T smart00317 1 KLEVFKSPGKGWGVRATEDIPKGEFIGEYVGEIITSEEAEERSKAYDTDGADSFYLFEI--------------------- 59 (116)
T ss_pred CcEEEecCCCcEEEEECCccCCCCEEEEEEeEEECHHHHHHHHHHHHhcCCCCEEEEEC---------------------
Confidence 3688999999999999999999999999999999877664321 1 1112444421
Q ss_pred CCCccccCCCCCEEEeccccCChhhcccCCCCCCceeEEEEEecCCCceeEEEEEEeecCCCCcEEEEec
Q 006089 563 SDTTEEYDLPYPLVISAKNVGNVARFMNHSCSPNVFWQPIIFENNNESFVHVAFFAMRHVPPMTELTYDY 632 (662)
Q Consensus 563 ~~~~~~~~~~~~~~IDA~~~GNvaRFINHSC~PN~~~q~V~~d~~d~~~p~I~~FA~RdI~~GEELT~DY 632 (662)
...++||+...||++|||||||.||+.++.+..++ ..++.|+|+|||++|||||+||
T Consensus 60 ---------~~~~~id~~~~~~~~~~iNHsc~pN~~~~~~~~~~----~~~~~~~a~r~I~~GeEi~i~Y 116 (116)
T smart00317 60 ---------DSDLCIDARRKGNIARFINHSCEPNCELLFVEVNG----DSRIVIFALRDIKPGEELTIDY 116 (116)
T ss_pred ---------CCCEEEeCCccCcHHHeeCCCCCCCEEEEEEEECC----CcEEEEEECCCcCCCCEEeecC
Confidence 12479999999999999999999999988776643 2389999999999999999999
No 9
>KOG1083 consensus Putative transcription factor ASH1/LIN-59 [Transcription]
Probab=99.85 E-value=1.6e-22 Score=231.01 Aligned_cols=127 Identities=38% Similarity=0.660 Sum_probs=107.0
Q ss_pred CCCccccc-CceeeEEEEecCCCCCeeEeCCccCCCceEEEeecEEeeHHhHhhhc----CCCCCceeeecccccccccc
Q 006089 476 CKNRVSQT-GLKVRLDVFKTKDRGWGLRSLDPIRAGTFICEYAGEVVDKFKARQDG----EGSNEDYVFDTTRTYDSFKW 550 (662)
Q Consensus 476 C~NRv~Q~-G~k~~LeVfrT~~kGwGVrA~~~I~~GtfIcEY~GEvit~~e~~~~~----~~~~d~Ylfd~~~~~~~~~w 550 (662)
|.|+-+|+ +.-..|+||+.+.+||||+++++|++|+|||||+|||++.++++.+| ....+.|..
T Consensus 1166 c~nqrm~r~e~cp~L~v~~gp~~G~~v~tk~PikagtfI~EYvGeVit~ke~e~~mmtl~~~d~~~~cL----------- 1234 (1306)
T KOG1083|consen 1166 CSNQRMQRHEECPPLEVFRGPKKGWGVRTKEPIKAGTFIMEYVGEVITEKEFEPRMMTLYHNDDDHYCL----------- 1234 (1306)
T ss_pred hhhHHhhhhccCCCcceeccCCCCccccccccccccchHHHHHHHHHHHHhhcccccccCCCCCccccc-----------
Confidence 77777774 56678999999999999999999999999999999999998887552 111222222
Q ss_pred cCCCCCccCCCCCCCccccCCCCCEEEeccccCChhhcccCCCCCCceeEEEEEecCCCceeEEEEEEeecCCCCcEEEE
Q 006089 551 NYEPGLIEDDDPSDTTEEYDLPYPLVISAKNVGNVARFMNHSCSPNVFWQPIIFENNNESFVHVAFFAMRHVPPMTELTY 630 (662)
Q Consensus 551 ~~~~~l~~~~~~~~~~~~~~~~~~~~IDA~~~GNvaRFINHSC~PN~~~q~V~~d~~d~~~p~I~~FA~RdI~~GEELT~ 630 (662)
.+..+.+||+.++||.+|||||||.|||..|.|-+. ++.||.|||+|||++||||||
T Consensus 1235 -------------------~I~p~l~id~~R~~n~~RfinhscKPNc~~qkwSVN----G~~Rv~L~A~rDi~kGEELtY 1291 (1306)
T KOG1083|consen 1235 -------------------VIDPGLFIDIPRMGNGARFINHSCKPNCEMQKWSVN----GEYRVGLFALRDLPKGEELTY 1291 (1306)
T ss_pred -------------------ccCccccCChhhccccccccccccCCCCcccccccc----ceeeeeeeecCCCCCCceEEE
Confidence 234567999999999999999999999999998774 689999999999999999999
Q ss_pred ecCCCC
Q 006089 631 DYGISK 636 (662)
Q Consensus 631 DYg~~~ 636 (662)
||+...
T Consensus 1292 DYN~ks 1297 (1306)
T KOG1083|consen 1292 DYNFKS 1297 (1306)
T ss_pred eccccc
Confidence 998653
No 10
>PF05033 Pre-SET: Pre-SET motif; InterPro: IPR007728 This region is found in a number of histone lysine methyltransferases (HMTase), N-terminal to the SET domain; it is generally described as the pre-SET domain. Histone lysine methylation is part of the histone code that regulated chromatin function and epigenetic control of gene function. Histone lysine methyltransferases (HMTase) differ both in their substrate specificity for the various acceptor lysines as well as in their product specificity for the number of methyl groups (one, two, or three) they transfer. With just one exception [], the HMTases belong to SET family that can be classified according to the sequences surrounding the SET domain [, ]. Structural studies on the human SET7/9, a mono-methylase, have revealed the molecular basis for the specificity of the enzyme for the histone-target and the roles of the invariant residues in the SET domain in determining the methylation specificities []. The pre-SET domain, as found in the SUV39 SET family, contains nine invariant cysteine residues that are grouped into two segments separated by a region of variable length. These 9 cysteines coordinate 3 zinc ions to form a triangular cluster, where each of the zinc ions is coordinated by 4 four cysteines to give a tetrahedral configuration. The function of this domain is structural, holding together 2 long segments of random coils and stabilising the SET domain. The C-terminal region including the post-SET domain is disordered when not interacting with a histone tail and in the absence of zinc. The three conserved cysteines in the post-SET domain form a zinc-binding site [] when coupled to a fourth conserved cysteine in the knot-like structure close to the SET domain active site []. The structured post-SET region brings in the C-terminal residues that participate in S-adenosylmethine-binding and histone tail interactions. The three conserved cysteine residues are essential for HMTase activity, as replacement with serine abolishes HMTase activity []. ; GO: 0008270 zinc ion binding, 0018024 histone-lysine N-methyltransferase activity, 0034968 histone lysine methylation, 0005634 nucleus; PDB: 3K5K_A 2O8J_D 3RJW_B 1ML9_A 1PEG_B 1MVH_A 1MVX_A 3BO5_A 2RFI_B 3MO5_B ....
Probab=99.84 E-value=2.4e-21 Score=173.68 Aligned_cols=100 Identities=42% Similarity=0.906 Sum_probs=74.2
Q ss_pred CCCCCcCCCCCccccCCCCCCCCCCcEEcceeccCCCcC--CCCCCCCCCCCCCCcCCCCCcccccccCCCCcccCCcee
Q 006089 379 DLSSGAEAIPIALINDVDDEKGPAYFTYLTTVKYSKSFR--LTQPSFGCNCYSACGPGNPNCSCVQKNGGDFPYTANGVL 456 (662)
Q Consensus 379 DiS~G~E~~PI~~vN~VD~~~~P~~F~Yi~~~~~~~~~~--~~~~~~gC~C~~~C~~~~~~C~C~~~n~g~~~Y~~~G~L 456 (662)
|||.|+|.+||+++|+||++.+|..|+||+++++...+. ......||+|.++|.. ..+|+|.+++++.++|+.+|+|
T Consensus 1 Dis~g~e~~pI~~~N~vd~~~~p~~F~Yi~~~~~~~~~~~~~~~~~~~C~C~~~C~~-~~~C~C~~~~~~~~~Y~~~g~l 79 (103)
T PF05033_consen 1 DISRGKENVPIPVVNDVDDEPPPPNFEYIPENIYGEGVPDIDPEFLQGCDCSGDCSN-PSNCECLQRNGGIFAYDSNGRL 79 (103)
T ss_dssp -TTCTSSSS-EEEEESSSS--SSTSSEE-SS-EESTTSS-TBGGGTS----SSSSTC-TTTSHHHCCTSSS-SB-TTSSB
T ss_pred CCCCCccCCCEEEEeCCCCCCCCCCeEEeeeEEcCCCccccccccCccCccCCCCCC-CCCCcCccccCccccccCCCcC
Confidence 899999999999999999999999999999999988764 2345689999999954 4789999999999999999999
Q ss_pred e-cCCCceeecCCCCCCCCCCCCc
Q 006089 457 V-SRKPLIYECGPSCPCNRDCKNR 479 (662)
Q Consensus 457 ~-~~~~~i~EC~~~C~C~~~C~NR 479 (662)
. ....+||||++.|+|+.+|.||
T Consensus 80 ~~~~~~~i~EC~~~C~C~~~C~NR 103 (103)
T PF05033_consen 80 RIPDKPPIFECNDNCGCSPSCRNR 103 (103)
T ss_dssp SSSSTSEEE---TTSSS-TTSTT-
T ss_pred ccCCCCeEEeCCCCCCCCCCCCCC
Confidence 8 6789999999999999999998
No 11
>smart00468 PreSET N-terminal to some SET domains. A Cys-rich putative Zn2+-binding domain that occurs N-terminal to some SET domains. Function is unknown. Unpublished.
Probab=99.83 E-value=9.3e-21 Score=168.83 Aligned_cols=94 Identities=38% Similarity=0.918 Sum_probs=84.8
Q ss_pred cCCCCCCcCCCCCccccCCCCCCCCCCcEEcceeccCCCcC---CCCCCCCCCCCCCCcCCCCCcccccccCCCCcc-cC
Q 006089 377 LPDLSSGAEAIPIALINDVDDEKGPAYFTYLTTVKYSKSFR---LTQPSFGCNCYSACGPGNPNCSCVQKNGGDFPY-TA 452 (662)
Q Consensus 377 ~~DiS~G~E~~PI~~vN~VD~~~~P~~F~Yi~~~~~~~~~~---~~~~~~gC~C~~~C~~~~~~C~C~~~n~g~~~Y-~~ 452 (662)
..|||.|+|++||++||+||++.+|..|+||+++.++.++. ...+..||+|.++|.+. ..|.|.+++++.++| ..
T Consensus 1 ~~Dis~G~E~~pI~~vN~vD~~~~p~~F~Yi~~~~~~~gv~~~~~~~~~~gC~C~~~C~~~-~~C~C~~~~~~~~~Y~~~ 79 (98)
T smart00468 1 CLDISNGKENVPVPLVNEVDEDPPPPDFEYISEYIYGQGVPIDRSPSPLVGCSCSGDCSSS-NKCECARKNGGEFAYELN 79 (98)
T ss_pred CccccCCccCCCcceEecCCCCCCCCCcEECcceEcCCCcccccCCCCCCCCcCCCCCCCC-CcCCcHhhcCCccCcccC
Confidence 36999999999999999999999999999999999988764 45678999999999984 459999999999999 77
Q ss_pred CceeecCCCceeecCCCCC
Q 006089 453 NGVLVSRKPLIYECGPSCP 471 (662)
Q Consensus 453 ~G~L~~~~~~i~EC~~~C~ 471 (662)
+++++..+++|||||+.|+
T Consensus 80 ~~~~~~~~~~IyECn~~C~ 98 (98)
T smart00468 80 GGLRLKRKPLIYECNSRCS 98 (98)
T ss_pred CCEEeCCCCEEEcCCCCCC
Confidence 8888899999999999985
No 12
>KOG1085 consensus Predicted methyltransferase (contains a SET domain) [General function prediction only]
Probab=99.64 E-value=4.5e-16 Score=158.58 Aligned_cols=123 Identities=24% Similarity=0.352 Sum_probs=95.5
Q ss_pred cCceeeEEEEecCCCCCeeEeCCccCCCceEEEeecEEeeHHhHhhhc---C--CCCCceeeecccccccccccCCCCCc
Q 006089 483 TGLKVRLDVFKTKDRGWGLRSLDPIRAGTFICEYAGEVVDKFKARQDG---E--GSNEDYVFDTTRTYDSFKWNYEPGLI 557 (662)
Q Consensus 483 ~G~k~~LeVfrT~~kGwGVrA~~~I~~GtfIcEY~GEvit~~e~~~~~---~--~~~d~Ylfd~~~~~~~~~w~~~~~l~ 557 (662)
.|....|.+..-..||.||+|...+.+|+||.||.|.+|...|+..++ . .....|.|- | .+
T Consensus 252 ~g~~egl~~~~~dgKGRGv~a~~~F~rgdFVVEY~Gdliei~eAk~rE~~Ya~De~~GcYMYy-------F--~h----- 317 (392)
T KOG1085|consen 252 KGTNEGLLEVYKDGKGRGVRAKVNFERGDFVVEYRGDLIEISEAKVREEQYANDEEIGCYMYY-------F--EH----- 317 (392)
T ss_pred hccccceeEEeeccccceeEeecccccCceEEEEecceeeechHHHHHHHhccCcccceEEEe-------e--ec-----
Confidence 455566777777779999999999999999999999999877764332 1 111223331 1 11
Q ss_pred cCCCCCCCccccCCCCCEEEecccc-CChhhcccCCCCCCceeEEEEEecCCCceeEEEEEEeecCCCCcEEEEecCCCC
Q 006089 558 EDDDPSDTTEEYDLPYPLVISAKNV-GNVARFMNHSCSPNVFWQPIIFENNNESFVHVAFFAMRHVPPMTELTYDYGISK 636 (662)
Q Consensus 558 ~~~~~~~~~~~~~~~~~~~IDA~~~-GNvaRFINHSC~PN~~~q~V~~d~~d~~~p~I~~FA~RdI~~GEELT~DYg~~~ 636 (662)
....|+|||+.- +-++|.||||-.+||....|.++ +.||+.|.|.|||.+||||+||||+..
T Consensus 318 -------------~sk~yCiDAT~et~~lGRLINHS~~gNl~TKvv~Id----g~pHLiLvA~rdIa~GEELlYDYGDRS 380 (392)
T KOG1085|consen 318 -------------NSKKYCIDATKETPWLGRLINHSVRGNLKTKVVEID----GSPHLILVARRDIAQGEELLYDYGDRS 380 (392)
T ss_pred -------------cCeeeeeecccccccchhhhcccccCcceeeEEEec----CCceEEEEeccccccchhhhhhccccc
Confidence 123589999865 55799999999999999888876 578999999999999999999999864
No 13
>COG2940 Proteins containing SET domain [General function prediction only]
Probab=99.62 E-value=3e-16 Score=176.30 Aligned_cols=168 Identities=29% Similarity=0.464 Sum_probs=124.8
Q ss_pred cCCCCCCCCCCCCcccccCceeeEEEEecCCCCCeeEeCCccCCCceEEEeecEEeeHHhHhhhcCCCCCceeeeccccc
Q 006089 466 CGPSCPCNRDCKNRVSQTGLKVRLDVFKTKDRGWGLRSLDPIRAGTFICEYAGEVVDKFKARQDGEGSNEDYVFDTTRTY 545 (662)
Q Consensus 466 C~~~C~C~~~C~NRv~Q~G~k~~LeVfrT~~kGwGVrA~~~I~~GtfIcEY~GEvit~~e~~~~~~~~~d~Ylfd~~~~~ 545 (662)
+...+.+...|.|...+........+..+..+||||||++.|++|++|.+|.|+++...++..+.. .| .... .+
T Consensus 311 ~~s~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~g~fa~~~i~~~e~i~~~~~~~~~~~~~~~~~~----~~-~~~~-~~ 384 (480)
T COG2940 311 SKSNVSKLKELLNSNGCKKRREPNVVQESEIKGYGVFALESIKKGEFIIEYHGEIIRRKEAREREE----NY-DLLG-NE 384 (480)
T ss_pred ccccCccccchhhhcccccccchhhhhhhcccccceeehhhccchHHHHHhcCcccchHHHHhhhc----cc-cccc-cc
Confidence 344444445666666667777888888899999999999999999999999999998887754321 11 1110 00
Q ss_pred ccccccCCCCCccCCCCCCCccccCCCCCEEEeccccCChhhcccCCCCCCceeEEEEEecCCCceeEEEEEEeecCCCC
Q 006089 546 DSFKWNYEPGLIEDDDPSDTTEEYDLPYPLVISAKNVGNVARFMNHSCSPNVFWQPIIFENNNESFVHVAFFAMRHVPPM 625 (662)
Q Consensus 546 ~~~~w~~~~~l~~~~~~~~~~~~~~~~~~~~IDA~~~GNvaRFINHSC~PN~~~q~V~~d~~d~~~p~I~~FA~RdI~~G 625 (662)
..+ |... ....++|+...|+++|||||||.||+.+..+... +..++.++|+|||.+|
T Consensus 385 ~~~-~~~~------------------~~~~~~d~~~~g~~~r~~nHS~~pN~~~~~~~~~----g~~~~~~~~~rDI~~g 441 (480)
T COG2940 385 FSF-GLLE------------------DKDKVRDSQKAGDVARFINHSCTPNCEASPIEVN----GIFKISIYAIRDIKAG 441 (480)
T ss_pred cch-hhcc------------------ccchhhhhhhcccccceeecCCCCCcceeccccc----ccceeeecccccchhh
Confidence 000 1100 0146899999999999999999999998765543 2668999999999999
Q ss_pred cEEEEecCCCCCCCC--CCCCCCCeEeecCCCCCccccC
Q 006089 626 TELTYDYGISKSDGG--NYEPHRKKKCLCGTLKCRGYFG 662 (662)
Q Consensus 626 EELT~DYg~~~~~~~--~~~~~~~~~C~CGS~~CrG~lG 662 (662)
||||+||+...+... .........|.|++..|++.++
T Consensus 442 eEl~~dy~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 480 (480)
T COG2940 442 EELTYDYGPSLEDNRELKKLLEKRWGCACGEDRCSHTMS 480 (480)
T ss_pred hhhccccccccccchhhhhhhhhhhccccCCCccCCCCC
Confidence 999999999877643 2233467899999999999874
No 14
>PF00856 SET: SET domain; InterPro: IPR001214 The SET domain appears generally as one part of a larger multidomain protein, and recently there were described three structures of very different proteins with distinct domain compositions: Neurospora crassa DIM-5, a member of the Su(var) family of HKMTs which methylate histone H3 on lysine 9,human SET7 (also called SET9), which methylates H3 on lysine 4 and garden pea Rubisco LSMT, an enzyme that does not modify histones, but instead methylates lysine 14 in the flexible tail of the large subunit of the enzyme Rubisco. The SET domain itself turned out to be an uncommon structure. Although in all three studies, electron density maps revealed the location of the AdoMet or AdoHcy cofactor, the SET domain bears no similarity at all to the canonical/AdoMet-dependent methyltransferase fold. Strictly conserved in the C-terminal motif of the SET domain tyrosine could be involved in abstracting a proton from the protonated amino group of the substrate lysine, promoting its nucleophilic attack on the sulphonium methyl group of the AdoMet cofactor. In contrast to the AdoMet-dependent protein methyltranferases of the classical type, which tend to bind their polypeptide substrates on top of the cofactor, it is noted from the Rubisco LSMT structure that the AdoMet seems to bind in a separate cleft, suggesting how a polypeptide substrate could be subjected to multiple rounds of methylation without having to be released from the enzyme. In contrast, SET7/9 is able to add only a single methyl group to its substrate. It has been demonstrated that association of SET domain and myotubularin-related proteins modulates growth control []. The SET domain-containing Drosophila melanogaster (Fruit fly) protein, enhancer of zeste, has a function in segment determination and the mammalian homologue may be involved in the regulation of gene transcription and chromatin structure. Histone lysine methylation is part of the histone code that regulated chromatin function and epigenetic control of gene function. Histone lysine methyltransferases (HMTase) differ both in their substrate specificity for the various acceptor lysines as well as in their product specificity for the number of methyl groups (one, two, or three) they transfer. With just one exception [], the HMTases belong to SET family that can be classified according to the sequences surrounding the SET domain [, ]. Structural studies on the human SET7/9, a mono-methylase, have revealed the molecular basis for the specificity of the enzyme for the histone-target and the roles of the invariant residues in the SET domain in determining the methylation specificities []. The pre-SET domain, as found in the SUV39 SET family, contains nine invariant cysteine residues that are grouped into two segments separated by a region of variable length. These 9 cysteines coordinate 3 zinc ions to form to form a triangular cluster, where each of the zinc ions is coordinated by 4 four cysteines to give a tetrahedral configuration. The function of this domain is structural, holding together 2 long segments of random coils. The C-terminal region including the post-SET domain is disordered when not interacting with a histone tail and in the absence of zinc. The three conserved cysteines in the post-SET domain form a zinc-binding site when coupled to a fourth conserved cysteine in the knot-like structure close to the SET domain active site []. The structured post-SET region brings in the C-terminal residues that participate in S-adenosylmethine-binding and histone tail interactions. The three conserved cysteine residues are essential for HMTase activity, as replacement with serine abolishes HMTase activity [], []. ; GO: 0005515 protein binding; PDB: 3TG5_A 3S7F_A 3RIB_B 3TG4_A 3S7J_A 3S7D_A 3S7B_A 3H6L_A 3SMT_A 3K5K_A ....
Probab=99.32 E-value=4.3e-12 Score=118.15 Aligned_cols=55 Identities=22% Similarity=0.142 Sum_probs=44.0
Q ss_pred EEEeccccCChhhcccCCCCCCceeEEEEEecCCCceeEEEEEEeecCCCCcEEEEecC
Q 006089 575 LVISAKNVGNVARFMNHSCSPNVFWQPIIFENNNESFVHVAFFAMRHVPPMTELTYDYG 633 (662)
Q Consensus 575 ~~IDA~~~GNvaRFINHSC~PN~~~q~V~~d~~d~~~p~I~~FA~RdI~~GEELT~DYg 633 (662)
...++....+++.|+||||.|||.+..... ....++.|.|.|+|++|||||++||
T Consensus 108 ~~~~~~~l~p~~d~~NHsc~pn~~~~~~~~----~~~~~~~~~a~r~I~~GeEi~isYG 162 (162)
T PF00856_consen 108 DDRDGIALYPFADMLNHSCDPNCEVSFDFD----GDGGCLVVRATRDIKKGEEIFISYG 162 (162)
T ss_dssp EEEEEEEEETGGGGSEEESSTSEEEEEEEE----TTTTEEEEEESS-B-TTSBEEEEST
T ss_pred ccccccccCcHhHheccccccccceeeEee----cccceEEEEECCccCCCCEEEEEEC
Confidence 356677778899999999999999876543 2334899999999999999999997
No 15
>KOG1141 consensus Predicted histone methyl transferase [Chromatin structure and dynamics]
Probab=98.92 E-value=1.3e-09 Score=123.82 Aligned_cols=283 Identities=33% Similarity=0.580 Sum_probs=196.1
Q ss_pred ccccccCCCCCCcCCCCCccccCCCCCCCCCC------cEEcceeccCCCcCCCCCCCCCCCCCCCcCCCCCcccccccC
Q 006089 372 RVGLILPDLSSGAEAIPIALINDVDDEKGPAY------FTYLTTVKYSKSFRLTQPSFGCNCYSACGPGNPNCSCVQKNG 445 (662)
Q Consensus 372 r~~~i~~DiS~G~E~~PI~~vN~VD~~~~P~~------F~Yi~~~~~~~~~~~~~~~~gC~C~~~C~~~~~~C~C~~~n~ 445 (662)
-.|+-..|.+.|.+.+|||++|.+|++.+|.- |.|..+...+. .......||.|.+.|.+. ..|.|.+...
T Consensus 872 ~~g~d~~d~~~g~sg~~~p~~~~~d~~~~~~c~d~~~~~~~~~~~~~s~--~~~~~~~~~s~d~hp~d~-~~~~~~~~~~ 948 (1262)
T KOG1141|consen 872 DKGLDVADFSLGTSGIPIPLVNSVDNDEPPSCEDSKRRFQYNDQVDISS--VSRDFCSGCSCDGHPSDA-SKCECQQLSI 948 (1262)
T ss_pred ccccchhhhhccccCCCCccccccccCCCccccccceeecccccchhhh--hccccccccccCCCCccc-CcccCCCCCh
Confidence 34777889999999999999999999887641 33333332221 123567899999999874 6899986431
Q ss_pred ---CCCc--ccCCce--eec--------CCCceeecCCCCCCCCCCCCcccccCceeeE--------EEEecCCCCCeeE
Q 006089 446 ---GDFP--YTANGV--LVS--------RKPLIYECGPSCPCNRDCKNRVSQTGLKVRL--------DVFKTKDRGWGLR 502 (662)
Q Consensus 446 ---g~~~--Y~~~G~--L~~--------~~~~i~EC~~~C~C~~~C~NRv~Q~G~k~~L--------eVfrT~~kGwGVr 502 (662)
+.+| ...+|. +.. .+-..+||+..|.|...|.||++|.+.++++ .||++...|||++
T Consensus 949 ~~~~~cpp~~s~d~~~~~~eS~~~~ns~~~~~f~e~~~hss~~~~e~~~~v~~~~~~~me~~s~~~l~i~~~~~~~~~~~ 1028 (1262)
T KOG1141|consen 949 EAMKRCPPNLSFDGHDELYESSEKQNSFLKLFFFECNDHSSCHRKEYNRVVQNNIKYPMEVSSFNDLQIFKTAQSGWGVR 1028 (1262)
T ss_pred hhhcCCCCccccCchhhhhhhhhhcchhhhccceeccccchhcccccchhhhcCCccceeeeeccccccccccccccccc
Confidence 2222 222331 111 1345789999999999999999999988764 5678888999999
Q ss_pred eCCccCCCceEEEeecEEeeHHhHhhhcCCCCCcee-----eecc-----cccccccccCCC------------------
Q 006089 503 SLDPIRAGTFICEYAGEVVDKFKARQDGEGSNEDYV-----FDTT-----RTYDSFKWNYEP------------------ 554 (662)
Q Consensus 503 A~~~I~~GtfIcEY~GEvit~~e~~~~~~~~~d~Yl-----fd~~-----~~~~~~~w~~~~------------------ 554 (662)
+..+|+.-+|||+|+|...++.-+.+.-....+.|. ++.. +.....++....
T Consensus 1029 edtD~~~~~~~~~~~~~ppt~~l~~~~r~aqad~~sn~~D~~~~~~l~es~~~~~T~~r~~t~~~~~~~~~d~dd~q~I~ 1108 (1262)
T KOG1141|consen 1029 EDTDIPQSTFICTYVGAPPTDDLADELRNAQADQYSNDLDLKDTVELEESREDHETDFRGDTSDYDDEEGSDGDDGQDIM 1108 (1262)
T ss_pred ccccCCCCcccccccCCCCchhhHHHHhhhhhccccCccchhhhhhhhhcccccccccCCCCCCCcccccccCccHHHHH
Confidence 999999999999999999887654221100011111 1100 000000000000
Q ss_pred CCccCCCCCC----------------------------------Ccc---------------ccCC-CCCEEEeccccCC
Q 006089 555 GLIEDDDPSD----------------------------------TTE---------------EYDL-PYPLVISAKNVGN 584 (662)
Q Consensus 555 ~l~~~~~~~~----------------------------------~~~---------------~~~~-~~~~~IDA~~~GN 584 (662)
...+.++..+ ..+ .+.. ..-|+|||+..||
T Consensus 1109 k~ve~qd~~~~~~~T~~~~RQ~~~~s~k~~~~~s~~~~~~ts~~~~~~dkges~~~~~~~~~~y~~~~~~yvIDAk~eGN 1188 (1262)
T KOG1141|consen 1109 KMVERQDSSESGEETKRLTRQKRKQSKKSGKGGSVEKDDTTSRDSMEKDKGESKDEPVFNWDKYFEPFPLYVIDAKQEGN 1188 (1262)
T ss_pred HHhhcccccccccccchhhhhhhhhhhhcccCccccccccCccchhhhccCccCcccccchhhccCCCceEEEecccccc
Confidence 0000000000 000 0011 1358999999999
Q ss_pred hhhcccCCCCCCceeEEEEEecCCCceeEEEEEEeecCCCCcEEEEecCCCCCCCCCCCCCCCeEeecCCCCCcccc
Q 006089 585 VARFMNHSCSPNVFWQPIIFENNNESFVHVAFFAMRHVPPMTELTYDYGISKSDGGNYEPHRKKKCLCGTLKCRGYF 661 (662)
Q Consensus 585 vaRFINHSC~PN~~~q~V~~d~~d~~~p~I~~FA~RdI~~GEELT~DYg~~~~~~~~~~~~~~~~C~CGS~~CrG~l 661 (662)
++||+||||+||+++|+|+++.||.++|.++|||.+-|++|+||||||+|.... ...+...|+||+.+|||+|
T Consensus 1189 lGRfLNHSC~PNl~VQnVfvdTHdlrfPwVAFFt~kyVkAgtELTWDY~Ye~g~----v~~keL~C~CGa~~CrgrL 1261 (1262)
T KOG1141|consen 1189 LGRFLNHSCDPNLHVQNVFVDTHDLRFPWVAFFTRKYVKAGTELTWDYQYEQGQ----VATKELTCHCGAENCRGRL 1261 (1262)
T ss_pred hhhhhccCCCccceeeeeeeeccccCCchhhhhhhhhhccCceeeeeccccccc----cccceEEEecChhhhhccc
Confidence 999999999999999999999999999999999999999999999999997532 3568899999999999987
No 16
>KOG1081 consensus Transcription factor NSD1 and related SET domain proteins [Transcription]
Probab=98.84 E-value=9.3e-10 Score=123.06 Aligned_cols=150 Identities=29% Similarity=0.452 Sum_probs=104.2
Q ss_pred CceeecC-CCCCCCCCCCCcccccCceeeEEEEecCCCCCeeEeCCccCCCceEEEeecEEeeHHhHhhhcCCCCCceee
Q 006089 461 PLIYECG-PSCPCNRDCKNRVSQTGLKVRLDVFKTKDRGWGLRSLDPIRAGTFICEYAGEVVDKFKARQDGEGSNEDYVF 539 (662)
Q Consensus 461 ~~i~EC~-~~C~C~~~C~NRv~Q~G~k~~LeVfrT~~kGwGVrA~~~I~~GtfIcEY~GEvit~~e~~~~~~~~~d~Ylf 539 (662)
...+||- ..|.+...|.|+-.-...... ..+ +|..+|.+| +|++++..+...+.......-+.
T Consensus 287 ~~~~~~~p~~~~~~~~~~~~~~sk~~~~e------~~~----~~~~~~~k~------vg~~i~~~e~~~~~~~~~~~~~~ 350 (463)
T KOG1081|consen 287 MLAYEVHPKVCSAEERCHNQQFSKESYPE------PQK----TAKADIRKG------VGEVIDDKECKARLQRVKESDLV 350 (463)
T ss_pred hhhhhhcccccccccccccchhhhhcccc------cch----hhHHhhhcc------cCcccchhhheeehhhhhccchh
Confidence 3456665 469999999888764332211 212 888888888 99999887754321110111111
Q ss_pred ecccccccccccCCCCCccCCCCCCCccccCCCCCEEEeccccCChhhcccCCCCCCceeEEEEEecCCCceeEEEEEEe
Q 006089 540 DTTRTYDSFKWNYEPGLIEDDDPSDTTEEYDLPYPLVISAKNVGNVARFMNHSCSPNVFWQPIIFENNNESFVHVAFFAM 619 (662)
Q Consensus 540 d~~~~~~~~~w~~~~~l~~~~~~~~~~~~~~~~~~~~IDA~~~GNvaRFINHSC~PN~~~q~V~~d~~d~~~p~I~~FA~ 619 (662)
+ .+. ..+.....||+...||.+||+||||+||+.-+.+.+ ....++.+||.
T Consensus 351 ~---~~~----------------------~~~e~~~~id~~~~~n~sr~~nh~~~~~v~~~k~~~----~~~t~~~~~a~ 401 (463)
T KOG1081|consen 351 D---FYM----------------------VFIQKDRIIDAGPKGNYSRFLNHSCQPNVETEKWQV----IGDTRVGLFAP 401 (463)
T ss_pred h---hhh----------------------hhhhcccccccccccchhhhhcccCCCceeechhhe----ecccccccccc
Confidence 0 000 001111289999999999999999999998877655 35678999999
Q ss_pred ecCCCCcEEEEecCCCCCCCCCCCCCCCeEeecCCCCCccccC
Q 006089 620 RHVPPMTELTYDYGISKSDGGNYEPHRKKKCLCGTLKCRGYFG 662 (662)
Q Consensus 620 RdI~~GEELT~DYg~~~~~~~~~~~~~~~~C~CGS~~CrG~lG 662 (662)
++|++|||||++|..... ...+.|.|++.+|.+.+|
T Consensus 402 ~~i~~g~e~t~~~n~~~~-------~~~~~~~~~~e~~~~~~~ 437 (463)
T KOG1081|consen 402 RQIEAGEELTFNYNGNCE-------GNEKRCCCGSENCTETKG 437 (463)
T ss_pred cccccchhhhheeecccc-------CCcceEeecccccccCCc
Confidence 999999999999987643 356899999999998765
No 17
>KOG2589 consensus Histone tail methylase [Chromatin structure and dynamics]
Probab=98.34 E-value=4.7e-07 Score=96.16 Aligned_cols=114 Identities=23% Similarity=0.323 Sum_probs=74.9
Q ss_pred CCCCeeEeCCccCCCceEEEeecEEeeHHhHhhhc--CCCCCceeee-cccccccccccCCCCCccCCCCCCCccccCCC
Q 006089 496 DRGWGLRSLDPIRAGTFICEYAGEVVDKFKARQDG--EGSNEDYVFD-TTRTYDSFKWNYEPGLIEDDDPSDTTEEYDLP 572 (662)
Q Consensus 496 ~kGwGVrA~~~I~~GtfIcEY~GEvit~~e~~~~~--~~~~d~Ylfd-~~~~~~~~~w~~~~~l~~~~~~~~~~~~~~~~ 572 (662)
..|=-|.+...+.+|+=|-..+|-|+.-.+++.++ ...+.+|..- .++..-
T Consensus 136 ~~gAkivst~~w~~ndkIe~LvGcIaeLse~eE~~ll~~g~nDFSvmyStRk~c-------------------------- 189 (453)
T KOG2589|consen 136 QNGAKIVSTKSWSRNDKIELLVGCIAELSEAEERSLLRGGGNDFSVMYSTRKRC-------------------------- 189 (453)
T ss_pred CCCceEEeeccccCCccHHHhhhhhhhcChhhhHHHHhccCCceeeeeecccch--------------------------
Confidence 35667899999999999999999986544443321 1112222210 000000
Q ss_pred CCEEEeccccCChhhcccCCCCCCceeEEEEEecCCCceeEEEEEEeecCCCCcEEEEecCCCCCCCCCCCCCCCeEeec
Q 006089 573 YPLVISAKNVGNVARFMNHSCSPNVFWQPIIFENNNESFVHVAFFAMRHVPPMTELTYDYGISKSDGGNYEPHRKKKCLC 652 (662)
Q Consensus 573 ~~~~IDA~~~GNvaRFINHSC~PN~~~q~V~~d~~d~~~p~I~~FA~RdI~~GEELT~DYg~~~~~~~~~~~~~~~~C~C 652 (662)
..+++ .-|+||||-|.|||.+.. .+.-.+++-++|||.||||||-=||..+... ....|.|
T Consensus 190 aqLwL------GPaafINHDCrpnCkFvs-------~g~~tacvkvlRDIePGeEITcFYgs~fFG~------~N~~CeC 250 (453)
T KOG2589|consen 190 AQLWL------GPAAFINHDCRPNCKFVS-------TGRDTACVKVLRDIEPGEEITCFYGSGFFGE------NNEECEC 250 (453)
T ss_pred hhhee------ccHHhhcCCCCCCceeec-------CCCceeeeehhhcCCCCceeEEeecccccCC------CCceeEE
Confidence 01222 247999999999998632 1234788999999999999999999998764 3456666
Q ss_pred CC
Q 006089 653 GT 654 (662)
Q Consensus 653 GS 654 (662)
-+
T Consensus 251 ~T 252 (453)
T KOG2589|consen 251 VT 252 (453)
T ss_pred ee
Confidence 44
No 18
>KOG2461 consensus Transcription factor BLIMP-1/PRDI-BF1, contains C2H2-type Zn-finger and SET domains [Transcription]
Probab=97.72 E-value=3.3e-05 Score=85.15 Aligned_cols=120 Identities=23% Similarity=0.282 Sum_probs=83.6
Q ss_pred CcccccCceeeEEEEecCC--CCCeeEeCCccCCCceEEEeecEE-eeHHhHhhhcCCCCCceeeecccccccccccCCC
Q 006089 478 NRVSQTGLKVRLDVFKTKD--RGWGLRSLDPIRAGTFICEYAGEV-VDKFKARQDGEGSNEDYVFDTTRTYDSFKWNYEP 554 (662)
Q Consensus 478 NRv~Q~G~k~~LeVfrT~~--kGwGVrA~~~I~~GtfIcEY~GEv-it~~e~~~~~~~~~d~Ylfd~~~~~~~~~w~~~~ 554 (662)
||..+. +...|.|+.+.. .|.||++...|++|+--.-|.|++ ++... ...+..|.+.+-...
T Consensus 20 ~~~~~~-LP~~l~i~~Ssv~~~~lgV~s~~~i~~G~~FGP~~G~~~~~~~~-----~~~n~~y~W~I~~~d--------- 84 (396)
T KOG2461|consen 20 GRLSKT-LPPELRIKPSSVPVTGLGVWSNASILPGTSFGPFEGEIIASIDS-----KSANNRYMWEIFSSD--------- 84 (396)
T ss_pred Ccchhc-CCCceEeeccccCCccccccccccccCcccccCccCcccccccc-----ccccCcceEEEEeCC---------
Confidence 444333 667788888864 789999999999999999999998 22111 112344544321100
Q ss_pred CCccCCCCCCCccccCCCCCEEEecc--ccCChhhcccCCCCC---CceeEEEEEecCCCceeEEEEEEeecCCCCcEEE
Q 006089 555 GLIEDDDPSDTTEEYDLPYPLVISAK--NVGNVARFMNHSCSP---NVFWQPIIFENNNESFVHVAFFAMRHVPPMTELT 629 (662)
Q Consensus 555 ~l~~~~~~~~~~~~~~~~~~~~IDA~--~~GNvaRFINHSC~P---N~~~q~V~~d~~d~~~p~I~~FA~RdI~~GEELT 629 (662)
..-++||++ ...|+.||+|=.++. |+.+. ..+ -.|.+.|+|+|+++|||.
T Consensus 85 -----------------~~~~~iDg~d~~~sNWmRYV~~Ar~~eeQNL~A~---Q~~-----~~Ifyrt~r~I~p~eELl 139 (396)
T KOG2461|consen 85 -----------------NGYEYIDGTDEEHSNWMRYVNSARSEEEQNLLAF---QIG-----ENIFYRTIRDIRPNEELL 139 (396)
T ss_pred -----------------CceEEeccCChhhcceeeeecccCChhhhhHHHH---hcc-----CceEEEecccCCCCCeEE
Confidence 112688877 468999999998884 76542 111 268899999999999999
Q ss_pred EecCCCCC
Q 006089 630 YDYGISKS 637 (662)
Q Consensus 630 ~DYg~~~~ 637 (662)
+.|+.++.
T Consensus 140 VWY~~e~~ 147 (396)
T KOG2461|consen 140 VWYGSEYA 147 (396)
T ss_pred EEeccchH
Confidence 99998754
No 19
>smart00508 PostSET Cysteine-rich motif following a subset of SET domains.
Probab=96.24 E-value=0.0022 Score=43.80 Aligned_cols=16 Identities=56% Similarity=1.524 Sum_probs=14.3
Q ss_pred CeEeecCCCCCccccC
Q 006089 647 KKKCLCGTLKCRGYFG 662 (662)
Q Consensus 647 ~~~C~CGS~~CrG~lG 662 (662)
.+.|+|||.+|||+|+
T Consensus 2 ~~~C~CGs~~CRG~l~ 17 (26)
T smart00508 2 KQPCLCGAPNCRGFLG 17 (26)
T ss_pred CeeeeCCCccccceec
Confidence 4789999999999984
No 20
>COG3440 Predicted restriction endonuclease [Defense mechanisms]
Probab=94.79 E-value=0.00081 Score=70.50 Aligned_cols=142 Identities=11% Similarity=-0.097 Sum_probs=109.9
Q ss_pred cccCCCccCCceech-hhhhhhhccccCCcCCcccccccCCCCCCCeEEEEEecCCCCCCCCCCCeEEEEcCCCCCCCCC
Q 006089 199 LGVVPGVEIGDIFFF-RMEMCLIGLHSQSMAGIDYMITRSDLDEEPVAVSIISSGGYDDDAEDSDILIYSGQGGNANRKG 277 (662)
Q Consensus 199 ~G~vpGv~vGd~f~~-R~e~~~~GlH~~~~~GI~~~~~~~~~~~~~~A~SIV~SGgy~dd~D~gd~l~YtG~GG~~~~~~ 277 (662)
++.. ++..+..+-. +.+..-.+.|-|++.++.+.+. ..+.+++.+|+|+++.+.+++..|++-|++ ..+.
T Consensus 6 ~~a~-sf~~~~a~~~i~~~~~~~a~~kp~l~l~v~~~~-------~~~~~~~n~~~~~~e~~~~f~~l~~~~g~~-~~~~ 76 (301)
T COG3440 6 YYAK-SFSQRNASLKIFGGNREAAPHKPILLLDVGRKI-------STFFITENQGIYETELIEPFIQLWSFFGPK-LQKY 76 (301)
T ss_pred hhhc-chhhhhhhhhhcccccccCCcCceeehhhHhhh-------hcccccccccccchhccchHHHHHhhcCcc-cccC
Confidence 3444 5555555555 6777788999999999987654 568899999999999999999999999997 4455
Q ss_pred CcccCcccchhhHHHHHHHHhCCccEEEeccccc-cCCCCceeeecCceeeeeeEEecCCCCceEEEEEeeec
Q 006089 278 EQAADQKLERGNLALERSLRRASEVRVIRGMKDA-INQSSKVYVYDGLYTVQESWTEKGKSGCNIFKYKLVRI 349 (662)
Q Consensus 278 ~~~~DQ~l~~gNlAL~~S~~~~~pVRViRg~~~~-~~~~~~~y~YDGLY~V~~~w~e~g~~G~~v~kfkL~R~ 349 (662)
.+..-+.+.+|+.+++.+++.+-+-+++|+.... ...+-..+-|-|+|.+...|-++...+..+..|...++
T Consensus 77 ~~~~p~~~l~~d~~~h~~~k~~~~~l~~~~~~~~~e~v~~~~~d~el~~~~~~~~~~~~l~~~L~~~~~~~~~ 149 (301)
T COG3440 77 GVDAPFELLQGDGKWHLDIKEGFDGLSIRTLPTEKEFVEYHYIDDELEQSLQYHQGEKRLIDDLISIWRKEVL 149 (301)
T ss_pred CCCCchHHhhccchhhhcccccCCccccCCCccHhhhhhhhhccHHHHHHHHhhcccchhHHHHHHHHHHHHH
Confidence 5666677889999999999999999999998654 23345667788889888888888766666555444443
No 21
>smart00570 AWS associated with SET domains. subdomain of PRESET
Probab=94.34 E-value=0.02 Score=45.40 Aligned_cols=25 Identities=36% Similarity=0.862 Sum_probs=21.9
Q ss_pred CCceeecCCCCCCCCCCCCcccccC
Q 006089 460 KPLIYECGPSCPCNRDCKNRVSQTG 484 (662)
Q Consensus 460 ~~~i~EC~~~C~C~~~C~NRv~Q~G 484 (662)
+.+.+||+..|.|+..|.||.+|+.
T Consensus 25 R~l~~EC~~~C~~G~~C~NqrFqk~ 49 (51)
T smart00570 25 RMLLIECSSDCPCGSYCSNQRFQKR 49 (51)
T ss_pred HHHhhhcCCCCCCCcCccCcccccC
Confidence 3478999888999999999999975
No 22
>KOG2084 consensus Predicted histone tail methylase containing SET domain [Chromatin structure and dynamics]
Probab=90.86 E-value=0.26 Score=54.84 Aligned_cols=44 Identities=30% Similarity=0.465 Sum_probs=33.3
Q ss_pred cccCCCCCCceeEEEEEecCCCceeEEEEEEeecCCCCc-EEEEecCCCCCCC
Q 006089 588 FMNHSCSPNVFWQPIIFENNNESFVHVAFFAMRHVPPMT-ELTYDYGISKSDG 639 (662)
Q Consensus 588 FINHSC~PN~~~q~V~~d~~d~~~p~I~~FA~RdI~~GE-ELT~DYg~~~~~~ 639 (662)
++||||.||+. +..+. ...++.+...+.+++ ||+..|-...+..
T Consensus 208 ~~~hsC~pn~~---~~~~~-----~~~~~~~~~~~~~~~~~l~~~y~~~~~~~ 252 (482)
T KOG2084|consen 208 LFNHSCFPNIS---VIFDG-----RGLALLVPAGIDAGEEELTISYTDPLLST 252 (482)
T ss_pred hcccCCCCCeE---EEECC-----ceeEEEeecccCCCCCEEEEeecccccCH
Confidence 88999999998 23322 256677888888887 9999998876653
No 23
>KOG1337 consensus N-methyltransferase [General function prediction only]
Probab=70.25 E-value=3.7 Score=46.81 Aligned_cols=41 Identities=27% Similarity=0.360 Sum_probs=31.6
Q ss_pred cccCCCCCCceeEEEEEecCCCceeEEEEEEeecCCCCcEEEEecCCC
Q 006089 588 FMNHSCSPNVFWQPIIFENNNESFVHVAFFAMRHVPPMTELTYDYGIS 635 (662)
Q Consensus 588 FINHSC~PN~~~q~V~~d~~d~~~p~I~~FA~RdI~~GEELT~DYg~~ 635 (662)
+.||+|++. ...+...|. .+.+.+.++|.+|||+.+.||..
T Consensus 239 ~~NH~~~~~----~~~~~~~d~---~~~l~~~~~v~~geevfi~YG~~ 279 (472)
T KOG1337|consen 239 LLNHSPEVI----KAGYNQEDE---AVELVAERDVSAGEEVFINYGPK 279 (472)
T ss_pred hhccCchhc----cccccCCCC---cEEEEEeeeecCCCeEEEecCCC
Confidence 579999992 223333332 78899999999999999999963
No 24
>PF03638 TCR: Tesmin/TSO1-like CXC domain, cysteine-rich domain; InterPro: IPR005172 This entry includes proteins that have two copies of a cysteine rich motif as follows: C-X-C-X4-C-X3-YC-X-C-X6-C-X3-C-X-C-X2-C. The family includes Tesmin Q9Y4I5 from SWISSPROT [] and TSO1 Q9LE32 from SWISSPROT []. This group of proteins is called a CXC domain in [].
Probab=50.76 E-value=11 Score=28.98 Aligned_cols=37 Identities=43% Similarity=1.030 Sum_probs=28.8
Q ss_pred CCCCCCC-CCCcCCCCCcccccccCCCCcccCCceeecCCCceeecCCCCCCCCCCCCcc
Q 006089 422 SFGCNCY-SACGPGNPNCSCVQKNGGDFPYTANGVLVSRKPLIYECGPSCPCNRDCKNRV 480 (662)
Q Consensus 422 ~~gC~C~-~~C~~~~~~C~C~~~n~g~~~Y~~~G~L~~~~~~i~EC~~~C~C~~~C~NRv 480 (662)
..||.|. ..|.. ..|.|.+.. ..|++.|.| ..|.|..
T Consensus 3 ~~gC~Ckks~Clk--~YC~Cf~~g-------------------~~C~~~C~C-~~C~N~~ 40 (42)
T PF03638_consen 3 KKGCNCKKSKCLK--LYCECFQAG-------------------RFCTPNCKC-QNCKNTE 40 (42)
T ss_pred CCCCcccCcChhh--hhCHHHHCc-------------------CcCCCCccc-CCCCCcC
Confidence 4689996 57885 689998653 369999999 8888864
No 25
>KOG3813 consensus Uncharacterized conserved protein (tumor-suppressor AXUD1 in humans) [General function prediction only]
Probab=49.16 E-value=9.2 Score=43.58 Aligned_cols=42 Identities=31% Similarity=0.842 Sum_probs=27.3
Q ss_pred CCCCCCCCCcCCCCCcccccccCCCCcccCCceeecCCCceeecCCCCCCC-CCCCCcc
Q 006089 423 FGCNCYSACGPGNPNCSCVQKNGGDFPYTANGVLVSRKPLIYECGPSCPCN-RDCKNRV 480 (662)
Q Consensus 423 ~gC~C~~~C~~~~~~C~C~~~n~g~~~Y~~~G~L~~~~~~i~EC~~~C~C~-~~C~NRv 480 (662)
-||+|..-|.| +.|+|.+- |+...-.-..|- |+|. ..|.|-+
T Consensus 308 CGCsCr~~CdP--ETCaCSqa----------GIkCQvDr~~fP----CgC~rEgCgNp~ 350 (640)
T KOG3813|consen 308 CGCSCRGVCDP--ETCACSQA----------GIKCQVDRGEFP----CGCFREGCGNPE 350 (640)
T ss_pred hCCcccceeCh--hhcchhcc----------CceEeecCcccc----cccchhhcCCCc
Confidence 59999999998 68999863 332222222232 7776 4799844
No 26
>KOG1171 consensus Metallothionein-like protein [Inorganic ion transport and metabolism]
Probab=43.90 E-value=8.3 Score=43.05 Aligned_cols=37 Identities=38% Similarity=0.992 Sum_probs=30.2
Q ss_pred CCCCCCCCCC-CCcCCCCCcccccccCCCCcccCCceeecCCCceeecCCCCCCCCCCCC
Q 006089 420 QPSFGCNCYS-ACGPGNPNCSCVQKNGGDFPYTANGVLVSRKPLIYECGPSCPCNRDCKN 478 (662)
Q Consensus 420 ~~~~gC~C~~-~C~~~~~~C~C~~~n~g~~~Y~~~G~L~~~~~~i~EC~~~C~C~~~C~N 478 (662)
....||+|.. +|.. ..|.|.+.+ .-|+..|+| ..|.|
T Consensus 215 ~hkkGC~CkkSgClK--kYCECyQa~-------------------vlCS~nCkC-~~CkN 252 (406)
T KOG1171|consen 215 RHKKGCNCKKSGCLK--KYCECYQAG-------------------VLCSSNCKC-QGCKN 252 (406)
T ss_pred hhcCCCCCccccchH--HHHHHHhcC-------------------CCccccccC-cCCcc
Confidence 3567999985 8986 689999865 349999999 78999
No 27
>PF12218 End_N_terminal: N terminal extension of bacteriophage endosialidase; InterPro: IPR024429 This entry represents the N-terminal extension domain of endosialidases which is approximately 70 amino acids in length. The two N-terminal domains (this domain and the beta propeller) assemble in the compact 'cap' whereas the C-terminal domain forms an extended tail-like structure. The very N-terminal part of the 'cap' region (residues 246 to 312) holds the only alpha-helix of the protein and is presumably the residual part of the deleted N-terminal head-binding domain [].; PDB: 3JU4_A 3GVL_A 3GVK_B 3GVJ_A 1V0E_B 1V0F_E.
Probab=33.10 E-value=22 Score=29.56 Aligned_cols=50 Identities=28% Similarity=0.315 Sum_probs=25.8
Q ss_pred HHHHHHHHhCCccEEEeccccccCCCCceeeecCceeeeeeEEecCCCCceEEEEEeeecCCCCCc
Q 006089 290 LALERSLRRASEVRVIRGMKDAINQSSKVYVYDGLYTVQESWTEKGKSGCNIFKYKLVRIPGQPGA 355 (662)
Q Consensus 290 lAL~~S~~~~~pVRViRg~~~~~~~~~~~y~YDGLY~V~~~w~e~g~~G~~v~kfkL~R~pgQp~~ 355 (662)
.|+-..++.-++=++|-|.- . -|+|. ....++-|+=-+|...|+||||-.
T Consensus 11 ~A~~a~l~a~~~g~~IDg~G-------l------TykVs---~lPd~srf~N~rF~~eri~gqpl~ 60 (67)
T PF12218_consen 11 AAITAALEASPVGRKIDGAG-------L------TYKVS---SLPDISRFKNARFVYERIPGQPLY 60 (67)
T ss_dssp HHHHHHHHHS-TTS-EE-TT--------------EEEES---S---GGGEES-EEEE-SSTT--EE
T ss_pred HHHHHHHhccCCCeEEecCC-------c------eEEEe---eCccHHhhccceEEEeecCCCceE
Confidence 56777777767667776632 1 13433 234566677788999999999854
No 28
>PF08666 SAF: SAF domain; InterPro: IPR013974 This entry includes a range of different proteins, such as antifreeze proteins, flagellar FlgA proteins, and CpaB pilus proteins. ; PDB: 1C89_A 3NLA_A 3RDN_A 1C8A_A 3FRN_A 1WVO_A 3K3S_H 3G8R_B 1XUU_A 1XUZ_A ....
Probab=30.18 E-value=25 Score=28.04 Aligned_cols=16 Identities=31% Similarity=0.382 Sum_probs=11.6
Q ss_pred EEEEeecCCCCcEEEE
Q 006089 615 AFFAMRHVPPMTELTY 630 (662)
Q Consensus 615 ~~FA~RdI~~GEELT~ 630 (662)
.+.|.|||++|+.||-
T Consensus 3 vvVA~~di~~G~~i~~ 18 (63)
T PF08666_consen 3 VVVAARDIPAGTVITA 18 (63)
T ss_dssp EEEESSTB-TT-BECT
T ss_pred EEEEeCccCCCCEEcc
Confidence 3689999999999964
No 29
>KOG1081 consensus Transcription factor NSD1 and related SET domain proteins [Transcription]
Probab=24.90 E-value=24 Score=40.37 Aligned_cols=153 Identities=10% Similarity=0.083 Sum_probs=84.0
Q ss_pred eeecCCCCCCCCCCCCcccccCceeeEEEEecCCCCCe---eEeCCccCCCceEEEeecEEeeHHhHh--hhc-C-CCCC
Q 006089 463 IYECGPSCPCNRDCKNRVSQTGLKVRLDVFKTKDRGWG---LRSLDPIRAGTFICEYAGEVVDKFKAR--QDG-E-GSNE 535 (662)
Q Consensus 463 i~EC~~~C~C~~~C~NRv~Q~G~k~~LeVfrT~~kGwG---VrA~~~I~~GtfIcEY~GEvit~~e~~--~~~-~-~~~d 535 (662)
-++|++.|.|.+.+.+-+ +.. .-+..+..+|+ .++...+..|++|++++|+..-..-+. ... . ....
T Consensus 95 c~~ggs~v~~~s~~~~~~-----r~c-~~~~~~~c~~~~~d~~~~~~~~~~~~vw~~vg~~~~~~c~vc~~~~~~~~~~~ 168 (463)
T KOG1081|consen 95 CFKGGSLVTCKSRIQAPH-----RKC-KPAQLEKCSKRCTDCRAFKKREVGDLVWSKVGEYPWWPCMVCHDPLLPKGMKH 168 (463)
T ss_pred ccCCCccceecccccccc-----ccC-cCccCcccccCCcceeeeccccceeEEeEEcCcccccccceecCcccchhhcc
Confidence 355666666655544331 111 11233445666 888889999999999999986544210 000 0 0000
Q ss_pred -ceeeecccccccccccCCCCCccCCCCCCCccccCCCCCEEEeccccCChhhcccCCCCCCceeEEEEEecCCCceeEE
Q 006089 536 -DYVFDTTRTYDSFKWNYEPGLIEDDDPSDTTEEYDLPYPLVISAKNVGNVARFMNHSCSPNVFWQPIIFENNNESFVHV 614 (662)
Q Consensus 536 -~Ylfd~~~~~~~~~w~~~~~l~~~~~~~~~~~~~~~~~~~~IDA~~~GNvaRFINHSC~PN~~~q~V~~d~~d~~~p~I 614 (662)
.-.|-.. ..|-+ ...++...|+..++|+|++.|+-....+... ..+++
T Consensus 169 ~~~~f~~~-----~~~~~----------------------~~~~~~~~g~~~~~l~~~~~~~s~~~~~~~~----~~~r~ 217 (463)
T KOG1081|consen 169 DHVNFFGC-----YAWTH----------------------EKRVFPYEGQSSKLIPHSKKPASTMSEKIKE----AKARF 217 (463)
T ss_pred ccceeccc-----hhhHH----------------------HhhhhhccchHHHhhhhccccchhhhhhhhc----ccchh
Confidence 0111000 00110 1233333999999999999999887776654 34566
Q ss_pred EEEEeecCCCCcE------EEEecCCCCCCCCCCCCCCCeEeecCCCCCcc
Q 006089 615 AFFAMRHVPPMTE------LTYDYGISKSDGGNYEPHRKKKCLCGTLKCRG 659 (662)
Q Consensus 615 ~~FA~RdI~~GEE------LT~DYg~~~~~~~~~~~~~~~~C~CGS~~CrG 659 (662)
..++.+-++-++- ++.+|....+ .....|.|.+..|..
T Consensus 218 ~~~~~q~~~~~~~~e~k~~~~~~~~~~~~-------~~~~~~~~~~~~~~~ 261 (463)
T KOG1081|consen 218 GKLKAQWEAGIKQKELKPEEYKRIKVVCP-------IGDQQIYSAAVSCIK 261 (463)
T ss_pred hhcccchhhccchhhcccccccccccccC-------cCcccccchhhhhhh
Confidence 6777777766655 5555544322 222335665555543
No 30
>KOG2155 consensus Tubulin-tyrosine ligase-related protein [Posttranslational modification, protein turnover, chaperones]
Probab=23.18 E-value=43 Score=37.92 Aligned_cols=50 Identities=20% Similarity=0.308 Sum_probs=38.0
Q ss_pred ChhhcccCCCCCCceeEEEEEecCCCceeEEEEEEeecCCCCcEEEEecCCC
Q 006089 584 NVARFMNHSCSPNVFWQPIIFENNNESFVHVAFFAMRHVPPMTELTYDYGIS 635 (662)
Q Consensus 584 NvaRFINHSC~PN~~~q~V~~d~~d~~~p~I~~FA~RdI~~GEELT~DYg~~ 635 (662)
.++.-+.||-+||..+.+.++--. .-..-.++-+|+...|||+|-|+-+.
T Consensus 203 efGsrvrHsdePnf~~aPf~fmPq--~vaYsimwp~k~~~tgeE~trDfasg 252 (631)
T KOG2155|consen 203 EFGSRVRHSDEPNFRIAPFMFMPQ--NVAYSIMWPTKPVNTGEEITRDFASG 252 (631)
T ss_pred hhhhhhccCCCCcceeeeheecch--hcceeEEeeccCCCCchHHHHHHhhc
Confidence 356678999999999888776432 22334568899999999999998554
Done!