Query 047072
Match_columns 570
No_of_seqs 439 out of 1709
Neff 5.9
Searched_HMMs 46136
Date Fri Mar 29 07:20:23 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/047072.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/047072hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 smart00466 SRA SET and RING fi 100.0 2E-58 4.4E-63 432.1 15.6 150 141-291 2-155 (155)
2 PF02182 SAD_SRA: SAD/SRA doma 100.0 9.5E-56 2.1E-60 416.2 11.6 150 142-291 2-155 (155)
3 KOG1082 Histone H3 (Lys9) meth 100.0 7.8E-53 1.7E-57 446.7 17.3 297 256-570 34-354 (364)
4 KOG4442 Clathrin coat binding 100.0 3.2E-41 6.9E-46 367.3 11.0 202 327-569 43-259 (729)
5 KOG1141 Predicted histone meth 100.0 3.3E-40 7.2E-45 360.1 5.7 154 295-448 668-838 (1262)
6 KOG1080 Histone H3 (Lys4) meth 99.9 1.9E-28 4.1E-33 282.5 9.1 132 409-569 866-1004(1005)
7 KOG1079 Transcriptional repres 99.9 1.5E-26 3.4E-31 251.5 9.7 144 381-545 555-714 (739)
8 smart00317 SET SET (Su(var)3-9 99.9 1.4E-21 3.1E-26 171.0 12.5 108 411-539 2-116 (116)
9 KOG1083 Putative transcription 99.8 3.9E-22 8.4E-27 224.4 2.1 127 397-544 1165-1298(1306)
10 PF05033 Pre-SET: Pre-SET moti 99.8 5.7E-21 1.2E-25 167.9 6.5 98 304-401 1-103 (103)
11 smart00468 PreSET N-terminal t 99.8 2.9E-20 6.2E-25 162.5 7.7 92 302-393 1-98 (98)
12 KOG1085 Predicted methyltransf 99.7 8.5E-17 1.8E-21 161.3 6.2 121 405-542 252-379 (392)
13 COG2940 Proteins containing SE 99.6 6.1E-17 1.3E-21 178.3 4.9 168 388-569 311-479 (480)
14 PF00856 SET: SET domain; Int 99.3 1.2E-12 2.7E-17 119.2 5.9 49 488-540 114-162 (162)
15 KOG1141 Predicted histone meth 99.1 2.6E-10 5.7E-15 127.2 9.6 333 229-570 785-1262(1262)
16 KOG1081 Transcription factor N 98.7 3E-09 6.6E-14 116.7 1.9 148 382-569 286-436 (463)
17 KOG2589 Histone tail methylase 98.4 2.5E-07 5.5E-12 96.5 4.1 102 418-544 136-241 (453)
18 KOG2461 Transcription factor B 97.7 4.7E-05 1E-09 82.4 5.0 118 407-545 26-148 (396)
19 smart00508 PostSET Cysteine-ri 96.4 0.0014 3.1E-08 43.9 1.2 15 555-569 2-16 (26)
20 COG3440 Predicted restriction 95.9 0.0002 4.3E-09 73.7 -7.4 136 149-288 10-149 (301)
21 smart00570 AWS associated with 95.1 0.011 2.4E-07 46.0 1.5 27 380-406 23-49 (51)
22 KOG2084 Predicted histone tail 89.7 0.28 6.1E-06 53.3 3.5 43 495-545 208-251 (482)
23 KOG1337 N-methyltransferase [G 74.0 2.8 6.1E-05 46.7 3.4 40 495-541 239-278 (472)
24 PF12218 End_N_terminal: N ter 47.4 16 0.00034 29.8 2.3 50 230-293 10-59 (67)
25 PF11403 Yeast_MT: Yeast metal 46.2 13 0.00029 26.5 1.5 18 345-362 21-38 (40)
26 KOG3813 Uncharacterized conser 45.5 11 0.00024 42.3 1.5 42 346-402 308-350 (640)
27 PF03638 TCR: Tesmin/TSO1-like 38.6 24 0.00053 26.5 2.0 37 345-402 3-40 (42)
28 KOG1025 Epidermal growth facto 36.2 50 0.0011 39.7 4.9 18 518-535 428-445 (1177)
29 PF08666 SAF: SAF domain; Int 33.7 19 0.00041 28.0 0.8 15 522-536 3-17 (63)
30 KOG2155 Tubulin-tyrosine ligas 27.1 30 0.00065 38.4 1.1 50 492-543 204-253 (631)
31 KOG1081 Transcription factor N 23.6 27 0.00058 39.2 0.0 127 384-534 94-230 (463)
No 1
>smart00466 SRA SET and RING finger associated domain. Domain of unknown function in SET domain containing proteins and in Deinococcus radiodurans DRA1533. Domain in SET domain containing proteins and in Deinococcus radiodurans DRA1533.
Probab=100.00 E-value=2e-58 Score=432.15 Aligned_cols=150 Identities=57% Similarity=0.933 Sum_probs=141.2
Q ss_pred CceeecCCCccCCceechhhhhhhhcccCCccCCcceecC-CCceeEEEEEeeCCcCCCCCCCCeEEEeCCCCCCCCCCC
Q 047072 141 KKVIGSVPGVEVGDEFQYRVELNMIGLHLQIQGGIDYVKH-EGKINATSIVASGGYDDKLDNSDVLIYTGQGGNVMNGGK 219 (570)
Q Consensus 141 ~~~~G~vpGv~vGd~f~~r~e~~~~GlH~~~~~GI~~~~~-~g~~~A~SIV~Sggy~dd~d~gd~l~YtG~gg~~~~~~~ 219 (570)
+++||+||||+|||+|++|+||+++|||+++|+||||++. +|+++|+|||+||||+||+|+||+|+||||||++. .++
T Consensus 2 ~~~~G~vpGv~vGd~f~~R~el~~~GlH~~~~~GI~~~~~~~~~~~A~SIV~SggYedd~D~gd~liYtG~gg~~~-~~~ 80 (155)
T smart00466 2 KHIFGPVPGVEVGDIFFFRVELCLVGLHRPTQAGIDGLTADEGEPGATSVVSSGGYEDDTDDGDVLIYTGQGGRDM-THG 80 (155)
T ss_pred CceEeCCCCccCCCEEcchhHhhhhcccCcccCCcccccccCCCccEEEEEECCCccCcccCCCEEEEEccCCccC-CCC
Confidence 4689999999999999999999999999999999999984 57888999999999999999999999999999987 568
Q ss_pred CCcccccccccHHHHhhhhhCCCceEEeCCCCC---CcceeeeeeeeEEEEEEEeecCCCCeeEEeeeecccCCC
Q 047072 220 EPEDQKLERGNVALANNIHEQNPVRVIRGDTKA---FEYRTCIYDGLYLVERYWQDVGSHGKLVYKFKLARIPGQ 291 (570)
Q Consensus 220 ~~~dQ~l~~gN~AL~~s~~~~~pVRViRg~~~~---~~~~~y~YDGLY~V~~~w~e~g~~G~~v~kf~L~R~~GQ 291 (570)
|+.||+|++||+||++||++++|||||||.... .+.++|||||||+|++||.|+|++|+.||||+|+|+|||
T Consensus 81 ~~~dQkl~~gNlAL~~S~~~~~PVRViRg~~~~~~~~p~~gyrYDGLY~V~~~w~e~g~~G~~v~kfkL~R~~gQ 155 (155)
T smart00466 81 QPEDQKLERGNLALEASCRKGIPVRVVRGMKGYSKYAPGKGYIYDGLYRIVDYWREVGKSGFLVFKFKLVRIPGQ 155 (155)
T ss_pred CccccEecchhHHHHHHHhcCCceEEEccccccCCCCCCCeEEECcEEEEEEEEEecCCCCcEEEEEEEEeCCCC
Confidence 999999999999999999999999999997633 345899999999999999999999999999999999998
No 2
>PF02182 SAD_SRA: SAD/SRA domain; InterPro: IPR003105 This domain has been termed SRA-YDG, for SET and Ring finger Associated, and because of the conserved YDG motif within the domain. Further characteristics of the domain are the conservation of up to 13 evenly spaced glycine residues and a VRV(I/V)RG motif. The domain is mainly found in plants and animals and in bacteria. In animals, this domain is associated with the Np95-like ring finger protein and the related gene product Np97, which contains PHD and RING FINGER domains and which is an important determinant in cell cycle progression. Np95 is a chromatin-associated ubiquitin ligase, binding to histones is direct and shows a remarkable preference for histone H3 and its N-terminal tail. The SRA-YDG domain contained in Np95 is indispensable both for the interaction with histones and for chromatin binding in vivo [, ]. In plants the SRA-YDG domain is associated with the SET domain, found in a family of histone methyl transferases, and in bacteria it is found in association with HNH, a non-specific nuclease motif [, ].; GO: 0042393 histone binding; PDB: 2ZO1_B 2ZKD_A 2ZO0_B 2ZKF_A 2ZKG_B 3FDE_A 3F8I_A 2ZO2_B 3F8J_B 2ZKE_A ....
Probab=100.00 E-value=9.5e-56 Score=416.18 Aligned_cols=150 Identities=59% Similarity=0.941 Sum_probs=124.0
Q ss_pred ceeecCCCccCCceechhhhhhhhcccCCccCCcceecCCCceeEEEEEeeCCcCCCCCCCCeEEEeCCCCCCCCCCCCC
Q 047072 142 KVIGSVPGVEVGDEFQYRVELNMIGLHLQIQGGIDYVKHEGKINATSIVASGGYDDKLDNSDVLIYTGQGGNVMNGGKEP 221 (570)
Q Consensus 142 ~~~G~vpGv~vGd~f~~r~e~~~~GlH~~~~~GI~~~~~~g~~~A~SIV~Sggy~dd~d~gd~l~YtG~gg~~~~~~~~~ 221 (570)
++|||||||+|||||++|+||+++|||+++|+||||++.+|.++|+|||+||+|+||+|+||+|+|||+||++..+++|.
T Consensus 2 k~~G~ipGv~vG~~f~~r~~~~~~G~H~~~~~GI~g~~~~g~~~A~SIV~Sg~y~dd~D~gd~l~YtG~gg~~~~~~~~~ 81 (155)
T PF02182_consen 2 KRFGHIPGVEVGDWFPYRMELSIVGLHGPTQAGIDGMKKEGGPVAYSIVLSGGYEDDEDNGDVLIYTGQGGNDLSGNKQP 81 (155)
T ss_dssp TSSS--TT--TT-EESSHHHHHHTTSS--SS-SEEEETTTESEEEEEEEESSSSTTCEECSSEEEEE-SSSB--TTT-B-
T ss_pred CcEeCCCCccCccEEhHHHHHhHhccCCCccCCeecccCCCceeeEEEEECCCcccccCCCCEEEEEcCCCccccccccc
Confidence 57999999999999999999999999999999999999999999999999999999999999999999999999899999
Q ss_pred cccccccccHHHHhhhhhCCCceEEeCCCCCCc---cee-eeeeeeEEEEEEEeecCCCCeeEEeeeecccCCC
Q 047072 222 EDQKLERGNVALANNIHEQNPVRVIRGDTKAFE---YRT-CIYDGLYLVERYWQDVGSHGKLVYKFKLARIPGQ 291 (570)
Q Consensus 222 ~dQ~l~~gN~AL~~s~~~~~pVRViRg~~~~~~---~~~-y~YDGLY~V~~~w~e~g~~G~~v~kf~L~R~~GQ 291 (570)
.||+|++||+||++|+++++|||||||...... ..+ |||||||+|+++|.+++++|+.||||+|+|+|||
T Consensus 82 ~dQ~l~~gN~AL~~S~~~~~PVRViR~~~~~~~~ap~~g~yrYDGLY~V~~~w~~~g~~G~~v~kF~L~R~~gQ 155 (155)
T PF02182_consen 82 KDQKLERGNLALANSMKTGNPVRVIRGYKLKSSYAPKGGIYRYDGLYKVVKYWREKGKSGFKVFKFKLVRLPGQ 155 (155)
T ss_dssp S---SSHHHHHHHHHSGGS-EEEEEEEGGGGGTTS-SSS-EEEEEEEEEEEEEEEE-TTSSEEEEEEEEE-TSS
T ss_pred ccccccchhHHHHHHHhcCCCeEEEeecCCCCccCCcCCCEEeCcEEEEEEEEEEeCCCCcEEEEEEEEECCCC
Confidence 999999999999999999999999999654333 345 9999999999999999999999999999999998
No 3
>KOG1082 consensus Histone H3 (Lys9) methyltransferase SUV39H1/Clr4, required for transcriptional silencing [Chromatin structure and dynamics; Transcription]
Probab=100.00 E-value=7.8e-53 Score=446.71 Aligned_cols=297 Identities=33% Similarity=0.561 Sum_probs=241.1
Q ss_pred eeeeeeeeEEEEEEEeecCCCCeeEEeeeecccCCCCCCceeeeeeeecccCCCCCcCceEEeccCCCCCCCceEeeeee
Q 047072 256 RTCIYDGLYLVERYWQDVGSHGKLVYKFKLARIPGQPELSWKVGLCVDDISQGKELIPICAVNTVDDEMPPSFKYITNII 335 (570)
Q Consensus 256 ~~y~YDGLY~V~~~w~e~g~~G~~v~kf~L~R~~GQp~l~~k~~~~~~DIS~G~E~~PI~~vN~VD~~~pp~F~Yi~~~~ 335 (570)
..++|+|.+.+...|..... ..+..+.+..||+.|.|.+||+++|+||++.++.|+|++..+
T Consensus 34 ~~~~~~~~~~~~~~~~~~~~------------------~~~~~~~~~~d~~~~~e~~~v~~~n~id~~~~~~f~y~~~~~ 95 (364)
T KOG1082|consen 34 LGLRPKGIASDIVAGMANDK------------------DKLEAKSELEDIALGSENLPVPLVNRIDEDAPLYFQYIATEI 95 (364)
T ss_pred cccccCCceeeehhhhcccc------------------cccccccccccccCccccCceeeeeeccCCccccceeccccc
Confidence 57888888888877652111 134567889999999999999999999988778899999888
Q ss_pred cCC-C-CCCCCCCCCccCCCCCCCC--CcccccccCCCccccCCc---ceeccceeeeccCCcCCCCCCCCCcccccCce
Q 047072 336 YPD-W-CRPVPPKGCDCTNGCSKLE--KCACVAKNGGEIPYNHNR---AIVQAKLLVYECGPSCKCPPSCYNRVSQQGIK 408 (570)
Q Consensus 336 ~~~-~-~~~~~~~gC~C~~~C~~~~--~C~C~~~ngg~~~y~~~g---~l~~~~~~i~EC~~~C~C~~~C~NRv~Q~g~~ 408 (570)
++. . ....+..+|.|...|+... .|.|...|++.++|+.++ .....+.++|||++.|+|++.|.|||+|+|++
T Consensus 96 ~~~~~~~~~~~~~~c~C~~~~~~~~~~~C~C~~~n~~~~~~~~~~~~~~~~~~~~~i~EC~~~C~C~~~C~nRv~q~g~~ 175 (364)
T KOG1082|consen 96 VDPGELSDCENSTGCRCCSSCSSVLPLTCLCERHNGGLVAYTCDGDCGTLGKFKEPVFECSVACGCHPDCANRVVQKGLQ 175 (364)
T ss_pred cCccccccCccccCCCccCCCCCCCCccccChHhhCCccccccCCccccccccCccccccccCCCCCCcCcchhhccccc
Confidence 766 2 2335678999999887542 399999999999999887 67788899999999999999999999999999
Q ss_pred eeEEEEEcCCCCceEeecCccCCCCeEEEEeeeeecHHHHHHhcCCCeeEEecCCCCCC-CCccCC------------Cc
Q 047072 409 VQLEIYKTEARGWGVRSLNSIAPGSFIYEFVGELLEEKEAERRTSNDKYLFNIGNNYND-GSLWGG------------LS 475 (570)
Q Consensus 409 ~~LeVfrT~~kGwGVrA~~~I~~GtfI~EY~GEvi~~~e~~~r~~~d~Ylf~l~~~~~~-~~~~~~------------~s 475 (570)
.+|+||+|+.+||||||++.|++|+|||||+||+++..+++.+..+..|.++....+.. ...|.. +.
T Consensus 176 ~~leIfrt~~kGwgvRs~~~I~~G~fvcEyaGe~~t~~e~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 255 (364)
T KOG1082|consen 176 FHLEVFRTPEKGWGVRTLDPIPAGEFVCEYAGEVLTSEEAQRRTHLREYLDDDCDAYSIADREWVDESPVGNTFVAPSLP 255 (364)
T ss_pred cceEEEecCCceeeecccccccCCCeeEEEeeEecChHHhhhccccccccccccccchhhhccccccccccccccccccc
Confidence 99999999999999999999999999999999999999999987777777765432211 001110 00
Q ss_pred cccCCCCCCCCcccCCeecccccCCCCCceEEEEEEcCCCCccceEEEEEeecCCCCCeEEEecCCCcc-cccccC---C
Q 047072 476 NVMPDAPSSSCGVYGNVGRFVNHSCSPNLYAQNVLYDHEDKRMPHKMLFAAENISPLQELTYHYSYMID-QVYDSS---G 551 (570)
Q Consensus 476 ~~~iDa~~~~~~~~GNvaRFINHSC~PN~~~~~V~~~~~d~~~prI~~FA~rdI~~GEELT~DYg~~~~-~~~~~~---g 551 (570)
........+++...||++|||||||+||+.++.|+.++.++.++||+|||+++|+||||||||||..+. .+.+.. .
T Consensus 256 ~~~~~~~~ida~~~GNv~RfinHSC~PN~~~~~v~~~~~~~~~~~i~ffa~~~I~p~~ELT~dYg~~~~~~~~~~~~~~~ 335 (364)
T KOG1082|consen 256 GGPGRELLIDAKPHGNVARFINHSCSPNLLYQAVFQDEFVLLYLRIGFFALRDISPGEELTLDYGKAYKLLVQDGANIYT 335 (364)
T ss_pred cCCCcceEEchhhcccccccccCCCCccceeeeeeecCCccchheeeeeeccccCCCcccchhhcccccccccccccccc
Confidence 111122223446899999999999999999999999999999999999999999999999999998764 112211 1
Q ss_pred CCcCeEEeeCCCCcccccC
Q 047072 552 NIKKKSCFCGSSECTGWLY 570 (570)
Q Consensus 552 ~~k~~~C~CGS~~CRG~ly 570 (570)
...+..|.||+.+||++++
T Consensus 336 ~~~~~~c~c~~~~cr~~~~ 354 (364)
T KOG1082|consen 336 PVMKKNCNCGLEKCRGLLG 354 (364)
T ss_pred cccchhhcCCCHHhCcccC
Confidence 3467899999999999875
No 4
>KOG4442 consensus Clathrin coat binding protein/Huntingtin interacting protein HIP1, involved in regulation of endocytosis [Intracellular trafficking, secretion, and vesicular transport]
Probab=100.00 E-value=3.2e-41 Score=367.30 Aligned_cols=202 Identities=30% Similarity=0.614 Sum_probs=159.9
Q ss_pred CceEeeeeecCCCCCC----CCCCCCccCCCCCC--CCCcccccccCCCccccCCcceeccceeeeccCC-cCC-CCCCC
Q 047072 327 SFKYITNIIYPDWCRP----VPPKGCDCTNGCSK--LEKCACVAKNGGEIPYNHNRAIVQAKLLVYECGP-SCK-CPPSC 398 (570)
Q Consensus 327 ~F~Yi~~~~~~~~~~~----~~~~gC~C~~~C~~--~~~C~C~~~ngg~~~y~~~g~l~~~~~~i~EC~~-~C~-C~~~C 398 (570)
.|.-+..++|...... .+..-|+|...=.+ ...|+|.. -+.++....||++ .|. |+..|
T Consensus 43 ~f~~~~e~~y~~krk~~~ee~~~m~Cdc~~~~~d~~n~~~~cg~-------------~CiNr~t~iECs~~~C~~cg~~C 109 (729)
T KOG4442|consen 43 KFENLDEKFYANKRKKKKEENDEMICDCKPKTGDGANGACACGE-------------DCINRMTSIECSDRECPRCGVYC 109 (729)
T ss_pred hhhhhhhhhhHHhhccCcccCcceeeecccccccccccccccCc-------------cccchhhhcccCCccCCCccccc
Confidence 4554555555442111 13457888763222 23455432 2335566789999 899 99999
Q ss_pred CCcccccCceeeEEEEEcCCCCceEeecCccCCCCeEEEEeeeeecHHHHHHhcC-------CCeeEEecCCCCCCCCcc
Q 047072 399 YNRVSQQGIKVQLEIYKTEARGWGVRSLNSIAPGSFIYEFVGELLEEKEAERRTS-------NDKYLFNIGNNYNDGSLW 471 (570)
Q Consensus 399 ~NRv~Q~g~~~~LeVfrT~~kGwGVrA~~~I~~GtfI~EY~GEvi~~~e~~~r~~-------~d~Ylf~l~~~~~~~~~~ 471 (570)
.|+.+|+....+++||.|..|||||||..+|++|+||.||.||||+..|+++|.. .+.|.|.+...
T Consensus 110 ~NQRFQkkqyA~vevF~Te~KG~GLRA~~dI~~g~FI~EY~GEVI~~~Ef~kR~~~Y~~d~~kh~Yfm~L~~~------- 182 (729)
T KOG4442|consen 110 KNQRFQKKQYAKVEVFLTEKKGCGLRAEEDIPKGQFILEYIGEVIEEKEFEKRVKRYAKDGIKHYYFMALQGG------- 182 (729)
T ss_pred cchhhhhhccCceeEEEecCcccceeeccccCCCcEEeeeccccccHHHHHHHHHHHHhcCCceEEEEEecCC-------
Confidence 9999999999999999999999999999999999999999999999999999864 23455555432
Q ss_pred CCCccccCCCCCCCCcccCCeecccccCCCCCceEEEEEEcCCCCccceEEEEEeecCCCCCeEEEecCCCcccccccCC
Q 047072 472 GGLSNVMPDAPSSSCGVYGNVGRFVNHSCSPNLYAQNVLYDHEDKRMPHKMLFAAENISPLQELTYHYSYMIDQVYDSSG 551 (570)
Q Consensus 472 ~~~s~~~iDa~~~~~~~~GNvaRFINHSC~PN~~~~~V~~~~~d~~~prI~~FA~rdI~~GEELT~DYg~~~~~~~~~~g 551 (570)
.+|||+ .+||.||||||||+|||+++.|.+.+ ..||+|||.|.|+|||||||||++... +.
T Consensus 183 -----e~IDAT-----~KGnlaRFiNHSC~PNa~~~KWtV~~----~lRvGiFakk~I~~GEEITFDYqf~rY---Gr-- 243 (729)
T KOG4442|consen 183 -----EYIDAT-----KKGNLARFINHSCDPNAEVQKWTVPD----ELRVGIFAKKVIKPGEEITFDYQFDRY---GR-- 243 (729)
T ss_pred -----ceeccc-----ccCcHHHhhcCCCCCCceeeeeeeCC----eeEEEEeEecccCCCceeeEecccccc---cc--
Confidence 468886 79999999999999999999999975 478999999999999999999998642 11
Q ss_pred CCcCeEEeeCCCCccccc
Q 047072 552 NIKKKSCFCGSSECTGWL 569 (570)
Q Consensus 552 ~~k~~~C~CGS~~CRG~l 569 (570)
...+|+||+++|+|||
T Consensus 244 --~AQ~CyCgeanC~G~I 259 (729)
T KOG4442|consen 244 --DAQPCYCGEANCRGWI 259 (729)
T ss_pred --cccccccCCccccccc
Confidence 2468999999999997
No 5
>KOG1141 consensus Predicted histone methyl transferase [Chromatin structure and dynamics]
Probab=100.00 E-value=3.3e-40 Score=360.12 Aligned_cols=154 Identities=33% Similarity=0.596 Sum_probs=128.1
Q ss_pred ceeeeeeeecccCCCCCcCceEEeccCCCCCCCceEeeeeecCCC----CCCCCCCCCccCCCCCCCCCccccccc----
Q 047072 295 SWKVGLCVDDISQGKELIPICAVNTVDDEMPPSFKYITNIIYPDW----CRPVPPKGCDCTNGCSKLEKCACVAKN---- 366 (570)
Q Consensus 295 ~~k~~~~~~DIS~G~E~~PI~~vN~VD~~~pp~F~Yi~~~~~~~~----~~~~~~~gC~C~~~C~~~~~C~C~~~n---- 366 (570)
+.++++-+.||+.|+|.+||..+|++|..+||.|.|-...+-... ..+...++|+|..+|.+...|+|.+..
T Consensus 668 p~kp~~~~~Di~~g~e~vpis~~neids~~lpq~ay~K~~ip~~~nl~n~~~~fl~scdc~~gcid~~kcachQltvk~~ 747 (1262)
T KOG1141|consen 668 PLKPGNRCTDIPCGREHVPISEKNEIDSHRLPQAAYKKHMIPTNNNLSNRRKDFLQSCDCPTGCIDSMKCACHQLTVKKK 747 (1262)
T ss_pred CcCCcceeccccCCccccccceeecccCcCCccchhheeeccCCCcccccChhhhhcCCCCcchhhhhhhhHHHHHHHhh
Confidence 456899999999999999999999999999999999887654332 234457899999999999999997632
Q ss_pred ----CCCc----cccCCcceeccceeeeccCCcCCCCC-CCCCcccccCceeeEEEEEcCCCCceEeecCccCCCCeEEE
Q 047072 367 ----GGEI----PYNHNRAIVQAKLLVYECGPSCKCPP-SCYNRVSQQGIKVQLEIYKTEARGWGVRSLNSIAPGSFIYE 437 (570)
Q Consensus 367 ----gg~~----~y~~~g~l~~~~~~i~EC~~~C~C~~-~C~NRv~Q~g~~~~LeVfrT~~kGwGVrA~~~I~~GtfI~E 437 (570)
++.. .|.+.++.-.....+|||+..|+|.+ .|.||++|+|.+.+|.+|+|..+|||+|.+++|..|+|||-
T Consensus 748 ~t~p~~~v~~t~gykyKRl~e~~ptg~yEc~k~ckc~~~~C~nrmvqhg~qvRlq~fkt~~kGWg~rclddi~~g~fVci 827 (1262)
T KOG1141|consen 748 TTGPNQNVASTNGYKYKRLIEIRPTGPYECLKACKCCGPDCLNRMVQHGYQVRLQRFKTIHKGWGRRCLDDITGGNFVCI 827 (1262)
T ss_pred ccCCCcccccCcchhhHHHHHhcCCCHHHHHHhhccCcHHHHHHHhhcCceeEeeeccccccccceEeeeecCCceEEEE
Confidence 1111 24455544445567999999999875 79999999999999999999999999999999999999999
Q ss_pred EeeeeecHHHH
Q 047072 438 FVGELLEEKEA 448 (570)
Q Consensus 438 Y~GEvi~~~e~ 448 (570)
|.|.++++.-+
T Consensus 828 y~g~~l~~~~s 838 (1262)
T KOG1141|consen 828 YPGGALLHQIS 838 (1262)
T ss_pred ecchhhhhhhc
Confidence 99999876543
No 6
>KOG1080 consensus Histone H3 (Lys4) methyltransferase complex, subunit SET1 and related methyltransferases [Chromatin structure and dynamics; Transcription]
Probab=99.95 E-value=1.9e-28 Score=282.45 Aligned_cols=132 Identities=34% Similarity=0.758 Sum_probs=114.6
Q ss_pred eeEEEEEcCCCCceEeecCccCCCCeEEEEeeeeecHHHHHHhc------C-CCeeEEecCCCCCCCCccCCCccccCCC
Q 047072 409 VQLEIYKTEARGWGVRSLNSIAPGSFIYEFVGELLEEKEAERRT------S-NDKYLFNIGNNYNDGSLWGGLSNVMPDA 481 (570)
Q Consensus 409 ~~LeVfrT~~kGwGVrA~~~I~~GtfI~EY~GEvi~~~e~~~r~------~-~d~Ylf~l~~~~~~~~~~~~~s~~~iDa 481 (570)
.+|.-.++..+||||||+++|.+|++|.||+||+|...-++.|. . .+.|+|.++.. .++||
T Consensus 866 k~~~F~~s~iH~wglfa~~~i~~~dmViEY~Ge~vR~~iad~RE~~Y~~~gi~~sYlfrid~~------------~ViDA 933 (1005)
T KOG1080|consen 866 KYVKFGRSGIHGWGLFAMENIAAGDMVIEYRGELVRSSIADLREARYERMGIGDSYLFRIDDE------------VVVDA 933 (1005)
T ss_pred hhhccccccccccceeeccCccccceEEEeeceehhhhHHHHHHHHHhccCcccceeeecccc------------eEEec
Confidence 34666788899999999999999999999999999876665543 2 68999999853 56888
Q ss_pred CCCCCcccCCeecccccCCCCCceEEEEEEcCCCCccceEEEEEeecCCCCCeEEEecCCCcccccccCCCCcCeEEeeC
Q 047072 482 PSSSCGVYGNVGRFVNHSCSPNLYAQNVLYDHEDKRMPHKMLFAAENISPLQELTYHYSYMIDQVYDSSGNIKKKSCFCG 561 (570)
Q Consensus 482 ~~~~~~~~GNvaRFINHSC~PN~~~~~V~~~~~d~~~prI~~FA~rdI~~GEELT~DYg~~~~~~~~~~g~~k~~~C~CG 561 (570)
+ .+||+||||||||.|||++..+.+++. .+|+|||.|||.+||||||||.+..+. -+.+|+||
T Consensus 934 t-----k~gniAr~InHsC~PNCyakvi~V~g~----~~IvIyakr~I~~~EElTYDYkF~~e~--------~kipClCg 996 (1005)
T KOG1080|consen 934 T-----KKGNIARFINHSCNPNCYAKVITVEGD----KRIVIYSKRDIAAGEELTYDYKFPTED--------DKIPCLCG 996 (1005)
T ss_pred c-----ccCchhheeecccCCCceeeEEEecCe----eEEEEEEecccccCceeeeeccccccc--------cccccccC
Confidence 6 799999999999999999999999865 589999999999999999999987653 26799999
Q ss_pred CCCccccc
Q 047072 562 SSECTGWL 569 (570)
Q Consensus 562 S~~CRG~l 569 (570)
|++|||+|
T Consensus 997 ap~Crg~~ 1004 (1005)
T KOG1080|consen 997 APNCRGFL 1004 (1005)
T ss_pred CCcccccc
Confidence 99999997
No 7
>KOG1079 consensus Transcriptional repressor EZH1 [Transcription]
Probab=99.93 E-value=1.5e-26 Score=251.48 Aligned_cols=144 Identities=33% Similarity=0.557 Sum_probs=125.8
Q ss_pred cceeeeccCC-cCCCC----------CCCCCcccccCceeeEEEEEcCCCCceEeecCccCCCCeEEEEeeeeecHHHHH
Q 047072 381 AKLLVYECGP-SCKCP----------PSCYNRVSQQGIKVQLEIYKTEARGWGVRSLNSIAPGSFIYEFVGELLEEKEAE 449 (570)
Q Consensus 381 ~~~~i~EC~~-~C~C~----------~~C~NRv~Q~g~~~~LeVfrT~~kGwGVrA~~~I~~GtfI~EY~GEvi~~~e~~ 449 (570)
+.....||.| .|.+. -+|.|--+|++.+.++.+..+...|||+|+.+...+++||.||+||+|+++||+
T Consensus 555 C~~A~rECdPd~Cl~cg~~~~~d~~~~~C~N~~l~~~~qkr~llapSdVaGwGlFlKe~v~KnefisEY~GE~IS~dEAD 634 (739)
T KOG1079|consen 555 CYLAVRECDPDVCLMCGNVDHFDSSKISCKNTNLQRGEQKRVLLAPSDVAGWGLFLKESVSKNEFISEYTGEIISHDEAD 634 (739)
T ss_pred hhhhccccCchHHhccCcccccccCccccccchhhhhhhcceeechhhccccceeeccccCCCceeeeecceeccchhhh
Confidence 3445789997 47652 289999999999999999999999999999999999999999999999999999
Q ss_pred HhcC-----CCeeEEecCCCCCCCCccCCCccccCCCCCCCCcccCCeecccccCCCCCceEEEEEEcCCCCccceEEEE
Q 047072 450 RRTS-----NDKYLFNIGNNYNDGSLWGGLSNVMPDAPSSSCGVYGNVGRFVNHSCSPNLYAQNVLYDHEDKRMPHKMLF 524 (570)
Q Consensus 450 ~r~~-----~d~Ylf~l~~~~~~~~~~~~~s~~~iDa~~~~~~~~GNvaRFINHSC~PN~~~~~V~~~~~d~~~prI~~F 524 (570)
+|.. +-+|+|++... +++|++ ++||.+||+|||-.|||++..+++.+. .+|+||
T Consensus 635 rRGkiYDr~~cSflFnln~d------------yviDs~-----rkGnk~rFANHS~nPNCYAkvm~V~Gd----hRIGif 693 (739)
T KOG1079|consen 635 RRGKIYDRYMCSFLFNLNND------------YVIDST-----RKGNKIRFANHSFNPNCYAKVMMVAGD----HRIGIF 693 (739)
T ss_pred hcccccccccceeeeecccc------------ceEeee-----eecchhhhccCCCCCCcEEEEEEecCC----cceeee
Confidence 9864 45788888654 456774 899999999999999999998888754 679999
Q ss_pred EeecCCCCCeEEEecCCCccc
Q 047072 525 AAENISPLQELTYHYSYMIDQ 545 (570)
Q Consensus 525 A~rdI~~GEELT~DYg~~~~~ 545 (570)
|.|.|.+||||||||.|+.++
T Consensus 694 AkRaIeagEELffDYrYs~~~ 714 (739)
T KOG1079|consen 694 AKRAIEAGEELFFDYRYSPEH 714 (739)
T ss_pred ehhhcccCceeeeeeccCccc
Confidence 999999999999999997653
No 8
>smart00317 SET SET (Su(var)3-9, Enhancer-of-zeste, Trithorax) domain. Putative methyl transferase, based on outlier plant homologues
Probab=99.87 E-value=1.4e-21 Score=171.04 Aligned_cols=108 Identities=42% Similarity=0.757 Sum_probs=89.2
Q ss_pred EEEEEcCCCCceEeecCccCCCCeEEEEeeeeecHHHHHHhcC-----C--CeeEEecCCCCCCCCccCCCccccCCCCC
Q 047072 411 LEIYKTEARGWGVRSLNSIAPGSFIYEFVGELLEEKEAERRTS-----N--DKYLFNIGNNYNDGSLWGGLSNVMPDAPS 483 (570)
Q Consensus 411 LeVfrT~~kGwGVrA~~~I~~GtfI~EY~GEvi~~~e~~~r~~-----~--d~Ylf~l~~~~~~~~~~~~~s~~~iDa~~ 483 (570)
+++++++.+|+||+|+++|++|++|++|.|+++...++..... . ..|+|.... .+.+|+.
T Consensus 2 ~~~~~~~~~G~gl~a~~~i~~g~~i~~~~g~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~------------~~~id~~- 68 (116)
T smart00317 2 LEVFKSPGKGWGVRATEDIPKGEFIGEYVGEIITSEEAEERSKAYDTDGADSFYLFEIDS------------DLCIDAR- 68 (116)
T ss_pred cEEEecCCCcEEEEECCccCCCCEEEEEEeEEECHHHHHHHHHHHHhcCCCCEEEEECCC------------CEEEeCC-
Confidence 6788999999999999999999999999999999877665421 1 356665532 2456664
Q ss_pred CCCcccCCeecccccCCCCCceEEEEEEcCCCCccceEEEEEeecCCCCCeEEEec
Q 047072 484 SSCGVYGNVGRFVNHSCSPNLYAQNVLYDHEDKRMPHKMLFAAENISPLQELTYHY 539 (570)
Q Consensus 484 ~~~~~~GNvaRFINHSC~PN~~~~~V~~~~~d~~~prI~~FA~rdI~~GEELT~DY 539 (570)
..||++|||||||.||+.++.+..++. .++.|+|+|||++|||||+||
T Consensus 69 ----~~~~~~~~iNHsc~pN~~~~~~~~~~~----~~~~~~a~r~I~~GeEi~i~Y 116 (116)
T smart00317 69 ----RKGNIARFINHSCEPNCELLFVEVNGD----SRIVIFALRDIKPGEELTIDY 116 (116)
T ss_pred ----ccCcHHHeeCCCCCCCEEEEEEEECCC----cEEEEEECCCcCCCCEEeecC
Confidence 589999999999999999988876543 379999999999999999999
No 9
>KOG1083 consensus Putative transcription factor ASH1/LIN-59 [Transcription]
Probab=99.84 E-value=3.9e-22 Score=224.41 Aligned_cols=127 Identities=38% Similarity=0.613 Sum_probs=104.2
Q ss_pred CCCCccccc-CceeeEEEEEcCCCCceEeecCccCCCCeEEEEeeeeecHHHHHHhcC------CCeeEEecCCCCCCCC
Q 047072 397 SCYNRVSQQ-GIKVQLEIYKTEARGWGVRSLNSIAPGSFIYEFVGELLEEKEAERRTS------NDKYLFNIGNNYNDGS 469 (570)
Q Consensus 397 ~C~NRv~Q~-g~~~~LeVfrT~~kGwGVrA~~~I~~GtfI~EY~GEvi~~~e~~~r~~------~d~Ylf~l~~~~~~~~ 469 (570)
+|.|+.+|+ +.-.+|+||+.+.+||||+|.++|++|+||+||+|||++.++++.++. .+.|+..++
T Consensus 1165 ~c~nqrm~r~e~cp~L~v~~gp~~G~~v~tk~PikagtfI~EYvGeVit~ke~e~~mmtl~~~d~~~~cL~I~------- 1237 (1306)
T KOG1083|consen 1165 SCSNQRMQRHEECPPLEVFRGPKKGWGVRTKEPIKAGTFIMEYVGEVITEKEFEPRMMTLYHNDDDHYCLVID------- 1237 (1306)
T ss_pred hhhhHHhhhhccCCCcceeccCCCCccccccccccccchHHHHHHHHHHHHhhcccccccCCCCCcccccccC-------
Confidence 477777665 455789999999999999999999999999999999999999888732 122332221
Q ss_pred ccCCCccccCCCCCCCCcccCCeecccccCCCCCceEEEEEEcCCCCccceEEEEEeecCCCCCeEEEecCCCcc
Q 047072 470 LWGGLSNVMPDAPSSSCGVYGNVGRFVNHSCSPNLYAQNVLYDHEDKRMPHKMLFAAENISPLQELTYHYSYMID 544 (570)
Q Consensus 470 ~~~~~s~~~iDa~~~~~~~~GNvaRFINHSC~PN~~~~~V~~~~~d~~~prI~~FA~rdI~~GEELT~DYg~~~~ 544 (570)
..+++|. .++||.+||+||||.|||..|.|.+.+. .||++||+|||++||||||||++...
T Consensus 1238 -----p~l~id~-----~R~~n~~RfinhscKPNc~~qkwSVNG~----~Rv~L~A~rDi~kGEELtYDYN~ks~ 1298 (1306)
T KOG1083|consen 1238 -----PGLFIDI-----PRMGNGARFINHSCKPNCEMQKWSVNGE----YRVGLFALRDLPKGEELTYDYNFKSF 1298 (1306)
T ss_pred -----ccccCCh-----hhccccccccccccCCCCccccccccce----eeeeeeecCCCCCCceEEEecccccc
Confidence 1234454 4799999999999999999999988755 89999999999999999999987543
No 10
>PF05033 Pre-SET: Pre-SET motif; InterPro: IPR007728 This region is found in a number of histone lysine methyltransferases (HMTase), N-terminal to the SET domain; it is generally described as the pre-SET domain. Histone lysine methylation is part of the histone code that regulated chromatin function and epigenetic control of gene function. Histone lysine methyltransferases (HMTase) differ both in their substrate specificity for the various acceptor lysines as well as in their product specificity for the number of methyl groups (one, two, or three) they transfer. With just one exception [], the HMTases belong to SET family that can be classified according to the sequences surrounding the SET domain [, ]. Structural studies on the human SET7/9, a mono-methylase, have revealed the molecular basis for the specificity of the enzyme for the histone-target and the roles of the invariant residues in the SET domain in determining the methylation specificities []. The pre-SET domain, as found in the SUV39 SET family, contains nine invariant cysteine residues that are grouped into two segments separated by a region of variable length. These 9 cysteines coordinate 3 zinc ions to form a triangular cluster, where each of the zinc ions is coordinated by 4 four cysteines to give a tetrahedral configuration. The function of this domain is structural, holding together 2 long segments of random coils and stabilising the SET domain. The C-terminal region including the post-SET domain is disordered when not interacting with a histone tail and in the absence of zinc. The three conserved cysteines in the post-SET domain form a zinc-binding site [] when coupled to a fourth conserved cysteine in the knot-like structure close to the SET domain active site []. The structured post-SET region brings in the C-terminal residues that participate in S-adenosylmethine-binding and histone tail interactions. The three conserved cysteine residues are essential for HMTase activity, as replacement with serine abolishes HMTase activity []. ; GO: 0008270 zinc ion binding, 0018024 histone-lysine N-methyltransferase activity, 0034968 histone lysine methylation, 0005634 nucleus; PDB: 3K5K_A 2O8J_D 3RJW_B 1ML9_A 1PEG_B 1MVH_A 1MVX_A 3BO5_A 2RFI_B 3MO5_B ....
Probab=99.83 E-value=5.7e-21 Score=167.95 Aligned_cols=98 Identities=46% Similarity=0.940 Sum_probs=71.6
Q ss_pred cccCCCCCcCceEEeccCCCCC-CCceEeeeeecCCCCC---CCCCCCCccCCCCCCCCCcccccccCCCccccCCccee
Q 047072 304 DISQGKELIPICAVNTVDDEMP-PSFKYITNIIYPDWCR---PVPPKGCDCTNGCSKLEKCACVAKNGGEIPYNHNRAIV 379 (570)
Q Consensus 304 DIS~G~E~~PI~~vN~VD~~~p-p~F~Yi~~~~~~~~~~---~~~~~gC~C~~~C~~~~~C~C~~~ngg~~~y~~~g~l~ 379 (570)
|||+|+|.+||+++|+||++.| |.|+||+++++..... .....+|+|.+.|....+|+|...+++.++|+.+|+|.
T Consensus 1 Dis~g~e~~pI~~~N~vd~~~~p~~F~Yi~~~~~~~~~~~~~~~~~~~C~C~~~C~~~~~C~C~~~~~~~~~Y~~~g~l~ 80 (103)
T PF05033_consen 1 DISRGKENVPIPVVNDVDDEPPPPNFEYIPENIYGEGVPDIDPEFLQGCDCSGDCSNPSNCECLQRNGGIFAYDSNGRLR 80 (103)
T ss_dssp -TTCTSSSS-EEEEESSSS--SSTSSEE-SS-EESTTSS-TBGGGTS----SSSSTCTTTSHHHCCTSSS-SB-TTSSBS
T ss_pred CCCCCccCCCEEEEeCCCCCCCCCCeEEeeeEEcCCCccccccccCccCccCCCCCCCCCCcCccccCccccccCCCcCc
Confidence 8999999999999999999986 5899999999877432 23467999999998888999999998889999999887
Q ss_pred -ccceeeeccCCcCCCCCCCCCc
Q 047072 380 -QAKLLVYECGPSCKCPPSCYNR 401 (570)
Q Consensus 380 -~~~~~i~EC~~~C~C~~~C~NR 401 (570)
....+||||++.|.|+++|+||
T Consensus 81 ~~~~~~i~EC~~~C~C~~~C~NR 103 (103)
T PF05033_consen 81 IPDKPPIFECNDNCGCSPSCRNR 103 (103)
T ss_dssp SSSTSEEE---TTSSS-TTSTT-
T ss_pred cCCCCeEEeCCCCCCCCCCCCCC
Confidence 6788999999999999999998
No 11
>smart00468 PreSET N-terminal to some SET domains. A Cys-rich putative Zn2+-binding domain that occurs N-terminal to some SET domains. Function is unknown. Unpublished.
Probab=99.81 E-value=2.9e-20 Score=162.53 Aligned_cols=92 Identities=43% Similarity=0.881 Sum_probs=81.8
Q ss_pred eecccCCCCCcCceEEeccCCCCC-CCceEeeeeecCCCC----CCCCCCCCccCCCCCCCCCcccccccCCCccc-cCC
Q 047072 302 VDDISQGKELIPICAVNTVDDEMP-PSFKYITNIIYPDWC----RPVPPKGCDCTNGCSKLEKCACVAKNGGEIPY-NHN 375 (570)
Q Consensus 302 ~~DIS~G~E~~PI~~vN~VD~~~p-p~F~Yi~~~~~~~~~----~~~~~~gC~C~~~C~~~~~C~C~~~ngg~~~y-~~~ 375 (570)
+.|||+|+|++||++||+||++.| +.|+||++++++... ...+..||+|.++|++...|.|.+++++.++| ...
T Consensus 1 ~~Dis~G~E~~pI~~vN~vD~~~~p~~F~Yi~~~~~~~gv~~~~~~~~~~gC~C~~~C~~~~~C~C~~~~~~~~~Y~~~~ 80 (98)
T smart00468 1 CLDISNGKENVPVPLVNEVDEDPPPPDFEYISEYIYGQGVPIDRSPSPLVGCSCSGDCSSSNKCECARKNGGEFAYELNG 80 (98)
T ss_pred CccccCCccCCCcceEecCCCCCCCCCcEECcceEcCCCcccccCCCCCCCCcCCCCCCCCCcCCcHhhcCCccCcccCC
Confidence 369999999999999999999876 589999999988743 34567899999999987679999999999999 778
Q ss_pred cceeccceeeeccCCcCC
Q 047072 376 RAIVQAKLLVYECGPSCK 393 (570)
Q Consensus 376 g~l~~~~~~i~EC~~~C~ 393 (570)
++++..+++|||||+.|+
T Consensus 81 ~~~~~~~~~IyECn~~C~ 98 (98)
T smart00468 81 GLRLKRKPLIYECNSRCS 98 (98)
T ss_pred CEEeCCCCEEEcCCCCCC
Confidence 888899999999999985
No 12
>KOG1085 consensus Predicted methyltransferase (contains a SET domain) [General function prediction only]
Probab=99.66 E-value=8.5e-17 Score=161.25 Aligned_cols=121 Identities=31% Similarity=0.459 Sum_probs=96.4
Q ss_pred cCceeeEEEEEcCCCCceEeecCccCCCCeEEEEeeeeecHHHHHHhcC---C----CeeEEecCCCCCCCCccCCCccc
Q 047072 405 QGIKVQLEIYKTEARGWGVRSLNSIAPGSFIYEFVGELLEEKEAERRTS---N----DKYLFNIGNNYNDGSLWGGLSNV 477 (570)
Q Consensus 405 ~g~~~~LeVfrT~~kGwGVrA~~~I~~GtfI~EY~GEvi~~~e~~~r~~---~----d~Ylf~l~~~~~~~~~~~~~s~~ 477 (570)
.|..-.|.+..-.+||.||+|...+.+|+||.||.|.+|.-.|+..|.. + ..|+|...++ ...+
T Consensus 252 ~g~~egl~~~~~dgKGRGv~a~~~F~rgdFVVEY~Gdliei~eAk~rE~~Ya~De~~GcYMYyF~h~---------sk~y 322 (392)
T KOG1085|consen 252 KGTNEGLLEVYKDGKGRGVRAKVNFERGDFVVEYRGDLIEISEAKVREEQYANDEEIGCYMYYFEHN---------SKKY 322 (392)
T ss_pred hccccceeEEeeccccceeEeecccccCceEEEEecceeeechHHHHHHHhccCcccceEEEeeecc---------Ceee
Confidence 3444455555556799999999999999999999999998877766542 1 2344433322 2347
Q ss_pred cCCCCCCCCcccCCeecccccCCCCCceEEEEEEcCCCCccceEEEEEeecCCCCCeEEEecCCC
Q 047072 478 MPDAPSSSCGVYGNVGRFVNHSCSPNLYAQNVLYDHEDKRMPHKMLFAAENISPLQELTYHYSYM 542 (570)
Q Consensus 478 ~iDa~~~~~~~~GNvaRFINHSC~PN~~~~~V~~~~~d~~~prI~~FA~rdI~~GEELT~DYg~~ 542 (570)
+||++ ...+-.+|.||||-.+||....|.+++. ||+.+.|.|||.+||||+||||..
T Consensus 323 CiDAT----~et~~lGRLINHS~~gNl~TKvv~Idg~----pHLiLvA~rdIa~GEELlYDYGDR 379 (392)
T KOG1085|consen 323 CIDAT----KETPWLGRLINHSVRGNLKTKVVEIDGS----PHLILVARRDIAQGEELLYDYGDR 379 (392)
T ss_pred eeecc----cccccchhhhcccccCcceeeEEEecCC----ceEEEEeccccccchhhhhhcccc
Confidence 89998 3577889999999999999999999865 999999999999999999999964
No 13
>COG2940 Proteins containing SET domain [General function prediction only]
Probab=99.65 E-value=6.1e-17 Score=178.31 Aligned_cols=168 Identities=28% Similarity=0.452 Sum_probs=120.7
Q ss_pred cCCcCCCCCCCCCcccccCceeeEEEEEcCCCCceEeecCccCCCCeEEEEeeeeecHHHHHHhcCCCeeEEecCCCCCC
Q 047072 388 CGPSCKCPPSCYNRVSQQGIKVQLEIYKTEARGWGVRSLNSIAPGSFIYEFVGELLEEKEAERRTSNDKYLFNIGNNYND 467 (570)
Q Consensus 388 C~~~C~C~~~C~NRv~Q~g~~~~LeVfrT~~kGwGVrA~~~I~~GtfI~EY~GEvi~~~e~~~r~~~d~Ylf~l~~~~~~ 467 (570)
+...+.+...+.|...+........+..+..+||||||++.|++|++|.+|.|+++...++..+.... ..+...+.-
T Consensus 311 ~~s~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~g~fa~~~i~~~e~i~~~~~~~~~~~~~~~~~~~~---~~~~~~~~~ 387 (480)
T COG2940 311 SKSNVSKLKELLNSNGCKKRREPNVVQESEIKGYGVFALESIKKGEFIIEYHGEIIRRKEAREREENY---DLLGNEFSF 387 (480)
T ss_pred ccccCccccchhhhcccccccchhhhhhhcccccceeehhhccchHHHHHhcCcccchHHHHhhhccc---cccccccch
Confidence 34444444566666666667777788888999999999999999999999999999988887765322 111111110
Q ss_pred CCccCCCccccCCCCCCCCcccCCeecccccCCCCCceEEEEEEcCCCCccceEEEEEeecCCCCCeEEEecCCCccccc
Q 047072 468 GSLWGGLSNVMPDAPSSSCGVYGNVGRFVNHSCSPNLYAQNVLYDHEDKRMPHKMLFAAENISPLQELTYHYSYMIDQVY 547 (570)
Q Consensus 468 ~~~~~~~s~~~iDa~~~~~~~~GNvaRFINHSC~PN~~~~~V~~~~~d~~~prI~~FA~rdI~~GEELT~DYg~~~~~~~ 547 (570)
..+ ... ...+|+ ...|+++|||||||.||+.+......+ ..++.++|++||.+|||||+||+..++...
T Consensus 388 ~~~-~~~-~~~~d~-----~~~g~~~r~~nHS~~pN~~~~~~~~~g----~~~~~~~~~rDI~~geEl~~dy~~~~~~~~ 456 (480)
T COG2940 388 GLL-EDK-DKVRDS-----QKAGDVARFINHSCTPNCEASPIEVNG----IFKISIYAIRDIKAGEELTYDYGPSLEDNR 456 (480)
T ss_pred hhc-ccc-chhhhh-----hhcccccceeecCCCCCcceecccccc----cceeeecccccchhhhhhccccccccccch
Confidence 000 000 223333 478999999999999999988765543 467999999999999999999998876422
Q ss_pred c-cCCCCcCeEEeeCCCCccccc
Q 047072 548 D-SSGNIKKKSCFCGSSECTGWL 569 (570)
Q Consensus 548 ~-~~g~~k~~~C~CGS~~CRG~l 569 (570)
. ..-......|.|++..|+++|
T Consensus 457 ~~~~~~~~~~~~~~~~~~~~~~~ 479 (480)
T COG2940 457 ELKKLLEKRWGCACGEDRCSHTM 479 (480)
T ss_pred hhhhhhhhhhccccCCCccCCCC
Confidence 1 111124579999999999987
No 14
>PF00856 SET: SET domain; InterPro: IPR001214 The SET domain appears generally as one part of a larger multidomain protein, and recently there were described three structures of very different proteins with distinct domain compositions: Neurospora crassa DIM-5, a member of the Su(var) family of HKMTs which methylate histone H3 on lysine 9,human SET7 (also called SET9), which methylates H3 on lysine 4 and garden pea Rubisco LSMT, an enzyme that does not modify histones, but instead methylates lysine 14 in the flexible tail of the large subunit of the enzyme Rubisco. The SET domain itself turned out to be an uncommon structure. Although in all three studies, electron density maps revealed the location of the AdoMet or AdoHcy cofactor, the SET domain bears no similarity at all to the canonical/AdoMet-dependent methyltransferase fold. Strictly conserved in the C-terminal motif of the SET domain tyrosine could be involved in abstracting a proton from the protonated amino group of the substrate lysine, promoting its nucleophilic attack on the sulphonium methyl group of the AdoMet cofactor. In contrast to the AdoMet-dependent protein methyltranferases of the classical type, which tend to bind their polypeptide substrates on top of the cofactor, it is noted from the Rubisco LSMT structure that the AdoMet seems to bind in a separate cleft, suggesting how a polypeptide substrate could be subjected to multiple rounds of methylation without having to be released from the enzyme. In contrast, SET7/9 is able to add only a single methyl group to its substrate. It has been demonstrated that association of SET domain and myotubularin-related proteins modulates growth control []. The SET domain-containing Drosophila melanogaster (Fruit fly) protein, enhancer of zeste, has a function in segment determination and the mammalian homologue may be involved in the regulation of gene transcription and chromatin structure. Histone lysine methylation is part of the histone code that regulated chromatin function and epigenetic control of gene function. Histone lysine methyltransferases (HMTase) differ both in their substrate specificity for the various acceptor lysines as well as in their product specificity for the number of methyl groups (one, two, or three) they transfer. With just one exception [], the HMTases belong to SET family that can be classified according to the sequences surrounding the SET domain [, ]. Structural studies on the human SET7/9, a mono-methylase, have revealed the molecular basis for the specificity of the enzyme for the histone-target and the roles of the invariant residues in the SET domain in determining the methylation specificities []. The pre-SET domain, as found in the SUV39 SET family, contains nine invariant cysteine residues that are grouped into two segments separated by a region of variable length. These 9 cysteines coordinate 3 zinc ions to form to form a triangular cluster, where each of the zinc ions is coordinated by 4 four cysteines to give a tetrahedral configuration. The function of this domain is structural, holding together 2 long segments of random coils. The C-terminal region including the post-SET domain is disordered when not interacting with a histone tail and in the absence of zinc. The three conserved cysteines in the post-SET domain form a zinc-binding site when coupled to a fourth conserved cysteine in the knot-like structure close to the SET domain active site []. The structured post-SET region brings in the C-terminal residues that participate in S-adenosylmethine-binding and histone tail interactions. The three conserved cysteine residues are essential for HMTase activity, as replacement with serine abolishes HMTase activity [], []. ; GO: 0005515 protein binding; PDB: 3TG5_A 3S7F_A 3RIB_B 3TG4_A 3S7J_A 3S7D_A 3S7B_A 3H6L_A 3SMT_A 3K5K_A ....
Probab=99.34 E-value=1.2e-12 Score=119.16 Aligned_cols=49 Identities=20% Similarity=0.226 Sum_probs=39.9
Q ss_pred ccCCeecccccCCCCCceEEEEEEcCCCCccceEEEEEeecCCCCCeEEEecC
Q 047072 488 VYGNVGRFVNHSCSPNLYAQNVLYDHEDKRMPHKMLFAAENISPLQELTYHYS 540 (570)
Q Consensus 488 ~~GNvaRFINHSC~PN~~~~~V~~~~~d~~~prI~~FA~rdI~~GEELT~DYg 540 (570)
...+++.|+||||.|||.+...... .-..+.|.|.|+|++|||||++||
T Consensus 114 ~l~p~~d~~NHsc~pn~~~~~~~~~----~~~~~~~~a~r~I~~GeEi~isYG 162 (162)
T PF00856_consen 114 ALYPFADMLNHSCDPNCEVSFDFDG----DGGCLVVRATRDIKKGEEIFISYG 162 (162)
T ss_dssp EEETGGGGSEEESSTSEEEEEEEET----TTTEEEEEESS-B-TTSBEEEEST
T ss_pred ccCcHhHheccccccccceeeEeec----ccceEEEEECCccCCCCEEEEEEC
Confidence 3668899999999999998876542 235799999999999999999997
No 15
>KOG1141 consensus Predicted histone methyl transferase [Chromatin structure and dynamics]
Probab=99.09 E-value=2.6e-10 Score=127.25 Aligned_cols=333 Identities=32% Similarity=0.500 Sum_probs=226.4
Q ss_pred ccHHHHhhhhhCCCceEEeCCCCCCcc------------eeeeeeeeEEEEEEEeecCCCCee----------EEeeeec
Q 047072 229 GNVALANNIHEQNPVRVIRGDTKAFEY------------RTCIYDGLYLVERYWQDVGSHGKL----------VYKFKLA 286 (570)
Q Consensus 229 gN~AL~~s~~~~~pVRViRg~~~~~~~------------~~y~YDGLY~V~~~w~e~g~~G~~----------v~kf~L~ 286 (570)
||.-|.+-...+..||.-|=.+.+..| -+|+|-|--..... ..++|.. -|.|+ .
T Consensus 785 ~~~C~nrmvqhg~qvRlq~fkt~~kGWg~rclddi~~g~fVciy~g~~l~~~~---sdks~~~~~~~~~~~id~~~f~-~ 860 (1262)
T KOG1141|consen 785 GPDCLNRMVQHGYQVRLQRFKTIHKGWGRRCLDDITGGNFVCIYPGGALLHQI---SDKSEYIHVTRSLLTIDCFSFD-A 860 (1262)
T ss_pred cHHHHHHHhhcCceeEeeeccccccccceEeeeecCCceEEEEecchhhhhhh---chhhhhcccchhhhcccccchh-c
Confidence 556666667788999998864332222 46667654332221 1222211 13333 3
Q ss_pred ccCCCCCCcee-eeeeeecccCCCCCcCceEEeccCCCCCCCceEeeeeecCCC------CCCCCCCCCccCCCCCCCCC
Q 047072 287 RIPGQPELSWK-VGLCVDDISQGKELIPICAVNTVDDEMPPSFKYITNIIYPDW------CRPVPPKGCDCTNGCSKLEK 359 (570)
Q Consensus 287 R~~GQp~l~~k-~~~~~~DIS~G~E~~PI~~vN~VD~~~pp~F~Yi~~~~~~~~------~~~~~~~gC~C~~~C~~~~~ 359 (570)
|+.-..++... .|+-..|.+.|.+.+|||.+|.+|++.||..+|......+.+ .......+|+|.+.|++...
T Consensus 861 ~~dt~~~~tvD~~g~d~~d~~~g~sg~~~p~~~~~d~~~~~~c~d~~~~~~~~~~~~~s~~~~~~~~~~s~d~hp~d~~~ 940 (1262)
T KOG1141|consen 861 RIDTATYITVDDKGLDVADFSLGTSGIPIPLVNSVDNDEPPSCEDSKRRFQYNDQVDISSVSRDFCSGCSCDGHPSDASK 940 (1262)
T ss_pred cccccceeeccccccchhhhhccccCCCCccccccccCCCccccccceeecccccchhhhhccccccccccCCCCcccCc
Confidence 44444444322 688899999999999999999999999987665543322221 23345689999999999889
Q ss_pred cccccccC---CCcc--ccCCcc--e--------eccceeeeccCCcCCCCCCCCCcccccCceee--------EEEEEc
Q 047072 360 CACVAKNG---GEIP--YNHNRA--I--------VQAKLLVYECGPSCKCPPSCYNRVSQQGIKVQ--------LEIYKT 416 (570)
Q Consensus 360 C~C~~~ng---g~~~--y~~~g~--l--------~~~~~~i~EC~~~C~C~~~C~NRv~Q~g~~~~--------LeVfrT 416 (570)
|.|.+... +..| +..++. + -..+...|||+..|.|...|.|+++|.+.+++ |.||++
T Consensus 941 ~~~~~~~~~~~~~cpp~~s~d~~~~~~eS~~~~ns~~~~~f~e~~~hss~~~~e~~~~v~~~~~~~me~~s~~~l~i~~~ 1020 (1262)
T KOG1141|consen 941 CECQQLSIEAMKRCPPNLSFDGHDELYESSEKQNSFLKLFFFECNDHSSCHRKEYNRVVQNNIKYPMEVSSFNDLQIFKT 1020 (1262)
T ss_pred ccCCCCChhhhcCCCCccccCchhhhhhhhhhcchhhhccceeccccchhcccccchhhhcCCccceeeeeccccccccc
Confidence 99986431 1222 222221 1 11234688999999999999999999998766 456777
Q ss_pred CCCCceEeecCccCCCCeEEEEeeeeecHHHHHHh--cCCCeeEEecCCC----------------------CCCCCccC
Q 047072 417 EARGWGVRSLNSIAPGSFIYEFVGELLEEKEAERR--TSNDKYLFNIGNN----------------------YNDGSLWG 472 (570)
Q Consensus 417 ~~kGwGVrA~~~I~~GtfI~EY~GEvi~~~e~~~r--~~~d~Ylf~l~~~----------------------~~~~~~~~ 472 (570)
...|||+++..+|+.-+|||+|+|...+..-+.+. .+.+.|.-+++.. +....-.+
T Consensus 1021 ~~~~~~~~edtD~~~~~~~~~~~~~ppt~~l~~~~r~aqad~~sn~~D~~~~~~l~es~~~~~T~~r~~t~~~~~~~~~d 1100 (1262)
T KOG1141|consen 1021 AQSGWGVREDTDIPQSTFICTYVGAPPTDDLADELRNAQADQYSNDLDLKDTVELEESREDHETDFRGDTSDYDDEEGSD 1100 (1262)
T ss_pred ccccccccccccCCCCcccccccCCCCchhhHHHHhhhhhccccCccchhhhhhhhhcccccccccCCCCCCCccccccc
Confidence 88999999999999999999999999876544321 1122222111110 00000000
Q ss_pred --------------------C-------------------CccccCC----------C--------------------CC
Q 047072 473 --------------------G-------------------LSNVMPD----------A--------------------PS 483 (570)
Q Consensus 473 --------------------~-------------------~s~~~iD----------a--------------------~~ 483 (570)
. .+..-+| . ..
T Consensus 1101 ~dd~q~I~k~ve~qd~~~~~~~T~~~~RQ~~~~s~k~~~~~s~~~~~~ts~~~~~~dkges~~~~~~~~~~y~~~~~~yv 1180 (1262)
T KOG1141|consen 1101 GDDGQDIMKMVERQDSSESGEETKRLTRQKRKQSKKSGKGGSVEKDDTTSRDSMEKDKGESKDEPVFNWDKYFEPFPLYV 1180 (1262)
T ss_pred CccHHHHHHHhhcccccccccccchhhhhhhhhhhhcccCccccccccCccchhhhccCccCcccccchhhccCCCceEE
Confidence 0 0000000 0 00
Q ss_pred CCCcccCCeecccccCCCCCceEEEEEEcCCCCccceEEEEEeecCCCCCeEEEecCCCcccccccCCCCcCeEEeeCCC
Q 047072 484 SSCGVYGNVGRFVNHSCSPNLYAQNVLYDHEDKRMPHKMLFAAENISPLQELTYHYSYMIDQVYDSSGNIKKKSCFCGSS 563 (570)
Q Consensus 484 ~~~~~~GNvaRFINHSC~PN~~~~~V~~~~~d~~~prI~~FA~rdI~~GEELT~DYg~~~~~~~~~~g~~k~~~C~CGS~ 563 (570)
++++..||++||+||||+||+.+|+|+++.+|.++|.+||||.+-|++|+||||||+|+..++.. +...|+||+.
T Consensus 1181 IDAk~eGNlGRfLNHSC~PNl~VQnVfvdTHdlrfPwVAFFt~kyVkAgtELTWDY~Ye~g~v~~-----keL~C~CGa~ 1255 (1262)
T KOG1141|consen 1181 IDAKQEGNLGRFLNHSCDPNLHVQNVFVDTHDLRFPWVAFFTRKYVKAGTELTWDYQYEQGQVAT-----KELTCHCGAE 1255 (1262)
T ss_pred EecccccchhhhhccCCCccceeeeeeeeccccCCchhhhhhhhhhccCceeeeecccccccccc-----ceEEEecChh
Confidence 12356999999999999999999999999999999999999999999999999999999887652 5689999999
Q ss_pred CcccccC
Q 047072 564 ECTGWLY 570 (570)
Q Consensus 564 ~CRG~ly 570 (570)
+|||+|.
T Consensus 1256 ~CrgrLL 1262 (1262)
T KOG1141|consen 1256 NCRGRLL 1262 (1262)
T ss_pred hhhcccC
Confidence 9999984
No 16
>KOG1081 consensus Transcription factor NSD1 and related SET domain proteins [Transcription]
Probab=98.74 E-value=3e-09 Score=116.72 Aligned_cols=148 Identities=30% Similarity=0.486 Sum_probs=100.9
Q ss_pred ceeeeccC-CcCCCCCCCCCcccccCceeeEEEEEcCCCCceEeecCccCCCCeEEEEeeeeecHHHHHHhcCC--CeeE
Q 047072 382 KLLVYECG-PSCKCPPSCYNRVSQQGIKVQLEIYKTEARGWGVRSLNSIAPGSFIYEFVGELLEEKEAERRTSN--DKYL 458 (570)
Q Consensus 382 ~~~i~EC~-~~C~C~~~C~NRv~Q~g~~~~LeVfrT~~kGwGVrA~~~I~~GtfI~EY~GEvi~~~e~~~r~~~--d~Yl 458 (570)
....+||- ..|.+...|.|+......... . +. +|..+|.++ +|++++..+...+... ..-+
T Consensus 286 ~~~~~~~~p~~~~~~~~~~~~~~sk~~~~e------~-~~---~~~~~~~k~------vg~~i~~~e~~~~~~~~~~~~~ 349 (463)
T KOG1081|consen 286 KMLAYEVHPKVCSAEERCHNQQFSKESYPE------P-QK---TAKADIRKG------VGEVIDDKECKARLQRVKESDL 349 (463)
T ss_pred Hhhhhhhcccccccccccccchhhhhcccc------c-ch---hhHHhhhcc------cCcccchhhheeehhhhhccch
Confidence 34456665 469898899988764433222 2 22 788888888 8999998887655421 0000
Q ss_pred EecCCCCCCCCccCCCccccCCCCCCCCcccCCeecccccCCCCCceEEEEEEcCCCCccceEEEEEeecCCCCCeEEEe
Q 047072 459 FNIGNNYNDGSLWGGLSNVMPDAPSSSCGVYGNVGRFVNHSCSPNLYAQNVLYDHEDKRMPHKMLFAAENISPLQELTYH 538 (570)
Q Consensus 459 f~l~~~~~~~~~~~~~s~~~iDa~~~~~~~~GNvaRFINHSC~PN~~~~~V~~~~~d~~~prI~~FA~rdI~~GEELT~D 538 (570)
.+..... + .....+|+ ..+||.+||+||||+||+..+.+.+.. ..++.+||.++|++||||||+
T Consensus 350 ~~~~~~~----~---e~~~~id~-----~~~~n~sr~~nh~~~~~v~~~k~~~~~----~t~~~~~a~~~i~~g~e~t~~ 413 (463)
T KOG1081|consen 350 VDFYMVF----I---QKDRIIDA-----GPKGNYSRFLNHSCQPNVETEKWQVIG----DTRVGLFAPRQIEAGEELTFN 413 (463)
T ss_pred hhhhhhh----h---hccccccc-----ccccchhhhhcccCCCceeechhheec----ccccccccccccccchhhhhe
Confidence 0000000 0 00013455 479999999999999999988776543 367999999999999999999
Q ss_pred cCCCcccccccCCCCcCeEEeeCCCCccccc
Q 047072 539 YSYMIDQVYDSSGNIKKKSCFCGSSECTGWL 569 (570)
Q Consensus 539 Yg~~~~~~~~~~g~~k~~~C~CGS~~CRG~l 569 (570)
|...-. ...+.|.|++.+|.+.+
T Consensus 414 ~n~~~~--------~~~~~~~~~~e~~~~~~ 436 (463)
T KOG1081|consen 414 YNGNCE--------GNEKRCCCGSENCTETK 436 (463)
T ss_pred eecccc--------CCcceEeecccccccCC
Confidence 986432 24578999999998764
No 17
>KOG2589 consensus Histone tail methylase [Chromatin structure and dynamics]
Probab=98.37 E-value=2.5e-07 Score=96.53 Aligned_cols=102 Identities=22% Similarity=0.258 Sum_probs=69.9
Q ss_pred CCCceEeecCccCCCCeEEEEeeeeecHHHHHHhc----CCCeeEEecCCCCCCCCccCCCccccCCCCCCCCcccCCee
Q 047072 418 ARGWGVRSLNSIAPGSFIYEFVGELLEEKEAERRT----SNDKYLFNIGNNYNDGSLWGGLSNVMPDAPSSSCGVYGNVG 493 (570)
Q Consensus 418 ~kGwGVrA~~~I~~GtfI~EY~GEvi~~~e~~~r~----~~d~Ylf~l~~~~~~~~~~~~~s~~~iDa~~~~~~~~GNva 493 (570)
..|--|.|++.+.+|+=|--.+|-|+.-.+++.+. ..++|.........- +-..=..|
T Consensus 136 ~~gAkivst~~w~~ndkIe~LvGcIaeLse~eE~~ll~~g~nDFSvmyStRk~c------------------aqLwLGPa 197 (453)
T KOG2589|consen 136 QNGAKIVSTKSWSRNDKIELLVGCIAELSEAEERSLLRGGGNDFSVMYSTRKRC------------------AQLWLGPA 197 (453)
T ss_pred CCCceEEeeccccCCccHHHhhhhhhhcChhhhHHHHhccCCceeeeeecccch------------------hhheeccH
Confidence 45667889999999999999999887655555542 222221111110000 01223458
Q ss_pred cccccCCCCCceEEEEEEcCCCCccceEEEEEeecCCCCCeEEEecCCCcc
Q 047072 494 RFVNHSCSPNLYAQNVLYDHEDKRMPHKMLFAAENISPLQELTYHYSYMID 544 (570)
Q Consensus 494 RFINHSC~PN~~~~~V~~~~~d~~~prI~~FA~rdI~~GEELT~DYg~~~~ 544 (570)
+||||-|.|||.+.. .+. -++.+-++|||.||||||--||..+.
T Consensus 198 afINHDCrpnCkFvs---~g~----~tacvkvlRDIePGeEITcFYgs~fF 241 (453)
T KOG2589|consen 198 AFINHDCRPNCKFVS---TGR----DTACVKVLRDIEPGEEITCFYGSGFF 241 (453)
T ss_pred HhhcCCCCCCceeec---CCC----ceeeeehhhcCCCCceeEEeeccccc
Confidence 999999999998654 122 35789999999999999999998876
No 18
>KOG2461 consensus Transcription factor BLIMP-1/PRDI-BF1, contains C2H2-type Zn-finger and SET domains [Transcription]
Probab=97.65 E-value=4.7e-05 Score=82.39 Aligned_cols=118 Identities=26% Similarity=0.338 Sum_probs=82.1
Q ss_pred ceeeEEEEEcC--CCCceEeecCccCCCCeEEEEeeeeecHHHHHHhcCCCeeEEecCCCCCCCCccCCCccccCCCCCC
Q 047072 407 IKVQLEIYKTE--ARGWGVRSLNSIAPGSFIYEFVGELLEEKEAERRTSNDKYLFNIGNNYNDGSLWGGLSNVMPDAPSS 484 (570)
Q Consensus 407 ~~~~LeVfrT~--~kGwGVrA~~~I~~GtfI~EY~GEvi~~~e~~~r~~~d~Ylf~l~~~~~~~~~~~~~s~~~iDa~~~ 484 (570)
+...|.|+.+. ..|.||.+...|++|+--+-|.|+++... . ....+..|...+-.... -..++|+..
T Consensus 26 LP~~l~i~~Ssv~~~~lgV~s~~~i~~G~~FGP~~G~~~~~~-~-~~~~n~~y~W~I~~~d~--------~~~~iDg~d- 94 (396)
T KOG2461|consen 26 LPPELRIKPSSVPVTGLGVWSNASILPGTSFGPFEGEIIASI-D-SKSANNRYMWEIFSSDN--------GYEYIDGTD- 94 (396)
T ss_pred CCCceEeeccccCCccccccccccccCcccccCccCcccccc-c-cccccCcceEEEEeCCC--------ceEEeccCC-
Confidence 55667887774 67899999999999999999999982211 1 11224455444422110 113566653
Q ss_pred CCcccCCeecccccCCC---CCceEEEEEEcCCCCccceEEEEEeecCCCCCeEEEecCCCccc
Q 047072 485 SCGVYGNVGRFVNHSCS---PNLYAQNVLYDHEDKRMPHKMLFAAENISPLQELTYHYSYMIDQ 545 (570)
Q Consensus 485 ~~~~~GNvaRFINHSC~---PN~~~~~V~~~~~d~~~prI~~FA~rdI~~GEELT~DYg~~~~~ 545 (570)
....|+.||+|=.++ -|+.+.- .+ -.|.+.|+|+|.|+|||.+.|+.++.+
T Consensus 95 --~~~sNWmRYV~~Ar~~eeQNL~A~Q---~~-----~~Ifyrt~r~I~p~eELlVWY~~e~~~ 148 (396)
T KOG2461|consen 95 --EEHSNWMRYVNSARSEEEQNLLAFQ---IG-----ENIFYRTIRDIRPNEELLVWYGSEYAE 148 (396)
T ss_pred --hhhcceeeeecccCChhhhhHHHHh---cc-----CceEEEecccCCCCCeEEEEeccchHh
Confidence 358999999998887 4876432 11 238899999999999999999987643
No 19
>smart00508 PostSET Cysteine-rich motif following a subset of SET domains.
Probab=96.41 E-value=0.0014 Score=43.87 Aligned_cols=15 Identities=47% Similarity=1.433 Sum_probs=13.7
Q ss_pred CeEEeeCCCCccccc
Q 047072 555 KKSCFCGSSECTGWL 569 (570)
Q Consensus 555 ~~~C~CGS~~CRG~l 569 (570)
.+.|+|||++|||+|
T Consensus 2 ~~~C~CGs~~CRG~l 16 (26)
T smart00508 2 KQPCLCGAPNCRGFL 16 (26)
T ss_pred CeeeeCCCcccccee
Confidence 468999999999998
No 20
>COG3440 Predicted restriction endonuclease [Defense mechanisms]
Probab=95.90 E-value=0.0002 Score=73.70 Aligned_cols=136 Identities=12% Similarity=-0.084 Sum_probs=113.4
Q ss_pred CccCCceech-hhhhhhhcccCCccCCcceecCCCceeEEEEEeeCCcCCCCCCCCeEEEeCCCCCCCCCCCCCcccccc
Q 047072 149 GVEVGDEFQY-RVELNMIGLHLQIQGGIDYVKHEGKINATSIVASGGYDDKLDNSDVLIYTGQGGNVMNGGKEPEDQKLE 227 (570)
Q Consensus 149 Gv~vGd~f~~-r~e~~~~GlH~~~~~GI~~~~~~g~~~A~SIV~Sggy~dd~d~gd~l~YtG~gg~~~~~~~~~~dQ~l~ 227 (570)
++..+-.+-. +.+..-.+.|-|++.++.+.+..+ +.+++.+|+|+++.+.+++..|++-|++ +....+..-+.+.
T Consensus 10 sf~~~~a~~~i~~~~~~~a~~kp~l~l~v~~~~~~---~~~~~n~~~~~~e~~~~f~~l~~~~g~~-~~~~~~~~p~~~l 85 (301)
T COG3440 10 SFSQRNASLKIFGGNREAAPHKPILLLDVGRKIST---FFITENQGIYETELIEPFIQLWSFFGPK-LQKYGVDAPFELL 85 (301)
T ss_pred chhhhhhhhhhcccccccCCcCceeehhhHhhhhc---ccccccccccchhccchHHHHHhhcCcc-cccCCCCCchHHh
Confidence 4444444444 566777889999999999987767 8899999999999999999999999997 3344456667788
Q ss_pred cccHHHHhhhhhCCCceEEeCCCC---CCcceeeeeeeeEEEEEEEeecCCCCeeEEeeeeccc
Q 047072 228 RGNVALANNIHEQNPVRVIRGDTK---AFEYRTCIYDGLYLVERYWQDVGSHGKLVYKFKLARI 288 (570)
Q Consensus 228 ~gN~AL~~s~~~~~pVRViRg~~~---~~~~~~y~YDGLY~V~~~w~e~g~~G~~v~kf~L~R~ 288 (570)
+|+.++..++..+-+-+++|+... -.++..+-|-|+|.+...|.++...+.++..|++.+.
T Consensus 86 ~~d~~~h~~~k~~~~~l~~~~~~~~~e~v~~~~~d~el~~~~~~~~~~~~l~~~L~~~~~~~~~ 149 (301)
T COG3440 86 QGDGKWHLDIKEGFDGLSIRTLPTEKEFVEYHYIDDELEQSLQYHQGEKRLIDDLISIWRKEVL 149 (301)
T ss_pred hccchhhhcccccCCccccCCCccHhhhhhhhhccHHHHHHHHhhcccchhHHHHHHHHHHHHH
Confidence 999999999999999999999543 3456788888999999999999999999999888776
No 21
>smart00570 AWS associated with SET domains. subdomain of PRESET
Probab=95.08 E-value=0.011 Score=46.00 Aligned_cols=27 Identities=30% Similarity=0.712 Sum_probs=23.7
Q ss_pred ccceeeeccCCcCCCCCCCCCcccccC
Q 047072 380 QAKLLVYECGPSCKCPPSCYNRVSQQG 406 (570)
Q Consensus 380 ~~~~~i~EC~~~C~C~~~C~NRv~Q~g 406 (570)
.++.+.+||+..|+|+..|.|+.+|+.
T Consensus 23 lNR~l~~EC~~~C~~G~~C~NqrFqk~ 49 (51)
T smart00570 23 LNRMLLIECSSDCPCGSYCSNQRFQKR 49 (51)
T ss_pred HHHHHhhhcCCCCCCCcCccCcccccC
Confidence 356788999888999999999999975
No 22
>KOG2084 consensus Predicted histone tail methylase containing SET domain [Chromatin structure and dynamics]
Probab=89.68 E-value=0.28 Score=53.30 Aligned_cols=43 Identities=33% Similarity=0.511 Sum_probs=33.2
Q ss_pred ccccCCCCCceEEEEEEcCCCCccceEEEEEeecCCCCC-eEEEecCCCccc
Q 047072 495 FVNHSCSPNLYAQNVLYDHEDKRMPHKMLFAAENISPLQ-ELTYHYSYMIDQ 545 (570)
Q Consensus 495 FINHSC~PN~~~~~V~~~~~d~~~prI~~FA~rdI~~GE-ELT~DYg~~~~~ 545 (570)
++||||.||+. +.++. ...++++...+.+++ ||+..|-...+.
T Consensus 208 ~~~hsC~pn~~---~~~~~-----~~~~~~~~~~~~~~~~~l~~~y~~~~~~ 251 (482)
T KOG2084|consen 208 LFNHSCFPNIS---VIFDG-----RGLALLVPAGIDAGEEELTISYTDPLLS 251 (482)
T ss_pred hcccCCCCCeE---EEECC-----ceeEEEeecccCCCCCEEEEeecccccC
Confidence 78999999998 33332 236788888888887 999999877643
No 23
>KOG1337 consensus N-methyltransferase [General function prediction only]
Probab=74.03 E-value=2.8 Score=46.73 Aligned_cols=40 Identities=28% Similarity=0.357 Sum_probs=31.5
Q ss_pred ccccCCCCCceEEEEEEcCCCCccceEEEEEeecCCCCCeEEEecCC
Q 047072 495 FVNHSCSPNLYAQNVLYDHEDKRMPHKMLFAAENISPLQELTYHYSY 541 (570)
Q Consensus 495 FINHSC~PN~~~~~V~~~~~d~~~prI~~FA~rdI~~GEELT~DYg~ 541 (570)
+.||+|++ ....+...|. .+-+.+.++|.+|||+++.||.
T Consensus 239 ~~NH~~~~----~~~~~~~~d~---~~~l~~~~~v~~geevfi~YG~ 278 (472)
T KOG1337|consen 239 LLNHSPEV----IKAGYNQEDE---AVELVAERDVSAGEEVFINYGP 278 (472)
T ss_pred hhccCchh----ccccccCCCC---cEEEEEeeeecCCCeEEEecCC
Confidence 57999999 2233343443 6899999999999999999996
No 24
>PF12218 End_N_terminal: N terminal extension of bacteriophage endosialidase; InterPro: IPR024429 This entry represents the N-terminal extension domain of endosialidases which is approximately 70 amino acids in length. The two N-terminal domains (this domain and the beta propeller) assemble in the compact 'cap' whereas the C-terminal domain forms an extended tail-like structure. The very N-terminal part of the 'cap' region (residues 246 to 312) holds the only alpha-helix of the protein and is presumably the residual part of the deleted N-terminal head-binding domain [].; PDB: 3JU4_A 3GVL_A 3GVK_B 3GVJ_A 1V0E_B 1V0F_E.
Probab=47.43 E-value=16 Score=29.83 Aligned_cols=50 Identities=26% Similarity=0.273 Sum_probs=26.9
Q ss_pred cHHHHhhhhhCCCceEEeCCCCCCcceeeeeeeeEEEEEEEeecCCCCeeEEeeeecccCCCCC
Q 047072 230 NVALANNIHEQNPVRVIRGDTKAFEYRTCIYDGLYLVERYWQDVGSHGKLVYKFKLARIPGQPE 293 (570)
Q Consensus 230 N~AL~~s~~~~~pVRViRg~~~~~~~~~y~YDGLY~V~~~w~e~g~~G~~v~kf~L~R~~GQp~ 293 (570)
..|+-..+...++=++|-|... -|+|.. -...+-|.--+|-.+|+||||-
T Consensus 10 t~A~~a~l~a~~~g~~IDg~Gl-----------TykVs~---lPd~srf~N~rF~~eri~gqpl 59 (67)
T PF12218_consen 10 TAAITAALEASPVGRKIDGAGL-----------TYKVSS---LPDISRFKNARFVYERIPGQPL 59 (67)
T ss_dssp HHHHHHHHHHS-TTS-EE-TT------------EEEESS------GGGEES-EEEE-SSTT--E
T ss_pred HHHHHHHHhccCCCeEEecCCc-----------eEEEee---CccHHhhccceEEEeecCCCce
Confidence 5677777887777777777422 244432 3456666667788899999985
No 25
>PF11403 Yeast_MT: Yeast metallothionein; InterPro: IPR022710 Metallothioneins are characterised by an abundance of cysteine residues and a lack of generic secondary structure motifs. This protein functions in primary metal storage, transport and detoxification []. For the first 40 residues in the protein the polypeptide wraps around the metal by forming two large parallel loops separated by a deep cleft containing the metal cluster []. ; PDB: 1AQS_A 1AQR_A 1RJU_V 1FMY_A 1AOO_A 1AQQ_A.
Probab=46.21 E-value=13 Score=26.53 Aligned_cols=18 Identities=44% Similarity=1.328 Sum_probs=8.5
Q ss_pred CCCCccCCCCCCCCCccc
Q 047072 345 PKGCDCTNGCSKLEKCAC 362 (570)
Q Consensus 345 ~~gC~C~~~C~~~~~C~C 362 (570)
...|+|..+|....+|+|
T Consensus 21 qkscscptgcnsddkcpc 38 (40)
T PF11403_consen 21 QKSCSCPTGCNSDDKCPC 38 (40)
T ss_dssp TTS-SS-TTTTSSTT--T
T ss_pred hhcCCCCCCCCCCCcCCC
Confidence 345666666665566666
No 26
>KOG3813 consensus Uncharacterized conserved protein (tumor-suppressor AXUD1 in humans) [General function prediction only]
Probab=45.48 E-value=11 Score=42.27 Aligned_cols=42 Identities=31% Similarity=0.659 Sum_probs=27.0
Q ss_pred CCCccCCCCCCCCCcccccccCCCccccCCcceeccceeeeccCCcCCCCC-CCCCcc
Q 047072 346 KGCDCTNGCSKLEKCACVAKNGGEIPYNHNRAIVQAKLLVYECGPSCKCPP-SCYNRV 402 (570)
Q Consensus 346 ~gC~C~~~C~~~~~C~C~~~ngg~~~y~~~g~l~~~~~~i~EC~~~C~C~~-~C~NRv 402 (570)
.||+|..-|. ++.|+|.+. |..++-....|- |+|.. .|.|-+
T Consensus 308 CGCsCr~~Cd-PETCaCSqa----------GIkCQvDr~~fP----CgC~rEgCgNp~ 350 (640)
T KOG3813|consen 308 CGCSCRGVCD-PETCACSQA----------GIKCQVDRGEFP----CGCFREGCGNPE 350 (640)
T ss_pred hCCcccceeC-hhhcchhcc----------CceEeecCcccc----cccchhhcCCCc
Confidence 6999997776 579999862 332322222232 88876 799964
No 27
>PF03638 TCR: Tesmin/TSO1-like CXC domain, cysteine-rich domain; InterPro: IPR005172 This entry includes proteins that have two copies of a cysteine rich motif as follows: C-X-C-X4-C-X3-YC-X-C-X6-C-X3-C-X-C-X2-C. The family includes Tesmin Q9Y4I5 from SWISSPROT [] and TSO1 Q9LE32 from SWISSPROT []. This group of proteins is called a CXC domain in [].
Probab=38.61 E-value=24 Score=26.54 Aligned_cols=37 Identities=41% Similarity=1.044 Sum_probs=27.4
Q ss_pred CCCCccCC-CCCCCCCcccccccCCCccccCCcceeccceeeeccCCcCCCCCCCCCcc
Q 047072 345 PKGCDCTN-GCSKLEKCACVAKNGGEIPYNHNRAIVQAKLLVYECGPSCKCPPSCYNRV 402 (570)
Q Consensus 345 ~~gC~C~~-~C~~~~~C~C~~~ngg~~~y~~~g~l~~~~~~i~EC~~~C~C~~~C~NRv 402 (570)
..||.|.. .|.. .-|.|.+.. ..|++.|.| ..|.|..
T Consensus 3 ~~gC~Ckks~Clk-~YC~Cf~~g-------------------~~C~~~C~C-~~C~N~~ 40 (42)
T PF03638_consen 3 KKGCNCKKSKCLK-LYCECFQAG-------------------RFCTPNCKC-QNCKNTE 40 (42)
T ss_pred CCCCcccCcChhh-hhCHHHHCc-------------------CcCCCCccc-CCCCCcC
Confidence 46899964 5765 478887643 469999999 6888864
No 28
>KOG1025 consensus Epidermal growth factor receptor EGFR and related tyrosine kinases [Signal transduction mechanisms]
Probab=36.20 E-value=50 Score=39.74 Aligned_cols=18 Identities=22% Similarity=0.074 Sum_probs=13.5
Q ss_pred cceEEEEEeecCCCCCeE
Q 047072 518 MPHKMLFAAENISPLQEL 535 (570)
Q Consensus 518 ~prI~~FA~rdI~~GEEL 535 (570)
+..+.|-.++-|.+|.-|
T Consensus 428 itsL~lrSLKeIs~G~v~ 445 (1177)
T KOG1025|consen 428 LTSLGLRSLKEISAGAVL 445 (1177)
T ss_pred cceeccchhhhccCCcEE
Confidence 456778888888888654
No 29
>PF08666 SAF: SAF domain; InterPro: IPR013974 This entry includes a range of different proteins, such as antifreeze proteins, flagellar FlgA proteins, and CpaB pilus proteins. ; PDB: 1C89_A 3NLA_A 3RDN_A 1C8A_A 3FRN_A 1WVO_A 3K3S_H 3G8R_B 1XUU_A 1XUZ_A ....
Probab=33.74 E-value=19 Score=28.05 Aligned_cols=15 Identities=27% Similarity=0.326 Sum_probs=11.4
Q ss_pred EEEEeecCCCCCeEE
Q 047072 522 MLFAAENISPLQELT 536 (570)
Q Consensus 522 ~~FA~rdI~~GEELT 536 (570)
.+.|.+||++|+.|+
T Consensus 3 vvVA~~di~~G~~i~ 17 (63)
T PF08666_consen 3 VVVAARDIPAGTVIT 17 (63)
T ss_dssp EEEESSTB-TT-BEC
T ss_pred EEEEeCccCCCCEEc
Confidence 478999999999995
No 30
>KOG2155 consensus Tubulin-tyrosine ligase-related protein [Posttranslational modification, protein turnover, chaperones]
Probab=27.05 E-value=30 Score=38.44 Aligned_cols=50 Identities=16% Similarity=0.323 Sum_probs=39.0
Q ss_pred eecccccCCCCCceEEEEEEcCCCCccceEEEEEeecCCCCCeEEEecCCCc
Q 047072 492 VGRFVNHSCSPNLYAQNVLYDHEDKRMPHKMLFAAENISPLQELTYHYSYMI 543 (570)
Q Consensus 492 vaRFINHSC~PN~~~~~V~~~~~d~~~prI~~FA~rdI~~GEELT~DYg~~~ 543 (570)
++.-+.||-+||..+.+.++-..+.. .-.+|-+++...|||+|-|+.+..
T Consensus 204 fGsrvrHsdePnf~~aPf~fmPq~va--Ysimwp~k~~~tgeE~trDfasg~ 253 (631)
T KOG2155|consen 204 FGSRVRHSDEPNFRIAPFMFMPQNVA--YSIMWPTKPVNTGEEITRDFASGV 253 (631)
T ss_pred hhhhhccCCCCcceeeeheecchhcc--eeEEeeccCCCCchHHHHHHhhcC
Confidence 34457899999999988877654433 356789999999999999987643
No 31
>KOG1081 consensus Transcription factor NSD1 and related SET domain proteins [Transcription]
Probab=23.60 E-value=27 Score=39.21 Aligned_cols=127 Identities=13% Similarity=0.110 Sum_probs=73.9
Q ss_pred eeeccCCcCCCCCCCCCcccccCceeeEEEEEcCCCCce---EeecCccCCCCeEEEEeeeeecHHH--HHHhcC---CC
Q 047072 384 LVYECGPSCKCPPSCYNRVSQQGIKVQLEIYKTEARGWG---VRSLNSIAPGSFIYEFVGELLEEKE--AERRTS---ND 455 (570)
Q Consensus 384 ~i~EC~~~C~C~~~C~NRv~Q~g~~~~LeVfrT~~kGwG---VrA~~~I~~GtfI~EY~GEvi~~~e--~~~r~~---~d 455 (570)
.-+++++.+.|.+.+.+- . ++. .-+..+..+|+ .++...+..|+||++++|+..-..- .....- ..
T Consensus 94 vc~~ggs~v~~~s~~~~~-~-r~c----~~~~~~~c~~~~~d~~~~~~~~~~~~vw~~vg~~~~~~c~vc~~~~~~~~~~ 167 (463)
T KOG1081|consen 94 VCFKGGSLVTCKSRIQAP-H-RKC----KPAQLEKCSKRCTDCRAFKKREVGDLVWSKVGEYPWWPCMVCHDPLLPKGMK 167 (463)
T ss_pred cccCCCccceeccccccc-c-ccC----cCccCcccccCCcceeeeccccceeEEeEEcCcccccccceecCcccchhhc
Confidence 345666666666554443 1 111 11234556666 8888899999999999999875541 111100 00
Q ss_pred e--eEEecCCCCCCCCccCCCccccCCCCCCCCcccCCeecccccCCCCCceEEEEEEcCCCCccceEEEEEeecCCCCC
Q 047072 456 K--YLFNIGNNYNDGSLWGGLSNVMPDAPSSSCGVYGNVGRFVNHSCSPNLYAQNVLYDHEDKRMPHKMLFAAENISPLQ 533 (570)
Q Consensus 456 ~--Ylf~l~~~~~~~~~~~~~s~~~iDa~~~~~~~~GNvaRFINHSC~PN~~~~~V~~~~~d~~~prI~~FA~rdI~~GE 533 (570)
. -.|... ..|.. ...++. ..|+..++++|++.|+-....+..... +++..++.+-++-++
T Consensus 168 ~~~~~f~~~------~~~~~---~~~~~~-----~~g~~~~~l~~~~~~~s~~~~~~~~~~----~r~~~~~~q~~~~~~ 229 (463)
T KOG1081|consen 168 HDHVNFFGC------YAWTH---EKRVFP-----YEGQSSKLIPHSKKPASTMSEKIKEAK----ARFGKLKAQWEAGIK 229 (463)
T ss_pred cccceeccc------hhhHH---Hhhhhh-----ccchHHHhhhhccccchhhhhhhhccc----chhhhcccchhhccc
Confidence 0 011111 11111 112221 289999999999999988877766533 567777877777666
Q ss_pred e
Q 047072 534 E 534 (570)
Q Consensus 534 E 534 (570)
-
T Consensus 230 ~ 230 (463)
T KOG1081|consen 230 Q 230 (463)
T ss_pred h
Confidence 5
Done!