Query 002895
Match_columns 869
No_of_seqs 381 out of 1510
Neff 4.7
Searched_HMMs 46136
Date Thu Mar 28 12:57:30 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/002895.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/002895hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 KOG1079 Transcriptional repres 100.0 5E-126 1E-130 1067.5 29.0 687 22-865 37-739 (739)
2 KOG4442 Clathrin coat binding 100.0 2.2E-42 4.8E-47 393.5 15.0 188 615-851 64-256 (729)
3 KOG1080 Histone H3 (Lys4) meth 100.0 3.1E-30 6.7E-35 309.9 11.4 134 719-852 865-1002(1005)
4 KOG1082 Histone H3 (Lys9) meth 99.9 1.4E-27 3.1E-32 264.3 12.4 141 688-838 154-323 (364)
5 smart00317 SET SET (Su(var)3-9 99.9 1.4E-23 3E-28 190.4 12.1 113 722-834 2-116 (116)
6 KOG1083 Putative transcription 99.9 3.6E-24 7.9E-29 251.2 5.9 132 708-839 1165-1298(1306)
7 KOG1085 Predicted methyltransf 99.7 1.8E-18 4E-23 182.1 8.9 123 715-837 251-379 (392)
8 KOG1141 Predicted histone meth 99.7 1.4E-18 3.1E-23 200.2 4.7 73 781-853 1179-1260(1262)
9 COG2940 Proteins containing SE 99.5 1.4E-15 2.9E-20 174.6 2.6 125 716-840 328-454 (480)
10 PF00856 SET: SET domain; Int 99.5 3.7E-14 8E-19 133.5 5.8 105 731-835 1-162 (162)
11 KOG1081 Transcription factor N 98.9 5E-10 1.1E-14 128.6 1.6 115 707-837 301-417 (463)
12 KOG2589 Histone tail methylase 98.6 4.8E-08 1E-12 107.2 4.5 112 730-847 137-252 (453)
13 KOG2461 Transcription factor B 98.1 2.8E-06 6.1E-11 96.3 5.7 109 718-837 26-145 (396)
14 cd00167 SANT 'SWI3, ADA2, N-Co 93.6 0.17 3.7E-06 38.6 5.2 41 499-541 1-42 (45)
15 smart00717 SANT SANT SWI3, AD 93.5 0.19 4E-06 38.9 5.4 43 498-542 2-45 (49)
16 PF00249 Myb_DNA-binding: Myb- 92.7 0.16 3.5E-06 40.9 4.1 46 174-221 1-48 (48)
17 PF13921 Myb_DNA-bind_6: Myb-l 92.3 0.14 3E-06 42.8 3.2 43 177-222 1-45 (60)
18 smart00717 SANT SANT SWI3, AD 92.1 0.14 3E-06 39.6 2.9 46 175-222 2-48 (49)
19 smart00570 AWS associated with 90.9 0.082 1.8E-06 44.3 0.4 29 672-718 22-50 (51)
20 cd00167 SANT 'SWI3, ADA2, N-Co 88.4 0.43 9.4E-06 36.3 2.8 43 176-220 1-44 (45)
21 PF03638 TCR: Tesmin/TSO1-like 84.5 0.5 1.1E-05 38.2 1.2 31 656-687 2-32 (42)
22 KOG1171 Metallothionein-like p 84.2 0.23 4.9E-06 57.1 -1.1 65 616-687 131-246 (406)
23 PF09111 SLIDE: SLIDE; InterP 81.7 1.3 2.8E-05 43.1 3.1 50 173-222 48-111 (118)
24 PF00249 Myb_DNA-binding: Myb- 79.2 5.4 0.00012 32.1 5.5 43 498-541 2-45 (48)
25 KOG1337 N-methyltransferase [G 78.0 1.6 3.4E-05 51.2 2.9 40 794-836 239-278 (472)
26 PF13921 Myb_DNA-bind_6: Myb-l 75.2 6.3 0.00014 32.9 5.0 41 500-542 1-41 (60)
27 KOG2084 Predicted histone tail 73.0 4 8.8E-05 46.3 4.4 38 794-835 208-246 (482)
28 PF05033 Pre-SET: Pre-SET moti 70.9 3.2 7E-05 38.3 2.5 46 656-712 45-103 (103)
29 TIGR01557 myb_SHAQKYF myb-like 64.5 17 0.00036 31.2 5.3 44 498-542 4-52 (57)
30 PLN03212 Transcription repress 63.2 8.1 0.00018 42.1 3.9 52 169-223 73-125 (249)
31 KOG1141 Predicted histone meth 62.6 14 0.0003 46.1 6.1 55 707-761 992-1054(1262)
32 PF05033 Pre-SET: Pre-SET moti 59.0 5.3 0.00012 36.9 1.5 37 615-656 45-103 (103)
33 PF03638 TCR: Tesmin/TSO1-like 54.2 6.3 0.00014 32.0 1.0 37 615-657 2-40 (42)
34 smart00570 AWS associated with 52.2 4.7 0.0001 34.0 -0.0 14 649-662 18-31 (51)
35 COG5259 RSC8 RSC chromatin rem 51.0 14 0.00031 43.4 3.6 38 496-535 278-315 (531)
36 PLN03091 hypothetical protein; 48.8 21 0.00045 42.0 4.4 53 168-223 61-114 (459)
37 PF14774 FAM177: FAM177 family 47.3 31 0.00066 34.1 4.7 66 143-211 18-97 (123)
38 PLN03212 Transcription repress 43.7 21 0.00046 39.0 3.3 46 174-221 25-72 (249)
39 KOG1081 Transcription factor N 41.2 8.5 0.00018 45.5 -0.2 100 734-835 130-242 (463)
40 KOG1079 Transcriptional repres 40.4 14 0.0003 45.3 1.4 29 27-55 18-51 (739)
41 KOG3813 Uncharacterized conser 36.2 16 0.00035 43.4 1.0 25 659-684 309-333 (640)
42 KOG1082 Histone H3 (Lys9) meth 30.7 32 0.0007 39.3 2.3 42 612-658 103-170 (364)
43 PRK09430 djlA Dna-J like membr 28.4 61 0.0013 35.7 3.9 49 177-225 146-229 (267)
44 PF00856 SET: SET domain; Int 28.1 32 0.0007 32.1 1.5 17 816-832 2-18 (162)
45 PF08271 TF_Zn_Ribbon: TFIIB z 27.5 71 0.0015 25.4 3.1 33 144-177 7-43 (43)
46 KOG0457 Histone acetyltransfer 27.5 1.4E+02 0.0029 35.4 6.5 41 496-538 71-115 (438)
47 KOG4167 Predicted DNA-binding 27.4 71 0.0015 39.8 4.4 40 498-539 620-659 (907)
48 PF08666 SAF: SAF domain; Int 24.8 41 0.00088 27.9 1.3 14 818-831 4-17 (63)
49 TIGR02726 phenyl_P_delta pheny 22.7 65 0.0014 33.0 2.6 49 147-195 22-74 (169)
50 KOG4289 Cadherin EGF LAG seven 20.9 1E+02 0.0022 41.4 4.1 15 731-745 1872-1886(2531)
51 PF14100 PmoA: Methane oxygena 20.4 97 0.0021 34.1 3.5 43 793-836 204-252 (271)
No 1
>KOG1079 consensus Transcriptional repressor EZH1 [Transcription]
Probab=100.00 E-value=5.1e-126 Score=1067.48 Aligned_cols=687 Identities=36% Similarity=0.574 Sum_probs=540.0
Q ss_pred CcccchhHHHHHHHHHHHHHHHHHHHHHHHHHHhhHHHHHHHhhhhccccchhhhccCCCC----Cc---CCccccCCCC
Q 002895 22 DGLGNLTYKLNQLKKQVQAERVVSVKDKIEKNRKKIENDISQLLSTTSRKSVIFAMDNGFG----NM---PLCKYSGFPQ 94 (869)
Q Consensus 22 ~~~~~L~~~i~~LKkqi~~eR~~~ik~k~e~N~~~l~~~~~~l~~~~~~~~r~~~~~~~~~----~~---~l~~~~g~~~ 94 (869)
+.++.+...+..+|+ ++..++.+++++-..++.+...+|+-+- +++.+ ........+ .| |++++||+.+
T Consensus 37 ~~~e~i~~~~~E~k~-~~~~~~~~~~~~~~~~r~k~~~~~~~~~-~~~~~--~~i~~~n~~~~v~~~~~~~~~q~nfmv~ 112 (739)
T KOG1079|consen 37 DRLEKIKILNCEWKK-RRLKPVRSAKEVDGDIRVKVDLDTSIFD-FPSQK--SPINELNAVAQVPIMYSWPPLQQNFMVE 112 (739)
T ss_pred HHHHHHHHHHHHHhh-hhcccccccccccccccccccccccccc-Ccccc--cchhhhcccccccccccCChhhhcceec
Confidence 456666666666666 7788888888888888888888888774 55552 222222222 22 9999999999
Q ss_pred CCCCCCcccccccccccccccccCCCCCCCceeEEeeccccccccccccccceeeEeCCCCeEEEecCCccccCCCcccc
Q 002895 95 GLGDRDYVNSHEVVLSTSSKLSHVQKIPPYTTWIFLDKNQRMAEDQSVVGRRRIYYDQHGSEALVCSDSEEDIIEPEEEK 174 (869)
Q Consensus 95 ~~~d~d~~~~~~v~~~~~iklp~v~klPpYTtWifldrNqrMaedqsvvgrrriYYD~~g~EalicSdseee~~e~eeek 174 (869)
+..+.+++..-++. +..||+|++|.|+|||+|||+||||||++||+|||+|+||| |.|||++| ||+||| ++++|||
T Consensus 113 ~~~~~~~ip~~~~~-v~~~k~~~ieel~~y~~~v~~dr~~~~~~d~v~ve~~~a~~-Q~~~e~dg-~D~~~e-~~~~~ek 188 (739)
T KOG1079|consen 113 DETVLHNIPYMGDE-VLDIKGPFIEELIKYDGKVHGDRNQRFMEDQVFVELVVALY-QYGGEHDG-SDDEEE-EVLEEEK 188 (739)
T ss_pred ccceeccccccccc-ccccccchhhhcccccceeeccccccchhhhhHHHHHHHHH-hcCCcccc-CCCccc-cchhhhc
Confidence 99999988877754 67899999999999999999999999999999999999999 99999999 999999 8889999
Q ss_pred ccCCcccch-hhhhHhhhcCChHHHHHHHHHHhc--CCcHHHHHHHHHhHhhcCCCCCccccccccccccchhhh-hhHH
Q 002895 175 HEFSDGEDR-ILWTVFEEHGLGEEVINAVSQFIG--IATSEVQDRYSTLKEKYDGKNLKEFEDAGHERGIALEKS-LSAA 250 (869)
Q Consensus 175 ~~f~~~ed~-~~~~~~~e~g~sd~v~~~l~~~~~--~~~sei~eRy~~L~~k~~~~~~~~~~~~~~~~~~~~~k~-l~~a 250 (869)
++|.+++|. ++|++.+.++++++||.+|+++|. ++++||+|||.+|+++..+...+...+.+++. +.+++. ++++
T Consensus 189 r~~~e~~~~~~~~~~~~~~~~~~~if~~~~~~f~~k~~~~~lke~~~~l~~~~~p~~~e~~~~~~id~-~~ae~~~r~~~ 267 (739)
T KOG1079|consen 189 RDFLEGEDDDIIESINKLSFPADKIFQAISSMFPDKLTASELKERYGELTSKSLPVAEEPECTPNIDG-SSAEPVQREQA 267 (739)
T ss_pred ccccCcccchhhHhhhhhccchHHHHHHHhhhcccccchhhhhHHHhhhhhccccccCCcccccCCCc-cccChHHHHhh
Confidence 999999999 899999999999999999999995 99999999999999986655555555444544 445555 9999
Q ss_pred hhcccccccccccccccCCcCcCCCCCCCCCCCCCCCCCCCCCCCCccchhhhhccccccccCcCCCccccccccccccc
Q 002895 251 LDSFDNLFCRRCLLFDCRLHGCSQTLINPSEKQPYWSEYEDDRKPCSNHCYLQSRAVQDTVEGSAGNISSIITNTEGTLL 330 (869)
Q Consensus 251 ldsFdnlFCRRClvfDC~lHgcsq~li~~~ekq~~~~~~~~d~~PCg~~Cyl~~~~~~~~~~~s~~~~~~~~~~~~~~~~ 330 (869)
|||||||||||||+|||+||| +|.++||.++.-.|.++..+..|||+.||.++.+......... .+
T Consensus 268 l~sF~tlfCrrCl~ydC~lHg-~~~~~~pn~~~r~e~~~a~~~~pc~p~~~~~l~~~~~~~m~~~---------~~---- 333 (739)
T KOG1079|consen 268 LHSFHTLFCRRCLKYDCFLHG-SQFHAFPNTKKRKEDEPALENEPCGPGCYGLLEGAKEKTMSAV---------VS---- 333 (739)
T ss_pred hcccccceeeeeeeeeccccC-ccccccccccccCCCCccccccCCCCchhhhhhccchhhhhcc---------cc----
Confidence 999999999999999999999 9999999999999999999999999999999866543311000 00
Q ss_pred cCCCCCCCCCCccccccccCcccccccccccccchhhhcCCCCCccchhhhccccccccccccchhhHHHHHHHHhhhhc
Q 002895 331 HCNAEVPGAHSDIMAGERCNSKRVLPVTSEAVDSSEVAIGNENTDTSMQSLGKRKALELNDSVKVFDEIEESLNKKQKKL 410 (869)
Q Consensus 331 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ss~~~~k~~~~~~~~s~~~~~~~~~~~~~~~~~~ 410 (869)
..+.+. + .++|+..
T Consensus 334 ----~~~p~~-----g---------------------------------------------------------~~~qk~~ 347 (739)
T KOG1079|consen 334 ----KCPPIR-----G---------------------------------------------------------DIRQKLV 347 (739)
T ss_pred ----cCCCCc-----c---------------------------------------------------------hhhhhhc
Confidence 000000 0 0112211
Q ss_pred cCccccccCCCCCCCCCCCCCCcccccccccccccccc--ccccccccccccccccccccccCCccCCCCcccccCCCCC
Q 002895 411 LPLDVLTASSDGIPRPDTKSGHHVGAINDNELQMTSKN--TIKKSVSAKVVSHNNIEHNIMDGAKDVNKEPEMKQSFSKG 488 (869)
Q Consensus 411 ~~~~~~~~~~~~~~~~d~~~~~~~~~~~~~~~~~~~~~--~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 488 (869)
-.+++++ ..+.. ..++....+...+..+. ..+.. ...++.........+.+.......
T Consensus 348 ~~~~~~s-------~~~~~--~~e~~g~~~d~~v~~~~~~~~~~v---------~~~~~~~~s~~~~~c~~~~~~~~~-- 407 (739)
T KOG1079|consen 348 KASSMDS-------DDEHV--EEEDKGHDDDDGVPRGFGGSVNFV---------GEDDTSTHSSTNSICQNPVHGKKD-- 407 (739)
T ss_pred ccccCCc-------chhhc--cccccCcccccccccccccccccc---------cCCcccccccccccccCcccccCC--
Confidence 1111111 00000 00000000001110000 00000 001111111112222222111110
Q ss_pred CCccccccCCCCcHHHHHHHHHhHHhcCCchHHHHHhhcCCCCchHHHHHHHhhcCCCCCCCCCCCCccccccccccchh
Q 002895 489 ELPEGVLCSSEWKPIEKELYLKGVEIFGRNSCLIARNLLSGLKTCMEVSTYMRDSSSSMPHKSVAPSSFLEETVKVDTDY 568 (869)
Q Consensus 489 ~~~~~~~~~~~W~~~E~~l~~k~~~ifg~NsC~iAr~ll~g~KtC~eV~~ym~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 568 (869)
...+|+++|+.||++|+.+||.|+|+|||+|+ +|||++||+||..+..... +.. +. ..
T Consensus 408 -------~~~ew~~~ek~~fr~~~~~~~~n~c~Iar~l~--~ktC~~v~~~~~~e~~~~~--------~~~--~~--~~- 465 (739)
T KOG1079|consen 408 -------TNVEWNGAEKVLFRVGSTLYGTNRCSIARNLL--TKTCRQVYEYEQKEVLQGL--------YFD--GR--FR- 465 (739)
T ss_pred -------cccccchhhhHHHHhccccccchhhHHHHHhc--chHHHHHHHHhhcchhhce--------ecc--cc--cc-
Confidence 24689999999999999999999999999995 5999999999997653211 110 00 00
Q ss_pred hhhcCCCchhHHhhhccccccccccCCCCCcchhhhhccCCCCCCcCcccCCCCCC--CCCCCcccCCCccccCCCcccc
Q 002895 569 AEQEMPARPRLLRRRGRARKLKYSWKSAGHPSIWKRIADGKNQSCKQYTPCGCQSM--CGKQCPCLHNGTCCEKYCGYSF 646 (869)
Q Consensus 569 ~~~~~~~r~r~~r~~gr~rklk~~~~s~~~p~~~kri~~~k~~~~~~y~PC~c~~~--C~~~C~C~~~g~~Cek~Cg~~~ 646 (869)
.....+.|+|.+|+.|+.|++.+.|+++.|+.+|. |+||+|+++ |+.+|+|+.++++||+||+
T Consensus 466 ~~~~~~~r~~~~r~~g~~r~k~q~kk~~~~~~v~~------------~qpC~hp~~c~c~~~C~C~~n~~~CEk~C~--- 530 (739)
T KOG1079|consen 466 VELPGPKRARKLRLWGRHRRKIQNKKDSRHTVVWN------------YQPCDHPGPCNCGVGCPCIDNETFCEKFCY--- 530 (739)
T ss_pred cccCcchhhHHHHhhhhHHHhhhcccccCCceeee------------cCcccCCCCCCCCCCCcccccCcchhhccc---
Confidence 11234556888999999999999999999877765 666666644 4689999999999999999
Q ss_pred ccCchhhhhccCCcccCCCCccCCCcccccccCccCcccCcCccCCCCCCCCCCCCCCCCC-CCCchHHhhcccCcEEEE
Q 002895 647 LRCSKSCKNRFRGCHCAKSQCRSRQCPCFAAGRECDPDVCRNCWVSCGDGSLGEPPKRGDG-QCGNMRLLLRQQQRILLA 725 (869)
Q Consensus 647 ~~C~~~C~nRf~GC~C~~~~C~t~~CpC~~a~rECdPd~C~~C~~sCg~g~l~~p~~~~~~-~C~Nr~lqrg~~k~l~V~ 725 (869)
|+.+|.|||+||+| +++|++.+|||+++.|||||++|..||.. +..+++. +|+|+.+|++++++++|+
T Consensus 531 --C~~dC~nrF~GC~C-k~QC~tkqCpC~~A~rECdPd~Cl~cg~~--------~~~d~~~~~C~N~~l~~~~qkr~lla 599 (739)
T KOG1079|consen 531 --CSPDCRNRFPGCRC-KAQCNTKQCPCYLAVRECDPDVCLMCGNV--------DHFDSSKISCKNTNLQRGEQKRVLLA 599 (739)
T ss_pred --CCHHHHhcCCCCCc-ccccccCcCchhhhccccCchHHhccCcc--------cccccCccccccchhhhhhhcceeec
Confidence 99999999999999 99999999999999999999999999851 2333444 999999999999999999
Q ss_pred EcCCCCcEEEEccccCCCCeeEEecccccCHHHHHHhhhhhcccCCcccccCCCcEEEeccccCCccccccCCCCCCcce
Q 002895 726 KSDVAGWGAFLKNSVSKNDYLGEYTGELISHREADKRGKIYDRANSSFLFDLNDQYVLDAYRKGDKLKFANHSSNPNCFA 805 (869)
Q Consensus 726 kS~~kG~GLFA~edI~KGefI~EY~GEIIs~~Ea~rR~k~yd~~~~sYlf~L~~~~~IDAtr~GN~aRFINHSC~PNc~~ 805 (869)
.|.+.|||||+++.+.|++||.||+||+|+++||++|+++|+..+.+|+|+|+.+++|||+++||.+||+|||-+|||++
T Consensus 600 pSdVaGwGlFlKe~v~KnefisEY~GE~IS~dEADrRGkiYDr~~cSflFnln~dyviDs~rkGnk~rFANHS~nPNCYA 679 (739)
T KOG1079|consen 600 PSDVAGWGLFLKESVSKNEFISEYTGEIISHDEADRRGKIYDRYMCSFLFNLNNDYVIDSTRKGNKIRFANHSFNPNCYA 679 (739)
T ss_pred hhhccccceeeccccCCCceeeeecceeccchhhhhcccccccccceeeeeccccceEeeeeecchhhhccCCCCCCcEE
Confidence 99999999999999999999999999999999999999999999999999999999999999999999999999999999
Q ss_pred eEEEECCeeEEEEEEecCCCCCCeEEEecCCCCCCCCcccCCCCCCCCCCCCCccccccc
Q 002895 806 KVMLVAGDHRVGIFAKEHIEASEELFYDYRYGPDQAPAWARKPEGSKREDSSVSQGRAKK 865 (869)
Q Consensus 806 ~~v~v~G~~RI~ifA~RDI~aGEELTfDYgy~~d~~pcwC~~p~~~k~de~~~s~gra~k 865 (869)
.+++|+|++||||||+|+|.+||||||||+|++++++.|-+.+.++++++....+.+++|
T Consensus 680 kvm~V~GdhRIGifAkRaIeagEELffDYrYs~~~~~k~~~~~~~s~k~e~~~~q~~~~~ 739 (739)
T KOG1079|consen 680 KVMMVAGDHRIGIFAKRAIEAGEELFFDYRYSPEHALKFVGIERESYKVELKIFQATQQK 739 (739)
T ss_pred EEEEecCCcceeeeehhhcccCceeeeeeccCccccccccccCccccccchhhhhhhcCC
Confidence 999999999999999999999999999999999999999999999999998888887765
No 2
>KOG4442 consensus Clathrin coat binding protein/Huntingtin interacting protein HIP1, involved in regulation of endocytosis [Intracellular trafficking, secretion, and vesicular transport]
Probab=100.00 E-value=2.2e-42 Score=393.46 Aligned_cols=188 Identities=30% Similarity=0.595 Sum_probs=168.2
Q ss_pred CcccCCCCCCCCCCCcccCCCccccCCCccccccCchhhhhccCCcccCCCCccCCCcccccccCccCcccCcCccCCCC
Q 002895 615 QYTPCGCQSMCGKQCPCLHNGTCCEKYCGYSFLRCSKSCKNRFRGCHCAKSQCRSRQCPCFAAGRECDPDVCRNCWVSCG 694 (869)
Q Consensus 615 ~y~PC~c~~~C~~~C~C~~~g~~Cek~Cg~~~~~C~~~C~nRf~GC~C~~~~C~t~~CpC~~a~rECdPd~C~~C~~sCg 694 (869)
..+-|+|..--+. --...|. |+.+|.||+. +.||.++.|..|+
T Consensus 64 ~~m~Cdc~~~~~d---------~~n~~~~-----cg~~CiNr~t-------------------~iECs~~~C~~cg---- 106 (729)
T KOG4442|consen 64 DEMICDCKPKTGD---------GANGACA-----CGEDCINRMT-------------------SIECSDRECPRCG---- 106 (729)
T ss_pred cceeeeccccccc---------ccccccc-----cCccccchhh-------------------hcccCCccCCCcc----
Confidence 5667777642211 1235677 8888888875 6777888888764
Q ss_pred CCCCCCCCCCCCCCCCchHHhhcccCcEEEEEcCCCCcEEEEccccCCCCeeEEecccccCHHHHHHhhhhhcccC--Cc
Q 002895 695 DGSLGEPPKRGDGQCGNMRLLLRQQQRILLAKSDVAGWGAFLKNSVSKNDYLGEYTGELISHREADKRGKIYDRAN--SS 772 (869)
Q Consensus 695 ~g~l~~p~~~~~~~C~Nr~lqrg~~k~l~V~kS~~kG~GLFA~edI~KGefI~EY~GEIIs~~Ea~rR~k~yd~~~--~s 772 (869)
..|.|++||+.+..+|+|+.+..+||||+|.++|++|+||+||+||||+..|+.+|.+.|+..+ ++
T Consensus 107 ------------~~C~NQRFQkkqyA~vevF~Te~KG~GLRA~~dI~~g~FI~EY~GEVI~~~Ef~kR~~~Y~~d~~kh~ 174 (729)
T KOG4442|consen 107 ------------VYCKNQRFQKKQYAKVEVFLTEKKGCGLRAEEDIPKGQFILEYIGEVIEEKEFEKRVKRYAKDGIKHY 174 (729)
T ss_pred ------------ccccchhhhhhccCceeEEEecCcccceeeccccCCCcEEeeeccccccHHHHHHHHHHHHhcCCceE
Confidence 4899999999999999999999999999999999999999999999999999999999999875 57
Q ss_pred ccccCCCcEEEeccccCCccccccCCCCCCcceeEEEECCeeEEEEEEecCCCCCCeEEEecC---CCCCCCCcccCCCC
Q 002895 773 FLFDLNDQYVLDAYRKGDKLKFANHSSNPNCFAKVMLVAGDHRVGIFAKEHIEASEELFYDYR---YGPDQAPAWARKPE 849 (869)
Q Consensus 773 Ylf~L~~~~~IDAtr~GN~aRFINHSC~PNc~~~~v~v~G~~RI~ifA~RDI~aGEELTfDYg---y~~d~~pcwC~~p~ 849 (869)
|+|.|....+||||.+||++|||||||+|||+++.|+|.|..||||||.|.|.+||||||||+ ||.+.++|+||.++
T Consensus 175 Yfm~L~~~e~IDAT~KGnlaRFiNHSC~PNa~~~KWtV~~~lRvGiFakk~I~~GEEITFDYqf~rYGr~AQ~CyCgean 254 (729)
T KOG4442|consen 175 YFMALQGGEYIDATKKGNLARFINHSCDPNAEVQKWTVPDELRVGIFAKKVIKPGEEITFDYQFDRYGRDAQPCYCGEAN 254 (729)
T ss_pred EEEEecCCceecccccCcHHHhhcCCCCCCceeeeeeeCCeeEEEEeEecccCCCceeeEecccccccccccccccCCcc
Confidence 888999999999999999999999999999999999999999999999999999999999996 78899999999999
Q ss_pred CC
Q 002895 850 GS 851 (869)
Q Consensus 850 ~~ 851 (869)
|+
T Consensus 255 C~ 256 (729)
T KOG4442|consen 255 CR 256 (729)
T ss_pred cc
Confidence 88
No 3
>KOG1080 consensus Histone H3 (Lys4) methyltransferase complex, subunit SET1 and related methyltransferases [Chromatin structure and dynamics; Transcription]
Probab=99.96 E-value=3.1e-30 Score=309.87 Aligned_cols=134 Identities=40% Similarity=0.755 Sum_probs=126.9
Q ss_pred cCcEEEEEcCCCCcEEEEccccCCCCeeEEecccccCHHHHHHhhhhhcccC--CcccccCCCcEEEeccccCCcccccc
Q 002895 719 QQRILLAKSDVAGWGAFLKNSVSKNDYLGEYTGELISHREADKRGKIYDRAN--SSFLFDLNDQYVLDAYRKGDKLKFAN 796 (869)
Q Consensus 719 ~k~l~V~kS~~kG~GLFA~edI~KGefI~EY~GEIIs~~Ea~rR~k~yd~~~--~sYlf~L~~~~~IDAtr~GN~aRFIN 796 (869)
++.|..+++.+|||||||+++|.+|++|+||+||+|.+.-|+.|+..|...+ .+|+|.++++.+|||+.+||+|||||
T Consensus 865 kk~~~F~~s~iH~wglfa~~~i~~~dmViEY~Ge~vR~~iad~RE~~Y~~~gi~~sYlfrid~~~ViDAtk~gniAr~In 944 (1005)
T KOG1080|consen 865 KKYVKFGRSGIHGWGLFAMENIAAGDMVIEYRGELVRSSIADLREARYERMGIGDSYLFRIDDEVVVDATKKGNIARFIN 944 (1005)
T ss_pred hhhhccccccccccceeeccCccccceEEEeeceehhhhHHHHHHHHHhccCcccceeeecccceEEeccccCchhheee
Confidence 3458899999999999999999999999999999999999999999999875 79999999999999999999999999
Q ss_pred CCCCCCcceeEEEECCeeEEEEEEecCCCCCCeEEEecCCCC--CCCCcccCCCCCCC
Q 002895 797 HSSNPNCFAKVMLVAGDHRVGIFAKEHIEASEELFYDYRYGP--DQAPAWARKPEGSK 852 (869)
Q Consensus 797 HSC~PNc~~~~v~v~G~~RI~ifA~RDI~aGEELTfDYgy~~--d~~pcwC~~p~~~k 852 (869)
|||+|||+++++.|+|+.+|+|||.|+|.+||||||||.|.. +..||+||.|+|++
T Consensus 945 HsC~PNCyakvi~V~g~~~IvIyakr~I~~~EElTYDYkF~~e~~kipClCgap~Crg 1002 (1005)
T KOG1080|consen 945 HSCNPNCYAKVITVEGDKRIVIYSKRDIAAGEELTYDYKFPTEDDKIPCLCGAPNCRG 1002 (1005)
T ss_pred cccCCCceeeEEEecCeeEEEEEEecccccCceeeeeccccccccccccccCCCcccc
Confidence 999999999999999999999999999999999999999855 45799999999985
No 4
>KOG1082 consensus Histone H3 (Lys9) methyltransferase SUV39H1/Clr4, required for transcriptional silencing [Chromatin structure and dynamics; Transcription]
Probab=99.94 E-value=1.4e-27 Score=264.33 Aligned_cols=141 Identities=26% Similarity=0.471 Sum_probs=118.2
Q ss_pred CccCCCCCCCCCCCCCCCCCCCCchHHhhcccCcEEEEEcCCCCcEEEEccccCCCCeeEEecccccCHHHHHHhhhhhc
Q 002895 688 NCWVSCGDGSLGEPPKRGDGQCGNMRLLLRQQQRILLAKSDVAGWGAFLKNSVSKNDYLGEYTGELISHREADKRGKIYD 767 (869)
Q Consensus 688 ~C~~sCg~g~l~~p~~~~~~~C~Nr~lqrg~~k~l~V~kS~~kG~GLFA~edI~KGefI~EY~GEIIs~~Ea~rR~k~yd 767 (869)
+|+..|+|+ ..|.|+.+|.+.+.+++|++++.+||||++.+.|++|+||+||+||+++..++++|...++
T Consensus 154 EC~~~C~C~----------~~C~nRv~q~g~~~~leIfrt~~kGwgvRs~~~I~~G~fvcEyaGe~~t~~e~~~~~~~~~ 223 (364)
T KOG1082|consen 154 ECSVACGCH----------PDCANRVVQKGLQFHLEVFRTPEKGWGVRTLDPIPAGEFVCEYAGEVLTSEEAQRRTHLRE 223 (364)
T ss_pred ccccCCCCC----------CcCcchhhccccccceEEEecCCceeeecccccccCCCeeEEEeeEecChHHhhhcccccc
Confidence 566666653 6999999999999999999999999999999999999999999999999999998843322
Q ss_pred cc----CCcccc---------------------cCCCcEEEeccccCCccccccCCCCCCcceeEEEECCe----eEEEE
Q 002895 768 RA----NSSFLF---------------------DLNDQYVLDAYRKGDKLKFANHSSNPNCFAKVMLVAGD----HRVGI 818 (869)
Q Consensus 768 ~~----~~sYlf---------------------~L~~~~~IDAtr~GN~aRFINHSC~PNc~~~~v~v~G~----~RI~i 818 (869)
.. +..+.+ .....+.|||...||++|||||||.||+.+..+..++. .+|+|
T Consensus 224 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ida~~~GNv~RfinHSC~PN~~~~~v~~~~~~~~~~~i~f 303 (364)
T KOG1082|consen 224 YLDDDCDAYSIADREWVDESPVGNTFVAPSLPGGPGRELLIDAKPHGNVARFINHSCSPNLLYQAVFQDEFVLLYLRIGF 303 (364)
T ss_pred ccccccccchhhhccccccccccccccccccccCCCcceEEchhhcccccccccCCCCccceeeeeeecCCccchheeee
Confidence 21 111111 11345899999999999999999999999988887643 69999
Q ss_pred EEecCCCCCCeEEEecCCCC
Q 002895 819 FAKEHIEASEELFYDYRYGP 838 (869)
Q Consensus 819 fA~RDI~aGEELTfDYgy~~ 838 (869)
||+++|.||||||||||...
T Consensus 304 fa~~~I~p~~ELT~dYg~~~ 323 (364)
T KOG1082|consen 304 FALRDISPGEELTLDYGKAY 323 (364)
T ss_pred eeccccCCCcccchhhcccc
Confidence 99999999999999999663
No 5
>smart00317 SET SET (Su(var)3-9, Enhancer-of-zeste, Trithorax) domain. Putative methyl transferase, based on outlier plant homologues
Probab=99.90 E-value=1.4e-23 Score=190.42 Aligned_cols=113 Identities=41% Similarity=0.736 Sum_probs=102.7
Q ss_pred EEEEEcCCCCcEEEEccccCCCCeeEEecccccCHHHHHHhhhhhcccC--CcccccCCCcEEEeccccCCccccccCCC
Q 002895 722 ILLAKSDVAGWGAFLKNSVSKNDYLGEYTGELISHREADKRGKIYDRAN--SSFLFDLNDQYVLDAYRKGDKLKFANHSS 799 (869)
Q Consensus 722 l~V~kS~~kG~GLFA~edI~KGefI~EY~GEIIs~~Ea~rR~k~yd~~~--~sYlf~L~~~~~IDAtr~GN~aRFINHSC 799 (869)
+++..++.+|+||||+.+|++|++|++|.|.++...++..+...|.... ..|+|.+...++||+...||++|||||||
T Consensus 2 ~~~~~~~~~G~gl~a~~~i~~g~~i~~~~g~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~id~~~~~~~~~~iNHsc 81 (116)
T smart00317 2 LEVFKSPGKGWGVRATEDIPKGEFIGEYVGEIITSEEAEERSKAYDTDGADSFYLFEIDSDLCIDARRKGNIARFINHSC 81 (116)
T ss_pred cEEEecCCCcEEEEECCccCCCCEEEEEEeEEECHHHHHHHHHHHHhcCCCCEEEEECCCCEEEeCCccCcHHHeeCCCC
Confidence 5677788999999999999999999999999999999888765555554 38889888889999999999999999999
Q ss_pred CCCcceeEEEECCeeEEEEEEecCCCCCCeEEEec
Q 002895 800 NPNCFAKVMLVAGDHRVGIFAKEHIEASEELFYDY 834 (869)
Q Consensus 800 ~PNc~~~~v~v~G~~RI~ifA~RDI~aGEELTfDY 834 (869)
.|||.+..+..++..+|.|+|+|||++|||||+||
T Consensus 82 ~pN~~~~~~~~~~~~~~~~~a~r~I~~GeEi~i~Y 116 (116)
T smart00317 82 EPNCELLFVEVNGDSRIVIFALRDIKPGEELTIDY 116 (116)
T ss_pred CCCEEEEEEEECCCcEEEEEECCCcCCCCEEeecC
Confidence 99999999888888899999999999999999999
No 6
>KOG1083 consensus Putative transcription factor ASH1/LIN-59 [Transcription]
Probab=99.89 E-value=3.6e-24 Score=251.17 Aligned_cols=132 Identities=29% Similarity=0.572 Sum_probs=123.5
Q ss_pred CCCchHHhh-cccCcEEEEEcCCCCcEEEEccccCCCCeeEEecccccCHHHHHHh-hhhhcccCCcccccCCCcEEEec
Q 002895 708 QCGNMRLLL-RQQQRILLAKSDVAGWGAFLKNSVSKNDYLGEYTGELISHREADKR-GKIYDRANSSFLFDLNDQYVLDA 785 (869)
Q Consensus 708 ~C~Nr~lqr-g~~k~l~V~kS~~kG~GLFA~edI~KGefI~EY~GEIIs~~Ea~rR-~k~yd~~~~sYlf~L~~~~~IDA 785 (869)
.|.|+++++ +...+|.+++.+.+||||.|.++|++|+||+||+||||+..+++.| ...|.....+|+..+..+.+||+
T Consensus 1165 ~c~nqrm~r~e~cp~L~v~~gp~~G~~v~tk~PikagtfI~EYvGeVit~ke~e~~mmtl~~~d~~~~cL~I~p~l~id~ 1244 (1306)
T KOG1083|consen 1165 SCSNQRMQRHEECPPLEVFRGPKKGWGVRTKEPIKAGTFIMEYVGEVITEKEFEPRMMTLYHNDDDHYCLVIDPGLFIDI 1244 (1306)
T ss_pred hhhhHHhhhhccCCCcceeccCCCCccccccccccccchHHHHHHHHHHHHhhcccccccCCCCCcccccccCccccCCh
Confidence 488888886 4677899999999999999999999999999999999999999988 56788888999999999999999
Q ss_pred cccCCccccccCCCCCCcceeEEEECCeeEEEEEEecCCCCCCeEEEecCCCCC
Q 002895 786 YRKGDKLKFANHSSNPNCFAKVMLVAGDHRVGIFAKEHIEASEELFYDYRYGPD 839 (869)
Q Consensus 786 tr~GN~aRFINHSC~PNc~~~~v~v~G~~RI~ifA~RDI~aGEELTfDYgy~~d 839 (869)
.++||.+||+||+|.|||.++.|.|+|..||++||+|||.+||||||||++...
T Consensus 1245 ~R~~n~~RfinhscKPNc~~qkwSVNG~~Rv~L~A~rDi~kGEELtYDYN~ks~ 1298 (1306)
T KOG1083|consen 1245 PRMGNGARFINHSCKPNCEMQKWSVNGEYRVGLFALRDLPKGEELTYDYNFKSF 1298 (1306)
T ss_pred hhccccccccccccCCCCccccccccceeeeeeeecCCCCCCceEEEecccccc
Confidence 999999999999999999999999999999999999999999999999976443
No 7
>KOG1085 consensus Predicted methyltransferase (contains a SET domain) [General function prediction only]
Probab=99.75 E-value=1.8e-18 Score=182.14 Aligned_cols=123 Identities=28% Similarity=0.417 Sum_probs=107.9
Q ss_pred hhcccCcEEEEEcCCCCcEEEEccccCCCCeeEEecccccCHHHHHHhhhhhcccCC----cccc-cCCCcEEEecccc-
Q 002895 715 LLRQQQRILLAKSDVAGWGAFLKNSVSKNDYLGEYTGELISHREADKRGKIYDRANS----SFLF-DLNDQYVLDAYRK- 788 (869)
Q Consensus 715 qrg~~k~l~V~kS~~kG~GLFA~edI~KGefI~EY~GEIIs~~Ea~rR~k~yd~~~~----sYlf-~L~~~~~IDAtr~- 788 (869)
..+....+.+..-.++|.||+|+..+.+|+||.||+|.+|.-.||..|+..|..... .|+| .++..|+|||++-
T Consensus 251 l~g~~egl~~~~~dgKGRGv~a~~~F~rgdFVVEY~Gdliei~eAk~rE~~Ya~De~~GcYMYyF~h~sk~yCiDAT~et 330 (392)
T KOG1085|consen 251 LKGTNEGLLEVYKDGKGRGVRAKVNFERGDFVVEYRGDLIEISEAKVREEQYANDEEIGCYMYYFEHNSKKYCIDATKET 330 (392)
T ss_pred HhccccceeEEeeccccceeEeecccccCceEEEEecceeeechHHHHHHHhccCcccceEEEeeeccCeeeeeeccccc
Confidence 345556677777778999999999999999999999999999999999999976532 4555 5567899999975
Q ss_pred CCccccccCCCCCCcceeEEEECCeeEEEEEEecCCCCCCeEEEecCCC
Q 002895 789 GDKLKFANHSSNPNCFAKVMLVAGDHRVGIFAKEHIEASEELFYDYRYG 837 (869)
Q Consensus 789 GN~aRFINHSC~PNc~~~~v~v~G~~RI~ifA~RDI~aGEELTfDYgy~ 837 (869)
+-++|.||||--+||.++++.++|.+++.++|.|||.+||||+||||-.
T Consensus 331 ~~lGRLINHS~~gNl~TKvv~Idg~pHLiLvA~rdIa~GEELlYDYGDR 379 (392)
T KOG1085|consen 331 PWLGRLINHSVRGNLKTKVVEIDGSPHLILVARRDIAQGEELLYDYGDR 379 (392)
T ss_pred ccchhhhcccccCcceeeEEEecCCceEEEEeccccccchhhhhhcccc
Confidence 5578999999999999999999999999999999999999999999854
No 8
>KOG1141 consensus Predicted histone methyl transferase [Chromatin structure and dynamics]
Probab=99.73 E-value=1.4e-18 Score=200.24 Aligned_cols=73 Identities=30% Similarity=0.517 Sum_probs=65.8
Q ss_pred EEEeccccCCccccccCCCCCCcceeEEEECCe----eEEEEEEecCCCCCCeEEEecCCCCCC-----CCcccCCCCCC
Q 002895 781 YVLDAYRKGDKLKFANHSSNPNCFAKVMLVAGD----HRVGIFAKEHIEASEELFYDYRYGPDQ-----APAWARKPEGS 851 (869)
Q Consensus 781 ~~IDAtr~GN~aRFINHSC~PNc~~~~v~v~G~----~RI~ifA~RDI~aGEELTfDYgy~~d~-----~pcwC~~p~~~ 851 (869)
|+|||...||++||+||||.||+.++.++|+.. +.|+|||.+-|+||+||||||+|..+. ..|.||.-+|+
T Consensus 1179 yvIDAk~eGNlGRfLNHSC~PNl~VQnVfvdTHdlrfPwVAFFt~kyVkAgtELTWDY~Ye~g~v~~keL~C~CGa~~Cr 1258 (1262)
T KOG1141|consen 1179 YVIDAKQEGNLGRFLNHSCDPNLHVQNVFVDTHDLRFPWVAFFTRKYVKAGTELTWDYQYEQGQVATKELTCHCGAENCR 1258 (1262)
T ss_pred EEEecccccchhhhhccCCCccceeeeeeeeccccCCchhhhhhhhhhccCceeeeeccccccccccceEEEecChhhhh
Confidence 789999999999999999999999999999853 689999999999999999999997765 35889988887
Q ss_pred CC
Q 002895 852 KR 853 (869)
Q Consensus 852 k~ 853 (869)
++
T Consensus 1259 gr 1260 (1262)
T KOG1141|consen 1259 GR 1260 (1262)
T ss_pred cc
Confidence 54
No 9
>COG2940 Proteins containing SET domain [General function prediction only]
Probab=99.54 E-value=1.4e-15 Score=174.61 Aligned_cols=125 Identities=35% Similarity=0.620 Sum_probs=107.1
Q ss_pred hcccCcEEEEEcCCCCcEEEEccccCCCCeeEEecccccCHHHHHHhhhhhcccCCcccc-cCCC-cEEEeccccCCccc
Q 002895 716 LRQQQRILLAKSDVAGWGAFLKNSVSKNDYLGEYTGELISHREADKRGKIYDRANSSFLF-DLND-QYVLDAYRKGDKLK 793 (869)
Q Consensus 716 rg~~k~l~V~kS~~kG~GLFA~edI~KGefI~EY~GEIIs~~Ea~rR~k~yd~~~~sYlf-~L~~-~~~IDAtr~GN~aR 793 (869)
........+..+...|||+||.+.|++|++|.+|.|+++...++..|...|...+..+.| .+.. ..++|+...|+.+|
T Consensus 328 ~~~~~~~~~~~~~~~~~g~fa~~~i~~~e~i~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~d~~~~g~~~r 407 (480)
T COG2940 328 KKRREPNVVQESEIKGYGVFALESIKKGEFIIEYHGEIIRRKEAREREENYDLLGNEFSFGLLEDKDKVRDSQKAGDVAR 407 (480)
T ss_pred ccccchhhhhhhcccccceeehhhccchHHHHHhcCcccchHHHHhhhccccccccccchhhccccchhhhhhhcccccc
Confidence 344456677788889999999999999999999999999999999998777555554444 3333 68899999999999
Q ss_pred cccCCCCCCcceeEEEECCeeEEEEEEecCCCCCCeEEEecCCCCCC
Q 002895 794 FANHSSNPNCFAKVMLVAGDHRVGIFAKEHIEASEELFYDYRYGPDQ 840 (869)
Q Consensus 794 FINHSC~PNc~~~~v~v~G~~RI~ifA~RDI~aGEELTfDYgy~~d~ 840 (869)
|+||||.|||.+....+.|..++.++|+|||.+||||++||+...+.
T Consensus 408 ~~nHS~~pN~~~~~~~~~g~~~~~~~~~rDI~~geEl~~dy~~~~~~ 454 (480)
T COG2940 408 FINHSCTPNCEASPIEVNGIFKISIYAIRDIKAGEELTYDYGPSLED 454 (480)
T ss_pred eeecCCCCCcceecccccccceeeecccccchhhhhhcccccccccc
Confidence 99999999999988888888899999999999999999999865443
No 10
>PF00856 SET: SET domain; InterPro: IPR001214 The SET domain appears generally as one part of a larger multidomain protein, and recently there were described three structures of very different proteins with distinct domain compositions: Neurospora crassa DIM-5, a member of the Su(var) family of HKMTs which methylate histone H3 on lysine 9,human SET7 (also called SET9), which methylates H3 on lysine 4 and garden pea Rubisco LSMT, an enzyme that does not modify histones, but instead methylates lysine 14 in the flexible tail of the large subunit of the enzyme Rubisco. The SET domain itself turned out to be an uncommon structure. Although in all three studies, electron density maps revealed the location of the AdoMet or AdoHcy cofactor, the SET domain bears no similarity at all to the canonical/AdoMet-dependent methyltransferase fold. Strictly conserved in the C-terminal motif of the SET domain tyrosine could be involved in abstracting a proton from the protonated amino group of the substrate lysine, promoting its nucleophilic attack on the sulphonium methyl group of the AdoMet cofactor. In contrast to the AdoMet-dependent protein methyltranferases of the classical type, which tend to bind their polypeptide substrates on top of the cofactor, it is noted from the Rubisco LSMT structure that the AdoMet seems to bind in a separate cleft, suggesting how a polypeptide substrate could be subjected to multiple rounds of methylation without having to be released from the enzyme. In contrast, SET7/9 is able to add only a single methyl group to its substrate. It has been demonstrated that association of SET domain and myotubularin-related proteins modulates growth control []. The SET domain-containing Drosophila melanogaster (Fruit fly) protein, enhancer of zeste, has a function in segment determination and the mammalian homologue may be involved in the regulation of gene transcription and chromatin structure. Histone lysine methylation is part of the histone code that regulated chromatin function and epigenetic control of gene function. Histone lysine methyltransferases (HMTase) differ both in their substrate specificity for the various acceptor lysines as well as in their product specificity for the number of methyl groups (one, two, or three) they transfer. With just one exception [], the HMTases belong to SET family that can be classified according to the sequences surrounding the SET domain [, ]. Structural studies on the human SET7/9, a mono-methylase, have revealed the molecular basis for the specificity of the enzyme for the histone-target and the roles of the invariant residues in the SET domain in determining the methylation specificities []. The pre-SET domain, as found in the SUV39 SET family, contains nine invariant cysteine residues that are grouped into two segments separated by a region of variable length. These 9 cysteines coordinate 3 zinc ions to form to form a triangular cluster, where each of the zinc ions is coordinated by 4 four cysteines to give a tetrahedral configuration. The function of this domain is structural, holding together 2 long segments of random coils. The C-terminal region including the post-SET domain is disordered when not interacting with a histone tail and in the absence of zinc. The three conserved cysteines in the post-SET domain form a zinc-binding site when coupled to a fourth conserved cysteine in the knot-like structure close to the SET domain active site []. The structured post-SET region brings in the C-terminal residues that participate in S-adenosylmethine-binding and histone tail interactions. The three conserved cysteine residues are essential for HMTase activity, as replacement with serine abolishes HMTase activity [], []. ; GO: 0005515 protein binding; PDB: 3TG5_A 3S7F_A 3RIB_B 3TG4_A 3S7J_A 3S7D_A 3S7B_A 3H6L_A 3SMT_A 3K5K_A ....
Probab=99.48 E-value=3.7e-14 Score=133.52 Aligned_cols=105 Identities=18% Similarity=0.190 Sum_probs=73.5
Q ss_pred CcEEEEccccCCCCeeEEecccccCHHHHHHh---hhhhcc---------------------------------------
Q 002895 731 GWGAFLKNSVSKNDYLGEYTGELISHREADKR---GKIYDR--------------------------------------- 768 (869)
Q Consensus 731 G~GLFA~edI~KGefI~EY~GEIIs~~Ea~rR---~k~yd~--------------------------------------- 768 (869)
|+||||+++|++|++|+++.+.+++..++... ...+..
T Consensus 1 GrGl~At~dI~~Ge~I~~p~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 80 (162)
T PF00856_consen 1 GRGLFATRDIKAGEVILIPRPAILTPDEVSPQPELLRLQLSKALEEQSRSDFSIQKKQKAEKSERSPQLESLHSISLRSE 80 (162)
T ss_dssp SEEEEESS-B-TTEEEEEESEEEEEHHHHHCHHHHSHHTTCSSSCSHHTTHHHHHHHHHHHHHHHHHHHHHHHHHCHTTT
T ss_pred CEEEEECccCCCCCEEEEECcceEEehhhhhcccchhhhhhhhhcccccccccccccccccccccccccccccccccccc
Confidence 89999999999999999999999987776441 000000
Q ss_pred cCCc---------------ccccCCCcEEEeccccCCccccccCCCCCCcceeEEEECCeeEEEEEEecCCCCCCeEEEe
Q 002895 769 ANSS---------------FLFDLNDQYVLDAYRKGDKLKFANHSSNPNCFAKVMLVAGDHRVGIFAKEHIEASEELFYD 833 (869)
Q Consensus 769 ~~~s---------------Ylf~L~~~~~IDAtr~GN~aRFINHSC~PNc~~~~v~v~G~~RI~ifA~RDI~aGEELTfD 833 (869)
.... ...........++.-....+.|+||||.|||.+..........+.|+|.|+|++|||||++
T Consensus 81 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~l~p~~d~~NHsc~pn~~~~~~~~~~~~~~~~~a~r~I~~GeEi~is 160 (162)
T PF00856_consen 81 LQFSQAFQWSWFISWTRSDFSSRSFSEDDRDGIALYPFADMLNHSCDPNCEVSFDFDGDGGCLVVRATRDIKKGEEIFIS 160 (162)
T ss_dssp CCTCCHHHHHHHHHHHHHEEEEEEETTEEEEEEEEETGGGGSEEESSTSEEEEEEEETTTTEEEEEESS-B-TTSBEEEE
T ss_pred ccccccccchhhccccceeeeccccccccccccccCcHhHheccccccccceeeEeecccceEEEEECCccCCCCEEEEE
Confidence 0000 0000011234556667789999999999999888777677889999999999999999999
Q ss_pred cC
Q 002895 834 YR 835 (869)
Q Consensus 834 Yg 835 (869)
||
T Consensus 161 YG 162 (162)
T PF00856_consen 161 YG 162 (162)
T ss_dssp ST
T ss_pred EC
Confidence 97
No 11
>KOG1081 consensus Transcription factor NSD1 and related SET domain proteins [Transcription]
Probab=98.88 E-value=5e-10 Score=128.58 Aligned_cols=115 Identities=30% Similarity=0.458 Sum_probs=92.6
Q ss_pred CCCCchHHhhcccCcEEEEEcCCCCcEEEEccccCCCCeeEEecccccCHHHHHHhhhhhccc--CCcccccCCCcEEEe
Q 002895 707 GQCGNMRLLLRQQQRILLAKSDVAGWGAFLKNSVSKNDYLGEYTGELISHREADKRGKIYDRA--NSSFLFDLNDQYVLD 784 (869)
Q Consensus 707 ~~C~Nr~lqrg~~k~l~V~kS~~kG~GLFA~edI~KGefI~EY~GEIIs~~Ea~rR~k~yd~~--~~sYlf~L~~~~~ID 784 (869)
..|.|+.+....... + .+ +|..+|.+| +|++|+..+...|...-... ...|+..+..+..||
T Consensus 301 ~~~~~~~~sk~~~~e------~-~~---~~~~~~~k~------vg~~i~~~e~~~~~~~~~~~~~~~~~~~~~e~~~~id 364 (463)
T KOG1081|consen 301 ERCHNQQFSKESYPE------P-QK---TAKADIRKG------VGEVIDDKECKARLQRVKESDLVDFYMVFIQKDRIID 364 (463)
T ss_pred cccccchhhhhcccc------c-ch---hhHHhhhcc------cCcccchhhheeehhhhhccchhhhhhhhhhcccccc
Confidence 578888776554444 1 12 889999998 99999999988775432222 234444444445999
Q ss_pred ccccCCccccccCCCCCCcceeEEEECCeeEEEEEEecCCCCCCeEEEecCCC
Q 002895 785 AYRKGDKLKFANHSSNPNCFAKVMLVAGDHRVGIFAKEHIEASEELFYDYRYG 837 (869)
Q Consensus 785 Atr~GN~aRFINHSC~PNc~~~~v~v~G~~RI~ifA~RDI~aGEELTfDYgy~ 837 (869)
+.++||.+||+||||+|||....|.+.++.++++||.+.|++|+||||+|++.
T Consensus 365 ~~~~~n~sr~~nh~~~~~v~~~k~~~~~~t~~~~~a~~~i~~g~e~t~~~n~~ 417 (463)
T KOG1081|consen 365 AGPKGNYSRFLNHSCQPNVETEKWQVIGDTRVGLFAPRQIEAGEELTFNYNGN 417 (463)
T ss_pred cccccchhhhhcccCCCceeechhheecccccccccccccccchhhhheeecc
Confidence 99999999999999999999999999999999999999999999999999865
No 12
>KOG2589 consensus Histone tail methylase [Chromatin structure and dynamics]
Probab=98.56 E-value=4.8e-08 Score=107.25 Aligned_cols=112 Identities=22% Similarity=0.259 Sum_probs=78.3
Q ss_pred CCcEEEEccccCCCCeeEEecccccCHHHHHHhhhhhcccC-CcccccCCCcEEEeccccCCccccccCCCCCCcceeEE
Q 002895 730 AGWGAFLKNSVSKNDYLGEYTGELISHREADKRGKIYDRAN-SSFLFDLNDQYVLDAYRKGDKLKFANHSSNPNCFAKVM 808 (869)
Q Consensus 730 kG~GLFA~edI~KGefI~EY~GEIIs~~Ea~rR~k~yd~~~-~sYlf~L~~~~~IDAtr~GN~aRFINHSC~PNc~~~~v 808 (869)
.|--|.+++.+.+|+=|--.+|-|+.-.+++++.-.....+ .+-||.--.. -|...-..++||||.|.|||.+.
T Consensus 137 ~gAkivst~~w~~ndkIe~LvGcIaeLse~eE~~ll~~g~nDFSvmyStRk~---caqLwLGPaafINHDCrpnCkFv-- 211 (453)
T KOG2589|consen 137 NGAKIVSTKSWSRNDKIELLVGCIAELSEAEERSLLRGGGNDFSVMYSTRKR---CAQLWLGPAAFINHDCRPNCKFV-- 211 (453)
T ss_pred CCceEEeeccccCCccHHHhhhhhhhcChhhhHHHHhccCCceeeeeecccc---hhhheeccHHhhcCCCCCCceee--
Confidence 47778999999999999999999987777777633222222 2222221110 11223356899999999999653
Q ss_pred EECCeeEEEEEEecCCCCCCeEEEecC---CCCCCCCcccCC
Q 002895 809 LVAGDHRVGIFAKEHIEASEELFYDYR---YGPDQAPAWARK 847 (869)
Q Consensus 809 ~v~G~~RI~ifA~RDI~aGEELTfDYg---y~~d~~pcwC~~ 847 (869)
..|..++.+-++|||.||||||--|| |+....-|.|-.
T Consensus 212 -s~g~~tacvkvlRDIePGeEITcFYgs~fFG~~N~~CeC~T 252 (453)
T KOG2589|consen 212 -STGRDTACVKVLRDIEPGEEITCFYGSGFFGENNEECECVT 252 (453)
T ss_pred -cCCCceeeeehhhcCCCCceeEEeecccccCCCCceeEEee
Confidence 25667899999999999999999997 455555555543
No 13
>KOG2461 consensus Transcription factor BLIMP-1/PRDI-BF1, contains C2H2-type Zn-finger and SET domains [Transcription]
Probab=98.12 E-value=2.8e-06 Score=96.25 Aligned_cols=109 Identities=18% Similarity=0.312 Sum_probs=83.0
Q ss_pred ccCcEEEEEcCC--CCcEEEEccccCCCCeeEEecccccCHHHHHHhhhhhcccCCcccccCC----CcEEEeccc--cC
Q 002895 718 QQQRILLAKSDV--AGWGAFLKNSVSKNDYLGEYTGELISHREADKRGKIYDRANSSFLFDLN----DQYVLDAYR--KG 789 (869)
Q Consensus 718 ~~k~l~V~kS~~--kG~GLFA~edI~KGefI~EY~GEIIs~~Ea~rR~k~yd~~~~sYlf~L~----~~~~IDAtr--~G 789 (869)
....+.|..+.+ .|.||++...|.+|+--|-|.|+++.... . ...+..|++.+- ..++||++. ..
T Consensus 26 LP~~l~i~~Ssv~~~~lgV~s~~~i~~G~~FGP~~G~~~~~~~-~------~~~n~~y~W~I~~~d~~~~~iDg~d~~~s 98 (396)
T KOG2461|consen 26 LPPELRIKPSSVPVTGLGVWSNASILPGTSFGPFEGEIIASID-S------KSANNRYMWEIFSSDNGYEYIDGTDEEHS 98 (396)
T ss_pred CCCceEeeccccCCccccccccccccCcccccCccCccccccc-c------ccccCcceEEEEeCCCceEEeccCChhhc
Confidence 567888988876 78999999999999999999999822111 0 123455665442 348899984 68
Q ss_pred CccccccCCCC---CCcceeEEEECCeeEEEEEEecCCCCCCeEEEecCCC
Q 002895 790 DKLKFANHSSN---PNCFAKVMLVAGDHRVGIFAKEHIEASEELFYDYRYG 837 (869)
Q Consensus 790 N~aRFINHSC~---PNc~~~~v~v~G~~RI~ifA~RDI~aGEELTfDYgy~ 837 (869)
|++||+|=+++ -|+.+. .....|.++|+|+|.+||||.++|+-+
T Consensus 99 NWmRYV~~Ar~~eeQNL~A~----Q~~~~Ifyrt~r~I~p~eELlVWY~~e 145 (396)
T KOG2461|consen 99 NWMRYVNSARSEEEQNLLAF----QIGENIFYRTIRDIRPNEELLVWYGSE 145 (396)
T ss_pred ceeeeecccCChhhhhHHHH----hccCceEEEecccCCCCCeEEEEeccc
Confidence 99999998885 687552 234568899999999999999999743
No 14
>cd00167 SANT 'SWI3, ADA2, N-CoR and TFIIIB' DNA-binding domains. Tandem copies of the domain bind telomeric DNA tandem repeatsas part of the capping complex. Binding is sequence dependent for repeats which contain the G/C rich motif [C2-3 A (CA)1-6]. The domain is also found in regulatory transcriptional repressor complexes where it also binds DNA.
Probab=93.60 E-value=0.17 Score=38.56 Aligned_cols=41 Identities=32% Similarity=0.407 Sum_probs=37.1
Q ss_pred CCcHHHHHHHHHhHHhcC-CchHHHHHhhcCCCCchHHHHHHHh
Q 002895 499 EWKPIEKELYLKGVEIFG-RNSCLIARNLLSGLKTCMEVSTYMR 541 (869)
Q Consensus 499 ~W~~~E~~l~~k~~~ifg-~NsC~iAr~ll~g~KtC~eV~~ym~ 541 (869)
.||.-|..++..++..|| .++..||+.+ +.||-.+|..|..
T Consensus 1 ~Wt~eE~~~l~~~~~~~g~~~w~~Ia~~~--~~rs~~~~~~~~~ 42 (45)
T cd00167 1 PWTEEEDELLLEAVKKYGKNNWEKIAKEL--PGRTPKQCRERWR 42 (45)
T ss_pred CCCHHHHHHHHHHHHHHCcCCHHHHHhHc--CCCCHHHHHHHHH
Confidence 499999999999999999 8999999987 6699999988764
No 15
>smart00717 SANT SANT SWI3, ADA2, N-CoR and TFIIIB'' DNA-binding domains.
Probab=93.50 E-value=0.19 Score=38.86 Aligned_cols=43 Identities=28% Similarity=0.382 Sum_probs=38.6
Q ss_pred CCCcHHHHHHHHHhHHhcC-CchHHHHHhhcCCCCchHHHHHHHhh
Q 002895 498 SEWKPIEKELYLKGVEIFG-RNSCLIARNLLSGLKTCMEVSTYMRD 542 (869)
Q Consensus 498 ~~W~~~E~~l~~k~~~ifg-~NsC~iAr~ll~g~KtC~eV~~ym~~ 542 (869)
..|++-|..+|..++..|| .++..||..| +.+|-.+|..+...
T Consensus 2 ~~Wt~~E~~~l~~~~~~~g~~~w~~Ia~~~--~~rt~~~~~~~~~~ 45 (49)
T smart00717 2 GEWTEEEDELLIELVKKYGKNNWEKIAKEL--PGRTAEQCRERWNN 45 (49)
T ss_pred CCCCHHHHHHHHHHHHHHCcCCHHHHHHHc--CCCCHHHHHHHHHH
Confidence 4799999999999999999 9999999987 68999999887653
No 16
>PF00249 Myb_DNA-binding: Myb-like DNA-binding domain; InterPro: IPR014778 The retroviral oncogene v-myb, and its cellular counterpart c-myb, encode nuclear DNA-binding proteins. These belong to the SANT domain family that specifically recognise the sequence YAAC(G/T)G [, ]. In myb, one of the most conserved regions consisting of three tandem repeats has been shown to be involved in DNA-binding [].; PDB: 1X41_A 2XAF_B 2XAG_B 2XAH_B 2UXN_B 2Y48_B 2XAQ_B 2X0L_B 2IW5_B 2XAJ_B ....
Probab=92.72 E-value=0.16 Score=40.93 Aligned_cols=46 Identities=15% Similarity=0.292 Sum_probs=39.5
Q ss_pred cccCCcccchhhhhHhhhcCChHHHHHHHHHHhc--CCcHHHHHHHHHhH
Q 002895 174 KHEFSDGEDRILWTVFEEHGLGEEVINAVSQFIG--IATSEVQDRYSTLK 221 (869)
Q Consensus 174 k~~f~~~ed~~~~~~~~e~g~sd~v~~~l~~~~~--~~~sei~eRy~~L~ 221 (869)
|..|++.||.+|-.++++||.. -...||+.|. +|+.+++.||..|.
T Consensus 1 r~~Wt~eE~~~l~~~v~~~g~~--~W~~Ia~~~~~~Rt~~qc~~~~~~~~ 48 (48)
T PF00249_consen 1 RGPWTEEEDEKLLEAVKKYGKD--NWKKIAKRMPGGRTAKQCRSRYQNLL 48 (48)
T ss_dssp S-SS-HHHHHHHHHHHHHSTTT--HHHHHHHHHSSSSTHHHHHHHHHHHT
T ss_pred CCCCCHHHHHHHHHHHHHhCCc--HHHHHHHHcCCCCCHHHHHHHHHhhC
Confidence 4569999999999999999998 6788888886 99999999998873
No 17
>PF13921 Myb_DNA-bind_6: Myb-like DNA-binding domain; PDB: 1A5J_A 1MBH_A 1GV5_A 1H89_C 1IDY_A 1MBK_A 1IDZ_A 1H88_C 1GVD_A 1MBG_A ....
Probab=92.26 E-value=0.14 Score=42.85 Aligned_cols=43 Identities=16% Similarity=0.452 Sum_probs=36.3
Q ss_pred CCcccchhhhhHhhhcCChHHHHHHHHHHhc-CCcHHHHHHHHH-hHh
Q 002895 177 FSDGEDRILWTVFEEHGLGEEVINAVSQFIG-IATSEVQDRYST-LKE 222 (869)
Q Consensus 177 f~~~ed~~~~~~~~e~g~sd~v~~~l~~~~~-~~~sei~eRy~~-L~~ 222 (869)
|++.||.+|....++||.+ ...||++|+ +++.+|+.||.. |..
T Consensus 1 WT~eEd~~L~~~~~~~g~~---W~~Ia~~l~~Rt~~~~~~r~~~~l~~ 45 (60)
T PF13921_consen 1 WTKEEDELLLELVKKYGND---WKKIAEHLGNRTPKQCRNRWRNHLRP 45 (60)
T ss_dssp S-HHHHHHHHHHHHHHTS----HHHHHHHSTTS-HHHHHHHHHHTTST
T ss_pred CCHHHHHHHHHHHHHHCcC---HHHHHHHHCcCCHHHHHHHHHHHCcc
Confidence 5788999999999999963 889999998 999999999999 753
No 18
>smart00717 SANT SANT SWI3, ADA2, N-CoR and TFIIIB'' DNA-binding domains.
Probab=92.12 E-value=0.14 Score=39.57 Aligned_cols=46 Identities=17% Similarity=0.405 Sum_probs=39.4
Q ss_pred ccCCcccchhhhhHhhhcCChHHHHHHHHHHh-cCCcHHHHHHHHHhHh
Q 002895 175 HEFSDGEDRILWTVFEEHGLGEEVINAVSQFI-GIATSEVQDRYSTLKE 222 (869)
Q Consensus 175 ~~f~~~ed~~~~~~~~e~g~sd~v~~~l~~~~-~~~~sei~eRy~~L~~ 222 (869)
..|+..||.+|-..+.+||.. -++.|+.+| ++++.+|+.||..|..
T Consensus 2 ~~Wt~~E~~~l~~~~~~~g~~--~w~~Ia~~~~~rt~~~~~~~~~~~~~ 48 (49)
T smart00717 2 GEWTEEEDELLIELVKKYGKN--NWEKIAKELPGRTAEQCRERWNNLLK 48 (49)
T ss_pred CCCCHHHHHHHHHHHHHHCcC--CHHHHHHHcCCCCHHHHHHHHHHHcC
Confidence 569999999999999999952 278888888 4999999999998754
No 19
>smart00570 AWS associated with SET domains. subdomain of PRESET
Probab=90.87 E-value=0.082 Score=44.25 Aligned_cols=29 Identities=38% Similarity=0.702 Sum_probs=19.6
Q ss_pred cccccccCccCcccCcCccCCCCCCCCCCCCCCCCCCCCchHHhhcc
Q 002895 672 CPCFAAGRECDPDVCRNCWVSCGDGSLGEPPKRGDGQCGNMRLLLRQ 718 (869)
Q Consensus 672 CpC~~a~rECdPd~C~~C~~sCg~g~l~~p~~~~~~~C~Nr~lqrg~ 718 (869)
|.+++...|| |..|+ | | ..|+|++||+++
T Consensus 22 ClNR~l~~EC-~~~C~-~----G------------~~C~NqrFqk~~ 50 (51)
T smart00570 22 CLNRMLLIEC-SSDCP-C----G------------SYCSNQRFQKRQ 50 (51)
T ss_pred HHHHHHhhhc-CCCCC-C----C------------cCccCcccccCc
Confidence 3333457888 56665 2 2 589999999875
No 20
>cd00167 SANT 'SWI3, ADA2, N-CoR and TFIIIB' DNA-binding domains. Tandem copies of the domain bind telomeric DNA tandem repeatsas part of the capping complex. Binding is sequence dependent for repeats which contain the G/C rich motif [C2-3 A (CA)1-6]. The domain is also found in regulatory transcriptional repressor complexes where it also binds DNA.
Probab=88.44 E-value=0.43 Score=36.28 Aligned_cols=43 Identities=14% Similarity=0.338 Sum_probs=36.9
Q ss_pred cCCcccchhhhhHhhhcCChHHHHHHHHHHhc-CCcHHHHHHHHHh
Q 002895 176 EFSDGEDRILWTVFEEHGLGEEVINAVSQFIG-IATSEVQDRYSTL 220 (869)
Q Consensus 176 ~f~~~ed~~~~~~~~e~g~sd~v~~~l~~~~~-~~~sei~eRy~~L 220 (869)
.|+..||.+|-..+.++|.. -...|+++|. ++..+|+.||..+
T Consensus 1 ~Wt~eE~~~l~~~~~~~g~~--~w~~Ia~~~~~rs~~~~~~~~~~~ 44 (45)
T cd00167 1 PWTEEEDELLLEAVKKYGKN--NWEKIAKELPGRTPKQCRERWRNL 44 (45)
T ss_pred CCCHHHHHHHHHHHHHHCcC--CHHHHHhHcCCCCHHHHHHHHHHh
Confidence 37889999999999999952 2788899994 9999999999876
No 21
>PF03638 TCR: Tesmin/TSO1-like CXC domain, cysteine-rich domain; InterPro: IPR005172 This entry includes proteins that have two copies of a cysteine rich motif as follows: C-X-C-X4-C-X3-YC-X-C-X6-C-X3-C-X-C-X2-C. The family includes Tesmin Q9Y4I5 from SWISSPROT [] and TSO1 Q9LE32 from SWISSPROT []. This group of proteins is called a CXC domain in [].
Probab=84.46 E-value=0.5 Score=38.21 Aligned_cols=31 Identities=48% Similarity=1.283 Sum_probs=27.6
Q ss_pred ccCCcccCCCCccCCCcccccccCccCcccCc
Q 002895 656 RFRGCHCAKSQCRSRQCPCFAAGRECDPDVCR 687 (869)
Q Consensus 656 Rf~GC~C~~~~C~t~~CpC~~a~rECdPd~C~ 687 (869)
...||.|.++.|...-|.||++++.|.+. |.
T Consensus 2 ~~~gC~Ckks~Clk~YC~Cf~~g~~C~~~-C~ 32 (42)
T PF03638_consen 2 KKKGCNCKKSKCLKLYCECFQAGRFCTPN-CK 32 (42)
T ss_pred CCCCCcccCcChhhhhCHHHHCcCcCCCC-cc
Confidence 35799999999999999999999999986 54
No 22
>KOG1171 consensus Metallothionein-like protein [Inorganic ion transport and metabolism]
Probab=84.18 E-value=0.23 Score=57.05 Aligned_cols=65 Identities=37% Similarity=1.050 Sum_probs=51.8
Q ss_pred cccCCCC-CCCC-CCCcccCCCccccCCCccccccCchhhhhc-------------------------------------
Q 002895 616 YTPCGCQ-SMCG-KQCPCLHNGTCCEKYCGYSFLRCSKSCKNR------------------------------------- 656 (869)
Q Consensus 616 y~PC~c~-~~C~-~~C~C~~~g~~Cek~Cg~~~~~C~~~C~nR------------------------------------- 656 (869)
-.+|.|. ..|- ..|.|...|.+|..+|. |- +|.|.
T Consensus 131 k~~~~ck~SkclklYCeCFAsG~yC~~~Cn-----Cv-nC~N~~~~e~~r~~a~k~~l~RNP~AFkPKia~s~~~~~da~ 204 (406)
T KOG1171|consen 131 KKKCNCKKSKCLKLYCECFASGVYCTGPCN-----CV-NCFNNPEHESVRLKARKQILERNPNAFKPKIAASSSGIADAS 204 (406)
T ss_pred ccCCCchHHHHHHHhHHHHhhcccccCCcc-----ee-eccCCCcchHHHHHHHHHHhhcCccccccccccCCcccchhh
Confidence 4455564 5565 48999999999999999 86 57554
Q ss_pred ------------cCCcccCCCCccCCCcccccccCccCcccCc
Q 002895 657 ------------FRGCHCAKSQCRSRQCPCFAAGRECDPDVCR 687 (869)
Q Consensus 657 ------------f~GC~C~~~~C~t~~CpC~~a~rECdPd~C~ 687 (869)
-.||+|.+..|..+-|.||+++.-|... |+
T Consensus 205 ~~~~~~~~sa~hkkGC~CkkSgClKkYCECyQa~vlCS~n-Ck 246 (406)
T KOG1171|consen 205 EEASKTPASARHKKGCNCKKSGCLKKYCECYQAGVLCSSN-CK 246 (406)
T ss_pred hhhhccchhhhhcCCCCCccccchHHHHHHHhcCCCcccc-cc
Confidence 4689999999999999999999988733 54
No 23
>PF09111 SLIDE: SLIDE; InterPro: IPR015195 The SLIDE domain adopts a secondary structure comprising a main core of three alpha-helices. It has a role in DNA binding, contacting DNA target sites similar to c-Myb (IPR014778 from INTERPRO) repeats or homeodomains []. ; GO: 0003676 nucleic acid binding, 0005524 ATP binding, 0016818 hydrolase activity, acting on acid anhydrides, in phosphorus-containing anhydrides, 0006338 chromatin remodeling, 0005634 nucleus; PDB: 2NOG_A 2Y9Y_A 2Y9Z_A 1OFC_X.
Probab=81.67 E-value=1.3 Score=43.11 Aligned_cols=50 Identities=28% Similarity=0.455 Sum_probs=36.9
Q ss_pred ccccCCcccchhhhhHhhhcCC-----hHHHHHHHHH--------Hh-cCCcHHHHHHHHHhHh
Q 002895 173 EKHEFSDGEDRILWTVFEEHGL-----GEEVINAVSQ--------FI-GIATSEVQDRYSTLKE 222 (869)
Q Consensus 173 ek~~f~~~ed~~~~~~~~e~g~-----sd~v~~~l~~--------~~-~~~~sei~eRy~~L~~ 222 (869)
-++.|++.||++|=+.+-+||+ =|.|...|.. || ++|+.||+.|...|-.
T Consensus 48 ~~k~yseeEDRfLl~~~~~~G~~~~~~~e~Ik~~Ir~~p~FrFDwf~kSRt~~el~rR~~tLi~ 111 (118)
T PF09111_consen 48 KKKVYSEEEDRFLLCMLYKYGYDAEGNWEKIKQEIRESPLFRFDWFFKSRTPQELQRRCNTLIK 111 (118)
T ss_dssp S-SSS-HHHHHHHHHHHHHHTTTSTTHHHHHHHHHHH-CGGCT-HHHHTS-HHHHHHHHHHHHH
T ss_pred CCCCcCcHHHHHHHHHHHHhCCCCCchHHHHHHHHHhCCCcccchhcccCCHHHHHHHHHHHHH
Confidence 3788999999999999999999 2444444443 22 9999999999999853
No 24
>PF00249 Myb_DNA-binding: Myb-like DNA-binding domain; InterPro: IPR014778 The retroviral oncogene v-myb, and its cellular counterpart c-myb, encode nuclear DNA-binding proteins. These belong to the SANT domain family that specifically recognise the sequence YAAC(G/T)G [, ]. In myb, one of the most conserved regions consisting of three tandem repeats has been shown to be involved in DNA-binding [].; PDB: 1X41_A 2XAF_B 2XAG_B 2XAH_B 2UXN_B 2Y48_B 2XAQ_B 2X0L_B 2IW5_B 2XAJ_B ....
Probab=79.24 E-value=5.4 Score=32.13 Aligned_cols=43 Identities=23% Similarity=0.405 Sum_probs=34.7
Q ss_pred CCCcHHHHHHHHHhHHhcCCc-hHHHHHhhcCCCCchHHHHHHHh
Q 002895 498 SEWKPIEKELYLKGVEIFGRN-SCLIARNLLSGLKTCMEVSTYMR 541 (869)
Q Consensus 498 ~~W~~~E~~l~~k~~~ifg~N-sC~iAr~ll~g~KtC~eV~~ym~ 541 (869)
..||.-|..+|+.++..||.+ +=.||..+- +.+|=.++-.+..
T Consensus 2 ~~Wt~eE~~~l~~~v~~~g~~~W~~Ia~~~~-~~Rt~~qc~~~~~ 45 (48)
T PF00249_consen 2 GPWTEEEDEKLLEAVKKYGKDNWKKIAKRMP-GGRTAKQCRSRYQ 45 (48)
T ss_dssp -SS-HHHHHHHHHHHHHSTTTHHHHHHHHHS-SSSTHHHHHHHHH
T ss_pred CCCCHHHHHHHHHHHHHhCCcHHHHHHHHcC-CCCCHHHHHHHHH
Confidence 469999999999999999988 999999872 3788777766543
No 25
>KOG1337 consensus N-methyltransferase [General function prediction only]
Probab=78.05 E-value=1.6 Score=51.25 Aligned_cols=40 Identities=30% Similarity=0.399 Sum_probs=30.9
Q ss_pred cccCCCCCCcceeEEEECCeeEEEEEEecCCCCCCeEEEecCC
Q 002895 794 FANHSSNPNCFAKVMLVAGDHRVGIFAKEHIEASEELFYDYRY 836 (869)
Q Consensus 794 FINHSC~PNc~~~~v~v~G~~RI~ifA~RDI~aGEELTfDYgy 836 (869)
+.||++++. ...+..-+..+.+++.++|.+||||+++||-
T Consensus 239 ~~NH~~~~~---~~~~~~~d~~~~l~~~~~v~~geevfi~YG~ 278 (472)
T KOG1337|consen 239 LLNHSPEVI---KAGYNQEDEAVELVAERDVSAGEEVFINYGP 278 (472)
T ss_pred hhccCchhc---cccccCCCCcEEEEEeeeecCCCeEEEecCC
Confidence 679999982 2233333448889999999999999999973
No 26
>PF13921 Myb_DNA-bind_6: Myb-like DNA-binding domain; PDB: 1A5J_A 1MBH_A 1GV5_A 1H89_C 1IDY_A 1MBK_A 1IDZ_A 1H88_C 1GVD_A 1MBG_A ....
Probab=75.16 E-value=6.3 Score=32.85 Aligned_cols=41 Identities=32% Similarity=0.443 Sum_probs=32.4
Q ss_pred CcHHHHHHHHHhHHhcCCchHHHHHhhcCCCCchHHHHHHHhh
Q 002895 500 WKPIEKELYLKGVEIFGRNSCLIARNLLSGLKTCMEVSTYMRD 542 (869)
Q Consensus 500 W~~~E~~l~~k~~~ifg~NsC~iAr~ll~g~KtC~eV~~ym~~ 542 (869)
||.-|..+++.++..||.+...||..| |.+|=.+|......
T Consensus 1 WT~eEd~~L~~~~~~~g~~W~~Ia~~l--~~Rt~~~~~~r~~~ 41 (60)
T PF13921_consen 1 WTKEEDELLLELVKKYGNDWKKIAEHL--GNRTPKQCRNRWRN 41 (60)
T ss_dssp S-HHHHHHHHHHHHHHTS-HHHHHHHS--TTS-HHHHHHHHHH
T ss_pred CCHHHHHHHHHHHHHHCcCHHHHHHHH--CcCCHHHHHHHHHH
Confidence 999999999999999999999999987 66777777664443
No 27
>KOG2084 consensus Predicted histone tail methylase containing SET domain [Chromatin structure and dynamics]
Probab=73.00 E-value=4 Score=46.32 Aligned_cols=38 Identities=32% Similarity=0.496 Sum_probs=28.2
Q ss_pred cccCCCCCCcceeEEEECCeeEEEEEEecCCCCCC-eEEEecC
Q 002895 794 FANHSSNPNCFAKVMLVAGDHRVGIFAKEHIEASE-ELFYDYR 835 (869)
Q Consensus 794 FINHSC~PNc~~~~v~v~G~~RI~ifA~RDI~aGE-ELTfDYg 835 (869)
++||||.||+. ....+.. ..+++...+.+++ ||+..|-
T Consensus 208 ~~~hsC~pn~~---~~~~~~~-~~~~~~~~~~~~~~~l~~~y~ 246 (482)
T KOG2084|consen 208 LFNHSCFPNIS---VIFDGRG-LALLVPAGIDAGEEELTISYT 246 (482)
T ss_pred hcccCCCCCeE---EEECCce-eEEEeecccCCCCCEEEEeec
Confidence 78999999996 3344544 4466777777776 9999994
No 28
>PF05033 Pre-SET: Pre-SET motif; InterPro: IPR007728 This region is found in a number of histone lysine methyltransferases (HMTase), N-terminal to the SET domain; it is generally described as the pre-SET domain. Histone lysine methylation is part of the histone code that regulated chromatin function and epigenetic control of gene function. Histone lysine methyltransferases (HMTase) differ both in their substrate specificity for the various acceptor lysines as well as in their product specificity for the number of methyl groups (one, two, or three) they transfer. With just one exception [], the HMTases belong to SET family that can be classified according to the sequences surrounding the SET domain [, ]. Structural studies on the human SET7/9, a mono-methylase, have revealed the molecular basis for the specificity of the enzyme for the histone-target and the roles of the invariant residues in the SET domain in determining the methylation specificities []. The pre-SET domain, as found in the SUV39 SET family, contains nine invariant cysteine residues that are grouped into two segments separated by a region of variable length. These 9 cysteines coordinate 3 zinc ions to form a triangular cluster, where each of the zinc ions is coordinated by 4 four cysteines to give a tetrahedral configuration. The function of this domain is structural, holding together 2 long segments of random coils and stabilising the SET domain. The C-terminal region including the post-SET domain is disordered when not interacting with a histone tail and in the absence of zinc. The three conserved cysteines in the post-SET domain form a zinc-binding site [] when coupled to a fourth conserved cysteine in the knot-like structure close to the SET domain active site []. The structured post-SET region brings in the C-terminal residues that participate in S-adenosylmethine-binding and histone tail interactions. The three conserved cysteine residues are essential for HMTase activity, as replacement with serine abolishes HMTase activity []. ; GO: 0008270 zinc ion binding, 0018024 histone-lysine N-methyltransferase activity, 0034968 histone lysine methylation, 0005634 nucleus; PDB: 3K5K_A 2O8J_D 3RJW_B 1ML9_A 1PEG_B 1MVH_A 1MVX_A 3BO5_A 2RFI_B 3MO5_B ....
Probab=70.87 E-value=3.2 Score=38.32 Aligned_cols=46 Identities=26% Similarity=0.683 Sum_probs=19.0
Q ss_pred ccCCcccCCCCc-cCCCcccccccCc------------cCcccCcCccCCCCCCCCCCCCCCCCCCCCch
Q 002895 656 RFRGCHCAKSQC-RSRQCPCFAAGRE------------CDPDVCRNCWVSCGDGSLGEPPKRGDGQCGNM 712 (869)
Q Consensus 656 Rf~GC~C~~~~C-~t~~CpC~~a~rE------------CdPd~C~~C~~sCg~g~l~~p~~~~~~~C~Nr 712 (869)
.+.||.| .+.| ....|.|...... -....=.+|+..|+++ ..|.||
T Consensus 45 ~~~~C~C-~~~C~~~~~C~C~~~~~~~~~Y~~~g~l~~~~~~~i~EC~~~C~C~----------~~C~NR 103 (103)
T PF05033_consen 45 FLQGCDC-SGDCSNPSNCECLQRNGGIFAYDSNGRLRIPDKPPIFECNDNCGCS----------PSCRNR 103 (103)
T ss_dssp GTS-----SSSSTCTTTSHHHCCTSSS-SB-TTSSBSSSSTSEEE---TTSSS-----------TTSTT-
T ss_pred cCccCcc-CCCCCCCCCCcCccccCccccccCCCcCccCCCCeEEeCCCCCCCC----------CCCCCC
Confidence 3345666 3445 4456666554432 2233334666666653 488886
No 29
>TIGR01557 myb_SHAQKYF myb-like DNA-binding domain, SHAQKYF class. This model describes a DNA-binding domain restricted to (but common in) plant proteins, many of which also contain a response regulator domain. The domain appears related to the Myb-like DNA-binding domain described by Pfam model pfam00249. It is distinguished in part by a well-conserved motif SH[AL]QKY[RF] at the C-terminal end of the motif.
Probab=64.54 E-value=17 Score=31.18 Aligned_cols=44 Identities=16% Similarity=0.191 Sum_probs=36.6
Q ss_pred CCCcHHHHHHHHHhHHhcCC-ch---HHHHHhhcCCCC-chHHHHHHHhh
Q 002895 498 SEWKPIEKELYLKGVEIFGR-NS---CLIARNLLSGLK-TCMEVSTYMRD 542 (869)
Q Consensus 498 ~~W~~~E~~l~~k~~~ifg~-Ns---C~iAr~ll~g~K-tC~eV~~ym~~ 542 (869)
-.||+-|-.+|+.+++.||. |. =.|+.++. .++ |-.+|.++++.
T Consensus 4 ~~WT~eeh~~Fl~ai~~~G~g~~a~pk~I~~~~~-~~~lT~~qV~SH~QK 52 (57)
T TIGR01557 4 VVWTEDLHDRFLQAVQKLGGPDWATPKRILELMV-VDGLTRDQVASHLQK 52 (57)
T ss_pred CCCCHHHHHHHHHHHHHhCCCcccchHHHHHHcC-CCCCCHHHHHHHHHH
Confidence 46999999999999999998 87 78888763 355 88999887764
No 30
>PLN03212 Transcription repressor MYB5; Provisional
Probab=63.16 E-value=8.1 Score=42.14 Aligned_cols=52 Identities=15% Similarity=0.291 Sum_probs=43.6
Q ss_pred CCccccccCCcccchhhhhHhhhcCChHHHHHHHHHHh-cCCcHHHHHHHHHhHhh
Q 002895 169 EPEEEKHEFSDGEDRILWTVFEEHGLGEEVINAVSQFI-GIATSEVQDRYSTLKEK 223 (869)
Q Consensus 169 e~eeek~~f~~~ed~~~~~~~~e~g~sd~v~~~l~~~~-~~~~sei~eRy~~L~~k 223 (869)
.|.=-|..|+..||.+|....+++|-. -..||++| ++|.-.|+.||+.+..+
T Consensus 73 ~P~I~kgpWT~EED~lLlel~~~~GnK---Ws~IAk~LpGRTDnqIKNRWns~LrK 125 (249)
T PLN03212 73 RPSVKRGGITSDEEDLILRLHRLLGNR---WSLIAGRIPGRTDNEIKNYWNTHLRK 125 (249)
T ss_pred chhcccCCCChHHHHHHHHHHHhcccc---HHHHHhhcCCCCHHHHHHHHHHHHhH
Confidence 455567789999999999999999953 67788888 99999999999877654
No 31
>KOG1141 consensus Predicted histone methyl transferase [Chromatin structure and dynamics]
Probab=62.63 E-value=14 Score=46.08 Aligned_cols=55 Identities=16% Similarity=0.260 Sum_probs=44.8
Q ss_pred CCCCchHHhhcccCcEEE--------EEcCCCCcEEEEccccCCCCeeEEecccccCHHHHHH
Q 002895 707 GQCGNMRLLLRQQQRILL--------AKSDVAGWGAFLKNSVSKNDYLGEYTGELISHREADK 761 (869)
Q Consensus 707 ~~C~Nr~lqrg~~k~l~V--------~kS~~kG~GLFA~edI~KGefI~EY~GEIIs~~Ea~r 761 (869)
..|.|+.++.+...+.++ +++...|||+.+..+|+.-.||++|+|...+..-+.+
T Consensus 992 ~~e~~~~v~~~~~~~me~~s~~~l~i~~~~~~~~~~~edtD~~~~~~~~~~~~~ppt~~l~~~ 1054 (1262)
T KOG1141|consen 992 RKEYNRVVQNNIKYPMEVSSFNDLQIFKTAQSGWGVREDTDIPQSTFICTYVGAPPTDDLADE 1054 (1262)
T ss_pred ccccchhhhcCCccceeeeecccccccccccccccccccccCCCCcccccccCCCCchhhHHH
Confidence 368899988777666554 4555689999999999999999999999988776643
No 32
>PF05033 Pre-SET: Pre-SET motif; InterPro: IPR007728 This region is found in a number of histone lysine methyltransferases (HMTase), N-terminal to the SET domain; it is generally described as the pre-SET domain. Histone lysine methylation is part of the histone code that regulated chromatin function and epigenetic control of gene function. Histone lysine methyltransferases (HMTase) differ both in their substrate specificity for the various acceptor lysines as well as in their product specificity for the number of methyl groups (one, two, or three) they transfer. With just one exception [], the HMTases belong to SET family that can be classified according to the sequences surrounding the SET domain [, ]. Structural studies on the human SET7/9, a mono-methylase, have revealed the molecular basis for the specificity of the enzyme for the histone-target and the roles of the invariant residues in the SET domain in determining the methylation specificities []. The pre-SET domain, as found in the SUV39 SET family, contains nine invariant cysteine residues that are grouped into two segments separated by a region of variable length. These 9 cysteines coordinate 3 zinc ions to form a triangular cluster, where each of the zinc ions is coordinated by 4 four cysteines to give a tetrahedral configuration. The function of this domain is structural, holding together 2 long segments of random coils and stabilising the SET domain. The C-terminal region including the post-SET domain is disordered when not interacting with a histone tail and in the absence of zinc. The three conserved cysteines in the post-SET domain form a zinc-binding site [] when coupled to a fourth conserved cysteine in the knot-like structure close to the SET domain active site []. The structured post-SET region brings in the C-terminal residues that participate in S-adenosylmethine-binding and histone tail interactions. The three conserved cysteine residues are essential for HMTase activity, as replacement with serine abolishes HMTase activity []. ; GO: 0008270 zinc ion binding, 0018024 histone-lysine N-methyltransferase activity, 0034968 histone lysine methylation, 0005634 nucleus; PDB: 3K5K_A 2O8J_D 3RJW_B 1ML9_A 1PEG_B 1MVH_A 1MVX_A 3BO5_A 2RFI_B 3MO5_B ....
Probab=58.97 E-value=5.3 Score=36.88 Aligned_cols=37 Identities=41% Similarity=1.072 Sum_probs=20.1
Q ss_pred CcccCCCCCCC--CCCCcccCCCc--------------------cccCCCccccccCchhhhhc
Q 002895 615 QYTPCGCQSMC--GKQCPCLHNGT--------------------CCEKYCGYSFLRCSKSCKNR 656 (869)
Q Consensus 615 ~y~PC~c~~~C--~~~C~C~~~g~--------------------~Cek~Cg~~~~~C~~~C~nR 656 (869)
...-|+|.+.| ...|.|..... .|...|+ |+..|.||
T Consensus 45 ~~~~C~C~~~C~~~~~C~C~~~~~~~~~Y~~~g~l~~~~~~~i~EC~~~C~-----C~~~C~NR 103 (103)
T PF05033_consen 45 FLQGCDCSGDCSNPSNCECLQRNGGIFAYDSNGRLRIPDKPPIFECNDNCG-----CSPSCRNR 103 (103)
T ss_dssp GTS----SSSSTCTTTSHHHCCTSSS-SB-TTSSBSSSSTSEEE---TTSS-----S-TTSTT-
T ss_pred cCccCccCCCCCCCCCCcCccccCccccccCCCcCccCCCCeEEeCCCCCC-----CCCCCCCC
Confidence 35579998889 47899987542 3666677 77777776
No 33
>PF03638 TCR: Tesmin/TSO1-like CXC domain, cysteine-rich domain; InterPro: IPR005172 This entry includes proteins that have two copies of a cysteine rich motif as follows: C-X-C-X4-C-X3-YC-X-C-X6-C-X3-C-X-C-X2-C. The family includes Tesmin Q9Y4I5 from SWISSPROT [] and TSO1 Q9LE32 from SWISSPROT []. This group of proteins is called a CXC domain in [].
Probab=54.19 E-value=6.3 Score=32.01 Aligned_cols=37 Identities=35% Similarity=0.942 Sum_probs=30.5
Q ss_pred CcccCCCC-CCCC-CCCcccCCCccccCCCccccccCchhhhhcc
Q 002895 615 QYTPCGCQ-SMCG-KQCPCLHNGTCCEKYCGYSFLRCSKSCKNRF 657 (869)
Q Consensus 615 ~y~PC~c~-~~C~-~~C~C~~~g~~Cek~Cg~~~~~C~~~C~nRf 657 (869)
+..+|.|. ..|- ..|.|...+.+|...|. |. +|.|..
T Consensus 2 ~~~gC~Ckks~Clk~YC~Cf~~g~~C~~~C~-----C~-~C~N~~ 40 (42)
T PF03638_consen 2 KKKGCNCKKSKCLKLYCECFQAGRFCTPNCK-----CQ-NCKNTE 40 (42)
T ss_pred CCCCCcccCcChhhhhCHHHHCcCcCCCCcc-----cC-CCCCcC
Confidence 35689996 8887 58999999999999999 94 687764
No 34
>smart00570 AWS associated with SET domains. subdomain of PRESET
Probab=52.16 E-value=4.7 Score=34.01 Aligned_cols=14 Identities=36% Similarity=0.862 Sum_probs=7.8
Q ss_pred CchhhhhccCCccc
Q 002895 649 CSKSCKNRFRGCHC 662 (869)
Q Consensus 649 C~~~C~nRf~GC~C 662 (869)
|+.+|.||+.-=.|
T Consensus 18 CgsdClNR~l~~EC 31 (51)
T smart00570 18 CGSDCLNRMLLIEC 31 (51)
T ss_pred cchHHHHHHHhhhc
Confidence 33667777664333
No 35
>COG5259 RSC8 RSC chromatin remodeling complex subunit RSC8 [Chromatin structure and dynamics / Transcription]
Probab=50.98 E-value=14 Score=43.38 Aligned_cols=38 Identities=32% Similarity=0.532 Sum_probs=33.7
Q ss_pred cCCCCcHHHHHHHHHhHHhcCCchHHHHHhhcCCCCchHH
Q 002895 496 CSSEWKPIEKELYLKGVEIFGRNSCLIARNLLSGLKTCME 535 (869)
Q Consensus 496 ~~~~W~~~E~~l~~k~~~ifg~NsC~iAr~ll~g~KtC~e 535 (869)
....|+.-|.-|++.|+++||+.+=.||+++ |+||=-|
T Consensus 278 ~dk~WS~qE~~LLLEGIe~ygDdW~kVA~HV--gtKt~Eq 315 (531)
T COG5259 278 RDKNWSRQELLLLLEGIEMYGDDWDKVARHV--GTKTKEQ 315 (531)
T ss_pred ccccccHHHHHHHHHHHHHhhhhHHHHHHHh--CCCCHHH
Confidence 4568999999999999999999999999998 8898433
No 36
>PLN03091 hypothetical protein; Provisional
Probab=48.84 E-value=21 Score=41.99 Aligned_cols=53 Identities=15% Similarity=0.362 Sum_probs=44.1
Q ss_pred CCCccccccCCcccchhhhhHhhhcCChHHHHHHHHHHh-cCCcHHHHHHHHHhHhh
Q 002895 168 IEPEEEKHEFSDGEDRILWTVFEEHGLGEEVINAVSQFI-GIATSEVQDRYSTLKEK 223 (869)
Q Consensus 168 ~e~eeek~~f~~~ed~~~~~~~~e~g~sd~v~~~l~~~~-~~~~sei~eRy~~L~~k 223 (869)
..|.--|..|+..||.+|....+++|-. ...||++| +++.-+||.||+.+.+|
T Consensus 61 LdP~IkKgpWT~EED~lLLeL~k~~GnK---WskIAk~LPGRTDnqIKNRWnslLKK 114 (459)
T PLN03091 61 LRPDLKRGTFSQQEENLIIELHAVLGNR---WSQIAAQLPGRTDNEIKNLWNSCLKK 114 (459)
T ss_pred cCCcccCCCCCHHHHHHHHHHHHHhCcc---hHHHHHhcCCCCHHHHHHHHHHHHHH
Confidence 3455567889999999999999999952 67788888 99999999999876554
No 37
>PF14774 FAM177: FAM177 family
Probab=47.30 E-value=31 Score=34.12 Aligned_cols=66 Identities=21% Similarity=0.257 Sum_probs=37.9
Q ss_pred cccceeeEeCCCCeEEE-ecCCccccCCCcccccc-CC---cccc--hh-------hhhHhhhcCChHHHHHHHHHHhcC
Q 002895 143 VGRRRIYYDQHGSEALV-CSDSEEDIIEPEEEKHE-FS---DGED--RI-------LWTVFEEHGLGEEVINAVSQFIGI 208 (869)
Q Consensus 143 vgrrriYYD~~g~Eali-cSdseee~~e~eeek~~-f~---~~ed--~~-------~~~~~~e~g~sd~v~~~l~~~~~~ 208 (869)
.=||-||+ ..||.|- .|.+||| .+.++.+.+ +. +... .. +|+...-+.--|=|=+.||.|||.
T Consensus 18 ~prRiihF--sdGetmEE~StdeEe-~e~d~~~~d~~~~~~dp~~l~w~~~~~~~~~~~~~~~l~~~d~~Ge~lA~~fGi 94 (123)
T PF14774_consen 18 KPRRIIHF--SDGETMEEYSTDEEE-EEQDEDQPDKLSVQVDPSKLTWGPWLWFWAWRVGTKSLSGCDYLGEKLASFFGI 94 (123)
T ss_pred CchheeEe--cCCceeeeecccccc-ccccccccccccccCCcccCCcHHHHHHHHHHHHHhHhhHHhhhhhHHHHHhCC
Confidence 35889999 8997776 7766665 333333333 22 1222 11 223333333344555889999999
Q ss_pred CcH
Q 002895 209 ATS 211 (869)
Q Consensus 209 ~~s 211 (869)
+.+
T Consensus 95 t~~ 97 (123)
T PF14774_consen 95 TSP 97 (123)
T ss_pred Cch
Confidence 986
No 38
>PLN03212 Transcription repressor MYB5; Provisional
Probab=43.75 E-value=21 Score=39.01 Aligned_cols=46 Identities=15% Similarity=0.205 Sum_probs=38.5
Q ss_pred cccCCcccchhhhhHhhhcCChHHHHHHHHHHh--cCCcHHHHHHHHHhH
Q 002895 174 KHEFSDGEDRILWTVFEEHGLGEEVINAVSQFI--GIATSEVQDRYSTLK 221 (869)
Q Consensus 174 k~~f~~~ed~~~~~~~~e~g~sd~v~~~l~~~~--~~~~sei~eRy~~L~ 221 (869)
+.-|+..||.+|...+++||-.. ...||+.+ +++.-+..+||...-
T Consensus 25 Rg~WT~EEDe~L~~lV~kyG~~n--W~~IAk~~g~gRT~KQCReRW~N~L 72 (249)
T PLN03212 25 RGPWTVEEDEILVSFIKKEGEGR--WRSLPKRAGLLRCGKSCRLRWMNYL 72 (249)
T ss_pred CCCCCHHHHHHHHHHHHHhCccc--HHHHHHhhhcCCCcchHHHHHHHhh
Confidence 55699999999999999999643 56788766 799999999997654
No 39
>KOG1081 consensus Transcription factor NSD1 and related SET domain proteins [Transcription]
Probab=41.15 E-value=8.5 Score=45.49 Aligned_cols=100 Identities=11% Similarity=-0.004 Sum_probs=67.2
Q ss_pred EEEccccCCCCeeEEecccccCHH--HHHHhhhhhc-ccCC-cccccCC---CcEEEeccccCCccccccCCCCCCccee
Q 002895 734 AFLKNSVSKNDYLGEYTGELISHR--EADKRGKIYD-RANS-SFLFDLN---DQYVLDAYRKGDKLKFANHSSNPNCFAK 806 (869)
Q Consensus 734 LFA~edI~KGefI~EY~GEIIs~~--Ea~rR~k~yd-~~~~-sYlf~L~---~~~~IDAtr~GN~aRFINHSC~PNc~~~ 806 (869)
..|...+..|++|+.++|+..-.. ....+. +. .... .-+|... .....++...|+..++++|++.|+-...
T Consensus 130 ~~~~~~~~~~~~vw~~vg~~~~~~c~vc~~~~--~~~~~~~~~~~f~~~~~~~~~~~~~~~~g~~~~~l~~~~~~~s~~~ 207 (463)
T KOG1081|consen 130 CRAFKKREVGDLVWSKVGEYPWWPCMVCHDPL--LPKGMKHDHVNFFGCYAWTHEKRVFPYEGQSSKLIPHSKKPASTMS 207 (463)
T ss_pred eeeeccccceeEEeEEcCcccccccceecCcc--cchhhccccceeccchhhHHHhhhhhccchHHHhhhhccccchhhh
Confidence 777779999999999999986444 111110 00 0000 0111111 1122333449999999999999999888
Q ss_pred EEEECCeeEEEEEEecCCCCCCe------EEEecC
Q 002895 807 VMLVAGDHRVGIFAKEHIEASEE------LFYDYR 835 (869)
Q Consensus 807 ~v~v~G~~RI~ifA~RDI~aGEE------LTfDYg 835 (869)
.+...+..|++.++.+.++-+.- ++.+|.
T Consensus 208 ~~~~~~~~r~~~~~~q~~~~~~~~e~k~~~~~~~~ 242 (463)
T KOG1081|consen 208 EKIKEAKARFGKLKAQWEAGIKQKELKPEEYKRIK 242 (463)
T ss_pred hhhhcccchhhhcccchhhccchhhcccccccccc
Confidence 88889999999999998888877 666653
No 40
>KOG1079 consensus Transcriptional repressor EZH1 [Transcription]
Probab=40.42 E-value=14 Score=45.26 Aligned_cols=29 Identities=14% Similarity=-0.029 Sum_probs=21.3
Q ss_pred hhHHHHHHHHHHH-----HHHHHHHHHHHHHHHh
Q 002895 27 LTYKLNQLKKQVQ-----AERVVSVKDKIEKNRK 55 (869)
Q Consensus 27 L~~~i~~LKkqi~-----~eR~~~ik~k~e~N~~ 55 (869)
+.-++..++.+-+ ++|+..||+++.++++
T Consensus 18 ~~r~~~~~~~K~~~~~~~~~~~e~i~~~~~E~k~ 51 (739)
T KOG1079|consen 18 RKRVREADEGKSAKSKNPADRLEKIKILNCEWKK 51 (739)
T ss_pred HHHHHHHhhhhhhcccCHHHHHHHHHHHHHHHhh
Confidence 3344445555555 8899999999999998
No 41
>KOG3813 consensus Uncharacterized conserved protein (tumor-suppressor AXUD1 in humans) [General function prediction only]
Probab=36.16 E-value=16 Score=43.42 Aligned_cols=25 Identities=40% Similarity=1.107 Sum_probs=18.4
Q ss_pred CcccCCCCccCCCcccccccCccCcc
Q 002895 659 GCHCAKSQCRSRQCPCFAAGRECDPD 684 (869)
Q Consensus 659 GC~C~~~~C~t~~CpC~~a~rECdPd 684 (869)
||.|. +-|.+.+|.|-+++.-|.-|
T Consensus 309 GCsCr-~~CdPETCaCSqaGIkCQvD 333 (640)
T KOG3813|consen 309 GCSCR-GVCDPETCACSQAGIKCQVD 333 (640)
T ss_pred CCccc-ceeChhhcchhccCceEeec
Confidence 35563 67788888888888888655
No 42
>KOG1082 consensus Histone H3 (Lys9) methyltransferase SUV39H1/Clr4, required for transcriptional silencing [Chromatin structure and dynamics; Transcription]
Probab=30.75 E-value=32 Score=39.26 Aligned_cols=42 Identities=36% Similarity=0.854 Sum_probs=27.9
Q ss_pred CCcCcccCCCCCCCCCC----CcccC----------------------CCccccCCCccccccCchhhhhccC
Q 002895 612 SCKQYTPCGCQSMCGKQ----CPCLH----------------------NGTCCEKYCGYSFLRCSKSCKNRFR 658 (869)
Q Consensus 612 ~~~~y~PC~c~~~C~~~----C~C~~----------------------~g~~Cek~Cg~~~~~C~~~C~nRf~ 658 (869)
.+..-..|.|...|... |.|.. ....|...|+ |+.+|.||+.
T Consensus 103 ~~~~~~~c~C~~~~~~~~~~~C~C~~~n~~~~~~~~~~~~~~~~~~~~~i~EC~~~C~-----C~~~C~nRv~ 170 (364)
T KOG1082|consen 103 DCENSTGCRCCSSCSSVLPLTCLCERHNGGLVAYTCDGDCGTLGKFKEPVFECSVACG-----CHPDCANRVV 170 (364)
T ss_pred cCccccCCCccCCCCCCCCccccChHhhCCccccccCCccccccccCccccccccCCC-----CCCcCcchhh
Confidence 34566778887666532 78877 1234667777 8888888875
No 43
>PRK09430 djlA Dna-J like membrane chaperone protein; Provisional
Probab=28.43 E-value=61 Score=35.66 Aligned_cols=49 Identities=16% Similarity=0.339 Sum_probs=39.3
Q ss_pred CCcccchhhhhHhhhcCChHHHHHHHHHHh-----------------------------------cCCcHHHHHHHHHhH
Q 002895 177 FSDGEDRILWTVFEEHGLGEEVINAVSQFI-----------------------------------GIATSEVQDRYSTLK 221 (869)
Q Consensus 177 f~~~ed~~~~~~~~e~g~sd~v~~~l~~~~-----------------------------------~~~~sei~eRy~~L~ 221 (869)
+...|+.+||.+.+-.|+|+.-|+.+.+++ +.+.+||+..|+.|.
T Consensus 146 l~~~E~~~L~~Ia~~Lgis~~df~~~~~~~~~~~~f~~~~~~~~~~~~~~~~~~~~ay~vLgv~~~as~~eIk~aYr~L~ 225 (267)
T PRK09430 146 LHPNERQVLYVIAEELGFSRFQFDQLLRMMQAGFRFQQQQGGGGYQQAQRGPTLEDAYKVLGVSESDDDQEIKRAYRKLM 225 (267)
T ss_pred CCHHHHHHHHHHHHHcCCCHHHHHHHHHHHHHHHhhcccccccccccccCCCcHHhHHHHcCCCCCCCHHHHHHHHHHHH
Confidence 888899999999999999998877665543 235688999999997
Q ss_pred hhcC
Q 002895 222 EKYD 225 (869)
Q Consensus 222 ~k~~ 225 (869)
.++-
T Consensus 226 ~~~H 229 (267)
T PRK09430 226 SEHH 229 (267)
T ss_pred HHhC
Confidence 7653
No 44
>PF00856 SET: SET domain; InterPro: IPR001214 The SET domain appears generally as one part of a larger multidomain protein, and recently there were described three structures of very different proteins with distinct domain compositions: Neurospora crassa DIM-5, a member of the Su(var) family of HKMTs which methylate histone H3 on lysine 9,human SET7 (also called SET9), which methylates H3 on lysine 4 and garden pea Rubisco LSMT, an enzyme that does not modify histones, but instead methylates lysine 14 in the flexible tail of the large subunit of the enzyme Rubisco. The SET domain itself turned out to be an uncommon structure. Although in all three studies, electron density maps revealed the location of the AdoMet or AdoHcy cofactor, the SET domain bears no similarity at all to the canonical/AdoMet-dependent methyltransferase fold. Strictly conserved in the C-terminal motif of the SET domain tyrosine could be involved in abstracting a proton from the protonated amino group of the substrate lysine, promoting its nucleophilic attack on the sulphonium methyl group of the AdoMet cofactor. In contrast to the AdoMet-dependent protein methyltranferases of the classical type, which tend to bind their polypeptide substrates on top of the cofactor, it is noted from the Rubisco LSMT structure that the AdoMet seems to bind in a separate cleft, suggesting how a polypeptide substrate could be subjected to multiple rounds of methylation without having to be released from the enzyme. In contrast, SET7/9 is able to add only a single methyl group to its substrate. It has been demonstrated that association of SET domain and myotubularin-related proteins modulates growth control []. The SET domain-containing Drosophila melanogaster (Fruit fly) protein, enhancer of zeste, has a function in segment determination and the mammalian homologue may be involved in the regulation of gene transcription and chromatin structure. Histone lysine methylation is part of the histone code that regulated chromatin function and epigenetic control of gene function. Histone lysine methyltransferases (HMTase) differ both in their substrate specificity for the various acceptor lysines as well as in their product specificity for the number of methyl groups (one, two, or three) they transfer. With just one exception [], the HMTases belong to SET family that can be classified according to the sequences surrounding the SET domain [, ]. Structural studies on the human SET7/9, a mono-methylase, have revealed the molecular basis for the specificity of the enzyme for the histone-target and the roles of the invariant residues in the SET domain in determining the methylation specificities []. The pre-SET domain, as found in the SUV39 SET family, contains nine invariant cysteine residues that are grouped into two segments separated by a region of variable length. These 9 cysteines coordinate 3 zinc ions to form to form a triangular cluster, where each of the zinc ions is coordinated by 4 four cysteines to give a tetrahedral configuration. The function of this domain is structural, holding together 2 long segments of random coils. The C-terminal region including the post-SET domain is disordered when not interacting with a histone tail and in the absence of zinc. The three conserved cysteines in the post-SET domain form a zinc-binding site when coupled to a fourth conserved cysteine in the knot-like structure close to the SET domain active site []. The structured post-SET region brings in the C-terminal residues that participate in S-adenosylmethine-binding and histone tail interactions. The three conserved cysteine residues are essential for HMTase activity, as replacement with serine abolishes HMTase activity [], []. ; GO: 0005515 protein binding; PDB: 3TG5_A 3S7F_A 3RIB_B 3TG4_A 3S7J_A 3S7D_A 3S7B_A 3H6L_A 3SMT_A 3K5K_A ....
Probab=28.14 E-value=32 Score=32.13 Aligned_cols=17 Identities=35% Similarity=0.618 Sum_probs=12.9
Q ss_pred EEEEEecCCCCCCeEEE
Q 002895 816 VGIFAKEHIEASEELFY 832 (869)
Q Consensus 816 I~ifA~RDI~aGEELTf 832 (869)
.||||+|||++||-|.+
T Consensus 2 rGl~At~dI~~Ge~I~~ 18 (162)
T PF00856_consen 2 RGLFATRDIKAGEVILI 18 (162)
T ss_dssp EEEEESS-B-TTEEEEE
T ss_pred EEEEECccCCCCCEEEE
Confidence 47999999999998874
No 45
>PF08271 TF_Zn_Ribbon: TFIIB zinc-binding; InterPro: IPR013137 Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [, , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. This entry represents a zinc finger motif found in transcription factor IIB (TFIIB). In eukaryotes the initiation of transcription of protein encoding genes by the polymerase II complexe (Pol II) is modulated by general and specific transcription factors. The general transcription factors operate through common promoters elements (such as the TATA box). At least seven different proteins associate to form the general transcription factors: TFIIA, -IIB, -IID, -IIE, -IIF, -IIG, and -IIH []. TFIIB and TFIID are responsible for promoter recognition and interaction with pol II; together with Pol II, they form a minimal initiation complex capable of transcription under certain conditions. The TATA box of a Pol II promoter is bound in the initiation complex by the TBP subunit of TFIID, which bends the DNA around the C-terminal domain of TFIIB whereas the N-terminal zinc finger of TFIIB interacts with Pol II [, ]. The TFIIB zinc finger adopts a zinc ribbon fold characterised by two beta-hairpins forming two structurally similar zinc-binding sub-sites []. The zinc finger contacts the rbp1 subunit of Pol II through its dock domain, a conserved region of about 70 amino acids located close to the polymerase active site []. In the Pol II complex this surface is located near the RNA exit groove. Interestingly this sequence is best conserved in the three polymerases that utilise a TFIIB-like general transcription factor (Pol II, Pol III, and archaeal RNA polymerase) but not in Pol I []. More information about these proteins can be found at Protein of the Month: Zinc Fingers [].; GO: 0008270 zinc ion binding, 0006355 regulation of transcription, DNA-dependent; PDB: 1VD4_A 1PFT_A 3K1F_M 3K7A_M 1RO4_A 1RLY_A 1DL6_A.
Probab=27.54 E-value=71 Score=25.44 Aligned_cols=33 Identities=39% Similarity=0.721 Sum_probs=23.7
Q ss_pred ccceeeEeCCCCeEEEecCC----ccccCCCccccccC
Q 002895 144 GRRRIYYDQHGSEALVCSDS----EEDIIEPEEEKHEF 177 (869)
Q Consensus 144 grrriYYD~~g~EalicSds----eee~~e~eeek~~f 177 (869)
|.+.|++|...||. ||+.= ||.++.++-|.++|
T Consensus 7 g~~~~~~D~~~g~~-vC~~CG~Vl~e~~i~~~~e~r~f 43 (43)
T PF08271_consen 7 GSKEIVFDPERGEL-VCPNCGLVLEENIIDEGPEWREF 43 (43)
T ss_dssp SSSEEEEETTTTEE-EETTT-BBEE-TTBSCCCSCCHC
T ss_pred cCCceEEcCCCCeE-ECCCCCCEeecccccCCcccccC
Confidence 55669999999997 99875 55566666666655
No 46
>KOG0457 consensus Histone acetyltransferase complex SAGA/ADA, subunit ADA2 [Chromatin structure and dynamics]
Probab=27.53 E-value=1.4e+02 Score=35.39 Aligned_cols=41 Identities=34% Similarity=0.486 Sum_probs=34.9
Q ss_pred cCCCCcHHHHHHHHHhHHhcC-CchHHHHHhhcCCCCc---hHHHHH
Q 002895 496 CSSEWKPIEKELYLKGVEIFG-RNSCLIARNLLSGLKT---CMEVST 538 (869)
Q Consensus 496 ~~~~W~~~E~~l~~k~~~ifg-~NsC~iAr~ll~g~Kt---C~eV~~ 538 (869)
-...|+.-|..|+++++++|| .|+=-||..+ |+|| |.+-|.
T Consensus 71 ~~~~WtadEEilLLea~~t~G~GNW~dIA~hI--GtKtkeeck~hy~ 115 (438)
T KOG0457|consen 71 LDPSWTADEEILLLEAAETYGFGNWQDIADHI--GTKTKEECKEHYL 115 (438)
T ss_pred CCCCCChHHHHHHHHHHHHhCCCcHHHHHHHH--cccchHHHHHHHH
Confidence 357899999999999999999 7999999988 8887 555554
No 47
>KOG4167 consensus Predicted DNA-binding protein, contains SANT and ELM2 domains [Transcription]
Probab=27.42 E-value=71 Score=39.82 Aligned_cols=40 Identities=20% Similarity=0.552 Sum_probs=34.0
Q ss_pred CCCcHHHHHHHHHhHHhcCCchHHHHHhhcCCCCchHHHHHH
Q 002895 498 SEWKPIEKELYLKGVEIFGRNSCLIARNLLSGLKTCMEVSTY 539 (869)
Q Consensus 498 ~~W~~~E~~l~~k~~~ifg~NsC~iAr~ll~g~KtC~eV~~y 539 (869)
.-||++|+-||.|++-.|-++|=+|+..| .+||=+|--+|
T Consensus 620 d~WTp~E~~lF~kA~y~~~KDF~~v~km~--~~KtVaqCVey 659 (907)
T KOG4167|consen 620 DKWTPLERKLFNKALYTYSKDFIFVQKMV--KSKTVAQCVEY 659 (907)
T ss_pred ccccHHHHHHHHHHHHHhcccHHHHHHHh--ccccHHHHHHH
Confidence 56999999999999999999999999987 56885555444
No 48
>PF08666 SAF: SAF domain; InterPro: IPR013974 This entry includes a range of different proteins, such as antifreeze proteins, flagellar FlgA proteins, and CpaB pilus proteins. ; PDB: 1C89_A 3NLA_A 3RDN_A 1C8A_A 3FRN_A 1WVO_A 3K3S_H 3G8R_B 1XUU_A 1XUZ_A ....
Probab=24.78 E-value=41 Score=27.95 Aligned_cols=14 Identities=21% Similarity=0.261 Sum_probs=11.0
Q ss_pred EEEecCCCCCCeEE
Q 002895 818 IFAKEHIEASEELF 831 (869)
Q Consensus 818 ifA~RDI~aGEELT 831 (869)
++|.|||++|+.|+
T Consensus 4 vVA~~di~~G~~i~ 17 (63)
T PF08666_consen 4 VVAARDIPAGTVIT 17 (63)
T ss_dssp EEESSTB-TT-BEC
T ss_pred EEEeCccCCCCEEc
Confidence 78999999999995
No 49
>TIGR02726 phenyl_P_delta phenylphosphate carboxylase, delta subunit. Members of this protein family are the alpha subunit of phenylphosphate carboxylase. Phenol (methyl-benzene) is converted to phenylphosphate, then para-carboxylated by this four-subunit enzyme, with the release of phosphate, to 4-hydroxybenzoate. The enzyme contains neither biotin nor thiamin pyrophosphate. This delta subunit belongs to HAD family hydrolases.
Probab=22.70 E-value=65 Score=33.01 Aligned_cols=49 Identities=12% Similarity=-0.014 Sum_probs=35.6
Q ss_pred eeeEeCCCCeEEEecCCccccCCCcccc----ccCCcccchhhhhHhhhcCCh
Q 002895 147 RIYYDQHGSEALVCSDSEEDIIEPEEEK----HEFSDGEDRILWTVFEEHGLG 195 (869)
Q Consensus 147 riYYD~~g~EalicSdseee~~e~eeek----~~f~~~ed~~~~~~~~e~g~s 195 (869)
+||||+.|+|.-.+|-.+...+.-=.++ --.|.....++++.++.+|+.
T Consensus 22 ~~~~~~~g~~~~~~~~~D~~~~~~L~~~Gi~laIiT~k~~~~~~~~l~~lgi~ 74 (169)
T TIGR02726 22 RIVINDEGIESRNFDIKDGMGVIVLQLCGIDVAIITSKKSGAVRHRAEELKIK 74 (169)
T ss_pred eEEEcCCCcEEEEEecchHHHHHHHHHCCCEEEEEECCCcHHHHHHHHHCCCc
Confidence 7999999999999998877644222112 245566777888888888885
No 50
>KOG4289 consensus Cadherin EGF LAG seven-pass G-type receptor [Signal transduction mechanisms]
Probab=20.86 E-value=1e+02 Score=41.42 Aligned_cols=15 Identities=13% Similarity=0.127 Sum_probs=10.7
Q ss_pred CcEEEEccccCCCCe
Q 002895 731 GWGAFLKNSVSKNDY 745 (869)
Q Consensus 731 G~GLFA~edI~KGef 745 (869)
-+|+-|..+-++|++
T Consensus 1872 kfG~~a~~pCP~G~~ 1886 (2531)
T KOG4289|consen 1872 KFGSPAAVPCPKGSS 1886 (2531)
T ss_pred ccCCcccccCCCCcc
Confidence 467777777777765
No 51
>PF14100 PmoA: Methane oxygenase PmoA
Probab=20.36 E-value=97 Score=34.11 Aligned_cols=43 Identities=28% Similarity=0.321 Sum_probs=32.7
Q ss_pred ccccCCCCCCcceeEEEECCeeEEEE------EEecCCCCCCeEEEecCC
Q 002895 793 KFANHSSNPNCFAKVMLVAGDHRVGI------FAKEHIEASEELFYDYRY 836 (869)
Q Consensus 793 RFINHSC~PNc~~~~v~v~G~~RI~i------fA~RDI~aGEELTfDYgy 836 (869)
-||+|--+||- ...|.+.+...+++ ..--.|++||.|++.|+.
T Consensus 204 ~~~dhP~N~~~-P~~W~vR~~g~~~~~p~~~~~~~~~l~~G~~l~~rYr~ 252 (271)
T PF14100_consen 204 AILDHPSNPNY-PTPWHVRGYGLFGANPAPAFDGPLTLPPGETLTLRYRV 252 (271)
T ss_pred EEEeCCCCCCC-CcceEEeccCcceecccccccCceecCCCCeEEEEEEE
Confidence 48899998875 46788876655544 445679999999999974
Done!