Query 003198
Match_columns 840
No_of_seqs 386 out of 1527
Neff 4.9
Searched_HMMs 46136
Date Thu Mar 28 18:55:40 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/003198.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/003198hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 KOG1079 Transcriptional repres 100.0 9E-127 2E-131 1070.9 27.5 689 22-836 37-739 (739)
2 KOG4442 Clathrin coat binding 100.0 9.4E-44 2E-48 403.6 14.5 189 591-823 64-257 (729)
3 KOG1080 Histone H3 (Lys4) meth 100.0 1.3E-30 2.9E-35 312.4 11.1 134 690-823 865-1002(1005)
4 KOG1082 Histone H3 (Lys9) meth 99.9 6.2E-28 1.4E-32 266.9 11.7 142 658-809 153-323 (364)
5 smart00317 SET SET (Su(var)3-9 99.9 1.4E-23 3.1E-28 190.3 11.9 113 693-805 2-116 (116)
6 KOG1083 Putative transcription 99.9 2E-24 4.3E-29 252.7 4.9 132 679-810 1165-1298(1306)
7 KOG1141 Predicted histone meth 99.7 4.6E-19 9.9E-24 203.7 4.3 73 752-824 1179-1260(1262)
8 KOG1085 Predicted methyltransf 99.7 2.9E-18 6.3E-23 180.0 8.9 124 685-808 250-379 (392)
9 COG2940 Proteins containing SE 99.5 1.3E-15 2.9E-20 174.5 2.6 131 680-810 321-453 (480)
10 PF00856 SET: SET domain; Int 99.5 4.6E-14 9.9E-19 133.0 5.7 105 702-806 1-162 (162)
11 KOG1081 Transcription factor N 98.9 5.2E-10 1.1E-14 128.1 2.0 115 678-808 301-417 (463)
12 KOG2589 Histone tail methylase 98.6 4.6E-08 9.9E-13 107.0 5.0 114 701-820 137-254 (453)
13 KOG2461 Transcription factor B 98.1 2.7E-06 5.8E-11 96.2 5.4 108 689-808 26-145 (396)
14 PF00249 Myb_DNA-binding: Myb- 93.1 0.13 2.8E-06 41.4 3.9 46 174-221 1-48 (48)
15 PF13921 Myb_DNA-bind_6: Myb-l 92.4 0.12 2.7E-06 42.9 3.1 43 177-222 1-45 (60)
16 smart00717 SANT SANT SWI3, AD 92.3 0.12 2.6E-06 39.8 2.8 46 175-222 2-48 (49)
17 smart00717 SANT SANT SWI3, AD 92.2 0.34 7.5E-06 37.2 5.2 42 474-517 2-44 (49)
18 cd00167 SANT 'SWI3, ADA2, N-Co 92.0 0.36 7.7E-06 36.6 5.0 41 475-517 1-42 (45)
19 KOG1171 Metallothionein-like p 91.4 0.046 9.9E-07 62.3 -0.8 63 592-655 131-244 (406)
20 smart00570 AWS associated with 89.7 0.13 2.8E-06 42.9 0.7 12 678-689 39-50 (51)
21 KOG4442 Clathrin coat binding 89.7 0.45 9.7E-06 57.3 5.3 35 600-634 83-120 (729)
22 cd00167 SANT 'SWI3, ADA2, N-Co 88.9 0.37 8.1E-06 36.5 2.7 43 176-220 1-44 (45)
23 PF03638 TCR: Tesmin/TSO1-like 87.7 0.29 6.2E-06 39.3 1.3 28 628-655 3-30 (42)
24 PF09111 SLIDE: SLIDE; InterP 84.0 0.89 1.9E-05 44.0 3.0 50 173-222 48-111 (118)
25 PF00249 Myb_DNA-binding: Myb- 79.2 5.2 0.00011 32.0 5.4 43 474-517 2-45 (48)
26 KOG1337 N-methyltransferase [G 78.9 1.4 3E-05 51.5 2.8 40 765-807 239-278 (472)
27 PF05033 Pre-SET: Pre-SET moti 78.8 1.4 3.1E-05 40.6 2.3 22 590-611 44-67 (103)
28 KOG1141 Predicted histone meth 77.2 4.7 0.0001 49.7 6.4 54 679-732 993-1054(1262)
29 smart00570 AWS associated with 75.9 0.99 2.1E-05 37.7 0.4 8 622-629 20-27 (51)
30 PF03638 TCR: Tesmin/TSO1-like 75.8 1.6 3.4E-05 35.2 1.4 37 591-628 2-40 (42)
31 PF05033 Pre-SET: Pre-SET moti 75.1 1.6 3.5E-05 40.2 1.6 16 612-627 88-103 (103)
32 PF13921 Myb_DNA-bind_6: Myb-l 72.5 7.4 0.00016 32.3 4.8 41 476-518 1-41 (60)
33 KOG2084 Predicted histone tail 72.5 4.2 9.2E-05 46.1 4.4 38 765-806 208-246 (482)
34 PLN03212 Transcription repress 64.4 7.3 0.00016 42.3 3.9 52 169-223 73-125 (249)
35 TIGR01557 myb_SHAQKYF myb-like 60.8 20 0.00044 30.5 5.2 44 474-518 4-52 (57)
36 PF14774 FAM177: FAM177 family 57.0 18 0.00038 35.6 4.7 66 143-211 18-97 (123)
37 PLN03091 hypothetical protein; 51.3 18 0.00038 42.4 4.3 53 168-223 61-114 (459)
38 KOG1082 Histone H3 (Lys9) meth 49.6 12 0.00026 42.5 2.7 41 589-629 104-170 (364)
39 COG5259 RSC8 RSC chromatin rem 47.0 18 0.00038 42.5 3.5 44 472-517 278-322 (531)
40 PLN03212 Transcription repress 45.1 19 0.00041 39.2 3.2 46 174-221 25-72 (249)
41 KOG3813 Uncharacterized conser 43.7 11 0.00025 44.4 1.3 13 760-772 470-482 (640)
42 PF08271 TF_Zn_Ribbon: TFIIB z 33.5 50 0.0011 26.1 3.1 33 144-177 7-43 (43)
43 KOG1081 Transcription factor N 32.9 14 0.0003 43.7 -0.2 105 700-806 122-242 (463)
44 KOG4167 Predicted DNA-binding 32.7 50 0.0011 40.9 4.3 40 474-515 620-659 (907)
45 PF00856 SET: SET domain; Int 30.2 29 0.00063 32.4 1.6 17 787-803 2-18 (162)
46 PRK09430 djlA Dna-J like membr 28.2 60 0.0013 35.6 3.8 49 176-224 145-228 (267)
47 KOG1079 Transcriptional repres 28.1 30 0.00065 42.3 1.5 29 27-55 18-51 (739)
48 PF08666 SAF: SAF domain; Int 27.5 35 0.00075 28.2 1.4 15 788-802 3-17 (63)
49 KOG0457 Histone acetyltransfer 25.6 1E+02 0.0022 36.2 5.1 39 473-513 72-111 (438)
50 KOG1338 Uncharacterized conser 24.7 50 0.0011 38.4 2.4 44 761-810 217-263 (466)
51 smart00760 Bac_DnaA_C Bacteria 24.4 64 0.0014 27.2 2.5 22 196-217 3-24 (60)
52 PLN03142 Probable chromatin-re 23.5 77 0.0017 41.2 4.0 48 174-221 926-984 (1033)
53 TIGR02726 phenyl_P_delta pheny 23.5 63 0.0014 33.0 2.7 49 147-195 22-74 (169)
54 PF14100 PmoA: Methane oxygena 22.8 93 0.002 34.1 4.0 102 691-807 143-252 (271)
55 smart00286 PTI Plant trypsin i 21.6 63 0.0014 24.2 1.6 20 592-611 7-26 (29)
56 KOG3813 Uncharacterized conser 21.3 42 0.00092 39.9 1.0 36 591-626 306-348 (640)
57 cd00150 PlantTI Plant trypsin 21.3 63 0.0014 23.9 1.5 20 592-611 5-24 (27)
58 PF13404 HTH_AsnC-type: AsnC-t 21.2 1E+02 0.0023 24.5 2.9 38 182-221 5-42 (42)
59 smart00468 PreSET N-terminal t 21.1 70 0.0015 29.4 2.2 21 590-610 47-69 (98)
60 KOG3988 Protein-tyrosine sulfo 20.9 66 0.0014 36.0 2.3 21 186-206 122-143 (378)
61 PF03656 Pam16: Pam16; InterP 20.4 78 0.0017 31.3 2.5 36 186-227 54-89 (127)
No 1
>KOG1079 consensus Transcriptional repressor EZH1 [Transcription]
Probab=100.00 E-value=9e-127 Score=1070.87 Aligned_cols=689 Identities=36% Similarity=0.545 Sum_probs=541.1
Q ss_pred CcccchHHHHHHHHHHHHHHHHHHHHHHHHHHHhhHHHHHHHhhhhcccccchhcccCCCC----Cc---CCccccCCCC
Q 003198 22 DGLGNLTYKLNQLKKQVQAERVVSVKDKIEKNRKKIENDISQLLSTTSRKSVIFAMDNGFG----NM---PLCKYSGFPQ 94 (840)
Q Consensus 22 ~~~~~L~~~i~~LKkqi~~~R~~~ik~k~e~n~~~l~~~~~~~~~~~~~~~~~~~~~~~~~----~~---~l~~~~g~~~ 94 (840)
+.++.+...+..+|+ ++..++.+++++-..++.+...+|+-+- +++.+. .......+ .| |++++||+..
T Consensus 37 ~~~e~i~~~~~E~k~-~~~~~~~~~~~~~~~~r~k~~~~~~~~~-~~~~~~--~i~~~n~~~~v~~~~~~~~~q~nfmv~ 112 (739)
T KOG1079|consen 37 DRLEKIKILNCEWKK-RRLKPVRSAKEVDGDIRVKVDLDTSIFD-FPSQKS--PINELNAVAQVPIMYSWPPLQQNFMVE 112 (739)
T ss_pred HHHHHHHHHHHHHhh-hhcccccccccccccccccccccccccc-Cccccc--chhhhcccccccccccCChhhhcceec
Confidence 456666666666666 7888888888888888888888888775 555532 22222222 22 9999999999
Q ss_pred CCCCCCcccccccccccccccccCCCCCCCceeEEeeccccccccccccccceeeEeCCCCeEEEecCCccccCCCcccc
Q 003198 95 GLGDRDYVNSHEVVLSTSSKLSHVQKIPPYTTWIFLDKNQRMAEDQSVVGRRRIYYDQHGSEALVCSDSEEDIIEPEEEK 174 (840)
Q Consensus 95 ~~~d~d~~~~~~v~~~~~iklp~v~klPpYTtWifldrNqrMaedqsvvgrrriYYd~~g~Ealicsdseee~~e~eeek 174 (840)
+..+.+++...++. +..||+|++|.|+|||+|||+||||||++||+|||+|+||| |.|||++| ||+||| ++++|||
T Consensus 113 ~~~~~~~ip~~~~~-v~~~k~~~ieel~~y~~~v~~dr~~~~~~d~v~ve~~~a~~-Q~~~e~dg-~D~~~e-~~~~~ek 188 (739)
T KOG1079|consen 113 DETVLHNIPYMGDE-VLDIKGPFIEELIKYDGKVHGDRNQRFMEDQVFVELVVALY-QYGGEHDG-SDDEEE-EVLEEEK 188 (739)
T ss_pred ccceeccccccccc-ccccccchhhhcccccceeeccccccchhhhhHHHHHHHHH-hcCCcccc-CCCccc-cchhhhc
Confidence 99999888877754 68899999999999999999999999999999999999999 99999999 999999 8889999
Q ss_pred ccCCcccch-hhhhHHhhcCChHHHHHHHHHHhc--CCcHHHHHHHHHhHhhcCCCCCccccccccccccchhhh-hhHH
Q 003198 175 HEFSDGEDR-ILWTVFEEHGLGEEVINAVSQFIG--IATSEVQDRYSTLKEKYDGKNLKEFEDAGHERGIALEKS-LSAA 250 (840)
Q Consensus 175 ~~f~~~ed~-~~~~~~~e~g~~~~v~~~l~~~~~--~~~sei~eRy~~L~~k~~~~~~~~~~~~~~~~~~~l~k~-l~~a 250 (840)
++|.+++|. ++|++.+..+++++||.+|+++|- ++++||+|||.+|+++..+...+...+.++. ++.+++. ++++
T Consensus 189 r~~~e~~~~~~~~~~~~~~~~~~~if~~~~~~f~~k~~~~~lke~~~~l~~~~~p~~~e~~~~~~id-~~~ae~~~r~~~ 267 (739)
T KOG1079|consen 189 RDFLEGEDDDIIESINKLSFPADKIFQAISSMFPDKLTASELKERYGELTSKSLPVAEEPECTPNID-GSSAEPVQREQA 267 (739)
T ss_pred ccccCcccchhhHhhhhhccchHHHHHHHhhhcccccchhhhhHHHhhhhhccccccCCcccccCCC-ccccChHHHHhh
Confidence 999999999 899999999999999999999995 9999999999999998655444444443443 4556665 9999
Q ss_pred hhcccccccccccccccCCcCcCCCccCCCCCCCCCCCCCCCCCCCCCCCccccccccccCCCCCCCccccccccCCccc
Q 003198 251 LDSFDNLFCRRCLLFDCRLHGCSQTLINPSRAVQDTVEGSAGNISSIITNTEGTLLHCNAEVPGAHSDIMAGERCNSKRV 330 (840)
Q Consensus 251 ldsfdnlFCRRClvfDC~lHgcsq~li~p~ekq~~~~~~~~~~~~~~pCg~~Cy~~~~~~~~~~~~~~~~~~~~~~~~~~ 330 (840)
|||||||||||||+||||||| +|.++||.++...|.++... ..|||+.||.+...+... +...+ +
T Consensus 268 l~sF~tlfCrrCl~ydC~lHg-~~~~~~pn~~~r~e~~~a~~---~~pc~p~~~~~l~~~~~~--~m~~~---------~ 332 (739)
T KOG1079|consen 268 LHSFHTLFCRRCLKYDCFLHG-SQFHAFPNTKKRKEDEPALE---NEPCGPGCYGLLEGAKEK--TMSAV---------V 332 (739)
T ss_pred hcccccceeeeeeeeeccccC-ccccccccccccCCCCcccc---ccCCCCchhhhhhccchh--hhhcc---------c
Confidence 999999999999999999999 99999999999999998876 999999999987532211 11101 0
Q ss_pred CCCcccccCCcccccCCCCCCcccccccccccccccCccchhhhHHHHHHHHhhhccccccccccCCCCCCCCCCCCCCc
Q 003198 331 LPVTSEAVDSSEVAIGNENTDTSMQSLGKRKALELNDSVKVFDEIEESLNKKQKKLLPLDVLTASSDGIPRPDTKSGHHV 410 (840)
Q Consensus 331 ~~~~s~~~~~~~~~~~~~~~~~~s~~~~~~~~~~l~~s~~~~~~~~~~~~k~~k~~~~~~~~~~~~~~~~~~d~~~~~~~ 410 (840)
+... ++. . .++||..-..+|++.+. |. ..+
T Consensus 333 -----~~~~----p~~------------g--------------------~~~qk~~~~~~~~s~~~--~~-------~~e 362 (739)
T KOG1079|consen 333 -----SKCP----PIR------------G--------------------DIRQKLVKASSMDSDDE--HV-------EEE 362 (739)
T ss_pred -----ccCC----CCc------------c--------------------hhhhhhcccccCCcchh--hc-------ccc
Confidence 0000 000 0 02233222222222111 00 000
Q ss_pred ccccccccccccccccccccccccccccccccccccCCccCCCCcccccCCCCCCCccccccCCCCcHHHHHHHHHhhhh
Q 003198 411 GAINDNELQMTSKNTIKKSVSAKVVSHNNIEHNIMDGAKDVNKEPEMKQSFSKGELPEGVLCSSEWKPIEKELYLKGVEI 490 (840)
Q Consensus 411 ~~~~~~~~~~~~~~~~~k~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~W~~~E~~L~~k~v~~ 490 (840)
+.-..+.....++ .++... -....++.........+.++...... ...+|+++|+.||++++.+
T Consensus 363 ~~g~~~d~~v~~~------~~~~~~-~v~~~~~~~~s~~~~~c~~~~~~~~~---------~~~ew~~~ek~~fr~~~~~ 426 (739)
T KOG1079|consen 363 DKGHDDDDGVPRG------FGGSVN-FVGEDDTSTHSSTNSICQNPVHGKKD---------TNVEWNGAEKVLFRVGSTL 426 (739)
T ss_pred ccCcccccccccc------cccccc-cccCCcccccccccccccCcccccCC---------cccccchhhhHHHHhcccc
Confidence 0000000111000 000000 00001122222222233322111111 2468999999999999999
Q ss_pred cCCchHHHHHhhhCCCCcHHHHHHHHhhcCCCCCCCCCCCCccccccccccchhhhhcCCCchhHHhhhhcccccccccC
Q 003198 491 FGRNSCLIARNLLSGLKTCMEVSTYMRDSSSSMPHKSVAPSSFLEETVKVDTDYAEQEMPARPRLLRRRGRARKLKYSWK 570 (840)
Q Consensus 491 fg~N~C~iA~~ll~g~KTC~EV~~ym~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~r~r~~r~~~r~rk~k~~~k 570 (840)
||.|+|+|||+| ++|||++||+||..+.... ++.... .......+.|+|.+|+.|+.|+..+.|+
T Consensus 427 ~~~n~c~Iar~l--~~ktC~~v~~~~~~e~~~~--------~~~~~~-----~~~~~~~~~r~~~~r~~g~~r~k~q~kk 491 (739)
T KOG1079|consen 427 YGTNRCSIARNL--LTKTCRQVYEYEQKEVLQG--------LYFDGR-----FRVELPGPKRARKLRLWGRHRRKIQNKK 491 (739)
T ss_pred ccchhhHHHHHh--cchHHHHHHHHhhcchhhc--------eecccc-----cccccCcchhhHHHHhhhhHHHhhhccc
Confidence 999999999999 4599999999999765311 111100 0001234556888999999999999999
Q ss_pred CCCCCccchhcccCCcCCCccccCCCCCCCC--CCCCcccCCCccccCCCCCCcccccccCCcccCCCCccCCCcccccc
Q 003198 571 SAGHPSIWKRIADGKNQSCKQYTPCGCQSMC--GKQCPCLHNGTCCEKYCGCSKSCKNRFRGCHCAKSQCRSRQCPCFAA 648 (840)
Q Consensus 571 s~~~p~~~kri~~~k~~~~~~y~PC~c~~~C--~~~C~C~~~g~~Ce~~CgC~~~C~nRf~GC~C~~~~C~t~~CpC~~a 648 (840)
++.|+.+|. |+||+|+++| +.+|+|+.++++||++|+|+.+|.|||+||+| ++||++++|||++|
T Consensus 492 ~~~~~~v~~------------~qpC~hp~~c~c~~~C~C~~n~~~CEk~C~C~~dC~nrF~GC~C-k~QC~tkqCpC~~A 558 (739)
T KOG1079|consen 492 DSRHTVVWN------------YQPCDHPGPCNCGVGCPCIDNETFCEKFCYCSPDCRNRFPGCRC-KAQCNTKQCPCYLA 558 (739)
T ss_pred ccCCceeee------------cCcccCCCCCCCCCCCcccccCcchhhcccCCHHHHhcCCCCCc-ccccccCcCchhhh
Confidence 999977774 7777777554 68999999999999999999999999999999 99999999999999
Q ss_pred ccccCCCCCCCCccCCCCCCCCCCCCCCCC-CCCchHhhhcccccEEEEecCCCCceEEeccccCCCCeEEeccccccCH
Q 003198 649 GRECDPDVCRNCWVSCGDGSLGEPPKRGDG-QCGNMRLLLRQQQRILLAKSDVAGWGAFLKNSVSKNDYLGEYTGELISH 727 (840)
Q Consensus 649 ~rECdPd~C~~C~~~Cg~~~l~~p~~~~~~-~C~N~~lq~g~~~~v~V~kS~~kG~GLfA~edI~kGefI~EY~GEIIs~ 727 (840)
+|||||++|..||+ + +..+++. .|+|+.+|+++++++.|++|.+.|||||+++.+.|++||.||+||+|++
T Consensus 559 ~rECdPd~Cl~cg~-~-------~~~d~~~~~C~N~~l~~~~qkr~llapSdVaGwGlFlKe~v~KnefisEY~GE~IS~ 630 (739)
T KOG1079|consen 559 VRECDPDVCLMCGN-V-------DHFDSSKISCKNTNLQRGEQKRVLLAPSDVAGWGLFLKESVSKNEFISEYTGEIISH 630 (739)
T ss_pred ccccCchHHhccCc-c-------cccccCccccccchhhhhhhcceeechhhccccceeeccccCCCceeeeecceeccc
Confidence 99999999999986 1 2233444 9999999999999999999999999999999999999999999999999
Q ss_pred HHHHHHhhhhcccCccccccCCCcEEEeccccCCccccccCCCCCCcceeEEEEcCeeEEEEEEccCCCCCCeEEEecCC
Q 003198 728 READKRGKIYDRANSSFLFDLNDQYVLDAYRKGDKLKFANHSSNPNCFAKVMLVAGDHRVGIFAKEHIEASEELFYDYRY 807 (840)
Q Consensus 728 ~Ea~rR~k~yd~~~~sYlf~L~~~~~IDA~~~GN~aRFINHSC~PNc~~~~v~V~g~~rI~ifA~RdI~aGEELTfDYgy 807 (840)
+||++|+++|+..+.+|+|+|+.+++|||+++||.+||+|||-+|||++++++|+|++||||||+|.|.+||||||||+|
T Consensus 631 dEADrRGkiYDr~~cSflFnln~dyviDs~rkGnk~rFANHS~nPNCYAkvm~V~GdhRIGifAkRaIeagEELffDYrY 710 (739)
T KOG1079|consen 631 DEADRRGKIYDRYMCSFLFNLNNDYVIDSTRKGNKIRFANHSFNPNCYAKVMMVAGDHRIGIFAKRAIEAGEELFFDYRY 710 (739)
T ss_pred hhhhhcccccccccceeeeeccccceEeeeeecchhhhccCCCCCCcEEEEEEecCCcceeeeehhhcccCceeeeeecc
Confidence 99999999999999999999999999999999999999999999999999999999999999999999999999999999
Q ss_pred CCCCCccccCCCCCCCCCCCCcccccccc
Q 003198 808 GPDQAPAWARKPEGSKREDSSVSQGRAKK 836 (840)
Q Consensus 808 ~~d~~pcwc~~pe~~~~d~~~~s~gra~k 836 (840)
+.+.++-|-+.+..+++++....+.+++|
T Consensus 711 s~~~~~k~~~~~~~s~k~e~~~~q~~~~~ 739 (739)
T KOG1079|consen 711 SPEHALKFVGIERESYKVELKIFQATQQK 739 (739)
T ss_pred CccccccccccCccccccchhhhhhhcCC
Confidence 99999999999999999998888888775
No 2
>KOG4442 consensus Clathrin coat binding protein/Huntingtin interacting protein HIP1, involved in regulation of endocytosis [Intracellular trafficking, secretion, and vesicular transport]
Probab=100.00 E-value=9.4e-44 Score=403.56 Aligned_cols=189 Identities=31% Similarity=0.600 Sum_probs=170.3
Q ss_pred cccCCCCCCCCCCCCcccCCCccccCCCCCCcccccccCCcccCCCCccCCCccccccccccCCCCCCCCccCCCCCCCC
Q 003198 591 QYTPCGCQSMCGKQCPCLHNGTCCEKYCGCSKSCKNRFRGCHCAKSQCRSRQCPCFAAGRECDPDVCRNCWVSCGDGSLG 670 (840)
Q Consensus 591 ~y~PC~c~~~C~~~C~C~~~g~~Ce~~CgC~~~C~nRf~GC~C~~~~C~t~~CpC~~a~rECdPd~C~~C~~~Cg~~~l~ 670 (840)
..+-|+|...-+ .--...|.|+.+|.||+. ..||.++.|..|++
T Consensus 64 ~~m~Cdc~~~~~---------d~~n~~~~cg~~CiNr~t-------------------~iECs~~~C~~cg~-------- 107 (729)
T KOG4442|consen 64 DEMICDCKPKTG---------DGANGACACGEDCINRMT-------------------SIECSDRECPRCGV-------- 107 (729)
T ss_pred cceeeecccccc---------cccccccccCccccchhh-------------------hcccCCccCCCccc--------
Confidence 566777764322 112467999999999986 47788888887643
Q ss_pred CCCCCCCCCCCchHhhhcccccEEEEecCCCCceEEeccccCCCCeEEeccccccCHHHHHHHhhhhcccC--ccccccC
Q 003198 671 EPPKRGDGQCGNMRLLLRQQQRILLAKSDVAGWGAFLKNSVSKNDYLGEYTGELISHREADKRGKIYDRAN--SSFLFDL 748 (840)
Q Consensus 671 ~p~~~~~~~C~N~~lq~g~~~~v~V~kS~~kG~GLfA~edI~kGefI~EY~GEIIs~~Ea~rR~k~yd~~~--~sYlf~L 748 (840)
.|+|++||+.+..+|+||.+..+||||+|.++|++|+||+||+||||+..|+++|.+.|+..+ ++|+|.|
T Consensus 108 --------~C~NQRFQkkqyA~vevF~Te~KG~GLRA~~dI~~g~FI~EY~GEVI~~~Ef~kR~~~Y~~d~~kh~Yfm~L 179 (729)
T KOG4442|consen 108 --------YCKNQRFQKKQYAKVEVFLTEKKGCGLRAEEDIPKGQFILEYIGEVIEEKEFEKRVKRYAKDGIKHYYFMAL 179 (729)
T ss_pred --------cccchhhhhhccCceeEEEecCcccceeeccccCCCcEEeeeccccccHHHHHHHHHHHHhcCCceEEEEEe
Confidence 799999999999999999999999999999999999999999999999999999999999875 5788899
Q ss_pred CCcEEEeccccCCccccccCCCCCCcceeEEEEcCeeEEEEEEccCCCCCCeEEEecC---CCCCCCccccCCCCCCC
Q 003198 749 NDQYVLDAYRKGDKLKFANHSSNPNCFAKVMLVAGDHRVGIFAKEHIEASEELFYDYR---YGPDQAPAWARKPEGSK 823 (840)
Q Consensus 749 ~~~~~IDA~~~GN~aRFINHSC~PNc~~~~v~V~g~~rI~ifA~RdI~aGEELTfDYg---y~~d~~pcwc~~pe~~~ 823 (840)
....+||||.+||++|||||||+|||++++|.|+|..||||||.|.|++||||||||+ ||.+.+||+||.++|++
T Consensus 180 ~~~e~IDAT~KGnlaRFiNHSC~PNa~~~KWtV~~~lRvGiFakk~I~~GEEITFDYqf~rYGr~AQ~CyCgeanC~G 257 (729)
T KOG4442|consen 180 QGGEYIDATKKGNLARFINHSCDPNAEVQKWTVPDELRVGIFAKKVIKPGEEITFDYQFDRYGRDAQPCYCGEANCRG 257 (729)
T ss_pred cCCceecccccCcHHHhhcCCCCCCceeeeeeeCCeeEEEEeEecccCCCceeeEecccccccccccccccCCccccc
Confidence 9999999999999999999999999999999999999999999999999999999996 78899999999999984
No 3
>KOG1080 consensus Histone H3 (Lys4) methyltransferase complex, subunit SET1 and related methyltransferases [Chromatin structure and dynamics; Transcription]
Probab=99.96 E-value=1.3e-30 Score=312.45 Aligned_cols=134 Identities=40% Similarity=0.755 Sum_probs=127.1
Q ss_pred cccEEEEecCCCCceEEeccccCCCCeEEeccccccCHHHHHHHhhhhcccC--ccccccCCCcEEEeccccCCcccccc
Q 003198 690 QQRILLAKSDVAGWGAFLKNSVSKNDYLGEYTGELISHREADKRGKIYDRAN--SSFLFDLNDQYVLDAYRKGDKLKFAN 767 (840)
Q Consensus 690 ~~~v~V~kS~~kG~GLfA~edI~kGefI~EY~GEIIs~~Ea~rR~k~yd~~~--~sYlf~L~~~~~IDA~~~GN~aRFIN 767 (840)
++.|.++++.+|||||||+++|.+|++|+||+||+|.+.-|+.|+..|...+ .+|||.++...+|||+++||+|||||
T Consensus 865 kk~~~F~~s~iH~wglfa~~~i~~~dmViEY~Ge~vR~~iad~RE~~Y~~~gi~~sYlfrid~~~ViDAtk~gniAr~In 944 (1005)
T KOG1080|consen 865 KKYVKFGRSGIHGWGLFAMENIAAGDMVIEYRGELVRSSIADLREARYERMGIGDSYLFRIDDEVVVDATKKGNIARFIN 944 (1005)
T ss_pred hhhhccccccccccceeeccCccccceEEEeeceehhhhHHHHHHHHHhccCcccceeeecccceEEeccccCchhheee
Confidence 3458899999999999999999999999999999999999999999999886 79999999999999999999999999
Q ss_pred CCCCCCcceeEEEEcCeeEEEEEEccCCCCCCeEEEecCCCC--CCCccccCCCCCCC
Q 003198 768 HSSNPNCFAKVMLVAGDHRVGIFAKEHIEASEELFYDYRYGP--DQAPAWARKPEGSK 823 (840)
Q Consensus 768 HSC~PNc~~~~v~V~g~~rI~ifA~RdI~aGEELTfDYgy~~--d~~pcwc~~pe~~~ 823 (840)
|||+|||+++++.|+|+.+|+|||.|+|.+||||||||.|.. +..||+|+.|+|++
T Consensus 945 HsC~PNCyakvi~V~g~~~IvIyakr~I~~~EElTYDYkF~~e~~kipClCgap~Crg 1002 (1005)
T KOG1080|consen 945 HSCNPNCYAKVITVEGDKRIVIYSKRDIAAGEELTYDYKFPTEDDKIPCLCGAPNCRG 1002 (1005)
T ss_pred cccCCCceeeEEEecCeeEEEEEEecccccCceeeeeccccccccccccccCCCcccc
Confidence 999999999999999999999999999999999999999854 45799999999985
No 4
>KOG1082 consensus Histone H3 (Lys9) methyltransferase SUV39H1/Clr4, required for transcriptional silencing [Chromatin structure and dynamics; Transcription]
Probab=99.95 E-value=6.2e-28 Score=266.85 Aligned_cols=142 Identities=25% Similarity=0.460 Sum_probs=119.8
Q ss_pred CCCccCCCCCCCCCCCCCCCCCCCchHhhhcccccEEEEecCCCCceEEeccccCCCCeEEeccccccCHHHHHHHhhhh
Q 003198 658 RNCWVSCGDGSLGEPPKRGDGQCGNMRLLLRQQQRILLAKSDVAGWGAFLKNSVSKNDYLGEYTGELISHREADKRGKIY 737 (840)
Q Consensus 658 ~~C~~~Cg~~~l~~p~~~~~~~C~N~~lq~g~~~~v~V~kS~~kG~GLfA~edI~kGefI~EY~GEIIs~~Ea~rR~k~y 737 (840)
.+|+..|+|+ ..|.|+.+|.+.+.+++|++++.+||||++.+.|++|+||+||+||+++..++++|...+
T Consensus 153 ~EC~~~C~C~----------~~C~nRv~q~g~~~~leIfrt~~kGwgvRs~~~I~~G~fvcEyaGe~~t~~e~~~~~~~~ 222 (364)
T KOG1082|consen 153 FECSVACGCH----------PDCANRVVQKGLQFHLEVFRTPEKGWGVRTLDPIPAGEFVCEYAGEVLTSEEAQRRTHLR 222 (364)
T ss_pred cccccCCCCC----------CcCcchhhccccccceEEEecCCceeeecccccccCCCeeEEEeeEecChHHhhhccccc
Confidence 3677778775 599999999999999999999999999999999999999999999999999999884322
Q ss_pred ccc----Cccccc---------------------cCCCcEEEeccccCCccccccCCCCCCcceeEEEEcCe----eEEE
Q 003198 738 DRA----NSSFLF---------------------DLNDQYVLDAYRKGDKLKFANHSSNPNCFAKVMLVAGD----HRVG 788 (840)
Q Consensus 738 d~~----~~sYlf---------------------~L~~~~~IDA~~~GN~aRFINHSC~PNc~~~~v~V~g~----~rI~ 788 (840)
+.. +..+.+ .....+.|||...||++|||||||.||+.+..+..++. .+|+
T Consensus 223 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ida~~~GNv~RfinHSC~PN~~~~~v~~~~~~~~~~~i~ 302 (364)
T KOG1082|consen 223 EYLDDDCDAYSIADREWVDESPVGNTFVAPSLPGGPGRELLIDAKPHGNVARFINHSCSPNLLYQAVFQDEFVLLYLRIG 302 (364)
T ss_pred cccccccccchhhhccccccccccccccccccccCCCcceEEchhhcccccccccCCCCccceeeeeeecCCccchheee
Confidence 221 111111 11345999999999999999999999999988887743 5899
Q ss_pred EEEccCCCCCCeEEEecCCCC
Q 003198 789 IFAKEHIEASEELFYDYRYGP 809 (840)
Q Consensus 789 ifA~RdI~aGEELTfDYgy~~ 809 (840)
|||+++|.||||||||||...
T Consensus 303 ffa~~~I~p~~ELT~dYg~~~ 323 (364)
T KOG1082|consen 303 FFALRDISPGEELTLDYGKAY 323 (364)
T ss_pred eeeccccCCCcccchhhcccc
Confidence 999999999999999999653
No 5
>smart00317 SET SET (Su(var)3-9, Enhancer-of-zeste, Trithorax) domain. Putative methyl transferase, based on outlier plant homologues
Probab=99.90 E-value=1.4e-23 Score=190.33 Aligned_cols=113 Identities=41% Similarity=0.736 Sum_probs=102.6
Q ss_pred EEEEecCCCCceEEeccccCCCCeEEeccccccCHHHHHHHhhhhcccC--ccccccCCCcEEEeccccCCccccccCCC
Q 003198 693 ILLAKSDVAGWGAFLKNSVSKNDYLGEYTGELISHREADKRGKIYDRAN--SSFLFDLNDQYVLDAYRKGDKLKFANHSS 770 (840)
Q Consensus 693 v~V~kS~~kG~GLfA~edI~kGefI~EY~GEIIs~~Ea~rR~k~yd~~~--~sYlf~L~~~~~IDA~~~GN~aRFINHSC 770 (840)
++++.++.+|+||||+.+|++|++|++|.|.++...++..+...|.... ..|+|.....++||+...||++|||||||
T Consensus 2 ~~~~~~~~~G~gl~a~~~i~~g~~i~~~~g~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~id~~~~~~~~~~iNHsc 81 (116)
T smart00317 2 LEVFKSPGKGWGVRATEDIPKGEFIGEYVGEIITSEEAEERSKAYDTDGADSFYLFEIDSDLCIDARRKGNIARFINHSC 81 (116)
T ss_pred cEEEecCCCcEEEEECCccCCCCEEEEEEeEEECHHHHHHHHHHHHhcCCCCEEEEECCCCEEEeCCccCcHHHeeCCCC
Confidence 5677888999999999999999999999999999998888765555554 38899988889999999999999999999
Q ss_pred CCCcceeEEEEcCeeEEEEEEccCCCCCCeEEEec
Q 003198 771 NPNCFAKVMLVAGDHRVGIFAKEHIEASEELFYDY 805 (840)
Q Consensus 771 ~PNc~~~~v~V~g~~rI~ifA~RdI~aGEELTfDY 805 (840)
.||+.+..+..++..+|.|+|+|+|++|||||+||
T Consensus 82 ~pN~~~~~~~~~~~~~~~~~a~r~I~~GeEi~i~Y 116 (116)
T smart00317 82 EPNCELLFVEVNGDSRIVIFALRDIKPGEELTIDY 116 (116)
T ss_pred CCCEEEEEEEECCCcEEEEEECCCcCCCCEEeecC
Confidence 99999998888888899999999999999999999
No 6
>KOG1083 consensus Putative transcription factor ASH1/LIN-59 [Transcription]
Probab=99.89 E-value=2e-24 Score=252.67 Aligned_cols=132 Identities=29% Similarity=0.572 Sum_probs=124.4
Q ss_pred CCCchHhhh-cccccEEEEecCCCCceEEeccccCCCCeEEeccccccCHHHHHHH-hhhhcccCccccccCCCcEEEec
Q 003198 679 QCGNMRLLL-RQQQRILLAKSDVAGWGAFLKNSVSKNDYLGEYTGELISHREADKR-GKIYDRANSSFLFDLNDQYVLDA 756 (840)
Q Consensus 679 ~C~N~~lq~-g~~~~v~V~kS~~kG~GLfA~edI~kGefI~EY~GEIIs~~Ea~rR-~k~yd~~~~sYlf~L~~~~~IDA 756 (840)
.|.|+++++ +...+|.++..+.+||||.|+++|++|+||+||+||||+..+++.| +..|.....+|+..+..+.+||+
T Consensus 1165 ~c~nqrm~r~e~cp~L~v~~gp~~G~~v~tk~PikagtfI~EYvGeVit~ke~e~~mmtl~~~d~~~~cL~I~p~l~id~ 1244 (1306)
T KOG1083|consen 1165 SCSNQRMQRHEECPPLEVFRGPKKGWGVRTKEPIKAGTFIMEYVGEVITEKEFEPRMMTLYHNDDDHYCLVIDPGLFIDI 1244 (1306)
T ss_pred hhhhHHhhhhccCCCcceeccCCCCccccccccccccchHHHHHHHHHHHHhhcccccccCCCCCcccccccCccccCCh
Confidence 488999986 5778899999999999999999999999999999999999999988 67888888999999999999999
Q ss_pred cccCCccccccCCCCCCcceeEEEEcCeeEEEEEEccCCCCCCeEEEecCCCCC
Q 003198 757 YRKGDKLKFANHSSNPNCFAKVMLVAGDHRVGIFAKEHIEASEELFYDYRYGPD 810 (840)
Q Consensus 757 ~~~GN~aRFINHSC~PNc~~~~v~V~g~~rI~ifA~RdI~aGEELTfDYgy~~d 810 (840)
.++||.+||+||+|.|||.++.|.|+|..||++||+|||.+||||||||++-..
T Consensus 1245 ~R~~n~~RfinhscKPNc~~qkwSVNG~~Rv~L~A~rDi~kGEELtYDYN~ks~ 1298 (1306)
T KOG1083|consen 1245 PRMGNGARFINHSCKPNCEMQKWSVNGEYRVGLFALRDLPKGEELTYDYNFKSF 1298 (1306)
T ss_pred hhccccccccccccCCCCccccccccceeeeeeeecCCCCCCceEEEecccccc
Confidence 999999999999999999999999999999999999999999999999986543
No 7
>KOG1141 consensus Predicted histone methyl transferase [Chromatin structure and dynamics]
Probab=99.75 E-value=4.6e-19 Score=203.68 Aligned_cols=73 Identities=30% Similarity=0.517 Sum_probs=65.7
Q ss_pred EEEeccccCCccccccCCCCCCcceeEEEEcCe----eEEEEEEccCCCCCCeEEEecCCCCCC-----CccccCCCCCC
Q 003198 752 YVLDAYRKGDKLKFANHSSNPNCFAKVMLVAGD----HRVGIFAKEHIEASEELFYDYRYGPDQ-----APAWARKPEGS 822 (840)
Q Consensus 752 ~~IDA~~~GN~aRFINHSC~PNc~~~~v~V~g~----~rI~ifA~RdI~aGEELTfDYgy~~d~-----~pcwc~~pe~~ 822 (840)
|+|||...||++||+||||.||+.++.++|+.. +.|+|||.+-|+||+||||||+|..+. -.|.||.-+|+
T Consensus 1179 yvIDAk~eGNlGRfLNHSC~PNl~VQnVfvdTHdlrfPwVAFFt~kyVkAgtELTWDY~Ye~g~v~~keL~C~CGa~~Cr 1258 (1262)
T KOG1141|consen 1179 YVIDAKQEGNLGRFLNHSCDPNLHVQNVFVDTHDLRFPWVAFFTRKYVKAGTELTWDYQYEQGQVATKELTCHCGAENCR 1258 (1262)
T ss_pred EEEecccccchhhhhccCCCccceeeeeeeeccccCCchhhhhhhhhhccCceeeeeccccccccccceEEEecChhhhh
Confidence 899999999999999999999999999999854 679999999999999999999997654 46889988887
Q ss_pred CC
Q 003198 823 KR 824 (840)
Q Consensus 823 ~~ 824 (840)
++
T Consensus 1259 gr 1260 (1262)
T KOG1141|consen 1259 GR 1260 (1262)
T ss_pred cc
Confidence 54
No 8
>KOG1085 consensus Predicted methyltransferase (contains a SET domain) [General function prediction only]
Probab=99.74 E-value=2.9e-18 Score=179.96 Aligned_cols=124 Identities=28% Similarity=0.419 Sum_probs=108.6
Q ss_pred hhhcccccEEEEecCCCCceEEeccccCCCCeEEeccccccCHHHHHHHhhhhcccCc----cccc-cCCCcEEEecccc
Q 003198 685 LLLRQQQRILLAKSDVAGWGAFLKNSVSKNDYLGEYTGELISHREADKRGKIYDRANS----SFLF-DLNDQYVLDAYRK 759 (840)
Q Consensus 685 lq~g~~~~v~V~kS~~kG~GLfA~edI~kGefI~EY~GEIIs~~Ea~rR~k~yd~~~~----sYlf-~L~~~~~IDA~~~ 759 (840)
+..+....+.+..-.++|.||+|+..+.+|+||.||.|.+|.-.|+..|+..|..... .|+| .++..|+|||++-
T Consensus 250 vl~g~~egl~~~~~dgKGRGv~a~~~F~rgdFVVEY~Gdliei~eAk~rE~~Ya~De~~GcYMYyF~h~sk~yCiDAT~e 329 (392)
T KOG1085|consen 250 VLKGTNEGLLEVYKDGKGRGVRAKVNFERGDFVVEYRGDLIEISEAKVREEQYANDEEIGCYMYYFEHNSKKYCIDATKE 329 (392)
T ss_pred HHhccccceeEEeeccccceeEeecccccCceEEEEecceeeechHHHHHHHhccCcccceEEEeeeccCeeeeeecccc
Confidence 3445566778888888999999999999999999999999999999999999977632 3555 4567899999976
Q ss_pred -CCccccccCCCCCCcceeEEEEcCeeEEEEEEccCCCCCCeEEEecCCC
Q 003198 760 -GDKLKFANHSSNPNCFAKVMLVAGDHRVGIFAKEHIEASEELFYDYRYG 808 (840)
Q Consensus 760 -GN~aRFINHSC~PNc~~~~v~V~g~~rI~ifA~RdI~aGEELTfDYgy~ 808 (840)
+-++|.||||--+||..+++.++|.+++.++|.|||.+||||+||||-.
T Consensus 330 t~~lGRLINHS~~gNl~TKvv~Idg~pHLiLvA~rdIa~GEELlYDYGDR 379 (392)
T KOG1085|consen 330 TPWLGRLINHSVRGNLKTKVVEIDGSPHLILVARRDIAQGEELLYDYGDR 379 (392)
T ss_pred cccchhhhcccccCcceeeEEEecCCceEEEEeccccccchhhhhhcccc
Confidence 4578999999999999999999999999999999999999999999843
No 9
>COG2940 Proteins containing SET domain [General function prediction only]
Probab=99.55 E-value=1.3e-15 Score=174.48 Aligned_cols=131 Identities=34% Similarity=0.566 Sum_probs=109.0
Q ss_pred CCchHhhhcccccEEEEecCCCCceEEeccccCCCCeEEeccccccCHHHHHHHhhhhcccCccccc-cCCC-cEEEecc
Q 003198 680 CGNMRLLLRQQQRILLAKSDVAGWGAFLKNSVSKNDYLGEYTGELISHREADKRGKIYDRANSSFLF-DLND-QYVLDAY 757 (840)
Q Consensus 680 C~N~~lq~g~~~~v~V~kS~~kG~GLfA~edI~kGefI~EY~GEIIs~~Ea~rR~k~yd~~~~sYlf-~L~~-~~~IDA~ 757 (840)
+.|............+..+...|||+||.+.|++|++|.+|.|+++...++..|...|...+..+.| .+.. ..++|+.
T Consensus 321 ~~~~~~~~~~~~~~~~~~~~~~~~g~fa~~~i~~~e~i~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~d~~ 400 (480)
T COG2940 321 LLNSNGCKKRREPNVVQESEIKGYGVFALESIKKGEFIIEYHGEIIRRKEAREREENYDLLGNEFSFGLLEDKDKVRDSQ 400 (480)
T ss_pred hhhhcccccccchhhhhhhcccccceeehhhccchHHHHHhcCcccchHHHHhhhccccccccccchhhccccchhhhhh
Confidence 3333333344455667788889999999999999999999999999999999998877555554444 3333 7899999
Q ss_pred ccCCccccccCCCCCCcceeEEEEcCeeEEEEEEccCCCCCCeEEEecCCCCC
Q 003198 758 RKGDKLKFANHSSNPNCFAKVMLVAGDHRVGIFAKEHIEASEELFYDYRYGPD 810 (840)
Q Consensus 758 ~~GN~aRFINHSC~PNc~~~~v~V~g~~rI~ifA~RdI~aGEELTfDYgy~~d 810 (840)
..|+.+||+||||.|||.+....+.|..++.++|+|||.+||||++||+...+
T Consensus 401 ~~g~~~r~~nHS~~pN~~~~~~~~~g~~~~~~~~~rDI~~geEl~~dy~~~~~ 453 (480)
T COG2940 401 KAGDVARFINHSCTPNCEASPIEVNGIFKISIYAIRDIKAGEELTYDYGPSLE 453 (480)
T ss_pred hcccccceeecCCCCCcceecccccccceeeecccccchhhhhhccccccccc
Confidence 99999999999999999998888888889999999999999999999986544
No 10
>PF00856 SET: SET domain; InterPro: IPR001214 The SET domain appears generally as one part of a larger multidomain protein, and recently there were described three structures of very different proteins with distinct domain compositions: Neurospora crassa DIM-5, a member of the Su(var) family of HKMTs which methylate histone H3 on lysine 9,human SET7 (also called SET9), which methylates H3 on lysine 4 and garden pea Rubisco LSMT, an enzyme that does not modify histones, but instead methylates lysine 14 in the flexible tail of the large subunit of the enzyme Rubisco. The SET domain itself turned out to be an uncommon structure. Although in all three studies, electron density maps revealed the location of the AdoMet or AdoHcy cofactor, the SET domain bears no similarity at all to the canonical/AdoMet-dependent methyltransferase fold. Strictly conserved in the C-terminal motif of the SET domain tyrosine could be involved in abstracting a proton from the protonated amino group of the substrate lysine, promoting its nucleophilic attack on the sulphonium methyl group of the AdoMet cofactor. In contrast to the AdoMet-dependent protein methyltranferases of the classical type, which tend to bind their polypeptide substrates on top of the cofactor, it is noted from the Rubisco LSMT structure that the AdoMet seems to bind in a separate cleft, suggesting how a polypeptide substrate could be subjected to multiple rounds of methylation without having to be released from the enzyme. In contrast, SET7/9 is able to add only a single methyl group to its substrate. It has been demonstrated that association of SET domain and myotubularin-related proteins modulates growth control []. The SET domain-containing Drosophila melanogaster (Fruit fly) protein, enhancer of zeste, has a function in segment determination and the mammalian homologue may be involved in the regulation of gene transcription and chromatin structure. Histone lysine methylation is part of the histone code that regulated chromatin function and epigenetic control of gene function. Histone lysine methyltransferases (HMTase) differ both in their substrate specificity for the various acceptor lysines as well as in their product specificity for the number of methyl groups (one, two, or three) they transfer. With just one exception [], the HMTases belong to SET family that can be classified according to the sequences surrounding the SET domain [, ]. Structural studies on the human SET7/9, a mono-methylase, have revealed the molecular basis for the specificity of the enzyme for the histone-target and the roles of the invariant residues in the SET domain in determining the methylation specificities []. The pre-SET domain, as found in the SUV39 SET family, contains nine invariant cysteine residues that are grouped into two segments separated by a region of variable length. These 9 cysteines coordinate 3 zinc ions to form to form a triangular cluster, where each of the zinc ions is coordinated by 4 four cysteines to give a tetrahedral configuration. The function of this domain is structural, holding together 2 long segments of random coils. The C-terminal region including the post-SET domain is disordered when not interacting with a histone tail and in the absence of zinc. The three conserved cysteines in the post-SET domain form a zinc-binding site when coupled to a fourth conserved cysteine in the knot-like structure close to the SET domain active site []. The structured post-SET region brings in the C-terminal residues that participate in S-adenosylmethine-binding and histone tail interactions. The three conserved cysteine residues are essential for HMTase activity, as replacement with serine abolishes HMTase activity [], []. ; GO: 0005515 protein binding; PDB: 3TG5_A 3S7F_A 3RIB_B 3TG4_A 3S7J_A 3S7D_A 3S7B_A 3H6L_A 3SMT_A 3K5K_A ....
Probab=99.47 E-value=4.6e-14 Score=133.00 Aligned_cols=105 Identities=17% Similarity=0.187 Sum_probs=74.0
Q ss_pred CceEEeccccCCCCeEEeccccccCHHHHHHH---hhhhccc--------------------------------------
Q 003198 702 GWGAFLKNSVSKNDYLGEYTGELISHREADKR---GKIYDRA-------------------------------------- 740 (840)
Q Consensus 702 G~GLfA~edI~kGefI~EY~GEIIs~~Ea~rR---~k~yd~~-------------------------------------- 740 (840)
|+||||+++|++|++|+++.+.+++..+.... ...+...
T Consensus 1 GrGl~At~dI~~Ge~I~~p~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 80 (162)
T PF00856_consen 1 GRGLFATRDIKAGEVILIPRPAILTPDEVSPQPELLRLQLSKALEEQSRSDFSIQKKQKAEKSERSPQLESLHSISLRSE 80 (162)
T ss_dssp SEEEEESS-B-TTEEEEEESEEEEEHHHHHCHHHHSHHTTCSSSCSHHTTHHHHHHHHHHHHHHHHHHHHHHHHHCHTTT
T ss_pred CEEEEECccCCCCCEEEEECcceEEehhhhhcccchhhhhhhhhcccccccccccccccccccccccccccccccccccc
Confidence 89999999999999999999999987776441 0000000
Q ss_pred -Cc---------------cccccCCCcEEEeccccCCccccccCCCCCCcceeEEEEcCeeEEEEEEccCCCCCCeEEEe
Q 003198 741 -NS---------------SFLFDLNDQYVLDAYRKGDKLKFANHSSNPNCFAKVMLVAGDHRVGIFAKEHIEASEELFYD 804 (840)
Q Consensus 741 -~~---------------sYlf~L~~~~~IDA~~~GN~aRFINHSC~PNc~~~~v~V~g~~rI~ifA~RdI~aGEELTfD 804 (840)
.. ............++.-....+.|+||||.|||.+..........+.|+|.|+|++|||||++
T Consensus 81 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~l~p~~d~~NHsc~pn~~~~~~~~~~~~~~~~~a~r~I~~GeEi~is 160 (162)
T PF00856_consen 81 LQFSQAFQWSWFISWTRSDFSSRSFSEDDRDGIALYPFADMLNHSCDPNCEVSFDFDGDGGCLVVRATRDIKKGEEIFIS 160 (162)
T ss_dssp CCTCCHHHHHHHHHHHHHEEEEEEETTEEEEEEEEETGGGGSEEESSTSEEEEEEEETTTTEEEEEESS-B-TTSBEEEE
T ss_pred ccccccccchhhccccceeeeccccccccccccccCcHhHheccccccccceeeEeecccceEEEEECCccCCCCEEEEE
Confidence 00 00000112244566677789999999999999887777677889999999999999999999
Q ss_pred cC
Q 003198 805 YR 806 (840)
Q Consensus 805 Yg 806 (840)
||
T Consensus 161 YG 162 (162)
T PF00856_consen 161 YG 162 (162)
T ss_dssp ST
T ss_pred EC
Confidence 97
No 11
>KOG1081 consensus Transcription factor NSD1 and related SET domain proteins [Transcription]
Probab=98.88 E-value=5.2e-10 Score=128.13 Aligned_cols=115 Identities=30% Similarity=0.458 Sum_probs=91.7
Q ss_pred CCCCchHhhhcccccEEEEecCCCCceEEeccccCCCCeEEeccccccCHHHHHHHhhhhccc--CccccccCCCcEEEe
Q 003198 678 GQCGNMRLLLRQQQRILLAKSDVAGWGAFLKNSVSKNDYLGEYTGELISHREADKRGKIYDRA--NSSFLFDLNDQYVLD 755 (840)
Q Consensus 678 ~~C~N~~lq~g~~~~v~V~kS~~kG~GLfA~edI~kGefI~EY~GEIIs~~Ea~rR~k~yd~~--~~sYlf~L~~~~~ID 755 (840)
..|.|+.+....... . .+ +|..+|.+| +|++|+..+...|...-... ...|+..+..+..||
T Consensus 301 ~~~~~~~~sk~~~~e------~-~~---~~~~~~~k~------vg~~i~~~e~~~~~~~~~~~~~~~~~~~~~e~~~~id 364 (463)
T KOG1081|consen 301 ERCHNQQFSKESYPE------P-QK---TAKADIRKG------VGEVIDDKECKARLQRVKESDLVDFYMVFIQKDRIID 364 (463)
T ss_pred cccccchhhhhcccc------c-ch---hhHHhhhcc------cCcccchhhheeehhhhhccchhhhhhhhhhcccccc
Confidence 478888776554443 1 11 888999998 99999999987775432222 223434444444999
Q ss_pred ccccCCccccccCCCCCCcceeEEEEcCeeEEEEEEccCCCCCCeEEEecCCC
Q 003198 756 AYRKGDKLKFANHSSNPNCFAKVMLVAGDHRVGIFAKEHIEASEELFYDYRYG 808 (840)
Q Consensus 756 A~~~GN~aRFINHSC~PNc~~~~v~V~g~~rI~ifA~RdI~aGEELTfDYgy~ 808 (840)
+.++||..||+||||+|||....|.+.++.++++||.+.|++||||||+|.+.
T Consensus 365 ~~~~~n~sr~~nh~~~~~v~~~k~~~~~~t~~~~~a~~~i~~g~e~t~~~n~~ 417 (463)
T KOG1081|consen 365 AGPKGNYSRFLNHSCQPNVETEKWQVIGDTRVGLFAPRQIEAGEELTFNYNGN 417 (463)
T ss_pred cccccchhhhhcccCCCceeechhheecccccccccccccccchhhhheeecc
Confidence 99999999999999999999999999999999999999999999999999865
No 12
>KOG2589 consensus Histone tail methylase [Chromatin structure and dynamics]
Probab=98.57 E-value=4.6e-08 Score=106.97 Aligned_cols=114 Identities=23% Similarity=0.267 Sum_probs=83.0
Q ss_pred CCceEEeccccCCCCeEEeccccccCHHHHHHHhhhhcccC-ccccccCCCcEEEeccccCCccccccCCCCCCcceeEE
Q 003198 701 AGWGAFLKNSVSKNDYLGEYTGELISHREADKRGKIYDRAN-SSFLFDLNDQYVLDAYRKGDKLKFANHSSNPNCFAKVM 779 (840)
Q Consensus 701 kG~GLfA~edI~kGefI~EY~GEIIs~~Ea~rR~k~yd~~~-~sYlf~L~~~~~IDA~~~GN~aRFINHSC~PNc~~~~v 779 (840)
.|--|.+++.+.+|+=|--.+|-|+.-.+++++.-.....+ .+.|+.-... -|...-..|+||||.|.|||.+
T Consensus 137 ~gAkivst~~w~~ndkIe~LvGcIaeLse~eE~~ll~~g~nDFSvmyStRk~---caqLwLGPaafINHDCrpnCkF--- 210 (453)
T KOG2589|consen 137 NGAKIVSTKSWSRNDKIELLVGCIAELSEAEERSLLRGGGNDFSVMYSTRKR---CAQLWLGPAAFINHDCRPNCKF--- 210 (453)
T ss_pred CCceEEeeccccCCccHHHhhhhhhhcChhhhHHHHhccCCceeeeeecccc---hhhheeccHHhhcCCCCCCcee---
Confidence 46678899999999999999999988888888743333222 3333332211 1222336789999999999964
Q ss_pred EEcCeeEEEEEEccCCCCCCeEEEecC---CCCCCCccccCCCC
Q 003198 780 LVAGDHRVGIFAKEHIEASEELFYDYR---YGPDQAPAWARKPE 820 (840)
Q Consensus 780 ~V~g~~rI~ifA~RdI~aGEELTfDYg---y~~d~~pcwc~~pe 820 (840)
...|..++.+-++|||+||||||--|| |++...-|.|-.+|
T Consensus 211 vs~g~~tacvkvlRDIePGeEITcFYgs~fFG~~N~~CeC~TCE 254 (453)
T KOG2589|consen 211 VSTGRDTACVKVLRDIEPGEEITCFYGSGFFGENNEECECVTCE 254 (453)
T ss_pred ecCCCceeeeehhhcCCCCceeEEeecccccCCCCceeEEeecc
Confidence 335778899999999999999999998 56666667665554
No 13
>KOG2461 consensus Transcription factor BLIMP-1/PRDI-BF1, contains C2H2-type Zn-finger and SET domains [Transcription]
Probab=98.11 E-value=2.7e-06 Score=96.23 Aligned_cols=108 Identities=19% Similarity=0.294 Sum_probs=83.2
Q ss_pred ccccEEEEecCC--CCceEEeccccCCCCeEEeccccc-cCHHHHHHHhhhhcccCccccccCC----CcEEEeccc--c
Q 003198 689 QQQRILLAKSDV--AGWGAFLKNSVSKNDYLGEYTGEL-ISHREADKRGKIYDRANSSFLFDLN----DQYVLDAYR--K 759 (840)
Q Consensus 689 ~~~~v~V~kS~~--kG~GLfA~edI~kGefI~EY~GEI-Is~~Ea~rR~k~yd~~~~sYlf~L~----~~~~IDA~~--~ 759 (840)
....+.|+.+.+ .|.||++...|.+|+-.+.|.|++ ++..+ ...+..|+|.+- ..++||++. .
T Consensus 26 LP~~l~i~~Ssv~~~~lgV~s~~~i~~G~~FGP~~G~~~~~~~~--------~~~n~~y~W~I~~~d~~~~~iDg~d~~~ 97 (396)
T KOG2461|consen 26 LPPELRIKPSSVPVTGLGVWSNASILPGTSFGPFEGEIIASIDS--------KSANNRYMWEIFSSDNGYEYIDGTDEEH 97 (396)
T ss_pred CCCceEeeccccCCccccccccccccCcccccCccCcccccccc--------ccccCcceEEEEeCCCceEEeccCChhh
Confidence 567888988877 788999999999999999999998 22211 123445666442 348999984 6
Q ss_pred CCccccccCCCC---CCcceeEEEEcCeeEEEEEEccCCCCCCeEEEecCCC
Q 003198 760 GDKLKFANHSSN---PNCFAKVMLVAGDHRVGIFAKEHIEASEELFYDYRYG 808 (840)
Q Consensus 760 GN~aRFINHSC~---PNc~~~~v~V~g~~rI~ifA~RdI~aGEELTfDYgy~ 808 (840)
.|++||+|=+++ -|+.+. .....|.++|+|+|.+||||.++|+-+
T Consensus 98 sNWmRYV~~Ar~~eeQNL~A~----Q~~~~Ifyrt~r~I~p~eELlVWY~~e 145 (396)
T KOG2461|consen 98 SNWMRYVNSARSEEEQNLLAF----QIGENIFYRTIRDIRPNEELLVWYGSE 145 (396)
T ss_pred cceeeeecccCChhhhhHHHH----hccCceEEEecccCCCCCeEEEEeccc
Confidence 899999998885 687652 233468899999999999999999743
No 14
>PF00249 Myb_DNA-binding: Myb-like DNA-binding domain; InterPro: IPR014778 The retroviral oncogene v-myb, and its cellular counterpart c-myb, encode nuclear DNA-binding proteins. These belong to the SANT domain family that specifically recognise the sequence YAAC(G/T)G [, ]. In myb, one of the most conserved regions consisting of three tandem repeats has been shown to be involved in DNA-binding [].; PDB: 1X41_A 2XAF_B 2XAG_B 2XAH_B 2UXN_B 2Y48_B 2XAQ_B 2X0L_B 2IW5_B 2XAJ_B ....
Probab=93.10 E-value=0.13 Score=41.36 Aligned_cols=46 Identities=15% Similarity=0.292 Sum_probs=39.5
Q ss_pred cccCCcccchhhhhHHhhcCChHHHHHHHHHHhc--CCcHHHHHHHHHhH
Q 003198 174 KHEFSDGEDRILWTVFEEHGLGEEVINAVSQFIG--IATSEVQDRYSTLK 221 (840)
Q Consensus 174 k~~f~~~ed~~~~~~~~e~g~~~~v~~~l~~~~~--~~~sei~eRy~~L~ 221 (840)
|..||+.||.+|-.++++||.. -...|++.|+ +|+.+++.||..|.
T Consensus 1 r~~Wt~eE~~~l~~~v~~~g~~--~W~~Ia~~~~~~Rt~~qc~~~~~~~~ 48 (48)
T PF00249_consen 1 RGPWTEEEDEKLLEAVKKYGKD--NWKKIAKRMPGGRTAKQCRSRYQNLL 48 (48)
T ss_dssp S-SS-HHHHHHHHHHHHHSTTT--HHHHHHHHHSSSSTHHHHHHHHHHHT
T ss_pred CCCCCHHHHHHHHHHHHHhCCc--HHHHHHHHcCCCCCHHHHHHHHHhhC
Confidence 4579999999999999999998 6788888886 99999999998873
No 15
>PF13921 Myb_DNA-bind_6: Myb-like DNA-binding domain; PDB: 1A5J_A 1MBH_A 1GV5_A 1H89_C 1IDY_A 1MBK_A 1IDZ_A 1H88_C 1GVD_A 1MBG_A ....
Probab=92.39 E-value=0.12 Score=42.94 Aligned_cols=43 Identities=16% Similarity=0.452 Sum_probs=36.4
Q ss_pred CCcccchhhhhHHhhcCChHHHHHHHHHHhc-CCcHHHHHHHHH-hHh
Q 003198 177 FSDGEDRILWTVFEEHGLGEEVINAVSQFIG-IATSEVQDRYST-LKE 222 (840)
Q Consensus 177 f~~~ed~~~~~~~~e~g~~~~v~~~l~~~~~-~~~sei~eRy~~-L~~ 222 (840)
||+.||.+|....++||.+ ...||++|+ +++.+|+.||.. |..
T Consensus 1 WT~eEd~~L~~~~~~~g~~---W~~Ia~~l~~Rt~~~~~~r~~~~l~~ 45 (60)
T PF13921_consen 1 WTKEEDELLLELVKKYGND---WKKIAEHLGNRTPKQCRNRWRNHLRP 45 (60)
T ss_dssp S-HHHHHHHHHHHHHHTS----HHHHHHHSTTS-HHHHHHHHHHTTST
T ss_pred CCHHHHHHHHHHHHHHCcC---HHHHHHHHCcCCHHHHHHHHHHHCcc
Confidence 5788999999999999963 889999999 999999999999 753
No 16
>smart00717 SANT SANT SWI3, ADA2, N-CoR and TFIIIB'' DNA-binding domains.
Probab=92.35 E-value=0.12 Score=39.79 Aligned_cols=46 Identities=15% Similarity=0.346 Sum_probs=39.4
Q ss_pred ccCCcccchhhhhHHhhcCChHHHHHHHHHHhc-CCcHHHHHHHHHhHh
Q 003198 175 HEFSDGEDRILWTVFEEHGLGEEVINAVSQFIG-IATSEVQDRYSTLKE 222 (840)
Q Consensus 175 ~~f~~~ed~~~~~~~~e~g~~~~v~~~l~~~~~-~~~sei~eRy~~L~~ 222 (840)
..|++.||.+|-..+.+||.. -++.|+.+|. +++.+|+.||..|..
T Consensus 2 ~~Wt~~E~~~l~~~~~~~g~~--~w~~Ia~~~~~rt~~~~~~~~~~~~~ 48 (49)
T smart00717 2 GEWTEEEDELLIELVKKYGKN--NWEKIAKELPGRTAEQCRERWNNLLK 48 (49)
T ss_pred CCCCHHHHHHHHHHHHHHCcC--CHHHHHHHcCCCCHHHHHHHHHHHcC
Confidence 569999999999999999952 2778888884 999999999998754
No 17
>smart00717 SANT SANT SWI3, ADA2, N-CoR and TFIIIB'' DNA-binding domains.
Probab=92.16 E-value=0.34 Score=37.21 Aligned_cols=42 Identities=29% Similarity=0.373 Sum_probs=38.1
Q ss_pred CCCcHHHHHHHHHhhhhcC-CchHHHHHhhhCCCCcHHHHHHHHh
Q 003198 474 SEWKPIEKELYLKGVEIFG-RNSCLIARNLLSGLKTCMEVSTYMR 517 (840)
Q Consensus 474 ~~W~~~E~~L~~k~v~~fg-~N~C~iA~~ll~g~KTC~EV~~ym~ 517 (840)
..|++-|..++..++..|| .++..||..| +.+|-.+|..+..
T Consensus 2 ~~Wt~~E~~~l~~~~~~~g~~~w~~Ia~~~--~~rt~~~~~~~~~ 44 (49)
T smart00717 2 GEWTEEEDELLIELVKKYGKNNWEKIAKEL--PGRTAEQCRERWN 44 (49)
T ss_pred CCCCHHHHHHHHHHHHHHCcCCHHHHHHHc--CCCCHHHHHHHHH
Confidence 4799999999999999999 9999999987 6899999988765
No 18
>cd00167 SANT 'SWI3, ADA2, N-CoR and TFIIIB' DNA-binding domains. Tandem copies of the domain bind telomeric DNA tandem repeatsas part of the capping complex. Binding is sequence dependent for repeats which contain the G/C rich motif [C2-3 A (CA)1-6]. The domain is also found in regulatory transcriptional repressor complexes where it also binds DNA.
Probab=92.01 E-value=0.36 Score=36.62 Aligned_cols=41 Identities=32% Similarity=0.407 Sum_probs=36.9
Q ss_pred CCcHHHHHHHHHhhhhcC-CchHHHHHhhhCCCCcHHHHHHHHh
Q 003198 475 EWKPIEKELYLKGVEIFG-RNSCLIARNLLSGLKTCMEVSTYMR 517 (840)
Q Consensus 475 ~W~~~E~~L~~k~v~~fg-~N~C~iA~~ll~g~KTC~EV~~ym~ 517 (840)
.||.-|..++..++..|| .+...||+.+ +.||-.+|..|..
T Consensus 1 ~Wt~eE~~~l~~~~~~~g~~~w~~Ia~~~--~~rs~~~~~~~~~ 42 (45)
T cd00167 1 PWTEEEDELLLEAVKKYGKNNWEKIAKEL--PGRTPKQCRERWR 42 (45)
T ss_pred CCCHHHHHHHHHHHHHHCcCCHHHHHhHc--CCCCHHHHHHHHH
Confidence 499999999999999999 8999999987 6699999988764
No 19
>KOG1171 consensus Metallothionein-like protein [Inorganic ion transport and metabolism]
Probab=91.39 E-value=0.046 Score=62.30 Aligned_cols=63 Identities=37% Similarity=1.008 Sum_probs=51.8
Q ss_pred ccCCCCC-CCCC-CCCcccCCCccccCCCCCCcccccc------------------------------------------
Q 003198 592 YTPCGCQ-SMCG-KQCPCLHNGTCCEKYCGCSKSCKNR------------------------------------------ 627 (840)
Q Consensus 592 y~PC~c~-~~C~-~~C~C~~~g~~Ce~~CgC~~~C~nR------------------------------------------ 627 (840)
-.+|.|+ ..|- ..|.|...|.+|..+|.|- +|.|.
T Consensus 131 k~~~~ck~SkclklYCeCFAsG~yC~~~CnCv-nC~N~~~~e~~r~~a~k~~l~RNP~AFkPKia~s~~~~~da~~~~~~ 209 (406)
T KOG1171|consen 131 KKKCNCKKSKCLKLYCECFASGVYCTGPCNCV-NCFNNPEHESVRLKARKQILERNPNAFKPKIAASSSGIADASEEASK 209 (406)
T ss_pred ccCCCchHHHHHHHhHHHHhhcccccCCccee-eccCCCcchHHHHHHHHHHhhcCccccccccccCCcccchhhhhhhc
Confidence 4466665 5565 4899999999999999998 47664
Q ss_pred -------cCCcccCCCCccCCCccccccccccCCC
Q 003198 628 -------FRGCHCAKSQCRSRQCPCFAAGRECDPD 655 (840)
Q Consensus 628 -------f~GC~C~~~~C~t~~CpC~~a~rECdPd 655 (840)
-.||+|.+..|..+.|.||+++.-|...
T Consensus 210 ~~~sa~hkkGC~CkkSgClKkYCECyQa~vlCS~n 244 (406)
T KOG1171|consen 210 TPASARHKKGCNCKKSGCLKKYCECYQAGVLCSSN 244 (406)
T ss_pred cchhhhhcCCCCCccccchHHHHHHHhcCCCcccc
Confidence 2799999999999999999999888533
No 20
>smart00570 AWS associated with SET domains. subdomain of PRESET
Probab=89.75 E-value=0.13 Score=42.85 Aligned_cols=12 Identities=42% Similarity=0.661 Sum_probs=10.1
Q ss_pred CCCCchHhhhcc
Q 003198 678 GQCGNMRLLLRQ 689 (840)
Q Consensus 678 ~~C~N~~lq~g~ 689 (840)
..|+|++||+++
T Consensus 39 ~~C~NqrFqk~~ 50 (51)
T smart00570 39 SYCSNQRFQKRQ 50 (51)
T ss_pred cCccCcccccCc
Confidence 389999999875
No 21
>KOG4442 consensus Clathrin coat binding protein/Huntingtin interacting protein HIP1, involved in regulation of endocytosis [Intracellular trafficking, secretion, and vesicular transport]
Probab=89.74 E-value=0.45 Score=57.30 Aligned_cols=35 Identities=34% Similarity=0.752 Sum_probs=29.9
Q ss_pred CCCCCCcccCCCccccC-CCC-CCccccc-ccCCcccC
Q 003198 600 MCGKQCPCLHNGTCCEK-YCG-CSKSCKN-RFRGCHCA 634 (840)
Q Consensus 600 ~C~~~C~C~~~g~~Ce~-~Cg-C~~~C~n-Rf~GC~C~ 634 (840)
.||.+|-|....+.|.. .|. |+..|.| ||+-|.|+
T Consensus 83 ~cg~~CiNr~t~iECs~~~C~~cg~~C~NQRFQkkqyA 120 (729)
T KOG4442|consen 83 ACGEDCINRMTSIECSDRECPRCGVYCKNQRFQKKQYA 120 (729)
T ss_pred ccCccccchhhhcccCCccCCCccccccchhhhhhccC
Confidence 45788999999999999 999 9999988 78866664
No 22
>cd00167 SANT 'SWI3, ADA2, N-CoR and TFIIIB' DNA-binding domains. Tandem copies of the domain bind telomeric DNA tandem repeatsas part of the capping complex. Binding is sequence dependent for repeats which contain the G/C rich motif [C2-3 A (CA)1-6]. The domain is also found in regulatory transcriptional repressor complexes where it also binds DNA.
Probab=88.89 E-value=0.37 Score=36.51 Aligned_cols=43 Identities=14% Similarity=0.338 Sum_probs=37.0
Q ss_pred cCCcccchhhhhHHhhcCChHHHHHHHHHHhc-CCcHHHHHHHHHh
Q 003198 176 EFSDGEDRILWTVFEEHGLGEEVINAVSQFIG-IATSEVQDRYSTL 220 (840)
Q Consensus 176 ~f~~~ed~~~~~~~~e~g~~~~v~~~l~~~~~-~~~sei~eRy~~L 220 (840)
.|++.||.+|-..+.++|.. -...|+++|. ++..+|+.||..+
T Consensus 1 ~Wt~eE~~~l~~~~~~~g~~--~w~~Ia~~~~~rs~~~~~~~~~~~ 44 (45)
T cd00167 1 PWTEEEDELLLEAVKKYGKN--NWEKIAKELPGRTPKQCRERWRNL 44 (45)
T ss_pred CCCHHHHHHHHHHHHHHCcC--CHHHHHhHcCCCCHHHHHHHHHHh
Confidence 37899999999999999952 2788898884 9999999999876
No 23
>PF03638 TCR: Tesmin/TSO1-like CXC domain, cysteine-rich domain; InterPro: IPR005172 This entry includes proteins that have two copies of a cysteine rich motif as follows: C-X-C-X4-C-X3-YC-X-C-X6-C-X3-C-X-C-X2-C. The family includes Tesmin Q9Y4I5 from SWISSPROT [] and TSO1 Q9LE32 from SWISSPROT []. This group of proteins is called a CXC domain in [].
Probab=87.66 E-value=0.29 Score=39.29 Aligned_cols=28 Identities=50% Similarity=1.220 Sum_probs=26.1
Q ss_pred cCCcccCCCCccCCCccccccccccCCC
Q 003198 628 FRGCHCAKSQCRSRQCPCFAAGRECDPD 655 (840)
Q Consensus 628 f~GC~C~~~~C~t~~CpC~~a~rECdPd 655 (840)
..||.|.++.|....|.||++++.|.+.
T Consensus 3 ~~gC~Ckks~Clk~YC~Cf~~g~~C~~~ 30 (42)
T PF03638_consen 3 KKGCNCKKSKCLKLYCECFQAGRFCTPN 30 (42)
T ss_pred CCCCcccCcChhhhhCHHHHCcCcCCCC
Confidence 5799999999999999999999999886
No 24
>PF09111 SLIDE: SLIDE; InterPro: IPR015195 The SLIDE domain adopts a secondary structure comprising a main core of three alpha-helices. It has a role in DNA binding, contacting DNA target sites similar to c-Myb (IPR014778 from INTERPRO) repeats or homeodomains []. ; GO: 0003676 nucleic acid binding, 0005524 ATP binding, 0016818 hydrolase activity, acting on acid anhydrides, in phosphorus-containing anhydrides, 0006338 chromatin remodeling, 0005634 nucleus; PDB: 2NOG_A 2Y9Y_A 2Y9Z_A 1OFC_X.
Probab=83.97 E-value=0.89 Score=44.00 Aligned_cols=50 Identities=28% Similarity=0.455 Sum_probs=37.1
Q ss_pred ccccCCcccchhhhhHHhhcCC-----hHHHHHHHHH--------Hh-cCCcHHHHHHHHHhHh
Q 003198 173 EKHEFSDGEDRILWTVFEEHGL-----GEEVINAVSQ--------FI-GIATSEVQDRYSTLKE 222 (840)
Q Consensus 173 ek~~f~~~ed~~~~~~~~e~g~-----~~~v~~~l~~--------~~-~~~~sei~eRy~~L~~ 222 (840)
-++-||+.||++|=+.+-+||+ =|.|...|.. || ++|+.||+.|...|-.
T Consensus 48 ~~k~yseeEDRfLl~~~~~~G~~~~~~~e~Ik~~Ir~~p~FrFDwf~kSRt~~el~rR~~tLi~ 111 (118)
T PF09111_consen 48 KKKVYSEEEDRFLLCMLYKYGYDAEGNWEKIKQEIRESPLFRFDWFFKSRTPQELQRRCNTLIK 111 (118)
T ss_dssp S-SSS-HHHHHHHHHHHHHHTTTSTTHHHHHHHHHHH-CGGCT-HHHHTS-HHHHHHHHHHHHH
T ss_pred CCCCcCcHHHHHHHHHHHHhCCCCCchHHHHHHHHHhCCCcccchhcccCCHHHHHHHHHHHHH
Confidence 3788999999999999999999 2444444443 22 9999999999999863
No 25
>PF00249 Myb_DNA-binding: Myb-like DNA-binding domain; InterPro: IPR014778 The retroviral oncogene v-myb, and its cellular counterpart c-myb, encode nuclear DNA-binding proteins. These belong to the SANT domain family that specifically recognise the sequence YAAC(G/T)G [, ]. In myb, one of the most conserved regions consisting of three tandem repeats has been shown to be involved in DNA-binding [].; PDB: 1X41_A 2XAF_B 2XAG_B 2XAH_B 2UXN_B 2Y48_B 2XAQ_B 2X0L_B 2IW5_B 2XAJ_B ....
Probab=79.23 E-value=5.2 Score=32.04 Aligned_cols=43 Identities=23% Similarity=0.405 Sum_probs=34.8
Q ss_pred CCCcHHHHHHHHHhhhhcCCc-hHHHHHhhhCCCCcHHHHHHHHh
Q 003198 474 SEWKPIEKELYLKGVEIFGRN-SCLIARNLLSGLKTCMEVSTYMR 517 (840)
Q Consensus 474 ~~W~~~E~~L~~k~v~~fg~N-~C~iA~~ll~g~KTC~EV~~ym~ 517 (840)
..||.-|..+|+.++..||.+ .=.||..+. +.||=.++-.+..
T Consensus 2 ~~Wt~eE~~~l~~~v~~~g~~~W~~Ia~~~~-~~Rt~~qc~~~~~ 45 (48)
T PF00249_consen 2 GPWTEEEDEKLLEAVKKYGKDNWKKIAKRMP-GGRTAKQCRSRYQ 45 (48)
T ss_dssp -SS-HHHHHHHHHHHHHSTTTHHHHHHHHHS-SSSTHHHHHHHHH
T ss_pred CCCCHHHHHHHHHHHHHhCCcHHHHHHHHcC-CCCCHHHHHHHHH
Confidence 469999999999999999998 999998772 3898888876543
No 26
>KOG1337 consensus N-methyltransferase [General function prediction only]
Probab=78.93 E-value=1.4 Score=51.52 Aligned_cols=40 Identities=30% Similarity=0.416 Sum_probs=31.1
Q ss_pred cccCCCCCCcceeEEEEcCeeEEEEEEccCCCCCCeEEEecCC
Q 003198 765 FANHSSNPNCFAKVMLVAGDHRVGIFAKEHIEASEELFYDYRY 807 (840)
Q Consensus 765 FINHSC~PNc~~~~v~V~g~~rI~ifA~RdI~aGEELTfDYgy 807 (840)
+.||++.+ ....+..-+..+-+++.++|.+||||+++||-
T Consensus 239 ~~NH~~~~---~~~~~~~~d~~~~l~~~~~v~~geevfi~YG~ 278 (472)
T KOG1337|consen 239 LLNHSPEV---IKAGYNQEDEAVELVAERDVSAGEEVFINYGP 278 (472)
T ss_pred hhccCchh---ccccccCCCCcEEEEEeeeecCCCeEEEecCC
Confidence 57999998 22233333448999999999999999999973
No 27
>PF05033 Pre-SET: Pre-SET motif; InterPro: IPR007728 This region is found in a number of histone lysine methyltransferases (HMTase), N-terminal to the SET domain; it is generally described as the pre-SET domain. Histone lysine methylation is part of the histone code that regulated chromatin function and epigenetic control of gene function. Histone lysine methyltransferases (HMTase) differ both in their substrate specificity for the various acceptor lysines as well as in their product specificity for the number of methyl groups (one, two, or three) they transfer. With just one exception [], the HMTases belong to SET family that can be classified according to the sequences surrounding the SET domain [, ]. Structural studies on the human SET7/9, a mono-methylase, have revealed the molecular basis for the specificity of the enzyme for the histone-target and the roles of the invariant residues in the SET domain in determining the methylation specificities []. The pre-SET domain, as found in the SUV39 SET family, contains nine invariant cysteine residues that are grouped into two segments separated by a region of variable length. These 9 cysteines coordinate 3 zinc ions to form a triangular cluster, where each of the zinc ions is coordinated by 4 four cysteines to give a tetrahedral configuration. The function of this domain is structural, holding together 2 long segments of random coils and stabilising the SET domain. The C-terminal region including the post-SET domain is disordered when not interacting with a histone tail and in the absence of zinc. The three conserved cysteines in the post-SET domain form a zinc-binding site [] when coupled to a fourth conserved cysteine in the knot-like structure close to the SET domain active site []. The structured post-SET region brings in the C-terminal residues that participate in S-adenosylmethine-binding and histone tail interactions. The three conserved cysteine residues are essential for HMTase activity, as replacement with serine abolishes HMTase activity []. ; GO: 0008270 zinc ion binding, 0018024 histone-lysine N-methyltransferase activity, 0034968 histone lysine methylation, 0005634 nucleus; PDB: 3K5K_A 2O8J_D 3RJW_B 1ML9_A 1PEG_B 1MVH_A 1MVX_A 3BO5_A 2RFI_B 3MO5_B ....
Probab=78.77 E-value=1.4 Score=40.56 Aligned_cols=22 Identities=27% Similarity=0.860 Sum_probs=10.7
Q ss_pred ccccCCCCCCCC--CCCCcccCCC
Q 003198 590 KQYTPCGCQSMC--GKQCPCLHNG 611 (840)
Q Consensus 590 ~~y~PC~c~~~C--~~~C~C~~~g 611 (840)
.....|+|.+.| ...|.|....
T Consensus 44 ~~~~~C~C~~~C~~~~~C~C~~~~ 67 (103)
T PF05033_consen 44 EFLQGCDCSGDCSNPSNCECLQRN 67 (103)
T ss_dssp GGTS----SSSSTCTTTSHHHCCT
T ss_pred ccCccCccCCCCCCCCCCcCcccc
Confidence 345568887777 3567776543
No 28
>KOG1141 consensus Predicted histone methyl transferase [Chromatin structure and dynamics]
Probab=77.21 E-value=4.7 Score=49.73 Aligned_cols=54 Identities=17% Similarity=0.277 Sum_probs=44.2
Q ss_pred CCCchHhhhcccccEE--------EEecCCCCceEEeccccCCCCeEEeccccccCHHHHHH
Q 003198 679 QCGNMRLLLRQQQRIL--------LAKSDVAGWGAFLKNSVSKNDYLGEYTGELISHREADK 732 (840)
Q Consensus 679 ~C~N~~lq~g~~~~v~--------V~kS~~kG~GLfA~edI~kGefI~EY~GEIIs~~Ea~r 732 (840)
.|.|+.++.+...+.+ |+++...|||+.+..+|+.-.||++|+|...+..-+.+
T Consensus 993 ~e~~~~v~~~~~~~me~~s~~~l~i~~~~~~~~~~~edtD~~~~~~~~~~~~~ppt~~l~~~ 1054 (1262)
T KOG1141|consen 993 KEYNRVVQNNIKYPMEVSSFNDLQIFKTAQSGWGVREDTDIPQSTFICTYVGAPPTDDLADE 1054 (1262)
T ss_pred cccchhhhcCCccceeeeecccccccccccccccccccccCCCCcccccccCCCCchhhHHH
Confidence 6889988877666555 45556689999999999999999999999988776643
No 29
>smart00570 AWS associated with SET domains. subdomain of PRESET
Probab=75.92 E-value=0.99 Score=37.69 Aligned_cols=8 Identities=38% Similarity=0.746 Sum_probs=4.2
Q ss_pred cccccccC
Q 003198 622 KSCKNRFR 629 (840)
Q Consensus 622 ~~C~nRf~ 629 (840)
++|+||+.
T Consensus 20 sdClNR~l 27 (51)
T smart00570 20 SDCLNRML 27 (51)
T ss_pred hHHHHHHH
Confidence 45555553
No 30
>PF03638 TCR: Tesmin/TSO1-like CXC domain, cysteine-rich domain; InterPro: IPR005172 This entry includes proteins that have two copies of a cysteine rich motif as follows: C-X-C-X4-C-X3-YC-X-C-X6-C-X3-C-X-C-X2-C. The family includes Tesmin Q9Y4I5 from SWISSPROT [] and TSO1 Q9LE32 from SWISSPROT []. This group of proteins is called a CXC domain in [].
Probab=75.81 E-value=1.6 Score=35.16 Aligned_cols=37 Identities=35% Similarity=0.942 Sum_probs=31.2
Q ss_pred cccCCCCC-CCCC-CCCcccCCCccccCCCCCCccccccc
Q 003198 591 QYTPCGCQ-SMCG-KQCPCLHNGTCCEKYCGCSKSCKNRF 628 (840)
Q Consensus 591 ~y~PC~c~-~~C~-~~C~C~~~g~~Ce~~CgC~~~C~nRf 628 (840)
+..+|.|. ..|. ..|.|...|.+|...|.|. +|.|..
T Consensus 2 ~~~gC~Ckks~Clk~YC~Cf~~g~~C~~~C~C~-~C~N~~ 40 (42)
T PF03638_consen 2 KKKGCNCKKSKCLKLYCECFQAGRFCTPNCKCQ-NCKNTE 40 (42)
T ss_pred CCCCCcccCcChhhhhCHHHHCcCcCCCCcccC-CCCCcC
Confidence 35689996 7887 5899999999999999995 688864
No 31
>PF05033 Pre-SET: Pre-SET motif; InterPro: IPR007728 This region is found in a number of histone lysine methyltransferases (HMTase), N-terminal to the SET domain; it is generally described as the pre-SET domain. Histone lysine methylation is part of the histone code that regulated chromatin function and epigenetic control of gene function. Histone lysine methyltransferases (HMTase) differ both in their substrate specificity for the various acceptor lysines as well as in their product specificity for the number of methyl groups (one, two, or three) they transfer. With just one exception [], the HMTases belong to SET family that can be classified according to the sequences surrounding the SET domain [, ]. Structural studies on the human SET7/9, a mono-methylase, have revealed the molecular basis for the specificity of the enzyme for the histone-target and the roles of the invariant residues in the SET domain in determining the methylation specificities []. The pre-SET domain, as found in the SUV39 SET family, contains nine invariant cysteine residues that are grouped into two segments separated by a region of variable length. These 9 cysteines coordinate 3 zinc ions to form a triangular cluster, where each of the zinc ions is coordinated by 4 four cysteines to give a tetrahedral configuration. The function of this domain is structural, holding together 2 long segments of random coils and stabilising the SET domain. The C-terminal region including the post-SET domain is disordered when not interacting with a histone tail and in the absence of zinc. The three conserved cysteines in the post-SET domain form a zinc-binding site [] when coupled to a fourth conserved cysteine in the knot-like structure close to the SET domain active site []. The structured post-SET region brings in the C-terminal residues that participate in S-adenosylmethine-binding and histone tail interactions. The three conserved cysteine residues are essential for HMTase activity, as replacement with serine abolishes HMTase activity []. ; GO: 0008270 zinc ion binding, 0018024 histone-lysine N-methyltransferase activity, 0034968 histone lysine methylation, 0005634 nucleus; PDB: 3K5K_A 2O8J_D 3RJW_B 1ML9_A 1PEG_B 1MVH_A 1MVX_A 3BO5_A 2RFI_B 3MO5_B ....
Probab=75.12 E-value=1.6 Score=40.16 Aligned_cols=16 Identities=56% Similarity=1.298 Sum_probs=7.9
Q ss_pred ccccCCCCCCcccccc
Q 003198 612 TCCEKYCGCSKSCKNR 627 (840)
Q Consensus 612 ~~Ce~~CgC~~~C~nR 627 (840)
..|...|+|+..|.||
T Consensus 88 ~EC~~~C~C~~~C~NR 103 (103)
T PF05033_consen 88 FECNDNCGCSPSCRNR 103 (103)
T ss_dssp E---TTSSS-TTSTT-
T ss_pred EeCCCCCCCCCCCCCC
Confidence 3577777777777776
No 32
>PF13921 Myb_DNA-bind_6: Myb-like DNA-binding domain; PDB: 1A5J_A 1MBH_A 1GV5_A 1H89_C 1IDY_A 1MBK_A 1IDZ_A 1H88_C 1GVD_A 1MBG_A ....
Probab=72.55 E-value=7.4 Score=32.28 Aligned_cols=41 Identities=32% Similarity=0.443 Sum_probs=32.9
Q ss_pred CcHHHHHHHHHhhhhcCCchHHHHHhhhCCCCcHHHHHHHHhh
Q 003198 476 WKPIEKELYLKGVEIFGRNSCLIARNLLSGLKTCMEVSTYMRD 518 (840)
Q Consensus 476 W~~~E~~L~~k~v~~fg~N~C~iA~~ll~g~KTC~EV~~ym~~ 518 (840)
||.-|..+++.++..||.+...||..| |.+|=.+|......
T Consensus 1 WT~eEd~~L~~~~~~~g~~W~~Ia~~l--~~Rt~~~~~~r~~~ 41 (60)
T PF13921_consen 1 WTKEEDELLLELVKKYGNDWKKIAEHL--GNRTPKQCRNRWRN 41 (60)
T ss_dssp S-HHHHHHHHHHHHHHTS-HHHHHHHS--TTS-HHHHHHHHHH
T ss_pred CCHHHHHHHHHHHHHHCcCHHHHHHHH--CcCCHHHHHHHHHH
Confidence 999999999999999999999999987 67887888765543
No 33
>KOG2084 consensus Predicted histone tail methylase containing SET domain [Chromatin structure and dynamics]
Probab=72.51 E-value=4.2 Score=46.12 Aligned_cols=38 Identities=32% Similarity=0.496 Sum_probs=28.3
Q ss_pred cccCCCCCCcceeEEEEcCeeEEEEEEccCCCCCC-eEEEecC
Q 003198 765 FANHSSNPNCFAKVMLVAGDHRVGIFAKEHIEASE-ELFYDYR 806 (840)
Q Consensus 765 FINHSC~PNc~~~~v~V~g~~rI~ifA~RdI~aGE-ELTfDYg 806 (840)
++||||.||+. +...+.. ..+++...+.+++ ||+..|-
T Consensus 208 ~~~hsC~pn~~---~~~~~~~-~~~~~~~~~~~~~~~l~~~y~ 246 (482)
T KOG2084|consen 208 LFNHSCFPNIS---VIFDGRG-LALLVPAGIDAGEEELTISYT 246 (482)
T ss_pred hcccCCCCCeE---EEECCce-eEEEeecccCCCCCEEEEeec
Confidence 89999999996 3334444 4466777777776 9999994
No 34
>PLN03212 Transcription repressor MYB5; Provisional
Probab=64.41 E-value=7.3 Score=42.25 Aligned_cols=52 Identities=15% Similarity=0.291 Sum_probs=43.7
Q ss_pred CCccccccCCcccchhhhhHHhhcCChHHHHHHHHHHh-cCCcHHHHHHHHHhHhh
Q 003198 169 EPEEEKHEFSDGEDRILWTVFEEHGLGEEVINAVSQFI-GIATSEVQDRYSTLKEK 223 (840)
Q Consensus 169 e~eeek~~f~~~ed~~~~~~~~e~g~~~~v~~~l~~~~-~~~~sei~eRy~~L~~k 223 (840)
.|.=-|-.||+.||.+|.-..+++|-. -..||++| ++|.-.|+.||+.+..+
T Consensus 73 ~P~I~kgpWT~EED~lLlel~~~~GnK---Ws~IAk~LpGRTDnqIKNRWns~LrK 125 (249)
T PLN03212 73 RPSVKRGGITSDEEDLILRLHRLLGNR---WSLIAGRIPGRTDNEIKNYWNTHLRK 125 (249)
T ss_pred chhcccCCCChHHHHHHHHHHHhcccc---HHHHHhhcCCCCHHHHHHHHHHHHhH
Confidence 455667789999999999999999953 67788888 99999999999877654
No 35
>TIGR01557 myb_SHAQKYF myb-like DNA-binding domain, SHAQKYF class. This model describes a DNA-binding domain restricted to (but common in) plant proteins, many of which also contain a response regulator domain. The domain appears related to the Myb-like DNA-binding domain described by Pfam model pfam00249. It is distinguished in part by a well-conserved motif SH[AL]QKY[RF] at the C-terminal end of the motif.
Probab=60.84 E-value=20 Score=30.48 Aligned_cols=44 Identities=16% Similarity=0.191 Sum_probs=35.8
Q ss_pred CCCcHHHHHHHHHhhhhcCC-ch---HHHHHhhhCCCC-cHHHHHHHHhh
Q 003198 474 SEWKPIEKELYLKGVEIFGR-NS---CLIARNLLSGLK-TCMEVSTYMRD 518 (840)
Q Consensus 474 ~~W~~~E~~L~~k~v~~fg~-N~---C~iA~~ll~g~K-TC~EV~~ym~~ 518 (840)
-.||+-|-..|+.+++.||. |. =.|+.++. .++ |-.+|.++++.
T Consensus 4 ~~WT~eeh~~Fl~ai~~~G~g~~a~pk~I~~~~~-~~~lT~~qV~SH~QK 52 (57)
T TIGR01557 4 VVWTEDLHDRFLQAVQKLGGPDWATPKRILELMV-VDGLTRDQVASHLQK 52 (57)
T ss_pred CCCCHHHHHHHHHHHHHhCCCcccchHHHHHHcC-CCCCCHHHHHHHHHH
Confidence 46999999999999999998 77 77777653 355 88899887763
No 36
>PF14774 FAM177: FAM177 family
Probab=56.96 E-value=18 Score=35.58 Aligned_cols=66 Identities=20% Similarity=0.222 Sum_probs=37.9
Q ss_pred cccceeeEeCCCCeEEE-ecCCccccCCCcccccc-CC----cccc-hh-------hhhHHhhcCChHHHHHHHHHHhcC
Q 003198 143 VGRRRIYYDQHGSEALV-CSDSEEDIIEPEEEKHE-FS----DGED-RI-------LWTVFEEHGLGEEVINAVSQFIGI 208 (840)
Q Consensus 143 vgrrriYYd~~g~Eali-csdseee~~e~eeek~~-f~----~~ed-~~-------~~~~~~e~g~~~~v~~~l~~~~~~ 208 (840)
.=||-||+ +.||+|- .|.+||| .+.++.+.+ ++ ..+- .. +|+...-+.--|=|=+.||.|||.
T Consensus 18 ~prRiihF--sdGetmEE~StdeEe-~e~d~~~~d~~~~~~dp~~l~w~~~~~~~~~~~~~~~l~~~d~~Ge~lA~~fGi 94 (123)
T PF14774_consen 18 KPRRIIHF--SDGETMEEYSTDEEE-EEQDEDQPDKLSVQVDPSKLTWGPWLWFWAWRVGTKSLSGCDYLGEKLASFFGI 94 (123)
T ss_pred CchheeEe--cCCceeeeecccccc-ccccccccccccccCCcccCCcHHHHHHHHHHHHHhHhhHHhhhhhHHHHHhCC
Confidence 35899999 9998776 7766665 333333333 22 2221 12 223333333344455789999999
Q ss_pred CcH
Q 003198 209 ATS 211 (840)
Q Consensus 209 ~~s 211 (840)
+.+
T Consensus 95 t~~ 97 (123)
T PF14774_consen 95 TSP 97 (123)
T ss_pred Cch
Confidence 987
No 37
>PLN03091 hypothetical protein; Provisional
Probab=51.29 E-value=18 Score=42.36 Aligned_cols=53 Identities=15% Similarity=0.362 Sum_probs=44.2
Q ss_pred CCCccccccCCcccchhhhhHHhhcCChHHHHHHHHHHh-cCCcHHHHHHHHHhHhh
Q 003198 168 IEPEEEKHEFSDGEDRILWTVFEEHGLGEEVINAVSQFI-GIATSEVQDRYSTLKEK 223 (840)
Q Consensus 168 ~e~eeek~~f~~~ed~~~~~~~~e~g~~~~v~~~l~~~~-~~~~sei~eRy~~L~~k 223 (840)
..|.--|..|+..||.+|....+++|-. -..||++| |++.-.|+.||+.+.+|
T Consensus 61 LdP~IkKgpWT~EED~lLLeL~k~~GnK---WskIAk~LPGRTDnqIKNRWnslLKK 114 (459)
T PLN03091 61 LRPDLKRGTFSQQEENLIIELHAVLGNR---WSQIAAQLPGRTDNEIKNLWNSCLKK 114 (459)
T ss_pred cCCcccCCCCCHHHHHHHHHHHHHhCcc---hHHHHHhcCCCCHHHHHHHHHHHHHH
Confidence 3455667889999999999999999952 67788887 99999999999877554
No 38
>KOG1082 consensus Histone H3 (Lys9) methyltransferase SUV39H1/Clr4, required for transcriptional silencing [Chromatin structure and dynamics; Transcription]
Probab=49.65 E-value=12 Score=42.51 Aligned_cols=41 Identities=37% Similarity=0.888 Sum_probs=29.5
Q ss_pred CccccCCCCCCCCCCC----CcccCC----------------------CccccCCCCCCcccccccC
Q 003198 589 CKQYTPCGCQSMCGKQ----CPCLHN----------------------GTCCEKYCGCSKSCKNRFR 629 (840)
Q Consensus 589 ~~~y~PC~c~~~C~~~----C~C~~~----------------------g~~Ce~~CgC~~~C~nRf~ 629 (840)
+..-..|.|...|... |.|... ...|...|+|..+|.||+.
T Consensus 104 ~~~~~~c~C~~~~~~~~~~~C~C~~~n~~~~~~~~~~~~~~~~~~~~~i~EC~~~C~C~~~C~nRv~ 170 (364)
T KOG1082|consen 104 CENSTGCRCCSSCSSVLPLTCLCERHNGGLVAYTCDGDCGTLGKFKEPVFECSVACGCHPDCANRVV 170 (364)
T ss_pred CccccCCCccCCCCCCCCccccChHhhCCccccccCCccccccccCccccccccCCCCCCcCcchhh
Confidence 3456678887666532 888761 1468889999999999986
No 39
>COG5259 RSC8 RSC chromatin remodeling complex subunit RSC8 [Chromatin structure and dynamics / Transcription]
Probab=46.97 E-value=18 Score=42.50 Aligned_cols=44 Identities=27% Similarity=0.536 Sum_probs=36.5
Q ss_pred cCCCCcHHHHHHHHHhhhhcCCchHHHHHhhhCCCCcHHH-HHHHHh
Q 003198 472 CSSEWKPIEKELYLKGVEIFGRNSCLIARNLLSGLKTCME-VSTYMR 517 (840)
Q Consensus 472 ~~~~W~~~E~~L~~k~v~~fg~N~C~iA~~ll~g~KTC~E-V~~ym~ 517 (840)
.+..|+.-|.-|++.|+++||..+-.||+++ |+||=-| ++.|++
T Consensus 278 ~dk~WS~qE~~LLLEGIe~ygDdW~kVA~HV--gtKt~EqCIl~FL~ 322 (531)
T COG5259 278 RDKNWSRQELLLLLEGIEMYGDDWDKVARHV--GTKTKEQCILHFLQ 322 (531)
T ss_pred ccccccHHHHHHHHHHHHHhhhhHHHHHHHh--CCCCHHHHHHHHHc
Confidence 4568999999999999999999999999987 8898544 344443
No 40
>PLN03212 Transcription repressor MYB5; Provisional
Probab=45.09 E-value=19 Score=39.17 Aligned_cols=46 Identities=15% Similarity=0.205 Sum_probs=38.5
Q ss_pred cccCCcccchhhhhHHhhcCChHHHHHHHHHHh--cCCcHHHHHHHHHhH
Q 003198 174 KHEFSDGEDRILWTVFEEHGLGEEVINAVSQFI--GIATSEVQDRYSTLK 221 (840)
Q Consensus 174 k~~f~~~ed~~~~~~~~e~g~~~~v~~~l~~~~--~~~~sei~eRy~~L~ 221 (840)
|.-|+..||.+|...+++||-.. ...||+.+ +++.-+..|||...-
T Consensus 25 Rg~WT~EEDe~L~~lV~kyG~~n--W~~IAk~~g~gRT~KQCReRW~N~L 72 (249)
T PLN03212 25 RGPWTVEEDEILVSFIKKEGEGR--WRSLPKRAGLLRCGKSCRLRWMNYL 72 (249)
T ss_pred CCCCCHHHHHHHHHHHHHhCccc--HHHHHHhhhcCCCcchHHHHHHHhh
Confidence 56699999999999999999643 56788766 799999999997654
No 41
>KOG3813 consensus Uncharacterized conserved protein (tumor-suppressor AXUD1 in humans) [General function prediction only]
Probab=43.74 E-value=11 Score=44.41 Aligned_cols=13 Identities=8% Similarity=-0.018 Sum_probs=7.5
Q ss_pred CCccccccCCCCC
Q 003198 760 GDKLKFANHSSNP 772 (840)
Q Consensus 760 GN~aRFINHSC~P 772 (840)
+.++-.|+-+|.+
T Consensus 470 ~sv~~li~asc~~ 482 (640)
T KOG3813|consen 470 TSVSELIKASCHL 482 (640)
T ss_pred cccccccccccCC
Confidence 3455556777753
No 42
>PF08271 TF_Zn_Ribbon: TFIIB zinc-binding; InterPro: IPR013137 Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [, , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. This entry represents a zinc finger motif found in transcription factor IIB (TFIIB). In eukaryotes the initiation of transcription of protein encoding genes by the polymerase II complexe (Pol II) is modulated by general and specific transcription factors. The general transcription factors operate through common promoters elements (such as the TATA box). At least seven different proteins associate to form the general transcription factors: TFIIA, -IIB, -IID, -IIE, -IIF, -IIG, and -IIH []. TFIIB and TFIID are responsible for promoter recognition and interaction with pol II; together with Pol II, they form a minimal initiation complex capable of transcription under certain conditions. The TATA box of a Pol II promoter is bound in the initiation complex by the TBP subunit of TFIID, which bends the DNA around the C-terminal domain of TFIIB whereas the N-terminal zinc finger of TFIIB interacts with Pol II [, ]. The TFIIB zinc finger adopts a zinc ribbon fold characterised by two beta-hairpins forming two structurally similar zinc-binding sub-sites []. The zinc finger contacts the rbp1 subunit of Pol II through its dock domain, a conserved region of about 70 amino acids located close to the polymerase active site []. In the Pol II complex this surface is located near the RNA exit groove. Interestingly this sequence is best conserved in the three polymerases that utilise a TFIIB-like general transcription factor (Pol II, Pol III, and archaeal RNA polymerase) but not in Pol I []. More information about these proteins can be found at Protein of the Month: Zinc Fingers [].; GO: 0008270 zinc ion binding, 0006355 regulation of transcription, DNA-dependent; PDB: 1VD4_A 1PFT_A 3K1F_M 3K7A_M 1RO4_A 1RLY_A 1DL6_A.
Probab=33.49 E-value=50 Score=26.15 Aligned_cols=33 Identities=39% Similarity=0.721 Sum_probs=24.2
Q ss_pred ccceeeEeCCCCeEEEecCC----ccccCCCccccccC
Q 003198 144 GRRRIYYDQHGSEALVCSDS----EEDIIEPEEEKHEF 177 (840)
Q Consensus 144 grrriYYd~~g~Ealicsds----eee~~e~eeek~~f 177 (840)
|.+.|++|...||. ||+.= ||.++.++-|.++|
T Consensus 7 g~~~~~~D~~~g~~-vC~~CG~Vl~e~~i~~~~e~r~f 43 (43)
T PF08271_consen 7 GSKEIVFDPERGEL-VCPNCGLVLEENIIDEGPEWREF 43 (43)
T ss_dssp SSSEEEEETTTTEE-EETTT-BBEE-TTBSCCCSCCHC
T ss_pred cCCceEEcCCCCeE-ECCCCCCEeecccccCCcccccC
Confidence 55669999999997 99875 55666666666665
No 43
>KOG1081 consensus Transcription factor NSD1 and related SET domain proteins [Transcription]
Probab=32.89 E-value=14 Score=43.66 Aligned_cols=105 Identities=10% Similarity=0.002 Sum_probs=69.6
Q ss_pred CCCce---EEeccccCCCCeEEeccccccCHH--HHHHHhhhhc-ccCc-cccccCC---CcEEEeccccCCccccccCC
Q 003198 700 VAGWG---AFLKNSVSKNDYLGEYTGELISHR--EADKRGKIYD-RANS-SFLFDLN---DQYVLDAYRKGDKLKFANHS 769 (840)
Q Consensus 700 ~kG~G---LfA~edI~kGefI~EY~GEIIs~~--Ea~rR~k~yd-~~~~-sYlf~L~---~~~~IDA~~~GN~aRFINHS 769 (840)
..+|+ ..|...+..|++|+.++|+..-.. ....+ .+. .... .-+|... .....++...|+..++++|+
T Consensus 122 ~c~~~~~d~~~~~~~~~~~~vw~~vg~~~~~~c~vc~~~--~~~~~~~~~~~~f~~~~~~~~~~~~~~~~g~~~~~l~~~ 199 (463)
T KOG1081|consen 122 KCSKRCTDCRAFKKREVGDLVWSKVGEYPWWPCMVCHDP--LLPKGMKHDHVNFFGCYAWTHEKRVFPYEGQSSKLIPHS 199 (463)
T ss_pred ccccCCcceeeeccccceeEEeEEcCcccccccceecCc--ccchhhccccceeccchhhHHHhhhhhccchHHHhhhhc
Confidence 34455 777779999999999999986443 10111 000 0000 0111111 11223344499999999999
Q ss_pred CCCCcceeEEEEcCeeEEEEEEccCCCCCCe------EEEecC
Q 003198 770 SNPNCFAKVMLVAGDHRVGIFAKEHIEASEE------LFYDYR 806 (840)
Q Consensus 770 C~PNc~~~~v~V~g~~rI~ifA~RdI~aGEE------LTfDYg 806 (840)
+.|+-....+...+..|+..++.+.++-+.- ++.+|.
T Consensus 200 ~~~~s~~~~~~~~~~~r~~~~~~q~~~~~~~~e~k~~~~~~~~ 242 (463)
T KOG1081|consen 200 KKPASTMSEKIKEAKARFGKLKAQWEAGIKQKELKPEEYKRIK 242 (463)
T ss_pred cccchhhhhhhhcccchhhhcccchhhccchhhcccccccccc
Confidence 9999998999999999999999998888877 666653
No 44
>KOG4167 consensus Predicted DNA-binding protein, contains SANT and ELM2 domains [Transcription]
Probab=32.68 E-value=50 Score=40.92 Aligned_cols=40 Identities=20% Similarity=0.552 Sum_probs=34.1
Q ss_pred CCCcHHHHHHHHHhhhhcCCchHHHHHhhhCCCCcHHHHHHH
Q 003198 474 SEWKPIEKELYLKGVEIFGRNSCLIARNLLSGLKTCMEVSTY 515 (840)
Q Consensus 474 ~~W~~~E~~L~~k~v~~fg~N~C~iA~~ll~g~KTC~EV~~y 515 (840)
.-||++|+-||.|.+..|-++|-+|+..| .+||=+|--+|
T Consensus 620 d~WTp~E~~lF~kA~y~~~KDF~~v~km~--~~KtVaqCVey 659 (907)
T KOG4167|consen 620 DKWTPLERKLFNKALYTYSKDFIFVQKMV--KSKTVAQCVEY 659 (907)
T ss_pred ccccHHHHHHHHHHHHHhcccHHHHHHHh--ccccHHHHHHH
Confidence 56999999999999999999999999987 67886655443
No 45
>PF00856 SET: SET domain; InterPro: IPR001214 The SET domain appears generally as one part of a larger multidomain protein, and recently there were described three structures of very different proteins with distinct domain compositions: Neurospora crassa DIM-5, a member of the Su(var) family of HKMTs which methylate histone H3 on lysine 9,human SET7 (also called SET9), which methylates H3 on lysine 4 and garden pea Rubisco LSMT, an enzyme that does not modify histones, but instead methylates lysine 14 in the flexible tail of the large subunit of the enzyme Rubisco. The SET domain itself turned out to be an uncommon structure. Although in all three studies, electron density maps revealed the location of the AdoMet or AdoHcy cofactor, the SET domain bears no similarity at all to the canonical/AdoMet-dependent methyltransferase fold. Strictly conserved in the C-terminal motif of the SET domain tyrosine could be involved in abstracting a proton from the protonated amino group of the substrate lysine, promoting its nucleophilic attack on the sulphonium methyl group of the AdoMet cofactor. In contrast to the AdoMet-dependent protein methyltranferases of the classical type, which tend to bind their polypeptide substrates on top of the cofactor, it is noted from the Rubisco LSMT structure that the AdoMet seems to bind in a separate cleft, suggesting how a polypeptide substrate could be subjected to multiple rounds of methylation without having to be released from the enzyme. In contrast, SET7/9 is able to add only a single methyl group to its substrate. It has been demonstrated that association of SET domain and myotubularin-related proteins modulates growth control []. The SET domain-containing Drosophila melanogaster (Fruit fly) protein, enhancer of zeste, has a function in segment determination and the mammalian homologue may be involved in the regulation of gene transcription and chromatin structure. Histone lysine methylation is part of the histone code that regulated chromatin function and epigenetic control of gene function. Histone lysine methyltransferases (HMTase) differ both in their substrate specificity for the various acceptor lysines as well as in their product specificity for the number of methyl groups (one, two, or three) they transfer. With just one exception [], the HMTases belong to SET family that can be classified according to the sequences surrounding the SET domain [, ]. Structural studies on the human SET7/9, a mono-methylase, have revealed the molecular basis for the specificity of the enzyme for the histone-target and the roles of the invariant residues in the SET domain in determining the methylation specificities []. The pre-SET domain, as found in the SUV39 SET family, contains nine invariant cysteine residues that are grouped into two segments separated by a region of variable length. These 9 cysteines coordinate 3 zinc ions to form to form a triangular cluster, where each of the zinc ions is coordinated by 4 four cysteines to give a tetrahedral configuration. The function of this domain is structural, holding together 2 long segments of random coils. The C-terminal region including the post-SET domain is disordered when not interacting with a histone tail and in the absence of zinc. The three conserved cysteines in the post-SET domain form a zinc-binding site when coupled to a fourth conserved cysteine in the knot-like structure close to the SET domain active site []. The structured post-SET region brings in the C-terminal residues that participate in S-adenosylmethine-binding and histone tail interactions. The three conserved cysteine residues are essential for HMTase activity, as replacement with serine abolishes HMTase activity [], []. ; GO: 0005515 protein binding; PDB: 3TG5_A 3S7F_A 3RIB_B 3TG4_A 3S7J_A 3S7D_A 3S7B_A 3H6L_A 3SMT_A 3K5K_A ....
Probab=30.25 E-value=29 Score=32.39 Aligned_cols=17 Identities=35% Similarity=0.618 Sum_probs=12.9
Q ss_pred EEEEEccCCCCCCeEEE
Q 003198 787 VGIFAKEHIEASEELFY 803 (840)
Q Consensus 787 I~ifA~RdI~aGEELTf 803 (840)
.|+||+|||++||-|.+
T Consensus 2 rGl~At~dI~~Ge~I~~ 18 (162)
T PF00856_consen 2 RGLFATRDIKAGEVILI 18 (162)
T ss_dssp EEEEESS-B-TTEEEEE
T ss_pred EEEEECccCCCCCEEEE
Confidence 47999999999998874
No 46
>PRK09430 djlA Dna-J like membrane chaperone protein; Provisional
Probab=28.24 E-value=60 Score=35.57 Aligned_cols=49 Identities=16% Similarity=0.337 Sum_probs=39.0
Q ss_pred cCCcccchhhhhHHhhcCChHHHHHHHHHHh-----------------------------------cCCcHHHHHHHHHh
Q 003198 176 EFSDGEDRILWTVFEEHGLGEEVINAVSQFI-----------------------------------GIATSEVQDRYSTL 220 (840)
Q Consensus 176 ~f~~~ed~~~~~~~~e~g~~~~v~~~l~~~~-----------------------------------~~~~sei~eRy~~L 220 (840)
++++.|+.+||.+.+-.|+|..-|+.+.+++ +.+.+||+..|+.|
T Consensus 145 ~l~~~E~~~L~~Ia~~Lgis~~df~~~~~~~~~~~~f~~~~~~~~~~~~~~~~~~~~ay~vLgv~~~as~~eIk~aYr~L 224 (267)
T PRK09430 145 SLHPNERQVLYVIAEELGFSRFQFDQLLRMMQAGFRFQQQQGGGGYQQAQRGPTLEDAYKVLGVSESDDDQEIKRAYRKL 224 (267)
T ss_pred CCCHHHHHHHHHHHHHcCCCHHHHHHHHHHHHHHHhhcccccccccccccCCCcHHhHHHHcCCCCCCCHHHHHHHHHHH
Confidence 3888899999999999999987776665542 23568899999999
Q ss_pred Hhhc
Q 003198 221 KEKY 224 (840)
Q Consensus 221 ~~k~ 224 (840)
-.++
T Consensus 225 ~~~~ 228 (267)
T PRK09430 225 MSEH 228 (267)
T ss_pred HHHh
Confidence 7765
No 47
>KOG1079 consensus Transcriptional repressor EZH1 [Transcription]
Probab=28.07 E-value=30 Score=42.33 Aligned_cols=29 Identities=14% Similarity=-0.029 Sum_probs=21.1
Q ss_pred hHHHHHHHHHHHH-----HHHHHHHHHHHHHHHh
Q 003198 27 LTYKLNQLKKQVQ-----AERVVSVKDKIEKNRK 55 (840)
Q Consensus 27 L~~~i~~LKkqi~-----~~R~~~ik~k~e~n~~ 55 (840)
+.-++..++.+-+ ++|+..||+++.++++
T Consensus 18 ~~r~~~~~~~K~~~~~~~~~~~e~i~~~~~E~k~ 51 (739)
T KOG1079|consen 18 RKRVREADEGKSAKSKNPADRLEKIKILNCEWKK 51 (739)
T ss_pred HHHHHHHhhhhhhcccCHHHHHHHHHHHHHHHhh
Confidence 3344444555555 7899999999999988
No 48
>PF08666 SAF: SAF domain; InterPro: IPR013974 This entry includes a range of different proteins, such as antifreeze proteins, flagellar FlgA proteins, and CpaB pilus proteins. ; PDB: 1C89_A 3NLA_A 3RDN_A 1C8A_A 3FRN_A 1WVO_A 3K3S_H 3G8R_B 1XUU_A 1XUZ_A ....
Probab=27.53 E-value=35 Score=28.22 Aligned_cols=15 Identities=20% Similarity=0.171 Sum_probs=11.3
Q ss_pred EEEEccCCCCCCeEE
Q 003198 788 GIFAKEHIEASEELF 802 (840)
Q Consensus 788 ~ifA~RdI~aGEELT 802 (840)
.++|.|||++|+.|+
T Consensus 3 vvVA~~di~~G~~i~ 17 (63)
T PF08666_consen 3 VVVAARDIPAGTVIT 17 (63)
T ss_dssp EEEESSTB-TT-BEC
T ss_pred EEEEeCccCCCCEEc
Confidence 378999999999995
No 49
>KOG0457 consensus Histone acetyltransferase complex SAGA/ADA, subunit ADA2 [Chromatin structure and dynamics]
Probab=25.61 E-value=1e+02 Score=36.17 Aligned_cols=39 Identities=33% Similarity=0.442 Sum_probs=33.8
Q ss_pred CCCCcHHHHHHHHHhhhhcC-CchHHHHHhhhCCCCcHHHHH
Q 003198 473 SSEWKPIEKELYLKGVEIFG-RNSCLIARNLLSGLKTCMEVS 513 (840)
Q Consensus 473 ~~~W~~~E~~L~~k~v~~fg-~N~C~iA~~ll~g~KTC~EV~ 513 (840)
...|+.-|..|+++++++|| .|+=-||..+ |+||=-|+-
T Consensus 72 ~~~WtadEEilLLea~~t~G~GNW~dIA~hI--GtKtkeeck 111 (438)
T KOG0457|consen 72 DPSWTADEEILLLEAAETYGFGNWQDIADHI--GTKTKEECK 111 (438)
T ss_pred CCCCChHHHHHHHHHHHHhCCCcHHHHHHHH--cccchHHHH
Confidence 56899999999999999999 6999999987 888855553
No 50
>KOG1338 consensus Uncharacterized conserved protein [Function unknown]
Probab=24.74 E-value=50 Score=38.37 Aligned_cols=44 Identities=25% Similarity=0.272 Sum_probs=32.0
Q ss_pred CccccccCC---CCCCcceeEEEEcCeeEEEEEEccCCCCCCeEEEecCCCCC
Q 003198 761 DKLKFANHS---SNPNCFAKVMLVAGDHRVGIFAKEHIEASEELFYDYRYGPD 810 (840)
Q Consensus 761 N~aRFINHS---C~PNc~~~~v~V~g~~rI~ifA~RdI~aGEELTfDYgy~~d 810 (840)
-.+-|+||- |+.|..+ +..-+-++|.|+|++|+|+.--||..++
T Consensus 217 p~ad~lNhd~~k~nanl~y------~~NcL~mva~r~iekgdev~n~dg~~p~ 263 (466)
T KOG1338|consen 217 PIADFLNHDGLKANANLRY------EDNCLEMVADRNIEKGDEVDNSDGLKPM 263 (466)
T ss_pred chhhhhccchhhcccceec------cCcceeeeecCCCCCccccccccccCcc
Confidence 356789995 5555432 3444577999999999999999985544
No 51
>smart00760 Bac_DnaA_C Bacterial dnaA protein helix-turn-helix domain. Could be involved in DNA-binding.
Probab=24.43 E-value=64 Score=27.16 Aligned_cols=22 Identities=27% Similarity=0.709 Sum_probs=19.1
Q ss_pred HHHHHHHHHHhcCCcHHHHHHH
Q 003198 196 EEVINAVSQFIGIATSEVQDRY 217 (840)
Q Consensus 196 ~~v~~~l~~~~~~~~sei~eRy 217 (840)
|+|+++|++++++++.||..+-
T Consensus 3 ~~I~~~Va~~~~i~~~~i~s~~ 24 (60)
T smart00760 3 EEIIEAVAEYFGVKPEDLKSKS 24 (60)
T ss_pred HHHHHHHHHHhCCCHHHHhcCC
Confidence 7899999999999999996543
No 52
>PLN03142 Probable chromatin-remodeling complex ATPase chain; Provisional
Probab=23.55 E-value=77 Score=41.21 Aligned_cols=48 Identities=21% Similarity=0.344 Sum_probs=38.2
Q ss_pred cccCCcccchhhhhHHhhcCCh--HHHHHHHHHH--------h-cCCcHHHHHHHHHhH
Q 003198 174 KHEFSDGEDRILWTVFEEHGLG--EEVINAVSQF--------I-GIATSEVQDRYSTLK 221 (840)
Q Consensus 174 k~~f~~~ed~~~~~~~~e~g~~--~~v~~~l~~~--------~-~~~~sei~eRy~~L~ 221 (840)
++-|++.||++|=..+..||+. |+|...|.+. | ++|+.||+.|...|-
T Consensus 926 ~~~~~~~~d~~~~~~~~~~g~~~~~~~~~~i~~~~~f~fd~~~~srt~~~~~~r~~~l~ 984 (1033)
T PLN03142 926 GKLYNEECDRFMLCMVHKLGYGNWDELKAAFRTSPLFRFDWFVKSRTPQELARRCDTLI 984 (1033)
T ss_pred CCcCCHHHHHHHHHHHHHhccchHHHHHHHHHhCCceeeehhhccCCHHHHHHHHHHHH
Confidence 4679999999999999999984 4455555432 2 999999999999885
No 53
>TIGR02726 phenyl_P_delta phenylphosphate carboxylase, delta subunit. Members of this protein family are the alpha subunit of phenylphosphate carboxylase. Phenol (methyl-benzene) is converted to phenylphosphate, then para-carboxylated by this four-subunit enzyme, with the release of phosphate, to 4-hydroxybenzoate. The enzyme contains neither biotin nor thiamin pyrophosphate. This delta subunit belongs to HAD family hydrolases.
Probab=23.49 E-value=63 Score=32.96 Aligned_cols=49 Identities=12% Similarity=-0.014 Sum_probs=35.8
Q ss_pred eeeEeCCCCeEEEecCCccccCCCcccc----ccCCcccchhhhhHHhhcCCh
Q 003198 147 RIYYDQHGSEALVCSDSEEDIIEPEEEK----HEFSDGEDRILWTVFEEHGLG 195 (840)
Q Consensus 147 riYYd~~g~Ealicsdseee~~e~eeek----~~f~~~ed~~~~~~~~e~g~~ 195 (840)
+||||+.|+|.-.+|-.+...+.-=.++ .-.|.....++++.++.+|+.
T Consensus 22 ~~~~~~~g~~~~~~~~~D~~~~~~L~~~Gi~laIiT~k~~~~~~~~l~~lgi~ 74 (169)
T TIGR02726 22 RIVINDEGIESRNFDIKDGMGVIVLQLCGIDVAIITSKKSGAVRHRAEELKIK 74 (169)
T ss_pred eEEEcCCCcEEEEEecchHHHHHHHHHCCCEEEEEECCCcHHHHHHHHHCCCc
Confidence 7999999999999998877644222111 345666777888888888885
No 54
>PF14100 PmoA: Methane oxygenase PmoA
Probab=22.80 E-value=93 Score=34.13 Aligned_cols=102 Identities=16% Similarity=0.162 Sum_probs=56.9
Q ss_pred ccEEEEecCCCCceEEeccccCCCCeEEeccccccCHHHHHHHhhhhcccCc--cccccCCCcEEEeccccCCccccccC
Q 003198 691 QRILLAKSDVAGWGAFLKNSVSKNDYLGEYTGELISHREADKRGKIYDRANS--SFLFDLNDQYVLDAYRKGDKLKFANH 768 (840)
Q Consensus 691 ~~v~V~kS~~kG~GLfA~edI~kGefI~EY~GEIIs~~Ea~rR~k~yd~~~~--sYlf~L~~~~~IDA~~~GN~aRFINH 768 (840)
..|.+....-.|+++++.+.+..|.++.. +-............. .|...++.. ..... .-|++|
T Consensus 143 ~~v~l~~~~yGGl~~R~~~~~~~g~v~~s--------~G~~g~~~~~g~~a~Wv~~~g~~~~~-----~~~~~-i~~~dh 208 (271)
T PF14100_consen 143 DPVTLGDPGYGGLFWRAARSWDGGTVLTS--------EGKTGEEAAWGKRAPWVDYSGPIDGE-----DGTSG-IAILDH 208 (271)
T ss_pred cceEecCCCcceEEEEccCcccCCeEECC--------CCCcCcccccCCccCceEEEeeeCCC-----cceEE-EEEEeC
Confidence 35667766667889999988855555432 111000001111100 111111111 00111 248899
Q ss_pred CCCCCcceeEEEEcCeeEEEE------EEccCCCCCCeEEEecCC
Q 003198 769 SSNPNCFAKVMLVAGDHRVGI------FAKEHIEASEELFYDYRY 807 (840)
Q Consensus 769 SC~PNc~~~~v~V~g~~rI~i------fA~RdI~aGEELTfDYgy 807 (840)
--+||- ...|.+.+...+++ ..--.|++||.|++.|+.
T Consensus 209 P~N~~~-P~~W~vR~~g~~~~~p~~~~~~~~~l~~G~~l~~rYr~ 252 (271)
T PF14100_consen 209 PSNPNY-PTPWHVRGYGLFGANPAPAFDGPLTLPPGETLTLRYRV 252 (271)
T ss_pred CCCCCC-CcceEEeccCcceecccccccCceecCCCCeEEEEEEE
Confidence 998875 57888886655544 445689999999999974
No 55
>smart00286 PTI Plant trypsin inhibitors.
Probab=21.61 E-value=63 Score=24.22 Aligned_cols=20 Identities=40% Similarity=0.841 Sum_probs=17.3
Q ss_pred ccCCCCCCCCCCCCcccCCC
Q 003198 592 YTPCGCQSMCGKQCPCLHNG 611 (840)
Q Consensus 592 y~PC~c~~~C~~~C~C~~~g 611 (840)
+++|...+.|-..|.|..+|
T Consensus 7 lm~Ck~DsDCl~~CiC~~~G 26 (29)
T smart00286 7 LMECKRDSDCMAECICLANG 26 (29)
T ss_pred hhccccccCcccCCEEcccc
Confidence 67888888999999999876
No 56
>KOG3813 consensus Uncharacterized conserved protein (tumor-suppressor AXUD1 in humans) [General function prediction only]
Probab=21.34 E-value=42 Score=39.92 Aligned_cols=36 Identities=33% Similarity=1.044 Sum_probs=25.6
Q ss_pred cccCCCCCCCCC-CCCcccCCCccccC-----CCCCC-ccccc
Q 003198 591 QYTPCGCQSMCG-KQCPCLHNGTCCEK-----YCGCS-KSCKN 626 (840)
Q Consensus 591 ~y~PC~c~~~C~-~~C~C~~~g~~Ce~-----~CgC~-~~C~n 626 (840)
+--.|+|.+-|+ ..|.|.+.|.-|.. .|||. ..|.|
T Consensus 306 eeCGCsCr~~CdPETCaCSqaGIkCQvDr~~fPCgC~rEgCgN 348 (640)
T KOG3813|consen 306 EECGCSCRGVCDPETCACSQAGIKCQVDRGEFPCGCFREGCGN 348 (640)
T ss_pred HhhCCcccceeChhhcchhccCceEeecCcccccccchhhcCC
Confidence 345688889999 58999999987643 36665 34555
No 57
>cd00150 PlantTI Plant trypsin inhibitors such as squash trypsin inhibitor. Plant proteinase inhibitors play important roles in natural plant defense. Proteinase inhibitors from squash seeds form an uniform family of small proteins cross-linked with three disulfide bridges.
Probab=21.28 E-value=63 Score=23.87 Aligned_cols=20 Identities=40% Similarity=0.846 Sum_probs=17.0
Q ss_pred ccCCCCCCCCCCCCcccCCC
Q 003198 592 YTPCGCQSMCGKQCPCLHNG 611 (840)
Q Consensus 592 y~PC~c~~~C~~~C~C~~~g 611 (840)
+++|...+.|-..|.|..+|
T Consensus 5 lm~Ck~DsDCl~~CiC~~~G 24 (27)
T cd00150 5 LMECKRDSDCLAECICLENG 24 (27)
T ss_pred heeccccccccCCCEEcccc
Confidence 56788888898999999876
No 58
>PF13404 HTH_AsnC-type: AsnC-type helix-turn-helix domain; PDB: 2ZNY_E 2ZNZ_G 1RI7_A 2CYY_A 2E1C_A 2VC1_B 2QZ8_A 2W29_C 2IVM_B 2VBX_B ....
Probab=21.22 E-value=1e+02 Score=24.48 Aligned_cols=38 Identities=24% Similarity=0.470 Sum_probs=27.1
Q ss_pred chhhhhHHhhcCChHHHHHHHHHHhcCCcHHHHHHHHHhH
Q 003198 182 DRILWTVFEEHGLGEEVINAVSQFIGIATSEVQDRYSTLK 221 (840)
Q Consensus 182 d~~~~~~~~e~g~~~~v~~~l~~~~~~~~sei~eRy~~L~ 221 (840)
|+-|--.+|+.| ..=+..|++-+|.+++.+.+|.+.|+
T Consensus 5 D~~Il~~Lq~d~--r~s~~~la~~lglS~~~v~~Ri~rL~ 42 (42)
T PF13404_consen 5 DRKILRLLQEDG--RRSYAELAEELGLSESTVRRRIRRLE 42 (42)
T ss_dssp HHHHHHHHHH-T--TS-HHHHHHHHTS-HHHHHHHHHHHH
T ss_pred HHHHHHHHHHcC--CccHHHHHHHHCcCHHHHHHHHHHhC
Confidence 444555666663 34478999999999999999999874
No 59
>smart00468 PreSET N-terminal to some SET domains. A Cys-rich putative Zn2+-binding domain that occurs N-terminal to some SET domains. Function is unknown. Unpublished.
Probab=21.15 E-value=70 Score=29.36 Aligned_cols=21 Identities=24% Similarity=0.860 Sum_probs=12.9
Q ss_pred ccccCCCCCCCCCCC--CcccCC
Q 003198 590 KQYTPCGCQSMCGKQ--CPCLHN 610 (840)
Q Consensus 590 ~~y~PC~c~~~C~~~--C~C~~~ 610 (840)
....-|+|.+.|... |.|+..
T Consensus 47 ~~~~gC~C~~~C~~~~~C~C~~~ 69 (98)
T smart00468 47 SPLVGCSCSGDCSSSNKCECARK 69 (98)
T ss_pred CCCCCCcCCCCCCCCCcCCcHhh
Confidence 455567777777632 777653
No 60
>KOG3988 consensus Protein-tyrosine sulfotransferase TPST1/TPST2 [Posttranslational modification, protein turnover, chaperones]
Probab=20.91 E-value=66 Score=36.04 Aligned_cols=21 Identities=43% Similarity=0.898 Sum_probs=19.3
Q ss_pred hhHHhhcCChHHHH-HHHHHHh
Q 003198 186 WTVFEEHGLGEEVI-NAVSQFI 206 (840)
Q Consensus 186 ~~~~~e~g~~~~v~-~~l~~~~ 206 (840)
|..+||.|.++||| +|+++|+
T Consensus 122 ~~rl~eaGvT~EV~d~AisaFi 143 (378)
T KOG3988|consen 122 WLRLQEAGVTDEVLDSAISAFI 143 (378)
T ss_pred HhhhhhccchHHHHHHHHHHHH
Confidence 77889999999999 8999997
No 61
>PF03656 Pam16: Pam16; InterPro: IPR005341 The Pam16 protein is the fifth essential subunit of the pre-sequence translocase-associated protein import motor (PAM) []. In Saccharomyces cerevisiae (Baker's yeast), Pam16 is required for preprotein translocation into the matrix, but not for protein insertion into the inner membrane [].; PDB: 2GUZ_J.
Probab=20.39 E-value=78 Score=31.31 Aligned_cols=36 Identities=25% Similarity=0.450 Sum_probs=20.6
Q ss_pred hhHHhhcCChHHHHHHHHHHhcCCcHHHHHHHHHhHhhcCCC
Q 003198 186 WTVFEEHGLGEEVINAVSQFIGIATSEVQDRYSTLKEKYDGK 227 (840)
Q Consensus 186 ~~~~~e~g~~~~v~~~l~~~~~~~~sei~eRy~~L~~k~~~~ 227 (840)
.|+++|. -.||+ |.. ..+.+||++||+.|.+-|++.
T Consensus 54 ~Mtl~EA---~~ILn-v~~--~~~~eeI~k~y~~Lf~~Nd~~ 89 (127)
T PF03656_consen 54 GMTLDEA---RQILN-VKE--ELSREEIQKRYKHLFKANDPS 89 (127)
T ss_dssp ---HHHH---HHHHT---G----SHHHHHHHHHHHHHHT-CC
T ss_pred CCCHHHH---HHHcC-CCC--ccCHHHHHHHHHHHHhccCCC
Confidence 4555552 33444 222 678899999999999988764
Done!