Query 026129
Match_columns 243
No_of_seqs 290 out of 1720
Neff 8.0
Searched_HMMs 46136
Date Fri Mar 29 04:07:46 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/026129.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/026129hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 KOG4442 Clathrin coat binding 100.0 5E-52 1.1E-56 383.9 15.9 205 39-243 39-250 (729)
2 KOG1082 Histone H3 (Lys9) meth 100.0 6.9E-38 1.5E-42 282.8 11.3 212 20-235 64-324 (364)
3 KOG1079 Transcriptional repres 100.0 9.2E-33 2E-37 254.9 9.5 152 84-237 553-715 (739)
4 KOG1080 Histone H3 (Lys4) meth 100.0 3.5E-31 7.5E-36 259.0 10.8 129 115-243 867-995 (1005)
5 smart00317 SET SET (Su(var)3-9 100.0 7.3E-28 1.6E-32 181.9 14.0 116 115-230 1-116 (116)
6 KOG1083 Putative transcription 99.9 2.1E-27 4.5E-32 227.3 4.0 134 103-237 1166-1300(1306)
7 KOG1141 Predicted histone meth 99.9 1.8E-26 4E-31 216.3 3.6 124 28-151 685-836 (1262)
8 KOG1085 Predicted methyltransf 99.9 4.1E-22 8.9E-27 168.9 9.7 124 111-234 253-380 (392)
9 COG2940 Proteins containing SE 99.7 1.6E-18 3.4E-23 162.2 5.7 134 103-237 321-455 (480)
10 PF00856 SET: SET domain; Int 99.7 8.7E-18 1.9E-22 132.5 7.4 107 125-231 1-162 (162)
11 KOG2589 Histone tail methylase 99.4 2.8E-13 6E-18 118.8 5.0 115 122-243 135-250 (453)
12 KOG1081 Transcription factor N 99.3 4E-13 8.7E-18 124.5 1.2 141 87-243 286-427 (463)
13 smart00570 AWS associated with 99.1 3.1E-11 6.6E-16 77.8 2.2 49 64-112 2-50 (51)
14 KOG2461 Transcription factor B 98.9 2E-09 4.3E-14 98.0 5.7 112 112-236 26-148 (396)
15 PF05033 Pre-SET: Pre-SET moti 98.9 8E-10 1.7E-14 82.4 2.4 84 22-106 3-103 (103)
16 smart00468 PreSET N-terminal t 98.4 4E-07 8.6E-12 67.3 4.3 58 20-77 3-61 (98)
17 KOG1081 Transcription factor N 98.0 3.2E-06 6.9E-11 78.8 3.1 234 1-237 1-248 (463)
18 KOG1141 Predicted histone meth 97.5 0.00032 6.9E-09 68.1 7.1 154 90-243 981-1252(1262)
19 KOG1337 N-methyltransferase [G 91.6 0.14 2.9E-06 48.3 2.8 41 189-232 238-278 (472)
20 KOG2084 Predicted histone tail 91.1 0.39 8.3E-06 44.5 5.3 44 190-237 208-252 (482)
21 PF03638 TCR: Tesmin/TSO1-like 78.1 1.6 3.5E-05 26.9 1.6 37 65-107 3-40 (42)
22 KOG1338 Uncharacterized conser 76.4 1.8 3.9E-05 39.6 2.1 39 187-231 218-259 (466)
23 PF08666 SAF: SAF domain; Int 76.1 1.6 3.5E-05 28.6 1.4 15 213-227 3-17 (63)
24 KOG2155 Tubulin-tyrosine ligas 57.9 5.6 0.00012 37.1 1.4 50 186-235 203-254 (631)
25 PF02067 Metallothio_5: Metall 56.5 9.3 0.0002 23.3 1.7 22 80-103 6-27 (41)
26 smart00858 SAF This domain fam 53.3 9.1 0.0002 24.8 1.6 16 213-228 3-18 (64)
27 KOG1079 Transcriptional repres 48.7 7.8 0.00017 37.8 0.8 28 80-107 512-539 (739)
28 smart00317 SET SET (Su(var)3-9 46.5 38 0.00082 24.2 4.2 16 126-141 98-113 (116)
29 KOG1171 Metallothionein-like p 42.8 7.8 0.00017 35.7 -0.1 37 63-105 215-252 (406)
30 PF08487 VIT: Vault protein in 40.0 1.1E+02 0.0025 22.7 6.0 34 130-163 39-81 (118)
31 PF14100 PmoA: Methane oxygena 39.2 44 0.00094 29.1 4.0 47 189-236 204-256 (271)
32 COG1188 Ribosome-associated he 34.8 50 0.0011 24.4 3.1 21 215-235 44-64 (100)
33 KOG4454 RNA binding protein (R 33.5 71 0.0015 27.2 4.1 45 1-47 1-54 (267)
34 KOG1338 Uncharacterized conser 32.7 30 0.00065 32.0 1.9 25 122-146 38-62 (466)
35 TIGR03569 NeuB_NnaB N-acetylne 27.8 34 0.00073 30.8 1.5 19 211-229 277-295 (329)
36 PF07773 DUF1619: Protein of u 27.2 39 0.00084 29.5 1.7 6 80-85 16-21 (294)
37 TIGR02059 swm_rep_I cyanobacte 25.7 1.7E+02 0.0036 21.7 4.5 29 205-233 69-98 (101)
38 cd05468 pVHL von Hippel-Landau 23.5 1.2E+02 0.0025 23.7 3.6 36 189-229 12-47 (141)
39 TIGR03586 PseI pseudaminic aci 23.3 51 0.0011 29.6 1.7 19 211-229 275-293 (327)
40 PF11720 Inhibitor_I78: Peptid 20.6 36 0.00077 22.4 0.1 18 216-233 25-42 (60)
No 1
>KOG4442 consensus Clathrin coat binding protein/Huntingtin interacting protein HIP1, involved in regulation of endocytosis [Intracellular trafficking, secretion, and vesicular transport]
Probab=100.00 E-value=5e-52 Score=383.90 Aligned_cols=205 Identities=39% Similarity=0.768 Sum_probs=190.1
Q ss_pred CCCCCcEEcccceeccccccccCCCCCCccccCCCCC----CCCCCCCCCCccceeecCCC-CCC-CCCCCCCcccccCC
Q 026129 39 PKAIPYVFIKRNIYLTKRIKRRLEDDGIFCSCTASPG----SSGVCDRDCHCGMLLSSCSS-GCK-CGNSCLNKPFQNRP 112 (243)
Q Consensus 39 ~~p~~f~~i~~n~~~~~~~~~~~~~~~~~C~C~~~~~----~~~~C~~~C~c~~~~~eC~~-~C~-c~~~C~Nr~~q~~~ 112 (243)
..|..|.-+..++|.....+.....+.+.|+|...-+ ..+.|+.+|.|+++..||++ .|. |+..|.|+.||+..
T Consensus 39 e~~~~f~~~~e~~y~~krk~~~ee~~~m~Cdc~~~~~d~~n~~~~cg~~CiNr~t~iECs~~~C~~cg~~C~NQRFQkkq 118 (729)
T KOG4442|consen 39 EALTKFENLDEKFYANKRKKKKEENDEMICDCKPKTGDGANGACACGEDCINRMTSIECSDRECPRCGVYCKNQRFQKKQ 118 (729)
T ss_pred ccchhhhhhhhhhhHHhhccCcccCcceeeecccccccccccccccCccccchhhhcccCCccCCCccccccchhhhhhc
Confidence 4566788888888877765554444677899998543 35678999999999999999 899 99999999999999
Q ss_pred ccceEEEEecCCCcEEEecccCCCCceEEEeceeeeCHHHHHHHHHHhhhcCCcceeeeeecccceecccccCCcccccC
Q 026129 113 VKKMKLVQTEKCGAGIVADEDIKRGEFVIEYVGEVIDDQTCEERLWKMKHLGETNFYLCEINRDMVIDATYKGNKSRYIN 192 (243)
Q Consensus 113 ~~~l~v~~s~~kG~Gv~A~~~I~~G~~I~ey~Gevi~~~~~~~r~~~~~~~~~~~~y~~~~~~~~~iDa~~~Gn~~RfiN 192 (243)
..+++||.|+++||||+|.++|++|+||+||.||||+..+++.|...|...+..++|+|.+..+.+|||+.+||++||||
T Consensus 119 yA~vevF~Te~KG~GLRA~~dI~~g~FI~EY~GEVI~~~Ef~kR~~~Y~~d~~kh~Yfm~L~~~e~IDAT~KGnlaRFiN 198 (729)
T KOG4442|consen 119 YAKVEVFLTEKKGCGLRAEEDIPKGQFILEYIGEVIEEKEFEKRVKRYAKDGIKHYYFMALQGGEYIDATKKGNLARFIN 198 (729)
T ss_pred cCceeEEEecCcccceeeccccCCCcEEeeeccccccHHHHHHHHHHHHhcCCceEEEEEecCCceecccccCcHHHhhc
Confidence 99999999999999999999999999999999999999999999999999999999999999999999999999999999
Q ss_pred CCCCCCceeEEEEECCeEEEEEEEcCCCCCCCeEEEecCCCcCCC-CCcccC
Q 026129 193 HSCCPNTEMQKWIIDGETRIGIFATRDIKKGENLTYDYQYEFLHD-SLIAYC 243 (243)
Q Consensus 193 HSC~PN~~~~~~~~~~~~~i~i~A~rdI~~GEELt~dY~~~~~~~-~~~C~C 243 (243)
|||+|||.+++|.+++..||+|||.|.|.+||||||||++++++. +|+|+|
T Consensus 199 HSC~PNa~~~KWtV~~~lRvGiFakk~I~~GEEITFDYqf~rYGr~AQ~CyC 250 (729)
T KOG4442|consen 199 HSCDPNAEVQKWTVPDELRVGIFAKKVIKPGEEITFDYQFDRYGRDAQPCYC 250 (729)
T ss_pred CCCCCCceeeeeeeCCeeEEEEeEecccCCCceeeEeccccccccccccccc
Confidence 999999999999999999999999999999999999999999987 699999
No 2
>KOG1082 consensus Histone H3 (Lys9) methyltransferase SUV39H1/Clr4, required for transcriptional silencing [Chromatin structure and dynamics; Transcription]
Probab=100.00 E-value=6.9e-38 Score=282.79 Aligned_cols=212 Identities=29% Similarity=0.453 Sum_probs=161.3
Q ss_pred HHHHhCCCeeecCCCccCCCCCCCcEEcccceeccccccccCCCCCCccccCCCCCCCCCCCCCCCcc------------
Q 026129 20 LLKQIGNPVEFELPDWFIKPKAIPYVFIKRNIYLTKRIKRRLEDDGIFCSCTASPGSSGVCDRDCHCG------------ 87 (243)
Q Consensus 20 ~~~~~~~~~~~~~p~~~~~~~p~~f~~i~~n~~~~~~~~~~~~~~~~~C~C~~~~~~~~~C~~~C~c~------------ 87 (243)
++.+..+..++++-++++...++.|+|+...++... ..........|.|...+... .|. .|.|.
T Consensus 64 d~~~~~e~~~v~~~n~id~~~~~~f~y~~~~~~~~~--~~~~~~~~~~c~C~~~~~~~-~~~-~C~C~~~n~~~~~~~~~ 139 (364)
T KOG1082|consen 64 DIALGSENLPVPLVNRIDEDAPLYFQYIATEIVDPG--ELSDCENSTGCRCCSSCSSV-LPL-TCLCERHNGGLVAYTCD 139 (364)
T ss_pred cccCccccCceeeeeeccCCccccceeccccccCcc--ccccCccccCCCccCCCCCC-CCc-cccChHhhCCccccccC
Confidence 344444444455555665444478999999877553 22334566789999876543 111 24442
Q ss_pred ----------ceeecCCCCCCCCCCCCCcccccCCccceEEEEecCCCcEEEecccCCCCceEEEeceeeeCHHHHHHHH
Q 026129 88 ----------MLLSSCSSGCKCGNSCLNKPFQNRPVKKMKLVQTEKCGAGIVADEDIKRGEFVIEYVGEVIDDQTCEERL 157 (243)
Q Consensus 88 ----------~~~~eC~~~C~c~~~C~Nr~~q~~~~~~l~v~~s~~kG~Gv~A~~~I~~G~~I~ey~Gevi~~~~~~~r~ 157 (243)
..++||++.|+|+..|.||++|++...+|+|++++.+||||++.+.|++|+||+||+||+++..+++.+.
T Consensus 140 ~~~~~~~~~~~~i~EC~~~C~C~~~C~nRv~q~g~~~~leIfrt~~kGwgvRs~~~I~~G~fvcEyaGe~~t~~e~~~~~ 219 (364)
T KOG1082|consen 140 GDCGTLGKFKEPVFECSVACGCHPDCANRVVQKGLQFHLEVFRTPEKGWGVRTLDPIPAGEFVCEYAGEVLTSEEAQRRT 219 (364)
T ss_pred CccccccccCccccccccCCCCCCcCcchhhccccccceEEEecCCceeeecccccccCCCeeEEEeeEecChHHhhhcc
Confidence 2489999999999999999999999999999999999999999999999999999999999999998773
Q ss_pred HHhhhcCC--ccee---------------------eeeecccceecccccCCcccccCCCCCCCceeEEEEECC----eE
Q 026129 158 WKMKHLGE--TNFY---------------------LCEINRDMVIDATYKGNKSRYINHSCCPNTEMQKWIIDG----ET 210 (243)
Q Consensus 158 ~~~~~~~~--~~~y---------------------~~~~~~~~~iDa~~~Gn~~RfiNHSC~PN~~~~~~~~~~----~~ 210 (243)
........ ...+ .......+.|||...||++|||||||.||+.++.+..++ ..
T Consensus 220 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ida~~~GNv~RfinHSC~PN~~~~~v~~~~~~~~~~ 299 (364)
T KOG1082|consen 220 HLREYLDDDCDAYSIADREWVDESPVGNTFVAPSLPGGPGRELLIDAKPHGNVARFINHSCSPNLLYQAVFQDEFVLLYL 299 (364)
T ss_pred ccccccccccccchhhhccccccccccccccccccccCCCcceEEchhhcccccccccCCCCccceeeeeeecCCccchh
Confidence 22211111 0010 111245689999999999999999999999998887663 36
Q ss_pred EEEEEEcCCCCCCCeEEEecCCCcC
Q 026129 211 RIGIFATRDIKKGENLTYDYQYEFL 235 (243)
Q Consensus 211 ~i~i~A~rdI~~GEELt~dY~~~~~ 235 (243)
+++|||+++|.+|||||+|||..+.
T Consensus 300 ~i~ffa~~~I~p~~ELT~dYg~~~~ 324 (364)
T KOG1082|consen 300 RIGFFALRDISPGEELTLDYGKAYK 324 (364)
T ss_pred eeeeeeccccCCCcccchhhccccc
Confidence 8999999999999999999998764
No 3
>KOG1079 consensus Transcriptional repressor EZH1 [Transcription]
Probab=99.98 E-value=9.2e-33 Score=254.88 Aligned_cols=152 Identities=36% Similarity=0.632 Sum_probs=139.0
Q ss_pred CCccceeecCCC-CCCC----------CCCCCCcccccCCccceEEEEecCCCcEEEecccCCCCceEEEeceeeeCHHH
Q 026129 84 CHCGMLLSSCSS-GCKC----------GNSCLNKPFQNRPVKKMKLVQTEKCGAGIVADEDIKRGEFVIEYVGEVIDDQT 152 (243)
Q Consensus 84 C~c~~~~~eC~~-~C~c----------~~~C~Nr~~q~~~~~~l~v~~s~~kG~Gv~A~~~I~~G~~I~ey~Gevi~~~~ 152 (243)
|+|.+...||.| .|.+ ..+|.|--+|++.++++.|..|...|||||+++.+.+++||.||+||+|+++|
T Consensus 553 CpC~~A~rECdPd~Cl~cg~~~~~d~~~~~C~N~~l~~~~qkr~llapSdVaGwGlFlKe~v~KnefisEY~GE~IS~dE 632 (739)
T KOG1079|consen 553 CPCYLAVRECDPDVCLMCGNVDHFDSSKISCKNTNLQRGEQKRVLLAPSDVAGWGLFLKESVSKNEFISEYTGEIISHDE 632 (739)
T ss_pred CchhhhccccCchHHhccCcccccccCccccccchhhhhhhcceeechhhccccceeeccccCCCceeeeecceeccchh
Confidence 777788889986 3654 23799999999999999999999999999999999999999999999999999
Q ss_pred HHHHHHHhhhcCCcceeeeeecccceecccccCCcccccCCCCCCCceeEEEEECCeEEEEEEEcCCCCCCCeEEEecCC
Q 026129 153 CEERLWKMKHLGETNFYLCEINRDMVIDATYKGNKSRYINHSCCPNTEMQKWIIDGETRIGIFATRDIKKGENLTYDYQY 232 (243)
Q Consensus 153 ~~~r~~~~~~~~~~~~y~~~~~~~~~iDa~~~Gn~~RfiNHSC~PN~~~~~~~~~~~~~i~i~A~rdI~~GEELt~dY~~ 232 (243)
+++|.+.|... ..+|+|+++.+++|||++.||.+||+|||-+|||.+..+.+.|.+||+|||.|.|.+||||||||++
T Consensus 633 ADrRGkiYDr~--~cSflFnln~dyviDs~rkGnk~rFANHS~nPNCYAkvm~V~GdhRIGifAkRaIeagEELffDYrY 710 (739)
T KOG1079|consen 633 ADRRGKIYDRY--MCSFLFNLNNDYVIDSTRKGNKIRFANHSFNPNCYAKVMMVAGDHRIGIFAKRAIEAGEELFFDYRY 710 (739)
T ss_pred hhhcccccccc--cceeeeeccccceEeeeeecchhhhccCCCCCCcEEEEEEecCCcceeeeehhhcccCceeeeeecc
Confidence 99998776543 4569999999999999999999999999999999999999999999999999999999999999999
Q ss_pred CcCCC
Q 026129 233 EFLHD 237 (243)
Q Consensus 233 ~~~~~ 237 (243)
+-.++
T Consensus 711 s~~~~ 715 (739)
T KOG1079|consen 711 SPEHA 715 (739)
T ss_pred Ccccc
Confidence 87654
No 4
>KOG1080 consensus Histone H3 (Lys4) methyltransferase complex, subunit SET1 and related methyltransferases [Chromatin structure and dynamics; Transcription]
Probab=99.97 E-value=3.5e-31 Score=258.96 Aligned_cols=129 Identities=38% Similarity=0.672 Sum_probs=124.2
Q ss_pred ceEEEEecCCCcEEEecccCCCCceEEEeceeeeCHHHHHHHHHHhhhcCCcceeeeeecccceecccccCCcccccCCC
Q 026129 115 KMKLVQTEKCGAGIVADEDIKRGEFVIEYVGEVIDDQTCEERLWKMKHLGETNFYLCEINRDMVIDATYKGNKSRYINHS 194 (243)
Q Consensus 115 ~l~v~~s~~kG~Gv~A~~~I~~G~~I~ey~Gevi~~~~~~~r~~~~~~~~~~~~y~~~~~~~~~iDa~~~Gn~~RfiNHS 194 (243)
.|...++..+||||||.+.|.+|++|+||+||+|...-++.|...|...+....|+|.++...+|||+..||+|||||||
T Consensus 867 ~~~F~~s~iH~wglfa~~~i~~~dmViEY~Ge~vR~~iad~RE~~Y~~~gi~~sYlfrid~~~ViDAtk~gniAr~InHs 946 (1005)
T KOG1080|consen 867 YVKFGRSGIHGWGLFAMENIAAGDMVIEYRGELVRSSIADLREARYERMGIGDSYLFRIDDEVVVDATKKGNIARFINHS 946 (1005)
T ss_pred hhccccccccccceeeccCccccceEEEeeceehhhhHHHHHHHHHhccCcccceeeecccceEEeccccCchhheeecc
Confidence 46777899999999999999999999999999999999999999999988889999999999999999999999999999
Q ss_pred CCCCceeEEEEECCeEEEEEEEcCCCCCCCeEEEecCCCcCCCCCcccC
Q 026129 195 CCPNTEMQKWIIDGETRIGIFATRDIKKGENLTYDYQYEFLHDSLIAYC 243 (243)
Q Consensus 195 C~PN~~~~~~~~~~~~~i~i~A~rdI~~GEELt~dY~~~~~~~~~~C~C 243 (243)
|+|||....+.++|+.+|+|||.|+|.+||||||||.+...+.+..|+|
T Consensus 947 C~PNCyakvi~V~g~~~IvIyakr~I~~~EElTYDYkF~~e~~kipClC 995 (1005)
T KOG1080|consen 947 CNPNCYAKVITVEGDKRIVIYSKRDIAAGEELTYDYKFPTEDDKIPCLC 995 (1005)
T ss_pred cCCCceeeEEEecCeeEEEEEEecccccCceeeeecccccccccccccc
Confidence 9999999999999999999999999999999999999999999999998
No 5
>smart00317 SET SET (Su(var)3-9, Enhancer-of-zeste, Trithorax) domain. Putative methyl transferase, based on outlier plant homologues
Probab=99.96 E-value=7.3e-28 Score=181.91 Aligned_cols=116 Identities=49% Similarity=0.782 Sum_probs=101.8
Q ss_pred ceEEEEecCCCcEEEecccCCCCceEEEeceeeeCHHHHHHHHHHhhhcCCcceeeeeecccceecccccCCcccccCCC
Q 026129 115 KMKLVQTEKCGAGIVADEDIKRGEFVIEYVGEVIDDQTCEERLWKMKHLGETNFYLCEINRDMVIDATYKGNKSRYINHS 194 (243)
Q Consensus 115 ~l~v~~s~~kG~Gv~A~~~I~~G~~I~ey~Gevi~~~~~~~r~~~~~~~~~~~~y~~~~~~~~~iDa~~~Gn~~RfiNHS 194 (243)
+++++.++++|+||||+++|++|++|++|.|.++...++..+...+........|++.....+.||+...||++||||||
T Consensus 1 ~~~~~~~~~~G~gl~a~~~i~~g~~i~~~~g~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~id~~~~~~~~~~iNHs 80 (116)
T smart00317 1 KLEVFKSPGKGWGVRATEDIPKGEFIGEYVGEIITSEEAEERSKAYDTDGADSFYLFEIDSDLCIDARRKGNIARFINHS 80 (116)
T ss_pred CcEEEecCCCcEEEEECCccCCCCEEEEEEeEEECHHHHHHHHHHHHhcCCCCEEEEECCCCEEEeCCccCcHHHeeCCC
Confidence 35778888999999999999999999999999999988877654344444335678888778999999999999999999
Q ss_pred CCCCceeEEEEECCeEEEEEEEcCCCCCCCeEEEec
Q 026129 195 CCPNTEMQKWIIDGETRIGIFATRDIKKGENLTYDY 230 (243)
Q Consensus 195 C~PN~~~~~~~~~~~~~i~i~A~rdI~~GEELt~dY 230 (243)
|.||+.+..+..++..++.++|+|||++|||||+||
T Consensus 81 c~pN~~~~~~~~~~~~~~~~~a~r~I~~GeEi~i~Y 116 (116)
T smart00317 81 CEPNCELLFVEVNGDSRIVIFALRDIKPGEELTIDY 116 (116)
T ss_pred CCCCEEEEEEEECCCcEEEEEECCCcCCCCEEeecC
Confidence 999999998888777789999999999999999999
No 6
>KOG1083 consensus Putative transcription factor ASH1/LIN-59 [Transcription]
Probab=99.93 E-value=2.1e-27 Score=227.32 Aligned_cols=134 Identities=42% Similarity=0.776 Sum_probs=120.2
Q ss_pred CCCcccccCC-ccceEEEEecCCCcEEEecccCCCCceEEEeceeeeCHHHHHHHHHHhhhcCCcceeeeeecccceecc
Q 026129 103 CLNKPFQNRP-VKKMKLVQTEKCGAGIVADEDIKRGEFVIEYVGEVIDDQTCEERLWKMKHLGETNFYLCEINRDMVIDA 181 (243)
Q Consensus 103 C~Nr~~q~~~-~~~l~v~~s~~kG~Gv~A~~~I~~G~~I~ey~Gevi~~~~~~~r~~~~~~~~~~~~y~~~~~~~~~iDa 181 (243)
|.|+.+++.. -.+|++++.+.+||||.|.++|++|+||+||+|+|++.++.+.++....+. ..+.|...++.+++||+
T Consensus 1166 c~nqrm~r~e~cp~L~v~~gp~~G~~v~tk~PikagtfI~EYvGeVit~ke~e~~mmtl~~~-d~~~~cL~I~p~l~id~ 1244 (1306)
T KOG1083|consen 1166 CSNQRMQRHEECPPLEVFRGPKKGWGVRTKEPIKAGTFIMEYVGEVITEKEFEPRMMTLYHN-DDDHYCLVIDPGLFIDI 1244 (1306)
T ss_pred hhhHHhhhhccCCCcceeccCCCCccccccccccccchHHHHHHHHHHHHhhcccccccCCC-CCcccccccCccccCCh
Confidence 7888888654 468999999999999999999999999999999999999988875444343 44568889999999999
Q ss_pred cccCCcccccCCCCCCCceeEEEEECCeEEEEEEEcCCCCCCCeEEEecCCCcCCC
Q 026129 182 TYKGNKSRYINHSCCPNTEMQKWIIDGETRIGIFATRDIKKGENLTYDYQYEFLHD 237 (243)
Q Consensus 182 ~~~Gn~~RfiNHSC~PN~~~~~~~~~~~~~i~i~A~rdI~~GEELt~dY~~~~~~~ 237 (243)
.++||.+||+||||.|||+++.|.++|..|+++||+|||.+|||||+||++..+.-
T Consensus 1245 ~R~~n~~RfinhscKPNc~~qkwSVNG~~Rv~L~A~rDi~kGEELtYDYN~ks~~~ 1300 (1306)
T KOG1083|consen 1245 PRMGNGARFINHSCKPNCEMQKWSVNGEYRVGLFALRDLPKGEELTYDYNFKSFNY 1300 (1306)
T ss_pred hhccccccccccccCCCCccccccccceeeeeeeecCCCCCCceEEEeccccccCC
Confidence 99999999999999999999999999999999999999999999999999876653
No 7
>KOG1141 consensus Predicted histone methyl transferase [Chromatin structure and dynamics]
Probab=99.92 E-value=1.8e-26 Score=216.34 Aligned_cols=124 Identities=23% Similarity=0.264 Sum_probs=98.9
Q ss_pred eeecCCCccCCCCCCCcEEcccceeccccccccCCCCCCccccCCCCCCCCCCCC--------CCC-ccc----------
Q 026129 28 VEFELPDWFIKPKAIPYVFIKRNIYLTKRIKRRLEDDGIFCSCTASPGSSGVCDR--------DCH-CGM---------- 88 (243)
Q Consensus 28 ~~~~~p~~~~~~~p~~f~~i~~n~~~~~~~~~~~~~~~~~C~C~~~~~~~~~C~~--------~C~-c~~---------- 88 (243)
+++...++++-.+||.+.|-+..+....++.++.++....|+|..+|..+..|.. .|. +..
T Consensus 685 vpis~~neids~~lpq~ay~K~~ip~~~nl~n~~~~fl~scdc~~gcid~~kcachQltvk~~~t~p~~~v~~t~gykyK 764 (1262)
T KOG1141|consen 685 VPISEKNEIDSHRLPQAAYKKHMIPTNNNLSNRRKDFLQSCDCPTGCIDSMKCACHQLTVKKKTTGPNQNVASTNGYKYK 764 (1262)
T ss_pred cccceeecccCcCCccchhheeeccCCCcccccChhhhhcCCCCcchhhhhhhhHHHHHHHhhccCCCcccccCcchhhH
Confidence 3333345666678889999998877777777788889999999998876554420 111 100
Q ss_pred --------eeecCCCCCCCC-CCCCCcccccCCccceEEEEecCCCcEEEecccCCCCceEEEeceeeeCHH
Q 026129 89 --------LLSSCSSGCKCG-NSCLNKPFQNRPVKKMKLVQTEKCGAGIVADEDIKRGEFVIEYVGEVIDDQ 151 (243)
Q Consensus 89 --------~~~eC~~~C~c~-~~C~Nr~~q~~~~~~l~v~~s~~kG~Gv~A~~~I~~G~~I~ey~Gevi~~~ 151 (243)
-.+||+..|+|. ..|.||++|.+.+.+++++++..+|||++...+|.+|.||+-|.|.++++.
T Consensus 765 Rl~e~~ptg~yEc~k~ckc~~~~C~nrmvqhg~qvRlq~fkt~~kGWg~rclddi~~g~fVciy~g~~l~~~ 836 (1262)
T KOG1141|consen 765 RLIEIRPTGPYECLKACKCCGPDCLNRMVQHGYQVRLQRFKTIHKGWGRRCLDDITGGNFVCIYPGGALLHQ 836 (1262)
T ss_pred HHHHhcCCCHHHHHHhhccCcHHHHHHHhhcCceeEeeeccccccccceEeeeecCCceEEEEecchhhhhh
Confidence 178999999987 479999999999999999999999999999999999999999999876543
No 8
>KOG1085 consensus Predicted methyltransferase (contains a SET domain) [General function prediction only]
Probab=99.87 E-value=4.1e-22 Score=168.85 Aligned_cols=124 Identities=31% Similarity=0.423 Sum_probs=106.6
Q ss_pred CCccceEEEEecCCCcEEEecccCCCCceEEEeceeeeCHHHHHHHHHHhhhcCCcceee--e-eecccceecccccC-C
Q 026129 111 RPVKKMKLVQTEKCGAGIVADEDIKRGEFVIEYVGEVIDDQTCEERLWKMKHLGETNFYL--C-EINRDMVIDATYKG-N 186 (243)
Q Consensus 111 ~~~~~l~v~~s~~kG~Gv~A~~~I~~G~~I~ey~Gevi~~~~~~~r~~~~~~~~~~~~y~--~-~~~~~~~iDa~~~G-n 186 (243)
+....+.+....+||.||+|+..+.+|+||.||.|.+|...++..|...|........|+ | .....++|||+..- -
T Consensus 253 g~~egl~~~~~dgKGRGv~a~~~F~rgdFVVEY~Gdliei~eAk~rE~~Ya~De~~GcYMYyF~h~sk~yCiDAT~et~~ 332 (392)
T KOG1085|consen 253 GTNEGLLEVYKDGKGRGVRAKVNFERGDFVVEYRGDLIEISEAKVREEQYANDEEIGCYMYYFEHNSKKYCIDATKETPW 332 (392)
T ss_pred ccccceeEEeeccccceeEeecccccCceEEEEecceeeechHHHHHHHhccCcccceEEEeeeccCeeeeeeccccccc
Confidence 444567777788899999999999999999999999999999999988877665443333 3 34567999998764 4
Q ss_pred cccccCCCCCCCceeEEEEECCeEEEEEEEcCCCCCCCeEEEecCCCc
Q 026129 187 KSRYINHSCCPNTEMQKWIIDGETRIGIFATRDIKKGENLTYDYQYEF 234 (243)
Q Consensus 187 ~~RfiNHSC~PN~~~~~~~~~~~~~i~i~A~rdI~~GEELt~dY~~~~ 234 (243)
++|.||||-.+|+....+.+++.++++++|.|||.+||||+||||+..
T Consensus 333 lGRLINHS~~gNl~TKvv~Idg~pHLiLvA~rdIa~GEELlYDYGDRS 380 (392)
T KOG1085|consen 333 LGRLINHSVRGNLKTKVVEIDGSPHLILVARRDIAQGEELLYDYGDRS 380 (392)
T ss_pred chhhhcccccCcceeeEEEecCCceEEEEeccccccchhhhhhccccc
Confidence 699999999999999999999999999999999999999999999754
No 9
>COG2940 Proteins containing SET domain [General function prediction only]
Probab=99.74 E-value=1.6e-18 Score=162.22 Aligned_cols=134 Identities=38% Similarity=0.587 Sum_probs=106.8
Q ss_pred CCCcccccCCccceEEEEecCCCcEEEecccCCCCceEEEeceeeeCHHHHHHHHHHhhhcCCcceeeeeec-ccceecc
Q 026129 103 CLNKPFQNRPVKKMKLVQTEKCGAGIVADEDIKRGEFVIEYVGEVIDDQTCEERLWKMKHLGETNFYLCEIN-RDMVIDA 181 (243)
Q Consensus 103 C~Nr~~q~~~~~~l~v~~s~~kG~Gv~A~~~I~~G~~I~ey~Gevi~~~~~~~r~~~~~~~~~~~~y~~~~~-~~~~iDa 181 (243)
+.|............+..+..+|+|+||.+.|++|++|.+|.|+++...++..+...+...+.. +.++.+. ...++|+
T Consensus 321 ~~~~~~~~~~~~~~~~~~~~~~~~g~fa~~~i~~~e~i~~~~~~~~~~~~~~~~~~~~~~~~~~-~~~~~~~~~~~~~d~ 399 (480)
T COG2940 321 LLNSNGCKKRREPNVVQESEIKGYGVFALESIKKGEFIIEYHGEIIRRKEAREREENYDLLGNE-FSFGLLEDKDKVRDS 399 (480)
T ss_pred hhhhcccccccchhhhhhhcccccceeehhhccchHHHHHhcCcccchHHHHhhhccccccccc-cchhhccccchhhhh
Confidence 4444333344455666778899999999999999999999999999999988886665332222 2222222 2788999
Q ss_pred cccCCcccccCCCCCCCceeEEEEECCeEEEEEEEcCCCCCCCeEEEecCCCcCCC
Q 026129 182 TYKGNKSRYINHSCCPNTEMQKWIIDGETRIGIFATRDIKKGENLTYDYQYEFLHD 237 (243)
Q Consensus 182 ~~~Gn~~RfiNHSC~PN~~~~~~~~~~~~~i~i~A~rdI~~GEELt~dY~~~~~~~ 237 (243)
...|+.+||+||||.||+........|..++.++|+|||.+||||++||+..++..
T Consensus 400 ~~~g~~~r~~nHS~~pN~~~~~~~~~g~~~~~~~~~rDI~~geEl~~dy~~~~~~~ 455 (480)
T COG2940 400 QKAGDVARFINHSCTPNCEASPIEVNGIFKISIYAIRDIKAGEELTYDYGPSLEDN 455 (480)
T ss_pred hhcccccceeecCCCCCcceecccccccceeeecccccchhhhhhccccccccccc
Confidence 99999999999999999999776666677999999999999999999999998873
No 10
>PF00856 SET: SET domain; InterPro: IPR001214 The SET domain appears generally as one part of a larger multidomain protein, and recently there were described three structures of very different proteins with distinct domain compositions: Neurospora crassa DIM-5, a member of the Su(var) family of HKMTs which methylate histone H3 on lysine 9,human SET7 (also called SET9), which methylates H3 on lysine 4 and garden pea Rubisco LSMT, an enzyme that does not modify histones, but instead methylates lysine 14 in the flexible tail of the large subunit of the enzyme Rubisco. The SET domain itself turned out to be an uncommon structure. Although in all three studies, electron density maps revealed the location of the AdoMet or AdoHcy cofactor, the SET domain bears no similarity at all to the canonical/AdoMet-dependent methyltransferase fold. Strictly conserved in the C-terminal motif of the SET domain tyrosine could be involved in abstracting a proton from the protonated amino group of the substrate lysine, promoting its nucleophilic attack on the sulphonium methyl group of the AdoMet cofactor. In contrast to the AdoMet-dependent protein methyltranferases of the classical type, which tend to bind their polypeptide substrates on top of the cofactor, it is noted from the Rubisco LSMT structure that the AdoMet seems to bind in a separate cleft, suggesting how a polypeptide substrate could be subjected to multiple rounds of methylation without having to be released from the enzyme. In contrast, SET7/9 is able to add only a single methyl group to its substrate. It has been demonstrated that association of SET domain and myotubularin-related proteins modulates growth control []. The SET domain-containing Drosophila melanogaster (Fruit fly) protein, enhancer of zeste, has a function in segment determination and the mammalian homologue may be involved in the regulation of gene transcription and chromatin structure. Histone lysine methylation is part of the histone code that regulated chromatin function and epigenetic control of gene function. Histone lysine methyltransferases (HMTase) differ both in their substrate specificity for the various acceptor lysines as well as in their product specificity for the number of methyl groups (one, two, or three) they transfer. With just one exception [], the HMTases belong to SET family that can be classified according to the sequences surrounding the SET domain [, ]. Structural studies on the human SET7/9, a mono-methylase, have revealed the molecular basis for the specificity of the enzyme for the histone-target and the roles of the invariant residues in the SET domain in determining the methylation specificities []. The pre-SET domain, as found in the SUV39 SET family, contains nine invariant cysteine residues that are grouped into two segments separated by a region of variable length. These 9 cysteines coordinate 3 zinc ions to form to form a triangular cluster, where each of the zinc ions is coordinated by 4 four cysteines to give a tetrahedral configuration. The function of this domain is structural, holding together 2 long segments of random coils. The C-terminal region including the post-SET domain is disordered when not interacting with a histone tail and in the absence of zinc. The three conserved cysteines in the post-SET domain form a zinc-binding site when coupled to a fourth conserved cysteine in the knot-like structure close to the SET domain active site []. The structured post-SET region brings in the C-terminal residues that participate in S-adenosylmethine-binding and histone tail interactions. The three conserved cysteine residues are essential for HMTase activity, as replacement with serine abolishes HMTase activity [], []. ; GO: 0005515 protein binding; PDB: 3TG5_A 3S7F_A 3RIB_B 3TG4_A 3S7J_A 3S7D_A 3S7B_A 3H6L_A 3SMT_A 3K5K_A ....
Probab=99.73 E-value=8.7e-18 Score=132.52 Aligned_cols=107 Identities=25% Similarity=0.297 Sum_probs=72.8
Q ss_pred CcEEEecccCCCCceEEEeceeeeCHHHHHHH-------------------H-----------------HHhhh---cCC
Q 026129 125 GAGIVADEDIKRGEFVIEYVGEVIDDQTCEER-------------------L-----------------WKMKH---LGE 165 (243)
Q Consensus 125 G~Gv~A~~~I~~G~~I~ey~Gevi~~~~~~~r-------------------~-----------------~~~~~---~~~ 165 (243)
|+||||+++|++|++|++..+.+++....... . ..... ...
T Consensus 1 GrGl~At~dI~~Ge~I~~p~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 80 (162)
T PF00856_consen 1 GRGLFATRDIKAGEVILIPRPAILTPDEVSPQPELLRLQLSKALEEQSRSDFSIQKKQKAEKSERSPQLESLHSISLRSE 80 (162)
T ss_dssp SEEEEESS-B-TTEEEEEESEEEEEHHHHHCHHHHSHHTTCSSSCSHHTTHHHHHHHHHHHHHHHHHHHHHHHHHCHTTT
T ss_pred CEEEEECccCCCCCEEEEECcceEEehhhhhcccchhhhhhhhhcccccccccccccccccccccccccccccccccccc
Confidence 89999999999999999888988887665331 0 00000 000
Q ss_pred ----------------cceeeeeecccceecccccCCcccccCCCCCCCceeEEEEECCeEEEEEEEcCCCCCCCeEEEe
Q 026129 166 ----------------TNFYLCEINRDMVIDATYKGNKSRYINHSCCPNTEMQKWIIDGETRIGIFATRDIKKGENLTYD 229 (243)
Q Consensus 166 ----------------~~~y~~~~~~~~~iDa~~~Gn~~RfiNHSC~PN~~~~~~~~~~~~~i~i~A~rdI~~GEELt~d 229 (243)
..............++.....++.|+||||.|||.+..........+.|+|.|||++|||||++
T Consensus 81 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~l~p~~d~~NHsc~pn~~~~~~~~~~~~~~~~~a~r~I~~GeEi~is 160 (162)
T PF00856_consen 81 LQFSQAFQWSWFISWTRSDFSSRSFSEDDRDGIALYPFADMLNHSCDPNCEVSFDFDGDGGCLVVRATRDIKKGEEIFIS 160 (162)
T ss_dssp CCTCCHHHHHHHHHHHHHEEEEEEETTEEEEEEEEETGGGGSEEESSTSEEEEEEEETTTTEEEEEESS-B-TTSBEEEE
T ss_pred ccccccccchhhccccceeeeccccccccccccccCcHhHheccccccccceeeEeecccceEEEEECCccCCCCEEEEE
Confidence 0001111112234455666788999999999999998766667789999999999999999999
Q ss_pred cC
Q 026129 230 YQ 231 (243)
Q Consensus 230 Y~ 231 (243)
||
T Consensus 161 YG 162 (162)
T PF00856_consen 161 YG 162 (162)
T ss_dssp ST
T ss_pred EC
Confidence 98
No 11
>KOG2589 consensus Histone tail methylase [Chromatin structure and dynamics]
Probab=99.39 E-value=2.8e-13 Score=118.78 Aligned_cols=115 Identities=26% Similarity=0.355 Sum_probs=84.6
Q ss_pred cCCCcEEEecccCCCCceEEEeceeeeCHHHHHHHHHHhhhcCCcceeeeeecccceecccccCCcccccCCCCCCCcee
Q 026129 122 EKCGAGIVADEDIKRGEFVIEYVGEVIDDQTCEERLWKMKHLGETNFYLCEINRDMVIDATYKGNKSRYINHSCCPNTEM 201 (243)
Q Consensus 122 ~~kG~Gv~A~~~I~~G~~I~ey~Gevi~~~~~~~r~~~~~~~~~~~~y~~~~~~~~~iDa~~~Gn~~RfiNHSC~PN~~~ 201 (243)
...|--|++++.+.+|+-|...+|-|..-.+++++... ..+..+|-.|.....- -+..+-..++||||.|.|||+|
T Consensus 135 ~~~gAkivst~~w~~ndkIe~LvGcIaeLse~eE~~ll--~~g~nDFSvmyStRk~--caqLwLGPaafINHDCrpnCkF 210 (453)
T KOG2589|consen 135 SQNGAKIVSTKSWSRNDKIELLVGCIAELSEAEERSLL--RGGGNDFSVMYSTRKR--CAQLWLGPAAFINHDCRPNCKF 210 (453)
T ss_pred cCCCceEEeeccccCCccHHHhhhhhhhcChhhhHHHH--hccCCceeeeeecccc--hhhheeccHHhhcCCCCCCcee
Confidence 34588899999999999999999998877777766321 2222333222221110 1222346789999999999988
Q ss_pred EEEEECCeEEEEEEEcCCCCCCCeEEEecCCCcCCCC-CcccC
Q 026129 202 QKWIIDGETRIGIFATRDIKKGENLTYDYQYEFLHDS-LIAYC 243 (243)
Q Consensus 202 ~~~~~~~~~~i~i~A~rdI~~GEELt~dY~~~~~~~~-~~C~C 243 (243)
. ..|..++.|.++|||+||||||--||.+||+.. ..|.|
T Consensus 211 v---s~g~~tacvkvlRDIePGeEITcFYgs~fFG~~N~~CeC 250 (453)
T KOG2589|consen 211 V---STGRDTACVKVLRDIEPGEEITCFYGSGFFGENNEECEC 250 (453)
T ss_pred e---cCCCceeeeehhhcCCCCceeEEeecccccCCCCceeEE
Confidence 4 345689999999999999999999999999985 56776
No 12
>KOG1081 consensus Transcription factor NSD1 and related SET domain proteins [Transcription]
Probab=99.31 E-value=4e-13 Score=124.55 Aligned_cols=141 Identities=40% Similarity=0.732 Sum_probs=114.7
Q ss_pred cceeecCC-CCCCCCCCCCCcccccCCccceEEEEecCCCcEEEecccCCCCceEEEeceeeeCHHHHHHHHHHhhhcCC
Q 026129 87 GMLLSSCS-SGCKCGNSCLNKPFQNRPVKKMKLVQTEKCGAGIVADEDIKRGEFVIEYVGEVIDDQTCEERLWKMKHLGE 165 (243)
Q Consensus 87 ~~~~~eC~-~~C~c~~~C~Nr~~q~~~~~~l~v~~s~~kG~Gv~A~~~I~~G~~I~ey~Gevi~~~~~~~r~~~~~~~~~ 165 (243)
....++|. ..|.+...|.|+.+....... +.+ +|..+|.+| +|+++...+...++........
T Consensus 286 ~~~~~~~~p~~~~~~~~~~~~~~sk~~~~e------~~~----~~~~~~~k~------vg~~i~~~e~~~~~~~~~~~~~ 349 (463)
T KOG1081|consen 286 KMLAYEVHPKVCSAEERCHNQQFSKESYPE------PQK----TAKADIRKG------VGEVIDDKECKARLQRVKESDL 349 (463)
T ss_pred Hhhhhhhcccccccccccccchhhhhcccc------cch----hhHHhhhcc------cCcccchhhheeehhhhhccch
Confidence 44566665 579998999998775443333 222 888899988 8999999998888777666666
Q ss_pred cceeeeeecccceecccccCCcccccCCCCCCCceeEEEEECCeEEEEEEEcCCCCCCCeEEEecCCCcCCCCCcccC
Q 026129 166 TNFYLCEINRDMVIDATYKGNKSRYINHSCCPNTEMQKWIIDGETRIGIFATRDIKKGENLTYDYQYEFLHDSLIAYC 243 (243)
Q Consensus 166 ~~~y~~~~~~~~~iDa~~~Gn~~RfiNHSC~PN~~~~~~~~~~~~~i~i~A~rdI~~GEELt~dY~~~~~~~~~~C~C 243 (243)
..+|+..+..+..||+...||.+||+||||.||+.-+.|.+.+..++.++|.+.|++|||||++|...-......|.|
T Consensus 350 ~~~~~~~~e~~~~id~~~~~n~sr~~nh~~~~~v~~~k~~~~~~t~~~~~a~~~i~~g~e~t~~~n~~~~~~~~~~~~ 427 (463)
T KOG1081|consen 350 VDFYMVFIQKDRIIDAGPKGNYSRFLNHSCQPNVETEKWQVIGDTRVGLFAPRQIEAGEELTFNYNGNCEGNEKRCCC 427 (463)
T ss_pred hhhhhhhhhcccccccccccchhhhhcccCCCceeechhheecccccccccccccccchhhhheeeccccCCcceEee
Confidence 666655555555999999999999999999999999999999999999999999999999999999987777666554
No 13
>smart00570 AWS associated with SET domains. subdomain of PRESET
Probab=99.11 E-value=3.1e-11 Score=77.81 Aligned_cols=49 Identities=43% Similarity=0.944 Sum_probs=43.4
Q ss_pred CCCccccCCCCCCCCCCCCCCCccceeecCCCCCCCCCCCCCcccccCC
Q 026129 64 DGIFCSCTASPGSSGVCDRDCHCGMLLSSCSSGCKCGNSCLNKPFQNRP 112 (243)
Q Consensus 64 ~~~~C~C~~~~~~~~~C~~~C~c~~~~~eC~~~C~c~~~C~Nr~~q~~~ 112 (243)
+...|+|++...+...|+++|+|+++++||+..|+|+..|+||.||++.
T Consensus 2 e~~~C~C~~~~~~~~~CgsdClNR~l~~EC~~~C~~G~~C~NqrFqk~~ 50 (51)
T smart00570 2 DIMTCECKPTDDDEGACGSDCLNRMLLIECSSDCPCGSYCSNQRFQKRQ 50 (51)
T ss_pred CCceeeCccCCCCCCCcchHHHHHHHhhhcCCCCCCCcCccCcccccCc
Confidence 4567999987655678999999999999999999999999999999864
No 14
>KOG2461 consensus Transcription factor BLIMP-1/PRDI-BF1, contains C2H2-type Zn-finger and SET domains [Transcription]
Probab=98.89 E-value=2e-09 Score=97.98 Aligned_cols=112 Identities=24% Similarity=0.259 Sum_probs=84.4
Q ss_pred CccceEEEEe--cCCCcEEEecccCCCCceEEEeceeeeCHHHHHHHHHHhhhcCCcceeeeeec----ccceeccc--c
Q 026129 112 PVKKMKLVQT--EKCGAGIVADEDIKRGEFVIEYVGEVIDDQTCEERLWKMKHLGETNFYLCEIN----RDMVIDAT--Y 183 (243)
Q Consensus 112 ~~~~l~v~~s--~~kG~Gv~A~~~I~~G~~I~ey~Gevi~~~~~~~r~~~~~~~~~~~~y~~~~~----~~~~iDa~--~ 183 (243)
....+.|..+ +..|.||++...|++|+-.+-|.|+++.... .+...+.|++.+- .-.+||++ .
T Consensus 26 LP~~l~i~~Ssv~~~~lgV~s~~~i~~G~~FGP~~G~~~~~~~---------~~~~n~~y~W~I~~~d~~~~~iDg~d~~ 96 (396)
T KOG2461|consen 26 LPPELRIKPSSVPVTGLGVWSNASILPGTSFGPFEGEIIASID---------SKSANNRYMWEIFSSDNGYEYIDGTDEE 96 (396)
T ss_pred CCCceEeeccccCCccccccccccccCcccccCccCccccccc---------cccccCcceEEEEeCCCceEEeccCChh
Confidence 4557888886 6678999999999999999999999822211 2223344555543 23789986 4
Q ss_pred cCCcccccCCCCCC---CceeEEEEECCeEEEEEEEcCCCCCCCeEEEecCCCcCC
Q 026129 184 KGNKSRYINHSCCP---NTEMQKWIIDGETRIGIFATRDIKKGENLTYDYQYEFLH 236 (243)
Q Consensus 184 ~Gn~~RfiNHSC~P---N~~~~~~~~~~~~~i~i~A~rdI~~GEELt~dY~~~~~~ 236 (243)
..|++||+|-+++. |+..- .....|.++|+|+|++||||.++|+.+|-.
T Consensus 97 ~sNWmRYV~~Ar~~eeQNL~A~----Q~~~~Ifyrt~r~I~p~eELlVWY~~e~~~ 148 (396)
T KOG2461|consen 97 HSNWMRYVNSARSEEEQNLLAF----QIGENIFYRTIRDIRPNEELLVWYGSEYAE 148 (396)
T ss_pred hcceeeeecccCChhhhhHHHH----hccCceEEEecccCCCCCeEEEEeccchHh
Confidence 68999999988864 77552 234579999999999999999999998853
No 15
>PF05033 Pre-SET: Pre-SET motif; InterPro: IPR007728 This region is found in a number of histone lysine methyltransferases (HMTase), N-terminal to the SET domain; it is generally described as the pre-SET domain. Histone lysine methylation is part of the histone code that regulated chromatin function and epigenetic control of gene function. Histone lysine methyltransferases (HMTase) differ both in their substrate specificity for the various acceptor lysines as well as in their product specificity for the number of methyl groups (one, two, or three) they transfer. With just one exception [], the HMTases belong to SET family that can be classified according to the sequences surrounding the SET domain [, ]. Structural studies on the human SET7/9, a mono-methylase, have revealed the molecular basis for the specificity of the enzyme for the histone-target and the roles of the invariant residues in the SET domain in determining the methylation specificities []. The pre-SET domain, as found in the SUV39 SET family, contains nine invariant cysteine residues that are grouped into two segments separated by a region of variable length. These 9 cysteines coordinate 3 zinc ions to form a triangular cluster, where each of the zinc ions is coordinated by 4 four cysteines to give a tetrahedral configuration. The function of this domain is structural, holding together 2 long segments of random coils and stabilising the SET domain. The C-terminal region including the post-SET domain is disordered when not interacting with a histone tail and in the absence of zinc. The three conserved cysteines in the post-SET domain form a zinc-binding site [] when coupled to a fourth conserved cysteine in the knot-like structure close to the SET domain active site []. The structured post-SET region brings in the C-terminal residues that participate in S-adenosylmethine-binding and histone tail interactions. The three conserved cysteine residues are essential for HMTase activity, as replacement with serine abolishes HMTase activity []. ; GO: 0008270 zinc ion binding, 0018024 histone-lysine N-methyltransferase activity, 0034968 histone lysine methylation, 0005634 nucleus; PDB: 3K5K_A 2O8J_D 3RJW_B 1ML9_A 1PEG_B 1MVH_A 1MVX_A 3BO5_A 2RFI_B 3MO5_B ....
Probab=98.88 E-value=8e-10 Score=82.44 Aligned_cols=84 Identities=17% Similarity=0.233 Sum_probs=44.0
Q ss_pred HHhCCCeeecCCCccCCC-CCCCcEEcccceeccccccccCCCCCCccccCCCCCCCCCC--CCCC--------------
Q 026129 22 KQIGNPVEFELPDWFIKP-KAIPYVFIKRNIYLTKRIKRRLEDDGIFCSCTASPGSSGVC--DRDC-------------- 84 (243)
Q Consensus 22 ~~~~~~~~~~~p~~~~~~-~p~~f~~i~~n~~~~~~~~~~~~~~~~~C~C~~~~~~~~~C--~~~C-------------- 84 (243)
+...+..++++-+.++.. .|+.|+||.++++..... ........+|+|...|.....| ...-
T Consensus 3 s~g~e~~pI~~~N~vd~~~~p~~F~Yi~~~~~~~~~~-~~~~~~~~~C~C~~~C~~~~~C~C~~~~~~~~~Y~~~g~l~~ 81 (103)
T PF05033_consen 3 SRGKENVPIPVVNDVDDEPPPPNFEYIPENIYGEGVP-DIDPEFLQGCDCSGDCSNPSNCECLQRNGGIFAYDSNGRLRI 81 (103)
T ss_dssp TCTSSSS-EEEEESSSS--SSTSSEE-SS-EESTTSS--TBGGGTS----SSSSTCTTTSHHHCCTSSS-SB-TTSSBSS
T ss_pred CCCccCCCEEEEeCCCCCCCCCCeEEeeeEEcCCCcc-ccccccCccCccCCCCCCCCCCcCccccCccccccCCCcCcc
Confidence 334455555555666654 458999999999877544 3445566799998776333333 1110
Q ss_pred CccceeecCCCCCCCCCCCCCc
Q 026129 85 HCGMLLSSCSSGCKCGNSCLNK 106 (243)
Q Consensus 85 ~c~~~~~eC~~~C~c~~~C~Nr 106 (243)
.-..+++||++.|.|+..|.||
T Consensus 82 ~~~~~i~EC~~~C~C~~~C~NR 103 (103)
T PF05033_consen 82 PDKPPIFECNDNCGCSPSCRNR 103 (103)
T ss_dssp SSTSEEE---TTSSS-TTSTT-
T ss_pred CCCCeEEeCCCCCCCCCCCCCC
Confidence 1233589999999999999997
No 16
>smart00468 PreSET N-terminal to some SET domains. A Cys-rich putative Zn2+-binding domain that occurs N-terminal to some SET domains. Function is unknown. Unpublished.
Probab=98.38 E-value=4e-07 Score=67.31 Aligned_cols=58 Identities=19% Similarity=0.147 Sum_probs=41.4
Q ss_pred HHHHhCCCeeecCCCccCCC-CCCCcEEcccceeccccccccCCCCCCccccCCCCCCC
Q 026129 20 LLKQIGNPVEFELPDWFIKP-KAIPYVFIKRNIYLTKRIKRRLEDDGIFCSCTASPGSS 77 (243)
Q Consensus 20 ~~~~~~~~~~~~~p~~~~~~-~p~~f~~i~~n~~~~~~~~~~~~~~~~~C~C~~~~~~~ 77 (243)
+++...+.+++++-++++.. +|+.|+||.++++..............+|+|...|...
T Consensus 3 Dis~G~E~~pI~~vN~vD~~~~p~~F~Yi~~~~~~~gv~~~~~~~~~~gC~C~~~C~~~ 61 (98)
T smart00468 3 DISNGKENVPVPLVNEVDEDPPPPDFEYISEYIYGQGVPIDRSPSPLVGCSCSGDCSSS 61 (98)
T ss_pred cccCCccCCCcceEecCCCCCCCCCcEECcceEcCCCcccccCCCCCCCCcCCCCCCCC
Confidence 45555677777777777754 45899999999986654333456678899999876543
No 17
>KOG1081 consensus Transcription factor NSD1 and related SET domain proteins [Transcription]
Probab=98.02 E-value=3.2e-06 Score=78.85 Aligned_cols=234 Identities=24% Similarity=0.174 Sum_probs=156.2
Q ss_pred CCccccCcchhhHHHHHHHHHHHhCCCeeecCCCccCCCCCCCcEEcccceeccccccccCCCCCCccccCCC-CC-CCC
Q 026129 1 MPAAKKNSDNSRIGHAFNKLLKQIGNPVEFELPDWFIKPKAIPYVFIKRNIYLTKRIKRRLEDDGIFCSCTAS-PG-SSG 78 (243)
Q Consensus 1 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~p~~~~~~~p~~f~~i~~n~~~~~~~~~~~~~~~~~C~C~~~-~~-~~~ 78 (243)
|+...|+++.+.+.+.+-++++...+.....-|....+..+ ..|++++..+..+...+.....+|++..+ .. .+.
T Consensus 1 ~~~~~~~~~~~~~~~~~~~~~~~~~e~~~~~~~~~~~~~~n---~~i~~~v~~~~~~~~~~~~~~~g~~~~~s~p~~~~~ 77 (463)
T KOG1081|consen 1 MSKFKKHSDRNQIPQHDLKCPSHNQESCSLETPPGSAPLGN---LKITRTVRLGKDLFESDACGGIGGSVSASEPNHVSP 77 (463)
T ss_pred CCcccccccccccchhhcccccccccccccCCCccccccCC---ceeeeeeecCcChhhcccccccccccccCCccccCC
Confidence 67889999999999999999999888888888877766665 77888888777766666777888998887 32 234
Q ss_pred CCCCCCCccceeecCCCCCCCCCCCCCcccccCCccceEEEEecCCCcE---EEecccCCCCceEEEeceeeeCHH--HH
Q 026129 79 VCDRDCHCGMLLSSCSSGCKCGNSCLNKPFQNRPVKKMKLVQTEKCGAG---IVADEDIKRGEFVIEYVGEVIDDQ--TC 153 (243)
Q Consensus 79 ~C~~~C~c~~~~~eC~~~C~c~~~C~Nr~~q~~~~~~l~v~~s~~kG~G---v~A~~~I~~G~~I~ey~Gevi~~~--~~ 153 (243)
.|+..+.+....-+|..-+.++..+.+...+......-.-+..+..+++ ..|.+.+..|++|+.++|+..-.. ..
T Consensus 78 ~~~~~~~~~~~~~~c~vc~~ggs~v~~~s~~~~~~r~c~~~~~~~c~~~~~d~~~~~~~~~~~~vw~~vg~~~~~~c~vc 157 (463)
T KOG1081|consen 78 EPGSRRHPKIEPSECFVCFKGGSLVTCKSRIQAPHRKCKPAQLEKCSKRCTDCRAFKKREVGDLVWSKVGEYPWWPCMVC 157 (463)
T ss_pred CCCchhccCCCcchhccccCCCccceeccccccccccCcCccCcccccCCcceeeeccccceeEEeEEcCccccccccee
Confidence 6777788887777776655555433333222222212222334455555 888889999999999999976544 11
Q ss_pred HHHHHHhhhcC-CcceeeeeecccceecccccCCcccccCCCCCCCceeEEEEECCeEEEEEEEcCCCCCCCe------E
Q 026129 154 EERLWKMKHLG-ETNFYLCEINRDMVIDATYKGNKSRYINHSCCPNTEMQKWIIDGETRIGIFATRDIKKGEN------L 226 (243)
Q Consensus 154 ~~r~~~~~~~~-~~~~y~~~~~~~~~iDa~~~Gn~~RfiNHSC~PN~~~~~~~~~~~~~i~i~A~rdI~~GEE------L 226 (243)
........... ...+|....-.....++..+|+..++++|++.|+-.+..+......++..++.+.++.+.- .
T Consensus 158 ~~~~~~~~~~~~~~~f~~~~~~~~~~~~~~~~g~~~~~l~~~~~~~s~~~~~~~~~~~r~~~~~~q~~~~~~~~e~k~~~ 237 (463)
T KOG1081|consen 158 HDPLLPKGMKHDHVNFFGCYAWTHEKRVFPYEGQSSKLIPHSKKPASTMSEKIKEAKARFGKLKAQWEAGIKQKELKPEE 237 (463)
T ss_pred cCcccchhhccccceeccchhhHHHhhhhhccchHHHhhhhccccchhhhhhhhcccchhhhcccchhhccchhhccccc
Confidence 11211111111 1222222111123344445899999999999999999888888888998898888887766 5
Q ss_pred EEecCCCcCCC
Q 026129 227 TYDYQYEFLHD 237 (243)
Q Consensus 227 t~dY~~~~~~~ 237 (243)
+-+|...-+..
T Consensus 238 ~~~~~~~~~~~ 248 (463)
T KOG1081|consen 238 YKRIKVVCPIG 248 (463)
T ss_pred ccccccccCcC
Confidence 55555444433
No 18
>KOG1141 consensus Predicted histone methyl transferase [Chromatin structure and dynamics]
Probab=97.46 E-value=0.00032 Score=68.13 Aligned_cols=154 Identities=31% Similarity=0.485 Sum_probs=118.6
Q ss_pred eecCCCCCCCCCCCCCcccccCCccce--------EEEEecCCCcEEEecccCCCCceEEEeceeeeCHHHHHHHHHHhh
Q 026129 90 LSSCSSGCKCGNSCLNKPFQNRPVKKM--------KLVQTEKCGAGIVADEDIKRGEFVIEYVGEVIDDQTCEERLWKMK 161 (243)
Q Consensus 90 ~~eC~~~C~c~~~C~Nr~~q~~~~~~l--------~v~~s~~kG~Gv~A~~~I~~G~~I~ey~Gevi~~~~~~~r~~~~~ 161 (243)
++||+..|.|...|.|+++|++...+. .|+++...|||+.+..+|+.-+||++|+|...+..-+.+......
T Consensus 981 f~e~~~hss~~~~e~~~~v~~~~~~~me~~s~~~l~i~~~~~~~~~~~edtD~~~~~~~~~~~~~ppt~~l~~~~r~aqa 1060 (1262)
T KOG1141|consen 981 FFECNDHSSCHRKEYNRVVQNNIKYPMEVSSFNDLQIFKTAQSGWGVREDTDIPQSTFICTYVGAPPTDDLADELRNAQA 1060 (1262)
T ss_pred ceeccccchhcccccchhhhcCCccceeeeecccccccccccccccccccccCCCCcccccccCCCCchhhHHHHhhhhh
Confidence 789999999999999999998776554 556777889999999999999999999999877655433211100
Q ss_pred hcC-------------------------Ccceeeeee-----c-------------------------------------
Q 026129 162 HLG-------------------------ETNFYLCEI-----N------------------------------------- 174 (243)
Q Consensus 162 ~~~-------------------------~~~~y~~~~-----~------------------------------------- 174 (243)
+.. ....|.-.- +
T Consensus 1061 d~~sn~~D~~~~~~l~es~~~~~T~~r~~t~~~~~~~~~d~dd~q~I~k~ve~qd~~~~~~~T~~~~RQ~~~~s~k~~~~ 1140 (1262)
T KOG1141|consen 1061 DQYSNDLDLKDTVELEESREDHETDFRGDTSDYDDEEGSDGDDGQDIMKMVERQDSSESGEETKRLTRQKRKQSKKSGKG 1140 (1262)
T ss_pred ccccCccchhhhhhhhhcccccccccCCCCCCCcccccccCccHHHHHHHhhcccccccccccchhhhhhhhhhhhcccC
Confidence 000 000010000 0
Q ss_pred ------------------------------------ccceecccccCCcccccCCCCCCCceeEEEEECCe----EEEEE
Q 026129 175 ------------------------------------RDMVIDATYKGNKSRYINHSCCPNTEMQKWIIDGE----TRIGI 214 (243)
Q Consensus 175 ------------------------------------~~~~iDa~~~Gn~~RfiNHSC~PN~~~~~~~~~~~----~~i~i 214 (243)
.-++|||+..||++||+||||.||+.++.++++.+ +.++|
T Consensus 1141 ~s~~~~~~ts~~~~~~dkges~~~~~~~~~~y~~~~~~yvIDAk~eGNlGRfLNHSC~PNl~VQnVfvdTHdlrfPwVAF 1220 (1262)
T KOG1141|consen 1141 GSVEKDDTTSRDSMEKDKGESKDEPVFNWDKYFEPFPLYVIDAKQEGNLGRFLNHSCDPNLHVQNVFVDTHDLRFPWVAF 1220 (1262)
T ss_pred ccccccccCccchhhhccCccCcccccchhhccCCCceEEEecccccchhhhhccCCCccceeeeeeeeccccCCchhhh
Confidence 01789999999999999999999999999998764 57999
Q ss_pred EEcCCCCCCCeEEEecCCCcCCC---CCcccC
Q 026129 215 FATRDIKKGENLTYDYQYEFLHD---SLIAYC 243 (243)
Q Consensus 215 ~A~rdI~~GEELt~dY~~~~~~~---~~~C~C 243 (243)
||.|-|++|+|||+||++....- ...|+|
T Consensus 1221 Ft~kyVkAgtELTWDY~Ye~g~v~~keL~C~C 1252 (1262)
T KOG1141|consen 1221 FTRKYVKAGTELTWDYQYEQGQVATKELTCHC 1252 (1262)
T ss_pred hhhhhhccCceeeeeccccccccccceEEEec
Confidence 99999999999999999987653 467887
No 19
>KOG1337 consensus N-methyltransferase [General function prediction only]
Probab=91.61 E-value=0.14 Score=48.30 Aligned_cols=41 Identities=22% Similarity=0.279 Sum_probs=31.1
Q ss_pred cccCCCCCCCceeEEEEECCeEEEEEEEcCCCCCCCeEEEecCC
Q 026129 189 RYINHSCCPNTEMQKWIIDGETRIGIFATRDIKKGENLTYDYQY 232 (243)
Q Consensus 189 RfiNHSC~PN~~~~~~~~~~~~~i~i~A~rdI~~GEELt~dY~~ 232 (243)
-+.||++++... .+......+.+++.++|.+||||+++||.
T Consensus 238 D~~NH~~~~~~~---~~~~~d~~~~l~~~~~v~~geevfi~YG~ 278 (472)
T KOG1337|consen 238 DLLNHSPEVIKA---GYNQEDEAVELVAERDVSAGEEVFINYGP 278 (472)
T ss_pred HhhccCchhccc---cccCCCCcEEEEEeeeecCCCeEEEecCC
Confidence 468999999221 12222338889999999999999999996
No 20
>KOG2084 consensus Predicted histone tail methylase containing SET domain [Chromatin structure and dynamics]
Probab=91.14 E-value=0.39 Score=44.47 Aligned_cols=44 Identities=36% Similarity=0.690 Sum_probs=32.6
Q ss_pred ccCCCCCCCceeEEEEECCeEEEEEEEcCCCCCCC-eEEEecCCCcCCC
Q 026129 190 YINHSCCPNTEMQKWIIDGETRIGIFATRDIKKGE-NLTYDYQYEFLHD 237 (243)
Q Consensus 190 fiNHSC~PN~~~~~~~~~~~~~i~i~A~rdI~~GE-ELt~dY~~~~~~~ 237 (243)
++||||.||+.. ..++ ....+.+...+.+++ ||++.|-...|..
T Consensus 208 ~~~hsC~pn~~~---~~~~-~~~~~~~~~~~~~~~~~l~~~y~~~~~~~ 252 (482)
T KOG2084|consen 208 LFNHSCFPNISV---IFDG-RGLALLVPAGIDAGEEELTISYTDPLLST 252 (482)
T ss_pred hcccCCCCCeEE---EECC-ceeEEEeecccCCCCCEEEEeecccccCH
Confidence 789999999973 2333 344556667777776 9999999888864
No 21
>PF03638 TCR: Tesmin/TSO1-like CXC domain, cysteine-rich domain; InterPro: IPR005172 This entry includes proteins that have two copies of a cysteine rich motif as follows: C-X-C-X4-C-X3-YC-X-C-X6-C-X3-C-X-C-X2-C. The family includes Tesmin Q9Y4I5 from SWISSPROT [] and TSO1 Q9LE32 from SWISSPROT []. This group of proteins is called a CXC domain in [].
Probab=78.08 E-value=1.6 Score=26.95 Aligned_cols=37 Identities=32% Similarity=0.731 Sum_probs=29.3
Q ss_pred CCccccCCCCCCCCCCC-CCCCccceeecCCCCCCCCCCCCCcc
Q 026129 65 GIFCSCTASPGSSGVCD-RDCHCGMLLSSCSSGCKCGNSCLNKP 107 (243)
Q Consensus 65 ~~~C~C~~~~~~~~~C~-~~C~c~~~~~eC~~~C~c~~~C~Nr~ 107 (243)
..+|.|..+ .|- ..|.|.+....|++.|.| ..|.|+.
T Consensus 3 ~~gC~Ckks-----~Clk~YC~Cf~~g~~C~~~C~C-~~C~N~~ 40 (42)
T PF03638_consen 3 KKGCNCKKS-----KCLKLYCECFQAGRFCTPNCKC-QNCKNTE 40 (42)
T ss_pred CCCCcccCc-----ChhhhhCHHHHCcCcCCCCccc-CCCCCcC
Confidence 457888863 464 369999999999999999 7888863
No 22
>KOG1338 consensus Uncharacterized conserved protein [Function unknown]
Probab=76.43 E-value=1.8 Score=39.61 Aligned_cols=39 Identities=21% Similarity=0.415 Sum_probs=31.1
Q ss_pred cccccCCC---CCCCceeEEEEECCeEEEEEEEcCCCCCCCeEEEecC
Q 026129 187 KSRYINHS---CCPNTEMQKWIIDGETRIGIFATRDIKKGENLTYDYQ 231 (243)
Q Consensus 187 ~~RfiNHS---C~PN~~~~~~~~~~~~~i~i~A~rdI~~GEELt~dY~ 231 (243)
.+-|+||. |..|..+ +...+-++|.|+|++|+|+.-.||
T Consensus 218 ~ad~lNhd~~k~nanl~y------~~NcL~mva~r~iekgdev~n~dg 259 (466)
T KOG1338|consen 218 IADFLNHDGLKANANLRY------EDNCLEMVADRNIEKGDEVDNSDG 259 (466)
T ss_pred hhhhhccchhhcccceec------cCcceeeeecCCCCCccccccccc
Confidence 47799995 5666655 245677899999999999999997
No 23
>PF08666 SAF: SAF domain; InterPro: IPR013974 This entry includes a range of different proteins, such as antifreeze proteins, flagellar FlgA proteins, and CpaB pilus proteins. ; PDB: 1C89_A 3NLA_A 3RDN_A 1C8A_A 3FRN_A 1WVO_A 3K3S_H 3G8R_B 1XUU_A 1XUZ_A ....
Probab=76.07 E-value=1.6 Score=28.59 Aligned_cols=15 Identities=40% Similarity=0.554 Sum_probs=11.4
Q ss_pred EEEEcCCCCCCCeEE
Q 026129 213 GIFATRDIKKGENLT 227 (243)
Q Consensus 213 ~i~A~rdI~~GEELt 227 (243)
+++|.|||++|+.|+
T Consensus 3 vvVA~~di~~G~~i~ 17 (63)
T PF08666_consen 3 VVVAARDIPAGTVIT 17 (63)
T ss_dssp EEEESSTB-TT-BEC
T ss_pred EEEEeCccCCCCEEc
Confidence 478999999999985
No 24
>KOG2155 consensus Tubulin-tyrosine ligase-related protein [Posttranslational modification, protein turnover, chaperones]
Probab=57.95 E-value=5.6 Score=37.07 Aligned_cols=50 Identities=18% Similarity=0.321 Sum_probs=36.6
Q ss_pred CcccccCCCCCCCceeEEEEE-CC-eEEEEEEEcCCCCCCCeEEEecCCCcC
Q 026129 186 NKSRYINHSCCPNTEMQKWII-DG-ETRIGIFATRDIKKGENLTYDYQYEFL 235 (243)
Q Consensus 186 n~~RfiNHSC~PN~~~~~~~~-~~-~~~i~i~A~rdI~~GEELt~dY~~~~~ 235 (243)
.++.-+.||-.||..+..... .. -..-.++-+|+...|||+|-|+-...-
T Consensus 203 efGsrvrHsdePnf~~aPf~fmPq~vaYsimwp~k~~~tgeE~trDfasg~~ 254 (631)
T KOG2155|consen 203 EFGSRVRHSDEPNFRIAPFMFMPQNVAYSIMWPTKPVNTGEEITRDFASGVI 254 (631)
T ss_pred hhhhhhccCCCCcceeeeheecchhcceeEEeeccCCCCchHHHHHHhhcCC
Confidence 345568999999998865543 21 234567899999999999998865443
No 25
>PF02067 Metallothio_5: Metallothionein family 5; InterPro: IPR000966 Metallothioneins (MT) are small proteins that bind heavy metals, such as zinc, copper, cadmium, and nickel. They have a high content of cysteine residues that bind the metal ions through clusters of thiolate bonds [, , ] species, including sea urchins, fungi, insects and cyanobacteria. Class III MTs are atypical polypeptides composed of gamma-glutamylcysteinyl units. This original classification system has been found to be limited, in the sense that it does not allow clear differentiation of patterns of structural similarities, either between or within classes. Consequently, all class I and class I MTs (the proteinaceous sequences) have now been grouped into families of phylogenetically-related and thus alignable sequences. Diptera (Drosophila, family 5) MTs are 40-43 residue proteins that contain 10 conserved cysteines arranged in five Cys-X-Cys groups. In particular, the consensus pattern C-G-x(2)-C-x-C-x(2)-Q-x(5)-C-x-C-x(2)-D-C-x-C has been found to be diagnostic of family 5 MTs. The protein is found primarily in the alimentary canal, and its induction is stimulated by ingestion of cadmium or copper []. Mercury, silver and zinc induce the protein to a lesser extent. Family 5 includes subfamilies: d1, d2. Only one d2 is known until now. Subfamilies hit the same entry.; GO: 0046872 metal ion binding
Probab=56.52 E-value=9.3 Score=23.30 Aligned_cols=22 Identities=32% Similarity=1.224 Sum_probs=10.2
Q ss_pred CCCCCCccceeecCCCCCCCCCCC
Q 026129 80 CDRDCHCGMLLSSCSSGCKCGNSC 103 (243)
Q Consensus 80 C~~~C~c~~~~~eC~~~C~c~~~C 103 (243)
|+.+|.|...- |+.+|.|+.+|
T Consensus 6 Cg~~CkC~~~k--cg~~C~C~~dC 27 (41)
T PF02067_consen 6 CGTNCKCSSQK--CGGNCACNQDC 27 (41)
T ss_pred cCCCCEecCCc--cCCCccCCCCc
Confidence 44455444332 45555555443
No 26
>smart00858 SAF This domain family includes a range of different proteins. Such as antifreeze proteins and flagellar FlgA proteins, and CpaB pilus proteins.
Probab=53.33 E-value=9.1 Score=24.84 Aligned_cols=16 Identities=38% Similarity=0.526 Sum_probs=13.8
Q ss_pred EEEEcCCCCCCCeEEE
Q 026129 213 GIFATRDIKKGENLTY 228 (243)
Q Consensus 213 ~i~A~rdI~~GEELt~ 228 (243)
.++|.++|.+|+.|+-
T Consensus 3 v~va~~~i~~G~~i~~ 18 (64)
T smart00858 3 VVVAARDLPAGEVITA 18 (64)
T ss_pred EEEEeCccCCCCCcch
Confidence 4688999999999884
No 27
>KOG1079 consensus Transcriptional repressor EZH1 [Transcription]
Probab=48.71 E-value=7.8 Score=37.84 Aligned_cols=28 Identities=29% Similarity=0.673 Sum_probs=16.7
Q ss_pred CCCCCCccceeecCCCCCCCCCCCCCcc
Q 026129 80 CDRDCHCGMLLSSCSSGCKCGNSCLNKP 107 (243)
Q Consensus 80 C~~~C~c~~~~~eC~~~C~c~~~C~Nr~ 107 (243)
|+.+|+|..--..|...|.|...|+||.
T Consensus 512 c~~~C~C~~n~~~CEk~C~C~~dC~nrF 539 (739)
T KOG1079|consen 512 CGVGCPCIDNETFCEKFCYCSPDCRNRF 539 (739)
T ss_pred CCCCCcccccCcchhhcccCCHHHHhcC
Confidence 5666777666555555555555555553
No 28
>smart00317 SET SET (Su(var)3-9, Enhancer-of-zeste, Trithorax) domain. Putative methyl transferase, based on outlier plant homologues
Probab=46.51 E-value=38 Score=24.15 Aligned_cols=16 Identities=44% Similarity=0.362 Sum_probs=14.2
Q ss_pred cEEEecccCCCCceEE
Q 026129 126 AGIVADEDIKRGEFVI 141 (243)
Q Consensus 126 ~Gv~A~~~I~~G~~I~ 141 (243)
..++|+++|++|+=|.
T Consensus 98 ~~~~a~r~I~~GeEi~ 113 (116)
T smart00317 98 IVIFALRDIKPGEELT 113 (116)
T ss_pred EEEEECCCcCCCCEEe
Confidence 7899999999999774
No 29
>KOG1171 consensus Metallothionein-like protein [Inorganic ion transport and metabolism]
Probab=42.81 E-value=7.8 Score=35.73 Aligned_cols=37 Identities=38% Similarity=0.793 Sum_probs=30.8
Q ss_pred CCCCccccCCCCCCCCCCCC-CCCccceeecCCCCCCCCCCCCC
Q 026129 63 DDGIFCSCTASPGSSGVCDR-DCHCGMLLSSCSSGCKCGNSCLN 105 (243)
Q Consensus 63 ~~~~~C~C~~~~~~~~~C~~-~C~c~~~~~eC~~~C~c~~~C~N 105 (243)
.+..+|+|... .|-. .|.|.+..+-|+.+|+| ..|.|
T Consensus 215 ~hkkGC~CkkS-----gClKkYCECyQa~vlCS~nCkC-~~CkN 252 (406)
T KOG1171|consen 215 RHKKGCNCKKS-----GCLKKYCECYQAGVLCSSNCKC-QGCKN 252 (406)
T ss_pred hhcCCCCCccc-----cchHHHHHHHhcCCCccccccC-cCCcc
Confidence 46678999973 5744 59999999999999999 68888
No 30
>PF08487 VIT: Vault protein inter-alpha-trypsin domain; InterPro: IPR013694 Inter-alpha-trypsin inhibitors (ITIs) consist of one light chain and a variable set of heavy chains. ITIs play a role in extracellular matrix (ECM) stabilisation and tumour metastasis as well as in plasma protease inhibition []. The vault protein inter-alpha-trypsin (VIT) domain described here is found to the N terminus of a von Willebrand factor type A domain (IPR002035 from INTERPRO) in ITI heavy chains (ITIHs) and their precursors.
Probab=40.05 E-value=1.1e+02 Score=22.69 Aligned_cols=34 Identities=12% Similarity=0.114 Sum_probs=22.1
Q ss_pred ecccCCCCceEE---------EeceeeeCHHHHHHHHHHhhhc
Q 026129 130 ADEDIKRGEFVI---------EYVGEVIDDQTCEERLWKMKHL 163 (243)
Q Consensus 130 A~~~I~~G~~I~---------ey~Gevi~~~~~~~r~~~~~~~ 163 (243)
-.-+||.|..|. .+.|++...+++..........
T Consensus 39 y~fpLp~~A~i~~f~~~i~g~~i~g~v~ek~~A~~~y~~a~~~ 81 (118)
T PF08487_consen 39 YSFPLPEGAAISGFSMWIGGRTIEGEVKEKEEAKQEYEEAVAQ 81 (118)
T ss_pred EEeECCCCeEEEEEEEEECCEEEEEEEecHHHHHHHHHHHHHc
Confidence 344677777775 3578888888877765544443
No 31
>PF14100 PmoA: Methane oxygenase PmoA
Probab=39.16 E-value=44 Score=29.11 Aligned_cols=47 Identities=23% Similarity=0.400 Sum_probs=30.9
Q ss_pred cccCCCCCCCceeEEEEECCeEEEEE------EEcCCCCCCCeEEEecCCCcCC
Q 026129 189 RYINHSCCPNTEMQKWIIDGETRIGI------FATRDIKKGENLTYDYQYEFLH 236 (243)
Q Consensus 189 RfiNHSC~PN~~~~~~~~~~~~~i~i------~A~rdI~~GEELt~dY~~~~~~ 236 (243)
-|++|.-+||-- ..|.+.+...+++ ..-..|++||.|++.|..-..+
T Consensus 204 ~~~dhP~N~~~P-~~W~vR~~g~~~~~p~~~~~~~~~l~~G~~l~~rYr~~v~d 256 (271)
T PF14100_consen 204 AILDHPSNPNYP-TPWHVRGYGLFGANPAPAFDGPLTLPPGETLTLRYRVVVHD 256 (271)
T ss_pred EEEeCCCCCCCC-cceEEeccCcceecccccccCceecCCCCeEEEEEEEEEeC
Confidence 478888887664 3455554433333 3445799999999999754443
No 32
>COG1188 Ribosome-associated heat shock protein implicated in the recycling of the 50S subunit (S4 paralog) [Translation, ribosomal structure and biogenesis]
Probab=34.84 E-value=50 Score=24.39 Aligned_cols=21 Identities=19% Similarity=0.358 Sum_probs=18.6
Q ss_pred EEcCCCCCCCeEEEecCCCcC
Q 026129 215 FATRDIKKGENLTYDYQYEFL 235 (243)
Q Consensus 215 ~A~rdI~~GEELt~dY~~~~~ 235 (243)
-+.++++.|++|++.|+...+
T Consensus 44 KpS~~VK~GD~l~i~~~~~~~ 64 (100)
T COG1188 44 KPSKEVKVGDILTIRFGNKEF 64 (100)
T ss_pred ccccccCCCCEEEEEeCCcEE
Confidence 788999999999999997664
No 33
>KOG4454 consensus RNA binding protein (RRM superfamily) [General function prediction only]
Probab=33.49 E-value=71 Score=27.19 Aligned_cols=45 Identities=18% Similarity=0.285 Sum_probs=30.4
Q ss_pred CCccccCcchhhHH-----HHHHHHHHHh----CCCeeecCCCccCCCCCCCcEEc
Q 026129 1 MPAAKKNSDNSRIG-----HAFNKLLKQI----GNPVEFELPDWFIKPKAIPYVFI 47 (243)
Q Consensus 1 ~~~~~~~~~~~~~~-----~~~~~~~~~~----~~~~~~~~p~~~~~~~p~~f~~i 47 (243)
||||+++.|..... +|-+.||.+. |..+.+.||+.-+. +++ |.|+
T Consensus 1 mgaaaae~drtl~v~n~~~~v~eelL~ElfiqaGPV~kv~ip~~~d~-~~k-Fa~v 54 (267)
T KOG4454|consen 1 MGAAAAEMDRTLLVQNMYSGVSEELLSELFIQAGPVYKVGIPSGQDQ-EQK-FAYV 54 (267)
T ss_pred CCCCCcchhhHHHHHhhhhhhhHHHHHHHhhccCceEEEeCCCCccC-CCc-eeee
Confidence 89999999987654 4445566654 66788899977653 333 4444
No 34
>KOG1338 consensus Uncharacterized conserved protein [Function unknown]
Probab=32.71 E-value=30 Score=31.95 Aligned_cols=25 Identities=48% Similarity=0.748 Sum_probs=22.1
Q ss_pred cCCCcEEEecccCCCCceEEEecee
Q 026129 122 EKCGAGIVADEDIKRGEFVIEYVGE 146 (243)
Q Consensus 122 ~~kG~Gv~A~~~I~~G~~I~ey~Ge 146 (243)
...|.|+.|+++|++|+.+..|.+.
T Consensus 38 ~~~G~g~vAtesIkkgE~Lf~~prd 62 (466)
T KOG1338|consen 38 RIAGAGIVATESIKKGESLFAYPRD 62 (466)
T ss_pred hhcccceeeehhhcCCceEEEecCc
Confidence 3459999999999999999999775
No 35
>TIGR03569 NeuB_NnaB N-acetylneuraminate synthase. This family is a subset of the Pfam model pfam03102 and is believed to include only authentic NeuB N-acetylneuraminate (sialic acid) synthase enzymes. The majority of the genes identified by this model are observed adjacent to both the NeuA and NeuC genes which together effect the biosynthesis of CMP-N-acetylneuraminate from UDP-N-acetylglucosamine.
Probab=27.80 E-value=34 Score=30.80 Aligned_cols=19 Identities=53% Similarity=0.712 Sum_probs=16.7
Q ss_pred EEEEEEcCCCCCCCeEEEe
Q 026129 211 RIGIFATRDIKKGENLTYD 229 (243)
Q Consensus 211 ~i~i~A~rdI~~GEELt~d 229 (243)
|-.++|.|||++||.||.+
T Consensus 277 rrsl~a~~di~~G~~lt~~ 295 (329)
T TIGR03569 277 RKSLVAAKDIKKGEIFTED 295 (329)
T ss_pred ceEEEEccCcCCCCEecHH
Confidence 5678999999999999975
No 36
>PF07773 DUF1619: Protein of unknown function (DUF1619); InterPro: IPR011677 This is a group of sequences derived from hypothetical eukaryotic proteins. The region in question is approximately 330 residues long and has a cysteine rich N terminus.
Probab=27.17 E-value=39 Score=29.51 Aligned_cols=6 Identities=67% Similarity=1.736 Sum_probs=2.7
Q ss_pred CCCCCC
Q 026129 80 CDRDCH 85 (243)
Q Consensus 80 C~~~C~ 85 (243)
|+.+|.
T Consensus 16 CD~DC~ 21 (294)
T PF07773_consen 16 CDPDCS 21 (294)
T ss_pred CCcccC
Confidence 444443
No 37
>TIGR02059 swm_rep_I cyanobacterial long protein repeat. This domain appears in 29 copies in a large (10000 amino protein in Synechococcus sp. WH8102 associated with a novel flagellar system, as one of three different repeats. Similar domains are found in two different large (<3500) proteins of Synechocystis PCC6803.
Probab=25.68 E-value=1.7e+02 Score=21.67 Aligned_cols=29 Identities=21% Similarity=0.439 Sum_probs=22.0
Q ss_pred EECCe-EEEEEEEcCCCCCCCeEEEecCCC
Q 026129 205 IIDGE-TRIGIFATRDIKKGENLTYDYQYE 233 (243)
Q Consensus 205 ~~~~~-~~i~i~A~rdI~~GEELt~dY~~~ 233 (243)
.+++. ..+.+.-.+.|..||++|+.|...
T Consensus 69 sV~~s~ktVTLTL~~~V~~Gq~VTVsYt~p 98 (101)
T TIGR02059 69 SLGGSNTTITLTLAQVVEDGDEVTLSYTKN 98 (101)
T ss_pred EEcCcccEEEEEecccccCCCEEEEEeeCC
Confidence 34443 357777789999999999999653
No 38
>cd05468 pVHL von Hippel-Landau (pVHL) tumor suppressor protein. von Hippel-Landau (pVHL) protein, the gene product of VHL, is a critical regulator of the ubiquitous oxygen-sensing pathway. It is conserved throughout evolution, as its homologs are found in organisms ranging from mammals to the Drosophila melanogaster, Anopheles gambiae insects and the Caenorhabditis elegans nematode. pVHL acts as the substrate recognition component of an E3 ubiquitin ligase complex. Several proteins have been identified as pVHL-binding proteins that are subject to ubiquitin-mediated proteolysis; the best characterized putative substrates are the alpha subunits of the hypoxia-inducible factor (HIF1alpha, HIF2alpha, and HIF3alpha). In addition to HIF degradation, pVHL has been implicated to be involved in HIF independent cellular processes. Germline VHL mutations cause renal cell carcinomas, hemangioblastomas and pheochromocytomas in humans. pVHL can bind to and direct the proper deposition of fibronecti
Probab=23.54 E-value=1.2e+02 Score=23.70 Aligned_cols=36 Identities=19% Similarity=0.445 Sum_probs=19.4
Q ss_pred cccCCCCCCCceeEEEEECCeEEEEEEEcCCCCCCCeEEEe
Q 026129 189 RYINHSCCPNTEMQKWIIDGETRIGIFATRDIKKGENLTYD 229 (243)
Q Consensus 189 RfiNHSC~PN~~~~~~~~~~~~~i~i~A~rdI~~GEELt~d 229 (243)
+|+|++-.| ++.+|++....-..++ .|++|++.+++
T Consensus 12 ~F~N~t~~~---v~~~Wid~~G~~~~Y~--~l~pg~~~~~~ 47 (141)
T cd05468 12 RFVNRTDRP---VELYWIDYDGKPVSYG--TLQPGETVRQN 47 (141)
T ss_pred EEEeCCCCe---EEEEEECCCCCEEEee--eeCCCCEEeec
Confidence 577777333 3445555444444444 36777776553
No 39
>TIGR03586 PseI pseudaminic acid synthase.
Probab=23.26 E-value=51 Score=29.64 Aligned_cols=19 Identities=42% Similarity=0.767 Sum_probs=16.7
Q ss_pred EEEEEEcCCCCCCCeEEEe
Q 026129 211 RIGIFATRDIKKGENLTYD 229 (243)
Q Consensus 211 ~i~i~A~rdI~~GEELt~d 229 (243)
|-.++|.|||++||-||.+
T Consensus 275 rrsl~a~~di~~G~~it~~ 293 (327)
T TIGR03586 275 RRSLYVVKDIKKGETFTEE 293 (327)
T ss_pred eEEEEEccCcCCCCEecHH
Confidence 6678999999999999865
No 40
>PF11720 Inhibitor_I78: Peptidase inhibitor I78 family; InterPro: IPR021719 This family includes Aspergillus elastase inhibitor and belongs to MEROPS peptidase inhibitor family I78.
Probab=20.61 E-value=36 Score=22.43 Aligned_cols=18 Identities=33% Similarity=0.652 Sum_probs=14.7
Q ss_pred EcCCCCCCCeEEEecCCC
Q 026129 216 ATRDIKKGENLTYDYQYE 233 (243)
Q Consensus 216 A~rdI~~GEELt~dY~~~ 233 (243)
..|=|.||+.+|.||..+
T Consensus 25 ~~Rvi~Pg~~vTmDyr~d 42 (60)
T PF11720_consen 25 TVRVIRPGDAVTMDYRPD 42 (60)
T ss_pred ceEEeCCCCcCcccCCCC
Confidence 446688999999999865
Done!