Query 020402
Match_columns 326
No_of_seqs 303 out of 1562
Neff 6.2
Searched_HMMs 46136
Date Fri Mar 29 09:27:53 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/020402.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/020402hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 KOG2744 DNA-binding proteins B 100.0 2.9E-28 6.3E-33 246.5 14.8 183 38-220 153-339 (512)
2 smart00501 BRIGHT BRIGHT, ARID 99.9 9.4E-27 2E-31 186.3 10.1 91 47-137 1-92 (93)
3 PF01388 ARID: ARID/BRIGHT DNA 99.9 3E-25 6.5E-30 176.7 8.6 88 46-133 4-92 (92)
4 PTZ00199 high mobility group p 99.9 1.2E-21 2.6E-26 157.5 10.0 84 234-317 7-93 (94)
5 cd01389 MATA_HMG-box MATA_HMG- 99.8 1.1E-18 2.4E-23 134.8 7.1 71 249-319 1-72 (77)
6 cd01388 SOX-TCF_HMG-box SOX-TC 99.7 3E-18 6.4E-23 130.8 7.3 69 249-317 1-70 (72)
7 PF00505 HMG_box: HMG (high mo 99.7 1.8E-17 4E-22 124.2 8.3 68 250-317 1-69 (69)
8 PF09011 HMG_box_2: HMG-box do 99.7 3.5E-17 7.7E-22 125.2 8.4 71 247-317 1-73 (73)
9 cd01390 HMGB-UBF_HMG-box HMGB- 99.7 5.9E-17 1.3E-21 120.3 8.3 65 250-314 1-66 (66)
10 smart00398 HMG high mobility g 99.7 9.2E-17 2E-21 120.0 8.5 69 249-317 1-70 (70)
11 COG5648 NHP6B Chromatin-associ 99.7 1.7E-16 3.7E-21 142.5 7.9 85 238-322 59-144 (211)
12 KOG0381 HMG box-containing pro 99.7 5.4E-16 1.2E-20 124.0 10.0 75 246-320 17-95 (96)
13 KOG0527 HMG-box transcription 99.6 5.9E-16 1.3E-20 149.4 5.9 75 244-318 57-132 (331)
14 cd00084 HMG-box High Mobility 99.6 3.4E-15 7.4E-20 110.2 8.3 64 250-313 1-65 (66)
15 KOG0526 Nucleosome-binding fac 99.5 2.1E-14 4.5E-19 143.3 6.7 76 238-317 524-600 (615)
16 KOG3248 Transcription factor T 99.5 1.5E-13 3.3E-18 130.5 11.1 136 180-318 108-261 (421)
17 KOG2510 SWI-SNF chromatin-remo 99.2 1.9E-11 4.1E-16 121.4 6.9 96 46-148 291-387 (532)
18 KOG0528 HMG-box transcription 99.0 3.1E-10 6.8E-15 112.6 3.3 75 245-319 321-396 (511)
19 KOG4715 SWI/SNF-related matrix 98.9 4.7E-09 1E-13 99.5 7.8 75 244-318 59-134 (410)
20 KOG2746 HMG-box transcription 98.3 6E-07 1.3E-11 92.6 4.6 75 238-312 170-247 (683)
21 PF14887 HMG_box_5: HMG (high 97.9 3.3E-05 7.2E-10 59.6 6.3 73 249-321 3-75 (85)
22 PF04690 YABBY: YABBY protein; 96.9 0.0013 2.9E-08 58.3 5.0 43 249-291 121-164 (170)
23 PF06382 DUF1074: Protein of u 96.8 0.006 1.3E-07 54.3 7.8 48 254-305 83-131 (183)
24 COG5648 NHP6B Chromatin-associ 96.1 0.0064 1.4E-07 55.4 4.0 66 249-314 143-209 (211)
25 PF08073 CHDNT: CHDNT (NUC034) 88.3 0.45 9.8E-06 34.7 2.6 39 254-292 13-52 (55)
26 PF00249 Myb_DNA-binding: Myb- 83.2 2.8 6.2E-05 28.8 4.6 39 80-129 10-48 (48)
27 PF04769 MAT_Alpha1: Mating-ty 82.0 2.7 5.9E-05 38.5 5.2 53 244-302 38-91 (201)
28 PF06244 DUF1014: Protein of u 80.1 1.6 3.6E-05 36.8 2.9 45 249-293 71-117 (122)
29 TIGR01624 LRP1_Cterm LRP1 C-te 75.3 1.8 3.9E-05 30.7 1.5 31 185-215 16-47 (50)
30 TIGR03481 HpnM hopanoid biosyn 68.9 14 0.0003 33.6 6.1 42 277-318 67-110 (198)
31 PF05142 DUF702: Domain of unk 68.1 3.1 6.8E-05 36.4 1.6 32 185-216 118-149 (154)
32 PRK15117 ABC transporter perip 64.8 15 0.00034 33.6 5.7 46 273-318 66-114 (211)
33 PF12881 NUT_N: NUT protein N 62.3 14 0.00031 36.0 5.0 63 257-319 232-296 (328)
34 PF09441 Abp2: ARS binding pro 60.0 18 0.00039 32.1 4.9 41 69-113 45-85 (175)
35 PF13921 Myb_DNA-bind_6: Myb-l 56.9 20 0.00044 25.4 4.0 36 81-129 8-43 (60)
36 cd00167 SANT 'SWI3, ADA2, N-Co 55.4 26 0.00057 22.4 4.1 37 81-129 9-45 (45)
37 PF05494 Tol_Tol_Ttg2: Toluene 55.0 30 0.00065 30.1 5.6 42 277-318 42-84 (170)
38 KOG3223 Uncharacterized conser 52.6 6.8 0.00015 35.6 1.1 52 248-302 162-215 (221)
39 PF11304 DUF3106: Protein of u 49.9 57 0.0012 26.7 6.1 40 278-317 11-57 (107)
40 COG2854 Ttg2D ABC-type transpo 43.6 34 0.00073 31.5 4.1 43 277-319 74-117 (202)
41 KOG0493 Transcription factor E 42.1 40 0.00087 32.3 4.4 42 227-274 226-267 (342)
42 PF13873 Myb_DNA-bind_5: Myb/S 40.8 36 0.00077 25.5 3.3 54 77-131 14-71 (78)
43 PF12776 Myb_DNA-bind_3: Myb/S 37.9 64 0.0014 24.8 4.5 61 79-139 10-72 (96)
44 PF12650 DUF3784: Domain of un 37.9 23 0.00049 28.1 1.9 17 286-302 25-41 (97)
45 smart00717 SANT SANT SWI3, AD 37.4 67 0.0015 20.7 4.0 26 99-129 22-47 (49)
46 PF13875 DUF4202: Domain of un 36.6 54 0.0012 29.7 4.2 40 255-297 130-170 (185)
47 PF10545 MADF_DNA_bdg: Alcohol 35.8 37 0.00081 25.1 2.7 38 96-133 24-64 (85)
48 PF02337 Gag_p10: Retroviral G 33.1 1.4E+02 0.0031 23.8 5.7 54 50-110 8-64 (90)
49 PRK09706 transcriptional repre 32.7 1E+02 0.0022 25.7 5.2 41 279-319 88-128 (135)
50 smart00595 MADF subfamily of S 31.2 34 0.00073 26.1 1.9 42 94-135 23-65 (89)
51 PF05066 HARE-HTH: HB1, ASXL, 30.5 83 0.0018 23.3 3.8 43 52-105 3-45 (72)
52 PRK10236 hypothetical protein; 29.9 51 0.0011 31.0 3.1 45 257-301 89-140 (237)
53 PF04967 HTH_10: HTH DNA bindi 28.2 77 0.0017 22.8 3.1 40 88-129 13-52 (53)
54 TIGR00787 dctP tripartite ATP- 27.0 1.1E+02 0.0024 28.2 4.8 28 284-311 213-240 (257)
55 COG1638 DctP TRAP-type C4-dica 26.0 1E+02 0.0022 30.3 4.5 43 277-319 237-279 (332)
56 PRK02363 DNA-directed RNA poly 24.5 60 0.0013 27.6 2.3 63 51-123 4-69 (129)
57 PRK12751 cpxP periplasmic stre 24.5 1.3E+02 0.0028 26.6 4.5 32 278-309 118-149 (162)
58 PRK12750 cpxP periplasmic repr 23.0 2E+02 0.0043 25.5 5.5 36 278-313 125-160 (170)
59 PRK10363 cpxP periplasmic repr 22.1 1.7E+02 0.0037 26.1 4.8 39 277-316 111-149 (166)
60 cd07268 Glo_EDI_BRP_like_4 Thi 22.0 43 0.00092 29.3 0.9 49 42-90 4-52 (149)
61 PF13725 tRNA_bind_2: Possible 20.9 59 0.0013 25.6 1.5 20 93-112 78-97 (101)
62 PLN00131 hypothetical protein; 20.4 3.3E+02 0.0071 24.3 6.1 56 6-61 83-147 (218)
63 cd05694 S1_Rrp5_repeat_hs2_sc2 20.4 73 0.0016 24.0 1.8 31 182-214 4-34 (74)
No 1
>KOG2744 consensus DNA-binding proteins Bright/BRCAA1/RBP1 and related proteins containing BRIGHT domain [Transcription]
Probab=99.95 E-value=2.9e-28 Score=246.54 Aligned_cols=183 Identities=38% Similarity=0.547 Sum_probs=152.0
Q ss_pred CCCCchhhhhcHHHHHHHHHHHHHhcCCCCC-CCccCCeecchhHHHHHHHhcCcchhhcccccHHHHHHHhCCCC-CCC
Q 020402 38 PTAKYEDIAQSSDLFWATLEAFHKSFGDKFK-VPTVGGKALDLHRLFVEVTSRGGLGKVIRDRRWKEVVVVFNFPT-TIT 115 (326)
Q Consensus 38 ~~~~~e~~~~~~~~F~~~L~~F~~~rG~~l~-~P~i~gk~lDL~~Ly~~V~~rGG~~~V~~~~~W~eVa~~l~~p~-~~~ 115 (326)
+...+|.+..+++.||++|+.||+.+|++|+ +|+|+|++||||.||.+|+++||+++|+.+++|++|+..|+||. ++|
T Consensus 153 ~~~~~e~~~~~~eeF~~dl~~f~~~~~~~~~~iPii~~~~ldL~~Ly~lV~s~GG~~~V~~~k~Wrev~~~l~~pt~tiT 232 (512)
T KOG2744|consen 153 PLYETEGVPKSSEEFMEDLRRFMKKRGTKVKSIPIIGGQPLDLHWLYALVTSRGGLDEVTNKKLWREVIDGLNFPTPTIT 232 (512)
T ss_pred cccccccccccHHHHHHHHHHHHHHhCCcceeccccCCCcchHHHHHHHHhcCCchhHhhhhhhHHHHhccccCCCcccc
Confidence 5555666777999999999999999999997 99999999999999999999999999999999999999999999 999
Q ss_pred cHHHHHHHHHHHhhHHhhhhhhhccCCCCCCCCCCCC-CCCCC-CCCCCCCCCchhhhcCCCCCccccCCCceeeeecCc
Q 020402 116 SASFVLRKYYLSLLYHFEQVYYFRREAPSSSMPDAVS-GSSLD-NGSASPEEGSTINQLGSQGISKLQIGCSVSGVIDGK 193 (326)
Q Consensus 116 ~as~~Lk~~Y~k~L~~fE~~~~~~~~~~~~~~~~~~~-~~~~~-~~~~~~~~~~~~~~~~~~~~~~~~~~~~V~g~idg~ 193 (326)
+++|.||++|+++|++||+.+++....+..++.+... .++.. .+....+.++........++...+....+.|+|+|+
T Consensus 233 saaf~lr~~y~K~L~~ye~~~~~~~~~pln~p~~~~~~a~~~~~rE~~~~~~~~~~~~~~~~~~~~~~~~~~aa~~~~g~ 312 (512)
T KOG2744|consen 233 SAAFTLRKQYLKLLFEYECEFEKNRHVPLNSPAELSEEASSSNRREGRRHELSPSKEFQANGPSEEEPAEAEAAPEILGN 312 (512)
T ss_pred hHHHHHHHHHHHHHHHHHHHHHHhccCCCCCcccccccccccccccccccccCcchhhccCCcccccccccccchhhhcc
Confidence 9999999999999999999999998777777665444 22222 233444444432333333334444567899999999
Q ss_pred ccCCceEEEeeccccccccccccCCCC
Q 020402 194 FDNGYLVTVNLGSEQLKGVLYHIPHAH 220 (326)
Q Consensus 194 fd~gy~vtv~~gse~~~g~ly~~p~~~ 220 (326)
|+.||++++.++++.+++++|+.+...
T Consensus 313 f~~~~~~~~~~~s~~ln~~~~~~~~~~ 339 (512)
T KOG2744|consen 313 FLQGLLVFMKDGSEPLNGVLYLGPPDL 339 (512)
T ss_pred ccccCceeccCcchhccCccccccCcc
Confidence 999999999999999999999986643
No 2
>smart00501 BRIGHT BRIGHT, ARID (A/T-rich interaction domain) domain. DNA-binding domain containing a helix-turn-helix structure
Probab=99.94 E-value=9.4e-27 Score=186.31 Aligned_cols=91 Identities=40% Similarity=0.652 Sum_probs=87.2
Q ss_pred hcHHHHHHHHHHHHHhcCCCC-CCCccCCeecchhHHHHHHHhcCcchhhcccccHHHHHHHhCCCCCCCcHHHHHHHHH
Q 020402 47 QSSDLFWATLEAFHKSFGDKF-KVPTVGGKALDLHRLFVEVTSRGGLGKVIRDRRWKEVVVVFNFPTTITSASFVLRKYY 125 (326)
Q Consensus 47 ~~~~~F~~~L~~F~~~rG~~l-~~P~i~gk~lDL~~Ly~~V~~rGG~~~V~~~~~W~eVa~~l~~p~~~~~as~~Lk~~Y 125 (326)
++++.|+++|.+||+.+|+++ ++|+|+|++||||+||.+|+++|||++||++++|.+||+.||+++.+++++..|+++|
T Consensus 1 ~~~~~F~~~L~~F~~~~g~~~~~~P~i~g~~vdL~~Ly~~V~~~GG~~~v~~~~~W~~Va~~lg~~~~~~~~~~~lk~~Y 80 (93)
T smart00501 1 RERVLFLDRLYKFMEERGSPLKKIPVIGGKPLDLYRLYRLVQERGGYDQVTKDKKWKEIARELGIPDTSTSAASSLRKHY 80 (93)
T ss_pred CcHHHHHHHHHHHHHHcCCcCCcCCeECCEeCcHHHHHHHHHHccCHHHHcCCCCHHHHHHHhCCCcccchHHHHHHHHH
Confidence 468999999999999999998 7999999999999999999999999999999999999999999998999999999999
Q ss_pred HHhhHHhhhhhh
Q 020402 126 LSLLYHFEQVYY 137 (326)
Q Consensus 126 ~k~L~~fE~~~~ 137 (326)
.+||++||+.+.
T Consensus 81 ~k~L~~yE~~~~ 92 (93)
T smart00501 81 ERYLLPFERFLR 92 (93)
T ss_pred HHHhHHHHHHhh
Confidence 999999999854
No 3
>PF01388 ARID: ARID/BRIGHT DNA binding domain; InterPro: IPR001606 Members of the recently discovered ARID (AT-rich interaction domain; also known as BRIGHT domain)) family of DNA-binding proteins are found in fungi and invertebrate and vertebrate metazoans. ARID-encoding genes are involved in a variety of biological processes including embryonic development, cell lineage gene regulation and cell cycle control. Although the specific roles of this domain and of ARID-containing proteins in transcriptional regulation are yet to be elucidated, they include both positive and negative transcriptional regulation and a likely involvement in the modification of chromatin structure []. The basic structure of the ARID domain domain appears to be a series of six alpha-helices separated by beta-strands, loops, or turns, but the structured region may extend to an additional helix at either or both ends of the basic six. Based on primary sequence homology, they can be partitioned into three structural classes: Minimal ARID proteins that consist of a core domain formed by six alpha helices; ARID proteins that supplement the core domain with an N-terminal alpha-helix; and Extended-ARID proteins, which contain the core domain and additional alpha-helices at their N- and C-termini. The human SWI-SNF complex protein p270 is an ARID family member with non-sequence-specific DNA binding activity. The ARID consensus and other structural features are common to both p270 and yeast SWI1, suggesting that p270 is a human counterpart of SWI1 []. The approximately 100-residue ARID sequence is present in a series of proteins strongly implicated in the regulation of cell growth, development, and tissue-specific gene expression. Although about a dozen ARID proteins can be identified from database searches, to date, only Bright (a regulator of B-cell-specific gene expression), dead ringer (a Drosophila melanogaster gene product required for normal development), and MRF-2 (which represses expression from the Cytomegalovirus enhancer) have been analyzed directly in regard to their DNA binding properties. Each binds preferentially to AT-rich sites. In contrast, p270 shows no sequence preference in its DNA binding activity, thereby demonstrating that AT-rich binding is not an intrinsic property of ARID domains and that ARID family proteins may be involved in a wider range of DNA interactions [].; GO: 0003677 DNA binding, 0005622 intracellular; PDB: 1C20_A 1KQQ_A 2JRZ_A 2LM1_A 2YQE_A 2JXJ_A 2EH9_A 2CXY_A 2LI6_A 1KN5_A ....
Probab=99.92 E-value=3e-25 Score=176.73 Aligned_cols=88 Identities=38% Similarity=0.711 Sum_probs=81.9
Q ss_pred hhcHHHHHHHHHHHHHhcCCCC-CCCccCCeecchhHHHHHHHhcCcchhhcccccHHHHHHHhCCCCCCCcHHHHHHHH
Q 020402 46 AQSSDLFWATLEAFHKSFGDKF-KVPTVGGKALDLHRLFVEVTSRGGLGKVIRDRRWKEVVVVFNFPTTITSASFVLRKY 124 (326)
Q Consensus 46 ~~~~~~F~~~L~~F~~~rG~~l-~~P~i~gk~lDL~~Ly~~V~~rGG~~~V~~~~~W~eVa~~l~~p~~~~~as~~Lk~~ 124 (326)
..+++.|++.|++||+.+|+++ .+|.|+|++||||+||.+|+++|||++|+.+++|.+||..||+++.+++.+..|+++
T Consensus 4 ~~~~~~F~~~L~~f~~~~g~~~~~~P~i~g~~vDL~~Ly~~V~~~GG~~~V~~~~~W~~va~~lg~~~~~~~~~~~L~~~ 83 (92)
T PF01388_consen 4 TREREQFLEQLREFHESRGTPIDRPPVIGGKPVDLYKLYKAVMKRGGFDKVTKNKKWREVARKLGFPPSSTSAAQQLRQH 83 (92)
T ss_dssp CHHHHHHHHHHHHHHHHTTSSSSS-SEETTSE-SHHHHHHHHHHHTSHHHHHHHTTHHHHHHHTTS-TTSCHHHHHHHHH
T ss_pred chHHHHHHHHHHHHHHHcCCCCCCCCcCCCEeCcHHHHHHHHHhCcCcccCcccchHHHHHHHhCCCCCCCcHHHHHHHH
Confidence 4678999999999999999997 799999999999999999999999999999999999999999999888888999999
Q ss_pred HHHhhHHhh
Q 020402 125 YLSLLYHFE 133 (326)
Q Consensus 125 Y~k~L~~fE 133 (326)
|++||++||
T Consensus 84 Y~~~L~~fE 92 (92)
T PF01388_consen 84 YEKYLLPFE 92 (92)
T ss_dssp HHHHTHHHH
T ss_pred HHHHhHhhC
Confidence 999999998
No 4
>PTZ00199 high mobility group protein; Provisional
Probab=99.86 E-value=1.2e-21 Score=157.48 Aligned_cols=84 Identities=32% Similarity=0.519 Sum_probs=78.2
Q ss_pred hhhhccccccCCCCCCCCCCCChHHHHHHHHHHHhcccCCCCh---hHHHHHHHHHhccCCHHHHHHHHHHHHHHHHHHH
Q 020402 234 HRRRKRSRLALRDPSRPKSNRSGYNFFFAEHYARLKPHYYGQE---KAISKKIGVLWSNLTEAEKQVYQEKGLKDKERYK 310 (326)
Q Consensus 234 rrkkkk~k~k~kdp~~PKrP~SAY~lF~~e~r~~lk~~~p~~~---~eisk~ige~Wk~Ls~eeK~~Y~e~A~~dkerY~ 310 (326)
++.+++++++++||++||+|+|||+||+.++|..|+.+||++. .+|+++||++|+.|+++||++|+++|+.|+++|.
T Consensus 7 ~~~~k~~~k~~kdp~~PKrP~sAY~~F~~~~R~~i~~~~P~~~~~~~evsk~ige~Wk~ls~eeK~~y~~~A~~dk~rY~ 86 (94)
T PTZ00199 7 KVLVRKNKRKKKDPNAPKRALSAYMFFAKEKRAEIIAENPELAKDVAAVGKMVGEAWNKLSEEEKAPYEKKAQEDKVRYE 86 (94)
T ss_pred CccccccCCCCCCCCCCCCCCcHHHHHHHHHHHHHHHHCcCCcccHHHHHHHHHHHHHcCCHHHHHHHHHHHHHHHHHHH
Confidence 4446666677899999999999999999999999999999974 8999999999999999999999999999999999
Q ss_pred HHHHHHH
Q 020402 311 SEMLEYR 317 (326)
Q Consensus 311 ~em~~Yk 317 (326)
.||.+|+
T Consensus 87 ~e~~~Y~ 93 (94)
T PTZ00199 87 KEKAEYA 93 (94)
T ss_pred HHHHHHh
Confidence 9999996
No 5
>cd01389 MATA_HMG-box MATA_HMG-box, class I member of the HMG-box superfamily of DNA-binding proteins. These proteins contain a single HMG box, and bind the minor groove of DNA in a highly sequence-specific manner. Members include the fungal mating type gene products MC, MATA1 and Ste11.
Probab=99.76 E-value=1.1e-18 Score=134.79 Aligned_cols=71 Identities=25% Similarity=0.427 Sum_probs=68.7
Q ss_pred CCCCCCChHHHHHHHHHHHhcccCCCCh-hHHHHHHHHHhccCCHHHHHHHHHHHHHHHHHHHHHHHHHHhc
Q 020402 249 RPKSNRSGYNFFFAEHYARLKPHYYGQE-KAISKKIGVLWSNLTEAEKQVYQEKGLKDKERYKSEMLEYRSS 319 (326)
Q Consensus 249 ~PKrP~SAY~lF~~e~r~~lk~~~p~~~-~eisk~ige~Wk~Ls~eeK~~Y~e~A~~dkerY~~em~~Yk~~ 319 (326)
+||||+||||||+++.|..++.++|+++ .+|+++||++|+.|++++|++|.++|++++++|.+++++|+-.
T Consensus 1 ~~kRP~naf~lf~~~~r~~~~~~~p~~~~~eisk~~g~~Wk~ls~eeK~~y~~~A~~~k~~~~~~~p~Yky~ 72 (77)
T cd01389 1 KIPRPRNAFILYRQDKHAQLKTENPGLTNNEISRIIGRMWRSESPEVKAYYKELAEEEKERHAREYPDYKYT 72 (77)
T ss_pred CCCCCCcHHHHHHHHHHHHHHHHCCCCCHHHHHHHHHHHHhhCCHHHHHHHHHHHHHHHHHHHHHCCCCccc
Confidence 5899999999999999999999999999 9999999999999999999999999999999999999999853
No 6
>cd01388 SOX-TCF_HMG-box SOX-TCF_HMG-box, class I member of the HMG-box superfamily of DNA-binding proteins. These proteins contain a single HMG box, and bind the minor groove of DNA in a highly sequence-specific manner. Members include SRY and its homologs in insects and vertebrates, and transcription factor-like proteins, TCF-1, -3, -4, and LEF-1. They appear to bind the minor groove of the A/T C A A A G/C-motif.
Probab=99.75 E-value=3e-18 Score=130.79 Aligned_cols=69 Identities=30% Similarity=0.401 Sum_probs=67.1
Q ss_pred CCCCCCChHHHHHHHHHHHhcccCCCCh-hHHHHHHHHHhccCCHHHHHHHHHHHHHHHHHHHHHHHHHH
Q 020402 249 RPKSNRSGYNFFFAEHYARLKPHYYGQE-KAISKKIGVLWSNLTEAEKQVYQEKGLKDKERYKSEMLEYR 317 (326)
Q Consensus 249 ~PKrP~SAY~lF~~e~r~~lk~~~p~~~-~eisk~ige~Wk~Ls~eeK~~Y~e~A~~dkerY~~em~~Yk 317 (326)
+.|||+|||++|++++|.+++++||+++ .+|+++||+.|+.|++++|++|.++|++++++|.+++++|+
T Consensus 1 ~iKrP~naf~~F~~~~r~~~~~~~p~~~~~eisk~l~~~Wk~ls~~eK~~y~~~a~~~k~~y~~~~p~y~ 70 (72)
T cd01388 1 HIKRPMNAFMLFSKRHRRKVLQEYPLKENRAISKILGDRWKALSNEEKQPYYEEAKKLKELHMKLYPDYK 70 (72)
T ss_pred CCCCCCcHHHHHHHHHHHHHHHHCCCCCHHHHHHHHHHHHHcCCHHHHHHHHHHHHHHHHHHHHHCcCCC
Confidence 4689999999999999999999999999 99999999999999999999999999999999999999997
No 7
>PF00505 HMG_box: HMG (high mobility group) box; InterPro: IPR000910 High mobility group (HMG or HMGB) proteins are a family of relatively low molecular weight non-histone components in chromatin. HMG1 (also called HMG-T in fish) and HMG2 are two highly related proteins that bind single-stranded DNA preferentially and unwind double-stranded DNA. Although they have no sequence specificity, they have a high affinity for bent or distorted DNA, and bend linear DNA. HMG1 and HMG2 contain two DNA-binding HMG-box domains (A and B) that show structural and functional differences, and have a long acidic C-terminal domain rich in aspartic and glutamic acid residues. The acidic tail modulates the affinity of the tandem HMG boxes in HMG1 and 2 for a variety of DNA targets. HMG1 and 2 appear to play important architectural roles in the assembly of nucleoprotein complexes in a variety of biological processes, for example V(D)J recombination, the initiation of transcription, and DNA repair []. The profile in this entry describing the HMG-domains is much more general than the signature. In addition to the HMG1 and HMG2 proteins, HMG-domains occur in single or multiple copies in the following protein classes; the SOX family of transcription factors; SRY sex determining region Y protein and related proteins []; LEF1 lymphoid enhancer binding factor 1 []; SSRP recombination signal recognition protein; MTF1 mitochondrial transcription factor 1; UBF1/2 nucleolar transcription factors; Abf2 yeast ARS-binding factor []; and Saccharomyces cerevisiae transcription factors Ixr1, Rox1, Nhp6a, Nhp6b and Spp41.; GO: 0003677 DNA binding; PDB: 1I11_A 1J3C_A 1J3D_A 1WZ6_A 1WGF_A 2D7L_A 1GT0_D 3U2B_C 2CRJ_A 2CS1_A ....
Probab=99.72 E-value=1.8e-17 Score=124.21 Aligned_cols=68 Identities=40% Similarity=0.651 Sum_probs=65.0
Q ss_pred CCCCCChHHHHHHHHHHHhcccCCCCh-hHHHHHHHHHhccCCHHHHHHHHHHHHHHHHHHHHHHHHHH
Q 020402 250 PKSNRSGYNFFFAEHYARLKPHYYGQE-KAISKKIGVLWSNLTEAEKQVYQEKGLKDKERYKSEMLEYR 317 (326)
Q Consensus 250 PKrP~SAY~lF~~e~r~~lk~~~p~~~-~eisk~ige~Wk~Ls~eeK~~Y~e~A~~dkerY~~em~~Yk 317 (326)
||||+|||++|+.+.+..++.+||++. .+|+++||++|++|+++||++|.++|++++++|+++|++|+
T Consensus 1 PkrP~~af~lf~~~~~~~~k~~~p~~~~~~i~~~~~~~W~~l~~~eK~~y~~~a~~~~~~y~~~~~~y~ 69 (69)
T PF00505_consen 1 PKRPPNAFMLFCKEKRAKLKEENPDLSNKEISKILAQMWKNLSEEEKAPYKEEAEEEKERYEKEMPEYK 69 (69)
T ss_dssp SSSS--HHHHHHHHHHHHHHHHSTTSTHHHHHHHHHHHHHCSHHHHHHHHHHHHHHHHHHHHHHHHHHH
T ss_pred CcCCCCHHHHHHHHHHHHHHHHhcccccccchhhHHHHHhcCCHHHHHHHHHHHHHHHHHHHHHHHhcC
Confidence 899999999999999999999999999 99999999999999999999999999999999999999996
No 8
>PF09011 HMG_box_2: HMG-box domain; InterPro: IPR015101 This domain is predominantly found in Maelstrom homologue proteins. It has no known function. ; GO: 0005634 nucleus; PDB: 2EQZ_A 1V64_A 2CTO_A 1H5P_A 3TQ6_A 3FGH_A 3TMM_A 1J3X_A 2YRQ_A 1AAB_A ....
Probab=99.71 E-value=3.5e-17 Score=125.16 Aligned_cols=71 Identities=38% Similarity=0.655 Sum_probs=62.4
Q ss_pred CCCCCCCCChHHHHHHHHHHHhccc-CCCCh-hHHHHHHHHHhccCCHHHHHHHHHHHHHHHHHHHHHHHHHH
Q 020402 247 PSRPKSNRSGYNFFFAEHYARLKPH-YYGQE-KAISKKIGVLWSNLTEAEKQVYQEKGLKDKERYKSEMLEYR 317 (326)
Q Consensus 247 p~~PKrP~SAY~lF~~e~r~~lk~~-~p~~~-~eisk~ige~Wk~Ls~eeK~~Y~e~A~~dkerY~~em~~Yk 317 (326)
|++||+|+|||+||+.+++..++.. .+... .|+++.|++.|++||++||.+|.++|+.++++|+.+|.+|+
T Consensus 1 p~kpK~~~say~lF~~~~~~~~k~~G~~~~~~~e~~k~~~~~Wk~Ls~~EK~~Y~~~A~~~k~~y~~e~~~~~ 73 (73)
T PF09011_consen 1 PKKPKRPPSAYNLFMKEMRKEVKEEGGQKQSFREVMKEISERWKSLSEEEKEPYEERAKEDKERYEREMKEWN 73 (73)
T ss_dssp SSS--SSSSHHHHHHHHHHHHHHHHT-T-SSHHHHHHHHHHHHHHS-HHHHHHHHHHHHHHHHHHHHHHHHH-
T ss_pred CcCCCCCCCHHHHHHHHHHHHHHHhcccCCCHHHHHHHHHHHHHhcCHHHHHHHHHHHHHHHHHHHHHHHhcC
Confidence 5789999999999999999999988 66666 99999999999999999999999999999999999999995
No 9
>cd01390 HMGB-UBF_HMG-box HMGB-UBF_HMG-box, class II and III members of the HMG-box superfamily of DNA-binding proteins. These proteins bind the minor groove of DNA in a non-sequence specific fashion and contain two or more tandem HMG boxes. Class II members include non-histone chromosomal proteins, HMG1 and HMG2, which bind to bent or distorted DNA such as four-way DNA junctions, synthetic DNA cruciforms, kinked cisplatin-modified DNA, DNA bulges, cross-overs in supercoiled DNA, and can cause looping of linear DNA. Class III members include nucleolar and mitochondrial transcription factors, UBF and mtTF1, which bind four-way DNA junctions.
Probab=99.70 E-value=5.9e-17 Score=120.28 Aligned_cols=65 Identities=40% Similarity=0.615 Sum_probs=63.2
Q ss_pred CCCCCChHHHHHHHHHHHhcccCCCCh-hHHHHHHHHHhccCCHHHHHHHHHHHHHHHHHHHHHHH
Q 020402 250 PKSNRSGYNFFFAEHYARLKPHYYGQE-KAISKKIGVLWSNLTEAEKQVYQEKGLKDKERYKSEML 314 (326)
Q Consensus 250 PKrP~SAY~lF~~e~r~~lk~~~p~~~-~eisk~ige~Wk~Ls~eeK~~Y~e~A~~dkerY~~em~ 314 (326)
||+|+|||++|++++|..++.+||+++ .+|+++||++|++|++++|++|.++|++++++|+.+|.
T Consensus 1 Pkrp~saf~~f~~~~r~~~~~~~p~~~~~~i~~~~~~~W~~ls~~eK~~y~~~a~~~~~~y~~e~~ 66 (66)
T cd01390 1 PKRPLSAYFLFSQEQRPKLKKENPDASVTEVTKILGEKWKELSEEEKKKYEEKAEKDKERYEKEMK 66 (66)
T ss_pred CCCCCcHHHHHHHHHHHHHHHHCcCCCHHHHHHHHHHHHHhCCHHHHHHHHHHHHHHHHHHHHhhC
Confidence 899999999999999999999999998 99999999999999999999999999999999999873
No 10
>smart00398 HMG high mobility group.
Probab=99.69 E-value=9.2e-17 Score=119.95 Aligned_cols=69 Identities=43% Similarity=0.662 Sum_probs=67.2
Q ss_pred CCCCCCChHHHHHHHHHHHhcccCCCCh-hHHHHHHHHHhccCCHHHHHHHHHHHHHHHHHHHHHHHHHH
Q 020402 249 RPKSNRSGYNFFFAEHYARLKPHYYGQE-KAISKKIGVLWSNLTEAEKQVYQEKGLKDKERYKSEMLEYR 317 (326)
Q Consensus 249 ~PKrP~SAY~lF~~e~r~~lk~~~p~~~-~eisk~ige~Wk~Ls~eeK~~Y~e~A~~dkerY~~em~~Yk 317 (326)
+||+|+|||++|++++|..++.++|++. .+|+++||++|+.|++++|++|.++|++++++|++++++|+
T Consensus 1 ~pkrp~~~y~~f~~~~r~~~~~~~~~~~~~~i~~~~~~~W~~l~~~ek~~y~~~a~~~~~~y~~~~~~y~ 70 (70)
T smart00398 1 KPKRPMSAFMLFSQENRAKIKAENPDLSNAEISKKLGERWKLLSEEEKAPYEEKAKKDKERYEEEMPEYK 70 (70)
T ss_pred CcCCCCcHHHHHHHHHHHHHHHHCcCCCHHHHHHHHHHHHHcCCHHHHHHHHHHHHHHHHHHHHHHHhcC
Confidence 5899999999999999999999999999 99999999999999999999999999999999999999985
No 11
>COG5648 NHP6B Chromatin-associated proteins containing the HMG domain [Chromatin structure and dynamics]
Probab=99.66 E-value=1.7e-16 Score=142.49 Aligned_cols=85 Identities=28% Similarity=0.444 Sum_probs=80.6
Q ss_pred ccccccCCCCCCCCCCCChHHHHHHHHHHHhcccCCCCh-hHHHHHHHHHhccCCHHHHHHHHHHHHHHHHHHHHHHHHH
Q 020402 238 KRSRLALRDPSRPKSNRSGYNFFFAEHYARLKPHYYGQE-KAISKKIGVLWSNLTEAEKQVYQEKGLKDKERYKSEMLEY 316 (326)
Q Consensus 238 kk~k~k~kdp~~PKrP~SAY~lF~~e~r~~lk~~~p~~~-~eisk~ige~Wk~Ls~eeK~~Y~e~A~~dkerY~~em~~Y 316 (326)
+...++++||+.||||+|||++|+.++|.+++.++|++. .++.+.+|++|++|+++||++|.+.|..++++|..++..|
T Consensus 59 k~~~r~k~dpN~PKRp~sayf~y~~~~R~ei~~~~p~l~~~e~~k~~~e~WK~Ltd~eke~y~k~~~~~~erYq~ek~~y 138 (211)
T COG5648 59 KRLVRKKKDPNGPKRPLSAYFLYSAENRDEIRKENPKLTFGEVGKLLSEKWKELTDEEKEPYYKEANSDRERYQREKEEY 138 (211)
T ss_pred HHHHHHhcCCCCCCCchhHHHHHHHHHHHHHHHhCCCCChHHHHHHHHHHHHhccHhhhhhHHHHHhhHHHHHHHHHHhh
Confidence 455677899999999999999999999999999999999 9999999999999999999999999999999999999999
Q ss_pred HhcCCC
Q 020402 317 RSSYDS 322 (326)
Q Consensus 317 k~~~~~ 322 (326)
.++...
T Consensus 139 ~~k~~~ 144 (211)
T COG5648 139 NKKLPN 144 (211)
T ss_pred hcccCC
Confidence 997754
No 12
>KOG0381 consensus HMG box-containing protein [General function prediction only]
Probab=99.66 E-value=5.4e-16 Score=123.98 Aligned_cols=75 Identities=39% Similarity=0.671 Sum_probs=71.8
Q ss_pred CC--CCCCCCCChHHHHHHHHHHHhcccCCCCh-hHHHHHHHHHhccCCHHHHHHHHHHHHHHHHHHHHHHH-HHHhcC
Q 020402 246 DP--SRPKSNRSGYNFFFAEHYARLKPHYYGQE-KAISKKIGVLWSNLTEAEKQVYQEKGLKDKERYKSEML-EYRSSY 320 (326)
Q Consensus 246 dp--~~PKrP~SAY~lF~~e~r~~lk~~~p~~~-~eisk~ige~Wk~Ls~eeK~~Y~e~A~~dkerY~~em~-~Yk~~~ 320 (326)
|+ +.||+|++||++|+.+.|..++.+||++. .++++++|++|++|++++|++|+..|.+++++|..+|. +|+..+
T Consensus 17 ~p~~~~pkrp~sa~~~f~~~~~~~~k~~~p~~~~~~v~k~~g~~W~~l~~~~k~~y~~ka~~~k~~Y~~~~~~~~~~~~ 95 (96)
T KOG0381|consen 17 DPNAQAPKRPLSAFFLFSSEQRSKIKAENPGLSVGEVAKALGEMWKNLAEEEKQPYEEKASKLKEKYEKELAGEYKASL 95 (96)
T ss_pred CCCCCCCCCCCcHHHHHHHHHHHHHHHhCCCCCHHHHHHHHHHHHhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHhhcc
Confidence 66 59999999999999999999999999988 99999999999999999999999999999999999999 998765
No 13
>KOG0527 consensus HMG-box transcription factor [Transcription]
Probab=99.61 E-value=5.9e-16 Score=149.41 Aligned_cols=75 Identities=19% Similarity=0.303 Sum_probs=71.7
Q ss_pred CCCCCCCCCCCChHHHHHHHHHHHhcccCCCCh-hHHHHHHHHHhccCCHHHHHHHHHHHHHHHHHHHHHHHHHHh
Q 020402 244 LRDPSRPKSNRSGYNFFFAEHYARLKPHYYGQE-KAISKKIGVLWSNLTEAEKQVYQEKGLKDKERYKSEMLEYRS 318 (326)
Q Consensus 244 ~kdp~~PKrP~SAY~lF~~e~r~~lk~~~p~~~-~eisk~ige~Wk~Ls~eeK~~Y~e~A~~dkerY~~em~~Yk~ 318 (326)
++...+.||||||||+|.+.+|.+|.+++|++. .||+|+||.+||.|+|+||++|.++|++.|..|++|+++||-
T Consensus 57 k~~~~hIKRPMNAFMVWSq~~RRkma~qnP~mHNSEISK~LG~~WK~Lse~EKrPFi~EAeRLR~~HmkehPdYKY 132 (331)
T KOG0527|consen 57 KTSTDRIKRPMNAFMVWSQGQRRKLAKQNPKMHNSEISKRLGAEWKLLSEEEKRPFVDEAERLRAQHMKEYPDYKY 132 (331)
T ss_pred CCCccccCCCcchhhhhhHHHHHHHHHhCcchhhHHHHHHHHHHHhhcCHhhhccHHHHHHHHHHHHHHhCCCccc
Confidence 445568999999999999999999999999999 999999999999999999999999999999999999999986
No 14
>cd00084 HMG-box High Mobility Group (HMG)-box is found in a variety of eukaryotic chromosomal proteins and transcription factors. HMGs bind to the minor groove of DNA and have been classified by DNA binding preferences. Two phylogenically distinct groups of Class I proteins bind DNA in a sequence specific fashion and contain a single HMG box. One group (SOX-TCF) includes transcription factors, TCF-1, -3, -4; and also SRY and LEF-1, which bind four-way DNA junctions and duplex DNA targets. The second group (MATA) includes fungal mating type gene products MC, MATA1 and Ste11. Class II and III proteins (HMGB-UBF) bind DNA in a non-sequence specific fashion and contain two or more tandem HMG boxes. Class II members include non-histone chromosomal proteins, HMG1 and HMG2, which bind to bent or distorted DNA such as four-way DNA junctions, synthetic DNA cruciforms, kinked cisplatin-modified DNA, DNA bulges, cross-overs in supercoiled DNA, and can cause looping of linear DNA. Class III member
Probab=99.60 E-value=3.4e-15 Score=110.19 Aligned_cols=64 Identities=47% Similarity=0.707 Sum_probs=62.3
Q ss_pred CCCCCChHHHHHHHHHHHhcccCCCCh-hHHHHHHHHHhccCCHHHHHHHHHHHHHHHHHHHHHH
Q 020402 250 PKSNRSGYNFFFAEHYARLKPHYYGQE-KAISKKIGVLWSNLTEAEKQVYQEKGLKDKERYKSEM 313 (326)
Q Consensus 250 PKrP~SAY~lF~~e~r~~lk~~~p~~~-~eisk~ige~Wk~Ls~eeK~~Y~e~A~~dkerY~~em 313 (326)
||+|+|||++|+++.+..++.++|++. .+|++++|++|+.|++++|++|.+.|+.++++|++++
T Consensus 1 pkrp~~af~~f~~~~~~~~~~~~~~~~~~~i~~~~~~~W~~l~~~~k~~y~~~a~~~~~~y~~~~ 65 (66)
T cd00084 1 PKRPLSAYFLFSQEHRAEVKAENPGLSVGEISKILGEMWKSLSEEEKKKYEEKAEKDKERYEKEM 65 (66)
T ss_pred CCCCCcHHHHHHHHHHHHHHHHCcCCCHHHHHHHHHHHHHhCCHHHHHHHHHHHHHHHHHHHHhh
Confidence 799999999999999999999999988 9999999999999999999999999999999999876
No 15
>KOG0526 consensus Nucleosome-binding factor SPN, POB3 subunit [Transcription; Replication, recombination and repair; Chromatin structure and dynamics]
Probab=99.50 E-value=2.1e-14 Score=143.28 Aligned_cols=76 Identities=30% Similarity=0.598 Sum_probs=72.0
Q ss_pred ccccccCCCCCCCCCCCChHHHHHHHHHHHhcccCCCCh-hHHHHHHHHHhccCCHHHHHHHHHHHHHHHHHHHHHHHHH
Q 020402 238 KRSRLALRDPSRPKSNRSGYNFFFAEHYARLKPHYYGQE-KAISKKIGVLWSNLTEAEKQVYQEKGLKDKERYKSEMLEY 316 (326)
Q Consensus 238 kk~k~k~kdp~~PKrP~SAY~lF~~e~r~~lk~~~p~~~-~eisk~ige~Wk~Ls~eeK~~Y~e~A~~dkerY~~em~~Y 316 (326)
+|+.++++||++|||++||||+|+...|..||.+ ++. .+++|..|++|+.|+. |.+|+++|+.||.||+.||.+|
T Consensus 524 ~k~~kk~kdpnapkra~sa~m~w~~~~r~~ik~d--gi~~~dv~kk~g~~wk~ms~--k~~we~ka~~dk~ry~~em~~y 599 (615)
T KOG0526|consen 524 KKKGKKKKDPNAPKRATSAYMLWLNASRESIKED--GISVGDVAKKAGEKWKQMSA--KEEWEDKAAVDKQRYEDEMKEY 599 (615)
T ss_pred ccCcccCCCCCCCccchhHHHHHHHhhhhhHhhc--CchHHHHHHHHhHHHhhhcc--cchhhHHHHHHHHHHHHHHHhh
Confidence 3455778999999999999999999999999998 888 9999999999999999 9999999999999999999999
Q ss_pred H
Q 020402 317 R 317 (326)
Q Consensus 317 k 317 (326)
+
T Consensus 600 k 600 (615)
T KOG0526|consen 600 K 600 (615)
T ss_pred c
Confidence 9
No 16
>KOG3248 consensus Transcription factor TCF-4 [Transcription]
Probab=99.49 E-value=1.5e-13 Score=130.55 Aligned_cols=136 Identities=18% Similarity=0.257 Sum_probs=97.6
Q ss_pred ccCCCceeeeecCcccCCceEEEeecccccccccccc----CCCCCC----------CCCCC-CCCC--cchhhhccccc
Q 020402 180 LQIGCSVSGVIDGKFDNGYLVTVNLGSEQLKGVLYHI----PHAHNV----------SQSSN-NSAA--PTHRRRKRSRL 242 (326)
Q Consensus 180 ~~~~~~V~g~idg~fd~gy~vtv~~gse~~~g~ly~~----p~~~~~----------~~~~~-~~a~--~~rrkkkk~k~ 242 (326)
+..+-.|..+-.|.|.+.|+-+|...-. +..-.|+ |..... ++.+. .++. ..++...+++.
T Consensus 108 ~~l~wp~y~~pt~~~~~p~p~~~~asms--rf~ph~~~p~~p~~~tagiPhpaiv~P~~kqes~~~~~nvk~~~~~k~e~ 185 (421)
T KOG3248|consen 108 HPLGWPVYPIPTFGFRHPYPGVVNASMS--RFSPHHVEPGHPGLHTAGIPHPAIVTPPVKQESDSAPQNVKRQAESKKEE 185 (421)
T ss_pred CccCCccccCCCCCCCCCCchhhhhhhh--hcchhccCCCCCCccccCCCCccccCCcccCcccccccccchhhhccccc
Confidence 4556678888899999999974433322 2222332 211111 11221 1111 23333333333
Q ss_pred cCCCCCCCCCCCChHHHHHHHHHHHhcccCCCCh-hHHHHHHHHHhccCCHHHHHHHHHHHHHHHHHHHHHHHHHHh
Q 020402 243 ALRDPSRPKSNRSGYNFFFAEHYARLKPHYYGQE-KAISKKIGVLWSNLTEAEKQVYQEKGLKDKERYKSEMLEYRS 318 (326)
Q Consensus 243 k~kdp~~PKrP~SAY~lF~~e~r~~lk~~~p~~~-~eisk~ige~Wk~Ls~eeK~~Y~e~A~~dkerY~~em~~Yk~ 318 (326)
+.|++ +.|+|+||||+||+|.|++|.+|..-.. .+|+++||++|.+|+.||..+|+|+|++||+-|+..++.|-+
T Consensus 186 e~Kkp-hiKKPLNAFmlyMKEmRa~vvaEctlKeSAaiNqiLGrRWH~LSrEEQAKYyElArKerqlH~qlYP~WSA 261 (421)
T KOG3248|consen 186 EAKKP-HIKKPLNAFMLYMKEMRAKVVAECTLKESAAINQILGRRWHALSREEQAKYYELARKERQLHMQLYPGWSA 261 (421)
T ss_pred cccCc-cccccHHHHHHHHHHHHHHHHHHhhhhhHHHHHHHHhHHHhhhhHHHHHHHHHHHHHHHHHHHHhcCCcch
Confidence 43444 8999999999999999999999999777 999999999999999999999999999999999999888855
No 17
>KOG2510 consensus SWI-SNF chromatin-remodeling complex protein [Chromatin structure and dynamics]
Probab=99.21 E-value=1.9e-11 Score=121.44 Aligned_cols=96 Identities=29% Similarity=0.509 Sum_probs=87.7
Q ss_pred hhcHHHHHHHHHHHHHhcCCCCC-CCccCCeecchhHHHHHHHhcCcchhhcccccHHHHHHHhCCCCCCCcHHHHHHHH
Q 020402 46 AQSSDLFWATLEAFHKSFGDKFK-VPTVGGKALDLHRLFVEVTSRGGLGKVIRDRRWKEVVVVFNFPTTITSASFVLRKY 124 (326)
Q Consensus 46 ~~~~~~F~~~L~~F~~~rG~~l~-~P~i~gk~lDL~~Ly~~V~~rGG~~~V~~~~~W~eVa~~l~~p~~~~~as~~Lk~~ 124 (326)
..+++..++.|+.|++.+.+++. +|.++.|+||||+||..|..+||+..|++++ +++|.-|| .++++.||++
T Consensus 291 qp~r~~wvDR~raF~ee~~Sp~t~~p~~gakPldl~rlYvsvke~gg~~~v~knk--rd~a~~lg-----ssaa~~l~k~ 363 (532)
T KOG2510|consen 291 QPERKEWVDRLRAFTEERASPMTNLPAVGAKPLDLYRLYVSVKEIGGLTQVNKNK--RDLATNLG-----SSAASSLKKQ 363 (532)
T ss_pred CcchhhHHHHHHHHHHhhcCcccccccccccchhHHHHHHHHHHhccceeeccch--hhhhhccc-----hHHHHHHHHH
Confidence 47889999999999999999995 8999999999999999999999999999999 99999888 5788899999
Q ss_pred HHHhhHHhhhhhhhccCCCCCCCC
Q 020402 125 YLSLLYHFEQVYYFRREAPSSSMP 148 (326)
Q Consensus 125 Y~k~L~~fE~~~~~~~~~~~~~~~ 148 (326)
|.+||+.||+.+-.|+++++....
T Consensus 364 y~~~lf~fec~f~Rg~e~p~~~~s 387 (532)
T KOG2510|consen 364 YIQYLFAFECKFERGEEPPPDIFS 387 (532)
T ss_pred HHHHHHhhceeeeccCCCCHHHhh
Confidence 999999999999988887764443
No 18
>KOG0528 consensus HMG-box transcription factor SOX5 [Transcription]
Probab=98.96 E-value=3.1e-10 Score=112.65 Aligned_cols=75 Identities=17% Similarity=0.285 Sum_probs=68.6
Q ss_pred CCCCCCCCCCChHHHHHHHHHHHhcccCCCCh-hHHHHHHHHHhccCCHHHHHHHHHHHHHHHHHHHHHHHHHHhc
Q 020402 245 RDPSRPKSNRSGYNFFFAEHYARLKPHYYGQE-KAISKKIGVLWSNLTEAEKQVYQEKGLKDKERYKSEMLEYRSS 319 (326)
Q Consensus 245 kdp~~PKrP~SAY~lF~~e~r~~lk~~~p~~~-~eisk~ige~Wk~Ls~eeK~~Y~e~A~~dkerY~~em~~Yk~~ 319 (326)
.-+.+.||||||||+|.++.|.++.+.+||+. ..|+|+||.+|+.|+..||++|+|.-.+.=..|.+.+++||-+
T Consensus 321 ss~PHIKRPMNAFMVWAkDERRKILqA~PDMHNSnISKILGSRWKaMSN~eKQPYYEEQaRLSk~HlEk~PdYrYk 396 (511)
T KOG0528|consen 321 SSEPHIKRPMNAFMVWAKDERRKILQAFPDMHNSNISKILGSRWKAMSNTEKQPYYEEQARLSKLHLEKYPDYRYK 396 (511)
T ss_pred CCCccccCCcchhhcccchhhhhhhhcCccccccchhHHhcccccccccccccchHHHHHHHHHhhhccCcccccC
Confidence 33458899999999999999999999999999 9999999999999999999999987777777999999999864
No 19
>KOG4715 consensus SWI/SNF-related matrix-associated actin-dependent regulator of chromatin [Chromatin structure and dynamics]
Probab=98.88 E-value=4.7e-09 Score=99.50 Aligned_cols=75 Identities=24% Similarity=0.372 Sum_probs=71.0
Q ss_pred CCCCCCCCCCCChHHHHHHHHHHHhcccCCCCh-hHHHHHHHHHhccCCHHHHHHHHHHHHHHHHHHHHHHHHHHh
Q 020402 244 LRDPSRPKSNRSGYNFFFAEHYARLKPHYYGQE-KAISKKIGVLWSNLTEAEKQVYQEKGLKDKERYKSEMLEYRS 318 (326)
Q Consensus 244 ~kdp~~PKrP~SAY~lF~~e~r~~lk~~~p~~~-~eisk~ige~Wk~Ls~eeK~~Y~e~A~~dkerY~~em~~Yk~ 318 (326)
.+.|..|-+|+-+||.|++..++++++.||++. .||.|+||.+|..|+|+||+.|...++.+|..|.+.|..|..
T Consensus 59 pkpPkppekpl~pymrySrkvWd~VkA~nPe~kLWeiGK~Ig~mW~dLpd~EK~ey~~EYeaEKieY~~smkayh~ 134 (410)
T KOG4715|consen 59 PKPPKPPEKPLMPYMRYSRKVWDQVKASNPELKLWEIGKIIGGMWLDLPDEEKQEYLNEYEAEKIEYNESMKAYHN 134 (410)
T ss_pred CCCCCCCCcccchhhHHhhhhhhhhhccCcchHHHHHHHHHHHHHhhCcchHHHHHHHHHHHHHHHHHHHHHHhhC
Confidence 455678889999999999999999999999999 999999999999999999999999999999999999999975
No 20
>KOG2746 consensus HMG-box transcription factor Capicua and related proteins [Transcription]
Probab=98.29 E-value=6e-07 Score=92.65 Aligned_cols=75 Identities=24% Similarity=0.352 Sum_probs=68.6
Q ss_pred ccccccCCCCCCCCCCCChHHHHHHHHH--HHhcccCCCCh-hHHHHHHHHHhccCCHHHHHHHHHHHHHHHHHHHHH
Q 020402 238 KRSRLALRDPSRPKSNRSGYNFFFAEHY--ARLKPHYYGQE-KAISKKIGVLWSNLTEAEKQVYQEKGLKDKERYKSE 312 (326)
Q Consensus 238 kk~k~k~kdp~~PKrP~SAY~lF~~e~r--~~lk~~~p~~~-~eisk~ige~Wk~Ls~eeK~~Y~e~A~~dkerY~~e 312 (326)
..+..-++|..+.++|||||++|++.+| ..+.+.||+.+ +.|++|+|+.|-.|.+.||+.|.+.|.+.|+.|.+.
T Consensus 170 dgrspnkr~k~HirrPMnaf~ifskrhr~~g~vhq~~pn~DNrtIskiLgewWytL~~~Ekq~yhdLa~Qvk~Ahfka 247 (683)
T KOG2746|consen 170 DGRSPNKRDKDHIRRPMNAFHIFSKRHRGEGRVHQRHPNQDNRTISKILGEWWYTLGPNEKQKYHDLAFQVKEAHFKA 247 (683)
T ss_pred ccCCCCcCcchhhhhhhHHHHHHHhhcCCccchhccCccccchhHHHHHhhhHhhhCchhhhhHHHHHHHHHHHHhhh
Confidence 3444556777799999999999999999 89999999999 999999999999999999999999999999999876
No 21
>PF14887 HMG_box_5: HMG (high mobility group) box 5; PDB: 1L8Y_A 1L8Z_A 2HDZ_A.
Probab=97.91 E-value=3.3e-05 Score=59.59 Aligned_cols=73 Identities=21% Similarity=0.374 Sum_probs=58.8
Q ss_pred CCCCCCChHHHHHHHHHHHhcccCCCChhHHHHHHHHHhccCCHHHHHHHHHHHHHHHHHHHHHHHHHHhcCC
Q 020402 249 RPKSNRSGYNFFFAEHYARLKPHYYGQEKAISKKIGVLWSNLTEAEKQVYQEKGLKDKERYKSEMLEYRSSYD 321 (326)
Q Consensus 249 ~PKrP~SAY~lF~~e~r~~lk~~~p~~~~eisk~ige~Wk~Ls~eeK~~Y~e~A~~dkerY~~em~~Yk~~~~ 321 (326)
-|..|.+|--+|.+.......+.++.-..-..+.+...|++|++.+|.+|..+|.+|..+|+++|.+|++-..
T Consensus 3 lPE~PKt~qe~Wqq~vi~dYla~~~~dr~K~~kam~~~W~~me~Kekl~WIkKA~EdqKrYE~el~e~r~~~~ 75 (85)
T PF14887_consen 3 LPETPKTAQEIWQQSVIGDYLAKFRNDRKKALKAMEAQWSQMEKKEKLKWIKKAAEDQKRYERELREMRSAPA 75 (85)
T ss_dssp -S----THHHHHHHHHHHHHHHHTTSTHHHHHHHHHHHHHTTGGGHHHHHHHHHHHHHHHHHHHHHCCS-CCC
T ss_pred CCCCCCCHHHHHHHHHHHHHHHHhhHhHHHHHHHHHHHHHHhhhhhhhHHHHHHHHHHHHHHHHHHHHhcCCC
Confidence 4678899999999998888888888766333568999999999999999999999999999999999998554
No 22
>PF04690 YABBY: YABBY protein; InterPro: IPR006780 YABBY proteins are a group of plant-specific transcription factors involved in the specification of abaxial polarity in lateral organs such as leaves and floral organs [, ].
Probab=96.92 E-value=0.0013 Score=58.35 Aligned_cols=43 Identities=21% Similarity=0.316 Sum_probs=39.1
Q ss_pred CCCCCCChHHHHHHHHHHHhcccCCCCh-hHHHHHHHHHhccCC
Q 020402 249 RPKSNRSGYNFFFAEHYARLKPHYYGQE-KAISKKIGVLWSNLT 291 (326)
Q Consensus 249 ~PKrP~SAY~lF~~e~r~~lk~~~p~~~-~eisk~ige~Wk~Ls 291 (326)
+-.|-+|||+.|++|.-++||+++|+++ +|+-+..++.|...+
T Consensus 121 KRqR~psaYn~f~k~ei~rik~~~p~ishkeaFs~aAknW~h~p 164 (170)
T PF04690_consen 121 KRQRVPSAYNRFMKEEIQRIKAENPDISHKEAFSAAAKNWAHFP 164 (170)
T ss_pred ccCCCchhHHHHHHHHHHHHHhcCCCCCHHHHHHHHHHhhhhCc
Confidence 3347789999999999999999999999 999999999998765
No 23
>PF06382 DUF1074: Protein of unknown function (DUF1074); InterPro: IPR024460 This family consists of several proteins which appear to be specific to Insecta. The function of this family is unknown.
Probab=96.77 E-value=0.006 Score=54.28 Aligned_cols=48 Identities=23% Similarity=0.392 Sum_probs=41.6
Q ss_pred CChHHHHHHHHHHHhcccCCCCh-hHHHHHHHHHhccCCHHHHHHHHHHHHHH
Q 020402 254 RSGYNFFFAEHYARLKPHYYGQE-KAISKKIGVLWSNLTEAEKQVYQEKGLKD 305 (326)
Q Consensus 254 ~SAY~lF~~e~r~~lk~~~p~~~-~eisk~ige~Wk~Ls~eeK~~Y~e~A~~d 305 (326)
.+||+-|+.+.+.+ |.++. .|+....+..|..|+++||..|..++...
T Consensus 83 nnaYLNFLReFRrk----h~~L~p~dlI~~AAraW~rLSe~eK~rYrr~~~~~ 131 (183)
T PF06382_consen 83 NNAYLNFLREFRRK----HCGLSPQDLIQRAARAWCRLSEAEKNRYRRMAPSV 131 (183)
T ss_pred chHHHHHHHHHHHH----ccCCCHHHHHHHHHHHHHhCCHHHHHHHHhhcchh
Confidence 57899999998874 56777 99999999999999999999999876543
No 24
>COG5648 NHP6B Chromatin-associated proteins containing the HMG domain [Chromatin structure and dynamics]
Probab=96.09 E-value=0.0064 Score=55.44 Aligned_cols=66 Identities=23% Similarity=0.168 Sum_probs=58.7
Q ss_pred CCCCCCChHHHHHHHHHHHhcccCCCCh-hHHHHHHHHHhccCCHHHHHHHHHHHHHHHHHHHHHHH
Q 020402 249 RPKSNRSGYNFFFAEHYARLKPHYYGQE-KAISKKIGVLWSNLTEAEKQVYQEKGLKDKERYKSEML 314 (326)
Q Consensus 249 ~PKrP~SAY~lF~~e~r~~lk~~~p~~~-~eisk~ige~Wk~Ls~eeK~~Y~e~A~~dkerY~~em~ 314 (326)
+++.+.-+|.-+-.+.|..+...+|+.. .++.++++..|++|++.-|.+|.+.+.++++.|...|+
T Consensus 143 ~~~~~~~~~~e~~~~~r~~~~~~~~~~~~~e~~k~~~~~w~el~~skK~~~~~~~Kk~k~~~~~~~~ 209 (211)
T COG5648 143 PNKAPIGPFIENEPKIRPKVEGPSPDKALVEETKIISKAWSELDESKKKKYIDKYKKLKEEYDSFYP 209 (211)
T ss_pred CCCCCCchhhhccHHhccccCCCCcchhhhHHhhhhhhhhhhhChhhhhHHHHHHHHHHHHHhhhcc
Confidence 5566677777788888888888899998 99999999999999999999999999999999987765
No 25
>PF08073 CHDNT: CHDNT (NUC034) domain; InterPro: IPR012958 The CHD N-terminal domain is found in PHD/RING fingers and chromo domain-associated helicases [].; GO: 0003677 DNA binding, 0005524 ATP binding, 0008270 zinc ion binding, 0016818 hydrolase activity, acting on acid anhydrides, in phosphorus-containing anhydrides, 0006355 regulation of transcription, DNA-dependent, 0005634 nucleus
Probab=88.25 E-value=0.45 Score=34.66 Aligned_cols=39 Identities=13% Similarity=0.160 Sum_probs=34.9
Q ss_pred CChHHHHHHHHHHHhcccCCCCh-hHHHHHHHHHhccCCH
Q 020402 254 RSGYNFFFAEHYARLKPHYYGQE-KAISKKIGVLWSNLTE 292 (326)
Q Consensus 254 ~SAY~lF~~e~r~~lk~~~p~~~-~eisk~ige~Wk~Ls~ 292 (326)
.+-|-+|.+-.|+.|.+.||+.. ..+...++.+|++.++
T Consensus 13 lt~yK~Fsq~vRP~l~~~NPk~~~sKl~~l~~AKwrEF~~ 52 (55)
T PF08073_consen 13 LTNYKAFSQHVRPLLAKANPKAPMSKLMMLLQAKWREFQE 52 (55)
T ss_pred HHHHHHHHHHHHHHHHHHCCCCcHHHHHHHHHHHHHHHHh
Confidence 35688999999999999999999 9999999999987654
No 26
>PF00249 Myb_DNA-binding: Myb-like DNA-binding domain; InterPro: IPR014778 The retroviral oncogene v-myb, and its cellular counterpart c-myb, encode nuclear DNA-binding proteins. These belong to the SANT domain family that specifically recognise the sequence YAAC(G/T)G [, ]. In myb, one of the most conserved regions consisting of three tandem repeats has been shown to be involved in DNA-binding [].; PDB: 1X41_A 2XAF_B 2XAG_B 2XAH_B 2UXN_B 2Y48_B 2XAQ_B 2X0L_B 2IW5_B 2XAJ_B ....
Probab=83.20 E-value=2.8 Score=28.83 Aligned_cols=39 Identities=23% Similarity=0.314 Sum_probs=27.9
Q ss_pred hHHHHHHHhcCcchhhcccccHHHHHHHhCCCCCCCcHHHHHHHHHHHhh
Q 020402 80 HRLFVEVTSRGGLGKVIRDRRWKEVVVVFNFPTTITSASFVLRKYYLSLL 129 (326)
Q Consensus 80 ~~Ly~~V~~rGG~~~V~~~~~W~eVa~~l~~p~~~~~as~~Lk~~Y~k~L 129 (326)
..|...|...|.- .|..||..|+. +-...+++.+|.+||
T Consensus 10 ~~l~~~v~~~g~~-------~W~~Ia~~~~~----~Rt~~qc~~~~~~~~ 48 (48)
T PF00249_consen 10 EKLLEAVKKYGKD-------NWKKIAKRMPG----GRTAKQCRSRYQNLL 48 (48)
T ss_dssp HHHHHHHHHSTTT-------HHHHHHHHHSS----SSTHHHHHHHHHHHT
T ss_pred HHHHHHHHHhCCc-------HHHHHHHHcCC----CCCHHHHHHHHHhhC
Confidence 3456666666643 79999999992 223448999999886
No 27
>PF04769 MAT_Alpha1: Mating-type protein MAT alpha 1; InterPro: IPR006856 This family includes Saccharomyces cerevisiae (Baker's yeast) mating type protein alpha 1 (P01365 from SWISSPROT). MAT alpha 1 is a transcription activator that activates mating-type alpha-specific genes with the help of the MADS-box containing MCM1 transcription factor, which together bind cooperatively to PQ elements upstream of alpha-specific genes. The MCM1-MATalpha1 complex is required for the proper DNA-bending that is needed for transcriptional activation []. Alpha 1 interacts in vivo with STE12, linking expression of alpha-specific genes to the alpha-pheromone (IPR006742 from INTERPRO) response pathway [].; GO: 0000772 mating pheromone activity, 0003677 DNA binding, 0045895 positive regulation of transcription, mating-type specific, 0005634 nucleus
Probab=82.03 E-value=2.7 Score=38.48 Aligned_cols=53 Identities=21% Similarity=0.271 Sum_probs=37.4
Q ss_pred CCCCCCCCCCCChHHHHHHHHHHHhcccCCCCh-hHHHHHHHHHhccCCHHHHHHHHHHH
Q 020402 244 LRDPSRPKSNRSGYNFFFAEHYARLKPHYYGQE-KAISKKIGVLWSNLTEAEKQVYQEKG 302 (326)
Q Consensus 244 ~kdp~~PKrP~SAY~lF~~e~r~~lk~~~p~~~-~eisk~ige~Wk~Ls~eeK~~Y~e~A 302 (326)
+....++|||.|+||.|..=.- ...++.. .+++..|+..|+.=+- |..|.-+|
T Consensus 38 ~~~~~~~kr~lN~Fm~FRsyy~----~~~~~~~Qk~~S~~l~~lW~~dp~--k~~W~l~a 91 (201)
T PF04769_consen 38 KRSPEKAKRPLNGFMAFRSYYS----PIFPPLPQKELSGILTKLWEKDPF--KNKWSLMA 91 (201)
T ss_pred cccccccccchhHHHHHHHHHH----hhcCCcCHHHHHHHHHHHHhCCcc--HhHHHHHh
Confidence 3344578999999999975544 4456666 9999999999987433 44444443
No 28
>PF06244 DUF1014: Protein of unknown function (DUF1014); InterPro: IPR010422 This family consists of several hypothetical eukaryotic proteins of unknown function.
Probab=80.15 E-value=1.6 Score=36.83 Aligned_cols=45 Identities=18% Similarity=0.313 Sum_probs=39.4
Q ss_pred CC-CCCCChHHHHHHHHHHHhcccCCCCh-hHHHHHHHHHhccCCHH
Q 020402 249 RP-KSNRSGYNFFFAEHYARLKPHYYGQE-KAISKKIGVLWSNLTEA 293 (326)
Q Consensus 249 ~P-KrP~SAY~lF~~e~r~~lk~~~p~~~-~eisk~ige~Wk~Ls~e 293 (326)
+| ||-.-||.-|....-++|++++|++- ..+..+|-..|..-+++
T Consensus 71 HPErR~KAAy~afeE~~Lp~lK~E~PgLrlsQ~kq~l~K~w~KSPeN 117 (122)
T PF06244_consen 71 HPERRMKAAYKAFEERRLPELKEENPGLRLSQYKQMLWKEWQKSPEN 117 (122)
T ss_pred CcchhHHHHHHHHHHHHhHHHHhhCCCchHHHHHHHHHHHHhcCCCC
Confidence 44 45557899999999999999999999 99999999999887764
No 29
>TIGR01624 LRP1_Cterm LRP1 C-terminal domain. This model represents a tightly conserved small domain found in LRP1 and related plant proteins. This family also contains a well-conserved putative zinc finger domain (TIGR01623). The rest of the sequence of most members consists of highly divergent, low-complexity sequence.
Probab=75.34 E-value=1.8 Score=30.70 Aligned_cols=31 Identities=32% Similarity=0.557 Sum_probs=27.3
Q ss_pred ceeeeecCcc-cCCceEEEeeccccccccccc
Q 020402 185 SVSGVIDGKF-DNGYLVTVNLGSEQLKGVLYH 215 (326)
Q Consensus 185 ~V~g~idg~f-d~gy~vtv~~gse~~~g~ly~ 215 (326)
.|+++=||.- +..|-.+|+||--.++|+||-
T Consensus 16 Rvs~idd~~~~e~aYQt~V~IgGHvFkGiLyD 47 (50)
T TIGR01624 16 RVTAIDDGEQAEYAYQATVTIGGHVFKGFLHD 47 (50)
T ss_pred EEeccCCCCCceEEEEEEEEECceEEeeEEec
Confidence 5777778876 779999999999999999996
No 30
>TIGR03481 HpnM hopanoid biosynthesis associated membrane protein HpnM. The genomes containing members of this family share the machinery for the biosynthesis of hopanoid lipids. Furthermore, the genes of this family are usually located proximal to other components of this biological process. The proteins are members of the pfam05494 family of putative transporters known as "toluene tolerance protein Ttg2D", although it is unlikely that the members included here have anything to do with toluene per-se.
Probab=68.89 E-value=14 Score=33.55 Aligned_cols=42 Identities=17% Similarity=0.381 Sum_probs=35.7
Q ss_pred hHHH-HHHHHHhccCCHHHHHHHHHHHHH-HHHHHHHHHHHHHh
Q 020402 277 KAIS-KKIGVLWSNLTEAEKQVYQEKGLK-DKERYKSEMLEYRS 318 (326)
Q Consensus 277 ~eis-k~ige~Wk~Ls~eeK~~Y~e~A~~-dkerY~~em~~Yk~ 318 (326)
..++ ..+|.-|+.+|+++|+.|.+.-.. ....|-..+..|..
T Consensus 67 ~~mar~vLG~~W~~~s~~Qr~~F~~~F~~~l~~tY~~~l~~y~~ 110 (198)
T TIGR03481 67 PAMARLTLGSSWTSLSPEQRRRFIGAFRELSIATYASQFKSYAG 110 (198)
T ss_pred HHHHHHHhhhhhhhCCHHHHHHHHHHHHHHHHHHHHHHHHhhcC
Confidence 4454 478999999999999999987776 78899999999975
No 31
>PF05142 DUF702: Domain of unknown function (DUF702) ; InterPro: IPR007818 This is a family of plant proteins of unknown function.
Probab=68.05 E-value=3.1 Score=36.45 Aligned_cols=32 Identities=38% Similarity=0.655 Sum_probs=29.4
Q ss_pred ceeeeecCcccCCceEEEeecccccccccccc
Q 020402 185 SVSGVIDGKFDNGYLVTVNLGSEQLKGVLYHI 216 (326)
Q Consensus 185 ~V~g~idg~fd~gy~vtv~~gse~~~g~ly~~ 216 (326)
.|+++=||.-+..|-.+|+||--.|+|+||--
T Consensus 118 RVssiDdgedE~AYQTaV~IGGHVFKGiLYDq 149 (154)
T PF05142_consen 118 RVSSIDDGEDEYAYQTAVNIGGHVFKGILYDQ 149 (154)
T ss_pred EEecccCcccceeeEEeEEECCEEeeeeeecc
Confidence 48888899999999999999999999999974
No 32
>PRK15117 ABC transporter periplasmic binding protein MlaC; Provisional
Probab=64.85 E-value=15 Score=33.56 Aligned_cols=46 Identities=17% Similarity=0.200 Sum_probs=37.0
Q ss_pred CCCh-hHHH-HHHHHHhccCCHHHHHHHHHHHHH-HHHHHHHHHHHHHh
Q 020402 273 YGQE-KAIS-KKIGVLWSNLTEAEKQVYQEKGLK-DKERYKSEMLEYRS 318 (326)
Q Consensus 273 p~~~-~eis-k~ige~Wk~Ls~eeK~~Y~e~A~~-dkerY~~em~~Yk~ 318 (326)
|..+ ..++ ..+|.-|+.+++++|+.|.+.-.. ....|-..+.+|..
T Consensus 66 p~~Df~~~s~~vLG~~wr~as~eQr~~F~~~F~~~Lv~tYa~~l~~y~~ 114 (211)
T PRK15117 66 PYVQVKYAGALVLGRYYKDATPAQREAYFAAFREYLKQAYGQALAMYHG 114 (211)
T ss_pred ccCCHHHHHHHHhhhhhhhCCHHHHHHHHHHHHHHHHHHHHHHHHHhCC
Confidence 4455 5554 478999999999999999876655 77889999999975
No 33
>PF12881 NUT_N: NUT protein N terminus; InterPro: IPR024309 This domain is found in the N-terminal region of Nuclear Testis (NUT) proteins. It is also found in FAM22, which are a family of uncharacterised mammalian proteins.
Probab=62.34 E-value=14 Score=35.96 Aligned_cols=63 Identities=13% Similarity=0.048 Sum_probs=42.0
Q ss_pred HHHHHHHHHHHhcccCCCCh-hHHHHHHHHHhccCCHHHHHHHHHHHHHHHHH-HHHHHHHHHhc
Q 020402 257 YNFFFAEHYARLKPHYYGQE-KAISKKIGVLWSNLTEAEKQVYQEKGLKDKER-YKSEMLEYRSS 319 (326)
Q Consensus 257 Y~lF~~e~r~~lk~~~p~~~-~eisk~ige~Wk~Ls~eeK~~Y~e~A~~dker-Y~~em~~Yk~~ 319 (326)
+-.|+.-.-..+....|.+. .|-....-+.|...|.-+|..|+|+|++=+|= -++||+.-+-+
T Consensus 232 lSCFLIpvLrsLar~kPtMtlEeGl~ra~qEW~~~SnfdRmifyemaekFmEFEaeEEmq~q~lq 296 (328)
T PF12881_consen 232 LSCFLIPVLRSLARLKPTMTLEEGLWRAVQEWQHTSNFDRMIFYEMAEKFMEFEAEEEMQIQKLQ 296 (328)
T ss_pred hhhhHHHHHHHHHhcCCCccHHHHHHHHHHHhhccccccHHHHHHHHHHHccCCcHHHHHHHHHH
Confidence 33333333333444567777 77777888999999999999999999985442 12455555443
No 34
>PF09441 Abp2: ARS binding protein 2; InterPro: IPR018562 This DNA-binding protein binds to the autonomously replicating sequence (ARS) binding element. It may play a role in regulating the cell cycle response to stress signals [].
Probab=60.01 E-value=18 Score=32.05 Aligned_cols=41 Identities=15% Similarity=0.327 Sum_probs=34.8
Q ss_pred CCccCCeecchhHHHHHHHhcCcchhhcccccHHHHHHHhCCCCC
Q 020402 69 VPTVGGKALDLHRLFVEVTSRGGLGKVIRDRRWKEVVVVFNFPTT 113 (326)
Q Consensus 69 ~P~i~gk~lDL~~Ly~~V~~rGG~~~V~~~~~W~eVa~~l~~p~~ 113 (326)
+|.-+||..+.|.||..|.++-.- .-+.|.++|-.||+.+.
T Consensus 45 pPkS~Gk~Fs~~~Lf~LI~k~~~k----eikTW~~La~~LGVepp 85 (175)
T PF09441_consen 45 PPKSDGKSFSTFTLFELIRKLESK----EIKTWAQLALELGVEPP 85 (175)
T ss_pred CCCcCCccchHHHHHHHHHHHhhh----hHhHHHHHHHHhCCCCC
Confidence 899999999999999999976432 34689999999999654
No 35
>PF13921 Myb_DNA-bind_6: Myb-like DNA-binding domain; PDB: 1A5J_A 1MBH_A 1GV5_A 1H89_C 1IDY_A 1MBK_A 1IDZ_A 1H88_C 1GVD_A 1MBG_A ....
Probab=56.92 E-value=20 Score=25.43 Aligned_cols=36 Identities=19% Similarity=0.280 Sum_probs=23.0
Q ss_pred HHHHHHHhcCcchhhcccccHHHHHHHhCCCCCCCcHHHHHHHHHHHhh
Q 020402 81 RLFVEVTSRGGLGKVIRDRRWKEVVVVFNFPTTITSASFVLRKYYLSLL 129 (326)
Q Consensus 81 ~Ly~~V~~rGG~~~V~~~~~W~eVa~~l~~p~~~~~as~~Lk~~Y~k~L 129 (326)
.|...|...|. .|..||..|| .. ...+++..|.++|
T Consensus 8 ~L~~~~~~~g~--------~W~~Ia~~l~-~R----t~~~~~~r~~~~l 43 (60)
T PF13921_consen 8 LLLELVKKYGN--------DWKKIAEHLG-NR----TPKQCRNRWRNHL 43 (60)
T ss_dssp HHHHHHHHHTS---------HHHHHHHST-TS-----HHHHHHHHHHTT
T ss_pred HHHHHHHHHCc--------CHHHHHHHHC-cC----CHHHHHHHHHHHC
Confidence 35555555553 6999999996 11 2347788888766
No 36
>cd00167 SANT 'SWI3, ADA2, N-CoR and TFIIIB' DNA-binding domains. Tandem copies of the domain bind telomeric DNA tandem repeatsas part of the capping complex. Binding is sequence dependent for repeats which contain the G/C rich motif [C2-3 A (CA)1-6]. The domain is also found in regulatory transcriptional repressor complexes where it also binds DNA.
Probab=55.36 E-value=26 Score=22.36 Aligned_cols=37 Identities=19% Similarity=0.277 Sum_probs=24.1
Q ss_pred HHHHHHHhcCcchhhcccccHHHHHHHhCCCCCCCcHHHHHHHHHHHhh
Q 020402 81 RLFVEVTSRGGLGKVIRDRRWKEVVVVFNFPTTITSASFVLRKYYLSLL 129 (326)
Q Consensus 81 ~Ly~~V~~rGG~~~V~~~~~W~eVa~~l~~p~~~~~as~~Lk~~Y~k~L 129 (326)
.|...|...|- ..|..|+..|+-- .+..++.+|..++
T Consensus 9 ~l~~~~~~~g~-------~~w~~Ia~~~~~r-----s~~~~~~~~~~~~ 45 (45)
T cd00167 9 LLLEAVKKYGK-------NNWEKIAKELPGR-----TPKQCRERWRNLL 45 (45)
T ss_pred HHHHHHHHHCc-------CCHHHHHhHcCCC-----CHHHHHHHHHHhC
Confidence 45555555552 5799999998641 2337788877653
No 37
>PF05494 Tol_Tol_Ttg2: Toluene tolerance, Ttg2 ; InterPro: IPR008869 Toluene tolerance is mediated by increased cell membrane rigidity resulting from changes in fatty acid and phospholipid compositions, exclusion of toluene from the cell membrane, and removal of intracellular toluene by degradation []. Many proteins are involved in these processes. This family is a transporter which shows similarity to ABC transporters [].; PDB: 2QGU_A.
Probab=55.03 E-value=30 Score=30.07 Aligned_cols=42 Identities=19% Similarity=0.361 Sum_probs=32.1
Q ss_pred hHHHHHHHHHhccCCHHHHHHHHHHHHH-HHHHHHHHHHHHHh
Q 020402 277 KAISKKIGVLWSNLTEAEKQVYQEKGLK-DKERYKSEMLEYRS 318 (326)
Q Consensus 277 ~eisk~ige~Wk~Ls~eeK~~Y~e~A~~-dkerY~~em~~Yk~ 318 (326)
.-....+|.-|+.++++||+.|.+.-.+ ....|-..+..|..
T Consensus 42 ~~ar~~LG~~w~~~s~~q~~~F~~~f~~~l~~~Y~~~l~~y~~ 84 (170)
T PF05494_consen 42 RMARRVLGRYWRKASPAQRQRFVEAFKQLLVRTYAKRLDEYSG 84 (170)
T ss_dssp HHHHHHHGGGTTTS-HHHHHHHHHHHHHHHHHHHHHHHHT-SS
T ss_pred HHHHHHHHHhHhhCCHHHHHHHHHHHHHHHHHHHHHHHHhhCC
Confidence 4445678899999999999999876655 67788888888875
No 38
>KOG3223 consensus Uncharacterized conserved protein [Function unknown]
Probab=52.61 E-value=6.8 Score=35.64 Aligned_cols=52 Identities=17% Similarity=0.339 Sum_probs=42.0
Q ss_pred CCC-CCCCChHHHHHHHHHHHhcccCCCCh-hHHHHHHHHHhccCCHHHHHHHHHHH
Q 020402 248 SRP-KSNRSGYNFFFAEHYARLKPHYYGQE-KAISKKIGVLWSNLTEAEKQVYQEKG 302 (326)
Q Consensus 248 ~~P-KrP~SAY~lF~~e~r~~lk~~~p~~~-~eisk~ige~Wk~Ls~eeK~~Y~e~A 302 (326)
.+| ||=+-||.-|-...-++|+.++|++. .++-.+|-..|..-+|+ ||.+++
T Consensus 162 rHPEkRmrAA~~afEe~~LPrLK~e~P~lrlsQ~Kqll~Kew~KsPDN---P~Nq~~ 215 (221)
T KOG3223|consen 162 RHPEKRMRAAFKAFEEARLPRLKKENPGLRLSQYKQLLKKEWQKSPDN---PFNQAA 215 (221)
T ss_pred cChHHHHHHHHHHHHHhhchhhhhcCCCccHHHHHHHHHHHHhhCCCC---hhhHHh
Confidence 455 45567788898888899999999999 99999999999988875 454443
No 39
>PF11304 DUF3106: Protein of unknown function (DUF3106); InterPro: IPR021455 Some members in this family of proteins are annotated as transmembrane proteins however this cannot be confirmed. Currently no function is known.
Probab=49.94 E-value=57 Score=26.69 Aligned_cols=40 Identities=10% Similarity=0.318 Sum_probs=18.7
Q ss_pred HHHHHHHHHhccCCHHHHHHHHHHHH-------HHHHHHHHHHHHHH
Q 020402 278 AISKKIGVLWSNLTEAEKQVYQEKGL-------KDKERYKSEMLEYR 317 (326)
Q Consensus 278 eisk~ige~Wk~Ls~eeK~~Y~e~A~-------~dkerY~~em~~Yk 317 (326)
++..-+.+.|+.|+++.|..+.+.|. .+++++..-|..|.
T Consensus 11 ~~L~pl~~~W~~l~~~qr~k~l~~a~r~~~mspeqq~r~~~rm~~W~ 57 (107)
T PF11304_consen 11 QALAPLAERWNSLPPEQRRKWLQIAERWPSMSPEQQQRLRERMRRWA 57 (107)
T ss_pred HHHHHHHHHHhcCCHHHHHHHHHHHHHHhcCCHHHHHHHHHHHHHHH
Confidence 33444455555555555554444432 24444444444444
No 40
>COG2854 Ttg2D ABC-type transport system involved in resistance to organic solvents, auxiliary component [Secondary metabolites biosynthesis, transport, and catabolism]
Probab=43.56 E-value=34 Score=31.46 Aligned_cols=43 Identities=14% Similarity=0.291 Sum_probs=36.8
Q ss_pred hHHHHHHHHHhccCCHHHHHHHHHHHHH-HHHHHHHHHHHHHhc
Q 020402 277 KAISKKIGVLWSNLTEAEKQVYQEKGLK-DKERYKSEMLEYRSS 319 (326)
Q Consensus 277 ~eisk~ige~Wk~Ls~eeK~~Y~e~A~~-dkerY~~em~~Yk~~ 319 (326)
..-...+|.-|+.+|+|+++.|.+.-.. ....|-..|.+|+.+
T Consensus 74 ~~a~~vLGk~~k~aspeQ~~~F~~aF~~yl~q~Y~~aL~~Y~~q 117 (202)
T COG2854 74 YAAKLVLGKYYKTASPEQRQAFFKAFRTYLEQTYGQALLDYKGQ 117 (202)
T ss_pred HHHHHHhccccccCCHHHHHHHHHHHHHHHHHHHHHHHHHccCC
Confidence 5567788999999999999999876655 778899999999874
No 41
>KOG0493 consensus Transcription factor Engrailed, contains HOX domain [General function prediction only]
Probab=42.11 E-value=40 Score=32.28 Aligned_cols=42 Identities=29% Similarity=0.518 Sum_probs=23.2
Q ss_pred CCCCCcchhhhccccccCCCCCCCCCCCChHHHHHHHHHHHhcccCCC
Q 020402 227 NNSAAPTHRRRKRSRLALRDPSRPKSNRSGYNFFFAEHYARLKPHYYG 274 (326)
Q Consensus 227 ~~~a~~~rrkkkk~k~k~kdp~~PKrP~SAY~lF~~e~r~~lk~~~p~ 274 (326)
+|++.+|-||-||++-..+ -=|||++|| ..|+-++||.++..
T Consensus 226 RPSsGPR~Rk~kkkk~~~~---eeKRPRTAF---taeQL~RLK~EF~e 267 (342)
T KOG0493|consen 226 RPSSGPRHRKPKKKKSSSK---EEKRPRTAF---TAEQLQRLKAEFQE 267 (342)
T ss_pred CCCCCcccccccccCCccc---hhcCccccc---cHHHHHHHHHHHhh
Confidence 4555555444333332222 336889985 46777777776543
No 42
>PF13873 Myb_DNA-bind_5: Myb/SANT-like DNA-binding domain
Probab=40.80 E-value=36 Score=25.51 Aligned_cols=54 Identities=13% Similarity=0.218 Sum_probs=33.5
Q ss_pred cchhHHHHHHHhc---CcchhhcccccHHHHHHHhCC-CCCCCcHHHHHHHHHHHhhHH
Q 020402 77 LDLHRLFVEVTSR---GGLGKVIRDRRWKEVVVVFNF-PTTITSASFVLRKYYLSLLYH 131 (326)
Q Consensus 77 lDL~~Ly~~V~~r---GG~~~V~~~~~W~eVa~~l~~-p~~~~~as~~Lk~~Y~k~L~~ 131 (326)
|+|..-|..|..- ++.....+...|.+|+..|+- ++. .--..+|++.|..+...
T Consensus 14 v~~v~~~~~il~~k~~~~~~~~~k~~~W~~I~~~lN~~~~~-~Rs~~~lkkkW~nlk~~ 71 (78)
T PF13873_consen 14 VELVEKHKDILENKFSDSVSNKEKRKAWEEIAEELNALGPG-KRSWKQLKKKWKNLKSK 71 (78)
T ss_pred HHHHHHhHHHHhcccccHHHHHHHHHHHHHHHHHHHhcCCC-CCCHHHHHHHHHHHHHH
Confidence 4455555555543 222333456689999999963 333 44445899999887653
No 43
>PF12776 Myb_DNA-bind_3: Myb/SANT-like DNA-binding domain; InterPro: IPR024752 This domain, found in a range of uncharacterised proteins, may be related to Myb/SANT-like DNA binding domains.
Probab=37.92 E-value=64 Score=24.83 Aligned_cols=61 Identities=18% Similarity=0.290 Sum_probs=42.5
Q ss_pred hhHHHHHHHhcCcc--hhhcccccHHHHHHHhCCCCCCCcHHHHHHHHHHHhhHHhhhhhhhc
Q 020402 79 LHRLFVEVTSRGGL--GKVIRDRRWKEVVVVFNFPTTITSASFVLRKYYLSLLYHFEQVYYFR 139 (326)
Q Consensus 79 L~~Ly~~V~~rGG~--~~V~~~~~W~eVa~~l~~p~~~~~as~~Lk~~Y~k~L~~fE~~~~~~ 139 (326)
|..|+.+....|.. ...-....|..|+.+|.-.....-...+|+..|..+=..|..+....
T Consensus 10 ll~~~~e~~~~g~~~~~~~fk~~~w~~i~~~~~~~~~~~~t~~qlknk~~~lk~~y~~~~~l~ 72 (96)
T PF12776_consen 10 LLDLLIEQINKGNRPTNGGFKKEGWNNIAEEFNEKTGLNYTKKQLKNKWKTLKKDYRIWKELR 72 (96)
T ss_pred HHHHHHHHHHhCCCCCCCCcCHHHHHHHHHHHHHHhCCcccHHHHHHHHHHHHHHHHHHHHHH
Confidence 44555666667777 34445558999999998644433334589999999999988876554
No 44
>PF12650 DUF3784: Domain of unknown function (DUF3784); InterPro: IPR017259 This group represents an uncharacterised conserved protein.
Probab=37.91 E-value=23 Score=28.09 Aligned_cols=17 Identities=24% Similarity=0.499 Sum_probs=14.2
Q ss_pred HhccCCHHHHHHHHHHH
Q 020402 286 LWSNLTEAEKQVYQEKG 302 (326)
Q Consensus 286 ~Wk~Ls~eeK~~Y~e~A 302 (326)
-||.||+|||+.|.++.
T Consensus 25 Gyntms~eEk~~~D~~~ 41 (97)
T PF12650_consen 25 GYNTMSKEEKEKYDKKK 41 (97)
T ss_pred hcccCCHHHHHHhhHHH
Confidence 48999999999997644
No 45
>smart00717 SANT SANT SWI3, ADA2, N-CoR and TFIIIB'' DNA-binding domains.
Probab=37.39 E-value=67 Score=20.65 Aligned_cols=26 Identities=15% Similarity=0.328 Sum_probs=18.5
Q ss_pred ccHHHHHHHhCCCCCCCcHHHHHHHHHHHhh
Q 020402 99 RRWKEVVVVFNFPTTITSASFVLRKYYLSLL 129 (326)
Q Consensus 99 ~~W~eVa~~l~~p~~~~~as~~Lk~~Y~k~L 129 (326)
..|..|+..|+ +-....++..|..++
T Consensus 22 ~~w~~Ia~~~~-----~rt~~~~~~~~~~~~ 47 (49)
T smart00717 22 NNWEKIAKELP-----GRTAEQCRERWNNLL 47 (49)
T ss_pred CCHHHHHHHcC-----CCCHHHHHHHHHHHc
Confidence 57999999997 122337788887765
No 46
>PF13875 DUF4202: Domain of unknown function (DUF4202)
Probab=36.56 E-value=54 Score=29.74 Aligned_cols=40 Identities=10% Similarity=0.274 Sum_probs=33.7
Q ss_pred ChHHHHHHHHHHHhcccCCCCh-hHHHHHHHHHhccCCHHHHHH
Q 020402 255 SGYNFFFAEHYARLKPHYYGQE-KAISKKIGVLWSNLTEAEKQV 297 (326)
Q Consensus 255 SAY~lF~~e~r~~lk~~~p~~~-~eisk~ige~Wk~Ls~eeK~~ 297 (326)
-+-++|+..+.+.+...| + ..+..++...|+.||++-++.
T Consensus 130 vacLVFL~~~f~~F~~~~---deeK~v~Il~KTw~KMS~~g~~~ 170 (185)
T PF13875_consen 130 VACLVFLEYYFEDFAAKH---DEEKIVDILRKTWRKMSERGHEA 170 (185)
T ss_pred hHHHHhHHHHHHHHHhcC---CHHHHHHHHHHHHHHCCHHHHHH
Confidence 357889999999998888 4 778899999999999988754
No 47
>PF10545 MADF_DNA_bdg: Alcohol dehydrogenase transcription factor Myb/SANT-like; InterPro: IPR006578 The MADF (myb/SANT-like domain in Adf-1) domain is an approximately 80-amino-acid module that directs sequence specific DNA binding to a site consisting of multiple tri-nucleotide repeats. The MADF domain is found in one or more copies in eukaryotic and viral proteins and is often associated with the BESS domain []. MADF is related to the Myb DNA-binding domain (IPR001005 from INTERPRO). The retroviral oncogene v-myb, and its cellular counterpart c-myb, are nuclear DNA-binding proteins that specifically recognise the sequence YAAC(G/T)G. It is likely that the MADF domain is more closely related to the myb/SANT domain than it is to other HTH domains. Some proteins known to contain a MADF domain are listed below: Drosophila Adf-1, a transcription factor first identified on the basis of its interaction with the alcohol dehydrogenase promoter but that binds the promoters of a diverse group of genes []. Drosophila Dorsal-interacting protein 3 (Dip3), which functions both as an activator to bind DNA in a sequence specific manner and a coactivator to stimulate synergistic activation by Dorsal and Twist []. Drosophila Stonewall (Stwl), a putative transcription factor required for maintenance of female germline stem cells as well as oocyte differentiation.
Probab=35.79 E-value=37 Score=25.14 Aligned_cols=38 Identities=21% Similarity=0.377 Sum_probs=23.2
Q ss_pred cccccHHHHHHHhC--CCCC-CCcHHHHHHHHHHHhhHHhh
Q 020402 96 IRDRRWKEVVVVFN--FPTT-ITSASFVLRKYYLSLLYHFE 133 (326)
Q Consensus 96 ~~~~~W~eVa~~l~--~p~~-~~~as~~Lk~~Y~k~L~~fE 133 (326)
.+.+.|.+|+..|| ++.. +...-..||..|.+.+...+
T Consensus 24 ~r~~aw~~Ia~~l~~~~~~~~~~~~w~~Lr~~y~~~~~~~~ 64 (85)
T PF10545_consen 24 LREEAWQEIARELGKEFSVDDCKKRWKNLRDRYRRELKKIK 64 (85)
T ss_pred HHHHHHHHHHHHHccchhHHHHHHHHHHHHHHHHHHHHHHh
Confidence 45678999999998 4422 22333345555555555554
No 48
>PF02337 Gag_p10: Retroviral GAG p10 protein; InterPro: IPR003322 Retroviral matrix proteins (or major core proteins) are components of envelope-associated capsids, which line the inner surface of virus envelopes and are associated with viral membranes []. Matrix proteins are produced as part of Gag precursor polyproteins. During viral maturation, the Gag polyprotein is cleaved into major structural proteins by the viral protease, yielding the matrix (MA), capsid (CA), nucleocapsid (NC), and some smaller peptides. Gag-derived proteins govern the entire assembly and release of the virus particles, with matrix proteins playing key roles in Gag stability, capsid assembly, transport and budding. Although matrix proteins from different retroviruses appear to perform similar functions and can have similar structural folds, their primary sequences can be very different. This entry represents matrix proteins from beta-retroviruses such as Mason-Pfizer monkey virus (MPMV) (Simian Mason-Pfizer virus) and Mouse mammary tumor virus (MMTV) [, ]. This entry also identifies matrix proteins from several eukaryotic endogenous retroviruses, which arise when one or more copies of the retroviral genome becomes integrated into the host genome [].; GO: 0005198 structural molecule activity, 0019028 viral capsid; PDB: 2F77_X 2F76_X.
Probab=33.08 E-value=1.4e+02 Score=23.82 Aligned_cols=54 Identities=19% Similarity=0.103 Sum_probs=36.8
Q ss_pred HHHHHHHHHHHHhcCCCCCCCccCCeecchhHHHHHHHhcCcchhhcc---cccHHHHHHHhCC
Q 020402 50 DLFWATLEAFHKSFGDKFKVPTVGGKALDLHRLFVEVTSRGGLGKVIR---DRRWKEVVVVFNF 110 (326)
Q Consensus 50 ~~F~~~L~~F~~~rG~~l~~P~i~gk~lDL~~Ly~~V~~rGG~~~V~~---~~~W~eVa~~l~~ 110 (326)
+.|+..|+.++..+|..+ +.=||-.+|..+.+..=+-.+.. -..|..|++.|.-
T Consensus 8 ~~fv~~Lk~lLk~rGi~v-------~~~~L~~f~~~i~~~~PWF~~eG~l~~~~W~kvG~~l~~ 64 (90)
T PF02337_consen 8 QPFVSILKHLLKERGIRV-------KKKDLINFLSFIDKVCPWFPEEGTLDLDNWKKVGEELKR 64 (90)
T ss_dssp HHHHHHHHHHHHCCT-----------HHHHHHHHHHHHHHTT-SS--SS-HHHHHHHHHHHHHH
T ss_pred hHHHHHHHHHHHHcCeee-------cHHHHHHHHHHHHHhCCCCCCCCCcCHHHHHHHHHHHHH
Confidence 789999999999999998 44577888888876554444333 3589999998843
No 49
>PRK09706 transcriptional repressor DicA; Reviewed
Probab=32.70 E-value=1e+02 Score=25.67 Aligned_cols=41 Identities=17% Similarity=0.152 Sum_probs=35.9
Q ss_pred HHHHHHHHhccCCHHHHHHHHHHHHHHHHHHHHHHHHHHhc
Q 020402 279 ISKKIGVLWSNLTEAEKQVYQEKGLKDKERYKSEMLEYRSS 319 (326)
Q Consensus 279 isk~ige~Wk~Ls~eeK~~Y~e~A~~dkerY~~em~~Yk~~ 319 (326)
-.+.+-+.|+.|++++++......+...+-|.+-+++|-.+
T Consensus 88 ~~~~ll~~~~~L~~~~~~~~l~~l~~~~~~~~~~~~~~~~~ 128 (135)
T PRK09706 88 DQKELLELFDALPESEQDAQLSEMRARVENFNKLFEELLKA 128 (135)
T ss_pred HHHHHHHHHHHCCHHHHHHHHHHHHHHHHHHHHHHHHHHHH
Confidence 35778889999999999999999999999999988888654
No 50
>smart00595 MADF subfamily of SANT domain.
Probab=31.22 E-value=34 Score=26.15 Aligned_cols=42 Identities=17% Similarity=0.236 Sum_probs=28.8
Q ss_pred hhcccccHHHHHHHhCCCCC-CCcHHHHHHHHHHHhhHHhhhh
Q 020402 94 KVIRDRRWKEVVVVFNFPTT-ITSASFVLRKYYLSLLYHFEQV 135 (326)
Q Consensus 94 ~V~~~~~W~eVa~~l~~p~~-~~~as~~Lk~~Y~k~L~~fE~~ 135 (326)
...+...|.+|+..||.+.. |..-=..||..|.+.+......
T Consensus 23 ~~~r~~aW~~Ia~~l~~~~~~~~~kw~~LR~~y~~e~~r~~~~ 65 (89)
T smart00595 23 KEEKRKAWEEIAEELGLSVEECKKRWKNLRDRYRRELKRLQNG 65 (89)
T ss_pred hHHHHHHHHHHHHHHCcCHHHHHHHHHHHHHHHHHHHHHHHHh
Confidence 34456699999999999543 3333446788888877766553
No 51
>PF05066 HARE-HTH: HB1, ASXL, restriction endonuclease HTH domain; InterPro: IPR007759 DNA-directed RNA polymerases 2.7.7.6 from EC (also known as DNA-dependent RNA polymerases) are responsible for the polymerisation of ribonucleotides into a sequence complementary to the template DNA. In eukaryotes, there are three different forms of DNA-directed RNA polymerases transcribing different sets of genes. Most RNA polymerases are multimeric enzymes and are composed of a variable number of subunits. The core RNA polymerase complex consists of five subunits (two alpha, one beta, one beta-prime and one omega) and is sufficient for transcription elongation and termination but is unable to initiate transcription. Transcription initiation from promoter elements requires a sixth, dissociable subunit called a sigma factor, which reversibly associates with the core RNA polymerase complex to form a holoenzyme []. The core RNA polymerase complex forms a "crab claw"-like structure with an internal channel running along the full length []. The key functional sites of the enzyme, as defined by mutational and cross-linking analysis, are located on the inner wall of this channel. RNA synthesis follows after the attachment of RNA polymerase to a specific site, the promoter, on the template DNA strand. The RNA synthesis process continues until a termination sequence is reached. The RNA product, which is synthesised in the 5' to 3'direction, is known as the primary transcript. Eukaryotic nuclei contain three distinct types of RNA polymerases that differ in the RNA they synthesise: RNA polymerase I: located in the nucleoli, synthesises precursors of most ribosomal RNAs. RNA polymerase II: occurs in the nucleoplasm, synthesises mRNA precursors. RNA polymerase III: also occurs in the nucleoplasm, synthesises the precursors of 5S ribosomal RNA, the tRNAs, and a variety of other small nuclear and cytosolic RNAs. Eukaryotic cells are also known to contain separate mitochondrial and chloroplast RNA polymerases. Eukaryotic RNA polymerases, whose molecular masses vary in size from 500 to 700 kDa, contain two non-identical large (>100 kDa) subunits and an array of up to 12 different small (less than 50 kDa) subunits. The delta protein is a dispensable subunit of Bacillus subtilis RNA polymerase (RNAP) that has major effects on the biochemical properties of the purified enzyme. In the presence of delta, RNAP displays an increased specificity of transcription, a decreased affinity for nucleic acids, and an increased efficiency of RNA synthesis because of enhanced recycling []. The delta protein, contains two distinct regions, an N-terminal domain and a glutamate and aspartate residue-rich C-terminal region [].; GO: 0003677 DNA binding, 0006351 transcription, DNA-dependent; PDB: 2KRC_A.
Probab=30.45 E-value=83 Score=23.28 Aligned_cols=43 Identities=14% Similarity=0.228 Sum_probs=25.5
Q ss_pred HHHHHHHHHHhcCCCCCCCccCCeecchhHHHHHHHhcCcchhhcccccHHHHH
Q 020402 52 FWATLEAFHKSFGDKFKVPTVGGKALDLHRLFVEVTSRGGLGKVIRDRRWKEVV 105 (326)
Q Consensus 52 F~~~L~~F~~~rG~~l~~P~i~gk~lDL~~Ly~~V~~rGG~~~V~~~~~W~eVa 105 (326)
|++...+-+++.| +++....|+..|.++|++... ...-|..|+
T Consensus 3 ~~eaa~~vL~~~~----------~pm~~~eI~~~i~~~~~~~~~-~k~p~~~i~ 45 (72)
T PF05066_consen 3 FKEAAYEVLEEAG----------RPMTFKEIWEEIQERGLYKKS-GKTPEATIA 45 (72)
T ss_dssp HHHHHHHHHHHH-----------S-EEHHHHHHHHHHHHTS----GGGGGHHHH
T ss_pred HHHHHHHHHHhcC----------CCcCHHHHHHHHHHhCCCCcc-cCCHHHHHH
Confidence 4455555555544 558899999999999999987 223344443
No 52
>PRK10236 hypothetical protein; Provisional
Probab=29.88 E-value=51 Score=31.00 Aligned_cols=45 Identities=20% Similarity=0.312 Sum_probs=30.2
Q ss_pred HHHHHHHHHHHhcccCCC-C-----h-hHHHHHHHHHhccCCHHHHHHHHHH
Q 020402 257 YNFFFAEHYARLKPHYYG-Q-----E-KAISKKIGVLWSNLTEAEKQVYQEK 301 (326)
Q Consensus 257 Y~lF~~e~r~~lk~~~p~-~-----~-~eisk~ige~Wk~Ls~eeK~~Y~e~ 301 (326)
|-=-..+....+|-.+.. . + .-+.+.+.+.|+.|+++|++.+.+.
T Consensus 89 YreIL~DVc~~LKV~y~~~~st~~iE~~il~kll~~a~~kms~eE~~~L~~~ 140 (237)
T PRK10236 89 YRAILLDVSKRLKLKADKEMSTFEIEQQLLEQFLRNTWKKMDEEHKQEFLHA 140 (237)
T ss_pred HHHHHHHHHHHcCCCCCCCCCHHHHHHHHHHHHHHHHHHHCCHHHHHHHHHH
Confidence 333444555555554333 1 2 4478999999999999999988653
No 53
>PF04967 HTH_10: HTH DNA binding domain; InterPro: IPR007050 Numerous bacterial transcription regulatory proteins bind DNA via a helix-turn-helix (HTH) motif. This entry represents the HTH DNA binding domain found in Halobacterium salinarium (Halobacterium halobium) and described as a putative bacterio-opsin activator.
Probab=28.19 E-value=77 Score=22.77 Aligned_cols=40 Identities=20% Similarity=0.130 Sum_probs=33.0
Q ss_pred hcCcchhhcccccHHHHHHHhCCCCCCCcHHHHHHHHHHHhh
Q 020402 88 SRGGLGKVIRDRRWKEVVVVFNFPTTITSASFVLRKYYLSLL 129 (326)
Q Consensus 88 ~rGG~~~V~~~~~W~eVa~~l~~p~~~~~as~~Lk~~Y~k~L 129 (326)
-..||-.+-++-.=.+||..||++. +.++..||+.-.+++
T Consensus 13 ~~~GYfd~PR~~tl~elA~~lgis~--st~~~~LRrae~kli 52 (53)
T PF04967_consen 13 YELGYFDVPRRITLEELAEELGISK--STVSEHLRRAERKLI 52 (53)
T ss_pred HHcCCCCCCCcCCHHHHHHHhCCCH--HHHHHHHHHHHHHHh
Confidence 3568888888889999999999995 467788998877765
No 54
>TIGR00787 dctP tripartite ATP-independent periplasmic transporter solute receptor, DctP family. TRAP-T (Tripartite ATP-independent Periplasmic Transporter) family proteins generally consist of three components, and these systems have so far been found in Gram-negative bacteria, Gram-postive bacteria and archaea. The best characterized example is the DctPQM system of Rhodobacter capsulatus, a C4 dicarboxylate (malate, fumarate, succinate) transporter. This model represents the DctP family, one of at least three major families of extracytoplasmic solute receptor for TRAP family transporters. Other are the SnoM family (see pfam03480) and TAXI (TRAP-associated extracytoplasmic immunogenic) family.
Probab=26.97 E-value=1.1e+02 Score=28.19 Aligned_cols=28 Identities=18% Similarity=0.312 Sum_probs=21.3
Q ss_pred HHHhccCCHHHHHHHHHHHHHHHHHHHH
Q 020402 284 GVLWSNLTEAEKQVYQEKGLKDKERYKS 311 (326)
Q Consensus 284 ge~Wk~Ls~eeK~~Y~e~A~~dkerY~~ 311 (326)
.+.|+.|+++.|+...+.+.+.-+....
T Consensus 213 ~~~~~~L~~e~q~~i~~a~~~~~~~~~~ 240 (257)
T TIGR00787 213 KAFWKSLPPDLQAVVKEAAKEAGEYQRK 240 (257)
T ss_pred HHHHhcCCHHHHHHHHHHHHHHHHHHHH
Confidence 4679999999999998877665444433
No 55
>COG1638 DctP TRAP-type C4-dicarboxylate transport system, periplasmic component [Carbohydrate transport and metabolism]
Probab=26.03 E-value=1e+02 Score=30.27 Aligned_cols=43 Identities=14% Similarity=0.224 Sum_probs=33.4
Q ss_pred hHHHHHHHHHhccCCHHHHHHHHHHHHHHHHHHHHHHHHHHhc
Q 020402 277 KAISKKIGVLWSNLTEAEKQVYQEKGLKDKERYKSEMLEYRSS 319 (326)
Q Consensus 277 ~eisk~ige~Wk~Ls~eeK~~Y~e~A~~dkerY~~em~~Yk~~ 319 (326)
..+.-+-...|..|++++|+...+.+.+..+...+...+.++.
T Consensus 237 ~~~~~~s~~~w~~L~~e~q~il~~aa~e~~~~~~~~~~~~e~~ 279 (332)
T COG1638 237 PLAVLVSKAFWDSLPEEDQTILLEAAKEAAEEQRKLVEELEDE 279 (332)
T ss_pred ceeeEEcHHHHhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHH
Confidence 4455556678999999999999999988877777766666553
No 56
>PRK02363 DNA-directed RNA polymerase subunit delta; Reviewed
Probab=24.52 E-value=60 Score=27.64 Aligned_cols=63 Identities=13% Similarity=0.081 Sum_probs=44.2
Q ss_pred HHHHHHHHHHHhcCCCCCCCccCCeecchhHHHHHHHhcCcchhhcccccHHHHHHHhCCCCC---CCcHHHHHHH
Q 020402 51 LFWATLEAFHKSFGDKFKVPTVGGKALDLHRLFVEVTSRGGLGKVIRDRRWKEVVVVFNFPTT---ITSASFVLRK 123 (326)
Q Consensus 51 ~F~~~L~~F~~~rG~~l~~P~i~gk~lDL~~Ly~~V~~rGG~~~V~~~~~W~eVa~~l~~p~~---~~~as~~Lk~ 123 (326)
.+++.-..++..+| +++.++.|+.+|.+..|+..-....+=.++...|.+... ++...+.||.
T Consensus 4 S~idvAy~iL~~~~----------~~m~f~dL~~ev~~~~~~s~e~~~~~iaq~YtdLn~DGRFi~lG~n~WgLr~ 69 (129)
T PRK02363 4 SLIEVAYEILKEKK----------EPMSFYDLVNEIQKYLGKSDEEIRERIAQFYTDLNLDGRFISLGDNKWGLRS 69 (129)
T ss_pred cHHHHHHHHHHHcC----------CcccHHHHHHHHHHHhCCCHHHHHHHHHHHHHHHhccCCeeEcCCCceeccc
Confidence 45566666676654 458899999999999998765555667777777777654 4455555665
No 57
>PRK12751 cpxP periplasmic stress adaptor protein CpxP; Reviewed
Probab=24.52 E-value=1.3e+02 Score=26.62 Aligned_cols=32 Identities=19% Similarity=0.276 Sum_probs=24.2
Q ss_pred HHHHHHHHHhccCCHHHHHHHHHHHHHHHHHH
Q 020402 278 AISKKIGVLWSNLTEAEKQVYQEKGLKDKERY 309 (326)
Q Consensus 278 eisk~ige~Wk~Ls~eeK~~Y~e~A~~dkerY 309 (326)
+..+..-++++-|++|+|..|.+.-++-.++.
T Consensus 118 ~~~~~~~qmy~lLTPEQra~l~~~~e~r~~~~ 149 (162)
T PRK12751 118 EMAKVRNQMYNLLTPEQKEALNKKHQERIEKL 149 (162)
T ss_pred HHHHHHHHHHHcCCHHHHHHHHHHHHHHHHHH
Confidence 34566678889999999999988766654444
No 58
>PRK12750 cpxP periplasmic repressor CpxP; Reviewed
Probab=23.05 E-value=2e+02 Score=25.52 Aligned_cols=36 Identities=22% Similarity=0.232 Sum_probs=28.3
Q ss_pred HHHHHHHHHhccCCHHHHHHHHHHHHHHHHHHHHHH
Q 020402 278 AISKKIGVLWSNLTEAEKQVYQEKGLKDKERYKSEM 313 (326)
Q Consensus 278 eisk~ige~Wk~Ls~eeK~~Y~e~A~~dkerY~~em 313 (326)
+..++.-+.+.-|++|+|..|.+.-.+-.+.+...+
T Consensus 125 ~~~~~~~~~~~vLTpEQRak~~e~~~~r~~~~~~~~ 160 (170)
T PRK12750 125 KMLEKRHQMLSILTPEQKAKFQELQQERMQECQDKM 160 (170)
T ss_pred HHHHHHHHHHHhCCHHHHHHHHHHHHHHHHHHHHHH
Confidence 344556678999999999999998877777776655
No 59
>PRK10363 cpxP periplasmic repressor CpxP; Reviewed
Probab=22.10 E-value=1.7e+02 Score=26.10 Aligned_cols=39 Identities=18% Similarity=0.298 Sum_probs=30.1
Q ss_pred hHHHHHHHHHhccCCHHHHHHHHHHHHHHHHHHHHHHHHH
Q 020402 277 KAISKKIGVLWSNLTEAEKQVYQEKGLKDKERYKSEMLEY 316 (326)
Q Consensus 277 ~eisk~ige~Wk~Ls~eeK~~Y~e~A~~dkerY~~em~~Y 316 (326)
.++.++--++.+-|++|+|..|.+..++-.+++.. +..+
T Consensus 111 Vem~k~~nqmy~lLTPEQKaq~~~~~~~rm~~~~~-~~~~ 149 (166)
T PRK10363 111 VEMAKVRNQMYRLLTPEQQAVLNEKHQQRMEQLRD-VTQW 149 (166)
T ss_pred HHHHHHHHHHHHhCCHHHHHHHHHHHHHHHHHHHH-HHhc
Confidence 56777788999999999999998887777666644 4433
No 60
>cd07268 Glo_EDI_BRP_like_4 This conserved domain belongs to a superfamily including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. This protein family belongs to a conserved domain superfamily that is found in a variety of structurally related metalloproteins, including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). The protein superfamily contains members with or without domain swapping. The proteins of this family share three conserved metal binding amino acids with the type I extradiol dioxygenases, which shows no domain swapping.
Probab=22.02 E-value=43 Score=29.34 Aligned_cols=49 Identities=10% Similarity=0.079 Sum_probs=39.6
Q ss_pred chhhhhcHHHHHHHHHHHHHhcCCCCCCCccCCeecchhHHHHHHHhcC
Q 020402 42 YEDIAQSSDLFWATLEAFHKSFGDKFKVPTVGGKALDLHRLFVEVTSRG 90 (326)
Q Consensus 42 ~e~~~~~~~~F~~~L~~F~~~rG~~l~~P~i~gk~lDL~~Ly~~V~~rG 90 (326)
|-++-.......+.++.-+.+.|+-+.-=+|+||+|-||+|..-+.-.|
T Consensus 4 HialR~n~~~~A~~w~~~l~~~G~llSen~INGRPI~l~~L~qPl~~~~ 52 (149)
T cd07268 4 HIALRVNENQTAERWKEGLLQCGELLSENEINGRPIALIKLEKPLQFAG 52 (149)
T ss_pred eEEEeeCCHHHHHHHHHHHHHhchhhhccccCCeeEEEEEcCCCceeCC
Confidence 4444455567888999999999999988899999999999987766444
No 61
>PF13725 tRNA_bind_2: Possible tRNA binding domain; PDB: 2ZPA_B.
Probab=20.94 E-value=59 Score=25.59 Aligned_cols=20 Identities=25% Similarity=0.619 Sum_probs=13.8
Q ss_pred hhhcccccHHHHHHHhCCCC
Q 020402 93 GKVIRDRRWKEVVVVFNFPT 112 (326)
Q Consensus 93 ~~V~~~~~W~eVa~~l~~p~ 112 (326)
.+|.+.+-|.+||+.|+++.
T Consensus 78 ~k~LQ~ksw~~~a~~l~l~g 97 (101)
T PF13725_consen 78 AKGLQGKSWEEVAKELGLPG 97 (101)
T ss_dssp HHHCS---HHHHHHHCT-SS
T ss_pred HHHHCCCCHHHHHHHcCCCC
Confidence 46778899999999999985
No 62
>PLN00131 hypothetical protein; Provisional
Probab=20.43 E-value=3.3e+02 Score=24.28 Aligned_cols=56 Identities=16% Similarity=0.201 Sum_probs=27.1
Q ss_pred CCCCCCCCCCCCC-------CCCCCCC-CCCCC-CCCCCCCCCCCchhhhhcHHHHHHHHHHHHH
Q 020402 6 LNGQKSSATTSNS-------NSNSNNN-NNNNK-ASSYYPPPTAKYEDIAQSSDLFWATLEAFHK 61 (326)
Q Consensus 6 ~~~~~~~~~~~~~-------~~~~~~~-~~~~~-~~~~~p~~~~~~e~~~~~~~~F~~~L~~F~~ 61 (326)
+||++|-+++|-- ++.+.-+ |++.. ---.-|.|....+++...+...-+.|+-.+.
T Consensus 83 iGG~GS~~~~SrrP~~DLNstpqpeldlnqpaaheqepepapplddqdlltkrkrvseelrlllq 147 (218)
T PLN00131 83 IGGGGSDAGPSRRPVLDLNSTPQPELDLNQPAAHEQEPEPAPPLDDQDLLTKRKRVSEELRLLLQ 147 (218)
T ss_pred ecCCCCCCCcCcCCcccCCCCCCcccccCCccccccCCCCCCCCCcHHHHHHHHHHHHHHHHHHH
Confidence 6899998877632 2222222 22221 1122333334445555666666666655443
No 63
>cd05694 S1_Rrp5_repeat_hs2_sc2 S1_Rrp5_repeat_hs2_sc2: Rrp5 is a trans-acting factor important for biogenesis of both the 40S and 60S eukaryotic ribosomal subunits. Rrp5 has two distinct regions, an N-terminal region containing tandemly repeated S1 RNA-binding domains (12 S1 repeats in Saccharomyces cerevisiae Rrp5 and 14 S1 repeats in Homo sapiens Rrp5) and a C-terminal region containing tetratricopeptide repeat (TPR) motifs thought to be involved in protein-protein interactions. Mutational studies have shown that each region represents a specific functional domain. Deletions within the S1-containing region inhibit pre-rRNA processing at either site A3 or A2, whereas deletions within the TPR region confer an inability to support cleavage of A0-A2. This CD includes H. sapiens S1 repeat 2 (hs2) and S. cerevisiae S1 repeat 2 (sc2). Rrp5 is found in eukaryotes but not in prokaryotes or archaea.
Probab=20.35 E-value=73 Score=23.99 Aligned_cols=31 Identities=26% Similarity=0.507 Sum_probs=23.5
Q ss_pred CCCceeeeecCcccCCceEEEeecccccccccc
Q 020402 182 IGCSVSGVIDGKFDNGYLVTVNLGSEQLKGVLY 214 (326)
Q Consensus 182 ~~~~V~g~idg~fd~gy~vtv~~gse~~~g~ly 214 (326)
.|..|.|+|-..-||||+|.+.+ +.+.|.|-
T Consensus 4 ~G~~v~g~V~si~d~G~~v~~g~--~gv~Gfl~ 34 (74)
T cd05694 4 EGMVLSGCVSSVEDHGYILDIGI--PGTTGFLP 34 (74)
T ss_pred CCCEEEEEEEEEeCCEEEEEeCC--CCcEEEEE
Confidence 46679999999999999988743 34677543
Done!