Query 022436
Match_columns 297
No_of_seqs 290 out of 1536
Neff 6.5
Searched_HMMs 46136
Date Fri Mar 29 03:31:48 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/022436.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/022436hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 smart00501 BRIGHT BRIGHT, ARID 99.9 4.1E-27 9E-32 185.0 10.2 91 32-122 1-92 (93)
2 PF01388 ARID: ARID/BRIGHT DNA 99.9 1.1E-25 2.4E-30 176.1 8.5 90 29-118 2-92 (92)
3 KOG2744 DNA-binding proteins B 99.9 9.8E-24 2.1E-28 209.8 10.8 156 23-178 153-340 (512)
4 PTZ00199 high mobility group p 99.9 8.3E-22 1.8E-26 155.4 11.0 82 203-284 9-93 (94)
5 cd01389 MATA_HMG-box MATA_HMG- 99.8 6.6E-19 1.4E-23 133.5 7.8 73 216-288 1-74 (77)
6 cd01388 SOX-TCF_HMG-box SOX-TC 99.8 2.1E-18 4.6E-23 129.2 7.8 69 216-284 1-70 (72)
7 PF00505 HMG_box: HMG (high mo 99.7 1.6E-17 3.4E-22 122.4 9.2 68 217-284 1-69 (69)
8 PF09011 HMG_box_2: HMG-box do 99.7 2.7E-17 5.8E-22 123.5 9.4 71 214-284 1-73 (73)
9 cd01390 HMGB-UBF_HMG-box HMGB- 99.7 4.2E-17 9E-22 119.0 9.1 65 217-281 1-66 (66)
10 smart00398 HMG high mobility g 99.7 6.9E-17 1.5E-21 118.6 9.0 69 216-284 1-70 (70)
11 COG5648 NHP6B Chromatin-associ 99.7 3.2E-17 6.9E-22 144.2 7.8 91 204-294 58-149 (211)
12 KOG0381 HMG box-containing pro 99.7 6.1E-16 1.3E-20 121.5 10.8 75 213-287 17-95 (96)
13 cd00084 HMG-box High Mobility 99.6 2.9E-15 6.3E-20 108.7 9.1 64 217-280 1-65 (66)
14 KOG0527 HMG-box transcription 99.6 5.4E-16 1.2E-20 146.9 6.6 83 210-292 56-139 (331)
15 KOG0526 Nucleosome-binding fac 99.5 8.7E-15 1.9E-19 143.2 7.2 79 204-286 523-602 (615)
16 KOG3248 Transcription factor T 99.3 5.7E-12 1.2E-16 117.6 8.0 72 215-286 190-262 (421)
17 KOG2510 SWI-SNF chromatin-remo 99.2 1.3E-11 2.9E-16 120.2 6.7 94 31-131 291-385 (532)
18 KOG4715 SWI/SNF-related matrix 99.0 6.1E-10 1.3E-14 103.4 8.7 77 210-286 58-135 (410)
19 KOG0528 HMG-box transcription 99.0 3.5E-10 7.6E-15 110.1 3.4 82 210-291 319-401 (511)
20 KOG2746 HMG-box transcription 98.3 5E-07 1.1E-11 91.5 4.5 75 205-279 170-247 (683)
21 PF14887 HMG_box_5: HMG (high 98.0 3.1E-05 6.8E-10 58.4 7.3 72 216-288 3-75 (85)
22 PF04690 YABBY: YABBY protein; 96.9 0.0019 4E-08 56.3 5.6 48 211-258 116-164 (170)
23 PF06382 DUF1074: Protein of u 96.7 0.0038 8.1E-08 54.4 6.0 48 221-272 83-131 (183)
24 COG5648 NHP6B Chromatin-associ 96.2 0.0041 8.9E-08 55.5 3.3 67 215-281 142-209 (211)
25 PF08073 CHDNT: CHDNT (NUC034) 88.5 0.64 1.4E-05 33.1 3.4 39 221-259 13-52 (55)
26 PF04769 MAT_Alpha1: Mating-ty 88.3 1.3 2.9E-05 39.7 6.1 56 210-271 37-93 (201)
27 PF06244 DUF1014: Protein of u 86.4 0.88 1.9E-05 37.6 3.6 48 213-260 68-117 (122)
28 PF00249 Myb_DNA-binding: Myb- 81.8 3.1 6.8E-05 28.0 4.3 38 66-114 11-48 (48)
29 TIGR03481 HpnM hopanoid biosyn 78.6 5.5 0.00012 35.4 5.9 45 242-286 65-111 (198)
30 KOG3223 Uncharacterized conser 73.1 2.9 6.2E-05 37.2 2.5 56 212-270 159-216 (221)
31 PRK15117 ABC transporter perip 67.8 14 0.00031 33.1 5.9 47 240-286 66-115 (211)
32 PF09441 Abp2: ARS binding pro 64.1 15 0.00032 31.9 4.9 42 53-98 44-85 (175)
33 PF12881 NUT_N: NUT protein N 57.0 27 0.00058 33.4 5.8 63 222-284 230-294 (328)
34 PF11304 DUF3106: Protein of u 56.4 45 0.00098 26.7 6.3 42 244-285 10-58 (107)
35 cd00167 SANT 'SWI3, ADA2, N-Co 56.1 24 0.00051 22.1 3.9 37 66-114 9-45 (45)
36 PF13921 Myb_DNA-bind_6: Myb-l 56.0 23 0.00051 24.5 4.2 35 67-114 9-43 (60)
37 PF05494 Tol_Tol_Ttg2: Toluene 50.4 18 0.00039 30.8 3.4 44 242-285 39-84 (170)
38 PF13873 Myb_DNA-bind_5: Myb/S 45.0 26 0.00056 25.7 3.1 55 62-116 14-71 (78)
39 PF13875 DUF4202: Domain of un 44.9 47 0.001 29.5 5.1 41 222-264 130-170 (185)
40 TIGR01557 myb_SHAQKYF myb-like 41.7 36 0.00077 24.2 3.2 42 63-115 9-55 (57)
41 TIGR01624 LRP1_Cterm LRP1 C-te 40.6 19 0.00041 25.0 1.5 19 155-173 30-48 (50)
42 smart00717 SANT SANT SWI3, AD 38.5 66 0.0014 20.2 4.0 26 84-114 22-47 (49)
43 PF05142 DUF702: Domain of unk 34.9 22 0.00048 30.5 1.5 21 155-175 131-151 (154)
44 PF04967 HTH_10: HTH DNA bindi 33.8 40 0.00087 23.7 2.4 40 73-114 13-52 (53)
45 PF10545 MADF_DNA_bdg: Alcohol 33.7 37 0.00081 24.6 2.4 38 81-118 24-64 (85)
46 PF12776 Myb_DNA-bind_3: Myb/S 33.7 64 0.0014 24.3 3.8 59 64-122 10-70 (96)
47 PRK10236 hypothetical protein; 33.3 43 0.00093 30.8 3.1 24 245-268 117-140 (237)
48 PRK12750 cpxP periplasmic repr 31.5 1.6E+02 0.0035 25.5 6.3 36 245-280 125-160 (170)
49 PRK10363 cpxP periplasmic repr 30.3 1.6E+02 0.0035 25.7 6.0 33 245-277 112-144 (166)
50 COG2854 Ttg2D ABC-type transpo 29.0 69 0.0015 28.8 3.6 41 247-287 77-118 (202)
51 PF08914 Myb_DNA-bind_2: Rap1 28.1 77 0.0017 23.1 3.2 51 66-121 12-64 (65)
52 PRK12751 cpxP periplasmic stre 27.1 1.6E+02 0.0034 25.5 5.4 30 246-275 119-148 (162)
53 PRK09706 transcriptional repre 26.5 1.8E+02 0.0039 23.6 5.6 43 247-289 89-131 (135)
54 KOG1610 Corticosteroid 11-beta 26.3 1.6E+02 0.0035 28.4 5.8 59 223-284 184-256 (322)
55 PLN03212 Transcription repress 25.6 94 0.002 28.9 3.9 39 67-116 36-74 (249)
56 PF12650 DUF3784: Domain of un 24.9 50 0.0011 25.5 1.8 17 252-268 24-40 (97)
57 PF05066 HARE-HTH: HB1, ASXL, 24.5 1.1E+02 0.0025 22.0 3.6 43 37-90 3-45 (72)
58 TIGR00787 dctP tripartite ATP- 23.5 1.6E+02 0.0035 26.6 5.1 28 251-278 213-240 (257)
59 PF13725 tRNA_bind_2: Possible 23.2 47 0.001 25.6 1.3 21 78-98 78-98 (101)
60 PF11860 DUF3380: Protein of u 22.1 1.2E+02 0.0025 26.7 3.7 44 70-113 130-174 (175)
61 PF06945 DUF1289: Protein of u 22.0 83 0.0018 21.7 2.2 20 252-271 28-47 (51)
62 KOG3838 Mannose lectin ERGIC-5 21.8 82 0.0018 31.3 2.9 38 256-293 268-305 (497)
63 PF00226 DnaJ: DnaJ domain; I 21.5 1.4E+02 0.0031 20.6 3.5 35 229-263 19-60 (64)
64 PRK02363 DNA-directed RNA poly 21.3 67 0.0014 26.8 1.9 64 35-108 3-69 (129)
65 COG1638 DctP TRAP-type C4-dica 21.0 1.6E+02 0.0036 28.2 4.9 38 249-286 242-279 (332)
66 PF02337 Gag_p10: Retroviral G 20.9 1.5E+02 0.0033 23.2 3.8 55 34-95 7-64 (90)
67 PTZ00100 DnaJ chaperone protei 20.9 3.4E+02 0.0073 22.2 6.0 76 35-117 18-93 (116)
68 PF01352 KRAB: KRAB box; Inte 20.1 1.3E+02 0.0027 19.9 2.7 26 245-270 5-31 (41)
No 1
>smart00501 BRIGHT BRIGHT, ARID (A/T-rich interaction domain) domain. DNA-binding domain containing a helix-turn-helix structure
Probab=99.94 E-value=4.1e-27 Score=185.00 Aligned_cols=91 Identities=41% Similarity=0.679 Sum_probs=87.2
Q ss_pred CChhHHHHHHHHHHhhcCCCC-CCCeeCCeecchhhhHHHHHhcCcchhhccccchHHHHhhhcCCCCCCcHHHHHHHHH
Q 022436 32 KDPIVFWDTLRRFHFIMGTKF-MIPVIGGKELDLHVLYVEATTRGGYEKVVAEKKWREVGAVFKFSPTTTSASFVLRKHY 110 (297)
Q Consensus 32 ~~~~~F~~~L~~f~~~~G~~~-~~P~i~gk~lDL~~Ly~~V~~~GG~~~V~~~~~W~~Va~~lg~p~~~t~~s~~Lk~~Y 110 (297)
++++.|+++|.+||+.+|+++ .+|+|+|++||||+||.+|+++|||++||.+++|.+|++.||+++++++++..|+.+|
T Consensus 1 ~~~~~F~~~L~~F~~~~g~~~~~~P~i~g~~vdL~~Ly~~V~~~GG~~~v~~~~~W~~Va~~lg~~~~~~~~~~~lk~~Y 80 (93)
T smart00501 1 RERVLFLDRLYKFMEERGSPLKKIPVIGGKPLDLYRLYRLVQERGGYDQVTKDKKWKEIARELGIPDTSTSAASSLRKHY 80 (93)
T ss_pred CcHHHHHHHHHHHHHHcCCcCCcCCeECCEeCcHHHHHHHHHHccCHHHHcCCCCHHHHHHHhCCCcccchHHHHHHHHH
Confidence 478999999999999999998 7999999999999999999999999999999999999999999998999999999999
Q ss_pred HHhhHHHHhhhh
Q 022436 111 LTLLYHYEQVHF 122 (297)
Q Consensus 111 ~k~L~~yE~~~~ 122 (297)
.+||++||+++.
T Consensus 81 ~k~L~~yE~~~~ 92 (93)
T smart00501 81 ERYLLPFERFLR 92 (93)
T ss_pred HHHhHHHHHHhh
Confidence 999999999753
No 2
>PF01388 ARID: ARID/BRIGHT DNA binding domain; InterPro: IPR001606 Members of the recently discovered ARID (AT-rich interaction domain; also known as BRIGHT domain)) family of DNA-binding proteins are found in fungi and invertebrate and vertebrate metazoans. ARID-encoding genes are involved in a variety of biological processes including embryonic development, cell lineage gene regulation and cell cycle control. Although the specific roles of this domain and of ARID-containing proteins in transcriptional regulation are yet to be elucidated, they include both positive and negative transcriptional regulation and a likely involvement in the modification of chromatin structure []. The basic structure of the ARID domain domain appears to be a series of six alpha-helices separated by beta-strands, loops, or turns, but the structured region may extend to an additional helix at either or both ends of the basic six. Based on primary sequence homology, they can be partitioned into three structural classes: Minimal ARID proteins that consist of a core domain formed by six alpha helices; ARID proteins that supplement the core domain with an N-terminal alpha-helix; and Extended-ARID proteins, which contain the core domain and additional alpha-helices at their N- and C-termini. The human SWI-SNF complex protein p270 is an ARID family member with non-sequence-specific DNA binding activity. The ARID consensus and other structural features are common to both p270 and yeast SWI1, suggesting that p270 is a human counterpart of SWI1 []. The approximately 100-residue ARID sequence is present in a series of proteins strongly implicated in the regulation of cell growth, development, and tissue-specific gene expression. Although about a dozen ARID proteins can be identified from database searches, to date, only Bright (a regulator of B-cell-specific gene expression), dead ringer (a Drosophila melanogaster gene product required for normal development), and MRF-2 (which represses expression from the Cytomegalovirus enhancer) have been analyzed directly in regard to their DNA binding properties. Each binds preferentially to AT-rich sites. In contrast, p270 shows no sequence preference in its DNA binding activity, thereby demonstrating that AT-rich binding is not an intrinsic property of ARID domains and that ARID family proteins may be involved in a wider range of DNA interactions [].; GO: 0003677 DNA binding, 0005622 intracellular; PDB: 1C20_A 1KQQ_A 2JRZ_A 2LM1_A 2YQE_A 2JXJ_A 2EH9_A 2CXY_A 2LI6_A 1KN5_A ....
Probab=99.92 E-value=1.1e-25 Score=176.05 Aligned_cols=90 Identities=43% Similarity=0.721 Sum_probs=83.0
Q ss_pred ccCCChhHHHHHHHHHHhhcCCCC-CCCeeCCeecchhhhHHHHHhcCcchhhccccchHHHHhhhcCCCCCCcHHHHHH
Q 022436 29 DVSKDPIVFWDTLRRFHFIMGTKF-MIPVIGGKELDLHVLYVEATTRGGYEKVVAEKKWREVGAVFKFSPTTTSASFVLR 107 (297)
Q Consensus 29 ~~~~~~~~F~~~L~~f~~~~G~~~-~~P~i~gk~lDL~~Ly~~V~~~GG~~~V~~~~~W~~Va~~lg~p~~~t~~s~~Lk 107 (297)
....+++.|++.|.+||+.+|+++ .+|.|+|++||||+||.+|+++|||++|+.+++|.+||+.||+++.+++.+..|+
T Consensus 2 ~~~~~~~~F~~~L~~f~~~~g~~~~~~P~i~g~~vDL~~Ly~~V~~~GG~~~V~~~~~W~~va~~lg~~~~~~~~~~~L~ 81 (92)
T PF01388_consen 2 ANTREREQFLEQLREFHESRGTPIDRPPVIGGKPVDLYKLYKAVMKRGGFDKVTKNKKWREVARKLGFPPSSTSAAQQLR 81 (92)
T ss_dssp SSCHHHHHHHHHHHHHHHHTTSSSSS-SEETTSE-SHHHHHHHHHHHTSHHHHHHHTTHHHHHHHTTS-TTSCHHHHHHH
T ss_pred CcchHHHHHHHHHHHHHHHcCCCCCCCCcCCCEeCcHHHHHHHHHhCcCcccCcccchHHHHHHHhCCCCCCCcHHHHHH
Confidence 345789999999999999999997 7999999999999999999999999999999999999999999998888889999
Q ss_pred HHHHHhhHHHH
Q 022436 108 KHYLTLLYHYE 118 (297)
Q Consensus 108 ~~Y~k~L~~yE 118 (297)
++|++||++||
T Consensus 82 ~~Y~~~L~~fE 92 (92)
T PF01388_consen 82 QHYEKYLLPFE 92 (92)
T ss_dssp HHHHHHTHHHH
T ss_pred HHHHHHhHhhC
Confidence 99999999998
No 3
>KOG2744 consensus DNA-binding proteins Bright/BRCAA1/RBP1 and related proteins containing BRIGHT domain [Transcription]
Probab=99.90 E-value=9.8e-24 Score=209.76 Aligned_cols=156 Identities=40% Similarity=0.541 Sum_probs=127.6
Q ss_pred CCCCCCccCCChhHHHHHHHHHHhhcCCCCC-CCeeCCeecchhhhHHHHHhcCcchhhccccchHHHHhhhcCCC-CCC
Q 022436 23 PLSSHEDVSKDPIVFWDTLRRFHFIMGTKFM-IPVIGGKELDLHVLYVEATTRGGYEKVVAEKKWREVGAVFKFSP-TTT 100 (297)
Q Consensus 23 ~~~~~e~~~~~~~~F~~~L~~f~~~~G~~~~-~P~i~gk~lDL~~Ly~~V~~~GG~~~V~~~~~W~~Va~~lg~p~-~~t 100 (297)
+....|.+..+++.||+.|+.||..+|++|+ +|+|+|++||||.||.+|+++||+++|+..++|++|+..|+||. ++|
T Consensus 153 ~~~~~e~~~~~~eeF~~dl~~f~~~~~~~~~~iPii~~~~ldL~~Ly~lV~s~GG~~~V~~~k~Wrev~~~l~~pt~tiT 232 (512)
T KOG2744|consen 153 PLYETEGVPKSSEEFMEDLRRFMKKRGTKVKSIPIIGGQPLDLHWLYALVTSRGGLDEVTNKKLWREVIDGLNFPTPTIT 232 (512)
T ss_pred cccccccccccHHHHHHHHHHHHHHhCCcceeccccCCCcchHHHHHHHHhcCCchhHhhhhhhHHHHhccccCCCcccc
Confidence 6666777778999999999999999999996 99999999999999999999999999999999999999999999 999
Q ss_pred cHHHHHHHHHHHhhHHHHhhhhhccCCCCCCCCcccc----ccccc----------cccCC------CCCcce---E---
Q 022436 101 SASFVLRKHYLTLLYHYEQVHFFKMQGPPCVPSALRA----GLAWL----------LWNIP------RKGLMI---I--- 154 (297)
Q Consensus 101 ~~s~~Lk~~Y~k~L~~yE~~~~~~~~~~~~~~~~~~~----~~~~p----------~~~~~------~~~~~~---i--- 154 (297)
++++.||++|+++|++|||.+++....+...|..... +-... ..+.+ +....+ |
T Consensus 233 saaf~lr~~y~K~L~~ye~~~~~~~~~pln~p~~~~~~a~~~~~rE~~~~~~~~~~~~~~~~~~~~~~~~~~aa~~~~g~ 312 (512)
T KOG2744|consen 233 SAAFTLRKQYLKLLFEYECEFEKNRHVPLNSPAELSEEASSSNRREGRRHELSPSKEFQANGPSEEEPAEAEAAPEILGN 312 (512)
T ss_pred hHHHHHHHHHHHHHHHHHHHHHHhccCCCCCcccccccccccccccccccccCcchhhccCCcccccccccccchhhhcc
Confidence 9999999999999999999999886655555442211 10000 00000 001111 2
Q ss_pred ----EEEEeeeccccccccccCCCCCCC
Q 022436 155 ----RIHILKLGSETLSGVLYHPDHPGP 178 (297)
Q Consensus 155 ----ylvtv~~gse~~~gvly~~~~~~~ 178 (297)
|++++.+|++.++|++||.++...
T Consensus 313 f~~~~~~~~~~~s~~ln~~~~~~~~~~~ 340 (512)
T KOG2744|consen 313 FLQGLLVFMKDGSEPLNGVLYLGPPDLN 340 (512)
T ss_pred ccccCceeccCcchhccCccccccCccc
Confidence 999999999999999999977654
No 4
>PTZ00199 high mobility group protein; Provisional
Probab=99.87 E-value=8.3e-22 Score=155.43 Aligned_cols=82 Identities=35% Similarity=0.603 Sum_probs=76.6
Q ss_pred cccccccCCCCCCCCCCCCChHHHHHHHHHHHHHhhCCC-C--HHHHHHHHHHHhhcCChHHHHHHHHHHHHHHHHHHHH
Q 022436 203 GRRRRSKRRGDPSYPKPNRSGYNFFFAEKHYKLKSLYPN-R--EREFTKMIGESWTNLSPEERKVYQNIGLKDKERYNRE 279 (297)
Q Consensus 203 ~kkk~~~~~~dp~~PKrP~saY~lF~~e~r~~lk~~~p~-~--~~eisk~l~~~Wk~Ls~~eK~~Y~e~a~~dke~y~~e 279 (297)
.+++++++.+||++||||+|||+||+.++|..|+.+||+ . +.+|+++||++|+.|+++||.+|+++|..|+++|.+|
T Consensus 9 ~~k~~~k~~kdp~~PKrP~sAY~~F~~~~R~~i~~~~P~~~~~~~evsk~ige~Wk~ls~eeK~~y~~~A~~dk~rY~~e 88 (94)
T PTZ00199 9 LVRKNKRKKKDPNAPKRALSAYMFFAKEKRAEIIAENPELAKDVAAVGKMVGEAWNKLSEEEKAPYEKKAQEDKVRYEKE 88 (94)
T ss_pred cccccCCCCCCCCCCCCCCcHHHHHHHHHHHHHHHHCcCCcccHHHHHHHHHHHHHcCCHHHHHHHHHHHHHHHHHHHHH
Confidence 345555678999999999999999999999999999999 4 8999999999999999999999999999999999999
Q ss_pred HHHHH
Q 022436 280 LKEYK 284 (297)
Q Consensus 280 ~~~yk 284 (297)
|.+|+
T Consensus 89 ~~~Y~ 93 (94)
T PTZ00199 89 KAEYA 93 (94)
T ss_pred HHHHh
Confidence 99996
No 5
>cd01389 MATA_HMG-box MATA_HMG-box, class I member of the HMG-box superfamily of DNA-binding proteins. These proteins contain a single HMG box, and bind the minor groove of DNA in a highly sequence-specific manner. Members include the fungal mating type gene products MC, MATA1 and Ste11.
Probab=99.77 E-value=6.6e-19 Score=133.54 Aligned_cols=73 Identities=29% Similarity=0.539 Sum_probs=70.1
Q ss_pred CCCCCCChHHHHHHHHHHHHHhhCCC-CHHHHHHHHHHHhhcCChHHHHHHHHHHHHHHHHHHHHHHHHHHhcc
Q 022436 216 YPKPNRSGYNFFFAEKHYKLKSLYPN-REREFTKMIGESWTNLSPEERKVYQNIGLKDKERYNRELKEYKERLK 288 (297)
Q Consensus 216 ~PKrP~saY~lF~~e~r~~lk~~~p~-~~~eisk~l~~~Wk~Ls~~eK~~Y~e~a~~dke~y~~e~~~yk~~~~ 288 (297)
+||||+|||+||+++.|..++.+||+ ++.+|+++||++|+.|++++|++|+++|++++++|++++++|+-..+
T Consensus 1 ~~kRP~naf~lf~~~~r~~~~~~~p~~~~~eisk~~g~~Wk~ls~eeK~~y~~~A~~~k~~~~~~~p~Yky~p~ 74 (77)
T cd01389 1 KIPRPRNAFILYRQDKHAQLKTENPGLTNNEISRIIGRMWRSESPEVKAYYKELAEEEKERHAREYPDYKYTPR 74 (77)
T ss_pred CCCCCCcHHHHHHHHHHHHHHHHCCCCCHHHHHHHHHHHHhhCCHHHHHHHHHHHHHHHHHHHHHCCCCcccCC
Confidence 58999999999999999999999999 99999999999999999999999999999999999999999987543
No 6
>cd01388 SOX-TCF_HMG-box SOX-TCF_HMG-box, class I member of the HMG-box superfamily of DNA-binding proteins. These proteins contain a single HMG box, and bind the minor groove of DNA in a highly sequence-specific manner. Members include SRY and its homologs in insects and vertebrates, and transcription factor-like proteins, TCF-1, -3, -4, and LEF-1. They appear to bind the minor groove of the A/T C A A A G/C-motif.
Probab=99.76 E-value=2.1e-18 Score=129.22 Aligned_cols=69 Identities=28% Similarity=0.424 Sum_probs=67.0
Q ss_pred CCCCCCChHHHHHHHHHHHHHhhCCC-CHHHHHHHHHHHhhcCChHHHHHHHHHHHHHHHHHHHHHHHHH
Q 022436 216 YPKPNRSGYNFFFAEKHYKLKSLYPN-REREFTKMIGESWTNLSPEERKVYQNIGLKDKERYNRELKEYK 284 (297)
Q Consensus 216 ~PKrP~saY~lF~~e~r~~lk~~~p~-~~~eisk~l~~~Wk~Ls~~eK~~Y~e~a~~dke~y~~e~~~yk 284 (297)
+.|||+|||++|++++|.+++.+||+ ++.+|+++||++|+.|++++|++|+++|++++++|++++++|+
T Consensus 1 ~iKrP~naf~~F~~~~r~~~~~~~p~~~~~eisk~l~~~Wk~ls~~eK~~y~~~a~~~k~~y~~~~p~y~ 70 (72)
T cd01388 1 HIKRPMNAFMLFSKRHRRKVLQEYPLKENRAISKILGDRWKALSNEEKQPYYEEAKKLKELHMKLYPDYK 70 (72)
T ss_pred CCCCCCcHHHHHHHHHHHHHHHHCCCCCHHHHHHHHHHHHHcCCHHHHHHHHHHHHHHHHHHHHHCcCCC
Confidence 46899999999999999999999999 9999999999999999999999999999999999999999986
No 7
>PF00505 HMG_box: HMG (high mobility group) box; InterPro: IPR000910 High mobility group (HMG or HMGB) proteins are a family of relatively low molecular weight non-histone components in chromatin. HMG1 (also called HMG-T in fish) and HMG2 are two highly related proteins that bind single-stranded DNA preferentially and unwind double-stranded DNA. Although they have no sequence specificity, they have a high affinity for bent or distorted DNA, and bend linear DNA. HMG1 and HMG2 contain two DNA-binding HMG-box domains (A and B) that show structural and functional differences, and have a long acidic C-terminal domain rich in aspartic and glutamic acid residues. The acidic tail modulates the affinity of the tandem HMG boxes in HMG1 and 2 for a variety of DNA targets. HMG1 and 2 appear to play important architectural roles in the assembly of nucleoprotein complexes in a variety of biological processes, for example V(D)J recombination, the initiation of transcription, and DNA repair []. The profile in this entry describing the HMG-domains is much more general than the signature. In addition to the HMG1 and HMG2 proteins, HMG-domains occur in single or multiple copies in the following protein classes; the SOX family of transcription factors; SRY sex determining region Y protein and related proteins []; LEF1 lymphoid enhancer binding factor 1 []; SSRP recombination signal recognition protein; MTF1 mitochondrial transcription factor 1; UBF1/2 nucleolar transcription factors; Abf2 yeast ARS-binding factor []; and Saccharomyces cerevisiae transcription factors Ixr1, Rox1, Nhp6a, Nhp6b and Spp41.; GO: 0003677 DNA binding; PDB: 1I11_A 1J3C_A 1J3D_A 1WZ6_A 1WGF_A 2D7L_A 1GT0_D 3U2B_C 2CRJ_A 2CS1_A ....
Probab=99.73 E-value=1.6e-17 Score=122.44 Aligned_cols=68 Identities=38% Similarity=0.708 Sum_probs=65.0
Q ss_pred CCCCCChHHHHHHHHHHHHHhhCCC-CHHHHHHHHHHHhhcCChHHHHHHHHHHHHHHHHHHHHHHHHH
Q 022436 217 PKPNRSGYNFFFAEKHYKLKSLYPN-REREFTKMIGESWTNLSPEERKVYQNIGLKDKERYNRELKEYK 284 (297)
Q Consensus 217 PKrP~saY~lF~~e~r~~lk~~~p~-~~~eisk~l~~~Wk~Ls~~eK~~Y~e~a~~dke~y~~e~~~yk 284 (297)
||||+|||++|+.+++..++.+||+ +..+|+++|+++|+.|+++||++|.+.|++++++|.++|++|+
T Consensus 1 PkrP~~af~lf~~~~~~~~k~~~p~~~~~~i~~~~~~~W~~l~~~eK~~y~~~a~~~~~~y~~~~~~y~ 69 (69)
T PF00505_consen 1 PKRPPNAFMLFCKEKRAKLKEENPDLSNKEISKILAQMWKNLSEEEKAPYKEEAEEEKERYEKEMPEYK 69 (69)
T ss_dssp SSSS--HHHHHHHHHHHHHHHHSTTSTHHHHHHHHHHHHHCSHHHHHHHHHHHHHHHHHHHHHHHHHHH
T ss_pred CcCCCCHHHHHHHHHHHHHHHHhcccccccchhhHHHHHhcCCHHHHHHHHHHHHHHHHHHHHHHHhcC
Confidence 8999999999999999999999999 9999999999999999999999999999999999999999996
No 8
>PF09011 HMG_box_2: HMG-box domain; InterPro: IPR015101 This domain is predominantly found in Maelstrom homologue proteins. It has no known function. ; GO: 0005634 nucleus; PDB: 2EQZ_A 1V64_A 2CTO_A 1H5P_A 3TQ6_A 3FGH_A 3TMM_A 1J3X_A 2YRQ_A 1AAB_A ....
Probab=99.72 E-value=2.7e-17 Score=123.55 Aligned_cols=71 Identities=41% Similarity=0.645 Sum_probs=62.9
Q ss_pred CCCCCCCCChHHHHHHHHHHHHHhh-CCC-CHHHHHHHHHHHhhcCChHHHHHHHHHHHHHHHHHHHHHHHHH
Q 022436 214 PSYPKPNRSGYNFFFAEKHYKLKSL-YPN-REREFTKMIGESWTNLSPEERKVYQNIGLKDKERYNRELKEYK 284 (297)
Q Consensus 214 p~~PKrP~saY~lF~~e~r~~lk~~-~p~-~~~eisk~l~~~Wk~Ls~~eK~~Y~e~a~~dke~y~~e~~~yk 284 (297)
|++||+|+|||+||+.+++..++.. .+. ...++++.|+++|++||++||.+|+++|+.++++|..+|..|+
T Consensus 1 p~kpK~~~say~lF~~~~~~~~k~~G~~~~~~~e~~k~~~~~Wk~Ls~~EK~~Y~~~A~~~k~~y~~e~~~~~ 73 (73)
T PF09011_consen 1 PKKPKRPPSAYNLFMKEMRKEVKEEGGQKQSFREVMKEISERWKSLSEEEKEPYEERAKEDKERYEREMKEWN 73 (73)
T ss_dssp SSS--SSSSHHHHHHHHHHHHHHHHT-T-SSHHHHHHHHHHHHHHS-HHHHHHHHHHHHHHHHHHHHHHHHH-
T ss_pred CcCCCCCCCHHHHHHHHHHHHHHHhcccCCCHHHHHHHHHHHHHhcCHHHHHHHHHHHHHHHHHHHHHHHhcC
Confidence 6789999999999999999999988 556 8999999999999999999999999999999999999999985
No 9
>cd01390 HMGB-UBF_HMG-box HMGB-UBF_HMG-box, class II and III members of the HMG-box superfamily of DNA-binding proteins. These proteins bind the minor groove of DNA in a non-sequence specific fashion and contain two or more tandem HMG boxes. Class II members include non-histone chromosomal proteins, HMG1 and HMG2, which bind to bent or distorted DNA such as four-way DNA junctions, synthetic DNA cruciforms, kinked cisplatin-modified DNA, DNA bulges, cross-overs in supercoiled DNA, and can cause looping of linear DNA. Class III members include nucleolar and mitochondrial transcription factors, UBF and mtTF1, which bind four-way DNA junctions.
Probab=99.71 E-value=4.2e-17 Score=119.02 Aligned_cols=65 Identities=46% Similarity=0.720 Sum_probs=63.4
Q ss_pred CCCCCChHHHHHHHHHHHHHhhCCC-CHHHHHHHHHHHhhcCChHHHHHHHHHHHHHHHHHHHHHH
Q 022436 217 PKPNRSGYNFFFAEKHYKLKSLYPN-REREFTKMIGESWTNLSPEERKVYQNIGLKDKERYNRELK 281 (297)
Q Consensus 217 PKrP~saY~lF~~e~r~~lk~~~p~-~~~eisk~l~~~Wk~Ls~~eK~~Y~e~a~~dke~y~~e~~ 281 (297)
||+|+|||++|++++|..++.+||+ ++.+|++.||++|++|+++||++|.++|++++++|..+|.
T Consensus 1 Pkrp~saf~~f~~~~r~~~~~~~p~~~~~~i~~~~~~~W~~ls~~eK~~y~~~a~~~~~~y~~e~~ 66 (66)
T cd01390 1 PKRPLSAYFLFSQEQRPKLKKENPDASVTEVTKILGEKWKELSEEEKKKYEEKAEKDKERYEKEMK 66 (66)
T ss_pred CCCCCcHHHHHHHHHHHHHHHHCcCCCHHHHHHHHHHHHHhCCHHHHHHHHHHHHHHHHHHHHhhC
Confidence 8999999999999999999999999 9999999999999999999999999999999999999873
No 10
>smart00398 HMG high mobility group.
Probab=99.70 E-value=6.9e-17 Score=118.64 Aligned_cols=69 Identities=41% Similarity=0.660 Sum_probs=67.1
Q ss_pred CCCCCCChHHHHHHHHHHHHHhhCCC-CHHHHHHHHHHHhhcCChHHHHHHHHHHHHHHHHHHHHHHHHH
Q 022436 216 YPKPNRSGYNFFFAEKHYKLKSLYPN-REREFTKMIGESWTNLSPEERKVYQNIGLKDKERYNRELKEYK 284 (297)
Q Consensus 216 ~PKrP~saY~lF~~e~r~~lk~~~p~-~~~eisk~l~~~Wk~Ls~~eK~~Y~e~a~~dke~y~~e~~~yk 284 (297)
+||+|+|||++|++++|..++.+||+ +..+|+++||++|+.|++++|++|.++|+.++++|.+++++|+
T Consensus 1 ~pkrp~~~y~~f~~~~r~~~~~~~~~~~~~~i~~~~~~~W~~l~~~ek~~y~~~a~~~~~~y~~~~~~y~ 70 (70)
T smart00398 1 KPKRPMSAFMLFSQENRAKIKAENPDLSNAEISKKLGERWKLLSEEEKAPYEEKAKKDKERYEEEMPEYK 70 (70)
T ss_pred CcCCCCcHHHHHHHHHHHHHHHHCcCCCHHHHHHHHHHHHHcCCHHHHHHHHHHHHHHHHHHHHHHHhcC
Confidence 58999999999999999999999999 9999999999999999999999999999999999999999984
No 11
>COG5648 NHP6B Chromatin-associated proteins containing the HMG domain [Chromatin structure and dynamics]
Probab=99.70 E-value=3.2e-17 Score=144.23 Aligned_cols=91 Identities=31% Similarity=0.545 Sum_probs=84.8
Q ss_pred ccccccCCCCCCCCCCCCChHHHHHHHHHHHHHhhCCC-CHHHHHHHHHHHhhcCChHHHHHHHHHHHHHHHHHHHHHHH
Q 022436 204 RRRRSKRRGDPSYPKPNRSGYNFFFAEKHYKLKSLYPN-REREFTKMIGESWTNLSPEERKVYQNIGLKDKERYNRELKE 282 (297)
Q Consensus 204 kkk~~~~~~dp~~PKrP~saY~lF~~e~r~~lk~~~p~-~~~eisk~l~~~Wk~Ls~~eK~~Y~e~a~~dke~y~~e~~~ 282 (297)
.+..+++++||+.||||+|||++|+.++|.+++.++|+ .+.++++.+|++|++|+++||++|...|..++++|..++..
T Consensus 58 sk~~~r~k~dpN~PKRp~sayf~y~~~~R~ei~~~~p~l~~~e~~k~~~e~WK~Ltd~eke~y~k~~~~~~erYq~ek~~ 137 (211)
T COG5648 58 SKRLVRKKKDPNGPKRPLSAYFLYSAENRDEIRKENPKLTFGEVGKLLSEKWKELTDEEKEPYYKEANSDRERYQREKEE 137 (211)
T ss_pred HHHHHHHhcCCCCCCCchhHHHHHHHHHHHHHHHhCCCCChHHHHHHHHHHHHhccHhhhhhHHHHHhhHHHHHHHHHHh
Confidence 35667788999999999999999999999999999999 99999999999999999999999999999999999999999
Q ss_pred HHHhcccccccC
Q 022436 283 YKERLKLRQGEG 294 (297)
Q Consensus 283 yk~~~~~~~~~~ 294 (297)
|.++.......+
T Consensus 138 y~~k~~~~~~~~ 149 (211)
T COG5648 138 YNKKLPNKAPIG 149 (211)
T ss_pred hhcccCCCCCCc
Confidence 999887765544
No 12
>KOG0381 consensus HMG box-containing protein [General function prediction only]
Probab=99.66 E-value=6.1e-16 Score=121.50 Aligned_cols=75 Identities=40% Similarity=0.654 Sum_probs=71.7
Q ss_pred CC--CCCCCCCChHHHHHHHHHHHHHhhCCC-CHHHHHHHHHHHhhcCChHHHHHHHHHHHHHHHHHHHHHH-HHHHhc
Q 022436 213 DP--SYPKPNRSGYNFFFAEKHYKLKSLYPN-REREFTKMIGESWTNLSPEERKVYQNIGLKDKERYNRELK-EYKERL 287 (297)
Q Consensus 213 dp--~~PKrP~saY~lF~~e~r~~lk~~~p~-~~~eisk~l~~~Wk~Ls~~eK~~Y~e~a~~dke~y~~e~~-~yk~~~ 287 (297)
|| ..||||+|||++|+.+.|..++.+||+ +..++++++|++|++|++++|.+|+..|.+++++|..+|. .|+...
T Consensus 17 ~p~~~~pkrp~sa~~~f~~~~~~~~k~~~p~~~~~~v~k~~g~~W~~l~~~~k~~y~~ka~~~k~~Y~~~~~~~~~~~~ 95 (96)
T KOG0381|consen 17 DPNAQAPKRPLSAFFLFSSEQRSKIKAENPGLSVGEVAKALGEMWKNLAEEEKQPYEEKASKLKEKYEKELAGEYKASL 95 (96)
T ss_pred CCCCCCCCCCCcHHHHHHHHHHHHHHHhCCCCCHHHHHHHHHHHHhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHhhcc
Confidence 66 599999999999999999999999999 9999999999999999999999999999999999999999 998754
No 13
>cd00084 HMG-box High Mobility Group (HMG)-box is found in a variety of eukaryotic chromosomal proteins and transcription factors. HMGs bind to the minor groove of DNA and have been classified by DNA binding preferences. Two phylogenically distinct groups of Class I proteins bind DNA in a sequence specific fashion and contain a single HMG box. One group (SOX-TCF) includes transcription factors, TCF-1, -3, -4; and also SRY and LEF-1, which bind four-way DNA junctions and duplex DNA targets. The second group (MATA) includes fungal mating type gene products MC, MATA1 and Ste11. Class II and III proteins (HMGB-UBF) bind DNA in a non-sequence specific fashion and contain two or more tandem HMG boxes. Class II members include non-histone chromosomal proteins, HMG1 and HMG2, which bind to bent or distorted DNA such as four-way DNA junctions, synthetic DNA cruciforms, kinked cisplatin-modified DNA, DNA bulges, cross-overs in supercoiled DNA, and can cause looping of linear DNA. Class III member
Probab=99.62 E-value=2.9e-15 Score=108.68 Aligned_cols=64 Identities=41% Similarity=0.676 Sum_probs=62.5
Q ss_pred CCCCCChHHHHHHHHHHHHHhhCCC-CHHHHHHHHHHHhhcCChHHHHHHHHHHHHHHHHHHHHH
Q 022436 217 PKPNRSGYNFFFAEKHYKLKSLYPN-REREFTKMIGESWTNLSPEERKVYQNIGLKDKERYNREL 280 (297)
Q Consensus 217 PKrP~saY~lF~~e~r~~lk~~~p~-~~~eisk~l~~~Wk~Ls~~eK~~Y~e~a~~dke~y~~e~ 280 (297)
||+|+|||++|+++.|..++.++|+ +..+|+++++++|+.|++++|.+|.+.|+.++++|.+++
T Consensus 1 pkrp~~af~~f~~~~~~~~~~~~~~~~~~~i~~~~~~~W~~l~~~~k~~y~~~a~~~~~~y~~~~ 65 (66)
T cd00084 1 PKRPLSAYFLFSQEHRAEVKAENPGLSVGEISKILGEMWKSLSEEEKKKYEEKAEKDKERYEKEM 65 (66)
T ss_pred CCCCCcHHHHHHHHHHHHHHHHCcCCCHHHHHHHHHHHHHhCCHHHHHHHHHHHHHHHHHHHHhh
Confidence 7999999999999999999999999 999999999999999999999999999999999999876
No 14
>KOG0527 consensus HMG-box transcription factor [Transcription]
Probab=99.62 E-value=5.4e-16 Score=146.86 Aligned_cols=83 Identities=19% Similarity=0.334 Sum_probs=76.9
Q ss_pred CCCCCCCCCCCCChHHHHHHHHHHHHHhhCCC-CHHHHHHHHHHHhhcCChHHHHHHHHHHHHHHHHHHHHHHHHHHhcc
Q 022436 210 RRGDPSYPKPNRSGYNFFFAEKHYKLKSLYPN-REREFTKMIGESWTNLSPEERKVYQNIGLKDKERYNRELKEYKERLK 288 (297)
Q Consensus 210 ~~~dp~~PKrP~saY~lF~~e~r~~lk~~~p~-~~~eisk~l~~~Wk~Ls~~eK~~Y~e~a~~dke~y~~e~~~yk~~~~ 288 (297)
.+....+.||||||||+|.+.+|.+|.+++|+ .+.||+|+||.+||.|+|+||.+|.++|++.|+.|++|.++|+-+-+
T Consensus 56 ~k~~~~hIKRPMNAFMVWSq~~RRkma~qnP~mHNSEISK~LG~~WK~Lse~EKrPFi~EAeRLR~~HmkehPdYKYRPR 135 (331)
T KOG0527|consen 56 DKTSTDRIKRPMNAFMVWSQGQRRKLAKQNPKMHNSEISKRLGAEWKLLSEEEKRPFVDEAERLRAQHMKEYPDYKYRPR 135 (331)
T ss_pred CCCCccccCCCcchhhhhhHHHHHHHHHhCcchhhHHHHHHHHHHHhhcCHhhhccHHHHHHHHHHHHHHhCCCcccccc
Confidence 45667789999999999999999999999999 99999999999999999999999999999999999999999998765
Q ss_pred cccc
Q 022436 289 LRQG 292 (297)
Q Consensus 289 ~~~~ 292 (297)
.+..
T Consensus 136 RKkk 139 (331)
T KOG0527|consen 136 RKKK 139 (331)
T ss_pred cccc
Confidence 5443
No 15
>KOG0526 consensus Nucleosome-binding factor SPN, POB3 subunit [Transcription; Replication, recombination and repair; Chromatin structure and dynamics]
Probab=99.54 E-value=8.7e-15 Score=143.16 Aligned_cols=79 Identities=28% Similarity=0.590 Sum_probs=73.6
Q ss_pred ccccccCCCCCCCCCCCCChHHHHHHHHHHHHHhhCCC-CHHHHHHHHHHHhhcCChHHHHHHHHHHHHHHHHHHHHHHH
Q 022436 204 RRRRSKRRGDPSYPKPNRSGYNFFFAEKHYKLKSLYPN-REREFTKMIGESWTNLSPEERKVYQNIGLKDKERYNRELKE 282 (297)
Q Consensus 204 kkk~~~~~~dp~~PKrP~saY~lF~~e~r~~lk~~~p~-~~~eisk~l~~~Wk~Ls~~eK~~Y~e~a~~dke~y~~e~~~ 282 (297)
++|+.++.+||++|||++||||+|+...|..||.+ + ++.+|+|++|++|+.|+. |.+|+++|+.||+||+.||.+
T Consensus 523 ~~k~~kk~kdpnapkra~sa~m~w~~~~r~~ik~d--gi~~~dv~kk~g~~wk~ms~--k~~we~ka~~dk~ry~~em~~ 598 (615)
T KOG0526|consen 523 KKKKGKKKKDPNAPKRATSAYMLWLNASRESIKED--GISVGDVAKKAGEKWKQMSA--KEEWEDKAAVDKQRYEDEMKE 598 (615)
T ss_pred cccCcccCCCCCCCccchhHHHHHHHhhhhhHhhc--CchHHHHHHHHhHHHhhhcc--cchhhHHHHHHHHHHHHHHHh
Confidence 33566778999999999999999999999999998 7 999999999999999999 999999999999999999999
Q ss_pred HHHh
Q 022436 283 YKER 286 (297)
Q Consensus 283 yk~~ 286 (297)
|+.-
T Consensus 599 yk~g 602 (615)
T KOG0526|consen 599 YKNG 602 (615)
T ss_pred hcCC
Confidence 9943
No 16
>KOG3248 consensus Transcription factor TCF-4 [Transcription]
Probab=99.30 E-value=5.7e-12 Score=117.60 Aligned_cols=72 Identities=17% Similarity=0.303 Sum_probs=67.8
Q ss_pred CCCCCCCChHHHHHHHHHHHHHhhCCC-CHHHHHHHHHHHhhcCChHHHHHHHHHHHHHHHHHHHHHHHHHHh
Q 022436 215 SYPKPNRSGYNFFFAEKHYKLKSLYPN-REREFTKMIGESWTNLSPEERKVYQNIGLKDKERYNRELKEYKER 286 (297)
Q Consensus 215 ~~PKrP~saY~lF~~e~r~~lk~~~p~-~~~eisk~l~~~Wk~Ls~~eK~~Y~e~a~~dke~y~~e~~~yk~~ 286 (297)
.+.|+|+|||++|++|.|++|.++... ...+|.++||.+|.+|+.||.++|.|+|+++|+-|+.-+.+|.++
T Consensus 190 phiKKPLNAFmlyMKEmRa~vvaEctlKeSAaiNqiLGrRWH~LSrEEQAKYyElArKerqlH~qlYP~WSAR 262 (421)
T KOG3248|consen 190 PHIKKPLNAFMLYMKEMRAKVVAECTLKESAAINQILGRRWHALSREEQAKYYELARKERQLHMQLYPGWSAR 262 (421)
T ss_pred ccccccHHHHHHHHHHHHHHHHHHhhhhhHHHHHHHHhHHHhhhhHHHHHHHHHHHHHHHHHHHHhcCCcchh
Confidence 467999999999999999999999988 889999999999999999999999999999999999999888764
No 17
>KOG2510 consensus SWI-SNF chromatin-remodeling complex protein [Chromatin structure and dynamics]
Probab=99.23 E-value=1.3e-11 Score=120.18 Aligned_cols=94 Identities=29% Similarity=0.450 Sum_probs=86.2
Q ss_pred CCChhHHHHHHHHHHhhcCCCC-CCCeeCCeecchhhhHHHHHhcCcchhhccccchHHHHhhhcCCCCCCcHHHHHHHH
Q 022436 31 SKDPIVFWDTLRRFHFIMGTKF-MIPVIGGKELDLHVLYVEATTRGGYEKVVAEKKWREVGAVFKFSPTTTSASFVLRKH 109 (297)
Q Consensus 31 ~~~~~~F~~~L~~f~~~~G~~~-~~P~i~gk~lDL~~Ly~~V~~~GG~~~V~~~~~W~~Va~~lg~p~~~t~~s~~Lk~~ 109 (297)
..+++..+++|+.|++.+.+++ .+|.++.++||||+||..|..+||+.+|++++ +++|.-|| .+++..||++
T Consensus 291 qp~r~~wvDR~raF~ee~~Sp~t~~p~~gakPldl~rlYvsvke~gg~~~v~knk--rd~a~~lg-----ssaa~~l~k~ 363 (532)
T KOG2510|consen 291 QPERKEWVDRLRAFTEERASPMTNLPAVGAKPLDLYRLYVSVKEIGGLTQVNKNK--RDLATNLG-----SSAASSLKKQ 363 (532)
T ss_pred CcchhhHHHHHHHHHHhhcCcccccccccccchhHHHHHHHHHHhccceeeccch--hhhhhccc-----hHHHHHHHHH
Confidence 4688999999999999999999 58999999999999999999999999999998 99999998 5688899999
Q ss_pred HHHhhHHHHhhhhhccCCCCCC
Q 022436 110 YLTLLYHYEQVHFFKMQGPPCV 131 (297)
Q Consensus 110 Y~k~L~~yE~~~~~~~~~~~~~ 131 (297)
|.+||+.|||++-.|++.++..
T Consensus 364 y~~~lf~fec~f~Rg~e~p~~~ 385 (532)
T KOG2510|consen 364 YIQYLFAFECKFERGEEPPPDI 385 (532)
T ss_pred HHHHHHhhceeeeccCCCCHHH
Confidence 9999999999888888776644
No 18
>KOG4715 consensus SWI/SNF-related matrix-associated actin-dependent regulator of chromatin [Chromatin structure and dynamics]
Probab=99.05 E-value=6.1e-10 Score=103.38 Aligned_cols=77 Identities=27% Similarity=0.419 Sum_probs=72.6
Q ss_pred CCCCCCCCCCCCChHHHHHHHHHHHHHhhCCC-CHHHHHHHHHHHhhcCChHHHHHHHHHHHHHHHHHHHHHHHHHHh
Q 022436 210 RRGDPSYPKPNRSGYNFFFAEKHYKLKSLYPN-REREFTKMIGESWTNLSPEERKVYQNIGLKDKERYNRELKEYKER 286 (297)
Q Consensus 210 ~~~dp~~PKrP~saY~lF~~e~r~~lk~~~p~-~~~eisk~l~~~Wk~Ls~~eK~~Y~e~a~~dke~y~~e~~~yk~~ 286 (297)
..+.|.+|-+|+-.||.|++..++++++.||+ ...+|.|+||.||..|+++||+.|+..++.+|..|.+.|..|...
T Consensus 58 ~pkpPkppekpl~pymrySrkvWd~VkA~nPe~kLWeiGK~Ig~mW~dLpd~EK~ey~~EYeaEKieY~~smkayh~s 135 (410)
T KOG4715|consen 58 RPKPPKPPEKPLMPYMRYSRKVWDQVKASNPELKLWEIGKIIGGMWLDLPDEEKQEYLNEYEAEKIEYNESMKAYHNS 135 (410)
T ss_pred CCCCCCCCCcccchhhHHhhhhhhhhhccCcchHHHHHHHHHHHHHhhCcchHHHHHHHHHHHHHHHHHHHHHHhhCC
Confidence 34567788999999999999999999999999 999999999999999999999999999999999999999999764
No 19
>KOG0528 consensus HMG-box transcription factor SOX5 [Transcription]
Probab=98.95 E-value=3.5e-10 Score=110.14 Aligned_cols=82 Identities=13% Similarity=0.329 Sum_probs=73.4
Q ss_pred CCCCCCCCCCCCChHHHHHHHHHHHHHhhCCC-CHHHHHHHHHHHhhcCChHHHHHHHHHHHHHHHHHHHHHHHHHHhcc
Q 022436 210 RRGDPSYPKPNRSGYNFFFAEKHYKLKSLYPN-REREFTKMIGESWTNLSPEERKVYQNIGLKDKERYNRELKEYKERLK 288 (297)
Q Consensus 210 ~~~dp~~PKrP~saY~lF~~e~r~~lk~~~p~-~~~eisk~l~~~Wk~Ls~~eK~~Y~e~a~~dke~y~~e~~~yk~~~~ 288 (297)
+...+.+.||||||||.|-++.|.++.+..|| ....|+|+||.+||.|+..||++|.|.-.+.-..|.+..++||-+.+
T Consensus 319 ~~ss~PHIKRPMNAFMVWAkDERRKILqA~PDMHNSnISKILGSRWKaMSN~eKQPYYEEQaRLSk~HlEk~PdYrYkPR 398 (511)
T KOG0528|consen 319 RASSEPHIKRPMNAFMVWAKDERRKILQAFPDMHNSNISKILGSRWKAMSNTEKQPYYEEQARLSKLHLEKYPDYRYKPR 398 (511)
T ss_pred cCCCCccccCCcchhhcccchhhhhhhhcCccccccchhHHhcccccccccccccchHHHHHHHHHhhhccCcccccCCC
Confidence 34556678999999999999999999999999 89999999999999999999999988777777799999999998766
Q ss_pred ccc
Q 022436 289 LRQ 291 (297)
Q Consensus 289 ~~~ 291 (297)
...
T Consensus 399 PKR 401 (511)
T KOG0528|consen 399 PKR 401 (511)
T ss_pred CCc
Confidence 543
No 20
>KOG2746 consensus HMG-box transcription factor Capicua and related proteins [Transcription]
Probab=98.31 E-value=5e-07 Score=91.47 Aligned_cols=75 Identities=23% Similarity=0.454 Sum_probs=68.7
Q ss_pred cccccCCCCCCCCCCCCChHHHHHHHHH--HHHHhhCCC-CHHHHHHHHHHHhhcCChHHHHHHHHHHHHHHHHHHHH
Q 022436 205 RRRSKRRGDPSYPKPNRSGYNFFFAEKH--YKLKSLYPN-REREFTKMIGESWTNLSPEERKVYQNIGLKDKERYNRE 279 (297)
Q Consensus 205 kk~~~~~~dp~~PKrP~saY~lF~~e~r--~~lk~~~p~-~~~eisk~l~~~Wk~Ls~~eK~~Y~e~a~~dke~y~~e 279 (297)
..+...+++..+.++|||||++|++.+| ..+.+.||+ +++.|++++|+.|-.|.+.||+.|.++|.+.|+.|.+.
T Consensus 170 dgrspnkr~k~HirrPMnaf~ifskrhr~~g~vhq~~pn~DNrtIskiLgewWytL~~~Ekq~yhdLa~Qvk~Ahfka 247 (683)
T KOG2746|consen 170 DGRSPNKRDKDHIRRPMNAFHIFSKRHRGEGRVHQRHPNQDNRTISKILGEWWYTLGPNEKQKYHDLAFQVKEAHFKA 247 (683)
T ss_pred ccCCCCcCcchhhhhhhHHHHHHHhhcCCccchhccCccccchhHHHHHhhhHhhhCchhhhhHHHHHHHHHHHHhhh
Confidence 3445556777889999999999999999 899999999 99999999999999999999999999999999999886
No 21
>PF14887 HMG_box_5: HMG (high mobility group) box 5; PDB: 1L8Y_A 1L8Z_A 2HDZ_A.
Probab=97.99 E-value=3.1e-05 Score=58.39 Aligned_cols=72 Identities=18% Similarity=0.398 Sum_probs=59.2
Q ss_pred CCCCCCChHHHHHHHHHHHHHhhCCC-CHHHHHHHHHHHhhcCChHHHHHHHHHHHHHHHHHHHHHHHHHHhcc
Q 022436 216 YPKPNRSGYNFFFAEKHYKLKSLYPN-REREFTKMIGESWTNLSPEERKVYQNIGLKDKERYNRELKEYKERLK 288 (297)
Q Consensus 216 ~PKrP~saY~lF~~e~r~~lk~~~p~-~~~eisk~l~~~Wk~Ls~~eK~~Y~e~a~~dke~y~~e~~~yk~~~~ 288 (297)
-|..|.++--+|.+.....+.+.+++ ...+ .+.+...|++|++.+|.+|...|.+|..+|+.+|.+|++...
T Consensus 3 lPE~PKt~qe~Wqq~vi~dYla~~~~dr~K~-~kam~~~W~~me~Kekl~WIkKA~EdqKrYE~el~e~r~~~~ 75 (85)
T PF14887_consen 3 LPETPKTAQEIWQQSVIGDYLAKFRNDRKKA-LKAMEAQWSQMEKKEKLKWIKKAAEDQKRYERELREMRSAPA 75 (85)
T ss_dssp -S----THHHHHHHHHHHHHHHHTTSTHHHH-HHHHHHHHHTTGGGHHHHHHHHHHHHHHHHHHHHHCCS-CCC
T ss_pred CCCCCCCHHHHHHHHHHHHHHHHhhHhHHHH-HHHHHHHHHHhhhhhhhHHHHHHHHHHHHHHHHHHHHhcCCC
Confidence 36778899999999999999999988 4444 568999999999999999999999999999999999987543
No 22
>PF04690 YABBY: YABBY protein; InterPro: IPR006780 YABBY proteins are a group of plant-specific transcription factors involved in the specification of abaxial polarity in lateral organs such as leaves and floral organs [, ].
Probab=96.89 E-value=0.0019 Score=56.25 Aligned_cols=48 Identities=21% Similarity=0.404 Sum_probs=42.2
Q ss_pred CCCCCCCCCCCChHHHHHHHHHHHHHhhCCC-CHHHHHHHHHHHhhcCC
Q 022436 211 RGDPSYPKPNRSGYNFFFAEKHYKLKSLYPN-REREFTKMIGESWTNLS 258 (297)
Q Consensus 211 ~~dp~~PKrP~saY~lF~~e~r~~lk~~~p~-~~~eisk~l~~~Wk~Ls 258 (297)
.+.|++-.|-+|||+.|+++.-++||+.+|+ +.+|.-+..++.|...+
T Consensus 116 ~kPPEKRqR~psaYn~f~k~ei~rik~~~p~ishkeaFs~aAknW~h~p 164 (170)
T PF04690_consen 116 NKPPEKRQRVPSAYNRFMKEEIQRIKAENPDISHKEAFSAAAKNWAHFP 164 (170)
T ss_pred cCCccccCCCchhHHHHHHHHHHHHHhcCCCCCHHHHHHHHHHhhhhCc
Confidence 3445555678999999999999999999999 99999999999998764
No 23
>PF06382 DUF1074: Protein of unknown function (DUF1074); InterPro: IPR024460 This family consists of several proteins which appear to be specific to Insecta. The function of this family is unknown.
Probab=96.72 E-value=0.0038 Score=54.41 Aligned_cols=48 Identities=19% Similarity=0.331 Sum_probs=41.5
Q ss_pred CChHHHHHHHHHHHHHhhCCC-CHHHHHHHHHHHhhcCChHHHHHHHHHHHHH
Q 022436 221 RSGYNFFFAEKHYKLKSLYPN-REREFTKMIGESWTNLSPEERKVYQNIGLKD 272 (297)
Q Consensus 221 ~saY~lF~~e~r~~lk~~~p~-~~~eisk~l~~~Wk~Ls~~eK~~Y~e~a~~d 272 (297)
.+||+-|+.+.+.+ |.+ ...|+....+.+|..|++++|..|..++...
T Consensus 83 nnaYLNFLReFRrk----h~~L~p~dlI~~AAraW~rLSe~eK~rYrr~~~~~ 131 (183)
T PF06382_consen 83 NNAYLNFLREFRRK----HCGLSPQDLIQRAARAWCRLSEAEKNRYRRMAPSV 131 (183)
T ss_pred chHHHHHHHHHHHH----ccCCCHHHHHHHHHHHHHhCCHHHHHHHHhhcchh
Confidence 57899999998765 566 7799999999999999999999999876543
No 24
>COG5648 NHP6B Chromatin-associated proteins containing the HMG domain [Chromatin structure and dynamics]
Probab=96.20 E-value=0.0041 Score=55.52 Aligned_cols=67 Identities=22% Similarity=0.203 Sum_probs=59.3
Q ss_pred CCCCCCCChHHHHHHHHHHHHHhhCCC-CHHHHHHHHHHHhhcCChHHHHHHHHHHHHHHHHHHHHHH
Q 022436 215 SYPKPNRSGYNFFFAEKHYKLKSLYPN-REREFTKMIGESWTNLSPEERKVYQNIGLKDKERYNRELK 281 (297)
Q Consensus 215 ~~PKrP~saY~lF~~e~r~~lk~~~p~-~~~eisk~l~~~Wk~Ls~~eK~~Y~e~a~~dke~y~~e~~ 281 (297)
.+++.+.-.|.-+-.+.|+++...+|+ ...++.++++..|++|+++-|.+|.+.++++++.|...++
T Consensus 142 ~~~~~~~~~~~e~~~~~r~~~~~~~~~~~~~e~~k~~~~~w~el~~skK~~~~~~~Kk~k~~~~~~~~ 209 (211)
T COG5648 142 LPNKAPIGPFIENEPKIRPKVEGPSPDKALVEETKIISKAWSELDESKKKKYIDKYKKLKEEYDSFYP 209 (211)
T ss_pred cCCCCCCchhhhccHHhccccCCCCcchhhhHHhhhhhhhhhhhChhhhhHHHHHHHHHHHHHhhhcc
Confidence 345667777777888888888888999 8999999999999999999999999999999999988765
No 25
>PF08073 CHDNT: CHDNT (NUC034) domain; InterPro: IPR012958 The CHD N-terminal domain is found in PHD/RING fingers and chromo domain-associated helicases [].; GO: 0003677 DNA binding, 0005524 ATP binding, 0008270 zinc ion binding, 0016818 hydrolase activity, acting on acid anhydrides, in phosphorus-containing anhydrides, 0006355 regulation of transcription, DNA-dependent, 0005634 nucleus
Probab=88.48 E-value=0.64 Score=33.10 Aligned_cols=39 Identities=13% Similarity=0.284 Sum_probs=34.9
Q ss_pred CChHHHHHHHHHHHHHhhCCC-CHHHHHHHHHHHhhcCCh
Q 022436 221 RSGYNFFFAEKHYKLKSLYPN-REREFTKMIGESWTNLSP 259 (297)
Q Consensus 221 ~saY~lF~~e~r~~lk~~~p~-~~~eisk~l~~~Wk~Ls~ 259 (297)
.+.|-+|.+-.|+.|.+.||+ ....+...++.+|+.-++
T Consensus 13 lt~yK~Fsq~vRP~l~~~NPk~~~sKl~~l~~AKwrEF~~ 52 (55)
T PF08073_consen 13 LTNYKAFSQHVRPLLAKANPKAPMSKLMMLLQAKWREFQE 52 (55)
T ss_pred HHHHHHHHHHHHHHHHHHCCCCcHHHHHHHHHHHHHHHHh
Confidence 456889999999999999999 999999999999987544
No 26
>PF04769 MAT_Alpha1: Mating-type protein MAT alpha 1; InterPro: IPR006856 This family includes Saccharomyces cerevisiae (Baker's yeast) mating type protein alpha 1 (P01365 from SWISSPROT). MAT alpha 1 is a transcription activator that activates mating-type alpha-specific genes with the help of the MADS-box containing MCM1 transcription factor, which together bind cooperatively to PQ elements upstream of alpha-specific genes. The MCM1-MATalpha1 complex is required for the proper DNA-bending that is needed for transcriptional activation []. Alpha 1 interacts in vivo with STE12, linking expression of alpha-specific genes to the alpha-pheromone (IPR006742 from INTERPRO) response pathway [].; GO: 0000772 mating pheromone activity, 0003677 DNA binding, 0045895 positive regulation of transcription, mating-type specific, 0005634 nucleus
Probab=88.28 E-value=1.3 Score=39.68 Aligned_cols=56 Identities=14% Similarity=0.353 Sum_probs=40.9
Q ss_pred CCCCCCCCCCCCChHHHHHHHHHHHHHhhCCC-CHHHHHHHHHHHhhcCChHHHHHHHHHHHH
Q 022436 210 RRGDPSYPKPNRSGYNFFFAEKHYKLKSLYPN-REREFTKMIGESWTNLSPEERKVYQNIGLK 271 (297)
Q Consensus 210 ~~~dp~~PKrP~saY~lF~~e~r~~lk~~~p~-~~~eisk~l~~~Wk~Ls~~eK~~Y~e~a~~ 271 (297)
++....++|||.|+||.|..=.- ...|+ ...+++..|+..|+.=+- |..|.-+|+.
T Consensus 37 ~~~~~~~~kr~lN~Fm~FRsyy~----~~~~~~~Qk~~S~~l~~lW~~dp~--k~~W~l~ak~ 93 (201)
T PF04769_consen 37 RKRSPEKAKRPLNGFMAFRSYYS----PIFPPLPQKELSGILTKLWEKDPF--KNKWSLMAKA 93 (201)
T ss_pred ccccccccccchhHHHHHHHHHH----hhcCCcCHHHHHHHHHHHHhCCcc--HhHHHHHhhh
Confidence 34455678999999999987654 34456 689999999999987333 5666655543
No 27
>PF06244 DUF1014: Protein of unknown function (DUF1014); InterPro: IPR010422 This family consists of several hypothetical eukaryotic proteins of unknown function.
Probab=86.42 E-value=0.88 Score=37.64 Aligned_cols=48 Identities=19% Similarity=0.368 Sum_probs=41.0
Q ss_pred CCCCC-CCCCChHHHHHHHHHHHHHhhCCC-CHHHHHHHHHHHhhcCChH
Q 022436 213 DPSYP-KPNRSGYNFFFAEKHYKLKSLYPN-REREFTKMIGESWTNLSPE 260 (297)
Q Consensus 213 dp~~P-KrP~saY~lF~~e~r~~lk~~~p~-~~~eisk~l~~~Wk~Ls~~ 260 (297)
...+| ||-.-||.-|.....+.|+.++|+ ....+..+|.++|..-+++
T Consensus 68 ~drHPErR~KAAy~afeE~~Lp~lK~E~PgLrlsQ~kq~l~K~w~KSPeN 117 (122)
T PF06244_consen 68 IDRHPERRMKAAYKAFEERRLPELKEENPGLRLSQYKQMLWKEWQKSPEN 117 (122)
T ss_pred CCCCcchhHHHHHHHHHHHHhHHHHhhCCCchHHHHHHHHHHHHhcCCCC
Confidence 34455 455578999999999999999999 9999999999999987764
No 28
>PF00249 Myb_DNA-binding: Myb-like DNA-binding domain; InterPro: IPR014778 The retroviral oncogene v-myb, and its cellular counterpart c-myb, encode nuclear DNA-binding proteins. These belong to the SANT domain family that specifically recognise the sequence YAAC(G/T)G [, ]. In myb, one of the most conserved regions consisting of three tandem repeats has been shown to be involved in DNA-binding [].; PDB: 1X41_A 2XAF_B 2XAG_B 2XAH_B 2UXN_B 2Y48_B 2XAQ_B 2X0L_B 2IW5_B 2XAJ_B ....
Probab=81.82 E-value=3.1 Score=28.00 Aligned_cols=38 Identities=18% Similarity=0.280 Sum_probs=26.8
Q ss_pred hhHHHHHhcCcchhhccccchHHHHhhhcCCCCCCcHHHHHHHHHHHhh
Q 022436 66 VLYVEATTRGGYEKVVAEKKWREVGAVFKFSPTTTSASFVLRKHYLTLL 114 (297)
Q Consensus 66 ~Ly~~V~~~GG~~~V~~~~~W~~Va~~lg~p~~~t~~s~~Lk~~Y~k~L 114 (297)
.|...|...|.- .|..||..|+. +-.+.+++.+|.+||
T Consensus 11 ~l~~~v~~~g~~-------~W~~Ia~~~~~----~Rt~~qc~~~~~~~~ 48 (48)
T PF00249_consen 11 KLLEAVKKYGKD-------NWKKIAKRMPG----GRTAKQCRSRYQNLL 48 (48)
T ss_dssp HHHHHHHHSTTT-------HHHHHHHHHSS----SSTHHHHHHHHHHHT
T ss_pred HHHHHHHHhCCc-------HHHHHHHHcCC----CCCHHHHHHHHHhhC
Confidence 455566666643 69999999992 223348999999886
No 29
>TIGR03481 HpnM hopanoid biosynthesis associated membrane protein HpnM. The genomes containing members of this family share the machinery for the biosynthesis of hopanoid lipids. Furthermore, the genes of this family are usually located proximal to other components of this biological process. The proteins are members of the pfam05494 family of putative transporters known as "toluene tolerance protein Ttg2D", although it is unlikely that the members included here have anything to do with toluene per-se.
Probab=78.56 E-value=5.5 Score=35.44 Aligned_cols=45 Identities=27% Similarity=0.522 Sum_probs=39.0
Q ss_pred CHHHHHH-HHHHHhhcCChHHHHHHHHHHHH-HHHHHHHHHHHHHHh
Q 022436 242 REREFTK-MIGESWTNLSPEERKVYQNIGLK-DKERYNRELKEYKER 286 (297)
Q Consensus 242 ~~~eisk-~l~~~Wk~Ls~~eK~~Y~e~a~~-dke~y~~e~~~yk~~ 286 (297)
++..+++ .+|..|+.+++++|+.|.+.... ....|-..+..|...
T Consensus 65 Df~~mar~vLG~~W~~~s~~Qr~~F~~~F~~~l~~tY~~~l~~y~~~ 111 (198)
T TIGR03481 65 DLPAMARLTLGSSWTSLSPEQRRRFIGAFRELSIATYASQFKSYAGE 111 (198)
T ss_pred CHHHHHHHHhhhhhhhCCHHHHHHHHHHHHHHHHHHHHHHHHhhcCc
Confidence 6778866 67999999999999999988877 778899999999763
No 30
>KOG3223 consensus Uncharacterized conserved protein [Function unknown]
Probab=73.07 E-value=2.9 Score=37.19 Aligned_cols=56 Identities=18% Similarity=0.417 Sum_probs=46.5
Q ss_pred CCCCCC-CCCCChHHHHHHHHHHHHHhhCCC-CHHHHHHHHHHHhhcCChHHHHHHHHHHH
Q 022436 212 GDPSYP-KPNRSGYNFFFAEKHYKLKSLYPN-REREFTKMIGESWTNLSPEERKVYQNIGL 270 (297)
Q Consensus 212 ~dp~~P-KrP~saY~lF~~e~r~~lk~~~p~-~~~eisk~l~~~Wk~Ls~~eK~~Y~e~a~ 270 (297)
.|..+| ||=+-||.-|-....+.|+.+||+ ...+.-.+|-.+|..-++. ||.+++.
T Consensus 159 ~ddrHPEkRmrAA~~afEe~~LPrLK~e~P~lrlsQ~Kqll~Kew~KsPDN---P~Nq~~~ 216 (221)
T KOG3223|consen 159 SDDRHPEKRMRAAFKAFEEARLPRLKKENPGLRLSQYKQLLKKEWQKSPDN---PFNQAAV 216 (221)
T ss_pred ccccChHHHHHHHHHHHHHhhchhhhhcCCCccHHHHHHHHHHHHhhCCCC---hhhHHhh
Confidence 344566 666788999999999999999999 9999999999999998886 6665543
No 31
>PRK15117 ABC transporter periplasmic binding protein MlaC; Provisional
Probab=67.83 E-value=14 Score=33.12 Aligned_cols=47 Identities=19% Similarity=0.371 Sum_probs=39.4
Q ss_pred CC-CHHHHHH-HHHHHhhcCChHHHHHHHHHHHH-HHHHHHHHHHHHHHh
Q 022436 240 PN-REREFTK-MIGESWTNLSPEERKVYQNIGLK-DKERYNRELKEYKER 286 (297)
Q Consensus 240 p~-~~~eisk-~l~~~Wk~Ls~~eK~~Y~e~a~~-dke~y~~e~~~yk~~ 286 (297)
|. ++..+++ .+|.-|+.+++++|+.|.+.... ...-|-..+.+|...
T Consensus 66 p~~Df~~~s~~vLG~~wr~as~eQr~~F~~~F~~~Lv~tYa~~l~~y~~q 115 (211)
T PRK15117 66 PYVQVKYAGALVLGRYYKDATPAQREAYFAAFREYLKQAYGQALAMYHGQ 115 (211)
T ss_pred ccCCHHHHHHHHhhhhhhhCCHHHHHHHHHHHHHHHHHHHHHHHHHhCCc
Confidence 55 7888876 67999999999999999987766 567899999999763
No 32
>PF09441 Abp2: ARS binding protein 2; InterPro: IPR018562 This DNA-binding protein binds to the autonomously replicating sequence (ARS) binding element. It may play a role in regulating the cell cycle response to stress signals [].
Probab=64.06 E-value=15 Score=31.91 Aligned_cols=42 Identities=17% Similarity=0.272 Sum_probs=34.8
Q ss_pred CCCeeCCeecchhhhHHHHHhcCcchhhccccchHHHHhhhcCCCC
Q 022436 53 MIPVIGGKELDLHVLYVEATTRGGYEKVVAEKKWREVGAVFKFSPT 98 (297)
Q Consensus 53 ~~P~i~gk~lDL~~Ly~~V~~~GG~~~V~~~~~W~~Va~~lg~p~~ 98 (297)
.+|.-+|+..+.|.||..|.++-.- .-+.|.++|-.||..+.
T Consensus 44 ~pPkS~Gk~Fs~~~Lf~LI~k~~~k----eikTW~~La~~LGVepp 85 (175)
T PF09441_consen 44 SPPKSDGKSFSTFTLFELIRKLESK----EIKTWAQLALELGVEPP 85 (175)
T ss_pred CCCCcCCccchHHHHHHHHHHHhhh----hHhHHHHHHHHhCCCCC
Confidence 4799999999999999999876432 23579999999999654
No 33
>PF12881 NUT_N: NUT protein N terminus; InterPro: IPR024309 This domain is found in the N-terminal region of Nuclear Testis (NUT) proteins. It is also found in FAM22, which are a family of uncharacterised mammalian proteins.
Probab=56.99 E-value=27 Score=33.43 Aligned_cols=63 Identities=19% Similarity=0.170 Sum_probs=42.6
Q ss_pred ChHHHHHHHHHHHHHhhCCC-CHHHHHHHHHHHhhcCChHHHHHHHHHHHHHHHHH-HHHHHHHH
Q 022436 222 SGYNFFFAEKHYKLKSLYPN-REREFTKMIGESWTNLSPEERKVYQNIGLKDKERY-NRELKEYK 284 (297)
Q Consensus 222 saY~lF~~e~r~~lk~~~p~-~~~eisk~l~~~Wk~Ls~~eK~~Y~e~a~~dke~y-~~e~~~yk 284 (297)
.|+-.|+.-....|....|. ...|-..+.-+.|...|.-||..|.|+|++-.|=- ++||+.-+
T Consensus 230 EAlSCFLIpvLrsLar~kPtMtlEeGl~ra~qEW~~~SnfdRmifyemaekFmEFEaeEEmq~q~ 294 (328)
T PF12881_consen 230 EALSCFLIPVLRSLARLKPTMTLEEGLWRAVQEWQHTSNFDRMIFYEMAEKFMEFEAEEEMQIQK 294 (328)
T ss_pred hhhhhhHHHHHHHHHhcCCCccHHHHHHHHHHHhhccccccHHHHHHHHHHHccCCcHHHHHHHH
Confidence 34444444444445555566 66677777889999999999999999999865422 24555444
No 34
>PF11304 DUF3106: Protein of unknown function (DUF3106); InterPro: IPR021455 Some members in this family of proteins are annotated as transmembrane proteins however this cannot be confirmed. Currently no function is known.
Probab=56.38 E-value=45 Score=26.70 Aligned_cols=42 Identities=19% Similarity=0.550 Sum_probs=26.6
Q ss_pred HHHHHHHHHHhhcCChHHHHHHHHHHHH-------HHHHHHHHHHHHHH
Q 022436 244 REFTKMIGESWTNLSPEERKVYQNIGLK-------DKERYNRELKEYKE 285 (297)
Q Consensus 244 ~eisk~l~~~Wk~Ls~~eK~~Y~e~a~~-------dke~y~~e~~~yk~ 285 (297)
.++..-+.+.|+.|+++.|..+.+.+.. ++++....|..|..
T Consensus 10 q~~L~pl~~~W~~l~~~qr~k~l~~a~r~~~mspeqq~r~~~rm~~W~~ 58 (107)
T PF11304_consen 10 QQALAPLAERWNSLPPEQRRKWLQIAERWPSMSPEQQQRLRERMRRWAA 58 (107)
T ss_pred HHHHHHHHHHHhcCCHHHHHHHHHHHHHHhcCCHHHHHHHHHHHHHHHh
Confidence 4555666677777777777666665533 56666666666655
No 35
>cd00167 SANT 'SWI3, ADA2, N-CoR and TFIIIB' DNA-binding domains. Tandem copies of the domain bind telomeric DNA tandem repeatsas part of the capping complex. Binding is sequence dependent for repeats which contain the G/C rich motif [C2-3 A (CA)1-6]. The domain is also found in regulatory transcriptional repressor complexes where it also binds DNA.
Probab=56.11 E-value=24 Score=22.12 Aligned_cols=37 Identities=16% Similarity=0.310 Sum_probs=23.8
Q ss_pred hhHHHHHhcCcchhhccccchHHHHhhhcCCCCCCcHHHHHHHHHHHhh
Q 022436 66 VLYVEATTRGGYEKVVAEKKWREVGAVFKFSPTTTSASFVLRKHYLTLL 114 (297)
Q Consensus 66 ~Ly~~V~~~GG~~~V~~~~~W~~Va~~lg~p~~~t~~s~~Lk~~Y~k~L 114 (297)
.|...|...|- ..|..|+..|+.- .+..++.+|..++
T Consensus 9 ~l~~~~~~~g~-------~~w~~Ia~~~~~r-----s~~~~~~~~~~~~ 45 (45)
T cd00167 9 LLLEAVKKYGK-------NNWEKIAKELPGR-----TPKQCRERWRNLL 45 (45)
T ss_pred HHHHHHHHHCc-------CCHHHHHhHcCCC-----CHHHHHHHHHHhC
Confidence 34555555552 5699999999651 2336777776653
No 36
>PF13921 Myb_DNA-bind_6: Myb-like DNA-binding domain; PDB: 1A5J_A 1MBH_A 1GV5_A 1H89_C 1IDY_A 1MBK_A 1IDZ_A 1H88_C 1GVD_A 1MBG_A ....
Probab=56.02 E-value=23 Score=24.55 Aligned_cols=35 Identities=14% Similarity=0.302 Sum_probs=22.0
Q ss_pred hHHHHHhcCcchhhccccchHHHHhhhcCCCCCCcHHHHHHHHHHHhh
Q 022436 67 LYVEATTRGGYEKVVAEKKWREVGAVFKFSPTTTSASFVLRKHYLTLL 114 (297)
Q Consensus 67 Ly~~V~~~GG~~~V~~~~~W~~Va~~lg~p~~~t~~s~~Lk~~Y~k~L 114 (297)
|...|...|. .|..||..|| . -....++..|..+|
T Consensus 9 L~~~~~~~g~--------~W~~Ia~~l~---~--Rt~~~~~~r~~~~l 43 (60)
T PF13921_consen 9 LLELVKKYGN--------DWKKIAEHLG---N--RTPKQCRNRWRNHL 43 (60)
T ss_dssp HHHHHHHHTS---------HHHHHHHST---T--S-HHHHHHHHHHTT
T ss_pred HHHHHHHHCc--------CHHHHHHHHC---c--CCHHHHHHHHHHHC
Confidence 4455555543 5999999996 1 12236788887766
No 37
>PF05494 Tol_Tol_Ttg2: Toluene tolerance, Ttg2 ; InterPro: IPR008869 Toluene tolerance is mediated by increased cell membrane rigidity resulting from changes in fatty acid and phospholipid compositions, exclusion of toluene from the cell membrane, and removal of intracellular toluene by degradation []. Many proteins are involved in these processes. This family is a transporter which shows similarity to ABC transporters [].; PDB: 2QGU_A.
Probab=50.41 E-value=18 Score=30.84 Aligned_cols=44 Identities=20% Similarity=0.444 Sum_probs=33.2
Q ss_pred CHHHHHH-HHHHHhhcCChHHHHHHHHHHHH-HHHHHHHHHHHHHH
Q 022436 242 REREFTK-MIGESWTNLSPEERKVYQNIGLK-DKERYNRELKEYKE 285 (297)
Q Consensus 242 ~~~eisk-~l~~~Wk~Ls~~eK~~Y~e~a~~-dke~y~~e~~~yk~ 285 (297)
++..+++ .||.-|+.++++||+.|.+...+ ....|-..+..|..
T Consensus 39 D~~~~ar~~LG~~w~~~s~~q~~~F~~~f~~~l~~~Y~~~l~~y~~ 84 (170)
T PF05494_consen 39 DFERMARRVLGRYWRKASPAQRQRFVEAFKQLLVRTYAKRLDEYSG 84 (170)
T ss_dssp -HHHHHHHHHGGGTTTS-HHHHHHHHHHHHHHHHHHHHHHHHT-SS
T ss_pred CHHHHHHHHHHHhHhhCCHHHHHHHHHHHHHHHHHHHHHHHHhhCC
Confidence 6667755 46889999999999999986665 66778899988875
No 38
>PF13873 Myb_DNA-bind_5: Myb/SANT-like DNA-binding domain
Probab=44.99 E-value=26 Score=25.74 Aligned_cols=55 Identities=13% Similarity=0.131 Sum_probs=33.4
Q ss_pred cchhhhHHHHHhc---CcchhhccccchHHHHhhhcCCCCCCcHHHHHHHHHHHhhHH
Q 022436 62 LDLHVLYVEATTR---GGYEKVVAEKKWREVGAVFKFSPTTTSASFVLRKHYLTLLYH 116 (297)
Q Consensus 62 lDL~~Ly~~V~~~---GG~~~V~~~~~W~~Va~~lg~p~~~t~~s~~Lk~~Y~k~L~~ 116 (297)
|+|..-|..|..- ++.........|.+|+..|+--....-....|++.|..+...
T Consensus 14 v~~v~~~~~il~~k~~~~~~~~~k~~~W~~I~~~lN~~~~~~Rs~~~lkkkW~nlk~~ 71 (78)
T PF13873_consen 14 VELVEKHKDILENKFSDSVSNKEKRKAWEEIAEELNALGPGKRSWKQLKKKWKNLKSK 71 (78)
T ss_pred HHHHHHhHHHHhcccccHHHHHHHHHHHHHHHHHHHhcCCCCCCHHHHHHHHHHHHHH
Confidence 3444455554433 233344456799999999974332233445899999887653
No 39
>PF13875 DUF4202: Domain of unknown function (DUF4202)
Probab=44.88 E-value=47 Score=29.46 Aligned_cols=41 Identities=10% Similarity=0.247 Sum_probs=34.6
Q ss_pred ChHHHHHHHHHHHHHhhCCCCHHHHHHHHHHHhhcCChHHHHH
Q 022436 222 SGYNFFFAEKHYKLKSLYPNREREFTKMIGESWTNLSPEERKV 264 (297)
Q Consensus 222 saY~lF~~e~r~~lk~~~p~~~~eisk~l~~~Wk~Ls~~eK~~ 264 (297)
-+.+.|+..+.+.+...|. ...+..++.+.|++||+.-++.
T Consensus 130 vacLVFL~~~f~~F~~~~d--eeK~v~Il~KTw~KMS~~g~~~ 170 (185)
T PF13875_consen 130 VACLVFLEYYFEDFAAKHD--EEKIVDILRKTWRKMSERGHEA 170 (185)
T ss_pred hHHHHhHHHHHHHHHhcCC--HHHHHHHHHHHHHHCCHHHHHH
Confidence 4688999999999998883 5778888999999999987643
No 40
>TIGR01557 myb_SHAQKYF myb-like DNA-binding domain, SHAQKYF class. This model describes a DNA-binding domain restricted to (but common in) plant proteins, many of which also contain a response regulator domain. The domain appears related to the Myb-like DNA-binding domain described by Pfam model pfam00249. It is distinguished in part by a well-conserved motif SH[AL]QKY[RF] at the C-terminal end of the motif.
Probab=41.71 E-value=36 Score=24.23 Aligned_cols=42 Identities=21% Similarity=0.419 Sum_probs=26.0
Q ss_pred chhhhHHHH-HhcCcchhhccccch---HHHHhhhcCCC-CCCcHHHHHHHHHHHhhH
Q 022436 63 DLHVLYVEA-TTRGGYEKVVAEKKW---REVGAVFKFSP-TTTSASFVLRKHYLTLLY 115 (297)
Q Consensus 63 DL~~Ly~~V-~~~GG~~~V~~~~~W---~~Va~~lg~p~-~~t~~s~~Lk~~Y~k~L~ 115 (297)
|++.+|... ...|+ ..| ..|++.|+.+. +.. .++.|+.+|.+
T Consensus 9 eeh~~Fl~ai~~~G~-------g~~a~pk~I~~~~~~~~lT~~----qV~SH~QKy~~ 55 (57)
T TIGR01557 9 DLHDRFLQAVQKLGG-------PDWATPKRILELMVVDGLTRD----QVASHLQKYRL 55 (57)
T ss_pred HHHHHHHHHHHHhCC-------CcccchHHHHHHcCCCCCCHH----HHHHHHHHHHc
Confidence 566666653 33343 248 88999998765 433 56666666643
No 41
>TIGR01624 LRP1_Cterm LRP1 C-terminal domain. This model represents a tightly conserved small domain found in LRP1 and related plant proteins. This family also contains a well-conserved putative zinc finger domain (TIGR01623). The rest of the sequence of most members consists of highly divergent, low-complexity sequence.
Probab=40.63 E-value=19 Score=25.01 Aligned_cols=19 Identities=16% Similarity=0.314 Sum_probs=17.8
Q ss_pred EEEEeeeccccccccccCC
Q 022436 155 RIHILKLGSETLSGVLYHP 173 (297)
Q Consensus 155 ylvtv~~gse~~~gvly~~ 173 (297)
|--+|+||--.++|+||..
T Consensus 30 YQt~V~IgGHvFkGiLyDq 48 (50)
T TIGR01624 30 YQATVTIGGHVFKGFLHDQ 48 (50)
T ss_pred EEEEEEECceEEeeEEecc
Confidence 9999999999999999974
No 42
>smart00717 SANT SANT SWI3, ADA2, N-CoR and TFIIIB'' DNA-binding domains.
Probab=38.45 E-value=66 Score=20.25 Aligned_cols=26 Identities=15% Similarity=0.382 Sum_probs=18.1
Q ss_pred cchHHHHhhhcCCCCCCcHHHHHHHHHHHhh
Q 022436 84 KKWREVGAVFKFSPTTTSASFVLRKHYLTLL 114 (297)
Q Consensus 84 ~~W~~Va~~lg~p~~~t~~s~~Lk~~Y~k~L 114 (297)
..|..|+..|+ +- .+..++..|..++
T Consensus 22 ~~w~~Ia~~~~---~r--t~~~~~~~~~~~~ 47 (49)
T smart00717 22 NNWEKIAKELP---GR--TAEQCRERWNNLL 47 (49)
T ss_pred CCHHHHHHHcC---CC--CHHHHHHHHHHHc
Confidence 57999999997 21 2236777777665
No 43
>PF05142 DUF702: Domain of unknown function (DUF702) ; InterPro: IPR007818 This is a family of plant proteins of unknown function.
Probab=34.93 E-value=22 Score=30.53 Aligned_cols=21 Identities=19% Similarity=0.490 Sum_probs=19.2
Q ss_pred EEEEeeeccccccccccCCCC
Q 022436 155 RIHILKLGSETLSGVLYHPDH 175 (297)
Q Consensus 155 ylvtv~~gse~~~gvly~~~~ 175 (297)
|--+|+||--.|+|+||-...
T Consensus 131 YQTaV~IGGHVFKGiLYDqG~ 151 (154)
T PF05142_consen 131 YQTAVNIGGHVFKGILYDQGP 151 (154)
T ss_pred eEEeEEECCEEeeeeeeccCC
Confidence 999999999999999998643
No 44
>PF04967 HTH_10: HTH DNA binding domain; InterPro: IPR007050 Numerous bacterial transcription regulatory proteins bind DNA via a helix-turn-helix (HTH) motif. This entry represents the HTH DNA binding domain found in Halobacterium salinarium (Halobacterium halobium) and described as a putative bacterio-opsin activator.
Probab=33.79 E-value=40 Score=23.66 Aligned_cols=40 Identities=25% Similarity=0.208 Sum_probs=32.4
Q ss_pred hcCcchhhccccchHHHHhhhcCCCCCCcHHHHHHHHHHHhh
Q 022436 73 TRGGYEKVVAEKKWREVGAVFKFSPTTTSASFVLRKHYLTLL 114 (297)
Q Consensus 73 ~~GG~~~V~~~~~W~~Va~~lg~p~~~t~~s~~Lk~~Y~k~L 114 (297)
-..||-.+-++-.-.+||+.||++.+. ++..||+.-.+++
T Consensus 13 ~~~GYfd~PR~~tl~elA~~lgis~st--~~~~LRrae~kli 52 (53)
T PF04967_consen 13 YELGYFDVPRRITLEELAEELGISKST--VSEHLRRAERKLI 52 (53)
T ss_pred HHcCCCCCCCcCCHHHHHHHhCCCHHH--HHHHHHHHHHHHh
Confidence 346888888888899999999998653 7788998877765
No 45
>PF10545 MADF_DNA_bdg: Alcohol dehydrogenase transcription factor Myb/SANT-like; InterPro: IPR006578 The MADF (myb/SANT-like domain in Adf-1) domain is an approximately 80-amino-acid module that directs sequence specific DNA binding to a site consisting of multiple tri-nucleotide repeats. The MADF domain is found in one or more copies in eukaryotic and viral proteins and is often associated with the BESS domain []. MADF is related to the Myb DNA-binding domain (IPR001005 from INTERPRO). The retroviral oncogene v-myb, and its cellular counterpart c-myb, are nuclear DNA-binding proteins that specifically recognise the sequence YAAC(G/T)G. It is likely that the MADF domain is more closely related to the myb/SANT domain than it is to other HTH domains. Some proteins known to contain a MADF domain are listed below: Drosophila Adf-1, a transcription factor first identified on the basis of its interaction with the alcohol dehydrogenase promoter but that binds the promoters of a diverse group of genes []. Drosophila Dorsal-interacting protein 3 (Dip3), which functions both as an activator to bind DNA in a sequence specific manner and a coactivator to stimulate synergistic activation by Dorsal and Twist []. Drosophila Stonewall (Stwl), a putative transcription factor required for maintenance of female germline stem cells as well as oocyte differentiation.
Probab=33.73 E-value=37 Score=24.64 Aligned_cols=38 Identities=18% Similarity=0.240 Sum_probs=21.6
Q ss_pred ccccchHHHHhhhcCCCC---CCcHHHHHHHHHHHhhHHHH
Q 022436 81 VAEKKWREVGAVFKFSPT---TTSASFVLRKHYLTLLYHYE 118 (297)
Q Consensus 81 ~~~~~W~~Va~~lg~p~~---~t~~s~~Lk~~Y~k~L~~yE 118 (297)
...+.|.+|+..||..-+ +...-..||..|.+.+...+
T Consensus 24 ~r~~aw~~Ia~~l~~~~~~~~~~~~w~~Lr~~y~~~~~~~~ 64 (85)
T PF10545_consen 24 LREEAWQEIARELGKEFSVDDCKKRWKNLRDRYRRELKKIK 64 (85)
T ss_pred HHHHHHHHHHHHHccchhHHHHHHHHHHHHHHHHHHHHHHh
Confidence 356789999999984322 22222345555555544444
No 46
>PF12776 Myb_DNA-bind_3: Myb/SANT-like DNA-binding domain; InterPro: IPR024752 This domain, found in a range of uncharacterised proteins, may be related to Myb/SANT-like DNA binding domains.
Probab=33.68 E-value=64 Score=24.33 Aligned_cols=59 Identities=17% Similarity=0.236 Sum_probs=39.5
Q ss_pred hhhhHHHHHhcCcc--hhhccccchHHHHhhhcCCCCCCcHHHHHHHHHHHhhHHHHhhhh
Q 022436 64 LHVLYVEATTRGGY--EKVVAEKKWREVGAVFKFSPTTTSASFVLRKHYLTLLYHYEQVHF 122 (297)
Q Consensus 64 L~~Ly~~V~~~GG~--~~V~~~~~W~~Va~~lg~p~~~t~~s~~Lk~~Y~k~L~~yE~~~~ 122 (297)
|..|+.+....|.. ...-....|..|+.+|.-.....-...+|+.+|..+=..|.....
T Consensus 10 ll~~~~e~~~~g~~~~~~~fk~~~w~~i~~~~~~~~~~~~t~~qlknk~~~lk~~y~~~~~ 70 (96)
T PF12776_consen 10 LLDLLIEQINKGNRPTNGGFKKEGWNNIAEEFNEKTGLNYTKKQLKNKWKTLKKDYRIWKE 70 (96)
T ss_pred HHHHHHHHHHhCCCCCCCCcCHHHHHHHHHHHHHHhCCcccHHHHHHHHHHHHHHHHHHHH
Confidence 34455566666666 234444589999999986444333345889999888888887653
No 47
>PRK10236 hypothetical protein; Provisional
Probab=33.33 E-value=43 Score=30.83 Aligned_cols=24 Identities=8% Similarity=0.480 Sum_probs=20.4
Q ss_pred HHHHHHHHHhhcCChHHHHHHHHH
Q 022436 245 EFTKMIGESWTNLSPEERKVYQNI 268 (297)
Q Consensus 245 eisk~l~~~Wk~Ls~~eK~~Y~e~ 268 (297)
-+.+.+.+.|+.|+++|++.+.+.
T Consensus 117 il~kll~~a~~kms~eE~~~L~~~ 140 (237)
T PRK10236 117 LLEQFLRNTWKKMDEEHKQEFLHA 140 (237)
T ss_pred HHHHHHHHHHHHCCHHHHHHHHHH
Confidence 368889999999999999888643
No 48
>PRK12750 cpxP periplasmic repressor CpxP; Reviewed
Probab=31.46 E-value=1.6e+02 Score=25.55 Aligned_cols=36 Identities=11% Similarity=0.272 Sum_probs=28.5
Q ss_pred HHHHHHHHHhhcCChHHHHHHHHHHHHHHHHHHHHH
Q 022436 245 EFTKMIGESWTNLSPEERKVYQNIGLKDKERYNREL 280 (297)
Q Consensus 245 eisk~l~~~Wk~Ls~~eK~~Y~e~a~~dke~y~~e~ 280 (297)
+..+...+++.-|++++|..|.++-.+-.+.+.+.+
T Consensus 125 ~~~~~~~~~~~vLTpEQRak~~e~~~~r~~~~~~~~ 160 (170)
T PRK12750 125 KMLEKRHQMLSILTPEQKAKFQELQQERMQECQDKM 160 (170)
T ss_pred HHHHHHHHHHHhCCHHHHHHHHHHHHHHHHHHHHHH
Confidence 344556678999999999999998877777777666
No 49
>PRK10363 cpxP periplasmic repressor CpxP; Reviewed
Probab=30.28 E-value=1.6e+02 Score=25.70 Aligned_cols=33 Identities=21% Similarity=0.349 Sum_probs=25.6
Q ss_pred HHHHHHHHHhhcCChHHHHHHHHHHHHHHHHHH
Q 022436 245 EFTKMIGESWTNLSPEERKVYQNIGLKDKERYN 277 (297)
Q Consensus 245 eisk~l~~~Wk~Ls~~eK~~Y~e~a~~dke~y~ 277 (297)
+..++-.++.+-|++|+|..|++..++-.+++.
T Consensus 112 em~k~~nqmy~lLTPEQKaq~~~~~~~rm~~~~ 144 (166)
T PRK10363 112 EMAKVRNQMYRLLTPEQQAVLNEKHQQRMEQLR 144 (166)
T ss_pred HHHHHHHHHHHhCCHHHHHHHHHHHHHHHHHHH
Confidence 456666889999999999999877666555553
No 50
>COG2854 Ttg2D ABC-type transport system involved in resistance to organic solvents, auxiliary component [Secondary metabolites biosynthesis, transport, and catabolism]
Probab=29.01 E-value=69 Score=28.83 Aligned_cols=41 Identities=22% Similarity=0.422 Sum_probs=34.5
Q ss_pred HHHHHHHhhcCChHHHHHHHHHHHH-HHHHHHHHHHHHHHhc
Q 022436 247 TKMIGESWTNLSPEERKVYQNIGLK-DKERYNRELKEYKERL 287 (297)
Q Consensus 247 sk~l~~~Wk~Ls~~eK~~Y~e~a~~-dke~y~~e~~~yk~~~ 287 (297)
..-+|.-|+.+|+++++.|.+.... ....|-..+.+|+...
T Consensus 77 ~~vLGk~~k~aspeQ~~~F~~aF~~yl~q~Y~~aL~~Y~~q~ 118 (202)
T COG2854 77 KLVLGKYYKTASPEQRQAFFKAFRTYLEQTYGQALLDYKGQT 118 (202)
T ss_pred HHHhccccccCCHHHHHHHHHHHHHHHHHHHHHHHHHccCCC
Confidence 4567899999999999999987766 5677999999998753
No 51
>PF08914 Myb_DNA-bind_2: Rap1 Myb domain; InterPro: IPR015010 Rap1 Myb adopts a canonical three-helix bundle tertiary structure, with the second and third helices forming a helix-turn-helix variant motif. The function is unclear but it may either interact with DNA via an adaptor protein or it may be only involved in protein-protein interactions []. ; PDB: 1FEX_A.
Probab=28.14 E-value=77 Score=23.13 Aligned_cols=51 Identities=22% Similarity=0.229 Sum_probs=28.7
Q ss_pred hhHHHHHhc--CcchhhccccchHHHHhhhcCCCCCCcHHHHHHHHHHHhhHHHHhhh
Q 022436 66 VLYVEATTR--GGYEKVVAEKKWREVGAVFKFSPTTTSASFVLRKHYLTLLYHYEQVH 121 (297)
Q Consensus 66 ~Ly~~V~~~--GG~~~V~~~~~W~~Va~~lg~p~~~t~~s~~Lk~~Y~k~L~~yE~~~ 121 (297)
.|+.-|... .| ..|+.++.|.++++.-- + ...-..+|..|.+.|.+.+..|
T Consensus 12 ~l~~~v~~~~~~~-~~~~Gn~iwk~le~~~~---t-~HtwQSwR~Ry~K~L~~~~~~~ 64 (65)
T PF08914_consen 12 ALLDYVKENERQG-GSVSGNKIWKELEEKHP---T-RHTWQSWRDRYLKHLRGRPRKY 64 (65)
T ss_dssp HHHHHHHHT--ST-TTTTSSHHHHHHHHS-S---S-S--SHHHHHHHHHHT-------
T ss_pred HHHHHHHHhccCC-CCCchHHHHHHHHHHcC---C-CCCHHHHHHHHHHHHhccccCC
Confidence 355666433 33 45788889999988752 1 1223479999999998876543
No 52
>PRK12751 cpxP periplasmic stress adaptor protein CpxP; Reviewed
Probab=27.09 E-value=1.6e+02 Score=25.54 Aligned_cols=30 Identities=17% Similarity=0.344 Sum_probs=22.1
Q ss_pred HHHHHHHHhhcCChHHHHHHHHHHHHHHHH
Q 022436 246 FTKMIGESWTNLSPEERKVYQNIGLKDKER 275 (297)
Q Consensus 246 isk~l~~~Wk~Ls~~eK~~Y~e~a~~dke~ 275 (297)
..+...++++.|++++|..|.+..++-.++
T Consensus 119 ~~~~~~qmy~lLTPEQra~l~~~~e~r~~~ 148 (162)
T PRK12751 119 MAKVRNQMYNLLTPEQKEALNKKHQERIEK 148 (162)
T ss_pred HHHHHHHHHHcCCHHHHHHHHHHHHHHHHH
Confidence 445557888999999999998765554333
No 53
>PRK09706 transcriptional repressor DicA; Reviewed
Probab=26.49 E-value=1.8e+02 Score=23.65 Aligned_cols=43 Identities=19% Similarity=0.140 Sum_probs=36.3
Q ss_pred HHHHHHHhhcCChHHHHHHHHHHHHHHHHHHHHHHHHHHhccc
Q 022436 247 TKMIGESWTNLSPEERKVYQNIGLKDKERYNRELKEYKERLKL 289 (297)
Q Consensus 247 sk~l~~~Wk~Ls~~eK~~Y~e~a~~dke~y~~e~~~yk~~~~~ 289 (297)
.+.+-..|+.|+++++.......+...+.|.+-+++|-.+...
T Consensus 89 ~~~ll~~~~~L~~~~~~~~l~~l~~~~~~~~~~~~~~~~~~~~ 131 (135)
T PRK09706 89 QKELLELFDALPESEQDAQLSEMRARVENFNKLFEELLKARKR 131 (135)
T ss_pred HHHHHHHHHHCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhh
Confidence 3567788999999999999999999989999988888775443
No 54
>KOG1610 consensus Corticosteroid 11-beta-dehydrogenase and related short chain-type dehydrogenases [Secondary metabolites biosynthesis, transport and catabolism; General function prediction only]
Probab=26.29 E-value=1.6e+02 Score=28.38 Aligned_cols=59 Identities=22% Similarity=0.430 Sum_probs=42.7
Q ss_pred hHHHHHHHHHHHHHhhC-------CC-------CHHHHHHHHHHHhhcCChHHHHHHHHHHHHHHHHHHHHHHHHH
Q 022436 223 GYNFFFAEKHYKLKSLY-------PN-------REREFTKMIGESWTNLSPEERKVYQNIGLKDKERYNRELKEYK 284 (297)
Q Consensus 223 aY~lF~~e~r~~lk~~~-------p~-------~~~eisk~l~~~Wk~Ls~~eK~~Y~e~a~~dke~y~~e~~~yk 284 (297)
|--.|+..-|.++..-. |+ ....+.+.+.++|..|+++.|+.|-+.+..+ |++.+..|.
T Consensus 184 aVeaf~D~lR~EL~~fGV~VsiiePG~f~T~l~~~~~~~~~~~~~w~~l~~e~k~~YGedy~~~---~~~~~~~~~ 256 (322)
T KOG1610|consen 184 AVEAFSDSLRRELRPFGVKVSIIEPGFFKTNLANPEKLEKRMKEIWERLPQETKDEYGEDYFED---YKKSLEKYL 256 (322)
T ss_pred HHHHHHHHHHHHHHhcCcEEEEeccCccccccCChHHHHHHHHHHHhcCCHHHHHHHHHHHHHH---HHHHHHhhh
Confidence 44567777787776522 33 3477899999999999999999998777665 455555554
No 55
>PLN03212 Transcription repressor MYB5; Provisional
Probab=25.61 E-value=94 Score=28.86 Aligned_cols=39 Identities=15% Similarity=0.190 Sum_probs=26.1
Q ss_pred hHHHHHhcCcchhhccccchHHHHhhhcCCCCCCcHHHHHHHHHHHhhHH
Q 022436 67 LYVEATTRGGYEKVVAEKKWREVGAVFKFSPTTTSASFVLRKHYLTLLYH 116 (297)
Q Consensus 67 Ly~~V~~~GG~~~V~~~~~W~~Va~~lg~p~~~t~~s~~Lk~~Y~k~L~~ 116 (297)
|...|...|. +.|..||..++...+. .+.|.-|.+||.|
T Consensus 36 L~~lV~kyG~-------~nW~~IAk~~g~gRT~----KQCReRW~N~L~P 74 (249)
T PLN03212 36 LVSFIKKEGE-------GRWRSLPKRAGLLRCG----KSCRLRWMNYLRP 74 (249)
T ss_pred HHHHHHHhCc-------ccHHHHHHhhhcCCCc----chHHHHHHHhhch
Confidence 4455666653 3699999998743332 2678889888855
No 56
>PF12650 DUF3784: Domain of unknown function (DUF3784); InterPro: IPR017259 This group represents an uncharacterised conserved protein.
Probab=24.92 E-value=50 Score=25.51 Aligned_cols=17 Identities=24% Similarity=0.524 Sum_probs=13.8
Q ss_pred HHhhcCChHHHHHHHHH
Q 022436 252 ESWTNLSPEERKVYQNI 268 (297)
Q Consensus 252 ~~Wk~Ls~~eK~~Y~e~ 268 (297)
.-|+.||+|||++|.+.
T Consensus 24 aGyntms~eEk~~~D~~ 40 (97)
T PF12650_consen 24 AGYNTMSKEEKEKYDKK 40 (97)
T ss_pred hhcccCCHHHHHHhhHH
Confidence 34789999999999654
No 57
>PF05066 HARE-HTH: HB1, ASXL, restriction endonuclease HTH domain; InterPro: IPR007759 DNA-directed RNA polymerases 2.7.7.6 from EC (also known as DNA-dependent RNA polymerases) are responsible for the polymerisation of ribonucleotides into a sequence complementary to the template DNA. In eukaryotes, there are three different forms of DNA-directed RNA polymerases transcribing different sets of genes. Most RNA polymerases are multimeric enzymes and are composed of a variable number of subunits. The core RNA polymerase complex consists of five subunits (two alpha, one beta, one beta-prime and one omega) and is sufficient for transcription elongation and termination but is unable to initiate transcription. Transcription initiation from promoter elements requires a sixth, dissociable subunit called a sigma factor, which reversibly associates with the core RNA polymerase complex to form a holoenzyme []. The core RNA polymerase complex forms a "crab claw"-like structure with an internal channel running along the full length []. The key functional sites of the enzyme, as defined by mutational and cross-linking analysis, are located on the inner wall of this channel. RNA synthesis follows after the attachment of RNA polymerase to a specific site, the promoter, on the template DNA strand. The RNA synthesis process continues until a termination sequence is reached. The RNA product, which is synthesised in the 5' to 3'direction, is known as the primary transcript. Eukaryotic nuclei contain three distinct types of RNA polymerases that differ in the RNA they synthesise: RNA polymerase I: located in the nucleoli, synthesises precursors of most ribosomal RNAs. RNA polymerase II: occurs in the nucleoplasm, synthesises mRNA precursors. RNA polymerase III: also occurs in the nucleoplasm, synthesises the precursors of 5S ribosomal RNA, the tRNAs, and a variety of other small nuclear and cytosolic RNAs. Eukaryotic cells are also known to contain separate mitochondrial and chloroplast RNA polymerases. Eukaryotic RNA polymerases, whose molecular masses vary in size from 500 to 700 kDa, contain two non-identical large (>100 kDa) subunits and an array of up to 12 different small (less than 50 kDa) subunits. The delta protein is a dispensable subunit of Bacillus subtilis RNA polymerase (RNAP) that has major effects on the biochemical properties of the purified enzyme. In the presence of delta, RNAP displays an increased specificity of transcription, a decreased affinity for nucleic acids, and an increased efficiency of RNA synthesis because of enhanced recycling []. The delta protein, contains two distinct regions, an N-terminal domain and a glutamate and aspartate residue-rich C-terminal region [].; GO: 0003677 DNA binding, 0006351 transcription, DNA-dependent; PDB: 2KRC_A.
Probab=24.54 E-value=1.1e+02 Score=22.01 Aligned_cols=43 Identities=16% Similarity=0.258 Sum_probs=25.0
Q ss_pred HHHHHHHHHhhcCCCCCCCeeCCeecchhhhHHHHHhcCcchhhccccchHHHH
Q 022436 37 FWDTLRRFHFIMGTKFMIPVIGGKELDLHVLYVEATTRGGYEKVVAEKKWREVG 90 (297)
Q Consensus 37 F~~~L~~f~~~~G~~~~~P~i~gk~lDL~~Ly~~V~~~GG~~~V~~~~~W~~Va 90 (297)
|.+.-..-+++.| +++....|+..+.++|++... ...-|..|+
T Consensus 3 ~~eaa~~vL~~~~----------~pm~~~eI~~~i~~~~~~~~~-~k~p~~~i~ 45 (72)
T PF05066_consen 3 FKEAAYEVLEEAG----------RPMTFKEIWEEIQERGLYKKS-GKTPEATIA 45 (72)
T ss_dssp HHHHHHHHHHHH-----------S-EEHHHHHHHHHHHHTS----GGGGGHHHH
T ss_pred HHHHHHHHHHhcC----------CCcCHHHHHHHHHHhCCCCcc-cCCHHHHHH
Confidence 4555555555555 447788899999999999877 223444444
No 58
>TIGR00787 dctP tripartite ATP-independent periplasmic transporter solute receptor, DctP family. TRAP-T (Tripartite ATP-independent Periplasmic Transporter) family proteins generally consist of three components, and these systems have so far been found in Gram-negative bacteria, Gram-postive bacteria and archaea. The best characterized example is the DctPQM system of Rhodobacter capsulatus, a C4 dicarboxylate (malate, fumarate, succinate) transporter. This model represents the DctP family, one of at least three major families of extracytoplasmic solute receptor for TRAP family transporters. Other are the SnoM family (see pfam03480) and TAXI (TRAP-associated extracytoplasmic immunogenic) family.
Probab=23.52 E-value=1.6e+02 Score=26.56 Aligned_cols=28 Identities=18% Similarity=0.353 Sum_probs=21.4
Q ss_pred HHHhhcCChHHHHHHHHHHHHHHHHHHH
Q 022436 251 GESWTNLSPEERKVYQNIGLKDKERYNR 278 (297)
Q Consensus 251 ~~~Wk~Ls~~eK~~Y~e~a~~dke~y~~ 278 (297)
.+.|..|++++|+...+.+.+.-+....
T Consensus 213 ~~~~~~L~~e~q~~i~~a~~~~~~~~~~ 240 (257)
T TIGR00787 213 KAFWKSLPPDLQAVVKEAAKEAGEYQRK 240 (257)
T ss_pred HHHHhcCCHHHHHHHHHHHHHHHHHHHH
Confidence 4789999999999998877665444333
No 59
>PF13725 tRNA_bind_2: Possible tRNA binding domain; PDB: 2ZPA_B.
Probab=23.15 E-value=47 Score=25.64 Aligned_cols=21 Identities=24% Similarity=0.403 Sum_probs=13.8
Q ss_pred hhhccccchHHHHhhhcCCCC
Q 022436 78 EKVVAEKKWREVGAVFKFSPT 98 (297)
Q Consensus 78 ~~V~~~~~W~~Va~~lg~p~~ 98 (297)
.+|.+.+-|.+|+++||++..
T Consensus 78 ~k~LQ~ksw~~~a~~l~l~g~ 98 (101)
T PF13725_consen 78 AKGLQGKSWEEVAKELGLPGR 98 (101)
T ss_dssp HHHCS---HHHHHHHCT-SSH
T ss_pred HHHHCCCCHHHHHHHcCCCCC
Confidence 356778999999999999853
No 60
>PF11860 DUF3380: Protein of unknown function (DUF3380); InterPro: IPR024408 Proteins in this entry including lysozyme from Enterobacteria phage PRD1 [, ].
Probab=22.12 E-value=1.2e+02 Score=26.72 Aligned_cols=44 Identities=11% Similarity=0.135 Sum_probs=32.7
Q ss_pred HHHhcCcchhhccccchHHHHhhhcCCCCCC-cHHHHHHHHHHHh
Q 022436 70 EATTRGGYEKVVAEKKWREVGAVFKFSPTTT-SASFVLRKHYLTL 113 (297)
Q Consensus 70 ~V~~~GG~~~V~~~~~W~~Va~~lg~p~~~t-~~s~~Lk~~Y~k~ 113 (297)
-+...|++..-.+.+.|..+|+..+-|.-.. .=...|++.|.+|
T Consensus 130 Fi~~~~~L~~aLr~~dW~~fAr~YNGp~y~~n~Yd~kl~~ay~~~ 174 (175)
T PF11860_consen 130 FIKANPALLKALRAKDWAAFARGYNGPGYAKNQYDTKLARAYARF 174 (175)
T ss_pred HHHcCHHHHHHHHhCCHHHHHHHcCCchhhhccHHHHHHHHHHhc
Confidence 3455666888889999999999999987543 3444677777664
No 61
>PF06945 DUF1289: Protein of unknown function (DUF1289); InterPro: IPR010710 This family consists of a number of hypothetical bacterial proteins. The aligned region spans around 56 residues and contains 4 highly conserved cysteine residues towards the N terminus. The function of this family is unknown.
Probab=22.01 E-value=83 Score=21.71 Aligned_cols=20 Identities=20% Similarity=0.371 Sum_probs=14.9
Q ss_pred HHhhcCChHHHHHHHHHHHH
Q 022436 252 ESWTNLSPEERKVYQNIGLK 271 (297)
Q Consensus 252 ~~Wk~Ls~~eK~~Y~e~a~~ 271 (297)
..|+.|++++|....+....
T Consensus 28 ~~W~~~s~~er~~i~~~l~~ 47 (51)
T PF06945_consen 28 RDWKSMSDDERRAILARLRA 47 (51)
T ss_pred HHHhhCCHHHHHHHHHHHHH
Confidence 36999999998877654443
No 62
>KOG3838 consensus Mannose lectin ERGIC-53, involved in glycoprotein traffic [Intracellular trafficking, secretion, and vesicular transport]
Probab=21.84 E-value=82 Score=31.32 Aligned_cols=38 Identities=21% Similarity=0.194 Sum_probs=29.5
Q ss_pred cCChHHHHHHHHHHHHHHHHHHHHHHHHHHhccccccc
Q 022436 256 NLSPEERKVYQNIGLKDKERYNRELKEYKERLKLRQGE 293 (297)
Q Consensus 256 ~Ls~~eK~~Y~e~a~~dke~y~~e~~~yk~~~~~~~~~ 293 (297)
.+.+.||++|++..+.....|.++.++|++...+.++.
T Consensus 268 E~qe~ek~kyqeEfe~~q~elek~k~efkk~hpd~~~e 305 (497)
T KOG3838|consen 268 EMQELEKAKYQEEFEWAQLELEKRKDEFKKSHPDAQGE 305 (497)
T ss_pred hhhHHHHHHHHHHHHHHHHHHhhhHhhhccCCchhhcc
Confidence 45566899999999888888888888888866555443
No 63
>PF00226 DnaJ: DnaJ domain; InterPro: IPR001623 The prokaryotic heat shock protein DnaJ interacts with the chaperone hsp70-like DnaK protein []. Structurally, the DnaJ protein consists of an N-terminal conserved domain (called 'J' domain) of about 70 amino acids, a glycine-rich region ('G' domain') of about 30 residues, a central domain containing four repeats of a CXXCXGXG motif ('CRR' domain) and a C-terminal region of 120 to 170 residues. Such a structure is shown in the following schematic representation: +------------+-+-------+-----+-----------+--------------------------------+ | N-terminal | | Gly-R | | CXXCXGXG | C-terminal | +------------+-+-------+-----+-----------+--------------------------------+ It is thought that the 'J' domain of DnaJ mediates the interaction with the dnaK protein and consists of four helices, the second of which has a charged surface that includes at least one pair of basic residues that are essential for interaction with the ATPase domain of Hsp70. The J- and CRR-domains are found in many prokaryotic and eukaryotic proteins [], either together or separately. In yeast, J-domains have been classified into 3 groups; the class III proteins are functionally distinct and do not appear to act as molecular chaperones []. ; GO: 0031072 heat shock protein binding; PDB: 2GUZ_C 2L6L_A 1HDJ_A 2EJ7_A 1FPO_C 2CUG_A 2QSA_A 2OCH_A 3BVO_B 3APQ_A ....
Probab=21.47 E-value=1.4e+02 Score=20.59 Aligned_cols=35 Identities=26% Similarity=0.264 Sum_probs=24.8
Q ss_pred HHHHHHHHhhCCC---CHH----HHHHHHHHHhhcCChHHHH
Q 022436 229 AEKHYKLKSLYPN---RER----EFTKMIGESWTNLSPEERK 263 (297)
Q Consensus 229 ~e~r~~lk~~~p~---~~~----eisk~l~~~Wk~Ls~~eK~ 263 (297)
+..+..++.-||+ ... +....|.+.|+.|++.++.
T Consensus 19 ~~y~~l~~~~HPD~~~~~~~~~~~~~~~i~~Ay~~L~~~~~R 60 (64)
T PF00226_consen 19 KAYRRLSKQYHPDKNSGDEAEAEEKFARINEAYEILSDPERR 60 (64)
T ss_dssp HHHHHHHHHTSTTTGTSTHHHHHHHHHHHHHHHHHHHSHHHH
T ss_pred HHHHhhhhccccccchhhhhhhhHHHHHHHHHHHHhCCHHHH
Confidence 3445566777888 233 6888899999998876653
No 64
>PRK02363 DNA-directed RNA polymerase subunit delta; Reviewed
Probab=21.28 E-value=67 Score=26.78 Aligned_cols=64 Identities=8% Similarity=-0.007 Sum_probs=42.3
Q ss_pred hHHHHHHHHHHhhcCCCCCCCeeCCeecchhhhHHHHHhcCcchhhccccchHHHHhhhcCCCC---CCcHHHHHHH
Q 022436 35 IVFWDTLRRFHFIMGTKFMIPVIGGKELDLHVLYVEATTRGGYEKVVAEKKWREVGAVFKFSPT---TTSASFVLRK 108 (297)
Q Consensus 35 ~~F~~~L~~f~~~~G~~~~~P~i~gk~lDL~~Ly~~V~~~GG~~~V~~~~~W~~Va~~lg~p~~---~t~~s~~Lk~ 108 (297)
..+++.-..++..+|.+ +.++.|+.+|....|+..-....+=.++..-|.+... +++....||.
T Consensus 3 ~S~idvAy~iL~~~~~~----------m~f~dL~~ev~~~~~~s~e~~~~~iaq~YtdLn~DGRFi~lG~n~WgLr~ 69 (129)
T PRK02363 3 LSLIEVAYEILKEKKEP----------MSFYDLVNEIQKYLGKSDEEIRERIAQFYTDLNLDGRFISLGDNKWGLRS 69 (129)
T ss_pred ccHHHHHHHHHHHcCCc----------ccHHHHHHHHHHHhCCCHHHHHHHHHHHHHHHhccCCeeEcCCCceeccc
Confidence 35566666777766544 7788899999999887655545566777777776654 3444445555
No 65
>COG1638 DctP TRAP-type C4-dicarboxylate transport system, periplasmic component [Carbohydrate transport and metabolism]
Probab=21.02 E-value=1.6e+02 Score=28.21 Aligned_cols=38 Identities=13% Similarity=0.323 Sum_probs=29.5
Q ss_pred HHHHHhhcCChHHHHHHHHHHHHHHHHHHHHHHHHHHh
Q 022436 249 MIGESWTNLSPEERKVYQNIGLKDKERYNRELKEYKER 286 (297)
Q Consensus 249 ~l~~~Wk~Ls~~eK~~Y~e~a~~dke~y~~e~~~yk~~ 286 (297)
+-...|..|++++|+...+.+++..+...+...+....
T Consensus 242 ~s~~~w~~L~~e~q~il~~aa~e~~~~~~~~~~~~e~~ 279 (332)
T COG1638 242 VSKAFWDSLPEEDQTILLEAAKEAAEEQRKLVEELEDE 279 (332)
T ss_pred EcHHHHhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHH
Confidence 33578999999999999999988777666666655553
No 66
>PF02337 Gag_p10: Retroviral GAG p10 protein; InterPro: IPR003322 Retroviral matrix proteins (or major core proteins) are components of envelope-associated capsids, which line the inner surface of virus envelopes and are associated with viral membranes []. Matrix proteins are produced as part of Gag precursor polyproteins. During viral maturation, the Gag polyprotein is cleaved into major structural proteins by the viral protease, yielding the matrix (MA), capsid (CA), nucleocapsid (NC), and some smaller peptides. Gag-derived proteins govern the entire assembly and release of the virus particles, with matrix proteins playing key roles in Gag stability, capsid assembly, transport and budding. Although matrix proteins from different retroviruses appear to perform similar functions and can have similar structural folds, their primary sequences can be very different. This entry represents matrix proteins from beta-retroviruses such as Mason-Pfizer monkey virus (MPMV) (Simian Mason-Pfizer virus) and Mouse mammary tumor virus (MMTV) [, ]. This entry also identifies matrix proteins from several eukaryotic endogenous retroviruses, which arise when one or more copies of the retroviral genome becomes integrated into the host genome [].; GO: 0005198 structural molecule activity, 0019028 viral capsid; PDB: 2F77_X 2F76_X.
Probab=20.94 E-value=1.5e+02 Score=23.16 Aligned_cols=55 Identities=16% Similarity=0.147 Sum_probs=34.6
Q ss_pred hhHHHHHHHHHHhhcCCCCCCCeeCCeecchhhhHHHHHhcCcchhh---ccccchHHHHhhhcC
Q 022436 34 PIVFWDTLRRFHFIMGTKFMIPVIGGKELDLHVLYVEATTRGGYEKV---VAEKKWREVGAVFKF 95 (297)
Q Consensus 34 ~~~F~~~L~~f~~~~G~~~~~P~i~gk~lDL~~Ly~~V~~~GG~~~V---~~~~~W~~Va~~lg~ 95 (297)
++.|+..|..++..+|..++. =||-.+|..+.+..=.-.. ..-..|..|++.|.-
T Consensus 7 ~~~fv~~Lk~lLk~rGi~v~~-------~~L~~f~~~i~~~~PWF~~eG~l~~~~W~kvG~~l~~ 64 (90)
T PF02337_consen 7 KQPFVSILKHLLKERGIRVKK-------KDLINFLSFIDKVCPWFPEEGTLDLDNWKKVGEELKR 64 (90)
T ss_dssp HHHHHHHHHHHHHCCT----H-------HHHHHHHHHHHHHTT-SS--SS-HHHHHHHHHHHHHH
T ss_pred hhHHHHHHHHHHHHcCeeecH-------HHHHHHHHHHHHhCCCCCCCCCcCHHHHHHHHHHHHH
Confidence 479999999999999998742 3677777776554322221 223589999888743
No 67
>PTZ00100 DnaJ chaperone protein; Provisional
Probab=20.92 E-value=3.4e+02 Score=22.22 Aligned_cols=76 Identities=17% Similarity=0.238 Sum_probs=46.9
Q ss_pred hHHHHHHHHHHhhcCCCCCCCeeCCeecchhhhHHHHHhcCcchhhccccchHHHHhhhcCCCCCCcHHHHHHHHHHHhh
Q 022436 35 IVFWDTLRRFHFIMGTKFMIPVIGGKELDLHVLYVEATTRGGYEKVVAEKKWREVGAVFKFSPTTTSASFVLRKHYLTLL 114 (297)
Q Consensus 35 ~~F~~~L~~f~~~~G~~~~~P~i~gk~lDL~~Ly~~V~~~GG~~~V~~~~~W~~Va~~lg~p~~~t~~s~~Lk~~Y~k~L 114 (297)
..|+..+..+....+..+..|.-+-... +-.+| .-...+|++..... .+-...||++++++ ...+++.|.++.
T Consensus 18 ~~~~~~~~~~~~~~~~~~~~~~s~~~~~-~~~~~-~~~~~~~f~~~Ms~---~eAy~ILGv~~~As--~~eIkkaYRrLa 90 (116)
T PTZ00100 18 RYGYRYLKNQKIFGSNNMSFPLSGFNPS-LGSLF-LKNDLKGFENPMSK---SEAYKILNISPTAS--KERIREAHKQLM 90 (116)
T ss_pred HHHHHHHHHHhhccCccccCCchhhhHH-HHHHH-hccccccccCCCCH---HHHHHHcCCCCCCC--HHHHHHHHHHHH
Confidence 4567777777666665553343211111 22222 12367888886654 57788999998764 347999999988
Q ss_pred HHH
Q 022436 115 YHY 117 (297)
Q Consensus 115 ~~y 117 (297)
..|
T Consensus 91 ~~~ 93 (116)
T PTZ00100 91 LRN 93 (116)
T ss_pred HHh
Confidence 776
No 68
>PF01352 KRAB: KRAB box; InterPro: IPR001909 The Krueppel-associated box (KRAB) is a domain of around 75 amino acids that is found in the N-terminal part of about one third of eukaryotic Krueppel-type C2H2 zinc finger proteins (ZFPs) []. It is enriched in charged amino acids and can be divided into subregions A and B, which are predicted to fold into two amphipathic alpha-helices. The KRAB A and B boxes can be separated by variable spacer segments and many KRAB proteins contain only the A box []. The functions currently known for members of the KRAB-containing protein family include transcriptional repression of RNA polymerase I, II, and III promoters, binding and splicing of RNA, and control of nucleolus function. The KRAB domain functions as a transcriptional repressor when tethered to the template DNA by a DNA-binding domain. A sequence of 45 amino acids in the KRAB A subdomain has been shown to be necessary and sufficient for transcriptional repression. The B box does not repress by itself but does potentiate the repression exerted by the KRAB A subdomain [, ]. Gene silencing requires the binding of the KRAB domain to the RING-B box-coiled coil (RBCC) domain of the KAP-1/TIF1-beta corepressor. As KAP-1 binds to the heterochromatin proteins HP1, it has been proposed that the KRAB-ZFP-bound target gene could be silenced following recruitment to heterochromatin [, ]. KRAB-ZFPs probably constitute the single largest class of transcription factors within the human genome []. Although the function of KRAB-ZFPs is largely unknown, they appear to play important roles during cell differentiation and development. The KRAB domain is generally encoded by two exons. The regions coded by the two exons are known as KRAB-A and KRAB-B.; GO: 0003676 nucleic acid binding, 0006355 regulation of transcription, DNA-dependent, 0005622 intracellular; PDB: 1V65_A.
Probab=20.11 E-value=1.3e+02 Score=19.92 Aligned_cols=26 Identities=23% Similarity=0.667 Sum_probs=15.9
Q ss_pred HHHHHHH-HHhhcCChHHHHHHHHHHH
Q 022436 245 EFTKMIG-ESWTNLSPEERKVYQNIGL 270 (297)
Q Consensus 245 eisk~l~-~~Wk~Ls~~eK~~Y~e~a~ 270 (297)
+++--++ +.|..|.+.+|.-|.+.-.
T Consensus 5 Dvav~fs~eEW~~L~~~Qk~ly~dvm~ 31 (41)
T PF01352_consen 5 DVAVYFSQEEWELLDPAQKNLYRDVML 31 (41)
T ss_dssp --TT---HHHHHTS-HHHHHHHHHHHH
T ss_pred EEEEEcChhhcccccceecccchhHHH
Confidence 3433333 6699999999999987654
Done!