Query         020402
Match_columns 326
No_of_seqs    303 out of 1562
Neff          6.2 
Searched_HMMs 46136
Date          Fri Mar 29 09:27:53 2013
Command       hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/020402.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/020402hhsearch_cdd -cpu 12 -v 0 

 No Hit                             Prob E-value P-value  Score    SS Cols Query HMM  Template HMM
  1 KOG2744 DNA-binding proteins B 100.0 2.9E-28 6.3E-33  246.5  14.8  183   38-220   153-339 (512)
  2 smart00501 BRIGHT BRIGHT, ARID  99.9 9.4E-27   2E-31  186.3  10.1   91   47-137     1-92  (93)
  3 PF01388 ARID:  ARID/BRIGHT DNA  99.9   3E-25 6.5E-30  176.7   8.6   88   46-133     4-92  (92)
  4 PTZ00199 high mobility group p  99.9 1.2E-21 2.6E-26  157.5  10.0   84  234-317     7-93  (94)
  5 cd01389 MATA_HMG-box MATA_HMG-  99.8 1.1E-18 2.4E-23  134.8   7.1   71  249-319     1-72  (77)
  6 cd01388 SOX-TCF_HMG-box SOX-TC  99.7   3E-18 6.4E-23  130.8   7.3   69  249-317     1-70  (72)
  7 PF00505 HMG_box:  HMG (high mo  99.7 1.8E-17   4E-22  124.2   8.3   68  250-317     1-69  (69)
  8 PF09011 HMG_box_2:  HMG-box do  99.7 3.5E-17 7.7E-22  125.2   8.4   71  247-317     1-73  (73)
  9 cd01390 HMGB-UBF_HMG-box HMGB-  99.7 5.9E-17 1.3E-21  120.3   8.3   65  250-314     1-66  (66)
 10 smart00398 HMG high mobility g  99.7 9.2E-17   2E-21  120.0   8.5   69  249-317     1-70  (70)
 11 COG5648 NHP6B Chromatin-associ  99.7 1.7E-16 3.7E-21  142.5   7.9   85  238-322    59-144 (211)
 12 KOG0381 HMG box-containing pro  99.7 5.4E-16 1.2E-20  124.0  10.0   75  246-320    17-95  (96)
 13 KOG0527 HMG-box transcription   99.6 5.9E-16 1.3E-20  149.4   5.9   75  244-318    57-132 (331)
 14 cd00084 HMG-box High Mobility   99.6 3.4E-15 7.4E-20  110.2   8.3   64  250-313     1-65  (66)
 15 KOG0526 Nucleosome-binding fac  99.5 2.1E-14 4.5E-19  143.3   6.7   76  238-317   524-600 (615)
 16 KOG3248 Transcription factor T  99.5 1.5E-13 3.3E-18  130.5  11.1  136  180-318   108-261 (421)
 17 KOG2510 SWI-SNF chromatin-remo  99.2 1.9E-11 4.1E-16  121.4   6.9   96   46-148   291-387 (532)
 18 KOG0528 HMG-box transcription   99.0 3.1E-10 6.8E-15  112.6   3.3   75  245-319   321-396 (511)
 19 KOG4715 SWI/SNF-related matrix  98.9 4.7E-09   1E-13   99.5   7.8   75  244-318    59-134 (410)
 20 KOG2746 HMG-box transcription   98.3   6E-07 1.3E-11   92.6   4.6   75  238-312   170-247 (683)
 21 PF14887 HMG_box_5:  HMG (high   97.9 3.3E-05 7.2E-10   59.6   6.3   73  249-321     3-75  (85)
 22 PF04690 YABBY:  YABBY protein;  96.9  0.0013 2.9E-08   58.3   5.0   43  249-291   121-164 (170)
 23 PF06382 DUF1074:  Protein of u  96.8   0.006 1.3E-07   54.3   7.8   48  254-305    83-131 (183)
 24 COG5648 NHP6B Chromatin-associ  96.1  0.0064 1.4E-07   55.4   4.0   66  249-314   143-209 (211)
 25 PF08073 CHDNT:  CHDNT (NUC034)  88.3    0.45 9.8E-06   34.7   2.6   39  254-292    13-52  (55)
 26 PF00249 Myb_DNA-binding:  Myb-  83.2     2.8 6.2E-05   28.8   4.6   39   80-129    10-48  (48)
 27 PF04769 MAT_Alpha1:  Mating-ty  82.0     2.7 5.9E-05   38.5   5.2   53  244-302    38-91  (201)
 28 PF06244 DUF1014:  Protein of u  80.1     1.6 3.6E-05   36.8   2.9   45  249-293    71-117 (122)
 29 TIGR01624 LRP1_Cterm LRP1 C-te  75.3     1.8 3.9E-05   30.7   1.5   31  185-215    16-47  (50)
 30 TIGR03481 HpnM hopanoid biosyn  68.9      14  0.0003   33.6   6.1   42  277-318    67-110 (198)
 31 PF05142 DUF702:  Domain of unk  68.1     3.1 6.8E-05   36.4   1.6   32  185-216   118-149 (154)
 32 PRK15117 ABC transporter perip  64.8      15 0.00034   33.6   5.7   46  273-318    66-114 (211)
 33 PF12881 NUT_N:  NUT protein N   62.3      14 0.00031   36.0   5.0   63  257-319   232-296 (328)
 34 PF09441 Abp2:  ARS binding pro  60.0      18 0.00039   32.1   4.9   41   69-113    45-85  (175)
 35 PF13921 Myb_DNA-bind_6:  Myb-l  56.9      20 0.00044   25.4   4.0   36   81-129     8-43  (60)
 36 cd00167 SANT 'SWI3, ADA2, N-Co  55.4      26 0.00057   22.4   4.1   37   81-129     9-45  (45)
 37 PF05494 Tol_Tol_Ttg2:  Toluene  55.0      30 0.00065   30.1   5.6   42  277-318    42-84  (170)
 38 KOG3223 Uncharacterized conser  52.6     6.8 0.00015   35.6   1.1   52  248-302   162-215 (221)
 39 PF11304 DUF3106:  Protein of u  49.9      57  0.0012   26.7   6.1   40  278-317    11-57  (107)
 40 COG2854 Ttg2D ABC-type transpo  43.6      34 0.00073   31.5   4.1   43  277-319    74-117 (202)
 41 KOG0493 Transcription factor E  42.1      40 0.00087   32.3   4.4   42  227-274   226-267 (342)
 42 PF13873 Myb_DNA-bind_5:  Myb/S  40.8      36 0.00077   25.5   3.3   54   77-131    14-71  (78)
 43 PF12776 Myb_DNA-bind_3:  Myb/S  37.9      64  0.0014   24.8   4.5   61   79-139    10-72  (96)
 44 PF12650 DUF3784:  Domain of un  37.9      23 0.00049   28.1   1.9   17  286-302    25-41  (97)
 45 smart00717 SANT SANT  SWI3, AD  37.4      67  0.0015   20.7   4.0   26   99-129    22-47  (49)
 46 PF13875 DUF4202:  Domain of un  36.6      54  0.0012   29.7   4.2   40  255-297   130-170 (185)
 47 PF10545 MADF_DNA_bdg:  Alcohol  35.8      37 0.00081   25.1   2.7   38   96-133    24-64  (85)
 48 PF02337 Gag_p10:  Retroviral G  33.1 1.4E+02  0.0031   23.8   5.7   54   50-110     8-64  (90)
 49 PRK09706 transcriptional repre  32.7   1E+02  0.0022   25.7   5.2   41  279-319    88-128 (135)
 50 smart00595 MADF subfamily of S  31.2      34 0.00073   26.1   1.9   42   94-135    23-65  (89)
 51 PF05066 HARE-HTH:  HB1, ASXL,   30.5      83  0.0018   23.3   3.8   43   52-105     3-45  (72)
 52 PRK10236 hypothetical protein;  29.9      51  0.0011   31.0   3.1   45  257-301    89-140 (237)
 53 PF04967 HTH_10:  HTH DNA bindi  28.2      77  0.0017   22.8   3.1   40   88-129    13-52  (53)
 54 TIGR00787 dctP tripartite ATP-  27.0 1.1E+02  0.0024   28.2   4.8   28  284-311   213-240 (257)
 55 COG1638 DctP TRAP-type C4-dica  26.0   1E+02  0.0022   30.3   4.5   43  277-319   237-279 (332)
 56 PRK02363 DNA-directed RNA poly  24.5      60  0.0013   27.6   2.3   63   51-123     4-69  (129)
 57 PRK12751 cpxP periplasmic stre  24.5 1.3E+02  0.0028   26.6   4.5   32  278-309   118-149 (162)
 58 PRK12750 cpxP periplasmic repr  23.0   2E+02  0.0043   25.5   5.5   36  278-313   125-160 (170)
 59 PRK10363 cpxP periplasmic repr  22.1 1.7E+02  0.0037   26.1   4.8   39  277-316   111-149 (166)
 60 cd07268 Glo_EDI_BRP_like_4 Thi  22.0      43 0.00092   29.3   0.9   49   42-90      4-52  (149)
 61 PF13725 tRNA_bind_2:  Possible  20.9      59  0.0013   25.6   1.5   20   93-112    78-97  (101)
 62 PLN00131 hypothetical protein;  20.4 3.3E+02  0.0071   24.3   6.1   56    6-61     83-147 (218)
 63 cd05694 S1_Rrp5_repeat_hs2_sc2  20.4      73  0.0016   24.0   1.8   31  182-214     4-34  (74)

No 1  
>KOG2744 consensus DNA-binding proteins Bright/BRCAA1/RBP1 and related proteins containing BRIGHT domain [Transcription]
Probab=99.95  E-value=2.9e-28  Score=246.54  Aligned_cols=183  Identities=38%  Similarity=0.547  Sum_probs=152.0

Q ss_pred             CCCCchhhhhcHHHHHHHHHHHHHhcCCCCC-CCccCCeecchhHHHHHHHhcCcchhhcccccHHHHHHHhCCCC-CCC
Q 020402           38 PTAKYEDIAQSSDLFWATLEAFHKSFGDKFK-VPTVGGKALDLHRLFVEVTSRGGLGKVIRDRRWKEVVVVFNFPT-TIT  115 (326)
Q Consensus        38 ~~~~~e~~~~~~~~F~~~L~~F~~~rG~~l~-~P~i~gk~lDL~~Ly~~V~~rGG~~~V~~~~~W~eVa~~l~~p~-~~~  115 (326)
                      +...+|.+..+++.||++|+.||+.+|++|+ +|+|+|++||||.||.+|+++||+++|+.+++|++|+..|+||. ++|
T Consensus       153 ~~~~~e~~~~~~eeF~~dl~~f~~~~~~~~~~iPii~~~~ldL~~Ly~lV~s~GG~~~V~~~k~Wrev~~~l~~pt~tiT  232 (512)
T KOG2744|consen  153 PLYETEGVPKSSEEFMEDLRRFMKKRGTKVKSIPIIGGQPLDLHWLYALVTSRGGLDEVTNKKLWREVIDGLNFPTPTIT  232 (512)
T ss_pred             cccccccccccHHHHHHHHHHHHHHhCCcceeccccCCCcchHHHHHHHHhcCCchhHhhhhhhHHHHhccccCCCcccc
Confidence            5555666777999999999999999999997 99999999999999999999999999999999999999999999 999


Q ss_pred             cHHHHHHHHHHHhhHHhhhhhhhccCCCCCCCCCCCC-CCCCC-CCCCCCCCCchhhhcCCCCCccccCCCceeeeecCc
Q 020402          116 SASFVLRKYYLSLLYHFEQVYYFRREAPSSSMPDAVS-GSSLD-NGSASPEEGSTINQLGSQGISKLQIGCSVSGVIDGK  193 (326)
Q Consensus       116 ~as~~Lk~~Y~k~L~~fE~~~~~~~~~~~~~~~~~~~-~~~~~-~~~~~~~~~~~~~~~~~~~~~~~~~~~~V~g~idg~  193 (326)
                      +++|.||++|+++|++||+.+++....+..++.+... .++.. .+....+.++........++...+....+.|+|+|+
T Consensus       233 saaf~lr~~y~K~L~~ye~~~~~~~~~pln~p~~~~~~a~~~~~rE~~~~~~~~~~~~~~~~~~~~~~~~~~aa~~~~g~  312 (512)
T KOG2744|consen  233 SAAFTLRKQYLKLLFEYECEFEKNRHVPLNSPAELSEEASSSNRREGRRHELSPSKEFQANGPSEEEPAEAEAAPEILGN  312 (512)
T ss_pred             hHHHHHHHHHHHHHHHHHHHHHHhccCCCCCcccccccccccccccccccccCcchhhccCCcccccccccccchhhhcc
Confidence            9999999999999999999999998777777665444 22222 233444444432333333334444567899999999


Q ss_pred             ccCCceEEEeeccccccccccccCCCC
Q 020402          194 FDNGYLVTVNLGSEQLKGVLYHIPHAH  220 (326)
Q Consensus       194 fd~gy~vtv~~gse~~~g~ly~~p~~~  220 (326)
                      |+.||++++.++++.+++++|+.+...
T Consensus       313 f~~~~~~~~~~~s~~ln~~~~~~~~~~  339 (512)
T KOG2744|consen  313 FLQGLLVFMKDGSEPLNGVLYLGPPDL  339 (512)
T ss_pred             ccccCceeccCcchhccCccccccCcc
Confidence            999999999999999999999986643


No 2  
>smart00501 BRIGHT BRIGHT, ARID (A/T-rich interaction domain) domain. DNA-binding domain containing a helix-turn-helix structure
Probab=99.94  E-value=9.4e-27  Score=186.31  Aligned_cols=91  Identities=40%  Similarity=0.652  Sum_probs=87.2

Q ss_pred             hcHHHHHHHHHHHHHhcCCCC-CCCccCCeecchhHHHHHHHhcCcchhhcccccHHHHHHHhCCCCCCCcHHHHHHHHH
Q 020402           47 QSSDLFWATLEAFHKSFGDKF-KVPTVGGKALDLHRLFVEVTSRGGLGKVIRDRRWKEVVVVFNFPTTITSASFVLRKYY  125 (326)
Q Consensus        47 ~~~~~F~~~L~~F~~~rG~~l-~~P~i~gk~lDL~~Ly~~V~~rGG~~~V~~~~~W~eVa~~l~~p~~~~~as~~Lk~~Y  125 (326)
                      ++++.|+++|.+||+.+|+++ ++|+|+|++||||+||.+|+++|||++||++++|.+||+.||+++.+++++..|+++|
T Consensus         1 ~~~~~F~~~L~~F~~~~g~~~~~~P~i~g~~vdL~~Ly~~V~~~GG~~~v~~~~~W~~Va~~lg~~~~~~~~~~~lk~~Y   80 (93)
T smart00501        1 RERVLFLDRLYKFMEERGSPLKKIPVIGGKPLDLYRLYRLVQERGGYDQVTKDKKWKEIARELGIPDTSTSAASSLRKHY   80 (93)
T ss_pred             CcHHHHHHHHHHHHHHcCCcCCcCCeECCEeCcHHHHHHHHHHccCHHHHcCCCCHHHHHHHhCCCcccchHHHHHHHHH
Confidence            468999999999999999998 7999999999999999999999999999999999999999999998999999999999


Q ss_pred             HHhhHHhhhhhh
Q 020402          126 LSLLYHFEQVYY  137 (326)
Q Consensus       126 ~k~L~~fE~~~~  137 (326)
                      .+||++||+.+.
T Consensus        81 ~k~L~~yE~~~~   92 (93)
T smart00501       81 ERYLLPFERFLR   92 (93)
T ss_pred             HHHhHHHHHHhh
Confidence            999999999854


No 3  
>PF01388 ARID:  ARID/BRIGHT DNA binding domain;  InterPro: IPR001606 Members of the recently discovered ARID (AT-rich interaction domain; also known as BRIGHT domain)) family of DNA-binding proteins are found in fungi and invertebrate and vertebrate metazoans. ARID-encoding genes are involved in a variety of biological processes including embryonic development, cell lineage gene regulation and cell cycle control. Although the specific roles of this domain and of ARID-containing proteins in transcriptional regulation are yet to be elucidated, they include both positive and negative transcriptional regulation and a likely involvement in the modification of chromatin structure []. The basic structure of the ARID domain domain appears to be a series of six alpha-helices separated by beta-strands, loops, or turns, but the structured region may extend to an additional helix at either or both ends of the basic six. Based on primary sequence homology, they can be partitioned into three structural classes: Minimal ARID proteins that consist of a core domain formed by six alpha helices; ARID proteins that supplement the core domain with an N-terminal alpha-helix; and Extended-ARID proteins, which contain the core domain and additional alpha-helices at their N- and C-termini. The human SWI-SNF complex protein p270 is an ARID family member with non-sequence-specific DNA binding activity. The ARID consensus and other structural features are common to both p270 and yeast SWI1, suggesting that p270 is a human counterpart of SWI1 []. The approximately 100-residue ARID sequence is present in a series of proteins strongly implicated in the regulation of cell growth, development, and tissue-specific gene expression. Although about a dozen ARID proteins can be identified from database searches, to date, only Bright (a regulator of B-cell-specific gene expression), dead ringer (a Drosophila melanogaster gene product required for normal development), and MRF-2 (which represses expression from the Cytomegalovirus enhancer) have been analyzed directly in regard to their DNA binding properties. Each binds preferentially to AT-rich sites. In contrast, p270 shows no sequence preference in its DNA binding activity, thereby demonstrating that AT-rich binding is not an intrinsic property of ARID domains and that ARID family proteins may be involved in a wider range of DNA interactions [].; GO: 0003677 DNA binding, 0005622 intracellular; PDB: 1C20_A 1KQQ_A 2JRZ_A 2LM1_A 2YQE_A 2JXJ_A 2EH9_A 2CXY_A 2LI6_A 1KN5_A ....
Probab=99.92  E-value=3e-25  Score=176.73  Aligned_cols=88  Identities=38%  Similarity=0.711  Sum_probs=81.9

Q ss_pred             hhcHHHHHHHHHHHHHhcCCCC-CCCccCCeecchhHHHHHHHhcCcchhhcccccHHHHHHHhCCCCCCCcHHHHHHHH
Q 020402           46 AQSSDLFWATLEAFHKSFGDKF-KVPTVGGKALDLHRLFVEVTSRGGLGKVIRDRRWKEVVVVFNFPTTITSASFVLRKY  124 (326)
Q Consensus        46 ~~~~~~F~~~L~~F~~~rG~~l-~~P~i~gk~lDL~~Ly~~V~~rGG~~~V~~~~~W~eVa~~l~~p~~~~~as~~Lk~~  124 (326)
                      ..+++.|++.|++||+.+|+++ .+|.|+|++||||+||.+|+++|||++|+.+++|.+||..||+++.+++.+..|+++
T Consensus         4 ~~~~~~F~~~L~~f~~~~g~~~~~~P~i~g~~vDL~~Ly~~V~~~GG~~~V~~~~~W~~va~~lg~~~~~~~~~~~L~~~   83 (92)
T PF01388_consen    4 TREREQFLEQLREFHESRGTPIDRPPVIGGKPVDLYKLYKAVMKRGGFDKVTKNKKWREVARKLGFPPSSTSAAQQLRQH   83 (92)
T ss_dssp             CHHHHHHHHHHHHHHHHTTSSSSS-SEETTSE-SHHHHHHHHHHHTSHHHHHHHTTHHHHHHHTTS-TTSCHHHHHHHHH
T ss_pred             chHHHHHHHHHHHHHHHcCCCCCCCCcCCCEeCcHHHHHHHHHhCcCcccCcccchHHHHHHHhCCCCCCCcHHHHHHHH
Confidence            4678999999999999999997 799999999999999999999999999999999999999999999888888999999


Q ss_pred             HHHhhHHhh
Q 020402          125 YLSLLYHFE  133 (326)
Q Consensus       125 Y~k~L~~fE  133 (326)
                      |++||++||
T Consensus        84 Y~~~L~~fE   92 (92)
T PF01388_consen   84 YEKYLLPFE   92 (92)
T ss_dssp             HHHHTHHHH
T ss_pred             HHHHhHhhC
Confidence            999999998


No 4  
>PTZ00199 high mobility group protein; Provisional
Probab=99.86  E-value=1.2e-21  Score=157.48  Aligned_cols=84  Identities=32%  Similarity=0.519  Sum_probs=78.2

Q ss_pred             hhhhccccccCCCCCCCCCCCChHHHHHHHHHHHhcccCCCCh---hHHHHHHHHHhccCCHHHHHHHHHHHHHHHHHHH
Q 020402          234 HRRRKRSRLALRDPSRPKSNRSGYNFFFAEHYARLKPHYYGQE---KAISKKIGVLWSNLTEAEKQVYQEKGLKDKERYK  310 (326)
Q Consensus       234 rrkkkk~k~k~kdp~~PKrP~SAY~lF~~e~r~~lk~~~p~~~---~eisk~ige~Wk~Ls~eeK~~Y~e~A~~dkerY~  310 (326)
                      ++.+++++++++||++||+|+|||+||+.++|..|+.+||++.   .+|+++||++|+.|+++||++|+++|+.|+++|.
T Consensus         7 ~~~~k~~~k~~kdp~~PKrP~sAY~~F~~~~R~~i~~~~P~~~~~~~evsk~ige~Wk~ls~eeK~~y~~~A~~dk~rY~   86 (94)
T PTZ00199          7 KVLVRKNKRKKKDPNAPKRALSAYMFFAKEKRAEIIAENPELAKDVAAVGKMVGEAWNKLSEEEKAPYEKKAQEDKVRYE   86 (94)
T ss_pred             CccccccCCCCCCCCCCCCCCcHHHHHHHHHHHHHHHHCcCCcccHHHHHHHHHHHHHcCCHHHHHHHHHHHHHHHHHHH
Confidence            4446666677899999999999999999999999999999974   8999999999999999999999999999999999


Q ss_pred             HHHHHHH
Q 020402          311 SEMLEYR  317 (326)
Q Consensus       311 ~em~~Yk  317 (326)
                      .||.+|+
T Consensus        87 ~e~~~Y~   93 (94)
T PTZ00199         87 KEKAEYA   93 (94)
T ss_pred             HHHHHHh
Confidence            9999996


No 5  
>cd01389 MATA_HMG-box MATA_HMG-box, class I member of the HMG-box superfamily of DNA-binding proteins. These proteins contain a single HMG box, and bind the minor groove of DNA in a highly sequence-specific manner. Members include the fungal mating type gene products MC, MATA1 and Ste11.
Probab=99.76  E-value=1.1e-18  Score=134.79  Aligned_cols=71  Identities=25%  Similarity=0.427  Sum_probs=68.7

Q ss_pred             CCCCCCChHHHHHHHHHHHhcccCCCCh-hHHHHHHHHHhccCCHHHHHHHHHHHHHHHHHHHHHHHHHHhc
Q 020402          249 RPKSNRSGYNFFFAEHYARLKPHYYGQE-KAISKKIGVLWSNLTEAEKQVYQEKGLKDKERYKSEMLEYRSS  319 (326)
Q Consensus       249 ~PKrP~SAY~lF~~e~r~~lk~~~p~~~-~eisk~ige~Wk~Ls~eeK~~Y~e~A~~dkerY~~em~~Yk~~  319 (326)
                      +||||+||||||+++.|..++.++|+++ .+|+++||++|+.|++++|++|.++|++++++|.+++++|+-.
T Consensus         1 ~~kRP~naf~lf~~~~r~~~~~~~p~~~~~eisk~~g~~Wk~ls~eeK~~y~~~A~~~k~~~~~~~p~Yky~   72 (77)
T cd01389           1 KIPRPRNAFILYRQDKHAQLKTENPGLTNNEISRIIGRMWRSESPEVKAYYKELAEEEKERHAREYPDYKYT   72 (77)
T ss_pred             CCCCCCcHHHHHHHHHHHHHHHHCCCCCHHHHHHHHHHHHhhCCHHHHHHHHHHHHHHHHHHHHHCCCCccc
Confidence            5899999999999999999999999999 9999999999999999999999999999999999999999853


No 6  
>cd01388 SOX-TCF_HMG-box SOX-TCF_HMG-box, class I member of the HMG-box superfamily of DNA-binding proteins. These proteins contain a single HMG box, and bind the minor groove of DNA in a highly sequence-specific manner. Members include SRY and its homologs in insects and vertebrates, and transcription factor-like proteins, TCF-1, -3, -4, and LEF-1. They appear to bind the minor groove of the A/T C A A A G/C-motif.
Probab=99.75  E-value=3e-18  Score=130.79  Aligned_cols=69  Identities=30%  Similarity=0.401  Sum_probs=67.1

Q ss_pred             CCCCCCChHHHHHHHHHHHhcccCCCCh-hHHHHHHHHHhccCCHHHHHHHHHHHHHHHHHHHHHHHHHH
Q 020402          249 RPKSNRSGYNFFFAEHYARLKPHYYGQE-KAISKKIGVLWSNLTEAEKQVYQEKGLKDKERYKSEMLEYR  317 (326)
Q Consensus       249 ~PKrP~SAY~lF~~e~r~~lk~~~p~~~-~eisk~ige~Wk~Ls~eeK~~Y~e~A~~dkerY~~em~~Yk  317 (326)
                      +.|||+|||++|++++|.+++++||+++ .+|+++||+.|+.|++++|++|.++|++++++|.+++++|+
T Consensus         1 ~iKrP~naf~~F~~~~r~~~~~~~p~~~~~eisk~l~~~Wk~ls~~eK~~y~~~a~~~k~~y~~~~p~y~   70 (72)
T cd01388           1 HIKRPMNAFMLFSKRHRRKVLQEYPLKENRAISKILGDRWKALSNEEKQPYYEEAKKLKELHMKLYPDYK   70 (72)
T ss_pred             CCCCCCcHHHHHHHHHHHHHHHHCCCCCHHHHHHHHHHHHHcCCHHHHHHHHHHHHHHHHHHHHHCcCCC
Confidence            4689999999999999999999999999 99999999999999999999999999999999999999997


No 7  
>PF00505 HMG_box:  HMG (high mobility group) box;  InterPro: IPR000910 High mobility group (HMG or HMGB) proteins are a family of relatively low molecular weight non-histone components in chromatin. HMG1 (also called HMG-T in fish) and HMG2 are two highly related proteins that bind single-stranded DNA preferentially and unwind double-stranded DNA. Although they have no sequence specificity, they have a high affinity for bent or distorted DNA, and bend linear DNA. HMG1 and HMG2 contain two DNA-binding HMG-box domains (A and B) that show structural and functional differences, and have a long acidic C-terminal domain rich in aspartic and glutamic acid residues. The acidic tail modulates the affinity of the tandem HMG boxes in HMG1 and 2 for a variety of DNA targets. HMG1 and 2 appear to play important architectural roles in the assembly of nucleoprotein complexes in a variety of biological processes, for example V(D)J recombination, the initiation of transcription, and DNA repair []. The profile in this entry describing the HMG-domains is much more general than the signature. In addition to the HMG1 and HMG2 proteins, HMG-domains occur in single or multiple copies in the following protein classes; the SOX family of transcription factors; SRY sex determining region Y protein and related proteins []; LEF1 lymphoid enhancer binding factor 1 []; SSRP recombination signal recognition protein; MTF1 mitochondrial transcription factor 1; UBF1/2 nucleolar transcription factors; Abf2 yeast ARS-binding factor []; and Saccharomyces cerevisiae transcription factors Ixr1, Rox1, Nhp6a, Nhp6b and Spp41.; GO: 0003677 DNA binding; PDB: 1I11_A 1J3C_A 1J3D_A 1WZ6_A 1WGF_A 2D7L_A 1GT0_D 3U2B_C 2CRJ_A 2CS1_A ....
Probab=99.72  E-value=1.8e-17  Score=124.21  Aligned_cols=68  Identities=40%  Similarity=0.651  Sum_probs=65.0

Q ss_pred             CCCCCChHHHHHHHHHHHhcccCCCCh-hHHHHHHHHHhccCCHHHHHHHHHHHHHHHHHHHHHHHHHH
Q 020402          250 PKSNRSGYNFFFAEHYARLKPHYYGQE-KAISKKIGVLWSNLTEAEKQVYQEKGLKDKERYKSEMLEYR  317 (326)
Q Consensus       250 PKrP~SAY~lF~~e~r~~lk~~~p~~~-~eisk~ige~Wk~Ls~eeK~~Y~e~A~~dkerY~~em~~Yk  317 (326)
                      ||||+|||++|+.+.+..++.+||++. .+|+++||++|++|+++||++|.++|++++++|+++|++|+
T Consensus         1 PkrP~~af~lf~~~~~~~~k~~~p~~~~~~i~~~~~~~W~~l~~~eK~~y~~~a~~~~~~y~~~~~~y~   69 (69)
T PF00505_consen    1 PKRPPNAFMLFCKEKRAKLKEENPDLSNKEISKILAQMWKNLSEEEKAPYKEEAEEEKERYEKEMPEYK   69 (69)
T ss_dssp             SSSS--HHHHHHHHHHHHHHHHSTTSTHHHHHHHHHHHHHCSHHHHHHHHHHHHHHHHHHHHHHHHHHH
T ss_pred             CcCCCCHHHHHHHHHHHHHHHHhcccccccchhhHHHHHhcCCHHHHHHHHHHHHHHHHHHHHHHHhcC
Confidence            899999999999999999999999999 99999999999999999999999999999999999999996


No 8  
>PF09011 HMG_box_2:  HMG-box domain;  InterPro: IPR015101 This domain is predominantly found in Maelstrom homologue proteins. It has no known function. ; GO: 0005634 nucleus; PDB: 2EQZ_A 1V64_A 2CTO_A 1H5P_A 3TQ6_A 3FGH_A 3TMM_A 1J3X_A 2YRQ_A 1AAB_A ....
Probab=99.71  E-value=3.5e-17  Score=125.16  Aligned_cols=71  Identities=38%  Similarity=0.655  Sum_probs=62.4

Q ss_pred             CCCCCCCCChHHHHHHHHHHHhccc-CCCCh-hHHHHHHHHHhccCCHHHHHHHHHHHHHHHHHHHHHHHHHH
Q 020402          247 PSRPKSNRSGYNFFFAEHYARLKPH-YYGQE-KAISKKIGVLWSNLTEAEKQVYQEKGLKDKERYKSEMLEYR  317 (326)
Q Consensus       247 p~~PKrP~SAY~lF~~e~r~~lk~~-~p~~~-~eisk~ige~Wk~Ls~eeK~~Y~e~A~~dkerY~~em~~Yk  317 (326)
                      |++||+|+|||+||+.+++..++.. .+... .|+++.|++.|++||++||.+|.++|+.++++|+.+|.+|+
T Consensus         1 p~kpK~~~say~lF~~~~~~~~k~~G~~~~~~~e~~k~~~~~Wk~Ls~~EK~~Y~~~A~~~k~~y~~e~~~~~   73 (73)
T PF09011_consen    1 PKKPKRPPSAYNLFMKEMRKEVKEEGGQKQSFREVMKEISERWKSLSEEEKEPYEERAKEDKERYEREMKEWN   73 (73)
T ss_dssp             SSS--SSSSHHHHHHHHHHHHHHHHT-T-SSHHHHHHHHHHHHHHS-HHHHHHHHHHHHHHHHHHHHHHHHH-
T ss_pred             CcCCCCCCCHHHHHHHHHHHHHHHhcccCCCHHHHHHHHHHHHHhcCHHHHHHHHHHHHHHHHHHHHHHHhcC
Confidence            5789999999999999999999988 66666 99999999999999999999999999999999999999995


No 9  
>cd01390 HMGB-UBF_HMG-box HMGB-UBF_HMG-box, class II and III members of the HMG-box superfamily of DNA-binding proteins. These proteins bind the minor groove of DNA in a non-sequence specific fashion and contain two or more tandem HMG boxes. Class II members include non-histone chromosomal proteins, HMG1 and HMG2, which bind to bent or distorted DNA such as four-way DNA junctions, synthetic DNA cruciforms, kinked cisplatin-modified DNA, DNA bulges, cross-overs in supercoiled DNA, and can cause looping of linear DNA. Class III members include nucleolar and mitochondrial transcription factors, UBF and mtTF1, which bind four-way DNA junctions.
Probab=99.70  E-value=5.9e-17  Score=120.28  Aligned_cols=65  Identities=40%  Similarity=0.615  Sum_probs=63.2

Q ss_pred             CCCCCChHHHHHHHHHHHhcccCCCCh-hHHHHHHHHHhccCCHHHHHHHHHHHHHHHHHHHHHHH
Q 020402          250 PKSNRSGYNFFFAEHYARLKPHYYGQE-KAISKKIGVLWSNLTEAEKQVYQEKGLKDKERYKSEML  314 (326)
Q Consensus       250 PKrP~SAY~lF~~e~r~~lk~~~p~~~-~eisk~ige~Wk~Ls~eeK~~Y~e~A~~dkerY~~em~  314 (326)
                      ||+|+|||++|++++|..++.+||+++ .+|+++||++|++|++++|++|.++|++++++|+.+|.
T Consensus         1 Pkrp~saf~~f~~~~r~~~~~~~p~~~~~~i~~~~~~~W~~ls~~eK~~y~~~a~~~~~~y~~e~~   66 (66)
T cd01390           1 PKRPLSAYFLFSQEQRPKLKKENPDASVTEVTKILGEKWKELSEEEKKKYEEKAEKDKERYEKEMK   66 (66)
T ss_pred             CCCCCcHHHHHHHHHHHHHHHHCcCCCHHHHHHHHHHHHHhCCHHHHHHHHHHHHHHHHHHHHhhC
Confidence            899999999999999999999999998 99999999999999999999999999999999999873


No 10 
>smart00398 HMG high mobility group.
Probab=99.69  E-value=9.2e-17  Score=119.95  Aligned_cols=69  Identities=43%  Similarity=0.662  Sum_probs=67.2

Q ss_pred             CCCCCCChHHHHHHHHHHHhcccCCCCh-hHHHHHHHHHhccCCHHHHHHHHHHHHHHHHHHHHHHHHHH
Q 020402          249 RPKSNRSGYNFFFAEHYARLKPHYYGQE-KAISKKIGVLWSNLTEAEKQVYQEKGLKDKERYKSEMLEYR  317 (326)
Q Consensus       249 ~PKrP~SAY~lF~~e~r~~lk~~~p~~~-~eisk~ige~Wk~Ls~eeK~~Y~e~A~~dkerY~~em~~Yk  317 (326)
                      +||+|+|||++|++++|..++.++|++. .+|+++||++|+.|++++|++|.++|++++++|++++++|+
T Consensus         1 ~pkrp~~~y~~f~~~~r~~~~~~~~~~~~~~i~~~~~~~W~~l~~~ek~~y~~~a~~~~~~y~~~~~~y~   70 (70)
T smart00398        1 KPKRPMSAFMLFSQENRAKIKAENPDLSNAEISKKLGERWKLLSEEEKAPYEEKAKKDKERYEEEMPEYK   70 (70)
T ss_pred             CcCCCCcHHHHHHHHHHHHHHHHCcCCCHHHHHHHHHHHHHcCCHHHHHHHHHHHHHHHHHHHHHHHhcC
Confidence            5899999999999999999999999999 99999999999999999999999999999999999999985


No 11 
>COG5648 NHP6B Chromatin-associated proteins containing the HMG domain [Chromatin structure and dynamics]
Probab=99.66  E-value=1.7e-16  Score=142.49  Aligned_cols=85  Identities=28%  Similarity=0.444  Sum_probs=80.6

Q ss_pred             ccccccCCCCCCCCCCCChHHHHHHHHHHHhcccCCCCh-hHHHHHHHHHhccCCHHHHHHHHHHHHHHHHHHHHHHHHH
Q 020402          238 KRSRLALRDPSRPKSNRSGYNFFFAEHYARLKPHYYGQE-KAISKKIGVLWSNLTEAEKQVYQEKGLKDKERYKSEMLEY  316 (326)
Q Consensus       238 kk~k~k~kdp~~PKrP~SAY~lF~~e~r~~lk~~~p~~~-~eisk~ige~Wk~Ls~eeK~~Y~e~A~~dkerY~~em~~Y  316 (326)
                      +...++++||+.||||+|||++|+.++|.+++.++|++. .++.+.+|++|++|+++||++|.+.|..++++|..++..|
T Consensus        59 k~~~r~k~dpN~PKRp~sayf~y~~~~R~ei~~~~p~l~~~e~~k~~~e~WK~Ltd~eke~y~k~~~~~~erYq~ek~~y  138 (211)
T COG5648          59 KRLVRKKKDPNGPKRPLSAYFLYSAENRDEIRKENPKLTFGEVGKLLSEKWKELTDEEKEPYYKEANSDRERYQREKEEY  138 (211)
T ss_pred             HHHHHHhcCCCCCCCchhHHHHHHHHHHHHHHHhCCCCChHHHHHHHHHHHHhccHhhhhhHHHHHhhHHHHHHHHHHhh
Confidence            455677899999999999999999999999999999999 9999999999999999999999999999999999999999


Q ss_pred             HhcCCC
Q 020402          317 RSSYDS  322 (326)
Q Consensus       317 k~~~~~  322 (326)
                      .++...
T Consensus       139 ~~k~~~  144 (211)
T COG5648         139 NKKLPN  144 (211)
T ss_pred             hcccCC
Confidence            997754


No 12 
>KOG0381 consensus HMG box-containing protein [General function prediction only]
Probab=99.66  E-value=5.4e-16  Score=123.98  Aligned_cols=75  Identities=39%  Similarity=0.671  Sum_probs=71.8

Q ss_pred             CC--CCCCCCCChHHHHHHHHHHHhcccCCCCh-hHHHHHHHHHhccCCHHHHHHHHHHHHHHHHHHHHHHH-HHHhcC
Q 020402          246 DP--SRPKSNRSGYNFFFAEHYARLKPHYYGQE-KAISKKIGVLWSNLTEAEKQVYQEKGLKDKERYKSEML-EYRSSY  320 (326)
Q Consensus       246 dp--~~PKrP~SAY~lF~~e~r~~lk~~~p~~~-~eisk~ige~Wk~Ls~eeK~~Y~e~A~~dkerY~~em~-~Yk~~~  320 (326)
                      |+  +.||+|++||++|+.+.|..++.+||++. .++++++|++|++|++++|++|+..|.+++++|..+|. +|+..+
T Consensus        17 ~p~~~~pkrp~sa~~~f~~~~~~~~k~~~p~~~~~~v~k~~g~~W~~l~~~~k~~y~~ka~~~k~~Y~~~~~~~~~~~~   95 (96)
T KOG0381|consen   17 DPNAQAPKRPLSAFFLFSSEQRSKIKAENPGLSVGEVAKALGEMWKNLAEEEKQPYEEKASKLKEKYEKELAGEYKASL   95 (96)
T ss_pred             CCCCCCCCCCCcHHHHHHHHHHHHHHHhCCCCCHHHHHHHHHHHHhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHhhcc
Confidence            66  59999999999999999999999999988 99999999999999999999999999999999999999 998765


No 13 
>KOG0527 consensus HMG-box transcription factor [Transcription]
Probab=99.61  E-value=5.9e-16  Score=149.41  Aligned_cols=75  Identities=19%  Similarity=0.303  Sum_probs=71.7

Q ss_pred             CCCCCCCCCCCChHHHHHHHHHHHhcccCCCCh-hHHHHHHHHHhccCCHHHHHHHHHHHHHHHHHHHHHHHHHHh
Q 020402          244 LRDPSRPKSNRSGYNFFFAEHYARLKPHYYGQE-KAISKKIGVLWSNLTEAEKQVYQEKGLKDKERYKSEMLEYRS  318 (326)
Q Consensus       244 ~kdp~~PKrP~SAY~lF~~e~r~~lk~~~p~~~-~eisk~ige~Wk~Ls~eeK~~Y~e~A~~dkerY~~em~~Yk~  318 (326)
                      ++...+.||||||||+|.+.+|.+|.+++|++. .||+|+||.+||.|+|+||++|.++|++.|..|++|+++||-
T Consensus        57 k~~~~hIKRPMNAFMVWSq~~RRkma~qnP~mHNSEISK~LG~~WK~Lse~EKrPFi~EAeRLR~~HmkehPdYKY  132 (331)
T KOG0527|consen   57 KTSTDRIKRPMNAFMVWSQGQRRKLAKQNPKMHNSEISKRLGAEWKLLSEEEKRPFVDEAERLRAQHMKEYPDYKY  132 (331)
T ss_pred             CCCccccCCCcchhhhhhHHHHHHHHHhCcchhhHHHHHHHHHHHhhcCHhhhccHHHHHHHHHHHHHHhCCCccc
Confidence            445568999999999999999999999999999 999999999999999999999999999999999999999986


No 14 
>cd00084 HMG-box High Mobility Group (HMG)-box is found in a variety of eukaryotic chromosomal proteins and transcription factors. HMGs bind to the minor groove of DNA and have been classified by DNA binding preferences. Two phylogenically distinct groups of Class I proteins bind DNA in a sequence specific fashion and contain a single HMG box. One group (SOX-TCF) includes transcription factors, TCF-1, -3, -4; and also SRY and LEF-1, which bind four-way DNA junctions and duplex DNA targets. The second group (MATA) includes fungal mating type gene products MC, MATA1 and Ste11. Class II and III proteins (HMGB-UBF) bind DNA in a non-sequence specific fashion and contain two or more tandem HMG boxes. Class II members include non-histone chromosomal proteins, HMG1 and HMG2, which bind to bent or distorted DNA such as four-way DNA junctions, synthetic DNA cruciforms, kinked cisplatin-modified DNA, DNA bulges, cross-overs in supercoiled DNA, and can cause looping of linear DNA. Class III member
Probab=99.60  E-value=3.4e-15  Score=110.19  Aligned_cols=64  Identities=47%  Similarity=0.707  Sum_probs=62.3

Q ss_pred             CCCCCChHHHHHHHHHHHhcccCCCCh-hHHHHHHHHHhccCCHHHHHHHHHHHHHHHHHHHHHH
Q 020402          250 PKSNRSGYNFFFAEHYARLKPHYYGQE-KAISKKIGVLWSNLTEAEKQVYQEKGLKDKERYKSEM  313 (326)
Q Consensus       250 PKrP~SAY~lF~~e~r~~lk~~~p~~~-~eisk~ige~Wk~Ls~eeK~~Y~e~A~~dkerY~~em  313 (326)
                      ||+|+|||++|+++.+..++.++|++. .+|++++|++|+.|++++|++|.+.|+.++++|++++
T Consensus         1 pkrp~~af~~f~~~~~~~~~~~~~~~~~~~i~~~~~~~W~~l~~~~k~~y~~~a~~~~~~y~~~~   65 (66)
T cd00084           1 PKRPLSAYFLFSQEHRAEVKAENPGLSVGEISKILGEMWKSLSEEEKKKYEEKAEKDKERYEKEM   65 (66)
T ss_pred             CCCCCcHHHHHHHHHHHHHHHHCcCCCHHHHHHHHHHHHHhCCHHHHHHHHHHHHHHHHHHHHhh
Confidence            799999999999999999999999988 9999999999999999999999999999999999876


No 15 
>KOG0526 consensus Nucleosome-binding factor SPN, POB3 subunit [Transcription; Replication, recombination and repair; Chromatin structure and dynamics]
Probab=99.50  E-value=2.1e-14  Score=143.28  Aligned_cols=76  Identities=30%  Similarity=0.598  Sum_probs=72.0

Q ss_pred             ccccccCCCCCCCCCCCChHHHHHHHHHHHhcccCCCCh-hHHHHHHHHHhccCCHHHHHHHHHHHHHHHHHHHHHHHHH
Q 020402          238 KRSRLALRDPSRPKSNRSGYNFFFAEHYARLKPHYYGQE-KAISKKIGVLWSNLTEAEKQVYQEKGLKDKERYKSEMLEY  316 (326)
Q Consensus       238 kk~k~k~kdp~~PKrP~SAY~lF~~e~r~~lk~~~p~~~-~eisk~ige~Wk~Ls~eeK~~Y~e~A~~dkerY~~em~~Y  316 (326)
                      +|+.++++||++|||++||||+|+...|..||.+  ++. .+++|..|++|+.|+.  |.+|+++|+.||.||+.||.+|
T Consensus       524 ~k~~kk~kdpnapkra~sa~m~w~~~~r~~ik~d--gi~~~dv~kk~g~~wk~ms~--k~~we~ka~~dk~ry~~em~~y  599 (615)
T KOG0526|consen  524 KKKGKKKKDPNAPKRATSAYMLWLNASRESIKED--GISVGDVAKKAGEKWKQMSA--KEEWEDKAAVDKQRYEDEMKEY  599 (615)
T ss_pred             ccCcccCCCCCCCccchhHHHHHHHhhhhhHhhc--CchHHHHHHHHhHHHhhhcc--cchhhHHHHHHHHHHHHHHHhh
Confidence            3455778999999999999999999999999998  888 9999999999999999  9999999999999999999999


Q ss_pred             H
Q 020402          317 R  317 (326)
Q Consensus       317 k  317 (326)
                      +
T Consensus       600 k  600 (615)
T KOG0526|consen  600 K  600 (615)
T ss_pred             c
Confidence            9


No 16 
>KOG3248 consensus Transcription factor TCF-4 [Transcription]
Probab=99.49  E-value=1.5e-13  Score=130.55  Aligned_cols=136  Identities=18%  Similarity=0.257  Sum_probs=97.6

Q ss_pred             ccCCCceeeeecCcccCCceEEEeecccccccccccc----CCCCCC----------CCCCC-CCCC--cchhhhccccc
Q 020402          180 LQIGCSVSGVIDGKFDNGYLVTVNLGSEQLKGVLYHI----PHAHNV----------SQSSN-NSAA--PTHRRRKRSRL  242 (326)
Q Consensus       180 ~~~~~~V~g~idg~fd~gy~vtv~~gse~~~g~ly~~----p~~~~~----------~~~~~-~~a~--~~rrkkkk~k~  242 (326)
                      +..+-.|..+-.|.|.+.|+-+|...-.  +..-.|+    |.....          ++.+. .++.  ..++...+++.
T Consensus       108 ~~l~wp~y~~pt~~~~~p~p~~~~asms--rf~ph~~~p~~p~~~tagiPhpaiv~P~~kqes~~~~~nvk~~~~~k~e~  185 (421)
T KOG3248|consen  108 HPLGWPVYPIPTFGFRHPYPGVVNASMS--RFSPHHVEPGHPGLHTAGIPHPAIVTPPVKQESDSAPQNVKRQAESKKEE  185 (421)
T ss_pred             CccCCccccCCCCCCCCCCchhhhhhhh--hcchhccCCCCCCccccCCCCccccCCcccCcccccccccchhhhccccc
Confidence            4556678888899999999974433322  2222332    211111          11221 1111  23333333333


Q ss_pred             cCCCCCCCCCCCChHHHHHHHHHHHhcccCCCCh-hHHHHHHHHHhccCCHHHHHHHHHHHHHHHHHHHHHHHHHHh
Q 020402          243 ALRDPSRPKSNRSGYNFFFAEHYARLKPHYYGQE-KAISKKIGVLWSNLTEAEKQVYQEKGLKDKERYKSEMLEYRS  318 (326)
Q Consensus       243 k~kdp~~PKrP~SAY~lF~~e~r~~lk~~~p~~~-~eisk~ige~Wk~Ls~eeK~~Y~e~A~~dkerY~~em~~Yk~  318 (326)
                      +.|++ +.|+|+||||+||+|.|++|.+|..-.. .+|+++||++|.+|+.||..+|+|+|++||+-|+..++.|-+
T Consensus       186 e~Kkp-hiKKPLNAFmlyMKEmRa~vvaEctlKeSAaiNqiLGrRWH~LSrEEQAKYyElArKerqlH~qlYP~WSA  261 (421)
T KOG3248|consen  186 EAKKP-HIKKPLNAFMLYMKEMRAKVVAECTLKESAAINQILGRRWHALSREEQAKYYELARKERQLHMQLYPGWSA  261 (421)
T ss_pred             cccCc-cccccHHHHHHHHHHHHHHHHHHhhhhhHHHHHHHHhHHHhhhhHHHHHHHHHHHHHHHHHHHHhcCCcch
Confidence            43444 8999999999999999999999999777 999999999999999999999999999999999999888855


No 17 
>KOG2510 consensus SWI-SNF chromatin-remodeling complex protein [Chromatin structure and dynamics]
Probab=99.21  E-value=1.9e-11  Score=121.44  Aligned_cols=96  Identities=29%  Similarity=0.509  Sum_probs=87.7

Q ss_pred             hhcHHHHHHHHHHHHHhcCCCCC-CCccCCeecchhHHHHHHHhcCcchhhcccccHHHHHHHhCCCCCCCcHHHHHHHH
Q 020402           46 AQSSDLFWATLEAFHKSFGDKFK-VPTVGGKALDLHRLFVEVTSRGGLGKVIRDRRWKEVVVVFNFPTTITSASFVLRKY  124 (326)
Q Consensus        46 ~~~~~~F~~~L~~F~~~rG~~l~-~P~i~gk~lDL~~Ly~~V~~rGG~~~V~~~~~W~eVa~~l~~p~~~~~as~~Lk~~  124 (326)
                      ..+++..++.|+.|++.+.+++. +|.++.|+||||+||..|..+||+..|++++  +++|.-||     .++++.||++
T Consensus       291 qp~r~~wvDR~raF~ee~~Sp~t~~p~~gakPldl~rlYvsvke~gg~~~v~knk--rd~a~~lg-----ssaa~~l~k~  363 (532)
T KOG2510|consen  291 QPERKEWVDRLRAFTEERASPMTNLPAVGAKPLDLYRLYVSVKEIGGLTQVNKNK--RDLATNLG-----SSAASSLKKQ  363 (532)
T ss_pred             CcchhhHHHHHHHHHHhhcCcccccccccccchhHHHHHHHHHHhccceeeccch--hhhhhccc-----hHHHHHHHHH
Confidence            47889999999999999999995 8999999999999999999999999999999  99999888     5788899999


Q ss_pred             HHHhhHHhhhhhhhccCCCCCCCC
Q 020402          125 YLSLLYHFEQVYYFRREAPSSSMP  148 (326)
Q Consensus       125 Y~k~L~~fE~~~~~~~~~~~~~~~  148 (326)
                      |.+||+.||+.+-.|+++++....
T Consensus       364 y~~~lf~fec~f~Rg~e~p~~~~s  387 (532)
T KOG2510|consen  364 YIQYLFAFECKFERGEEPPPDIFS  387 (532)
T ss_pred             HHHHHHhhceeeeccCCCCHHHhh
Confidence            999999999999988887764443


No 18 
>KOG0528 consensus HMG-box transcription factor SOX5 [Transcription]
Probab=98.96  E-value=3.1e-10  Score=112.65  Aligned_cols=75  Identities=17%  Similarity=0.285  Sum_probs=68.6

Q ss_pred             CCCCCCCCCCChHHHHHHHHHHHhcccCCCCh-hHHHHHHHHHhccCCHHHHHHHHHHHHHHHHHHHHHHHHHHhc
Q 020402          245 RDPSRPKSNRSGYNFFFAEHYARLKPHYYGQE-KAISKKIGVLWSNLTEAEKQVYQEKGLKDKERYKSEMLEYRSS  319 (326)
Q Consensus       245 kdp~~PKrP~SAY~lF~~e~r~~lk~~~p~~~-~eisk~ige~Wk~Ls~eeK~~Y~e~A~~dkerY~~em~~Yk~~  319 (326)
                      .-+.+.||||||||+|.++.|.++.+.+||+. ..|+|+||.+|+.|+..||++|+|.-.+.=..|.+.+++||-+
T Consensus       321 ss~PHIKRPMNAFMVWAkDERRKILqA~PDMHNSnISKILGSRWKaMSN~eKQPYYEEQaRLSk~HlEk~PdYrYk  396 (511)
T KOG0528|consen  321 SSEPHIKRPMNAFMVWAKDERRKILQAFPDMHNSNISKILGSRWKAMSNTEKQPYYEEQARLSKLHLEKYPDYRYK  396 (511)
T ss_pred             CCCccccCCcchhhcccchhhhhhhhcCccccccchhHHhcccccccccccccchHHHHHHHHHhhhccCcccccC
Confidence            33458899999999999999999999999999 9999999999999999999999987777777999999999864


No 19 
>KOG4715 consensus SWI/SNF-related matrix-associated actin-dependent regulator of chromatin  [Chromatin structure and dynamics]
Probab=98.88  E-value=4.7e-09  Score=99.50  Aligned_cols=75  Identities=24%  Similarity=0.372  Sum_probs=71.0

Q ss_pred             CCCCCCCCCCCChHHHHHHHHHHHhcccCCCCh-hHHHHHHHHHhccCCHHHHHHHHHHHHHHHHHHHHHHHHHHh
Q 020402          244 LRDPSRPKSNRSGYNFFFAEHYARLKPHYYGQE-KAISKKIGVLWSNLTEAEKQVYQEKGLKDKERYKSEMLEYRS  318 (326)
Q Consensus       244 ~kdp~~PKrP~SAY~lF~~e~r~~lk~~~p~~~-~eisk~ige~Wk~Ls~eeK~~Y~e~A~~dkerY~~em~~Yk~  318 (326)
                      .+.|..|-+|+-+||.|++..++++++.||++. .||.|+||.+|..|+|+||+.|...++.+|..|.+.|..|..
T Consensus        59 pkpPkppekpl~pymrySrkvWd~VkA~nPe~kLWeiGK~Ig~mW~dLpd~EK~ey~~EYeaEKieY~~smkayh~  134 (410)
T KOG4715|consen   59 PKPPKPPEKPLMPYMRYSRKVWDQVKASNPELKLWEIGKIIGGMWLDLPDEEKQEYLNEYEAEKIEYNESMKAYHN  134 (410)
T ss_pred             CCCCCCCCcccchhhHHhhhhhhhhhccCcchHHHHHHHHHHHHHhhCcchHHHHHHHHHHHHHHHHHHHHHHhhC
Confidence            455678889999999999999999999999999 999999999999999999999999999999999999999975


No 20 
>KOG2746 consensus HMG-box transcription factor Capicua and related proteins [Transcription]
Probab=98.29  E-value=6e-07  Score=92.65  Aligned_cols=75  Identities=24%  Similarity=0.352  Sum_probs=68.6

Q ss_pred             ccccccCCCCCCCCCCCChHHHHHHHHH--HHhcccCCCCh-hHHHHHHHHHhccCCHHHHHHHHHHHHHHHHHHHHH
Q 020402          238 KRSRLALRDPSRPKSNRSGYNFFFAEHY--ARLKPHYYGQE-KAISKKIGVLWSNLTEAEKQVYQEKGLKDKERYKSE  312 (326)
Q Consensus       238 kk~k~k~kdp~~PKrP~SAY~lF~~e~r--~~lk~~~p~~~-~eisk~ige~Wk~Ls~eeK~~Y~e~A~~dkerY~~e  312 (326)
                      ..+..-++|..+.++|||||++|++.+|  ..+.+.||+.+ +.|++|+|+.|-.|.+.||+.|.+.|.+.|+.|.+.
T Consensus       170 dgrspnkr~k~HirrPMnaf~ifskrhr~~g~vhq~~pn~DNrtIskiLgewWytL~~~Ekq~yhdLa~Qvk~Ahfka  247 (683)
T KOG2746|consen  170 DGRSPNKRDKDHIRRPMNAFHIFSKRHRGEGRVHQRHPNQDNRTISKILGEWWYTLGPNEKQKYHDLAFQVKEAHFKA  247 (683)
T ss_pred             ccCCCCcCcchhhhhhhHHHHHHHhhcCCccchhccCccccchhHHHHHhhhHhhhCchhhhhHHHHHHHHHHHHhhh
Confidence            3444556777799999999999999999  89999999999 999999999999999999999999999999999876


No 21 
>PF14887 HMG_box_5:  HMG (high mobility group) box 5; PDB: 1L8Y_A 1L8Z_A 2HDZ_A.
Probab=97.91  E-value=3.3e-05  Score=59.59  Aligned_cols=73  Identities=21%  Similarity=0.374  Sum_probs=58.8

Q ss_pred             CCCCCCChHHHHHHHHHHHhcccCCCChhHHHHHHHHHhccCCHHHHHHHHHHHHHHHHHHHHHHHHHHhcCC
Q 020402          249 RPKSNRSGYNFFFAEHYARLKPHYYGQEKAISKKIGVLWSNLTEAEKQVYQEKGLKDKERYKSEMLEYRSSYD  321 (326)
Q Consensus       249 ~PKrP~SAY~lF~~e~r~~lk~~~p~~~~eisk~ige~Wk~Ls~eeK~~Y~e~A~~dkerY~~em~~Yk~~~~  321 (326)
                      -|..|.+|--+|.+.......+.++.-..-..+.+...|++|++.+|.+|..+|.+|..+|+++|.+|++-..
T Consensus         3 lPE~PKt~qe~Wqq~vi~dYla~~~~dr~K~~kam~~~W~~me~Kekl~WIkKA~EdqKrYE~el~e~r~~~~   75 (85)
T PF14887_consen    3 LPETPKTAQEIWQQSVIGDYLAKFRNDRKKALKAMEAQWSQMEKKEKLKWIKKAAEDQKRYERELREMRSAPA   75 (85)
T ss_dssp             -S----THHHHHHHHHHHHHHHHTTSTHHHHHHHHHHHHHTTGGGHHHHHHHHHHHHHHHHHHHHHCCS-CCC
T ss_pred             CCCCCCCHHHHHHHHHHHHHHHHhhHhHHHHHHHHHHHHHHhhhhhhhHHHHHHHHHHHHHHHHHHHHhcCCC
Confidence            4678899999999998888888888766333568999999999999999999999999999999999998554


No 22 
>PF04690 YABBY:  YABBY protein;  InterPro: IPR006780 YABBY proteins are a group of plant-specific transcription factors involved in the specification of abaxial polarity in lateral organs such as leaves and floral organs [, ].
Probab=96.92  E-value=0.0013  Score=58.35  Aligned_cols=43  Identities=21%  Similarity=0.316  Sum_probs=39.1

Q ss_pred             CCCCCCChHHHHHHHHHHHhcccCCCCh-hHHHHHHHHHhccCC
Q 020402          249 RPKSNRSGYNFFFAEHYARLKPHYYGQE-KAISKKIGVLWSNLT  291 (326)
Q Consensus       249 ~PKrP~SAY~lF~~e~r~~lk~~~p~~~-~eisk~ige~Wk~Ls  291 (326)
                      +-.|-+|||+.|++|.-++||+++|+++ +|+-+..++.|...+
T Consensus       121 KRqR~psaYn~f~k~ei~rik~~~p~ishkeaFs~aAknW~h~p  164 (170)
T PF04690_consen  121 KRQRVPSAYNRFMKEEIQRIKAENPDISHKEAFSAAAKNWAHFP  164 (170)
T ss_pred             ccCCCchhHHHHHHHHHHHHHhcCCCCCHHHHHHHHHHhhhhCc
Confidence            3347789999999999999999999999 999999999998765


No 23 
>PF06382 DUF1074:  Protein of unknown function (DUF1074);  InterPro: IPR024460 This family consists of several proteins which appear to be specific to Insecta. The function of this family is unknown.
Probab=96.77  E-value=0.006  Score=54.28  Aligned_cols=48  Identities=23%  Similarity=0.392  Sum_probs=41.6

Q ss_pred             CChHHHHHHHHHHHhcccCCCCh-hHHHHHHHHHhccCCHHHHHHHHHHHHHH
Q 020402          254 RSGYNFFFAEHYARLKPHYYGQE-KAISKKIGVLWSNLTEAEKQVYQEKGLKD  305 (326)
Q Consensus       254 ~SAY~lF~~e~r~~lk~~~p~~~-~eisk~ige~Wk~Ls~eeK~~Y~e~A~~d  305 (326)
                      .+||+-|+.+.+.+    |.++. .|+....+..|..|+++||..|..++...
T Consensus        83 nnaYLNFLReFRrk----h~~L~p~dlI~~AAraW~rLSe~eK~rYrr~~~~~  131 (183)
T PF06382_consen   83 NNAYLNFLREFRRK----HCGLSPQDLIQRAARAWCRLSEAEKNRYRRMAPSV  131 (183)
T ss_pred             chHHHHHHHHHHHH----ccCCCHHHHHHHHHHHHHhCCHHHHHHHHhhcchh
Confidence            57899999998874    56777 99999999999999999999999876543


No 24 
>COG5648 NHP6B Chromatin-associated proteins containing the HMG domain [Chromatin structure and dynamics]
Probab=96.09  E-value=0.0064  Score=55.44  Aligned_cols=66  Identities=23%  Similarity=0.168  Sum_probs=58.7

Q ss_pred             CCCCCCChHHHHHHHHHHHhcccCCCCh-hHHHHHHHHHhccCCHHHHHHHHHHHHHHHHHHHHHHH
Q 020402          249 RPKSNRSGYNFFFAEHYARLKPHYYGQE-KAISKKIGVLWSNLTEAEKQVYQEKGLKDKERYKSEML  314 (326)
Q Consensus       249 ~PKrP~SAY~lF~~e~r~~lk~~~p~~~-~eisk~ige~Wk~Ls~eeK~~Y~e~A~~dkerY~~em~  314 (326)
                      +++.+.-+|.-+-.+.|..+...+|+.. .++.++++..|++|++.-|.+|.+.+.++++.|...|+
T Consensus       143 ~~~~~~~~~~e~~~~~r~~~~~~~~~~~~~e~~k~~~~~w~el~~skK~~~~~~~Kk~k~~~~~~~~  209 (211)
T COG5648         143 PNKAPIGPFIENEPKIRPKVEGPSPDKALVEETKIISKAWSELDESKKKKYIDKYKKLKEEYDSFYP  209 (211)
T ss_pred             CCCCCCchhhhccHHhccccCCCCcchhhhHHhhhhhhhhhhhChhhhhHHHHHHHHHHHHHhhhcc
Confidence            5566677777788888888888899998 99999999999999999999999999999999987765


No 25 
>PF08073 CHDNT:  CHDNT (NUC034) domain;  InterPro: IPR012958 The CHD N-terminal domain is found in PHD/RING fingers and chromo domain-associated helicases [].; GO: 0003677 DNA binding, 0005524 ATP binding, 0008270 zinc ion binding, 0016818 hydrolase activity, acting on acid anhydrides, in phosphorus-containing anhydrides, 0006355 regulation of transcription, DNA-dependent, 0005634 nucleus
Probab=88.25  E-value=0.45  Score=34.66  Aligned_cols=39  Identities=13%  Similarity=0.160  Sum_probs=34.9

Q ss_pred             CChHHHHHHHHHHHhcccCCCCh-hHHHHHHHHHhccCCH
Q 020402          254 RSGYNFFFAEHYARLKPHYYGQE-KAISKKIGVLWSNLTE  292 (326)
Q Consensus       254 ~SAY~lF~~e~r~~lk~~~p~~~-~eisk~ige~Wk~Ls~  292 (326)
                      .+-|-+|.+-.|+.|.+.||+.. ..+...++.+|++.++
T Consensus        13 lt~yK~Fsq~vRP~l~~~NPk~~~sKl~~l~~AKwrEF~~   52 (55)
T PF08073_consen   13 LTNYKAFSQHVRPLLAKANPKAPMSKLMMLLQAKWREFQE   52 (55)
T ss_pred             HHHHHHHHHHHHHHHHHHCCCCcHHHHHHHHHHHHHHHHh
Confidence            35688999999999999999999 9999999999987654


No 26 
>PF00249 Myb_DNA-binding:  Myb-like DNA-binding domain;  InterPro: IPR014778 The retroviral oncogene v-myb, and its cellular counterpart c-myb, encode nuclear DNA-binding proteins. These belong to the SANT domain family that specifically recognise the sequence YAAC(G/T)G [, ]. In myb, one of the most conserved regions consisting of three tandem repeats has been shown to be involved in DNA-binding [].; PDB: 1X41_A 2XAF_B 2XAG_B 2XAH_B 2UXN_B 2Y48_B 2XAQ_B 2X0L_B 2IW5_B 2XAJ_B ....
Probab=83.20  E-value=2.8  Score=28.83  Aligned_cols=39  Identities=23%  Similarity=0.314  Sum_probs=27.9

Q ss_pred             hHHHHHHHhcCcchhhcccccHHHHHHHhCCCCCCCcHHHHHHHHHHHhh
Q 020402           80 HRLFVEVTSRGGLGKVIRDRRWKEVVVVFNFPTTITSASFVLRKYYLSLL  129 (326)
Q Consensus        80 ~~Ly~~V~~rGG~~~V~~~~~W~eVa~~l~~p~~~~~as~~Lk~~Y~k~L  129 (326)
                      ..|...|...|.-       .|..||..|+.    +-...+++.+|.+||
T Consensus        10 ~~l~~~v~~~g~~-------~W~~Ia~~~~~----~Rt~~qc~~~~~~~~   48 (48)
T PF00249_consen   10 EKLLEAVKKYGKD-------NWKKIAKRMPG----GRTAKQCRSRYQNLL   48 (48)
T ss_dssp             HHHHHHHHHSTTT-------HHHHHHHHHSS----SSTHHHHHHHHHHHT
T ss_pred             HHHHHHHHHhCCc-------HHHHHHHHcCC----CCCHHHHHHHHHhhC
Confidence            3456666666643       79999999992    223448999999886


No 27 
>PF04769 MAT_Alpha1:  Mating-type protein MAT alpha 1;  InterPro: IPR006856 This family includes Saccharomyces cerevisiae (Baker's yeast) mating type protein alpha 1 (P01365 from SWISSPROT). MAT alpha 1 is a transcription activator that activates mating-type alpha-specific genes with the help of the MADS-box containing MCM1 transcription factor, which together bind cooperatively to PQ elements upstream of alpha-specific genes. The MCM1-MATalpha1 complex is required for the proper DNA-bending that is needed for transcriptional activation []. Alpha 1 interacts in vivo with STE12, linking expression of alpha-specific genes to the alpha-pheromone (IPR006742 from INTERPRO) response pathway [].; GO: 0000772 mating pheromone activity, 0003677 DNA binding, 0045895 positive regulation of transcription, mating-type specific, 0005634 nucleus
Probab=82.03  E-value=2.7  Score=38.48  Aligned_cols=53  Identities=21%  Similarity=0.271  Sum_probs=37.4

Q ss_pred             CCCCCCCCCCCChHHHHHHHHHHHhcccCCCCh-hHHHHHHHHHhccCCHHHHHHHHHHH
Q 020402          244 LRDPSRPKSNRSGYNFFFAEHYARLKPHYYGQE-KAISKKIGVLWSNLTEAEKQVYQEKG  302 (326)
Q Consensus       244 ~kdp~~PKrP~SAY~lF~~e~r~~lk~~~p~~~-~eisk~ige~Wk~Ls~eeK~~Y~e~A  302 (326)
                      +....++|||.|+||.|..=.-    ...++.. .+++..|+..|+.=+-  |..|.-+|
T Consensus        38 ~~~~~~~kr~lN~Fm~FRsyy~----~~~~~~~Qk~~S~~l~~lW~~dp~--k~~W~l~a   91 (201)
T PF04769_consen   38 KRSPEKAKRPLNGFMAFRSYYS----PIFPPLPQKELSGILTKLWEKDPF--KNKWSLMA   91 (201)
T ss_pred             cccccccccchhHHHHHHHHHH----hhcCCcCHHHHHHHHHHHHhCCcc--HhHHHHHh
Confidence            3344578999999999975544    4456666 9999999999987433  44444443


No 28 
>PF06244 DUF1014:  Protein of unknown function (DUF1014);  InterPro: IPR010422 This family consists of several hypothetical eukaryotic proteins of unknown function.
Probab=80.15  E-value=1.6  Score=36.83  Aligned_cols=45  Identities=18%  Similarity=0.313  Sum_probs=39.4

Q ss_pred             CC-CCCCChHHHHHHHHHHHhcccCCCCh-hHHHHHHHHHhccCCHH
Q 020402          249 RP-KSNRSGYNFFFAEHYARLKPHYYGQE-KAISKKIGVLWSNLTEA  293 (326)
Q Consensus       249 ~P-KrP~SAY~lF~~e~r~~lk~~~p~~~-~eisk~ige~Wk~Ls~e  293 (326)
                      +| ||-.-||.-|....-++|++++|++- ..+..+|-..|..-+++
T Consensus        71 HPErR~KAAy~afeE~~Lp~lK~E~PgLrlsQ~kq~l~K~w~KSPeN  117 (122)
T PF06244_consen   71 HPERRMKAAYKAFEERRLPELKEENPGLRLSQYKQMLWKEWQKSPEN  117 (122)
T ss_pred             CcchhHHHHHHHHHHHHhHHHHhhCCCchHHHHHHHHHHHHhcCCCC
Confidence            44 45557899999999999999999999 99999999999887764


No 29 
>TIGR01624 LRP1_Cterm LRP1 C-terminal domain. This model represents a tightly conserved small domain found in LRP1 and related plant proteins. This family also contains a well-conserved putative zinc finger domain (TIGR01623). The rest of the sequence of most members consists of highly divergent, low-complexity sequence.
Probab=75.34  E-value=1.8  Score=30.70  Aligned_cols=31  Identities=32%  Similarity=0.557  Sum_probs=27.3

Q ss_pred             ceeeeecCcc-cCCceEEEeeccccccccccc
Q 020402          185 SVSGVIDGKF-DNGYLVTVNLGSEQLKGVLYH  215 (326)
Q Consensus       185 ~V~g~idg~f-d~gy~vtv~~gse~~~g~ly~  215 (326)
                      .|+++=||.- +..|-.+|+||--.++|+||-
T Consensus        16 Rvs~idd~~~~e~aYQt~V~IgGHvFkGiLyD   47 (50)
T TIGR01624        16 RVTAIDDGEQAEYAYQATVTIGGHVFKGFLHD   47 (50)
T ss_pred             EEeccCCCCCceEEEEEEEEECceEEeeEEec
Confidence            5777778876 779999999999999999996


No 30 
>TIGR03481 HpnM hopanoid biosynthesis associated membrane protein HpnM. The genomes containing members of this family share the machinery for the biosynthesis of hopanoid lipids. Furthermore, the genes of this family are usually located proximal to other components of this biological process. The proteins are members of the pfam05494 family of putative transporters known as "toluene tolerance protein Ttg2D", although it is unlikely that the members included here have anything to do with toluene per-se.
Probab=68.89  E-value=14  Score=33.55  Aligned_cols=42  Identities=17%  Similarity=0.381  Sum_probs=35.7

Q ss_pred             hHHH-HHHHHHhccCCHHHHHHHHHHHHH-HHHHHHHHHHHHHh
Q 020402          277 KAIS-KKIGVLWSNLTEAEKQVYQEKGLK-DKERYKSEMLEYRS  318 (326)
Q Consensus       277 ~eis-k~ige~Wk~Ls~eeK~~Y~e~A~~-dkerY~~em~~Yk~  318 (326)
                      ..++ ..+|.-|+.+|+++|+.|.+.-.. ....|-..+..|..
T Consensus        67 ~~mar~vLG~~W~~~s~~Qr~~F~~~F~~~l~~tY~~~l~~y~~  110 (198)
T TIGR03481        67 PAMARLTLGSSWTSLSPEQRRRFIGAFRELSIATYASQFKSYAG  110 (198)
T ss_pred             HHHHHHHhhhhhhhCCHHHHHHHHHHHHHHHHHHHHHHHHhhcC
Confidence            4454 478999999999999999987776 78899999999975


No 31 
>PF05142 DUF702:  Domain of unknown function (DUF702) ;  InterPro: IPR007818 This is a family of plant proteins of unknown function.
Probab=68.05  E-value=3.1  Score=36.45  Aligned_cols=32  Identities=38%  Similarity=0.655  Sum_probs=29.4

Q ss_pred             ceeeeecCcccCCceEEEeecccccccccccc
Q 020402          185 SVSGVIDGKFDNGYLVTVNLGSEQLKGVLYHI  216 (326)
Q Consensus       185 ~V~g~idg~fd~gy~vtv~~gse~~~g~ly~~  216 (326)
                      .|+++=||.-+..|-.+|+||--.|+|+||--
T Consensus       118 RVssiDdgedE~AYQTaV~IGGHVFKGiLYDq  149 (154)
T PF05142_consen  118 RVSSIDDGEDEYAYQTAVNIGGHVFKGILYDQ  149 (154)
T ss_pred             EEecccCcccceeeEEeEEECCEEeeeeeecc
Confidence            48888899999999999999999999999974


No 32 
>PRK15117 ABC transporter periplasmic binding protein MlaC; Provisional
Probab=64.85  E-value=15  Score=33.56  Aligned_cols=46  Identities=17%  Similarity=0.200  Sum_probs=37.0

Q ss_pred             CCCh-hHHH-HHHHHHhccCCHHHHHHHHHHHHH-HHHHHHHHHHHHHh
Q 020402          273 YGQE-KAIS-KKIGVLWSNLTEAEKQVYQEKGLK-DKERYKSEMLEYRS  318 (326)
Q Consensus       273 p~~~-~eis-k~ige~Wk~Ls~eeK~~Y~e~A~~-dkerY~~em~~Yk~  318 (326)
                      |..+ ..++ ..+|.-|+.+++++|+.|.+.-.. ....|-..+.+|..
T Consensus        66 p~~Df~~~s~~vLG~~wr~as~eQr~~F~~~F~~~Lv~tYa~~l~~y~~  114 (211)
T PRK15117         66 PYVQVKYAGALVLGRYYKDATPAQREAYFAAFREYLKQAYGQALAMYHG  114 (211)
T ss_pred             ccCCHHHHHHHHhhhhhhhCCHHHHHHHHHHHHHHHHHHHHHHHHHhCC
Confidence            4455 5554 478999999999999999876655 77889999999975


No 33 
>PF12881 NUT_N:  NUT protein N terminus;  InterPro: IPR024309 This domain is found in the N-terminal region of Nuclear Testis (NUT) proteins. It is also found in FAM22, which are a family of uncharacterised mammalian proteins.
Probab=62.34  E-value=14  Score=35.96  Aligned_cols=63  Identities=13%  Similarity=0.048  Sum_probs=42.0

Q ss_pred             HHHHHHHHHHHhcccCCCCh-hHHHHHHHHHhccCCHHHHHHHHHHHHHHHHH-HHHHHHHHHhc
Q 020402          257 YNFFFAEHYARLKPHYYGQE-KAISKKIGVLWSNLTEAEKQVYQEKGLKDKER-YKSEMLEYRSS  319 (326)
Q Consensus       257 Y~lF~~e~r~~lk~~~p~~~-~eisk~ige~Wk~Ls~eeK~~Y~e~A~~dker-Y~~em~~Yk~~  319 (326)
                      +-.|+.-.-..+....|.+. .|-....-+.|...|.-+|..|+|+|++=+|= -++||+.-+-+
T Consensus       232 lSCFLIpvLrsLar~kPtMtlEeGl~ra~qEW~~~SnfdRmifyemaekFmEFEaeEEmq~q~lq  296 (328)
T PF12881_consen  232 LSCFLIPVLRSLARLKPTMTLEEGLWRAVQEWQHTSNFDRMIFYEMAEKFMEFEAEEEMQIQKLQ  296 (328)
T ss_pred             hhhhHHHHHHHHHhcCCCccHHHHHHHHHHHhhccccccHHHHHHHHHHHccCCcHHHHHHHHHH
Confidence            33333333333444567777 77777888999999999999999999985442 12455555443


No 34 
>PF09441 Abp2:  ARS binding protein 2;  InterPro: IPR018562  This DNA-binding protein binds to the autonomously replicating sequence (ARS) binding element. It may play a role in regulating the cell cycle response to stress signals []. 
Probab=60.01  E-value=18  Score=32.05  Aligned_cols=41  Identities=15%  Similarity=0.327  Sum_probs=34.8

Q ss_pred             CCccCCeecchhHHHHHHHhcCcchhhcccccHHHHHHHhCCCCC
Q 020402           69 VPTVGGKALDLHRLFVEVTSRGGLGKVIRDRRWKEVVVVFNFPTT  113 (326)
Q Consensus        69 ~P~i~gk~lDL~~Ly~~V~~rGG~~~V~~~~~W~eVa~~l~~p~~  113 (326)
                      +|.-+||..+.|.||..|.++-.-    .-+.|.++|-.||+.+.
T Consensus        45 pPkS~Gk~Fs~~~Lf~LI~k~~~k----eikTW~~La~~LGVepp   85 (175)
T PF09441_consen   45 PPKSDGKSFSTFTLFELIRKLESK----EIKTWAQLALELGVEPP   85 (175)
T ss_pred             CCCcCCccchHHHHHHHHHHHhhh----hHhHHHHHHHHhCCCCC
Confidence            899999999999999999976432    34689999999999654


No 35 
>PF13921 Myb_DNA-bind_6:  Myb-like DNA-binding domain; PDB: 1A5J_A 1MBH_A 1GV5_A 1H89_C 1IDY_A 1MBK_A 1IDZ_A 1H88_C 1GVD_A 1MBG_A ....
Probab=56.92  E-value=20  Score=25.43  Aligned_cols=36  Identities=19%  Similarity=0.280  Sum_probs=23.0

Q ss_pred             HHHHHHHhcCcchhhcccccHHHHHHHhCCCCCCCcHHHHHHHHHHHhh
Q 020402           81 RLFVEVTSRGGLGKVIRDRRWKEVVVVFNFPTTITSASFVLRKYYLSLL  129 (326)
Q Consensus        81 ~Ly~~V~~rGG~~~V~~~~~W~eVa~~l~~p~~~~~as~~Lk~~Y~k~L  129 (326)
                      .|...|...|.        .|..||..|| ..    ...+++..|.++|
T Consensus         8 ~L~~~~~~~g~--------~W~~Ia~~l~-~R----t~~~~~~r~~~~l   43 (60)
T PF13921_consen    8 LLLELVKKYGN--------DWKKIAEHLG-NR----TPKQCRNRWRNHL   43 (60)
T ss_dssp             HHHHHHHHHTS---------HHHHHHHST-TS-----HHHHHHHHHHTT
T ss_pred             HHHHHHHHHCc--------CHHHHHHHHC-cC----CHHHHHHHHHHHC
Confidence            35555555553        6999999996 11    2347788888766


No 36 
>cd00167 SANT 'SWI3, ADA2, N-CoR and TFIIIB' DNA-binding domains. Tandem copies of the domain bind telomeric DNA tandem repeatsas part of the capping complex. Binding is sequence dependent for repeats which contain the G/C rich motif [C2-3 A (CA)1-6]. The domain is also found in regulatory transcriptional repressor complexes where it also binds DNA.
Probab=55.36  E-value=26  Score=22.36  Aligned_cols=37  Identities=19%  Similarity=0.277  Sum_probs=24.1

Q ss_pred             HHHHHHHhcCcchhhcccccHHHHHHHhCCCCCCCcHHHHHHHHHHHhh
Q 020402           81 RLFVEVTSRGGLGKVIRDRRWKEVVVVFNFPTTITSASFVLRKYYLSLL  129 (326)
Q Consensus        81 ~Ly~~V~~rGG~~~V~~~~~W~eVa~~l~~p~~~~~as~~Lk~~Y~k~L  129 (326)
                      .|...|...|-       ..|..|+..|+--     .+..++.+|..++
T Consensus         9 ~l~~~~~~~g~-------~~w~~Ia~~~~~r-----s~~~~~~~~~~~~   45 (45)
T cd00167           9 LLLEAVKKYGK-------NNWEKIAKELPGR-----TPKQCRERWRNLL   45 (45)
T ss_pred             HHHHHHHHHCc-------CCHHHHHhHcCCC-----CHHHHHHHHHHhC
Confidence            45555555552       5799999998641     2337788877653


No 37 
>PF05494 Tol_Tol_Ttg2:  Toluene tolerance, Ttg2 ;  InterPro: IPR008869 Toluene tolerance is mediated by increased cell membrane rigidity resulting from changes in fatty acid and phospholipid compositions, exclusion of toluene from the cell membrane, and removal of intracellular toluene by degradation []. Many proteins are involved in these processes. This family is a transporter which shows similarity to ABC transporters [].; PDB: 2QGU_A.
Probab=55.03  E-value=30  Score=30.07  Aligned_cols=42  Identities=19%  Similarity=0.361  Sum_probs=32.1

Q ss_pred             hHHHHHHHHHhccCCHHHHHHHHHHHHH-HHHHHHHHHHHHHh
Q 020402          277 KAISKKIGVLWSNLTEAEKQVYQEKGLK-DKERYKSEMLEYRS  318 (326)
Q Consensus       277 ~eisk~ige~Wk~Ls~eeK~~Y~e~A~~-dkerY~~em~~Yk~  318 (326)
                      .-....+|.-|+.++++||+.|.+.-.+ ....|-..+..|..
T Consensus        42 ~~ar~~LG~~w~~~s~~q~~~F~~~f~~~l~~~Y~~~l~~y~~   84 (170)
T PF05494_consen   42 RMARRVLGRYWRKASPAQRQRFVEAFKQLLVRTYAKRLDEYSG   84 (170)
T ss_dssp             HHHHHHHGGGTTTS-HHHHHHHHHHHHHHHHHHHHHHHHT-SS
T ss_pred             HHHHHHHHHhHhhCCHHHHHHHHHHHHHHHHHHHHHHHHhhCC
Confidence            4445678899999999999999876655 67788888888875


No 38 
>KOG3223 consensus Uncharacterized conserved protein [Function unknown]
Probab=52.61  E-value=6.8  Score=35.64  Aligned_cols=52  Identities=17%  Similarity=0.339  Sum_probs=42.0

Q ss_pred             CCC-CCCCChHHHHHHHHHHHhcccCCCCh-hHHHHHHHHHhccCCHHHHHHHHHHH
Q 020402          248 SRP-KSNRSGYNFFFAEHYARLKPHYYGQE-KAISKKIGVLWSNLTEAEKQVYQEKG  302 (326)
Q Consensus       248 ~~P-KrP~SAY~lF~~e~r~~lk~~~p~~~-~eisk~ige~Wk~Ls~eeK~~Y~e~A  302 (326)
                      .+| ||=+-||.-|-...-++|+.++|++. .++-.+|-..|..-+|+   ||.+++
T Consensus       162 rHPEkRmrAA~~afEe~~LPrLK~e~P~lrlsQ~Kqll~Kew~KsPDN---P~Nq~~  215 (221)
T KOG3223|consen  162 RHPEKRMRAAFKAFEEARLPRLKKENPGLRLSQYKQLLKKEWQKSPDN---PFNQAA  215 (221)
T ss_pred             cChHHHHHHHHHHHHHhhchhhhhcCCCccHHHHHHHHHHHHhhCCCC---hhhHHh
Confidence            455 45567788898888899999999999 99999999999988875   454443


No 39 
>PF11304 DUF3106:  Protein of unknown function (DUF3106);  InterPro: IPR021455  Some members in this family of proteins are annotated as transmembrane proteins however this cannot be confirmed. Currently no function is known. 
Probab=49.94  E-value=57  Score=26.69  Aligned_cols=40  Identities=10%  Similarity=0.318  Sum_probs=18.7

Q ss_pred             HHHHHHHHHhccCCHHHHHHHHHHHH-------HHHHHHHHHHHHHH
Q 020402          278 AISKKIGVLWSNLTEAEKQVYQEKGL-------KDKERYKSEMLEYR  317 (326)
Q Consensus       278 eisk~ige~Wk~Ls~eeK~~Y~e~A~-------~dkerY~~em~~Yk  317 (326)
                      ++..-+.+.|+.|+++.|..+.+.|.       .+++++..-|..|.
T Consensus        11 ~~L~pl~~~W~~l~~~qr~k~l~~a~r~~~mspeqq~r~~~rm~~W~   57 (107)
T PF11304_consen   11 QALAPLAERWNSLPPEQRRKWLQIAERWPSMSPEQQQRLRERMRRWA   57 (107)
T ss_pred             HHHHHHHHHHhcCCHHHHHHHHHHHHHHhcCCHHHHHHHHHHHHHHH
Confidence            33444455555555555554444432       24444444444444


No 40 
>COG2854 Ttg2D ABC-type transport system involved in resistance to organic solvents, auxiliary component [Secondary metabolites biosynthesis, transport, and catabolism]
Probab=43.56  E-value=34  Score=31.46  Aligned_cols=43  Identities=14%  Similarity=0.291  Sum_probs=36.8

Q ss_pred             hHHHHHHHHHhccCCHHHHHHHHHHHHH-HHHHHHHHHHHHHhc
Q 020402          277 KAISKKIGVLWSNLTEAEKQVYQEKGLK-DKERYKSEMLEYRSS  319 (326)
Q Consensus       277 ~eisk~ige~Wk~Ls~eeK~~Y~e~A~~-dkerY~~em~~Yk~~  319 (326)
                      ..-...+|.-|+.+|+|+++.|.+.-.. ....|-..|.+|+.+
T Consensus        74 ~~a~~vLGk~~k~aspeQ~~~F~~aF~~yl~q~Y~~aL~~Y~~q  117 (202)
T COG2854          74 YAAKLVLGKYYKTASPEQRQAFFKAFRTYLEQTYGQALLDYKGQ  117 (202)
T ss_pred             HHHHHHhccccccCCHHHHHHHHHHHHHHHHHHHHHHHHHccCC
Confidence            5567788999999999999999876655 778899999999874


No 41 
>KOG0493 consensus Transcription factor Engrailed, contains HOX domain [General function prediction only]
Probab=42.11  E-value=40  Score=32.28  Aligned_cols=42  Identities=29%  Similarity=0.518  Sum_probs=23.2

Q ss_pred             CCCCCcchhhhccccccCCCCCCCCCCCChHHHHHHHHHHHhcccCCC
Q 020402          227 NNSAAPTHRRRKRSRLALRDPSRPKSNRSGYNFFFAEHYARLKPHYYG  274 (326)
Q Consensus       227 ~~~a~~~rrkkkk~k~k~kdp~~PKrP~SAY~lF~~e~r~~lk~~~p~  274 (326)
                      +|++.+|-||-||++-..+   -=|||++||   ..|+-++||.++..
T Consensus       226 RPSsGPR~Rk~kkkk~~~~---eeKRPRTAF---taeQL~RLK~EF~e  267 (342)
T KOG0493|consen  226 RPSSGPRHRKPKKKKSSSK---EEKRPRTAF---TAEQLQRLKAEFQE  267 (342)
T ss_pred             CCCCCcccccccccCCccc---hhcCccccc---cHHHHHHHHHHHhh
Confidence            4555555444333332222   336889985   46777777776543


No 42 
>PF13873 Myb_DNA-bind_5:  Myb/SANT-like DNA-binding domain
Probab=40.80  E-value=36  Score=25.51  Aligned_cols=54  Identities=13%  Similarity=0.218  Sum_probs=33.5

Q ss_pred             cchhHHHHHHHhc---CcchhhcccccHHHHHHHhCC-CCCCCcHHHHHHHHHHHhhHH
Q 020402           77 LDLHRLFVEVTSR---GGLGKVIRDRRWKEVVVVFNF-PTTITSASFVLRKYYLSLLYH  131 (326)
Q Consensus        77 lDL~~Ly~~V~~r---GG~~~V~~~~~W~eVa~~l~~-p~~~~~as~~Lk~~Y~k~L~~  131 (326)
                      |+|..-|..|..-   ++.....+...|.+|+..|+- ++. .--..+|++.|..+...
T Consensus        14 v~~v~~~~~il~~k~~~~~~~~~k~~~W~~I~~~lN~~~~~-~Rs~~~lkkkW~nlk~~   71 (78)
T PF13873_consen   14 VELVEKHKDILENKFSDSVSNKEKRKAWEEIAEELNALGPG-KRSWKQLKKKWKNLKSK   71 (78)
T ss_pred             HHHHHHhHHHHhcccccHHHHHHHHHHHHHHHHHHHhcCCC-CCCHHHHHHHHHHHHHH
Confidence            4455555555543   222333456689999999963 333 44445899999887653


No 43 
>PF12776 Myb_DNA-bind_3:  Myb/SANT-like DNA-binding domain;  InterPro: IPR024752 This domain, found in a range of uncharacterised proteins, may be related to Myb/SANT-like DNA binding domains.
Probab=37.92  E-value=64  Score=24.83  Aligned_cols=61  Identities=18%  Similarity=0.290  Sum_probs=42.5

Q ss_pred             hhHHHHHHHhcCcc--hhhcccccHHHHHHHhCCCCCCCcHHHHHHHHHHHhhHHhhhhhhhc
Q 020402           79 LHRLFVEVTSRGGL--GKVIRDRRWKEVVVVFNFPTTITSASFVLRKYYLSLLYHFEQVYYFR  139 (326)
Q Consensus        79 L~~Ly~~V~~rGG~--~~V~~~~~W~eVa~~l~~p~~~~~as~~Lk~~Y~k~L~~fE~~~~~~  139 (326)
                      |..|+.+....|..  ...-....|..|+.+|.-.....-...+|+..|..+=..|..+....
T Consensus        10 ll~~~~e~~~~g~~~~~~~fk~~~w~~i~~~~~~~~~~~~t~~qlknk~~~lk~~y~~~~~l~   72 (96)
T PF12776_consen   10 LLDLLIEQINKGNRPTNGGFKKEGWNNIAEEFNEKTGLNYTKKQLKNKWKTLKKDYRIWKELR   72 (96)
T ss_pred             HHHHHHHHHHhCCCCCCCCcCHHHHHHHHHHHHHHhCCcccHHHHHHHHHHHHHHHHHHHHHH
Confidence            44555666667777  34445558999999998644433334589999999999988876554


No 44 
>PF12650 DUF3784:  Domain of unknown function (DUF3784);  InterPro: IPR017259 This group represents an uncharacterised conserved protein.
Probab=37.91  E-value=23  Score=28.09  Aligned_cols=17  Identities=24%  Similarity=0.499  Sum_probs=14.2

Q ss_pred             HhccCCHHHHHHHHHHH
Q 020402          286 LWSNLTEAEKQVYQEKG  302 (326)
Q Consensus       286 ~Wk~Ls~eeK~~Y~e~A  302 (326)
                      -||.||+|||+.|.++.
T Consensus        25 Gyntms~eEk~~~D~~~   41 (97)
T PF12650_consen   25 GYNTMSKEEKEKYDKKK   41 (97)
T ss_pred             hcccCCHHHHHHhhHHH
Confidence            48999999999997644


No 45 
>smart00717 SANT SANT  SWI3, ADA2, N-CoR and TFIIIB'' DNA-binding domains.
Probab=37.39  E-value=67  Score=20.65  Aligned_cols=26  Identities=15%  Similarity=0.328  Sum_probs=18.5

Q ss_pred             ccHHHHHHHhCCCCCCCcHHHHHHHHHHHhh
Q 020402           99 RRWKEVVVVFNFPTTITSASFVLRKYYLSLL  129 (326)
Q Consensus        99 ~~W~eVa~~l~~p~~~~~as~~Lk~~Y~k~L  129 (326)
                      ..|..|+..|+     +-....++..|..++
T Consensus        22 ~~w~~Ia~~~~-----~rt~~~~~~~~~~~~   47 (49)
T smart00717       22 NNWEKIAKELP-----GRTAEQCRERWNNLL   47 (49)
T ss_pred             CCHHHHHHHcC-----CCCHHHHHHHHHHHc
Confidence            57999999997     122337788887765


No 46 
>PF13875 DUF4202:  Domain of unknown function (DUF4202)
Probab=36.56  E-value=54  Score=29.74  Aligned_cols=40  Identities=10%  Similarity=0.274  Sum_probs=33.7

Q ss_pred             ChHHHHHHHHHHHhcccCCCCh-hHHHHHHHHHhccCCHHHHHH
Q 020402          255 SGYNFFFAEHYARLKPHYYGQE-KAISKKIGVLWSNLTEAEKQV  297 (326)
Q Consensus       255 SAY~lF~~e~r~~lk~~~p~~~-~eisk~ige~Wk~Ls~eeK~~  297 (326)
                      -+-++|+..+.+.+...|   + ..+..++...|+.||++-++.
T Consensus       130 vacLVFL~~~f~~F~~~~---deeK~v~Il~KTw~KMS~~g~~~  170 (185)
T PF13875_consen  130 VACLVFLEYYFEDFAAKH---DEEKIVDILRKTWRKMSERGHEA  170 (185)
T ss_pred             hHHHHhHHHHHHHHHhcC---CHHHHHHHHHHHHHHCCHHHHHH
Confidence            357889999999998888   4 778899999999999988754


No 47 
>PF10545 MADF_DNA_bdg:  Alcohol dehydrogenase transcription factor Myb/SANT-like;  InterPro: IPR006578 The MADF (myb/SANT-like domain in Adf-1) domain is an approximately 80-amino-acid module that directs sequence specific DNA binding to a site consisting of multiple tri-nucleotide repeats. The MADF domain is found in one or more copies in eukaryotic and viral proteins and is often associated with the BESS domain []. MADF is related to the Myb DNA-binding domain (IPR001005 from INTERPRO). The retroviral oncogene v-myb, and its cellular counterpart c-myb, are nuclear DNA-binding proteins that specifically recognise the sequence YAAC(G/T)G. It is likely that the MADF domain is more closely related to the myb/SANT domain than it is to other HTH domains. Some proteins known to contain a MADF domain are listed below:    Drosophila Adf-1, a transcription factor first identified on the basis of its interaction with the alcohol dehydrogenase promoter but that binds the promoters of a diverse group of genes [].  Drosophila Dorsal-interacting protein 3 (Dip3), which functions both as an activator to bind DNA in a sequence specific manner and a coactivator to stimulate synergistic activation by Dorsal and Twist [].  Drosophila Stonewall (Stwl), a putative transcription factor required for maintenance of female germline stem cells as well as oocyte differentiation.   
Probab=35.79  E-value=37  Score=25.14  Aligned_cols=38  Identities=21%  Similarity=0.377  Sum_probs=23.2

Q ss_pred             cccccHHHHHHHhC--CCCC-CCcHHHHHHHHHHHhhHHhh
Q 020402           96 IRDRRWKEVVVVFN--FPTT-ITSASFVLRKYYLSLLYHFE  133 (326)
Q Consensus        96 ~~~~~W~eVa~~l~--~p~~-~~~as~~Lk~~Y~k~L~~fE  133 (326)
                      .+.+.|.+|+..||  ++.. +...-..||..|.+.+...+
T Consensus        24 ~r~~aw~~Ia~~l~~~~~~~~~~~~w~~Lr~~y~~~~~~~~   64 (85)
T PF10545_consen   24 LREEAWQEIARELGKEFSVDDCKKRWKNLRDRYRRELKKIK   64 (85)
T ss_pred             HHHHHHHHHHHHHccchhHHHHHHHHHHHHHHHHHHHHHHh
Confidence            45678999999998  4422 22333345555555555554


No 48 
>PF02337 Gag_p10:  Retroviral GAG p10 protein;  InterPro: IPR003322 Retroviral matrix proteins (or major core proteins) are components of envelope-associated capsids, which line the inner surface of virus envelopes and are associated with viral membranes []. Matrix proteins are produced as part of Gag precursor polyproteins. During viral maturation, the Gag polyprotein is cleaved into major structural proteins by the viral protease, yielding the matrix (MA), capsid (CA), nucleocapsid (NC), and some smaller peptides. Gag-derived proteins govern the entire assembly and release of the virus particles, with matrix proteins playing key roles in Gag stability, capsid assembly, transport and budding. Although matrix proteins from different retroviruses appear to perform similar functions and can have similar structural folds, their primary sequences can be very different. This entry represents matrix proteins from beta-retroviruses such as Mason-Pfizer monkey virus (MPMV) (Simian Mason-Pfizer virus) and Mouse mammary tumor virus (MMTV) [, ]. This entry also identifies matrix proteins from several eukaryotic endogenous retroviruses, which arise when one or more copies of the retroviral genome becomes integrated into the host genome [].; GO: 0005198 structural molecule activity, 0019028 viral capsid; PDB: 2F77_X 2F76_X.
Probab=33.08  E-value=1.4e+02  Score=23.82  Aligned_cols=54  Identities=19%  Similarity=0.103  Sum_probs=36.8

Q ss_pred             HHHHHHHHHHHHhcCCCCCCCccCCeecchhHHHHHHHhcCcchhhcc---cccHHHHHHHhCC
Q 020402           50 DLFWATLEAFHKSFGDKFKVPTVGGKALDLHRLFVEVTSRGGLGKVIR---DRRWKEVVVVFNF  110 (326)
Q Consensus        50 ~~F~~~L~~F~~~rG~~l~~P~i~gk~lDL~~Ly~~V~~rGG~~~V~~---~~~W~eVa~~l~~  110 (326)
                      +.|+..|+.++..+|..+       +.=||-.+|..+.+..=+-.+..   -..|..|++.|.-
T Consensus         8 ~~fv~~Lk~lLk~rGi~v-------~~~~L~~f~~~i~~~~PWF~~eG~l~~~~W~kvG~~l~~   64 (90)
T PF02337_consen    8 QPFVSILKHLLKERGIRV-------KKKDLINFLSFIDKVCPWFPEEGTLDLDNWKKVGEELKR   64 (90)
T ss_dssp             HHHHHHHHHHHHCCT-----------HHHHHHHHHHHHHHTT-SS--SS-HHHHHHHHHHHHHH
T ss_pred             hHHHHHHHHHHHHcCeee-------cHHHHHHHHHHHHHhCCCCCCCCCcCHHHHHHHHHHHHH
Confidence            789999999999999998       44577888888876554444333   3589999998843


No 49 
>PRK09706 transcriptional repressor DicA; Reviewed
Probab=32.70  E-value=1e+02  Score=25.67  Aligned_cols=41  Identities=17%  Similarity=0.152  Sum_probs=35.9

Q ss_pred             HHHHHHHHhccCCHHHHHHHHHHHHHHHHHHHHHHHHHHhc
Q 020402          279 ISKKIGVLWSNLTEAEKQVYQEKGLKDKERYKSEMLEYRSS  319 (326)
Q Consensus       279 isk~ige~Wk~Ls~eeK~~Y~e~A~~dkerY~~em~~Yk~~  319 (326)
                      -.+.+-+.|+.|++++++......+...+-|.+-+++|-.+
T Consensus        88 ~~~~ll~~~~~L~~~~~~~~l~~l~~~~~~~~~~~~~~~~~  128 (135)
T PRK09706         88 DQKELLELFDALPESEQDAQLSEMRARVENFNKLFEELLKA  128 (135)
T ss_pred             HHHHHHHHHHHCCHHHHHHHHHHHHHHHHHHHHHHHHHHHH
Confidence            35778889999999999999999999999999988888654


No 50 
>smart00595 MADF subfamily of SANT domain.
Probab=31.22  E-value=34  Score=26.15  Aligned_cols=42  Identities=17%  Similarity=0.236  Sum_probs=28.8

Q ss_pred             hhcccccHHHHHHHhCCCCC-CCcHHHHHHHHHHHhhHHhhhh
Q 020402           94 KVIRDRRWKEVVVVFNFPTT-ITSASFVLRKYYLSLLYHFEQV  135 (326)
Q Consensus        94 ~V~~~~~W~eVa~~l~~p~~-~~~as~~Lk~~Y~k~L~~fE~~  135 (326)
                      ...+...|.+|+..||.+.. |..-=..||..|.+.+......
T Consensus        23 ~~~r~~aW~~Ia~~l~~~~~~~~~kw~~LR~~y~~e~~r~~~~   65 (89)
T smart00595       23 KEEKRKAWEEIAEELGLSVEECKKRWKNLRDRYRRELKRLQNG   65 (89)
T ss_pred             hHHHHHHHHHHHHHHCcCHHHHHHHHHHHHHHHHHHHHHHHHh
Confidence            34456699999999999543 3333446788888877766553


No 51 
>PF05066 HARE-HTH:  HB1, ASXL, restriction endonuclease HTH domain;  InterPro: IPR007759 DNA-directed RNA polymerases 2.7.7.6 from EC (also known as DNA-dependent RNA polymerases) are responsible for the polymerisation of ribonucleotides into a sequence complementary to the template DNA. In eukaryotes, there are three different forms of DNA-directed RNA polymerases transcribing different sets of genes. Most RNA polymerases are multimeric enzymes and are composed of a variable number of subunits. The core RNA polymerase complex consists of five subunits (two alpha, one beta, one beta-prime and one omega) and is sufficient for transcription elongation and termination but is unable to initiate transcription. Transcription initiation from promoter elements requires a sixth, dissociable subunit called a sigma factor, which reversibly associates with the core RNA polymerase complex to form a holoenzyme []. The core RNA polymerase complex forms a "crab claw"-like structure with an internal channel running along the full length []. The key functional sites of the enzyme, as defined by mutational and cross-linking analysis, are located on the inner wall of this channel. RNA synthesis follows after the attachment of RNA polymerase to a specific site, the promoter, on the template DNA strand. The RNA synthesis process continues until a termination sequence is reached. The RNA product, which is synthesised in the 5' to 3'direction, is known as the primary transcript. Eukaryotic nuclei contain three distinct types of RNA polymerases that differ in the RNA they synthesise:  RNA polymerase I: located in the nucleoli, synthesises precursors of most ribosomal RNAs. RNA polymerase II: occurs in the nucleoplasm, synthesises mRNA precursors.  RNA polymerase III: also occurs in the nucleoplasm, synthesises the precursors of 5S ribosomal RNA, the tRNAs, and a variety of other small nuclear and cytosolic RNAs.   Eukaryotic cells are also known to contain separate mitochondrial and chloroplast RNA polymerases. Eukaryotic RNA polymerases, whose molecular masses vary in size from 500 to 700 kDa, contain two non-identical large (>100 kDa) subunits and an array of up to 12 different small (less than 50 kDa) subunits. The delta protein is a dispensable subunit of Bacillus subtilis RNA polymerase (RNAP) that has major effects on the biochemical properties of the purified enzyme. In the presence of delta, RNAP displays an increased specificity of transcription, a decreased affinity for nucleic acids, and an increased efficiency of RNA synthesis because of enhanced recycling []. The delta protein, contains two distinct regions, an N-terminal domain and a glutamate and aspartate residue-rich C-terminal region [].; GO: 0003677 DNA binding, 0006351 transcription, DNA-dependent; PDB: 2KRC_A.
Probab=30.45  E-value=83  Score=23.28  Aligned_cols=43  Identities=14%  Similarity=0.228  Sum_probs=25.5

Q ss_pred             HHHHHHHHHHhcCCCCCCCccCCeecchhHHHHHHHhcCcchhhcccccHHHHH
Q 020402           52 FWATLEAFHKSFGDKFKVPTVGGKALDLHRLFVEVTSRGGLGKVIRDRRWKEVV  105 (326)
Q Consensus        52 F~~~L~~F~~~rG~~l~~P~i~gk~lDL~~Ly~~V~~rGG~~~V~~~~~W~eVa  105 (326)
                      |++...+-+++.|          +++....|+..|.++|++... ...-|..|+
T Consensus         3 ~~eaa~~vL~~~~----------~pm~~~eI~~~i~~~~~~~~~-~k~p~~~i~   45 (72)
T PF05066_consen    3 FKEAAYEVLEEAG----------RPMTFKEIWEEIQERGLYKKS-GKTPEATIA   45 (72)
T ss_dssp             HHHHHHHHHHHH-----------S-EEHHHHHHHHHHHHTS----GGGGGHHHH
T ss_pred             HHHHHHHHHHhcC----------CCcCHHHHHHHHHHhCCCCcc-cCCHHHHHH
Confidence            4455555555544          558899999999999999987 223344443


No 52 
>PRK10236 hypothetical protein; Provisional
Probab=29.88  E-value=51  Score=31.00  Aligned_cols=45  Identities=20%  Similarity=0.312  Sum_probs=30.2

Q ss_pred             HHHHHHHHHHHhcccCCC-C-----h-hHHHHHHHHHhccCCHHHHHHHHHH
Q 020402          257 YNFFFAEHYARLKPHYYG-Q-----E-KAISKKIGVLWSNLTEAEKQVYQEK  301 (326)
Q Consensus       257 Y~lF~~e~r~~lk~~~p~-~-----~-~eisk~ige~Wk~Ls~eeK~~Y~e~  301 (326)
                      |-=-..+....+|-.+.. .     + .-+.+.+.+.|+.|+++|++.+.+.
T Consensus        89 YreIL~DVc~~LKV~y~~~~st~~iE~~il~kll~~a~~kms~eE~~~L~~~  140 (237)
T PRK10236         89 YRAILLDVSKRLKLKADKEMSTFEIEQQLLEQFLRNTWKKMDEEHKQEFLHA  140 (237)
T ss_pred             HHHHHHHHHHHcCCCCCCCCCHHHHHHHHHHHHHHHHHHHCCHHHHHHHHHH
Confidence            333444555555554333 1     2 4478999999999999999988653


No 53 
>PF04967 HTH_10:  HTH DNA binding domain;  InterPro: IPR007050 Numerous bacterial transcription regulatory proteins bind DNA via a helix-turn-helix (HTH) motif. This entry represents the HTH DNA binding domain found in Halobacterium salinarium (Halobacterium halobium) and described as a putative bacterio-opsin activator. 
Probab=28.19  E-value=77  Score=22.77  Aligned_cols=40  Identities=20%  Similarity=0.130  Sum_probs=33.0

Q ss_pred             hcCcchhhcccccHHHHHHHhCCCCCCCcHHHHHHHHHHHhh
Q 020402           88 SRGGLGKVIRDRRWKEVVVVFNFPTTITSASFVLRKYYLSLL  129 (326)
Q Consensus        88 ~rGG~~~V~~~~~W~eVa~~l~~p~~~~~as~~Lk~~Y~k~L  129 (326)
                      -..||-.+-++-.=.+||..||++.  +.++..||+.-.+++
T Consensus        13 ~~~GYfd~PR~~tl~elA~~lgis~--st~~~~LRrae~kli   52 (53)
T PF04967_consen   13 YELGYFDVPRRITLEELAEELGISK--STVSEHLRRAERKLI   52 (53)
T ss_pred             HHcCCCCCCCcCCHHHHHHHhCCCH--HHHHHHHHHHHHHHh
Confidence            3568888888889999999999995  467788998877765


No 54 
>TIGR00787 dctP tripartite ATP-independent periplasmic transporter solute receptor, DctP family. TRAP-T (Tripartite ATP-independent Periplasmic Transporter) family proteins generally consist of three components, and these systems have so far been found in Gram-negative bacteria, Gram-postive bacteria and archaea. The best characterized example is the DctPQM system of Rhodobacter capsulatus, a C4 dicarboxylate (malate, fumarate, succinate) transporter. This model represents the DctP family, one of at least three major families of extracytoplasmic solute receptor for TRAP family transporters. Other are the SnoM family (see pfam03480) and TAXI (TRAP-associated extracytoplasmic immunogenic) family.
Probab=26.97  E-value=1.1e+02  Score=28.19  Aligned_cols=28  Identities=18%  Similarity=0.312  Sum_probs=21.3

Q ss_pred             HHHhccCCHHHHHHHHHHHHHHHHHHHH
Q 020402          284 GVLWSNLTEAEKQVYQEKGLKDKERYKS  311 (326)
Q Consensus       284 ge~Wk~Ls~eeK~~Y~e~A~~dkerY~~  311 (326)
                      .+.|+.|+++.|+...+.+.+.-+....
T Consensus       213 ~~~~~~L~~e~q~~i~~a~~~~~~~~~~  240 (257)
T TIGR00787       213 KAFWKSLPPDLQAVVKEAAKEAGEYQRK  240 (257)
T ss_pred             HHHHhcCCHHHHHHHHHHHHHHHHHHHH
Confidence            4679999999999998877665444433


No 55 
>COG1638 DctP TRAP-type C4-dicarboxylate transport system, periplasmic component [Carbohydrate transport and metabolism]
Probab=26.03  E-value=1e+02  Score=30.27  Aligned_cols=43  Identities=14%  Similarity=0.224  Sum_probs=33.4

Q ss_pred             hHHHHHHHHHhccCCHHHHHHHHHHHHHHHHHHHHHHHHHHhc
Q 020402          277 KAISKKIGVLWSNLTEAEKQVYQEKGLKDKERYKSEMLEYRSS  319 (326)
Q Consensus       277 ~eisk~ige~Wk~Ls~eeK~~Y~e~A~~dkerY~~em~~Yk~~  319 (326)
                      ..+.-+-...|..|++++|+...+.+.+..+...+...+.++.
T Consensus       237 ~~~~~~s~~~w~~L~~e~q~il~~aa~e~~~~~~~~~~~~e~~  279 (332)
T COG1638         237 PLAVLVSKAFWDSLPEEDQTILLEAAKEAAEEQRKLVEELEDE  279 (332)
T ss_pred             ceeeEEcHHHHhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHH
Confidence            4455556678999999999999999988877777766666553


No 56 
>PRK02363 DNA-directed RNA polymerase subunit delta; Reviewed
Probab=24.52  E-value=60  Score=27.64  Aligned_cols=63  Identities=13%  Similarity=0.081  Sum_probs=44.2

Q ss_pred             HHHHHHHHHHHhcCCCCCCCccCCeecchhHHHHHHHhcCcchhhcccccHHHHHHHhCCCCC---CCcHHHHHHH
Q 020402           51 LFWATLEAFHKSFGDKFKVPTVGGKALDLHRLFVEVTSRGGLGKVIRDRRWKEVVVVFNFPTT---ITSASFVLRK  123 (326)
Q Consensus        51 ~F~~~L~~F~~~rG~~l~~P~i~gk~lDL~~Ly~~V~~rGG~~~V~~~~~W~eVa~~l~~p~~---~~~as~~Lk~  123 (326)
                      .+++.-..++..+|          +++.++.|+.+|.+..|+..-....+=.++...|.+...   ++...+.||.
T Consensus         4 S~idvAy~iL~~~~----------~~m~f~dL~~ev~~~~~~s~e~~~~~iaq~YtdLn~DGRFi~lG~n~WgLr~   69 (129)
T PRK02363          4 SLIEVAYEILKEKK----------EPMSFYDLVNEIQKYLGKSDEEIRERIAQFYTDLNLDGRFISLGDNKWGLRS   69 (129)
T ss_pred             cHHHHHHHHHHHcC----------CcccHHHHHHHHHHHhCCCHHHHHHHHHHHHHHHhccCCeeEcCCCceeccc
Confidence            45566666676654          458899999999999998765555667777777777654   4455555665


No 57 
>PRK12751 cpxP periplasmic stress adaptor protein CpxP; Reviewed
Probab=24.52  E-value=1.3e+02  Score=26.62  Aligned_cols=32  Identities=19%  Similarity=0.276  Sum_probs=24.2

Q ss_pred             HHHHHHHHHhccCCHHHHHHHHHHHHHHHHHH
Q 020402          278 AISKKIGVLWSNLTEAEKQVYQEKGLKDKERY  309 (326)
Q Consensus       278 eisk~ige~Wk~Ls~eeK~~Y~e~A~~dkerY  309 (326)
                      +..+..-++++-|++|+|..|.+.-++-.++.
T Consensus       118 ~~~~~~~qmy~lLTPEQra~l~~~~e~r~~~~  149 (162)
T PRK12751        118 EMAKVRNQMYNLLTPEQKEALNKKHQERIEKL  149 (162)
T ss_pred             HHHHHHHHHHHcCCHHHHHHHHHHHHHHHHHH
Confidence            34566678889999999999988766654444


No 58 
>PRK12750 cpxP periplasmic repressor CpxP; Reviewed
Probab=23.05  E-value=2e+02  Score=25.52  Aligned_cols=36  Identities=22%  Similarity=0.232  Sum_probs=28.3

Q ss_pred             HHHHHHHHHhccCCHHHHHHHHHHHHHHHHHHHHHH
Q 020402          278 AISKKIGVLWSNLTEAEKQVYQEKGLKDKERYKSEM  313 (326)
Q Consensus       278 eisk~ige~Wk~Ls~eeK~~Y~e~A~~dkerY~~em  313 (326)
                      +..++.-+.+.-|++|+|..|.+.-.+-.+.+...+
T Consensus       125 ~~~~~~~~~~~vLTpEQRak~~e~~~~r~~~~~~~~  160 (170)
T PRK12750        125 KMLEKRHQMLSILTPEQKAKFQELQQERMQECQDKM  160 (170)
T ss_pred             HHHHHHHHHHHhCCHHHHHHHHHHHHHHHHHHHHHH
Confidence            344556678999999999999998877777776655


No 59 
>PRK10363 cpxP periplasmic repressor CpxP; Reviewed
Probab=22.10  E-value=1.7e+02  Score=26.10  Aligned_cols=39  Identities=18%  Similarity=0.298  Sum_probs=30.1

Q ss_pred             hHHHHHHHHHhccCCHHHHHHHHHHHHHHHHHHHHHHHHH
Q 020402          277 KAISKKIGVLWSNLTEAEKQVYQEKGLKDKERYKSEMLEY  316 (326)
Q Consensus       277 ~eisk~ige~Wk~Ls~eeK~~Y~e~A~~dkerY~~em~~Y  316 (326)
                      .++.++--++.+-|++|+|..|.+..++-.+++.. +..+
T Consensus       111 Vem~k~~nqmy~lLTPEQKaq~~~~~~~rm~~~~~-~~~~  149 (166)
T PRK10363        111 VEMAKVRNQMYRLLTPEQQAVLNEKHQQRMEQLRD-VTQW  149 (166)
T ss_pred             HHHHHHHHHHHHhCCHHHHHHHHHHHHHHHHHHHH-HHhc
Confidence            56777788999999999999998887777666644 4433


No 60 
>cd07268 Glo_EDI_BRP_like_4 This conserved domain belongs to a superfamily including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. This protein family belongs to a conserved domain superfamily that is found in a variety of structurally related metalloproteins, including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). The protein superfamily contains members with or without domain swapping. The proteins of this family share three conserved metal binding amino acids with the type I extradiol dioxygenases, which shows no domain swapping.
Probab=22.02  E-value=43  Score=29.34  Aligned_cols=49  Identities=10%  Similarity=0.079  Sum_probs=39.6

Q ss_pred             chhhhhcHHHHHHHHHHHHHhcCCCCCCCccCCeecchhHHHHHHHhcC
Q 020402           42 YEDIAQSSDLFWATLEAFHKSFGDKFKVPTVGGKALDLHRLFVEVTSRG   90 (326)
Q Consensus        42 ~e~~~~~~~~F~~~L~~F~~~rG~~l~~P~i~gk~lDL~~Ly~~V~~rG   90 (326)
                      |-++-.......+.++.-+.+.|+-+.-=+|+||+|-||+|..-+.-.|
T Consensus         4 HialR~n~~~~A~~w~~~l~~~G~llSen~INGRPI~l~~L~qPl~~~~   52 (149)
T cd07268           4 HIALRVNENQTAERWKEGLLQCGELLSENEINGRPIALIKLEKPLQFAG   52 (149)
T ss_pred             eEEEeeCCHHHHHHHHHHHHHhchhhhccccCCeeEEEEEcCCCceeCC
Confidence            4444455567888999999999999988899999999999987766444


No 61 
>PF13725 tRNA_bind_2:  Possible tRNA binding domain; PDB: 2ZPA_B.
Probab=20.94  E-value=59  Score=25.59  Aligned_cols=20  Identities=25%  Similarity=0.619  Sum_probs=13.8

Q ss_pred             hhhcccccHHHHHHHhCCCC
Q 020402           93 GKVIRDRRWKEVVVVFNFPT  112 (326)
Q Consensus        93 ~~V~~~~~W~eVa~~l~~p~  112 (326)
                      .+|.+.+-|.+||+.|+++.
T Consensus        78 ~k~LQ~ksw~~~a~~l~l~g   97 (101)
T PF13725_consen   78 AKGLQGKSWEEVAKELGLPG   97 (101)
T ss_dssp             HHHCS---HHHHHHHCT-SS
T ss_pred             HHHHCCCCHHHHHHHcCCCC
Confidence            46778899999999999985


No 62 
>PLN00131 hypothetical protein; Provisional
Probab=20.43  E-value=3.3e+02  Score=24.28  Aligned_cols=56  Identities=16%  Similarity=0.201  Sum_probs=27.1

Q ss_pred             CCCCCCCCCCCCC-------CCCCCCC-CCCCC-CCCCCCCCCCCchhhhhcHHHHHHHHHHHHH
Q 020402            6 LNGQKSSATTSNS-------NSNSNNN-NNNNK-ASSYYPPPTAKYEDIAQSSDLFWATLEAFHK   61 (326)
Q Consensus         6 ~~~~~~~~~~~~~-------~~~~~~~-~~~~~-~~~~~p~~~~~~e~~~~~~~~F~~~L~~F~~   61 (326)
                      +||++|-+++|--       ++.+.-+ |++.. ---.-|.|....+++...+...-+.|+-.+.
T Consensus        83 iGG~GS~~~~SrrP~~DLNstpqpeldlnqpaaheqepepapplddqdlltkrkrvseelrlllq  147 (218)
T PLN00131         83 IGGGGSDAGPSRRPVLDLNSTPQPELDLNQPAAHEQEPEPAPPLDDQDLLTKRKRVSEELRLLLQ  147 (218)
T ss_pred             ecCCCCCCCcCcCCcccCCCCCCcccccCCccccccCCCCCCCCCcHHHHHHHHHHHHHHHHHHH
Confidence            6899998877632       2222222 22221 1122333334445555666666666655443


No 63 
>cd05694 S1_Rrp5_repeat_hs2_sc2 S1_Rrp5_repeat_hs2_sc2: Rrp5 is a trans-acting factor important for biogenesis of both the 40S and 60S eukaryotic ribosomal subunits. Rrp5 has two distinct regions, an N-terminal region containing tandemly repeated S1 RNA-binding domains (12 S1 repeats in Saccharomyces cerevisiae Rrp5 and 14 S1 repeats in Homo sapiens Rrp5) and a C-terminal region containing tetratricopeptide repeat (TPR) motifs thought to be involved in protein-protein interactions. Mutational studies have shown that each region represents a specific functional domain. Deletions within the S1-containing region inhibit pre-rRNA processing at either site A3 or A2, whereas deletions within the TPR region confer an inability to support cleavage of A0-A2. This CD includes H. sapiens S1 repeat 2 (hs2) and S. cerevisiae S1 repeat 2 (sc2). Rrp5 is found in eukaryotes but not in prokaryotes or archaea.
Probab=20.35  E-value=73  Score=23.99  Aligned_cols=31  Identities=26%  Similarity=0.507  Sum_probs=23.5

Q ss_pred             CCCceeeeecCcccCCceEEEeecccccccccc
Q 020402          182 IGCSVSGVIDGKFDNGYLVTVNLGSEQLKGVLY  214 (326)
Q Consensus       182 ~~~~V~g~idg~fd~gy~vtv~~gse~~~g~ly  214 (326)
                      .|..|.|+|-..-||||+|.+.+  +.+.|.|-
T Consensus         4 ~G~~v~g~V~si~d~G~~v~~g~--~gv~Gfl~   34 (74)
T cd05694           4 EGMVLSGCVSSVEDHGYILDIGI--PGTTGFLP   34 (74)
T ss_pred             CCCEEEEEEEEEeCCEEEEEeCC--CCcEEEEE
Confidence            46679999999999999988743  34677543


Done!