Query         032803
Match_columns 133
No_of_seqs    203 out of 1016
Neff          7.3 
Searched_HMMs 46136
Date          Fri Mar 29 06:09:47 2013
Command       hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/032803.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/032803hhsearch_cdd -cpu 12 -v 0 

 No Hit                             Prob E-value P-value  Score    SS Cols Query HMM  Template HMM
  1 PTZ00199 high mobility group p  99.9 2.4E-26 5.1E-31  154.5  11.8   89   18-107     3-93  (94)
  2 cd01389 MATA_HMG-box MATA_HMG-  99.9 4.7E-22   1E-26  128.7   8.2   71   37-108     1-71  (77)
  3 cd01388 SOX-TCF_HMG-box SOX-TC  99.9 1.3E-21 2.7E-26  125.2   8.3   70   37-107     1-70  (72)
  4 PF00505 HMG_box:  HMG (high mo  99.9 5.9E-21 1.3E-25  120.2   9.5   69   38-107     1-69  (69)
  5 cd01390 HMGB-UBF_HMG-box HMGB-  99.8 1.4E-20 3.1E-25  117.4   9.3   65   38-103     1-65  (66)
  6 smart00398 HMG high mobility g  99.8 2.5E-20 5.5E-25  117.0   9.4   70   37-107     1-70  (70)
  7 PF09011 HMG_box_2:  HMG-box do  99.8 2.9E-20 6.2E-25  119.2   9.6   72   35-107     1-73  (73)
  8 COG5648 NHP6B Chromatin-associ  99.8 1.7E-20 3.7E-25  140.4   8.0   90   26-116    59-148 (211)
  9 cd00084 HMG-box High Mobility   99.8 1.6E-18 3.6E-23  107.5   9.3   65   38-103     1-65  (66)
 10 KOG0381 HMG box-containing pro  99.8 3.1E-18 6.8E-23  114.5  11.0   76   34-110    17-95  (96)
 11 KOG0527 HMG-box transcription   99.7 1.7E-18 3.6E-23  138.8   6.4   77   31-108    56-132 (331)
 12 KOG0526 Nucleosome-binding fac  99.7 9.6E-17 2.1E-21  133.5   7.5   79   25-108   523-601 (615)
 13 KOG3248 Transcription factor T  99.4 3.9E-13 8.4E-18  107.1   7.2   75   36-111   190-264 (421)
 14 KOG4715 SWI/SNF-related matrix  99.3   6E-12 1.3E-16   99.7   9.0   78   31-109    58-135 (410)
 15 KOG0528 HMG-box transcription   99.2 1.1E-11 2.5E-16  102.4   2.4   77   32-109   320-396 (511)
 16 KOG2746 HMG-box transcription   98.8 4.6E-09 9.9E-14   90.0   5.2   78   24-102   168-247 (683)
 17 PF14887 HMG_box_5:  HMG (high   98.4 3.3E-06 7.2E-11   54.4   8.2   75   37-113     3-77  (85)
 18 PF06382 DUF1074:  Protein of u  97.5 0.00088 1.9E-08   49.6   8.1   49   42-95     83-131 (183)
 19 PF04690 YABBY:  YABBY protein;  97.4 0.00049 1.1E-08   50.8   5.8   49   32-81    116-164 (170)
 20 COG5648 NHP6B Chromatin-associ  97.2 0.00033 7.2E-09   53.1   3.1   69   35-104   141-209 (211)
 21 PF08073 CHDNT:  CHDNT (NUC034)  96.6  0.0037   8E-08   37.9   3.5   40   42-82     13-52  (55)
 22 PF06244 DUF1014:  Protein of u  95.0   0.045 9.7E-07   38.4   4.2   47   36-83     70-117 (122)
 23 PF04769 MAT_Alpha1:  Mating-ty  94.7    0.12 2.5E-06   39.3   6.2   56   31-93     37-92  (201)
 24 TIGR03481 HpnM hopanoid biosyn  91.5    0.66 1.4E-05   34.9   5.9   46   64-109    64-111 (198)
 25 PRK15117 ABC transporter perip  90.0     1.1 2.4E-05   34.0   6.0   46   64-109    68-115 (211)
 26 PF05494 Tol_Tol_Ttg2:  Toluene  83.1     1.7 3.6E-05   31.5   3.5   45   64-108    38-84  (170)
 27 KOG3223 Uncharacterized conser  81.9    0.95 2.1E-05   34.2   1.8   52   36-91    162-214 (221)
 28 PF12881 NUT_N:  NUT protein N   81.4     5.2 0.00011   32.4   5.9   67   42-109   229-296 (328)
 29 PF13875 DUF4202:  Domain of un  74.9     6.9 0.00015   29.4   4.6   40   44-87    131-170 (185)
 30 COG2854 Ttg2D ABC-type transpo  72.6     5.6 0.00012   30.3   3.6   44   70-113    77-121 (202)
 31 PF11304 DUF3106:  Protein of u  69.9      25 0.00054   23.8   6.1   24   69-92     12-35  (107)
 32 PRK12751 cpxP periplasmic stre  56.3      33 0.00073   25.1   5.0   31   70-100   120-150 (162)
 33 PRK10363 cpxP periplasmic repr  54.7      37 0.00081   25.0   5.0   41   67-108   111-151 (166)
 34 PRK09706 transcriptional repre  54.1      50  0.0011   22.8   5.5   43   68-110    87-129 (135)
 35 PF01352 KRAB:  KRAB box;  Inte  54.0      12 0.00027   20.9   1.9   29   66-94      3-32  (41)
 36 PRK12750 cpxP periplasmic repr  52.6      52  0.0011   24.2   5.6   33   71-103   128-160 (170)
 37 PRK10236 hypothetical protein;  48.4      21 0.00045   27.9   3.0   26   69-94    118-143 (237)
 38 PF06945 DUF1289:  Protein of u  48.2      30 0.00065   20.2   3.1   25   65-94     23-47  (51)
 39 PF12650 DUF3784:  Domain of un  41.5      18  0.0004   23.6   1.6   16   76-91     25-40  (97)
 40 KOG3838 Mannose lectin ERGIC-5  40.8      33 0.00072   29.0   3.2   38   79-116   268-305 (497)
 41 PF00887 ACBP:  Acyl CoA bindin  37.7 1.1E+02  0.0023   19.5   5.3   53   45-99     30-86  (87)
 42 TIGR00787 dctP tripartite ATP-  34.8   1E+02  0.0022   23.4   5.1   28   74-101   213-240 (257)
 43 PRK10455 periplasmic protein;   32.7 1.1E+02  0.0023   22.3   4.6   27   69-95    119-145 (161)
 44 PF09164 VitD-bind_III:  Vitami  32.0 1.3E+02  0.0029   18.8   4.9   33   43-76      9-41  (68)
 45 cd07081 ALDH_F20_ACDH_EutE-lik  30.6 1.5E+02  0.0033   24.9   5.7   41   68-108     6-46  (439)
 46 PF03480 SBP_bac_7:  Bacterial   28.6 1.2E+02  0.0027   23.3   4.6   29   74-102   213-241 (286)
 47 COG1638 DctP TRAP-type C4-dica  27.9 1.3E+02  0.0029   24.3   4.8   36   73-108   243-278 (332)
 48 KOG1610 Corticosteroid 11-beta  27.6   2E+02  0.0043   23.5   5.6   58   47-107   187-256 (322)
 49 smart00271 DnaJ DnaJ molecular  26.7 1.2E+02  0.0026   17.2   3.3   32   52-83     22-58  (60)
 50 PF07813 LTXXQ:  LTXXQ motif fa  25.8 1.3E+02  0.0029   18.8   3.7   25   67-91     75-99  (100)
 51 PF05388 Carbpep_Y_N:  Carboxyp  25.3 1.4E+02  0.0031   20.4   3.9   30   66-95     45-74  (113)
 52 cd07133 ALDH_CALDH_CalB Conife  25.0 2.2E+02  0.0047   23.7   5.7   41   68-108     5-45  (434)
 53 cd07122 ALDH_F20_ACDH Coenzyme  24.3 2.2E+02  0.0048   23.9   5.6   41   68-108     6-46  (436)
 54 cd07132 ALDH_F3AB Aldehyde deh  24.0 2.1E+02  0.0047   23.8   5.5   42   68-109     5-46  (443)
 55 PTZ00037 DnaJ_C chaperone prot  23.7 1.8E+02   0.004   24.4   5.0   43   49-91     46-88  (421)
 56 KOG2880 SMAD6 interacting prot  23.7 4.1E+02  0.0088   22.4   6.8   66   42-110    52-119 (424)
 57 PF15581 Imm35:  Immunity prote  23.1      90  0.0019   20.7   2.4   22   65-86     31-52  (93)
 58 PF02026 RyR:  RyR domain;  Int  22.7      87  0.0019   20.7   2.4   21   76-96     60-80  (94)
 59 PRK14291 chaperone protein Dna  22.6 1.9E+02   0.004   23.8   4.8   39   51-90     23-65  (382)
 60 cd08317 Death_ank Death domain  22.2      48   0.001   21.0   1.0   49   64-112     5-60  (84)
 61 PF09655 Nitr_red_assoc:  Conse  22.1 1.8E+02  0.0039   21.0   4.0   41   75-115    33-76  (144)
 62 PF15076 DUF4543:  Domain of un  21.7      61  0.0013   20.4   1.3   21   31-51     25-45  (75)
 63 PRK14296 chaperone protein Dna  21.7 1.9E+02  0.0042   23.7   4.7   39   51-90     24-66  (372)
 64 cd07087 ALDH_F3-13-14_CALDH-li  21.5 2.6E+02  0.0056   23.1   5.5   42   68-109     5-46  (426)
 65 cd07085 ALDH_F6_MMSDH Methylma  21.1 2.8E+02  0.0061   23.2   5.7   38   70-107    47-84  (478)
 66 PF08367 M16C_assoc:  Peptidase  20.8 2.3E+02   0.005   21.6   4.7   30   67-96     13-42  (248)
 67 cd06257 DnaJ DnaJ domain or J-  20.5 1.7E+02  0.0037   16.1   3.8   31   51-81     20-54  (55)
 68 KOG1827 Chromatin remodeling c  20.4     5.3 0.00011   35.3  -4.9   44   41-85    552-595 (629)
 69 PRK10266 curved DNA-binding pr  20.3 2.6E+02  0.0056   22.2   5.1   40   51-90     24-66  (306)
 70 PF06628 Catalase-rel:  Catalas  20.2 1.1E+02  0.0023   18.7   2.3   19   72-90     12-30  (68)
 71 PRK14279 chaperone protein Dna  20.1 1.7E+02  0.0037   24.2   4.1   40   51-91     29-73  (392)
 72 PHA03102 Small T antigen; Revi  20.1 1.8E+02  0.0039   21.0   3.7   36   50-85     26-61  (153)

No 1  
>PTZ00199 high mobility group protein; Provisional
Probab=99.94  E-value=2.4e-26  Score=154.55  Aligned_cols=89  Identities=44%  Similarity=0.696  Sum_probs=82.9

Q ss_pred             CCCCCCCCCCCCCCCCCCCCCCCCCcHHHHHHHHHHHHHHHhCCCCCC--HHHHHHHHHHHhhCCChHHhHHHHHHHHHH
Q 032803           18 NKKPAKAGRKSGKAAKDPNKPKRPASAFFVFMEEFREQYKKDHPKNKS--VAAVGKAGGEKWKSMSEADKAPYVAKAEKR   95 (133)
Q Consensus        18 ~k~~~~~~kk~~k~~~dp~~PKrP~say~lF~~e~r~~~k~~~p~~~~--~~eisk~l~~~Wk~ls~eeK~~Y~~~a~~~   95 (133)
                      ++.+.+.+++++++.+||+.|+||+|||+|||.++|..|..+||+ +.  +++|+++||++|+.||+++|++|.++|..+
T Consensus         3 ~~~~~~~~k~~~k~~kdp~~PKrP~sAY~~F~~~~R~~i~~~~P~-~~~~~~evsk~ige~Wk~ls~eeK~~y~~~A~~d   81 (94)
T PTZ00199          3 KKQGKVLVRKNKRKKKDPNAPKRALSAYMFFAKEKRAEIIAENPE-LAKDVAAVGKMVGEAWNKLSEEEKAPYEKKAQED   81 (94)
T ss_pred             ccccCccccccCCCCCCCCCCCCCCcHHHHHHHHHHHHHHHHCcC-CcccHHHHHHHHHHHHHcCCHHHHHHHHHHHHHH
Confidence            355677777888889999999999999999999999999999999 64  899999999999999999999999999999


Q ss_pred             HHHHHHHHHHHH
Q 032803           96 KVEYEKDMKNYN  107 (133)
Q Consensus        96 k~~y~~e~~~y~  107 (133)
                      +.+|..+|..|+
T Consensus        82 k~rY~~e~~~Y~   93 (94)
T PTZ00199         82 KVRYEKEKAEYA   93 (94)
T ss_pred             HHHHHHHHHHHh
Confidence            999999999996


No 2  
>cd01389 MATA_HMG-box MATA_HMG-box, class I member of the HMG-box superfamily of DNA-binding proteins. These proteins contain a single HMG box, and bind the minor groove of DNA in a highly sequence-specific manner. Members include the fungal mating type gene products MC, MATA1 and Ste11.
Probab=99.87  E-value=4.7e-22  Score=128.66  Aligned_cols=71  Identities=27%  Similarity=0.437  Sum_probs=68.8

Q ss_pred             CCCCCCcHHHHHHHHHHHHHHHhCCCCCCHHHHHHHHHHHhhCCChHHhHHHHHHHHHHHHHHHHHHHHHHH
Q 032803           37 KPKRPASAFFVFMEEFREQYKKDHPKNKSVAAVGKAGGEKWKSMSEADKAPYVAKAEKRKVEYEKDMKNYNR  108 (133)
Q Consensus        37 ~PKrP~say~lF~~e~r~~~k~~~p~~~~~~eisk~l~~~Wk~ls~eeK~~Y~~~a~~~k~~y~~e~~~y~~  108 (133)
                      +|+||+|||||||++.|..|+.+||+ +++.+|+++||++|+.|++++|++|.++|..++++|..++++|+-
T Consensus         1 ~~kRP~naf~lf~~~~r~~~~~~~p~-~~~~eisk~~g~~Wk~ls~eeK~~y~~~A~~~k~~~~~~~p~Yky   71 (77)
T cd01389           1 KIPRPRNAFILYRQDKHAQLKTENPG-LTNNEISRIIGRMWRSESPEVKAYYKELAEEEKERHAREYPDYKY   71 (77)
T ss_pred             CCCCCCcHHHHHHHHHHHHHHHHCCC-CCHHHHHHHHHHHHhhCCHHHHHHHHHHHHHHHHHHHHHCCCCcc
Confidence            48999999999999999999999999 899999999999999999999999999999999999999999974


No 3  
>cd01388 SOX-TCF_HMG-box SOX-TCF_HMG-box, class I member of the HMG-box superfamily of DNA-binding proteins. These proteins contain a single HMG box, and bind the minor groove of DNA in a highly sequence-specific manner. Members include SRY and its homologs in insects and vertebrates, and transcription factor-like proteins, TCF-1, -3, -4, and LEF-1. They appear to bind the minor groove of the A/T C A A A G/C-motif.
Probab=99.86  E-value=1.3e-21  Score=125.23  Aligned_cols=70  Identities=33%  Similarity=0.547  Sum_probs=67.5

Q ss_pred             CCCCCCcHHHHHHHHHHHHHHHhCCCCCCHHHHHHHHHHHhhCCChHHhHHHHHHHHHHHHHHHHHHHHHH
Q 032803           37 KPKRPASAFFVFMEEFREQYKKDHPKNKSVAAVGKAGGEKWKSMSEADKAPYVAKAEKRKVEYEKDMKNYN  107 (133)
Q Consensus        37 ~PKrP~say~lF~~e~r~~~k~~~p~~~~~~eisk~l~~~Wk~ls~eeK~~Y~~~a~~~k~~y~~e~~~y~  107 (133)
                      ..|||+||||+||+++|..++.+||+ +++.+|+++||+.|+.||+++|++|.++|..++++|..++++|+
T Consensus         1 ~iKrP~naf~~F~~~~r~~~~~~~p~-~~~~eisk~l~~~Wk~ls~~eK~~y~~~a~~~k~~y~~~~p~y~   70 (72)
T cd01388           1 HIKRPMNAFMLFSKRHRRKVLQEYPL-KENRAISKILGDRWKALSNEEKQPYYEEAKKLKELHMKLYPDYK   70 (72)
T ss_pred             CCCCCCcHHHHHHHHHHHHHHHHCCC-CCHHHHHHHHHHHHHcCCHHHHHHHHHHHHHHHHHHHHHCcCCC
Confidence            36899999999999999999999999 89999999999999999999999999999999999999999885


No 4  
>PF00505 HMG_box:  HMG (high mobility group) box;  InterPro: IPR000910 High mobility group (HMG or HMGB) proteins are a family of relatively low molecular weight non-histone components in chromatin. HMG1 (also called HMG-T in fish) and HMG2 are two highly related proteins that bind single-stranded DNA preferentially and unwind double-stranded DNA. Although they have no sequence specificity, they have a high affinity for bent or distorted DNA, and bend linear DNA. HMG1 and HMG2 contain two DNA-binding HMG-box domains (A and B) that show structural and functional differences, and have a long acidic C-terminal domain rich in aspartic and glutamic acid residues. The acidic tail modulates the affinity of the tandem HMG boxes in HMG1 and 2 for a variety of DNA targets. HMG1 and 2 appear to play important architectural roles in the assembly of nucleoprotein complexes in a variety of biological processes, for example V(D)J recombination, the initiation of transcription, and DNA repair []. The profile in this entry describing the HMG-domains is much more general than the signature. In addition to the HMG1 and HMG2 proteins, HMG-domains occur in single or multiple copies in the following protein classes; the SOX family of transcription factors; SRY sex determining region Y protein and related proteins []; LEF1 lymphoid enhancer binding factor 1 []; SSRP recombination signal recognition protein; MTF1 mitochondrial transcription factor 1; UBF1/2 nucleolar transcription factors; Abf2 yeast ARS-binding factor []; and Saccharomyces cerevisiae transcription factors Ixr1, Rox1, Nhp6a, Nhp6b and Spp41.; GO: 0003677 DNA binding; PDB: 1I11_A 1J3C_A 1J3D_A 1WZ6_A 1WGF_A 2D7L_A 1GT0_D 3U2B_C 2CRJ_A 2CS1_A ....
Probab=99.85  E-value=5.9e-21  Score=120.21  Aligned_cols=69  Identities=42%  Similarity=0.758  Sum_probs=65.7

Q ss_pred             CCCCCcHHHHHHHHHHHHHHHhCCCCCCHHHHHHHHHHHhhCCChHHhHHHHHHHHHHHHHHHHHHHHHH
Q 032803           38 PKRPASAFFVFMEEFREQYKKDHPKNKSVAAVGKAGGEKWKSMSEADKAPYVAKAEKRKVEYEKDMKNYN  107 (133)
Q Consensus        38 PKrP~say~lF~~e~r~~~k~~~p~~~~~~eisk~l~~~Wk~ls~eeK~~Y~~~a~~~k~~y~~e~~~y~  107 (133)
                      |+||+|||+|||.+++..++.+||+ +++.+|+++||++|+.||+++|++|.+.|..++..|..++..|+
T Consensus         1 PkrP~~af~lf~~~~~~~~k~~~p~-~~~~~i~~~~~~~W~~l~~~eK~~y~~~a~~~~~~y~~~~~~y~   69 (69)
T PF00505_consen    1 PKRPPNAFMLFCKEKRAKLKEENPD-LSNKEISKILAQMWKNLSEEEKAPYKEEAEEEKERYEKEMPEYK   69 (69)
T ss_dssp             SSSS--HHHHHHHHHHHHHHHHSTT-STHHHHHHHHHHHHHCSHHHHHHHHHHHHHHHHHHHHHHHHHHH
T ss_pred             CcCCCCHHHHHHHHHHHHHHHHhcc-cccccchhhHHHHHhcCCHHHHHHHHHHHHHHHHHHHHHHHhcC
Confidence            8999999999999999999999999 89999999999999999999999999999999999999999995


No 5  
>cd01390 HMGB-UBF_HMG-box HMGB-UBF_HMG-box, class II and III members of the HMG-box superfamily of DNA-binding proteins. These proteins bind the minor groove of DNA in a non-sequence specific fashion and contain two or more tandem HMG boxes. Class II members include non-histone chromosomal proteins, HMG1 and HMG2, which bind to bent or distorted DNA such as four-way DNA junctions, synthetic DNA cruciforms, kinked cisplatin-modified DNA, DNA bulges, cross-overs in supercoiled DNA, and can cause looping of linear DNA. Class III members include nucleolar and mitochondrial transcription factors, UBF and mtTF1, which bind four-way DNA junctions.
Probab=99.84  E-value=1.4e-20  Score=117.36  Aligned_cols=65  Identities=54%  Similarity=0.844  Sum_probs=63.5

Q ss_pred             CCCCCcHHHHHHHHHHHHHHHhCCCCCCHHHHHHHHHHHhhCCChHHhHHHHHHHHHHHHHHHHHH
Q 032803           38 PKRPASAFFVFMEEFREQYKKDHPKNKSVAAVGKAGGEKWKSMSEADKAPYVAKAEKRKVEYEKDM  103 (133)
Q Consensus        38 PKrP~say~lF~~e~r~~~k~~~p~~~~~~eisk~l~~~Wk~ls~eeK~~Y~~~a~~~k~~y~~e~  103 (133)
                      ||+|+|||++|++++|..++..||+ +++.+|++.||.+|+.||+++|++|.+.|..++.+|..+|
T Consensus         1 Pkrp~saf~~f~~~~r~~~~~~~p~-~~~~~i~~~~~~~W~~ls~~eK~~y~~~a~~~~~~y~~e~   65 (66)
T cd01390           1 PKRPLSAYFLFSQEQRPKLKKENPD-ASVTEVTKILGEKWKELSEEEKKKYEEKAEKDKERYEKEM   65 (66)
T ss_pred             CCCCCcHHHHHHHHHHHHHHHHCcC-CCHHHHHHHHHHHHHhCCHHHHHHHHHHHHHHHHHHHHhh
Confidence            8999999999999999999999999 8999999999999999999999999999999999999886


No 6  
>smart00398 HMG high mobility group.
Probab=99.84  E-value=2.5e-20  Score=117.01  Aligned_cols=70  Identities=49%  Similarity=0.806  Sum_probs=67.8

Q ss_pred             CCCCCCcHHHHHHHHHHHHHHHhCCCCCCHHHHHHHHHHHhhCCChHHhHHHHHHHHHHHHHHHHHHHHHH
Q 032803           37 KPKRPASAFFVFMEEFREQYKKDHPKNKSVAAVGKAGGEKWKSMSEADKAPYVAKAEKRKVEYEKDMKNYN  107 (133)
Q Consensus        37 ~PKrP~say~lF~~e~r~~~k~~~p~~~~~~eisk~l~~~Wk~ls~eeK~~Y~~~a~~~k~~y~~e~~~y~  107 (133)
                      +|++|+|||++|++++|..+..+||+ +++.+|+++||.+|+.|++++|++|.++|..++.+|..++..|+
T Consensus         1 ~pkrp~~~y~~f~~~~r~~~~~~~~~-~~~~~i~~~~~~~W~~l~~~ek~~y~~~a~~~~~~y~~~~~~y~   70 (70)
T smart00398        1 KPKRPMSAFMLFSQENRAKIKAENPD-LSNAEISKKLGERWKLLSEEEKAPYEEKAKKDKERYEEEMPEYK   70 (70)
T ss_pred             CcCCCCcHHHHHHHHHHHHHHHHCcC-CCHHHHHHHHHHHHHcCCHHHHHHHHHHHHHHHHHHHHHHHhcC
Confidence            58999999999999999999999999 89999999999999999999999999999999999999999884


No 7  
>PF09011 HMG_box_2:  HMG-box domain;  InterPro: IPR015101 This domain is predominantly found in Maelstrom homologue proteins. It has no known function. ; GO: 0005634 nucleus; PDB: 2EQZ_A 1V64_A 2CTO_A 1H5P_A 3TQ6_A 3FGH_A 3TMM_A 1J3X_A 2YRQ_A 1AAB_A ....
Probab=99.84  E-value=2.9e-20  Score=119.20  Aligned_cols=72  Identities=46%  Similarity=0.807  Sum_probs=63.7

Q ss_pred             CCCCCCCCcHHHHHHHHHHHHHHHh-CCCCCCHHHHHHHHHHHhhCCChHHhHHHHHHHHHHHHHHHHHHHHHH
Q 032803           35 PNKPKRPASAFFVFMEEFREQYKKD-HPKNKSVAAVGKAGGEKWKSMSEADKAPYVAKAEKRKVEYEKDMKNYN  107 (133)
Q Consensus        35 p~~PKrP~say~lF~~e~r~~~k~~-~p~~~~~~eisk~l~~~Wk~ls~eeK~~Y~~~a~~~k~~y~~e~~~y~  107 (133)
                      |++||+|+|||+||+.+++..++.. ++. .++.++++.|+..|+.||++||.+|.++|..++.+|..+|..|+
T Consensus         1 p~kpK~~~say~lF~~~~~~~~k~~G~~~-~~~~e~~k~~~~~Wk~Ls~~EK~~Y~~~A~~~k~~y~~e~~~~~   73 (73)
T PF09011_consen    1 PKKPKRPPSAYNLFMKEMRKEVKEEGGQK-QSFREVMKEISERWKSLSEEEKEPYEERAKEDKERYEREMKEWN   73 (73)
T ss_dssp             SSS--SSSSHHHHHHHHHHHHHHHHT-T--SSHHHHHHHHHHHHHHS-HHHHHHHHHHHHHHHHHHHHHHHHH-
T ss_pred             CcCCCCCCCHHHHHHHHHHHHHHHhcccC-CCHHHHHHHHHHHHHhcCHHHHHHHHHHHHHHHHHHHHHHHhcC
Confidence            6899999999999999999999988 665 78999999999999999999999999999999999999999985


No 8  
>COG5648 NHP6B Chromatin-associated proteins containing the HMG domain [Chromatin structure and dynamics]
Probab=99.83  E-value=1.7e-20  Score=140.45  Aligned_cols=90  Identities=33%  Similarity=0.692  Sum_probs=84.7

Q ss_pred             CCCCCCCCCCCCCCCCCcHHHHHHHHHHHHHHHhCCCCCCHHHHHHHHHHHhhCCChHHhHHHHHHHHHHHHHHHHHHHH
Q 032803           26 RKSGKAAKDPNKPKRPASAFFVFMEEFREQYKKDHPKNKSVAAVGKAGGEKWKSMSEADKAPYVAKAEKRKVEYEKDMKN  105 (133)
Q Consensus        26 kk~~k~~~dp~~PKrP~say~lF~~e~r~~~k~~~p~~~~~~eisk~l~~~Wk~ls~eeK~~Y~~~a~~~k~~y~~e~~~  105 (133)
                      +..+++.+|||.||||+||||+|+.++|.+|+..+|+ +++.+|+++||++|++|++++|++|...|..++++|..++..
T Consensus        59 k~~~r~k~dpN~PKRp~sayf~y~~~~R~ei~~~~p~-l~~~e~~k~~~e~WK~Ltd~eke~y~k~~~~~~erYq~ek~~  137 (211)
T COG5648          59 KRLVRKKKDPNGPKRPLSAYFLYSAENRDEIRKENPK-LTFGEVGKLLSEKWKELTDEEKEPYYKEANSDRERYQREKEE  137 (211)
T ss_pred             HHHHHHhcCCCCCCCchhHHHHHHHHHHHHHHHhCCC-CChHHHHHHHHHHHHhccHhhhhhHHHHHhhHHHHHHHHHHh
Confidence            5557788999999999999999999999999999999 899999999999999999999999999999999999999999


Q ss_pred             HHHhCCCCCCC
Q 032803          106 YNRRQAEGTKP  116 (133)
Q Consensus       106 y~~~~~~~~~~  116 (133)
                      |..+++...-.
T Consensus       138 y~~k~~~~~~~  148 (211)
T COG5648         138 YNKKLPNKAPI  148 (211)
T ss_pred             hhcccCCCCCC
Confidence            99998876543


No 9  
>cd00084 HMG-box High Mobility Group (HMG)-box is found in a variety of eukaryotic chromosomal proteins and transcription factors. HMGs bind to the minor groove of DNA and have been classified by DNA binding preferences. Two phylogenically distinct groups of Class I proteins bind DNA in a sequence specific fashion and contain a single HMG box. One group (SOX-TCF) includes transcription factors, TCF-1, -3, -4; and also SRY and LEF-1, which bind four-way DNA junctions and duplex DNA targets. The second group (MATA) includes fungal mating type gene products MC, MATA1 and Ste11. Class II and III proteins (HMGB-UBF) bind DNA in a non-sequence specific fashion and contain two or more tandem HMG boxes. Class II members include non-histone chromosomal proteins, HMG1 and HMG2, which bind to bent or distorted DNA such as four-way DNA junctions, synthetic DNA cruciforms, kinked cisplatin-modified DNA, DNA bulges, cross-overs in supercoiled DNA, and can cause looping of linear DNA. Class III member
Probab=99.78  E-value=1.6e-18  Score=107.52  Aligned_cols=65  Identities=51%  Similarity=0.812  Sum_probs=62.9

Q ss_pred             CCCCCcHHHHHHHHHHHHHHHhCCCCCCHHHHHHHHHHHhhCCChHHhHHHHHHHHHHHHHHHHHH
Q 032803           38 PKRPASAFFVFMEEFREQYKKDHPKNKSVAAVGKAGGEKWKSMSEADKAPYVAKAEKRKVEYEKDM  103 (133)
Q Consensus        38 PKrP~say~lF~~e~r~~~k~~~p~~~~~~eisk~l~~~Wk~ls~eeK~~Y~~~a~~~k~~y~~e~  103 (133)
                      |++|+|||++|+++.|..+...||+ +++.+|++.||.+|+.|++++|.+|.+.|..++..|..++
T Consensus         1 pkrp~~af~~f~~~~~~~~~~~~~~-~~~~~i~~~~~~~W~~l~~~~k~~y~~~a~~~~~~y~~~~   65 (66)
T cd00084           1 PKRPLSAYFLFSQEHRAEVKAENPG-LSVGEISKILGEMWKSLSEEEKKKYEEKAEKDKERYEKEM   65 (66)
T ss_pred             CCCCCcHHHHHHHHHHHHHHHHCcC-CCHHHHHHHHHHHHHhCCHHHHHHHHHHHHHHHHHHHHhh
Confidence            7999999999999999999999999 8999999999999999999999999999999999998765


No 10 
>KOG0381 consensus HMG box-containing protein [General function prediction only]
Probab=99.78  E-value=3.1e-18  Score=114.52  Aligned_cols=76  Identities=47%  Similarity=0.788  Sum_probs=72.6

Q ss_pred             CC--CCCCCCCcHHHHHHHHHHHHHHHhCCCCCCHHHHHHHHHHHhhCCChHHhHHHHHHHHHHHHHHHHHHH-HHHHhC
Q 032803           34 DP--NKPKRPASAFFVFMEEFREQYKKDHPKNKSVAAVGKAGGEKWKSMSEADKAPYVAKAEKRKVEYEKDMK-NYNRRQ  110 (133)
Q Consensus        34 dp--~~PKrP~say~lF~~e~r~~~k~~~p~~~~~~eisk~l~~~Wk~ls~eeK~~Y~~~a~~~k~~y~~e~~-~y~~~~  110 (133)
                      ||  +.|++|++||++|+.++|..++..||+ +++.+|+++||++|++|++++|++|...|..++.+|..+|. .|+..+
T Consensus        17 ~p~~~~pkrp~sa~~~f~~~~~~~~k~~~p~-~~~~~v~k~~g~~W~~l~~~~k~~y~~ka~~~k~~Y~~~~~~~~~~~~   95 (96)
T KOG0381|consen   17 DPNAQAPKRPLSAFFLFSSEQRSKIKAENPG-LSVGEVAKALGEMWKNLAEEEKQPYEEKASKLKEKYEKELAGEYKASL   95 (96)
T ss_pred             CCCCCCCCCCCcHHHHHHHHHHHHHHHhCCC-CCHHHHHHHHHHHHhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHhhcc
Confidence            56  599999999999999999999999999 99999999999999999999999999999999999999999 998754


No 11 
>KOG0527 consensus HMG-box transcription factor [Transcription]
Probab=99.75  E-value=1.7e-18  Score=138.78  Aligned_cols=77  Identities=30%  Similarity=0.560  Sum_probs=72.9

Q ss_pred             CCCCCCCCCCCCcHHHHHHHHHHHHHHHhCCCCCCHHHHHHHHHHHhhCCChHHhHHHHHHHHHHHHHHHHHHHHHHH
Q 032803           31 AAKDPNKPKRPASAFFVFMEEFREQYKKDHPKNKSVAAVGKAGGEKWKSMSEADKAPYVAKAEKRKVEYEKDMKNYNR  108 (133)
Q Consensus        31 ~~~dp~~PKrP~say~lF~~e~r~~~k~~~p~~~~~~eisk~l~~~Wk~ls~eeK~~Y~~~a~~~k~~y~~e~~~y~~  108 (133)
                      ......+.||||||||+|.+..|.+|..+||+ +.++||+|+||.+|+.|+++||.+|+++|++++..|.+++.+|+=
T Consensus        56 ~k~~~~hIKRPMNAFMVWSq~~RRkma~qnP~-mHNSEISK~LG~~WK~Lse~EKrPFi~EAeRLR~~HmkehPdYKY  132 (331)
T KOG0527|consen   56 DKTSTDRIKRPMNAFMVWSQGQRRKLAKQNPK-MHNSEISKRLGAEWKLLSEEEKRPFVDEAERLRAQHMKEYPDYKY  132 (331)
T ss_pred             CCCCccccCCCcchhhhhhHHHHHHHHHhCcc-hhhHHHHHHHHHHHhhcCHhhhccHHHHHHHHHHHHHHhCCCccc
Confidence            34566899999999999999999999999999 899999999999999999999999999999999999999999974


No 12 
>KOG0526 consensus Nucleosome-binding factor SPN, POB3 subunit [Transcription; Replication, recombination and repair; Chromatin structure and dynamics]
Probab=99.67  E-value=9.6e-17  Score=133.54  Aligned_cols=79  Identities=46%  Similarity=0.731  Sum_probs=74.0

Q ss_pred             CCCCCCCCCCCCCCCCCCcHHHHHHHHHHHHHHHhCCCCCCHHHHHHHHHHHhhCCChHHhHHHHHHHHHHHHHHHHHHH
Q 032803           25 GRKSGKAAKDPNKPKRPASAFFVFMEEFREQYKKDHPKNKSVAAVGKAGGEKWKSMSEADKAPYVAKAEKRKVEYEKDMK  104 (133)
Q Consensus        25 ~kk~~k~~~dp~~PKrP~say~lF~~e~r~~~k~~~p~~~~~~eisk~l~~~Wk~ls~eeK~~Y~~~a~~~k~~y~~e~~  104 (133)
                      +.++.|+.+|||+|||++||||+|++..|..|+.+  + .++++|++.+|++|+.|+.  |.+|++.|+.++++|+.+|.
T Consensus       523 ~~k~~kk~kdpnapkra~sa~m~w~~~~r~~ik~d--g-i~~~dv~kk~g~~wk~ms~--k~~we~ka~~dk~ry~~em~  597 (615)
T KOG0526|consen  523 KKKKGKKKKDPNAPKRATSAYMLWLNASRESIKED--G-ISVGDVAKKAGEKWKQMSA--KEEWEDKAAVDKQRYEDEMK  597 (615)
T ss_pred             cccCcccCCCCCCCccchhHHHHHHHhhhhhHhhc--C-chHHHHHHHHhHHHhhhcc--cchhhHHHHHHHHHHHHHHH
Confidence            34667889999999999999999999999999987  5 8999999999999999999  99999999999999999999


Q ss_pred             HHHH
Q 032803          105 NYNR  108 (133)
Q Consensus       105 ~y~~  108 (133)
                      +|+.
T Consensus       598 ~yk~  601 (615)
T KOG0526|consen  598 EYKN  601 (615)
T ss_pred             hhcC
Confidence            9993


No 13 
>KOG3248 consensus Transcription factor TCF-4 [Transcription]
Probab=99.42  E-value=3.9e-13  Score=107.11  Aligned_cols=75  Identities=24%  Similarity=0.434  Sum_probs=68.9

Q ss_pred             CCCCCCCcHHHHHHHHHHHHHHHhCCCCCCHHHHHHHHHHHhhCCChHHhHHHHHHHHHHHHHHHHHHHHHHHhCC
Q 032803           36 NKPKRPASAFFVFMEEFREQYKKDHPKNKSVAAVGKAGGEKWKSMSEADKAPYVAKAEKRKVEYEKDMKNYNRRQA  111 (133)
Q Consensus        36 ~~PKrP~say~lF~~e~r~~~k~~~p~~~~~~eisk~l~~~Wk~ls~eeK~~Y~~~a~~~k~~y~~e~~~y~~~~~  111 (133)
                      ...|+|+||||||+++.|..|..++-- ....+|.++||++|.+||-+|..+|.++|.++++.|.+.+..|.+...
T Consensus       190 phiKKPLNAFmlyMKEmRa~vvaEctl-KeSAaiNqiLGrRWH~LSrEEQAKYyElArKerqlH~qlYP~WSARdN  264 (421)
T KOG3248|consen  190 PHIKKPLNAFMLYMKEMRAKVVAECTL-KESAAINQILGRRWHALSREEQAKYYELARKERQLHMQLYPGWSARDN  264 (421)
T ss_pred             ccccccHHHHHHHHHHHHHHHHHHhhh-hhHHHHHHHHhHHHhhhhHHHHHHHHHHHHHHHHHHHHhcCCcchhhh
Confidence            367999999999999999999999876 577899999999999999999999999999999999999999987543


No 14 
>KOG4715 consensus SWI/SNF-related matrix-associated actin-dependent regulator of chromatin  [Chromatin structure and dynamics]
Probab=99.34  E-value=6e-12  Score=99.68  Aligned_cols=78  Identities=26%  Similarity=0.570  Sum_probs=73.6

Q ss_pred             CCCCCCCCCCCCcHHHHHHHHHHHHHHHhCCCCCCHHHHHHHHHHHhhCCChHHhHHHHHHHHHHHHHHHHHHHHHHHh
Q 032803           31 AAKDPNKPKRPASAFFVFMEEFREQYKKDHPKNKSVAAVGKAGGEKWKSMSEADKAPYVAKAEKRKVEYEKDMKNYNRR  109 (133)
Q Consensus        31 ~~~dp~~PKrP~say~lF~~e~r~~~k~~~p~~~~~~eisk~l~~~Wk~ls~eeK~~Y~~~a~~~k~~y~~e~~~y~~~  109 (133)
                      ..+.|.+|-+|+-+||.|++..+++|+..||+ +...+|.++||.+|..|+++||+.|...++..+..|.+-|..|+..
T Consensus        58 ~pkpPkppekpl~pymrySrkvWd~VkA~nPe-~kLWeiGK~Ig~mW~dLpd~EK~ey~~EYeaEKieY~~smkayh~s  135 (410)
T KOG4715|consen   58 RPKPPKPPEKPLMPYMRYSRKVWDQVKASNPE-LKLWEIGKIIGGMWLDLPDEEKQEYLNEYEAEKIEYNESMKAYHNS  135 (410)
T ss_pred             CCCCCCCCCcccchhhHHhhhhhhhhhccCcc-hHHHHHHHHHHHHHhhCcchHHHHHHHHHHHHHHHHHHHHHHhhCC
Confidence            44568889999999999999999999999999 9999999999999999999999999999999999999999999863


No 15 
>KOG0528 consensus HMG-box transcription factor SOX5 [Transcription]
Probab=99.16  E-value=1.1e-11  Score=102.45  Aligned_cols=77  Identities=25%  Similarity=0.452  Sum_probs=69.6

Q ss_pred             CCCCCCCCCCCcHHHHHHHHHHHHHHHhCCCCCCHHHHHHHHHHHhhCCChHHhHHHHHHHHHHHHHHHHHHHHHHHh
Q 032803           32 AKDPNKPKRPASAFFVFMEEFREQYKKDHPKNKSVAAVGKAGGEKWKSMSEADKAPYVAKAEKRKVEYEKDMKNYNRR  109 (133)
Q Consensus        32 ~~dp~~PKrP~say~lF~~e~r~~~k~~~p~~~~~~eisk~l~~~Wk~ls~eeK~~Y~~~a~~~k~~y~~e~~~y~~~  109 (133)
                      ...+...||||||||+|.++.|-.|...+|| +-...|+++||.+|+.|+..||++|.+.-.++-..|.+.+++|+-+
T Consensus       320 ~ss~PHIKRPMNAFMVWAkDERRKILqA~PD-MHNSnISKILGSRWKaMSN~eKQPYYEEQaRLSk~HlEk~PdYrYk  396 (511)
T KOG0528|consen  320 ASSEPHIKRPMNAFMVWAKDERRKILQAFPD-MHNSNISKILGSRWKAMSNTEKQPYYEEQARLSKLHLEKYPDYRYK  396 (511)
T ss_pred             CCCCccccCCcchhhcccchhhhhhhhcCcc-ccccchhHHhcccccccccccccchHHHHHHHHHhhhccCcccccC
Confidence            3445678999999999999999999999999 7778999999999999999999999999888888999998888753


No 16 
>KOG2746 consensus HMG-box transcription factor Capicua and related proteins [Transcription]
Probab=98.82  E-value=4.6e-09  Score=89.98  Aligned_cols=78  Identities=23%  Similarity=0.374  Sum_probs=71.0

Q ss_pred             CCCCCCCCCCCCCCCCCCCcHHHHHHHHHH--HHHHHhCCCCCCHHHHHHHHHHHhhCCChHHhHHHHHHHHHHHHHHHH
Q 032803           24 AGRKSGKAAKDPNKPKRPASAFFVFMEEFR--EQYKKDHPKNKSVAAVGKAGGEKWKSMSEADKAPYVAKAEKRKVEYEK  101 (133)
Q Consensus        24 ~~kk~~k~~~dp~~PKrP~say~lF~~e~r--~~~k~~~p~~~~~~eisk~l~~~Wk~ls~eeK~~Y~~~a~~~k~~y~~  101 (133)
                      .+..+-..+++....++|||+|+||++.+|  ..+...||+ ....-|+++||++|-.|-+.||+.|.++|.+.++.|.+
T Consensus       168 ~kdgrspnkr~k~HirrPMnaf~ifskrhr~~g~vhq~~pn-~DNrtIskiLgewWytL~~~Ekq~yhdLa~Qvk~Ahfk  246 (683)
T KOG2746|consen  168 EKDGRSPNKRDKDHIRRPMNAFHIFSKRHRGEGRVHQRHPN-QDNRTISKILGEWWYTLGPNEKQKYHDLAFQVKEAHFK  246 (683)
T ss_pred             ccccCCCCcCcchhhhhhhHHHHHHHhhcCCccchhccCcc-ccchhHHHHHhhhHhhhCchhhhhHHHHHHHHHHHHhh
Confidence            345556677788899999999999999999  889999999 89999999999999999999999999999999999887


Q ss_pred             H
Q 032803          102 D  102 (133)
Q Consensus       102 e  102 (133)
                      .
T Consensus       247 a  247 (683)
T KOG2746|consen  247 A  247 (683)
T ss_pred             h
Confidence            6


No 17 
>PF14887 HMG_box_5:  HMG (high mobility group) box 5; PDB: 1L8Y_A 1L8Z_A 2HDZ_A.
Probab=98.40  E-value=3.3e-06  Score=54.44  Aligned_cols=75  Identities=20%  Similarity=0.385  Sum_probs=61.0

Q ss_pred             CCCCCCcHHHHHHHHHHHHHHHhCCCCCCHHHHHHHHHHHhhCCChHHhHHHHHHHHHHHHHHHHHHHHHHHhCCCC
Q 032803           37 KPKRPASAFFVFMEEFREQYKKDHPKNKSVAAVGKAGGEKWKSMSEADKAPYVAKAEKRKVEYEKDMKNYNRRQAEG  113 (133)
Q Consensus        37 ~PKrP~say~lF~~e~r~~~k~~~p~~~~~~eisk~l~~~Wk~ls~eeK~~Y~~~a~~~k~~y~~e~~~y~~~~~~~  113 (133)
                      .|..|-+|--||.+.........++. -...+ .+.+...|++|++.+|-+|+..|.++..+|+.+|.+|+...+..
T Consensus         3 lPE~PKt~qe~Wqq~vi~dYla~~~~-dr~K~-~kam~~~W~~me~Kekl~WIkKA~EdqKrYE~el~e~r~~~~~~   77 (85)
T PF14887_consen    3 LPETPKTAQEIWQQSVIGDYLAKFRN-DRKKA-LKAMEAQWSQMEKKEKLKWIKKAAEDQKRYERELREMRSAPADA   77 (85)
T ss_dssp             -S----THHHHHHHHHHHHHHHHTTS-THHHH-HHHHHHHHHTTGGGHHHHHHHHHHHHHHHHHHHHHCCS-CCCTT
T ss_pred             CCCCCCCHHHHHHHHHHHHHHHHhhH-hHHHH-HHHHHHHHHHhhhhhhhHHHHHHHHHHHHHHHHHHHHhcCCCCC
Confidence            57788999999999999999988887 44444 56899999999999999999999999999999999999866554


No 18 
>PF06382 DUF1074:  Protein of unknown function (DUF1074);  InterPro: IPR024460 This family consists of several proteins which appear to be specific to Insecta. The function of this family is unknown.
Probab=97.48  E-value=0.00088  Score=49.63  Aligned_cols=49  Identities=29%  Similarity=0.435  Sum_probs=42.6

Q ss_pred             CcHHHHHHHHHHHHHHHhCCCCCCHHHHHHHHHHHhhCCChHHhHHHHHHHHHH
Q 032803           42 ASAFFVFMEEFREQYKKDHPKNKSVAAVGKAGGEKWKSMSEADKAPYVAKAEKR   95 (133)
Q Consensus        42 ~say~lF~~e~r~~~k~~~p~~~~~~eisk~l~~~Wk~ls~eeK~~Y~~~a~~~   95 (133)
                      -+||+-|+.++|.    .|.+ +...|+....+..|..|++.+|..|..++...
T Consensus        83 nnaYLNFLReFRr----kh~~-L~p~dlI~~AAraW~rLSe~eK~rYrr~~~~~  131 (183)
T PF06382_consen   83 NNAYLNFLREFRR----KHCG-LSPQDLIQRAARAWCRLSEAEKNRYRRMAPSV  131 (183)
T ss_pred             chHHHHHHHHHHH----HccC-CCHHHHHHHHHHHHHhCCHHHHHHHHhhcchh
Confidence            3789999999876    4566 88999999999999999999999999876544


No 19 
>PF04690 YABBY:  YABBY protein;  InterPro: IPR006780 YABBY proteins are a group of plant-specific transcription factors involved in the specification of abaxial polarity in lateral organs such as leaves and floral organs [, ].
Probab=97.38  E-value=0.00049  Score=50.84  Aligned_cols=49  Identities=29%  Similarity=0.430  Sum_probs=43.0

Q ss_pred             CCCCCCCCCCCcHHHHHHHHHHHHHHHhCCCCCCHHHHHHHHHHHhhCCC
Q 032803           32 AKDPNKPKRPASAFFVFMEEFREQYKKDHPKNKSVAAVGKAGGEKWKSMS   81 (133)
Q Consensus        32 ~~dp~~PKrP~say~lF~~e~r~~~k~~~p~~~~~~eisk~l~~~Wk~ls   81 (133)
                      .+.|.+..|-+|||..|+++.-..|+..+|+ ++..|.....+..|...+
T Consensus       116 ~kPPEKRqR~psaYn~f~k~ei~rik~~~p~-ishkeaFs~aAknW~h~p  164 (170)
T PF04690_consen  116 NKPPEKRQRVPSAYNRFMKEEIQRIKAENPD-ISHKEAFSAAAKNWAHFP  164 (170)
T ss_pred             cCCccccCCCchhHHHHHHHHHHHHHhcCCC-CCHHHHHHHHHHhhhhCc
Confidence            3445555677899999999999999999999 999999999999998765


No 20 
>COG5648 NHP6B Chromatin-associated proteins containing the HMG domain [Chromatin structure and dynamics]
Probab=97.18  E-value=0.00033  Score=53.08  Aligned_cols=69  Identities=23%  Similarity=0.345  Sum_probs=61.8

Q ss_pred             CCCCCCCCcHHHHHHHHHHHHHHHhCCCCCCHHHHHHHHHHHhhCCChHHhHHHHHHHHHHHHHHHHHHH
Q 032803           35 PNKPKRPASAFFVFMEEFREQYKKDHPKNKSVAAVGKAGGEKWKSMSEADKAPYVAKAEKRKVEYEKDMK  104 (133)
Q Consensus        35 p~~PKrP~say~lF~~e~r~~~k~~~p~~~~~~eisk~l~~~Wk~ls~eeK~~Y~~~a~~~k~~y~~e~~  104 (133)
                      ..+++.|..+|+-|-...|..+...+|+ ....+++++++..|..|++.-|.+|.+.+..++..|...+.
T Consensus       141 k~~~~~~~~~~~e~~~~~r~~~~~~~~~-~~~~e~~k~~~~~w~el~~skK~~~~~~~Kk~k~~~~~~~~  209 (211)
T COG5648         141 KLPNKAPIGPFIENEPKIRPKVEGPSPD-KALVEETKIISKAWSELDESKKKKYIDKYKKLKEEYDSFYP  209 (211)
T ss_pred             ccCCCCCCchhhhccHHhccccCCCCcc-hhhhHHhhhhhhhhhhhChhhhhHHHHHHHHHHHHHhhhcc
Confidence            3567888899999999999999999998 78899999999999999999999999999999988876653


No 21 
>PF08073 CHDNT:  CHDNT (NUC034) domain;  InterPro: IPR012958 The CHD N-terminal domain is found in PHD/RING fingers and chromo domain-associated helicases [].; GO: 0003677 DNA binding, 0005524 ATP binding, 0008270 zinc ion binding, 0016818 hydrolase activity, acting on acid anhydrides, in phosphorus-containing anhydrides, 0006355 regulation of transcription, DNA-dependent, 0005634 nucleus
Probab=96.56  E-value=0.0037  Score=37.89  Aligned_cols=40  Identities=20%  Similarity=0.377  Sum_probs=36.0

Q ss_pred             CcHHHHHHHHHHHHHHHhCCCCCCHHHHHHHHHHHhhCCCh
Q 032803           42 ASAFFVFMEEFREQYKKDHPKNKSVAAVGKAGGEKWKSMSE   82 (133)
Q Consensus        42 ~say~lF~~e~r~~~k~~~p~~~~~~eisk~l~~~Wk~ls~   82 (133)
                      ++.|-+|.+..|+.|...||+ +..+.+..+++..|+..++
T Consensus        13 lt~yK~Fsq~vRP~l~~~NPk-~~~sKl~~l~~AKwrEF~~   52 (55)
T PF08073_consen   13 LTNYKAFSQHVRPLLAKANPK-APMSKLMMLLQAKWREFQE   52 (55)
T ss_pred             HHHHHHHHHHHHHHHHHHCCC-CcHHHHHHHHHHHHHHHHh
Confidence            467889999999999999999 8999999999999987553


No 22 
>PF06244 DUF1014:  Protein of unknown function (DUF1014);  InterPro: IPR010422 This family consists of several hypothetical eukaryotic proteins of unknown function.
Probab=94.99  E-value=0.045  Score=38.43  Aligned_cols=47  Identities=19%  Similarity=0.350  Sum_probs=40.6

Q ss_pred             CCC-CCCCcHHHHHHHHHHHHHHHhCCCCCCHHHHHHHHHHHhhCCChH
Q 032803           36 NKP-KRPASAFFVFMEEFREQYKKDHPKNKSVAAVGKAGGEKWKSMSEA   83 (133)
Q Consensus        36 ~~P-KrP~say~lF~~e~r~~~k~~~p~~~~~~eisk~l~~~Wk~ls~e   83 (133)
                      .+| +|-.-||.-|+...-+.|+.+||+ +..+++-.+|-..|...|+.
T Consensus        70 rHPErR~KAAy~afeE~~Lp~lK~E~Pg-LrlsQ~kq~l~K~w~KSPeN  117 (122)
T PF06244_consen   70 RHPERRMKAAYKAFEERRLPELKEENPG-LRLSQYKQMLWKEWQKSPEN  117 (122)
T ss_pred             CCcchhHHHHHHHHHHHHhHHHHhhCCC-chHHHHHHHHHHHHhcCCCC
Confidence            344 444578999999999999999999 99999999999999887753


No 23 
>PF04769 MAT_Alpha1:  Mating-type protein MAT alpha 1;  InterPro: IPR006856 This family includes Saccharomyces cerevisiae (Baker's yeast) mating type protein alpha 1 (P01365 from SWISSPROT). MAT alpha 1 is a transcription activator that activates mating-type alpha-specific genes with the help of the MADS-box containing MCM1 transcription factor, which together bind cooperatively to PQ elements upstream of alpha-specific genes. The MCM1-MATalpha1 complex is required for the proper DNA-bending that is needed for transcriptional activation []. Alpha 1 interacts in vivo with STE12, linking expression of alpha-specific genes to the alpha-pheromone (IPR006742 from INTERPRO) response pathway [].; GO: 0000772 mating pheromone activity, 0003677 DNA binding, 0045895 positive regulation of transcription, mating-type specific, 0005634 nucleus
Probab=94.73  E-value=0.12  Score=39.30  Aligned_cols=56  Identities=20%  Similarity=0.326  Sum_probs=40.2

Q ss_pred             CCCCCCCCCCCCcHHHHHHHHHHHHHHHhCCCCCCHHHHHHHHHHHhhCCChHHhHHHHHHHH
Q 032803           31 AAKDPNKPKRPASAFFVFMEEFREQYKKDHPKNKSVAAVGKAGGEKWKSMSEADKAPYVAKAE   93 (133)
Q Consensus        31 ~~~dp~~PKrP~say~lF~~e~r~~~k~~~p~~~~~~eisk~l~~~Wk~ls~eeK~~Y~~~a~   93 (133)
                      .......++||+|+||+|..-.-    ...|+ ....+++..|+..|..=+-  |..|.-+|.
T Consensus        37 ~~~~~~~~kr~lN~Fm~FRsyy~----~~~~~-~~Qk~~S~~l~~lW~~dp~--k~~W~l~ak   92 (201)
T PF04769_consen   37 RKRSPEKAKRPLNGFMAFRSYYS----PIFPP-LPQKELSGILTKLWEKDPF--KNKWSLMAK   92 (201)
T ss_pred             ccccccccccchhHHHHHHHHHH----hhcCC-cCHHHHHHHHHHHHhCCcc--HhHHHHHhh
Confidence            34455678999999999986664    34555 6778999999999997432  455555443


No 24 
>TIGR03481 HpnM hopanoid biosynthesis associated membrane protein HpnM. The genomes containing members of this family share the machinery for the biosynthesis of hopanoid lipids. Furthermore, the genes of this family are usually located proximal to other components of this biological process. The proteins are members of the pfam05494 family of putative transporters known as "toluene tolerance protein Ttg2D", although it is unlikely that the members included here have anything to do with toluene per-se.
Probab=91.54  E-value=0.66  Score=34.89  Aligned_cols=46  Identities=17%  Similarity=0.432  Sum_probs=39.6

Q ss_pred             CCHHHHHH-HHHHHhhCCChHHhHHHHHHHHH-HHHHHHHHHHHHHHh
Q 032803           64 KSVAAVGK-AGGEKWKSMSEADKAPYVAKAEK-RKVEYEKDMKNYNRR  109 (133)
Q Consensus        64 ~~~~eisk-~l~~~Wk~ls~eeK~~Y~~~a~~-~k~~y~~e~~~y~~~  109 (133)
                      ..+..|++ .||..|+.+|+++++.|.+.... ....|-..+..|...
T Consensus        64 ~Df~~mar~vLG~~W~~~s~~Qr~~F~~~F~~~l~~tY~~~l~~y~~~  111 (198)
T TIGR03481        64 FDLPAMARLTLGSSWTSLSPEQRRRFIGAFRELSIATYASQFKSYAGE  111 (198)
T ss_pred             CCHHHHHHHHhhhhhhhCCHHHHHHHHHHHHHHHHHHHHHHHHhhcCc
Confidence            46778876 68999999999999999999988 677899999988753


No 25 
>PRK15117 ABC transporter periplasmic binding protein MlaC; Provisional
Probab=90.04  E-value=1.1  Score=34.01  Aligned_cols=46  Identities=22%  Similarity=0.346  Sum_probs=38.8

Q ss_pred             CCHHHHHH-HHHHHhhCCChHHhHHHHHHHHHH-HHHHHHHHHHHHHh
Q 032803           64 KSVAAVGK-AGGEKWKSMSEADKAPYVAKAEKR-KVEYEKDMKNYNRR  109 (133)
Q Consensus        64 ~~~~eisk-~l~~~Wk~ls~eeK~~Y~~~a~~~-k~~y~~e~~~y~~~  109 (133)
                      ..+..+++ .||..|+.+|+++++.|.+..... ...|-..+..|...
T Consensus        68 ~Df~~~s~~vLG~~wr~as~eQr~~F~~~F~~~Lv~tYa~~l~~y~~q  115 (211)
T PRK15117         68 VQVKYAGALVLGRYYKDATPAQREAYFAAFREYLKQAYGQALAMYHGQ  115 (211)
T ss_pred             CCHHHHHHHHhhhhhhhCCHHHHHHHHHHHHHHHHHHHHHHHHHhCCc
Confidence            56777766 689999999999999999988885 56799999999753


No 26 
>PF05494 Tol_Tol_Ttg2:  Toluene tolerance, Ttg2 ;  InterPro: IPR008869 Toluene tolerance is mediated by increased cell membrane rigidity resulting from changes in fatty acid and phospholipid compositions, exclusion of toluene from the cell membrane, and removal of intracellular toluene by degradation []. Many proteins are involved in these processes. This family is a transporter which shows similarity to ABC transporters [].; PDB: 2QGU_A.
Probab=83.14  E-value=1.7  Score=31.48  Aligned_cols=45  Identities=18%  Similarity=0.367  Sum_probs=34.4

Q ss_pred             CCHHHHHHH-HHHHhhCCChHHhHHHHHHHHHH-HHHHHHHHHHHHH
Q 032803           64 KSVAAVGKA-GGEKWKSMSEADKAPYVAKAEKR-KVEYEKDMKNYNR  108 (133)
Q Consensus        64 ~~~~eisk~-l~~~Wk~ls~eeK~~Y~~~a~~~-k~~y~~e~~~y~~  108 (133)
                      ..+..|++. ||..|+.||+++++.|.+..... ...|-..+..|..
T Consensus        38 ~D~~~~ar~~LG~~w~~~s~~q~~~F~~~f~~~l~~~Y~~~l~~y~~   84 (170)
T PF05494_consen   38 FDFERMARRVLGRYWRKASPAQRQRFVEAFKQLLVRTYAKRLDEYSG   84 (170)
T ss_dssp             B-HHHHHHHHHGGGTTTS-HHHHHHHHHHHHHHHHHHHHHHHHT-SS
T ss_pred             CCHHHHHHHHHHHhHhhCCHHHHHHHHHHHHHHHHHHHHHHHHhhCC
Confidence            567777765 78899999999999999988875 5668888888875


No 27 
>KOG3223 consensus Uncharacterized conserved protein [Function unknown]
Probab=81.86  E-value=0.95  Score=34.24  Aligned_cols=52  Identities=23%  Similarity=0.413  Sum_probs=43.0

Q ss_pred             CCC-CCCCcHHHHHHHHHHHHHHHhCCCCCCHHHHHHHHHHHhhCCChHHhHHHHHH
Q 032803           36 NKP-KRPASAFFVFMEEFREQYKKDHPKNKSVAAVGKAGGEKWKSMSEADKAPYVAK   91 (133)
Q Consensus        36 ~~P-KrP~say~lF~~e~r~~~k~~~p~~~~~~eisk~l~~~Wk~ls~eeK~~Y~~~   91 (133)
                      .+| +|=..||.-|-...-+.|+.+||+ +.++++-.+|-.+|..-|+.   ||..+
T Consensus       162 rHPEkRmrAA~~afEe~~LPrLK~e~P~-lrlsQ~Kqll~Kew~KsPDN---P~Nq~  214 (221)
T KOG3223|consen  162 RHPEKRMRAAFKAFEEARLPRLKKENPG-LRLSQYKQLLKKEWQKSPDN---PFNQA  214 (221)
T ss_pred             cChHHHHHHHHHHHHHhhchhhhhcCCC-ccHHHHHHHHHHHHhhCCCC---hhhHH
Confidence            455 444577999999999999999999 99999999999999988874   45443


No 28 
>PF12881 NUT_N:  NUT protein N terminus;  InterPro: IPR024309 This domain is found in the N-terminal region of Nuclear Testis (NUT) proteins. It is also found in FAM22, which are a family of uncharacterised mammalian proteins.
Probab=81.36  E-value=5.2  Score=32.39  Aligned_cols=67  Identities=18%  Similarity=0.169  Sum_probs=49.2

Q ss_pred             CcHHHHHHHHHHHHHHHhCCCCCCHHHHHHHHHHHhhCCChHHhHHHHHHHHHHHHHH-HHHHHHHHHh
Q 032803           42 ASAFFVFMEEFREQYKKDHPKNKSVAAVGKAGGEKWKSMSEADKAPYVAKAEKRKVEY-EKDMKNYNRR  109 (133)
Q Consensus        42 ~say~lF~~e~r~~~k~~~p~~~~~~eisk~l~~~Wk~ls~eeK~~Y~~~a~~~k~~y-~~e~~~y~~~  109 (133)
                      ..||.+|..-.--.+....|. ++.-|-..+.-+.|...|.-+|-.|.++|++-.+-= +++|+.-+-+
T Consensus       229 ~EAlSCFLIpvLrsLar~kPt-MtlEeGl~ra~qEW~~~SnfdRmifyemaekFmEFEaeEEmq~q~lq  296 (328)
T PF12881_consen  229 AEALSCFLIPVLRSLARLKPT-MTLEEGLWRAVQEWQHTSNFDRMIFYEMAEKFMEFEAEEEMQIQKLQ  296 (328)
T ss_pred             hhhhhhhHHHHHHHHHhcCCC-ccHHHHHHHHHHHhhccccccHHHHHHHHHHHccCCcHHHHHHHHHH
Confidence            356666666665556667787 788898899999999999999999999999875322 1455544443


No 29 
>PF13875 DUF4202:  Domain of unknown function (DUF4202)
Probab=74.94  E-value=6.9  Score=29.38  Aligned_cols=40  Identities=25%  Similarity=0.483  Sum_probs=33.8

Q ss_pred             HHHHHHHHHHHHHHHhCCCCCCHHHHHHHHHHHhhCCChHHhHH
Q 032803           44 AFFVFMEEFREQYKKDHPKNKSVAAVGKAGGEKWKSMSEADKAP   87 (133)
Q Consensus        44 ay~lF~~e~r~~~k~~~p~~~~~~eisk~l~~~Wk~ls~eeK~~   87 (133)
                      +-++|...+...+...|..    ..+..+|...|+.||+.-++.
T Consensus       131 acLVFL~~~f~~F~~~~de----eK~v~Il~KTw~KMS~~g~~~  170 (185)
T PF13875_consen  131 ACLVFLEYYFEDFAAKHDE----EKIVDILRKTWRKMSERGHEA  170 (185)
T ss_pred             HHHHhHHHHHHHHHhcCCH----HHHHHHHHHHHHHCCHHHHHH
Confidence            5788999999999888743    578899999999999988753


No 30 
>COG2854 Ttg2D ABC-type transport system involved in resistance to organic solvents, auxiliary component [Secondary metabolites biosynthesis, transport, and catabolism]
Probab=72.59  E-value=5.6  Score=30.28  Aligned_cols=44  Identities=11%  Similarity=0.219  Sum_probs=36.8

Q ss_pred             HHHHHHHhhCCChHHhHHHHHHHHH-HHHHHHHHHHHHHHhCCCC
Q 032803           70 GKAGGEKWKSMSEADKAPYVAKAEK-RKVEYEKDMKNYNRRQAEG  113 (133)
Q Consensus        70 sk~l~~~Wk~ls~eeK~~Y~~~a~~-~k~~y~~e~~~y~~~~~~~  113 (133)
                      ...||.-|+.+|+++++.|...... ....|-..+..|+.+...-
T Consensus        77 ~~vLGk~~k~aspeQ~~~F~~aF~~yl~q~Y~~aL~~Y~~q~~~v  121 (202)
T COG2854          77 KLVLGKYYKTASPEQRQAFFKAFRTYLEQTYGQALLDYKGQTLKV  121 (202)
T ss_pred             HHHhccccccCCHHHHHHHHHHHHHHHHHHHHHHHHHccCCCcee
Confidence            3458899999999999999998887 4667999999999875543


No 31 
>PF11304 DUF3106:  Protein of unknown function (DUF3106);  InterPro: IPR021455  Some members in this family of proteins are annotated as transmembrane proteins however this cannot be confirmed. Currently no function is known. 
Probab=69.93  E-value=25  Score=23.82  Aligned_cols=24  Identities=17%  Similarity=0.367  Sum_probs=10.5

Q ss_pred             HHHHHHHHhhCCChHHhHHHHHHH
Q 032803           69 VGKAGGEKWKSMSEADKAPYVAKA   92 (133)
Q Consensus        69 isk~l~~~Wk~ls~eeK~~Y~~~a   92 (133)
                      +..-+...|+.|++..+..+...+
T Consensus        12 ~L~pl~~~W~~l~~~qr~k~l~~a   35 (107)
T PF11304_consen   12 ALAPLAERWNSLPPEQRRKWLQIA   35 (107)
T ss_pred             HHHHHHHHHhcCCHHHHHHHHHHH
Confidence            334444444444444444444333


No 32 
>PRK12751 cpxP periplasmic stress adaptor protein CpxP; Reviewed
Probab=56.32  E-value=33  Score=25.08  Aligned_cols=31  Identities=13%  Similarity=0.225  Sum_probs=23.9

Q ss_pred             HHHHHHHhhCCChHHhHHHHHHHHHHHHHHH
Q 032803           70 GKAGGEKWKSMSEADKAPYVAKAEKRKVEYE  100 (133)
Q Consensus        70 sk~l~~~Wk~ls~eeK~~Y~~~a~~~k~~y~  100 (133)
                      .+...++++.|++++|..|.+..++-.....
T Consensus       120 ~~~~~qmy~lLTPEQra~l~~~~e~r~~~~~  150 (162)
T PRK12751        120 AKVRNQMYNLLTPEQKEALNKKHQERIEKLQ  150 (162)
T ss_pred             HHHHHHHHHcCCHHHHHHHHHHHHHHHHHHH
Confidence            4555678899999999999997777655553


No 33 
>PRK10363 cpxP periplasmic repressor CpxP; Reviewed
Probab=54.67  E-value=37  Score=25.05  Aligned_cols=41  Identities=10%  Similarity=0.278  Sum_probs=31.3

Q ss_pred             HHHHHHHHHHhhCCChHHhHHHHHHHHHHHHHHHHHHHHHHH
Q 032803           67 AAVGKAGGEKWKSMSEADKAPYVAKAEKRKVEYEKDMKNYNR  108 (133)
Q Consensus        67 ~eisk~l~~~Wk~ls~eeK~~Y~~~a~~~k~~y~~e~~~y~~  108 (133)
                      .++.++-.++++.|++++|..|.+..++-...+.. +..+..
T Consensus       111 Vem~k~~nqmy~lLTPEQKaq~~~~~~~rm~~~~~-~~~~q~  151 (166)
T PRK10363        111 VEMAKVRNQMYRLLTPEQQAVLNEKHQQRMEQLRD-VTQWQK  151 (166)
T ss_pred             HHHHHHHHHHHHhCCHHHHHHHHHHHHHHHHHHHH-HHhcCc
Confidence            34566677899999999999999988887777754 554443


No 34 
>PRK09706 transcriptional repressor DicA; Reviewed
Probab=54.11  E-value=50  Score=22.75  Aligned_cols=43  Identities=12%  Similarity=0.176  Sum_probs=37.5

Q ss_pred             HHHHHHHHHhhCCChHHhHHHHHHHHHHHHHHHHHHHHHHHhC
Q 032803           68 AVGKAGGEKWKSMSEADKAPYVAKAEKRKVEYEKDMKNYNRRQ  110 (133)
Q Consensus        68 eisk~l~~~Wk~ls~eeK~~Y~~~a~~~k~~y~~e~~~y~~~~  110 (133)
                      .-...|-..|+.|+++++.............|..-+++|-.+.
T Consensus        87 ~~~~~ll~~~~~L~~~~~~~~l~~l~~~~~~~~~~~~~~~~~~  129 (135)
T PRK09706         87 EDQKELLELFDALPESEQDAQLSEMRARVENFNKLFEELLKAR  129 (135)
T ss_pred             HHHHHHHHHHHHCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHH
Confidence            4457889999999999999999999999999999999887653


No 35 
>PF01352 KRAB:  KRAB box;  InterPro: IPR001909 The Krueppel-associated box (KRAB) is a domain of around 75 amino acids that is found in the N-terminal part of about one third of eukaryotic Krueppel-type C2H2 zinc finger proteins (ZFPs) []. It is enriched in charged amino acids and can be divided into subregions A and B, which are predicted to fold into two amphipathic alpha-helices. The KRAB A and B boxes can be separated by variable spacer segments and many KRAB proteins contain only the A box []. The functions currently known for members of the KRAB-containing protein family include transcriptional repression of RNA polymerase I, II, and III promoters, binding and splicing of RNA, and control of nucleolus function. The KRAB domain functions as a transcriptional repressor when tethered to the template DNA by a DNA-binding domain. A sequence of 45 amino acids in the KRAB A subdomain has been shown to be necessary and sufficient for transcriptional repression. The B box does not repress by itself but does potentiate the repression exerted by the KRAB A subdomain [, ]. Gene silencing requires the binding of the KRAB domain to the RING-B box-coiled coil (RBCC) domain of the KAP-1/TIF1-beta corepressor. As KAP-1 binds to the heterochromatin proteins HP1, it has been proposed that the KRAB-ZFP-bound target gene could be silenced following recruitment to heterochromatin [, ]. KRAB-ZFPs probably constitute the single largest class of transcription factors within the human genome []. Although the function of KRAB-ZFPs is largely unknown, they appear to play important roles during cell differentiation and development. The KRAB domain is generally encoded by two exons. The regions coded by the two exons are known as KRAB-A and KRAB-B.; GO: 0003676 nucleic acid binding, 0006355 regulation of transcription, DNA-dependent, 0005622 intracellular; PDB: 1V65_A.
Probab=53.97  E-value=12  Score=20.94  Aligned_cols=29  Identities=21%  Similarity=0.247  Sum_probs=16.3

Q ss_pred             HHHHHHHHH-HHhhCCChHHhHHHHHHHHH
Q 032803           66 VAAVGKAGG-EKWKSMSEADKAPYVAKAEK   94 (133)
Q Consensus        66 ~~eisk~l~-~~Wk~ls~eeK~~Y~~~a~~   94 (133)
                      |.+|+--++ +.|..|.+.+|.-|.+.-.+
T Consensus         3 f~Dvav~fs~eEW~~L~~~Qk~ly~dvm~E   32 (41)
T PF01352_consen    3 FEDVAVYFSQEEWELLDPAQKNLYRDVMLE   32 (41)
T ss_dssp             ----TT---HHHHHTS-HHHHHHHHHHHHH
T ss_pred             EEEEEEEcChhhcccccceecccchhHHHH
Confidence            445555454 56999999999998876543


No 36 
>PRK12750 cpxP periplasmic repressor CpxP; Reviewed
Probab=52.61  E-value=52  Score=24.17  Aligned_cols=33  Identities=15%  Similarity=0.225  Sum_probs=27.2

Q ss_pred             HHHHHHhhCCChHHhHHHHHHHHHHHHHHHHHH
Q 032803           71 KAGGEKWKSMSEADKAPYVAKAEKRKVEYEKDM  103 (133)
Q Consensus        71 k~l~~~Wk~ls~eeK~~Y~~~a~~~k~~y~~e~  103 (133)
                      +...+.+..|++++|..|.++..+-...|...+
T Consensus       128 ~~~~~~~~vLTpEQRak~~e~~~~r~~~~~~~~  160 (170)
T PRK12750        128 EKRHQMLSILTPEQKAKFQELQQERMQECQDKM  160 (170)
T ss_pred             HHHHHHHHhCCHHHHHHHHHHHHHHHHHHHHHH
Confidence            345568999999999999999888877777666


No 37 
>PRK10236 hypothetical protein; Provisional
Probab=48.45  E-value=21  Score=27.88  Aligned_cols=26  Identities=19%  Similarity=0.420  Sum_probs=21.7

Q ss_pred             HHHHHHHHhhCCChHHhHHHHHHHHH
Q 032803           69 VGKAGGEKWKSMSEADKAPYVAKAEK   94 (133)
Q Consensus        69 isk~l~~~Wk~ls~eeK~~Y~~~a~~   94 (133)
                      +.+++...|..||++|++.+.+.-..
T Consensus       118 l~kll~~a~~kms~eE~~~L~~~l~~  143 (237)
T PRK10236        118 LEQFLRNTWKKMDEEHKQEFLHAVDA  143 (237)
T ss_pred             HHHHHHHHHHHCCHHHHHHHHHHHhh
Confidence            57889999999999999888765443


No 38 
>PF06945 DUF1289:  Protein of unknown function (DUF1289);  InterPro: IPR010710 This family consists of a number of hypothetical bacterial proteins. The aligned region spans around 56 residues and contains 4 highly conserved cysteine residues towards the N terminus. The function of this family is unknown.
Probab=48.23  E-value=30  Score=20.19  Aligned_cols=25  Identities=24%  Similarity=0.590  Sum_probs=17.8

Q ss_pred             CHHHHHHHHHHHhhCCChHHhHHHHHHHHH
Q 032803           65 SVAAVGKAGGEKWKSMSEADKAPYVAKAEK   94 (133)
Q Consensus        65 ~~~eisk~l~~~Wk~ls~eeK~~Y~~~a~~   94 (133)
                      +..||..     |..|++++|.........
T Consensus        23 T~dEI~~-----W~~~s~~er~~i~~~l~~   47 (51)
T PF06945_consen   23 TLDEIRD-----WKSMSDDERRAILARLRA   47 (51)
T ss_pred             cHHHHHH-----HhhCCHHHHHHHHHHHHH
Confidence            4556654     999999999776654443


No 39 
>PF12650 DUF3784:  Domain of unknown function (DUF3784);  InterPro: IPR017259 This group represents an uncharacterised conserved protein.
Probab=41.50  E-value=18  Score=23.62  Aligned_cols=16  Identities=31%  Similarity=0.542  Sum_probs=13.7

Q ss_pred             HhhCCChHHhHHHHHH
Q 032803           76 KWKSMSEADKAPYVAK   91 (133)
Q Consensus        76 ~Wk~ls~eeK~~Y~~~   91 (133)
                      -|+.||++||+.|...
T Consensus        25 Gyntms~eEk~~~D~~   40 (97)
T PF12650_consen   25 GYNTMSKEEKEKYDKK   40 (97)
T ss_pred             hcccCCHHHHHHhhHH
Confidence            4899999999999763


No 40 
>KOG3838 consensus Mannose lectin ERGIC-53, involved in glycoprotein traffic [Intracellular trafficking, secretion, and vesicular transport]
Probab=40.85  E-value=33  Score=29.00  Aligned_cols=38  Identities=24%  Similarity=0.391  Sum_probs=31.9

Q ss_pred             CCChHHhHHHHHHHHHHHHHHHHHHHHHHHhCCCCCCC
Q 032803           79 SMSEADKAPYVAKAEKRKVEYEKDMKNYNRRQAEGTKP  116 (133)
Q Consensus        79 ~ls~eeK~~Y~~~a~~~k~~y~~e~~~y~~~~~~~~~~  116 (133)
                      .|.+.+|++|.+..+.....|.++..+|.+.+++...+
T Consensus       268 E~qe~ek~kyqeEfe~~q~elek~k~efkk~hpd~~~e  305 (497)
T KOG3838|consen  268 EMQELEKAKYQEEFEWAQLELEKRKDEFKKSHPDAQGE  305 (497)
T ss_pred             hhhHHHHHHHHHHHHHHHHHHhhhHhhhccCCchhhcc
Confidence            34567899999999999999999999999988876653


No 41 
>PF00887 ACBP:  Acyl CoA binding protein;  InterPro: IPR000582 Acyl-CoA-binding protein (ACBP) is a small (10 Kd) protein that binds medium- and long-chain acyl-CoA esters with very high affinity and may function as an intracellular carrier of acyl-CoA esters []. ACBP is also known as diazepam binding inhibitor (DBI) or endozepine (EP) because of its ability to displace diazepam from the benzodiazepine (BZD) recognition site located on the GABA type A receptor. It is therefore possible that this protein also acts as a neuropeptide to modulate the action of the GABA receptor []. ACBP is a highly conserved protein of about 90 residues that is found in all four eukaryotic kingdoms, Animalia, Plantae, Fungi and Protista, and in some eubacterial species []. Although ACBP occurs as a completely independent protein, intact ACB domains have been identified in a number of large, multifunctional proteins in a variety of eukaryotic species. These include large membrane-associated proteins with N-terminal ACB domains, multifunctional enzymes with both ACB and peroxisomal enoyl-CoA Delta(3), Delta(2)-enoyl-CoA isomerase domains, and proteins with both an ACB domain and ankyrin repeats (IPR002110 from INTERPRO) []. The ACB domain consists of four alpha-helices arranged in a bowl shape with a highly exposed acyl-CoA-binding site. The ligand is bound through specific interactions with residues on the protein, most notably several conserved positive charges that interact with the phosphate group on the adenosine-3'phosphate moiety, and the acyl chain is sandwiched between the hydrophobic surfaces of CoA and the protein []. Other proteins containing an ACB domain include:   Endozepine-like peptide (ELP) (gene DBIL5) from mouse []. ELP is a testis-specific ACBP homologue that may be involved in the energy metabolism of the mature sperm. MA-DBI, a transmembrane protein of unknown function which has been found in mammals. MA-DBI contains a N-terminal ACB domain. DRS-1 [], a human protein of unknown function that contains a N-terminal ACB domain and a C-terminal enoyl-CoA isomerase/hydratase domain.  ; GO: 0000062 fatty-acyl-CoA binding; PDB: 2CB8_A 2FJ9_A 2LBB_A 1ST7_A 3EPY_B 2FDQ_C 1NTI_A 1HB8_A 1ACA_A 1NVL_A ....
Probab=37.70  E-value=1.1e+02  Score=19.48  Aligned_cols=53  Identities=17%  Similarity=0.366  Sum_probs=30.1

Q ss_pred             HHHHHHHHHHHHHHhCCCCCCHHHHHHHHHHHhhCCC----hHHhHHHHHHHHHHHHHH
Q 032803           45 FFVFMEEFREQYKKDHPKNKSVAAVGKAGGEKWKSMS----EADKAPYVAKAEKRKVEY   99 (133)
Q Consensus        45 y~lF~~e~r~~~k~~~p~~~~~~eisk~l~~~Wk~ls----~eeK~~Y~~~a~~~k~~y   99 (133)
                      |-+|.+.....+....|+...+  +.+.--..|+.|.    ++-++.|++........|
T Consensus        30 YalyKQAt~Gd~~~~~P~~~d~--~~~~K~~AW~~l~gms~~eA~~~Yi~~v~~~~~~~   86 (87)
T PF00887_consen   30 YALYKQATHGDCDTPRPGFFDI--EGRAKWDAWKALKGMSKEEAMREYIELVEELIPKY   86 (87)
T ss_dssp             HHHHHHHHTSS--S-CTTTTCH--HHHHHHHHHHTTTTTHHHHHHHHHHHHHHHHHHHH
T ss_pred             HHHHHHHHhCCCcCCCCcchhH--HHHHHHHHHHHccCCCHHHHHHHHHHHHHHHHHhc
Confidence            6667666655555555653333  3333456798876    455667777777665555


No 42 
>TIGR00787 dctP tripartite ATP-independent periplasmic transporter solute receptor, DctP family. TRAP-T (Tripartite ATP-independent Periplasmic Transporter) family proteins generally consist of three components, and these systems have so far been found in Gram-negative bacteria, Gram-postive bacteria and archaea. The best characterized example is the DctPQM system of Rhodobacter capsulatus, a C4 dicarboxylate (malate, fumarate, succinate) transporter. This model represents the DctP family, one of at least three major families of extracytoplasmic solute receptor for TRAP family transporters. Other are the SnoM family (see pfam03480) and TAXI (TRAP-associated extracytoplasmic immunogenic) family.
Probab=34.79  E-value=1e+02  Score=23.37  Aligned_cols=28  Identities=21%  Similarity=0.156  Sum_probs=21.4

Q ss_pred             HHHhhCCChHHhHHHHHHHHHHHHHHHH
Q 032803           74 GEKWKSMSEADKAPYVAKAEKRKVEYEK  101 (133)
Q Consensus        74 ~~~Wk~ls~eeK~~Y~~~a~~~k~~y~~  101 (133)
                      ...|..||++.|+...+.+...-..+..
T Consensus       213 ~~~~~~L~~e~q~~i~~a~~~~~~~~~~  240 (257)
T TIGR00787       213 KAFWKSLPPDLQAVVKEAAKEAGEYQRK  240 (257)
T ss_pred             HHHHhcCCHHHHHHHHHHHHHHHHHHHH
Confidence            4779999999999998877766444333


No 43 
>PRK10455 periplasmic protein; Reviewed
Probab=32.72  E-value=1.1e+02  Score=22.30  Aligned_cols=27  Identities=19%  Similarity=0.265  Sum_probs=20.2

Q ss_pred             HHHHHHHHhhCCChHHhHHHHHHHHHH
Q 032803           69 VGKAGGEKWKSMSEADKAPYVAKAEKR   95 (133)
Q Consensus        69 isk~l~~~Wk~ls~eeK~~Y~~~a~~~   95 (133)
                      ..+....++..|++++|+.|.+..++.
T Consensus       119 ~~~~~~qiy~vLTPEQr~q~~~~~ekr  145 (161)
T PRK10455        119 HMETQNKIYNVLTPEQKKQFNANFEKR  145 (161)
T ss_pred             HHHHHHHHHHhCCHHHHHHHHHHHHHH
Confidence            445556789999999999998765443


No 44 
>PF09164 VitD-bind_III:  Vitamin D binding protein, domain III;  InterPro: IPR015247 This domain is predominantly found in Vitamin D binding proteins, and adopts a multihelical structure. It is required for formation of an actin 'clamp', allowing the protein to bind to actin []. ; PDB: 1MA9_A 1KW2_A 1KXP_D 1J7E_A 1J78_A 1LOT_A.
Probab=32.01  E-value=1.3e+02  Score=18.82  Aligned_cols=33  Identities=9%  Similarity=0.260  Sum_probs=23.8

Q ss_pred             cHHHHHHHHHHHHHHHhCCCCCCHHHHHHHHHHH
Q 032803           43 SAFFVFMEEFREQYKKDHPKNKSVAAVGKAGGEK   76 (133)
Q Consensus        43 say~lF~~e~r~~~k~~~p~~~~~~eisk~l~~~   76 (133)
                      +.|.=|-+.-.+.++...|+ .+..+|..++.++
T Consensus         9 ~tFtEyKKrL~e~l~~k~P~-at~~~l~~lve~R   41 (68)
T PF09164_consen    9 NTFTEYKKRLAERLRAKLPD-ATPTELKELVEKR   41 (68)
T ss_dssp             S-HHHHHHHHHHHHHHH-TT-S-HHHHHHHHHHH
T ss_pred             ccHHHHHHHHHHHHHHHCCC-CCHHHHHHHHHHH
Confidence            45777888888889999999 7888887777654


No 45 
>cd07081 ALDH_F20_ACDH_EutE-like Coenzyme A acylating aldehyde dehydrogenase (ACDH), Ethanolamine utilization protein EutE, and related proteins. Coenzyme A acylating aldehyde dehydrogenase (ACDH), an NAD+ and CoA-dependent acetaldehyde dehydrogenase, acetylating (EC=1.2.1.10), functions as a single enzyme (such as the Ethanolamine utilization protein, EutE, in Salmonella typhimurium) or as part of a multifunctional enzyme to convert acetaldehyde into acetyl-CoA. The E. coli aldehyde-alcohol dehydrogenase includes the functional domains, alcohol dehydrogenase (ADH), ACDH, and pyruvate-formate-lyase deactivase; and the Entamoeba histolytica aldehyde-alcohol dehydrogenase 2 (ALDH20A1) includes the functional domains ADH and ACDH, and may be critical enzymes in the fermentative pathway.
Probab=30.58  E-value=1.5e+02  Score=24.90  Aligned_cols=41  Identities=12%  Similarity=0.040  Sum_probs=34.1

Q ss_pred             HHHHHHHHHhhCCChHHhHHHHHHHHHHHHHHHHHHHHHHH
Q 032803           68 AVGKAGGEKWKSMSEADKAPYVAKAEKRKVEYEKDMKNYNR  108 (133)
Q Consensus        68 eisk~l~~~Wk~ls~eeK~~Y~~~a~~~k~~y~~e~~~y~~  108 (133)
                      +.++..-..|+.++..+|..+...+....+.+..++.....
T Consensus         6 ~~A~~A~~~W~~~~~~~R~~iL~~~a~~l~~~~~ela~~~~   46 (439)
T cd07081           6 AAAKVAQQGLSCKSQEMVDLIFRAAAEAAEDARIDLAKLAV   46 (439)
T ss_pred             HHHHHHHHHHhhCCHHHHHHHHHHHHHHHHHHHHHHHHHHH
Confidence            44555667899999999999999999988888888887754


No 46 
>PF03480 SBP_bac_7:  Bacterial extracellular solute-binding protein, family 7;  InterPro: IPR018389 This family of proteins are involved in binding extracellular solutes for transport across the bacterial cytoplasmic membrane. This family includes a C4-dicarboxylate-binding protein DctP [, ] and the sialic acid-binding protein SiaP. The structure of the SiaP receptor has revealed an overall topology similar to ATP binding cassette ESR (extracytoplasmic solute receptors) proteins []. Upon binding of sialic acid, SiaP undergoes domain closure about a hinge region and kinking of an alpha-helix hinge component [].; GO: 0006810 transport, 0030288 outer membrane-bounded periplasmic space; PDB: 2HZK_C 2HZL_B 2HPG_C 2XWI_A 2XWK_A 2WX9_A 2CEY_A 2WYP_A 3B50_A 2CEX_B ....
Probab=28.58  E-value=1.2e+02  Score=23.27  Aligned_cols=29  Identities=14%  Similarity=0.321  Sum_probs=20.7

Q ss_pred             HHHhhCCChHHhHHHHHHHHHHHHHHHHH
Q 032803           74 GEKWKSMSEADKAPYVAKAEKRKVEYEKD  102 (133)
Q Consensus        74 ~~~Wk~ls~eeK~~Y~~~a~~~k~~y~~e  102 (133)
                      ...|..||++.|+...+.+......+...
T Consensus       213 ~~~w~~L~~e~q~~l~~~~~~~~~~~~~~  241 (286)
T PF03480_consen  213 KDWWDSLPDEDQEALDDAADEAEARAREY  241 (286)
T ss_dssp             HHHHHHS-HHHHHHHHHHHHHHHHHHHHH
T ss_pred             HHHHhcCCHHHHHHHHHHHHHHHHHHHHH
Confidence            35799999999999998777665444333


No 47 
>COG1638 DctP TRAP-type C4-dicarboxylate transport system, periplasmic component [Carbohydrate transport and metabolism]
Probab=27.89  E-value=1.3e+02  Score=24.32  Aligned_cols=36  Identities=19%  Similarity=0.307  Sum_probs=27.0

Q ss_pred             HHHHhhCCChHHhHHHHHHHHHHHHHHHHHHHHHHH
Q 032803           73 GGEKWKSMSEADKAPYVAKAEKRKVEYEKDMKNYNR  108 (133)
Q Consensus        73 l~~~Wk~ls~eeK~~Y~~~a~~~k~~y~~e~~~y~~  108 (133)
                      -...|..||++.++...+.+.+..........+...
T Consensus       243 s~~~w~~L~~e~q~il~~aa~e~~~~~~~~~~~~e~  278 (332)
T COG1638         243 SKAFWDSLPEEDQTILLEAAKEAAEEQRKLVEELED  278 (332)
T ss_pred             cHHHHhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHH
Confidence            457899999999999999888876655555544443


No 48 
>KOG1610 consensus Corticosteroid 11-beta-dehydrogenase and related short chain-type dehydrogenases [Secondary metabolites biosynthesis, transport and catabolism; General function prediction only]
Probab=27.62  E-value=2e+02  Score=23.55  Aligned_cols=58  Identities=19%  Similarity=0.348  Sum_probs=39.2

Q ss_pred             HHHHHHHHHHHHhC-------CCC-----CCHHHHHHHHHHHhhCCChHHhHHHHHHHHHHHHHHHHHHHHHH
Q 032803           47 VFMEEFREQYKKDH-------PKN-----KSVAAVGKAGGEKWKSMSEADKAPYVAKAEKRKVEYEKDMKNYN  107 (133)
Q Consensus        47 lF~~e~r~~~k~~~-------p~~-----~~~~eisk~l~~~Wk~ls~eeK~~Y~~~a~~~k~~y~~e~~~y~  107 (133)
                      .|+...|.++..-.       |+.     .+...+.+.+.+.|..|+++.|+.|=+.+..+   |...+..|.
T Consensus       187 af~D~lR~EL~~fGV~VsiiePG~f~T~l~~~~~~~~~~~~~w~~l~~e~k~~YGedy~~~---~~~~~~~~~  256 (322)
T KOG1610|consen  187 AFSDSLRRELRPFGVKVSIIEPGFFKTNLANPEKLEKRMKEIWERLPQETKDEYGEDYFED---YKKSLEKYL  256 (322)
T ss_pred             HHHHHHHHHHHhcCcEEEEeccCccccccCChHHHHHHHHHHHhcCCHHHHHHHHHHHHHH---HHHHHHhhh
Confidence            36777777665321       221     24578999999999999999999998776654   333444444


No 49 
>smart00271 DnaJ DnaJ molecular chaperone homology domain.
Probab=26.66  E-value=1.2e+02  Score=17.24  Aligned_cols=32  Identities=22%  Similarity=0.308  Sum_probs=19.4

Q ss_pred             HHHHHHHhCCCCCC-----HHHHHHHHHHHhhCCChH
Q 032803           52 FREQYKKDHPKNKS-----VAAVGKAGGEKWKSMSEA   83 (133)
Q Consensus        52 ~r~~~k~~~p~~~~-----~~eisk~l~~~Wk~ls~e   83 (133)
                      .+..++.-||+...     ..+....|.+.|..|.+.
T Consensus        22 y~~l~~~~HPD~~~~~~~~~~~~~~~l~~Ay~~L~~~   58 (60)
T smart00271       22 YRKLALKYHPDKNPGDKEEAEEKFKEINEAYEVLSDP   58 (60)
T ss_pred             HHHHHHHHCcCCCCCchHHHHHHHHHHHHHHHHHcCC
Confidence            34455666888333     345666777777776654


No 50 
>PF07813 LTXXQ:  LTXXQ motif family protein;  InterPro: IPR012899 This five residue motif is found in a number of bacterial proteins bearing similarity to the protein CpxP (P32158 from SWISSPROT). This is a periplasmic protein that aids in combating extracytoplasmic protein-mediated toxicity, and may also be involved in the response to alkaline pH []. Another member of this family, Spy (P77754 from SWISSPROT) is also a periplasmic protein that may be involved in the response to stress []. The homology between CpxP and Spy may indicate that these two proteins are functionally related []. The motif is found repeated twice in many members of this entry. ; GO: 0042597 periplasmic space; PDB: 3ITF_B 3QZC_B 3OEO_D 3O39_A.
Probab=25.76  E-value=1.3e+02  Score=18.85  Aligned_cols=25  Identities=12%  Similarity=0.129  Sum_probs=19.1

Q ss_pred             HHHHHHHHHHhhCCChHHhHHHHHH
Q 032803           67 AAVGKAGGEKWKSMSEADKAPYVAK   91 (133)
Q Consensus        67 ~eisk~l~~~Wk~ls~eeK~~Y~~~   91 (133)
                      ..+.......+..|++++|..|..+
T Consensus        75 ~~~~~~~~~~~~vLt~eQk~~~~~l   99 (100)
T PF07813_consen   75 EERAKAQHALYAVLTPEQKEKFDQL   99 (100)
T ss_dssp             HHHHHHHHHHHTTS-HHHHHHHHHH
T ss_pred             HHHHHHHHHHHhcCCHHHHHHHHHh
Confidence            4566777889999999999988764


No 51 
>PF05388 Carbpep_Y_N:  Carboxypeptidase Y pro-peptide;  InterPro: IPR008442 This signature is found at the N terminus of carboxypeptidase Y, which belong to MEROPS peptidase family S10. This region contains the signal peptide and pro-peptide regions [,].; GO: 0004185 serine-type carboxypeptidase activity, 0005773 vacuole
Probab=25.33  E-value=1.4e+02  Score=20.41  Aligned_cols=30  Identities=20%  Similarity=0.092  Sum_probs=25.3

Q ss_pred             HHHHHHHHHHHhhCCChHHhHHHHHHHHHH
Q 032803           66 VAAVGKAGGEKWKSMSEADKAPYVAKAEKR   95 (133)
Q Consensus        66 ~~eisk~l~~~Wk~ls~eeK~~Y~~~a~~~   95 (133)
                      +.-+++.+++.++.|+.+-|+.|.++...-
T Consensus        45 ~~~~~~~l~e~l~~Lt~e~k~~W~E~~~~f   74 (113)
T PF05388_consen   45 LEKISKYLNEPLKSLTSEAKALWDEMMLLF   74 (113)
T ss_pred             HHHHHHHHHHHHhhccHHHHHHHHHHHHHC
Confidence            456777789999999999999999988753


No 52 
>cd07133 ALDH_CALDH_CalB Coniferyl aldehyde dehydrogenase-like. Coniferyl aldehyde dehydrogenase (CALDH, EC=1.2.1.68) of Pseudomonas sp. strain HR199 (CalB) which catalyzes the NAD+-dependent oxidation of coniferyl aldehyde to ferulic acid, and similar sequences, are present in this CD.
Probab=24.96  E-value=2.2e+02  Score=23.67  Aligned_cols=41  Identities=7%  Similarity=-0.056  Sum_probs=33.8

Q ss_pred             HHHHHHHHHhhCCChHHhHHHHHHHHHHHHHHHHHHHHHHH
Q 032803           68 AVGKAGGEKWKSMSEADKAPYVAKAEKRKVEYEKDMKNYNR  108 (133)
Q Consensus        68 eisk~l~~~Wk~ls~eeK~~Y~~~a~~~k~~y~~e~~~y~~  108 (133)
                      +.++..-..|+.++..+|..+...+....+.+..++.....
T Consensus         5 ~~a~~a~~~w~~~~~~~R~~~L~~~a~~l~~~~~el~~~~~   45 (434)
T cd07133           5 ERQKAAFLANPPPSLEERRDRLDRLKALLLDNQDALAEAIS   45 (434)
T ss_pred             HHHHHHHHhcCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHH
Confidence            44566667899999999999999998988888888887654


No 53 
>cd07122 ALDH_F20_ACDH Coenzyme A acylating aldehyde dehydrogenase (ACDH), ALDH family 20-like. Coenzyme A acylating aldehyde dehydrogenase (ACDH, EC=1.2.1.10), an NAD+ and CoA-dependent acetaldehyde dehydrogenase, functions as a single enzyme (such as the Ethanolamine utilization protein, EutE, in Salmonella typhimurium) or as part of a multifunctional enzyme to convert acetaldehyde into acetyl-CoA . The E. coli aldehyde-alcohol dehydrogenase includes the functional domains, alcohol dehydrogenase (ADH), ACDH, and pyruvate-formate-lyase deactivase; and the Entamoeba histolytica aldehyde-alcohol dehydrogenase 2 (ALDH20A1) includes the functional domains ADH and ACDH and may be critical enzymes in the fermentative pathway.
Probab=24.34  E-value=2.2e+02  Score=23.88  Aligned_cols=41  Identities=5%  Similarity=0.112  Sum_probs=33.1

Q ss_pred             HHHHHHHHHhhCCChHHhHHHHHHHHHHHHHHHHHHHHHHH
Q 032803           68 AVGKAGGEKWKSMSEADKAPYVAKAEKRKVEYEKDMKNYNR  108 (133)
Q Consensus        68 eisk~l~~~Wk~ls~eeK~~Y~~~a~~~k~~y~~e~~~y~~  108 (133)
                      +.++..-..|..++..+|..+...+....+.+..++.....
T Consensus         6 ~~A~~A~~~W~~~~~~eR~~~L~~~a~~l~~~~eela~~~~   46 (436)
T cd07122           6 ERARKAQREFATFSQEQVDKIVEAVAWAAADAAEELAKMAV   46 (436)
T ss_pred             HHHHHHHHHHHhCCHHHHHHHHHHHHHHHHHHHHHHHHHHH
Confidence            34455566799999999999999988888888888877654


No 54 
>cd07132 ALDH_F3AB Aldehyde dehydrogenase family 3 members A1, A2, and B1 and related proteins. NAD(P)+-dependent, aldehyde dehydrogenase, family 3 members A1 and B1  (ALDH3A1, ALDH3B1,  EC=1.2.1.5) and fatty aldehyde dehydrogenase, family 3 member A2 (ALDH3A2, EC=1.2.1.3), and similar sequences are included in this CD. Human ALDH3A1 is a homodimer with a critical role in cellular defense against oxidative stress; it catalyzes the oxidation of various cellular membrane lipid-derived aldehydes. Corneal crystalline ALDH3A1 protects the cornea and underlying lens against UV-induced oxidative stress. Human ALDH3A2, a microsomal homodimer, catalyzes the oxidation of long-chain aliphatic aldehydes to fatty acids. Human ALDH3B1 is highly expressed in the kidney and liver and catalyzes the oxidation of various medium- and long-chain saturated and unsaturated aliphatic aldehydes.
Probab=24.05  E-value=2.1e+02  Score=23.81  Aligned_cols=42  Identities=7%  Similarity=-0.074  Sum_probs=34.1

Q ss_pred             HHHHHHHHHhhCCChHHhHHHHHHHHHHHHHHHHHHHHHHHh
Q 032803           68 AVGKAGGEKWKSMSEADKAPYVAKAEKRKVEYEKDMKNYNRR  109 (133)
Q Consensus        68 eisk~l~~~Wk~ls~eeK~~Y~~~a~~~k~~y~~e~~~y~~~  109 (133)
                      +.++..-..|+.++..+|..+...+....+.+..++..-...
T Consensus         5 ~~A~~A~~~w~~~~~~~R~~~L~~~a~~l~~~~~~l~~~~~~   46 (443)
T cd07132           5 RRAREAFSSGKTRPLEFRIQQLEALLRMLEENEDEIVEALAK   46 (443)
T ss_pred             HHHHHHHHhcCCCCHHHHHHHHHHHHHHHHHhHHHHHHHHHH
Confidence            445666678999999999999999888888888888776653


No 55 
>PTZ00037 DnaJ_C chaperone protein; Provisional
Probab=23.73  E-value=1.8e+02  Score=24.39  Aligned_cols=43  Identities=16%  Similarity=0.222  Sum_probs=30.8

Q ss_pred             HHHHHHHHHHhCCCCCCHHHHHHHHHHHhhCCChHHhHHHHHH
Q 032803           49 MEEFREQYKKDHPKNKSVAAVGKAGGEKWKSMSEADKAPYVAK   91 (133)
Q Consensus        49 ~~e~r~~~k~~~p~~~~~~eisk~l~~~Wk~ls~eeK~~Y~~~   91 (133)
                      -+..|...++-||+.....+..+.|.+.|..|++.+|....+.
T Consensus        46 KkAYrkla~k~HPDk~~~~e~F~~i~~AYevLsD~~kR~~YD~   88 (421)
T PTZ00037         46 KKAYRKLAIKHHPDKGGDPEKFKEISRAYEVLSDPEKRKIYDE   88 (421)
T ss_pred             HHHHHHHHHHHCCCCCchHHHHHHHHHHHHHhccHHHHHHHhh
Confidence            3455666678899932345788899999999998886554443


No 56 
>KOG2880 consensus SMAD6 interacting protein AMSH, contains JAB/MPN/Mov34 domain [Signal transduction mechanisms]
Probab=23.71  E-value=4.1e+02  Score=22.35  Aligned_cols=66  Identities=14%  Similarity=0.185  Sum_probs=37.8

Q ss_pred             CcHHHHHHHHHHH--HHHHhCCCCCCHHHHHHHHHHHhhCCChHHhHHHHHHHHHHHHHHHHHHHHHHHhC
Q 032803           42 ASAFFVFMEEFRE--QYKKDHPKNKSVAAVGKAGGEKWKSMSEADKAPYVAKAEKRKVEYEKDMKNYNRRQ  110 (133)
Q Consensus        42 ~say~lF~~e~r~--~~k~~~p~~~~~~eisk~l~~~Wk~ls~eeK~~Y~~~a~~~k~~y~~e~~~y~~~~  110 (133)
                      -+||+||.+-.-=  +-...||+ .  ..+-...-...+.|-++.-..-.++..++..+|..+..+|....
T Consensus        52 enafvLy~ry~tLfiEkipkHrD-y--~s~k~ek~d~~~klk~~~~p~~deL~~~ll~rY~~eyn~y~~~K  119 (424)
T KOG2880|consen   52 ENAFVLYLRYITLFIEKIPKHRD-Y--RSVKPEKEDIRKKLKEEAFPRIDELKAKLLKRYNVEYNEYDHSK  119 (424)
T ss_pred             chhhhHHHHHHHHHHHhcccCcc-h--hhhchhHHHHHHHHHHHhhhhHHHHHHHHHHHHhhHHHHHHHHH
Confidence            3677776543311  11234665 2  23333333344444466666666777888888888888887654


No 57 
>PF15581 Imm35:  Immunity protein 35
Probab=23.08  E-value=90  Score=20.71  Aligned_cols=22  Identities=9%  Similarity=0.335  Sum_probs=17.6

Q ss_pred             CHHHHHHHHHHHhhCCChHHhH
Q 032803           65 SVAAVGKAGGEKWKSMSEADKA   86 (133)
Q Consensus        65 ~~~eisk~l~~~Wk~ls~eeK~   86 (133)
                      +..-+...|...|+-|++++=.
T Consensus        31 ~i~~l~~lIe~eWRGl~~~qV~   52 (93)
T PF15581_consen   31 TIRNLESLIEHEWRGLPEEQVL   52 (93)
T ss_pred             HHHHHHHHHHHHHcCCCHHHHH
Confidence            4556788999999999987753


No 58 
>PF02026 RyR:  RyR domain;  InterPro: IPR003032 This domain is called RyR for Ryanodine receptor []. The domain is found in four copies in the ryanodine receptor. The function of this domain is unknown.; PDB: 4ETV_A 3RQR_A 4ETT_A 4ERT_A 4ESU_A 4ETU_A 4ERV_A 3NRT_E.
Probab=22.72  E-value=87  Score=20.65  Aligned_cols=21  Identities=14%  Similarity=0.177  Sum_probs=16.4

Q ss_pred             HhhCCChHHhHHHHHHHHHHH
Q 032803           76 KWKSMSEADKAPYVAKAEKRK   96 (133)
Q Consensus        76 ~Wk~ls~eeK~~Y~~~a~~~k   96 (133)
                      -|..|++.+|..+.+.+....
T Consensus        60 py~~L~e~eK~~dr~~~~e~l   80 (94)
T PF02026_consen   60 PYDELSEEEKEKDRDMVRETL   80 (94)
T ss_dssp             -GGGS-HHHHHHHHHHHHHHH
T ss_pred             ChhhCCHHHHHHhHHHHHHHH
Confidence            399999999999998887654


No 59 
>PRK14291 chaperone protein DnaJ; Provisional
Probab=22.55  E-value=1.9e+02  Score=23.83  Aligned_cols=39  Identities=21%  Similarity=0.280  Sum_probs=27.7

Q ss_pred             HHHHHHHHhCCCCCC----HHHHHHHHHHHhhCCChHHhHHHHH
Q 032803           51 EFREQYKKDHPKNKS----VAAVGKAGGEKWKSMSEADKAPYVA   90 (133)
Q Consensus        51 e~r~~~k~~~p~~~~----~~eisk~l~~~Wk~ls~eeK~~Y~~   90 (133)
                      ..|...+.-||+ .+    ..+..+.|.+.|..|++.+|..-.+
T Consensus        23 ayr~la~~~HPD-~~~~~~~~~~f~~i~~Ay~vLsd~~kR~~YD   65 (382)
T PRK14291         23 AYRRLARKYHPD-FNKNPEAEEKFKEINEAYQVLSDPEKRKLYD   65 (382)
T ss_pred             HHHHHHHHHCCC-CCCCccHHHHHHHHHHHHHHhcCHHHHHHHh
Confidence            445556677888 43    4577789999999999887654333


No 60 
>cd08317 Death_ank Death domain associated with Ankyrins. Death Domain (DD) associated with Ankyrins. Ankyrins are modular proteins comprising three conserved domains, an N-terminal membrane-binding domain containing ANK repeats, a spectrin-binding domain and a C-terminal DD. Ankyrins function as adaptor proteins and they interact, through ANK repeats, with structurally diverse membrane proteins, including ion channels/pumps, calcium release channels, and cell adhesion molecules. They play critical roles in the proper expression and membrane localization of these proteins. In mammals, this family includes ankyrin-R for restricted (or ANK1), ankyrin-B for broadly expressed (or ANK2) and ankyrin-G for general or giant (or ANK3). They are expressed in different combinations in many tissues and play non-overlapping functions. In general, DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-associati
Probab=22.17  E-value=48  Score=21.04  Aligned_cols=49  Identities=12%  Similarity=0.171  Sum_probs=27.9

Q ss_pred             CCHHHHHHHHHHHhhCCChHHh------HHHHHHHH-HHHHHHHHHHHHHHHhCCC
Q 032803           64 KSVAAVGKAGGEKWKSMSEADK------APYVAKAE-KRKVEYEKDMKNYNRRQAE  112 (133)
Q Consensus        64 ~~~~eisk~l~~~Wk~ls~eeK------~~Y~~~a~-~~k~~y~~e~~~y~~~~~~  112 (133)
                      ..+..|+..||.-|..|-..=-      ..+..... ...++-..-+..|..+...
T Consensus         5 ~~l~~ia~~lG~dW~~LAr~Lg~~~~dI~~i~~~~~~~~~eq~~~mL~~W~~r~g~   60 (84)
T cd08317           5 IRLADISNLLGSDWPQLARELGVSETDIDLIKAENPNSLAQQAQAMLKLWLEREGK   60 (84)
T ss_pred             chHHHHHHHHhhHHHHHHHHcCCCHHHHHHHHHHCCCCHHHHHHHHHHHHHHhcCC
Confidence            6788999999999987653322      22221111 0123344556677776543


No 61 
>PF09655 Nitr_red_assoc:  Conserved nitrate reductase-associated protein (Nitr_red_assoc);  InterPro: IPR013481  Proteins in this entry are found in the Cyanobacteria, and are mostly encoded near nitrate reductase and molybdopterin biosynthesis genes. Molybdopterin guanine dinucleotide is a cofactor for nitrate reductase. These proteins are sometimes annotated as nitrate reductase-associated proteins, though their function is unknown.
Probab=22.09  E-value=1.8e+02  Score=20.99  Aligned_cols=41  Identities=17%  Similarity=0.400  Sum_probs=32.4

Q ss_pred             HHhhCCChHHhHHHHHHH---HHHHHHHHHHHHHHHHhCCCCCC
Q 032803           75 EKWKSMSEADKAPYVAKA---EKRKVEYEKDMKNYNRRQAEGTK  115 (133)
Q Consensus        75 ~~Wk~ls~eeK~~Y~~~a---~~~k~~y~~e~~~y~~~~~~~~~  115 (133)
                      ..|..||.+||+...+..   ..+.+.|...+...-..++..+.
T Consensus        33 ~~W~~l~~~eRq~Lv~~pc~t~~ei~~yr~~L~~li~~~~~~~~   76 (144)
T PF09655_consen   33 SHWQQLSQEERQQLVDLPCDTPEEIQNYREFLQELIRTHAGGPA   76 (144)
T ss_pred             HHHhcCCHHHHHHHHcCCCCCHHHHHHHHHHHHHHHHHHhCCCc
Confidence            579999999999998865   45566888888888877765554


No 62 
>PF15076 DUF4543:  Domain of unknown function (DUF4543)
Probab=21.67  E-value=61  Score=20.37  Aligned_cols=21  Identities=14%  Similarity=0.567  Sum_probs=17.2

Q ss_pred             CCCCCCCCCCCCcHHHHHHHH
Q 032803           31 AAKDPNKPKRPASAFFVFMEE   51 (133)
Q Consensus        31 ~~~dp~~PKrP~say~lF~~e   51 (133)
                      +...|+.|--||.-||++++.
T Consensus        25 r~~K~GfpdepmrE~ml~l~~   45 (75)
T PF15076_consen   25 RPRKPGFPDEPMREYMLHLQA   45 (75)
T ss_pred             CCCCCCCCcchHHHHHHHHHH
Confidence            445688999999999999863


No 63 
>PRK14296 chaperone protein DnaJ; Provisional
Probab=21.66  E-value=1.9e+02  Score=23.71  Aligned_cols=39  Identities=18%  Similarity=0.136  Sum_probs=27.2

Q ss_pred             HHHHHHHHhCCCCCC----HHHHHHHHHHHhhCCChHHhHHHHH
Q 032803           51 EFREQYKKDHPKNKS----VAAVGKAGGEKWKSMSEADKAPYVA   90 (133)
Q Consensus        51 e~r~~~k~~~p~~~~----~~eisk~l~~~Wk~ls~eeK~~Y~~   90 (133)
                      ..|...++-||+ .+    ..+..+.|.+.|..|++.+|..-.+
T Consensus        24 ayrkla~~~HPD-~n~~~~a~~~F~~i~~AyevLsD~~KR~~YD   66 (372)
T PRK14296         24 AYRKLAKQYHPD-LNKSPDAHDKMVEINEAADVLLDKDKRKQYD   66 (372)
T ss_pred             HHHHHHHHHCcC-CCCCchHHHHHHHHHHHHHHhcCHHHhhhhh
Confidence            344555667887 43    4467788999999999888654444


No 64 
>cd07087 ALDH_F3-13-14_CALDH-like ALDH subfamily: Coniferyl aldehyde dehydrogenase, ALDH families 3, 13, and 14, and other related proteins. ALDH subfamily which includes NAD(P)+-dependent, aldehyde dehydrogenase, family 3 member A1 and B1  (ALDH3A1, ALDH3B1,  EC=1.2.1.5) and fatty aldehyde dehydrogenase, family 3 member A2 (ALDH3A2, EC=1.2.1.3), and also plant ALDH family members ALDH3F1, ALDH3H1, and ALDH3I1, fungal ALDH14 (YMR110C) and the protozoan family 13 member (ALDH13), as well as coniferyl aldehyde dehydrogenases (CALDH, EC=1.2.1.68), and other similar  sequences, such as the Pseudomonas putida benzaldehyde dehydrogenase I that is involved in the metabolism of mandelate.
Probab=21.55  E-value=2.6e+02  Score=23.10  Aligned_cols=42  Identities=12%  Similarity=-0.095  Sum_probs=33.6

Q ss_pred             HHHHHHHHHhhCCChHHhHHHHHHHHHHHHHHHHHHHHHHHh
Q 032803           68 AVGKAGGEKWKSMSEADKAPYVAKAEKRKVEYEKDMKNYNRR  109 (133)
Q Consensus        68 eisk~l~~~Wk~ls~eeK~~Y~~~a~~~k~~y~~e~~~y~~~  109 (133)
                      +.++..-..|..++..+|..+...+....+.+..++......
T Consensus         5 ~~a~~a~~~w~~~~~~~R~~~L~~~a~~l~~~~~el~~~~~~   46 (426)
T cd07087           5 ARLRETFLTGKTRSLEWRKAQLKALKRMLTENEEEIAAALYA   46 (426)
T ss_pred             HHHHHHHHhcCCCCHHHHHHHHHHHHHHHHHhHHHHHHHHHH
Confidence            345556677999999999999999988888888888766543


No 65 
>cd07085 ALDH_F6_MMSDH Methylmalonate semialdehyde dehydrogenase and ALDH family members 6A1 and 6B2. Methylmalonate semialdehyde dehydrogenase (MMSDH, EC=1.2.1.27) [acylating] from Bacillus subtilis is involved in valine metabolism and catalyses the NAD+- and CoA-dependent oxidation of methylmalonate semialdehyde into propionyl-CoA. Mitochondrial human MMSDH ALDH6A1 and Arabidopsis MMSDH ALDH6B2 are also present in this CD.
Probab=21.15  E-value=2.8e+02  Score=23.25  Aligned_cols=38  Identities=13%  Similarity=0.114  Sum_probs=30.1

Q ss_pred             HHHHHHHhhCCChHHhHHHHHHHHHHHHHHHHHHHHHH
Q 032803           70 GKAGGEKWKSMSEADKAPYVAKAEKRKVEYEKDMKNYN  107 (133)
Q Consensus        70 sk~l~~~Wk~ls~eeK~~Y~~~a~~~k~~y~~e~~~y~  107 (133)
                      ++.....|+.++..+|..+...+......+..++..-.
T Consensus        47 A~~A~~~w~~~~~~~R~~~L~~~a~~l~~~~~el~~~~   84 (478)
T cd07085          47 AKAAFPAWSATPVLKRQQVMFKFRQLLEENLDELARLI   84 (478)
T ss_pred             HHHHHHHHhcCCHHHHHHHHHHHHHHHHHHHHHHHHHH
Confidence            44455679999999999999988888888877776543


No 66 
>PF08367 M16C_assoc:  Peptidase M16C associated;  InterPro: IPR013578 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site []. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases []. This domain appears in eukaryotes as well as bacteria and tends to be found near the C terminus of metalloproteases and related sequences belonging to MEROPS peptidase family M16 (subfamily M16C, clan ME). These include: eupitrilysin, falcilysin, PreP peptidase, CYM1 peptidase and subfamily M16C non-peptidase homologues.; GO: 0008237 metallopeptidase activity, 0008270 zinc ion binding, 0006508 proteolysis; PDB: 2FGE_B 3S5I_A 3S5H_A 3S5M_A 3S5K_A.
Probab=20.82  E-value=2.3e+02  Score=21.57  Aligned_cols=30  Identities=17%  Similarity=0.134  Sum_probs=25.0

Q ss_pred             HHHHHHHHHHhhCCChHHhHHHHHHHHHHH
Q 032803           67 AAVGKAGGEKWKSMSEADKAPYVAKAEKRK   96 (133)
Q Consensus        67 ~eisk~l~~~Wk~ls~eeK~~Y~~~a~~~k   96 (133)
                      .+....|.+.+..|++++++...+.+...+
T Consensus        13 ~~e~~~L~~~k~~Ls~~e~~~i~~~~~~L~   42 (248)
T PF08367_consen   13 EEEKEKLAAYKASLSEEEKEKIIEQTKELK   42 (248)
T ss_dssp             HHHHHHHHHHHHCS-HHHHHHHHHHHHHHH
T ss_pred             HHHHHHHHHHHhhCCHHHHHHHHHHHHHHH
Confidence            467788999999999999999998888764


No 67 
>cd06257 DnaJ DnaJ domain or J-domain.  DnaJ/Hsp40 (heat shock protein 40) proteins are highly conserved and play crucial roles in protein translation, folding, unfolding, translocation, and degradation. They act primarily by stimulating the ATPase activity of Hsp70s, an important chaperonine family. Hsp40 proteins are characterized by the presence of a J domain, which mediates the interaction with Hsp70. They may contain other domains as well, and the architectures provide a means of classification.
Probab=20.46  E-value=1.7e+02  Score=16.13  Aligned_cols=31  Identities=23%  Similarity=0.285  Sum_probs=18.3

Q ss_pred             HHHHHHHHhCCCCCC----HHHHHHHHHHHhhCCC
Q 032803           51 EFREQYKKDHPKNKS----VAAVGKAGGEKWKSMS   81 (133)
Q Consensus        51 e~r~~~k~~~p~~~~----~~eisk~l~~~Wk~ls   81 (133)
                      ..|..++.-||+...    ..+....|...|..|+
T Consensus        20 ~y~~l~~~~HPD~~~~~~~~~~~~~~l~~Ay~~L~   54 (55)
T cd06257          20 AYRKLALKYHPDKNPDDPEAEEKFKEINEAYEVLS   54 (55)
T ss_pred             HHHHHHHHHCcCCCCCcHHHHHHHHHHHHHHHHhc
Confidence            345556667887332    3455666666666664


No 68 
>KOG1827 consensus Chromatin remodeling complex RSC, subunit RSC1/Polybromo and related proteins [Chromatin structure and dynamics; Transcription]
Probab=20.42  E-value=5.3  Score=35.29  Aligned_cols=44  Identities=16%  Similarity=0.276  Sum_probs=39.7

Q ss_pred             CCcHHHHHHHHHHHHHHHhCCCCCCHHHHHHHHHHHhhCCChHHh
Q 032803           41 PASAFFVFMEEFREQYKKDHPKNKSVAAVGKAGGEKWKSMSEADK   85 (133)
Q Consensus        41 P~say~lF~~e~r~~~k~~~p~~~~~~eisk~l~~~Wk~ls~eeK   85 (133)
                      -+++|++|+.+.+..+...+|+ ..+.+++.+.|..|..|+...+
T Consensus       552 ~~~~~~~~s~~~~~~~~~~np~-v~~~~~~~~vg~~~~~lp~~~k  595 (629)
T KOG1827|consen  552 SPEPYILDSIENRTIIWFENPT-VGFGEVSIIVGNDWDKLPNINK  595 (629)
T ss_pred             CCccccccccccCceeeeeCCC-cccceeEEeecCCcccCccccc
Confidence            5688999999999999999999 8999999999999999994444


No 69 
>PRK10266 curved DNA-binding protein CbpA; Provisional
Probab=20.34  E-value=2.6e+02  Score=22.17  Aligned_cols=40  Identities=23%  Similarity=0.353  Sum_probs=27.4

Q ss_pred             HHHHHHHHhCCCCC---CHHHHHHHHHHHhhCCChHHhHHHHH
Q 032803           51 EFREQYKKDHPKNK---SVAAVGKAGGEKWKSMSEADKAPYVA   90 (133)
Q Consensus        51 e~r~~~k~~~p~~~---~~~eisk~l~~~Wk~ls~eeK~~Y~~   90 (133)
                      ..|...++-||+..   ...+..+.|.+.|..|++..+..-.+
T Consensus        24 ayr~la~k~HPD~~~~~~~~~~f~~i~~Ay~~L~~~~kr~~yD   66 (306)
T PRK10266         24 AYRRLARKYHPDVSKEPDAEARFKEVAEAWEVLSDEQRRAEYD   66 (306)
T ss_pred             HHHHHHHHHCcCCCCCccHHHHHHHHHHHHHHhhhHHHHHHHH
Confidence            34555567788821   26678889999999999776654333


No 70 
>PF06628 Catalase-rel:  Catalase-related immune-responsive;  InterPro: IPR010582 Catalases (1.11.1.6 from EC) are antioxidant enzymes that catalyse the conversion of hydrogen peroxide to water and molecular oxygen, serving to protect cells from its toxic effects []. Hydrogen peroxide is produced as a consequence of oxidative cellular metabolism and can be converted to the highly reactive hydroxyl radical via transition metals, this radical being able to damage a wide variety of molecules within a cell, leading to oxidative stress and cell death. Catalases act to neutralise hydrogen peroxide toxicity, and are produced by all aerobic organisms ranging from bacteria to man. Most catalases are mono-functional, haem-containing enzymes, although there are also bifunctional haem-containing peroxidase/catalases (IPR000763 from INTERPRO) that are closely related to plant peroxidases, and non-haem, manganese-containing catalases (IPR007760 from INTERPRO) that are found in bacteria []. This entry represents a small conserved region within catalase enzymes that carries the immune-responsive amphipathic octa-peptide that is recognised by T cells [].; PDB: 2CAH_A 1NM0_A 1H7K_A 1E93_A 1H6N_A 3HB6_A 2CAG_A 1M85_A 1MQF_A 1A4E_C ....
Probab=20.25  E-value=1.1e+02  Score=18.74  Aligned_cols=19  Identities=11%  Similarity=0.375  Sum_probs=15.1

Q ss_pred             HHHHHhhCCChHHhHHHHH
Q 032803           72 AGGEKWKSMSEADKAPYVA   90 (133)
Q Consensus        72 ~l~~~Wk~ls~eeK~~Y~~   90 (133)
                      .-+..|+.|++.+|+-+..
T Consensus        12 Qa~~ly~~l~~~er~~lv~   30 (68)
T PF06628_consen   12 QARDLYRVLSDEERERLVE   30 (68)
T ss_dssp             HHHHHHHHSSHHHHHHHHH
T ss_pred             hHHHHHHHCCHHHHHHHHH
Confidence            4567899999999987775


No 71 
>PRK14279 chaperone protein DnaJ; Provisional
Probab=20.13  E-value=1.7e+02  Score=24.22  Aligned_cols=40  Identities=20%  Similarity=0.219  Sum_probs=28.4

Q ss_pred             HHHHHHHHhCCCCCC-----HHHHHHHHHHHhhCCChHHhHHHHHH
Q 032803           51 EFREQYKKDHPKNKS-----VAAVGKAGGEKWKSMSEADKAPYVAK   91 (133)
Q Consensus        51 e~r~~~k~~~p~~~~-----~~eisk~l~~~Wk~ls~eeK~~Y~~~   91 (133)
                      ..|...++-||+ .+     ..+..+.|.+.|..|++.+|..-.+.
T Consensus        29 ayr~la~~~HPD-~~~~~~~a~~~f~~i~~Ay~vLsD~~KR~~YD~   73 (392)
T PRK14279         29 AYRKLARELHPD-ANPGDPAAEERFKAVSEAHDVLSDPAKRKEYDE   73 (392)
T ss_pred             HHHHHHHHHCcC-CCCCChHHHHHHHHHHHHHHHhcchhhhhHHHH
Confidence            345556677888 42     34777899999999998887644443


No 72 
>PHA03102 Small T antigen; Reviewed
Probab=20.09  E-value=1.8e+02  Score=21.04  Aligned_cols=36  Identities=17%  Similarity=0.159  Sum_probs=23.6

Q ss_pred             HHHHHHHHHhCCCCCCHHHHHHHHHHHhhCCChHHh
Q 032803           50 EEFREQYKKDHPKNKSVAAVGKAGGEKWKSMSEADK   85 (133)
Q Consensus        50 ~e~r~~~k~~~p~~~~~~eisk~l~~~Wk~ls~eeK   85 (133)
                      +..|..++.-|||.-...+..+.|.+.|..|++..+
T Consensus        26 kAYr~la~~~HPDkgg~~e~~k~in~Ay~~L~d~~~   61 (153)
T PHA03102         26 KAYLRKCLEFHPDKGGDEEKMKELNTLYKKFRESVK   61 (153)
T ss_pred             HHHHHHHHHHCcCCCchhHHHHHHHHHHHHHhhHHH
Confidence            345666677899843345667777777777776544


Done!