Query         032101
Match_columns 147
No_of_seqs    162 out of 1153
Neff          6.7 
Searched_HMMs 46136
Date          Fri Mar 29 09:36:20 2013
Command       hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/032101.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/032101hhsearch_cdd -cpu 12 -v 0 

 No Hit                             Prob E-value P-value  Score    SS Cols Query HMM  Template HMM
  1 PTZ00199 high mobility group p  99.9 2.4E-25 5.2E-30  155.3  11.4   86   39-124     7-93  (94)
  2 cd01389 MATA_HMG-box MATA_HMG-  99.8 6.4E-21 1.4E-25  127.8   8.1   72   54-126     1-72  (77)
  3 PF00505 HMG_box:  HMG (high mo  99.8 2.9E-20 6.2E-25  121.1   9.0   69   55-124     1-69  (69)
  4 cd01388 SOX-TCF_HMG-box SOX-TC  99.8 2.6E-20 5.7E-25  123.4   8.2   70   55-125     2-71  (72)
  5 cd01390 HMGB-UBF_HMG-box HMGB-  99.8   1E-19 2.2E-24  117.4   9.0   65   55-120     1-65  (66)
  6 COG5648 NHP6B Chromatin-associ  99.8   5E-20 1.1E-24  143.6   8.8   89   43-132    59-147 (211)
  7 smart00398 HMG high mobility g  99.8 1.8E-19   4E-24  116.8   9.3   70   54-124     1-70  (70)
  8 PF09011 HMG_box_2:  HMG-box do  99.8 2.3E-19 5.1E-24  119.1   9.0   72   52-124     1-73  (73)
  9 KOG0381 HMG box-containing pro  99.8 3.6E-18 7.8E-23  118.2  10.8   76   51-127    17-95  (96)
 10 cd00084 HMG-box High Mobility   99.7 9.6E-18 2.1E-22  107.5   9.0   65   55-120     1-65  (66)
 11 KOG0527 HMG-box transcription   99.7 1.3E-17 2.7E-22  139.0   6.2   79   48-127    56-134 (331)
 12 KOG0526 Nucleosome-binding fac  99.7   3E-17 6.4E-22  142.1   7.9   78   44-126   525-602 (615)
 13 KOG4715 SWI/SNF-related matrix  99.3 2.2E-12 4.9E-17  106.3   7.3   78   48-126    58-135 (410)
 14 KOG3248 Transcription factor T  99.3 2.3E-12 4.9E-17  107.0   5.9   71   54-125   191-261 (421)
 15 KOG0528 HMG-box transcription   99.1 1.4E-11   3E-16  106.1   1.8   80   50-130   321-400 (511)
 16 KOG2746 HMG-box transcription   98.6 4.1E-08 8.9E-13   87.6   4.8   76   43-119   170-247 (683)
 17 PF14887 HMG_box_5:  HMG (high   98.3 7.4E-06 1.6E-10   55.1   7.7   74   54-129     3-76  (85)
 18 PF06382 DUF1074:  Protein of u  97.2  0.0007 1.5E-08   52.2   5.1   49   59-112    83-131 (183)
 19 PF04690 YABBY:  YABBY protein;  97.2 0.00097 2.1E-08   51.2   5.7   49   49-98    116-164 (170)
 20 COG5648 NHP6B Chromatin-associ  97.1 0.00041 8.8E-09   54.7   3.0   69   52-121   141-209 (211)
 21 PF08073 CHDNT:  CHDNT (NUC034)  95.2   0.031 6.7E-07   35.3   3.5   39   60-99     14-52  (55)
 22 PF06244 DUF1014:  Protein of u  93.1    0.15 3.3E-06   37.2   4.0   50   50-100    68-117 (122)
 23 PF04769 MAT_Alpha1:  Mating-ty  90.5    0.87 1.9E-05   35.9   5.9   53   50-109    39-91  (201)
 24 TIGR03481 HpnM hopanoid biosyn  89.8    0.91   2E-05   35.4   5.5   44   83-126    66-111 (198)
 25 PRK15117 ABC transporter perip  89.3     1.1 2.3E-05   35.4   5.6   48   78-126    66-115 (211)
 26 KOG3223 Uncharacterized conser  79.6     3.9 8.5E-05   32.2   4.6   53   53-109   162-215 (221)
 27 PF05494 Tol_Tol_Ttg2:  Toluene  78.3     2.7 5.9E-05   31.5   3.3   47   78-125    36-84  (170)
 28 COG2854 Ttg2D ABC-type transpo  69.3     7.3 0.00016   30.8   3.8   42   88-129    78-120 (202)
 29 PF13875 DUF4202:  Domain of un  68.0      13 0.00027   29.1   4.8   40   60-103   130-169 (185)
 30 PF12881 NUT_N:  NUT protein N   67.9      16 0.00034   30.9   5.6   53   59-112   229-281 (328)
 31 PRK10363 cpxP periplasmic repr  54.7      52  0.0011   25.3   6.0   40   84-124   111-150 (166)
 32 PRK09706 transcriptional repre  49.3      48   0.001   23.7   5.0   44   85-128    87-130 (135)
 33 PRK12751 cpxP periplasmic stre  47.9      44 0.00096   25.4   4.7   34   85-118   118-151 (162)
 34 PRK12750 cpxP periplasmic repr  46.8      78  0.0017   24.1   6.0   35   86-120   126-160 (170)
 35 PF11304 DUF3106:  Protein of u  44.5      98  0.0021   21.7   5.8    9   94-102    56-64  (107)
 36 PF00887 ACBP:  Acyl CoA bindin  39.4      58  0.0013   21.6   3.9   53   62-116    30-86  (87)
 37 PF06945 DUF1289:  Protein of u  38.5      43 0.00093   20.3   2.8   24   83-111    24-47  (51)
 38 KOG1610 Corticosteroid 11-beta  36.1 1.1E+02  0.0023   26.0   5.6   57   65-124   188-256 (322)
 39 PF01352 KRAB:  KRAB box;  Inte  35.5      29 0.00063   20.2   1.6   28   83-110     3-31  (41)
 40 PF12650 DUF3784:  Domain of un  34.8      25 0.00054   23.8   1.5   15   94-108    26-40  (97)
 41 TIGR00787 dctP tripartite ATP-  31.8      99  0.0021   24.3   4.7   28   91-118   213-240 (257)
 42 cd07081 ALDH_F20_ACDH_EutE-lik  31.1 1.2E+02  0.0026   26.4   5.4   40   85-124     6-45  (439)
 43 COG4281 ACB Acyl-CoA-binding p  31.0      52  0.0011   22.3   2.4   61   54-116    16-85  (87)
 44 PF05388 Carbpep_Y_N:  Carboxyp  30.4      75  0.0016   22.7   3.4   29   83-111    45-73  (113)
 45 PRK10236 hypothetical protein;  30.4      49  0.0011   26.8   2.6   26   86-111   118-143 (237)
 46 KOG1827 Chromatin remodeling c  29.8     3.9 8.5E-05   37.4  -4.1   44   58-102   552-595 (629)
 47 COG1638 DctP TRAP-type C4-dica  29.3 1.1E+02  0.0024   25.7   4.7   35   91-125   244-278 (332)
 48 PF06394 Pepsin-I3:  Pepsin inh  28.9      59  0.0013   21.7   2.4   31   95-133    38-68  (76)
 49 cd07133 ALDH_CALDH_CalB Conife  26.4 1.8E+02  0.0038   25.1   5.6   42   84-125     4-45  (434)
 50 PRK10455 periplasmic protein;   26.3 1.4E+02   0.003   22.6   4.3   28   85-112   118-145 (161)
 51 PHA02662 ORF131 putative membr  25.4 1.6E+02  0.0035   23.7   4.7   45   60-105    22-98  (226)
 52 PF12290 DUF3802:  Protein of u  25.2 2.6E+02  0.0057   20.1   5.3   40   72-111    49-101 (113)
 53 PF03480 SBP_bac_7:  Bacterial   25.2 1.1E+02  0.0024   24.3   4.0   31   91-121   213-243 (286)
 54 cd07122 ALDH_F20_ACDH Coenzyme  23.7 1.9E+02  0.0041   25.2   5.3   40   85-124     6-45  (436)
 55 cd07132 ALDH_F3AB Aldehyde deh  23.6   2E+02  0.0043   24.9   5.4   40   85-124     5-44  (443)
 56 PF15581 Imm35:  Immunity prote  23.2      93   0.002   21.5   2.6   21   83-103    32-52  (93)
 57 cd01145 TroA_c Periplasmic bin  23.1 1.4E+02  0.0031   22.7   4.0   48   82-129   116-163 (203)
 58 cd07087 ALDH_F3-13-14_CALDH-li  21.2 2.4E+02  0.0051   24.2   5.4   40   85-124     5-44  (426)
 59 cd07085 ALDH_F6_MMSDH Methylma  21.0 2.4E+02  0.0052   24.5   5.4   37   87-123    47-83  (478)
 60 PTZ00037 DnaJ_C chaperone prot  20.7   2E+02  0.0043   25.1   4.8   43   66-108    46-88  (421)
 61 smart00271 DnaJ DnaJ molecular  20.4 1.7E+02  0.0037   17.2   3.3   34   67-100    20-58  (60)

No 1  
>PTZ00199 high mobility group protein; Provisional
Probab=99.93  E-value=2.4e-25  Score=155.28  Aligned_cols=86  Identities=44%  Similarity=0.699  Sum_probs=79.9

Q ss_pred             ccccccccccCCCCCCCCCCCCHHHHHHHHHHHHHHHhCCCCC-cHHHHHHHHHHHhcCCChhhhhhHHHHHHHHHHHHH
Q 032101           39 KRQGKREKKAKKDPNKPKRPPSAFFVFLEEFRKTFKKENPNVT-AVSAVGKAAGGKWKSMSPAEKAPYESKAEKLKSEYG  117 (147)
Q Consensus        39 kk~~kk~kk~~kdp~~PKRP~sAy~lF~~e~r~~ik~e~P~~~-~~~eisk~lge~Wk~Ls~eeK~~Y~~~A~~~k~~y~  117 (147)
                      ++.+++++++.+||+.|+||+||||+|++++|..|..+||+++ ++++|+++||++|++||+++|.+|++.|..++++|.
T Consensus         7 ~~~~k~~~k~~kdp~~PKrP~sAY~~F~~~~R~~i~~~~P~~~~~~~evsk~ige~Wk~ls~eeK~~y~~~A~~dk~rY~   86 (94)
T PTZ00199          7 KVLVRKNKRKKKDPNAPKRALSAYMFFAKEKRAEIIAENPELAKDVAAVGKMVGEAWNKLSEEEKAPYEKKAQEDKVRYE   86 (94)
T ss_pred             CccccccCCCCCCCCCCCCCCcHHHHHHHHHHHHHHHHCcCCcccHHHHHHHHHHHHHcCCHHHHHHHHHHHHHHHHHHH
Confidence            4445667778899999999999999999999999999999985 479999999999999999999999999999999999


Q ss_pred             HHHHHHh
Q 032101          118 KKMNAYN  124 (147)
Q Consensus       118 k~~~~Y~  124 (147)
                      .+|.+|+
T Consensus        87 ~e~~~Y~   93 (94)
T PTZ00199         87 KEKAEYA   93 (94)
T ss_pred             HHHHHHh
Confidence            9999996


No 2  
>cd01389 MATA_HMG-box MATA_HMG-box, class I member of the HMG-box superfamily of DNA-binding proteins. These proteins contain a single HMG box, and bind the minor groove of DNA in a highly sequence-specific manner. Members include the fungal mating type gene products MC, MATA1 and Ste11.
Probab=99.84  E-value=6.4e-21  Score=127.80  Aligned_cols=72  Identities=31%  Similarity=0.517  Sum_probs=69.4

Q ss_pred             CCCCCCCHHHHHHHHHHHHHHHhCCCCCcHHHHHHHHHHHhcCCChhhhhhHHHHHHHHHHHHHHHHHHHhhh
Q 032101           54 KPKRPPSAFFVFLEEFRKTFKKENPNVTAVSAVGKAAGGKWKSMSPAEKAPYESKAEKLKSEYGKKMNAYNKK  126 (147)
Q Consensus        54 ~PKRP~sAy~lF~~e~r~~ik~e~P~~~~~~eisk~lge~Wk~Ls~eeK~~Y~~~A~~~k~~y~k~~~~Y~~~  126 (147)
                      .|+||+||||||+++.|..|+.+||+++ +.+|+++||++|++||+++|++|.++|..++++|..++++|...
T Consensus         1 ~~kRP~naf~lf~~~~r~~~~~~~p~~~-~~eisk~~g~~Wk~ls~eeK~~y~~~A~~~k~~~~~~~p~Yky~   72 (77)
T cd01389           1 KIPRPRNAFILYRQDKHAQLKTENPGLT-NNEISRIIGRMWRSESPEVKAYYKELAEEEKERHAREYPDYKYT   72 (77)
T ss_pred             CCCCCCcHHHHHHHHHHHHHHHHCCCCC-HHHHHHHHHHHHhhCCHHHHHHHHHHHHHHHHHHHHHCCCCccc
Confidence            4899999999999999999999999997 79999999999999999999999999999999999999999875


No 3  
>PF00505 HMG_box:  HMG (high mobility group) box;  InterPro: IPR000910 High mobility group (HMG or HMGB) proteins are a family of relatively low molecular weight non-histone components in chromatin. HMG1 (also called HMG-T in fish) and HMG2 are two highly related proteins that bind single-stranded DNA preferentially and unwind double-stranded DNA. Although they have no sequence specificity, they have a high affinity for bent or distorted DNA, and bend linear DNA. HMG1 and HMG2 contain two DNA-binding HMG-box domains (A and B) that show structural and functional differences, and have a long acidic C-terminal domain rich in aspartic and glutamic acid residues. The acidic tail modulates the affinity of the tandem HMG boxes in HMG1 and 2 for a variety of DNA targets. HMG1 and 2 appear to play important architectural roles in the assembly of nucleoprotein complexes in a variety of biological processes, for example V(D)J recombination, the initiation of transcription, and DNA repair []. The profile in this entry describing the HMG-domains is much more general than the signature. In addition to the HMG1 and HMG2 proteins, HMG-domains occur in single or multiple copies in the following protein classes; the SOX family of transcription factors; SRY sex determining region Y protein and related proteins []; LEF1 lymphoid enhancer binding factor 1 []; SSRP recombination signal recognition protein; MTF1 mitochondrial transcription factor 1; UBF1/2 nucleolar transcription factors; Abf2 yeast ARS-binding factor []; and Saccharomyces cerevisiae transcription factors Ixr1, Rox1, Nhp6a, Nhp6b and Spp41.; GO: 0003677 DNA binding; PDB: 1I11_A 1J3C_A 1J3D_A 1WZ6_A 1WGF_A 2D7L_A 1GT0_D 3U2B_C 2CRJ_A 2CS1_A ....
Probab=99.83  E-value=2.9e-20  Score=121.08  Aligned_cols=69  Identities=43%  Similarity=0.797  Sum_probs=65.5

Q ss_pred             CCCCCCHHHHHHHHHHHHHHHhCCCCCcHHHHHHHHHHHhcCCChhhhhhHHHHHHHHHHHHHHHHHHHh
Q 032101           55 PKRPPSAFFVFLEEFRKTFKKENPNVTAVSAVGKAAGGKWKSMSPAEKAPYESKAEKLKSEYGKKMNAYN  124 (147)
Q Consensus        55 PKRP~sAy~lF~~e~r~~ik~e~P~~~~~~eisk~lge~Wk~Ls~eeK~~Y~~~A~~~k~~y~k~~~~Y~  124 (147)
                      |+||+|||++|+.+++..++.+||+++ +.+|+++||++|++||+++|.+|.+.|..++.+|..++.+|+
T Consensus         1 PkrP~~af~lf~~~~~~~~k~~~p~~~-~~~i~~~~~~~W~~l~~~eK~~y~~~a~~~~~~y~~~~~~y~   69 (69)
T PF00505_consen    1 PKRPPNAFMLFCKEKRAKLKEENPDLS-NKEISKILAQMWKNLSEEEKAPYKEEAEEEKERYEKEMPEYK   69 (69)
T ss_dssp             SSSS--HHHHHHHHHHHHHHHHSTTST-HHHHHHHHHHHHHCSHHHHHHHHHHHHHHHHHHHHHHHHHHH
T ss_pred             CcCCCCHHHHHHHHHHHHHHHHhcccc-cccchhhHHHHHhcCCHHHHHHHHHHHHHHHHHHHHHHHhcC
Confidence            899999999999999999999999998 799999999999999999999999999999999999999995


No 4  
>cd01388 SOX-TCF_HMG-box SOX-TCF_HMG-box, class I member of the HMG-box superfamily of DNA-binding proteins. These proteins contain a single HMG box, and bind the minor groove of DNA in a highly sequence-specific manner. Members include SRY and its homologs in insects and vertebrates, and transcription factor-like proteins, TCF-1, -3, -4, and LEF-1. They appear to bind the minor groove of the A/T C A A A G/C-motif.
Probab=99.83  E-value=2.6e-20  Score=123.41  Aligned_cols=70  Identities=36%  Similarity=0.527  Sum_probs=67.4

Q ss_pred             CCCCCCHHHHHHHHHHHHHHHhCCCCCcHHHHHHHHHHHhcCCChhhhhhHHHHHHHHHHHHHHHHHHHhh
Q 032101           55 PKRPPSAFFVFLEEFRKTFKKENPNVTAVSAVGKAAGGKWKSMSPAEKAPYESKAEKLKSEYGKKMNAYNK  125 (147)
Q Consensus        55 PKRP~sAy~lF~~e~r~~ik~e~P~~~~~~eisk~lge~Wk~Ls~eeK~~Y~~~A~~~k~~y~k~~~~Y~~  125 (147)
                      .+||+||||+|++++|..++.+||+++ +.+|+++||++|+.||+++|++|.+.|..++++|..++++|+.
T Consensus         2 iKrP~naf~~F~~~~r~~~~~~~p~~~-~~eisk~l~~~Wk~ls~~eK~~y~~~a~~~k~~y~~~~p~y~y   71 (72)
T cd01388           2 IKRPMNAFMLFSKRHRRKVLQEYPLKE-NRAISKILGDRWKALSNEEKQPYYEEAKKLKELHMKLYPDYKW   71 (72)
T ss_pred             CCCCCcHHHHHHHHHHHHHHHHCCCCC-HHHHHHHHHHHHHcCCHHHHHHHHHHHHHHHHHHHHHCcCCCC
Confidence            589999999999999999999999997 7999999999999999999999999999999999999999963


No 5  
>cd01390 HMGB-UBF_HMG-box HMGB-UBF_HMG-box, class II and III members of the HMG-box superfamily of DNA-binding proteins. These proteins bind the minor groove of DNA in a non-sequence specific fashion and contain two or more tandem HMG boxes. Class II members include non-histone chromosomal proteins, HMG1 and HMG2, which bind to bent or distorted DNA such as four-way DNA junctions, synthetic DNA cruciforms, kinked cisplatin-modified DNA, DNA bulges, cross-overs in supercoiled DNA, and can cause looping of linear DNA. Class III members include nucleolar and mitochondrial transcription factors, UBF and mtTF1, which bind four-way DNA junctions.
Probab=99.82  E-value=1e-19  Score=117.38  Aligned_cols=65  Identities=54%  Similarity=0.836  Sum_probs=63.3

Q ss_pred             CCCCCCHHHHHHHHHHHHHHHhCCCCCcHHHHHHHHHHHhcCCChhhhhhHHHHHHHHHHHHHHHH
Q 032101           55 PKRPPSAFFVFLEEFRKTFKKENPNVTAVSAVGKAAGGKWKSMSPAEKAPYESKAEKLKSEYGKKM  120 (147)
Q Consensus        55 PKRP~sAy~lF~~e~r~~ik~e~P~~~~~~eisk~lge~Wk~Ls~eeK~~Y~~~A~~~k~~y~k~~  120 (147)
                      |++|+|||++|++++|..+..+||+++ +.+|++.||++|++||+++|.+|.+.|..++.+|..+|
T Consensus         1 Pkrp~saf~~f~~~~r~~~~~~~p~~~-~~~i~~~~~~~W~~ls~~eK~~y~~~a~~~~~~y~~e~   65 (66)
T cd01390           1 PKRPLSAYFLFSQEQRPKLKKENPDAS-VTEVTKILGEKWKELSEEEKKKYEEKAEKDKERYEKEM   65 (66)
T ss_pred             CCCCCcHHHHHHHHHHHHHHHHCcCCC-HHHHHHHHHHHHHhCCHHHHHHHHHHHHHHHHHHHHhh
Confidence            899999999999999999999999997 89999999999999999999999999999999999887


No 6  
>COG5648 NHP6B Chromatin-associated proteins containing the HMG domain [Chromatin structure and dynamics]
Probab=99.82  E-value=5e-20  Score=143.59  Aligned_cols=89  Identities=42%  Similarity=0.689  Sum_probs=84.5

Q ss_pred             ccccccCCCCCCCCCCCCHHHHHHHHHHHHHHHhCCCCCcHHHHHHHHHHHhcCCChhhhhhHHHHHHHHHHHHHHHHHH
Q 032101           43 KREKKAKKDPNKPKRPPSAFFVFLEEFRKTFKKENPNVTAVSAVGKAAGGKWKSMSPAEKAPYESKAEKLKSEYGKKMNA  122 (147)
Q Consensus        43 kk~kk~~kdp~~PKRP~sAy~lF~~e~r~~ik~e~P~~~~~~eisk~lge~Wk~Ls~eeK~~Y~~~A~~~k~~y~k~~~~  122 (147)
                      +...+.++|||.|+||+||||+|+.++|.+|..++|.++ |.+|++.+|++|++|++++|.+|...|..++++|..++..
T Consensus        59 k~~~r~k~dpN~PKRp~sayf~y~~~~R~ei~~~~p~l~-~~e~~k~~~e~WK~Ltd~eke~y~k~~~~~~erYq~ek~~  137 (211)
T COG5648          59 KRLVRKKKDPNGPKRPLSAYFLYSAENRDEIRKENPKLT-FGEVGKLLSEKWKELTDEEKEPYYKEANSDRERYQREKEE  137 (211)
T ss_pred             HHHHHHhcCCCCCCCchhHHHHHHHHHHHHHHHhCCCCC-hHHHHHHHHHHHHhccHhhhhhHHHHHhhHHHHHHHHHHh
Confidence            566778899999999999999999999999999999998 9999999999999999999999999999999999999999


Q ss_pred             HhhhCCCCCC
Q 032101          123 YNKKQVTNLV  132 (147)
Q Consensus       123 Y~~~~~~~~~  132 (147)
                      |+.+..+...
T Consensus       138 y~~k~~~~~~  147 (211)
T COG5648         138 YNKKLPNKAP  147 (211)
T ss_pred             hhcccCCCCC
Confidence            9999887753


No 7  
>smart00398 HMG high mobility group.
Probab=99.81  E-value=1.8e-19  Score=116.81  Aligned_cols=70  Identities=47%  Similarity=0.793  Sum_probs=67.7

Q ss_pred             CCCCCCCHHHHHHHHHHHHHHHhCCCCCcHHHHHHHHHHHhcCCChhhhhhHHHHHHHHHHHHHHHHHHHh
Q 032101           54 KPKRPPSAFFVFLEEFRKTFKKENPNVTAVSAVGKAAGGKWKSMSPAEKAPYESKAEKLKSEYGKKMNAYN  124 (147)
Q Consensus        54 ~PKRP~sAy~lF~~e~r~~ik~e~P~~~~~~eisk~lge~Wk~Ls~eeK~~Y~~~A~~~k~~y~k~~~~Y~  124 (147)
                      +|++|+|+|++|++++|..+..+||+++ +.+|++.||++|+.||+++|.+|.+.|..++.+|..++..|.
T Consensus         1 ~pkrp~~~y~~f~~~~r~~~~~~~~~~~-~~~i~~~~~~~W~~l~~~ek~~y~~~a~~~~~~y~~~~~~y~   70 (70)
T smart00398        1 KPKRPMSAFMLFSQENRAKIKAENPDLS-NAEISKKLGERWKLLSEEEKAPYEEKAKKDKERYEEEMPEYK   70 (70)
T ss_pred             CcCCCCcHHHHHHHHHHHHHHHHCcCCC-HHHHHHHHHHHHHcCCHHHHHHHHHHHHHHHHHHHHHHHhcC
Confidence            5899999999999999999999999997 899999999999999999999999999999999999999984


No 8  
>PF09011 HMG_box_2:  HMG-box domain;  InterPro: IPR015101 This domain is predominantly found in Maelstrom homologue proteins. It has no known function. ; GO: 0005634 nucleus; PDB: 2EQZ_A 1V64_A 2CTO_A 1H5P_A 3TQ6_A 3FGH_A 3TMM_A 1J3X_A 2YRQ_A 1AAB_A ....
Probab=99.80  E-value=2.3e-19  Score=119.15  Aligned_cols=72  Identities=43%  Similarity=0.785  Sum_probs=63.3

Q ss_pred             CCCCCCCCCHHHHHHHHHHHHHHHh-CCCCCcHHHHHHHHHHHhcCCChhhhhhHHHHHHHHHHHHHHHHHHHh
Q 032101           52 PNKPKRPPSAFFVFLEEFRKTFKKE-NPNVTAVSAVGKAAGGKWKSMSPAEKAPYESKAEKLKSEYGKKMNAYN  124 (147)
Q Consensus        52 p~~PKRP~sAy~lF~~e~r~~ik~e-~P~~~~~~eisk~lge~Wk~Ls~eeK~~Y~~~A~~~k~~y~k~~~~Y~  124 (147)
                      |+.||+|+|||+||+.+++..++.+ ++... +.++++.|++.|++||++||.+|.++|..++.+|..+|..|+
T Consensus         1 p~kpK~~~say~lF~~~~~~~~k~~G~~~~~-~~e~~k~~~~~Wk~Ls~~EK~~Y~~~A~~~k~~y~~e~~~~~   73 (73)
T PF09011_consen    1 PKKPKRPPSAYNLFMKEMRKEVKEEGGQKQS-FREVMKEISERWKSLSEEEKEPYEERAKEDKERYEREMKEWN   73 (73)
T ss_dssp             SSS--SSSSHHHHHHHHHHHHHHHHT-T-SS-HHHHHHHHHHHHHHS-HHHHHHHHHHHHHHHHHHHHHHHHH-
T ss_pred             CcCCCCCCCHHHHHHHHHHHHHHHhcccCCC-HHHHHHHHHHHHHhcCHHHHHHHHHHHHHHHHHHHHHHHhcC
Confidence            6899999999999999999999988 66665 899999999999999999999999999999999999999995


No 9  
>KOG0381 consensus HMG box-containing protein [General function prediction only]
Probab=99.77  E-value=3.6e-18  Score=118.24  Aligned_cols=76  Identities=49%  Similarity=0.831  Sum_probs=72.3

Q ss_pred             CC--CCCCCCCCHHHHHHHHHHHHHHHhCCCCCcHHHHHHHHHHHhcCCChhhhhhHHHHHHHHHHHHHHHHH-HHhhhC
Q 032101           51 DP--NKPKRPPSAFFVFLEEFRKTFKKENPNVTAVSAVGKAAGGKWKSMSPAEKAPYESKAEKLKSEYGKKMN-AYNKKQ  127 (147)
Q Consensus        51 dp--~~PKRP~sAy~lF~~e~r~~ik~e~P~~~~~~eisk~lge~Wk~Ls~eeK~~Y~~~A~~~k~~y~k~~~-~Y~~~~  127 (147)
                      ||  +.|+||+|||++|+.+.|..++.+||+++ +.+|++++|++|++|++++|.+|...+..++++|..+|. .|+...
T Consensus        17 ~p~~~~pkrp~sa~~~f~~~~~~~~k~~~p~~~-~~~v~k~~g~~W~~l~~~~k~~y~~ka~~~k~~Y~~~~~~~~~~~~   95 (96)
T KOG0381|consen   17 DPNAQAPKRPLSAFFLFSSEQRSKIKAENPGLS-VGEVAKALGEMWKNLAEEEKQPYEEKASKLKEKYEKELAGEYKASL   95 (96)
T ss_pred             CCCCCCCCCCCcHHHHHHHHHHHHHHHhCCCCC-HHHHHHHHHHHHhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHhhcc
Confidence            66  59999999999999999999999999987 899999999999999999999999999999999999999 998754


No 10 
>cd00084 HMG-box High Mobility Group (HMG)-box is found in a variety of eukaryotic chromosomal proteins and transcription factors. HMGs bind to the minor groove of DNA and have been classified by DNA binding preferences. Two phylogenically distinct groups of Class I proteins bind DNA in a sequence specific fashion and contain a single HMG box. One group (SOX-TCF) includes transcription factors, TCF-1, -3, -4; and also SRY and LEF-1, which bind four-way DNA junctions and duplex DNA targets. The second group (MATA) includes fungal mating type gene products MC, MATA1 and Ste11. Class II and III proteins (HMGB-UBF) bind DNA in a non-sequence specific fashion and contain two or more tandem HMG boxes. Class II members include non-histone chromosomal proteins, HMG1 and HMG2, which bind to bent or distorted DNA such as four-way DNA junctions, synthetic DNA cruciforms, kinked cisplatin-modified DNA, DNA bulges, cross-overs in supercoiled DNA, and can cause looping of linear DNA. Class III member
Probab=99.75  E-value=9.6e-18  Score=107.52  Aligned_cols=65  Identities=51%  Similarity=0.807  Sum_probs=62.7

Q ss_pred             CCCCCCHHHHHHHHHHHHHHHhCCCCCcHHHHHHHHHHHhcCCChhhhhhHHHHHHHHHHHHHHHH
Q 032101           55 PKRPPSAFFVFLEEFRKTFKKENPNVTAVSAVGKAAGGKWKSMSPAEKAPYESKAEKLKSEYGKKM  120 (147)
Q Consensus        55 PKRP~sAy~lF~~e~r~~ik~e~P~~~~~~eisk~lge~Wk~Ls~eeK~~Y~~~A~~~k~~y~k~~  120 (147)
                      |+||+|||++|+++.|..+..+||+++ +.+|++.||++|+.|++++|.+|.+.|..++.+|..++
T Consensus         1 pkrp~~af~~f~~~~~~~~~~~~~~~~-~~~i~~~~~~~W~~l~~~~k~~y~~~a~~~~~~y~~~~   65 (66)
T cd00084           1 PKRPLSAYFLFSQEHRAEVKAENPGLS-VGEISKILGEMWKSLSEEEKKKYEEKAEKDKERYEKEM   65 (66)
T ss_pred             CCCCCcHHHHHHHHHHHHHHHHCcCCC-HHHHHHHHHHHHHhCCHHHHHHHHHHHHHHHHHHHHhh
Confidence            799999999999999999999999997 79999999999999999999999999999999999875


No 11 
>KOG0527 consensus HMG-box transcription factor [Transcription]
Probab=99.71  E-value=1.3e-17  Score=139.03  Aligned_cols=79  Identities=32%  Similarity=0.590  Sum_probs=74.3

Q ss_pred             cCCCCCCCCCCCCHHHHHHHHHHHHHHHhCCCCCcHHHHHHHHHHHhcCCChhhhhhHHHHHHHHHHHHHHHHHHHhhhC
Q 032101           48 AKKDPNKPKRPPSAFFVFLEEFRKTFKKENPNVTAVSAVGKAAGGKWKSMSPAEKAPYESKAEKLKSEYGKKMNAYNKKQ  127 (147)
Q Consensus        48 ~~kdp~~PKRP~sAy~lF~~e~r~~ik~e~P~~~~~~eisk~lge~Wk~Ls~eeK~~Y~~~A~~~k~~y~k~~~~Y~~~~  127 (147)
                      .+......||||||||+|.+..|.+|.++||.+.| .||+++||.+|+.|+++||.+|+++|++++..|.++..+|+.+-
T Consensus        56 ~k~~~~hIKRPMNAFMVWSq~~RRkma~qnP~mHN-SEISK~LG~~WK~Lse~EKrPFi~EAeRLR~~HmkehPdYKYRP  134 (331)
T KOG0527|consen   56 DKTSTDRIKRPMNAFMVWSQGQRRKLAKQNPKMHN-SEISKRLGAEWKLLSEEEKRPFVDEAERLRAQHMKEYPDYKYRP  134 (331)
T ss_pred             CCCCccccCCCcchhhhhhHHHHHHHHHhCcchhh-HHHHHHHHHHHhhcCHhhhccHHHHHHHHHHHHHHhCCCccccc
Confidence            44556789999999999999999999999999997 89999999999999999999999999999999999999998763


No 12 
>KOG0526 consensus Nucleosome-binding factor SPN, POB3 subunit [Transcription; Replication, recombination and repair; Chromatin structure and dynamics]
Probab=99.70  E-value=3e-17  Score=142.08  Aligned_cols=78  Identities=41%  Similarity=0.708  Sum_probs=73.6

Q ss_pred             cccccCCCCCCCCCCCCHHHHHHHHHHHHHHHhCCCCCcHHHHHHHHHHHhcCCChhhhhhHHHHHHHHHHHHHHHHHHH
Q 032101           44 REKKAKKDPNKPKRPPSAFFVFLEEFRKTFKKENPNVTAVSAVGKAAGGKWKSMSPAEKAPYESKAEKLKSEYGKKMNAY  123 (147)
Q Consensus        44 k~kk~~kdp~~PKRP~sAy~lF~~e~r~~ik~e~P~~~~~~eisk~lge~Wk~Ls~eeK~~Y~~~A~~~k~~y~k~~~~Y  123 (147)
                      ++.++.+|||+|||++||||+|++..|..|+.+  +++ +++|++.+|++|+.||.  |.+|++.|+.++++|+.+|.+|
T Consensus       525 k~~kk~kdpnapkra~sa~m~w~~~~r~~ik~d--gi~-~~dv~kk~g~~wk~ms~--k~~we~ka~~dk~ry~~em~~y  599 (615)
T KOG0526|consen  525 KKGKKKKDPNAPKRATSAYMLWLNASRESIKED--GIS-VGDVAKKAGEKWKQMSA--KEEWEDKAAVDKQRYEDEMKEY  599 (615)
T ss_pred             cCcccCCCCCCCccchhHHHHHHHhhhhhHhhc--Cch-HHHHHHHHhHHHhhhcc--cchhhHHHHHHHHHHHHHHHhh
Confidence            667788999999999999999999999999987  887 99999999999999999  9999999999999999999999


Q ss_pred             hhh
Q 032101          124 NKK  126 (147)
Q Consensus       124 ~~~  126 (147)
                      +.-
T Consensus       600 k~g  602 (615)
T KOG0526|consen  600 KNG  602 (615)
T ss_pred             cCC
Confidence            943


No 13 
>KOG4715 consensus SWI/SNF-related matrix-associated actin-dependent regulator of chromatin  [Chromatin structure and dynamics]
Probab=99.35  E-value=2.2e-12  Score=106.25  Aligned_cols=78  Identities=28%  Similarity=0.558  Sum_probs=73.9

Q ss_pred             cCCCCCCCCCCCCHHHHHHHHHHHHHHHhCCCCCcHHHHHHHHHHHhcCCChhhhhhHHHHHHHHHHHHHHHHHHHhhh
Q 032101           48 AKKDPNKPKRPPSAFFVFLEEFRKTFKKENPNVTAVSAVGKAAGGKWKSMSPAEKAPYESKAEKLKSEYGKKMNAYNKK  126 (147)
Q Consensus        48 ~~kdp~~PKRP~sAy~lF~~e~r~~ik~e~P~~~~~~eisk~lge~Wk~Ls~eeK~~Y~~~A~~~k~~y~k~~~~Y~~~  126 (147)
                      ..+.|.+|-+|+-.||.|+..++++|+..||++. +.+|+++||.||..|+++||..|...++.++.+|++.|.+|+..
T Consensus        58 ~pkpPkppekpl~pymrySrkvWd~VkA~nPe~k-LWeiGK~Ig~mW~dLpd~EK~ey~~EYeaEKieY~~smkayh~s  135 (410)
T KOG4715|consen   58 RPKPPKPPEKPLMPYMRYSRKVWDQVKASNPELK-LWEIGKIIGGMWLDLPDEEKQEYLNEYEAEKIEYNESMKAYHNS  135 (410)
T ss_pred             CCCCCCCCCcccchhhHHhhhhhhhhhccCcchH-HHHHHHHHHHHHhhCcchHHHHHHHHHHHHHHHHHHHHHHhhCC
Confidence            4457889999999999999999999999999998 99999999999999999999999999999999999999999764


No 14 
>KOG3248 consensus Transcription factor TCF-4 [Transcription]
Probab=99.32  E-value=2.3e-12  Score=106.96  Aligned_cols=71  Identities=23%  Similarity=0.416  Sum_probs=65.0

Q ss_pred             CCCCCCCHHHHHHHHHHHHHHHhCCCCCcHHHHHHHHHHHhcCCChhhhhhHHHHHHHHHHHHHHHHHHHhh
Q 032101           54 KPKRPPSAFFVFLEEFRKTFKKENPNVTAVSAVGKAAGGKWKSMSPAEKAPYESKAEKLKSEYGKKMNAYNK  125 (147)
Q Consensus        54 ~PKRP~sAy~lF~~e~r~~ik~e~P~~~~~~eisk~lge~Wk~Ls~eeK~~Y~~~A~~~k~~y~k~~~~Y~~  125 (147)
                      ..|+|+||||+|++++|..|..++. ++...+|.++||.+|++||-||.+.|+++|+++++.+......|-+
T Consensus       191 hiKKPLNAFmlyMKEmRa~vvaEct-lKeSAaiNqiLGrRWH~LSrEEQAKYyElArKerqlH~qlYP~WSA  261 (421)
T KOG3248|consen  191 HIKKPLNAFMLYMKEMRAKVVAECT-LKESAAINQILGRRWHALSREEQAKYYELARKERQLHMQLYPGWSA  261 (421)
T ss_pred             cccccHHHHHHHHHHHHHHHHHHhh-hhhHHHHHHHHhHHHhhhhHHHHHHHHHHHHHHHHHHHHhcCCcch
Confidence            6799999999999999999999996 4446899999999999999999999999999999999988877744


No 15 
>KOG0528 consensus HMG-box transcription factor SOX5 [Transcription]
Probab=99.14  E-value=1.4e-11  Score=106.11  Aligned_cols=80  Identities=29%  Similarity=0.474  Sum_probs=73.0

Q ss_pred             CCCCCCCCCCCHHHHHHHHHHHHHHHhCCCCCcHHHHHHHHHHHhcCCChhhhhhHHHHHHHHHHHHHHHHHHHhhhCCC
Q 032101           50 KDPNKPKRPPSAFFVFLEEFRKTFKKENPNVTAVSAVGKAAGGKWKSMSPAEKAPYESKAEKLKSEYGKKMNAYNKKQVT  129 (147)
Q Consensus        50 kdp~~PKRP~sAy~lF~~e~r~~ik~e~P~~~~~~eisk~lge~Wk~Ls~eeK~~Y~~~A~~~k~~y~k~~~~Y~~~~~~  129 (147)
                      ..++..||||||||+|.++.|..|.+.+|++.| ..|+++||.+|+.||..||++|++.-.++-..|.+..+.|+.+-.+
T Consensus       321 ss~PHIKRPMNAFMVWAkDERRKILqA~PDMHN-SnISKILGSRWKaMSN~eKQPYYEEQaRLSk~HlEk~PdYrYkPRP  399 (511)
T KOG0528|consen  321 SSEPHIKRPMNAFMVWAKDERRKILQAFPDMHN-SNISKILGSRWKAMSNTEKQPYYEEQARLSKLHLEKYPDYRYKPRP  399 (511)
T ss_pred             CCCccccCCcchhhcccchhhhhhhhcCccccc-cchhHHhcccccccccccccchHHHHHHHHHhhhccCcccccCCCC
Confidence            334577999999999999999999999999997 6999999999999999999999999999999999999999987655


Q ss_pred             C
Q 032101          130 N  130 (147)
Q Consensus       130 ~  130 (147)
                      .
T Consensus       400 K  400 (511)
T KOG0528|consen  400 K  400 (511)
T ss_pred             C
Confidence            4


No 16 
>KOG2746 consensus HMG-box transcription factor Capicua and related proteins [Transcription]
Probab=98.61  E-value=4.1e-08  Score=87.60  Aligned_cols=76  Identities=26%  Similarity=0.453  Sum_probs=69.7

Q ss_pred             ccccccCCCCCCCCCCCCHHHHHHHHHH--HHHHHhCCCCCcHHHHHHHHHHHhcCCChhhhhhHHHHHHHHHHHHHHH
Q 032101           43 KREKKAKKDPNKPKRPPSAFFVFLEEFR--KTFKKENPNVTAVSAVGKAAGGKWKSMSPAEKAPYESKAEKLKSEYGKK  119 (147)
Q Consensus        43 kk~kk~~kdp~~PKRP~sAy~lF~~e~r--~~ik~e~P~~~~~~eisk~lge~Wk~Ls~eeK~~Y~~~A~~~k~~y~k~  119 (147)
                      ..+-..+.|....+||||+|++|++.+|  ..+.+.||+..| ..|+++||+.|-.|.+.||+.|.++|.+.++.|.++
T Consensus       170 dgrspnkr~k~HirrPMnaf~ifskrhr~~g~vhq~~pn~DN-rtIskiLgewWytL~~~Ekq~yhdLa~Qvk~Ahfka  247 (683)
T KOG2746|consen  170 DGRSPNKRDKDHIRRPMNAFHIFSKRHRGEGRVHQRHPNQDN-RTISKILGEWWYTLGPNEKQKYHDLAFQVKEAHFKA  247 (683)
T ss_pred             ccCCCCcCcchhhhhhhHHHHHHHhhcCCccchhccCccccc-hhHHHHHhhhHhhhCchhhhhHHHHHHHHHHHHhhh
Confidence            4455566777789999999999999999  899999999997 899999999999999999999999999999999886


No 17 
>PF14887 HMG_box_5:  HMG (high mobility group) box 5; PDB: 1L8Y_A 1L8Z_A 2HDZ_A.
Probab=98.25  E-value=7.4e-06  Score=55.11  Aligned_cols=74  Identities=19%  Similarity=0.274  Sum_probs=61.0

Q ss_pred             CCCCCCCHHHHHHHHHHHHHHHhCCCCCcHHHHHHHHHHHhcCCChhhhhhHHHHHHHHHHHHHHHHHHHhhhCCC
Q 032101           54 KPKRPPSAFFVFLEEFRKTFKKENPNVTAVSAVGKAAGGKWKSMSPAEKAPYESKAEKLKSEYGKKMNAYNKKQVT  129 (147)
Q Consensus        54 ~PKRP~sAy~lF~~e~r~~ik~e~P~~~~~~eisk~lge~Wk~Ls~eeK~~Y~~~A~~~k~~y~k~~~~Y~~~~~~  129 (147)
                      .|..|-++--+|.+.....+...+++.. ..+ .+.+...|++|++.+|.+|+..|.++..+|+.+|.+|+.-...
T Consensus         3 lPE~PKt~qe~Wqq~vi~dYla~~~~dr-~K~-~kam~~~W~~me~Kekl~WIkKA~EdqKrYE~el~e~r~~~~~   76 (85)
T PF14887_consen    3 LPETPKTAQEIWQQSVIGDYLAKFRNDR-KKA-LKAMEAQWSQMEKKEKLKWIKKAAEDQKRYERELREMRSAPAD   76 (85)
T ss_dssp             -S----THHHHHHHHHHHHHHHHTTSTH-HHH-HHHHHHHHHTTGGGHHHHHHHHHHHHHHHHHHHHHCCS-CCCT
T ss_pred             CCCCCCCHHHHHHHHHHHHHHHHhhHhH-HHH-HHHHHHHHHHhhhhhhhHHHHHHHHHHHHHHHHHHHHhcCCCC
Confidence            5778889999999999999999999885 344 5699999999999999999999999999999999999876443


No 18 
>PF06382 DUF1074:  Protein of unknown function (DUF1074);  InterPro: IPR024460 This family consists of several proteins which appear to be specific to Insecta. The function of this family is unknown.
Probab=97.20  E-value=0.0007  Score=52.20  Aligned_cols=49  Identities=29%  Similarity=0.480  Sum_probs=42.8

Q ss_pred             CCHHHHHHHHHHHHHHHhCCCCCcHHHHHHHHHHHhcCCChhhhhhHHHHHHHH
Q 032101           59 PSAFFVFLEEFRKTFKKENPNVTAVSAVGKAAGGKWKSMSPAEKAPYESKAEKL  112 (147)
Q Consensus        59 ~sAy~lF~~e~r~~ik~e~P~~~~~~eisk~lge~Wk~Ls~eeK~~Y~~~A~~~  112 (147)
                      -+||+-|+.++|..    |.+++ ..|+....+..|..||+++|..|..++...
T Consensus        83 nnaYLNFLReFRrk----h~~L~-p~dlI~~AAraW~rLSe~eK~rYrr~~~~~  131 (183)
T PF06382_consen   83 NNAYLNFLREFRRK----HCGLS-PQDLIQRAARAWCRLSEAEKNRYRRMAPSV  131 (183)
T ss_pred             chHHHHHHHHHHHH----ccCCC-HHHHHHHHHHHHHhCCHHHHHHHHhhcchh
Confidence            47899999998875    57897 789999999999999999999999876543


No 19 
>PF04690 YABBY:  YABBY protein;  InterPro: IPR006780 YABBY proteins are a group of plant-specific transcription factors involved in the specification of abaxial polarity in lateral organs such as leaves and floral organs [, ].
Probab=97.18  E-value=0.00097  Score=51.23  Aligned_cols=49  Identities=33%  Similarity=0.542  Sum_probs=42.8

Q ss_pred             CCCCCCCCCCCCHHHHHHHHHHHHHHHhCCCCCcHHHHHHHHHHHhcCCC
Q 032101           49 KKDPNKPKRPPSAFFVFLEEFRKTFKKENPNVTAVSAVGKAAGGKWKSMS   98 (147)
Q Consensus        49 ~kdp~~PKRP~sAy~lF~~e~r~~ik~e~P~~~~~~eisk~lge~Wk~Ls   98 (147)
                      .+.|.+-.|-+|||..|+++.-.+|+..||+++ ..|.-...+..|...+
T Consensus       116 ~kPPEKRqR~psaYn~f~k~ei~rik~~~p~is-hkeaFs~aAknW~h~p  164 (170)
T PF04690_consen  116 NKPPEKRQRVPSAYNRFMKEEIQRIKAENPDIS-HKEAFSAAAKNWAHFP  164 (170)
T ss_pred             cCCccccCCCchhHHHHHHHHHHHHHhcCCCCC-HHHHHHHHHHhhhhCc
Confidence            344555567789999999999999999999998 7999999999998765


No 20 
>COG5648 NHP6B Chromatin-associated proteins containing the HMG domain [Chromatin structure and dynamics]
Probab=97.10  E-value=0.00041  Score=54.74  Aligned_cols=69  Identities=22%  Similarity=0.312  Sum_probs=62.1

Q ss_pred             CCCCCCCCCHHHHHHHHHHHHHHHhCCCCCcHHHHHHHHHHHhcCCChhhhhhHHHHHHHHHHHHHHHHH
Q 032101           52 PNKPKRPPSAFFVFLEEFRKTFKKENPNVTAVSAVGKAAGGKWKSMSPAEKAPYESKAEKLKSEYGKKMN  121 (147)
Q Consensus        52 p~~PKRP~sAy~lF~~e~r~~ik~e~P~~~~~~eisk~lge~Wk~Ls~eeK~~Y~~~A~~~k~~y~k~~~  121 (147)
                      ..++..|...|+-+-..+|..+...+|... ..++++++|..|.+|++.-|..|.+.+..++..|...+.
T Consensus       141 k~~~~~~~~~~~e~~~~~r~~~~~~~~~~~-~~e~~k~~~~~w~el~~skK~~~~~~~Kk~k~~~~~~~~  209 (211)
T COG5648         141 KLPNKAPIGPFIENEPKIRPKVEGPSPDKA-LVEETKIISKAWSELDESKKKKYIDKYKKLKEEYDSFYP  209 (211)
T ss_pred             ccCCCCCCchhhhccHHhccccCCCCcchh-hhHHhhhhhhhhhhhChhhhhHHHHHHHHHHHHHhhhcc
Confidence            457788888999999999999999999886 789999999999999999999999999999999987654


No 21 
>PF08073 CHDNT:  CHDNT (NUC034) domain;  InterPro: IPR012958 The CHD N-terminal domain is found in PHD/RING fingers and chromo domain-associated helicases [].; GO: 0003677 DNA binding, 0005524 ATP binding, 0008270 zinc ion binding, 0016818 hydrolase activity, acting on acid anhydrides, in phosphorus-containing anhydrides, 0006355 regulation of transcription, DNA-dependent, 0005634 nucleus
Probab=95.21  E-value=0.031  Score=35.32  Aligned_cols=39  Identities=21%  Similarity=0.422  Sum_probs=35.1

Q ss_pred             CHHHHHHHHHHHHHHHhCCCCCcHHHHHHHHHHHhcCCCh
Q 032101           60 SAFFVFLEEFRKTFKKENPNVTAVSAVGKAAGGKWKSMSP   99 (147)
Q Consensus        60 sAy~lF~~e~r~~ik~e~P~~~~~~eisk~lge~Wk~Ls~   99 (147)
                      +.|-+|.+-.|..|...||++. ++.|..+++.+|+..++
T Consensus        14 t~yK~Fsq~vRP~l~~~NPk~~-~sKl~~l~~AKwrEF~~   52 (55)
T PF08073_consen   14 TNYKAFSQHVRPLLAKANPKAP-MSKLMMLLQAKWREFQE   52 (55)
T ss_pred             HHHHHHHHHHHHHHHHHCCCCc-HHHHHHHHHHHHHHHHh
Confidence            5688999999999999999997 89999999999987554


No 22 
>PF06244 DUF1014:  Protein of unknown function (DUF1014);  InterPro: IPR010422 This family consists of several hypothetical eukaryotic proteins of unknown function.
Probab=93.11  E-value=0.15  Score=37.16  Aligned_cols=50  Identities=22%  Similarity=0.324  Sum_probs=41.2

Q ss_pred             CCCCCCCCCCCHHHHHHHHHHHHHHHhCCCCCcHHHHHHHHHHHhcCCChh
Q 032101           50 KDPNKPKRPPSAFFVFLEEFRKTFKKENPNVTAVSAVGKAAGGKWKSMSPA  100 (147)
Q Consensus        50 kdp~~PKRP~sAy~lF~~e~r~~ik~e~P~~~~~~eisk~lge~Wk~Ls~e  100 (147)
                      -|..|-+|---||.-|....-..|+.+||++. .+++-.+|-..|..-++.
T Consensus        68 ~drHPErR~KAAy~afeE~~Lp~lK~E~PgLr-lsQ~kq~l~K~w~KSPeN  117 (122)
T PF06244_consen   68 IDRHPERRMKAAYKAFEERRLPELKEENPGLR-LSQYKQMLWKEWQKSPEN  117 (122)
T ss_pred             CCCCcchhHHHHHHHHHHHHhHHHHhhCCCch-HHHHHHHHHHHHhcCCCC
Confidence            34434455557899999999999999999998 899999999999876653


No 23 
>PF04769 MAT_Alpha1:  Mating-type protein MAT alpha 1;  InterPro: IPR006856 This family includes Saccharomyces cerevisiae (Baker's yeast) mating type protein alpha 1 (P01365 from SWISSPROT). MAT alpha 1 is a transcription activator that activates mating-type alpha-specific genes with the help of the MADS-box containing MCM1 transcription factor, which together bind cooperatively to PQ elements upstream of alpha-specific genes. The MCM1-MATalpha1 complex is required for the proper DNA-bending that is needed for transcriptional activation []. Alpha 1 interacts in vivo with STE12, linking expression of alpha-specific genes to the alpha-pheromone (IPR006742 from INTERPRO) response pathway [].; GO: 0000772 mating pheromone activity, 0003677 DNA binding, 0045895 positive regulation of transcription, mating-type specific, 0005634 nucleus
Probab=90.52  E-value=0.87  Score=35.90  Aligned_cols=53  Identities=21%  Similarity=0.353  Sum_probs=38.1

Q ss_pred             CCCCCCCCCCCHHHHHHHHHHHHHHHhCCCCCcHHHHHHHHHHHhcCCChhhhhhHHHHH
Q 032101           50 KDPNKPKRPPSAFFVFLEEFRKTFKKENPNVTAVSAVGKAAGGKWKSMSPAEKAPYESKA  109 (147)
Q Consensus        50 kdp~~PKRP~sAy~lF~~e~r~~ik~e~P~~~~~~eisk~lge~Wk~Ls~eeK~~Y~~~A  109 (147)
                      .....++||.|+||.|..=.-.    -.|+.. ..+++..|+..|..=+.  |..|.-.|
T Consensus        39 ~~~~~~kr~lN~Fm~FRsyy~~----~~~~~~-Qk~~S~~l~~lW~~dp~--k~~W~l~a   91 (201)
T PF04769_consen   39 RSPEKAKRPLNGFMAFRSYYSP----IFPPLP-QKELSGILTKLWEKDPF--KNKWSLMA   91 (201)
T ss_pred             ccccccccchhHHHHHHHHHHh----hcCCcC-HHHHHHHHHHHHhCCcc--HhHHHHHh
Confidence            3455789999999999766553    346776 68999999999986332  45555444


No 24 
>TIGR03481 HpnM hopanoid biosynthesis associated membrane protein HpnM. The genomes containing members of this family share the machinery for the biosynthesis of hopanoid lipids. Furthermore, the genes of this family are usually located proximal to other components of this biological process. The proteins are members of the pfam05494 family of putative transporters known as "toluene tolerance protein Ttg2D", although it is unlikely that the members included here have anything to do with toluene per-se.
Probab=89.76  E-value=0.91  Score=35.43  Aligned_cols=44  Identities=18%  Similarity=0.511  Sum_probs=38.7

Q ss_pred             HHHHHH-HHHHHhcCCChhhhhhHHHHHHH-HHHHHHHHHHHHhhh
Q 032101           83 VSAVGK-AAGGKWKSMSPAEKAPYESKAEK-LKSEYGKKMNAYNKK  126 (147)
Q Consensus        83 ~~eisk-~lge~Wk~Ls~eeK~~Y~~~A~~-~k~~y~k~~~~Y~~~  126 (147)
                      |..|++ .||..|+.+|+++++.|.+.... ....|-..+..|...
T Consensus        66 f~~mar~vLG~~W~~~s~~Qr~~F~~~F~~~l~~tY~~~l~~y~~~  111 (198)
T TIGR03481        66 LPAMARLTLGSSWTSLSPEQRRRFIGAFRELSIATYASQFKSYAGE  111 (198)
T ss_pred             HHHHHHHHhhhhhhhCCHHHHHHHHHHHHHHHHHHHHHHHHhhcCc
Confidence            778876 68999999999999999998888 778899999999763


No 25 
>PRK15117 ABC transporter periplasmic binding protein MlaC; Provisional
Probab=89.33  E-value=1.1  Score=35.36  Aligned_cols=48  Identities=29%  Similarity=0.562  Sum_probs=40.3

Q ss_pred             CCCCcHHHHHH-HHHHHhcCCChhhhhhHHHHHHH-HHHHHHHHHHHHhhh
Q 032101           78 PNVTAVSAVGK-AAGGKWKSMSPAEKAPYESKAEK-LKSEYGKKMNAYNKK  126 (147)
Q Consensus        78 P~~~~~~eisk-~lge~Wk~Ls~eeK~~Y~~~A~~-~k~~y~k~~~~Y~~~  126 (147)
                      |... |..+++ .||..|+++|++++..|.+.... ....|-..+..|...
T Consensus        66 p~~D-f~~~s~~vLG~~wr~as~eQr~~F~~~F~~~Lv~tYa~~l~~y~~q  115 (211)
T PRK15117         66 PYVQ-VKYAGALVLGRYYKDATPAQREAYFAAFREYLKQAYGQALAMYHGQ  115 (211)
T ss_pred             ccCC-HHHHHHHHhhhhhhhCCHHHHHHHHHHHHHHHHHHHHHHHHHhCCc
Confidence            5564 778875 68999999999999999987777 667899999999764


No 26 
>KOG3223 consensus Uncharacterized conserved protein [Function unknown]
Probab=79.56  E-value=3.9  Score=32.24  Aligned_cols=53  Identities=30%  Similarity=0.483  Sum_probs=43.3

Q ss_pred             CCC-CCCCCHHHHHHHHHHHHHHHhCCCCCcHHHHHHHHHHHhcCCChhhhhhHHHHH
Q 032101           53 NKP-KRPPSAFFVFLEEFRKTFKKENPNVTAVSAVGKAAGGKWKSMSPAEKAPYESKA  109 (147)
Q Consensus        53 ~~P-KRP~sAy~lF~~e~r~~ik~e~P~~~~~~eisk~lge~Wk~Ls~eeK~~Y~~~A  109 (147)
                      ..| +|=.-||.-|-....+.|+.+||++. ++++-.+|-.+|..-++.   ||.+.+
T Consensus       162 rHPEkRmrAA~~afEe~~LPrLK~e~P~lr-lsQ~Kqll~Kew~KsPDN---P~Nq~~  215 (221)
T KOG3223|consen  162 RHPEKRMRAAFKAFEEARLPRLKKENPGLR-LSQYKQLLKKEWQKSPDN---PFNQAA  215 (221)
T ss_pred             cChHHHHHHHHHHHHHhhchhhhhcCCCcc-HHHHHHHHHHHHhhCCCC---hhhHHh
Confidence            445 45556799999999999999999998 899999999999887775   565544


No 27 
>PF05494 Tol_Tol_Ttg2:  Toluene tolerance, Ttg2 ;  InterPro: IPR008869 Toluene tolerance is mediated by increased cell membrane rigidity resulting from changes in fatty acid and phospholipid compositions, exclusion of toluene from the cell membrane, and removal of intracellular toluene by degradation []. Many proteins are involved in these processes. This family is a transporter which shows similarity to ABC transporters [].; PDB: 2QGU_A.
Probab=78.27  E-value=2.7  Score=31.47  Aligned_cols=47  Identities=21%  Similarity=0.548  Sum_probs=36.1

Q ss_pred             CCCCcHHHHHH-HHHHHhcCCChhhhhhHHHHHHH-HHHHHHHHHHHHhh
Q 032101           78 PNVTAVSAVGK-AAGGKWKSMSPAEKAPYESKAEK-LKSEYGKKMNAYNK  125 (147)
Q Consensus        78 P~~~~~~eisk-~lge~Wk~Ls~eeK~~Y~~~A~~-~k~~y~k~~~~Y~~  125 (147)
                      |... |..|++ .||..|+.||++++..|.+.... ....|-..+..|..
T Consensus        36 ~~~D-~~~~ar~~LG~~w~~~s~~q~~~F~~~f~~~l~~~Y~~~l~~y~~   84 (170)
T PF05494_consen   36 PYFD-FERMARRVLGRYWRKASPAQRQRFVEAFKQLLVRTYAKRLDEYSG   84 (170)
T ss_dssp             GGB--HHHHHHHHHGGGTTTS-HHHHHHHHHHHHHHHHHHHHHHHHT-SS
T ss_pred             HhCC-HHHHHHHHHHHhHhhCCHHHHHHHHHHHHHHHHHHHHHHHHhhCC
Confidence            4554 677775 47889999999999999987776 66788999999975


No 28 
>COG2854 Ttg2D ABC-type transport system involved in resistance to organic solvents, auxiliary component [Secondary metabolites biosynthesis, transport, and catabolism]
Probab=69.26  E-value=7.3  Score=30.83  Aligned_cols=42  Identities=19%  Similarity=0.421  Sum_probs=35.9

Q ss_pred             HHHHHHhcCCChhhhhhHHHHHHH-HHHHHHHHHHHHhhhCCC
Q 032101           88 KAAGGKWKSMSPAEKAPYESKAEK-LKSEYGKKMNAYNKKQVT  129 (147)
Q Consensus        88 k~lge~Wk~Ls~eeK~~Y~~~A~~-~k~~y~k~~~~Y~~~~~~  129 (147)
                      ..||.-|+++|+++++.|...... ..+.|-..+..|+.+...
T Consensus        78 ~vLGk~~k~aspeQ~~~F~~aF~~yl~q~Y~~aL~~Y~~q~~~  120 (202)
T COG2854          78 LVLGKYYKTASPEQRQAFFKAFRTYLEQTYGQALLDYKGQTLK  120 (202)
T ss_pred             HHhccccccCCHHHHHHHHHHHHHHHHHHHHHHHHHccCCCce
Confidence            458999999999999999987776 678899999999887543


No 29 
>PF13875 DUF4202:  Domain of unknown function (DUF4202)
Probab=67.97  E-value=13  Score=29.12  Aligned_cols=40  Identities=23%  Similarity=0.379  Sum_probs=33.9

Q ss_pred             CHHHHHHHHHHHHHHHhCCCCCcHHHHHHHHHHHhcCCChhhhh
Q 032101           60 SAFFVFLEEFRKTFKKENPNVTAVSAVGKAAGGKWKSMSPAEKA  103 (147)
Q Consensus        60 sAy~lF~~e~r~~ik~e~P~~~~~~eisk~lge~Wk~Ls~eeK~  103 (147)
                      -+-++|+..+...+...|.    -..+..+|...|..||+.-++
T Consensus       130 vacLVFL~~~f~~F~~~~d----eeK~v~Il~KTw~KMS~~g~~  169 (185)
T PF13875_consen  130 VACLVFLEYYFEDFAAKHD----EEKIVDILRKTWRKMSERGHE  169 (185)
T ss_pred             hHHHHhHHHHHHHHHhcCC----HHHHHHHHHHHHHHCCHHHHH
Confidence            3578999999999998882    357888999999999998765


No 30 
>PF12881 NUT_N:  NUT protein N terminus;  InterPro: IPR024309 This domain is found in the N-terminal region of Nuclear Testis (NUT) proteins. It is also found in FAM22, which are a family of uncharacterised mammalian proteins.
Probab=67.86  E-value=16  Score=30.86  Aligned_cols=53  Identities=21%  Similarity=0.281  Sum_probs=41.9

Q ss_pred             CCHHHHHHHHHHHHHHHhCCCCCcHHHHHHHHHHHhcCCChhhhhhHHHHHHHH
Q 032101           59 PSAFFVFLEEFRKTFKKENPNVTAVSAVGKAAGGKWKSMSPAEKAPYESKAEKL  112 (147)
Q Consensus        59 ~sAy~lF~~e~r~~ik~e~P~~~~~~eisk~lge~Wk~Ls~eeK~~Y~~~A~~~  112 (147)
                      ..||..|+.-+...+....|.++ +.|-....-+.|.-.|.-+|..|+++|++=
T Consensus       229 ~EAlSCFLIpvLrsLar~kPtMt-lEeGl~ra~qEW~~~SnfdRmifyemaekF  281 (328)
T PF12881_consen  229 AEALSCFLIPVLRSLARLKPTMT-LEEGLWRAVQEWQHTSNFDRMIFYEMAEKF  281 (328)
T ss_pred             hhhhhhhHHHHHHHHHhcCCCcc-HHHHHHHHHHHhhccccccHHHHHHHHHHH
Confidence            35566666655555666678887 888888888999999999999999999774


No 31 
>PRK10363 cpxP periplasmic repressor CpxP; Reviewed
Probab=54.73  E-value=52  Score=25.27  Aligned_cols=40  Identities=10%  Similarity=0.280  Sum_probs=31.3

Q ss_pred             HHHHHHHHHHhcCCChhhhhhHHHHHHHHHHHHHHHHHHHh
Q 032101           84 SAVGKAAGGKWKSMSPAEKAPYESKAEKLKSEYGKKMNAYN  124 (147)
Q Consensus        84 ~eisk~lge~Wk~Ls~eeK~~Y~~~A~~~k~~y~k~~~~Y~  124 (147)
                      .++.++-.+++.-|+||.|..|.+..+....++.. +..+.
T Consensus       111 Vem~k~~nqmy~lLTPEQKaq~~~~~~~rm~~~~~-~~~~q  150 (166)
T PRK10363        111 VEMAKVRNQMYRLLTPEQQAVLNEKHQQRMEQLRD-VTQWQ  150 (166)
T ss_pred             HHHHHHHHHHHHhCCHHHHHHHHHHHHHHHHHHHH-HHhcC
Confidence            45667778999999999999999888887777754 54443


No 32 
>PRK09706 transcriptional repressor DicA; Reviewed
Probab=49.25  E-value=48  Score=23.67  Aligned_cols=44  Identities=11%  Similarity=0.071  Sum_probs=38.3

Q ss_pred             HHHHHHHHHhcCCChhhhhhHHHHHHHHHHHHHHHHHHHhhhCC
Q 032101           85 AVGKAAGGKWKSMSPAEKAPYESKAEKLKSEYGKKMNAYNKKQV  128 (147)
Q Consensus        85 eisk~lge~Wk~Ls~eeK~~Y~~~A~~~k~~y~k~~~~Y~~~~~  128 (147)
                      +-...|-..|+.|+++++.............|...+++|-....
T Consensus        87 ~~~~~ll~~~~~L~~~~~~~~l~~l~~~~~~~~~~~~~~~~~~~  130 (135)
T PRK09706         87 EDQKELLELFDALPESEQDAQLSEMRARVENFNKLFEELLKARK  130 (135)
T ss_pred             HHHHHHHHHHHHCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHh
Confidence            34567889999999999999999999999999999999977643


No 33 
>PRK12751 cpxP periplasmic stress adaptor protein CpxP; Reviewed
Probab=47.94  E-value=44  Score=25.39  Aligned_cols=34  Identities=12%  Similarity=0.262  Sum_probs=25.9

Q ss_pred             HHHHHHHHHhcCCChhhhhhHHHHHHHHHHHHHH
Q 032101           85 AVGKAAGGKWKSMSPAEKAPYESKAEKLKSEYGK  118 (147)
Q Consensus        85 eisk~lge~Wk~Ls~eeK~~Y~~~A~~~k~~y~k  118 (147)
                      +..+...+++..|++++|..|.+..++...+...
T Consensus       118 ~~~~~~~qmy~lLTPEQra~l~~~~e~r~~~~~~  151 (162)
T PRK12751        118 EMAKVRNQMYNLLTPEQKEALNKKHQERIEKLQQ  151 (162)
T ss_pred             HHHHHHHHHHHcCCHHHHHHHHHHHHHHHHHHHh
Confidence            3445567888999999999999888776665543


No 34 
>PRK12750 cpxP periplasmic repressor CpxP; Reviewed
Probab=46.80  E-value=78  Score=24.10  Aligned_cols=35  Identities=17%  Similarity=0.231  Sum_probs=28.7

Q ss_pred             HHHHHHHHhcCCChhhhhhHHHHHHHHHHHHHHHH
Q 032101           86 VGKAAGGKWKSMSPAEKAPYESKAEKLKSEYGKKM  120 (147)
Q Consensus        86 isk~lge~Wk~Ls~eeK~~Y~~~A~~~k~~y~k~~  120 (147)
                      +.+..-+++..|++++|..|.+.-.+..+.|...+
T Consensus       126 ~~~~~~~~~~vLTpEQRak~~e~~~~r~~~~~~~~  160 (170)
T PRK12750        126 MLEKRHQMLSILTPEQKAKFQELQQERMQECQDKM  160 (170)
T ss_pred             HHHHHHHHHHhCCHHHHHHHHHHHHHHHHHHHHHH
Confidence            34456678999999999999999888888887766


No 35 
>PF11304 DUF3106:  Protein of unknown function (DUF3106);  InterPro: IPR021455  Some members in this family of proteins are annotated as transmembrane proteins however this cannot be confirmed. Currently no function is known. 
Probab=44.51  E-value=98  Score=21.69  Aligned_cols=9  Identities=33%  Similarity=1.180  Sum_probs=3.5

Q ss_pred             hcCCChhhh
Q 032101           94 WKSMSPAEK  102 (147)
Q Consensus        94 Wk~Ls~eeK  102 (147)
                      |.+||++++
T Consensus        56 W~~LspeqR   64 (107)
T PF11304_consen   56 WAALSPEQR   64 (107)
T ss_pred             HHhCCHHHH
Confidence            333333333


No 36 
>PF00887 ACBP:  Acyl CoA binding protein;  InterPro: IPR000582 Acyl-CoA-binding protein (ACBP) is a small (10 Kd) protein that binds medium- and long-chain acyl-CoA esters with very high affinity and may function as an intracellular carrier of acyl-CoA esters []. ACBP is also known as diazepam binding inhibitor (DBI) or endozepine (EP) because of its ability to displace diazepam from the benzodiazepine (BZD) recognition site located on the GABA type A receptor. It is therefore possible that this protein also acts as a neuropeptide to modulate the action of the GABA receptor []. ACBP is a highly conserved protein of about 90 residues that is found in all four eukaryotic kingdoms, Animalia, Plantae, Fungi and Protista, and in some eubacterial species []. Although ACBP occurs as a completely independent protein, intact ACB domains have been identified in a number of large, multifunctional proteins in a variety of eukaryotic species. These include large membrane-associated proteins with N-terminal ACB domains, multifunctional enzymes with both ACB and peroxisomal enoyl-CoA Delta(3), Delta(2)-enoyl-CoA isomerase domains, and proteins with both an ACB domain and ankyrin repeats (IPR002110 from INTERPRO) []. The ACB domain consists of four alpha-helices arranged in a bowl shape with a highly exposed acyl-CoA-binding site. The ligand is bound through specific interactions with residues on the protein, most notably several conserved positive charges that interact with the phosphate group on the adenosine-3'phosphate moiety, and the acyl chain is sandwiched between the hydrophobic surfaces of CoA and the protein []. Other proteins containing an ACB domain include:   Endozepine-like peptide (ELP) (gene DBIL5) from mouse []. ELP is a testis-specific ACBP homologue that may be involved in the energy metabolism of the mature sperm. MA-DBI, a transmembrane protein of unknown function which has been found in mammals. MA-DBI contains a N-terminal ACB domain. DRS-1 [], a human protein of unknown function that contains a N-terminal ACB domain and a C-terminal enoyl-CoA isomerase/hydratase domain.  ; GO: 0000062 fatty-acyl-CoA binding; PDB: 2CB8_A 2FJ9_A 2LBB_A 1ST7_A 3EPY_B 2FDQ_C 1NTI_A 1HB8_A 1ACA_A 1NVL_A ....
Probab=39.45  E-value=58  Score=21.58  Aligned_cols=53  Identities=17%  Similarity=0.340  Sum_probs=32.6

Q ss_pred             HHHHHHHHHHHHHHhCCCCCcHHHHHHHHHHHhcCCC----hhhhhhHHHHHHHHHHHH
Q 032101           62 FFVFLEEFRKTFKKENPNVTAVSAVGKAAGGKWKSMS----PAEKAPYESKAEKLKSEY  116 (147)
Q Consensus        62 y~lF~~e~r~~ik~e~P~~~~~~eisk~lge~Wk~Ls----~eeK~~Y~~~A~~~k~~y  116 (147)
                      |-+|.+.....+....|+..+  -+.+.--+.|+.|.    ++-+..|.+...+....|
T Consensus        30 YalyKQAt~Gd~~~~~P~~~d--~~~~~K~~AW~~l~gms~~eA~~~Yi~~v~~~~~~~   86 (87)
T PF00887_consen   30 YALYKQATHGDCDTPRPGFFD--IEGRAKWDAWKALKGMSKEEAMREYIELVEELIPKY   86 (87)
T ss_dssp             HHHHHHHHTSS--S-CTTTTC--HHHHHHHHHHHTTTTTHHHHHHHHHHHHHHHHHHHH
T ss_pred             HHHHHHHHhCCCcCCCCcchh--HHHHHHHHHHHHccCCCHHHHHHHHHHHHHHHHHhc
Confidence            667777666655566677643  44555567798776    555667777777666555


No 37 
>PF06945 DUF1289:  Protein of unknown function (DUF1289);  InterPro: IPR010710 This family consists of a number of hypothetical bacterial proteins. The aligned region spans around 56 residues and contains 4 highly conserved cysteine residues towards the N terminus. The function of this family is unknown.
Probab=38.47  E-value=43  Score=20.34  Aligned_cols=24  Identities=25%  Similarity=0.488  Sum_probs=17.7

Q ss_pred             HHHHHHHHHHHhcCCChhhhhhHHHHHHH
Q 032101           83 VSAVGKAAGGKWKSMSPAEKAPYESKAEK  111 (147)
Q Consensus        83 ~~eisk~lge~Wk~Ls~eeK~~Y~~~A~~  111 (147)
                      ..||..     |..|++++|.........
T Consensus        24 ~dEI~~-----W~~~s~~er~~i~~~l~~   47 (51)
T PF06945_consen   24 LDEIRD-----WKSMSDDERRAILARLRA   47 (51)
T ss_pred             HHHHHH-----HhhCCHHHHHHHHHHHHH
Confidence            456665     999999998877665543


No 38 
>KOG1610 consensus Corticosteroid 11-beta-dehydrogenase and related short chain-type dehydrogenases [Secondary metabolites biosynthesis, transport and catabolism; General function prediction only]
Probab=36.11  E-value=1.1e+02  Score=26.01  Aligned_cols=57  Identities=18%  Similarity=0.367  Sum_probs=39.9

Q ss_pred             HHHHHHHHHHHh-------CCC-----CCcHHHHHHHHHHHhcCCChhhhhhHHHHHHHHHHHHHHHHHHHh
Q 032101           65 FLEEFRKTFKKE-------NPN-----VTAVSAVGKAAGGKWKSMSPAEKAPYESKAEKLKSEYGKKMNAYN  124 (147)
Q Consensus        65 F~~e~r~~ik~e-------~P~-----~~~~~eisk~lge~Wk~Ls~eeK~~Y~~~A~~~k~~y~k~~~~Y~  124 (147)
                      |+...|.++..=       -|+     +.+...+.+.+.++|..|+++.|+.|-+.+..+   |+..+..|.
T Consensus       188 f~D~lR~EL~~fGV~VsiiePG~f~T~l~~~~~~~~~~~~~w~~l~~e~k~~YGedy~~~---~~~~~~~~~  256 (322)
T KOG1610|consen  188 FSDSLRRELRPFGVKVSIIEPGFFKTNLANPEKLEKRMKEIWERLPQETKDEYGEDYFED---YKKSLEKYL  256 (322)
T ss_pred             HHHHHHHHHHhcCcEEEEeccCccccccCChHHHHHHHHHHHhcCCHHHHHHHHHHHHHH---HHHHHHhhh
Confidence            777777776522       122     323478889999999999999999998877554   455555554


No 39 
>PF01352 KRAB:  KRAB box;  InterPro: IPR001909 The Krueppel-associated box (KRAB) is a domain of around 75 amino acids that is found in the N-terminal part of about one third of eukaryotic Krueppel-type C2H2 zinc finger proteins (ZFPs) []. It is enriched in charged amino acids and can be divided into subregions A and B, which are predicted to fold into two amphipathic alpha-helices. The KRAB A and B boxes can be separated by variable spacer segments and many KRAB proteins contain only the A box []. The functions currently known for members of the KRAB-containing protein family include transcriptional repression of RNA polymerase I, II, and III promoters, binding and splicing of RNA, and control of nucleolus function. The KRAB domain functions as a transcriptional repressor when tethered to the template DNA by a DNA-binding domain. A sequence of 45 amino acids in the KRAB A subdomain has been shown to be necessary and sufficient for transcriptional repression. The B box does not repress by itself but does potentiate the repression exerted by the KRAB A subdomain [, ]. Gene silencing requires the binding of the KRAB domain to the RING-B box-coiled coil (RBCC) domain of the KAP-1/TIF1-beta corepressor. As KAP-1 binds to the heterochromatin proteins HP1, it has been proposed that the KRAB-ZFP-bound target gene could be silenced following recruitment to heterochromatin [, ]. KRAB-ZFPs probably constitute the single largest class of transcription factors within the human genome []. Although the function of KRAB-ZFPs is largely unknown, they appear to play important roles during cell differentiation and development. The KRAB domain is generally encoded by two exons. The regions coded by the two exons are known as KRAB-A and KRAB-B.; GO: 0003676 nucleic acid binding, 0006355 regulation of transcription, DNA-dependent, 0005622 intracellular; PDB: 1V65_A.
Probab=35.53  E-value=29  Score=20.23  Aligned_cols=28  Identities=21%  Similarity=0.369  Sum_probs=15.9

Q ss_pred             HHHHHHHHH-HHhcCCChhhhhhHHHHHH
Q 032101           83 VSAVGKAAG-GKWKSMSPAEKAPYESKAE  110 (147)
Q Consensus        83 ~~eisk~lg-e~Wk~Ls~eeK~~Y~~~A~  110 (147)
                      |.+|+--++ +.|..|.+.+|..|.+.-.
T Consensus         3 f~Dvav~fs~eEW~~L~~~Qk~ly~dvm~   31 (41)
T PF01352_consen    3 FEDVAVYFSQEEWELLDPAQKNLYRDVML   31 (41)
T ss_dssp             ----TT---HHHHHTS-HHHHHHHHHHHH
T ss_pred             EEEEEEEcChhhcccccceecccchhHHH
Confidence            445554444 5699999999999987653


No 40 
>PF12650 DUF3784:  Domain of unknown function (DUF3784);  InterPro: IPR017259 This group represents an uncharacterised conserved protein.
Probab=34.85  E-value=25  Score=23.84  Aligned_cols=15  Identities=40%  Similarity=0.718  Sum_probs=13.2

Q ss_pred             hcCCChhhhhhHHHH
Q 032101           94 WKSMSPAEKAPYESK  108 (147)
Q Consensus        94 Wk~Ls~eeK~~Y~~~  108 (147)
                      |+.||+|||+.|...
T Consensus        26 yntms~eEk~~~D~~   40 (97)
T PF12650_consen   26 YNTMSKEEKEKYDKK   40 (97)
T ss_pred             cccCCHHHHHHhhHH
Confidence            899999999999754


No 41 
>TIGR00787 dctP tripartite ATP-independent periplasmic transporter solute receptor, DctP family. TRAP-T (Tripartite ATP-independent Periplasmic Transporter) family proteins generally consist of three components, and these systems have so far been found in Gram-negative bacteria, Gram-postive bacteria and archaea. The best characterized example is the DctPQM system of Rhodobacter capsulatus, a C4 dicarboxylate (malate, fumarate, succinate) transporter. This model represents the DctP family, one of at least three major families of extracytoplasmic solute receptor for TRAP family transporters. Other are the SnoM family (see pfam03480) and TAXI (TRAP-associated extracytoplasmic immunogenic) family.
Probab=31.82  E-value=99  Score=24.31  Aligned_cols=28  Identities=25%  Similarity=0.310  Sum_probs=21.6

Q ss_pred             HHHhcCCChhhhhhHHHHHHHHHHHHHH
Q 032101           91 GGKWKSMSPAEKAPYESKAEKLKSEYGK  118 (147)
Q Consensus        91 ge~Wk~Ls~eeK~~Y~~~A~~~k~~y~k  118 (147)
                      .+.|..||++.|....+.+...-..+..
T Consensus       213 ~~~~~~L~~e~q~~i~~a~~~~~~~~~~  240 (257)
T TIGR00787       213 KAFWKSLPPDLQAVVKEAAKEAGEYQRK  240 (257)
T ss_pred             HHHHhcCCHHHHHHHHHHHHHHHHHHHH
Confidence            5779999999999998877765444443


No 42 
>cd07081 ALDH_F20_ACDH_EutE-like Coenzyme A acylating aldehyde dehydrogenase (ACDH), Ethanolamine utilization protein EutE, and related proteins. Coenzyme A acylating aldehyde dehydrogenase (ACDH), an NAD+ and CoA-dependent acetaldehyde dehydrogenase, acetylating (EC=1.2.1.10), functions as a single enzyme (such as the Ethanolamine utilization protein, EutE, in Salmonella typhimurium) or as part of a multifunctional enzyme to convert acetaldehyde into acetyl-CoA. The E. coli aldehyde-alcohol dehydrogenase includes the functional domains, alcohol dehydrogenase (ADH), ACDH, and pyruvate-formate-lyase deactivase; and the Entamoeba histolytica aldehyde-alcohol dehydrogenase 2 (ALDH20A1) includes the functional domains ADH and ACDH, and may be critical enzymes in the fermentative pathway.
Probab=31.05  E-value=1.2e+02  Score=26.39  Aligned_cols=40  Identities=13%  Similarity=-0.030  Sum_probs=32.2

Q ss_pred             HHHHHHHHHhcCCChhhhhhHHHHHHHHHHHHHHHHHHHh
Q 032101           85 AVGKAAGGKWKSMSPAEKAPYESKAEKLKSEYGKKMNAYN  124 (147)
Q Consensus        85 eisk~lge~Wk~Ls~eeK~~Y~~~A~~~k~~y~k~~~~Y~  124 (147)
                      +.++..-..|+.++.++|..+...+....+++..++....
T Consensus         6 ~~A~~A~~~W~~~~~~~R~~iL~~~a~~l~~~~~ela~~~   45 (439)
T cd07081           6 AAAKVAQQGLSCKSQEMVDLIFRAAAEAAEDARIDLAKLA   45 (439)
T ss_pred             HHHHHHHHHHhhCCHHHHHHHHHHHHHHHHHHHHHHHHHH
Confidence            4455566789999999999999988888888887777663


No 43 
>COG4281 ACB Acyl-CoA-binding protein [Lipid metabolism]
Probab=31.03  E-value=52  Score=22.28  Aligned_cols=61  Identities=20%  Similarity=0.390  Sum_probs=38.1

Q ss_pred             CCCCCCCH-----HHHHHHHHHHHHHHhCCCCCcHHHHHHHHHHHhcCCC----hhhhhhHHHHHHHHHHHH
Q 032101           54 KPKRPPSA-----FFVFLEEFRKTFKKENPNVTAVSAVGKAAGGKWKSMS----PAEKAPYESKAEKLKSEY  116 (147)
Q Consensus        54 ~PKRP~sA-----y~lF~~e~r~~ik~e~P~~~~~~eisk~lge~Wk~Ls----~eeK~~Y~~~A~~~k~~y  116 (147)
                      .+.+|.|-     |.||-+..-.....+-|++.  .-+.+.--+.|.+|-    ++-++.|.....+.+..|
T Consensus        16 L~~kP~~d~LLkLYAL~KQ~s~GD~~~ekPG~~--d~~gr~K~eAW~~LKGksqedA~qeYialVeeLkak~   85 (87)
T COG4281          16 LSEKPSNDELLKLYALFKQGSVGDNDGEKPGFF--DIVGRYKYEAWAGLKGKSQEDARQEYIALVEELKAKY   85 (87)
T ss_pred             hccCCCcHHHHHHHHHHHhccccccCCCCCCcc--ccccchhHHHHhhccCccHHHHHHHHHHHHHHHHhhc
Confidence            35666665     66666655544555668774  345566668897765    555667777777666544


No 44 
>PF05388 Carbpep_Y_N:  Carboxypeptidase Y pro-peptide;  InterPro: IPR008442 This signature is found at the N terminus of carboxypeptidase Y, which belong to MEROPS peptidase family S10. This region contains the signal peptide and pro-peptide regions [,].; GO: 0004185 serine-type carboxypeptidase activity, 0005773 vacuole
Probab=30.41  E-value=75  Score=22.70  Aligned_cols=29  Identities=17%  Similarity=0.193  Sum_probs=25.3

Q ss_pred             HHHHHHHHHHHhcCCChhhhhhHHHHHHH
Q 032101           83 VSAVGKAAGGKWKSMSPAEKAPYESKAEK  111 (147)
Q Consensus        83 ~~eisk~lge~Wk~Ls~eeK~~Y~~~A~~  111 (147)
                      +..+++.+++.++.|+.+.|+.|.+....
T Consensus        45 ~~~~~~~l~e~l~~Lt~e~k~~W~E~~~~   73 (113)
T PF05388_consen   45 LEKISKYLNEPLKSLTSEAKALWDEMMLL   73 (113)
T ss_pred             HHHHHHHHHHHHhhccHHHHHHHHHHHHH
Confidence            56677889999999999999999998754


No 45 
>PRK10236 hypothetical protein; Provisional
Probab=30.38  E-value=49  Score=26.83  Aligned_cols=26  Identities=15%  Similarity=0.351  Sum_probs=21.2

Q ss_pred             HHHHHHHHhcCCChhhhhhHHHHHHH
Q 032101           86 VGKAAGGKWKSMSPAEKAPYESKAEK  111 (147)
Q Consensus        86 isk~lge~Wk~Ls~eeK~~Y~~~A~~  111 (147)
                      +.+.+.+.|..||++|++.+.+.-..
T Consensus       118 l~kll~~a~~kms~eE~~~L~~~l~~  143 (237)
T PRK10236        118 LEQFLRNTWKKMDEEHKQEFLHAVDA  143 (237)
T ss_pred             HHHHHHHHHHHCCHHHHHHHHHHHhh
Confidence            46889999999999999887765443


No 46 
>KOG1827 consensus Chromatin remodeling complex RSC, subunit RSC1/Polybromo and related proteins [Chromatin structure and dynamics; Transcription]
Probab=29.81  E-value=3.9  Score=37.44  Aligned_cols=44  Identities=25%  Similarity=0.404  Sum_probs=39.6

Q ss_pred             CCCHHHHHHHHHHHHHHHhCCCCCcHHHHHHHHHHHhcCCChhhh
Q 032101           58 PPSAFFVFLEEFRKTFKKENPNVTAVSAVGKAAGGKWKSMSPAEK  102 (147)
Q Consensus        58 P~sAy~lF~~e~r~~ik~e~P~~~~~~eisk~lge~Wk~Ls~eeK  102 (147)
                      -.++|++|..+.+..+-..+|++. +++++.++|.-|..|+...+
T Consensus       552 ~~~~~~~~s~~~~~~~~~~np~v~-~~~~~~~vg~~~~~lp~~~k  595 (629)
T KOG1827|consen  552 SPEPYILDSIENRTIIWFENPTVG-FGEVSIIVGNDWDKLPNINK  595 (629)
T ss_pred             CCccccccccccCceeeeeCCCcc-cceeEEeecCCcccCccccc
Confidence            558899999999999999999997 89999999999999994444


No 47 
>COG1638 DctP TRAP-type C4-dicarboxylate transport system, periplasmic component [Carbohydrate transport and metabolism]
Probab=29.33  E-value=1.1e+02  Score=25.68  Aligned_cols=35  Identities=14%  Similarity=0.232  Sum_probs=27.1

Q ss_pred             HHHhcCCChhhhhhHHHHHHHHHHHHHHHHHHHhh
Q 032101           91 GGKWKSMSPAEKAPYESKAEKLKSEYGKKMNAYNK  125 (147)
Q Consensus        91 ge~Wk~Ls~eeK~~Y~~~A~~~k~~y~k~~~~Y~~  125 (147)
                      ...|..||++.+....+.+.+......+...+++.
T Consensus       244 ~~~w~~L~~e~q~il~~aa~e~~~~~~~~~~~~e~  278 (332)
T COG1638         244 KAFWDSLPEEDQTILLEAAKEAAEEQRKLVEELED  278 (332)
T ss_pred             HHHHhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHH
Confidence            57899999999999999888866666555555544


No 48 
>PF06394 Pepsin-I3:  Pepsin inhibitor-3-like repeated domain;  InterPro: IPR010480 Peptide proteinase inhibitors can be found as single domain proteins or as single or multiple domains within proteins; these are referred to as either simple or compound inhibitors, respectively. In many cases they are synthesised as part of a larger precursor protein, either as a prepropeptide or as an N-terminal domain associated with an inactive peptidase or zymogen. This domain prevents access of the substrate to the active site. Removal of the N-terminal inhibitor domain either by interaction with a second peptidase or by autocatalytic cleavage activates the zymogen. Other inhibitors interact direct with proteinases using a simple noncovalent lock and key mechanism; while yet others use a conformational change-based trapping mechanism that depends on their structural and thermodynamic properties.  The members of this group of proteins belong to MEROPS inhibitor family I33, clan IR; the nematode aspartyl protease inhibitors or Aspins. They are restricted to parasitic nematode species. Structural features common to the nematode Aspins include the presence of a signal peptide sequence and the conservation of all four cysteine residues in the mature protein. The Y[V.A]RDLT sequence motif has been suggested as being of crucial functional importance in several filarial nematode inhibitors [], this sequence is not conserved in Tco-API-1 from Trichostrongylus colubriformis (Black scour worm) and it has been demonstrated that Tco-API-1, is not an Aspin as it does not inhibit porcine pepsin []. Related inhibitors from Onchocerca volvulus, Ov33 [] and Ascaris suum (Pig roundworm), PI-3 [] inhibit the in vitro activity of aspartyl proteases such as pepsin and cathepsin E (MEROPS peptidase family A1).  Aspin may facilitate the safe passage of the eggs of Ascaris through the host stomach without digestion by pepsin [, ]. The other parasitic nematodes known to express homologous proteins do not pass through the stomach of their hosts []. Several proteins in the family are potent allergens in mammals. The three-dimensional structures of pepsin inhibitor-3 (PI-3) from A. suum and of the complex between PI-3 and porcine pepsin at 1. 75 A and 2.45 A resolution, respectively, have revealed the mechanism of aspartic protease inhibition. PI-3 has a new fold consisting of two identical domains, each comprising an antiparallel beta-sheet flanked by an alpha-helix. In the enzyme-inhibitor complex, the N-terminal beta-strand of PI-3 pairs with one strand of the 'active site flap' (residues 70-82) of pepsin, thus forming an eight-stranded beta-sheet that spans the two proteins. PI-3 has a novel mode of inhibition, using its N-terminal residues to occupy and therefore block the first three binding pockets in pepsin for substrate residues C-terminal to the scissile bond (S1'-S3') [].; PDB: 1F32_A 1F34_B.
Probab=28.86  E-value=59  Score=21.74  Aligned_cols=31  Identities=23%  Similarity=0.464  Sum_probs=18.4

Q ss_pred             cCCChhhhhhHHHHHHHHHHHHHHHHHHHhhhCCCCCCC
Q 032101           95 KSMSPAEKAPYESKAEKLKSEYGKKMNAYNKKQVTNLVP  133 (147)
Q Consensus        95 k~Ls~eeK~~Y~~~A~~~k~~y~k~~~~Y~~~~~~~~~~  133 (147)
                      +.|+++|+        ..+..|.+++..|...+...-..
T Consensus        38 R~Lt~~E~--------~eL~~y~~~v~~y~~~l~~~iq~   68 (76)
T PF06394_consen   38 RDLTPDEQ--------QELKTYQKKVAAYKEQLQQQIQE   68 (76)
T ss_dssp             EE--HHHH--------HHHHHHHHHHHHHHHHHTT----
T ss_pred             ccCCHHHH--------HHHHHHHHHHHHHHHHHHHHHHH
Confidence            45666665        45678888888888876655443


No 49 
>cd07133 ALDH_CALDH_CalB Coniferyl aldehyde dehydrogenase-like. Coniferyl aldehyde dehydrogenase (CALDH, EC=1.2.1.68) of Pseudomonas sp. strain HR199 (CalB) which catalyzes the NAD+-dependent oxidation of coniferyl aldehyde to ferulic acid, and similar sequences, are present in this CD.
Probab=26.37  E-value=1.8e+02  Score=25.07  Aligned_cols=42  Identities=14%  Similarity=-0.096  Sum_probs=32.0

Q ss_pred             HHHHHHHHHHhcCCChhhhhhHHHHHHHHHHHHHHHHHHHhh
Q 032101           84 SAVGKAAGGKWKSMSPAEKAPYESKAEKLKSEYGKKMNAYNK  125 (147)
Q Consensus        84 ~eisk~lge~Wk~Ls~eeK~~Y~~~A~~~k~~y~k~~~~Y~~  125 (147)
                      .+.++..-..|+.++..+|..+........+++..++.....
T Consensus         4 ~~~a~~a~~~w~~~~~~~R~~~L~~~a~~l~~~~~el~~~~~   45 (434)
T cd07133           4 LERQKAAFLANPPPSLEERRDRLDRLKALLLDNQDALAEAIS   45 (434)
T ss_pred             HHHHHHHHHhcCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHH
Confidence            345556666799999999998888888888888877776544


No 50 
>PRK10455 periplasmic protein; Reviewed
Probab=26.33  E-value=1.4e+02  Score=22.59  Aligned_cols=28  Identities=18%  Similarity=0.311  Sum_probs=21.4

Q ss_pred             HHHHHHHHHhcCCChhhhhhHHHHHHHH
Q 032101           85 AVGKAAGGKWKSMSPAEKAPYESKAEKL  112 (147)
Q Consensus        85 eisk~lge~Wk~Ls~eeK~~Y~~~A~~~  112 (147)
                      +..+.-.++|..|++++|..|.+..++.
T Consensus       118 ~~~~~~~qiy~vLTPEQr~q~~~~~ekr  145 (161)
T PRK10455        118 AHMETQNKIYNVLTPEQKKQFNANFEKR  145 (161)
T ss_pred             HHHHHHHHHHHhCCHHHHHHHHHHHHHH
Confidence            4556667789999999999998765443


No 51 
>PHA02662 ORF131 putative membrane protein; Provisional
Probab=25.37  E-value=1.6e+02  Score=23.68  Aligned_cols=45  Identities=16%  Similarity=0.262  Sum_probs=33.9

Q ss_pred             CHHHHHHHHHHHHHHHh--------------------------------CCCCCcHHHHHHHHHHHhcCCChhhhhhH
Q 032101           60 SAFFVFLEEFRKTFKKE--------------------------------NPNVTAVSAVGKAAGGKWKSMSPAEKAPY  105 (147)
Q Consensus        60 sAy~lF~~e~r~~ik~e--------------------------------~P~~~~~~eisk~lge~Wk~Ls~eeK~~Y  105 (147)
                      +=|-+|+..+-..+-.-                                +...+ |.-+.+.+.|....|++++|..-
T Consensus        22 tLY~lf~~ryL~kLs~~s~~a~a~C~IhIG~I~g~~k~C~v~V~N~C~sna~~s-f~lll~Al~Et~~~Lp~~qK~~i   98 (226)
T PHA02662         22 SLYDVFLARFLRRLAARAAPASAACAVRVGAVRGRLRNCELVVLNRCHTDAADA-LALASAALAETLAELPRADRLAV   98 (226)
T ss_pred             hHHHHHHHHHHHHHHhccCccccccceEEeeEeeecCCceEEEEecccCCHHHH-HHHHHHHHHHHHHhCCHHHHHHH
Confidence            34999999887776421                                22332 88889999999999999998653


No 52 
>PF12290 DUF3802:  Protein of unknown function (DUF3802);  InterPro: IPR020979  This family of proteins is found in bacteria and are typically between 114 and 143 amino acids in length. There is a conserved KNLFD sequence motif. The annotation with this family suggests that it may be the B subunit of bacterial type IIA DNA topoisomerase but there is no evidence to support this annotation. 
Probab=25.25  E-value=2.6e+02  Score=20.10  Aligned_cols=40  Identities=13%  Similarity=0.260  Sum_probs=29.7

Q ss_pred             HHHHhCCCCCc-------------HHHHHHHHHHHhcCCChhhhhhHHHHHHH
Q 032101           72 TFKKENPNVTA-------------VSAVGKAAGGKWKSMSPAEKAPYESKAEK  111 (147)
Q Consensus        72 ~ik~e~P~~~~-------------~~eisk~lge~Wk~Ls~eeK~~Y~~~A~~  111 (147)
                      .+..+||+++.             +.++...|+..|...+-.+..-|.+..--
T Consensus        49 ~vc~Qnp~L~~~~R~~iirE~Daiv~DLeEVLa~V~~~~aT~eQ~~Fi~Ef~~  101 (113)
T PF12290_consen   49 AVCEQNPELEFSQRFQIIREADAIVYDLEEVLASVWNQKATNEQIAFIEEFIG  101 (113)
T ss_pred             HHHccCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHcCCCCHHHHHHHHHHHH
Confidence            45677888862             55777889999999888888777765544


No 53 
>PF03480 SBP_bac_7:  Bacterial extracellular solute-binding protein, family 7;  InterPro: IPR018389 This family of proteins are involved in binding extracellular solutes for transport across the bacterial cytoplasmic membrane. This family includes a C4-dicarboxylate-binding protein DctP [, ] and the sialic acid-binding protein SiaP. The structure of the SiaP receptor has revealed an overall topology similar to ATP binding cassette ESR (extracytoplasmic solute receptors) proteins []. Upon binding of sialic acid, SiaP undergoes domain closure about a hinge region and kinking of an alpha-helix hinge component [].; GO: 0006810 transport, 0030288 outer membrane-bounded periplasmic space; PDB: 2HZK_C 2HZL_B 2HPG_C 2XWI_A 2XWK_A 2WX9_A 2CEY_A 2WYP_A 3B50_A 2CEX_B ....
Probab=25.24  E-value=1.1e+02  Score=24.34  Aligned_cols=31  Identities=10%  Similarity=0.288  Sum_probs=22.0

Q ss_pred             HHHhcCCChhhhhhHHHHHHHHHHHHHHHHH
Q 032101           91 GGKWKSMSPAEKAPYESKAEKLKSEYGKKMN  121 (147)
Q Consensus        91 ge~Wk~Ls~eeK~~Y~~~A~~~k~~y~k~~~  121 (147)
                      .+.|..||++.|+...+.+.+....+.....
T Consensus       213 ~~~w~~L~~e~q~~l~~~~~~~~~~~~~~~~  243 (286)
T PF03480_consen  213 KDWWDSLPDEDQEALDDAADEAEARAREYYE  243 (286)
T ss_dssp             HHHHHHS-HHHHHHHHHHHHHHHHHHHHHHH
T ss_pred             HHHHhcCCHHHHHHHHHHHHHHHHHHHHHHH
Confidence            4679999999999999887776554444333


No 54 
>cd07122 ALDH_F20_ACDH Coenzyme A acylating aldehyde dehydrogenase (ACDH), ALDH family 20-like. Coenzyme A acylating aldehyde dehydrogenase (ACDH, EC=1.2.1.10), an NAD+ and CoA-dependent acetaldehyde dehydrogenase, functions as a single enzyme (such as the Ethanolamine utilization protein, EutE, in Salmonella typhimurium) or as part of a multifunctional enzyme to convert acetaldehyde into acetyl-CoA . The E. coli aldehyde-alcohol dehydrogenase includes the functional domains, alcohol dehydrogenase (ADH), ACDH, and pyruvate-formate-lyase deactivase; and the Entamoeba histolytica aldehyde-alcohol dehydrogenase 2 (ALDH20A1) includes the functional domains ADH and ACDH and may be critical enzymes in the fermentative pathway.
Probab=23.66  E-value=1.9e+02  Score=25.16  Aligned_cols=40  Identities=5%  Similarity=0.070  Sum_probs=31.0

Q ss_pred             HHHHHHHHHhcCCChhhhhhHHHHHHHHHHHHHHHHHHHh
Q 032101           85 AVGKAAGGKWKSMSPAEKAPYESKAEKLKSEYGKKMNAYN  124 (147)
Q Consensus        85 eisk~lge~Wk~Ls~eeK~~Y~~~A~~~k~~y~k~~~~Y~  124 (147)
                      +.++..-..|..++.++|..+...+....+++..++....
T Consensus         6 ~~A~~A~~~W~~~~~~eR~~~L~~~a~~l~~~~eela~~~   45 (436)
T cd07122           6 ERARKAQREFATFSQEQVDKIVEAVAWAAADAAEELAKMA   45 (436)
T ss_pred             HHHHHHHHHHHhCCHHHHHHHHHHHHHHHHHHHHHHHHHH
Confidence            3445555679999999999999888888888887776664


No 55 
>cd07132 ALDH_F3AB Aldehyde dehydrogenase family 3 members A1, A2, and B1 and related proteins. NAD(P)+-dependent, aldehyde dehydrogenase, family 3 members A1 and B1  (ALDH3A1, ALDH3B1,  EC=1.2.1.5) and fatty aldehyde dehydrogenase, family 3 member A2 (ALDH3A2, EC=1.2.1.3), and similar sequences are included in this CD. Human ALDH3A1 is a homodimer with a critical role in cellular defense against oxidative stress; it catalyzes the oxidation of various cellular membrane lipid-derived aldehydes. Corneal crystalline ALDH3A1 protects the cornea and underlying lens against UV-induced oxidative stress. Human ALDH3A2, a microsomal homodimer, catalyzes the oxidation of long-chain aliphatic aldehydes to fatty acids. Human ALDH3B1 is highly expressed in the kidney and liver and catalyzes the oxidation of various medium- and long-chain saturated and unsaturated aliphatic aldehydes.
Probab=23.63  E-value=2e+02  Score=24.91  Aligned_cols=40  Identities=8%  Similarity=-0.111  Sum_probs=30.7

Q ss_pred             HHHHHHHHHhcCCChhhhhhHHHHHHHHHHHHHHHHHHHh
Q 032101           85 AVGKAAGGKWKSMSPAEKAPYESKAEKLKSEYGKKMNAYN  124 (147)
Q Consensus        85 eisk~lge~Wk~Ls~eeK~~Y~~~A~~~k~~y~k~~~~Y~  124 (147)
                      +.++..-..|..++..+|..+........+++..++.+-.
T Consensus         5 ~~A~~A~~~w~~~~~~~R~~~L~~~a~~l~~~~~~l~~~~   44 (443)
T cd07132           5 RRAREAFSSGKTRPLEFRIQQLEALLRMLEENEDEIVEAL   44 (443)
T ss_pred             HHHHHHHHhcCCCCHHHHHHHHHHHHHHHHHhHHHHHHHH
Confidence            4455566779999999999999888888787777666543


No 56 
>PF15581 Imm35:  Immunity protein 35
Probab=23.17  E-value=93  Score=21.52  Aligned_cols=21  Identities=5%  Similarity=0.311  Sum_probs=17.0

Q ss_pred             HHHHHHHHHHHhcCCChhhhh
Q 032101           83 VSAVGKAAGGKWKSMSPAEKA  103 (147)
Q Consensus        83 ~~eisk~lge~Wk~Ls~eeK~  103 (147)
                      +.-+...|.+.|+.|++++=.
T Consensus        32 i~~l~~lIe~eWRGl~~~qV~   52 (93)
T PF15581_consen   32 IRNLESLIEHEWRGLPEEQVL   52 (93)
T ss_pred             HHHHHHHHHHHHcCCCHHHHH
Confidence            556788999999999987643


No 57 
>cd01145 TroA_c Periplasmic binding protein TroA_c.  These proteins are predicted to function as initial receptors in the ABC metal ion uptake in eubacteria and archaea.  They belong to the TroA superfamily of helical backbone metal receptor proteins that share a distinct fold and ligand binding mechanism.  A typical TroA protein is comprised of two globular subdomains connected by a single helix and can bind their ligands in the cleft between these domains.
Probab=23.09  E-value=1.4e+02  Score=22.74  Aligned_cols=48  Identities=15%  Similarity=0.289  Sum_probs=38.6

Q ss_pred             cHHHHHHHHHHHhcCCChhhhhhHHHHHHHHHHHHHHHHHHHhhhCCC
Q 032101           82 AVSAVGKAAGGKWKSMSPAEKAPYESKAEKLKSEYGKKMNAYNKKQVT  129 (147)
Q Consensus        82 ~~~eisk~lge~Wk~Ls~eeK~~Y~~~A~~~k~~y~k~~~~Y~~~~~~  129 (147)
                      +...++..|++....+.++.+..|.+.+.....+.......|......
T Consensus       116 ~~~~~a~~I~~~L~~~dP~~~~~y~~N~~~~~~~l~~l~~~~~~~l~~  163 (203)
T cd01145         116 NAPALAKALADALIELDPSEQEEYKENLRVFLAKLNKLLREWERQFEG  163 (203)
T ss_pred             HHHHHHHHHHHHHHHhCcccHHHHHHHHHHHHHHHHHHHHHHHHHhhc
Confidence            356778888899999999999999999988877777777777766554


No 58 
>cd07087 ALDH_F3-13-14_CALDH-like ALDH subfamily: Coniferyl aldehyde dehydrogenase, ALDH families 3, 13, and 14, and other related proteins. ALDH subfamily which includes NAD(P)+-dependent, aldehyde dehydrogenase, family 3 member A1 and B1  (ALDH3A1, ALDH3B1,  EC=1.2.1.5) and fatty aldehyde dehydrogenase, family 3 member A2 (ALDH3A2, EC=1.2.1.3), and also plant ALDH family members ALDH3F1, ALDH3H1, and ALDH3I1, fungal ALDH14 (YMR110C) and the protozoan family 13 member (ALDH13), as well as coniferyl aldehyde dehydrogenases (CALDH, EC=1.2.1.68), and other similar  sequences, such as the Pseudomonas putida benzaldehyde dehydrogenase I that is involved in the metabolism of mandelate.
Probab=21.21  E-value=2.4e+02  Score=24.17  Aligned_cols=40  Identities=13%  Similarity=-0.078  Sum_probs=29.9

Q ss_pred             HHHHHHHHHhcCCChhhhhhHHHHHHHHHHHHHHHHHHHh
Q 032101           85 AVGKAAGGKWKSMSPAEKAPYESKAEKLKSEYGKKMNAYN  124 (147)
Q Consensus        85 eisk~lge~Wk~Ls~eeK~~Y~~~A~~~k~~y~k~~~~Y~  124 (147)
                      +.++..-..|..++..+|..+...+....+++..++.+..
T Consensus         5 ~~a~~a~~~w~~~~~~~R~~~L~~~a~~l~~~~~el~~~~   44 (426)
T cd07087           5 ARLRETFLTGKTRSLEWRKAQLKALKRMLTENEEEIAAAL   44 (426)
T ss_pred             HHHHHHHHhcCCCCHHHHHHHHHHHHHHHHHhHHHHHHHH
Confidence            3445556679999999998888888888877777766554


No 59 
>cd07085 ALDH_F6_MMSDH Methylmalonate semialdehyde dehydrogenase and ALDH family members 6A1 and 6B2. Methylmalonate semialdehyde dehydrogenase (MMSDH, EC=1.2.1.27) [acylating] from Bacillus subtilis is involved in valine metabolism and catalyses the NAD+- and CoA-dependent oxidation of methylmalonate semialdehyde into propionyl-CoA. Mitochondrial human MMSDH ALDH6A1 and Arabidopsis MMSDH ALDH6B2 are also present in this CD.
Probab=20.96  E-value=2.4e+02  Score=24.49  Aligned_cols=37  Identities=19%  Similarity=0.159  Sum_probs=29.0

Q ss_pred             HHHHHHHhcCCChhhhhhHHHHHHHHHHHHHHHHHHH
Q 032101           87 GKAAGGKWKSMSPAEKAPYESKAEKLKSEYGKKMNAY  123 (147)
Q Consensus        87 sk~lge~Wk~Ls~eeK~~Y~~~A~~~k~~y~k~~~~Y  123 (147)
                      ++.....|..++.++|..+...+....+++..++..-
T Consensus        47 A~~A~~~w~~~~~~~R~~~L~~~a~~l~~~~~el~~~   83 (478)
T cd07085          47 AKAAFPAWSATPVLKRQQVMFKFRQLLEENLDELARL   83 (478)
T ss_pred             HHHHHHHHhcCCHHHHHHHHHHHHHHHHHHHHHHHHH
Confidence            4445567999999999999988888887777666553


No 60 
>PTZ00037 DnaJ_C chaperone protein; Provisional
Probab=20.66  E-value=2e+02  Score=25.08  Aligned_cols=43  Identities=19%  Similarity=0.184  Sum_probs=31.7

Q ss_pred             HHHHHHHHHHhCCCCCcHHHHHHHHHHHhcCCChhhhhhHHHH
Q 032101           66 LEEFRKTFKKENPNVTAVSAVGKAAGGKWKSMSPAEKAPYESK  108 (147)
Q Consensus        66 ~~e~r~~ik~e~P~~~~~~eisk~lge~Wk~Ls~eeK~~Y~~~  108 (147)
                      -+.+|...+.-||+...-.+..+.|.+.|..|++.+|...++.
T Consensus        46 KkAYrkla~k~HPDk~~~~e~F~~i~~AYevLsD~~kR~~YD~   88 (421)
T PTZ00037         46 KKAYRKLAIKHHPDKGGDPEKFKEISRAYEVLSDPEKRKIYDE   88 (421)
T ss_pred             HHHHHHHHHHHCCCCCchHHHHHHHHHHHHHhccHHHHHHHhh
Confidence            3456666778899985335777889999999998876655554


No 61 
>smart00271 DnaJ DnaJ molecular chaperone homology domain.
Probab=20.41  E-value=1.7e+02  Score=17.18  Aligned_cols=34  Identities=18%  Similarity=0.227  Sum_probs=19.2

Q ss_pred             HHHHHHHHHhCCCCCc-----HHHHHHHHHHHhcCCChh
Q 032101           67 EEFRKTFKKENPNVTA-----VSAVGKAAGGKWKSMSPA  100 (147)
Q Consensus        67 ~e~r~~ik~e~P~~~~-----~~eisk~lge~Wk~Ls~e  100 (147)
                      ..++..++.-||+...     ..+....|.+-|..|.+.
T Consensus        20 ~ay~~l~~~~HPD~~~~~~~~~~~~~~~l~~Ay~~L~~~   58 (60)
T smart00271       20 KAYRKLALKYHPDKNPGDKEEAEEKFKEINEAYEVLSDP   58 (60)
T ss_pred             HHHHHHHHHHCcCCCCCchHHHHHHHHHHHHHHHHHcCC
Confidence            3445555666787752     234555666666665543


Done!