Query         023605
Match_columns 280
No_of_seqs    189 out of 356
Neff          7.4 
Searched_HMMs 46136
Date          Fri Mar 29 05:17:25 2013
Command       hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/023605.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/023605hhsearch_cdd -cpu 12 -v 0 

 No Hit                             Prob E-value P-value  Score    SS Cols Query HMM  Template HMM
  1 PF10536 PMD:  Plant mobile dom 100.0 2.8E-33   6E-38  263.2   7.0  181   96-276     1-195 (363)
  2 PTZ00199 high mobility group p  99.2   2E-12 4.3E-17   99.1   1.1   59    2-60     30-89  (94)
  3 cd01390 HMGB-UBF_HMG-box HMGB-  99.2 4.5E-12 9.8E-17   90.0   1.8   57    2-59      8-64  (66)
  4 cd01388 SOX-TCF_HMG-box SOX-TC  99.2 7.7E-12 1.7E-16   91.0   1.6   59    2-61      9-67  (72)
  5 cd01389 MATA_HMG-box MATA_HMG-  99.1 1.3E-11 2.8E-16   90.9   2.0   59    2-61      9-67  (77)
  6 PF00505 HMG_box:  HMG (high mo  99.1 6.2E-11 1.3E-15   85.0   1.9   57    2-59      8-64  (69)
  7 smart00398 HMG high mobility g  99.0 1.1E-10 2.3E-15   83.6   2.3   58    2-60      9-66  (70)
  8 cd00084 HMG-box High Mobility   99.0 1.1E-10 2.3E-15   82.6   1.9   57    2-59      8-64  (66)
  9 PF09331 DUF1985:  Domain of un  98.9 9.8E-10 2.1E-14   90.4   5.1  123  125-247    14-142 (142)
 10 COG5648 NHP6B Chromatin-associ  98.9   3E-10 6.6E-15   97.1   0.5   58    2-60     78-135 (211)
 11 KOG0381 HMG box-containing pro  98.9 9.7E-10 2.1E-14   84.1   2.8   58    2-60     30-87  (96)
 12 PF09011 HMG_box_2:  HMG-box do  98.8 1.4E-09   3E-14   79.3   1.6   57    2-59     11-68  (73)
 13 KOG0527 HMG-box transcription   98.5 8.6E-08 1.9E-12   88.7   2.7   59    2-61     70-128 (331)
 14 KOG0526 Nucleosome-binding fac  98.1 6.9E-07 1.5E-11   85.7  -0.3   55    2-61    543-597 (615)
 15 KOG3248 Transcription factor T  97.6 3.9E-05 8.4E-10   70.1   2.3   51    2-53    199-249 (421)
 16 KOG4715 SWI/SNF-related matrix  97.4 4.6E-05 9.9E-10   69.1   0.2   59    2-61     72-130 (410)
 17 PF14887 HMG_box_5:  HMG (high   95.8  0.0053 1.1E-07   44.7   1.8   53    6-60     15-67  (85)
 18 KOG0528 HMG-box transcription   95.5  0.0091   2E-07   57.4   2.5   58    2-60    333-390 (511)
 19 PF06382 DUF1074:  Protein of u  93.3   0.058 1.3E-06   45.5   2.3   44    2-50     86-129 (183)
 20 KOG2746 HMG-box transcription   90.6   0.085 1.8E-06   52.8   0.4   56    2-58    189-246 (683)
 21 COG5648 NHP6B Chromatin-associ  90.5     0.1 2.2E-06   45.2   0.7   54    6-60    155-208 (211)
 22 PF03078 ATHILA:  ATHILA ORF-1   85.0     4.3 9.4E-05   39.6   8.1  164   73-246    67-262 (458)
 23 PF06945 DUF1289:  Protein of u  64.7     4.6  0.0001   27.0   1.6   22   33-54     29-50  (51)
 24 PF08073 CHDNT:  CHDNT (NUC034)  63.5     6.7 0.00014   26.9   2.2   34    4-38     18-51  (55)
 25 PF11304 DUF3106:  Protein of u  60.9      11 0.00024   29.3   3.3   14   31-44     35-48  (107)
 26 PF10234 Cluap1:  Clusterin-ass  47.4     8.9 0.00019   34.8   1.0   32   94-125     2-38  (267)
 27 PRK15117 ABC transporter perip  46.8      14  0.0003   32.3   2.1   30   18-48     66-96  (211)
 28 PF05494 Tol_Tol_Ttg2:  Toluene  44.8      16 0.00034   30.4   2.0   31   18-48     36-66  (170)
 29 PF12650 DUF3784:  Domain of un  44.3      13 0.00029   28.0   1.4   19   33-51     25-43  (97)
 30 cd03489 Topoisomer_IB_N_Ldtopo  44.2      24 0.00053   30.8   3.1   54    5-58     61-122 (212)
 31 PF11304 DUF3106:  Protein of u  43.5      29 0.00062   27.0   3.2   65   28-104    14-78  (107)
 32 cd00660 Topoisomer_IB_N Topois  38.1      41 0.00088   29.5   3.6   54    5-58     63-125 (215)
 33 cd03488 Topoisomer_IB_N_htopoI  37.9      42 0.00091   29.4   3.6   54    5-58     63-125 (215)
 34 TIGR03481 HpnM hopanoid biosyn  37.7      20 0.00043   30.9   1.7   22   27-48     71-92  (198)
 35 cd03490 Topoisomer_IB_N_1 Topo  34.0      47   0.001   29.1   3.3   54    5-58     61-124 (217)
 36 cd09071 FAR_C C-terminal domai  31.6      45 0.00098   24.4   2.5   21  229-250    70-90  (92)
 37 PF03457 HA:  Helicase associat  31.4      33 0.00071   23.8   1.7   16   82-97     52-67  (68)
 38 PF05823 Gp-FAR-1:  Nematode fa  31.3      13 0.00029   30.8  -0.5   89    7-103    43-139 (154)
 39 PF04994 TfoX_C:  TfoX C-termin  30.5      53  0.0011   24.1   2.7   36   12-47     41-79  (81)
 40 cd07321 Extradiol_Dioxygenase_  29.3      60  0.0013   23.6   2.8   31   83-116    34-64  (77)
 41 PRK01381 Trp operon repressor;  28.4      35 0.00076   26.2   1.4   59   37-95     33-93  (99)
 42 PF03015 Sterile:  Male sterili  26.6      63  0.0014   23.9   2.6   53  196-251    33-91  (94)
 43 PF02919 Topoisom_I_N:  Eukaryo  26.5      28 0.00061   30.5   0.7   52    5-56     64-124 (215)
 44 PF11460 DUF3007:  Protein of u  25.4      67  0.0014   24.9   2.5   37    7-48     65-101 (104)
 45 PF04189 Gcd10p:  Gcd10p family  23.5   2E+02  0.0044   26.5   5.8  111    7-129   110-232 (299)
 46 PF08373 RAP:  RAP domain;  Int  23.2      36 0.00079   22.7   0.6   17   32-48     40-57  (58)
 47 PF11943 DUF3460:  Protein of u  22.0 1.2E+02  0.0026   21.2   3.0   39    7-47      8-47  (60)
 48 cd07921 PCA_45_Doxase_A_like S  21.7      56  0.0012   25.5   1.5   23   83-105    44-66  (106)
 49 PRK10236 hypothetical protein;  21.3      54  0.0012   29.3   1.4   29   24-52    116-144 (237)
 50 PHA02662 ORF131 putative membr  21.0      89  0.0019   27.6   2.7   25   68-92     64-98  (226)
 51 PRK04156 gltX glutamyl-tRNA sy  20.9      92   0.002   31.5   3.2   62   13-74     33-108 (567)
 52 cd07922 CarBa CarBa is the A s  20.7 1.1E+02  0.0024   22.6   2.8   48   73-124    26-73  (81)
 53 cd07923 Gallate_dioxygenase_C   20.1      59  0.0013   24.8   1.3   23   83-105    36-58  (94)

No 1  
>PF10536 PMD:  Plant mobile domain;  InterPro: IPR019557  This entry represents a domain found in a variety of transposases []. 
Probab=99.98  E-value=2.8e-33  Score=263.21  Aligned_cols=181  Identities=22%  Similarity=0.354  Sum_probs=157.0

Q ss_pred             CChhhhccc--ccccCHHHHHHHhcccCCCcceEEECCEEEEeCccccceeeeecCCCcccccCCCc---hHHHHHHHHh
Q 023605           96 GFESLLELR--CGKLKRKLCHWLVNQFKPERNIIELHGQKLELCPKMFSKIMGVKDGGMAIKINGAS---DHIAEVRRIF  170 (280)
Q Consensus        96 GFg~LL~i~--~~~l~~~l~~wL~~~~d~~t~~~~i~g~~i~it~~dV~~VlGLP~~G~~i~~~~~~---~~~~~l~~~~  170 (280)
                      |||+|+.|.  ..++++.|+.+|+++|+++|++|++++++++||++||..++|||+.|.+|......   +.++++.+..
T Consensus         1 ~~g~~~~i~~s~~~~~~~li~al~erW~~et~tF~~~~gEmtiTL~DV~~llGLpi~G~pv~~~~~~~~~~~~~~ll~~~   80 (363)
T PF10536_consen    1 GFGILDAIMASRITIDRSLISALVERWDPETNTFHFPWGEMTITLEDVAMLLGLPIDGRPVTGPLPPDWRDLCEELLGVS   80 (363)
T ss_pred             CchhHhhhhhhcCCCCHHHHHHHHHHhCcccCeeecccccccchhhhhhhccccccccccccCccccchhhHHHHHhccc
Confidence            899999999  89999999999999999999999999999999999999999999999999875433   2333333322


Q ss_pred             C----CCCCCcchhHHHHHHhhcccc-chhHHHHHHHhhhceeecccCCC--CCccccccccccCCCccccchHHHHHHH
Q 023605          171 Q----PTVKGIRIRTLEEVIEQLDEA-NKIFKVAFTLFAIATLLCPIGSY--ISTLFLHPIMDVSSIKSLNWATFCYDWL  243 (280)
Q Consensus       171 ~----~~~~~i~l~~L~~~l~~~~~~-~d~f~r~Fll~~~~~~L~Ptt~~--vs~~yl~~l~D~~~i~~ynW~~~Vl~~L  243 (280)
                      .    ..+..+.++++++.+.+.+++ .+.+.|||+++.+|++|||+++.  |+..|++++.|++.+++||||.+||++|
T Consensus        81 ~~~~~~~~~~~~~~wl~~~~~~~~~~d~~~~~rAFll~~lg~~lfp~~~~~~v~~~~l~~~~~l~~~~~~~wg~a~La~l  160 (363)
T PF10536_consen   81 PQIKSKKGSSIRLSWLEEFFSNRPEDDEEQYHRAFLLYWLGSFLFPDKSGDYVSPRYLPLAVDLARIKRYAWGSAVLAYL  160 (363)
T ss_pred             ccccccccccchhhheeccccccccchHHHHHHHHHHHhhhceeccCCCcceeeeeEEeeeeccccccccccHHHHHHHH
Confidence            1    124556778898888544433 24899999999999999999877  9999999999999999999999999999


Q ss_pred             HHHHHHHhhcC--CcccchhHHHHHHHHhhCCCCc
Q 023605          244 VKSICRFQNQQ--AAYIGGCLHFLQVRPLLQLKLS  276 (280)
Q Consensus       244 ~~~i~k~~~~k--~~~i~GC~~lLqi~yld~l~~~  276 (280)
                      +++++++..+.  ..+++||..|||+|+|+|++++
T Consensus       161 y~~L~~~~~~~~~~~~~~g~~~llq~W~werf~~~  195 (363)
T PF10536_consen  161 YRDLCKASRKSASQSNIGGPLWLLQLWAWERFPVG  195 (363)
T ss_pred             HHHHHHHhhhcccccccccceeeeccchhheeecc
Confidence            99999988776  7899999999999999999865


No 2  
>PTZ00199 high mobility group protein; Provisional
Probab=99.23  E-value=2e-12  Score=99.08  Aligned_cols=59  Identities=22%  Similarity=0.374  Sum_probs=55.6

Q ss_pred             cccchHHHHHHHHhhCCCCh-hhhhhhhhhhhcccCCChhhhhhhhhhhhhhcccCCCCC
Q 023605            2 LCFCSNEEVKRLRSENSDLS-VTLGLRKHIGKTYKELPPEQKARYKKRDERMGNSGNSNS   60 (280)
Q Consensus         2 ~~~~~~~~~~~~~~~~p~~~-~~~~~~k~~g~~wk~l~~~~k~~~~~~~~~~k~~~~~~k   60 (280)
                      ||+|+++.|+.++++||+.+ .+++|+|++|+.|++||++||++|.++|...+.+|..++
T Consensus        30 Y~~F~~~~R~~i~~~~P~~~~~~~evsk~ige~Wk~ls~eeK~~y~~~A~~dk~rY~~e~   89 (94)
T PTZ00199         30 YMFFAKEKRAEIIAENPELAKDVAAVGKMVGEAWNKLSEEEKAPYEKKAQEDKVRYEKEK   89 (94)
T ss_pred             HHHHHHHHHHHHHHHCcCCcccHHHHHHHHHHHHHcCCHHHHHHHHHHHHHHHHHHHHHH
Confidence            89999999999999999986 489999999999999999999999999999999998654


No 3  
>cd01390 HMGB-UBF_HMG-box HMGB-UBF_HMG-box, class II and III members of the HMG-box superfamily of DNA-binding proteins. These proteins bind the minor groove of DNA in a non-sequence specific fashion and contain two or more tandem HMG boxes. Class II members include non-histone chromosomal proteins, HMG1 and HMG2, which bind to bent or distorted DNA such as four-way DNA junctions, synthetic DNA cruciforms, kinked cisplatin-modified DNA, DNA bulges, cross-overs in supercoiled DNA, and can cause looping of linear DNA. Class III members include nucleolar and mitochondrial transcription factors, UBF and mtTF1, which bind four-way DNA junctions.
Probab=99.21  E-value=4.5e-12  Score=90.04  Aligned_cols=57  Identities=28%  Similarity=0.464  Sum_probs=54.1

Q ss_pred             cccchHHHHHHHHhhCCCChhhhhhhhhhhhcccCCChhhhhhhhhhhhhhcccCCCC
Q 023605            2 LCFCSNEEVKRLRSENSDLSVTLGLRKHIGKTYKELPPEQKARYKKRDERMGNSGNSN   59 (280)
Q Consensus         2 ~~~~~~~~~~~~~~~~p~~~~~~~~~k~~g~~wk~l~~~~k~~~~~~~~~~k~~~~~~   59 (280)
                      |++|++|+|+.++++||+.+ +.++++.+|+.|++||++||++|.+++++.+.+|..+
T Consensus         8 f~~f~~~~r~~~~~~~p~~~-~~~i~~~~~~~W~~ls~~eK~~y~~~a~~~~~~y~~e   64 (66)
T cd01390           8 YFLFSQEQRPKLKKENPDAS-VTEVTKILGEKWKELSEEEKKKYEEKAEKDKERYEKE   64 (66)
T ss_pred             HHHHHHHHHHHHHHHCcCCC-HHHHHHHHHHHHHhCCHHHHHHHHHHHHHHHHHHHHh
Confidence            89999999999999999975 9999999999999999999999999999999998754


No 4  
>cd01388 SOX-TCF_HMG-box SOX-TCF_HMG-box, class I member of the HMG-box superfamily of DNA-binding proteins. These proteins contain a single HMG box, and bind the minor groove of DNA in a highly sequence-specific manner. Members include SRY and its homologs in insects and vertebrates, and transcription factor-like proteins, TCF-1, -3, -4, and LEF-1. They appear to bind the minor groove of the A/T C A A A G/C-motif.
Probab=99.17  E-value=7.7e-12  Score=91.01  Aligned_cols=59  Identities=15%  Similarity=0.218  Sum_probs=55.5

Q ss_pred             cccchHHHHHHHHhhCCCChhhhhhhhhhhhcccCCChhhhhhhhhhhhhhcccCCCCCC
Q 023605            2 LCFCSNEEVKRLRSENSDLSVTLGLRKHIGKTYKELPPEQKARYKKRDERMGNSGNSNSH   61 (280)
Q Consensus         2 ~~~~~~~~~~~~~~~~p~~~~~~~~~k~~g~~wk~l~~~~k~~~~~~~~~~k~~~~~~k~   61 (280)
                      ||+|+++.|..++++||+.+ +..++|.+|+.|++||++||++|.+++.+.+.+|..+.+
T Consensus         9 f~~F~~~~r~~~~~~~p~~~-~~eisk~l~~~Wk~ls~~eK~~y~~~a~~~k~~y~~~~p   67 (72)
T cd01388           9 FMLFSKRHRRKVLQEYPLKE-NRAISKILGDRWKALSNEEKQPYYEEAKKLKELHMKLYP   67 (72)
T ss_pred             HHHHHHHHHHHHHHHCCCCC-HHHHHHHHHHHHHcCCHHHHHHHHHHHHHHHHHHHHHCc
Confidence            89999999999999999985 999999999999999999999999999999999977653


No 5  
>cd01389 MATA_HMG-box MATA_HMG-box, class I member of the HMG-box superfamily of DNA-binding proteins. These proteins contain a single HMG box, and bind the minor groove of DNA in a highly sequence-specific manner. Members include the fungal mating type gene products MC, MATA1 and Ste11.
Probab=99.15  E-value=1.3e-11  Score=90.93  Aligned_cols=59  Identities=22%  Similarity=0.371  Sum_probs=56.1

Q ss_pred             cccchHHHHHHHHhhCCCChhhhhhhhhhhhcccCCChhhhhhhhhhhhhhcccCCCCCC
Q 023605            2 LCFCSNEEVKRLRSENSDLSVTLGLRKHIGKTYKELPPEQKARYKKRDERMGNSGNSNSH   61 (280)
Q Consensus         2 ~~~~~~~~~~~~~~~~p~~~~~~~~~k~~g~~wk~l~~~~k~~~~~~~~~~k~~~~~~k~   61 (280)
                      ||+|++++|+.++++||+.+ +..|++++|+.|+.||++||++|.++|++.+.+|..+.+
T Consensus         9 f~lf~~~~r~~~~~~~p~~~-~~eisk~~g~~Wk~ls~eeK~~y~~~A~~~k~~~~~~~p   67 (77)
T cd01389           9 FILYRQDKHAQLKTENPGLT-NNEISRIIGRMWRSESPEVKAYYKELAEEEKERHAREYP   67 (77)
T ss_pred             HHHHHHHHHHHHHHHCCCCC-HHHHHHHHHHHHhhCCHHHHHHHHHHHHHHHHHHHHHCC
Confidence            89999999999999999985 999999999999999999999999999999999988764


No 6  
>PF00505 HMG_box:  HMG (high mobility group) box;  InterPro: IPR000910 High mobility group (HMG or HMGB) proteins are a family of relatively low molecular weight non-histone components in chromatin. HMG1 (also called HMG-T in fish) and HMG2 are two highly related proteins that bind single-stranded DNA preferentially and unwind double-stranded DNA. Although they have no sequence specificity, they have a high affinity for bent or distorted DNA, and bend linear DNA. HMG1 and HMG2 contain two DNA-binding HMG-box domains (A and B) that show structural and functional differences, and have a long acidic C-terminal domain rich in aspartic and glutamic acid residues. The acidic tail modulates the affinity of the tandem HMG boxes in HMG1 and 2 for a variety of DNA targets. HMG1 and 2 appear to play important architectural roles in the assembly of nucleoprotein complexes in a variety of biological processes, for example V(D)J recombination, the initiation of transcription, and DNA repair []. The profile in this entry describing the HMG-domains is much more general than the signature. In addition to the HMG1 and HMG2 proteins, HMG-domains occur in single or multiple copies in the following protein classes; the SOX family of transcription factors; SRY sex determining region Y protein and related proteins []; LEF1 lymphoid enhancer binding factor 1 []; SSRP recombination signal recognition protein; MTF1 mitochondrial transcription factor 1; UBF1/2 nucleolar transcription factors; Abf2 yeast ARS-binding factor []; and Saccharomyces cerevisiae transcription factors Ixr1, Rox1, Nhp6a, Nhp6b and Spp41.; GO: 0003677 DNA binding; PDB: 1I11_A 1J3C_A 1J3D_A 1WZ6_A 1WGF_A 2D7L_A 1GT0_D 3U2B_C 2CRJ_A 2CS1_A ....
Probab=99.05  E-value=6.2e-11  Score=84.96  Aligned_cols=57  Identities=28%  Similarity=0.429  Sum_probs=53.9

Q ss_pred             cccchHHHHHHHHhhCCCChhhhhhhhhhhhcccCCChhhhhhhhhhhhhhcccCCCC
Q 023605            2 LCFCSNEEVKRLRSENSDLSVTLGLRKHIGKTYKELPPEQKARYKKRDERMGNSGNSN   59 (280)
Q Consensus         2 ~~~~~~~~~~~~~~~~p~~~~~~~~~k~~g~~wk~l~~~~k~~~~~~~~~~k~~~~~~   59 (280)
                      |++|++++|..++++||+.+ ..+|++.+|+.|++||++||++|.+++.+.+.+|..+
T Consensus         8 f~lf~~~~~~~~k~~~p~~~-~~~i~~~~~~~W~~l~~~eK~~y~~~a~~~~~~y~~~   64 (69)
T PF00505_consen    8 FMLFCKEKRAKLKEENPDLS-NKEISKILAQMWKNLSEEEKAPYKEEAEEEKERYEKE   64 (69)
T ss_dssp             HHHHHHHHHHHHHHHSTTST-HHHHHHHHHHHHHCSHHHHHHHHHHHHHHHHHHHHHH
T ss_pred             HHHHHHHHHHHHHHHhcccc-cccchhhHHHHHhcCCHHHHHHHHHHHHHHHHHHHHH
Confidence            89999999999999999998 9999999999999999999999999999999888643


No 7  
>smart00398 HMG high mobility group.
Probab=99.03  E-value=1.1e-10  Score=83.61  Aligned_cols=58  Identities=26%  Similarity=0.453  Sum_probs=54.5

Q ss_pred             cccchHHHHHHHHhhCCCChhhhhhhhhhhhcccCCChhhhhhhhhhhhhhcccCCCCC
Q 023605            2 LCFCSNEEVKRLRSENSDLSVTLGLRKHIGKTYKELPPEQKARYKKRDERMGNSGNSNS   60 (280)
Q Consensus         2 ~~~~~~~~~~~~~~~~p~~~~~~~~~k~~g~~wk~l~~~~k~~~~~~~~~~k~~~~~~k   60 (280)
                      |++|++++|+.++++||+.+ +.++++.+|..|++||++||++|.++++..+.+|....
T Consensus         9 y~~f~~~~r~~~~~~~~~~~-~~~i~~~~~~~W~~l~~~ek~~y~~~a~~~~~~y~~~~   66 (70)
T smart00398        9 FMLFSQENRAKIKAENPDLS-NAEISKKLGERWKLLSEEEKAPYEEKAKKDKERYEEEM   66 (70)
T ss_pred             HHHHHHHHHHHHHHHCcCCC-HHHHHHHHHHHHHcCCHHHHHHHHHHHHHHHHHHHHHH
Confidence            89999999999999999987 89999999999999999999999999999999887643


No 8  
>cd00084 HMG-box High Mobility Group (HMG)-box is found in a variety of eukaryotic chromosomal proteins and transcription factors. HMGs bind to the minor groove of DNA and have been classified by DNA binding preferences. Two phylogenically distinct groups of Class I proteins bind DNA in a sequence specific fashion and contain a single HMG box. One group (SOX-TCF) includes transcription factors, TCF-1, -3, -4; and also SRY and LEF-1, which bind four-way DNA junctions and duplex DNA targets. The second group (MATA) includes fungal mating type gene products MC, MATA1 and Ste11. Class II and III proteins (HMGB-UBF) bind DNA in a non-sequence specific fashion and contain two or more tandem HMG boxes. Class II members include non-histone chromosomal proteins, HMG1 and HMG2, which bind to bent or distorted DNA such as four-way DNA junctions, synthetic DNA cruciforms, kinked cisplatin-modified DNA, DNA bulges, cross-overs in supercoiled DNA, and can cause looping of linear DNA. Class III member
Probab=99.02  E-value=1.1e-10  Score=82.60  Aligned_cols=57  Identities=25%  Similarity=0.405  Sum_probs=53.6

Q ss_pred             cccchHHHHHHHHhhCCCChhhhhhhhhhhhcccCCChhhhhhhhhhhhhhcccCCCC
Q 023605            2 LCFCSNEEVKRLRSENSDLSVTLGLRKHIGKTYKELPPEQKARYKKRDERMGNSGNSN   59 (280)
Q Consensus         2 ~~~~~~~~~~~~~~~~p~~~~~~~~~k~~g~~wk~l~~~~k~~~~~~~~~~k~~~~~~   59 (280)
                      ||+|+.|+|+.++++||+.+ ..++.+.+|++|++||+++|++|.+++++.+.+|...
T Consensus         8 f~~f~~~~~~~~~~~~~~~~-~~~i~~~~~~~W~~l~~~~k~~y~~~a~~~~~~y~~~   64 (66)
T cd00084           8 YFLFSQEHRAEVKAENPGLS-VGEISKILGEMWKSLSEEEKKKYEEKAEKDKERYEKE   64 (66)
T ss_pred             HHHHHHHHHHHHHHHCcCCC-HHHHHHHHHHHHHhCCHHHHHHHHHHHHHHHHHHHHh
Confidence            89999999999999999976 8899999999999999999999999999999988654


No 9  
>PF09331 DUF1985:  Domain of unknown function (DUF1985);  InterPro: IPR015410 This domain is functionally uncharacterised; it is found in a set of Arabidopsis thaliana (Mouse-ear cress) hypothetical proteins. 
Probab=98.94  E-value=9.8e-10  Score=90.41  Aligned_cols=123  Identities=14%  Similarity=0.265  Sum_probs=89.8

Q ss_pred             ceEEECCEEEEeCccccceeeeecCCCcccccCCCch---HHHHHHHHhCCCCCCcchhHHHHHHhhc--cccchhHHHH
Q 023605          125 NIIELHGQKLELCPKMFSKIMGVKDGGMAIKINGASD---HIAEVRRIFQPTVKGIRIRTLEEVIEQL--DEANKIFKVA  199 (280)
Q Consensus       125 ~~~~i~g~~i~it~~dV~~VlGLP~~G~~i~~~~~~~---~~~~l~~~~~~~~~~i~l~~L~~~l~~~--~~~~d~f~r~  199 (280)
                      .-+.++|..|.++..+.+.|+|||++..|-.......   ....+.+.+-..+..+++..+.+++.+.  .+.++.+.-|
T Consensus        14 ~W~~~~g~piRfsl~Ef~lvTGL~C~~~p~~~~~~~~~~~~~~~fw~~Lf~~~~~vtv~dv~~~L~~~~~~~~~~Rlrla   93 (142)
T PF09331_consen   14 IWFVFNGVPIRFSLREFALVTGLNCGPYPKEKKVDKKGKKEKGSFWNKLFGREEDVTVEDVIAKLKKMKKWDSEDRLRLA   93 (142)
T ss_pred             EEEEECCEeeEecHHHHHhhcCCcCCCCCcccchhhccccchhhhhhhhccccccCcHHHHHHHHhhcccCChhhHHHHH
Confidence            4678899999999999999999999988776542111   1112322222345679999999999854  2233445555


Q ss_pred             HHHhhhceeecccCCC-CCccccccccccCCCccccchHHHHHHHHHHH
Q 023605          200 FTLFAIATLLCPIGSY-ISTLFLHPIMDVSSIKSLNWATFCYDWLVKSI  247 (280)
Q Consensus       200 Fll~~~~~~L~Ptt~~-vs~~yl~~l~D~~~i~~ynW~~~Vl~~L~~~i  247 (280)
                      +++++.|.+++++.+. |+..++..+.|++...+|-||.+.++.++++|
T Consensus        94 ~L~~v~gvl~~~~~~~~i~~~~~~~v~Dl~~f~~yPWGr~sF~~~~~sI  142 (142)
T PF09331_consen   94 LLLFVDGVLIATSKTTKIPKEHLKMVDDLEKFLNYPWGRYSFDMLMKSI  142 (142)
T ss_pred             HHHhhheeeeccCCCCCCCHHHHHHHhhHHHHhcCCcHHHHHHHHHhcC
Confidence            5555555555555556 99999999999999999999999999999874


No 10 
>COG5648 NHP6B Chromatin-associated proteins containing the HMG domain [Chromatin structure and dynamics]
Probab=98.90  E-value=3e-10  Score=97.11  Aligned_cols=58  Identities=24%  Similarity=0.303  Sum_probs=55.3

Q ss_pred             cccchHHHHHHHHhhCCCChhhhhhhhhhhhcccCCChhhhhhhhhhhhhhcccCCCCC
Q 023605            2 LCFCSNEEVKRLRSENSDLSVTLGLRKHIGKTYKELPPEQKARYKKRDERMGNSGNSNS   60 (280)
Q Consensus         2 ~~~~~~~~~~~~~~~~p~~~~~~~~~k~~g~~wk~l~~~~k~~~~~~~~~~k~~~~~~k   60 (280)
                      ||+|+++.|++++++||+++ ++++||++|++||+|+|+||.||.+.+...+.+|.+.+
T Consensus        78 yf~y~~~~R~ei~~~~p~l~-~~e~~k~~~e~WK~Ltd~eke~y~k~~~~~~erYq~ek  135 (211)
T COG5648          78 YFLYSAENRDEIRKENPKLT-FGEVGKLLSEKWKELTDEEKEPYYKEANSDRERYQREK  135 (211)
T ss_pred             HHHHHHHHHHHHHHhCCCCC-hHHHHHHHHHHHHhccHhhhhhHHHHHhhHHHHHHHHH
Confidence            89999999999999999996 99999999999999999999999999999999987755


No 11 
>KOG0381 consensus HMG box-containing protein [General function prediction only]
Probab=98.88  E-value=9.7e-10  Score=84.08  Aligned_cols=58  Identities=21%  Similarity=0.409  Sum_probs=54.2

Q ss_pred             cccchHHHHHHHHhhCCCChhhhhhhhhhhhcccCCChhhhhhhhhhhhhhcccCCCCC
Q 023605            2 LCFCSNEEVKRLRSENSDLSVTLGLRKHIGKTYKELPPEQKARYKKRDERMGNSGNSNS   60 (280)
Q Consensus         2 ~~~~~~~~~~~~~~~~p~~~~~~~~~k~~g~~wk~l~~~~k~~~~~~~~~~k~~~~~~k   60 (280)
                      |++|++++|..++++||+. ++.+|+|++|+.|++|++++|.+|..++.+.+.+|..+.
T Consensus        30 ~~~f~~~~~~~~k~~~p~~-~~~~v~k~~g~~W~~l~~~~k~~y~~ka~~~k~~Y~~~~   87 (96)
T KOG0381|consen   30 FFLFSSEQRSKIKAENPGL-SVGEVAKALGEMWKNLAEEEKQPYEEKASKLKEKYEKEL   87 (96)
T ss_pred             HHHHHHHHHHHHHHhCCCC-CHHHHHHHHHHHHhcCCHHHHHHHHHHHHHHHHHHHHHH
Confidence            7899999999999999995 499999999999999999999999999999999997654


No 12 
>PF09011 HMG_box_2:  HMG-box domain;  InterPro: IPR015101 This domain is predominantly found in Maelstrom homologue proteins. It has no known function. ; GO: 0005634 nucleus; PDB: 2EQZ_A 1V64_A 2CTO_A 1H5P_A 3TQ6_A 3FGH_A 3TMM_A 1J3X_A 2YRQ_A 1AAB_A ....
Probab=98.82  E-value=1.4e-09  Score=79.31  Aligned_cols=57  Identities=19%  Similarity=0.248  Sum_probs=49.4

Q ss_pred             cccchHHHHHHHHhh-CCCChhhhhhhhhhhhcccCCChhhhhhhhhhhhhhcccCCCC
Q 023605            2 LCFCSNEEVKRLRSE-NSDLSVTLGLRKHIGKTYKELPPEQKARYKKRDERMGNSGNSN   59 (280)
Q Consensus         2 ~~~~~~~~~~~~~~~-~p~~~~~~~~~k~~g~~wk~l~~~~k~~~~~~~~~~k~~~~~~   59 (280)
                      |++|+.|.++.++.+ .|.. ....+.+.+|+.|++||++||++|.++|+..+.+|+.+
T Consensus        11 y~lF~~~~~~~~k~~G~~~~-~~~e~~k~~~~~Wk~Ls~~EK~~Y~~~A~~~k~~y~~e   68 (73)
T PF09011_consen   11 YNLFMKEMRKEVKEEGGQKQ-SFREVMKEISERWKSLSEEEKEPYEERAKEDKERYERE   68 (73)
T ss_dssp             HHHHHHHHHHHHHHHT-T-S-SHHHHHHHHHHHHHHS-HHHHHHHHHHHHHHHHHHHHH
T ss_pred             HHHHHHHHHHHHHHhcccCC-CHHHHHHHHHHHHHhcCHHHHHHHHHHHHHHHHHHHHH
Confidence            889999999999999 7744 48899999999999999999999999999999988653


No 13 
>KOG0527 consensus HMG-box transcription factor [Transcription]
Probab=98.45  E-value=8.6e-08  Score=88.73  Aligned_cols=59  Identities=19%  Similarity=0.321  Sum_probs=56.2

Q ss_pred             cccchHHHHHHHHhhCCCChhhhhhhhhhhhcccCCChhhhhhhhhhhhhhcccCCCCCC
Q 023605            2 LCFCSNEEVKRLRSENSDLSVTLGLRKHIGKTYKELPPEQKARYKKRDERMGNSGNSNSH   61 (280)
Q Consensus         2 ~~~~~~~~~~~~~~~~p~~~~~~~~~k~~g~~wk~l~~~~k~~~~~~~~~~k~~~~~~k~   61 (280)
                      ||.++.+.|+.+-++||+.- ++.+-|.+|+.||.|+|+||.||++.|++.+++|.++-+
T Consensus        70 FMVWSq~~RRkma~qnP~mH-NSEISK~LG~~WK~Lse~EKrPFi~EAeRLR~~HmkehP  128 (331)
T KOG0527|consen   70 FMVWSQGQRRKLAKQNPKMH-NSEISKRLGAEWKLLSEEEKRPFVDEAERLRAQHMKEYP  128 (331)
T ss_pred             hhhhhHHHHHHHHHhCcchh-hHHHHHHHHHHHhhcCHhhhccHHHHHHHHHHHHHHhCC
Confidence            89999999999999999996 999999999999999999999999999999999988763


No 14 
>KOG0526 consensus Nucleosome-binding factor SPN, POB3 subunit [Transcription; Replication, recombination and repair; Chromatin structure and dynamics]
Probab=98.07  E-value=6.9e-07  Score=85.73  Aligned_cols=55  Identities=11%  Similarity=0.235  Sum_probs=51.4

Q ss_pred             cccchHHHHHHHHhhCCCChhhhhhhhhhhhcccCCChhhhhhhhhhhhhhcccCCCCCC
Q 023605            2 LCFCSNEEVKRLRSENSDLSVTLGLRKHIGKTYKELPPEQKARYKKRDERMGNSGNSNSH   61 (280)
Q Consensus         2 ~~~~~~~~~~~~~~~~p~~~~~~~~~k~~g~~wk~l~~~~k~~~~~~~~~~k~~~~~~k~   61 (280)
                      ||+|.+..|..+|++  +.+ ++.|+|.+|++||.||.  |.+|.+||+..+.||+.+++
T Consensus       543 ~m~w~~~~r~~ik~d--gi~-~~dv~kk~g~~wk~ms~--k~~we~ka~~dk~ry~~em~  597 (615)
T KOG0526|consen  543 YMLWLNASRESIKED--GIS-VGDVAKKAGEKWKQMSA--KEEWEDKAAVDKQRYEDEMK  597 (615)
T ss_pred             HHHHHHhhhhhHhhc--Cch-HHHHHHHHhHHHhhhcc--cchhhHHHHHHHHHHHHHHH
Confidence            899999999999999  554 99999999999999999  99999999999999988774


No 15 
>KOG3248 consensus Transcription factor TCF-4 [Transcription]
Probab=97.56  E-value=3.9e-05  Score=70.12  Aligned_cols=51  Identities=16%  Similarity=0.328  Sum_probs=47.2

Q ss_pred             cccchHHHHHHHHhhCCCChhhhhhhhhhhhcccCCChhhhhhhhhhhhhhc
Q 023605            2 LCFCSNEEVKRLRSENSDLSVTLGLRKHIGKTYKELPPEQKARYKKRDERMG   53 (280)
Q Consensus         2 ~~~~~~~~~~~~~~~~p~~~~~~~~~k~~g~~wk~l~~~~k~~~~~~~~~~k   53 (280)
                      ||.||+|+|+.+-+|-- +|.-+++-+++|++|.+||-||.++|-+-|++.+
T Consensus       199 FmlyMKEmRa~vvaEct-lKeSAaiNqiLGrRWH~LSrEEQAKYyElArKer  249 (421)
T KOG3248|consen  199 FMLYMKEMRAKVVAECT-LKESAAINQILGRRWHALSREEQAKYYELARKER  249 (421)
T ss_pred             HHHHHHHHHHHHHHHhh-hhhHHHHHHHHhHHHhhhhHHHHHHHHHHHHHHH
Confidence            79999999999999984 8889999999999999999999999998887766


No 16 
>KOG4715 consensus SWI/SNF-related matrix-associated actin-dependent regulator of chromatin  [Chromatin structure and dynamics]
Probab=97.36  E-value=4.6e-05  Score=69.06  Aligned_cols=59  Identities=22%  Similarity=0.273  Sum_probs=54.6

Q ss_pred             cccchHHHHHHHHhhCCCChhhhhhhhhhhhcccCCChhhhhhhhhhhhhhcccCCCCCC
Q 023605            2 LCFCSNEEVKRLRSENSDLSVTLGLRKHIGKTYKELPPEQKARYKKRDERMGNSGNSNSH   61 (280)
Q Consensus         2 ~~~~~~~~~~~~~~~~p~~~~~~~~~k~~g~~wk~l~~~~k~~~~~~~~~~k~~~~~~k~   61 (280)
                      ||-|+.-.-..+|++||+++ .=.+||.+|.-|+.|+|+||..|+..-+.-|.+|+.+++
T Consensus        72 ymrySrkvWd~VkA~nPe~k-LWeiGK~Ig~mW~dLpd~EK~ey~~EYeaEKieY~~smk  130 (410)
T KOG4715|consen   72 YMRYSRKVWDQVKASNPELK-LWEIGKIIGGMWLDLPDEEKQEYLNEYEAEKIEYNESMK  130 (410)
T ss_pred             hhHHhhhhhhhhhccCcchH-HHHHHHHHHHHHhhCcchHHHHHHHHHHHHHHHHHHHHH
Confidence            77788888889999999999 888999999999999999999999999999999998874


No 17 
>PF14887 HMG_box_5:  HMG (high mobility group) box 5; PDB: 1L8Y_A 1L8Z_A 2HDZ_A.
Probab=95.83  E-value=0.0053  Score=44.67  Aligned_cols=53  Identities=6%  Similarity=0.072  Sum_probs=45.1

Q ss_pred             hHHHHHHHHhhCCCChhhhhhhhhhhhcccCCChhhhhhhhhhhhhhcccCCCCC
Q 023605            6 SNEEVKRLRSENSDLSVTLGLRKHIGKTYKELPPEQKARYKKRDERMGNSGNSNS   60 (280)
Q Consensus         6 ~~~~~~~~~~~~p~~~~~~~~~k~~g~~wk~l~~~~k~~~~~~~~~~k~~~~~~k   60 (280)
                      .+..+..|-+.+|+.. ..+ -|+.+..|++|++.||.++.+||.+..++|+.+.
T Consensus        15 qq~vi~dYla~~~~dr-~K~-~kam~~~W~~me~Kekl~WIkKA~EdqKrYE~el   67 (85)
T PF14887_consen   15 QQSVIGDYLAKFRNDR-KKA-LKAMEAQWSQMEKKEKLKWIKKAAEDQKRYEREL   67 (85)
T ss_dssp             HHHHHHHHHHHTTSTH-HHH-HHHHHHHHHTTGGGHHHHHHHHHHHHHHHHHHHH
T ss_pred             HHHHHHHHHHHhhHhH-HHH-HHHHHHHHHHhhhhhhhHHHHHHHHHHHHHHHHH
Confidence            4567788999999876 333 5699999999999999999999999999998755


No 18 
>KOG0528 consensus HMG-box transcription factor SOX5 [Transcription]
Probab=95.47  E-value=0.0091  Score=57.37  Aligned_cols=58  Identities=14%  Similarity=0.299  Sum_probs=50.3

Q ss_pred             cccchHHHHHHHHhhCCCChhhhhhhhhhhhcccCCChhhhhhhhhhhhhhcccCCCCC
Q 023605            2 LCFCSNEEVKRLRSENSDLSVTLGLRKHIGKTYKELPPEQKARYKKRDERMGNSGNSNS   60 (280)
Q Consensus         2 ~~~~~~~~~~~~~~~~p~~~~~~~~~k~~g~~wk~l~~~~k~~~~~~~~~~k~~~~~~k   60 (280)
                      ||.+.+|.|.-+-.+.||-- +..+.|++|.+||+||..||+||-+.-...++.|--+-
T Consensus       333 FMVWAkDERRKILqA~PDMH-NSnISKILGSRWKaMSN~eKQPYYEEQaRLSk~HlEk~  390 (511)
T KOG0528|consen  333 FMVWAKDERRKILQAFPDMH-NSNISKILGSRWKAMSNTEKQPYYEEQARLSKLHLEKY  390 (511)
T ss_pred             hhcccchhhhhhhhcCcccc-ccchhHHhcccccccccccccchHHHHHHHHHhhhccC
Confidence            78899999988889999986 88899999999999999999999888777777665443


No 19 
>PF06382 DUF1074:  Protein of unknown function (DUF1074);  InterPro: IPR024460 This family consists of several proteins which appear to be specific to Insecta. The function of this family is unknown.
Probab=93.33  E-value=0.058  Score=45.53  Aligned_cols=44  Identities=18%  Similarity=0.251  Sum_probs=35.0

Q ss_pred             cccchHHHHHHHHhhCCCChhhhhhhhhhhhcccCCChhhhhhhhhhhh
Q 023605            2 LCFCSNEEVKRLRSENSDLSVTLGLRKHIGKTYKELPPEQKARYKKRDE   50 (280)
Q Consensus         2 ~~~~~~~~~~~~~~~~p~~~~~~~~~k~~g~~wk~l~~~~k~~~~~~~~   50 (280)
                      |+-|+.|||+    .|.+++ ..++...+.+.|..||++||.+|..++.
T Consensus        86 YLNFLReFRr----kh~~L~-p~dlI~~AAraW~rLSe~eK~rYrr~~~  129 (183)
T PF06382_consen   86 YLNFLREFRR----KHCGLS-PQDLIQRAARAWCRLSEAEKNRYRRMAP  129 (183)
T ss_pred             HHHHHHHHHH----HccCCC-HHHHHHHHHHHHHhCCHHHHHHHHhhcc
Confidence            5666666655    788887 6667777889999999999999999644


No 20 
>KOG2746 consensus HMG-box transcription factor Capicua and related proteins [Transcription]
Probab=90.59  E-value=0.085  Score=52.77  Aligned_cols=56  Identities=14%  Similarity=0.253  Sum_probs=50.5

Q ss_pred             cccchHHHH--HHHHhhCCCChhhhhhhhhhhhcccCCChhhhhhhhhhhhhhcccCCC
Q 023605            2 LCFCSNEEV--KRLRSENSDLSVTLGLRKHIGKTYKELPPEQKARYKKRDERMGNSGNS   58 (280)
Q Consensus         2 ~~~~~~~~~--~~~~~~~p~~~~~~~~~k~~g~~wk~l~~~~k~~~~~~~~~~k~~~~~   58 (280)
                      +|+|.+..|  -.....||+.. +..|.|++|+.|-+|.+.||+.|.+-+.++|..|-+
T Consensus       189 f~ifskrhr~~g~vhq~~pn~D-NrtIskiLgewWytL~~~Ekq~yhdLa~Qvk~Ahfk  246 (683)
T KOG2746|consen  189 FHIFSKRHRGEGRVHQRHPNQD-NRTISKILGEWWYTLGPNEKQKYHDLAFQVKEAHFK  246 (683)
T ss_pred             HHHHHhhcCCccchhccCcccc-chhHHHHHhhhHhhhCchhhhhHHHHHHHHHHHHhh
Confidence            578888888  88889999876 999999999999999999999999999999887765


No 21 
>COG5648 NHP6B Chromatin-associated proteins containing the HMG domain [Chromatin structure and dynamics]
Probab=90.49  E-value=0.1  Score=45.25  Aligned_cols=54  Identities=17%  Similarity=0.230  Sum_probs=43.5

Q ss_pred             hHHHHHHHHhhCCCChhhhhhhhhhhhcccCCChhhhhhhhhhhhhhcccCCCCC
Q 023605            6 SNEEVKRLRSENSDLSVTLGLRKHIGKTYKELPPEQKARYKKRDERMGNSGNSNS   60 (280)
Q Consensus         6 ~~~~~~~~~~~~p~~~~~~~~~k~~g~~wk~l~~~~k~~~~~~~~~~k~~~~~~k   60 (280)
                      ..+-|......+|+.. .-..+|++|+.|.+|++.-|++|.+.+++.+.+|++..
T Consensus       155 ~~~~r~~~~~~~~~~~-~~e~~k~~~~~w~el~~skK~~~~~~~Kk~k~~~~~~~  208 (211)
T COG5648         155 EPKIRPKVEGPSPDKA-LVEETKIISKAWSELDESKKKKYIDKYKKLKEEYDSFY  208 (211)
T ss_pred             cHHhccccCCCCcchh-hhHHhhhhhhhhhhhChhhhhHHHHHHHHHHHHHhhhc
Confidence            3445555666666654 55569999999999999999999999999999998754


No 22 
>PF03078 ATHILA:  ATHILA ORF-1 family;  InterPro: IPR004312 ATHILA is a group of Arabidopsis thaliana retrotransposons [] belonging to the Ty3/gypsy family of the long terminal repeat (LTR) class of eukaryotic retrotransposons[, ]. The central region of ATHILA retrotransposons contains two or three open reading frames (ORFs). This family represents the ORF1 product. The function of ORF1 is unknown.
Probab=85.01  E-value=4.3  Score=39.61  Aligned_cols=164  Identities=15%  Similarity=0.156  Sum_probs=94.0

Q ss_pred             cHHHHHHHHhhCCHHHHHHHHHcCChhhhcccccccCHHHHHHHhcc----c---CC--------CcceEEECCEEEEeC
Q 023605           73 VPERFCALVKSLSEEKKKAIREIGFESLLELRCGKLKRKLCHWLVNQ----F---KP--------ERNIIELHGQKLELC  137 (280)
Q Consensus        73 S~~~~~~~i~~Ls~~qk~~I~~~GFg~LL~i~~~~l~~~l~~wL~~~----~---d~--------~t~~~~i~g~~i~it  137 (280)
                      ++..+..+  .|.++-..+++.+|.+.|..++...-+...+.+|+.-    +   ++        ..-+|.|.|....+|
T Consensus        67 ~~etl~~L--Gl~~dV~~lf~~~gL~~f~~~~~~~Y~eet~qFLaTl~v~~~~~~~~~~~e~~glG~l~F~V~~~~y~ls  144 (458)
T PF03078_consen   67 DPETLQKL--GLLEDVEYLFKKCGLGTFMSYPYPTYPEETRQFLATLKVTFYNPSEPRAKELDGLGYLTFFVYGVEYSLS  144 (458)
T ss_pred             CHHHHHHh--ccHHHHHHHHHhcCchhhccCCCCCcHHHHHHhhheeeeeecccccchhhcccCcceEEEEEcceeeeee
Confidence            33444444  6778888999999999999888876665555555432    1   11        123577789999999


Q ss_pred             ccccceeeeecCCCcccccCCCchHHHHHHHHhCCCCCCcchhHHHHHHhhccccchhHHHHHHHhhhceeecccCCC--
Q 023605          138 PKMFSKIMGVKDGGMAIKINGASDHIAEVRRIFQPTVKGIRIRTLEEVIEQLDEANKIFKVAFTLFAIATLLCPIGSY--  215 (280)
Q Consensus       138 ~~dV~~VlGLP~~G~~i~~~~~~~~~~~l~~~~~~~~~~i~l~~L~~~l~~~~~~~d~f~r~Fll~~~~~~L~Ptt~~--  215 (280)
                      -.+...++|.|.++. +....+.+....+.+..|.+. .++...-...      ..-+=+.+++-=+++..|+|....  
T Consensus       145 i~~L~~i~GF~~~~~-i~~~~~~~el~~~W~~ig~~~-p~~~~~~ks~------~Ir~PviRy~hr~iA~tlf~R~~~~~  216 (458)
T PF03078_consen  145 IKHLERIFGFPSGDE-IKPDFDPEELNDFWATIGGGK-PFNSARSKSN------QIRSPVIRYFHRLIANTLFAREETGT  216 (458)
T ss_pred             HHHHHHHhCCCCccc-cCCCCCchHHHHHHHHhcCCC-cccccccccc------cccChHHHHHHHHHHhhhccccccCc
Confidence            999999999999844 333334444455655555321 1111111110      111223344444556666665533  


Q ss_pred             CCccccccc-----------ccc----CCCccccchHHHHHHHHHH
Q 023605          216 ISTLFLHPI-----------MDV----SSIKSLNWATFCYDWLVKS  246 (280)
Q Consensus       216 vs~~yl~~l-----------~D~----~~i~~ynW~~~Vl~~L~~~  246 (280)
                      |..+-|..+           .|-    .+..+.|=+...++||...
T Consensus       217 v~~~El~~l~~~L~~~Lr~~~~g~~l~~d~~dt~~~~vl~~hL~~y  262 (458)
T PF03078_consen  217 VRNDELEMLDQALKHLLRRTKDGKLLRGDLNDTNVSMVLLDHLCSY  262 (458)
T ss_pred             eechhHHHHHHHHHHHHHhcCCCccccCcccccchhHHHHHHHHhh
Confidence            655543321           111    1235666777777777654


No 23 
>PF06945 DUF1289:  Protein of unknown function (DUF1289);  InterPro: IPR010710 This family consists of a number of hypothetical bacterial proteins. The aligned region spans around 56 residues and contains 4 highly conserved cysteine residues towards the N terminus. The function of this family is unknown.
Probab=64.69  E-value=4.6  Score=27.04  Aligned_cols=22  Identities=9%  Similarity=0.157  Sum_probs=18.7

Q ss_pred             cccCCChhhhhhhhhhhhhhcc
Q 023605           33 TYKELPPEQKARYKKRDERMGN   54 (280)
Q Consensus        33 ~wk~l~~~~k~~~~~~~~~~k~   54 (280)
                      .|++||++||...+++.....+
T Consensus        29 ~W~~~s~~er~~i~~~l~~R~~   50 (51)
T PF06945_consen   29 DWKSMSDDERRAILARLRARRA   50 (51)
T ss_pred             HHhhCCHHHHHHHHHHHHHHhc
Confidence            7999999999999988766544


No 24 
>PF08073 CHDNT:  CHDNT (NUC034) domain;  InterPro: IPR012958 The CHD N-terminal domain is found in PHD/RING fingers and chromo domain-associated helicases [].; GO: 0003677 DNA binding, 0005524 ATP binding, 0008270 zinc ion binding, 0016818 hydrolase activity, acting on acid anhydrides, in phosphorus-containing anhydrides, 0006355 regulation of transcription, DNA-dependent, 0005634 nucleus
Probab=63.47  E-value=6.7  Score=26.86  Aligned_cols=34  Identities=15%  Similarity=0.098  Sum_probs=28.8

Q ss_pred             cchHHHHHHHHhhCCCChhhhhhhhhhhhcccCCC
Q 023605            4 FCSNEEVKRLRSENSDLSVTLGLRKHIGKTYKELP   38 (280)
Q Consensus         4 ~~~~~~~~~~~~~~p~~~~~~~~~k~~g~~wk~l~   38 (280)
                      .|+.-.|..+.++||++. ...+=...+.||+.-+
T Consensus        18 ~Fsq~vRP~l~~~NPk~~-~sKl~~l~~AKwrEF~   51 (55)
T PF08073_consen   18 AFSQHVRPLLAKANPKAP-MSKLMMLLQAKWREFQ   51 (55)
T ss_pred             HHHHHHHHHHHHHCCCCc-HHHHHHHHHHHHHHHH
Confidence            588899999999999997 7777788999997543


No 25 
>PF11304 DUF3106:  Protein of unknown function (DUF3106);  InterPro: IPR021455  Some members in this family of proteins are annotated as transmembrane proteins however this cannot be confirmed. Currently no function is known. 
Probab=60.90  E-value=11  Score=29.34  Aligned_cols=14  Identities=29%  Similarity=0.674  Sum_probs=6.3

Q ss_pred             hhcccCCChhhhhh
Q 023605           31 GKTYKELPPEQKAR   44 (280)
Q Consensus        31 g~~wk~l~~~~k~~   44 (280)
                      .++|.+||+++++.
T Consensus        35 a~r~~~mspeqq~r   48 (107)
T PF11304_consen   35 AERWPSMSPEQQQR   48 (107)
T ss_pred             HHHHhcCCHHHHHH
Confidence            34444444444444


No 26 
>PF10234 Cluap1:  Clusterin-associated protein-1;  InterPro: IPR019366 This protein of 413 amino acids contains a central coiled-coil domain, possibly the region that binds to clusterin. Cluap1 expression is highest in the nucleus and gradually increases during late S to G2/M phases of the cell cycle and returns to the basal level in the G0/G1 phases. In addition, it is upregulated in colon cancer tissues compared to corresponding non-cancerous mucosa. It thus plays a crucial role in the life of the cell []. 
Probab=47.37  E-value=8.9  Score=34.83  Aligned_cols=32  Identities=19%  Similarity=0.643  Sum_probs=25.8

Q ss_pred             HcCChhhhcccccccC-----HHHHHHHhcccCCCcc
Q 023605           94 EIGFESLLELRCGKLK-----RKLCHWLVNQFKPERN  125 (280)
Q Consensus        94 ~~GFg~LL~i~~~~l~-----~~l~~wL~~~~d~~t~  125 (280)
                      .+||..+++|....-|     .+++.||+.+|||+..
T Consensus         2 ~LGypr~iSmenFrtPNF~LVAeiL~WLv~rydP~~~   38 (267)
T PF10234_consen    2 ALGYPRLISMENFRTPNFELVAEILRWLVKRYDPDAD   38 (267)
T ss_pred             CCCCCCCCcHHHcCCCChHHHHHHHHHHHHHcCCCCC
Confidence            4799999999775554     5788899999999873


No 27 
>PRK15117 ABC transporter periplasmic binding protein MlaC; Provisional
Probab=46.78  E-value=14  Score=32.26  Aligned_cols=30  Identities=20%  Similarity=0.294  Sum_probs=22.4

Q ss_pred             CCChhhhhh-hhhhhhcccCCChhhhhhhhhh
Q 023605           18 SDLSVTLGL-RKHIGKTYKELPPEQKARYKKR   48 (280)
Q Consensus        18 p~~~~~~~~-~k~~g~~wk~l~~~~k~~~~~~   48 (280)
                      |... +..+ ..+.|..|+..|+++|+.|.+-
T Consensus        66 p~~D-f~~~s~~vLG~~wr~as~eQr~~F~~~   96 (211)
T PRK15117         66 PYVQ-VKYAGALVLGRYYKDATPAQREAYFAA   96 (211)
T ss_pred             ccCC-HHHHHHHHhhhhhhhCCHHHHHHHHHH
Confidence            4444 3333 6689999999999999987654


No 28 
>PF05494 Tol_Tol_Ttg2:  Toluene tolerance, Ttg2 ;  InterPro: IPR008869 Toluene tolerance is mediated by increased cell membrane rigidity resulting from changes in fatty acid and phospholipid compositions, exclusion of toluene from the cell membrane, and removal of intracellular toluene by degradation []. Many proteins are involved in these processes. This family is a transporter which shows similarity to ABC transporters [].; PDB: 2QGU_A.
Probab=44.82  E-value=16  Score=30.45  Aligned_cols=31  Identities=16%  Similarity=0.404  Sum_probs=21.2

Q ss_pred             CCChhhhhhhhhhhhcccCCChhhhhhhhhh
Q 023605           18 SDLSVTLGLRKHIGKTYKELPPEQKARYKKR   48 (280)
Q Consensus        18 p~~~~~~~~~k~~g~~wk~l~~~~k~~~~~~   48 (280)
                      |....-.....++|..|+.+|+++++.|.+.
T Consensus        36 ~~~D~~~~ar~~LG~~w~~~s~~q~~~F~~~   66 (170)
T PF05494_consen   36 PYFDFERMARRVLGRYWRKASPAQRQRFVEA   66 (170)
T ss_dssp             GGB-HHHHHHHHHGGGTTTS-HHHHHHHHHH
T ss_pred             HhCCHHHHHHHHHHHhHhhCCHHHHHHHHHH
Confidence            3344334446788999999999999987654


No 29 
>PF12650 DUF3784:  Domain of unknown function (DUF3784);  InterPro: IPR017259 This group represents an uncharacterised conserved protein.
Probab=44.28  E-value=13  Score=27.96  Aligned_cols=19  Identities=26%  Similarity=0.543  Sum_probs=15.5

Q ss_pred             cccCCChhhhhhhhhhhhh
Q 023605           33 TYKELPPEQKARYKKRDER   51 (280)
Q Consensus        33 ~wk~l~~~~k~~~~~~~~~   51 (280)
                      -|+.||+|||+.|-+++-.
T Consensus        25 Gyntms~eEk~~~D~~~l~   43 (97)
T PF12650_consen   25 GYNTMSKEEKEKYDKKKLC   43 (97)
T ss_pred             hcccCCHHHHHHhhHHHHH
Confidence            4789999999998877643


No 30 
>cd03489 Topoisomer_IB_N_LdtopoI_like Topoisomer_IB_N_LdtopoI_like: N-terminal DNA binding fragment found in eukaryotic DNA topoisomerase (topo) IB proteins similar to the heterodimeric topo I from Leishmania donvanni. Topo I enzymes are divided into:  topo type IA (bacterial) and type IB (eukaryotic). Topo I relaxes superhelical tension in duplex DNA by creating a single-strand nick, the broken strand can then rotate around the unbroken strand to remove DNA supercoils and, the nick is religated, liberating topo I. These enzymes regulate the topological changes that accompany DNA replication, transcription and other nuclear processes.  Human topo I is the target of a diverse set of anticancer drugs including camptothecins (CPTs). CPTs bind to the topo I-DNA complex and inhibit re-ligation of the single-strand nick, resulting in the accumulation of topo I-DNA adducts. In addition to differences in structure and some biochemical properties, Trypanosomatid parasite topo I differ from human
Probab=44.15  E-value=24  Score=30.78  Aligned_cols=54  Identities=9%  Similarity=-0.011  Sum_probs=34.8

Q ss_pred             chHHHHHHHHhhCCCChhhh--hhhh------hhhhcccCCChhhhhhhhhhhhhhcccCCC
Q 023605            5 CSNEEVKRLRSENSDLSVTL--GLRK------HIGKTYKELPPEQKARYKKRDERMGNSGNS   58 (280)
Q Consensus         5 ~~~~~~~~~~~~~p~~~~~~--~~~k------~~g~~wk~l~~~~k~~~~~~~~~~k~~~~~   58 (280)
                      |.+|||+.+...+-.++.++  +..+      .--++-|+||.|||+.-.+.+.+....|.-
T Consensus        61 Ff~df~~~l~~~~~~I~~f~kcDF~~i~~~~~~~~e~kK~~tkeEKk~~K~ek~~~e~~Y~~  122 (212)
T cd03489          61 FFESWREILDKRHHPIRKLELCDFTPIYEWHLREKEKKKSRTKEEKKALKEEKDKEAEPYMW  122 (212)
T ss_pred             HHHHHHHHhcccCccccchhhCCCHHHHHHHHHHHHHHHhCCHHHHHHHHHHHHHhhccCCE
Confidence            78999999976652333322  2222      234578899999999776666666666643


No 31 
>PF11304 DUF3106:  Protein of unknown function (DUF3106);  InterPro: IPR021455  Some members in this family of proteins are annotated as transmembrane proteins however this cannot be confirmed. Currently no function is known. 
Probab=43.50  E-value=29  Score=26.96  Aligned_cols=65  Identities=20%  Similarity=0.340  Sum_probs=37.5

Q ss_pred             hhhhhcccCCChhhhhhhhhhhhhhcccCCCCCCCCCCCCccccccHHHHHHHHhhCCHHHHHHHHHcCChhhhccc
Q 023605           28 KHIGKTYKELPPEQKARYKKRDERMGNSGNSNSHSGDNEIVETKCVPERFCALVKSLSEEKKKAIREIGFESLLELR  104 (280)
Q Consensus        28 k~~g~~wk~l~~~~k~~~~~~~~~~k~~~~~~k~~~k~~~~~~RcS~~~~~~~i~~Ls~~qk~~I~~~GFg~LL~i~  104 (280)
                      .-..+.|.+|+++.|..++..+..-..=-     +.-+.      ....=+.--..|||+|++.+++. |..+-.|+
T Consensus        14 ~pl~~~W~~l~~~qr~k~l~~a~r~~~ms-----peqq~------r~~~rm~~W~~LspeqR~~~R~~-~~~~~~Lp   78 (107)
T PF11304_consen   14 APLAERWNSLPPEQRRKWLQIAERWPSMS-----PEQQQ------RLRERMRRWAALSPEQRQQAREN-YQRFKQLP   78 (107)
T ss_pred             HHHHHHHhcCCHHHHHHHHHHHHHHhcCC-----HHHHH------HHHHHHHHHHhCCHHHHHHHHHH-HHHHHcCC
Confidence            55667899999999998887765443111     11011      11112223356777777777765 55555544


No 32 
>cd00660 Topoisomer_IB_N Topoisomer_IB_N: N-terminal DNA binding fragment found in eukaryotic DNA topoisomerase (topo) IB proteins similar to the monomeric yeast and human topo I and heterodimeric topo I from Leishmania donvanni. Topo I enzymes are divided into:  topo type IA (bacterial) and type IB (eukaryotic). Topo I relaxes superhelical tension in duplex DNA by creating a single-strand nick, the broken strand can then rotate around the unbroken strand to remove DNA supercoils and, the nick is religated, liberating topo I. These enzymes regulate the topological changes that accompany DNA replication, transcription and other nuclear processes.  Human topo I is the target of a diverse set of anticancer drugs including camptothecins (CPTs). CPTs bind to the topo I-DNA complex and inhibit re-ligation of the single-strand nick, resulting in the accumulation of topo I-DNA adducts.  In addition to differences in structure and some biochemical properties, Trypanosomatid parasite topo I diffe
Probab=38.11  E-value=41  Score=29.47  Aligned_cols=54  Identities=15%  Similarity=0.122  Sum_probs=34.2

Q ss_pred             chHHHHHHHHhhCCCC-hhhh--hh------hhhhhhcccCCChhhhhhhhhhhhhhcccCCC
Q 023605            5 CSNEEVKRLRSENSDL-SVTL--GL------RKHIGKTYKELPPEQKARYKKRDERMGNSGNS   58 (280)
Q Consensus         5 ~~~~~~~~~~~~~p~~-~~~~--~~------~k~~g~~wk~l~~~~k~~~~~~~~~~k~~~~~   58 (280)
                      |.+|||+.+...+... +.++  +.      -..-.++-|+||.|||+.-.+.+++....|.-
T Consensus        63 Ff~Df~~~l~~~~~~~i~~f~kcDF~~i~~~~~~~~e~kK~~s~eEKk~~K~ek~~~e~~Y~~  125 (215)
T cd00660          63 FFKDFRKILTKEEKHIIKKLSKCDFTPIYQYFEEEKEKKKAMSKEEKKAIKEEKEKLEEPYGY  125 (215)
T ss_pred             HHHHHHHHhccccCccccchhhCCCHHHHHHHHHHHHHHHcCCHHHHHHHHHHHHhhhccCCE
Confidence            7889999996654322 1111  11      22335678899999999776666666666643


No 33 
>cd03488 Topoisomer_IB_N_htopoI_like Topoisomer_IB_N_htopoI_like : N-terminal DNA binding fragment found in eukaryotic DNA topoisomerase (topo) IB proteins similar to the monomeric yeast and human topo I.  Topo I enzymes are divided into:  topo type IA (bacterial) and type IB (eukaryotic). Topo I relaxes superhelical tension in duplex DNA by creating a single-strand nick, the broken strand can then rotate around the unbroken strand to remove DNA supercoils and, the nick is religated, liberating topo I. These enzymes regulate the topological changes that accompany DNA replication, transcription and other nuclear processes.  Human topo I is the target of a diverse set of anticancer drugs including camptothecins (CPTs). CPTs bind to the topo I-DNA complex and inhibit religation of the single-strand nick, resulting in the accumulation of topo I-DNA adducts.  This family may represent more than one structural domain.
Probab=37.88  E-value=42  Score=29.41  Aligned_cols=54  Identities=15%  Similarity=0.078  Sum_probs=34.4

Q ss_pred             chHHHHHHHHhhCCCC-hhhh--hh------hhhhhhcccCCChhhhhhhhhhhhhhcccCCC
Q 023605            5 CSNEEVKRLRSENSDL-SVTL--GL------RKHIGKTYKELPPEQKARYKKRDERMGNSGNS   58 (280)
Q Consensus         5 ~~~~~~~~~~~~~p~~-~~~~--~~------~k~~g~~wk~l~~~~k~~~~~~~~~~k~~~~~   58 (280)
                      |.+|||+.+...+... +.++  +.      -..-.++-|+||.|||+.-.+.+.+....|.-
T Consensus        63 Ff~Df~~~l~~~~~~~I~~f~kcDF~~i~~~~~~~~e~kK~~tkeEKk~~K~ek~~~e~~Y~~  125 (215)
T cd03488          63 FFKDFKKVMTKEEKVIIKDFSKCDFTQMFAYFKAQKEEKKAMSKEEKKAIKAEKEKLEEEYGF  125 (215)
T ss_pred             HHHHHHHHhccccCccccchhhCCCHHHHHHHHHHHHHHHcCCHHHHHHHHHHHHhhhccCCE
Confidence            7889999996654221 1111  11      22345678899999999776666666666643


No 34 
>TIGR03481 HpnM hopanoid biosynthesis associated membrane protein HpnM. The genomes containing members of this family share the machinery for the biosynthesis of hopanoid lipids. Furthermore, the genes of this family are usually located proximal to other components of this biological process. The proteins are members of the pfam05494 family of putative transporters known as "toluene tolerance protein Ttg2D", although it is unlikely that the members included here have anything to do with toluene per-se.
Probab=37.70  E-value=20  Score=30.95  Aligned_cols=22  Identities=32%  Similarity=0.668  Sum_probs=18.9

Q ss_pred             hhhhhhcccCCChhhhhhhhhh
Q 023605           27 RKHIGKTYKELPPEQKARYKKR   48 (280)
Q Consensus        27 ~k~~g~~wk~l~~~~k~~~~~~   48 (280)
                      ..++|..|+.+|+++|+.|.+-
T Consensus        71 r~vLG~~W~~~s~~Qr~~F~~~   92 (198)
T TIGR03481        71 RLTLGSSWTSLSPEQRRRFIGA   92 (198)
T ss_pred             HHHhhhhhhhCCHHHHHHHHHH
Confidence            5588999999999999987654


No 35 
>cd03490 Topoisomer_IB_N_1 Topoisomer_IB_N_1: A subgroup of the N-terminal DNA binding fragment found in eukaryotic DNA topoisomerase (topo) IB. Topo IB proteins include the monomeric yeast and human topo I and heterodimeric topo I from Leishmania donvanni. Topo I enzymes are divided into:  topo type IA (bacterial) and type IB (eukaryotic). Topo I relaxes superhelical tension in duplex DNA by creating a single-strand nick, the broken strand can then rotate around the unbroken strand to remove DNA supercoils and, the nick is religated, liberating topo I. These enzymes regulate the topological changes that accompany DNA replication, transcription and other nuclear processes.  Human topo I is the target of a diverse set of anticancer drugs including camptothecins (CPTs). CPTs bind to the topo I-DNA complex and inhibit religation of the single-strand nick, resulting in the accumulation of topo I-DNA adducts.  In addition to differences in structure and some biochemical properties, Trypanoso
Probab=34.00  E-value=47  Score=29.12  Aligned_cols=54  Identities=15%  Similarity=0.056  Sum_probs=34.6

Q ss_pred             chHHHHHHHHhhCCCChh--h--hhhh------hhhhhcccCCChhhhhhhhhhhhhhcccCCC
Q 023605            5 CSNEEVKRLRSENSDLSV--T--LGLR------KHIGKTYKELPPEQKARYKKRDERMGNSGNS   58 (280)
Q Consensus         5 ~~~~~~~~~~~~~p~~~~--~--~~~~------k~~g~~wk~l~~~~k~~~~~~~~~~k~~~~~   58 (280)
                      |.+|||+.+...+....+  +  ++..      ..--++-|+||.|||+.-.+.+.+....|.-
T Consensus        61 Ff~df~~~l~~~~~~~~i~~f~kcDF~~i~~~~~~~ke~kK~~tkeEKk~~K~ek~~~e~~Y~~  124 (217)
T cd03490          61 FWKVFVNSFEKDHKFIRRCKLSDADFSLIKNHLEEEKEKKKNLNKEEKEAKKKERAKREYPFNY  124 (217)
T ss_pred             HHHHHHHHhccccCcccccchhhCCCHHHHHHHHHHHHHHHhcCHHHHHHHHHHHHHhhccCCE
Confidence            789999999766633321  1  1222      2234577899999999766666666666643


No 36 
>cd09071 FAR_C C-terminal domain of fatty acyl CoA reductases. C-terminal domain of fatty acyl CoA reductases, a family of SDR-like proteins. SDRs or short-chain dehydrogenases/reductases are Rossmann-fold NAD(P)H-binding proteins. Many proteins in this FAR_C family may function as fatty acyl-CoA reductases (FARs), acting on medium and long chain fatty acids, and have been reported to be involved in diverse processes such as the biosynthesis of insect pheromones, plant cuticular wax production, and mammalian wax biosynthesis. In Arabidopsis thaliana, proteins with this particular architecture have also been identified as the MALE STERILITY 2 (MS2) gene product, which is implicated in male gametogenesis. Mutations in MS2 inhibit the synthesis of exine (sporopollenin), rendering plants unable to reduce pollen wall fatty acids to corresponding alcohols. The function of this C-terminal domain is unclear.
Probab=31.61  E-value=45  Score=24.40  Aligned_cols=21  Identities=14%  Similarity=0.754  Sum_probs=18.2

Q ss_pred             CCccccchHHHHHHHHHHHHHH
Q 023605          229 SIKSLNWATFCYDWLVKSICRF  250 (280)
Q Consensus       229 ~i~~ynW~~~Vl~~L~~~i~k~  250 (280)
                      ++.++||..++.+. +.|+++|
T Consensus        70 D~~~idW~~Y~~~~-~~G~r~y   90 (92)
T cd09071          70 DIRSIDWDDYFENY-IPGLRKY   90 (92)
T ss_pred             CCCCCCHHHHHHHH-HHHHHHH
Confidence            46799999999999 8888876


No 37 
>PF03457 HA:  Helicase associated domain;  InterPro: IPR005114 This short domain is found in multiple copies in bacterial helicase proteins. The domain is predicted to contain 3 alpha helices. The function of this domain may be to bind nucleic acid.; PDB: 2KTA_A.
Probab=31.45  E-value=33  Score=23.82  Aligned_cols=16  Identities=25%  Similarity=0.578  Sum_probs=11.3

Q ss_pred             hhCCHHHHHHHHHcCC
Q 023605           82 KSLSEEKKKAIREIGF   97 (280)
Q Consensus        82 ~~Ls~~qk~~I~~~GF   97 (280)
                      ..|+++|.+.++++||
T Consensus        52 g~L~~er~~~L~~lg~   67 (68)
T PF03457_consen   52 GKLTPERIERLDALGF   67 (68)
T ss_dssp             T---HHHHHHHHHHT-
T ss_pred             CCCCHHHHHHHHcCCC
Confidence            4699999999999998


No 38 
>PF05823 Gp-FAR-1:  Nematode fatty acid retinoid binding protein (Gp-FAR-1);  InterPro: IPR008632 Parasitic nematodes produce at least two structurally novel classes of small helix-rich retinol- and fatty-acid-binding proteins that have no counterparts in their plant or animal hosts and thus represent potential targets for new nematicides. Gp-FAR-1 is a member of the nematode-specific fatty-acid- and retinol-binding (FAR) family of proteins but localises to the surface of the organism, placing it in a strategic position for interaction with the host. Gp-FAR-1 functions as a broad-spectrum retinol- and fatty-acid-binding protein, and it is thought that it is involved in the evasion of primary host plant defence systems [].; GO: 0008289 lipid binding; PDB: 2W9Y_A.
Probab=31.32  E-value=13  Score=30.80  Aligned_cols=89  Identities=19%  Similarity=0.241  Sum_probs=49.4

Q ss_pred             HHHHHHHHhhCCCChh-hhhhhhhhhhcccCCChhhhhh---hhhhhhhhcccCCCCCCCCCCCCccccccH----HHHH
Q 023605            7 NEEVKRLRSENSDLSV-TLGLRKHIGKTYKELPPEQKAR---YKKRDERMGNSGNSNSHSGDNEIVETKCVP----ERFC   78 (280)
Q Consensus         7 ~~~~~~~~~~~p~~~~-~~~~~k~~g~~wk~l~~~~k~~---~~~~~~~~k~~~~~~k~~~k~~~~~~RcS~----~~~~   78 (280)
                      +|+-+.+|++.|.+-. +..+....-.+-.+|+|+.|+.   ..+++.....++.....+.       .+..    +.+.
T Consensus        43 de~i~~LK~ksP~L~~k~~~l~~~~k~ki~~L~peak~Fv~~li~~~~~l~~~~~~G~~~~-------~~~lk~~~k~~~  115 (154)
T PF05823_consen   43 DEMIAALKEKSPSLYEKAEKLRDKLKKKIDKLSPEAKAFVKELIAKARSLYAQYSAGEKPD-------LEELKQLAKKVI  115 (154)
T ss_dssp             TTHHHHHHHH-HHHHHHHHHHHHHHHHTTTT--HHHHHHHHHHHHHHHHHHHHHHHT-----------THHHHHHH----
T ss_pred             HHHHHHHHHhCHHHHHHHHHHHHHHHHHHHcCCHHHHHHHHHHHHHHHHHHHHhcCCCCCC-------HHHHHHHHhhhH
Confidence            5778899999999754 5556777778999999998863   3333333333333322111       3334    4555


Q ss_pred             HHHhhCCHHHHHHHHHcCChhhhcc
Q 023605           79 ALVKSLSEEKKKAIREIGFESLLEL  103 (280)
Q Consensus        79 ~~i~~Ls~~qk~~I~~~GFg~LL~i  103 (280)
                      .-++.||++-|+-|++. |-.+..+
T Consensus       116 ~~ykaLs~~ak~dL~k~-FP~i~~~  139 (154)
T PF05823_consen  116 DSYKALSPEAKDDLKKN-FPIIASF  139 (154)
T ss_dssp             HHHHTS-HHHHHHHHHH--TT----
T ss_pred             HHHHcCCHHHHHHHHHH-Cccchhh
Confidence            77889999999999887 6665554


No 39 
>PF04994 TfoX_C:  TfoX C-terminal domain;  InterPro: IPR007077 This domain is found in a number of bacterial proteins including the TfoX gene product of Haemophilus influenzae. TfoX may play a key role in the development of genetic competence by regulating the expression of late competence-specific genes []. This family corresponds to the C-terminal presumed domain of TfoX. The domain is found in association with the N-terminal domain in some, but not all members of this group, suggesting this is an autonomous and functionally unrelated domain. For example it is found associated with Q9JZR1 from SWISSPROT in IPR002125 from INTERPRO.; PDB: 3BQT_A 3MAB_A.
Probab=30.51  E-value=53  Score=24.12  Aligned_cols=36  Identities=19%  Similarity=0.290  Sum_probs=22.3

Q ss_pred             HHHhhCCCChh---hhhhhhhhhhcccCCChhhhhhhhh
Q 023605           12 RLRSENSDLSV---TLGLRKHIGKTYKELPPEQKARYKK   47 (280)
Q Consensus        12 ~~~~~~p~~~~---~~~~~k~~g~~wk~l~~~~k~~~~~   47 (280)
                      .+++.++++..   .+=.|-.-|..|..|++++|+.-..
T Consensus        41 ~Lk~~~~~~~~~~L~aL~gAi~g~~~~~L~~~~K~~L~~   79 (81)
T PF04994_consen   41 RLKASGPSVCLNLLYALEGAIQGIHWADLPDEEKQELLE   79 (81)
T ss_dssp             HHHHH-TT--HHHHHHHHHHHCTS-GGGS-HHHHHHHHH
T ss_pred             HHHHHCCCCCHHHHHHHHHHHcCCCHHHCCHHHHHHHHh
Confidence            45666877764   3334778899999999999986543


No 40 
>cd07321 Extradiol_Dioxygenase_3A_like Subunit A of Class III extradiol dioxygenases. Extradiol dioxygenases catalyze the incorporation of both atoms of molecular oxygen into substrates using a variety of reaction mechanisms, resulting in the cleavage of aromatic rings.  There are two major groups of dioxygenases according to the cleavage site of the aromatic ring. Intradiol enzymes cleave the aromatic ring between two hydroxyl groups, whereas extradiol enzymes cleave the aromatic ring between a hydroxylated carbon and an adjacent non-hydroxylated carbon. Extradiol dioxygenases can be divided into three classes. Class I and II enzymes are evolutionary related and show sequence similarity, with the two domain class II enzymes evolving from the class I enzyme through gene duplication. Class III enzymes are different in sequence and structure and usually have two subunits, designated A and B, which form a tetramer composed of two copies of each subunit. This model represents subunit A of c
Probab=29.35  E-value=60  Score=23.59  Aligned_cols=31  Identities=23%  Similarity=0.218  Sum_probs=25.0

Q ss_pred             hCCHHHHHHHHHcCChhhhcccccccCHHHHHHH
Q 023605           83 SLSEEKKKAIREIGFESLLELRCGKLKRKLCHWL  116 (280)
Q Consensus        83 ~Ls~~qk~~I~~~GFg~LL~i~~~~l~~~l~~wL  116 (280)
                      .||++|+++|.+--+.+|+++..   |..+...+
T Consensus        34 ~Lt~eE~~al~~rD~~~L~~lG~---~~~~l~k~   64 (77)
T cd07321          34 GLTPEEKAALLARDVGALYVLGV---NPMLLMHF   64 (77)
T ss_pred             CCCHHHHHHHHcCCHHHHHHcCC---CHHHHHHH
Confidence            79999999999999999999874   44444433


No 41 
>PRK01381 Trp operon repressor; Provisional
Probab=28.35  E-value=35  Score=26.25  Aligned_cols=59  Identities=15%  Similarity=0.142  Sum_probs=40.0

Q ss_pred             CChhhhhhhhhhhhhhcc--cCCCCCCCCCCCCccccccHHHHHHHHhhCCHHHHHHHHHc
Q 023605           37 LPPEQKARYKKRDERMGN--SGNSNSHSGDNEIVETKCVPERFCALVKSLSEEKKKAIREI   95 (280)
Q Consensus        37 l~~~~k~~~~~~~~~~k~--~~~~~k~~~k~~~~~~RcS~~~~~~~i~~Ls~~qk~~I~~~   95 (280)
                      |++.|+.-...|-.-...  +++...-...+....++|...+..+.++..+++.|+++++.
T Consensus        33 lTp~Er~al~~R~~I~~~L~~g~~sQREIa~~lGvSiaTITRgsn~Lk~~~~~~k~~l~~~   93 (99)
T PRK01381         33 LTPDEREALGTRVRIVEELLRGELSQREIKQELGVGIATITRGSNSLKTAPPEFKEWLEQQ   93 (99)
T ss_pred             CCHHHHHHHHHHHHHHHHHHcCCcCHHHHHHHhCCceeeehhhHHHhccCCHHHHHHHHHH
Confidence            677777665555444332  33333223445556788999999999999999999998864


No 42 
>PF03015 Sterile:  Male sterility protein;  InterPro: IPR004262 This family represents the C-terminal region of the male sterility protein in a number of organisms. The Arabidopsis thaliana male sterility 2 (MS2) protein is involved in male gametogenesis. The MS2 protein shows sequence similarity to a jojoba protein (also a member of this group) that converts wax fatty acids to fatty alcohols. It has been suggested that a possible function of the MS2 protein may be as a fatty acyl reductase in the formation of pollen wall substances [].; GO: 0016620 oxidoreductase activity, acting on the aldehyde or oxo group of donors, NAD or NADP as acceptor, 0055114 oxidation-reduction process
Probab=26.62  E-value=63  Score=23.93  Aligned_cols=53  Identities=11%  Similarity=0.279  Sum_probs=31.8

Q ss_pred             HHHHHHHhhhceeecccCCC------CCccccccccccCCCccccchHHHHHHHHHHHHHHh
Q 023605          196 FKVAFTLFAIATLLCPIGSY------ISTLFLHPIMDVSSIKSLNWATFCYDWLVKSICRFQ  251 (280)
Q Consensus       196 f~r~Fll~~~~~~L~Ptt~~------vs~~yl~~l~D~~~i~~ynW~~~Vl~~L~~~i~k~~  251 (280)
                      ...++-.|+.....+.+.+.      .++.--..+ +. +++++||-.++..+ +.|+++|-
T Consensus        33 ~~~~~~~F~~~eW~F~~~n~~~L~~~l~~~D~~~F-~f-D~~~idW~~Y~~~~-~~G~rkyl   91 (94)
T PF03015_consen   33 ALEVLEYFTTNEWIFDNDNTRRLWERLSPEDREIF-NF-DIRSIDWEEYFRNY-IPGIRKYL   91 (94)
T ss_pred             HHHHHHHHHhCceeecchHHHHHHHhCchhcCcee-cC-CCCCCCHHHHHHHH-HHHHHHHH
Confidence            34444455555555554432      222222222 23 57899999999999 88999874


No 43 
>PF02919 Topoisom_I_N:  Eukaryotic DNA topoisomerase I, DNA binding fragment;  InterPro: IPR008336 DNA topoisomerases regulate the number of topological links between two DNA strands (i.e. change the number of superhelical turns) by catalysing transient single- or double-strand breaks, crossing the strands through one another, then resealing the breaks []. These enzymes have several functions: to remove DNA supercoils during transcription and DNA replication; for strand breakage during recombination; for chromosome condensation; and to disentangle intertwined DNA during mitosis [, ]. DNA topoisomerases are divided into two classes: type I enzymes (5.99.1.2 from EC; topoisomerases I, III and V) break single-strand DNA, and type II enzymes (5.99.1.3 from EC; topoisomerases II, IV and VI) break double-strand DNA []. Type I topoisomerases are ATP-independent enzymes (except for reverse gyrase), and can be subdivided according to their structure and reaction mechanisms: type IA (bacterial and archaeal topoisomerase I, topoisomerase III and reverse gyrase) and type IB (eukaryotic topoisomerase I and topoisomerase V). These enzymes are primarily responsible for relaxing positively and/or negatively supercoiled DNA, except for reverse gyrase, which can introduce positive supercoils into DNA.  This entry represents the N-terminal DNA-binding domain found in eukaryotic topoisomerase I, which is a type IB enzymes. To cleave the DNA backbone, these enzymes must make a transient phosphotyrosine bond. The N-terminal domain of human topoisomerase I is thought to coordinate the restriction of free strand rotation during the topoisomerisation step of catalysis. A conserved tryptophan residue may be important for the DNA-interaction ability of the N-terminal domain []. Human topoisomerase I has been shown to be inhibited by camptothecin (CPT), a plant alkaloid with antitumour activity. A binding mode for the anticancer drug camptothecin has been proposed on the basis of chemical and biochemical information combined with the three-dimensional structures of topoisomerase I-DNA complexes []. More information about this protein can be found at Protein of the Month: DNA Topoisomerase [].; GO: 0003677 DNA binding, 0003917 DNA topoisomerase type I activity, 0006265 DNA topological change, 0005694 chromosome; PDB: 1TL8_A 1K4T_A 1A36_A 1RR8_C 1T8I_A 1SC7_A 1EJ9_A 1LPQ_A 1RRJ_A 1A31_A ....
Probab=26.47  E-value=28  Score=30.49  Aligned_cols=52  Identities=15%  Similarity=0.113  Sum_probs=31.1

Q ss_pred             chHHHHHHHHhhCCC-Chhh-----hhh---hhhhhhcccCCChhhhhhhhhhhhhhcccC
Q 023605            5 CSNEEVKRLRSENSD-LSVT-----LGL---RKHIGKTYKELPPEQKARYKKRDERMGNSG   56 (280)
Q Consensus         5 ~~~~~~~~~~~~~p~-~~~~-----~~~---~k~~g~~wk~l~~~~k~~~~~~~~~~k~~~   56 (280)
                      |.+|||+.+.++++. ++.+     ..+   -..--++-|+||.|||+.--+.+++....|
T Consensus        64 Ff~Df~~~l~~~~~~~i~~~~kcDF~~i~~~~~~~~e~kk~~skeEK~~~K~~k~~~~~~y  124 (215)
T PF02919_consen   64 FFKDFRKVLTKEERKKIKDFDKCDFSPIYEYFEKEKEKKKNMSKEEKKALKEEKEELEEKY  124 (215)
T ss_dssp             HHHHHHHHHCHCCHHH-S-GGGEETHHHHHHHHHHHHHHCTS-CCHHHHHHHHHHHHHHHH
T ss_pred             HHHHHHHHhhhccCcccCchhhCCCHHHHHHHHHHHHHHHhcCHHHHHHHHHHHHHHHhhC
Confidence            788999999988853 2221     111   222335678999999886555555555554


No 44 
>PF11460 DUF3007:  Protein of unknown function (DUF3007);  InterPro: IPR021562  This is a family of uncharacterised proteins found in bacteria and eukaryotes. 
Probab=25.43  E-value=67  Score=24.95  Aligned_cols=37  Identities=19%  Similarity=0.292  Sum_probs=26.7

Q ss_pred             HHHHHHHHhhCCCChhhhhhhhhhhhcccCCChhhhhhhhhh
Q 023605            7 NEEVKRLRSENSDLSVTLGLRKHIGKTYKELPPEQKARYKKR   48 (280)
Q Consensus         7 ~~~~~~~~~~~p~~~~~~~~~k~~g~~wk~l~~~~k~~~~~~   48 (280)
                      .++|+.|+++.-..     .....-+++.+|||||.+.-++.
T Consensus        65 ~~Q~k~Ye~a~~~~-----~~~~lqkRle~l~~eE~~~L~~e  101 (104)
T PF11460_consen   65 MQQRKDYEEAVDQL-----TNEELQKRLEELSPEELEALQAE  101 (104)
T ss_pred             HHHHHHHHHHHHHH-----hHHHHHHHHHhCCHHHHHHHHHH
Confidence            46788888765332     24467789999999999876654


No 45 
>PF04189 Gcd10p:  Gcd10p family;  InterPro: IPR007316 eIF-3 is a multisubunit complex that stimulates translation initiation in vitro at several different steps. This family corresponds to the gamma subunit of eIF3 [, ].; GO: 0003743 translation initiation factor activity, 0006413 translational initiation
Probab=23.52  E-value=2e+02  Score=26.52  Aligned_cols=111  Identities=18%  Similarity=0.178  Sum_probs=68.7

Q ss_pred             HHHHHHHHhhCCCChhhhhhhhhh--hhcccCCChhhhhhhhhhhhhhcccCCCCCCCCCCCCccccccHHHHHHHHhhC
Q 023605            7 NEEVKRLRSENSDLSVTLGLRKHI--GKTYKELPPEQKARYKKRDERMGNSGNSNSHSGDNEIVETKCVPERFCALVKSL   84 (280)
Q Consensus         7 ~~~~~~~~~~~p~~~~~~~~~k~~--g~~wk~l~~~~k~~~~~~~~~~k~~~~~~k~~~k~~~~~~RcS~~~~~~~i~~L   84 (280)
                      .|...++|++  +.++-..+.|.+  ...|..=|+-.+++|++|+++...++-          ...++++..+.+.+-.=
T Consensus       110 ~eeIe~LK~~--g~sg~eII~kLiens~tF~~KT~FSqeKYlkrK~kKy~~~f----------tv~~pt~~~l~e~y~~k  177 (299)
T PF04189_consen  110 QEEIEELKKE--GVSGEEIIEKLIENSSTFDKKTEFSQEKYLKRKQKKYLKRF----------TVLRPTIRNLCEYYFEK  177 (299)
T ss_pred             HHHHHHHHHc--CCCHHHHHHHHHHhccchhhhhHHHHHHHHHHHHhhhhceE----------EEeCCCHHHHHHHHhhc
Confidence            4566778887  566666666665  358888899999999999877654432          23577888888877544


Q ss_pred             CHHHHHHHHHcCChhhhccccccc----------CHHHHHHHhcccCCCcceEEE
Q 023605           85 SEEKKKAIREIGFESLLELRCGKL----------KRKLCHWLVNQFKPERNIIEL  129 (280)
Q Consensus        85 s~~qk~~I~~~GFg~LL~i~~~~l----------~~~l~~wL~~~~d~~t~~~~i  129 (280)
                      +|.+.--+|.=-++-+|.+....-          .-=++..+++|.......+.+
T Consensus       178 ~p~Ki~~lR~d~la~il~~aNV~~g~r~Lv~D~~~GLv~aav~eRmgg~G~i~~~  232 (299)
T PF04189_consen  178 DPQKIMDLRFDTLAQILSLANVHAGGRVLVVDDCGGLVVAAVAERMGGSGNIITL  232 (299)
T ss_pred             ChHHHhccCHHHHHHHHHhcCCCCCCeEEEEeCCCChHHHHHHHHhCCCceEEEE
Confidence            665554444444444555443211          123445666666665444444


No 46 
>PF08373 RAP:  RAP domain;  InterPro: IPR013584 The ~60-residue RAP (an acronym for RNA-binding domain abundant in Apicomplexans) domain is found in various proteins in eukaryotes. It is particularly abundant in apicomplexans and might mediate a range of cellular functions through its potential interactions with RNA []. The RAP domain consists of multiple blocks of charged and aromatics residues and is predicted to be composed of alpha helical and beta strand structures. Two predicted loop regions that are dominated by glycine and tryptophan residues are found before and after the central beta sheet []. Some proteins known to contain a RAP domain are listed below:   Human hypothetical protein MGC5297,  Mammalian FAST kinase domain-containing proteins (FASTKDs),   Chlamydomonas reinhardtii chloroplastic trans-splicing factor Raa3. 
Probab=23.24  E-value=36  Score=22.67  Aligned_cols=17  Identities=29%  Similarity=0.532  Sum_probs=14.7

Q ss_pred             hcccCC-Chhhhhhhhhh
Q 023605           32 KTYKEL-PPEQKARYKKR   48 (280)
Q Consensus        32 ~~wk~l-~~~~k~~~~~~   48 (280)
                      -.|.+| +.++|+.|+++
T Consensus        40 ~eW~~l~~~~~k~~YL~~   57 (58)
T PF08373_consen   40 YEWNKLKSREEKIEYLKK   57 (58)
T ss_pred             HHHHhcCCHHHHHHHHhc
Confidence            389999 89999999875


No 47 
>PF11943 DUF3460:  Protein of unknown function (DUF3460);  InterPro: IPR021853  This family of proteins are functionally uncharacterised. This protein is found in bacteria. Proteins in this family are about 70 amino acids in length. This protein has a conserved WDK sequence motif. 
Probab=22.01  E-value=1.2e+02  Score=21.15  Aligned_cols=39  Identities=13%  Similarity=0.263  Sum_probs=25.4

Q ss_pred             HHHHHHHHhhCCCChhhhhhhhh-hhhcccCCChhhhhhhhh
Q 023605            7 NEEVKRLRSENSDLSVTLGLRKH-IGKTYKELPPEQKARYKK   47 (280)
Q Consensus         7 ~~~~~~~~~~~p~~~~~~~~~k~-~g~~wk~l~~~~k~~~~~   47 (280)
                      -.|-.++|++||++..--..|++ .-+|  -++.++.+.|.+
T Consensus         8 TqFl~~lk~~~Pele~~Q~~GRallWDk--~~d~e~~~~~~~   47 (60)
T PF11943_consen    8 TQFLNQLKAKHPELEEEQRAGRALLWDK--PQDLEEQARFRA   47 (60)
T ss_pred             HHHHHHHHHhCCchHHHHHHhhHHhcCC--CCCHHHHHHHHh
Confidence            35788999999998743333433 3444  677777776544


No 48 
>cd07921 PCA_45_Doxase_A_like Subunit A of the Class III Extradiol dioxygenase, Protocatechuate 4,5-dioxygenase, and similar enzymes. This subfamily includes the A subunit of protocatechuate (PCA) 4,5-dioxygenase (LigAB) and two subfamilies of unknown function. The A subunit is the smaller, non-catalytic subunit of LigAB. PCA 4,5-dioxygenase catalyzes the oxidization and subsequent ring-opening of PCA (or 3,4-dihydroxybenzoic acid), which is an intermediate in the breakdown of lignin and other compounds. PCA 4,5-dioxygenase is one of the aromatic ring opening dioxygenases which play key roles in the degradation of aromatic compounds. As members of the Class III extradiol dioxygenase family, the enzymes use a non-heme Fe(II) to cleave aromatic rings between a hydroxylated carbon and an adjacent non-hydroxylated carbon. LigAB-like class III enzymes are usually composed of two subunits, designated A and B, which form a tetramer composed of two copies of each subunit.
Probab=21.69  E-value=56  Score=25.46  Aligned_cols=23  Identities=39%  Similarity=0.525  Sum_probs=21.2

Q ss_pred             hCCHHHHHHHHHcCChhhhcccc
Q 023605           83 SLSEEKKKAIREIGFESLLELRC  105 (280)
Q Consensus        83 ~Ls~~qk~~I~~~GFg~LL~i~~  105 (280)
                      .||++|+++|.+-.+-+|+++.-
T Consensus        44 gLTeEe~~AV~~rD~~~Li~lGg   66 (106)
T cd07921          44 GLTEEQKQAVLDRDWLRLLELGG   66 (106)
T ss_pred             CCCHHHHHHHHhCCHHHHHHhcC
Confidence            79999999999999999998864


No 49 
>PRK10236 hypothetical protein; Provisional
Probab=21.30  E-value=54  Score=29.26  Aligned_cols=29  Identities=17%  Similarity=0.274  Sum_probs=24.1

Q ss_pred             hhhhhhhhhcccCCChhhhhhhhhhhhhh
Q 023605           24 LGLRKHIGKTYKELPPEQKARYKKRDERM   52 (280)
Q Consensus        24 ~~~~k~~g~~wk~l~~~~k~~~~~~~~~~   52 (280)
                      .-+.|..++.|+-||++|++.+.++-...
T Consensus       116 ~il~kll~~a~~kms~eE~~~L~~~l~~~  144 (237)
T PRK10236        116 QLLEQFLRNTWKKMDEEHKQEFLHAVDAR  144 (237)
T ss_pred             HHHHHHHHHHHHHCCHHHHHHHHHHHhhh
Confidence            34589999999999999999988775554


No 50 
>PHA02662 ORF131 putative membrane protein; Provisional
Probab=21.03  E-value=89  Score=27.56  Aligned_cols=25  Identities=16%  Similarity=0.361  Sum_probs=21.4

Q ss_pred             ccccccH----------HHHHHHHhhCCHHHHHHH
Q 023605           68 VETKCVP----------ERFCALVKSLSEEKKKAI   92 (280)
Q Consensus        68 ~~~RcS~----------~~~~~~i~~Ls~~qk~~I   92 (280)
                      +.++||.          +.+.++++.|+++||..|
T Consensus        64 V~N~C~sna~~sf~lll~Al~Et~~~Lp~~qK~~i   98 (226)
T PHA02662         64 VLNRCHTDAADALALASAALAETLAELPRADRLAV   98 (226)
T ss_pred             EEecccCCHHHHHHHHHHHHHHHHHhCCHHHHHHH
Confidence            5688876          678899999999999887


No 51 
>PRK04156 gltX glutamyl-tRNA synthetase; Provisional
Probab=20.94  E-value=92  Score=31.50  Aligned_cols=62  Identities=13%  Similarity=0.171  Sum_probs=34.8

Q ss_pred             HHhhCCCChh-hhhh---hhhhhhcccCCChhhhhhhhhh-hh------hhcccCCCCC---CCCCCCCccccccH
Q 023605           13 LRSENSDLSV-TLGL---RKHIGKTYKELPPEQKARYKKR-DE------RMGNSGNSNS---HSGDNEIVETKCVP   74 (280)
Q Consensus        13 ~~~~~p~~~~-~~~~---~k~~g~~wk~l~~~~k~~~~~~-~~------~~k~~~~~~k---~~~k~~~~~~RcS~   74 (280)
                      +..+||++.. .+.|   .+.+=+.=.+||.||.+.-++. +-      +.+++-.++.   +.+-..++.+||-|
T Consensus        33 ~~~~~pelr~~~~ei~~~v~~~v~~vn~ms~ee~~~~l~~~~pe~~~~~~~~~~~~~~lp~L~~ae~g~V~tRFaP  108 (567)
T PRK04156         33 IMGENPELRSKAKEIIPIVKEVVEEVNSLSLEEQRERLEELAPELLEEEEEKKEEKKGLPPLPNAEKGKVVMRFAP  108 (567)
T ss_pred             hhccChhhhhhhhhHHHHHHHHHHHHhcCCHHHHHHHHHHhChhhhhhhhhhcccccCCCCCCCCCCCeEEEEeCC
Confidence            4567888765 3333   3334447789999988765555 11      1111112222   23336678999966


No 52 
>cd07922 CarBa CarBa is the A subunit of 2-aminophenol 1,6-dioxygenase, which catalyzes the oxidization and   subsequent ring-opening of 2-aminophenyl-2,3-diol. CarBa is the A subunit of 2-aminophenol 1,6-dioxygenase, which catalyzes the oxidization and subsequent ring-opening of 2-aminophenyl-2,3-diol. 2-aminophenol 1,6-dioxygenase is a key enzyme in the carbazole degradation pathway isolated from bacterial strains with carbazole degradation ability. The enzyme is a heterotetramer composed of two A and two B subunits. CarB belongs to the class III extradiol dioxygenase family, composed of enzymes which use a non-heme Fe(II) to cleave aromatic rings between a hydroxylated carbon and an adjacent non-hydroxylated carbon. Although the enzyme was originally isolated as a meta-cleavage enzyme for 2'-aminobiphenyl-2,3-diol involved in carbazole degradation, the enzyme has also shown high specificity for 2,3-dihydroxybiphenyl.
Probab=20.69  E-value=1.1e+02  Score=22.60  Aligned_cols=48  Identities=21%  Similarity=0.227  Sum_probs=33.9

Q ss_pred             cHHHHHHHHhhCCHHHHHHHHHcCChhhhcccccccCHHHHHHHhcccCCCc
Q 023605           73 VPERFCALVKSLSEEKKKAIREIGFESLLELRCGKLKRKLCHWLVNQFKPER  124 (280)
Q Consensus        73 S~~~~~~~i~~Ls~~qk~~I~~~GFg~LL~i~~~~l~~~l~~wL~~~~d~~t  124 (280)
                      .|..+.+- -.||++++++|++-.+++|.+++   ++..|..+.+=-.||+-
T Consensus        26 DPea~~~~-~gLt~eE~~aL~~~D~~~L~~lG---vhp~L~mh~~~~~np~~   73 (81)
T cd07922          26 DPSAVFEE-YGLTPAERAALREGTFGALTSIG---VHPILQMHYLMYTNPEM   73 (81)
T ss_pred             CHHHHHHH-cCCCHHHHHHHHccCHHHHHHcC---CCHHHHHHHHHHcCccc
Confidence            34444332 27999999999999999999987   55566665554556654


No 53 
>cd07923 Gallate_dioxygenase_C The C-terminal domain of Gallate Dioxygenase, which catalyzes the oxidization and subsequent ring-opening of gallate. Gallate Dioxygenase catalyzes the oxidization and subsequent ring-opening of gallate, an intermediate in the degradation of the aromatic compound, syringate. The reaction product of gallate dioxygenase is 4-oxalomesaconate. The amino acid sequence of the N-terminal and C-terminal regions of gallate dioxygenase exhibits homology with the sequence of the PCA 4,5-dioxygenase B (catalytic) and A subunits, respectively. This model represents the C-terminal domain, which is similar to the A subunit of PCA 4,5-dioxygenase (or LigAB). The enzyme is estimated to be a homodimer according to the Escherichia coli enzyme. Since enzymes in this subfamily have fused A and B subunits, the dimer interface may resemble the tetramer interface of classical LigAB enzymes. This enzyme belongs to the class III extradiol dioxygenase family, composed of enzymes whi
Probab=20.14  E-value=59  Score=24.77  Aligned_cols=23  Identities=22%  Similarity=0.429  Sum_probs=21.2

Q ss_pred             hCCHHHHHHHHHcCChhhhcccc
Q 023605           83 SLSEEKKKAIREIGFESLLELRC  105 (280)
Q Consensus        83 ~Ls~~qk~~I~~~GFg~LL~i~~  105 (280)
                      .||++|+++|++--+.+++++..
T Consensus        36 gLt~Ee~~av~~rD~~~li~~G~   58 (94)
T cd07923          36 GLTEEERTLIRNRDWIGMIRYGV   58 (94)
T ss_pred             CCCHHHHHHHHcchHHHHHHccC
Confidence            79999999999999999999875


Done!