RPS-BLAST 2.2.26 [Sep-21-2011]

Database: CDD.v3.10 
           44,354 sequences; 10,937,602 total letters

Searching..................................................done

Query= psy7929
         (259 letters)



>gnl|CDD|221887 pfam12998, ING, Inhibitor of growth proteins N-terminal
           histone-binding.  Histones undergo numerous
           post-translational modifications, including acetylation
           and methylation, at residues which are then probable
           docking sites for various chromatin remodelling
           complexes. Inhibitor of growth proteins (INGs)
           specifically bind to residues that have been thus
           modified. INGs carry a well-characterized C-terminal
           PHD-type zinc-finger domain, binding with lysine
           4-tri-methylated histone H3 (H3K4me3), as well as this
           N-terminal domain that binds unmodified H3 tails.
           Although these two regions can bind histones
           independently, together they increase the apparent
           association of the ING for the H3 tail.
          Length = 104

 Score =  136 bits (344), Expect = 7e-41
 Identities = 48/104 (46%), Positives = 71/104 (68%), Gaps = 1/104 (0%)

Query: 4   SYLEQYLDSLDSLPIELQRNFTLMRELDSRAQDVMKTIDRVAEDYLDNMK-HYSKDKKKE 62
            YLE YLD L++LP+ELQRNFT +RE+D++ Q ++K +D   + ++     + S  K++E
Sbjct: 1   LYLEDYLDDLENLPLELQRNFTEIREIDAQVQKIIKELDEQIQKFIKENGSNLSNPKEEE 60

Query: 63  TLAEIQKYFDKTKEYGDDKVQLAIQTYEMVDKYIRKLDTDLARF 106
            L  IQ+   K +E  D+KVQLA Q YE+VDK+IR+LD DL + 
Sbjct: 61  LLKRIQEELIKAQELQDEKVQLANQAYELVDKHIRRLDKDLEKL 104


>gnl|CDD|227367 COG5034, TNG2, Chromatin remodeling protein, contains PhD zinc
           finger [Chromatin structure and dynamics].
          Length = 271

 Score =  131 bits (331), Expect = 4e-37
 Identities = 69/274 (25%), Positives = 100/274 (36%), Gaps = 25/274 (9%)

Query: 1   MSTSY-LEQYLDSLDSLPIELQRNFTLMRELDSRAQDVMK------TIDRVAEDYLDNMK 53
                 L    D L ++P E    FT + E+D++  D++K      +I +   D      
Sbjct: 1   ADLFPGLNDITDHLANVPSETDIRFTELSEIDAKVCDIIKNLRQMISILKKIIDLDSQTY 60

Query: 54  HYSKDKKKETLAEI-QKYFDKTKEYGDDKVQLAIQTYEMVDKYIRKLDTDLAR-----FE 107
              +D   + + E+  K     KE    K  LA +  +++ ++ + LD  +A+       
Sbjct: 61  EEVEDGLLKEIRELLLKAIYIQKE----KSDLADRAEKLLRRHRKLLDDRIAKRPHEKVA 116

Query: 108 QEIQEKALKNTTGGAGGGGSGTGSGSGSAGGAASKSKRGRKKAKDKAESATDAAGDDKSS 167
             I+      +        S     SG    AAS       K K +           +S 
Sbjct: 117 ARIENCHDAVSRLERNSYSSAARRSSGEHRSAASSQGSRHTKLKKRKNIHN---LKRRSP 173

Query: 168 NSKKKVAKKITGVGGVVGVLNAIVA---ADPDVAAPSHDVLDMPVDPNEPTYCVCQQVSY 224
               K     T     V      V          +      D      E  YC CQQVSY
Sbjct: 174 ELSSKREVSFTLESPSVPDTATRVKEGNNGGSTKSRGVSSEDNSEG--EELYCFCQQVSY 231

Query: 225 GEMIGCDNPDCPIEWFHFACVSLTTKPKGKWYCP 258
           G+M+ CDN +C  EWFH  CV L   PKGKWYCP
Sbjct: 232 GQMVACDNANCKREWFHLECVGLKEPPKGKWYCP 265


>gnl|CDD|201356 pfam00628, PHD, PHD-finger.  PHD folds into an interleaved type of
           Zn-finger chelating 2 Zn ions in a similar manner to
           that of the RING and FYVE domains. Several PHD fingers
           have been identified as binding modules of methylated
           histone H3.
          Length = 51

 Score = 52.9 bits (127), Expect = 9e-10
 Identities = 21/50 (42%), Positives = 28/50 (56%), Gaps = 9/50 (18%)

Query: 216 YC-VCQQV-SYGEMIGCDNPDCPIEWFHFACVSLT----TKPKGKWYCPK 259
           YC VC +V   GE++ CD   C   WFH AC+         P+G+WYCP+
Sbjct: 1   YCAVCGKVDDDGELLLCDG--CD-RWFHLACLGPPLEPEEIPEGEWYCPE 47


>gnl|CDD|214584 smart00249, PHD, PHD zinc finger.  The plant homeodomain (PHD)
           finger is a C4HC3 zinc-finger-like motif found in
           nuclear proteins thought to be involved in epigenetics
           and chromatin-mediated transcriptional regulation. The
           PHD finger binds two zinc ions using the so-called
           'cross-brace' motif and is thus structurally related to
           the RING finger and the FYVE finger. It is not yet known
           if PHD fingers have a common molecular function. Several
           reports suggest that it can function as a
           protein-protein interacton domain and it was recently
           demonstrated that the PHD finger of p300 can cooperate
           with the adjacent BROMO domain in nucleosome binding in
           vitro. Other reports suggesting that the PHD finger is a
           ubiquitin ligase have been refuted as these domains were
           RING fingers misidentified as PHD fingers.
          Length = 47

 Score = 47.2 bits (112), Expect = 9e-08
 Identities = 20/49 (40%), Positives = 25/49 (51%), Gaps = 8/49 (16%)

Query: 216 YC-VCQQV-SYGEMIGCDNPDCPIEWFHFACVSLTTK---PKGKWYCPK 259
           YC VC +    GE++ CD   C   W+H  C+        P GKWYCPK
Sbjct: 1   YCSVCGKPDDGGELLQCDG--CD-RWYHQTCLGPPLLEEEPDGKWYCPK 46


>gnl|CDD|235307 PRK04537, PRK04537, ATP-dependent RNA helicase RhlB; Provisional.
          Length = 572

 Score = 37.6 bits (87), Expect = 0.006
 Identities = 23/87 (26%), Positives = 33/87 (37%), Gaps = 3/87 (3%)

Query: 102 DLARFEQEIQEKALKNTTGGAGGGGSGTGSGSGSAGG---AASKSKRGRKKAKDKAESAT 158
            + R  +E +    +   GG  G G G+ SGS   GG    A    + R + K + E   
Sbjct: 416 TIFREAREQRAAEEQRRGGGRSGPGGGSRSGSVGGGGRRDGAGADGKPRPRRKPRVEGEA 475

Query: 159 DAAGDDKSSNSKKKVAKKITGVGGVVG 185
           DAA     +      A +  GV    G
Sbjct: 476 DAAAAGAETPVVAAAAAQAPGVVAADG 502


>gnl|CDD|233430 TIGR01477, RIFIN, variant surface antigen, rifin family.  This
           model represents the rifin branch of the rifin/stevor
           family (pfam02009) of predicted variant surface antigens
           as found in Plasmodium falciparum. This model is based
           on a set of rifin sequences kindly provided by Matt
           Berriman from the Sanger Center. This is a global model
           and assesses a penalty for incomplete sequence.
           Additional fragmentary sequences may be found with the
           fragment model and a cutoff of 20 bits.
          Length = 353

 Score = 35.9 bits (83), Expect = 0.014
 Identities = 44/170 (25%), Positives = 68/170 (40%), Gaps = 14/170 (8%)

Query: 31  DSRAQDVMKTIDRVA----EDYLDNMKHYSKDKKKETLAEIQKYFDKTKEYGDDKV--QL 84
           D   + VM+  DR      E+Y + M+   +  K++   EIQK   K      DK+  +L
Sbjct: 57  DPEMKSVMEQFDRQTSQRFEEYDERMQEKRQKCKEQCDKEIQKIILK------DKLEKEL 110

Query: 85  AIQTYEMVDKYIRKLDTDLARFEQEIQEKALKNTTGGAGGGGSGTGSGSGSAGGAASKSK 144
             + +  +   I+         E+ + +K  K       G G G   G G  GG A    
Sbjct: 111 TEK-FSTLQTDIQTDAIPTCVCEKSLADKVEKGCLRCGCGLGGGVAPGVGLLGGIAVNVV 169

Query: 145 RGRKKAKDKAESATDAAGDDKSSNSKKKVAKKITGVGGVVGV-LNAIVAA 193
             ++ A   A +A  AAG     N   +  KKI G+  + G  L  I+ A
Sbjct: 170 GWKQAALAAAAAAAIAAGIKAGINVVIEGLKKILGLSTLFGKKLKQIINA 219


>gnl|CDD|233261 TIGR01073, pcrA, ATP-dependent DNA helicase PcrA.  Designed to
           identify pcrA members of the uvrD/rep subfamily [DNA
           metabolism, DNA replication, recombination, and repair].
          Length = 726

 Score = 35.5 bits (82), Expect = 0.023
 Identities = 25/98 (25%), Positives = 36/98 (36%), Gaps = 11/98 (11%)

Query: 104 ARFEQEIQEKALKNTTGGAGGGGSGTGSGSGSAGGAASKSKRGRKKAKDKAESATDAAGD 163
           +RF  EI  + L+  + G   G +             S  + G  +      S   A GD
Sbjct: 626 SRFLNEIPAELLETASTGRRTGATDPK--------GPSIRQAGASR---PTTSQPTAGGD 674

Query: 164 DKSSNSKKKVAKKITGVGGVVGVLNAIVAADPDVAAPS 201
             S     +V  K  G+G VV V       + D+A PS
Sbjct: 675 TLSWAVGDRVNHKKWGIGTVVSVKGGGDDQELDIAFPS 712


>gnl|CDD|192936 pfam12095, DUF3571, Protein of unknown function (DUF3571).  This
          family of proteins is functionally uncharacterized.
          This protein is found in bacteria and eukaryotes.
          Proteins in this family are typically between 85 to 97
          amino acids in length.
          Length = 83

 Score = 32.2 bits (74), Expect = 0.044
 Identities = 13/36 (36%), Positives = 24/36 (66%), Gaps = 4/36 (11%)

Query: 5  YLEQYLDSLDSLPIELQRNFTLMRELDSRAQDVMKT 40
          +L+++L  LDSLP +L +  +    LD++AQ ++ T
Sbjct: 32 WLKEWLTRLDSLPADLAKLPS----LDAQAQRLLDT 63


>gnl|CDD|238103 cd00176, SPEC, Spectrin repeats, found in several proteins involved
           in cytoskeletal structure; family members include
           spectrin, alpha-actinin and dystrophin; the spectrin
           repeat forms a three helix bundle with the second helix
           interrupted by proline in some sequences; the repeats
           are independent folding units; tandem repeats are found
           in differing numbers and arrange in an antiparallel
           manner to form dimers; the repeats are defined by a
           characteristic tryptophan (W) residue in helix A and a
           leucine (L) at the carboxyl end of helix C and separated
           by a linker of 5 residues; two copies of the repeat are
           present here.
          Length = 213

 Score = 33.6 bits (77), Expect = 0.057
 Identities = 15/70 (21%), Positives = 39/70 (55%), Gaps = 1/70 (1%)

Query: 7   EQYLDSLDSLPIELQRNFTLMRELDSRAQDVMKTIDRVAEDYLDNMKHYSKDKKKETLAE 66
           E     L+S+   L+++  L  EL++  +  +K+++ +AE+ L+     + ++ +E L E
Sbjct: 132 EDLGKDLESVEELLKKHKELEEELEAH-EPRLKSLNELAEELLEEGHPDADEEIEEKLEE 190

Query: 67  IQKYFDKTKE 76
           + + +++  E
Sbjct: 191 LNERWEELLE 200


>gnl|CDD|164795 PHA00370, III, attachment protein.
          Length = 297

 Score = 33.4 bits (76), Expect = 0.092
 Identities = 25/101 (24%), Positives = 37/101 (36%), Gaps = 27/101 (26%)

Query: 117 NTTGGAGGGGSGTGSGSGSAGGAASKSKRGRKKAK-----------------------DK 153
           NT GG+GGG +G   G GS GG +     G+   K                       D 
Sbjct: 104 NTGGGSGGGDTGGSGGGGSDGGGSEGGSTGKSLTKEGVGAGDFDYPKMANANKDALTEDN 163

Query: 154 AESATDAAGDDKSSNSKKKVAKKITG----VGGVVGVLNAI 190
            ++A     D++   +   V+  I+G    VGG+V      
Sbjct: 164 DQNALQKDADEQLDKASASVSDAISGFMRGVGGLVDNGGGE 204



 Score = 28.0 bits (62), Expect = 5.8
 Identities = 13/32 (40%), Positives = 16/32 (50%), Gaps = 2/32 (6%)

Query: 117 NTTGGAGGGGSGTG--SGSGSAGGAASKSKRG 146
              GG G GGS TG  +G G+ GG +     G
Sbjct: 84  GDGGGTGEGGSDTGGDTGGGNTGGGSGGGDTG 115



 Score = 27.6 bits (61), Expect = 7.7
 Identities = 13/29 (44%), Positives = 15/29 (51%)

Query: 118 TTGGAGGGGSGTGSGSGSAGGAASKSKRG 146
           T G  GGG +G GSG G  GG+      G
Sbjct: 96  TGGDTGGGNTGGGSGGGDTGGSGGGGSDG 124


>gnl|CDD|215565 PLN03083, PLN03083, E3 UFM1-protein ligase 1 homolog; Provisional.
          Length = 803

 Score = 33.3 bits (76), Expect = 0.13
 Identities = 22/106 (20%), Positives = 44/106 (41%), Gaps = 13/106 (12%)

Query: 79  DDKVQLAIQTYEMVDKYIRKLDTDLARFEQEIQEKALKNTTGGAGGGGSGTGSGSGSAGG 138
           D K  +  ++  + + +I+ +     + E+E+   +++ ++ G   G S    GS  +  
Sbjct: 356 DSKALILGESCVLSNGFIKGI---YDQIEKEMDAFSIQASSAG-LIGSSEKSLGSNESSP 411

Query: 139 AASKSKRGRKKAKDKAESATDAAGDD---------KSSNSKKKVAK 175
           AAS S +G KK K K+ S      +          K     +K  +
Sbjct: 412 AASNSDKGSKKKKGKSTSTKGGTAESIPDDEEDAPKKGKKNQKKGR 457



 Score = 29.0 bits (65), Expect = 2.6
 Identities = 29/176 (16%), Positives = 53/176 (30%), Gaps = 35/176 (19%)

Query: 57  KDKKKETLAE--------IQKYFDKTKEYGDDKVQLAIQTYEMVDKYIRKLDTDLARFEQ 108
           KD K   L E        I+  +D+ ++  D     +IQ          +          
Sbjct: 355 KDSKALILGESCVLSNGFIKGIYDQIEKEMDAF---SIQASSAGLIGSSEKSLG------ 405

Query: 109 EIQEKALKNTTGGAGGGGSGTGSGSGSAGGAAS-----------KSKRGRKKAKDKAESA 157
               ++    +    G     G  + + GG A            K K+ +KK +DK+   
Sbjct: 406 --SNESSPAASNSDKGSKKKKGKSTSTKGGTAESIPDDEEDAPKKGKKNQKKGRDKSSKV 463

Query: 158 TDAAGDDKSSNSK--KKVAKKITGVGGVVGVLNAIVAADPDVAAPSHDVLDMPVDP 211
                D K+   K   K  +    +     V+  I+   PD+     +     +  
Sbjct: 464 P---SDSKAGGKKESVKSQEDNNNIPPEEWVMKKILEWVPDLEEDGTEDPGSILKH 516


>gnl|CDD|220817 pfam10593, Z1, Z1 domain.  This uncharacterized domain was
           identified by Iyer and colleagues. It is found
           associated with a helicase domain of superfamily type
           II.
          Length = 231

 Score = 32.6 bits (75), Expect = 0.14
 Identities = 18/70 (25%), Positives = 38/70 (54%), Gaps = 5/70 (7%)

Query: 32  SRAQDVMKTIDRVAEDYLDNMKHYSK--DKKKETLAEIQKYFDKTKEYGDDKVQLAIQTY 89
           SR  DV K +  + E+YL++++   +  D   ET+AE+++ ++   E G D   L   ++
Sbjct: 34  SRFTDVHKQVADLIEEYLNSLRKAVENGDDLGETIAELKELYEDDFEPGTD---LDKPSW 90

Query: 90  EMVDKYIRKL 99
           + +   + K+
Sbjct: 91  DEIQAALPKV 100


>gnl|CDD|217149 pfam02621, VitK2_biosynth, Menaquinone biosynthesis.  This family
           includes two enzymes which are involved in menaquinone
           biosynthesis. One which catalyzes the conversion of
           cyclic de-hypoxanthine futalosine to
           1,4-dihydroxy-6-naphthoate, and one which may be
           involved in the conversion of chorismate to futalosine.
           These enzymes comprise two domains with alpha/beta
           structures, a large domain and a small domain. A pocket
           between the two domains may form the active site, a
           conserved histidine located within this pocket could be
           the catalytic base.
          Length = 248

 Score = 31.7 bits (73), Expect = 0.24
 Identities = 16/71 (22%), Positives = 28/71 (39%), Gaps = 14/71 (19%)

Query: 55  YSKDKKKETLAEIQKYFDKTKEYGDDKVQLAIQTYEM---------VDKYIRKLDTDLAR 105
             +D   E   E+++   K+KEY     +   + Y           ++ Y+ +L  DL  
Sbjct: 175 VRRDLGLELAKELEEALRKSKEYALKHPEEIAE-YAAERAGLDEKFIELYVNELSYDLG- 232

Query: 106 FEQEIQEKALK 116
              E   KAL+
Sbjct: 233 ---EEGRKALR 240


>gnl|CDD|216689 pfam01765, RRF, Ribosome recycling factor.  The ribosome recycling
           factor (RRF / ribosome release factor) dissociates the
           ribosome from the mRNA after termination of translation,
           and is essential bacterial growth. Thus ribosomes are
           "recycled" and ready for another round of protein
           synthesis.
          Length = 165

 Score = 30.9 bits (71), Expect = 0.39
 Identities = 20/76 (26%), Positives = 32/76 (42%), Gaps = 26/76 (34%)

Query: 35  QDVMKTIDRVAEDYLDNMKHYSKDKKKETLAEIQKYFDKTKEYGDDKVQLAIQTYEMVDK 94
           +D    + ++ +D     K  S+D+ K    EIQK                     + DK
Sbjct: 115 RDANDKLKKLEKD-----KEISEDEVKRAEKEIQK---------------------LTDK 148

Query: 95  YIRKLDTDLARFEQEI 110
           YI+K+D  L + E+EI
Sbjct: 149 YIKKIDELLKKKEKEI 164


>gnl|CDD|216868 pfam02084, Bindin, Bindin. 
          Length = 239

 Score = 31.4 bits (71), Expect = 0.41
 Identities = 27/95 (28%), Positives = 39/95 (41%), Gaps = 10/95 (10%)

Query: 120 GGAGGGGSGTGSGSGSAGGAASKSKRGRKKAKDKAESATDAAGDDKSSNSKKKVAKKITG 179
           GG GG G+G G+  G  GG    S  G     + A  A DA  +    +S        T 
Sbjct: 41  GGGGGPGAGGGAPGGPVGGGGGGSG-GPPGGGEVAGEAEDAMSEFDDYSSSSIEEGDTTI 99

Query: 180 VGGVVGVLNAIVAADPDVAAPSHDVLDMPVDPNEP 214
              V+  + A++ A           +D+PVD N+P
Sbjct: 100 SADVMEKIKAVLGATK---------IDLPVDINDP 125


>gnl|CDD|218083 pfam04426, Bul1_C, Bul1 C terminus.  This family contains the C
           terminus of Saccharomyces cerevisiae Bul1. Bul1 binds
           the ubiquitin ligase Rsp5, via an N terminal PPSY motif.
           The complex containing Bul1 and Rsp5 is involved in
           intracellular trafficking of the general amino acid
           permease Gap1, degradation of Rog1 in cooperation with
           Bul2 and GSK-3, and mitochondrial inheritance. Bul1 may
           contain HEAT repeats.
          Length = 226

 Score = 30.4 bits (69), Expect = 0.63
 Identities = 16/65 (24%), Positives = 32/65 (49%), Gaps = 8/65 (12%)

Query: 12  SLDSLPIELQRNFTLMRELDSRAQDVMKTIDRVAEDYLDNMKHYSKDKKKETLAEIQKYF 71
           S +S+PI+L     LM +      + +K I +  +D+L  +K Y K K  E   ++ + +
Sbjct: 58  SDNSIPIKLNSEL-LMNK------EKLKNIKKTFKDFLKKIKEY-KKKFNENKEKLNELY 109

Query: 72  DKTKE 76
           +  + 
Sbjct: 110 NLNRT 114


>gnl|CDD|221517 pfam12300, DUF3628, Protein of unknown function (DUF3628).  This
           domain family is found in bacteria, and is typically
           between 153 and 183 amino acids in length. The family is
           found in association with pfam00270, pfam00271.
          Length = 180

 Score = 30.4 bits (68), Expect = 0.64
 Identities = 19/68 (27%), Positives = 25/68 (36%), Gaps = 5/68 (7%)

Query: 123 GGGGSGTGSGSGSAGGAASKSKRG-----RKKAKDKAESATDAAGDDKSSNSKKKVAKKI 177
           GGG SG G  SG   G   +   G     R   K + E   D+A D   +      A + 
Sbjct: 57  GGGRSGPGERSGRPVGKGPRDGDGGGGRRRGPRKPRLEGERDSAADGAVTPVFGASAPRP 116

Query: 178 TGVGGVVG 185
            G+    G
Sbjct: 117 PGIVAAPG 124



 Score = 27.7 bits (61), Expect = 4.9
 Identities = 15/59 (25%), Positives = 24/59 (40%), Gaps = 1/59 (1%)

Query: 104 ARFEQEIQEKALKNTTGGAGGGGSGTGSGSGSAGGAASKSKRGRKKAKDKAESATDAAG 162
            R  QE +    ++  G   G   G G   G  GG   +  R + + + + +SA D A 
Sbjct: 48  VRAAQEWRRGGGRSGPGERSGRPVGKGPRDGDGGGGRRRGPR-KPRLEGERDSAADGAV 105


>gnl|CDD|223311 COG0233, Frr, Ribosome recycling factor [Translation, ribosomal
           structure and biogenesis].
          Length = 187

 Score = 30.2 bits (69), Expect = 0.66
 Identities = 22/81 (27%), Positives = 41/81 (50%), Gaps = 10/81 (12%)

Query: 33  RAQDVMKTIDRVAEDYLDNMKHYSKDKKKETLAEIQKYFDKTKEYGDD---KVQLAIQTY 89
           R ++++K   + AE+    +++  +D   +      K  +K KE  +D   K +  IQ  
Sbjct: 111 RRKELVKVAKKYAEEAKVAVRNIRRDANDKI-----KKLEKDKEISEDEVKKAEEEIQK- 164

Query: 90  EMVDKYIRKLDTDLARFEQEI 110
            + D+YI+K+D  L   E+EI
Sbjct: 165 -LTDEYIKKIDELLKDKEKEI 184


>gnl|CDD|239131 cd02666, Peptidase_C19J, A subfamily of Peptidase C19. Peptidase
           C19 contains ubiquitinyl hydrolases. They are
           intracellular peptidases that remove ubiquitin molecules
           from polyubiquinated peptides by cleavage of isopeptide
           bonds. They hydrolyze bonds involving the carboxyl group
           of the C-terminal Gly residue of ubiquitin. The purpose
           of the de-ubiquitination is thought to be editing of the
           ubiquitin conjugates, which could rescue them from
           degradation, as well as recycling of the ubiquitin. The
           ubiquitin/proteasome system is responsible for most
           protein turnover in the mammalian cell, and with over 50
           members, family C19 is one of the largest families of
           peptidases in the human genome.
          Length = 343

 Score = 30.5 bits (69), Expect = 0.71
 Identities = 21/86 (24%), Positives = 34/86 (39%), Gaps = 13/86 (15%)

Query: 6   LEQY--LDSLDSLPIELQ-----RNFTLMRELDSRAQDVMKTIDRVAEDYLDNMKHYSK- 57
           L++Y   DSL  LP   Q             +     ++  +ID + E   + ++  S  
Sbjct: 195 LDRYFDYDSLTKLPQRSQVQAQLAQPLQRELISMDRYELPSSIDDIDELIREAIQSESSL 254

Query: 58  -----DKKKETLAEIQKYFDKTKEYG 78
                ++  E   EI+K FD  K YG
Sbjct: 255 VRQAQNELAELKHEIEKQFDDLKSYG 280


>gnl|CDD|213033 cd11725, ADDz_Dnmt3, ADDz domain of DNA (cytosine-5)
           methyltransferases (C5-MTases) 3 (Dnmt3).  Dnmt3 is a de
           novo DNA methyltransferase family that includes two
           active enzymes Dnmt3a and -3b and one regulatory factor
           Dnmt3l. The ADDz domain of Dnmt3 is located in the
           C-terminal region of Dnmt3, which is an active catalytic
           domain in Dnmt3a and -b, but lacks some residues for
           enzymatic activity in Dnmt3l. DNA methylation is an
           important epigenetic mechanism involved in diverse
           biological processes such as embryonic development, gene
           expression, and genomic imprinting. The ADDz_Dnmt3
           domain is a PHD-like zinc finger motif that contains two
           parts, a C2-C2 and a PHD-like zinc finger. PHD zinc
           finger domains have been identified in more than 40
           proteins that are mainly involved in chromatin mediated
           transcriptional control; the classical PHD zinc finger
           has a C4-H-C3 motif that spans about 50-80 amino acids.
           In ADDz, the conserved histidine residue of the PHD
           finger is replaced by a cysteine, and an additional zinc
           finger C2-C2 like motif is located about twenty residues
           upstream of the C4-C-C3 motif.
          Length = 126

 Score = 29.5 bits (67), Expect = 0.81
 Identities = 14/51 (27%), Positives = 19/51 (37%), Gaps = 12/51 (23%)

Query: 216 YC-VCQQVSYGEMIGCDNPDCPIEWFHFACV--------SLTTKPKGKWYC 257
           YC +C     GE+I CDN  C    +  AC+              +  W C
Sbjct: 49  YCTIC--CGGGEVILCDNESCC-RVYCTACLDILVGPGTYDKVLLEDPWSC 96


>gnl|CDD|173502 PTZ00266, PTZ00266, NIMA-related protein kinase; Provisional.
          Length = 1021

 Score = 30.5 bits (68), Expect = 0.94
 Identities = 18/50 (36%), Positives = 25/50 (50%), Gaps = 7/50 (14%)

Query: 105 RFEQEIQEKALKNT-------TGGAGGGGSGTGSGSGSAGGAASKSKRGR 147
           R E++  EKA +N+        G + GGG G G G G+  GA   +  GR
Sbjct: 514 RLERDRLEKARRNSYFLKGMENGLSAGGGPGDGPGVGAGVGAGVGTSDGR 563


>gnl|CDD|236722 PRK10590, PRK10590, ATP-dependent RNA helicase RhlE; Provisional.
          Length = 456

 Score = 30.2 bits (68), Expect = 1.1
 Identities = 16/59 (27%), Positives = 26/59 (44%), Gaps = 1/59 (1%)

Query: 110 IQEKALKNTTGGAGGGGSGTGSGSGSAGGAASKSK-RGRKKAKDKAESATDAAGDDKSS 167
           I+ + ++N     GGGG G G G G   G   + +   +  +   AE  +   GD K +
Sbjct: 382 IKAEPIQNGRQQRGGGGRGQGGGRGQQQGQPRRGEGGAKSASAKPAEKPSRRLGDAKPA 440



 Score = 29.0 bits (65), Expect = 3.0
 Identities = 10/55 (18%), Positives = 16/55 (29%)

Query: 120 GGAGGGGSGTGSGSGSAGGAASKSKRGRKKAKDKAESATDAAGDDKSSNSKKKVA 174
            G GGG            G A  +     +   +       AG+ +     +K A
Sbjct: 399 RGQGGGRGQQQGQPRRGEGGAKSASAKPAEKPSRRLGDAKPAGEQQRRRRPRKPA 453


>gnl|CDD|128778 smart00502, BBC, B-Box C-terminal domain.  Coiled coil region
          C-terminal to (some) B-Box domains.
          Length = 127

 Score = 29.2 bits (66), Expect = 1.1
 Identities = 11/57 (19%), Positives = 23/57 (40%), Gaps = 12/57 (21%)

Query: 3  TSYLEQYLDSLDSLPIELQRN------------FTLMRELDSRAQDVMKTIDRVAED 47
           + LE  L  L S+  E++ N              L   L+ R + +++ ++   E+
Sbjct: 16 AAELEDALKQLISIIQEVEENAADVEAQIKAAFDELRNALNKRKKQLLEDLEEQKEN 72


>gnl|CDD|218169 pfam04604, L_biotic_typeA, Type-A lantibiotic.  Lantibiotics are
           antibiotic peptides distinguished by the presence of the
           rare thioether amino acids lanthionine and/or
           methyl-lanthionine. They are produced by Gram-positive
           bacteria as gene-encoded precursor peptides and undergo
           post-translational modification to generate the mature
           peptide. Based on their structural and functional
           features lantibiotics are currently divided into two
           major groups: the flexible amphiphilic type-A and the
           rather rigid and globular type-B. Type-A lantibiotics
           act primarily by pore formation in the bacterial
           membrane by a mechanism involving the interaction with
           specific docking molecules such as the membrane
           precursor lipid II.
          Length = 50

 Score = 27.4 bits (61), Expect = 1.3
 Identities = 10/29 (34%), Positives = 13/29 (44%)

Query: 101 TDLARFEQEIQEKALKNTTGGAGGGGSGT 129
           T+     QE+ E+ L    GG G G   T
Sbjct: 5   TEALNSLQEVSEEELDQILGGKGSGVIKT 33


>gnl|CDD|237034 PRK12278, PRK12278, 50S ribosomal protein L21/unknown domain fusion
           protein; Provisional.
          Length = 221

 Score = 29.4 bits (66), Expect = 1.4
 Identities = 19/78 (24%), Positives = 31/78 (39%), Gaps = 4/78 (5%)

Query: 128 GTGSGSGSAGGAASKSKRGRKKAKDKAESATDAAGDDKSSNSKKKVAKKITGVGGVVG-- 185
           G G    +A  A +K+K+        A +A  A     ++ +      KITGVG  +   
Sbjct: 116 GAGKVEVAAEAAPAKAKKEAAPKAAPAPAAAAAP--PAAAAAGADDLTKITGVGPALAKK 173

Query: 186 VLNAIVAADPDVAAPSHD 203
           +  A V     +AA +  
Sbjct: 174 LNEAGVTTFAQIAALTDA 191


>gnl|CDD|213032 cd11672, ADDz, ADDz for ATRX, Dnmt3 and Dnmt3l PHD-like zinc finger
           domain.  The ADDz zinc finger domain is present in the
           chromatin-associated proteins
           cytosine-5-methyltransferase 3 (Dnmt3) and ATRX, a SNF2
           type transcription factor protein. The Dnmt3 family
           includes two active DNA methyltransferases, Dnmt3a and
           -3b, and one regulatory factor Dnmt3l. DNA methylation
           is an important epigenetic mechanism involved in diverse
           biological processes such as embryonic development, gene
           expression, and genomic imprinting. The ADDz domain is a
           PHD-like zinc finger motif that contains two parts, a
           C2-C2 and a PHD-like zinc finger. PHD zinc finger
           domains have been identified in more than 40 proteins
           that are mainly involved in chromatin mediated
           transcriptional control; the classical PHD zinc finger
           has a C4-H-C3 motif that spans about 50-80 amino acids.
           In ADDz, the conserved histidine residue of the PHD
           finger is replaced by a cysteine, and an additional zinc
           finger C2-C2 like motif is located about twenty residues
           upstream of the C4-C-C3 motif.
          Length = 120

 Score = 28.8 bits (65), Expect = 1.4
 Identities = 14/51 (27%), Positives = 19/51 (37%), Gaps = 12/51 (23%)

Query: 216 YC-VCQQVSYGEMIGCDNPDCPIEWFHFACV--------SLTTKPKGKWYC 257
           YC +C     GE+I CDN  C    +  AC+              +  W C
Sbjct: 49  YCTIC--CGGGEVILCDNESCC-RVYCTACLDFLVGPGTYDKVLDEDPWSC 96


>gnl|CDD|221581 pfam12446, DUF3682, Protein of unknown function (DUF3682).  This
           domain family is found in eukaryotes, and is typically
           between 125 and 136 amino acids in length.
          Length = 133

 Score = 28.6 bits (64), Expect = 1.5
 Identities = 14/49 (28%), Positives = 20/49 (40%)

Query: 121 GAGGGGSGTGSGSGSAGGAASKSKRGRKKAKDKAESATDAAGDDKSSNS 169
           G GG  SG+ + +  AG     +      A     SA  + G+  SS S
Sbjct: 6   GTGGVSSGSSAPAPPAGPGPGPNAPPAPAAPGVDSSAGSSGGEAGSSGS 54


>gnl|CDD|107029 PHA01351, PHA01351, putative minor structural protein.
          Length = 1070

 Score = 29.9 bits (67), Expect = 1.5
 Identities = 20/85 (23%), Positives = 36/85 (42%), Gaps = 17/85 (20%)

Query: 16   LPIELQR---NFTLMRELDSRAQDVMKTIDRVAEDYLDNMKHYSKDKKKETLAEIQKYFD 72
            +P ELQ     +   R +     +++ TI+ + E            K K  L   Q Y  
Sbjct: 998  IPQELQNTYFEYARNRRVSRYVNEIITTINLLFE------------KHKIDLDTAQSYLQ 1045

Query: 73   KTKEYG--DDKVQLAIQTYEMVDKY 95
            + K+YG  D+++QL    +++   Y
Sbjct: 1046 QLKKYGLTDEEIQLIKLNWQLRSAY 1070


>gnl|CDD|226642 COG4174, COG4174, ABC-type uncharacterized transport system,
           permease component [General function prediction only].
          Length = 364

 Score = 29.7 bits (67), Expect = 1.6
 Identities = 13/45 (28%), Positives = 17/45 (37%), Gaps = 5/45 (11%)

Query: 107 EQEIQEKALKNTTGGA---GGGGSGTGSGSGSAGGAASKSKRGRK 148
           EQ I +  L+    G    GGGG       G  G  +    RG +
Sbjct: 36  EQAIAK--LEGGQSGLDRLGGGGVDASGAGGGVGNISDSQYRGAQ 78


>gnl|CDD|224117 COG1196, Smc, Chromosome segregation ATPases [Cell division and
           chromosome partitioning].
          Length = 1163

 Score = 30.1 bits (68), Expect = 1.6
 Identities = 22/116 (18%), Positives = 58/116 (50%), Gaps = 3/116 (2%)

Query: 6   LEQYLDSLDSLPIELQRNFTLMRELDSRAQDVMKTIDRVAEDYLDNMKHYS--KDKKKET 63
           LE+    LD+L  EL+        L+   +++ + I+ + E   +  +     + + +E 
Sbjct: 802 LEEAERRLDALERELESLEQRRERLEQEIEELEEEIEELEEKLDELEEELEELEKELEEL 861

Query: 64  LAEIQKYFDKTKEYGDDKVQLAIQTYEMVDKYIRKLDTDLARFEQEIQEKALKNTT 119
             E+++   + +E  D+  +L  +  E +++ +R+L+++LA  ++EI++   +   
Sbjct: 862 KEELEELEAEKEELEDELKELE-EEKEELEEELRELESELAELKEEIEKLRERLEE 916


>gnl|CDD|233237 TIGR01026, fliI_yscN, ATPase FliI/YscN family.  This family of
           ATPases demonstrates extensive homology with ATP
           synthase F1, beta subunit. It is a mixture of members
           with two different protein functions. The first group is
           exemplified by Salmonella typhimurium FliI protein. It
           is needed for flagellar assembly, its ATPase activity is
           required for flagellation, and it may be involved in a
           specialized protein export pathway that proceeds without
           signal peptide cleavage. The second group of proteins
           function in the export of virulence proteins;
           exemplified by Yersinia sp. YscN protein an ATPase
           involved in the type III secretory pathway for the
           antihost Yops proteins [Energy metabolism, ATP-proton
           motive force interconversion].
          Length = 440

 Score = 29.6 bits (67), Expect = 1.6
 Identities = 22/96 (22%), Positives = 45/96 (46%), Gaps = 9/96 (9%)

Query: 26  LMRELDSR----AQDVMKTIDRVAEDYLDNMKHYSKDKKKETLAEIQKYFD--KTKEY-- 77
           L R L  R    A DV+ +I R+    +      +  K +E L++ +   D  +   Y  
Sbjct: 339 LSRALAQRGHYPAIDVLASISRLMTAIVSEEHRRAARKFRELLSKYKDNEDLIRIGAYQR 398

Query: 78  GDDK-VQLAIQTYEMVDKYIRKLDTDLARFEQEIQE 112
           G D+ +  AI  Y  +++++++   +   FE+ +Q+
Sbjct: 399 GSDRELDFAIAKYPKLERFLKQGINEKVNFEESLQQ 434


>gnl|CDD|131611 TIGR02560, HrpB4, type III secretion protein HrpB4.  This family of
           genes are always found in type III secretion operons in
           a limited number of species including Burkholderia,
           Xanthomonas and Ralstonia.
          Length = 210

 Score = 29.1 bits (65), Expect = 1.6
 Identities = 17/48 (35%), Positives = 20/48 (41%)

Query: 2   STSYLEQYLDSLDSLPIELQRNFTLMRELDSRAQDVMKTIDRVAEDYL 49
           S S L +  + L  LP  L      MR L SR   V + IDR     L
Sbjct: 65  SLSALLERANRLAVLPPALLLRVLRMRALFSRRTAVRRCIDRARLSRL 112


>gnl|CDD|227594 COG5269, ZUO1, Ribosome-associated chaperone zuotin [Translation,
           ribosomal structure and biogenesis / Posttranslational
           modification, protein turnover, chaperones].
          Length = 379

 Score = 29.6 bits (66), Expect = 1.9
 Identities = 41/175 (23%), Positives = 73/175 (41%), Gaps = 20/175 (11%)

Query: 36  DVMKTIDRVAEDYLDNMKHYSKDKKKETLAEIQKYFDKTKEYGDDKVQLAIQTYEMVDKY 95
           D  +T + + EDY D+M+   +D+K+ + A+ ++   K K   + +++  +Q  +  D  
Sbjct: 174 DSWRTFEPLDEDYPDDME--ERDRKRYSEAKNREKRAKLKNQDNARLKRLVQIAKKRDPR 231

Query: 96  IRKLDTDLARFEQEIQEKALKNTTGGAGGGGSGTGSGSGSAGGAASKSKRGRKKAKDKAE 155
           I+         EQE + K ++     AG            A   A  + +G+ +AK+KAE
Sbjct: 232 IKSFK------EQEKEMKKIRKWEREAG------------ARLKALAALKGKAEAKNKAE 273

Query: 156 SATDAAGDDKSSNSKKKVAKKITGVGGVVGVLNAIVAADPDVAAPSHDVLDMPVD 210
              +A     +   K K   K         + NA   AD    A   + +D  VD
Sbjct: 274 IEAEALASATAVKKKAKEVMKKALKMEKKAIKNAAKDADYFGDADKAEHIDEDVD 328


>gnl|CDD|221084 pfam11336, DUF3138, Protein of unknown function (DUF3138).  This
           family of proteins with unknown function appear to be
           restricted to Proteobacteria.
          Length = 514

 Score = 29.5 bits (66), Expect = 2.0
 Identities = 13/50 (26%), Positives = 22/50 (44%), Gaps = 2/50 (4%)

Query: 96  IRKLDTDLARFEQEIQE--KALKNTTGGAGGGGSGTGSGSGSAGGAASKS 143
           I+ L   L   +Q++ E   AL      AGGG     + + +A   +S +
Sbjct: 27  IKALQAQLTALQQQVNELRAALAAKPAAAGGGAKIQSAAAAAAAAPSSDA 76


>gnl|CDD|236092 PRK07772, PRK07772, single-stranded DNA-binding protein;
           Provisional.
          Length = 186

 Score = 28.8 bits (65), Expect = 2.1
 Identities = 12/25 (48%), Positives = 14/25 (56%)

Query: 117 NTTGGAGGGGSGTGSGSGSAGGAAS 141
              GG GGGG G+G G G  GG  +
Sbjct: 128 GGGGGFGGGGGGSGGGGGGGGGGGA 152



 Score = 28.5 bits (64), Expect = 2.6
 Identities = 15/48 (31%), Positives = 17/48 (35%)

Query: 120 GGAGGGGSGTGSGSGSAGGAASKSKRGRKKAKDKAESATDAAGDDKSS 167
           GG GGG  G G G G  G       +    A D   SA  + G     
Sbjct: 134 GGGGGGSGGGGGGGGGGGAPGGGGAQASAPADDPWSSAPASGGFGGGD 181



 Score = 28.5 bits (64), Expect = 2.9
 Identities = 17/45 (37%), Positives = 21/45 (46%), Gaps = 2/45 (4%)

Query: 120 GGAGGGGSGTGSGSGSAGGAASKSKRGRKKAKDKAESATDAAGDD 164
           GG GGGG G G G G +GG       G       A+++  A  DD
Sbjct: 124 GGGGGGGGGFGGGGGGSGGGGGGGGGGGAPGGGGAQAS--APADD 166


>gnl|CDD|211826 TIGR03497, FliI_clade2, flagellar protein export ATPase FliI.
           Members of this protein family are the FliI protein of
           bacterial flagellum systems. This protein acts to drive
           protein export for flagellar biosynthesis. The most
           closely related family is the YscN family of bacterial
           type III secretion systems. This model represents one
           (of three) segment of the FliI family tree. These have
           been modeled separately in order to exclude the type III
           secretion ATPases more effectively [Cellular processes,
           Chemotaxis and motility].
          Length = 413

 Score = 29.2 bits (66), Expect = 2.1
 Identities = 21/101 (20%), Positives = 43/101 (42%), Gaps = 17/101 (16%)

Query: 24  FTLMRELDSRAQ----DVMKTIDRVAEDYLDN---------MKHYSKDKKKETLAEIQKY 70
             L REL ++      DV+ ++ RV  + +            +  +  K+ E L  I  Y
Sbjct: 311 IVLSRELAAKNHYPAIDVLASVSRVMNEIVSEEHKELAGKLRELLAVYKEAEDLINIGAY 370

Query: 71  FDKTKEYGDDKVQLAIQTYEMVDKYIRKLDTDLARFEQEIQ 111
               K   + K+  AI+  E ++ ++++   +   FE+ +Q
Sbjct: 371 ----KRGSNPKIDEAIRYIEKINSFLKQGIDEKFTFEETVQ 407


>gnl|CDD|237537 PRK13875, PRK13875, conjugal transfer protein TrbL; Provisional.
          Length = 440

 Score = 29.1 bits (66), Expect = 2.2
 Identities = 8/43 (18%), Positives = 11/43 (25%)

Query: 120 GGAGGGGSGTGSGSGSAGGAASKSKRGRKKAKDKAESATDAAG 162
            G   G  G      SA  +  +    R     K+     A  
Sbjct: 335 AGVAAGLGGVARAGASAAASPLRRAASRAAESMKSSFRAGARS 377



 Score = 29.1 bits (66), Expect = 2.6
 Identities = 13/44 (29%), Positives = 19/44 (43%), Gaps = 1/44 (2%)

Query: 120 GGAGGGGSGTGS-GSGSAGGAASKSKRGRKKAKDKAESATDAAG 162
            GA G  +G G      A  AAS  +R   +A +  +S+  A  
Sbjct: 332 SGAAGVAAGLGGVARAGASAAASPLRRAASRAAESMKSSFRAGA 375



 Score = 27.6 bits (62), Expect = 7.4
 Identities = 23/83 (27%), Positives = 30/83 (36%), Gaps = 5/83 (6%)

Query: 118 TTGGAGGGGSGTGSGSGSAGGAASKSKRGRKKAKDKAESATDAAGDDKSSNSKKKVAKKI 177
           T   AGG      +G+G A G  + +  G   A   A     AAG   S+ S        
Sbjct: 277 TGLAAGGAAVAAAAGAGLAAGGGAAAAGGAAAA---ARGGAAAAGGASSAYSAGAAGG-- 331

Query: 178 TGVGGVVGVLNAIVAADPDVAAP 200
           +G  GV   L  +  A    AA 
Sbjct: 332 SGAAGVAAGLGGVARAGASAAAS 354


>gnl|CDD|178850 PRK00083, frr, ribosome recycling factor; Reviewed.
          Length = 185

 Score = 28.5 bits (65), Expect = 2.3
 Identities = 21/76 (27%), Positives = 32/76 (42%), Gaps = 26/76 (34%)

Query: 35  QDVMKTIDRVAEDYLDNMKHYSKDKKKETLAEIQKYFDKTKEYGDDKVQLAIQTYEMVDK 94
           +D    + ++ +D     K  S+D+ K    EIQK                     + DK
Sbjct: 133 RDANDKLKKLEKD-----KEISEDELKRAEDEIQK---------------------LTDK 166

Query: 95  YIRKLDTDLARFEQEI 110
           YI+K+D  LA  E+EI
Sbjct: 167 YIKKIDELLAAKEKEI 182


>gnl|CDD|182507 PRK10510, PRK10510, putative outer membrane lipoprotein;
           Provisional.
          Length = 219

 Score = 28.7 bits (64), Expect = 2.3
 Identities = 14/29 (48%), Positives = 17/29 (58%)

Query: 121 GAGGGGSGTGSGSGSAGGAASKSKRGRKK 149
           G  G G+G GS  G+  GA S SK+ R K
Sbjct: 33  GKSGIGAGIGSLVGAGIGALSSSKKDRGK 61


>gnl|CDD|100109 cd05831, Ribosomal_P1, Ribosomal protein P1. This subfamily
           represents the eukaryotic large ribosomal protein P1.
           Eukaryotic P1 and P2 are functionally equivalent to the
           bacterial protein L7/L12, but are not homologous to
           L7/L12. P1 is located in the L12 stalk, with proteins
           P2, P0, L11, and 28S rRNA. P1 and P2 are the only
           proteins in the ribosome to occur as multimers, always
           appearing as sets of heterodimers. Recent data indicate
           that eukaryotes have four copies (two heterodimers),
           while most archaeal species contain six copies of L12p
           (three homodimers) and bacteria may have four or six
           copies (two or three homodimers), depending on the
           species. Experiments using S. cerevisiae P1 and P2
           indicate that P1 proteins are positioned more internally
           with limited reactivity in the C-terminal domains, while
           P2 proteins seem to be more externally located and are
           more likely to interact with other cellular components.
           In lower eukaryotes, P1 and P2 are further subdivided
           into P1A, P1B, P2A, and P2B, which form P1A/P2B and
           P1B/P2A heterodimers. Some plant species have a third
           P-protein, called P3, which is not homologous to P1 and
           P2. In humans, P1 and P2 are strongly autoimmunogenic.
           They play a significant role in the etiology and
           pathogenesis of systemic lupus erythema (SLE). In
           addition, the ribosome-inactivating protein
           trichosanthin (TCS) interacts with human P0, P1, and P2,
           with its primary binding site located in the C-terminal
           region of P2. TCS inactivates the ribosome by
           depurinating a specific adenine in the sarcin-ricin loop
           of 28S rRNA.
          Length = 103

 Score = 27.7 bits (62), Expect = 2.4
 Identities = 17/55 (30%), Positives = 30/55 (54%), Gaps = 7/55 (12%)

Query: 112 EKALKN-------TTGGAGGGGSGTGSGSGSAGGAASKSKRGRKKAKDKAESATD 159
            KAL+        +  G GGGG+   + + +A  AA+++K+  KK +++ ES  D
Sbjct: 43  AKALEGKDIKDLLSNVGGGGGGAAPAAAAAAAAAAAAEAKKEEKKEEEEEESDDD 97



 Score = 26.5 bits (59), Expect = 5.9
 Identities = 14/54 (25%), Positives = 30/54 (55%)

Query: 106 FEQEIQEKALKNTTGGAGGGGSGTGSGSGSAGGAASKSKRGRKKAKDKAESATD 159
           F + ++ K +K+     GGGG G    + +A  AA+ ++  +++ K++ E  +D
Sbjct: 42  FAKALEGKDIKDLLSNVGGGGGGAAPAAAAAAAAAAAAEAKKEEKKEEEEEESD 95


>gnl|CDD|218967 pfam06273, eIF-4B, Plant specific eukaryotic initiation factor 4B. 
           This family consists of several plant specific
           eukaryotic initiation factor 4B proteins.
          Length = 496

 Score = 28.9 bits (64), Expect = 2.6
 Identities = 15/46 (32%), Positives = 19/46 (41%), Gaps = 1/46 (2%)

Query: 119 TGGAGGGGSGTGSGSGSAGGAASKSKR-GRKKAKDKAESATDAAGD 163
            GG GGGG    SG       A  S R GRKK +    +  +   +
Sbjct: 189 GGGGGGGGGERRSGGFRDSPGADDSDRWGRKKVETFGSAFGENGEE 234


>gnl|CDD|180777 PRK06958, PRK06958, single-stranded DNA-binding protein;
           Provisional.
          Length = 182

 Score = 28.6 bits (64), Expect = 2.6
 Identities = 11/29 (37%), Positives = 13/29 (44%)

Query: 120 GGAGGGGSGTGSGSGSAGGAASKSKRGRK 148
           GG+GGGG G   G    GG       G +
Sbjct: 112 GGSGGGGGGGDEGGYGGGGGGGGGGYGGE 140



 Score = 27.8 bits (62), Expect = 4.8
 Identities = 11/25 (44%), Positives = 14/25 (56%)

Query: 119 TGGAGGGGSGTGSGSGSAGGAASKS 143
             G GGG +  G G G+ GGA+  S
Sbjct: 142 RSGGGGGRASGGGGGGAGGGASRPS 166



 Score = 27.5 bits (61), Expect = 6.3
 Identities = 11/28 (39%), Positives = 11/28 (39%)

Query: 120 GGAGGGGSGTGSGSGSAGGAASKSKRGR 147
           GG GG   G G G G  GG      R  
Sbjct: 117 GGGGGDEGGYGGGGGGGGGGYGGESRSG 144



 Score = 27.1 bits (60), Expect = 8.3
 Identities = 18/44 (40%), Positives = 19/44 (43%)

Query: 120 GGAGGGGSGTGSGSGSAGGAASKSKRGRKKAKDKAESATDAAGD 163
           GG GGGG G G  S S GG    S  G   A   A   +  AG 
Sbjct: 128 GGGGGGGGGYGGESRSGGGGGRASGGGGGGAGGGASRPSAPAGG 171


>gnl|CDD|117316 pfam08746, zf-RING-like, RING-like domain.  This is a zinc finger
           domain that is related to the C3HC4 RING finger domain
           (pfam00097).
          Length = 43

 Score = 26.2 bits (58), Expect = 2.7
 Identities = 8/18 (44%), Positives = 9/18 (50%), Gaps = 1/18 (5%)

Query: 228 IGCDNPDCPIEWFHFACV 245
             C N DC I W H  C+
Sbjct: 12  QRCGNRDCNIRW-HVDCL 28


>gnl|CDD|149112 pfam07863, CtnDOT_TraJ, Homologues of TraJ from Bacteroides
           conjugative transposon.  Members of this family have
           been implicated in as being involved in an unusual form
           of DNA transfer (conjugation) in Bacteroides. The family
           has been named CtnDOT_TraJ to avoid confusion with other
           conjugative transfer systems.
          Length = 61

 Score = 26.5 bits (59), Expect = 3.0
 Identities = 13/32 (40%), Positives = 19/32 (59%)

Query: 116 KNTTGGAGGGGSGTGSGSGSAGGAASKSKRGR 147
           +N    AG GGS   +G+G+A G A+   RG+
Sbjct: 30  RNVNSTAGRGGSSAAAGAGAAMGNAAGRLRGK 61


>gnl|CDD|240291 PTZ00146, PTZ00146, fibrillarin; Provisional.
          Length = 293

 Score = 28.5 bits (64), Expect = 3.0
 Identities = 10/23 (43%), Positives = 10/23 (43%)

Query: 120 GGAGGGGSGTGSGSGSAGGAASK 142
           GG G G  G G G G  GG    
Sbjct: 33  GGRGRGRGGGGGGRGGGGGGGPG 55



 Score = 28.2 bits (63), Expect = 4.5
 Identities = 12/29 (41%), Positives = 14/29 (48%)

Query: 120 GGAGGGGSGTGSGSGSAGGAASKSKRGRK 148
           GG GGGG G G   G  GG   + + G  
Sbjct: 15  GGGGGGGRGGGGRGGGRGGGRGRGRGGGG 43



 Score = 28.2 bits (63), Expect = 4.6
 Identities = 12/32 (37%), Positives = 14/32 (43%)

Query: 120 GGAGGGGSGTGSGSGSAGGAASKSKRGRKKAK 151
           GG GGG  G G G    GG       GR + +
Sbjct: 8   GGRGGGRGGGGGGGRGGGGRGGGRGGGRGRGR 39



 Score = 28.2 bits (63), Expect = 4.8
 Identities = 12/31 (38%), Positives = 14/31 (45%)

Query: 120 GGAGGGGSGTGSGSGSAGGAASKSKRGRKKA 150
           GG  GGG G G G G  GG     + G +  
Sbjct: 3   GGGFGGGRGGGRGGGGGGGRGGGGRGGGRGG 33



 Score = 27.8 bits (62), Expect = 6.5
 Identities = 10/32 (31%), Positives = 10/32 (31%)

Query: 120 GGAGGGGSGTGSGSGSAGGAASKSKRGRKKAK 151
           GG GGG  G        GG       G    K
Sbjct: 25  GGRGGGRGGGRGRGRGGGGGGRGGGGGGGPGK 56



 Score = 27.4 bits (61), Expect = 7.2
 Identities = 11/32 (34%), Positives = 13/32 (40%)

Query: 120 GGAGGGGSGTGSGSGSAGGAASKSKRGRKKAK 151
            G G GG   G G G  GG      RG  + +
Sbjct: 6   FGGGRGGGRGGGGGGGRGGGGRGGGRGGGRGR 37


>gnl|CDD|218573 pfam05387, Chorion_3, Chorion family 3.  This family consists of
           several Drosophila chorion proteins S36 and S38. The
           chorion genes of Drosophila are amplified in response to
           developmental signals in the follicle cells of the
           ovary.
          Length = 277

 Score = 28.5 bits (63), Expect = 3.1
 Identities = 21/70 (30%), Positives = 28/70 (40%), Gaps = 1/70 (1%)

Query: 120 GGAGGGGSGTGSGSGSAGGAASKSKRGRKKAKDKAESATDAAGDDKSSNSKKKVAKKITG 179
           G AGGGG G GSG   AG +A   +     A     S  +  G         + A ++  
Sbjct: 21  GSAGGGG-GHGSGQYGAGASAGLEEYVNAAAGGAQPSGGNIIGAQAEIQPTPEEAGRLGR 79

Query: 180 VGGVVGVLNA 189
           V   +  LNA
Sbjct: 80  VQAQLQALNA 89


>gnl|CDD|224429 COG1512, COG1512, Beta-propeller domains of methanol dehydrogenase
           type [General function prediction only].
          Length = 271

 Score = 28.5 bits (64), Expect = 3.2
 Identities = 11/33 (33%), Positives = 15/33 (45%)

Query: 116 KNTTGGAGGGGSGTGSGSGSAGGAASKSKRGRK 148
           +++  G  GG  G  SG G +GG  S    G  
Sbjct: 236 RSSGSGGSGGSGGGSSGGGFSGGGGSSGGGGAS 268



 Score = 28.1 bits (63), Expect = 4.0
 Identities = 12/25 (48%), Positives = 14/25 (56%)

Query: 119 TGGAGGGGSGTGSGSGSAGGAASKS 143
           +GG   GG  +G G  S GG AS S
Sbjct: 246 SGGGSSGGGFSGGGGSSGGGGASGS 270


>gnl|CDD|235175 PRK03918, PRK03918, chromosome segregation protein; Provisional.
          Length = 880

 Score = 28.9 bits (65), Expect = 3.2
 Identities = 23/113 (20%), Positives = 55/113 (48%), Gaps = 11/113 (9%)

Query: 6   LEQYLDSLDSLPIELQRNFTLMRELDSRAQDVMKTIDRVAEDYLDNMKHYSKDKKKETLA 65
           LE+Y   L  +  EL+      R+L    +++ K + + +E            K KE   
Sbjct: 454 LEEYTAELKRIEKELKEIEEKERKLRKELRELEKVLKKESELI----------KLKELAE 503

Query: 66  EIQKYFDKTKEYGDDKVQLAIQTYEMVDKYIRKLDTDLARFEQEIQE-KALKN 117
           ++++  +K K+Y  ++++   + YE + + + KL  ++   ++E+++ + LK 
Sbjct: 504 QLKELEEKLKKYNLEELEKKAEEYEKLKEKLIKLKGEIKSLKKELEKLEELKK 556


>gnl|CDD|227244 COG4907, COG4907, Predicted membrane protein [Function unknown].
          Length = 595

 Score = 28.7 bits (64), Expect = 3.3
 Identities = 11/23 (47%), Positives = 15/23 (65%)

Query: 116 KNTTGGAGGGGSGTGSGSGSAGG 138
           ++++ G GGG SG GSG G  G 
Sbjct: 572 RSSSSGGGGGFSGGGSGGGGGGA 594



 Score = 28.0 bits (62), Expect = 6.6
 Identities = 11/21 (52%), Positives = 13/21 (61%)

Query: 120 GGAGGGGSGTGSGSGSAGGAA 140
             +GGGG  +G GSG  GG A
Sbjct: 574 SSSGGGGGFSGGGSGGGGGGA 594


>gnl|CDD|221784 pfam12810, Gly_rich, Glycine rich protein.  This family of proteins
           is greatly expanded in Trichomonas vaginalis. The
           proteins are composed of several glycine rich motifs
           interspersed through the sequence. Although many
           proteins have been annotated by similarity in the family
           these annotations given the biased composition of the
           sequences these are unlikely to be functionally
           relevant.
          Length = 248

 Score = 28.3 bits (64), Expect = 3.4
 Identities = 9/19 (47%), Positives = 10/19 (52%)

Query: 120 GGAGGGGSGTGSGSGSAGG 138
           GG GG G+  G   G  GG
Sbjct: 112 GGGGGSGNYNGGSGGFGGG 130


>gnl|CDD|222632 pfam14260, zf-C4pol, C4-type zinc-finger of DNA polymerase delta.
           In fission yeast this zinc-finger domain appears is the
           region of Pol3 that binds directly to the B-subunit,
           Cdc1. Pol delta is a hetero-tetrameric enzyme comprising
           four evolutionarily well-conserved proteins: the
           catalytic subunit Pol3 and three smaller subunits Cdc1,
           Cdc27 and Cdm1.
          Length = 73

 Score = 26.6 bits (59), Expect = 3.5
 Identities = 10/22 (45%), Positives = 14/22 (63%), Gaps = 2/22 (9%)

Query: 218 VCQ--QVSYGEMIGCDNPDCPI 237
           +CQ  Q S  E + CD+ DCP+
Sbjct: 47  ICQRCQGSLHEEVLCDSRDCPV 68


>gnl|CDD|184923 PRK14959, PRK14959, DNA polymerase III subunits gamma and tau;
           Provisional.
          Length = 624

 Score = 28.9 bits (64), Expect = 3.5
 Identities = 13/45 (28%), Positives = 18/45 (40%)

Query: 123 GGGGSGTGSGSGSAGGAASKSKRGRKKAKDKAESATDAAGDDKSS 167
            GGG+   SGS + G A+  +           +    AAG   SS
Sbjct: 374 SGGGASAPSGSAAEGPASGGAATIPTPGTQGPQGTAPAAGMTPSS 418



 Score = 28.5 bits (63), Expect = 4.1
 Identities = 11/46 (23%), Positives = 20/46 (43%), Gaps = 2/46 (4%)

Query: 119 TGGAGGGGSGTGSGSGSAGGAASKSKRG--RKKAKDKAESATDAAG 162
           +GG     SG+ +   ++GGAA+    G    +    A   T ++ 
Sbjct: 374 SGGGASAPSGSAAEGPASGGAATIPTPGTQGPQGTAPAAGMTPSSA 419


>gnl|CDD|203394 pfam06133, DUF964, Protein of unknown function (DUF964).  This
          family consists of several relatively short bacterial
          and archaeal hypothetical sequences. The function of
          this family is unknown.
          Length = 108

 Score = 27.5 bits (62), Expect = 3.6
 Identities = 9/43 (20%), Positives = 20/43 (46%), Gaps = 1/43 (2%)

Query: 31 DSRAQDVMKTIDRVAEDYLDNMKHYSKDKKKETLAEIQKYFDK 73
          D  AQ ++    ++ E+     + + K+  KE   +IQ+   +
Sbjct: 32 DEEAQKLIDEFQKLQEEI-QEKQMFGKEIPKEVQQKIQELKRE 73


>gnl|CDD|224212 COG1293, COG1293, Predicted RNA-binding protein homologous to
           eukaryotic snRNP [Transcription].
          Length = 564

 Score = 28.5 bits (64), Expect = 3.8
 Identities = 30/125 (24%), Positives = 48/125 (38%), Gaps = 14/125 (11%)

Query: 5   YLEQYLDSLDSLPIELQRNFTLMRELDSRAQ---DVMKTIDRVAEDYLDNMKHYSKDKKK 61
            LE+  D L+ L    +        L +  Q   + +K++ R+A+ Y +       DK K
Sbjct: 301 KLEKQEDELEELEKAAEELRQKGELLYANLQLIEEGLKSV-RLADFYGNEEIKIELDKSK 359

Query: 62  ETLAEIQKYFDKTKEYGDDKVQL---------AIQTYEMVDKYIRKLDTDLARFEQEIQE 112
                 Q+YF K K+    KV L         AI  YE     + K +   A  E+  +E
Sbjct: 360 TPSENAQRYFKKYKKLKGAKVNLDRQLSELKEAIAYYESAKTALEKAEGKKA-IEEIREE 418

Query: 113 KALKN 117
              + 
Sbjct: 419 LIEEG 423


>gnl|CDD|234336 TIGR03734, PRTRC_parB, PRTRC system ParB family protein.  A novel
           genetic system characterized by six major proteins,
           included a ParB homolog and a ThiF homolog, is
           designated PRTRC, or ParB-Related,ThiF-Related Cassette.
           It is often found on plasmids. This protein family the
           member related to ParB, and is designated PRTRC system
           ParB family protein.
          Length = 554

 Score = 28.5 bits (64), Expect = 3.9
 Identities = 9/48 (18%), Positives = 17/48 (35%)

Query: 108 QEIQEKALKNTTGGAGGGGSGTGSGSGSAGGAASKSKRGRKKAKDKAE 155
           +     A +     A G G+     S +    +  +K   KKA   ++
Sbjct: 325 ERAAAAAAQKPAAPAAGPGTPAKEKSPAETATSGAAKPAAKKAVPSSQ 372


>gnl|CDD|215598 PLN03138, PLN03138, Protein TOC75; Provisional.
          Length = 796

 Score = 28.7 bits (64), Expect = 3.9
 Identities = 15/57 (26%), Positives = 21/57 (36%)

Query: 111 QEKALKNTTGGAGGGGSGTGSGSGSAGGAASKSKRGRKKAKDKAESATDAAGDDKSS 167
               L  +    GGGG G G G    GG       G  +   +  +  DA  D++ S
Sbjct: 67  AVALLSASAISGGGGGGGGGFGGFGGGGGGGGGGGGGWRFWLRLFAPADAHADEEQS 123


>gnl|CDD|218602 pfam05478, Prominin, Prominin.  The prominins are an emerging
           family of proteins that among the multispan membrane
           proteins display a novel topology. Mouse prominin and
           human prominin (mouse)-like 1 (PROML1) are predicted to
           contain five membrane spanning domains, with an
           N-terminal domain exposed to the extracellular space
           followed by four, alternating small cytoplasmic and
           large extracellular, loops and a cytoplasmic C-terminal
           domain. The exact function of prominin is unknown
           although in humans defects in PROM1, the gene coding for
           prominin, cause retinal degeneration.
          Length = 807

 Score = 28.4 bits (64), Expect = 4.0
 Identities = 12/73 (16%), Positives = 28/73 (38%), Gaps = 16/73 (21%)

Query: 10  LDSLDSLPIELQRNFT-----LMRELDSR-------AQDVMKTIDRVAEDYLDNMKHYSK 57
             + + +P +++   +     +   LDS        A+D+   +  V    L+N +  S 
Sbjct: 342 NSTFEEIPSKVKNQTSSVVPDVKAALDSLGTDIKSVAEDLPLQVLSVLSQILNNTQSSSN 401

Query: 58  DKKKETLAEIQKY 70
                 L  +++Y
Sbjct: 402 PY----LPYVEQY 410


>gnl|CDD|233028 TIGR00571, dam, DNA adenine methylase (dam).  All proteins in this
           family for which functions are known are DNA-adenine
           methyltransferases. This family is based on the
           phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis,
           Stanford University). The DNA adenine methylase (dam) of
           E. coli and related species is instrumental in
           distinguishing the newly synthesized strand during DNA
           replication for methylation-directed mismatch repair.
           This family includes several phage methylases and a
           number of different restriction enzyme chromosomal
           site-specific modification systems [DNA metabolism, DNA
           replication, recombination, and repair].
          Length = 266

 Score = 28.1 bits (63), Expect = 4.4
 Identities = 12/36 (33%), Positives = 21/36 (58%)

Query: 41  IDRVAEDYLDNMKHYSKDKKKETLAEIQKYFDKTKE 76
            + V E  LD  K Y+++  KE   E+++ F+K+ E
Sbjct: 65  KNNVDELILDVRKLYAEENTKEYYYEVREDFNKSTE 100


>gnl|CDD|215278 PLN02502, PLN02502, lysyl-tRNA synthetase.
          Length = 553

 Score = 28.4 bits (64), Expect = 4.4
 Identities = 12/49 (24%), Positives = 19/49 (38%)

Query: 126 GSGTGSGSGSAGGAASKSKRGRKKAKDKAESATDAAGDDKSSNSKKKVA 174
            S     S +A     K+K+  ++   K E+   AA       S+K  A
Sbjct: 2   ESNGEPLSKNALKKRLKAKQAEEEKAAKEEAKAAAAAAAAKGRSRKSAA 50


>gnl|CDD|151621 pfam11179, DUF2967, Protein of unknown function (DUF2967).  This
           family of proteins with unknown function appears to be
           restricted to Drosophila.
          Length = 284

 Score = 28.0 bits (61), Expect = 4.6
 Identities = 19/65 (29%), Positives = 25/65 (38%)

Query: 113 KALKNTTGGAGGGGSGTGSGSGSAGGAASKSKRGRKKAKDKAESATDAAGDDKSSNSKKK 172
           K   N      G  SG G+GSG+  G    S  G       A  A D A D +  ++   
Sbjct: 38  KNYGNEDAKDAGRASGKGTGSGAGVGGGGGSGSGVGGNGVMAAGAADEAADAQEEDAIAN 97

Query: 173 VAKKI 177
             KK+
Sbjct: 98  RNKKL 102


>gnl|CDD|221526 pfam12316, Dsh_C, Segment polarity protein dishevelled (Dsh) C
           terminal.  This domain family is found in eukaryotes,
           and is typically between 177 and 207 amino acids in
           length. The family is found in association with
           pfam00778, pfam02377, pfam00610, pfam00595. The segment
           polarity gene dishevelled (dsh) is required for pattern
           formation of the embryonic segments. It is involved in
           the determination of body organisation through the
           Wingless pathway (analogous to the Wnt-1 pathway).
          Length = 202

 Score = 27.6 bits (61), Expect = 5.2
 Identities = 14/48 (29%), Positives = 20/48 (41%), Gaps = 6/48 (12%)

Query: 122 AGGGGSGTGSGSGSAGGAASKSKRGRKKAKDKAESATDAAGDDKSSNS 169
            G G +G+    GS    +++S   R +A D  E      G  KS  S
Sbjct: 61  YGSGSAGSQHSEGSRSSGSNRSDGERSRAADGRE------GGRKSGGS 102


>gnl|CDD|217348 pfam03064, U79_P34, HSV U79 / HCMV P34.  This family represents
           herpes virus protein U79 and cytomegalovirus early
           phosphoprotein P34 (UL112).
          Length = 238

 Score = 27.9 bits (62), Expect = 5.4
 Identities = 18/94 (19%), Positives = 33/94 (35%), Gaps = 4/94 (4%)

Query: 59  KKKETLAEIQKYFDKTKEYGDDKVQLAIQTYEMVDKYIRKLDTDLARFEQEIQEKALKNT 118
           K  E   +  +   + K   + + +   Q  +   +  +K   D     ++ QE+  +N 
Sbjct: 138 KNAEKFEKECRALSRKKSDDEHRKRSGKQKEKRRVEDSQKHKED----RRKKQEEKRRND 193

Query: 119 TGGAGGGGSGTGSGSGSAGGAASKSKRGRKKAKD 152
                GGG G+  G           K  R+K  D
Sbjct: 194 EDKRPGGGGGSSGGQSGLSTKDEPPKEKRQKHHD 227


>gnl|CDD|215521 PLN02967, PLN02967, kinase.
          Length = 581

 Score = 28.1 bits (62), Expect = 5.4
 Identities = 12/42 (28%), Positives = 19/42 (45%), Gaps = 5/42 (11%)

Query: 135 SAGGAASKSKRGRKKAKDKAESATDAAGDDKSSNSKKKVAKK 176
           +A    SK    R + K     A  A+ D +   ++KKV K+
Sbjct: 104 AALDKESKKTPRRTRRK-----AAAASSDVEEEKTEKKVRKR 140


>gnl|CDD|235782 PRK06341, PRK06341, single-stranded DNA-binding protein;
           Provisional.
          Length = 166

 Score = 27.5 bits (61), Expect = 5.4
 Identities = 11/33 (33%), Positives = 15/33 (45%), Gaps = 1/33 (3%)

Query: 116 KNTTGGAGGGGSGTGSGS-GSAGGAASKSKRGR 147
           +   GG GGGG   G G  GS+G +    +   
Sbjct: 117 RGEGGGGGGGGDDGGGGDFGSSGPSRGGPRPAS 149



 Score = 27.5 bits (61), Expect = 6.1
 Identities = 14/45 (31%), Positives = 18/45 (40%)

Query: 120 GGAGGGGSGTGSGSGSAGGAASKSKRGRKKAKDKAESATDAAGDD 164
           G  GGGG G   G G   G++  S+ G + A            DD
Sbjct: 118 GEGGGGGGGGDDGGGGDFGSSGPSRGGPRPASSGGGGNFSRDMDD 162


>gnl|CDD|187662 cd09761, A3DFK9-like_SDR_c, Clostridium thermocellum A3DFK9-like, a
           putative carbohydrate or polyalcohol metabolizing SDR,
           classical (c) SDRs.  This subgroup includes a putative
           carbohydrate or polyalcohol metabolizing SDR (A3DFK9)
           from Clostridium thermocellum. Its members have a
           TGXXXGXG classical-SDR glycine-rich NAD-binding motif,
           and some have a canonical SDR active site tetrad (A3DFK9
           lacks the upstream Asn). SDRs are a functionally diverse
           family of oxidoreductases that have a single domain with
           a structurally conserved Rossmann fold (alpha/beta
           folding pattern with a central beta-sheet), an
           NAD(P)(H)-binding region, and a structurally diverse
           C-terminal region. Classical SDRs are typically about
           250 residues long, while extended SDRs are approximately
           350 residues. Sequence identity between different SDR
           enzymes are typically in the 15-30% range, but the
           enzymes share the Rossmann fold NAD-binding motif and
           characteristic NAD-binding and catalytic sequence
           patterns. These enzymes catalyze a wide range of
           activities including the metabolism of steroids,
           cofactors, carbohydrates, lipids, aromatic compounds,
           and amino acids, and act in redox sensing. Classical
           SDRs have an TGXXX[AG]XG cofactor binding motif and a
           YXXXK active site motif, with the Tyr residue of the
           active site motif serving as a critical catalytic
           residue (Tyr-151, human 15-hydroxyprostaglandin
           dehydrogenase (15-PGDH) numbering). In addition to the
           Tyr and Lys, there is often an upstream Ser (Ser-138,
           15-PGDH numbering) and/or an Asn (Asn-107, 15-PGDH
           numbering) contributing to the active site; while
           substrate binding is in the C-terminal region, which
           determines specificity. The standard reaction mechanism
           is a 4-pro-S hydride transfer and proton relay involving
           the conserved Tyr and Lys, a water molecule stabilized
           by Asn, and nicotinamide. Extended SDRs have additional
           elements in the C-terminal region, and typically have a
           TGXXGXXG cofactor binding motif. Complex (multidomain)
           SDRs such as ketoreductase domains of fatty acid
           synthase have a GGXGXXG NAD(P)-binding motif and an
           altered active site motif (YXXXN). Fungal type ketoacyl
           reductases have a TGXXXGX(1-2)G NAD(P)-binding motif.
           Some atypical SDRs have lost catalytic activity and/or
           have an unusual NAD(P)-binding motif and missing or
           unusual active site residues. Reactions catalyzed within
           the SDR family include isomerization, decarboxylation,
           epimerization, C=N bond reduction, dehydratase activity,
           dehalogenation, Enoyl-CoA reduction, and
           carbonyl-alcohol oxidoreduction.
          Length = 242

 Score = 27.5 bits (61), Expect = 5.6
 Identities = 20/81 (24%), Positives = 27/81 (33%), Gaps = 6/81 (7%)

Query: 119 TGGAGGGGSGTGSGSGSAGG----AASKSKRGRKKAKDKAESATDAAGDDKSSNSKKKVA 174
           TGG  G G         AG     A    +RG   A+ +  +     GD       K V 
Sbjct: 7   TGGGHGIGKQICLDFLEAGDKVVFADIDEERGADFAEAEGPNLFFVHGDVADETLVKFVV 66

Query: 175 KKITGVGGVVGVL--NAIVAA 193
             +    G + VL  NA   +
Sbjct: 67  YAMLEKLGRIDVLVNNAARGS 87


>gnl|CDD|222425 pfam13865, FoP_duplication, C-terminal duplication domain of Friend
           of PRMT1.  Fop, or Friend of Prmt1, proteins are
           conserved from fungi and plants to vertebrates. There is
           little that is actually conserved except for this
           C-terminal LDXXLDAYM region where X is any amino acid).
           The Fop proteins themselves are nuclear proteins
           localised to regions with low levels of DAPI, with a
           punctate/speckle-like distribution. Fop is a
           chromatin-associated protein and it colocalises with
           facultative heterochromatin. It is is critical for
           oestrogen-dependent gene activation.
          Length = 76

 Score = 26.3 bits (58), Expect = 5.6
 Identities = 15/60 (25%), Positives = 19/60 (31%), Gaps = 6/60 (10%)

Query: 119 TGGAGGGGSGTGSGSGSAGGAASKSKRGRKKA--KDKAESAT----DAAGDDKSSNSKKK 172
            G  G  G         A     + + GRK      K +  T    DA  D   S +K K
Sbjct: 2   GGRKGSRGGKFRPRGRGARRGRRRGRGGRKGKGGAAKPKPKTREDLDAELDQYMSTTKSK 61


>gnl|CDD|221857 pfam12923, RRP7, Ribosomal RNA-processing protein 7 (RRP7).  RRP7
           is an essential protein in yeast that is involved in
           pre-rRNA processing and ribosome assembly. It is
           speculated to be required for correct assembly of rpS27
           into the pre-ribosomal particle.
          Length = 131

 Score = 27.2 bits (61), Expect = 5.8
 Identities = 18/66 (27%), Positives = 27/66 (40%), Gaps = 4/66 (6%)

Query: 90  EMVDKYIRKLDTDLARFEQEIQEKALKN----TTGGAGGGGSGTGSGSGSAGGAASKSKR 145
           E VD Y+ K D      ++E + ++  +    TT   GG     G+    A     K K 
Sbjct: 21  EEVDTYMEKYDKREEEAKEEAKARSEPDEDGWTTVTRGGRKRKAGASRNKAAEERRKLKE 80

Query: 146 GRKKAK 151
            +KK K
Sbjct: 81  KKKKKK 86


>gnl|CDD|235600 PRK05771, PRK05771, V-type ATP synthase subunit I; Validated.
          Length = 646

 Score = 28.0 bits (63), Expect = 5.8
 Identities = 13/88 (14%), Positives = 33/88 (37%), Gaps = 11/88 (12%)

Query: 28  RELDSRAQDVMKTIDRVAEDYLDNMKHYSKDKKKETLAEIQKYFDKTKEYGDDKVQLAIQ 87
           R+L S    + + +D+        ++ Y          + +      +E   D  +   +
Sbjct: 46  RKLRSLLTKLSEALDK--------LRSYLPKLNPLREEKKKVSVKSLEELIKDVEEELEK 97

Query: 88  TYEMVDKY---IRKLDTDLARFEQEIQE 112
             + + +    I +L+ ++   EQEI+ 
Sbjct: 98  IEKEIKELEEEISELENEIKELEQEIER 125


>gnl|CDD|236172 PRK08173, PRK08173, DNA topoisomerase III; Validated.
          Length = 862

 Score = 28.1 bits (63), Expect = 6.3
 Identities = 12/43 (27%), Positives = 17/43 (39%)

Query: 136 AGGAASKSKRGRKKAKDKAESATDAAGDDKSSNSKKKVAKKIT 178
           A  A   + +    A  KAE A       K + +KK  A+K  
Sbjct: 820 AAAAKKTAAKATAAAATKAEKAAAKKAPAKKTAAKKTAARKTG 862


>gnl|CDD|188995 cd06456, M3A_DCP, Peptidase family M3 dipeptidyl carboxypeptidase
           (DCP).  Peptidase family M3 dipeptidyl carboxypeptidase
           (DCP; Dcp II; peptidyl dipeptidase; EC 3.4.15.5). This
           metal-binding M3A family also includes oligopeptidase A
           (OpdA; EC 3.4.24.70) enzyme. DCP cleaves dipeptides off
           the C-termini of various peptides and proteins, the
           smallest substrate being N-blocked tripeptides and
           unblocked tetrapeptides. DCP from E. coli is inhibited
           by the anti-hypertensive drug captopril, an inhibitor of
           the mammalian angiotensin converting enzyme (ACE, also
           called  peptidyl dipeptidase A). Oligopeptidase A (OpdA)
           may play a specific role in the degradation of signal
           peptides after they are released from precursor forms of
           secreted proteins. It can also cleave N-acetyl-L-Ala.
          Length = 654

 Score = 27.8 bits (63), Expect = 6.4
 Identities = 15/45 (33%), Positives = 28/45 (62%), Gaps = 3/45 (6%)

Query: 39  KTIDRVAEDYLDNMKHYSKDKKKETLAEIQKYFDKTKEYGDDKVQ 83
           KT + V   +L+++   +K + K+ LAE+Q +    +E GDD++Q
Sbjct: 267 KTPEAV-LAFLEDLAEKAKPQAKKELAELQAFAK--EEGGDDELQ 308


>gnl|CDD|236312 PRK08617, PRK08617, acetolactate synthase; Reviewed.
          Length = 552

 Score = 27.9 bits (63), Expect = 6.6
 Identities = 20/91 (21%), Positives = 39/91 (42%), Gaps = 9/91 (9%)

Query: 13  LDSLPIELQRNFTLMRELDSRAQDVMKTIDRVAEDYLDNMKHYSKDKK-KETLAEIQKYF 71
           +D LP E+   +   REL     D+  T+D +AE     +   S   +  E L E++   
Sbjct: 298 IDVLPAEIDNYYQPEREL---IGDIAATLDLLAEK----LDGLSLSPQSLEILEELRAQL 350

Query: 72  DKTKEYGDDKVQLAIQTYEMVDKYIRKLDTD 102
           ++  E      + A+    ++   ++ + TD
Sbjct: 351 EELAERPARLEEGAVHPLRIIRA-LQDIVTD 380


>gnl|CDD|235082 PRK02888, PRK02888, nitrous-oxide reductase; Validated.
          Length = 635

 Score = 27.6 bits (62), Expect = 6.7
 Identities = 10/57 (17%), Positives = 16/57 (28%)

Query: 107 EQEIQEKALKNTTGGAGGGGSGTGSGSGSAGGAASKSKRGRKKAKDKAESATDAAGD 163
              +  +    T   AG  G+   +G      AA  +      A         A G+
Sbjct: 6   PSGLSRRQFLGTAALAGAAGAAGSTGLLGGALAAGAAAAAAAAAAAAGGKYEVAPGE 62


>gnl|CDD|220369 pfam09731, Mitofilin, Mitochondrial inner membrane protein.
           Mitofilin controls mitochondrial cristae morphology.
           Mitofilin is enriched in the narrow space between the
           inner boundary and the outer membranes, where it forms a
           homotypic interaction and assembles into a large
           multimeric protein complex. The first 78 amino acids
           contain a typical amino-terminal-cleavable mitochondrial
           presequence rich in positive-charged and hydroxylated
           residues and a membrane anchor domain. In addition, it
           has three centrally located coiled coil domains.
          Length = 493

 Score = 27.7 bits (62), Expect = 6.7
 Identities = 24/113 (21%), Positives = 46/113 (40%), Gaps = 26/113 (23%)

Query: 10  LDSLDSLPIELQRNFTLMRELDSRAQDVMK-----TIDRVAEDYLDNMKHYSKDKKKETL 64
            + LD L  +L     L  E +   +  +K      + ++ E+ L  ++     K+    
Sbjct: 169 KEELDQLSKKLAE---LKAEEEEELERALKEKREELLSKLEEELLARLES----KEAALE 221

Query: 65  AEIQKYFDKTKEYGDDKVQLAIQTYEMVDKYIRKLDTDLARFEQEIQEKALKN 117
            +++  F++ KE             E+  KY  KL  +L R  +  ++K LKN
Sbjct: 222 KQLRLEFEREKE-------------ELRKKYEEKLRQELERQAEAHEQK-LKN 260


>gnl|CDD|190954 pfam04360, Serglycin, Serglycin.  Serglycin is the most prevalent
           proteoglycan produced in haemopoietic cells. Serglycin
           is a proteinase resistant secretory granule
           proteoglycan.
          Length = 150

 Score = 26.8 bits (59), Expect = 6.9
 Identities = 11/18 (61%), Positives = 13/18 (72%)

Query: 124 GGGSGTGSGSGSAGGAAS 141
           G GSG+GSGSGS  G+  
Sbjct: 93  GSGSGSGSGSGSGSGSGF 110



 Score = 26.8 bits (59), Expect = 7.3
 Identities = 11/17 (64%), Positives = 13/17 (76%)

Query: 119 TGGAGGGGSGTGSGSGS 135
           +G   G GSG+GSGSGS
Sbjct: 92  SGSGSGSGSGSGSGSGS 108


>gnl|CDD|233909 TIGR02520, pilus_B_mal_scr, type IVB pilus formation outer membrane
           protein, R64 PilN family.  Several related protein
           families encode outer membrane pore proteins for type II
           secretion, type III secretion, and type IV pilus
           formation. This protein family appears to encode a
           secretin for pilus formation, although it is quite
           different from PilQ. Members include the PilN
           lipoprotein of the plasmid R64 thin pilus, a type IV
           pilus. Scoring between the trusted and noise cutoffs are
           examples of bundle-forming pilus B (bfpB) [Cell
           envelope, Surface structures, Protein fate, Protein and
           peptide secretion and trafficking].
          Length = 497

 Score = 27.5 bits (61), Expect = 6.9
 Identities = 10/28 (35%), Positives = 13/28 (46%)

Query: 118 TTGGAGGGGSGTGSGSGSAGGAASKSKR 145
           +T G+G   SG    SGS    A K + 
Sbjct: 178 STAGSGSSSSGGSGNSGSTQSTAVKLES 205


>gnl|CDD|226903 COG4520, LipA, Surface antigen [Cell envelope biogenesis, outer
           membrane].
          Length = 136

 Score = 26.7 bits (59), Expect = 7.0
 Identities = 9/27 (33%), Positives = 12/27 (44%)

Query: 117 NTTGGAGGGGSGTGSGSGSAGGAASKS 143
            TT G G GG+  G+ +    G  S  
Sbjct: 9   TTTSGGGAGGALAGAQAIPTKGCRSTV 35


>gnl|CDD|233758 TIGR02169, SMC_prok_A, chromosome segregation protein SMC,
           primarily archaeal type.  SMC (structural maintenance of
           chromosomes) proteins bind DNA and act in organizing and
           segregating chromosomes for partition. SMC proteins are
           found in bacteria, archaea, and eukaryotes. It is found
           in a single copy and is homodimeric in prokaryotes, but
           six paralogs (excluded from this family) are found in
           eukarotes, where SMC proteins are heterodimeric. This
           family represents the SMC protein of archaea and a few
           bacteria (Aquifex, Synechocystis, etc); the SMC of other
           bacteria is described by TIGR02168. The N- and
           C-terminal domains of this protein are well conserved,
           but the central hinge region is skewed in composition
           and highly divergent [Cellular processes, Cell division,
           DNA metabolism, Chromosome-associated proteins].
          Length = 1164

 Score = 27.7 bits (62), Expect = 7.1
 Identities = 14/85 (16%), Positives = 43/85 (50%), Gaps = 6/85 (7%)

Query: 28  RELDSRAQDVMKTIDRVAEDYLDNMKHYSKDKKKETLAEIQKYFDKTKEYGDDKVQLAIQ 87
            EL+   ++ ++ +D + ++    ++   ++++K   AE  +Y    KE  + +    ++
Sbjct: 177 EELE-EVEENIERLDLIIDEKRQQLERLRREREK---AE--RYQALLKEKREYEGYELLK 230

Query: 88  TYEMVDKYIRKLDTDLARFEQEIQE 112
             E +++    ++  LA  E+E+++
Sbjct: 231 EKEALERQKEAIERQLASLEEELEK 255


>gnl|CDD|237654 PRK14278, PRK14278, chaperone protein DnaJ; Provisional.
          Length = 378

 Score = 27.7 bits (62), Expect = 7.2
 Identities = 16/35 (45%), Positives = 17/35 (48%), Gaps = 7/35 (20%)

Query: 120 GGAGGGGSGTGSGSGS-------AGGAASKSKRGR 147
            G GGGG G G G           GGAAS+  RGR
Sbjct: 74  AGGGGGGFGGGFGGLGDVFEAFFGGGAASRGPRGR 108


>gnl|CDD|222127 pfam13436, Gly-zipper_OmpA, Glycine-zipper containing OmpA-like
           membrane domain. 
          Length = 116

 Score = 26.5 bits (59), Expect = 7.2
 Identities = 9/32 (28%), Positives = 15/32 (46%)

Query: 119 TGGAGGGGSGTGSGSGSAGGAASKSKRGRKKA 150
            G A G   G  +G+GS   ++  ++R    A
Sbjct: 77  IGAAAGALVGAAAGAGSGQYSSYGAQRRYDNA 108



 Score = 26.5 bits (59), Expect = 8.3
 Identities = 10/27 (37%), Positives = 17/27 (62%)

Query: 121 GAGGGGSGTGSGSGSAGGAASKSKRGR 147
           G GG G+  G+ +G+  GAA+ +  G+
Sbjct: 69  GGGGDGAAIGAAAGALVGAAAGAGSGQ 95


>gnl|CDD|235640 PRK05901, PRK05901, RNA polymerase sigma factor; Provisional.
          Length = 509

 Score = 27.7 bits (62), Expect = 7.3
 Identities = 10/76 (13%), Positives = 20/76 (26%), Gaps = 2/76 (2%)

Query: 101 TDLARFEQEIQEKALKNTTGGAGGGGSGTGSGSGSAGGAASKSKRGRKKAKDKAESATDA 160
            ++    +  ++   +         G    +   +      K  +   KA      A   
Sbjct: 36  EEIKEALESKKKTPEQIDQVLIFLSGMVKDTDDATESDIPKKKTKTAAKAAAAKAPAKKK 95

Query: 161 AGDDKSSNSKKKVAKK 176
                  +S KK  KK
Sbjct: 96  L--KDELDSSKKAEKK 109


>gnl|CDD|204345 pfam09932, DUF2164, Uncharacterized conserved protein (DUF2164). 
          This domain, found in various hypothetical prokaryotic
          proteins, has no known function.
          Length = 76

 Score = 25.6 bits (57), Expect = 7.5
 Identities = 9/17 (52%), Positives = 16/17 (94%)

Query: 56 SKDKKKETLAEIQKYFD 72
          SK++K+E +A+IQ+YF+
Sbjct: 4  SKEQKQELVAKIQRYFE 20


>gnl|CDD|214018 cd12925, iSH2_PIK3R3, Inter-Src homology 2 (iSH2) helical domain of
           Class IA Phosphoinositide 3-kinase Regulatory subunit 3,
           PIK3R3, also called p55gamma.  PI3Ks catalyze the
           transfer of the gamma-phosphoryl group from ATP to the
           3-hydroxyl of the inositol ring of
           D-myo-phosphatidylinositol (PtdIns) or its derivatives.
           They play an important role in a variety of fundamental
           cellular processes, including cell motility, the Ras
           pathway, vesicle trafficking and secretion, immune cell
           activation, and apoptosis. They are classified according
           to their substrate specificity, regulation, and domain
           structure. Class IA PI3Ks are heterodimers of a p110
           catalytic (C) subunit and a p85-related regulatory (R)
           subunit. The R subunit down-regulates PI3K basal
           activity, stabilizes the C subunit, and plays a role in
           the activation downstream of tyrosine kinases. All R
           subunits contain two SH2 domains that flank an
           intervening helical domain (iSH2), which binds to the
           N-terminal adaptor-binding domain (ABD) of the catalytic
           subunit. p55gamma, also called PIK3R3 or p55PIK, also
           contains a unique N-terminal 24-amino acid residue (N24)
           that interacts with cell cycle modulators to promote
           cell cycle progression.
          Length = 161

 Score = 26.9 bits (59), Expect = 7.5
 Identities = 27/124 (21%), Positives = 57/124 (45%), Gaps = 24/124 (19%)

Query: 11  DSLDSLPIELQRNFTLMRELDSRAQDVMKTIDRVAEDYLD-----NMKHYSKDKKKETLA 65
           D++D++  +LQ       E  S+ Q+  K  DR+ E+Y        MK  + +   ET+ 
Sbjct: 1   DNIDAVGRKLQ-------EYHSQYQEKSKEYDRLYEEYTKTSQEIQMKRTAIEAFNETIK 53

Query: 66  EIQK-----------YFDKTKEYGDDK-VQLAIQTYEMVDKYIRKLDTDLARFEQEIQEK 113
             ++           Y ++ +  G++K ++  +  YE +   + ++     R EQ+++ +
Sbjct: 54  IFEEQCHTQERYSKEYIERFRREGNEKEIERIMMNYEKLKSRLGEIHDSKMRLEQDLKTQ 113

Query: 114 ALKN 117
           AL N
Sbjct: 114 ALDN 117


>gnl|CDD|147458 pfam05268, GP38, Phage tail fibre adhesin Gp38.  This family
           contains several Gp38 proteins from T-even-like phages.
           Gp38, together with a second phage protein, gp57,
           catalyzes the organisation of gp37 but is absent from
           the phage particle. Gp37 is responsible for receptor
           recognition.
          Length = 261

 Score = 27.5 bits (61), Expect = 7.6
 Identities = 15/35 (42%), Positives = 18/35 (51%), Gaps = 3/35 (8%)

Query: 115 LKNTTGGAGGG---GSGTGSGSGSAGGAASKSKRG 146
             N   G GGG   G+G  SGS  +GG AS +  G
Sbjct: 168 RGNGVCGGGGGRPFGAGGKSGSHMSGGNASLTAPG 202


>gnl|CDD|147982 pfam06112, Herpes_capsid, Gammaherpesvirus capsid protein.  This
           family consists of several Gammaherpesvirus capsid
           proteins. The exact function of this family is unknown.
          Length = 148

 Score = 26.7 bits (59), Expect = 7.8
 Identities = 17/50 (34%), Positives = 22/50 (44%), Gaps = 12/50 (24%)

Query: 124 GGGSGTGSGSGSAGGAASKSKRGRKKAKDKAESATDA-AGDDKSSNSKKK 172
            G SG+   SG    ++S S            S + A AGD   S+SKKK
Sbjct: 110 SGSSGSALSSGPGSLSSSSS-----------LSGSGAGAGDTAPSSSKKK 148


>gnl|CDD|238288 cd00520, RRF, Ribosome recycling factor (RRF). Ribosome recycling
           factor dissociates the posttermination complex, composed
           of the ribosome, deacylated tRNA, and mRNA, after
           termination of translation.  Thus ribosomes are
           "recycled" and ready for another round of protein
           synthesis.  RRF is believed to bind the ribosome at the
           A-site in a manner that mimics tRNA, but the specific
           mechanisms remain unclear.  RRF is essential for
           bacterial growth.  It is not necessary for cell growth
           in archaea or eukaryotes, but is found in mitochondria
           or chloroplasts of some eukaryotic species.
          Length = 179

 Score = 26.8 bits (60), Expect = 7.9
 Identities = 20/81 (24%), Positives = 42/81 (51%), Gaps = 10/81 (12%)

Query: 33  RAQDVMKTIDRVAEDYLDNMKHYSKDKKKETLAEIQKYFDKTKEYGDD---KVQLAIQTY 89
           R ++++K   ++AE+    +++  +D   +      K  +K KE  +D   K +  +Q  
Sbjct: 105 RRKELVKDAKKIAEEAKVAIRNIRRDANDKI-----KKLEKEKEISEDEVKKAEEDLQK- 158

Query: 90  EMVDKYIRKLDTDLARFEQEI 110
            + D+YI+K+D  L   E+E+
Sbjct: 159 -LTDEYIKKIDELLKSKEKEL 178


>gnl|CDD|183745 PRK12787, fliX, flagellar assembly regulator FliX; Reviewed.
          Length = 138

 Score = 26.5 bits (59), Expect = 8.0
 Identities = 10/43 (23%), Positives = 15/43 (34%), Gaps = 5/43 (11%)

Query: 120 GGAGGGGSGTGSGSGSAGGAASKSKRGRKKAKDKAESATDAAG 162
            G  G  +  GS S    G++     G     + A  A +A  
Sbjct: 4   YGPNGTTAAGGSRSARRTGSS-----GFSLPDESASGAGEARA 41


>gnl|CDD|215601 PLN03142, PLN03142, Probable chromatin-remodeling complex ATPase
            chain; Provisional.
          Length = 1033

 Score = 27.5 bits (61), Expect = 8.1
 Identities = 10/55 (18%), Positives = 18/55 (32%)

Query: 97   RKLDTDLARFEQEIQEKALKNTTGGAGGGGSGTGSGSGSAGGAASKSKRGRKKAK 151
            R+ DT +   E+E QE   +          +   + S    G  +       K +
Sbjct: 978  RRCDTLIRLIEKENQEYDERERQARKEKKLAKNATPSKRPSGRQANESPSSLKKR 1032


>gnl|CDD|219366 pfam07295, DUF1451, Protein of unknown function (DUF1451).  This
           family consists of several hypothetical bacterial
           proteins of around 160 residues in length. Members of
           this family contain four highly conserved cysteine
           resides toward the C-terminal region of the protein. The
           function of this family is unknown.
          Length = 148

 Score = 26.8 bits (60), Expect = 8.2
 Identities = 12/61 (19%), Positives = 29/61 (47%), Gaps = 3/61 (4%)

Query: 52  MKHYSKDKKKETLAEIQKYFDKTKEYGDDKVQLAIQTYEMVDKYIRKLDTDLARFEQEIQ 111
           +     ++ K T  E+++  ++ KEY     +L  +   ++  Y+++   DL  F +  +
Sbjct: 1   VLDRLSERLKHTEKELKEAIEQAKEYLQAAEELTREELALIGAYLKR---DLEEFLRSYE 57

Query: 112 E 112
           E
Sbjct: 58  E 58


>gnl|CDD|173607 PTZ00417, PTZ00417, lysine-tRNA ligase; Provisional.
          Length = 585

 Score = 27.7 bits (61), Expect = 8.4
 Identities = 31/117 (26%), Positives = 47/117 (40%), Gaps = 29/117 (24%)

Query: 17  PIELQRNFTLMRELDSRAQDVMKTIDRVAEDYLDNMKHYSKDKKKETLAEIQK--YFDKT 74
           P+  ++ F  M E   + + VM+   +V           SKDKKKE  AE+    Y++  
Sbjct: 39  PVHCKQCFVTMSE---KKEHVMEGEKKVR------SVQASKDKKKEEEAEVDPRLYYENR 89

Query: 75  KEYGDDKVQLAIQTY-----------EMVDKYIRKLDTDLARFEQEIQEKALKNTTG 120
            ++  ++    I  Y           E V+KY      DLA  E    E  + N TG
Sbjct: 90  SKFIQEQKAKGINPYPHKFERTITVPEFVEKY-----QDLASGEH--LEDTILNVTG 139


>gnl|CDD|236382 PRK09111, PRK09111, DNA polymerase III subunits gamma and tau;
           Validated.
          Length = 598

 Score = 27.6 bits (62), Expect = 8.5
 Identities = 8/30 (26%), Positives = 12/30 (40%)

Query: 112 EKALKNTTGGAGGGGSGTGSGSGSAGGAAS 141
           ++AL+    G    G G G   G  G   +
Sbjct: 381 DEALRRLQEGPPSPGGGGGGPPGGGGAPGA 410


>gnl|CDD|176238 cd08277, liver_alcohol_DH_like, Liver alcohol dehydrogenase.
           NAD(P)(H)-dependent oxidoreductases are the major
           enzymes in the interconversion of alcohols and
           aldehydes, or ketones.  Alcohol dehydrogenase in the
           liver converts ethanol and NAD+ to acetaldehyde and
           NADH, while in yeast and some other microorganisms ADH
           catalyzes the conversion acetaldehyde to ethanol in
           alcoholic fermentation.  There are 7 vertebrate ADH 7
           classes, 6 of which have been identified in humans.
           Class III, glutathione-dependent formaldehyde
           dehydrogenase, has been identified as the primordial
           form and exists in diverse species, including plants,
           micro-organisms, vertebrates, and invertebrates. Class
           I, typified by  liver dehydrogenase, is an evolving
           form. Gene duplication and functional specialization of
           ADH into ADH classes and subclasses created numerous
           forms in vertebrates.  For example, the A, B and C
           (formerly alpha, beta, gamma) human class I subunits
           have high overall structural similarity, but differ in
           the substrate binding pocket and therefore in substrate
           specificity. In human ADH catalysis, the zinc ion helps
           coordinate the alcohol, followed by deprotonation of  a
           histidine (His-51), the ribose of NAD,  a serine
           (Ser-48) , then the alcohol, which allows the transfer
           of a hydride to NAD+, creating NADH and a zinc-bound
           aldehyde or ketone. In yeast and some bacteria, the
           active site zinc binds an aldehyde, polarizing it, and
           leading to the reverse reaction. ADH is a member of the
           medium chain alcohol dehydrogenase family (MDR), which
           has a NAD(P)(H)-binding domain in a Rossmann fold of an
           beta-alpha form. The NAD(H)-binding region is comprised
           of 2 structurally similar halves, each of which contacts
           a mononucleotide.  A GxGxxG motif after the first
           mononucleotide contact half allows the close contact of
           the coenzyme with the ADH backbone.  The N-terminal
           catalytic domain has a distant homology  to GroES.
           These proteins typically form dimers (typically higher
           plants, mammals) or tetramers (yeast, bacteria), and
           have 2 tightly bound zinc atoms per subunit, a catalytic
           zinc at the active site and a structural zinc in a lobe
           of the catalytic domain.  NAD(H) binding occurs in the
           cleft between the catalytic  and coenzyme-binding
           domains at the active site, and coenzyme binding induces
           a conformational closing of this cleft. Coenzyme binding
           typically precedes and contributes to substrate binding.
          Length = 365

 Score = 27.3 bits (61), Expect = 8.6
 Identities = 21/85 (24%), Positives = 31/85 (36%), Gaps = 14/85 (16%)

Query: 121 GAGGGGSGTGSGSGSAGGAASK----SKRGRKKAKDKAESATDAAGDDKSSNSKKKVAKK 176
           G G  G     G+  AG  AS+         K  K K   ATD      S     +V ++
Sbjct: 192 GLGAVGLSAIMGAKIAG--ASRIIGVDINEDKFEKAKEFGATDFINPKDSDKPVSEVIRE 249

Query: 177 ITGVG--------GVVGVLNAIVAA 193
           +TG G        G   ++N  + +
Sbjct: 250 MTGGGVDYSFECTGNADLMNEALES 274


>gnl|CDD|191367 pfam05761, 5_nucleotid, 5' nucleotidase family.  This family of
           eukaryotic proteins includes 5' nucleotidase enzymes,
           such as purine 5'-nucleotidase EC:3.1.3.5.
          Length = 448

 Score = 27.2 bits (61), Expect = 8.9
 Identities = 13/42 (30%), Positives = 18/42 (42%), Gaps = 9/42 (21%)

Query: 35  QDVMKTIDRVAEDYLDNMKHYSKDKKKETLAEIQKYFDKTKE 76
           QDV   ID V         H     K+E L ++++Y  K  E
Sbjct: 155 QDVRDAIDDV---------HRDGSLKREVLEDLERYVIKDPE 187


>gnl|CDD|199210 cd02258, Peptidase_C25_N, Peptidase C25 family N-terminal domain,
           found in Arg-gingipain (Rgp), Lys-gingipain (Kgp) and
           related proteins.  Peptidase family C25 is a unique
           class of cysteine proteases, exemplified by gingipain,
           which is produced by Porphyromonas gingivalis. P.
           gingivalis is one of the primary gram-negative pathogens
           that causes periodontitis, a disease that is also
           associated with other diseases such as diabetes and
           cardiovascular disease. Gingipains are a group of
           extracellular Arg- and Lys-specific proteinases called
           Arg-gingipain (Rgp) and Lys-gingipain (Kgp); RgpA and
           RgpB are homologous Arg-specific gingipains encoded by
           two closely related genes, rgpA and rgpB, while
           Lys-specific gingipain is encoded by the single kgp
           gene. Mutant studies have shown that, among the large
           quantities of proteolytic enzymes produced by P.
           gingivalis, these three proteases are major virulence
           factors of this bacterium. All three genes encode an
           N-terminal pre-pro fragment, followed by the protease
           domain; however, rgpA and kgp also encode additional
           C-terminal HA (hemaglutinin/adhesion) subunits which
           consist of several sequence-related adhesion domains.
           Although unique, their cysteine protease active site
           residues (His and Cys) forming the catalytic dyad are
           well-conserved, cleaving the C-terminal peptide bond
           with Arg or Lys residues. Gingipains are evolutionarily
           related to other highly specific proteases including
           caspases, clostripain, legumains, and separase.
           Gingipains function by dysregulating host defense and
           inflammatory responses, and degrading host proteins,
           e.g. tissue, cells, matrix, plasma and immunological
           proteins. They are proposed to enhance gingival
           crevicular fluid (GCF) production through activation of
           the kallikrein/kinin pathways, thus increasing vascular
           permeability and causing gingival inflammation, a
           distinctive feature of periodontitis. RgpA and RgpB are
           also able to cleave and activate coagulation factors IX
           and X in order to activate prothrombin to produce
           thrombin, which in turn increases production of GCF. The
           gingipains also play a pivotal role in the survival of
           P. gingivalis in the host by attacking the host defense
           system through cleavage of several immunological
           molecules, while at the same time evading the
           host-immune response by dysregulating the cytokine
           network.
          Length = 382

 Score = 27.3 bits (61), Expect = 9.1
 Identities = 8/20 (40%), Positives = 12/20 (60%)

Query: 62  ETLAEIQKYFDKTKEYGDDK 81
           +T AE + Y DK   Y ++K
Sbjct: 149 KTNAEAKNYVDKIIAYENNK 168


>gnl|CDD|236799 PRK10930, PRK10930, FtsH protease regulator HflK; Provisional.
          Length = 419

 Score = 27.5 bits (61), Expect = 9.2
 Identities = 13/27 (48%), Positives = 14/27 (51%)

Query: 121 GAGGGGSGTGSGSGSAGGAASKSKRGR 147
           G  GGG GTGSG GS+         GR
Sbjct: 53  GGLGGGKGTGSGGGSSSQGPRPQLGGR 79


>gnl|CDD|218598 pfam05470, eIF-3c_N, Eukaryotic translation initiation factor 3
           subunit 8 N-terminus.  The largest of the mammalian
           translation initiation factors, eIF3, consists of at
           least eight subunits ranging in mass from 35 to 170 kDa.
           eIF3 binds to the 40 S ribosome in an early step of
           translation initiation and promotes the binding of
           methionyl-tRNAi and mRNA.
          Length = 593

 Score = 27.5 bits (61), Expect = 9.2
 Identities = 18/110 (16%), Positives = 35/110 (31%), Gaps = 9/110 (8%)

Query: 92  VDKYIRKLDTDLARF----EQEIQEKALKNTTGGAGGGGSGTGSGSGSAGGAASKSKRGR 147
           V K  ++ + D+ R+    E E +E+         G            A    + S    
Sbjct: 117 VKKNNKQFEDDITRYREDPESEDEEEEEDEDDDDDGSDDEDEDEDGVGATEEVAASSESG 176

Query: 148 KKAKDKAESATDAAGDDKSSNSKKKVA----KKITGVGGVVGVLNAIVAA 193
                + +   + A   K    ++       ++IT    V   L  I++A
Sbjct: 177 VDRVKEDDEEDEDADLSKKDVLEEPKMFKKPEEIT-WDDVFKKLKEIMSA 225


>gnl|CDD|236957 PRK11700, PRK11700, hypothetical protein; Provisional.
          Length = 187

 Score = 26.7 bits (60), Expect = 9.4
 Identities = 10/14 (71%), Positives = 11/14 (78%)

Query: 99  LDTDLARFEQEIQE 112
           L  DL RFEQ+IQE
Sbjct: 13  LLADLPRFEQKIQE 26


>gnl|CDD|172341 PRK13808, PRK13808, adenylate kinase; Provisional.
          Length = 333

 Score = 27.2 bits (60), Expect = 9.8
 Identities = 14/53 (26%), Positives = 21/53 (39%), Gaps = 2/53 (3%)

Query: 127 SGTGSGSGSAGGAASKSKRGRKKAKDKAESATDAAGDDKSSNSKKKVAKKITG 179
           +  G+ +         +K G KKA  KA+SA       K   +K  V+ K   
Sbjct: 192 AAVGAANAKKAAKTPAAKSGAKKASAKAKSAAKKVS--KKKAAKTAVSAKKAA 242


  Database: CDD.v3.10
    Posted date:  Mar 20, 2013  7:55 AM
  Number of letters in database: 10,937,602
  Number of sequences in database:  44,354
  
Lambda     K      H
   0.312    0.130    0.377 

Gapped
Lambda     K      H
   0.267   0.0753    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 12,993,730
Number of extensions: 1240604
Number of successful extensions: 2520
Number of sequences better than 10.0: 1
Number of HSP's gapped: 2263
Number of HSP's successfully gapped: 216
Length of query: 259
Length of database: 10,937,602
Length adjustment: 95
Effective length of query: 164
Effective length of database: 6,723,972
Effective search space: 1102731408
Effective search space used: 1102731408
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.2 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 42 (21.8 bits)
S2: 58 (26.1 bits)