RPS-BLAST 2.2.26 [Sep-21-2011]

Database: CDD.v3.10 
           44,354 sequences; 10,937,602 total letters

Searching..................................................done

Query= psy6373
         (199 letters)



>gnl|CDD|133004 cd02510, pp-GalNAc-T, pp-GalNAc-T initiates the formation of
           mucin-type O-linked glycans.  UDP-GalNAc: polypeptide
           alpha-N-acetylgalactosaminyltransferases (pp-GalNAc-T)
           initiate the formation of mucin-type, O-linked glycans
           by catalyzing the transfer of
           alpha-N-acetylgalactosamine (GalNAc) from UDP-GalNAc to
           hydroxyl groups of Ser or Thr residues of core proteins
           to form the Tn antigen (GalNAc-a-1-O-Ser/Thr). These
           enzymes are type II membrane proteins with a GT-A type
           catalytic domain and a lectin domain located on the
           lumen side of the Golgi apparatus. In human, there are
           15 isozymes of pp-GalNAc-Ts, representing the largest of
           all glycosyltransferase families. Each isozyme has
           unique but partially redundant substrate specificity for
           glycosylation sites on acceptor proteins.
          Length = 299

 Score =  203 bits (520), Expect = 8e-66
 Identities = 72/109 (66%), Positives = 87/109 (79%), Gaps = 2/109 (1%)

Query: 3   RTPMIAGGLFSIDRKYFEKLGKYDMAMSVWGGENLEISFRVWQCGGSLEIVPCSRVGHVF 62
           R+P +AGGLF+IDR++F +LG YD  M +WGGENLE+SF+VWQCGGS+EIVPCSRVGH+F
Sbjct: 167 RSPTMAGGLFAIDREWFLELGGYDEGMDIWGGENLELSFKVWQCGGSIEIVPCSRVGHIF 226

Query: 63  R-KRHPYTFPGGSGNVFARNTRRAAEVWMDNYKHYYYAEVPLAKTIPFG 110
           R KR PYTFPGGSG V  RN +R AEVWMD YK Y+Y   P  + I +G
Sbjct: 227 RRKRKPYTFPGGSGTV-LRNYKRVAEVWMDEYKEYFYKARPELRNIDYG 274


>gnl|CDD|233191 TIGR00927, 2A1904, K+-dependent Na+/Ca+ exchanger.  [Transport and
           binding proteins, Cations and iron carrying compounds].
          Length = 1096

 Score = 39.6 bits (92), Expect = 7e-04
 Identities = 17/24 (70%), Positives = 23/24 (95%)

Query: 117 NGGSSEEEEEEKEKKKEEEEEEEQ 140
           +GG SEEEEEE+E+++EEEEEEE+
Sbjct: 859 DGGDSEEEEEEEEEEEEEEEEEEE 882



 Score = 36.5 bits (84), Expect = 0.007
 Identities = 16/25 (64%), Positives = 22/25 (88%)

Query: 116 ENGGSSEEEEEEKEKKKEEEEEEEQ 140
           + G S EEEEEE+E+++EEEEEEE+
Sbjct: 859 DGGDSEEEEEEEEEEEEEEEEEEEE 883



 Score = 33.8 bits (77), Expect = 0.060
 Identities = 15/31 (48%), Positives = 22/31 (70%)

Query: 110 GETLNLENGGSSEEEEEEKEKKKEEEEEEEQ 140
           G++   E     EEEEEE+E+++EEEEEE +
Sbjct: 861 GDSEEEEEEEEEEEEEEEEEEEEEEEEEENE 891



 Score = 33.0 bits (75), Expect = 0.095
 Identities = 13/19 (68%), Positives = 18/19 (94%)

Query: 122 EEEEEEKEKKKEEEEEEEQ 140
           EEEEEE+E+++EEEEE E+
Sbjct: 874 EEEEEEEEEEEEEEEENEE 892



 Score = 30.0 bits (67), Expect = 0.84
 Identities = 13/20 (65%), Positives = 17/20 (85%)

Query: 122 EEEEEEKEKKKEEEEEEEQS 141
           EEEEEE+E+++EEE EE  S
Sbjct: 876 EEEEEEEEEEEEEENEEPLS 895



 Score = 26.9 bits (59), Expect = 9.4
 Identities = 24/90 (26%), Positives = 32/90 (35%), Gaps = 3/90 (3%)

Query: 104 AKTIPFGETLNLENGGSSEEEEEEKEKKKEEEEEEE---QSRGGNRALRPSTYTRHRHQT 160
            +     E  N E  G   E+E E E K E E E E   + +G              H+ 
Sbjct: 649 GERPTEAEGENGEESGGEAEQEGETETKGENESEGEIPAERKGEQEGEGEIEAKEADHKG 708

Query: 161 SFIDQEVAYMKMTSKKYRMGEDEEEKEEEG 190
               +EV +   T  +    E E E  EEG
Sbjct: 709 ETEAEEVEHEGETEAEGTEDEGEIETGEEG 738


>gnl|CDD|217196 pfam02709, Glyco_transf_7C, N-terminal domain of
          galactosyltransferase.  This is the N-terminal domain
          of a family of galactosyltransferases from a wide range
          of Metazoa with three related galactosyltransferases
          activities, all three of which are possessed by one
          sequence in some cases. EC:2.4.1.90,
          N-acetyllactosamine synthase; EC:2.4.1.38,
          Beta-N-acetylglucosaminyl-glycopeptide beta-1,4-
          galactosyltransferase; and EC:2.4.1.22 Lactose
          synthase. Note that N-acetyllactosamine synthase is a
          component of Lactose synthase along with
          alpha-lactalbumin, in the absence of alpha-lactalbumin
          EC:2.4.1.90 is the catalyzed reaction.
          Length = 78

 Score = 34.5 bits (80), Expect = 0.004
 Identities = 11/50 (22%), Positives = 23/50 (46%)

Query: 7  IAGGLFSIDRKYFEKLGKYDMAMSVWGGENLEISFRVWQCGGSLEIVPCS 56
            GG+ +  ++ F K+  +      WGGE+ ++  R+   G  +E    +
Sbjct: 19 YFGGVLAFSKEDFLKVNGFSNNFWGWGGEDDDLYARLLLAGLKIERPKFA 68


>gnl|CDD|133029 cd04186, GT_2_like_c, Subfamily of Glycosyltransferase Family GT2
           of unknown function.  GT-2 includes diverse families of
           glycosyltransferases with a common GT-A type structural
           fold, which has two tightly associated beta/alpha/beta
           domains that tend to form a continuous central sheet of
           at least eight beta-strands. These are enzymes that
           catalyze the transfer of sugar moieties from activated
           donor molecules to specific acceptor molecules, forming
           glycosidic bonds. Glycosyltransferases have been
           classified into more than 90 distinct sequence based
           families.
          Length = 166

 Score = 34.8 bits (81), Expect = 0.012
 Identities = 13/57 (22%), Positives = 29/57 (50%), Gaps = 1/57 (1%)

Query: 4   TPMIAGGLFSIDRKYFEKLGKYDMAMSVWGGENLEISFRVWQCGGSLEIVPCSRVGH 60
            P ++G    + R+ FE++G +D    ++  E++++  R    G  +  VP + + H
Sbjct: 109 GPKVSGAFLLVRREVFEEVGGFDEDFFLYY-EDVDLCLRARLAGYRVLYVPQAVIYH 164


>gnl|CDD|235795 PRK06402, rpl12p, 50S ribosomal protein L12P; Reviewed.
          Length = 106

 Score = 33.4 bits (77), Expect = 0.016
 Identities = 13/27 (48%), Positives = 22/27 (81%)

Query: 122 EEEEEEKEKKKEEEEEEEQSRGGNRAL 148
           EE++EE+E+++E+EE EE++  G  AL
Sbjct: 78  EEKKEEEEEEEEKEESEEEAAAGLGAL 104



 Score = 30.7 bits (70), Expect = 0.15
 Identities = 9/20 (45%), Positives = 17/20 (85%)

Query: 122 EEEEEEKEKKKEEEEEEEQS 141
            EE++E+E+++EE+EE E+ 
Sbjct: 77  AEEKKEEEEEEEEKEESEEE 96



 Score = 30.3 bits (69), Expect = 0.25
 Identities = 10/18 (55%), Positives = 15/18 (83%)

Query: 122 EEEEEEKEKKKEEEEEEE 139
              EE+KE+++EEEE+EE
Sbjct: 75  AAAEEKKEEEEEEEEKEE 92



 Score = 30.3 bits (69), Expect = 0.25
 Identities = 10/20 (50%), Positives = 16/20 (80%)

Query: 122 EEEEEEKEKKKEEEEEEEQS 141
               EEK++++EEEEE+E+S
Sbjct: 74  AAAAEEKKEEEEEEEEKEES 93



 Score = 26.8 bits (60), Expect = 3.5
 Identities = 8/18 (44%), Positives = 13/18 (72%)

Query: 122 EEEEEEKEKKKEEEEEEE 139
                E++K++EEEEEE+
Sbjct: 73  AAAAAEEKKEEEEEEEEK 90



 Score = 26.1 bits (58), Expect = 7.2
 Identities = 10/18 (55%), Positives = 12/18 (66%)

Query: 122 EEEEEEKEKKKEEEEEEE 139
                 +EKK+EEEEEEE
Sbjct: 72  AAAAAAEEKKEEEEEEEE 89


>gnl|CDD|215914 pfam00428, Ribosomal_60s, 60s Acidic ribosomal protein.  This
           family includes archaebacterial L12, eukaryotic P0, P1
           and P2.
          Length = 88

 Score = 33.0 bits (76), Expect = 0.018
 Identities = 10/22 (45%), Positives = 14/22 (63%)

Query: 118 GGSSEEEEEEKEKKKEEEEEEE 139
             ++      +E+KKEEEEEEE
Sbjct: 57  AAAAAAAAAAEEEKKEEEEEEE 78



 Score = 33.0 bits (76), Expect = 0.021
 Identities = 9/22 (40%), Positives = 17/22 (77%)

Query: 118 GGSSEEEEEEKEKKKEEEEEEE 139
             ++   EEEK++++EEEEE++
Sbjct: 60  AAAAAAAEEEKKEEEEEEEEDD 81



 Score = 32.6 bits (75), Expect = 0.023
 Identities = 9/22 (40%), Positives = 16/22 (72%)

Query: 118 GGSSEEEEEEKEKKKEEEEEEE 139
             ++    EE++K++EEEEEE+
Sbjct: 59  AAAAAAAAEEEKKEEEEEEEED 80



 Score = 32.2 bits (74), Expect = 0.031
 Identities = 11/22 (50%), Positives = 15/22 (68%)

Query: 118 GGSSEEEEEEKEKKKEEEEEEE 139
             ++     E+EKK+EEEEEEE
Sbjct: 58  AAAAAAAAAEEEKKEEEEEEEE 79



 Score = 30.7 bits (70), Expect = 0.14
 Identities = 8/24 (33%), Positives = 14/24 (58%)

Query: 118 GGSSEEEEEEKEKKKEEEEEEEQS 141
             ++       E++K+EEEEEE+ 
Sbjct: 56  AAAAAAAAAAAEEEKKEEEEEEEE 79


>gnl|CDD|100109 cd05831, Ribosomal_P1, Ribosomal protein P1. This subfamily
           represents the eukaryotic large ribosomal protein P1.
           Eukaryotic P1 and P2 are functionally equivalent to the
           bacterial protein L7/L12, but are not homologous to
           L7/L12. P1 is located in the L12 stalk, with proteins
           P2, P0, L11, and 28S rRNA. P1 and P2 are the only
           proteins in the ribosome to occur as multimers, always
           appearing as sets of heterodimers. Recent data indicate
           that eukaryotes have four copies (two heterodimers),
           while most archaeal species contain six copies of L12p
           (three homodimers) and bacteria may have four or six
           copies (two or three homodimers), depending on the
           species. Experiments using S. cerevisiae P1 and P2
           indicate that P1 proteins are positioned more internally
           with limited reactivity in the C-terminal domains, while
           P2 proteins seem to be more externally located and are
           more likely to interact with other cellular components.
           In lower eukaryotes, P1 and P2 are further subdivided
           into P1A, P1B, P2A, and P2B, which form P1A/P2B and
           P1B/P2A heterodimers. Some plant species have a third
           P-protein, called P3, which is not homologous to P1 and
           P2. In humans, P1 and P2 are strongly autoimmunogenic.
           They play a significant role in the etiology and
           pathogenesis of systemic lupus erythema (SLE). In
           addition, the ribosome-inactivating protein
           trichosanthin (TCS) interacts with human P0, P1, and P2,
           with its primary binding site located in the C-terminal
           region of P2. TCS inactivates the ribosome by
           depurinating a specific adenine in the sarcin-ricin loop
           of 28S rRNA.
          Length = 103

 Score = 32.7 bits (75), Expect = 0.031
 Identities = 11/22 (50%), Positives = 15/22 (68%)

Query: 118 GGSSEEEEEEKEKKKEEEEEEE 139
             ++   E +KE+KKEEEEEE 
Sbjct: 73  AAAAAAAEAKKEEKKEEEEEES 94



 Score = 30.8 bits (70), Expect = 0.17
 Identities = 10/26 (38%), Positives = 17/26 (65%)

Query: 118 GGSSEEEEEEKEKKKEEEEEEEQSRG 143
             ++  E +++EKK+EEEEE +   G
Sbjct: 74  AAAAAAEAKKEEKKEEEEEESDDDMG 99



 Score = 28.8 bits (65), Expect = 0.76
 Identities = 9/22 (40%), Positives = 15/22 (68%)

Query: 118 GGSSEEEEEEKEKKKEEEEEEE 139
             ++    E K+++K+EEEEEE
Sbjct: 72  AAAAAAAAEAKKEEKKEEEEEE 93


>gnl|CDD|179712 PRK04019, rplP0, acidic ribosomal protein P0; Validated.
          Length = 330

 Score = 33.7 bits (78), Expect = 0.042
 Identities = 13/22 (59%), Positives = 18/22 (81%)

Query: 122 EEEEEEKEKKKEEEEEEEQSRG 143
           EEEEEE+E+++EE  EEE + G
Sbjct: 303 EEEEEEEEEEEEEPSEEEAAAG 324



 Score = 32.9 bits (76), Expect = 0.076
 Identities = 15/27 (55%), Positives = 21/27 (77%)

Query: 122 EEEEEEKEKKKEEEEEEEQSRGGNRAL 148
           EEEEEE+E+++EEE  EE++  G  AL
Sbjct: 302 EEEEEEEEEEEEEEPSEEEAAAGLGAL 328



 Score = 31.8 bits (73), Expect = 0.20
 Identities = 12/31 (38%), Positives = 19/31 (61%)

Query: 111 ETLNLENGGSSEEEEEEKEKKKEEEEEEEQS 141
           E   + +  +     EE+E+++EEEEEEE S
Sbjct: 287 ELKEVLSAQAQAAAAEEEEEEEEEEEEEEPS 317



 Score = 31.0 bits (71), Expect = 0.36
 Identities = 11/22 (50%), Positives = 18/22 (81%)

Query: 120 SSEEEEEEKEKKKEEEEEEEQS 141
           ++ EEEEE+E+++EEEE  E+ 
Sbjct: 299 AAAEEEEEEEEEEEEEEPSEEE 320


>gnl|CDD|234311 TIGR03685, L12P_arch, 50S ribosomal protein L12P.  This model
           represents the L12P protein of the large (50S) subunit
           of the archaeal ribosome.
          Length = 105

 Score = 31.9 bits (73), Expect = 0.059
 Identities = 16/27 (59%), Positives = 22/27 (81%)

Query: 122 EEEEEEKEKKKEEEEEEEQSRGGNRAL 148
           EEEEEE+E+++EEEE EE++  G  AL
Sbjct: 77  EEEEEEEEEEEEEEESEEEAMAGLGAL 103



 Score = 31.2 bits (71), Expect = 0.12
 Identities = 12/22 (54%), Positives = 18/22 (81%)

Query: 118 GGSSEEEEEEKEKKKEEEEEEE 139
             ++  EEEE+E+++EEEEEEE
Sbjct: 70  AAAAAAEEEEEEEEEEEEEEEE 91



 Score = 30.8 bits (70), Expect = 0.16
 Identities = 12/23 (52%), Positives = 19/23 (82%)

Query: 119 GSSEEEEEEKEKKKEEEEEEEQS 141
            ++   EEE+E+++EEEEEEE+S
Sbjct: 70  AAAAAAEEEEEEEEEEEEEEEES 92



 Score = 30.4 bits (69), Expect = 0.18
 Identities = 12/27 (44%), Positives = 19/27 (70%)

Query: 118 GGSSEEEEEEKEKKKEEEEEEEQSRGG 144
             ++ EEEEE+E+++EEEEEE +    
Sbjct: 71  AAAAAEEEEEEEEEEEEEEEESEEEAM 97



 Score = 30.0 bits (68), Expect = 0.30
 Identities = 13/23 (56%), Positives = 18/23 (78%)

Query: 122 EEEEEEKEKKKEEEEEEEQSRGG 144
           EEEEEE+E+++EEEEE E+    
Sbjct: 76  EEEEEEEEEEEEEEEESEEEAMA 98



 Score = 27.3 bits (61), Expect = 2.3
 Identities = 9/22 (40%), Positives = 15/22 (68%)

Query: 118 GGSSEEEEEEKEKKKEEEEEEE 139
             ++     E+E+++EEEEEEE
Sbjct: 67  AAAAAAAAAEEEEEEEEEEEEE 88



 Score = 26.9 bits (60), Expect = 3.5
 Identities = 8/22 (36%), Positives = 14/22 (63%)

Query: 118 GGSSEEEEEEKEKKKEEEEEEE 139
             ++      +E+++EEEEEEE
Sbjct: 66  AAAAAAAAAAEEEEEEEEEEEE 87


>gnl|CDD|220577 pfam10111, Glyco_tranf_2_2, Glycosyltransferase like family 2.
           Members of this family of prokaryotic proteins include
           putative glucosyltransferase, which are involved in
           bacterial capsule biosynthesis.
          Length = 278

 Score = 33.1 bits (76), Expect = 0.068
 Identities = 15/65 (23%), Positives = 25/65 (38%), Gaps = 2/65 (3%)

Query: 8   AGGLFSIDRKYFEKLGKYDMAMSVWGGENLEISFRVWQCGGSLEIVPCSRVGHVFRKRHP 67
           A     I+R +F K+G +D      GGE+ E+ +R+                     + P
Sbjct: 166 ASSCILINRDFFLKIGGFDENFRGHGGEDFELLYRLLLYYKKFPPPKDLLTYD--EYKWP 223

Query: 68  YTFPG 72
            T+ G
Sbjct: 224 ITYSG 228


>gnl|CDD|100110 cd05832, Ribosomal_L12p, Ribosomal protein L12p. This subfamily
           includes archaeal L12p, the protein that is functionally
           equivalent to L7/L12 in bacteria and the P1 and P2
           proteins in eukaryotes. L12p is homologous to P1 and P2
           but is not homologous to bacterial L7/L12. It is located
           in the L12 stalk, with proteins L10, L11, and 23S rRNA.
           L12p is the only protein in the ribosome to occur as
           multimers, always appearing as sets of dimers. Recent
           data indicate that most archaeal species contain six
           copies of L12p (three homodimers), while eukaryotes have
           four copies (two heterodimers), and bacteria may have
           four or six copies (two or three homodimers), depending
           on the species. The organization of proteins within the
           stalk has been characterized primarily in bacteria,
           where L7/L12 forms either two or three homodimers and
           each homodimer binds to the extended C-terminal helix of
           L10. L7/L12 is attached to the ribosome through L10 and
           is the only ribosomal protein that does not directly
           interact with rRNA. Archaeal L12p is believed to
           function in a similar fashion. However, hybrid ribosomes
           containing the large subunit from E. coli with an
           archaeal stalk are able to bind archaeal and eukaryotic
           elongation factors but not bacterial elongation factors.
           In several mesophilic and thermophilic archaeal species,
           the binding of 23S rRNA to protein L11 and to the
           L10/L12p pentameric complex was found to be
           temperature-dependent and cooperative.
          Length = 106

 Score = 31.7 bits (72), Expect = 0.084
 Identities = 16/27 (59%), Positives = 23/27 (85%)

Query: 122 EEEEEEKEKKKEEEEEEEQSRGGNRAL 148
           EE+EEEK+K++E+EEEEE++  G  AL
Sbjct: 79  EEKEEEKKKEEEKEEEEEEALAGLGAL 105



 Score = 28.2 bits (63), Expect = 1.4
 Identities = 11/19 (57%), Positives = 17/19 (89%)

Query: 121 SEEEEEEKEKKKEEEEEEE 139
            E+ EE++E+KK+EEE+EE
Sbjct: 75  EEKAEEKEEEKKKEEEKEE 93



 Score = 27.1 bits (60), Expect = 3.0
 Identities = 11/22 (50%), Positives = 17/22 (77%)

Query: 116 ENGGSSEEEEEEKEKKKEEEEE 137
           E     EEE++++E+K+EEEEE
Sbjct: 76  EKAEEKEEEKKKEEEKEEEEEE 97



 Score = 26.7 bits (59), Expect = 4.6
 Identities = 10/27 (37%), Positives = 18/27 (66%)

Query: 122 EEEEEEKEKKKEEEEEEEQSRGGNRAL 148
             EE+ +EK++E+++EEE+      AL
Sbjct: 73  AAEEKAEEKEEEKKKEEEKEEEEEEAL 99



 Score = 26.3 bits (58), Expect = 5.2
 Identities = 11/22 (50%), Positives = 20/22 (90%)

Query: 120 SSEEEEEEKEKKKEEEEEEEQS 141
           ++EE+ EEKE++K++EEE+E+ 
Sbjct: 73  AAEEKAEEKEEEKKKEEEKEEE 94



 Score = 25.9 bits (57), Expect = 7.7
 Identities = 8/18 (44%), Positives = 14/18 (77%)

Query: 122 EEEEEEKEKKKEEEEEEE 139
              EE+ E+K+EE+++EE
Sbjct: 72  AAAEEKAEEKEEEKKKEE 89


>gnl|CDD|100111 cd05833, Ribosomal_P2, Ribosomal protein P2. This subfamily
           represents the eukaryotic large ribosomal protein P2.
           Eukaryotic P1 and P2 are functionally equivalent to the
           bacterial protein L7/L12, but are not homologous to
           L7/L12. P2 is located in the L12 stalk, with proteins
           P1, P0, L11, and 28S rRNA. P1 and P2 are the only
           proteins in the ribosome to occur as multimers, always
           appearing as sets of heterodimers. Recent data indicate
           that eukaryotes have four copies (two heterodimers),
           while most archaeal species contain six copies of L12p
           (three homodimers). Bacteria may have four or six copies
           of L7/L12 (two or three homodimers) depending on the
           species. Experiments using S. cerevisiae P1 and P2
           indicate that P1 proteins are positioned more internally
           with limited reactivity in the C-terminal domains, while
           P2 proteins seem to be more externally located and are
           more likely to interact with other cellular components.
           In lower eukaryotes, P1 and P2 are further subdivided
           into P1A, P1B, P2A, and P2B, which form P1A/P2B and
           P1B/P2A heterodimers. Some plants have a third
           P-protein, called P3, which is not homologous to P1 and
           P2. In humans, P1 and P2 are strongly autoimmunogenic.
           They play a significant role in the etiology and
           pathogenesis of systemic lupus erythema (SLE). In
           addition, the ribosome-inactivating protein
           trichosanthin (TCS) interacts with human P0, P1, and P2,
           with its primary binding site in the C-terminal region
           of P2. TCS inactivates the ribosome by depurinating a
           specific adenine in the sarcin-ricin loop of 28S rRNA.
          Length = 109

 Score = 30.7 bits (70), Expect = 0.18
 Identities = 9/23 (39%), Positives = 13/23 (56%)

Query: 118 GGSSEEEEEEKEKKKEEEEEEEQ 140
             ++     +KE+KKEE EEE  
Sbjct: 78  AAAAAAAAAKKEEKKEESEEESD 100



 Score = 26.8 bits (60), Expect = 4.3
 Identities = 7/22 (31%), Positives = 13/22 (59%)

Query: 118 GGSSEEEEEEKEKKKEEEEEEE 139
             ++      K+++K+EE EEE
Sbjct: 77  AAAAAAAAAAKKEEKKEESEEE 98



 Score = 26.5 bits (59), Expect = 4.4
 Identities = 8/28 (28%), Positives = 15/28 (53%)

Query: 118 GGSSEEEEEEKEKKKEEEEEEEQSRGGN 145
             ++    +++EKK+E EEE +   G  
Sbjct: 79  AAAAAAAAKKEEKKEESEEESDDDMGFG 106


>gnl|CDD|217393 pfam03154, Atrophin-1, Atrophin-1 family.  Atrophin-1 is the
           protein product of the dentatorubral-pallidoluysian
           atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive
           neurodegenerative disorder. It is caused by the
           expansion of a CAG repeat in the DRPLA gene on
           chromosome 12p. This results in an extended
           polyglutamine region in atrophin-1, that is thought to
           confer toxicity to the protein, possibly through
           altering its interactions with other proteins. The
           expansion of a CAG repeat is also the underlying defect
           in six other neurodegenerative disorders, including
           Huntington's disease. One interaction of expanded
           polyglutamine repeats that is thought to be pathogenic
           is that with the short glutamine repeat in the
           transcriptional coactivator CREB binding protein, CBP.
           This interaction draws CBP away from its usual nuclear
           location to the expanded polyglutamine repeat protein
           aggregates that are characteristic of the polyglutamine
           neurodegenerative disorders. This interferes with
           CBP-mediated transcription and causes cytotoxicity.
          Length = 979

 Score = 32.4 bits (73), Expect = 0.18
 Identities = 14/35 (40%), Positives = 21/35 (60%)

Query: 124 EEEEKEKKKEEEEEEEQSRGGNRALRPSTYTRHRH 158
           EE E+EK+KE+E E E+ R   RA + S+ +    
Sbjct: 598 EEREREKEKEKEREREREREAERAAKASSSSHESR 632


>gnl|CDD|224969 COG2058, RPP1A, Ribosomal protein L12E/L44/L45/RPP1/RPP2
           [Translation, ribosomal structure and biogenesis].
          Length = 109

 Score = 29.7 bits (67), Expect = 0.34
 Identities = 10/23 (43%), Positives = 14/23 (60%)

Query: 122 EEEEEEKEKKKEEEEEEEQSRGG 144
           E +E E+E+K+EE EEE      
Sbjct: 82  EADEAEEEEKEEEAEEESDDDML 104



 Score = 29.7 bits (67), Expect = 0.38
 Identities = 11/31 (35%), Positives = 17/31 (54%)

Query: 118 GGSSEEEEEEKEKKKEEEEEEEQSRGGNRAL 148
             ++ E +E +E++KEEE EEE        L
Sbjct: 77  AEAAAEADEAEEEEKEEEAEEESDDDMLFGL 107



 Score = 28.5 bits (64), Expect = 1.1
 Identities = 10/24 (41%), Positives = 17/24 (70%)

Query: 118 GGSSEEEEEEKEKKKEEEEEEEQS 141
           G  +  E +E E++++EEE EE+S
Sbjct: 76  GAEAAAEADEAEEEEKEEEAEEES 99



 Score = 27.0 bits (60), Expect = 3.1
 Identities = 9/29 (31%), Positives = 18/29 (62%)

Query: 116 ENGGSSEEEEEEKEKKKEEEEEEEQSRGG 144
           E    ++E EEE+++++ EEE ++    G
Sbjct: 78  EAAAEADEAEEEEKEEEAEEESDDDMLFG 106


>gnl|CDD|114172 pfam05432, BSP_II, Bone sialoprotein II (BSP-II).  Bone
           sialoprotein (BSP) is a major structural protein of the
           bone matrix that is specifically expressed by
           fully-differentiated osteoblasts. The expression of bone
           sialoprotein (BSP) is normally restricted to mineralised
           connective tissues of bones and teeth where it has been
           associated with mineral crystal formation. However, it
           has been found that ectopic expression of BSP occurs in
           various lesions, including oral and extraoral
           carcinomas, in which it has been associated with the
           formation of microcrystalline deposits and the
           metastasis of cancer cells to bone.
          Length = 291

 Score = 30.8 bits (69), Expect = 0.35
 Identities = 26/79 (32%), Positives = 35/79 (44%), Gaps = 11/79 (13%)

Query: 120 SSEEEEEEKEKKKEEEEEEEQSRGGNRALRPSTYTRHRHQTSFIDQEVAYMKMTSKKYRM 179
           S E+EEEE+E+++EE E EE  +G N     ST   H + +S  D               
Sbjct: 135 SDEDEEEEEEEEEEEAEVEENEQGTNGTSTNSTEVDHGNGSSGGDNG-----------EE 183

Query: 180 GEDEEEKEEEGTGRVGEGK 198
           GE+E   E E  G    G 
Sbjct: 184 GEEESVTEAEAEGTTVAGP 202


>gnl|CDD|177133 MTH00061, ND4L, NADH dehydrogenase subunit 4L; Provisional.
          Length = 86

 Score = 29.1 bits (66), Expect = 0.38
 Identities = 9/33 (27%), Positives = 14/33 (42%), Gaps = 11/33 (33%)

Query: 27 MAMSVWGGENLEISF------RVWQCGGSLEIV 53
          M +       +E+        RVW+C   LE+V
Sbjct: 57 MVIFT-----VEVILGLVVLTRVWECSSLLELV 84


>gnl|CDD|185582 PTZ00373, PTZ00373, 60S Acidic ribosomal protein P2; Provisional.
          Length = 112

 Score = 29.5 bits (66), Expect = 0.47
 Identities = 13/19 (68%), Positives = 15/19 (78%)

Query: 125 EEEKEKKKEEEEEEEQSRG 143
           E +KE+KKEEEEEEE   G
Sbjct: 89  EAKKEEKKEEEEEEEDDLG 107



 Score = 26.8 bits (59), Expect = 4.0
 Identities = 9/24 (37%), Positives = 18/24 (75%)

Query: 116 ENGGSSEEEEEEKEKKKEEEEEEE 139
              G+  E ++E++K++EEEEE++
Sbjct: 82  ATAGAKAEAKKEEKKEEEEEEEDD 105


>gnl|CDD|148682 pfam07222, PBP_sp32, Proacrosin binding protein sp32.  This family
           consists of several mammalian specific proacrosin
           binding protein sp32 sequences. sp32 is a sperm specific
           protein which is known to bind with with 55- and 53-kDa
           proacrosins and the 49-kDa acrosin intermediate. The
           exact function of sp32 is unclear, it is thought however
           that the binding of sp32 to proacrosin may be involved
           in packaging the acrosin zymogen into the acrosomal
           matrix.
          Length = 243

 Score = 30.0 bits (67), Expect = 0.66
 Identities = 25/100 (25%), Positives = 41/100 (41%), Gaps = 2/100 (2%)

Query: 99  AEV-PLAKTIPFGETLNLENGGSSEEEEEEKEKKKEEEEEEEQSRGGNRALRPSTYTRHR 157
           AEV P   T+P  E   +    S +   E      EE  +   S GG+  +  +   +  
Sbjct: 142 AEVQPTTMTLPIAEHPTITENQSFQPWPERLHNNVEELLQSSLSLGGSVQV-KAPKPKQE 200

Query: 158 HQTSFIDQEVAYMKMTSKKYRMGEDEEEKEEEGTGRVGEG 197
              S + + +   K   K+ +  ++EEE EEE     G+G
Sbjct: 201 QLLSKLQEYLQEHKTEEKQPQEEQEEEEVEEEAKQEEGQG 240


>gnl|CDD|240285 PTZ00135, PTZ00135, 60S acidic ribosomal protein P0; Provisional.
          Length = 310

 Score = 30.0 bits (68), Expect = 0.84
 Identities = 8/25 (32%), Positives = 10/25 (40%)

Query: 120 SSEEEEEEKEKKKEEEEEEEQSRGG 144
           ++           EEEEEEE   G 
Sbjct: 282 AAAAAAAAAAAPAEEEEEEEDDMGF 306



 Score = 29.2 bits (66), Expect = 1.2
 Identities = 7/23 (30%), Positives = 9/23 (39%)

Query: 122 EEEEEEKEKKKEEEEEEEQSRGG 144
                     +EEEEEE+    G
Sbjct: 285 AAAAAAAAPAEEEEEEEDDMGFG 307


>gnl|CDD|227596 COG5271, MDN1, AAA ATPase containing von Willebrand factor type A
            (vWA) domain [General function prediction only].
          Length = 4600

 Score = 30.0 bits (67), Expect = 0.91
 Identities = 18/83 (21%), Positives = 30/83 (36%), Gaps = 7/83 (8%)

Query: 116  ENGGSSEEEEEEKEKKKEEEEEEEQSRGGNRALRPSTYTRHRHQ---TSFIDQEVAYMKM 172
            E+G     +E E+  +   + +EE  +G      P       H               + 
Sbjct: 4064 EDGFEENVQENEESTEDGVKSDEELEQGE----VPEDQAIDNHPKMDAKSTFASAEADEE 4119

Query: 173  TSKKYRMGEDEEEKEEEGTGRVG 195
             + K  +GE+EE  EE+G    G
Sbjct: 4120 NTDKGIVGENEELGEEDGVRGNG 4142


>gnl|CDD|219563 pfam07767, Nop53, Nop53 (60S ribosomal biogenesis).  This nucleolar
           family of proteins are involved in 60S ribosomal
           biogenesis. They are specifically involved in the
           processing beyond the 27S stage of 25S rRNA maturation.
           This family contains sequences that bear similarity to
           the glioma tumour suppressor candidate region gene 2
           protein (p60). This protein has been found to interact
           with herpes simplex type 1 regulatory proteins.
          Length = 387

 Score = 29.7 bits (67), Expect = 0.96
 Identities = 15/76 (19%), Positives = 26/76 (34%), Gaps = 10/76 (13%)

Query: 123 EEEEEKEKKKEEEEEEEQSRGGNRALRPSTYTRHRHQTSFIDQEVAYMKMTSKKYRMGED 182
           E+E + EKK++E E  E+ +    A   S       +    + +              E 
Sbjct: 204 EKEVKAEKKRQELERVEEKKLEKMAPEASRL-DEMSEGLLEESDDD---------GEEES 253

Query: 183 EEEKEEEGTGRVGEGK 198
           ++E   EG     E  
Sbjct: 254 DDESAWEGFESEYEPI 269


>gnl|CDD|240388 PTZ00372, PTZ00372, endonuclease 4-like protein; Provisional.
          Length = 413

 Score = 29.7 bits (67), Expect = 1.1
 Identities = 12/35 (34%), Positives = 21/35 (60%)

Query: 110 GETLNLENGGSSEEEEEEKEKKKEEEEEEEQSRGG 144
            E  N E+   SE+++++K++KKE + E E   G 
Sbjct: 59  KEDKNNESKKKSEKKKKKKKEKKEPKSEGETKLGF 93


>gnl|CDD|218223 pfam04712, Radial_spoke, Radial spokehead-like protein.  This
           family includes the radial spoke head proteins RSP4 and
           RSP6 from Chlamydomonas reinhardtii, and several
           eukaryotic homologues, including mammalian RSHL1, the
           protein product of a familial ciliary dyskinesia
           candidate gene.
          Length = 481

 Score = 28.9 bits (65), Expect = 1.7
 Identities = 11/22 (50%), Positives = 17/22 (77%)

Query: 122 EEEEEEKEKKKEEEEEEEQSRG 143
           E+E+EE+E+++EE EE E   G
Sbjct: 352 EQEDEEEEEEEEEPEEPEPEEG 373



 Score = 26.9 bits (60), Expect = 9.2
 Identities = 12/19 (63%), Positives = 17/19 (89%)

Query: 122 EEEEEEKEKKKEEEEEEEQ 140
           EEEE+E E+++EEEEE E+
Sbjct: 349 EEEEQEDEEEEEEEEEPEE 367


>gnl|CDD|222581 pfam14181, YqfQ, YqfQ-like protein.  The YqfQ-like protein family
           includes the B. subtilis YqfQ protein, also known as
           VrrA, which is functionally uncharacterized. This family
           of proteins is found in bacteria. Proteins in this
           family are typically between 146 and 237 amino acids in
           length. There are two conserved sequence motifs: QYGP
           and PKLY.
          Length = 155

 Score = 28.2 bits (63), Expect = 2.0
 Identities = 8/31 (25%), Positives = 13/31 (41%)

Query: 116 ENGGSSEEEEEEKEKKKEEEEEEEQSRGGNR 146
           E   + EE  +E E++   E + E      R
Sbjct: 97  EEEETEEESTDETEQEDPPETKTESKEKKKR 127



 Score = 26.6 bits (59), Expect = 6.3
 Identities = 13/61 (21%), Positives = 27/61 (44%), Gaps = 6/61 (9%)

Query: 102 PLAKTIPFGETLNLENGGSSEEEEE------EKEKKKEEEEEEEQSRGGNRALRPSTYTR 155
           PL + +P    +  E   S +EEEE      ++ ++++  E + +S+   +   P   T 
Sbjct: 76  PLVRNLPAMWKIFRELSSSDDEEEETEEESTDETEQEDPPETKTESKEKKKREVPKPKTE 135

Query: 156 H 156
            
Sbjct: 136 K 136


>gnl|CDD|216269 pfam01056, Myc_N, Myc amino-terminal region.  The myc family
           belongs to the basic helix-loop-helix leucine zipper
           class of transcription factors, see pfam00010. Myc forms
           a heterodimer with Max, and this complex regulates cell
           growth through direct activation of genes involved in
           cell replication. Mutations in the C-terminal 20
           residues of this domain cause unique changes in the
           induction of apoptosis, transformation, and G2 arrest.
          Length = 329

 Score = 28.7 bits (64), Expect = 2.0
 Identities = 13/21 (61%), Positives = 18/21 (85%)

Query: 119 GSSEEEEEEKEKKKEEEEEEE 139
           GS  E EE++E+++EEEEEEE
Sbjct: 223 GSDSESEEDEEEEEEEEEEEE 243



 Score = 28.0 bits (62), Expect = 3.8
 Identities = 13/21 (61%), Positives = 18/21 (85%)

Query: 118 GGSSEEEEEEKEKKKEEEEEE 138
           G  SE EE+E+E+++EEEEEE
Sbjct: 223 GSDSESEEDEEEEEEEEEEEE 243


>gnl|CDD|234419 TIGR03965, mycofact_glyco, mycofactocin system glycosyltransferase.
            Members of this protein family are putative
           glycosyltransferases, members of pfam00535 (glycosyl
           transferase family 2). Members appear mostly in the
           Actinobacteria, where they appear to be part of a system
           for converting a precursor peptide (TIGR03969) into a
           novel redox carrier designated mycofactocin. A radical
           SAM enzyme, TIGR03962, is a proposed to be a key
           maturase for mycofactocin.
          Length = 467

 Score = 28.6 bits (64), Expect = 2.1
 Identities = 14/52 (26%), Positives = 29/52 (55%), Gaps = 2/52 (3%)

Query: 14  IDRKYFEKLGKYDMAMSVWGGENLEISFRVWQCGGSLEIVPCSRVGHVFRKR 65
           + R+   ++G +D  + V  GE++++ +R+ + GG +   P + V H  R R
Sbjct: 236 VRRRALLEVGGFDERLEV--GEDVDLCWRLCEAGGRVRYEPAAVVAHDHRTR 285


>gnl|CDD|221175 pfam11705, RNA_pol_3_Rpc31, DNA-directed RNA polymerase III subunit
           Rpc31.  RNA polymerase III contains seventeen subunits
           in yeasts and in human cells. Twelve of these are akin
           to RNA polymerase I or II and the other five are RNA pol
           III-specific, and form the functionally distinct groups
           (i) Rpc31-Rpc34-Rpc82, and (ii) Rpc37-Rpc53. Rpc31,
           Rpc34 and Rpc82 form a cluster of enzyme-specific
           subunits that contribute to transcription initiation in
           S.cerevisiae and H.sapiens. There is evidence that these
           subunits are anchored at or near the N-terminal Zn-fold
           of Rpc1, itself prolonged by a highly conserved but RNA
           polymerase III-specific domain.
          Length = 221

 Score = 28.2 bits (63), Expect = 2.3
 Identities = 12/19 (63%), Positives = 16/19 (84%)

Query: 122 EEEEEEKEKKKEEEEEEEQ 140
           E+ +EE EK +EEEEEEE+
Sbjct: 166 EDVDEEDEKDEEEEEEEEE 184


>gnl|CDD|185603 PTZ00415, PTZ00415, transmission-blocking target antigen s230;
           Provisional.
          Length = 2849

 Score = 28.9 bits (64), Expect = 2.6
 Identities = 8/24 (33%), Positives = 18/24 (75%)

Query: 117 NGGSSEEEEEEKEKKKEEEEEEEQ 140
           +    +E+E++ +++ +EEEEEE+
Sbjct: 154 DDDDEDEDEDDDDEEDDEEEEEEE 177



 Score = 28.5 bits (63), Expect = 2.7
 Identities = 10/22 (45%), Positives = 19/22 (86%)

Query: 122 EEEEEEKEKKKEEEEEEEQSRG 143
           +E+++++E  +EEEEEEE+ +G
Sbjct: 161 DEDDDDEEDDEEEEEEEEEIKG 182



 Score = 28.5 bits (63), Expect = 3.0
 Identities = 9/27 (33%), Positives = 18/27 (66%)

Query: 114 NLENGGSSEEEEEEKEKKKEEEEEEEQ 140
           N       E+E+E+ + ++++EEEEE+
Sbjct: 150 NFVIDDDDEDEDEDDDDEEDDEEEEEE 176



 Score = 27.3 bits (60), Expect = 7.5
 Identities = 8/21 (38%), Positives = 16/21 (76%)

Query: 122 EEEEEEKEKKKEEEEEEEQSR 142
           +E+E+E +  +E++EEEE+  
Sbjct: 157 DEDEDEDDDDEEDDEEEEEEE 177


>gnl|CDD|117592 pfam09026, Cenp-B_dimeris, Centromere protein B dimerisation
           domain.  The centromere protein B (CENP-B) dimerisation
           domain is composed of two alpha-helices, which are
           folded into an antiparallel configuration. Dimerisation
           of CENP-B is mediated by this domain, in which monomers
           dimerise to form a symmetrical, antiparallel, four-helix
           bundle structure with a large hydrophobic patch in which
           23 residues of one monomer form van der Waals contacts
           with the other monomer. This CENP-B dimer configuration
           may be suitable for capturing two distant CENP-B boxes
           during centromeric heterochromatin formation.
          Length = 101

 Score = 27.4 bits (60), Expect = 2.6
 Identities = 10/36 (27%), Positives = 20/36 (55%)

Query: 116 ENGGSSEEEEEEKEKKKEEEEEEEQSRGGNRALRPS 151
           E+  S  +EEE+ + + EE+++E+     +    PS
Sbjct: 10  EDSDSDSDEEEDDDDEDEEDDDEDDDEDDDEVPVPS 45


>gnl|CDD|217203 pfam02724, CDC45, CDC45-like protein.  CDC45 is an essential gene
           required for initiation of DNA replication in S.
           cerevisiae, forming a complex with MCM5/CDC46.
           Homologues of CDC45 have been identified in human, mouse
           and smut fungus among others.
          Length = 583

 Score = 28.4 bits (64), Expect = 2.6
 Identities = 16/111 (14%), Positives = 39/111 (35%), Gaps = 13/111 (11%)

Query: 46  CGGSLEIV-----PCSRVGHVFRKRHPYTFPGGSGNVFARNTRRAAEVWMDNYKHYYYAE 100
           CGG +++          + +V     P+       NVF  +      ++ D        +
Sbjct: 60  CGGMVDLEEFLQLDEDVIVYVIDSHRPWNL----DNVFGSDQVV---IFDDGDIEEELQD 112

Query: 101 VPLAKTIPFGETLNLENGGSSEEEEEEKEKKKEEEEEEEQSRGGNRALRPS 151
            P      + +    ++     +EE+E+  K E++E+++     +      
Sbjct: 113 EPRYDDA-YRDLEEDDDDDEESDEEDEESSKSEDDEDDDDDDDDDDIATRE 162


>gnl|CDD|217049 pfam02459, Adeno_terminal, Adenoviral DNA terminal protein.  This
           protein is covalently attached to the terminii of
           replicating DNA in vivo.
          Length = 548

 Score = 28.5 bits (64), Expect = 2.7
 Identities = 15/30 (50%), Positives = 20/30 (66%)

Query: 120 SSEEEEEEKEKKKEEEEEEEQSRGGNRALR 149
             EEEEEE+  ++EEEEEEE+ R     +R
Sbjct: 306 EEEEEEEEEVPEEEEEEEEEEERTFEEEVR 335



 Score = 26.9 bits (60), Expect = 7.3
 Identities = 14/22 (63%), Positives = 17/22 (77%)

Query: 120 SSEEEEEEKEKKKEEEEEEEQS 141
             EEEEEE+E  +EEEEEEE+ 
Sbjct: 305 PEEEEEEEEEVPEEEEEEEEEE 326


>gnl|CDD|215214 PLN02381, PLN02381, valyl-tRNA synthetase.
          Length = 1066

 Score = 28.3 bits (63), Expect = 2.9
 Identities = 20/69 (28%), Positives = 36/69 (52%), Gaps = 4/69 (5%)

Query: 121 SEEEEEEKEKKKEEEEEEEQSRGGNRALRPSTYTRHRHQTSFIDQEVAYMKMTSKKYRMG 180
           +EEE E K+KK+E+ +E+E  +   +A +     + + Q +     V   K + KK R  
Sbjct: 14  TEEELERKKKKEEKAKEKELKK--LKAAQKEAKAKLQAQQASDGTNVP--KKSEKKSRKR 69

Query: 181 EDEEEKEEE 189
           + E+E  E+
Sbjct: 70  DVEDENPED 78


>gnl|CDD|219900 pfam08553, VID27, VID27 cytoplasmic protein.  This is a family of
           fungal and plant proteins and contains many hypothetical
           proteins. VID27 is a cytoplasmic protein that plays a
           potential role in vacuolar protein degradation.
          Length = 794

 Score = 28.2 bits (63), Expect = 3.5
 Identities = 9/30 (30%), Positives = 17/30 (56%)

Query: 114 NLENGGSSEEEEEEKEKKKEEEEEEEQSRG 143
           +       EEEE+E+E+++E+E+E      
Sbjct: 383 DANTERDDEEEEDEEEEEEEDEDEGPSKEH 412



 Score = 27.8 bits (62), Expect = 4.2
 Identities = 10/30 (33%), Positives = 23/30 (76%)

Query: 111 ETLNLENGGSSEEEEEEKEKKKEEEEEEEQ 140
             L +E+  +  ++EEE+++++EEEE+E++
Sbjct: 377 SALEIEDANTERDDEEEEDEEEEEEEDEDE 406



 Score = 27.0 bits (60), Expect = 7.7
 Identities = 11/33 (33%), Positives = 21/33 (63%)

Query: 110 GETLNLENGGSSEEEEEEKEKKKEEEEEEEQSR 142
              +   N    +EEEE++E+++EE+E+E  S+
Sbjct: 378 ALEIEDANTERDDEEEEDEEEEEEEDEDEGPSK 410


>gnl|CDD|222440 pfam13897, GOLD_2, Golgi-dynamics membrane-trafficking.  Sec14-like
           Golgi-trafficking domain The GOLD domain is always found
           combined with lipid- or membrane-association domains.
          Length = 136

 Score = 27.0 bits (60), Expect = 4.2
 Identities = 11/38 (28%), Positives = 21/38 (55%)

Query: 111 ETLNLENGGSSEEEEEEKEKKKEEEEEEEQSRGGNRAL 148
            ++++      EEEEE +E++ E  + E  S+  +R L
Sbjct: 47  VSVHVSESSDEEEEEEAEEEEAETGDVEAGSKSQSRPL 84


>gnl|CDD|206063 pfam13892, DBINO, DNA-binding domain.  DBINO is a DNA-binding
           domain found on global transcription activator SNF2L1
           proteins and chromatin re-modelling proteins.
          Length = 140

 Score = 26.8 bits (60), Expect = 4.3
 Identities = 9/20 (45%), Positives = 12/20 (60%)

Query: 123 EEEEEKEKKKEEEEEEEQSR 142
            E+E  E+ K+EEE  E  R
Sbjct: 92  AEKEALEQAKKEEELREAKR 111



 Score = 26.5 bits (59), Expect = 7.5
 Identities = 10/23 (43%), Positives = 13/23 (56%), Gaps = 2/23 (8%)

Query: 122 EEEEEEKEKKKEEEEEEE--QSR 142
            E+E  ++ KKEEE  E   Q R
Sbjct: 92  AEKEALEQAKKEEELREAKRQQR 114


>gnl|CDD|118278 pfam09746, Membralin, Tumour-associated protein.  Membralin is
           evolutionarily highly conserved; though it seems to
           represent a unique protein family. The protein appears
           to contain several transmembrane regions. In humans it
           is expressed in certain cancers, particularly ovarian
           cancers. Membralin-like gene homologues have been
           identified in plants including grape, cotton and tomato.
          Length = 375

 Score = 27.5 bits (61), Expect = 4.5
 Identities = 11/79 (13%), Positives = 25/79 (31%), Gaps = 6/79 (7%)

Query: 113 LNLENGGSSEEEEEEKEKKKEEEEEEEQSRGGNRAL--RPSTYTRHRHQTSFIDQEVAYM 170
              ++       +  +     E  E  Q+  G   L  RP+    +     + D E    
Sbjct: 105 SLKDSYYYGIGPQTRQN---HETLERYQNILGKLGLPVRPTFAYSNESLYYYFDAENILD 161

Query: 171 KMT-SKKYRMGEDEEEKEE 188
             +      +  +E ++E+
Sbjct: 162 TYSHPNAISLKNEEWDEEQ 180


>gnl|CDD|217503 pfam03344, Daxx, Daxx Family.  The Daxx protein (also known as the
           Fas-binding protein) is thought to play a role in
           apoptosis, but precise role played by Daxx remains to be
           determined. Daxx forms a complex with Axin.
          Length = 715

 Score = 28.0 bits (62), Expect = 4.6
 Identities = 22/68 (32%), Positives = 33/68 (48%), Gaps = 10/68 (14%)

Query: 122 EEEEEEKEKKKEEEEEEEQSRGGNRALRPSTYTRHRHQTSFIDQEVAYMKMTSKKYRMGE 181
           + EEEE+ K++E E +   SR  + +   ST        S   QE       S++    E
Sbjct: 397 DTEEEERRKRQERERQGTSSRSSDPSKASSTSGESPSMAS---QE-------SEEEESVE 446

Query: 182 DEEEKEEE 189
           +EEE+EEE
Sbjct: 447 EEEEEEEE 454



 Score = 27.2 bits (60), Expect = 6.3
 Identities = 15/39 (38%), Positives = 23/39 (58%)

Query: 102 PLAKTIPFGETLNLENGGSSEEEEEEKEKKKEEEEEEEQ 140
           P   +    E  ++E     EEEEEE+E++ EEEE E++
Sbjct: 432 PSMASQESEEEESVEEEEEEEEEEEEEEQESEEEEGEDE 470



 Score = 27.2 bits (60), Expect = 6.5
 Identities = 12/22 (54%), Positives = 18/22 (81%)

Query: 120 SSEEEEEEKEKKKEEEEEEEQS 141
           +S+E EEE+  ++EEEEEEE+ 
Sbjct: 435 ASQESEEEESVEEEEEEEEEEE 456


>gnl|CDD|214487 smart00046, DAGKc, Diacylglycerol kinase catalytic domain
          (presumed).  Diacylglycerol (DAG) is a second messenger
          that acts as a protein kinase C activator. DAG can be
          produced from the hydrolysis of phosphatidylinositol
          4,5-bisphosphate (PIP2) by a phosphoinositide-specific
          phospholipase C and by the degradation of
          phosphatidylcholine (PC) by a phospholipase C or the
          concerted actions of phospholipase D and phosphatidate
          phosphohydrolase. This domain is presumed to be the
          catalytic domain. Bacterial homologues areknown.
          Length = 124

 Score = 26.9 bits (60), Expect = 4.8
 Identities = 6/14 (42%), Positives = 8/14 (57%)

Query: 70 FPGGSGNVFARNTR 83
           P G+GN  AR+  
Sbjct: 84 LPLGTGNDLARSLG 97


>gnl|CDD|216116 pfam00781, DAGK_cat, Diacylglycerol kinase catalytic domain.
          Diacylglycerol (DAG) is a second messenger that acts as
          a protein kinase C activator. The catalytic domain is
          assumed from the finding of bacterial homologues. YegS
          is the Escherichia coli protein in this family whose
          crystal structure reveals an active site in the
          inter-domain cleft formed by four conserved sequence
          motifs, revealing a novel metal-binding site. The
          residues of this site are conserved across the family.
          Length = 127

 Score = 26.9 bits (60), Expect = 5.1
 Identities = 7/11 (63%), Positives = 8/11 (72%)

Query: 70 FPGGSGNVFAR 80
           P G+GN FAR
Sbjct: 88 IPLGTGNDFAR 98


>gnl|CDD|220648 pfam10243, MIP-T3, Microtubule-binding protein MIP-T3.  This
           protein, which interacts with both microtubules and
           TRAF3 (tumour necrosis factor receptor-associated factor
           3), is conserved from worms to humans. The N-terminal
           region is the microtubule binding domain and is
           well-conserved; the C-terminal 100 residues, also
           well-conserved, constitute the coiled-coil region which
           binds to TRAF3. The central region of the protein is
           rich in lysine and glutamic acid and carries KKE motifs
           which may also be necessary for tubulin-binding, but
           this region is the least well-conserved.
          Length = 506

 Score = 27.5 bits (61), Expect = 5.3
 Identities = 14/44 (31%), Positives = 29/44 (65%), Gaps = 1/44 (2%)

Query: 99  AEVPLAKTIPFGETLNLENGGSSEEEEEEKEKKKEEEEEEEQSR 142
           ++ P AKT P  E  N E+G   E+E+E+ +++K++++E+ +  
Sbjct: 85  SKGPAAKTKPAKEPKN-ESGKEEEKEKEQVKEEKKKKKEKPKEE 127


>gnl|CDD|239572 cd03490, Topoisomer_IB_N_1, Topoisomer_IB_N_1: A subgroup of the
           N-terminal DNA binding fragment found in eukaryotic DNA
           topoisomerase (topo) IB. Topo IB proteins include the
           monomeric yeast and human topo I and heterodimeric topo
           I from Leishmania donvanni. Topo I enzymes are divided
           into:  topo type IA (bacterial) and type IB
           (eukaryotic). Topo I relaxes superhelical tension in
           duplex DNA by creating a single-strand nick, the broken
           strand can then rotate around the unbroken strand to
           remove DNA supercoils and, the nick is religated,
           liberating topo I. These enzymes regulate the
           topological changes that accompany DNA replication,
           transcription and other nuclear processes.  Human topo I
           is the target of a diverse set of anticancer drugs
           including camptothecins (CPTs). CPTs bind to the topo
           I-DNA complex and inhibit religation of the
           single-strand nick, resulting in the accumulation of
           topo I-DNA adducts.  In addition to differences in
           structure and some biochemical properties,
           Trypanosomatid parasite topos I differ from human topo I
           in their sensitivity to CPTs and other classical topo I
           inhibitors. Trypanosomatid topos I have putative roles
           in organizing the kinetoplast DNA network unique to
           these parasites.  This family may represent more than
           one structural domain.
          Length = 217

 Score = 27.2 bits (60), Expect = 5.4
 Identities = 9/22 (40%), Positives = 16/22 (72%)

Query: 122 EEEEEEKEKKKEEEEEEEQSRG 143
           EE+E++K   KEE+E +++ R 
Sbjct: 95  EEKEKKKNLNKEEKEAKKKERA 116


>gnl|CDD|100108 cd04411, Ribosomal_P1_P2_L12p, Ribosomal protein P1, P2, and L12p.
           Ribosomal proteins P1 and P2 are the eukaryotic proteins
           that are functionally equivalent to bacterial L7/L12.
           L12p is the archaeal homolog. Unlike other ribosomal
           proteins, the archaeal L12p and eukaryotic P1 and P2 do
           not share sequence similarity with their bacterial
           counterparts. They are part of the ribosomal stalk
           (called the L7/L12 stalk in bacteria), along with 28S
           rRNA and the proteins L11 and P0 in eukaryotes (23S
           rRNA, L11, and L10e in archaea). In bacterial ribosomes,
           L7/L12 homodimers bind the extended C-terminal helix of
           L10 to anchor the L7/L12 molecules to the ribosome.
           Eukaryotic P1/P2 heterodimers and archaeal L12p
           homodimers are believed to bind the L10 equivalent
           proteins, eukaryotic P0 and archaeal L10e, in a similar
           fashion. P1 and P2 (L12p, L7/L12) are the only proteins
           in the ribosome to occur as multimers, always appearing
           as sets of dimers. Recent data indicate that most
           archaeal species contain six copies of L12p (three
           homodimers), while eukaryotes have two copies each of P1
           and P2 (two heterodimers). Bacteria may have four or six
           copies (two or three homodimers), depending on the
           species. As in bacteria, the stalk is crucial for
           binding of initiation, elongation, and release factors
           in eukaryotes and archaea.
          Length = 105

 Score = 26.5 bits (58), Expect = 5.6
 Identities = 9/22 (40%), Positives = 16/22 (72%)

Query: 118 GGSSEEEEEEKEKKKEEEEEEE 139
              +E+ EE KE+++EEE+E+ 
Sbjct: 79  AEPAEKAEEAKEEEEEEEDEDF 100


>gnl|CDD|240329 PTZ00248, PTZ00248, eukaryotic translation initiation factor 2
           subunit 1; Provisional.
          Length = 319

 Score = 27.3 bits (61), Expect = 5.6
 Identities = 10/32 (31%), Positives = 21/32 (65%)

Query: 111 ETLNLENGGSSEEEEEEKEKKKEEEEEEEQSR 142
           E    E+  S  E+E+E+++ +EEEE++++  
Sbjct: 287 EEEEEEDDYSESEDEDEEDEDEEEEEDDDEGD 318


>gnl|CDD|236766 PRK10811, rne, ribonuclease E; Reviewed.
          Length = 1068

 Score = 27.7 bits (62), Expect = 5.7
 Identities = 10/33 (30%), Positives = 16/33 (48%)

Query: 114 NLENGGSSEEEEEEKEKKKEEEEEEEQSRGGNR 146
            L +GG   + +E+   K E + E +Q R   R
Sbjct: 578 ALFSGGEETKPQEQPAPKAEAKPERQQDRRKPR 610


>gnl|CDD|215774 pfam00183, HSP90, Hsp90 protein. 
          Length = 529

 Score = 27.4 bits (61), Expect = 5.9
 Identities = 14/20 (70%), Positives = 16/20 (80%)

Query: 120 SSEEEEEEKEKKKEEEEEEE 139
             EEEEEEKE+KKEEEE+  
Sbjct: 36  PDEEEEEEKEEKKEEEEKTT 55


>gnl|CDD|145949 pfam03066, Nucleoplasmin, Nucleoplasmin.  Nucleoplasmins are also
           known as chromatin decondensation proteins. They bind to
           core histones and transfer DNA to them in a reaction
           that requires ATP. This is thought to play a role in the
           assembly of regular nucleosomal arrays.
          Length = 146

 Score = 26.5 bits (59), Expect = 6.2
 Identities = 10/29 (34%), Positives = 21/29 (72%)

Query: 113 LNLENGGSSEEEEEEKEKKKEEEEEEEQS 141
           +  E   S ++EE+E+E+  EE+++E++S
Sbjct: 107 VASEEDESDDDEEDEEEEDDEEDDDEDES 135



 Score = 26.5 bits (59), Expect = 6.4
 Identities = 11/26 (42%), Positives = 18/26 (69%)

Query: 116 ENGGSSEEEEEEKEKKKEEEEEEEQS 141
           ++    EEEE+++E   E+E EEE+S
Sbjct: 115 DDDEEDEEEEDDEEDDDEDESEEEES 140


>gnl|CDD|219655 pfam07946, DUF1682, Protein of unknown function (DUF1682).  The
           members of this family are all hypothetical eukaryotic
           proteins of unknown function. One member is described as
           being an adipocyte-specific protein, but no evidence of
           this was found.
          Length = 322

 Score = 27.2 bits (61), Expect = 6.5
 Identities = 10/21 (47%), Positives = 17/21 (80%)

Query: 122 EEEEEEKEKKKEEEEEEEQSR 142
           EE +E+KE+KK+EE E + ++
Sbjct: 282 EEAQEKKEEKKKEEREAKLAK 302


>gnl|CDD|235302 PRK04456, PRK04456, acetyl-CoA decarbonylase/synthase complex
           subunit beta; Reviewed.
          Length = 463

 Score = 27.3 bits (61), Expect = 6.6
 Identities = 11/19 (57%), Positives = 16/19 (84%)

Query: 121 SEEEEEEKEKKKEEEEEEE 139
           + EEEEE+E+++EEEEE  
Sbjct: 403 AAEEEEEEEEEEEEEEEPV 421


>gnl|CDD|204985 pfam12624, Chorein_N, N-terminal region of Chorein, a TM
          vesicle-mediated sorter.  Although mutations in the
          full-length vacuolar protein sorting 13A (VPS13A)
          protein in vertebrates lead to the disease of
          chorea-acanthocytosis, the exact function of any of the
          regions within the protein is not yet known. This
          region is the proposed leucine zipper at the
          N-terminus. The full-length protein is a transmembrane
          protein with a presumed role in vesicle-mediated
          sorting and intracellular protein transport.
          Length = 117

 Score = 26.4 bits (59), Expect = 6.7
 Identities = 10/27 (37%), Positives = 16/27 (59%), Gaps = 4/27 (14%)

Query: 17 KYFEKLGKYDMAMSVWGG----ENLEI 39
          +Y E L K  +++S+W G    ENL +
Sbjct: 15 EYVENLDKEQLSVSIWSGDVELENLRL 41


>gnl|CDD|227499 COG5171, YRB1, Ran GTPase-activating protein (Ran-binding protein)
           [Intracellular trafficking and secretion].
          Length = 211

 Score = 26.9 bits (59), Expect = 6.7
 Identities = 16/70 (22%), Positives = 28/70 (40%)

Query: 123 EEEEEKEKKKEEEEEEEQSRGGNRALRPSTYTRHRHQTSFIDQEVAYMKMTSKKYRMGED 182
           ++    E    E +E +     N    P    +  H  +  + E    K  +K +R  E+
Sbjct: 46  QQSPFLENAVPEGDEGKGPESPNIHFEPVVELQRVHLKTNEEDETVLFKARAKLFRFDEE 105

Query: 183 EEEKEEEGTG 192
            +E +E GTG
Sbjct: 106 AKEWKERGTG 115


>gnl|CDD|220284 pfam09538, FYDLN_acid, Protein of unknown function (FYDLN_acid).
           Members of this family are bacterial proteins with a
           conserved motif [KR]FYDLN, sometimes flanked by a pair
           of CXXC motifs, followed by a long region of low
           complexity sequence in which roughly half the residues
           are Asp and Glu, including multiple runs of five or more
           acidic residues. The function of members of this family
           is unknown.
          Length = 104

 Score = 26.1 bits (58), Expect = 6.7
 Identities = 10/32 (31%), Positives = 17/32 (53%), Gaps = 2/32 (6%)

Query: 110 GETLNLENG--GSSEEEEEEKEKKKEEEEEEE 139
           GE +  E     +   + E+  KK E+EE+E+
Sbjct: 33  GEEVPPEVAKSRAPAADAEDAAKKDEDEEDED 64


>gnl|CDD|133433 cd05297, GH4_alpha_glucosidase_galactosidase, Glycoside Hydrolases
           Family 4; Alpha-glucosidases and alpha-galactosidases.
           Glucosidases cleave glycosidic bonds to release glucose
           from oligosaccharides. Alpha-glucosidases and
           alpha-galactosidases release alpha-D-glucose and
           alpha-D-galactose, respectively, via the hydrolysis of
           alpha-glycopyranoside bonds. Some bacteria
           simultaneously translocate and phosphorylate
           disaccharides via the phosphoenolpyruvate-dependent
           phosphotransferase system (PEP-PTS). After
           translocation, these phospho-disaccharides may be
           hydrolyzed by the GH4 glycoside hydrolases such as the
           alpha-glucosidases. Other organsisms (such as archaea
           and Thermotoga maritima) lack the PEP-PTS system, but
           have several enzymes normally associated with the
           PEP-PTS operon. Alpha-glucosidases and
           alpha-galactosidases are part of the NAD(P)-binding
           Rossmann fold superfamily, which includes a wide variety
           of protein families including the NAD(P)-binding domains
           of alcohol dehydrogenases, tyrosine-dependent
           oxidoreductases, glyceraldehyde-3-phosphate
           dehydrogenases, formate/glycerate dehydrogenases,
           siroheme synthases, 6-phosphogluconate dehydrogenases,
           aminoacid dehydrogenases, repressor rex, and NAD-binding
           potassium channel domains, among others.
          Length = 423

 Score = 27.1 bits (61), Expect = 7.7
 Identities = 17/83 (20%), Positives = 32/83 (38%), Gaps = 12/83 (14%)

Query: 82  TRRAAEVWMDNYKHYYYAEVPLAKTIPFGETLNLENGGSSEEEEEEKEKKKEEEEEEEQS 141
           +   +E     Y  +Y  E    K I +GE    E GG  EE+  E  +++ +    E  
Sbjct: 248 SEHLSE-----YVPHYRKE---TKKIWYGEFNEDEYGGRDEEQGWEWYEERLKLILAEID 299

Query: 142 RGGNRALRPSTYTRHRHQTSFID 164
           +     ++ S      + +  I+
Sbjct: 300 KEELDPVKRS----GEYASPIIE 318


>gnl|CDD|218652 pfam05602, CLPTM1, Cleft lip and palate transmembrane protein 1
           (CLPTM1).  This family consists of several eukaryotic
           cleft lip and palate transmembrane protein 1 sequences.
           Cleft lip with or without cleft palate is a common birth
           defect that is genetically complex. The nonsyndromic
           forms have been studied genetically using linkage and
           candidate-gene association studies with only partial
           success in defining the loci responsible for orofacial
           clefting. CLPTM1 encodes a transmembrane protein and has
           strong homology to two Caenorhabditis elegans genes,
           suggesting that CLPTM1 may belong to a new gene family.
           This family also contains the human cisplatin resistance
           related protein CRR9p which is associated with
           CDDP-induced apoptosis.
          Length = 437

 Score = 26.9 bits (60), Expect = 7.7
 Identities = 12/71 (16%), Positives = 26/71 (36%), Gaps = 12/71 (16%)

Query: 75  GNVFAR--------NTRRAAEVWMDNYKHYYYAEVPLAKTIPF--GETLNLENGGSSEEE 124
           G+++          +     + +        +   PL   +P    +  NL  G S +EE
Sbjct: 110 GSLYLHVYLGLSGYSLDPTDKGYDSGKA--VHFVFPLTTYLPKKKVKKKNLLGGKSEKEE 167

Query: 125 EEEKEKKKEEE 135
            EE++    ++
Sbjct: 168 PEEEKTPAPDK 178


>gnl|CDD|227602 COG5277, COG5277, Actin and related proteins [Cytoskeleton].
          Length = 444

 Score = 27.0 bits (60), Expect = 8.0
 Identities = 10/50 (20%), Positives = 17/50 (34%)

Query: 122 EEEEEEKEKKKEEEEEEEQSRGGNRALRPSTYTRHRHQTSFIDQEVAYMK 171
           EE EEE+EK  E+  E         ++   +      +      E  +  
Sbjct: 248 EEFEEEEEKPAEKSTESTFQLSKETSIAKESKELPDGEEIEFGNEERFKA 297


>gnl|CDD|219256 pfam06991, Prp19_bind, Splicing factor, Prp19-binding domain.  This
           family represents the C-terminus (approximately 300
           residues) of proteins that are involved as binding
           partners for Prp19 as part of the nuclear pore complex.
           The family in Drosophila is necessary for pre-mRNA
           splicing, and the human protein has been found in
           purifications of the spliceosome. In the past this
           family was thought, erroneously, to be associated with
           microfibrillin.
          Length = 277

 Score = 26.8 bits (59), Expect = 8.2
 Identities = 23/79 (29%), Positives = 34/79 (43%), Gaps = 4/79 (5%)

Query: 111 ETLNLENGGSSEEEEEEKEKKKEEEEEEEQSRGGNRALRPSTYTRHRHQTSFIDQEVAYM 170
           E L LE    S EEEEE+     EEEEE  S           +TR + + +  ++E    
Sbjct: 3   EVLELEEEDESGEEEEEE----SEEEEETDSEDDMEPRLKPVFTRKKDRITIQEREREAA 58

Query: 171 KMTSKKYRMGEDEEEKEEE 189
           K  + +       EE++ E
Sbjct: 59  KEKALEEEAKRKAEERKRE 77


>gnl|CDD|222792 PHA00435, PHA00435, capsid assembly protein.
          Length = 306

 Score = 26.7 bits (59), Expect = 8.4
 Identities = 17/44 (38%), Positives = 23/44 (52%), Gaps = 4/44 (9%)

Query: 102 PLAKTIPFGE----TLNLENGGSSEEEEEEKEKKKEEEEEEEQS 141
           P     PFGE     + +      EEEE E+ ++ EEEE EE+S
Sbjct: 56  PYGNPDPFGEDDEGRIEVRISEDGEEEEVEEGEEDEEEEGEEES 99


>gnl|CDD|129661 TIGR00570, cdk7, CDK-activating kinase assembly factor MAT1.  All
           proteins in this family for which functions are known
           are cyclin dependent protein kinases that are components
           of TFIIH, a complex that is involved in nucleotide
           excision repair and transcription initiation. Also known
           as MAT1 (menage a trois 1). This family is based on the
           phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis,
           Stanford University) [DNA metabolism, DNA replication,
           recombination, and repair].
          Length = 309

 Score = 26.7 bits (59), Expect = 8.4
 Identities = 16/69 (23%), Positives = 33/69 (47%)

Query: 123 EEEEEKEKKKEEEEEEEQSRGGNRALRPSTYTRHRHQTSFIDQEVAYMKMTSKKYRMGED 182
           E+EEE++++   ++EEE+ +   R  + +        T    + +A  K  S K  M  +
Sbjct: 154 EKEEEEQRRLLLQKEEEEQQMNKRKNKQALLDELETSTLPAAELIAQHKKNSVKLEMQVE 213

Query: 183 EEEKEEEGT 191
           + + E+  T
Sbjct: 214 KPKPEKPNT 222


>gnl|CDD|218115 pfam04502, DUF572, Family of unknown function (DUF572).  Family of
           eukaryotic proteins with undetermined function.
          Length = 321

 Score = 26.6 bits (59), Expect = 8.7
 Identities = 22/90 (24%), Positives = 39/90 (43%), Gaps = 15/90 (16%)

Query: 115 LENGGS----SEEEEEEKEKKKEEEEEEEQSRGGNRALRPSTYTRHRHQT---------- 160
           +E+G +    +++ +EE+E++ E+E EEE +    + L   T    R             
Sbjct: 100 VESGATRNYEADKLDEEQEERVEKEREEELAGDAMKKLENRTADSKREMEVLERLEELKE 159

Query: 161 -SFIDQEVAYMKMTSKKYRMGEDEEEKEEE 189
                 +V    M    +R  + EEE+EEE
Sbjct: 160 LQSRRADVDVNSMLEALFRREKKEEEEEEE 189


>gnl|CDD|222648 pfam14283, DUF4366, Domain of unknown function (DUF4366).  This
           family of proteins is found in bacteria and eukaryotes.
           Proteins in this family are typically between 227 and
           387 amino acids in length.
          Length = 213

 Score = 26.5 bits (59), Expect = 8.8
 Identities = 10/28 (35%), Positives = 13/28 (46%)

Query: 121 SEEEEEEKEKKKEEEEEEEQSRGGNRAL 148
           +E    E E + E EEE E+  G    L
Sbjct: 135 TECTGPEPEPEPEPEEEPEKKSGMGPLL 162


>gnl|CDD|219838 pfam08432, DUF1742, Fungal protein of unknown function (DUF1742).
           This is a family of fungal proteins of unknown function.
          Length = 182

 Score = 26.2 bits (58), Expect = 9.2
 Identities = 7/45 (15%), Positives = 23/45 (51%)

Query: 116 ENGGSSEEEEEEKEKKKEEEEEEEQSRGGNRALRPSTYTRHRHQT 160
           +     +++ E+K++K+ E++ E+ ++  +  L   +  + R   
Sbjct: 101 KKDDKKDDKSEKKDEKEAEDKLEDLTKSYSETLSTLSELKPRKYA 145


>gnl|CDD|217829 pfam03985, Paf1, Paf1.  Members of this family are components of
           the RNA polymerase II associated Paf1 complex. The Paf1
           complex functions during the elongation phase of
           transcription in conjunction with Spt4-Spt5 and
           Spt16-Pob3i.
          Length = 431

 Score = 26.6 bits (59), Expect = 9.3
 Identities = 9/37 (24%), Positives = 19/37 (51%)

Query: 115 LENGGSSEEEEEEKEKKKEEEEEEEQSRGGNRALRPS 151
           L+     E +E+E E++++  +E E+  G +     S
Sbjct: 362 LDPIDFEEVDEDEDEEEEQRSDEHEEEEGEDSEEEGS 398


>gnl|CDD|218312 pfam04889, Cwf_Cwc_15, Cwf15/Cwc15 cell cycle control protein.
           This family represents Cwf15/Cwc15 (from
           Schizosaccharomyces pombe and Saccharomyces cerevisiae
           respectively) and their homologues. The function of
           these proteins is unknown, but they form part of the
           spliceosome and are thus thought to be involved in mRNA
           splicing.
          Length = 241

 Score = 26.6 bits (59), Expect = 9.5
 Identities = 8/22 (36%), Positives = 16/22 (72%)

Query: 121 SEEEEEEKEKKKEEEEEEEQSR 142
            E  EE++ +++E+  EEE++R
Sbjct: 156 KERAEEKEREEEEKAAEEEKAR 177


>gnl|CDD|204614 pfam11221, Med21, Subunit 21 of Mediator complex.  Med21 has been
           known as Srb7 in yeasts, hSrb7 in humans and Trap 19 in
           Drosophila. The heterodimer of the two subunits Med7 and
           Med21 appears to act as a hinge between the middle and
           the tail regions of Mediator.
          Length = 132

 Score = 25.7 bits (57), Expect = 9.9
 Identities = 12/25 (48%), Positives = 15/25 (60%), Gaps = 1/25 (4%)

Query: 119 GSSEEEEEEKEKKKEEE-EEEEQSR 142
            SSEEE+  + K+ EEE  E E  R
Sbjct: 84  ESSEEEQLRRIKELEEELREVEAER 108


  Database: CDD.v3.10
    Posted date:  Mar 20, 2013  7:55 AM
  Number of letters in database: 10,937,602
  Number of sequences in database:  44,354
  
Lambda     K      H
   0.313    0.132    0.391 

Gapped
Lambda     K      H
   0.267   0.0812    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 10,627,356
Number of extensions: 999857
Number of successful extensions: 4512
Number of sequences better than 10.0: 1
Number of HSP's gapped: 3626
Number of HSP's successfully gapped: 358
Length of query: 199
Length of database: 10,937,602
Length adjustment: 92
Effective length of query: 107
Effective length of database: 6,857,034
Effective search space: 733702638
Effective search space used: 733702638
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.2 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 42 (21.9 bits)
S2: 56 (25.2 bits)