RPS-BLAST 2.2.26 [Sep-21-2011]

Database: CDD.v3.10 
           44,354 sequences; 10,937,602 total letters

Searching..................................................done

Query= psy2085
         (358 letters)



>gnl|CDD|212055 cd11486, SLC5sbd_SGLT1, Na(+)/glucose cotransporter SGLT1;solute
           binding domain.  Human SGLT1 (hSGLT1) is a
           high-affinity/low-capacity glucose transporter, which
           can also transport galactose. In the transport
           mechanism, two Na+ ions first bind to the extracellular
           side of the transporter and induce a conformational
           change in the glucose binding site. This results in an
           increased affinity for glucose. A second conformational
           change in the transporter follows, bringing the Na+ and
           glucose binding sites to the inner surface of the
           membrane. Glucose is then released, followed by the Na+
           ions. In the process, hSGLT1 is also able to transport
           water and urea and may be a major pathway for transport
           of these across the intestinal brush-border membrane.
           hSGLT1 is encoded by the SLC5A1 gene and expressed
           mostly in the intestine, but also in the trachea,
           kidney, heart, brain, testis, and prostate. The
           WHO/UNICEF oral rehydration solution (ORS) for the
           treatment of secretory diarrhea contains salt and
           glucose. The glucose, along with sodium ions, is
           transported by hSGLT1 and water is either co-transported
           along with these or follows by osmosis. Mutations in
           SGLT1 are associated with intestinal glucose galactose
           malabsorption (GGM). Up-regulation of intestinal SGLT1
           may protect against enteric infections. SGLT1 is
           expressed in colorectal, head and neck, and prostate
           tumors. Epidermal growth factor receptor (EGFR)
           functions in cell survival by stabilizing SGLT1, and
           thereby maintaining intracellular glucose levels. SGLT1
           is predicted to have 14 membrane-spanning regions. This
           subgroup belongs to the solute carrier 5
           (SLC5)transporter family.
          Length = 635

 Score = 40.6 bits (95), Expect = 0.001
 Identities = 23/81 (28%), Positives = 33/81 (40%), Gaps = 13/81 (16%)

Query: 145 CISNERDTEEKEGKASSDESSEEEEEEEEEEESSDDDQAWTKEIKKTYNLGPAPKWCGFL 204
           C S    TEE+    + D + +E+E E E +E            +K YN      +CGF 
Sbjct: 534 CWSLRNSTEERIDLDADDWTEDEDENEMETDEERKKPGCC----RKAYNW-----FCGFD 584

Query: 205 DN----LTEELEENIIENVYD 221
                 LTEE E  +   + D
Sbjct: 585 QGKAPKLTEEEEAALKMKMTD 605


>gnl|CDD|203864 pfam08159, NUC153, NUC153 domain.  This small domain is found in a
           a novel nucleolar family.
          Length = 30

 Score = 33.6 bits (78), Expect = 0.006
 Identities = 11/20 (55%), Positives = 15/20 (75%), Gaps = 1/20 (5%)

Query: 335 DDRFSKLFENPDFQVIEPSS 354
           D RF  +FE+PDF  I+P+S
Sbjct: 1   DPRFKAMFEDPDFA-IDPTS 19


>gnl|CDD|233191 TIGR00927, 2A1904, K+-dependent Na+/Ca+ exchanger.  [Transport and
           binding proteins, Cations and iron carrying compounds].
          Length = 1096

 Score = 38.4 bits (89), Expect = 0.006
 Identities = 16/31 (51%), Positives = 23/31 (74%)

Query: 151 DTEEKEGKASSDESSEEEEEEEEEEESSDDD 181
           D+EE+E +   +E  EEEEEEEEEEE  +++
Sbjct: 862 DSEEEEEEEEEEEEEEEEEEEEEEEEEENEE 892



 Score = 37.3 bits (86), Expect = 0.014
 Identities = 14/32 (43%), Positives = 20/32 (62%)

Query: 151 DTEEKEGKASSDESSEEEEEEEEEEESSDDDQ 182
                +G  S +E  EEEEEEEEEEE  ++++
Sbjct: 854 GGGGSDGGDSEEEEEEEEEEEEEEEEEEEEEE 885



 Score = 36.1 bits (83), Expect = 0.027
 Identities = 15/31 (48%), Positives = 22/31 (70%)

Query: 149 ERDTEEKEGKASSDESSEEEEEEEEEEESSD 179
           + + EE+E +   +E  EEEEEEEEEEE+ +
Sbjct: 862 DSEEEEEEEEEEEEEEEEEEEEEEEEEENEE 892



 Score = 35.7 bits (82), Expect = 0.036
 Identities = 16/30 (53%), Positives = 20/30 (66%)

Query: 147 SNERDTEEKEGKASSDESSEEEEEEEEEEE 176
           S E + EE+E +   +E  EEEEEEEE EE
Sbjct: 863 SEEEEEEEEEEEEEEEEEEEEEEEEEENEE 892



 Score = 34.6 bits (79), Expect = 0.082
 Identities = 16/40 (40%), Positives = 22/40 (55%)

Query: 151 DTEEKEGKASSDESSEEEEEEEEEEESSDDDQAWTKEIKK 190
           + EE+E +   +E  EEEEEEEEEE        W +  +K
Sbjct: 865 EEEEEEEEEEEEEEEEEEEEEEEEENEEPLSLEWPETRQK 904



 Score = 33.8 bits (77), Expect = 0.15
 Identities = 14/34 (41%), Positives = 22/34 (64%)

Query: 154 EKEGKASSDESSEEEEEEEEEEESSDDDQAWTKE 187
           +  G +   +S EEEEEEEEEEE  ++++   +E
Sbjct: 853 DGGGGSDGGDSEEEEEEEEEEEEEEEEEEEEEEE 886



 Score = 33.4 bits (76), Expect = 0.20
 Identities = 15/40 (37%), Positives = 23/40 (57%)

Query: 148 NERDTEEKEGKASSDESSEEEEEEEEEEESSDDDQAWTKE 187
           +E+  +   G    D   EEEEEEEEEEE  ++++   +E
Sbjct: 848 DEKGVDGGGGSDGGDSEEEEEEEEEEEEEEEEEEEEEEEE 887



 Score = 33.0 bits (75), Expect = 0.26
 Identities = 15/31 (48%), Positives = 19/31 (61%)

Query: 148 NERDTEEKEGKASSDESSEEEEEEEEEEESS 178
            E + EE+E +   +E  EEEEEEE EE  S
Sbjct: 865 EEEEEEEEEEEEEEEEEEEEEEEEENEEPLS 895



 Score = 32.7 bits (74), Expect = 0.36
 Identities = 16/44 (36%), Positives = 25/44 (56%)

Query: 151 DTEEKEGKASSDESSEEEEEEEEEEESSDDDQAWTKEIKKTYNL 194
           D         S+E  EEEEEEEEEEE  ++++   +E ++  +L
Sbjct: 853 DGGGGSDGGDSEEEEEEEEEEEEEEEEEEEEEEEEEENEEPLSL 896



 Score = 32.3 bits (73), Expect = 0.46
 Identities = 17/46 (36%), Positives = 26/46 (56%)

Query: 151 DTEEKEGKASSDESSEEEEEEEEEEESSDDDQAWTKEIKKTYNLGP 196
           D +  +G   SD    EEEEEEEEEE  ++++   +E ++  N  P
Sbjct: 848 DEKGVDGGGGSDGGDSEEEEEEEEEEEEEEEEEEEEEEEEEENEEP 893



 Score = 30.3 bits (68), Expect = 2.0
 Identities = 14/31 (45%), Positives = 19/31 (61%)

Query: 149 ERDTEEKEGKASSDESSEEEEEEEEEEESSD 179
           E + EE+E +   +E  EEEEEE EE  S +
Sbjct: 867 EEEEEEEEEEEEEEEEEEEEEEENEEPLSLE 897


>gnl|CDD|215914 pfam00428, Ribosomal_60s, 60s Acidic ribosomal protein.  This
           family includes archaebacterial L12, eukaryotic P0, P1
           and P2.
          Length = 88

 Score = 35.3 bits (82), Expect = 0.007
 Identities = 13/23 (56%), Positives = 19/23 (82%)

Query: 159 ASSDESSEEEEEEEEEEESSDDD 181
           A++  ++EEE++EEEEEE  DDD
Sbjct: 60  AAAAAAAEEEKKEEEEEEEEDDD 82



 Score = 33.4 bits (77), Expect = 0.026
 Identities = 12/24 (50%), Positives = 17/24 (70%)

Query: 157 GKASSDESSEEEEEEEEEEESSDD 180
             A++  + EE++EEEEEEE  DD
Sbjct: 59  AAAAAAAAEEEKKEEEEEEEEDDD 82



 Score = 28.0 bits (63), Expect = 2.2
 Identities = 10/23 (43%), Positives = 16/23 (69%)

Query: 154 EKEGKASSDESSEEEEEEEEEEE 176
                A+++E  +EEEEEEEE++
Sbjct: 59  AAAAAAAAEEEKKEEEEEEEEDD 81


>gnl|CDD|215774 pfam00183, HSP90, Hsp90 protein. 
          Length = 529

 Score = 37.4 bits (87), Expect = 0.009
 Identities = 17/48 (35%), Positives = 31/48 (64%)

Query: 146 ISNERDTEEKEGKASSDESSEEEEEEEEEEESSDDDQAWTKEIKKTYN 193
           + +E + EEKE K   +E + ++EEE +EEE  ++ +  TK++K+T  
Sbjct: 35  VPDEEEEEEKEEKKEEEEKTTDKEEEVDEEEEKEEKKKKTKKVKETTT 82



 Score = 29.4 bits (66), Expect = 3.3
 Identities = 12/40 (30%), Positives = 23/40 (57%)

Query: 152 TEEKEGKASSDESSEEEEEEEEEEESSDDDQAWTKEIKKT 191
             EKE     +E  +EE++EEEE+ +  +++   +E K+ 
Sbjct: 30  EVEKEVPDEEEEEEKEEKKEEEEKTTDKEEEVDEEEEKEE 69


>gnl|CDD|224969 COG2058, RPP1A, Ribosomal protein L12E/L44/L45/RPP1/RPP2
           [Translation, ribosomal structure and biogenesis].
          Length = 109

 Score = 34.7 bits (80), Expect = 0.016
 Identities = 16/35 (45%), Positives = 20/35 (57%)

Query: 147 SNERDTEEKEGKASSDESSEEEEEEEEEEESSDDD 181
           +        E  A +DE+ EEE+EEE EEES DD 
Sbjct: 69  AAAAAAAGAEAAAEADEAEEEEKEEEAEEESDDDM 103



 Score = 27.4 bits (61), Expect = 4.5
 Identities = 11/29 (37%), Positives = 16/29 (55%)

Query: 155 KEGKASSDESSEEEEEEEEEEESSDDDQA 183
             G  ++ E+ E EEEE+EEE   + D  
Sbjct: 74  AAGAEAAAEADEAEEEEKEEEAEEESDDD 102


>gnl|CDD|100109 cd05831, Ribosomal_P1, Ribosomal protein P1. This subfamily
           represents the eukaryotic large ribosomal protein P1.
           Eukaryotic P1 and P2 are functionally equivalent to the
           bacterial protein L7/L12, but are not homologous to
           L7/L12. P1 is located in the L12 stalk, with proteins
           P2, P0, L11, and 28S rRNA. P1 and P2 are the only
           proteins in the ribosome to occur as multimers, always
           appearing as sets of heterodimers. Recent data indicate
           that eukaryotes have four copies (two heterodimers),
           while most archaeal species contain six copies of L12p
           (three homodimers) and bacteria may have four or six
           copies (two or three homodimers), depending on the
           species. Experiments using S. cerevisiae P1 and P2
           indicate that P1 proteins are positioned more internally
           with limited reactivity in the C-terminal domains, while
           P2 proteins seem to be more externally located and are
           more likely to interact with other cellular components.
           In lower eukaryotes, P1 and P2 are further subdivided
           into P1A, P1B, P2A, and P2B, which form P1A/P2B and
           P1B/P2A heterodimers. Some plant species have a third
           P-protein, called P3, which is not homologous to P1 and
           P2. In humans, P1 and P2 are strongly autoimmunogenic.
           They play a significant role in the etiology and
           pathogenesis of systemic lupus erythema (SLE). In
           addition, the ribosome-inactivating protein
           trichosanthin (TCS) interacts with human P0, P1, and P2,
           with its primary binding site located in the C-terminal
           region of P2. TCS inactivates the ribosome by
           depurinating a specific adenine in the sarcin-ricin loop
           of 28S rRNA.
          Length = 103

 Score = 34.2 bits (79), Expect = 0.020
 Identities = 12/25 (48%), Positives = 19/25 (76%)

Query: 157 GKASSDESSEEEEEEEEEEESSDDD 181
             A++   +++EE++EEEEE SDDD
Sbjct: 73  AAAAAAAEAKKEEKKEEEEEESDDD 97


>gnl|CDD|219838 pfam08432, DUF1742, Fungal protein of unknown function (DUF1742).
           This is a family of fungal proteins of unknown function.
          Length = 182

 Score = 34.7 bits (80), Expect = 0.037
 Identities = 12/71 (16%), Positives = 27/71 (38%), Gaps = 9/71 (12%)

Query: 121 IEGKVEAWDPRMKVKAGTLDCAFNCISNERDTEEKEGKASSDESSEEEEEEEEEEESSDD 180
           ++ + E        K           S ++  ++K+ K    +   E+++E+E E+  +D
Sbjct: 74  VKKEYEEKQKWKWKKKK---------SKKKKDKDKDKKDDKKDDKSEKKDEKEAEDKLED 124

Query: 181 DQAWTKEIKKT 191
                 E   T
Sbjct: 125 LTKSYSETLST 135


>gnl|CDD|238121 cd00200, WD40, WD40 domain, found in a number of eukaryotic
           proteins that cover a wide variety of functions
           including adaptor/regulatory modules in signal
           transduction, pre-mRNA processing and cytoskeleton
           assembly; typically contains a GH dipeptide 11-24
           residues from its N-terminus and the WD dipeptide at its
           C-terminus and is 40 residues long, hence the name WD40;
           between GH and WD lies a conserved core; serves as a
           stable propeller-like platform to which proteins can
           bind either stably or reversibly; forms a propeller-like
           structure with several blades where each blade is
           composed of a four-stranded anti-parallel b-sheet;
           instances with few detectable copies are hypothesized to
           form larger structures by dimerization; each WD40
           sequence repeat forms the first three strands of one
           blade and the last strand in the next blade; the last
           C-terminal WD40 repeat completes the blade structure of
           the first WD40 repeat to create the closed ring
           propeller-structure; residues on the top and bottom
           surface of the propeller are proposed to coordinate
           interactions with other proteins and/or small ligands; 7
           copies of the repeat are present in this alignment.
          Length = 289

 Score = 35.0 bits (81), Expect = 0.042
 Identities = 22/91 (24%), Positives = 38/91 (41%), Gaps = 14/91 (15%)

Query: 49  TSVRISPDGQYVLSTGIYKPRVRCYETDNLSMKFERCFDSEVVTFEILSDDYSSELNSIA 108
            SV  SPDG+ +LS+           +D     ++      + T       + + +NS+A
Sbjct: 181 NSVAFSPDGEKLLSSS----------SDGTIKLWDLSTGKCLGTLRG----HENGVNSVA 226

Query: 109 INPVHQLICVGTIEGKVEAWDPRMKVKAGTL 139
            +P   L+  G+ +G +  WD R      TL
Sbjct: 227 FSPDGYLLASGSEDGTIRVWDLRTGECVQTL 257



 Score = 31.5 bits (72), Expect = 0.54
 Identities = 20/91 (21%), Positives = 38/91 (41%), Gaps = 14/91 (15%)

Query: 49  TSVRISPDGQYVLSTGIYKPRVRCYETDNLSMKFERCFDSEVVTFEILSDDYSSELNSIA 108
            SV  SPDG +V ++      ++ ++               V T       ++ E+NS+A
Sbjct: 139 NSVAFSPDGTFV-ASSSQDGTIKLWDLRTGK---------CVATLT----GHTGEVNSVA 184

Query: 109 INPVHQLICVGTIEGKVEAWDPRMKVKAGTL 139
            +P  + +   + +G ++ WD       GTL
Sbjct: 185 FSPDGEKLLSSSSDGTIKLWDLSTGKCLGTL 215



 Score = 30.4 bits (69), Expect = 1.4
 Identities = 19/83 (22%), Positives = 36/83 (43%), Gaps = 14/83 (16%)

Query: 49  TSVRISPDGQYVLSTGIYKPRVRCYETDNLSMKFERCFDSEVVTFEILSDDYSSELNSIA 108
           +SV  SPDG+ + S+            D  ++K    +D E          ++  +NS+A
Sbjct: 97  SSVAFSPDGRILSSSS----------RDK-TIKV---WDVETGKCLTTLRGHTDWVNSVA 142

Query: 109 INPVHQLICVGTIEGKVEAWDPR 131
            +P    +   + +G ++ WD R
Sbjct: 143 FSPDGTFVASSSQDGTIKLWDLR 165


>gnl|CDD|240329 PTZ00248, PTZ00248, eukaryotic translation initiation factor 2
           subunit 1; Provisional.
          Length = 319

 Score = 34.3 bits (79), Expect = 0.082
 Identities = 19/69 (27%), Positives = 32/69 (46%), Gaps = 5/69 (7%)

Query: 117 CVGTIEGKVEAWDPRMKVKA-----GTLDCAFNCISNERDTEEKEGKASSDESSEEEEEE 171
            +  I+  ++      KVK      G  +     +  + + EE+E   S  E  +EE+E+
Sbjct: 248 ALEAIKEVIKKKGGDFKVKGEPEVVGGDEEDLEELLEKAEEEEEEDDYSESEDEDEEDED 307

Query: 172 EEEEESSDD 180
           EEEEE  D+
Sbjct: 308 EEEEEDDDE 316


>gnl|CDD|218333 pfam04931, DNA_pol_phi, DNA polymerase phi.  This family includes
           the fifth essential DNA polymerase in yeast EC:2.7.7.7.
           Pol5p is localised exclusively to the nucleolus and
           binds near or at the enhancer region of rRNA-encoding
           DNA repeating units.
          Length = 784

 Score = 34.5 bits (79), Expect = 0.087
 Identities = 14/35 (40%), Positives = 19/35 (54%)

Query: 149 ERDTEEKEGKASSDESSEEEEEEEEEEESSDDDQA 183
           E D +E E +A  D  SE E + E+ EE   +D A
Sbjct: 657 ETDDDEDECEAIEDSESESESDGEDGEEDEQEDDA 691



 Score = 28.3 bits (63), Expect = 7.2
 Identities = 15/53 (28%), Positives = 22/53 (41%)

Query: 145 CISNERDTEEKEGKASSDESSEEEEEEEEEEESSDDDQAWTKEIKKTYNLGPA 197
           C + E    E E      E  E+E++ E  E     D+A  + + K  NL  A
Sbjct: 665 CEAIEDSESESESDGEDGEEDEQEDDAEANEGVVPIDKAVRRALPKVLNLPDA 717


>gnl|CDD|218673 pfam05642, Sporozoite_P67, Sporozoite P67 surface antigen.  This
           family consists of several Theileria P67 surface
           antigens. A stage specific surface antigen of Theileria
           parva, p67, is the basis for the development of an
           anti-sporozoite vaccine for the control of East Coast
           fever (ECF) in cattle. The antigen has been shown to
           contain five distinct linear peptide sequences
           recognised by sporozoite-neutralising murine monoclonal
           antibodies.
          Length = 727

 Score = 34.3 bits (78), Expect = 0.098
 Identities = 13/43 (30%), Positives = 23/43 (53%)

Query: 149 ERDTEEKEGKASSDESSEEEEEEEEEEESSDDDQAWTKEIKKT 191
           + +TE+ +    S   SEE++++ EEE++        K  KKT
Sbjct: 100 QDNTEQNQDTKGSKTDSEEDDDDSEEEDNKSTSSKDGKGSKKT 142


>gnl|CDD|217829 pfam03985, Paf1, Paf1.  Members of this family are components of
           the RNA polymerase II associated Paf1 complex. The Paf1
           complex functions during the elongation phase of
           transcription in conjunction with Spt4-Spt5 and
           Spt16-Pob3i.
          Length = 431

 Score = 34.3 bits (79), Expect = 0.10
 Identities = 13/41 (31%), Positives = 21/41 (51%)

Query: 147 SNERDTEEKEGKASSDESSEEEEEEEEEEESSDDDQAWTKE 187
               + E++E +  SDE  EEE E+ EEE S   +   ++ 
Sbjct: 368 EEVDEDEDEEEEQRSDEHEEEEGEDSEEEGSQSREDGSSES 408


>gnl|CDD|218538 pfam05285, SDA1, SDA1.  This family consists of several SDA1
           protein homologues. SDA1 is a Saccharomyces cerevisiae
           protein which is involved in the control of the actin
           cytoskeleton. The protein is essential for cell
           viability and is localised in the nucleus.
          Length = 317

 Score = 33.5 bits (77), Expect = 0.14
 Identities = 16/37 (43%), Positives = 22/37 (59%)

Query: 148 NERDTEEKEGKASSDESSEEEEEEEEEEESSDDDQAW 184
             +  E ++G  S D+  EEEE E EE+E SDD+  W
Sbjct: 82  ERKKKEAEQGLESDDDDDEEEEWEVEEDEDSDDEGEW 118



 Score = 29.6 bits (67), Expect = 2.8
 Identities = 15/34 (44%), Positives = 19/34 (55%)

Query: 148 NERDTEEKEGKASSDESSEEEEEEEEEEESSDDD 181
            ER  +E E    SD+  +EEEE E EE+   DD
Sbjct: 81  EERKKKEAEQGLESDDDDDEEEEWEVEEDEDSDD 114



 Score = 28.5 bits (64), Expect = 5.0
 Identities = 10/34 (29%), Positives = 17/34 (50%)

Query: 148 NERDTEEKEGKASSDESSEEEEEEEEEEESSDDD 181
            E   +++  +    +  ++EEEE E EE  D D
Sbjct: 80  EEERKKKEAEQGLESDDDDDEEEEWEVEEDEDSD 113


>gnl|CDD|235351 PRK05137, tolB, translocation protein TolB; Provisional.
          Length = 435

 Score = 33.4 bits (77), Expect = 0.17
 Identities = 18/49 (36%), Positives = 27/49 (55%), Gaps = 11/49 (22%)

Query: 25  YLLPTNDIRRRIELIQDFEMPGVSTSVRISPDGQYVL-------STGIY 66
           YLL     +R  EL+ +F  PG++ + R SPDG+ V+       +T IY
Sbjct: 229 YLLDLETGQR--ELVGNF--PGMTFAPRFSPDGRKVVMSLSQGGNTDIY 273


>gnl|CDD|100111 cd05833, Ribosomal_P2, Ribosomal protein P2. This subfamily
           represents the eukaryotic large ribosomal protein P2.
           Eukaryotic P1 and P2 are functionally equivalent to the
           bacterial protein L7/L12, but are not homologous to
           L7/L12. P2 is located in the L12 stalk, with proteins
           P1, P0, L11, and 28S rRNA. P1 and P2 are the only
           proteins in the ribosome to occur as multimers, always
           appearing as sets of heterodimers. Recent data indicate
           that eukaryotes have four copies (two heterodimers),
           while most archaeal species contain six copies of L12p
           (three homodimers). Bacteria may have four or six copies
           of L7/L12 (two or three homodimers) depending on the
           species. Experiments using S. cerevisiae P1 and P2
           indicate that P1 proteins are positioned more internally
           with limited reactivity in the C-terminal domains, while
           P2 proteins seem to be more externally located and are
           more likely to interact with other cellular components.
           In lower eukaryotes, P1 and P2 are further subdivided
           into P1A, P1B, P2A, and P2B, which form P1A/P2B and
           P1B/P2A heterodimers. Some plants have a third
           P-protein, called P3, which is not homologous to P1 and
           P2. In humans, P1 and P2 are strongly autoimmunogenic.
           They play a significant role in the etiology and
           pathogenesis of systemic lupus erythema (SLE). In
           addition, the ribosome-inactivating protein
           trichosanthin (TCS) interacts with human P0, P1, and P2,
           with its primary binding site in the C-terminal region
           of P2. TCS inactivates the ribosome by depurinating a
           specific adenine in the sarcin-ricin loop of 28S rRNA.
          Length = 109

 Score = 31.5 bits (72), Expect = 0.19
 Identities = 11/28 (39%), Positives = 20/28 (71%)

Query: 157 GKASSDESSEEEEEEEEEEESSDDDQAW 184
             A++  ++++EE++EE EE SDDD  +
Sbjct: 78  AAAAAAAAAKKEEKKEESEEESDDDMGF 105


>gnl|CDD|117592 pfam09026, Cenp-B_dimeris, Centromere protein B dimerisation
           domain.  The centromere protein B (CENP-B) dimerisation
           domain is composed of two alpha-helices, which are
           folded into an antiparallel configuration. Dimerisation
           of CENP-B is mediated by this domain, in which monomers
           dimerise to form a symmetrical, antiparallel, four-helix
           bundle structure with a large hydrophobic patch in which
           23 residues of one monomer form van der Waals contacts
           with the other monomer. This CENP-B dimer configuration
           may be suitable for capturing two distant CENP-B boxes
           during centromeric heterochromatin formation.
          Length = 101

 Score = 31.3 bits (70), Expect = 0.19
 Identities = 12/27 (44%), Positives = 20/27 (74%)

Query: 156 EGKASSDESSEEEEEEEEEEESSDDDQ 182
           EG+  SD  S+EEE++++E+E  DD+ 
Sbjct: 7   EGEEDSDSDSDEEEDDDDEDEEDDDED 33



 Score = 27.8 bits (61), Expect = 3.8
 Identities = 8/29 (27%), Positives = 21/29 (72%)

Query: 148 NERDTEEKEGKASSDESSEEEEEEEEEEE 176
           ++ D++E+E     DE  ++E+++E+++E
Sbjct: 12  SDSDSDEEEDDDDEDEEDDDEDDDEDDDE 40



 Score = 27.4 bits (60), Expect = 4.9
 Identities = 12/35 (34%), Positives = 21/35 (60%)

Query: 154 EKEGKASSDESSEEEEEEEEEEESSDDDQAWTKEI 188
           E E  + SD   EE++++E+EE+  +DD     E+
Sbjct: 7   EGEEDSDSDSDEEEDDDDEDEEDDDEDDDEDDDEV 41


>gnl|CDD|114172 pfam05432, BSP_II, Bone sialoprotein II (BSP-II).  Bone
           sialoprotein (BSP) is a major structural protein of the
           bone matrix that is specifically expressed by
           fully-differentiated osteoblasts. The expression of bone
           sialoprotein (BSP) is normally restricted to mineralised
           connective tissues of bones and teeth where it has been
           associated with mineral crystal formation. However, it
           has been found that ectopic expression of BSP occurs in
           various lesions, including oral and extraoral
           carcinomas, in which it has been associated with the
           formation of microcrystalline deposits and the
           metastasis of cancer cells to bone.
          Length = 291

 Score = 33.1 bits (75), Expect = 0.20
 Identities = 15/33 (45%), Positives = 21/33 (63%), Gaps = 1/33 (3%)

Query: 151 DTEEKEGKA-SSDESSEEEEEEEEEEESSDDDQ 182
           +  +K  K   SDE  EEEEEEEEEE   ++++
Sbjct: 124 NAGKKATKEDESDEDEEEEEEEEEEEAEVEENE 156



 Score = 29.7 bits (66), Expect = 2.0
 Identities = 14/48 (29%), Positives = 24/48 (50%)

Query: 134 VKAGTLDCAFNCISNERDTEEKEGKASSDESSEEEEEEEEEEESSDDD 181
              GT +     +   +       KA+ ++ S+E+EEEEEEEE  + +
Sbjct: 104 ATPGTGNIGLAALQLPKKAGNAGKKATKEDESDEDEEEEEEEEEEEAE 151



 Score = 28.5 bits (63), Expect = 4.8
 Identities = 13/28 (46%), Positives = 17/28 (60%)

Query: 149 ERDTEEKEGKASSDESSEEEEEEEEEEE 176
           ++ T+E E     +E  EEEEEE E EE
Sbjct: 127 KKATKEDESDEDEEEEEEEEEEEAEVEE 154



 Score = 27.7 bits (61), Expect = 8.6
 Identities = 13/31 (41%), Positives = 16/31 (51%)

Query: 151 DTEEKEGKASSDESSEEEEEEEEEEESSDDD 181
           D+ E+ G   S E   EEE   EEE + D D
Sbjct: 49  DSSEENGDGDSSEEEGEEETSNEEENNEDSD 79


>gnl|CDD|215406 PLN02761, PLN02761, lipase class 3 family protein.
          Length = 527

 Score = 32.7 bits (74), Expect = 0.29
 Identities = 26/103 (25%), Positives = 44/103 (42%), Gaps = 17/103 (16%)

Query: 133 KVKAGTLDCAFNCISNERDTEEKEGKASSDESSEEEEEEEEEEESSDDD----QAWTKEI 188
           K K  ++ C+ +C S    T +++        S+ + EEE EEE  + +    + W +E+
Sbjct: 36  KFKTCSIICSSSCTSISSSTTQQKQSNKQTHVSDNKREEEPEEELEEKEVSLREIW-REV 94

Query: 189 KKTYNLGPAPKWCGFLDNLTEELEENII------ENVYDDYKF 225
           +   N      W G LD +   L   II      +  YD + F
Sbjct: 95  QGCNN------WEGLLDPMNNHLRREIIRYGEFAQACYDSFDF 131


>gnl|CDD|179712 PRK04019, rplP0, acidic ribosomal protein P0; Validated.
          Length = 330

 Score = 32.5 bits (75), Expect = 0.29
 Identities = 15/25 (60%), Positives = 20/25 (80%)

Query: 159 ASSDESSEEEEEEEEEEESSDDDQA 183
           A++ E  EEEEEEEEEEE S+++ A
Sbjct: 298 AAAAEEEEEEEEEEEEEEPSEEEAA 322



 Score = 29.1 bits (66), Expect = 3.3
 Identities = 12/25 (48%), Positives = 17/25 (68%)

Query: 159 ASSDESSEEEEEEEEEEESSDDDQA 183
            ++    EEEEEEEEEEE   +++A
Sbjct: 297 QAAAAEEEEEEEEEEEEEEPSEEEA 321



 Score = 28.7 bits (65), Expect = 4.6
 Identities = 12/31 (38%), Positives = 18/31 (58%)

Query: 151 DTEEKEGKASSDESSEEEEEEEEEEESSDDD 181
           + +E     +   ++EEEEEEEEEEE  +  
Sbjct: 287 ELKEVLSAQAQAAAAEEEEEEEEEEEEEEPS 317



 Score = 28.7 bits (65), Expect = 4.7
 Identities = 13/28 (46%), Positives = 19/28 (67%)

Query: 153 EEKEGKASSDESSEEEEEEEEEEESSDD 180
             +   A+++E  EEEEEEEEEE S ++
Sbjct: 293 SAQAQAAAAEEEEEEEEEEEEEEPSEEE 320



 Score = 27.9 bits (63), Expect = 8.9
 Identities = 10/27 (37%), Positives = 15/27 (55%)

Query: 153 EEKEGKASSDESSEEEEEEEEEEESSD 179
           +    +   +E  EEEEEE  EEE++ 
Sbjct: 297 QAAAAEEEEEEEEEEEEEEPSEEEAAA 323


>gnl|CDD|217503 pfam03344, Daxx, Daxx Family.  The Daxx protein (also known as the
           Fas-binding protein) is thought to play a role in
           apoptosis, but precise role played by Daxx remains to be
           determined. Daxx forms a complex with Axin.
          Length = 715

 Score = 33.0 bits (75), Expect = 0.30
 Identities = 15/36 (41%), Positives = 22/36 (61%)

Query: 147 SNERDTEEKEGKASSDESSEEEEEEEEEEESSDDDQ 182
           S E + EE+E +   ++ SEEEE E+EEEE   +  
Sbjct: 444 SVEEEEEEEEEEEEEEQESEEEEGEDEEEEEEVEAD 479



 Score = 30.7 bits (69), Expect = 1.4
 Identities = 14/35 (40%), Positives = 20/35 (57%)

Query: 147 SNERDTEEKEGKASSDESSEEEEEEEEEEESSDDD 181
             E + EE+E +    E  E E+EEEEEE  +D+ 
Sbjct: 447 EEEEEEEEEEEEEQESEEEEGEDEEEEEEVEADNG 481



 Score = 30.3 bits (68), Expect = 2.1
 Identities = 14/34 (41%), Positives = 22/34 (64%)

Query: 149 ERDTEEKEGKASSDESSEEEEEEEEEEESSDDDQ 182
           E + EE+E +  S+E   E+EEEEEE E+ +  +
Sbjct: 450 EEEEEEEEEEQESEEEEGEDEEEEEEVEADNGSE 483



 Score = 29.9 bits (67), Expect = 2.3
 Identities = 15/36 (41%), Positives = 22/36 (61%)

Query: 147 SNERDTEEKEGKASSDESSEEEEEEEEEEESSDDDQ 182
           S E + EE   +   +E  EEEEE+E EEE  +D++
Sbjct: 436 SQESEEEESVEEEEEEEEEEEEEEQESEEEEGEDEE 471



 Score = 29.9 bits (67), Expect = 2.4
 Identities = 14/35 (40%), Positives = 22/35 (62%)

Query: 148 NERDTEEKEGKASSDESSEEEEEEEEEEESSDDDQ 182
            E + EE+E + S +E  E+EEEEEE E  +  ++
Sbjct: 450 EEEEEEEEEEQESEEEEGEDEEEEEEVEADNGSEE 484



 Score = 29.1 bits (65), Expect = 4.0
 Identities = 13/31 (41%), Positives = 19/31 (61%)

Query: 152 TEEKEGKASSDESSEEEEEEEEEEESSDDDQ 182
            E  E +   +E  EEEE+E EEEE  D+++
Sbjct: 442 EESVEEEEEEEEEEEEEEQESEEEEGEDEEE 472



 Score = 28.7 bits (64), Expect = 5.0
 Identities = 12/37 (32%), Positives = 20/37 (54%)

Query: 147 SNERDTEEKEGKASSDESSEEEEEEEEEEESSDDDQA 183
            +  + EE+E +   +E   EEEE E+EEE  + +  
Sbjct: 443 ESVEEEEEEEEEEEEEEQESEEEEGEDEEEEEEVEAD 479



 Score = 28.7 bits (64), Expect = 6.5
 Identities = 13/37 (35%), Positives = 22/37 (59%)

Query: 147 SNERDTEEKEGKASSDESSEEEEEEEEEEESSDDDQA 183
           +++   EE+  +   +E  EEEEEE+E EE   +D+ 
Sbjct: 435 ASQESEEEESVEEEEEEEEEEEEEEQESEEEEGEDEE 471


>gnl|CDD|221581 pfam12446, DUF3682, Protein of unknown function (DUF3682).  This
           domain family is found in eukaryotes, and is typically
           between 125 and 136 amino acids in length.
          Length = 133

 Score = 31.3 bits (71), Expect = 0.34
 Identities = 14/35 (40%), Positives = 17/35 (48%), Gaps = 4/35 (11%)

Query: 153 EEKEGKASSDESSEEEEEEEEEEE----SSDDDQA 183
           E   G  S    + +EEEEEEEE      SD+ Q 
Sbjct: 82  EGPAGTTSGTGHTRQEEEEEEEENEKQQQSDEAQV 116


>gnl|CDD|219408 pfam07423, DUF1510, Protein of unknown function (DUF1510).  This
           family consists of several hypothetical bacterial
           proteins of around 200 residues in length. The function
           of this family is unknown.
          Length = 214

 Score = 31.6 bits (72), Expect = 0.41
 Identities = 15/41 (36%), Positives = 25/41 (60%)

Query: 151 DTEEKEGKASSDESSEEEEEEEEEEESSDDDQAWTKEIKKT 191
           ++E+KE K  +++  EE EEE EEE+    D+   +  +KT
Sbjct: 74  NSEDKEDKGDAEKEDEESEEENEEEDEESSDENEKETEEKT 114



 Score = 30.5 bits (69), Expect = 1.0
 Identities = 13/38 (34%), Positives = 23/38 (60%)

Query: 149 ERDTEEKEGKASSDESSEEEEEEEEEEESSDDDQAWTK 186
           E +  E+E +   +ESS+E E+E EE+  S+ ++  T 
Sbjct: 87  EDEESEEENEEEDEESSDENEKETEEKTESNVEKEITN 124



 Score = 30.1 bits (68), Expect = 1.4
 Identities = 16/34 (47%), Positives = 21/34 (61%)

Query: 147 SNERDTEEKEGKASSDESSEEEEEEEEEEESSDD 180
           +N  D E+K      DE SEEE EEE+EE S ++
Sbjct: 73  ANSEDKEDKGDAEKEDEESEEENEEEDEESSDEN 106



 Score = 29.7 bits (67), Expect = 1.8
 Identities = 10/40 (25%), Positives = 24/40 (60%)

Query: 148 NERDTEEKEGKASSDESSEEEEEEEEEEESSDDDQAWTKE 187
              + +E+E +A++ E  E++ + E+E+E S+++     E
Sbjct: 61  EIEEVKEEEKEAANSEDKEDKGDAEKEDEESEEENEEEDE 100



 Score = 28.9 bits (65), Expect = 3.4
 Identities = 11/41 (26%), Positives = 26/41 (63%)

Query: 147 SNERDTEEKEGKASSDESSEEEEEEEEEEESSDDDQAWTKE 187
             E   EE++  A+S++  ++ + E+E+EES ++++   +E
Sbjct: 61  EIEEVKEEEKEAANSEDKEDKGDAEKEDEESEEENEEEDEE 101



 Score = 28.5 bits (64), Expect = 4.3
 Identities = 14/48 (29%), Positives = 28/48 (58%), Gaps = 1/48 (2%)

Query: 147 SNERDTEEKEGKASSDES-SEEEEEEEEEEESSDDDQAWTKEIKKTYN 193
           S++   +E+E K S D+  +E EE +EEE+E+++ +    K   +  +
Sbjct: 41  SDQAAADEQEAKKSDDQETAEIEEVKEEEKEAANSEDKEDKGDAEKED 88



 Score = 27.8 bits (62), Expect = 7.7
 Identities = 9/33 (27%), Positives = 19/33 (57%)

Query: 149 ERDTEEKEGKASSDESSEEEEEEEEEEESSDDD 181
           E++ EE E +   ++    +E E+E EE ++ +
Sbjct: 85  EKEDEESEEENEEEDEESSDENEKETEEKTESN 117


>gnl|CDD|216269 pfam01056, Myc_N, Myc amino-terminal region.  The myc family
           belongs to the basic helix-loop-helix leucine zipper
           class of transcription factors, see pfam00010. Myc forms
           a heterodimer with Max, and this complex regulates cell
           growth through direct activation of genes involved in
           cell replication. Mutations in the C-terminal 20
           residues of this domain cause unique changes in the
           induction of apoptosis, transformation, and G2 arrest.
          Length = 329

 Score = 32.2 bits (73), Expect = 0.42
 Identities = 15/23 (65%), Positives = 15/23 (65%)

Query: 157 GKASSDESSEEEEEEEEEEESSD 179
           G  S  E  EEEEEEEEEEE  D
Sbjct: 223 GSDSESEEDEEEEEEEEEEEEID 245



 Score = 28.7 bits (64), Expect = 4.1
 Identities = 12/24 (50%), Positives = 18/24 (75%)

Query: 159 ASSDESSEEEEEEEEEEESSDDDQ 182
           +SS   SE EE+EEEEEE  ++++
Sbjct: 220 SSSGSDSESEEDEEEEEEEEEEEE 243



 Score = 28.0 bits (62), Expect = 7.8
 Identities = 15/28 (53%), Positives = 18/28 (64%), Gaps = 5/28 (17%)

Query: 147 SNERDTEEKEGKASSDESSEEEEEEEEE 174
           S+  D+E +E     DE  EEEEEEEEE
Sbjct: 221 SSGSDSESEE-----DEEEEEEEEEEEE 243


>gnl|CDD|217392 pfam03153, TFIIA, Transcription factor IIA, alpha/beta subunit.
           Transcription initiation factor IIA (TFIIA) is a
           heterotrimer, the three subunits being known as alpha,
           beta, and gamma, in order of molecular weight. The N and
           C-terminal domains of the gamma subunit are represented
           in pfam02268 and pfam02751, respectively. This family
           represents the precursor that yields both the alpha and
           beta subunits. The TFIIA heterotrimer is an essential
           general transcription initiation factor for the
           expression of genes transcribed by RNA polymerase II.
           Together with TFIID, TFIIA binds to the promoter region;
           this is the first step in the formation of a
           pre-initiation complex (PIC). Binding of the rest of the
           transcription machinery follows this step. After
           initiation, the PIC does not completely dissociate from
           the promoter. Some components, including TFIIA, remain
           attached and re-initiate a subsequent round of
           transcription.
          Length = 332

 Score = 32.0 bits (73), Expect = 0.42
 Identities = 10/35 (28%), Positives = 18/35 (51%)

Query: 147 SNERDTEEKEGKASSDESSEEEEEEEEEEESSDDD 181
           S  R   + +G  S DE    +++++E+   SD D
Sbjct: 236 SKRRTIAQIDGIDSDDEGDGSDDDDDEDAIESDLD 270


>gnl|CDD|220393 pfam09772, Tmem26, Transmembrane protein 26.  The function of this
           family of transmembrane proteins has not, as yet, been
           determined.
          Length = 282

 Score = 32.0 bits (73), Expect = 0.42
 Identities = 12/26 (46%), Positives = 17/26 (65%)

Query: 9   TNWHKQAVTLVIIGGRYLLPTNDIRR 34
           T   +Q + L++I GR+LLP  DI R
Sbjct: 122 TKGLEQTLLLILIVGRWLLPKGDITR 147


>gnl|CDD|217049 pfam02459, Adeno_terminal, Adenoviral DNA terminal protein.  This
           protein is covalently attached to the terminii of
           replicating DNA in vivo.
          Length = 548

 Score = 32.3 bits (74), Expect = 0.43
 Identities = 30/100 (30%), Positives = 44/100 (44%), Gaps = 18/100 (18%)

Query: 159 ASSDESSEEEEEEEEEEESSDDDQAWTKEIKKTYNLGPAPKWCGFLDNLTEEL-----EE 213
              +E  EEE  EEEEEE  ++++ + +E++ T            +  L EEL       
Sbjct: 305 PEEEEEEEEEVPEEEEEEEEEEERTFEEEVRATV--------AEAIRLLEEELTVSARRH 356

Query: 214 NIIENVYDDYKFVTRQELEDLGLGHLIGTSLLRAYMHGFF 253
                  D Y+ +  Q LEDLG    I  S LR ++  FF
Sbjct: 357 EFFNFAVDFYELL--QRLEDLG---RITESFLRRWVMYFF 391


>gnl|CDD|221175 pfam11705, RNA_pol_3_Rpc31, DNA-directed RNA polymerase III subunit
           Rpc31.  RNA polymerase III contains seventeen subunits
           in yeasts and in human cells. Twelve of these are akin
           to RNA polymerase I or II and the other five are RNA pol
           III-specific, and form the functionally distinct groups
           (i) Rpc31-Rpc34-Rpc82, and (ii) Rpc37-Rpc53. Rpc31,
           Rpc34 and Rpc82 form a cluster of enzyme-specific
           subunits that contribute to transcription initiation in
           S.cerevisiae and H.sapiens. There is evidence that these
           subunits are anchored at or near the N-terminal Zn-fold
           of Rpc1, itself prolonged by a highly conserved but RNA
           polymerase III-specific domain.
          Length = 221

 Score = 31.3 bits (71), Expect = 0.53
 Identities = 12/36 (33%), Positives = 18/36 (50%)

Query: 146 ISNERDTEEKEGKASSDESSEEEEEEEEEEESSDDD 181
           +      +  E     +E  EEEEEE+E+ +  DDD
Sbjct: 160 LKELEAEDVDEEDEKDEEEEEEEEEEDEDFDDDDDD 195



 Score = 30.5 bits (69), Expect = 0.93
 Identities = 12/33 (36%), Positives = 19/33 (57%)

Query: 149 ERDTEEKEGKASSDESSEEEEEEEEEEESSDDD 181
           E   EE E     +E  EEE+E+ ++++  DDD
Sbjct: 166 EDVDEEDEKDEEEEEEEEEEDEDFDDDDDDDDD 198



 Score = 30.5 bits (69), Expect = 1.1
 Identities = 9/33 (27%), Positives = 22/33 (66%)

Query: 149 ERDTEEKEGKASSDESSEEEEEEEEEEESSDDD 181
           + D E+++ +   +E  EE+E+ +++++  DDD
Sbjct: 167 DVDEEDEKDEEEEEEEEEEDEDFDDDDDDDDDD 199



 Score = 30.5 bits (69), Expect = 1.1
 Identities = 12/32 (37%), Positives = 21/32 (65%)

Query: 151 DTEEKEGKASSDESSEEEEEEEEEEESSDDDQ 182
           D +E++ K   +E  EEEE+E+ +++  DDD 
Sbjct: 167 DVDEEDEKDEEEEEEEEEEDEDFDDDDDDDDD 198



 Score = 29.0 bits (65), Expect = 3.4
 Identities = 7/33 (21%), Positives = 21/33 (63%)

Query: 148 NERDTEEKEGKASSDESSEEEEEEEEEEESSDD 180
           +  + +EK+ +   +E  E+E+ ++++++  DD
Sbjct: 167 DVDEEDEKDEEEEEEEEEEDEDFDDDDDDDDDD 199



 Score = 28.6 bits (64), Expect = 4.4
 Identities = 12/30 (40%), Positives = 20/30 (66%)

Query: 152 TEEKEGKASSDESSEEEEEEEEEEESSDDD 181
            + KE +A   +  +E++EEEEEEE  +D+
Sbjct: 158 KKLKELEAEDVDEEDEKDEEEEEEEEEEDE 187



 Score = 27.8 bits (62), Expect = 6.7
 Identities = 15/53 (28%), Positives = 26/53 (49%)

Query: 130 PRMKVKAGTLDCAFNCISNERDTEEKEGKASSDESSEEEEEEEEEEESSDDDQ 182
            + K K G        I  +    EK+ K    E  +EE+E++EEEE  ++++
Sbjct: 133 SKFKRKVGLFTEEEEDIDEKLSMLEKKLKELEAEDVDEEDEKDEEEEEEEEEE 185


>gnl|CDD|227693 COG5406, COG5406, Nucleosome binding factor SPN, SPT16 subunit
           [Transcription / DNA replication, recombination, and
           repair / Chromatin structure and dynamics].
          Length = 1001

 Score = 31.9 bits (72), Expect = 0.54
 Identities = 17/34 (50%), Positives = 26/34 (76%)

Query: 147 SNERDTEEKEGKASSDESSEEEEEEEEEEESSDD 180
           S+E + E  E +ASSD+ S+E +E+EE +ESS+D
Sbjct: 931 SDESEEEVSEYEASSDDESDETDEDEESDESSED 964



 Score = 30.4 bits (68), Expect = 1.9
 Identities = 16/31 (51%), Positives = 17/31 (54%)

Query: 157 GKASSDESSEEEEEEEEEEESSDDDQAWTKE 187
              S DES E EEE  E E SSDD+   T E
Sbjct: 924 MVGSDDESDESEEEVSEYEASSDDESDETDE 954



 Score = 30.0 bits (67), Expect = 2.8
 Identities = 16/46 (34%), Positives = 28/46 (60%), Gaps = 2/46 (4%)

Query: 147 SNERDTEEKEGKASSDESSEEEEEEEEEEESSDDDQA--WTKEIKK 190
           S++ +++E +    SDESSE+  E+E E +SSD++    W +   K
Sbjct: 944 SSDDESDETDEDEESDESSEDLSEDESENDSSDEEDGEDWDELESK 989


>gnl|CDD|217927 pfam04147, Nop14, Nop14-like family.  Emg1 and Nop14 are novel
           proteins whose interaction is required for the
           maturation of the 18S rRNA and for 40S ribosome
           production.
          Length = 809

 Score = 31.9 bits (73), Expect = 0.56
 Identities = 12/46 (26%), Positives = 27/46 (58%)

Query: 146 ISNERDTEEKEGKASSDESSEEEEEEEEEEESSDDDQAWTKEIKKT 191
             ++   EE+E    SDE  +EE+E+ ++E+  ++++   ++ KK 
Sbjct: 337 DDDDDLEEEEEDVDLSDEEEDEEDEDSDDEDDEEEEEEEKEKKKKK 382



 Score = 31.1 bits (71), Expect = 0.95
 Identities = 10/34 (29%), Positives = 18/34 (52%)

Query: 148 NERDTEEKEGKASSDESSEEEEEEEEEEESSDDD 181
            E D  + E +   D+  EEEEE+ +  +  +D+
Sbjct: 325 EEEDGVDDEDEEDDDDDLEEEEEDVDLSDEEEDE 358



 Score = 31.1 bits (71), Expect = 1.1
 Identities = 9/33 (27%), Positives = 17/33 (51%)

Query: 149 ERDTEEKEGKASSDESSEEEEEEEEEEESSDDD 181
           + + EE       +E  +++ EEEEE+    D+
Sbjct: 322 DEEEEEDGVDDEDEEDDDDDLEEEEEDVDLSDE 354



 Score = 30.7 bits (70), Expect = 1.5
 Identities = 12/37 (32%), Positives = 21/37 (56%)

Query: 146 ISNERDTEEKEGKASSDESSEEEEEEEEEEESSDDDQ 182
             +E D ++   +   D    +EEE+EE+E+S D+D 
Sbjct: 332 DEDEEDDDDDLEEEEEDVDLSDEEEDEEDEDSDDEDD 368



 Score = 30.4 bits (69), Expect = 1.6
 Identities = 11/26 (42%), Positives = 16/26 (61%)

Query: 156 EGKASSDESSEEEEEEEEEEESSDDD 181
            G+   DE  EE+  ++E+EE  DDD
Sbjct: 316 LGQGEEDEEEEEDGVDDEDEEDDDDD 341



 Score = 30.4 bits (69), Expect = 1.7
 Identities = 15/51 (29%), Positives = 27/51 (52%)

Query: 149 ERDTEEKEGKASSDESSEEEEEEEEEEESSDDDQAWTKEIKKTYNLGPAPK 199
           E D ++ E +    + S+EEE+EE+E+   +DD+   +E K+      A  
Sbjct: 336 EDDDDDLEEEEEDVDLSDEEEDEEDEDSDDEDDEEEEEEEKEKKKKKSAES 386



 Score = 29.6 bits (67), Expect = 2.8
 Identities = 12/33 (36%), Positives = 20/33 (60%)

Query: 149 ERDTEEKEGKASSDESSEEEEEEEEEEESSDDD 181
           E + EE++G    DE  ++++ EEEEE+    D
Sbjct: 321 EDEEEEEDGVDDEDEEDDDDDLEEEEEDVDLSD 353



 Score = 28.4 bits (64), Expect = 6.8
 Identities = 20/52 (38%), Positives = 27/52 (51%), Gaps = 4/52 (7%)

Query: 149 ERDTEEKEGKASSDESSEEEEEEEEEEESSDDDQAWTK-EIKKTYNLGPAPK 199
             D EE E    SD+  +EEEEEEE+E+        T+ E+  T+   P PK
Sbjct: 351 LSDEEEDEEDEDSDDEDDEEEEEEEKEKKKKKSAESTRSELPFTF---PCPK 399



 Score = 28.0 bits (63), Expect = 9.2
 Identities = 9/43 (20%), Positives = 22/43 (51%)

Query: 149 ERDTEEKEGKASSDESSEEEEEEEEEEESSDDDQAWTKEIKKT 191
           E +  +   +   +E  + ++E++EEEE  + ++   K  + T
Sbjct: 345 EEEDVDLSDEEEDEEDEDSDDEDDEEEEEEEKEKKKKKSAEST 387


>gnl|CDD|148051 pfam06213, CobT, Cobalamin biosynthesis protein CobT.  This family
           consists of several bacterial cobalamin biosynthesis
           (CobT) proteins. CobT is involved in the transformation
           of precorrin-3 into cobyrinic acid.
          Length = 282

 Score = 31.3 bits (71), Expect = 0.66
 Identities = 13/35 (37%), Positives = 19/35 (54%)

Query: 149 ERDTEEKEGKASSDESSEEEEEEEEEEESSDDDQA 183
           + D  E+E   SSD  SE+ +   EE ES + + A
Sbjct: 235 DDDQGEEEESGSSDSLSEDSDASSEEMESGEMEAA 269



 Score = 31.3 bits (71), Expect = 0.74
 Identities = 9/45 (20%), Positives = 24/45 (53%)

Query: 147 SNERDTEEKEGKASSDESSEEEEEEEEEEESSDDDQAWTKEIKKT 191
            +E ++ + E     D+  E+E++++ EEE S    + +++   +
Sbjct: 213 GDEPESADSEDNEDEDDPKEDEDDDQGEEEESGSSDSLSEDSDAS 257



 Score = 27.5 bits (61), Expect = 10.0
 Identities = 5/31 (16%), Positives = 15/31 (48%)

Query: 151 DTEEKEGKASSDESSEEEEEEEEEEESSDDD 181
             E+++     +E S   +   E+ ++S ++
Sbjct: 230 PKEDEDDDQGEEEESGSSDSLSEDSDASSEE 260


>gnl|CDD|145949 pfam03066, Nucleoplasmin, Nucleoplasmin.  Nucleoplasmins are also
           known as chromatin decondensation proteins. They bind to
           core histones and transfer DNA to them in a reaction
           that requires ATP. This is thought to play a role in the
           assembly of regular nucleosomal arrays.
          Length = 146

 Score = 30.4 bits (69), Expect = 0.67
 Identities = 12/36 (33%), Positives = 21/36 (58%)

Query: 156 EGKASSDESSEEEEEEEEEEESSDDDQAWTKEIKKT 191
           E   S D+  +EEEE++EE++  D+ +     +KK 
Sbjct: 110 EEDESDDDEEDEEEEDDEEDDDEDESEEEESPVKKV 145



 Score = 30.0 bits (68), Expect = 0.97
 Identities = 11/37 (29%), Positives = 26/37 (70%)

Query: 153 EEKEGKASSDESSEEEEEEEEEEESSDDDQAWTKEIK 189
           EE E     ++  EE++EE+++E+ S+++++  K++K
Sbjct: 110 EEDESDDDEEDEEEEDDEEDDDEDESEEEESPVKKVK 146



 Score = 29.2 bits (66), Expect = 1.7
 Identities = 9/28 (32%), Positives = 19/28 (67%)

Query: 154 EKEGKASSDESSEEEEEEEEEEESSDDD 181
            +E ++  DE  EEEE++EE+++  + +
Sbjct: 109 SEEDESDDDEEDEEEEDDEEDDDEDESE 136



 Score = 28.4 bits (64), Expect = 3.0
 Identities = 12/29 (41%), Positives = 20/29 (68%)

Query: 159 ASSDESSEEEEEEEEEEESSDDDQAWTKE 187
           AS ++ S+++EE+EEEE+  +DD     E
Sbjct: 108 ASEEDESDDDEEDEEEEDDEEDDDEDESE 136



 Score = 27.3 bits (61), Expect = 6.9
 Identities = 9/31 (29%), Positives = 20/31 (64%)

Query: 146 ISNERDTEEKEGKASSDESSEEEEEEEEEEE 176
           +++E D  + + +   +E  EE+++E+E EE
Sbjct: 107 VASEEDESDDDEEDEEEEDDEEDDDEDESEE 137



 Score = 27.3 bits (61), Expect = 7.5
 Identities = 10/27 (37%), Positives = 18/27 (66%)

Query: 156 EGKASSDESSEEEEEEEEEEESSDDDQ 182
             +  SD+  E+EEEE++EE+  +D+ 
Sbjct: 109 SEEDESDDDEEDEEEEDDEEDDDEDES 135


>gnl|CDD|219256 pfam06991, Prp19_bind, Splicing factor, Prp19-binding domain.  This
           family represents the C-terminus (approximately 300
           residues) of proteins that are involved as binding
           partners for Prp19 as part of the nuclear pore complex.
           The family in Drosophila is necessary for pre-mRNA
           splicing, and the human protein has been found in
           purifications of the spliceosome. In the past this
           family was thought, erroneously, to be associated with
           microfibrillin.
          Length = 277

 Score = 31.0 bits (70), Expect = 0.77
 Identities = 17/40 (42%), Positives = 22/40 (55%)

Query: 149 ERDTEEKEGKASSDESSEEEEEEEEEEESSDDDQAWTKEI 188
           E +  E E +  S E  EEE EEEEE +S DD +   K +
Sbjct: 1   ETEVLELEEEDESGEEEEEESEEEEETDSEDDMEPRLKPV 40


>gnl|CDD|221490 pfam12253, CAF1A, Chromatin assembly factor 1 subunit A.  The CAF-1
           or chromatin assembly factor-1 consists of three
           subunits, and this is the first, or A. The A domain is
           uniquely required for the progression of S phase in
           mouse cells, independent of its ability to promote
           histone deposition but dependent on its ability to
           interact with HP1 - heterochromatin protein 1-rich
           heterochromatin domains next to centromeres that are
           crucial for chromosome segregation during mitosis. This
           HP1-CAF-1 interaction module functions as a built-in
           replication control for heterochromatin, which, like a
           control barrier, has an impact on S-phase progression in
           addition to DNA-based checkpoints.
          Length = 76

 Score = 28.7 bits (65), Expect = 0.83
 Identities = 12/31 (38%), Positives = 19/31 (61%)

Query: 151 DTEEKEGKASSDESSEEEEEEEEEEESSDDD 181
           D E +E +   D  SE+EE+EEE+++   D 
Sbjct: 45  DAEWEEEEEGEDLESEDEEDEEEDDDDDMDG 75


>gnl|CDD|219563 pfam07767, Nop53, Nop53 (60S ribosomal biogenesis).  This nucleolar
           family of proteins are involved in 60S ribosomal
           biogenesis. They are specifically involved in the
           processing beyond the 27S stage of 25S rRNA maturation.
           This family contains sequences that bear similarity to
           the glioma tumour suppressor candidate region gene 2
           protein (p60). This protein has been found to interact
           with herpes simplex type 1 regulatory proteins.
          Length = 387

 Score = 31.2 bits (71), Expect = 0.83
 Identities = 18/57 (31%), Positives = 33/57 (57%), Gaps = 4/57 (7%)

Query: 275 KKKKIRERIEQE----RTRGVQLNKLPKVNQELALKLMDEKQKAEETESRKKKKKLQ 327
           K++K  ER  +E    + +  QL +L ++ +E+A K     +K E+ + R +KKKL+
Sbjct: 287 KRRKELEREAKEEKQLKKKLAQLARLKEIAKEVAQKEKARARKKEQRKERGEKKKLK 343


>gnl|CDD|185603 PTZ00415, PTZ00415, transmission-blocking target antigen s230;
           Provisional.
          Length = 2849

 Score = 31.6 bits (71), Expect = 0.88
 Identities = 11/26 (42%), Positives = 18/26 (69%)

Query: 151 DTEEKEGKASSDESSEEEEEEEEEEE 176
           D ++++     D+  ++EEEEEEEEE
Sbjct: 154 DDDDEDEDEDDDDEEDDEEEEEEEEE 179



 Score = 29.6 bits (66), Expect = 3.1
 Identities = 13/64 (20%), Positives = 28/64 (43%), Gaps = 1/64 (1%)

Query: 133 KVKAGTLDCAFNCISNERD-TEEKEGKASSDESSEEEEEEEEEEESSDDDQAWTKEIKKT 191
           K + G LD         R   EE      +    +++E+E+E+++  +DD+   +E ++ 
Sbjct: 121 KAEIGDLDMIIIKRRRARHLAEEDMSPRDNFVIDDDDEDEDEDDDDEEDDEEEEEEEEEI 180

Query: 192 YNLG 195
               
Sbjct: 181 KGFD 184


>gnl|CDD|235549 PRK05658, PRK05658, RNA polymerase sigma factor RpoD; Validated.
          Length = 619

 Score = 30.9 bits (71), Expect = 1.2
 Identities = 11/42 (26%), Positives = 22/42 (52%)

Query: 149 ERDTEEKEGKASSDESSEEEEEEEEEEESSDDDQAWTKEIKK 190
             + +     +  +E  ++E+EEEEE+E+ D   A   E+ +
Sbjct: 177 NAEEDPAHVGSELEELDDDEDEEEEEDENDDSLAADESELPE 218


>gnl|CDD|240271 PTZ00108, PTZ00108, DNA topoisomerase 2-like protein; Provisional.
          Length = 1388

 Score = 30.8 bits (70), Expect = 1.2
 Identities = 40/206 (19%), Positives = 81/206 (39%), Gaps = 32/206 (15%)

Query: 153  EEKEGKASSDESSEEEEEEEEEEESSDDDQAWTKEIKKTYN-LGPAPKWCGFLDNLTEEL 211
            ++   K S   ++EEEE  EE++E+ D+D         +Y+ L   P W     +LT+E 
Sbjct: 1049 KDIIKKKSEKITAEEEEGAEEDDEADDEDDEEELGAAVSYDYLLSMPIW-----SLTKEK 1103

Query: 212  EENIIENVYDDYKFVTRQELEDLGLGHLIGTSLLRAYMHGFFMDIRLYRKAKSVSAPFEF 271
             E +   +         +ELE      L  T+    ++     D+  + +A         
Sbjct: 1104 VEKLNAELEK-----KEKELEK-----LKNTTPKDMWLE----DLDKFEEA--------L 1141

Query: 272  EEFKKKKIRERIEQERTRGVQLNK-LPKVNQELALKLMDEKQKAEETESRKKKKKLQLSA 330
            EE ++ + +E  +++R +     K       +L  K   +K+ + +   +        ++
Sbjct: 1142 EEQEEVEEKEIAKEQRLKSKTKGKASKLRKPKLKKKEKKKKKSSADKSKKASVVG---NS 1198

Query: 331  NLLEDDRFSKLFENPDFQVIEPSSDQ 356
              ++ D   KL + PD +    S   
Sbjct: 1199 KRVDSDEKRKLDDKPDNKKSNSSGSD 1224


>gnl|CDD|219900 pfam08553, VID27, VID27 cytoplasmic protein.  This is a family of
           fungal and plant proteins and contains many hypothetical
           proteins. VID27 is a cytoplasmic protein that plays a
           potential role in vacuolar protein degradation.
          Length = 794

 Score = 30.9 bits (70), Expect = 1.3
 Identities = 11/36 (30%), Positives = 19/36 (52%)

Query: 146 ISNERDTEEKEGKASSDESSEEEEEEEEEEESSDDD 181
           I +     + E +   +E  EE+E+E   +E SDD+
Sbjct: 381 IEDANTERDDEEEEDEEEEEEEDEDEGPSKEHSDDE 416



 Score = 30.9 bits (70), Expect = 1.4
 Identities = 14/42 (33%), Positives = 22/42 (52%), Gaps = 3/42 (7%)

Query: 149 ERDTEEKEGKASSDESSEEEEEEEEEEESSD---DDQAWTKE 187
           E +    E     +E  EEEEEE+E+E  S    DD+ + ++
Sbjct: 380 EIEDANTERDDEEEEDEEEEEEEDEDEGPSKEHSDDEEFEED 421



 Score = 30.1 bits (68), Expect = 2.3
 Identities = 14/34 (41%), Positives = 21/34 (61%)

Query: 149 ERDTEEKEGKASSDESSEEEEEEEEEEESSDDDQ 182
           E + EE E +  S E S++EE EE++ ES  +D 
Sbjct: 397 EEEEEEDEDEGPSKEHSDDEEFEEDDVESKYEDS 430



 Score = 28.9 bits (65), Expect = 4.4
 Identities = 14/28 (50%), Positives = 20/28 (71%)

Query: 161 SDESSEEEEEEEEEEESSDDDQAWTKEI 188
           + E  +EEEE+EEEEE  D+D+  +KE 
Sbjct: 385 NTERDDEEEEDEEEEEEEDEDEGPSKEH 412


>gnl|CDD|203043 pfam04546, Sigma70_ner, Sigma-70, non-essential region.  The domain
           is found in the primary vegetative sigma factor. The
           function of this domain is unclear and can be removed
           without loss of function.
          Length = 211

 Score = 30.2 bits (69), Expect = 1.4
 Identities = 7/33 (21%), Positives = 19/33 (57%)

Query: 151 DTEEKEGKASSDESSEEEEEEEEEEESSDDDQA 183
            T         +E  E++++++E+E+  D+++A
Sbjct: 39  ATAAAIESELDEEDLEDDDDDDEDEDEDDEEEA 71



 Score = 29.5 bits (67), Expect = 2.4
 Identities = 4/34 (11%), Positives = 15/34 (44%)

Query: 149 ERDTEEKEGKASSDESSEEEEEEEEEEESSDDDQ 182
                 +      D   +++++E+E+E+  ++  
Sbjct: 39  ATAAAIESELDEEDLEDDDDDDEDEDEDDEEEAD 72



 Score = 28.3 bits (64), Expect = 5.2
 Identities = 7/34 (20%), Positives = 16/34 (47%)

Query: 148 NERDTEEKEGKASSDESSEEEEEEEEEEESSDDD 181
           N            S+   E+ E++++++E  D+D
Sbjct: 33  NAAAAAATAAAIESELDEEDLEDDDDDDEDEDED 66



 Score = 27.5 bits (62), Expect = 8.9
 Identities = 9/32 (28%), Positives = 20/32 (62%)

Query: 149 ERDTEEKEGKASSDESSEEEEEEEEEEESSDD 180
           E + +E++ +   D+  +E+E++EEE +   D
Sbjct: 45  ESELDEEDLEDDDDDDEDEDEDDEEEADLGPD 76


>gnl|CDD|227596 COG5271, MDN1, AAA ATPase containing von Willebrand factor type A
            (vWA) domain [General function prediction only].
          Length = 4600

 Score = 30.7 bits (69), Expect = 1.4
 Identities = 13/33 (39%), Positives = 19/33 (57%)

Query: 147  SNERDTEEKEGKASSDESSEEEEEEEEEEESSD 179
            +NE D   KE    + E  + +E+E+EEE S D
Sbjct: 3930 NNESDLVSKEDDNKALEDKDRQEKEDEEEMSDD 3962


>gnl|CDD|235795 PRK06402, rpl12p, 50S ribosomal protein L12P; Reviewed.
          Length = 106

 Score = 28.8 bits (65), Expect = 1.4
 Identities = 14/25 (56%), Positives = 20/25 (80%)

Query: 159 ASSDESSEEEEEEEEEEESSDDDQA 183
           A+++E  EEEEEEEE+EES ++  A
Sbjct: 75  AAAEEKKEEEEEEEEKEESEEEAAA 99



 Score = 28.8 bits (65), Expect = 1.7
 Identities = 12/23 (52%), Positives = 19/23 (82%)

Query: 159 ASSDESSEEEEEEEEEEESSDDD 181
           A++ E  +EEEEEEEE+E S+++
Sbjct: 74  AAAAEEKKEEEEEEEEKEESEEE 96



 Score = 28.4 bits (64), Expect = 2.2
 Identities = 10/25 (40%), Positives = 19/25 (76%)

Query: 159 ASSDESSEEEEEEEEEEESSDDDQA 183
           A++  ++EE++EEEEEEE  ++ + 
Sbjct: 71  AAAAAAAEEKKEEEEEEEEKEESEE 95



 Score = 27.6 bits (62), Expect = 4.4
 Identities = 10/24 (41%), Positives = 13/24 (54%)

Query: 153 EEKEGKASSDESSEEEEEEEEEEE 176
                +   +E  EEEE+EE EEE
Sbjct: 73  AAAAAEEKKEEEEEEEEKEESEEE 96



 Score = 27.2 bits (61), Expect = 5.1
 Identities = 10/25 (40%), Positives = 18/25 (72%)

Query: 159 ASSDESSEEEEEEEEEEESSDDDQA 183
           A++  + E++EEEEEEEE  + ++ 
Sbjct: 72  AAAAAAEEKKEEEEEEEEKEESEEE 96



 Score = 26.8 bits (60), Expect = 6.4
 Identities = 10/25 (40%), Positives = 11/25 (44%)

Query: 153 EEKEGKASSDESSEEEEEEEEEEES 177
                     E  EEEEE+EE EE 
Sbjct: 72  AAAAAAEEKKEEEEEEEEKEESEEE 96


>gnl|CDD|218223 pfam04712, Radial_spoke, Radial spokehead-like protein.  This
           family includes the radial spoke head proteins RSP4 and
           RSP6 from Chlamydomonas reinhardtii, and several
           eukaryotic homologues, including mammalian RSHL1, the
           protein product of a familial ciliary dyskinesia
           candidate gene.
          Length = 481

 Score = 30.4 bits (69), Expect = 1.6
 Identities = 19/62 (30%), Positives = 26/62 (41%), Gaps = 18/62 (29%)

Query: 151 DTEEKEGKASSDESSEEEEEEEEEEES----------SDDDQ------AWTKEIKKTYNL 194
             ++ E +   DE  EEEEEE EE E           S+D        AWT   + + +L
Sbjct: 344 PEQKDEEEEQEDEEEEEEEEEPEEPEPEEGPPLLTPISEDAPLPNDDPAWT--FRLSSSL 401

Query: 195 GP 196
            P
Sbjct: 402 SP 403


>gnl|CDD|219342 pfam07227, DUF1423, Protein of unknown function (DUF1423).  This
           family represents a conserved region approximately 500
           residues long within a number of Arabidopsis thaliana
           proteins of unknown function.
          Length = 446

 Score = 30.5 bits (69), Expect = 1.6
 Identities = 18/64 (28%), Positives = 30/64 (46%), Gaps = 4/64 (6%)

Query: 270 EFEEFKKKKIRERIEQERTRGVQLNKLPKVNQELALKLMDEKQKAEETESRKKK--KKLQ 327
           E + F+ K    R E ER + + L K  K  +E A K +  K +  E E  ++   ++L+
Sbjct: 365 EADMFQLKADEARREAERLQRIALAKTEKSEEEYASKYL--KLRLSEAEEERQYLFEELK 422

Query: 328 LSAN 331
           L   
Sbjct: 423 LQEE 426


>gnl|CDD|173534 PTZ00341, PTZ00341, Ring-infected erythrocyte surface antigen;
            Provisional.
          Length = 1136

 Score = 30.5 bits (68), Expect = 1.7
 Identities = 23/90 (25%), Positives = 43/90 (47%), Gaps = 1/90 (1%)

Query: 144  NCISNERDTEEKEGKASSDESSEEEEEEEEEEESSDDDQAWTKEIKKTYNLGPAPKWCGF 203
            N   N  +  E+  + + +E+ EE  EE  EE   + D+   +E+++             
Sbjct: 986  NVEENVEENVEENVEENIEENVEENVEENIEENVEEYDEENVEEVEENVEEYDEENVEEI 1045

Query: 204  LDNLTEELEENIIENVYDDYKFVTRQELED 233
             +N  E +EENI EN+ ++Y     +E+E+
Sbjct: 1046 EENAEENVEENIEENI-EEYDEENVEEIEE 1074


>gnl|CDD|237875 PRK14974, PRK14974, cell division protein FtsY; Provisional.
          Length = 336

 Score = 29.9 bits (68), Expect = 1.7
 Identities = 22/73 (30%), Positives = 40/73 (54%), Gaps = 11/73 (15%)

Query: 146 ISNERDTEEKEGKASSDESSEEEEEEEEEEESSDDDQAWTKEIKKTYNLGPAPKWCGFLD 205
           +  + + EE+E    ++E  EEE+EEE++E+    D+A   EIK+             ++
Sbjct: 16  VEEKIEEEEEEEAPEAEEEEEEEDEEEKKEKPGFFDKAKITEIKEKD-----------IE 64

Query: 206 NLTEELEENIIEN 218
           +L EELE  ++E+
Sbjct: 65  DLLEELELELLES 77


>gnl|CDD|223287 COG0209, NrdA, Ribonucleotide reductase, alpha subunit [Nucleotide
           transport and metabolism].
          Length = 651

 Score = 30.4 bits (69), Expect = 1.8
 Identities = 9/23 (39%), Positives = 12/23 (52%)

Query: 20  IIGGRYLLPTNDIRRRIELIQDF 42
           ++  RYLL   D    +EL QD 
Sbjct: 108 VLYSRYLLKDEDGSEILELPQDL 130


>gnl|CDD|234311 TIGR03685, L12P_arch, 50S ribosomal protein L12P.  This model
           represents the L12P protein of the large (50S) subunit
           of the archaeal ribosome.
          Length = 105

 Score = 28.5 bits (64), Expect = 1.8
 Identities = 15/28 (53%), Positives = 19/28 (67%)

Query: 156 EGKASSDESSEEEEEEEEEEESSDDDQA 183
              A+ +E  EEEEEEEEEEES ++  A
Sbjct: 71  AAAAAEEEEEEEEEEEEEEEESEEEAMA 98



 Score = 28.5 bits (64), Expect = 2.2
 Identities = 11/26 (42%), Positives = 14/26 (53%)

Query: 153 EEKEGKASSDESSEEEEEEEEEEESS 178
                +   +E  EEEEEEEE EE +
Sbjct: 71  AAAAAEEEEEEEEEEEEEEEESEEEA 96



 Score = 26.5 bits (59), Expect = 9.9
 Identities = 12/25 (48%), Positives = 18/25 (72%)

Query: 159 ASSDESSEEEEEEEEEEESSDDDQA 183
           A++  + EEEEEEEEEEE  ++ + 
Sbjct: 70  AAAAAAEEEEEEEEEEEEEEEESEE 94


>gnl|CDD|218177 pfam04615, Utp14, Utp14 protein.  This protein is found to be part
           of a large ribonucleoprotein complex containing the U3
           snoRNA. Depletion of the Utp proteins impedes production
           of the 18S rRNA, indicating that they are part of the
           active pre-rRNA processing complex. This large RNP
           complex has been termed the small subunit (SSU)
           processome.
          Length = 728

 Score = 30.4 bits (69), Expect = 1.9
 Identities = 13/40 (32%), Positives = 22/40 (55%)

Query: 148 NERDTEEKEGKASSDESSEEEEEEEEEEESSDDDQAWTKE 187
            E    + EGK+ S+E  +E+ + EEE+E  D+D    + 
Sbjct: 311 GEELRRKIEGKSVSEEDEDEDSDSEEEDEDDDEDDDDGEN 350


>gnl|CDD|148630 pfam07133, Merozoite_SPAM, Merozoite surface protein (SPAM).  This
           family consists of several Plasmodium falciparum SPAM
           (secreted polymorphic antigen associated with
           merozoites) proteins. Variation among SPAM alleles is
           the result of deletions and amino acid substitutions in
           non-repetitive sequences within and flanking the alanine
           heptad-repeat domain. Heptad repeats in which the a and
           d position contain hydrophobic residues generate
           amphipathic alpha-helices which give rise to helical
           bundles or coiled-coil structures in proteins. SPAM is
           an example of a P. falciparum antigen in which a
           repetitive sequence has features characteristic of a
           well-defined structural element.
          Length = 164

 Score = 29.4 bits (66), Expect = 2.0
 Identities = 16/39 (41%), Positives = 23/39 (58%)

Query: 149 ERDTEEKEGKASSDESSEEEEEEEEEEESSDDDQAWTKE 187
           E + EE E     +E  E+EEEEEE+EE + D +   K+
Sbjct: 59  EEEIEEPEDIEDEEEIVEDEEEEEEDEEDNVDLKDIEKK 97



 Score = 27.1 bits (60), Expect = 9.9
 Identities = 17/45 (37%), Positives = 22/45 (48%)

Query: 149 ERDTEEKEGKASSDESSEEEEEEEEEEESSDDDQAWTKEIKKTYN 193
           E D EE E     ++  E  E+EEEEEE  +D+       KK  N
Sbjct: 56  EEDEEEIEEPEDIEDEEEIVEDEEEEEEDEEDNVDLKDIEKKNIN 100


>gnl|CDD|224212 COG1293, COG1293, Predicted RNA-binding protein homologous to
           eukaryotic snRNP [Transcription].
          Length = 564

 Score = 30.1 bits (68), Expect = 2.0
 Identities = 21/84 (25%), Positives = 34/84 (40%), Gaps = 6/84 (7%)

Query: 269 FEFEEFKKKKIRERIEQERTRGVQLNKLPKVNQELALKLMDEKQKAEETESRKKKKKLQL 328
            +FE  K K++   +E+      +L K  K  +    K  DE ++ E+     ++K   L
Sbjct: 273 EKFERDKIKQLASELEK------KLEKELKKLENKLEKQEDELEELEKAAEELRQKGELL 326

Query: 329 SANLLEDDRFSKLFENPDFQVIEP 352
            ANL   +   K     DF   E 
Sbjct: 327 YANLQLIEEGLKSVRLADFYGNEE 350


>gnl|CDD|224217 COG1298, FlhA, Flagellar biosynthesis pathway, component FlhA [Cell
           motility and secretion / Intracellular trafficking and
           secretion].
          Length = 696

 Score = 29.9 bits (68), Expect = 2.3
 Identities = 23/88 (26%), Positives = 36/88 (40%), Gaps = 16/88 (18%)

Query: 149 ERDTEEKEGKASSDESSEEEEEEEEEEESSDDDQAWTKEIKKTYNLGPAPKWCGFLDNLT 208
           E+  EEKE  A   +  EEEEEEE  ++    D     E++  Y L P        +   
Sbjct: 328 EQQAEEKEKPAEEAKKEEEEEEEESVDDVLLIDPI---ELELGYGLIPLVD-----EQQG 379

Query: 209 EELEENIIENVYDDYKFVTRQELEDLGL 236
            EL         D  + + ++  ++LG 
Sbjct: 380 GEL--------LDRIRGIRKKIAQELGF 399


>gnl|CDD|217203 pfam02724, CDC45, CDC45-like protein.  CDC45 is an essential gene
           required for initiation of DNA replication in S.
           cerevisiae, forming a complex with MCM5/CDC46.
           Homologues of CDC45 have been identified in human, mouse
           and smut fungus among others.
          Length = 583

 Score = 30.0 bits (68), Expect = 2.5
 Identities = 11/33 (33%), Positives = 17/33 (51%)

Query: 149 ERDTEEKEGKASSDESSEEEEEEEEEEESSDDD 181
               E+ +    SDE  EE  + E++E+  DDD
Sbjct: 121 RDLEEDDDDDEESDEEDEESSKSEDDEDDDDDD 153



 Score = 28.4 bits (64), Expect = 6.8
 Identities = 8/36 (22%), Positives = 14/36 (38%)

Query: 146 ISNERDTEEKEGKASSDESSEEEEEEEEEEESSDDD 181
                D      +   D+   +EE+EE  +   D+D
Sbjct: 113 EPRYDDAYRDLEEDDDDDEESDEEDEESSKSEDDED 148



 Score = 28.0 bits (63), Expect = 8.9
 Identities = 11/33 (33%), Positives = 17/33 (51%)

Query: 150 RDTEEKEGKASSDESSEEEEEEEEEEESSDDDQ 182
           RD EE +      +  +EE  + E++E  DDD 
Sbjct: 121 RDLEEDDDDDEESDEEDEESSKSEDDEDDDDDD 153



 Score = 28.0 bits (63), Expect = 8.9
 Identities = 13/39 (33%), Positives = 23/39 (58%)

Query: 149 ERDTEEKEGKASSDESSEEEEEEEEEEESSDDDQAWTKE 187
           E D ++ E     DE S + E++E++++  DDD   T+E
Sbjct: 124 EEDDDDDEESDEEDEESSKSEDDEDDDDDDDDDDIATRE 162


>gnl|CDD|220135 pfam09184, PPP4R2, PPP4R2.  PPP4R2 (protein phosphatase 4 core
           regulatory subunit R2) is the regulatory subunit of the
           histone H2A phosphatase complex. It has been shown to
           confer resistance to the anticancer drug cisplatin in
           yeast, and may confer resistance in higher eukaryotes.
          Length = 285

 Score = 29.4 bits (66), Expect = 2.8
 Identities = 14/32 (43%), Positives = 20/32 (62%)

Query: 149 ERDTEEKEGKASSDESSEEEEEEEEEEESSDD 180
           ++D +  E K   ++  EEE EEEEEEE  D+
Sbjct: 254 DQDGDYVEEKELKEDEEEEETEEEEEEEDEDE 285


>gnl|CDD|237799 PRK14715, PRK14715, DNA polymerase II large subunit; Provisional.
          Length = 1627

 Score = 29.8 bits (67), Expect = 3.0
 Identities = 15/51 (29%), Positives = 26/51 (50%), Gaps = 1/51 (1%)

Query: 153 EEKEGKASSDESSEEE-EEEEEEEESSDDDQAWTKEIKKTYNLGPAPKWCG 202
           E KE K   DE   EE + EE +EE  ++++ +  E+ +  N+    K+  
Sbjct: 278 ELKEKKEEKDEEKSEEVKTEEVDEEFEEEEKGFYYELYEKVNIEANKKFIK 328


>gnl|CDD|220972 pfam11081, DUF2890, Protein of unknown function (DUF2890).  This
           family is conserved in dsDNA adenoviruses of
           vertebrates. The function is not known.
          Length = 172

 Score = 28.7 bits (64), Expect = 3.0
 Identities = 13/31 (41%), Positives = 18/31 (58%)

Query: 149 ERDTEEKEGKASSDESSEEEEEEEEEEESSD 179
           E + E ++ + S DE  EE EE EEE  +S 
Sbjct: 33  EDEEEMEDWEDSLDEEDEEAEEVEEETAASS 63



 Score = 28.0 bits (62), Expect = 6.2
 Identities = 13/30 (43%), Positives = 17/30 (56%)

Query: 149 ERDTEEKEGKASSDESSEEEEEEEEEEESS 178
           E D EE E    S +  +EE EE EEE ++
Sbjct: 32  EEDEEEMEDWEDSLDEEDEEAEEVEEETAA 61


>gnl|CDD|226920 COG4547, CobT, Cobalamin biosynthesis protein CobT
           (nicotinate-mononucleotide:5, 6-dimethylbenzimidazole
           phosphoribosyltransferase) [Coenzyme metabolism].
          Length = 620

 Score = 29.4 bits (66), Expect = 3.1
 Identities = 9/36 (25%), Positives = 20/36 (55%)

Query: 148 NERDTEEKEGKASSDESSEEEEEEEEEEESSDDDQA 183
           +  + +  + +  ++E SE   EE E  + S++D+A
Sbjct: 230 DADEEDGDDDQPDNNEDSEAGREESEGSDESEEDEA 265



 Score = 28.7 bits (64), Expect = 6.0
 Identities = 11/36 (30%), Positives = 20/36 (55%)

Query: 148 NERDTEEKEGKASSDESSEEEEEEEEEEESSDDDQA 183
            E D +E++G     +++E+ E   EE E SD+ + 
Sbjct: 227 IEEDADEEDGDDDQPDNNEDSEAGREESEGSDESEE 262


>gnl|CDD|220284 pfam09538, FYDLN_acid, Protein of unknown function (FYDLN_acid).
           Members of this family are bacterial proteins with a
           conserved motif [KR]FYDLN, sometimes flanked by a pair
           of CXXC motifs, followed by a long region of low
           complexity sequence in which roughly half the residues
           are Asp and Glu, including multiple runs of five or more
           acidic residues. The function of members of this family
           is unknown.
          Length = 104

 Score = 28.0 bits (63), Expect = 3.1
 Identities = 4/33 (12%), Positives = 18/33 (54%)

Query: 149 ERDTEEKEGKASSDESSEEEEEEEEEEESSDDD 181
           E   ++ E +   D+   ++++++++++   D 
Sbjct: 51  EDAAKKDEDEEDEDDVVLDDDDDDDDDDDLPDL 83



 Score = 26.9 bits (60), Expect = 6.7
 Identities = 7/33 (21%), Positives = 17/33 (51%)

Query: 149 ERDTEEKEGKASSDESSEEEEEEEEEEESSDDD 181
             D E+   K   +E  ++   ++++++  DDD
Sbjct: 47  AADAEDAAKKDEDEEDEDDVVLDDDDDDDDDDD 79


>gnl|CDD|235640 PRK05901, PRK05901, RNA polymerase sigma factor; Provisional.
          Length = 509

 Score = 29.6 bits (67), Expect = 3.3
 Identities = 10/49 (20%), Positives = 23/49 (46%)

Query: 151 DTEEKEGKASSDESSEEEEEEEEEEESSDDDQAWTKEIKKTYNLGPAPK 199
           + ++++      +  +EE++E +E E   DD  +  +   +  L  A K
Sbjct: 154 EDDDEDDDDDDVDDEDEEKKEAKELEKLSDDDDFVWDEDDSEALRQARK 202


>gnl|CDD|218752 pfam05793, TFIIF_alpha, Transcription initiation factor IIF, alpha
           subunit (TFIIF-alpha).  Transcription initiation factor
           IIF, alpha subunit (TFIIF-alpha) or RNA polymerase
           II-associating protein 74 (RAP74) is the large subunit
           of transcription factor IIF (TFIIF), which is essential
           for accurate initiation and stimulates elongation by RNA
           polymerase II.
          Length = 528

 Score = 29.2 bits (65), Expect = 3.6
 Identities = 17/61 (27%), Positives = 28/61 (45%), Gaps = 1/61 (1%)

Query: 123 GKVEAWDPRMKVKAGTLDCAFNCISNERDTEEKEGKASSDESSEEEEEEEEEEESSDDDQ 182
            K +  D +   + G  D A    S++ D E +E    SD S+   + EE E++ S +  
Sbjct: 251 NKKKLDDDKKGKRGGDDD-ADEYDSDDGDDEGREEDYISDSSASGNDPEEREDKLSPEIP 309

Query: 183 A 183
           A
Sbjct: 310 A 310



 Score = 28.8 bits (64), Expect = 5.1
 Identities = 16/53 (30%), Positives = 29/53 (54%), Gaps = 5/53 (9%)

Query: 147 SNERDTEEKEGKASSDESSEE-----EEEEEEEEESSDDDQAWTKEIKKTYNL 194
           ++  D EE+E K S +  ++      E+ EE EEE ++++   +K+ KK   L
Sbjct: 292 ASGNDPEEREDKLSPEIPAKPEIEQDEDSEESEEEKNEEEGGLSKKGKKLKKL 344


>gnl|CDD|222812 PHA00727, PHA00727, hypothetical protein.
          Length = 278

 Score = 29.1 bits (65), Expect = 3.7
 Identities = 25/69 (36%), Positives = 36/69 (52%), Gaps = 11/69 (15%)

Query: 260 RKAKSVSAPFEFEEFKKKKIRERIEQERTRGVQLNKLPKV--NQELALKLMDEKQ-KAEE 316
           RKA+S       EE K+K   E  +++   G  L +L KV   +E  LK    +Q KAE 
Sbjct: 15  RKAQS------LEELKQK--YEEAQKQIADGKTLKRLYKVYEKREFELKKQQFEQLKAEL 66

Query: 317 TESRKKKKK 325
           ++ +KK KK
Sbjct: 67  SKKKKKFKK 75


>gnl|CDD|240388 PTZ00372, PTZ00372, endonuclease 4-like protein; Provisional.
          Length = 413

 Score = 29.3 bits (66), Expect = 3.8
 Identities = 13/54 (24%), Positives = 23/54 (42%)

Query: 146 ISNERDTEEKEGKASSDESSEEEEEEEEEEESSDDDQAWTKEIKKTYNLGPAPK 199
              E    E + K+   +  ++E++E + E  +       K+ KKT    P PK
Sbjct: 57  DKKEDKNNESKKKSEKKKKKKKEKKEPKSEGETKLGFKTPKKSKKTKKKPPKPK 110


>gnl|CDD|100110 cd05832, Ribosomal_L12p, Ribosomal protein L12p. This subfamily
           includes archaeal L12p, the protein that is functionally
           equivalent to L7/L12 in bacteria and the P1 and P2
           proteins in eukaryotes. L12p is homologous to P1 and P2
           but is not homologous to bacterial L7/L12. It is located
           in the L12 stalk, with proteins L10, L11, and 23S rRNA.
           L12p is the only protein in the ribosome to occur as
           multimers, always appearing as sets of dimers. Recent
           data indicate that most archaeal species contain six
           copies of L12p (three homodimers), while eukaryotes have
           four copies (two heterodimers), and bacteria may have
           four or six copies (two or three homodimers), depending
           on the species. The organization of proteins within the
           stalk has been characterized primarily in bacteria,
           where L7/L12 forms either two or three homodimers and
           each homodimer binds to the extended C-terminal helix of
           L10. L7/L12 is attached to the ribosome through L10 and
           is the only ribosomal protein that does not directly
           interact with rRNA. Archaeal L12p is believed to
           function in a similar fashion. However, hybrid ribosomes
           containing the large subunit from E. coli with an
           archaeal stalk are able to bind archaeal and eukaryotic
           elongation factors but not bacterial elongation factors.
           In several mesophilic and thermophilic archaeal species,
           the binding of 23S rRNA to protein L11 and to the
           L10/L12p pentameric complex was found to be
           temperature-dependent and cooperative.
          Length = 106

 Score = 27.8 bits (62), Expect = 3.8
 Identities = 12/24 (50%), Positives = 16/24 (66%)

Query: 152 TEEKEGKASSDESSEEEEEEEEEE 175
            EEK  +   ++  EEE+EEEEEE
Sbjct: 74  AEEKAEEKEEEKKKEEEKEEEEEE 97



 Score = 27.1 bits (60), Expect = 6.3
 Identities = 11/25 (44%), Positives = 18/25 (72%)

Query: 153 EEKEGKASSDESSEEEEEEEEEEES 177
            E++ +   +E  +EEE+EEEEEE+
Sbjct: 74  AEEKAEEKEEEKKKEEEKEEEEEEA 98



 Score = 26.7 bits (59), Expect = 7.7
 Identities = 10/24 (41%), Positives = 15/24 (62%)

Query: 153 EEKEGKASSDESSEEEEEEEEEEE 176
             +E     +E  ++EEE+EEEEE
Sbjct: 73  AAEEKAEEKEEEKKKEEEKEEEEE 96


>gnl|CDD|223095 COG0016, PheS, Phenylalanyl-tRNA synthetase alpha subunit
           [Translation, ribosomal structure and biogenesis].
          Length = 335

 Score = 28.7 bits (65), Expect = 4.2
 Identities = 12/53 (22%), Positives = 23/53 (43%)

Query: 272 EEFKKKKIRERIEQERTRGVQLNKLPKVNQELALKLMDEKQKAEETESRKKKK 324
           +  KK      +E+ +  G  +N+L K  ++   +L  E + A   E    +K
Sbjct: 43  DLLKKLGKLSPLEERKEVGALINELKKEVEDAITELTPELEAAGLWERLAFEK 95


>gnl|CDD|188659 cd07440, RGS, Regulator of G protein signaling (RGS) domain
           superfamily.  The RGS domain is an essential part of the
           Regulator of G-protein Signaling (RGS) protein family, a
           diverse group of multifunctional proteins that regulate
           cellular signaling events downstream of G-protein
           coupled receptors (GPCRs). RGS proteins play critical
           regulatory roles as GTPase activating proteins (GAPs) of
           the heterotrimeric G-protein G-alpha-subunits. While
           inactive, G-alpha-subunits bind GDP, which is released
           and replaced by GTP upon agonist activation. GTP binding
           leads to dissociation of the alpha-subunit and the
           beta-gamma-dimer, allowing them to interact with
           effectors molecules and propagate signaling cascades
           associated with cellular growth, survival, migration,
           and invasion. Deactivation of the G-protein signaling
           controlled by the RGS domain accelerates GTPase activity
           of the alpha subunit by hydrolysis of GTP to GDP, which
           results in the reassociation of the alpha-subunit with
           the beta-gamma-dimer and thereby inhibition of
           downstream activity. As a major G-protein regulator, RGS
           domain containing proteins are involved in many crucial
           cellular processes such as regulation of intracellular
           trafficking, glial differentiation, embryonic axis
           formation, skeletal and muscle development, and cell
           migration during early embryogenesis. RGS proteins are
           also involved in apoptosis and cell proliferation, as
           well as modulation of cardiac development. Several RGS
           proteins can fine-tune immune responses, while others
           play important roles in neuronal signals modulation.
           Some RGS proteins are principal elements needed for
           proper vision.
          Length = 113

 Score = 27.7 bits (62), Expect = 4.4
 Identities = 18/57 (31%), Positives = 27/57 (47%)

Query: 174 EEESSDDDQAWTKEIKKTYNLGPAPKWCGFLDNLTEELEENIIENVYDDYKFVTRQE 230
              S ++ ++  KEI   Y    APK     +++ EE+EEN+ E   D   F   QE
Sbjct: 35  TTSSDEELKSKAKEIYDKYISKDAPKEINIPESIREEIEENLEEPYPDPDCFDEAQE 91


>gnl|CDD|225180 COG2271, UhpC, Sugar phosphate permease [Carbohydrate transport and
           metabolism].
          Length = 448

 Score = 28.8 bits (65), Expect = 4.5
 Identities = 14/45 (31%), Positives = 21/45 (46%), Gaps = 2/45 (4%)

Query: 150 RDTEEKEGKASSDESSEEEEEEEEEEESSDDDQAWTKEIKKTYNL 194
           RD  + EG    +E   +  E  EEE+ ++   AW  +I   Y L
Sbjct: 207 RDRPQSEGLPPIEEYRGDPLEIYEEEKENEGLTAW--QIFVKYVL 249


>gnl|CDD|222581 pfam14181, YqfQ, YqfQ-like protein.  The YqfQ-like protein family
           includes the B. subtilis YqfQ protein, also known as
           VrrA, which is functionally uncharacterized. This family
           of proteins is found in bacteria. Proteins in this
           family are typically between 146 and 237 amino acids in
           length. There are two conserved sequence motifs: QYGP
           and PKLY.
          Length = 155

 Score = 28.2 bits (63), Expect = 4.6
 Identities = 16/45 (35%), Positives = 22/45 (48%)

Query: 155 KEGKASSDESSEEEEEEEEEEESSDDDQAWTKEIKKTYNLGPAPK 199
           +E  +S DE  E EEE  +E E  D  +  T+  +K     P PK
Sbjct: 89  RELSSSDDEEEETEEESTDETEQEDPPETKTESKEKKKREVPKPK 133


>gnl|CDD|220600 pfam10147, CR6_interact, Growth arrest and DNA-damage-inducible
           proteins-interacting protein 1.  Members of this family
           of proteins act as negative regulators of G1 to S cell
           cycle phase progression by inhibiting cyclin-dependent
           kinases. Inhibitory effects are additive with GADD45
           proteins but occur also in the absence of GADD45
           proteins. Furthermore, they act as a repressor of the
           orphan nuclear receptor NR4A1 by inhibiting AB
           domain-mediated transcriptional activity.
          Length = 217

 Score = 28.6 bits (64), Expect = 4.7
 Identities = 13/54 (24%), Positives = 26/54 (48%)

Query: 270 EFEEFKKKKIRERIEQERTRGVQLNKLPKVNQELALKLMDEKQKAEETESRKKK 323
           E  E +K+K   R  +E      + K+P++  +   +    +QKA   + RK++
Sbjct: 107 ENREQQKEKEARRQAREAEIAKNMAKMPQMIADWRAQKRKREQKARAAKERKER 160


>gnl|CDD|100108 cd04411, Ribosomal_P1_P2_L12p, Ribosomal protein P1, P2, and L12p.
           Ribosomal proteins P1 and P2 are the eukaryotic proteins
           that are functionally equivalent to bacterial L7/L12.
           L12p is the archaeal homolog. Unlike other ribosomal
           proteins, the archaeal L12p and eukaryotic P1 and P2 do
           not share sequence similarity with their bacterial
           counterparts. They are part of the ribosomal stalk
           (called the L7/L12 stalk in bacteria), along with 28S
           rRNA and the proteins L11 and P0 in eukaryotes (23S
           rRNA, L11, and L10e in archaea). In bacterial ribosomes,
           L7/L12 homodimers bind the extended C-terminal helix of
           L10 to anchor the L7/L12 molecules to the ribosome.
           Eukaryotic P1/P2 heterodimers and archaeal L12p
           homodimers are believed to bind the L10 equivalent
           proteins, eukaryotic P0 and archaeal L10e, in a similar
           fashion. P1 and P2 (L12p, L7/L12) are the only proteins
           in the ribosome to occur as multimers, always appearing
           as sets of dimers. Recent data indicate that most
           archaeal species contain six copies of L12p (three
           homodimers), while eukaryotes have two copies each of P1
           and P2 (two heterodimers). Bacteria may have four or six
           copies (two or three homodimers), depending on the
           species. As in bacteria, the stalk is crucial for
           binding of initiation, elongation, and release factors
           in eukaryotes and archaea.
          Length = 105

 Score = 27.2 bits (60), Expect = 4.8
 Identities = 10/25 (40%), Positives = 17/25 (68%)

Query: 157 GKASSDESSEEEEEEEEEEESSDDD 181
             A+++ + + EE +EEEEE  D+D
Sbjct: 75  AAATAEPAEKAEEAKEEEEEEEDED 99


>gnl|CDD|217861 pfam04050, Upf2, Up-frameshift suppressor 2.  Transcripts
           harbouring premature signals for translation termination
           are recognised and rapidly degraded by eukaryotic cells
           through a pathway known as nonsense-mediated mRNA decay.
           In Saccharomyces cerevisiae, three trans-acting factors
           (Upf1 to Upf3) are required for nonsense-mediated mRNA
           decay.
          Length = 171

 Score = 28.1 bits (63), Expect = 5.2
 Identities = 11/29 (37%), Positives = 18/29 (62%)

Query: 153 EEKEGKASSDESSEEEEEEEEEEESSDDD 181
           EE E   SSDE   +  ++E++EES  ++
Sbjct: 18  EEDEDDESSDEEEVDLPDDEQDEESDSEE 46


>gnl|CDD|152901 pfam12467, CMV_1a, Cucumber mosaic virus 1a protein.  This domain
           family is found in viruses, and is typically between 156
           and 171 amino acids in length. The family is found in
           association with pfam01443, pfam01660. 1a protein is the
           major virulence factor of the cucumber mosaic virus
           (CMV). The Ns strain of CMV causes necrotic lesions to
           Nicotiana spp. while other strains cause systemic
           mosaic. The determinant of the pathogenesis of these
           different strains is the specific amino acid residue at
           the 461 residue of the 1a protein.
          Length = 175

 Score = 27.9 bits (62), Expect = 5.3
 Identities = 16/60 (26%), Positives = 25/60 (41%), Gaps = 4/60 (6%)

Query: 299 VNQELALKLMDEKQKAEETESRKKKKKLQLSANLLEDDRFSKLFENPDFQVIEPSSDQVR 358
           V +  A  + D K    E    KKKK  +     +++D  S+ FE+      E   D V+
Sbjct: 64  VLKRAAWAVEDGKTLRAE----KKKKLEEALQQPVQEDSVSEEFEDAPDAPSESVRDDVK 119


>gnl|CDD|130712 TIGR01651, CobT, cobaltochelatase, CobT subunit.  This model
           describes Pseudomonas denitrificans CobT gene product,
           which is a cobalt chelatase subunit that functions in
           cobalamin biosynthesis. Cobalamin (vitamin B12) can be
           synthesized via several pathways, including an aerobic
           pathway (found in Pseudomonas denitrificans) and an
           anaerobic pathway (found in P. shermanii and Salmonella
           typhimurium). These pathways differ in the point of
           cobalt insertion during corrin ring formation. There are
           apparently a number of variations on these two pathways,
           where the major differences seem to be concerned with
           the process of ring contraction. Confusion regarding the
           functions of enzymes found in the aerobic vs. anaerobic
           pathways has arisen because nonhomologous genes in these
           different pathways were given the same gene symbols.
           Thus, cobT in the aerobic pathway (P. denitrificans) is
           not a homolog of cobT in the anaerobic pathway (S.
           typhimurium). It should be noted that E. coli
           synthesizes cobalamin only when it is supplied with the
           precursor cobinamide, which is a complex intermediate.
           Additionally, all E. coli cobalamin synthesis genes
           (cobU, cobS and cobT) were named after their Salmonella
           typhimurium homologs which function in the anaerobic
           cobalamin synthesis pathway. This model describes the
           aerobic cobalamin pathway Pseudomonas denitrificans CobT
           gene product, which is a cobalt chelatase subunit, with
           a MW ~70 kDa. The aerobic pathway cobalt chelatase is a
           heterotrimeric, ATP-dependent enzyme that catalyzes
           cobalt insertion during cobalamin biosynthesis. The
           other two subunits are the P. denitrificans CobS
           (TIGR01650) and CobN (pfam02514 CobN/Magnesium
           Chelatase) proteins. To avoid potential confusion with
           the nonhomologous Salmonella typhimurium/E.coli cobT
           gene product, the P. denitrificans gene symbol is not
           used in the name of this model [Biosynthesis of
           cofactors, prosthetic groups, and carriers, Heme,
           porphyrin, and cobalamin].
          Length = 600

 Score = 28.8 bits (64), Expect = 5.4
 Identities = 9/40 (22%), Positives = 19/40 (47%)

Query: 148 NERDTEEKEGKASSDESSEEEEEEEEEEESSDDDQAWTKE 187
           +E D ++ +   +  E   E E E +E  +  + +A  +E
Sbjct: 211 DEEDGDDDQPTENEQEEQGEGEGEGQEGSAPQESEATDRE 250


>gnl|CDD|183610 PRK12585, PRK12585, putative monovalent cation/H+ antiporter
           subunit G; Reviewed.
          Length = 197

 Score = 28.1 bits (62), Expect = 5.4
 Identities = 15/35 (42%), Positives = 22/35 (62%)

Query: 149 ERDTEEKEGKASSDESSEEEEEEEEEEESSDDDQA 183
           ER+ EE+  +  SD+S  E  E++E E  SDDD+ 
Sbjct: 161 EREREEQTIEEQSDDSEHEIIEQDESETESDDDKT 195


>gnl|CDD|183594 PRK12558, PRK12558, glutamyl-tRNA synthetase; Provisional.
          Length = 445

 Score = 28.7 bits (65), Expect = 5.9
 Identities = 17/39 (43%), Positives = 24/39 (61%), Gaps = 1/39 (2%)

Query: 284 EQERTRGVQLNK-LPKVNQELALKLMDEKQKAEETESRK 321
           E E  R +QL++ LP +    ALKL +E++ A E E RK
Sbjct: 103 ELELKRKIQLSRGLPPIYDRAALKLTEEEKAALEAEGRK 141


>gnl|CDD|225201 COG2319, COG2319, FOG: WD40 repeat [General function prediction
           only].
          Length = 466

 Score = 28.5 bits (62), Expect = 6.0
 Identities = 22/111 (19%), Positives = 45/111 (40%), Gaps = 30/111 (27%)

Query: 49  TSVRISPDGQYVLSTGIYKPRVRCY--------------ETDNLSMKFER--------CF 86
           +S+  SPDG  ++++G     +R +               +D++   F            
Sbjct: 202 SSLAFSPDGGLLIASGSSDGTIRLWDLSTGKLLRSTLSGHSDSVVSSFSPDGSLLASGSS 261

Query: 87  DSEVVTFEILSDD--------YSSELNSIAINPVHQLICVGTIEGKVEAWD 129
           D  +  +++ S          +SS + S+A +P  +L+  G+ +G V  WD
Sbjct: 262 DGTIRLWDLRSSSSLLRTLSGHSSSVLSVAFSPDGKLLASGSSDGTVRLWD 312


>gnl|CDD|220102 pfam09073, BUD22, BUD22.  BUD22 has been shown in yeast to be a
           nuclear protein involved in bud-site selection. It plays
           a role in positioning the proximal bud pole signal. More
           recently it has been shown to be involved in ribosome
           biogenesis.
          Length = 424

 Score = 28.3 bits (63), Expect = 6.4
 Identities = 15/37 (40%), Positives = 23/37 (62%), Gaps = 1/37 (2%)

Query: 147 SNERDTEEKEGK-ASSDESSEEEEEEEEEEESSDDDQ 182
           S++ D EE E +  S  E S E++ ++EEEE SD + 
Sbjct: 167 SDKDDEEESESEDESKSEESAEDDSDDEEEEDSDSED 203



 Score = 27.9 bits (62), Expect = 9.2
 Identities = 12/36 (33%), Positives = 25/36 (69%)

Query: 146 ISNERDTEEKEGKASSDESSEEEEEEEEEEESSDDD 181
             ++ +  E E ++ S+ES+E++ ++EEEE+S  +D
Sbjct: 168 DKDDEEESESEDESKSEESAEDDSDDEEEEDSDSED 203


>gnl|CDD|222018 pfam13274, DUF4065, Protein of unknown function (DUF4065).  This
           family of proteins is functionally uncharacterized. This
           family of proteins is found in bacteria, archaea and
           viruses. Proteins in this family are typically between
           155 and 202 amino acids in length.
          Length = 99

 Score = 26.9 bits (60), Expect = 6.4
 Identities = 13/33 (39%), Positives = 22/33 (66%), Gaps = 1/33 (3%)

Query: 202 GFLDNLTEELEENIIENVYDDYKFVTRQELEDL 234
           G LD L+EE E+ I++ V + Y  ++ +EL +L
Sbjct: 58  GDLDELSEE-EKEILDEVIEKYGNLSAKELSEL 89


>gnl|CDD|113868 pfam05113, DUF693, Protein of unknown function (DUF693).  This
           family consists of several uncharacterized proteins from
           Borrelia burgdorferi (Lyme disease spirochete).
          Length = 311

 Score = 28.3 bits (63), Expect = 6.6
 Identities = 19/64 (29%), Positives = 34/64 (53%), Gaps = 4/64 (6%)

Query: 224 KFVTRQELEDLGLGHLIGTSLLRAYMHGFF---MDIRLYRKAKSVSAPFEFEEFKKKKIR 280
           KF   ++ + +  G+L GT +   Y  G F   +D+RL  ++   +   E++ FK K ++
Sbjct: 90  KFAHEKDFDFIMAGYL-GTPMSTDYPGGDFSVELDVRLLSRSNFFNRKLEYKNFKGKTVQ 148

Query: 281 ERIE 284
           E IE
Sbjct: 149 EAIE 152


>gnl|CDD|235250 PRK04195, PRK04195, replication factor C large subunit;
           Provisional.
          Length = 482

 Score = 28.3 bits (64), Expect = 6.6
 Identities = 16/34 (47%), Positives = 19/34 (55%)

Query: 149 ERDTEEKEGKASSDESSEEEEEEEEEEESSDDDQ 182
           E + EEKE K    E  EEE EEE+EEE     +
Sbjct: 442 EEEEEEKEKKEEEKEEEEEEAEEEKEEEEEKKKK 475



 Score = 28.3 bits (64), Expect = 7.8
 Identities = 14/35 (40%), Positives = 21/35 (60%)

Query: 149 ERDTEEKEGKASSDESSEEEEEEEEEEESSDDDQA 183
           E + E+++ +   +E  EE EEE+EEEE     QA
Sbjct: 443 EEEEEKEKKEEEKEEEEEEAEEEKEEEEEKKKKQA 477



 Score = 28.0 bits (63), Expect = 9.2
 Identities = 13/43 (30%), Positives = 27/43 (62%)

Query: 149 ERDTEEKEGKASSDESSEEEEEEEEEEESSDDDQAWTKEIKKT 191
           E+  ++K+  A   +  EEEEE+E++EE  ++++   +E K+ 
Sbjct: 426 EKKEKKKKAFAGKKKEEEEEEEKEKKEEEKEEEEEEAEEEKEE 468


>gnl|CDD|235370 PRK05244, PRK05244, Der GTPase activator; Provisional.
          Length = 177

 Score = 27.6 bits (62), Expect = 7.0
 Identities = 11/17 (64%), Positives = 15/17 (88%)

Query: 165 SEEEEEEEEEEESSDDD 181
           S++++EEE EEE SDDD
Sbjct: 147 SDDDDEEESEEEESDDD 163


>gnl|CDD|237622 PRK14140, PRK14140, heat shock protein GrpE; Provisional.
          Length = 191

 Score = 27.7 bits (62), Expect = 7.1
 Identities = 10/29 (34%), Positives = 14/29 (48%)

Query: 148 NERDTEEKEGKASSDESSEEEEEEEEEEE 176
            E + EE       +E+ EEE E E  +E
Sbjct: 13  EETEVEEAVEDEVEEETVEEESEAELLDE 41


>gnl|CDD|148013 pfam06152, Phage_min_cap2, Phage minor capsid protein 2.  Family of
           related phage minor capsid proteins.
          Length = 361

 Score = 28.2 bits (63), Expect = 7.1
 Identities = 17/53 (32%), Positives = 28/53 (52%), Gaps = 2/53 (3%)

Query: 290 GVQLNKLPKVNQELALKLMDE--KQKAEETESRKKKKKLQLSANLLEDDRFSK 340
           GV  N  P ++ E A+K+     KQ+  E + RK K+KL ++  L + +   K
Sbjct: 290 GVNTNNQPNIDPEEAIKVYAISQKQRYLERQIRKWKRKLMVAEELGDKEGVEK 342


>gnl|CDD|212672 cd10230, HYOU1-like_NBD, Nucleotide-binding domain of human HYOU1
           and similar proteins.  This subgroup includes human
           HYOU1 (also known as human hypoxia up-regulated 1,
           GRP170; HSP12A; ORP150; GRP-170; ORP-150; the human
           HYOU1 gene maps to11q23.1-q23.3) and Saccharomyces
           cerevisiae Lhs1p (also known as Cer1p, SsI1). Mammalian
           HYOU1 functions as a nucleotide exchange factor (NEF)
           for HSPA5 (alos known as BiP, Grp78 or HspA5) and may
           also function as a HSPA5-independent chaperone. S.
           cerevisiae Lhs1p, does not have a detectable endogenous
           ATPase activity like canonical HSP70s, but functions as
           a NEF for Kar2p; it's interaction with Kar2p is
           stimulated by nucleotide-binding. In addition, Lhs1p has
           a nucleotide-independent holdase activity that prevents
           heat-induced aggregation of proteins in vitro. This
           subgroup belongs to the 105/110 kDa heat shock protein
           (HSP105/110) subfamily of the HSP70-like family.
           HSP105/110s are believed to function generally as
           co-chaperones of HSP70 chaperones, acting as NEFs, to
           remove ADP from their HSP70 chaperone partners during
           the ATP hydrolysis cycle. HSP70 chaperones assist in
           protein folding and assembly, and can direct incompetent
           "client" proteins towards degradation. Like HSP70
           chaperones, HSP105/110s have an N-terminal
           nucleotide-binding domain (NBD) and a C-terminal
           substrate-binding domain (SBD). For HSP70 chaperones,
           the nucleotide sits in a deep cleft formed between the
           two lobes of the NBD. The two subdomains of each lobe
           change conformation between ATP-bound, ADP-bound, and
           nucleotide-free states. ATP binding opens up the
           substrate-binding site; substrate-binding increases the
           rate of ATP hydrolysis. Hsp70 chaperone activity is also
           regulated by J-domain proteins.
          Length = 388

 Score = 28.3 bits (64), Expect = 7.1
 Identities = 11/22 (50%), Positives = 15/22 (68%), Gaps = 3/22 (13%)

Query: 216 IENVYDDYKF---VTRQELEDL 234
           IE++YDD  F   +TR E E+L
Sbjct: 294 IESLYDDIDFKTKITRAEFEEL 315


>gnl|CDD|227466 COG5137, COG5137, Histone chaperone involved in gene silencing
           [Transcription / Chromatin structure and dynamics].
          Length = 279

 Score = 28.0 bits (62), Expect = 7.3
 Identities = 14/33 (42%), Positives = 18/33 (54%)

Query: 149 ERDTEEKEGKASSDESSEEEEEEEEEEESSDDD 181
           E + +E+ G  S  E + E  EEEEEE    DD
Sbjct: 189 EEEEDEEVGSDSYGEGNRELNEEEEEEAEGSDD 221


>gnl|CDD|131316 TIGR02263, benz_CoA_red_C, benzoyl-CoA reductase, subunit C.  This
           model describes C subunit of benzoyl-CoA reductase, a
           4-subunit enzyme. Many aromatic compounds are
           metabolized by way of benzoyl-CoA. This enzyme acts
           under anaerobic conditions.
          Length = 380

 Score = 28.0 bits (62), Expect = 7.5
 Identities = 21/87 (24%), Positives = 39/87 (44%), Gaps = 11/87 (12%)

Query: 206 NLTEELEENIIENVYDDYKFVTRQELEDLGLGHLIGTSLLRAYMHGFFMDIRLYRKAKSV 265
           NL + +E +    V DD+  V R E  D+ L      +L  A++H           + S 
Sbjct: 249 NLIKSIELSGCYIVDDDFIIVHRFENNDVALAGDPLQNLALAFLH----------DSIST 298

Query: 266 SAPFEFEEFKKKK-IRERIEQERTRGV 291
           +A ++ +E  K K + +++ +    GV
Sbjct: 299 AAKYDDDEADKGKYLLDQVRKNAAEGV 325


>gnl|CDD|222792 PHA00435, PHA00435, capsid assembly protein.
          Length = 306

 Score = 27.9 bits (62), Expect = 7.8
 Identities = 13/36 (36%), Positives = 18/36 (50%)

Query: 146 ISNERDTEEKEGKASSDESSEEEEEEEEEEESSDDD 181
           +    D EE+E +   ++  EE EEE EE E   D 
Sbjct: 73  VRISEDGEEEEVEEGEEDEEEEGEEESEEFEPLGDT 108


>gnl|CDD|225288 COG2433, COG2433, Uncharacterized conserved protein [Function
           unknown].
          Length = 652

 Score = 28.1 bits (63), Expect = 9.0
 Identities = 14/66 (21%), Positives = 34/66 (51%), Gaps = 6/66 (9%)

Query: 270 EFEEFKKK--KIRERIEQERTRGVQLNKLPKVNQELALKLMDEKQKAEETESR----KKK 323
           E E+ + +  + R  +  +  +  ++    +  + L  +L ++K++ EE E +    +K 
Sbjct: 451 EIEKLESELERFRREVRDKVRKDREIRARDRRIERLEKELEEKKKRVEELERKLAELRKM 510

Query: 324 KKLQLS 329
           +KL+LS
Sbjct: 511 RKLELS 516


>gnl|CDD|202096 pfam02029, Caldesmon, Caldesmon. 
          Length = 431

 Score = 28.1 bits (62), Expect = 9.1
 Identities = 46/180 (25%), Positives = 66/180 (36%), Gaps = 12/180 (6%)

Query: 150 RDTEEKEGKASSDESSEEEEEEEEEEESSDDDQAWTKEIKKTYNLGPAPKWCGFLDNLTE 209
           R  +E  G  +     EE+EE  EE E  ++ +  TK  +K  N     + C   +   E
Sbjct: 98  RRMQEDSGAENETVEEEEKEESREEREEVEETEGVTKSEQK--NDWRDAEECQKEEKEPE 155

Query: 210 ELEENIIENVYDDYKFVTRQELEDLGLGHLIGTSLLRAYMHGFFMDIRLYRKA--KSVSA 267
             EE   E            E     L H   T              + + K   K   A
Sbjct: 156 PEEE---EKPKRGSLEENNGEFMTHKLKHTENTFSRGGAEGAQVEAGKEFEKLKQKQQEA 212

Query: 268 PFEFEEFKKK-----KIRERIEQERTRGVQLNKLPKVNQELALKLMDEKQKAEETESRKK 322
             E EE KKK     K+ E  EQ R +     K  +  ++  LK   E+++AE  E R+K
Sbjct: 213 ALELEELKKKREERRKVLEEEEQRRKQEEADRKSREEEEKRRLKEEIERRRAEAAEKRQK 272


>gnl|CDD|220605 pfam10156, Med17, Subunit 17 of Mediator complex.  This Mediator
           complex subunit was formerly known as Srb4 in yeasts or
           Trap80 in Drosophila and human. The Med17 subunit is
           located within the head domain and is essential for cell
           viability to the extent that a mutant strain of
           cerevisiae lacking it shows all RNA polymerase
           II-dependent transcription ceasing at non-permissive
           temperatures.
          Length = 454

 Score = 27.7 bits (62), Expect = 9.6
 Identities = 15/41 (36%), Positives = 23/41 (56%), Gaps = 1/41 (2%)

Query: 148 NERD-TEEKEGKASSDESSEEEEEEEEEEESSDDDQAWTKE 187
            E    EE   +A+  + SEE +EEE++EE  +DD    K+
Sbjct: 50  TEESLREEIAKEAAKIDFSEESDEEEDDEEDDNDDSEENKD 90


>gnl|CDD|227278 COG4942, COG4942, Membrane-bound metallopeptidase [Cell division
           and chromosome partitioning].
          Length = 420

 Score = 27.8 bits (62), Expect = 10.0
 Identities = 12/66 (18%), Positives = 28/66 (42%), Gaps = 9/66 (13%)

Query: 275 KKKKIRERIEQERTRGVQLNKLPKVNQELALKLMDEKQKAEETES-----RKKKKKLQLS 329
           + +   E+ E       Q  +      +LA  L + K+   +  S     +KK ++L+ +
Sbjct: 177 RAEIAAEQAELTTLLSEQRAQQ----AKLAQLLEERKKTLAQLNSELSADQKKLEELRAN 232

Query: 330 ANLLED 335
            + L++
Sbjct: 233 ESRLKN 238


  Database: CDD.v3.10
    Posted date:  Mar 20, 2013  7:55 AM
  Number of letters in database: 10,937,602
  Number of sequences in database:  44,354
  
Lambda     K      H
   0.314    0.133    0.378 

Gapped
Lambda     K      H
   0.267   0.0788    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 18,385,821
Number of extensions: 1813109
Number of successful extensions: 6115
Number of sequences better than 10.0: 1
Number of HSP's gapped: 4999
Number of HSP's successfully gapped: 492
Length of query: 358
Length of database: 10,937,602
Length adjustment: 98
Effective length of query: 260
Effective length of database: 6,590,910
Effective search space: 1713636600
Effective search space used: 1713636600
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.2 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 42 (21.9 bits)
S2: 60 (26.8 bits)