RPS-BLAST 2.2.26 [Sep-21-2011]

Database: CDD.v3.10 
           44,354 sequences; 10,937,602 total letters

Searching..................................................done

Query= psy7437
         (279 letters)



>gnl|CDD|240580 cd12945, NOPS_NONA_like, NOPS domain, including C-terminal
           coiled-coil region, in p54nrb/PSF/PSP1 homologs from
           invertebrate species.  The family contains a DBHS domain
           (for Drosophila behavior, human splicing), which
           comprises two conserved RNA recognition motifs (RRMs),
           also termed RBDs (RNA binding domains) or RNPs
           (ribonucleoprotein domains), and a charged
           protein-protein interaction NOPS (NONA and PSP1) domain.
           This model corresponds to the NOPS domain, with a long
           helical C-terminal extension , found in Drosophila
           melanogaster gene no-ontransient A (nonA) encoding
           puff-specific protein Bj6 (also termed NONA), Chironomus
           tentans hrp65 gene encoding protein Hrp65 and similar
           proteins. D. melanogaster NONA is involved in eye
           development and behavior, and may play a role in
           circadian rhythm maintenance, similar to vertebrate
           p54nrb. C. tentans hrp65 is a component of nuclear
           fibers associated with ribonucleoprotein particles in
           transit from the gene to the nuclear pore. The NOPS
           domain specifically binds to the second RNA recognition
           motif (RRM2) domain of the partner DBHS protein via a
           substantial interaction surface. Its highly conserved
           C-terminal residues are critical for functional DBHS
           dimerization while the highly conserved C-terminal
           helical extension, forming a right-handed antiparallel
           heterodimeric coiled-coil, is essential for localization
           of these proteins to subnuclear bodies.
          Length = 100

 Score =  134 bits (339), Expect = 4e-40
 Identities = 67/97 (69%), Positives = 88/97 (90%)

Query: 48  LKPVIVEPLDLVDDEEGLSERTVSKKSSDYFKQRSIGPRFATIGSFEFEYGTRWKQLYEL 107
           L+P +VEP++ +DDE+GL E++++KK+ ++ K+RSIGPRFA   SFE EYGTRWKQL+EL
Sbjct: 1   LRPCVVEPMEEIDDEDGLPEKSLNKKNPEFNKERSIGPRFAEPNSFEHEYGTRWKQLHEL 60

Query: 108 YKQKEEALQKELKLEEEKLEAQMEFARYENETEILRE 144
           YKQKEEAL++ELK+EEEKLEAQME+ARYE+ETE+LRE
Sbjct: 61  YKQKEEALKRELKMEEEKLEAQMEYARYEHETELLRE 97


>gnl|CDD|240579 cd12931, eNOPS_SF, NOPS domain, including C-terminal helical
           extension region, in the p54nrb/PSF/PSP1 family.  All
           members in this family contain a DBHS domain (for
           Drosophila behavior, human splicing), which comprises
           two conserved RNA recognition motifs (RRM1 and RRM2),
           also termed RBDs (RNA binding domains) or RNPs
           (ribonucleoprotein domains), and a charged
           protein-protein interaction NOPS (NONA and PSP1) domain
           with a long helical C-terminal extension. The NOPS
           domain specifically binds to RRM2 domain of the partner
           DBHS protein via a substantial interaction surface. Its
           highly conserved C-terminal residues are critical for
           functional DBHS dimerization while the highly conserved
           C-terminal helical extension, forming a right-handed
           antiparallel heterodimeric coiled-coil, is essential for
           localization of these proteins to subnuclear bodies. PSF
           has an additional large N-terminal domain that
           differentiates it from other family members. The
           p54nrb/PSF/PSP1 family includes 54 kDa nuclear RNA- and
           DNA-binding protein (p54nrb), polypyrimidine
           tract-binding protein (PTB)-associated-splicing factor
           (PSF) and paraspeckle protein 1 (PSP1), which are
           ubiquitously expressed and are well conserved in
           vertebrates. p54nrb, also termed NONO or NMT55, is a
           multi-functional protein involved in numerous nuclear
           processes including transcriptional regulation,
           splicing, DNA unwinding, nuclear retention of
           hyperedited double-stranded RNA, viral RNA processing,
           control of cell proliferation, and circadian rhythm
           maintenance. PSF, also termed POMp100, is also a
           multi-functional protein that binds RNA, single-stranded
           DNA (ssDNA), double-stranded DNA (dsDNA) and many
           factors, and mediates diverse activities in the cell.
           PSP1, also termed PSPC1, is a novel nucleolar factor
           that accumulates within a new nucleoplasmic compartment,
           termed paraspeckles, and diffusely distributes in the
           nucleoplasm. The cellular function of PSP1 remains
           unknown currently. The family also includes some
           p54nrb/PSF/PSP1 homologs from invertebrate species. For
           instance, the Drosophila melanogaster gene
           no-ontransient A (nonA) encoding puff-specific protein
           Bj6 (also termed NONA) and Chironomus tentans hrp65 gene
           encoding protein Hrp65. D. melanogaster NONA is involved
           in eye development and behavior and may play a role in
           circadian rhythm maintenance, similar to vertebrate
           p54nrb. C. tentans Hrp65 is a component of nuclear
           fibers associated with ribonucleoprotein particles in
           transit from the gene to the nuclear pore.
          Length = 90

 Score = 95.8 bits (239), Expect = 3e-25
 Identities = 55/90 (61%), Positives = 71/90 (78%), Gaps = 1/90 (1%)

Query: 49  KPVIVEPLDLVDDEEGLSERTVSKKSSDYFKQRSIGPRFATIGSFEFEYGTRWKQLYELY 108
           +PV+VEPL+  D+E+GL ER V KK++ Y K+R +GPRFA  GSFE+E+G RWK LYEL 
Sbjct: 2   RPVVVEPLEQRDEEDGLPERNV-KKNAGYQKEREVGPRFAPPGSFEYEFGQRWKALYELE 60

Query: 109 KQKEEALQKELKLEEEKLEAQMEFARYENE 138
           KQ+ E L+KELK   EKLEA+ME ARYE++
Sbjct: 61  KQQREQLEKELKEAREKLEAEMEAARYEHQ 90


>gnl|CDD|240779 cd12333, RRM2_p54nrb_like, RNA recognition motif 2 in the
          p54nrb/PSF/PSP1 family.  This subfamily corresponds to
          the RRM2 of the p54nrb/PSF/PSP1 family, including 54
          kDa nuclear RNA- and DNA-binding protein (p54nrb or
          NonO or NMT55), polypyrimidine tract-binding protein
          (PTB)-associated-splicing factor (PSF or POMp100),
          paraspeckle protein 1 (PSP1 or PSPC1), which are
          ubiquitously expressed and are conserved in
          vertebrates. p54nrb is a multi-functional protein
          involved in numerous nuclear processes including
          transcriptional regulation, splicing, DNA unwinding,
          nuclear retention of hyperedited double-stranded RNA,
          viral RNA processing, control of cell proliferation,
          and circadian rhythm maintenance. PSF is also a
          multi-functional protein that binds RNA,
          single-stranded DNA (ssDNA), double-stranded DNA
          (dsDNA) and many factors, and mediates diverse
          activities in the cell. PSP1 is a novel nucleolar
          factor that accumulates within a new nucleoplasmic
          compartment, termed paraspeckles, and diffusely
          distributes in the nucleoplasm. The cellular function
          of PSP1 remains unknown currently. The family also
          includes some p54nrb/PSF/PSP1 homologs from
          invertebrate species, such as the Drosophila
          melanogaster gene no-ontransient A (nonA) encoding
          puff-specific protein Bj6 (also termed NONA) and
          Chironomus tentans hrp65 gene encoding protein Hrp65.
          D. melanogaster NONA is involved in eye development and
          behavior and may play a role in circadian rhythm
          maintenance, similar to vertebrate p54nrb. C. tentans
          Hrp65 is a component of nuclear fibers associated with
          ribonucleoprotein particles in transit from the gene to
          the nuclear pore. All family members contains a DBHS
          domain (for Drosophila behavior, human splicing), which
          comprises two conserved RNA recognition motifs (RRMs),
          also termed RBDs (RNA binding domains) or RNPs
          (ribonucleoprotein domains), and a charged
          protein-protein interaction module. PSF has an
          additional large N-terminal domain that differentiates
          it from other family members. .
          Length = 80

 Score = 92.0 bits (229), Expect = 5e-24
 Identities = 35/55 (63%), Positives = 44/55 (80%)

Query: 3  IERAVVLVDDRGNSKNEGIIEFTRKPAAAQALKRCQDGVFFLTQSLKPVIVEPLD 57
          +ERAVV+VDDRG S  EGI+EF+RKP A  A+KRC +G F LT S +PV+VEPL+
Sbjct: 26 VERAVVIVDDRGRSTGEGIVEFSRKPGAQAAIKRCSEGCFLLTASPRPVVVEPLE 80


>gnl|CDD|241034 cd12590, RRM2_PSF, RNA recognition motif 2 in vertebrate
          polypyrimidine tract-binding protein
          (PTB)-associated-splicing factor (PSF).  This subgroup
          corresponds to the RRM2 of PSF, also termed proline-
          and glutamine-rich splicing factor, or 100 kDa
          DNA-pairing protein (POMp100), or 100 kDa subunit of
          DNA-binding p52/p100 complex, a multifunctional protein
          that mediates diverse activities in the cell. It is
          ubiquitously expressed and highly conserved in
          vertebrates. PSF binds not only RNA but also both
          single-stranded DNA (ssDNA) and double-stranded DNA
          (dsDNA) and facilitates the renaturation of
          complementary ssDNAs. It promotes the formation of
          D-loops in superhelical duplex DNA, and is involved in
          cell proliferation. PSF can also interact with multiple
          factors. It is an RNA-binding component of spliceosomes
          and binds to insulin-like growth factor response
          element (IGFRE). Moreover, PSF functions as a
          transcriptional repressor interacting with Sin3A and
          mediating silencing through the recruitment of histone
          deacetylases (HDACs) to the DNA binding domain (DBD) of
          nuclear hormone receptors. PSF is an essential pre-mRNA
          splicing factor and is dissociated from PTB and binds
          to U1-70K and serine-arginine (SR) proteins during
          apoptosis. PSF forms a heterodimer with the nuclear
          protein p54nrb, also known as non-POU domain-containing
          octamer-binding protein (NonO). The PSF/p54nrb complex
          displays a variety of functions, such as DNA
          recombination and RNA synthesis, processing, and
          transport. PSF contains two conserved RNA recognition
          motifs (RRMs), also termed RBDs (RNA binding domains)
          or RNPs (ribonucleoprotein domains), which are
          responsible for interactions with RNA and for the
          localization of the protein in speckles. It also
          contains an N-terminal region rich in proline, glycine,
          and glutamine residues, which may play a role in
          interactions recruiting other molecules. .
          Length = 80

 Score = 77.4 bits (190), Expect = 2e-18
 Identities = 34/55 (61%), Positives = 44/55 (80%)

Query: 3  IERAVVLVDDRGNSKNEGIIEFTRKPAAAQALKRCQDGVFFLTQSLKPVIVEPLD 57
          +ERAVV+VDDRG S  +GI+EF  KPAA +A +RC +GVF LT + +PVIVEPL+
Sbjct: 26 VERAVVIVDDRGRSTGKGIVEFASKPAARKAFERCTEGVFLLTTTPRPVIVEPLE 80


>gnl|CDD|241033 cd12589, RRM2_PSP1, RNA recognition motif 2 in vertebrate
          paraspeckle protein 1 (PSP1 or PSPC1).  This subgroup
          corresponds to the RRM2 of PSPC1, also termed
          paraspeckle component 1 (PSPC1), a novel nucleolar
          factor that accumulates within a new nucleoplasmic
          compartment, termed paraspeckles, and diffusely
          distributes in the nucleoplasm. It is ubiquitously
          expressed and highly conserved in vertebrates. Although
          its cellular function remains unknown currently, PSPC1
          forms a novel heterodimer with the nuclear protein
          p54nrb, also known as non-POU domain-containing
          octamer-binding protein (NonO), which localizes to
          paraspeckles in an RNA-dependent manner. PSPC1 contains
          two conserved RNA recognition motifs (RRMs), also
          termed RBDs (RNA binding domains) or RNPs
          (ribonucleoprotein domains), at the N-terminus. .
          Length = 80

 Score = 74.3 bits (182), Expect = 3e-17
 Identities = 32/55 (58%), Positives = 42/55 (76%)

Query: 3  IERAVVLVDDRGNSKNEGIIEFTRKPAAAQALKRCQDGVFFLTQSLKPVIVEPLD 57
          +ERAVV+VDDRG    +G +EF  KPAA +AL+RC DG F LT + +PVIVEP++
Sbjct: 26 VERAVVIVDDRGRPTGKGFVEFAAKPAARKALERCADGAFLLTTTPRPVIVEPME 80


>gnl|CDD|241035 cd12591, RRM2_p54nrb, RNA recognition motif 2 in vertebrate 54
          kDa nuclear RNA- and DNA-binding protein (p54nrb).
          This subgroup corresponds to the RRM2 of p54nrb, also
          termed non-POU domain-containing octamer-binding
          protein (NonO), or 55 kDa nuclear protein (NMT55), or
          DNA-binding p52/p100 complex 52 kDa subunit. p54nrb is
          a multifunctional protein involved in numerous nuclear
          processes including transcriptional regulation,
          splicing, DNA unwinding, nuclear retention of
          hyperedited double-stranded RNA, viral RNA processing,
          control of cell proliferation, and circadian rhythm
          maintenance. It is ubiquitously expressed and highly
          conserved in vertebrates. It binds both, single- and
          double-stranded RNA and DNA, and also possesses
          inherent carbonic anhydrase activity. p54nrb forms a
          heterodimer with paraspeckle component 1 (PSPC1 or
          PSP1), localizing to paraspeckles in an RNA-dependent
          manner. It also forms a heterodimer with polypyrimidine
          tract-binding protein-associated-splicing factor (PSF).
          p54nrb contains two conserved RNA recognition motifs
          (RRMs), also termed RBDs (RNA binding domains) or RNPs
          (ribonucleoprotein domains), at the N-terminus. .
          Length = 80

 Score = 74.3 bits (182), Expect = 3e-17
 Identities = 32/55 (58%), Positives = 40/55 (72%)

Query: 3  IERAVVLVDDRGNSKNEGIIEFTRKPAAAQALKRCQDGVFFLTQSLKPVIVEPLD 57
          +ERAVV+VDDRG    +GI+EF  KP+A +AL RC DG F LT   +PV VEP+D
Sbjct: 26 VERAVVIVDDRGRPTGKGIVEFAGKPSARKALDRCSDGAFLLTAFPRPVTVEPMD 80


>gnl|CDD|149257 pfam08075, NOPS, NOPS (NUC059) domain.  This domain is found at the
           C-terminus of NONA and PSP1 proteins adjacent to 1 or 2
           pfam00076 domains.
          Length = 52

 Score = 72.4 bits (178), Expect = 7e-17
 Identities = 35/53 (66%), Positives = 41/53 (77%), Gaps = 1/53 (1%)

Query: 50  PVIVEPLDLVDDEEGLSERTVSKKSSDYFKQRSIGPRFATIGSFEFEYGTRWK 102
           PVIVEP++  DDE+GL E+ V KKS DY K+R  GPRFA  GSFE EYG+RWK
Sbjct: 1   PVIVEPMEQNDDEDGLPEKLV-KKSPDYHKEREQGPRFAQPGSFEHEYGSRWK 52


>gnl|CDD|240581 cd12946, NOPS_p54nrb_PSF_PSPC1, NOPS domain, including C-terminal
           coiled-coil region, in p54nrb/PSF/PSPC1 family proteins.
            The family contains a DBHS domain (for Drosophila
           behavior, human splicing), which comprises two conserved
           RNA recognition motifs (RRMs), also termed RBDs (RNA
           binding domains) or RNPs (ribonucleoprotein domains),
           and a charged protein-protein interaction NOPS (NONA and
           PSP1) domain. This model corresponds to the NOPS domain,
           with a long helical C-terminal extension, found in the
           p54nrb/PSF/PSPC1 proteins. The NOPS domain specifically
           binds to the second RNA recognition motif (RRM2) domain
           of the partner DBHS protein via a substantial
           interaction surface. Its highly conserved C-terminal
           residues are critical for functional DBHS dimerization
           while the highly conserved C-terminal helical extension,
           forming a right-handed antiparallel heterodimeric
           coiled-coil, is essential for localization of these
           proteins to subnuclear bodies. Members in the family
           include 54 kDa nuclear RNA- and DNA-binding protein
           (p54nrb), polypyrimidine tract-binding protein
           (PTB)-associated-splicing factor (PSF) and paraspeckle
           protein component 1 (PSPC1 or PSP1), which are
           ubiquitously expressed and are conserved in vertebrates.
           p54nrb, also termed NONO or NMT55, is a multi-functional
           protein involved in numerous nuclear processes including
           transcriptional regulation, splicing, DNA unwinding,
           nuclear retention of hyperedited double-stranded RNA,
           viral RNA processing, control of cell proliferation, and
           circadian rhythm maintenance. PSF, also termed POMp100,
           is a multi-functional protein that binds RNA,
           single-stranded DNA (ssDNA), double-stranded DNA (dsDNA)
           and many factors, and mediates diverse activities in the
           cell. PSPC1 is a novel nucleolar factor that accumulates
           within a new nucleoplasmic compartment, termed
           paraspeckles, and diffusely distributes in the
           nucleoplasm. The cellular function of PSPC1 remains
           unknown currently. PSF has an additional large
           N-terminal domain that differentiates it from other
           family members.
          Length = 93

 Score = 69.8 bits (170), Expect = 2e-15
 Identities = 43/90 (47%), Positives = 65/90 (72%), Gaps = 1/90 (1%)

Query: 49  KPVIVEPLDLVDDEEGLSERTVSKKSSDYFKQRSIGPRFATIGSFEFEYGTRWKQLYELY 108
           +PVIVEP++ +DDE+GL E+   +K+  Y K+R   PRFA  G+FE+EY  RWK L E+ 
Sbjct: 2   RPVIVEPMEQLDDEDGLPEKLA-QKNQQYHKEREQPPRFAQPGTFEYEYAQRWKALDEME 60

Query: 109 KQKEEALQKELKLEEEKLEAQMEFARYENE 138
           KQ+ E + + +K  +EKLE++ME AR+E++
Sbjct: 61  KQQREQVDRNIKEAKEKLESEMEAARHEHQ 90


>gnl|CDD|240584 cd12949, NOPS_PSPC1, NOPS domain, including C-terminal coiled-coil
           region, in paraspeckle protein component 1 (PSPC1) and
           similar proteins.  The family contains a DBHS domain
           (for Drosophila behavior, human splicing), which
           comprises two conserved RNA recognition motifs (RRMs),
           also termed RBDs (RNA binding domains) or RNPs
           (ribonucleoprotein domains), and a charged
           protein-protein interaction NOPS (NONA and PSP1) domain.
           This model corresponds to the NOPS domain, with a long
           helical C-terminal extension, of paraspeckle component 1
           (PSPC1, also termed PSP1), a novel nucleolar factor that
           accumulates within a new nucleoplasmic compartment,
           termed paraspeckles, and diffusely distributes in the
           nucleoplasm. It is ubiquitously expressed and highly
           conserved in vertebrates. Although its cellular function
           remains unknown currently, PSPC1 forms a novel
           heterodimer with the nuclear protein p54nrb, also known
           as non-POU domain-containing octamer-binding protein
           (NONO), which localizes to paraspeckles in an
           RNA-dependent manner. The NOPS domain specifically binds
           to the second RNA recognition motif (RRM2) domain of the
           partner DBHS protein via a substantial interaction
           surface. Its highly conserved C-terminal residues are
           critical for functional DBHS dimerization while the
           highly conserved C-terminal helical extension, forming a
           right-handed antiparallel heterodimeric coiled-coil, is
           essential for localization of these proteins to
           subnuclear bodies.
          Length = 94

 Score = 68.9 bits (168), Expect = 4e-15
 Identities = 44/90 (48%), Positives = 66/90 (73%), Gaps = 1/90 (1%)

Query: 49  KPVIVEPLDLVDDEEGLSERTVSKKSSDYFKQRSIGPRFATIGSFEFEYGTRWKQLYELY 108
           +PVIVEP++  DDE+GL E+ + +K+  Y K+R   PRFA  G+FEFEY +RWK L E+ 
Sbjct: 2   RPVIVEPMEQFDDEDGLPEKLM-QKTQQYHKEREQPPRFAQPGTFEFEYASRWKALDEME 60

Query: 109 KQKEEALQKELKLEEEKLEAQMEFARYENE 138
           KQ+ E + + ++  +EKLEA+ME AR+E++
Sbjct: 61  KQQREQVDRNIREAKEKLEAEMEAARHEHQ 90


>gnl|CDD|240582 cd12947, NOPS_p54nrb, NOPS domain, including C-terminal coiled-coil
           region, in 54 kDa nuclear RNA- and DNA-binding protein
           (p54nrb) and similar proteins.  The family contains a
           DBHS domain (for Drosophila behavior, human splicing),
           which comprises two conserved RNA recognition motifs
           (RRMs), also termed RBDs (RNA binding domains) or RNPs
           (ribonucleoprotein domains), and a charged
           protein-protein interaction NOPS (NONA and PSP1) domain.
           This model corresponds to the NOPS domain, with a long
           helical C-terminal extension, found in p54nrb, also
           termed non-POU domain-containing octamer-binding protein
           (NONO), or 55 kDa nuclear protein (NMT55), or
           DNA-binding p52/p100 complex 52 kDa subunit. It is a
           multi-functional protein involved in numerous nuclear
           processes including transcriptional regulation,
           splicing, DNA unwinding, nuclear retention of
           hyperedited double-stranded RNA, viral RNA processing,
           control of cell proliferation, and circadian rhythm
           maintenance. p54nrb is ubiquitously expressed and highly
           conserved in vertebrates. It binds both single- and
           double-stranded RNA and DNA, and also possesses inherent
           carbonic anhydrase activity. p54nrb forms a heterodimer
           with paraspeckle component 1 (PSPC1 or PSP1), localizing
           to paraspeckles in an RNA-dependent manner. It also
           forms a heterodimer with polypyrimidine tract-binding
           protein-associated-splicing factor (PSF). The NOPS
           domain specifically binds to the second RNA recognition
           motif (RRM2) domain of the partner DBHS protein via a
           substantial interaction surface. Its highly conserved
           C-terminal residues are critical for functional DBHS
           dimerization while the highly conserved C-terminal
           helical extension, forming a right-handed antiparallel
           heterodimeric coiled-coil, is essential for paraspeckle
           localization to subnuclear bodies.
          Length = 94

 Score = 68.2 bits (166), Expect = 8e-15
 Identities = 46/94 (48%), Positives = 64/94 (68%), Gaps = 1/94 (1%)

Query: 49  KPVIVEPLDLVDDEEGLSERTVSKKSSDYFKQRSIGPRFATIGSFEFEYGTRWKQLYELY 108
           +PV VEP+D +DDEEGL E+ V K    Y K+R   PRFA  GSFE+EY  RWK L E+ 
Sbjct: 2   RPVTVEPMDQLDDEEGLPEKLVIKNQQ-YHKEREQPPRFAQPGSFEYEYAMRWKALIEME 60

Query: 109 KQKEEALQKELKLEEEKLEAQMEFARYENETEIL 142
           KQ++E + + +K   EKLE +ME AR+E++  ++
Sbjct: 61  KQQQEQVDRNIKEAREKLEMEMEAARHEHQVMLM 94


>gnl|CDD|240583 cd12948, NOPS_PSF, NOPS domain, including C-terminal coiled-coil
           region, in polypyrimidine tract-binding protein
           (PTB)-associated-splicing factor (PSF) and similar
           proteins.  This model contains the NOPS (NONA and PSP1)
           domain PSF (also termed proline- and glutamine-rich
           splicing factor, or 100 kDa DNA-pairing protein
           (POMp100), or 100 kDa subunit of DNA-binding p52/p100
           complex), with a long helical C-terminal extension. PSF
           is a multifunctional protein that mediates diverse
           activities in the cell. It is ubiquitously expressed and
           highly conserved in vertebrates. PSF binds not only RNA
           but also single-stranded DNA (ssDNA) as well as
           double-stranded DNA (dsDNA) and facilitates the
           renaturation of complementary ssDNAs. Additionally, it
           promotes the formation of D-loops in superhelical duplex
           DNA, and is involved in cell proliferation. PSF can also
           interact with multiple factors. It is an RNA-binding
           component of spliceosomes and binds to insulin-like
           growth factor response element (IGFRE). Moreover, PSF
           functions as a transcriptional repressor interacting
           with Sin3A and mediating silencing through the
           recruitment of histone deacetylases (HDACs) to the DNA
           binding domain (DBD) of nuclear hormone receptors. As an
           RNA-binding component of spliceosomes, PSF binds to the
           insulin-like growth factor response element (IGFRE), and
           acts as an independent negative regulator of the
           transcriptional activity of the porcine P-450
           cholesterol side-chain cleavage enzyme gene (P450scc)
           IGFRE. PSF is an essential pre-mRNA splicing factor and
           is dissociated from PTB and binds to U1-70K and
           serine-arginine (SR) proteins during apoptosis. In
           addition, PSF forms a heterodimer with the nuclear
           protein p54nrb, also known as non-POU domain-containing
           octamer-binding protein (NONO). The PSF/p54nrb complex
           displays a variety of functions, such as DNA
           recombination and RNA synthesis, processing, and
           transport. PSF contains two conserved RNA recognition
           motifs (RRMs), also termed RBDs (RNA binding domains) or
           RNPs (ribonucleoprotein domains), which are responsible
           for interactions with RNA and for the localization of
           the protein in speckles. It also contains an N-terminal
           region rich in proline, glycine, and glutamine residues,
           which may play a role in interactions recruiting other
           molecules. The NOPS domain specifically binds to the
           second RNA recognition motif (RRM2) domain of the
           partner DBHS protein via a substantial interaction
           surface. Its highly conserved C-terminal residues are
           critical for functional DBHS dimerization while the
           highly conserved C-terminal helical extension, forming a
           right-handed antiparallel heterodimeric coiled-coil, is
           essential for localization of these proteins to
           subnuclear bodies.
          Length = 97

 Score = 67.9 bits (165), Expect = 1e-14
 Identities = 46/96 (47%), Positives = 71/96 (73%), Gaps = 1/96 (1%)

Query: 49  KPVIVEPLDLVDDEEGLSERTVSKKSSDYFKQRSIGPRFATIGSFEFEYGTRWKQLYELY 108
           +PVIVEPL+ +DDE+GL E+ +++K+  Y K+R   PRFA  G+FE+EY  RWK L E+ 
Sbjct: 2   RPVIVEPLEQLDDEDGLPEK-LAQKNPMYQKERETPPRFAQPGTFEYEYSQRWKSLDEME 60

Query: 109 KQKEEALQKELKLEEEKLEAQMEFARYENETEILRE 144
           KQ+ E ++K +K  +EKLE++ME A +E++  +LR+
Sbjct: 61  KQQREQVEKNMKEAKEKLESEMEDAYHEHQANLLRQ 96


>gnl|CDD|221389 pfam12037, DUF3523, Domain of unknown function (DUF3523).  This
           presumed domain is functionally uncharacterized. This
           domain is found in eukaryotes. This domain is typically
           between 257 to 277 amino acids in length. This domain is
           found associated with pfam00004. This domain has a
           conserved LER sequence motif.
          Length = 276

 Score = 43.6 bits (103), Expect = 5e-05
 Identities = 39/133 (29%), Positives = 69/133 (51%), Gaps = 18/133 (13%)

Query: 102 KQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLE 161
           K+ +EL K +E+  Q EL+ + ++ EAQ        + ++ R R    VE   +++   E
Sbjct: 51  KKAFELSKMQEKTRQAELEAKIKEYEAQQA------QAKLERAR----VEAEERRKTLQE 100

Query: 162 EEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQT 221
           + + E Q   A+Y++E    R++ ++    + RQ +E  LK +     R+   EAMRR T
Sbjct: 101 QTQQEQQR--AQYQDELA--RKRYQKELEQQRRQNEEL-LKMQEESVLRQ---EAMRRAT 152

Query: 222 EEIHLRMQQQDEE 234
           EE  L M+++  E
Sbjct: 153 EEEILEMRRETIE 165



 Score = 30.5 bits (69), Expect = 0.84
 Identities = 30/115 (26%), Positives = 56/115 (48%), Gaps = 2/115 (1%)

Query: 96  EYGTRWKQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQ 155
           E   R K L E  +Q+++  Q + +L  ++ + ++E  R +NE  +  + E    +EA++
Sbjct: 90  EAEERRKTLQEQTQQEQQRAQYQDELARKRYQKELEQQRRQNEELLKMQEESVLRQEAMR 149

Query: 156 KELKLEEEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQR 210
           +    EEE LE + E    E E E    + +     R R K+E E ++ + E  +
Sbjct: 150 R--ATEEEILEMRRETIEEEAELERENIRAKIEAEARGRAKEERENEDINREMLK 202



 Score = 30.5 bits (69), Expect = 0.96
 Identities = 19/68 (27%), Positives = 38/68 (55%)

Query: 175 ENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQTEEIHLRMQQQDEE 234
           E E +I   + +Q +A  ER + E E + +  +EQ + +++  + Q E    R Q++ E+
Sbjct: 67  ELEAKIKEYEAQQAQAKLERARVEAEERRKTLQEQTQQEQQRAQYQDELARKRYQKELEQ 126

Query: 235 LRRRHQEN 242
            RR+++E 
Sbjct: 127 QRRQNEEL 134



 Score = 28.2 bits (63), Expect = 4.3
 Identities = 23/90 (25%), Positives = 43/90 (47%), Gaps = 10/90 (11%)

Query: 151 EEALQKELKLEEEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQR 210
           E+  Q EL+ + ++ EAQ   A+ E          R R    ER+K   E  ++  +  +
Sbjct: 61  EKTRQAELEAKIKEYEAQQAQAKLE----------RARVEAEERRKTLQEQTQQEQQRAQ 110

Query: 211 RCDEEAMRRQTEEIHLRMQQQDEELRRRHQ 240
             DE A +R  +E+  + +Q +E L+ + +
Sbjct: 111 YQDELARKRYQKELEQQRRQNEELLKMQEE 140


>gnl|CDD|237177 PRK12704, PRK12704, phosphodiesterase; Provisional.
          Length = 520

 Score = 40.1 bits (95), Expect = 9e-04
 Identities = 37/130 (28%), Positives = 66/130 (50%), Gaps = 8/130 (6%)

Query: 111 KEEALQKELKLEEEKLEAQMEFARYENETEI-LRERECRFVEEALQKELKLEEEKLEAQM 169
           +E   + E   +E  LEA+ E  +  NE E  LRER      + L+K L  +EE L+ ++
Sbjct: 45  EEAKKEAEAIKKEALLEAKEEIHKLRNEFEKELRERRNEL--QKLEKRLLQKEENLDRKL 102

Query: 170 E-FARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQTEEI-HLR 227
           E   + E E E   ++L Q++ + E++++  EL+E   E+ +   E       EE   + 
Sbjct: 103 ELLEKREEELEKKEKELEQKQQELEKKEE--ELEELIEEQLQEL-ERISGLTAEEAKEIL 159

Query: 228 MQQQDEELRR 237
           +++ +EE R 
Sbjct: 160 LEKVEEEARH 169



 Score = 39.4 bits (93), Expect = 0.001
 Identities = 30/122 (24%), Positives = 62/122 (50%), Gaps = 5/122 (4%)

Query: 116 QKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLEEEKLEAQMEFARYE 175
           + ++K  EE+ +  +E A+ E E  I +E      EE  +   + E+E  E + E  + E
Sbjct: 30  EAKIKEAEEEAKRILEEAKKEAE-AIKKEALLEAKEEIHKLRNEFEKELRERRNELQKLE 88

Query: 176 NETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQTEEIHLRMQQQDEEL 235
                  E L ++    E++++E E KE+  E++    ++ + ++ EE+   +++Q +EL
Sbjct: 89  KRLLQKEENLDRKLELLEKREEELEKKEKELEQK----QQELEKKEEELEELIEEQLQEL 144

Query: 236 RR 237
            R
Sbjct: 145 ER 146


>gnl|CDD|224117 COG1196, Smc, Chromosome segregation ATPases [Cell division and
           chromosome partitioning].
          Length = 1163

 Score = 38.9 bits (91), Expect = 0.002
 Identities = 42/147 (28%), Positives = 65/147 (44%), Gaps = 7/147 (4%)

Query: 102 KQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRER--ECRFVEEALQKEL- 158
           ++L  L ++ E   Q+  +LE+E  E + E    E + + L E   E     E L++EL 
Sbjct: 807 RRLDALERELESLEQRRERLEQEIEELEEEIEELEEKLDELEEELEELEKELEELKEELE 866

Query: 159 KLEEEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEE----QRRCDE 214
           +LE EK E + E    E E E L E+LR+ E++    K+E E      EE      R + 
Sbjct: 867 ELEAEKEELEDELKELEEEKEELEEELRELESELAELKEEIEKLRERLEELEAKLERLEV 926

Query: 215 EAMRRQTEEIHLRMQQQDEELRRRHQE 241
           E    + E         + EL R  + 
Sbjct: 927 ELPELEEELEEEYEDTLETELEREIER 953



 Score = 36.6 bits (85), Expect = 0.013
 Identities = 51/161 (31%), Positives = 72/161 (44%), Gaps = 12/161 (7%)

Query: 94  EFEYGTRWKQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRER--ECRFVE 151
           E E      +L EL K+ EE  ++  +LEEE  E Q E    E E E L+    E R   
Sbjct: 224 ELELALLLAKLKELRKELEELEEELSRLEEELEELQEELEEAEKEIEELKSELEELREEL 283

Query: 152 EALQKEL--------KLEEEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKE 203
           E LQ+EL        +LE E    +      ENE E L E+L + +   E  K+E E +E
Sbjct: 284 EELQEELLELKEEIEELEGEISLLRERLEELENELEELEERLEELKEKIEALKEELEERE 343

Query: 204 RHAEE--QRRCDEEAMRRQTEEIHLRMQQQDEELRRRHQEN 242
              EE  Q   + E  + + EE    + ++ EEL    +E 
Sbjct: 344 TLLEELEQLLAELEEAKEELEEKLSALLEELEELFEALREE 384



 Score = 34.7 bits (80), Expect = 0.056
 Identities = 34/132 (25%), Positives = 55/132 (41%), Gaps = 1/132 (0%)

Query: 102 KQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLE 161
           +QL EL +Q EE  ++   LEEE  + Q      E E E L E      E   + E +LE
Sbjct: 709 RQLEELERQLEELKRELAALEEELEQLQSRLEELEEELEELEEELEELQERLEELEEELE 768

Query: 162 EEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQT 221
             +          E   E  R+ L++   + E + +E E +    E +    E+   R  
Sbjct: 769 SLEEALAKLKEEIEELEEK-RQALQEELEELEEELEEAERRLDALERELESLEQRRERLE 827

Query: 222 EEIHLRMQQQDE 233
           +EI    ++ +E
Sbjct: 828 QEIEELEEEIEE 839



 Score = 32.8 bits (75), Expect = 0.20
 Identities = 35/155 (22%), Positives = 61/155 (39%), Gaps = 13/155 (8%)

Query: 100 RWKQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRF------VEEA 153
           R  +L  L  + +E  ++  +LEEE    +      E   E L E E         +EE 
Sbjct: 223 RELELALLLAKLKELRKELEELEEELSRLE---EELEELQEELEEAEKEIEELKSELEEL 279

Query: 154 LQKELKLEEEKLEAQMEFARYENETEILREQLRQREADRERQKQEW----ELKERHAEEQ 209
            ++  +L+EE LE + E    E E  +LRE+L + E + E  ++      E  E   EE 
Sbjct: 280 REELEELQEELLELKEEIEELEGEISLLRERLEELENELEELEERLEELKEKIEALKEEL 339

Query: 210 RRCDEEAMRRQTEEIHLRMQQQDEELRRRHQENSI 244
              +      +     L   +++ E +       +
Sbjct: 340 EERETLLEELEQLLAELEEAKEELEEKLSALLEEL 374



 Score = 31.6 bits (72), Expect = 0.49
 Identities = 37/156 (23%), Positives = 64/156 (41%), Gaps = 16/156 (10%)

Query: 102 KQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRF--VEEALQKELK 159
           ++L EL  + EE  ++  +L+E+    + E    E   E L +        +E L+++L 
Sbjct: 309 ERLEELENELEELEERLEELKEKIEALKEELEERETLLEELEQLLAELEEAKEELEEKLS 368

Query: 160 LEEEKLEAQM------------EFARYENETEILREQLRQREADRERQKQEWE--LKERH 205
              E+LE               E A   NE E L+ ++   E   ER  +  E   +E  
Sbjct: 369 ALLEELEELFEALREELAELEAELAEIRNELEELKREIESLEERLERLSERLEDLKEELK 428

Query: 206 AEEQRRCDEEAMRRQTEEIHLRMQQQDEELRRRHQE 241
             E    + +    +  E    +++Q EELR R +E
Sbjct: 429 ELEAELEELQTELEELNEELEELEEQLEELRDRLKE 464



 Score = 31.2 bits (71), Expect = 0.69
 Identities = 35/142 (24%), Positives = 65/142 (45%), Gaps = 2/142 (1%)

Query: 105 YELYKQKEEALQKELKLEEEKLEAQMEFARYE-NETEILRERECRFVEEALQKELKLEEE 163
            E  +Q  +   +EL+ E E+ E +++    E    E  RER  + +EE  ++  +LEE+
Sbjct: 784 LEEKRQALQEELEELEEELEEAERRLDALERELESLEQRRERLEQEIEELEEEIEELEEK 843

Query: 164 KLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQTEE 223
             E + E    E E E L+E+L + EA++E  + E +  E   EE    +   +  +  E
Sbjct: 844 LDELEEELEELEKELEELKEELEELEAEKEELEDELKELEEEKEELEE-ELRELESELAE 902

Query: 224 IHLRMQQQDEELRRRHQENSIF 245
           +   +++  E L     +    
Sbjct: 903 LKEEIEKLRERLEELEAKLERL 924



 Score = 30.5 bits (69), Expect = 1.1
 Identities = 30/136 (22%), Positives = 59/136 (43%), Gaps = 5/136 (3%)

Query: 102 KQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLE 161
           ++L EL  Q E+  ++   L+ E    +        + E L  +    +EE  ++   LE
Sbjct: 674 EELAELEAQLEKLEEELKSLKNELRSLEDLLEELRRQLEELERQ----LEELKRELAALE 729

Query: 162 EEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQT 221
           EE  + Q      E E E L E+L + +   E  ++E E     A  + + + E +  + 
Sbjct: 730 EELEQLQSRLEELEEELEELEEELEELQERLEELEEELE-SLEEALAKLKEEIEELEEKR 788

Query: 222 EEIHLRMQQQDEELRR 237
           + +   +++ +EEL  
Sbjct: 789 QALQEELEELEEELEE 804



 Score = 29.7 bits (67), Expect = 2.0
 Identities = 31/123 (25%), Positives = 50/123 (40%), Gaps = 12/123 (9%)

Query: 90   IGSFEFEYGTRWKQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRF 149
            +   E E     ++L EL  +  E  ++  KL E   E + +  R E E   L E     
Sbjct: 879  LKELEEEKEELEEELRELESELAELKEEIEKLRERLEELEAKLERLEVELPELEEELEEE 938

Query: 150  VEEALQKELKLEEEKLEAQM------------EFARYENETEILREQLRQREADRERQKQ 197
             E+ L+ EL+ E E+LE ++            E+   E   E L+ Q    E  +E+  +
Sbjct: 939  YEDTLETELEREIERLEEEIEALGPVNLRAIEEYEEVEERYEELKSQREDLEEAKEKLLE 998

Query: 198  EWE 200
              E
Sbjct: 999  VIE 1001



 Score = 28.9 bits (65), Expect = 3.3
 Identities = 38/164 (23%), Positives = 69/164 (42%), Gaps = 14/164 (8%)

Query: 102 KQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLE 161
            +L EL ++ EE  ++  +L+E   E + E    E     L+E     +EE  +K   L+
Sbjct: 737 SRLEELEEELEELEEELEELQERLEELEEELESLEEALAKLKEE----IEELEEKRQALQ 792

Query: 162 EEKLEAQMEFARYENETEILREQLRQREADRER---QKQEWELKERHAEEQRRCDEEAMR 218
           EE  E + E    E   + L  +L   E  RER   + +E E +    EE+    EE + 
Sbjct: 793 EELEELEEELEEAERRLDALERELESLEQRRERLEQEIEELEEEIEELEEKLDELEEELE 852

Query: 219 RQTEEIHLRMQQQDEELRRRHQENSIFMQVIVWLGDLKQGVYQL 262
              +E+    ++ +E    + +           L +L++   +L
Sbjct: 853 ELEKELEELKEELEELEAEKEELEDE-------LKELEEEKEEL 889



 Score = 28.5 bits (64), Expect = 4.4
 Identities = 35/147 (23%), Positives = 64/147 (43%), Gaps = 3/147 (2%)

Query: 102 KQLYELYKQKEEALQKELK-LEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKL 160
           K+     +++ E LQ  L+ LEEE  E + E    +   E L E      E   + + ++
Sbjct: 722 KRELAALEEELEQLQSRLEELEEELEELEEELEELQERLEELEEELESLEEALAKLKEEI 781

Query: 161 EEEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQ 220
           EE + + Q      E   E L E  R+ +A     +   + +ER  +E    +EE    +
Sbjct: 782 EELEEKRQALQEELEELEEELEEAERRLDALERELESLEQRRERLEQEIEELEEE--IEE 839

Query: 221 TEEIHLRMQQQDEELRRRHQENSIFMQ 247
            EE    ++++ EEL +  +E    ++
Sbjct: 840 LEEKLDELEEELEELEKELEELKEELE 866


>gnl|CDD|233973 TIGR02680, TIGR02680, TIGR02680 family protein.  Members of this
           protein family belong to a conserved gene four-gene
           neighborhood found sporadically in a phylogenetically
           broad range of bacteria: Nocardia farcinica,
           Symbiobacterium thermophilum, and Streptomyces
           avermitilis (Actinobacteria), Geobacillus kaustophilus
           (Firmicutes), Azoarcus sp. EbN1 and Ralstonia
           solanacearum (Betaproteobacteria). Proteins in this
           family average over 1400 amino acids in length
           [Hypothetical proteins, Conserved].
          Length = 1353

 Score = 38.6 bits (90), Expect = 0.003
 Identities = 34/103 (33%), Positives = 43/103 (41%), Gaps = 3/103 (2%)

Query: 133 ARYENETEILRERECRFVEEALQKELKLEEEKLEAQMEFARYE--NETEILREQLRQREA 190
           AR E ET    ERE     EAL++E      +LEA      Y+   E E  R      +A
Sbjct: 288 ARDELETAREEERELDARTEALEREADALRTRLEALQGSPAYQDAEELERARADAEALQA 347

Query: 191 DRERQKQEWELKERHAEE-QRRCDEEAMRRQTEEIHLRMQQQD 232
                +Q     E   EE +RR DEEA R    E  LR  ++ 
Sbjct: 348 AAADARQAIREAESRLEEERRRLDEEAGRLDDAERELRAAREQ 390


>gnl|CDD|233757 TIGR02168, SMC_prok_B, chromosome segregation protein SMC, common
           bacterial type.  SMC (structural maintenance of
           chromosomes) proteins bind DNA and act in organizing and
           segregating chromosomes for partition. SMC proteins are
           found in bacteria, archaea, and eukaryotes. This family
           represents the SMC protein of most bacteria. The smc
           gene is often associated with scpB (TIGR00281) and scpA
           genes, where scp stands for segregation and condensation
           protein. SMC was shown (in Caulobacter crescentus) to be
           induced early in S phase but present and bound to DNA
           throughout the cell cycle [Cellular processes, Cell
           division, DNA metabolism, Chromosome-associated
           proteins].
          Length = 1179

 Score = 38.1 bits (89), Expect = 0.005
 Identities = 36/149 (24%), Positives = 63/149 (42%), Gaps = 18/149 (12%)

Query: 109 KQKEEALQKEL--------KLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKL 160
           +++ E LQKEL        +LE++K   +   A  E + E L  +    +EE   K  +L
Sbjct: 280 EEEIEELQKELYALANEISRLEQQKQILRERLANLERQLEELEAQ----LEELESKLDEL 335

Query: 161 EEEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRC------DE 214
            EE  E + +    + E E L  +L + EA+ E  +   E  E   E  R          
Sbjct: 336 AEELAELEEKLEELKEELESLEAELEELEAELEELESRLEELEEQLETLRSKVAQLELQI 395

Query: 215 EAMRRQTEEIHLRMQQQDEELRRRHQENS 243
            ++  + E +  R+++ ++   R  QE  
Sbjct: 396 ASLNNEIERLEARLERLEDRRERLQQEIE 424



 Score = 37.3 bits (87), Expect = 0.008
 Identities = 47/160 (29%), Positives = 68/160 (42%), Gaps = 21/160 (13%)

Query: 102 KQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLE 161
           +QL EL  Q EE   K  +L EE  E + +    + E E L         EA  +EL+  
Sbjct: 316 RQLEELEAQLEELESKLDELAEELAELEEKLEELKEELESLEAELEEL--EAELEELESR 373

Query: 162 EEKLEAQMEFARYE------------NETEILREQLRQREADRERQKQEWELKERHAEEQ 209
            E+LE Q+E  R +            NE E L  +L + E  RER +QE E   +  EE 
Sbjct: 374 LEELEEQLETLRSKVAQLELQIASLNNEIERLEARLERLEDRRERLQQEIEELLKKLEEA 433

Query: 210 RRCD-------EEAMRRQTEEIHLRMQQQDEELRRRHQEN 242
              +        E    + +E   R+++  EELR   +E 
Sbjct: 434 ELKELQAELEELEEELEELQEELERLEEALEELREELEEA 473



 Score = 33.9 bits (78), Expect = 0.092
 Identities = 39/147 (26%), Positives = 65/147 (44%), Gaps = 12/147 (8%)

Query: 102 KQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLE 161
            +L  L  + EE  ++  +L+EE  EA+ E      E + L E+    +EE   +  +LE
Sbjct: 225 LELALLVLRLEELREELEELQEELKEAEEELEELTAELQELEEK----LEELRLEVSELE 280

Query: 162 EEKLEAQMEF-------ARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDE 214
           EE  E Q E        +R E + +ILRE+L   E   E  + + E  E   +E    + 
Sbjct: 281 EEIEELQKELYALANEISRLEQQKQILRERLANLERQLEELEAQLEELESKLDELAE-EL 339

Query: 215 EAMRRQTEEIHLRMQQQDEELRRRHQE 241
             +  + EE+   ++  + EL     E
Sbjct: 340 AELEEKLEELKEELESLEAELEELEAE 366



 Score = 33.1 bits (76), Expect = 0.18
 Identities = 38/150 (25%), Positives = 66/150 (44%), Gaps = 11/150 (7%)

Query: 102 KQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRER---ECRFVEEALQKEL 158
           K+L EL ++ E+  ++  +L  +    + + AR E E E L ER     + + E   +  
Sbjct: 705 KELEELEEELEQLRKELEELSRQISALRKDLARLEAEVEQLEERIAQLSKELTELEAEIE 764

Query: 159 KLEEEKLEAQMEFARYENETEILREQL----RQREADRERQKQEWELKERHAEEQRRCDE 214
           +LEE   EA+ E A  E E E L  Q+     + +A RE      EL+            
Sbjct: 765 ELEERLEEAEEELAEAEAEIEELEAQIEQLKEELKALREALD---ELRAELTLLNEEAAN 821

Query: 215 EAMRRQTEEIHLRM-QQQDEELRRRHQENS 243
              R ++ E  +   +++ E+L  + +E S
Sbjct: 822 LRERLESLERRIAATERRLEDLEEQIEELS 851



 Score = 31.6 bits (72), Expect = 0.60
 Identities = 29/144 (20%), Positives = 52/144 (36%), Gaps = 6/144 (4%)

Query: 103 QLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLEE 162
           +L E  ++  EA  +  +LE +  + + E        + LR       EEA     +LE 
Sbjct: 769 RLEEAEEELAEAEAEIEELEAQIEQLKEELKALREALDELRAELTLLNEEAANLRERLES 828

Query: 163 EKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKER------HAEEQRRCDEEA 216
            +        R E+  E + E     E+     ++  EL E           +R   EEA
Sbjct: 829 LERRIAATERRLEDLEEQIEELSEDIESLAAEIEELEELIEELESELEALLNERASLEEA 888

Query: 217 MRRQTEEIHLRMQQQDEELRRRHQ 240
           +     E+    ++  E   +R +
Sbjct: 889 LALLRSELEELSEELRELESKRSE 912



 Score = 29.6 bits (67), Expect = 2.0
 Identities = 30/99 (30%), Positives = 50/99 (50%), Gaps = 3/99 (3%)

Query: 109 KQKEEALQKELKLEEEKLEA-QMEFARYENETEILRERECRFVEEALQKEL-KLEEEKLE 166
           + +  +L  E++  E +LE  +    R + E E L ++      + LQ EL +LEEE  E
Sbjct: 392 ELQIASLNNEIERLEARLERLEDRRERLQQEIEELLKKLEEAELKELQAELEELEEELEE 451

Query: 167 AQMEFARYENETEILREQLRQ-READRERQKQEWELKER 204
            Q E  R E   E LRE+L +  +A    +++  +L+ R
Sbjct: 452 LQEELERLEEALEELREELEEAEQALDAAERELAQLQAR 490



 Score = 28.1 bits (63), Expect = 6.1
 Identities = 28/127 (22%), Positives = 49/127 (38%), Gaps = 6/127 (4%)

Query: 136 ENETEILRERECRFVEEALQKELKLEEEKLEAQMEFARYENETEILREQLRQREADRERQ 195
           E   EI    E   +EE  +K  +LE+   E + E    E E     EQLR+   +  RQ
Sbjct: 674 ERRREIEELEEK--IEELEEKIAELEKALAELRKELEELEEE----LEQLRKELEELSRQ 727

Query: 196 KQEWELKERHAEEQRRCDEEAMRRQTEEIHLRMQQQDEELRRRHQENSIFMQVIVWLGDL 255
                      E +    EE + + ++E+     + +E   R  +      +    + +L
Sbjct: 728 ISALRKDLARLEAEVEQLEERIAQLSKELTELEAEIEELEERLEEAEEELAEAEAEIEEL 787

Query: 256 KQGVYQL 262
           +  + QL
Sbjct: 788 EAQIEQL 794


>gnl|CDD|215621 PLN03188, PLN03188, kinesin-12 family protein; Provisional.
          Length = 1320

 Score = 38.0 bits (88), Expect = 0.005
 Identities = 28/83 (33%), Positives = 44/83 (53%), Gaps = 5/83 (6%)

Query: 157  ELKLEEEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEA 216
            E KLE+E+L      +++ +  E LR +L    A  E+QK E + ++R AEE +   + A
Sbjct: 1046 EKKLEQERLRWTEAESKWISLAEELRTELDASRALAEKQKHELDTEKRCAEELKEAMQMA 1105

Query: 217  MRRQTEEIHLRMQQQDEELRRRH 239
            M     E H RM +Q  +L  +H
Sbjct: 1106 M-----EGHARMLEQYADLEEKH 1123


>gnl|CDD|173412 PTZ00121, PTZ00121, MAEBL; Provisional.
          Length = 2084

 Score = 36.7 bits (84), Expect = 0.014
 Identities = 28/139 (20%), Positives = 66/139 (47%), Gaps = 5/139 (3%)

Query: 106  ELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQ---KELKLEE 162
            E  K +E  ++ E   + E+ + ++E  + +   E  +  E +  EE  +    E   + 
Sbjct: 1611 EAKKAEEAKIKAEELKKAEEEKKKVEQLKKKEAEEKKKAEELKKAEEENKIKAAEEAKKA 1670

Query: 163  EKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQTE 222
            E+ + + E A+   E E    +  ++EA+  ++ +E  LK++ AEE+++ +E     +  
Sbjct: 1671 EEDKKKAEEAKKAEEDEKKAAEALKKEAEEAKKAEE--LKKKEAEEKKKAEELKKAEEEN 1728

Query: 223  EIHLRMQQQDEELRRRHQE 241
            +I     +++ E  ++  E
Sbjct: 1729 KIKAEEAKKEAEEDKKKAE 1747



 Score = 36.3 bits (83), Expect = 0.018
 Identities = 35/127 (27%), Positives = 65/127 (51%), Gaps = 7/127 (5%)

Query: 102  KQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQK--ELK 159
            ++L +  ++K++  Q + K  EEK +A+ E  + E E +I    E +  EE  +K  E K
Sbjct: 1623 EELKKAEEEKKKVEQLKKKEAEEKKKAE-ELKKAEEENKIKAAEEAKKAEEDKKKAEEAK 1681

Query: 160  LEEEKLEAQMEFARYENETEILREQLRQREADRERQ----KQEWELKERHAEEQRRCDEE 215
              EE  +   E  + E E     E+L+++EA+ +++    K+  E  +  AEE ++  EE
Sbjct: 1682 KAEEDEKKAAEALKKEAEEAKKAEELKKKEAEEKKKAEELKKAEEENKIKAEEAKKEAEE 1741

Query: 216  AMRRQTE 222
              ++  E
Sbjct: 1742 DKKKAEE 1748



 Score = 35.9 bits (82), Expect = 0.026
 Identities = 38/143 (26%), Positives = 79/143 (55%), Gaps = 12/143 (8%)

Query: 102  KQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLE 161
            K+  EL K +E    +E K  EE  +A+ +      + E  ++ E    E  +++ +KL 
Sbjct: 1546 KKADELKKAEELKKAEEKKKAEEAKKAEEDKNMALRKAEEAKKAE----EARIEEVMKLY 1601

Query: 162  EEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQT 221
            EE+ + + E A+   E +I  E+L++ E   E++K E +LK++ AEE+++ +E  +++  
Sbjct: 1602 EEEKKMKAEEAKKAEEAKIKAEELKKAE--EEKKKVE-QLKKKEAEEKKKAEE--LKKAE 1656

Query: 222  EEIHLRMQQ---QDEELRRRHQE 241
            EE  ++  +   + EE +++ +E
Sbjct: 1657 EENKIKAAEEAKKAEEDKKKAEE 1679



 Score = 34.7 bits (79), Expect = 0.055
 Identities = 30/138 (21%), Positives = 72/138 (52%), Gaps = 11/138 (7%)

Query: 106  ELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRE--RECRFVEEALQKELKLEEE 163
            E  K +E  +++ +KL EE+ + + E A+   E +I  E  ++    ++ +++  K E E
Sbjct: 1585 EAKKAEEARIEEVMKLYEEEKKMKAEEAKKAEEAKIKAEELKKAEEEKKKVEQLKKKEAE 1644

Query: 164  KLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQTEE 223
            + +   E  + E E +I   +  ++  + +++ +E   K+   +E++    EA++++ EE
Sbjct: 1645 EKKKAEELKKAEEENKIKAAEEAKKAEEDKKKAEE--AKKAEEDEKKA--AEALKKEAEE 1700

Query: 224  IHLRMQQQDEELRRRHQE 241
                  ++ EEL+++  E
Sbjct: 1701 -----AKKAEELKKKEAE 1713



 Score = 34.3 bits (78), Expect = 0.067
 Identities = 38/147 (25%), Positives = 74/147 (50%), Gaps = 11/147 (7%)

Query: 100  RWKQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKE-- 157
            R +++ +LY+++++   +E K  EE      E  + E E + + + + +  EE  + E  
Sbjct: 1593 RIEEVMKLYEEEKKMKAEEAKKAEEAKIKAEELKKAEEEKKKVEQLKKKEAEEKKKAEEL 1652

Query: 158  LKLEEEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAM 217
             K EEE      E A+   E +   E+ ++ E   E +K+  E  ++ AEE ++  EE  
Sbjct: 1653 KKAEEENKIKAAEEAKKAEEDKKKAEEAKKAE---EDEKKAAEALKKEAEEAKKA-EELK 1708

Query: 218  RRQTEEIHLRMQQQDEELRRRHQENSI 244
            +++ EE      ++ EEL++  +EN I
Sbjct: 1709 KKEAEEK-----KKAEELKKAEEENKI 1730



 Score = 34.0 bits (77), Expect = 0.10
 Identities = 31/140 (22%), Positives = 72/140 (51%), Gaps = 3/140 (2%)

Query: 109  KQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLEEEKLEAQ 168
            K+K E  +K  + E++  EA  + A    + E L+++E    ++A  +ELK  EE+ + +
Sbjct: 1674 KKKAEEAKKAEEDEKKAAEALKKEAEEAKKAEELKKKEAEEKKKA--EELKKAEEENKIK 1731

Query: 169  MEFARYENETEILR-EQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQTEEIHLR 227
             E A+ E E +  + E+ ++ E ++++     + +E+ AEE R+  E  +  + +E   +
Sbjct: 1732 AEEAKKEAEEDKKKAEEAKKDEEEKKKIAHLKKEEEKKAEEIRKEKEAVIEEELDEEDEK 1791

Query: 228  MQQQDEELRRRHQENSIFMQ 247
             + + ++  +   +N   + 
Sbjct: 1792 RRMEVDKKIKDIFDNFANII 1811



 Score = 31.6 bits (71), Expect = 0.59
 Identities = 31/143 (21%), Positives = 70/143 (48%), Gaps = 6/143 (4%)

Query: 102  KQLYELYKQKEEALQKELKLEEEKLEA---QMEFARYENETEILRERECRFVEEALQKEL 158
            K+  E  K + EA   E +  EEK EA   + E A+ + +    +  E +  +EA +K  
Sbjct: 1342 KKAAEAAKAEAEAAADEAEAAEEKAEAAEKKKEEAKKKADAAKKKAEEKKKADEAKKKAE 1401

Query: 159  KLEEEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMR 218
            + +++  E +   A  +   E  ++   +++AD  ++K E   + + A+E ++  EEA +
Sbjct: 1402 EDKKKADELKKAAAAKKKADEAKKKAEEKKKADEAKKKAE---EAKKADEAKKKAEEAKK 1458

Query: 219  RQTEEIHLRMQQQDEELRRRHQE 241
             +  +      ++ +E +++ +E
Sbjct: 1459 AEEAKKKAEEAKKADEAKKKAEE 1481



 Score = 28.2 bits (62), Expect = 6.4
 Identities = 32/147 (21%), Positives = 70/147 (47%), Gaps = 13/147 (8%)

Query: 102  KQLYELYKQKEEALQ-KELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKL 160
            K+  E  K+ EEA +  E K + E+ + + + A+ + E         +   EA   E + 
Sbjct: 1302 KKADEAKKKAEEAKKADEAKKKAEEAKKKADAAKKKAEEAKKAAEAAKAEAEAAADEAEA 1361

Query: 161  EEEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDE----EA 216
             EEK EA       E + E  +++    +   E +K+  E K++  E++++ DE     A
Sbjct: 1362 AEEKAEAA------EKKKEEAKKKADAAKKKAEEKKKADEAKKKAEEDKKKADELKKAAA 1415

Query: 217  MRRQTEEIHLRMQQ--QDEELRRRHQE 241
             +++ +E   + ++  + +E +++ +E
Sbjct: 1416 AKKKADEAKKKAEEKKKADEAKKKAEE 1442



 Score = 28.2 bits (62), Expect = 6.5
 Identities = 38/149 (25%), Positives = 74/149 (49%), Gaps = 9/149 (6%)

Query: 102  KQLYELYKQKEEALQK--ELKLEEEKLEA-QMEFARYENETEILRERECRFVEEALQK-- 156
            K+  +  K+  EA +K  E K  EE  +A + + A    + +  ++ E +   + L+K  
Sbjct: 1496 KKKADEAKKAAEAKKKADEAKKAEEAKKADEAKKAEEAKKADEAKKAEEKKKADELKKAE 1555

Query: 157  ELKLEEEKLEAQMEFARYENETEILR--EQLRQREADR--ERQKQEWELKERHAEEQRRC 212
            ELK  EEK +A+      E++   LR  E+ ++ E  R  E  K   E K+  AEE ++ 
Sbjct: 1556 ELKKAEEKKKAEEAKKAEEDKNMALRKAEEAKKAEEARIEEVMKLYEEEKKMKAEEAKKA 1615

Query: 213  DEEAMRRQTEEIHLRMQQQDEELRRRHQE 241
            +E  ++ +  +     +++ E+L+++  E
Sbjct: 1616 EEAKIKAEELKKAEEEKKKVEQLKKKEAE 1644



 Score = 28.2 bits (62), Expect = 6.6
 Identities = 29/143 (20%), Positives = 69/143 (48%), Gaps = 7/143 (4%)

Query: 106  ELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLEEEKL 165
            E  K+ +EA + E K + ++ + + E A+  +E +   E   +  + A +K  + ++   
Sbjct: 1287 EEKKKADEAKKAEEKKKADEAKKKAEEAKKADEAKKKAEEAKKKADAAKKKAEEAKKAAE 1346

Query: 166  EAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDE-----EAMRRQ 220
             A+ E     +E E   E+    E  +E  K++ +  ++ AEE+++ DE     E  +++
Sbjct: 1347 AAKAEAEAAADEAEAAEEKAEAAEKKKEEAKKKADAAKKKAEEKKKADEAKKKAEEDKKK 1406

Query: 221  TEEIHLRMQQQD--EELRRRHQE 241
             +E+      +   +E +++ +E
Sbjct: 1407 ADELKKAAAAKKKADEAKKKAEE 1429



 Score = 28.2 bits (62), Expect = 7.0
 Identities = 34/154 (22%), Positives = 69/154 (44%), Gaps = 11/154 (7%)

Query: 102  KQLYELYKQKEEALQK--ELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQK--E 157
            K+  EL K +EE   K  E   + E+ + + E A+   E E       +   E  +K  E
Sbjct: 1647 KKAEELKKAEEENKIKAAEEAKKAEEDKKKAEEAKKAEEDEKKAAEALKKEAEEAKKAEE 1706

Query: 158  LKLEEEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKE-------RHAEEQR 210
            LK +E + + + E  +   E   ++ +  ++EA+ +++K E   K+        H +++ 
Sbjct: 1707 LKKKEAEEKKKAEELKKAEEENKIKAEEAKKEAEEDKKKAEEAKKDEEEKKKIAHLKKEE 1766

Query: 211  RCDEEAMRRQTEEIHLRMQQQDEELRRRHQENSI 244
                E +R++ E +      +++E RR   +  I
Sbjct: 1767 EKKAEEIRKEKEAVIEEELDEEDEKRRMEVDKKI 1800


>gnl|CDD|215696 pfam00076, RRM_1, RNA recognition motif. (a.k.a. RRM, RBD, or RNP
          domain).  The RRM motif is probably diagnostic of an
          RNA binding protein. RRMs are found in a variety of RNA
          binding proteins, including various hnRNP proteins,
          proteins implicated in regulation of alternative
          splicing, and protein components of snRNPs. The motif
          also appears in a few single stranded DNA binding
          proteins. The RRM structure consists of four strands
          and two helices arranged in an alpha/beta sandwich,
          with a third helix present during RNA binding in some
          cases The C-terminal beta strand (4th strand) and final
          helix are hard to align and have been omitted in the
          SEED alignment The LA proteins have an N terminal rrm
          which is included in the seed. There is a second region
          towards the C terminus that has some features
          characteristic of a rrm but does not appear to have the
          important structural core of a rrm. The LA proteins are
          one of the main autoantigens in Systemic lupus
          erythematosus (SLE), an autoimmune disease.
          Length = 70

 Score = 32.6 bits (75), Expect = 0.023
 Identities = 11/36 (30%), Positives = 18/36 (50%)

Query: 3  IERAVVLVDDRGNSKNEGIIEFTRKPAAAQALKRCQ 38
          IE   ++ D+ G SK    +EF  +  A +AL+   
Sbjct: 25 IESIRIVRDETGRSKGFAFVEFEDEEDAEKALEALN 60


>gnl|CDD|216608 pfam01618, MotA_ExbB, MotA/TolQ/ExbB proton channel family.  This
           family groups together integral membrane proteins that
           appear to be involved translocation of proteins across a
           membrane. These proteins are probably proton channels.
           MotA is an essential component of the flageller motor
           that uses a proton gradient to generate rotational
           motion in the flageller. ExbB is part of the
           TonB-dependent transduction complex. The TonB complex
           uses the proton gradient across the inner bacterial
           membrane to transport large molecules across the outer
           bacterial membrane.
          Length = 139

 Score = 34.1 bits (79), Expect = 0.024
 Identities = 13/53 (24%), Positives = 19/53 (35%)

Query: 115 LQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLEEEKLEA 167
            +    +E       + F R      I R     F+  AL++ L  E  KLE 
Sbjct: 1   REPLSAIEILAQLGFLSFVRLGLAPLIERGAFKEFLRLALEEALDAELRKLER 53


>gnl|CDD|188306 TIGR03319, RNase_Y, ribonuclease Y.  Members of this family are
           RNase Y, an endoribonuclease. The member from Bacillus
           subtilis, YmdA, has been shown to be involved in
           turnover of yitJ riboswitch [Transcription, Degradation
           of RNA].
          Length = 514

 Score = 35.3 bits (82), Expect = 0.032
 Identities = 27/124 (21%), Positives = 56/124 (45%), Gaps = 18/124 (14%)

Query: 132 FARYENETEI--LRERECRFVEEA------LQKELKLE------EEKLEAQMEFARYENE 177
             +   E ++    E   R +EEA      L+KE  LE      + + E + E     NE
Sbjct: 18  LRKRIAEKKLGSAEELAKRIIEEAKKEAETLKKEALLEAKEEVHKLRAELERELKERRNE 77

Query: 178 TEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQTEEIHLRMQQQDEELRR 237
            + L  +L QRE   +R+ +  + KE + E++    E+ +  + + +  + ++ +E +  
Sbjct: 78  LQRLERRLLQREETLDRKMESLDKKEENLEKK----EKELSNKEKNLDEKEEELEELIAE 133

Query: 238 RHQE 241
           + +E
Sbjct: 134 QREE 137



 Score = 32.6 bits (75), Expect = 0.20
 Identities = 37/138 (26%), Positives = 66/138 (47%), Gaps = 11/138 (7%)

Query: 102 KQLYELYKQKEEALQKELKLE--EEKLEAQMEFARYENETEILRERECRFVEEALQKELK 159
           K++ E  K++ E L+KE  LE  EE  + + E    E E +  R    R     LQ+E  
Sbjct: 35  KRIIEEAKKEAETLKKEALLEAKEEVHKLRAEL---ERELKERRNELQRLERRLLQREET 91

Query: 160 LEEEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRR 219
           L + K+E+     + E   E   ++L  +E  +   ++E EL+E  AE++   +  +   
Sbjct: 92  L-DRKMES---LDKKEENLEKKEKELSNKE--KNLDEKEEELEELIAEQREELERISGLT 145

Query: 220 QTEEIHLRMQQQDEELRR 237
           Q E   + +++ +EE R 
Sbjct: 146 QEEAKEILLEEVEEEARH 163



 Score = 27.6 bits (62), Expect = 7.6
 Identities = 27/126 (21%), Positives = 58/126 (46%), Gaps = 10/126 (7%)

Query: 109 KQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLEEEKLEAQ 168
           + + + L++ L   EE L+ +ME    + + E L ++E    +E   KE  L+E++ E +
Sbjct: 75  RNELQRLERRLLQREETLDRKME--SLDKKEENLEKKE----KELSNKEKNLDEKEEELE 128

Query: 169 MEFARYENETE----ILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQTEEI 224
              A    E E    + +E+ ++   +   ++   E  +   E +    EEA ++  E +
Sbjct: 129 ELIAEQREELERISGLTQEEAKEILLEEVEEEARHEAAKLIKEIEEEAKEEADKKAKEIL 188

Query: 225 HLRMQQ 230
              +Q+
Sbjct: 189 ATAIQR 194


>gnl|CDD|240668 cd00590, RRM_SF, RNA recognition motif (RRM) superfamily.  RRM,
          also known as RBD (RNA binding domain) or RNP
          (ribonucleoprotein domain), is a highly abundant domain
          in eukaryotes found in proteins involved in
          post-transcriptional gene expression processes
          including mRNA and rRNA processing, RNA export, and RNA
          stability. This domain is 90 amino acids in length and
          consists of a four-stranded beta-sheet packed against
          two alpha-helices. RRM usually interacts with ssRNA,
          but is also known to interact with ssDNA as well as
          proteins. RRM binds a variable number of nucleotides,
          ranging from two to eight. The active site includes
          three aromatic side-chains located within the conserved
          RNP1 and RNP2 motifs of the domain. The RRM domain is
          found in a variety heterogeneous nuclear
          ribonucleoproteins (hnRNPs), proteins implicated in
          regulation of alternative splicing, and protein
          components of small nuclear ribonucleoproteins
          (snRNPs).
          Length = 72

 Score = 31.9 bits (73), Expect = 0.049
 Identities = 11/42 (26%), Positives = 16/42 (38%)

Query: 2  NIERAVVLVDDRGNSKNEGIIEFTRKPAAAQALKRCQDGVFF 43
           IE   ++ D  G SK    +EF     A +AL+        
Sbjct: 24 EIESVRIVRDKDGKSKGFAFVEFESPEDAEKALEALNGKELD 65


>gnl|CDD|223496 COG0419, SbcC, ATPase involved in DNA repair [DNA replication,
           recombination, and repair].
          Length = 908

 Score = 34.0 bits (78), Expect = 0.097
 Identities = 31/135 (22%), Positives = 62/135 (45%), Gaps = 6/135 (4%)

Query: 102 KQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLE 161
           K+L EL ++  E L+ E  L+EE  E   +      E E L+E+      +   ++L+  
Sbjct: 508 KELRELEEELIELLELEEALKEELEEKLEKLENLLEELEELKEKLQLQQLKEELRQLEDR 567

Query: 162 EEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQT 221
            ++L+  +E  R     +   E+LR+R  + +++ +E E +    EE  +        + 
Sbjct: 568 LQELKELLEELRLLRTRKEELEELRERLKELKKKLKELEERLSQLEELLQ------SLEL 621

Query: 222 EEIHLRMQQQDEELR 236
            E    +++ +EEL 
Sbjct: 622 SEAENELEEAEEELE 636



 Score = 32.0 bits (73), Expect = 0.35
 Identities = 35/140 (25%), Positives = 65/140 (46%), Gaps = 4/140 (2%)

Query: 98  GTRWKQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKE 157
               K+L ELY+ + E L++EL  E+E+ E + E      E E         + E L+ E
Sbjct: 469 EEHEKELLELYELELEELEEELSREKEEAELREEI----EELEKELRELEEELIELLELE 524

Query: 158 LKLEEEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAM 217
             L+EE  E   +      E E L+E+L+ ++   E ++ E  L+E     +        
Sbjct: 525 EALKEELEEKLEKLENLLEELEELKEKLQLQQLKEELRQLEDRLQELKELLEELRLLRTR 584

Query: 218 RRQTEEIHLRMQQQDEELRR 237
           + + EE+  R+++  ++L+ 
Sbjct: 585 KEELEELRERLKELKKKLKE 604



 Score = 29.3 bits (66), Expect = 2.9
 Identities = 23/103 (22%), Positives = 39/103 (37%)

Query: 102 KQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLE 161
           ++L E  +     L++  +L E+    +    + E + E L        EE  +    LE
Sbjct: 301 EELEEELEGLRALLEELEELLEKLKSLEERLEKLEEKLEKLESELEELAEEKNELAKLLE 360

Query: 162 EEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKER 204
           E   E +      E E E   E+L+Q E   +  K+E      
Sbjct: 361 ERLKELEERLEELEKELEKALERLKQLEEAIQELKEELAELSA 403



 Score = 28.2 bits (63), Expect = 5.8
 Identities = 26/115 (22%), Positives = 46/115 (40%), Gaps = 1/115 (0%)

Query: 106 ELYKQKEEALQKELKLEEEKL-EAQMEFARYENETEILRERECRFVEEALQKELKLEEEK 164
              K++ E L++ LK  ++KL E +   ++ E   + L   E     E  ++EL+ E EK
Sbjct: 582 RTRKEELEELRERLKELKKKLKELEERLSQLEELLQSLELSEAENELEEAEEELESELEK 641

Query: 165 LEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRR 219
           L  Q E            E+  +      R++ +    E   EE+    E+    
Sbjct: 642 LNLQAELEELLQAALEELEEKVEELEAEIRRELQRIENEEQLEEKLEELEQLEEE 696



 Score = 27.8 bits (62), Expect = 7.5
 Identities = 32/160 (20%), Positives = 68/160 (42%), Gaps = 4/160 (2%)

Query: 102 KQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLE 161
           ++L +L    EE  + + KL+ ++L+ ++       +       E R +    ++  +L 
Sbjct: 533 EKLEKLENLLEELEELKEKLQLQQLKEELRQLEDRLQELKELLEELRLLRTRKEELEELR 592

Query: 162 EEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRR----CDEEAM 217
           E   E + +    E     L E L+  E      + E   +E  +E ++       EE +
Sbjct: 593 ERLKELKKKLKELEERLSQLEELLQSLELSEAENELEEAEEELESELEKLNLQAELEELL 652

Query: 218 RRQTEEIHLRMQQQDEELRRRHQENSIFMQVIVWLGDLKQ 257
           +   EE+  ++++ + E+RR  Q      Q+   L +L+Q
Sbjct: 653 QAALEELEEKVEELEAEIRRELQRIENEEQLEEKLEELEQ 692


>gnl|CDD|219655 pfam07946, DUF1682, Protein of unknown function (DUF1682).  The
           members of this family are all hypothetical eukaryotic
           proteins of unknown function. One member is described as
           being an adipocyte-specific protein, but no evidence of
           this was found.
          Length = 322

 Score = 33.4 bits (77), Expect = 0.10
 Identities = 19/76 (25%), Positives = 42/76 (55%), Gaps = 11/76 (14%)

Query: 148 RFVEEALQKELKL---EEEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKER 204
           +   E L+K  K    EEEK+    E  R        +E+ ++++ +++++++E +L + 
Sbjct: 252 KLSPEVLRKVDKTREEEEEKILKAAEEER--------QEEAQEKKEEKKKEEREAKLAKL 303

Query: 205 HAEEQRRCDEEAMRRQ 220
             EEQR+ +E+  ++Q
Sbjct: 304 SPEEQRKLEEKERKKQ 319


>gnl|CDD|240832 cd12386, RRM2_hnRNPM_like, RNA recognition motif 2 in
          heterogeneous nuclear ribonucleoprotein M (hnRNP M) and
          similar proteins.  This subfamily corresponds to the
          RRM2 of heterogeneous nuclear ribonucleoprotein M
          (hnRNP M), myelin expression factor 2 (MEF-2 or MyEF-2
          or MST156) and similar proteins. hnRNP M is pre-mRNA
          binding protein that may play an important role in the
          pre-mRNA processing. It also preferentially binds to
          poly(G) and poly(U) RNA homopolymers. hnRNP M is able
          to interact with early spliceosomes, further
          influencing splicing patterns of specific pre-mRNAs. It
          functions as the receptor of carcinoembryonic antigen
          (CEA) that contains the penta-peptide sequence PELPK
          signaling motif. In addition, hnRNP M and another
          splicing factor Nova-1 work together as dopamine D2
          receptor (D2R) pre-mRNA-binding proteins. They regulate
          alternative splicing of D2R pre-mRNA in an antagonistic
          manner. hnRNP M contains three RNA recognition motifs
          (RRMs), also termed RBDs (RNA binding domains) or RNPs
          (ribonucleoprotein domains), and an unusual
          hexapeptide-repeat region rich in methionine and
          arginine residues (MR repeat motif). MEF-2 is a
          sequence-specific single-stranded DNA (ssDNA) binding
          protein that binds specifically to ssDNA derived from
          the proximal (MB1) element of the myelin basic protein
          (MBP) promoter and represses transcription of the MBP
          gene. MEF-2 shows high sequence homology with hnRNP M.
          It also contains three RRMs, which may be responsible
          for its ssDNA binding activity. .
          Length = 74

 Score = 31.2 bits (71), Expect = 0.10
 Identities = 10/33 (30%), Positives = 17/33 (51%)

Query: 2  NIERAVVLVDDRGNSKNEGIIEFTRKPAAAQAL 34
           + RA +  D  G S+  G+++F     A QA+
Sbjct: 24 KVVRADIKEDKEGKSRGMGVVQFEHPIEAVQAI 56


>gnl|CDD|130902 TIGR01843, type_I_hlyD, type I secretion membrane fusion protein,
           HlyD family.  Type I secretion is an ABC transport
           process that exports proteins, without cleavage of any
           signal sequence, from the cytosol to extracellular
           medium across both inner and outer membranes. The
           secretion signal is found in the C-terminus of the
           transported protein. This model represents the adaptor
           protein between the ATP-binding cassette (ABC) protein
           of the inner membrane and the outer membrane protein,
           and is called the membrane fusion protein. This model
           selects a subfamily closely related to HlyD; it is
           defined narrowly and excludes, for example, colicin V
           secretion protein CvaA and multidrug efflux proteins
           [Protein fate, Protein and peptide secretion and
           trafficking].
          Length = 423

 Score = 33.4 bits (77), Expect = 0.12
 Identities = 35/156 (22%), Positives = 63/156 (40%), Gaps = 16/156 (10%)

Query: 95  FEYGTRWKQLYELYKQKEEALQKELK-LEEEKLEAQMEFARYENETEILRER------EC 147
                  K    L++ ++  L+ +L+ +  +  + + E A  + + + LR++      E 
Sbjct: 122 PAVPELIKGQQSLFESRKSTLRAQLELILAQIKQLEAELAGLQAQLQALRQQLEVISEEL 181

Query: 148 RFVEEALQKE-------LKLEEEKLEAQMEFARYENETEILREQLRQREADRERQKQEWE 200
               +  +K        L+LE E+ EAQ E  R E E E+L+ Q+   E   ERQ+ E  
Sbjct: 182 EARRKLKEKGLVSRLELLELERERAEAQGELGRLEAELEVLKRQI--DELQLERQQIEQT 239

Query: 201 LKERHAEEQRRCDEEAMRRQTEEIHLRMQQQDEELR 236
            +E   EE           +      R + Q   +R
Sbjct: 240 FREEVLEELTEAQARLAELRERLNKARDRLQRLIIR 275


>gnl|CDD|225606 COG3064, TolA, Membrane protein involved in colicin uptake [Cell
           envelope biogenesis, outer membrane].
          Length = 387

 Score = 33.4 bits (76), Expect = 0.12
 Identities = 24/117 (20%), Positives = 51/117 (43%), Gaps = 5/117 (4%)

Query: 100 RWKQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELK 159
           R +      K+ E+  +K+ +   E+L+ +        E E L++ E   ++   Q++  
Sbjct: 66  RIQSQQSSAKKGEQQRKKKEEQVAEELKPKQA-----AEQERLKQLEKERLKAQEQQKQA 120

Query: 160 LEEEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEA 216
            E EK     +  + E   +   EQ ++ EA + +   E    +  AE +++ +E A
Sbjct: 121 EEAEKQAQLEQKQQEEQARKAAAEQKKKAEAAKAKAAAEAAKLKAAAEAKKKAEEAA 177



 Score = 32.2 bits (73), Expect = 0.24
 Identities = 33/125 (26%), Positives = 60/125 (48%), Gaps = 9/125 (7%)

Query: 102 KQLYELYKQKEEALQKELK-LEEEKLEAQMEFARYENETEILRERECRFVEE-----ALQ 155
           +Q+ E  K K+ A Q+ LK LE+E+L+AQ E  +   E E   + E +  EE     A +
Sbjct: 86  EQVAEELKPKQAAEQERLKQLEKERLKAQ-EQQKQAEEAEKQAQLEQKQQEEQARKAAAE 144

Query: 156 KELKLEEEKLEAQMEFARYENETEILR--EQLRQREADRERQKQEWELKERHAEEQRRCD 213
           ++ K E  K +A  E A+ +   E  +  E+  +   + + + +    K++   E +   
Sbjct: 145 QKKKAEAAKAKAAAEAAKLKAAAEAKKKAEEAAKAAEEAKAKAEAAAAKKKAEAEAKAAA 204

Query: 214 EEAMR 218
           E+A  
Sbjct: 205 EKAKA 209


>gnl|CDD|129705 TIGR00618, sbcc, exonuclease SbcC.  All proteins in this family for
           which functions are known are part of an exonuclease
           complex with sbcD homologs. This complex is involved in
           the initiation of recombination to regulate the levels
           of palindromic sequences in DNA. This family is based on
           the phylogenomic analysis of JA Eisen (1999, Ph.D.
           Thesis, Stanford University) [DNA metabolism, DNA
           replication, recombination, and repair].
          Length = 1042

 Score = 33.4 bits (76), Expect = 0.13
 Identities = 28/140 (20%), Positives = 57/140 (40%), Gaps = 6/140 (4%)

Query: 102 KQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLE 161
           +Q +    QK EA +++LK ++   + +        +  +L E + R +  A +      
Sbjct: 239 QQSHAYLTQKREAQEEQLKKQQLLKQLRARIEELRAQEAVLEETQER-INRARKAAPLAA 297

Query: 162 EEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQT 221
             K   Q+E       TE+   Q + R   +   K+   +K++ + E++R   + +  Q 
Sbjct: 298 HIKAVTQIEQQAQRIHTEL---QSKMRSRAKLLMKRAAHVKQQSSIEEQRRLLQTLHSQ- 353

Query: 222 EEIHLRMQQQDEELRRRHQE 241
            EIH+R   +     R    
Sbjct: 354 -EIHIRDAHEVATSIREISC 372


>gnl|CDD|217393 pfam03154, Atrophin-1, Atrophin-1 family.  Atrophin-1 is the
           protein product of the dentatorubral-pallidoluysian
           atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive
           neurodegenerative disorder. It is caused by the
           expansion of a CAG repeat in the DRPLA gene on
           chromosome 12p. This results in an extended
           polyglutamine region in atrophin-1, that is thought to
           confer toxicity to the protein, possibly through
           altering its interactions with other proteins. The
           expansion of a CAG repeat is also the underlying defect
           in six other neurodegenerative disorders, including
           Huntington's disease. One interaction of expanded
           polyglutamine repeats that is thought to be pathogenic
           is that with the short glutamine repeat in the
           transcriptional coactivator CREB binding protein, CBP.
           This interaction draws CBP away from its usual nuclear
           location to the expanded polyglutamine repeat protein
           aggregates that are characteristic of the polyglutamine
           neurodegenerative disorders. This interferes with
           CBP-mediated transcription and causes cytotoxicity.
          Length = 979

 Score = 33.1 bits (75), Expect = 0.16
 Identities = 17/53 (32%), Positives = 32/53 (60%), Gaps = 5/53 (9%)

Query: 159 KLEEEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRR 211
           KL +++ EA  E A+ E E +   E+ R++E ++ER+++    +ER AE   +
Sbjct: 576 KLAKKREEAV-EKAKREAEQKAREEREREKEKEKERERE----REREAERAAK 623



 Score = 30.4 bits (68), Expect = 1.2
 Identities = 16/45 (35%), Positives = 30/45 (66%), Gaps = 5/45 (11%)

Query: 153 ALQKELKLEEEKLEAQMEFARYENETEILREQLRQREADRERQKQ 197
           A ++E  +E+ K EA+ + AR E E    RE+ +++E +RER+++
Sbjct: 578 AKKREEAVEKAKREAEQK-AREERE----REKEKEKERERERERE 617



 Score = 28.5 bits (63), Expect = 4.8
 Identities = 16/41 (39%), Positives = 25/41 (60%), Gaps = 1/41 (2%)

Query: 106 ELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERE 146
           +L K++EEA++K  +  E+K   + E  + E E E  RERE
Sbjct: 576 KLAKKREEAVEKAKREAEQKAREEREREK-EKEKERERERE 615


>gnl|CDD|225288 COG2433, COG2433, Uncharacterized conserved protein [Function
           unknown].
          Length = 652

 Score = 33.1 bits (76), Expect = 0.16
 Identities = 19/58 (32%), Positives = 35/58 (60%), Gaps = 1/58 (1%)

Query: 113 EALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQK-ELKLEEEKLEAQM 169
           +   +ELK E EKLE+++E  R E   ++ ++RE R  +  +++ E +LEE+K   + 
Sbjct: 442 KRELEELKREIEKLESELERFRREVRDKVRKDREIRARDRRIERLEKELEEKKKRVEE 499



 Score = 33.1 bits (76), Expect = 0.18
 Identities = 35/112 (31%), Positives = 54/112 (48%), Gaps = 12/112 (10%)

Query: 112 EEALQKELKLEEEK------LEAQMEFARYENETEILRERECRFVEE-----ALQKELKL 160
            EAL K  + E  +       E + E   YE   + L E   R  EE        +ELK 
Sbjct: 391 AEALSKVKEEERPREKEGTEEEERREITVYEKRIKKLEETVERLEEENSELKRELEELKR 450

Query: 161 EEEKLEAQMEFARYENETEILRE-QLRQREADRERQKQEWELKERHAEEQRR 211
           E EKLE+++E  R E   ++ ++ ++R R+   ER ++E E K++  EE  R
Sbjct: 451 EIEKLESELERFRREVRDKVRKDREIRARDRRIERLEKELEEKKKRVEELER 502


>gnl|CDD|235551 PRK05667, dnaG, DNA primase; Validated.
          Length = 580

 Score = 32.9 bits (76), Expect = 0.17
 Identities = 20/124 (16%), Positives = 39/124 (31%), Gaps = 11/124 (8%)

Query: 83  IGPRFATIGSFEFEYGTRWKQLYELYKQKEEA----------LQKELKLEEEKLEAQMEF 132
                  +   +FE    ++ L E    +                 L+     LE+   +
Sbjct: 457 AEEVRDALDEEDFEGLPLFRALLEAILAQPGLTTGSQLLEHLRDAGLEELAALLESLAVW 516

Query: 133 ARYENETEILRERECRFVEEALQK-ELKLEEEKLEAQMEFARYENETEILREQLRQREAD 191
                E     E+E +   E L+   L+   E+L A+         +   R +L Q   +
Sbjct: 517 EEISEEDIAALEKELKDALEKLRDQLLEERLEELIAKERLLEGHGLSSEERLELLQLLIE 576

Query: 192 RERQ 195
            +R+
Sbjct: 577 LKRK 580


>gnl|CDD|216108 pfam00769, ERM, Ezrin/radixin/moesin family.  This family of
           proteins contain a band 4.1 domain (pfam00373), at their
           amino terminus. This family represents the rest of these
           proteins.
          Length = 244

 Score = 32.0 bits (73), Expect = 0.24
 Identities = 34/133 (25%), Positives = 63/133 (47%), Gaps = 7/133 (5%)

Query: 106 ELYKQKEEALQKELKLEEEKLE-AQMEFARYEN---ETEILRERECRFVEEALQKELKLE 161
           E  +++++ L++ ++  EE +  AQ E   YE    E E   ++E    +   +K  +LE
Sbjct: 1   EEAEREQQELEERMEQMEEDMRRAQKELEEYEETALELEEKLKQEEEEAQLLEKKADELE 60

Query: 162 EEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKE---RHAEEQRRCDEEAMR 218
           EE    + E A  E E E L  ++ +  A+  + ++E E KE   R  +++ R  +EA  
Sbjct: 61  EENRRLEEEAAASEEERERLEAEVDEATAEVAKLEEEREKKEAETRQLQQELREAQEAHE 120

Query: 219 RQTEEIHLRMQQQ 231
           R  +E+       
Sbjct: 121 RARQELLEAAAAP 133



 Score = 29.3 bits (66), Expect = 2.1
 Identities = 25/90 (27%), Positives = 37/90 (41%), Gaps = 4/90 (4%)

Query: 151 EEALQKELKLEEEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQR 210
           +E  ++  ++EE+   AQ E   YE     L E+L+Q E     + Q  E K    EE+ 
Sbjct: 8   QELEERMEQMEEDMRRAQKELEEYEETALELEEKLKQEE----EEAQLLEKKADELEEEN 63

Query: 211 RCDEEAMRRQTEEIHLRMQQQDEELRRRHQ 240
           R  EE      EE      + DE      +
Sbjct: 64  RRLEEEAAASEEERERLEAEVDEATAEVAK 93


>gnl|CDD|222290 pfam13654, AAA_32, AAA domain.  This family includes a wide variety
           of AAA domains including some that have lost essential
           nucleotide binding residues in the P-loop.
          Length = 509

 Score = 32.4 bits (75), Expect = 0.25
 Identities = 29/121 (23%), Positives = 55/121 (45%), Gaps = 30/121 (24%)

Query: 106 ELYKQKEEALQKELKLEEEKLEAQME-FARYEN----------------ETEILRERECR 148
           E Y+ ++E +++E + + E+   ++E  A+ +                 + E L E E  
Sbjct: 111 EEYEARKEEIEEEFQEKREEAFEELEEEAKEKGFALVRTPGGFVFAPLKDGEPLTEEEFE 170

Query: 149 FVEEALQKELKLEEEKLEAQMEFARYENETEILREQLRQ-READRERQKQEWELKERHAE 207
            + E  ++EL+ + ++LE            E L+E LRQ RE +RE +++  EL    A 
Sbjct: 171 ALPEEEREELEEKIDELE------------EELQEILRQLRELEREAREKLRELDREVAL 218

Query: 208 E 208
            
Sbjct: 219 F 219


>gnl|CDD|215212 PLN02372, PLN02372, violaxanthin de-epoxidase.
          Length = 455

 Score = 32.5 bits (74), Expect = 0.26
 Identities = 23/84 (27%), Positives = 38/84 (45%), Gaps = 6/84 (7%)

Query: 110 QKEEALQKELKLEEEKLEAQMEFARYENETEILRE------RECRFVEEALQKELKLEEE 163
           + E+ + KE +  EE+LE ++E    E E+   R       +E    EE   KEL  EE+
Sbjct: 372 EGEKTIVKEARQIEEELEKEVEKLGKEEESLFKRVALEEGLKELEQDEENFLKELSKEEK 431

Query: 164 KLEAQMEFARYENETEILREQLRQ 187
           +L  +++    E E    R    +
Sbjct: 432 ELLEKLKMEASEVEKLFGRALPVR 455


>gnl|CDD|235850 PRK06669, fliH, flagellar assembly protein H; Validated.
          Length = 281

 Score = 31.9 bits (73), Expect = 0.26
 Identities = 20/84 (23%), Positives = 37/84 (44%)

Query: 106 ELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLEEEKL 165
           E  +++EE   ++L+ E      ++     E+  EI+   E    EE L+K  +      
Sbjct: 36  ERLREEEEEQVEQLREEANDEAKEIIEEAEEDAFEIVEAAEEEAKEELLKKTDEASSIIE 95

Query: 166 EAQMEFARYENETEILREQLRQRE 189
           + QM+  R + E E   E+L +  
Sbjct: 96  KLQMQIEREQEEWEEELERLIEEA 119


>gnl|CDD|236912 PRK11448, hsdR, type I restriction enzyme EcoKI subunit R;
           Provisional.
          Length = 1123

 Score = 32.6 bits (75), Expect = 0.27
 Identities = 23/96 (23%), Positives = 51/96 (53%), Gaps = 2/96 (2%)

Query: 109 KQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLEEEKLEAQ 168
           +    ALQ+E+   +++LE Q    + +++     +++     E L  EL+ ++++LEAQ
Sbjct: 141 ENLLHALQQEVLTLKQQLELQAR-EKAQSQALAEAQQQELVALEGLAAELEEKQQELEAQ 199

Query: 169 MEFARYENETEILREQLRQREADRERQKQEWELKER 204
           +E  + E   E  +E+ ++R+   ++  +  EL E 
Sbjct: 200 LEQLQ-EKAAETSQERKQKRKEITDQAAKRLELSEE 234


>gnl|CDD|240376 PTZ00350, PTZ00350, adenylosuccinate synthetase; Provisional.
          Length = 436

 Score = 32.3 bits (74), Expect = 0.30
 Identities = 25/98 (25%), Positives = 47/98 (47%), Gaps = 14/98 (14%)

Query: 81  RSIGPRFATIGSFEFEYGTR------WKQLYELYKQKEEALQKELKLEEEKLEAQMEFAR 134
           R IGP ++T  S     G R      ++   + Y++  E LQ++  +EE   E ++E  R
Sbjct: 144 RGIGPCYSTKAS---RTGLRVGDLLNFETFEKKYRKLVEKLQEQYNIEEYDAEEELE--R 198

Query: 135 YENETEILREREC---RFVEEALQKELKLEEEKLEAQM 169
           Y+   E L++       F+ +A+++  ++  E   A M
Sbjct: 199 YKGYAEKLKDMIVDTVYFMNKAIKEGKRVLVEGANATM 236


>gnl|CDD|236545 PRK09510, tolA, cell envelope integrity inner membrane protein
           TolA; Provisional.
          Length = 387

 Score = 32.1 bits (73), Expect = 0.31
 Identities = 25/122 (20%), Positives = 55/122 (45%), Gaps = 10/122 (8%)

Query: 102 KQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLE 161
           +Q  EL +++    ++  +LE+E+L AQ          E  ++ E    + AL+++   E
Sbjct: 87  QQAEELQQKQAAEQERLKQLEKERLAAQ----------EQKKQAEEAAKQAALKQKQAEE 136

Query: 162 EEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQT 221
                A    A+ E E +      ++  A+ +++ +    K+  AE +++ + EA  +  
Sbjct: 137 AAAKAAAAAKAKAEAEAKRAAAAAKKAAAEAKKKAEAEAAKKAAAEAKKKAEAEAAAKAA 196

Query: 222 EE 223
            E
Sbjct: 197 AE 198



 Score = 29.4 bits (66), Expect = 2.3
 Identities = 17/83 (20%), Positives = 40/83 (48%), Gaps = 3/83 (3%)

Query: 153 ALQKELKLEEEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRC 212
           A ++  K E+++ E   +  +   E E L++  ++R A +E++KQ  E   + A  +++ 
Sbjct: 77  AEEQRKKKEQQQAEELQQ--KQAAEQERLKQLEKERLAAQEQKKQA-EEAAKQAALKQKQ 133

Query: 213 DEEAMRRQTEEIHLRMQQQDEEL 235
            EEA  +       + + + +  
Sbjct: 134 AEEAAAKAAAAAKAKAEAEAKRA 156



 Score = 27.5 bits (61), Expect = 8.9
 Identities = 14/60 (23%), Positives = 28/60 (46%), Gaps = 7/60 (11%)

Query: 182 REQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQTEEIHLRMQQQDEELRRRHQE 241
            E+ R+++     Q+Q  EL+++ A EQ R  +    R   +     ++Q EE  ++   
Sbjct: 77  AEEQRKKKE----QQQAEELQQKQAAEQERLKQLEKERLAAQ---EQKKQAEEAAKQAAL 129


>gnl|CDD|118696 pfam10168, Nup88, Nuclear pore component.  Nup88 can be divided
           into two structural domains; the N-terminal two-thirds
           of the protein has no obvious structural motifs but is
           the region for binding to Nup98, one of the components
           of the nuclear pore. the C-terminal end is a predicted
           coiled-coil domain. Nup88 is overexpressed in tumour
           cells.
          Length = 717

 Score = 32.2 bits (73), Expect = 0.31
 Identities = 27/147 (18%), Positives = 55/147 (37%), Gaps = 29/147 (19%)

Query: 113 EALQKELKLEEEKLEAQMEFARY-ENETEILRER---------ECRFVEEALQKELKLEE 162
           E  Q+ +KL + + E Q+E  +    E + L ER         E ++ +E L    K   
Sbjct: 561 EEFQRRVKLLQLQKEKQLEDIQDCREERKSLSERAEKLAEKFEEAKYNQELLVNRCKRLL 620

Query: 163 EKLEAQMEFA-----RYENETEILREQLRQREADRERQKQEWELKERH------------ 205
           +   +Q+            E + + +QL+      ++ K++   +  H            
Sbjct: 621 QSANSQLPVLSDSERDMSKELQRINKQLQHLANGIKQVKKKKNYQRYHMASQESPKKSSY 680

Query: 206 --AEEQRRCDEEAMRRQTEEIHLRMQQ 230
              E+Q +   E ++   E I   ++Q
Sbjct: 681 TLPEKQHKTITEILKELGEHIDRMIKQ 707



 Score = 27.9 bits (62), Expect = 7.7
 Identities = 24/119 (20%), Positives = 47/119 (39%), Gaps = 26/119 (21%)

Query: 139 TEILRERECR---FVEEALQKELKLEEEKLEAQMEFARY-ENETEILREQLRQREADRER 194
           T++ RE+         E  Q+ +KL + + E Q+E  +    E + L E+     A++  
Sbjct: 545 TQVFREQYLLKHDLAREEFQRRVKLLQLQKEKQLEDIQDCREERKSLSER-----AEKLA 599

Query: 195 QKQEWELKERHAEEQRRCD----------------EEAMRRQTEEIHLRMQQQDEELRR 237
           +K E E K        RC                 E  M ++ + I+ ++Q     +++
Sbjct: 600 EKFE-EAKYNQELLVNRCKRLLQSANSQLPVLSDSERDMSKELQRINKQLQHLANGIKQ 657


>gnl|CDD|224188 COG1269, NtpI, Archaeal/vacuolar-type H+-ATPase subunit I [Energy
           production and conversion].
          Length = 660

 Score = 31.9 bits (73), Expect = 0.34
 Identities = 29/164 (17%), Positives = 50/164 (30%), Gaps = 30/164 (18%)

Query: 90  IGSFEFEYGTRWKQLYELYKQKEEA---LQKELKLEEEKLEAQMEFARYENETEILRERE 146
           +            ++ EL ++ EE    L +EL+  E+ LE     A  + +  +LR  +
Sbjct: 97  LEEVIKPAEKFSSEVEELTRKLEERLSELDEELEDLEDLLEELEPLAYLDFDLSLLRGLK 156

Query: 147 CRFVEEALQKELKLEE--EKLEAQMEFARYENETE-----------------ILREQLRQ 187
              V   L +  KLE     +E ++       E                   IL E   +
Sbjct: 157 FLLVRLGLVRREKLEALVGVIEDEVALYGENVEASVVIVVAHGAEDLDKVSKILNELGFE 216

Query: 188 R----EADRERQKQ----EWELKERHAEEQRRCDEEAMRRQTEE 223
                E D    +     E  + E   E +    E     +   
Sbjct: 217 LYEVPEFDGGPSELISELEEVIAEIQDELESLRSELEALAEKIA 260


>gnl|CDD|240813 cd12367, RRM2_RBM45, RNA recognition motif 2 in RNA-binding
          protein 45 (RBM45) and similar proteins.  This
          subfamily corresponds to the RRM2 of RBM45, also termed
          developmentally-regulated RNA-binding protein 1 (DRB1),
          a new member of RNA recognition motif (RRM)-type neural
          RNA-binding proteins, which expresses under
          spatiotemporal control. It is encoded by gene drb1 that
          is expressed in neurons, not in glial cells. RBM45
          predominantly localizes in cytoplasm of cultured cells
          and specifically binds to poly(C) RNA. It could play an
          important role during neurogenesis. RBM45 carries four
          RRMs, also known as RBDs (RNA binding domains) or RNPs
          (ribonucleoprotein domains). .
          Length = 74

 Score = 29.3 bits (66), Expect = 0.38
 Identities = 15/42 (35%), Positives = 21/42 (50%), Gaps = 7/42 (16%)

Query: 14 GNSKNEGIIEFTRKPAAAQALKRCQDGVFFLTQSLKPVIVEP 55
          G SK  G ++F +   AA AL+ C        +S K V+ EP
Sbjct: 39 GESKGFGYVKFHKPSQAAVALENCD-------KSFKAVLAEP 73


>gnl|CDD|219563 pfam07767, Nop53, Nop53 (60S ribosomal biogenesis).  This nucleolar
           family of proteins are involved in 60S ribosomal
           biogenesis. They are specifically involved in the
           processing beyond the 27S stage of 25S rRNA maturation.
           This family contains sequences that bear similarity to
           the glioma tumour suppressor candidate region gene 2
           protein (p60). This protein has been found to interact
           with herpes simplex type 1 regulatory proteins.
          Length = 387

 Score = 31.6 bits (72), Expect = 0.40
 Identities = 25/141 (17%), Positives = 59/141 (41%), Gaps = 9/141 (6%)

Query: 101 WKQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEA---LQKE 157
            ++L +  ++ E+ ++ E K +E +   + +  +   E   L E     +EE+    ++E
Sbjct: 195 HQELLQ--EEYEKEVKAEKKRQELERVEEKKLEKMAPEASRLDEMSEGLLEESDDDGEEE 252

Query: 158 LKLEEEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAM 217
              E      + E+          R+   QR  ++ R++ E E KE    +++       
Sbjct: 253 SDDESAWEGFESEYEPINKPVRPKRKTKAQRNKEKRRKELEREAKEEKQLKKKLAQLA-- 310

Query: 218 RRQTEEIHLRMQQQDEELRRR 238
             + +EI   + Q+++   R+
Sbjct: 311 --RLKEIAKEVAQKEKARARK 329


>gnl|CDD|237478 PRK13709, PRK13709, conjugal transfer nickase/helicase TraI;
            Provisional.
          Length = 1747

 Score = 32.1 bits (73), Expect = 0.42
 Identities = 31/143 (21%), Positives = 49/143 (34%), Gaps = 19/143 (13%)

Query: 98   GTRWKQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKE 157
            G  W  + +   Q          +  E L      A+ + E  I RE E R  +E ++K 
Sbjct: 1618 GRVWGDIPDNSVQPGAG--NGEPVTAEVL------AQRQAEEAIRRETERR-ADEIVRKM 1668

Query: 158  LKLEEEKLEAQMEFARYENETEILREQLRQREADRERQ-KQEWELKERHAEEQRRCDEEA 216
                     A+ +    + +TE     +  +E DR    ++E  L E    E +R  E  
Sbjct: 1669 ---------AENKPDLPDGKTEQAVRDIAGQERDRAAISEREAALPESVLREPQREREAV 1719

Query: 217  MRRQTEEIHLRMQQQDEELRRRH 239
                 E +     QQ E    R 
Sbjct: 1720 REVARENLLRERLQQMERDMVRD 1742


>gnl|CDD|234017 TIGR02794, tolA_full, TolA protein.  TolA couples the inner
           membrane complex of itself with TolQ and TolR to the
           outer membrane complex of TolB and OprL (also called
           Pal). Most of the length of the protein consists of
           low-complexity sequence that may differ in both length
           and composition from one species to another,
           complicating efforts to discriminate TolA (the most
           divergent gene in the tol-pal system) from paralogs such
           as TonB. Selection of members of the seed alignment and
           criteria for setting scoring cutoffs are based largely
           conserved operon struction. //The Tol-Pal complex is
           required for maintaining outer membrane integrity. Also
           involved in transport (uptake) of colicins and
           filamentous DNA, and implicated in pathogenesis.
           Transport is energized by the proton motive force. TolA
           is an inner membrane protein that interacts with
           periplasmic TolB and with outer membrane porins ompC,
           phoE and lamB [Transport and binding proteins, Other,
           Cellular processes, Pathogenesis].
          Length = 346

 Score = 31.7 bits (72), Expect = 0.42
 Identities = 17/85 (20%), Positives = 36/85 (42%), Gaps = 3/85 (3%)

Query: 140 EILRERECRFVEEALQKELKLEEEKLEAQMEFARYENETEILREQLRQREADRERQKQEW 199
           E  R+++     E  +K+   E+ + +   + A  E      ++  +  +   E+QKQ  
Sbjct: 66  EQERQKKLEQQAEEAEKQRAAEQARQKELEQRAAAEKAA---KQAEQAAKQAEEKQKQAE 122

Query: 200 ELKERHAEEQRRCDEEAMRRQTEEI 224
           E K + A E +   E    ++ +E 
Sbjct: 123 EAKAKQAAEAKAKAEAEAEKKAKEE 147



 Score = 28.3 bits (63), Expect = 5.2
 Identities = 20/114 (17%), Positives = 48/114 (42%), Gaps = 2/114 (1%)

Query: 109 KQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLEEEKLEAQ 168
           K+     ++E + + E+   + E  R   +       +    E+A ++  +  ++  E Q
Sbjct: 59  KKPAAKKEQERQKKLEQQAEEAEKQRAAEQARQKELEQRAAAEKAAKQAEQAAKQAEEKQ 118

Query: 169 MEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQTE 222
            +    E + +   E   + EA+ E++ +E   K+   E + +   EA ++  E
Sbjct: 119 KQAE--EAKAKQAAEAKAKAEAEAEKKAKEEAKKQAEEEAKAKAAAEAKKKAAE 170


>gnl|CDD|222447 pfam13904, DUF4207, Domain of unknown function (DUF4207).  This
           family is found in eukaryotes; it has several conserved
           tryptophan residues. The function is not known.
          Length = 261

 Score = 31.2 bits (71), Expect = 0.43
 Identities = 26/126 (20%), Positives = 62/126 (49%), Gaps = 3/126 (2%)

Query: 109 KQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLEEEKLEAQ 168
            Q+++ LQK L  E++K E + E    E    + +E+   +  +  Q+  K    K + +
Sbjct: 98  AQRQKKLQKLL-EEKQKQEREKEREEAELRQRLAKEKYEEWCRQKAQQAAKQRTPKHKKE 156

Query: 169 MEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDE--EAMRRQTEEIHL 226
              +   + +   + +    + + +++ QEWELK+   ++Q+R +E  +  ++Q EE   
Sbjct: 157 AAESASSSLSGSAKPERNVSQEEAKKRLQEWELKKLKQQQQKREEERRKQRKKQQEEEER 216

Query: 227 RMQQQD 232
           + + ++
Sbjct: 217 KQKAEE 222



 Score = 27.4 bits (61), Expect = 7.9
 Identities = 20/129 (15%), Positives = 56/129 (43%), Gaps = 3/129 (2%)

Query: 116 QKELKLEEEKLEA-QMEFARYENE--TEILRERECRFVEEALQKELKLEEEKLEAQMEFA 172
            KE+KLE +  EA +   +  + +   ++ +  E +  +E  ++  + E  +  A+ ++ 
Sbjct: 76  LKEVKLERQAQEAYENWLSAKQAQRQKKLQKLLEEKQKQEREKEREEAELRQRLAKEKYE 135

Query: 173 RYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQTEEIHLRMQQQD 232
            +  +      + R  +  +E  +         A+ +R   +E  +++ +E  L+  +Q 
Sbjct: 136 EWCRQKAQQAAKQRTPKHKKEAAESASSSLSGSAKPERNVSQEEAKKRLQEWELKKLKQQ 195

Query: 233 EELRRRHQE 241
           ++ R   + 
Sbjct: 196 QQKREEERR 204


>gnl|CDD|202096 pfam02029, Caldesmon, Caldesmon. 
          Length = 431

 Score = 31.5 bits (71), Expect = 0.45
 Identities = 33/130 (25%), Positives = 50/130 (38%), Gaps = 1/130 (0%)

Query: 108 YKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLEEEKLEA 167
            +QK +    E   +EEK     E  + +  +      E     +    E        E 
Sbjct: 135 SEQKNDWRDAEECQKEEKEPEPEEEEKPKRGSLEENNGE-FMTHKLKHTENTFSRGGAEG 193

Query: 168 QMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQTEEIHLR 227
               A  E E    ++Q    E +  ++K+E   K    EEQRR  EEA R+  EE   R
Sbjct: 194 AQVEAGKEFEKLKQKQQEAALELEELKKKREERRKVLEEEEQRRKQEEADRKSREEEEKR 253

Query: 228 MQQQDEELRR 237
             +++ E RR
Sbjct: 254 RLKEEIERRR 263


>gnl|CDD|217051 pfam02463, SMC_N, RecF/RecN/SMC N terminal domain.  This domain is
           found at the N terminus of SMC proteins. The SMC
           (structural maintenance of chromosomes) superfamily
           proteins have ATP-binding domains at the N- and
           C-termini, and two extended coiled-coil domains
           separated by a hinge in the middle. The eukaryotic SMC
           proteins form two kind of heterodimers: the SMC1/SMC3
           and the SMC2/SMC4 types. These heterodimers constitute
           an essential part of higher order complexes, which are
           involved in chromatin and DNA dynamics. This family also
           includes the RecF and RecN proteins that are involved in
           DNA metabolism and recombination.
          Length = 1162

 Score = 31.9 bits (72), Expect = 0.46
 Identities = 27/129 (20%), Positives = 53/129 (41%), Gaps = 10/129 (7%)

Query: 109 KQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLEEEKLEAQ 168
           ++ +E+ ++  KLE+E  + + E    E E + L  +     EE  Q E   E+      
Sbjct: 315 EKLKESEKELKKLEKELKKEKEEIEELEKELKELEIKREAEEEEEEQLEKLQEKL----- 369

Query: 169 MEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQTEEIHLRM 228
                 + E E+L ++  + E      K + E  E   EE++         + EE  L+ 
Sbjct: 370 -----EQLEEELLAKKKLESERLSSAAKLKEEELELKNEEEKEAKLLLELSEQEEDLLKE 424

Query: 229 QQQDEELRR 237
           ++++E    
Sbjct: 425 EKKEELKIV 433



 Score = 30.7 bits (69), Expect = 1.0
 Identities = 28/131 (21%), Positives = 57/131 (43%), Gaps = 1/131 (0%)

Query: 105 YELYKQKEEALQKELKLEEEKLEAQMEFAR-YENETEILRERECRFVEEALQKELKLEEE 163
               ++K+E L+K ++  E   E  ++       E ++  + +       L+++L+LEEE
Sbjct: 166 SREKRKKKERLKKLIEETENLAELIIDLEELKLQELKLKEQAKKALEYYQLKEKLELEEE 225

Query: 164 KLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQTEE 223
            L         E   ++L+E LR  + + E  KQE E +E    +  + ++E  + +  +
Sbjct: 226 NLLYLDYLKLNEERIDLLQELLRDEQEEIESSKQELEKEEEILAQVLKENKEEEKEKKLQ 285

Query: 224 IHLRMQQQDEE 234
                    EE
Sbjct: 286 EEELKLLAKEE 296



 Score = 28.8 bits (64), Expect = 4.0
 Identities = 24/135 (17%), Positives = 54/135 (40%), Gaps = 7/135 (5%)

Query: 116 QKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLEEEKLEAQMEFARYE 175
           ++ L++EEE   ++ +  + E   +++ E E         +ELKL+E KL+ Q       
Sbjct: 154 ERRLEIEEEAAGSREKRKKKERLKKLIEETENLAELIIDLEELKLQELKLKEQ------A 207

Query: 176 NETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQTEEIHLR-MQQQDEE 234
            +     +   + E + E       LK          +     ++  E   + +++++E 
Sbjct: 208 KKALEYYQLKEKLELEEENLLYLDYLKLNEERIDLLQELLRDEQEEIESSKQELEKEEEI 267

Query: 235 LRRRHQENSIFMQVI 249
           L +  +EN    +  
Sbjct: 268 LAQVLKENKEEEKEK 282



 Score = 28.4 bits (63), Expect = 4.8
 Identities = 28/132 (21%), Positives = 56/132 (42%), Gaps = 13/132 (9%)

Query: 106 ELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLEEEKL 165
           EL K+ +E   K    EEE+ + +              + +   +EE L  + KLE E+L
Sbjct: 340 ELEKELKELEIKREAEEEEEEQLE------------KLQEKLEQLEEELLAKKKLESERL 387

Query: 166 EAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQTEEIH 225
            +  +  + E       E+   +      +++E  LKE   EE +  +E     +T++  
Sbjct: 388 SSAAK-LKEEELELKNEEEKEAKLLLELSEQEEDLLKEEKKEELKIVEELEESLETKQGK 446

Query: 226 LRMQQQDEELRR 237
           L  ++++ E + 
Sbjct: 447 LTEEKEELEKQA 458



 Score = 28.0 bits (62), Expect = 7.4
 Identities = 26/149 (17%), Positives = 66/149 (44%), Gaps = 3/149 (2%)

Query: 96  EYGTRWKQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQ 155
           +Y    ++  +L ++     Q+E++  +++LE + E      +     E+E +  EE   
Sbjct: 231 DYLKLNEERIDLLQELLRDEQEEIESSKQELEKEEEILAQVLKENKEEEKEKKLQEE--- 287

Query: 156 KELKLEEEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEE 215
           +   L +E+ E + E  + E       E+L++ E + ++ ++E + ++   EE  +  +E
Sbjct: 288 ELKLLAKEEEELKSELLKLERRKVDDEEKLKESEKELKKLEKELKKEKEEIEELEKELKE 347

Query: 216 AMRRQTEEIHLRMQQQDEELRRRHQENSI 244
              ++  E     Q +  + +    E  +
Sbjct: 348 LEIKREAEEEEEEQLEKLQEKLEQLEEEL 376


>gnl|CDD|217902 pfam04111, APG6, Autophagy protein Apg6.  In yeast, 15 Apg proteins
           coordinate the formation of autophagosomes. Autophagy is
           a bulk degradation process induced by starvation in
           eukaryotic cells. Apg6/Vps30p has two distinct functions
           in the autophagic process, either associated with the
           membrane or in a retrieval step of the carboxypeptidase
           Y sorting pathway.
          Length = 356

 Score = 31.4 bits (71), Expect = 0.50
 Identities = 21/84 (25%), Positives = 39/84 (46%), Gaps = 6/84 (7%)

Query: 117 KELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLEEEKLEAQMEFARYEN 176
            ELK EEE+L  ++E    E E + L       + E  +++ +LE E+L+   E+  ++ 
Sbjct: 73  DELKKEEERLLDELE--ELEKEDDDLDGE----LVELQEEKEQLENEELQYLREYNLFDR 126

Query: 177 ETEILREQLRQREADRERQKQEWE 200
               L + L+  E   E    + +
Sbjct: 127 NNLQLEDNLQSLELQYEYSLNQLD 150


>gnl|CDD|234342 TIGR03752, conj_TIGR03752, integrating conjugative element protein,
           PFL_4705 family.  Members of this protein family are
           found occasionally on plasmids such as the Pseudomonas
           putida toluene catabolic TOL plasmid pWWO_p085. Usually,
           however, they are found on the bacterial main chromosome
           in regions flanked by markers of conjugative transfer
           and/or transposition [Mobile and extrachromosomal
           element functions, Plasmid functions].
          Length = 472

 Score = 31.5 bits (72), Expect = 0.51
 Identities = 21/83 (25%), Positives = 38/83 (45%), Gaps = 13/83 (15%)

Query: 156 KELKLEEEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEE 215
           KEL+    KL ++ E  + ENE       L++RE   ++Q Q     +    E +   +E
Sbjct: 69  KELRKRLAKLISENEALKAENER------LQKREQSIDQQIQ-----QAVQSETQELTKE 117

Query: 216 AMRRQTEEIHLRMQQQDEELRRR 238
               Q +    ++Q   ++L+RR
Sbjct: 118 --IEQLKSERQQLQGLIDQLQRR 138


>gnl|CDD|224340 COG1422, COG1422, Predicted membrane protein [Function unknown].
          Length = 201

 Score = 30.8 bits (70), Expect = 0.55
 Identities = 15/56 (26%), Positives = 27/56 (48%), Gaps = 1/56 (1%)

Query: 180 ILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQTEEIHLRMQQQDEEL 235
           IL++ L  +E  +E QK   E ++   E Q   D + +++  +E  + M     EL
Sbjct: 63  ILQKLLIDQEKMKELQKMMKEFQKEFREAQESGDMKKLKK-LQEKQMEMMDDQREL 117


>gnl|CDD|179699 PRK03992, PRK03992, proteasome-activating nucleotidase;
           Provisional.
          Length = 389

 Score = 31.3 bits (72), Expect = 0.56
 Identities = 15/51 (29%), Positives = 23/51 (45%), Gaps = 4/51 (7%)

Query: 140 EILRERECRFVEEALQKELKLEEEKLEAQMEFARYENETEILR---EQLRQ 187
           E L   E R  E   Q   +LE +  + + E  + E E E L+   E+L+ 
Sbjct: 1   ERLEALEERNSELEEQIR-QLELKLRDLEAENEKLERELERLKSELEKLKS 50


>gnl|CDD|233758 TIGR02169, SMC_prok_A, chromosome segregation protein SMC,
           primarily archaeal type.  SMC (structural maintenance of
           chromosomes) proteins bind DNA and act in organizing and
           segregating chromosomes for partition. SMC proteins are
           found in bacteria, archaea, and eukaryotes. It is found
           in a single copy and is homodimeric in prokaryotes, but
           six paralogs (excluded from this family) are found in
           eukarotes, where SMC proteins are heterodimeric. This
           family represents the SMC protein of archaea and a few
           bacteria (Aquifex, Synechocystis, etc); the SMC of other
           bacteria is described by TIGR02168. The N- and
           C-terminal domains of this protein are well conserved,
           but the central hinge region is skewed in composition
           and highly divergent [Cellular processes, Cell division,
           DNA metabolism, Chromosome-associated proteins].
          Length = 1164

 Score = 31.6 bits (72), Expect = 0.57
 Identities = 38/157 (24%), Positives = 65/157 (41%), Gaps = 12/157 (7%)

Query: 94  EFEYGTRWKQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEA 153
           E+E     K+   L +QKE   ++   LEEE    ++     E E  +  E E    E  
Sbjct: 222 EYEGYELLKEKEALERQKEAIERQLASLEEEL--EKLTEEISELEKRL-EEIEQLLEELN 278

Query: 154 LQKELKLEEEKLEAQMEFARYENETEILREQL-----RQREADRERQKQEWEL---KERH 205
            + +   EEE+L  + +    E E   L   +        +A+    K E E+       
Sbjct: 279 KKIKDLGEEEQLRVKEKIGELEAEIASLERSIAEKERELEDAEERLAKLEAEIDKLLAEI 338

Query: 206 AEEQRRCDEEAMRR-QTEEIHLRMQQQDEELRRRHQE 241
            E +R  +EE  RR +  E +  ++++ E+LR   +E
Sbjct: 339 EELEREIEEERKRRDKLTEEYAELKEELEDLRAELEE 375



 Score = 31.2 bits (71), Expect = 0.72
 Identities = 41/142 (28%), Positives = 65/142 (45%), Gaps = 12/142 (8%)

Query: 106 ELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLEE--- 162
            L K+K E    EL  E+E LE Q E    E +   L E   +  EE  + E +LEE   
Sbjct: 215 ALLKEKREYEGYELLKEKEALERQKE--AIERQLASLEEELEKLTEEISELEKRLEEIEQ 272

Query: 163 --EKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERH---AEEQRRCDEEAM 217
             E+L  +++    E E   ++E++ + EA+    ++    KER    AEE+    E  +
Sbjct: 273 LLEELNKKIK-DLGEEEQLRVKEKIGELEAEIASLERSIAEKERELEDAEERLAKLEAEI 331

Query: 218 RRQTEEI-HLRMQQQDEELRRR 238
            +   EI  L  + ++E  RR 
Sbjct: 332 DKLLAEIEELEREIEEERKRRD 353



 Score = 30.0 bits (68), Expect = 1.6
 Identities = 31/126 (24%), Positives = 56/126 (44%), Gaps = 5/126 (3%)

Query: 102 KQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVE--EALQKELK 159
           ++L +L ++  E  ++  +L+EE      E A        +  +     E  E    E+K
Sbjct: 392 EKLEKLKREINELKRELDRLQEELQRLSEELADLNAAIAGIEAKINELEEEKEDKALEIK 451

Query: 160 LEEEKLE-AQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMR 218
            +E KLE    + ++YE E   L+E+  + E  +E  K + EL E  A+ +   +     
Sbjct: 452 KQEWKLEQLAADLSKYEQELYDLKEEYDRVE--KELSKLQRELAEAEAQARASEERVRGG 509

Query: 219 RQTEEI 224
           R  EE+
Sbjct: 510 RAVEEV 515



 Score = 28.9 bits (65), Expect = 4.0
 Identities = 31/146 (21%), Positives = 58/146 (39%), Gaps = 12/146 (8%)

Query: 102 KQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKL- 160
           ++L  L  +      +  +L +E  +A  +    E E E L + E +  E   + E  L 
Sbjct: 688 RELSSLQSELRRIENRLDELSQELSDASRKIGEIEKEIEQLEQEEEKLKERLEELEEDLS 747

Query: 161 --EEEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMR 218
             E+E    + E    E   E L E L + E          +L+ R +  +    +  + 
Sbjct: 748 SLEQEIENVKSELKELEARIEELEEDLHKLEEALN------DLEARLSHSRIPEIQAELS 801

Query: 219 RQTEE---IHLRMQQQDEELRRRHQE 241
           +  EE   I  R+++ +++L R   E
Sbjct: 802 KLEEEVSRIEARLREIEQKLNRLTLE 827



 Score = 27.7 bits (62), Expect = 8.9
 Identities = 28/133 (21%), Positives = 60/133 (45%), Gaps = 5/133 (3%)

Query: 106 ELYKQKEEALQKELKLEEEKLEA-QMEFARYENETEILRERECRFVEEALQKELKLEE-E 163
           E  +Q+EE L++ L+  EE L + + E    ++E + L  R     E+  + E  L + E
Sbjct: 726 EQLEQEEEKLKERLEELEEDLSSLEQEIENVKSELKELEARIEELEEDLHKLEEALNDLE 785

Query: 164 KLEAQMEFARYENETEILREQLRQREA---DRERQKQEWELKERHAEEQRRCDEEAMRRQ 220
              +       + E   L E++ + EA   + E++     L++ + E++ +  +E     
Sbjct: 786 ARLSHSRIPEIQAELSKLEEEVSRIEARLREIEQKLNRLTLEKEYLEKEIQELQEQRIDL 845

Query: 221 TEEIHLRMQQQDE 233
            E+I    ++ + 
Sbjct: 846 KEQIKSIEKEIEN 858


>gnl|CDD|202833 pfam03962, Mnd1, Mnd1 family.  This family of proteins includes
           MND1 from S. cerevisiae. The mnd1 protein forms a
           complex with hop2 to promote homologous chromosome
           pairing and meiotic double-strand break repair.
          Length = 188

 Score = 30.7 bits (70), Expect = 0.58
 Identities = 13/60 (21%), Positives = 32/60 (53%), Gaps = 1/60 (1%)

Query: 183 EQLRQREADRER-QKQEWELKERHAEEQRRCDEEAMRRQTEEIHLRMQQQDEELRRRHQE 241
           + L + +   E+ +K+  ELK+R AE Q + ++    R+  E    + ++ ++L +  ++
Sbjct: 62  QALNKLKTRLEKLKKELEELKQRIAELQAQIEKLKKGREETEERTELLEELKQLEKELKK 121


>gnl|CDD|223447 COG0370, FeoB, Fe2+ transport system protein B [Inorganic ion
           transport and metabolism].
          Length = 653

 Score = 31.1 bits (71), Expect = 0.61
 Identities = 44/175 (25%), Positives = 72/175 (41%), Gaps = 34/175 (19%)

Query: 41  VFFLTQSL---KPVIVEPLDLVD---------DEEGLSER-------TVSKK--SSDYFK 79
           ++   Q L    P+I+  L+++D         D E LS+        TV+K+    +  K
Sbjct: 98  LYLTLQLLELGIPMILA-LNMIDEAKKRGIRIDIEKLSKLLGVPVVPTVAKRGEGLEELK 156

Query: 80  QRSIGPRFATIGSFEFEYGTRWKQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENET 139
           +  I    +     E +YG    +  E   ++ EAL ++ +    KL    E        
Sbjct: 157 RAIIELAESKTTPREVDYG----EEIEEEIKELEALSEDPRWLAIKLLEDDELVE----- 207

Query: 140 EILRERECRFVEEALQKELKLEEEKLEAQMEFARYENETEILREQLRQREADRER 194
            +L+E E R   E L +EL  EEE     +  ARY     ILR  ++Q E ++  
Sbjct: 208 AVLKEPEKRV--EELLEELS-EEEGHLLLIADARYALIERILRSVVKQEEEEKSS 259


>gnl|CDD|202427 pfam02841, GBP_C, Guanylate-binding protein, C-terminal domain.
           Transcription of the anti-viral guanylate-binding
           protein (GBP) is induced by interferon-gamma during
           macrophage induction. This family contains GBP1 and
           GPB2, both GTPases capable of binding GTP, GDP and GMP.
          Length = 297

 Score = 31.1 bits (71), Expect = 0.63
 Identities = 32/131 (24%), Positives = 56/131 (42%), Gaps = 13/131 (9%)

Query: 109 KQKEEALQKELKLEEEKLEA--QMEFARYENETEILRERECRFVEEALQKELKLEEEKLE 166
            + EE LQ+ L  +E   EA  Q + A    E  I  ER      EA + E +L  EK +
Sbjct: 172 VKAEEVLQEFLNSKEAVEEAILQTDQALTAKEKAIEAERA---KAEAAEAEQELLREKQK 228

Query: 167 AQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQTEEIHL 226
            + +    E +    +E ++Q     E ++++         EQ R  E  ++ Q E +  
Sbjct: 229 EEEQ--MMEAQERSYQEHVKQLIEKMEAEREKLL------AEQERMLEHKLQEQEELLKE 280

Query: 227 RMQQQDEELRR 237
             + + E L++
Sbjct: 281 GFKTEAESLQK 291



 Score = 28.8 bits (65), Expect = 3.5
 Identities = 30/145 (20%), Positives = 70/145 (48%), Gaps = 6/145 (4%)

Query: 105 YELYKQKEEALQKELKLEEEK-LEAQMEFARYENETEILRERECRFVEEALQKELKLEEE 163
           Y+L+ ++ + L+ +      K ++A+     + N  E + E   +  +    KE  +E E
Sbjct: 150 YKLFLEERDKLEAKYNQVPRKGVKAEEVLQEFLNSKEAVEEAILQTDQALTAKEKAIEAE 209

Query: 164 KLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQTEE 223
           + +A+      E E E+LRE+ ++ E   E Q++ ++   +   E+   + E +  + E 
Sbjct: 210 RAKAEAA----EAEQELLREKQKEEEQMMEAQERSYQEHVKQLIEKMEAEREKLLAEQER 265

Query: 224 -IHLRMQQQDEELRRRHQENSIFMQ 247
            +  ++Q+Q+E L+   +  +  +Q
Sbjct: 266 MLEHKLQEQEELLKEGFKTEAESLQ 290


>gnl|CDD|223024 PHA03252, PHA03252, DNA packaging tegument protein UL25;
           Provisional.
          Length = 589

 Score = 31.2 bits (71), Expect = 0.71
 Identities = 13/35 (37%), Positives = 20/35 (57%)

Query: 203 ERHAEEQRRCDEEAMRRQTEEIHLRMQQQDEELRR 237
           E H     RC EE +RR  ++  LR+++  E+L R
Sbjct: 17  EGHVRNILRCPEEDLRRLRDDSALRLRRYREDLLR 51


>gnl|CDD|130141 TIGR01069, mutS2, MutS2 family protein.  Function of MutS2 is
           unknown. It should not be considered a DNA mismatch
           repair protein. It is likely a DNA mismatch binding
           protein of unknown cellular function [DNA metabolism,
           Other].
          Length = 771

 Score = 30.9 bits (70), Expect = 0.76
 Identities = 27/103 (26%), Positives = 46/103 (44%), Gaps = 10/103 (9%)

Query: 107 LYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLEEEKLE 166
           L   ++E  QK   LE+   E +      E E E L+ERE         +  KLE EK E
Sbjct: 520 LSALEKELEQKNEHLEKLLKEQEKLKKELEQEMEELKERE---------RNKKLELEK-E 569

Query: 167 AQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQ 209
           AQ      + E E +  +L++++  + ++ +  E   +  E +
Sbjct: 570 AQEALKALKKEVESIIRELKEKKIHKAKEIKSIEDLVKLKETK 612


>gnl|CDD|223868 COG0797, RlpA, Lipoproteins [Cell envelope biogenesis, outer
           membrane].
          Length = 233

 Score = 30.5 bits (69), Expect = 0.77
 Identities = 16/51 (31%), Positives = 24/51 (47%), Gaps = 9/51 (17%)

Query: 7   VVLVDDRGNSKNEGIIEFTRKPAAAQALKRCQDGVFFLTQSLKPVIVEPLD 57
           VV ++DRG   +  II+ ++  AAA  L   + GV         V +E L 
Sbjct: 135 VVRINDRGPFVSGRIIDLSK--AAADKLGMIRSGVA-------KVRIEVLG 176


>gnl|CDD|221371 pfam12004, DUF3498, Domain of unknown function (DUF3498).  This
           presumed domain is functionally uncharacterized. This
           domain is found in eukaryotes. This domain is typically
           between 433 to 538 amino acids in length. This domain is
           found associated with pfam00616, pfam00168. This domain
           has two conserved sequence motifs: DLQ and PLSFQNP.
          Length = 489

 Score = 30.8 bits (69), Expect = 0.81
 Identities = 28/105 (26%), Positives = 45/105 (42%), Gaps = 22/105 (20%)

Query: 162 EEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQT 221
           EE  E   +  +YE E   L+E+LR   + R  ++ E  L  +  + Q+   E   R + 
Sbjct: 356 EENREEGTQAEKYEQEIARLKERLRV--SVRRLEEYERRLLGQEQQMQKLLQEYQARLED 413

Query: 222 EEIHLRMQQQD----------------EELRRRHQENSIFMQVIV 250
            E  LR QQ++                EEL++ H +    MQ +V
Sbjct: 414 SEERLRRQQEEKDSQMKSIISRLMAVEEELKKDHAD----MQAVV 454



 Score = 30.8 bits (69), Expect = 0.86
 Identities = 26/96 (27%), Positives = 45/96 (46%), Gaps = 9/96 (9%)

Query: 123 EEKLEAQMEFARYENETEILRERECRFVEEALQKELKLEEEKLEAQMEFARYENETEILR 182
           EE  E   +  +YE E   L+ER    V    + E +L  ++ + Q     Y+   E   
Sbjct: 356 EENREEGTQAEKYEQEIARLKERLRVSVRRLEEYERRLLGQEQQMQKLLQEYQARLEDSE 415

Query: 183 EQLRQREADRERQKQ---------EWELKERHAEEQ 209
           E+LR+++ +++ Q +         E ELK+ HA+ Q
Sbjct: 416 ERLRRQQEEKDSQMKSIISRLMAVEEELKKDHADMQ 451


>gnl|CDD|233503 TIGR01642, U2AF_lg, U2 snRNP auxilliary factor, large subunit,
           splicing factor.  These splicing factors consist of an
           N-terminal arginine-rich low complexity domain followed
           by three tandem RNA recognition motifs (pfam00076). The
           well-characterized members of this family are auxilliary
           components of the U2 small nuclear ribonuclearprotein
           splicing factor (U2AF). These proteins are closely
           related to the CC1-like subfamily of splicing factors
           (TIGR01622). Members of this subfamily are found in
           plants, metazoa and fungi.
          Length = 509

 Score = 30.6 bits (69), Expect = 0.85
 Identities = 14/75 (18%), Positives = 26/75 (34%), Gaps = 2/75 (2%)

Query: 170 EFARYENETEILREQLRQREADRERQKQEWELKERHAE--EQRRCDEEAMRRQTEEIHLR 227
           E    E E    R++ R  E  R R +     ++RH    E+   ++   R +       
Sbjct: 3   EEPDREREKSRGRDRDRSSERPRRRSRDRSRFRDRHRRSRERSYREDSRPRDRRRYDSRS 62

Query: 228 MQQQDEELRRRHQEN 242
            +       RR ++ 
Sbjct: 63  PRSLRYSSVRRSRDR 77



 Score = 27.9 bits (62), Expect = 5.8
 Identities = 14/62 (22%), Positives = 25/62 (40%), Gaps = 2/62 (3%)

Query: 182 REQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQTEEIHLRMQQQDEELRRRHQE 241
           R++   RE ++ R +      ER    +R  D    R +      R  ++D   R R + 
Sbjct: 1   RDEEPDREREKSRGRDRDRSSERP--RRRSRDRSRFRDRHRRSRERSYREDSRPRDRRRY 58

Query: 242 NS 243
           +S
Sbjct: 59  DS 60


>gnl|CDD|220369 pfam09731, Mitofilin, Mitochondrial inner membrane protein.
           Mitofilin controls mitochondrial cristae morphology.
           Mitofilin is enriched in the narrow space between the
           inner boundary and the outer membranes, where it forms a
           homotypic interaction and assembles into a large
           multimeric protein complex. The first 78 amino acids
           contain a typical amino-terminal-cleavable mitochondrial
           presequence rich in positive-charged and hydroxylated
           residues and a membrane anchor domain. In addition, it
           has three centrally located coiled coil domains.
          Length = 493

 Score = 30.8 bits (70), Expect = 0.93
 Identities = 39/130 (30%), Positives = 67/130 (51%), Gaps = 15/130 (11%)

Query: 88  ATIGSFEFEYGTRWKQLYELYKQKEEALQKELK--------LEEEKLEAQMEFARYENET 139
           + I S + E     K+L EL  ++EE L++ LK          EE+L A++E      E 
Sbjct: 163 SLIASAKEELDQLSKKLAELKAEEEEELERALKEKREELLSKLEEELLARLESKEAALEK 222

Query: 140 EILRERECRFVEEALQKELKLEEEKLEAQMEFARYENETEILREQLRQREADRERQKQEW 199
           ++  E E R  EE L+K+    EEKL  ++E     +E + L+ +L  +  + +R+  + 
Sbjct: 223 QLRLEFE-REKEE-LRKKY---EEKLRQELERQAEAHE-QKLKNELALQAIELQREFNK- 275

Query: 200 ELKERHAEEQ 209
           E+KE+  EE+
Sbjct: 276 EIKEKVEEER 285



 Score = 30.0 bits (68), Expect = 1.4
 Identities = 34/122 (27%), Positives = 61/122 (50%), Gaps = 7/122 (5%)

Query: 96  EYGTRWKQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRF--VEEA 153
           +  +      E   Q  + L +    EEE+LE  ++  R E  +++  E   R    E A
Sbjct: 160 DLESLIASAKEELDQLSKKLAELKAEEEEELERALKEKREELLSKLEEELLARLESKEAA 219

Query: 154 LQKELKLEEEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCD 213
           L+K+L+LE E+ + ++   +YE +   LR++L ++    E QK + EL  +  E QR  +
Sbjct: 220 LEKQLRLEFEREKEELR-KKYEEK---LRQELERQAEAHE-QKLKNELALQAIELQREFN 274

Query: 214 EE 215
           +E
Sbjct: 275 KE 276



 Score = 28.1 bits (63), Expect = 5.8
 Identities = 30/111 (27%), Positives = 52/111 (46%), Gaps = 11/111 (9%)

Query: 111 KEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLEEEKLEAQME 170
           KEE  Q   KL E K E + E  R       L+E+    + +  ++ L   E K  A  +
Sbjct: 169 KEELDQLSKKLAELKAEEEEELER------ALKEKREELLSKLEEELLARLESKEAALEK 222

Query: 171 FARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQT 221
             +   E E  +E+LR++  ++ RQ+ E   ++  A EQ+  +E A++   
Sbjct: 223 --QLRLEFEREKEELRKKYEEKLRQELE---RQAEAHEQKLKNELALQAIE 268


>gnl|CDD|227512 COG5185, HEC1, Protein involved in chromosome segregation,
           interacts with SMC proteins [Cell division and
           chromosome partitioning].
          Length = 622

 Score = 30.7 bits (69), Expect = 0.96
 Identities = 28/142 (19%), Positives = 65/142 (45%), Gaps = 15/142 (10%)

Query: 95  FEYGTRWKQLYELYKQKEEALQKELKLEEEK----LEAQMEFARYENETEILRERECRFV 150
           F+Y T   + +   +   E  ++ELKL  EK    +   +   + +N+    + +E   +
Sbjct: 234 FDYFTESYKSFLKLEDNYEPSEQELKLGFEKFVHIINTDIANLKTQNDNLYEKIQEAMKI 293

Query: 151 EEALQKELKLEEEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQR 210
            + ++    L E+    + +  +YEN    ++++ ++     E+ K E ELKE       
Sbjct: 294 SQKIKT---LREKWRALKSDSNKYENYVNAMKQKSQEWPGKLEKLKSEIELKEEEI---- 346

Query: 211 RCDEEAMRRQTEEIHLRMQQQD 232
               +A++   +E+H ++++Q 
Sbjct: 347 ----KALQSNIDELHKQLRKQG 364


>gnl|CDD|235175 PRK03918, PRK03918, chromosome segregation protein; Provisional.
          Length = 880

 Score = 30.8 bits (70), Expect = 0.97
 Identities = 34/105 (32%), Positives = 54/105 (51%), Gaps = 8/105 (7%)

Query: 109 KQKEEALQKELKLEEEKL-EAQMEFARYENETEILRER----ECRFVEEALQKELKLEEE 163
           +++ E  +KELK  EE+L +A  E A  E   E LR+     E ++ EE  ++   L EE
Sbjct: 611 EKELEREEKELKKLEEELDKAFEELAETEKRLEELRKELEELEKKYSEEEYEE---LREE 667

Query: 164 KLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEE 208
            LE   E A    E E L ++  + +   E+ K+E E +E+  +E
Sbjct: 668 YLELSRELAGLRAELEELEKRREEIKKTLEKLKEELEEREKAKKE 712



 Score = 29.3 bits (66), Expect = 2.4
 Identities = 35/148 (23%), Positives = 64/148 (43%), Gaps = 12/148 (8%)

Query: 106 ELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVE--------EALQKE 157
           EL K+ E     + KLEE+  E +      + E E L E+     E          L + 
Sbjct: 242 ELEKELESLEGSKRKLEEKIRELEERIEELKKEIEELEEKVKELKELKEKAEEYIKLSEF 301

Query: 158 L-KLEEEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEA 216
             +  +E  E +   +R E E   + E++++ E   ER +   ELK++  E ++R +E  
Sbjct: 302 YEEYLDELREIEKRLSRLEEEINGIEERIKELEEKEERLE---ELKKKLKELEKRLEELE 358

Query: 217 MRRQTEEIHLRMQQQDEELRRRHQENSI 244
            R +  E     +++ E L++R    + 
Sbjct: 359 ERHELYEEAKAKKEELERLKKRLTGLTP 386



 Score = 28.1 bits (63), Expect = 7.2
 Identities = 36/130 (27%), Positives = 63/130 (48%), Gaps = 3/130 (2%)

Query: 110 QKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLEEEKLEAQM 169
           +K E L+K+L   E+KL+   E        + L E     VEE L++ LK  E      +
Sbjct: 549 EKLEELKKKLAELEKKLDELEE--ELAELLKELEELGFESVEE-LEERLKELEPFYNEYL 605

Query: 170 EFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQTEEIHLRMQ 229
           E    E E E   ++L++ E + ++  +E    E+  EE R+  EE  ++ +EE +  ++
Sbjct: 606 ELKDAEKELEREEKELKKLEEELDKAFEELAETEKRLEELRKELEELEKKYSEEEYEELR 665

Query: 230 QQDEELRRRH 239
           ++  EL R  
Sbjct: 666 EEYLELSREL 675


>gnl|CDD|221432 pfam12128, DUF3584, Protein of unknown function (DUF3584).  This
           protein is found in bacteria and eukaryotes. Proteins in
           this family are typically between 943 to 1234 amino
           acids in length. This family contains a P-loop motif
           suggesting it is a nucleotide binding protein. It may be
           involved in replication.
          Length = 1198

 Score = 30.8 bits (70), Expect = 1.1
 Identities = 30/141 (21%), Positives = 65/141 (46%), Gaps = 10/141 (7%)

Query: 108 YKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLEEEKLEA 167
           Y++ ++ ++++L+ + EK   ++   R E + +     E     +AL+ +L+ + E  + 
Sbjct: 382 YERLKQKIKEQLERDLEKNNERLAAIREEKDRQKAAIEE---DLQALESQLRQQLEAGKL 438

Query: 168 QMEFARYENETEILREQLRQREA-----DRERQKQEWELKERHAEEQRRCDEEAMRRQTE 222
           +     YE E  + R + R   A     + E+ +   E  E+  EEQ + +    + Q+E
Sbjct: 439 EFNEEEYELELRLGRLKQRLDSATATPEELEQLEINDEALEKAQEEQEQAEANVEQLQSE 498

Query: 223 EIHLRMQ--QQDEELRRRHQE 241
              LR +  +  E L+R  + 
Sbjct: 499 LRQLRKRRDEALEALQRAERR 519



 Score = 27.7 bits (62), Expect = 8.7
 Identities = 31/155 (20%), Positives = 69/155 (44%), Gaps = 20/155 (12%)

Query: 96  EYGTRWKQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQ 155
               + + L +  K+  + L +EL     KL A        +E E+L +++  F +  ++
Sbjct: 292 RLRQQLRTLEDQLKEARDELNQELSAANAKLAAD------RSELELLEDQKGAFEDADIE 345

Query: 156 KELKLEEEKL---EAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRC 212
           + L+ + ++L    +++E      +    + Q  QR+ +R +QK    +KE+   +  + 
Sbjct: 346 Q-LQADLDQLPSIRSELEEVEARLDALTGKHQDVQRKYERLKQK----IKEQLERDLEKN 400

Query: 213 DE------EAMRRQTEEIHLRMQQQDEELRRRHQE 241
           +E      E   RQ   I   +Q  + +LR++ + 
Sbjct: 401 NERLAAIREEKDRQKAAIEEDLQALESQLRQQLEA 435


>gnl|CDD|233069 TIGR00643, recG, ATP-dependent DNA helicase RecG.  [DNA metabolism,
           DNA replication, recombination, and repair].
          Length = 630

 Score = 30.4 bits (69), Expect = 1.1
 Identities = 15/60 (25%), Positives = 24/60 (40%), Gaps = 7/60 (11%)

Query: 122 EEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLEEEKLEAQMEFARYENETEIL 181
           E EKL+ +   A YE   +   +     +   ++      +EK     EF   E E +IL
Sbjct: 460 ESEKLDLKAAEALYERLKKAFPKYNVGLLHGRMK-----SDEKEAVMEEF--REGEVDIL 512


>gnl|CDD|163620 cd00845, MPP_UshA_N_like, Escherichia coli UshA-like family,
           N-terminal metallophosphatase domain.  This family
           includes the bacterial enzyme UshA, and related enzymes
           including SoxB, CpdB, YhcR, and CD73.  All members have
           a similar domain architecture which includes an
           N-terminal metallophosphatase domain and a C-terminal
           nucleotidase domain.  The N-terminal metallophosphatase
           domain belongs to a large superfamily of distantly
           related metallophosphatases (MPPs) that includes:
           Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat
           debranching enzymes, YfcE-like phosphodiesterases,
           purple acid phosphatases (PAPs), YbbF-like
           UDP-2,3-diacylglucosamine hydrolases, and acid
           sphingomyelinases (ASMases).  MPPs are functionally
           diverse, but all share a conserved domain with an active
           site consisting of two metal ions (usually manganese,
           iron, or zinc) coordinated with octahedral geometry by a
           cage of histidine, aspartate, and asparagine residues.
           The conserved domain is a double beta-sheet sandwich
           with a di-metal active site made up of residues located
           at the C-terminal side of the sheets. This domain is
           thought to allow for productive metal coordination.
          Length = 252

 Score = 29.9 bits (68), Expect = 1.1
 Identities = 13/28 (46%), Positives = 16/28 (57%), Gaps = 2/28 (7%)

Query: 83  IGPRFATIGSFEFEYGTRWKQLYELYKQ 110
           +G    TIG+ EF+YG     L ELYK 
Sbjct: 69  LGYDAVTIGNHEFDYGL--DALAELYKD 94


>gnl|CDD|238427 cd00831, CHS_like, Chalcone and stilbene synthases; plant-specific
           polyketide synthases (PKS) and related enzymes, also
           called type III PKSs. PKS generate an array of different
           products, dependent on the nature of the starter
           molecule. They share a common chemical strategy, after
           the starter molecule is loaded onto the active site
           cysteine, a carboxylative condensation reation extends
           the polyketide chain. Plant-specific PKS are dimeric
           iterative PKSs, using coenzyme A esters to deliver
           substrate to the active site, but they differ in the
           choice of starter molecule and the number of
           condensation reactions.
          Length = 361

 Score = 30.3 bits (69), Expect = 1.1
 Identities = 15/50 (30%), Positives = 22/50 (44%), Gaps = 9/50 (18%)

Query: 152 EALQKELKLEEEKLEA-QMEFARYEN--------ETEILREQLRQREADR 192
           +A++K L L  E LEA +M   RY N            +  + R +  DR
Sbjct: 295 DAVEKALGLSPEDLEASRMVLRRYGNMSSSSVLYVLAYMEAKGRVKRGDR 344



 Score = 27.6 bits (62), Expect = 8.7
 Identities = 12/26 (46%), Positives = 16/26 (61%), Gaps = 1/26 (3%)

Query: 113 EALQKELKLEEEKLEA-QMEFARYEN 137
           +A++K L L  E LEA +M   RY N
Sbjct: 295 DAVEKALGLSPEDLEASRMVLRRYGN 320


>gnl|CDD|237178 PRK12705, PRK12705, hypothetical protein; Provisional.
          Length = 508

 Score = 30.1 bits (68), Expect = 1.3
 Identities = 27/109 (24%), Positives = 40/109 (36%), Gaps = 3/109 (2%)

Query: 132 FARYENETEILRERECRFVEEALQKELKLEEEKLEAQMEFARYENETEILREQLR---QR 188
             R     E  R  +    E   + E  L E K     E  +   E    RE+L+   +R
Sbjct: 26  KKRQRLAKEAERILQEAQKEAEEKLEAALLEAKELLLRERNQQRQEARREREELQREEER 85

Query: 189 EADRERQKQEWELKERHAEEQRRCDEEAMRRQTEEIHLRMQQQDEELRR 237
              +E Q      K  + E Q    E+A+  +  E+    +Q D EL R
Sbjct: 86  LVQKEEQLDARAEKLDNLENQLEEREKALSARELELEELEKQLDNELYR 134



 Score = 29.7 bits (67), Expect = 1.8
 Identities = 34/111 (30%), Positives = 58/111 (52%), Gaps = 6/111 (5%)

Query: 104 LYELYKQKEEALQKELKLEEEKLEAQ--MEFARYENETEILRERECRFVEEALQK-ELKL 160
           +  L K++  A + E  L+E + EA+  +E A  E +  +LRER  +  E   ++ EL+ 
Sbjct: 22  VVLLKKRQRLAKEAERILQEAQKEAEEKLEAALLEAKELLLRERNQQRQEARREREELQR 81

Query: 161 EEEKLEAQMEF--ARYENETEILREQLRQREADRERQKQEWELKERHAEEQ 209
           EEE+L  + E   AR E + + L  QL +RE     ++ E E  E+  + +
Sbjct: 82  EEERLVQKEEQLDARAE-KLDNLENQLEEREKALSARELELEELEKQLDNE 131


>gnl|CDD|235316 PRK04863, mukB, cell division protein MukB; Provisional.
          Length = 1486

 Score = 30.3 bits (69), Expect = 1.3
 Identities = 28/131 (21%), Positives = 51/131 (38%), Gaps = 18/131 (13%)

Query: 102 KQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLE 161
           + L E  K+  + L  E +LE+ + E +             RER  R       ++L+  
Sbjct: 537 RLLAEFCKRLGKNLDDEDELEQLQEELEARLESLSESVSEARER--RMALRQQLEQLQAR 594

Query: 162 EEKLEAQM-EFARYENETEILREQ-------------LRQREADRERQKQEWELKERHAE 207
            ++L A+   +   ++    LREQ               Q+  +RER+      ++  A 
Sbjct: 595 IQRLAARAPAWLAAQDALARLREQSGEEFEDSQDVTEYMQQLLERERELTV--ERDELAA 652

Query: 208 EQRRCDEEAMR 218
            ++  DEE  R
Sbjct: 653 RKQALDEEIER 663


>gnl|CDD|220402 pfam09787, Golgin_A5, Golgin subfamily A member 5.  Members of this
           family of proteins are involved in maintaining Golgi
           structure. They stimulate the formation of Golgi stacks
           and ribbons, and are involved in intra-Golgi retrograde
           transport. Two main interactions have been
           characterized: one with RAB1A that has been activated by
           GTP-binding and another with isoform CASP of CUTL1.
          Length = 509

 Score = 29.8 bits (67), Expect = 1.5
 Identities = 23/113 (20%), Positives = 52/113 (46%), Gaps = 5/113 (4%)

Query: 101 WKQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKL 160
            +QL +L + + E   ++ +L++ + +A       E     L+E       ++   +++L
Sbjct: 217 LQQLLKLLRAEGE--SEKQELQQYRQKAHRILQSKEKRINFLKEGCLFEGLDSSTAQIEL 274

Query: 161 EEEKLEAQM---EFARYENETEILREQLRQREADRERQKQEWELKERHAEEQR 210
           EE K E++    E  + E +   LR + + REA+   + + +  + R   +Q 
Sbjct: 275 EELKHESEHVQEEITKLEGQIIQLRSEAQDREAEASGEAESFRKQPRELSQQI 327


>gnl|CDD|185727 cd08986, GH43_7, Glycosyl hydrolase family 43.  This glycosyl
           hydrolase family 43 (GH43) includes enzymes with
           beta-1,4-xylosidase (xylan 1,4-beta-xylosidase; EC
           3.2.1.37), beta-1,3-xylosidase (EC 3.2.1.-),
           alpha-L-arabinofuranosidase (EC 3.2.1.55), arabinanase
           (EC 3.2.1.99), xylanase (EC 3.2.1.8),
           endo-alpha-L-arabinanase and galactan
           1,3-beta-galactosidase (EC 3.2.1.145) activities. These
           are inverting enzymes (i.e. they invert the
           stereochemistry of the anomeric carbon atom of the
           substrate) that have an aspartate as the catalytic
           general base, a glutamate as the catalytic general acid
           and another aspartate that is responsible for pKa
           modulation and orienting the catalytic acid. Many of the
           enzymes in this family display both
           alpha-L-arabinofuranosidase and beta-D-xylosidase
           activity using aryl-glycosides as substrates. A common
           structural feature of GH43 enzymes is a 5-bladed
           beta-propeller domain that contains the catalytic acid
           and catalytic base. A long V-shaped groove, partially
           enclosed at one end, forms a single extended
           substrate-binding surface across the face of the
           propeller.
          Length = 269

 Score = 29.7 bits (67), Expect = 1.7
 Identities = 14/48 (29%), Positives = 20/48 (41%), Gaps = 4/48 (8%)

Query: 58  LVDDEEGLSERTVSKKSSDYFKQRSIGPRFATIGSFEFEYGTRWKQLY 105
           L DD  GL+   V    S  F +  IG   A +    F+YG ++    
Sbjct: 156 LKDDLSGLAGDPVRIDPSPTFYKDEIGHEGAFV----FKYGGKYYLFG 199


>gnl|CDD|234750 PRK00409, PRK00409, recombination and DNA strand exchange inhibitor
           protein; Reviewed.
          Length = 782

 Score = 29.8 bits (68), Expect = 1.8
 Identities = 30/125 (24%), Positives = 57/125 (45%), Gaps = 17/125 (13%)

Query: 110 QKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQK-ELKLEEEKLEAQ 168
            + E  QK  + E    EA+      E + E L+E E + +EEA ++ +  ++E K EA 
Sbjct: 528 LERELEQKAEEAEALLKEAEKLKEELEEKKEKLQEEEDKLLEEAEKEAQQAIKEAKKEAD 587

Query: 169 MEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQTEEIHLRM 228
                     EI++E LRQ +     +     +K     E R+   +A  ++ ++   + 
Sbjct: 588 ----------EIIKE-LRQLQ-----KGGYASVKAHELIEARKRLNKANEKKEKKKKKQK 631

Query: 229 QQQDE 233
           ++Q+E
Sbjct: 632 EKQEE 636



 Score = 29.4 bits (67), Expect = 2.5
 Identities = 17/89 (19%), Positives = 36/89 (40%), Gaps = 2/89 (2%)

Query: 155 QKELKLEEEKLEAQMEFARYENETEILREQLRQREA--DRERQKQEWELKERHAEEQRRC 212
           + E +LE++  EA+      E   E L E+  + +   D+  ++ E E ++   E ++  
Sbjct: 527 ELERELEQKAEEAEALLKEAEKLKEELEEKKEKLQEEEDKLLEEAEKEAQQAIKEAKKEA 586

Query: 213 DEEAMRRQTEEIHLRMQQQDEELRRRHQE 241
           DE     +  +       +  EL    + 
Sbjct: 587 DEIIKELRQLQKGGYASVKAHELIEARKR 615



 Score = 29.0 bits (66), Expect = 3.2
 Identities = 22/117 (18%), Positives = 52/117 (44%), Gaps = 18/117 (15%)

Query: 120 KLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLEEEKLEAQMEFARYENETE 179
            LEE + E + +      E E L +       E L++EL+ ++EKL+ +        E +
Sbjct: 524 SLEELERELEQKAE----EAEALLKEA-----EKLKEELEEKKEKLQEE--------EDK 566

Query: 180 ILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQ-TEEIHLRMQQQDEEL 235
           +L E  ++ +   +  K+E +   +   + ++    +++     E   R+ + +E+ 
Sbjct: 567 LLEEAEKEAQQAIKEAKKEADEIIKELRQLQKGGYASVKAHELIEARKRLNKANEKK 623



 Score = 27.9 bits (63), Expect = 6.9
 Identities = 18/69 (26%), Positives = 33/69 (47%), Gaps = 8/69 (11%)

Query: 102 KQLYELYKQKEEALQKELKLEEEKLEAQMEF------ARYENETEILRERECRFVEEALQ 155
           ++L E  ++K+E LQ+E   ++   EA+ E       A+ E +  I   R+ +    A  
Sbjct: 547 EKLKEELEEKKEKLQEE--EDKLLEEAEKEAQQAIKEAKKEADEIIKELRQLQKGGYASV 604

Query: 156 KELKLEEEK 164
           K  +L E +
Sbjct: 605 KAHELIEAR 613


>gnl|CDD|240830 cd12384, RRM_RBM24_RBM38_like, RNA recognition motif in
          eukaryotic RNA-binding protein RBM24, RBM38 and similar
          proteins.  This subfamily corresponds to the RRM of
          RBM24 and RBM38 from vertebrate, SUPpressor family
          member SUP-12 from Caenorhabditis elegans and similar
          proteins. Both, RBM24 and RBM38, are preferentially
          expressed in cardiac and skeletal muscle tissues. They
          regulate myogenic differentiation by controlling the
          cell cycle in a p21-dependent or -independent manner.
          RBM24, also termed RNA-binding region-containing
          protein 6, interacts with the 3'-untranslated region
          (UTR) of myogenin mRNA and regulates its stability in
          C2C12 cells. RBM38, also termed CLL-associated antigen
          KW-5, or HSRNASEB, or RNA-binding region-containing
          protein 1(RNPC1), or ssDNA-binding protein SEB4, is a
          direct target of the p53 family. It is required for
          maintaining the stability of the basal and
          stress-induced p21 mRNA by binding to their 3'-UTRs. It
          also binds the AU-/U-rich elements in p63 3'-UTR and
          regulates p63 mRNA stability and activity. SUP-12 is a
          novel tissue-specific splicing factor that controls
          muscle-specific splicing of the ADF/cofilin pre-mRNA in
          C. elegans. All family members contain a conserved RNA
          recognition motif (RRM), also termed RBD (RNA binding
          domain) or RNP (ribonucleoprotein domain). .
          Length = 76

 Score = 27.6 bits (62), Expect = 1.8
 Identities = 14/34 (41%), Positives = 20/34 (58%), Gaps = 1/34 (2%)

Query: 3  IERAVVLVDDR-GNSKNEGIIEFTRKPAAAQALK 35
          IE AVV+ D + G S+  G + F  K +A +A K
Sbjct: 27 IEEAVVITDRQTGKSRGYGFVTFKDKESAERACK 60


>gnl|CDD|224143 COG1222, RPT1, ATP-dependent 26S proteasome regulatory subunit
           [Posttranslational modification, protein turnover,
           chaperones].
          Length = 406

 Score = 29.5 bits (67), Expect = 1.9
 Identities = 20/74 (27%), Positives = 31/74 (41%), Gaps = 12/74 (16%)

Query: 138 ETEILRERECRFVEEALQKELKLEEEKLEAQMEFARYENETEILREQLRQREADRERQKQ 197
             EIL + E    +E L K    + + LE          E  +L  + ++ EA+  R K+
Sbjct: 6   LDEILGDLESYEPQEYLNKLEDTKLKLLE---------KEKRLLLLEEQRLEAEGLRLKR 56

Query: 198 EWELKERHAEEQRR 211
           E    +R  EE  R
Sbjct: 57  EV---DRLREEIER 67



 Score = 29.2 bits (66), Expect = 2.3
 Identities = 17/72 (23%), Positives = 31/72 (43%), Gaps = 2/72 (2%)

Query: 112 EEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLEEEKLEAQMEF 171
           EE    +  L + +     E+     +T++    + + +   L +E +LE E L  + E 
Sbjct: 1   EELDALDEILGDLESYEPQEYLNKLEDTKLKLLEKEKRLL--LLEEQRLEAEGLRLKREV 58

Query: 172 ARYENETEILRE 183
            R   E E L+E
Sbjct: 59  DRLREEIERLKE 70


>gnl|CDD|206563 pfam14395, COOH-NH2_lig, Phage phiEco32-like COOH.NH2 ligase-type
           2.  A family of COOH-NH2 ligases/GCS superfamily found
           in the neighborhood of YheC/D-like ATP-grasp and the
           CotE family of proteins in the firmicutes. Contextual
           analysis suggests that it might be involved in cell wall
           modification and spore coat biosynthesis.
          Length = 261

 Score = 29.2 bits (66), Expect = 2.0
 Identities = 20/78 (25%), Positives = 32/78 (41%), Gaps = 14/78 (17%)

Query: 126 LEAQMEFARYENETEILRERECRFVEEALQKELKLEEEKLEAQMEFARYENETEILREQL 185
           L  +++ A YE + E LR      V+             LEA   +  Y NE E L E +
Sbjct: 192 LSPELQEAFYEGDKEALRP----CVKGVWDD--------LEALPGYTDYRNEIEPLFEMI 239

Query: 186 RQREADRERQ--KQEWEL 201
            + +   E    +Q W++
Sbjct: 240 EEGQTWDEEVDLRQAWKI 257


>gnl|CDD|222631 pfam14259, RRM_6, RNA recognition motif (a.k.a. RRM, RBD, or RNP
          domain). 
          Length = 69

 Score = 27.1 bits (61), Expect = 2.0
 Identities = 10/37 (27%), Positives = 15/37 (40%)

Query: 7  VVLVDDRGNSKNEGIIEFTRKPAAAQALKRCQDGVFF 43
          V LV ++   +    +EF     A  ALK+    V  
Sbjct: 28 VRLVRNKDRPRGFAFVEFASPEDAEAALKKLNGLVLD 64


>gnl|CDD|226654 COG4191, COG4191, Signal transduction histidine kinase regulating
           C4-dicarboxylate transport system [Signal transduction
           mechanisms].
          Length = 603

 Score = 29.6 bits (67), Expect = 2.0
 Identities = 20/61 (32%), Positives = 27/61 (44%), Gaps = 6/61 (9%)

Query: 181 LREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQTEEIHLRMQQQDEELRRRHQ 240
            R +LR  E    R + E  ++ER A+  R     A  R   EI  R +Q +  LRR   
Sbjct: 320 RRARLRLAELQEARAELERRVEERTADLTR-----ANARLQAEIAER-EQAEAALRRAQD 373

Query: 241 E 241
           E
Sbjct: 374 E 374


>gnl|CDD|163562 TIGR03850, bind_CPR_0540, carbohydrate ABC transporter
           substrate-binding protein, CPR_0540 family.  Members of
           this protein are the substrate-binding protein of a
           predicted carbohydrate transporter operon, together with
           permease subunits of ABC transporter homology families.
           This substrate-binding protein frequently co-occurs in
           genomes with a family of disaccharide phosphorylases,
           TIGR02336, suggesting that the molecule transported will
           include
           beta-D-galactopyranosyl-(1->3)-N-acetyl-D-glucosamine
           and related carbohydrates. Members of this family are
           sporadically strain by strain, often in species with a
           human host association, including Propionibacterium
           acnes and Clostridium perfringens, and Bacillus cereus
           [Transport and binding proteins, Carbohydrates, organic
           alcohols, and acids].
          Length = 437

 Score = 29.7 bits (67), Expect = 2.1
 Identities = 12/45 (26%), Positives = 27/45 (60%), Gaps = 4/45 (8%)

Query: 90  IGSFEFEYGTR-WKQLYELYKQKEEALQKELKLE---EEKLEAQM 130
           + +FE  YGT+ W+++ E +++  E ++ EL +    E+ +  Q+
Sbjct: 38  VAAFEGGYGTKMWEEVVEAFEKSHEGVKVELTVSKNLEDVITPQI 82


>gnl|CDD|225177 COG2268, COG2268, Uncharacterized protein conserved in bacteria
           [Function unknown].
          Length = 548

 Score = 29.4 bits (66), Expect = 2.4
 Identities = 28/131 (21%), Positives = 58/131 (44%), Gaps = 7/131 (5%)

Query: 120 KLEEEKLEAQMEFARYENETEI-----LRERECRFVEEALQKELKLEEEKLEAQMEFARY 174
           ++ +   +A++     E ETEI      R+ +   +E   Q   K  E+  E ++  A  
Sbjct: 215 RIAQVLQDAEIAENEAEKETEIAIAEANRDAKLVELEVEQQPAGKTAEQTREVKIILAET 274

Query: 175 ENETEILREQLRQREADRERQKQEWELKERHAEEQRRC-DEEAMRRQTEEIHLRMQQQDE 233
           E E    + + R REA++     E  ++E  A+ ++     +A+  +   + L  +Q++ 
Sbjct: 275 EAEVAAWKAETR-REAEQAEILAEQAIQEEKAQAEQEVQHAKALEAREMRVGLIERQKET 333

Query: 234 ELRRRHQENSI 244
           EL  + +   I
Sbjct: 334 ELEPQERSYFI 344



 Score = 27.9 bits (62), Expect = 6.5
 Identities = 26/140 (18%), Positives = 53/140 (37%), Gaps = 5/140 (3%)

Query: 109 KQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLEEEKLEAQ 168
           + + EA Q E+  E+   E + +  +     + L  RE R      QKE +LE ++    
Sbjct: 284 ETRREAEQAEILAEQAIQEEKAQAEQEVQHAKALEAREMRVGLIERQKETELEPQERSYF 343

Query: 169 MEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRR--CDEEAMRR--QTEEI 224
           +  A+ + + E  +      EA   + +   E      E +R       A     + E++
Sbjct: 344 INAAQRQAQEEA-KAAANIAEAIGAQAEAAVETARETEEAERAEQAALVAAAEAAEQEQV 402

Query: 225 HLRMQQQDEELRRRHQENSI 244
            + ++ +  +     Q   I
Sbjct: 403 EIAVRAEAAKAEAEAQAAEI 422


>gnl|CDD|236272 PRK08475, PRK08475, F0F1 ATP synthase subunit B; Validated.
          Length = 167

 Score = 28.4 bits (64), Expect = 2.4
 Identities = 24/84 (28%), Positives = 46/84 (54%), Gaps = 7/84 (8%)

Query: 102 KQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLE 161
           ++L E  ++KE+AL+K   LEE K +A++     + E  IL ++    +E+  + +++  
Sbjct: 67  EKLKESKEKKEDALKK---LEEAKEKAELIVETAKKEAYILTQK----IEKQTKDDIENL 119

Query: 162 EEKLEAQMEFARYENETEILREQL 185
            +  E  MEF   + E E++ E L
Sbjct: 120 IKSFEELMEFEVRKMEREVVEEVL 143


>gnl|CDD|215180 PLN02316, PLN02316, synthase/transferase.
          Length = 1036

 Score = 29.5 bits (66), Expect = 2.5
 Identities = 19/53 (35%), Positives = 29/53 (54%), Gaps = 6/53 (11%)

Query: 186 RQREADRERQKQEWELKERHAEEQRRCDEEAMRRQTEEIHLRMQQQDEELRRR 238
           ++RE   E+  +E   +ER AEEQRR +EE    + +    R Q + E  +RR
Sbjct: 254 KRREL--EKLAKEEAERERQAEEQRRREEEKAAMEAD----RAQAKAEVEKRR 300



 Score = 28.3 bits (63), Expect = 6.1
 Identities = 18/59 (30%), Positives = 26/59 (44%), Gaps = 1/59 (1%)

Query: 151 EEALQKELKLEEEKLEA-QMEFARYENETEILREQLRQREADRERQKQEWELKERHAEE 208
           E+ L +E + E EKL   + E  R   E     E+    EADR + K E E +    + 
Sbjct: 247 EDFLLEEKRRELEKLAKEEAERERQAEEQRRREEEKAAMEADRAQAKAEVEKRREKLQN 305


>gnl|CDD|227519 COG5192, BMS1, GTP-binding protein required for 40S ribosome
           biogenesis [Translation, ribosomal structure and
           biogenesis].
          Length = 1077

 Score = 29.3 bits (65), Expect = 2.6
 Identities = 30/183 (16%), Positives = 62/183 (33%), Gaps = 19/183 (10%)

Query: 47  SLKPVIVEPLDLVDDEEGLSERTVSKKSSDYFKQRSIGPRFATIGSFEFEYGTRWKQLYE 106
                 V    +  + E L E    +    +     +  RF    + +   G       E
Sbjct: 540 FFDVSKVANESISSNHEKLMESEFEELKKKWSSLAQLKSRFQKDATLDSIEGEE-----E 594

Query: 107 LYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLEEEKLE 166
           L +  E+   ++L+ EE   + +ME +R  + T    E       E  ++E   ++E+L 
Sbjct: 595 LIQDDEKGNFEDLEDEENSSDNEMEESRGSSVTAENEESADEVDYETEREENARKKEELR 654

Query: 167 AQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQTEEIHL 226
              E              L +R    ++    +  ++R  EEQ + +         E  +
Sbjct: 655 GNFE--------------LEERGDPEKKDVDWYTEEKRKIEEQLKINRSEFETMVPESRV 700

Query: 227 RMQ 229
            ++
Sbjct: 701 VIE 703


>gnl|CDD|178867 PRK00106, PRK00106, hypothetical protein; Provisional.
          Length = 535

 Score = 29.1 bits (65), Expect = 2.7
 Identities = 30/123 (24%), Positives = 53/123 (43%), Gaps = 3/123 (2%)

Query: 116 QKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLEEEKLEAQMEFARYE 175
           + E   E  K  A+ E    + E  +  + E R   E +++E K E ++L+ Q+E    E
Sbjct: 50  KAERDAEHIKKTAKRESKALKKELLLEAKEEARKYREEIEQEFKSERQELK-QIESRLTE 108

Query: 176 NETEILR--EQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQTEEIHLRMQQQDE 233
             T + R  E L  +E   E ++Q    K +H +E+    E+   ++  E+         
Sbjct: 109 RATSLDRKDENLSSKEKTLESKEQSLTDKSKHIDEREEQVEKLEEQKKAELERVAALSQA 168

Query: 234 ELR 236
           E R
Sbjct: 169 EAR 171


>gnl|CDD|227606 COG5281, COG5281, Phage-related minor tail protein [Function
           unknown].
          Length = 833

 Score = 29.2 bits (65), Expect = 2.9
 Identities = 23/160 (14%), Positives = 49/160 (30%), Gaps = 15/160 (9%)

Query: 109 KQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEAL-QKELKLEEEKLEA 167
            ++ +             + Q+       E    ++      EE     + + ++ +L+ 
Sbjct: 480 AERSQEQMTAALKALLAFQQQIADLSGAKEKASDQKSLLWKAEEQYALLKEEAKQRQLQE 539

Query: 168 QMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQTEEIHLR 227
           Q     ++ ET     QL +      +Q   +EL  + A  Q+    +  R    +    
Sbjct: 540 QKALLEHKKETLEYTSQLAELLD---QQADRFELSAQAAGSQKERGSDLYREALAQNAAA 596

Query: 228 MQQQDEELRRRHQENSIFMQVIVWLGDLKQGVYQLGLTEG 267
           + +   EL                  DL QG ++ G    
Sbjct: 597 LNKALNELAAYWSAL-----------DLLQGDWKAGALSA 625


>gnl|CDD|171793 PRK12880, PRK12880, 3-oxoacyl-(acyl carrier protein) synthase III;
           Reviewed.
          Length = 353

 Score = 28.8 bits (64), Expect = 3.0
 Identities = 16/67 (23%), Positives = 35/67 (52%), Gaps = 4/67 (5%)

Query: 101 WKQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEI---LRERECRFVEEALQKE 157
           ++QL  LY          L+ E +  +  +EF++  +E +I   L  +   ++ + +++E
Sbjct: 223 FRQLENLYMDGANIFNMALECEPKSFKEILEFSK-VDEKDIAFHLFHQSNAYLVDCIKEE 281

Query: 158 LKLEEEK 164
           LKL ++K
Sbjct: 282 LKLNDDK 288


>gnl|CDD|129694 TIGR00606, rad50, rad50.  All proteins in this family for which
           functions are known are involvedin recombination,
           recombinational repair, and/or non-homologous end
           joining.They are components of an exonuclease complex
           with MRE11 homologs. This family is distantly related to
           the SbcC family of bacterial proteins.This family is
           based on the phylogenomic analysis of JA Eisen (1999,
           Ph.D. Thesis, Stanford University).
          Length = 1311

 Score = 29.2 bits (65), Expect = 3.0
 Identities = 33/147 (22%), Positives = 63/147 (42%), Gaps = 16/147 (10%)

Query: 102 KQLYELYKQKEEALQKELKLEEEKLEAQMEFA-RYENETEILRERECRFVEEALQKELKL 160
            +  + YK+K   ++ ++  +E +LE+  E    YENE + L+ R  + +E  L K +KL
Sbjct: 209 LKYLKQYKEKACEIRDQITSKEAQLESSREIVKSYENELDPLKNRL-KEIEHNLSKIMKL 267

Query: 161 EEEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQ 220
           +              NE + L+ + +Q E D    + + E   +  +EQ        +R 
Sbjct: 268 D--------------NEIKALKSRKKQMEKDNSELELKMEKVFQGTDEQLNDLYHNHQRT 313

Query: 221 TEEIHLRMQQQDEELRRRHQENSIFMQ 247
             E    +     EL + ++E  +  Q
Sbjct: 314 VREKERELVDCQRELEKLNKERRLLNQ 340


>gnl|CDD|220098 pfam09057, Smac_DIABLO, Second Mitochondria-derived Activator of
           Caspases.  Second Mitochondria-derived Activator of
           Caspases promotes apoptosis by activating caspases in
           the cytochrome c/Apaf-1/caspase-9 pathway, and by
           opposing the inhibitory activity of inhibitor of
           apoptosis proteins (XIAP-BIR3). The protein assumes an
           elongated three-helix bundle structure, and forms a
           dimer in solution.
          Length = 234

 Score = 28.7 bits (64), Expect = 3.1
 Identities = 35/158 (22%), Positives = 57/158 (36%), Gaps = 12/158 (7%)

Query: 70  VSKKSSDYFKQRSIGPRFATIGSFEFEYGTRWKQLYELYKQKEEALQKELKLEEEKLEAQ 129
           V+  ++ +  Q +     A + +   EY      L  L KQ   ++ K   +EE+ +   
Sbjct: 73  VTDSANTFLSQTT----LALVDALT-EYTKAVYTLISLQKQYTASIGKMNPVEEDAIWQV 127

Query: 130 MEFARYENETEILRERECRFVEEALQKELKLEEEKLEAQM----EFARYENETEILREQL 185
           +   R E      R  EC   E      + L E   EA      + A       +   Q 
Sbjct: 128 IIGQRVE---VSDRLEECLKFESNWMTAVNLSEMAAEAAYNSGADQASVAARNHLQVAQS 184

Query: 186 RQREADRERQKQEWELKERHAEEQRRCDEEAMRRQTEE 223
           +  E  +  ++ E +L E  AEE +R  E A      E
Sbjct: 185 QVEEVRQLSKEAEKKLAESKAEEIQRMAEYASSIDLSE 222


>gnl|CDD|220383 pfam09756, DDRGK, DDRGK domain.  This is a family of proteins of
           approximately 300 residues, found in plants and
           vertebrates. They contain a highly conserved DDRGK
           motif.
          Length = 189

 Score = 28.5 bits (64), Expect = 3.3
 Identities = 18/77 (23%), Positives = 39/77 (50%)

Query: 155 QKELKLEEEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDE 214
            K+    EEK   + +    E E E  ++   +RE +R+ +++  E +E+  EE+ R + 
Sbjct: 5   AKKRAKLEEKQARRQQREAEEEEREERKKLEEKREGERKEEEELEEEREKKKEEEERKER 64

Query: 215 EAMRRQTEEIHLRMQQQ 231
           E   R+ +E + +++  
Sbjct: 65  EEQARKEQEEYEKLKSS 81


>gnl|CDD|204414 pfam10211, Ax_dynein_light, Axonemal dynein light chain.  Axonemal
           dynein light chain proteins play a dynamic role in
           flagellar and cilia motility. Eukaryotic cilia and
           flagella are complex organelles consisting of a core
           structure, the axoneme, which is composed of nine
           microtubule doublets forming a cylinder that surrounds a
           pair of central singlet microtubules. This
           ultra-structural arrangement seems to be one of the most
           stable micro-tubular assemblies known and is responsible
           for the flagellar and ciliary movement of a large number
           of organisms ranging from protozoan to mammals. This
           light chain interacts directly with the N-terminal half
           of the heavy chains.
          Length = 189

 Score = 28.3 bits (64), Expect = 3.6
 Identities = 23/85 (27%), Positives = 43/85 (50%), Gaps = 10/85 (11%)

Query: 153 ALQKELKLEEEKLEAQMEFARYENETEILREQ---LRQREADRERQKQEWELKERHAEEQ 209
            ++K L+ E+ K E + E  + E E E L ++   L  +    E++++E    ER  EE+
Sbjct: 111 GMRKALQAEQGKSELEQEIKKLEEEKEELEKRVAELEAKLEAIEKREEE----ERQIEEK 166

Query: 210 RRCDE-EAMRRQTEEIHLRMQQQDE 233
           R  DE   +++Q +   L+ Q +  
Sbjct: 167 RHADEIAFLKKQNQ--QLKSQLEQI 189


>gnl|CDD|217803 pfam03938, OmpH, Outer membrane protein (OmpH-like).  This family
           includes outer membrane proteins such as OmpH among
           others. Skp (OmpH) has been characterized as a molecular
           chaperone that interacts with unfolded proteins as they
           emerge in the periplasm from the Sec translocation
           machinery.
          Length = 157

 Score = 28.0 bits (63), Expect = 3.6
 Identities = 18/88 (20%), Positives = 43/88 (48%), Gaps = 8/88 (9%)

Query: 101 WKQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKL 160
            K   +  +++ + LQ EL+ +E++L+   E  + + +   L E      E    K+ +L
Sbjct: 33  GKAAQKQLEKEFKKLQAELQKKEKELQK--EEQKLQKQAATLSE------EARKAKQQEL 84

Query: 161 EEEKLEAQMEFARYENETEILREQLRQR 188
           ++++ E Q +    + E +  +++L Q 
Sbjct: 85  QQKQQELQQKQQAAQQELQQKQQELLQP 112


>gnl|CDD|163506 TIGR03794, NHLM_micro_HlyD, NHLM bacteriocin system secretion
           protein.  Members of this protein family are homologs of
           the HlyD membrane fusion protein of type I secretion
           systems. Their occurrence in prokaryotic genomes is
           associated with the occurrence of a novel class of
           microcin (small bacteriocins) with a leader peptide
           region related to nitrile hydratase. We designate the
           class of bacteriocin as Nitrile Hydratase Leader
           Microcin, or NHLM. This family, therefore, is designated
           as NHLM bacteriocin system secretion protein. Some but
           not all NHLM-class putative microcins belong to the TOMM
           (thiazole/oxazole modified microcin) class as assessed
           by the presence of the scaffolding protein and/or
           cyclodehydratase in the same gene clusters [Transport
           and binding proteins, Amino acids, peptides and amines,
           Cellular processes, Biosynthesis of natural products].
          Length = 421

 Score = 28.7 bits (64), Expect = 3.6
 Identities = 30/145 (20%), Positives = 53/145 (36%), Gaps = 19/145 (13%)

Query: 106 ELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLEEEKL 165
           EL ++ +E+ QK  +L+E+      E   Y    +  RER         + +  LEE   
Sbjct: 93  ELRERLQESYQKLTQLQEQ----LEEVRNYTGRLKEGRERH------FQKSKEALEETIG 142

Query: 166 EAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDE---------EA 216
             + E A    E    R  L +  A  +R +   +      E+    D+         +A
Sbjct: 143 RLREELAALSREVGKQRGLLSRGLATFKRDRILQQQWREEQEKYDAADKARAIYALQTKA 202

Query: 217 MRRQTEEIHLRMQQQDEELRRRHQE 241
             R  E +   + Q D +L    ++
Sbjct: 203 DERNLETVLQSLSQADFQLAGVAEK 227


>gnl|CDD|240227 PTZ00009, PTZ00009, heat shock 70 kDa protein; Provisional.
          Length = 653

 Score = 28.6 bits (64), Expect = 3.9
 Identities = 21/80 (26%), Positives = 41/80 (51%), Gaps = 3/80 (3%)

Query: 131 EFARYENETEILRER-ECRFVEEALQKELK--LEEEKLEAQMEFARYENETEILREQLRQ 187
           E  +Y+ E E  RER E +   E     +K  L++EK++ ++  +      + + E L  
Sbjct: 523 EAEKYKAEDEANRERVEAKNGLENYCYSMKNTLQDEKVKGKLSDSDKATIEKAIDEALEW 582

Query: 188 READRERQKQEWELKERHAE 207
            E ++  +K+E+E K++  E
Sbjct: 583 LEKNQLAEKEEFEHKQKEVE 602


>gnl|CDD|226018 COG3487, IrpA, Uncharacterized iron-regulated protein [Inorganic
           ion transport and metabolism].
          Length = 446

 Score = 28.7 bits (64), Expect = 3.9
 Identities = 18/52 (34%), Positives = 27/52 (51%), Gaps = 2/52 (3%)

Query: 82  SIGPRFATIGSFEFEYGTRWK--QLYELYKQKEEALQKELKLEEEKLEAQME 131
           +IG R A +GS+    G+  K   L +L   K+ A  KELK +     A+M+
Sbjct: 326 AIGIRNAYLGSYTRVDGSVVKGPSLADLVAAKDAAANKELKAKLAATVAKMQ 377


>gnl|CDD|114011 pfam05262, Borrelia_P83, Borrelia P83/100 protein.  This family
           consists of several Borrelia P83/P100 antigen proteins.
          Length = 489

 Score = 28.4 bits (63), Expect = 4.1
 Identities = 8/40 (20%), Positives = 25/40 (62%)

Query: 179 EILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMR 218
           + L+E+L +++ D ++ +Q+ +  + +A++QR    +  +
Sbjct: 216 QQLKEELDKKQIDADKAQQKADFAQDNADKQRDEVRQKQQ 255



 Score = 28.0 bits (62), Expect = 6.4
 Identities = 22/133 (16%), Positives = 59/133 (44%), Gaps = 10/133 (7%)

Query: 120 KLEEEKLEAQMEFAR-----YENETEILRERECRFVEEALQKELKLEEEKLEAQMEFARY 174
            L E+  E  + F R      E E++   +R  +  EE  +K++  ++ + +A       
Sbjct: 185 ALREDN-EKGVNFRRDMTDLKERESQEDAKRAQQLKEELDKKQIDADKAQQKADFAQDNA 243

Query: 175 ENETEILREQLRQ-READRERQKQEWELKERHAEEQRRCDEEA---MRRQTEEIHLRMQQ 230
           + + + +R++ ++ +   +       +  ++ AE Q+R  E+A   +++  EE       
Sbjct: 244 DKQRDEVRQKQQEAKNLPKPADTSSPKEDKQVAENQKREIEKAQIEIKKNDEEALKAKDH 303

Query: 231 QDEELRRRHQENS 243
           +  +L++  + + 
Sbjct: 304 KAFDLKQESKASE 316


>gnl|CDD|224241 COG1322, COG1322, Predicted nuclease of restriction
           endonuclease-like fold, RmuC family [General function
           prediction only].
          Length = 448

 Score = 28.5 bits (64), Expect = 4.1
 Identities = 19/85 (22%), Positives = 32/85 (37%), Gaps = 7/85 (8%)

Query: 164 KLEAQMEFARYENETEILREQLRQREA-------DRERQKQEWELKERHAEEQRRCDEEA 216
            LE  +    +  E E LR   R  +A       +    K   + +   + EQ +   E+
Sbjct: 40  VLEQLLLLLAFRAEAEQLRTFARSLQALNLELIQELNELKARLQQQLLQSREQLQLLIES 99

Query: 217 MRRQTEEIHLRMQQQDEELRRRHQE 241
           + + + E      +  EEL RR  E
Sbjct: 100 LAQLSSEFQELANEIFEELNRRLAE 124


>gnl|CDD|226513 COG4026, COG4026, Uncharacterized protein containing TOPRIM domain,
           potential nuclease [General function prediction only].
          Length = 290

 Score = 28.3 bits (63), Expect = 4.3
 Identities = 29/93 (31%), Positives = 44/93 (47%), Gaps = 2/93 (2%)

Query: 103 QLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLEE 162
           +L EL K+KEE L++  +LE E  E Q    R E E   L E   +   E    + + +E
Sbjct: 143 KLEELQKEKEELLKELEELEAEYEEVQERLKRLEVENSRLEEMLKKLPGEVYDLKKRWDE 202

Query: 163 EKLEAQMEFARYENETEILREQLRQREADRERQ 195
             LE  +E    E  +++++E L     D E Q
Sbjct: 203 --LEPGVELPEEELISDLVKETLNLAPKDIEGQ 233


>gnl|CDD|173502 PTZ00266, PTZ00266, NIMA-related protein kinase; Provisional.
          Length = 1021

 Score = 28.5 bits (63), Expect = 4.3
 Identities = 24/80 (30%), Positives = 39/80 (48%), Gaps = 4/80 (5%)

Query: 169 MEFARYENETEILREQLRQREADR-ERQKQEWELKERHAEEQRRCDEEAMRRQTEEIHLR 227
           +E  R E      RE+L +   +R ER++ E E  ER   E+ R + + + R   E   R
Sbjct: 455 LEKKRIERLEREERERLERERMERIERERLERERLERERLERDRLERDRLDRLERERVDR 514

Query: 228 MQQQDEELRRRHQENSIFMQ 247
           +++   E  RR   NS F++
Sbjct: 515 LERDRLEKARR---NSYFLK 531


>gnl|CDD|185616 PTZ00436, PTZ00436, 60S ribosomal protein L19-like protein;
           Provisional.
          Length = 357

 Score = 28.4 bits (62), Expect = 4.4
 Identities = 16/46 (34%), Positives = 28/46 (60%), Gaps = 3/46 (6%)

Query: 156 KELKLEEEKLEAQMEFARYENET---EILREQLRQREADRERQKQE 198
           K  K +E +L  Q+   R ++E    +  +++LR+RE DRER ++E
Sbjct: 146 KNEKKKERQLAEQLAAKRLKDEQHRHKARKQELRKREKDRERARRE 191


>gnl|CDD|172358 PRK13831, PRK13831, conjugal transfer protein TrbI; Provisional.
          Length = 432

 Score = 28.5 bits (64), Expect = 4.5
 Identities = 14/49 (28%), Positives = 22/49 (44%), Gaps = 3/49 (6%)

Query: 184 QLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQTEEIHLRMQQQD 232
           Q  QRE  R   + E E + R   EQ   +E+ +R +  +   R+Q   
Sbjct: 111 QPGQREERRPTLESEEEWRARLKREQ---EEQYLRERQRQRMARLQANA 156


>gnl|CDD|214636 smart00360, RRM, RNA recognition motif. 
          Length = 73

 Score = 26.4 bits (59), Expect = 4.5
 Identities = 10/38 (26%), Positives = 18/38 (47%), Gaps = 1/38 (2%)

Query: 2  NIERAVVLVD-DRGNSKNEGIIEFTRKPAAAQALKRCQ 38
           +E   ++ D + G SK    +EF  +  A +AL+   
Sbjct: 25 KVESVRLVRDKETGKSKGFAFVEFESEEDAEKALEALN 62


>gnl|CDD|240271 PTZ00108, PTZ00108, DNA topoisomerase 2-like protein; Provisional.
          Length = 1388

 Score = 28.5 bits (64), Expect = 4.7
 Identities = 29/167 (17%), Positives = 71/167 (42%), Gaps = 36/167 (21%)

Query: 106  ELYKQKEEALQKELKLEEEKLEAQMEFARY--ENETEILRERECRFVEEALQKELKLEEE 163
            +LYK+++E L  +L+ E  +L  ++ F ++    E  I   +     ++ L KELK    
Sbjct: 991  DLYKKRKEYLLGKLERELARLSNKVRFIKHVINGELVITNAK-----KKDLVKELK---- 1041

Query: 164  KLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCD---------- 213
                ++ + R+++  +   E++   E +   +  E + ++   E                
Sbjct: 1042 ----KLGYVRFKDIIKKKSEKITAEEEEGAEEDDEADDEDDEEELGAAVSYDYLLSMPIW 1097

Query: 214  ---EEAMRRQTEEIHLRMQQQDEELRRRHQENSIFMQVIVWLGDLKQ 257
               +E + +   E+  + +++ E+L+    ++       +WL DL +
Sbjct: 1098 SLTKEKVEKLNAELE-KKEKELEKLKNTTPKD-------MWLEDLDK 1136


>gnl|CDD|192773 pfam11559, ADIP, Afadin- and alpha -actinin-Binding.  This family
           is found in mammals where it is localised at cell-cell
           adherens junctions, and in Sch. pombe and other fungi
           where it anchors spindle-pole bodies to spindle
           microtubules. It is a coiled-coil structure, and in
           pombe, it is required for anchoring the minus end of
           spindle microtubules to the centrosome equivalent, the
           spindle-pole body. The name ADIP derives from the family
           being composed of Afadin- and alpha -Actinin-Binding
           Proteins Localised at Cell-Cell Adherens Junctions.
          Length = 149

 Score = 27.6 bits (62), Expect = 4.8
 Identities = 27/107 (25%), Positives = 46/107 (42%), Gaps = 18/107 (16%)

Query: 142 LRERECRFVEEALQKELKLEEEKLEAQMEFARYENETEILREQLRQREADRERQKQEWEL 201
            R+R+  F E   +   KLE E    Q    R + + E L           ER+    + 
Sbjct: 44  QRDRDLEFRESLEETLRKLEAEIERLQNTIERLKTQLEDL-----------ERELALLQA 92

Query: 202 KERHAEEQRRCDEEAMRRQTEEIHL-------RMQQQDEELRRRHQE 241
           KER  E++ +  E+ ++ + EE+         R  Q + EL++R +E
Sbjct: 93  KERQLEKKLKTLEQKLKNEKEEVQRLKNIIQQRKTQYNHELKKRDRE 139


>gnl|CDD|233720 TIGR02091, glgC, glucose-1-phosphate adenylyltransferase.  This
           enzyme, glucose-1-phosphate adenylyltransferase, is also
           called ADP-glucose pyrophosphorylase. The plant form is
           an alpha2,beta2 heterodimer, allosterically regulated in
           plants. Both subunits are homologous and included in
           this model. In bacteria, both homomeric forms of GlgC
           and more active heterodimers of GlgC and GlgD have been
           described. This model describes the GlgC subunit only.
           This enzyme appears in variants of glycogen synthesis
           pathways that use ADP-glucose, rather than UDP-glucose
           as in animals [Energy metabolism, Biosynthesis and
           degradation of polysaccharides].
          Length = 361

 Score = 28.4 bits (64), Expect = 5.0
 Identities = 15/55 (27%), Positives = 25/55 (45%), Gaps = 12/55 (21%)

Query: 7   VVLVDDRGNSKNEGIIEFTRKPAAAQALKRCQD------GVF-FLTQSLKPVIVE 54
           V+ VD+ G      I++F  KPA   ++    D      G++ F    LK ++ E
Sbjct: 158 VMQVDEDGR-----IVDFEEKPANPPSIPGMPDFALASMGIYIFDKDVLKELLEE 207


>gnl|CDD|99747 cd06454, KBL_like, KBL_like; this family belongs to the pyridoxal
           phosphate (PLP)-dependent aspartate aminotransferase
           superfamily (fold I). The major groups in this CD
           corresponds to serine palmitoyltransferase (SPT),
           5-aminolevulinate synthase (ALAS),
           8-amino-7-oxononanoate synthase (AONS), and
           2-amino-3-ketobutyrate CoA ligase (KBL). SPT is
           responsible for the condensation of L-serine with
           palmitoyl-CoA to produce 3-ketodihydrospingosine, the
           reaction of the first step in sphingolipid biosynthesis.
           ALAS is involved in heme biosynthesis; it catalyzes the
           synthesis of 5-aminolevulinic acid from glycine and
           succinyl-coenzyme A. AONS catalyses the decarboxylative
           condensation of l-alanine and pimeloyl-CoA in the first
           committed step of biotin biosynthesis. KBL catalyzes the
           second reaction step of the metabolic degradation
           pathway for threonine converting 2-amino-3-ketobutyrate,
           to glycine and acetyl-CoA. The members of this CD are
           widely found in all three forms of life.
          Length = 349

 Score = 28.3 bits (64), Expect = 5.2
 Identities = 16/62 (25%), Positives = 23/62 (37%), Gaps = 17/62 (27%)

Query: 226 LRMQQQDEELRRRHQENSIFMQVIVWLGDLKQGVYQLGLTEG--------PFICECNNKM 277
           L + Q   E R R QEN  +         L++G+ +LG   G        P I +   K 
Sbjct: 246 LEVLQGGPERRERLQENVRY---------LRRGLKELGFPVGGSPSHIIPPLIGDDPAKA 296

Query: 278 KN 279
             
Sbjct: 297 VA 298


>gnl|CDD|215969 pfam00521, DNA_topoisoIV, DNA gyrase/topoisomerase IV, subunit A. 
          Length = 427

 Score = 28.3 bits (64), Expect = 5.4
 Identities = 29/119 (24%), Positives = 51/119 (42%), Gaps = 21/119 (17%)

Query: 134 RYENETEILRERECRFVE---EALQKELKLEEEKLEAQMEFARYENETEILREQLRQREA 190
           +Y N  EIL+E    F+E   E  ++  +   EKLE ++         E L + L + + 
Sbjct: 302 KYLNLKEILKE----FLEHRLEVYKRRKEYLLEKLEERLHIL------EGLLKALNKIDF 351

Query: 191 DRERQKQEWELKERHAEEQRRCDEE--------AMRRQTEEIHLRMQQQDEELRRRHQE 241
             E  +   +LK+   E      E          +RR T+E   +++++ EEL +   E
Sbjct: 352 VIEVIRGSIDLKKAKKELIEELSEIQADYLLDMRLRRLTKEEIEKLEKEIEELEKEIAE 410


>gnl|CDD|241110 cd12666, RRM2_RAVER2, RNA recognition motif 2 in vertebrate
          ribonucleoprotein PTB-binding 2 (raver-2).  This
          subgroup corresponds to the RRM2 of raver-2, a novel
          member of the heterogeneous nuclear ribonucleoprotein
          (hnRNP) family. It is present in vertebrates and shows
          high sequence homology to raver-1, a ubiquitously
          expressed co-repressor of the nucleoplasmic splicing
          repressor polypyrimidine tract-binding protein
          (PTB)-directed splicing of select mRNAs. In contrast,
          raver-2 exerts a distinct spatio-temporal expression
          pattern during embryogenesis and is mainly limited to
          differentiated neurons and glia cells. Although it
          displays nucleo-cytoplasmic shuttling in heterokaryons,
          raver2 localizes to the nucleus in glia cells and
          neurons. Raver-2 can interact with PTB and may
          participate in PTB-mediated RNA-processing. However,
          there is no evidence indicating that raver-2 can bind
          to cytoplasmic proteins. Raver-2 contains three
          N-terminal RNA recognition motifs (RRMs), also termed
          RBDs (RNA binding domains) or RNPs (ribonucleoprotein
          domains), two putative nuclear localization signals
          (NLS) at the N- and C-termini, a central leucine-rich
          region, and a C-terminal region harboring two
          [SG][IL]LGxxP motifs. Raver-2 binds to PTB through the
          SLLGEPP motif only, and binds to RNA through its RRMs.
          .
          Length = 77

 Score = 26.4 bits (58), Expect = 5.4
 Identities = 12/33 (36%), Positives = 22/33 (66%), Gaps = 1/33 (3%)

Query: 2  NIERAVVLVDD-RGNSKNEGIIEFTRKPAAAQA 33
          NIER  ++  +  G+SK  G +E+ +K +A++A
Sbjct: 25 NIERCFLVYSEVTGHSKGYGFVEYMKKDSASKA 57


>gnl|CDD|240835 cd12389, RRM2_RAVER, RNA recognition motif 2 in ribonucleoprotein
          PTB-binding raver-1, raver-2 and similar proteins.
          This subfamily corresponds to the RRM2 of raver-1 and
          raver-2. Raver-1 is a ubiquitously expressed
          heterogeneous nuclear ribonucleoprotein (hnRNP) that
          serves as a co-repressor of the nucleoplasmic splicing
          repressor polypyrimidine tract-binding protein
          (PTB)-directed splicing of select mRNAs. It shuttles
          between the cytoplasm and the nucleus and can
          accumulate in the perinucleolar compartment, a dynamic
          nuclear substructure that harbors PTB. Raver-1 also
          modulates focal adhesion assembly by binding to the
          cytoskeletal proteins, including alpha-actinin,
          vinculin, and metavinculin (an alternatively spliced
          isoform of vinculin) at adhesion complexes,
          particularly in differentiated muscle tissue. Raver-2
          is a novel member of the heterogeneous nuclear
          ribonucleoprotein (hnRNP) family. It shows high
          sequence homology to raver-1. Raver-2 exerts a
          spatio-temporal expression pattern during embryogenesis
          and is mainly limited to differentiated neurons and
          glia cells. Although it displays nucleo-cytoplasmic
          shuttling in heterokaryons, raver2 localizes to the
          nucleus in glia cells and neurons. Raver-2 can interact
          with PTB and may participate in PTB-mediated
          RNA-processing. However, there is no evidence
          indicating that raver-2 can bind to cytoplasmic
          proteins. Both, raver-1 and raver-2, contain three
          N-terminal RNA recognition motifs (RRMs), also termed
          RBDs (RNA binding domains) or RNPs (ribonucleoprotein
          domains), two putative nuclear localization signals
          (NLS) at the N- and C-termini, a central leucine-rich
          region, and a C-terminal region harboring two
          [SG][IL]LGxxP motifs. They binds to RNA through the
          RRMs. In addition, the two [SG][IL]LGxxP motifs serve
          as the PTB-binding motifs in raver1. However, raver-2
          interacts with PTB through the SLLGEPP motif only. .
          Length = 77

 Score = 26.1 bits (58), Expect = 5.7
 Identities = 10/33 (30%), Positives = 18/33 (54%), Gaps = 1/33 (3%)

Query: 2  NIERAVVLVDDR-GNSKNEGIIEFTRKPAAAQA 33
           +ER  ++  +  G SK  G +E+  K +A +A
Sbjct: 25 AVERCFLVYSESTGESKGYGFVEYASKASALKA 57


>gnl|CDD|234084 TIGR03007, pepcterm_ChnLen, polysaccharide chain length determinant
           protein, PEP-CTERM locus subfamily.  Members of this
           protein family belong to the family of polysaccharide
           chain length determinant proteins (pfam02706). All are
           found in species that encode the PEP-CTERM/exosortase
           system predicted to act in protein sorting in a number
           of Gram-negative bacteria, and are found near the epsH
           homolog that is the putative exosortase gene [Cell
           envelope, Biosynthesis and degradation of surface
           polysaccharides and lipopolysaccharides].
          Length = 498

 Score = 28.1 bits (63), Expect = 5.7
 Identities = 22/98 (22%), Positives = 34/98 (34%), Gaps = 8/98 (8%)

Query: 103 QLYELYKQKEEALQKELKLEEEKLEA-------QMEFARYENETEILRERECRFVEEALQ 155
           ++ +L +QKEE    +    E    A       Q+E A  E E   L  R         +
Sbjct: 283 EIAQLEEQKEEEGSAKNGGPERGEIANPVYQQLQIELAEAEAEIASLEARVAELTARIER 342

Query: 156 KELKLEEEKLEAQMEFARYENETEILREQLRQREADRE 193
            E  L     E + E  +   + E+ +    Q    RE
Sbjct: 343 LESLLRTIP-EVEAELTQLNRDYEVNKSNYEQLLTRRE 379


>gnl|CDD|114629 pfam05917, DUF874, Helicobacter pylori protein of unknown function
           (DUF874).  This family consists of several hypothetical
           proteins specific to Helicobacter pylori. The function
           of this family is unknown.
          Length = 417

 Score = 27.9 bits (61), Expect = 5.9
 Identities = 24/114 (21%), Positives = 58/114 (50%), Gaps = 1/114 (0%)

Query: 129 QMEFARYENETEILRERECRFVEEALQKELKLEEEKLEAQMEFARYENETEILREQLRQR 188
           ++E A+ + E E  R+R  +   E  Q+E K E+EK + + E     N ++I  EQ +Q+
Sbjct: 127 KIELAQAKKEAENARDRANKSGIELEQEEQKTEQEKQKTEKEGIELAN-SQIKAEQEKQK 185

Query: 189 EADRERQKQEWELKERHAEEQRRCDEEAMRRQTEEIHLRMQQQDEELRRRHQEN 242
               +++ ++ + K  +   +   + E  +++TE     + ++ ++  +  ++N
Sbjct: 186 TEQEKQKTEQEKQKTSNIANKNAIELEQEKQKTENEKQDLIKEQKDFIKEAEQN 239


>gnl|CDD|215612 PLN03169, PLN03169, chalcone synthase family protein; Provisional.
          Length = 391

 Score = 27.7 bits (62), Expect = 6.4
 Identities = 20/58 (34%), Positives = 31/58 (53%), Gaps = 13/58 (22%)

Query: 153 ALQKELKLEEEKLE----AQMEFARYENET-----EILREQLRQREADRERQKQEWEL 201
            L+K+LKL  EKLE    A M++    + T     E +RE+L+++  + E    EW L
Sbjct: 318 RLEKKLKLAPEKLECSRRALMDYGNVSSNTIVYVLEYMREELKKKGEEDE----EWGL 371


>gnl|CDD|179385 PRK02224, PRK02224, chromosome segregation protein; Provisional.
          Length = 880

 Score = 28.1 bits (63), Expect = 6.5
 Identities = 36/128 (28%), Positives = 53/128 (41%), Gaps = 14/128 (10%)

Query: 117 KELKLEEEKLEAQMEFAR--YENETEILRERECRFVEEALQKELKLEEEKLEAQMEFARY 174
            EL  E E+ E Q E AR   +   E+L E E     E  ++   LE E  + +   A  
Sbjct: 216 AELDEEIERYEEQREQARETRDEADEVLEEHE-----ERREELETLEAEIEDLRETIAET 270

Query: 175 ENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQTEEIHLR-MQQQDE 233
           E E E L E++R      E      EL+E   +       +    +  E     ++ +DE
Sbjct: 271 EREREELAEEVRDLRERLE------ELEEERDDLLAEAGLDDADAEAVEARREELEDRDE 324

Query: 234 ELRRRHQE 241
           ELR R +E
Sbjct: 325 ELRDRLEE 332


>gnl|CDD|227352 COG5019, CDC3, Septin family protein [Cell division and chromosome
           partitioning / Cytoskeleton].
          Length = 373

 Score = 28.1 bits (63), Expect = 6.5
 Identities = 22/96 (22%), Positives = 41/96 (42%), Gaps = 6/96 (6%)

Query: 102 KQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLE 161
             LYE Y+ ++      L   +   E  ++        E  RE + +F E+  +KE +LE
Sbjct: 281 NLLYENYRTEK------LSGLKNSGEPSLKEIHEARLNEEERELKKKFTEKIREKEKRLE 334

Query: 162 EEKLEAQMEFARYENETEILREQLRQREADRERQKQ 197
           E +     E     ++ E ++++L   E   E+ K 
Sbjct: 335 ELEQNLIEERKELNSKLEEIQKKLEDLEKRLEKLKS 370


>gnl|CDD|222878 PHA02562, 46, endonuclease subunit; Provisional.
          Length = 562

 Score = 28.1 bits (63), Expect = 6.7
 Identities = 17/100 (17%), Positives = 42/100 (42%), Gaps = 4/100 (4%)

Query: 99  TRWKQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKEL 158
            + K+L    ++ + A+ +  ++ +E  E   +    +N+    ++     V++A     
Sbjct: 306 DKLKELQHSLEKLDTAIDELEEIMDEFNEQSKKLLELKNKISTNKQSLITLVDKAK---- 361

Query: 159 KLEEEKLEAQMEFARYENETEILREQLRQREADRERQKQE 198
           K++    E Q EF     E   L+++L +    +    +E
Sbjct: 362 KVKAAIEELQAEFVDNAEELAKLQDELDKIVKTKSELVKE 401


>gnl|CDD|220365 pfam09726, Macoilin, Transmembrane protein.  This entry is a highly
           conserved protein present in eukaryotes.
          Length = 680

 Score = 28.0 bits (62), Expect = 6.8
 Identities = 28/130 (21%), Positives = 57/130 (43%), Gaps = 16/130 (12%)

Query: 114 ALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQK---ELKLEEEKLEAQME 170
            L+KE  + + KL + +   + + ++  ++  E R   EA  +   E +L EEK   + E
Sbjct: 452 QLKKENDMLQTKLNSMVSAKQKDKQS--MQSMEKRLKSEADSRVNAEKQLAEEKKRKKEE 509

Query: 171 -------FARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQTEE 223
                   A+     E   E L+Q + D E + ++ E   +  EE+ R     + ++ +E
Sbjct: 510 EETAARAAAQAAASREECAESLKQAKQDLEMEIKKLEHDLKLKEEECR----MLEKEAQE 565

Query: 224 IHLRMQQQDE 233
           +    + + E
Sbjct: 566 LRKYQESEKE 575


>gnl|CDD|225159 COG2250, COG2250, Uncharacterized conserved protein related to
           C-terminal domain of eukaryotic chaperone, SACSIN
           [Function unknown].
          Length = 132

 Score = 27.0 bits (60), Expect = 6.8
 Identities = 15/66 (22%), Positives = 31/66 (46%)

Query: 101 WKQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKL 160
            ++L    +  EE L+   +LE+  + ++   A YE   E+  + +   + +  +K L+L
Sbjct: 65  LRELSRELEVPEEILECARELEKRYILSRYPDAEYEGPLELYSKEDAEELLKTAEKVLEL 124

Query: 161 EEEKLE 166
            E  L 
Sbjct: 125 VEGLLG 130


>gnl|CDD|184696 PRK14474, PRK14474, F0F1 ATP synthase subunit B; Provisional.
          Length = 250

 Score = 27.5 bits (61), Expect = 6.8
 Identities = 22/77 (28%), Positives = 33/77 (42%), Gaps = 7/77 (9%)

Query: 161 EEEKLEAQMEFARYENETEILREQLR------QREADRERQKQEWELKERHAEEQRRCDE 214
           E+ + EA  E  RY  + + L +Q        Q  AD +RQ    E +E      R    
Sbjct: 49  EQRQQEAGQEAERYRQKQQSLEQQRASFMAQAQEAADEQRQHLLNEARE-DVATARDEWL 107

Query: 215 EAMRRQTEEIHLRMQQQ 231
           E + R+ +E    +QQQ
Sbjct: 108 EQLEREKQEFFKALQQQ 124


>gnl|CDD|220410 pfam09798, LCD1, DNA damage checkpoint protein.  This is a family
           of proteins which regulate checkpoint kinases. In
           Schizosaccharomyces pombe this protein is called Rad26
           and in Saccharomyces cerevisiae it is called LCD1.
          Length = 648

 Score = 27.9 bits (62), Expect = 7.1
 Identities = 18/56 (32%), Positives = 31/56 (55%), Gaps = 5/56 (8%)

Query: 180 ILREQLR--QREADRERQKQ---EWELKERHAEEQRRCDEEAMRRQTEEIHLRMQQ 230
           +LR++L   Q++   ER KQ     ELKE+H +E ++  +E    + E   L ++Q
Sbjct: 1   MLRDKLDMLQQQKQEERNKQKSRVNELKEKHDQELQKLKQELQSLEDERKFLVLEQ 56


>gnl|CDD|223097 COG0018, ArgS, Arginyl-tRNA synthetase [Translation, ribosomal
           structure and biogenesis].
          Length = 577

 Score = 28.0 bits (63), Expect = 7.2
 Identities = 14/53 (26%), Positives = 27/53 (50%), Gaps = 3/53 (5%)

Query: 104 LYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQK 156
           L E Y +  + L+++   +EE  EA+ E  + E+  E       +FV+ +L+ 
Sbjct: 195 LGEYYVKIAKDLEEDPGNDEE--EAREEVEKLESGDEEAELWR-KFVDLSLEG 244


>gnl|CDD|218581 pfam05416, Peptidase_C37, Southampton virus-type processing
           peptidase.  Corresponds to Merops family C37.
           Norwalk-like viruses (NLVs), including the Southampton
           virus, cause acute non-bacterial gastroenteritis in
           humans. The NLV genome encodes three open reading frames
           (ORFs). ORF1 encodes a polyprotein, which is processed
           by the viral protease into six proteins.
          Length = 535

 Score = 27.9 bits (62), Expect = 7.3
 Identities = 18/66 (27%), Positives = 35/66 (53%), Gaps = 2/66 (3%)

Query: 135 YENETEILRERECRF-VEEALQKELKLEEEKLEAQMEFARYENETEI-LREQLRQREADR 192
           Y+   +I  ER  ++ ++E L+   + EEE  E Q   A +  E E  +R+++      R
Sbjct: 258 YDEYKKIREERGGKYSIQEYLEDRERYEEELAERQATEADFCEEEEAKIRQRIFGLRKTR 317

Query: 193 ERQKQE 198
           +++K+E
Sbjct: 318 KQRKEE 323


>gnl|CDD|191249 pfam05279, Asp-B-Hydro_N, Aspartyl beta-hydroxylase N-terminal
           region.  This family includes the N-terminal regions of
           the junctin, junctate and aspartyl beta-hydroxylase
           proteins. Junctate is an integral ER/SR membrane calcium
           binding protein, which comes from an alternatively
           spliced form of the same gene that generates aspartyl
           beta-hydroxylase and junctin. Aspartyl beta-hydroxylase
           catalyzes the post-translational hydroxylation of
           aspartic acid or asparagine residues contained within
           epidermal growth factor (EGF) domains of proteins.
          Length = 240

 Score = 27.6 bits (61), Expect = 7.7
 Identities = 20/115 (17%), Positives = 44/115 (38%), Gaps = 1/115 (0%)

Query: 112 EEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLEEEKL-EAQME 170
           EE ++++L+   EK+    +      +   L E +    E++  ++  LE  K+ E   +
Sbjct: 102 EEEVKEQLQSLLEKIVVSKQEEDGPGKEPQLDEDKFLLAEDSDDRQETLEAGKVHEETED 161

Query: 171 FARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQTEEIH 225
               E       +Q  + +A  +  +   E  E+    +   D+       EE +
Sbjct: 162 SYHVEETASEQYKQDMKEKASEQENEDSKEPVEKAERTKAETDDVTEEDYDEEDN 216


>gnl|CDD|227396 COG5064, SRP1, Karyopherin (importin) alpha [Intracellular
           trafficking and secretion].
          Length = 526

 Score = 27.5 bits (61), Expect = 7.9
 Identities = 12/33 (36%), Positives = 21/33 (63%), Gaps = 6/33 (18%)

Query: 206 AEEQRRCDEEAMRRQTEEIHLRMQQQDEELRRR 238
           A+E RR      RR+ +++ LR Q+++E L +R
Sbjct: 22  ADELRR------RREEQQVELRKQKREELLNKR 48


>gnl|CDD|220623 pfam10186, Atg14, UV radiation resistance protein and
           autophagy-related subunit 14.  The Atg14 or Apg14
           proteins are hydrophilic proteins with a predicted
           molecular mass of 40.5 kDa, and have a coiled-coil motif
           at the N terminus region. Yeast cells with mutant Atg14
           are defective not only in autophagy but also in sorting
           of carboxypeptidase Y (CPY), a vacuolar-soluble
           hydrolase, to the vacuole. Subcellular fractionation
           indicate that Apg14p and Apg6p are peripherally
           associated with a membrane structure(s). Apg14p was
           co-immunoprecipitated with Apg6p, suggesting that they
           form a stable protein complex. These results imply that
           Apg6/Vps30p has two distinct functions: in the
           autophagic process and in the vacuolar protein sorting
           pathway. Apg14p may be a component specifically required
           for the function of Apg6/Vps30p through the autophagic
           pathway. There are 17 auto-phagosomal component proteins
           which are categorized into six functional units, one of
           which is the AS-PI3K complex (Vps30/Atg6 and Atg14). The
           AS-PI3K complex and the Atg2-Atg18 complex are essential
           for nucleation, and the specific function of the AS-PI3K
           apparently is to produce phosphatidylinositol
           3-phosphate (PtdIns(3)P) at the pre-autophagosomal
           structure (PAS). The localisation of this complex at the
           PAS is controlled by Atg14. Autophagy mediates the
           cellular response to nutrient deprivation, protein
           aggregation, and pathogen invasion in humans, and
           malfunction of autophagy has been implicated in multiple
           human diseases including cancer. This effect seems to be
           mediated through direct interaction of the human Atg14
           with Beclin 1 in the human phosphatidylinositol 3-kinase
           class III complex.
          Length = 307

 Score = 27.3 bits (61), Expect = 8.0
 Identities = 13/69 (18%), Positives = 28/69 (40%), Gaps = 8/69 (11%)

Query: 181 LREQLRQREADRERQKQEWE--------LKERHAEEQRRCDEEAMRRQTEEIHLRMQQQD 232
           LR  L +   + E  KQ+ E           + A +  + +    + +  +I  R+ Q  
Sbjct: 25  LRLDLARLLLENEELKQKVEEALEGATNEDGKLAADLLKLEVARKKERLNQIRARISQLK 84

Query: 233 EELRRRHQE 241
           EE+ ++ + 
Sbjct: 85  EEIEQKRER 93


>gnl|CDD|197874 smart00787, Spc7, Spc7 kinetochore protein.  This domain is found
           in cell division proteins which are required for
           kinetochore-spindle association.
          Length = 312

 Score = 27.3 bits (61), Expect = 8.2
 Identities = 32/130 (24%), Positives = 54/130 (41%), Gaps = 18/130 (13%)

Query: 97  YGTRWKQLYELYKQKEEALQKELKLEEEKLEAQME------------FARYENETEILRE 144
           Y  R K L  L +  +E L+  LK + + L  ++E                E E   L++
Sbjct: 135 YEWRMKLLEGLKEGLDENLE-GLKEDYKLLMKELELLNSIKPKLRDRKDALEEELRQLKQ 193

Query: 145 RECRFVEEALQKELKLEEEKL-EAQMEFARYENETEILREQLRQREADRER---QKQEWE 200
            E   +E+    EL   +EKL +   E      + E L E+L++ E+  E    +K E  
Sbjct: 194 LE-DELEDCDPTELDRAKEKLKKLLQEIMIKVKKLEELEEELQELESKIEDLTNKKSELN 252

Query: 201 LKERHAEEQR 210
            +   AE++ 
Sbjct: 253 TEIAEAEKKL 262


>gnl|CDD|241124 cd12680, RRM_THOC4, RNA recognition motif in THO complex subunit
          4 (THOC4) and similar proteins.  This subgroup
          corresponds to the RRM of THOC4, also termed
          transcriptional coactivator Aly/REF, or ally of AML-1
          and LEF-1, or bZIP-enhancing factor BEF, an mRNA
          transporter protein with a well conserved RNA
          recognition motif (RRM), also termed RBD (RNA binding
          domain) or RNP (ribonucleoprotein domain). It is
          involved in RNA transportation from the nucleus. THOC4
          was initially identified as a transcription coactivator
          of LEF-1 and AML-1 for the TCRalpha enhancer function.
          In addition, THOC4 specifically binds to rhesus (RH)
          promoter in erythroid. It might be a novel
          transcription cofactor for erythroid-specific genes. .
          Length = 75

 Score = 25.7 bits (57), Expect = 8.3
 Identities = 10/35 (28%), Positives = 18/35 (51%)

Query: 2  NIERAVVLVDDRGNSKNEGIIEFTRKPAAAQALKR 36
           +++A V  D  G S     + F R+  A +A+K+
Sbjct: 26 ALKKAAVHYDRSGRSLGTADVVFERRADALKAMKQ 60


>gnl|CDD|218115 pfam04502, DUF572, Family of unknown function (DUF572).  Family of
           eukaryotic proteins with undetermined function.
          Length = 321

 Score = 27.4 bits (61), Expect = 8.3
 Identities = 21/101 (20%), Positives = 43/101 (42%), Gaps = 16/101 (15%)

Query: 145 RECRFVEEALQKELKLEEEKLEAQMEFARYENETEILREQLRQREADRERQKQEWE---- 200
           R     +   ++E ++E+E+ E     A          ++L  R AD +R+ +  E    
Sbjct: 106 RNYEADKLDEEQEERVEKEREEELAGDAM---------KKLENRTADSKREMEVLERLEE 156

Query: 201 ---LKERHAEEQRRCDEEAMRRQTEEIHLRMQQQDEELRRR 238
              L+ R A+       EA+ R+ ++     +++DE L + 
Sbjct: 157 LKELQSRRADVDVNSMLEALFRREKKEEEEEEEEDEALIKS 197


>gnl|CDD|206172 pfam14002, YniB, YniB-like protein.  The YniB-like protein family
           includes the E. coli YniB protein, which is functionally
           uncharacterized. This family of proteins is found in
           bacteria. Proteins in this family are approximately 180
           amino acids in length. This family of proteins are
           integral membrane proteins.
          Length = 166

 Score = 26.8 bits (60), Expect = 8.4
 Identities = 13/47 (27%), Positives = 22/47 (46%), Gaps = 7/47 (14%)

Query: 149 FVEEALQ-------KELKLEEEKLEAQMEFARYENETEILREQLRQR 188
           FV  ALQ       +++K   E +E Q+   + +      REQL ++
Sbjct: 85  FVGLALQASGARMSRQVKFIREGIEDQLILEKAKGVEGRTREQLEEK 131


>gnl|CDD|218636 pfam05557, MAD, Mitotic checkpoint protein.  This family consists
           of several eukaryotic mitotic checkpoint (Mitotic arrest
           deficient or MAD) proteins. The mitotic spindle
           checkpoint monitors proper attachment of the bipolar
           spindle to the kinetochores of aligned sister chromatids
           and causes a cell cycle arrest in prometaphase when
           failures occur. Multiple components of the mitotic
           spindle checkpoint have been identified in yeast and
           higher eukaryotes. In S.cerevisiae, the existence of a
           Mad1-dependent complex containing Mad2, Mad3, Bub3 and
           Cdc20 has been demonstrated.
          Length = 722

 Score = 27.6 bits (61), Expect = 8.5
 Identities = 25/137 (18%), Positives = 61/137 (44%), Gaps = 4/137 (2%)

Query: 111 KEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLEEEKLEAQME 170
           + E +QKEL+ +  ++E + + +      E    RE     E   +   LEE + +A+ E
Sbjct: 74  ENELMQKELEHKRAQIELERKASTLAENYE----RELDRNLELEVRLKALEELEKKAENE 129

Query: 171 FARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQTEEIHLRMQQ 230
            A  E E ++L+++L       + +K++   + + +  + + D   M+ + +     ++ 
Sbjct: 130 AAEAEEEAKLLKDKLDAESLKLQNEKEDQLKEAKESISRIKNDLSEMQCRAQNADTELKL 189

Query: 231 QDEELRRRHQENSIFMQ 247
            + EL    ++     +
Sbjct: 190 LESELEELREQLEECQK 206


>gnl|CDD|227355 COG5022, COG5022, Myosin heavy chain [Cytoskeleton].
          Length = 1463

 Score = 27.7 bits (62), Expect = 8.6
 Identities = 18/98 (18%), Positives = 39/98 (39%), Gaps = 4/98 (4%)

Query: 102  KQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLE 161
             ++ EL K     L + L+ + E +       +  N  ++       +V+     +L   
Sbjct: 906  SEIIELKKSLSSDLIENLEFKTELI---ARLKKLLNNIDLEEGPSIEYVKLPELNKLHEV 962

Query: 162  EEKL-EAQMEFARYENETEILREQLRQREADRERQKQE 198
            E KL E   E+     ++ IL  +  +  ++ +  K+E
Sbjct: 963  ESKLKETSEEYEDLLKKSTILVREGNKANSELKNFKKE 1000


>gnl|CDD|221514 pfam12297, EVC2_like, Ellis van Creveld protein 2 like protein.
           This family of proteins is found in eukaryotes. Proteins
           in this family are typically between 571 and 1310 amino
           acids in length. There are two conserved sequence
           motifs: LPA and ELH. EVC2 is implicated in Ellis van
           Creveld chondrodysplastic dwarfism in humans. Mutations
           in this protein can give rise to this congenital
           condition. LIMBIN is a protein which shares around 80%
           sequence homology with EVC2 and it is implicated in a
           similar condition in bovine chondrodysplastic dwarfism.
          Length = 429

 Score = 27.5 bits (61), Expect = 8.7
 Identities = 25/88 (28%), Positives = 39/88 (44%), Gaps = 6/88 (6%)

Query: 102 KQLYELYKQKEEALQKELKLE-EEKLEAQMEFARYENETEILR-----ERECRFVEEALQ 155
           ++L E Y++K  AL  E  LE  +K+EAQ +    E E          ERE       L 
Sbjct: 209 RRLQEEYERKMVALTAECNLETRKKMEAQHQREMAEMEQAEELLKRAPEREAVECSSLLD 268

Query: 156 KELKLEEEKLEAQMEFARYENETEILRE 183
               LE+E L+  +   + E+  +  R+
Sbjct: 269 TLHGLEQEHLQRSLLLQQEEDFAKAHRQ 296


>gnl|CDD|226809 COG4372, COG4372, Uncharacterized protein conserved in bacteria
           with the myosin-like domain [Function unknown].
          Length = 499

 Score = 27.7 bits (61), Expect = 9.0
 Identities = 29/145 (20%), Positives = 57/145 (39%), Gaps = 18/145 (12%)

Query: 109 KQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEE------------ALQK 156
           +++E   Q+     +   +AQ E AR   + + L+ R     E+            A QK
Sbjct: 116 QEREAVRQELAAARQNLAKAQQELARLTKQAQDLQTRLKTLAEQRRQLEAQAQSLQASQK 175

Query: 157 ELKLEEEKLEAQME--FARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDE 214
           +L+    +L++Q+     R     +  +    +  A + R +   EL  R A  Q+    
Sbjct: 176 QLQASATQLKSQVLDLKLRSAQIEQEAQNLATRANAAQARTE---ELARRAAAAQQTAQA 232

Query: 215 EAMR-RQTEEIHLRMQQQDEELRRR 238
              R  Q  +   ++  + E++R R
Sbjct: 233 IQQRDAQISQKAQQIAARAEQIRER 257


>gnl|CDD|237063 PRK12329, nusA, transcription elongation factor NusA; Provisional.
          Length = 449

 Score = 27.4 bits (61), Expect = 9.2
 Identities = 22/78 (28%), Positives = 33/78 (42%), Gaps = 2/78 (2%)

Query: 133 ARYENETEILRERECRFVEEALQKELKLEEEKLEAQMEFARYENETEILREQLRQREADR 192
           A Y+ E E  +  E     E  +   +  EE+LEA+    R E +   LRE     E + 
Sbjct: 374 AEYDQEAEDAKVAELISQREEEEALQREAEERLEAEQA-ERAEEDAR-LRELYPLPEDEF 431

Query: 193 ERQKQEWELKERHAEEQR 210
           E + +  E +    EE R
Sbjct: 432 EDEDELEEAQPEEEEEAR 449


>gnl|CDD|226889 COG4487, COG4487, Uncharacterized protein conserved in bacteria
           [Function unknown].
          Length = 438

 Score = 27.5 bits (61), Expect = 9.6
 Identities = 30/147 (20%), Positives = 65/147 (44%), Gaps = 2/147 (1%)

Query: 102 KQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKL- 160
           K+  E   Q   A +KEL   EE+L  Q +  +     +I +       E A  + L+L 
Sbjct: 49  KEANEKRAQYRSAKKKELSQLEEQLINQKKEQKNLFNEQIKQFELALQDEIAKLEALELL 108

Query: 161 -EEEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRR 219
             E+  E ++     +  ++ L++QL+      E++++  + +ER   E  +  EE++  
Sbjct: 109 NLEKDKELELLEKELDELSKELQKQLQNTAEIIEKKRENNKNEERLKFENEKKLEESLEL 168

Query: 220 QTEEIHLRMQQQDEELRRRHQENSIFM 246
           + E+   ++ + + +L  +  E     
Sbjct: 169 EREKFEEQLHEANLDLEFKENEEQRES 195


  Database: CDD.v3.10
    Posted date:  Mar 20, 2013  7:55 AM
  Number of letters in database: 10,937,602
  Number of sequences in database:  44,354
  
Lambda     K      H
   0.315    0.132    0.367 

Gapped
Lambda     K      H
   0.267   0.0677    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 14,711,351
Number of extensions: 1500143
Number of successful extensions: 6060
Number of sequences better than 10.0: 1
Number of HSP's gapped: 4258
Number of HSP's successfully gapped: 1102
Length of query: 279
Length of database: 10,937,602
Length adjustment: 96
Effective length of query: 183
Effective length of database: 6,679,618
Effective search space: 1222370094
Effective search space used: 1222370094
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.6 bits)
S2: 58 (26.2 bits)