RPS-BLAST 2.2.26 [Sep-21-2011]
Database: CDD.v3.10
44,354 sequences; 10,937,602 total letters
Searching..................................................done
Query= psy7437
(279 letters)
>gnl|CDD|240580 cd12945, NOPS_NONA_like, NOPS domain, including C-terminal
coiled-coil region, in p54nrb/PSF/PSP1 homologs from
invertebrate species. The family contains a DBHS domain
(for Drosophila behavior, human splicing), which
comprises two conserved RNA recognition motifs (RRMs),
also termed RBDs (RNA binding domains) or RNPs
(ribonucleoprotein domains), and a charged
protein-protein interaction NOPS (NONA and PSP1) domain.
This model corresponds to the NOPS domain, with a long
helical C-terminal extension , found in Drosophila
melanogaster gene no-ontransient A (nonA) encoding
puff-specific protein Bj6 (also termed NONA), Chironomus
tentans hrp65 gene encoding protein Hrp65 and similar
proteins. D. melanogaster NONA is involved in eye
development and behavior, and may play a role in
circadian rhythm maintenance, similar to vertebrate
p54nrb. C. tentans hrp65 is a component of nuclear
fibers associated with ribonucleoprotein particles in
transit from the gene to the nuclear pore. The NOPS
domain specifically binds to the second RNA recognition
motif (RRM2) domain of the partner DBHS protein via a
substantial interaction surface. Its highly conserved
C-terminal residues are critical for functional DBHS
dimerization while the highly conserved C-terminal
helical extension, forming a right-handed antiparallel
heterodimeric coiled-coil, is essential for localization
of these proteins to subnuclear bodies.
Length = 100
Score = 134 bits (339), Expect = 4e-40
Identities = 67/97 (69%), Positives = 88/97 (90%)
Query: 48 LKPVIVEPLDLVDDEEGLSERTVSKKSSDYFKQRSIGPRFATIGSFEFEYGTRWKQLYEL 107
L+P +VEP++ +DDE+GL E++++KK+ ++ K+RSIGPRFA SFE EYGTRWKQL+EL
Sbjct: 1 LRPCVVEPMEEIDDEDGLPEKSLNKKNPEFNKERSIGPRFAEPNSFEHEYGTRWKQLHEL 60
Query: 108 YKQKEEALQKELKLEEEKLEAQMEFARYENETEILRE 144
YKQKEEAL++ELK+EEEKLEAQME+ARYE+ETE+LRE
Sbjct: 61 YKQKEEALKRELKMEEEKLEAQMEYARYEHETELLRE 97
>gnl|CDD|240579 cd12931, eNOPS_SF, NOPS domain, including C-terminal helical
extension region, in the p54nrb/PSF/PSP1 family. All
members in this family contain a DBHS domain (for
Drosophila behavior, human splicing), which comprises
two conserved RNA recognition motifs (RRM1 and RRM2),
also termed RBDs (RNA binding domains) or RNPs
(ribonucleoprotein domains), and a charged
protein-protein interaction NOPS (NONA and PSP1) domain
with a long helical C-terminal extension. The NOPS
domain specifically binds to RRM2 domain of the partner
DBHS protein via a substantial interaction surface. Its
highly conserved C-terminal residues are critical for
functional DBHS dimerization while the highly conserved
C-terminal helical extension, forming a right-handed
antiparallel heterodimeric coiled-coil, is essential for
localization of these proteins to subnuclear bodies. PSF
has an additional large N-terminal domain that
differentiates it from other family members. The
p54nrb/PSF/PSP1 family includes 54 kDa nuclear RNA- and
DNA-binding protein (p54nrb), polypyrimidine
tract-binding protein (PTB)-associated-splicing factor
(PSF) and paraspeckle protein 1 (PSP1), which are
ubiquitously expressed and are well conserved in
vertebrates. p54nrb, also termed NONO or NMT55, is a
multi-functional protein involved in numerous nuclear
processes including transcriptional regulation,
splicing, DNA unwinding, nuclear retention of
hyperedited double-stranded RNA, viral RNA processing,
control of cell proliferation, and circadian rhythm
maintenance. PSF, also termed POMp100, is also a
multi-functional protein that binds RNA, single-stranded
DNA (ssDNA), double-stranded DNA (dsDNA) and many
factors, and mediates diverse activities in the cell.
PSP1, also termed PSPC1, is a novel nucleolar factor
that accumulates within a new nucleoplasmic compartment,
termed paraspeckles, and diffusely distributes in the
nucleoplasm. The cellular function of PSP1 remains
unknown currently. The family also includes some
p54nrb/PSF/PSP1 homologs from invertebrate species. For
instance, the Drosophila melanogaster gene
no-ontransient A (nonA) encoding puff-specific protein
Bj6 (also termed NONA) and Chironomus tentans hrp65 gene
encoding protein Hrp65. D. melanogaster NONA is involved
in eye development and behavior and may play a role in
circadian rhythm maintenance, similar to vertebrate
p54nrb. C. tentans Hrp65 is a component of nuclear
fibers associated with ribonucleoprotein particles in
transit from the gene to the nuclear pore.
Length = 90
Score = 95.8 bits (239), Expect = 3e-25
Identities = 55/90 (61%), Positives = 71/90 (78%), Gaps = 1/90 (1%)
Query: 49 KPVIVEPLDLVDDEEGLSERTVSKKSSDYFKQRSIGPRFATIGSFEFEYGTRWKQLYELY 108
+PV+VEPL+ D+E+GL ER V KK++ Y K+R +GPRFA GSFE+E+G RWK LYEL
Sbjct: 2 RPVVVEPLEQRDEEDGLPERNV-KKNAGYQKEREVGPRFAPPGSFEYEFGQRWKALYELE 60
Query: 109 KQKEEALQKELKLEEEKLEAQMEFARYENE 138
KQ+ E L+KELK EKLEA+ME ARYE++
Sbjct: 61 KQQREQLEKELKEAREKLEAEMEAARYEHQ 90
>gnl|CDD|240779 cd12333, RRM2_p54nrb_like, RNA recognition motif 2 in the
p54nrb/PSF/PSP1 family. This subfamily corresponds to
the RRM2 of the p54nrb/PSF/PSP1 family, including 54
kDa nuclear RNA- and DNA-binding protein (p54nrb or
NonO or NMT55), polypyrimidine tract-binding protein
(PTB)-associated-splicing factor (PSF or POMp100),
paraspeckle protein 1 (PSP1 or PSPC1), which are
ubiquitously expressed and are conserved in
vertebrates. p54nrb is a multi-functional protein
involved in numerous nuclear processes including
transcriptional regulation, splicing, DNA unwinding,
nuclear retention of hyperedited double-stranded RNA,
viral RNA processing, control of cell proliferation,
and circadian rhythm maintenance. PSF is also a
multi-functional protein that binds RNA,
single-stranded DNA (ssDNA), double-stranded DNA
(dsDNA) and many factors, and mediates diverse
activities in the cell. PSP1 is a novel nucleolar
factor that accumulates within a new nucleoplasmic
compartment, termed paraspeckles, and diffusely
distributes in the nucleoplasm. The cellular function
of PSP1 remains unknown currently. The family also
includes some p54nrb/PSF/PSP1 homologs from
invertebrate species, such as the Drosophila
melanogaster gene no-ontransient A (nonA) encoding
puff-specific protein Bj6 (also termed NONA) and
Chironomus tentans hrp65 gene encoding protein Hrp65.
D. melanogaster NONA is involved in eye development and
behavior and may play a role in circadian rhythm
maintenance, similar to vertebrate p54nrb. C. tentans
Hrp65 is a component of nuclear fibers associated with
ribonucleoprotein particles in transit from the gene to
the nuclear pore. All family members contains a DBHS
domain (for Drosophila behavior, human splicing), which
comprises two conserved RNA recognition motifs (RRMs),
also termed RBDs (RNA binding domains) or RNPs
(ribonucleoprotein domains), and a charged
protein-protein interaction module. PSF has an
additional large N-terminal domain that differentiates
it from other family members. .
Length = 80
Score = 92.0 bits (229), Expect = 5e-24
Identities = 35/55 (63%), Positives = 44/55 (80%)
Query: 3 IERAVVLVDDRGNSKNEGIIEFTRKPAAAQALKRCQDGVFFLTQSLKPVIVEPLD 57
+ERAVV+VDDRG S EGI+EF+RKP A A+KRC +G F LT S +PV+VEPL+
Sbjct: 26 VERAVVIVDDRGRSTGEGIVEFSRKPGAQAAIKRCSEGCFLLTASPRPVVVEPLE 80
>gnl|CDD|241034 cd12590, RRM2_PSF, RNA recognition motif 2 in vertebrate
polypyrimidine tract-binding protein
(PTB)-associated-splicing factor (PSF). This subgroup
corresponds to the RRM2 of PSF, also termed proline-
and glutamine-rich splicing factor, or 100 kDa
DNA-pairing protein (POMp100), or 100 kDa subunit of
DNA-binding p52/p100 complex, a multifunctional protein
that mediates diverse activities in the cell. It is
ubiquitously expressed and highly conserved in
vertebrates. PSF binds not only RNA but also both
single-stranded DNA (ssDNA) and double-stranded DNA
(dsDNA) and facilitates the renaturation of
complementary ssDNAs. It promotes the formation of
D-loops in superhelical duplex DNA, and is involved in
cell proliferation. PSF can also interact with multiple
factors. It is an RNA-binding component of spliceosomes
and binds to insulin-like growth factor response
element (IGFRE). Moreover, PSF functions as a
transcriptional repressor interacting with Sin3A and
mediating silencing through the recruitment of histone
deacetylases (HDACs) to the DNA binding domain (DBD) of
nuclear hormone receptors. PSF is an essential pre-mRNA
splicing factor and is dissociated from PTB and binds
to U1-70K and serine-arginine (SR) proteins during
apoptosis. PSF forms a heterodimer with the nuclear
protein p54nrb, also known as non-POU domain-containing
octamer-binding protein (NonO). The PSF/p54nrb complex
displays a variety of functions, such as DNA
recombination and RNA synthesis, processing, and
transport. PSF contains two conserved RNA recognition
motifs (RRMs), also termed RBDs (RNA binding domains)
or RNPs (ribonucleoprotein domains), which are
responsible for interactions with RNA and for the
localization of the protein in speckles. It also
contains an N-terminal region rich in proline, glycine,
and glutamine residues, which may play a role in
interactions recruiting other molecules. .
Length = 80
Score = 77.4 bits (190), Expect = 2e-18
Identities = 34/55 (61%), Positives = 44/55 (80%)
Query: 3 IERAVVLVDDRGNSKNEGIIEFTRKPAAAQALKRCQDGVFFLTQSLKPVIVEPLD 57
+ERAVV+VDDRG S +GI+EF KPAA +A +RC +GVF LT + +PVIVEPL+
Sbjct: 26 VERAVVIVDDRGRSTGKGIVEFASKPAARKAFERCTEGVFLLTTTPRPVIVEPLE 80
>gnl|CDD|241033 cd12589, RRM2_PSP1, RNA recognition motif 2 in vertebrate
paraspeckle protein 1 (PSP1 or PSPC1). This subgroup
corresponds to the RRM2 of PSPC1, also termed
paraspeckle component 1 (PSPC1), a novel nucleolar
factor that accumulates within a new nucleoplasmic
compartment, termed paraspeckles, and diffusely
distributes in the nucleoplasm. It is ubiquitously
expressed and highly conserved in vertebrates. Although
its cellular function remains unknown currently, PSPC1
forms a novel heterodimer with the nuclear protein
p54nrb, also known as non-POU domain-containing
octamer-binding protein (NonO), which localizes to
paraspeckles in an RNA-dependent manner. PSPC1 contains
two conserved RNA recognition motifs (RRMs), also
termed RBDs (RNA binding domains) or RNPs
(ribonucleoprotein domains), at the N-terminus. .
Length = 80
Score = 74.3 bits (182), Expect = 3e-17
Identities = 32/55 (58%), Positives = 42/55 (76%)
Query: 3 IERAVVLVDDRGNSKNEGIIEFTRKPAAAQALKRCQDGVFFLTQSLKPVIVEPLD 57
+ERAVV+VDDRG +G +EF KPAA +AL+RC DG F LT + +PVIVEP++
Sbjct: 26 VERAVVIVDDRGRPTGKGFVEFAAKPAARKALERCADGAFLLTTTPRPVIVEPME 80
>gnl|CDD|241035 cd12591, RRM2_p54nrb, RNA recognition motif 2 in vertebrate 54
kDa nuclear RNA- and DNA-binding protein (p54nrb).
This subgroup corresponds to the RRM2 of p54nrb, also
termed non-POU domain-containing octamer-binding
protein (NonO), or 55 kDa nuclear protein (NMT55), or
DNA-binding p52/p100 complex 52 kDa subunit. p54nrb is
a multifunctional protein involved in numerous nuclear
processes including transcriptional regulation,
splicing, DNA unwinding, nuclear retention of
hyperedited double-stranded RNA, viral RNA processing,
control of cell proliferation, and circadian rhythm
maintenance. It is ubiquitously expressed and highly
conserved in vertebrates. It binds both, single- and
double-stranded RNA and DNA, and also possesses
inherent carbonic anhydrase activity. p54nrb forms a
heterodimer with paraspeckle component 1 (PSPC1 or
PSP1), localizing to paraspeckles in an RNA-dependent
manner. It also forms a heterodimer with polypyrimidine
tract-binding protein-associated-splicing factor (PSF).
p54nrb contains two conserved RNA recognition motifs
(RRMs), also termed RBDs (RNA binding domains) or RNPs
(ribonucleoprotein domains), at the N-terminus. .
Length = 80
Score = 74.3 bits (182), Expect = 3e-17
Identities = 32/55 (58%), Positives = 40/55 (72%)
Query: 3 IERAVVLVDDRGNSKNEGIIEFTRKPAAAQALKRCQDGVFFLTQSLKPVIVEPLD 57
+ERAVV+VDDRG +GI+EF KP+A +AL RC DG F LT +PV VEP+D
Sbjct: 26 VERAVVIVDDRGRPTGKGIVEFAGKPSARKALDRCSDGAFLLTAFPRPVTVEPMD 80
>gnl|CDD|149257 pfam08075, NOPS, NOPS (NUC059) domain. This domain is found at the
C-terminus of NONA and PSP1 proteins adjacent to 1 or 2
pfam00076 domains.
Length = 52
Score = 72.4 bits (178), Expect = 7e-17
Identities = 35/53 (66%), Positives = 41/53 (77%), Gaps = 1/53 (1%)
Query: 50 PVIVEPLDLVDDEEGLSERTVSKKSSDYFKQRSIGPRFATIGSFEFEYGTRWK 102
PVIVEP++ DDE+GL E+ V KKS DY K+R GPRFA GSFE EYG+RWK
Sbjct: 1 PVIVEPMEQNDDEDGLPEKLV-KKSPDYHKEREQGPRFAQPGSFEHEYGSRWK 52
>gnl|CDD|240581 cd12946, NOPS_p54nrb_PSF_PSPC1, NOPS domain, including C-terminal
coiled-coil region, in p54nrb/PSF/PSPC1 family proteins.
The family contains a DBHS domain (for Drosophila
behavior, human splicing), which comprises two conserved
RNA recognition motifs (RRMs), also termed RBDs (RNA
binding domains) or RNPs (ribonucleoprotein domains),
and a charged protein-protein interaction NOPS (NONA and
PSP1) domain. This model corresponds to the NOPS domain,
with a long helical C-terminal extension, found in the
p54nrb/PSF/PSPC1 proteins. The NOPS domain specifically
binds to the second RNA recognition motif (RRM2) domain
of the partner DBHS protein via a substantial
interaction surface. Its highly conserved C-terminal
residues are critical for functional DBHS dimerization
while the highly conserved C-terminal helical extension,
forming a right-handed antiparallel heterodimeric
coiled-coil, is essential for localization of these
proteins to subnuclear bodies. Members in the family
include 54 kDa nuclear RNA- and DNA-binding protein
(p54nrb), polypyrimidine tract-binding protein
(PTB)-associated-splicing factor (PSF) and paraspeckle
protein component 1 (PSPC1 or PSP1), which are
ubiquitously expressed and are conserved in vertebrates.
p54nrb, also termed NONO or NMT55, is a multi-functional
protein involved in numerous nuclear processes including
transcriptional regulation, splicing, DNA unwinding,
nuclear retention of hyperedited double-stranded RNA,
viral RNA processing, control of cell proliferation, and
circadian rhythm maintenance. PSF, also termed POMp100,
is a multi-functional protein that binds RNA,
single-stranded DNA (ssDNA), double-stranded DNA (dsDNA)
and many factors, and mediates diverse activities in the
cell. PSPC1 is a novel nucleolar factor that accumulates
within a new nucleoplasmic compartment, termed
paraspeckles, and diffusely distributes in the
nucleoplasm. The cellular function of PSPC1 remains
unknown currently. PSF has an additional large
N-terminal domain that differentiates it from other
family members.
Length = 93
Score = 69.8 bits (170), Expect = 2e-15
Identities = 43/90 (47%), Positives = 65/90 (72%), Gaps = 1/90 (1%)
Query: 49 KPVIVEPLDLVDDEEGLSERTVSKKSSDYFKQRSIGPRFATIGSFEFEYGTRWKQLYELY 108
+PVIVEP++ +DDE+GL E+ +K+ Y K+R PRFA G+FE+EY RWK L E+
Sbjct: 2 RPVIVEPMEQLDDEDGLPEKLA-QKNQQYHKEREQPPRFAQPGTFEYEYAQRWKALDEME 60
Query: 109 KQKEEALQKELKLEEEKLEAQMEFARYENE 138
KQ+ E + + +K +EKLE++ME AR+E++
Sbjct: 61 KQQREQVDRNIKEAKEKLESEMEAARHEHQ 90
>gnl|CDD|240584 cd12949, NOPS_PSPC1, NOPS domain, including C-terminal coiled-coil
region, in paraspeckle protein component 1 (PSPC1) and
similar proteins. The family contains a DBHS domain
(for Drosophila behavior, human splicing), which
comprises two conserved RNA recognition motifs (RRMs),
also termed RBDs (RNA binding domains) or RNPs
(ribonucleoprotein domains), and a charged
protein-protein interaction NOPS (NONA and PSP1) domain.
This model corresponds to the NOPS domain, with a long
helical C-terminal extension, of paraspeckle component 1
(PSPC1, also termed PSP1), a novel nucleolar factor that
accumulates within a new nucleoplasmic compartment,
termed paraspeckles, and diffusely distributes in the
nucleoplasm. It is ubiquitously expressed and highly
conserved in vertebrates. Although its cellular function
remains unknown currently, PSPC1 forms a novel
heterodimer with the nuclear protein p54nrb, also known
as non-POU domain-containing octamer-binding protein
(NONO), which localizes to paraspeckles in an
RNA-dependent manner. The NOPS domain specifically binds
to the second RNA recognition motif (RRM2) domain of the
partner DBHS protein via a substantial interaction
surface. Its highly conserved C-terminal residues are
critical for functional DBHS dimerization while the
highly conserved C-terminal helical extension, forming a
right-handed antiparallel heterodimeric coiled-coil, is
essential for localization of these proteins to
subnuclear bodies.
Length = 94
Score = 68.9 bits (168), Expect = 4e-15
Identities = 44/90 (48%), Positives = 66/90 (73%), Gaps = 1/90 (1%)
Query: 49 KPVIVEPLDLVDDEEGLSERTVSKKSSDYFKQRSIGPRFATIGSFEFEYGTRWKQLYELY 108
+PVIVEP++ DDE+GL E+ + +K+ Y K+R PRFA G+FEFEY +RWK L E+
Sbjct: 2 RPVIVEPMEQFDDEDGLPEKLM-QKTQQYHKEREQPPRFAQPGTFEFEYASRWKALDEME 60
Query: 109 KQKEEALQKELKLEEEKLEAQMEFARYENE 138
KQ+ E + + ++ +EKLEA+ME AR+E++
Sbjct: 61 KQQREQVDRNIREAKEKLEAEMEAARHEHQ 90
>gnl|CDD|240582 cd12947, NOPS_p54nrb, NOPS domain, including C-terminal coiled-coil
region, in 54 kDa nuclear RNA- and DNA-binding protein
(p54nrb) and similar proteins. The family contains a
DBHS domain (for Drosophila behavior, human splicing),
which comprises two conserved RNA recognition motifs
(RRMs), also termed RBDs (RNA binding domains) or RNPs
(ribonucleoprotein domains), and a charged
protein-protein interaction NOPS (NONA and PSP1) domain.
This model corresponds to the NOPS domain, with a long
helical C-terminal extension, found in p54nrb, also
termed non-POU domain-containing octamer-binding protein
(NONO), or 55 kDa nuclear protein (NMT55), or
DNA-binding p52/p100 complex 52 kDa subunit. It is a
multi-functional protein involved in numerous nuclear
processes including transcriptional regulation,
splicing, DNA unwinding, nuclear retention of
hyperedited double-stranded RNA, viral RNA processing,
control of cell proliferation, and circadian rhythm
maintenance. p54nrb is ubiquitously expressed and highly
conserved in vertebrates. It binds both single- and
double-stranded RNA and DNA, and also possesses inherent
carbonic anhydrase activity. p54nrb forms a heterodimer
with paraspeckle component 1 (PSPC1 or PSP1), localizing
to paraspeckles in an RNA-dependent manner. It also
forms a heterodimer with polypyrimidine tract-binding
protein-associated-splicing factor (PSF). The NOPS
domain specifically binds to the second RNA recognition
motif (RRM2) domain of the partner DBHS protein via a
substantial interaction surface. Its highly conserved
C-terminal residues are critical for functional DBHS
dimerization while the highly conserved C-terminal
helical extension, forming a right-handed antiparallel
heterodimeric coiled-coil, is essential for paraspeckle
localization to subnuclear bodies.
Length = 94
Score = 68.2 bits (166), Expect = 8e-15
Identities = 46/94 (48%), Positives = 64/94 (68%), Gaps = 1/94 (1%)
Query: 49 KPVIVEPLDLVDDEEGLSERTVSKKSSDYFKQRSIGPRFATIGSFEFEYGTRWKQLYELY 108
+PV VEP+D +DDEEGL E+ V K Y K+R PRFA GSFE+EY RWK L E+
Sbjct: 2 RPVTVEPMDQLDDEEGLPEKLVIKNQQ-YHKEREQPPRFAQPGSFEYEYAMRWKALIEME 60
Query: 109 KQKEEALQKELKLEEEKLEAQMEFARYENETEIL 142
KQ++E + + +K EKLE +ME AR+E++ ++
Sbjct: 61 KQQQEQVDRNIKEAREKLEMEMEAARHEHQVMLM 94
>gnl|CDD|240583 cd12948, NOPS_PSF, NOPS domain, including C-terminal coiled-coil
region, in polypyrimidine tract-binding protein
(PTB)-associated-splicing factor (PSF) and similar
proteins. This model contains the NOPS (NONA and PSP1)
domain PSF (also termed proline- and glutamine-rich
splicing factor, or 100 kDa DNA-pairing protein
(POMp100), or 100 kDa subunit of DNA-binding p52/p100
complex), with a long helical C-terminal extension. PSF
is a multifunctional protein that mediates diverse
activities in the cell. It is ubiquitously expressed and
highly conserved in vertebrates. PSF binds not only RNA
but also single-stranded DNA (ssDNA) as well as
double-stranded DNA (dsDNA) and facilitates the
renaturation of complementary ssDNAs. Additionally, it
promotes the formation of D-loops in superhelical duplex
DNA, and is involved in cell proliferation. PSF can also
interact with multiple factors. It is an RNA-binding
component of spliceosomes and binds to insulin-like
growth factor response element (IGFRE). Moreover, PSF
functions as a transcriptional repressor interacting
with Sin3A and mediating silencing through the
recruitment of histone deacetylases (HDACs) to the DNA
binding domain (DBD) of nuclear hormone receptors. As an
RNA-binding component of spliceosomes, PSF binds to the
insulin-like growth factor response element (IGFRE), and
acts as an independent negative regulator of the
transcriptional activity of the porcine P-450
cholesterol side-chain cleavage enzyme gene (P450scc)
IGFRE. PSF is an essential pre-mRNA splicing factor and
is dissociated from PTB and binds to U1-70K and
serine-arginine (SR) proteins during apoptosis. In
addition, PSF forms a heterodimer with the nuclear
protein p54nrb, also known as non-POU domain-containing
octamer-binding protein (NONO). The PSF/p54nrb complex
displays a variety of functions, such as DNA
recombination and RNA synthesis, processing, and
transport. PSF contains two conserved RNA recognition
motifs (RRMs), also termed RBDs (RNA binding domains) or
RNPs (ribonucleoprotein domains), which are responsible
for interactions with RNA and for the localization of
the protein in speckles. It also contains an N-terminal
region rich in proline, glycine, and glutamine residues,
which may play a role in interactions recruiting other
molecules. The NOPS domain specifically binds to the
second RNA recognition motif (RRM2) domain of the
partner DBHS protein via a substantial interaction
surface. Its highly conserved C-terminal residues are
critical for functional DBHS dimerization while the
highly conserved C-terminal helical extension, forming a
right-handed antiparallel heterodimeric coiled-coil, is
essential for localization of these proteins to
subnuclear bodies.
Length = 97
Score = 67.9 bits (165), Expect = 1e-14
Identities = 46/96 (47%), Positives = 71/96 (73%), Gaps = 1/96 (1%)
Query: 49 KPVIVEPLDLVDDEEGLSERTVSKKSSDYFKQRSIGPRFATIGSFEFEYGTRWKQLYELY 108
+PVIVEPL+ +DDE+GL E+ +++K+ Y K+R PRFA G+FE+EY RWK L E+
Sbjct: 2 RPVIVEPLEQLDDEDGLPEK-LAQKNPMYQKERETPPRFAQPGTFEYEYSQRWKSLDEME 60
Query: 109 KQKEEALQKELKLEEEKLEAQMEFARYENETEILRE 144
KQ+ E ++K +K +EKLE++ME A +E++ +LR+
Sbjct: 61 KQQREQVEKNMKEAKEKLESEMEDAYHEHQANLLRQ 96
>gnl|CDD|221389 pfam12037, DUF3523, Domain of unknown function (DUF3523). This
presumed domain is functionally uncharacterized. This
domain is found in eukaryotes. This domain is typically
between 257 to 277 amino acids in length. This domain is
found associated with pfam00004. This domain has a
conserved LER sequence motif.
Length = 276
Score = 43.6 bits (103), Expect = 5e-05
Identities = 39/133 (29%), Positives = 69/133 (51%), Gaps = 18/133 (13%)
Query: 102 KQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLE 161
K+ +EL K +E+ Q EL+ + ++ EAQ + ++ R R VE +++ E
Sbjct: 51 KKAFELSKMQEKTRQAELEAKIKEYEAQQA------QAKLERAR----VEAEERRKTLQE 100
Query: 162 EEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQT 221
+ + E Q A+Y++E R++ ++ + RQ +E LK + R+ EAMRR T
Sbjct: 101 QTQQEQQR--AQYQDELA--RKRYQKELEQQRRQNEEL-LKMQEESVLRQ---EAMRRAT 152
Query: 222 EEIHLRMQQQDEE 234
EE L M+++ E
Sbjct: 153 EEEILEMRRETIE 165
Score = 30.5 bits (69), Expect = 0.84
Identities = 30/115 (26%), Positives = 56/115 (48%), Gaps = 2/115 (1%)
Query: 96 EYGTRWKQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQ 155
E R K L E +Q+++ Q + +L ++ + ++E R +NE + + E +EA++
Sbjct: 90 EAEERRKTLQEQTQQEQQRAQYQDELARKRYQKELEQQRRQNEELLKMQEESVLRQEAMR 149
Query: 156 KELKLEEEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQR 210
+ EEE LE + E E E E + + R R K+E E ++ + E +
Sbjct: 150 R--ATEEEILEMRRETIEEEAELERENIRAKIEAEARGRAKEERENEDINREMLK 202
Score = 30.5 bits (69), Expect = 0.96
Identities = 19/68 (27%), Positives = 38/68 (55%)
Query: 175 ENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQTEEIHLRMQQQDEE 234
E E +I + +Q +A ER + E E + + +EQ + +++ + Q E R Q++ E+
Sbjct: 67 ELEAKIKEYEAQQAQAKLERARVEAEERRKTLQEQTQQEQQRAQYQDELARKRYQKELEQ 126
Query: 235 LRRRHQEN 242
RR+++E
Sbjct: 127 QRRQNEEL 134
Score = 28.2 bits (63), Expect = 4.3
Identities = 23/90 (25%), Positives = 43/90 (47%), Gaps = 10/90 (11%)
Query: 151 EEALQKELKLEEEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQR 210
E+ Q EL+ + ++ EAQ A+ E R R ER+K E ++ + +
Sbjct: 61 EKTRQAELEAKIKEYEAQQAQAKLE----------RARVEAEERRKTLQEQTQQEQQRAQ 110
Query: 211 RCDEEAMRRQTEEIHLRMQQQDEELRRRHQ 240
DE A +R +E+ + +Q +E L+ + +
Sbjct: 111 YQDELARKRYQKELEQQRRQNEELLKMQEE 140
>gnl|CDD|237177 PRK12704, PRK12704, phosphodiesterase; Provisional.
Length = 520
Score = 40.1 bits (95), Expect = 9e-04
Identities = 37/130 (28%), Positives = 66/130 (50%), Gaps = 8/130 (6%)
Query: 111 KEEALQKELKLEEEKLEAQMEFARYENETEI-LRERECRFVEEALQKELKLEEEKLEAQM 169
+E + E +E LEA+ E + NE E LRER + L+K L +EE L+ ++
Sbjct: 45 EEAKKEAEAIKKEALLEAKEEIHKLRNEFEKELRERRNEL--QKLEKRLLQKEENLDRKL 102
Query: 170 E-FARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQTEEI-HLR 227
E + E E E ++L Q++ + E++++ EL+E E+ + E EE +
Sbjct: 103 ELLEKREEELEKKEKELEQKQQELEKKEE--ELEELIEEQLQEL-ERISGLTAEEAKEIL 159
Query: 228 MQQQDEELRR 237
+++ +EE R
Sbjct: 160 LEKVEEEARH 169
Score = 39.4 bits (93), Expect = 0.001
Identities = 30/122 (24%), Positives = 62/122 (50%), Gaps = 5/122 (4%)
Query: 116 QKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLEEEKLEAQMEFARYE 175
+ ++K EE+ + +E A+ E E I +E EE + + E+E E + E + E
Sbjct: 30 EAKIKEAEEEAKRILEEAKKEAE-AIKKEALLEAKEEIHKLRNEFEKELRERRNELQKLE 88
Query: 176 NETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQTEEIHLRMQQQDEEL 235
E L ++ E++++E E KE+ E++ ++ + ++ EE+ +++Q +EL
Sbjct: 89 KRLLQKEENLDRKLELLEKREEELEKKEKELEQK----QQELEKKEEELEELIEEQLQEL 144
Query: 236 RR 237
R
Sbjct: 145 ER 146
>gnl|CDD|224117 COG1196, Smc, Chromosome segregation ATPases [Cell division and
chromosome partitioning].
Length = 1163
Score = 38.9 bits (91), Expect = 0.002
Identities = 42/147 (28%), Positives = 65/147 (44%), Gaps = 7/147 (4%)
Query: 102 KQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRER--ECRFVEEALQKEL- 158
++L L ++ E Q+ +LE+E E + E E + + L E E E L++EL
Sbjct: 807 RRLDALERELESLEQRRERLEQEIEELEEEIEELEEKLDELEEELEELEKELEELKEELE 866
Query: 159 KLEEEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEE----QRRCDE 214
+LE EK E + E E E E L E+LR+ E++ K+E E EE R +
Sbjct: 867 ELEAEKEELEDELKELEEEKEELEEELRELESELAELKEEIEKLRERLEELEAKLERLEV 926
Query: 215 EAMRRQTEEIHLRMQQQDEELRRRHQE 241
E + E + EL R +
Sbjct: 927 ELPELEEELEEEYEDTLETELEREIER 953
Score = 36.6 bits (85), Expect = 0.013
Identities = 51/161 (31%), Positives = 72/161 (44%), Gaps = 12/161 (7%)
Query: 94 EFEYGTRWKQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRER--ECRFVE 151
E E +L EL K+ EE ++ +LEEE E Q E E E E L+ E R
Sbjct: 224 ELELALLLAKLKELRKELEELEEELSRLEEELEELQEELEEAEKEIEELKSELEELREEL 283
Query: 152 EALQKEL--------KLEEEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKE 203
E LQ+EL +LE E + ENE E L E+L + + E K+E E +E
Sbjct: 284 EELQEELLELKEEIEELEGEISLLRERLEELENELEELEERLEELKEKIEALKEELEERE 343
Query: 204 RHAEE--QRRCDEEAMRRQTEEIHLRMQQQDEELRRRHQEN 242
EE Q + E + + EE + ++ EEL +E
Sbjct: 344 TLLEELEQLLAELEEAKEELEEKLSALLEELEELFEALREE 384
Score = 34.7 bits (80), Expect = 0.056
Identities = 34/132 (25%), Positives = 55/132 (41%), Gaps = 1/132 (0%)
Query: 102 KQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLE 161
+QL EL +Q EE ++ LEEE + Q E E E L E E + E +LE
Sbjct: 709 RQLEELERQLEELKRELAALEEELEQLQSRLEELEEELEELEEELEELQERLEELEEELE 768
Query: 162 EEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQT 221
+ E E R+ L++ + E + +E E + E + E+ R
Sbjct: 769 SLEEALAKLKEEIEELEEK-RQALQEELEELEEELEEAERRLDALERELESLEQRRERLE 827
Query: 222 EEIHLRMQQQDE 233
+EI ++ +E
Sbjct: 828 QEIEELEEEIEE 839
Score = 32.8 bits (75), Expect = 0.20
Identities = 35/155 (22%), Positives = 61/155 (39%), Gaps = 13/155 (8%)
Query: 100 RWKQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRF------VEEA 153
R +L L + +E ++ +LEEE + E E L E E +EE
Sbjct: 223 RELELALLLAKLKELRKELEELEEELSRLE---EELEELQEELEEAEKEIEELKSELEEL 279
Query: 154 LQKELKLEEEKLEAQMEFARYENETEILREQLRQREADRERQKQEW----ELKERHAEEQ 209
++ +L+EE LE + E E E +LRE+L + E + E ++ E E EE
Sbjct: 280 REELEELQEELLELKEEIEELEGEISLLRERLEELENELEELEERLEELKEKIEALKEEL 339
Query: 210 RRCDEEAMRRQTEEIHLRMQQQDEELRRRHQENSI 244
+ + L +++ E + +
Sbjct: 340 EERETLLEELEQLLAELEEAKEELEEKLSALLEEL 374
Score = 31.6 bits (72), Expect = 0.49
Identities = 37/156 (23%), Positives = 64/156 (41%), Gaps = 16/156 (10%)
Query: 102 KQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRF--VEEALQKELK 159
++L EL + EE ++ +L+E+ + E E E L + +E L+++L
Sbjct: 309 ERLEELENELEELEERLEELKEKIEALKEELEERETLLEELEQLLAELEEAKEELEEKLS 368
Query: 160 LEEEKLEAQM------------EFARYENETEILREQLRQREADRERQKQEWE--LKERH 205
E+LE E A NE E L+ ++ E ER + E +E
Sbjct: 369 ALLEELEELFEALREELAELEAELAEIRNELEELKREIESLEERLERLSERLEDLKEELK 428
Query: 206 AEEQRRCDEEAMRRQTEEIHLRMQQQDEELRRRHQE 241
E + + + E +++Q EELR R +E
Sbjct: 429 ELEAELEELQTELEELNEELEELEEQLEELRDRLKE 464
Score = 31.2 bits (71), Expect = 0.69
Identities = 35/142 (24%), Positives = 65/142 (45%), Gaps = 2/142 (1%)
Query: 105 YELYKQKEEALQKELKLEEEKLEAQMEFARYE-NETEILRERECRFVEEALQKELKLEEE 163
E +Q + +EL+ E E+ E +++ E E RER + +EE ++ +LEE+
Sbjct: 784 LEEKRQALQEELEELEEELEEAERRLDALERELESLEQRRERLEQEIEELEEEIEELEEK 843
Query: 164 KLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQTEE 223
E + E E E E L+E+L + EA++E + E + E EE + + + E
Sbjct: 844 LDELEEELEELEKELEELKEELEELEAEKEELEDELKELEEEKEELEE-ELRELESELAE 902
Query: 224 IHLRMQQQDEELRRRHQENSIF 245
+ +++ E L +
Sbjct: 903 LKEEIEKLRERLEELEAKLERL 924
Score = 30.5 bits (69), Expect = 1.1
Identities = 30/136 (22%), Positives = 59/136 (43%), Gaps = 5/136 (3%)
Query: 102 KQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLE 161
++L EL Q E+ ++ L+ E + + E L + +EE ++ LE
Sbjct: 674 EELAELEAQLEKLEEELKSLKNELRSLEDLLEELRRQLEELERQ----LEELKRELAALE 729
Query: 162 EEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQT 221
EE + Q E E E L E+L + + E ++E E A + + + E + +
Sbjct: 730 EELEQLQSRLEELEEELEELEEELEELQERLEELEEELE-SLEEALAKLKEEIEELEEKR 788
Query: 222 EEIHLRMQQQDEELRR 237
+ + +++ +EEL
Sbjct: 789 QALQEELEELEEELEE 804
Score = 29.7 bits (67), Expect = 2.0
Identities = 31/123 (25%), Positives = 50/123 (40%), Gaps = 12/123 (9%)
Query: 90 IGSFEFEYGTRWKQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRF 149
+ E E ++L EL + E ++ KL E E + + R E E L E
Sbjct: 879 LKELEEEKEELEEELRELESELAELKEEIEKLRERLEELEAKLERLEVELPELEEELEEE 938
Query: 150 VEEALQKELKLEEEKLEAQM------------EFARYENETEILREQLRQREADRERQKQ 197
E+ L+ EL+ E E+LE ++ E+ E E L+ Q E +E+ +
Sbjct: 939 YEDTLETELEREIERLEEEIEALGPVNLRAIEEYEEVEERYEELKSQREDLEEAKEKLLE 998
Query: 198 EWE 200
E
Sbjct: 999 VIE 1001
Score = 28.9 bits (65), Expect = 3.3
Identities = 38/164 (23%), Positives = 69/164 (42%), Gaps = 14/164 (8%)
Query: 102 KQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLE 161
+L EL ++ EE ++ +L+E E + E E L+E +EE +K L+
Sbjct: 737 SRLEELEEELEELEEELEELQERLEELEEELESLEEALAKLKEE----IEELEEKRQALQ 792
Query: 162 EEKLEAQMEFARYENETEILREQLRQREADRER---QKQEWELKERHAEEQRRCDEEAMR 218
EE E + E E + L +L E RER + +E E + EE+ EE +
Sbjct: 793 EELEELEEELEEAERRLDALERELESLEQRRERLEQEIEELEEEIEELEEKLDELEEELE 852
Query: 219 RQTEEIHLRMQQQDEELRRRHQENSIFMQVIVWLGDLKQGVYQL 262
+E+ ++ +E + + L +L++ +L
Sbjct: 853 ELEKELEELKEELEELEAEKEELEDE-------LKELEEEKEEL 889
Score = 28.5 bits (64), Expect = 4.4
Identities = 35/147 (23%), Positives = 64/147 (43%), Gaps = 3/147 (2%)
Query: 102 KQLYELYKQKEEALQKELK-LEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKL 160
K+ +++ E LQ L+ LEEE E + E + E L E E + + ++
Sbjct: 722 KRELAALEEELEQLQSRLEELEEELEELEEELEELQERLEELEEELESLEEALAKLKEEI 781
Query: 161 EEEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQ 220
EE + + Q E E L E R+ +A + + +ER +E +EE +
Sbjct: 782 EELEEKRQALQEELEELEEELEEAERRLDALERELESLEQRRERLEQEIEELEEE--IEE 839
Query: 221 TEEIHLRMQQQDEELRRRHQENSIFMQ 247
EE ++++ EEL + +E ++
Sbjct: 840 LEEKLDELEEELEELEKELEELKEELE 866
>gnl|CDD|233973 TIGR02680, TIGR02680, TIGR02680 family protein. Members of this
protein family belong to a conserved gene four-gene
neighborhood found sporadically in a phylogenetically
broad range of bacteria: Nocardia farcinica,
Symbiobacterium thermophilum, and Streptomyces
avermitilis (Actinobacteria), Geobacillus kaustophilus
(Firmicutes), Azoarcus sp. EbN1 and Ralstonia
solanacearum (Betaproteobacteria). Proteins in this
family average over 1400 amino acids in length
[Hypothetical proteins, Conserved].
Length = 1353
Score = 38.6 bits (90), Expect = 0.003
Identities = 34/103 (33%), Positives = 43/103 (41%), Gaps = 3/103 (2%)
Query: 133 ARYENETEILRERECRFVEEALQKELKLEEEKLEAQMEFARYE--NETEILREQLRQREA 190
AR E ET ERE EAL++E +LEA Y+ E E R +A
Sbjct: 288 ARDELETAREEERELDARTEALEREADALRTRLEALQGSPAYQDAEELERARADAEALQA 347
Query: 191 DRERQKQEWELKERHAEE-QRRCDEEAMRRQTEEIHLRMQQQD 232
+Q E EE +RR DEEA R E LR ++
Sbjct: 348 AAADARQAIREAESRLEEERRRLDEEAGRLDDAERELRAAREQ 390
>gnl|CDD|233757 TIGR02168, SMC_prok_B, chromosome segregation protein SMC, common
bacterial type. SMC (structural maintenance of
chromosomes) proteins bind DNA and act in organizing and
segregating chromosomes for partition. SMC proteins are
found in bacteria, archaea, and eukaryotes. This family
represents the SMC protein of most bacteria. The smc
gene is often associated with scpB (TIGR00281) and scpA
genes, where scp stands for segregation and condensation
protein. SMC was shown (in Caulobacter crescentus) to be
induced early in S phase but present and bound to DNA
throughout the cell cycle [Cellular processes, Cell
division, DNA metabolism, Chromosome-associated
proteins].
Length = 1179
Score = 38.1 bits (89), Expect = 0.005
Identities = 36/149 (24%), Positives = 63/149 (42%), Gaps = 18/149 (12%)
Query: 109 KQKEEALQKEL--------KLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKL 160
+++ E LQKEL +LE++K + A E + E L + +EE K +L
Sbjct: 280 EEEIEELQKELYALANEISRLEQQKQILRERLANLERQLEELEAQ----LEELESKLDEL 335
Query: 161 EEEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRC------DE 214
EE E + + + E E L +L + EA+ E + E E E R
Sbjct: 336 AEELAELEEKLEELKEELESLEAELEELEAELEELESRLEELEEQLETLRSKVAQLELQI 395
Query: 215 EAMRRQTEEIHLRMQQQDEELRRRHQENS 243
++ + E + R+++ ++ R QE
Sbjct: 396 ASLNNEIERLEARLERLEDRRERLQQEIE 424
Score = 37.3 bits (87), Expect = 0.008
Identities = 47/160 (29%), Positives = 68/160 (42%), Gaps = 21/160 (13%)
Query: 102 KQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLE 161
+QL EL Q EE K +L EE E + + + E E L EA +EL+
Sbjct: 316 RQLEELEAQLEELESKLDELAEELAELEEKLEELKEELESLEAELEEL--EAELEELESR 373
Query: 162 EEKLEAQMEFARYE------------NETEILREQLRQREADRERQKQEWELKERHAEEQ 209
E+LE Q+E R + NE E L +L + E RER +QE E + EE
Sbjct: 374 LEELEEQLETLRSKVAQLELQIASLNNEIERLEARLERLEDRRERLQQEIEELLKKLEEA 433
Query: 210 RRCD-------EEAMRRQTEEIHLRMQQQDEELRRRHQEN 242
+ E + +E R+++ EELR +E
Sbjct: 434 ELKELQAELEELEEELEELQEELERLEEALEELREELEEA 473
Score = 33.9 bits (78), Expect = 0.092
Identities = 39/147 (26%), Positives = 65/147 (44%), Gaps = 12/147 (8%)
Query: 102 KQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLE 161
+L L + EE ++ +L+EE EA+ E E + L E+ +EE + +LE
Sbjct: 225 LELALLVLRLEELREELEELQEELKEAEEELEELTAELQELEEK----LEELRLEVSELE 280
Query: 162 EEKLEAQMEF-------ARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDE 214
EE E Q E +R E + +ILRE+L E E + + E E +E +
Sbjct: 281 EEIEELQKELYALANEISRLEQQKQILRERLANLERQLEELEAQLEELESKLDELAE-EL 339
Query: 215 EAMRRQTEEIHLRMQQQDEELRRRHQE 241
+ + EE+ ++ + EL E
Sbjct: 340 AELEEKLEELKEELESLEAELEELEAE 366
Score = 33.1 bits (76), Expect = 0.18
Identities = 38/150 (25%), Positives = 66/150 (44%), Gaps = 11/150 (7%)
Query: 102 KQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRER---ECRFVEEALQKEL 158
K+L EL ++ E+ ++ +L + + + AR E E E L ER + + E +
Sbjct: 705 KELEELEEELEQLRKELEELSRQISALRKDLARLEAEVEQLEERIAQLSKELTELEAEIE 764
Query: 159 KLEEEKLEAQMEFARYENETEILREQL----RQREADRERQKQEWELKERHAEEQRRCDE 214
+LEE EA+ E A E E E L Q+ + +A RE EL+
Sbjct: 765 ELEERLEEAEEELAEAEAEIEELEAQIEQLKEELKALREALD---ELRAELTLLNEEAAN 821
Query: 215 EAMRRQTEEIHLRM-QQQDEELRRRHQENS 243
R ++ E + +++ E+L + +E S
Sbjct: 822 LRERLESLERRIAATERRLEDLEEQIEELS 851
Score = 31.6 bits (72), Expect = 0.60
Identities = 29/144 (20%), Positives = 52/144 (36%), Gaps = 6/144 (4%)
Query: 103 QLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLEE 162
+L E ++ EA + +LE + + + E + LR EEA +LE
Sbjct: 769 RLEEAEEELAEAEAEIEELEAQIEQLKEELKALREALDELRAELTLLNEEAANLRERLES 828
Query: 163 EKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKER------HAEEQRRCDEEA 216
+ R E+ E + E E+ ++ EL E +R EEA
Sbjct: 829 LERRIAATERRLEDLEEQIEELSEDIESLAAEIEELEELIEELESELEALLNERASLEEA 888
Query: 217 MRRQTEEIHLRMQQQDEELRRRHQ 240
+ E+ ++ E +R +
Sbjct: 889 LALLRSELEELSEELRELESKRSE 912
Score = 29.6 bits (67), Expect = 2.0
Identities = 30/99 (30%), Positives = 50/99 (50%), Gaps = 3/99 (3%)
Query: 109 KQKEEALQKELKLEEEKLEA-QMEFARYENETEILRERECRFVEEALQKEL-KLEEEKLE 166
+ + +L E++ E +LE + R + E E L ++ + LQ EL +LEEE E
Sbjct: 392 ELQIASLNNEIERLEARLERLEDRRERLQQEIEELLKKLEEAELKELQAELEELEEELEE 451
Query: 167 AQMEFARYENETEILREQLRQ-READRERQKQEWELKER 204
Q E R E E LRE+L + +A +++ +L+ R
Sbjct: 452 LQEELERLEEALEELREELEEAEQALDAAERELAQLQAR 490
Score = 28.1 bits (63), Expect = 6.1
Identities = 28/127 (22%), Positives = 49/127 (38%), Gaps = 6/127 (4%)
Query: 136 ENETEILRERECRFVEEALQKELKLEEEKLEAQMEFARYENETEILREQLRQREADRERQ 195
E EI E +EE +K +LE+ E + E E E EQLR+ + RQ
Sbjct: 674 ERRREIEELEEK--IEELEEKIAELEKALAELRKELEELEEE----LEQLRKELEELSRQ 727
Query: 196 KQEWELKERHAEEQRRCDEEAMRRQTEEIHLRMQQQDEELRRRHQENSIFMQVIVWLGDL 255
E + EE + + ++E+ + +E R + + + +L
Sbjct: 728 ISALRKDLARLEAEVEQLEERIAQLSKELTELEAEIEELEERLEEAEEELAEAEAEIEEL 787
Query: 256 KQGVYQL 262
+ + QL
Sbjct: 788 EAQIEQL 794
>gnl|CDD|215621 PLN03188, PLN03188, kinesin-12 family protein; Provisional.
Length = 1320
Score = 38.0 bits (88), Expect = 0.005
Identities = 28/83 (33%), Positives = 44/83 (53%), Gaps = 5/83 (6%)
Query: 157 ELKLEEEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEA 216
E KLE+E+L +++ + E LR +L A E+QK E + ++R AEE + + A
Sbjct: 1046 EKKLEQERLRWTEAESKWISLAEELRTELDASRALAEKQKHELDTEKRCAEELKEAMQMA 1105
Query: 217 MRRQTEEIHLRMQQQDEELRRRH 239
M E H RM +Q +L +H
Sbjct: 1106 M-----EGHARMLEQYADLEEKH 1123
>gnl|CDD|173412 PTZ00121, PTZ00121, MAEBL; Provisional.
Length = 2084
Score = 36.7 bits (84), Expect = 0.014
Identities = 28/139 (20%), Positives = 66/139 (47%), Gaps = 5/139 (3%)
Query: 106 ELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQ---KELKLEE 162
E K +E ++ E + E+ + ++E + + E + E + EE + E +
Sbjct: 1611 EAKKAEEAKIKAEELKKAEEEKKKVEQLKKKEAEEKKKAEELKKAEEENKIKAAEEAKKA 1670
Query: 163 EKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQTE 222
E+ + + E A+ E E + ++EA+ ++ +E LK++ AEE+++ +E +
Sbjct: 1671 EEDKKKAEEAKKAEEDEKKAAEALKKEAEEAKKAEE--LKKKEAEEKKKAEELKKAEEEN 1728
Query: 223 EIHLRMQQQDEELRRRHQE 241
+I +++ E ++ E
Sbjct: 1729 KIKAEEAKKEAEEDKKKAE 1747
Score = 36.3 bits (83), Expect = 0.018
Identities = 35/127 (27%), Positives = 65/127 (51%), Gaps = 7/127 (5%)
Query: 102 KQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQK--ELK 159
++L + ++K++ Q + K EEK +A+ E + E E +I E + EE +K E K
Sbjct: 1623 EELKKAEEEKKKVEQLKKKEAEEKKKAE-ELKKAEEENKIKAAEEAKKAEEDKKKAEEAK 1681
Query: 160 LEEEKLEAQMEFARYENETEILREQLRQREADRERQ----KQEWELKERHAEEQRRCDEE 215
EE + E + E E E+L+++EA+ +++ K+ E + AEE ++ EE
Sbjct: 1682 KAEEDEKKAAEALKKEAEEAKKAEELKKKEAEEKKKAEELKKAEEENKIKAEEAKKEAEE 1741
Query: 216 AMRRQTE 222
++ E
Sbjct: 1742 DKKKAEE 1748
Score = 35.9 bits (82), Expect = 0.026
Identities = 38/143 (26%), Positives = 79/143 (55%), Gaps = 12/143 (8%)
Query: 102 KQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLE 161
K+ EL K +E +E K EE +A+ + + E ++ E E +++ +KL
Sbjct: 1546 KKADELKKAEELKKAEEKKKAEEAKKAEEDKNMALRKAEEAKKAE----EARIEEVMKLY 1601
Query: 162 EEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQT 221
EE+ + + E A+ E +I E+L++ E E++K E +LK++ AEE+++ +E +++
Sbjct: 1602 EEEKKMKAEEAKKAEEAKIKAEELKKAE--EEKKKVE-QLKKKEAEEKKKAEE--LKKAE 1656
Query: 222 EEIHLRMQQ---QDEELRRRHQE 241
EE ++ + + EE +++ +E
Sbjct: 1657 EENKIKAAEEAKKAEEDKKKAEE 1679
Score = 34.7 bits (79), Expect = 0.055
Identities = 30/138 (21%), Positives = 72/138 (52%), Gaps = 11/138 (7%)
Query: 106 ELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRE--RECRFVEEALQKELKLEEE 163
E K +E +++ +KL EE+ + + E A+ E +I E ++ ++ +++ K E E
Sbjct: 1585 EAKKAEEARIEEVMKLYEEEKKMKAEEAKKAEEAKIKAEELKKAEEEKKKVEQLKKKEAE 1644
Query: 164 KLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQTEE 223
+ + E + E E +I + ++ + +++ +E K+ +E++ EA++++ EE
Sbjct: 1645 EKKKAEELKKAEEENKIKAAEEAKKAEEDKKKAEE--AKKAEEDEKKA--AEALKKEAEE 1700
Query: 224 IHLRMQQQDEELRRRHQE 241
++ EEL+++ E
Sbjct: 1701 -----AKKAEELKKKEAE 1713
Score = 34.3 bits (78), Expect = 0.067
Identities = 38/147 (25%), Positives = 74/147 (50%), Gaps = 11/147 (7%)
Query: 100 RWKQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKE-- 157
R +++ +LY+++++ +E K EE E + E E + + + + + EE + E
Sbjct: 1593 RIEEVMKLYEEEKKMKAEEAKKAEEAKIKAEELKKAEEEKKKVEQLKKKEAEEKKKAEEL 1652
Query: 158 LKLEEEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAM 217
K EEE E A+ E + E+ ++ E E +K+ E ++ AEE ++ EE
Sbjct: 1653 KKAEEENKIKAAEEAKKAEEDKKKAEEAKKAE---EDEKKAAEALKKEAEEAKKA-EELK 1708
Query: 218 RRQTEEIHLRMQQQDEELRRRHQENSI 244
+++ EE ++ EEL++ +EN I
Sbjct: 1709 KKEAEEK-----KKAEELKKAEEENKI 1730
Score = 34.0 bits (77), Expect = 0.10
Identities = 31/140 (22%), Positives = 72/140 (51%), Gaps = 3/140 (2%)
Query: 109 KQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLEEEKLEAQ 168
K+K E +K + E++ EA + A + E L+++E ++A +ELK EE+ + +
Sbjct: 1674 KKKAEEAKKAEEDEKKAAEALKKEAEEAKKAEELKKKEAEEKKKA--EELKKAEEENKIK 1731
Query: 169 MEFARYENETEILR-EQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQTEEIHLR 227
E A+ E E + + E+ ++ E ++++ + +E+ AEE R+ E + + +E +
Sbjct: 1732 AEEAKKEAEEDKKKAEEAKKDEEEKKKIAHLKKEEEKKAEEIRKEKEAVIEEELDEEDEK 1791
Query: 228 MQQQDEELRRRHQENSIFMQ 247
+ + ++ + +N +
Sbjct: 1792 RRMEVDKKIKDIFDNFANII 1811
Score = 31.6 bits (71), Expect = 0.59
Identities = 31/143 (21%), Positives = 70/143 (48%), Gaps = 6/143 (4%)
Query: 102 KQLYELYKQKEEALQKELKLEEEKLEA---QMEFARYENETEILRERECRFVEEALQKEL 158
K+ E K + EA E + EEK EA + E A+ + + + E + +EA +K
Sbjct: 1342 KKAAEAAKAEAEAAADEAEAAEEKAEAAEKKKEEAKKKADAAKKKAEEKKKADEAKKKAE 1401
Query: 159 KLEEEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMR 218
+ +++ E + A + E ++ +++AD ++K E + + A+E ++ EEA +
Sbjct: 1402 EDKKKADELKKAAAAKKKADEAKKKAEEKKKADEAKKKAE---EAKKADEAKKKAEEAKK 1458
Query: 219 RQTEEIHLRMQQQDEELRRRHQE 241
+ + ++ +E +++ +E
Sbjct: 1459 AEEAKKKAEEAKKADEAKKKAEE 1481
Score = 28.2 bits (62), Expect = 6.4
Identities = 32/147 (21%), Positives = 70/147 (47%), Gaps = 13/147 (8%)
Query: 102 KQLYELYKQKEEALQ-KELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKL 160
K+ E K+ EEA + E K + E+ + + + A+ + E + EA E +
Sbjct: 1302 KKADEAKKKAEEAKKADEAKKKAEEAKKKADAAKKKAEEAKKAAEAAKAEAEAAADEAEA 1361
Query: 161 EEEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDE----EA 216
EEK EA E + E +++ + E +K+ E K++ E++++ DE A
Sbjct: 1362 AEEKAEAA------EKKKEEAKKKADAAKKKAEEKKKADEAKKKAEEDKKKADELKKAAA 1415
Query: 217 MRRQTEEIHLRMQQ--QDEELRRRHQE 241
+++ +E + ++ + +E +++ +E
Sbjct: 1416 AKKKADEAKKKAEEKKKADEAKKKAEE 1442
Score = 28.2 bits (62), Expect = 6.5
Identities = 38/149 (25%), Positives = 74/149 (49%), Gaps = 9/149 (6%)
Query: 102 KQLYELYKQKEEALQK--ELKLEEEKLEA-QMEFARYENETEILRERECRFVEEALQK-- 156
K+ + K+ EA +K E K EE +A + + A + + ++ E + + L+K
Sbjct: 1496 KKKADEAKKAAEAKKKADEAKKAEEAKKADEAKKAEEAKKADEAKKAEEKKKADELKKAE 1555
Query: 157 ELKLEEEKLEAQMEFARYENETEILR--EQLRQREADR--ERQKQEWELKERHAEEQRRC 212
ELK EEK +A+ E++ LR E+ ++ E R E K E K+ AEE ++
Sbjct: 1556 ELKKAEEKKKAEEAKKAEEDKNMALRKAEEAKKAEEARIEEVMKLYEEEKKMKAEEAKKA 1615
Query: 213 DEEAMRRQTEEIHLRMQQQDEELRRRHQE 241
+E ++ + + +++ E+L+++ E
Sbjct: 1616 EEAKIKAEELKKAEEEKKKVEQLKKKEAE 1644
Score = 28.2 bits (62), Expect = 6.6
Identities = 29/143 (20%), Positives = 69/143 (48%), Gaps = 7/143 (4%)
Query: 106 ELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLEEEKL 165
E K+ +EA + E K + ++ + + E A+ +E + E + + A +K + ++
Sbjct: 1287 EEKKKADEAKKAEEKKKADEAKKKAEEAKKADEAKKKAEEAKKKADAAKKKAEEAKKAAE 1346
Query: 166 EAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDE-----EAMRRQ 220
A+ E +E E E+ E +E K++ + ++ AEE+++ DE E +++
Sbjct: 1347 AAKAEAEAAADEAEAAEEKAEAAEKKKEEAKKKADAAKKKAEEKKKADEAKKKAEEDKKK 1406
Query: 221 TEEIHLRMQQQD--EELRRRHQE 241
+E+ + +E +++ +E
Sbjct: 1407 ADELKKAAAAKKKADEAKKKAEE 1429
Score = 28.2 bits (62), Expect = 7.0
Identities = 34/154 (22%), Positives = 69/154 (44%), Gaps = 11/154 (7%)
Query: 102 KQLYELYKQKEEALQK--ELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQK--E 157
K+ EL K +EE K E + E+ + + E A+ E E + E +K E
Sbjct: 1647 KKAEELKKAEEENKIKAAEEAKKAEEDKKKAEEAKKAEEDEKKAAEALKKEAEEAKKAEE 1706
Query: 158 LKLEEEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKE-------RHAEEQR 210
LK +E + + + E + E ++ + ++EA+ +++K E K+ H +++
Sbjct: 1707 LKKKEAEEKKKAEELKKAEEENKIKAEEAKKEAEEDKKKAEEAKKDEEEKKKIAHLKKEE 1766
Query: 211 RCDEEAMRRQTEEIHLRMQQQDEELRRRHQENSI 244
E +R++ E + +++E RR + I
Sbjct: 1767 EKKAEEIRKEKEAVIEEELDEEDEKRRMEVDKKI 1800
>gnl|CDD|215696 pfam00076, RRM_1, RNA recognition motif. (a.k.a. RRM, RBD, or RNP
domain). The RRM motif is probably diagnostic of an
RNA binding protein. RRMs are found in a variety of RNA
binding proteins, including various hnRNP proteins,
proteins implicated in regulation of alternative
splicing, and protein components of snRNPs. The motif
also appears in a few single stranded DNA binding
proteins. The RRM structure consists of four strands
and two helices arranged in an alpha/beta sandwich,
with a third helix present during RNA binding in some
cases The C-terminal beta strand (4th strand) and final
helix are hard to align and have been omitted in the
SEED alignment The LA proteins have an N terminal rrm
which is included in the seed. There is a second region
towards the C terminus that has some features
characteristic of a rrm but does not appear to have the
important structural core of a rrm. The LA proteins are
one of the main autoantigens in Systemic lupus
erythematosus (SLE), an autoimmune disease.
Length = 70
Score = 32.6 bits (75), Expect = 0.023
Identities = 11/36 (30%), Positives = 18/36 (50%)
Query: 3 IERAVVLVDDRGNSKNEGIIEFTRKPAAAQALKRCQ 38
IE ++ D+ G SK +EF + A +AL+
Sbjct: 25 IESIRIVRDETGRSKGFAFVEFEDEEDAEKALEALN 60
>gnl|CDD|216608 pfam01618, MotA_ExbB, MotA/TolQ/ExbB proton channel family. This
family groups together integral membrane proteins that
appear to be involved translocation of proteins across a
membrane. These proteins are probably proton channels.
MotA is an essential component of the flageller motor
that uses a proton gradient to generate rotational
motion in the flageller. ExbB is part of the
TonB-dependent transduction complex. The TonB complex
uses the proton gradient across the inner bacterial
membrane to transport large molecules across the outer
bacterial membrane.
Length = 139
Score = 34.1 bits (79), Expect = 0.024
Identities = 13/53 (24%), Positives = 19/53 (35%)
Query: 115 LQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLEEEKLEA 167
+ +E + F R I R F+ AL++ L E KLE
Sbjct: 1 REPLSAIEILAQLGFLSFVRLGLAPLIERGAFKEFLRLALEEALDAELRKLER 53
>gnl|CDD|188306 TIGR03319, RNase_Y, ribonuclease Y. Members of this family are
RNase Y, an endoribonuclease. The member from Bacillus
subtilis, YmdA, has been shown to be involved in
turnover of yitJ riboswitch [Transcription, Degradation
of RNA].
Length = 514
Score = 35.3 bits (82), Expect = 0.032
Identities = 27/124 (21%), Positives = 56/124 (45%), Gaps = 18/124 (14%)
Query: 132 FARYENETEI--LRERECRFVEEA------LQKELKLE------EEKLEAQMEFARYENE 177
+ E ++ E R +EEA L+KE LE + + E + E NE
Sbjct: 18 LRKRIAEKKLGSAEELAKRIIEEAKKEAETLKKEALLEAKEEVHKLRAELERELKERRNE 77
Query: 178 TEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQTEEIHLRMQQQDEELRR 237
+ L +L QRE +R+ + + KE + E++ E+ + + + + + ++ +E +
Sbjct: 78 LQRLERRLLQREETLDRKMESLDKKEENLEKK----EKELSNKEKNLDEKEEELEELIAE 133
Query: 238 RHQE 241
+ +E
Sbjct: 134 QREE 137
Score = 32.6 bits (75), Expect = 0.20
Identities = 37/138 (26%), Positives = 66/138 (47%), Gaps = 11/138 (7%)
Query: 102 KQLYELYKQKEEALQKELKLE--EEKLEAQMEFARYENETEILRERECRFVEEALQKELK 159
K++ E K++ E L+KE LE EE + + E E E + R R LQ+E
Sbjct: 35 KRIIEEAKKEAETLKKEALLEAKEEVHKLRAEL---ERELKERRNELQRLERRLLQREET 91
Query: 160 LEEEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRR 219
L + K+E+ + E E ++L +E + ++E EL+E AE++ + +
Sbjct: 92 L-DRKMES---LDKKEENLEKKEKELSNKE--KNLDEKEEELEELIAEQREELERISGLT 145
Query: 220 QTEEIHLRMQQQDEELRR 237
Q E + +++ +EE R
Sbjct: 146 QEEAKEILLEEVEEEARH 163
Score = 27.6 bits (62), Expect = 7.6
Identities = 27/126 (21%), Positives = 58/126 (46%), Gaps = 10/126 (7%)
Query: 109 KQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLEEEKLEAQ 168
+ + + L++ L EE L+ +ME + + E L ++E +E KE L+E++ E +
Sbjct: 75 RNELQRLERRLLQREETLDRKME--SLDKKEENLEKKE----KELSNKEKNLDEKEEELE 128
Query: 169 MEFARYENETE----ILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQTEEI 224
A E E + +E+ ++ + ++ E + E + EEA ++ E +
Sbjct: 129 ELIAEQREELERISGLTQEEAKEILLEEVEEEARHEAAKLIKEIEEEAKEEADKKAKEIL 188
Query: 225 HLRMQQ 230
+Q+
Sbjct: 189 ATAIQR 194
>gnl|CDD|240668 cd00590, RRM_SF, RNA recognition motif (RRM) superfamily. RRM,
also known as RBD (RNA binding domain) or RNP
(ribonucleoprotein domain), is a highly abundant domain
in eukaryotes found in proteins involved in
post-transcriptional gene expression processes
including mRNA and rRNA processing, RNA export, and RNA
stability. This domain is 90 amino acids in length and
consists of a four-stranded beta-sheet packed against
two alpha-helices. RRM usually interacts with ssRNA,
but is also known to interact with ssDNA as well as
proteins. RRM binds a variable number of nucleotides,
ranging from two to eight. The active site includes
three aromatic side-chains located within the conserved
RNP1 and RNP2 motifs of the domain. The RRM domain is
found in a variety heterogeneous nuclear
ribonucleoproteins (hnRNPs), proteins implicated in
regulation of alternative splicing, and protein
components of small nuclear ribonucleoproteins
(snRNPs).
Length = 72
Score = 31.9 bits (73), Expect = 0.049
Identities = 11/42 (26%), Positives = 16/42 (38%)
Query: 2 NIERAVVLVDDRGNSKNEGIIEFTRKPAAAQALKRCQDGVFF 43
IE ++ D G SK +EF A +AL+
Sbjct: 24 EIESVRIVRDKDGKSKGFAFVEFESPEDAEKALEALNGKELD 65
>gnl|CDD|223496 COG0419, SbcC, ATPase involved in DNA repair [DNA replication,
recombination, and repair].
Length = 908
Score = 34.0 bits (78), Expect = 0.097
Identities = 31/135 (22%), Positives = 62/135 (45%), Gaps = 6/135 (4%)
Query: 102 KQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLE 161
K+L EL ++ E L+ E L+EE E + E E L+E+ + ++L+
Sbjct: 508 KELRELEEELIELLELEEALKEELEEKLEKLENLLEELEELKEKLQLQQLKEELRQLEDR 567
Query: 162 EEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQT 221
++L+ +E R + E+LR+R + +++ +E E + EE + +
Sbjct: 568 LQELKELLEELRLLRTRKEELEELRERLKELKKKLKELEERLSQLEELLQ------SLEL 621
Query: 222 EEIHLRMQQQDEELR 236
E +++ +EEL
Sbjct: 622 SEAENELEEAEEELE 636
Score = 32.0 bits (73), Expect = 0.35
Identities = 35/140 (25%), Positives = 65/140 (46%), Gaps = 4/140 (2%)
Query: 98 GTRWKQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKE 157
K+L ELY+ + E L++EL E+E+ E + E E E + E L+ E
Sbjct: 469 EEHEKELLELYELELEELEEELSREKEEAELREEI----EELEKELRELEEELIELLELE 524
Query: 158 LKLEEEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAM 217
L+EE E + E E L+E+L+ ++ E ++ E L+E +
Sbjct: 525 EALKEELEEKLEKLENLLEELEELKEKLQLQQLKEELRQLEDRLQELKELLEELRLLRTR 584
Query: 218 RRQTEEIHLRMQQQDEELRR 237
+ + EE+ R+++ ++L+
Sbjct: 585 KEELEELRERLKELKKKLKE 604
Score = 29.3 bits (66), Expect = 2.9
Identities = 23/103 (22%), Positives = 39/103 (37%)
Query: 102 KQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLE 161
++L E + L++ +L E+ + + E + E L EE + LE
Sbjct: 301 EELEEELEGLRALLEELEELLEKLKSLEERLEKLEEKLEKLESELEELAEEKNELAKLLE 360
Query: 162 EEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKER 204
E E + E E E E+L+Q E + K+E
Sbjct: 361 ERLKELEERLEELEKELEKALERLKQLEEAIQELKEELAELSA 403
Score = 28.2 bits (63), Expect = 5.8
Identities = 26/115 (22%), Positives = 46/115 (40%), Gaps = 1/115 (0%)
Query: 106 ELYKQKEEALQKELKLEEEKL-EAQMEFARYENETEILRERECRFVEEALQKELKLEEEK 164
K++ E L++ LK ++KL E + ++ E + L E E ++EL+ E EK
Sbjct: 582 RTRKEELEELRERLKELKKKLKELEERLSQLEELLQSLELSEAENELEEAEEELESELEK 641
Query: 165 LEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRR 219
L Q E E+ + R++ + E EE+ E+
Sbjct: 642 LNLQAELEELLQAALEELEEKVEELEAEIRRELQRIENEEQLEEKLEELEQLEEE 696
Score = 27.8 bits (62), Expect = 7.5
Identities = 32/160 (20%), Positives = 68/160 (42%), Gaps = 4/160 (2%)
Query: 102 KQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLE 161
++L +L EE + + KL+ ++L+ ++ + E R + ++ +L
Sbjct: 533 EKLEKLENLLEELEELKEKLQLQQLKEELRQLEDRLQELKELLEELRLLRTRKEELEELR 592
Query: 162 EEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRR----CDEEAM 217
E E + + E L E L+ E + E +E +E ++ EE +
Sbjct: 593 ERLKELKKKLKELEERLSQLEELLQSLELSEAENELEEAEEELESELEKLNLQAELEELL 652
Query: 218 RRQTEEIHLRMQQQDEELRRRHQENSIFMQVIVWLGDLKQ 257
+ EE+ ++++ + E+RR Q Q+ L +L+Q
Sbjct: 653 QAALEELEEKVEELEAEIRRELQRIENEEQLEEKLEELEQ 692
>gnl|CDD|219655 pfam07946, DUF1682, Protein of unknown function (DUF1682). The
members of this family are all hypothetical eukaryotic
proteins of unknown function. One member is described as
being an adipocyte-specific protein, but no evidence of
this was found.
Length = 322
Score = 33.4 bits (77), Expect = 0.10
Identities = 19/76 (25%), Positives = 42/76 (55%), Gaps = 11/76 (14%)
Query: 148 RFVEEALQKELKL---EEEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKER 204
+ E L+K K EEEK+ E R +E+ ++++ +++++++E +L +
Sbjct: 252 KLSPEVLRKVDKTREEEEEKILKAAEEER--------QEEAQEKKEEKKKEEREAKLAKL 303
Query: 205 HAEEQRRCDEEAMRRQ 220
EEQR+ +E+ ++Q
Sbjct: 304 SPEEQRKLEEKERKKQ 319
>gnl|CDD|240832 cd12386, RRM2_hnRNPM_like, RNA recognition motif 2 in
heterogeneous nuclear ribonucleoprotein M (hnRNP M) and
similar proteins. This subfamily corresponds to the
RRM2 of heterogeneous nuclear ribonucleoprotein M
(hnRNP M), myelin expression factor 2 (MEF-2 or MyEF-2
or MST156) and similar proteins. hnRNP M is pre-mRNA
binding protein that may play an important role in the
pre-mRNA processing. It also preferentially binds to
poly(G) and poly(U) RNA homopolymers. hnRNP M is able
to interact with early spliceosomes, further
influencing splicing patterns of specific pre-mRNAs. It
functions as the receptor of carcinoembryonic antigen
(CEA) that contains the penta-peptide sequence PELPK
signaling motif. In addition, hnRNP M and another
splicing factor Nova-1 work together as dopamine D2
receptor (D2R) pre-mRNA-binding proteins. They regulate
alternative splicing of D2R pre-mRNA in an antagonistic
manner. hnRNP M contains three RNA recognition motifs
(RRMs), also termed RBDs (RNA binding domains) or RNPs
(ribonucleoprotein domains), and an unusual
hexapeptide-repeat region rich in methionine and
arginine residues (MR repeat motif). MEF-2 is a
sequence-specific single-stranded DNA (ssDNA) binding
protein that binds specifically to ssDNA derived from
the proximal (MB1) element of the myelin basic protein
(MBP) promoter and represses transcription of the MBP
gene. MEF-2 shows high sequence homology with hnRNP M.
It also contains three RRMs, which may be responsible
for its ssDNA binding activity. .
Length = 74
Score = 31.2 bits (71), Expect = 0.10
Identities = 10/33 (30%), Positives = 17/33 (51%)
Query: 2 NIERAVVLVDDRGNSKNEGIIEFTRKPAAAQAL 34
+ RA + D G S+ G+++F A QA+
Sbjct: 24 KVVRADIKEDKEGKSRGMGVVQFEHPIEAVQAI 56
>gnl|CDD|130902 TIGR01843, type_I_hlyD, type I secretion membrane fusion protein,
HlyD family. Type I secretion is an ABC transport
process that exports proteins, without cleavage of any
signal sequence, from the cytosol to extracellular
medium across both inner and outer membranes. The
secretion signal is found in the C-terminus of the
transported protein. This model represents the adaptor
protein between the ATP-binding cassette (ABC) protein
of the inner membrane and the outer membrane protein,
and is called the membrane fusion protein. This model
selects a subfamily closely related to HlyD; it is
defined narrowly and excludes, for example, colicin V
secretion protein CvaA and multidrug efflux proteins
[Protein fate, Protein and peptide secretion and
trafficking].
Length = 423
Score = 33.4 bits (77), Expect = 0.12
Identities = 35/156 (22%), Positives = 63/156 (40%), Gaps = 16/156 (10%)
Query: 95 FEYGTRWKQLYELYKQKEEALQKELK-LEEEKLEAQMEFARYENETEILRER------EC 147
K L++ ++ L+ +L+ + + + + E A + + + LR++ E
Sbjct: 122 PAVPELIKGQQSLFESRKSTLRAQLELILAQIKQLEAELAGLQAQLQALRQQLEVISEEL 181
Query: 148 RFVEEALQKE-------LKLEEEKLEAQMEFARYENETEILREQLRQREADRERQKQEWE 200
+ +K L+LE E+ EAQ E R E E E+L+ Q+ E ERQ+ E
Sbjct: 182 EARRKLKEKGLVSRLELLELERERAEAQGELGRLEAELEVLKRQI--DELQLERQQIEQT 239
Query: 201 LKERHAEEQRRCDEEAMRRQTEEIHLRMQQQDEELR 236
+E EE + R + Q +R
Sbjct: 240 FREEVLEELTEAQARLAELRERLNKARDRLQRLIIR 275
>gnl|CDD|225606 COG3064, TolA, Membrane protein involved in colicin uptake [Cell
envelope biogenesis, outer membrane].
Length = 387
Score = 33.4 bits (76), Expect = 0.12
Identities = 24/117 (20%), Positives = 51/117 (43%), Gaps = 5/117 (4%)
Query: 100 RWKQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELK 159
R + K+ E+ +K+ + E+L+ + E E L++ E ++ Q++
Sbjct: 66 RIQSQQSSAKKGEQQRKKKEEQVAEELKPKQA-----AEQERLKQLEKERLKAQEQQKQA 120
Query: 160 LEEEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEA 216
E EK + + E + EQ ++ EA + + E + AE +++ +E A
Sbjct: 121 EEAEKQAQLEQKQQEEQARKAAAEQKKKAEAAKAKAAAEAAKLKAAAEAKKKAEEAA 177
Score = 32.2 bits (73), Expect = 0.24
Identities = 33/125 (26%), Positives = 60/125 (48%), Gaps = 9/125 (7%)
Query: 102 KQLYELYKQKEEALQKELK-LEEEKLEAQMEFARYENETEILRERECRFVEE-----ALQ 155
+Q+ E K K+ A Q+ LK LE+E+L+AQ E + E E + E + EE A +
Sbjct: 86 EQVAEELKPKQAAEQERLKQLEKERLKAQ-EQQKQAEEAEKQAQLEQKQQEEQARKAAAE 144
Query: 156 KELKLEEEKLEAQMEFARYENETEILR--EQLRQREADRERQKQEWELKERHAEEQRRCD 213
++ K E K +A E A+ + E + E+ + + + + + K++ E +
Sbjct: 145 QKKKAEAAKAKAAAEAAKLKAAAEAKKKAEEAAKAAEEAKAKAEAAAAKKKAEAEAKAAA 204
Query: 214 EEAMR 218
E+A
Sbjct: 205 EKAKA 209
>gnl|CDD|129705 TIGR00618, sbcc, exonuclease SbcC. All proteins in this family for
which functions are known are part of an exonuclease
complex with sbcD homologs. This complex is involved in
the initiation of recombination to regulate the levels
of palindromic sequences in DNA. This family is based on
the phylogenomic analysis of JA Eisen (1999, Ph.D.
Thesis, Stanford University) [DNA metabolism, DNA
replication, recombination, and repair].
Length = 1042
Score = 33.4 bits (76), Expect = 0.13
Identities = 28/140 (20%), Positives = 57/140 (40%), Gaps = 6/140 (4%)
Query: 102 KQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLE 161
+Q + QK EA +++LK ++ + + + +L E + R + A +
Sbjct: 239 QQSHAYLTQKREAQEEQLKKQQLLKQLRARIEELRAQEAVLEETQER-INRARKAAPLAA 297
Query: 162 EEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQT 221
K Q+E TE+ Q + R + K+ +K++ + E++R + + Q
Sbjct: 298 HIKAVTQIEQQAQRIHTEL---QSKMRSRAKLLMKRAAHVKQQSSIEEQRRLLQTLHSQ- 353
Query: 222 EEIHLRMQQQDEELRRRHQE 241
EIH+R + R
Sbjct: 354 -EIHIRDAHEVATSIREISC 372
>gnl|CDD|217393 pfam03154, Atrophin-1, Atrophin-1 family. Atrophin-1 is the
protein product of the dentatorubral-pallidoluysian
atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive
neurodegenerative disorder. It is caused by the
expansion of a CAG repeat in the DRPLA gene on
chromosome 12p. This results in an extended
polyglutamine region in atrophin-1, that is thought to
confer toxicity to the protein, possibly through
altering its interactions with other proteins. The
expansion of a CAG repeat is also the underlying defect
in six other neurodegenerative disorders, including
Huntington's disease. One interaction of expanded
polyglutamine repeats that is thought to be pathogenic
is that with the short glutamine repeat in the
transcriptional coactivator CREB binding protein, CBP.
This interaction draws CBP away from its usual nuclear
location to the expanded polyglutamine repeat protein
aggregates that are characteristic of the polyglutamine
neurodegenerative disorders. This interferes with
CBP-mediated transcription and causes cytotoxicity.
Length = 979
Score = 33.1 bits (75), Expect = 0.16
Identities = 17/53 (32%), Positives = 32/53 (60%), Gaps = 5/53 (9%)
Query: 159 KLEEEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRR 211
KL +++ EA E A+ E E + E+ R++E ++ER+++ +ER AE +
Sbjct: 576 KLAKKREEAV-EKAKREAEQKAREEREREKEKEKERERE----REREAERAAK 623
Score = 30.4 bits (68), Expect = 1.2
Identities = 16/45 (35%), Positives = 30/45 (66%), Gaps = 5/45 (11%)
Query: 153 ALQKELKLEEEKLEAQMEFARYENETEILREQLRQREADRERQKQ 197
A ++E +E+ K EA+ + AR E E RE+ +++E +RER+++
Sbjct: 578 AKKREEAVEKAKREAEQK-AREERE----REKEKEKERERERERE 617
Score = 28.5 bits (63), Expect = 4.8
Identities = 16/41 (39%), Positives = 25/41 (60%), Gaps = 1/41 (2%)
Query: 106 ELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERE 146
+L K++EEA++K + E+K + E + E E E RERE
Sbjct: 576 KLAKKREEAVEKAKREAEQKAREEREREK-EKEKERERERE 615
>gnl|CDD|225288 COG2433, COG2433, Uncharacterized conserved protein [Function
unknown].
Length = 652
Score = 33.1 bits (76), Expect = 0.16
Identities = 19/58 (32%), Positives = 35/58 (60%), Gaps = 1/58 (1%)
Query: 113 EALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQK-ELKLEEEKLEAQM 169
+ +ELK E EKLE+++E R E ++ ++RE R + +++ E +LEE+K +
Sbjct: 442 KRELEELKREIEKLESELERFRREVRDKVRKDREIRARDRRIERLEKELEEKKKRVEE 499
Score = 33.1 bits (76), Expect = 0.18
Identities = 35/112 (31%), Positives = 54/112 (48%), Gaps = 12/112 (10%)
Query: 112 EEALQKELKLEEEK------LEAQMEFARYENETEILRERECRFVEE-----ALQKELKL 160
EAL K + E + E + E YE + L E R EE +ELK
Sbjct: 391 AEALSKVKEEERPREKEGTEEEERREITVYEKRIKKLEETVERLEEENSELKRELEELKR 450
Query: 161 EEEKLEAQMEFARYENETEILRE-QLRQREADRERQKQEWELKERHAEEQRR 211
E EKLE+++E R E ++ ++ ++R R+ ER ++E E K++ EE R
Sbjct: 451 EIEKLESELERFRREVRDKVRKDREIRARDRRIERLEKELEEKKKRVEELER 502
>gnl|CDD|235551 PRK05667, dnaG, DNA primase; Validated.
Length = 580
Score = 32.9 bits (76), Expect = 0.17
Identities = 20/124 (16%), Positives = 39/124 (31%), Gaps = 11/124 (8%)
Query: 83 IGPRFATIGSFEFEYGTRWKQLYELYKQKEEA----------LQKELKLEEEKLEAQMEF 132
+ +FE ++ L E + L+ LE+ +
Sbjct: 457 AEEVRDALDEEDFEGLPLFRALLEAILAQPGLTTGSQLLEHLRDAGLEELAALLESLAVW 516
Query: 133 ARYENETEILRERECRFVEEALQK-ELKLEEEKLEAQMEFARYENETEILREQLRQREAD 191
E E+E + E L+ L+ E+L A+ + R +L Q +
Sbjct: 517 EEISEEDIAALEKELKDALEKLRDQLLEERLEELIAKERLLEGHGLSSEERLELLQLLIE 576
Query: 192 RERQ 195
+R+
Sbjct: 577 LKRK 580
>gnl|CDD|216108 pfam00769, ERM, Ezrin/radixin/moesin family. This family of
proteins contain a band 4.1 domain (pfam00373), at their
amino terminus. This family represents the rest of these
proteins.
Length = 244
Score = 32.0 bits (73), Expect = 0.24
Identities = 34/133 (25%), Positives = 63/133 (47%), Gaps = 7/133 (5%)
Query: 106 ELYKQKEEALQKELKLEEEKLE-AQMEFARYEN---ETEILRERECRFVEEALQKELKLE 161
E +++++ L++ ++ EE + AQ E YE E E ++E + +K +LE
Sbjct: 1 EEAEREQQELEERMEQMEEDMRRAQKELEEYEETALELEEKLKQEEEEAQLLEKKADELE 60
Query: 162 EEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKE---RHAEEQRRCDEEAMR 218
EE + E A E E E L ++ + A+ + ++E E KE R +++ R +EA
Sbjct: 61 EENRRLEEEAAASEEERERLEAEVDEATAEVAKLEEEREKKEAETRQLQQELREAQEAHE 120
Query: 219 RQTEEIHLRMQQQ 231
R +E+
Sbjct: 121 RARQELLEAAAAP 133
Score = 29.3 bits (66), Expect = 2.1
Identities = 25/90 (27%), Positives = 37/90 (41%), Gaps = 4/90 (4%)
Query: 151 EEALQKELKLEEEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQR 210
+E ++ ++EE+ AQ E YE L E+L+Q E + Q E K EE+
Sbjct: 8 QELEERMEQMEEDMRRAQKELEEYEETALELEEKLKQEE----EEAQLLEKKADELEEEN 63
Query: 211 RCDEEAMRRQTEEIHLRMQQQDEELRRRHQ 240
R EE EE + DE +
Sbjct: 64 RRLEEEAAASEEERERLEAEVDEATAEVAK 93
>gnl|CDD|222290 pfam13654, AAA_32, AAA domain. This family includes a wide variety
of AAA domains including some that have lost essential
nucleotide binding residues in the P-loop.
Length = 509
Score = 32.4 bits (75), Expect = 0.25
Identities = 29/121 (23%), Positives = 55/121 (45%), Gaps = 30/121 (24%)
Query: 106 ELYKQKEEALQKELKLEEEKLEAQME-FARYEN----------------ETEILRERECR 148
E Y+ ++E +++E + + E+ ++E A+ + + E L E E
Sbjct: 111 EEYEARKEEIEEEFQEKREEAFEELEEEAKEKGFALVRTPGGFVFAPLKDGEPLTEEEFE 170
Query: 149 FVEEALQKELKLEEEKLEAQMEFARYENETEILREQLRQ-READRERQKQEWELKERHAE 207
+ E ++EL+ + ++LE E L+E LRQ RE +RE +++ EL A
Sbjct: 171 ALPEEEREELEEKIDELE------------EELQEILRQLRELEREAREKLRELDREVAL 218
Query: 208 E 208
Sbjct: 219 F 219
>gnl|CDD|215212 PLN02372, PLN02372, violaxanthin de-epoxidase.
Length = 455
Score = 32.5 bits (74), Expect = 0.26
Identities = 23/84 (27%), Positives = 38/84 (45%), Gaps = 6/84 (7%)
Query: 110 QKEEALQKELKLEEEKLEAQMEFARYENETEILRE------RECRFVEEALQKELKLEEE 163
+ E+ + KE + EE+LE ++E E E+ R +E EE KEL EE+
Sbjct: 372 EGEKTIVKEARQIEEELEKEVEKLGKEEESLFKRVALEEGLKELEQDEENFLKELSKEEK 431
Query: 164 KLEAQMEFARYENETEILREQLRQ 187
+L +++ E E R +
Sbjct: 432 ELLEKLKMEASEVEKLFGRALPVR 455
>gnl|CDD|235850 PRK06669, fliH, flagellar assembly protein H; Validated.
Length = 281
Score = 31.9 bits (73), Expect = 0.26
Identities = 20/84 (23%), Positives = 37/84 (44%)
Query: 106 ELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLEEEKL 165
E +++EE ++L+ E ++ E+ EI+ E EE L+K +
Sbjct: 36 ERLREEEEEQVEQLREEANDEAKEIIEEAEEDAFEIVEAAEEEAKEELLKKTDEASSIIE 95
Query: 166 EAQMEFARYENETEILREQLRQRE 189
+ QM+ R + E E E+L +
Sbjct: 96 KLQMQIEREQEEWEEELERLIEEA 119
>gnl|CDD|236912 PRK11448, hsdR, type I restriction enzyme EcoKI subunit R;
Provisional.
Length = 1123
Score = 32.6 bits (75), Expect = 0.27
Identities = 23/96 (23%), Positives = 51/96 (53%), Gaps = 2/96 (2%)
Query: 109 KQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLEEEKLEAQ 168
+ ALQ+E+ +++LE Q + +++ +++ E L EL+ ++++LEAQ
Sbjct: 141 ENLLHALQQEVLTLKQQLELQAR-EKAQSQALAEAQQQELVALEGLAAELEEKQQELEAQ 199
Query: 169 MEFARYENETEILREQLRQREADRERQKQEWELKER 204
+E + E E +E+ ++R+ ++ + EL E
Sbjct: 200 LEQLQ-EKAAETSQERKQKRKEITDQAAKRLELSEE 234
>gnl|CDD|240376 PTZ00350, PTZ00350, adenylosuccinate synthetase; Provisional.
Length = 436
Score = 32.3 bits (74), Expect = 0.30
Identities = 25/98 (25%), Positives = 47/98 (47%), Gaps = 14/98 (14%)
Query: 81 RSIGPRFATIGSFEFEYGTR------WKQLYELYKQKEEALQKELKLEEEKLEAQMEFAR 134
R IGP ++T S G R ++ + Y++ E LQ++ +EE E ++E R
Sbjct: 144 RGIGPCYSTKAS---RTGLRVGDLLNFETFEKKYRKLVEKLQEQYNIEEYDAEEELE--R 198
Query: 135 YENETEILREREC---RFVEEALQKELKLEEEKLEAQM 169
Y+ E L++ F+ +A+++ ++ E A M
Sbjct: 199 YKGYAEKLKDMIVDTVYFMNKAIKEGKRVLVEGANATM 236
>gnl|CDD|236545 PRK09510, tolA, cell envelope integrity inner membrane protein
TolA; Provisional.
Length = 387
Score = 32.1 bits (73), Expect = 0.31
Identities = 25/122 (20%), Positives = 55/122 (45%), Gaps = 10/122 (8%)
Query: 102 KQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLE 161
+Q EL +++ ++ +LE+E+L AQ E ++ E + AL+++ E
Sbjct: 87 QQAEELQQKQAAEQERLKQLEKERLAAQ----------EQKKQAEEAAKQAALKQKQAEE 136
Query: 162 EEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQT 221
A A+ E E + ++ A+ +++ + K+ AE +++ + EA +
Sbjct: 137 AAAKAAAAAKAKAEAEAKRAAAAAKKAAAEAKKKAEAEAAKKAAAEAKKKAEAEAAAKAA 196
Query: 222 EE 223
E
Sbjct: 197 AE 198
Score = 29.4 bits (66), Expect = 2.3
Identities = 17/83 (20%), Positives = 40/83 (48%), Gaps = 3/83 (3%)
Query: 153 ALQKELKLEEEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRC 212
A ++ K E+++ E + + E E L++ ++R A +E++KQ E + A +++
Sbjct: 77 AEEQRKKKEQQQAEELQQ--KQAAEQERLKQLEKERLAAQEQKKQA-EEAAKQAALKQKQ 133
Query: 213 DEEAMRRQTEEIHLRMQQQDEEL 235
EEA + + + + +
Sbjct: 134 AEEAAAKAAAAAKAKAEAEAKRA 156
Score = 27.5 bits (61), Expect = 8.9
Identities = 14/60 (23%), Positives = 28/60 (46%), Gaps = 7/60 (11%)
Query: 182 REQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQTEEIHLRMQQQDEELRRRHQE 241
E+ R+++ Q+Q EL+++ A EQ R + R + ++Q EE ++
Sbjct: 77 AEEQRKKKE----QQQAEELQQKQAAEQERLKQLEKERLAAQ---EQKKQAEEAAKQAAL 129
>gnl|CDD|118696 pfam10168, Nup88, Nuclear pore component. Nup88 can be divided
into two structural domains; the N-terminal two-thirds
of the protein has no obvious structural motifs but is
the region for binding to Nup98, one of the components
of the nuclear pore. the C-terminal end is a predicted
coiled-coil domain. Nup88 is overexpressed in tumour
cells.
Length = 717
Score = 32.2 bits (73), Expect = 0.31
Identities = 27/147 (18%), Positives = 55/147 (37%), Gaps = 29/147 (19%)
Query: 113 EALQKELKLEEEKLEAQMEFARY-ENETEILRER---------ECRFVEEALQKELKLEE 162
E Q+ +KL + + E Q+E + E + L ER E ++ +E L K
Sbjct: 561 EEFQRRVKLLQLQKEKQLEDIQDCREERKSLSERAEKLAEKFEEAKYNQELLVNRCKRLL 620
Query: 163 EKLEAQMEFA-----RYENETEILREQLRQREADRERQKQEWELKERH------------ 205
+ +Q+ E + + +QL+ ++ K++ + H
Sbjct: 621 QSANSQLPVLSDSERDMSKELQRINKQLQHLANGIKQVKKKKNYQRYHMASQESPKKSSY 680
Query: 206 --AEEQRRCDEEAMRRQTEEIHLRMQQ 230
E+Q + E ++ E I ++Q
Sbjct: 681 TLPEKQHKTITEILKELGEHIDRMIKQ 707
Score = 27.9 bits (62), Expect = 7.7
Identities = 24/119 (20%), Positives = 47/119 (39%), Gaps = 26/119 (21%)
Query: 139 TEILRERECR---FVEEALQKELKLEEEKLEAQMEFARY-ENETEILREQLRQREADRER 194
T++ RE+ E Q+ +KL + + E Q+E + E + L E+ A++
Sbjct: 545 TQVFREQYLLKHDLAREEFQRRVKLLQLQKEKQLEDIQDCREERKSLSER-----AEKLA 599
Query: 195 QKQEWELKERHAEEQRRCD----------------EEAMRRQTEEIHLRMQQQDEELRR 237
+K E E K RC E M ++ + I+ ++Q +++
Sbjct: 600 EKFE-EAKYNQELLVNRCKRLLQSANSQLPVLSDSERDMSKELQRINKQLQHLANGIKQ 657
>gnl|CDD|224188 COG1269, NtpI, Archaeal/vacuolar-type H+-ATPase subunit I [Energy
production and conversion].
Length = 660
Score = 31.9 bits (73), Expect = 0.34
Identities = 29/164 (17%), Positives = 50/164 (30%), Gaps = 30/164 (18%)
Query: 90 IGSFEFEYGTRWKQLYELYKQKEEA---LQKELKLEEEKLEAQMEFARYENETEILRERE 146
+ ++ EL ++ EE L +EL+ E+ LE A + + +LR +
Sbjct: 97 LEEVIKPAEKFSSEVEELTRKLEERLSELDEELEDLEDLLEELEPLAYLDFDLSLLRGLK 156
Query: 147 CRFVEEALQKELKLEE--EKLEAQMEFARYENETE-----------------ILREQLRQ 187
V L + KLE +E ++ E IL E +
Sbjct: 157 FLLVRLGLVRREKLEALVGVIEDEVALYGENVEASVVIVVAHGAEDLDKVSKILNELGFE 216
Query: 188 R----EADRERQKQ----EWELKERHAEEQRRCDEEAMRRQTEE 223
E D + E + E E + E +
Sbjct: 217 LYEVPEFDGGPSELISELEEVIAEIQDELESLRSELEALAEKIA 260
>gnl|CDD|240813 cd12367, RRM2_RBM45, RNA recognition motif 2 in RNA-binding
protein 45 (RBM45) and similar proteins. This
subfamily corresponds to the RRM2 of RBM45, also termed
developmentally-regulated RNA-binding protein 1 (DRB1),
a new member of RNA recognition motif (RRM)-type neural
RNA-binding proteins, which expresses under
spatiotemporal control. It is encoded by gene drb1 that
is expressed in neurons, not in glial cells. RBM45
predominantly localizes in cytoplasm of cultured cells
and specifically binds to poly(C) RNA. It could play an
important role during neurogenesis. RBM45 carries four
RRMs, also known as RBDs (RNA binding domains) or RNPs
(ribonucleoprotein domains). .
Length = 74
Score = 29.3 bits (66), Expect = 0.38
Identities = 15/42 (35%), Positives = 21/42 (50%), Gaps = 7/42 (16%)
Query: 14 GNSKNEGIIEFTRKPAAAQALKRCQDGVFFLTQSLKPVIVEP 55
G SK G ++F + AA AL+ C +S K V+ EP
Sbjct: 39 GESKGFGYVKFHKPSQAAVALENCD-------KSFKAVLAEP 73
>gnl|CDD|219563 pfam07767, Nop53, Nop53 (60S ribosomal biogenesis). This nucleolar
family of proteins are involved in 60S ribosomal
biogenesis. They are specifically involved in the
processing beyond the 27S stage of 25S rRNA maturation.
This family contains sequences that bear similarity to
the glioma tumour suppressor candidate region gene 2
protein (p60). This protein has been found to interact
with herpes simplex type 1 regulatory proteins.
Length = 387
Score = 31.6 bits (72), Expect = 0.40
Identities = 25/141 (17%), Positives = 59/141 (41%), Gaps = 9/141 (6%)
Query: 101 WKQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEA---LQKE 157
++L + ++ E+ ++ E K +E + + + + E L E +EE+ ++E
Sbjct: 195 HQELLQ--EEYEKEVKAEKKRQELERVEEKKLEKMAPEASRLDEMSEGLLEESDDDGEEE 252
Query: 158 LKLEEEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAM 217
E + E+ R+ QR ++ R++ E E KE +++
Sbjct: 253 SDDESAWEGFESEYEPINKPVRPKRKTKAQRNKEKRRKELEREAKEEKQLKKKLAQLA-- 310
Query: 218 RRQTEEIHLRMQQQDEELRRR 238
+ +EI + Q+++ R+
Sbjct: 311 --RLKEIAKEVAQKEKARARK 329
>gnl|CDD|237478 PRK13709, PRK13709, conjugal transfer nickase/helicase TraI;
Provisional.
Length = 1747
Score = 32.1 bits (73), Expect = 0.42
Identities = 31/143 (21%), Positives = 49/143 (34%), Gaps = 19/143 (13%)
Query: 98 GTRWKQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKE 157
G W + + Q + E L A+ + E I RE E R +E ++K
Sbjct: 1618 GRVWGDIPDNSVQPGAG--NGEPVTAEVL------AQRQAEEAIRRETERR-ADEIVRKM 1668
Query: 158 LKLEEEKLEAQMEFARYENETEILREQLRQREADRERQ-KQEWELKERHAEEQRRCDEEA 216
A+ + + +TE + +E DR ++E L E E +R E
Sbjct: 1669 ---------AENKPDLPDGKTEQAVRDIAGQERDRAAISEREAALPESVLREPQREREAV 1719
Query: 217 MRRQTEEIHLRMQQQDEELRRRH 239
E + QQ E R
Sbjct: 1720 REVARENLLRERLQQMERDMVRD 1742
>gnl|CDD|234017 TIGR02794, tolA_full, TolA protein. TolA couples the inner
membrane complex of itself with TolQ and TolR to the
outer membrane complex of TolB and OprL (also called
Pal). Most of the length of the protein consists of
low-complexity sequence that may differ in both length
and composition from one species to another,
complicating efforts to discriminate TolA (the most
divergent gene in the tol-pal system) from paralogs such
as TonB. Selection of members of the seed alignment and
criteria for setting scoring cutoffs are based largely
conserved operon struction. //The Tol-Pal complex is
required for maintaining outer membrane integrity. Also
involved in transport (uptake) of colicins and
filamentous DNA, and implicated in pathogenesis.
Transport is energized by the proton motive force. TolA
is an inner membrane protein that interacts with
periplasmic TolB and with outer membrane porins ompC,
phoE and lamB [Transport and binding proteins, Other,
Cellular processes, Pathogenesis].
Length = 346
Score = 31.7 bits (72), Expect = 0.42
Identities = 17/85 (20%), Positives = 36/85 (42%), Gaps = 3/85 (3%)
Query: 140 EILRERECRFVEEALQKELKLEEEKLEAQMEFARYENETEILREQLRQREADRERQKQEW 199
E R+++ E +K+ E+ + + + A E ++ + + E+QKQ
Sbjct: 66 EQERQKKLEQQAEEAEKQRAAEQARQKELEQRAAAEKAA---KQAEQAAKQAEEKQKQAE 122
Query: 200 ELKERHAEEQRRCDEEAMRRQTEEI 224
E K + A E + E ++ +E
Sbjct: 123 EAKAKQAAEAKAKAEAEAEKKAKEE 147
Score = 28.3 bits (63), Expect = 5.2
Identities = 20/114 (17%), Positives = 48/114 (42%), Gaps = 2/114 (1%)
Query: 109 KQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLEEEKLEAQ 168
K+ ++E + + E+ + E R + + E+A ++ + ++ E Q
Sbjct: 59 KKPAAKKEQERQKKLEQQAEEAEKQRAAEQARQKELEQRAAAEKAAKQAEQAAKQAEEKQ 118
Query: 169 MEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQTE 222
+ E + + E + EA+ E++ +E K+ E + + EA ++ E
Sbjct: 119 KQAE--EAKAKQAAEAKAKAEAEAEKKAKEEAKKQAEEEAKAKAAAEAKKKAAE 170
>gnl|CDD|222447 pfam13904, DUF4207, Domain of unknown function (DUF4207). This
family is found in eukaryotes; it has several conserved
tryptophan residues. The function is not known.
Length = 261
Score = 31.2 bits (71), Expect = 0.43
Identities = 26/126 (20%), Positives = 62/126 (49%), Gaps = 3/126 (2%)
Query: 109 KQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLEEEKLEAQ 168
Q+++ LQK L E++K E + E E + +E+ + + Q+ K K + +
Sbjct: 98 AQRQKKLQKLL-EEKQKQEREKEREEAELRQRLAKEKYEEWCRQKAQQAAKQRTPKHKKE 156
Query: 169 MEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDE--EAMRRQTEEIHL 226
+ + + + + + + +++ QEWELK+ ++Q+R +E + ++Q EE
Sbjct: 157 AAESASSSLSGSAKPERNVSQEEAKKRLQEWELKKLKQQQQKREEERRKQRKKQQEEEER 216
Query: 227 RMQQQD 232
+ + ++
Sbjct: 217 KQKAEE 222
Score = 27.4 bits (61), Expect = 7.9
Identities = 20/129 (15%), Positives = 56/129 (43%), Gaps = 3/129 (2%)
Query: 116 QKELKLEEEKLEA-QMEFARYENE--TEILRERECRFVEEALQKELKLEEEKLEAQMEFA 172
KE+KLE + EA + + + + ++ + E + +E ++ + E + A+ ++
Sbjct: 76 LKEVKLERQAQEAYENWLSAKQAQRQKKLQKLLEEKQKQEREKEREEAELRQRLAKEKYE 135
Query: 173 RYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQTEEIHLRMQQQD 232
+ + + R + +E + A+ +R +E +++ +E L+ +Q
Sbjct: 136 EWCRQKAQQAAKQRTPKHKKEAAESASSSLSGSAKPERNVSQEEAKKRLQEWELKKLKQQ 195
Query: 233 EELRRRHQE 241
++ R +
Sbjct: 196 QQKREEERR 204
>gnl|CDD|202096 pfam02029, Caldesmon, Caldesmon.
Length = 431
Score = 31.5 bits (71), Expect = 0.45
Identities = 33/130 (25%), Positives = 50/130 (38%), Gaps = 1/130 (0%)
Query: 108 YKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLEEEKLEA 167
+QK + E +EEK E + + + E + E E
Sbjct: 135 SEQKNDWRDAEECQKEEKEPEPEEEEKPKRGSLEENNGE-FMTHKLKHTENTFSRGGAEG 193
Query: 168 QMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQTEEIHLR 227
A E E ++Q E + ++K+E K EEQRR EEA R+ EE R
Sbjct: 194 AQVEAGKEFEKLKQKQQEAALELEELKKKREERRKVLEEEEQRRKQEEADRKSREEEEKR 253
Query: 228 MQQQDEELRR 237
+++ E RR
Sbjct: 254 RLKEEIERRR 263
>gnl|CDD|217051 pfam02463, SMC_N, RecF/RecN/SMC N terminal domain. This domain is
found at the N terminus of SMC proteins. The SMC
(structural maintenance of chromosomes) superfamily
proteins have ATP-binding domains at the N- and
C-termini, and two extended coiled-coil domains
separated by a hinge in the middle. The eukaryotic SMC
proteins form two kind of heterodimers: the SMC1/SMC3
and the SMC2/SMC4 types. These heterodimers constitute
an essential part of higher order complexes, which are
involved in chromatin and DNA dynamics. This family also
includes the RecF and RecN proteins that are involved in
DNA metabolism and recombination.
Length = 1162
Score = 31.9 bits (72), Expect = 0.46
Identities = 27/129 (20%), Positives = 53/129 (41%), Gaps = 10/129 (7%)
Query: 109 KQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLEEEKLEAQ 168
++ +E+ ++ KLE+E + + E E E + L + EE Q E E+
Sbjct: 315 EKLKESEKELKKLEKELKKEKEEIEELEKELKELEIKREAEEEEEEQLEKLQEKL----- 369
Query: 169 MEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQTEEIHLRM 228
+ E E+L ++ + E K + E E EE++ + EE L+
Sbjct: 370 -----EQLEEELLAKKKLESERLSSAAKLKEEELELKNEEEKEAKLLLELSEQEEDLLKE 424
Query: 229 QQQDEELRR 237
++++E
Sbjct: 425 EKKEELKIV 433
Score = 30.7 bits (69), Expect = 1.0
Identities = 28/131 (21%), Positives = 57/131 (43%), Gaps = 1/131 (0%)
Query: 105 YELYKQKEEALQKELKLEEEKLEAQMEFAR-YENETEILRERECRFVEEALQKELKLEEE 163
++K+E L+K ++ E E ++ E ++ + + L+++L+LEEE
Sbjct: 166 SREKRKKKERLKKLIEETENLAELIIDLEELKLQELKLKEQAKKALEYYQLKEKLELEEE 225
Query: 164 KLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQTEE 223
L E ++L+E LR + + E KQE E +E + + ++E + + +
Sbjct: 226 NLLYLDYLKLNEERIDLLQELLRDEQEEIESSKQELEKEEEILAQVLKENKEEEKEKKLQ 285
Query: 224 IHLRMQQQDEE 234
EE
Sbjct: 286 EEELKLLAKEE 296
Score = 28.8 bits (64), Expect = 4.0
Identities = 24/135 (17%), Positives = 54/135 (40%), Gaps = 7/135 (5%)
Query: 116 QKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLEEEKLEAQMEFARYE 175
++ L++EEE ++ + + E +++ E E +ELKL+E KL+ Q
Sbjct: 154 ERRLEIEEEAAGSREKRKKKERLKKLIEETENLAELIIDLEELKLQELKLKEQ------A 207
Query: 176 NETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQTEEIHLR-MQQQDEE 234
+ + + E + E LK + ++ E + +++++E
Sbjct: 208 KKALEYYQLKEKLELEEENLLYLDYLKLNEERIDLLQELLRDEQEEIESSKQELEKEEEI 267
Query: 235 LRRRHQENSIFMQVI 249
L + +EN +
Sbjct: 268 LAQVLKENKEEEKEK 282
Score = 28.4 bits (63), Expect = 4.8
Identities = 28/132 (21%), Positives = 56/132 (42%), Gaps = 13/132 (9%)
Query: 106 ELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLEEEKL 165
EL K+ +E K EEE+ + + + + +EE L + KLE E+L
Sbjct: 340 ELEKELKELEIKREAEEEEEEQLE------------KLQEKLEQLEEELLAKKKLESERL 387
Query: 166 EAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQTEEIH 225
+ + + E E+ + +++E LKE EE + +E +T++
Sbjct: 388 SSAAK-LKEEELELKNEEEKEAKLLLELSEQEEDLLKEEKKEELKIVEELEESLETKQGK 446
Query: 226 LRMQQQDEELRR 237
L ++++ E +
Sbjct: 447 LTEEKEELEKQA 458
Score = 28.0 bits (62), Expect = 7.4
Identities = 26/149 (17%), Positives = 66/149 (44%), Gaps = 3/149 (2%)
Query: 96 EYGTRWKQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQ 155
+Y ++ +L ++ Q+E++ +++LE + E + E+E + EE
Sbjct: 231 DYLKLNEERIDLLQELLRDEQEEIESSKQELEKEEEILAQVLKENKEEEKEKKLQEE--- 287
Query: 156 KELKLEEEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEE 215
+ L +E+ E + E + E E+L++ E + ++ ++E + ++ EE + +E
Sbjct: 288 ELKLLAKEEEELKSELLKLERRKVDDEEKLKESEKELKKLEKELKKEKEEIEELEKELKE 347
Query: 216 AMRRQTEEIHLRMQQQDEELRRRHQENSI 244
++ E Q + + + E +
Sbjct: 348 LEIKREAEEEEEEQLEKLQEKLEQLEEEL 376
>gnl|CDD|217902 pfam04111, APG6, Autophagy protein Apg6. In yeast, 15 Apg proteins
coordinate the formation of autophagosomes. Autophagy is
a bulk degradation process induced by starvation in
eukaryotic cells. Apg6/Vps30p has two distinct functions
in the autophagic process, either associated with the
membrane or in a retrieval step of the carboxypeptidase
Y sorting pathway.
Length = 356
Score = 31.4 bits (71), Expect = 0.50
Identities = 21/84 (25%), Positives = 39/84 (46%), Gaps = 6/84 (7%)
Query: 117 KELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLEEEKLEAQMEFARYEN 176
ELK EEE+L ++E E E + L + E +++ +LE E+L+ E+ ++
Sbjct: 73 DELKKEEERLLDELE--ELEKEDDDLDGE----LVELQEEKEQLENEELQYLREYNLFDR 126
Query: 177 ETEILREQLRQREADRERQKQEWE 200
L + L+ E E + +
Sbjct: 127 NNLQLEDNLQSLELQYEYSLNQLD 150
>gnl|CDD|234342 TIGR03752, conj_TIGR03752, integrating conjugative element protein,
PFL_4705 family. Members of this protein family are
found occasionally on plasmids such as the Pseudomonas
putida toluene catabolic TOL plasmid pWWO_p085. Usually,
however, they are found on the bacterial main chromosome
in regions flanked by markers of conjugative transfer
and/or transposition [Mobile and extrachromosomal
element functions, Plasmid functions].
Length = 472
Score = 31.5 bits (72), Expect = 0.51
Identities = 21/83 (25%), Positives = 38/83 (45%), Gaps = 13/83 (15%)
Query: 156 KELKLEEEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEE 215
KEL+ KL ++ E + ENE L++RE ++Q Q + E + +E
Sbjct: 69 KELRKRLAKLISENEALKAENER------LQKREQSIDQQIQ-----QAVQSETQELTKE 117
Query: 216 AMRRQTEEIHLRMQQQDEELRRR 238
Q + ++Q ++L+RR
Sbjct: 118 --IEQLKSERQQLQGLIDQLQRR 138
>gnl|CDD|224340 COG1422, COG1422, Predicted membrane protein [Function unknown].
Length = 201
Score = 30.8 bits (70), Expect = 0.55
Identities = 15/56 (26%), Positives = 27/56 (48%), Gaps = 1/56 (1%)
Query: 180 ILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQTEEIHLRMQQQDEEL 235
IL++ L +E +E QK E ++ E Q D + +++ +E + M EL
Sbjct: 63 ILQKLLIDQEKMKELQKMMKEFQKEFREAQESGDMKKLKK-LQEKQMEMMDDQREL 117
>gnl|CDD|179699 PRK03992, PRK03992, proteasome-activating nucleotidase;
Provisional.
Length = 389
Score = 31.3 bits (72), Expect = 0.56
Identities = 15/51 (29%), Positives = 23/51 (45%), Gaps = 4/51 (7%)
Query: 140 EILRERECRFVEEALQKELKLEEEKLEAQMEFARYENETEILR---EQLRQ 187
E L E R E Q +LE + + + E + E E E L+ E+L+
Sbjct: 1 ERLEALEERNSELEEQIR-QLELKLRDLEAENEKLERELERLKSELEKLKS 50
>gnl|CDD|233758 TIGR02169, SMC_prok_A, chromosome segregation protein SMC,
primarily archaeal type. SMC (structural maintenance of
chromosomes) proteins bind DNA and act in organizing and
segregating chromosomes for partition. SMC proteins are
found in bacteria, archaea, and eukaryotes. It is found
in a single copy and is homodimeric in prokaryotes, but
six paralogs (excluded from this family) are found in
eukarotes, where SMC proteins are heterodimeric. This
family represents the SMC protein of archaea and a few
bacteria (Aquifex, Synechocystis, etc); the SMC of other
bacteria is described by TIGR02168. The N- and
C-terminal domains of this protein are well conserved,
but the central hinge region is skewed in composition
and highly divergent [Cellular processes, Cell division,
DNA metabolism, Chromosome-associated proteins].
Length = 1164
Score = 31.6 bits (72), Expect = 0.57
Identities = 38/157 (24%), Positives = 65/157 (41%), Gaps = 12/157 (7%)
Query: 94 EFEYGTRWKQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEA 153
E+E K+ L +QKE ++ LEEE ++ E E + E E E
Sbjct: 222 EYEGYELLKEKEALERQKEAIERQLASLEEEL--EKLTEEISELEKRL-EEIEQLLEELN 278
Query: 154 LQKELKLEEEKLEAQMEFARYENETEILREQL-----RQREADRERQKQEWEL---KERH 205
+ + EEE+L + + E E L + +A+ K E E+
Sbjct: 279 KKIKDLGEEEQLRVKEKIGELEAEIASLERSIAEKERELEDAEERLAKLEAEIDKLLAEI 338
Query: 206 AEEQRRCDEEAMRR-QTEEIHLRMQQQDEELRRRHQE 241
E +R +EE RR + E + ++++ E+LR +E
Sbjct: 339 EELEREIEEERKRRDKLTEEYAELKEELEDLRAELEE 375
Score = 31.2 bits (71), Expect = 0.72
Identities = 41/142 (28%), Positives = 65/142 (45%), Gaps = 12/142 (8%)
Query: 106 ELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLEE--- 162
L K+K E EL E+E LE Q E E + L E + EE + E +LEE
Sbjct: 215 ALLKEKREYEGYELLKEKEALERQKE--AIERQLASLEEELEKLTEEISELEKRLEEIEQ 272
Query: 163 --EKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERH---AEEQRRCDEEAM 217
E+L +++ E E ++E++ + EA+ ++ KER AEE+ E +
Sbjct: 273 LLEELNKKIK-DLGEEEQLRVKEKIGELEAEIASLERSIAEKERELEDAEERLAKLEAEI 331
Query: 218 RRQTEEI-HLRMQQQDEELRRR 238
+ EI L + ++E RR
Sbjct: 332 DKLLAEIEELEREIEEERKRRD 353
Score = 30.0 bits (68), Expect = 1.6
Identities = 31/126 (24%), Positives = 56/126 (44%), Gaps = 5/126 (3%)
Query: 102 KQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVE--EALQKELK 159
++L +L ++ E ++ +L+EE E A + + E E E+K
Sbjct: 392 EKLEKLKREINELKRELDRLQEELQRLSEELADLNAAIAGIEAKINELEEEKEDKALEIK 451
Query: 160 LEEEKLE-AQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMR 218
+E KLE + ++YE E L+E+ + E +E K + EL E A+ + +
Sbjct: 452 KQEWKLEQLAADLSKYEQELYDLKEEYDRVE--KELSKLQRELAEAEAQARASEERVRGG 509
Query: 219 RQTEEI 224
R EE+
Sbjct: 510 RAVEEV 515
Score = 28.9 bits (65), Expect = 4.0
Identities = 31/146 (21%), Positives = 58/146 (39%), Gaps = 12/146 (8%)
Query: 102 KQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKL- 160
++L L + + +L +E +A + E E E L + E + E + E L
Sbjct: 688 RELSSLQSELRRIENRLDELSQELSDASRKIGEIEKEIEQLEQEEEKLKERLEELEEDLS 747
Query: 161 --EEEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMR 218
E+E + E E E L E L + E +L+ R + + + +
Sbjct: 748 SLEQEIENVKSELKELEARIEELEEDLHKLEEALN------DLEARLSHSRIPEIQAELS 801
Query: 219 RQTEE---IHLRMQQQDEELRRRHQE 241
+ EE I R+++ +++L R E
Sbjct: 802 KLEEEVSRIEARLREIEQKLNRLTLE 827
Score = 27.7 bits (62), Expect = 8.9
Identities = 28/133 (21%), Positives = 60/133 (45%), Gaps = 5/133 (3%)
Query: 106 ELYKQKEEALQKELKLEEEKLEA-QMEFARYENETEILRERECRFVEEALQKELKLEE-E 163
E +Q+EE L++ L+ EE L + + E ++E + L R E+ + E L + E
Sbjct: 726 EQLEQEEEKLKERLEELEEDLSSLEQEIENVKSELKELEARIEELEEDLHKLEEALNDLE 785
Query: 164 KLEAQMEFARYENETEILREQLRQREA---DRERQKQEWELKERHAEEQRRCDEEAMRRQ 220
+ + E L E++ + EA + E++ L++ + E++ + +E
Sbjct: 786 ARLSHSRIPEIQAELSKLEEEVSRIEARLREIEQKLNRLTLEKEYLEKEIQELQEQRIDL 845
Query: 221 TEEIHLRMQQQDE 233
E+I ++ +
Sbjct: 846 KEQIKSIEKEIEN 858
>gnl|CDD|202833 pfam03962, Mnd1, Mnd1 family. This family of proteins includes
MND1 from S. cerevisiae. The mnd1 protein forms a
complex with hop2 to promote homologous chromosome
pairing and meiotic double-strand break repair.
Length = 188
Score = 30.7 bits (70), Expect = 0.58
Identities = 13/60 (21%), Positives = 32/60 (53%), Gaps = 1/60 (1%)
Query: 183 EQLRQREADRER-QKQEWELKERHAEEQRRCDEEAMRRQTEEIHLRMQQQDEELRRRHQE 241
+ L + + E+ +K+ ELK+R AE Q + ++ R+ E + ++ ++L + ++
Sbjct: 62 QALNKLKTRLEKLKKELEELKQRIAELQAQIEKLKKGREETEERTELLEELKQLEKELKK 121
>gnl|CDD|223447 COG0370, FeoB, Fe2+ transport system protein B [Inorganic ion
transport and metabolism].
Length = 653
Score = 31.1 bits (71), Expect = 0.61
Identities = 44/175 (25%), Positives = 72/175 (41%), Gaps = 34/175 (19%)
Query: 41 VFFLTQSL---KPVIVEPLDLVD---------DEEGLSER-------TVSKK--SSDYFK 79
++ Q L P+I+ L+++D D E LS+ TV+K+ + K
Sbjct: 98 LYLTLQLLELGIPMILA-LNMIDEAKKRGIRIDIEKLSKLLGVPVVPTVAKRGEGLEELK 156
Query: 80 QRSIGPRFATIGSFEFEYGTRWKQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENET 139
+ I + E +YG + E ++ EAL ++ + KL E
Sbjct: 157 RAIIELAESKTTPREVDYG----EEIEEEIKELEALSEDPRWLAIKLLEDDELVE----- 207
Query: 140 EILRERECRFVEEALQKELKLEEEKLEAQMEFARYENETEILREQLRQREADRER 194
+L+E E R E L +EL EEE + ARY ILR ++Q E ++
Sbjct: 208 AVLKEPEKRV--EELLEELS-EEEGHLLLIADARYALIERILRSVVKQEEEEKSS 259
>gnl|CDD|202427 pfam02841, GBP_C, Guanylate-binding protein, C-terminal domain.
Transcription of the anti-viral guanylate-binding
protein (GBP) is induced by interferon-gamma during
macrophage induction. This family contains GBP1 and
GPB2, both GTPases capable of binding GTP, GDP and GMP.
Length = 297
Score = 31.1 bits (71), Expect = 0.63
Identities = 32/131 (24%), Positives = 56/131 (42%), Gaps = 13/131 (9%)
Query: 109 KQKEEALQKELKLEEEKLEA--QMEFARYENETEILRERECRFVEEALQKELKLEEEKLE 166
+ EE LQ+ L +E EA Q + A E I ER EA + E +L EK +
Sbjct: 172 VKAEEVLQEFLNSKEAVEEAILQTDQALTAKEKAIEAERA---KAEAAEAEQELLREKQK 228
Query: 167 AQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQTEEIHL 226
+ + E + +E ++Q E ++++ EQ R E ++ Q E +
Sbjct: 229 EEEQ--MMEAQERSYQEHVKQLIEKMEAEREKLL------AEQERMLEHKLQEQEELLKE 280
Query: 227 RMQQQDEELRR 237
+ + E L++
Sbjct: 281 GFKTEAESLQK 291
Score = 28.8 bits (65), Expect = 3.5
Identities = 30/145 (20%), Positives = 70/145 (48%), Gaps = 6/145 (4%)
Query: 105 YELYKQKEEALQKELKLEEEK-LEAQMEFARYENETEILRERECRFVEEALQKELKLEEE 163
Y+L+ ++ + L+ + K ++A+ + N E + E + + KE +E E
Sbjct: 150 YKLFLEERDKLEAKYNQVPRKGVKAEEVLQEFLNSKEAVEEAILQTDQALTAKEKAIEAE 209
Query: 164 KLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQTEE 223
+ +A+ E E E+LRE+ ++ E E Q++ ++ + E+ + E + + E
Sbjct: 210 RAKAEAA----EAEQELLREKQKEEEQMMEAQERSYQEHVKQLIEKMEAEREKLLAEQER 265
Query: 224 -IHLRMQQQDEELRRRHQENSIFMQ 247
+ ++Q+Q+E L+ + + +Q
Sbjct: 266 MLEHKLQEQEELLKEGFKTEAESLQ 290
>gnl|CDD|223024 PHA03252, PHA03252, DNA packaging tegument protein UL25;
Provisional.
Length = 589
Score = 31.2 bits (71), Expect = 0.71
Identities = 13/35 (37%), Positives = 20/35 (57%)
Query: 203 ERHAEEQRRCDEEAMRRQTEEIHLRMQQQDEELRR 237
E H RC EE +RR ++ LR+++ E+L R
Sbjct: 17 EGHVRNILRCPEEDLRRLRDDSALRLRRYREDLLR 51
>gnl|CDD|130141 TIGR01069, mutS2, MutS2 family protein. Function of MutS2 is
unknown. It should not be considered a DNA mismatch
repair protein. It is likely a DNA mismatch binding
protein of unknown cellular function [DNA metabolism,
Other].
Length = 771
Score = 30.9 bits (70), Expect = 0.76
Identities = 27/103 (26%), Positives = 46/103 (44%), Gaps = 10/103 (9%)
Query: 107 LYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLEEEKLE 166
L ++E QK LE+ E + E E E L+ERE + KLE EK E
Sbjct: 520 LSALEKELEQKNEHLEKLLKEQEKLKKELEQEMEELKERE---------RNKKLELEK-E 569
Query: 167 AQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQ 209
AQ + E E + +L++++ + ++ + E + E +
Sbjct: 570 AQEALKALKKEVESIIRELKEKKIHKAKEIKSIEDLVKLKETK 612
>gnl|CDD|223868 COG0797, RlpA, Lipoproteins [Cell envelope biogenesis, outer
membrane].
Length = 233
Score = 30.5 bits (69), Expect = 0.77
Identities = 16/51 (31%), Positives = 24/51 (47%), Gaps = 9/51 (17%)
Query: 7 VVLVDDRGNSKNEGIIEFTRKPAAAQALKRCQDGVFFLTQSLKPVIVEPLD 57
VV ++DRG + II+ ++ AAA L + GV V +E L
Sbjct: 135 VVRINDRGPFVSGRIIDLSK--AAADKLGMIRSGVA-------KVRIEVLG 176
>gnl|CDD|221371 pfam12004, DUF3498, Domain of unknown function (DUF3498). This
presumed domain is functionally uncharacterized. This
domain is found in eukaryotes. This domain is typically
between 433 to 538 amino acids in length. This domain is
found associated with pfam00616, pfam00168. This domain
has two conserved sequence motifs: DLQ and PLSFQNP.
Length = 489
Score = 30.8 bits (69), Expect = 0.81
Identities = 28/105 (26%), Positives = 45/105 (42%), Gaps = 22/105 (20%)
Query: 162 EEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQT 221
EE E + +YE E L+E+LR + R ++ E L + + Q+ E R +
Sbjct: 356 EENREEGTQAEKYEQEIARLKERLRV--SVRRLEEYERRLLGQEQQMQKLLQEYQARLED 413
Query: 222 EEIHLRMQQQD----------------EELRRRHQENSIFMQVIV 250
E LR QQ++ EEL++ H + MQ +V
Sbjct: 414 SEERLRRQQEEKDSQMKSIISRLMAVEEELKKDHAD----MQAVV 454
Score = 30.8 bits (69), Expect = 0.86
Identities = 26/96 (27%), Positives = 45/96 (46%), Gaps = 9/96 (9%)
Query: 123 EEKLEAQMEFARYENETEILRERECRFVEEALQKELKLEEEKLEAQMEFARYENETEILR 182
EE E + +YE E L+ER V + E +L ++ + Q Y+ E
Sbjct: 356 EENREEGTQAEKYEQEIARLKERLRVSVRRLEEYERRLLGQEQQMQKLLQEYQARLEDSE 415
Query: 183 EQLRQREADRERQKQ---------EWELKERHAEEQ 209
E+LR+++ +++ Q + E ELK+ HA+ Q
Sbjct: 416 ERLRRQQEEKDSQMKSIISRLMAVEEELKKDHADMQ 451
>gnl|CDD|233503 TIGR01642, U2AF_lg, U2 snRNP auxilliary factor, large subunit,
splicing factor. These splicing factors consist of an
N-terminal arginine-rich low complexity domain followed
by three tandem RNA recognition motifs (pfam00076). The
well-characterized members of this family are auxilliary
components of the U2 small nuclear ribonuclearprotein
splicing factor (U2AF). These proteins are closely
related to the CC1-like subfamily of splicing factors
(TIGR01622). Members of this subfamily are found in
plants, metazoa and fungi.
Length = 509
Score = 30.6 bits (69), Expect = 0.85
Identities = 14/75 (18%), Positives = 26/75 (34%), Gaps = 2/75 (2%)
Query: 170 EFARYENETEILREQLRQREADRERQKQEWELKERHAE--EQRRCDEEAMRRQTEEIHLR 227
E E E R++ R E R R + ++RH E+ ++ R +
Sbjct: 3 EEPDREREKSRGRDRDRSSERPRRRSRDRSRFRDRHRRSRERSYREDSRPRDRRRYDSRS 62
Query: 228 MQQQDEELRRRHQEN 242
+ RR ++
Sbjct: 63 PRSLRYSSVRRSRDR 77
Score = 27.9 bits (62), Expect = 5.8
Identities = 14/62 (22%), Positives = 25/62 (40%), Gaps = 2/62 (3%)
Query: 182 REQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQTEEIHLRMQQQDEELRRRHQE 241
R++ RE ++ R + ER +R D R + R ++D R R +
Sbjct: 1 RDEEPDREREKSRGRDRDRSSERP--RRRSRDRSRFRDRHRRSRERSYREDSRPRDRRRY 58
Query: 242 NS 243
+S
Sbjct: 59 DS 60
>gnl|CDD|220369 pfam09731, Mitofilin, Mitochondrial inner membrane protein.
Mitofilin controls mitochondrial cristae morphology.
Mitofilin is enriched in the narrow space between the
inner boundary and the outer membranes, where it forms a
homotypic interaction and assembles into a large
multimeric protein complex. The first 78 amino acids
contain a typical amino-terminal-cleavable mitochondrial
presequence rich in positive-charged and hydroxylated
residues and a membrane anchor domain. In addition, it
has three centrally located coiled coil domains.
Length = 493
Score = 30.8 bits (70), Expect = 0.93
Identities = 39/130 (30%), Positives = 67/130 (51%), Gaps = 15/130 (11%)
Query: 88 ATIGSFEFEYGTRWKQLYELYKQKEEALQKELK--------LEEEKLEAQMEFARYENET 139
+ I S + E K+L EL ++EE L++ LK EE+L A++E E
Sbjct: 163 SLIASAKEELDQLSKKLAELKAEEEEELERALKEKREELLSKLEEELLARLESKEAALEK 222
Query: 140 EILRERECRFVEEALQKELKLEEEKLEAQMEFARYENETEILREQLRQREADRERQKQEW 199
++ E E R EE L+K+ EEKL ++E +E + L+ +L + + +R+ +
Sbjct: 223 QLRLEFE-REKEE-LRKKY---EEKLRQELERQAEAHE-QKLKNELALQAIELQREFNK- 275
Query: 200 ELKERHAEEQ 209
E+KE+ EE+
Sbjct: 276 EIKEKVEEER 285
Score = 30.0 bits (68), Expect = 1.4
Identities = 34/122 (27%), Positives = 61/122 (50%), Gaps = 7/122 (5%)
Query: 96 EYGTRWKQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRF--VEEA 153
+ + E Q + L + EEE+LE ++ R E +++ E R E A
Sbjct: 160 DLESLIASAKEELDQLSKKLAELKAEEEEELERALKEKREELLSKLEEELLARLESKEAA 219
Query: 154 LQKELKLEEEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCD 213
L+K+L+LE E+ + ++ +YE + LR++L ++ E QK + EL + E QR +
Sbjct: 220 LEKQLRLEFEREKEELR-KKYEEK---LRQELERQAEAHE-QKLKNELALQAIELQREFN 274
Query: 214 EE 215
+E
Sbjct: 275 KE 276
Score = 28.1 bits (63), Expect = 5.8
Identities = 30/111 (27%), Positives = 52/111 (46%), Gaps = 11/111 (9%)
Query: 111 KEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLEEEKLEAQME 170
KEE Q KL E K E + E R L+E+ + + ++ L E K A +
Sbjct: 169 KEELDQLSKKLAELKAEEEEELER------ALKEKREELLSKLEEELLARLESKEAALEK 222
Query: 171 FARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQT 221
+ E E +E+LR++ ++ RQ+ E ++ A EQ+ +E A++
Sbjct: 223 --QLRLEFEREKEELRKKYEEKLRQELE---RQAEAHEQKLKNELALQAIE 268
>gnl|CDD|227512 COG5185, HEC1, Protein involved in chromosome segregation,
interacts with SMC proteins [Cell division and
chromosome partitioning].
Length = 622
Score = 30.7 bits (69), Expect = 0.96
Identities = 28/142 (19%), Positives = 65/142 (45%), Gaps = 15/142 (10%)
Query: 95 FEYGTRWKQLYELYKQKEEALQKELKLEEEK----LEAQMEFARYENETEILRERECRFV 150
F+Y T + + + E ++ELKL EK + + + +N+ + +E +
Sbjct: 234 FDYFTESYKSFLKLEDNYEPSEQELKLGFEKFVHIINTDIANLKTQNDNLYEKIQEAMKI 293
Query: 151 EEALQKELKLEEEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQR 210
+ ++ L E+ + + +YEN ++++ ++ E+ K E ELKE
Sbjct: 294 SQKIKT---LREKWRALKSDSNKYENYVNAMKQKSQEWPGKLEKLKSEIELKEEEI---- 346
Query: 211 RCDEEAMRRQTEEIHLRMQQQD 232
+A++ +E+H ++++Q
Sbjct: 347 ----KALQSNIDELHKQLRKQG 364
>gnl|CDD|235175 PRK03918, PRK03918, chromosome segregation protein; Provisional.
Length = 880
Score = 30.8 bits (70), Expect = 0.97
Identities = 34/105 (32%), Positives = 54/105 (51%), Gaps = 8/105 (7%)
Query: 109 KQKEEALQKELKLEEEKL-EAQMEFARYENETEILRER----ECRFVEEALQKELKLEEE 163
+++ E +KELK EE+L +A E A E E LR+ E ++ EE ++ L EE
Sbjct: 611 EKELEREEKELKKLEEELDKAFEELAETEKRLEELRKELEELEKKYSEEEYEE---LREE 667
Query: 164 KLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEE 208
LE E A E E L ++ + + E+ K+E E +E+ +E
Sbjct: 668 YLELSRELAGLRAELEELEKRREEIKKTLEKLKEELEEREKAKKE 712
Score = 29.3 bits (66), Expect = 2.4
Identities = 35/148 (23%), Positives = 64/148 (43%), Gaps = 12/148 (8%)
Query: 106 ELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVE--------EALQKE 157
EL K+ E + KLEE+ E + + E E L E+ E L +
Sbjct: 242 ELEKELESLEGSKRKLEEKIRELEERIEELKKEIEELEEKVKELKELKEKAEEYIKLSEF 301
Query: 158 L-KLEEEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEA 216
+ +E E + +R E E + E++++ E ER + ELK++ E ++R +E
Sbjct: 302 YEEYLDELREIEKRLSRLEEEINGIEERIKELEEKEERLE---ELKKKLKELEKRLEELE 358
Query: 217 MRRQTEEIHLRMQQQDEELRRRHQENSI 244
R + E +++ E L++R +
Sbjct: 359 ERHELYEEAKAKKEELERLKKRLTGLTP 386
Score = 28.1 bits (63), Expect = 7.2
Identities = 36/130 (27%), Positives = 63/130 (48%), Gaps = 3/130 (2%)
Query: 110 QKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLEEEKLEAQM 169
+K E L+K+L E+KL+ E + L E VEE L++ LK E +
Sbjct: 549 EKLEELKKKLAELEKKLDELEE--ELAELLKELEELGFESVEE-LEERLKELEPFYNEYL 605
Query: 170 EFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQTEEIHLRMQ 229
E E E E ++L++ E + ++ +E E+ EE R+ EE ++ +EE + ++
Sbjct: 606 ELKDAEKELEREEKELKKLEEELDKAFEELAETEKRLEELRKELEELEKKYSEEEYEELR 665
Query: 230 QQDEELRRRH 239
++ EL R
Sbjct: 666 EEYLELSREL 675
>gnl|CDD|221432 pfam12128, DUF3584, Protein of unknown function (DUF3584). This
protein is found in bacteria and eukaryotes. Proteins in
this family are typically between 943 to 1234 amino
acids in length. This family contains a P-loop motif
suggesting it is a nucleotide binding protein. It may be
involved in replication.
Length = 1198
Score = 30.8 bits (70), Expect = 1.1
Identities = 30/141 (21%), Positives = 65/141 (46%), Gaps = 10/141 (7%)
Query: 108 YKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLEEEKLEA 167
Y++ ++ ++++L+ + EK ++ R E + + E +AL+ +L+ + E +
Sbjct: 382 YERLKQKIKEQLERDLEKNNERLAAIREEKDRQKAAIEE---DLQALESQLRQQLEAGKL 438
Query: 168 QMEFARYENETEILREQLRQREA-----DRERQKQEWELKERHAEEQRRCDEEAMRRQTE 222
+ YE E + R + R A + E+ + E E+ EEQ + + + Q+E
Sbjct: 439 EFNEEEYELELRLGRLKQRLDSATATPEELEQLEINDEALEKAQEEQEQAEANVEQLQSE 498
Query: 223 EIHLRMQ--QQDEELRRRHQE 241
LR + + E L+R +
Sbjct: 499 LRQLRKRRDEALEALQRAERR 519
Score = 27.7 bits (62), Expect = 8.7
Identities = 31/155 (20%), Positives = 69/155 (44%), Gaps = 20/155 (12%)
Query: 96 EYGTRWKQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQ 155
+ + L + K+ + L +EL KL A +E E+L +++ F + ++
Sbjct: 292 RLRQQLRTLEDQLKEARDELNQELSAANAKLAAD------RSELELLEDQKGAFEDADIE 345
Query: 156 KELKLEEEKL---EAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRC 212
+ L+ + ++L +++E + + Q QR+ +R +QK +KE+ + +
Sbjct: 346 Q-LQADLDQLPSIRSELEEVEARLDALTGKHQDVQRKYERLKQK----IKEQLERDLEKN 400
Query: 213 DE------EAMRRQTEEIHLRMQQQDEELRRRHQE 241
+E E RQ I +Q + +LR++ +
Sbjct: 401 NERLAAIREEKDRQKAAIEEDLQALESQLRQQLEA 435
>gnl|CDD|233069 TIGR00643, recG, ATP-dependent DNA helicase RecG. [DNA metabolism,
DNA replication, recombination, and repair].
Length = 630
Score = 30.4 bits (69), Expect = 1.1
Identities = 15/60 (25%), Positives = 24/60 (40%), Gaps = 7/60 (11%)
Query: 122 EEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLEEEKLEAQMEFARYENETEIL 181
E EKL+ + A YE + + + ++ +EK EF E E +IL
Sbjct: 460 ESEKLDLKAAEALYERLKKAFPKYNVGLLHGRMK-----SDEKEAVMEEF--REGEVDIL 512
>gnl|CDD|163620 cd00845, MPP_UshA_N_like, Escherichia coli UshA-like family,
N-terminal metallophosphatase domain. This family
includes the bacterial enzyme UshA, and related enzymes
including SoxB, CpdB, YhcR, and CD73. All members have
a similar domain architecture which includes an
N-terminal metallophosphatase domain and a C-terminal
nucleotidase domain. The N-terminal metallophosphatase
domain belongs to a large superfamily of distantly
related metallophosphatases (MPPs) that includes:
Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat
debranching enzymes, YfcE-like phosphodiesterases,
purple acid phosphatases (PAPs), YbbF-like
UDP-2,3-diacylglucosamine hydrolases, and acid
sphingomyelinases (ASMases). MPPs are functionally
diverse, but all share a conserved domain with an active
site consisting of two metal ions (usually manganese,
iron, or zinc) coordinated with octahedral geometry by a
cage of histidine, aspartate, and asparagine residues.
The conserved domain is a double beta-sheet sandwich
with a di-metal active site made up of residues located
at the C-terminal side of the sheets. This domain is
thought to allow for productive metal coordination.
Length = 252
Score = 29.9 bits (68), Expect = 1.1
Identities = 13/28 (46%), Positives = 16/28 (57%), Gaps = 2/28 (7%)
Query: 83 IGPRFATIGSFEFEYGTRWKQLYELYKQ 110
+G TIG+ EF+YG L ELYK
Sbjct: 69 LGYDAVTIGNHEFDYGL--DALAELYKD 94
>gnl|CDD|238427 cd00831, CHS_like, Chalcone and stilbene synthases; plant-specific
polyketide synthases (PKS) and related enzymes, also
called type III PKSs. PKS generate an array of different
products, dependent on the nature of the starter
molecule. They share a common chemical strategy, after
the starter molecule is loaded onto the active site
cysteine, a carboxylative condensation reation extends
the polyketide chain. Plant-specific PKS are dimeric
iterative PKSs, using coenzyme A esters to deliver
substrate to the active site, but they differ in the
choice of starter molecule and the number of
condensation reactions.
Length = 361
Score = 30.3 bits (69), Expect = 1.1
Identities = 15/50 (30%), Positives = 22/50 (44%), Gaps = 9/50 (18%)
Query: 152 EALQKELKLEEEKLEA-QMEFARYEN--------ETEILREQLRQREADR 192
+A++K L L E LEA +M RY N + + R + DR
Sbjct: 295 DAVEKALGLSPEDLEASRMVLRRYGNMSSSSVLYVLAYMEAKGRVKRGDR 344
Score = 27.6 bits (62), Expect = 8.7
Identities = 12/26 (46%), Positives = 16/26 (61%), Gaps = 1/26 (3%)
Query: 113 EALQKELKLEEEKLEA-QMEFARYEN 137
+A++K L L E LEA +M RY N
Sbjct: 295 DAVEKALGLSPEDLEASRMVLRRYGN 320
>gnl|CDD|237178 PRK12705, PRK12705, hypothetical protein; Provisional.
Length = 508
Score = 30.1 bits (68), Expect = 1.3
Identities = 27/109 (24%), Positives = 40/109 (36%), Gaps = 3/109 (2%)
Query: 132 FARYENETEILRERECRFVEEALQKELKLEEEKLEAQMEFARYENETEILREQLR---QR 188
R E R + E + E L E K E + E RE+L+ +R
Sbjct: 26 KKRQRLAKEAERILQEAQKEAEEKLEAALLEAKELLLRERNQQRQEARREREELQREEER 85
Query: 189 EADRERQKQEWELKERHAEEQRRCDEEAMRRQTEEIHLRMQQQDEELRR 237
+E Q K + E Q E+A+ + E+ +Q D EL R
Sbjct: 86 LVQKEEQLDARAEKLDNLENQLEEREKALSARELELEELEKQLDNELYR 134
Score = 29.7 bits (67), Expect = 1.8
Identities = 34/111 (30%), Positives = 58/111 (52%), Gaps = 6/111 (5%)
Query: 104 LYELYKQKEEALQKELKLEEEKLEAQ--MEFARYENETEILRERECRFVEEALQK-ELKL 160
+ L K++ A + E L+E + EA+ +E A E + +LRER + E ++ EL+
Sbjct: 22 VVLLKKRQRLAKEAERILQEAQKEAEEKLEAALLEAKELLLRERNQQRQEARREREELQR 81
Query: 161 EEEKLEAQMEF--ARYENETEILREQLRQREADRERQKQEWELKERHAEEQ 209
EEE+L + E AR E + + L QL +RE ++ E E E+ + +
Sbjct: 82 EEERLVQKEEQLDARAE-KLDNLENQLEEREKALSARELELEELEKQLDNE 131
>gnl|CDD|235316 PRK04863, mukB, cell division protein MukB; Provisional.
Length = 1486
Score = 30.3 bits (69), Expect = 1.3
Identities = 28/131 (21%), Positives = 51/131 (38%), Gaps = 18/131 (13%)
Query: 102 KQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLE 161
+ L E K+ + L E +LE+ + E + RER R ++L+
Sbjct: 537 RLLAEFCKRLGKNLDDEDELEQLQEELEARLESLSESVSEARER--RMALRQQLEQLQAR 594
Query: 162 EEKLEAQM-EFARYENETEILREQ-------------LRQREADRERQKQEWELKERHAE 207
++L A+ + ++ LREQ Q+ +RER+ ++ A
Sbjct: 595 IQRLAARAPAWLAAQDALARLREQSGEEFEDSQDVTEYMQQLLERERELTV--ERDELAA 652
Query: 208 EQRRCDEEAMR 218
++ DEE R
Sbjct: 653 RKQALDEEIER 663
>gnl|CDD|220402 pfam09787, Golgin_A5, Golgin subfamily A member 5. Members of this
family of proteins are involved in maintaining Golgi
structure. They stimulate the formation of Golgi stacks
and ribbons, and are involved in intra-Golgi retrograde
transport. Two main interactions have been
characterized: one with RAB1A that has been activated by
GTP-binding and another with isoform CASP of CUTL1.
Length = 509
Score = 29.8 bits (67), Expect = 1.5
Identities = 23/113 (20%), Positives = 52/113 (46%), Gaps = 5/113 (4%)
Query: 101 WKQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKL 160
+QL +L + + E ++ +L++ + +A E L+E ++ +++L
Sbjct: 217 LQQLLKLLRAEGE--SEKQELQQYRQKAHRILQSKEKRINFLKEGCLFEGLDSSTAQIEL 274
Query: 161 EEEKLEAQM---EFARYENETEILREQLRQREADRERQKQEWELKERHAEEQR 210
EE K E++ E + E + LR + + REA+ + + + + R +Q
Sbjct: 275 EELKHESEHVQEEITKLEGQIIQLRSEAQDREAEASGEAESFRKQPRELSQQI 327
>gnl|CDD|185727 cd08986, GH43_7, Glycosyl hydrolase family 43. This glycosyl
hydrolase family 43 (GH43) includes enzymes with
beta-1,4-xylosidase (xylan 1,4-beta-xylosidase; EC
3.2.1.37), beta-1,3-xylosidase (EC 3.2.1.-),
alpha-L-arabinofuranosidase (EC 3.2.1.55), arabinanase
(EC 3.2.1.99), xylanase (EC 3.2.1.8),
endo-alpha-L-arabinanase and galactan
1,3-beta-galactosidase (EC 3.2.1.145) activities. These
are inverting enzymes (i.e. they invert the
stereochemistry of the anomeric carbon atom of the
substrate) that have an aspartate as the catalytic
general base, a glutamate as the catalytic general acid
and another aspartate that is responsible for pKa
modulation and orienting the catalytic acid. Many of the
enzymes in this family display both
alpha-L-arabinofuranosidase and beta-D-xylosidase
activity using aryl-glycosides as substrates. A common
structural feature of GH43 enzymes is a 5-bladed
beta-propeller domain that contains the catalytic acid
and catalytic base. A long V-shaped groove, partially
enclosed at one end, forms a single extended
substrate-binding surface across the face of the
propeller.
Length = 269
Score = 29.7 bits (67), Expect = 1.7
Identities = 14/48 (29%), Positives = 20/48 (41%), Gaps = 4/48 (8%)
Query: 58 LVDDEEGLSERTVSKKSSDYFKQRSIGPRFATIGSFEFEYGTRWKQLY 105
L DD GL+ V S F + IG A + F+YG ++
Sbjct: 156 LKDDLSGLAGDPVRIDPSPTFYKDEIGHEGAFV----FKYGGKYYLFG 199
>gnl|CDD|234750 PRK00409, PRK00409, recombination and DNA strand exchange inhibitor
protein; Reviewed.
Length = 782
Score = 29.8 bits (68), Expect = 1.8
Identities = 30/125 (24%), Positives = 57/125 (45%), Gaps = 17/125 (13%)
Query: 110 QKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQK-ELKLEEEKLEAQ 168
+ E QK + E EA+ E + E L+E E + +EEA ++ + ++E K EA
Sbjct: 528 LERELEQKAEEAEALLKEAEKLKEELEEKKEKLQEEEDKLLEEAEKEAQQAIKEAKKEAD 587
Query: 169 MEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQTEEIHLRM 228
EI++E LRQ + + +K E R+ +A ++ ++ +
Sbjct: 588 ----------EIIKE-LRQLQ-----KGGYASVKAHELIEARKRLNKANEKKEKKKKKQK 631
Query: 229 QQQDE 233
++Q+E
Sbjct: 632 EKQEE 636
Score = 29.4 bits (67), Expect = 2.5
Identities = 17/89 (19%), Positives = 36/89 (40%), Gaps = 2/89 (2%)
Query: 155 QKELKLEEEKLEAQMEFARYENETEILREQLRQREA--DRERQKQEWELKERHAEEQRRC 212
+ E +LE++ EA+ E E L E+ + + D+ ++ E E ++ E ++
Sbjct: 527 ELERELEQKAEEAEALLKEAEKLKEELEEKKEKLQEEEDKLLEEAEKEAQQAIKEAKKEA 586
Query: 213 DEEAMRRQTEEIHLRMQQQDEELRRRHQE 241
DE + + + EL +
Sbjct: 587 DEIIKELRQLQKGGYASVKAHELIEARKR 615
Score = 29.0 bits (66), Expect = 3.2
Identities = 22/117 (18%), Positives = 52/117 (44%), Gaps = 18/117 (15%)
Query: 120 KLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLEEEKLEAQMEFARYENETE 179
LEE + E + + E E L + E L++EL+ ++EKL+ + E +
Sbjct: 524 SLEELERELEQKAE----EAEALLKEA-----EKLKEELEEKKEKLQEE--------EDK 566
Query: 180 ILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQ-TEEIHLRMQQQDEEL 235
+L E ++ + + K+E + + + ++ +++ E R+ + +E+
Sbjct: 567 LLEEAEKEAQQAIKEAKKEADEIIKELRQLQKGGYASVKAHELIEARKRLNKANEKK 623
Score = 27.9 bits (63), Expect = 6.9
Identities = 18/69 (26%), Positives = 33/69 (47%), Gaps = 8/69 (11%)
Query: 102 KQLYELYKQKEEALQKELKLEEEKLEAQMEF------ARYENETEILRERECRFVEEALQ 155
++L E ++K+E LQ+E ++ EA+ E A+ E + I R+ + A
Sbjct: 547 EKLKEELEEKKEKLQEE--EDKLLEEAEKEAQQAIKEAKKEADEIIKELRQLQKGGYASV 604
Query: 156 KELKLEEEK 164
K +L E +
Sbjct: 605 KAHELIEAR 613
>gnl|CDD|240830 cd12384, RRM_RBM24_RBM38_like, RNA recognition motif in
eukaryotic RNA-binding protein RBM24, RBM38 and similar
proteins. This subfamily corresponds to the RRM of
RBM24 and RBM38 from vertebrate, SUPpressor family
member SUP-12 from Caenorhabditis elegans and similar
proteins. Both, RBM24 and RBM38, are preferentially
expressed in cardiac and skeletal muscle tissues. They
regulate myogenic differentiation by controlling the
cell cycle in a p21-dependent or -independent manner.
RBM24, also termed RNA-binding region-containing
protein 6, interacts with the 3'-untranslated region
(UTR) of myogenin mRNA and regulates its stability in
C2C12 cells. RBM38, also termed CLL-associated antigen
KW-5, or HSRNASEB, or RNA-binding region-containing
protein 1(RNPC1), or ssDNA-binding protein SEB4, is a
direct target of the p53 family. It is required for
maintaining the stability of the basal and
stress-induced p21 mRNA by binding to their 3'-UTRs. It
also binds the AU-/U-rich elements in p63 3'-UTR and
regulates p63 mRNA stability and activity. SUP-12 is a
novel tissue-specific splicing factor that controls
muscle-specific splicing of the ADF/cofilin pre-mRNA in
C. elegans. All family members contain a conserved RNA
recognition motif (RRM), also termed RBD (RNA binding
domain) or RNP (ribonucleoprotein domain). .
Length = 76
Score = 27.6 bits (62), Expect = 1.8
Identities = 14/34 (41%), Positives = 20/34 (58%), Gaps = 1/34 (2%)
Query: 3 IERAVVLVDDR-GNSKNEGIIEFTRKPAAAQALK 35
IE AVV+ D + G S+ G + F K +A +A K
Sbjct: 27 IEEAVVITDRQTGKSRGYGFVTFKDKESAERACK 60
>gnl|CDD|224143 COG1222, RPT1, ATP-dependent 26S proteasome regulatory subunit
[Posttranslational modification, protein turnover,
chaperones].
Length = 406
Score = 29.5 bits (67), Expect = 1.9
Identities = 20/74 (27%), Positives = 31/74 (41%), Gaps = 12/74 (16%)
Query: 138 ETEILRERECRFVEEALQKELKLEEEKLEAQMEFARYENETEILREQLRQREADRERQKQ 197
EIL + E +E L K + + LE E +L + ++ EA+ R K+
Sbjct: 6 LDEILGDLESYEPQEYLNKLEDTKLKLLE---------KEKRLLLLEEQRLEAEGLRLKR 56
Query: 198 EWELKERHAEEQRR 211
E +R EE R
Sbjct: 57 EV---DRLREEIER 67
Score = 29.2 bits (66), Expect = 2.3
Identities = 17/72 (23%), Positives = 31/72 (43%), Gaps = 2/72 (2%)
Query: 112 EEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLEEEKLEAQMEF 171
EE + L + + E+ +T++ + + + L +E +LE E L + E
Sbjct: 1 EELDALDEILGDLESYEPQEYLNKLEDTKLKLLEKEKRLL--LLEEQRLEAEGLRLKREV 58
Query: 172 ARYENETEILRE 183
R E E L+E
Sbjct: 59 DRLREEIERLKE 70
>gnl|CDD|206563 pfam14395, COOH-NH2_lig, Phage phiEco32-like COOH.NH2 ligase-type
2. A family of COOH-NH2 ligases/GCS superfamily found
in the neighborhood of YheC/D-like ATP-grasp and the
CotE family of proteins in the firmicutes. Contextual
analysis suggests that it might be involved in cell wall
modification and spore coat biosynthesis.
Length = 261
Score = 29.2 bits (66), Expect = 2.0
Identities = 20/78 (25%), Positives = 32/78 (41%), Gaps = 14/78 (17%)
Query: 126 LEAQMEFARYENETEILRERECRFVEEALQKELKLEEEKLEAQMEFARYENETEILREQL 185
L +++ A YE + E LR V+ LEA + Y NE E L E +
Sbjct: 192 LSPELQEAFYEGDKEALRP----CVKGVWDD--------LEALPGYTDYRNEIEPLFEMI 239
Query: 186 RQREADRERQ--KQEWEL 201
+ + E +Q W++
Sbjct: 240 EEGQTWDEEVDLRQAWKI 257
>gnl|CDD|222631 pfam14259, RRM_6, RNA recognition motif (a.k.a. RRM, RBD, or RNP
domain).
Length = 69
Score = 27.1 bits (61), Expect = 2.0
Identities = 10/37 (27%), Positives = 15/37 (40%)
Query: 7 VVLVDDRGNSKNEGIIEFTRKPAAAQALKRCQDGVFF 43
V LV ++ + +EF A ALK+ V
Sbjct: 28 VRLVRNKDRPRGFAFVEFASPEDAEAALKKLNGLVLD 64
>gnl|CDD|226654 COG4191, COG4191, Signal transduction histidine kinase regulating
C4-dicarboxylate transport system [Signal transduction
mechanisms].
Length = 603
Score = 29.6 bits (67), Expect = 2.0
Identities = 20/61 (32%), Positives = 27/61 (44%), Gaps = 6/61 (9%)
Query: 181 LREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQTEEIHLRMQQQDEELRRRHQ 240
R +LR E R + E ++ER A+ R A R EI R +Q + LRR
Sbjct: 320 RRARLRLAELQEARAELERRVEERTADLTR-----ANARLQAEIAER-EQAEAALRRAQD 373
Query: 241 E 241
E
Sbjct: 374 E 374
>gnl|CDD|163562 TIGR03850, bind_CPR_0540, carbohydrate ABC transporter
substrate-binding protein, CPR_0540 family. Members of
this protein are the substrate-binding protein of a
predicted carbohydrate transporter operon, together with
permease subunits of ABC transporter homology families.
This substrate-binding protein frequently co-occurs in
genomes with a family of disaccharide phosphorylases,
TIGR02336, suggesting that the molecule transported will
include
beta-D-galactopyranosyl-(1->3)-N-acetyl-D-glucosamine
and related carbohydrates. Members of this family are
sporadically strain by strain, often in species with a
human host association, including Propionibacterium
acnes and Clostridium perfringens, and Bacillus cereus
[Transport and binding proteins, Carbohydrates, organic
alcohols, and acids].
Length = 437
Score = 29.7 bits (67), Expect = 2.1
Identities = 12/45 (26%), Positives = 27/45 (60%), Gaps = 4/45 (8%)
Query: 90 IGSFEFEYGTR-WKQLYELYKQKEEALQKELKLE---EEKLEAQM 130
+ +FE YGT+ W+++ E +++ E ++ EL + E+ + Q+
Sbjct: 38 VAAFEGGYGTKMWEEVVEAFEKSHEGVKVELTVSKNLEDVITPQI 82
>gnl|CDD|225177 COG2268, COG2268, Uncharacterized protein conserved in bacteria
[Function unknown].
Length = 548
Score = 29.4 bits (66), Expect = 2.4
Identities = 28/131 (21%), Positives = 58/131 (44%), Gaps = 7/131 (5%)
Query: 120 KLEEEKLEAQMEFARYENETEI-----LRERECRFVEEALQKELKLEEEKLEAQMEFARY 174
++ + +A++ E ETEI R+ + +E Q K E+ E ++ A
Sbjct: 215 RIAQVLQDAEIAENEAEKETEIAIAEANRDAKLVELEVEQQPAGKTAEQTREVKIILAET 274
Query: 175 ENETEILREQLRQREADRERQKQEWELKERHAEEQRRC-DEEAMRRQTEEIHLRMQQQDE 233
E E + + R REA++ E ++E A+ ++ +A+ + + L +Q++
Sbjct: 275 EAEVAAWKAETR-REAEQAEILAEQAIQEEKAQAEQEVQHAKALEAREMRVGLIERQKET 333
Query: 234 ELRRRHQENSI 244
EL + + I
Sbjct: 334 ELEPQERSYFI 344
Score = 27.9 bits (62), Expect = 6.5
Identities = 26/140 (18%), Positives = 53/140 (37%), Gaps = 5/140 (3%)
Query: 109 KQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLEEEKLEAQ 168
+ + EA Q E+ E+ E + + + + L RE R QKE +LE ++
Sbjct: 284 ETRREAEQAEILAEQAIQEEKAQAEQEVQHAKALEAREMRVGLIERQKETELEPQERSYF 343
Query: 169 MEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRR--CDEEAMRR--QTEEI 224
+ A+ + + E + EA + + E E +R A + E++
Sbjct: 344 INAAQRQAQEEA-KAAANIAEAIGAQAEAAVETARETEEAERAEQAALVAAAEAAEQEQV 402
Query: 225 HLRMQQQDEELRRRHQENSI 244
+ ++ + + Q I
Sbjct: 403 EIAVRAEAAKAEAEAQAAEI 422
>gnl|CDD|236272 PRK08475, PRK08475, F0F1 ATP synthase subunit B; Validated.
Length = 167
Score = 28.4 bits (64), Expect = 2.4
Identities = 24/84 (28%), Positives = 46/84 (54%), Gaps = 7/84 (8%)
Query: 102 KQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLE 161
++L E ++KE+AL+K LEE K +A++ + E IL ++ +E+ + +++
Sbjct: 67 EKLKESKEKKEDALKK---LEEAKEKAELIVETAKKEAYILTQK----IEKQTKDDIENL 119
Query: 162 EEKLEAQMEFARYENETEILREQL 185
+ E MEF + E E++ E L
Sbjct: 120 IKSFEELMEFEVRKMEREVVEEVL 143
>gnl|CDD|215180 PLN02316, PLN02316, synthase/transferase.
Length = 1036
Score = 29.5 bits (66), Expect = 2.5
Identities = 19/53 (35%), Positives = 29/53 (54%), Gaps = 6/53 (11%)
Query: 186 RQREADRERQKQEWELKERHAEEQRRCDEEAMRRQTEEIHLRMQQQDEELRRR 238
++RE E+ +E +ER AEEQRR +EE + + R Q + E +RR
Sbjct: 254 KRREL--EKLAKEEAERERQAEEQRRREEEKAAMEAD----RAQAKAEVEKRR 300
Score = 28.3 bits (63), Expect = 6.1
Identities = 18/59 (30%), Positives = 26/59 (44%), Gaps = 1/59 (1%)
Query: 151 EEALQKELKLEEEKLEA-QMEFARYENETEILREQLRQREADRERQKQEWELKERHAEE 208
E+ L +E + E EKL + E R E E+ EADR + K E E + +
Sbjct: 247 EDFLLEEKRRELEKLAKEEAERERQAEEQRRREEEKAAMEADRAQAKAEVEKRREKLQN 305
>gnl|CDD|227519 COG5192, BMS1, GTP-binding protein required for 40S ribosome
biogenesis [Translation, ribosomal structure and
biogenesis].
Length = 1077
Score = 29.3 bits (65), Expect = 2.6
Identities = 30/183 (16%), Positives = 62/183 (33%), Gaps = 19/183 (10%)
Query: 47 SLKPVIVEPLDLVDDEEGLSERTVSKKSSDYFKQRSIGPRFATIGSFEFEYGTRWKQLYE 106
V + + E L E + + + RF + + G E
Sbjct: 540 FFDVSKVANESISSNHEKLMESEFEELKKKWSSLAQLKSRFQKDATLDSIEGEE-----E 594
Query: 107 LYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLEEEKLE 166
L + E+ ++L+ EE + +ME +R + T E E ++E ++E+L
Sbjct: 595 LIQDDEKGNFEDLEDEENSSDNEMEESRGSSVTAENEESADEVDYETEREENARKKEELR 654
Query: 167 AQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQTEEIHL 226
E L +R ++ + ++R EEQ + + E +
Sbjct: 655 GNFE--------------LEERGDPEKKDVDWYTEEKRKIEEQLKINRSEFETMVPESRV 700
Query: 227 RMQ 229
++
Sbjct: 701 VIE 703
>gnl|CDD|178867 PRK00106, PRK00106, hypothetical protein; Provisional.
Length = 535
Score = 29.1 bits (65), Expect = 2.7
Identities = 30/123 (24%), Positives = 53/123 (43%), Gaps = 3/123 (2%)
Query: 116 QKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLEEEKLEAQMEFARYE 175
+ E E K A+ E + E + + E R E +++E K E ++L+ Q+E E
Sbjct: 50 KAERDAEHIKKTAKRESKALKKELLLEAKEEARKYREEIEQEFKSERQELK-QIESRLTE 108
Query: 176 NETEILR--EQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQTEEIHLRMQQQDE 233
T + R E L +E E ++Q K +H +E+ E+ ++ E+
Sbjct: 109 RATSLDRKDENLSSKEKTLESKEQSLTDKSKHIDEREEQVEKLEEQKKAELERVAALSQA 168
Query: 234 ELR 236
E R
Sbjct: 169 EAR 171
>gnl|CDD|227606 COG5281, COG5281, Phage-related minor tail protein [Function
unknown].
Length = 833
Score = 29.2 bits (65), Expect = 2.9
Identities = 23/160 (14%), Positives = 49/160 (30%), Gaps = 15/160 (9%)
Query: 109 KQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEAL-QKELKLEEEKLEA 167
++ + + Q+ E ++ EE + + ++ +L+
Sbjct: 480 AERSQEQMTAALKALLAFQQQIADLSGAKEKASDQKSLLWKAEEQYALLKEEAKQRQLQE 539
Query: 168 QMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQTEEIHLR 227
Q ++ ET QL + +Q +EL + A Q+ + R +
Sbjct: 540 QKALLEHKKETLEYTSQLAELLD---QQADRFELSAQAAGSQKERGSDLYREALAQNAAA 596
Query: 228 MQQQDEELRRRHQENSIFMQVIVWLGDLKQGVYQLGLTEG 267
+ + EL DL QG ++ G
Sbjct: 597 LNKALNELAAYWSAL-----------DLLQGDWKAGALSA 625
>gnl|CDD|171793 PRK12880, PRK12880, 3-oxoacyl-(acyl carrier protein) synthase III;
Reviewed.
Length = 353
Score = 28.8 bits (64), Expect = 3.0
Identities = 16/67 (23%), Positives = 35/67 (52%), Gaps = 4/67 (5%)
Query: 101 WKQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEI---LRERECRFVEEALQKE 157
++QL LY L+ E + + +EF++ +E +I L + ++ + +++E
Sbjct: 223 FRQLENLYMDGANIFNMALECEPKSFKEILEFSK-VDEKDIAFHLFHQSNAYLVDCIKEE 281
Query: 158 LKLEEEK 164
LKL ++K
Sbjct: 282 LKLNDDK 288
>gnl|CDD|129694 TIGR00606, rad50, rad50. All proteins in this family for which
functions are known are involvedin recombination,
recombinational repair, and/or non-homologous end
joining.They are components of an exonuclease complex
with MRE11 homologs. This family is distantly related to
the SbcC family of bacterial proteins.This family is
based on the phylogenomic analysis of JA Eisen (1999,
Ph.D. Thesis, Stanford University).
Length = 1311
Score = 29.2 bits (65), Expect = 3.0
Identities = 33/147 (22%), Positives = 63/147 (42%), Gaps = 16/147 (10%)
Query: 102 KQLYELYKQKEEALQKELKLEEEKLEAQMEFA-RYENETEILRERECRFVEEALQKELKL 160
+ + YK+K ++ ++ +E +LE+ E YENE + L+ R + +E L K +KL
Sbjct: 209 LKYLKQYKEKACEIRDQITSKEAQLESSREIVKSYENELDPLKNRL-KEIEHNLSKIMKL 267
Query: 161 EEEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQ 220
+ NE + L+ + +Q E D + + E + +EQ +R
Sbjct: 268 D--------------NEIKALKSRKKQMEKDNSELELKMEKVFQGTDEQLNDLYHNHQRT 313
Query: 221 TEEIHLRMQQQDEELRRRHQENSIFMQ 247
E + EL + ++E + Q
Sbjct: 314 VREKERELVDCQRELEKLNKERRLLNQ 340
>gnl|CDD|220098 pfam09057, Smac_DIABLO, Second Mitochondria-derived Activator of
Caspases. Second Mitochondria-derived Activator of
Caspases promotes apoptosis by activating caspases in
the cytochrome c/Apaf-1/caspase-9 pathway, and by
opposing the inhibitory activity of inhibitor of
apoptosis proteins (XIAP-BIR3). The protein assumes an
elongated three-helix bundle structure, and forms a
dimer in solution.
Length = 234
Score = 28.7 bits (64), Expect = 3.1
Identities = 35/158 (22%), Positives = 57/158 (36%), Gaps = 12/158 (7%)
Query: 70 VSKKSSDYFKQRSIGPRFATIGSFEFEYGTRWKQLYELYKQKEEALQKELKLEEEKLEAQ 129
V+ ++ + Q + A + + EY L L KQ ++ K +EE+ +
Sbjct: 73 VTDSANTFLSQTT----LALVDALT-EYTKAVYTLISLQKQYTASIGKMNPVEEDAIWQV 127
Query: 130 MEFARYENETEILRERECRFVEEALQKELKLEEEKLEAQM----EFARYENETEILREQL 185
+ R E R EC E + L E EA + A + Q
Sbjct: 128 IIGQRVE---VSDRLEECLKFESNWMTAVNLSEMAAEAAYNSGADQASVAARNHLQVAQS 184
Query: 186 RQREADRERQKQEWELKERHAEEQRRCDEEAMRRQTEE 223
+ E + ++ E +L E AEE +R E A E
Sbjct: 185 QVEEVRQLSKEAEKKLAESKAEEIQRMAEYASSIDLSE 222
>gnl|CDD|220383 pfam09756, DDRGK, DDRGK domain. This is a family of proteins of
approximately 300 residues, found in plants and
vertebrates. They contain a highly conserved DDRGK
motif.
Length = 189
Score = 28.5 bits (64), Expect = 3.3
Identities = 18/77 (23%), Positives = 39/77 (50%)
Query: 155 QKELKLEEEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDE 214
K+ EEK + + E E E ++ +RE +R+ +++ E +E+ EE+ R +
Sbjct: 5 AKKRAKLEEKQARRQQREAEEEEREERKKLEEKREGERKEEEELEEEREKKKEEEERKER 64
Query: 215 EAMRRQTEEIHLRMQQQ 231
E R+ +E + +++
Sbjct: 65 EEQARKEQEEYEKLKSS 81
>gnl|CDD|204414 pfam10211, Ax_dynein_light, Axonemal dynein light chain. Axonemal
dynein light chain proteins play a dynamic role in
flagellar and cilia motility. Eukaryotic cilia and
flagella are complex organelles consisting of a core
structure, the axoneme, which is composed of nine
microtubule doublets forming a cylinder that surrounds a
pair of central singlet microtubules. This
ultra-structural arrangement seems to be one of the most
stable micro-tubular assemblies known and is responsible
for the flagellar and ciliary movement of a large number
of organisms ranging from protozoan to mammals. This
light chain interacts directly with the N-terminal half
of the heavy chains.
Length = 189
Score = 28.3 bits (64), Expect = 3.6
Identities = 23/85 (27%), Positives = 43/85 (50%), Gaps = 10/85 (11%)
Query: 153 ALQKELKLEEEKLEAQMEFARYENETEILREQ---LRQREADRERQKQEWELKERHAEEQ 209
++K L+ E+ K E + E + E E E L ++ L + E++++E ER EE+
Sbjct: 111 GMRKALQAEQGKSELEQEIKKLEEEKEELEKRVAELEAKLEAIEKREEE----ERQIEEK 166
Query: 210 RRCDE-EAMRRQTEEIHLRMQQQDE 233
R DE +++Q + L+ Q +
Sbjct: 167 RHADEIAFLKKQNQ--QLKSQLEQI 189
>gnl|CDD|217803 pfam03938, OmpH, Outer membrane protein (OmpH-like). This family
includes outer membrane proteins such as OmpH among
others. Skp (OmpH) has been characterized as a molecular
chaperone that interacts with unfolded proteins as they
emerge in the periplasm from the Sec translocation
machinery.
Length = 157
Score = 28.0 bits (63), Expect = 3.6
Identities = 18/88 (20%), Positives = 43/88 (48%), Gaps = 8/88 (9%)
Query: 101 WKQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKL 160
K + +++ + LQ EL+ +E++L+ E + + + L E E K+ +L
Sbjct: 33 GKAAQKQLEKEFKKLQAELQKKEKELQK--EEQKLQKQAATLSE------EARKAKQQEL 84
Query: 161 EEEKLEAQMEFARYENETEILREQLRQR 188
++++ E Q + + E + +++L Q
Sbjct: 85 QQKQQELQQKQQAAQQELQQKQQELLQP 112
>gnl|CDD|163506 TIGR03794, NHLM_micro_HlyD, NHLM bacteriocin system secretion
protein. Members of this protein family are homologs of
the HlyD membrane fusion protein of type I secretion
systems. Their occurrence in prokaryotic genomes is
associated with the occurrence of a novel class of
microcin (small bacteriocins) with a leader peptide
region related to nitrile hydratase. We designate the
class of bacteriocin as Nitrile Hydratase Leader
Microcin, or NHLM. This family, therefore, is designated
as NHLM bacteriocin system secretion protein. Some but
not all NHLM-class putative microcins belong to the TOMM
(thiazole/oxazole modified microcin) class as assessed
by the presence of the scaffolding protein and/or
cyclodehydratase in the same gene clusters [Transport
and binding proteins, Amino acids, peptides and amines,
Cellular processes, Biosynthesis of natural products].
Length = 421
Score = 28.7 bits (64), Expect = 3.6
Identities = 30/145 (20%), Positives = 53/145 (36%), Gaps = 19/145 (13%)
Query: 106 ELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLEEEKL 165
EL ++ +E+ QK +L+E+ E Y + RER + + LEE
Sbjct: 93 ELRERLQESYQKLTQLQEQ----LEEVRNYTGRLKEGRERH------FQKSKEALEETIG 142
Query: 166 EAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDE---------EA 216
+ E A E R L + A +R + + E+ D+ +A
Sbjct: 143 RLREELAALSREVGKQRGLLSRGLATFKRDRILQQQWREEQEKYDAADKARAIYALQTKA 202
Query: 217 MRRQTEEIHLRMQQQDEELRRRHQE 241
R E + + Q D +L ++
Sbjct: 203 DERNLETVLQSLSQADFQLAGVAEK 227
>gnl|CDD|240227 PTZ00009, PTZ00009, heat shock 70 kDa protein; Provisional.
Length = 653
Score = 28.6 bits (64), Expect = 3.9
Identities = 21/80 (26%), Positives = 41/80 (51%), Gaps = 3/80 (3%)
Query: 131 EFARYENETEILRER-ECRFVEEALQKELK--LEEEKLEAQMEFARYENETEILREQLRQ 187
E +Y+ E E RER E + E +K L++EK++ ++ + + + E L
Sbjct: 523 EAEKYKAEDEANRERVEAKNGLENYCYSMKNTLQDEKVKGKLSDSDKATIEKAIDEALEW 582
Query: 188 READRERQKQEWELKERHAE 207
E ++ +K+E+E K++ E
Sbjct: 583 LEKNQLAEKEEFEHKQKEVE 602
>gnl|CDD|226018 COG3487, IrpA, Uncharacterized iron-regulated protein [Inorganic
ion transport and metabolism].
Length = 446
Score = 28.7 bits (64), Expect = 3.9
Identities = 18/52 (34%), Positives = 27/52 (51%), Gaps = 2/52 (3%)
Query: 82 SIGPRFATIGSFEFEYGTRWK--QLYELYKQKEEALQKELKLEEEKLEAQME 131
+IG R A +GS+ G+ K L +L K+ A KELK + A+M+
Sbjct: 326 AIGIRNAYLGSYTRVDGSVVKGPSLADLVAAKDAAANKELKAKLAATVAKMQ 377
>gnl|CDD|114011 pfam05262, Borrelia_P83, Borrelia P83/100 protein. This family
consists of several Borrelia P83/P100 antigen proteins.
Length = 489
Score = 28.4 bits (63), Expect = 4.1
Identities = 8/40 (20%), Positives = 25/40 (62%)
Query: 179 EILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMR 218
+ L+E+L +++ D ++ +Q+ + + +A++QR + +
Sbjct: 216 QQLKEELDKKQIDADKAQQKADFAQDNADKQRDEVRQKQQ 255
Score = 28.0 bits (62), Expect = 6.4
Identities = 22/133 (16%), Positives = 59/133 (44%), Gaps = 10/133 (7%)
Query: 120 KLEEEKLEAQMEFAR-----YENETEILRERECRFVEEALQKELKLEEEKLEAQMEFARY 174
L E+ E + F R E E++ +R + EE +K++ ++ + +A
Sbjct: 185 ALREDN-EKGVNFRRDMTDLKERESQEDAKRAQQLKEELDKKQIDADKAQQKADFAQDNA 243
Query: 175 ENETEILREQLRQ-READRERQKQEWELKERHAEEQRRCDEEA---MRRQTEEIHLRMQQ 230
+ + + +R++ ++ + + + ++ AE Q+R E+A +++ EE
Sbjct: 244 DKQRDEVRQKQQEAKNLPKPADTSSPKEDKQVAENQKREIEKAQIEIKKNDEEALKAKDH 303
Query: 231 QDEELRRRHQENS 243
+ +L++ + +
Sbjct: 304 KAFDLKQESKASE 316
>gnl|CDD|224241 COG1322, COG1322, Predicted nuclease of restriction
endonuclease-like fold, RmuC family [General function
prediction only].
Length = 448
Score = 28.5 bits (64), Expect = 4.1
Identities = 19/85 (22%), Positives = 32/85 (37%), Gaps = 7/85 (8%)
Query: 164 KLEAQMEFARYENETEILREQLRQREA-------DRERQKQEWELKERHAEEQRRCDEEA 216
LE + + E E LR R +A + K + + + EQ + E+
Sbjct: 40 VLEQLLLLLAFRAEAEQLRTFARSLQALNLELIQELNELKARLQQQLLQSREQLQLLIES 99
Query: 217 MRRQTEEIHLRMQQQDEELRRRHQE 241
+ + + E + EEL RR E
Sbjct: 100 LAQLSSEFQELANEIFEELNRRLAE 124
>gnl|CDD|226513 COG4026, COG4026, Uncharacterized protein containing TOPRIM domain,
potential nuclease [General function prediction only].
Length = 290
Score = 28.3 bits (63), Expect = 4.3
Identities = 29/93 (31%), Positives = 44/93 (47%), Gaps = 2/93 (2%)
Query: 103 QLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLEE 162
+L EL K+KEE L++ +LE E E Q R E E L E + E + + +E
Sbjct: 143 KLEELQKEKEELLKELEELEAEYEEVQERLKRLEVENSRLEEMLKKLPGEVYDLKKRWDE 202
Query: 163 EKLEAQMEFARYENETEILREQLRQREADRERQ 195
LE +E E +++++E L D E Q
Sbjct: 203 --LEPGVELPEEELISDLVKETLNLAPKDIEGQ 233
>gnl|CDD|173502 PTZ00266, PTZ00266, NIMA-related protein kinase; Provisional.
Length = 1021
Score = 28.5 bits (63), Expect = 4.3
Identities = 24/80 (30%), Positives = 39/80 (48%), Gaps = 4/80 (5%)
Query: 169 MEFARYENETEILREQLRQREADR-ERQKQEWELKERHAEEQRRCDEEAMRRQTEEIHLR 227
+E R E RE+L + +R ER++ E E ER E+ R + + + R E R
Sbjct: 455 LEKKRIERLEREERERLERERMERIERERLERERLERERLERDRLERDRLDRLERERVDR 514
Query: 228 MQQQDEELRRRHQENSIFMQ 247
+++ E RR NS F++
Sbjct: 515 LERDRLEKARR---NSYFLK 531
>gnl|CDD|185616 PTZ00436, PTZ00436, 60S ribosomal protein L19-like protein;
Provisional.
Length = 357
Score = 28.4 bits (62), Expect = 4.4
Identities = 16/46 (34%), Positives = 28/46 (60%), Gaps = 3/46 (6%)
Query: 156 KELKLEEEKLEAQMEFARYENET---EILREQLRQREADRERQKQE 198
K K +E +L Q+ R ++E + +++LR+RE DRER ++E
Sbjct: 146 KNEKKKERQLAEQLAAKRLKDEQHRHKARKQELRKREKDRERARRE 191
>gnl|CDD|172358 PRK13831, PRK13831, conjugal transfer protein TrbI; Provisional.
Length = 432
Score = 28.5 bits (64), Expect = 4.5
Identities = 14/49 (28%), Positives = 22/49 (44%), Gaps = 3/49 (6%)
Query: 184 QLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQTEEIHLRMQQQD 232
Q QRE R + E E + R EQ +E+ +R + + R+Q
Sbjct: 111 QPGQREERRPTLESEEEWRARLKREQ---EEQYLRERQRQRMARLQANA 156
>gnl|CDD|214636 smart00360, RRM, RNA recognition motif.
Length = 73
Score = 26.4 bits (59), Expect = 4.5
Identities = 10/38 (26%), Positives = 18/38 (47%), Gaps = 1/38 (2%)
Query: 2 NIERAVVLVD-DRGNSKNEGIIEFTRKPAAAQALKRCQ 38
+E ++ D + G SK +EF + A +AL+
Sbjct: 25 KVESVRLVRDKETGKSKGFAFVEFESEEDAEKALEALN 62
>gnl|CDD|240271 PTZ00108, PTZ00108, DNA topoisomerase 2-like protein; Provisional.
Length = 1388
Score = 28.5 bits (64), Expect = 4.7
Identities = 29/167 (17%), Positives = 71/167 (42%), Gaps = 36/167 (21%)
Query: 106 ELYKQKEEALQKELKLEEEKLEAQMEFARY--ENETEILRERECRFVEEALQKELKLEEE 163
+LYK+++E L +L+ E +L ++ F ++ E I + ++ L KELK
Sbjct: 991 DLYKKRKEYLLGKLERELARLSNKVRFIKHVINGELVITNAK-----KKDLVKELK---- 1041
Query: 164 KLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCD---------- 213
++ + R+++ + E++ E + + E + ++ E
Sbjct: 1042 ----KLGYVRFKDIIKKKSEKITAEEEEGAEEDDEADDEDDEEELGAAVSYDYLLSMPIW 1097
Query: 214 ---EEAMRRQTEEIHLRMQQQDEELRRRHQENSIFMQVIVWLGDLKQ 257
+E + + E+ + +++ E+L+ ++ +WL DL +
Sbjct: 1098 SLTKEKVEKLNAELE-KKEKELEKLKNTTPKD-------MWLEDLDK 1136
>gnl|CDD|192773 pfam11559, ADIP, Afadin- and alpha -actinin-Binding. This family
is found in mammals where it is localised at cell-cell
adherens junctions, and in Sch. pombe and other fungi
where it anchors spindle-pole bodies to spindle
microtubules. It is a coiled-coil structure, and in
pombe, it is required for anchoring the minus end of
spindle microtubules to the centrosome equivalent, the
spindle-pole body. The name ADIP derives from the family
being composed of Afadin- and alpha -Actinin-Binding
Proteins Localised at Cell-Cell Adherens Junctions.
Length = 149
Score = 27.6 bits (62), Expect = 4.8
Identities = 27/107 (25%), Positives = 46/107 (42%), Gaps = 18/107 (16%)
Query: 142 LRERECRFVEEALQKELKLEEEKLEAQMEFARYENETEILREQLRQREADRERQKQEWEL 201
R+R+ F E + KLE E Q R + + E L ER+ +
Sbjct: 44 QRDRDLEFRESLEETLRKLEAEIERLQNTIERLKTQLEDL-----------ERELALLQA 92
Query: 202 KERHAEEQRRCDEEAMRRQTEEIHL-------RMQQQDEELRRRHQE 241
KER E++ + E+ ++ + EE+ R Q + EL++R +E
Sbjct: 93 KERQLEKKLKTLEQKLKNEKEEVQRLKNIIQQRKTQYNHELKKRDRE 139
>gnl|CDD|233720 TIGR02091, glgC, glucose-1-phosphate adenylyltransferase. This
enzyme, glucose-1-phosphate adenylyltransferase, is also
called ADP-glucose pyrophosphorylase. The plant form is
an alpha2,beta2 heterodimer, allosterically regulated in
plants. Both subunits are homologous and included in
this model. In bacteria, both homomeric forms of GlgC
and more active heterodimers of GlgC and GlgD have been
described. This model describes the GlgC subunit only.
This enzyme appears in variants of glycogen synthesis
pathways that use ADP-glucose, rather than UDP-glucose
as in animals [Energy metabolism, Biosynthesis and
degradation of polysaccharides].
Length = 361
Score = 28.4 bits (64), Expect = 5.0
Identities = 15/55 (27%), Positives = 25/55 (45%), Gaps = 12/55 (21%)
Query: 7 VVLVDDRGNSKNEGIIEFTRKPAAAQALKRCQD------GVF-FLTQSLKPVIVE 54
V+ VD+ G I++F KPA ++ D G++ F LK ++ E
Sbjct: 158 VMQVDEDGR-----IVDFEEKPANPPSIPGMPDFALASMGIYIFDKDVLKELLEE 207
>gnl|CDD|99747 cd06454, KBL_like, KBL_like; this family belongs to the pyridoxal
phosphate (PLP)-dependent aspartate aminotransferase
superfamily (fold I). The major groups in this CD
corresponds to serine palmitoyltransferase (SPT),
5-aminolevulinate synthase (ALAS),
8-amino-7-oxononanoate synthase (AONS), and
2-amino-3-ketobutyrate CoA ligase (KBL). SPT is
responsible for the condensation of L-serine with
palmitoyl-CoA to produce 3-ketodihydrospingosine, the
reaction of the first step in sphingolipid biosynthesis.
ALAS is involved in heme biosynthesis; it catalyzes the
synthesis of 5-aminolevulinic acid from glycine and
succinyl-coenzyme A. AONS catalyses the decarboxylative
condensation of l-alanine and pimeloyl-CoA in the first
committed step of biotin biosynthesis. KBL catalyzes the
second reaction step of the metabolic degradation
pathway for threonine converting 2-amino-3-ketobutyrate,
to glycine and acetyl-CoA. The members of this CD are
widely found in all three forms of life.
Length = 349
Score = 28.3 bits (64), Expect = 5.2
Identities = 16/62 (25%), Positives = 23/62 (37%), Gaps = 17/62 (27%)
Query: 226 LRMQQQDEELRRRHQENSIFMQVIVWLGDLKQGVYQLGLTEG--------PFICECNNKM 277
L + Q E R R QEN + L++G+ +LG G P I + K
Sbjct: 246 LEVLQGGPERRERLQENVRY---------LRRGLKELGFPVGGSPSHIIPPLIGDDPAKA 296
Query: 278 KN 279
Sbjct: 297 VA 298
>gnl|CDD|215969 pfam00521, DNA_topoisoIV, DNA gyrase/topoisomerase IV, subunit A.
Length = 427
Score = 28.3 bits (64), Expect = 5.4
Identities = 29/119 (24%), Positives = 51/119 (42%), Gaps = 21/119 (17%)
Query: 134 RYENETEILRERECRFVE---EALQKELKLEEEKLEAQMEFARYENETEILREQLRQREA 190
+Y N EIL+E F+E E ++ + EKLE ++ E L + L + +
Sbjct: 302 KYLNLKEILKE----FLEHRLEVYKRRKEYLLEKLEERLHIL------EGLLKALNKIDF 351
Query: 191 DRERQKQEWELKERHAEEQRRCDEE--------AMRRQTEEIHLRMQQQDEELRRRHQE 241
E + +LK+ E E +RR T+E +++++ EEL + E
Sbjct: 352 VIEVIRGSIDLKKAKKELIEELSEIQADYLLDMRLRRLTKEEIEKLEKEIEELEKEIAE 410
>gnl|CDD|241110 cd12666, RRM2_RAVER2, RNA recognition motif 2 in vertebrate
ribonucleoprotein PTB-binding 2 (raver-2). This
subgroup corresponds to the RRM2 of raver-2, a novel
member of the heterogeneous nuclear ribonucleoprotein
(hnRNP) family. It is present in vertebrates and shows
high sequence homology to raver-1, a ubiquitously
expressed co-repressor of the nucleoplasmic splicing
repressor polypyrimidine tract-binding protein
(PTB)-directed splicing of select mRNAs. In contrast,
raver-2 exerts a distinct spatio-temporal expression
pattern during embryogenesis and is mainly limited to
differentiated neurons and glia cells. Although it
displays nucleo-cytoplasmic shuttling in heterokaryons,
raver2 localizes to the nucleus in glia cells and
neurons. Raver-2 can interact with PTB and may
participate in PTB-mediated RNA-processing. However,
there is no evidence indicating that raver-2 can bind
to cytoplasmic proteins. Raver-2 contains three
N-terminal RNA recognition motifs (RRMs), also termed
RBDs (RNA binding domains) or RNPs (ribonucleoprotein
domains), two putative nuclear localization signals
(NLS) at the N- and C-termini, a central leucine-rich
region, and a C-terminal region harboring two
[SG][IL]LGxxP motifs. Raver-2 binds to PTB through the
SLLGEPP motif only, and binds to RNA through its RRMs.
.
Length = 77
Score = 26.4 bits (58), Expect = 5.4
Identities = 12/33 (36%), Positives = 22/33 (66%), Gaps = 1/33 (3%)
Query: 2 NIERAVVLVDD-RGNSKNEGIIEFTRKPAAAQA 33
NIER ++ + G+SK G +E+ +K +A++A
Sbjct: 25 NIERCFLVYSEVTGHSKGYGFVEYMKKDSASKA 57
>gnl|CDD|240835 cd12389, RRM2_RAVER, RNA recognition motif 2 in ribonucleoprotein
PTB-binding raver-1, raver-2 and similar proteins.
This subfamily corresponds to the RRM2 of raver-1 and
raver-2. Raver-1 is a ubiquitously expressed
heterogeneous nuclear ribonucleoprotein (hnRNP) that
serves as a co-repressor of the nucleoplasmic splicing
repressor polypyrimidine tract-binding protein
(PTB)-directed splicing of select mRNAs. It shuttles
between the cytoplasm and the nucleus and can
accumulate in the perinucleolar compartment, a dynamic
nuclear substructure that harbors PTB. Raver-1 also
modulates focal adhesion assembly by binding to the
cytoskeletal proteins, including alpha-actinin,
vinculin, and metavinculin (an alternatively spliced
isoform of vinculin) at adhesion complexes,
particularly in differentiated muscle tissue. Raver-2
is a novel member of the heterogeneous nuclear
ribonucleoprotein (hnRNP) family. It shows high
sequence homology to raver-1. Raver-2 exerts a
spatio-temporal expression pattern during embryogenesis
and is mainly limited to differentiated neurons and
glia cells. Although it displays nucleo-cytoplasmic
shuttling in heterokaryons, raver2 localizes to the
nucleus in glia cells and neurons. Raver-2 can interact
with PTB and may participate in PTB-mediated
RNA-processing. However, there is no evidence
indicating that raver-2 can bind to cytoplasmic
proteins. Both, raver-1 and raver-2, contain three
N-terminal RNA recognition motifs (RRMs), also termed
RBDs (RNA binding domains) or RNPs (ribonucleoprotein
domains), two putative nuclear localization signals
(NLS) at the N- and C-termini, a central leucine-rich
region, and a C-terminal region harboring two
[SG][IL]LGxxP motifs. They binds to RNA through the
RRMs. In addition, the two [SG][IL]LGxxP motifs serve
as the PTB-binding motifs in raver1. However, raver-2
interacts with PTB through the SLLGEPP motif only. .
Length = 77
Score = 26.1 bits (58), Expect = 5.7
Identities = 10/33 (30%), Positives = 18/33 (54%), Gaps = 1/33 (3%)
Query: 2 NIERAVVLVDDR-GNSKNEGIIEFTRKPAAAQA 33
+ER ++ + G SK G +E+ K +A +A
Sbjct: 25 AVERCFLVYSESTGESKGYGFVEYASKASALKA 57
>gnl|CDD|234084 TIGR03007, pepcterm_ChnLen, polysaccharide chain length determinant
protein, PEP-CTERM locus subfamily. Members of this
protein family belong to the family of polysaccharide
chain length determinant proteins (pfam02706). All are
found in species that encode the PEP-CTERM/exosortase
system predicted to act in protein sorting in a number
of Gram-negative bacteria, and are found near the epsH
homolog that is the putative exosortase gene [Cell
envelope, Biosynthesis and degradation of surface
polysaccharides and lipopolysaccharides].
Length = 498
Score = 28.1 bits (63), Expect = 5.7
Identities = 22/98 (22%), Positives = 34/98 (34%), Gaps = 8/98 (8%)
Query: 103 QLYELYKQKEEALQKELKLEEEKLEA-------QMEFARYENETEILRERECRFVEEALQ 155
++ +L +QKEE + E A Q+E A E E L R +
Sbjct: 283 EIAQLEEQKEEEGSAKNGGPERGEIANPVYQQLQIELAEAEAEIASLEARVAELTARIER 342
Query: 156 KELKLEEEKLEAQMEFARYENETEILREQLRQREADRE 193
E L E + E + + E+ + Q RE
Sbjct: 343 LESLLRTIP-EVEAELTQLNRDYEVNKSNYEQLLTRRE 379
>gnl|CDD|114629 pfam05917, DUF874, Helicobacter pylori protein of unknown function
(DUF874). This family consists of several hypothetical
proteins specific to Helicobacter pylori. The function
of this family is unknown.
Length = 417
Score = 27.9 bits (61), Expect = 5.9
Identities = 24/114 (21%), Positives = 58/114 (50%), Gaps = 1/114 (0%)
Query: 129 QMEFARYENETEILRERECRFVEEALQKELKLEEEKLEAQMEFARYENETEILREQLRQR 188
++E A+ + E E R+R + E Q+E K E+EK + + E N ++I EQ +Q+
Sbjct: 127 KIELAQAKKEAENARDRANKSGIELEQEEQKTEQEKQKTEKEGIELAN-SQIKAEQEKQK 185
Query: 189 EADRERQKQEWELKERHAEEQRRCDEEAMRRQTEEIHLRMQQQDEELRRRHQEN 242
+++ ++ + K + + + E +++TE + ++ ++ + ++N
Sbjct: 186 TEQEKQKTEQEKQKTSNIANKNAIELEQEKQKTENEKQDLIKEQKDFIKEAEQN 239
>gnl|CDD|215612 PLN03169, PLN03169, chalcone synthase family protein; Provisional.
Length = 391
Score = 27.7 bits (62), Expect = 6.4
Identities = 20/58 (34%), Positives = 31/58 (53%), Gaps = 13/58 (22%)
Query: 153 ALQKELKLEEEKLE----AQMEFARYENET-----EILREQLRQREADRERQKQEWEL 201
L+K+LKL EKLE A M++ + T E +RE+L+++ + E EW L
Sbjct: 318 RLEKKLKLAPEKLECSRRALMDYGNVSSNTIVYVLEYMREELKKKGEEDE----EWGL 371
>gnl|CDD|179385 PRK02224, PRK02224, chromosome segregation protein; Provisional.
Length = 880
Score = 28.1 bits (63), Expect = 6.5
Identities = 36/128 (28%), Positives = 53/128 (41%), Gaps = 14/128 (10%)
Query: 117 KELKLEEEKLEAQMEFAR--YENETEILRERECRFVEEALQKELKLEEEKLEAQMEFARY 174
EL E E+ E Q E AR + E+L E E E ++ LE E + + A
Sbjct: 216 AELDEEIERYEEQREQARETRDEADEVLEEHE-----ERREELETLEAEIEDLRETIAET 270
Query: 175 ENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQTEEIHLR-MQQQDE 233
E E E L E++R E EL+E + + + E ++ +DE
Sbjct: 271 EREREELAEEVRDLRERLE------ELEEERDDLLAEAGLDDADAEAVEARREELEDRDE 324
Query: 234 ELRRRHQE 241
ELR R +E
Sbjct: 325 ELRDRLEE 332
>gnl|CDD|227352 COG5019, CDC3, Septin family protein [Cell division and chromosome
partitioning / Cytoskeleton].
Length = 373
Score = 28.1 bits (63), Expect = 6.5
Identities = 22/96 (22%), Positives = 41/96 (42%), Gaps = 6/96 (6%)
Query: 102 KQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLE 161
LYE Y+ ++ L + E ++ E RE + +F E+ +KE +LE
Sbjct: 281 NLLYENYRTEK------LSGLKNSGEPSLKEIHEARLNEEERELKKKFTEKIREKEKRLE 334
Query: 162 EEKLEAQMEFARYENETEILREQLRQREADRERQKQ 197
E + E ++ E ++++L E E+ K
Sbjct: 335 ELEQNLIEERKELNSKLEEIQKKLEDLEKRLEKLKS 370
>gnl|CDD|222878 PHA02562, 46, endonuclease subunit; Provisional.
Length = 562
Score = 28.1 bits (63), Expect = 6.7
Identities = 17/100 (17%), Positives = 42/100 (42%), Gaps = 4/100 (4%)
Query: 99 TRWKQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKEL 158
+ K+L ++ + A+ + ++ +E E + +N+ ++ V++A
Sbjct: 306 DKLKELQHSLEKLDTAIDELEEIMDEFNEQSKKLLELKNKISTNKQSLITLVDKAK---- 361
Query: 159 KLEEEKLEAQMEFARYENETEILREQLRQREADRERQKQE 198
K++ E Q EF E L+++L + + +E
Sbjct: 362 KVKAAIEELQAEFVDNAEELAKLQDELDKIVKTKSELVKE 401
>gnl|CDD|220365 pfam09726, Macoilin, Transmembrane protein. This entry is a highly
conserved protein present in eukaryotes.
Length = 680
Score = 28.0 bits (62), Expect = 6.8
Identities = 28/130 (21%), Positives = 57/130 (43%), Gaps = 16/130 (12%)
Query: 114 ALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQK---ELKLEEEKLEAQME 170
L+KE + + KL + + + + ++ ++ E R EA + E +L EEK + E
Sbjct: 452 QLKKENDMLQTKLNSMVSAKQKDKQS--MQSMEKRLKSEADSRVNAEKQLAEEKKRKKEE 509
Query: 171 -------FARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQTEE 223
A+ E E L+Q + D E + ++ E + EE+ R + ++ +E
Sbjct: 510 EETAARAAAQAAASREECAESLKQAKQDLEMEIKKLEHDLKLKEEECR----MLEKEAQE 565
Query: 224 IHLRMQQQDE 233
+ + + E
Sbjct: 566 LRKYQESEKE 575
>gnl|CDD|225159 COG2250, COG2250, Uncharacterized conserved protein related to
C-terminal domain of eukaryotic chaperone, SACSIN
[Function unknown].
Length = 132
Score = 27.0 bits (60), Expect = 6.8
Identities = 15/66 (22%), Positives = 31/66 (46%)
Query: 101 WKQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKL 160
++L + EE L+ +LE+ + ++ A YE E+ + + + + +K L+L
Sbjct: 65 LRELSRELEVPEEILECARELEKRYILSRYPDAEYEGPLELYSKEDAEELLKTAEKVLEL 124
Query: 161 EEEKLE 166
E L
Sbjct: 125 VEGLLG 130
>gnl|CDD|184696 PRK14474, PRK14474, F0F1 ATP synthase subunit B; Provisional.
Length = 250
Score = 27.5 bits (61), Expect = 6.8
Identities = 22/77 (28%), Positives = 33/77 (42%), Gaps = 7/77 (9%)
Query: 161 EEEKLEAQMEFARYENETEILREQLR------QREADRERQKQEWELKERHAEEQRRCDE 214
E+ + EA E RY + + L +Q Q AD +RQ E +E R
Sbjct: 49 EQRQQEAGQEAERYRQKQQSLEQQRASFMAQAQEAADEQRQHLLNEARE-DVATARDEWL 107
Query: 215 EAMRRQTEEIHLRMQQQ 231
E + R+ +E +QQQ
Sbjct: 108 EQLEREKQEFFKALQQQ 124
>gnl|CDD|220410 pfam09798, LCD1, DNA damage checkpoint protein. This is a family
of proteins which regulate checkpoint kinases. In
Schizosaccharomyces pombe this protein is called Rad26
and in Saccharomyces cerevisiae it is called LCD1.
Length = 648
Score = 27.9 bits (62), Expect = 7.1
Identities = 18/56 (32%), Positives = 31/56 (55%), Gaps = 5/56 (8%)
Query: 180 ILREQLR--QREADRERQKQ---EWELKERHAEEQRRCDEEAMRRQTEEIHLRMQQ 230
+LR++L Q++ ER KQ ELKE+H +E ++ +E + E L ++Q
Sbjct: 1 MLRDKLDMLQQQKQEERNKQKSRVNELKEKHDQELQKLKQELQSLEDERKFLVLEQ 56
>gnl|CDD|223097 COG0018, ArgS, Arginyl-tRNA synthetase [Translation, ribosomal
structure and biogenesis].
Length = 577
Score = 28.0 bits (63), Expect = 7.2
Identities = 14/53 (26%), Positives = 27/53 (50%), Gaps = 3/53 (5%)
Query: 104 LYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQK 156
L E Y + + L+++ +EE EA+ E + E+ E +FV+ +L+
Sbjct: 195 LGEYYVKIAKDLEEDPGNDEE--EAREEVEKLESGDEEAELWR-KFVDLSLEG 244
>gnl|CDD|218581 pfam05416, Peptidase_C37, Southampton virus-type processing
peptidase. Corresponds to Merops family C37.
Norwalk-like viruses (NLVs), including the Southampton
virus, cause acute non-bacterial gastroenteritis in
humans. The NLV genome encodes three open reading frames
(ORFs). ORF1 encodes a polyprotein, which is processed
by the viral protease into six proteins.
Length = 535
Score = 27.9 bits (62), Expect = 7.3
Identities = 18/66 (27%), Positives = 35/66 (53%), Gaps = 2/66 (3%)
Query: 135 YENETEILRERECRF-VEEALQKELKLEEEKLEAQMEFARYENETEI-LREQLRQREADR 192
Y+ +I ER ++ ++E L+ + EEE E Q A + E E +R+++ R
Sbjct: 258 YDEYKKIREERGGKYSIQEYLEDRERYEEELAERQATEADFCEEEEAKIRQRIFGLRKTR 317
Query: 193 ERQKQE 198
+++K+E
Sbjct: 318 KQRKEE 323
>gnl|CDD|191249 pfam05279, Asp-B-Hydro_N, Aspartyl beta-hydroxylase N-terminal
region. This family includes the N-terminal regions of
the junctin, junctate and aspartyl beta-hydroxylase
proteins. Junctate is an integral ER/SR membrane calcium
binding protein, which comes from an alternatively
spliced form of the same gene that generates aspartyl
beta-hydroxylase and junctin. Aspartyl beta-hydroxylase
catalyzes the post-translational hydroxylation of
aspartic acid or asparagine residues contained within
epidermal growth factor (EGF) domains of proteins.
Length = 240
Score = 27.6 bits (61), Expect = 7.7
Identities = 20/115 (17%), Positives = 44/115 (38%), Gaps = 1/115 (0%)
Query: 112 EEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLEEEKL-EAQME 170
EE ++++L+ EK+ + + L E + E++ ++ LE K+ E +
Sbjct: 102 EEEVKEQLQSLLEKIVVSKQEEDGPGKEPQLDEDKFLLAEDSDDRQETLEAGKVHEETED 161
Query: 171 FARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQTEEIH 225
E +Q + +A + + E E+ + D+ EE +
Sbjct: 162 SYHVEETASEQYKQDMKEKASEQENEDSKEPVEKAERTKAETDDVTEEDYDEEDN 216
>gnl|CDD|227396 COG5064, SRP1, Karyopherin (importin) alpha [Intracellular
trafficking and secretion].
Length = 526
Score = 27.5 bits (61), Expect = 7.9
Identities = 12/33 (36%), Positives = 21/33 (63%), Gaps = 6/33 (18%)
Query: 206 AEEQRRCDEEAMRRQTEEIHLRMQQQDEELRRR 238
A+E RR RR+ +++ LR Q+++E L +R
Sbjct: 22 ADELRR------RREEQQVELRKQKREELLNKR 48
>gnl|CDD|220623 pfam10186, Atg14, UV radiation resistance protein and
autophagy-related subunit 14. The Atg14 or Apg14
proteins are hydrophilic proteins with a predicted
molecular mass of 40.5 kDa, and have a coiled-coil motif
at the N terminus region. Yeast cells with mutant Atg14
are defective not only in autophagy but also in sorting
of carboxypeptidase Y (CPY), a vacuolar-soluble
hydrolase, to the vacuole. Subcellular fractionation
indicate that Apg14p and Apg6p are peripherally
associated with a membrane structure(s). Apg14p was
co-immunoprecipitated with Apg6p, suggesting that they
form a stable protein complex. These results imply that
Apg6/Vps30p has two distinct functions: in the
autophagic process and in the vacuolar protein sorting
pathway. Apg14p may be a component specifically required
for the function of Apg6/Vps30p through the autophagic
pathway. There are 17 auto-phagosomal component proteins
which are categorized into six functional units, one of
which is the AS-PI3K complex (Vps30/Atg6 and Atg14). The
AS-PI3K complex and the Atg2-Atg18 complex are essential
for nucleation, and the specific function of the AS-PI3K
apparently is to produce phosphatidylinositol
3-phosphate (PtdIns(3)P) at the pre-autophagosomal
structure (PAS). The localisation of this complex at the
PAS is controlled by Atg14. Autophagy mediates the
cellular response to nutrient deprivation, protein
aggregation, and pathogen invasion in humans, and
malfunction of autophagy has been implicated in multiple
human diseases including cancer. This effect seems to be
mediated through direct interaction of the human Atg14
with Beclin 1 in the human phosphatidylinositol 3-kinase
class III complex.
Length = 307
Score = 27.3 bits (61), Expect = 8.0
Identities = 13/69 (18%), Positives = 28/69 (40%), Gaps = 8/69 (11%)
Query: 181 LREQLRQREADRERQKQEWE--------LKERHAEEQRRCDEEAMRRQTEEIHLRMQQQD 232
LR L + + E KQ+ E + A + + + + + +I R+ Q
Sbjct: 25 LRLDLARLLLENEELKQKVEEALEGATNEDGKLAADLLKLEVARKKERLNQIRARISQLK 84
Query: 233 EELRRRHQE 241
EE+ ++ +
Sbjct: 85 EEIEQKRER 93
>gnl|CDD|197874 smart00787, Spc7, Spc7 kinetochore protein. This domain is found
in cell division proteins which are required for
kinetochore-spindle association.
Length = 312
Score = 27.3 bits (61), Expect = 8.2
Identities = 32/130 (24%), Positives = 54/130 (41%), Gaps = 18/130 (13%)
Query: 97 YGTRWKQLYELYKQKEEALQKELKLEEEKLEAQME------------FARYENETEILRE 144
Y R K L L + +E L+ LK + + L ++E E E L++
Sbjct: 135 YEWRMKLLEGLKEGLDENLE-GLKEDYKLLMKELELLNSIKPKLRDRKDALEEELRQLKQ 193
Query: 145 RECRFVEEALQKELKLEEEKL-EAQMEFARYENETEILREQLRQREADRER---QKQEWE 200
E +E+ EL +EKL + E + E L E+L++ E+ E +K E
Sbjct: 194 LE-DELEDCDPTELDRAKEKLKKLLQEIMIKVKKLEELEEELQELESKIEDLTNKKSELN 252
Query: 201 LKERHAEEQR 210
+ AE++
Sbjct: 253 TEIAEAEKKL 262
>gnl|CDD|241124 cd12680, RRM_THOC4, RNA recognition motif in THO complex subunit
4 (THOC4) and similar proteins. This subgroup
corresponds to the RRM of THOC4, also termed
transcriptional coactivator Aly/REF, or ally of AML-1
and LEF-1, or bZIP-enhancing factor BEF, an mRNA
transporter protein with a well conserved RNA
recognition motif (RRM), also termed RBD (RNA binding
domain) or RNP (ribonucleoprotein domain). It is
involved in RNA transportation from the nucleus. THOC4
was initially identified as a transcription coactivator
of LEF-1 and AML-1 for the TCRalpha enhancer function.
In addition, THOC4 specifically binds to rhesus (RH)
promoter in erythroid. It might be a novel
transcription cofactor for erythroid-specific genes. .
Length = 75
Score = 25.7 bits (57), Expect = 8.3
Identities = 10/35 (28%), Positives = 18/35 (51%)
Query: 2 NIERAVVLVDDRGNSKNEGIIEFTRKPAAAQALKR 36
+++A V D G S + F R+ A +A+K+
Sbjct: 26 ALKKAAVHYDRSGRSLGTADVVFERRADALKAMKQ 60
>gnl|CDD|218115 pfam04502, DUF572, Family of unknown function (DUF572). Family of
eukaryotic proteins with undetermined function.
Length = 321
Score = 27.4 bits (61), Expect = 8.3
Identities = 21/101 (20%), Positives = 43/101 (42%), Gaps = 16/101 (15%)
Query: 145 RECRFVEEALQKELKLEEEKLEAQMEFARYENETEILREQLRQREADRERQKQEWE---- 200
R + ++E ++E+E+ E A ++L R AD +R+ + E
Sbjct: 106 RNYEADKLDEEQEERVEKEREEELAGDAM---------KKLENRTADSKREMEVLERLEE 156
Query: 201 ---LKERHAEEQRRCDEEAMRRQTEEIHLRMQQQDEELRRR 238
L+ R A+ EA+ R+ ++ +++DE L +
Sbjct: 157 LKELQSRRADVDVNSMLEALFRREKKEEEEEEEEDEALIKS 197
>gnl|CDD|206172 pfam14002, YniB, YniB-like protein. The YniB-like protein family
includes the E. coli YniB protein, which is functionally
uncharacterized. This family of proteins is found in
bacteria. Proteins in this family are approximately 180
amino acids in length. This family of proteins are
integral membrane proteins.
Length = 166
Score = 26.8 bits (60), Expect = 8.4
Identities = 13/47 (27%), Positives = 22/47 (46%), Gaps = 7/47 (14%)
Query: 149 FVEEALQ-------KELKLEEEKLEAQMEFARYENETEILREQLRQR 188
FV ALQ +++K E +E Q+ + + REQL ++
Sbjct: 85 FVGLALQASGARMSRQVKFIREGIEDQLILEKAKGVEGRTREQLEEK 131
>gnl|CDD|218636 pfam05557, MAD, Mitotic checkpoint protein. This family consists
of several eukaryotic mitotic checkpoint (Mitotic arrest
deficient or MAD) proteins. The mitotic spindle
checkpoint monitors proper attachment of the bipolar
spindle to the kinetochores of aligned sister chromatids
and causes a cell cycle arrest in prometaphase when
failures occur. Multiple components of the mitotic
spindle checkpoint have been identified in yeast and
higher eukaryotes. In S.cerevisiae, the existence of a
Mad1-dependent complex containing Mad2, Mad3, Bub3 and
Cdc20 has been demonstrated.
Length = 722
Score = 27.6 bits (61), Expect = 8.5
Identities = 25/137 (18%), Positives = 61/137 (44%), Gaps = 4/137 (2%)
Query: 111 KEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLEEEKLEAQME 170
+ E +QKEL+ + ++E + + + E RE E + LEE + +A+ E
Sbjct: 74 ENELMQKELEHKRAQIELERKASTLAENYE----RELDRNLELEVRLKALEELEKKAENE 129
Query: 171 FARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRRQTEEIHLRMQQ 230
A E E ++L+++L + +K++ + + + + + D M+ + + ++
Sbjct: 130 AAEAEEEAKLLKDKLDAESLKLQNEKEDQLKEAKESISRIKNDLSEMQCRAQNADTELKL 189
Query: 231 QDEELRRRHQENSIFMQ 247
+ EL ++ +
Sbjct: 190 LESELEELREQLEECQK 206
>gnl|CDD|227355 COG5022, COG5022, Myosin heavy chain [Cytoskeleton].
Length = 1463
Score = 27.7 bits (62), Expect = 8.6
Identities = 18/98 (18%), Positives = 39/98 (39%), Gaps = 4/98 (4%)
Query: 102 KQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKLE 161
++ EL K L + L+ + E + + N ++ +V+ +L
Sbjct: 906 SEIIELKKSLSSDLIENLEFKTELI---ARLKKLLNNIDLEEGPSIEYVKLPELNKLHEV 962
Query: 162 EEKL-EAQMEFARYENETEILREQLRQREADRERQKQE 198
E KL E E+ ++ IL + + ++ + K+E
Sbjct: 963 ESKLKETSEEYEDLLKKSTILVREGNKANSELKNFKKE 1000
>gnl|CDD|221514 pfam12297, EVC2_like, Ellis van Creveld protein 2 like protein.
This family of proteins is found in eukaryotes. Proteins
in this family are typically between 571 and 1310 amino
acids in length. There are two conserved sequence
motifs: LPA and ELH. EVC2 is implicated in Ellis van
Creveld chondrodysplastic dwarfism in humans. Mutations
in this protein can give rise to this congenital
condition. LIMBIN is a protein which shares around 80%
sequence homology with EVC2 and it is implicated in a
similar condition in bovine chondrodysplastic dwarfism.
Length = 429
Score = 27.5 bits (61), Expect = 8.7
Identities = 25/88 (28%), Positives = 39/88 (44%), Gaps = 6/88 (6%)
Query: 102 KQLYELYKQKEEALQKELKLE-EEKLEAQMEFARYENETEILR-----ERECRFVEEALQ 155
++L E Y++K AL E LE +K+EAQ + E E ERE L
Sbjct: 209 RRLQEEYERKMVALTAECNLETRKKMEAQHQREMAEMEQAEELLKRAPEREAVECSSLLD 268
Query: 156 KELKLEEEKLEAQMEFARYENETEILRE 183
LE+E L+ + + E+ + R+
Sbjct: 269 TLHGLEQEHLQRSLLLQQEEDFAKAHRQ 296
>gnl|CDD|226809 COG4372, COG4372, Uncharacterized protein conserved in bacteria
with the myosin-like domain [Function unknown].
Length = 499
Score = 27.7 bits (61), Expect = 9.0
Identities = 29/145 (20%), Positives = 57/145 (39%), Gaps = 18/145 (12%)
Query: 109 KQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEE------------ALQK 156
+++E Q+ + +AQ E AR + + L+ R E+ A QK
Sbjct: 116 QEREAVRQELAAARQNLAKAQQELARLTKQAQDLQTRLKTLAEQRRQLEAQAQSLQASQK 175
Query: 157 ELKLEEEKLEAQME--FARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDE 214
+L+ +L++Q+ R + + + A + R + EL R A Q+
Sbjct: 176 QLQASATQLKSQVLDLKLRSAQIEQEAQNLATRANAAQARTE---ELARRAAAAQQTAQA 232
Query: 215 EAMR-RQTEEIHLRMQQQDEELRRR 238
R Q + ++ + E++R R
Sbjct: 233 IQQRDAQISQKAQQIAARAEQIRER 257
>gnl|CDD|237063 PRK12329, nusA, transcription elongation factor NusA; Provisional.
Length = 449
Score = 27.4 bits (61), Expect = 9.2
Identities = 22/78 (28%), Positives = 33/78 (42%), Gaps = 2/78 (2%)
Query: 133 ARYENETEILRERECRFVEEALQKELKLEEEKLEAQMEFARYENETEILREQLRQREADR 192
A Y+ E E + E E + + EE+LEA+ R E + LRE E +
Sbjct: 374 AEYDQEAEDAKVAELISQREEEEALQREAEERLEAEQA-ERAEEDAR-LRELYPLPEDEF 431
Query: 193 ERQKQEWELKERHAEEQR 210
E + + E + EE R
Sbjct: 432 EDEDELEEAQPEEEEEAR 449
>gnl|CDD|226889 COG4487, COG4487, Uncharacterized protein conserved in bacteria
[Function unknown].
Length = 438
Score = 27.5 bits (61), Expect = 9.6
Identities = 30/147 (20%), Positives = 65/147 (44%), Gaps = 2/147 (1%)
Query: 102 KQLYELYKQKEEALQKELKLEEEKLEAQMEFARYENETEILRERECRFVEEALQKELKL- 160
K+ E Q A +KEL EE+L Q + + +I + E A + L+L
Sbjct: 49 KEANEKRAQYRSAKKKELSQLEEQLINQKKEQKNLFNEQIKQFELALQDEIAKLEALELL 108
Query: 161 -EEEKLEAQMEFARYENETEILREQLRQREADRERQKQEWELKERHAEEQRRCDEEAMRR 219
E+ E ++ + ++ L++QL+ E++++ + +ER E + EE++
Sbjct: 109 NLEKDKELELLEKELDELSKELQKQLQNTAEIIEKKRENNKNEERLKFENEKKLEESLEL 168
Query: 220 QTEEIHLRMQQQDEELRRRHQENSIFM 246
+ E+ ++ + + +L + E
Sbjct: 169 EREKFEEQLHEANLDLEFKENEEQRES 195
Database: CDD.v3.10
Posted date: Mar 20, 2013 7:55 AM
Number of letters in database: 10,937,602
Number of sequences in database: 44,354
Lambda K H
0.315 0.132 0.367
Gapped
Lambda K H
0.267 0.0677 0.140
Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 14,711,351
Number of extensions: 1500143
Number of successful extensions: 6060
Number of sequences better than 10.0: 1
Number of HSP's gapped: 4258
Number of HSP's successfully gapped: 1102
Length of query: 279
Length of database: 10,937,602
Length adjustment: 96
Effective length of query: 183
Effective length of database: 6,679,618
Effective search space: 1222370094
Effective search space used: 1222370094
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.6 bits)
S2: 58 (26.2 bits)