RPS-BLAST 2.2.26 [Sep-21-2011]
Database: CDD.v3.10
44,354 sequences; 10,937,602 total letters
Searching..................................................done
Query= psy11306
(413 letters)
>gnl|CDD|173890 cd06902, lectin_ERGIC-53_ERGL, ERGIC-53 and ERGL type 1
transmembrane proteins, N-terminal lectin domain.
ERGIC-53 and ERGL, N-terminal carbohydrate recognition
domain. ERGIC-53 and ERGL are eukaryotic mannose-binding
type 1 transmembrane proteins of the early secretory
pathway that transport newly synthesized glycoproteins
from the endoplasmic reticulum (ER) to the ER-Golgi
intermediate compartment (ERGIC). ERGIC-53 and ERGL
have an N-terminal lectin-like carbohydrate recognition
domain (represented by this alignment model) as well as
a C-terminal transmembrane domain. ERGIC-53 functions
as a 'cargo receptor' to facilitate the export of
glycoproteins with different characteristics from the
ER, while the ERGIC-53-like protein (ERGL) which may act
as a regulator of ERGIC-53. In mammals, ERGIC-53 forms
a complex with MCFD2 (multi-coagulation factor
deficiency 2) which then recruits blood coagulation
factors V and VIII. Mutations in either MCFD2 or
ERGIC-53 cause a mild form of inherited hemophilia known
as combined deficiency of factors V and VIII (F5F8D). In
addition to the lectin and transmembrane domains,
ERGIC-53 and ERGL have a short N-terminal cytoplasmic
region of about 12 amino acids. ERGIC-53 forms
disulphide-linked homodimers and homohexamers. ERGIC-53
and ERGL are sequence-similar to the lectins of
leguminous plants. L-type lectins have a dome-shaped
beta-barrel carbohydrate recognition domain with a
curved seven-stranded beta-sheet referred to as the
"front face" and a flat six-stranded beta-sheet referred
to as the "back face". This domain homodimerizes so
that adjacent back sheets form a contiguous 12-stranded
sheet and homotetramers occur by a back-to-back
association of these homodimers. Though L-type lectins
exhibit both sequence and structural similarity to one
another, their carbohydrate binding specificities differ
widely.
Length = 225
Score = 420 bits (1083), Expect = e-149
Identities = 153/224 (68%), Positives = 180/224 (80%), Gaps = 1/224 (0%)
Query: 31 RFEYKYSFKPPYLAQKDGSVPFWEYGGNCIASLENVRVAPSLRSQKGAIWTKQTTNFEWW 90
RFEYKYSFK P+LAQKDG+VPFW +GG+ IASLE VR+ PSLRS+KG++WTK +FE W
Sbjct: 2 RFEYKYSFKGPHLAQKDGTVPFWSHGGDAIASLEQVRLTPSLRSKKGSVWTKNPFSFENW 61
Query: 91 NVDIVFRVTGRGRIGADGLAFWYTSEKGSYDGEVFGSSDRWKGLGLFFDSFDNDNNHNNP 150
V++ FRVTGRGRIGADGLA WYT E+G +G VFGSSD+W G+G+FFDSFDND NNP
Sbjct: 62 EVEVTFRVTGRGRIGADGLAIWYTKERGE-EGPVFGSSDKWNGVGIFFDSFDNDGKKNNP 120
Query: 151 YIMAVVNDGNMAFDHQNDGASQSLAGCLRDFRNKPYPTRARIQYYMNTLTVWFHNGMTNN 210
I+ V NDG ++DHQNDG +Q+L CLRDFRNKPYP RA+I YY N LTV +NG T N
Sbjct: 121 AILVVGNDGTKSYDHQNDGLTQALGSCLRDFRNKPYPVRAKITYYQNVLTVSINNGFTPN 180
Query: 211 EQDIEVCLRVENIYLPKEGYFGVSAATGGLADDHDILHFLTSSL 254
+ D E+C RVEN+ LP GYFGVSAATGGLADDHD+L FLT SL
Sbjct: 181 KDDYELCTRVENMVLPPNGYFGVSAATGGLADDHDVLSFLTFSL 224
Score = 46.2 bits (110), Expect = 9e-06
Identities = 19/35 (54%), Positives = 23/35 (65%), Gaps = 3/35 (8%)
Query: 296 RFEYKYSFKPPYLAQKDG---YQKDHPDAHPNEEE 327
RFEYKYSFK P+LAQKDG + DA + E+
Sbjct: 2 RFEYKYSFKGPHLAQKDGTVPFWSHGGDAIASLEQ 36
>gnl|CDD|217528 pfam03388, Lectin_leg-like, Legume-like lectin family. Lectins are
structurally diverse proteins that bind to specific
carbohydrates. This family includes the VIP36 and
ERGIC-53 lectins. These two proteins were the first
recognised members of a family of animal lectins similar
(19-24%) to the leguminous plant lectins. The alignment
for this family aligns residues lying towards the
N-terminus, where the similarity of VIP36 and ERGIC-53
is greatest. However, while Fiedler and Simons
identified these proteins as a new family of animal
lectins, our alignment also includes yeast sequences.
ERGIC-53 is a 53kD protein, localised to the
intermediate region between the endoplasmic reticulum
and the Golgi apparatus (ER-Golgi-Intermediate
Compartment, ERGIC). It was identified as a
calcium-dependent, mannose-specific lectin. Its
dysfunction has been associated with combined factors V
and VIII deficiency OMIM:227300 OMIM:601567, suggesting
an important and substrate-specific role for ERGIC-53 in
the glycoprotein- secreting pathway.
Length = 226
Score = 321 bits (825), Expect = e-110
Identities = 123/225 (54%), Positives = 157/225 (69%), Gaps = 1/225 (0%)
Query: 30 ERFEYKYSFKPPYLAQKDGSVPFWEYGGNCIASLENVRVAPSLRSQKGAIWTKQTTNFEW 89
+RF+YK+S K PYL Q DGS+PFWEYGG+ I S +R+ P L+SQKG++W KQ + +
Sbjct: 1 DRFKYKHSLKAPYLGQGDGSIPFWEYGGSAILSSGYIRLTPDLQSQKGSLWNKQPLDLDS 60
Query: 90 WNVDIVFRVTGRGRIGADGLAFWYTSEKGSYDGEVFGSSDRWKGLGLFFDSFDNDNNHNN 149
W V++ FRV G GR+G DGLA WYTSE+G G VFGS D W GL +F D++DNDN N
Sbjct: 61 WEVEVTFRVHGSGRLGGDGLAIWYTSERGV-PGPVFGSKDNWDGLAIFLDTYDNDNQTLN 119
Query: 150 PYIMAVVNDGNMAFDHQNDGASQSLAGCLRDFRNKPYPTRARIQYYMNTLTVWFHNGMTN 209
PYI ++NDG+ +DH DG Q LA C DFRNK YPTR RI Y NTLTV +G+
Sbjct: 120 PYISGMLNDGSKPYDHTKDGTDQELASCTADFRNKDYPTRIRITYDKNTLTVMIDDGLLE 179
Query: 210 NEQDIEVCLRVENIYLPKEGYFGVSAATGGLADDHDILHFLTSSL 254
++ D ++C +VEN+ LP YFGVSAATG L+D+HD+ FLT L
Sbjct: 180 DKVDYKLCFQVENVRLPTGYYFGVSAATGDLSDNHDVFSFLTFQL 224
Score = 31.6 bits (72), Expect = 0.55
Identities = 19/62 (30%), Positives = 24/62 (38%), Gaps = 13/62 (20%)
Query: 295 ERFEYKYSFKPPYLAQKDGYQKDHPDAHPNEEEWYES----ENQRELRQIFQGQSQLAEW 350
+RF+YK+S K PYL Q DG E+ S L Q Q + W
Sbjct: 1 DRFKYKHSLKAPYLGQGDG--------SIPFWEYGGSAILSSGYIRLTPDLQSQKG-SLW 51
Query: 351 TK 352
K
Sbjct: 52 NK 53
>gnl|CDD|173892 cd07308, lectin_leg-like, legume-like lectins: ERGIC-53, ERGL,
VIP36, VIPL, EMP46, and EMP47. The legume-like
(leg-like) lectins are eukaryotic intracellular sugar
transport proteins with a carbohydrate recognition
domain similar to that of the legume lectins. This
domain binds high-mannose-type oligosaccharides for
transport from the endoplasmic reticulum to the Golgi
complex. These leg-like lectins include ERGIC-53, ERGL,
VIP36, VIPL, EMP46, EMP47, and the UIP5
(ULP1-interacting protein 5) precursor protein.
Leg-like lectins have different intracellular
distributions and dynamics in the endoplasmic
reticulum-Golgi system of the secretory pathway and
interact with N-glycans of glycoproteins in a
calcium-dependent manner, suggesting a role in
glycoprotein sorting and trafficking. L-type lectins
have a dome-shaped beta-barrel carbohydrate recognition
domain with a curved seven-stranded beta-sheet referred
to as the "front face" and a flat six-stranded
beta-sheet referred to as the "back face". This domain
homodimerizes so that adjacent back sheets form a
contiguous 12-stranded sheet and homotetramers occur by
a back-to-back association of these homodimers. Though
L-type lectins exhibit both sequence and structural
similarity to one another, their carbohydrate binding
specificities differ widely.
Length = 218
Score = 215 bits (550), Expect = 2e-68
Identities = 89/223 (39%), Positives = 128/223 (57%), Gaps = 5/223 (2%)
Query: 32 FEYKYSFKPPYLAQKDGSVPFWEYGGNCIASLENVRVAPSLRSQKGAIWTKQTTNFEWWN 91
F ++S PP+L DG + W GG+ + + +R+ P + SQ G++W++ + +
Sbjct: 1 FISEHSLSPPFLDDNDGEIGNWTVGGSTVITKNYIRLTPDVPSQSGSLWSRVPIPAKDFE 60
Query: 92 VDIVFRVTGRGRIGADGLAFWYTSEKGSYDGEVFGSSDRWKGLGLFFDSFDNDNNHNNPY 151
+++ F + G +G DG AFWYT E GS DG +FG D++KGL +FFD++DND P
Sbjct: 61 IEVEFSIHGGSGLGGDGFAFWYTEEPGS-DGPLFGGPDKFKGLAIFFDTYDNDGK-GFPS 118
Query: 152 IMAVVNDGNMAFDHQNDGASQSLAGCLRDFRNKPYPTRARIQYYMNTLTVWFHNGMTNNE 211
I +NDG ++D++ DG LA C FRN PT RI Y NTL V NN
Sbjct: 119 ISVFLNDGTKSYDYETDGEKLELASCSLKFRNSNAPTTLRISYLNNTLKVDITYSEGNNW 178
Query: 212 QDIEVCLRVENIYLPKEGYFGVSAATGGLADDHDILHFLTSSL 254
++ C VE++ LP +GYFG SA TG L+D+HDIL T L
Sbjct: 179 KE---CFTVEDVILPSQGYFGFSAQTGDLSDNHDILSVHTYEL 218
>gnl|CDD|173889 cd06901, lectin_VIP36_VIPL, VIP36 and VIPL type 1 transmembrane
proteins, lectin domain. The vesicular integral protein
of 36 kDa (VIP36) is a type 1 transmembrane protein of
the mammalian early secretory pathway that acts as a
cargo receptor transporting high mannose type
glycoproteins between the Golgi and the endoplasmic
reticulum (ER). Lectins of the early secretory pathway
are involved in the selective transport of newly
synthesized glycoproteins from the ER to the ER-Golgi
intermediate compartment (ERGIC). The most prominent
cycling lectin is the mannose-binding type1 membrane
protein ERGIC-53, which functions as a cargo receptor to
facilitate export of glycoproteins from the ER. L-type
lectins have a dome-shaped beta-barrel carbohydrate
recognition domain with a curved seven-stranded
beta-sheet referred to as the "front face" and a flat
six-stranded beta-sheet referred to as the "back face".
This domain homodimerizes so that adjacent back sheets
form a contiguous 12-stranded sheet and homotetramers
occur by a back-to-back association of these homodimers.
Though L-type lectins exhibit both sequence and
structural similarity to one another, their carbohydrate
binding specificities differ widely.
Length = 248
Score = 189 bits (481), Expect = 9e-58
Identities = 85/221 (38%), Positives = 126/221 (57%), Gaps = 15/221 (6%)
Query: 36 YSFKPPYLAQKDGSV-PFWEYGGNCIASLENVRVAPSLRSQKGAIWTKQTTNFEWWNVDI 94
+S PY Q GS P W++ G+ + + + +R+ P +S++G+IW + W + +
Sbjct: 6 HSLIKPY--QGVGSSMPLWDFLGSTMVTSQYIRLTPDHQSKQGSIWNRVPCYLRDWEMHV 63
Query: 95 VFRVTGRGR-IGADGLAFWYTSEKGSYDGEVFGSSDRWKGLGLFFDSFDNDN---NHNNP 150
F+V G G+ + DG A WYT E+ G VFGS D + GL +FFD++ N N H +P
Sbjct: 64 HFKVHGSGKNLFGDGFAIWYTKER-MQPGPVFGSKDNFHGLAIFFDTYSNQNGEHEHVHP 122
Query: 151 YIMAVVNDGNMAFDHQNDGASQSLAGCLRDFRNKPYPTRARIQYYMNTLTVWFHNGMTN- 209
YI A+VN+G++++DH DG LAGC FRNK + T I+Y LTV MT+
Sbjct: 123 YISAMVNNGSLSYDHDRDGTHTELAGCSAPFRNKDHDTFVAIRYSKGRLTV-----MTDI 177
Query: 210 -NEQDIEVCLRVENIYLPKEGYFGVSAATGGLADDHDILHF 249
+ + + C V + LP YFG SAATG L+D+HDI+
Sbjct: 178 DGKNEWKECFDVTGVRLPTGYYFGASAATGDLSDNHDIISM 218
>gnl|CDD|173891 cd06903, lectin_EMP46_EMP47, EMP46 and EMP47 type 1 transmembrane
proteins, N-terminal lectin domain. EMP46 and EMP47,
N-terminal carbohydrate recognition domain. EMP46 and
EMP47 are fungal type-I transmembrane proteins that
cycle between the endoplasmic reticulum and the golgi
apparatus and are thought to function as cargo receptors
that transport newly synthesized glycoproteins. EMP47
is a receptor for EMP46 responsible for the selective
transport of EMP46 by forming hetero-oligomerization
between the two proteins. EMP46 and EMP47 have an
N-terminal lectin-like carbohydrate recognition domain
(represented by this alignment model) as well as a
C-terminal transmembrane domain. EMP46 and EMP47 are 45%
sequence-identical to one another and have sequence
homology to a class of intracellular lectins defined by
ERGIC-53 and VIP36. L-type lectins have a dome-shaped
beta-barrel carbohydrate recognition domain with a
curved seven-stranded beta-sheet referred to as the
"front face" and a flat six-stranded beta-sheet referred
to as the "back face". This domain homodimerizes so
that adjacent back sheets form a contiguous 12-stranded
sheet and homotetramers occur by a back-to-back
association of these homodimers. Though L-type lectins
exhibit both sequence and structural similarity to one
another, their carbohydrate binding specificities differ
widely.
Length = 215
Score = 89.7 bits (223), Expect = 1e-20
Identities = 48/213 (22%), Positives = 91/213 (42%), Gaps = 18/213 (8%)
Query: 47 DGSVPFWEYGGNCIASLENVR-VAPSLRSQKGAIWTKQTTNFEW-WNVDIVFRVTGRGRI 104
+P W+ GN LE+ R + +Q+G++W K+ + + W ++ FR TG
Sbjct: 17 GKLIPNWQTSGN--PKLESGRIILTPPGNQRGSLWLKKPLSLKDEWTIEWTFRSTGPEGR 74
Query: 105 GADGLAFWYTSEKGSYDGE-VFGSSDRWKGLGLFFDSFDNDNNHNNPYIMAVVNDGNMAF 163
GL FW + + G ++ GL L D+ + + +NDG+ +
Sbjct: 75 SGGGLNFWLVKDGNADVGTSSIYGPSKFDGLQLLIDNNGG----SGGSLRGFLNDGSKDY 130
Query: 164 DHQNDGASQSLAGCLRDFRNKPYPTRARIQYYMNTLTVWFHNGMTNNEQDIEVCLRVENI 223
+++ S + CL +++ P+ R+ Y F + N +C + + +
Sbjct: 131 KNEDV-DSLAFGSCLFAYQDSGVPSTIRLSYDAL--NSLFKVQVDNR-----LCFQTDKV 182
Query: 224 YLPKEGY-FGVSAATGGLADDHDILHFLTSSLL 255
LP+ GY FG++AA + +IL + L
Sbjct: 183 QLPQGGYRFGITAANADNPESFEILKLKVWNGL 215
>gnl|CDD|173886 cd01951, lectin_L-type, legume lectins. The L-type (legume-type)
lectins are a highly diverse family of carbohydrate
binding proteins that generally display no enzymatic
activity toward the sugars they bind. This family
includes arcelin, concanavalinA, the lectin-like
receptor kinases, the ERGIC-53/VIP36/EMP46 type1
transmembrane proteins, and an alpha-amylase inhibitor.
L-type lectins have a dome-shaped beta-barrel
carbohydrate recognition domain with a curved
seven-stranded beta-sheet referred to as the "front
face" and a flat six-stranded beta-sheet referred to as
the "back face". This domain homodimerizes so that
adjacent back sheets form a contiguous 12-stranded sheet
and homotetramers occur by a back-to-back association of
these homodimers. Though L-type lectins exhibit both
sequence and structural similarity to one another, their
carbohydrate binding specificities differ widely.
Length = 223
Score = 80.6 bits (199), Expect = 2e-17
Identities = 57/215 (26%), Positives = 88/215 (40%), Gaps = 25/215 (11%)
Query: 49 SVPFWEYGGNCIASLENV--RVAPSLRSQKGAIWTKQ----TTNFEWWNVDIVFRVTGRG 102
+ W+ G+ + ++ R+ P +Q G+ W K + +F F + +G
Sbjct: 12 NQSNWQLNGSATLTTDSGVLRLTPDTGNQAGSAWYKTPIDLSKDFT---TTFKFYLGTKG 68
Query: 103 RIGADGLAFWYTSEK-----GSYDGEVFGSSDRWKGLGLFFDSFDND--NNHNNPYIMAV 155
GADG+AF ++ G G G + + FD++ ND N+ N +I
Sbjct: 69 TNGADGIAFVLQNDPAGALGGGGGGGGLGYGGIGNSVAVEFDTYKNDDNNDPNGNHISID 128
Query: 156 VNDGNMAFDHQNDGASQSLAGCLRDFRNKPYPTR-ARIQY--YMNTLTVWFHNGMTNNEQ 212
VN SL RI Y NTLTV+ NG T
Sbjct: 129 VNGNGNNTALAT-----SLGSASLPNGTGLGNEHTVRITYDPTTNTLTVYLDNGSTLTSL 183
Query: 213 DIEVCLRVENIYLPKEGYFGVSAATGGLADDHDIL 247
DI + + + + P + YFG +A+TGGL + HDIL
Sbjct: 184 DITIPVDLIQL-GPTKAYFGFTASTGGLTNLHDIL 217
>gnl|CDD|215744 pfam00139, Lectin_legB, Legume lectin domain.
Length = 231
Score = 43.8 bits (104), Expect = 6e-05
Identities = 48/190 (25%), Positives = 74/190 (38%), Gaps = 35/190 (18%)
Query: 79 IWTKQTTNFEWWNVDIVFRVTGR--GRIGADGLAFWYTSEKGSYDGEVFGSSDRWKGLGL 136
+W T ++ VF + G DGLAF+ + G SS + LGL
Sbjct: 51 LWDSSTGKVASFSTSFVFAIKNIPKSTNGGDGLAFFLAPSG-TQPG---ASSGGY--LGL 104
Query: 137 FFDSFDNDNNHNNPYIMAVVNDGNMAFDHQN-DG-------------ASQSLAGCLRDFR 182
F S N+ N +N I+AV D + + + D AS+S + D
Sbjct: 105 FNSS--NNGNSSNH-IVAVEFDTFLNPEFNDIDDNHVGIDVNSIISVASESASFVPLDLN 161
Query: 183 N-KPYPTRARIQY--YMNTLTVWFHNGMTNNEQDIEVCLRVENI--YLPKEGYFGVSAAT 237
+ KP + I Y L+V N+ + ++ LP+ Y G SA+T
Sbjct: 162 SGKPI--QVWIDYDGSSKRLSVTLAY---PNKPKRPLLSASVDLSTVLPEWVYVGFSAST 216
Query: 238 GGLADDHDIL 247
GG + H +L
Sbjct: 217 GGATESHYVL 226
>gnl|CDD|173887 cd06899, lectin_legume_LecRK_Arcelin_ConA, legume lectins,
lectin-like receptor kinases, arcelin, concanavalinA,
and alpha-amylase inhibitor. This alignment model
includes the legume lectins (also known as agglutinins),
the arcelin (also known as phytohemagglutinin-L) family
of lectin-like defense proteins, the LecRK family of
lectin-like receptor kinases, concanavalinA (ConA), and
an alpha-amylase inhibitor. Arcelin is a major seed
glycoprotein discovered in kidney beans (Phaseolus
vulgaris) that has insecticidal properties and protects
the seeds from predation by larvae of various bruchids.
Arcelin is devoid of monosaccharide binding properties
and lacks a key metal-binding loop that is present in
other members of this family. Phytohaemagglutinin (PHA)
is a lectin found in plants, especially beans, that
affects cell metabolism by inducing mitosis and by
altering the permeability of the cell membrane to
various proteins. PHA agglutinates most mammalian red
blood cell types by binding glycans on the cell surface.
Medically, PHA is used as a mitogen to trigger cell
division in T-lymphocytes and to activate latent HIV-1
from human peripheral lymphocytes. Plant L-type lectins
are primarily found in the seeds of leguminous plants
where they constitute about 10% of the total soluble
protein of the seed extracts. They are synthesized
during seed development several weeks after flowering
and transported to the vacuole where they become
condensed into specialized vesicles called protein
bodies. L-type lectins have a dome-shaped beta-barrel
carbohydrate recognition domain with a curved
seven-stranded beta-sheet referred to as the "front
face" and a flat six-stranded beta-sheet referred to as
the "back face". This domain homodimerizes so that
adjacent back sheets form a contiguous 12-stranded sheet
and homotetramers occur by a back-to-back association of
these homodimers. Though L-type lectins exhibit both
sequence and structural similarity to one another, their
carbohydrate binding specificities differ widely.
Length = 236
Score = 36.8 bits (86), Expect = 0.012
Identities = 53/195 (27%), Positives = 71/195 (36%), Gaps = 61/195 (31%)
Query: 84 TTNFEWWNVDIVFRVTGRGR-IGADGLAFWYTSEKGSYDGEVFGSSDRWKGLGLFFDSFD 142
+T+F F +T +G DGLAF+ D SS + LGLF S
Sbjct: 64 STSF-------SFSITPPNPSLGGDGLAFFLAP----TDSLPPASSGGY--LGLFNSS-- 108
Query: 143 NDNNHNNPYIMAVVND--GNMAFDHQND---G------ASQSLAGCLRDFRNKPY---PT 188
N+ N +N I+AV D N F +D G S AG D K P
Sbjct: 109 NNGNSSNH-IVAVEFDTFQNPEFGDPDDNHVGIDVNSLVSVK-AGYWDDDGGKLKSGKPM 166
Query: 189 RARIQYYMNTLTVWFHNGMTNNEQDIEVCLRVENI----------------YLPKEGYFG 232
+A I Y + + + V L + LP+E Y G
Sbjct: 167 QAWIDY----------DSSSKR---LSVTLAYSGVAKPKKPLLSYPVDLSKVLPEEVYVG 213
Query: 233 VSAATGGLADDHDIL 247
SA+TG L + H IL
Sbjct: 214 FSASTGLLTELHYIL 228
>gnl|CDD|234354 TIGR03789, pdsO, proteobacterial sortase system OmpA family
protein. A newly defined histidine kinase (TIGR03785)
and response regulator (TIGR03787) gene pair occurs
exclusively in Proteobacteria, mostly of marine origin,
nearly all of which contain a subfamily 6 sortase
(TIGR03784) and its single dedicated target protein
(TIGR03788) adjacent to to the sortase. This protein
family shows up in only in those species with the
histidine kinase/response regulator gene pair, and often
adjacent to that pair. It belongs to the OmpA protein
family (pfam00691). Its function is unknown. We assign
the gene symbol pdsO, for Proteobacterial Dedicated
Sortase system OmpA family protein.
Length = 239
Score = 31.3 bits (71), Expect = 0.69
Identities = 11/39 (28%), Positives = 20/39 (51%), Gaps = 6/39 (15%)
Query: 259 AKQQEQV------NQEDQKVAQEYAQYEKKLEEQKQHSQ 291
A+Q++Q+ Q +++ EY Q + LE +Q Q
Sbjct: 87 AQQRQQMVALTQKQQALEQLEAEYQQAQVHLETLQQDQQ 125
>gnl|CDD|213783 TIGR03181, PDH_E1_alph_x, pyruvate dehydrogenase E1 component,
alpha subunit. Members of this protein family are the
alpha subunit of the E1 component of pyruvate
dehydrogenase (PDH). This model represents one branch of
a larger family that E1-alpha proteins from
2-oxoisovalerate dehydrogenase, acetoin dehydrogenase,
another PDH clade, etc [Energy metabolism, Pyruvate
dehydrogenase].
Length = 341
Score = 31.3 bits (72), Expect = 0.90
Identities = 12/55 (21%), Positives = 25/55 (45%), Gaps = 8/55 (14%)
Query: 260 KQQEQVNQE-DQKVAQEYAQYEKKLEEQKQHSQNPVERFEYKYSFKPPYL-AQKD 312
+Q+E + +E + +VA+ A+ + F++ Y+ PP L Q+
Sbjct: 291 EQEEALEEEAEAEVAEAVAEALAL------PPPPVDDIFDHVYAELPPELEEQRA 339
>gnl|CDD|233758 TIGR02169, SMC_prok_A, chromosome segregation protein SMC,
primarily archaeal type. SMC (structural maintenance of
chromosomes) proteins bind DNA and act in organizing and
segregating chromosomes for partition. SMC proteins are
found in bacteria, archaea, and eukaryotes. It is found
in a single copy and is homodimeric in prokaryotes, but
six paralogs (excluded from this family) are found in
eukarotes, where SMC proteins are heterodimeric. This
family represents the SMC protein of archaea and a few
bacteria (Aquifex, Synechocystis, etc); the SMC of other
bacteria is described by TIGR02168. The N- and
C-terminal domains of this protein are well conserved,
but the central hinge region is skewed in composition
and highly divergent [Cellular processes, Cell division,
DNA metabolism, Chromosome-associated proteins].
Length = 1164
Score = 31.6 bits (72), Expect = 0.99
Identities = 15/98 (15%), Positives = 39/98 (39%), Gaps = 7/98 (7%)
Query: 260 KQQEQVNQEDQKVAQEYAQYEKKLEEQKQHSQNPVERFEYKYSFKPPYLAQKDGYQK--D 317
K+ EQ+ QE++K+ + + E+ L +Q +N + + ++ +
Sbjct: 723 KEIEQLEQEEEKLKERLEELEEDLSSLEQEIENVKSELKELEARIEELEEDLHKLEEALN 782
Query: 318 HPDAHPNEEEWYESENQRELRQIFQGQSQLAEWTKAIA 355
+A + E Q EL ++ + +++ +
Sbjct: 783 DLEARLSHSRIPEI--QAELSKL---EEEVSRIEARLR 815
>gnl|CDD|221432 pfam12128, DUF3584, Protein of unknown function (DUF3584). This
protein is found in bacteria and eukaryotes. Proteins in
this family are typically between 943 to 1234 amino
acids in length. This family contains a P-loop motif
suggesting it is a nucleotide binding protein. It may be
involved in replication.
Length = 1198
Score = 30.8 bits (70), Expect = 1.5
Identities = 32/148 (21%), Positives = 55/148 (37%), Gaps = 23/148 (15%)
Query: 239 GLADDHDILHFLTSSLLPPGAKQQEQVNQEDQKVAQEYAQYEKKLEEQKQHSQNPVERFE 298
A + L L S L ++Q+ + +E + E +L KQ +
Sbjct: 414 QKAAIEEDLQALESQL-------RQQLEAGKLEFNEEEYELELRLGRLKQRLDSA----- 461
Query: 299 YKYSFKPPYLAQKDGYQKDHPDAHPNEEEWYESENQRELRQIFQGQSQLAEWTKAIATGL 358
+ P L Q + + A EE ++E E Q QS+L + K L
Sbjct: 462 ---TATPEELEQLEINDEALEKAQ---EEQEQAEANVE-----QLQSELRQLRKRRDEAL 510
Query: 359 DALQQKQDRILAVVSQGGGIPHQVVPGQ 386
+ALQ+ + R+L + + Q+ P
Sbjct: 511 EALQRAERRLLQLRQALDELELQLSPQA 538
>gnl|CDD|197696 smart00389, HOX, Homeodomain. DNA-binding factors that are
involved in the transcriptional regulation of key
developmental processes.
Length = 57
Score = 27.6 bits (62), Expect = 1.9
Identities = 10/32 (31%), Positives = 13/32 (40%), Gaps = 7/32 (21%)
Query: 181 FRNKPYPTRARIQYYMNTL-------TVWFHN 205
F+ PYP+R + L VWF N
Sbjct: 20 FQKNPYPSREEREELAKKLGLSERQVKVWFQN 51
>gnl|CDD|236196 PRK08241, PRK08241, RNA polymerase factor sigma-70; Validated.
Length = 339
Score = 30.3 bits (69), Expect = 2.0
Identities = 23/67 (34%), Positives = 29/67 (43%), Gaps = 14/67 (20%)
Query: 341 FQGQSQLAEWTKAIATG--LDALQQKQDRIL-------AVVSQGGGIPHQVVPG-QPMPM 390
F+G+S L W IAT LDAL+ + R L A + VP +P P
Sbjct: 64 FEGRSSLRTWLYRIATNVCLDALEGRARRPLPTDLGAPAADPVDELVERPEVPWLEPYP- 122
Query: 391 INNDALL 397
DALL
Sbjct: 123 ---DALL 126
>gnl|CDD|99890 cd05816, CBM20_DPE2_repeat2, Disproportionating enzyme 2 (DPE2),
N-terminal CBM20 (carbohydrate-binding module, family
20) domain, repeat 2. DPE2 is a transglucosidase that
is essential for the cytosolic metabolism of maltose in
plant leaves at night. Maltose is an intermediate on
the pathway from starch to sucrose and DPE2 is thought
to metabolize the maltose that is exported from the
chloroplast. DPE2 has two N-terminal CBM20 domains as
well as a C-terminal amylomaltase
(4-alpha-glucanotransferase) catalytic domain. DPE1,
the plastid version of this enzyme, has a
transglucosidase domain that is similar to that of DPE2
but lacks the N-terminal CBM20 domains. Included in
this group are PDE2-like proteins from Dictyostelium,
Entamoeba, and Bacteroides. The CBM20 domain is found
in a large number of starch degrading enzymes including
alpha-amylase, beta-amylase, glucoamylase, and CGTase
(cyclodextrin glucanotransferase). CBM20 is also
present in proteins that have a regulatory role in
starch metabolism in plants (e.g. alpha-amylase) or
glycogen metabolism in mammals (e.g. laforin). CBM20
folds as an antiparallel beta-barrel structure with two
starch binding sites. These two sites are thought to
differ functionally with site 1 acting as the initial
starch recognition site and site 2 involved in the
specific recognition of appropriate regions of starch.
Length = 99
Score = 28.1 bits (63), Expect = 2.6
Identities = 15/40 (37%), Positives = 18/40 (45%), Gaps = 8/40 (20%)
Query: 19 LVVLSSSQNPVERFEYKYSFKPPYLAQKDGSVPFWEYGGN 58
+ +S P FEYKY +A KD V WE G N
Sbjct: 48 DIDISKDSFP---FEYKY-----IIANKDSGVVSWENGPN 79
>gnl|CDD|240584 cd12949, NOPS_PSPC1, NOPS domain, including C-terminal coiled-coil
region, in paraspeckle protein component 1 (PSPC1) and
similar proteins. The family contains a DBHS domain
(for Drosophila behavior, human splicing), which
comprises two conserved RNA recognition motifs (RRMs),
also termed RBDs (RNA binding domains) or RNPs
(ribonucleoprotein domains), and a charged
protein-protein interaction NOPS (NONA and PSP1) domain.
This model corresponds to the NOPS domain, with a long
helical C-terminal extension, of paraspeckle component 1
(PSPC1, also termed PSP1), a novel nucleolar factor that
accumulates within a new nucleoplasmic compartment,
termed paraspeckles, and diffusely distributes in the
nucleoplasm. It is ubiquitously expressed and highly
conserved in vertebrates. Although its cellular function
remains unknown currently, PSPC1 forms a novel
heterodimer with the nuclear protein p54nrb, also known
as non-POU domain-containing octamer-binding protein
(NONO), which localizes to paraspeckles in an
RNA-dependent manner. The NOPS domain specifically binds
to the second RNA recognition motif (RRM2) domain of the
partner DBHS protein via a substantial interaction
surface. Its highly conserved C-terminal residues are
critical for functional DBHS dimerization while the
highly conserved C-terminal helical extension, forming a
right-handed antiparallel heterodimeric coiled-coil, is
essential for localization of these proteins to
subnuclear bodies.
Length = 94
Score = 28.1 bits (62), Expect = 2.8
Identities = 22/74 (29%), Positives = 38/74 (51%), Gaps = 6/74 (8%)
Query: 263 EQVNQED---QKVAQEYAQYEKKLEEQKQHSQNPVERFEYKYSFKPPYLAQKDGYQKDHP 319
EQ + ED +K+ Q+ QY K+ EQ P FE++Y+ + L + + Q++
Sbjct: 10 EQFDDEDGLPEKLMQKTQQYHKE-REQPPRFAQP-GTFEFEYASRWKALDEMEKQQREQV 67
Query: 320 DAHPNE-EEWYESE 332
D + E +E E+E
Sbjct: 68 DRNIREAKEKLEAE 81
>gnl|CDD|182544 PRK10555, PRK10555, aminoglycoside/multidrug efflux system;
Provisional.
Length = 1037
Score = 29.8 bits (67), Expect = 3.0
Identities = 14/34 (41%), Positives = 19/34 (55%), Gaps = 1/34 (2%)
Query: 249 FLTSSLLPPGAKQQEQVNQEDQKVAQEYAQYEKK 282
F TS LP G+ QQ Q + +KV + Y +EK
Sbjct: 571 FTTSVQLPSGSTQQ-QTLKVVEKVEKYYFTHEKD 603
>gnl|CDD|236465 PRK09319, PRK09319, bifunctional 3,4-dihydroxy-2-butanone
4-phosphate synthase/GTP cyclohydrolase II/unknown
domain fusion protein; Provisional.
Length = 555
Score = 29.5 bits (67), Expect = 3.9
Identities = 14/60 (23%), Positives = 23/60 (38%), Gaps = 9/60 (15%)
Query: 318 HPDAHPNEEEWYESENQRELRQIFQGQSQLAEWTKA------IATGLD---ALQQKQDRI 368
+WY+ N ++ I +LA+W I++G D LQ + DR
Sbjct: 466 FDQNKVASADWYKQSNHPYIKAIELLLDELAQWPNTKRLGFLISSGDDPALHLQVQLDRQ 525
>gnl|CDD|225087 COG2176, PolC, DNA polymerase III, alpha subunit (gram-positive
type) [DNA replication, recombination, and repair].
Length = 1444
Score = 29.6 bits (67), Expect = 4.5
Identities = 19/85 (22%), Positives = 33/85 (38%), Gaps = 15/85 (17%)
Query: 260 KQQEQVNQEDQKVAQEYAQYEKKLEEQKQHSQNPVERFEYKYSFKPPYLAQKDGYQKDHP 319
K +E +N+E +K AQE + EKKL+ + VE+ + + + + K
Sbjct: 178 KFEEAINEEVEKAAQEALEAEKKLKAE----SPKVEKPKPLFDGQKGRKIKSTEEIKP-- 231
Query: 320 DAHPNEEEWYESENQRELRQIFQGQ 344
N+ E R +G
Sbjct: 232 ---------LIKINEEETRVKVEGY 247
>gnl|CDD|240581 cd12946, NOPS_p54nrb_PSF_PSPC1, NOPS domain, including C-terminal
coiled-coil region, in p54nrb/PSF/PSPC1 family proteins.
The family contains a DBHS domain (for Drosophila
behavior, human splicing), which comprises two conserved
RNA recognition motifs (RRMs), also termed RBDs (RNA
binding domains) or RNPs (ribonucleoprotein domains),
and a charged protein-protein interaction NOPS (NONA and
PSP1) domain. This model corresponds to the NOPS domain,
with a long helical C-terminal extension, found in the
p54nrb/PSF/PSPC1 proteins. The NOPS domain specifically
binds to the second RNA recognition motif (RRM2) domain
of the partner DBHS protein via a substantial
interaction surface. Its highly conserved C-terminal
residues are critical for functional DBHS dimerization
while the highly conserved C-terminal helical extension,
forming a right-handed antiparallel heterodimeric
coiled-coil, is essential for localization of these
proteins to subnuclear bodies. Members in the family
include 54 kDa nuclear RNA- and DNA-binding protein
(p54nrb), polypyrimidine tract-binding protein
(PTB)-associated-splicing factor (PSF) and paraspeckle
protein component 1 (PSPC1 or PSP1), which are
ubiquitously expressed and are conserved in vertebrates.
p54nrb, also termed NONO or NMT55, is a multi-functional
protein involved in numerous nuclear processes including
transcriptional regulation, splicing, DNA unwinding,
nuclear retention of hyperedited double-stranded RNA,
viral RNA processing, control of cell proliferation, and
circadian rhythm maintenance. PSF, also termed POMp100,
is a multi-functional protein that binds RNA,
single-stranded DNA (ssDNA), double-stranded DNA (dsDNA)
and many factors, and mediates diverse activities in the
cell. PSPC1 is a novel nucleolar factor that accumulates
within a new nucleoplasmic compartment, termed
paraspeckles, and diffusely distributes in the
nucleoplasm. The cellular function of PSPC1 remains
unknown currently. PSF has an additional large
N-terminal domain that differentiates it from other
family members.
Length = 93
Score = 27.4 bits (60), Expect = 4.9
Identities = 24/74 (32%), Positives = 42/74 (56%), Gaps = 6/74 (8%)
Query: 263 EQVNQED---QKVAQEYAQYEKKLEEQKQHSQNPVERFEYKYSFKPPYLAQKDGYQKDHP 319
EQ++ ED +K+AQ+ QY K+ E+ + +Q FEY+Y+ + L + + Q++
Sbjct: 10 EQLDDEDGLPEKLAQKNQQYHKEREQPPRFAQPGT--FEYEYAQRWKALDEMEKQQREQV 67
Query: 320 DAHPNE-EEWYESE 332
D + E +E ESE
Sbjct: 68 DRNIKEAKEKLESE 81
>gnl|CDD|238039 cd00086, homeodomain, Homeodomain; DNA binding domains involved in
the transcriptional regulation of key eukaryotic
developmental processes; may bind to DNA as monomers or
as homo- and/or heterodimers, in a sequence-specific
manner.
Length = 59
Score = 26.4 bits (59), Expect = 5.2
Identities = 9/32 (28%), Positives = 12/32 (37%), Gaps = 7/32 (21%)
Query: 181 FRNKPYPTRARIQYYMNTL-------TVWFHN 205
F PYP+R + L +WF N
Sbjct: 19 FEKNPYPSREEREELAKELGLTERQVKIWFQN 50
>gnl|CDD|233065 TIGR00634, recN, DNA repair protein RecN. All proteins in this
family for which functions are known are ATP binding
proteins involved in the initiation of recombination and
recombinational repair [DNA metabolism, DNA replication,
recombination, and repair].
Length = 563
Score = 28.9 bits (65), Expect = 5.8
Identities = 22/109 (20%), Positives = 48/109 (44%), Gaps = 20/109 (18%)
Query: 259 AKQQEQVNQEDQKVAQEYAQYEKKLEEQKQHSQNPVERFEY-KYSFKPPYLAQKDGYQKD 317
A E+V + +++ Q + + ++L++++Q Q +R ++ ++ + + +
Sbjct: 154 AGANEKV-KAYRELYQAWLKARQQLKDRQQKEQELAQRLDFLQFQLE----------ELE 202
Query: 318 HPDAHPNEEEWYESENQRELRQIFQGQSQLAEWTKAIATGLDALQQKQD 366
D P E+E E+E QR S L + + L AL+ D
Sbjct: 203 EADLQPGEDEALEAEQQR--------LSNLEKLRELSQNALAALRGDVD 243
>gnl|CDD|235179 PRK03947, PRK03947, prefoldin subunit alpha; Reviewed.
Length = 140
Score = 27.6 bits (62), Expect = 6.1
Identities = 9/32 (28%), Positives = 19/32 (59%)
Query: 260 KQQEQVNQEDQKVAQEYAQYEKKLEEQKQHSQ 291
K E++ + QK+A AQ ++L++ +Q +
Sbjct: 108 KALEKLEEALQKLASRIAQLAQELQQLQQEAA 139
>gnl|CDD|227278 COG4942, COG4942, Membrane-bound metallopeptidase [Cell division
and chromosome partitioning].
Length = 420
Score = 28.5 bits (64), Expect = 6.7
Identities = 11/46 (23%), Positives = 23/46 (50%)
Query: 246 ILHFLTSSLLPPGAKQQEQVNQEDQKVAQEYAQYEKKLEEQKQHSQ 291
+L L S+ + A +++ +++ +E A EKK+ EQ+
Sbjct: 17 LLASLLSAAVLAAAFSAAADDKQLKQIQKEIAALEKKIREQQDQRA 62
>gnl|CDD|221122 pfam11490, DNA_pol3_alph_N, DNA polymerase III polC-type
N-terminus. This is an N-terminal domain of DNA
polymerase III polC subunit A that is found only in
Firmicutes. DNA polymerase polC-type III enzyme
functions as the 'replicase' in low G + C Gram-positive
bacteria. Purine asymmetry is a characteristic of
organisms with a heterodimeric DNA polymerase III
alpha-subunit constituted by polC which probably plays a
direct role in the maintenance of strand-biased gene
distribution; since, among prokaryotic genomes, the
distribution of genes on the leading and lagging strands
of the replication fork is known to be biased. The
domain is associated with DNA_pol3_alpha pfam07733.
Length = 180
Score = 28.1 bits (63), Expect = 6.7
Identities = 10/35 (28%), Positives = 20/35 (57%)
Query: 259 AKQQEQVNQEDQKVAQEYAQYEKKLEEQKQHSQNP 293
+ +EQ +E+ K+A+E + KK E +K+ +
Sbjct: 146 EEFEEQKEEEEAKLAEEALEALKKKEAEKKKKEKE 180
>gnl|CDD|227731 COG5444, COG5444, Uncharacterized conserved protein [Function
unknown].
Length = 565
Score = 28.6 bits (64), Expect = 7.4
Identities = 18/133 (13%), Positives = 33/133 (24%), Gaps = 30/133 (22%)
Query: 260 KQQEQVNQEDQKVAQEYAQYEKKLEEQKQHSQNPVERFEYKYSF---------------- 303
E++ + DQ+ A E K++E K + + E
Sbjct: 157 DTLEKLYKLDQEGMTLMAAVESKMQELKAIIR----QLEEWTIKGGATKKGVPIHYVAKA 212
Query: 304 --------KPPYLAQKDGYQKDHPDAHPNEEEWYESENQRELRQIFQ-GQSQLAEWTKAI 354
K +A + D +E + + L + G LA
Sbjct: 213 FAEVTIHKKAAEVALQSETYLDIKTEL-AKERPDMRDLDKPLESANKTGYEGLAGEDIVP 271
Query: 355 ATGLDALQQKQDR 367
+ Q
Sbjct: 272 FFLAEETGQLAYS 284
>gnl|CDD|234750 PRK00409, PRK00409, recombination and DNA strand exchange inhibitor
protein; Reviewed.
Length = 782
Score = 28.6 bits (65), Expect = 7.9
Identities = 9/46 (19%), Positives = 22/46 (47%), Gaps = 4/46 (8%)
Query: 260 KQQEQVNQEDQKVAQE----YAQYEKKLEEQKQHSQNPVERFEYKY 301
++ EQ +E + + +E + E+K E+ ++ +E E +
Sbjct: 530 RELEQKAEEAEALLKEAEKLKEELEEKKEKLQEEEDKLLEEAEKEA 575
>gnl|CDD|221389 pfam12037, DUF3523, Domain of unknown function (DUF3523). This
presumed domain is functionally uncharacterized. This
domain is found in eukaryotes. This domain is typically
between 257 to 277 amino acids in length. This domain is
found associated with pfam00004. This domain has a
conserved LER sequence motif.
Length = 276
Score = 27.8 bits (62), Expect = 9.6
Identities = 22/80 (27%), Positives = 34/80 (42%), Gaps = 11/80 (13%)
Query: 260 KQQEQVNQEDQKVAQEYAQYEKKLEEQKQHSQNPVERFEYKYSFKPPYLAQKDGYQKDHP 319
QQ Q E +V E + + E+ +Q Q R +Y+ LA+K YQK+
Sbjct: 77 AQQAQAKLERARVEAE-ERRKTLQEQTQQEQQ----RAQYQDE-----LARK-RYQKELE 125
Query: 320 DAHPNEEEWYESENQRELRQ 339
EE + + + LRQ
Sbjct: 126 QQRRQNEELLKMQEESVLRQ 145
Database: CDD.v3.10
Posted date: Mar 20, 2013 7:55 AM
Number of letters in database: 10,937,602
Number of sequences in database: 44,354
Lambda K H
0.316 0.134 0.414
Gapped
Lambda K H
0.267 0.0760 0.140
Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 21,118,718
Number of extensions: 2026201
Number of successful extensions: 2291
Number of sequences better than 10.0: 1
Number of HSP's gapped: 2266
Number of HSP's successfully gapped: 51
Length of query: 413
Length of database: 10,937,602
Length adjustment: 99
Effective length of query: 314
Effective length of database: 6,546,556
Effective search space: 2055618584
Effective search space used: 2055618584
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.6 bits)
S2: 60 (26.8 bits)