RPS-BLAST 2.2.26 [Sep-21-2011]
Database: CDD.v3.10
44,354 sequences; 10,937,602 total letters
Searching..................................................done
Query= psy4097
(129 letters)
>gnl|CDD|205157 pfam12947, EGF_3, EGF domain. This family includes a variety of
EGF-like domain homologues. This family includes the
C-terminal domain of the malaria parasite MSP1 protein.
Length = 36
Score = 33.7 bits (78), Expect = 0.002
Identities = 18/38 (47%), Positives = 21/38 (55%), Gaps = 4/38 (10%)
Query: 37 CSVQNGGCHPLATCRETSDTVRSVISCTCPPGMGGSGV 74
C+ NGGCHP ATC T + +CTC G G GV
Sbjct: 1 CAENNGGCHPNATCTNTGGS----FTCTCKSGYTGDGV 34
Score = 24.4 bits (54), Expect = 4.8
Identities = 12/25 (48%), Positives = 13/25 (52%), Gaps = 1/25 (4%)
Query: 6 ECINTQGYRKCGQCPHGWVGDGTTC 30
C NT G C C G+ GDG TC
Sbjct: 13 TCTNTGGSFTC-TCKSGYTGDGVTC 36
>gnl|CDD|100002 cd04962, GT1_like_5, This family is most closely related to the
GT1 family of glycosyltransferases.
Glycosyltransferases catalyze the transfer of sugar
moieties from activated donor molecules to specific
acceptor molecules, forming glycosidic bonds. The
acceptor molecule can be a lipid, a protein, a
heterocyclic compound, or another carbohydrate residue.
This group of glycosyltransferases is most closely
related to the previously defined glycosyltransferase
family 1 (GT1). The members of this family may transfer
UDP, ADP, GDP, or CMP linked sugars. The diverse
enzymatic activities among members of this family
reflect a wide range of biological functions. The
protein structure available for this family has the GTB
topology, one of the two protein topologies observed
for nucleotide-sugar-dependent glycosyltransferases.
GTB proteins have distinct N- and C- terminal domains
each containing a typical Rossmann fold. The two
domains have high structural homology despite minimal
sequence homology. The large cleft that separates the
two domains includes the catalytic center and permits a
high degree of flexibility. The members of this family
are found mainly in bacteria, while some of them are
also found in Archaea and eukaryotes.
Length = 371
Score = 27.2 bits (61), Expect = 2.7
Identities = 8/14 (57%), Positives = 8/14 (57%)
Query: 61 ISCTCPPGMGGSGV 74
I C P GGSGV
Sbjct: 3 IGIVCYPTYGGSGV 16
>gnl|CDD|215652 pfam00008, EGF, EGF-like domain. There is no clear separation
between noise and signal. pfam00053 is very similar, but
has 8 instead of 6 conserved cysteines. Includes some
cytokine receptors. The EGF domain misses the N-terminus
regions of the Ca2+ binding EGF domains (this is the
main reason of discrepancy between swiss-prot domain
start/end and Pfam). The family is hard to model due to
many similar but different sub-types of EGF domains.
Pfam certainly misses a number of EGF domains.
Length = 32
Score = 25.1 bits (55), Expect = 2.9
Identities = 11/26 (42%), Positives = 15/26 (57%), Gaps = 1/26 (3%)
Query: 93 CEHGGICAPIGDRGYRCQCEPGFTGE 118
C +GG C GY C+C G+TG+
Sbjct: 7 CSNGGTCVDTPG-GYTCECPEGYTGK 31
>gnl|CDD|238011 cd00054, EGF_CA, Calcium-binding EGF-like domain, present in a
large number of membrane-bound and extracellular (mostly
animal) proteins. Many of these proteins require calcium
for their biological function and calcium-binding sites
have been found to be located at the N-terminus of
particular EGF-like domains; calcium-binding may be
crucial for numerous protein-protein interactions. Six
conserved core cysteines form three disulfide bridges as
in non calcium-binding EGF domains, whose structures are
very similar. EGF_CA can be found in tandem repeat
arrangements.
Length = 38
Score = 24.9 bits (55), Expect = 3.3
Identities = 12/27 (44%), Positives = 16/27 (59%), Gaps = 1/27 (3%)
Query: 93 CEHGGICAPIGDRGYRCQCEPGFTGES 119
C++GG C YRC C PG+TG +
Sbjct: 11 CQNGGTCVNTVG-SYRCSCPPGYTGRN 36
>gnl|CDD|234438 TIGR03999, thiol_BshA, N-acetyl-alpha-D-glucosaminyl L-malate
synthase BshA. Members of this protein family are
BshA, a glycosyltransferase required for bacillithiol
biosynthesis. This enzyme combines UDP-GlcNAc and
L-malate to form N-acetyl-alpha-D-glucosaminyl L-malate
synthase. Bacillithiol is a low-molecular-weight thiol,
an analog of glutathione and mycothiol, and is found
largely in the Firmicutes [Biosynthesis of cofactors,
prosthetic groups, and carriers, Glutathione and
analogs].
Length = 374
Score = 26.8 bits (60), Expect = 3.5
Identities = 9/14 (64%), Positives = 9/14 (64%)
Query: 61 ISCTCPPGMGGSGV 74
I TC P GGSGV
Sbjct: 3 IGITCYPTYGGSGV 16
>gnl|CDD|238010 cd00053, EGF, Epidermal growth factor domain, found in epidermal
growth factor (EGF) presents in a large number of
proteins, mostly animal; the list of proteins currently
known to contain one or more copies of an EGF-like
pattern is large and varied; the functional significance
of EGF-like domains in what appear to be unrelated
proteins is not yet clear; a common feature is that
these repeats are found in the extracellular domain of
membrane-bound proteins or in proteins known to be
secreted (exception: prostaglandin G/H synthase); the
domain includes six cysteine residues which have been
shown to be involved in disulfide bonds; the main
structure is a two-stranded beta-sheet followed by a
loop to a C-terminal short two-stranded sheet;
Subdomains between the conserved cysteines vary in
length; the region between the 5th and 6th cysteine
contains two conserved glycines of which at least one
is present in most EGF-like domains; a subset of
these bind calcium.
Length = 36
Score = 24.7 bits (54), Expect = 3.9
Identities = 12/26 (46%), Positives = 15/26 (57%), Gaps = 1/26 (3%)
Query: 93 CEHGGICAPIGDRGYRCQCEPGFTGE 118
C +GG C YRC C PG+TG+
Sbjct: 8 CSNGGTCVNTPG-SYRCVCPPGYTGD 32
>gnl|CDD|221190 pfam11727, ISG65-75, Invariant surface glycoprotein. This family
is found in Trypanosome species, and appears to be one
of two invariant surface glycoproteins, ISG65 and ISG75.
that are found in the mammalian stage of the parasitic
protozoan. the sequence suggests the two families are
polypeptides with N-terminal signal sequences,
hydrophilic extracellular domains, single trans-membrane
alpha-helices and short cytoplasmic domains. they are
both expressed in the bloodstream form but not in the
midgut stage. Both polypeptides are distributed over the
entire surface of the parasite.
Length = 286
Score = 26.3 bits (58), Expect = 4.3
Identities = 11/34 (32%), Positives = 16/34 (47%)
Query: 31 RQGTTGCSVQNGGCHPLATCRETSDTVRSVISCT 64
+ G S + C +A R SD R+VI C+
Sbjct: 167 KPGENAKSSPSQNCDGIAFKRHYSDGGRNVIDCS 200
>gnl|CDD|237357 PRK13348, PRK13348, chromosome replication initiation inhibitor
protein; Provisional.
Length = 294
Score = 25.7 bits (57), Expect = 8.0
Identities = 8/20 (40%), Positives = 10/20 (50%)
Query: 96 GGICAPIGDRGYRCQCEPGF 115
G + P+G YRC P F
Sbjct: 152 GCLAEPLGTMRYRCVASPAF 171
Database: CDD.v3.10
Posted date: Mar 20, 2013 7:55 AM
Number of letters in database: 10,937,602
Number of sequences in database: 44,354
Lambda K H
0.322 0.142 0.502
Gapped
Lambda K H
0.267 0.0635 0.140
Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 6,115,491
Number of extensions: 484260
Number of successful extensions: 398
Number of sequences better than 10.0: 1
Number of HSP's gapped: 394
Number of HSP's successfully gapped: 52
Length of query: 129
Length of database: 10,937,602
Length adjustment: 86
Effective length of query: 43
Effective length of database: 7,123,158
Effective search space: 306295794
Effective search space used: 306295794
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.8 bits)
S2: 53 (24.4 bits)