RPS-BLAST 2.2.26 [Sep-21-2011]
Database: CDD.v3.10
44,354 sequences; 10,937,602 total letters
Searching..................................................done
Query= psy11864
(193 letters)
>gnl|CDD|215916 pfam00431, CUB, CUB domain.
Length = 110
Score = 99.7 bits (249), Expect = 2e-27
Identities = 37/72 (51%), Positives = 52/72 (72%)
Query: 1 VALKFQSFEIENHDQCTYDYVEIRDGHAPDSPIIGTYCGYKLPPDIKSSGTKLMIKFVSD 60
++L FQ F++E+HD+C YDYVEIRDG SP++G +CG P DI+S+ ++ IKFVSD
Sbjct: 39 ISLTFQDFDLEDHDECGYDYVEIRDGLPSSSPLLGRFCGSGPPEDIRSTSNQMTIKFVSD 98
Query: 61 GSVQKPGFSAIF 72
S+ K GF A +
Sbjct: 99 SSISKRGFKATY 110
Score = 75.0 bits (185), Expect = 6e-18
Identities = 31/74 (41%), Positives = 43/74 (58%), Gaps = 4/74 (5%)
Query: 118 CGGILNTPNGTLTSPSFPDLYIKNKTCIWEIVAPPQYRISLNFTHFDIEGNNLFQSSCEY 177
CGG+L +G++TSP++P+ Y NK C+W I APP YRISL F FD+E C Y
Sbjct: 1 CGGVLTESSGSITSPNYPNSYPPNKDCVWTIRAPPGYRISLTFQDFDLED----HDECGY 56
Query: 178 DNLTVFSKIGDSFK 191
D + + + S
Sbjct: 57 DYVEIRDGLPSSSP 70
>gnl|CDD|238001 cd00041, CUB, CUB domain; extracellular domain; present in proteins
mostly known to be involved in development; not found in
prokaryotes, plants and yeast.
Length = 113
Score = 95.2 bits (237), Expect = 1e-25
Identities = 33/72 (45%), Positives = 47/72 (65%)
Query: 1 VALKFQSFEIENHDQCTYDYVEIRDGHAPDSPIIGTYCGYKLPPDIKSSGTKLMIKFVSD 60
+ L F+ F++E+ C+YDY+EI DG + SP++G +CG LPP I SSG L ++F SD
Sbjct: 40 IRLTFEDFDLESSPNCSYDYLEIYDGPSTSSPLLGRFCGSTLPPPIISSGNSLTVRFRSD 99
Query: 61 GSVQKPGFSAIF 72
SV GF A +
Sbjct: 100 SSVTGRGFKATY 111
Score = 77.8 bits (192), Expect = 5e-19
Identities = 32/77 (41%), Positives = 44/77 (57%), Gaps = 7/77 (9%)
Query: 118 CGGILNTP-NGTLTSPSFPDLYIKNKTCIWEIVAPPQYRISLNFTHFDIEGNNLFQSSCE 176
CGG L +GT++SP++P+ Y N C+W I APP YRI L F FD+E + +C
Sbjct: 1 CGGTLTASTSGTISSPNYPNNYPNNLNCVWTIEAPPGYRIRLTFEDFDLESSP----NCS 56
Query: 177 YDNLTVFSKIGDSFKSQ 193
YD L ++ G S S
Sbjct: 57 YDYLEIYD--GPSTSSP 71
>gnl|CDD|214483 smart00042, CUB, Domain first found in C1r, C1s, uEGF, and bone
morphogenetic protein. This domain is found mostly
among developmentally-regulated proteins. Spermadhesins
contain only this domain.
Length = 102
Score = 91.7 bits (228), Expect = 2e-24
Identities = 36/73 (49%), Positives = 49/73 (67%), Gaps = 1/73 (1%)
Query: 1 VALKFQSFEIENHDQCTYDYVEIRDGHAPDSPIIGTYCGYKLPPDIKSS-GTKLMIKFVS 59
+ L+F F++E+ D C YDYVEI DG + SP++G +CG + PP + SS L + FVS
Sbjct: 30 IELQFTDFDLESSDNCEYDYVEIYDGPSASSPLLGRFCGSEAPPPVISSSSNSLTLTFVS 89
Query: 60 DGSVQKPGFSAIF 72
D SVQK GFSA +
Sbjct: 90 DSSVQKRGFSARY 102
Score = 71.7 bits (176), Expect = 9e-17
Identities = 28/66 (42%), Positives = 39/66 (59%), Gaps = 4/66 (6%)
Query: 127 GTLTSPSFPDLYIKNKTCIWEIVAPPQYRISLNFTHFDIEGNNLFQSSCEYDNLTVFSKI 186
GT+TSP++P Y N C+W I APP YRI L FT FD+E ++ +CEYD + ++
Sbjct: 1 GTITSPNYPQSYPNNLDCVWTIRAPPGYRIELQFTDFDLESSD----NCEYDYVEIYDGP 56
Query: 187 GDSFKS 192
S
Sbjct: 57 SASSPL 62
>gnl|CDD|238752 cd01475, vWA_Matrilin, VWA_Matrilin: In cartilaginous plate,
extracellular matrix molecules mediate cell-matrix and
matrix-matrix interactions thereby providing tissue
integrity. Some members of the matrilin family are
expressed specifically in developing cartilage
rudiments. The matrilin family consists of at least four
members. All the members of the matrilin family contain
VWA domains, EGF-like domains and a heptad repeat
coiled-coiled domain at the carboxy terminus which is
responsible for the oligomerization of the matrilins.
The VWA domains have been shown to be essential for
matrilin network formation by interacting with matrix
ligands.
Length = 224
Score = 45.1 bits (107), Expect = 5e-06
Identities = 15/36 (41%), Positives = 18/36 (50%)
Query: 77 DECALEDHGCEHTCKNILGGYECSCKIGYELHSDGK 112
D CA H C+ C + G Y C+C GY L D K
Sbjct: 188 DLCATLSHVCQQVCISTPGSYLCACTEGYALLEDNK 223
>gnl|CDD|214542 smart00179, EGF_CA, Calcium-binding EGF-like domain.
Length = 39
Score = 39.2 bits (92), Expect = 4e-05
Identities = 17/40 (42%), Positives = 22/40 (55%), Gaps = 6/40 (15%)
Query: 77 DECALEDHGCEH--TCKNILGGYECSCKIGYELHSDGKIC 114
DECA + C++ TC N +G Y C C GY DG+ C
Sbjct: 3 DECA-SGNPCQNGGTCVNTVGSYRCECPPGYT---DGRNC 38
>gnl|CDD|219496 pfam07645, EGF_CA, Calcium-binding EGF domain.
Length = 42
Score = 38.5 bits (90), Expect = 8e-05
Identities = 18/41 (43%), Positives = 21/41 (51%), Gaps = 2/41 (4%)
Query: 76 FDECALEDHGCEH--TCKNILGGYECSCKIGYELHSDGKIC 114
DECA H C C N +G +EC C GYE + DG C
Sbjct: 2 VDECADGTHNCPANTVCVNTIGSFECVCPDGYENNEDGTNC 42
>gnl|CDD|238011 cd00054, EGF_CA, Calcium-binding EGF-like domain, present in a
large number of membrane-bound and extracellular (mostly
animal) proteins. Many of these proteins require calcium
for their biological function and calcium-binding sites
have been found to be located at the N-terminus of
particular EGF-like domains; calcium-binding may be
crucial for numerous protein-protein interactions. Six
conserved core cysteines form three disulfide bridges as
in non calcium-binding EGF domains, whose structures are
very similar. EGF_CA can be found in tandem repeat
arrangements.
Length = 38
Score = 36.8 bits (86), Expect = 3e-04
Identities = 15/35 (42%), Positives = 20/35 (57%), Gaps = 3/35 (8%)
Query: 77 DECALEDHGCEH--TCKNILGGYECSCKIGYELHS 109
DECA + C++ TC N +G Y CSC GY +
Sbjct: 3 DECA-SGNPCQNGGTCVNTVGSYRCSCPPGYTGRN 36
>gnl|CDD|221695 pfam12662, cEGF, Complement Clr-like EGF-like. cEGF, or complement
Clr-like EGF, domains have six conserved cysteine
residues disulfide-bonded into the characteristic
pattern 'ababcc'. They are found in blood coagulation
proteins such as fibrillin, Clr and Cls, thrombomodulin,
and the LDL receptor. The core fold of the EGF domain
consists of two small beta-hairpins packed against each
other. Two major structural variants have been
identified based on the structural context of the
C-terminal cysteine residue of disulfide 'c' in the
C-terminal hairpin: hEGFs and cEGFs. In cEGFs the
C-terminal thiol resides on the C-terminal beta-sheet,
resulting in long loop-lengths between the cysteine
residues of disulfide 'c', typically C[10+]XC. These
longer loop-lengths may have arisen by selective
cysteine loss from a four-disulfide EGF template such as
laminin or integrin. Tandem cEGF domains have five
linking residues between terminal cysteines of adjacent
domains. cEGF domains may or may not bind calcium in the
linker region. cEGF domains with the consensus motif
CXN4X[F,Y]XCXC are hydroxylated exclusively on the
asparagine residue.
Length = 24
Score = 34.4 bits (80), Expect = 0.001
Identities = 11/21 (52%), Positives = 13/21 (61%)
Query: 96 GYECSCKIGYELHSDGKICLD 116
Y CSC GY+L DG+ C D
Sbjct: 1 SYTCSCPPGYQLSGDGRTCED 21
>gnl|CDD|205157 pfam12947, EGF_3, EGF domain. This family includes a variety of
EGF-like domain homologues. This family includes the
C-terminal domain of the malaria parasite MSP1 protein.
Length = 36
Score = 34.4 bits (80), Expect = 0.002
Identities = 16/38 (42%), Positives = 20/38 (52%), Gaps = 4/38 (10%)
Query: 79 CALEDHGC-EH-TCKNILGGYECSCKIGYELHSDGKIC 114
CA + GC + TC N G + C+CK GY DG C
Sbjct: 1 CAENNGGCHPNATCTNTGGSFTCTCKSGYTG--DGVTC 36
>gnl|CDD|214544 smart00181, EGF, Epidermal growth factor-like domain.
Length = 35
Score = 32.9 bits (75), Expect = 0.006
Identities = 14/31 (45%), Positives = 15/31 (48%), Gaps = 2/31 (6%)
Query: 78 ECALEDHGCEH-TCKNILGGYECSCKIGYEL 107
ECA C + TC N G Y CSC GY
Sbjct: 1 ECA-SGGPCSNGTCINTPGSYTCSCPPGYTG 30
>gnl|CDD|238010 cd00053, EGF, Epidermal growth factor domain, found in epidermal
growth factor (EGF) presents in a large number of
proteins, mostly animal; the list of proteins currently
known to contain one or more copies of an EGF-like
pattern is large and varied; the functional significance
of EGF-like domains in what appear to be unrelated
proteins is not yet clear; a common feature is that
these repeats are found in the extracellular domain of
membrane-bound proteins or in proteins known to be
secreted (exception: prostaglandin G/H synthase); the
domain includes six cysteine residues which have been
shown to be involved in disulfide bonds; the main
structure is a two-stranded beta-sheet followed by a
loop to a C-terminal short two-stranded sheet;
Subdomains between the conserved cysteines vary in
length; the region between the 5th and 6th cysteine
contains two conserved glycines of which at least one
is present in most EGF-like domains; a subset of
these bind calcium.
Length = 36
Score = 32.1 bits (73), Expect = 0.012
Identities = 13/32 (40%), Positives = 15/32 (46%), Gaps = 3/32 (9%)
Query: 78 ECALEDHGCEH--TCKNILGGYECSCKIGYEL 107
ECA + C + TC N G Y C C GY
Sbjct: 1 ECA-ASNPCSNGGTCVNTPGSYRCVCPPGYTG 31
>gnl|CDD|215063 PLN00120, PLN00120, fucoxanthin-chlorophyll a-c binding protein;
Provisional.
Length = 202
Score = 31.3 bits (71), Expect = 0.20
Identities = 18/45 (40%), Positives = 23/45 (51%), Gaps = 6/45 (13%)
Query: 14 DQCTYD---YVEIRDGHAPDSPIIG---TYCGYKLPPDIKSSGTK 52
DQ +D YVEI+ G ++G T G +LP DI SGT
Sbjct: 57 DQEKFDRLRYVEIKHGRISMLAVVGYLVTEAGIRLPGDIDYSGTS 101
>gnl|CDD|215652 pfam00008, EGF, EGF-like domain. There is no clear separation
between noise and signal. pfam00053 is very similar, but
has 8 instead of 6 conserved cysteines. Includes some
cytokine receptors. The EGF domain misses the N-terminus
regions of the Ca2+ binding EGF domains (this is the
main reason of discrepancy between swiss-prot domain
start/end and Pfam). The family is hard to model due to
many similar but different sub-types of EGF domains.
Pfam certainly misses a number of EGF domains.
Length = 32
Score = 27.0 bits (60), Expect = 0.85
Identities = 10/26 (38%), Positives = 14/26 (53%), Gaps = 2/26 (7%)
Query: 82 EDHGCEH--TCKNILGGYECSCKIGY 105
++ C + TC + GGY C C GY
Sbjct: 3 PNNPCSNGGTCVDTPGGYTCECPEGY 28
>gnl|CDD|200520 cd11259, Sema_4D, The Sema domain, a protein interacting module, of
semaphorin 4D (Sema4D, also known as CD100).
Sema4D/CD100 is expressed in immune cells and plays
critical roles in immune response; it is thus termed an
"immune semaphorin". It is expressed by lymphocytes and
promotes the aggregation and survival of B lymphocytes
and inhibits cytokine-induced migration of immune cells
in vitro. Sema4D/CD100 knock-out mice demonstrate that
Sema4D is required for normal activation of B and T
lymphocytes. Sema4D increases B-cell and DC function
using either Plexin B1 or CD72 as receptors. The
function of Sema4D in immune response implicates its
role in infectious and noninfectious diseases. Sema4D
belongs to the class 4 transmembrane semaphorin family
of proteins. Semaphorins are regulatory molecules in the
development of the nervous system and in axonal
guidance. They also play important roles in other
biological processes, such as angiogenesis, immune
regulation, respiration systems and cancer. The Sema
domain is located at the N-terminus and contains four
disulfide bonds formed by eight conserved cysteine
residues. It serves as a receptor-recognition and
-binding module.
Length = 471
Score = 29.8 bits (67), Expect = 0.85
Identities = 14/30 (46%), Positives = 18/30 (60%)
Query: 158 LNFTHFDIEGNNLFQSSCEYDNLTVFSKIG 187
LN T + G N FQ +C+Y NLT F +G
Sbjct: 88 LNDTFLYVCGTNAFQPTCDYLNLTSFRLLG 117
>gnl|CDD|193543 cd05667, M20_Acy1_like2, M20 Peptidase Aminoacylase 1 subfamily.
Peptidase M20 family, Uncharacterized subfamily of
bacterial proteins that have been predicted as
N-acyl-L-amino acid amidohydrolase (amaA), thermostable
carboxypeptidase (cpsA-1, cpsA-2 in Sulfolobus
solfataricus) and abgB (aminobenzoyl-glutamate
utilization protein B), and generally are involved in
the urea cycle and metabolism of amino groups.
Aminoacylases 1 (ACY1s) comprise a class of zinc binding
homodimeric enzymes involved in the hydrolysis of
N-acetylated proteins. N-terminal acetylation of
proteins is a widespread and is a highly conserved
process that is involved in the protection and stability
of proteins. Several types of aminoacylases can be
distinguished on the basis of substrate specificity.
ACY1 breaks down cytosolic aliphatic N-acyl-alpha-amino
acids (except L-aspartate), especially
N-acetyl-methionine and acetyl-glutamate into L-amino
acids and an acyl group. However, ACY1 can also catalyze
the reverse reaction, the synthesis of acetylated amino
acids. ACY1 may also play a role in xenobiotic
bioactivation as well as the inter-organ processing of
amino acid-conjugated xenobiotic derivatives
(S-substituted-N-acetyl-L-cysteine).
Length = 402
Score = 27.6 bits (62), Expect = 5.0
Identities = 11/30 (36%), Positives = 16/30 (53%), Gaps = 3/30 (10%)
Query: 43 PPDIKSSGTKLMIKFVSDGSVQKPGFSAIF 72
P + G KLM+K +G ++ P AIF
Sbjct: 145 APPGEEGGAKLMVK---EGVLKNPKVDAIF 171
>gnl|CDD|223160 COG0082, AroC, Chorismate synthase [Amino acid transport and
metabolism].
Length = 369
Score = 27.2 bits (61), Expect = 5.9
Identities = 12/26 (46%), Positives = 15/26 (57%), Gaps = 4/26 (15%)
Query: 7 SFEIENHDQCTYDYVEIRD----GHA 28
+ IEN DQ + DY I+D GHA
Sbjct: 81 ALLIENTDQRSKDYSMIKDPPRPGHA 106
>gnl|CDD|129255 TIGR00151, ispF, 2-C-methyl-D-erythritol 2,4-cyclodiphosphate
synthase. Members of this protein family are
2C-methyl-D-erythritol 2,4-cyclodiphosphate synthase,
the IspF protein of the deoxyxylulose (non-mevalonate)
pathway of IPP biosynthesis. This protein occurs as an
IspDF bifunctional fusion protein in about 20 percent of
bacterial genomes [Biosynthesis of cofactors, prosthetic
groups, and carriers, Other].
Length = 155
Score = 26.1 bits (58), Expect = 8.3
Identities = 21/75 (28%), Positives = 28/75 (37%), Gaps = 11/75 (14%)
Query: 93 ILGGYECSCKIGYELHSDGKICLDA-CGGILN-TPNGTLTSPSFPDLYIKNKTC------ 144
ILGG E + G HSDG + L A +L G + FPD + K
Sbjct: 18 ILGGVEIPHEKGLLAHSDGDVLLHALTDALLGALGLGDIGK-HFPDTDPRWKGADSRVLL 76
Query: 145 --IWEIVAPPQYRIS 157
++ YRI
Sbjct: 77 RHAVALIKEKGYRIG 91
>gnl|CDD|143612 cd07304, Chorismate_synthase, Chorismase synthase, the enzyme
catalyzing the final step of the shikimate pathway.
Chorismate synthase (CS;
5-enolpyruvylshikimate-3-phosphate phospholyase;
1-carboxyvinyl-3-phosphoshikimate phosphate-lyase; E.C.
4.2.3.5) catalyzes the seventh and final step in the
shikimate pathway: the conversion of 5-
enolpyruvylshikimate-3-phosphate (EPSP) to chorismate,
a precursor for the biosynthesis of aromatic compounds.
This process has an absolute requirement for reduced
FMN as a co-factor which is thought to facilitate
cleavage of C-O bonds by transiently donating an
electron to the substrate, having no overall change its
redox state. Depending on the capacity of these enzymes
to regenerate the reduced form of FMN, chorismate
synthases are divided into two classes: Enzymes, mostly
from plants and eubacteria, that sequester CS from the
cellular environment, are monofunctiona,l while those
that can generate reduced FMN at the expense of NADPH,
such as found in fungi and the ciliated protozoan
Euglena gracilis, are bifunctional, having an
additional NADPH:FMN oxidoreductase activity. Recently,
bifunctionality of the Mycobacterium tuberculosis
enzyme (MtCS) was determined by measurements of both
chorismate synthase and NADH:FMN oxidoreductase
activities. Since shikimate pathway enzymes are present
in bacteria, fungi and apicomplexan parasites (such as
Toxoplasma gondii, Plasmodium falciparum, and
Cryptosporidium parvum) but absent in mammals, they are
potentially attractive targets for the development of
new therapy against infectious diseases such as
tuberculosis (TB).
Length = 344
Score = 26.6 bits (60), Expect = 9.3
Identities = 9/26 (34%), Positives = 14/26 (53%), Gaps = 4/26 (15%)
Query: 7 SFEIENHDQCTYDYVEIRD----GHA 28
+ I N DQ ++DY ++ GHA
Sbjct: 73 ALLIRNKDQRSWDYSMLKTLPRPGHA 98
Database: CDD.v3.10
Posted date: Mar 20, 2013 7:55 AM
Number of letters in database: 10,937,602
Number of sequences in database: 44,354
Lambda K H
0.320 0.140 0.444
Gapped
Lambda K H
0.267 0.0632 0.140
Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 9,688,940
Number of extensions: 862050
Number of successful extensions: 581
Number of sequences better than 10.0: 1
Number of HSP's gapped: 576
Number of HSP's successfully gapped: 29
Length of query: 193
Length of database: 10,937,602
Length adjustment: 92
Effective length of query: 101
Effective length of database: 6,857,034
Effective search space: 692560434
Effective search space used: 692560434
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.8 bits)
S2: 56 (25.6 bits)