RPS-BLAST 2.2.26 [Sep-21-2011]
Database: CDD.v3.10
44,354 sequences; 10,937,602 total letters
Searching..................................................done
Query= psy11800
(220 letters)
>gnl|CDD|221695 pfam12662, cEGF, Complement Clr-like EGF-like. cEGF, or complement
Clr-like EGF, domains have six conserved cysteine
residues disulfide-bonded into the characteristic
pattern 'ababcc'. They are found in blood coagulation
proteins such as fibrillin, Clr and Cls, thrombomodulin,
and the LDL receptor. The core fold of the EGF domain
consists of two small beta-hairpins packed against each
other. Two major structural variants have been
identified based on the structural context of the
C-terminal cysteine residue of disulfide 'c' in the
C-terminal hairpin: hEGFs and cEGFs. In cEGFs the
C-terminal thiol resides on the C-terminal beta-sheet,
resulting in long loop-lengths between the cysteine
residues of disulfide 'c', typically C[10+]XC. These
longer loop-lengths may have arisen by selective
cysteine loss from a four-disulfide EGF template such as
laminin or integrin. Tandem cEGF domains have five
linking residues between terminal cysteines of adjacent
domains. cEGF domains may or may not bind calcium in the
linker region. cEGF domains with the consensus motif
CXN4X[F,Y]XCXC are hydroxylated exclusively on the
asparagine residue.
Length = 24
Score = 40.5 bits (96), Expect = 1e-05
Identities = 12/24 (50%), Positives = 15/24 (62%)
Query: 141 SFKCQCKPGFVLSPTGHACIDVDE 164
S+ C C PG+ LS G C D+DE
Sbjct: 1 SYTCSCPPGYQLSGDGRTCEDIDE 24
Score = 37.4 bits (88), Expect = 2e-04
Identities = 14/24 (58%), Positives = 15/24 (62%)
Query: 73 SYRCACQPGYSPSPDGGFCVDRDE 96
SY C+C PGY S DG C D DE
Sbjct: 1 SYTCSCPPGYQLSGDGRTCEDIDE 24
Score = 37.4 bits (88), Expect = 2e-04
Identities = 14/24 (58%), Positives = 15/24 (62%)
Query: 183 SYRCACQPGYSPSPDGGFCVDRDE 206
SY C+C PGY S DG C D DE
Sbjct: 1 SYTCSCPPGYQLSGDGRTCEDIDE 24
>gnl|CDD|214542 smart00179, EGF_CA, Calcium-binding EGF-like domain.
Length = 39
Score = 40.7 bits (96), Expect = 1e-05
Identities = 14/30 (46%), Positives = 20/30 (66%), Gaps = 1/30 (3%)
Query: 122 DECSQNGMCANG-MCINMDGSFKCQCKPGF 150
DEC+ C NG C+N GS++C+C PG+
Sbjct: 3 DECASGNPCQNGGTCVNTVGSYRCECPPGY 32
Score = 40.3 bits (95), Expect = 2e-05
Identities = 22/43 (51%), Positives = 25/43 (58%), Gaps = 5/43 (11%)
Query: 161 DVDECYENPLICLNG-RCDNTLGSYRCACQPGYSPSPDGGFCV 202
D+DEC C NG C NT+GSYRC C PGY+ DG C
Sbjct: 1 DIDECASGN-PCQNGGTCVNTVGSYRCECPPGYT---DGRNCE 39
Score = 38.8 bits (91), Expect = 7e-05
Identities = 21/43 (48%), Positives = 25/43 (58%), Gaps = 5/43 (11%)
Query: 51 NVDECYENPLICLNG-RCDNTLGSYRCACQPGYSPSPDGGFCV 92
++DEC C NG C NT+GSYRC C PGY+ DG C
Sbjct: 1 DIDECASGN-PCQNGGTCVNTVGSYRCECPPGYT---DGRNCE 39
>gnl|CDD|214544 smart00181, EGF, Epidermal growth factor-like domain.
Length = 35
Score = 40.2 bits (94), Expect = 2e-05
Identities = 15/33 (45%), Positives = 19/33 (57%)
Query: 123 ECSQNGMCANGMCINMDGSFKCQCKPGFVLSPT 155
EC+ G C+NG CIN GS+ C C PG+
Sbjct: 1 ECASGGPCSNGTCINTPGSYTCSCPPGYTGDKR 33
Score = 34.0 bits (78), Expect = 0.003
Identities = 16/34 (47%), Positives = 18/34 (52%), Gaps = 1/34 (2%)
Query: 54 ECYENPLICLNGRCDNTLGSYRCACQPGYSPSPD 87
EC C NG C NT GSY C+C PGY+
Sbjct: 1 ECASGG-PCSNGTCINTPGSYTCSCPPGYTGDKR 33
Score = 34.0 bits (78), Expect = 0.003
Identities = 16/34 (47%), Positives = 18/34 (52%), Gaps = 1/34 (2%)
Query: 164 ECYENPLICLNGRCDNTLGSYRCACQPGYSPSPD 197
EC C NG C NT GSY C+C PGY+
Sbjct: 1 ECASGG-PCSNGTCINTPGSYTCSCPPGYTGDKR 33
>gnl|CDD|238011 cd00054, EGF_CA, Calcium-binding EGF-like domain, present in a
large number of membrane-bound and extracellular (mostly
animal) proteins. Many of these proteins require calcium
for their biological function and calcium-binding sites
have been found to be located at the N-terminus of
particular EGF-like domains; calcium-binding may be
crucial for numerous protein-protein interactions. Six
conserved core cysteines form three disulfide bridges as
in non calcium-binding EGF domains, whose structures are
very similar. EGF_CA can be found in tandem repeat
arrangements.
Length = 38
Score = 39.9 bits (94), Expect = 3e-05
Identities = 14/32 (43%), Positives = 19/32 (59%), Gaps = 1/32 (3%)
Query: 122 DECSQNGMCAN-GMCINMDGSFKCQCKPGFVL 152
DEC+ C N G C+N GS++C C PG+
Sbjct: 3 DECASGNPCQNGGTCVNTVGSYRCSCPPGYTG 34
Score = 39.2 bits (92), Expect = 5e-05
Identities = 17/33 (51%), Positives = 21/33 (63%)
Query: 161 DVDECYENPLICLNGRCDNTLGSYRCACQPGYS 193
D+DEC G C NT+GSYRC+C PGY+
Sbjct: 1 DIDECASGNPCQNGGTCVNTVGSYRCSCPPGYT 33
Score = 37.6 bits (88), Expect = 2e-04
Identities = 16/33 (48%), Positives = 21/33 (63%)
Query: 51 NVDECYENPLICLNGRCDNTLGSYRCACQPGYS 83
++DEC G C NT+GSYRC+C PGY+
Sbjct: 1 DIDECASGNPCQNGGTCVNTVGSYRCSCPPGYT 33
>gnl|CDD|219496 pfam07645, EGF_CA, Calcium-binding EGF domain.
Length = 42
Score = 39.3 bits (92), Expect = 4e-05
Identities = 19/42 (45%), Positives = 23/42 (54%), Gaps = 1/42 (2%)
Query: 161 DVDECYENPLIC-LNGRCDNTLGSYRCACQPGYSPSPDGGFC 201
DVDEC + C N C NT+GS+ C C GY + DG C
Sbjct: 1 DVDECADGTHNCPANTVCVNTIGSFECVCPDGYENNEDGTNC 42
Score = 36.9 bits (86), Expect = 3e-04
Identities = 18/41 (43%), Positives = 22/41 (53%), Gaps = 1/41 (2%)
Query: 52 VDECYENPLIC-LNGRCDNTLGSYRCACQPGYSPSPDGGFC 91
VDEC + C N C NT+GS+ C C GY + DG C
Sbjct: 2 VDECADGTHNCPANTVCVNTIGSFECVCPDGYENNEDGTNC 42
Score = 35.4 bits (82), Expect = 0.001
Identities = 14/40 (35%), Positives = 19/40 (47%), Gaps = 2/40 (5%)
Query: 122 DECSQNG-MCANGM-CINMDGSFKCQCKPGFVLSPTGHAC 159
DEC+ C C+N GSF+C C G+ + G C
Sbjct: 3 DECADGTHNCPANTVCVNTIGSFECVCPDGYENNEDGTNC 42
>gnl|CDD|238752 cd01475, vWA_Matrilin, VWA_Matrilin: In cartilaginous plate,
extracellular matrix molecules mediate cell-matrix and
matrix-matrix interactions thereby providing tissue
integrity. Some members of the matrilin family are
expressed specifically in developing cartilage
rudiments. The matrilin family consists of at least four
members. All the members of the matrilin family contain
VWA domains, EGF-like domains and a heptad repeat
coiled-coiled domain at the carboxy terminus which is
responsible for the oligomerization of the matrilins.
The VWA domains have been shown to be essential for
matrilin network formation by interacting with matrix
ligands.
Length = 224
Score = 38.5 bits (90), Expect = 0.001
Identities = 15/40 (37%), Positives = 19/40 (47%), Gaps = 1/40 (2%)
Query: 159 CIDVDECYENPLICLNGRCDNTLGSYRCACQPGYSPSPDG 198
C+ D C +C C +T GSY CAC GY+ D
Sbjct: 184 CVVPDLCATLSHVCQQV-CISTPGSYLCACTEGYALLEDN 222
Score = 33.9 bits (78), Expect = 0.038
Identities = 14/37 (37%), Positives = 17/37 (45%), Gaps = 1/37 (2%)
Query: 52 VDECYENPLICLNGRCDNTLGSYRCACQPGYSPSPDG 88
D C +C C +T GSY CAC GY+ D
Sbjct: 187 PDLCATLSHVCQQV-CISTPGSYLCACTEGYALLEDN 222
Score = 33.1 bits (76), Expect = 0.064
Identities = 17/67 (25%), Positives = 21/67 (31%), Gaps = 27/67 (40%)
Query: 88 GGFCVDRDECRTPGDHDECSQKKKKKKKKKKLYHDECSQNGMCANGMCINMDGSFKCQCK 147
G CV D C T H C Q +CI+ GS+ C C
Sbjct: 181 GKICVVPDLCAT-LSHV-------------------CQQ-------VCISTPGSYLCACT 213
Query: 148 PGFVLSP 154
G+ L
Sbjct: 214 EGYALLE 220
>gnl|CDD|238010 cd00053, EGF, Epidermal growth factor domain, found in epidermal
growth factor (EGF) presents in a large number of
proteins, mostly animal; the list of proteins currently
known to contain one or more copies of an EGF-like
pattern is large and varied; the functional significance
of EGF-like domains in what appear to be unrelated
proteins is not yet clear; a common feature is that
these repeats are found in the extracellular domain of
membrane-bound proteins or in proteins known to be
secreted (exception: prostaglandin G/H synthase); the
domain includes six cysteine residues which have been
shown to be involved in disulfide bonds; the main
structure is a two-stranded beta-sheet followed by a
loop to a C-terminal short two-stranded sheet;
Subdomains between the conserved cysteines vary in
length; the region between the 5th and 6th cysteine
contains two conserved glycines of which at least one
is present in most EGF-like domains; a subset of
these bind calcium.
Length = 36
Score = 33.6 bits (77), Expect = 0.004
Identities = 13/34 (38%), Positives = 21/34 (61%), Gaps = 1/34 (2%)
Query: 123 ECSQNGMCAN-GMCINMDGSFKCQCKPGFVLSPT 155
EC+ + C+N G C+N GS++C C PG+ +
Sbjct: 1 ECAASNPCSNGGTCVNTPGSYRCVCPPGYTGDRS 34
Score = 30.5 bits (69), Expect = 0.064
Identities = 13/20 (65%), Positives = 14/20 (70%)
Query: 64 NGRCDNTLGSYRCACQPGYS 83
G C NT GSYRC C PGY+
Sbjct: 11 GGTCVNTPGSYRCVCPPGYT 30
Score = 30.5 bits (69), Expect = 0.064
Identities = 13/20 (65%), Positives = 14/20 (70%)
Query: 174 NGRCDNTLGSYRCACQPGYS 193
G C NT GSYRC C PGY+
Sbjct: 11 GGTCVNTPGSYRCVCPPGYT 30
>gnl|CDD|218955 pfam06247, Plasmod_Pvs28, Plasmodium ookinete surface protein
Pvs28. This family consists of several ookinete surface
protein (Pvs28) from several species of Plasmodium.
Pvs25 and Pvs28 are expressed on the surface of
ookinetes. These proteins are potential candidates for
vaccine and induce antibodies that block the infectivity
of Plasmodium vivax in immunised animals.
Length = 196
Score = 32.8 bits (75), Expect = 0.076
Identities = 19/72 (26%), Positives = 29/72 (40%), Gaps = 10/72 (13%)
Query: 130 CANGMCINMDGSFKCQCKPGFVLSPTGHACIDVDECYE---------NPLICLNGRCDNT 180
C NG I M F+C+C G+VL + C + +C + C+N
Sbjct: 8 CKNGYLIQMSNHFECKCNEGYVLK-NENTCEEKVKCDKLENVNKVCGEYATCINQANKAE 66
Query: 181 LGSYRCACQPGY 192
+ +C C GY
Sbjct: 67 EKALKCGCINGY 78
>gnl|CDD|205157 pfam12947, EGF_3, EGF domain. This family includes a variety of
EGF-like domain homologues. This family includes the
C-terminal domain of the malaria parasite MSP1 protein.
Length = 36
Score = 29.8 bits (68), Expect = 0.083
Identities = 15/36 (41%), Positives = 16/36 (44%), Gaps = 3/36 (8%)
Query: 125 SQNGMC-ANGMCINMDGSFKCQCKPGFVLSPTGHAC 159
NG C N C N GSF C CK G+ G C
Sbjct: 3 ENNGGCHPNATCTNTGGSFTCTCKSGYTGD--GVTC 36
Score = 27.5 bits (62), Expect = 0.61
Identities = 14/30 (46%), Positives = 17/30 (56%), Gaps = 1/30 (3%)
Query: 55 CYENPLIC-LNGRCDNTLGSYRCACQPGYS 83
C EN C N C NT GS+ C C+ GY+
Sbjct: 1 CAENNGGCHPNATCTNTGGSFTCTCKSGYT 30
Score = 27.5 bits (62), Expect = 0.61
Identities = 14/30 (46%), Positives = 17/30 (56%), Gaps = 1/30 (3%)
Query: 165 CYENPLIC-LNGRCDNTLGSYRCACQPGYS 193
C EN C N C NT GS+ C C+ GY+
Sbjct: 1 CAENNGGCHPNATCTNTGGSFTCTCKSGYT 30
>gnl|CDD|215652 pfam00008, EGF, EGF-like domain. There is no clear separation
between noise and signal. pfam00053 is very similar, but
has 8 instead of 6 conserved cysteines. Includes some
cytokine receptors. The EGF domain misses the N-terminus
regions of the Ca2+ binding EGF domains (this is the
main reason of discrepancy between swiss-prot domain
start/end and Pfam). The family is hard to model due to
many similar but different sub-types of EGF domains.
Pfam certainly misses a number of EGF domains.
Length = 32
Score = 28.9 bits (65), Expect = 0.20
Identities = 11/28 (39%), Positives = 17/28 (60%), Gaps = 1/28 (3%)
Query: 124 CSQNGMCANGM-CINMDGSFKCQCKPGF 150
CS N C+NG C++ G + C+C G+
Sbjct: 1 CSPNNPCSNGGTCVDTPGGYTCECPEGY 28
Score = 25.5 bits (56), Expect = 3.2
Identities = 9/20 (45%), Positives = 11/20 (55%)
Query: 64 NGRCDNTLGSYRCACQPGYS 83
G C +T G Y C C GY+
Sbjct: 10 GGTCVDTPGGYTCECPEGYT 29
Score = 25.5 bits (56), Expect = 3.2
Identities = 9/20 (45%), Positives = 11/20 (55%)
Query: 174 NGRCDNTLGSYRCACQPGYS 193
G C +T G Y C C GY+
Sbjct: 10 GGTCVDTPGGYTCECPEGYT 29
>gnl|CDD|219901 pfam08555, DUF1754, Eukaryotic family of unknown function
(DUF1754). This is a eukaryotic protein family of
unknown function.
Length = 90
Score = 30.1 bits (68), Expect = 0.21
Identities = 10/12 (83%), Positives = 10/12 (83%)
Query: 107 SQKKKKKKKKKK 118
KKKKKKKKKK
Sbjct: 18 DVKKKKKKKKKK 29
Score = 28.2 bits (63), Expect = 1.1
Identities = 9/11 (81%), Positives = 10/11 (90%)
Query: 108 QKKKKKKKKKK 118
+KKKKKKKKK
Sbjct: 20 KKKKKKKKKKN 30
>gnl|CDD|224655 COG1741, COG1741, Pirin-related protein [General function
prediction only].
Length = 276
Score = 28.4 bits (64), Expect = 2.6
Identities = 9/27 (33%), Positives = 13/27 (48%)
Query: 13 LHWASAGPAVVHSTYTGLSETQTYTGL 39
+ W +AG +VHS S + GL
Sbjct: 93 VQWMTAGSGIVHSEMNPPSTGKPLHGL 119
>gnl|CDD|201524 pfam00954, S_locus_glycop, S-locus glycoprotein family. In
Brassicaceae, self-incompatible plants have a
self/non-self recognition system. This is
sporophytically controlled by multiple alleles at a
single locus (S). S-locus glycoproteins, as well as
S-receptor kinases, are in linkage with the S-alleles.
Length = 110
Score = 27.2 bits (61), Expect = 3.4
Identities = 13/31 (41%), Positives = 17/31 (54%), Gaps = 2/31 (6%)
Query: 122 DECSQNGMC-ANGMCINMDGSFKCQCKPGFV 151
D+C G C G C +++ S KC C GFV
Sbjct: 78 DQCDVYGRCGPYGYC-DVNTSPKCNCIKGFV 107
>gnl|CDD|99903 cd06080, MUM1_like, Mutated melanoma-associated antigen 1 (MUM-1)
is a melanoma-associated antigen (MAA). MUM-1 belongs
to the mutated or aberrantly expressed type of MAAs,
along with antigens such as CDK4, beta-catenin,
gp100-in4, p15, and N-acetylglucosaminyltransferase V.
It is highly expressed in several types of human
cancers. The PWWP domain, named for a conserved
Pro-Trp-Trp-Pro motif, is a small domain consisting of
100-150 amino acids. The PWWP domain is found in
numerous proteins that are involved in cell division,
growth and differentiation. Most PWWP-domain proteins
seem to be nuclear, often DNA-binding, proteins that
function as transcription factors regulating a variety
of developmental processes.
Length = 80
Score = 26.6 bits (59), Expect = 3.8
Identities = 7/19 (36%), Positives = 12/19 (63%)
Query: 102 DHDECSQKKKKKKKKKKLY 120
H +C++K+K K K+ Y
Sbjct: 55 KHFDCTEKQKLTNKAKESY 73
>gnl|CDD|216726 pfam01826, TIL, Trypsin Inhibitor like cysteine rich domain. This
family contains trypsin inhibitors as well as a domain
found in many extracellular proteins. The domain
typically contains ten cysteine residues that form five
disulphide bonds. The cysteine residues that form the
disulphide bonds are 1-7, 2-6, 3-5, 4-10 and 8-9.
Length = 55
Score = 25.8 bits (57), Expect = 4.4
Identities = 8/22 (36%), Positives = 11/22 (50%), Gaps = 1/22 (4%)
Query: 144 CQCKPGFVLSPTGHACIDVDEC 165
C C PG+V G C+ +C
Sbjct: 35 CVCPPGYVRDNDG-KCVPPSQC 55
>gnl|CDD|222466 pfam13945, NST1, Splicing factor, salt tolerance regulator. NST1
is a family of proteins that seem to be involved,
directly or indirectly, in the salt sensitivity of some
cellular functions in yeast. These proteins also
interact with the splicing factor Msl1p.
Length = 189
Score = 27.6 bits (61), Expect = 4.5
Identities = 12/23 (52%), Positives = 14/23 (60%)
Query: 95 DECRTPGDHDECSQKKKKKKKKK 117
DE T +D S K KKKKKK+
Sbjct: 18 DELNTVIHNDSSSSKSKKKKKKR 40
Database: CDD.v3.10
Posted date: Mar 20, 2013 7:55 AM
Number of letters in database: 10,937,602
Number of sequences in database: 44,354
Lambda K H
0.319 0.136 0.461
Gapped
Lambda K H
0.267 0.0632 0.140
Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 10,754,766
Number of extensions: 927083
Number of successful extensions: 1248
Number of sequences better than 10.0: 1
Number of HSP's gapped: 1230
Number of HSP's successfully gapped: 65
Length of query: 220
Length of database: 10,937,602
Length adjustment: 93
Effective length of query: 127
Effective length of database: 6,812,680
Effective search space: 865210360
Effective search space used: 865210360
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 57 (25.9 bits)