RPS-BLAST 2.2.26 [Sep-21-2011]
Database: CDD.v3.10
44,354 sequences; 10,937,602 total letters
Searching..................................................done
Query= psy5750
(286 letters)
>gnl|CDD|205157 pfam12947, EGF_3, EGF domain. This family includes a variety of
EGF-like domain homologues. This family includes the
C-terminal domain of the malaria parasite MSP1 protein.
Length = 36
Score = 46.8 bits (112), Expect = 1e-07
Identities = 16/36 (44%), Positives = 21/36 (58%)
Query: 118 CNAGTDLCHKNAMCFNEIGSYSCQCRPGFTGNGHQC 153
C CH NA C N GS++C C+ G+TG+G C
Sbjct: 1 CAENNGGCHPNATCTNTGGSFTCTCKSGYTGDGVTC 36
Score = 43.3 bits (103), Expect = 2e-06
Identities = 15/36 (41%), Positives = 20/36 (55%)
Query: 216 CMNYPPICNNNADCINRPGTYQCQCKRGFSGDGFNC 251
C C+ NA C N G++ C CK G++GDG C
Sbjct: 1 CAENNGGCHPNATCTNTGGSFTCTCKSGYTGDGVTC 36
>gnl|CDD|238158 cd00255, nidG2, Nidogen, G2 domain; Nidogen is an important
component of the basement membrane, an extracellular
sheet-like matrix. Nidogen is a multifunctional protein
that interacts with many other basement membrane
proteins, like collagen, perlecan, lamin, and has a
potential role in the assembly and connection of
networks. Nidogen consists of 3 globular domains
(G1-G3), G3 is the lamin-binding domain, while G2 binds
collagen IV and perlecan. Also found in hemicentin, a
protein which functions at various cell-cell and
cell-matrix junctions and might assist in refining broad
regions of cell contact into oriented, line-shaped
junctions. Nidogen G2 consists of an N-terminal EGF-like
domain (excluded from this alignment model) and an
11-stranded beta-barrel with a central helix, a topology
that exhibits high structural similarity to the green
flourescent proteins of Cnidaria.
Length = 224
Score = 50.0 bits (120), Expect = 3e-07
Identities = 17/66 (25%), Positives = 31/66 (46%), Gaps = 1/66 (1%)
Query: 1 GVFNYSAELIFST-GQRLHVQEEFFGHDVYDQLKMQGSIQGTAPSIAVGVSPVLEDLQQE 59
G F AE+ F T G++L + + G D + L + I G P + G + +ED +
Sbjct: 84 GEFTRQAEVTFYTGGEKLRITQVARGLDSHGHLLLDTVISGRVPQVPAGATVHIEDYTEL 143
Query: 60 FTHSST 65
+ ++
Sbjct: 144 YHYTGP 149
>gnl|CDD|219496 pfam07645, EGF_CA, Calcium-binding EGF domain.
Length = 42
Score = 45.4 bits (108), Expect = 4e-07
Identities = 16/37 (43%), Positives = 21/37 (56%)
Query: 114 DINECNAGTDLCHKNAMCFNEIGSYSCQCRPGFTGNG 150
D++EC GT C N +C N IGS+ C C G+ N
Sbjct: 1 DVDECADGTHNCPANTVCVNTIGSFECVCPDGYENNE 37
Score = 40.8 bits (96), Expect = 2e-05
Identities = 17/42 (40%), Positives = 24/42 (57%), Gaps = 2/42 (4%)
Query: 212 DVDECMNYPPICNNNADCINRPGTYQCQCKRGF--SGDGFNC 251
DVDEC + C N C+N G+++C C G+ + DG NC
Sbjct: 1 DVDECADGTHNCPANTVCVNTIGSFECVCPDGYENNEDGTNC 42
>gnl|CDD|214542 smart00179, EGF_CA, Calcium-binding EGF-like domain.
Length = 39
Score = 44.9 bits (107), Expect = 6e-07
Identities = 19/41 (46%), Positives = 27/41 (65%), Gaps = 2/41 (4%)
Query: 212 DVDECMNYPPICNNNADCINRPGTYQCQCKRGFSGDGFNCE 252
D+DEC + P C N C+N G+Y+C+C G++ DG NCE
Sbjct: 1 DIDECASGNP-CQNGGTCVNTVGSYRCECPPGYT-DGRNCE 39
Score = 43.8 bits (104), Expect = 1e-06
Identities = 18/41 (43%), Positives = 24/41 (58%), Gaps = 2/41 (4%)
Query: 114 DINECNAGTDLCHKNAMCFNEIGSYSCQCRPGFTGNGHQCT 154
DI+EC A + C C N +GSY C+C PG+T +G C
Sbjct: 1 DIDEC-ASGNPCQNGGTCVNTVGSYRCECPPGYT-DGRNCE 39
>gnl|CDD|238011 cd00054, EGF_CA, Calcium-binding EGF-like domain, present in a
large number of membrane-bound and extracellular (mostly
animal) proteins. Many of these proteins require calcium
for their biological function and calcium-binding sites
have been found to be located at the N-terminus of
particular EGF-like domains; calcium-binding may be
crucial for numerous protein-protein interactions. Six
conserved core cysteines form three disulfide bridges as
in non calcium-binding EGF domains, whose structures are
very similar. EGF_CA can be found in tandem repeat
arrangements.
Length = 38
Score = 43.0 bits (102), Expect = 3e-06
Identities = 18/41 (43%), Positives = 25/41 (60%), Gaps = 3/41 (7%)
Query: 212 DVDECMNYPPICNNNADCINRPGTYQCQCKRGFSGDGFNCE 252
D+DEC + P C N C+N G+Y+C C G++G NCE
Sbjct: 1 DIDECASGNP-CQNGGTCVNTVGSYRCSCPPGYTGR--NCE 38
Score = 43.0 bits (102), Expect = 3e-06
Identities = 17/35 (48%), Positives = 21/35 (60%), Gaps = 1/35 (2%)
Query: 114 DINECNAGTDLCHKNAMCFNEIGSYSCQCRPGFTG 148
DI+EC A + C C N +GSY C C PG+TG
Sbjct: 1 DIDEC-ASGNPCQNGGTCVNTVGSYRCSCPPGYTG 34
>gnl|CDD|214774 smart00682, G2F, G2 nidogen domain and fibulin.
Length = 227
Score = 46.7 bits (111), Expect = 3e-06
Identities = 19/64 (29%), Positives = 35/64 (54%)
Query: 1 GVFNYSAELIFSTGQRLHVQEEFFGHDVYDQLKMQGSIQGTAPSIAVGVSPVLEDLQQEF 60
GVF E+ F+ G+ L +++ F G D + LK++ + G P +A G + D +E+
Sbjct: 86 GVFTRETEVTFAGGEILRIKQTFSGLDEHGYLKVKIEVSGRVPQVAAGAEVTIPDYTEEY 145
Query: 61 THSS 64
T++
Sbjct: 146 TYTG 149
>gnl|CDD|219422 pfam07474, G2F, G2F domain. Nidogen, an invariant component of
basement membranes, is a multifunctional protein that
interacts with most other major basement membrane
proteins. The G2 fragment or (G"F domain) contains
binding sites for collagen IV and perlecan. The
structure is composed of an 11-stranded beta-barrel with
a central helix. This domain is structurally related to
that of green fluorescent protein pfam01353. A large
surface patch on the beta-barrel is conserved in all
metazoan nidogens.
Length = 193
Score = 41.7 bits (98), Expect = 1e-04
Identities = 17/64 (26%), Positives = 32/64 (50%)
Query: 1 GVFNYSAELIFSTGQRLHVQEEFFGHDVYDQLKMQGSIQGTAPSIAVGVSPVLEDLQQEF 60
GVF E+ F TG+ L +++ F G D L ++ + G P I G ++D +++
Sbjct: 86 GVFKRETEVTFHTGEILRIKQIFSGLDSDGYLLIKTVVSGRVPQIPSGAEVTIKDYTEDY 145
Query: 61 THSS 64
++
Sbjct: 146 HYTG 149
>gnl|CDD|238010 cd00053, EGF, Epidermal growth factor domain, found in epidermal
growth factor (EGF) presents in a large number of
proteins, mostly animal; the list of proteins currently
known to contain one or more copies of an EGF-like
pattern is large and varied; the functional significance
of EGF-like domains in what appear to be unrelated
proteins is not yet clear; a common feature is that
these repeats are found in the extracellular domain of
membrane-bound proteins or in proteins known to be
secreted (exception: prostaglandin G/H synthase); the
domain includes six cysteine residues which have been
shown to be involved in disulfide bonds; the main
structure is a two-stranded beta-sheet followed by a
loop to a C-terminal short two-stranded sheet;
Subdomains between the conserved cysteines vary in
length; the region between the 5th and 6th cysteine
contains two conserved glycines of which at least one
is present in most EGF-like domains; a subset of
these bind calcium.
Length = 36
Score = 35.5 bits (82), Expect = 0.001
Identities = 15/35 (42%), Positives = 19/35 (54%), Gaps = 1/35 (2%)
Query: 117 ECNAGTDLCHKNAMCFNEIGSYSCQCRPGFTGNGH 151
EC A ++ C C N GSY C C PG+TG+
Sbjct: 1 EC-AASNPCSNGGTCVNTPGSYRCVCPPGYTGDRS 34
Score = 34.4 bits (79), Expect = 0.003
Identities = 14/33 (42%), Positives = 21/33 (63%), Gaps = 1/33 (3%)
Query: 220 PPICNNNADCINRPGTYQCQCKRGFSGDGFNCE 252
C+N C+N PG+Y+C C G++GD +CE
Sbjct: 5 SNPCSNGGTCVNTPGSYRCVCPPGYTGD-RSCE 36
>gnl|CDD|215652 pfam00008, EGF, EGF-like domain. There is no clear separation
between noise and signal. pfam00053 is very similar, but
has 8 instead of 6 conserved cysteines. Includes some
cytokine receptors. The EGF domain misses the N-terminus
regions of the Ca2+ binding EGF domains (this is the
main reason of discrepancy between swiss-prot domain
start/end and Pfam). The family is hard to model due to
many similar but different sub-types of EGF domains.
Pfam certainly misses a number of EGF domains.
Length = 32
Score = 32.4 bits (74), Expect = 0.013
Identities = 10/25 (40%), Positives = 16/25 (64%)
Query: 223 CNNNADCINRPGTYQCQCKRGFSGD 247
C+N C++ PG Y C+C G++G
Sbjct: 7 CSNGGTCVDTPGGYTCECPEGYTGK 31
Score = 29.7 bits (67), Expect = 0.13
Identities = 9/24 (37%), Positives = 13/24 (54%)
Query: 125 CHKNAMCFNEIGSYSCQCRPGFTG 148
C C + G Y+C+C G+TG
Sbjct: 7 CSNGGTCVDTPGGYTCECPEGYTG 30
>gnl|CDD|214544 smart00181, EGF, Epidermal growth factor-like domain.
Length = 35
Score = 29.8 bits (67), Expect = 0.14
Identities = 12/24 (50%), Positives = 15/24 (62%)
Query: 128 NAMCFNEIGSYSCQCRPGFTGNGH 151
N C N GSY+C C PG+TG+
Sbjct: 10 NGTCINTPGSYTCSCPPGYTGDKR 33
Score = 29.4 bits (66), Expect = 0.20
Identities = 14/27 (51%), Positives = 17/27 (62%), Gaps = 1/27 (3%)
Query: 226 NADCINRPGTYQCQCKRGFSGDGFNCE 252
N CIN PG+Y C C G++GD CE
Sbjct: 10 NGTCINTPGSYTCSCPPGYTGDK-RCE 35
>gnl|CDD|221695 pfam12662, cEGF, Complement Clr-like EGF-like. cEGF, or complement
Clr-like EGF, domains have six conserved cysteine
residues disulfide-bonded into the characteristic
pattern 'ababcc'. They are found in blood coagulation
proteins such as fibrillin, Clr and Cls, thrombomodulin,
and the LDL receptor. The core fold of the EGF domain
consists of two small beta-hairpins packed against each
other. Two major structural variants have been
identified based on the structural context of the
C-terminal cysteine residue of disulfide 'c' in the
C-terminal hairpin: hEGFs and cEGFs. In cEGFs the
C-terminal thiol resides on the C-terminal beta-sheet,
resulting in long loop-lengths between the cysteine
residues of disulfide 'c', typically C[10+]XC. These
longer loop-lengths may have arisen by selective
cysteine loss from a four-disulfide EGF template such as
laminin or integrin. Tandem cEGF domains have five
linking residues between terminal cysteines of adjacent
domains. cEGF domains may or may not bind calcium in the
linker region. cEGF domains with the consensus motif
CXN4X[F,Y]XCXC are hydroxylated exclusively on the
asparagine residue.
Length = 24
Score = 28.6 bits (65), Expect = 0.24
Identities = 10/22 (45%), Positives = 14/22 (63%), Gaps = 2/22 (9%)
Query: 137 SYSCQCRPGFT--GNGHQCTEI 156
SY+C C PG+ G+G C +I
Sbjct: 1 SYTCSCPPGYQLSGDGRTCEDI 22
Score = 25.1 bits (56), Expect = 4.9
Identities = 10/19 (52%), Positives = 11/19 (57%), Gaps = 2/19 (10%)
Query: 236 YQCQCKRGF--SGDGFNCE 252
Y C C G+ SGDG CE
Sbjct: 2 YTCSCPPGYQLSGDGRTCE 20
Score = 24.3 bits (54), Expect = 9.3
Identities = 13/27 (48%), Positives = 15/27 (55%), Gaps = 5/27 (18%)
Query: 189 TCNCDPGYQKDYLDDRRVAFVCTDVDE 215
TC+C PGYQ D R C D+DE
Sbjct: 3 TCSCPPGYQLS--GDGR---TCEDIDE 24
>gnl|CDD|238752 cd01475, vWA_Matrilin, VWA_Matrilin: In cartilaginous plate,
extracellular matrix molecules mediate cell-matrix and
matrix-matrix interactions thereby providing tissue
integrity. Some members of the matrilin family are
expressed specifically in developing cartilage
rudiments. The matrilin family consists of at least four
members. All the members of the matrilin family contain
VWA domains, EGF-like domains and a heptad repeat
coiled-coiled domain at the carboxy terminus which is
responsible for the oligomerization of the matrilins.
The VWA domains have been shown to be essential for
matrilin network formation by interacting with matrix
ligands.
Length = 224
Score = 32.0 bits (73), Expect = 0.27
Identities = 12/36 (33%), Positives = 17/36 (47%), Gaps = 2/36 (5%)
Query: 209 VCTDVDECMNYPPICNNNADCINRPGTYQCQCKRGF 244
+C D C +C CI+ PG+Y C C G+
Sbjct: 183 ICVVPDLCATLSHVCQQV--CISTPGSYLCACTEGY 216
Score = 28.9 bits (65), Expect = 2.6
Identities = 24/102 (23%), Positives = 39/102 (38%), Gaps = 24/102 (23%)
Query: 46 AVGVSPVLEDLQQEFTHSSTVNDDPCKNFFCVANSSCIVEDDKPTCICNRGFQQLYSEDR 105
AVGV E+ +E S + D + F V + S I ++L +
Sbjct: 140 AVGVGRADEEELREIA-SEPLAD----HVFYVEDFSTI--------------EELTK--K 178
Query: 106 LQDDFGCFDINECNAGTDLCHKNAMCFNEIGSYSCQCRPGFT 147
Q C + C + +C + +C + GSY C C G+
Sbjct: 179 FQGKI-CVVPDLCATLSHVCQQ--VCISTPGSYLCACTEGYA 217
>gnl|CDD|193419 pfam12946, EGF_MSP1_1, MSP1 EGF domain 1. This EGF-like domain is
found at the C-terminus of the malaria parasite MSP1
protein. MSP1 is the merozoite surface protein 1. This
domain is part of the C-terminal fragment that is
proteolytically processed from the the rest of the
protein and is left attached to the surface of the
invading parasite.
Length = 37
Score = 26.2 bits (58), Expect = 2.3
Identities = 10/31 (32%), Positives = 14/31 (45%), Gaps = 1/31 (3%)
Query: 223 CNNNADCINR-PGTYQCQCKRGFSGDGFNCE 252
C NA C G +C+C G+ +G C
Sbjct: 7 CPANAGCFRYLDGREECRCLLGYKKEGGKCV 37
Score = 25.0 bits (55), Expect = 7.2
Identities = 11/30 (36%), Positives = 15/30 (50%), Gaps = 1/30 (3%)
Query: 125 CHKNAMCFNEI-GSYSCQCRPGFTGNGHQC 153
C NA CF + G C+C G+ G +C
Sbjct: 7 CPANAGCFRYLDGREECRCLLGYKKEGGKC 36
>gnl|CDD|239562 cd03480, Rieske_RO_Alpha_PaO, Rieske non-heme iron oxygenase (RO)
family, Pheophorbide a oxygenase (PaO) subfamily,
N-terminal Rieske domain of the oxygenase alpha subunit;
composed of the oxygenase alpha subunits of a small
subfamily of enzymes found in plants as well as oxygenic
cyanobacterial photosynthesizers including LLS1 (lethal
leaf spot 1, also known as PaO) and ACD1 (accelerated
cell death 1). ROs comprise a large class of aromatic
ring-hydroxylating dioxygenases that enable
microorganisms to tolerate and utilize aromatic
compounds for growth. The oxygenase alpha subunit
contains an N-terminal Rieske domain with an [2Fe-2S]
cluster and a C-terminal catalytic domain with a
mononuclear Fe(II) binding site. The Rieske [2Fe-2S]
cluster accepts electrons from a reductase or ferredoxin
component and transfers them to the mononuclear iron for
catalysis. PaO expression increases upon physical
wounding of plant leaves and is thought to catalyze a
key step in chlorophyll degradation. The
Arabidopsis-accelerated cell death gene ACD1 is involved
in oxygenation of PaO.
Length = 138
Score = 28.1 bits (63), Expect = 2.8
Identities = 11/29 (37%), Positives = 13/29 (44%), Gaps = 3/29 (10%)
Query: 146 FTGNGHQCTEITVPQTGPTSPCESDPRAC 174
F G+G C I PQ + PRAC
Sbjct: 87 FDGSG-SCQRI--PQAAEGGKAHTSPRAC 112
>gnl|CDD|218955 pfam06247, Plasmod_Pvs28, Plasmodium ookinete surface protein
Pvs28. This family consists of several ookinete surface
protein (Pvs28) from several species of Plasmodium.
Pvs25 and Pvs28 are expressed on the surface of
ookinetes. These proteins are potential candidates for
vaccine and induce antibodies that block the infectivity
of Plasmodium vivax in immunised animals.
Length = 196
Score = 27.4 bits (61), Expect = 6.4
Identities = 46/168 (27%), Positives = 66/168 (39%), Gaps = 36/168 (21%)
Query: 91 CICNRGFQQLYSEDRLQDDFGCFDINECNAGTDL---CHKNAMCFN-----EIGSYSCQC 142
C CN G+ L +E+ C + +C+ ++ C + A C N E + C C
Sbjct: 22 CKCNEGYV-LKNENT------CEEKVKCDKLENVNKVCGEYATCINQANKAEEKALKCGC 74
Query: 143 RPGFTGNGHQCTEITVPQTGPTSPCESDPRACNPPHSTCTNLTDYRTCNCDPGYQKDYLD 202
G+T + C VP C S +P + T TC+C+ G K
Sbjct: 75 INGYTLSQGVC----VPNKCNNKVCGSGKCIVDPANPNNT------TCSCNIG--KVPDQ 122
Query: 203 DRRVAFVCTDVDE--CMNYPPICNNNADCINRPGTYQCQCKRGFSGDG 248
+ + CT E C C N +C G Y+C CK GF GDG
Sbjct: 123 NGK----CTKTGETKCSLK---CKENEECKLVGGYYECVCKEGFPGDG 163
Score = 27.4 bits (61), Expect = 7.3
Identities = 25/95 (26%), Positives = 35/95 (36%), Gaps = 12/95 (12%)
Query: 71 CKNFFCVANSSCIVE---DDKPTCICNRGFQQLYSEDRLQDDFGCFDINECNAGTDLCHK 127
C N C + CIV+ + TC CN G ++ D G + C +
Sbjct: 90 CNNKVC-GSGKCIVDPANPNNTTCSCNIG--------KVPDQNGKCTKTGETKCSLKCKE 140
Query: 128 NAMCFNEIGSYSCQCRPGFTGNGHQCTEITVPQTG 162
N C G Y C C+ GF G+G P +
Sbjct: 141 NEECKLVGGYYECVCKEGFPGDGGGTGSGGPPTSS 175
>gnl|CDD|165214 PHA02887, PHA02887, EGF-like protein; Provisional.
Length = 126
Score = 26.8 bits (59), Expect = 8.1
Identities = 16/45 (35%), Positives = 26/45 (57%), Gaps = 6/45 (13%)
Query: 58 QEFTHSSTVNDDPCK---NFFCVANSSC--IVEDDKPTCICNRGF 97
Q F +++ + CK N FC+ N C I++ D+ CICN+G+
Sbjct: 73 QNFKRKNSMFFEKCKNDFNDFCI-NGECMNIIDLDEKFCICNKGY 116
>gnl|CDD|114473 pfam05749, Rubella_E2, Rubella membrane glycoprotein E2. Rubella
virus (RV), the sole member of the genus Rubivirus
within the family Togaviridae, is a small enveloped,
positive strand RNA virus. The nucleocapsid consists of
40S genomic RNA and a single species of capsid protein
which is enveloped within a host-derived lipid bilayer
containing two viral glycoproteins, E1 (58 kDa) and E2
(42-46 kDa). In virus infected cells, RV matures by
budding either at the plasma membrane, or at the
internal membranes depending on the cell type and enters
adjacent uninfected cells by a membrane fusion process
in the endosome, directed by E1-E2 heterodimers. The
heterodimer formation is crucial for E1 transport out of
the endoplasmic reticulum to the Golgi and plasma
membrane. In RV E1, a cysteine at position 82 is crucial
for the E1-E2 heterodimer formation and cell surface
expression of the two proteins.
Length = 267
Score = 27.4 bits (60), Expect = 9.2
Identities = 10/27 (37%), Positives = 18/27 (66%)
Query: 106 LQDDFGCFDINECNAGTDLCHKNAMCF 132
LQ +GC+++++ + GT +CH M F
Sbjct: 63 LQGGWGCYNLSDWHQGTHVCHTKHMDF 89
Database: CDD.v3.10
Posted date: Mar 20, 2013 7:55 AM
Number of letters in database: 10,937,602
Number of sequences in database: 44,354
Lambda K H
0.321 0.138 0.458
Gapped
Lambda K H
0.267 0.0665 0.140
Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 13,923,192
Number of extensions: 1229664
Number of successful extensions: 772
Number of sequences better than 10.0: 1
Number of HSP's gapped: 762
Number of HSP's successfully gapped: 55
Length of query: 286
Length of database: 10,937,602
Length adjustment: 96
Effective length of query: 190
Effective length of database: 6,679,618
Effective search space: 1269127420
Effective search space used: 1269127420
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.9 bits)
S2: 58 (26.3 bits)