RPS-BLAST 2.2.26 [Sep-21-2011]
Database: CDD.v3.10
44,354 sequences; 10,937,602 total letters
Searching..................................................done
Query= psy11798
(245 letters)
>gnl|CDD|221695 pfam12662, cEGF, Complement Clr-like EGF-like. cEGF, or
complement Clr-like EGF, domains have six conserved
cysteine residues disulfide-bonded into the
characteristic pattern 'ababcc'. They are found in
blood coagulation proteins such as fibrillin, Clr and
Cls, thrombomodulin, and the LDL receptor. The core
fold of the EGF domain consists of two small
beta-hairpins packed against each other. Two major
structural variants have been identified based on the
structural context of the C-terminal cysteine residue
of disulfide 'c' in the C-terminal hairpin: hEGFs and
cEGFs. In cEGFs the C-terminal thiol resides on the
C-terminal beta-sheet, resulting in long loop-lengths
between the cysteine residues of disulfide 'c',
typically C[10+]XC. These longer loop-lengths may have
arisen by selective cysteine loss from a four-disulfide
EGF template such as laminin or integrin. Tandem cEGF
domains have five linking residues between terminal
cysteines of adjacent domains. cEGF domains may or may
not bind calcium in the linker region. cEGF domains
with the consensus motif CXN4X[F,Y]XCXC are
hydroxylated exclusively on the asparagine residue.
Length = 24
Score = 45.1 bits (108), Expect = 3e-07
Identities = 14/24 (58%), Positives = 17/24 (70%)
Query: 40 SFRCICPYGYALAPDGRHCIDINE 63
S+ C CP GY L+ DGR C DI+E
Sbjct: 1 SYTCSCPPGYQLSGDGRTCEDIDE 24
Score = 34.4 bits (80), Expect = 0.002
Identities = 10/20 (50%), Positives = 13/20 (65%)
Query: 84 TCECPEGFMLSPNGMKCIDV 103
TC CP G+ LS +G C D+
Sbjct: 3 TCSCPPGYQLSGDGRTCEDI 22
>gnl|CDD|214542 smart00179, EGF_CA, Calcium-binding EGF-like domain.
Length = 39
Score = 42.6 bits (101), Expect = 3e-06
Identities = 19/42 (45%), Positives = 23/42 (54%), Gaps = 4/42 (9%)
Query: 19 DINECLELSN-QCAFRCHNVPGSFRCICPYGYALAPDGRHCI 59
DI+EC + Q C N GS+RC CP GY DGR+C
Sbjct: 1 DIDECASGNPCQNGGTCVNTVGSYRCECPPGYT---DGRNCE 39
Score = 36.5 bits (85), Expect = 5e-04
Identities = 16/43 (37%), Positives = 22/43 (51%), Gaps = 5/43 (11%)
Query: 60 DINECKENEGICEDG-KCINIAGGVTCECPEGFMLSPNGMKCI 101
DI+EC C++G C+N G CECP G+ +G C
Sbjct: 1 DIDECASG-NPCQNGGTCVNTVGSYRCECPPGYT---DGRNCE 39
>gnl|CDD|219496 pfam07645, EGF_CA, Calcium-binding EGF domain.
Length = 42
Score = 40.4 bits (95), Expect = 2e-05
Identities = 17/42 (40%), Positives = 24/42 (57%), Gaps = 2/42 (4%)
Query: 19 DINECLELSNQC--AFRCHNVPGSFRCICPYGYALAPDGRHC 58
D++EC + ++ C C N GSF C+CP GY DG +C
Sbjct: 1 DVDECADGTHNCPANTVCVNTIGSFECVCPDGYENNEDGTNC 42
Score = 38.1 bits (89), Expect = 1e-04
Identities = 13/42 (30%), Positives = 22/42 (52%), Gaps = 1/42 (2%)
Query: 60 DINECKENEGIC-EDGKCINIAGGVTCECPEGFMLSPNGMKC 100
D++EC + C + C+N G C CP+G+ + +G C
Sbjct: 1 DVDECADGTHNCPANTVCVNTIGSFECVCPDGYENNEDGTNC 42
>gnl|CDD|238011 cd00054, EGF_CA, Calcium-binding EGF-like domain, present in a
large number of membrane-bound and extracellular
(mostly animal) proteins. Many of these proteins
require calcium for their biological function and
calcium-binding sites have been found to be located at
the N-terminus of particular EGF-like domains;
calcium-binding may be crucial for numerous
protein-protein interactions. Six conserved core
cysteines form three disulfide bridges as in non
calcium-binding EGF domains, whose structures are very
similar. EGF_CA can be found in tandem repeat
arrangements.
Length = 38
Score = 38.0 bits (89), Expect = 1e-04
Identities = 18/42 (42%), Positives = 22/42 (52%), Gaps = 5/42 (11%)
Query: 19 DINECLELSN-QCAFRCHNVPGSFRCICPYGYALAPDGRHCI 59
DI+EC + Q C N GS+RC CP GY GR+C
Sbjct: 1 DIDECASGNPCQNGGTCVNTVGSYRCSCPPGYT----GRNCE 38
Score = 38.0 bits (89), Expect = 1e-04
Identities = 12/34 (35%), Positives = 15/34 (44%)
Query: 60 DINECKENEGICEDGKCINIAGGVTCECPEGFML 93
DI+EC G C+N G C CP G+
Sbjct: 1 DIDECASGNPCQNGGTCVNTVGSYRCSCPPGYTG 34
>gnl|CDD|201391 pfam00683, TB, TB domain. This domain is also known as the 8
cysteine domain. This family includes the hybrid
domains. This cysteine rich repeat is found in TGF
binding protein and fibrillin.
Length = 42
Score = 37.3 bits (87), Expect = 3e-04
Identities = 17/42 (40%), Positives = 22/42 (52%)
Query: 115 GTCTLLRKQPITVKECCCSMGQAWGRYCLPCPSPNSGEPATF 156
G C+ +T ECCCS+G+AWG C PCP + E
Sbjct: 1 GRCSNPLPGNVTKSECCCSLGRAWGTPCEPCPVQGTAEFRQL 42
>gnl|CDD|238752 cd01475, vWA_Matrilin, VWA_Matrilin: In cartilaginous plate,
extracellular matrix molecules mediate cell-matrix and
matrix-matrix interactions thereby providing tissue
integrity. Some members of the matrilin family are
expressed specifically in developing cartilage
rudiments. The matrilin family consists of at least four
members. All the members of the matrilin family contain
VWA domains, EGF-like domains and a heptad repeat
coiled-coiled domain at the carboxy terminus which is
responsible for the oligomerization of the matrilins.
The VWA domains have been shown to be essential for
matrilin network formation by interacting with matrix
ligands.
Length = 224
Score = 40.1 bits (94), Expect = 5e-04
Identities = 17/53 (32%), Positives = 26/53 (49%)
Query: 4 TMSVTGYRLRVETCEDINECLELSNQCAFRCHNVPGSFRCICPYGYALAPDGR 56
T+ + + + C + C LS+ C C + PGS+ C C GYAL D +
Sbjct: 171 TIEELTKKFQGKICVVPDLCATLSHVCQQVCISTPGSYLCACTEGYALLEDNK 223
Score = 31.6 bits (72), Expect = 0.30
Identities = 12/43 (27%), Positives = 20/43 (46%), Gaps = 1/43 (2%)
Query: 55 GRHCIDINECKENEGICEDGKCINIAGGVTCECPEGFMLSPNG 97
G+ C+ + C +C+ CI+ G C C EG+ L +
Sbjct: 181 GKICVVPDLCATLSHVCQQV-CISTPGSYLCACTEGYALLEDN 222
>gnl|CDD|214544 smart00181, EGF, Epidermal growth factor-like domain.
Length = 35
Score = 34.4 bits (79), Expect = 0.003
Identities = 14/34 (41%), Positives = 16/34 (47%), Gaps = 1/34 (2%)
Query: 63 ECKENEGICEDGKCINIAGGVTCECPEGFMLSPN 96
EC G C +G CIN G TC CP G+
Sbjct: 1 ECASG-GPCSNGTCINTPGSYTCSCPPGYTGDKR 33
Score = 29.8 bits (67), Expect = 0.12
Identities = 10/22 (45%), Positives = 11/22 (50%)
Query: 33 RCHNVPGSFRCICPYGYALAPD 54
C N PGS+ C CP GY
Sbjct: 12 TCINTPGSYTCSCPPGYTGDKR 33
>gnl|CDD|205157 pfam12947, EGF_3, EGF domain. This family includes a variety of
EGF-like domain homologues. This family includes the
C-terminal domain of the malaria parasite MSP1 protein.
Length = 36
Score = 31.7 bits (73), Expect = 0.020
Identities = 14/38 (36%), Positives = 18/38 (47%), Gaps = 3/38 (7%)
Query: 64 CKENEGIC-EDGKCINIAGGVTCECPEGFMLSPNGMKC 100
C EN G C + C N G TC C G+ +G+ C
Sbjct: 1 CAENNGGCHPNATCTNTGGSFTCTCKSGYTG--DGVTC 36
Score = 31.0 bits (71), Expect = 0.037
Identities = 12/25 (48%), Positives = 12/25 (48%), Gaps = 2/25 (8%)
Query: 34 CHNVPGSFRCICPYGYALAPDGRHC 58
C N GSF C C GY DG C
Sbjct: 14 CTNTGGSFTCTCKSGYTG--DGVTC 36
>gnl|CDD|238010 cd00053, EGF, Epidermal growth factor domain, found in epidermal
growth factor (EGF) presents in a large number of
proteins, mostly animal; the list of proteins currently
known to contain one or more copies of an EGF-like
pattern is large and varied; the functional
significance of EGF-like domains in what appear to be
unrelated proteins is not yet clear; a common feature
is that these repeats are found in the extracellular
domain of membrane-bound proteins or in proteins known
to be secreted (exception: prostaglandin G/H synthase);
the domain includes six cysteine residues which have
been shown to be involved in disulfide bonds; the main
structure is a two-stranded beta-sheet followed by a
loop to a C-terminal short two-stranded sheet;
Subdomains between the conserved cysteines vary in
length; the region between the 5th and 6th cysteine
contains two conserved glycines of which at least one
is present in most EGF-like domains; a subset of
these bind calcium.
Length = 36
Score = 30.1 bits (68), Expect = 0.088
Identities = 16/42 (38%), Positives = 18/42 (42%), Gaps = 8/42 (19%)
Query: 17 CEDINECLELSNQCAFRCHNVPGSFRCICPYGYALAPDGRHC 58
C N C C N PGS+RC+CP GY R C
Sbjct: 2 CAASNPCSNGG-----TCVNTPGSYRCVCPPGYTGD---RSC 35
Score = 29.4 bits (66), Expect = 0.18
Identities = 10/34 (29%), Positives = 14/34 (41%)
Query: 63 ECKENEGICEDGKCINIAGGVTCECPEGFMLSPN 96
EC + G C+N G C CP G+ +
Sbjct: 1 ECAASNPCSNGGTCVNTPGSYRCVCPPGYTGDRS 34
>gnl|CDD|215652 pfam00008, EGF, EGF-like domain. There is no clear separation
between noise and signal. pfam00053 is very similar,
but has 8 instead of 6 conserved cysteines. Includes
some cytokine receptors. The EGF domain misses the
N-terminus regions of the Ca2+ binding EGF domains
(this is the main reason of discrepancy between
swiss-prot domain start/end and Pfam). The family is
hard to model due to many similar but different
sub-types of EGF domains. Pfam certainly misses a
number of EGF domains.
Length = 32
Score = 28.9 bits (65), Expect = 0.22
Identities = 13/28 (46%), Positives = 16/28 (57%)
Query: 64 CKENEGICEDGKCINIAGGVTCECPEGF 91
C N G C++ GG TCECPEG+
Sbjct: 1 CSPNNPCSNGGTCVDTPGGYTCECPEGY 28
>gnl|CDD|219625 pfam07895, DUF1673, Protein of unknown function (DUF1673). This
family contains hypothetical proteins of unknown
function expressed by two archaeal species.
Length = 204
Score = 31.2 bits (71), Expect = 0.33
Identities = 12/75 (16%), Positives = 20/75 (26%), Gaps = 6/75 (8%)
Query: 163 GFFFVLFFLVIFWSILPIYKTIKDITILSVCAYNMKQTTLHYTNFFFLNFLFDALLEHLM 222
G F L + W I + V Y+ K+ + L + L +
Sbjct: 87 GLFLSLLLYLFTWKKQMIRY--DALAKKPVIRYSNKKKIVRSLLVIILLLILLLLFLY-- 142
Query: 223 KEIARLFVLASLFMQ 237
+ L Q
Sbjct: 143 --YILGHFESLLSAQ 155
>gnl|CDD|218955 pfam06247, Plasmod_Pvs28, Plasmodium ookinete surface protein
Pvs28. This family consists of several ookinete surface
protein (Pvs28) from several species of Plasmodium.
Pvs25 and Pvs28 are expressed on the surface of
ookinetes. These proteins are potential candidates for
vaccine and induce antibodies that block the infectivity
of Plasmodium vivax in immunised animals.
Length = 196
Score = 29.3 bits (66), Expect = 1.5
Identities = 31/110 (28%), Positives = 40/110 (36%), Gaps = 26/110 (23%)
Query: 8 TGYRLRVE-TCEDINECLELSN---------QCA-FRCHNVPGSFRCICPYGYALAPDGR 56
GY L+ E TCE+ +C +L N C + +C C GY L
Sbjct: 26 EGYVLKNENTCEEKVKCDKLENVNKVCGEYATCINQANKAEEKALKCGCINGYTL----- 80
Query: 57 HCIDINECKENEG---ICEDGKCI---NIAGGVTCECPEGFMLSPNGMKC 100
C N+ +C GKCI TC C G + NG KC
Sbjct: 81 ---SQGVCVPNKCNNKVCGSGKCIVDPANPNNTTCSCNIGKVPDQNG-KC 126
Score = 28.2 bits (63), Expect = 3.5
Identities = 19/67 (28%), Positives = 27/67 (40%), Gaps = 10/67 (14%)
Query: 39 GSFRCICPYGYALAPDGRHCIDINECKENEGI---CED-GKCINIAGG-----VTCECPE 89
F C C GY L + C + +C + E + C + CIN A + C C
Sbjct: 18 NHFECKCNEGYVLKNENT-CEEKVKCDKLENVNKVCGEYATCINQANKAEEKALKCGCIN 76
Query: 90 GFMLSPN 96
G+ LS
Sbjct: 77 GYTLSQG 83
>gnl|CDD|239585 cd03508, Delta4-sphingolipid-FADS-like, The Delta4-sphingolipid
Fatty Acid Desaturase (Delta4-sphingolipid-FADS)-like CD
includes the integral-membrane enzymes, dihydroceramide
Delta-4 desaturase, involved in the synthesis of
sphingosine; and the human membrane fatty acid (lipid)
desaturase (MLD), reported to modulate biosynthesis of
the epidermal growth factor receptor; and other related
proteins. These proteins are found in various eukaryotes
including vertebrates, higher plants, and fungi. Studies
show that MLD is localized to the endoplasmic reticulum.
As with other members of this superfamily, this domain
family has extensive hydrophobic regions that would be
capable of spanning the membrane bilayer at least twice.
Comparison of sequences also reveals the existence of
three regions of conserved histidine cluster motifs that
contain eight histidine residues: HXXXH, HXXHH, and
HXXHH. These histidine residues are reported to be
catalytically essential and proposed to be the ligands
for the iron atoms contained within the homolog,
stearoyl CoA desaturase.
Length = 289
Score = 29.1 bits (66), Expect = 1.9
Identities = 13/83 (15%), Positives = 26/83 (31%), Gaps = 14/83 (16%)
Query: 154 ATFWSHYPKGFFFVLFFLVIFWSILPIYKTIKDITILSVCAYNMKQTTLHYTNFFFLNFL 213
+S +V F+++ P++ K T L N
Sbjct: 123 GKLFSTVLGKAIWV-TLQPFFYALRPLFVRPKPPTR------------LEVINIVV-QIT 168
Query: 214 FDALLEHLMKEIARLFVLASLFM 236
FD L+ + + ++L F+
Sbjct: 169 FDYLIYYFFGWKSLAYLLLGSFL 191
>gnl|CDD|226137 COG3610, COG3610, Uncharacterized conserved protein [Function
unknown].
Length = 156
Score = 28.4 bits (64), Expect = 2.0
Identities = 15/97 (15%), Positives = 27/97 (27%), Gaps = 18/97 (18%)
Query: 153 PATFWSHYPKGFFFVLFFLVIF---WSILPIYKTIKDITILSVCAYNMKQTTLHYTNF-- 207
F + F ++F LPI L + + + F
Sbjct: 3 LLMLLLDMLFAFIATVGFAIVFNVPPRALPI------CGFLGALGWVVYYLLGKHFGFSI 56
Query: 208 ----FFLNFLFDAL---LEHLMKEIARLFVLASLFMQ 237
F F+ L L K A++F + ++
Sbjct: 57 VVATFIAAFVVGCLGNLLSRRYKTPAKVFTVPAIIPL 93
>gnl|CDD|119297 pfam10777, YlaC, sigma70 family sigma factor YlaC. Members of the
sigma70 family of sigma factors are components of the
RNA polymerase holoenzyme that direct bacterial or
plastid core RNA polymerase to specific promoter
elements. This domain is an inner membrane protein of
unknown function.
Length = 156
Score = 28.1 bits (63), Expect = 2.8
Identities = 13/62 (20%), Positives = 28/62 (45%), Gaps = 12/62 (19%)
Query: 165 FFVLFFLVIFWSILPIYKTIKDITILS--VCAYNMKQTTLHYTNFFFLNFLFDALLEHLM 222
F++ +F+ I P+Y+ +DI +L VC YN ++ + L++ ++
Sbjct: 70 LFIVMNAFLFFDIKPVYR-FEDIDVLDLRVC-YN--------GEWYNTRAVSQQLIDEIL 119
Query: 223 KE 224
Sbjct: 120 NS 121
>gnl|CDD|233335 TIGR01271, CFTR_protein, cystic fibrosis transmembrane conductor
regulator (CFTR). The model describes the cystis
fibrosis transmembrane conductor regulator (CFTR) in
eukaryotes. The principal role of this protein is
chloride ion conductance. The protein is predicted to
consist of 12 transmembrane domains. Mutations or
lesions in the genetic loci have been linked to the
aetiology of asthma, bronchiectasis, chronic obstructive
pulmonary disease etc. Disease-causing mutations have
been studied by 36Cl efflux assays in vitro cell
cultures and electrophysiology, all of which point to
the impairment of chloride channel stability and not the
biosynthetic processing per se [Transport and binding
proteins, Anions].
Length = 1490
Score = 28.3 bits (63), Expect = 4.1
Identities = 14/40 (35%), Positives = 21/40 (52%), Gaps = 3/40 (7%)
Query: 160 YPKGFFFVLFFLVIFWSILP--IYKTIKDITILSVCAYNM 197
Y FFF FF V+F S++P + K I I + +Y +
Sbjct: 307 YSSAFFFSGFF-VVFLSVVPYALIKGIILRRIFTTISYCI 345
>gnl|CDD|200519 cd11258, Sema_4C, The Sema domain, a protein interacting module, of
semaphorin 4C (Sema4C). Sema4C acts as a Plexin B2
ligand to regulate the development of cerebellar granule
cells and to modulate ureteric branching in the
developing kidney. The binding of Sema4C to Plexin B2
results the phosphorylation of downstream regulator
ErbB-2 and the plexin protein itself. The cytoplasmic
region of Sema4C binds a neurite-outgrowth-related
protein SFAP75, suggesting that Sema4C may also play a
role in neural function. Sema4C belongs to the class 4
transmembrane semaphorin family of proteins. Semaphorins
are regulatory molecules in the development of the
nervous system and in axonal guidance. They also play
important roles in other biological processes, such as
angiogenesis, immune regulation, respiration systems and
cancer. The Sema domain is located at the N-terminus and
contains four disulfide bonds formed by eight conserved
cysteine residues. It serves as a receptor-recognition
and -binding module.
Length = 458
Score = 27.8 bits (62), Expect = 6.2
Identities = 11/23 (47%), Positives = 11/23 (47%)
Query: 136 QAWGRYCLPCPSPNSGEPATFWS 158
Q WGRY P PSP G W
Sbjct: 312 QKWGRYTDPVPSPRPGSCINNWH 334
>gnl|CDD|206444 pfam14276, DUF4363, Domain of unknown function (DUF4363). This
family of proteins is found in bacteria. Proteins in
this family are approximately 120 amino acids in length.
Length = 121
Score = 26.8 bits (60), Expect = 6.4
Identities = 4/32 (12%), Positives = 10/32 (31%)
Query: 164 FFFVLFFLVIFWSILPIYKTIKDITILSVCAY 195
F+L L+ +S + + +
Sbjct: 5 LIFILIILLGVFSNNYLNTSCDKLEEKLEKIE 36
>gnl|CDD|149504 pfam08475, Baculo_VP91_N, Viral capsid protein 91 N-terminal. This
domain is found in Baculoviridae including the
nucleopolyhedrovirus at the N-terminus of the viral
capsid protein 91 (VP91).
Length = 185
Score = 26.8 bits (60), Expect = 7.7
Identities = 9/30 (30%), Positives = 11/30 (36%), Gaps = 2/30 (6%)
Query: 81 GGVTCECPEGFMLSPNGMKCIDVRQDVCYD 110
G V ECP N ++C V C
Sbjct: 122 GWVEMECPANERFDGNQLQC--VPIPPCDG 149
Database: CDD.v3.10
Posted date: Mar 20, 2013 7:55 AM
Number of letters in database: 10,937,602
Number of sequences in database: 44,354
Lambda K H
0.328 0.141 0.478
Gapped
Lambda K H
0.267 0.0678 0.140
Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 12,148,719
Number of extensions: 1113405
Number of successful extensions: 1864
Number of sequences better than 10.0: 1
Number of HSP's gapped: 1853
Number of HSP's successfully gapped: 69
Length of query: 245
Length of database: 10,937,602
Length adjustment: 94
Effective length of query: 151
Effective length of database: 6,768,326
Effective search space: 1022017226
Effective search space used: 1022017226
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 15 ( 7.1 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 40 (21.7 bits)
S2: 58 (26.2 bits)