RPS-BLAST 2.2.26 [Sep-21-2011]
Database: CDD.v3.10
44,354 sequences; 10,937,602 total letters
Searching..................................................done
Query= psy2974
(186 letters)
>gnl|CDD|215916 pfam00431, CUB, CUB domain.
Length = 110
Score = 70.4 bits (173), Expect = 3e-16
Identities = 28/110 (25%), Positives = 44/110 (40%), Gaps = 38/110 (34%)
Query: 81 CKYEITAPNGVIKTPNHPDYYPSKRECIWHFTTTPGHRIKLN------------------ 122
C +T +G I +PN+P+ YP ++C+W PG+RI L
Sbjct: 1 CGGVLTESSGSITSPNYPNSYPPNKDCVWTIRAPPGYRISLTFQDFDLEDHDECGYDYVE 60
Query: 123 --------------------PTDIISISEGLLVRFRSDDTVVGKGFSASY 152
P DI S S + ++F SD ++ +GF A+Y
Sbjct: 61 IRDGLPSSSPLLGRFCGSGPPEDIRSTSNQMTIKFVSDSSISKRGFKATY 110
>gnl|CDD|238001 cd00041, CUB, CUB domain; extracellular domain; present in proteins
mostly known to be involved in development; not found in
prokaryotes, plants and yeast.
Length = 113
Score = 65.9 bits (161), Expect = 2e-14
Identities = 35/113 (30%), Positives = 47/113 (41%), Gaps = 39/113 (34%)
Query: 81 CKYEITAP-NGVIKTPNHPDYYPSKRECIWHFTTTPGHRIKLN----------------- 122
C +TA +G I +PN+P+ YP+ C+W PG+RI+L
Sbjct: 1 CGGTLTASTSGTISSPNYPNNYPNNLNCVWTIEAPPGYRIRLTFEDFDLESSPNCSYDYL 60
Query: 123 ---------------------PTDIISISEGLLVRFRSDDTVVGKGFSASYIA 154
P IIS L VRFRSD +V G+GF A+Y A
Sbjct: 61 EIYDGPSTSSPLLGRFCGSTLPPPIISSGNSLTVRFRSDSSVTGRGFKATYSA 113
>gnl|CDD|214483 smart00042, CUB, Domain first found in C1r, C1s, uEGF, and bone
morphogenetic protein. This domain is found mostly
among developmentally-regulated proteins. Spermadhesins
contain only this domain.
Length = 102
Score = 58.2 bits (141), Expect = 1e-11
Identities = 28/102 (27%), Positives = 38/102 (37%), Gaps = 39/102 (38%)
Query: 90 GVIKTPNHPDYYPSKRECIWHFTTTPGHRIKL---------------------------- 121
G I +PN+P YP+ +C+W PG+RI+L
Sbjct: 1 GTITSPNYPQSYPNNLDCVWTIRAPPGYRIELQFTDFDLESSDNCEYDYVEIYDGPSASS 60
Query: 122 -----------NPTDIISISEGLLVRFRSDDTVVGKGFSASY 152
P I S S L + F SD +V +GFSA Y
Sbjct: 61 PLLGRFCGSEAPPPVISSSSNSLTLTFVSDSSVQKRGFSARY 102
>gnl|CDD|238752 cd01475, vWA_Matrilin, VWA_Matrilin: In cartilaginous plate,
extracellular matrix molecules mediate cell-matrix and
matrix-matrix interactions thereby providing tissue
integrity. Some members of the matrilin family are
expressed specifically in developing cartilage
rudiments. The matrilin family consists of at least four
members. All the members of the matrilin family contain
VWA domains, EGF-like domains and a heptad repeat
coiled-coiled domain at the carboxy terminus which is
responsible for the oligomerization of the matrilins.
The VWA domains have been shown to be essential for
matrilin network formation by interacting with matrix
ligands.
Length = 224
Score = 48.1 bits (115), Expect = 4e-07
Identities = 18/49 (36%), Positives = 26/49 (53%), Gaps = 3/49 (6%)
Query: 28 KIGDSFKS---QDKDECMTNNGGCQHECRNTIGSYICSCHNGYTLLENG 73
++ F+ D C T + CQ C +T GSY+C+C GY LLE+
Sbjct: 174 ELTKKFQGKICVVPDLCATLSHVCQQVCISTPGSYLCACTEGYALLEDN 222
>gnl|CDD|205157 pfam12947, EGF_3, EGF domain. This family includes a variety of
EGF-like domain homologues. This family includes the
C-terminal domain of the malaria parasite MSP1 protein.
Length = 36
Score = 41.8 bits (99), Expect = 4e-06
Identities = 18/38 (47%), Positives = 23/38 (60%), Gaps = 4/38 (10%)
Query: 41 CMTNNGGC-QH-ECRNTIGSYICSCHNGYTLLENGHDC 76
C NNGGC + C NT GS+ C+C +GYT +G C
Sbjct: 1 CAENNGGCHPNATCTNTGGSFTCTCKSGYTG--DGVTC 36
>gnl|CDD|214542 smart00179, EGF_CA, Calcium-binding EGF-like domain.
Length = 39
Score = 41.5 bits (98), Expect = 5e-06
Identities = 20/43 (46%), Positives = 26/43 (60%), Gaps = 6/43 (13%)
Query: 37 DKDECMTNNGGCQH--ECRNTIGSYICSCHNGYTLLENGHDCK 77
D DEC + N CQ+ C NT+GSY C C GYT +G +C+
Sbjct: 1 DIDECASGNP-CQNGGTCVNTVGSYRCECPPGYT---DGRNCE 39
>gnl|CDD|238011 cd00054, EGF_CA, Calcium-binding EGF-like domain, present in a
large number of membrane-bound and extracellular
(mostly animal) proteins. Many of these proteins
require calcium for their biological function and
calcium-binding sites have been found to be located at
the N-terminus of particular EGF-like domains;
calcium-binding may be crucial for numerous
protein-protein interactions. Six conserved core
cysteines form three disulfide bridges as in non
calcium-binding EGF domains, whose structures are very
similar. EGF_CA can be found in tandem repeat
arrangements.
Length = 38
Score = 39.5 bits (93), Expect = 2e-05
Identities = 17/34 (50%), Positives = 19/34 (55%), Gaps = 1/34 (2%)
Query: 37 DKDECMTNNG-GCQHECRNTIGSYICSCHNGYTL 69
D DEC + N C NT+GSY CSC GYT
Sbjct: 1 DIDECASGNPCQNGGTCVNTVGSYRCSCPPGYTG 34
>gnl|CDD|214544 smart00181, EGF, Epidermal growth factor-like domain.
Length = 35
Score = 36.3 bits (84), Expect = 3e-04
Identities = 16/34 (47%), Positives = 19/34 (55%), Gaps = 2/34 (5%)
Query: 40 ECMTNNGGCQH-ECRNTIGSYICSCHNGYTLLEN 72
EC + G C + C NT GSY CSC GYT +
Sbjct: 1 EC-ASGGPCSNGTCINTPGSYTCSCPPGYTGDKR 33
>gnl|CDD|238010 cd00053, EGF, Epidermal growth factor domain, found in epidermal
growth factor (EGF) presents in a large number of
proteins, mostly animal; the list of proteins currently
known to contain one or more copies of an EGF-like
pattern is large and varied; the functional
significance of EGF-like domains in what appear to be
unrelated proteins is not yet clear; a common feature
is that these repeats are found in the extracellular
domain of membrane-bound proteins or in proteins known
to be secreted (exception: prostaglandin G/H synthase);
the domain includes six cysteine residues which have
been shown to be involved in disulfide bonds; the main
structure is a two-stranded beta-sheet followed by a
loop to a C-terminal short two-stranded sheet;
Subdomains between the conserved cysteines vary in
length; the region between the 5th and 6th cysteine
contains two conserved glycines of which at least one
is present in most EGF-like domains; a subset of
these bind calcium.
Length = 36
Score = 34.8 bits (80), Expect = 0.001
Identities = 14/35 (40%), Positives = 17/35 (48%), Gaps = 3/35 (8%)
Query: 40 ECMTNNGGCQH--ECRNTIGSYICSCHNGYTLLEN 72
EC + C + C NT GSY C C GYT +
Sbjct: 1 EC-AASNPCSNGGTCVNTPGSYRCVCPPGYTGDRS 34
>gnl|CDD|221695 pfam12662, cEGF, Complement Clr-like EGF-like. cEGF, or
complement Clr-like EGF, domains have six conserved
cysteine residues disulfide-bonded into the
characteristic pattern 'ababcc'. They are found in
blood coagulation proteins such as fibrillin, Clr and
Cls, thrombomodulin, and the LDL receptor. The core
fold of the EGF domain consists of two small
beta-hairpins packed against each other. Two major
structural variants have been identified based on the
structural context of the C-terminal cysteine residue
of disulfide 'c' in the C-terminal hairpin: hEGFs and
cEGFs. In cEGFs the C-terminal thiol resides on the
C-terminal beta-sheet, resulting in long loop-lengths
between the cysteine residues of disulfide 'c',
typically C[10+]XC. These longer loop-lengths may have
arisen by selective cysteine loss from a four-disulfide
EGF template such as laminin or integrin. Tandem cEGF
domains have five linking residues between terminal
cysteines of adjacent domains. cEGF domains may or may
not bind calcium in the linker region. cEGF domains
with the consensus motif CXN4X[F,Y]XCXC are
hydroxylated exclusively on the asparagine residue.
Length = 24
Score = 32.0 bits (74), Expect = 0.010
Identities = 10/20 (50%), Positives = 12/20 (60%)
Query: 58 SYICSCHNGYTLLENGHDCK 77
SY CSC GY L +G C+
Sbjct: 1 SYTCSCPPGYQLSGDGRTCE 20
>gnl|CDD|219496 pfam07645, EGF_CA, Calcium-binding EGF domain.
Length = 42
Score = 31.6 bits (72), Expect = 0.022
Identities = 18/42 (42%), Positives = 22/42 (52%), Gaps = 2/42 (4%)
Query: 37 DKDECMTNNGGCQH--ECRNTIGSYICSCHNGYTLLENGHDC 76
D DEC C C NTIGS+ C C +GY E+G +C
Sbjct: 1 DVDECADGTHNCPANTVCVNTIGSFECVCPDGYENNEDGTNC 42
>gnl|CDD|165346 PHA03054, PHA03054, IMV membrane protein; Provisional.
Length = 72
Score = 30.7 bits (69), Expect = 0.078
Identities = 15/31 (48%), Positives = 17/31 (54%)
Query: 18 SPHDTLTVFSKIGDSFKSQDKDECMTNNGGC 48
SP D LT F +I S S +K TNN GC
Sbjct: 15 SPEDDLTDFIEIVKSVLSDEKTVTSTNNTGC 45
>gnl|CDD|215652 pfam00008, EGF, EGF-like domain. There is no clear separation
between noise and signal. pfam00053 is very similar,
but has 8 instead of 6 conserved cysteines. Includes
some cytokine receptors. The EGF domain misses the
N-terminus regions of the Ca2+ binding EGF domains
(this is the main reason of discrepancy between
swiss-prot domain start/end and Pfam). The family is
hard to model due to many similar but different
sub-types of EGF domains. Pfam certainly misses a
number of EGF domains.
Length = 32
Score = 29.3 bits (66), Expect = 0.10
Identities = 11/28 (39%), Positives = 14/28 (50%), Gaps = 2/28 (7%)
Query: 43 TNNGGCQH--ECRNTIGSYICSCHNGYT 68
+ N C + C +T G Y C C GYT
Sbjct: 2 SPNNPCSNGGTCVDTPGGYTCECPEGYT 29
>gnl|CDD|218673 pfam05642, Sporozoite_P67, Sporozoite P67 surface antigen. This
family consists of several Theileria P67 surface
antigens. A stage specific surface antigen of Theileria
parva, p67, is the basis for the development of an
anti-sporozoite vaccine for the control of East Coast
fever (ECF) in cattle. The antigen has been shown to
contain five distinct linear peptide sequences
recognised by sporozoite-neutralising murine monoclonal
antibodies.
Length = 727
Score = 30.8 bits (69), Expect = 0.44
Identities = 17/29 (58%), Positives = 21/29 (72%), Gaps = 1/29 (3%)
Query: 156 DTQGSKEFSEIDDDD-EDEDNTDLNSRRG 183
DT+GSK SE DDDD E+EDN +S+ G
Sbjct: 108 DTKGSKTDSEEDDDDSEEEDNKSTSSKDG 136
>gnl|CDD|191582 pfam06679, DUF1180, Protein of unknown function (DUF1180). This
family consists of several hypothetical mammalian
proteins of around 190 residues in length. The function
of this family is unknown.
Length = 163
Score = 29.1 bits (65), Expect = 0.97
Identities = 13/34 (38%), Positives = 20/34 (58%), Gaps = 3/34 (8%)
Query: 152 YIAIDTQ-GSKEFSEIDDDDEDEDNT--DLNSRR 182
Y +DT + E + ++ DDED+D+T D N R
Sbjct: 129 YGVLDTNAENMELTPLEQDDEDDDSTLFDANYPR 162
>gnl|CDD|131633 TIGR02584, cas_NE0113, CRISPR-associated protein, NE0113 family.
Members of this minor CRISPR-associated (Cas) protein
family are found in cas gene clusters in Vibrio
vulnificus YJ016, Nitrosomonas europaea ATCC 19718,
Mannheimia succiniciproducens MBEL55E, and
Verrucomicrobium spinosum [Mobile and extrachromosomal
element functions, Other].
Length = 209
Score = 28.6 bits (64), Expect = 1.6
Identities = 11/45 (24%), Positives = 20/45 (44%), Gaps = 2/45 (4%)
Query: 96 NHPD-YYPSKRECIWHFTTTPGHRIKLNPTD-IISISEGLLVRFR 138
N YYP ++ I T G + ++ ++ ++E VR R
Sbjct: 161 NIRGFYYPPRKGPILEIRTRDGPPAPADTSEAVVELAELPFVRLR 205
>gnl|CDD|187817 cd09686, Csx1_III-U, CRISPR/Cas system-associated protein Csx1.
CRISPR (Clustered Regularly Interspaced Short
Palindromic Repeats) and associated Cas proteins
comprise a system for heritable host defense by
prokaryotic cells against phage and other foreign DNA;
Protein of this family often fused to HTH domain; Some
proteins could have an additional fusion with
RecB-family nuclease domain; Core domain appears to have
a Rossmann-like fold; loosely associated with CRISPR/Cas
systems; also known as NE0113 family.
Length = 209
Score = 28.6 bits (64), Expect = 1.6
Identities = 11/45 (24%), Positives = 20/45 (44%), Gaps = 2/45 (4%)
Query: 96 NHPD-YYPSKRECIWHFTTTPGHRIKLNPTD-IISISEGLLVRFR 138
N YYP ++ I T G + ++ ++ ++E VR R
Sbjct: 161 NIRGFYYPPRKGPILEIRTRDGPPAPADTSEAVVELAELPFVRLR 205
>gnl|CDD|219564 pfam07771, TSGP1, Tick salivary peptide group 1. This contains a
group of peptides derived from a salivary gland cDNA
library of the tick Ixodes scapularis. Also present are
peptides from a related tick species, Ixodes ricinus.
They are characterized by a putative signal peptide
indicative of secretion and conserved cysteine residues.
Length = 120
Score = 27.5 bits (61), Expect = 2.0
Identities = 17/69 (24%), Positives = 25/69 (36%), Gaps = 16/69 (23%)
Query: 43 TNNGGCQHECRNTIGSYICSCHNGYTL--------LENGHDCKEGGCKYEITAPNGVIKT 94
TN GC + C N + S + N C+ G C + T+
Sbjct: 34 TNREGCDYYCWNQDTN---SWDEFFFGDGETCFYNTGNDGVCQNGEC-HLTTSSGE---- 85
Query: 95 PNHPDYYPS 103
P+HPD +P
Sbjct: 86 PSHPDDHPP 94
>gnl|CDD|217392 pfam03153, TFIIA, Transcription factor IIA, alpha/beta subunit.
Transcription initiation factor IIA (TFIIA) is a
heterotrimer, the three subunits being known as alpha,
beta, and gamma, in order of molecular weight. The N and
C-terminal domains of the gamma subunit are represented
in pfam02268 and pfam02751, respectively. This family
represents the precursor that yields both the alpha and
beta subunits. The TFIIA heterotrimer is an essential
general transcription initiation factor for the
expression of genes transcribed by RNA polymerase II.
Together with TFIID, TFIIA binds to the promoter region;
this is the first step in the formation of a
pre-initiation complex (PIC). Binding of the rest of the
transcription machinery follows this step. After
initiation, the PIC does not completely dissociate from
the promoter. Some components, including TFIIA, remain
attached and re-initiate a subsequent round of
transcription.
Length = 332
Score = 27.8 bits (62), Expect = 4.0
Identities = 9/28 (32%), Positives = 12/28 (42%)
Query: 153 IAIDTQGSKEFSEIDDDDEDEDNTDLNS 180
I + DDDDED +DL+
Sbjct: 244 IDGIDSDDEGDGSDDDDDEDAIESDLDD 271
>gnl|CDD|221765 pfam12772, GHBP, Growth hormone receptor binding. Growth hormone
receptor binding protein is produced either by
proteolysis of the GHR (growth hormone receptor) at the
cell surface thereby releasing its extracellular domain,
the GHBP (growth hormone-binding protein), or, in
rodents, by alternative processing of the GHR
transcript. The sheddase proteolytic enzyme responsible
for the cleavage is TACE (tumour necrosis
factor-alpha-converting enzyme). Growth hormone (GH)
binding to GH receptor (GHR) is the initial step that
leads to the physiological functions of the hormone. The
biological effects of GHBP are determined by the serum
levels of growth hormone (GH), which can vary. Low
levels of GH can result in a dwarf phenotype and have
been positively correlated with an increased life
expectancy. High levels of GH can lead to gigantism or a
clinical syndrome termed acromegaly and have been
implicated in diabetic eye and kidney damage.
Length = 289
Score = 27.4 bits (61), Expect = 4.2
Identities = 10/21 (47%), Positives = 14/21 (66%)
Query: 162 EFSEIDDDDEDEDNTDLNSRR 182
EF E+D DD DE N +++R
Sbjct: 29 EFIELDIDDPDEKNEGSDTQR 49
>gnl|CDD|219285 pfam07065, D123, D123. This family contains a number of eukaryotic
D123 proteins approximately 330 residues long. It has
been shown that mutated variants of D123 exhibit
temperature-dependent differences in their degradation
rate. D123 proteins are regulators of eIF2, the central
regulator of translational initiation.
Length = 295
Score = 27.3 bits (61), Expect = 5.1
Identities = 14/55 (25%), Positives = 25/55 (45%), Gaps = 1/55 (1%)
Query: 123 PTDIISISEGLLVRFRSDDTVVGKGFSASYIAIDTQGSKEFSEIDDDDEDEDNTD 177
+ II + E L D ++ S+ I ++ E+S+ +DD+DED
Sbjct: 11 KSKIIPLPEEFLEYLLQDGILLPSEESSLPIYQESS-DNEYSDWFEDDDDEDTDV 64
Database: CDD.v3.10
Posted date: Mar 20, 2013 7:55 AM
Number of letters in database: 10,937,602
Number of sequences in database: 44,354
Lambda K H
0.315 0.136 0.427
Gapped
Lambda K H
0.267 0.0653 0.140
Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 9,189,761
Number of extensions: 806356
Number of successful extensions: 696
Number of sequences better than 10.0: 1
Number of HSP's gapped: 689
Number of HSP's successfully gapped: 31
Length of query: 186
Length of database: 10,937,602
Length adjustment: 91
Effective length of query: 95
Effective length of database: 6,901,388
Effective search space: 655631860
Effective search space used: 655631860
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 42 (22.0 bits)
S2: 56 (25.5 bits)