RPS-BLAST 2.2.26 [Sep-21-2011]
Database: CDD.v3.10
44,354 sequences; 10,937,602 total letters
Searching..................................................done
Query= psy826
(329 letters)
>gnl|CDD|214542 smart00179, EGF_CA, Calcium-binding EGF-like domain.
Length = 39
Score = 41.8 bits (99), Expect = 7e-06
Identities = 17/31 (54%), Positives = 20/31 (64%), Gaps = 1/31 (3%)
Query: 196 DINECSDENICSGNQFCVNTEGSYRCMQCDP 226
DI+EC+ N C CVNT GSYRC +C P
Sbjct: 1 DIDECASGNPCQNGGTCVNTVGSYRC-ECPP 30
>gnl|CDD|238011 cd00054, EGF_CA, Calcium-binding EGF-like domain, present in a
large number of membrane-bound and extracellular (mostly
animal) proteins. Many of these proteins require calcium
for their biological function and calcium-binding sites
have been found to be located at the N-terminus of
particular EGF-like domains; calcium-binding may be
crucial for numerous protein-protein interactions. Six
conserved core cysteines form three disulfide bridges as
in non calcium-binding EGF domains, whose structures are
very similar. EGF_CA can be found in tandem repeat
arrangements.
Length = 38
Score = 41.5 bits (98), Expect = 1e-05
Identities = 17/31 (54%), Positives = 19/31 (61%), Gaps = 1/31 (3%)
Query: 196 DINECSDENICSGNQFCVNTEGSYRCMQCDP 226
DI+EC+ N C CVNT GSYRC C P
Sbjct: 1 DIDECASGNPCQNGGTCVNTVGSYRC-SCPP 30
>gnl|CDD|221329 pfam11938, DUF3456, TLR4 regulator and MIR-interacting MSAP. This
family of proteins, found from plants to humans, is
PRAT4 (A and B), a Protein Associated with Toll-like
receptor 4. The Toll family of receptors - TLRs - plays
an essential role in innate recognition of microbial
products, the first line of defence against bacterial
infection. PRAT4A influences the subcellular
distribution and the strength of TLR responses and
alters the relative activity of each TLR. PRAT4B
regulates TLR4 trafficking to the cell surface and the
extent of its expression there. TLR4 recognizes
lipopolysaccharide (LPS), one of the most
immuno-stimulatory glycolipids constituting the outer
membrane of the Gram-negative bacteria. This family has
also been described as a SAP-like MIR-interacting
protein family.
Length = 151
Score = 43.9 bits (104), Expect = 2e-05
Identities = 29/140 (20%), Positives = 43/140 (30%), Gaps = 52/140 (37%)
Query: 2 SGIEKTAKGNF-----AGGDTAWEEEKQKIY-AKSEVRLIEIQEKMCS------------ 43
+ KT D + + +K Y A+SE+RL E+ E +C
Sbjct: 16 EALSKTDPKKEVDVGGFRLDPDGKRKGKKKYYARSELRLTELLEGVCDRMLDYNLHKERS 75
Query: 44 ------------------------------EVSGFLDQCHNFAADIESEIEEWWFKVQHS 73
EV+ QC + E EIEEW+
Sbjct: 76 GSRRFAKGMSPTFQTLHGLVLKGVKVDPSAEVAELKFQCERLLEEHEDEIEEWYKN---- 131
Query: 74 KAKDSDLYTWLCINKLKRCC 93
+ + DL +LC K C
Sbjct: 132 EQLEDDLSKFLCSEHSKACL 151
>gnl|CDD|214589 smart00261, FU, Furin-like repeats.
Length = 45
Score = 39.4 bits (92), Expect = 6e-05
Identities = 14/35 (40%), Positives = 17/35 (48%)
Query: 220 RCMQCDPSCNGCHGDGPDMCEACAEGYKLQQNICI 254
C C P C C G GPD C +C G+ L C+
Sbjct: 3 ECKPCHPECATCTGPGPDDCTSCKHGFFLDGGKCV 37
Score = 30.9 bits (70), Expect = 0.083
Identities = 12/27 (44%), Positives = 15/27 (55%), Gaps = 1/27 (3%)
Query: 160 CSKCHASCESGCSTGGPKGCTKCKSGW 186
C CH C + C+ GP CT CK G+
Sbjct: 4 CKPCHPEC-ATCTGPGPDDCTSCKHGF 29
>gnl|CDD|238021 cd00064, FU, Furin-like repeats. Cysteine rich region. Exact
function of the domain is not known. Furin is a
serine-kinase dependent proprotein processor. Other
members of this family include endoproteases and cell
surface receptors.
Length = 49
Score = 37.1 bits (86), Expect = 5e-04
Identities = 14/31 (45%), Positives = 17/31 (54%)
Query: 224 CDPSCNGCHGDGPDMCEACAEGYKLQQNICI 254
C PSC C G GPD C +C G+ L C+
Sbjct: 2 CHPSCATCTGPGPDQCTSCRHGFYLDGGTCV 32
Score = 29.4 bits (66), Expect = 0.29
Identities = 12/31 (38%), Positives = 15/31 (48%), Gaps = 1/31 (3%)
Query: 162 KCHASCESGCSTGGPKGCTKCKSGWAADKDI 192
CH SC C+ GP CT C+ G+ D
Sbjct: 1 PCHPSCA-TCTGPGPDQCTSCRHGFYLDGGT 30
>gnl|CDD|238010 cd00053, EGF, Epidermal growth factor domain, found in epidermal
growth factor (EGF) presents in a large number of
proteins, mostly animal; the list of proteins currently
known to contain one or more copies of an EGF-like
pattern is large and varied; the functional significance
of EGF-like domains in what appear to be unrelated
proteins is not yet clear; a common feature is that
these repeats are found in the extracellular domain of
membrane-bound proteins or in proteins known to be
secreted (exception: prostaglandin G/H synthase); the
domain includes six cysteine residues which have been
shown to be involved in disulfide bonds; the main
structure is a two-stranded beta-sheet followed by a
loop to a C-terminal short two-stranded sheet;
Subdomains between the conserved cysteines vary in
length; the region between the 5th and 6th cysteine
contains two conserved glycines of which at least one
is present in most EGF-like domains; a subset of
these bind calcium.
Length = 36
Score = 31.3 bits (71), Expect = 0.040
Identities = 16/28 (57%), Positives = 17/28 (60%), Gaps = 1/28 (3%)
Query: 199 ECSDENICSGNQFCVNTEGSYRCMQCDP 226
EC+ N CS CVNT GSYRC C P
Sbjct: 1 ECAASNPCSNGGTCVNTPGSYRC-VCPP 27
>gnl|CDD|219496 pfam07645, EGF_CA, Calcium-binding EGF domain.
Length = 42
Score = 31.2 bits (71), Expect = 0.052
Identities = 14/32 (43%), Positives = 19/32 (59%), Gaps = 2/32 (6%)
Query: 196 DINECSDE-NICSGNQFCVNTEGSYRCMQCDP 226
D++EC+D + C N CVNT GS+ C C
Sbjct: 1 DVDECADGTHNCPANTVCVNTIGSFEC-VCPD 31
>gnl|CDD|221261 pfam11845, DUF3365, Protein of unknown function (DUF3365). This
family of proteins are functionally uncharacterized.
This protein is found in bacteria. Proteins in this
family are typically between 198 to 657 amino acids in
length.
Length = 179
Score = 31.6 bits (72), Expect = 0.34
Identities = 8/24 (33%), Positives = 11/24 (45%)
Query: 225 DPSCNGCHGDGPDMCEACAEGYKL 248
+ SC CHG + GYK+
Sbjct: 143 EESCLKCHGAPEEQPGDPGFGYKV 166
>gnl|CDD|173479 PTZ00214, PTZ00214, high cysteine membrane protein Group 4;
Provisional.
Length = 800
Score = 32.2 bits (73), Expect = 0.40
Identities = 26/88 (29%), Positives = 36/88 (40%), Gaps = 9/88 (10%)
Query: 171 CSTGGPKGCTKCKSGWAADKDIGCYDINECSDENIC--SGNQFCVNTEGSY------RCM 222
C++ CT C SG A + GCY IC N C T+ Y + +
Sbjct: 519 CTSTANGACTTC-SGAAFLMNGGCYTTEHYPGSTICDKQSNGKCTTTKKGYGISPDGKLL 577
Query: 223 QCDPSCNGCHGDGPDMCEACAEGYKLQQ 250
+CDP+C C GP C C L++
Sbjct: 578 ECDPTCLACTAPGPGRCTRCPSDKLLKR 605
Score = 29.1 bits (65), Expect = 4.0
Identities = 31/119 (26%), Positives = 43/119 (36%), Gaps = 18/119 (15%)
Query: 159 LCSKCHASCESGCST----GGPKGCTKCKSGW------------AADKDIGCYDINECSD 202
LC SGC+T G CT+C +G+ + D C + E S+
Sbjct: 352 LCGDATNGGVSGCATCGYNSGAVTCTRCSAGYLGVDGKSCSESCSGDTRGVCTKVAEGSE 411
Query: 203 ENICSGNQFCVNT--EGSYRCMQCDPSCNGCHGDGPDMCEACAEGYKLQQNICINTQAK 259
S C T S C C SC C P C+ C+ G L+ +I + A
Sbjct: 412 STEVSCRCVCKPTFYNSSGTCTPCTDSCAVCKDGTPTGCQQCSPGKILEFSIVSSESAD 470
>gnl|CDD|238012 cd00055, EGF_Lam, Laminin-type epidermal growth factor-like domain;
laminins are the major noncollagenous components of
basement membranes that mediate cell adhesion, growth
migration, and differentiation; the laminin-type
epidermal growth factor-like module occurs in tandem
arrays; the domain contains 4 disulfide bonds (loops
a-d) the first three resemble epidermal growth factor
(EGF); the number of copies of this domain in the
different forms of laminins is highly variable ranging
from 3 up to 22 copies.
Length = 50
Score = 28.9 bits (65), Expect = 0.48
Identities = 11/26 (42%), Positives = 13/26 (50%)
Query: 125 KGNGQCVCNKEYTGELCNECNTGYFQ 150
G GQC C TG C+ C GY+
Sbjct: 16 PGTGQCECKPNTTGRRCDRCAPGYYG 41
>gnl|CDD|214544 smart00181, EGF, Epidermal growth factor-like domain.
Length = 35
Score = 27.5 bits (61), Expect = 0.99
Identities = 14/28 (50%), Positives = 16/28 (57%), Gaps = 2/28 (7%)
Query: 199 ECSDENICSGNQFCVNTEGSYRCMQCDP 226
EC+ CS N C+NT GSY C C P
Sbjct: 1 ECASGGPCS-NGTCINTPGSYTC-SCPP 26
>gnl|CDD|235820 PRK06521, PRK06521, hydrogenase 4 subunit B; Validated.
Length = 667
Score = 29.9 bits (68), Expect = 2.0
Identities = 8/18 (44%), Positives = 14/18 (77%)
Query: 282 IIFQKNVFIASIVGVVVA 299
+ F N+F+AS+V V++A
Sbjct: 115 MGFFYNLFLASMVLVLLA 132
>gnl|CDD|215680 pfam00053, Laminin_EGF, Laminin EGF-like (Domains III and V). This
family is like pfam00008 but has 8 conserved cysteines
instead of six.
Length = 49
Score = 26.9 bits (60), Expect = 2.0
Identities = 11/27 (40%), Positives = 14/27 (51%)
Query: 128 GQCVCNKEYTGELCNECNTGYFQSYKD 154
GQC+C TG C+ C GY+ D
Sbjct: 18 GQCLCKPGVTGRHCDRCKPGYYGLPSD 44
>gnl|CDD|215652 pfam00008, EGF, EGF-like domain. There is no clear separation
between noise and signal. pfam00053 is very similar, but
has 8 instead of 6 conserved cysteines. Includes some
cytokine receptors. The EGF domain misses the N-terminus
regions of the Ca2+ binding EGF domains (this is the
main reason of discrepancy between swiss-prot domain
start/end and Pfam). The family is hard to model due to
many similar but different sub-types of EGF domains.
Pfam certainly misses a number of EGF domains.
Length = 32
Score = 26.6 bits (59), Expect = 2.1
Identities = 11/22 (50%), Positives = 12/22 (54%)
Query: 200 CSDENICSGNQFCVNTEGSYRC 221
CS N CS CV+T G Y C
Sbjct: 1 CSPNNPCSNGGTCVDTPGGYTC 22
>gnl|CDD|219677 pfam07974, EGF_2, EGF-like domain. This family contains EGF
domains found in a variety of extracellular proteins.
Length = 31
Score = 26.6 bits (59), Expect = 2.1
Identities = 13/31 (41%), Positives = 17/31 (54%), Gaps = 1/31 (3%)
Query: 112 CFGNGKCKGNGT-RKGNGQCVCNKEYTGELC 141
C +G C G GT + G+CVC+ Y G C
Sbjct: 1 CSASGICNGRGTCVRPCGKCVCDSGYQGATC 31
>gnl|CDD|237660 PRK14289, PRK14289, chaperone protein DnaJ; Provisional.
Length = 386
Score = 29.8 bits (67), Expect = 2.1
Identities = 26/114 (22%), Positives = 42/114 (36%), Gaps = 29/114 (25%)
Query: 143 ECNTGYFQSYKDEKTILCSKCHASCESGCSTGGPKGCTKCKSGWAADKDIGCYDINECSD 202
E +TG + +K +K + CS CH + G G + C CK + +
Sbjct: 140 EISTGVEKKFKVKKYVPCSHCHGTGAEG--NNGSETCPTCKGSGSVTRV----------- 186
Query: 203 ENICSGNQFCVNTEGSYRCMQCDPSCNGCHGDG---PDMCEACA-EGYKLQQNI 252
+N G MQ +C C+G+G C+ C EG + +
Sbjct: 187 QNTILGT------------MQTQSTCPTCNGEGKIIKKKCKKCGGEGIVYGEEV 228
>gnl|CDD|214543 smart00180, EGF_Lam, Laminin-type epidermal growth factor-like
domai.
Length = 46
Score = 26.1 bits (58), Expect = 3.7
Identities = 10/25 (40%), Positives = 12/25 (48%)
Query: 127 NGQCVCNKEYTGELCNECNTGYFQS 151
GQC C TG C+ C GY+
Sbjct: 17 TGQCECKPNVTGRRCDRCAPGYYGD 41
>gnl|CDD|220356 pfam09709, Cas_Csd1, CRISPR-associated protein (Cas_Csd1).
CRISPR loci appear to be mobile elements with a wide
host range. This entry represents proteins that tend to
be found near CRISPR repeats. The species range, so
far, is exclusively bacterial and mesophilic, although
CRISPR loci are particularly common among the archaea
and thermophilic bacteria. Clusters of short DNA
repeats with nonhomologous spacers, which are found at
regular intervals in the genomes of phylogenetically
distinct prokaryotic species, comprise a family with
recognisable features. This family is known as CRISPR
(short for Clustered, Regularly Interspaced Short
Palindromic Repeats). A number of protein families
appear only in association with these repeats and are
designated Cas (CRISPR-Associated) proteins.
Length = 572
Score = 29.2 bits (66), Expect = 4.0
Identities = 11/40 (27%), Positives = 14/40 (35%), Gaps = 3/40 (7%)
Query: 8 AKGNFAGGDTAWEEEKQKIYAKSEVRLIEIQEKMCSEVSG 47
GNF D + K+ I L+ EK SG
Sbjct: 34 EDGNFLRIDARERDGKKTIPRSM---LVPATEKSAGRSSG 70
>gnl|CDD|234402 TIGR03928, T7_EssCb_Firm, type VII secretion protein EssC,
C-terminal domain. This model describes the C-terminal
domain, or longer subunit, of the Firmicutes type VII
secretion protein EssC. This protein (homologous to EccC
in Actinobacteria) and the WXG100 target proteins are
the only homologous parts of type VII secretion between
Firmicutes and Actinobacteria [Protein fate, Protein and
peptide secretion and trafficking].
Length = 1296
Score = 28.8 bits (65), Expect = 4.6
Identities = 13/42 (30%), Positives = 18/42 (42%), Gaps = 1/42 (2%)
Query: 272 VYVGLCVATYIIFQKNVFI-ASIVGVVVAIYVSVAEYILNDK 312
V + + V I + +FI ASI +V I S Y K
Sbjct: 46 VMIAVTVLISIFQPRGIFIIASIAMSLVTIIFSTTTYFREKK 87
>gnl|CDD|205157 pfam12947, EGF_3, EGF domain. This family includes a variety of
EGF-like domain homologues. This family includes the
C-terminal domain of the malaria parasite MSP1 protein.
Length = 36
Score = 25.6 bits (57), Expect = 5.2
Identities = 15/37 (40%), Positives = 18/37 (48%), Gaps = 5/37 (13%)
Query: 200 CSDEN-ICSGNQFCVNTEGSYRCMQCDPSCNGCHGDG 235
C++ N C N C NT GS+ C C G GDG
Sbjct: 1 CAENNGGCHPNATCTNTGGSFTC-TCKS---GYTGDG 33
>gnl|CDD|232957 TIGR00398, metG, methionyl-tRNA synthetase. The methionyl-tRNA
synthetase (metG) is a class I amino acyl-tRNA ligase.
This model appears to recognize the methionyl-tRNA
synthetase of every species, including eukaryotic
cytosolic and mitochondrial forms. The UPGMA difference
tree calculated after search and alignment according to
This model shows an unusual deep split between two
families of MetG. One family contains forms from the
Archaea, yeast cytosol, spirochetes, and E. coli, among
others. The other family includes forms from yeast
mitochondrion, Synechocystis sp., Bacillus subtilis, the
Mycoplasmas, Aquifex aeolicus, and Helicobacter pylori.
The E. coli enzyme is homodimeric, although monomeric
forms can be prepared that are fully active. Activity of
this enzyme in bacteria includes aminoacylation of
fMet-tRNA with Met; subsequent formylation of the Met to
fMet is catalyzed by a separate enzyme. Note that the
protein from Aquifex aeolicus is split into an alpha
(large) and beta (small) subunit; this model does not
include the C-terminal region corresponding to the beta
chain [Protein synthesis, tRNA aminoacylation].
Length = 530
Score = 28.5 bits (64), Expect = 5.3
Identities = 10/38 (26%), Positives = 12/38 (31%), Gaps = 3/38 (7%)
Query: 133 NKEYTGELCNECNTGYFQSYKDEKTILCSKCHASCESG 170
KE C EC Y + C KC + G
Sbjct: 115 EKEIKQLYCPECEMFLPDRYVEGT---CPKCGSEDARG 149
>gnl|CDD|221770 pfam12785, VESA1_N, Variant erythrocyte surface antigen-1. This
family represents the N-terminal of the variant
erythrocyte surface antigen 1, versions a and b, of
Babesia. Babesia bovis is a tick-borne,
intra-erythrocytic, protozoal parasite of cattle that
shares many lifestyle parallels with the most virulent
of the human malarial parasites, Plasmodium falciparum.
Babesia uses antigenic variation to establish consistent
infections of long duration. The two variants of VESA1,
a and b, are expressed from different but closely
related genes, and variation is achieved through the
involvement of a segmental gene conversion mechanism and
low-frequency epigenetic in situ switching of
transcriptional activity from the VESA1 gene-pair to a
possible other gene pair.
Length = 428
Score = 28.4 bits (64), Expect = 5.4
Identities = 29/127 (22%), Positives = 38/127 (29%), Gaps = 23/127 (18%)
Query: 117 KCKGNGTRKGNGQCVCNKEYTGELCNECNTGYFQSYKDEKTILCSKC--------HASCE 168
KC G G K G + + C Y + K C C A +
Sbjct: 77 KCWGGGGGKCKGGGGNGNGHGQK--GGC--KYLKDVKPNNP--CDDCGCMKWDVPKADSD 130
Query: 169 SGCSTGGPKGCTKCKSGWAADKDIGCYDINECS-DENICSGNQFCVNTEGSYRCMQCDPS 227
G G +GCT+C D GC +CS CS + C C C
Sbjct: 131 EGHHLG--RGCTRCSDS--GGSDHGC----KCSTGGGSCSAGKECKCALAGKCCKCCCKG 182
Query: 228 CNGCHGD 234
G +
Sbjct: 183 KCGKGKE 189
>gnl|CDD|177356 PHA02256, PHA02256, hypothetical protein.
Length = 113
Score = 26.8 bits (59), Expect = 7.0
Identities = 15/55 (27%), Positives = 26/55 (47%), Gaps = 5/55 (9%)
Query: 272 VYVGLCVATYIIFQKNVFIASIVGVVVAIY--VSVAEYILNDKTAAFDPPSIITK 324
V+ L + T +IF + F + V V+ IY + + YI+ F S++ K
Sbjct: 10 VFTCLSLLTLMIFVHSKFSSKNVFVLYVIYAIIGIGTYIV---LTMFQTTSVLIK 61
>gnl|CDD|237654 PRK14278, PRK14278, chaperone protein DnaJ; Provisional.
Length = 378
Score = 27.7 bits (62), Expect = 9.3
Identities = 13/41 (31%), Positives = 18/41 (43%), Gaps = 2/41 (4%)
Query: 142 NECNTGYFQSYKDEKTILCSKCHASCESGCSTGGPKGCTKC 182
EC TG + + +LC +CH +G S P C C
Sbjct: 124 EECATGVTKQVTVDTAVLCDRCHGKGTAGDSK--PVTCDTC 162
Database: CDD.v3.10
Posted date: Mar 20, 2013 7:55 AM
Number of letters in database: 10,937,602
Number of sequences in database: 44,354
Lambda K H
0.319 0.136 0.450
Gapped
Lambda K H
0.267 0.0510 0.140
Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 15,584,498
Number of extensions: 1376986
Number of successful extensions: 1394
Number of sequences better than 10.0: 1
Number of HSP's gapped: 1364
Number of HSP's successfully gapped: 80
Length of query: 329
Length of database: 10,937,602
Length adjustment: 97
Effective length of query: 232
Effective length of database: 6,635,264
Effective search space: 1539381248
Effective search space used: 1539381248
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.8 bits)
S2: 59 (27.0 bits)