RPS-BLAST 2.2.26 [Sep-21-2011]
Database: CDD.v3.10
44,354 sequences; 10,937,602 total letters
Searching..................................................done
Query= psy620
(1290 letters)
>gnl|CDD|147730 pfam05735, TSP_C, Thrombospondin C-terminal region. This region is
found at the C-terminus of thrombospondin and related
proteins.
Length = 201
Score = 236 bits (605), Expect = 2e-71
Identities = 88/121 (72%), Positives = 100/121 (82%)
Query: 1015 QIDPHWVIYNHGAEILQTMNSDPGLAIGQDKFSGVDFEGTFFVDTDIDDDYAGFVFSYQS 1074
QIDP+WV+YN GAEI+QT+NSDPGLA+G D F GVDFEGTFF++T DDDY GFVF YQS
Sbjct: 1 QIDPNWVVYNQGAEIVQTLNSDPGLAVGYDAFEGVDFEGTFFINTTTDDDYVGFVFGYQS 60
Query: 1075 SQKFYVMMWKKNSQVYWQTTPFRAVAEPGIQLKVVDSATGPGTMLRNSLWHTGDTENQCD 1134
+ KFYV+MWKK Q YWQ PFRA AEPGIQLK+V+S TGPG LRN+LWHTGDT NQ
Sbjct: 61 NSKFYVVMWKKAEQTYWQANPFRASAEPGIQLKLVNSTTGPGEALRNALWHTGDTTNQVR 120
Query: 1135 L 1135
L
Sbjct: 121 L 121
Score = 116 bits (293), Expect = 2e-29
Identities = 40/55 (72%), Positives = 45/55 (81%)
Query: 1236 YQSSQKFYVMMWKKNSQVYWQTTPFRAVAEPGIQLKVVDSATGPGTMLRNSLWHT 1290
YQS+ KFYV+MWKK Q YWQ PFRA AEPGIQLK+V+S TGPG LRN+LWHT
Sbjct: 58 YQSNSKFYVVMWKKAEQTYWQANPFRASAEPGIQLKLVNSTTGPGEALRNALWHT 112
>gnl|CDD|202235 pfam02412, TSP_3, Thrombospondin type 3 repeat. The thrombospondin
repeat is a short aspartate rich repeat which binds to
calcium ions. The repeat was initially identified in
thrombospondin proteins that contained 7 of these
repeats. The repeat lacks defined secondary structure.
Length = 35
Score = 50.8 bits (122), Expect = 2e-08
Identities = 24/35 (68%), Positives = 28/35 (80%)
Query: 830 DTDNDGTGDACDNDMDNDGINNHADNCPRNANPDQ 864
D+D+DG GDACDND DNDG+ + DNCP NAN DQ
Sbjct: 1 DSDSDGIGDACDNDFDNDGVPDLLDNCPYNANIDQ 35
Score = 46.2 bits (110), Expect = 1e-06
Identities = 21/35 (60%), Positives = 25/35 (71%)
Query: 733 DADEDNIGDICDDDADNDGILNPSDNCPYVHNPDQ 767
D+D D IGD CD+D DNDG+ + DNCPY N DQ
Sbjct: 1 DSDSDGIGDACDNDFDNDGVPDLLDNCPYNANIDQ 35
Score = 44.3 bits (105), Expect = 4e-06
Identities = 19/31 (61%), Positives = 23/31 (74%)
Query: 963 DSDGDGVGNVCDKDFDKDGTPDVSDVCPNNS 993
DSD DG+G+ CD DFD DG PD+ D CP N+
Sbjct: 1 DSDSDGIGDACDNDFDNDGVPDLLDNCPYNA 31
Score = 42.4 bits (100), Expect = 2e-05
Identities = 22/37 (59%), Positives = 28/37 (75%), Gaps = 2/37 (5%)
Query: 488 DSNSNHIGDVCDSGVDTDHDGVPDPMDNCPKVANPNQ 524
DS+S+ IGD CD+ D D+DGVPD +DNCP AN +Q
Sbjct: 1 DSDSDGIGDACDN--DFDNDGVPDLLDNCPYNANIDQ 35
Score = 42.4 bits (100), Expect = 2e-05
Identities = 22/37 (59%), Positives = 28/37 (75%), Gaps = 2/37 (5%)
Query: 889 DSNSNHIGDVCDSGVDTDHDGVPDPMDNCPKVANPNQ 925
DS+S+ IGD CD+ D D+DGVPD +DNCP AN +Q
Sbjct: 1 DSDSDGIGDACDN--DFDNDGVPDLLDNCPYNANIDQ 35
Score = 40.4 bits (95), Expect = 1e-04
Identities = 17/35 (48%), Positives = 21/35 (60%)
Query: 927 DNDRDGKGDECDPDLDGDGISNDEDNCRLIYNPNQ 961
D+D DG GD CD D D DG+ + DNC N +Q
Sbjct: 1 DSDSDGIGDACDNDFDNDGVPDLLDNCPYNANIDQ 35
Score = 27.3 bits (61), Expect = 3.7
Identities = 16/35 (45%), Positives = 18/35 (51%), Gaps = 13/35 (37%)
Query: 465 DSDHDGIGDAC-------------DNCPRVSNPEQ 486
DSD DGIGDAC DNCP +N +Q
Sbjct: 1 DSDSDGIGDACDNDFDNDGVPDLLDNCPYNANIDQ 35
>gnl|CDD|238011 cd00054, EGF_CA, Calcium-binding EGF-like domain, present in a
large number of membrane-bound and extracellular (mostly
animal) proteins. Many of these proteins require calcium
for their biological function and calcium-binding sites
have been found to be located at the N-terminus of
particular EGF-like domains; calcium-binding may be
crucial for numerous protein-protein interactions. Six
conserved core cysteines form three disulfide bridges as
in non calcium-binding EGF domains, whose structures are
very similar. EGF_CA can be found in tandem repeat
arrangements.
Length = 38
Score = 42.6 bits (101), Expect = 2e-05
Identities = 17/37 (45%), Positives = 18/37 (48%), Gaps = 1/37 (2%)
Query: 235 DIDECDLAEPCDPRVQCTNLFPGYRCDPCPAGFTGST 271
DIDEC PC C N YRC CP G+TG
Sbjct: 1 DIDECASGNPCQNGGTCVNTVGSYRCS-CPPGYTGRN 36
Score = 35.7 bits (83), Expect = 0.005
Identities = 18/40 (45%), Positives = 21/40 (52%), Gaps = 4/40 (10%)
Query: 290 DIDECADGRNGGCDSNSMCTNTEGSFTCTSLCRNSYMVRN 329
DIDECA G C + C NT GS+ C+ C Y RN
Sbjct: 1 DIDECASG--NPCQNGGTCVNTVGSYRCS--CPPGYTGRN 36
Score = 34.9 bits (81), Expect = 0.008
Identities = 13/35 (37%), Positives = 16/35 (45%), Gaps = 1/35 (2%)
Query: 1131 NQCDLAEPCDPRVQCTNLFPGYRCDPCPAGFTGST 1165
++C PC C N YRC CP G+TG
Sbjct: 3 DECASGNPCQNGGTCVNTVGSYRCS-CPPGYTGRN 36
Score = 33.8 bits (78), Expect = 0.024
Identities = 16/37 (43%), Positives = 20/37 (54%), Gaps = 3/37 (8%)
Query: 108 CATDNPCFPGVECRDTREGPRCMRCPDGYVGDGIHCK 144
CA+ NPC G C +T RC CP GY G +C+
Sbjct: 5 CASGNPCQNGGTCVNTVGSYRCS-CPPGYTGR--NCE 38
Score = 33.8 bits (78), Expect = 0.024
Identities = 16/37 (43%), Positives = 20/37 (54%), Gaps = 3/37 (8%)
Query: 590 CATDNPCFPGVECRDTREGPRCMRCPDGYVGDGIHCK 626
CA+ NPC G C +T RC CP GY G +C+
Sbjct: 5 CASGNPCQNGGTCVNTVGSYRCS-CPPGYTGR--NCE 38
Score = 31.1 bits (71), Expect = 0.24
Identities = 15/32 (46%), Positives = 17/32 (53%), Gaps = 3/32 (9%)
Query: 153 PCFQGVQCFDTVEGYTCGPCPSGYTGDGERCQ 184
PC G C +TV Y C CP GYTG C+
Sbjct: 10 PCQNGGTCVNTVGSYRCS-CPPGYTGR--NCE 38
>gnl|CDD|214542 smart00179, EGF_CA, Calcium-binding EGF-like domain.
Length = 39
Score = 40.7 bits (96), Expect = 7e-05
Identities = 16/34 (47%), Positives = 18/34 (52%), Gaps = 1/34 (2%)
Query: 235 DIDECDLAEPCDPRVQCTNLFPGYRCDPCPAGFT 268
DIDEC PC C N YRC+ CP G+T
Sbjct: 1 DIDECASGNPCQNGGTCVNTVGSYRCE-CPPGYT 33
Score = 36.5 bits (85), Expect = 0.003
Identities = 14/29 (48%), Positives = 16/29 (55%), Gaps = 2/29 (6%)
Query: 290 DIDECADGRNGGCDSNSMCTNTEGSFTCT 318
DIDECA G C + C NT GS+ C
Sbjct: 1 DIDECASG--NPCQNGGTCVNTVGSYRCE 27
Score = 34.1 bits (79), Expect = 0.017
Identities = 17/37 (45%), Positives = 21/37 (56%), Gaps = 2/37 (5%)
Query: 108 CATDNPCFPGVECRDTREGPRCMRCPDGYVGDGIHCK 144
CA+ NPC G C +T RC CP GY DG +C+
Sbjct: 5 CASGNPCQNGGTCVNTVGSYRC-ECPPGYT-DGRNCE 39
Score = 34.1 bits (79), Expect = 0.017
Identities = 17/37 (45%), Positives = 21/37 (56%), Gaps = 2/37 (5%)
Query: 590 CATDNPCFPGVECRDTREGPRCMRCPDGYVGDGIHCK 626
CA+ NPC G C +T RC CP GY DG +C+
Sbjct: 5 CASGNPCQNGGTCVNTVGSYRC-ECPPGYT-DGRNCE 39
Score = 33.8 bits (78), Expect = 0.026
Identities = 16/32 (50%), Positives = 18/32 (56%), Gaps = 2/32 (6%)
Query: 153 PCFQGVQCFDTVEGYTCGPCPSGYTGDGERCQ 184
PC G C +TV Y C CP GYT DG C+
Sbjct: 10 PCQNGGTCVNTVGSYRCE-CPPGYT-DGRNCE 39
Score = 33.4 bits (77), Expect = 0.036
Identities = 12/32 (37%), Positives = 16/32 (50%), Gaps = 1/32 (3%)
Query: 1131 NQCDLAEPCDPRVQCTNLFPGYRCDPCPAGFT 1162
++C PC C N YRC+ CP G+T
Sbjct: 3 DECASGNPCQNGGTCVNTVGSYRCE-CPPGYT 33
>gnl|CDD|219496 pfam07645, EGF_CA, Calcium-binding EGF domain.
Length = 42
Score = 37.3 bits (87), Expect = 0.002
Identities = 19/45 (42%), Positives = 26/45 (57%), Gaps = 3/45 (6%)
Query: 290 DIDECADGRNGGCDSNSMCTNTEGSFTCTSLCRNSYMVRNVSVGC 334
D+DECADG + C +N++C NT GSF C +C + Y C
Sbjct: 1 DVDECADGTHN-CPANTVCVNTIGSFEC--VCPDGYENNEDGTNC 42
>gnl|CDD|205157 pfam12947, EGF_3, EGF domain. This family includes a variety of
EGF-like domain homologues. This family includes the
C-terminal domain of the malaria parasite MSP1 protein.
Length = 36
Score = 36.0 bits (84), Expect = 0.004
Identities = 17/25 (68%), Positives = 19/25 (76%), Gaps = 1/25 (4%)
Query: 294 CADGRNGGCDSNSMCTNTEGSFTCT 318
CA+ NGGC N+ CTNT GSFTCT
Sbjct: 1 CAEN-NGGCHPNATCTNTGGSFTCT 24
Score = 33.3 bits (77), Expect = 0.032
Identities = 13/30 (43%), Positives = 17/30 (56%), Gaps = 1/30 (3%)
Query: 350 CDRNAKCTRILGNHYACKCDNGWAGDGQFC 379
C NA CT G+ + C C +G+ GDG C
Sbjct: 8 CHPNATCTNTGGS-FTCTCKSGYTGDGVTC 36
>gnl|CDD|238010 cd00053, EGF, Epidermal growth factor domain, found in epidermal
growth factor (EGF) presents in a large number of
proteins, mostly animal; the list of proteins currently
known to contain one or more copies of an EGF-like
pattern is large and varied; the functional significance
of EGF-like domains in what appear to be unrelated
proteins is not yet clear; a common feature is that
these repeats are found in the extracellular domain of
membrane-bound proteins or in proteins known to be
secreted (exception: prostaglandin G/H synthase); the
domain includes six cysteine residues which have been
shown to be involved in disulfide bonds; the main
structure is a two-stranded beta-sheet followed by a
loop to a C-terminal short two-stranded sheet;
Subdomains between the conserved cysteines vary in
length; the region between the 5th and 6th cysteine
contains two conserved glycines of which at least one
is present in most EGF-like domains; a subset of
these bind calcium.
Length = 36
Score = 34.4 bits (79), Expect = 0.012
Identities = 17/36 (47%), Positives = 18/36 (50%), Gaps = 2/36 (5%)
Query: 108 CATDNPCFPGVECRDTREGPRCMRCPDGYVGDGIHC 143
CA NPC G C +T RC CP GY GD C
Sbjct: 2 CAASNPCSNGGTCVNTPGSYRC-VCPPGYTGDR-SC 35
Score = 34.4 bits (79), Expect = 0.012
Identities = 17/36 (47%), Positives = 18/36 (50%), Gaps = 2/36 (5%)
Query: 590 CATDNPCFPGVECRDTREGPRCMRCPDGYVGDGIHC 625
CA NPC G C +T RC CP GY GD C
Sbjct: 2 CAASNPCSNGGTCVNTPGSYRC-VCPPGYTGDR-SC 35
Score = 33.6 bits (77), Expect = 0.023
Identities = 14/34 (41%), Positives = 16/34 (47%), Gaps = 1/34 (2%)
Query: 238 ECDLAEPCDPRVQCTNLFPGYRCDPCPAGFTGST 271
EC + PC C N YRC CP G+TG
Sbjct: 1 ECAASNPCSNGGTCVNTPGSYRCV-CPPGYTGDR 33
Score = 32.1 bits (73), Expect = 0.093
Identities = 13/33 (39%), Positives = 15/33 (45%), Gaps = 1/33 (3%)
Query: 1133 CDLAEPCDPRVQCTNLFPGYRCDPCPAGFTGST 1165
C + PC C N YRC CP G+TG
Sbjct: 2 CAASNPCSNGGTCVNTPGSYRCV-CPPGYTGDR 33
Score = 31.3 bits (71), Expect = 0.15
Identities = 15/32 (46%), Positives = 17/32 (53%), Gaps = 2/32 (6%)
Query: 153 PCFQGVQCFDTVEGYTCGPCPSGYTGDGERCQ 184
PC G C +T Y C CP GYTGD C+
Sbjct: 7 PCSNGGTCVNTPGSYRCV-CPPGYTGDR-SCE 36
Score = 27.1 bits (60), Expect = 4.8
Identities = 12/34 (35%), Positives = 15/34 (44%), Gaps = 4/34 (11%)
Query: 293 ECADGRNGGCDSNSMCTNTEGSFTCTSLCRNSYM 326
ECA + C + C NT GS+ C C Y
Sbjct: 1 ECAA--SNPCSNGGTCVNTPGSYRCV--CPPGYT 30
>gnl|CDD|214544 smart00181, EGF, Epidermal growth factor-like domain.
Length = 35
Score = 30.9 bits (70), Expect = 0.25
Identities = 14/32 (43%), Positives = 16/32 (50%), Gaps = 2/32 (6%)
Query: 108 CATDNPCFPGVECRDTREGPRCMRCPDGYVGD 139
CA+ PC G C +T C CP GY GD
Sbjct: 2 CASGGPCSNGT-CINTPGSYTC-SCPPGYTGD 31
Score = 30.9 bits (70), Expect = 0.25
Identities = 14/32 (43%), Positives = 16/32 (50%), Gaps = 2/32 (6%)
Query: 590 CATDNPCFPGVECRDTREGPRCMRCPDGYVGD 621
CA+ PC G C +T C CP GY GD
Sbjct: 2 CASGGPCSNGT-CINTPGSYTC-SCPPGYTGD 31
Score = 28.6 bits (64), Expect = 1.4
Identities = 13/34 (38%), Positives = 14/34 (41%), Gaps = 2/34 (5%)
Query: 238 ECDLAEPCDPRVQCTNLFPGYRCDPCPAGFTGST 271
EC PC C N Y C CP G+TG
Sbjct: 1 ECASGGPCSNG-TCINTPGSYTCS-CPPGYTGDK 32
Score = 28.3 bits (63), Expect = 2.3
Identities = 17/32 (53%), Positives = 20/32 (62%), Gaps = 3/32 (9%)
Query: 153 PCFQGVQCFDTVEGYTCGPCPSGYTGDGERCQ 184
PC G C +T YTC CP GYTGD +RC+
Sbjct: 7 PCSNG-TCINTPGSYTC-SCPPGYTGD-KRCE 35
Score = 27.1 bits (60), Expect = 5.3
Identities = 12/33 (36%), Positives = 13/33 (39%), Gaps = 2/33 (6%)
Query: 1133 CDLAEPCDPRVQCTNLFPGYRCDPCPAGFTGST 1165
C PC C N Y C CP G+TG
Sbjct: 2 CASGGPCSNG-TCINTPGSYTCS-CPPGYTGDK 32
>gnl|CDD|215652 pfam00008, EGF, EGF-like domain. There is no clear separation
between noise and signal. pfam00053 is very similar, but
has 8 instead of 6 conserved cysteines. Includes some
cytokine receptors. The EGF domain misses the N-terminus
regions of the Ca2+ binding EGF domains (this is the
main reason of discrepancy between swiss-prot domain
start/end and Pfam). The family is hard to model due to
many similar but different sub-types of EGF domains.
Pfam certainly misses a number of EGF domains.
Length = 32
Score = 28.9 bits (65), Expect = 1.1
Identities = 15/32 (46%), Positives = 18/32 (56%), Gaps = 1/32 (3%)
Query: 108 CATDNPCFPGVECRDTREGPRCMRCPDGYVGD 139
C+ +NPC G C DT G C CP+GY G
Sbjct: 1 CSPNNPCSNGGTCVDTPGGYTCE-CPEGYTGK 31
Score = 28.9 bits (65), Expect = 1.1
Identities = 15/32 (46%), Positives = 18/32 (56%), Gaps = 1/32 (3%)
Query: 590 CATDNPCFPGVECRDTREGPRCMRCPDGYVGD 621
C+ +NPC G C DT G C CP+GY G
Sbjct: 1 CSPNNPCSNGGTCVDTPGGYTCE-CPEGYTGK 31
Score = 28.6 bits (64), Expect = 1.6
Identities = 16/27 (59%), Positives = 16/27 (59%), Gaps = 1/27 (3%)
Query: 153 PCFQGVQCFDTVEGYTCGPCPSGYTGD 179
PC G C DT GYTC CP GYTG
Sbjct: 6 PCSNGGTCVDTPGGYTCE-CPEGYTGK 31
Score = 28.2 bits (63), Expect = 2.3
Identities = 8/32 (25%), Positives = 10/32 (31%), Gaps = 1/32 (3%)
Query: 344 CPDGTRCDRNAKCTRILGNHYACKCDNGWAGD 375
C C C Y C+C G+ G
Sbjct: 1 CSPNNPCSNGGTCVD-TPGGYTCECPEGYTGK 31
Score = 27.8 bits (62), Expect = 2.4
Identities = 12/33 (36%), Positives = 15/33 (45%), Gaps = 1/33 (3%)
Query: 239 CDLAEPCDPRVQCTNLFPGYRCDPCPAGFTGST 271
C PC C + GY C+ CP G+TG
Sbjct: 1 CSPNNPCSNGGTCVDTPGGYTCE-CPEGYTGKR 32
Score = 27.8 bits (62), Expect = 2.4
Identities = 12/33 (36%), Positives = 15/33 (45%), Gaps = 1/33 (3%)
Query: 1133 CDLAEPCDPRVQCTNLFPGYRCDPCPAGFTGST 1165
C PC C + GY C+ CP G+TG
Sbjct: 1 CSPNNPCSNGGTCVDTPGGYTCE-CPEGYTGKR 32
>gnl|CDD|177447 PHA02664, PHA02664, hypothetical protein; Provisional.
Length = 534
Score = 32.7 bits (74), Expect = 1.4
Identities = 17/81 (20%), Positives = 30/81 (37%)
Query: 900 DSGVDTDHDGVPDPMDNCPKVANPNQLDNDRDGKGDECDPDLDGDGISNDEDNCRLIYNP 959
D V+ + D P A+ D D + + D DG+ S+ + +
Sbjct: 426 DQDVEAEAHDEFDQDPGAPAHADRADSDEDDMDEQESGDERADGEDDSDSSYSYSTTSSE 485
Query: 960 NQDDSDGDGVGNVCDKDFDKD 980
++ DS D G+ D + D
Sbjct: 486 DESDSADDSWGDESDSGIEHD 506
Score = 30.7 bits (69), Expect = 5.3
Identities = 20/97 (20%), Positives = 34/97 (35%), Gaps = 21/97 (21%)
Query: 832 DNDGTGDACDNDMDNDGINNHADNCPRNANPDQR---DSDHDGI---------------G 873
D D++ + H + P DSD D +
Sbjct: 417 AAAAANAPADQDVEAEA---HDEFDQDPGAPAHADRADSDEDDMDEQESGDERADGEDDS 473
Query: 874 DACDNCPRVSNPEQTDSNSNHIGDVCDSGVDTDHDGV 910
D+ + S+ +++DS + GD DSG++ D GV
Sbjct: 474 DSSYSYSTTSSEDESDSADDSWGDESDSGIEHDDGGV 510
>gnl|CDD|110998 pfam02055, Glyco_hydro_30, O-Glycosyl hydrolase family 30.
Length = 495
Score = 32.2 bits (73), Expect = 1.9
Identities = 25/95 (26%), Positives = 36/95 (37%), Gaps = 5/95 (5%)
Query: 1029 ILQTMNSDPGLA--IGQDKFSGVDFEGTFFVDTDIDDDYAGFVFSYQSSQKFYVMMWKKN 1086
IL+ SD GL G+ + DF + D DDY FS + + +
Sbjct: 102 ILKQYFSDEGLNLQFGRVPIASCDFSIRVYTYADTPDDYQMHNFSLPEEDTQWKIPYIHR 161
Query: 1087 SQVYWQTTPFRAV--AEPGIQLKVVDSATGPGTML 1119
+Q Y Q A PG LK + G G++
Sbjct: 162 AQKYNQRLKLFASPWTAPG-WLKTTGAVNGKGSLK 195
>gnl|CDD|215094 PLN00188, PLN00188, enhanced disease resistance protein (EDR2);
Provisional.
Length = 719
Score = 32.5 bits (74), Expect = 1.9
Identities = 18/62 (29%), Positives = 26/62 (41%), Gaps = 17/62 (27%)
Query: 75 DSMKNPQMRLRKTDEESVDEIELPAIPIVKKPTCATDNPCFPGVECRDTREGPR-CMRCP 133
++ KN + + +EE D+I+L CF G RD R+ R C R
Sbjct: 461 ETTKN-ETKDTAMEEEPQDKIDLS---------------CFSGNLRRDDRDKARDCWRIS 504
Query: 134 DG 135
DG
Sbjct: 505 DG 506
>gnl|CDD|221695 pfam12662, cEGF, Complement Clr-like EGF-like. cEGF, or complement
Clr-like EGF, domains have six conserved cysteine
residues disulfide-bonded into the characteristic
pattern 'ababcc'. They are found in blood coagulation
proteins such as fibrillin, Clr and Cls, thrombomodulin,
and the LDL receptor. The core fold of the EGF domain
consists of two small beta-hairpins packed against each
other. Two major structural variants have been
identified based on the structural context of the
C-terminal cysteine residue of disulfide 'c' in the
C-terminal hairpin: hEGFs and cEGFs. In cEGFs the
C-terminal thiol resides on the C-terminal beta-sheet,
resulting in long loop-lengths between the cysteine
residues of disulfide 'c', typically C[10+]XC. These
longer loop-lengths may have arisen by selective
cysteine loss from a four-disulfide EGF template such as
laminin or integrin. Tandem cEGF domains have five
linking residues between terminal cysteines of adjacent
domains. cEGF domains may or may not bind calcium in the
linker region. cEGF domains with the consensus motif
CXN4X[F,Y]XCXC are hydroxylated exclusively on the
asparagine residue.
Length = 24
Score = 27.8 bits (63), Expect = 2.0
Identities = 13/24 (54%), Positives = 15/24 (62%), Gaps = 3/24 (12%)
Query: 217 YRCGSCPEG--TTGNGTRCHDIDE 238
Y C SCP G +G+G C DIDE
Sbjct: 2 YTC-SCPPGYQLSGDGRTCEDIDE 24
Score = 27.8 bits (63), Expect = 2.0
Identities = 13/24 (54%), Positives = 15/24 (62%), Gaps = 3/24 (12%)
Query: 699 YRCGSCPEG--TTGNGTRCHDIDE 720
Y C SCP G +G+G C DIDE
Sbjct: 2 YTC-SCPPGYQLSGDGRTCEDIDE 24
>gnl|CDD|173479 PTZ00214, PTZ00214, high cysteine membrane protein Group 4;
Provisional.
Length = 800
Score = 31.8 bits (72), Expect = 2.6
Identities = 20/69 (28%), Positives = 28/69 (40%), Gaps = 3/69 (4%)
Query: 142 HCKPGVTCNMRP--CFQGVQCFDTVEGYTCGPCPSGYTGDGERCQRIGGCSRNPCAQGKL 199
+ K GV C + C Q +C C C SGY +CQ G + + C
Sbjct: 22 YNKFGVECGGKQENCAQN-RCILLGSDELCTQCVSGYVPINGKCQLYPGDASSVCIPDDA 80
Query: 200 NEKTRCVRC 208
+ TRC+ C
Sbjct: 81 SSPTRCIEC 89
>gnl|CDD|131138 TIGR02083, LEU2, 3-isopropylmalate dehydratase, large subunit.
Homoaconitase, aconitase, and 3-isopropylmalate
dehydratase have similar overall structures. All are
dehydratases (EC 4.2.1.-) and bind a Fe-4S iron-sulfur
cluster. 3-isopropylmalate dehydratase is split into
large (leuC) and small (leuD) chains in eubacteria.
Several pairs of archaeal proteins resemble the leuC and
leuD pair in length and sequence but even more closely
resemble the respective domains of homoaconitase, and
their identity is uncertain. These homologs are
described by a separate model of subfamily (rather than
equivalog) homology type (TIGR01343). This model along
with TIGR00170 describe clades which consist only of
LeuC sequences. Here, the genes from Pyrococcus
furiosus, Clostridium acetobutylicum, Thermotoga
maritima and others are gene clustered with related
genes from the leucine biosynthesis pathway [Amino acid
biosynthesis, Pyruvate family].
Length = 419
Score = 30.9 bits (70), Expect = 5.2
Identities = 21/66 (31%), Positives = 25/66 (37%), Gaps = 19/66 (28%)
Query: 144 KPGVTCNMRPCFQGVQCFDTVEGY--------------TCGPCPSGYTG---DGERCQRI 186
P V C + P Q V EG TCGPC G+ G +GER I
Sbjct: 321 APDVRCIIIPGSQNVYLEAMKEGLLEIFIEAGAVVSTPTCGPCLGGHMGILAEGERA--I 378
Query: 187 GGCSRN 192
+RN
Sbjct: 379 STTNRN 384
>gnl|CDD|192535 pfam10320, 7TM_GPCR_Srsx, Serpentine type 7TM GPCR chemoreceptor
Srsx. Chemoreception is mediated in Caenorhabditis
elegans by members of the seven-transmembrane
G-protein-coupled receptor class (7TM GPCRs) of proteins
which are of the serpentine type. Srsx is a solo family
amongst the superfamilies of chemoreceptors.
Chemoperception is one of the central senses of soil
nematodes like C. elegans which are otherwise 'blind' and
'deaf'.
Length = 257
Score = 30.3 bits (69), Expect = 6.7
Identities = 9/39 (23%), Positives = 14/39 (35%), Gaps = 7/39 (17%)
Query: 1053 GTFFVDTDIDDDYAGFVFSYQS-------SQKFYVMMWK 1084
T F+ D + + Y SQ F+V W+
Sbjct: 207 NTVFLLLTEDGEVENIIQMYAGIFVNLSFSQNFFVTYWR 245
>gnl|CDD|178752 PLN03213, PLN03213, repressor of silencing 3; Provisional.
Length = 759
Score = 30.6 bits (68), Expect = 6.7
Identities = 29/113 (25%), Positives = 48/113 (42%), Gaps = 12/113 (10%)
Query: 800 QLDSDQDGMRDRRLGDACDNCPTVPNVDQTDTDNDGTGDACDNDMDNDGINNHADNCPRN 859
+ D+ D M D D+ + V ++D GDA +ND I++ AD+ N
Sbjct: 447 ECDTAIDSMADDTAIDSMADDAASDAVAESDD-----GDAVENDT---AIDSMADDTASN 498
Query: 860 ANPDQRDSDHDGIGDACDN-CPRVSNPEQTDSNSNHIGDVCDSGVDTDHDGVP 911
+ + D D+ A D+ +N D S+ + D+ +DT D VP
Sbjct: 499 SMAESDDGDNVEDDTAIDSMADDTAN---DDVGSDDSESLADTVIDTSVDAVP 548
>gnl|CDD|193419 pfam12946, EGF_MSP1_1, MSP1 EGF domain 1. This EGF-like domain is
found at the C-terminus of the malaria parasite MSP1
protein. MSP1 is the merozoite surface protein 1. This
domain is part of the C-terminal fragment that is
proteolytically processed from the the rest of the
protein and is left attached to the surface of the
invading parasite.
Length = 37
Score = 26.6 bits (59), Expect = 7.4
Identities = 12/33 (36%), Positives = 15/33 (45%)
Query: 347 GTRCDRNAKCTRILGNHYACKCDNGWAGDGQFC 379
T C NA C R L C+C G+ +G C
Sbjct: 4 KTNCPANAGCFRYLDGREECRCLLGYKKEGGKC 36
>gnl|CDD|173874 cd08509, PBP2_TmCBP_oligosaccharides_like, The substrate binding
domain of a cellulose-binding protein from Thermotoga
maritima contains the type 2 periplasmic binding fold.
This family represents the substrate-binding domain of a
cellulose-binding protein from the hyperthermophilic
bacterium Thermotoga maritima (TmCBP) and its closest
related proteins. TmCBP binds a variety of lengths of
beta-1,4-linked glucose oligomers, ranging from two
sugar rings (cellobiose) to five (cellopentose). TmCBP
is structurally homologous to domains I and III of the
ATP-binding cassette (ABC)-type oligopeptide-binding
proteins and thus belongs to the type 2 periplasmic
binding fold protein (PBP2) superfamily. The type 2
periplasmic binding proteins are soluble ligand-binding
components of ABC or tripartite ATP-independent
transporters and chemotaxis systems. Members of the PBP2
superfamily function in uptake of a variety of
metabolites in bacteria such as amino acids,
carbohydrate, ions, and polyamines. Ligands are then
transported across the cytoplasmic membrane energized by
ATP hydrolysis or electrochemical ion gradient. Besides
transport proteins, the PBP2 superfamily includes the
ligand-binding domains from ionotropic glutamate
receptors, LysR-type transcriptional regulators, and
unorthodox sensor proteins involved in signal
transduction.
Length = 509
Score = 30.4 bits (69), Expect = 7.7
Identities = 11/54 (20%), Positives = 22/54 (40%), Gaps = 9/54 (16%)
Query: 544 WTTVPRWQIP--------TSWITTVTARERLAEKIEDGIWKKELPAIPIVKKPT 589
RW+ P + T ++ L +++ I+ +E+P IP+ P
Sbjct: 438 AGNFGRWKNPELDELIDELNKTTDEAEQKELGNELQK-IFAEEMPVIPLFYNPI 490
>gnl|CDD|114045 pfam05297, Herpes_LMP1, Herpesvirus latent membrane protein 1
(LMP1). This family consists of several latent membrane
protein 1 or LMP1s mostly from Epstein-Barr virus. LMP1
of EBV is a 62-65 kDa plasma membrane protein possessing
six membrane spanning regions, a short cytoplasmic
N-terminus and a long cytoplasmic carboxy tail of 200
amino acids. EBV latent membrane protein 1 (LMP1) is
essential for EBV-mediated transformation and has been
associated with several cases of malignancies. EBV-like
viruses in Cynomolgus monkeys (Macaca fascicularis) have
been associated with high lymphoma rates in
immunosuppressed monkeys.
Length = 382
Score = 30.0 bits (67), Expect = 8.6
Identities = 26/76 (34%), Positives = 39/76 (51%), Gaps = 12/76 (15%)
Query: 727 PNSGQEDADE-DNIGDICDDDADNDGILNPS---DNCPYVHNPDQLDSDQDGMRDRRLGD 782
P++G +D D D+ G D+ D++G +P DN P +PD ++ +G +D D
Sbjct: 249 PDNGPQDPDNTDDNGPQDPDNTDDNGPQDPDNTDDNGP--QDPD--NTADNGPQDPDNTD 304
Query: 783 ACDNCPTDNCPYVHNP 798
DN P D P HNP
Sbjct: 305 --DNGPHDPLP--HNP 316
Database: CDD.v3.10
Posted date: Mar 20, 2013 7:55 AM
Number of letters in database: 10,937,602
Number of sequences in database: 44,354
Lambda K H
0.319 0.138 0.460
Gapped
Lambda K H
0.267 0.0632 0.140
Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 64,704,956
Number of extensions: 6206615
Number of successful extensions: 4518
Number of sequences better than 10.0: 1
Number of HSP's gapped: 4475
Number of HSP's successfully gapped: 107
Length of query: 1290
Length of database: 10,937,602
Length adjustment: 108
Effective length of query: 1182
Effective length of database: 6,147,370
Effective search space: 7266191340
Effective search space used: 7266191340
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 65 (29.0 bits)