RPS-BLAST 2.2.26 [Sep-21-2011]
Database: CDD.v3.10
44,354 sequences; 10,937,602 total letters
Searching..................................................done
Query= psy954
(337 letters)
>gnl|CDD|238060 cd00112, LDLa, Low Density Lipoprotein Receptor Class A domain, a
cysteine-rich repeat that plays a central role in
mammalian cholesterol metabolism; the receptor protein
binds LDL and transports it into cells by endocytosis; 7
successive cysteine-rich repeats of about 40 amino acids
are present in the N-terminal of this multidomain
membrane protein; other homologous domains occur in
related receptors, including the very low-density
lipoprotein receptor and the LDL receptor-related
protein/alpha 2-macroglobulin receptor, and in proteins
which are functionally unrelated, such as the C9
component of complement; the binding of calcium is
required for in vitro formation of the native disulfide
isomer and is necessary in establishing and maintaining
the modular structure.
Length = 35
Score = 53.7 bits (130), Expect = 5e-10
Identities = 20/32 (62%), Positives = 26/32 (81%)
Query: 259 CSPDQFSCGNGRCINTGWLCDHDNDCGDGSDE 290
C P++F C NGRCI + W+CD ++DCGDGSDE
Sbjct: 1 CPPNEFRCANGRCIPSSWVCDGEDDCGDGSDE 32
Score = 51.4 bits (124), Expect = 4e-09
Identities = 21/35 (60%), Positives = 24/35 (68%)
Query: 301 CSSEEFACQNFKCIRKTYHCDGEDDCGDRSDEFNC 335
C EF C N +CI ++ CDGEDDCGD SDE NC
Sbjct: 1 CPPNEFRCANGRCIPSSWVCDGEDDCGDGSDEENC 35
Score = 49.1 bits (118), Expect = 2e-08
Identities = 19/33 (57%), Positives = 24/33 (72%)
Query: 121 CDGSKFFCRNGKCISRMWSCDGDDDCGDNSDED 153
C ++F C NG+CI W CDG+DDCGD SDE+
Sbjct: 1 CPPNEFRCANGRCIPSSWVCDGEDDCGDGSDEE 33
Score = 38.3 bits (90), Expect = 1e-04
Identities = 16/37 (43%), Positives = 20/37 (54%), Gaps = 6/37 (16%)
Query: 211 ETEFTCTENKAWNRAQCIPKKWLCDGDPDCVDGADEN 247
EF C +CIP W+CDG+ DC DG+DE
Sbjct: 3 PNEFRCANG------RCIPSSWVCDGEDDCGDGSDEE 33
>gnl|CDD|197566 smart00192, LDLa, Low-density lipoprotein receptor domain class A.
Cysteine-rich repeat in the low-density lipoprotein
(LDL) receptor that plays a central role in mammalian
cholesterol metabolism. The N-terminal type A repeats in
LDL receptor bind the lipoproteins. Other homologous
domains occur in related receptors, including the very
low-density lipoprotein receptor and the LDL
receptor-related protein/alpha 2-macroglobulin receptor,
and in proteins which are functionally unrelated, such
as the C9 component of complement. Mutations in the LDL
receptor gene cause familial hypercholesterolemia.
Length = 33
Score = 51.5 bits (124), Expect = 3e-09
Identities = 20/32 (62%), Positives = 24/32 (75%)
Query: 259 CSPDQFSCGNGRCINTGWLCDHDNDCGDGSDE 290
C P +F C NGRCI + W+CD +DCGDGSDE
Sbjct: 2 CPPGEFQCDNGRCIPSSWVCDGVDDCGDGSDE 33
Score = 47.2 bits (113), Expect = 8e-08
Identities = 20/33 (60%), Positives = 22/33 (66%)
Query: 120 TCDGSKFFCRNGKCISRMWSCDGDDDCGDNSDE 152
TC +F C NG+CI W CDG DDCGD SDE
Sbjct: 1 TCPPGEFQCDNGRCIPSSWVCDGVDDCGDGSDE 33
Score = 44.9 bits (107), Expect = 6e-07
Identities = 19/33 (57%), Positives = 22/33 (66%)
Query: 300 TCSSEEFACQNFKCIRKTYHCDGEDDCGDRSDE 332
TC EF C N +CI ++ CDG DDCGD SDE
Sbjct: 1 TCPPGEFQCDNGRCIPSSWVCDGVDDCGDGSDE 33
Score = 36.5 bits (85), Expect = 6e-04
Identities = 16/36 (44%), Positives = 19/36 (52%), Gaps = 6/36 (16%)
Query: 211 ETEFTCTENKAWNRAQCIPKKWLCDGDPDCVDGADE 246
EF C +CIP W+CDG DC DG+DE
Sbjct: 4 PGEFQCDNG------RCIPSSWVCDGVDDCGDGSDE 33
>gnl|CDD|200964 pfam00057, Ldl_recept_a, Low-density lipoprotein receptor domain
class A.
Length = 37
Score = 49.6 bits (119), Expect = 1e-08
Identities = 21/34 (61%), Positives = 25/34 (73%)
Query: 257 SSCSPDQFSCGNGRCINTGWLCDHDNDCGDGSDE 290
S+C PD+F CG+G CI W+CD D DC DGSDE
Sbjct: 1 STCGPDEFQCGSGECIPMSWVCDGDPDCEDGSDE 34
Score = 48.1 bits (115), Expect = 5e-08
Identities = 18/34 (52%), Positives = 21/34 (61%)
Query: 120 TCDGSKFFCRNGKCISRMWSCDGDDDCGDNSDED 153
TC +F C +G+CI W CDGD DC D SDE
Sbjct: 2 TCGPDEFQCGSGECIPMSWVCDGDPDCEDGSDEK 35
Score = 44.2 bits (105), Expect = 1e-06
Identities = 18/37 (48%), Positives = 24/37 (64%)
Query: 299 RTCSSEEFACQNFKCIRKTYHCDGEDDCGDRSDEFNC 335
TC +EF C + +CI ++ CDG+ DC D SDE NC
Sbjct: 1 STCGPDEFQCGSGECIPMSWVCDGDPDCEDGSDEKNC 37
Score = 38.0 bits (89), Expect = 2e-04
Identities = 18/36 (50%), Positives = 21/36 (58%), Gaps = 6/36 (16%)
Query: 213 EFTCTENKAWNRAQCIPKKWLCDGDPDCVDGADENT 248
EF C +CIP W+CDGDPDC DG+DE
Sbjct: 7 EFQCGSG------ECIPMSWVCDGDPDCEDGSDEKN 36
>gnl|CDD|215683 pfam00058, Ldl_recept_b, Low-density lipoprotein receptor repeat
class B. This domain is also known as the YWTD motif
after the most conserved region of the repeat. The YWTD
repeat is found in multiple tandem repeats and has been
predicted to form a beta-propeller structure.
Length = 42
Score = 30.6 bits (70), Expect = 0.080
Identities = 10/41 (24%), Positives = 14/41 (34%)
Query: 26 NYIYWTDLQLRGVYRAEKHTGANMIEMVKRLEDSPRDIHVY 66
+YWTD LR G++ + P I V
Sbjct: 1 GRLYWTDSSLRASISVADLNGSDRRTLFSEDLQWPNGIAVD 41
>gnl|CDD|214531 smart00135, LY, Low-density lipoprotein-receptor YWTD domain.
Type "B" repeats in low-density lipoprotein (LDL)
receptor that plays a central role in mammalian
cholesterol metabolism. Also present in a variety of
molecules similar to gp300/megalin.
Length = 43
Score = 28.7 bits (65), Expect = 0.48
Identities = 7/23 (30%), Positives = 10/23 (43%)
Query: 19 FAITVHRNYIYWTDLQLRGVYRA 41
A+ +YWTD L + A
Sbjct: 14 LAVDWIEGRLYWTDWGLDVIEVA 36
>gnl|CDD|193472 pfam12999, PRKCSH-like, Glucosidase II beta subunit-like. The
sequences found in this family are similar to a region
found in the beta-subunit of glucosidase II, which is
also known as protein kinase C substrate 80K-H (PRKCSH).
The enzyme catalyzes the sequential removal of two
alpha-1,3-linked glucose residues in the second step of
N-linked oligosaccharide processing. The beta subunit is
required for the solubility and stability of the
heterodimeric enzyme, and is involved in retaining the
enzyme within the endoplasmic reticulum.
Length = 176
Score = 30.5 bits (69), Expect = 0.85
Identities = 24/74 (32%), Positives = 32/74 (43%), Gaps = 21/74 (28%)
Query: 239 DCVDGADENTTALNCPKQSSCSPDQFSCGNGRCINT--------GWLCDHDNDCGDGSDE 290
DC DG+DE P ++CS +F C N I +CD+D C DGSDE
Sbjct: 59 DCPDGSDE-------PGTNACSNGKFYCANEGFIPGYIPSFKVDDGVCDYD-ICCDGSDE 110
Query: 291 G-----KECHDKYR 299
+C + R
Sbjct: 111 ALGKCPNKCGEIAR 124
>gnl|CDD|102374 PRK06434, PRK06434, cystathionine gamma-lyase; Validated.
Length = 384
Score = 30.6 bits (69), Expect = 1.3
Identities = 16/43 (37%), Positives = 24/43 (55%), Gaps = 5/43 (11%)
Query: 33 LQLRGV----YRAEKHTGANMIEMVKRLEDSPRDIHVYSADSQ 71
L LRG+ R EKH N +E+ + L DS + +VY D++
Sbjct: 249 LALRGLKTLGLRMEKHN-KNGMELARFLRDSKKISNVYYPDTE 290
>gnl|CDD|129832 TIGR00749, glk, glucokinase, proteobacterial type. This model
represents glucokinase of E. coli and close homologs,
mostly from other proteobacteria, presumed to have
equivalent function. This glucokinase is more closely
related to a number of uncharacterized paralogs than to
the glucokinase glcK (fromerly yqgR) of Bacillus
subtilis and its closest homologs, so the two sets are
represented by separate models [Energy metabolism,
Glycolysis/gluconeogenesis].
Length = 316
Score = 29.1 bits (65), Expect = 3.5
Identities = 15/66 (22%), Positives = 24/66 (36%), Gaps = 11/66 (16%)
Query: 175 GHVQITGVSQPPGIVMVMTTVQTGLMNHPNNRKCDEETEFTCTENKAWNRAQCIPKKWLC 234
GHV V PG+V + + K D E +F + + + I ++ L
Sbjct: 182 GHVSAERVLSGPGLVNIYEAL----------VKADPERQFN-KLPQENLKPKDISERALA 230
Query: 235 DGDPDC 240
DC
Sbjct: 231 GSCTDC 236
>gnl|CDD|219761 pfam08243, SPT2, SPT2 chromatin protein. This family includes the
Saccharomyces cerevisiae protein SPT2 which is a
chromatin protein involved in transcriptional
regulation.
Length = 116
Score = 27.5 bits (61), Expect = 4.5
Identities = 15/50 (30%), Positives = 20/50 (40%)
Query: 279 DHDNDCGDGSDEGKECHDKYRTCSSEEFACQNFKCIRKTYHCDGEDDCGD 328
+HD D D ++ E D+ S E +A R Y EDD D
Sbjct: 23 EHDEDMDDFIEDDDEEQDEIPYDSDEIWAIFGKGRKRSYYDRYDEDDALD 72
>gnl|CDD|233463 TIGR01549, HAD-SF-IA-v1, haloacid dehalogenase superfamily,
subfamily IA, variant 1 with third motif having
Dx(3-4)D or Dx(3-4)E. This model represents part of
one structural subfamily of the Haloacid Dehalogenase
(HAD) superfamily of aspartate-nucleophile hydrolases.
The superfamily is defined by the presence of three
short catalytic motifs. The subfamilies are defined
based on the location and the observed or predicted
fold of a so-called "capping domain", or the absence of
such a domain. Subfamily I consists of sequences in
which the capping domain is found in between the first
and second catalytic motifs. Subfamily II consists of
sequences in which the capping domain is found between
the second and third motifs. Subfamily III sequences
have no capping domain in either of these positions.The
Subfamily IA and IB capping domains are predicted by
PSI-PRED to consist of an alpha helical bundle.
Subfamily I encompasses such a wide region of sequence
space (the sequences are highly divergent) that
modelling it with a single representation is
impossible, resulting in an overly broad description
which allows in many unrelated sequences. Subfamily IA
and IB are separated based on an aparrent phylogenetic
bifurcation. Subfamily IA is still too broad to model,
but cannot be further subdivided into large chunks
based on phylogenetic trees. Of the three motifs
defining the HAD superfamily, the third has three
variant forms : (1) hhhhsDxxx(x)(D/E), (2)
hhhhssxxx(x)D and (3) hhhhDDxxx(x)s where _s_ refers to
a small amino acid and _h_ to a hydrophobic one. All
three of these variants are found in subfamily IA.
Individual models were made based on seeds exhibiting
only one of the variants each. Variant 1 (this model)
is found in the enzymes phosphoglycolate phosphatase
(TIGR01449) and enolase-phosphatase. These three
variant models (see also TIGR01493 and TIGR01509) were
created withthe knowledge that there will be overlap
among them - this is by design and serves the purpose
of eliminating the overlap with models of more
distantly relatedHAD subfamilies caused by an overly
broad single model [Unknown function, Enzymes of
unknown specificity].
Length = 162
Score = 28.1 bits (63), Expect = 4.6
Identities = 9/40 (22%), Positives = 19/40 (47%)
Query: 33 LQLRGVYRAEKHTGANMIEMVKRLEDSPRDIHVYSADSQK 72
LQ Y AE+ +++ RL+++ + + S S +
Sbjct: 60 LQGHIGYDAEEAYIPGAADLLPRLKEAGIKLGIISNGSLR 99
>gnl|CDD|224852 COG1941, FrhG, Coenzyme F420-reducing hydrogenase, gamma subunit
[Energy production and conversion].
Length = 247
Score = 27.7 bits (62), Expect = 7.7
Identities = 15/62 (24%), Positives = 21/62 (33%), Gaps = 16/62 (25%)
Query: 95 GTAECKCDESTKLVNEGRMCVAKNITCDGSKFFCRNGKCISRMWSCDG-------DDDCG 147
+ +C+CD L+ +G C+ TC C SR C G CG
Sbjct: 173 TSEKCRCDLDCCLLEQGLPCMGC-GTC--------AASCPSRAIPCRGCRGNIPRCIKCG 223
Query: 148 DN 149
Sbjct: 224 AC 225
Database: CDD.v3.10
Posted date: Mar 20, 2013 7:55 AM
Number of letters in database: 10,937,602
Number of sequences in database: 44,354
Lambda K H
0.319 0.135 0.456
Gapped
Lambda K H
0.267 0.0749 0.140
Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 15,960,280
Number of extensions: 1400349
Number of successful extensions: 1009
Number of sequences better than 10.0: 1
Number of HSP's gapped: 1008
Number of HSP's successfully gapped: 38
Length of query: 337
Length of database: 10,937,602
Length adjustment: 97
Effective length of query: 240
Effective length of database: 6,635,264
Effective search space: 1592463360
Effective search space used: 1592463360
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 59 (26.5 bits)