RPS-BLAST 2.2.26 [Sep-21-2011]

Database: CDD.v3.10 
           44,354 sequences; 10,937,602 total letters

Searching..................................................done

Query= psy954
         (337 letters)



>gnl|CDD|238060 cd00112, LDLa, Low Density Lipoprotein Receptor Class A domain, a
           cysteine-rich repeat that plays a central role in
           mammalian cholesterol metabolism; the receptor protein
           binds LDL and transports it into cells by endocytosis; 7
           successive cysteine-rich repeats of about 40 amino acids
           are present in the N-terminal of this multidomain
           membrane protein; other homologous domains occur in
           related receptors, including the very low-density
           lipoprotein receptor and the LDL receptor-related
           protein/alpha 2-macroglobulin receptor, and in proteins
           which are functionally unrelated, such as the C9
           component of complement; the binding of calcium is
           required for in vitro formation of the native disulfide
           isomer and is necessary in establishing and maintaining
           the modular structure.
          Length = 35

 Score = 53.7 bits (130), Expect = 5e-10
 Identities = 20/32 (62%), Positives = 26/32 (81%)

Query: 259 CSPDQFSCGNGRCINTGWLCDHDNDCGDGSDE 290
           C P++F C NGRCI + W+CD ++DCGDGSDE
Sbjct: 1   CPPNEFRCANGRCIPSSWVCDGEDDCGDGSDE 32



 Score = 51.4 bits (124), Expect = 4e-09
 Identities = 21/35 (60%), Positives = 24/35 (68%)

Query: 301 CSSEEFACQNFKCIRKTYHCDGEDDCGDRSDEFNC 335
           C   EF C N +CI  ++ CDGEDDCGD SDE NC
Sbjct: 1   CPPNEFRCANGRCIPSSWVCDGEDDCGDGSDEENC 35



 Score = 49.1 bits (118), Expect = 2e-08
 Identities = 19/33 (57%), Positives = 24/33 (72%)

Query: 121 CDGSKFFCRNGKCISRMWSCDGDDDCGDNSDED 153
           C  ++F C NG+CI   W CDG+DDCGD SDE+
Sbjct: 1   CPPNEFRCANGRCIPSSWVCDGEDDCGDGSDEE 33



 Score = 38.3 bits (90), Expect = 1e-04
 Identities = 16/37 (43%), Positives = 20/37 (54%), Gaps = 6/37 (16%)

Query: 211 ETEFTCTENKAWNRAQCIPKKWLCDGDPDCVDGADEN 247
             EF C         +CIP  W+CDG+ DC DG+DE 
Sbjct: 3   PNEFRCANG------RCIPSSWVCDGEDDCGDGSDEE 33


>gnl|CDD|197566 smart00192, LDLa, Low-density lipoprotein receptor domain class A. 
           Cysteine-rich repeat in the low-density lipoprotein
           (LDL) receptor that plays a central role in mammalian
           cholesterol metabolism. The N-terminal type A repeats in
           LDL receptor bind the lipoproteins. Other homologous
           domains occur in related receptors, including the very
           low-density lipoprotein receptor and the LDL
           receptor-related protein/alpha 2-macroglobulin receptor,
           and in proteins which are functionally unrelated, such
           as the C9 component of complement. Mutations in the LDL
           receptor gene cause familial hypercholesterolemia.
          Length = 33

 Score = 51.5 bits (124), Expect = 3e-09
 Identities = 20/32 (62%), Positives = 24/32 (75%)

Query: 259 CSPDQFSCGNGRCINTGWLCDHDNDCGDGSDE 290
           C P +F C NGRCI + W+CD  +DCGDGSDE
Sbjct: 2   CPPGEFQCDNGRCIPSSWVCDGVDDCGDGSDE 33



 Score = 47.2 bits (113), Expect = 8e-08
 Identities = 20/33 (60%), Positives = 22/33 (66%)

Query: 120 TCDGSKFFCRNGKCISRMWSCDGDDDCGDNSDE 152
           TC   +F C NG+CI   W CDG DDCGD SDE
Sbjct: 1   TCPPGEFQCDNGRCIPSSWVCDGVDDCGDGSDE 33



 Score = 44.9 bits (107), Expect = 6e-07
 Identities = 19/33 (57%), Positives = 22/33 (66%)

Query: 300 TCSSEEFACQNFKCIRKTYHCDGEDDCGDRSDE 332
           TC   EF C N +CI  ++ CDG DDCGD SDE
Sbjct: 1   TCPPGEFQCDNGRCIPSSWVCDGVDDCGDGSDE 33



 Score = 36.5 bits (85), Expect = 6e-04
 Identities = 16/36 (44%), Positives = 19/36 (52%), Gaps = 6/36 (16%)

Query: 211 ETEFTCTENKAWNRAQCIPKKWLCDGDPDCVDGADE 246
             EF C         +CIP  W+CDG  DC DG+DE
Sbjct: 4   PGEFQCDNG------RCIPSSWVCDGVDDCGDGSDE 33


>gnl|CDD|200964 pfam00057, Ldl_recept_a, Low-density lipoprotein receptor domain
           class A. 
          Length = 37

 Score = 49.6 bits (119), Expect = 1e-08
 Identities = 21/34 (61%), Positives = 25/34 (73%)

Query: 257 SSCSPDQFSCGNGRCINTGWLCDHDNDCGDGSDE 290
           S+C PD+F CG+G CI   W+CD D DC DGSDE
Sbjct: 1   STCGPDEFQCGSGECIPMSWVCDGDPDCEDGSDE 34



 Score = 48.1 bits (115), Expect = 5e-08
 Identities = 18/34 (52%), Positives = 21/34 (61%)

Query: 120 TCDGSKFFCRNGKCISRMWSCDGDDDCGDNSDED 153
           TC   +F C +G+CI   W CDGD DC D SDE 
Sbjct: 2   TCGPDEFQCGSGECIPMSWVCDGDPDCEDGSDEK 35



 Score = 44.2 bits (105), Expect = 1e-06
 Identities = 18/37 (48%), Positives = 24/37 (64%)

Query: 299 RTCSSEEFACQNFKCIRKTYHCDGEDDCGDRSDEFNC 335
            TC  +EF C + +CI  ++ CDG+ DC D SDE NC
Sbjct: 1   STCGPDEFQCGSGECIPMSWVCDGDPDCEDGSDEKNC 37



 Score = 38.0 bits (89), Expect = 2e-04
 Identities = 18/36 (50%), Positives = 21/36 (58%), Gaps = 6/36 (16%)

Query: 213 EFTCTENKAWNRAQCIPKKWLCDGDPDCVDGADENT 248
           EF C         +CIP  W+CDGDPDC DG+DE  
Sbjct: 7   EFQCGSG------ECIPMSWVCDGDPDCEDGSDEKN 36


>gnl|CDD|215683 pfam00058, Ldl_recept_b, Low-density lipoprotein receptor repeat
          class B.  This domain is also known as the YWTD motif
          after the most conserved region of the repeat. The YWTD
          repeat is found in multiple tandem repeats and has been
          predicted to form a beta-propeller structure.
          Length = 42

 Score = 30.6 bits (70), Expect = 0.080
 Identities = 10/41 (24%), Positives = 14/41 (34%)

Query: 26 NYIYWTDLQLRGVYRAEKHTGANMIEMVKRLEDSPRDIHVY 66
            +YWTD  LR         G++   +       P  I V 
Sbjct: 1  GRLYWTDSSLRASISVADLNGSDRRTLFSEDLQWPNGIAVD 41


>gnl|CDD|214531 smart00135, LY, Low-density lipoprotein-receptor YWTD domain.
          Type "B" repeats in low-density lipoprotein (LDL)
          receptor that plays a central role in mammalian
          cholesterol metabolism. Also present in a variety of
          molecules similar to gp300/megalin.
          Length = 43

 Score = 28.7 bits (65), Expect = 0.48
 Identities = 7/23 (30%), Positives = 10/23 (43%)

Query: 19 FAITVHRNYIYWTDLQLRGVYRA 41
           A+      +YWTD  L  +  A
Sbjct: 14 LAVDWIEGRLYWTDWGLDVIEVA 36


>gnl|CDD|193472 pfam12999, PRKCSH-like, Glucosidase II beta subunit-like.  The
           sequences found in this family are similar to a region
           found in the beta-subunit of glucosidase II, which is
           also known as protein kinase C substrate 80K-H (PRKCSH).
           The enzyme catalyzes the sequential removal of two
           alpha-1,3-linked glucose residues in the second step of
           N-linked oligosaccharide processing. The beta subunit is
           required for the solubility and stability of the
           heterodimeric enzyme, and is involved in retaining the
           enzyme within the endoplasmic reticulum.
          Length = 176

 Score = 30.5 bits (69), Expect = 0.85
 Identities = 24/74 (32%), Positives = 32/74 (43%), Gaps = 21/74 (28%)

Query: 239 DCVDGADENTTALNCPKQSSCSPDQFSCGNGRCINT--------GWLCDHDNDCGDGSDE 290
           DC DG+DE       P  ++CS  +F C N   I            +CD+D  C DGSDE
Sbjct: 59  DCPDGSDE-------PGTNACSNGKFYCANEGFIPGYIPSFKVDDGVCDYD-ICCDGSDE 110

Query: 291 G-----KECHDKYR 299
                  +C +  R
Sbjct: 111 ALGKCPNKCGEIAR 124


>gnl|CDD|102374 PRK06434, PRK06434, cystathionine gamma-lyase; Validated.
          Length = 384

 Score = 30.6 bits (69), Expect = 1.3
 Identities = 16/43 (37%), Positives = 24/43 (55%), Gaps = 5/43 (11%)

Query: 33  LQLRGV----YRAEKHTGANMIEMVKRLEDSPRDIHVYSADSQ 71
           L LRG+     R EKH   N +E+ + L DS +  +VY  D++
Sbjct: 249 LALRGLKTLGLRMEKHN-KNGMELARFLRDSKKISNVYYPDTE 290


>gnl|CDD|129832 TIGR00749, glk, glucokinase, proteobacterial type.  This model
           represents glucokinase of E. coli and close homologs,
           mostly from other proteobacteria, presumed to have
           equivalent function. This glucokinase is more closely
           related to a number of uncharacterized paralogs than to
           the glucokinase glcK (fromerly yqgR) of Bacillus
           subtilis and its closest homologs, so the two sets are
           represented by separate models [Energy metabolism,
           Glycolysis/gluconeogenesis].
          Length = 316

 Score = 29.1 bits (65), Expect = 3.5
 Identities = 15/66 (22%), Positives = 24/66 (36%), Gaps = 11/66 (16%)

Query: 175 GHVQITGVSQPPGIVMVMTTVQTGLMNHPNNRKCDEETEFTCTENKAWNRAQCIPKKWLC 234
           GHV    V   PG+V +   +           K D E +F     +   + + I ++ L 
Sbjct: 182 GHVSAERVLSGPGLVNIYEAL----------VKADPERQFN-KLPQENLKPKDISERALA 230

Query: 235 DGDPDC 240
               DC
Sbjct: 231 GSCTDC 236


>gnl|CDD|219761 pfam08243, SPT2, SPT2 chromatin protein.  This family includes the
           Saccharomyces cerevisiae protein SPT2 which is a
           chromatin protein involved in transcriptional
           regulation.
          Length = 116

 Score = 27.5 bits (61), Expect = 4.5
 Identities = 15/50 (30%), Positives = 20/50 (40%)

Query: 279 DHDNDCGDGSDEGKECHDKYRTCSSEEFACQNFKCIRKTYHCDGEDDCGD 328
           +HD D  D  ++  E  D+    S E +A       R  Y    EDD  D
Sbjct: 23  EHDEDMDDFIEDDDEEQDEIPYDSDEIWAIFGKGRKRSYYDRYDEDDALD 72


>gnl|CDD|233463 TIGR01549, HAD-SF-IA-v1, haloacid dehalogenase superfamily,
          subfamily IA, variant 1 with third motif having
          Dx(3-4)D or Dx(3-4)E.  This model represents part of
          one structural subfamily of the Haloacid Dehalogenase
          (HAD) superfamily of aspartate-nucleophile hydrolases.
          The superfamily is defined by the presence of three
          short catalytic motifs. The subfamilies are defined
          based on the location and the observed or predicted
          fold of a so-called "capping domain", or the absence of
          such a domain. Subfamily I consists of sequences in
          which the capping domain is found in between the first
          and second catalytic motifs. Subfamily II consists of
          sequences in which the capping domain is found between
          the second and third motifs. Subfamily III sequences
          have no capping domain in either of these positions.The
          Subfamily IA and IB capping domains are predicted by
          PSI-PRED to consist of an alpha helical bundle.
          Subfamily I encompasses such a wide region of sequence
          space (the sequences are highly divergent) that
          modelling it with a single representation is
          impossible, resulting in an overly broad description
          which allows in many unrelated sequences. Subfamily IA
          and IB are separated based on an aparrent phylogenetic
          bifurcation. Subfamily IA is still too broad to model,
          but cannot be further subdivided into large chunks
          based on phylogenetic trees. Of the three motifs
          defining the HAD superfamily, the third has three
          variant forms : (1) hhhhsDxxx(x)(D/E), (2)
          hhhhssxxx(x)D and (3) hhhhDDxxx(x)s where _s_ refers to
          a small amino acid and _h_ to a hydrophobic one. All
          three of these variants are found in subfamily IA.
          Individual models were made based on seeds exhibiting
          only one of the variants each. Variant 1 (this model)
          is found in the enzymes phosphoglycolate phosphatase
          (TIGR01449) and enolase-phosphatase. These three
          variant models (see also TIGR01493 and TIGR01509) were
          created withthe knowledge that there will be overlap
          among them - this is by design and serves the purpose
          of eliminating the overlap with models of more
          distantly relatedHAD subfamilies caused by an overly
          broad single model [Unknown function, Enzymes of
          unknown specificity].
          Length = 162

 Score = 28.1 bits (63), Expect = 4.6
 Identities = 9/40 (22%), Positives = 19/40 (47%)

Query: 33 LQLRGVYRAEKHTGANMIEMVKRLEDSPRDIHVYSADSQK 72
          LQ    Y AE+       +++ RL+++   + + S  S +
Sbjct: 60 LQGHIGYDAEEAYIPGAADLLPRLKEAGIKLGIISNGSLR 99


>gnl|CDD|224852 COG1941, FrhG, Coenzyme F420-reducing hydrogenase, gamma subunit
           [Energy production and conversion].
          Length = 247

 Score = 27.7 bits (62), Expect = 7.7
 Identities = 15/62 (24%), Positives = 21/62 (33%), Gaps = 16/62 (25%)

Query: 95  GTAECKCDESTKLVNEGRMCVAKNITCDGSKFFCRNGKCISRMWSCDG-------DDDCG 147
            + +C+CD    L+ +G  C+    TC           C SR   C G          CG
Sbjct: 173 TSEKCRCDLDCCLLEQGLPCMGC-GTC--------AASCPSRAIPCRGCRGNIPRCIKCG 223

Query: 148 DN 149
             
Sbjct: 224 AC 225


  Database: CDD.v3.10
    Posted date:  Mar 20, 2013  7:55 AM
  Number of letters in database: 10,937,602
  Number of sequences in database:  44,354
  
Lambda     K      H
   0.319    0.135    0.456 

Gapped
Lambda     K      H
   0.267   0.0749    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 15,960,280
Number of extensions: 1400349
Number of successful extensions: 1009
Number of sequences better than 10.0: 1
Number of HSP's gapped: 1008
Number of HSP's successfully gapped: 38
Length of query: 337
Length of database: 10,937,602
Length adjustment: 97
Effective length of query: 240
Effective length of database: 6,635,264
Effective search space: 1592463360
Effective search space used: 1592463360
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 59 (26.5 bits)