RPS-BLAST 2.2.26 [Sep-21-2011]

Database: CDD.v3.10 
           44,354 sequences; 10,937,602 total letters

Searching..................................................done

Query= psy2856
         (136 letters)



>gnl|CDD|238011 cd00054, EGF_CA, Calcium-binding EGF-like domain, present in a
          large number of membrane-bound and extracellular
          (mostly animal) proteins. Many of these proteins
          require calcium for their biological function and
          calcium-binding sites have been found to be located at
          the N-terminus of particular EGF-like domains;
          calcium-binding may be crucial for numerous
          protein-protein interactions. Six conserved core
          cysteines form three disulfide bridges as in non
          calcium-binding EGF domains, whose structures are very
          similar. EGF_CA can be found in tandem repeat
          arrangements.
          Length = 38

 Score = 44.2 bits (105), Expect = 2e-07
 Identities = 19/34 (55%), Positives = 21/34 (61%)

Query: 19 DINECAHPNACGVNALCQNYPGNYTCSCQPGYTG 52
          DI+ECA  N C     C N  G+Y CSC PGYTG
Sbjct: 1  DIDECASGNPCQNGGTCVNTVGSYRCSCPPGYTG 34



 Score = 43.0 bits (102), Expect = 6e-07
 Identities = 18/38 (47%), Positives = 20/38 (52%), Gaps = 3/38 (7%)

Query: 60 DIDECQYASTHPVCGPGARCTNFPGGYHCECPPGYHGD 97
          DIDEC   ++   C  G  C N  G Y C CPPGY G 
Sbjct: 1  DIDEC---ASGNPCQNGGTCVNTVGSYRCSCPPGYTGR 35



 Score = 34.9 bits (81), Expect = 7e-04
 Identities = 16/33 (48%), Positives = 19/33 (57%), Gaps = 1/33 (3%)

Query: 105 DADECVNR-PCGKDALCSNVDGSYTCTCPPGFR 136
           D DEC +  PC     C N  GSY C+CPPG+ 
Sbjct: 1   DIDECASGNPCQNGGTCVNTVGSYRCSCPPGYT 33


>gnl|CDD|214542 smart00179, EGF_CA, Calcium-binding EGF-like domain. 
          Length = 39

 Score = 43.4 bits (103), Expect = 4e-07
 Identities = 18/36 (50%), Positives = 20/36 (55%), Gaps = 3/36 (8%)

Query: 60 DIDECQYASTHPVCGPGARCTNFPGGYHCECPPGYH 95
          DIDEC   ++   C  G  C N  G Y CECPPGY 
Sbjct: 1  DIDEC---ASGNPCQNGGTCVNTVGSYRCECPPGYT 33



 Score = 42.2 bits (100), Expect = 1e-06
 Identities = 17/33 (51%), Positives = 19/33 (57%)

Query: 19 DINECAHPNACGVNALCQNYPGNYTCSCQPGYT 51
          DI+ECA  N C     C N  G+Y C C PGYT
Sbjct: 1  DIDECASGNPCQNGGTCVNTVGSYRCECPPGYT 33



 Score = 34.5 bits (80), Expect = 0.001
 Identities = 16/33 (48%), Positives = 18/33 (54%), Gaps = 1/33 (3%)

Query: 105 DADECVNR-PCGKDALCSNVDGSYTCTCPPGFR 136
           D DEC +  PC     C N  GSY C CPPG+ 
Sbjct: 1   DIDECASGNPCQNGGTCVNTVGSYRCECPPGYT 33


>gnl|CDD|238010 cd00053, EGF, Epidermal growth factor domain, found in epidermal
          growth factor (EGF) presents in a large number of
          proteins, mostly animal; the list of proteins currently
          known to contain one or more copies of an EGF-like
          pattern is large and varied; the functional
          significance of EGF-like domains in what appear to be
          unrelated proteins is not yet clear; a common feature
          is that these repeats are found in the extracellular
          domain of membrane-bound proteins or in proteins known
          to be secreted (exception: prostaglandin G/H synthase);
          the domain includes six cysteine residues which have
          been shown to be involved in disulfide bonds; the main
          structure is a two-stranded beta-sheet followed by a
          loop to a C-terminal short two-stranded sheet;
          Subdomains between the conserved cysteines vary in
          length; the region between the 5th and 6th cysteine
          contains two conserved glycines of which at  least  one
           is  present  in  most EGF-like domains; a subset of
          these bind calcium.
          Length = 36

 Score = 36.3 bits (84), Expect = 2e-04
 Identities = 17/32 (53%), Positives = 19/32 (59%)

Query: 22 ECAHPNACGVNALCQNYPGNYTCSCQPGYTGN 53
          ECA  N C     C N PG+Y C C PGYTG+
Sbjct: 1  ECAASNPCSNGGTCVNTPGSYRCVCPPGYTGD 32



 Score = 34.4 bits (79), Expect = 0.001
 Identities = 15/30 (50%), Positives = 16/30 (53%)

Query: 68 STHPVCGPGARCTNFPGGYHCECPPGYHGD 97
          +    C  G  C N PG Y C CPPGY GD
Sbjct: 3  AASNPCSNGGTCVNTPGSYRCVCPPGYTGD 32



 Score = 30.5 bits (69), Expect = 0.034
 Identities = 12/26 (46%), Positives = 14/26 (53%)

Query: 111 NRPCGKDALCSNVDGSYTCTCPPGFR 136
           + PC     C N  GSY C CPPG+ 
Sbjct: 5   SNPCSNGGTCVNTPGSYRCVCPPGYT 30


>gnl|CDD|205157 pfam12947, EGF_3, EGF domain.  This family includes a variety of
          EGF-like domain homologues. This family includes the
          C-terminal domain of the malaria parasite MSP1 protein.
          Length = 36

 Score = 34.4 bits (80), Expect = 0.001
 Identities = 13/25 (52%), Positives = 18/25 (72%)

Query: 29 CGVNALCQNYPGNYTCSCQPGYTGN 53
          C  NA C N  G++TC+C+ GYTG+
Sbjct: 8  CHPNATCTNTGGSFTCTCKSGYTGD 32



 Score = 33.7 bits (78), Expect = 0.002
 Identities = 12/25 (48%), Positives = 16/25 (64%)

Query: 111 NRPCGKDALCSNVDGSYTCTCPPGF 135
           N  C  +A C+N  GS+TCTC  G+
Sbjct: 5   NGGCHPNATCTNTGGSFTCTCKSGY 29



 Score = 32.9 bits (76), Expect = 0.003
 Identities = 14/28 (50%), Positives = 15/28 (53%)

Query: 73  CGPGARCTNFPGGYHCECPPGYHGDAFT 100
           C P A CTN  G + C C  GY GD  T
Sbjct: 8   CHPNATCTNTGGSFTCTCKSGYTGDGVT 35


>gnl|CDD|221695 pfam12662, cEGF, Complement Clr-like EGF-like.  cEGF, or complement
           Clr-like EGF, domains have six conserved cysteine
           residues disulfide-bonded into the characteristic
           pattern 'ababcc'. They are found in blood coagulation
           proteins such as fibrillin, Clr and Cls, thrombomodulin,
           and the LDL receptor. The core fold of the EGF domain
           consists of two small beta-hairpins packed against each
           other. Two major structural variants have been
           identified based on the structural context of the
           C-terminal cysteine residue of disulfide 'c' in the
           C-terminal hairpin: hEGFs and cEGFs. In cEGFs the
           C-terminal thiol resides on the C-terminal beta-sheet,
           resulting in long loop-lengths between the cysteine
           residues of disulfide 'c', typically C[10+]XC. These
           longer loop-lengths may have arisen by selective
           cysteine loss from a four-disulfide EGF template such as
           laminin or integrin. Tandem cEGF domains have five
           linking residues between terminal cysteines of adjacent
           domains. cEGF domains may or may not bind calcium in the
           linker region. cEGF domains with the consensus motif
           CXN4X[F,Y]XCXC are hydroxylated exclusively on the
           asparagine residue.
          Length = 24

 Score = 32.0 bits (74), Expect = 0.007
 Identities = 11/24 (45%), Positives = 11/24 (45%)

Query: 85  GYHCECPPGYHGDAFTTGCVDADE 108
            Y C CPPGY        C D DE
Sbjct: 1   SYTCSCPPGYQLSGDGRTCEDIDE 24



 Score = 30.1 bits (69), Expect = 0.030
 Identities = 13/23 (56%), Positives = 14/23 (60%), Gaps = 1/23 (4%)

Query: 42 YTCSCQPGYTGNPFEG-CIDIDE 63
          YTCSC PGY  +     C DIDE
Sbjct: 2  YTCSCPPGYQLSGDGRTCEDIDE 24



 Score = 24.7 bits (55), Expect = 2.9
 Identities = 8/11 (72%), Positives = 11/11 (100%)

Query: 126 SYTCTCPPGFR 136
           SYTC+CPPG++
Sbjct: 1   SYTCSCPPGYQ 11


>gnl|CDD|214544 smart00181, EGF, Epidermal growth factor-like domain. 
          Length = 35

 Score = 31.3 bits (71), Expect = 0.016
 Identities = 19/33 (57%), Positives = 21/33 (63%), Gaps = 1/33 (3%)

Query: 22 ECAHPNACGVNALCQNYPGNYTCSCQPGYTGNP 54
          ECA    C  N  C N PG+YTCSC PGYTG+ 
Sbjct: 1  ECASGGPCS-NGTCINTPGSYTCSCPPGYTGDK 32



 Score = 29.0 bits (65), Expect = 0.12
 Identities = 15/30 (50%), Positives = 17/30 (56%), Gaps = 1/30 (3%)

Query: 68 STHPVCGPGARCTNFPGGYHCECPPGYHGD 97
          ++   C  G  C N PG Y C CPPGY GD
Sbjct: 3  ASGGPCSNG-TCINTPGSYTCSCPPGYTGD 31



 Score = 25.6 bits (56), Expect = 1.7
 Identities = 13/26 (50%), Positives = 15/26 (57%), Gaps = 1/26 (3%)

Query: 111 NRPCGKDALCSNVDGSYTCTCPPGFR 136
             PC     C N  GSYTC+CPPG+ 
Sbjct: 5   GGPCSNG-TCINTPGSYTCSCPPGYT 29


>gnl|CDD|219496 pfam07645, EGF_CA, Calcium-binding EGF domain. 
          Length = 42

 Score = 30.8 bits (70), Expect = 0.023
 Identities = 14/37 (37%), Positives = 20/37 (54%), Gaps = 1/37 (2%)

Query: 19 DINECAHP-NACGVNALCQNYPGNYTCSCQPGYTGNP 54
          D++ECA   + C  N +C N  G++ C C  GY  N 
Sbjct: 1  DVDECADGTHNCPANTVCVNTIGSFECVCPDGYENNE 37



 Score = 30.4 bits (69), Expect = 0.033
 Identities = 13/34 (38%), Positives = 18/34 (52%), Gaps = 2/34 (5%)

Query: 105 DADECVNRP--CGKDALCSNVDGSYTCTCPPGFR 136
           D DEC +    C  + +C N  GS+ C CP G+ 
Sbjct: 1   DVDECADGTHNCPANTVCVNTIGSFECVCPDGYE 34



 Score = 29.6 bits (67), Expect = 0.066
 Identities = 14/35 (40%), Positives = 16/35 (45%), Gaps = 2/35 (5%)

Query: 60 DIDECQYASTHPVCGPGARCTNFPGGYHCECPPGY 94
          D+DEC  A     C     C N  G + C CP GY
Sbjct: 1  DVDEC--ADGTHNCPANTVCVNTIGSFECVCPDGY 33


>gnl|CDD|215652 pfam00008, EGF, EGF-like domain.  There is no clear separation
          between noise and signal. pfam00053 is very similar,
          but has 8 instead of 6 conserved cysteines. Includes
          some cytokine receptors. The EGF domain misses the
          N-terminus regions of the Ca2+ binding EGF domains
          (this is the main reason of discrepancy between
          swiss-prot domain start/end and Pfam). The family is
          hard to model due to many similar but different
          sub-types of EGF domains. Pfam certainly misses a
          number of EGF domains.
          Length = 32

 Score = 29.7 bits (67), Expect = 0.055
 Identities = 14/25 (56%), Positives = 15/25 (60%)

Query: 73 CGPGARCTNFPGGYHCECPPGYHGD 97
          C  G  C + PGGY CECP GY G 
Sbjct: 7  CSNGGTCVDTPGGYTCECPEGYTGK 31



 Score = 29.3 bits (66), Expect = 0.081
 Identities = 14/30 (46%), Positives = 16/30 (53%)

Query: 23 CAHPNACGVNALCQNYPGNYTCSCQPGYTG 52
          C+  N C     C + PG YTC C  GYTG
Sbjct: 1  CSPNNPCSNGGTCVDTPGGYTCECPEGYTG 30



 Score = 28.2 bits (63), Expect = 0.22
 Identities = 11/25 (44%), Positives = 13/25 (52%)

Query: 111 NRPCGKDALCSNVDGSYTCTCPPGF 135
           N PC     C +  G YTC CP G+
Sbjct: 4   NNPCSNGGTCVDTPGGYTCECPEGY 28


>gnl|CDD|212058 cd11489, SLC5sbd_SGLT5, Na(+)/glucose cotransporter SGLT5 and
           related proteins; solute-binding domain.  Human SGLT5 is
           a glucose transporter, which also transports galactose.
           It is encoded by the SLC5A10 gene, and is exclusively
           expressed in the renal cortex. This subgroup belongs to
           the solute carrier 5 (SLC5) transporter family.
          Length = 604

 Score = 31.4 bits (71), Expect = 0.13
 Identities = 11/24 (45%), Positives = 15/24 (62%), Gaps = 1/24 (4%)

Query: 100 TTGCVDADECVNRPCGKDALCSNV 123
              CVD +EC+ R CG +  CSN+
Sbjct: 313 DVACVDPEECL-RVCGAEVGCSNI 335


>gnl|CDD|212057 cd11488, SLC5sbd_SGLT4, Na(+)/glucose cotransporter SGLT4 and
           related proteins; solute-binding domain.  Human SGLT4
           (hSGLT4) has been reported to be a low-affinity glucose
           transporter with unusual sugar selectivity: it
           transports D-mannose but not galactose or
           3-O-methyl-D-glucoside. It is encoded by the SLC5A9 gene
           and is expressed in intestine, kidney, liver, brain,
           lung, trachea, uterus, and pancreas. hSLGT4 is predicted
           to contain 14 membrane-spanning regions. This subgroup
           belongs to the solute carrier 5 (SLC5 )transporter
           family.
          Length = 605

 Score = 30.2 bits (68), Expect = 0.28
 Identities = 12/22 (54%), Positives = 14/22 (63%), Gaps = 1/22 (4%)

Query: 102 GCVDADECVNRPCGKDALCSNV 123
           GCVD DEC  + CG    CSN+
Sbjct: 312 GCVDPDEC-QKICGAKVGCSNI 332



 Score = 26.8 bits (59), Expect = 4.5
 Identities = 11/25 (44%), Positives = 14/25 (56%), Gaps = 5/25 (20%)

Query: 57  GCIDIDECQYASTHPVCGPGARCTN 81
           GC+D DECQ      +CG    C+N
Sbjct: 312 GCVDPDECQ-----KICGAKVGCSN 331


>gnl|CDD|212039 cd10329, SLC5sbd_SGLT1-like, Na(+)/glucose cotransporter SGLT1 and
           related proteins; solute binding domain.  This subfamily
           includes the solute-binding domain of SGLT proteins that
           cotransport Na+ with various solutes. Its members
           include: the human glucose (SGLT1, -2, -4, -5 ),
           chiro-inositol (SGLT5), and myo-inositol (SMIT)
           cotransporters. It also includes human SGLT3 which has
           been characterized as a glucose sensor and not a
           transporter. It belongs to the solute carrier 5 (SLC5)
           transporter family.
          Length = 564

 Score = 30.0 bits (68), Expect = 0.33
 Identities = 9/22 (40%), Positives = 13/22 (59%), Gaps = 1/22 (4%)

Query: 102 GCVDADECVNRPCGKDALCSNV 123
            CV  +EC+ + CG    CSN+
Sbjct: 315 ACVVPEECM-KVCGNPVGCSNI 335


>gnl|CDD|238752 cd01475, vWA_Matrilin, VWA_Matrilin: In cartilaginous plate,
           extracellular matrix molecules mediate cell-matrix and
           matrix-matrix interactions thereby providing tissue
           integrity. Some members of the matrilin family are
           expressed specifically in developing cartilage
           rudiments. The matrilin family consists of at least four
           members. All the members of the matrilin family contain
           VWA domains, EGF-like domains and a heptad repeat
           coiled-coiled domain at the carboxy terminus which is
           responsible for the oligomerization of the matrilins.
           The VWA domains have been shown to be essential for
           matrilin network formation by interacting with matrix
           ligands.
          Length = 224

 Score = 29.3 bits (66), Expect = 0.44
 Identities = 11/33 (33%), Positives = 14/33 (42%)

Query: 103 CVDADECVNRPCGKDALCSNVDGSYTCTCPPGF 135
           CV  D C         +C +  GSY C C  G+
Sbjct: 184 CVVPDLCATLSHVCQQVCISTPGSYLCACTEGY 216



 Score = 29.3 bits (66), Expect = 0.58
 Identities = 11/37 (29%), Positives = 16/37 (43%), Gaps = 5/37 (13%)

Query: 23  CAHPNAC-----GVNALCQNYPGNYTCSCQPGYTGNP 54
           C  P+ C         +C + PG+Y C+C  GY    
Sbjct: 184 CVVPDLCATLSHVCQQVCISTPGSYLCACTEGYALLE 220


>gnl|CDD|236853 PRK11121, nrdG, anaerobic ribonucleotide reductase-activating
          protein; Provisional.
          Length = 154

 Score = 28.4 bits (64), Expect = 0.90
 Identities = 14/30 (46%), Positives = 15/30 (50%)

Query: 65 QYASTHPVCGPGARCTNFPGGYHCECPPGY 94
          QY     V GPG RCT F  G   +CP  Y
Sbjct: 5  QYYPVDVVNGPGTRCTLFVSGCVHQCPGCY 34


>gnl|CDD|116493 pfam07881, Fucose_iso_N1, L-fucose isomerase, first N-terminal
           domain.  The members of this family are similar to
           L-fucose isomerase expressed by E. coli (EC:5.3.1.3).
           This enzyme corresponds to glucose-6-phosphate isomerase
           in glycolysis, and converts an aldo-hexose to a ketose
           to prepare it for aldol cleavage. The enzyme is a
           hexamer, with each subunit being wedge-shaped and
           composed of three domains. Both domains 1 and 2 contain
           central parallel beta-sheets with surrounding alpha
           helices. Domain 1 demonstrates the
           beta-alpha-beta-alpha- beta Rossman fold. The active
           centre is shared between pairs of subunits related along
           the molecular three-fold axis, with domains 2 and 3 from
           one subunit providing most of the substrate-contacting
           residues, and domain 1 from the adjacent subunit
           contributing some other residues.
          Length = 171

 Score = 28.1 bits (63), Expect = 1.1
 Identities = 10/50 (20%), Positives = 14/50 (28%), Gaps = 8/50 (16%)

Query: 94  YHGDAFTTGCVDADECVNRP-----CGKDALCSNVDGS---YTCTCPPGF 135
            + D     CV AD  +        C +     NV  +     C C    
Sbjct: 43  RYPDGGPVECVIADTTIGGVAEAAACAEKFKKENVGVTITVTPCWCYGSE 92


>gnl|CDD|201524 pfam00954, S_locus_glycop, S-locus glycoprotein family.  In
           Brassicaceae, self-incompatible plants have a
           self/non-self recognition system. This is
           sporophytically controlled by multiple alleles at a
           single locus (S). S-locus glycoproteins, as well as
           S-receptor kinases, are in linkage with the S-alleles.
          Length = 110

 Score = 26.5 bits (59), Expect = 3.3
 Identities = 12/32 (37%), Positives = 15/32 (46%), Gaps = 2/32 (6%)

Query: 106 ADEC-VNRPCGKDALCSNVDGSYTCTCPPGFR 136
            D+C V   CG    C +V+ S  C C  GF 
Sbjct: 77  KDQCDVYGRCGPYGYC-DVNTSPKCNCIKGFV 107


  Database: CDD.v3.10
    Posted date:  Mar 20, 2013  7:55 AM
  Number of letters in database: 10,937,602
  Number of sequences in database:  44,354
  
Lambda     K      H
   0.325    0.143    0.520 

Gapped
Lambda     K      H
   0.267   0.0720    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 6,707,523
Number of extensions: 540141
Number of successful extensions: 379
Number of sequences better than 10.0: 1
Number of HSP's gapped: 368
Number of HSP's successfully gapped: 71
Length of query: 136
Length of database: 10,937,602
Length adjustment: 87
Effective length of query: 49
Effective length of database: 7,078,804
Effective search space: 346861396
Effective search space used: 346861396
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 15 ( 7.0 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (22.0 bits)
S2: 54 (24.6 bits)