RPS-BLAST 2.2.26 [Sep-21-2011]

Database: CDD.v3.10 
           44,354 sequences; 10,937,602 total letters

Searching..................................................done

Query= psy4097
         (129 letters)



>gnl|CDD|205157 pfam12947, EGF_3, EGF domain.  This family includes a variety of
          EGF-like domain homologues. This family includes the
          C-terminal domain of the malaria parasite MSP1 protein.
          Length = 36

 Score = 33.7 bits (78), Expect = 0.002
 Identities = 18/38 (47%), Positives = 21/38 (55%), Gaps = 4/38 (10%)

Query: 37 CSVQNGGCHPLATCRETSDTVRSVISCTCPPGMGGSGV 74
          C+  NGGCHP ATC  T  +     +CTC  G  G GV
Sbjct: 1  CAENNGGCHPNATCTNTGGS----FTCTCKSGYTGDGV 34



 Score = 24.4 bits (54), Expect = 4.8
 Identities = 12/25 (48%), Positives = 13/25 (52%), Gaps = 1/25 (4%)

Query: 6  ECINTQGYRKCGQCPHGWVGDGTTC 30
           C NT G   C  C  G+ GDG TC
Sbjct: 13 TCTNTGGSFTC-TCKSGYTGDGVTC 36


>gnl|CDD|100002 cd04962, GT1_like_5, This family is most closely related to the
          GT1 family of glycosyltransferases.
          Glycosyltransferases catalyze the transfer of sugar
          moieties from activated donor molecules to specific
          acceptor molecules, forming glycosidic bonds. The
          acceptor molecule can be a lipid, a protein, a
          heterocyclic compound, or another carbohydrate residue.
          This group of glycosyltransferases is most closely
          related to the previously defined glycosyltransferase
          family 1 (GT1). The members of this family may transfer
          UDP, ADP, GDP, or CMP linked sugars. The diverse
          enzymatic activities among members of this family
          reflect a wide range of biological functions. The
          protein structure available for this family has the GTB
          topology, one of the two protein topologies observed
          for nucleotide-sugar-dependent glycosyltransferases.
          GTB proteins have distinct N- and C- terminal domains
          each containing a typical Rossmann fold. The two
          domains have high structural homology despite minimal
          sequence homology. The large cleft that separates the
          two domains includes the catalytic center and permits a
          high degree of flexibility. The members of this family
          are found mainly in bacteria, while some of them are
          also found in Archaea and eukaryotes.
          Length = 371

 Score = 27.2 bits (61), Expect = 2.7
 Identities = 8/14 (57%), Positives = 8/14 (57%)

Query: 61 ISCTCPPGMGGSGV 74
          I   C P  GGSGV
Sbjct: 3  IGIVCYPTYGGSGV 16


>gnl|CDD|215652 pfam00008, EGF, EGF-like domain.  There is no clear separation
           between noise and signal. pfam00053 is very similar, but
           has 8 instead of 6 conserved cysteines. Includes some
           cytokine receptors. The EGF domain misses the N-terminus
           regions of the Ca2+ binding EGF domains (this is the
           main reason of discrepancy between swiss-prot domain
           start/end and Pfam). The family is hard to model due to
           many similar but different sub-types of EGF domains.
           Pfam certainly misses a number of EGF domains.
          Length = 32

 Score = 25.1 bits (55), Expect = 2.9
 Identities = 11/26 (42%), Positives = 15/26 (57%), Gaps = 1/26 (3%)

Query: 93  CEHGGICAPIGDRGYRCQCEPGFTGE 118
           C +GG C      GY C+C  G+TG+
Sbjct: 7   CSNGGTCVDTPG-GYTCECPEGYTGK 31


>gnl|CDD|238011 cd00054, EGF_CA, Calcium-binding EGF-like domain, present in a
           large number of membrane-bound and extracellular (mostly
           animal) proteins. Many of these proteins require calcium
           for their biological function and calcium-binding sites
           have been found to be located at the N-terminus of
           particular EGF-like domains; calcium-binding may be
           crucial for numerous protein-protein interactions. Six
           conserved core cysteines form three disulfide bridges as
           in non calcium-binding EGF domains, whose structures are
           very similar. EGF_CA can be found in tandem repeat
           arrangements.
          Length = 38

 Score = 24.9 bits (55), Expect = 3.3
 Identities = 12/27 (44%), Positives = 16/27 (59%), Gaps = 1/27 (3%)

Query: 93  CEHGGICAPIGDRGYRCQCEPGFTGES 119
           C++GG C       YRC C PG+TG +
Sbjct: 11  CQNGGTCVNTVG-SYRCSCPPGYTGRN 36


>gnl|CDD|234438 TIGR03999, thiol_BshA, N-acetyl-alpha-D-glucosaminyl L-malate
          synthase BshA.  Members of this protein family are
          BshA, a glycosyltransferase required for bacillithiol
          biosynthesis. This enzyme combines UDP-GlcNAc and
          L-malate to form N-acetyl-alpha-D-glucosaminyl L-malate
          synthase. Bacillithiol is a low-molecular-weight thiol,
          an analog of glutathione and mycothiol, and is found
          largely in the Firmicutes [Biosynthesis of cofactors,
          prosthetic groups, and carriers, Glutathione and
          analogs].
          Length = 374

 Score = 26.8 bits (60), Expect = 3.5
 Identities = 9/14 (64%), Positives = 9/14 (64%)

Query: 61 ISCTCPPGMGGSGV 74
          I  TC P  GGSGV
Sbjct: 3  IGITCYPTYGGSGV 16


>gnl|CDD|238010 cd00053, EGF, Epidermal growth factor domain, found in epidermal
           growth factor (EGF) presents in a large number of
           proteins, mostly animal; the list of proteins currently
           known to contain one or more copies of an EGF-like
           pattern is large and varied; the functional significance
           of EGF-like domains in what appear to be unrelated
           proteins is not yet clear; a common feature is that
           these repeats are found in the extracellular domain of
           membrane-bound proteins or in proteins known to be
           secreted (exception: prostaglandin G/H synthase); the
           domain includes six cysteine residues which have been
           shown to be involved in disulfide bonds; the main
           structure is a two-stranded beta-sheet followed by a
           loop to a C-terminal short two-stranded sheet;
           Subdomains between the conserved cysteines vary in
           length; the region between the 5th and 6th cysteine
           contains two conserved glycines of which at  least  one 
           is  present  in  most EGF-like domains; a subset of
           these bind calcium.
          Length = 36

 Score = 24.7 bits (54), Expect = 3.9
 Identities = 12/26 (46%), Positives = 15/26 (57%), Gaps = 1/26 (3%)

Query: 93  CEHGGICAPIGDRGYRCQCEPGFTGE 118
           C +GG C       YRC C PG+TG+
Sbjct: 8   CSNGGTCVNTPG-SYRCVCPPGYTGD 32


>gnl|CDD|221190 pfam11727, ISG65-75, Invariant surface glycoprotein.  This family
           is found in Trypanosome species, and appears to be one
           of two invariant surface glycoproteins, ISG65 and ISG75.
           that are found in the mammalian stage of the parasitic
           protozoan. the sequence suggests the two families are
           polypeptides with N-terminal signal sequences,
           hydrophilic extracellular domains, single trans-membrane
           alpha-helices and short cytoplasmic domains. they are
           both expressed in the bloodstream form but not in the
           midgut stage. Both polypeptides are distributed over the
           entire surface of the parasite.
          Length = 286

 Score = 26.3 bits (58), Expect = 4.3
 Identities = 11/34 (32%), Positives = 16/34 (47%)

Query: 31  RQGTTGCSVQNGGCHPLATCRETSDTVRSVISCT 64
           + G    S  +  C  +A  R  SD  R+VI C+
Sbjct: 167 KPGENAKSSPSQNCDGIAFKRHYSDGGRNVIDCS 200


>gnl|CDD|237357 PRK13348, PRK13348, chromosome replication initiation inhibitor
           protein; Provisional.
          Length = 294

 Score = 25.7 bits (57), Expect = 8.0
 Identities = 8/20 (40%), Positives = 10/20 (50%)

Query: 96  GGICAPIGDRGYRCQCEPGF 115
           G +  P+G   YRC   P F
Sbjct: 152 GCLAEPLGTMRYRCVASPAF 171


  Database: CDD.v3.10
    Posted date:  Mar 20, 2013  7:55 AM
  Number of letters in database: 10,937,602
  Number of sequences in database:  44,354
  
Lambda     K      H
   0.322    0.142    0.502 

Gapped
Lambda     K      H
   0.267   0.0635    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 6,115,491
Number of extensions: 484260
Number of successful extensions: 398
Number of sequences better than 10.0: 1
Number of HSP's gapped: 394
Number of HSP's successfully gapped: 52
Length of query: 129
Length of database: 10,937,602
Length adjustment: 86
Effective length of query: 43
Effective length of database: 7,123,158
Effective search space: 306295794
Effective search space used: 306295794
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.8 bits)
S2: 53 (24.4 bits)