RPS-BLAST 2.2.26 [Sep-21-2011]

Database: CDD.v3.10 
           44,354 sequences; 10,937,602 total letters

Searching..................................................done

Query= psy12432
         (229 letters)



>gnl|CDD|238011 cd00054, EGF_CA, Calcium-binding EGF-like domain, present in a
          large number of membrane-bound and extracellular
          (mostly animal) proteins. Many of these proteins
          require calcium for their biological function and
          calcium-binding sites have been found to be located at
          the N-terminus of particular EGF-like domains;
          calcium-binding may be crucial for numerous
          protein-protein interactions. Six conserved core
          cysteines form three disulfide bridges as in non
          calcium-binding EGF domains, whose structures are very
          similar. EGF_CA can be found in tandem repeat
          arrangements.
          Length = 38

 Score = 33.4 bits (77), Expect = 0.005
 Identities = 13/32 (40%), Positives = 14/32 (43%), Gaps = 4/32 (12%)

Query: 37 CLNGATCFTVKIGESLLYNCECADGYMGQRCE 68
          C NG TC          Y C C  GY G+ CE
Sbjct: 11 CQNGGTCVNTVGS----YRCSCPPGYTGRNCE 38


>gnl|CDD|215652 pfam00008, EGF, EGF-like domain.  There is no clear separation
          between noise and signal. pfam00053 is very similar,
          but has 8 instead of 6 conserved cysteines. Includes
          some cytokine receptors. The EGF domain misses the
          N-terminus regions of the Ca2+ binding EGF domains
          (this is the main reason of discrepancy between
          swiss-prot domain start/end and Pfam). The family is
          hard to model due to many similar but different
          sub-types of EGF domains. Pfam certainly misses a
          number of EGF domains.
          Length = 32

 Score = 30.5 bits (69), Expect = 0.051
 Identities = 16/39 (41%), Positives = 18/39 (46%), Gaps = 7/39 (17%)

Query: 28 CPPTYATWYCLNGATCFTVKIGESLLYNCECADGYMGQR 66
          C P      C NG TC     G    Y CEC +GY G+R
Sbjct: 1  CSPNN---PCSNGGTCVDTPGG----YTCECPEGYTGKR 32


>gnl|CDD|214542 smart00179, EGF_CA, Calcium-binding EGF-like domain. 
          Length = 39

 Score = 29.5 bits (67), Expect = 0.12
 Identities = 17/34 (50%), Positives = 18/34 (52%), Gaps = 7/34 (20%)

Query: 37 CLNGATCF-TVKIGESLLYNCECADGYM-GQRCE 68
          C NG TC  TV    S  Y CEC  GY  G+ CE
Sbjct: 11 CQNGGTCVNTV---GS--YRCECPPGYTDGRNCE 39


>gnl|CDD|165214 PHA02887, PHA02887, EGF-like protein; Provisional.
          Length = 126

 Score = 31.4 bits (71), Expect = 0.15
 Identities = 11/33 (33%), Positives = 16/33 (48%), Gaps = 3/33 (9%)

Query: 36  YCLNGATCFTVKIGESLLYNCECADGYMGQRCE 68
           +C+NG     + + E     C C  GY G RC+
Sbjct: 93  FCINGECMNIIDLDEKF---CICNKGYTGIRCD 122


>gnl|CDD|238010 cd00053, EGF, Epidermal growth factor domain, found in epidermal
          growth factor (EGF) presents in a large number of
          proteins, mostly animal; the list of proteins currently
          known to contain one or more copies of an EGF-like
          pattern is large and varied; the functional
          significance of EGF-like domains in what appear to be
          unrelated proteins is not yet clear; a common feature
          is that these repeats are found in the extracellular
          domain of membrane-bound proteins or in proteins known
          to be secreted (exception: prostaglandin G/H synthase);
          the domain includes six cysteine residues which have
          been shown to be involved in disulfide bonds; the main
          structure is a two-stranded beta-sheet followed by a
          loop to a C-terminal short two-stranded sheet;
          Subdomains between the conserved cysteines vary in
          length; the region between the 5th and 6th cysteine
          contains two conserved glycines of which at  least  one
           is  present  in  most EGF-like domains; a subset of
          these bind calcium.
          Length = 36

 Score = 29.0 bits (65), Expect = 0.23
 Identities = 13/33 (39%), Positives = 13/33 (39%), Gaps = 5/33 (15%)

Query: 37 CLNGATCFTVKIGESLLYNCECADGYMGQ-RCE 68
          C NG TC          Y C C  GY G   CE
Sbjct: 8  CSNGGTCVNTPGS----YRCVCPPGYTGDRSCE 36


>gnl|CDD|220647 pfam10242, L_HGMIC_fpl, Lipoma HMGIC fusion partner-like protein.
           This is a group of proteins expressed from a series of
           genes referred to as Lipoma HGMIC fusion partner-like.
           The proteins carry four highly conserved transmembrane
           domains in this entry. In certain instances, eg in
           LHFPL5, mutations cause deafness in humans and
           hypospadias, and LHFPL1 is transcribed in six liver
           tumour cell lines.
          Length = 181

 Score = 31.1 bits (71), Expect = 0.28
 Identities = 19/75 (25%), Positives = 27/75 (36%), Gaps = 5/75 (6%)

Query: 53  LYN-CECADGYMGQRCEFKDLDGSYLPSRKQVMLETASIASGASIAVFLVVILCFSLYVH 111
           LY  C      M   C    LD   +PS      + A    G   A+ L+ I C SL+  
Sbjct: 40  LYRRCIGLMDQMELTCGGYALDFLAIPSS---AWQAAMFFVGLGTALLLL-IACLSLFTF 95

Query: 112 CQRRKKQAQAASVCC 126
           C++         +C 
Sbjct: 96  CRQSIISKSVFKICG 110


>gnl|CDD|218597 pfam05466, BASP1, Brain acid soluble protein 1 (BASP1 protein).
           This family consists of several brain acid soluble
           protein 1 (BASP1) or neuronal axonal membrane protein
           NAP-22. The BASP1 is a neuron enriched Ca(2+)-dependent
           calmodulin-binding protein of unknown function.
          Length = 233

 Score = 31.0 bits (69), Expect = 0.37
 Identities = 20/60 (33%), Positives = 26/60 (43%)

Query: 157 TEAPRAADTRTSITITGKGDSVSASQLYHQSSPPLPHMSHPPSCTPLPPAPSQPPDDIKA 216
           TEAP AA   T        DS  +S     SS   P  +  PS T    AP+ P +++K 
Sbjct: 158 TEAPAAAAQETKSDAAPASDSKPSSSEAAPSSKETPAATEAPSSTAKASAPAAPAEEVKP 217


>gnl|CDD|147197 pfam04906, Tweety, Tweety.  The tweety (tty) gene has not been
           characterized at the protein level. However, it is
           thought to form a membrane protein with five potential
           membrane-spanning regions. A number of potential
           functions have been suggested in.
          Length = 406

 Score = 30.4 bits (69), Expect = 0.86
 Identities = 19/60 (31%), Positives = 31/60 (51%), Gaps = 6/60 (10%)

Query: 69  FKDLDGSYLPSRKQVMLETASIASGASIAVFLVVILCFSLYVHCQRRKKQA-QAASVCCT 127
           F+  D SY     Q +L  AS+A  A +A+ L+ +L + + + C RRK++       CC 
Sbjct: 8   FRPEDESYQ----QSLLFLASVA-AACLALSLLFLLFYLITLCCCRRKREEHSNKDCCCV 62


>gnl|CDD|216503 pfam01435, Peptidase_M48, Peptidase family M48. 
          Length = 223

 Score = 29.3 bits (66), Expect = 1.3
 Identities = 20/120 (16%), Positives = 37/120 (30%), Gaps = 12/120 (10%)

Query: 84  MLETASIASGASIAVFLVVILCFSLYVHCQRRKKQAQAASVCCTDGPGSSLQRPRMPFER 143
            +E+ S     ++ + L         +           A+   T      LQ   +P+ R
Sbjct: 106 SVESMSQGLLLNLLLLLGAAALGGRAL--------GFNANGFLTALGIFLLQLLLLPYSR 157

Query: 144 -RPSPADFV---LTRITTEAPRAADTRTSITITGKGDSVSASQLYHQSSPPLPHMSHPPS 199
            +   AD     L      A      R ++    K  + + S++      P    +HPP 
Sbjct: 158 KQEYEADEAGARLGGDKDLARAGYKPRAAVKFLAKLAAENLSRVSGGKLYPELLSTHPPL 217


>gnl|CDD|200219 TIGR02927, SucB_Actino, 2-oxoglutarate dehydrogenase, E2 component,
           dihydrolipoamide succinyltransferase.  This model
           represents an Actinobacterial clade of E2 enzyme, a
           component of the 2-oxoglutarate dehydrogenase complex
           involved in the TCA cycle. These proteins have multiple
           domains including the catalytic domain (pfam00198), one
           or two biotin domains (pfam00364) and an E3-component
           binding domain (pfam02817).
          Length = 579

 Score = 28.1 bits (62), Expect = 4.3
 Identities = 16/71 (22%), Positives = 27/71 (38%), Gaps = 4/71 (5%)

Query: 145 PSPADFVLTRITTEAPRAADTRTSITITGK----GDSVSASQLYHQSSPPLPHMSHPPSC 200
           PSPA  VL  I        +    + I G+    G   + +    +++P     +  P+ 
Sbjct: 49  PSPAAGVLLEIRAPEDDTVEVGGVLAIIGEPGEAGSEPAPAAPEPEAAPEPEAPAPAPTP 108

Query: 201 TPLPPAPSQPP 211
               PAP+ P 
Sbjct: 109 AAEAPAPAAPQ 119


>gnl|CDD|221170 pfam11696, DUF3292, Protein of unknown function (DUF3292).  This
           eukaryotic family of proteins has no known function.
          Length = 641

 Score = 27.4 bits (61), Expect = 6.6
 Identities = 10/43 (23%), Positives = 15/43 (34%), Gaps = 8/43 (18%)

Query: 188 SPPLPHMSHPPSCTPLPPAPSQP----PDDIKADMRSSQVSAT 226
           + PLP    PP  + L  APS P       +       ++   
Sbjct: 402 AAPLP----PPPSSSLRKAPSSPASIDHKQLNLGASEEEIDQA 440


>gnl|CDD|204999 pfam12661, hEGF, Human growth factor-like EGF.  hEGF, or human
          growth factor-like EGF, domains have six conserved
          residues disulfide-bonded into the characteristic
          'ababcc' pattern. They are involved in growth and
          proliferation of cells, in proteins of the Notch/Delta
          pathway, neurogulin and selectins. hEGFs are also found
          in mosaic proteins with four-disulfide laminin EGFs
          such as aggrecan and perlecan. The core fold of the EGF
          domain consists of two small beta-hairpins packed
          against each other. Two major structural variants have
          been identified based on the structural context of the
          C-terminal Cys residue of disulfide 'c' in the
          C-terminal hairpin: hEGFs and cEGFs. In hEGFs the
          C-terminal thiol resides in the beta-turn, resulting in
          shorter loop-lengths between the Cys residues of
          disulfide 'c', typically C[8-9]XC. These shorter
          loop-lengths are also typical of the four-disulfide EGF
          domains, laminin ad integrin. Tandem hEGF domains have
          six linking residues between terminal cysteines of
          adjacent domains. hEGF domains may or may not bind
          calcium in the linker region. hEGF domains with the
          consensus motif CXD4X[F,Y]XCXC are hydroxylated
          exclusively in the Asp residue.
          Length = 13

 Score = 24.2 bits (54), Expect = 6.9
 Identities = 7/12 (58%), Positives = 8/12 (66%)

Query: 56 CECADGYMGQRC 67
          C+C  GY G RC
Sbjct: 2  CQCPPGYTGPRC 13


>gnl|CDD|220392 pfam09770, PAT1, Topoisomerase II-associated protein PAT1.  Members
           of this family are necessary for accurate chromosome
           transmission during cell division.
          Length = 804

 Score = 27.4 bits (61), Expect = 7.6
 Identities = 8/30 (26%), Positives = 11/30 (36%)

Query: 182 QLYHQSSPPLPHMSHPPSCTPLPPAPSQPP 211
           Q + +   P   +  P      PP P Q P
Sbjct: 208 QGHPEQVQPQQFLPAPSQAPAQPPLPPQLP 237


>gnl|CDD|214544 smart00181, EGF, Epidermal growth factor-like domain. 
          Length = 35

 Score = 24.4 bits (53), Expect = 9.0
 Identities = 14/34 (41%), Positives = 14/34 (41%), Gaps = 6/34 (17%)

Query: 36 YCLNGATCFTVKIGESLLYNCECADGYMGQ-RCE 68
           C NG TC          Y C C  GY G  RCE
Sbjct: 7  PCSNG-TCINTPGS----YTCSCPPGYTGDKRCE 35


  Database: CDD.v3.10
    Posted date:  Mar 20, 2013  7:55 AM
  Number of letters in database: 10,937,602
  Number of sequences in database:  44,354
  
Lambda     K      H
   0.319    0.131    0.416 

Gapped
Lambda     K      H
   0.267   0.0647    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 11,136,528
Number of extensions: 969047
Number of successful extensions: 1661
Number of sequences better than 10.0: 1
Number of HSP's gapped: 1614
Number of HSP's successfully gapped: 61
Length of query: 229
Length of database: 10,937,602
Length adjustment: 94
Effective length of query: 135
Effective length of database: 6,768,326
Effective search space: 913724010
Effective search space used: 913724010
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.8 bits)
S2: 57 (25.9 bits)