RPS-BLAST 2.2.26 [Sep-21-2011]

Database: CDD.v3.10 
           44,354 sequences; 10,937,602 total letters

Searching..................................................done

Query= psy12789
         (514 letters)



>gnl|CDD|219677 pfam07974, EGF_2, EGF-like domain.  This family contains EGF
           domains found in a variety of extracellular proteins.
          Length = 31

 Score = 34.3 bits (79), Expect = 0.005
 Identities = 12/28 (42%), Positives = 15/28 (53%), Gaps = 2/28 (7%)

Query: 292 PNGCSGHGSCY--LGKCDCIDGYEGTDC 317
              C+G G+C    GKC C  GY+G  C
Sbjct: 4   SGICNGRGTCVRPCGKCVCDSGYQGATC 31



 Score = 30.9 bits (70), Expect = 0.076
 Identities = 12/29 (41%), Positives = 16/29 (55%), Gaps = 2/29 (6%)

Query: 356 QYSDCSGHGSCV--EGTCHCQSGWKGIGC 382
               C+G G+CV   G C C SG++G  C
Sbjct: 3   ASGICNGRGTCVRPCGKCVCDSGYQGATC 31



 Score = 29.7 bits (67), Expect = 0.21
 Identities = 11/30 (36%), Positives = 13/30 (43%), Gaps = 2/30 (6%)

Query: 388 PNACNRHGTCAFENEEYQCVCAEGWAGVDC 417
              CN  GTC       +CVC  G+ G  C
Sbjct: 4   SGICNGRGTC--VRPCGKCVCDSGYQGATC 31



 Score = 27.4 bits (61), Expect = 1.6
 Identities = 8/26 (30%), Positives = 14/26 (53%), Gaps = 2/26 (7%)

Query: 325 LCSNHGK--YGGGICHCENGWKGPEC 348
           +C+  G      G C C++G++G  C
Sbjct: 6   ICNGRGTCVRPCGKCVCDSGYQGATC 31


>gnl|CDD|113629 pfam04863, EGF_alliinase, Alliinase EGF-like domain.  Allicin is a
           thiosulphinate that gives rise to dithiines, allyl
           sulphides and ajoenes, the three groups of active
           compounds in Allium species. Allicin is synthesised from
           sulfoxide cysteine derivatives by alliinase
           (EC:4.4.1.4), whose C-S lyase activity cleaves
           C(beta)-S(gamma) bonds. It is thought that this enzyme
           forms part of a primitive plant defence system. This
           family represents the N-terminal EGF-like domain.
          Length = 56

 Score = 29.8 bits (67), Expect = 0.37
 Identities = 20/74 (27%), Positives = 28/74 (37%), Gaps = 26/74 (35%)

Query: 282 KEAEGVSSTCPNGCSGHGSCYLGKCDCIDGYEGTDCSKSVCPVLCSNHGKYGGGICHCEN 341
           +EAE V++     CSGHG  +L      DG                     G  IC C  
Sbjct: 9   EEAEAVAAI---NCSGHGRAFL------DGI-----------------ISDGSPICECNT 42

Query: 342 GWKGPECDIPASDC 355
            + GP+C +   +C
Sbjct: 43  CYTGPDCSVLIPNC 56


>gnl|CDD|225566 COG3022, COG3022, Uncharacterized protein conserved in bacteria
           [Function unknown].
          Length = 253

 Score = 32.3 bits (74), Expect = 0.51
 Identities = 15/82 (18%), Positives = 32/82 (39%), Gaps = 13/82 (15%)

Query: 90  EEAPSVVELKDFNSLISATIPSFQFWNSDFNIEQSAFFRFNFDLPRGSNFAVYGRRNVAP 149
            +  S++++ D   L       FQ W + F   + A   FN D+  G +           
Sbjct: 44  NQISSLMKISD--KLAGLNAQRFQDWETQFTPARQAILAFNGDVYTGLDAE--------- 92

Query: 150 SITNYDFSEFIKDNTR-TSAFF 170
           +++  D   +++ + R  S  +
Sbjct: 93  TLSEKDQ-AYLQQHLRILSGLY 113


>gnl|CDD|238010 cd00053, EGF, Epidermal growth factor domain, found in epidermal
           growth factor (EGF) presents in a large number of
           proteins, mostly animal; the list of proteins currently
           known to contain one or more copies of an EGF-like
           pattern is large and varied; the functional significance
           of EGF-like domains in what appear to be unrelated
           proteins is not yet clear; a common feature is that
           these repeats are found in the extracellular domain of
           membrane-bound proteins or in proteins known to be
           secreted (exception: prostaglandin G/H synthase); the
           domain includes six cysteine residues which have been
           shown to be involved in disulfide bonds; the main
           structure is a two-stranded beta-sheet followed by a
           loop to a C-terminal short two-stranded sheet;
           Subdomains between the conserved cysteines vary in
           length; the region between the 5th and 6th cysteine
           contains two conserved glycines of which at  least  one 
           is  present  in  most EGF-like domains; a subset of
           these bind calcium.
          Length = 36

 Score = 28.2 bits (63), Expect = 0.75
 Identities = 11/27 (40%), Positives = 14/27 (51%)

Query: 388 PNACNRHGTCAFENEEYQCVCAEGWAG 414
            N C+  GTC      Y+CVC  G+ G
Sbjct: 5   SNPCSNGGTCVNTPGSYRCVCPPGYTG 31


>gnl|CDD|238011 cd00054, EGF_CA, Calcium-binding EGF-like domain, present in a
           large number of membrane-bound and extracellular (mostly
           animal) proteins. Many of these proteins require calcium
           for their biological function and calcium-binding sites
           have been found to be located at the N-terminus of
           particular EGF-like domains; calcium-binding may be
           crucial for numerous protein-protein interactions. Six
           conserved core cysteines form three disulfide bridges as
           in non calcium-binding EGF domains, whose structures are
           very similar. EGF_CA can be found in tandem repeat
           arrangements.
          Length = 38

 Score = 28.4 bits (64), Expect = 0.87
 Identities = 12/32 (37%), Positives = 15/32 (46%), Gaps = 4/32 (12%)

Query: 388 PNACNRHGTCAFENEE--YQCVCAEGWAGVDC 417
            N C   GTC   N    Y+C C  G+ G +C
Sbjct: 8   GNPCQNGGTC--VNTVGSYRCSCPPGYTGRNC 37



 Score = 26.4 bits (59), Expect = 3.5
 Identities = 9/32 (28%), Positives = 13/32 (40%), Gaps = 4/32 (12%)

Query: 356 QYSDCSGHGSCVEG----TCHCQSGWKGIGCQ 383
             + C   G+CV       C C  G+ G  C+
Sbjct: 7   SGNPCQNGGTCVNTVGSYRCSCPPGYTGRNCE 38



 Score = 26.1 bits (58), Expect = 5.0
 Identities = 10/30 (33%), Positives = 13/30 (43%), Gaps = 4/30 (13%)

Query: 292 PNGCSGHGSCYLG----KCDCIDGYEGTDC 317
            N C   G+C       +C C  GY G +C
Sbjct: 8   GNPCQNGGTCVNTVGSYRCSCPPGYTGRNC 37


>gnl|CDD|188543 TIGR04028, SBP_KPN_01854, ABC transporter substrate binding
           protein, KPN_01854 family.  Members of this protein
           family are ABC transporter substrate-binding proteins
           related to KPN_01854 from Klebsiella pneumoniae, and
           occur in both Gram-positive and Gram-negative species.
           This transport protein family is closely associated with
           a putative FMN-dependent luciferase-like monooxygenase
           of unknown function (TIGR04027), as well as with the
           other proteins of its transporter complex [Transport and
           binding proteins, Unknown substrate].
          Length = 509

 Score = 32.0 bits (73), Expect = 0.94
 Identities = 21/78 (26%), Positives = 28/78 (35%), Gaps = 16/78 (20%)

Query: 138 NFAVYGRRN------VAPSITNYDFSEFIKDNTRTSAFFRFNFDLPRGSNFA----VYGR 187
           NF +YG  +      V+  I NYD SE +   T      +F F  P    F     V   
Sbjct: 86  NFDLYGLGDKDRKLTVSEVINNYDRSEVVDPLT-----VKFYFSKP-SPGFLQGTSVINS 139

Query: 188 RNVAPSITNYDFSEFIKG 205
             V+ +     F  F  G
Sbjct: 140 GLVSLATLALPFEGFGPG 157


>gnl|CDD|215652 pfam00008, EGF, EGF-like domain.  There is no clear separation
           between noise and signal. pfam00053 is very similar, but
           has 8 instead of 6 conserved cysteines. Includes some
           cytokine receptors. The EGF domain misses the N-terminus
           regions of the Ca2+ binding EGF domains (this is the
           main reason of discrepancy between swiss-prot domain
           start/end and Pfam). The family is hard to model due to
           many similar but different sub-types of EGF domains.
           Pfam certainly misses a number of EGF domains.
          Length = 32

 Score = 27.4 bits (61), Expect = 1.5
 Identities = 10/24 (41%), Positives = 12/24 (50%)

Query: 391 CNRHGTCAFENEEYQCVCAEGWAG 414
           C+  GTC      Y C C EG+ G
Sbjct: 7   CSNGGTCVDTPGGYTCECPEGYTG 30



 Score = 26.6 bits (59), Expect = 2.7
 Identities = 10/25 (40%), Positives = 13/25 (52%), Gaps = 4/25 (16%)

Query: 360 CSGHGSCVEG----TCHCQSGWKGI 380
           CS  G+CV+     TC C  G+ G 
Sbjct: 7   CSNGGTCVDTPGGYTCECPEGYTGK 31


>gnl|CDD|221873 pfam12955, DUF3844, Domain of unknown function (DUF3844).  This
           presumed domain is found in fungal species. It contains
           8 largely conserved cysteine residues. This domain is
           found in proteins that are thought to be found in the
           endoplasmic reticulum.
          Length = 103

 Score = 29.2 bits (66), Expect = 1.6
 Identities = 10/20 (50%), Positives = 10/20 (50%), Gaps = 4/20 (20%)

Query: 293 NGCSGHGSCYL----GKCDC 308
           N CSGHGSC         DC
Sbjct: 13  NSCSGHGSCVKKSKSKGGDC 32


>gnl|CDD|236544 PRK09506, mrcB, bifunctional glycosyl transferase/transpeptidase;
           Reviewed.
          Length = 830

 Score = 30.1 bits (68), Expect = 3.2
 Identities = 12/44 (27%), Positives = 16/44 (36%)

Query: 432 DKDGVTDCSDSDCCSQPVCSDQPHIMCLASNDPVEVLLRKQPPS 475
           D DG   C        PV +D P  +C  S    +    +Q P 
Sbjct: 765 DYDGNFVCGSGGMRVLPVWTDDPQSLCQQSEMQQQPSQPQQQPQ 808


>gnl|CDD|240410 PTZ00418, PTZ00418, Poly(A) polymerase; Provisional.
          Length = 593

 Score = 30.2 bits (68), Expect = 3.2
 Identities = 14/73 (19%), Positives = 27/73 (36%), Gaps = 6/73 (8%)

Query: 98  LKDFNSLISATIPSFQFWNSD-FNIEQSAFFRFNFDLPRGSNFAVYGRRNVAPSITNYDF 156
           L+  N+L     P F  +  D ++   S F    F      N + +   ++  +I   DF
Sbjct: 444 LETLNNLKIRPYPKFFKYQDDGWDYASSFFIGLVFFSKNVYNNSTF---DLRYAIR--DF 498

Query: 157 SEFIKDNTRTSAF 169
            + I +      +
Sbjct: 499 VDIINNWPEMEKY 511


>gnl|CDD|215680 pfam00053, Laminin_EGF, Laminin EGF-like (Domains III and V).  This
           family is like pfam00008 but has 8 conserved cysteines
           instead of six.
          Length = 49

 Score = 26.9 bits (60), Expect = 3.7
 Identities = 11/29 (37%), Positives = 13/29 (44%), Gaps = 6/29 (20%)

Query: 295 CSGHGS----CYL--GKCDCIDGYEGTDC 317
           C+ HGS    C    G+C C  G  G  C
Sbjct: 3   CNPHGSLSDTCDPETGQCLCKPGVTGRHC 31


>gnl|CDD|201923 pfam01683, EB, EB module.  This domain has no known function. It is
           found in several C. elegans proteins. The domain
           contains 8 conserved cysteines that probably form four
           disulphide bridges. This domain is found associated with
           kunitz domains pfam00014.
          Length = 52

 Score = 27.0 bits (60), Expect = 3.7
 Identities = 11/27 (40%), Positives = 14/27 (51%), Gaps = 1/27 (3%)

Query: 351 PASDCQYSD-CSGHGSCVEGTCHCQSG 376
           P   C+Y + C G   C+ GTC C  G
Sbjct: 18  PGESCEYDEQCQGGSVCINGTCQCPEG 44


>gnl|CDD|218410 pfam05065, Phage_capsid, Phage capsid family.  Family of
           bacteriophage hypothetical proteins and capsid proteins.
          Length = 240

 Score = 29.3 bits (66), Expect = 4.2
 Identities = 8/48 (16%), Positives = 15/48 (31%)

Query: 137 SNFAVYGRRNVAPSITNYDFSEFIKDNTRTSAFFRFNFDLPRGSNFAV 184
           S + +  R  V        ++ F K+     A  R +  +     F  
Sbjct: 190 SAYIIVDREGVTVERLRDPYTAFEKNQVGFRATERVDGAVVDPEAFKK 237


>gnl|CDD|214544 smart00181, EGF, Epidermal growth factor-like domain. 
          Length = 35

 Score = 26.3 bits (58), Expect = 4.3
 Identities = 8/22 (36%), Positives = 10/22 (45%)

Query: 394 HGTCAFENEEYQCVCAEGWAGV 415
           +GTC      Y C C  G+ G 
Sbjct: 10  NGTCINTPGSYTCSCPPGYTGD 31


>gnl|CDD|235493 PRK05481, PRK05481, lipoyl synthase; Provisional.
          Length = 289

 Score = 29.3 bits (67), Expect = 4.6
 Identities = 10/20 (50%), Positives = 11/20 (55%), Gaps = 2/20 (10%)

Query: 382 CQEAGCPNA--CNRHGTCAF 399
           C+EA CPN   C   GT  F
Sbjct: 37  CEEASCPNIGECWSRGTATF 56


>gnl|CDD|193537 cd05661, M28_like_PA_2, M28 Zn-Peptidases containing a PA domain
          insert.  Peptidase family M28 (also called
          aminopeptidase Y family), uncharacterized subfamily.
          The M28 family contains aminopeptidases as well as
          carboxypeptidases. They have co-catalytic zinc ions;
          each zinc ion is tetrahedrally co-ordinated, with three
          amino acid ligands plus activated water; one aspartate
          residue binds both metal ions. This subfamily is
          composed of uncharacterized proteins containing a
          protease-associated (PA) domain insert which may
          participate in substrate binding and/or promote
          conformational changes, influencing the stability and
          accessibility of the site to substrate.
          Length = 305

 Score = 28.8 bits (65), Expect = 7.7
 Identities = 8/22 (36%), Positives = 10/22 (45%)

Query: 28 PSVTASFYQRVKFLVEENSVQS 49
          P+    +YQ V F  E   V S
Sbjct: 45 PAGDDGYYQPVPFQEEHEDVTS 66



 Score = 28.8 bits (65), Expect = 7.7
 Identities = 8/22 (36%), Positives = 10/22 (45%)

Query: 474 PSVTASFYQRVKFLVEENSVQS 495
           P+    +YQ V F  E   V S
Sbjct: 45  PAGDDGYYQPVPFQEEHEDVTS 66


>gnl|CDD|238012 cd00055, EGF_Lam, Laminin-type epidermal growth factor-like domain;
           laminins are the major noncollagenous components of
           basement membranes that mediate cell adhesion, growth
           migration, and differentiation; the laminin-type
           epidermal growth factor-like module occurs in tandem
           arrays; the domain contains 4 disulfide bonds (loops
           a-d) the first three resemble epidermal growth factor
           (EGF); the number of copies of this domain in the
           different forms of laminins is highly variable ranging
           from 3 up to 22 copies.
          Length = 50

 Score = 25.8 bits (57), Expect = 9.4
 Identities = 11/31 (35%), Positives = 15/31 (48%), Gaps = 6/31 (19%)

Query: 295 CSGHGS----CYL--GKCDCIDGYEGTDCSK 319
           C+GHGS    C    G+C+C     G  C +
Sbjct: 4   CNGHGSLSGQCDPGTGQCECKPNTTGRRCDR 34


  Database: CDD.v3.10
    Posted date:  Mar 20, 2013  7:55 AM
  Number of letters in database: 10,937,602
  Number of sequences in database:  44,354
  
Lambda     K      H
   0.319    0.134    0.428 

Gapped
Lambda     K      H
   0.267   0.0632    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 24,880,265
Number of extensions: 2294622
Number of successful extensions: 1790
Number of sequences better than 10.0: 1
Number of HSP's gapped: 1775
Number of HSP's successfully gapped: 58
Length of query: 514
Length of database: 10,937,602
Length adjustment: 101
Effective length of query: 413
Effective length of database: 6,457,848
Effective search space: 2667091224
Effective search space used: 2667091224
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 61 (27.5 bits)