RPS-BLAST 2.2.26 [Sep-21-2011]

Database: CDD.v3.10 
           44,354 sequences; 10,937,602 total letters

Searching..................................................done

Query= psy11800
         (220 letters)



>gnl|CDD|221695 pfam12662, cEGF, Complement Clr-like EGF-like.  cEGF, or complement
           Clr-like EGF, domains have six conserved cysteine
           residues disulfide-bonded into the characteristic
           pattern 'ababcc'. They are found in blood coagulation
           proteins such as fibrillin, Clr and Cls, thrombomodulin,
           and the LDL receptor. The core fold of the EGF domain
           consists of two small beta-hairpins packed against each
           other. Two major structural variants have been
           identified based on the structural context of the
           C-terminal cysteine residue of disulfide 'c' in the
           C-terminal hairpin: hEGFs and cEGFs. In cEGFs the
           C-terminal thiol resides on the C-terminal beta-sheet,
           resulting in long loop-lengths between the cysteine
           residues of disulfide 'c', typically C[10+]XC. These
           longer loop-lengths may have arisen by selective
           cysteine loss from a four-disulfide EGF template such as
           laminin or integrin. Tandem cEGF domains have five
           linking residues between terminal cysteines of adjacent
           domains. cEGF domains may or may not bind calcium in the
           linker region. cEGF domains with the consensus motif
           CXN4X[F,Y]XCXC are hydroxylated exclusively on the
           asparagine residue.
          Length = 24

 Score = 40.5 bits (96), Expect = 1e-05
 Identities = 12/24 (50%), Positives = 15/24 (62%)

Query: 141 SFKCQCKPGFVLSPTGHACIDVDE 164
           S+ C C PG+ LS  G  C D+DE
Sbjct: 1   SYTCSCPPGYQLSGDGRTCEDIDE 24



 Score = 37.4 bits (88), Expect = 2e-04
 Identities = 14/24 (58%), Positives = 15/24 (62%)

Query: 73 SYRCACQPGYSPSPDGGFCVDRDE 96
          SY C+C PGY  S DG  C D DE
Sbjct: 1  SYTCSCPPGYQLSGDGRTCEDIDE 24



 Score = 37.4 bits (88), Expect = 2e-04
 Identities = 14/24 (58%), Positives = 15/24 (62%)

Query: 183 SYRCACQPGYSPSPDGGFCVDRDE 206
           SY C+C PGY  S DG  C D DE
Sbjct: 1   SYTCSCPPGYQLSGDGRTCEDIDE 24


>gnl|CDD|214542 smart00179, EGF_CA, Calcium-binding EGF-like domain. 
          Length = 39

 Score = 40.7 bits (96), Expect = 1e-05
 Identities = 14/30 (46%), Positives = 20/30 (66%), Gaps = 1/30 (3%)

Query: 122 DECSQNGMCANG-MCINMDGSFKCQCKPGF 150
           DEC+    C NG  C+N  GS++C+C PG+
Sbjct: 3   DECASGNPCQNGGTCVNTVGSYRCECPPGY 32



 Score = 40.3 bits (95), Expect = 2e-05
 Identities = 22/43 (51%), Positives = 25/43 (58%), Gaps = 5/43 (11%)

Query: 161 DVDECYENPLICLNG-RCDNTLGSYRCACQPGYSPSPDGGFCV 202
           D+DEC      C NG  C NT+GSYRC C PGY+   DG  C 
Sbjct: 1   DIDECASGN-PCQNGGTCVNTVGSYRCECPPGYT---DGRNCE 39



 Score = 38.8 bits (91), Expect = 7e-05
 Identities = 21/43 (48%), Positives = 25/43 (58%), Gaps = 5/43 (11%)

Query: 51 NVDECYENPLICLNG-RCDNTLGSYRCACQPGYSPSPDGGFCV 92
          ++DEC      C NG  C NT+GSYRC C PGY+   DG  C 
Sbjct: 1  DIDECASGN-PCQNGGTCVNTVGSYRCECPPGYT---DGRNCE 39


>gnl|CDD|214544 smart00181, EGF, Epidermal growth factor-like domain. 
          Length = 35

 Score = 40.2 bits (94), Expect = 2e-05
 Identities = 15/33 (45%), Positives = 19/33 (57%)

Query: 123 ECSQNGMCANGMCINMDGSFKCQCKPGFVLSPT 155
           EC+  G C+NG CIN  GS+ C C PG+     
Sbjct: 1   ECASGGPCSNGTCINTPGSYTCSCPPGYTGDKR 33



 Score = 34.0 bits (78), Expect = 0.003
 Identities = 16/34 (47%), Positives = 18/34 (52%), Gaps = 1/34 (2%)

Query: 54 ECYENPLICLNGRCDNTLGSYRCACQPGYSPSPD 87
          EC      C NG C NT GSY C+C PGY+    
Sbjct: 1  ECASGG-PCSNGTCINTPGSYTCSCPPGYTGDKR 33



 Score = 34.0 bits (78), Expect = 0.003
 Identities = 16/34 (47%), Positives = 18/34 (52%), Gaps = 1/34 (2%)

Query: 164 ECYENPLICLNGRCDNTLGSYRCACQPGYSPSPD 197
           EC      C NG C NT GSY C+C PGY+    
Sbjct: 1   ECASGG-PCSNGTCINTPGSYTCSCPPGYTGDKR 33


>gnl|CDD|238011 cd00054, EGF_CA, Calcium-binding EGF-like domain, present in a
           large number of membrane-bound and extracellular (mostly
           animal) proteins. Many of these proteins require calcium
           for their biological function and calcium-binding sites
           have been found to be located at the N-terminus of
           particular EGF-like domains; calcium-binding may be
           crucial for numerous protein-protein interactions. Six
           conserved core cysteines form three disulfide bridges as
           in non calcium-binding EGF domains, whose structures are
           very similar. EGF_CA can be found in tandem repeat
           arrangements.
          Length = 38

 Score = 39.9 bits (94), Expect = 3e-05
 Identities = 14/32 (43%), Positives = 19/32 (59%), Gaps = 1/32 (3%)

Query: 122 DECSQNGMCAN-GMCINMDGSFKCQCKPGFVL 152
           DEC+    C N G C+N  GS++C C PG+  
Sbjct: 3   DECASGNPCQNGGTCVNTVGSYRCSCPPGYTG 34



 Score = 39.2 bits (92), Expect = 5e-05
 Identities = 17/33 (51%), Positives = 21/33 (63%)

Query: 161 DVDECYENPLICLNGRCDNTLGSYRCACQPGYS 193
           D+DEC         G C NT+GSYRC+C PGY+
Sbjct: 1   DIDECASGNPCQNGGTCVNTVGSYRCSCPPGYT 33



 Score = 37.6 bits (88), Expect = 2e-04
 Identities = 16/33 (48%), Positives = 21/33 (63%)

Query: 51 NVDECYENPLICLNGRCDNTLGSYRCACQPGYS 83
          ++DEC         G C NT+GSYRC+C PGY+
Sbjct: 1  DIDECASGNPCQNGGTCVNTVGSYRCSCPPGYT 33


>gnl|CDD|219496 pfam07645, EGF_CA, Calcium-binding EGF domain. 
          Length = 42

 Score = 39.3 bits (92), Expect = 4e-05
 Identities = 19/42 (45%), Positives = 23/42 (54%), Gaps = 1/42 (2%)

Query: 161 DVDECYENPLIC-LNGRCDNTLGSYRCACQPGYSPSPDGGFC 201
           DVDEC +    C  N  C NT+GS+ C C  GY  + DG  C
Sbjct: 1   DVDECADGTHNCPANTVCVNTIGSFECVCPDGYENNEDGTNC 42



 Score = 36.9 bits (86), Expect = 3e-04
 Identities = 18/41 (43%), Positives = 22/41 (53%), Gaps = 1/41 (2%)

Query: 52 VDECYENPLIC-LNGRCDNTLGSYRCACQPGYSPSPDGGFC 91
          VDEC +    C  N  C NT+GS+ C C  GY  + DG  C
Sbjct: 2  VDECADGTHNCPANTVCVNTIGSFECVCPDGYENNEDGTNC 42



 Score = 35.4 bits (82), Expect = 0.001
 Identities = 14/40 (35%), Positives = 19/40 (47%), Gaps = 2/40 (5%)

Query: 122 DECSQNG-MCANGM-CINMDGSFKCQCKPGFVLSPTGHAC 159
           DEC+     C     C+N  GSF+C C  G+  +  G  C
Sbjct: 3   DECADGTHNCPANTVCVNTIGSFECVCPDGYENNEDGTNC 42


>gnl|CDD|238752 cd01475, vWA_Matrilin, VWA_Matrilin: In cartilaginous plate,
           extracellular matrix molecules mediate cell-matrix and
           matrix-matrix interactions thereby providing tissue
           integrity. Some members of the matrilin family are
           expressed specifically in developing cartilage
           rudiments. The matrilin family consists of at least four
           members. All the members of the matrilin family contain
           VWA domains, EGF-like domains and a heptad repeat
           coiled-coiled domain at the carboxy terminus which is
           responsible for the oligomerization of the matrilins.
           The VWA domains have been shown to be essential for
           matrilin network formation by interacting with matrix
           ligands.
          Length = 224

 Score = 38.5 bits (90), Expect = 0.001
 Identities = 15/40 (37%), Positives = 19/40 (47%), Gaps = 1/40 (2%)

Query: 159 CIDVDECYENPLICLNGRCDNTLGSYRCACQPGYSPSPDG 198
           C+  D C     +C    C +T GSY CAC  GY+   D 
Sbjct: 184 CVVPDLCATLSHVCQQV-CISTPGSYLCACTEGYALLEDN 222



 Score = 33.9 bits (78), Expect = 0.038
 Identities = 14/37 (37%), Positives = 17/37 (45%), Gaps = 1/37 (2%)

Query: 52  VDECYENPLICLNGRCDNTLGSYRCACQPGYSPSPDG 88
            D C     +C    C +T GSY CAC  GY+   D 
Sbjct: 187 PDLCATLSHVCQQV-CISTPGSYLCACTEGYALLEDN 222



 Score = 33.1 bits (76), Expect = 0.064
 Identities = 17/67 (25%), Positives = 21/67 (31%), Gaps = 27/67 (40%)

Query: 88  GGFCVDRDECRTPGDHDECSQKKKKKKKKKKLYHDECSQNGMCANGMCINMDGSFKCQCK 147
           G  CV  D C T   H                    C Q       +CI+  GS+ C C 
Sbjct: 181 GKICVVPDLCAT-LSHV-------------------CQQ-------VCISTPGSYLCACT 213

Query: 148 PGFVLSP 154
            G+ L  
Sbjct: 214 EGYALLE 220


>gnl|CDD|238010 cd00053, EGF, Epidermal growth factor domain, found in epidermal
           growth factor (EGF) presents in a large number of
           proteins, mostly animal; the list of proteins currently
           known to contain one or more copies of an EGF-like
           pattern is large and varied; the functional significance
           of EGF-like domains in what appear to be unrelated
           proteins is not yet clear; a common feature is that
           these repeats are found in the extracellular domain of
           membrane-bound proteins or in proteins known to be
           secreted (exception: prostaglandin G/H synthase); the
           domain includes six cysteine residues which have been
           shown to be involved in disulfide bonds; the main
           structure is a two-stranded beta-sheet followed by a
           loop to a C-terminal short two-stranded sheet;
           Subdomains between the conserved cysteines vary in
           length; the region between the 5th and 6th cysteine
           contains two conserved glycines of which at  least  one 
           is  present  in  most EGF-like domains; a subset of
           these bind calcium.
          Length = 36

 Score = 33.6 bits (77), Expect = 0.004
 Identities = 13/34 (38%), Positives = 21/34 (61%), Gaps = 1/34 (2%)

Query: 123 ECSQNGMCAN-GMCINMDGSFKCQCKPGFVLSPT 155
           EC+ +  C+N G C+N  GS++C C PG+    +
Sbjct: 1   ECAASNPCSNGGTCVNTPGSYRCVCPPGYTGDRS 34



 Score = 30.5 bits (69), Expect = 0.064
 Identities = 13/20 (65%), Positives = 14/20 (70%)

Query: 64 NGRCDNTLGSYRCACQPGYS 83
           G C NT GSYRC C PGY+
Sbjct: 11 GGTCVNTPGSYRCVCPPGYT 30



 Score = 30.5 bits (69), Expect = 0.064
 Identities = 13/20 (65%), Positives = 14/20 (70%)

Query: 174 NGRCDNTLGSYRCACQPGYS 193
            G C NT GSYRC C PGY+
Sbjct: 11  GGTCVNTPGSYRCVCPPGYT 30


>gnl|CDD|218955 pfam06247, Plasmod_Pvs28, Plasmodium ookinete surface protein
           Pvs28.  This family consists of several ookinete surface
           protein (Pvs28) from several species of Plasmodium.
           Pvs25 and Pvs28 are expressed on the surface of
           ookinetes. These proteins are potential candidates for
           vaccine and induce antibodies that block the infectivity
           of Plasmodium vivax in immunised animals.
          Length = 196

 Score = 32.8 bits (75), Expect = 0.076
 Identities = 19/72 (26%), Positives = 29/72 (40%), Gaps = 10/72 (13%)

Query: 130 CANGMCINMDGSFKCQCKPGFVLSPTGHACIDVDECYE---------NPLICLNGRCDNT 180
           C NG  I M   F+C+C  G+VL    + C +  +C +             C+N      
Sbjct: 8   CKNGYLIQMSNHFECKCNEGYVLK-NENTCEEKVKCDKLENVNKVCGEYATCINQANKAE 66

Query: 181 LGSYRCACQPGY 192
             + +C C  GY
Sbjct: 67  EKALKCGCINGY 78


>gnl|CDD|205157 pfam12947, EGF_3, EGF domain.  This family includes a variety of
           EGF-like domain homologues. This family includes the
           C-terminal domain of the malaria parasite MSP1 protein.
          Length = 36

 Score = 29.8 bits (68), Expect = 0.083
 Identities = 15/36 (41%), Positives = 16/36 (44%), Gaps = 3/36 (8%)

Query: 125 SQNGMC-ANGMCINMDGSFKCQCKPGFVLSPTGHAC 159
             NG C  N  C N  GSF C CK G+     G  C
Sbjct: 3   ENNGGCHPNATCTNTGGSFTCTCKSGYTGD--GVTC 36



 Score = 27.5 bits (62), Expect = 0.61
 Identities = 14/30 (46%), Positives = 17/30 (56%), Gaps = 1/30 (3%)

Query: 55 CYENPLIC-LNGRCDNTLGSYRCACQPGYS 83
          C EN   C  N  C NT GS+ C C+ GY+
Sbjct: 1  CAENNGGCHPNATCTNTGGSFTCTCKSGYT 30



 Score = 27.5 bits (62), Expect = 0.61
 Identities = 14/30 (46%), Positives = 17/30 (56%), Gaps = 1/30 (3%)

Query: 165 CYENPLIC-LNGRCDNTLGSYRCACQPGYS 193
           C EN   C  N  C NT GS+ C C+ GY+
Sbjct: 1   CAENNGGCHPNATCTNTGGSFTCTCKSGYT 30


>gnl|CDD|215652 pfam00008, EGF, EGF-like domain.  There is no clear separation
           between noise and signal. pfam00053 is very similar, but
           has 8 instead of 6 conserved cysteines. Includes some
           cytokine receptors. The EGF domain misses the N-terminus
           regions of the Ca2+ binding EGF domains (this is the
           main reason of discrepancy between swiss-prot domain
           start/end and Pfam). The family is hard to model due to
           many similar but different sub-types of EGF domains.
           Pfam certainly misses a number of EGF domains.
          Length = 32

 Score = 28.9 bits (65), Expect = 0.20
 Identities = 11/28 (39%), Positives = 17/28 (60%), Gaps = 1/28 (3%)

Query: 124 CSQNGMCANGM-CINMDGSFKCQCKPGF 150
           CS N  C+NG  C++  G + C+C  G+
Sbjct: 1   CSPNNPCSNGGTCVDTPGGYTCECPEGY 28



 Score = 25.5 bits (56), Expect = 3.2
 Identities = 9/20 (45%), Positives = 11/20 (55%)

Query: 64 NGRCDNTLGSYRCACQPGYS 83
           G C +T G Y C C  GY+
Sbjct: 10 GGTCVDTPGGYTCECPEGYT 29



 Score = 25.5 bits (56), Expect = 3.2
 Identities = 9/20 (45%), Positives = 11/20 (55%)

Query: 174 NGRCDNTLGSYRCACQPGYS 193
            G C +T G Y C C  GY+
Sbjct: 10  GGTCVDTPGGYTCECPEGYT 29


>gnl|CDD|219901 pfam08555, DUF1754, Eukaryotic family of unknown function
           (DUF1754).  This is a eukaryotic protein family of
           unknown function.
          Length = 90

 Score = 30.1 bits (68), Expect = 0.21
 Identities = 10/12 (83%), Positives = 10/12 (83%)

Query: 107 SQKKKKKKKKKK 118
             KKKKKKKKKK
Sbjct: 18  DVKKKKKKKKKK 29



 Score = 28.2 bits (63), Expect = 1.1
 Identities = 9/11 (81%), Positives = 10/11 (90%)

Query: 108 QKKKKKKKKKK 118
           +KKKKKKKKK 
Sbjct: 20  KKKKKKKKKKN 30


>gnl|CDD|224655 COG1741, COG1741, Pirin-related protein [General function
           prediction only].
          Length = 276

 Score = 28.4 bits (64), Expect = 2.6
 Identities = 9/27 (33%), Positives = 13/27 (48%)

Query: 13  LHWASAGPAVVHSTYTGLSETQTYTGL 39
           + W +AG  +VHS     S  +   GL
Sbjct: 93  VQWMTAGSGIVHSEMNPPSTGKPLHGL 119


>gnl|CDD|201524 pfam00954, S_locus_glycop, S-locus glycoprotein family.  In
           Brassicaceae, self-incompatible plants have a
           self/non-self recognition system. This is
           sporophytically controlled by multiple alleles at a
           single locus (S). S-locus glycoproteins, as well as
           S-receptor kinases, are in linkage with the S-alleles.
          Length = 110

 Score = 27.2 bits (61), Expect = 3.4
 Identities = 13/31 (41%), Positives = 17/31 (54%), Gaps = 2/31 (6%)

Query: 122 DECSQNGMC-ANGMCINMDGSFKCQCKPGFV 151
           D+C   G C   G C +++ S KC C  GFV
Sbjct: 78  DQCDVYGRCGPYGYC-DVNTSPKCNCIKGFV 107


>gnl|CDD|99903 cd06080, MUM1_like, Mutated melanoma-associated antigen 1 (MUM-1)
           is a melanoma-associated antigen (MAA).  MUM-1 belongs
           to the mutated or aberrantly expressed type of MAAs,
           along with antigens such as CDK4, beta-catenin,
           gp100-in4, p15, and N-acetylglucosaminyltransferase V.
           It is highly expressed in several types of human
           cancers.  The PWWP domain, named for a conserved
           Pro-Trp-Trp-Pro motif, is a small domain consisting of
           100-150 amino acids. The PWWP domain is found in
           numerous proteins that are involved in cell division,
           growth and differentiation. Most PWWP-domain proteins
           seem to be nuclear, often DNA-binding, proteins that
           function as transcription factors regulating a variety
           of developmental processes.
          Length = 80

 Score = 26.6 bits (59), Expect = 3.8
 Identities = 7/19 (36%), Positives = 12/19 (63%)

Query: 102 DHDECSQKKKKKKKKKKLY 120
            H +C++K+K   K K+ Y
Sbjct: 55  KHFDCTEKQKLTNKAKESY 73


>gnl|CDD|216726 pfam01826, TIL, Trypsin Inhibitor like cysteine rich domain.  This
           family contains trypsin inhibitors as well as a domain
           found in many extracellular proteins. The domain
           typically contains ten cysteine residues that form five
           disulphide bonds. The cysteine residues that form the
           disulphide bonds are 1-7, 2-6, 3-5, 4-10 and 8-9.
          Length = 55

 Score = 25.8 bits (57), Expect = 4.4
 Identities = 8/22 (36%), Positives = 11/22 (50%), Gaps = 1/22 (4%)

Query: 144 CQCKPGFVLSPTGHACIDVDEC 165
           C C PG+V    G  C+   +C
Sbjct: 35  CVCPPGYVRDNDG-KCVPPSQC 55


>gnl|CDD|222466 pfam13945, NST1, Splicing factor, salt tolerance regulator.  NST1
           is a family of proteins that seem to be involved,
           directly or indirectly, in the salt sensitivity of some
           cellular functions in yeast. These proteins also
           interact with the splicing factor Msl1p.
          Length = 189

 Score = 27.6 bits (61), Expect = 4.5
 Identities = 12/23 (52%), Positives = 14/23 (60%)

Query: 95  DECRTPGDHDECSQKKKKKKKKK 117
           DE  T   +D  S K KKKKKK+
Sbjct: 18  DELNTVIHNDSSSSKSKKKKKKR 40


  Database: CDD.v3.10
    Posted date:  Mar 20, 2013  7:55 AM
  Number of letters in database: 10,937,602
  Number of sequences in database:  44,354
  
Lambda     K      H
   0.319    0.136    0.461 

Gapped
Lambda     K      H
   0.267   0.0632    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 10,754,766
Number of extensions: 927083
Number of successful extensions: 1248
Number of sequences better than 10.0: 1
Number of HSP's gapped: 1230
Number of HSP's successfully gapped: 65
Length of query: 220
Length of database: 10,937,602
Length adjustment: 93
Effective length of query: 127
Effective length of database: 6,812,680
Effective search space: 865210360
Effective search space used: 865210360
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 57 (25.9 bits)