RPS-BLAST 2.2.26 [Sep-21-2011]

Database: CDD.v3.10 
           44,354 sequences; 10,937,602 total letters

Searching..................................................done

Query= psy5750
         (286 letters)



>gnl|CDD|205157 pfam12947, EGF_3, EGF domain.  This family includes a variety of
           EGF-like domain homologues. This family includes the
           C-terminal domain of the malaria parasite MSP1 protein.
          Length = 36

 Score = 46.8 bits (112), Expect = 1e-07
 Identities = 16/36 (44%), Positives = 21/36 (58%)

Query: 118 CNAGTDLCHKNAMCFNEIGSYSCQCRPGFTGNGHQC 153
           C      CH NA C N  GS++C C+ G+TG+G  C
Sbjct: 1   CAENNGGCHPNATCTNTGGSFTCTCKSGYTGDGVTC 36



 Score = 43.3 bits (103), Expect = 2e-06
 Identities = 15/36 (41%), Positives = 20/36 (55%)

Query: 216 CMNYPPICNNNADCINRPGTYQCQCKRGFSGDGFNC 251
           C      C+ NA C N  G++ C CK G++GDG  C
Sbjct: 1   CAENNGGCHPNATCTNTGGSFTCTCKSGYTGDGVTC 36


>gnl|CDD|238158 cd00255, nidG2, Nidogen, G2 domain; Nidogen is an important
           component of the basement membrane, an extracellular
           sheet-like matrix. Nidogen is a multifunctional protein
           that interacts with many other basement membrane
           proteins, like collagen, perlecan, lamin, and has a
           potential role in the assembly and connection of
           networks. Nidogen consists of 3 globular domains
           (G1-G3), G3 is the lamin-binding domain, while G2 binds
           collagen IV and perlecan. Also found in hemicentin, a
           protein which functions at various cell-cell and
           cell-matrix junctions and might assist in refining broad
           regions of cell contact into oriented, line-shaped
           junctions. Nidogen G2 consists of an N-terminal EGF-like
           domain (excluded from this alignment model) and an
           11-stranded beta-barrel with a central helix, a topology
           that exhibits high structural similarity to the green
           flourescent proteins of Cnidaria.
          Length = 224

 Score = 50.0 bits (120), Expect = 3e-07
 Identities = 17/66 (25%), Positives = 31/66 (46%), Gaps = 1/66 (1%)

Query: 1   GVFNYSAELIFST-GQRLHVQEEFFGHDVYDQLKMQGSIQGTAPSIAVGVSPVLEDLQQE 59
           G F   AE+ F T G++L + +   G D +  L +   I G  P +  G +  +ED  + 
Sbjct: 84  GEFTRQAEVTFYTGGEKLRITQVARGLDSHGHLLLDTVISGRVPQVPAGATVHIEDYTEL 143

Query: 60  FTHSST 65
           + ++  
Sbjct: 144 YHYTGP 149


>gnl|CDD|219496 pfam07645, EGF_CA, Calcium-binding EGF domain. 
          Length = 42

 Score = 45.4 bits (108), Expect = 4e-07
 Identities = 16/37 (43%), Positives = 21/37 (56%)

Query: 114 DINECNAGTDLCHKNAMCFNEIGSYSCQCRPGFTGNG 150
           D++EC  GT  C  N +C N IGS+ C C  G+  N 
Sbjct: 1   DVDECADGTHNCPANTVCVNTIGSFECVCPDGYENNE 37



 Score = 40.8 bits (96), Expect = 2e-05
 Identities = 17/42 (40%), Positives = 24/42 (57%), Gaps = 2/42 (4%)

Query: 212 DVDECMNYPPICNNNADCINRPGTYQCQCKRGF--SGDGFNC 251
           DVDEC +    C  N  C+N  G+++C C  G+  + DG NC
Sbjct: 1   DVDECADGTHNCPANTVCVNTIGSFECVCPDGYENNEDGTNC 42


>gnl|CDD|214542 smart00179, EGF_CA, Calcium-binding EGF-like domain. 
          Length = 39

 Score = 44.9 bits (107), Expect = 6e-07
 Identities = 19/41 (46%), Positives = 27/41 (65%), Gaps = 2/41 (4%)

Query: 212 DVDECMNYPPICNNNADCINRPGTYQCQCKRGFSGDGFNCE 252
           D+DEC +  P C N   C+N  G+Y+C+C  G++ DG NCE
Sbjct: 1   DIDECASGNP-CQNGGTCVNTVGSYRCECPPGYT-DGRNCE 39



 Score = 43.8 bits (104), Expect = 1e-06
 Identities = 18/41 (43%), Positives = 24/41 (58%), Gaps = 2/41 (4%)

Query: 114 DINECNAGTDLCHKNAMCFNEIGSYSCQCRPGFTGNGHQCT 154
           DI+EC A  + C     C N +GSY C+C PG+T +G  C 
Sbjct: 1   DIDEC-ASGNPCQNGGTCVNTVGSYRCECPPGYT-DGRNCE 39


>gnl|CDD|238011 cd00054, EGF_CA, Calcium-binding EGF-like domain, present in a
           large number of membrane-bound and extracellular (mostly
           animal) proteins. Many of these proteins require calcium
           for their biological function and calcium-binding sites
           have been found to be located at the N-terminus of
           particular EGF-like domains; calcium-binding may be
           crucial for numerous protein-protein interactions. Six
           conserved core cysteines form three disulfide bridges as
           in non calcium-binding EGF domains, whose structures are
           very similar. EGF_CA can be found in tandem repeat
           arrangements.
          Length = 38

 Score = 43.0 bits (102), Expect = 3e-06
 Identities = 18/41 (43%), Positives = 25/41 (60%), Gaps = 3/41 (7%)

Query: 212 DVDECMNYPPICNNNADCINRPGTYQCQCKRGFSGDGFNCE 252
           D+DEC +  P C N   C+N  G+Y+C C  G++G   NCE
Sbjct: 1   DIDECASGNP-CQNGGTCVNTVGSYRCSCPPGYTGR--NCE 38



 Score = 43.0 bits (102), Expect = 3e-06
 Identities = 17/35 (48%), Positives = 21/35 (60%), Gaps = 1/35 (2%)

Query: 114 DINECNAGTDLCHKNAMCFNEIGSYSCQCRPGFTG 148
           DI+EC A  + C     C N +GSY C C PG+TG
Sbjct: 1   DIDEC-ASGNPCQNGGTCVNTVGSYRCSCPPGYTG 34


>gnl|CDD|214774 smart00682, G2F, G2 nidogen domain and fibulin. 
          Length = 227

 Score = 46.7 bits (111), Expect = 3e-06
 Identities = 19/64 (29%), Positives = 35/64 (54%)

Query: 1   GVFNYSAELIFSTGQRLHVQEEFFGHDVYDQLKMQGSIQGTAPSIAVGVSPVLEDLQQEF 60
           GVF    E+ F+ G+ L +++ F G D +  LK++  + G  P +A G    + D  +E+
Sbjct: 86  GVFTRETEVTFAGGEILRIKQTFSGLDEHGYLKVKIEVSGRVPQVAAGAEVTIPDYTEEY 145

Query: 61  THSS 64
           T++ 
Sbjct: 146 TYTG 149


>gnl|CDD|219422 pfam07474, G2F, G2F domain.  Nidogen, an invariant component of
           basement membranes, is a multifunctional protein that
           interacts with most other major basement membrane
           proteins. The G2 fragment or (G"F domain) contains
           binding sites for collagen IV and perlecan. The
           structure is composed of an 11-stranded beta-barrel with
           a central helix. This domain is structurally related to
           that of green fluorescent protein pfam01353. A large
           surface patch on the beta-barrel is conserved in all
           metazoan nidogens.
          Length = 193

 Score = 41.7 bits (98), Expect = 1e-04
 Identities = 17/64 (26%), Positives = 32/64 (50%)

Query: 1   GVFNYSAELIFSTGQRLHVQEEFFGHDVYDQLKMQGSIQGTAPSIAVGVSPVLEDLQQEF 60
           GVF    E+ F TG+ L +++ F G D    L ++  + G  P I  G    ++D  +++
Sbjct: 86  GVFKRETEVTFHTGEILRIKQIFSGLDSDGYLLIKTVVSGRVPQIPSGAEVTIKDYTEDY 145

Query: 61  THSS 64
            ++ 
Sbjct: 146 HYTG 149


>gnl|CDD|238010 cd00053, EGF, Epidermal growth factor domain, found in epidermal
           growth factor (EGF) presents in a large number of
           proteins, mostly animal; the list of proteins currently
           known to contain one or more copies of an EGF-like
           pattern is large and varied; the functional significance
           of EGF-like domains in what appear to be unrelated
           proteins is not yet clear; a common feature is that
           these repeats are found in the extracellular domain of
           membrane-bound proteins or in proteins known to be
           secreted (exception: prostaglandin G/H synthase); the
           domain includes six cysteine residues which have been
           shown to be involved in disulfide bonds; the main
           structure is a two-stranded beta-sheet followed by a
           loop to a C-terminal short two-stranded sheet;
           Subdomains between the conserved cysteines vary in
           length; the region between the 5th and 6th cysteine
           contains two conserved glycines of which at  least  one 
           is  present  in  most EGF-like domains; a subset of
           these bind calcium.
          Length = 36

 Score = 35.5 bits (82), Expect = 0.001
 Identities = 15/35 (42%), Positives = 19/35 (54%), Gaps = 1/35 (2%)

Query: 117 ECNAGTDLCHKNAMCFNEIGSYSCQCRPGFTGNGH 151
           EC A ++ C     C N  GSY C C PG+TG+  
Sbjct: 1   EC-AASNPCSNGGTCVNTPGSYRCVCPPGYTGDRS 34



 Score = 34.4 bits (79), Expect = 0.003
 Identities = 14/33 (42%), Positives = 21/33 (63%), Gaps = 1/33 (3%)

Query: 220 PPICNNNADCINRPGTYQCQCKRGFSGDGFNCE 252
              C+N   C+N PG+Y+C C  G++GD  +CE
Sbjct: 5   SNPCSNGGTCVNTPGSYRCVCPPGYTGD-RSCE 36


>gnl|CDD|215652 pfam00008, EGF, EGF-like domain.  There is no clear separation
           between noise and signal. pfam00053 is very similar, but
           has 8 instead of 6 conserved cysteines. Includes some
           cytokine receptors. The EGF domain misses the N-terminus
           regions of the Ca2+ binding EGF domains (this is the
           main reason of discrepancy between swiss-prot domain
           start/end and Pfam). The family is hard to model due to
           many similar but different sub-types of EGF domains.
           Pfam certainly misses a number of EGF domains.
          Length = 32

 Score = 32.4 bits (74), Expect = 0.013
 Identities = 10/25 (40%), Positives = 16/25 (64%)

Query: 223 CNNNADCINRPGTYQCQCKRGFSGD 247
           C+N   C++ PG Y C+C  G++G 
Sbjct: 7   CSNGGTCVDTPGGYTCECPEGYTGK 31



 Score = 29.7 bits (67), Expect = 0.13
 Identities = 9/24 (37%), Positives = 13/24 (54%)

Query: 125 CHKNAMCFNEIGSYSCQCRPGFTG 148
           C     C +  G Y+C+C  G+TG
Sbjct: 7   CSNGGTCVDTPGGYTCECPEGYTG 30


>gnl|CDD|214544 smart00181, EGF, Epidermal growth factor-like domain. 
          Length = 35

 Score = 29.8 bits (67), Expect = 0.14
 Identities = 12/24 (50%), Positives = 15/24 (62%)

Query: 128 NAMCFNEIGSYSCQCRPGFTGNGH 151
           N  C N  GSY+C C PG+TG+  
Sbjct: 10  NGTCINTPGSYTCSCPPGYTGDKR 33



 Score = 29.4 bits (66), Expect = 0.20
 Identities = 14/27 (51%), Positives = 17/27 (62%), Gaps = 1/27 (3%)

Query: 226 NADCINRPGTYQCQCKRGFSGDGFNCE 252
           N  CIN PG+Y C C  G++GD   CE
Sbjct: 10  NGTCINTPGSYTCSCPPGYTGDK-RCE 35


>gnl|CDD|221695 pfam12662, cEGF, Complement Clr-like EGF-like.  cEGF, or complement
           Clr-like EGF, domains have six conserved cysteine
           residues disulfide-bonded into the characteristic
           pattern 'ababcc'. They are found in blood coagulation
           proteins such as fibrillin, Clr and Cls, thrombomodulin,
           and the LDL receptor. The core fold of the EGF domain
           consists of two small beta-hairpins packed against each
           other. Two major structural variants have been
           identified based on the structural context of the
           C-terminal cysteine residue of disulfide 'c' in the
           C-terminal hairpin: hEGFs and cEGFs. In cEGFs the
           C-terminal thiol resides on the C-terminal beta-sheet,
           resulting in long loop-lengths between the cysteine
           residues of disulfide 'c', typically C[10+]XC. These
           longer loop-lengths may have arisen by selective
           cysteine loss from a four-disulfide EGF template such as
           laminin or integrin. Tandem cEGF domains have five
           linking residues between terminal cysteines of adjacent
           domains. cEGF domains may or may not bind calcium in the
           linker region. cEGF domains with the consensus motif
           CXN4X[F,Y]XCXC are hydroxylated exclusively on the
           asparagine residue.
          Length = 24

 Score = 28.6 bits (65), Expect = 0.24
 Identities = 10/22 (45%), Positives = 14/22 (63%), Gaps = 2/22 (9%)

Query: 137 SYSCQCRPGFT--GNGHQCTEI 156
           SY+C C PG+   G+G  C +I
Sbjct: 1   SYTCSCPPGYQLSGDGRTCEDI 22



 Score = 25.1 bits (56), Expect = 4.9
 Identities = 10/19 (52%), Positives = 11/19 (57%), Gaps = 2/19 (10%)

Query: 236 YQCQCKRGF--SGDGFNCE 252
           Y C C  G+  SGDG  CE
Sbjct: 2   YTCSCPPGYQLSGDGRTCE 20



 Score = 24.3 bits (54), Expect = 9.3
 Identities = 13/27 (48%), Positives = 15/27 (55%), Gaps = 5/27 (18%)

Query: 189 TCNCDPGYQKDYLDDRRVAFVCTDVDE 215
           TC+C PGYQ     D R    C D+DE
Sbjct: 3   TCSCPPGYQLS--GDGR---TCEDIDE 24


>gnl|CDD|238752 cd01475, vWA_Matrilin, VWA_Matrilin: In cartilaginous plate,
           extracellular matrix molecules mediate cell-matrix and
           matrix-matrix interactions thereby providing tissue
           integrity. Some members of the matrilin family are
           expressed specifically in developing cartilage
           rudiments. The matrilin family consists of at least four
           members. All the members of the matrilin family contain
           VWA domains, EGF-like domains and a heptad repeat
           coiled-coiled domain at the carboxy terminus which is
           responsible for the oligomerization of the matrilins.
           The VWA domains have been shown to be essential for
           matrilin network formation by interacting with matrix
           ligands.
          Length = 224

 Score = 32.0 bits (73), Expect = 0.27
 Identities = 12/36 (33%), Positives = 17/36 (47%), Gaps = 2/36 (5%)

Query: 209 VCTDVDECMNYPPICNNNADCINRPGTYQCQCKRGF 244
           +C   D C     +C     CI+ PG+Y C C  G+
Sbjct: 183 ICVVPDLCATLSHVCQQV--CISTPGSYLCACTEGY 216



 Score = 28.9 bits (65), Expect = 2.6
 Identities = 24/102 (23%), Positives = 39/102 (38%), Gaps = 24/102 (23%)

Query: 46  AVGVSPVLEDLQQEFTHSSTVNDDPCKNFFCVANSSCIVEDDKPTCICNRGFQQLYSEDR 105
           AVGV    E+  +E   S  + D    + F V + S I              ++L    +
Sbjct: 140 AVGVGRADEEELREIA-SEPLAD----HVFYVEDFSTI--------------EELTK--K 178

Query: 106 LQDDFGCFDINECNAGTDLCHKNAMCFNEIGSYSCQCRPGFT 147
            Q    C   + C   + +C +  +C +  GSY C C  G+ 
Sbjct: 179 FQGKI-CVVPDLCATLSHVCQQ--VCISTPGSYLCACTEGYA 217


>gnl|CDD|193419 pfam12946, EGF_MSP1_1, MSP1 EGF domain 1.  This EGF-like domain is
           found at the C-terminus of the malaria parasite MSP1
           protein. MSP1 is the merozoite surface protein 1. This
           domain is part of the C-terminal fragment that is
           proteolytically processed from the the rest of the
           protein and is left attached to the surface of the
           invading parasite.
          Length = 37

 Score = 26.2 bits (58), Expect = 2.3
 Identities = 10/31 (32%), Positives = 14/31 (45%), Gaps = 1/31 (3%)

Query: 223 CNNNADCINR-PGTYQCQCKRGFSGDGFNCE 252
           C  NA C     G  +C+C  G+  +G  C 
Sbjct: 7   CPANAGCFRYLDGREECRCLLGYKKEGGKCV 37



 Score = 25.0 bits (55), Expect = 7.2
 Identities = 11/30 (36%), Positives = 15/30 (50%), Gaps = 1/30 (3%)

Query: 125 CHKNAMCFNEI-GSYSCQCRPGFTGNGHQC 153
           C  NA CF  + G   C+C  G+   G +C
Sbjct: 7   CPANAGCFRYLDGREECRCLLGYKKEGGKC 36


>gnl|CDD|239562 cd03480, Rieske_RO_Alpha_PaO, Rieske non-heme iron oxygenase (RO)
           family, Pheophorbide a oxygenase (PaO) subfamily,
           N-terminal Rieske domain of the oxygenase alpha subunit;
           composed of the oxygenase alpha subunits of a small
           subfamily of enzymes found in plants as well as oxygenic
           cyanobacterial photosynthesizers including LLS1 (lethal
           leaf spot 1, also known as PaO) and ACD1 (accelerated
           cell death 1). ROs comprise a large class of aromatic
           ring-hydroxylating dioxygenases that enable
           microorganisms to tolerate and utilize aromatic
           compounds for growth. The oxygenase alpha subunit
           contains an N-terminal Rieske domain with an [2Fe-2S]
           cluster and a C-terminal catalytic domain with a
           mononuclear Fe(II) binding site. The Rieske [2Fe-2S]
           cluster accepts electrons from a reductase or ferredoxin
           component and transfers them to the mononuclear iron for
           catalysis. PaO expression increases upon physical
           wounding of plant leaves and is thought to catalyze a
           key step in chlorophyll degradation. The
           Arabidopsis-accelerated cell death gene ACD1 is involved
           in oxygenation of PaO.
          Length = 138

 Score = 28.1 bits (63), Expect = 2.8
 Identities = 11/29 (37%), Positives = 13/29 (44%), Gaps = 3/29 (10%)

Query: 146 FTGNGHQCTEITVPQTGPTSPCESDPRAC 174
           F G+G  C  I  PQ        + PRAC
Sbjct: 87  FDGSG-SCQRI--PQAAEGGKAHTSPRAC 112


>gnl|CDD|218955 pfam06247, Plasmod_Pvs28, Plasmodium ookinete surface protein
           Pvs28.  This family consists of several ookinete surface
           protein (Pvs28) from several species of Plasmodium.
           Pvs25 and Pvs28 are expressed on the surface of
           ookinetes. These proteins are potential candidates for
           vaccine and induce antibodies that block the infectivity
           of Plasmodium vivax in immunised animals.
          Length = 196

 Score = 27.4 bits (61), Expect = 6.4
 Identities = 46/168 (27%), Positives = 66/168 (39%), Gaps = 36/168 (21%)

Query: 91  CICNRGFQQLYSEDRLQDDFGCFDINECNAGTDL---CHKNAMCFN-----EIGSYSCQC 142
           C CN G+  L +E+       C +  +C+   ++   C + A C N     E  +  C C
Sbjct: 22  CKCNEGYV-LKNENT------CEEKVKCDKLENVNKVCGEYATCINQANKAEEKALKCGC 74

Query: 143 RPGFTGNGHQCTEITVPQTGPTSPCESDPRACNPPHSTCTNLTDYRTCNCDPGYQKDYLD 202
             G+T +   C    VP       C S     +P +   T      TC+C+ G  K    
Sbjct: 75  INGYTLSQGVC----VPNKCNNKVCGSGKCIVDPANPNNT------TCSCNIG--KVPDQ 122

Query: 203 DRRVAFVCTDVDE--CMNYPPICNNNADCINRPGTYQCQCKRGFSGDG 248
           + +    CT   E  C      C  N +C    G Y+C CK GF GDG
Sbjct: 123 NGK----CTKTGETKCSLK---CKENEECKLVGGYYECVCKEGFPGDG 163



 Score = 27.4 bits (61), Expect = 7.3
 Identities = 25/95 (26%), Positives = 35/95 (36%), Gaps = 12/95 (12%)

Query: 71  CKNFFCVANSSCIVE---DDKPTCICNRGFQQLYSEDRLQDDFGCFDINECNAGTDLCHK 127
           C N  C  +  CIV+    +  TC CN G        ++ D  G          +  C +
Sbjct: 90  CNNKVC-GSGKCIVDPANPNNTTCSCNIG--------KVPDQNGKCTKTGETKCSLKCKE 140

Query: 128 NAMCFNEIGSYSCQCRPGFTGNGHQCTEITVPQTG 162
           N  C    G Y C C+ GF G+G        P + 
Sbjct: 141 NEECKLVGGYYECVCKEGFPGDGGGTGSGGPPTSS 175


>gnl|CDD|165214 PHA02887, PHA02887, EGF-like protein; Provisional.
          Length = 126

 Score = 26.8 bits (59), Expect = 8.1
 Identities = 16/45 (35%), Positives = 26/45 (57%), Gaps = 6/45 (13%)

Query: 58  QEFTHSSTVNDDPCK---NFFCVANSSC--IVEDDKPTCICNRGF 97
           Q F   +++  + CK   N FC+ N  C  I++ D+  CICN+G+
Sbjct: 73  QNFKRKNSMFFEKCKNDFNDFCI-NGECMNIIDLDEKFCICNKGY 116


>gnl|CDD|114473 pfam05749, Rubella_E2, Rubella membrane glycoprotein E2.  Rubella
           virus (RV), the sole member of the genus Rubivirus
           within the family Togaviridae, is a small enveloped,
           positive strand RNA virus. The nucleocapsid consists of
           40S genomic RNA and a single species of capsid protein
           which is enveloped within a host-derived lipid bilayer
           containing two viral glycoproteins, E1 (58 kDa) and E2
           (42-46 kDa). In virus infected cells, RV matures by
           budding either at the plasma membrane, or at the
           internal membranes depending on the cell type and enters
           adjacent uninfected cells by a membrane fusion process
           in the endosome, directed by E1-E2 heterodimers. The
           heterodimer formation is crucial for E1 transport out of
           the endoplasmic reticulum to the Golgi and plasma
           membrane. In RV E1, a cysteine at position 82 is crucial
           for the E1-E2 heterodimer formation and cell surface
           expression of the two proteins.
          Length = 267

 Score = 27.4 bits (60), Expect = 9.2
 Identities = 10/27 (37%), Positives = 18/27 (66%)

Query: 106 LQDDFGCFDINECNAGTDLCHKNAMCF 132
           LQ  +GC+++++ + GT +CH   M F
Sbjct: 63  LQGGWGCYNLSDWHQGTHVCHTKHMDF 89


  Database: CDD.v3.10
    Posted date:  Mar 20, 2013  7:55 AM
  Number of letters in database: 10,937,602
  Number of sequences in database:  44,354
  
Lambda     K      H
   0.321    0.138    0.458 

Gapped
Lambda     K      H
   0.267   0.0665    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 13,923,192
Number of extensions: 1229664
Number of successful extensions: 772
Number of sequences better than 10.0: 1
Number of HSP's gapped: 762
Number of HSP's successfully gapped: 55
Length of query: 286
Length of database: 10,937,602
Length adjustment: 96
Effective length of query: 190
Effective length of database: 6,679,618
Effective search space: 1269127420
Effective search space used: 1269127420
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.9 bits)
S2: 58 (26.3 bits)