RPS-BLAST 2.2.26 [Sep-21-2011]

Database: CDD.v3.10 
           44,354 sequences; 10,937,602 total letters

Searching..................................................done

Query= psy11798
         (245 letters)



>gnl|CDD|221695 pfam12662, cEGF, Complement Clr-like EGF-like.  cEGF, or
          complement Clr-like EGF, domains have six conserved
          cysteine residues disulfide-bonded into the
          characteristic pattern 'ababcc'. They are found in
          blood coagulation proteins such as fibrillin, Clr and
          Cls, thrombomodulin, and the LDL receptor. The core
          fold of the EGF domain consists of two small
          beta-hairpins packed against each other. Two major
          structural variants have been identified based on the
          structural context of the C-terminal cysteine residue
          of disulfide 'c' in the C-terminal hairpin: hEGFs and
          cEGFs. In cEGFs the C-terminal thiol resides on the
          C-terminal beta-sheet, resulting in long loop-lengths
          between the cysteine residues of disulfide 'c',
          typically C[10+]XC. These longer loop-lengths may have
          arisen by selective cysteine loss from a four-disulfide
          EGF template such as laminin or integrin. Tandem cEGF
          domains have five linking residues between terminal
          cysteines of adjacent domains. cEGF domains may or may
          not bind calcium in the linker region. cEGF domains
          with the consensus motif CXN4X[F,Y]XCXC are
          hydroxylated exclusively on the asparagine residue.
          Length = 24

 Score = 45.1 bits (108), Expect = 3e-07
 Identities = 14/24 (58%), Positives = 17/24 (70%)

Query: 40 SFRCICPYGYALAPDGRHCIDINE 63
          S+ C CP GY L+ DGR C DI+E
Sbjct: 1  SYTCSCPPGYQLSGDGRTCEDIDE 24



 Score = 34.4 bits (80), Expect = 0.002
 Identities = 10/20 (50%), Positives = 13/20 (65%)

Query: 84  TCECPEGFMLSPNGMKCIDV 103
           TC CP G+ LS +G  C D+
Sbjct: 3   TCSCPPGYQLSGDGRTCEDI 22


>gnl|CDD|214542 smart00179, EGF_CA, Calcium-binding EGF-like domain. 
          Length = 39

 Score = 42.6 bits (101), Expect = 3e-06
 Identities = 19/42 (45%), Positives = 23/42 (54%), Gaps = 4/42 (9%)

Query: 19 DINECLELSN-QCAFRCHNVPGSFRCICPYGYALAPDGRHCI 59
          DI+EC   +  Q    C N  GS+RC CP GY    DGR+C 
Sbjct: 1  DIDECASGNPCQNGGTCVNTVGSYRCECPPGYT---DGRNCE 39



 Score = 36.5 bits (85), Expect = 5e-04
 Identities = 16/43 (37%), Positives = 22/43 (51%), Gaps = 5/43 (11%)

Query: 60  DINECKENEGICEDG-KCINIAGGVTCECPEGFMLSPNGMKCI 101
           DI+EC      C++G  C+N  G   CECP G+    +G  C 
Sbjct: 1   DIDECASG-NPCQNGGTCVNTVGSYRCECPPGYT---DGRNCE 39


>gnl|CDD|219496 pfam07645, EGF_CA, Calcium-binding EGF domain. 
          Length = 42

 Score = 40.4 bits (95), Expect = 2e-05
 Identities = 17/42 (40%), Positives = 24/42 (57%), Gaps = 2/42 (4%)

Query: 19 DINECLELSNQC--AFRCHNVPGSFRCICPYGYALAPDGRHC 58
          D++EC + ++ C     C N  GSF C+CP GY    DG +C
Sbjct: 1  DVDECADGTHNCPANTVCVNTIGSFECVCPDGYENNEDGTNC 42



 Score = 38.1 bits (89), Expect = 1e-04
 Identities = 13/42 (30%), Positives = 22/42 (52%), Gaps = 1/42 (2%)

Query: 60  DINECKENEGIC-EDGKCINIAGGVTCECPEGFMLSPNGMKC 100
           D++EC +    C  +  C+N  G   C CP+G+  + +G  C
Sbjct: 1   DVDECADGTHNCPANTVCVNTIGSFECVCPDGYENNEDGTNC 42


>gnl|CDD|238011 cd00054, EGF_CA, Calcium-binding EGF-like domain, present in a
          large number of membrane-bound and extracellular
          (mostly animal) proteins. Many of these proteins
          require calcium for their biological function and
          calcium-binding sites have been found to be located at
          the N-terminus of particular EGF-like domains;
          calcium-binding may be crucial for numerous
          protein-protein interactions. Six conserved core
          cysteines form three disulfide bridges as in non
          calcium-binding EGF domains, whose structures are very
          similar. EGF_CA can be found in tandem repeat
          arrangements.
          Length = 38

 Score = 38.0 bits (89), Expect = 1e-04
 Identities = 18/42 (42%), Positives = 22/42 (52%), Gaps = 5/42 (11%)

Query: 19 DINECLELSN-QCAFRCHNVPGSFRCICPYGYALAPDGRHCI 59
          DI+EC   +  Q    C N  GS+RC CP GY     GR+C 
Sbjct: 1  DIDECASGNPCQNGGTCVNTVGSYRCSCPPGYT----GRNCE 38



 Score = 38.0 bits (89), Expect = 1e-04
 Identities = 12/34 (35%), Positives = 15/34 (44%)

Query: 60 DINECKENEGICEDGKCINIAGGVTCECPEGFML 93
          DI+EC         G C+N  G   C CP G+  
Sbjct: 1  DIDECASGNPCQNGGTCVNTVGSYRCSCPPGYTG 34


>gnl|CDD|201391 pfam00683, TB, TB domain.  This domain is also known as the 8
           cysteine domain. This family includes the hybrid
           domains. This cysteine rich repeat is found in TGF
           binding protein and fibrillin.
          Length = 42

 Score = 37.3 bits (87), Expect = 3e-04
 Identities = 17/42 (40%), Positives = 22/42 (52%)

Query: 115 GTCTLLRKQPITVKECCCSMGQAWGRYCLPCPSPNSGEPATF 156
           G C+      +T  ECCCS+G+AWG  C PCP   + E    
Sbjct: 1   GRCSNPLPGNVTKSECCCSLGRAWGTPCEPCPVQGTAEFRQL 42


>gnl|CDD|238752 cd01475, vWA_Matrilin, VWA_Matrilin: In cartilaginous plate,
           extracellular matrix molecules mediate cell-matrix and
           matrix-matrix interactions thereby providing tissue
           integrity. Some members of the matrilin family are
           expressed specifically in developing cartilage
           rudiments. The matrilin family consists of at least four
           members. All the members of the matrilin family contain
           VWA domains, EGF-like domains and a heptad repeat
           coiled-coiled domain at the carboxy terminus which is
           responsible for the oligomerization of the matrilins.
           The VWA domains have been shown to be essential for
           matrilin network formation by interacting with matrix
           ligands.
          Length = 224

 Score = 40.1 bits (94), Expect = 5e-04
 Identities = 17/53 (32%), Positives = 26/53 (49%)

Query: 4   TMSVTGYRLRVETCEDINECLELSNQCAFRCHNVPGSFRCICPYGYALAPDGR 56
           T+     + + + C   + C  LS+ C   C + PGS+ C C  GYAL  D +
Sbjct: 171 TIEELTKKFQGKICVVPDLCATLSHVCQQVCISTPGSYLCACTEGYALLEDNK 223



 Score = 31.6 bits (72), Expect = 0.30
 Identities = 12/43 (27%), Positives = 20/43 (46%), Gaps = 1/43 (2%)

Query: 55  GRHCIDINECKENEGICEDGKCINIAGGVTCECPEGFMLSPNG 97
           G+ C+  + C     +C+   CI+  G   C C EG+ L  + 
Sbjct: 181 GKICVVPDLCATLSHVCQQV-CISTPGSYLCACTEGYALLEDN 222


>gnl|CDD|214544 smart00181, EGF, Epidermal growth factor-like domain. 
          Length = 35

 Score = 34.4 bits (79), Expect = 0.003
 Identities = 14/34 (41%), Positives = 16/34 (47%), Gaps = 1/34 (2%)

Query: 63 ECKENEGICEDGKCINIAGGVTCECPEGFMLSPN 96
          EC    G C +G CIN  G  TC CP G+     
Sbjct: 1  ECASG-GPCSNGTCINTPGSYTCSCPPGYTGDKR 33



 Score = 29.8 bits (67), Expect = 0.12
 Identities = 10/22 (45%), Positives = 11/22 (50%)

Query: 33 RCHNVPGSFRCICPYGYALAPD 54
           C N PGS+ C CP GY     
Sbjct: 12 TCINTPGSYTCSCPPGYTGDKR 33


>gnl|CDD|205157 pfam12947, EGF_3, EGF domain.  This family includes a variety of
           EGF-like domain homologues. This family includes the
           C-terminal domain of the malaria parasite MSP1 protein.
          Length = 36

 Score = 31.7 bits (73), Expect = 0.020
 Identities = 14/38 (36%), Positives = 18/38 (47%), Gaps = 3/38 (7%)

Query: 64  CKENEGIC-EDGKCINIAGGVTCECPEGFMLSPNGMKC 100
           C EN G C  +  C N  G  TC C  G+    +G+ C
Sbjct: 1   CAENNGGCHPNATCTNTGGSFTCTCKSGYTG--DGVTC 36



 Score = 31.0 bits (71), Expect = 0.037
 Identities = 12/25 (48%), Positives = 12/25 (48%), Gaps = 2/25 (8%)

Query: 34 CHNVPGSFRCICPYGYALAPDGRHC 58
          C N  GSF C C  GY    DG  C
Sbjct: 14 CTNTGGSFTCTCKSGYTG--DGVTC 36


>gnl|CDD|238010 cd00053, EGF, Epidermal growth factor domain, found in epidermal
          growth factor (EGF) presents in a large number of
          proteins, mostly animal; the list of proteins currently
          known to contain one or more copies of an EGF-like
          pattern is large and varied; the functional
          significance of EGF-like domains in what appear to be
          unrelated proteins is not yet clear; a common feature
          is that these repeats are found in the extracellular
          domain of membrane-bound proteins or in proteins known
          to be secreted (exception: prostaglandin G/H synthase);
          the domain includes six cysteine residues which have
          been shown to be involved in disulfide bonds; the main
          structure is a two-stranded beta-sheet followed by a
          loop to a C-terminal short two-stranded sheet;
          Subdomains between the conserved cysteines vary in
          length; the region between the 5th and 6th cysteine
          contains two conserved glycines of which at  least  one
           is  present  in  most EGF-like domains; a subset of
          these bind calcium.
          Length = 36

 Score = 30.1 bits (68), Expect = 0.088
 Identities = 16/42 (38%), Positives = 18/42 (42%), Gaps = 8/42 (19%)

Query: 17 CEDINECLELSNQCAFRCHNVPGSFRCICPYGYALAPDGRHC 58
          C   N C          C N PGS+RC+CP GY      R C
Sbjct: 2  CAASNPCSNGG-----TCVNTPGSYRCVCPPGYTGD---RSC 35



 Score = 29.4 bits (66), Expect = 0.18
 Identities = 10/34 (29%), Positives = 14/34 (41%)

Query: 63 ECKENEGICEDGKCINIAGGVTCECPEGFMLSPN 96
          EC  +      G C+N  G   C CP G+    +
Sbjct: 1  ECAASNPCSNGGTCVNTPGSYRCVCPPGYTGDRS 34


>gnl|CDD|215652 pfam00008, EGF, EGF-like domain.  There is no clear separation
          between noise and signal. pfam00053 is very similar,
          but has 8 instead of 6 conserved cysteines. Includes
          some cytokine receptors. The EGF domain misses the
          N-terminus regions of the Ca2+ binding EGF domains
          (this is the main reason of discrepancy between
          swiss-prot domain start/end and Pfam). The family is
          hard to model due to many similar but different
          sub-types of EGF domains. Pfam certainly misses a
          number of EGF domains.
          Length = 32

 Score = 28.9 bits (65), Expect = 0.22
 Identities = 13/28 (46%), Positives = 16/28 (57%)

Query: 64 CKENEGICEDGKCINIAGGVTCECPEGF 91
          C  N      G C++  GG TCECPEG+
Sbjct: 1  CSPNNPCSNGGTCVDTPGGYTCECPEGY 28


>gnl|CDD|219625 pfam07895, DUF1673, Protein of unknown function (DUF1673).  This
           family contains hypothetical proteins of unknown
           function expressed by two archaeal species.
          Length = 204

 Score = 31.2 bits (71), Expect = 0.33
 Identities = 12/75 (16%), Positives = 20/75 (26%), Gaps = 6/75 (8%)

Query: 163 GFFFVLFFLVIFWSILPIYKTIKDITILSVCAYNMKQTTLHYTNFFFLNFLFDALLEHLM 222
           G F  L   +  W    I      +    V  Y+ K+  +       L  +   L  +  
Sbjct: 87  GLFLSLLLYLFTWKKQMIRY--DALAKKPVIRYSNKKKIVRSLLVIILLLILLLLFLY-- 142

Query: 223 KEIARLFVLASLFMQ 237
                    + L  Q
Sbjct: 143 --YILGHFESLLSAQ 155


>gnl|CDD|218955 pfam06247, Plasmod_Pvs28, Plasmodium ookinete surface protein
           Pvs28.  This family consists of several ookinete surface
           protein (Pvs28) from several species of Plasmodium.
           Pvs25 and Pvs28 are expressed on the surface of
           ookinetes. These proteins are potential candidates for
           vaccine and induce antibodies that block the infectivity
           of Plasmodium vivax in immunised animals.
          Length = 196

 Score = 29.3 bits (66), Expect = 1.5
 Identities = 31/110 (28%), Positives = 40/110 (36%), Gaps = 26/110 (23%)

Query: 8   TGYRLRVE-TCEDINECLELSN---------QCA-FRCHNVPGSFRCICPYGYALAPDGR 56
            GY L+ E TCE+  +C +L N          C          + +C C  GY L     
Sbjct: 26  EGYVLKNENTCEEKVKCDKLENVNKVCGEYATCINQANKAEEKALKCGCINGYTL----- 80

Query: 57  HCIDINECKENEG---ICEDGKCI---NIAGGVTCECPEGFMLSPNGMKC 100
                  C  N+    +C  GKCI         TC C  G +   NG KC
Sbjct: 81  ---SQGVCVPNKCNNKVCGSGKCIVDPANPNNTTCSCNIGKVPDQNG-KC 126



 Score = 28.2 bits (63), Expect = 3.5
 Identities = 19/67 (28%), Positives = 27/67 (40%), Gaps = 10/67 (14%)

Query: 39 GSFRCICPYGYALAPDGRHCIDINECKENEGI---CED-GKCINIAGG-----VTCECPE 89
            F C C  GY L  +   C +  +C + E +   C +   CIN A       + C C  
Sbjct: 18 NHFECKCNEGYVLKNENT-CEEKVKCDKLENVNKVCGEYATCINQANKAEEKALKCGCIN 76

Query: 90 GFMLSPN 96
          G+ LS  
Sbjct: 77 GYTLSQG 83


>gnl|CDD|239585 cd03508, Delta4-sphingolipid-FADS-like, The Delta4-sphingolipid
           Fatty Acid Desaturase (Delta4-sphingolipid-FADS)-like CD
           includes the integral-membrane enzymes, dihydroceramide
           Delta-4 desaturase, involved in the synthesis of
           sphingosine; and the human membrane fatty acid (lipid)
           desaturase (MLD), reported to modulate biosynthesis of
           the epidermal growth factor receptor; and other related
           proteins. These proteins are found in various eukaryotes
           including vertebrates, higher plants, and fungi. Studies
           show that MLD is localized to the endoplasmic reticulum.
           As with other members of this superfamily, this domain
           family has extensive hydrophobic regions that would be
           capable of spanning the membrane bilayer at least twice.
           Comparison of sequences also reveals the existence of
           three regions of conserved histidine cluster motifs that
           contain eight histidine residues: HXXXH, HXXHH, and
           HXXHH. These histidine residues are reported to be
           catalytically essential and proposed to be the ligands
           for the iron atoms contained within the homolog,
           stearoyl CoA desaturase.
          Length = 289

 Score = 29.1 bits (66), Expect = 1.9
 Identities = 13/83 (15%), Positives = 26/83 (31%), Gaps = 14/83 (16%)

Query: 154 ATFWSHYPKGFFFVLFFLVIFWSILPIYKTIKDITILSVCAYNMKQTTLHYTNFFFLNFL 213
              +S       +V      F+++ P++   K  T             L   N       
Sbjct: 123 GKLFSTVLGKAIWV-TLQPFFYALRPLFVRPKPPTR------------LEVINIVV-QIT 168

Query: 214 FDALLEHLMKEIARLFVLASLFM 236
           FD L+ +     +  ++L   F+
Sbjct: 169 FDYLIYYFFGWKSLAYLLLGSFL 191


>gnl|CDD|226137 COG3610, COG3610, Uncharacterized conserved protein [Function
           unknown].
          Length = 156

 Score = 28.4 bits (64), Expect = 2.0
 Identities = 15/97 (15%), Positives = 27/97 (27%), Gaps = 18/97 (18%)

Query: 153 PATFWSHYPKGFFFVLFFLVIF---WSILPIYKTIKDITILSVCAYNMKQTTLHYTNF-- 207
                      F   + F ++F      LPI         L    + +      +  F  
Sbjct: 3   LLMLLLDMLFAFIATVGFAIVFNVPPRALPI------CGFLGALGWVVYYLLGKHFGFSI 56

Query: 208 ----FFLNFLFDAL---LEHLMKEIARLFVLASLFMQ 237
               F   F+   L   L    K  A++F + ++   
Sbjct: 57  VVATFIAAFVVGCLGNLLSRRYKTPAKVFTVPAIIPL 93


>gnl|CDD|119297 pfam10777, YlaC, sigma70 family sigma factor YlaC.  Members of the
           sigma70 family of sigma factors are components of the
           RNA polymerase holoenzyme that direct bacterial or
           plastid core RNA polymerase to specific promoter
           elements. This domain is an inner membrane protein of
           unknown function.
          Length = 156

 Score = 28.1 bits (63), Expect = 2.8
 Identities = 13/62 (20%), Positives = 28/62 (45%), Gaps = 12/62 (19%)

Query: 165 FFVLFFLVIFWSILPIYKTIKDITILS--VCAYNMKQTTLHYTNFFFLNFLFDALLEHLM 222
            F++    +F+ I P+Y+  +DI +L   VC YN          ++    +   L++ ++
Sbjct: 70  LFIVMNAFLFFDIKPVYR-FEDIDVLDLRVC-YN--------GEWYNTRAVSQQLIDEIL 119

Query: 223 KE 224
             
Sbjct: 120 NS 121


>gnl|CDD|233335 TIGR01271, CFTR_protein, cystic fibrosis transmembrane conductor
           regulator (CFTR).  The model describes the cystis
           fibrosis transmembrane conductor regulator (CFTR) in
           eukaryotes. The principal role of this protein is
           chloride ion conductance. The protein is predicted to
           consist of 12 transmembrane domains. Mutations or
           lesions in the genetic loci have been linked to the
           aetiology of asthma, bronchiectasis, chronic obstructive
           pulmonary disease etc. Disease-causing mutations have
           been studied by 36Cl efflux assays in vitro cell
           cultures and electrophysiology, all of which point to
           the impairment of chloride channel stability and not the
           biosynthetic processing per se [Transport and binding
           proteins, Anions].
          Length = 1490

 Score = 28.3 bits (63), Expect = 4.1
 Identities = 14/40 (35%), Positives = 21/40 (52%), Gaps = 3/40 (7%)

Query: 160 YPKGFFFVLFFLVIFWSILP--IYKTIKDITILSVCAYNM 197
           Y   FFF  FF V+F S++P  + K I    I +  +Y +
Sbjct: 307 YSSAFFFSGFF-VVFLSVVPYALIKGIILRRIFTTISYCI 345


>gnl|CDD|200519 cd11258, Sema_4C, The Sema domain, a protein interacting module, of
           semaphorin 4C (Sema4C).  Sema4C acts as a Plexin B2
           ligand to regulate the development of cerebellar granule
           cells and to modulate ureteric branching in the
           developing kidney. The binding of Sema4C to Plexin B2
           results  the phosphorylation of downstream regulator
           ErbB-2 and the plexin protein itself. The cytoplasmic
           region of Sema4C binds a neurite-outgrowth-related
           protein SFAP75, suggesting that Sema4C may also play a
           role in neural function. Sema4C belongs to the class 4
           transmembrane semaphorin family of proteins. Semaphorins
           are regulatory molecules in the development of the
           nervous system and in axonal guidance. They also play
           important roles in other biological processes, such as
           angiogenesis, immune regulation, respiration systems and
           cancer. The Sema domain is located at the N-terminus and
           contains four disulfide bonds formed by eight conserved
           cysteine residues. It serves as a receptor-recognition
           and -binding module.
          Length = 458

 Score = 27.8 bits (62), Expect = 6.2
 Identities = 11/23 (47%), Positives = 11/23 (47%)

Query: 136 QAWGRYCLPCPSPNSGEPATFWS 158
           Q WGRY  P PSP  G     W 
Sbjct: 312 QKWGRYTDPVPSPRPGSCINNWH 334


>gnl|CDD|206444 pfam14276, DUF4363, Domain of unknown function (DUF4363).  This
           family of proteins is found in bacteria. Proteins in
           this family are approximately 120 amino acids in length.
          Length = 121

 Score = 26.8 bits (60), Expect = 6.4
 Identities = 4/32 (12%), Positives = 10/32 (31%)

Query: 164 FFFVLFFLVIFWSILPIYKTIKDITILSVCAY 195
             F+L  L+  +S   +  +   +        
Sbjct: 5   LIFILIILLGVFSNNYLNTSCDKLEEKLEKIE 36


>gnl|CDD|149504 pfam08475, Baculo_VP91_N, Viral capsid protein 91 N-terminal.  This
           domain is found in Baculoviridae including the
           nucleopolyhedrovirus at the N-terminus of the viral
           capsid protein 91 (VP91).
          Length = 185

 Score = 26.8 bits (60), Expect = 7.7
 Identities = 9/30 (30%), Positives = 11/30 (36%), Gaps = 2/30 (6%)

Query: 81  GGVTCECPEGFMLSPNGMKCIDVRQDVCYD 110
           G V  ECP       N ++C  V    C  
Sbjct: 122 GWVEMECPANERFDGNQLQC--VPIPPCDG 149


  Database: CDD.v3.10
    Posted date:  Mar 20, 2013  7:55 AM
  Number of letters in database: 10,937,602
  Number of sequences in database:  44,354
  
Lambda     K      H
   0.328    0.141    0.478 

Gapped
Lambda     K      H
   0.267   0.0678    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 12,148,719
Number of extensions: 1113405
Number of successful extensions: 1864
Number of sequences better than 10.0: 1
Number of HSP's gapped: 1853
Number of HSP's successfully gapped: 69
Length of query: 245
Length of database: 10,937,602
Length adjustment: 94
Effective length of query: 151
Effective length of database: 6,768,326
Effective search space: 1022017226
Effective search space used: 1022017226
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 15 ( 7.1 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 40 (21.7 bits)
S2: 58 (26.2 bits)