RPS-BLAST 2.2.26 [Sep-21-2011]

Database: CDD.v3.10 
           44,354 sequences; 10,937,602 total letters

Searching..................................................done

Query= psy10559
         (418 letters)



>gnl|CDD|238011 cd00054, EGF_CA, Calcium-binding EGF-like domain, present in a
          large number of membrane-bound and extracellular
          (mostly animal) proteins. Many of these proteins
          require calcium for their biological function and
          calcium-binding sites have been found to be located at
          the N-terminus of particular EGF-like domains;
          calcium-binding may be crucial for numerous
          protein-protein interactions. Six conserved core
          cysteines form three disulfide bridges as in non
          calcium-binding EGF domains, whose structures are very
          similar. EGF_CA can be found in tandem repeat
          arrangements.
          Length = 38

 Score = 33.8 bits (78), Expect = 0.007
 Identities = 12/31 (38%), Positives = 17/31 (54%), Gaps = 1/31 (3%)

Query: 42 PNSCSNHGQCRVNSDSQWECKCSDGWDGKDC 72
           N C N G C VN+   + C C  G+ G++C
Sbjct: 8  GNPCQNGGTC-VNTVGSYRCSCPPGYTGRNC 37



 Score = 30.3 bits (69), Expect = 0.12
 Identities = 9/28 (32%), Positives = 12/28 (42%), Gaps = 4/28 (14%)

Query: 14 CNEHGQCKNG----TCLCVTGWNGKHCT 37
          C   G C N      C C  G+ G++C 
Sbjct: 11 CQNGGTCVNTVGSYRCSCPPGYTGRNCE 38


>gnl|CDD|113629 pfam04863, EGF_alliinase, Alliinase EGF-like domain.  Allicin is
          a thiosulphinate that gives rise to dithiines, allyl
          sulphides and ajoenes, the three groups of active
          compounds in Allium species. Allicin is synthesised
          from sulfoxide cysteine derivatives by alliinase
          (EC:4.4.1.4), whose C-S lyase activity cleaves
          C(beta)-S(gamma) bonds. It is thought that this enzyme
          forms part of a primitive plant defence system. This
          family represents the N-terminal EGF-like domain.
          Length = 56

 Score = 33.6 bits (77), Expect = 0.014
 Identities = 16/40 (40%), Positives = 24/40 (60%), Gaps = 4/40 (10%)

Query: 44 SCSNHGQCRVN---SDSQWECKCSDGWDGKDCSVLLEQNC 80
          +CS HG+  ++   SD    C+C+  + G DCSVL+  NC
Sbjct: 18 NCSGHGRAFLDGIISDGSPICECNTCYTGPDCSVLI-PNC 56


>gnl|CDD|215652 pfam00008, EGF, EGF-like domain.  There is no clear separation
          between noise and signal. pfam00053 is very similar,
          but has 8 instead of 6 conserved cysteines. Includes
          some cytokine receptors. The EGF domain misses the
          N-terminus regions of the Ca2+ binding EGF domains
          (this is the main reason of discrepancy between
          swiss-prot domain start/end and Pfam). The family is
          hard to model due to many similar but different
          sub-types of EGF domains. Pfam certainly misses a
          number of EGF domains.
          Length = 32

 Score = 32.4 bits (74), Expect = 0.020
 Identities = 12/28 (42%), Positives = 18/28 (64%), Gaps = 1/28 (3%)

Query: 43 NSCSNHGQCRVNSDSQWECKCSDGWDGK 70
          N CSN G C V++   + C+C +G+ GK
Sbjct: 5  NPCSNGGTC-VDTPGGYTCECPEGYTGK 31



 Score = 27.0 bits (60), Expect = 1.7
 Identities = 9/27 (33%), Positives = 12/27 (44%), Gaps = 4/27 (14%)

Query: 13 WCNEHGQCKNG----TCLCVTGWNGKH 35
           C+  G C +     TC C  G+ GK 
Sbjct: 6  PCSNGGTCVDTPGGYTCECPEGYTGKR 32


>gnl|CDD|238010 cd00053, EGF, Epidermal growth factor domain, found in epidermal
          growth factor (EGF) presents in a large number of
          proteins, mostly animal; the list of proteins currently
          known to contain one or more copies of an EGF-like
          pattern is large and varied; the functional
          significance of EGF-like domains in what appear to be
          unrelated proteins is not yet clear; a common feature
          is that these repeats are found in the extracellular
          domain of membrane-bound proteins or in proteins known
          to be secreted (exception: prostaglandin G/H synthase);
          the domain includes six cysteine residues which have
          been shown to be involved in disulfide bonds; the main
          structure is a two-stranded beta-sheet followed by a
          loop to a C-terminal short two-stranded sheet;
          Subdomains between the conserved cysteines vary in
          length; the region between the 5th and 6th cysteine
          contains two conserved glycines of which at  least  one
           is  present  in  most EGF-like domains; a subset of
          these bind calcium.
          Length = 36

 Score = 31.7 bits (72), Expect = 0.045
 Identities = 12/29 (41%), Positives = 15/29 (51%), Gaps = 1/29 (3%)

Query: 42 PNSCSNHGQCRVNSDSQWECKCSDGWDGK 70
           N CSN G C VN+   + C C  G+ G 
Sbjct: 5  SNPCSNGGTC-VNTPGSYRCVCPPGYTGD 32



 Score = 25.5 bits (56), Expect = 6.3
 Identities = 9/29 (31%), Positives = 13/29 (44%), Gaps = 5/29 (17%)

Query: 14 CNEHGQCKNG----TCLCVTGWNG-KHCT 37
          C+  G C N      C+C  G+ G + C 
Sbjct: 8  CSNGGTCVNTPGSYRCVCPPGYTGDRSCE 36


>gnl|CDD|214542 smart00179, EGF_CA, Calcium-binding EGF-like domain. 
          Length = 39

 Score = 31.4 bits (72), Expect = 0.052
 Identities = 13/32 (40%), Positives = 19/32 (59%), Gaps = 2/32 (6%)

Query: 42 PNSCSNHGQCRVNSDSQWECKCSDGW-DGKDC 72
           N C N G C VN+   + C+C  G+ DG++C
Sbjct: 8  GNPCQNGGTC-VNTVGSYRCECPPGYTDGRNC 38


>gnl|CDD|219677 pfam07974, EGF_2, EGF-like domain.  This family contains EGF
          domains found in a variety of extracellular proteins.
          Length = 31

 Score = 31.3 bits (71), Expect = 0.060
 Identities = 8/31 (25%), Positives = 11/31 (35%), Gaps = 3/31 (9%)

Query: 42 PNSCSNHGQCRVNSDSQWECKCSDGWDGKDC 72
             C+  G C        +C C  G+ G  C
Sbjct: 4  SGICNGRGTCV---RPCGKCVCDSGYQGATC 31



 Score = 30.1 bits (68), Expect = 0.15
 Identities = 10/25 (40%), Positives = 13/25 (52%), Gaps = 2/25 (8%)

Query: 14 CNEHGQC--KNGTCLCVTGWNGKHC 36
          CN  G C    G C+C +G+ G  C
Sbjct: 7  CNGRGTCVRPCGKCVCDSGYQGATC 31


>gnl|CDD|222268 pfam13620, CarboxypepD_reg, Carboxypeptidase regulatory-like
           domain. 
          Length = 81

 Score = 30.7 bits (70), Expect = 0.22
 Identities = 21/82 (25%), Positives = 29/82 (35%), Gaps = 5/82 (6%)

Query: 162 VVRGRVVTSMGMGLVGVRVS---TSTPLEGFTLTRDDGWFDLLVNGGGAVTLQFGRSPFK 218
            + G V  + G  + G  V+     T     T T  DG F L     G  TL      +K
Sbjct: 1   TISGTVTDASGAPIPGATVTLTNADTGTVRGTTTDADGRFSLTGLPPGTYTLTVSAPGYK 60

Query: 219 PHNH-IVHVPWNEVVIIDTITM 239
                 V V   +   +D IT+
Sbjct: 61  SQTVKDVTVTAGQTTTLD-ITL 81


>gnl|CDD|220751 pfam10433, MMS1_N, Mono-functional DNA-alkylating methyl
           methanesulfonate N-term.  MMS1 is a protein that
           protects against replication-dependent DNA damage in
           Saccharomyces cerevisiae. MMS1 belongs to the DDB1
           family of cullin 4 adaptors and the two proteins are
           homologous. MMS1 bridges the interaction of MMS22 and
           Crt10 with Cul8/Rtt101. Cul8/Rtt101 is a cullin protein
           involved in the regulation of DNA replication subsequent
           to DNA damage. The N-terminal region of MMS1 and the
           C-terminal of MMS22 are required for the the MMS1-MMS22
           interaction. The human HIV-1 virion-associated protein
           Vpr assembles with DDB1 through interaction with DCAF1
           (chromatin assembly factor) to form an E3 ubiquitin
           ligase that targets cellular substrates for
           proteasome-mediated degradation and subsequent G2
           arrest.
          Length = 513

 Score = 32.7 bits (75), Expect = 0.35
 Identities = 13/50 (26%), Positives = 17/50 (34%)

Query: 306 TSASIFYQIGRLWQVTLKGKGLVKLLRMGRKYLTPSSNISCASSNPPQVV 355
           T A+     G + QVT     L  L             I+ AS N   V+
Sbjct: 422 TLAAGNTSDGVIIQVTENSIRLSDLELGKITDEWSDEIITAASVNGSLVL 471


>gnl|CDD|221873 pfam12955, DUF3844, Domain of unknown function (DUF3844).  This
          presumed domain is found in fungal species. It contains
          8 largely conserved cysteine residues. This domain is
          found in proteins that are thought to be found in the
          endoplasmic reticulum.
          Length = 103

 Score = 30.7 bits (70), Expect = 0.39
 Identities = 15/48 (31%), Positives = 17/48 (35%), Gaps = 17/48 (35%)

Query: 43 NSCSNHGQCRVNSDSQ----WECKCSD-------------GWDGKDCS 73
          NSCS HG C   S S+    + CKC                W G  C 
Sbjct: 13 NSCSGHGSCVKKSKSKGGDCYACKCKPTVVRTGSDKGKTTRWGGPACQ 60


>gnl|CDD|216650 pfam01696, Adeno_E1B_55K, Adenovirus EB1 55K protein / large
           t-antigen.  This family consists of adenovirus E1B 55K
           protein or large t-antigen. E1B 55K binds p53 the tumour
           suppressor protein converting it from a transcriptional
           activator which responds to damaged DNA in to an
           unregulated repressor of genes with a p53 binding site.
           This protects the virus against p53 induced host
           antiviral responses and prevents apoptosis as induced by
           the adenovirus E1A protein. The E1B region of adenovirus
           encodes two proteins E1B 55K the large t-antigen as
           found in this family and E1B 19K pfam01691 the small
           t-antigen which is not found in this family; both of
           these proteins inhibit E1A induced apoptosis. This
           family shows distant similarities to the pectate lyase
           superfamily.
          Length = 387

 Score = 31.1 bits (71), Expect = 1.0
 Identities = 16/91 (17%), Positives = 26/91 (28%), Gaps = 22/91 (24%)

Query: 34  KHCTLEGCPNSCSNHGQCRVNSDSQWECKCSDGWDGKDCSVLLEQ----------NCNDG 83
           K C  E C     + G  R+          S+      C VLL+             +  
Sbjct: 194 KKCVFEKCVLGIISEGDARIRH-----NAASE----TGCFVLLKGTGSIKHNSICGPSTL 244

Query: 84  KDNDKDGLVDCEDPEC---CSNHICRSSQLC 111
            D+    ++ C D       + HI    +  
Sbjct: 245 PDSMDFQMLTCADGNVHPLKTVHIVSHPRKK 275


>gnl|CDD|188502 TIGR03987, TIGR03987, TIGR03987 family protein.  Conserved
           hypothetical protein.
          Length = 120

 Score = 29.6 bits (67), Expect = 1.1
 Identities = 13/40 (32%), Positives = 16/40 (40%), Gaps = 10/40 (25%)

Query: 299 IMPTVLITSASIFYQIG----------RLWQVTLKGKGLV 328
           I   + IT A +FY IG          + W V     GLV
Sbjct: 2   IFAIIFITLALVFYTIGVFSERKSGTLKKWHVIFFWLGLV 41


>gnl|CDD|205157 pfam12947, EGF_3, EGF domain.  This family includes a variety of
          EGF-like domain homologues. This family includes the
          C-terminal domain of the malaria parasite MSP1 protein.
          Length = 36

 Score = 26.0 bits (58), Expect = 4.1
 Identities = 9/30 (30%), Positives = 12/30 (40%), Gaps = 3/30 (10%)

Query: 45 CSNHGQCRVNSDSQWECKCSDG--WDGKDC 72
          C  +  C  N+   + C C  G   DG  C
Sbjct: 8  CHPNATCT-NTGGSFTCTCKSGYTGDGVTC 36


>gnl|CDD|218955 pfam06247, Plasmod_Pvs28, Plasmodium ookinete surface protein
           Pvs28.  This family consists of several ookinete surface
           protein (Pvs28) from several species of Plasmodium.
           Pvs25 and Pvs28 are expressed on the surface of
           ookinetes. These proteins are potential candidates for
           vaccine and induce antibodies that block the infectivity
           of Plasmodium vivax in immunised animals.
          Length = 196

 Score = 28.6 bits (64), Expect = 4.3
 Identities = 22/102 (21%), Positives = 37/102 (36%), Gaps = 20/102 (19%)

Query: 14  CNEHGQCKNG---------TCLCVTGW--NGKHCTLEGCPNSCSNHGQCRVNSDSQWECK 62
           C E+  C N           C C+ G+  +   C    C N     G+C V+  +     
Sbjct: 52  CGEYATCINQANKAEEKALKCGCINGYTLSQGVCVPNKCNNKVCGSGKCIVDPANPNNTT 111

Query: 63  CSDGWDGKDCSV--LLEQNCNDGKDNDKDGLVDCEDPECCSN 102
           CS       C++  + +QN    K  +    + C++ E C  
Sbjct: 112 CS-------CNIGKVPDQNGKCTKTGETKCSLKCKENEECKL 146


>gnl|CDD|224883 COG1972, NupC, Nucleoside permease [Nucleotide transport and
           metabolism].
          Length = 404

 Score = 29.1 bits (66), Expect = 5.1
 Identities = 14/42 (33%), Positives = 21/42 (50%), Gaps = 3/42 (7%)

Query: 297 LAIMPTVLITSA--SIFYQIGRL-WQVTLKGKGLVKLLRMGR 335
             ++P ++  SA  SI Y IG L   + + G GL K L   +
Sbjct: 92  FRVLPPIIFISALISILYYIGILPLVIRIIGGGLQKALGTSK 133


>gnl|CDD|219496 pfam07645, EGF_CA, Calcium-binding EGF domain. 
          Length = 42

 Score = 25.8 bits (57), Expect = 6.5
 Identities = 12/35 (34%), Positives = 19/35 (54%), Gaps = 5/35 (14%)

Query: 42 PNSCSNHGQCRVNSDSQWECKCSDGW----DGKDC 72
           ++C  +  C VN+   +EC C DG+    DG +C
Sbjct: 9  THNCPANTVC-VNTIGSFECVCPDGYENNEDGTNC 42


>gnl|CDD|173909 cd00818, IleRS_core, catalytic core domain of isoleucyl-tRNA
           synthetases.  Isoleucine amino-acyl tRNA synthetases
           (IleRS) catalytic core domain . This class I enzyme is a
           monomer which aminoacylates the 2'-OH of the nucleotide
           at the 3' of the appropriate tRNA. The core domain is
           based on the Rossman fold and is responsible for the
           ATP-dependent formation of the enzyme bound
           aminoacyl-adenylate. It contains the characteristic
           class I HIGH and KMSKS motifs, which are involved in ATP
           binding.  IleRS has an insertion in the core domain,
           which is subject to both deletions and rearrangements.
           This editing region hydrolyzes mischarged cognate tRNAs
           and thus prevents the incorporation of chemically
           similar amino acids.
          Length = 338

 Score = 28.4 bits (64), Expect = 8.4
 Identities = 19/41 (46%), Positives = 23/41 (56%), Gaps = 9/41 (21%)

Query: 186 LEGFTLTRDDGWF-DLLVNGGGAVTLQFGRSPFKPHNHIVH 225
           LEG   TR  GWF  LL+      T  FG++P+K  N IVH
Sbjct: 257 LEGSDQTR--GWFYSLLLLS----TALFGKAPYK--NVIVH 289


  Database: CDD.v3.10
    Posted date:  Mar 20, 2013  7:55 AM
  Number of letters in database: 10,937,602
  Number of sequences in database:  44,354
  
Lambda     K      H
   0.320    0.135    0.431 

Gapped
Lambda     K      H
   0.267   0.0632    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 20,294,865
Number of extensions: 1865775
Number of successful extensions: 1365
Number of sequences better than 10.0: 1
Number of HSP's gapped: 1356
Number of HSP's successfully gapped: 32
Length of query: 418
Length of database: 10,937,602
Length adjustment: 99
Effective length of query: 319
Effective length of database: 6,546,556
Effective search space: 2088351364
Effective search space used: 2088351364
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.8 bits)
S2: 60 (27.1 bits)