RPS-BLAST 2.2.26 [Sep-21-2011]

Database: CDD.v3.10 
           44,354 sequences; 10,937,602 total letters

Searching..................................................done

Query= psy11864
         (193 letters)



>gnl|CDD|215916 pfam00431, CUB, CUB domain. 
          Length = 110

 Score = 99.7 bits (249), Expect = 2e-27
 Identities = 37/72 (51%), Positives = 52/72 (72%)

Query: 1   VALKFQSFEIENHDQCTYDYVEIRDGHAPDSPIIGTYCGYKLPPDIKSSGTKLMIKFVSD 60
           ++L FQ F++E+HD+C YDYVEIRDG    SP++G +CG   P DI+S+  ++ IKFVSD
Sbjct: 39  ISLTFQDFDLEDHDECGYDYVEIRDGLPSSSPLLGRFCGSGPPEDIRSTSNQMTIKFVSD 98

Query: 61  GSVQKPGFSAIF 72
            S+ K GF A +
Sbjct: 99  SSISKRGFKATY 110



 Score = 75.0 bits (185), Expect = 6e-18
 Identities = 31/74 (41%), Positives = 43/74 (58%), Gaps = 4/74 (5%)

Query: 118 CGGILNTPNGTLTSPSFPDLYIKNKTCIWEIVAPPQYRISLNFTHFDIEGNNLFQSSCEY 177
           CGG+L   +G++TSP++P+ Y  NK C+W I APP YRISL F  FD+E        C Y
Sbjct: 1   CGGVLTESSGSITSPNYPNSYPPNKDCVWTIRAPPGYRISLTFQDFDLED----HDECGY 56

Query: 178 DNLTVFSKIGDSFK 191
           D + +   +  S  
Sbjct: 57  DYVEIRDGLPSSSP 70


>gnl|CDD|238001 cd00041, CUB, CUB domain; extracellular domain; present in proteins
           mostly known to be involved in development; not found in
           prokaryotes, plants and yeast.
          Length = 113

 Score = 95.2 bits (237), Expect = 1e-25
 Identities = 33/72 (45%), Positives = 47/72 (65%)

Query: 1   VALKFQSFEIENHDQCTYDYVEIRDGHAPDSPIIGTYCGYKLPPDIKSSGTKLMIKFVSD 60
           + L F+ F++E+   C+YDY+EI DG +  SP++G +CG  LPP I SSG  L ++F SD
Sbjct: 40  IRLTFEDFDLESSPNCSYDYLEIYDGPSTSSPLLGRFCGSTLPPPIISSGNSLTVRFRSD 99

Query: 61  GSVQKPGFSAIF 72
            SV   GF A +
Sbjct: 100 SSVTGRGFKATY 111



 Score = 77.8 bits (192), Expect = 5e-19
 Identities = 32/77 (41%), Positives = 44/77 (57%), Gaps = 7/77 (9%)

Query: 118 CGGILNTP-NGTLTSPSFPDLYIKNKTCIWEIVAPPQYRISLNFTHFDIEGNNLFQSSCE 176
           CGG L    +GT++SP++P+ Y  N  C+W I APP YRI L F  FD+E +     +C 
Sbjct: 1   CGGTLTASTSGTISSPNYPNNYPNNLNCVWTIEAPPGYRIRLTFEDFDLESSP----NCS 56

Query: 177 YDNLTVFSKIGDSFKSQ 193
           YD L ++   G S  S 
Sbjct: 57  YDYLEIYD--GPSTSSP 71


>gnl|CDD|214483 smart00042, CUB, Domain first found in C1r, C1s, uEGF, and bone
           morphogenetic protein.  This domain is found mostly
           among developmentally-regulated proteins. Spermadhesins
           contain only this domain.
          Length = 102

 Score = 91.7 bits (228), Expect = 2e-24
 Identities = 36/73 (49%), Positives = 49/73 (67%), Gaps = 1/73 (1%)

Query: 1   VALKFQSFEIENHDQCTYDYVEIRDGHAPDSPIIGTYCGYKLPPDIKSS-GTKLMIKFVS 59
           + L+F  F++E+ D C YDYVEI DG +  SP++G +CG + PP + SS    L + FVS
Sbjct: 30  IELQFTDFDLESSDNCEYDYVEIYDGPSASSPLLGRFCGSEAPPPVISSSSNSLTLTFVS 89

Query: 60  DGSVQKPGFSAIF 72
           D SVQK GFSA +
Sbjct: 90  DSSVQKRGFSARY 102



 Score = 71.7 bits (176), Expect = 9e-17
 Identities = 28/66 (42%), Positives = 39/66 (59%), Gaps = 4/66 (6%)

Query: 127 GTLTSPSFPDLYIKNKTCIWEIVAPPQYRISLNFTHFDIEGNNLFQSSCEYDNLTVFSKI 186
           GT+TSP++P  Y  N  C+W I APP YRI L FT FD+E ++    +CEYD + ++   
Sbjct: 1   GTITSPNYPQSYPNNLDCVWTIRAPPGYRIELQFTDFDLESSD----NCEYDYVEIYDGP 56

Query: 187 GDSFKS 192
             S   
Sbjct: 57  SASSPL 62


>gnl|CDD|238752 cd01475, vWA_Matrilin, VWA_Matrilin: In cartilaginous plate,
           extracellular matrix molecules mediate cell-matrix and
           matrix-matrix interactions thereby providing tissue
           integrity. Some members of the matrilin family are
           expressed specifically in developing cartilage
           rudiments. The matrilin family consists of at least four
           members. All the members of the matrilin family contain
           VWA domains, EGF-like domains and a heptad repeat
           coiled-coiled domain at the carboxy terminus which is
           responsible for the oligomerization of the matrilins.
           The VWA domains have been shown to be essential for
           matrilin network formation by interacting with matrix
           ligands.
          Length = 224

 Score = 45.1 bits (107), Expect = 5e-06
 Identities = 15/36 (41%), Positives = 18/36 (50%)

Query: 77  DECALEDHGCEHTCKNILGGYECSCKIGYELHSDGK 112
           D CA   H C+  C +  G Y C+C  GY L  D K
Sbjct: 188 DLCATLSHVCQQVCISTPGSYLCACTEGYALLEDNK 223


>gnl|CDD|214542 smart00179, EGF_CA, Calcium-binding EGF-like domain. 
          Length = 39

 Score = 39.2 bits (92), Expect = 4e-05
 Identities = 17/40 (42%), Positives = 22/40 (55%), Gaps = 6/40 (15%)

Query: 77  DECALEDHGCEH--TCKNILGGYECSCKIGYELHSDGKIC 114
           DECA   + C++  TC N +G Y C C  GY    DG+ C
Sbjct: 3   DECA-SGNPCQNGGTCVNTVGSYRCECPPGYT---DGRNC 38


>gnl|CDD|219496 pfam07645, EGF_CA, Calcium-binding EGF domain. 
          Length = 42

 Score = 38.5 bits (90), Expect = 8e-05
 Identities = 18/41 (43%), Positives = 21/41 (51%), Gaps = 2/41 (4%)

Query: 76  FDECALEDHGCEH--TCKNILGGYECSCKIGYELHSDGKIC 114
            DECA   H C     C N +G +EC C  GYE + DG  C
Sbjct: 2   VDECADGTHNCPANTVCVNTIGSFECVCPDGYENNEDGTNC 42


>gnl|CDD|238011 cd00054, EGF_CA, Calcium-binding EGF-like domain, present in a
           large number of membrane-bound and extracellular (mostly
           animal) proteins. Many of these proteins require calcium
           for their biological function and calcium-binding sites
           have been found to be located at the N-terminus of
           particular EGF-like domains; calcium-binding may be
           crucial for numerous protein-protein interactions. Six
           conserved core cysteines form three disulfide bridges as
           in non calcium-binding EGF domains, whose structures are
           very similar. EGF_CA can be found in tandem repeat
           arrangements.
          Length = 38

 Score = 36.8 bits (86), Expect = 3e-04
 Identities = 15/35 (42%), Positives = 20/35 (57%), Gaps = 3/35 (8%)

Query: 77  DECALEDHGCEH--TCKNILGGYECSCKIGYELHS 109
           DECA   + C++  TC N +G Y CSC  GY   +
Sbjct: 3   DECA-SGNPCQNGGTCVNTVGSYRCSCPPGYTGRN 36


>gnl|CDD|221695 pfam12662, cEGF, Complement Clr-like EGF-like.  cEGF, or complement
           Clr-like EGF, domains have six conserved cysteine
           residues disulfide-bonded into the characteristic
           pattern 'ababcc'. They are found in blood coagulation
           proteins such as fibrillin, Clr and Cls, thrombomodulin,
           and the LDL receptor. The core fold of the EGF domain
           consists of two small beta-hairpins packed against each
           other. Two major structural variants have been
           identified based on the structural context of the
           C-terminal cysteine residue of disulfide 'c' in the
           C-terminal hairpin: hEGFs and cEGFs. In cEGFs the
           C-terminal thiol resides on the C-terminal beta-sheet,
           resulting in long loop-lengths between the cysteine
           residues of disulfide 'c', typically C[10+]XC. These
           longer loop-lengths may have arisen by selective
           cysteine loss from a four-disulfide EGF template such as
           laminin or integrin. Tandem cEGF domains have five
           linking residues between terminal cysteines of adjacent
           domains. cEGF domains may or may not bind calcium in the
           linker region. cEGF domains with the consensus motif
           CXN4X[F,Y]XCXC are hydroxylated exclusively on the
           asparagine residue.
          Length = 24

 Score = 34.4 bits (80), Expect = 0.001
 Identities = 11/21 (52%), Positives = 13/21 (61%)

Query: 96  GYECSCKIGYELHSDGKICLD 116
            Y CSC  GY+L  DG+ C D
Sbjct: 1   SYTCSCPPGYQLSGDGRTCED 21


>gnl|CDD|205157 pfam12947, EGF_3, EGF domain.  This family includes a variety of
           EGF-like domain homologues. This family includes the
           C-terminal domain of the malaria parasite MSP1 protein.
          Length = 36

 Score = 34.4 bits (80), Expect = 0.002
 Identities = 16/38 (42%), Positives = 20/38 (52%), Gaps = 4/38 (10%)

Query: 79  CALEDHGC-EH-TCKNILGGYECSCKIGYELHSDGKIC 114
           CA  + GC  + TC N  G + C+CK GY    DG  C
Sbjct: 1   CAENNGGCHPNATCTNTGGSFTCTCKSGYTG--DGVTC 36


>gnl|CDD|214544 smart00181, EGF, Epidermal growth factor-like domain. 
          Length = 35

 Score = 32.9 bits (75), Expect = 0.006
 Identities = 14/31 (45%), Positives = 15/31 (48%), Gaps = 2/31 (6%)

Query: 78  ECALEDHGCEH-TCKNILGGYECSCKIGYEL 107
           ECA     C + TC N  G Y CSC  GY  
Sbjct: 1   ECA-SGGPCSNGTCINTPGSYTCSCPPGYTG 30


>gnl|CDD|238010 cd00053, EGF, Epidermal growth factor domain, found in epidermal
           growth factor (EGF) presents in a large number of
           proteins, mostly animal; the list of proteins currently
           known to contain one or more copies of an EGF-like
           pattern is large and varied; the functional significance
           of EGF-like domains in what appear to be unrelated
           proteins is not yet clear; a common feature is that
           these repeats are found in the extracellular domain of
           membrane-bound proteins or in proteins known to be
           secreted (exception: prostaglandin G/H synthase); the
           domain includes six cysteine residues which have been
           shown to be involved in disulfide bonds; the main
           structure is a two-stranded beta-sheet followed by a
           loop to a C-terminal short two-stranded sheet;
           Subdomains between the conserved cysteines vary in
           length; the region between the 5th and 6th cysteine
           contains two conserved glycines of which at  least  one 
           is  present  in  most EGF-like domains; a subset of
           these bind calcium.
          Length = 36

 Score = 32.1 bits (73), Expect = 0.012
 Identities = 13/32 (40%), Positives = 15/32 (46%), Gaps = 3/32 (9%)

Query: 78  ECALEDHGCEH--TCKNILGGYECSCKIGYEL 107
           ECA   + C +  TC N  G Y C C  GY  
Sbjct: 1   ECA-ASNPCSNGGTCVNTPGSYRCVCPPGYTG 31


>gnl|CDD|215063 PLN00120, PLN00120, fucoxanthin-chlorophyll a-c binding protein;
           Provisional.
          Length = 202

 Score = 31.3 bits (71), Expect = 0.20
 Identities = 18/45 (40%), Positives = 23/45 (51%), Gaps = 6/45 (13%)

Query: 14  DQCTYD---YVEIRDGHAPDSPIIG---TYCGYKLPPDIKSSGTK 52
           DQ  +D   YVEI+ G      ++G   T  G +LP DI  SGT 
Sbjct: 57  DQEKFDRLRYVEIKHGRISMLAVVGYLVTEAGIRLPGDIDYSGTS 101


>gnl|CDD|215652 pfam00008, EGF, EGF-like domain.  There is no clear separation
           between noise and signal. pfam00053 is very similar, but
           has 8 instead of 6 conserved cysteines. Includes some
           cytokine receptors. The EGF domain misses the N-terminus
           regions of the Ca2+ binding EGF domains (this is the
           main reason of discrepancy between swiss-prot domain
           start/end and Pfam). The family is hard to model due to
           many similar but different sub-types of EGF domains.
           Pfam certainly misses a number of EGF domains.
          Length = 32

 Score = 27.0 bits (60), Expect = 0.85
 Identities = 10/26 (38%), Positives = 14/26 (53%), Gaps = 2/26 (7%)

Query: 82  EDHGCEH--TCKNILGGYECSCKIGY 105
            ++ C +  TC +  GGY C C  GY
Sbjct: 3   PNNPCSNGGTCVDTPGGYTCECPEGY 28


>gnl|CDD|200520 cd11259, Sema_4D, The Sema domain, a protein interacting module, of
           semaphorin 4D (Sema4D, also known as CD100).
           Sema4D/CD100 is expressed in immune cells and plays
           critical roles in immune response; it is thus termed an
           "immune semaphorin". It is expressed by lymphocytes and
           promotes the aggregation and survival of B lymphocytes
           and inhibits cytokine-induced migration of immune cells
           in vitro. Sema4D/CD100 knock-out mice demonstrate that
           Sema4D is required for normal activation of B and T
           lymphocytes. Sema4D increases B-cell and DC function
           using either Plexin B1 or CD72 as receptors. The
           function of Sema4D in immune response implicates its
           role in infectious and noninfectious diseases. Sema4D
           belongs to the class 4 transmembrane semaphorin family
           of proteins. Semaphorins are regulatory molecules in the
           development of the nervous system and in axonal
           guidance. They also play important roles in other
           biological processes, such as angiogenesis, immune
           regulation, respiration systems and cancer. The Sema
           domain is located at the N-terminus and contains four
           disulfide bonds formed by eight conserved cysteine
           residues. It serves as a receptor-recognition and
           -binding module.
          Length = 471

 Score = 29.8 bits (67), Expect = 0.85
 Identities = 14/30 (46%), Positives = 18/30 (60%)

Query: 158 LNFTHFDIEGNNLFQSSCEYDNLTVFSKIG 187
           LN T   + G N FQ +C+Y NLT F  +G
Sbjct: 88  LNDTFLYVCGTNAFQPTCDYLNLTSFRLLG 117


>gnl|CDD|193543 cd05667, M20_Acy1_like2, M20 Peptidase Aminoacylase 1 subfamily.
           Peptidase M20 family, Uncharacterized subfamily of
           bacterial proteins that have been predicted as
           N-acyl-L-amino acid amidohydrolase (amaA), thermostable
           carboxypeptidase (cpsA-1, cpsA-2 in Sulfolobus
           solfataricus) and abgB (aminobenzoyl-glutamate
           utilization protein B), and generally are involved in
           the urea cycle and metabolism of amino groups.
           Aminoacylases 1 (ACY1s) comprise a class of zinc binding
           homodimeric enzymes involved in the hydrolysis of
           N-acetylated proteins. N-terminal acetylation of
           proteins is a widespread and is a highly conserved
           process that is involved in the protection and stability
           of proteins. Several types of aminoacylases can be
           distinguished on the basis of substrate specificity.
           ACY1 breaks down cytosolic aliphatic N-acyl-alpha-amino
           acids (except L-aspartate), especially
           N-acetyl-methionine and acetyl-glutamate into L-amino
           acids and an acyl group. However, ACY1 can also catalyze
           the reverse reaction, the synthesis of acetylated amino
           acids. ACY1 may also play a role in xenobiotic
           bioactivation as well as the inter-organ processing of
           amino acid-conjugated xenobiotic derivatives
           (S-substituted-N-acetyl-L-cysteine).
          Length = 402

 Score = 27.6 bits (62), Expect = 5.0
 Identities = 11/30 (36%), Positives = 16/30 (53%), Gaps = 3/30 (10%)

Query: 43  PPDIKSSGTKLMIKFVSDGSVQKPGFSAIF 72
            P  +  G KLM+K   +G ++ P   AIF
Sbjct: 145 APPGEEGGAKLMVK---EGVLKNPKVDAIF 171


>gnl|CDD|223160 COG0082, AroC, Chorismate synthase [Amino acid transport and
           metabolism].
          Length = 369

 Score = 27.2 bits (61), Expect = 5.9
 Identities = 12/26 (46%), Positives = 15/26 (57%), Gaps = 4/26 (15%)

Query: 7   SFEIENHDQCTYDYVEIRD----GHA 28
           +  IEN DQ + DY  I+D    GHA
Sbjct: 81  ALLIENTDQRSKDYSMIKDPPRPGHA 106


>gnl|CDD|129255 TIGR00151, ispF, 2-C-methyl-D-erythritol 2,4-cyclodiphosphate
           synthase.  Members of this protein family are
           2C-methyl-D-erythritol 2,4-cyclodiphosphate synthase,
           the IspF protein of the deoxyxylulose (non-mevalonate)
           pathway of IPP biosynthesis. This protein occurs as an
           IspDF bifunctional fusion protein in about 20 percent of
           bacterial genomes [Biosynthesis of cofactors, prosthetic
           groups, and carriers, Other].
          Length = 155

 Score = 26.1 bits (58), Expect = 8.3
 Identities = 21/75 (28%), Positives = 28/75 (37%), Gaps = 11/75 (14%)

Query: 93  ILGGYECSCKIGYELHSDGKICLDA-CGGILN-TPNGTLTSPSFPDLYIKNKTC------ 144
           ILGG E   + G   HSDG + L A    +L     G +    FPD   + K        
Sbjct: 18  ILGGVEIPHEKGLLAHSDGDVLLHALTDALLGALGLGDIGK-HFPDTDPRWKGADSRVLL 76

Query: 145 --IWEIVAPPQYRIS 157
                ++    YRI 
Sbjct: 77  RHAVALIKEKGYRIG 91


>gnl|CDD|143612 cd07304, Chorismate_synthase, Chorismase synthase, the enzyme
          catalyzing the final step of the shikimate pathway.
          Chorismate synthase (CS;
          5-enolpyruvylshikimate-3-phosphate phospholyase;
          1-carboxyvinyl-3-phosphoshikimate phosphate-lyase; E.C.
          4.2.3.5) catalyzes the seventh and final step in the
          shikimate pathway: the conversion of 5-
          enolpyruvylshikimate-3-phosphate (EPSP) to chorismate,
          a precursor for the biosynthesis of aromatic compounds.
          This process has an absolute requirement for reduced
          FMN as a co-factor which is thought to facilitate
          cleavage of C-O bonds by transiently donating an
          electron to the substrate, having no overall change its
          redox state. Depending on the capacity of these enzymes
          to regenerate the reduced form of FMN, chorismate
          synthases are divided into two classes: Enzymes, mostly
          from plants and eubacteria, that sequester CS from the
          cellular environment, are monofunctiona,l while those
          that can generate reduced FMN at the expense of NADPH,
          such as found in fungi and the ciliated protozoan
          Euglena gracilis, are bifunctional, having an
          additional NADPH:FMN oxidoreductase activity. Recently,
          bifunctionality of the Mycobacterium tuberculosis
          enzyme (MtCS) was determined by measurements of both
          chorismate synthase and NADH:FMN oxidoreductase
          activities. Since shikimate pathway enzymes are present
          in bacteria, fungi and apicomplexan parasites (such as
          Toxoplasma gondii, Plasmodium falciparum, and
          Cryptosporidium parvum) but absent in mammals, they are
          potentially attractive targets for the development of
          new therapy against infectious diseases such as
          tuberculosis (TB).
          Length = 344

 Score = 26.6 bits (60), Expect = 9.3
 Identities = 9/26 (34%), Positives = 14/26 (53%), Gaps = 4/26 (15%)

Query: 7  SFEIENHDQCTYDYVEIRD----GHA 28
          +  I N DQ ++DY  ++     GHA
Sbjct: 73 ALLIRNKDQRSWDYSMLKTLPRPGHA 98


  Database: CDD.v3.10
    Posted date:  Mar 20, 2013  7:55 AM
  Number of letters in database: 10,937,602
  Number of sequences in database:  44,354
  
Lambda     K      H
   0.320    0.140    0.444 

Gapped
Lambda     K      H
   0.267   0.0632    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 9,688,940
Number of extensions: 862050
Number of successful extensions: 581
Number of sequences better than 10.0: 1
Number of HSP's gapped: 576
Number of HSP's successfully gapped: 29
Length of query: 193
Length of database: 10,937,602
Length adjustment: 92
Effective length of query: 101
Effective length of database: 6,857,034
Effective search space: 692560434
Effective search space used: 692560434
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.8 bits)
S2: 56 (25.6 bits)