RPS-BLAST 2.2.26 [Sep-21-2011]

Database: CDD.v3.10 
           44,354 sequences; 10,937,602 total letters

Searching..................................................done

Query= psy2974
         (186 letters)



>gnl|CDD|215916 pfam00431, CUB, CUB domain. 
          Length = 110

 Score = 70.4 bits (173), Expect = 3e-16
 Identities = 28/110 (25%), Positives = 44/110 (40%), Gaps = 38/110 (34%)

Query: 81  CKYEITAPNGVIKTPNHPDYYPSKRECIWHFTTTPGHRIKLN------------------ 122
           C   +T  +G I +PN+P+ YP  ++C+W     PG+RI L                   
Sbjct: 1   CGGVLTESSGSITSPNYPNSYPPNKDCVWTIRAPPGYRISLTFQDFDLEDHDECGYDYVE 60

Query: 123 --------------------PTDIISISEGLLVRFRSDDTVVGKGFSASY 152
                               P DI S S  + ++F SD ++  +GF A+Y
Sbjct: 61  IRDGLPSSSPLLGRFCGSGPPEDIRSTSNQMTIKFVSDSSISKRGFKATY 110


>gnl|CDD|238001 cd00041, CUB, CUB domain; extracellular domain; present in proteins
           mostly known to be involved in development; not found in
           prokaryotes, plants and yeast.
          Length = 113

 Score = 65.9 bits (161), Expect = 2e-14
 Identities = 35/113 (30%), Positives = 47/113 (41%), Gaps = 39/113 (34%)

Query: 81  CKYEITAP-NGVIKTPNHPDYYPSKRECIWHFTTTPGHRIKLN----------------- 122
           C   +TA  +G I +PN+P+ YP+   C+W     PG+RI+L                  
Sbjct: 1   CGGTLTASTSGTISSPNYPNNYPNNLNCVWTIEAPPGYRIRLTFEDFDLESSPNCSYDYL 60

Query: 123 ---------------------PTDIISISEGLLVRFRSDDTVVGKGFSASYIA 154
                                P  IIS    L VRFRSD +V G+GF A+Y A
Sbjct: 61  EIYDGPSTSSPLLGRFCGSTLPPPIISSGNSLTVRFRSDSSVTGRGFKATYSA 113


>gnl|CDD|214483 smart00042, CUB, Domain first found in C1r, C1s, uEGF, and bone
           morphogenetic protein.  This domain is found mostly
           among developmentally-regulated proteins. Spermadhesins
           contain only this domain.
          Length = 102

 Score = 58.2 bits (141), Expect = 1e-11
 Identities = 28/102 (27%), Positives = 38/102 (37%), Gaps = 39/102 (38%)

Query: 90  GVIKTPNHPDYYPSKRECIWHFTTTPGHRIKL---------------------------- 121
           G I +PN+P  YP+  +C+W     PG+RI+L                            
Sbjct: 1   GTITSPNYPQSYPNNLDCVWTIRAPPGYRIELQFTDFDLESSDNCEYDYVEIYDGPSASS 60

Query: 122 -----------NPTDIISISEGLLVRFRSDDTVVGKGFSASY 152
                       P  I S S  L + F SD +V  +GFSA Y
Sbjct: 61  PLLGRFCGSEAPPPVISSSSNSLTLTFVSDSSVQKRGFSARY 102


>gnl|CDD|238752 cd01475, vWA_Matrilin, VWA_Matrilin: In cartilaginous plate,
           extracellular matrix molecules mediate cell-matrix and
           matrix-matrix interactions thereby providing tissue
           integrity. Some members of the matrilin family are
           expressed specifically in developing cartilage
           rudiments. The matrilin family consists of at least four
           members. All the members of the matrilin family contain
           VWA domains, EGF-like domains and a heptad repeat
           coiled-coiled domain at the carboxy terminus which is
           responsible for the oligomerization of the matrilins.
           The VWA domains have been shown to be essential for
           matrilin network formation by interacting with matrix
           ligands.
          Length = 224

 Score = 48.1 bits (115), Expect = 4e-07
 Identities = 18/49 (36%), Positives = 26/49 (53%), Gaps = 3/49 (6%)

Query: 28  KIGDSFKS---QDKDECMTNNGGCQHECRNTIGSYICSCHNGYTLLENG 73
           ++   F+       D C T +  CQ  C +T GSY+C+C  GY LLE+ 
Sbjct: 174 ELTKKFQGKICVVPDLCATLSHVCQQVCISTPGSYLCACTEGYALLEDN 222


>gnl|CDD|205157 pfam12947, EGF_3, EGF domain.  This family includes a variety of
          EGF-like domain homologues. This family includes the
          C-terminal domain of the malaria parasite MSP1 protein.
          Length = 36

 Score = 41.8 bits (99), Expect = 4e-06
 Identities = 18/38 (47%), Positives = 23/38 (60%), Gaps = 4/38 (10%)

Query: 41 CMTNNGGC-QH-ECRNTIGSYICSCHNGYTLLENGHDC 76
          C  NNGGC  +  C NT GS+ C+C +GYT   +G  C
Sbjct: 1  CAENNGGCHPNATCTNTGGSFTCTCKSGYTG--DGVTC 36


>gnl|CDD|214542 smart00179, EGF_CA, Calcium-binding EGF-like domain. 
          Length = 39

 Score = 41.5 bits (98), Expect = 5e-06
 Identities = 20/43 (46%), Positives = 26/43 (60%), Gaps = 6/43 (13%)

Query: 37 DKDECMTNNGGCQH--ECRNTIGSYICSCHNGYTLLENGHDCK 77
          D DEC + N  CQ+   C NT+GSY C C  GYT   +G +C+
Sbjct: 1  DIDECASGNP-CQNGGTCVNTVGSYRCECPPGYT---DGRNCE 39


>gnl|CDD|238011 cd00054, EGF_CA, Calcium-binding EGF-like domain, present in a
          large number of membrane-bound and extracellular
          (mostly animal) proteins. Many of these proteins
          require calcium for their biological function and
          calcium-binding sites have been found to be located at
          the N-terminus of particular EGF-like domains;
          calcium-binding may be crucial for numerous
          protein-protein interactions. Six conserved core
          cysteines form three disulfide bridges as in non
          calcium-binding EGF domains, whose structures are very
          similar. EGF_CA can be found in tandem repeat
          arrangements.
          Length = 38

 Score = 39.5 bits (93), Expect = 2e-05
 Identities = 17/34 (50%), Positives = 19/34 (55%), Gaps = 1/34 (2%)

Query: 37 DKDECMTNNG-GCQHECRNTIGSYICSCHNGYTL 69
          D DEC + N       C NT+GSY CSC  GYT 
Sbjct: 1  DIDECASGNPCQNGGTCVNTVGSYRCSCPPGYTG 34


>gnl|CDD|214544 smart00181, EGF, Epidermal growth factor-like domain. 
          Length = 35

 Score = 36.3 bits (84), Expect = 3e-04
 Identities = 16/34 (47%), Positives = 19/34 (55%), Gaps = 2/34 (5%)

Query: 40 ECMTNNGGCQH-ECRNTIGSYICSCHNGYTLLEN 72
          EC  + G C +  C NT GSY CSC  GYT  + 
Sbjct: 1  EC-ASGGPCSNGTCINTPGSYTCSCPPGYTGDKR 33


>gnl|CDD|238010 cd00053, EGF, Epidermal growth factor domain, found in epidermal
          growth factor (EGF) presents in a large number of
          proteins, mostly animal; the list of proteins currently
          known to contain one or more copies of an EGF-like
          pattern is large and varied; the functional
          significance of EGF-like domains in what appear to be
          unrelated proteins is not yet clear; a common feature
          is that these repeats are found in the extracellular
          domain of membrane-bound proteins or in proteins known
          to be secreted (exception: prostaglandin G/H synthase);
          the domain includes six cysteine residues which have
          been shown to be involved in disulfide bonds; the main
          structure is a two-stranded beta-sheet followed by a
          loop to a C-terminal short two-stranded sheet;
          Subdomains between the conserved cysteines vary in
          length; the region between the 5th and 6th cysteine
          contains two conserved glycines of which at  least  one
           is  present  in  most EGF-like domains; a subset of
          these bind calcium.
          Length = 36

 Score = 34.8 bits (80), Expect = 0.001
 Identities = 14/35 (40%), Positives = 17/35 (48%), Gaps = 3/35 (8%)

Query: 40 ECMTNNGGCQH--ECRNTIGSYICSCHNGYTLLEN 72
          EC   +  C +   C NT GSY C C  GYT   +
Sbjct: 1  EC-AASNPCSNGGTCVNTPGSYRCVCPPGYTGDRS 34


>gnl|CDD|221695 pfam12662, cEGF, Complement Clr-like EGF-like.  cEGF, or
          complement Clr-like EGF, domains have six conserved
          cysteine residues disulfide-bonded into the
          characteristic pattern 'ababcc'. They are found in
          blood coagulation proteins such as fibrillin, Clr and
          Cls, thrombomodulin, and the LDL receptor. The core
          fold of the EGF domain consists of two small
          beta-hairpins packed against each other. Two major
          structural variants have been identified based on the
          structural context of the C-terminal cysteine residue
          of disulfide 'c' in the C-terminal hairpin: hEGFs and
          cEGFs. In cEGFs the C-terminal thiol resides on the
          C-terminal beta-sheet, resulting in long loop-lengths
          between the cysteine residues of disulfide 'c',
          typically C[10+]XC. These longer loop-lengths may have
          arisen by selective cysteine loss from a four-disulfide
          EGF template such as laminin or integrin. Tandem cEGF
          domains have five linking residues between terminal
          cysteines of adjacent domains. cEGF domains may or may
          not bind calcium in the linker region. cEGF domains
          with the consensus motif CXN4X[F,Y]XCXC are
          hydroxylated exclusively on the asparagine residue.
          Length = 24

 Score = 32.0 bits (74), Expect = 0.010
 Identities = 10/20 (50%), Positives = 12/20 (60%)

Query: 58 SYICSCHNGYTLLENGHDCK 77
          SY CSC  GY L  +G  C+
Sbjct: 1  SYTCSCPPGYQLSGDGRTCE 20


>gnl|CDD|219496 pfam07645, EGF_CA, Calcium-binding EGF domain. 
          Length = 42

 Score = 31.6 bits (72), Expect = 0.022
 Identities = 18/42 (42%), Positives = 22/42 (52%), Gaps = 2/42 (4%)

Query: 37 DKDECMTNNGGCQH--ECRNTIGSYICSCHNGYTLLENGHDC 76
          D DEC      C     C NTIGS+ C C +GY   E+G +C
Sbjct: 1  DVDECADGTHNCPANTVCVNTIGSFECVCPDGYENNEDGTNC 42


>gnl|CDD|165346 PHA03054, PHA03054, IMV membrane protein; Provisional.
          Length = 72

 Score = 30.7 bits (69), Expect = 0.078
 Identities = 15/31 (48%), Positives = 17/31 (54%)

Query: 18 SPHDTLTVFSKIGDSFKSQDKDECMTNNGGC 48
          SP D LT F +I  S  S +K    TNN GC
Sbjct: 15 SPEDDLTDFIEIVKSVLSDEKTVTSTNNTGC 45


>gnl|CDD|215652 pfam00008, EGF, EGF-like domain.  There is no clear separation
          between noise and signal. pfam00053 is very similar,
          but has 8 instead of 6 conserved cysteines. Includes
          some cytokine receptors. The EGF domain misses the
          N-terminus regions of the Ca2+ binding EGF domains
          (this is the main reason of discrepancy between
          swiss-prot domain start/end and Pfam). The family is
          hard to model due to many similar but different
          sub-types of EGF domains. Pfam certainly misses a
          number of EGF domains.
          Length = 32

 Score = 29.3 bits (66), Expect = 0.10
 Identities = 11/28 (39%), Positives = 14/28 (50%), Gaps = 2/28 (7%)

Query: 43 TNNGGCQH--ECRNTIGSYICSCHNGYT 68
          + N  C +   C +T G Y C C  GYT
Sbjct: 2  SPNNPCSNGGTCVDTPGGYTCECPEGYT 29


>gnl|CDD|218673 pfam05642, Sporozoite_P67, Sporozoite P67 surface antigen.  This
           family consists of several Theileria P67 surface
           antigens. A stage specific surface antigen of Theileria
           parva, p67, is the basis for the development of an
           anti-sporozoite vaccine for the control of East Coast
           fever (ECF) in cattle. The antigen has been shown to
           contain five distinct linear peptide sequences
           recognised by sporozoite-neutralising murine monoclonal
           antibodies.
          Length = 727

 Score = 30.8 bits (69), Expect = 0.44
 Identities = 17/29 (58%), Positives = 21/29 (72%), Gaps = 1/29 (3%)

Query: 156 DTQGSKEFSEIDDDD-EDEDNTDLNSRRG 183
           DT+GSK  SE DDDD E+EDN   +S+ G
Sbjct: 108 DTKGSKTDSEEDDDDSEEEDNKSTSSKDG 136


>gnl|CDD|191582 pfam06679, DUF1180, Protein of unknown function (DUF1180).  This
           family consists of several hypothetical mammalian
           proteins of around 190 residues in length. The function
           of this family is unknown.
          Length = 163

 Score = 29.1 bits (65), Expect = 0.97
 Identities = 13/34 (38%), Positives = 20/34 (58%), Gaps = 3/34 (8%)

Query: 152 YIAIDTQ-GSKEFSEIDDDDEDEDNT--DLNSRR 182
           Y  +DT   + E + ++ DDED+D+T  D N  R
Sbjct: 129 YGVLDTNAENMELTPLEQDDEDDDSTLFDANYPR 162


>gnl|CDD|131633 TIGR02584, cas_NE0113, CRISPR-associated protein, NE0113 family.
           Members of this minor CRISPR-associated (Cas) protein
           family are found in cas gene clusters in Vibrio
           vulnificus YJ016, Nitrosomonas europaea ATCC 19718,
           Mannheimia succiniciproducens MBEL55E, and
           Verrucomicrobium spinosum [Mobile and extrachromosomal
           element functions, Other].
          Length = 209

 Score = 28.6 bits (64), Expect = 1.6
 Identities = 11/45 (24%), Positives = 20/45 (44%), Gaps = 2/45 (4%)

Query: 96  NHPD-YYPSKRECIWHFTTTPGHRIKLNPTD-IISISEGLLVRFR 138
           N    YYP ++  I    T  G     + ++ ++ ++E   VR R
Sbjct: 161 NIRGFYYPPRKGPILEIRTRDGPPAPADTSEAVVELAELPFVRLR 205


>gnl|CDD|187817 cd09686, Csx1_III-U, CRISPR/Cas system-associated protein Csx1.
           CRISPR (Clustered Regularly Interspaced Short
           Palindromic Repeats) and associated Cas proteins
           comprise a system for heritable host defense by
           prokaryotic cells against phage and other foreign DNA;
           Protein of this family often fused to HTH domain; Some
           proteins could have an additional fusion with
           RecB-family nuclease domain; Core domain appears to have
           a Rossmann-like fold; loosely associated with CRISPR/Cas
           systems; also known as NE0113 family.
          Length = 209

 Score = 28.6 bits (64), Expect = 1.6
 Identities = 11/45 (24%), Positives = 20/45 (44%), Gaps = 2/45 (4%)

Query: 96  NHPD-YYPSKRECIWHFTTTPGHRIKLNPTD-IISISEGLLVRFR 138
           N    YYP ++  I    T  G     + ++ ++ ++E   VR R
Sbjct: 161 NIRGFYYPPRKGPILEIRTRDGPPAPADTSEAVVELAELPFVRLR 205


>gnl|CDD|219564 pfam07771, TSGP1, Tick salivary peptide group 1.  This contains a
           group of peptides derived from a salivary gland cDNA
           library of the tick Ixodes scapularis. Also present are
           peptides from a related tick species, Ixodes ricinus.
           They are characterized by a putative signal peptide
           indicative of secretion and conserved cysteine residues.
          Length = 120

 Score = 27.5 bits (61), Expect = 2.0
 Identities = 17/69 (24%), Positives = 25/69 (36%), Gaps = 16/69 (23%)

Query: 43  TNNGGCQHECRNTIGSYICSCHNGYTL--------LENGHDCKEGGCKYEITAPNGVIKT 94
           TN  GC + C N   +   S    +            N   C+ G C +  T+       
Sbjct: 34  TNREGCDYYCWNQDTN---SWDEFFFGDGETCFYNTGNDGVCQNGEC-HLTTSSGE---- 85

Query: 95  PNHPDYYPS 103
           P+HPD +P 
Sbjct: 86  PSHPDDHPP 94


>gnl|CDD|217392 pfam03153, TFIIA, Transcription factor IIA, alpha/beta subunit.
           Transcription initiation factor IIA (TFIIA) is a
           heterotrimer, the three subunits being known as alpha,
           beta, and gamma, in order of molecular weight. The N and
           C-terminal domains of the gamma subunit are represented
           in pfam02268 and pfam02751, respectively. This family
           represents the precursor that yields both the alpha and
           beta subunits. The TFIIA heterotrimer is an essential
           general transcription initiation factor for the
           expression of genes transcribed by RNA polymerase II.
           Together with TFIID, TFIIA binds to the promoter region;
           this is the first step in the formation of a
           pre-initiation complex (PIC). Binding of the rest of the
           transcription machinery follows this step. After
           initiation, the PIC does not completely dissociate from
           the promoter. Some components, including TFIIA, remain
           attached and re-initiate a subsequent round of
           transcription.
          Length = 332

 Score = 27.8 bits (62), Expect = 4.0
 Identities = 9/28 (32%), Positives = 12/28 (42%)

Query: 153 IAIDTQGSKEFSEIDDDDEDEDNTDLNS 180
           I       +     DDDDED   +DL+ 
Sbjct: 244 IDGIDSDDEGDGSDDDDDEDAIESDLDD 271


>gnl|CDD|221765 pfam12772, GHBP, Growth hormone receptor binding.  Growth hormone
           receptor binding protein is produced either by
           proteolysis of the GHR (growth hormone receptor) at the
           cell surface thereby releasing its extracellular domain,
           the GHBP (growth hormone-binding protein), or, in
           rodents, by alternative processing of the GHR
           transcript. The sheddase proteolytic enzyme responsible
           for the cleavage is TACE (tumour necrosis
           factor-alpha-converting enzyme). Growth hormone (GH)
           binding to GH receptor (GHR) is the initial step that
           leads to the physiological functions of the hormone. The
           biological effects of GHBP are determined by the serum
           levels of growth hormone (GH), which can vary. Low
           levels of GH can result in a dwarf phenotype and have
           been positively correlated with an increased life
           expectancy. High levels of GH can lead to gigantism or a
           clinical syndrome termed acromegaly and have been
           implicated in diabetic eye and kidney damage.
          Length = 289

 Score = 27.4 bits (61), Expect = 4.2
 Identities = 10/21 (47%), Positives = 14/21 (66%)

Query: 162 EFSEIDDDDEDEDNTDLNSRR 182
           EF E+D DD DE N   +++R
Sbjct: 29  EFIELDIDDPDEKNEGSDTQR 49


>gnl|CDD|219285 pfam07065, D123, D123.  This family contains a number of eukaryotic
           D123 proteins approximately 330 residues long. It has
           been shown that mutated variants of D123 exhibit
           temperature-dependent differences in their degradation
           rate. D123 proteins are regulators of eIF2, the central
           regulator of translational initiation.
          Length = 295

 Score = 27.3 bits (61), Expect = 5.1
 Identities = 14/55 (25%), Positives = 25/55 (45%), Gaps = 1/55 (1%)

Query: 123 PTDIISISEGLLVRFRSDDTVVGKGFSASYIAIDTQGSKEFSEIDDDDEDEDNTD 177
            + II + E  L     D  ++    S+  I  ++    E+S+  +DD+DED   
Sbjct: 11  KSKIIPLPEEFLEYLLQDGILLPSEESSLPIYQESS-DNEYSDWFEDDDDEDTDV 64


  Database: CDD.v3.10
    Posted date:  Mar 20, 2013  7:55 AM
  Number of letters in database: 10,937,602
  Number of sequences in database:  44,354
  
Lambda     K      H
   0.315    0.136    0.427 

Gapped
Lambda     K      H
   0.267   0.0653    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 9,189,761
Number of extensions: 806356
Number of successful extensions: 696
Number of sequences better than 10.0: 1
Number of HSP's gapped: 689
Number of HSP's successfully gapped: 31
Length of query: 186
Length of database: 10,937,602
Length adjustment: 91
Effective length of query: 95
Effective length of database: 6,901,388
Effective search space: 655631860
Effective search space used: 655631860
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 42 (22.0 bits)
S2: 56 (25.5 bits)