RPS-BLAST 2.2.26 [Sep-21-2011]

Database: CDD.v3.10 
           44,354 sequences; 10,937,602 total letters

Searching..................................................done

Query= psy826
         (329 letters)



>gnl|CDD|214542 smart00179, EGF_CA, Calcium-binding EGF-like domain. 
          Length = 39

 Score = 41.8 bits (99), Expect = 7e-06
 Identities = 17/31 (54%), Positives = 20/31 (64%), Gaps = 1/31 (3%)

Query: 196 DINECSDENICSGNQFCVNTEGSYRCMQCDP 226
           DI+EC+  N C     CVNT GSYRC +C P
Sbjct: 1   DIDECASGNPCQNGGTCVNTVGSYRC-ECPP 30


>gnl|CDD|238011 cd00054, EGF_CA, Calcium-binding EGF-like domain, present in a
           large number of membrane-bound and extracellular (mostly
           animal) proteins. Many of these proteins require calcium
           for their biological function and calcium-binding sites
           have been found to be located at the N-terminus of
           particular EGF-like domains; calcium-binding may be
           crucial for numerous protein-protein interactions. Six
           conserved core cysteines form three disulfide bridges as
           in non calcium-binding EGF domains, whose structures are
           very similar. EGF_CA can be found in tandem repeat
           arrangements.
          Length = 38

 Score = 41.5 bits (98), Expect = 1e-05
 Identities = 17/31 (54%), Positives = 19/31 (61%), Gaps = 1/31 (3%)

Query: 196 DINECSDENICSGNQFCVNTEGSYRCMQCDP 226
           DI+EC+  N C     CVNT GSYRC  C P
Sbjct: 1   DIDECASGNPCQNGGTCVNTVGSYRC-SCPP 30


>gnl|CDD|221329 pfam11938, DUF3456, TLR4 regulator and MIR-interacting MSAP.  This
           family of proteins, found from plants to humans, is
           PRAT4 (A and B), a Protein Associated with Toll-like
           receptor 4. The Toll family of receptors - TLRs - plays
           an essential role in innate recognition of microbial
           products, the first line of defence against bacterial
           infection. PRAT4A influences the subcellular
           distribution and the strength of TLR responses and
           alters the relative activity of each TLR. PRAT4B
           regulates TLR4 trafficking to the cell surface and the
           extent of its expression there. TLR4 recognizes
           lipopolysaccharide (LPS), one of the most
           immuno-stimulatory glycolipids constituting the outer
           membrane of the Gram-negative bacteria. This family has
           also been described as a SAP-like MIR-interacting
           protein family.
          Length = 151

 Score = 43.9 bits (104), Expect = 2e-05
 Identities = 29/140 (20%), Positives = 43/140 (30%), Gaps = 52/140 (37%)

Query: 2   SGIEKTAKGNF-----AGGDTAWEEEKQKIY-AKSEVRLIEIQEKMCS------------ 43
             + KT             D   + + +K Y A+SE+RL E+ E +C             
Sbjct: 16  EALSKTDPKKEVDVGGFRLDPDGKRKGKKKYYARSELRLTELLEGVCDRMLDYNLHKERS 75

Query: 44  ------------------------------EVSGFLDQCHNFAADIESEIEEWWFKVQHS 73
                                         EV+    QC     + E EIEEW+      
Sbjct: 76  GSRRFAKGMSPTFQTLHGLVLKGVKVDPSAEVAELKFQCERLLEEHEDEIEEWYKN---- 131

Query: 74  KAKDSDLYTWLCINKLKRCC 93
           +  + DL  +LC    K C 
Sbjct: 132 EQLEDDLSKFLCSEHSKACL 151


>gnl|CDD|214589 smart00261, FU, Furin-like repeats. 
          Length = 45

 Score = 39.4 bits (92), Expect = 6e-05
 Identities = 14/35 (40%), Positives = 17/35 (48%)

Query: 220 RCMQCDPSCNGCHGDGPDMCEACAEGYKLQQNICI 254
            C  C P C  C G GPD C +C  G+ L    C+
Sbjct: 3   ECKPCHPECATCTGPGPDDCTSCKHGFFLDGGKCV 37



 Score = 30.9 bits (70), Expect = 0.083
 Identities = 12/27 (44%), Positives = 15/27 (55%), Gaps = 1/27 (3%)

Query: 160 CSKCHASCESGCSTGGPKGCTKCKSGW 186
           C  CH  C + C+  GP  CT CK G+
Sbjct: 4   CKPCHPEC-ATCTGPGPDDCTSCKHGF 29


>gnl|CDD|238021 cd00064, FU, Furin-like repeats. Cysteine rich region. Exact
           function of the domain is not known. Furin is a
           serine-kinase dependent proprotein processor. Other
           members of this family include endoproteases and cell
           surface receptors.
          Length = 49

 Score = 37.1 bits (86), Expect = 5e-04
 Identities = 14/31 (45%), Positives = 17/31 (54%)

Query: 224 CDPSCNGCHGDGPDMCEACAEGYKLQQNICI 254
           C PSC  C G GPD C +C  G+ L    C+
Sbjct: 2   CHPSCATCTGPGPDQCTSCRHGFYLDGGTCV 32



 Score = 29.4 bits (66), Expect = 0.29
 Identities = 12/31 (38%), Positives = 15/31 (48%), Gaps = 1/31 (3%)

Query: 162 KCHASCESGCSTGGPKGCTKCKSGWAADKDI 192
            CH SC   C+  GP  CT C+ G+  D   
Sbjct: 1   PCHPSCA-TCTGPGPDQCTSCRHGFYLDGGT 30


>gnl|CDD|238010 cd00053, EGF, Epidermal growth factor domain, found in epidermal
           growth factor (EGF) presents in a large number of
           proteins, mostly animal; the list of proteins currently
           known to contain one or more copies of an EGF-like
           pattern is large and varied; the functional significance
           of EGF-like domains in what appear to be unrelated
           proteins is not yet clear; a common feature is that
           these repeats are found in the extracellular domain of
           membrane-bound proteins or in proteins known to be
           secreted (exception: prostaglandin G/H synthase); the
           domain includes six cysteine residues which have been
           shown to be involved in disulfide bonds; the main
           structure is a two-stranded beta-sheet followed by a
           loop to a C-terminal short two-stranded sheet;
           Subdomains between the conserved cysteines vary in
           length; the region between the 5th and 6th cysteine
           contains two conserved glycines of which at  least  one 
           is  present  in  most EGF-like domains; a subset of
           these bind calcium.
          Length = 36

 Score = 31.3 bits (71), Expect = 0.040
 Identities = 16/28 (57%), Positives = 17/28 (60%), Gaps = 1/28 (3%)

Query: 199 ECSDENICSGNQFCVNTEGSYRCMQCDP 226
           EC+  N CS    CVNT GSYRC  C P
Sbjct: 1   ECAASNPCSNGGTCVNTPGSYRC-VCPP 27


>gnl|CDD|219496 pfam07645, EGF_CA, Calcium-binding EGF domain. 
          Length = 42

 Score = 31.2 bits (71), Expect = 0.052
 Identities = 14/32 (43%), Positives = 19/32 (59%), Gaps = 2/32 (6%)

Query: 196 DINECSDE-NICSGNQFCVNTEGSYRCMQCDP 226
           D++EC+D  + C  N  CVNT GS+ C  C  
Sbjct: 1   DVDECADGTHNCPANTVCVNTIGSFEC-VCPD 31


>gnl|CDD|221261 pfam11845, DUF3365, Protein of unknown function (DUF3365).  This
           family of proteins are functionally uncharacterized.
           This protein is found in bacteria. Proteins in this
           family are typically between 198 to 657 amino acids in
           length.
          Length = 179

 Score = 31.6 bits (72), Expect = 0.34
 Identities = 8/24 (33%), Positives = 11/24 (45%)

Query: 225 DPSCNGCHGDGPDMCEACAEGYKL 248
           + SC  CHG   +       GYK+
Sbjct: 143 EESCLKCHGAPEEQPGDPGFGYKV 166


>gnl|CDD|173479 PTZ00214, PTZ00214, high cysteine membrane protein Group 4;
           Provisional.
          Length = 800

 Score = 32.2 bits (73), Expect = 0.40
 Identities = 26/88 (29%), Positives = 36/88 (40%), Gaps = 9/88 (10%)

Query: 171 CSTGGPKGCTKCKSGWAADKDIGCYDINECSDENIC--SGNQFCVNTEGSY------RCM 222
           C++     CT C SG A   + GCY         IC    N  C  T+  Y      + +
Sbjct: 519 CTSTANGACTTC-SGAAFLMNGGCYTTEHYPGSTICDKQSNGKCTTTKKGYGISPDGKLL 577

Query: 223 QCDPSCNGCHGDGPDMCEACAEGYKLQQ 250
           +CDP+C  C   GP  C  C     L++
Sbjct: 578 ECDPTCLACTAPGPGRCTRCPSDKLLKR 605



 Score = 29.1 bits (65), Expect = 4.0
 Identities = 31/119 (26%), Positives = 43/119 (36%), Gaps = 18/119 (15%)

Query: 159 LCSKCHASCESGCST----GGPKGCTKCKSGW------------AADKDIGCYDINECSD 202
           LC        SGC+T     G   CT+C +G+            + D    C  + E S+
Sbjct: 352 LCGDATNGGVSGCATCGYNSGAVTCTRCSAGYLGVDGKSCSESCSGDTRGVCTKVAEGSE 411

Query: 203 ENICSGNQFCVNT--EGSYRCMQCDPSCNGCHGDGPDMCEACAEGYKLQQNICINTQAK 259
               S    C  T    S  C  C  SC  C    P  C+ C+ G  L+ +I  +  A 
Sbjct: 412 STEVSCRCVCKPTFYNSSGTCTPCTDSCAVCKDGTPTGCQQCSPGKILEFSIVSSESAD 470


>gnl|CDD|238012 cd00055, EGF_Lam, Laminin-type epidermal growth factor-like domain;
           laminins are the major noncollagenous components of
           basement membranes that mediate cell adhesion, growth
           migration, and differentiation; the laminin-type
           epidermal growth factor-like module occurs in tandem
           arrays; the domain contains 4 disulfide bonds (loops
           a-d) the first three resemble epidermal growth factor
           (EGF); the number of copies of this domain in the
           different forms of laminins is highly variable ranging
           from 3 up to 22 copies.
          Length = 50

 Score = 28.9 bits (65), Expect = 0.48
 Identities = 11/26 (42%), Positives = 13/26 (50%)

Query: 125 KGNGQCVCNKEYTGELCNECNTGYFQ 150
            G GQC C    TG  C+ C  GY+ 
Sbjct: 16  PGTGQCECKPNTTGRRCDRCAPGYYG 41


>gnl|CDD|214544 smart00181, EGF, Epidermal growth factor-like domain. 
          Length = 35

 Score = 27.5 bits (61), Expect = 0.99
 Identities = 14/28 (50%), Positives = 16/28 (57%), Gaps = 2/28 (7%)

Query: 199 ECSDENICSGNQFCVNTEGSYRCMQCDP 226
           EC+    CS N  C+NT GSY C  C P
Sbjct: 1   ECASGGPCS-NGTCINTPGSYTC-SCPP 26


>gnl|CDD|235820 PRK06521, PRK06521, hydrogenase 4 subunit B; Validated.
          Length = 667

 Score = 29.9 bits (68), Expect = 2.0
 Identities = 8/18 (44%), Positives = 14/18 (77%)

Query: 282 IIFQKNVFIASIVGVVVA 299
           + F  N+F+AS+V V++A
Sbjct: 115 MGFFYNLFLASMVLVLLA 132


>gnl|CDD|215680 pfam00053, Laminin_EGF, Laminin EGF-like (Domains III and V).  This
           family is like pfam00008 but has 8 conserved cysteines
           instead of six.
          Length = 49

 Score = 26.9 bits (60), Expect = 2.0
 Identities = 11/27 (40%), Positives = 14/27 (51%)

Query: 128 GQCVCNKEYTGELCNECNTGYFQSYKD 154
           GQC+C    TG  C+ C  GY+    D
Sbjct: 18  GQCLCKPGVTGRHCDRCKPGYYGLPSD 44


>gnl|CDD|215652 pfam00008, EGF, EGF-like domain.  There is no clear separation
           between noise and signal. pfam00053 is very similar, but
           has 8 instead of 6 conserved cysteines. Includes some
           cytokine receptors. The EGF domain misses the N-terminus
           regions of the Ca2+ binding EGF domains (this is the
           main reason of discrepancy between swiss-prot domain
           start/end and Pfam). The family is hard to model due to
           many similar but different sub-types of EGF domains.
           Pfam certainly misses a number of EGF domains.
          Length = 32

 Score = 26.6 bits (59), Expect = 2.1
 Identities = 11/22 (50%), Positives = 12/22 (54%)

Query: 200 CSDENICSGNQFCVNTEGSYRC 221
           CS  N CS    CV+T G Y C
Sbjct: 1   CSPNNPCSNGGTCVDTPGGYTC 22


>gnl|CDD|219677 pfam07974, EGF_2, EGF-like domain.  This family contains EGF
           domains found in a variety of extracellular proteins.
          Length = 31

 Score = 26.6 bits (59), Expect = 2.1
 Identities = 13/31 (41%), Positives = 17/31 (54%), Gaps = 1/31 (3%)

Query: 112 CFGNGKCKGNGT-RKGNGQCVCNKEYTGELC 141
           C  +G C G GT  +  G+CVC+  Y G  C
Sbjct: 1   CSASGICNGRGTCVRPCGKCVCDSGYQGATC 31


>gnl|CDD|237660 PRK14289, PRK14289, chaperone protein DnaJ; Provisional.
          Length = 386

 Score = 29.8 bits (67), Expect = 2.1
 Identities = 26/114 (22%), Positives = 42/114 (36%), Gaps = 29/114 (25%)

Query: 143 ECNTGYFQSYKDEKTILCSKCHASCESGCSTGGPKGCTKCKSGWAADKDIGCYDINECSD 202
           E +TG  + +K +K + CS CH +   G    G + C  CK   +  +            
Sbjct: 140 EISTGVEKKFKVKKYVPCSHCHGTGAEG--NNGSETCPTCKGSGSVTRV----------- 186

Query: 203 ENICSGNQFCVNTEGSYRCMQCDPSCNGCHGDG---PDMCEACA-EGYKLQQNI 252
           +N   G             MQ   +C  C+G+G      C+ C  EG    + +
Sbjct: 187 QNTILGT------------MQTQSTCPTCNGEGKIIKKKCKKCGGEGIVYGEEV 228


>gnl|CDD|214543 smart00180, EGF_Lam, Laminin-type epidermal growth factor-like
           domai. 
          Length = 46

 Score = 26.1 bits (58), Expect = 3.7
 Identities = 10/25 (40%), Positives = 12/25 (48%)

Query: 127 NGQCVCNKEYTGELCNECNTGYFQS 151
            GQC C    TG  C+ C  GY+  
Sbjct: 17  TGQCECKPNVTGRRCDRCAPGYYGD 41


>gnl|CDD|220356 pfam09709, Cas_Csd1, CRISPR-associated protein (Cas_Csd1).
          CRISPR loci appear to be mobile elements with a wide
          host range. This entry represents proteins that tend to
          be found near CRISPR repeats. The species range, so
          far, is exclusively bacterial and mesophilic, although
          CRISPR loci are particularly common among the archaea
          and thermophilic bacteria. Clusters of short DNA
          repeats with nonhomologous spacers, which are found at
          regular intervals in the genomes of phylogenetically
          distinct prokaryotic species, comprise a family with
          recognisable features. This family is known as CRISPR
          (short for Clustered, Regularly Interspaced Short
          Palindromic Repeats). A number of protein families
          appear only in association with these repeats and are
          designated Cas (CRISPR-Associated) proteins.
          Length = 572

 Score = 29.2 bits (66), Expect = 4.0
 Identities = 11/40 (27%), Positives = 14/40 (35%), Gaps = 3/40 (7%)

Query: 8  AKGNFAGGDTAWEEEKQKIYAKSEVRLIEIQEKMCSEVSG 47
            GNF   D    + K+ I       L+   EK     SG
Sbjct: 34 EDGNFLRIDARERDGKKTIPRSM---LVPATEKSAGRSSG 70


>gnl|CDD|234402 TIGR03928, T7_EssCb_Firm, type VII secretion protein EssC,
           C-terminal domain.  This model describes the C-terminal
           domain, or longer subunit, of the Firmicutes type VII
           secretion protein EssC. This protein (homologous to EccC
           in Actinobacteria) and the WXG100 target proteins are
           the only homologous parts of type VII secretion between
           Firmicutes and Actinobacteria [Protein fate, Protein and
           peptide secretion and trafficking].
          Length = 1296

 Score = 28.8 bits (65), Expect = 4.6
 Identities = 13/42 (30%), Positives = 18/42 (42%), Gaps = 1/42 (2%)

Query: 272 VYVGLCVATYIIFQKNVFI-ASIVGVVVAIYVSVAEYILNDK 312
           V + + V   I   + +FI ASI   +V I  S   Y    K
Sbjct: 46  VMIAVTVLISIFQPRGIFIIASIAMSLVTIIFSTTTYFREKK 87


>gnl|CDD|205157 pfam12947, EGF_3, EGF domain.  This family includes a variety of
           EGF-like domain homologues. This family includes the
           C-terminal domain of the malaria parasite MSP1 protein.
          Length = 36

 Score = 25.6 bits (57), Expect = 5.2
 Identities = 15/37 (40%), Positives = 18/37 (48%), Gaps = 5/37 (13%)

Query: 200 CSDEN-ICSGNQFCVNTEGSYRCMQCDPSCNGCHGDG 235
           C++ N  C  N  C NT GS+ C  C     G  GDG
Sbjct: 1   CAENNGGCHPNATCTNTGGSFTC-TCKS---GYTGDG 33


>gnl|CDD|232957 TIGR00398, metG, methionyl-tRNA synthetase.  The methionyl-tRNA
           synthetase (metG) is a class I amino acyl-tRNA ligase.
           This model appears to recognize the methionyl-tRNA
           synthetase of every species, including eukaryotic
           cytosolic and mitochondrial forms. The UPGMA difference
           tree calculated after search and alignment according to
           This model shows an unusual deep split between two
           families of MetG. One family contains forms from the
           Archaea, yeast cytosol, spirochetes, and E. coli, among
           others. The other family includes forms from yeast
           mitochondrion, Synechocystis sp., Bacillus subtilis, the
           Mycoplasmas, Aquifex aeolicus, and Helicobacter pylori.
           The E. coli enzyme is homodimeric, although monomeric
           forms can be prepared that are fully active. Activity of
           this enzyme in bacteria includes aminoacylation of
           fMet-tRNA with Met; subsequent formylation of the Met to
           fMet is catalyzed by a separate enzyme. Note that the
           protein from Aquifex aeolicus is split into an alpha
           (large) and beta (small) subunit; this model does not
           include the C-terminal region corresponding to the beta
           chain [Protein synthesis, tRNA aminoacylation].
          Length = 530

 Score = 28.5 bits (64), Expect = 5.3
 Identities = 10/38 (26%), Positives = 12/38 (31%), Gaps = 3/38 (7%)

Query: 133 NKEYTGELCNECNTGYFQSYKDEKTILCSKCHASCESG 170
            KE     C EC       Y +     C KC +    G
Sbjct: 115 EKEIKQLYCPECEMFLPDRYVEGT---CPKCGSEDARG 149


>gnl|CDD|221770 pfam12785, VESA1_N, Variant erythrocyte surface antigen-1.  This
           family represents the N-terminal of the variant
           erythrocyte surface antigen 1, versions a and b, of
           Babesia. Babesia bovis is a tick-borne,
           intra-erythrocytic, protozoal parasite of cattle that
           shares many lifestyle parallels with the most virulent
           of the human malarial parasites, Plasmodium falciparum.
           Babesia uses antigenic variation to establish consistent
           infections of long duration. The two variants of VESA1,
           a and b, are expressed from different but closely
           related genes, and variation is achieved through the
           involvement of a segmental gene conversion mechanism and
           low-frequency epigenetic in situ switching of
           transcriptional activity from the VESA1 gene-pair to a
           possible other gene pair.
          Length = 428

 Score = 28.4 bits (64), Expect = 5.4
 Identities = 29/127 (22%), Positives = 38/127 (29%), Gaps = 23/127 (18%)

Query: 117 KCKGNGTRKGNGQCVCNKEYTGELCNECNTGYFQSYKDEKTILCSKC--------HASCE 168
           KC G G  K  G       +  +    C   Y +  K      C  C         A  +
Sbjct: 77  KCWGGGGGKCKGGGGNGNGHGQK--GGC--KYLKDVKPNNP--CDDCGCMKWDVPKADSD 130

Query: 169 SGCSTGGPKGCTKCKSGWAADKDIGCYDINECS-DENICSGNQFCVNTEGSYRCMQCDPS 227
            G   G  +GCT+C        D GC    +CS     CS  + C        C  C   
Sbjct: 131 EGHHLG--RGCTRCSDS--GGSDHGC----KCSTGGGSCSAGKECKCALAGKCCKCCCKG 182

Query: 228 CNGCHGD 234
             G   +
Sbjct: 183 KCGKGKE 189


>gnl|CDD|177356 PHA02256, PHA02256, hypothetical protein.
          Length = 113

 Score = 26.8 bits (59), Expect = 7.0
 Identities = 15/55 (27%), Positives = 26/55 (47%), Gaps = 5/55 (9%)

Query: 272 VYVGLCVATYIIFQKNVFIASIVGVVVAIY--VSVAEYILNDKTAAFDPPSIITK 324
           V+  L + T +IF  + F +  V V+  IY  + +  YI+      F   S++ K
Sbjct: 10  VFTCLSLLTLMIFVHSKFSSKNVFVLYVIYAIIGIGTYIV---LTMFQTTSVLIK 61


>gnl|CDD|237654 PRK14278, PRK14278, chaperone protein DnaJ; Provisional.
          Length = 378

 Score = 27.7 bits (62), Expect = 9.3
 Identities = 13/41 (31%), Positives = 18/41 (43%), Gaps = 2/41 (4%)

Query: 142 NECNTGYFQSYKDEKTILCSKCHASCESGCSTGGPKGCTKC 182
            EC TG  +    +  +LC +CH    +G S   P  C  C
Sbjct: 124 EECATGVTKQVTVDTAVLCDRCHGKGTAGDSK--PVTCDTC 162


  Database: CDD.v3.10
    Posted date:  Mar 20, 2013  7:55 AM
  Number of letters in database: 10,937,602
  Number of sequences in database:  44,354
  
Lambda     K      H
   0.319    0.136    0.450 

Gapped
Lambda     K      H
   0.267   0.0510    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 15,584,498
Number of extensions: 1376986
Number of successful extensions: 1394
Number of sequences better than 10.0: 1
Number of HSP's gapped: 1364
Number of HSP's successfully gapped: 80
Length of query: 329
Length of database: 10,937,602
Length adjustment: 97
Effective length of query: 232
Effective length of database: 6,635,264
Effective search space: 1539381248
Effective search space used: 1539381248
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.8 bits)
S2: 59 (27.0 bits)