RPS-BLAST 2.2.26 [Sep-21-2011]

Database: CDD.v3.10 
           44,354 sequences; 10,937,602 total letters

Searching..................................................done

Query= psy13144
         (1031 letters)



>gnl|CDD|205157 pfam12947, EGF_3, EGF domain.  This family includes a variety of
           EGF-like domain homologues. This family includes the
           C-terminal domain of the malaria parasite MSP1 protein.
          Length = 36

 Score = 29.4 bits (67), Expect = 0.67
 Identities = 14/28 (50%), Positives = 15/28 (53%)

Query: 521 PGTCGQNANCRVINHSPICTCKPGFTGD 548
            G C  NA C     S  CTCK G+TGD
Sbjct: 5   NGGCHPNATCTNTGGSFTCTCKSGYTGD 32



 Score = 26.7 bits (60), Expect = 5.1
 Identities = 13/28 (46%), Positives = 15/28 (53%)

Query: 101 PGTCGQNANCKVINHSPICRCKAGFTGD 128
            G C  NA C     S  C CK+G+TGD
Sbjct: 5   NGGCHPNATCTNTGGSFTCTCKSGYTGD 32



 Score = 26.7 bits (60), Expect = 5.1
 Identities = 13/28 (46%), Positives = 15/28 (53%)

Query: 299 PGTCGQNANCKVINHSPICRCKAGFTGD 326
            G C  NA C     S  C CK+G+TGD
Sbjct: 5   NGGCHPNATCTNTGGSFTCTCKSGYTGD 32


>gnl|CDD|201524 pfam00954, S_locus_glycop, S-locus glycoprotein family.  In
            Brassicaceae, self-incompatible plants have a
            self/non-self recognition system. This is sporophytically
            controlled by multiple alleles at a single locus (S).
            S-locus glycoproteins, as well as S-receptor kinases, are
            in linkage with the S-alleles.
          Length = 110

 Score = 31.1 bits (71), Expect = 0.92
 Identities = 12/25 (48%), Positives = 13/25 (52%), Gaps = 1/25 (4%)

Query: 985  GSCGYNALCKVINHSPICTCPDGFV 1009
            G CG    C  +N SP C C  GFV
Sbjct: 84   GRCGPYGYC-DVNTSPKCNCIKGFV 107



 Score = 29.9 bits (68), Expect = 1.9
 Identities = 11/20 (55%), Positives = 14/20 (70%), Gaps = 1/20 (5%)

Query: 158 CGPYSQCRDINGSPSCSCLP 177
           CGPY  C D+N SP C+C+ 
Sbjct: 86  CGPYGYC-DVNTSPKCNCIK 104



 Score = 29.2 bits (66), Expect = 3.8
 Identities = 11/22 (50%), Positives = 14/22 (63%), Gaps = 1/22 (4%)

Query: 930 CGPNSQCRDINGSPSCSCLPTF 951
           CGP   C D+N SP C+C+  F
Sbjct: 86  CGPYGYC-DVNTSPKCNCIKGF 106



 Score = 28.8 bits (65), Expect = 5.4
 Identities = 11/24 (45%), Positives = 12/24 (50%), Gaps = 1/24 (4%)

Query: 522 GTCGQNANCRVINHSPICTCKPGF 545
           G CG    C  +N SP C C  GF
Sbjct: 84  GRCGPYGYC-DVNTSPKCNCIKGF 106


>gnl|CDD|234196 TIGR03397, acid_phos_Burk, acid phosphatase, Burkholderia-type.  A
           member of this family, AcpA from Burkholderia mallei,
           has been charactized as a surface-bound glycoprotein
           with acid phosphatase activity, as can be shown with the
           colorigenic substrate 5-bromo-4-chloro-3-indolyl
           phosphate. This family shares regions of sequence
           similarity with phosphocholine-preferring phospholipase
           C enzymes (TIGR03396) from many of the same species.
          Length = 483

 Score = 32.5 bits (74), Expect = 1.5
 Identities = 21/71 (29%), Positives = 32/71 (45%), Gaps = 9/71 (12%)

Query: 612 NHQ-AVCSCLPNYFGS--PPACRPECTVNTDCPLDKACFNQKCVDPCPDSPPPPLESPPE 668
           NHQ  +C+C P Y  +   PA      ++ D P          + P  +SP   L+ PP+
Sbjct: 162 NHQYLICACAPFYPDADKSPAKSSISVLDGDGPKGTR------LKPADNSPASALDGPPK 215

Query: 669 YVNPCIPSPCG 679
           +VN    +P G
Sbjct: 216 FVNDGNLTPDG 226


>gnl|CDD|223021 PHA03247, PHA03247, large tegument protein UL36; Provisional.
          Length = 3151

 Score = 32.2 bits (73), Expect = 2.1
 Identities = 26/119 (21%), Positives = 33/119 (27%), Gaps = 8/119 (6%)

Query: 129  PFTYCNRIPPPPPPQEDVPEPVNPCYPSPCGPYSQCRDINGSPSCSCLPSYIGSPPNCRP 188
              T     PPPPP  E  P  +    P P GP +         +   LP+    P     
Sbjct: 2694 SLTSLADPPPPPPTPEPAPHALVSATPLPPGPAA------ARQASPALPAAPAPPAVPAG 2747

Query: 189  ECIQNSECPYDKACINEKCADPCPGFCPPGTTGSPFVQCKPIVHEPVYTNPCQPSPCGP 247
                       +         P P   P    G P    +P V     +    PSP  P
Sbjct: 2748 PATPGGPARPARPPTTAGPPAPAPPAAPAA--GPPRRLTRPAVASLSESRESLPSPWDP 2804



 Score = 31.1 bits (70), Expect = 3.9
 Identities = 22/99 (22%), Positives = 24/99 (24%), Gaps = 4/99 (4%)

Query: 127  GDPFTYCNRIPPPPPPQEDVPEPVNPCYPSPCGPYSQCRDIN-GSPSCSCLPSYIGSPPN 185
            GDP        PP  P   VP P     P P  P    R     +P  S  P        
Sbjct: 2549 GDPPPPLPPAAPPAAPDRSVPPP--RPAPRPSEPAVTSRARRPDAPPQSARPRAPVDDRG 2606

Query: 186  CRPECIQNSECPYDKACINEKCADPCP-GFCPPGTTGSP 223
                    S  P D    +     P P    P       
Sbjct: 2607 DPRGPAPPSPLPPDTHAPDPPPPSPSPAANEPDPHPPPT 2645


>gnl|CDD|184519 PRK14120, gpmA, phosphoglyceromutase; Provisional.
          Length = 249

 Score = 31.2 bits (71), Expect = 2.4
 Identities = 16/54 (29%), Positives = 18/54 (33%), Gaps = 22/54 (40%)

Query: 657 DSPPPPLESPPEYVNPCIPSPCGPYSQCRDIGGSPSCSCLPNYIGAPPNCRPEC 710
           D+PPPP+E   E            YSQ  D          P Y       R EC
Sbjct: 121 DTPPPPIEDGSE------------YSQDND----------PRYADLGVGPRTEC 152


>gnl|CDD|238011 cd00054, EGF_CA, Calcium-binding EGF-like domain, present in a
           large number of membrane-bound and extracellular (mostly
           animal) proteins. Many of these proteins require calcium
           for their biological function and calcium-binding sites
           have been found to be located at the N-terminus of
           particular EGF-like domains; calcium-binding may be
           crucial for numerous protein-protein interactions. Six
           conserved core cysteines form three disulfide bridges as
           in non calcium-binding EGF domains, whose structures are
           very similar. EGF_CA can be found in tandem repeat
           arrangements.
          Length = 38

 Score = 27.6 bits (62), Expect = 2.6
 Identities = 11/33 (33%), Positives = 16/33 (48%), Gaps = 1/33 (3%)

Query: 922 VNPC-IPSPCGPNSQCRDINGSPSCSCLPTFIG 953
           ++ C   +PC     C +  GS  CSC P + G
Sbjct: 2   IDECASGNPCQNGGTCVNTVGSYRCSCPPGYTG 34



 Score = 26.8 bits (60), Expect = 5.0
 Identities = 12/33 (36%), Positives = 16/33 (48%), Gaps = 1/33 (3%)

Query: 670 VNPC-IPSPCGPYSQCRDIGGSPSCSCLPNYIG 701
           ++ C   +PC     C +  GS  CSC P Y G
Sbjct: 2   IDECASGNPCQNGGTCVNTVGSYRCSCPPGYTG 34



 Score = 26.4 bits (59), Expect = 8.5
 Identities = 12/33 (36%), Positives = 16/33 (48%), Gaps = 1/33 (3%)

Query: 150 VNPCY-PSPCGPYSQCRDINGSPSCSCLPSYIG 181
           ++ C   +PC     C +  GS  CSC P Y G
Sbjct: 2   IDECASGNPCQNGGTCVNTVGSYRCSCPPGYTG 34


>gnl|CDD|188093 TIGR00864, PCC, polycystin cation channel protein.  The Polycystin
           Cation Channel (PCC) Family (TC 1.A.5) Polycystin is a
           huge protein of 4303aas. Its repeated leucine-rich (LRR)
           segment is found in many proteins. It contains 16
           polycystic kidney disease (PKD) domains, one
           LDL-receptor class A domain, one C-type lectin family
           domain, and 16-18 putative TMSs in positions between
           residues 2200 and 4100. Polycystin-L has been shown to
           be a cation (Na+, K+ and Ca2+) channel that is activated
           by Ca2+. Two members of the PCC family (polycystin 1 and
           2) are mutated in autosomal dominant polycystic kidney
           disease, and polycystin-L is deleted in mice with renal
           and retinal defects. Note: this model is restricted to
           the amino half for technical reasons.
          Length = 2740

 Score = 31.2 bits (70), Expect = 4.0
 Identities = 19/68 (27%), Positives = 23/68 (33%), Gaps = 8/68 (11%)

Query: 37  VYTNPCQPSPCGPNSQCREVNHQAVCSCLPNYFGSPPACRPE---CTVNSDC-PLDKSCQ 92
           +Y   C+    G          +A C    N     PAC      C     C PLD  C 
Sbjct: 524 IYRLRCRLPGAGGP----ACGPEAECRPPDNRSADAPACMKGEQWCPFAHICLPLDAPCH 579

Query: 93  NQKCADPC 100
            Q CA+ C
Sbjct: 580 PQACANGC 587



 Score = 31.2 bits (70), Expect = 4.0
 Identities = 19/68 (27%), Positives = 23/68 (33%), Gaps = 8/68 (11%)

Query: 235 VYTNPCQPSPCGPNSQCREVNHQAVCSCLPNYFGSPPACRPE---CTVNSDC-PLDKSCQ 290
           +Y   C+    G          +A C    N     PAC      C     C PLD  C 
Sbjct: 524 IYRLRCRLPGAGGP----ACGPEAECRPPDNRSADAPACMKGEQWCPFAHICLPLDAPCH 579

Query: 291 NQKCADPC 298
            Q CA+ C
Sbjct: 580 PQACANGC 587


>gnl|CDD|222546 pfam14107, DUF4280, Domain of unknown function (DUF4280).  This
           family of proteins is functionally uncharacterized. This
           family of proteins is found in bacteria and eukaryotes.
           Proteins in this family are typically between 129 and
           456 amino acids in length. There is a single completely
           conserved residue C that may be functionally important.
          Length = 109

 Score = 28.1 bits (63), Expect = 8.0
 Identities = 13/31 (41%), Positives = 15/31 (48%)

Query: 574 PGTTGNPFVLCKLVQNEPVYTNPCQPSPCGP 604
           PG    PF +CK   N  V   PC+P   GP
Sbjct: 41  PGVNIPPFGMCKSPANPTVAPAPCKPVIAGP 71


  Database: CDD.v3.10
    Posted date:  Mar 20, 2013  7:55 AM
  Number of letters in database: 10,937,602
  Number of sequences in database:  44,354
  
Lambda     K      H
   0.321    0.140    0.510 

Gapped
Lambda     K      H
   0.267   0.0656    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 49,593,889
Number of extensions: 4517119
Number of successful extensions: 4782
Number of sequences better than 10.0: 1
Number of HSP's gapped: 4626
Number of HSP's successfully gapped: 277
Length of query: 1031
Length of database: 10,937,602
Length adjustment: 107
Effective length of query: 924
Effective length of database: 6,191,724
Effective search space: 5721152976
Effective search space used: 5721152976
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.8 bits)
S2: 64 (28.6 bits)