RPS-BLAST 2.2.26 [Sep-21-2011]

Database: CDD.v3.10 
           44,354 sequences; 10,937,602 total letters

Searching..................................................done

Query= psy6524
         (457 letters)



>gnl|CDD|238113 cd00190, Tryp_SPc, Trypsin-like serine protease; Many of these are
           synthesized as inactive precursor zymogens that are
           cleaved during limited proteolysis to generate their
           active forms. Alignment contains also inactive enzymes
           that have substitutions of the catalytic triad residues.
          Length = 232

 Score =  164 bits (417), Expect = 6e-48
 Identities = 67/114 (58%), Positives = 77/114 (67%), Gaps = 2/114 (1%)

Query: 337 PSGKMGTVVGWGRTSEGGSLATEALEVQVPILSPGQCRAMKYKPSRITPNMLCAG--RGE 394
           P+G   TV GWGRTSEGG L     EV VPI+S  +C+        IT NMLCAG   G 
Sbjct: 119 PAGTTCTVSGWGRTSEGGPLPDVLQEVNVPIVSNAECKRAYSYGGTITDNMLCAGGLEGG 178

Query: 395 MDSCQGDSGGPLIINDVGRYELVGIVSWGVGCGRPGYPGVYTRVNRYLSWVKRN 448
            D+CQGDSGGPL+ ND GR  LVGIVSWG GC RP YPGVYTRV+ YL W+++ 
Sbjct: 179 KDACQGDSGGPLVCNDNGRGVLVGIVSWGSGCARPNYPGVYTRVSSYLDWIQKT 232



 Score =  145 bits (369), Expect = 5e-41
 Identities = 48/128 (37%), Positives = 69/128 (53%), Gaps = 6/128 (4%)

Query: 30  IVGGRPTGVNKYPWVARLVY-DGNFHCGASLINEDYVLTAAHCVRRLKRSKIRIVLGDYD 88
           IVGG    +  +PW   L Y  G   CG SLI+  +VLTAAHCV     S   + LG +D
Sbjct: 1   IVGGSEAKIGSFPWQVSLQYTGGRHFCGGSLISPRWVLTAAHCVYSSAPSNYTVRLGSHD 60

Query: 89  QSVTTETAEPTMMRAVSSIVRHRHFDVNNYNHDIALLKLRKPVSFTKSVRPICLPPDSEY 148
            S          +  V  ++ H +++ + Y++DIALLKL++PV+ + +VRPICLP     
Sbjct: 61  LSSNEGG---GQVIKVKKVIVHPNYNPSTYDNDIALLKLKRPVTLSDNVRPICLPSSGY- 116

Query: 149 HTVVKGTM 156
             +  GT 
Sbjct: 117 -NLPAGTT 123



 Score =  139 bits (352), Expect = 1e-38
 Identities = 56/137 (40%), Positives = 77/137 (56%), Gaps = 6/137 (4%)

Query: 181 SKIRIVLGDYDQSVTTETAEPTMMRAVSSIVRHRHFDVNNYNHDIALLKLRKPVSFTKSV 240
           S   + LG +D S          +  V  ++ H +++ + Y++DIALLKL++PV+ + +V
Sbjct: 50  SNYTVRLGSHDLSSNEGG---GQVIKVKKVIVHPNYNPSTYDNDIALLKLKRPVTLSDNV 106

Query: 241 RPICLPPDNIDPS-GKMGTVVGWGRTSEGGSLATEALEVQVPILSPGQCRAMKYKPSRIT 299
           RPICLP    +   G   TV GWGRTSEGG L     EV VPI+S  +C+        IT
Sbjct: 107 RPICLPSSGYNLPAGTTCTVSGWGRTSEGGPLPDVLQEVNVPIVSNAECKRAYSYGGTIT 166

Query: 300 PNMLCAG--RGEMDSCQ 314
            NMLCAG   G  D+CQ
Sbjct: 167 DNMLCAGGLEGGKDACQ 183


>gnl|CDD|214473 smart00020, Tryp_SPc, Trypsin-like serine protease.  Many of these
           are synthesised as inactive precursor zymogens that are
           cleaved during limited proteolysis to generate their
           active forms. A few, however, are active as single chain
           molecules, and others are inactive due to substitutions
           of the catalytic triad residues.
          Length = 229

 Score =  149 bits (378), Expect = 2e-42
 Identities = 70/117 (59%), Positives = 77/117 (65%), Gaps = 4/117 (3%)

Query: 332 STDIDPSGKMGTVVGWGRTSEG-GSLATEALEVQVPILSPGQCRAMKYKPSRITPNMLCA 390
           S    P+G   TV GWGRTSEG GSL     EV VPI+S   CR        IT NMLCA
Sbjct: 114 SNYNVPAGTTCTVSGWGRTSEGAGSLPDTLQEVNVPIVSNATCRRAYSGGGAITDNMLCA 173

Query: 391 G--RGEMDSCQGDSGGPLIINDVGRYELVGIVSWGVGCGRPGYPGVYTRVNRYLSWV 445
           G   G  D+CQGDSGGPL+ ND GR+ LVGIVSWG GC RPG PGVYTRV+ YL W+
Sbjct: 174 GGLEGGKDACQGDSGGPLVCND-GRWVLVGIVSWGSGCARPGKPGVYTRVSSYLDWI 229



 Score =  145 bits (368), Expect = 7e-41
 Identities = 56/129 (43%), Positives = 77/129 (59%), Gaps = 7/129 (5%)

Query: 29  RIVGGRPTGVNKYPWVARLVYDGNFH-CGASLINEDYVLTAAHCVRRLKRSKIRIVLGDY 87
           RIVGG    +  +PW   L Y G  H CG SLI+  +VLTAAHCVR    S IR+ LG +
Sbjct: 1   RIVGGSEANIGSFPWQVSLQYGGGRHFCGGSLISPRWVLTAAHCVRGSDPSNIRVRLGSH 60

Query: 88  DQSVTTETAEPTMMRAVSSIVRHRHFDVNNYNHDIALLKLRKPVSFTKSVRPICLPPDSE 147
           D S      E   +  VS ++ H +++ + Y++DIALLKL++PV+ + +VRPICLP  + 
Sbjct: 61  DLSSG----EEGQVIKVSKVIIHPNYNPSTYDNDIALLKLKEPVTLSDNVRPICLPSSNY 116

Query: 148 YHTVVKGTM 156
              V  GT 
Sbjct: 117 --NVPAGTT 123



 Score =  130 bits (329), Expect = 3e-35
 Identities = 63/138 (45%), Positives = 82/138 (59%), Gaps = 8/138 (5%)

Query: 181 SKIRIVLGDYDQSVTTETAEPTMMRAVSSIVRHRHFDVNNYNHDIALLKLRKPVSFTKSV 240
           S IR+ LG +D S      E   +  VS ++ H +++ + Y++DIALLKL++PV+ + +V
Sbjct: 51  SNIRVRLGSHDLSSG----EEGQVIKVSKVIIHPNYNPSTYDNDIALLKLKEPVTLSDNV 106

Query: 241 RPICLPPDNIDPS-GKMGTVVGWGRTSEG-GSLATEALEVQVPILSPGQCRAMKYKPSRI 298
           RPICLP  N +   G   TV GWGRTSEG GSL     EV VPI+S   CR        I
Sbjct: 107 RPICLPSSNYNVPAGTTCTVSGWGRTSEGAGSLPDTLQEVNVPIVSNATCRRAYSGGGAI 166

Query: 299 TPNMLCAG--RGEMDSCQ 314
           T NMLCAG   G  D+CQ
Sbjct: 167 TDNMLCAGGLEGGKDACQ 184


>gnl|CDD|215708 pfam00089, Trypsin, Trypsin. 
          Length = 218

 Score =  121 bits (306), Expect = 3e-32
 Identities = 54/107 (50%), Positives = 64/107 (59%), Gaps = 6/107 (5%)

Query: 339 GKMGTVVGWGRTSEGGSLATEALEVQVPILSPGQCRAMKYKPSRITPNMLCAGRGEMDSC 398
           G   TV GWG T   G   T   EV VP++S   CR        +T NM+CAG G  D+C
Sbjct: 118 GTTCTVSGWGNTKTLGLPDT-LQEVTVPVVSRETCR--SAYGGTVTDNMICAGAGGKDAC 174

Query: 399 QGDSGGPLIINDVGRYELVGIVSWGVGCGRPGYPGVYTRVNRYLSWV 445
           QGDSGGPL+ +D    EL+GIVSWG GC    YPGVYT V+ YL W+
Sbjct: 175 QGDSGGPLVCSDG---ELIGIVSWGYGCASGNYPGVYTPVSSYLDWI 218



 Score =  105 bits (265), Expect = 2e-26
 Identities = 49/120 (40%), Positives = 66/120 (55%), Gaps = 7/120 (5%)

Query: 30  IVGGRPTGVNKYPWVARLVYDGNFH-CGASLINEDYVLTAAHCVRRLKRSKIRIVLGDYD 88
           IVGG       +PW   L      H CG SLI+E++VLTAAHCV       +R+VLG ++
Sbjct: 1   IVGGDEAQPGSFPWQVSLQVSSGKHFCGGSLISENWVLTAAHCVSN--AKSVRVVLGAHN 58

Query: 89  QSVTTETAEPTMMRAVSSIVRHRHFDVNNYNHDIALLKLRKPVSFTKSVRPICLPPDSEY 148
             V  E  E      V  ++ H +++ +  N DIALLKL+ PV+   +VRPICLP  S  
Sbjct: 59  -IVLREGGEQK--FDVKKVIVHPNYNPDTDN-DIALLKLKSPVTLGDTVRPICLPTASSD 114



 Score = 99.4 bits (248), Expect = 4e-24
 Identities = 53/136 (38%), Positives = 73/136 (53%), Gaps = 8/136 (5%)

Query: 180 SSKIRIVLGDYDQSVTTETAEPTMMRAVSSIVRHRHFDVNNYNHDIALLKLRKPVSFTKS 239
           +  +R+VLG ++  V  E  E      V  ++ H +++ +  N DIALLKL+ PV+   +
Sbjct: 47  AKSVRVVLGAHN-IVLREGGEQK--FDVKKVIVHPNYNPDTDN-DIALLKLKSPVTLGDT 102

Query: 240 VRPICLP-PDNIDPSGKMGTVVGWGRTSEGGSLATEALEVQVPILSPGQCRAMKYKPSRI 298
           VRPICLP   +  P G   TV GWG T   G   T   EV VP++S   CR        +
Sbjct: 103 VRPICLPTASSDLPVGTTCTVSGWGNTKTLGLPDT-LQEVTVPVVSRETCR--SAYGGTV 159

Query: 299 TPNMLCAGRGEMDSCQ 314
           T NM+CAG G  D+CQ
Sbjct: 160 TDNMICAGAGGKDACQ 175


>gnl|CDD|227927 COG5640, COG5640, Secreted trypsin-like serine protease
           [Posttranslational modification, protein turnover,
           chaperones].
          Length = 413

 Score = 79.5 bits (196), Expect = 5e-16
 Identities = 39/120 (32%), Positives = 49/120 (40%), Gaps = 14/120 (11%)

Query: 343 TVVGWGRTSEGGSL-----ATEALEVQVPILSPGQCRAMKYKPSRITPNM------LCAG 391
           T   +G T+           T   EV V  +    C   +YK      +        CAG
Sbjct: 162 TNGTFGVTTPSDVPRSSPKGTILHEVAVLFVPLSTC--AQYKGCANASDGATGLTGFCAG 219

Query: 392 RGEMDSCQGDSGGPLIINDVGRYELVGIVSWGVG-CGRPGYPGVYTRVNRYLSWVKRNMK 450
           R   D+CQGDSGGP+           G+VSWG G CG    PGVYT V+ Y  W+     
Sbjct: 220 RPPKDACQGDSGGPIFHKGEEGRVQRGVVSWGDGGCGGTLIPGVYTNVSNYQDWIAAMTN 279



 Score = 52.6 bits (126), Expect = 2e-07
 Identities = 38/126 (30%), Positives = 51/126 (40%), Gaps = 15/126 (11%)

Query: 16  TCLLECGVTNQEV--RIVGGRPTGVNKYPWVARLV------YDGNFHCGASLINEDYVLT 67
           T       T  EV  RI+GG      +YP +  LV        G F CG S +   YVLT
Sbjct: 17  TLQPSAAQTADEVSSRIIGGSNANAGEYPSLVALVDRISDYVSGTF-CGGSKLGGRYVLT 75

Query: 68  AAHCVRRLKRSKIRIVLGDYDQSVTTETAEPTMMR-AVSSIVRHRHFDVNNYNHDIALLK 126
           AAHC           +  D ++ V          R  V +I  H  +   N  +DIA+L+
Sbjct: 76  AAHCA-----DASSPISSDVNRVVVDLNDSSQAERGHVRTIYVHEFYSPGNLGNDIAVLE 130

Query: 127 LRKPVS 132
           L +  S
Sbjct: 131 LARAAS 136


>gnl|CDD|220189 pfam09342, DUF1986, Domain of unknown function (DUF1986).  This
           domain is found in serine proteases and is predicted to
           contain disulphide bonds.
          Length = 267

 Score = 43.9 bits (103), Expect = 8e-05
 Identities = 29/108 (26%), Positives = 50/108 (46%), Gaps = 10/108 (9%)

Query: 41  YPWVARLVYDGNFHCGASLINEDYVLTAAHCVR--RLKRSKIRIVLGDYDQSVTTETAEP 98
           +PW+A++  +GN+ C   LI+  +VL +  C+    L+ S I +VLG +    + +    
Sbjct: 16  WPWIAKVYVEGNYRCTGVLIDLSWVLVSHSCLWDTSLEHSYISVVLGGHKTLKSVKGPYE 75

Query: 99  TMMRAVSSIVRHRHFDVNNYNHDIALLKLRKPVSFTKSVRPICLPPDS 146
            + R     V  R          I+LL L+ P +F+  V P  +P   
Sbjct: 76  QIYR-----VDCRKD---LPRSKISLLHLKSPATFSNHVLPTFVPSTR 115


>gnl|CDD|226119 COG3591, COG3591, V8-like Glu-specific endopeptidase [Amino acid
          transport and metabolism].
          Length = 251

 Score = 38.1 bits (89), Expect = 0.005
 Identities = 13/48 (27%), Positives = 20/48 (41%), Gaps = 3/48 (6%)

Query: 39 NKYPW---VARLVYDGNFHCGASLINEDYVLTAAHCVRRLKRSKIRIV 83
           ++P+   V      G     A+LI  + VLTA HC+      +  I 
Sbjct: 47 TQFPYSAVVQFEAATGRLCTAATLIGPNTVLTAGHCIYSPDYGEDDIA 94


>gnl|CDD|222077 pfam13365, Trypsin_2, Trypsin-like peptidase domain.  This family
           includes trypsin like peptidase domains.
          Length = 138

 Score = 34.5 bits (79), Expect = 0.038
 Identities = 21/127 (16%), Positives = 37/127 (29%), Gaps = 20/127 (15%)

Query: 61  NEDYVLTAAHCVRRLKRSKIRIVLGDYDQSVTTETAEPTMMRAVSSIVRHRHFDVNNYNH 120
           ++  +LT AH V     S+I +VL D  +                 +V     D    + 
Sbjct: 8   SDGLILTNAHVVEDADASEIEVVLPDGGRVPAE-------------VV---AADP---DL 48

Query: 121 DIALLKLRKPV-SFTKSVRPICLPPDSEYHTVVKGTMRCRQRAAVLAFGTQRDGSDVKLV 179
           D+ALLK+  P+      +     P       V          +             +  V
Sbjct: 49  DLALLKVDGPLLPAAPLLASSAAPLGGSVVVVGGPGGIGLGASGGGGGVGGLVSGSLGGV 108

Query: 180 SSKIRIV 186
             +  + 
Sbjct: 109 DGRYILT 115


>gnl|CDD|132758 cd07073, NR_LBD_AR, Ligand binding domain of the nuclear receptor
           androgen receptor, ligand activated transcription
           regulator.  The ligand binding domain of the androgen
           receptor (AR): AR is a member of the nuclear receptor
           family. It is activated by binding either of the
           androgenic hormones, testosterone or
           dihydrotestosterone, which are responsible for male
           primary sexual characteristics and for secondary male
           characteristics, respectively. The primary mechanism of
           action of ARs is by direct regulation of gene
           transcription. The binding of an androgen results in a
           conformational change in the androgen receptor which
           causes its transport from the cytosol into the cell
           nucleus, and dimerization. The receptor dimer binds to a
           hormone response element of AR-regulated genes and
           modulates their expression. Another mode of action is
           independent of their interactions with DNA. The
           receptors interact directly with signal transduction
           proteins in the cytoplasm, causing rapid changes in cell
           function, such as ion transport. Like other members of
           the nuclear receptor (NR) superfamily of
           ligand-activated transcription factors, AR has  a
           central well conserved DNA binding domain (DBD), a
           variable N-terminal domain, a flexible hinge and a
           C-terminal ligand binding domain (LBD).  The LBD is not
           only involved in binding to androgen, but also involved
           in binding of coactivator proteins and dimerization. A
           ligand dependent nuclear export signal is also present
           at the ligand binding domain.
          Length = 246

 Score = 31.8 bits (72), Expect = 0.50
 Identities = 18/55 (32%), Positives = 30/55 (54%), Gaps = 7/55 (12%)

Query: 383 ITPNMLCAGRGEMDSCQGDSGGPLI--INDVGRYELVGIVSWGVGCGRPGYPGVY 435
           I P ++CAG    D+ Q DS   L+  +N++G  +LV +V W      PG+  ++
Sbjct: 9   IEPGVVCAGH---DNNQPDSFAALLSSLNELGERQLVHVVKWAKAL--PGFRNLH 58


>gnl|CDD|223474 COG0397, COG0397, Uncharacterized conserved protein [Function
           unknown].
          Length = 488

 Score = 31.1 bits (71), Expect = 1.3
 Identities = 11/41 (26%), Positives = 16/41 (39%), Gaps = 2/41 (4%)

Query: 88  DQSVTTETAEP--TMMRAVSSIVRHRHFDVNNYNHDIALLK 126
            + V  E  EP   + R   S +R   F+   Y     LL+
Sbjct: 161 GEPVQREDEEPSAVLTRLAPSHIRFGTFERFAYRDRRDLLR 201



 Score = 31.1 bits (71), Expect = 1.3
 Identities = 11/41 (26%), Positives = 16/41 (39%), Gaps = 2/41 (4%)

Query: 191 DQSVTTETAEP--TMMRAVSSIVRHRHFDVNNYNHDIALLK 229
            + V  E  EP   + R   S +R   F+   Y     LL+
Sbjct: 161 GEPVQREDEEPSAVLTRLAPSHIRFGTFERFAYRDRRDLLR 201


>gnl|CDD|218673 pfam05642, Sporozoite_P67, Sporozoite P67 surface antigen.  This
           family consists of several Theileria P67 surface
           antigens. A stage specific surface antigen of Theileria
           parva, p67, is the basis for the development of an
           anti-sporozoite vaccine for the control of East Coast
           fever (ECF) in cattle. The antigen has been shown to
           contain five distinct linear peptide sequences
           recognised by sporozoite-neutralising murine monoclonal
           antibodies.
          Length = 727

 Score = 30.0 bits (67), Expect = 3.5
 Identities = 34/147 (23%), Positives = 48/147 (32%), Gaps = 18/147 (12%)

Query: 254 GKMGTVVGWGRTSEGGSLATEALEVQVPILSPGQCRAMKYKPSRIT-PNMLCAGRGEMDS 312
           G+ G   G G    GG      L          Q    +    R+  P +   G    DS
Sbjct: 214 GRAGVSPGVGVGGLGGVPGVGILASNTSREGQTQDDQERDGDGRVIEPGVGLPGVRVGDS 273

Query: 313 CQDLAPRRPTESHLHFHFLSTDIDPSGKMGTVVGWGRTSEGGSLATEALEVQVPILSPGQ 372
               +  RP+ S       +T   P+    +  G   +S   ++      +  PI SPG 
Sbjct: 274 TSSPSTTRPSGS-------TTTTTPASSGPSAPGGPGSSSRNAVTRSTDSISGPIPSPGA 326

Query: 373 CRAMKYKPSRITPNMLCAGRGEMDSCQ 399
            RA       IT  M   G  EM + Q
Sbjct: 327 PRA-------ITGQM---GEREMFAVQ 343


>gnl|CDD|221348 pfam11969, DcpS_C, Scavenger mRNA decapping enzyme C-term binding. 
           This family consists of several scavenger mRNA decapping
           enzymes (DcpS) and is the C-terminal region. DcpS is a
           scavenger pyrophosphatase that hydrolyses the residual
           cap structure following 3' to 5' decay of an mRNA. The
           association of DcpS with 3' to 5' exonuclease exosome
           components suggests that these two activities are linked
           and there is a coupled exonucleolytic decay-dependent
           decapping pathway. The C-terminal domain contains a
           histidine triad (HIT) sequence with three histidines
           separated by hydrophobic residues. The central histidine
           within the DcpS HIT motif is critical for decapping
           activity and defines the HIT motif as a new mRNA
           decapping domain, making DcpS the first member of the
           HIT family of proteins with a defined biological
           function.
          Length = 113

 Score = 28.3 bits (64), Expect = 3.7
 Identities = 7/20 (35%), Positives = 11/20 (55%)

Query: 321 PTESHLHFHFLSTDIDPSGK 340
           P+  HLH H ++ D +P   
Sbjct: 90  PSVYHLHLHVIAPDFEPGLG 109


>gnl|CDD|219065 pfam06506, PrpR_N, Propionate catabolism activator.  This domain is
           found at the N terminus of several sigma54- dependent
           transcriptional activators including PrpR, which
           activates catabolism of propionate.
          Length = 169

 Score = 28.3 bits (64), Expect = 5.6
 Identities = 10/37 (27%), Positives = 19/37 (51%), Gaps = 3/37 (8%)

Query: 68  AAHCVRRLKRSKIRIVLGDYDQSVTTETAEPTMMRAV 104
           A   V+ LK   I++++GD    +  + AE   ++ V
Sbjct: 112 ARAAVKELKAQGIKVIVGD---GLVCDLAEQAGLQGV 145


>gnl|CDD|185390 PRK15493, PRK15493, 5-methylthioadenosine/S-adenosylhomocysteine
           deaminase; Provisional.
          Length = 435

 Score = 28.5 bits (63), Expect = 7.6
 Identities = 16/54 (29%), Positives = 27/54 (50%), Gaps = 2/54 (3%)

Query: 121 DIALLKLRKP--VSFTKSVRPICLPPDSEYHTVVKGTMRCRQRAAVLAFGTQRD 172
           ++ LL++ K    SF+    PI +  D+   TV +  MR      + +FGT+ D
Sbjct: 107 ELGLLEMVKSGTTSFSDMFNPIGVDQDAIMETVSRSGMRAAVSRTLFSFGTKED 160


>gnl|CDD|112433 pfam03615, GCM, GCM motif protein. 
          Length = 143

 Score = 27.4 bits (61), Expect = 8.6
 Identities = 14/40 (35%), Positives = 21/40 (52%), Gaps = 1/40 (2%)

Query: 368 LSPGQC-RAMKYKPSRITPNMLCAGRGEMDSCQGDSGGPL 406
           L P  C +A + +  +  PN  C GR E+  C+G  G P+
Sbjct: 68  LRPAICDKARRKQQGKQCPNRGCNGRLELIPCRGHCGYPV 107


  Database: CDD.v3.10
    Posted date:  Mar 20, 2013  7:55 AM
  Number of letters in database: 10,937,602
  Number of sequences in database:  44,354
  
Lambda     K      H
   0.321    0.136    0.423 

Gapped
Lambda     K      H
   0.267   0.0738    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 23,056,350
Number of extensions: 2192723
Number of successful extensions: 1498
Number of sequences better than 10.0: 1
Number of HSP's gapped: 1474
Number of HSP's successfully gapped: 25
Length of query: 457
Length of database: 10,937,602
Length adjustment: 100
Effective length of query: 357
Effective length of database: 6,502,202
Effective search space: 2321286114
Effective search space used: 2321286114
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.8 bits)
S2: 61 (27.3 bits)