Query         psy2950
Match_columns 406
No_of_seqs    246 out of 2346
Neff          9.3 
Searched_HMMs 46136
Date          Fri Aug 16 21:42:56 2013
Command       hhsearch -i /work/01045/syshi/Psyhhblits/psy2950.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/2950hhsearch_cdd -cpu 12 -v 0 

 No Hit                             Prob E-value P-value  Score    SS Cols Query HMM  Template HMM
  1 cd00190 Tryp_SPc Trypsin-like  100.0 6.1E-34 1.3E-38  258.0  14.6  191    7-215     1-232 (232)
  2 KOG3627|consensus              100.0 6.1E-32 1.3E-36  249.4  16.9  196    5-217    11-255 (256)
  3 smart00020 Tryp_SPc Trypsin-li 100.0 5.5E-31 1.2E-35  238.5  15.5  188    6-212     1-229 (229)
  4 PF00089 Trypsin:  Trypsin;  In 100.0 7.1E-29 1.5E-33  223.0  13.0  181    7-212     1-220 (220)
  5 KOG3627|consensus              100.0 9.8E-28 2.1E-32  221.3  17.2  164  222-389    85-255 (256)
  6 cd00190 Tryp_SPc Trypsin-like  100.0 1.6E-27 3.4E-32  215.9  17.9  197  178-387    33-232 (232)
  7 smart00020 Tryp_SPc Trypsin-li  99.9 3.6E-24 7.8E-29  193.8  18.1  158  222-384    69-229 (229)
  8 PF00089 Trypsin:  Trypsin;  In  99.9 6.4E-23 1.4E-27  184.2  16.4  186  178-384    33-220 (220)
  9 COG5640 Secreted trypsin-like   99.9 3.8E-23 8.2E-28  186.9  13.6  230    4-249    30-323 (413)
 10 COG5640 Secreted trypsin-like   99.8 3.1E-18 6.8E-23  155.3  17.1  205  178-397    69-287 (413)
 11 PF03761 DUF316:  Domain of unk  99.0   5E-09 1.1E-13   98.0  12.6  164    5-213    40-276 (282)
 12 PF09342 DUF1986:  Domain of un  98.5 6.7E-07 1.5E-11   78.2   9.4   82   15-110    13-131 (267)
 13 PF03761 DUF316:  Domain of unk  98.4 1.6E-06 3.4E-11   81.1  10.9  117  237-384   157-275 (282)
 14 PF09342 DUF1986:  Domain of un  96.7   0.044 9.5E-07   48.6  12.6   60  217-282    71-131 (267)
 15 COG3591 V8-like Glu-specific e  96.5  0.0092   2E-07   53.7   7.6   53  160-216   197-250 (251)
 16 TIGR02037 degP_htrA_DO peripla  94.1    0.11 2.4E-06   51.6   6.3   63   29-110    60-142 (428)
 17 PRK10898 serine endoprotease;   93.5    0.97 2.1E-05   43.6  11.4   24  163-190   195-218 (353)
 18 TIGR02038 protease_degS peripl  92.4    0.57 1.2E-05   45.2   8.1   25  162-190   194-218 (351)
 19 TIGR02037 degP_htrA_DO peripla  89.6     1.7 3.6E-05   43.2   8.6   39  240-282   104-142 (428)
 20 PRK10139 serine endoprotease;   88.0     1.4 2.9E-05   44.2   6.7   25  162-190   208-232 (455)
 21 PF02395 Peptidase_S6:  Immunog  87.4    0.51 1.1E-05   49.9   3.4   53  335-391   213-266 (769)
 22 PRK10898 serine endoprotease;   86.3     8.6 0.00019   37.1  11.0   37  240-281   124-160 (353)
 23 COG3591 V8-like Glu-specific e  84.4     1.8 3.9E-05   39.2   4.9   53  332-388   197-250 (251)
 24 PF02395 Peptidase_S6:  Immunog  82.6       1 2.2E-05   47.7   3.0   32  163-194   213-245 (769)
 25 TIGR02038 protease_degS peripl  82.5     6.5 0.00014   37.9   8.3   38  240-282   124-161 (351)
 26 PRK10942 serine endoprotease;   80.0     4.9 0.00011   40.4   6.7   24  163-190   230-253 (473)
 27 PRK10139 serine endoprotease;   79.6     9.8 0.00021   38.1   8.7  122  240-388   137-259 (455)
 28 PF05580 Peptidase_S55:  SpoIVB  69.4     6.4 0.00014   34.7   3.8   28  332-364   174-201 (218)
 29 PRK10942 serine endoprotease;   60.8      39 0.00085   34.0   8.1   37  240-280   158-194 (473)
 30 PF00947 Pico_P2A:  Picornaviru  59.3      10 0.00022   30.3   2.9   35  337-381    89-123 (127)
 31 PF00947 Pico_P2A:  Picornaviru  52.0      18 0.00039   29.0   3.2   34  165-208    89-122 (127)
 32 TIGR02860 spore_IV_B stage IV   34.1      37  0.0008   33.2   2.9   46  332-388   354-399 (402)
 33 PF05579 Peptidase_S32:  Equine  32.0      51  0.0011   30.1   3.2   22  337-362   207-228 (297)
 34 PF13365 Trypsin_2:  Trypsin-li  29.6      48   0.001   25.5   2.5   18  337-358   103-120 (120)
 35 PF00548 Peptidase_C3:  3C cyst  24.2      95  0.0021   26.5   3.5   26  336-362   145-170 (172)
 36 PF02907 Peptidase_S29:  Hepati  23.7      79  0.0017   25.7   2.7   25  165-194   107-131 (148)
 37 PF00944 Peptidase_S3:  Alphavi  21.3      41 0.00089   27.3   0.6   24  337-364   105-128 (158)

No 1  
>cd00190 Tryp_SPc Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues.
Probab=100.00  E-value=6.1e-34  Score=257.96  Aligned_cols=191  Identities=42%  Similarity=0.818  Sum_probs=164.8

Q ss_pred             cccCCccccceeeEEEEeeee----ccCCCcce---EEEEEEe---------------------------------cCCC
Q psy2950           7 FVEGNPRQLHHQLFIILLRRT----SEGGSLPH---ILQAAEV---------------------------------PLTP   46 (406)
Q Consensus         7 i~~G~~~~~~~~P~~v~i~~~----~c~G~l~~---vltaa~c---------------------------------~~Hp   46 (406)
                      |+||.++..++|||+|+|...    .|+|+|+.   |||||||                                 ++||
T Consensus         1 i~~G~~~~~~~~Pw~v~i~~~~~~~~C~GtlIs~~~VLTaAhC~~~~~~~~~~v~~g~~~~~~~~~~~~~~~v~~~~~hp   80 (232)
T cd00190           1 IVGGSEAKIGSFPWQVSLQYTGGRHFCGGSLISPRWVLTAAHCVYSSAPSNYTVRLGSHDLSSNEGGGQVIKVKKVIVHP   80 (232)
T ss_pred             CcCCeECCCCCCCCEEEEEccCCcEEEEEEEeeCCEEEECHHhcCCCCCccEEEEeCcccccCCCCceEEEEEEEEEECC
Confidence            789999999999999999865    39999966   9999999                                 4599


Q ss_pred             CCCC-CCCcceEEEEeCCccccCCCccccccCCCCCccccccccccccccccCCCceEEEeeeccCCCCCCcccccceee
Q psy2950          47 KEEC-RRSYAVAGYELTRPFKFNEFVSPICLPNPGLTVTADVGLISGWGRLSEGADVGLISGWGRLSEGGSLPHILQAAE  125 (406)
Q Consensus        47 ~y~~-~~~nDIALlkL~~~v~~~~~v~picl~~~~~~~~~~~~~~~~wg~~~~~~~~~~~~Gwg~~~~~~~~~~~l~~~~  125 (406)
                      +|+. ...+|||||||++++.++..++|||||........             + ..+.++|||........+..++...
T Consensus        81 ~y~~~~~~~DiAll~L~~~~~~~~~v~picl~~~~~~~~~-------------~-~~~~~~G~g~~~~~~~~~~~~~~~~  146 (232)
T cd00190          81 NYNPSTYDNDIALLKLKRPVTLSDNVRPICLPSSGYNLPA-------------G-TTCTVSGWGRTSEGGPLPDVLQEVN  146 (232)
T ss_pred             CCCCCCCcCCEEEEEECCcccCCCcccceECCCccccCCC-------------C-CEEEEEeCCcCCCCCCCCceeeEEE
Confidence            9984 57899999999999999999999999988633332             4 8999999998766545677899999


Q ss_pred             eeccChhhhhHhhhccCCCCCCCCCeeecCCCCCCcccccCCCCCeeEeeCCCCcEEEEEEEEecCCCCCCCCCeeEEEe
Q psy2950         126 VPLTPKEECRRSYAVAGYSNYLNQCQVCTGTKQGGLDSCQGDSGGPLACPLPDGRYYLCGITSWGVGCARPDFYGVYTLV  205 (406)
Q Consensus       126 ~~i~~~~~C~~~~~~~~~~~~~~~~~~Ca~~~~~~~~~C~gdsGgPl~~~~~~~~~~l~Gi~s~~~~C~~~~~p~~~t~v  205 (406)
                      +.+++.++|+..+..   ...+.+.++|++........|.||+||||++.. +++|+|+||+|++..|.....|.+|+++
T Consensus       147 ~~~~~~~~C~~~~~~---~~~~~~~~~C~~~~~~~~~~c~gdsGgpl~~~~-~~~~~lvGI~s~g~~c~~~~~~~~~t~v  222 (232)
T cd00190         147 VPIVSNAECKRAYSY---GGTITDNMLCAGGLEGGKDACQGDSGGPLVCND-NGRGVLVGIVSWGSGCARPNYPGVYTRV  222 (232)
T ss_pred             eeeECHHHhhhhccC---cccCCCceEeeCCCCCCCccccCCCCCcEEEEe-CCEEEEEEEEehhhccCCCCCCCEEEEc
Confidence            999999999988764   126889999998654467899999999999988 6999999999999889866778999999


Q ss_pred             eechhhHHHh
Q psy2950         206 SCYSDWVKSI  215 (406)
Q Consensus       206 ~~~~~WI~~~  215 (406)
                      ..|.+||+++
T Consensus       223 ~~~~~WI~~~  232 (232)
T cd00190         223 SSYLDWIQKT  232 (232)
T ss_pred             HHhhHHhhcC
Confidence            9999999853


No 2  
>KOG3627|consensus
Probab=99.98  E-value=6.1e-32  Score=249.36  Aligned_cols=196  Identities=39%  Similarity=0.772  Sum_probs=164.2

Q ss_pred             cccccCCccccceeeEEEEeeee-----ccCCCcce---EEEEEEe----------------------------------
Q psy2950           5 QDFVEGNPRQLHHQLFIILLRRT-----SEGGSLPH---ILQAAEV----------------------------------   42 (406)
Q Consensus         5 ~~i~~G~~~~~~~~P~~v~i~~~-----~c~G~l~~---vltaa~c----------------------------------   42 (406)
                      .||+||.++..+++||+|+|+..     .|+|+|++   |||||||                                  
T Consensus        11 ~~i~~g~~~~~~~~Pw~~~l~~~~~~~~~Cggsli~~~~vltaaHC~~~~~~~~~~V~~G~~~~~~~~~~~~~~~~~~v~   90 (256)
T KOG3627|consen   11 GRIVGGTEAEPGSFPWQVSLQYGGNGRHLCGGSLISPRWVLTAAHCVKGASASLYTVRLGEHDINLSVSEGEEQLVGDVE   90 (256)
T ss_pred             CCEeCCccCCCCCCCCEEEEEECCCcceeeeeEEeeCCEEEEChhhCCCCCCcceEEEECccccccccccCchhhhceee
Confidence            58999999999999999999864     49999855   9999988                                  


Q ss_pred             --cCCCCCC-CCCC-cceEEEEeCCccccCCCccccccCCCCCc-cccccccccccccccCCCceEEEeeeccCCCC-CC
Q psy2950          43 --PLTPKEE-CRRS-YAVAGYELTRPFKFNEFVSPICLPNPGLT-VTADVGLISGWGRLSEGADVGLISGWGRLSEG-GS  116 (406)
Q Consensus        43 --~~Hp~y~-~~~~-nDIALlkL~~~v~~~~~v~picl~~~~~~-~~~~~~~~~~wg~~~~~~~~~~~~Gwg~~~~~-~~  116 (406)
                        ++||+|+ .+.. ||||||+|.+++.|+++|+|||||..... ...             +...|.++|||++... ..
T Consensus        91 ~~i~H~~y~~~~~~~nDiall~l~~~v~~~~~i~piclp~~~~~~~~~-------------~~~~~~v~GWG~~~~~~~~  157 (256)
T KOG3627|consen   91 KIIVHPNYNPRTLENNDIALLRLSEPVTFSSHIQPICLPSSADPYFPP-------------GGTTCLVSGWGRTESGGGP  157 (256)
T ss_pred             EEEECCCCCCCCCCCCCEEEEEECCCcccCCcccccCCCCCcccCCCC-------------CCCEEEEEeCCCcCCCCCC
Confidence              2488887 4455 99999999999999999999999855432 111             2278999999987665 35


Q ss_pred             cccccceeeeeccChhhhhHhhhccCCCCCCCCCeeecCCCCCCcccccCCCCCeeEeeCCCCcEEEEEEEEecCC-CCC
Q psy2950         117 LPHILQAAEVPLTPKEECRRSYAVAGYSNYLNQCQVCTGTKQGGLDSCQGDSGGPLACPLPDGRYYLCGITSWGVG-CAR  195 (406)
Q Consensus       117 ~~~~l~~~~~~i~~~~~C~~~~~~~~~~~~~~~~~~Ca~~~~~~~~~C~gdsGgPl~~~~~~~~~~l~Gi~s~~~~-C~~  195 (406)
                      .+..|+++.+++++.++|+..+....   .+++.++|++......+.|.|||||||++.. .++|+++||+|||.. |..
T Consensus       158 ~~~~L~~~~v~i~~~~~C~~~~~~~~---~~~~~~~Ca~~~~~~~~~C~GDSGGPLv~~~-~~~~~~~GivS~G~~~C~~  233 (256)
T KOG3627|consen  158 LPDTLQEVDVPIISNSECRRAYGGLG---TITDTMLCAGGPEGGKDACQGDSGGPLVCED-NGRWVLVGIVSWGSGGCGQ  233 (256)
T ss_pred             CCceeEEEEEeEcChhHhcccccCcc---ccCCCEEeeCccCCCCccccCCCCCeEEEee-CCcEEEEEEEEecCCCCCC
Confidence            57889999999999999999887532   3566789999755667899999999999998 558999999999987 998


Q ss_pred             CCCCeeEEEeeechhhHHHhhh
Q psy2950         196 PDFYGVYTLVSCYSDWVKSILY  217 (406)
Q Consensus       196 ~~~p~~~t~v~~~~~WI~~~i~  217 (406)
                      .+.|++|++|..|.+||++.+.
T Consensus       234 ~~~P~vyt~V~~y~~WI~~~~~  255 (256)
T KOG3627|consen  234 PNYPGVYTRVSSYLDWIKENIG  255 (256)
T ss_pred             CCCCeEEeEhHHhHHHHHHHhc
Confidence            7799999999999999998764


No 3  
>smart00020 Tryp_SPc Trypsin-like serine protease. Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues.
Probab=99.97  E-value=5.5e-31  Score=238.52  Aligned_cols=188  Identities=42%  Similarity=0.789  Sum_probs=161.2

Q ss_pred             ccccCCccccceeeEEEEeeeec----cCCCcce---EEEEEEe--------------------------------cCCC
Q psy2950           6 DFVEGNPRQLHHQLFIILLRRTS----EGGSLPH---ILQAAEV--------------------------------PLTP   46 (406)
Q Consensus         6 ~i~~G~~~~~~~~P~~v~i~~~~----c~G~l~~---vltaa~c--------------------------------~~Hp   46 (406)
                      ||+||.++...+|||+|.++...    |+|+|+.   |||||||                                ++||
T Consensus         1 ~~~~G~~~~~~~~Pw~~~i~~~~~~~~C~GtlIs~~~VLTaahC~~~~~~~~~~v~~g~~~~~~~~~~~~~~v~~~~~~p   80 (229)
T smart00020        1 RIVGGSEANIGSFPWQVSLQYRGGRHFCGGSLISPRWVLTAAHCVYGSDPSNIRVRLGSHDLSSGEEGQVIKVSKVIIHP   80 (229)
T ss_pred             CccCCCcCCCCCCCcEEEEEEcCCCcEEEEEEecCCEEEECHHHcCCCCCcceEEEeCcccCCCCCCceEEeeEEEEECC
Confidence            68999999999999999998664    9999966   9999999                                3489


Q ss_pred             CCC-CCCCcceEEEEeCCccccCCCccccccCCCCCccccccccccccccccCCCceEEEeeeccCCCC-CCccccccee
Q psy2950          47 KEE-CRRSYAVAGYELTRPFKFNEFVSPICLPNPGLTVTADVGLISGWGRLSEGADVGLISGWGRLSEG-GSLPHILQAA  124 (406)
Q Consensus        47 ~y~-~~~~nDIALlkL~~~v~~~~~v~picl~~~~~~~~~~~~~~~~wg~~~~~~~~~~~~Gwg~~~~~-~~~~~~l~~~  124 (406)
                      +|+ ....+|||||||++++.+++.++|||||........             + ..+.++|||..... ......++..
T Consensus        81 ~~~~~~~~~DiAll~L~~~i~~~~~~~pi~l~~~~~~~~~-------------~-~~~~~~g~g~~~~~~~~~~~~~~~~  146 (229)
T smart00020       81 NYNPSTYDNDIALLKLKSPVTLSDNVRPICLPSSNYNVPA-------------G-TTCTVSGWGRTSEGAGSLPDTLQEV  146 (229)
T ss_pred             CCCCCCCcCCEEEEEECcccCCCCceeeccCCCcccccCC-------------C-CEEEEEeCCCCCCCCCcCCCEeeEE
Confidence            987 567899999999999999999999999987433333             4 88999999986542 3456678899


Q ss_pred             eeeccChhhhhHhhhccCCCCCCCCCeeecCCCCCCcccccCCCCCeeEeeCCCCcEEEEEEEEecCCCCCCCCCeeEEE
Q psy2950         125 EVPLTPKEECRRSYAVAGYSNYLNQCQVCTGTKQGGLDSCQGDSGGPLACPLPDGRYYLCGITSWGVGCARPDFYGVYTL  204 (406)
Q Consensus       125 ~~~i~~~~~C~~~~~~~~~~~~~~~~~~Ca~~~~~~~~~C~gdsGgPl~~~~~~~~~~l~Gi~s~~~~C~~~~~p~~~t~  204 (406)
                      .+.+++.+.|...+...   ..+.+.++|++........|.+|+|+||++.. + +|+|+||+|++..|.....|.+|++
T Consensus       147 ~~~~~~~~~C~~~~~~~---~~~~~~~~C~~~~~~~~~~c~gdsG~pl~~~~-~-~~~l~Gi~s~g~~C~~~~~~~~~~~  221 (229)
T smart00020      147 NVPIVSNATCRRAYSGG---GAITDNMLCAGGLEGGKDACQGDSGGPLVCND-G-RWVLVGIVSWGSGCARPGKPGVYTR  221 (229)
T ss_pred             EEEEeCHHHhhhhhccc---cccCCCcEeecCCCCCCcccCCCCCCeeEEEC-C-CEEEEEEEEECCCCCCCCCCCEEEE
Confidence            99999999999877542   25788999998765467899999999999988 5 9999999999999986678899999


Q ss_pred             eeechhhH
Q psy2950         205 VSCYSDWV  212 (406)
Q Consensus       205 v~~~~~WI  212 (406)
                      +..|.+||
T Consensus       222 i~~~~~WI  229 (229)
T smart00020      222 VSSYLDWI  229 (229)
T ss_pred             eccccccC
Confidence            99999997


No 4  
>PF00089 Trypsin:  Trypsin;  InterPro: IPR001254 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine proteases belong to the MEROPS peptidase family S1 (chymotrypsin family, clan PA(S))and to peptidase family S6 (Hap serine peptidases). The chymotrypsin family is almost totally confined to animals, although trypsin-like enzymes are found in actinomycetes of the genera Streptomyces and Saccharopolyspora, and in the fungus Fusarium oxysporum []. The enzymes are inherently secreted, being synthesised with a signal peptide that targets them to the secretory pathway. Animal enzymes are either secreted directly, packaged into vesicles for regulated secretion, or are retained in leukocyte granules []. The Hap family, 'Haemophilus adhesion and penetration', are proteins that play a role in the interaction with human epithelial cells. The serine protease activity is localized at the N-terminal domain, whereas the binding domain is in the C-terminal region. ; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 1SPJ_A 1A5I_A 2ZGH_A 2ZKS_A 2ZGJ_A 2ZGC_A 2ODP_A 2I6Q_A 2I6S_A 2ODQ_A ....
Probab=99.96  E-value=7.1e-29  Score=222.97  Aligned_cols=181  Identities=38%  Similarity=0.772  Sum_probs=156.5

Q ss_pred             cccCCccccceeeEEEEeeeec----cCCCcce---EEEEEEe-------------------------------cCCCCC
Q psy2950           7 FVEGNPRQLHHQLFIILLRRTS----EGGSLPH---ILQAAEV-------------------------------PLTPKE   48 (406)
Q Consensus         7 i~~G~~~~~~~~P~~v~i~~~~----c~G~l~~---vltaa~c-------------------------------~~Hp~y   48 (406)
                      |.||.+++.++|||+|.++...    |+|+|+.   |||||||                               ++||+|
T Consensus         1 i~~g~~~~~~~~p~~v~i~~~~~~~~C~G~li~~~~vLTaahC~~~~~~~~v~~g~~~~~~~~~~~~~~~v~~~~~h~~~   80 (220)
T PF00089_consen    1 IVGGDPASPGEFPWVVSIRYSNGRFFCTGTLISPRWVLTAAHCVDGASDIKVRLGTYSIRNSDGSEQTIKVSKIIIHPKY   80 (220)
T ss_dssp             SBSSEECGTTSSTTEEEEEETTTEEEEEEEEEETTEEEEEGGGHTSGGSEEEEESESBTTSTTTTSEEEEEEEEEEETTS
T ss_pred             CCCCEECCCCCCCeEEEEeeCCCCeeEeEEeccccccccccccccccccccccccccccccccccccccccccccccccc
Confidence            7899999999999999998755    9999966   9999999                               468898


Q ss_pred             C-CCCCcceEEEEeCCccccCCCccccccCCCCCccccccccccccccccCCCceEEEeeeccCCCCCCcccccceeeee
Q psy2950          49 E-CRRSYAVAGYELTRPFKFNEFVSPICLPNPGLTVTADVGLISGWGRLSEGADVGLISGWGRLSEGGSLPHILQAAEVP  127 (406)
Q Consensus        49 ~-~~~~nDIALlkL~~~v~~~~~v~picl~~~~~~~~~~~~~~~~wg~~~~~~~~~~~~Gwg~~~~~~~~~~~l~~~~~~  127 (406)
                      + ....+|||||||++++.+.+.++|+||+........             + ..+.+.|||.....+ ....++...+.
T Consensus        81 ~~~~~~~DiAll~L~~~~~~~~~~~~~~l~~~~~~~~~-------------~-~~~~~~G~~~~~~~~-~~~~~~~~~~~  145 (220)
T PF00089_consen   81 DPSTYDNDIALLKLDRPITFGDNIQPICLPSAGSDPNV-------------G-TSCIVVGWGRTSDNG-YSSNLQSVTVP  145 (220)
T ss_dssp             BTTTTTTSEEEEEESSSSEHBSSBEESBBTSTTHTTTT-------------T-SEEEEEESSBSSTTS-BTSBEEEEEEE
T ss_pred             cccccccccccccccccccccccccccccccccccccc-------------c-ccccccccccccccc-ccccccccccc
Confidence            7 446899999999999999999999999985443333             4 899999999865544 45678899999


Q ss_pred             ccChhhhhHhhhccCCCCCCCCCeeecCCCCCCcccccCCCCCeeEeeCCCCcEEEEEEEEecCCCCCCCCCeeEEEeee
Q psy2950         128 LTPKEECRRSYAVAGYSNYLNQCQVCTGTKQGGLDSCQGDSGGPLACPLPDGRYYLCGITSWGVGCARPDFYGVYTLVSC  207 (406)
Q Consensus       128 i~~~~~C~~~~~~~~~~~~~~~~~~Ca~~~~~~~~~C~gdsGgPl~~~~~~~~~~l~Gi~s~~~~C~~~~~p~~~t~v~~  207 (406)
                      +++.+.|...+..     .+.+.++|++.. .....|.||+|+||++.+ .   +|+||.+++..|...+.|.+|+++..
T Consensus       146 ~~~~~~c~~~~~~-----~~~~~~~c~~~~-~~~~~~~g~sG~pl~~~~-~---~lvGI~s~~~~c~~~~~~~v~~~v~~  215 (220)
T PF00089_consen  146 VVSRKTCRSSYND-----NLTPNMICAGSS-GSGDACQGDSGGPLICNN-N---YLVGIVSFGENCGSPNYPGVYTRVSS  215 (220)
T ss_dssp             EEEHHHHHHHTTT-----TSTTTEEEEETT-SSSBGGTTTTTSEEEETT-E---EEEEEEEEESSSSBTTSEEEEEEGGG
T ss_pred             ccccccccccccc-----cccccccccccc-ccccccccccccccccce-e---eecceeeecCCCCCCCcCEEEEEHHH
Confidence            9999999998543     578899999865 557899999999999977 2   79999999999998778999999999


Q ss_pred             chhhH
Q psy2950         208 YSDWV  212 (406)
Q Consensus       208 ~~~WI  212 (406)
                      |.+||
T Consensus       216 ~~~WI  220 (220)
T PF00089_consen  216 YLDWI  220 (220)
T ss_dssp             GHHHH
T ss_pred             hhccC
Confidence            99998


No 5  
>KOG3627|consensus
Probab=99.95  E-value=9.8e-28  Score=221.28  Aligned_cols=164  Identities=45%  Similarity=0.914  Sum_probs=138.9

Q ss_pred             eeeeeeeEee-ccCCCCCCc-CceEEEEeCCCcccCCCcccccCCCCCC---cccCCcEEEEcccccCCC-CCcccccee
Q psy2950         222 QRRRVERIYT-DFYDKSIYK-NDIALLELTRPFKFNEFVSPICLPNPGL---TVTADVGLISGWGRLSEG-GSLPHILQA  295 (406)
Q Consensus       222 ~~~~v~~i~~-p~y~~~~~~-~DiALlkL~~~v~~~~~v~piclp~~~~---~~~~~~~~~~GwG~~~~~-~~~~~~l~~  295 (406)
                      ....+.+++. |+|+..... ||||||+|.+++.|+++|+|||||....   ...+..+.++|||..... ...+..|++
T Consensus        85 ~~~~v~~~i~H~~y~~~~~~~nDiall~l~~~v~~~~~i~piclp~~~~~~~~~~~~~~~v~GWG~~~~~~~~~~~~L~~  164 (256)
T KOG3627|consen   85 LVGDVEKIIVHPNYNPRTLENNDIALLRLSEPVTFSSHIQPICLPSSADPYFPPGGTTCLVSGWGRTESGGGPLPDTLQE  164 (256)
T ss_pred             hhceeeEEEECCCCCCCCCCCCCEEEEEECCCcccCCcccccCCCCCcccCCCCCCCEEEEEeCCCcCCCCCCCCceeEE
Confidence            3344667777 999998877 9999999999999999999999996543   225688999999987764 346788999


Q ss_pred             eeeccCChhhhhhhhhccCCcCCCCcceEEeecCCCCCCcccCCCCCeeEeecCCCcEEEEEEEEeCCC-CCCCCCCeEE
Q psy2950         296 AEVPLTPKEECRRSYAVAGYSNYLNQCQVCTGTKQGGLDSCQGDSGGPLACPLPDGRYYLCGITSWGVG-CARPDFYGVY  374 (406)
Q Consensus       296 ~~v~~~~~~~C~~~~~~~~~~~~~~~~~~Ca~~~~~~~~~C~gDsGgPl~~~~~~~~~~l~Gi~S~~~~-c~~~~~p~vf  374 (406)
                      +++.+++.++|+..+....   .+.+.+|||+......++|.|||||||++.. .++++|+||+|||.. |+....|++|
T Consensus       165 ~~v~i~~~~~C~~~~~~~~---~~~~~~~Ca~~~~~~~~~C~GDSGGPLv~~~-~~~~~~~GivS~G~~~C~~~~~P~vy  240 (256)
T KOG3627|consen  165 VDVPIISNSECRRAYGGLG---TITDTMLCAGGPEGGKDACQGDSGGPLVCED-NGRWVLVGIVSWGSGGCGQPNYPGVY  240 (256)
T ss_pred             EEEeEcChhHhcccccCcc---ccCCCEEeeCccCCCCccccCCCCCeEEEee-CCcEEEEEEEEecCCCCCCCCCCeEE
Confidence            9999999999999887432   3567789999756678899999999999987 448999999999988 9987799999


Q ss_pred             EeCCccHHHHHHhHc
Q psy2950         375 TLVSCYSDWVKSILY  389 (406)
Q Consensus       375 t~v~~~~~WI~~~~~  389 (406)
                      |||+.|.+||++.+.
T Consensus       241 t~V~~y~~WI~~~~~  255 (256)
T KOG3627|consen  241 TRVSSYLDWIKENIG  255 (256)
T ss_pred             eEhHHhHHHHHHHhc
Confidence            999999999999875


No 6  
>cd00190 Tryp_SPc Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues.
Probab=99.95  E-value=1.6e-27  Score=215.90  Aligned_cols=197  Identities=41%  Similarity=0.794  Sum_probs=155.7

Q ss_pred             CCcEEEEEEEEecCCCCCCCCC-eeEEEeeechhhHHHhhhcccceeeeeeeEee-ccCCCCCCcCceEEEEeCCCcccC
Q psy2950         178 DGRYYLCGITSWGVGCARPDFY-GVYTLVSCYSDWVKSILYARHEQRRRVERIYT-DFYDKSIYKNDIALLELTRPFKFN  255 (406)
Q Consensus       178 ~~~~~l~Gi~s~~~~C~~~~~p-~~~t~v~~~~~WI~~~i~~~~~~~~~v~~i~~-p~y~~~~~~~DiALlkL~~~v~~~  255 (406)
                      ..+|+|..     ++|.....+ ....++...   ..... ....+.+.|++++. |+|+.....+|||||||++++.++
T Consensus        33 s~~~VLTa-----AhC~~~~~~~~~~v~~g~~---~~~~~-~~~~~~~~v~~~~~hp~y~~~~~~~DiAll~L~~~~~~~  103 (232)
T cd00190          33 SPRWVLTA-----AHCVYSSAPSNYTVRLGSH---DLSSN-EGGGQVIKVKKVIVHPNYNPSTYDNDIALLKLKRPVTLS  103 (232)
T ss_pred             eCCEEEEC-----HHhcCCCCCccEEEEeCcc---cccCC-CCceEEEEEEEEEECCCCCCCCCcCCEEEEEECCcccCC
Confidence            45678887     899753211 122222211   11110 01237788999999 999998889999999999999999


Q ss_pred             CCcccccCCCCCCcc-cCCcEEEEcccccCCCCCccccceeeeeccCChhhhhhhhhccCCcCCCCcceEEeecCCCCCC
Q psy2950         256 EFVSPICLPNPGLTV-TADVGLISGWGRLSEGGSLPHILQAAEVPLTPKEECRRSYAVAGYSNYLNQCQVCTGTKQGGLD  334 (406)
Q Consensus       256 ~~v~piclp~~~~~~-~~~~~~~~GwG~~~~~~~~~~~l~~~~v~~~~~~~C~~~~~~~~~~~~~~~~~~Ca~~~~~~~~  334 (406)
                      ++++|+|||...... .+..+.++|||........+..|++..+.+++.++|...+...   ..+.++++|++......+
T Consensus       104 ~~v~picl~~~~~~~~~~~~~~~~G~g~~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~---~~~~~~~~C~~~~~~~~~  180 (232)
T cd00190         104 DNVRPICLPSSGYNLPAGTTCTVSGWGRTSEGGPLPDVLQEVNVPIVSNAECKRAYSYG---GTITDNMLCAGGLEGGKD  180 (232)
T ss_pred             CcccceECCCccccCCCCCEEEEEeCCcCCCCCCCCceeeEEEeeeECHHHhhhhccCc---ccCCCceEeeCCCCCCCc
Confidence            999999999875333 6899999999987665456778999999999999999887631   167899999997554778


Q ss_pred             cccCCCCCeeEeecCCCcEEEEEEEEeCCCCCCCCCCeEEEeCCccHHHHHHh
Q psy2950         335 SCQGDSGGPLACPLPDGRYYLCGITSWGVGCARPDFYGVYTLVSCYSDWVKSI  387 (406)
Q Consensus       335 ~C~gDsGgPl~~~~~~~~~~l~Gi~S~~~~c~~~~~p~vft~v~~~~~WI~~~  387 (406)
                      .|.|||||||++.. +++++|+||+|++..|.....|.+||||+.|.+||+++
T Consensus       181 ~c~gdsGgpl~~~~-~~~~~lvGI~s~g~~c~~~~~~~~~t~v~~~~~WI~~~  232 (232)
T cd00190         181 ACQGDSGGPLVCND-NGRGVLVGIVSWGSGCARPNYPGVYTRVSSYLDWIQKT  232 (232)
T ss_pred             cccCCCCCcEEEEe-CCEEEEEEEEehhhccCCCCCCCEEEEcHHhhHHhhcC
Confidence            99999999999987 68999999999998898767899999999999999864


No 7  
>smart00020 Tryp_SPc Trypsin-like serine protease. Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues.
Probab=99.92  E-value=3.6e-24  Score=193.80  Aligned_cols=158  Identities=47%  Similarity=0.916  Sum_probs=136.4

Q ss_pred             eeeeeeeEee-ccCCCCCCcCceEEEEeCCCcccCCCcccccCCCCCCcc-cCCcEEEEcccccCC-CCCccccceeeee
Q psy2950         222 QRRRVERIYT-DFYDKSIYKNDIALLELTRPFKFNEFVSPICLPNPGLTV-TADVGLISGWGRLSE-GGSLPHILQAAEV  298 (406)
Q Consensus       222 ~~~~v~~i~~-p~y~~~~~~~DiALlkL~~~v~~~~~v~piclp~~~~~~-~~~~~~~~GwG~~~~-~~~~~~~l~~~~v  298 (406)
                      ..+.|.+++. |+|+.....+|||||+|++++.+++.++|+|||...... .+..+.+.|||.... .......++...+
T Consensus        69 ~~~~v~~~~~~p~~~~~~~~~DiAll~L~~~i~~~~~~~pi~l~~~~~~~~~~~~~~~~g~g~~~~~~~~~~~~~~~~~~  148 (229)
T smart00020       69 QVIKVSKVIIHPNYNPSTYDNDIALLKLKSPVTLSDNVRPICLPSSNYNVPAGTTCTVSGWGRTSEGAGSLPDTLQEVNV  148 (229)
T ss_pred             eEEeeEEEEECCCCCCCCCcCCEEEEEECcccCCCCceeeccCCCcccccCCCCEEEEEeCCCCCCCCCcCCCEeeEEEE
Confidence            6788999998 999988889999999999999999999999999863333 688999999998764 2344667899999


Q ss_pred             ccCChhhhhhhhhccCCcCCCCcceEEeecCCCCCCcccCCCCCeeEeecCCCcEEEEEEEEeCCCCCCCCCCeEEEeCC
Q psy2950         299 PLTPKEECRRSYAVAGYSNYLNQCQVCTGTKQGGLDSCQGDSGGPLACPLPDGRYYLCGITSWGVGCARPDFYGVYTLVS  378 (406)
Q Consensus       299 ~~~~~~~C~~~~~~~~~~~~~~~~~~Ca~~~~~~~~~C~gDsGgPl~~~~~~~~~~l~Gi~S~~~~c~~~~~p~vft~v~  378 (406)
                      .+++.+.|...+...   ..+.++++|++........|.||+||||++.. + +|+|+||+|++..|...+.|.+|+||.
T Consensus       149 ~~~~~~~C~~~~~~~---~~~~~~~~C~~~~~~~~~~c~gdsG~pl~~~~-~-~~~l~Gi~s~g~~C~~~~~~~~~~~i~  223 (229)
T smart00020      149 PIVSNATCRRAYSGG---GAITDNMLCAGGLEGGKDACQGDSGGPLVCND-G-RWVLVGIVSWGSGCARPGKPGVYTRVS  223 (229)
T ss_pred             EEeCHHHhhhhhccc---cccCCCcEeecCCCCCCcccCCCCCCeeEEEC-C-CEEEEEEEEECCCCCCCCCCCEEEEec
Confidence            999999999877532   15788899999754467899999999999987 4 999999999999998667899999999


Q ss_pred             ccHHHH
Q psy2950         379 CYSDWV  384 (406)
Q Consensus       379 ~~~~WI  384 (406)
                      +|.+||
T Consensus       224 ~~~~WI  229 (229)
T smart00020      224 SYLDWI  229 (229)
T ss_pred             cccccC
Confidence            999998


No 8  
>PF00089 Trypsin:  Trypsin;  InterPro: IPR001254 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine proteases belong to the MEROPS peptidase family S1 (chymotrypsin family, clan PA(S))and to peptidase family S6 (Hap serine peptidases). The chymotrypsin family is almost totally confined to animals, although trypsin-like enzymes are found in actinomycetes of the genera Streptomyces and Saccharopolyspora, and in the fungus Fusarium oxysporum []. The enzymes are inherently secreted, being synthesised with a signal peptide that targets them to the secretory pathway. Animal enzymes are either secreted directly, packaged into vesicles for regulated secretion, or are retained in leukocyte granules []. The Hap family, 'Haemophilus adhesion and penetration', are proteins that play a role in the interaction with human epithelial cells. The serine protease activity is localized at the N-terminal domain, whereas the binding domain is in the C-terminal region. ; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 1SPJ_A 1A5I_A 2ZGH_A 2ZKS_A 2ZGJ_A 2ZGC_A 2ODP_A 2I6Q_A 2I6S_A 2ODQ_A ....
Probab=99.90  E-value=6.4e-23  Score=184.18  Aligned_cols=186  Identities=39%  Similarity=0.752  Sum_probs=147.5

Q ss_pred             CCcEEEEEEEEecCCCCCCCCCeeEEEeeechhhHHHhhhcccceeeeeeeEee-ccCCCCCCcCceEEEEeCCCcccCC
Q psy2950         178 DGRYYLCGITSWGVGCARPDFYGVYTLVSCYSDWVKSILYARHEQRRRVERIYT-DFYDKSIYKNDIALLELTRPFKFNE  256 (406)
Q Consensus       178 ~~~~~l~Gi~s~~~~C~~~~~p~~~t~v~~~~~WI~~~i~~~~~~~~~v~~i~~-p~y~~~~~~~DiALlkL~~~v~~~~  256 (406)
                      ..+|+|..     ++|... ...+-..+..  .++......  .+.+.|++++. |.|+.....+|||||||++++.+.+
T Consensus        33 ~~~~vLTa-----ahC~~~-~~~~~v~~g~--~~~~~~~~~--~~~~~v~~~~~h~~~~~~~~~~DiAll~L~~~~~~~~  102 (220)
T PF00089_consen   33 SPRWVLTA-----AHCVDG-ASDIKVRLGT--YSIRNSDGS--EQTIKVSKIIIHPKYDPSTYDNDIALLKLDRPITFGD  102 (220)
T ss_dssp             ETTEEEEE-----GGGHTS-GGSEEEEESE--SBTTSTTTT--SEEEEEEEEEEETTSBTTTTTTSEEEEEESSSSEHBS
T ss_pred             cccccccc-----cccccc-cccccccccc--ccccccccc--ccccccccccccccccccccccccccccccccccccc
Confidence            44577877     899764 1112122222  233222221  37899999988 9999988899999999999999999


Q ss_pred             CcccccCCCCCCcc-cCCcEEEEcccccCCCCCccccceeeeeccCChhhhhhhhhccCCcCCCCcceEEeecCCCCCCc
Q psy2950         257 FVSPICLPNPGLTV-TADVGLISGWGRLSEGGSLPHILQAAEVPLTPKEECRRSYAVAGYSNYLNQCQVCTGTKQGGLDS  335 (406)
Q Consensus       257 ~v~piclp~~~~~~-~~~~~~~~GwG~~~~~~~~~~~l~~~~v~~~~~~~C~~~~~~~~~~~~~~~~~~Ca~~~~~~~~~  335 (406)
                      .++|+||+...... .+..+.+.|||.....+ ....++...+.+++.+.|...+..     .+.+.++|+... ...+.
T Consensus       103 ~~~~~~l~~~~~~~~~~~~~~~~G~~~~~~~~-~~~~~~~~~~~~~~~~~c~~~~~~-----~~~~~~~c~~~~-~~~~~  175 (220)
T PF00089_consen  103 NIQPICLPSAGSDPNVGTSCIVVGWGRTSDNG-YSSNLQSVTVPVVSRKTCRSSYND-----NLTPNMICAGSS-GSGDA  175 (220)
T ss_dssp             SBEESBBTSTTHTTTTTSEEEEEESSBSSTTS-BTSBEEEEEEEEEEHHHHHHHTTT-----TSTTTEEEEETT-SSSBG
T ss_pred             cccccccccccccccccccccccccccccccc-cccccccccccccccccccccccc-----cccccccccccc-ccccc
Confidence            99999999844322 68899999999865544 456789999999999999987542     478899999975 56799


Q ss_pred             ccCCCCCeeEeecCCCcEEEEEEEEeCCCCCCCCCCeEEEeCCccHHHH
Q psy2950         336 CQGDSGGPLACPLPDGRYYLCGITSWGVGCARPDFYGVYTLVSCYSDWV  384 (406)
Q Consensus       336 C~gDsGgPl~~~~~~~~~~l~Gi~S~~~~c~~~~~p~vft~v~~~~~WI  384 (406)
                      |.|||||||++...    +|+||+|++..|.....|.+|+||+.|++||
T Consensus       176 ~~g~sG~pl~~~~~----~lvGI~s~~~~c~~~~~~~v~~~v~~~~~WI  220 (220)
T PF00089_consen  176 CQGDSGGPLICNNN----YLVGIVSFGENCGSPNYPGVYTRVSSYLDWI  220 (220)
T ss_dssp             GTTTTTSEEEETTE----EEEEEEEEESSSSBTTSEEEEEEGGGGHHHH
T ss_pred             ccccccccccccee----eecceeeecCCCCCCCcCEEEEEHHHhhccC
Confidence            99999999999761    7999999999999877899999999999999


No 9  
>COG5640 Secreted trypsin-like serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=99.90  E-value=3.8e-23  Score=186.93  Aligned_cols=230  Identities=23%  Similarity=0.327  Sum_probs=158.3

Q ss_pred             ccccccCCccccceeeEEEEeeee--------ccCCCc---ceEEEEEEe------------------------------
Q psy2950           4 IQDFVEGNPRQLHHQLFIILLRRT--------SEGGSL---PHILQAAEV------------------------------   42 (406)
Q Consensus         4 ~~~i~~G~~~~~~~~P~~v~i~~~--------~c~G~l---~~vltaa~c------------------------------   42 (406)
                      --||+||+.+..++||++|++-.+        .|||+.   ++|||||||                              
T Consensus        30 s~rIigGs~Anag~~P~~VaLv~~isd~~s~tfCGgs~l~~RYvLTAAHC~~~~s~is~d~~~vv~~l~d~Sq~~rg~vr  109 (413)
T COG5640          30 SSRIIGGSNANAGEYPSLVALVDRISDYVSGTFCGGSKLGGRYVLTAAHCADASSPISSDVNRVVVDLNDSSQAERGHVR  109 (413)
T ss_pred             ceeEecCcccccccCchHHHHHhhcccccceeEeccceecceEEeeehhhccCCCCccccceEEEecccccccccCcceE
Confidence            358999999999999999999532        299998   459999999                              


Q ss_pred             --cCCCCCC-CCCCcceEEEEeCCccccCCCccccccCCCCCccccccccccccccccCCCceEEEeeeccCCCC---CC
Q psy2950          43 --PLTPKEE-CRRSYAVAGYELTRPFKFNEFVSPICLPNPGLTVTADVGLISGWGRLSEGADVGLISGWGRLSEG---GS  116 (406)
Q Consensus        43 --~~Hp~y~-~~~~nDIALlkL~~~v~~~~~v~picl~~~~~~~~~~~~~~~~wg~~~~~~~~~~~~Gwg~~~~~---~~  116 (406)
                        ..|..|. .++.||||+++|.++....    .+.+-..+..... .--++.|       ......+|+.+...   ..
T Consensus       110 ~i~~~efY~~~n~~ND~Av~~l~~~a~~p----r~ki~~~~~sdt~-l~sv~~~-------s~~~n~t~~~~~~~~v~~~  177 (413)
T COG5640         110 TIYVHEFYSPGNLGNDIAVLELARAASLP----RVKITSFDASDTF-LNSVTTV-------SPMTNGTFGVTTPSDVPRS  177 (413)
T ss_pred             EEeeecccccccccCcceeeccccccccc----hhheeeccCcccc-eeccccc-------ccccceeeeeeeecCCCCC
Confidence              4588886 6799999999999866432    1111111110000 0001111       34555666644322   12


Q ss_pred             cc--cccceeeeeccChhhhhHhhhccC-CCCCCCCCeeecCCCCCCcccccCCCCCeeEeeCCCCcEEEEEEEEecCC-
Q psy2950         117 LP--HILQAAEVPLTPKEECRRSYAVAG-YSNYLNQCQVCTGTKQGGLDSCQGDSGGPLACPLPDGRYYLCGITSWGVG-  192 (406)
Q Consensus       117 ~~--~~l~~~~~~i~~~~~C~~~~~~~~-~~~~~~~~~~Ca~~~~~~~~~C~gdsGgPl~~~~~~~~~~l~Gi~s~~~~-  192 (406)
                      .+  ..|+++.+...+...|.+.++... ......-.-||++.+  ..+.|+||||||++... +...+++||+|||.+ 
T Consensus       178 ~p~gt~l~e~~v~fv~~stc~~~~g~an~~dg~~~lT~~cag~~--~~daCqGDSGGPi~~~g-~~G~vQ~GVvSwG~~~  254 (413)
T COG5640         178 SPKGTILHEVAVLFVPLSTCAQYKGCANASDGATGLTGFCAGRP--PKDACQGDSGGPIFHKG-EEGRVQRGVVSWGDGG  254 (413)
T ss_pred             CCccceeeeeeeeeechHHhhhhccccccCCCCCCccceecCCC--CcccccCCCCCceEEeC-CCccEEEeEEEecCCC
Confidence            22  478999999999999999886211 111222233999976  47999999999999988 556699999999976 


Q ss_pred             CCCCCCCeeEEEeeechhhHHHhhhcccceeeeeeeEeec-------------cCCCCCCcCceEEEEeC
Q psy2950         193 CARPDFYGVYTLVSCYSDWVKSILYARHEQRRRVERIYTD-------------FYDKSIYKNDIALLELT  249 (406)
Q Consensus       193 C~~~~~p~~~t~v~~~~~WI~~~i~~~~~~~~~v~~i~~p-------------~y~~~~~~~DiALlkL~  249 (406)
                      |+....|++||+++.|.+||...+..-....+..+. +.|             .++..++.-|++++-.+
T Consensus       255 Cg~t~~~gVyT~vsny~~WI~a~~~~l~~~~~rp~~-~~~~G~~t~~~p~T~~~~na~t~~g~~~~llva  323 (413)
T COG5640         255 CGGTLIPGVYTNVSNYQDWIAAMTNGLSYLQFRPLG-YRPTGFDTPRDPATNFFFNAQTYEGNTFVLLVA  323 (413)
T ss_pred             CCCCCcceeEEehhHHHHHHHHHhcCCCcccccccc-cccccccccccCCCccccccccccCCeEEEEEe
Confidence            999999999999999999999987654433333333 111             23445566677777765


No 10 
>COG5640 Secreted trypsin-like serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=99.79  E-value=3.1e-18  Score=155.25  Aligned_cols=205  Identities=30%  Similarity=0.447  Sum_probs=138.3

Q ss_pred             CCcEEEEEEEEecCCCCCCCCCe--eEEEeeechhhHHHhhhcccceeeeeeeEee-ccCCCCCCcCceEEEEeCCCccc
Q psy2950         178 DGRYYLCGITSWGVGCARPDFYG--VYTLVSCYSDWVKSILYARHEQRRRVERIYT-DFYDKSIYKNDIALLELTRPFKF  254 (406)
Q Consensus       178 ~~~~~l~Gi~s~~~~C~~~~~p~--~~t~v~~~~~WI~~~i~~~~~~~~~v~~i~~-p~y~~~~~~~DiALlkL~~~v~~  254 (406)
                      ++||+|.+     ++|...+.|.  -+.++..-+.      +.++.+...+..++. ..|.+.++.||+|+++|.++...
T Consensus        69 ~~RYvLTA-----AHC~~~~s~is~d~~~vv~~l~------d~Sq~~rg~vr~i~~~efY~~~n~~ND~Av~~l~~~a~~  137 (413)
T COG5640          69 GGRYVLTA-----AHCADASSPISSDVNRVVVDLN------DSSQAERGHVRTIYVHEFYSPGNLGNDIAVLELARAASL  137 (413)
T ss_pred             cceEEeee-----hhhccCCCCccccceEEEeccc------ccccccCcceEEEeeecccccccccCcceeecccccccc
Confidence            67899999     8997655432  2333332222      333347778888888 88899999999999999985542


Q ss_pred             CCCcccccCCCCC---Ccc-cCCcEEEEcccccCCCC---Ccc--ccceeeeeccCChhhhhhhhhccCCc-CCCCcceE
Q psy2950         255 NEFVSPICLPNPG---LTV-TADVGLISGWGRLSEGG---SLP--HILQAAEVPLTPKEECRRSYAVAGYS-NYLNQCQV  324 (406)
Q Consensus       255 ~~~v~piclp~~~---~~~-~~~~~~~~GwG~~~~~~---~~~--~~l~~~~v~~~~~~~C~~~~~~~~~~-~~~~~~~~  324 (406)
                      - .++---....+   ... ........+||.+....   ..+  ..|+++.+...+...|..+++..... ....-.-+
T Consensus       138 p-r~ki~~~~~sdt~l~sv~~~s~~~n~t~~~~~~~~v~~~~p~gt~l~e~~v~fv~~stc~~~~g~an~~dg~~~lT~~  216 (413)
T COG5640         138 P-RVKITSFDASDTFLNSVTTVSPMTNGTFGVTTPSDVPRSSPKGTILHEVAVLFVPLSTCAQYKGCANASDGATGLTGF  216 (413)
T ss_pred             c-hhheeeccCcccceecccccccccceeeeeeeecCCCCCCCccceeeeeeeeeechHHhhhhccccccCCCCCCccce
Confidence            1 11100000001   000 11222345555543321   112  48999999999999999988521111 11222239


Q ss_pred             EeecCCCCCCcccCCCCCeeEeecCCCcEEEEEEEEeCCC-CCCCCCCeEEEeCCccHHHHHHhHccCCCcccc
Q psy2950         325 CTGTKQGGLDSCQGDSGGPLACPLPDGRYYLCGITSWGVG-CARPDFYGVYTLVSCYSDWVKSILYASVSAKRV  397 (406)
Q Consensus       325 Ca~~~~~~~~~C~gDsGgPl~~~~~~~~~~l~Gi~S~~~~-c~~~~~p~vft~v~~~~~WI~~~~~~~~~~~~~  397 (406)
                      |++.+  .++.|+||||||++.+..+++ +++||+|||.+ |+....|.|||+|+.|.+||...|...+.+.+-
T Consensus       217 cag~~--~~daCqGDSGGPi~~~g~~G~-vQ~GVvSwG~~~Cg~t~~~gVyT~vsny~~WI~a~~~~l~~~~~r  287 (413)
T COG5640         217 CAGRP--PKDACQGDSGGPIFHKGEEGR-VQRGVVSWGDGGCGGTLIPGVYTNVSNYQDWIAAMTNGLSYLQFR  287 (413)
T ss_pred             ecCCC--CcccccCCCCCceEEeCCCcc-EEEeEEEecCCCCCCCCcceeEEehhHHHHHHHHHhcCCCccccc
Confidence            99965  589999999999999985555 89999999988 999999999999999999999999887665443


No 11 
>PF03761 DUF316:  Domain of unknown function (DUF316) ;  InterPro: IPR005514 This is a family of uncharacterised proteins from Caenorhabditis elegans.
Probab=98.99  E-value=5e-09  Score=97.96  Aligned_cols=164  Identities=25%  Similarity=0.459  Sum_probs=109.0

Q ss_pred             cccccCCccccceeeEEEEeeeec-------cCCCcce---EEEEEEecCCC----------------------------
Q psy2950           5 QDFVEGNPRQLHHQLFIILLRRTS-------EGGSLPH---ILQAAEVPLTP----------------------------   46 (406)
Q Consensus         5 ~~i~~G~~~~~~~~P~~v~i~~~~-------c~G~l~~---vltaa~c~~Hp----------------------------   46 (406)
                      .++.+|.++...+.||.|.+....       ++|++|+   |||++||+...                            
T Consensus        40 ~~~~~g~~~~~~~~pW~v~v~~~~~~~~~~~~~gtlIS~RHiLtss~~~~~~~~~W~~~~~~~~~~C~~~~~~l~vP~~~  119 (282)
T PF03761_consen   40 SKVFNGTPAESGEAPWAVSVYTKNHNEGNYFSTGTLISPRHILTSSHCVMNDKSKWLNGEEFDNKKCEGNNNHLIVPEEV  119 (282)
T ss_pred             ccccCCcccccCCCCCEEEEEeccCcccceecceEEeccCeEEEeeeEEEecccccccCcccccceeeCCCceEEeCHHH
Confidence            346889999999999999997654       5889854   99999992200                            


Q ss_pred             ----------------------------CC------CCCCCcceEEEEeCCccccCCCccccccCCCCCccccccccccc
Q psy2950          47 ----------------------------KE------ECRRSYAVAGYELTRPFKFNEFVSPICLPNPGLTVTADVGLISG   92 (406)
Q Consensus        47 ----------------------------~y------~~~~~nDIALlkL~~~v~~~~~v~picl~~~~~~~~~~~~~~~~   92 (406)
                                                  ++      ......+++||+|+++  ++....|+|||........       
T Consensus       120 l~~~~v~~~~~~~~~~~~~~~v~ka~il~~C~~~~~~~~~~~~~mIlEl~~~--~~~~~~~~Cl~~~~~~~~~-------  190 (282)
T PF03761_consen  120 LSKIDVRCCNCFSNGKCFSIKVKKAYILNGCKKIKKNFNRPYSPMILELEED--FSKNVSPPCLADSSTNWEK-------  190 (282)
T ss_pred             hccEEEEeecccccCCcccceeEEEEEEecCCCcccccccccceEEEEEccc--ccccCCCEEeCCCcccccc-------
Confidence                                        00      0123467889999998  7788999999987655432       


Q ss_pred             cccccCCCceEEEeeeccCCCCCCcccccceeeeeccChhhhhHhhhccCCCCCCCCCeeecCCCCCCcccccCCCCCee
Q psy2950          93 WGRLSEGADVGLISGWGRLSEGGSLPHILQAAEVPLTPKEECRRSYAVAGYSNYLNQCQVCTGTKQGGLDSCQGDSGGPL  172 (406)
Q Consensus        93 wg~~~~~~~~~~~~Gwg~~~~~~~~~~~l~~~~~~i~~~~~C~~~~~~~~~~~~~~~~~~Ca~~~~~~~~~C~gdsGgPl  172 (406)
                            + +.+.+.|+       .....+....+.+.....|.              ..+|     .....|.+|+||||
T Consensus       191 ------~-~~~~~yg~-------~~~~~~~~~~~~i~~~~~~~--------------~~~~-----~~~~~~~~d~Gg~l  237 (282)
T PF03761_consen  191 ------G-DEVDVYGF-------NSTGKLKHRKLKITNCTKCA--------------YSIC-----TKQYSCKGDRGGPL  237 (282)
T ss_pred             ------C-ceEEEeec-------CCCCeEEEEEEEEEEeeccc--------------eeEe-----cccccCCCCccCeE
Confidence                  3 55555555       11233444444443322211              1122     23477999999999


Q ss_pred             EeeCCCCcEEEEEEEEecC-CCCCCCCCeeEEEeeechhhHH
Q psy2950         173 ACPLPDGRYYLCGITSWGV-GCARPDFYGVYTLVSCYSDWVK  213 (406)
Q Consensus       173 ~~~~~~~~~~l~Gi~s~~~-~C~~~~~p~~~t~v~~~~~WI~  213 (406)
                      +... +|+|+|+||.+.+. .|.. + ...|.++..|.+=|-
T Consensus       238 v~~~-~gr~tlIGv~~~~~~~~~~-~-~~~f~~v~~~~~~IC  276 (282)
T PF03761_consen  238 VKNI-NGRWTLIGVGASGNYECNK-N-NSYFFNVSWYQDEIC  276 (282)
T ss_pred             EEEE-CCCEEEEEEEccCCCcccc-c-ccEEEEHHHhhhhhc
Confidence            9998 89999999998874 3432 1 457888877766443


No 12 
>PF09342 DUF1986:  Domain of unknown function (DUF1986);  InterPro: IPR015420 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This domain is found in serine endopeptidases belonging to MEROPS peptidase family S1A (clan PA). It is found in unusual mosaic proteins, which are encoded by the Drosophila nudel gene (see P98159 from SWISSPROT). Nudel is involved in defining embryonic dorsoventral polarity. Three proteases; ndl, gd and snk process easter to create active easter. Active easter defines cell identities along the dorsal-ventral continuum by activating the spz ligand for the Tl receptor in the ventral region of the embryo. Nudel, pipe and windbeutel together trigger the protease cascade within the extraembryonic perivitelline compartment which induces dorsoventral polarity of the Drosophila embryo [].
Probab=98.51  E-value=6.7e-07  Score=78.18  Aligned_cols=82  Identities=17%  Similarity=0.157  Sum_probs=61.0

Q ss_pred             cceeeEEEEeeeec---cCCCcc---eEEEEEEecC-------------------------CCC------CCCCCCcceE
Q psy2950          15 LHHQLFIILLRRTS---EGGSLP---HILQAAEVPL-------------------------TPK------EECRRSYAVA   57 (406)
Q Consensus        15 ~~~~P~~v~i~~~~---c~G~l~---~vltaa~c~~-------------------------Hp~------y~~~~~nDIA   57 (406)
                      ...|||.|.|+...   |+|+|+   |+|++..|..                         |..      |..-.+.+++
T Consensus        13 ~y~WPWlA~IYvdG~~~CsgvLlD~~WlLvsssCl~~I~L~~~YvsallG~~Kt~~~v~Gp~EQI~rVD~~~~V~~S~v~   92 (267)
T PF09342_consen   13 DYHWPWLADIYVDGRYWCSGVLLDPHWLLVSSSCLRGISLSHHYVSALLGGGKTYLSVDGPHEQISRVDCFKDVPESNVL   92 (267)
T ss_pred             cccCcceeeEEEcCeEEEEEEEeccceEEEeccccCCcccccceEEEEecCcceecccCCChheEEEeeeeeecccccee
Confidence            45699999998655   999994   4999999921                         111      0001357999


Q ss_pred             EEEeCCccccCCCccccccCCCCCccccccccccccccccCCCceEEEeeecc
Q psy2950          58 GYELTRPFKFNEFVSPICLPNPGLTVTADVGLISGWGRLSEGADVGLISGWGR  110 (406)
Q Consensus        58 LlkL~~~v~~~~~v~picl~~~~~~~~~~~~~~~~wg~~~~~~~~~~~~Gwg~  110 (406)
                      ||.|++|++|+.+|+|..||........             . ..|...|-..
T Consensus        93 LLHL~~~~~fTr~VlP~flp~~~~~~~~-------------~-~~CVAVg~d~  131 (267)
T PF09342_consen   93 LLHLEQPANFTRYVLPTFLPETSNENES-------------D-DECVAVGHDD  131 (267)
T ss_pred             eeeecCcccceeeecccccccccCCCCC-------------C-CceEEEEccc
Confidence            9999999999999999999974444433             4 6888888654


No 13 
>PF03761 DUF316:  Domain of unknown function (DUF316) ;  InterPro: IPR005514 This is a family of uncharacterised proteins from Caenorhabditis elegans.
Probab=98.44  E-value=1.6e-06  Score=81.13  Aligned_cols=117  Identities=27%  Similarity=0.414  Sum_probs=79.7

Q ss_pred             CCCcCceEEEEeCCCcccCCCcccccCCCCCCcc-cCCcEEEEcccccCCCCCccccceeeeeccCChhhhhhhhhccCC
Q psy2950         237 SIYKNDIALLELTRPFKFNEFVSPICLPNPGLTV-TADVGLISGWGRLSEGGSLPHILQAAEVPLTPKEECRRSYAVAGY  315 (406)
Q Consensus       237 ~~~~~DiALlkL~~~v~~~~~v~piclp~~~~~~-~~~~~~~~GwG~~~~~~~~~~~l~~~~v~~~~~~~C~~~~~~~~~  315 (406)
                      ....++++||+|+++  ++....|+|||...... .++.+.+.|+       .....+....+.+.....          
T Consensus       157 ~~~~~~~mIlEl~~~--~~~~~~~~Cl~~~~~~~~~~~~~~~yg~-------~~~~~~~~~~~~i~~~~~----------  217 (282)
T PF03761_consen  157 FNRPYSPMILELEED--FSKNVSPPCLADSSTNWEKGDEVDVYGF-------NSTGKLKHRKLKITNCTK----------  217 (282)
T ss_pred             cccccceEEEEEccc--ccccCCCEEeCCCccccccCceEEEeec-------CCCCeEEEEEEEEEEeec----------
Confidence            345789999999999  77899999999865544 5666666665       112234444444433322          


Q ss_pred             cCCCCcceEEeecCCCCCCcccCCCCCeeEeecCCCcEEEEEEEEeCCC-CCCCCCCeEEEeCCccHHHH
Q psy2950         316 SNYLNQCQVCTGTKQGGLDSCQGDSGGPLACPLPDGRYYLCGITSWGVG-CARPDFYGVYTLVSCYSDWV  384 (406)
Q Consensus       316 ~~~~~~~~~Ca~~~~~~~~~C~gDsGgPl~~~~~~~~~~l~Gi~S~~~~-c~~~~~p~vft~v~~~~~WI  384 (406)
                               |+.........|.+|+||||+... +++++|+||.+.+.. |..  ....|.+|..|.+=|
T Consensus       218 ---------~~~~~~~~~~~~~~d~Gg~lv~~~-~gr~tlIGv~~~~~~~~~~--~~~~f~~v~~~~~~I  275 (282)
T PF03761_consen  218 ---------CAYSICTKQYSCKGDRGGPLVKNI-NGRWTLIGVGASGNYECNK--NNSYFFNVSWYQDEI  275 (282)
T ss_pred             ---------cceeEecccccCCCCccCeEEEEE-CCCEEEEEEEccCCCcccc--cccEEEEHHHhhhhh
Confidence                     111112245779999999999987 899999999987753 432  267888988877643


No 14 
>PF09342 DUF1986:  Domain of unknown function (DUF1986);  InterPro: IPR015420 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This domain is found in serine endopeptidases belonging to MEROPS peptidase family S1A (clan PA). It is found in unusual mosaic proteins, which are encoded by the Drosophila nudel gene (see P98159 from SWISSPROT). Nudel is involved in defining embryonic dorsoventral polarity. Three proteases; ndl, gd and snk process easter to create active easter. Active easter defines cell identities along the dorsal-ventral continuum by activating the spz ligand for the Tl receptor in the ventral region of the embryo. Nudel, pipe and windbeutel together trigger the protease cascade within the extraembryonic perivitelline compartment which induces dorsoventral polarity of the Drosophila embryo [].
Probab=96.72  E-value=0.044  Score=48.62  Aligned_cols=60  Identities=27%  Similarity=0.400  Sum_probs=43.1

Q ss_pred             hcccceeeeeeeEeeccCCCCCCcCceEEEEeCCCcccCCCcccccCCCCCCcc-cCCcEEEEcccc
Q psy2950         217 YARHEQRRRVERIYTDFYDKSIYKNDIALLELTRPFKFNEFVSPICLPNPGLTV-TADVGLISGWGR  282 (406)
Q Consensus       217 ~~~~~~~~~v~~i~~p~y~~~~~~~DiALlkL~~~v~~~~~v~piclp~~~~~~-~~~~~~~~GwG~  282 (406)
                      ....+|..+|..+..     - ...+++||.|++|+.|+.+|+|+.||....+. ....|...|-..
T Consensus        71 ~Gp~EQI~rVD~~~~-----V-~~S~v~LLHL~~~~~fTr~VlP~flp~~~~~~~~~~~CVAVg~d~  131 (267)
T PF09342_consen   71 DGPHEQISRVDCFKD-----V-PESNVLLLHLEQPANFTRYVLPTFLPETSNENESDDECVAVGHDD  131 (267)
T ss_pred             CCChheEEEeeeeee-----c-cccceeeeeecCcccceeeecccccccccCCCCCCCceEEEEccc
Confidence            333345555555543     1 25689999999999999999999999744444 566899888554


No 15 
>COG3591 V8-like Glu-specific endopeptidase [Amino acid transport and metabolism]
Probab=96.55  E-value=0.0092  Score=53.74  Aligned_cols=53  Identities=15%  Similarity=0.238  Sum_probs=34.5

Q ss_pred             CcccccCCCCCeeEeeCCCCcEEEEEEEEecCCCCCCCCCeeEEEee-echhhHHHhh
Q psy2950         160 GLDSCQGDSGGPLACPLPDGRYYLCGITSWGVGCARPDFYGVYTLVS-CYSDWVKSIL  216 (406)
Q Consensus       160 ~~~~C~gdsGgPl~~~~~~~~~~l~Gi~s~~~~C~~~~~p~~~t~v~-~~~~WI~~~i  216 (406)
                      ..+.+.|+||+|++....    +++||..-+.+-.........+++. ..++||++.+
T Consensus       197 ~~dT~pG~SGSpv~~~~~----~vigv~~~g~~~~~~~~~n~~vr~t~~~~~~I~~~~  250 (251)
T COG3591         197 DADTLPGSSGSPVLISKD----EVIGVHYNGPGANGGSLANNAVRLTPEILNFIQQNI  250 (251)
T ss_pred             EecccCCCCCCceEecCc----eEEEEEecCCCcccccccCcceEecHHHHHHHHHhh
Confidence            347889999999997652    8999998876533222223334443 4567777654


No 16 
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=94.07  E-value=0.11  Score=51.56  Aligned_cols=63  Identities=17%  Similarity=0.076  Sum_probs=40.4

Q ss_pred             cCCCcce----EEEEEEecCCCC-----------CCC-----CCCcceEEEEeCCccccCCCccccccCCCCCccccccc
Q psy2950          29 EGGSLPH----ILQAAEVPLTPK-----------EEC-----RRSYAVAGYELTRPFKFNEFVSPICLPNPGLTVTADVG   88 (406)
Q Consensus        29 c~G~l~~----vltaa~c~~Hp~-----------y~~-----~~~nDIALlkL~~~v~~~~~v~picl~~~~~~~~~~~~   88 (406)
                      ++|.++.    |||.+|++-...           |..     ....||||||++.+    ..+.++.|........    
T Consensus        60 GSGfii~~~G~IlTn~Hvv~~~~~i~V~~~~~~~~~a~vv~~d~~~DlAllkv~~~----~~~~~~~l~~~~~~~~----  131 (428)
T TIGR02037        60 GSGVIISADGYILTNNHVVDGADEITVTLSDGREFKAKLVGKDPRTDIAVLKIDAK----KNLPVIKLGDSDKLRV----  131 (428)
T ss_pred             eeEEEECCCCEEEEcHHHcCCCCeEEEEeCCCCEEEEEEEEecCCCCEEEEEecCC----CCceEEEccCCCCCCC----
Confidence            7777754    999999943111           211     24689999999865    3456777764433222    


Q ss_pred             cccccccccCCCceEEEeeecc
Q psy2950          89 LISGWGRLSEGADVGLISGWGR  110 (406)
Q Consensus        89 ~~~~wg~~~~~~~~~~~~Gwg~  110 (406)
                                + +.+.+.|+..
T Consensus       132 ----------G-~~v~aiG~p~  142 (428)
T TIGR02037       132 ----------G-DWVLAIGNPF  142 (428)
T ss_pred             ----------C-CEEEEEECCC
Confidence                      4 8888888753


No 17 
>PRK10898 serine endoprotease; Provisional
Probab=93.50  E-value=0.97  Score=43.59  Aligned_cols=24  Identities=38%  Similarity=0.555  Sum_probs=18.1

Q ss_pred             cccCCCCCeeEeeCCCCcEEEEEEEEec
Q psy2950         163 SCQGDSGGPLACPLPDGRYYLCGITSWG  190 (406)
Q Consensus       163 ~C~gdsGgPl~~~~~~~~~~l~Gi~s~~  190 (406)
                      .-.|.|||||+-...    .++||.+..
T Consensus       195 i~~GnSGGPl~n~~G----~vvGI~~~~  218 (353)
T PRK10898        195 INHGNSGGALVNSLG----ELMGINTLS  218 (353)
T ss_pred             cCCCCCcceEECCCC----eEEEEEEEE
Confidence            346889999994432    799998864


No 18 
>TIGR02038 protease_degS periplasmic serine pepetdase DegS. This family consists of the periplasmic serine protease DegS (HhoB), a shorter paralog of protease DO (HtrA, DegP) and DegQ (HhoA). It is found in E. coli and several other Proteobacteria of the gamma subdivision. It contains a trypsin domain and a single copy of PDZ domain (in contrast to DegP with two copies). A critical role of this DegS is to sense stress in the periplasm and partially degrade an inhibitor of sigma(E).
Probab=92.42  E-value=0.57  Score=45.19  Aligned_cols=25  Identities=32%  Similarity=0.375  Sum_probs=18.3

Q ss_pred             ccccCCCCCeeEeeCCCCcEEEEEEEEec
Q psy2950         162 DSCQGDSGGPLACPLPDGRYYLCGITSWG  190 (406)
Q Consensus       162 ~~C~gdsGgPl~~~~~~~~~~l~Gi~s~~  190 (406)
                      ..-.|.|||||+-...    .++||.+..
T Consensus       194 ~i~~GnSGGpl~n~~G----~vIGI~~~~  218 (351)
T TIGR02038       194 AINAGNSGGALINTNG----ELVGINTAS  218 (351)
T ss_pred             ccCCCCCcceEECCCC----eEEEEEeee
Confidence            3446889999995432    799998764


No 19 
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=89.64  E-value=1.7  Score=43.21  Aligned_cols=39  Identities=23%  Similarity=0.183  Sum_probs=27.5

Q ss_pred             cCceEEEEeCCCcccCCCcccccCCCCCCcccCCcEEEEcccc
Q psy2950         240 KNDIALLELTRPFKFNEFVSPICLPNPGLTVTADVGLISGWGR  282 (406)
Q Consensus       240 ~~DiALlkL~~~v~~~~~v~piclp~~~~~~~~~~~~~~GwG~  282 (406)
                      ..|+||||+..+    ....++.|...+....|+.+.+.|+-.
T Consensus       104 ~~DlAllkv~~~----~~~~~~~l~~~~~~~~G~~v~aiG~p~  142 (428)
T TIGR02037       104 RTDIAVLKIDAK----KNLPVIKLGDSDKLRVGDWVLAIGNPF  142 (428)
T ss_pred             CCCEEEEEecCC----CCceEEEccCCCCCCCCCEEEEEECCC
Confidence            579999999864    245566776444333799999988753


No 20 
>PRK10139 serine endoprotease; Provisional
Probab=87.97  E-value=1.4  Score=44.17  Aligned_cols=25  Identities=32%  Similarity=0.291  Sum_probs=18.8

Q ss_pred             ccccCCCCCeeEeeCCCCcEEEEEEEEec
Q psy2950         162 DSCQGDSGGPLACPLPDGRYYLCGITSWG  190 (406)
Q Consensus       162 ~~C~gdsGgPl~~~~~~~~~~l~Gi~s~~  190 (406)
                      ..-.|.|||||+-...    .++||.+..
T Consensus       208 ~in~GnSGGpl~n~~G----~vIGi~~~~  232 (455)
T PRK10139        208 SINRGNSGGALLNLNG----ELIGINTAI  232 (455)
T ss_pred             ccCCCCCcceEECCCC----eEEEEEEEE
Confidence            3456899999995432    799998864


No 21 
>PF02395 Peptidase_S6:  Immunoglobulin A1 protease Serine protease Prosite pattern;  InterPro: IPR000710 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S6 (clan PA(S)). The type sample being the IgA1-specific serine endopeptidase from Neisseria gonorrhoeae []. These cleave prolyl bonds in the hinge regions of immunoglobulin A heavy chains. Similar specificity is shown by the unrelated family of M26 metalloendopeptidases.; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SZE_A 3H09_B 3SYJ_A 1WXR_A 3AK5_B.
Probab=87.38  E-value=0.51  Score=49.88  Aligned_cols=53  Identities=25%  Similarity=0.436  Sum_probs=31.3

Q ss_pred             cccCCCCCeeEe-ecCCCcEEEEEEEEeCCCCCCCCCCeEEEeCCccHHHHHHhHccC
Q psy2950         335 SCQGDSGGPLAC-PLPDGRYYLCGITSWGVGCARPDFYGVYTLVSCYSDWVKSILYAS  391 (406)
Q Consensus       335 ~C~gDsGgPl~~-~~~~~~~~l~Gi~S~~~~c~~~~~p~vft~v~~~~~WI~~~~~~~  391 (406)
                      .=.||||+||+. +..+.+|+|+|+++.+.......  ..++-+  -.+|+++++.+.
T Consensus       213 ~~~GDSGSPlF~YD~~~kKWvl~Gv~~~~~~~~g~~--~~~~~~--~~~f~~~~~~~d  266 (769)
T PF02395_consen  213 GSPGDSGSPLFAYDKEKKKWVLVGVLSGGNGYNGKG--NWWNVI--PPDFINQIKQND  266 (769)
T ss_dssp             --TT-TT-EEEEEETTTTEEEEEEEEEEECCCCHSE--EEEEEE--CHHHHHHHHHHC
T ss_pred             cccCcCCCceEEEEccCCeEEEEEEEccccccCCcc--ceeEEe--cHHHHHHHHhhh
Confidence            457999999985 55678999999999876543321  333222  345555555444


No 22 
>PRK10898 serine endoprotease; Provisional
Probab=86.32  E-value=8.6  Score=37.10  Aligned_cols=37  Identities=22%  Similarity=0.245  Sum_probs=22.9

Q ss_pred             cCceEEEEeCCCcccCCCcccccCCCCCCcccCCcEEEEccc
Q psy2950         240 KNDIALLELTRPFKFNEFVSPICLPNPGLTVTADVGLISGWG  281 (406)
Q Consensus       240 ~~DiALlkL~~~v~~~~~v~piclp~~~~~~~~~~~~~~GwG  281 (406)
                      .+|+||||+... .    ..++-|........|+.+.+.|+-
T Consensus       124 ~~DlAvl~v~~~-~----l~~~~l~~~~~~~~G~~V~aiG~P  160 (353)
T PRK10898        124 LTDLAVLKINAT-N----LPVIPINPKRVPHIGDVVLAIGNP  160 (353)
T ss_pred             CCCEEEEEEcCC-C----CCeeeccCcCcCCCCCEEEEEeCC
Confidence            689999999753 1    233333322222278888888865


No 23 
>COG3591 V8-like Glu-specific endopeptidase [Amino acid transport and metabolism]
Probab=84.37  E-value=1.8  Score=39.23  Aligned_cols=53  Identities=15%  Similarity=0.237  Sum_probs=38.2

Q ss_pred             CCCcccCCCCCeeEeecCCCcEEEEEEEEeCCCCCCCCCCeEEEeCCc-cHHHHHHhH
Q psy2950         332 GLDSCQGDSGGPLACPLPDGRYYLCGITSWGVGCARPDFYGVYTLVSC-YSDWVKSIL  388 (406)
Q Consensus       332 ~~~~C~gDsGgPl~~~~~~~~~~l~Gi~S~~~~c~~~~~p~vft~v~~-~~~WI~~~~  388 (406)
                      ..++|.|+||+|++....    +++||.+-+..--......-.+|+.+ ..+||++.+
T Consensus       197 ~~dT~pG~SGSpv~~~~~----~vigv~~~g~~~~~~~~~n~~vr~t~~~~~~I~~~~  250 (251)
T COG3591         197 DADTLPGSSGSPVLISKD----EVIGVHYNGPGANGGSLANNAVRLTPEILNFIQQNI  250 (251)
T ss_pred             EecccCCCCCCceEecCc----eEEEEEecCCCcccccccCcceEecHHHHHHHHHhh
Confidence            457889999999998652    89999998876322233455666554 788998865


No 24 
>PF02395 Peptidase_S6:  Immunoglobulin A1 protease Serine protease Prosite pattern;  InterPro: IPR000710 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S6 (clan PA(S)). The type sample being the IgA1-specific serine endopeptidase from Neisseria gonorrhoeae []. These cleave prolyl bonds in the hinge regions of immunoglobulin A heavy chains. Similar specificity is shown by the unrelated family of M26 metalloendopeptidases.; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SZE_A 3H09_B 3SYJ_A 1WXR_A 3AK5_B.
Probab=82.56  E-value=1  Score=47.67  Aligned_cols=32  Identities=34%  Similarity=0.509  Sum_probs=23.5

Q ss_pred             cccCCCCCeeEeeCC-CCcEEEEEEEEecCCCC
Q psy2950         163 SCQGDSGGPLACPLP-DGRYYLCGITSWGVGCA  194 (406)
Q Consensus       163 ~C~gdsGgPl~~~~~-~~~~~l~Gi~s~~~~C~  194 (406)
                      .=.||||+||+..+. ..+|+|+|+++.+....
T Consensus       213 ~~~GDSGSPlF~YD~~~kKWvl~Gv~~~~~~~~  245 (769)
T PF02395_consen  213 GSPGDSGSPLFAYDKEKKKWVLVGVLSGGNGYN  245 (769)
T ss_dssp             --TT-TT-EEEEEETTTTEEEEEEEEEEECCCC
T ss_pred             cccCcCCCceEEEEccCCeEEEEEEEccccccC
Confidence            457999999998775 67999999999875543


No 25 
>TIGR02038 protease_degS periplasmic serine pepetdase DegS. This family consists of the periplasmic serine protease DegS (HhoB), a shorter paralog of protease DO (HtrA, DegP) and DegQ (HhoA). It is found in E. coli and several other Proteobacteria of the gamma subdivision. It contains a trypsin domain and a single copy of PDZ domain (in contrast to DegP with two copies). A critical role of this DegS is to sense stress in the periplasm and partially degrade an inhibitor of sigma(E).
Probab=82.49  E-value=6.5  Score=37.88  Aligned_cols=38  Identities=21%  Similarity=0.131  Sum_probs=24.0

Q ss_pred             cCceEEEEeCCCcccCCCcccccCCCCCCcccCCcEEEEcccc
Q psy2950         240 KNDIALLELTRPFKFNEFVSPICLPNPGLTVTADVGLISGWGR  282 (406)
Q Consensus       240 ~~DiALlkL~~~v~~~~~v~piclp~~~~~~~~~~~~~~GwG~  282 (406)
                      .+|+||||+..+-     ..++-+-.......|+.+.+.|+..
T Consensus       124 ~~DlAvlkv~~~~-----~~~~~l~~s~~~~~G~~V~aiG~P~  161 (351)
T TIGR02038       124 LTDLAVLKIEGDN-----LPTIPVNLDRPPHVGDVVLAIGNPY  161 (351)
T ss_pred             CCCEEEEEecCCC-----CceEeccCcCccCCCCEEEEEeCCC
Confidence            6899999997532     2333343222222799999988753


No 26 
>PRK10942 serine endoprotease; Provisional
Probab=80.03  E-value=4.9  Score=40.42  Aligned_cols=24  Identities=33%  Similarity=0.314  Sum_probs=17.7

Q ss_pred             cccCCCCCeeEeeCCCCcEEEEEEEEec
Q psy2950         163 SCQGDSGGPLACPLPDGRYYLCGITSWG  190 (406)
Q Consensus       163 ~C~gdsGgPl~~~~~~~~~~l~Gi~s~~  190 (406)
                      .-.|.|||||+-...    .++||.+..
T Consensus       230 i~~GnSGGpL~n~~G----eviGI~t~~  253 (473)
T PRK10942        230 INRGNSGGALVNLNG----ELIGINTAI  253 (473)
T ss_pred             cCCCCCcCccCCCCC----eEEEEEEEE
Confidence            346889999995432    799998753


No 27 
>PRK10139 serine endoprotease; Provisional
Probab=79.63  E-value=9.8  Score=38.08  Aligned_cols=122  Identities=16%  Similarity=0.147  Sum_probs=59.2

Q ss_pred             cCceEEEEeCCCcccCCCcccccCCCCCCcccCCcEEEEcccccCCCCCccccceeeeeccCChhhhhhhhhccCCcCCC
Q psy2950         240 KNDIALLELTRPFKFNEFVSPICLPNPGLTVTADVGLISGWGRLSEGGSLPHILQAAEVPLTPKEECRRSYAVAGYSNYL  319 (406)
Q Consensus       240 ~~DiALlkL~~~v~~~~~v~piclp~~~~~~~~~~~~~~GwG~~~~~~~~~~~l~~~~v~~~~~~~C~~~~~~~~~~~~~  319 (406)
                      .+||||||+..+-    ...++.|-..+....|+.+.+.|.-.. .    ......-.+..+...    ....     .-
T Consensus       137 ~~DlAvlkv~~~~----~l~~~~lg~s~~~~~G~~V~aiG~P~g-~----~~tvt~GivS~~~r~----~~~~-----~~  198 (455)
T PRK10139        137 QSDIALLQIQNPS----KLTQIAIADSDKLRVGDFAVAVGNPFG-L----GQTATSGIISALGRS----GLNL-----EG  198 (455)
T ss_pred             CCCEEEEEecCCC----CCceeEecCccccCCCCEEEEEecCCC-C----CCceEEEEEcccccc----ccCC-----CC
Confidence            6799999997532    344566654333337899888885321 1    111112222221110    0000     00


Q ss_pred             CcceEEeecCCCCCCcccCCCCCeeEeecCCCcEEEEEEEEeCCCC-CCCCCCeEEEeCCccHHHHHHhH
Q psy2950         320 NQCQVCTGTKQGGLDSCQGDSGGPLACPLPDGRYYLCGITSWGVGC-ARPDFYGVYTLVSCYSDWVKSIL  388 (406)
Q Consensus       320 ~~~~~Ca~~~~~~~~~C~gDsGgPl~~~~~~~~~~l~Gi~S~~~~c-~~~~~p~vft~v~~~~~WI~~~~  388 (406)
                      ....+=+     ....-.|.|||||+..  +|  .|+||.+....- +.....+...-+..-...+++.+
T Consensus       199 ~~~~iqt-----da~in~GnSGGpl~n~--~G--~vIGi~~~~~~~~~~~~gigfaIP~~~~~~v~~~l~  259 (455)
T PRK10139        199 LENFIQT-----DASINRGNSGGALLNL--NG--ELIGINTAILAPGGGSVGIGFAIPSNMARTLAQQLI  259 (455)
T ss_pred             cceEEEE-----CCccCCCCCcceEECC--CC--eEEEEEEEEEcCCCCccceEEEEEhHHHHHHHHHHh
Confidence            0112222     1233468899999963  22  699999864321 11112345555544444455444


No 28 
>PF05580 Peptidase_S55:  SpoIVB peptidase S55;  InterPro: IPR008763 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S55 (SpoIVB peptidase family, clan PA(S)). The protein SpoIVB plays a key role in signalling in the final sigma-K checkpoint of Bacillus subtilis [, ].
Probab=69.44  E-value=6.4  Score=34.67  Aligned_cols=28  Identities=25%  Similarity=0.136  Sum_probs=23.9

Q ss_pred             CCCcccCCCCCeeEeecCCCcEEEEEEEEeCCC
Q psy2950         332 GLDSCQGDSGGPLACPLPDGRYYLCGITSWGVG  364 (406)
Q Consensus       332 ~~~~C~gDsGgPl~~~~~~~~~~l~Gi~S~~~~  364 (406)
                      .-+.-+|-||+|++.++     +|+|-+++...
T Consensus       174 TGGIvqGMSGSPI~qdG-----KLiGAVthvf~  201 (218)
T PF05580_consen  174 TGGIVQGMSGSPIIQDG-----KLIGAVTHVFV  201 (218)
T ss_pred             hCCEEecccCCCEEECC-----EEEEEEEEEEe
Confidence            34678999999999988     89999998754


No 29 
>PRK10942 serine endoprotease; Provisional
Probab=60.76  E-value=39  Score=34.04  Aligned_cols=37  Identities=24%  Similarity=0.397  Sum_probs=23.6

Q ss_pred             cCceEEEEeCCCcccCCCcccccCCCCCCcccCCcEEEEcc
Q psy2950         240 KNDIALLELTRPFKFNEFVSPICLPNPGLTVTADVGLISGW  280 (406)
Q Consensus       240 ~~DiALlkL~~~v~~~~~v~piclp~~~~~~~~~~~~~~Gw  280 (406)
                      ..|+||||+..+-.    ..++-|-..+....|+.+.+.|.
T Consensus       158 ~~DlAvlki~~~~~----l~~~~lg~s~~l~~G~~V~aiG~  194 (473)
T PRK10942        158 RSDIALIQLQNPKN----LTAIKMADSDALRVGDYTVAIGN  194 (473)
T ss_pred             CCCEEEEEecCCCC----CceeEecCccccCCCCEEEEEcC
Confidence            68999999964322    34555543333237888888774


No 30 
>PF00947 Pico_P2A:  Picornavirus core protein 2A;  InterPro: IPR000081 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  This domain defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies 3CA and 3CB. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral 3C cysteine protease []. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0008233 peptidase activity, 0006508 proteolysis, 0016032 viral reproduction; PDB: 2HRV_B 1Z8R_A.
Probab=59.32  E-value=10  Score=30.34  Aligned_cols=35  Identities=26%  Similarity=0.326  Sum_probs=26.3

Q ss_pred             cCCCCCeeEeecCCCcEEEEEEEEeCCCCCCCCCCeEEEeCCccH
Q psy2950         337 QGDSGGPLACPLPDGRYYLCGITSWGVGCARPDFYGVYTLVSCYS  381 (406)
Q Consensus       337 ~gDsGgPl~~~~~~~~~~l~Gi~S~~~~c~~~~~p~vft~v~~~~  381 (406)
                      .||-||+|.|+-     =++||++-|-     ..-..|++|+.+.
T Consensus        89 PGdCGg~L~C~H-----GViGi~Tagg-----~g~VaF~dir~~~  123 (127)
T PF00947_consen   89 PGDCGGILRCKH-----GVIGIVTAGG-----EGHVAFADIRDLL  123 (127)
T ss_dssp             TT-TCSEEEETT-----CEEEEEEEEE-----TTEEEEEECCCGS
T ss_pred             CCCCCceeEeCC-----CeEEEEEeCC-----CceEEEEechhhh
Confidence            478899999986     4999998762     2257799998764


No 31 
>PF00947 Pico_P2A:  Picornavirus core protein 2A;  InterPro: IPR000081 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  This domain defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies 3CA and 3CB. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral 3C cysteine protease []. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0008233 peptidase activity, 0006508 proteolysis, 0016032 viral reproduction; PDB: 2HRV_B 1Z8R_A.
Probab=51.95  E-value=18  Score=29.00  Aligned_cols=34  Identities=26%  Similarity=0.356  Sum_probs=25.5

Q ss_pred             cCCCCCeeEeeCCCCcEEEEEEEEecCCCCCCCCCeeEEEeeec
Q psy2950         165 QGDSGGPLACPLPDGRYYLCGITSWGVGCARPDFYGVYTLVSCY  208 (406)
Q Consensus       165 ~gdsGgPl~~~~~~~~~~l~Gi~s~~~~C~~~~~p~~~t~v~~~  208 (406)
                      .||-||+|.|+-     =++||++.|.     .....|++++.+
T Consensus        89 PGdCGg~L~C~H-----GViGi~Tagg-----~g~VaF~dir~~  122 (127)
T PF00947_consen   89 PGDCGGILRCKH-----GVIGIVTAGG-----EGHVAFADIRDL  122 (127)
T ss_dssp             TT-TCSEEEETT-----CEEEEEEEEE-----TTEEEEEECCCG
T ss_pred             CCCCCceeEeCC-----CeEEEEEeCC-----CceEEEEechhh
Confidence            489999999987     4999998872     223578888765


No 32 
>TIGR02860 spore_IV_B stage IV sporulation protein B. SpoIVB, the stage IV sporulation protein B of endospore-forming bacteria such as Bacillus subtilis, is a serine proteinase, expressed in the spore (rather than mother cell) compartment, that participates in a proteolytic activation cascade for Sigma-K. It appears to be universal among endospore-forming bacteria and occurs nowhere else.
Probab=34.05  E-value=37  Score=33.24  Aligned_cols=46  Identities=24%  Similarity=0.274  Sum_probs=33.2

Q ss_pred             CCCcccCCCCCeeEeecCCCcEEEEEEEEeCCCCCCCCCCeEEEeCCccHHHHHHhH
Q psy2950         332 GLDSCQGDSGGPLACPLPDGRYYLCGITSWGVGCARPDFYGVYTLVSCYSDWVKSIL  388 (406)
Q Consensus       332 ~~~~C~gDsGgPl~~~~~~~~~~l~Gi~S~~~~c~~~~~p~vft~v~~~~~WI~~~~  388 (406)
                      ..+.-+|-||+|++.++     .|+|-++.-.--+....+++      |.+|+.+..
T Consensus       354 tgGivqGMSGSPi~q~g-----kliGAvtHVfvndpt~GYGi------~ie~Ml~~~  399 (402)
T TIGR02860       354 TGGIVQGMSGSPIIQNG-----KVIGAVTHVFVNDPTSGYGV------YIEWMLKEA  399 (402)
T ss_pred             hCCEEecccCCCEEECC-----EEEEEEEEEEecCCCcceee------hHHHHHHHh
Confidence            34778999999999988     89999987654333233444      578887753


No 33 
>PF05579 Peptidase_S32:  Equine arteritis virus serine endopeptidase S32;  InterPro: IPR008760 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to MEROPS peptidase family S32 (clan PA(S)). The type example is equine arteritis virus serine endopeptidase (equine arteritis virus), which is involved in processing of nidovirus polyproteins [].; GO: 0004252 serine-type endopeptidase activity, 0016032 viral reproduction, 0019082 viral protein processing; PDB: 3FAN_A 3FAO_A 1MBM_A.
Probab=31.98  E-value=51  Score=30.15  Aligned_cols=22  Identities=36%  Similarity=0.588  Sum_probs=15.8

Q ss_pred             cCCCCCeeEeecCCCcEEEEEEEEeC
Q psy2950         337 QGDSGGPLACPLPDGRYYLCGITSWG  362 (406)
Q Consensus       337 ~gDsGgPl~~~~~~~~~~l~Gi~S~~  362 (406)
                      .||||+|++.+.  +  .|+||-+-+
T Consensus       207 ~GDSGSPVVt~d--g--~liGVHTGS  228 (297)
T PF05579_consen  207 PGDSGSPVVTED--G--DLIGVHTGS  228 (297)
T ss_dssp             GGCTT-EEEETT--C---EEEEEEEE
T ss_pred             CCCCCCccCcCC--C--CEEEEEecC
Confidence            489999999864  3  599998654


No 34 
>PF13365 Trypsin_2:  Trypsin-like peptidase domain; PDB: 1Y8T_A 2Z9I_A 3QO6_A 1L1J_A 1QY6_A 2O8L_A 3OTP_E 2ZLE_I 1KY9_A 3CS0_A ....
Probab=29.60  E-value=48  Score=25.46  Aligned_cols=18  Identities=50%  Similarity=0.898  Sum_probs=13.4

Q ss_pred             cCCCCCeeEeecCCCcEEEEEE
Q psy2950         337 QGDSGGPLACPLPDGRYYLCGI  358 (406)
Q Consensus       337 ~gDsGgPl~~~~~~~~~~l~Gi  358 (406)
                      .|.|||||+-.  ++  .++||
T Consensus       103 ~G~SGgpv~~~--~G--~vvGi  120 (120)
T PF13365_consen  103 PGSSGGPVFDS--DG--RVVGI  120 (120)
T ss_dssp             TTTTTSEEEET--TS--EEEEE
T ss_pred             CCcEeHhEECC--CC--EEEeC
Confidence            58899999762  44  58886


No 35 
>PF00548 Peptidase_C3:  3C cysteine protease (picornain 3C);  InterPro: IPR000199 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  This signature defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies C3A and C3B. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral C3 cysteine protease. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SJO_E 2H6M_A 1QA7_C 1HAV_B 2HAL_A 2H9H_A 3QZQ_B 3QZR_A 3R0F_B 3SJ9_A ....
Probab=24.15  E-value=95  Score=26.46  Aligned_cols=26  Identities=31%  Similarity=0.482  Sum_probs=19.3

Q ss_pred             ccCCCCCeeEeecCCCcEEEEEEEEeC
Q psy2950         336 CQGDSGGPLACPLPDGRYYLCGITSWG  362 (406)
Q Consensus       336 C~gDsGgPl~~~~~~~~~~l~Gi~S~~  362 (406)
                      -.|+=||||+.+. .+...++||-.-|
T Consensus       145 ~~G~CG~~l~~~~-~~~~~i~GiHvaG  170 (172)
T PF00548_consen  145 KPGMCGSPLVSRI-GGQGKIIGIHVAG  170 (172)
T ss_dssp             ETTGTTEEEEESC-GGTTEEEEEEEEE
T ss_pred             CCCccCCeEEEee-ccCccEEEEEecc
Confidence            3456699999965 4566899997654


No 36 
>PF02907 Peptidase_S29:  Hepatitis C virus NS3 protease;  InterPro: IPR004109 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies the Hepatitis C virus NS3 protein as a serine protease which belongs to MEROPS peptidase family S29 (hepacivirin family, clan PA(S)), which has a trypsin-like fold. The non-structural (NS) protein NS3 is one of the NS proteins involved in replication of the HCV genome. The NS2 proteinase (IPR002518 from INTERPRO), a zinc-dependent enzyme, performs a single proteolytic cut to release the N terminus of NS3. The action of NS3 proteinase (NS3P), which resides in the N-terminal one-third of the NS3 protein, then yields all remaining non-structural proteins. The C-terminal two-thirds of the NS3 protein contain a helicase. The functional relationship between the proteinase and helicase domains is unknown. NS3 has a structural zinc-binding site and requires cofactor NS4. It has been suggested that the NS3 serine protease of hepatitus C is involved in cell transformation and that the ability to transform requires an active enzyme [].; GO: 0008236 serine-type peptidase activity, 0006508 proteolysis, 0019087 transformation of host cell by virus; PDB: 2QV1_B 3LOX_C 2OBQ_C 2OC1_C 2OC0_A 3LON_A 3KNX_A 2O8M_A 2OBO_A 2OC8_A ....
Probab=23.73  E-value=79  Score=25.71  Aligned_cols=25  Identities=36%  Similarity=0.768  Sum_probs=17.3

Q ss_pred             cCCCCCeeEeeCCCCcEEEEEEEEecCCCC
Q psy2950         165 QGDSGGPLACPLPDGRYYLCGITSWGVGCA  194 (406)
Q Consensus       165 ~gdsGgPl~~~~~~~~~~l~Gi~s~~~~C~  194 (406)
                      .|.||+|++|...    ..+||.... -|.
T Consensus       107 kGSSGgPiLC~~G----H~vG~f~aa-~~t  131 (148)
T PF02907_consen  107 KGSSGGPILCPSG----HAVGMFRAA-VCT  131 (148)
T ss_dssp             TT-TT-EEEETTS----EEEEEEEEE-EEE
T ss_pred             ecCCCCcccCCCC----CEEEEEEEE-EEc
Confidence            7899999999874    788877653 443


No 37 
>PF00944 Peptidase_S3:  Alphavirus core protein ;  InterPro: IPR000930 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. Togavirin, also known as Sindbis virus core endopeptidase, is a serine protease resident at the N terminus of the p130 polyprotein of togaviruses []. The endopeptidase signature identifies the peptidase as belonging to the MEROPS peptidase family S3 (togavirin family, clan PA(S)). The polyprotein also includes structural proteins for the nucleocapsid core and for the glycoprotein spikes []. Togavirin is only active while part of the polyprotein, cleavage at a Trp-Ser bond resulting in total lack of activity []. Mutagenesis studies have identified the location of the His-Asp-Ser catalytic triad, and X-ray studies have revealed the protein fold to be similar to that of chymotrypsin [, ].; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis, 0016020 membrane; PDB: 2YEW_D 1EP5_A 3J0C_F 1EP6_C 1WYK_D 1DYL_A 1VCQ_B 1VCP_B 1LD4_D 1KXA_A ....
Probab=21.32  E-value=41  Score=27.26  Aligned_cols=24  Identities=33%  Similarity=0.494  Sum_probs=16.9

Q ss_pred             cCCCCCeeEeecCCCcEEEEEEEEeCCC
Q psy2950         337 QGDSGGPLACPLPDGRYYLCGITSWGVG  364 (406)
Q Consensus       337 ~gDsGgPl~~~~~~~~~~l~Gi~S~~~~  364 (406)
                      .||||.|++-+  .|  .+|||+--|..
T Consensus       105 ~GDSGRpi~DN--sG--rVVaIVLGG~n  128 (158)
T PF00944_consen  105 PGDSGRPIFDN--SG--RVVAIVLGGAN  128 (158)
T ss_dssp             TTSTTEEEEST--TS--BEEEEEEEEEE
T ss_pred             CCCCCCccCcC--CC--CEEEEEecCCC
Confidence            58999999853  33  47888865543


Done!