Query         psy16044
Match_columns 436
No_of_seqs    347 out of 2625
Neff          9.8 
Searched_HMMs 46136
Date          Fri Aug 16 19:26:10 2013
Command       hhsearch -i /work/01045/syshi/Psyhhblits/psy16044.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/16044hhsearch_cdd -cpu 12 -v 0 

 No Hit                             Prob E-value P-value  Score    SS Cols Query HMM  Template HMM
  1 cd00190 Tryp_SPc Trypsin-like  100.0 5.6E-40 1.2E-44  297.7  22.1  226   27-266     1-231 (232)
  2 KOG3627|consensus              100.0 1.6E-37 3.4E-42  286.1  21.4  235   23-268     9-254 (256)
  3 smart00020 Tryp_SPc Trypsin-li 100.0   6E-37 1.3E-41  277.5  23.3  223   26-264     1-229 (229)
  4 PF00089 Trypsin:  Trypsin;  In 100.0 1.9E-34   4E-39  259.4  18.8  215   27-264     1-220 (220)
  5 COG5640 Secreted trypsin-like  100.0   2E-28 4.3E-33  218.6  13.0  238   23-270    29-280 (413)
  6 KOG3627|consensus               99.9 3.4E-26 7.4E-31  210.6  15.5  158  271-430    94-255 (256)
  7 cd00190 Tryp_SPc Trypsin-like   99.9 2.2E-24 4.7E-29  195.4  13.6  156  270-428    77-232 (232)
  8 smart00020 Tryp_SPc Trypsin-li  99.9 9.1E-22   2E-26  178.0  13.2  151  271-425    78-229 (229)
  9 PF00089 Trypsin:  Trypsin;  In  99.8 1.8E-20 3.8E-25  168.3  11.4  145  271-425    76-220 (220)
 10 PF03761 DUF316:  Domain of unk  99.7 2.4E-17 5.3E-22  153.6  14.6  227    2-260    19-271 (282)
 11 COG5640 Secreted trypsin-like   99.6 8.8E-15 1.9E-19  131.7  13.7  160  270-432   112-281 (413)
 12 PF09342 DUF1986:  Domain of un  99.5 1.3E-12 2.8E-17  112.1  14.7  117   35-167    13-131 (267)
 13 COG3591 V8-like Glu-specific e  98.7 2.5E-07 5.4E-12   82.0  11.5  177   33-246    44-225 (251)
 14 PF03761 DUF316:  Domain of unk  98.3   4E-06 8.6E-11   78.2   9.4  121  280-430   159-280 (282)
 15 TIGR02037 degP_htrA_DO peripla  98.2 2.2E-05 4.8E-10   77.6  13.0   85   55-167    57-142 (428)
 16 TIGR02038 protease_degS peripl  98.1 0.00012 2.7E-09   70.1  14.9  160   36-244    53-218 (351)
 17 PRK10898 serine endoprotease;   97.9 0.00035 7.6E-09   67.0  15.5  103   36-167    53-161 (353)
 18 PRK10139 serine endoprotease;   97.9 0.00014 3.1E-09   72.0  12.4  142   55-244    89-232 (455)
 19 PRK10942 serine endoprotease;   97.8  0.0003 6.4E-09   70.1  13.0   84   55-166   110-195 (473)
 20 PF13365 Trypsin_2:  Trypsin-li  97.3 0.00042 9.1E-09   55.2   5.1   21   58-78      1-22  (120)
 21 PF02395 Peptidase_S6:  Immunog  96.2   0.031 6.8E-07   58.5   9.9   66   59-149    68-133 (769)
 22 PF09342 DUF1986:  Domain of un  93.7    0.17 3.8E-06   44.5   5.9   45  280-326    87-131 (267)
 23 PF02395 Peptidase_S6:  Immunog  90.2    0.53 1.1E-05   49.6   5.7   54  375-432   212-266 (769)
 24 TIGR02037 degP_htrA_DO peripla  85.1     5.3 0.00012   39.6   9.2   40  280-326   103-142 (428)
 25 COG0265 DegQ Trypsin-like seri  82.1      27 0.00058   33.5  12.4  144   56-246    72-216 (347)
 26 COG3591 V8-like Glu-specific e  81.2     2.6 5.7E-05   37.9   4.6   52  375-430   199-251 (251)
 27 PRK10898 serine endoprotease;   81.0      10 0.00022   36.5   9.0   40  279-326   122-161 (353)
 28 PF00863 Peptidase_C4:  Peptida  78.4      11 0.00024   33.7   7.6  137   61-246    36-174 (235)
 29 TIGR02038 protease_degS peripl  78.3     7.9 0.00017   37.2   7.3   40  279-326   122-161 (351)
 30 PF00947 Pico_P2A:  Picornaviru  76.1     2.6 5.6E-05   33.3   2.7   34  378-421    89-122 (127)
 31 PRK10139 serine endoprotease;   75.1      17 0.00036   36.4   8.8   39  280-325   136-174 (455)
 32 PF05416 Peptidase_C37:  Southa  69.1      15 0.00033   35.3   6.4   41  204-245   485-527 (535)
 33 PF00548 Peptidase_C3:  3C cyst  60.0     5.2 0.00011   34.0   1.5   29  216-245   143-171 (172)
 34 PF00947 Pico_P2A:  Picornaviru  57.2       8 0.00017   30.7   2.0   23  219-246    89-111 (127)
 35 PRK10942 serine endoprotease;   56.5      58  0.0013   32.8   8.5   39  280-325   157-195 (473)
 36 PF10459 Peptidase_S46:  Peptid  56.4     6.8 0.00015   41.1   1.9   22   57-78     48-70  (698)
 37 PF05580 Peptidase_S55:  SpoIVB  45.1      20 0.00043   31.4   2.7   25  375-404   176-200 (218)
 38 PF05579 Peptidase_S32:  Equine  32.6      39 0.00085   30.6   2.6   22  219-244   207-228 (297)
 39 PF05580 Peptidase_S55:  SpoIVB  30.8      37 0.00079   29.8   2.1   26  215-245   175-200 (218)
 40 PF05579 Peptidase_S32:  Equine  30.6      41 0.00089   30.5   2.4   21  379-403   208-228 (297)
 41 PF00944 Peptidase_S3:  Alphavi  29.5      36 0.00079   27.3   1.7   25  217-245   103-127 (158)
 42 PF02907 Peptidase_S29:  Hepati  28.1      38 0.00083   27.2   1.6   22  377-402   106-127 (148)
 43 KOG1421|consensus               23.5 4.5E+02  0.0098   27.7   8.4  135   59-227    87-223 (955)
 44 TIGR02860 spore_IV_B stage IV   20.3      72  0.0016   31.2   2.2   44  375-429   356-399 (402)

No 1  
>cd00190 Tryp_SPc Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues.
Probab=100.00  E-value=5.6e-40  Score=297.75  Aligned_cols=226  Identities=38%  Similarity=0.723  Sum_probs=192.5

Q ss_pred             EeCCcccCCCCCcceEEEeeeccCCCCCCeeeEEEEecCCEEEecCCCcCCCCCCCCCCcceEEEeccccCCccCCceEE
Q psy16044         27 LINGKESIRGAWPWQVSLQVLHPRLGLMPHWCGAVLIHPSWVVTAAHCIHNDIFSLPIPELWTAVLGDWDRTEEEKSEVR  106 (436)
Q Consensus        27 i~~G~~a~~~~~Pw~v~i~~~~~~~~~~~~~C~GtLIs~~~VLTAAhC~~~~~~~~~~~~~~~v~~g~~~~~~~~~~~~~  106 (436)
                      |+||+++..++|||+|+|+...     ..++|+||||+++||||||||+....     ...+.|++|...........+.
T Consensus         1 i~~G~~~~~~~~Pw~v~i~~~~-----~~~~C~GtlIs~~~VLTaAhC~~~~~-----~~~~~v~~g~~~~~~~~~~~~~   70 (232)
T cd00190           1 IVGGSEAKIGSFPWQVSLQYTG-----GRHFCGGSLISPRWVLTAAHCVYSSA-----PSNYTVRLGSHDLSSNEGGGQV   70 (232)
T ss_pred             CcCCeECCCCCCCCEEEEEccC-----CcEEEEEEEeeCCEEEECHHhcCCCC-----CccEEEEeCcccccCCCCceEE
Confidence            6899999999999999998763     16899999999999999999997631     4678899998776654445677


Q ss_pred             EeeeEEEeCCCCCC--CCCceeEEEeCCCCCCCCCceeeeeecCCCCCCCCCCCcEEEEecCccCCCCCccccceeeeee
Q psy16044        107 IPVERIRVHEEFHN--YHHDIALLKLSRPTSARDKGVRAVCLTDADKRPVNPKQQCVATGWGRVKPKGDLVSKLRQIRVP  184 (436)
Q Consensus       107 ~~v~~i~~hp~y~~--~~~DiAll~L~~~~~~~~~~v~pi~l~~~~~~~~~~~~~~~~~GwG~~~~~~~~~~~l~~~~~~  184 (436)
                      +.|.++++||+|+.  ..+|||||||++|+. ++.+++|+|||... .....+..+.++|||...........++...+.
T Consensus        71 ~~v~~~~~hp~y~~~~~~~DiAll~L~~~~~-~~~~v~picl~~~~-~~~~~~~~~~~~G~g~~~~~~~~~~~~~~~~~~  148 (232)
T cd00190          71 IKVKKVIVHPNYNPSTYDNDIALLKLKRPVT-LSDNVRPICLPSSG-YNLPAGTTCTVSGWGRTSEGGPLPDVLQEVNVP  148 (232)
T ss_pred             EEEEEEEECCCCCCCCCcCCEEEEEECCccc-CCCcccceECCCcc-ccCCCCCEEEEEeCCcCCCCCCCCceeeEEEee
Confidence            89999999999984  689999999999999 67789999999863 345577899999999987655566789999999


Q ss_pred             ccchhhhhhhcCCCcCCcCCeEEeccCCCCCCCccCCCCCeeeeEecCCcEEEEEEEeEcCCCCC---CCccccccccee
Q psy16044        185 LHNISVCRDKYGDSVELHGGHLCGGQLDGFSGACIGDSGGPLQCSLKDGRWYLAGITSFGSGYCG---VGIRYSHRQPRL  261 (436)
Q Consensus       185 ~~~~~~C~~~~~~~~~~~~~~~Ca~~~~~~~~~C~gDsGgPl~~~~~~~~~~l~GI~S~g~~~C~---~~~~y~~v~~~~  261 (436)
                      +++...|...+.......+.++|+.......+.|.|||||||++.. +++|+|+||+|+|.. |+   .+.+|+++..+.
T Consensus       149 ~~~~~~C~~~~~~~~~~~~~~~C~~~~~~~~~~c~gdsGgpl~~~~-~~~~~lvGI~s~g~~-c~~~~~~~~~t~v~~~~  226 (232)
T cd00190         149 IVSNAECKRAYSYGGTITDNMLCAGGLEGGKDACQGDSGGPLVCND-NGRGVLVGIVSWGSG-CARPNYPGVYTRVSSYL  226 (232)
T ss_pred             eECHHHhhhhccCcccCCCceEeeCCCCCCCccccCCCCCcEEEEe-CCEEEEEEEEehhhc-cCCCCCCCEEEEcHHhh
Confidence            9999999988865345667999998876568899999999999985 589999999999987 77   457899999999


Q ss_pred             cCCcc
Q psy16044        262 INGKE  266 (436)
Q Consensus       262 ~~~~~  266 (436)
                      .||.+
T Consensus       227 ~WI~~  231 (232)
T cd00190         227 DWIQK  231 (232)
T ss_pred             HHhhc
Confidence            99975


No 2  
>KOG3627|consensus
Probab=100.00  E-value=1.6e-37  Score=286.11  Aligned_cols=235  Identities=38%  Similarity=0.688  Sum_probs=188.5

Q ss_pred             CCCeEeCCcccCCCCCcceEEEeeeccCCCCCCeeeEEEEecCCEEEecCCCcCCCCCCCCCCcceEEEeccccCCcc-C
Q psy16044         23 RQPRLINGKESIRGAWPWQVSLQVLHPRLGLMPHWCGAVLIHPSWVVTAAHCIHNDIFSLPIPELWTAVLGDWDRTEE-E  101 (436)
Q Consensus        23 ~~~~i~~G~~a~~~~~Pw~v~i~~~~~~~~~~~~~C~GtLIs~~~VLTAAhC~~~~~~~~~~~~~~~v~~g~~~~~~~-~  101 (436)
                      ...||+||.++.+++|||+|+|+....    ..++|+|+||+++||||||||+....    .. .+.|++|.+..... .
T Consensus         9 ~~~~i~~g~~~~~~~~Pw~~~l~~~~~----~~~~Cggsli~~~~vltaaHC~~~~~----~~-~~~V~~G~~~~~~~~~   79 (256)
T KOG3627|consen    9 PEGRIVGGTEAEPGSFPWQVSLQYGGN----GRHLCGGSLISPRWVLTAAHCVKGAS----AS-LYTVRLGEHDINLSVS   79 (256)
T ss_pred             ccCCEeCCccCCCCCCCCEEEEEECCC----cceeeeeEEeeCCEEEEChhhCCCCC----Cc-ceEEEECccccccccc
Confidence            357999999999999999999987642    25799999999999999999997742    12 77888897654433 1


Q ss_pred             Cc--eEEEeeeEEEeCCCCCC--CC-CceeEEEeCCCCCCCCCceeeeeecCCCCC-CCCCCCcEEEEecCccCCC-CCc
Q psy16044        102 KS--EVRIPVERIRVHEEFHN--YH-HDIALLKLSRPTSARDKGVRAVCLTDADKR-PVNPKQQCVATGWGRVKPK-GDL  174 (436)
Q Consensus       102 ~~--~~~~~v~~i~~hp~y~~--~~-~DiAll~L~~~~~~~~~~v~pi~l~~~~~~-~~~~~~~~~~~GwG~~~~~-~~~  174 (436)
                      ..  .....+.++++||.|+.  .. ||||||+|++++. ++..|+|||||..... ....+..|.++|||.+... ...
T Consensus        80 ~~~~~~~~~v~~~i~H~~y~~~~~~~nDiall~l~~~v~-~~~~i~piclp~~~~~~~~~~~~~~~v~GWG~~~~~~~~~  158 (256)
T KOG3627|consen   80 EGEEQLVGDVEKIIVHPNYNPRTLENNDIALLRLSEPVT-FSSHIQPICLPSSADPYFPPGGTTCLVSGWGRTESGGGPL  158 (256)
T ss_pred             cCchhhhceeeEEEECCCCCCCCCCCCCEEEEEECCCcc-cCCcccccCCCCCcccCCCCCCCEEEEEeCCCcCCCCCCC
Confidence            11  24445778889999984  34 9999999999999 7889999999864332 3445589999999988754 345


Q ss_pred             cccceeeeeeccchhhhhhhcCCCcCCcCCeEEeccCCCCCCCccCCCCCeeeeEecCCcEEEEEEEeEcCCCCCC---C
Q psy16044        175 VSKLRQIRVPLHNISVCRDKYGDSVELHGGHLCGGQLDGFSGACIGDSGGPLQCSLKDGRWYLAGITSFGSGYCGV---G  251 (436)
Q Consensus       175 ~~~l~~~~~~~~~~~~C~~~~~~~~~~~~~~~Ca~~~~~~~~~C~gDsGgPl~~~~~~~~~~l~GI~S~g~~~C~~---~  251 (436)
                      +..|++..+.+++...|...+.......+.++||+......++|+|||||||++.... +|+++||+|||.+.|+.   +
T Consensus       159 ~~~L~~~~v~i~~~~~C~~~~~~~~~~~~~~~Ca~~~~~~~~~C~GDSGGPLv~~~~~-~~~~~GivS~G~~~C~~~~~P  237 (256)
T KOG3627|consen  159 PDTLQEVDVPIISNSECRRAYGGLGTITDTMLCAGGPEGGKDACQGDSGGPLVCEDNG-RWVLVGIVSWGSGGCGQPNYP  237 (256)
T ss_pred             CceeEEEEEeEcChhHhcccccCccccCCCEEeeCccCCCCccccCCCCCeEEEeeCC-cEEEEEEEEecCCCCCCCCCC
Confidence            6789999999999999999887542344568999986666889999999999998544 89999999999986655   6


Q ss_pred             cccccccceecCCcccc
Q psy16044        252 IRYSHRQPRLINGKESI  268 (436)
Q Consensus       252 ~~y~~v~~~~~~~~~~~  268 (436)
                      .+|++|..|..||.+.+
T Consensus       238 ~vyt~V~~y~~WI~~~~  254 (256)
T KOG3627|consen  238 GVYTRVSSYLDWIKENI  254 (256)
T ss_pred             eEEeEhHHhHHHHHHHh
Confidence            89999999999998654


No 3  
>smart00020 Tryp_SPc Trypsin-like serine protease. Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues.
Probab=100.00  E-value=6e-37  Score=277.50  Aligned_cols=223  Identities=38%  Similarity=0.723  Sum_probs=185.8

Q ss_pred             eEeCCcccCCCCCcceEEEeeeccCCCCCCeeeEEEEecCCEEEecCCCcCCCCCCCCCCcceEEEeccccCCccCCceE
Q psy16044         26 RLINGKESIRGAWPWQVSLQVLHPRLGLMPHWCGAVLIHPSWVVTAAHCIHNDIFSLPIPELWTAVLGDWDRTEEEKSEV  105 (436)
Q Consensus        26 ~i~~G~~a~~~~~Pw~v~i~~~~~~~~~~~~~C~GtLIs~~~VLTAAhC~~~~~~~~~~~~~~~v~~g~~~~~~~~~~~~  105 (436)
                      ||+||+++..++|||+|.++...     ..+.|+||||++++|||||||+....     ...+.|++|........ ...
T Consensus         1 ~~~~G~~~~~~~~Pw~~~i~~~~-----~~~~C~GtlIs~~~VLTaahC~~~~~-----~~~~~v~~g~~~~~~~~-~~~   69 (229)
T smart00020        1 RIVGGSEANIGSFPWQVSLQYRG-----GRHFCGGSLISPRWVLTAAHCVYGSD-----PSNIRVRLGSHDLSSGE-EGQ   69 (229)
T ss_pred             CccCCCcCCCCCCCcEEEEEEcC-----CCcEEEEEEecCCEEEECHHHcCCCC-----CcceEEEeCcccCCCCC-Cce
Confidence            58999999999999999998653     26889999999999999999997642     45789999987655432 226


Q ss_pred             EEeeeEEEeCCCCC--CCCCceeEEEeCCCCCCCCCceeeeeecCCCCCCCCCCCcEEEEecCccCC-CCCccccceeee
Q psy16044        106 RIPVERIRVHEEFH--NYHHDIALLKLSRPTSARDKGVRAVCLTDADKRPVNPKQQCVATGWGRVKP-KGDLVSKLRQIR  182 (436)
Q Consensus       106 ~~~v~~i~~hp~y~--~~~~DiAll~L~~~~~~~~~~v~pi~l~~~~~~~~~~~~~~~~~GwG~~~~-~~~~~~~l~~~~  182 (436)
                      .+.|..++.||+|+  ...+|||||+|++|+. ++..++|+||+.. ......+..+.++|||.... .......++...
T Consensus        70 ~~~v~~~~~~p~~~~~~~~~DiAll~L~~~i~-~~~~~~pi~l~~~-~~~~~~~~~~~~~g~g~~~~~~~~~~~~~~~~~  147 (229)
T smart00020       70 VIKVSKVIIHPNYNPSTYDNDIALLKLKSPVT-LSDNVRPICLPSS-NYNVPAGTTCTVSGWGRTSEGAGSLPDTLQEVN  147 (229)
T ss_pred             EEeeEEEEECCCCCCCCCcCCEEEEEECcccC-CCCceeeccCCCc-ccccCCCCEEEEEeCCCCCCCCCcCCCEeeEEE
Confidence            68899999999997  4789999999999998 6678999999985 22344678999999998763 234456788999


Q ss_pred             eeccchhhhhhhcCCCcCCcCCeEEeccCCCCCCCccCCCCCeeeeEecCCcEEEEEEEeEcCCCCC---CCcccccccc
Q psy16044        183 VPLHNISVCRDKYGDSVELHGGHLCGGQLDGFSGACIGDSGGPLQCSLKDGRWYLAGITSFGSGYCG---VGIRYSHRQP  259 (436)
Q Consensus       183 ~~~~~~~~C~~~~~~~~~~~~~~~Ca~~~~~~~~~C~gDsGgPl~~~~~~~~~~l~GI~S~g~~~C~---~~~~y~~v~~  259 (436)
                      +.+++.+.|...+.......+.++|++......+.|.|||||||++.. + +|+|+||+|+|. .|+   .+.+|+++.+
T Consensus       148 ~~~~~~~~C~~~~~~~~~~~~~~~C~~~~~~~~~~c~gdsG~pl~~~~-~-~~~l~Gi~s~g~-~C~~~~~~~~~~~i~~  224 (229)
T smart00020      148 VPIVSNATCRRAYSGGGAITDNMLCAGGLEGGKDACQGDSGGPLVCND-G-RWVLVGIVSWGS-GCARPGKPGVYTRVSS  224 (229)
T ss_pred             EEEeCHHHhhhhhccccccCCCcEeecCCCCCCcccCCCCCCeeEEEC-C-CEEEEEEEEECC-CCCCCCCCCEEEEecc
Confidence            999999999988765445677999998876558899999999999975 3 999999999999 587   5678999999


Q ss_pred             eecCC
Q psy16044        260 RLING  264 (436)
Q Consensus       260 ~~~~~  264 (436)
                      +..||
T Consensus       225 ~~~WI  229 (229)
T smart00020      225 YLDWI  229 (229)
T ss_pred             ccccC
Confidence            99997


No 4  
>PF00089 Trypsin:  Trypsin;  InterPro: IPR001254 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine proteases belong to the MEROPS peptidase family S1 (chymotrypsin family, clan PA(S))and to peptidase family S6 (Hap serine peptidases). The chymotrypsin family is almost totally confined to animals, although trypsin-like enzymes are found in actinomycetes of the genera Streptomyces and Saccharopolyspora, and in the fungus Fusarium oxysporum []. The enzymes are inherently secreted, being synthesised with a signal peptide that targets them to the secretory pathway. Animal enzymes are either secreted directly, packaged into vesicles for regulated secretion, or are retained in leukocyte granules []. The Hap family, 'Haemophilus adhesion and penetration', are proteins that play a role in the interaction with human epithelial cells. The serine protease activity is localized at the N-terminal domain, whereas the binding domain is in the C-terminal region. ; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 1SPJ_A 1A5I_A 2ZGH_A 2ZKS_A 2ZGJ_A 2ZGC_A 2ODP_A 2I6Q_A 2I6S_A 2ODQ_A ....
Probab=100.00  E-value=1.9e-34  Score=259.39  Aligned_cols=215  Identities=36%  Similarity=0.728  Sum_probs=178.3

Q ss_pred             EeCCcccCCCCCcceEEEeeeccCCCCCCeeeEEEEecCCEEEecCCCcCCCCCCCCCCcceEEEeccccCCccCCceEE
Q psy16044         27 LINGKESIRGAWPWQVSLQVLHPRLGLMPHWCGAVLIHPSWVVTAAHCIHNDIFSLPIPELWTAVLGDWDRTEEEKSEVR  106 (436)
Q Consensus        27 i~~G~~a~~~~~Pw~v~i~~~~~~~~~~~~~C~GtLIs~~~VLTAAhC~~~~~~~~~~~~~~~v~~g~~~~~~~~~~~~~  106 (436)
                      |+||.++.+++|||+|.|+....     .++|+|+||+++||||||||+..       ...+.+.+|...........+.
T Consensus         1 i~~g~~~~~~~~p~~v~i~~~~~-----~~~C~G~li~~~~vLTaahC~~~-------~~~~~v~~g~~~~~~~~~~~~~   68 (220)
T PF00089_consen    1 IVGGDPASPGEFPWVVSIRYSNG-----RFFCTGTLISPRWVLTAAHCVDG-------ASDIKVRLGTYSIRNSDGSEQT   68 (220)
T ss_dssp             SBSSEECGTTSSTTEEEEEETTT-----EEEEEEEEEETTEEEEEGGGHTS-------GGSEEEEESESBTTSTTTTSEE
T ss_pred             CCCCEECCCCCCCeEEEEeeCCC-----CeeEeEEeccccccccccccccc-------cccccccccccccccccccccc
Confidence            68999999999999999988642     68899999999999999999976       3567888998444444444588


Q ss_pred             EeeeEEEeCCCCCC--CCCceeEEEeCCCCCCCCCceeeeeecCCCCCCCCCCCcEEEEecCccCCCCCccccceeeeee
Q psy16044        107 IPVERIRVHEEFHN--YHHDIALLKLSRPTSARDKGVRAVCLTDADKRPVNPKQQCVATGWGRVKPKGDLVSKLRQIRVP  184 (436)
Q Consensus       107 ~~v~~i~~hp~y~~--~~~DiAll~L~~~~~~~~~~v~pi~l~~~~~~~~~~~~~~~~~GwG~~~~~~~~~~~l~~~~~~  184 (436)
                      +.+++++.||+|+.  ..+|||||+|++++. +...++|+|++... .....+..+.++|||.....+ ....++...+.
T Consensus        69 ~~v~~~~~h~~~~~~~~~~DiAll~L~~~~~-~~~~~~~~~l~~~~-~~~~~~~~~~~~G~~~~~~~~-~~~~~~~~~~~  145 (220)
T PF00089_consen   69 IKVSKIIIHPKYDPSTYDNDIALLKLDRPIT-FGDNIQPICLPSAG-SDPNVGTSCIVVGWGRTSDNG-YSSNLQSVTVP  145 (220)
T ss_dssp             EEEEEEEEETTSBTTTTTTSEEEEEESSSSE-HBSSBEESBBTSTT-HTTTTTSEEEEEESSBSSTTS-BTSBEEEEEEE
T ss_pred             ccccccccccccccccccccccccccccccc-cccccccccccccc-ccccccccccccccccccccc-ccccccccccc
Confidence            89999999999984  479999999999988 66789999999842 223678899999999976554 55678999999


Q ss_pred             ccchhhhhhhcCCCcCCcCCeEEeccCCCCCCCccCCCCCeeeeEecCCcEEEEEEEeEcCCCCCCC---ccccccccee
Q psy16044        185 LHNISVCRDKYGDSVELHGGHLCGGQLDGFSGACIGDSGGPLQCSLKDGRWYLAGITSFGSGYCGVG---IRYSHRQPRL  261 (436)
Q Consensus       185 ~~~~~~C~~~~~~~~~~~~~~~Ca~~~~~~~~~C~gDsGgPl~~~~~~~~~~l~GI~S~g~~~C~~~---~~y~~v~~~~  261 (436)
                      +++.+.|...+...  ..+.++|++.. ...+.|.|||||||++...    +|+||+|++.. |+..   .+|+++..+.
T Consensus       146 ~~~~~~c~~~~~~~--~~~~~~c~~~~-~~~~~~~g~sG~pl~~~~~----~lvGI~s~~~~-c~~~~~~~v~~~v~~~~  217 (220)
T PF00089_consen  146 VVSRKTCRSSYNDN--LTPNMICAGSS-GSGDACQGDSGGPLICNNN----YLVGIVSFGEN-CGSPNYPGVYTRVSSYL  217 (220)
T ss_dssp             EEEHHHHHHHTTTT--STTTEEEEETT-SSSBGGTTTTTSEEEETTE----EEEEEEEEESS-SSBTTSEEEEEEGGGGH
T ss_pred             cccccccccccccc--ccccccccccc-cccccccccccccccccee----eecceeeecCC-CCCCCcCEEEEEHHHhh
Confidence            99999999874332  45689999987 5589999999999998643    89999999955 6555   7899999999


Q ss_pred             cCC
Q psy16044        262 ING  264 (436)
Q Consensus       262 ~~~  264 (436)
                      +||
T Consensus       218 ~WI  220 (220)
T PF00089_consen  218 DWI  220 (220)
T ss_dssp             HHH
T ss_pred             ccC
Confidence            986


No 5  
>COG5640 Secreted trypsin-like serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=99.96  E-value=2e-28  Score=218.63  Aligned_cols=238  Identities=24%  Similarity=0.330  Sum_probs=167.2

Q ss_pred             CCCeEeCCcccCCCCCcceEEEeeeccCCCCCCeeeEEEEecCCEEEecCCCcCCCCCCCCCCcceEEEeccccCCccCC
Q psy16044         23 RQPRLINGKESIRGAWPWQVSLQVLHPRLGLMPHWCGAVLIHPSWVVTAAHCIHNDIFSLPIPELWTAVLGDWDRTEEEK  102 (436)
Q Consensus        23 ~~~~i~~G~~a~~~~~Pw~v~i~~~~~~~~~~~~~C~GtLIs~~~VLTAAhC~~~~~~~~~~~~~~~v~~g~~~~~~~~~  102 (436)
                      ...||+||..|+.++||++|++....... ....+|+|+++..|||||||||+....  ........|..+..+..    
T Consensus        29 vs~rIigGs~Anag~~P~~VaLv~~isd~-~s~tfCGgs~l~~RYvLTAAHC~~~~s--~is~d~~~vv~~l~d~S----  101 (413)
T COG5640          29 VSSRIIGGSNANAGEYPSLVALVDRISDY-VSGTFCGGSKLGGRYVLTAAHCADASS--PISSDVNRVVVDLNDSS----  101 (413)
T ss_pred             cceeEecCcccccccCchHHHHHhhcccc-cceeEeccceecceEEeeehhhccCCC--CccccceEEEecccccc----
Confidence            57899999999999999999998765431 125689999999999999999998753  12233345555444333    


Q ss_pred             ceEEEeeeEEEeCCCCC--CCCCceeEEEeCCCCCCCCCceeeeeecCCCCCCCCCCCcEEEEecCccCCCC---Ccc--
Q psy16044        103 SEVRIPVERIRVHEEFH--NYHHDIALLKLSRPTSARDKGVRAVCLTDADKRPVNPKQQCVATGWGRVKPKG---DLV--  175 (436)
Q Consensus       103 ~~~~~~v~~i~~hp~y~--~~~~DiAll~L~~~~~~~~~~v~pi~l~~~~~~~~~~~~~~~~~GwG~~~~~~---~~~--  175 (436)
                      ..+...|..++.|..|.  +..||+|+++|.++...--..+...--+..-...+.........+|+.+....   ..+  
T Consensus       102 q~~rg~vr~i~~~efY~~~n~~ND~Av~~l~~~a~~pr~ki~~~~~sdt~l~sv~~~s~~~n~t~~~~~~~~v~~~~p~g  181 (413)
T COG5640         102 QAERGHVRTIYVHEFYSPGNLGNDIAVLELARAASLPRVKITSFDASDTFLNSVTTVSPMTNGTFGVTTPSDVPRSSPKG  181 (413)
T ss_pred             cccCcceEEEeeecccccccccCcceeeccccccccchhheeeccCcccceecccccccccceeeeeeeecCCCCCCCcc
Confidence            23567899999999997  68899999999997762111111111111001112233444566777654321   112  


Q ss_pred             ccceeeeeeccchhhhhhhcCC----CcCCcCCeEEeccCCCCCCCccCCCCCeeeeEecCCcEEEEEEEeEcCCCCCCC
Q psy16044        176 SKLRQIRVPLHNISVCRDKYGD----SVELHGGHLCGGQLDGFSGACIGDSGGPLQCSLKDGRWYLAGITSFGSGYCGVG  251 (436)
Q Consensus       176 ~~l~~~~~~~~~~~~C~~~~~~----~~~~~~~~~Ca~~~~~~~~~C~gDsGgPl~~~~~~~~~~l~GI~S~g~~~C~~~  251 (436)
                      ..|++..+..++...|...++.    .....-.-+|++...  .++|+||||||++....+ ...++||+|||.+-|+..
T Consensus       182 t~l~e~~v~fv~~stc~~~~g~an~~dg~~~lT~~cag~~~--~daCqGDSGGPi~~~g~~-G~vQ~GVvSwG~~~Cg~t  258 (413)
T COG5640         182 TILHEVAVLFVPLSTCAQYKGCANASDGATGLTGFCAGRPP--KDACQGDSGGPIFHKGEE-GRVQRGVVSWGDGGCGGT  258 (413)
T ss_pred             ceeeeeeeeeechHHhhhhccccccCCCCCCccceecCCCC--cccccCCCCCceEEeCCC-ccEEEeEEEecCCCCCCC
Confidence            3689999999999999988752    111222449998776  799999999999998644 458999999999989765


Q ss_pred             ---cccccccceecCCccccCC
Q psy16044        252 ---IRYSHRQPRLINGKESIRG  270 (436)
Q Consensus       252 ---~~y~~v~~~~~~~~~~~~~  270 (436)
                         .+||+|.-|..||.....+
T Consensus       259 ~~~gVyT~vsny~~WI~a~~~~  280 (413)
T COG5640         259 LIPGVYTNVSNYQDWIAAMTNG  280 (413)
T ss_pred             CcceeEEehhHHHHHHHHHhcC
Confidence               5899999999999875543


No 6  
>KOG3627|consensus
Probab=99.94  E-value=3.4e-26  Score=210.62  Aligned_cols=158  Identities=39%  Similarity=0.769  Sum_probs=134.1

Q ss_pred             CCCcccccccc-cccceeecccccccCCCCeeeeccCCCCCC-CCCCCCeEEEEecCCCCCC-CCccccceEEEEEeeCc
Q psy16044        271 AWPWQNLITSF-LSAALLKLSRPTSARDKGVRAVCLTDADKR-PVNPKQQCVATGWGRVKPK-GDLVSKLRQIRVPLHNI  347 (436)
Q Consensus       271 ~~p~~~~~~~~-~diali~l~~~~~~~~~~v~picl~~~~~~-~~~~~~~~~~~Gwg~~~~~-~~~~~~l~~~~~~~~~~  347 (436)
                      .||.|+..... ||||||+|.+++.++.. |+|||||..... ....+..|.++|||++... ...+..|+++.+++++.
T Consensus        94 ~H~~y~~~~~~~nDiall~l~~~v~~~~~-i~piclp~~~~~~~~~~~~~~~v~GWG~~~~~~~~~~~~L~~~~v~i~~~  172 (256)
T KOG3627|consen   94 VHPNYNPRTLENNDIALLRLSEPVTFSSH-IQPICLPSSADPYFPPGGTTCLVSGWGRTESGGGPLPDTLQEVDVPIISN  172 (256)
T ss_pred             ECCCCCCCCCCCCCEEEEEECCCcccCCc-ccccCCCCCcccCCCCCCCEEEEEeCCCcCCCCCCCCceeEEEEEeEcCh
Confidence            68999998877 99999999999999988 999999855442 3455689999999987654 24578899999999999


Q ss_pred             cccccccCCCccCCCCceecccCCCCCCCccCCCCceeeeEecCCcEEEEEEEEecCC-CCCCCCCeEEEeCcccHHHHH
Q psy16044        348 SVCRDKYGDSVELHGGHLCGGQLDGFSGACIGDSGGPLQCSLKDGRWYLAGITSFGSG-CAKSGYPDVYTKLSFYLPWIR  426 (436)
Q Consensus       348 ~~C~~~~~~~~~~~~~~~C~~~~~~~~~~c~gdsGgpl~~~~~~~~~~l~Gi~S~~~~-c~~~~~p~v~t~V~~~~~WI~  426 (436)
                      ++|+..+.....+.+.|||++......++|+|||||||++... ++++|+||+|||.. |+....|++||+|+.|.+||+
T Consensus       173 ~~C~~~~~~~~~~~~~~~Ca~~~~~~~~~C~GDSGGPLv~~~~-~~~~~~GivS~G~~~C~~~~~P~vyt~V~~y~~WI~  251 (256)
T KOG3627|consen  173 SECRRAYGGLGTITDTMLCAGGPEGGKDACQGDSGGPLVCEDN-GRWVLVGIVSWGSGGCGQPNYPGVYTRVSSYLDWIK  251 (256)
T ss_pred             hHhcccccCccccCCCEEeeCccCCCCccccCCCCCeEEEeeC-CcEEEEEEEEecCCCCCCCCCCeEEeEhHHhHHHHH
Confidence            9999988653235677899987656568999999999999863 48999999999988 988778999999999999999


Q ss_pred             HHHh
Q psy16044        427 KQIN  430 (436)
Q Consensus       427 ~~i~  430 (436)
                      +.+.
T Consensus       252 ~~~~  255 (256)
T KOG3627|consen  252 ENIG  255 (256)
T ss_pred             HHhc
Confidence            9875


No 7  
>cd00190 Tryp_SPc Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues.
Probab=99.92  E-value=2.2e-24  Score=195.41  Aligned_cols=156  Identities=41%  Similarity=0.758  Sum_probs=134.4

Q ss_pred             CCCCcccccccccccceeecccccccCCCCeeeeccCCCCCCCCCCCCeEEEEecCCCCCCCCccccceEEEEEeeCccc
Q psy16044        270 GAWPWQNLITSFLSAALLKLSRPTSARDKGVRAVCLTDADKRPVNPKQQCVATGWGRVKPKGDLVSKLRQIRVPLHNISV  349 (436)
Q Consensus       270 ~~~p~~~~~~~~~diali~l~~~~~~~~~~v~picl~~~~~~~~~~~~~~~~~Gwg~~~~~~~~~~~l~~~~~~~~~~~~  349 (436)
                      ..||.|......+|||||+|++|+.++.. ++|||||.... ....+..+.+.|||........+..++...+.+++.++
T Consensus        77 ~~hp~y~~~~~~~DiAll~L~~~~~~~~~-v~picl~~~~~-~~~~~~~~~~~G~g~~~~~~~~~~~~~~~~~~~~~~~~  154 (232)
T cd00190          77 IVHPNYNPSTYDNDIALLKLKRPVTLSDN-VRPICLPSSGY-NLPAGTTCTVSGWGRTSEGGPLPDVLQEVNVPIVSNAE  154 (232)
T ss_pred             EECCCCCCCCCcCCEEEEEECCcccCCCc-ccceECCCccc-cCCCCCEEEEEeCCcCCCCCCCCceeeEEEeeeECHHH
Confidence            35899988888999999999999999887 99999998763 45678999999999876554556789999999999999


Q ss_pred             cccccCCCccCCCCceecccCCCCCCCccCCCCceeeeEecCCcEEEEEEEEecCCCCCCCCCeEEEeCcccHHHHHHH
Q psy16044        350 CRDKYGDSVELHGGHLCGGQLDGFSGACIGDSGGPLQCSLKDGRWYLAGITSFGSGCAKSGYPDVYTKLSFYLPWIRKQ  428 (436)
Q Consensus       350 C~~~~~~~~~~~~~~~C~~~~~~~~~~c~gdsGgpl~~~~~~~~~~l~Gi~S~~~~c~~~~~p~v~t~V~~~~~WI~~~  428 (436)
                      |...+.....+.++++|+.........|.|||||||++.. +++++|+||+|++..|.....|.+|++|.+|++||+++
T Consensus       155 C~~~~~~~~~~~~~~~C~~~~~~~~~~c~gdsGgpl~~~~-~~~~~lvGI~s~g~~c~~~~~~~~~t~v~~~~~WI~~~  232 (232)
T cd00190         155 CKRAYSYGGTITDNMLCAGGLEGGKDACQGDSGGPLVCND-NGRGVLVGIVSWGSGCARPNYPGVYTRVSSYLDWIQKT  232 (232)
T ss_pred             hhhhccCcccCCCceEeeCCCCCCCccccCCCCCcEEEEe-CCEEEEEEEEehhhccCCCCCCCEEEEcHHhhHHhhcC
Confidence            9988764335789999998765456899999999999985 58999999999999888656799999999999999864


No 8  
>smart00020 Tryp_SPc Trypsin-like serine protease. Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues.
Probab=99.87  E-value=9.1e-22  Score=177.99  Aligned_cols=151  Identities=42%  Similarity=0.814  Sum_probs=128.1

Q ss_pred             CCCcccccccccccceeecccccccCCCCeeeeccCCCCCCCCCCCCeEEEEecCCCCC-CCCccccceEEEEEeeCccc
Q psy16044        271 AWPWQNLITSFLSAALLKLSRPTSARDKGVRAVCLTDADKRPVNPKQQCVATGWGRVKP-KGDLVSKLRQIRVPLHNISV  349 (436)
Q Consensus       271 ~~p~~~~~~~~~diali~l~~~~~~~~~~v~picl~~~~~~~~~~~~~~~~~Gwg~~~~-~~~~~~~l~~~~~~~~~~~~  349 (436)
                      .||.|......+|+|||+|++|+.+... ++||||+.... ....+..+.+.|||.... .......++...+.+++.+.
T Consensus        78 ~~p~~~~~~~~~DiAll~L~~~i~~~~~-~~pi~l~~~~~-~~~~~~~~~~~g~g~~~~~~~~~~~~~~~~~~~~~~~~~  155 (229)
T smart00020       78 IHPNYNPSTYDNDIALLKLKSPVTLSDN-VRPICLPSSNY-NVPAGTTCTVSGWGRTSEGAGSLPDTLQEVNVPIVSNAT  155 (229)
T ss_pred             ECCCCCCCCCcCCEEEEEECcccCCCCc-eeeccCCCccc-ccCCCCEEEEEeCCCCCCCCCcCCCEeeEEEEEEeCHHH
Confidence            4788888888999999999999998886 99999998733 345689999999998763 23345678999999999999


Q ss_pred             cccccCCCccCCCCceecccCCCCCCCccCCCCceeeeEecCCcEEEEEEEEecCCCCCCCCCeEEEeCcccHHHH
Q psy16044        350 CRDKYGDSVELHGGHLCGGQLDGFSGACIGDSGGPLQCSLKDGRWYLAGITSFGSGCAKSGYPDVYTKLSFYLPWI  425 (436)
Q Consensus       350 C~~~~~~~~~~~~~~~C~~~~~~~~~~c~gdsGgpl~~~~~~~~~~l~Gi~S~~~~c~~~~~p~v~t~V~~~~~WI  425 (436)
                      |...+.....+.+.++|++........|.||+||||++.. + +|+|+||+|++..|...+.|.+|++|.+|++||
T Consensus       156 C~~~~~~~~~~~~~~~C~~~~~~~~~~c~gdsG~pl~~~~-~-~~~l~Gi~s~g~~C~~~~~~~~~~~i~~~~~WI  229 (229)
T smart00020      156 CRRAYSGGGAITDNMLCAGGLEGGKDACQGDSGGPLVCND-G-RWVLVGIVSWGSGCARPGKPGVYTRVSSYLDWI  229 (229)
T ss_pred             hhhhhccccccCCCcEeecCCCCCCcccCCCCCCeeEEEC-C-CEEEEEEEEECCCCCCCCCCCEEEEeccccccC
Confidence            9988765335788999998765456899999999999985 4 999999999999998667799999999999998


No 9  
>PF00089 Trypsin:  Trypsin;  InterPro: IPR001254 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine proteases belong to the MEROPS peptidase family S1 (chymotrypsin family, clan PA(S))and to peptidase family S6 (Hap serine peptidases). The chymotrypsin family is almost totally confined to animals, although trypsin-like enzymes are found in actinomycetes of the genera Streptomyces and Saccharopolyspora, and in the fungus Fusarium oxysporum []. The enzymes are inherently secreted, being synthesised with a signal peptide that targets them to the secretory pathway. Animal enzymes are either secreted directly, packaged into vesicles for regulated secretion, or are retained in leukocyte granules []. The Hap family, 'Haemophilus adhesion and penetration', are proteins that play a role in the interaction with human epithelial cells. The serine protease activity is localized at the N-terminal domain, whereas the binding domain is in the C-terminal region. ; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 1SPJ_A 1A5I_A 2ZGH_A 2ZKS_A 2ZGJ_A 2ZGC_A 2ODP_A 2I6Q_A 2I6S_A 2ODQ_A ....
Probab=99.83  E-value=1.8e-20  Score=168.33  Aligned_cols=145  Identities=41%  Similarity=0.810  Sum_probs=124.6

Q ss_pred             CCCcccccccccccceeecccccccCCCCeeeeccCCCCCCCCCCCCeEEEEecCCCCCCCCccccceEEEEEeeCcccc
Q psy16044        271 AWPWQNLITSFLSAALLKLSRPTSARDKGVRAVCLTDADKRPVNPKQQCVATGWGRVKPKGDLVSKLRQIRVPLHNISVC  350 (436)
Q Consensus       271 ~~p~~~~~~~~~diali~l~~~~~~~~~~v~picl~~~~~~~~~~~~~~~~~Gwg~~~~~~~~~~~l~~~~~~~~~~~~C  350 (436)
                      .||.|......+|+|||+|++++.+... ++|+||+.... ....++.+.+.||+...... ....++...+.+++.+.|
T Consensus        76 ~h~~~~~~~~~~DiAll~L~~~~~~~~~-~~~~~l~~~~~-~~~~~~~~~~~G~~~~~~~~-~~~~~~~~~~~~~~~~~c  152 (220)
T PF00089_consen   76 IHPKYDPSTYDNDIALLKLDRPITFGDN-IQPICLPSAGS-DPNVGTSCIVVGWGRTSDNG-YSSNLQSVTVPVVSRKTC  152 (220)
T ss_dssp             EETTSBTTTTTTSEEEEEESSSSEHBSS-BEESBBTSTTH-TTTTTSEEEEEESSBSSTTS-BTSBEEEEEEEEEEHHHH
T ss_pred             cccccccccccccccccccccccccccc-ccccccccccc-cccccccccccccccccccc-cccccccccccccccccc
Confidence            5788888888999999999999988888 99999998443 34788999999999865443 456899999999999999


Q ss_pred             ccccCCCccCCCCceecccCCCCCCCccCCCCceeeeEecCCcEEEEEEEEecCCCCCCCCCeEEEeCcccHHHH
Q psy16044        351 RDKYGDSVELHGGHLCGGQLDGFSGACIGDSGGPLQCSLKDGRWYLAGITSFGSGCAKSGYPDVYTKLSFYLPWI  425 (436)
Q Consensus       351 ~~~~~~~~~~~~~~~C~~~~~~~~~~c~gdsGgpl~~~~~~~~~~l~Gi~S~~~~c~~~~~p~v~t~V~~~~~WI  425 (436)
                      ...+..  .+.+.++|+... .....|.|||||||++...    +|+||.|++..|...+.|.+|++|+.|++||
T Consensus       153 ~~~~~~--~~~~~~~c~~~~-~~~~~~~g~sG~pl~~~~~----~lvGI~s~~~~c~~~~~~~v~~~v~~~~~WI  220 (220)
T PF00089_consen  153 RSSYND--NLTPNMICAGSS-GSGDACQGDSGGPLICNNN----YLVGIVSFGENCGSPNYPGVYTRVSSYLDWI  220 (220)
T ss_dssp             HHHTTT--TSTTTEEEEETT-SSSBGGTTTTTSEEEETTE----EEEEEEEEESSSSBTTSEEEEEEGGGGHHHH
T ss_pred             cccccc--cccccccccccc-cccccccccccccccccee----eecceeeecCCCCCCCcCEEEEEHHHhhccC
Confidence            987544  368899999876 4468999999999998642    8999999999998877799999999999999


No 10 
>PF03761 DUF316:  Domain of unknown function (DUF316) ;  InterPro: IPR005514 This is a family of uncharacterised proteins from Caenorhabditis elegans.
Probab=99.74  E-value=2.4e-17  Score=153.62  Aligned_cols=227  Identities=21%  Similarity=0.334  Sum_probs=142.6

Q ss_pred             cccccCceecCCCCCCccCCCCCCeEeCCcccCCCCCcceEEEeeeccCCCCCCeeeEEEEecCCEEEecCCCcCCCCCC
Q psy16044          2 INLCDTVTFARDCGVGIRYSHRQPRLINGKESIRGAWPWQVSLQVLHPRLGLMPHWCGAVLIHPSWVVTAAHCIHNDIFS   81 (436)
Q Consensus         2 ~~~~~~~~~~~~CG~~~~~~~~~~~i~~G~~a~~~~~Pw~v~i~~~~~~~~~~~~~C~GtLIs~~~VLTAAhC~~~~~~~   81 (436)
                      |+.+||..++..||..  ......++.+|..+...+.||+|.+.......  ....++|||||+||||||+||+......
T Consensus        19 Lt~eEN~~rl~~CG~~--~~~~~~~~~~g~~~~~~~~pW~v~v~~~~~~~--~~~~~~gtlIS~RHiLtss~~~~~~~~~   94 (282)
T PF03761_consen   19 LTEEENEERLETCGKK--KLPYPSKVFNGTPAESGEAPWAVSVYTKNHNE--GNYFSTGTLISPRHILTSSHCVMNDKSK   94 (282)
T ss_pred             CCHHHHHHHHHhcCCC--CCCCcccccCCcccccCCCCCEEEEEeccCcc--cceecceEEeccCeEEEeeeEEEecccc
Confidence            5678899999999943  33355667999999999999999998765332  2466799999999999999999753221


Q ss_pred             C---CCC------cc-eEEEeccc-----cC----CccCCceEEEeeeEEEeCCCC------CCCCCceeEEEeCCCCCC
Q psy16044         82 L---PIP------EL-WTAVLGDW-----DR----TEEEKSEVRIPVERIRVHEEF------HNYHHDIALLKLSRPTSA  136 (436)
Q Consensus        82 ~---~~~------~~-~~v~~g~~-----~~----~~~~~~~~~~~v~~i~~hp~y------~~~~~DiAll~L~~~~~~  136 (436)
                      .   ...      .. ..+.+-..     ..    ...........+.++++...-      .....+++||+|+++   
T Consensus        95 W~~~~~~~~~~C~~~~~~l~vP~~~l~~~~v~~~~~~~~~~~~~~~v~ka~il~~C~~~~~~~~~~~~~mIlEl~~~---  171 (282)
T PF03761_consen   95 WLNGEEFDNKKCEGNNNHLIVPEEVLSKIDVRCCNCFSNGKCFSIKVKKAYILNGCKKIKKNFNRPYSPMILELEED---  171 (282)
T ss_pred             cccCcccccceeeCCCceEEeCHHHhccEEEEeecccccCCcccceeEEEEEEecCCCcccccccccceEEEEEccc---
Confidence            1   000      00 00000000     00    001111123445555542211      135679999999998   


Q ss_pred             CCCceeeeeecCCCCCCCCCCCcEEEEecCccCCCCCccccceeeeeeccchhhhhhhcCCCcCCcCCeEEeccCCCCCC
Q psy16044        137 RDKGVRAVCLTDADKRPVNPKQQCVATGWGRVKPKGDLVSKLRQIRVPLHNISVCRDKYGDSVELHGGHLCGGQLDGFSG  216 (436)
Q Consensus       137 ~~~~v~pi~l~~~~~~~~~~~~~~~~~GwG~~~~~~~~~~~l~~~~~~~~~~~~C~~~~~~~~~~~~~~~Ca~~~~~~~~  216 (436)
                      ++....|+|||+.. .....+..+.+.|+       .....+....+.+.....|.           ..+|.     ...
T Consensus       172 ~~~~~~~~Cl~~~~-~~~~~~~~~~~yg~-------~~~~~~~~~~~~i~~~~~~~-----------~~~~~-----~~~  227 (282)
T PF03761_consen  172 FSKNVSPPCLADSS-TNWEKGDEVDVYGF-------NSTGKLKHRKLKITNCTKCA-----------YSICT-----KQY  227 (282)
T ss_pred             ccccCCCEEeCCCc-cccccCceEEEeec-------CCCCeEEEEEEEEEEeeccc-----------eeEec-----ccc
Confidence            34568999999863 33556667777777       11233555555554433211           12222     257


Q ss_pred             CccCCCCCeeeeEecCCcEEEEEEEeEcCCCCCC-Ccccccccce
Q psy16044        217 ACIGDSGGPLQCSLKDGRWYLAGITSFGSGYCGV-GIRYSHRQPR  260 (436)
Q Consensus       217 ~C~gDsGgPl~~~~~~~~~~l~GI~S~g~~~C~~-~~~y~~v~~~  260 (436)
                      .|.||+||||+... +|+|+|+||.+.+...|.. ...|.+|..+
T Consensus       228 ~~~~d~Gg~lv~~~-~gr~tlIGv~~~~~~~~~~~~~~f~~v~~~  271 (282)
T PF03761_consen  228 SCKGDRGGPLVKNI-NGRWTLIGVGASGNYECNKNNSYFFNVSWY  271 (282)
T ss_pred             cCCCCccCeEEEEE-CCCEEEEEEEccCCCcccccccEEEEHHHh
Confidence            89999999999874 8999999999988866643 3444455433


No 11 
>COG5640 Secreted trypsin-like serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=99.61  E-value=8.8e-15  Score=131.74  Aligned_cols=160  Identities=24%  Similarity=0.355  Sum_probs=108.8

Q ss_pred             CCCCcccccccccccceeecccccccCCCCeeeeccCCCCCCCCCCCCeEEEEecCCCCCC---CCc--cccceEEEEEe
Q psy16044        270 GAWPWQNLITSFLSAALLKLSRPTSARDKGVRAVCLTDADKRPVNPKQQCVATGWGRVKPK---GDL--VSKLRQIRVPL  344 (436)
Q Consensus       270 ~~~p~~~~~~~~~diali~l~~~~~~~~~~v~picl~~~~~~~~~~~~~~~~~Gwg~~~~~---~~~--~~~l~~~~~~~  344 (436)
                      ..|-.|-..++.||+|+++|.++...-...+...--+..-.............+|+.+...   ...  ...++++.+..
T Consensus       112 ~~~efY~~~n~~ND~Av~~l~~~a~~pr~ki~~~~~sdt~l~sv~~~s~~~n~t~~~~~~~~v~~~~p~gt~l~e~~v~f  191 (413)
T COG5640         112 YVHEFYSPGNLGNDIAVLELARAASLPRVKITSFDASDTFLNSVTTVSPMTNGTFGVTTPSDVPRSSPKGTILHEVAVLF  191 (413)
T ss_pred             eeecccccccccCcceeeccccccccchhheeeccCcccceecccccccccceeeeeeeecCCCCCCCccceeeeeeeee
Confidence            3456666778899999999999655432112111111100112233444555666643211   111  24799999999


Q ss_pred             eCccccccccCC--Ccc--CCCCceecccCCCCCCCccCCCCceeeeEecCCcEEEEEEEEecCC-CCCCCCCeEEEeCc
Q psy16044        345 HNISVCRDKYGD--SVE--LHGGHLCGGQLDGFSGACIGDSGGPLQCSLKDGRWYLAGITSFGSG-CAKSGYPDVYTKLS  419 (436)
Q Consensus       345 ~~~~~C~~~~~~--~~~--~~~~~~C~~~~~~~~~~c~gdsGgpl~~~~~~~~~~l~Gi~S~~~~-c~~~~~p~v~t~V~  419 (436)
                      .+...|.+..+.  ...  ..-.-+|++...  ++.|+||||||++.+..+| ++|+||+|||.+ |+.+..|.|||+|+
T Consensus       192 v~~stc~~~~g~an~~dg~~~lT~~cag~~~--~daCqGDSGGPi~~~g~~G-~vQ~GVvSwG~~~Cg~t~~~gVyT~vs  268 (413)
T COG5640         192 VPLSTCAQYKGCANASDGATGLTGFCAGRPP--KDACQGDSGGPIFHKGEEG-RVQRGVVSWGDGGCGGTLIPGVYTNVS  268 (413)
T ss_pred             echHHhhhhccccccCCCCCCccceecCCCC--cccccCCCCCceEEeCCCc-cEEEeEEEecCCCCCCCCcceeEEehh
Confidence            999999988741  111  112239998765  6999999999999986444 589999999987 99989999999999


Q ss_pred             ccHHHHHHHHhhh
Q psy16044        420 FYLPWIRKQINIA  432 (436)
Q Consensus       420 ~~~~WI~~~i~~~  432 (436)
                      .|.+||...|+..
T Consensus       269 ny~~WI~a~~~~l  281 (413)
T COG5640         269 NYQDWIAAMTNGL  281 (413)
T ss_pred             HHHHHHHHHhcCC
Confidence            9999999988643


No 12 
>PF09342 DUF1986:  Domain of unknown function (DUF1986);  InterPro: IPR015420 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This domain is found in serine endopeptidases belonging to MEROPS peptidase family S1A (clan PA). It is found in unusual mosaic proteins, which are encoded by the Drosophila nudel gene (see P98159 from SWISSPROT). Nudel is involved in defining embryonic dorsoventral polarity. Three proteases; ndl, gd and snk process easter to create active easter. Active easter defines cell identities along the dorsal-ventral continuum by activating the spz ligand for the Tl receptor in the ventral region of the embryo. Nudel, pipe and windbeutel together trigger the protease cascade within the extraembryonic perivitelline compartment which induces dorsoventral polarity of the Drosophila embryo [].
Probab=99.48  E-value=1.3e-12  Score=112.11  Aligned_cols=117  Identities=23%  Similarity=0.477  Sum_probs=90.0

Q ss_pred             CCCCcceEEEeeeccCCCCCCeeeEEEEecCCEEEecCCCcCCCCCCCCCCcceEEEecccc--CCccCCceEEEeeeEE
Q psy16044         35 RGAWPWQVSLQVLHPRLGLMPHWCGAVLIHPSWVVTAAHCIHNDIFSLPIPELWTAVLGDWD--RTEEEKSEVRIPVERI  112 (436)
Q Consensus        35 ~~~~Pw~v~i~~~~~~~~~~~~~C~GtLIs~~~VLTAAhC~~~~~~~~~~~~~~~v~~g~~~--~~~~~~~~~~~~v~~i  112 (436)
                      .-.|||.|.|+..+      .++|+|+||.+.|||++..|+.+-..   ...-+.|.+|...  +....+.+|.+.|..+
T Consensus        13 ~y~WPWlA~IYvdG------~~~CsgvLlD~~WlLvsssCl~~I~L---~~~YvsallG~~Kt~~~v~Gp~EQI~rVD~~   83 (267)
T PF09342_consen   13 DYHWPWLADIYVDG------RYWCSGVLLDPHWLLVSSSCLRGISL---SHHYVSALLGGGKTYLSVDGPHEQISRVDCF   83 (267)
T ss_pred             cccCcceeeEEEcC------eEEEEEEEeccceEEEeccccCCccc---ccceEEEEecCcceecccCCChheEEEeeee
Confidence            34699999999875      78999999999999999999976322   1244678888654  2235566777777776


Q ss_pred             EeCCCCCCCCCceeEEEeCCCCCCCCCceeeeeecCCCCCCCCCCCcEEEEecCc
Q psy16044        113 RVHEEFHNYHHDIALLKLSRPTSARDKGVRAVCLTDADKRPVNPKQQCVATGWGR  167 (436)
Q Consensus       113 ~~hp~y~~~~~DiAll~L~~~~~~~~~~v~pi~l~~~~~~~~~~~~~~~~~GwG~  167 (436)
                      ..-|.     .+++||.|++|+. |+.+|+|..||.. .........|.++|-..
T Consensus        84 ~~V~~-----S~v~LLHL~~~~~-fTr~VlP~flp~~-~~~~~~~~~CVAVg~d~  131 (267)
T PF09342_consen   84 KDVPE-----SNVLLLHLEQPAN-FTRYVLPTFLPET-SNENESDDECVAVGHDD  131 (267)
T ss_pred             eeccc-----cceeeeeecCccc-ceeeecccccccc-cCCCCCCCceEEEEccc
Confidence            65555     7999999999999 8999999999974 33344556999999543


No 13 
>COG3591 V8-like Glu-specific endopeptidase [Amino acid transport and metabolism]
Probab=98.67  E-value=2.5e-07  Score=82.03  Aligned_cols=177  Identities=20%  Similarity=0.191  Sum_probs=93.5

Q ss_pred             cCCCCCcceEEEeeeccCCCCCCeeeEEEEecCCEEEecCCCcCCCCCCCCCCcceEEEe-ccccCCccCCceEEEeeeE
Q psy16044         33 SIRGAWPWQVSLQVLHPRLGLMPHWCGAVLIHPSWVVTAAHCIHNDIFSLPIPELWTAVL-GDWDRTEEEKSEVRIPVER  111 (436)
Q Consensus        33 a~~~~~Pw~v~i~~~~~~~~~~~~~C~GtLIs~~~VLTAAhC~~~~~~~~~~~~~~~v~~-g~~~~~~~~~~~~~~~v~~  111 (436)
                      ....+|||-+-.......+   ..-|+++||+++.||||+||+.+....  . ..+.+.. |.......   ...+....
T Consensus        44 ~dt~~~Py~av~~~~~~tG---~~~~~~~lI~pntvLTa~Hc~~s~~~G--~-~~~~~~p~g~~~~~~~---~~~~~~~~  114 (251)
T COG3591          44 TDTTQFPYSAVVQFEAATG---RLCTAATLIGPNTVLTAGHCIYSPDYG--E-DDIAAAPPGVNSDGGP---FYGITKIE  114 (251)
T ss_pred             ccCCCCCcceeEEeecCCC---cceeeEEEEcCceEEEeeeEEecCCCC--h-hhhhhcCCcccCCCCC---CCceeeEE
Confidence            4467899998886654332   445777999999999999999775321  0 1122221 22211111   11222223


Q ss_pred             EEeCCC-C---CCCCCceeEEEeCCCCCCCCCceeeeeecCCCCCCCCCCCcEEEEecCccCCCCCccccceeeeeeccc
Q psy16044        112 IRVHEE-F---HNYHHDIALLKLSRPTSARDKGVRAVCLTDADKRPVNPKQQCVATGWGRVKPKGDLVSKLRQIRVPLHN  187 (436)
Q Consensus       112 i~~hp~-y---~~~~~DiAll~L~~~~~~~~~~v~pi~l~~~~~~~~~~~~~~~~~GwG~~~~~~~~~~~l~~~~~~~~~  187 (436)
                      +...|. +   +....|+..+.|+...+ +...+....++..  .....+....++||-......   ..+.+..-.+..
T Consensus       115 ~~~~~g~~~~~d~~~~~v~~~~~~~g~~-~~~~~~~~~~~~~--~~~~~~d~i~v~GYP~dk~~~---~~~~e~t~~v~~  188 (251)
T COG3591         115 IRVYPGELYKEDGASYDVGEAALESGIN-IGDVVNYLKRNTA--SEAKANDRITVIGYPGDKPNI---GTMWESTGKVNS  188 (251)
T ss_pred             EEecCCceeccCCceeeccHHHhccCCC-ccccccccccccc--cccccCceeEEEeccCCCCcc---eeEeeecceeEE
Confidence            323443 2   24556777777774443 2333343333332  233445558999986544210   001111000000


Q ss_pred             hhhhhhhcCCCcCCcCCeEEeccCCCCCCCccCCCCCeeeeEecCCcEEEEEEEeEcCC
Q psy16044        188 ISVCRDKYGDSVELHGGHLCGGQLDGFSGACIGDSGGPLQCSLKDGRWYLAGITSFGSG  246 (436)
Q Consensus       188 ~~~C~~~~~~~~~~~~~~~Ca~~~~~~~~~C~gDsGgPl~~~~~~~~~~l~GI~S~g~~  246 (436)
                      .+              ..+    ..-..+++.|+||+|++....    +++||.+-|..
T Consensus       189 ~~--------------~~~----l~y~~dT~pG~SGSpv~~~~~----~vigv~~~g~~  225 (251)
T COG3591         189 IK--------------GNK----LFYDADTLPGSSGSPVLISKD----EVIGVHYNGPG  225 (251)
T ss_pred             Ee--------------cce----EEEEecccCCCCCCceEecCc----eEEEEEecCCC
Confidence            00              000    001268999999999997532    89999988765


No 14 
>PF03761 DUF316:  Domain of unknown function (DUF316) ;  InterPro: IPR005514 This is a family of uncharacterised proteins from Caenorhabditis elegans.
Probab=98.27  E-value=4e-06  Score=78.21  Aligned_cols=121  Identities=26%  Similarity=0.504  Sum_probs=82.8

Q ss_pred             cccccceeecccccccCCCCeeeeccCCCCCCCCCCCCeEEEEecCCCCCCCCccccceEEEEEeeCccccccccCCCcc
Q psy16044        280 SFLSAALLKLSRPTSARDKGVRAVCLTDADKRPVNPKQQCVATGWGRVKPKGDLVSKLRQIRVPLHNISVCRDKYGDSVE  359 (436)
Q Consensus       280 ~~~diali~l~~~~~~~~~~v~picl~~~~~~~~~~~~~~~~~Gwg~~~~~~~~~~~l~~~~~~~~~~~~C~~~~~~~~~  359 (436)
                      ...+.+||+|+++  +... ..|+|||.... ....++...+.|+.       ....+....+++.....|..       
T Consensus       159 ~~~~~mIlEl~~~--~~~~-~~~~Cl~~~~~-~~~~~~~~~~yg~~-------~~~~~~~~~~~i~~~~~~~~-------  220 (282)
T PF03761_consen  159 RPYSPMILELEED--FSKN-VSPPCLADSST-NWEKGDEVDVYGFN-------STGKLKHRKLKITNCTKCAY-------  220 (282)
T ss_pred             cccceEEEEEccc--cccc-CCCEEeCCCcc-ccccCceEEEeecC-------CCCeEEEEEEEEEEeeccce-------
Confidence            3467789999999  5555 89999997765 35566777777771       12335555555554322111       


Q ss_pred             CCCCceecccCCCCCCCccCCCCceeeeEecCCcEEEEEEEEecC-CCCCCCCCeEEEeCcccHHHHHHHHh
Q psy16044        360 LHGGHLCGGQLDGFSGACIGDSGGPLQCSLKDGRWYLAGITSFGS-GCAKSGYPDVYTKLSFYLPWIRKQIN  430 (436)
Q Consensus       360 ~~~~~~C~~~~~~~~~~c~gdsGgpl~~~~~~~~~~l~Gi~S~~~-~c~~~~~p~v~t~V~~~~~WI~~~i~  430 (436)
                          .+|     ..+..|.+|+||||+-.. +++|+|+||.+.+. .|...  ...|.+|..|.+=|.+..+
T Consensus       221 ----~~~-----~~~~~~~~d~Gg~lv~~~-~gr~tlIGv~~~~~~~~~~~--~~~f~~v~~~~~~IC~ltG  280 (282)
T PF03761_consen  221 ----SIC-----TKQYSCKGDRGGPLVKNI-NGRWTLIGVGASGNYECNKN--NSYFFNVSWYQDEICELTG  280 (282)
T ss_pred             ----eEe-----cccccCCCCccCeEEEEE-CCCEEEEEEEccCCCccccc--ccEEEEHHHhhhhhcccee
Confidence                122     224789999999999875 89999999998765 34322  5789999999887766543


No 15 
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=98.19  E-value=2.2e-05  Score=77.59  Aligned_cols=85  Identities=19%  Similarity=0.246  Sum_probs=58.4

Q ss_pred             CeeeEEEEecCC-EEEecCCCcCCCCCCCCCCcceEEEeccccCCccCCceEEEeeeEEEeCCCCCCCCCceeEEEeCCC
Q psy16044         55 PHWCGAVLIHPS-WVVTAAHCIHNDIFSLPIPELWTAVLGDWDRTEEEKSEVRIPVERIRVHEEFHNYHHDIALLKLSRP  133 (436)
Q Consensus        55 ~~~C~GtLIs~~-~VLTAAhC~~~~~~~~~~~~~~~v~~g~~~~~~~~~~~~~~~v~~i~~hp~y~~~~~DiAll~L~~~  133 (436)
                      ...++|.+|+++ +|||++|++.+.       ..+.|.+...         ..+..+-+..++.     .||||||++.+
T Consensus        57 ~~~GSGfii~~~G~IlTn~Hvv~~~-------~~i~V~~~~~---------~~~~a~vv~~d~~-----~DlAllkv~~~  115 (428)
T TIGR02037        57 RGLGSGVIISADGYILTNNHVVDGA-------DEITVTLSDG---------REFKAKLVGKDPR-----TDIAVLKIDAK  115 (428)
T ss_pred             cceeeEEEECCCCEEEEcHHHcCCC-------CeEEEEeCCC---------CEEEEEEEEecCC-----CCEEEEEecCC
Confidence            467999999986 999999999663       4455555321         2234444444544     69999999864


Q ss_pred             CCCCCCceeeeeecCCCCCCCCCCCcEEEEecCc
Q psy16044        134 TSARDKGVRAVCLTDADKRPVNPKQQCVATGWGR  167 (436)
Q Consensus       134 ~~~~~~~v~pi~l~~~~~~~~~~~~~~~~~GwG~  167 (436)
                      .     .+.++.|.+.  ..+..++.+.++|+..
T Consensus       116 ~-----~~~~~~l~~~--~~~~~G~~v~aiG~p~  142 (428)
T TIGR02037       116 K-----NLPVIKLGDS--DKLRVGDWVLAIGNPF  142 (428)
T ss_pred             C-----CceEEEccCC--CCCCCCCEEEEEECCC
Confidence            2     2567777653  3456899999999864


No 16 
>TIGR02038 protease_degS periplasmic serine pepetdase DegS. This family consists of the periplasmic serine protease DegS (HhoB), a shorter paralog of protease DO (HtrA, DegP) and DegQ (HhoA). It is found in E. coli and several other Proteobacteria of the gamma subdivision. It contains a trypsin domain and a single copy of PDZ domain (in contrast to DegP with two copies). A critical role of this DegS is to sense stress in the periplasm and partially degrade an inhibitor of sigma(E).
Probab=98.06  E-value=0.00012  Score=70.08  Aligned_cols=160  Identities=18%  Similarity=0.187  Sum_probs=89.1

Q ss_pred             CCCcceEEEeeeccCCC-----CCCeeeEEEEecCC-EEEecCCCcCCCCCCCCCCcceEEEeccccCCccCCceEEEee
Q psy16044         36 GAWPWQVSLQVLHPRLG-----LMPHWCGAVLIHPS-WVVTAAHCIHNDIFSLPIPELWTAVLGDWDRTEEEKSEVRIPV  109 (436)
Q Consensus        36 ~~~Pw~v~i~~~~~~~~-----~~~~~C~GtLIs~~-~VLTAAhC~~~~~~~~~~~~~~~v~~g~~~~~~~~~~~~~~~v  109 (436)
                      .--|-+|.|........     ......+|.+|+++ +|||++|.+...       ..+.|.+..         ...+..
T Consensus        53 ~~~psVV~I~~~~~~~~~~~~~~~~~~GSG~vi~~~G~IlTn~HVV~~~-------~~i~V~~~d---------g~~~~a  116 (351)
T TIGR02038        53 RAAPAVVNIYNRSISQNSLNQLSIQGLGSGVIMSKEGYILTNYHVIKKA-------DQIVVALQD---------GRKFEA  116 (351)
T ss_pred             hcCCcEEEEEeEeccccccccccccceEEEEEEeCCeEEEecccEeCCC-------CEEEEEECC---------CCEEEE
Confidence            44699999976432110     11346999999977 999999998652       345555432         122344


Q ss_pred             eEEEeCCCCCCCCCceeEEEeCCCCCCCCCceeeeeecCCCCCCCCCCCcEEEEecCccCCCCCccccceeeeeeccchh
Q psy16044        110 ERIRVHEEFHNYHHDIALLKLSRPTSARDKGVRAVCLTDADKRPVNPKQQCVATGWGRVKPKGDLVSKLRQIRVPLHNIS  189 (436)
Q Consensus       110 ~~i~~hp~y~~~~~DiAll~L~~~~~~~~~~v~pi~l~~~~~~~~~~~~~~~~~GwG~~~~~~~~~~~l~~~~~~~~~~~  189 (436)
                      +-+..+|.     .||||||++.+-      +.++.+..  ...+..++.+.++|+......     ......+.-+...
T Consensus       117 ~vv~~d~~-----~DlAvlkv~~~~------~~~~~l~~--s~~~~~G~~V~aiG~P~~~~~-----s~t~GiIs~~~r~  178 (351)
T TIGR02038       117 ELVGSDPL-----TDLAVLKIEGDN------LPTIPVNL--DRPPHVGDVVLAIGNPYNLGQ-----TITQGIISATGRN  178 (351)
T ss_pred             EEEEecCC-----CCEEEEEecCCC------CceEeccC--cCccCCCCEEEEEeCCCCCCC-----cEEEEEEEeccCc
Confidence            44444544     799999998541      23444433  345678999999998642211     1111111111110


Q ss_pred             hhhhhcCCCcCCcCCeEEeccCCCCCCCccCCCCCeeeeEecCCcEEEEEEEeEc
Q psy16044        190 VCRDKYGDSVELHGGHLCGGQLDGFSGACIGDSGGPLQCSLKDGRWYLAGITSFG  244 (436)
Q Consensus       190 ~C~~~~~~~~~~~~~~~Ca~~~~~~~~~C~gDsGgPl~~~~~~~~~~l~GI~S~g  244 (436)
                      .    +...  .....+=+     ......|.|||||+-.  +|  .++||.+..
T Consensus       179 ~----~~~~--~~~~~iqt-----da~i~~GnSGGpl~n~--~G--~vIGI~~~~  218 (351)
T TIGR02038       179 G----LSSV--GRQNFIQT-----DAAINAGNSGGALINT--NG--ELVGINTAS  218 (351)
T ss_pred             c----cCCC--CcceEEEE-----CCccCCCCCcceEECC--CC--eEEEEEeee
Confidence            0    0000  00011111     1456789999999953  34  499998764


No 17 
>PRK10898 serine endoprotease; Provisional
Probab=97.94  E-value=0.00035  Score=66.98  Aligned_cols=103  Identities=17%  Similarity=0.157  Sum_probs=64.6

Q ss_pred             CCCcceEEEeeeccCCC-----CCCeeeEEEEecCC-EEEecCCCcCCCCCCCCCCcceEEEeccccCCccCCceEEEee
Q psy16044         36 GAWPWQVSLQVLHPRLG-----LMPHWCGAVLIHPS-WVVTAAHCIHNDIFSLPIPELWTAVLGDWDRTEEEKSEVRIPV  109 (436)
Q Consensus        36 ~~~Pw~v~i~~~~~~~~-----~~~~~C~GtLIs~~-~VLTAAhC~~~~~~~~~~~~~~~v~~g~~~~~~~~~~~~~~~v  109 (436)
                      .--|-+|.|........     .....-+|.+|+++ +|||++|=+.+.       ..+.|.+...         ..+..
T Consensus        53 ~~~psvV~v~~~~~~~~~~~~~~~~~~GSGfvi~~~G~IlTn~HVv~~a-------~~i~V~~~dg---------~~~~a  116 (353)
T PRK10898         53 RAAPAVVNVYNRSLNSTSHNQLEIRTLGSGVIMDQRGYILTNKHVINDA-------DQIIVALQDG---------RVFEA  116 (353)
T ss_pred             HhCCcEEEEEeEeccccCcccccccceeeEEEEeCCeEEEecccEeCCC-------CEEEEEeCCC---------CEEEE
Confidence            34588998876532111     01256999999976 999999988652       4455655321         22334


Q ss_pred             eEEEeCCCCCCCCCceeEEEeCCCCCCCCCceeeeeecCCCCCCCCCCCcEEEEecCc
Q psy16044        110 ERIRVHEEFHNYHHDIALLKLSRPTSARDKGVRAVCLTDADKRPVNPKQQCVATGWGR  167 (436)
Q Consensus       110 ~~i~~hp~y~~~~~DiAll~L~~~~~~~~~~v~pi~l~~~~~~~~~~~~~~~~~GwG~  167 (436)
                      +-+...|.     .||||||++.+ .     ..++.+..  ...+..+..+.++|+..
T Consensus       117 ~vv~~d~~-----~DlAvl~v~~~-~-----l~~~~l~~--~~~~~~G~~V~aiG~P~  161 (353)
T PRK10898        117 LLVGSDSL-----TDLAVLKINAT-N-----LPVIPINP--KRVPHIGDVVLAIGNPY  161 (353)
T ss_pred             EEEEEcCC-----CCEEEEEEcCC-C-----CCeeeccC--cCcCCCCCEEEEEeCCC
Confidence            43444544     79999999753 1     23444443  23456789999999853


No 18 
>PRK10139 serine endoprotease; Provisional
Probab=97.90  E-value=0.00014  Score=72.00  Aligned_cols=142  Identities=21%  Similarity=0.228  Sum_probs=81.7

Q ss_pred             CeeeEEEEecC--CEEEecCCCcCCCCCCCCCCcceEEEeccccCCccCCceEEEeeeEEEeCCCCCCCCCceeEEEeCC
Q psy16044         55 PHWCGAVLIHP--SWVVTAAHCIHNDIFSLPIPELWTAVLGDWDRTEEEKSEVRIPVERIRVHEEFHNYHHDIALLKLSR  132 (436)
Q Consensus        55 ~~~C~GtLIs~--~~VLTAAhC~~~~~~~~~~~~~~~v~~g~~~~~~~~~~~~~~~v~~i~~hp~y~~~~~DiAll~L~~  132 (436)
                      ....+|.+|++  -+|||++|.+.+.       ..+.|.+...         ..+..+-+...|.     .||||||++.
T Consensus        89 ~~~GSG~ii~~~~g~IlTn~HVv~~a-------~~i~V~~~dg---------~~~~a~vvg~D~~-----~DlAvlkv~~  147 (455)
T PRK10139         89 EGLGSGVIIDAAKGYVLTNNHVINQA-------QKISIQLNDG---------REFDAKLIGSDDQ-----SDIALLQIQN  147 (455)
T ss_pred             cceEEEEEEECCCCEEEeChHHhCCC-------CEEEEEECCC---------CEEEEEEEEEcCC-----CCEEEEEecC
Confidence            35799999974  6999999998653       4566665321         2344444445554     7999999985


Q ss_pred             CCCCCCCceeeeeecCCCCCCCCCCCcEEEEecCccCCCCCccccceeeeeeccchhhhhhhcCCCcCCcCCeEEeccCC
Q psy16044        133 PTSARDKGVRAVCLTDADKRPVNPKQQCVATGWGRVKPKGDLVSKLRQIRVPLHNISVCRDKYGDSVELHGGHLCGGQLD  212 (436)
Q Consensus       133 ~~~~~~~~v~pi~l~~~~~~~~~~~~~~~~~GwG~~~~~~~~~~~l~~~~~~~~~~~~C~~~~~~~~~~~~~~~Ca~~~~  212 (436)
                      +-.     ..++.|.+.  ..+..++.+.++|+.....     .......+.-+....    ..  .......+=+    
T Consensus       148 ~~~-----l~~~~lg~s--~~~~~G~~V~aiG~P~g~~-----~tvt~GivS~~~r~~----~~--~~~~~~~iqt----  205 (455)
T PRK10139        148 PSK-----LTQIAIADS--DKLRVGDFAVAVGNPFGLG-----QTATSGIISALGRSG----LN--LEGLENFIQT----  205 (455)
T ss_pred             CCC-----CceeEecCc--cccCCCCEEEEEecCCCCC-----CceEEEEEccccccc----cC--CCCcceEEEE----
Confidence            422     446666653  4466799999999742111     111111111111100    00  0000011111    


Q ss_pred             CCCCCccCCCCCeeeeEecCCcEEEEEEEeEc
Q psy16044        213 GFSGACIGDSGGPLQCSLKDGRWYLAGITSFG  244 (436)
Q Consensus       213 ~~~~~C~gDsGgPl~~~~~~~~~~l~GI~S~g  244 (436)
                       ....-.|.|||||+-.  +|  .++||.+..
T Consensus       206 -da~in~GnSGGpl~n~--~G--~vIGi~~~~  232 (455)
T PRK10139        206 -DASINRGNSGGALLNL--NG--ELIGINTAI  232 (455)
T ss_pred             -CCccCCCCCcceEECC--CC--eEEEEEEEE
Confidence             2556789999999953  34  399999874


No 19 
>PRK10942 serine endoprotease; Provisional
Probab=97.81  E-value=0.0003  Score=70.11  Aligned_cols=84  Identities=24%  Similarity=0.330  Sum_probs=56.4

Q ss_pred             CeeeEEEEecC--CEEEecCCCcCCCCCCCCCCcceEEEeccccCCccCCceEEEeeeEEEeCCCCCCCCCceeEEEeCC
Q psy16044         55 PHWCGAVLIHP--SWVVTAAHCIHNDIFSLPIPELWTAVLGDWDRTEEEKSEVRIPVERIRVHEEFHNYHHDIALLKLSR  132 (436)
Q Consensus        55 ~~~C~GtLIs~--~~VLTAAhC~~~~~~~~~~~~~~~v~~g~~~~~~~~~~~~~~~v~~i~~hp~y~~~~~DiAll~L~~  132 (436)
                      ....+|.+|+.  -+|||++|.+.+       ...+.|.+...         ..+..+-+..+|.     .||||||++.
T Consensus       110 ~~~GSG~ii~~~~G~IlTn~HVv~~-------a~~i~V~~~dg---------~~~~a~vv~~D~~-----~DlAvlki~~  168 (473)
T PRK10942        110 MALGSGVIIDADKGYVVTNNHVVDN-------ATKIKVQLSDG---------RKFDAKVVGKDPR-----SDIALIQLQN  168 (473)
T ss_pred             cceEEEEEEECCCCEEEeChhhcCC-------CCEEEEEECCC---------CEEEEEEEEecCC-----CCEEEEEecC
Confidence            35799999985  499999999865       24566665421         2234444445554     7999999975


Q ss_pred             CCCCCCCceeeeeecCCCCCCCCCCCcEEEEecC
Q psy16044        133 PTSARDKGVRAVCLTDADKRPVNPKQQCVATGWG  166 (436)
Q Consensus       133 ~~~~~~~~v~pi~l~~~~~~~~~~~~~~~~~GwG  166 (436)
                      +-.     ..++.|.+  ...+..+..+.++|+-
T Consensus       169 ~~~-----l~~~~lg~--s~~l~~G~~V~aiG~P  195 (473)
T PRK10942        169 PKN-----LTAIKMAD--SDALRVGDYTVAIGNP  195 (473)
T ss_pred             CCC-----CceeEecC--ccccCCCCEEEEEcCC
Confidence            322     34566654  3446678999999864


No 20 
>PF13365 Trypsin_2:  Trypsin-like peptidase domain; PDB: 1Y8T_A 2Z9I_A 3QO6_A 1L1J_A 1QY6_A 2O8L_A 3OTP_E 2ZLE_I 1KY9_A 3CS0_A ....
Probab=97.29  E-value=0.00042  Score=55.19  Aligned_cols=21  Identities=33%  Similarity=0.556  Sum_probs=19.4

Q ss_pred             eEEEEecCC-EEEecCCCcCCC
Q psy16044         58 CGAVLIHPS-WVVTAAHCIHND   78 (436)
Q Consensus        58 C~GtLIs~~-~VLTAAhC~~~~   78 (436)
                      |+|.+|.++ +|||||||+...
T Consensus         1 GTGf~i~~~g~ilT~~Hvv~~~   22 (120)
T PF13365_consen    1 GTGFLIGPDGYILTAAHVVEDW   22 (120)
T ss_dssp             EEEEEEETTTEEEEEHHHHTCC
T ss_pred             CEEEEEcCCceEEEchhheecc
Confidence            799999999 999999999764


No 21 
>PF02395 Peptidase_S6:  Immunoglobulin A1 protease Serine protease Prosite pattern;  InterPro: IPR000710 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S6 (clan PA(S)). The type sample being the IgA1-specific serine endopeptidase from Neisseria gonorrhoeae []. These cleave prolyl bonds in the hinge regions of immunoglobulin A heavy chains. Similar specificity is shown by the unrelated family of M26 metalloendopeptidases.; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SZE_A 3H09_B 3SYJ_A 1WXR_A 3AK5_B.
Probab=96.15  E-value=0.031  Score=58.49  Aligned_cols=66  Identities=18%  Similarity=0.193  Sum_probs=36.9

Q ss_pred             EEEEecCCEEEecCCCcCCCCCCCCCCcceEEEeccccCCccCCceEEEeeeEEEeCCCCCCCCCceeEEEeCCCCCCCC
Q psy16044         59 GAVLIHPSWVVTAAHCIHNDIFSLPIPELWTAVLGDWDRTEEEKSEVRIPVERIRVHEEFHNYHHDIALLKLSRPTSARD  138 (436)
Q Consensus        59 ~GtLIs~~~VLTAAhC~~~~~~~~~~~~~~~v~~g~~~~~~~~~~~~~~~v~~i~~hp~y~~~~~DiAll~L~~~~~~~~  138 (436)
                      ..|||+|++|+|++|=....         -.|.+|....       ..+.+..--.|+.     .|..+-||.+=|.   
T Consensus        68 ~aTLigpqYiVSV~HN~~gy---------~~v~FG~~g~-------~~Y~iV~RNn~~~-----~Df~~pRLnK~VT---  123 (769)
T PF02395_consen   68 VATLIGPQYIVSVKHNGKGY---------NSVSFGNEGQ-------NTYKIVDRNNYPS-----GDFHMPRLNKFVT---  123 (769)
T ss_dssp             S-EEEETTEEEBETTG-TSC---------CEECESCSST-------CEEEEEEEEBETT-----STEBEEEESS------
T ss_pred             eEEEecCCeEEEEEccCCCc---------CceeecccCC-------ceEEEEEccCCCC-----cccceeecCceEE---
Confidence            38999999999999986221         2577776432       2233333333433     5999999998665   


Q ss_pred             CceeeeeecCC
Q psy16044        139 KGVRAVCLTDA  149 (436)
Q Consensus       139 ~~v~pi~l~~~  149 (436)
                       .+.|+.....
T Consensus       124 -EvaP~~~t~~  133 (769)
T PF02395_consen  124 -EVAPAEMTTA  133 (769)
T ss_dssp             -SS----BBSS
T ss_pred             -EEeccccccc
Confidence             2677766553


No 22 
>PF09342 DUF1986:  Domain of unknown function (DUF1986);  InterPro: IPR015420 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This domain is found in serine endopeptidases belonging to MEROPS peptidase family S1A (clan PA). It is found in unusual mosaic proteins, which are encoded by the Drosophila nudel gene (see P98159 from SWISSPROT). Nudel is involved in defining embryonic dorsoventral polarity. Three proteases; ndl, gd and snk process easter to create active easter. Active easter defines cell identities along the dorsal-ventral continuum by activating the spz ligand for the Tl receptor in the ventral region of the embryo. Nudel, pipe and windbeutel together trigger the protease cascade within the extraembryonic perivitelline compartment which induces dorsoventral polarity of the Drosophila embryo [].
Probab=93.68  E-value=0.17  Score=44.55  Aligned_cols=45  Identities=22%  Similarity=0.347  Sum_probs=36.8

Q ss_pred             cccccceeecccccccCCCCeeeeccCCCCCCCCCCCCeEEEEecCC
Q psy16044        280 SFLSAALLKLSRPTSARDKGVRAVCLTDADKRPVNPKQQCVATGWGR  326 (436)
Q Consensus       280 ~~~diali~l~~~~~~~~~~v~picl~~~~~~~~~~~~~~~~~Gwg~  326 (436)
                      ...+++|++|++|+.|+.+ |+|..||+... .+.....|...|-..
T Consensus        87 ~~S~v~LLHL~~~~~fTr~-VlP~flp~~~~-~~~~~~~CVAVg~d~  131 (267)
T PF09342_consen   87 PESNVLLLHLEQPANFTRY-VLPTFLPETSN-ENESDDECVAVGHDD  131 (267)
T ss_pred             cccceeeeeecCcccceee-ecccccccccC-CCCCCCceEEEEccc
Confidence            4688999999999999999 99999997544 344555999998664


No 23 
>PF02395 Peptidase_S6:  Immunoglobulin A1 protease Serine protease Prosite pattern;  InterPro: IPR000710 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S6 (clan PA(S)). The type sample being the IgA1-specific serine endopeptidase from Neisseria gonorrhoeae []. These cleave prolyl bonds in the hinge regions of immunoglobulin A heavy chains. Similar specificity is shown by the unrelated family of M26 metalloendopeptidases.; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SZE_A 3H09_B 3SYJ_A 1WXR_A 3AK5_B.
Probab=90.18  E-value=0.53  Score=49.64  Aligned_cols=54  Identities=26%  Similarity=0.425  Sum_probs=32.5

Q ss_pred             CCccCCCCceeeeEe-cCCcEEEEEEEEecCCCCCCCCCeEEEeCcccHHHHHHHHhhh
Q psy16044        375 GACIGDSGGPLQCSL-KDGRWYLAGITSFGSGCAKSGYPDVYTKLSFYLPWIRKQINIA  432 (436)
Q Consensus       375 ~~c~gdsGgpl~~~~-~~~~~~l~Gi~S~~~~c~~~~~p~v~t~V~~~~~WI~~~i~~~  432 (436)
                      ..-.||||+||+.-+ +..+|+|+|+.+.+.+.....  ..++-+.  .+|+++.+.+.
T Consensus       212 ~~~~GDSGSPlF~YD~~~kKWvl~Gv~~~~~~~~g~~--~~~~~~~--~~f~~~~~~~d  266 (769)
T PF02395_consen  212 YGSPGDSGSPLFAYDKEKKKWVLVGVLSGGNGYNGKG--NWWNVIP--PDFINQIKQND  266 (769)
T ss_dssp             B--TT-TT-EEEEEETTTTEEEEEEEEEEECCCCHSE--EEEEEEC--HHHHHHHHHHC
T ss_pred             ccccCcCCCceEEEEccCCeEEEEEEEccccccCCcc--ceeEEec--HHHHHHHHhhh
Confidence            355699999998876 577999999999876543322  3343332  45555555443


No 24 
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=85.09  E-value=5.3  Score=39.58  Aligned_cols=40  Identities=20%  Similarity=0.199  Sum_probs=28.0

Q ss_pred             cccccceeecccccccCCCCeeeeccCCCCCCCCCCCCeEEEEecCC
Q psy16044        280 SFLSAALLKLSRPTSARDKGVRAVCLTDADKRPVNPKQQCVATGWGR  326 (436)
Q Consensus       280 ~~~diali~l~~~~~~~~~~v~picl~~~~~~~~~~~~~~~~~Gwg~  326 (436)
                      ...|+|||+++.+    .. ..++.|....  ....++.+++.|+..
T Consensus       103 ~~~DlAllkv~~~----~~-~~~~~l~~~~--~~~~G~~v~aiG~p~  142 (428)
T TIGR02037       103 PRTDIAVLKIDAK----KN-LPVIKLGDSD--KLRVGDWVLAIGNPF  142 (428)
T ss_pred             CCCCEEEEEecCC----CC-ceEEEccCCC--CCCCCCEEEEEECCC
Confidence            3579999999865    12 5566665332  357899999999864


No 25 
>COG0265 DegQ Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain [Posttranslational modification, protein turnover, chaperones]
Probab=82.07  E-value=27  Score=33.49  Aligned_cols=144  Identities=21%  Similarity=0.220  Sum_probs=75.2

Q ss_pred             eeeEEEEec-CCEEEecCCCcCCCCCCCCCCcceEEEeccccCCccCCceEEEeeeEEEeCCCCCCCCCceeEEEeCCCC
Q psy16044         56 HWCGAVLIH-PSWVVTAAHCIHNDIFSLPIPELWTAVLGDWDRTEEEKSEVRIPVERIRVHEEFHNYHHDIALLKLSRPT  134 (436)
Q Consensus        56 ~~C~GtLIs-~~~VLTAAhC~~~~~~~~~~~~~~~v~~g~~~~~~~~~~~~~~~v~~i~~hp~y~~~~~DiAll~L~~~~  134 (436)
                      ...+|.+++ ..+|||..|=+..       ...+.+.+.         ....+..+.+-..+     ..|+|+||.+..-
T Consensus        72 ~~gSg~i~~~~g~ivTn~hVi~~-------a~~i~v~l~---------dg~~~~a~~vg~d~-----~~dlavlki~~~~  130 (347)
T COG0265          72 GLGSGFIISSDGYIVTNNHVIAG-------AEEITVTLA---------DGREVPAKLVGKDP-----ISDLAVLKIDGAG  130 (347)
T ss_pred             ccccEEEEcCCeEEEecceecCC-------cceEEEEeC---------CCCEEEEEEEecCC-----ccCEEEEEeccCC
Confidence            678999999 7899999998754       234444441         11334444443333     3799999998653


Q ss_pred             CCCCCceeeeeecCCCCCCCCCCCcEEEEecCccCCCCCccccceeeeeeccchhhhhhhcCCCcCCcCCeEEeccCCCC
Q psy16044        135 SARDKGVRAVCLTDADKRPVNPKQQCVATGWGRVKPKGDLVSKLRQIRVPLHNISVCRDKYGDSVELHGGHLCGGQLDGF  214 (436)
Q Consensus       135 ~~~~~~v~pi~l~~~~~~~~~~~~~~~~~GwG~~~~~~~~~~~l~~~~~~~~~~~~C~~~~~~~~~~~~~~~Ca~~~~~~  214 (436)
                      .     ...+.+..  ...+..++...+.|-..-.     ........+..+... +-.....    ....+     ...
T Consensus       131 ~-----~~~~~~~~--s~~l~vg~~v~aiGnp~g~-----~~tvt~Givs~~~r~-~v~~~~~----~~~~I-----qtd  188 (347)
T COG0265         131 G-----LPVIALGD--SDKLRVGDVVVAIGNPFGL-----GQTVTSGIVSALGRT-GVGSAGG----YVNFI-----QTD  188 (347)
T ss_pred             C-----CceeeccC--CCCcccCCEEEEecCCCCc-----ccceeccEEeccccc-cccCccc----ccchh-----hcc
Confidence            2     22333333  2334456666677643221     011111112222211 1110000    00111     111


Q ss_pred             CCCccCCCCCeeeeEecCCcEEEEEEEeEcCC
Q psy16044        215 SGACIGDSGGPLQCSLKDGRWYLAGITSFGSG  246 (436)
Q Consensus       215 ~~~C~gDsGgPl~~~~~~~~~~l~GI~S~g~~  246 (436)
                      ...+.|.||||++..  ++  .++||.+....
T Consensus       189 Aain~gnsGgpl~n~--~g--~~iGint~~~~  216 (347)
T COG0265         189 AAINPGNSGGPLVNI--DG--EVVGINTAIIA  216 (347)
T ss_pred             cccCCCCCCCceEcC--CC--cEEEEEEEEec
Confidence            568899999999963  33  49998887654


No 26 
>COG3591 V8-like Glu-specific endopeptidase [Amino acid transport and metabolism]
Probab=81.20  E-value=2.6  Score=37.94  Aligned_cols=52  Identities=23%  Similarity=0.369  Sum_probs=35.1

Q ss_pred             CCccCCCCceeeeEecCCcEEEEEEEEecCCCCCCCCCeEEEeCcc-cHHHHHHHHh
Q psy16044        375 GACIGDSGGPLQCSLKDGRWYLAGITSFGSGCAKSGYPDVYTKLSF-YLPWIRKQIN  430 (436)
Q Consensus       375 ~~c~gdsGgpl~~~~~~~~~~l~Gi~S~~~~c~~~~~p~v~t~V~~-~~~WI~~~i~  430 (436)
                      +++.|+||+|+.....    +++||.+-+..-.......-.+|+.. +++||++.++
T Consensus       199 dT~pG~SGSpv~~~~~----~vigv~~~g~~~~~~~~~n~~vr~t~~~~~~I~~~~~  251 (251)
T COG3591         199 DTLPGSSGSPVLISKD----EVIGVHYNGPGANGGSLANNAVRLTPEILNFIQQNIK  251 (251)
T ss_pred             cccCCCCCCceEecCc----eEEEEEecCCCcccccccCcceEecHHHHHHHHHhhC
Confidence            7888999999987642    79999988764322122233444444 7889887653


No 27 
>PRK10898 serine endoprotease; Provisional
Probab=80.99  E-value=10  Score=36.48  Aligned_cols=40  Identities=20%  Similarity=0.216  Sum_probs=25.4

Q ss_pred             ccccccceeecccccccCCCCeeeeccCCCCCCCCCCCCeEEEEecCC
Q psy16044        279 TSFLSAALLKLSRPTSARDKGVRAVCLTDADKRPVNPKQQCVATGWGR  326 (436)
Q Consensus       279 ~~~~diali~l~~~~~~~~~~v~picl~~~~~~~~~~~~~~~~~Gwg~  326 (436)
                      ....|+||||++.+ .     ..++.|..  ......++.+++.|+..
T Consensus       122 d~~~DlAvl~v~~~-~-----l~~~~l~~--~~~~~~G~~V~aiG~P~  161 (353)
T PRK10898        122 DSLTDLAVLKINAT-N-----LPVIPINP--KRVPHIGDVVLAIGNPY  161 (353)
T ss_pred             cCCCCEEEEEEcCC-C-----CCeeeccC--cCcCCCCCEEEEEeCCC
Confidence            34689999999764 1     22233322  22456899999999753


No 28 
>PF00863 Peptidase_C4:  Peptidase family C4 This family belongs to family C4 of the peptidase classification.;  InterPro: IPR001730 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  Nuclear inclusion A (NIA) proteases from potyviruses are cysteine peptidases belong to the MEROPS peptidase family C4 (NIa protease family, clan PA(C)) [, ].  Potyviruses include plant viruses in which the single-stranded RNA encodes a polyprotein with NIA protease activity, where proteolytic cleavage is specific for Gln+Gly sites. The NIA protease acts on the polyprotein, releasing itself by Gln+Gly cleavage at both the N- and C-termini. It further processes the polyprotein by cleavage at five similar sites in the C-terminal half of the sequence. In addition to its C-terminal protease activity, the NIA protease contains an N-terminal domain that has been implicated in the transcription process []. This peptidase is present in the nuclear inclusion protein of potyviruses.; GO: 0008234 cysteine-type peptidase activity, 0006508 proteolysis; PDB: 3MMG_B 1Q31_B 1LVB_A 1LVM_A.
Probab=78.40  E-value=11  Score=33.66  Aligned_cols=137  Identities=18%  Similarity=0.213  Sum_probs=57.0

Q ss_pred             EEecCCEEEecCCCcCCCCCCCCCCcceEEE--eccccCCccCCceEEEeeeEEEeCCCCCCCCCceeEEEeCCCCCCCC
Q psy16044         61 VLIHPSWVVTAAHCIHNDIFSLPIPELWTAV--LGDWDRTEEEKSEVRIPVERIRVHEEFHNYHHDIALLKLSRPTSARD  138 (436)
Q Consensus        61 tLIs~~~VLTAAhC~~~~~~~~~~~~~~~v~--~g~~~~~~~~~~~~~~~v~~i~~hp~y~~~~~DiAll~L~~~~~~~~  138 (436)
                      -|.--.||||-+|-+....+      .+.+.  .|.+.....       ...++..-+     ..||.||+|.+.++-+.
T Consensus        36 gigyG~~iItn~HLf~~nng------~L~i~s~hG~f~v~nt-------~~lkv~~i~-----~~DiviirmPkDfpPf~   97 (235)
T PF00863_consen   36 GIGYGSYIITNAHLFKRNNG------ELTIKSQHGEFTVPNT-------TQLKVHPIE-----GRDIVIIRMPKDFPPFP   97 (235)
T ss_dssp             EEEETTEEEEEGGGGSSTTC------EEEEEETTEEEEECEG-------GGSEEEE-T-----CSSEEEEE--TTS----
T ss_pred             EEeECCEEEEChhhhccCCC------eEEEEeCceEEEcCCc-------cccceEEeC-----CccEEEEeCCcccCCcc
Confidence            35567899999999866422      23332  222221111       011222222     46999999998876322


Q ss_pred             CceeeeeecCCCCCCCCCCCcEEEEecCccCCCCCccccceeeeeeccchhhhhhhcCCCcCCcCCeEEeccCCCCCCCc
Q psy16044        139 KGVRAVCLTDADKRPVNPKQQCVATGWGRVKPKGDLVSKLRQIRVPLHNISVCRDKYGDSVELHGGHLCGGQLDGFSGAC  218 (436)
Q Consensus       139 ~~v~pi~l~~~~~~~~~~~~~~~~~GwG~~~~~~~~~~~l~~~~~~~~~~~~C~~~~~~~~~~~~~~~Ca~~~~~~~~~C  218 (436)
                      .   -+++     +.+..+..+.++|--......  .....+.  ..+... ..           ..|=.-    --++=
T Consensus        98 ~---kl~F-----R~P~~~e~v~mVg~~fq~k~~--~s~vSes--S~i~p~-~~-----------~~fWkH----wIsTk  149 (235)
T PF00863_consen   98 Q---KLKF-----RAPKEGERVCMVGSNFQEKSI--SSTVSES--SWIYPE-EN-----------SHFWKH----WISTK  149 (235)
T ss_dssp             S------B---------TT-EEEEEEEECSSCCC--EEEEEEE--EEEEEE-TT-----------TTEEEE-----C---
T ss_pred             h---hhhc-----cCCCCCCEEEEEEEEEEcCCe--eEEECCc--eEEeec-CC-----------CCeeEE----EecCC
Confidence            2   1121     334466777888854333111  1111111  000000 00           000000    02345


Q ss_pred             cCCCCCeeeeEecCCcEEEEEEEeEcCC
Q psy16044        219 IGDSGGPLQCSLKDGRWYLAGITSFGSG  246 (436)
Q Consensus       219 ~gDsGgPl~~~~~~~~~~l~GI~S~g~~  246 (436)
                      .||=|.||+.. .+|  .++||-|.+..
T Consensus       150 ~G~CG~PlVs~-~Dg--~IVGiHsl~~~  174 (235)
T PF00863_consen  150 DGDCGLPLVST-KDG--KIVGIHSLTSN  174 (235)
T ss_dssp             TT-TT-EEEET-TT----EEEEEEEEET
T ss_pred             CCccCCcEEEc-CCC--cEEEEEcCccC
Confidence            68889999986 345  49999998754


No 29 
>TIGR02038 protease_degS periplasmic serine pepetdase DegS. This family consists of the periplasmic serine protease DegS (HhoB), a shorter paralog of protease DO (HtrA, DegP) and DegQ (HhoA). It is found in E. coli and several other Proteobacteria of the gamma subdivision. It contains a trypsin domain and a single copy of PDZ domain (in contrast to DegP with two copies). A critical role of this DegS is to sense stress in the periplasm and partially degrade an inhibitor of sigma(E).
Probab=78.27  E-value=7.9  Score=37.21  Aligned_cols=40  Identities=18%  Similarity=0.213  Sum_probs=26.2

Q ss_pred             ccccccceeecccccccCCCCeeeeccCCCCCCCCCCCCeEEEEecCC
Q psy16044        279 TSFLSAALLKLSRPTSARDKGVRAVCLTDADKRPVNPKQQCVATGWGR  326 (436)
Q Consensus       279 ~~~~diali~l~~~~~~~~~~v~picl~~~~~~~~~~~~~~~~~Gwg~  326 (436)
                      ....|+||||++.+-      ..++.+.  +......++.+.+.|+..
T Consensus       122 d~~~DlAvlkv~~~~------~~~~~l~--~s~~~~~G~~V~aiG~P~  161 (351)
T TIGR02038       122 DPLTDLAVLKIEGDN------LPTIPVN--LDRPPHVGDVVLAIGNPY  161 (351)
T ss_pred             cCCCCEEEEEecCCC------CceEecc--CcCccCCCCEEEEEeCCC
Confidence            346899999998641      2223332  222467899999999864


No 30 
>PF00947 Pico_P2A:  Picornavirus core protein 2A;  InterPro: IPR000081 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  This domain defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies 3CA and 3CB. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral 3C cysteine protease []. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0008233 peptidase activity, 0006508 proteolysis, 0016032 viral reproduction; PDB: 2HRV_B 1Z8R_A.
Probab=76.08  E-value=2.6  Score=33.34  Aligned_cols=34  Identities=26%  Similarity=0.461  Sum_probs=25.8

Q ss_pred             cCCCCceeeeEecCCcEEEEEEEEecCCCCCCCCCeEEEeCccc
Q psy16044        378 IGDSGGPLQCSLKDGRWYLAGITSFGSGCAKSGYPDVYTKLSFY  421 (436)
Q Consensus       378 ~gdsGgpl~~~~~~~~~~l~Gi~S~~~~c~~~~~p~v~t~V~~~  421 (436)
                      .||-||+|.|+-.     ++||++.|-+     ....|++++.+
T Consensus        89 PGdCGg~L~C~HG-----ViGi~Tagg~-----g~VaF~dir~~  122 (127)
T PF00947_consen   89 PGDCGGILRCKHG-----VIGIVTAGGE-----GHVAFADIRDL  122 (127)
T ss_dssp             TT-TCSEEEETTC-----EEEEEEEEET-----TEEEEEECCCG
T ss_pred             CCCCCceeEeCCC-----eEEEEEeCCC-----ceEEEEechhh
Confidence            3899999999743     9999998652     24679999875


No 31 
>PRK10139 serine endoprotease; Provisional
Probab=75.15  E-value=17  Score=36.37  Aligned_cols=39  Identities=23%  Similarity=0.372  Sum_probs=26.4

Q ss_pred             cccccceeecccccccCCCCeeeeccCCCCCCCCCCCCeEEEEecC
Q psy16044        280 SFLSAALLKLSRPTSARDKGVRAVCLTDADKRPVNPKQQCVATGWG  325 (436)
Q Consensus       280 ~~~diali~l~~~~~~~~~~v~picl~~~~~~~~~~~~~~~~~Gwg  325 (436)
                      ...|+||||++.+..     ..++.|....  ....++.++..|.-
T Consensus       136 ~~~DlAvlkv~~~~~-----l~~~~lg~s~--~~~~G~~V~aiG~P  174 (455)
T PRK10139        136 DQSDIALLQIQNPSK-----LTQIAIADSD--KLRVGDFAVAVGNP  174 (455)
T ss_pred             CCCCEEEEEecCCCC-----CceeEecCcc--ccCCCCEEEEEecC
Confidence            458999999985422     3445554332  45689999999874


No 32 
>PF05416 Peptidase_C37:  Southampton virus-type processing peptidase;  InterPro: IPR001665 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  This group of cysteine peptidases belong to the MEROPS peptidase family C37, (clan PA(C)). The type example is calicivirin from Southampton virus, an endopeptidase that cleaves the polyprotein at sites N-terminal to itself, liberating the polyprotein helicase. Southampton virus is a positive-stranded ssRNA virus belonging to the Caliciviruses, which are viruses that cause gastroenteritis. The calicivirus genome contains two open reading frames, ORF1 and ORF2. ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine protease and RNA polymerase activity []. The regions of the polyprotein in which these activities lie are similar to proteins produced by the picornaviruses []. ORF2 encodes a structural, capsid protein. Two different families of caliciviruses can be distinguished on the basis of sequence similarity, namely the Norwalk-like viruses or small round structured viruses (SRSVs), and those classed as non-SRSVs.; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 2FYQ_A 2FYR_A 1WQS_D 4ASH_A 2IPH_B.
Probab=69.08  E-value=15  Score=35.34  Aligned_cols=41  Identities=24%  Similarity=0.438  Sum_probs=26.6

Q ss_pred             CeEEeccCCCC--CCCccCCCCCeeeeEecCCcEEEEEEEeEcC
Q psy16044        204 GHLCGGQLDGF--SGACIGDSGGPLQCSLKDGRWYLAGITSFGS  245 (436)
Q Consensus       204 ~~~Ca~~~~~~--~~~C~gDsGgPl~~~~~~~~~~l~GI~S~g~  245 (436)
                      .|+-++....+  -++-.||-|-|-++. .++.|+++||..-..
T Consensus       485 GMLLTGaNAK~mDLGT~PGDCGcPYvyK-rgNd~VV~GVH~AAt  527 (535)
T PF05416_consen  485 GMLLTGANAKGMDLGTIPGDCGCPYVYK-RGNDWVVIGVHAAAT  527 (535)
T ss_dssp             EEETTSTT-SSTTTS--TTGTT-EEEEE-ETTEEEEEEEEEEE-
T ss_pred             eeeeecCCccccccCCCCCCCCCceeee-cCCcEEEEEEEehhc
Confidence            45555544332  456799999999997 588999999976543


No 33 
>PF00548 Peptidase_C3:  3C cysteine protease (picornain 3C);  InterPro: IPR000199 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  This signature defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies C3A and C3B. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral C3 cysteine protease. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SJO_E 2H6M_A 1QA7_C 1HAV_B 2HAL_A 2H9H_A 3QZQ_B 3QZR_A 3R0F_B 3SJ9_A ....
Probab=59.98  E-value=5.2  Score=34.05  Aligned_cols=29  Identities=28%  Similarity=0.411  Sum_probs=22.1

Q ss_pred             CCccCCCCCeeeeEecCCcEEEEEEEeEcC
Q psy16044        216 GACIGDSGGPLQCSLKDGRWYLAGITSFGS  245 (436)
Q Consensus       216 ~~C~gDsGgPl~~~~~~~~~~l~GI~S~g~  245 (436)
                      .+..|+=||||+... ++...++||-.-|.
T Consensus       143 ~t~~G~CG~~l~~~~-~~~~~i~GiHvaG~  171 (172)
T PF00548_consen  143 PTKPGMCGSPLVSRI-GGQGKIIGIHVAGN  171 (172)
T ss_dssp             EEETTGTTEEEEESC-GGTTEEEEEEEEEE
T ss_pred             CCCCCccCCeEEEee-ccCccEEEEEeccC
Confidence            355789999999863 45668999987764


No 34 
>PF00947 Pico_P2A:  Picornavirus core protein 2A;  InterPro: IPR000081 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  This domain defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies 3CA and 3CB. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral 3C cysteine protease []. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0008233 peptidase activity, 0006508 proteolysis, 0016032 viral reproduction; PDB: 2HRV_B 1Z8R_A.
Probab=57.17  E-value=8  Score=30.66  Aligned_cols=23  Identities=39%  Similarity=0.732  Sum_probs=18.3

Q ss_pred             cCCCCCeeeeEecCCcEEEEEEEeEcCC
Q psy16044        219 IGDSGGPLQCSLKDGRWYLAGITSFGSG  246 (436)
Q Consensus       219 ~gDsGgPl~~~~~~~~~~l~GI~S~g~~  246 (436)
                      .||-||+|.|+-.     ++||++-|..
T Consensus        89 PGdCGg~L~C~HG-----ViGi~Tagg~  111 (127)
T PF00947_consen   89 PGDCGGILRCKHG-----VIGIVTAGGE  111 (127)
T ss_dssp             TT-TCSEEEETTC-----EEEEEEEEET
T ss_pred             CCCCCceeEeCCC-----eEEEEEeCCC
Confidence            7999999999732     9999998754


No 35 
>PRK10942 serine endoprotease; Provisional
Probab=56.49  E-value=58  Score=32.78  Aligned_cols=39  Identities=23%  Similarity=0.314  Sum_probs=25.0

Q ss_pred             cccccceeecccccccCCCCeeeeccCCCCCCCCCCCCeEEEEecC
Q psy16044        280 SFLSAALLKLSRPTSARDKGVRAVCLTDADKRPVNPKQQCVATGWG  325 (436)
Q Consensus       280 ~~~diali~l~~~~~~~~~~v~picl~~~~~~~~~~~~~~~~~Gwg  325 (436)
                      ...|+||||++.+-.     ..++.|...  .....++.+++.|.-
T Consensus       157 ~~~DlAvlki~~~~~-----l~~~~lg~s--~~l~~G~~V~aiG~P  195 (473)
T PRK10942        157 PRSDIALIQLQNPKN-----LTAIKMADS--DALRVGDYTVAIGNP  195 (473)
T ss_pred             CCCCEEEEEecCCCC-----CceeEecCc--cccCCCCEEEEEcCC
Confidence            458999999864322     334444322  245788999988864


No 36 
>PF10459 Peptidase_S46:  Peptidase S46;  InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains. 
Probab=56.35  E-value=6.8  Score=41.12  Aligned_cols=22  Identities=27%  Similarity=0.564  Sum_probs=19.5

Q ss_pred             eeEEEEecCC-EEEecCCCcCCC
Q psy16044         57 WCGAVLIHPS-WVVTAAHCIHND   78 (436)
Q Consensus        57 ~C~GtLIs~~-~VLTAAhC~~~~   78 (436)
                      .|+|++||++ .|||--||..+.
T Consensus        48 GCSgsfVS~~GLvlTNHHC~~~~   70 (698)
T PF10459_consen   48 GCSGSFVSPDGLVLTNHHCGYGA   70 (698)
T ss_pred             ceeEEEEcCCceEEecchhhhhH
Confidence            4999999997 999999998763


No 37 
>PF05580 Peptidase_S55:  SpoIVB peptidase S55;  InterPro: IPR008763 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S55 (SpoIVB peptidase family, clan PA(S)). The protein SpoIVB plays a key role in signalling in the final sigma-K checkpoint of Bacillus subtilis [, ].
Probab=45.14  E-value=20  Score=31.43  Aligned_cols=25  Identities=32%  Similarity=0.353  Sum_probs=21.5

Q ss_pred             CCccCCCCceeeeEecCCcEEEEEEEEecC
Q psy16044        375 GACIGDSGGPLQCSLKDGRWYLAGITSFGS  404 (436)
Q Consensus       375 ~~c~gdsGgpl~~~~~~~~~~l~Gi~S~~~  404 (436)
                      +.-+|.||+|++.+++     |+|-+++..
T Consensus       176 GIvqGMSGSPI~qdGK-----LiGAVthvf  200 (218)
T PF05580_consen  176 GIVQGMSGSPIIQDGK-----LIGAVTHVF  200 (218)
T ss_pred             CEEecccCCCEEECCE-----EEEEEEEEE
Confidence            6889999999987654     999999875


No 38 
>PF05579 Peptidase_S32:  Equine arteritis virus serine endopeptidase S32;  InterPro: IPR008760 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to MEROPS peptidase family S32 (clan PA(S)). The type example is equine arteritis virus serine endopeptidase (equine arteritis virus), which is involved in processing of nidovirus polyproteins [].; GO: 0004252 serine-type endopeptidase activity, 0016032 viral reproduction, 0019082 viral protein processing; PDB: 3FAN_A 3FAO_A 1MBM_A.
Probab=32.58  E-value=39  Score=30.63  Aligned_cols=22  Identities=41%  Similarity=0.656  Sum_probs=15.4

Q ss_pred             cCCCCCeeeeEecCCcEEEEEEEeEc
Q psy16044        219 IGDSGGPLQCSLKDGRWYLAGITSFG  244 (436)
Q Consensus       219 ~gDsGgPl~~~~~~~~~~l~GI~S~g  244 (436)
                      .||||+|++..  +|  .|+||.+-.
T Consensus       207 ~GDSGSPVVt~--dg--~liGVHTGS  228 (297)
T PF05579_consen  207 PGDSGSPVVTE--DG--DLIGVHTGS  228 (297)
T ss_dssp             GGCTT-EEEET--TC---EEEEEEEE
T ss_pred             CCCCCCccCcC--CC--CEEEEEecC
Confidence            68999999964  44  399997643


No 39 
>PF05580 Peptidase_S55:  SpoIVB peptidase S55;  InterPro: IPR008763 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S55 (SpoIVB peptidase family, clan PA(S)). The protein SpoIVB plays a key role in signalling in the final sigma-K checkpoint of Bacillus subtilis [, ].
Probab=30.81  E-value=37  Score=29.84  Aligned_cols=26  Identities=27%  Similarity=0.277  Sum_probs=21.7

Q ss_pred             CCCccCCCCCeeeeEecCCcEEEEEEEeEcC
Q psy16044        215 SGACIGDSGGPLQCSLKDGRWYLAGITSFGS  245 (436)
Q Consensus       215 ~~~C~gDsGgPl~~~~~~~~~~l~GI~S~g~  245 (436)
                      .+.-+|-||+|++.++     .|+|-++++.
T Consensus       175 GGIvqGMSGSPI~qdG-----KLiGAVthvf  200 (218)
T PF05580_consen  175 GGIVQGMSGSPIIQDG-----KLIGAVTHVF  200 (218)
T ss_pred             CCEEecccCCCEEECC-----EEEEEEEEEE
Confidence            5688999999999765     3999998874


No 40 
>PF05579 Peptidase_S32:  Equine arteritis virus serine endopeptidase S32;  InterPro: IPR008760 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to MEROPS peptidase family S32 (clan PA(S)). The type example is equine arteritis virus serine endopeptidase (equine arteritis virus), which is involved in processing of nidovirus polyproteins [].; GO: 0004252 serine-type endopeptidase activity, 0016032 viral reproduction, 0019082 viral protein processing; PDB: 3FAN_A 3FAO_A 1MBM_A.
Probab=30.61  E-value=41  Score=30.52  Aligned_cols=21  Identities=38%  Similarity=0.591  Sum_probs=15.2

Q ss_pred             CCCCceeeeEecCCcEEEEEEEEec
Q psy16044        379 GDSGGPLQCSLKDGRWYLAGITSFG  403 (436)
Q Consensus       379 gdsGgpl~~~~~~~~~~l~Gi~S~~  403 (436)
                      ||||+|++..+  +  .|+||-+-+
T Consensus       208 GDSGSPVVt~d--g--~liGVHTGS  228 (297)
T PF05579_consen  208 GDSGSPVVTED--G--DLIGVHTGS  228 (297)
T ss_dssp             GCTT-EEEETT--C---EEEEEEEE
T ss_pred             CCCCCccCcCC--C--CEEEEEecC
Confidence            89999999853  3  399998654


No 41 
>PF00944 Peptidase_S3:  Alphavirus core protein ;  InterPro: IPR000930 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. Togavirin, also known as Sindbis virus core endopeptidase, is a serine protease resident at the N terminus of the p130 polyprotein of togaviruses []. The endopeptidase signature identifies the peptidase as belonging to the MEROPS peptidase family S3 (togavirin family, clan PA(S)). The polyprotein also includes structural proteins for the nucleocapsid core and for the glycoprotein spikes []. Togavirin is only active while part of the polyprotein, cleavage at a Trp-Ser bond resulting in total lack of activity []. Mutagenesis studies have identified the location of the His-Asp-Ser catalytic triad, and X-ray studies have revealed the protein fold to be similar to that of chymotrypsin [, ].; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis, 0016020 membrane; PDB: 2YEW_D 1EP5_A 3J0C_F 1EP6_C 1WYK_D 1DYL_A 1VCQ_B 1VCP_B 1LD4_D 1KXA_A ....
Probab=29.47  E-value=36  Score=27.27  Aligned_cols=25  Identities=36%  Similarity=0.512  Sum_probs=17.4

Q ss_pred             CccCCCCCeeeeEecCCcEEEEEEEeEcC
Q psy16044        217 ACIGDSGGPLQCSLKDGRWYLAGITSFGS  245 (436)
Q Consensus       217 ~C~gDsGgPl~~~~~~~~~~l~GI~S~g~  245 (436)
                      .-.||||-|++-+  .|+  ++||+--|.
T Consensus       103 g~~GDSGRpi~DN--sGr--VVaIVLGG~  127 (158)
T PF00944_consen  103 GKPGDSGRPIFDN--SGR--VVAIVLGGA  127 (158)
T ss_dssp             -STTSTTEEEEST--TSB--EEEEEEEEE
T ss_pred             CCCCCCCCccCcC--CCC--EEEEEecCC
Confidence            3479999999853  444  888876553


No 42 
>PF02907 Peptidase_S29:  Hepatitis C virus NS3 protease;  InterPro: IPR004109 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies the Hepatitis C virus NS3 protein as a serine protease which belongs to MEROPS peptidase family S29 (hepacivirin family, clan PA(S)), which has a trypsin-like fold. The non-structural (NS) protein NS3 is one of the NS proteins involved in replication of the HCV genome. The NS2 proteinase (IPR002518 from INTERPRO), a zinc-dependent enzyme, performs a single proteolytic cut to release the N terminus of NS3. The action of NS3 proteinase (NS3P), which resides in the N-terminal one-third of the NS3 protein, then yields all remaining non-structural proteins. The C-terminal two-thirds of the NS3 protein contain a helicase. The functional relationship between the proteinase and helicase domains is unknown. NS3 has a structural zinc-binding site and requires cofactor NS4. It has been suggested that the NS3 serine protease of hepatitus C is involved in cell transformation and that the ability to transform requires an active enzyme [].; GO: 0008236 serine-type peptidase activity, 0006508 proteolysis, 0019087 transformation of host cell by virus; PDB: 2QV1_B 3LOX_C 2OBQ_C 2OC1_C 2OC0_A 3LON_A 3KNX_A 2O8M_A 2OBO_A 2OC8_A ....
Probab=28.06  E-value=38  Score=27.19  Aligned_cols=22  Identities=36%  Similarity=0.724  Sum_probs=14.8

Q ss_pred             ccCCCCceeeeEecCCcEEEEEEEEe
Q psy16044        377 CIGDSGGPLQCSLKDGRWYLAGITSF  402 (436)
Q Consensus       377 c~gdsGgpl~~~~~~~~~~l~Gi~S~  402 (436)
                      -.|.||||+.|.  .|  ..+||+-.
T Consensus       106 lkGSSGgPiLC~--~G--H~vG~f~a  127 (148)
T PF02907_consen  106 LKGSSGGPILCP--SG--HAVGMFRA  127 (148)
T ss_dssp             HTT-TT-EEEET--TS--EEEEEEEE
T ss_pred             EecCCCCcccCC--CC--CEEEEEEE
Confidence            347799999996  34  59998654


No 43 
>KOG1421|consensus
Probab=23.50  E-value=4.5e+02  Score=27.66  Aligned_cols=135  Identities=17%  Similarity=0.192  Sum_probs=0.0

Q ss_pred             EEEEecCC--EEEecCCCcCCCCCCCCCCcceEEEeccccCCccCCceEEEeeeEEEeCCCCCCCCCceeEEEeCCCCCC
Q psy16044         59 GAVLIHPS--WVVTAAHCIHNDIFSLPIPELWTAVLGDWDRTEEEKSEVRIPVERIRVHEEFHNYHHDIALLKLSRPTSA  136 (436)
Q Consensus        59 ~GtLIs~~--~VLTAAhC~~~~~~~~~~~~~~~v~~g~~~~~~~~~~~~~~~v~~i~~hp~y~~~~~DiAll~L~~~~~~  136 (436)
                      +|.++++.  ++||+.|-+        .+..+...+            ....=..+-+.|-|...-+|+.+++.+..-. 
T Consensus        87 tgfvvd~~~gyiLtnrhvv--------~pgP~va~a------------vf~n~ee~ei~pvyrDpVhdfGf~r~dps~i-  145 (955)
T KOG1421|consen   87 TGFVVDKKLGYILTNRHVV--------APGPFVASA------------VFDNHEEIEIYPVYRDPVHDFGFFRYDPSTI-  145 (955)
T ss_pred             eEEEEecccceEEEecccc--------CCCCceeEE------------EecccccCCcccccCCchhhcceeecChhhc-


Q ss_pred             CCCceeeeeecCCCCCCCCCCCcEEEEecCccCCCCCccccceeeeeeccchhhhhhhcCCCcCCcCCeEEeccCCCCCC
Q psy16044        137 RDKGVRAVCLTDADKRPVNPKQQCVATGWGRVKPKGDLVSKLRQIRVPLHNISVCRDKYGDSVELHGGHLCGGQLDGFSG  216 (436)
Q Consensus       137 ~~~~v~pi~l~~~~~~~~~~~~~~~~~GwG~~~~~~~~~~~l~~~~~~~~~~~~C~~~~~~~~~~~~~~~Ca~~~~~~~~  216 (436)
                      .-..++.+||..   .....+..-+++|          .....+..+-.--.+.-...........-..|-+.+.+-..+
T Consensus       146 r~s~vt~i~lap---~~akvgseirvvg----------NDagEklsIlagflSrldr~apdyg~~~yndfnTfy~Qaass  212 (955)
T KOG1421|consen  146 RFSIVTEICLAP---ELAKVGSEIRVVG----------NDAGEKLSILAGFLSRLDRNAPDYGEDTYNDFNTFYIQAASS  212 (955)
T ss_pred             ceeeeeccccCc---cccccCCceEEec----------CCccceEEeehhhhhhccCCCccccccccccccceeeeehhc


Q ss_pred             CccCCCCCeee
Q psy16044        217 ACIGDSGGPLQ  227 (436)
Q Consensus       217 ~C~gDsGgPl~  227 (436)
                      +-.|.||+|++
T Consensus       213 tsggssgspVv  223 (955)
T KOG1421|consen  213 TSGGSSGSPVV  223 (955)
T ss_pred             CCCCCCCCcee


No 44 
>TIGR02860 spore_IV_B stage IV sporulation protein B. SpoIVB, the stage IV sporulation protein B of endospore-forming bacteria such as Bacillus subtilis, is a serine proteinase, expressed in the spore (rather than mother cell) compartment, that participates in a proteolytic activation cascade for Sigma-K. It appears to be universal among endospore-forming bacteria and occurs nowhere else.
Probab=20.32  E-value=72  Score=31.15  Aligned_cols=44  Identities=25%  Similarity=0.347  Sum_probs=29.1

Q ss_pred             CCccCCCCceeeeEecCCcEEEEEEEEecCCCCCCCCCeEEEeCcccHHHHHHHH
Q psy16044        375 GACIGDSGGPLQCSLKDGRWYLAGITSFGSGCAKSGYPDVYTKLSFYLPWIRKQI  429 (436)
Q Consensus       375 ~~c~gdsGgpl~~~~~~~~~~l~Gi~S~~~~c~~~~~p~v~t~V~~~~~WI~~~i  429 (436)
                      +.-+|.||+|++.+++     |+|-++.-.--.++...++      |++|..+..
T Consensus       356 GivqGMSGSPi~q~gk-----liGAvtHVfvndpt~GYGi------~ie~Ml~~~  399 (402)
T TIGR02860       356 GIVQGMSGSPIIQNGK-----VIGAVTHVFVNDPTSGYGV------YIEWMLKEA  399 (402)
T ss_pred             CEEecccCCCEEECCE-----EEEEEEEEEecCCCcceee------hHHHHHHHh
Confidence            7889999999998665     9998886431111111233      578887654


Done!