Query         psy10089
Match_columns 559
No_of_seqs    467 out of 3309
Neff          9.2 
Searched_HMMs 46136
Date          Fri Aug 16 17:36:41 2013
Command       hhsearch -i /work/01045/syshi/Psyhhblits/psy10089.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/10089hhsearch_cdd -cpu 12 -v 0 

 No Hit                             Prob E-value P-value  Score    SS Cols Query HMM  Template HMM
  1 cd00190 Tryp_SPc Trypsin-like  100.0 1.7E-40 3.6E-45  320.4  24.1  229  288-545     1-232 (232)
  2 KOG3627|consensus              100.0 1.1E-39 2.5E-44  319.9  24.7  238  285-547    10-255 (256)
  3 smart00020 Tryp_SPc Trypsin-li 100.0   1E-37 2.2E-42  300.8  23.9  226  287-542     1-229 (229)
  4 PF00089 Trypsin:  Trypsin;  In 100.0 3.3E-35 7.2E-40  281.1  22.5  218  288-542     1-220 (220)
  5 cd00190 Tryp_SPc Trypsin-like  100.0 6.7E-32 1.4E-36  260.3  17.7  177   33-221     1-231 (232)
  6 COG5640 Secreted trypsin-like  100.0 9.7E-32 2.1E-36  255.0  17.0  248  284-552    29-284 (413)
  7 KOG3627|consensus              100.0   9E-31 1.9E-35  256.8  17.0  186   29-222     9-253 (256)
  8 smart00020 Tryp_SPc Trypsin-li 100.0 9.5E-30 2.1E-34  245.1  19.5  174   32-219     1-229 (229)
  9 PF00089 Trypsin:  Trypsin;  In 100.0 4.1E-28 8.9E-33  232.0  14.6  172   33-219     1-220 (220)
 10 COG5640 Secreted trypsin-like   99.9 1.4E-23   3E-28  199.8   8.3  182   29-224    29-279 (413)
 11 PF03761 DUF316:  Domain of unk  99.5 1.7E-13 3.6E-18  136.2  17.7  212  287-544    41-277 (282)
 12 PF03761 DUF316:  Domain of unk  99.4 1.4E-11   3E-16  122.5  15.9  174   22-200    31-259 (282)
 13 PF09342 DUF1986:  Domain of un  99.3 7.4E-12 1.6E-16  114.4  11.2  115  301-436    14-131 (267)
 14 PF09342 DUF1986:  Domain of un  99.3 2.7E-11 5.9E-16  110.8  10.9  111   42-165    14-129 (267)
 15 COG3591 V8-like Glu-specific e  98.9 2.1E-08 4.6E-13   94.5  12.6  200  299-547    45-251 (251)
 16 COG3591 V8-like Glu-specific e  98.3 7.9E-06 1.7E-10   77.3  11.1  147   40-197    45-224 (251)
 17 TIGR02037 degP_htrA_DO peripla  98.0 8.8E-05 1.9E-09   78.2  14.8   84  325-436    58-142 (428)
 18 TIGR02038 protease_degS peripl  97.7 0.00096 2.1E-08   68.2  16.0   83  325-436    78-161 (351)
 19 TIGR02037 degP_htrA_DO peripla  97.7 0.00032   7E-09   73.9  12.7   82   60-165    58-140 (428)
 20 PRK10898 serine endoprotease;   97.6  0.0032 6.9E-08   64.4  17.1   82  325-435    78-160 (353)
 21 PRK10139 serine endoprotease;   97.5  0.0015 3.3E-08   69.0  14.0   83  325-435    90-174 (455)
 22 PRK10942 serine endoprotease;   97.5  0.0036 7.7E-08   66.5  16.1   83  325-435   111-195 (473)
 23 TIGR02038 protease_degS peripl  97.5  0.0028 6.1E-08   64.8  14.7   75   42-136    53-135 (351)
 24 PRK10898 serine endoprotease;   97.3  0.0064 1.4E-07   62.2  14.9   73   44-136    55-135 (353)
 25 PRK10139 serine endoprotease;   97.3   0.003 6.5E-08   66.7  12.5  107   60-196    90-232 (455)
 26 PF13365 Trypsin_2:  Trypsin-li  97.3 0.00082 1.8E-08   56.9   6.8   20   62-81      1-21  (120)
 27 PF13365 Trypsin_2:  Trypsin-li  97.2 0.00097 2.1E-08   56.5   6.8   21  327-347     1-22  (120)
 28 PRK10942 serine endoprotease;   97.1   0.005 1.1E-07   65.4  11.8   56   60-135   111-168 (473)
 29 PF02395 Peptidase_S6:  Immunog  92.1    0.83 1.8E-05   51.2   9.9   30  170-200   216-245 (769)
 30 PF00548 Peptidase_C3:  3C cyst  89.2     3.4 7.5E-05   37.4   9.6   70  324-416    24-93  (172)
 31 PF02395 Peptidase_S6:  Immunog  85.1    0.81 1.8E-05   51.3   3.7   55  491-547   212-268 (769)
 32 COG0265 DegQ Trypsin-like seri  72.4      64  0.0014   32.9  12.7   58  325-406    72-130 (347)
 33 PF00863 Peptidase_C4:  Peptida  70.0   1E+02  0.0022   29.5  12.2   49  491-546   147-196 (235)
 34 PF00947 Pico_P2A:  Picornaviru  60.8     6.5 0.00014   33.2   2.2   35  494-539    89-123 (127)
 35 PF05580 Peptidase_S55:  SpoIVB  55.7      13 0.00028   34.6   3.4   26  490-522   175-200 (218)
 36 PF00548 Peptidase_C3:  3C cyst  47.3 1.7E+02  0.0037   26.4   9.3   71   58-147    23-93  (172)
 37 PF00863 Peptidase_C4:  Peptida  44.6 2.6E+02  0.0057   26.7  10.3   17   66-82     37-53  (235)
 38 PF05416 Peptidase_C37:  Southa  38.6 2.1E+02  0.0045   29.7   8.9   31  489-522   497-527 (535)
 39 COG0265 DegQ Trypsin-like seri  26.6 6.8E+02   0.015   25.3  11.9   29   60-88     72-101 (347)
 40 PF10459 Peptidase_S46:  Peptid  26.1      42 0.00091   37.6   2.1   20   61-80     48-68  (698)
 41 PF10459 Peptidase_S46:  Peptid  24.4      46 0.00099   37.3   2.0   21  326-346    48-69  (698)
 42 PF08192 Peptidase_S64:  Peptid  23.8 5.5E+02   0.012   28.5   9.7   57  490-547   634-690 (695)
 43 PF05579 Peptidase_S32:  Equine  22.2      56  0.0012   31.5   1.8   21  494-520   207-227 (297)
 44 TIGR02860 spore_IV_B stage IV   22.1      67  0.0015   33.3   2.5   44  489-545   354-398 (402)
 45 PF02907 Peptidase_S29:  Hepati  20.8      73  0.0016   27.3   2.0   21  493-519   106-126 (148)

No 1  
>cd00190 Tryp_SPc Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues.
Probab=100.00  E-value=1.7e-40  Score=320.39  Aligned_cols=229  Identities=35%  Similarity=0.662  Sum_probs=193.0

Q ss_pred             ecCCCcCCCccccCcCCceEEEeecccCCCCCCccceeEeeEEEEcCCEEEecccccCccCccceEEEccceeccccCCC
Q psy10089        288 ITGWGRDSAETFFGEYPWMMAILTNKINKDGSVTENVFQCGATLILPHVVMTAAHCVNNIPVTDIKVRGGEWDTITNNRT  367 (559)
Q Consensus       288 i~gg~~~~~~~~~~~~Pw~v~i~~~~~~~~~~~~~~~~~C~GtLIs~~~VLTAAhC~~~~~~~~~~V~~g~~~~~~~~~~  367 (559)
                      |+||.    ++..++|||+|.|+...         ..+.|+|+||+++||||||||+.......+.|++|........  
T Consensus         1 i~~G~----~~~~~~~Pw~v~i~~~~---------~~~~C~GtlIs~~~VLTaAhC~~~~~~~~~~v~~g~~~~~~~~--   65 (232)
T cd00190           1 IVGGS----EAKIGSFPWQVSLQYTG---------GRHFCGGSLISPRWVLTAAHCVYSSAPSNYTVRLGSHDLSSNE--   65 (232)
T ss_pred             CcCCe----ECCCCCCCCEEEEEccC---------CcEEEEEEEeeCCEEEECHHhcCCCCCccEEEEeCcccccCCC--
Confidence            45665    88899999999998731         1489999999999999999999875567889999987665421  


Q ss_pred             CCCCCccceeeeeEEEeCCCCCCCCCCCceEEEEeCCCCCCCCCceecccCCCC-CCCCCceEEEEccCCCCCCCCCccc
Q psy10089        368 DREPFPYQERTVSQIYIHENFEAKTVFNDIALIILDFPFPVKNHIGLACTPNSA-EEYDDQNCIVTGWGKDKFGVEGRYQ  446 (559)
Q Consensus       368 ~~~~~~~~~~~V~~i~~Hp~y~~~~~~~DIALl~L~~p~~~~~~v~picLp~~~-~~~~~~~~~~~GwG~~~~~~~~~~~  446 (559)
                          ...+.+.|.++++||+|+.....+|||||+|++|+.++.+++|||||... ....+..+.++|||....  ....+
T Consensus        66 ----~~~~~~~v~~~~~hp~y~~~~~~~DiAll~L~~~~~~~~~v~picl~~~~~~~~~~~~~~~~G~g~~~~--~~~~~  139 (232)
T cd00190          66 ----GGGQVIKVKKVIVHPNYNPSTYDNDIALLKLKRPVTLSDNVRPICLPSSGYNLPAGTTCTVSGWGRTSE--GGPLP  139 (232)
T ss_pred             ----CceEEEEEEEEEECCCCCCCCCcCCEEEEEECCcccCCCcccceECCCccccCCCCCEEEEEeCCcCCC--CCCCC
Confidence                13567899999999999988889999999999999999999999999775 334568999999999863  23567


Q ss_pred             ccceEEEEEeecchhhhHHhhhcccCCeeccCCceEEEeCCC-CCCCCCCCCCcccEEeecCCCCcEEEEEEEEeCCCCC
Q psy10089        447 STLKKVEVKLVPRNVCQQQLRKTRLGGVFKLHDSFICASGGP-NQDACKGDGGGPLVCQLKNERDRFTQVGIVSWGIGCG  525 (559)
Q Consensus       447 ~~L~~~~v~i~~~~~C~~~~~~~~~~~~~~~~~~~lCa~~~~-~~~~C~GDsGgPLv~~~~~~~~~~~l~GI~S~g~~C~  525 (559)
                      ..++...+.+++...|...+..     ...+.+.++|+.... ..+.|.|||||||++..   +++++|+||+|+|..|.
T Consensus       140 ~~~~~~~~~~~~~~~C~~~~~~-----~~~~~~~~~C~~~~~~~~~~c~gdsGgpl~~~~---~~~~~lvGI~s~g~~c~  211 (232)
T cd00190         140 DVLQEVNVPIVSNAECKRAYSY-----GGTITDNMLCAGGLEGGKDACQGDSGGPLVCND---NGRGVLVGIVSWGSGCA  211 (232)
T ss_pred             ceeeEEEeeeECHHHhhhhccC-----cccCCCceEeeCCCCCCCccccCCCCCcEEEEe---CCEEEEEEEEehhhccC
Confidence            7899999999999999987653     124789999998543 78899999999999987   68899999999999998


Q ss_pred             CCC-CeeeEeccccHHHHHhh
Q psy10089        526 SDT-PGVYVDVRKFKKWILDN  545 (559)
Q Consensus       526 ~~~-p~vyt~V~~y~~WI~~~  545 (559)
                      ... |++|++|+.|++||+++
T Consensus       212 ~~~~~~~~t~v~~~~~WI~~~  232 (232)
T cd00190         212 RPNYPGVYTRVSSYLDWIQKT  232 (232)
T ss_pred             CCCCCCEEEEcHHhhHHhhcC
Confidence            744 99999999999999864


No 2  
>KOG3627|consensus
Probab=100.00  E-value=1.1e-39  Score=319.92  Aligned_cols=238  Identities=37%  Similarity=0.699  Sum_probs=193.9

Q ss_pred             ceEecCCCcCCCccccCcCCceEEEeecccCCCCCCccceeEeeEEEEcCCEEEecccccCcc-CccceEEEccceeccc
Q psy10089        285 NCVITGWGRDSAETFFGEYPWMMAILTNKINKDGSVTENVFQCGATLILPHVVMTAAHCVNNI-PVTDIKVRGGEWDTIT  363 (559)
Q Consensus       285 ~~ri~gg~~~~~~~~~~~~Pw~v~i~~~~~~~~~~~~~~~~~C~GtLIs~~~VLTAAhC~~~~-~~~~~~V~~g~~~~~~  363 (559)
                      ..||+||.    ++..++|||++.++....        ..+.|+|+||+++||||||||+... .. .+.|++|.+....
T Consensus        10 ~~~i~~g~----~~~~~~~Pw~~~l~~~~~--------~~~~Cggsli~~~~vltaaHC~~~~~~~-~~~V~~G~~~~~~   76 (256)
T KOG3627|consen   10 EGRIVGGT----EAEPGSFPWQVSLQYGGN--------GRHLCGGSLISPRWVLTAAHCVKGASAS-LYTVRLGEHDINL   76 (256)
T ss_pred             cCCEeCCc----cCCCCCCCCEEEEEECCC--------cceeeeeEEeeCCEEEEChhhCCCCCCc-ceEEEECcccccc
Confidence            35899997    888999999999998321        1469999999999999999999863 22 7888889875554


Q ss_pred             cCCCCCCCCccceeeeeEEEeCCCCCCCCCC-CceEEEEeCCCCCCCCCceecccCCCCC---CCCCceEEEEccCCCCC
Q psy10089        364 NNRTDREPFPYQERTVSQIYIHENFEAKTVF-NDIALIILDFPFPVKNHIGLACTPNSAE---EYDDQNCIVTGWGKDKF  439 (559)
Q Consensus       364 ~~~~~~~~~~~~~~~V~~i~~Hp~y~~~~~~-~DIALl~L~~p~~~~~~v~picLp~~~~---~~~~~~~~~~GwG~~~~  439 (559)
                      ......   ..+...|.++++||+|+..... ||||||+|++++.|+++|+|||||....   ...+..|.++|||.+..
T Consensus        77 ~~~~~~---~~~~~~v~~~i~H~~y~~~~~~~nDiall~l~~~v~~~~~i~piclp~~~~~~~~~~~~~~~v~GWG~~~~  153 (256)
T KOG3627|consen   77 SVSEGE---EQLVGDVEKIIVHPNYNPRTLENNDIALLRLSEPVTFSSHIQPICLPSSADPYFPPGGTTCLVSGWGRTES  153 (256)
T ss_pred             ccccCc---hhhhceeeEEEECCCCCCCCCCCCCEEEEEECCCcccCCcccccCCCCCcccCCCCCCCEEEEEeCCCcCC
Confidence            421100   1244557889999999998877 9999999999999999999999995543   33448999999999864


Q ss_pred             CCCCcccccceEEEEEeecchhhhHHhhhcccCCeeccCCceEEEeC-CCCCCCCCCCCCcccEEeecCCCCcEEEEEEE
Q psy10089        440 GVEGRYQSTLKKVEVKLVPRNVCQQQLRKTRLGGVFKLHDSFICASG-GPNQDACKGDGGGPLVCQLKNERDRFTQVGIV  518 (559)
Q Consensus       440 ~~~~~~~~~L~~~~v~i~~~~~C~~~~~~~~~~~~~~~~~~~lCa~~-~~~~~~C~GDsGgPLv~~~~~~~~~~~l~GI~  518 (559)
                      + ....+..|+++.+++++.++|...+....     .+.+.||||+. ....++|+|||||||++..   .++++|+||+
T Consensus       154 ~-~~~~~~~L~~~~v~i~~~~~C~~~~~~~~-----~~~~~~~Ca~~~~~~~~~C~GDSGGPLv~~~---~~~~~~~Giv  224 (256)
T KOG3627|consen  154 G-GGPLPDTLQEVDVPIISNSECRRAYGGLG-----TITDTMLCAGGPEGGKDACQGDSGGPLVCED---NGRWVLVGIV  224 (256)
T ss_pred             C-CCCCCceeEEEEEeEcChhHhcccccCcc-----ccCCCEEeeCccCCCCccccCCCCCeEEEee---CCcEEEEEEE
Confidence            2 23567899999999999999998875321     36677999995 6778899999999999986   3489999999


Q ss_pred             EeCCC-CCCCC-CeeeEeccccHHHHHhhcC
Q psy10089        519 SWGIG-CGSDT-PGVYVDVRKFKKWILDNSH  547 (559)
Q Consensus       519 S~g~~-C~~~~-p~vyt~V~~y~~WI~~~i~  547 (559)
                      |||.. |+... |++||+|+.|.+||++.+.
T Consensus       225 S~G~~~C~~~~~P~vyt~V~~y~~WI~~~~~  255 (256)
T KOG3627|consen  225 SWGSGGCGQPNYPGVYTRVSSYLDWIKENIG  255 (256)
T ss_pred             EecCCCCCCCCCCeEEeEhHHhHHHHHHHhc
Confidence            99988 99875 9999999999999999875


No 3  
>smart00020 Tryp_SPc Trypsin-like serine protease. Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues.
Probab=100.00  E-value=1e-37  Score=300.76  Aligned_cols=226  Identities=37%  Similarity=0.703  Sum_probs=189.8

Q ss_pred             EecCCCcCCCccccCcCCceEEEeecccCCCCCCccceeEeeEEEEcCCEEEecccccCccCccceEEEccceeccccCC
Q psy10089        287 VITGWGRDSAETFFGEYPWMMAILTNKINKDGSVTENVFQCGATLILPHVVMTAAHCVNNIPVTDIKVRGGEWDTITNNR  366 (559)
Q Consensus       287 ri~gg~~~~~~~~~~~~Pw~v~i~~~~~~~~~~~~~~~~~C~GtLIs~~~VLTAAhC~~~~~~~~~~V~~g~~~~~~~~~  366 (559)
                      ||+||.    ++..++|||+|.++...         ..+.|+|+||++++|||||||+.......+.|++|.+...... 
T Consensus         1 ~~~~G~----~~~~~~~Pw~~~i~~~~---------~~~~C~GtlIs~~~VLTaahC~~~~~~~~~~v~~g~~~~~~~~-   66 (229)
T smart00020        1 RIVGGS----EANIGSFPWQVSLQYRG---------GRHFCGGSLISPRWVLTAAHCVYGSDPSNIRVRLGSHDLSSGE-   66 (229)
T ss_pred             CccCCC----cCCCCCCCcEEEEEEcC---------CCcEEEEEEecCCEEEECHHHcCCCCCcceEEEeCcccCCCCC-
Confidence            467886    88999999999998732         2478999999999999999999875556889999987654432 


Q ss_pred             CCCCCCccceeeeeEEEeCCCCCCCCCCCceEEEEeCCCCCCCCCceecccCCCC-CCCCCceEEEEccCCCCCCCCCcc
Q psy10089        367 TDREPFPYQERTVSQIYIHENFEAKTVFNDIALIILDFPFPVKNHIGLACTPNSA-EEYDDQNCIVTGWGKDKFGVEGRY  445 (559)
Q Consensus       367 ~~~~~~~~~~~~V~~i~~Hp~y~~~~~~~DIALl~L~~p~~~~~~v~picLp~~~-~~~~~~~~~~~GwG~~~~~~~~~~  445 (559)
                            ..+.+.|.+++.||+|+.....+|||||+|++|+.+++.++|+|||... ....+..+.++|||.... ..+..
T Consensus        67 ------~~~~~~v~~~~~~p~~~~~~~~~DiAll~L~~~i~~~~~~~pi~l~~~~~~~~~~~~~~~~g~g~~~~-~~~~~  139 (229)
T smart00020       67 ------EGQVIKVSKVIIHPNYNPSTYDNDIALLKLKSPVTLSDNVRPICLPSSNYNVPAGTTCTVSGWGRTSE-GAGSL  139 (229)
T ss_pred             ------CceEEeeEEEEECCCCCCCCCcCCEEEEEECcccCCCCceeeccCCCcccccCCCCEEEEEeCCCCCC-CCCcC
Confidence                  1267899999999999988889999999999999999999999999763 334568999999998764 33455


Q ss_pred             cccceEEEEEeecchhhhHHhhhcccCCeeccCCceEEEeCCC-CCCCCCCCCCcccEEeecCCCCcEEEEEEEEeCCCC
Q psy10089        446 QSTLKKVEVKLVPRNVCQQQLRKTRLGGVFKLHDSFICASGGP-NQDACKGDGGGPLVCQLKNERDRFTQVGIVSWGIGC  524 (559)
Q Consensus       446 ~~~L~~~~v~i~~~~~C~~~~~~~~~~~~~~~~~~~lCa~~~~-~~~~C~GDsGgPLv~~~~~~~~~~~l~GI~S~g~~C  524 (559)
                      ...++...+.+++.+.|...+...     ..+...++|++... ..+.|.|||||||++..   + +|+|+||+|+|..|
T Consensus       140 ~~~~~~~~~~~~~~~~C~~~~~~~-----~~~~~~~~C~~~~~~~~~~c~gdsG~pl~~~~---~-~~~l~Gi~s~g~~C  210 (229)
T smart00020      140 PDTLQEVNVPIVSNATCRRAYSGG-----GAITDNMLCAGGLEGGKDACQGDSGGPLVCND---G-RWVLVGIVSWGSGC  210 (229)
T ss_pred             CCEeeEEEEEEeCHHHhhhhhccc-----cccCCCcEeecCCCCCCcccCCCCCCeeEEEC---C-CEEEEEEEEECCCC
Confidence            678999999999999999876431     24788999998544 78899999999999976   4 99999999999999


Q ss_pred             CCCC-CeeeEeccccHHHH
Q psy10089        525 GSDT-PGVYVDVRKFKKWI  542 (559)
Q Consensus       525 ~~~~-p~vyt~V~~y~~WI  542 (559)
                      .... |.+|+||++|++||
T Consensus       211 ~~~~~~~~~~~i~~~~~WI  229 (229)
T smart00020      211 ARPGKPGVYTRVSSYLDWI  229 (229)
T ss_pred             CCCCCCCEEEEeccccccC
Confidence            8544 99999999999998


No 4  
>PF00089 Trypsin:  Trypsin;  InterPro: IPR001254 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine proteases belong to the MEROPS peptidase family S1 (chymotrypsin family, clan PA(S))and to peptidase family S6 (Hap serine peptidases). The chymotrypsin family is almost totally confined to animals, although trypsin-like enzymes are found in actinomycetes of the genera Streptomyces and Saccharopolyspora, and in the fungus Fusarium oxysporum []. The enzymes are inherently secreted, being synthesised with a signal peptide that targets them to the secretory pathway. Animal enzymes are either secreted directly, packaged into vesicles for regulated secretion, or are retained in leukocyte granules []. The Hap family, 'Haemophilus adhesion and penetration', are proteins that play a role in the interaction with human epithelial cells. The serine protease activity is localized at the N-terminal domain, whereas the binding domain is in the C-terminal region. ; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 1SPJ_A 1A5I_A 2ZGH_A 2ZKS_A 2ZGJ_A 2ZGC_A 2ODP_A 2I6Q_A 2I6S_A 2ODQ_A ....
Probab=100.00  E-value=3.3e-35  Score=281.08  Aligned_cols=218  Identities=41%  Similarity=0.762  Sum_probs=182.5

Q ss_pred             ecCCCcCCCccccCcCCceEEEeecccCCCCCCccceeEeeEEEEcCCEEEecccccCccCccceEEEccceeccccCCC
Q psy10089        288 ITGWGRDSAETFFGEYPWMMAILTNKINKDGSVTENVFQCGATLILPHVVMTAAHCVNNIPVTDIKVRGGEWDTITNNRT  367 (559)
Q Consensus       288 i~gg~~~~~~~~~~~~Pw~v~i~~~~~~~~~~~~~~~~~C~GtLIs~~~VLTAAhC~~~~~~~~~~V~~g~~~~~~~~~~  367 (559)
                      |.||.    ++..++|||+|.++..  .       ..++|+|+||+++||||||||+..  ...+.+.+|........  
T Consensus         1 i~~g~----~~~~~~~p~~v~i~~~--~-------~~~~C~G~li~~~~vLTaahC~~~--~~~~~v~~g~~~~~~~~--   63 (220)
T PF00089_consen    1 IVGGD----PASPGEFPWVVSIRYS--N-------GRFFCTGTLISPRWVLTAAHCVDG--ASDIKVRLGTYSIRNSD--   63 (220)
T ss_dssp             SBSSE----ECGTTSSTTEEEEEET--T-------TEEEEEEEEEETTEEEEEGGGHTS--GGSEEEEESESBTTSTT--
T ss_pred             CCCCE----ECCCCCCCeEEEEeeC--C-------CCeeEeEEeccccccccccccccc--ccccccccccccccccc--
Confidence            56775    8999999999999983  2       058999999999999999999986  56788888873333222  


Q ss_pred             CCCCCccceeeeeEEEeCCCCCCCCCCCceEEEEeCCCCCCCCCceecccCCCCC-CCCCceEEEEccCCCCCCCCCccc
Q psy10089        368 DREPFPYQERTVSQIYIHENFEAKTVFNDIALIILDFPFPVKNHIGLACTPNSAE-EYDDQNCIVTGWGKDKFGVEGRYQ  446 (559)
Q Consensus       368 ~~~~~~~~~~~V~~i~~Hp~y~~~~~~~DIALl~L~~p~~~~~~v~picLp~~~~-~~~~~~~~~~GwG~~~~~~~~~~~  446 (559)
                          ...+.+.|++++.||+|+.....+|||||+|++|+.+.+.++|+||+.... ...+..+.++|||....  .+ ..
T Consensus        64 ----~~~~~~~v~~~~~h~~~~~~~~~~DiAll~L~~~~~~~~~~~~~~l~~~~~~~~~~~~~~~~G~~~~~~--~~-~~  136 (220)
T PF00089_consen   64 ----GSEQTIKVSKIIIHPKYDPSTYDNDIALLKLDRPITFGDNIQPICLPSAGSDPNVGTSCIVVGWGRTSD--NG-YS  136 (220)
T ss_dssp             ----TTSEEEEEEEEEEETTSBTTTTTTSEEEEEESSSSEHBSSBEESBBTSTTHTTTTTSEEEEEESSBSST--TS-BT
T ss_pred             ----ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc--cc-cc
Confidence                124689999999999999988899999999999999999999999998433 24568999999998753  22 56


Q ss_pred             ccceEEEEEeecchhhhHHhhhcccCCeeccCCceEEEeCCCCCCCCCCCCCcccEEeecCCCCcEEEEEEEEeCCCCCC
Q psy10089        447 STLKKVEVKLVPRNVCQQQLRKTRLGGVFKLHDSFICASGGPNQDACKGDGGGPLVCQLKNERDRFTQVGIVSWGIGCGS  526 (559)
Q Consensus       447 ~~L~~~~v~i~~~~~C~~~~~~~~~~~~~~~~~~~lCa~~~~~~~~C~GDsGgPLv~~~~~~~~~~~l~GI~S~g~~C~~  526 (559)
                      ..++...+.+++.+.|...+..       .+.+.++|+......+.|.|||||||++..   .   +|+||++++..|..
T Consensus       137 ~~~~~~~~~~~~~~~c~~~~~~-------~~~~~~~c~~~~~~~~~~~g~sG~pl~~~~---~---~lvGI~s~~~~c~~  203 (220)
T PF00089_consen  137 SNLQSVTVPVVSRKTCRSSYND-------NLTPNMICAGSSGSGDACQGDSGGPLICNN---N---YLVGIVSFGENCGS  203 (220)
T ss_dssp             SBEEEEEEEEEEHHHHHHHTTT-------TSTTTEEEEETTSSSBGGTTTTTSEEEETT---E---EEEEEEEEESSSSB
T ss_pred             cccccccccccccccccccccc-------ccccccccccccccccccccccccccccce---e---eecceeeecCCCCC
Confidence            7899999999999999987442       278899999954668999999999999865   1   79999999999998


Q ss_pred             CC-CeeeEeccccHHHH
Q psy10089        527 DT-PGVYVDVRKFKKWI  542 (559)
Q Consensus       527 ~~-p~vyt~V~~y~~WI  542 (559)
                      .. |.+|+||+.|++||
T Consensus       204 ~~~~~v~~~v~~~~~WI  220 (220)
T PF00089_consen  204 PNYPGVYTRVSSYLDWI  220 (220)
T ss_dssp             TTSEEEEEEGGGGHHHH
T ss_pred             CCcCEEEEEHHHhhccC
Confidence            76 89999999999999


No 5  
>cd00190 Tryp_SPc Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues.
Probab=99.98  E-value=6.7e-32  Score=260.31  Aligned_cols=177  Identities=38%  Similarity=0.716  Sum_probs=149.1

Q ss_pred             ecCCccCCCCCCCeEEEEEEEecCCceeeeEEEEEcCCEEEecccccccC--ceeeEEeeeEEEcccccccccceeEEee
Q psy10089         33 PISGRNTYFGEFPWMLVLFYYKRNMEYFKCGASLIGPNIALTAAHCVQYD--VTYSVAAGEWFINGIVEEELEEEQRRDV  110 (559)
Q Consensus        33 iigG~~a~~~~~Pw~v~l~~~~~~~~~~~CgGtLIs~~~VLTAAhC~~~~--~~~~v~~g~~~~~~~~~~~~~~~~~~~v  110 (559)
                      |+||+++.+++|||+|+|+...   ..+.|+|+||+++||||||||+...  ..+.|++|......    .....+.+.|
T Consensus         1 i~~G~~~~~~~~Pw~v~i~~~~---~~~~C~GtlIs~~~VLTaAhC~~~~~~~~~~v~~g~~~~~~----~~~~~~~~~v   73 (232)
T cd00190           1 IVGGSEAKIGSFPWQVSLQYTG---GRHFCGGSLISPRWVLTAAHCVYSSAPSNYTVRLGSHDLSS----NEGGGQVIKV   73 (232)
T ss_pred             CcCCeECCCCCCCCEEEEEccC---CcEEEEEEEeeCCEEEECHHhcCCCCCccEEEEeCcccccC----CCCceEEEEE
Confidence            7899999999999999998753   3478999999999999999999764  56788888765531    1135678899


Q ss_pred             EEEEECCCCCCCCcCCceEEEeecCccccCCCcceeccCCCC-CCCCCCcEEEEeecCCC--------------------
Q psy10089        111 LDVRIHPNYSTETLENNIALLKLSSNIDFDDYIHPICLPDWN-VTYDSENCVITGWGRDS--------------------  169 (559)
Q Consensus       111 ~~i~~hp~y~~~~~~~Diall~L~~~v~~~~~v~picl~~~~-~~~~~~~~~~~GwG~~~--------------------  169 (559)
                      .++++||.|+.....+|||||||++|+.++.+++|||||... ....+..+.++|||...                    
T Consensus        74 ~~~~~hp~y~~~~~~~DiAll~L~~~~~~~~~v~picl~~~~~~~~~~~~~~~~G~g~~~~~~~~~~~~~~~~~~~~~~~  153 (232)
T cd00190          74 KKVIVHPNYNPSTYDNDIALLKLKRPVTLSDNVRPICLPSSGYNLPAGTTCTVSGWGRTSEGGPLPDVLQEVNVPIVSNA  153 (232)
T ss_pred             EEEEECCCCCCCCCcCCEEEEEECCcccCCCcccceECCCccccCCCCCEEEEEeCCcCCCCCCCCceeeEEEeeeECHH
Confidence            999999999998889999999999999999999999999864 23356789999998642                    


Q ss_pred             ------------------------------CCCCCceEeecCCCCCcEEEEEEEEcCCCCCC-CCCeeeEEeeeccCccc
Q psy10089        170 ------------------------------ADGGGPLVCPSKEDPTTFFQVGIAAWSVVCTP-DMPGLYDVTYSVAAGEW  218 (559)
Q Consensus       170 ------------------------------~d~G~pl~~~~~~~~~~~~l~Gi~s~~~~C~~-~~p~vy~~v~~~~~~~W  218 (559)
                                                    +|+|+||++...   +.++|+||+|++..|.. +.|++|+++..  +.+|
T Consensus       154 ~C~~~~~~~~~~~~~~~C~~~~~~~~~~c~gdsGgpl~~~~~---~~~~lvGI~s~g~~c~~~~~~~~~t~v~~--~~~W  228 (232)
T cd00190         154 ECKRAYSYGGTITDNMLCAGGLEGGKDACQGDSGGPLVCNDN---GRGVLVGIVSWGSGCARPNYPGVYTRVSS--YLDW  228 (232)
T ss_pred             HhhhhccCcccCCCceEeeCCCCCCCccccCCCCCcEEEEeC---CEEEEEEEEehhhccCCCCCCCEEEEcHH--hhHH
Confidence                                          689999999643   78999999999999984 89999999877  6799


Q ss_pred             eee
Q psy10089        219 FIN  221 (559)
Q Consensus       219 i~~  221 (559)
                      |.+
T Consensus       229 I~~  231 (232)
T cd00190         229 IQK  231 (232)
T ss_pred             hhc
Confidence            854


No 6  
>COG5640 Secreted trypsin-like serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=99.98  E-value=9.7e-32  Score=255.01  Aligned_cols=248  Identities=26%  Similarity=0.375  Sum_probs=173.5

Q ss_pred             CceEecCCCcCCCccccCcCCceEEEeecccCCCCCCccceeEeeEEEEcCCEEEecccccCccCccceEEEccceeccc
Q psy10089        284 ENCVITGWGRDSAETFFGEYPWMMAILTNKINKDGSVTENVFQCGATLILPHVVMTAAHCVNNIPVTDIKVRGGEWDTIT  363 (559)
Q Consensus       284 ~~~ri~gg~~~~~~~~~~~~Pw~v~i~~~~~~~~~~~~~~~~~C~GtLIs~~~VLTAAhC~~~~~~~~~~V~~g~~~~~~  363 (559)
                      -..||+||.    .+..++||++|++..+  ..+.   ...-+|||+++..|||||||||+....+-...+..+..++..
T Consensus        29 vs~rIigGs----~Anag~~P~~VaLv~~--isd~---~s~tfCGgs~l~~RYvLTAAHC~~~~s~is~d~~~vv~~l~d   99 (413)
T COG5640          29 VSSRIIGGS----NANAGEYPSLVALVDR--ISDY---VSGTFCGGSKLGGRYVLTAAHCADASSPISSDVNRVVVDLND   99 (413)
T ss_pred             cceeEecCc----ccccccCchHHHHHhh--cccc---cceeEeccceecceEEeeehhhccCCCCccccceEEEecccc
Confidence            456999998    9999999999999863  2111   113589999999999999999998744322223333333333


Q ss_pred             cCCCCCCCCccceeeeeEEEeCCCCCCCCCCCceEEEEeCCCCCCC-CCceecccCCC--CCCCCCceEEEEccCCCCCC
Q psy10089        364 NNRTDREPFPYQERTVSQIYIHENFEAKTVFNDIALIILDFPFPVK-NHIGLACTPNS--AEEYDDQNCIVTGWGKDKFG  440 (559)
Q Consensus       364 ~~~~~~~~~~~~~~~V~~i~~Hp~y~~~~~~~DIALl~L~~p~~~~-~~v~picLp~~--~~~~~~~~~~~~GwG~~~~~  440 (559)
                      ..       ..+...|++++.|..|.+.++.||||+++|.++...- ..+.-.--+..  ............+|+.+...
T Consensus       100 ~S-------q~~rg~vr~i~~~efY~~~n~~ND~Av~~l~~~a~~pr~ki~~~~~sdt~l~sv~~~s~~~n~t~~~~~~~  172 (413)
T COG5640         100 SS-------QAERGHVRTIYVHEFYSPGNLGNDIAVLELARAASLPRVKITSFDASDTFLNSVTTVSPMTNGTFGVTTPS  172 (413)
T ss_pred             cc-------cccCcceEEEeeecccccccccCcceeeccccccccchhheeeccCcccceecccccccccceeeeeeeec
Confidence            22       2457889999999999999999999999999876542 11111111110  01111244556677766543


Q ss_pred             CCCcc-c--ccceEEEEEeecchhhhHHhhhcccCCeeccCCceEEEeCCCCCCCCCCCCCcccEEeecCCCCcEEEEEE
Q psy10089        441 VEGRY-Q--STLKKVEVKLVPRNVCQQQLRKTRLGGVFKLHDSFICASGGPNQDACKGDGGGPLVCQLKNERDRFTQVGI  517 (559)
Q Consensus       441 ~~~~~-~--~~L~~~~v~i~~~~~C~~~~~~~~~~~~~~~~~~~lCa~~~~~~~~C~GDsGgPLv~~~~~~~~~~~l~GI  517 (559)
                      ..... +  ..|++..+..++...|...+....... ....-.-+|++.. .+++|+||||||++...   ++...++||
T Consensus       173 ~v~~~~p~gt~l~e~~v~fv~~stc~~~~g~an~~d-g~~~lT~~cag~~-~~daCqGDSGGPi~~~g---~~G~vQ~GV  247 (413)
T COG5640         173 DVPRSSPKGTILHEVAVLFVPLSTCAQYKGCANASD-GATGLTGFCAGRP-PKDACQGDSGGPIFHKG---EEGRVQRGV  247 (413)
T ss_pred             CCCCCCCccceeeeeeeeeechHHhhhhccccccCC-CCCCccceecCCC-CcccccCCCCCceEEeC---CCccEEEeE
Confidence            22222 2  479999999999999998875221111 1122223999844 39999999999999876   455689999


Q ss_pred             EEeCCC-CCCCC-CeeeEeccccHHHHHhhcCCCCCC
Q psy10089        518 VSWGIG-CGSDT-PGVYVDVRKFKKWILDNSHGKIID  552 (559)
Q Consensus       518 ~S~g~~-C~~~~-p~vyt~V~~y~~WI~~~i~~~~~~  552 (559)
                      +|||.+ |+.+. |+|||+|+.|.+||...|+.....
T Consensus       248 vSwG~~~Cg~t~~~gVyT~vsny~~WI~a~~~~l~~~  284 (413)
T COG5640         248 VSWGDGGCGGTLIPGVYTNVSNYQDWIAAMTNGLSYL  284 (413)
T ss_pred             EEecCCCCCCCCcceeEEehhHHHHHHHHHhcCCCcc
Confidence            999986 99988 999999999999999998875543


No 7  
>KOG3627|consensus
Probab=99.97  E-value=9e-31  Score=256.83  Aligned_cols=186  Identities=41%  Similarity=0.737  Sum_probs=151.2

Q ss_pred             CcceecCCccCCCCCCCeEEEEEEEecCCceeeeEEEEEcCCEEEecccccccCc--eeeEEeeeEEEccccccccccee
Q psy10089         29 DYIEPISGRNTYFGEFPWMLVLFYYKRNMEYFKCGASLIGPNIALTAAHCVQYDV--TYSVAAGEWFINGIVEEELEEEQ  106 (559)
Q Consensus        29 ~~~riigG~~a~~~~~Pw~v~l~~~~~~~~~~~CgGtLIs~~~VLTAAhC~~~~~--~~~v~~g~~~~~~~~~~~~~~~~  106 (559)
                      ...||+||.+|.++++||+|+|++...  ..|.|||+||+++||||||||+....  .+.|++|.+......... ...+
T Consensus         9 ~~~~i~~g~~~~~~~~Pw~~~l~~~~~--~~~~Cggsli~~~~vltaaHC~~~~~~~~~~V~~G~~~~~~~~~~~-~~~~   85 (256)
T KOG3627|consen    9 PEGRIVGGTEAEPGSFPWQVSLQYGGN--GRHLCGGSLISPRWVLTAAHCVKGASASLYTVRLGEHDINLSVSEG-EEQL   85 (256)
T ss_pred             ccCCEeCCccCCCCCCCCEEEEEECCC--cceeeeeEEeeCCEEEEChhhCCCCCCcceEEEECccccccccccC-chhh
Confidence            357899999999999999999987653  23789999999999999999998765  788888876554221110 0124


Q ss_pred             EEeeEEEEECCCCCCCCcC-CceEEEeecCccccCCCcceeccCCCCC---CCCCCcEEEEeecCCC-------------
Q psy10089        107 RRDVLDVRIHPNYSTETLE-NNIALLKLSSNIDFDDYIHPICLPDWNV---TYDSENCVITGWGRDS-------------  169 (559)
Q Consensus       107 ~~~v~~i~~hp~y~~~~~~-~Diall~L~~~v~~~~~v~picl~~~~~---~~~~~~~~~~GwG~~~-------------  169 (559)
                      ...|.++++||+|+..... ||||||+|.+++.|+++|+|||||....   ...+..|.++|||.+.             
T Consensus        86 ~~~v~~~i~H~~y~~~~~~~nDiall~l~~~v~~~~~i~piclp~~~~~~~~~~~~~~~v~GWG~~~~~~~~~~~~L~~~  165 (256)
T KOG3627|consen   86 VGDVEKIIVHPNYNPRTLENNDIALLRLSEPVTFSSHIQPICLPSSADPYFPPGGTTCLVSGWGRTESGGGPLPDTLQEV  165 (256)
T ss_pred             hceeeEEEECCCCCCCCCCCCCEEEEEECCCcccCCcccccCCCCCcccCCCCCCCEEEEEeCCCcCCCCCCCCceeEEE
Confidence            5557789999999998877 9999999999999999999999985443   3345789999999752             


Q ss_pred             --------------------------------------CCCCCceEeecCCCCCcEEEEEEEEcCCC-CC-CCCCeeeEE
Q psy10089        170 --------------------------------------ADGGGPLVCPSKEDPTTFFQVGIAAWSVV-CT-PDMPGLYDV  209 (559)
Q Consensus       170 --------------------------------------~d~G~pl~~~~~~~~~~~~l~Gi~s~~~~-C~-~~~p~vy~~  209 (559)
                                                            ||+||||+|....   .++++||+|||.. |. .+.|++|++
T Consensus       166 ~v~i~~~~~C~~~~~~~~~~~~~~~Ca~~~~~~~~~C~GDSGGPLv~~~~~---~~~~~GivS~G~~~C~~~~~P~vyt~  242 (256)
T KOG3627|consen  166 DVPIISNSECRRAYGGLGTITDTMLCAGGPEGGKDACQGDSGGPLVCEDNG---RWVLVGIVSWGSGGCGQPNYPGVYTR  242 (256)
T ss_pred             EEeEcChhHhcccccCccccCCCEEeeCccCCCCccccCCCCCeEEEeeCC---cEEEEEEEEecCCCCCCCCCCeEEeE
Confidence                                                  7999999996532   7899999999987 98 569999999


Q ss_pred             eeeccCccceeec
Q psy10089        210 TYSVAAGEWFING  222 (559)
Q Consensus       210 v~~~~~~~Wi~~~  222 (559)
                      +..  +.+||.+.
T Consensus       243 V~~--y~~WI~~~  253 (256)
T KOG3627|consen  243 VSS--YLDWIKEN  253 (256)
T ss_pred             hHH--hHHHHHHH
Confidence            887  77998654


No 8  
>smart00020 Tryp_SPc Trypsin-like serine protease. Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues.
Probab=99.97  E-value=9.5e-30  Score=245.08  Aligned_cols=174  Identities=41%  Similarity=0.762  Sum_probs=147.1

Q ss_pred             eecCCccCCCCCCCeEEEEEEEecCCceeeeEEEEEcCCEEEecccccccC--ceeeEEeeeEEEcccccccccceeEEe
Q psy10089         32 EPISGRNTYFGEFPWMLVLFYYKRNMEYFKCGASLIGPNIALTAAHCVQYD--VTYSVAAGEWFINGIVEEELEEEQRRD  109 (559)
Q Consensus        32 riigG~~a~~~~~Pw~v~l~~~~~~~~~~~CgGtLIs~~~VLTAAhC~~~~--~~~~v~~g~~~~~~~~~~~~~~~~~~~  109 (559)
                      ||+||+++.+++|||+|.|+...   ..+.|+||||++++|||||||+...  ..+.|++|.+....     ....+.+.
T Consensus         1 ~~~~G~~~~~~~~Pw~~~i~~~~---~~~~C~GtlIs~~~VLTaahC~~~~~~~~~~v~~g~~~~~~-----~~~~~~~~   72 (229)
T smart00020        1 RIVGGSEANIGSFPWQVSLQYRG---GRHFCGGSLISPRWVLTAAHCVYGSDPSNIRVRLGSHDLSS-----GEEGQVIK   72 (229)
T ss_pred             CccCCCcCCCCCCCcEEEEEEcC---CCcEEEEEEecCCEEEECHHHcCCCCCcceEEEeCcccCCC-----CCCceEEe
Confidence            69999999999999999998654   3478999999999999999999764  36788888765431     11227789


Q ss_pred             eEEEEECCCCCCCCcCCceEEEeecCccccCCCcceeccCCCC-CCCCCCcEEEEeecCCC-------------------
Q psy10089        110 VLDVRIHPNYSTETLENNIALLKLSSNIDFDDYIHPICLPDWN-VTYDSENCVITGWGRDS-------------------  169 (559)
Q Consensus       110 v~~i~~hp~y~~~~~~~Diall~L~~~v~~~~~v~picl~~~~-~~~~~~~~~~~GwG~~~-------------------  169 (559)
                      |.++++||.|+.....+|||||+|++|+.+++.++|||||... ....+..+.++|||...                   
T Consensus        73 v~~~~~~p~~~~~~~~~DiAll~L~~~i~~~~~~~pi~l~~~~~~~~~~~~~~~~g~g~~~~~~~~~~~~~~~~~~~~~~  152 (229)
T smart00020       73 VSKVIIHPNYNPSTYDNDIALLKLKSPVTLSDNVRPICLPSSNYNVPAGTTCTVSGWGRTSEGAGSLPDTLQEVNVPIVS  152 (229)
T ss_pred             eEEEEECCCCCCCCCcCCEEEEEECcccCCCCceeeccCCCcccccCCCCEEEEEeCCCCCCCCCcCCCEeeEEEEEEeC
Confidence            9999999999988889999999999999999999999999852 23356789999998752                   


Q ss_pred             --------------------------------CCCCCceEeecCCCCCcEEEEEEEEcCCCCC-CCCCeeeEEeeeccCc
Q psy10089        170 --------------------------------ADGGGPLVCPSKEDPTTFFQVGIAAWSVVCT-PDMPGLYDVTYSVAAG  216 (559)
Q Consensus       170 --------------------------------~d~G~pl~~~~~~~~~~~~l~Gi~s~~~~C~-~~~p~vy~~v~~~~~~  216 (559)
                                                      +|+|+||++...    .|+|+||++++..|. .+.|++|+++..  +.
T Consensus       153 ~~~C~~~~~~~~~~~~~~~C~~~~~~~~~~c~gdsG~pl~~~~~----~~~l~Gi~s~g~~C~~~~~~~~~~~i~~--~~  226 (229)
T smart00020      153 NATCRRAYSGGGAITDNMLCAGGLEGGKDACQGDSGGPLVCNDG----RWVLVGIVSWGSGCARPGKPGVYTRVSS--YL  226 (229)
T ss_pred             HHHhhhhhccccccCCCcEeecCCCCCCcccCCCCCCeeEEECC----CEEEEEEEEECCCCCCCCCCCEEEEecc--cc
Confidence                                            489999999642    799999999999998 789999999986  67


Q ss_pred             cce
Q psy10089        217 EWF  219 (559)
Q Consensus       217 ~Wi  219 (559)
                      +||
T Consensus       227 ~WI  229 (229)
T smart00020      227 DWI  229 (229)
T ss_pred             ccC
Confidence            996


No 9  
>PF00089 Trypsin:  Trypsin;  InterPro: IPR001254 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine proteases belong to the MEROPS peptidase family S1 (chymotrypsin family, clan PA(S))and to peptidase family S6 (Hap serine peptidases). The chymotrypsin family is almost totally confined to animals, although trypsin-like enzymes are found in actinomycetes of the genera Streptomyces and Saccharopolyspora, and in the fungus Fusarium oxysporum []. The enzymes are inherently secreted, being synthesised with a signal peptide that targets them to the secretory pathway. Animal enzymes are either secreted directly, packaged into vesicles for regulated secretion, or are retained in leukocyte granules []. The Hap family, 'Haemophilus adhesion and penetration', are proteins that play a role in the interaction with human epithelial cells. The serine protease activity is localized at the N-terminal domain, whereas the binding domain is in the C-terminal region. ; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 1SPJ_A 1A5I_A 2ZGH_A 2ZKS_A 2ZGJ_A 2ZGC_A 2ODP_A 2I6Q_A 2I6S_A 2ODQ_A ....
Probab=99.95  E-value=4.1e-28  Score=231.95  Aligned_cols=172  Identities=40%  Similarity=0.770  Sum_probs=143.4

Q ss_pred             ecCCccCCCCCCCeEEEEEEEecCCceeeeEEEEEcCCEEEecccccccCceeeEEeeeEEEcccccccccceeEEeeEE
Q psy10089         33 PISGRNTYFGEFPWMLVLFYYKRNMEYFKCGASLIGPNIALTAAHCVQYDVTYSVAAGEWFINGIVEEELEEEQRRDVLD  112 (559)
Q Consensus        33 iigG~~a~~~~~Pw~v~l~~~~~~~~~~~CgGtLIs~~~VLTAAhC~~~~~~~~v~~g~~~~~~~~~~~~~~~~~~~v~~  112 (559)
                      |+||.++.+++|||+|.|++...   .++|+|+||+++||||||||+.....+.+.+|...+.    ......+.+.|.+
T Consensus         1 i~~g~~~~~~~~p~~v~i~~~~~---~~~C~G~li~~~~vLTaahC~~~~~~~~v~~g~~~~~----~~~~~~~~~~v~~   73 (220)
T PF00089_consen    1 IVGGDPASPGEFPWVVSIRYSNG---RFFCTGTLISPRWVLTAAHCVDGASDIKVRLGTYSIR----NSDGSEQTIKVSK   73 (220)
T ss_dssp             SBSSEECGTTSSTTEEEEEETTT---EEEEEEEEEETTEEEEEGGGHTSGGSEEEEESESBTT----STTTTSEEEEEEE
T ss_pred             CCCCEECCCCCCCeEEEEeeCCC---CeeEeEEeccccccccccccccccccccccccccccc----ccccccccccccc
Confidence            78999999999999999987543   5889999999999999999998755677777762221    2223358899999


Q ss_pred             EEECCCCCCCCcCCceEEEeecCccccCCCcceeccCCCCC-CCCCCcEEEEeecCCC----------------------
Q psy10089        113 VRIHPNYSTETLENNIALLKLSSNIDFDDYIHPICLPDWNV-TYDSENCVITGWGRDS----------------------  169 (559)
Q Consensus       113 i~~hp~y~~~~~~~Diall~L~~~v~~~~~v~picl~~~~~-~~~~~~~~~~GwG~~~----------------------  169 (559)
                      ++.||.|+.....+|||||+|++++.+.+.++|+||+.... ...+..+.+.|||...                      
T Consensus        74 ~~~h~~~~~~~~~~DiAll~L~~~~~~~~~~~~~~l~~~~~~~~~~~~~~~~G~~~~~~~~~~~~~~~~~~~~~~~~~c~  153 (220)
T PF00089_consen   74 IIIHPKYDPSTYDNDIALLKLDRPITFGDNIQPICLPSAGSDPNVGTSCIVVGWGRTSDNGYSSNLQSVTVPVVSRKTCR  153 (220)
T ss_dssp             EEEETTSBTTTTTTSEEEEEESSSSEHBSSBEESBBTSTTHTTTTTSEEEEEESSBSSTTSBTSBEEEEEEEEEEHHHHH
T ss_pred             cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
Confidence            99999999988899999999999999999999999998432 2356789999998742                      


Q ss_pred             ------------------------CCCCCceEeecCCCCCcEEEEEEEEcCCCCC-CCCCeeeEEeeeccCccce
Q psy10089        170 ------------------------ADGGGPLVCPSKEDPTTFFQVGIAAWSVVCT-PDMPGLYDVTYSVAAGEWF  219 (559)
Q Consensus       170 ------------------------~d~G~pl~~~~~~~~~~~~l~Gi~s~~~~C~-~~~p~vy~~v~~~~~~~Wi  219 (559)
                                              +|+|+||++...      +|+||.+++..|. .+.|++|+++..  +.+||
T Consensus       154 ~~~~~~~~~~~~c~~~~~~~~~~~g~sG~pl~~~~~------~lvGI~s~~~~c~~~~~~~v~~~v~~--~~~WI  220 (220)
T PF00089_consen  154 SSYNDNLTPNMICAGSSGSGDACQGDSGGPLICNNN------YLVGIVSFGENCGSPNYPGVYTRVSS--YLDWI  220 (220)
T ss_dssp             HHTTTTSTTTEEEEETTSSSBGGTTTTTSEEEETTE------EEEEEEEEESSSSBTTSEEEEEEGGG--GHHHH
T ss_pred             ccccccccccccccccccccccccccccccccccee------eecceeeecCCCCCCCcCEEEEEHHH--hhccC
Confidence                                    689999999542      7999999999997 557999999987  66896


No 10 
>COG5640 Secreted trypsin-like serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=99.89  E-value=1.4e-23  Score=199.79  Aligned_cols=182  Identities=24%  Similarity=0.392  Sum_probs=128.1

Q ss_pred             CcceecCCccCCCCCCCeEEEEEEEecC-CceeeeEEEEEcCCEEEecccccccCce--eeEEeeeEEEcccccccccce
Q psy10089         29 DYIEPISGRNTYFGEFPWMLVLFYYKRN-MEYFKCGASLIGPNIALTAAHCVQYDVT--YSVAAGEWFINGIVEEELEEE  105 (559)
Q Consensus        29 ~~~riigG~~a~~~~~Pw~v~l~~~~~~-~~~~~CgGtLIs~~~VLTAAhC~~~~~~--~~v~~g~~~~~~~~~~~~~~~  105 (559)
                      -.+|||||..|..++||++|+|.-+... ...-+|||+++..|||||||||+.....  ..+..+...+     .+.+..
T Consensus        29 vs~rIigGs~Anag~~P~~VaLv~~isd~~s~tfCGgs~l~~RYvLTAAHC~~~~s~is~d~~~vv~~l-----~d~Sq~  103 (413)
T COG5640          29 VSSRIIGGSNANAGEYPSLVALVDRISDYVSGTFCGGSKLGGRYVLTAAHCADASSPISSDVNRVVVDL-----NDSSQA  103 (413)
T ss_pred             cceeEecCcccccccCchHHHHHhhcccccceeEeccceecceEEeeehhhccCCCCccccceEEEecc-----cccccc
Confidence            3689999999999999999999755443 2235799999999999999999976542  2223333333     234667


Q ss_pred             eEEeeEEEEECCCCCCCCcCCceEEEeecCccccCCCcceeccCCCCCC-------CCCCcEEEEeecCC----------
Q psy10089        106 QRRDVLDVRIHPNYSTETLENNIALLKLSSNIDFDDYIHPICLPDWNVT-------YDSENCVITGWGRD----------  168 (559)
Q Consensus       106 ~~~~v~~i~~hp~y~~~~~~~Diall~L~~~v~~~~~v~picl~~~~~~-------~~~~~~~~~GwG~~----------  168 (559)
                      +..+|++++.|..|.+.++.||||+++|.++...-    .+.+-..+..       .........+||.+          
T Consensus       104 ~rg~vr~i~~~efY~~~n~~ND~Av~~l~~~a~~p----r~ki~~~~~sdt~l~sv~~~s~~~n~t~~~~~~~~v~~~~p  179 (413)
T COG5640         104 ERGHVRTIYVHEFYSPGNLGNDIAVLELARAASLP----RVKITSFDASDTFLNSVTTVSPMTNGTFGVTTPSDVPRSSP  179 (413)
T ss_pred             cCcceEEEeeecccccccccCcceeeccccccccc----hhheeeccCcccceecccccccccceeeeeeeecCCCCCCC
Confidence            88899999999999999999999999998854321    0011000000       00011122233221          


Q ss_pred             -----------------------------------------------CCCCCCceEeecCCCCCcEEEEEEEEcCCC-CC
Q psy10089        169 -----------------------------------------------SADGGGPLVCPSKEDPTTFFQVGIAAWSVV-CT  200 (559)
Q Consensus       169 -----------------------------------------------~~d~G~pl~~~~~~~~~~~~l~Gi~s~~~~-C~  200 (559)
                                                                     ++|+|||++....+   ...+.||+|||.+ |+
T Consensus       180 ~gt~l~e~~v~fv~~stc~~~~g~an~~dg~~~lT~~cag~~~~daCqGDSGGPi~~~g~~---G~vQ~GVvSwG~~~Cg  256 (413)
T COG5640         180 KGTILHEVAVLFVPLSTCAQYKGCANASDGATGLTGFCAGRPPKDACQGDSGGPIFHKGEE---GRVQRGVVSWGDGGCG  256 (413)
T ss_pred             ccceeeeeeeeeechHHhhhhccccccCCCCCCccceecCCCCcccccCCCCCceEEeCCC---ccEEEeEEEecCCCCC
Confidence                                                           18999999986643   3468999999975 97


Q ss_pred             -CCCCeeeEEeeeccCccceeeccc
Q psy10089        201 -PDMPGLYDVTYSVAAGEWFINGIV  224 (559)
Q Consensus       201 -~~~p~vy~~v~~~~~~~Wi~~~i~  224 (559)
                       .+.|+|||++..  |.+||...+.
T Consensus       257 ~t~~~gVyT~vsn--y~~WI~a~~~  279 (413)
T COG5640         257 GTLIPGVYTNVSN--YQDWIAAMTN  279 (413)
T ss_pred             CCCcceeEEehhH--HHHHHHHHhc
Confidence             899999999988  7799876443


No 11 
>PF03761 DUF316:  Domain of unknown function (DUF316) ;  InterPro: IPR005514 This is a family of uncharacterised proteins from Caenorhabditis elegans.
Probab=99.55  E-value=1.7e-13  Score=136.19  Aligned_cols=212  Identities=24%  Similarity=0.389  Sum_probs=132.2

Q ss_pred             EecCCCcCCCccccCcCCceEEEeecccCCCCCCccceeEeeEEEEcCCEEEecccccCccCccce---------EEEcc
Q psy10089        287 VITGWGRDSAETFFGEYPWMMAILTNKINKDGSVTENVFQCGATLILPHVVMTAAHCVNNIPVTDI---------KVRGG  357 (559)
Q Consensus       287 ri~gg~~~~~~~~~~~~Pw~v~i~~~~~~~~~~~~~~~~~C~GtLIs~~~VLTAAhC~~~~~~~~~---------~V~~g  357 (559)
                      ++.+|.    .+...+.||.+.+......      .....++|+|||+||||||+||+..... .+         ...-+
T Consensus        41 ~~~~g~----~~~~~~~pW~v~v~~~~~~------~~~~~~~gtlIS~RHiLtss~~~~~~~~-~W~~~~~~~~~~C~~~  109 (282)
T PF03761_consen   41 KVFNGT----PAESGEAPWAVSVYTKNHN------EGNYFSTGTLISPRHILTSSHCVMNDKS-KWLNGEEFDNKKCEGN  109 (282)
T ss_pred             cccCCc----ccccCCCCCEEEEEeccCc------ccceecceEEeccCeEEEeeeEEEeccc-ccccCcccccceeeCC
Confidence            445554    7778899999999874222      1246689999999999999999974221 11         11111


Q ss_pred             ceeccccCC----------CCCCCCccceeeeeEEEeCCCC----CCCCCCCceEEEEeCCCCCCCCCceecccCCCCCC
Q psy10089        358 EWDTITNNR----------TDREPFPYQERTVSQIYIHENF----EAKTVFNDIALIILDFPFPVKNHIGLACTPNSAEE  423 (559)
Q Consensus       358 ~~~~~~~~~----------~~~~~~~~~~~~V~~i~~Hp~y----~~~~~~~DIALl~L~~p~~~~~~v~picLp~~~~~  423 (559)
                      ...+..+..          ...........+|.++++--.-    ......++++||+|+++  ++....|+|||.....
T Consensus       110 ~~~l~vP~~~l~~~~v~~~~~~~~~~~~~~~v~ka~il~~C~~~~~~~~~~~~~mIlEl~~~--~~~~~~~~Cl~~~~~~  187 (282)
T PF03761_consen  110 NNHLIVPEEVLSKIDVRCCNCFSNGKCFSIKVKKAYILNGCKKIKKNFNRPYSPMILELEED--FSKNVSPPCLADSSTN  187 (282)
T ss_pred             CceEEeCHHHhccEEEEeecccccCCcccceeEEEEEEecCCCcccccccccceEEEEEccc--ccccCCCEEeCCCccc
Confidence            000000000          0000011234566666663222    33345689999999999  7789999999976554


Q ss_pred             C-CCceEEEEccCCCCCCCCCcccccceEEEEEeecchhhhHHhhhcccCCeeccCCceEEEeCCCCCCCCCCCCCcccE
Q psy10089        424 Y-DDQNCIVTGWGKDKFGVEGRYQSTLKKVEVKLVPRNVCQQQLRKTRLGGVFKLHDSFICASGGPNQDACKGDGGGPLV  502 (559)
Q Consensus       424 ~-~~~~~~~~GwG~~~~~~~~~~~~~L~~~~v~i~~~~~C~~~~~~~~~~~~~~~~~~~lCa~~~~~~~~C~GDsGgPLv  502 (559)
                      . .+..+.+.|+..         ...+....+.+.....|..                .+|    .....|.||+||||+
T Consensus       188 ~~~~~~~~~yg~~~---------~~~~~~~~~~i~~~~~~~~----------------~~~----~~~~~~~~d~Gg~lv  238 (282)
T PF03761_consen  188 WEKGDEVDVYGFNS---------TGKLKHRKLKITNCTKCAY----------------SIC----TKQYSCKGDRGGPLV  238 (282)
T ss_pred             cccCceEEEeecCC---------CCeEEEEEEEEEEeeccce----------------eEe----cccccCCCCccCeEE
Confidence            3 345666777611         2345566666655333211                122    246789999999999


Q ss_pred             EeecCCCCcEEEEEEEEeCC-CCCCCCCeeeEeccccHHHHHh
Q psy10089        503 CQLKNERDRFTQVGIVSWGI-GCGSDTPGVYVDVRKFKKWILD  544 (559)
Q Consensus       503 ~~~~~~~~~~~l~GI~S~g~-~C~~~~p~vyt~V~~y~~WI~~  544 (559)
                      ...   +++++|+||.+.+. .|... ...|.+|..|.+=|-+
T Consensus       239 ~~~---~gr~tlIGv~~~~~~~~~~~-~~~f~~v~~~~~~IC~  277 (282)
T PF03761_consen  239 KNI---NGRWTLIGVGASGNYECNKN-NSYFFNVSWYQDEICE  277 (282)
T ss_pred             EEE---CCCEEEEEEEccCCCccccc-ccEEEEHHHhhhhhcc
Confidence            887   89999999999775 34322 5788899888775543


No 12 
>PF03761 DUF316:  Domain of unknown function (DUF316) ;  InterPro: IPR005514 This is a family of uncharacterised proteins from Caenorhabditis elegans.
Probab=99.36  E-value=1.4e-11  Score=122.46  Aligned_cols=174  Identities=20%  Similarity=0.310  Sum_probs=111.3

Q ss_pred             CCCCCCCCcceecCCccCCCCCCCeEEEEEEEecCCceeeeEEEEEcCCEEEecccccccC-cee---------eEEee-
Q psy10089         22 AENTEEYDYIEPISGRNTYFGEFPWMLVLFYYKRNMEYFKCGASLIGPNIALTAAHCVQYD-VTY---------SVAAG-   90 (559)
Q Consensus        22 ~~~~~~~~~~riigG~~a~~~~~Pw~v~l~~~~~~~~~~~CgGtLIs~~~VLTAAhC~~~~-~~~---------~v~~g-   90 (559)
                      |++......+++.+|..+..++.||.|.+.........+.++|+|||+||||||+||+... ..+         ...-+ 
T Consensus        31 CG~~~~~~~~~~~~g~~~~~~~~pW~v~v~~~~~~~~~~~~~gtlIS~RHiLtss~~~~~~~~~W~~~~~~~~~~C~~~~  110 (282)
T PF03761_consen   31 CGKKKLPYPSKVFNGTPAESGEAPWAVSVYTKNHNEGNYFSTGTLISPRHILTSSHCVMNDKSKWLNGEEFDNKKCEGNN  110 (282)
T ss_pred             cCCCCCCCcccccCCcccccCCCCCEEEEEeccCcccceecceEEeccCeEEEeeeEEEecccccccCcccccceeeCCC
Confidence            4433344556689999999999999999988766555577899999999999999999632 111         01111 


Q ss_pred             -eEEEcc-----cc-----cccccceeEEeeEEEEECCCC----CCCCcCCceEEEeecCccccCCCcceeccCCCCCCC
Q psy10089         91 -EWFING-----IV-----EEELEEEQRRDVLDVRIHPNY----STETLENNIALLKLSSNIDFDDYIHPICLPDWNVTY  155 (559)
Q Consensus        91 -~~~~~~-----~~-----~~~~~~~~~~~v~~i~~hp~y----~~~~~~~Diall~L~~~v~~~~~v~picl~~~~~~~  155 (559)
                       .+.+..     ..     ..........+|.++++.-.-    ......++++||+|+++  ++....|+|||......
T Consensus       111 ~~l~vP~~~l~~~~v~~~~~~~~~~~~~~~v~ka~il~~C~~~~~~~~~~~~~mIlEl~~~--~~~~~~~~Cl~~~~~~~  188 (282)
T PF03761_consen  111 NHLIVPEEVLSKIDVRCCNCFSNGKCFSIKVKKAYILNGCKKIKKNFNRPYSPMILELEED--FSKNVSPPCLADSSTNW  188 (282)
T ss_pred             ceEEeCHHHhccEEEEeecccccCCcccceeEEEEEEecCCCcccccccccceEEEEEccc--ccccCCCEEeCCCcccc
Confidence             111100     00     000111223456666653222    23344579999999998  77889999999755432


Q ss_pred             -CCCcEEEEeec---------------------------CCCCCCCCceEeecCCCCCcEEEEEEEEcCC-CCC
Q psy10089        156 -DSENCVITGWG---------------------------RDSADGGGPLVCPSKEDPTTFFQVGIAAWSV-VCT  200 (559)
Q Consensus       156 -~~~~~~~~GwG---------------------------~~~~d~G~pl~~~~~~~~~~~~l~Gi~s~~~-~C~  200 (559)
                       .+....+.|+-                           ...+|+||||+...   .++++++||.+.+. .|.
T Consensus       189 ~~~~~~~~yg~~~~~~~~~~~~~i~~~~~~~~~~~~~~~~~~~d~Gg~lv~~~---~gr~tlIGv~~~~~~~~~  259 (282)
T PF03761_consen  189 EKGDEVDVYGFNSTGKLKHRKLKITNCTKCAYSICTKQYSCKGDRGGPLVKNI---NGRWTLIGVGASGNYECN  259 (282)
T ss_pred             ccCceEEEeecCCCCeEEEEEEEEEEeeccceeEecccccCCCCccCeEEEEE---CCCEEEEEEEccCCCccc
Confidence             23334444441                           11279999999843   47899999998765 444


No 13 
>PF09342 DUF1986:  Domain of unknown function (DUF1986);  InterPro: IPR015420 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This domain is found in serine endopeptidases belonging to MEROPS peptidase family S1A (clan PA). It is found in unusual mosaic proteins, which are encoded by the Drosophila nudel gene (see P98159 from SWISSPROT). Nudel is involved in defining embryonic dorsoventral polarity. Three proteases; ndl, gd and snk process easter to create active easter. Active easter defines cell identities along the dorsal-ventral continuum by activating the spz ligand for the Tl receptor in the ventral region of the embryo. Nudel, pipe and windbeutel together trigger the protease cascade within the extraembryonic perivitelline compartment which induces dorsoventral polarity of the Drosophila embryo [].
Probab=99.34  E-value=7.4e-12  Score=114.39  Aligned_cols=115  Identities=18%  Similarity=0.317  Sum_probs=87.7

Q ss_pred             CcCCceEEEeecccCCCCCCccceeEeeEEEEcCCEEEecccccCccCc--cceEEEccceeccccCCCCCCCCccceee
Q psy10089        301 GEYPWMMAILTNKINKDGSVTENVFQCGATLILPHVVMTAAHCVNNIPV--TDIKVRGGEWDTITNNRTDREPFPYQERT  378 (559)
Q Consensus       301 ~~~Pw~v~i~~~~~~~~~~~~~~~~~C~GtLIs~~~VLTAAhC~~~~~~--~~~~V~~g~~~~~~~~~~~~~~~~~~~~~  378 (559)
                      .-|||+|.|+.  .+        .+.|+|+||.++|||++-.|+.+...  .-+.|.+|.......-    ..+.+|.++
T Consensus        14 y~WPWlA~IYv--dG--------~~~CsgvLlD~~WlLvsssCl~~I~L~~~YvsallG~~Kt~~~v----~Gp~EQI~r   79 (267)
T PF09342_consen   14 YHWPWLADIYV--DG--------RYWCSGVLLDPHWLLVSSSCLRGISLSHHYVSALLGGGKTYLSV----DGPHEQISR   79 (267)
T ss_pred             ccCcceeeEEE--cC--------eEEEEEEEeccceEEEeccccCCcccccceEEEEecCcceeccc----CCChheEEE
Confidence            46999999998  44        69999999999999999999987544  4567888876544322    223467777


Q ss_pred             eeEEEeCCCCCCCCCCCceEEEEeCCCCCCCCCceecccCCCCCC-CCCceEEEEccCC
Q psy10089        379 VSQIYIHENFEAKTVFNDIALIILDFPFPVKNHIGLACTPNSAEE-YDDQNCIVTGWGK  436 (559)
Q Consensus       379 V~~i~~Hp~y~~~~~~~DIALl~L~~p~~~~~~v~picLp~~~~~-~~~~~~~~~GwG~  436 (559)
                      |..+..-|       ..+++||.|++|+.|+.+|+|..||..... .....|..+|--.
T Consensus        80 VD~~~~V~-------~S~v~LLHL~~~~~fTr~VlP~flp~~~~~~~~~~~CVAVg~d~  131 (267)
T PF09342_consen   80 VDCFKDVP-------ESNVLLLHLEQPANFTRYVLPTFLPETSNENESDDECVAVGHDD  131 (267)
T ss_pred             eeeeeecc-------ccceeeeeecCcccceeeecccccccccCCCCCCCceEEEEccc
Confidence            77765433       369999999999999999999999974332 2346899998543


No 14 
>PF09342 DUF1986:  Domain of unknown function (DUF1986);  InterPro: IPR015420 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This domain is found in serine endopeptidases belonging to MEROPS peptidase family S1A (clan PA). It is found in unusual mosaic proteins, which are encoded by the Drosophila nudel gene (see P98159 from SWISSPROT). Nudel is involved in defining embryonic dorsoventral polarity. Three proteases; ndl, gd and snk process easter to create active easter. Active easter defines cell identities along the dorsal-ventral continuum by activating the spz ligand for the Tl receptor in the ventral region of the embryo. Nudel, pipe and windbeutel together trigger the protease cascade within the extraembryonic perivitelline compartment which induces dorsoventral polarity of the Drosophila embryo [].
Probab=99.27  E-value=2.7e-11  Score=110.77  Aligned_cols=111  Identities=24%  Similarity=0.395  Sum_probs=81.9

Q ss_pred             CCCCeEEEEEEEecCCceeeeEEEEEcCCEEEecccccccC----ceeeEEeeeEEEcccccccccceeEEeeEEEEECC
Q psy10089         42 GEFPWMLVLFYYKRNMEYFKCGASLIGPNIALTAAHCVQYD----VTYSVAAGEWFINGIVEEELEEEQRRDVLDVRIHP  117 (559)
Q Consensus        42 ~~~Pw~v~l~~~~~~~~~~~CgGtLIs~~~VLTAAhC~~~~----~~~~v~~g~~~~~~~~~~~~~~~~~~~v~~i~~hp  117 (559)
                      -.|||.|.|+..+.    +.|.|.||.+.|||++..|+.+-    .-..+.+|......  .-..+.+|++.|..+..-|
T Consensus        14 y~WPWlA~IYvdG~----~~CsgvLlD~~WlLvsssCl~~I~L~~~YvsallG~~Kt~~--~v~Gp~EQI~rVD~~~~V~   87 (267)
T PF09342_consen   14 YHWPWLADIYVDGR----YWCSGVLLDPHWLLVSSSCLRGISLSHHYVSALLGGGKTYL--SVDGPHEQISRVDCFKDVP   87 (267)
T ss_pred             ccCcceeeEEEcCe----EEEEEEEeccceEEEeccccCCcccccceEEEEecCcceec--ccCCChheEEEeeeeeecc
Confidence            46999999998765    88999999999999999999652    22456666433221  1134567888887665443


Q ss_pred             CCCCCCcCCceEEEeecCccccCCCcceeccCCCCCC-CCCCcEEEEee
Q psy10089        118 NYSTETLENNIALLKLSSNIDFDDYIHPICLPDWNVT-YDSENCVITGW  165 (559)
Q Consensus       118 ~y~~~~~~~Diall~L~~~v~~~~~v~picl~~~~~~-~~~~~~~~~Gw  165 (559)
                             ..+++||.|++|+.|+.+|+|..||..... .....|+..|-
T Consensus        88 -------~S~v~LLHL~~~~~fTr~VlP~flp~~~~~~~~~~~CVAVg~  129 (267)
T PF09342_consen   88 -------ESNVLLLHLEQPANFTRYVLPTFLPETSNENESDDECVAVGH  129 (267)
T ss_pred             -------ccceeeeeecCcccceeeecccccccccCCCCCCCceEEEEc
Confidence                   348999999999999999999999873322 23457888874


No 15 
>COG3591 V8-like Glu-specific endopeptidase [Amino acid transport and metabolism]
Probab=98.89  E-value=2.1e-08  Score=94.52  Aligned_cols=200  Identities=18%  Similarity=0.219  Sum_probs=111.3

Q ss_pred             ccCcCCceEEEeecccCCCCCCccceeEeeEEEEcCCEEEecccccCccCcc--ceEEEc-cceeccccCCCCCCCCccc
Q psy10089        299 FFGEYPWMMAILTNKINKDGSVTENVFQCGATLILPHVVMTAAHCVNNIPVT--DIKVRG-GEWDTITNNRTDREPFPYQ  375 (559)
Q Consensus       299 ~~~~~Pw~v~i~~~~~~~~~~~~~~~~~C~GtLIs~~~VLTAAhC~~~~~~~--~~~V~~-g~~~~~~~~~~~~~~~~~~  375 (559)
                      ....|||-+-....  ..     ...+-|+++||+++.||||+||+......  .+.+.. |.....       .  +.-
T Consensus        45 dt~~~Py~av~~~~--~~-----tG~~~~~~~lI~pntvLTa~Hc~~s~~~G~~~~~~~p~g~~~~~-------~--~~~  108 (251)
T COG3591          45 DTTQFPYSAVVQFE--AA-----TGRLCTAATLIGPNTVLTAGHCIYSPDYGEDDIAAAPPGVNSDG-------G--PFY  108 (251)
T ss_pred             cCCCCCcceeEEee--cC-----CCcceeeEEEEcCceEEEeeeEEecCCCChhhhhhcCCcccCCC-------C--CCC
Confidence            45789999877652  21     12356777999999999999999863321  111211 111100       1  111


Q ss_pred             eeeeeEEEeCCC--CCCCCCCCceEEEEeCCCCCCCCCceecccCCCCCCCCCceEEEEccCCCCCCCCCcccccceEEE
Q psy10089        376 ERTVSQIYIHEN--FEAKTVFNDIALIILDFPFPVKNHIGLACTPNSAEEYDDQNCIVTGWGKDKFGVEGRYQSTLKKVE  453 (559)
Q Consensus       376 ~~~V~~i~~Hp~--y~~~~~~~DIALl~L~~p~~~~~~v~picLp~~~~~~~~~~~~~~GwG~~~~~~~~~~~~~L~~~~  453 (559)
                      .+...++.+-|.  |.......|+..+.|+....+.+.....-++.......+....++||-...       ...+++.+
T Consensus       109 ~~~~~~~~~~~g~~~~~d~~~~~v~~~~~~~g~~~~~~~~~~~~~~~~~~~~~d~i~v~GYP~dk-------~~~~~~~e  181 (251)
T COG3591         109 GITKIEIRVYPGELYKEDGASYDVGEAALESGINIGDVVNYLKRNTASEAKANDRITVIGYPGDK-------PNIGTMWE  181 (251)
T ss_pred             ceeeEEEEecCCceeccCCceeeccHHHhccCCCccccccccccccccccccCceeEEEeccCCC-------CcceeEee
Confidence            222222222333  334445677777888866666666666666655555555568999996443       11111111


Q ss_pred             EEeecchhhhHHhhhcccCCeeccCCceEEEeCCCCCCCCCCCCCcccEEeecCCCCcEEEEEEEEeCCCCCCCC-Ceee
Q psy10089        454 VKLVPRNVCQQQLRKTRLGGVFKLHDSFICASGGPNQDACKGDGGGPLVCQLKNERDRFTQVGIVSWGIGCGSDT-PGVY  532 (559)
Q Consensus       454 v~i~~~~~C~~~~~~~~~~~~~~~~~~~lCa~~~~~~~~C~GDsGgPLv~~~~~~~~~~~l~GI~S~g~~C~~~~-p~vy  532 (559)
                             .|....         .+....    ..-..++|.|+||+|++...   +   +++||.+-|..-.... ..-.
T Consensus       182 -------~t~~v~---------~~~~~~----l~y~~dT~pG~SGSpv~~~~---~---~vigv~~~g~~~~~~~~~n~~  235 (251)
T COG3591         182 -------STGKVN---------SIKGNK----LFYDADTLPGSSGSPVLISK---D---EVIGVHYNGPGANGGSLANNA  235 (251)
T ss_pred             -------ecceeE---------EEecce----EEEEecccCCCCCCceEecC---c---eEEEEEecCCCcccccccCcc
Confidence                   111100         011110    11256899999999999754   2   8999999886422111 2333


Q ss_pred             Eec-cccHHHHHhhcC
Q psy10089        533 VDV-RKFKKWILDNSH  547 (559)
Q Consensus       533 t~V-~~y~~WI~~~i~  547 (559)
                      +|+ ..+++||++.++
T Consensus       236 vr~t~~~~~~I~~~~~  251 (251)
T COG3591         236 VRLTPEILNFIQQNIK  251 (251)
T ss_pred             eEecHHHHHHHHHhhC
Confidence            444 557899988764


No 16 
>COG3591 V8-like Glu-specific endopeptidase [Amino acid transport and metabolism]
Probab=98.26  E-value=7.9e-06  Score=77.32  Aligned_cols=147  Identities=18%  Similarity=0.285  Sum_probs=82.9

Q ss_pred             CCCCCCeEEEEEEEecCCceeeeEEEEEcCCEEEecccccccCceeeEEeeeEEEcccccccccceeEEeeE--EEEECC
Q psy10089         40 YFGEFPWMLVLFYYKRNMEYFKCGASLIGPNIALTAAHCVQYDVTYSVAAGEWFINGIVEEELEEEQRRDVL--DVRIHP  117 (559)
Q Consensus        40 ~~~~~Pw~v~l~~~~~~~~~~~CgGtLIs~~~VLTAAhC~~~~~~~~v~~g~~~~~~~~~~~~~~~~~~~v~--~i~~hp  117 (559)
                      ....|||-+...+....+ .+-|+++||+++.||||+||+.....-....-. ...+..   ........+.  .+.+.|
T Consensus        45 dt~~~Py~av~~~~~~tG-~~~~~~~lI~pntvLTa~Hc~~s~~~G~~~~~~-~p~g~~---~~~~~~~~~~~~~~~~~~  119 (251)
T COG3591          45 DTTQFPYSAVVQFEAATG-RLCTAATLIGPNTVLTAGHCIYSPDYGEDDIAA-APPGVN---SDGGPFYGITKIEIRVYP  119 (251)
T ss_pred             cCCCCCcceeEEeecCCC-cceeeEEEEcCceEEEeeeEEecCCCChhhhhh-cCCccc---CCCCCCCceeeEEEEecC
Confidence            446899999886655433 346888999999999999999653210000000 001111   1122222222  222244


Q ss_pred             C--CCCCCcCCceEEEeecCccccCCCcceeccCCCCCCCCCCcEEEEeecCCC--------------------------
Q psy10089        118 N--YSTETLENNIALLKLSSNIDFDDYIHPICLPDWNVTYDSENCVITGWGRDS--------------------------  169 (559)
Q Consensus       118 ~--y~~~~~~~Diall~L~~~v~~~~~v~picl~~~~~~~~~~~~~~~GwG~~~--------------------------  169 (559)
                      .  |.......|+..+.|+....++..+...-++.......+....+.|+-.+.                          
T Consensus       120 g~~~~~d~~~~~v~~~~~~~g~~~~~~~~~~~~~~~~~~~~~d~i~v~GYP~dk~~~~~~~e~t~~v~~~~~~~l~y~~d  199 (251)
T COG3591         120 GELYKEDGASYDVGEAALESGINIGDVVNYLKRNTASEAKANDRITVIGYPGDKPNIGTMWESTGKVNSIKGNKLFYDAD  199 (251)
T ss_pred             CceeccCCceeeccHHHhccCCCccccccccccccccccccCceeEEEeccCCCCcceeEeeecceeEEEecceEEEEec
Confidence            3  334445567777777755566666554444433333334446777764322                          


Q ss_pred             ---CCCCCceEeecCCCCCcEEEEEEEEcCC
Q psy10089        170 ---ADGGGPLVCPSKEDPTTFFQVGIAAWSV  197 (559)
Q Consensus       170 ---~d~G~pl~~~~~~~~~~~~l~Gi~s~~~  197 (559)
                         |+||+|++-...      +++|+..-+.
T Consensus       200 T~pG~SGSpv~~~~~------~vigv~~~g~  224 (251)
T COG3591         200 TLPGSSGSPVLISKD------EVIGVHYNGP  224 (251)
T ss_pred             ccCCCCCCceEecCc------eEEEEEecCC
Confidence               789999987432      6888887664


No 17 
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=98.04  E-value=8.8e-05  Score=78.17  Aligned_cols=84  Identities=15%  Similarity=0.114  Sum_probs=58.1

Q ss_pred             eEeeEEEEcCC-EEEecccccCccCccceEEEccceeccccCCCCCCCCccceeeeeEEEeCCCCCCCCCCCceEEEEeC
Q psy10089        325 FQCGATLILPH-VVMTAAHCVNNIPVTDIKVRGGEWDTITNNRTDREPFPYQERTVSQIYIHENFEAKTVFNDIALIILD  403 (559)
Q Consensus       325 ~~C~GtLIs~~-~VLTAAhC~~~~~~~~~~V~~g~~~~~~~~~~~~~~~~~~~~~V~~i~~Hp~y~~~~~~~DIALl~L~  403 (559)
                      ..++|.+|+++ +|||++|++.+  ...+.|.+..               ...+..+-+..++       ..||||||++
T Consensus        58 ~~GSGfii~~~G~IlTn~Hvv~~--~~~i~V~~~~---------------~~~~~a~vv~~d~-------~~DlAllkv~  113 (428)
T TIGR02037        58 GLGSGVIISADGYILTNNHVVDG--ADEITVTLSD---------------GREFKAKLVGKDP-------RTDIAVLKID  113 (428)
T ss_pred             ceeeEEEECCCCEEEEcHHHcCC--CCeEEEEeCC---------------CCEEEEEEEEecC-------CCCEEEEEec
Confidence            57999999986 99999999985  3455555432               1233433333333       3699999998


Q ss_pred             CCCCCCCCceecccCCCCCCCCCceEEEEccCC
Q psy10089        404 FPFPVKNHIGLACTPNSAEEYDDQNCIVTGWGK  436 (559)
Q Consensus       404 ~p~~~~~~v~picLp~~~~~~~~~~~~~~GwG~  436 (559)
                      .+    ..+.++.|........++.++++|+..
T Consensus       114 ~~----~~~~~~~l~~~~~~~~G~~v~aiG~p~  142 (428)
T TIGR02037       114 AK----KNLPVIKLGDSDKLRVGDWVLAIGNPF  142 (428)
T ss_pred             CC----CCceEEEccCCCCCCCCCEEEEEECCC
Confidence            65    345677776655556789999999853


No 18 
>TIGR02038 protease_degS periplasmic serine pepetdase DegS. This family consists of the periplasmic serine protease DegS (HhoB), a shorter paralog of protease DO (HtrA, DegP) and DegQ (HhoA). It is found in E. coli and several other Proteobacteria of the gamma subdivision. It contains a trypsin domain and a single copy of PDZ domain (in contrast to DegP with two copies). A critical role of this DegS is to sense stress in the periplasm and partially degrade an inhibitor of sigma(E).
Probab=97.74  E-value=0.00096  Score=68.19  Aligned_cols=83  Identities=8%  Similarity=0.015  Sum_probs=53.3

Q ss_pred             eEeeEEEEcCC-EEEecccccCccCccceEEEccceeccccCCCCCCCCccceeeeeEEEeCCCCCCCCCCCceEEEEeC
Q psy10089        325 FQCGATLILPH-VVMTAAHCVNNIPVTDIKVRGGEWDTITNNRTDREPFPYQERTVSQIYIHENFEAKTVFNDIALIILD  403 (559)
Q Consensus       325 ~~C~GtLIs~~-~VLTAAhC~~~~~~~~~~V~~g~~~~~~~~~~~~~~~~~~~~~V~~i~~Hp~y~~~~~~~DIALl~L~  403 (559)
                      ...+|.+|+++ +|||++|.+..  ...+.|.+..               ...+..+-+..+|       ..||||||++
T Consensus        78 ~~GSG~vi~~~G~IlTn~HVV~~--~~~i~V~~~d---------------g~~~~a~vv~~d~-------~~DlAvlkv~  133 (351)
T TIGR02038        78 GLGSGVIMSKEGYILTNYHVIKK--ADQIVVALQD---------------GRKFEAELVGSDP-------LTDLAVLKIE  133 (351)
T ss_pred             ceEEEEEEeCCeEEEecccEeCC--CCEEEEEECC---------------CCEEEEEEEEecC-------CCCEEEEEec
Confidence            46999999977 99999999975  3445555432               1223333333333       3699999998


Q ss_pred             CCCCCCCCceecccCCCCCCCCCceEEEEccCC
Q psy10089        404 FPFPVKNHIGLACTPNSAEEYDDQNCIVTGWGK  436 (559)
Q Consensus       404 ~p~~~~~~v~picLp~~~~~~~~~~~~~~GwG~  436 (559)
                      .+-     +.++.+-.......|+.+.++|+..
T Consensus       134 ~~~-----~~~~~l~~s~~~~~G~~V~aiG~P~  161 (351)
T TIGR02038       134 GDN-----LPTIPVNLDRPPHVGDVVLAIGNPY  161 (351)
T ss_pred             CCC-----CceEeccCcCccCCCCEEEEEeCCC
Confidence            642     2334443333445689999999853


No 19 
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=97.73  E-value=0.00032  Score=73.90  Aligned_cols=82  Identities=20%  Similarity=0.165  Sum_probs=51.0

Q ss_pred             eeeEEEEEcCC-EEEecccccccCceeeEEeeeEEEcccccccccceeEEeeEEEEECCCCCCCCcCCceEEEeecCccc
Q psy10089         60 FKCGASLIGPN-IALTAAHCVQYDVTYSVAAGEWFINGIVEEELEEEQRRDVLDVRIHPNYSTETLENNIALLKLSSNID  138 (559)
Q Consensus        60 ~~CgGtLIs~~-~VLTAAhC~~~~~~~~v~~g~~~~~~~~~~~~~~~~~~~v~~i~~hp~y~~~~~~~Diall~L~~~v~  138 (559)
                      ..++|.+|+++ +|||++|++.....+.|.+..             ...+..+-+..++.       .||||||++.+  
T Consensus        58 ~~GSGfii~~~G~IlTn~Hvv~~~~~i~V~~~~-------------~~~~~a~vv~~d~~-------~DlAllkv~~~--  115 (428)
T TIGR02037        58 GLGSGVIISADGYILTNNHVVDGADEITVTLSD-------------GREFKAKLVGKDPR-------TDIAVLKIDAK--  115 (428)
T ss_pred             ceeeEEEECCCCEEEEcHHHcCCCCeEEEEeCC-------------CCEEEEEEEEecCC-------CCEEEEEecCC--
Confidence            56999999986 999999999876655554321             12233333334443       49999999854  


Q ss_pred             cCCCcceeccCCCCCCCCCCcEEEEee
Q psy10089        139 FDDYIHPICLPDWNVTYDSENCVITGW  165 (559)
Q Consensus       139 ~~~~v~picl~~~~~~~~~~~~~~~Gw  165 (559)
                        ..+.++.|........++.+.+.|+
T Consensus       116 --~~~~~~~l~~~~~~~~G~~v~aiG~  140 (428)
T TIGR02037       116 --KNLPVIKLGDSDKLRVGDWVLAIGN  140 (428)
T ss_pred             --CCceEEEccCCCCCCCCCEEEEEEC
Confidence              2345666654333334555555554


No 20 
>PRK10898 serine endoprotease; Provisional
Probab=97.59  E-value=0.0032  Score=64.37  Aligned_cols=82  Identities=10%  Similarity=0.064  Sum_probs=52.5

Q ss_pred             eEeeEEEEcCC-EEEecccccCccCccceEEEccceeccccCCCCCCCCccceeeeeEEEeCCCCCCCCCCCceEEEEeC
Q psy10089        325 FQCGATLILPH-VVMTAAHCVNNIPVTDIKVRGGEWDTITNNRTDREPFPYQERTVSQIYIHENFEAKTVFNDIALIILD  403 (559)
Q Consensus       325 ~~C~GtLIs~~-~VLTAAhC~~~~~~~~~~V~~g~~~~~~~~~~~~~~~~~~~~~V~~i~~Hp~y~~~~~~~DIALl~L~  403 (559)
                      ...+|.+|+++ +|||.+|-+.+  ...+.|.+..               ...+..+-+-..|       .+||||||++
T Consensus        78 ~~GSGfvi~~~G~IlTn~HVv~~--a~~i~V~~~d---------------g~~~~a~vv~~d~-------~~DlAvl~v~  133 (353)
T PRK10898         78 TLGSGVIMDQRGYILTNKHVIND--ADQIIVALQD---------------GRVFEALLVGSDS-------LTDLAVLKIN  133 (353)
T ss_pred             ceeeEEEEeCCeEEEecccEeCC--CCEEEEEeCC---------------CCEEEEEEEEEcC-------CCCEEEEEEc
Confidence            57999999976 99999999974  3456665432               1223333333333       3699999997


Q ss_pred             CCCCCCCCceecccCCCCCCCCCceEEEEccC
Q psy10089        404 FPFPVKNHIGLACTPNSAEEYDDQNCIVTGWG  435 (559)
Q Consensus       404 ~p~~~~~~v~picLp~~~~~~~~~~~~~~GwG  435 (559)
                      .+ .    ..++.|........++.++++|+.
T Consensus       134 ~~-~----l~~~~l~~~~~~~~G~~V~aiG~P  160 (353)
T PRK10898        134 AT-N----LPVIPINPKRVPHIGDVVLAIGNP  160 (353)
T ss_pred             CC-C----CCeeeccCcCcCCCCCEEEEEeCC
Confidence            54 1    233344333334468889999985


No 21 
>PRK10139 serine endoprotease; Provisional
Probab=97.52  E-value=0.0015  Score=68.96  Aligned_cols=83  Identities=18%  Similarity=0.166  Sum_probs=56.1

Q ss_pred             eEeeEEEEcC--CEEEecccccCccCccceEEEccceeccccCCCCCCCCccceeeeeEEEeCCCCCCCCCCCceEEEEe
Q psy10089        325 FQCGATLILP--HVVMTAAHCVNNIPVTDIKVRGGEWDTITNNRTDREPFPYQERTVSQIYIHENFEAKTVFNDIALIIL  402 (559)
Q Consensus       325 ~~C~GtLIs~--~~VLTAAhC~~~~~~~~~~V~~g~~~~~~~~~~~~~~~~~~~~~V~~i~~Hp~y~~~~~~~DIALl~L  402 (559)
                      ...+|.+|++  -+|||.+|.+.+  ...+.|.+..               ...+..+-+-..|       ..||||||+
T Consensus        90 ~~GSG~ii~~~~g~IlTn~HVv~~--a~~i~V~~~d---------------g~~~~a~vvg~D~-------~~DlAvlkv  145 (455)
T PRK10139         90 GLGSGVIIDAAKGYVLTNNHVINQ--AQKISIQLND---------------GREFDAKLIGSDD-------QSDIALLQI  145 (455)
T ss_pred             ceEEEEEEECCCCEEEeChHHhCC--CCEEEEEECC---------------CCEEEEEEEEEcC-------CCCEEEEEe
Confidence            4799999974  599999999985  4566676542               1233333333333       369999999


Q ss_pred             CCCCCCCCCceecccCCCCCCCCCceEEEEccC
Q psy10089        403 DFPFPVKNHIGLACTPNSAEEYDDQNCIVTGWG  435 (559)
Q Consensus       403 ~~p~~~~~~v~picLp~~~~~~~~~~~~~~GwG  435 (559)
                      +.+-    ...++.|........|+.++++|+.
T Consensus       146 ~~~~----~l~~~~lg~s~~~~~G~~V~aiG~P  174 (455)
T PRK10139        146 QNPS----KLTQIAIADSDKLRVGDFAVAVGNP  174 (455)
T ss_pred             cCCC----CCceeEecCccccCCCCEEEEEecC
Confidence            8652    3446667655555578999999874


No 22 
>PRK10942 serine endoprotease; Provisional
Probab=97.47  E-value=0.0036  Score=66.53  Aligned_cols=83  Identities=23%  Similarity=0.168  Sum_probs=55.0

Q ss_pred             eEeeEEEEcC--CEEEecccccCccCccceEEEccceeccccCCCCCCCCccceeeeeEEEeCCCCCCCCCCCceEEEEe
Q psy10089        325 FQCGATLILP--HVVMTAAHCVNNIPVTDIKVRGGEWDTITNNRTDREPFPYQERTVSQIYIHENFEAKTVFNDIALIIL  402 (559)
Q Consensus       325 ~~C~GtLIs~--~~VLTAAhC~~~~~~~~~~V~~g~~~~~~~~~~~~~~~~~~~~~V~~i~~Hp~y~~~~~~~DIALl~L  402 (559)
                      ...+|.+|+.  -+|||.+|.+.+  ...+.|.+.+               ...+..+-+-.+|       ..||||||+
T Consensus       111 ~~GSG~ii~~~~G~IlTn~HVv~~--a~~i~V~~~d---------------g~~~~a~vv~~D~-------~~DlAvlki  166 (473)
T PRK10942        111 ALGSGVIIDADKGYVVTNNHVVDN--ATKIKVQLSD---------------GRKFDAKVVGKDP-------RSDIALIQL  166 (473)
T ss_pred             ceEEEEEEECCCCEEEeChhhcCC--CCEEEEEECC---------------CCEEEEEEEEecC-------CCCEEEEEe
Confidence            4799999985  499999999975  4556666542               1223333333333       369999999


Q ss_pred             CCCCCCCCCceecccCCCCCCCCCceEEEEccC
Q psy10089        403 DFPFPVKNHIGLACTPNSAEEYDDQNCIVTGWG  435 (559)
Q Consensus       403 ~~p~~~~~~v~picLp~~~~~~~~~~~~~~GwG  435 (559)
                      +.+-    ...++.|-.......++.++++|+.
T Consensus       167 ~~~~----~l~~~~lg~s~~l~~G~~V~aiG~P  195 (473)
T PRK10942        167 QNPK----NLTAIKMADSDALRVGDYTVAIGNP  195 (473)
T ss_pred             cCCC----CCceeEecCccccCCCCEEEEEcCC
Confidence            7542    2345666555555678888888874


No 23 
>TIGR02038 protease_degS periplasmic serine pepetdase DegS. This family consists of the periplasmic serine protease DegS (HhoB), a shorter paralog of protease DO (HtrA, DegP) and DegQ (HhoA). It is found in E. coli and several other Proteobacteria of the gamma subdivision. It contains a trypsin domain and a single copy of PDZ domain (in contrast to DegP with two copies). A critical role of this DegS is to sense stress in the periplasm and partially degrade an inhibitor of sigma(E).
Probab=97.46  E-value=0.0028  Score=64.79  Aligned_cols=75  Identities=15%  Similarity=0.147  Sum_probs=47.3

Q ss_pred             CCCCeEEEEEEEecC-------CceeeeEEEEEcCC-EEEecccccccCceeeEEeeeEEEcccccccccceeEEeeEEE
Q psy10089         42 GEFPWMLVLFYYKRN-------MEYFKCGASLIGPN-IALTAAHCVQYDVTYSVAAGEWFINGIVEEELEEEQRRDVLDV  113 (559)
Q Consensus        42 ~~~Pw~v~l~~~~~~-------~~~~~CgGtLIs~~-~VLTAAhC~~~~~~~~v~~g~~~~~~~~~~~~~~~~~~~v~~i  113 (559)
                      .--|-+|.|.-....       ......+|.+|+++ +|||++|.+.....+.|.+..             +..+..+-+
T Consensus        53 ~~~psVV~I~~~~~~~~~~~~~~~~~~GSG~vi~~~G~IlTn~HVV~~~~~i~V~~~d-------------g~~~~a~vv  119 (351)
T TIGR02038        53 RAAPAVVNIYNRSISQNSLNQLSIQGLGSGVIMSKEGYILTNYHVIKKADQIVVALQD-------------GRKFEAELV  119 (351)
T ss_pred             hcCCcEEEEEeEeccccccccccccceEEEEEEeCCeEEEecccEeCCCCEEEEEECC-------------CCEEEEEEE
Confidence            345889988754321       11245999999977 999999999776555544321             122333333


Q ss_pred             EECCCCCCCCcCCceEEEeecCc
Q psy10089        114 RIHPNYSTETLENNIALLKLSSN  136 (559)
Q Consensus       114 ~~hp~y~~~~~~~Diall~L~~~  136 (559)
                      ..+|.       .||||||++.+
T Consensus       120 ~~d~~-------~DlAvlkv~~~  135 (351)
T TIGR02038       120 GSDPL-------TDLAVLKIEGD  135 (351)
T ss_pred             EecCC-------CCEEEEEecCC
Confidence            34443       49999999753


No 24 
>PRK10898 serine endoprotease; Provisional
Probab=97.29  E-value=0.0064  Score=62.18  Aligned_cols=73  Identities=16%  Similarity=0.150  Sum_probs=45.9

Q ss_pred             CCeEEEEEEEecCC-------ceeeeEEEEEcCC-EEEecccccccCceeeEEeeeEEEcccccccccceeEEeeEEEEE
Q psy10089         44 FPWMLVLFYYKRNM-------EYFKCGASLIGPN-IALTAAHCVQYDVTYSVAAGEWFINGIVEEELEEEQRRDVLDVRI  115 (559)
Q Consensus        44 ~Pw~v~l~~~~~~~-------~~~~CgGtLIs~~-~VLTAAhC~~~~~~~~v~~g~~~~~~~~~~~~~~~~~~~v~~i~~  115 (559)
                      -|-+|.|.-.....       .....+|.+|+++ +|||+||=+.....+.|.+.+             +..+..+-+..
T Consensus        55 ~psvV~v~~~~~~~~~~~~~~~~~~GSGfvi~~~G~IlTn~HVv~~a~~i~V~~~d-------------g~~~~a~vv~~  121 (353)
T PRK10898         55 APAVVNVYNRSLNSTSHNQLEIRTLGSGVIMDQRGYILTNKHVINDADQIIVALQD-------------GRVFEALLVGS  121 (353)
T ss_pred             CCcEEEEEeEeccccCcccccccceeeEEEEeCCeEEEecccEeCCCCEEEEEeCC-------------CCEEEEEEEEE
Confidence            47888886543211       1246899999976 999999999766555554321             12233333334


Q ss_pred             CCCCCCCCcCCceEEEeecCc
Q psy10089        116 HPNYSTETLENNIALLKLSSN  136 (559)
Q Consensus       116 hp~y~~~~~~~Diall~L~~~  136 (559)
                      .|.       .||||||++..
T Consensus       122 d~~-------~DlAvl~v~~~  135 (353)
T PRK10898        122 DSL-------TDLAVLKINAT  135 (353)
T ss_pred             cCC-------CCEEEEEEcCC
Confidence            443       49999999743


No 25 
>PRK10139 serine endoprotease; Provisional
Probab=97.26  E-value=0.003  Score=66.73  Aligned_cols=107  Identities=20%  Similarity=0.239  Sum_probs=64.8

Q ss_pred             eeeEEEEEcC--CEEEecccccccCceeeEEeeeEEEcccccccccceeEEeeEEEEECCCCCCCCcCCceEEEeecCcc
Q psy10089         60 FKCGASLIGP--NIALTAAHCVQYDVTYSVAAGEWFINGIVEEELEEEQRRDVLDVRIHPNYSTETLENNIALLKLSSNI  137 (559)
Q Consensus        60 ~~CgGtLIs~--~~VLTAAhC~~~~~~~~v~~g~~~~~~~~~~~~~~~~~~~v~~i~~hp~y~~~~~~~Diall~L~~~v  137 (559)
                      -..+|.||++  -+|||.+|.+.+...+.|.+.+             ...+..+-+...|.       .||||||++.+-
T Consensus        90 ~~GSG~ii~~~~g~IlTn~HVv~~a~~i~V~~~d-------------g~~~~a~vvg~D~~-------~DlAvlkv~~~~  149 (455)
T PRK10139         90 GLGSGVIIDAAKGYVLTNNHVINQAQKISIQLND-------------GREFDAKLIGSDDQ-------SDIALLQIQNPS  149 (455)
T ss_pred             ceEEEEEEECCCCEEEeChHHhCCCCEEEEEECC-------------CCEEEEEEEEEcCC-------CCEEEEEecCCC
Confidence            3589999974  6999999999877766665421             12233333334433       499999997532


Q ss_pred             ccCCCcceeccCCCCCCCCCCcEEEEee--c--------------CC------------------CCCCCCceEeecCCC
Q psy10089        138 DFDDYIHPICLPDWNVTYDSENCVITGW--G--------------RD------------------SADGGGPLVCPSKED  183 (559)
Q Consensus       138 ~~~~~v~picl~~~~~~~~~~~~~~~Gw--G--------------~~------------------~~d~G~pl~~~~~~~  183 (559)
                          ...++.|........++.+.+.|.  |              +.                  .|.|||||+-...  
T Consensus       150 ----~l~~~~lg~s~~~~~G~~V~aiG~P~g~~~tvt~GivS~~~r~~~~~~~~~~~iqtda~in~GnSGGpl~n~~G--  223 (455)
T PRK10139        150 ----KLTQIAIADSDKLRVGDFAVAVGNPFGLGQTATSGIISALGRSGLNLEGLENFIQTDASINRGNSGGALLNLNG--  223 (455)
T ss_pred             ----CCceeEecCccccCCCCEEEEEecCCCCCCceEEEEEccccccccCCCCcceEEEECCccCCCCCcceEECCCC--
Confidence                234555543332223444444443  1              10                  1789999986433  


Q ss_pred             CCcEEEEEEEEcC
Q psy10089        184 PTTFFQVGIAAWS  196 (559)
Q Consensus       184 ~~~~~l~Gi~s~~  196 (559)
                          .++||.+..
T Consensus       224 ----~vIGi~~~~  232 (455)
T PRK10139        224 ----ELIGINTAI  232 (455)
T ss_pred             ----eEEEEEEEE
Confidence                388998764


No 26 
>PF13365 Trypsin_2:  Trypsin-like peptidase domain; PDB: 1Y8T_A 2Z9I_A 3QO6_A 1L1J_A 1QY6_A 2O8L_A 3OTP_E 2ZLE_I 1KY9_A 3CS0_A ....
Probab=97.26  E-value=0.00082  Score=56.92  Aligned_cols=20  Identities=50%  Similarity=0.619  Sum_probs=18.6

Q ss_pred             eEEEEEcCC-EEEeccccccc
Q psy10089         62 CGASLIGPN-IALTAAHCVQY   81 (559)
Q Consensus        62 CgGtLIs~~-~VLTAAhC~~~   81 (559)
                      |+|.+|+++ +|||||||+..
T Consensus         1 GTGf~i~~~g~ilT~~Hvv~~   21 (120)
T PF13365_consen    1 GTGFLIGPDGYILTAAHVVED   21 (120)
T ss_dssp             EEEEEEETTTEEEEEHHHHTC
T ss_pred             CEEEEEcCCceEEEchhheec
Confidence            789999999 99999999974


No 27 
>PF13365 Trypsin_2:  Trypsin-like peptidase domain; PDB: 1Y8T_A 2Z9I_A 3QO6_A 1L1J_A 1QY6_A 2O8L_A 3OTP_E 2ZLE_I 1KY9_A 3CS0_A ....
Probab=97.21  E-value=0.00097  Score=56.47  Aligned_cols=21  Identities=38%  Similarity=0.471  Sum_probs=19.3

Q ss_pred             eeEEEEcCC-EEEecccccCcc
Q psy10089        327 CGATLILPH-VVMTAAHCVNNI  347 (559)
Q Consensus       327 C~GtLIs~~-~VLTAAhC~~~~  347 (559)
                      |+|.+|+++ +|||+|||+...
T Consensus         1 GTGf~i~~~g~ilT~~Hvv~~~   22 (120)
T PF13365_consen    1 GTGFLIGPDGYILTAAHVVEDW   22 (120)
T ss_dssp             EEEEEEETTTEEEEEHHHHTCC
T ss_pred             CEEEEEcCCceEEEchhheecc
Confidence            789999999 999999999863


No 28 
>PRK10942 serine endoprotease; Provisional
Probab=97.06  E-value=0.005  Score=65.41  Aligned_cols=56  Identities=21%  Similarity=0.293  Sum_probs=37.4

Q ss_pred             eeeEEEEEcC--CEEEecccccccCceeeEEeeeEEEcccccccccceeEEeeEEEEECCCCCCCCcCCceEEEeecC
Q psy10089         60 FKCGASLIGP--NIALTAAHCVQYDVTYSVAAGEWFINGIVEEELEEEQRRDVLDVRIHPNYSTETLENNIALLKLSS  135 (559)
Q Consensus        60 ~~CgGtLIs~--~~VLTAAhC~~~~~~~~v~~g~~~~~~~~~~~~~~~~~~~v~~i~~hp~y~~~~~~~Diall~L~~  135 (559)
                      ...+|.||+.  -+|||.+|.+.+...+.|.+.+             ...+..+-+..+|.       .||||||++.
T Consensus       111 ~~GSG~ii~~~~G~IlTn~HVv~~a~~i~V~~~d-------------g~~~~a~vv~~D~~-------~DlAvlki~~  168 (473)
T PRK10942        111 ALGSGVIIDADKGYVVTNNHVVDNATKIKVQLSD-------------GRKFDAKVVGKDPR-------SDIALIQLQN  168 (473)
T ss_pred             ceEEEEEEECCCCEEEeChhhcCCCCEEEEEECC-------------CCEEEEEEEEecCC-------CCEEEEEecC
Confidence            3589999985  4999999999876666555321             11233333334443       4999999964


No 29 
>PF02395 Peptidase_S6:  Immunoglobulin A1 protease Serine protease Prosite pattern;  InterPro: IPR000710 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S6 (clan PA(S)). The type sample being the IgA1-specific serine endopeptidase from Neisseria gonorrhoeae []. These cleave prolyl bonds in the hinge regions of immunoglobulin A heavy chains. Similar specificity is shown by the unrelated family of M26 metalloendopeptidases.; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SZE_A 3H09_B 3SYJ_A 1WXR_A 3AK5_B.
Probab=92.09  E-value=0.83  Score=51.21  Aligned_cols=30  Identities=20%  Similarity=0.354  Sum_probs=22.2

Q ss_pred             CCCCCceEeecCCCCCcEEEEEEEEcCCCCC
Q psy10089        170 ADGGGPLVCPSKEDPTTFFQVGIAAWSVVCT  200 (559)
Q Consensus       170 ~d~G~pl~~~~~~~~~~~~l~Gi~s~~~~C~  200 (559)
                      ||||+||+.-. ...++|+|+|+.+.+....
T Consensus       216 GDSGSPlF~YD-~~~kKWvl~Gv~~~~~~~~  245 (769)
T PF02395_consen  216 GDSGSPLFAYD-KEKKKWVLVGVLSGGNGYN  245 (769)
T ss_dssp             T-TT-EEEEEE-TTTTEEEEEEEEEEECCCC
T ss_pred             CcCCCceEEEE-ccCCeEEEEEEEccccccC
Confidence            89999999865 3457999999999876543


No 30 
>PF00548 Peptidase_C3:  3C cysteine protease (picornain 3C);  InterPro: IPR000199 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  This signature defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies C3A and C3B. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral C3 cysteine protease. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SJO_E 2H6M_A 1QA7_C 1HAV_B 2HAL_A 2H9H_A 3QZQ_B 3QZR_A 3R0F_B 3SJ9_A ....
Probab=89.19  E-value=3.4  Score=37.45  Aligned_cols=70  Identities=14%  Similarity=0.039  Sum_probs=39.5

Q ss_pred             eeEeeEEEEcCCEEEecccccCccCccceEEEccceeccccCCCCCCCCccceeeeeEEEeCCCCCCCCCCCceEEEEeC
Q psy10089        324 VFQCGATLILPHVVMTAAHCVNNIPVTDIKVRGGEWDTITNNRTDREPFPYQERTVSQIYIHENFEAKTVFNDIALIILD  403 (559)
Q Consensus       324 ~~~C~GtLIs~~~VLTAAhC~~~~~~~~~~V~~g~~~~~~~~~~~~~~~~~~~~~V~~i~~Hp~y~~~~~~~DIALl~L~  403 (559)
                      .+.|.+..|..+|.|-..|.-     ....+.++.                ..+++.+.+..  .+......||++++|.
T Consensus        24 ~~t~l~~gi~~~~~lvp~H~~-----~~~~i~i~g----------------~~~~~~d~~~l--v~~~~~~~Dl~~v~l~   80 (172)
T PF00548_consen   24 EFTMLALGIYDRYFLVPTHEE-----PEDTIYIDG----------------VEYKVDDSVVL--VDRDGVDTDLTLVKLP   80 (172)
T ss_dssp             EEEEEEEEEEBTEEEEEGGGG-----GCSEEEETT----------------EEEEEEEEEEE--EETTSSEEEEEEEEEE
T ss_pred             eEEEecceEeeeEEEEECcCC-----CcEEEEECC----------------EEEEeeeeEEE--ecCCCcceeEEEEEcc
Confidence            577889999999999999932     222333332                12222222111  1122235699999999


Q ss_pred             CCCCCCCCceecc
Q psy10089        404 FPFPVKNHIGLAC  416 (559)
Q Consensus       404 ~p~~~~~~v~pic  416 (559)
                      +.-.|.+-.+-++
T Consensus        81 ~~~kfrDIrk~~~   93 (172)
T PF00548_consen   81 RNPKFRDIRKFFP   93 (172)
T ss_dssp             SSS-B--GGGGSB
T ss_pred             CCcccCchhhhhc
Confidence            8888865444444


No 31 
>PF02395 Peptidase_S6:  Immunoglobulin A1 protease Serine protease Prosite pattern;  InterPro: IPR000710 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S6 (clan PA(S)). The type sample being the IgA1-specific serine endopeptidase from Neisseria gonorrhoeae []. These cleave prolyl bonds in the hinge regions of immunoglobulin A heavy chains. Similar specificity is shown by the unrelated family of M26 metalloendopeptidases.; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SZE_A 3H09_B 3SYJ_A 1WXR_A 3AK5_B.
Probab=85.12  E-value=0.81  Score=51.28  Aligned_cols=55  Identities=20%  Similarity=0.338  Sum_probs=32.6

Q ss_pred             CCCCCCCCcccEEeecCCCCcEEEEEEEEeCCCCCCCCCeeeE--eccccHHHHHhhcC
Q psy10089        491 DACKGDGGGPLVCQLKNERDRFTQVGIVSWGIGCGSDTPGVYV--DVRKFKKWILDNSH  547 (559)
Q Consensus       491 ~~C~GDsGgPLv~~~~~~~~~~~l~GI~S~g~~C~~~~p~vyt--~V~~y~~WI~~~i~  547 (559)
                      ..=.||||+||+.-+ ....+|+|+|+++.+.+..... ..|+  +...+..|+++-..
T Consensus       212 ~~~~GDSGSPlF~YD-~~~kKWvl~Gv~~~~~~~~g~~-~~~~~~~~~f~~~~~~~d~~  268 (769)
T PF02395_consen  212 YGSPGDSGSPLFAYD-KEKKKWVLVGVLSGGNGYNGKG-NWWNVIPPDFINQIKQNDTD  268 (769)
T ss_dssp             B--TT-TT-EEEEEE-TTTTEEEEEEEEEEECCCCHSE-EEEEEECHHHHHHHHHHCCE
T ss_pred             ccccCcCCCceEEEE-ccCCeEEEEEEEccccccCCcc-ceeEEecHHHHHHHHhhhcc
Confidence            345799999999765 3468999999999887654322 2332  33334456555433


No 32 
>COG0265 DegQ Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain [Posttranslational modification, protein turnover, chaperones]
Probab=72.35  E-value=64  Score=32.86  Aligned_cols=58  Identities=16%  Similarity=0.121  Sum_probs=36.8

Q ss_pred             eEeeEEEEc-CCEEEecccccCccCccceEEEccceeccccCCCCCCCCccceeeeeEEEeCCCCCCCCCCCceEEEEeC
Q psy10089        325 FQCGATLIL-PHVVMTAAHCVNNIPVTDIKVRGGEWDTITNNRTDREPFPYQERTVSQIYIHENFEAKTVFNDIALIILD  403 (559)
Q Consensus       325 ~~C~GtLIs-~~~VLTAAhC~~~~~~~~~~V~~g~~~~~~~~~~~~~~~~~~~~~V~~i~~Hp~y~~~~~~~DIALl~L~  403 (559)
                      ...+|.+++ ..+|+|-.|-+..  ...+.|.+..               ...+..+-+-.       ....|+|+||.+
T Consensus        72 ~~gSg~i~~~~g~ivTn~hVi~~--a~~i~v~l~d---------------g~~~~a~~vg~-------d~~~dlavlki~  127 (347)
T COG0265          72 GLGSGFIISSDGYIVTNNHVIAG--AEEITVTLAD---------------GREVPAKLVGK-------DPISDLAVLKID  127 (347)
T ss_pred             ccccEEEEcCCeEEEecceecCC--cceEEEEeCC---------------CCEEEEEEEec-------CCccCEEEEEec
Confidence            467888888 7799999998875  4555555411               12233333322       234699999998


Q ss_pred             CCC
Q psy10089        404 FPF  406 (559)
Q Consensus       404 ~p~  406 (559)
                      ..-
T Consensus       128 ~~~  130 (347)
T COG0265         128 GAG  130 (347)
T ss_pred             cCC
Confidence            653


No 33 
>PF00863 Peptidase_C4:  Peptidase family C4 This family belongs to family C4 of the peptidase classification.;  InterPro: IPR001730 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  Nuclear inclusion A (NIA) proteases from potyviruses are cysteine peptidases belong to the MEROPS peptidase family C4 (NIa protease family, clan PA(C)) [, ].  Potyviruses include plant viruses in which the single-stranded RNA encodes a polyprotein with NIA protease activity, where proteolytic cleavage is specific for Gln+Gly sites. The NIA protease acts on the polyprotein, releasing itself by Gln+Gly cleavage at both the N- and C-termini. It further processes the polyprotein by cleavage at five similar sites in the C-terminal half of the sequence. In addition to its C-terminal protease activity, the NIA protease contains an N-terminal domain that has been implicated in the transcription process []. This peptidase is present in the nuclear inclusion protein of potyviruses.; GO: 0008234 cysteine-type peptidase activity, 0006508 proteolysis; PDB: 3MMG_B 1Q31_B 1LVB_A 1LVM_A.
Probab=70.03  E-value=1e+02  Score=29.50  Aligned_cols=49  Identities=29%  Similarity=0.395  Sum_probs=24.7

Q ss_pred             CCCCCCCCcccEEeecCCCCcEEEEEEEEeCCCCCCCCCeeeEeccc-cHHHHHhhc
Q psy10089        491 DACKGDGGGPLVCQLKNERDRFTQVGIVSWGIGCGSDTPGVYVDVRK-FKKWILDNS  546 (559)
Q Consensus       491 ~~C~GDsGgPLv~~~~~~~~~~~l~GI~S~g~~C~~~~p~vyt~V~~-y~~WI~~~i  546 (559)
                      ++=.||=|.|||...   ++  .++||.|-+..-..  -.+|+.+.. +.+-+.+..
T Consensus       147 sTk~G~CG~PlVs~~---Dg--~IVGiHsl~~~~~~--~N~F~~f~~~f~~~~l~~~  196 (235)
T PF00863_consen  147 STKDGDCGLPLVSTK---DG--KIVGIHSLTSNTSS--RNYFTPFPDDFEEFYLENI  196 (235)
T ss_dssp             ---TT-TT-EEEETT---T----EEEEEEEEETTTS--SEEEEE--TTHHHHHCC-C
T ss_pred             cCCCCccCCcEEEcC---CC--cEEEEEcCccCCCC--eEEEEcCCHHHHHHHhccc
Confidence            334588899999764   22  79999997642221  357777654 445444433


No 34 
>PF00947 Pico_P2A:  Picornavirus core protein 2A;  InterPro: IPR000081 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  This domain defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies 3CA and 3CB. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral 3C cysteine protease []. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0008233 peptidase activity, 0006508 proteolysis, 0016032 viral reproduction; PDB: 2HRV_B 1Z8R_A.
Probab=60.85  E-value=6.5  Score=33.19  Aligned_cols=35  Identities=34%  Similarity=0.594  Sum_probs=26.1

Q ss_pred             CCCCCcccEEeecCCCCcEEEEEEEEeCCCCCCCCCeeeEeccccH
Q psy10089        494 KGDGGGPLVCQLKNERDRFTQVGIVSWGIGCGSDTPGVYVDVRKFK  539 (559)
Q Consensus       494 ~GDsGgPLv~~~~~~~~~~~l~GI~S~g~~C~~~~p~vyt~V~~y~  539 (559)
                      +||-||+|.|+.       =++||++.|-    .+-..|++|+.+.
T Consensus        89 PGdCGg~L~C~H-------GViGi~Tagg----~g~VaF~dir~~~  123 (127)
T PF00947_consen   89 PGDCGGILRCKH-------GVIGIVTAGG----EGHVAFADIRDLL  123 (127)
T ss_dssp             TT-TCSEEEETT-------CEEEEEEEEE----TTEEEEEECCCGS
T ss_pred             CCCCCceeEeCC-------CeEEEEEeCC----CceEEEEechhhh
Confidence            699999999987       4899999872    1246688888763


No 35 
>PF05580 Peptidase_S55:  SpoIVB peptidase S55;  InterPro: IPR008763 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S55 (SpoIVB peptidase family, clan PA(S)). The protein SpoIVB plays a key role in signalling in the final sigma-K checkpoint of Bacillus subtilis [, ].
Probab=55.70  E-value=13  Score=34.60  Aligned_cols=26  Identities=19%  Similarity=0.355  Sum_probs=22.6

Q ss_pred             CCCCCCCCCcccEEeecCCCCcEEEEEEEEeCC
Q psy10089        490 QDACKGDGGGPLVCQLKNERDRFTQVGIVSWGI  522 (559)
Q Consensus       490 ~~~C~GDsGgPLv~~~~~~~~~~~l~GI~S~g~  522 (559)
                      .+.-||.||+|++...       .|+|-++++.
T Consensus       175 GGIvqGMSGSPI~qdG-------KLiGAVthvf  200 (218)
T PF05580_consen  175 GGIVQGMSGSPIIQDG-------KLIGAVTHVF  200 (218)
T ss_pred             CCEEecccCCCEEECC-------EEEEEEEEEE
Confidence            4678999999999866       8999999985


No 36 
>PF00548 Peptidase_C3:  3C cysteine protease (picornain 3C);  InterPro: IPR000199 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  This signature defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies C3A and C3B. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral C3 cysteine protease. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SJO_E 2H6M_A 1QA7_C 1HAV_B 2HAL_A 2H9H_A 3QZQ_B 3QZR_A 3R0F_B 3SJ9_A ....
Probab=47.28  E-value=1.7e+02  Score=26.41  Aligned_cols=71  Identities=18%  Similarity=0.075  Sum_probs=37.2

Q ss_pred             ceeeeEEEEEcCCEEEecccccccCceeeEEeeeEEEcccccccccceeEEeeEEEEECCCCCCCCcCCceEEEeecCcc
Q psy10089         58 EYFKCGASLIGPNIALTAAHCVQYDVTYSVAAGEWFINGIVEEELEEEQRRDVLDVRIHPNYSTETLENNIALLKLSSNI  137 (559)
Q Consensus        58 ~~~~CgGtLIs~~~VLTAAhC~~~~~~~~v~~g~~~~~~~~~~~~~~~~~~~v~~i~~hp~y~~~~~~~Diall~L~~~v  137 (559)
                      ..+.|.+.-|..+|.|--.|.- .  ...+.++...              +++...+..  .+......||++++|.+.-
T Consensus        23 g~~t~l~~gi~~~~~lvp~H~~-~--~~~i~i~g~~--------------~~~~d~~~l--v~~~~~~~Dl~~v~l~~~~   83 (172)
T PF00548_consen   23 GEFTMLALGIYDRYFLVPTHEE-P--EDTIYIDGVE--------------YKVDDSVVL--VDRDGVDTDLTLVKLPRNP   83 (172)
T ss_dssp             EEEEEEEEEEEBTEEEEEGGGG-G--CSEEEETTEE--------------EEEEEEEEE--EETTSSEEEEEEEEEESSS
T ss_pred             ceEEEecceEeeeEEEEECcCC-C--cEEEEECCEE--------------EEeeeeEEE--ecCCCcceeEEEEEccCCc
Confidence            3466778899999999999922 1  1222222111              111111111  1112224599999998877


Q ss_pred             ccCCCcceec
Q psy10089        138 DFDDYIHPIC  147 (559)
Q Consensus       138 ~~~~~v~pic  147 (559)
                      .|.+-.+-++
T Consensus        84 kfrDIrk~~~   93 (172)
T PF00548_consen   84 KFRDIRKFFP   93 (172)
T ss_dssp             -B--GGGGSB
T ss_pred             ccCchhhhhc
Confidence            7765544444


No 37 
>PF00863 Peptidase_C4:  Peptidase family C4 This family belongs to family C4 of the peptidase classification.;  InterPro: IPR001730 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  Nuclear inclusion A (NIA) proteases from potyviruses are cysteine peptidases belong to the MEROPS peptidase family C4 (NIa protease family, clan PA(C)) [, ].  Potyviruses include plant viruses in which the single-stranded RNA encodes a polyprotein with NIA protease activity, where proteolytic cleavage is specific for Gln+Gly sites. The NIA protease acts on the polyprotein, releasing itself by Gln+Gly cleavage at both the N- and C-termini. It further processes the polyprotein by cleavage at five similar sites in the C-terminal half of the sequence. In addition to its C-terminal protease activity, the NIA protease contains an N-terminal domain that has been implicated in the transcription process []. This peptidase is present in the nuclear inclusion protein of potyviruses.; GO: 0008234 cysteine-type peptidase activity, 0006508 proteolysis; PDB: 3MMG_B 1Q31_B 1LVB_A 1LVM_A.
Probab=44.65  E-value=2.6e+02  Score=26.69  Aligned_cols=17  Identities=18%  Similarity=0.120  Sum_probs=13.7

Q ss_pred             EEcCCEEEecccccccC
Q psy10089         66 LIGPNIALTAAHCVQYD   82 (559)
Q Consensus        66 LIs~~~VLTAAhC~~~~   82 (559)
                      |.--.|++|-+|-+..+
T Consensus        37 igyG~~iItn~HLf~~n   53 (235)
T PF00863_consen   37 IGYGSYIITNAHLFKRN   53 (235)
T ss_dssp             EEETTEEEEEGGGGSST
T ss_pred             EeECCEEEEChhhhccC
Confidence            45688999999999653


No 38 
>PF05416 Peptidase_C37:  Southampton virus-type processing peptidase;  InterPro: IPR001665 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  This group of cysteine peptidases belong to the MEROPS peptidase family C37, (clan PA(C)). The type example is calicivirin from Southampton virus, an endopeptidase that cleaves the polyprotein at sites N-terminal to itself, liberating the polyprotein helicase. Southampton virus is a positive-stranded ssRNA virus belonging to the Caliciviruses, which are viruses that cause gastroenteritis. The calicivirus genome contains two open reading frames, ORF1 and ORF2. ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine protease and RNA polymerase activity []. The regions of the polyprotein in which these activities lie are similar to proteins produced by the picornaviruses []. ORF2 encodes a structural, capsid protein. Two different families of caliciviruses can be distinguished on the basis of sequence similarity, namely the Norwalk-like viruses or small round structured viruses (SRSVs), and those classed as non-SRSVs.; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 2FYQ_A 2FYR_A 1WQS_D 4ASH_A 2IPH_B.
Probab=38.59  E-value=2.1e+02  Score=29.72  Aligned_cols=31  Identities=19%  Similarity=0.372  Sum_probs=22.7

Q ss_pred             CCCCCCCCCCcccEEeecCCCCcEEEEEEEEeCC
Q psy10089        489 NQDACKGDGGGPLVCQLKNERDRFTQVGIVSWGI  522 (559)
Q Consensus       489 ~~~~C~GDsGgPLv~~~~~~~~~~~l~GI~S~g~  522 (559)
                      +-++-+||-|-|-+...   ++.|+++||..-..
T Consensus       497 DLGT~PGDCGcPYvyKr---gNd~VV~GVH~AAt  527 (535)
T PF05416_consen  497 DLGTIPGDCGCPYVYKR---GNDWVVIGVHAAAT  527 (535)
T ss_dssp             TTS--TTGTT-EEEEEE---TTEEEEEEEEEEE-
T ss_pred             ccCCCCCCCCCceeeec---CCcEEEEEEEehhc
Confidence            34567899999999987   78999999987543


No 39 
>COG0265 DegQ Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain [Posttranslational modification, protein turnover, chaperones]
Probab=26.63  E-value=6.8e+02  Score=25.28  Aligned_cols=29  Identities=14%  Similarity=0.063  Sum_probs=21.5

Q ss_pred             eeeEEEEEc-CCEEEecccccccCceeeEE
Q psy10089         60 FKCGASLIG-PNIALTAAHCVQYDVTYSVA   88 (559)
Q Consensus        60 ~~CgGtLIs-~~~VLTAAhC~~~~~~~~v~   88 (559)
                      ....|.+++ ..+|||-.|=+.......+.
T Consensus        72 ~~gSg~i~~~~g~ivTn~hVi~~a~~i~v~  101 (347)
T COG0265          72 GLGSGFIISSDGYIVTNNHVIAGAEEITVT  101 (347)
T ss_pred             ccccEEEEcCCeEEEecceecCCcceEEEE
Confidence            457888888 88999999988665444443


No 40 
>PF10459 Peptidase_S46:  Peptidase S46;  InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains. 
Probab=26.11  E-value=42  Score=37.62  Aligned_cols=20  Identities=35%  Similarity=0.826  Sum_probs=18.1

Q ss_pred             eeEEEEEcCC-EEEecccccc
Q psy10089         61 KCGASLIGPN-IALTAAHCVQ   80 (559)
Q Consensus        61 ~CgGtLIs~~-~VLTAAhC~~   80 (559)
                      .|+|++||++ .|||=-||..
T Consensus        48 GCSgsfVS~~GLvlTNHHC~~   68 (698)
T PF10459_consen   48 GCSGSFVSPDGLVLTNHHCGY   68 (698)
T ss_pred             ceeEEEEcCCceEEecchhhh
Confidence            3999999988 9999999984


No 41 
>PF10459 Peptidase_S46:  Peptidase S46;  InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains. 
Probab=24.40  E-value=46  Score=37.31  Aligned_cols=21  Identities=29%  Similarity=0.738  Sum_probs=18.7

Q ss_pred             EeeEEEEcCC-EEEecccccCc
Q psy10089        326 QCGATLILPH-VVMTAAHCVNN  346 (559)
Q Consensus       326 ~C~GtLIs~~-~VLTAAhC~~~  346 (559)
                      .|+|++||++ .|||=-||..+
T Consensus        48 GCSgsfVS~~GLvlTNHHC~~~   69 (698)
T PF10459_consen   48 GCSGSFVSPDGLVLTNHHCGYG   69 (698)
T ss_pred             ceeEEEEcCCceEEecchhhhh
Confidence            3999999988 99999999844


No 42 
>PF08192 Peptidase_S64:  Peptidase family S64;  InterPro: IPR012985 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This family of fungal proteins is involved in the processing of membrane bound transcription factor Stp1 [] and belongs to MEROPS petidase family S64 (clan PA). The processing causes the signalling domain of Stp1 to be passed to the nucleus where several permease genes are induced. The permeases are important for uptake of amino acids, and processing of tp1 only occurs in an amino acid-rich environment. This family is predicted to be distantly related to the trypsin family (MEROPS peptidase family S1) and to have a typical trypsin-like catalytic triad [].
Probab=23.81  E-value=5.5e+02  Score=28.52  Aligned_cols=57  Identities=14%  Similarity=0.272  Sum_probs=36.4

Q ss_pred             CCCCCCCCCcccEEeecCCCCcEEEEEEEEeCCCCCCCCCeeeEeccccHHHHHhhcC
Q psy10089        490 QDACKGDGGGPLVCQLKNERDRFTQVGIVSWGIGCGSDTPGVYVDVRKFKKWILDNSH  547 (559)
Q Consensus       490 ~~~C~GDsGgPLv~~~~~~~~~~~l~GI~S~g~~C~~~~p~vyt~V~~y~~WI~~~i~  547 (559)
                      .-+-.||||+=++....+..-..-++|+.. .+.+.....++||-+...++=++++.+
T Consensus       634 ~Fa~~GDSGS~VLtk~~d~~~gLgvvGMlh-sydge~kqfglftPi~~il~rl~~vT~  690 (695)
T PF08192_consen  634 AFASGGDSGSWVLTKLEDNNKGLGVVGMLH-SYDGEQKQFGLFTPINEILDRLEEVTG  690 (695)
T ss_pred             cccCCCCcccEEEecccccccCceeeEEee-ecCCccceeeccCcHHHHHHHHHHhhc
Confidence            344569999999876532222345677765 334444447899988888877776543


No 43 
>PF05579 Peptidase_S32:  Equine arteritis virus serine endopeptidase S32;  InterPro: IPR008760 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to MEROPS peptidase family S32 (clan PA(S)). The type example is equine arteritis virus serine endopeptidase (equine arteritis virus), which is involved in processing of nidovirus polyproteins [].; GO: 0004252 serine-type endopeptidase activity, 0016032 viral reproduction, 0019082 viral protein processing; PDB: 3FAN_A 3FAO_A 1MBM_A.
Probab=22.20  E-value=56  Score=31.54  Aligned_cols=21  Identities=29%  Similarity=0.537  Sum_probs=15.1

Q ss_pred             CCCCCcccEEeecCCCCcEEEEEEEEe
Q psy10089        494 KGDGGGPLVCQLKNERDRFTQVGIVSW  520 (559)
Q Consensus       494 ~GDsGgPLv~~~~~~~~~~~l~GI~S~  520 (559)
                      .||||+|++..+   +   .|+||.+-
T Consensus       207 ~GDSGSPVVt~d---g---~liGVHTG  227 (297)
T PF05579_consen  207 PGDSGSPVVTED---G---DLIGVHTG  227 (297)
T ss_dssp             GGCTT-EEEETT---C----EEEEEEE
T ss_pred             CCCCCCccCcCC---C---CEEEEEec
Confidence            489999999764   2   68999874


No 44 
>TIGR02860 spore_IV_B stage IV sporulation protein B. SpoIVB, the stage IV sporulation protein B of endospore-forming bacteria such as Bacillus subtilis, is a serine proteinase, expressed in the spore (rather than mother cell) compartment, that participates in a proteolytic activation cascade for Sigma-K. It appears to be universal among endospore-forming bacteria and occurs nowhere else.
Probab=22.15  E-value=67  Score=33.31  Aligned_cols=44  Identities=20%  Similarity=0.422  Sum_probs=31.1

Q ss_pred             CCCCCCCCCCcccEEeecCCCCcEEEEEEEEeCCCCCCC-CCeeeEeccccHHHHHhh
Q psy10089        489 NQDACKGDGGGPLVCQLKNERDRFTQVGIVSWGIGCGSD-TPGVYVDVRKFKKWILDN  545 (559)
Q Consensus       489 ~~~~C~GDsGgPLv~~~~~~~~~~~l~GI~S~g~~C~~~-~p~vyt~V~~y~~WI~~~  545 (559)
                      ..+..||.||+|++...       .|+|-+++-.--... ++++      |.+|+.+.
T Consensus       354 tgGivqGMSGSPi~q~g-------kliGAvtHVfvndpt~GYGi------~ie~Ml~~  398 (402)
T TIGR02860       354 TGGIVQGMSGSPIIQNG-------KVIGAVTHVFVNDPTSGYGV------YIEWMLKE  398 (402)
T ss_pred             hCCEEecccCCCEEECC-------EEEEEEEEEEecCCCcceee------hHHHHHHH
Confidence            35778999999999876       899999976422221 1444      56888765


No 45 
>PF02907 Peptidase_S29:  Hepatitis C virus NS3 protease;  InterPro: IPR004109 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies the Hepatitis C virus NS3 protein as a serine protease which belongs to MEROPS peptidase family S29 (hepacivirin family, clan PA(S)), which has a trypsin-like fold. The non-structural (NS) protein NS3 is one of the NS proteins involved in replication of the HCV genome. The NS2 proteinase (IPR002518 from INTERPRO), a zinc-dependent enzyme, performs a single proteolytic cut to release the N terminus of NS3. The action of NS3 proteinase (NS3P), which resides in the N-terminal one-third of the NS3 protein, then yields all remaining non-structural proteins. The C-terminal two-thirds of the NS3 protein contain a helicase. The functional relationship between the proteinase and helicase domains is unknown. NS3 has a structural zinc-binding site and requires cofactor NS4. It has been suggested that the NS3 serine protease of hepatitus C is involved in cell transformation and that the ability to transform requires an active enzyme [].; GO: 0008236 serine-type peptidase activity, 0006508 proteolysis, 0019087 transformation of host cell by virus; PDB: 2QV1_B 3LOX_C 2OBQ_C 2OC1_C 2OC0_A 3LON_A 3KNX_A 2O8M_A 2OBO_A 2OC8_A ....
Probab=20.82  E-value=73  Score=27.32  Aligned_cols=21  Identities=38%  Similarity=0.868  Sum_probs=14.2

Q ss_pred             CCCCCCcccEEeecCCCCcEEEEEEEE
Q psy10089        493 CKGDGGGPLVCQLKNERDRFTQVGIVS  519 (559)
Q Consensus       493 C~GDsGgPLv~~~~~~~~~~~l~GI~S  519 (559)
                      -.|-||||++|..   +   ..+||..
T Consensus       106 lkGSSGgPiLC~~---G---H~vG~f~  126 (148)
T PF02907_consen  106 LKGSSGGPILCPS---G---HAVGMFR  126 (148)
T ss_dssp             HTT-TT-EEEETT---S---EEEEEEE
T ss_pred             EecCCCCcccCCC---C---CEEEEEE
Confidence            3588999999975   2   6778865


Done!