Query         004596
Match_columns 743
No_of_seqs    359 out of 1843
Neff          5.7 
Searched_HMMs 46136
Date          Fri Mar 29 01:58:48 2013
Command       hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/004596.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/004596hhsearch_cdd -cpu 12 -v 0 

 No Hit                             Prob E-value P-value  Score    SS Cols Query HMM  Template HMM
  1 PRK10139 serine endoprotease;  100.0 3.1E-28 6.7E-33  274.4  23.6  192  384-661    43-264 (455)
  2 TIGR02038 protease_degS peripl 100.0 8.1E-28 1.7E-32  263.1  24.4  192  384-660    48-251 (351)
  3 PRK10898 serine endoprotease;  100.0 1.4E-27   3E-32  261.4  23.2  192  384-660    48-252 (353)
  4 PRK10942 serine endoprotease;   99.9 1.6E-26 3.4E-31  261.8  23.7  173  403-661   111-285 (473)
  5 TIGR02037 degP_htrA_DO peripla  99.9 9.9E-26 2.1E-30  252.6  24.1  173  403-661    58-231 (428)
  6 COG0265 DegQ Trypsin-like seri  99.9 2.2E-20 4.7E-25  204.0  20.7  190  385-659    37-244 (347)
  7 PRK10139 serine endoprotease;   99.8 3.5E-18 7.6E-23  193.1  14.1  129  213-348   136-274 (455)
  8 PRK10942 serine endoprotease;   99.7 3.3E-17 7.1E-22  186.1  13.8  129  214-349   158-296 (473)
  9 TIGR02037 degP_htrA_DO peripla  99.6 1.1E-15 2.4E-20  171.7  14.7  130  214-350   104-243 (428)
 10 TIGR02038 protease_degS peripl  99.6 1.6E-15 3.5E-20  166.4  14.0  128  213-348   123-262 (351)
 11 PRK10898 serine endoprotease;   99.6 1.6E-15 3.5E-20  166.5  13.4  127  214-348   124-263 (353)
 12 COG0265 DegQ Trypsin-like seri  99.6 1.6E-14 3.5E-19  157.9  12.4  130  212-348   116-256 (347)
 13 PF13365 Trypsin_2:  Trypsin-li  99.6 6.3E-14 1.4E-18  127.6  14.0   24  600-623    97-120 (120)
 14 KOG1320 Serine protease [Postt  99.5 3.7E-13   8E-18  150.5  13.9  201  386-655   133-350 (473)
 15 PF00089 Trypsin:  Trypsin;  In  99.3 2.8E-10   6E-15  113.4  18.2  124  518-650    86-218 (220)
 16 cd00190 Tryp_SPc Trypsin-like   99.2 5.8E-10 1.3E-14  111.8  16.3  109  517-629    87-209 (232)
 17 smart00020 Tryp_SPc Trypsin-li  99.0   2E-08 4.3E-13  101.3  19.3  108  517-628    87-208 (229)
 18 KOG1421 Predicted signaling-as  98.9 5.8E-09 1.3E-13  118.7  11.7  191  386-658    57-261 (955)
 19 KOG1320 Serine protease [Postt  98.8 7.2E-09 1.6E-13  116.5   7.7  125  203-333   211-350 (473)
 20 COG3591 V8-like Glu-specific e  98.5 1.6E-06 3.6E-11   90.9  14.6   73  542-633   157-229 (251)
 21 KOG3627 Trypsin [Amino acid tr  98.1 0.00029 6.4E-09   73.1  19.5  117  519-639   106-239 (256)
 22 PF00863 Peptidase_C4:  Peptida  97.9 0.00031 6.7E-09   73.4  15.1  103  517-646    80-185 (235)
 23 COG5640 Secreted trypsin-like   97.2  0.0023   5E-08   69.9  11.3   38  602-639   223-263 (413)
 24 PF03761 DUF316:  Domain of unk  97.0   0.042   9E-07   58.5  18.1   92  517-629   159-256 (282)
 25 PF05579 Peptidase_S32:  Equine  96.7  0.0046   1E-07   65.2   7.4   77  519-631   156-232 (297)
 26 PF10459 Peptidase_S46:  Peptid  96.3  0.0078 1.7E-07   72.1   7.5   21  405-425    49-69  (698)
 27 PF13365 Trypsin_2:  Trypsin-li  94.9    0.03 6.6E-07   50.6   4.0   21  284-304    97-120 (120)
 28 PF10459 Peptidase_S46:  Peptid  94.6   0.026 5.7E-07   67.7   3.8   65  592-656   618-687 (698)
 29 PF00548 Peptidase_C3:  3C cyst  94.1    0.93   2E-05   45.5  12.9   35  593-627   133-170 (172)
 30 KOG1421 Predicted signaling-as  92.0     2.7 5.9E-05   49.9  14.2  152  494-660   578-731 (955)
 31 PF02907 Peptidase_S29:  Hepati  88.8    0.52 1.1E-05   45.4   4.1   44  603-649   104-147 (148)
 32 PF08192 Peptidase_S64:  Peptid  88.2       3 6.4E-05   49.7  10.5  119  516-655   540-688 (695)
 33 PF00949 Peptidase_S7:  Peptida  87.8    0.46   1E-05   45.8   3.1   31  601-631    91-121 (132)
 34 PF00089 Trypsin:  Trypsin;  In  86.6       7 0.00015   38.6  11.0  110  214-324    86-214 (220)
 35 PF00944 Peptidase_S3:  Alphavi  84.9     1.1 2.4E-05   43.3   3.9   35  597-631    96-130 (158)
 36 PF05580 Peptidase_S55:  SpoIVB  80.8     1.4   3E-05   45.9   3.1   45  597-647   170-214 (218)
 37 PF09342 DUF1986:  Domain of un  79.9      15 0.00033   39.1  10.4   33  392-425    17-49  (267)
 38 PF00947 Pico_P2A:  Picornaviru  75.0     3.8 8.2E-05   39.3   4.0   31  596-627    79-109 (127)
 39 KOG0441 Cu2+/Zn2+ superoxide d  54.3     4.5 9.7E-05   40.0   0.2   42   26-67     38-84  (154)
 40 PF01732 DUF31:  Putative pepti  51.8     9.3  0.0002   42.8   2.3   24  602-625   350-373 (374)
 41 PF08192 Peptidase_S64:  Peptid  50.6      76  0.0017   38.4   9.4  109  209-329   537-684 (695)
 42 COG3591 V8-like Glu-specific e  49.8      48   0.001   35.6   7.0   71  234-312   154-227 (251)
 43 PF02907 Peptidase_S29:  Hepati  49.5      12 0.00025   36.5   2.1   42  284-326   104-146 (148)
 44 TIGR02860 spore_IV_B stage IV   47.9      15 0.00032   41.9   3.1   45  597-647   350-394 (402)
 45 PF03510 Peptidase_C24:  2C end  40.6      84  0.0018   29.4   6.2   17  407-424     3-19  (105)
 46 PF05416 Peptidase_C37:  Southa  40.0      89  0.0019   35.9   7.4   37  594-630   483-529 (535)
 47 PF00863 Peptidase_C4:  Peptida  36.5      85  0.0018   33.4   6.3   81  214-307    81-171 (235)
 48 cd00190 Tryp_SPc Trypsin-like   32.7 1.3E+02  0.0029   29.6   6.9   39  214-252    88-132 (232)
 49 PF00571 CBS:  CBS domain CBS d  25.7      56  0.0012   25.5   2.3   21  606-626    28-48  (57)
 50 smart00020 Tryp_SPc Trypsin-li  25.1 4.1E+02  0.0089   26.3   9.0   39  214-252    88-132 (229)
 51 PF00949 Peptidase_S7:  Peptida  24.4      69  0.0015   31.1   3.0   32  280-311    86-120 (132)
 52 PF13267 DUF4058:  Protein of u  23.5      59  0.0013   34.9   2.5   26  701-728   124-150 (254)
 53 PF00947 Pico_P2A:  Picornaviru  21.4 1.2E+02  0.0025   29.4   3.8   26  281-307    80-108 (127)
 54 PF12381 Peptidase_C3G:  Tungro  20.8      84  0.0018   33.1   2.9   54  596-655   169-228 (231)
 55 PF08208 RNA_polI_A34:  DNA-dir  20.8      33 0.00071   35.0   0.0   13   23-35    109-121 (198)
 56 PF02122 Peptidase_S39:  Peptid  20.0 1.1E+02  0.0024   31.8   3.6   49  597-649   137-185 (203)

No 1  
>PRK10139 serine endoprotease; Provisional
Probab=99.96  E-value=3.1e-28  Score=274.37  Aligned_cols=192  Identities=29%  Similarity=0.522  Sum_probs=158.4

Q ss_pred             chHHHhccCceEEEEECC----------------------------CeeEEEEEEeC-CCeEEecccccccccCcceecc
Q 004596          384 PLPIQKALASVCLITIDD----------------------------GVWASGVLLND-QGLILTNAHLLEPWRFGKTTVS  434 (743)
Q Consensus       384 p~~i~~a~~SVV~I~~~~----------------------------~~wGSGfvV~~-~G~ILTNaHVV~p~~~~~t~~~  434 (743)
                      ..+++++.||||.|....                            .++||||+|++ +||||||+||++          
T Consensus        43 ~~~~~~~~pavV~i~~~~~~~~~~~~~~~~~~~f~~~~~~~~~~~~~~~GSG~ii~~~~g~IlTn~HVv~----------  112 (455)
T PRK10139         43 APMLEKVLPAVVSVRVEGTASQGQKIPEEFKKFFGDDLPDQPAQPFEGLGSGVIIDAAKGYVLTNNHVIN----------  112 (455)
T ss_pred             HHHHHHhCCcEEEEEEEEeecccccCchhHHHhccccCCccccccccceEEEEEEECCCCEEEeChHHhC----------
Confidence            357899999999996410                            14799999985 799999999997          


Q ss_pred             CCccccccCCCCCCCCCCCcccccccccCCCCCCCcccccccccccccccccccCCceeEEEEEcCCCCceeEeeEEEEe
Q 004596          435 GWRNGVSFQPEDSASSGHTGVDQYQKSQTLPPKMPKIVDSSVDEHRAYKLSSFSRGHRKIRVRLDHLDPWIWCDAKIVYV  514 (743)
Q Consensus       435 g~~~~~~f~~~~~~~~~~~~~~~~~~~q~~~~k~~~i~~~~~~~~~~~~l~l~~~~~~~I~Vrl~~~~~~~W~~A~VV~v  514 (743)
                      +                                                       ...|.|++.+++.   |+|++++ 
T Consensus       113 ~-------------------------------------------------------a~~i~V~~~dg~~---~~a~vvg-  133 (455)
T PRK10139        113 Q-------------------------------------------------------AQKISIQLNDGRE---FDAKLIG-  133 (455)
T ss_pred             C-------------------------------------------------------CCEEEEEECCCCE---EEEEEEE-
Confidence            2                                                       1248888888876   9999999 


Q ss_pred             cCCCCcEEEEEEccCCCCccceeCCCC-CCCCCCeEEEEecCCCCCCCCCCCeeEeEEEeeeeeccCCCCCccccccCCC
Q 004596          515 CKGPLDVSLLQLGYIPDQLCPIDADFG-QPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQRNSA  593 (743)
Q Consensus       515 ~~~~~DLALLkLe~~p~~l~pi~l~~s-~~~~G~~V~vIGyPlfg~~~g~~psvt~GiIS~vv~~~~~~~~~~~~~~~~~  593 (743)
                      .|+.+||||||++. +..+++++++++ .+++||+|+++|||     +|+..+++.|+||+..+...         ....
T Consensus       134 ~D~~~DlAvlkv~~-~~~l~~~~lg~s~~~~~G~~V~aiG~P-----~g~~~tvt~GivS~~~r~~~---------~~~~  198 (455)
T PRK10139        134 SDDQSDIALLQIQN-PSKLTQIAIADSDKLRVGDFAVAVGNP-----FGLGQTATSGIISALGRSGL---------NLEG  198 (455)
T ss_pred             EcCCCCEEEEEecC-CCCCceeEecCccccCCCCEEEEEecC-----CCCCCceEEEEEcccccccc---------CCCC
Confidence            56779999999985 457889999886 68999999999995     57778999999998766321         0123


Q ss_pred             cCeEEEEccccCCCCcccceecCCceEEEEEeeeecCCCCcccCceeEEEehhHHHHHHHHHHhcCCc
Q 004596          594 YPVMLETTAAVHPGGSGGAVVNLDGHMIGLVTSNARHGGGTVIPHLNFSIPCAVLRPIFEFARDMQEV  661 (743)
Q Consensus       594 ~~~~IqTdAai~~GnSGGPL~d~~G~VIGIvtsna~~~gg~~~p~lnFaIPi~~l~~il~~~~~~~d~  661 (743)
                      ...+||||+++++|||||||||.+|+||||+++...+.++..  +++|+||++.++++++++...+.+
T Consensus       199 ~~~~iqtda~in~GnSGGpl~n~~G~vIGi~~~~~~~~~~~~--gigfaIP~~~~~~v~~~l~~~g~v  264 (455)
T PRK10139        199 LENFIQTDASINRGNSGGALLNLNGELIGINTAILAPGGGSV--GIGFAIPSNMARTLAQQLIDFGEI  264 (455)
T ss_pred             cceEEEECCccCCCCCcceEECCCCeEEEEEEEEEcCCCCcc--ceEEEEEhHHHHHHHHHHhhcCcc
Confidence            456899999999999999999999999999999877654443  899999999999999999876654


No 2  
>TIGR02038 protease_degS periplasmic serine pepetdase DegS. This family consists of the periplasmic serine protease DegS (HhoB), a shorter paralog of protease DO (HtrA, DegP) and DegQ (HhoA). It is found in E. coli and several other Proteobacteria of the gamma subdivision. It contains a trypsin domain and a single copy of PDZ domain (in contrast to DegP with two copies). A critical role of this DegS is to sense stress in the periplasm and partially degrade an inhibitor of sigma(E).
Probab=99.96  E-value=8.1e-28  Score=263.09  Aligned_cols=192  Identities=25%  Similarity=0.415  Sum_probs=156.2

Q ss_pred             chHHHhccCceEEEEECC-----------CeeEEEEEEeCCCeEEecccccccccCcceeccCCccccccCCCCCCCCCC
Q 004596          384 PLPIQKALASVCLITIDD-----------GVWASGVLLNDQGLILTNAHLLEPWRFGKTTVSGWRNGVSFQPEDSASSGH  452 (743)
Q Consensus       384 p~~i~~a~~SVV~I~~~~-----------~~wGSGfvV~~~G~ILTNaHVV~p~~~~~t~~~g~~~~~~f~~~~~~~~~~  452 (743)
                      ..+++++.||||.|....           .+.||||+|+++||||||+||++          +                 
T Consensus        48 ~~~~~~~~psVV~I~~~~~~~~~~~~~~~~~~GSG~vi~~~G~IlTn~HVV~----------~-----------------  100 (351)
T TIGR02038        48 NKAVRRAAPAVVNIYNRSISQNSLNQLSIQGLGSGVIMSKEGYILTNYHVIK----------K-----------------  100 (351)
T ss_pred             HHHHHhcCCcEEEEEeEeccccccccccccceEEEEEEeCCeEEEecccEeC----------C-----------------
Confidence            356889999999997621           34799999999999999999996          2                 


Q ss_pred             CcccccccccCCCCCCCcccccccccccccccccccCCceeEEEEEcCCCCceeEeeEEEEecCCCCcEEEEEEccCCCC
Q 004596          453 TGVDQYQKSQTLPPKMPKIVDSSVDEHRAYKLSSFSRGHRKIRVRLDHLDPWIWCDAKIVYVCKGPLDVSLLQLGYIPDQ  532 (743)
Q Consensus       453 ~~~~~~~~~q~~~~k~~~i~~~~~~~~~~~~l~l~~~~~~~I~Vrl~~~~~~~W~~A~VV~v~~~~~DLALLkLe~~p~~  532 (743)
                                                            ...+.|++.+++.   ++|++++ .|+.+||||||++.  ..
T Consensus       101 --------------------------------------~~~i~V~~~dg~~---~~a~vv~-~d~~~DlAvlkv~~--~~  136 (351)
T TIGR02038       101 --------------------------------------ADQIVVALQDGRK---FEAELVG-SDPLTDLAVLKIEG--DN  136 (351)
T ss_pred             --------------------------------------CCEEEEEECCCCE---EEEEEEE-ecCCCCEEEEEecC--CC
Confidence                                                  1247788888766   8999998 66789999999996  35


Q ss_pred             ccceeCCCC-CCCCCCeEEEEecCCCCCCCCCCCeeEeEEEeeeeeccCCCCCccccccCCCcCeEEEEccccCCCCccc
Q 004596          533 LCPIDADFG-QPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQRNSAYPVMLETTAAVHPGGSGG  611 (743)
Q Consensus       533 l~pi~l~~s-~~~~G~~V~vIGyPlfg~~~g~~psvt~GiIS~vv~~~~~~~~~~~~~~~~~~~~~IqTdAai~~GnSGG  611 (743)
                      ++++++.++ .+++|++|+++|||     +++..+++.|+|++..+...         .......+||||+++++|||||
T Consensus       137 ~~~~~l~~s~~~~~G~~V~aiG~P-----~~~~~s~t~GiIs~~~r~~~---------~~~~~~~~iqtda~i~~GnSGG  202 (351)
T TIGR02038       137 LPTIPVNLDRPPHVGDVVLAIGNP-----YNLGQTITQGIISATGRNGL---------SSVGRQNFIQTDAAINAGNSGG  202 (351)
T ss_pred             CceEeccCcCccCCCCEEEEEeCC-----CCCCCcEEEEEEEeccCccc---------CCCCcceEEEECCccCCCCCcc
Confidence            788888765 68999999999996     56678999999998766321         0012356899999999999999


Q ss_pred             ceecCCceEEEEEeeeecCCCCcccCceeEEEehhHHHHHHHHHHhcCC
Q 004596          612 AVVNLDGHMIGLVTSNARHGGGTVIPHLNFSIPCAVLRPIFEFARDMQE  660 (743)
Q Consensus       612 PL~d~~G~VIGIvtsna~~~gg~~~p~lnFaIPi~~l~~il~~~~~~~d  660 (743)
                      ||||.+|+||||+++.....++....+++|+||++.++++++++...+.
T Consensus       203 pl~n~~G~vIGI~~~~~~~~~~~~~~g~~faIP~~~~~~vl~~l~~~g~  251 (351)
T TIGR02038       203 ALINTNGELVGINTASFQKGGDEGGEGINFAIPIKLAHKIMGKIIRDGR  251 (351)
T ss_pred             eEECCCCeEEEEEeeeecccCCCCccceEEEecHHHHHHHHHHHhhcCc
Confidence            9999999999999987654433334589999999999999999987654


No 3  
>PRK10898 serine endoprotease; Provisional
Probab=99.96  E-value=1.4e-27  Score=261.41  Aligned_cols=192  Identities=23%  Similarity=0.377  Sum_probs=155.5

Q ss_pred             chHHHhccCceEEEEECC-----------CeeEEEEEEeCCCeEEecccccccccCcceeccCCccccccCCCCCCCCCC
Q 004596          384 PLPIQKALASVCLITIDD-----------GVWASGVLLNDQGLILTNAHLLEPWRFGKTTVSGWRNGVSFQPEDSASSGH  452 (743)
Q Consensus       384 p~~i~~a~~SVV~I~~~~-----------~~wGSGfvV~~~G~ILTNaHVV~p~~~~~t~~~g~~~~~~f~~~~~~~~~~  452 (743)
                      ..+++++.||||.|....           .++||||+|+++||||||+||++          +                 
T Consensus        48 ~~~~~~~~psvV~v~~~~~~~~~~~~~~~~~~GSGfvi~~~G~IlTn~HVv~----------~-----------------  100 (353)
T PRK10898         48 NQAVRRAAPAVVNVYNRSLNSTSHNQLEIRTLGSGVIMDQRGYILTNKHVIN----------D-----------------  100 (353)
T ss_pred             HHHHHHhCCcEEEEEeEeccccCcccccccceeeEEEEeCCeEEEecccEeC----------C-----------------
Confidence            356899999999998731           15899999999999999999996          2                 


Q ss_pred             CcccccccccCCCCCCCcccccccccccccccccccCCceeEEEEEcCCCCceeEeeEEEEecCCCCcEEEEEEccCCCC
Q 004596          453 TGVDQYQKSQTLPPKMPKIVDSSVDEHRAYKLSSFSRGHRKIRVRLDHLDPWIWCDAKIVYVCKGPLDVSLLQLGYIPDQ  532 (743)
Q Consensus       453 ~~~~~~~~~q~~~~k~~~i~~~~~~~~~~~~l~l~~~~~~~I~Vrl~~~~~~~W~~A~VV~v~~~~~DLALLkLe~~p~~  532 (743)
                                                            ...|.|++.+++.   |+|++++ .|+.+||||||++.  ..
T Consensus       101 --------------------------------------a~~i~V~~~dg~~---~~a~vv~-~d~~~DlAvl~v~~--~~  136 (353)
T PRK10898        101 --------------------------------------ADQIIVALQDGRV---FEALLVG-SDSLTDLAVLKINA--TN  136 (353)
T ss_pred             --------------------------------------CCEEEEEeCCCCE---EEEEEEE-EcCCCCEEEEEEcC--CC
Confidence                                                  1247788888766   9999998 56779999999985  35


Q ss_pred             ccceeCCCC-CCCCCCeEEEEecCCCCCCCCCCCeeEeEEEeeeeeccCCCCCccccccCCCcCeEEEEccccCCCCccc
Q 004596          533 LCPIDADFG-QPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQRNSAYPVMLETTAAVHPGGSGG  611 (743)
Q Consensus       533 l~pi~l~~s-~~~~G~~V~vIGyPlfg~~~g~~psvt~GiIS~vv~~~~~~~~~~~~~~~~~~~~~IqTdAai~~GnSGG  611 (743)
                      +++++++++ .+++|+.|+++|||     .++..+++.|+|++..+....         ......+||||+++++|||||
T Consensus       137 l~~~~l~~~~~~~~G~~V~aiG~P-----~g~~~~~t~Giis~~~r~~~~---------~~~~~~~iqtda~i~~GnSGG  202 (353)
T PRK10898        137 LPVIPINPKRVPHIGDVVLAIGNP-----YNLGQTITQGIISATGRIGLS---------PTGRQNFLQTDASINHGNSGG  202 (353)
T ss_pred             CCeeeccCcCcCCCCCEEEEEeCC-----CCcCCCcceeEEEeccccccC---------CccccceEEeccccCCCCCcc
Confidence            788888876 58999999999996     566688999999987664210         012246899999999999999


Q ss_pred             ceecCCceEEEEEeeeecCCC-CcccCceeEEEehhHHHHHHHHHHhcCC
Q 004596          612 AVVNLDGHMIGLVTSNARHGG-GTVIPHLNFSIPCAVLRPIFEFARDMQE  660 (743)
Q Consensus       612 PL~d~~G~VIGIvtsna~~~g-g~~~p~lnFaIPi~~l~~il~~~~~~~d  660 (743)
                      ||+|.+|+||||+++.....+ +....+++|+||++.++++++++...+.
T Consensus       203 Pl~n~~G~vvGI~~~~~~~~~~~~~~~g~~faIP~~~~~~~~~~l~~~G~  252 (353)
T PRK10898        203 ALVNSLGELMGINTLSFDKSNDGETPEGIGFAIPTQLATKIMDKLIRDGR  252 (353)
T ss_pred             eEECCCCeEEEEEEEEecccCCCCcccceEEEEchHHHHHHHHHHhhcCc
Confidence            999999999999998765432 2333489999999999999999877555


No 4  
>PRK10942 serine endoprotease; Provisional
Probab=99.95  E-value=1.6e-26  Score=261.76  Aligned_cols=173  Identities=31%  Similarity=0.538  Sum_probs=145.2

Q ss_pred             eeEEEEEEeC-CCeEEecccccccccCcceeccCCccccccCCCCCCCCCCCcccccccccCCCCCCCcccccccccccc
Q 004596          403 VWASGVLLND-QGLILTNAHLLEPWRFGKTTVSGWRNGVSFQPEDSASSGHTGVDQYQKSQTLPPKMPKIVDSSVDEHRA  481 (743)
Q Consensus       403 ~wGSGfvV~~-~G~ILTNaHVV~p~~~~~t~~~g~~~~~~f~~~~~~~~~~~~~~~~~~~q~~~~k~~~i~~~~~~~~~~  481 (743)
                      ++||||+|++ +||||||+||++          +                                              
T Consensus       111 ~~GSG~ii~~~~G~IlTn~HVv~----------~----------------------------------------------  134 (473)
T PRK10942        111 ALGSGVIIDADKGYVVTNNHVVD----------N----------------------------------------------  134 (473)
T ss_pred             ceEEEEEEECCCCEEEeChhhcC----------C----------------------------------------------
Confidence            4799999996 599999999996          2                                              


Q ss_pred             cccccccCCceeEEEEEcCCCCceeEeeEEEEecCCCCcEEEEEEccCCCCccceeCCCC-CCCCCCeEEEEecCCCCCC
Q 004596          482 YKLSSFSRGHRKIRVRLDHLDPWIWCDAKIVYVCKGPLDVSLLQLGYIPDQLCPIDADFG-QPSLGSAAYVIGHGLFGPR  560 (743)
Q Consensus       482 ~~l~l~~~~~~~I~Vrl~~~~~~~W~~A~VV~v~~~~~DLALLkLe~~p~~l~pi~l~~s-~~~~G~~V~vIGyPlfg~~  560 (743)
                               ...|.|++.+++.   |.|++++ .|+.+||||||++. +..+++++++++ .+++|++|+++|||     
T Consensus       135 ---------a~~i~V~~~dg~~---~~a~vv~-~D~~~DlAvlki~~-~~~l~~~~lg~s~~l~~G~~V~aiG~P-----  195 (473)
T PRK10942        135 ---------ATKIKVQLSDGRK---FDAKVVG-KDPRSDIALIQLQN-PKNLTAIKMADSDALRVGDYTVAIGNP-----  195 (473)
T ss_pred             ---------CCEEEEEECCCCE---EEEEEEE-ecCCCCEEEEEecC-CCCCceeEecCccccCCCCEEEEEcCC-----
Confidence                     1248888888877   9999998 67789999999985 457899999876 69999999999995     


Q ss_pred             CCCCCeeEeEEEeeeeeccCCCCCccccccCCCcCeEEEEccccCCCCcccceecCCceEEEEEeeeecCCCCcccCcee
Q 004596          561 CGLSPSVSSGVVAKVVKANLPSYGQSTLQRNSAYPVMLETTAAVHPGGSGGAVVNLDGHMIGLVTSNARHGGGTVIPHLN  640 (743)
Q Consensus       561 ~g~~psvt~GiIS~vv~~~~~~~~~~~~~~~~~~~~~IqTdAai~~GnSGGPL~d~~G~VIGIvtsna~~~gg~~~p~ln  640 (743)
                      +|+..+++.|+|+++.+...         ....+..+||||+++++|||||||+|.+|+||||+++...+.++..  +++
T Consensus       196 ~g~~~tvt~GiVs~~~r~~~---------~~~~~~~~iqtda~i~~GnSGGpL~n~~GeviGI~t~~~~~~g~~~--g~g  264 (473)
T PRK10942        196 YGLGETVTSGIVSALGRSGL---------NVENYENFIQTDAAINRGNSGGALVNLNGELIGINTAILAPDGGNI--GIG  264 (473)
T ss_pred             CCCCcceeEEEEEEeecccC---------CcccccceEEeccccCCCCCcCccCCCCCeEEEEEEEEEcCCCCcc--cEE
Confidence            57778999999999876311         0123457899999999999999999999999999999877665543  899


Q ss_pred             EEEehhHHHHHHHHHHhcCCc
Q 004596          641 FSIPCAVLRPIFEFARDMQEV  661 (743)
Q Consensus       641 FaIPi~~l~~il~~~~~~~d~  661 (743)
                      |+||++.++++++++.+.+.+
T Consensus       265 faIP~~~~~~v~~~l~~~g~v  285 (473)
T PRK10942        265 FAIPSNMVKNLTSQMVEYGQV  285 (473)
T ss_pred             EEEEHHHHHHHHHHHHhcccc
Confidence            999999999999999876654


No 5  
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=99.94  E-value=9.9e-26  Score=252.56  Aligned_cols=173  Identities=31%  Similarity=0.494  Sum_probs=143.1

Q ss_pred             eeEEEEEEeCCCeEEecccccccccCcceeccCCccccccCCCCCCCCCCCcccccccccCCCCCCCccccccccccccc
Q 004596          403 VWASGVLLNDQGLILTNAHLLEPWRFGKTTVSGWRNGVSFQPEDSASSGHTGVDQYQKSQTLPPKMPKIVDSSVDEHRAY  482 (743)
Q Consensus       403 ~wGSGfvV~~~G~ILTNaHVV~p~~~~~t~~~g~~~~~~f~~~~~~~~~~~~~~~~~~~q~~~~k~~~i~~~~~~~~~~~  482 (743)
                      ++||||+|+++||||||+||++          +                                               
T Consensus        58 ~~GSGfii~~~G~IlTn~Hvv~----------~-----------------------------------------------   80 (428)
T TIGR02037        58 GLGSGVIISADGYILTNNHVVD----------G-----------------------------------------------   80 (428)
T ss_pred             ceeeEEEECCCCEEEEcHHHcC----------C-----------------------------------------------
Confidence            4799999999999999999997          2                                               


Q ss_pred             ccccccCCceeEEEEEcCCCCceeEeeEEEEecCCCCcEEEEEEccCCCCccceeCCCC-CCCCCCeEEEEecCCCCCCC
Q 004596          483 KLSSFSRGHRKIRVRLDHLDPWIWCDAKIVYVCKGPLDVSLLQLGYIPDQLCPIDADFG-QPSLGSAAYVIGHGLFGPRC  561 (743)
Q Consensus       483 ~l~l~~~~~~~I~Vrl~~~~~~~W~~A~VV~v~~~~~DLALLkLe~~p~~l~pi~l~~s-~~~~G~~V~vIGyPlfg~~~  561 (743)
                              ...+.|++.+++.   |+|++++ .++.+||||||++. +..++++.++++ .+++|++|+++|||     .
T Consensus        81 --------~~~i~V~~~~~~~---~~a~vv~-~d~~~DlAllkv~~-~~~~~~~~l~~~~~~~~G~~v~aiG~p-----~  142 (428)
T TIGR02037        81 --------ADEITVTLSDGRE---FKAKLVG-KDPRTDIAVLKIDA-KKNLPVIKLGDSDKLRVGDWVLAIGNP-----F  142 (428)
T ss_pred             --------CCeEEEEeCCCCE---EEEEEEE-ecCCCCEEEEEecC-CCCceEEEccCCCCCCCCCEEEEEECC-----C
Confidence                    1237777777765   8999998 56779999999986 357899999875 68999999999996     5


Q ss_pred             CCCCeeEeEEEeeeeeccCCCCCccccccCCCcCeEEEEccccCCCCcccceecCCceEEEEEeeeecCCCCcccCceeE
Q 004596          562 GLSPSVSSGVVAKVVKANLPSYGQSTLQRNSAYPVMLETTAAVHPGGSGGAVVNLDGHMIGLVTSNARHGGGTVIPHLNF  641 (743)
Q Consensus       562 g~~psvt~GiIS~vv~~~~~~~~~~~~~~~~~~~~~IqTdAai~~GnSGGPL~d~~G~VIGIvtsna~~~gg~~~p~lnF  641 (743)
                      ++..+++.|+|++..+...         ....+..++|||+++++|||||||||.+|+||||+++.....++..  +++|
T Consensus       143 g~~~~~t~G~vs~~~~~~~---------~~~~~~~~i~tda~i~~GnSGGpl~n~~G~viGI~~~~~~~~g~~~--g~~f  211 (428)
T TIGR02037       143 GLGQTVTSGIVSALGRSGL---------GIGDYENFIQTDAAINPGNSGGPLVNLRGEVIGINTAIYSPSGGNV--GIGF  211 (428)
T ss_pred             cCCCcEEEEEEEecccCcc---------CCCCccceEEECCCCCCCCCCCceECCCCeEEEEEeEEEcCCCCcc--ceEE
Confidence            6778999999998765310         0123456899999999999999999999999999999876654433  8999


Q ss_pred             EEehhHHHHHHHHHHhcCCc
Q 004596          642 SIPCAVLRPIFEFARDMQEV  661 (743)
Q Consensus       642 aIPi~~l~~il~~~~~~~d~  661 (743)
                      +||++.++++++++.+.+.+
T Consensus       212 aiP~~~~~~~~~~l~~~g~~  231 (428)
T TIGR02037       212 AIPSNMAKNVVDQLIEGGKV  231 (428)
T ss_pred             EEEhHHHHHHHHHHHhcCcC
Confidence            99999999999999876553


No 6  
>COG0265 DegQ Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain [Posttranslational modification, protein turnover, chaperones]
Probab=99.85  E-value=2.2e-20  Score=203.97  Aligned_cols=190  Identities=27%  Similarity=0.447  Sum_probs=155.2

Q ss_pred             hHHHhccCceEEEEECC-----------------CeeEEEEEEeCCCeEEecccccccccCcceeccCCccccccCCCCC
Q 004596          385 LPIQKALASVCLITIDD-----------------GVWASGVLLNDQGLILTNAHLLEPWRFGKTTVSGWRNGVSFQPEDS  447 (743)
Q Consensus       385 ~~i~~a~~SVV~I~~~~-----------------~~wGSGfvV~~~G~ILTNaHVV~p~~~~~t~~~g~~~~~~f~~~~~  447 (743)
                      ..++++.++||.|....                 .++||||+++.+|||+||.||++          +            
T Consensus        37 ~~~~~~~~~vV~~~~~~~~~~~~~~~~~~~~~~~~~~gSg~i~~~~g~ivTn~hVi~----------~------------   94 (347)
T COG0265          37 TAVEKVAPAVVSIATGLTAKLRSFFPSDPPLRSAEGLGSGFIISSDGYIVTNNHVIA----------G------------   94 (347)
T ss_pred             HHHHhcCCcEEEEEeeeeecchhcccCCcccccccccccEEEEcCCeEEEecceecC----------C------------
Confidence            46888999999997732                 37899999999999999999997          2            


Q ss_pred             CCCCCCcccccccccCCCCCCCcccccccccccccccccccCCceeEEEEEcCCCCceeEeeEEEEecCCCCcEEEEEEc
Q 004596          448 ASSGHTGVDQYQKSQTLPPKMPKIVDSSVDEHRAYKLSSFSRGHRKIRVRLDHLDPWIWCDAKIVYVCKGPLDVSLLQLG  527 (743)
Q Consensus       448 ~~~~~~~~~~~~~~q~~~~k~~~i~~~~~~~~~~~~l~l~~~~~~~I~Vrl~~~~~~~W~~A~VV~v~~~~~DLALLkLe  527 (743)
                                                                 ...+.+.+.++..   +++++++ .|...|+|+||++
T Consensus        95 -------------------------------------------a~~i~v~l~dg~~---~~a~~vg-~d~~~dlavlki~  127 (347)
T COG0265          95 -------------------------------------------AEEITVTLADGRE---VPAKLVG-KDPISDLAVLKID  127 (347)
T ss_pred             -------------------------------------------cceEEEEeCCCCE---EEEEEEe-cCCccCEEEEEec
Confidence                                                       1236666666665   8999998 6778999999999


Q ss_pred             cCCCCccceeCCCC-CCCCCCeEEEEecCCCCCCCCCCCeeEeEEEeeeeeccCCCCCccccccCCCcCeEEEEccccCC
Q 004596          528 YIPDQLCPIDADFG-QPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQRNSAYPVMLETTAAVHP  606 (743)
Q Consensus       528 ~~p~~l~pi~l~~s-~~~~G~~V~vIGyPlfg~~~g~~psvt~GiIS~vv~~~~~~~~~~~~~~~~~~~~~IqTdAai~~  606 (743)
                      .... ++.+.+.++ .++.|+.++++|+|     +|+..+++.|+++...+...        ........+|||||++++
T Consensus       128 ~~~~-~~~~~~~~s~~l~vg~~v~aiGnp-----~g~~~tvt~Givs~~~r~~v--------~~~~~~~~~IqtdAain~  193 (347)
T COG0265         128 GAGG-LPVIALGDSDKLRVGDVVVAIGNP-----FGLGQTVTSGIVSALGRTGV--------GSAGGYVNFIQTDAAINP  193 (347)
T ss_pred             cCCC-CceeeccCCCCcccCCEEEEecCC-----CCcccceeccEEeccccccc--------cCcccccchhhcccccCC
Confidence            7322 777788876 58899999999995     57779999999999887411        111225678999999999


Q ss_pred             CCcccceecCCceEEEEEeeeecCCCCcccCceeEEEehhHHHHHHHHHHhcC
Q 004596          607 GGSGGAVVNLDGHMIGLVTSNARHGGGTVIPHLNFSIPCAVLRPIFEFARDMQ  659 (743)
Q Consensus       607 GnSGGPL~d~~G~VIGIvtsna~~~gg~~~p~lnFaIPi~~l~~il~~~~~~~  659 (743)
                      ||||||++|.+|++|||++......++..  +++|+||+..+.+++..+...+
T Consensus       194 gnsGgpl~n~~g~~iGint~~~~~~~~~~--gigfaiP~~~~~~v~~~l~~~G  244 (347)
T COG0265         194 GNSGGPLVNIDGEVVGINTAIIAPSGGSS--GIGFAIPVNLVAPVLDELISKG  244 (347)
T ss_pred             CCCCCceEcCCCcEEEEEEEEecCCCCcc--eeEEEecHHHHHHHHHHHHHcC
Confidence            99999999999999999999988766533  6999999999999999988744


No 7  
>PRK10139 serine endoprotease; Provisional
Probab=99.76  E-value=3.5e-18  Score=193.11  Aligned_cols=129  Identities=22%  Similarity=0.328  Sum_probs=112.1

Q ss_pred             ccccEEEEEeccCCCCCCceecC--CCCCCCCeEEEEeCCCCCCCccccccceEEEEEeeecCCC---CCCCceEEeecc
Q 004596          213 STSRVAILGVSSYLKDLPNIALT--PLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVANCYPPR---STTRSLLMADIR  287 (743)
Q Consensus       213 ~~t~~A~l~i~~~~~~~~~~~~~--~~~~~G~~v~~igsPFg~~sP~~f~ns~s~Givs~~~~~~---~~~~~~i~tD~~  287 (743)
                      ..+|+|||||+. ..++|.+.++  +.+++||+|++||+|||      |..++|.|+||++.+..   ..+..|||||++
T Consensus       136 ~~~DlAvlkv~~-~~~l~~~~lg~s~~~~~G~~V~aiG~P~g------~~~tvt~GivS~~~r~~~~~~~~~~~iqtda~  208 (455)
T PRK10139        136 DQSDIALLQIQN-PSKLTQIAIADSDKLRVGDFAVAVGNPFG------LGQTATSGIISALGRSGLNLEGLENFIQTDAS  208 (455)
T ss_pred             CCCCEEEEEecC-CCCCceeEecCccccCCCCEEEEEecCCC------CCCceEEEEEccccccccCCCCcceEEEECCc
Confidence            458999999983 3568888884  46999999999999999      68899999999987642   235679999999


Q ss_pred             ccCC---CceEcCCCcEEEEEecccccc-CCcceEEEeeHHHHHHHHhhhh-cCCCcccccceecc
Q 004596          288 CLPG---GPVFGEHAHFVGILIRPLRQK-SGAEIQLVIPWEAIATACSDLL-LKEPQNAEKEIHIN  348 (743)
Q Consensus       288 ~~pG---g~vf~~~g~liGiv~~~l~~~-g~~~l~~~ip~~~i~~~~~~l~-~~~~~~~~~~~~~~  348 (743)
                      +|||   |||||.+|+||||+++.++.. +..|++|+||++.+..++.+++ .+++.++|+|+.++
T Consensus       209 in~GnSGGpl~n~~G~vIGi~~~~~~~~~~~~gigfaIP~~~~~~v~~~l~~~g~v~r~~LGv~~~  274 (455)
T PRK10139        209 INRGNSGGALLNLNGELIGINTAILAPGGGSVGIGFAIPSNMARTLAQQLIDFGEIKRGLLGIKGT  274 (455)
T ss_pred             cCCCCCcceEECCCCeEEEEEEEEEcCCCCccceEEEEEhHHHHHHHHHHhhcCcccccceeEEEE
Confidence            9997   999999999999999999876 6789999999999999999988 67899999998754


No 8  
>PRK10942 serine endoprotease; Provisional
Probab=99.72  E-value=3.3e-17  Score=186.08  Aligned_cols=129  Identities=20%  Similarity=0.331  Sum_probs=113.0

Q ss_pred             cccEEEEEeccCCCCCCceecC--CCCCCCCeEEEEeCCCCCCCccccccceEEEEEeeecCCC---CCCCceEEeeccc
Q 004596          214 TSRVAILGVSSYLKDLPNIALT--PLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVANCYPPR---STTRSLLMADIRC  288 (743)
Q Consensus       214 ~t~~A~l~i~~~~~~~~~~~~~--~~~~~G~~v~~igsPFg~~sP~~f~ns~s~Givs~~~~~~---~~~~~~i~tD~~~  288 (743)
                      .+|+||||++. ..+++.++++  +.+++||||++||+|||      |.+++|.|+||++.+..   ..|..|||||+++
T Consensus       158 ~~DlAvlki~~-~~~l~~~~lg~s~~l~~G~~V~aiG~P~g------~~~tvt~GiVs~~~r~~~~~~~~~~~iqtda~i  230 (473)
T PRK10942        158 RSDIALIQLQN-PKNLTAIKMADSDALRVGDYTVAIGNPYG------LGETVTSGIVSALGRSGLNVENYENFIQTDAAI  230 (473)
T ss_pred             CCCEEEEEecC-CCCCceeEecCccccCCCCEEEEEcCCCC------CCcceeEEEEEEeecccCCcccccceEEecccc
Confidence            48999999973 3568888885  46999999999999999      68999999999987642   2467899999999


Q ss_pred             cCC---CceEcCCCcEEEEEecccccc-CCcceEEEeeHHHHHHHHhhhh-cCCCcccccceeccc
Q 004596          289 LPG---GPVFGEHAHFVGILIRPLRQK-SGAEIQLVIPWEAIATACSDLL-LKEPQNAEKEIHINK  349 (743)
Q Consensus       289 ~pG---g~vf~~~g~liGiv~~~l~~~-g~~~l~~~ip~~~i~~~~~~l~-~~~~~~~~~~~~~~~  349 (743)
                      +||   |||||.+|+||||+++.+... ++.+++|+||++.+..++.++. .+++.++|.|+.++.
T Consensus       231 ~~GnSGGpL~n~~GeviGI~t~~~~~~g~~~g~gfaIP~~~~~~v~~~l~~~g~v~rg~lGv~~~~  296 (473)
T PRK10942        231 NRGNSGGALVNLNGELIGINTAILAPDGGNIGIGFAIPSNMVKNLTSQMVEYGQVKRGELGIMGTE  296 (473)
T ss_pred             CCCCCcCccCCCCCeEEEEEEEEEcCCCCcccEEEEEEHHHHHHHHHHHHhccccccceeeeEeee
Confidence            997   999999999999999999887 7789999999999999999988 678899999987543


No 9  
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=99.65  E-value=1.1e-15  Score=171.72  Aligned_cols=130  Identities=24%  Similarity=0.364  Sum_probs=112.7

Q ss_pred             cccEEEEEeccCCCCCCceecC--CCCCCCCeEEEEeCCCCCCCccccccceEEEEEeeecCC---CCCCCceEEeeccc
Q 004596          214 TSRVAILGVSSYLKDLPNIALT--PLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVANCYPP---RSTTRSLLMADIRC  288 (743)
Q Consensus       214 ~t~~A~l~i~~~~~~~~~~~~~--~~~~~G~~v~~igsPFg~~sP~~f~ns~s~Givs~~~~~---~~~~~~~i~tD~~~  288 (743)
                      .+|+||||++.. .++|.+.++  ..+++||+|+++|+|||      +..++|.|+||+..+.   ...+..||+||+++
T Consensus       104 ~~DlAllkv~~~-~~~~~~~l~~~~~~~~G~~v~aiG~p~g------~~~~~t~G~vs~~~~~~~~~~~~~~~i~tda~i  176 (428)
T TIGR02037       104 RTDIAVLKIDAK-KNLPVIKLGDSDKLRVGDWVLAIGNPFG------LGQTVTSGIVSALGRSGLGIGDYENFIQTDAAI  176 (428)
T ss_pred             CCCEEEEEecCC-CCceEEEccCCCCCCCCCEEEEEECCCc------CCCcEEEEEEEecccCccCCCCccceEEECCCC
Confidence            469999999853 468888885  46999999999999999      6889999999988764   23467899999999


Q ss_pred             cCC---CceEcCCCcEEEEEecccccc-CCcceEEEeeHHHHHHHHhhhh-cCCCcccccceecccC
Q 004596          289 LPG---GPVFGEHAHFVGILIRPLRQK-SGAEIQLVIPWEAIATACSDLL-LKEPQNAEKEIHINKG  350 (743)
Q Consensus       289 ~pG---g~vf~~~g~liGiv~~~l~~~-g~~~l~~~ip~~~i~~~~~~l~-~~~~~~~~~~~~~~~~  350 (743)
                      +||   |||||.+|+||||+++.+... ++.+++|+||++.+..++.++. .+++.++|+|+.++.-
T Consensus       177 ~~GnSGGpl~n~~G~viGI~~~~~~~~g~~~g~~faiP~~~~~~~~~~l~~~g~~~~~~lGi~~~~~  243 (428)
T TIGR02037       177 NPGNSGGPLVNLRGEVIGINTAIYSPSGGNVGIGFAIPSNMAKNVVDQLIEGGKVQRGWLGVTIQEV  243 (428)
T ss_pred             CCCCCCCceECCCCeEEEEEeEEEcCCCCccceEEEEEhHHHHHHHHHHHhcCcCcCCcCceEeecC
Confidence            996   999999999999999999877 6779999999999999999988 6678899999886543


No 10 
>TIGR02038 protease_degS periplasmic serine pepetdase DegS. This family consists of the periplasmic serine protease DegS (HhoB), a shorter paralog of protease DO (HtrA, DegP) and DegQ (HhoA). It is found in E. coli and several other Proteobacteria of the gamma subdivision. It contains a trypsin domain and a single copy of PDZ domain (in contrast to DegP with two copies). A critical role of this DegS is to sense stress in the periplasm and partially degrade an inhibitor of sigma(E).
Probab=99.64  E-value=1.6e-15  Score=166.43  Aligned_cols=128  Identities=17%  Similarity=0.330  Sum_probs=107.1

Q ss_pred             ccccEEEEEeccCCCCCCceec--CCCCCCCCeEEEEeCCCCCCCccccccceEEEEEeeecCCC---CCCCceEEeecc
Q 004596          213 STSRVAILGVSSYLKDLPNIAL--TPLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVANCYPPR---STTRSLLMADIR  287 (743)
Q Consensus       213 ~~t~~A~l~i~~~~~~~~~~~~--~~~~~~G~~v~~igsPFg~~sP~~f~ns~s~Givs~~~~~~---~~~~~~i~tD~~  287 (743)
                      ..+|+||||++..  ++|.+.+  +..+++||+|++||+|||      +.++++.|+||+..+..   ..+..|||||+.
T Consensus       123 ~~~DlAvlkv~~~--~~~~~~l~~s~~~~~G~~V~aiG~P~~------~~~s~t~GiIs~~~r~~~~~~~~~~~iqtda~  194 (351)
T TIGR02038       123 PLTDLAVLKIEGD--NLPTIPVNLDRPPHVGDVVLAIGNPYN------LGQTITQGIISATGRNGLSSVGRQNFIQTDAA  194 (351)
T ss_pred             CCCCEEEEEecCC--CCceEeccCcCccCCCCEEEEEeCCCC------CCCcEEEEEEEeccCcccCCCCcceEEEECCc
Confidence            3479999999853  4677776  446999999999999999      67899999999886542   134679999999


Q ss_pred             ccCC---CceEcCCCcEEEEEeccccccC---CcceEEEeeHHHHHHHHhhhh-cCCCcccccceecc
Q 004596          288 CLPG---GPVFGEHAHFVGILIRPLRQKS---GAEIQLVIPWEAIATACSDLL-LKEPQNAEKEIHIN  348 (743)
Q Consensus       288 ~~pG---g~vf~~~g~liGiv~~~l~~~g---~~~l~~~ip~~~i~~~~~~l~-~~~~~~~~~~~~~~  348 (743)
                      ++||   |||||.+|+||||+++.+...+   ..+++|+||++.+..++.+++ .+++.++|+|+..+
T Consensus       195 i~~GnSGGpl~n~~G~vIGI~~~~~~~~~~~~~~g~~faIP~~~~~~vl~~l~~~g~~~r~~lGv~~~  262 (351)
T TIGR02038       195 INAGNSGGALINTNGELVGINTASFQKGGDEGGEGINFAIPIKLAHKIMGKIIRDGRVIRGYIGVSGE  262 (351)
T ss_pred             cCCCCCcceEECCCCeEEEEEeeeecccCCCCccceEEEecHHHHHHHHHHHhhcCcccceEeeeEEE
Confidence            9997   9999999999999999886552   258999999999999999987 66788899988743


No 11 
>PRK10898 serine endoprotease; Provisional
Probab=99.63  E-value=1.6e-15  Score=166.54  Aligned_cols=127  Identities=18%  Similarity=0.304  Sum_probs=106.1

Q ss_pred             cccEEEEEeccCCCCCCceecC--CCCCCCCeEEEEeCCCCCCCccccccceEEEEEeeecCCC---CCCCceEEeeccc
Q 004596          214 TSRVAILGVSSYLKDLPNIALT--PLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVANCYPPR---STTRSLLMADIRC  288 (743)
Q Consensus       214 ~t~~A~l~i~~~~~~~~~~~~~--~~~~~G~~v~~igsPFg~~sP~~f~ns~s~Givs~~~~~~---~~~~~~i~tD~~~  288 (743)
                      .+|+||||++..  ++|.+.++  ..+++||+|+++|+|||      +..++|.|+||+..+..   ..+..|||||+++
T Consensus       124 ~~DlAvl~v~~~--~l~~~~l~~~~~~~~G~~V~aiG~P~g------~~~~~t~Giis~~~r~~~~~~~~~~~iqtda~i  195 (353)
T PRK10898        124 LTDLAVLKINAT--NLPVIPINPKRVPHIGDVVLAIGNPYN------LGQTITQGIISATGRIGLSPTGRQNFLQTDASI  195 (353)
T ss_pred             CCCEEEEEEcCC--CCCeeeccCcCcCCCCCEEEEEeCCCC------cCCCcceeEEEeccccccCCccccceEEecccc
Confidence            479999999853  57776764  45999999999999999      67899999999876541   2345799999999


Q ss_pred             cCC---CceEcCCCcEEEEEeccccccC----CcceEEEeeHHHHHHHHhhhh-cCCCcccccceecc
Q 004596          289 LPG---GPVFGEHAHFVGILIRPLRQKS----GAEIQLVIPWEAIATACSDLL-LKEPQNAEKEIHIN  348 (743)
Q Consensus       289 ~pG---g~vf~~~g~liGiv~~~l~~~g----~~~l~~~ip~~~i~~~~~~l~-~~~~~~~~~~~~~~  348 (743)
                      +||   |||+|.+|+||||+++.+...+    ..+++|+||++.+..++.++. .+++.++|+|+..+
T Consensus       196 ~~GnSGGPl~n~~G~vvGI~~~~~~~~~~~~~~~g~~faIP~~~~~~~~~~l~~~G~~~~~~lGi~~~  263 (353)
T PRK10898        196 NHGNSGGALVNSLGELMGINTLSFDKSNDGETPEGIGFAIPTQLATKIMDKLIRDGRVIRGYIGIGGR  263 (353)
T ss_pred             CCCCCcceEECCCCeEEEEEEEEecccCCCCcccceEEEEchHHHHHHHHHHhhcCcccccccceEEE
Confidence            997   9999999999999999886542    258999999999999999987 67788999998744


No 12 
>COG0265 DegQ Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain [Posttranslational modification, protein turnover, chaperones]
Probab=99.56  E-value=1.6e-14  Score=157.94  Aligned_cols=130  Identities=24%  Similarity=0.345  Sum_probs=111.8

Q ss_pred             cccccEEEEEeccCCCCCCceec--CCCCCCCCeEEEEeCCCCCCCccccccceEEEEEeeecCC-C---CCCCceEEee
Q 004596          212 KSTSRVAILGVSSYLKDLPNIAL--TPLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVANCYPP-R---STTRSLLMAD  285 (743)
Q Consensus       212 ~~~t~~A~l~i~~~~~~~~~~~~--~~~~~~G~~v~~igsPFg~~sP~~f~ns~s~Givs~~~~~-~---~~~~~~i~tD  285 (743)
                      ...+|+|+||++.... +|.+.+  ++.+++||++++||+|||      |.+++|.||||...+. -   ..+..|||||
T Consensus       116 d~~~dlavlki~~~~~-~~~~~~~~s~~l~vg~~v~aiGnp~g------~~~tvt~Givs~~~r~~v~~~~~~~~~Iqtd  188 (347)
T COG0265         116 DPISDLAVLKIDGAGG-LPVIALGDSDKLRVGDVVVAIGNPFG------LGQTVTSGIVSALGRTGVGSAGGYVNFIQTD  188 (347)
T ss_pred             CCccCEEEEEeccCCC-CceeeccCCCCcccCCEEEEecCCCC------cccceeccEEeccccccccCcccccchhhcc
Confidence            4568999999996433 666666  556999999999999999      7999999999998874 1   2367899999


Q ss_pred             ccccCC---CceEcCCCcEEEEEecccccc-CCcceEEEeeHHHHHHHHhhhh-cCCCcccccceecc
Q 004596          286 IRCLPG---GPVFGEHAHFVGILIRPLRQK-SGAEIQLVIPWEAIATACSDLL-LKEPQNAEKEIHIN  348 (743)
Q Consensus       286 ~~~~pG---g~vf~~~g~liGiv~~~l~~~-g~~~l~~~ip~~~i~~~~~~l~-~~~~~~~~~~~~~~  348 (743)
                      |++|||   ||++|.+|++|||++..+... +..|+.|+||++.+..+..++. .+++.+++.|+.+.
T Consensus       189 Aain~gnsGgpl~n~~g~~iGint~~~~~~~~~~gigfaiP~~~~~~v~~~l~~~G~v~~~~lgv~~~  256 (347)
T COG0265         189 AAINPGNSGGPLVNIDGEVVGINTAIIAPSGGSSGIGFAIPVNLVAPVLDELISKGKVVRGYLGVIGE  256 (347)
T ss_pred             cccCCCCCCCceEcCCCcEEEEEEEEecCCCCcceeEEEecHHHHHHHHHHHHHcCCccccccceEEE
Confidence            999997   999999999999999999988 5678999999999999999988 36788988887754


No 13 
>PF13365 Trypsin_2:  Trypsin-like peptidase domain; PDB: 1Y8T_A 2Z9I_A 3QO6_A 1L1J_A 1QY6_A 2O8L_A 3OTP_E 2ZLE_I 1KY9_A 3CS0_A ....
Probab=99.55  E-value=6.3e-14  Score=127.59  Aligned_cols=24  Identities=46%  Similarity=0.904  Sum_probs=22.4

Q ss_pred             EccccCCCCcccceecCCceEEEE
Q 004596          600 TTAAVHPGGSGGAVVNLDGHMIGL  623 (743)
Q Consensus       600 TdAai~~GnSGGPL~d~~G~VIGI  623 (743)
                      +++.+.+|+|||||||.+|+||||
T Consensus        97 ~~~~~~~G~SGgpv~~~~G~vvGi  120 (120)
T PF13365_consen   97 TDADTRPGSSGGPVFDSDGRVVGI  120 (120)
T ss_dssp             ESSS-STTTTTSEEEETTSEEEEE
T ss_pred             eecccCCCcEeHhEECCCCEEEeC
Confidence            899999999999999999999998


No 14 
>KOG1320 consensus Serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=99.47  E-value=3.7e-13  Score=150.52  Aligned_cols=201  Identities=21%  Similarity=0.290  Sum_probs=142.0

Q ss_pred             HHHhccCceEEEEECC--------------CeeEEEEEEeCCCeEEecccccccccCcceeccCCccccccCCCCCCCCC
Q 004596          386 PIQKALASVCLITIDD--------------GVWASGVLLNDQGLILTNAHLLEPWRFGKTTVSGWRNGVSFQPEDSASSG  451 (743)
Q Consensus       386 ~i~~a~~SVV~I~~~~--------------~~wGSGfvV~~~G~ILTNaHVV~p~~~~~t~~~g~~~~~~f~~~~~~~~~  451 (743)
                      +.++...++|.|+..+              ...||||+++.+|+++||+||+.-.        -..|   .         
T Consensus       133 ~~~~cd~Avv~Ie~~~f~~~~~~~e~~~ip~l~~S~~Vv~gd~i~VTnghV~~~~--------~~~y---~---------  192 (473)
T KOG1320|consen  133 VFEECDLAVVYIESEEFWKGMNPFELGDIPSLNGSGFVVGGDGIIVTNGHVVRVE--------PRIY---A---------  192 (473)
T ss_pred             hhhcccceEEEEeeccccCCCcccccCCCcccCccEEEEcCCcEEEEeeEEEEEE--------eccc---c---------
Confidence            3556778888888621              1249999999999999999999511        0000   0         


Q ss_pred             CCcccccccccCCCCCCCcccccccccccccccccccCCceeEEEEEcCC--CCceeEeeEEEEecCCCCcEEEEEEccC
Q 004596          452 HTGVDQYQKSQTLPPKMPKIVDSSVDEHRAYKLSSFSRGHRKIRVRLDHL--DPWIWCDAKIVYVCKGPLDVSLLQLGYI  529 (743)
Q Consensus       452 ~~~~~~~~~~q~~~~k~~~i~~~~~~~~~~~~l~l~~~~~~~I~Vrl~~~--~~~~W~~A~VV~v~~~~~DLALLkLe~~  529 (743)
                            +.                     .       ..--.|.++...+  +.   +.+.+++ -+...|+|+++++..
T Consensus       193 ------~~---------------------~-------~~l~~vqi~aa~~~~~s---~ep~i~g-~d~~~gvA~l~ik~~  234 (473)
T KOG1320|consen  193 ------HS---------------------S-------TVLLRVQIDAAIGPGNS---GEPVIVG-VDKVAGVAFLKIKTP  234 (473)
T ss_pred             ------CC---------------------C-------cceeeEEEEEeecCCcc---CCCeEEc-cccccceEEEEEecC
Confidence                  00                     0       0011255555544  44   5677776 356799999999752


Q ss_pred             CCCccceeCCCC-CCCCCCeEEEEecCCCCCCCCCCCeeEeEEEeeeeeccCCCCCccccccCCCcCeEEEEccccCCCC
Q 004596          530 PDQLCPIDADFG-QPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQRNSAYPVMLETTAAVHPGG  608 (743)
Q Consensus       530 p~~l~pi~l~~s-~~~~G~~V~vIGyPlfg~~~g~~psvt~GiIS~vv~~~~~~~~~~~~~~~~~~~~~IqTdAai~~Gn  608 (743)
                      ..-+++++++.. .+..|+++..+|.|     +++..+++.|+++...|.......    ........++|||++++.|+
T Consensus       235 ~~i~~~i~~~~~~~~~~G~~~~a~~~~-----f~~~nt~t~g~vs~~~R~~~~lg~----~~g~~i~~~~qtd~ai~~~n  305 (473)
T KOG1320|consen  235 ENILYVIPLGVSSHFRTGVEVSAIGNG-----FGLLNTLTQGMVSGQLRKSFKLGL----ETGVLISKINQTDAAINPGN  305 (473)
T ss_pred             CcccceeecceeeeecccceeeccccC-----ceeeeeeeecccccccccccccCc----ccceeeeeecccchhhhccc
Confidence            233788888775 68999999999985     677789999999987764321110    11123456899999999999


Q ss_pred             cccceecCCceEEEEEeeeecCCCCcccCceeEEEehhHHHHHHHHH
Q 004596          609 SGGAVVNLDGHMIGLVTSNARHGGGTVIPHLNFSIPCAVLRPIFEFA  655 (743)
Q Consensus       609 SGGPL~d~~G~VIGIvtsna~~~gg~~~p~lnFaIPi~~l~~il~~~  655 (743)
                      ||||++|.+|++||+++.+....+-..  +++|++|.+.+..++.+.
T Consensus       306 sg~~ll~~DG~~IgVn~~~~~ri~~~~--~iSf~~p~d~vl~~v~r~  350 (473)
T KOG1320|consen  306 SGGPLLNLDGEVIGVNTRKVTRIGFSH--GISFKIPIDTVLVIVLRL  350 (473)
T ss_pred             CCCcEEEecCcEeeeeeeeeEEeeccc--cceeccCchHhhhhhhhh
Confidence            999999999999999998865422222  689999999999877665


No 15 
>PF00089 Trypsin:  Trypsin;  InterPro: IPR001254 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine proteases belong to the MEROPS peptidase family S1 (chymotrypsin family, clan PA(S))and to peptidase family S6 (Hap serine peptidases). The chymotrypsin family is almost totally confined to animals, although trypsin-like enzymes are found in actinomycetes of the genera Streptomyces and Saccharopolyspora, and in the fungus Fusarium oxysporum []. The enzymes are inherently secreted, being synthesised with a signal peptide that targets them to the secretory pathway. Animal enzymes are either secreted directly, packaged into vesicles for regulated secretion, or are retained in leukocyte granules []. The Hap family, 'Haemophilus adhesion and penetration', are proteins that play a role in the interaction with human epithelial cells. The serine protease activity is localized at the N-terminal domain, whereas the binding domain is in the C-terminal region. ; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 1SPJ_A 1A5I_A 2ZGH_A 2ZKS_A 2ZGJ_A 2ZGC_A 2ODP_A 2I6Q_A 2I6S_A 2ODQ_A ....
Probab=99.25  E-value=2.8e-10  Score=113.43  Aligned_cols=124  Identities=22%  Similarity=0.314  Sum_probs=77.2

Q ss_pred             CCcEEEEEEccC---CCCccceeCCCC--CCCCCCeEEEEecCCCCCCCCCCCeeEeEEEeeeeeccCCCCCccccccCC
Q 004596          518 PLDVSLLQLGYI---PDQLCPIDADFG--QPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQRNS  592 (743)
Q Consensus       518 ~~DLALLkLe~~---p~~l~pi~l~~s--~~~~G~~V~vIGyPlfg~~~g~~psvt~GiIS~vv~~~~~~~~~~~~~~~~  592 (743)
                      .+|||||+|+..   .+.+.|+.+...  .+..|+.+.++||+.-.. .+....+....+.-+...    .|... ....
T Consensus        86 ~~DiAll~L~~~~~~~~~~~~~~l~~~~~~~~~~~~~~~~G~~~~~~-~~~~~~~~~~~~~~~~~~----~c~~~-~~~~  159 (220)
T PF00089_consen   86 DNDIALLKLDRPITFGDNIQPICLPSAGSDPNVGTSCIVVGWGRTSD-NGYSSNLQSVTVPVVSRK----TCRSS-YNDN  159 (220)
T ss_dssp             TTSEEEEEESSSSEHBSSBEESBBTSTTHTTTTTSEEEEEESSBSST-TSBTSBEEEEEEEEEEHH----HHHHH-TTTT
T ss_pred             ccccccccccccccccccccccccccccccccccccccccccccccc-cccccccccccccccccc----ccccc-cccc
Confidence            589999999973   356778888773  468999999999985221 111123333333221110    01110 0011


Q ss_pred             CcCeEEEEcc----ccCCCCcccceecCCceEEEEEeeeecCCCCcccCceeEEEehhHHHH
Q 004596          593 AYPVMLETTA----AVHPGGSGGAVVNLDGHMIGLVTSNARHGGGTVIPHLNFSIPCAVLRP  650 (743)
Q Consensus       593 ~~~~~IqTdA----ai~~GnSGGPL~d~~G~VIGIvtsna~~~gg~~~p~lnFaIPi~~l~~  650 (743)
                      ....++++..    ..+.|+|||||++.++.++||++.. ...+...  ...+.+++....+
T Consensus       160 ~~~~~~c~~~~~~~~~~~g~sG~pl~~~~~~lvGI~s~~-~~c~~~~--~~~v~~~v~~~~~  218 (220)
T PF00089_consen  160 LTPNMICAGSSGSGDACQGDSGGPLICNNNYLVGIVSFG-ENCGSPN--YPGVYTRVSSYLD  218 (220)
T ss_dssp             STTTEEEEETTSSSBGGTTTTTSEEEETTEEEEEEEEEE-SSSSBTT--SEEEEEEGGGGHH
T ss_pred             cccccccccccccccccccccccccccceeeecceeeec-CCCCCCC--cCEEEEEHHHhhc
Confidence            3456788776    7889999999998777899999988 3232222  2467777776554


No 16 
>cd00190 Tryp_SPc Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues.
Probab=99.18  E-value=5.8e-10  Score=111.82  Aligned_cols=109  Identities=23%  Similarity=0.235  Sum_probs=63.9

Q ss_pred             CCCcEEEEEEccC---CCCccceeCCCC--CCCCCCeEEEEecCCCCCCCCCCCeeEeEEEeeeeeccCCCCCcccccc-
Q 004596          517 GPLDVSLLQLGYI---PDQLCPIDADFG--QPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQR-  590 (743)
Q Consensus       517 ~~~DLALLkLe~~---p~~l~pi~l~~s--~~~~G~~V~vIGyPlfg~~~g~~psvt~GiIS~vv~~~~~~~~~~~~~~-  590 (743)
                      ..+|||||+|+..   ...+.|+.+...  .+..|+.++++||+................+.-+.    ...|...... 
T Consensus        87 ~~~DiAll~L~~~~~~~~~v~picl~~~~~~~~~~~~~~~~G~g~~~~~~~~~~~~~~~~~~~~~----~~~C~~~~~~~  162 (232)
T cd00190          87 YDNDIALLKLKRPVTLSDNVRPICLPSSGYNLPAGTTCTVSGWGRTSEGGPLPDVLQEVNVPIVS----NAECKRAYSYG  162 (232)
T ss_pred             CcCCEEEEEECCcccCCCcccceECCCccccCCCCCEEEEEeCCcCCCCCCCCceeeEEEeeeEC----HHHhhhhccCc
Confidence            3589999999962   234788888776  67889999999998643211111112222221111    1111111110 


Q ss_pred             CCCcCeEEEEc-----cccCCCCcccceecCC---ceEEEEEeeeec
Q 004596          591 NSAYPVMLETT-----AAVHPGGSGGAVVNLD---GHMIGLVTSNAR  629 (743)
Q Consensus       591 ~~~~~~~IqTd-----Aai~~GnSGGPL~d~~---G~VIGIvtsna~  629 (743)
                      ......+++..     ...+.|+|||||+...   +.++||++....
T Consensus       163 ~~~~~~~~C~~~~~~~~~~c~gdsGgpl~~~~~~~~~lvGI~s~g~~  209 (232)
T cd00190         163 GTITDNMLCAGGLEGGKDACQGDSGGPLVCNDNGRGVLVGIVSWGSG  209 (232)
T ss_pred             ccCCCceEeeCCCCCCCccccCCCCCcEEEEeCCEEEEEEEEehhhc
Confidence            11223455543     3467899999999653   889999988654


No 17 
>smart00020 Tryp_SPc Trypsin-like serine protease. Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues.
Probab=99.02  E-value=2e-08  Score=101.29  Aligned_cols=108  Identities=24%  Similarity=0.291  Sum_probs=61.5

Q ss_pred             CCCcEEEEEEccC---CCCccceeCCCC--CCCCCCeEEEEecCCCCCCCC-CCCeeEeEEEeeeeeccCCCCCcccccc
Q 004596          517 GPLDVSLLQLGYI---PDQLCPIDADFG--QPSLGSAAYVIGHGLFGPRCG-LSPSVSSGVVAKVVKANLPSYGQSTLQR  590 (743)
Q Consensus       517 ~~~DLALLkLe~~---p~~l~pi~l~~s--~~~~G~~V~vIGyPlfg~~~g-~~psvt~GiIS~vv~~~~~~~~~~~~~~  590 (743)
                      ..+|||||+|+..   .+.+.|+.+...  .+..++.+.+.||+......+ .........+.-+..    ..|......
T Consensus        87 ~~~DiAll~L~~~i~~~~~~~pi~l~~~~~~~~~~~~~~~~g~g~~~~~~~~~~~~~~~~~~~~~~~----~~C~~~~~~  162 (229)
T smart00020       87 YDNDIALLKLKSPVTLSDNVRPICLPSSNYNVPAGTTCTVSGWGRTSEGAGSLPDTLQEVNVPIVSN----ATCRRAYSG  162 (229)
T ss_pred             CcCCEEEEEECcccCCCCceeeccCCCcccccCCCCEEEEEeCCCCCCCCCcCCCEeeEEEEEEeCH----HHhhhhhcc
Confidence            4689999999872   345678877664  578899999999985332001 111122121111110    001110000


Q ss_pred             -CCCcCeEEEE-----ccccCCCCcccceecCCc--eEEEEEeeee
Q 004596          591 -NSAYPVMLET-----TAAVHPGGSGGAVVNLDG--HMIGLVTSNA  628 (743)
Q Consensus       591 -~~~~~~~IqT-----dAai~~GnSGGPL~d~~G--~VIGIvtsna  628 (743)
                       ......+++.     +...++|+|||||+...+  .++||++...
T Consensus       163 ~~~~~~~~~C~~~~~~~~~~c~gdsG~pl~~~~~~~~l~Gi~s~g~  208 (229)
T smart00020      163 GGAITDNMLCAGGLEGGKDACQGDSGGPLVCNDGRWVLVGIVSWGS  208 (229)
T ss_pred             ccccCCCcEeecCCCCCCcccCCCCCCeeEEECCCEEEEEEEEECC
Confidence             0112223333     355788999999996443  8999999876


No 18 
>KOG1421 consensus Predicted signaling-associated protein (contains a PDZ domain) [General function prediction only]
Probab=98.92  E-value=5.8e-09  Score=118.72  Aligned_cols=191  Identities=20%  Similarity=0.331  Sum_probs=131.4

Q ss_pred             HHHhccCceEEEEEC----------CCeeEEEEEEeC-CCeEEecccccccccCcceeccCCccccccCCCCCCCCCCCc
Q 004596          386 PIQKALASVCLITID----------DGVWASGVLLND-QGLILTNAHLLEPWRFGKTTVSGWRNGVSFQPEDSASSGHTG  454 (743)
Q Consensus       386 ~i~~a~~SVV~I~~~----------~~~wGSGfvV~~-~G~ILTNaHVV~p~~~~~t~~~g~~~~~~f~~~~~~~~~~~~  454 (743)
                      .+..+-++||.|...          ..+-|+||+|++ .||||||+||+.|..+...        ++|..+++       
T Consensus        57 ~ia~VvksvVsI~~S~v~~fdtesag~~~atgfvvd~~~gyiLtnrhvv~pgP~va~--------avf~n~ee-------  121 (955)
T KOG1421|consen   57 TIANVVKSVVSIRFSAVRAFDTESAGESEATGFVVDKKLGYILTNRHVVAPGPFVAS--------AVFDNHEE-------  121 (955)
T ss_pred             hhhhhcccEEEEEehheeecccccccccceeEEEEecccceEEEeccccCCCCceeE--------EEeccccc-------
Confidence            467788999999863          234699999997 7899999999986432211        12221111       


Q ss_pred             ccccccccCCCCCCCcccccccccccccccccccCCceeEEEEEcCCCCceeEeeEEEEecCCCCcEEEEEEccCC---C
Q 004596          455 VDQYQKSQTLPPKMPKIVDSSVDEHRAYKLSSFSRGHRKIRVRLDHLDPWIWCDAKIVYVCKGPLDVSLLQLGYIP---D  531 (743)
Q Consensus       455 ~~~~~~~q~~~~k~~~i~~~~~~~~~~~~l~l~~~~~~~I~Vrl~~~~~~~W~~A~VV~v~~~~~DLALLkLe~~p---~  531 (743)
                                                                          ++.-.+| .|+-+|+.+++.+...   .
T Consensus       122 ----------------------------------------------------~ei~pvy-rDpVhdfGf~r~dps~ir~s  148 (955)
T KOG1421|consen  122 ----------------------------------------------------IEIYPVY-RDPVHDFGFFRYDPSTIRFS  148 (955)
T ss_pred             ----------------------------------------------------CCccccc-CCchhhcceeecChhhccee
Confidence                                                                1122233 5667899999988521   2


Q ss_pred             CccceeCCCCCCCCCCeEEEEecCCCCCCCCCCCeeEeEEEeeeeeccCCCCCccccccCCCcCeEEEEccccCCCCccc
Q 004596          532 QLCPIDADFGQPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQRNSAYPVMLETTAAVHPGGSGG  611 (743)
Q Consensus       532 ~l~pi~l~~s~~~~G~~V~vIGyPlfg~~~g~~psvt~GiIS~vv~~~~~~~~~~~~~~~~~~~~~IqTdAai~~GnSGG  611 (743)
                      .+..+.+...-.++|.+++++|+     -.|-..++-.|.++.+.+. .|-|...++..+..  .++|.-+...+|.||.
T Consensus       149 ~vt~i~lap~~akvgseirvvgN-----DagEklsIlagflSrldr~-apdyg~~~yndfnT--fy~Qaasstsggssgs  220 (955)
T KOG1421|consen  149 IVTEICLAPELAKVGSEIRVVGN-----DAGEKLSILAGFLSRLDRN-APDYGEDTYNDFNT--FYIQAASSTSGGSSGS  220 (955)
T ss_pred             eeeccccCccccccCCceEEecC-----CccceEEeehhhhhhccCC-Cccccccccccccc--eeeeehhcCCCCCCCC
Confidence            23444444455589999999998     4566678888999988774 34444333433333  3678878889999999


Q ss_pred             ceecCCceEEEEEeeeecCCCCcccCceeEEEehhHHHHHHHHHHhc
Q 004596          612 AVVNLDGHMIGLVTSNARHGGGTVIPHLNFSIPCAVLRPIFEFARDM  658 (743)
Q Consensus       612 PL~d~~G~VIGIvtsna~~~gg~~~p~lnFaIPi~~l~~il~~~~~~  658 (743)
                      ||+|-+|..|.++.....      ...-.|.+|++-+.+.+..++++
T Consensus       221 pVv~i~gyAVAl~agg~~------ssas~ffLpLdrV~RaL~clq~n  261 (955)
T KOG1421|consen  221 PVVDIPGYAVALNAGGSI------SSASDFFLPLDRVVRALRCLQNN  261 (955)
T ss_pred             ceecccceEEeeecCCcc------cccccceeeccchhhhhhhhhcC
Confidence            999999999999855432      22567999999999988888753


No 19 
>KOG1320 consensus Serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=98.81  E-value=7.2e-09  Score=116.51  Aligned_cols=125  Identities=21%  Similarity=0.319  Sum_probs=102.7

Q ss_pred             ccccccccc-cccccEEEEEeccCCCCCCceec--CCCCCCCCeEEEEeCCCCCCCccccccceEEEEEeeecCCC----
Q 004596          203 ESSNLSLMS-KSTSRVAILGVSSYLKDLPNIAL--TPLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVANCYPPR----  275 (743)
Q Consensus       203 ~~~~~~~~~-~~~t~~A~l~i~~~~~~~~~~~~--~~~~~~G~~v~~igsPFg~~sP~~f~ns~s~Givs~~~~~~----  275 (743)
                      -+.+|.+++ ....|+|+|||+....-++.+..  +..++.|+|+.++++||+      +.|++++|+|+...|..    
T Consensus       211 ~s~ep~i~g~d~~~gvA~l~ik~~~~i~~~i~~~~~~~~~~G~~~~a~~~~f~------~~nt~t~g~vs~~~R~~~~lg  284 (473)
T KOG1320|consen  211 NSGEPVIVGVDKVAGVAFLKIKTPENILYVIPLGVSSHFRTGVEVSAIGNGFG------LLNTLTQGMVSGQLRKSFKLG  284 (473)
T ss_pred             ccCCCeEEccccccceEEEEEecCCcccceeecceeeeecccceeeccccCce------eeeeeeecccccccccccccC
Confidence            355677888 67789999999743333666654  667999999999999999      79999999999776652    


Q ss_pred             ----CCCCceEEeeccccCC---CceEcCCCcEEEEEecccccc-CCcceEEEeeHHHHHHHHhhh
Q 004596          276 ----STTRSLLMADIRCLPG---GPVFGEHAHFVGILIRPLRQK-SGAEIQLVIPWEAIATACSDL  333 (743)
Q Consensus       276 ----~~~~~~i~tD~~~~pG---g~vf~~~g~liGiv~~~l~~~-g~~~l~~~ip~~~i~~~~~~l  333 (743)
                          .....++|||+++++|   ||+++.+|+.||++++...+. -+.+++|++|.+.+...+...
T Consensus       285 ~~~g~~i~~~~qtd~ai~~~nsg~~ll~~DG~~IgVn~~~~~ri~~~~~iSf~~p~d~vl~~v~r~  350 (473)
T KOG1320|consen  285 LETGVLISKINQTDAAINPGNSGGPLLNLDGEVIGVNTRKVTRIGFSHGISFKIPIDTVLVIVLRL  350 (473)
T ss_pred             cccceeeeeecccchhhhcccCCCcEEEecCcEeeeeeeeeEEeeccccceeccCchHhhhhhhhh
Confidence                2356689999999997   999999999999999999887 567999999999998666544


No 20 
>COG3591 V8-like Glu-specific endopeptidase [Amino acid transport and metabolism]
Probab=98.52  E-value=1.6e-06  Score=90.89  Aligned_cols=73  Identities=26%  Similarity=0.280  Sum_probs=51.4

Q ss_pred             CCCCCCeEEEEecCCCCCCCCCCCeeEeEEEeeeeeccCCCCCccccccCCCcCeEEEEccccCCCCcccceecCCceEE
Q 004596          542 QPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQRNSAYPVMLETTAAVHPGGSGGAVVNLDGHMI  621 (743)
Q Consensus       542 ~~~~G~~V~vIGyPlfg~~~g~~psvt~GiIS~vv~~~~~~~~~~~~~~~~~~~~~IqTdAai~~GnSGGPL~d~~G~VI  621 (743)
                      ..+.++.+.++|||.-.+..+ ..-...+.+..+                  ....++.+|.+.+|+||.||++.+.++|
T Consensus       157 ~~~~~d~i~v~GYP~dk~~~~-~~~e~t~~v~~~------------------~~~~l~y~~dT~pG~SGSpv~~~~~~vi  217 (251)
T COG3591         157 EAKANDRITVIGYPGDKPNIG-TMWESTGKVNSI------------------KGNKLFYDADTLPGSSGSPVLISKDEVI  217 (251)
T ss_pred             ccccCceeEEEeccCCCCcce-eEeeecceeEEE------------------ecceEEEEecccCCCCCCceEecCceEE
Confidence            468999999999985333222 112233333322                  1236889999999999999999989999


Q ss_pred             EEEeeeecCCCC
Q 004596          622 GLVTSNARHGGG  633 (743)
Q Consensus       622 GIvtsna~~~gg  633 (743)
                      |+.+.+....++
T Consensus       218 gv~~~g~~~~~~  229 (251)
T COG3591         218 GVHYNGPGANGG  229 (251)
T ss_pred             EEEecCCCcccc
Confidence            999998765444


No 21 
>KOG3627 consensus Trypsin [Amino acid transport and metabolism]
Probab=98.08  E-value=0.00029  Score=73.13  Aligned_cols=117  Identities=22%  Similarity=0.235  Sum_probs=66.6

Q ss_pred             CcEEEEEEcc---CCCCccceeCCCCC----CCCCCeEEEEecCCCCCC-CCCCCeeEeEEEeeeeeccCCCCCcccccc
Q 004596          519 LDVSLLQLGY---IPDQLCPIDADFGQ----PSLGSAAYVIGHGLFGPR-CGLSPSVSSGVVAKVVKANLPSYGQSTLQR  590 (743)
Q Consensus       519 ~DLALLkLe~---~p~~l~pi~l~~s~----~~~G~~V~vIGyPlfg~~-~g~~psvt~GiIS~vv~~~~~~~~~~~~~~  590 (743)
                      +|||||+++.   ..+.+.|+.+....    ...++.+++.|||..... ..........    .........|......
T Consensus       106 nDiall~l~~~v~~~~~i~piclp~~~~~~~~~~~~~~~v~GWG~~~~~~~~~~~~L~~~----~v~i~~~~~C~~~~~~  181 (256)
T KOG3627|consen  106 NDIALLRLSEPVTFSSHIQPICLPSSADPYFPPGGTTCLVSGWGRTESGGGPLPDTLQEV----DVPIISNSECRRAYGG  181 (256)
T ss_pred             CCEEEEEECCCcccCCcccccCCCCCcccCCCCCCCEEEEEeCCCcCCCCCCCCceeEEE----EEeEcChhHhcccccC
Confidence            8999999996   34567777775332    345589999999753221 0111122211    1111111223322221


Q ss_pred             C-CCcCeEEEEcc-----ccCCCCcccceecCC---ceEEEEEeeeecCCCCcccCce
Q 004596          591 N-SAYPVMLETTA-----AVHPGGSGGAVVNLD---GHMIGLVTSNARHGGGTVIPHL  639 (743)
Q Consensus       591 ~-~~~~~~IqTdA-----ai~~GnSGGPL~d~~---G~VIGIvtsna~~~gg~~~p~l  639 (743)
                      . .....++++..     ..+.|+|||||+-.+   ..++||+++.....+....|+.
T Consensus       182 ~~~~~~~~~Ca~~~~~~~~~C~GDSGGPLv~~~~~~~~~~GivS~G~~~C~~~~~P~v  239 (256)
T KOG3627|consen  182 LGTITDTMLCAGGPEGGKDACQGDSGGPLVCEDNGRWVLVGIVSWGSGGCGQPNYPGV  239 (256)
T ss_pred             ccccCCCEEeeCccCCCCccccCCCCCeEEEeeCCcEEEEEEEEecCCCCCCCCCCeE
Confidence            1 11234677653     357899999999553   6999999998764444345555


No 22 
>PF00863 Peptidase_C4:  Peptidase family C4 This family belongs to family C4 of the peptidase classification.;  InterPro: IPR001730 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  Nuclear inclusion A (NIA) proteases from potyviruses are cysteine peptidases belong to the MEROPS peptidase family C4 (NIa protease family, clan PA(C)) [, ].  Potyviruses include plant viruses in which the single-stranded RNA encodes a polyprotein with NIA protease activity, where proteolytic cleavage is specific for Gln+Gly sites. The NIA protease acts on the polyprotein, releasing itself by Gln+Gly cleavage at both the N- and C-termini. It further processes the polyprotein by cleavage at five similar sites in the C-terminal half of the sequence. In addition to its C-terminal protease activity, the NIA protease contains an N-terminal domain that has been implicated in the transcription process []. This peptidase is present in the nuclear inclusion protein of potyviruses.; GO: 0008234 cysteine-type peptidase activity, 0006508 proteolysis; PDB: 3MMG_B 1Q31_B 1LVB_A 1LVM_A.
Probab=97.88  E-value=0.00031  Score=73.45  Aligned_cols=103  Identities=17%  Similarity=0.331  Sum_probs=50.0

Q ss_pred             CCCcEEEEEEccCCCCccceeC--CCCCCCCCCeEEEEecCCCCCCCCCCCeeEeEEEeeeeeccCCCCCccccccCCCc
Q 004596          517 GPLDVSLLQLGYIPDQLCPIDA--DFGQPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQRNSAY  594 (743)
Q Consensus       517 ~~~DLALLkLe~~p~~l~pi~l--~~s~~~~G~~V~vIGyPlfg~~~g~~psvt~GiIS~vv~~~~~~~~~~~~~~~~~~  594 (743)
                      +..||.++|+..   +++|.+-  ....++.|+.|+++|.=. . ..+....+++-  +.+.+              ...
T Consensus        80 ~~~DiviirmPk---DfpPf~~kl~FR~P~~~e~v~mVg~~f-q-~k~~~s~vSes--S~i~p--------------~~~  138 (235)
T PF00863_consen   80 EGRDIVIIRMPK---DFPPFPQKLKFRAPKEGERVCMVGSNF-Q-EKSISSTVSES--SWIYP--------------EEN  138 (235)
T ss_dssp             TCSSEEEEE--T---TS----S---B----TT-EEEEEEEEC-S-SCCCEEEEEEE--EEEEE--------------ETT
T ss_pred             CCccEEEEeCCc---ccCCcchhhhccCCCCCCEEEEEEEEE-E-cCCeeEEECCc--eEEee--------------cCC
Confidence            359999999986   6666543  345789999999999721 1 11211222221  22222              122


Q ss_pred             CeEEEEccccCCCCcccceecC-CceEEEEEeeeecCCCCcccCceeEEEehh
Q 004596          595 PVMLETTAAVHPGGSGGAVVNL-DGHMIGLVTSNARHGGGTVIPHLNFSIPCA  646 (743)
Q Consensus       595 ~~~IqTdAai~~GnSGGPL~d~-~G~VIGIvtsna~~~gg~~~p~lnFaIPi~  646 (743)
                      ..+..+-.+...|+-|.||++. +|++|||++......      ..||..|+.
T Consensus       139 ~~fWkHwIsTk~G~CG~PlVs~~Dg~IVGiHsl~~~~~------~~N~F~~f~  185 (235)
T PF00863_consen  139 SHFWKHWISTKDGDCGLPLVSTKDGKIVGIHSLTSNTS------SRNYFTPFP  185 (235)
T ss_dssp             TTEEEE-C---TT-TT-EEEETTT--EEEEEEEEETTT------SSEEEEE--
T ss_pred             CCeeEEEecCCCCccCCcEEEcCCCcEEEEEcCccCCC------CeEEEEcCC
Confidence            3456666677899999999975 699999999775433      367777754


No 23 
>COG5640 Secreted trypsin-like serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=97.23  E-value=0.0023  Score=69.91  Aligned_cols=38  Identities=32%  Similarity=0.587  Sum_probs=30.2

Q ss_pred             cccCCCCcccceecC--CceE-EEEEeeeecCCCCcccCce
Q 004596          602 AAVHPGGSGGAVVNL--DGHM-IGLVTSNARHGGGTVIPHL  639 (743)
Q Consensus       602 Aai~~GnSGGPL~d~--~G~V-IGIvtsna~~~gg~~~p~l  639 (743)
                      ...|.|+||||+|-.  +|++ +||+++....+++..+|++
T Consensus       223 ~daCqGDSGGPi~~~g~~G~vQ~GVvSwG~~~Cg~t~~~gV  263 (413)
T COG5640         223 KDACQGDSGGPIFHKGEEGRVQRGVVSWGDGGCGGTLIPGV  263 (413)
T ss_pred             cccccCCCCCceEEeCCCccEEEeEEEecCCCCCCCCccee
Confidence            467889999999943  4765 9999999888877776663


No 24 
>PF03761 DUF316:  Domain of unknown function (DUF316) ;  InterPro: IPR005514 This is a family of uncharacterised proteins from Caenorhabditis elegans.
Probab=96.99  E-value=0.042  Score=58.54  Aligned_cols=92  Identities=17%  Similarity=0.133  Sum_probs=55.5

Q ss_pred             CCCcEEEEEEccC-CCCccceeCCCC--CCCCCCeEEEEecCCCCCCCCCCCeeEeEEEeeeeeccCCCCCccccccCCC
Q 004596          517 GPLDVSLLQLGYI-PDQLCPIDADFG--QPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQRNSA  593 (743)
Q Consensus       517 ~~~DLALLkLe~~-p~~l~pi~l~~s--~~~~G~~V~vIGyPlfg~~~g~~psvt~GiIS~vv~~~~~~~~~~~~~~~~~  593 (743)
                      ...+++||+++.. .....|+=++++  ....|+.+.+.|+.    ..   ..+....+.-...              ..
T Consensus       159 ~~~~~mIlEl~~~~~~~~~~~Cl~~~~~~~~~~~~~~~yg~~----~~---~~~~~~~~~i~~~--------------~~  217 (282)
T PF03761_consen  159 RPYSPMILELEEDFSKNVSPPCLADSSTNWEKGDEVDVYGFN----ST---GKLKHRKLKITNC--------------TK  217 (282)
T ss_pred             cccceEEEEEcccccccCCCEEeCCCccccccCceEEEeecC----CC---CeEEEEEEEEEEe--------------ec
Confidence            5789999999973 234444444443  46789999998871    11   1222222221111              01


Q ss_pred             cCeEEEEccccCCCCccccee---cCCceEEEEEeeeec
Q 004596          594 YPVMLETTAAVHPGGSGGAVV---NLDGHMIGLVTSNAR  629 (743)
Q Consensus       594 ~~~~IqTdAai~~GnSGGPL~---d~~G~VIGIvtsna~  629 (743)
                      ....+.++...+.|++|||++   |.+-.||||.+.+..
T Consensus       218 ~~~~~~~~~~~~~~d~Gg~lv~~~~gr~tlIGv~~~~~~  256 (282)
T PF03761_consen  218 CAYSICTKQYSCKGDRGGPLVKNINGRWTLIGVGASGNY  256 (282)
T ss_pred             cceeEecccccCCCCccCeEEEEECCCEEEEEEEccCCC
Confidence            233455666677899999999   333468999877643


No 25 
>PF05579 Peptidase_S32:  Equine arteritis virus serine endopeptidase S32;  InterPro: IPR008760 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to MEROPS peptidase family S32 (clan PA(S)). The type example is equine arteritis virus serine endopeptidase (equine arteritis virus), which is involved in processing of nidovirus polyproteins [].; GO: 0004252 serine-type endopeptidase activity, 0016032 viral reproduction, 0019082 viral protein processing; PDB: 3FAN_A 3FAO_A 1MBM_A.
Probab=96.66  E-value=0.0046  Score=65.22  Aligned_cols=77  Identities=26%  Similarity=0.336  Sum_probs=41.1

Q ss_pred             CcEEEEEEccCCCCccceeCCCCCCCCCCeEEEEecCCCCCCCCCCCeeEeEEEeeeeeccCCCCCccccccCCCcCeEE
Q 004596          519 LDVSLLQLGYIPDQLCPIDADFGQPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQRNSAYPVML  598 (743)
Q Consensus       519 ~DLALLkLe~~p~~l~pi~l~~s~~~~G~~V~vIGyPlfg~~~g~~psvt~GiIS~vv~~~~~~~~~~~~~~~~~~~~~I  598 (743)
                      -|.|.-.++..+...+.+++....  .| +.|-.-          +.-+..|.|...                    ..+
T Consensus       156 GDfA~~~~~~~~G~~P~~k~a~~~--~G-rAyW~t----------~tGvE~G~ig~~--------------------~~~  202 (297)
T PF05579_consen  156 GDFAEADITNWPGAAPKYKFAQNY--TG-RAYWLT----------STGVEPGFIGGG--------------------GAV  202 (297)
T ss_dssp             TTEEEEEETTS-S---B--B-TT---SE-EEEEEE----------TTEEEEEEEETT--------------------EEE
T ss_pred             CcEEEEECCCCCCCCCceeecCCc--cc-ceEEEc----------ccCcccceecCc--------------------eEE
Confidence            688888886656677777766321  12 122111          123455655532                    123


Q ss_pred             EEccccCCCCcccceecCCceEEEEEeeeecCC
Q 004596          599 ETTAAVHPGGSGGAVVNLDGHMIGLVTSNARHG  631 (743)
Q Consensus       599 qTdAai~~GnSGGPL~d~~G~VIGIvtsna~~~  631 (743)
                      +.   .++|+||+|++..+|.+|||++..-+.+
T Consensus       203 ~f---T~~GDSGSPVVt~dg~liGVHTGSn~~G  232 (297)
T PF05579_consen  203 CF---TGPGDSGSPVVTEDGDLIGVHTGSNKRG  232 (297)
T ss_dssp             ES---S-GGCTT-EEEETTC-EEEEEEEEETTT
T ss_pred             EE---cCCCCCCCccCcCCCCEEEEEecCCCcC
Confidence            33   3689999999999999999999875543


No 26 
>PF10459 Peptidase_S46:  Peptidase S46;  InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains. 
Probab=96.32  E-value=0.0078  Score=72.14  Aligned_cols=21  Identities=38%  Similarity=0.512  Sum_probs=19.8

Q ss_pred             EEEEEEeCCCeEEeccccccc
Q 004596          405 ASGVLLNDQGLILTNAHLLEP  425 (743)
Q Consensus       405 GSGfvV~~~G~ILTNaHVV~p  425 (743)
                      |||.+|+++|+|+||.||+-.
T Consensus        49 CSgsfVS~~GLvlTNHHC~~~   69 (698)
T PF10459_consen   49 CSGSFVSPDGLVLTNHHCGYG   69 (698)
T ss_pred             eeEEEEcCCceEEecchhhhh
Confidence            999999999999999999953


No 27 
>PF13365 Trypsin_2:  Trypsin-like peptidase domain; PDB: 1Y8T_A 2Z9I_A 3QO6_A 1L1J_A 1QY6_A 2O8L_A 3OTP_E 2ZLE_I 1KY9_A 3CS0_A ....
Probab=94.87  E-value=0.03  Score=50.58  Aligned_cols=21  Identities=48%  Similarity=0.892  Sum_probs=19.1

Q ss_pred             eeccccCC---CceEcCCCcEEEE
Q 004596          284 ADIRCLPG---GPVFGEHAHFVGI  304 (743)
Q Consensus       284 tD~~~~pG---g~vf~~~g~liGi  304 (743)
                      +|+.+.||   |||||.+|++|||
T Consensus        97 ~~~~~~~G~SGgpv~~~~G~vvGi  120 (120)
T PF13365_consen   97 TDADTRPGSSGGPVFDSDGRVVGI  120 (120)
T ss_dssp             ESSS-STTTTTSEEEETTSEEEEE
T ss_pred             eecccCCCcEeHhEECCCCEEEeC
Confidence            89999997   9999999999997


No 28 
>PF10459 Peptidase_S46:  Peptidase S46;  InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains. 
Probab=94.63  E-value=0.026  Score=67.74  Aligned_cols=65  Identities=20%  Similarity=0.285  Sum_probs=44.5

Q ss_pred             CCcCeEEEEccccCCCCcccceecCCceEEEEEeeeecCCCCccc---Cc--eeEEEehhHHHHHHHHHH
Q 004596          592 SAYPVMLETTAAVHPGGSGGAVVNLDGHMIGLVTSNARHGGGTVI---PH--LNFSIPCAVLRPIFEFAR  656 (743)
Q Consensus       592 ~~~~~~IqTdAai~~GnSGGPL~d~~G~VIGIvtsna~~~gg~~~---p~--lnFaIPi~~l~~il~~~~  656 (743)
                      ...+.-+.+|+-+.+||||+|++|.+|+|||++.-..-.+-...+   |.  -+.++=+..+.-+++.+.
T Consensus       618 g~~pv~FlstnDitGGNSGSPvlN~~GeLVGl~FDgn~Esl~~D~~fdp~~~R~I~VDiRyvL~~ldkv~  687 (698)
T PF10459_consen  618 GSVPVNFLSTNDITGGNSGSPVLNAKGELVGLAFDGNWESLSGDIAFDPELNRTIHVDIRYVLWALDKVY  687 (698)
T ss_pred             CCeeeEEEeccCcCCCCCCCccCCCCceEEEEeecCchhhcccccccccccceeEEEEHHHHHHHHHHHh
Confidence            446677788999999999999999999999998765433211110   23  344555566666666553


No 29 
>PF00548 Peptidase_C3:  3C cysteine protease (picornain 3C);  InterPro: IPR000199 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  This signature defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies C3A and C3B. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral C3 cysteine protease. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SJO_E 2H6M_A 1QA7_C 1HAV_B 2HAL_A 2H9H_A 3QZQ_B 3QZR_A 3R0F_B 3SJ9_A ....
Probab=94.06  E-value=0.93  Score=45.51  Aligned_cols=35  Identities=29%  Similarity=0.526  Sum_probs=29.5

Q ss_pred             CcCeEEEEccccCCCCcccceecC---CceEEEEEeee
Q 004596          593 AYPVMLETTAAVHPGGSGGAVVNL---DGHMIGLVTSN  627 (743)
Q Consensus       593 ~~~~~IqTdAai~~GnSGGPL~d~---~G~VIGIvtsn  627 (743)
                      ..+.++.+.++..+|+.||||+..   .++++||+.+.
T Consensus       133 ~~~~~~~Y~~~t~~G~CG~~l~~~~~~~~~i~GiHvaG  170 (172)
T PF00548_consen  133 TTPRSLKYKAPTKPGMCGSPLVSRIGGQGKIIGIHVAG  170 (172)
T ss_dssp             EEEEEEEEESEEETTGTTEEEEESCGGTTEEEEEEEEE
T ss_pred             EeeEEEEEccCCCCCccCCeEEEeeccCccEEEEEecc
Confidence            346788899999999999999942   58999999885


No 30 
>KOG1421 consensus Predicted signaling-associated protein (contains a PDZ domain) [General function prediction only]
Probab=92.04  E-value=2.7  Score=49.94  Aligned_cols=152  Identities=14%  Similarity=0.124  Sum_probs=84.3

Q ss_pred             EEEEEcCCCCceeEeeEEEEecCCCCcEEEEEEccCCCCccceeCCCCCCCCCCeEEEEecCCCCCCCCCCCeeEeEEEe
Q 004596          494 IRVRLDHLDPWIWCDAKIVYVCKGPLDVSLLQLGYIPDQLCPIDADFGQPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVA  573 (743)
Q Consensus       494 I~Vrl~~~~~~~W~~A~VV~v~~~~~DLALLkLe~~p~~l~pi~l~~s~~~~G~~V~vIGyPlfg~~~g~~psvt~GiIS  573 (743)
                      ++++.++...   ..|++.+ -++...+|.+|.+.  .....+++.+..+..||++...|+-.-........+++.-.+-
T Consensus       578 ~~vt~~dS~~---i~a~~~f-L~~t~n~a~~kydp--~~~~~~kl~~~~v~~gD~~~f~g~~~~~r~ltaktsv~dvs~~  651 (955)
T KOG1421|consen  578 QRVTEADSDG---IPANVSF-LHPTENVASFKYDP--ALEVQLKLTDTTVLRGDECTFEGFTEDLRALTAKTSVTDVSVV  651 (955)
T ss_pred             eEEeeccccc---ccceeeE-ecCccceeEeccCh--hHhhhhccceeeEecCCceeEecccccchhhcccceeeeeEEE
Confidence            5566665555   6777776 35567888888874  3445556667778999999999983100000112233332111


Q ss_pred             eeeeccCCCCCccccccCCCcCeEEEEcccc-CCCCcccceecCCceEEEEEeeeecCC-CCcccCceeEEEehhHHHHH
Q 004596          574 KVVKANLPSYGQSTLQRNSAYPVMLETTAAV-HPGGSGGAVVNLDGHMIGLVTSNARHG-GGTVIPHLNFSIPCAVLRPI  651 (743)
Q Consensus       574 ~vv~~~~~~~~~~~~~~~~~~~~~IqTdAai-~~GnSGGPL~d~~G~VIGIvtsna~~~-gg~~~p~lnFaIPi~~l~~i  651 (743)
                      -+-+...|.+.      .... ..|-..+.+ ..++|| -+.|.+|+++|+=-+-.... ++..+ ..-|.+-+.++.+.
T Consensus       652 ~~ps~~~pr~r------~~n~-e~Is~~~nlsT~c~sg-~ltdddg~vvalwl~~~ge~~~~kd~-~y~~gl~~~~~l~v  722 (955)
T KOG1421|consen  652 IIPSSVMPRFR------ATNL-EVISFMDNLSTSCLSG-RLTDDDGEVVALWLSVVGEDVGGKDY-TYKYGLSMSYILPV  722 (955)
T ss_pred             EecCCCCccee------ecce-EEEEEeccccccccce-EEECCCCeEEEEEeeeeccccCCcee-EEEeccchHHHHHH
Confidence            11111111110      0111 123222222 345554 56688999999977766543 23222 34566667889999


Q ss_pred             HHHHHhcCC
Q 004596          652 FEFARDMQE  660 (743)
Q Consensus       652 l~~~~~~~d  660 (743)
                      ++.++....
T Consensus       723 l~rlk~g~~  731 (955)
T KOG1421|consen  723 LERLKLGPS  731 (955)
T ss_pred             HHHHhcCCC
Confidence            999987533


No 31 
>PF02907 Peptidase_S29:  Hepatitis C virus NS3 protease;  InterPro: IPR004109 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies the Hepatitis C virus NS3 protein as a serine protease which belongs to MEROPS peptidase family S29 (hepacivirin family, clan PA(S)), which has a trypsin-like fold. The non-structural (NS) protein NS3 is one of the NS proteins involved in replication of the HCV genome. The NS2 proteinase (IPR002518 from INTERPRO), a zinc-dependent enzyme, performs a single proteolytic cut to release the N terminus of NS3. The action of NS3 proteinase (NS3P), which resides in the N-terminal one-third of the NS3 protein, then yields all remaining non-structural proteins. The C-terminal two-thirds of the NS3 protein contain a helicase. The functional relationship between the proteinase and helicase domains is unknown. NS3 has a structural zinc-binding site and requires cofactor NS4. It has been suggested that the NS3 serine protease of hepatitus C is involved in cell transformation and that the ability to transform requires an active enzyme [].; GO: 0008236 serine-type peptidase activity, 0006508 proteolysis, 0019087 transformation of host cell by virus; PDB: 2QV1_B 3LOX_C 2OBQ_C 2OC1_C 2OC0_A 3LON_A 3KNX_A 2O8M_A 2OBO_A 2OC8_A ....
Probab=88.82  E-value=0.52  Score=45.43  Aligned_cols=44  Identities=25%  Similarity=0.484  Sum_probs=30.9

Q ss_pred             ccCCCCcccceecCCceEEEEEeeeecCCCCcccCceeEEEehhHHH
Q 004596          603 AVHPGGSGGAVVNLDGHMIGLVTSNARHGGGTVIPHLNFSIPCAVLR  649 (743)
Q Consensus       603 ai~~GnSGGPL~d~~G~VIGIvtsna~~~gg~~~p~lnFaIPi~~l~  649 (743)
                      +.-.|.|||||+-.+|++|||-.+..-..+-.+  .+-|. |++.+.
T Consensus       104 s~lkGSSGgPiLC~~GH~vG~f~aa~~trgvak--~i~f~-P~e~l~  147 (148)
T PF02907_consen  104 SDLKGSSGGPILCPSGHAVGMFRAAVCTRGVAK--AIDFI-PVETLP  147 (148)
T ss_dssp             HHHTT-TT-EEEETTSEEEEEEEEEEEETTEEE--EEEEE-EHHHHH
T ss_pred             EEEecCCCCcccCCCCCEEEEEEEEEEcCCcee--eEEEE-eeeecC
Confidence            445799999999999999999988754333333  57777 887653


No 32 
>PF08192 Peptidase_S64:  Peptidase family S64;  InterPro: IPR012985 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This family of fungal proteins is involved in the processing of membrane bound transcription factor Stp1 [] and belongs to MEROPS petidase family S64 (clan PA). The processing causes the signalling domain of Stp1 to be passed to the nucleus where several permease genes are induced. The permeases are important for uptake of amino acids, and processing of tp1 only occurs in an amino acid-rich environment. This family is predicted to be distantly related to the trypsin family (MEROPS peptidase family S1) and to have a typical trypsin-like catalytic triad [].
Probab=88.22  E-value=3  Score=49.69  Aligned_cols=119  Identities=14%  Similarity=0.206  Sum_probs=69.6

Q ss_pred             CCCCcEEEEEEcc-------CCCCc------cceeCCC-------CCCCCCCeEEEEecCCCCCCCCCCCeeEeEEEeee
Q 004596          516 KGPLDVSLLQLGY-------IPDQL------CPIDADF-------GQPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKV  575 (743)
Q Consensus       516 ~~~~DLALLkLe~-------~p~~l------~pi~l~~-------s~~~~G~~V~vIGyPlfg~~~g~~psvt~GiIS~v  575 (743)
                      ..-.|+||++++.       +.+++      +.+.+.+       ....+|..|+=+|.     ..|    .|.|.++++
T Consensus       540 ~~LsD~AIIkV~~~~~~~N~LGddi~f~~~dP~l~f~NlyV~~~~~~~~~G~~VfK~Gr-----TTg----yT~G~lNg~  610 (695)
T PF08192_consen  540 KRLSDWAIIKVNKERKCQNYLGDDIQFNEPDPTLMFQNLYVREVVSNLVPGMEVFKVGR-----TTG----YTTGILNGI  610 (695)
T ss_pred             ccccceEEEEeCCCceecCCCCccccccCCCccccccccchhhhhhccCCCCeEEEecc-----cCC----ccceEecce
Confidence            3446999999986       11222      1222221       23578999999986     333    577888765


Q ss_pred             eeccCCCCCccccccCCCcCeEEEEc----cccCCCCcccceecCCc------eEEEEEeeeecCCCCcccCceeEEEeh
Q 004596          576 VKANLPSYGQSTLQRNSAYPVMLETT----AAVHPGGSGGAVVNLDG------HMIGLVTSNARHGGGTVIPHLNFSIPC  645 (743)
Q Consensus       576 v~~~~~~~~~~~~~~~~~~~~~IqTd----Aai~~GnSGGPL~d~~G------~VIGIvtsna~~~gg~~~p~lnFaIPi  645 (743)
                      .-.    +-.   .+......++...    +=..+|+||.=|++.-+      .|+||..+..+.   .+  .++...|+
T Consensus       611 klv----yw~---dG~i~s~efvV~s~~~~~Fa~~GDSGS~VLtk~~d~~~gLgvvGMlhsydge---~k--qfglftPi  678 (695)
T PF08192_consen  611 KLV----YWA---DGKIQSSEFVVSSDNNPAFASGGDSGSWVLTKLEDNNKGLGVVGMLHSYDGE---QK--QFGLFTPI  678 (695)
T ss_pred             EEE----Eec---CCCeEEEEEEEecCCCccccCCCCcccEEEecccccccCceeeEEeeecCCc---cc--eeeccCcH
Confidence            321    100   0111112233333    23457999999998533      499999886332   22  58888888


Q ss_pred             hHHHHHHHHH
Q 004596          646 AVLRPIFEFA  655 (743)
Q Consensus       646 ~~l~~il~~~  655 (743)
                      ..+..-+++.
T Consensus       679 ~~il~rl~~v  688 (695)
T PF08192_consen  679 NEILDRLEEV  688 (695)
T ss_pred             HHHHHHHHHh
Confidence            8776655554


No 33 
>PF00949 Peptidase_S7:  Peptidase S7, Flavivirus NS3 serine protease ;  InterPro: IPR001850 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies serine peptidases belong to MEROPS peptidase family S7 (flavivirin family, clan PA(S)). The protein fold of the peptidase domain for members of this family resembles that of chymotrypsin, the type example for clan PA.  Flaviviruses produce a polyprotein from the ssRNA genome. The N terminus of the NS3 protein (approx. 180 aa) is required for the processing of the polyprotein. NS3 also has conserved homology with NTP-binding proteins and DEAD family of RNA helicase [, , ].; GO: 0003723 RNA binding, 0003724 RNA helicase activity, 0005524 ATP binding; PDB: 2IJO_B 3E90_D 2GGV_B 2FP7_B 2WV9_A 3U1I_B 3U1J_B 2WZQ_A 2WHX_A 3L6P_A ....
Probab=87.76  E-value=0.46  Score=45.83  Aligned_cols=31  Identities=26%  Similarity=0.521  Sum_probs=22.1

Q ss_pred             ccccCCCCcccceecCCceEEEEEeeeecCC
Q 004596          601 TAAVHPGGSGGAVVNLDGHMIGLVTSNARHG  631 (743)
Q Consensus       601 dAai~~GnSGGPL~d~~G~VIGIvtsna~~~  631 (743)
                      +....+|.||+|+||.+|++|||-.......
T Consensus        91 ~~d~~~GsSGSpi~n~~g~ivGlYg~g~~~~  121 (132)
T PF00949_consen   91 DLDFPKGSSGSPIFNQNGEIVGLYGNGVEVG  121 (132)
T ss_dssp             ---S-TTGTT-EEEETTSCEEEEEEEEEE-T
T ss_pred             ecccCCCCCCCceEcCCCcEEEEEccceeec
Confidence            3457899999999999999999987765543


No 34 
>PF00089 Trypsin:  Trypsin;  InterPro: IPR001254 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine proteases belong to the MEROPS peptidase family S1 (chymotrypsin family, clan PA(S))and to peptidase family S6 (Hap serine peptidases). The chymotrypsin family is almost totally confined to animals, although trypsin-like enzymes are found in actinomycetes of the genera Streptomyces and Saccharopolyspora, and in the fungus Fusarium oxysporum []. The enzymes are inherently secreted, being synthesised with a signal peptide that targets them to the secretory pathway. Animal enzymes are either secreted directly, packaged into vesicles for regulated secretion, or are retained in leukocyte granules []. The Hap family, 'Haemophilus adhesion and penetration', are proteins that play a role in the interaction with human epithelial cells. The serine protease activity is localized at the N-terminal domain, whereas the binding domain is in the C-terminal region. ; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 1SPJ_A 1A5I_A 2ZGH_A 2ZKS_A 2ZGJ_A 2ZGC_A 2ODP_A 2I6Q_A 2I6S_A 2ODQ_A ....
Probab=86.56  E-value=7  Score=38.63  Aligned_cols=110  Identities=17%  Similarity=0.141  Sum_probs=64.5

Q ss_pred             cccEEEEEeccC---CCCCCceecCC---CCCCCCeEEEEeCCCCCCCc-cccccceEEEEEeee--cC--CCCCCCceE
Q 004596          214 TSRVAILGVSSY---LKDLPNIALTP---LNKRGDLLLAVGSPFGVLSP-MHFFNSVSMGSVANC--YP--PRSTTRSLL  282 (743)
Q Consensus       214 ~t~~A~l~i~~~---~~~~~~~~~~~---~~~~G~~v~~igsPFg~~sP-~~f~ns~s~Givs~~--~~--~~~~~~~~i  282 (743)
                      ..||||||++..   .....++.+..   .++.|+.+.++|-+.....- ..-.......+++..  ..  ........+
T Consensus        86 ~~DiAll~L~~~~~~~~~~~~~~l~~~~~~~~~~~~~~~~G~~~~~~~~~~~~~~~~~~~~~~~~~c~~~~~~~~~~~~~  165 (220)
T PF00089_consen   86 DNDIALLKLDRPITFGDNIQPICLPSAGSDPNVGTSCIVVGWGRTSDNGYSSNLQSVTVPVVSRKTCRSSYNDNLTPNMI  165 (220)
T ss_dssp             TTSEEEEEESSSSEHBSSBEESBBTSTTHTTTTTSEEEEEESSBSSTTSBTSBEEEEEEEEEEHHHHHHHTTTTSTTTEE
T ss_pred             cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
Confidence            359999999854   12233444433   35899999999999863221 012223344444432  11  011234566


Q ss_pred             Eeec----cccC---CCceEcCCCcEEEEEecccccc-CCcceEEEeeHH
Q 004596          283 MADI----RCLP---GGPVFGEHAHFVGILIRPLRQK-SGAEIQLVIPWE  324 (743)
Q Consensus       283 ~tD~----~~~p---Gg~vf~~~g~liGiv~~~l~~~-g~~~l~~~ip~~  324 (743)
                      .++.    ....   ||||+..++.||||++.. ..+ ......+.+++.
T Consensus       166 c~~~~~~~~~~~g~sG~pl~~~~~~lvGI~s~~-~~c~~~~~~~v~~~v~  214 (220)
T PF00089_consen  166 CAGSSGSGDACQGDSGGPLICNNNYLVGIVSFG-ENCGSPNYPGVYTRVS  214 (220)
T ss_dssp             EEETTSSSBGGTTTTTSEEEETTEEEEEEEEEE-SSSSBTTSEEEEEEGG
T ss_pred             cccccccccccccccccccccceeeecceeeec-CCCCCCCcCEEEEEHH
Confidence            6665    4444   599999988999999987 333 332345555544


No 35 
>PF00944 Peptidase_S3:  Alphavirus core protein ;  InterPro: IPR000930 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. Togavirin, also known as Sindbis virus core endopeptidase, is a serine protease resident at the N terminus of the p130 polyprotein of togaviruses []. The endopeptidase signature identifies the peptidase as belonging to the MEROPS peptidase family S3 (togavirin family, clan PA(S)). The polyprotein also includes structural proteins for the nucleocapsid core and for the glycoprotein spikes []. Togavirin is only active while part of the polyprotein, cleavage at a Trp-Ser bond resulting in total lack of activity []. Mutagenesis studies have identified the location of the His-Asp-Ser catalytic triad, and X-ray studies have revealed the protein fold to be similar to that of chymotrypsin [, ].; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis, 0016020 membrane; PDB: 2YEW_D 1EP5_A 3J0C_F 1EP6_C 1WYK_D 1DYL_A 1VCQ_B 1VCP_B 1LD4_D 1KXA_A ....
Probab=84.86  E-value=1.1  Score=43.26  Aligned_cols=35  Identities=26%  Similarity=0.489  Sum_probs=28.1

Q ss_pred             EEEEccccCCCCcccceecCCceEEEEEeeeecCC
Q 004596          597 MLETTAAVHPGGSGGAVVNLDGHMIGLVTSNARHG  631 (743)
Q Consensus       597 ~IqTdAai~~GnSGGPL~d~~G~VIGIvtsna~~~  631 (743)
                      +..-+..-.+|+||-|++|-.|+||||+-..+...
T Consensus        96 ftip~g~g~~GDSGRpi~DNsGrVVaIVLGG~neG  130 (158)
T PF00944_consen   96 FTIPTGVGKPGDSGRPIFDNSGRVVAIVLGGANEG  130 (158)
T ss_dssp             EEEETTS-STTSTTEEEESTTSBEEEEEEEEEEET
T ss_pred             EEeccCCCCCCCCCCccCcCCCCEEEEEecCCCCC
Confidence            34456778899999999998999999998877643


No 36 
>PF05580 Peptidase_S55:  SpoIVB peptidase S55;  InterPro: IPR008763 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S55 (SpoIVB peptidase family, clan PA(S)). The protein SpoIVB plays a key role in signalling in the final sigma-K checkpoint of Bacillus subtilis [, ].
Probab=80.85  E-value=1.4  Score=45.88  Aligned_cols=45  Identities=31%  Similarity=0.477  Sum_probs=34.2

Q ss_pred             EEEEccccCCCCcccceecCCceEEEEEeeeecCCCCcccCceeEEEehhH
Q 004596          597 MLETTAAVHPGGSGGAVVNLDGHMIGLVTSNARHGGGTVIPHLNFSIPCAV  647 (743)
Q Consensus       597 ~IqTdAai~~GnSGGPL~d~~G~VIGIvtsna~~~gg~~~p~lnFaIPi~~  647 (743)
                      .+.-+..+..|+||+|++ .+|++||=++-..-++     |..+|.||++.
T Consensus       170 Ll~~TGGIvqGMSGSPI~-qdGKLiGAVthvf~~d-----p~~Gygi~ie~  214 (218)
T PF05580_consen  170 LLEKTGGIVQGMSGSPII-QDGKLIGAVTHVFVND-----PTKGYGIFIEW  214 (218)
T ss_pred             hhhhhCCEEecccCCCEE-ECCEEEEEEEEEEecC-----CCceeeecHHH
Confidence            344445677899999999 5999999999886433     46789998754


No 37 
>PF09342 DUF1986:  Domain of unknown function (DUF1986);  InterPro: IPR015420 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This domain is found in serine endopeptidases belonging to MEROPS peptidase family S1A (clan PA). It is found in unusual mosaic proteins, which are encoded by the Drosophila nudel gene (see P98159 from SWISSPROT). Nudel is involved in defining embryonic dorsoventral polarity. Three proteases; ndl, gd and snk process easter to create active easter. Active easter defines cell identities along the dorsal-ventral continuum by activating the spz ligand for the Tl receptor in the ventral region of the embryo. Nudel, pipe and windbeutel together trigger the protease cascade within the extraembryonic perivitelline compartment which induces dorsoventral polarity of the Drosophila embryo [].
Probab=79.91  E-value=15  Score=39.14  Aligned_cols=33  Identities=30%  Similarity=0.506  Sum_probs=28.2

Q ss_pred             CceEEEEECCCeeEEEEEEeCCCeEEeccccccc
Q 004596          392 ASVCLITIDDGVWASGVLLNDQGLILTNAHLLEP  425 (743)
Q Consensus       392 ~SVV~I~~~~~~wGSGfvV~~~G~ILTNaHVV~p  425 (743)
                      |..+-|.+++.-|+||++|+++ |||++..|+..
T Consensus        17 PWlA~IYvdG~~~CsgvLlD~~-WlLvsssCl~~   49 (267)
T PF09342_consen   17 PWLADIYVDGRYWCSGVLLDPH-WLLVSSSCLRG   49 (267)
T ss_pred             cceeeEEEcCeEEEEEEEeccc-eEEEeccccCC
Confidence            5566777777789999999997 99999999974


No 38 
>PF00947 Pico_P2A:  Picornavirus core protein 2A;  InterPro: IPR000081 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  This domain defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies 3CA and 3CB. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral 3C cysteine protease []. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0008233 peptidase activity, 0006508 proteolysis, 0016032 viral reproduction; PDB: 2HRV_B 1Z8R_A.
Probab=74.97  E-value=3.8  Score=39.30  Aligned_cols=31  Identities=29%  Similarity=0.508  Sum_probs=23.3

Q ss_pred             eEEEEccccCCCCcccceecCCceEEEEEeee
Q 004596          596 VMLETTAAVHPGGSGGAVVNLDGHMIGLVTSN  627 (743)
Q Consensus       596 ~~IqTdAai~~GnSGGPL~d~~G~VIGIvtsn  627 (743)
                      .++....++.||+.||+|+ .+--||||+|+.
T Consensus        79 ~~l~g~Gp~~PGdCGg~L~-C~HGViGi~Tag  109 (127)
T PF00947_consen   79 NLLIGEGPAEPGDCGGILR-CKHGVIGIVTAG  109 (127)
T ss_dssp             CEEEEE-SSSTT-TCSEEE-ETTCEEEEEEEE
T ss_pred             CceeecccCCCCCCCceeE-eCCCeEEEEEeC
Confidence            3444556899999999999 456699999996


No 39 
>KOG0441 consensus Cu2+/Zn2+ superoxide dismutase SOD1 [Inorganic ion transport and metabolism]
Probab=54.32  E-value=4.5  Score=40.00  Aligned_cols=42  Identities=29%  Similarity=0.256  Sum_probs=30.9

Q ss_pred             hhhhccccccccccCcee---eeeeeeecccccC--ChhhhhhccCC
Q 004596           26 GLKMRRHAFHQYNSGKTT---LSASGMLLPLSFF--DTKVAERNWGV   67 (743)
Q Consensus        26 ~~k~~~~~f~~~~~g~~t---~sas~~~~p~~~~--~~~~~~~~~~~   67 (743)
                      ||.-++|+||.|+.|.+|   .||-...=|.+..  .+.+.+|.+++
T Consensus        38 GL~pg~hgfHvHqfGD~t~GC~SaGphFNp~~~~hg~p~~~~rH~gd   84 (154)
T KOG0441|consen   38 GLPPGKHGFHVHQFGDNTNGCKSAGPHFNPNKKTHGGPVDEVRHVGD   84 (154)
T ss_pred             cCCCceeeEEEEeccCCCCChhcCCCCCCCcccCCCCcccccccccc
Confidence            444499999999999998   6776666566555  46666777766


No 40 
>PF01732 DUF31:  Putative peptidase (DUF31);  InterPro: IPR022382  This domain has no known function. It is found in various hypothetical proteins and putative lipoproteins from mycoplasmas. 
Probab=51.84  E-value=9.3  Score=42.81  Aligned_cols=24  Identities=25%  Similarity=0.515  Sum_probs=21.2

Q ss_pred             cccCCCCcccceecCCceEEEEEe
Q 004596          602 AAVHPGGSGGAVVNLDGHMIGLVT  625 (743)
Q Consensus       602 Aai~~GnSGGPL~d~~G~VIGIvt  625 (743)
                      ....+|+||+.|+|.+|++|||..
T Consensus       350 ~~l~gGaSGS~V~n~~~~lvGIy~  373 (374)
T PF01732_consen  350 YSLGGGASGSMVINQNNELVGIYF  373 (374)
T ss_pred             cCCCCCCCcCeEECCCCCEEEEeC
Confidence            366789999999999999999964


No 41 
>PF08192 Peptidase_S64:  Peptidase family S64;  InterPro: IPR012985 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This family of fungal proteins is involved in the processing of membrane bound transcription factor Stp1 [] and belongs to MEROPS petidase family S64 (clan PA). The processing causes the signalling domain of Stp1 to be passed to the nucleus where several permease genes are induced. The permeases are important for uptake of amino acids, and processing of tp1 only occurs in an amino acid-rich environment. This family is predicted to be distantly related to the trypsin family (MEROPS peptidase family S1) and to have a typical trypsin-like catalytic triad [].
Probab=50.55  E-value=76  Score=38.36  Aligned_cols=109  Identities=17%  Similarity=0.186  Sum_probs=63.0

Q ss_pred             ccccccccEEEEEeccCC-------------CCCCceecC--------CCCCCCCeEEEEeCCCCCCCccccccceEEEE
Q 004596          209 LMSKSTSRVAILGVSSYL-------------KDLPNIALT--------PLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGS  267 (743)
Q Consensus       209 ~~~~~~t~~A~l~i~~~~-------------~~~~~~~~~--------~~~~~G~~v~~igsPFg~~sP~~f~ns~s~Gi  267 (743)
                      ++.+.++|+||+||+.+.             ..-|.+.+.        ..+..|.+|+-+|.==|+          |.|+
T Consensus       537 ii~~~LsD~AIIkV~~~~~~~N~LGddi~f~~~dP~l~f~NlyV~~~~~~~~~G~~VfK~GrTTgy----------T~G~  606 (695)
T PF08192_consen  537 IINKRLSDWAIIKVNKERKCQNYLGDDIQFNEPDPTLMFQNLYVREVVSNLVPGMEVFKVGRTTGY----------TTGI  606 (695)
T ss_pred             hhcccccceEEEEeCCCceecCCCCccccccCCCccccccccchhhhhhccCCCCeEEEecccCCc----------cceE
Confidence            445778899999998432             112333332        247789999999998885          6777


Q ss_pred             Eeeec----CCCC-CCCceEEee----ccccCC---CceEcCCC------cEEEEEeccccccCCcceEEEeeHHHHHHH
Q 004596          268 VANCY----PPRS-TTRSLLMAD----IRCLPG---GPVFGEHA------HFVGILIRPLRQKSGAEIQLVIPWEAIATA  329 (743)
Q Consensus       268 vs~~~----~~~~-~~~~~i~tD----~~~~pG---g~vf~~~g------~liGiv~~~l~~~g~~~l~~~ip~~~i~~~  329 (743)
                      |...-    .++. ....|+++-    +=..+|   .-|+++-+      .|+||+-+.=+  ....+++..||..|++=
T Consensus       607 lNg~klvyw~dG~i~s~efvV~s~~~~~Fa~~GDSGS~VLtk~~d~~~gLgvvGMlhsydg--e~kqfglftPi~~il~r  684 (695)
T PF08192_consen  607 LNGIKLVYWADGKIQSSEFVVSSDNNPAFASGGDSGSWVLTKLEDNNKGLGVVGMLHSYDG--EQKQFGLFTPINEILDR  684 (695)
T ss_pred             ecceEEEEecCCCeEEEEEEEecCCCccccCCCCcccEEEecccccccCceeeEEeeecCC--ccceeeccCcHHHHHHH
Confidence            76431    1111 011223332    112224   44556522      38888874332  22377889999999853


No 42 
>COG3591 V8-like Glu-specific endopeptidase [Amino acid transport and metabolism]
Probab=49.77  E-value=48  Score=35.56  Aligned_cols=71  Identities=23%  Similarity=0.218  Sum_probs=55.6

Q ss_pred             cCCCCCCCCeEEEEeCCCCCCCccccccceEEEEEeeecCCCCCCCceEEeeccccCC---CceEcCCCcEEEEEecccc
Q 004596          234 LTPLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVANCYPPRSTTRSLLMADIRCLPG---GPVFGEHAHFVGILIRPLR  310 (743)
Q Consensus       234 ~~~~~~~G~~v~~igsPFg~~sP~~f~ns~s~Givs~~~~~~~~~~~~i~tD~~~~pG---g~vf~~~g~liGiv~~~l~  310 (743)
                      .....+.+|.|.++|.|=.-  |..+....+.|.|-.....      +++-|+...||   .||++.+.++||+.+....
T Consensus       154 ~~~~~~~~d~i~v~GYP~dk--~~~~~~~e~t~~v~~~~~~------~l~y~~dT~pG~SGSpv~~~~~~vigv~~~g~~  225 (251)
T COG3591         154 TASEAKANDRITVIGYPGDK--PNIGTMWESTGKVNSIKGN------KLFYDADTLPGSSGSPVLISKDEVIGVHYNGPG  225 (251)
T ss_pred             cccccccCceeEEEeccCCC--CcceeEeeecceeEEEecc------eEEEEecccCCCCCCceEecCceEEEEEecCCC
Confidence            34569999999999999884  4345666677766555432      67788888897   9999999999999998887


Q ss_pred             cc
Q 004596          311 QK  312 (743)
Q Consensus       311 ~~  312 (743)
                      ..
T Consensus       226 ~~  227 (251)
T COG3591         226 AN  227 (251)
T ss_pred             cc
Confidence            55


No 43 
>PF02907 Peptidase_S29:  Hepatitis C virus NS3 protease;  InterPro: IPR004109 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies the Hepatitis C virus NS3 protein as a serine protease which belongs to MEROPS peptidase family S29 (hepacivirin family, clan PA(S)), which has a trypsin-like fold. The non-structural (NS) protein NS3 is one of the NS proteins involved in replication of the HCV genome. The NS2 proteinase (IPR002518 from INTERPRO), a zinc-dependent enzyme, performs a single proteolytic cut to release the N terminus of NS3. The action of NS3 proteinase (NS3P), which resides in the N-terminal one-third of the NS3 protein, then yields all remaining non-structural proteins. The C-terminal two-thirds of the NS3 protein contain a helicase. The functional relationship between the proteinase and helicase domains is unknown. NS3 has a structural zinc-binding site and requires cofactor NS4. It has been suggested that the NS3 serine protease of hepatitus C is involved in cell transformation and that the ability to transform requires an active enzyme [].; GO: 0008236 serine-type peptidase activity, 0006508 proteolysis, 0019087 transformation of host cell by virus; PDB: 2QV1_B 3LOX_C 2OBQ_C 2OC1_C 2OC0_A 3LON_A 3KNX_A 2O8M_A 2OBO_A 2OC8_A ....
Probab=49.47  E-value=12  Score=36.52  Aligned_cols=42  Identities=24%  Similarity=0.539  Sum_probs=31.7

Q ss_pred             eeccccCCCceEcCCCcEEEEEeccccccCC-cceEEEeeHHHH
Q 004596          284 ADIRCLPGGPVFGEHAHFVGILIRPLRQKSG-AEIQLVIPWEAI  326 (743)
Q Consensus       284 tD~~~~pGg~vf~~~g~liGiv~~~l~~~g~-~~l~~~ip~~~i  326 (743)
                      +|.+=-.||||+-..|++|||..+-++.+|- -.+.|+ ||+.+
T Consensus       104 s~lkGSSGgPiLC~~GH~vG~f~aa~~trgvak~i~f~-P~e~l  146 (148)
T PF02907_consen  104 SDLKGSSGGPILCPSGHAVGMFRAAVCTRGVAKAIDFI-PVETL  146 (148)
T ss_dssp             HHHTT-TT-EEEETTSEEEEEEEEEEEETTEEEEEEEE-EHHHH
T ss_pred             EEEecCCCCcccCCCCCEEEEEEEEEEcCCceeeEEEE-eeeec
Confidence            3344444699999999999999999998844 378888 99865


No 44 
>TIGR02860 spore_IV_B stage IV sporulation protein B. SpoIVB, the stage IV sporulation protein B of endospore-forming bacteria such as Bacillus subtilis, is a serine proteinase, expressed in the spore (rather than mother cell) compartment, that participates in a proteolytic activation cascade for Sigma-K. It appears to be universal among endospore-forming bacteria and occurs nowhere else.
Probab=47.94  E-value=15  Score=41.91  Aligned_cols=45  Identities=27%  Similarity=0.448  Sum_probs=32.9

Q ss_pred             EEEEccccCCCCcccceecCCceEEEEEeeeecCCCCcccCceeEEEehhH
Q 004596          597 MLETTAAVHPGGSGGAVVNLDGHMIGLVTSNARHGGGTVIPHLNFSIPCAV  647 (743)
Q Consensus       597 ~IqTdAai~~GnSGGPL~d~~G~VIGIvtsna~~~gg~~~p~lnFaIPi~~  647 (743)
                      .+.-+..+..|+||+|++ .+|++||=+|=-.-++     |..+|.|-++.
T Consensus       350 ll~~tgGivqGMSGSPi~-q~gkliGAvtHVfvnd-----pt~GYGi~ie~  394 (402)
T TIGR02860       350 LLEKTGGIVQGMSGSPII-QNGKVIGAVTHVFVND-----PTSGYGVYIEW  394 (402)
T ss_pred             HhhHhCCEEecccCCCEE-ECCEEEEEEEEEEecC-----CCcceeehHHH
Confidence            333345677899999999 5999999988776543     34678885544


No 45 
>PF03510 Peptidase_C24:  2C endopeptidase (C24) cysteine protease family;  InterPro: IPR000317 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  The two signatures that defines this group of calivirus polyproteins identify a cysteine peptidase signature that belongs to MEROPS peptidase family C24 (clan PA(C)). Caliciviruses are positive-stranded ssRNA viruses that cause gastroenteritis. The calicivirus genome contains two open reading frames, ORF1 and ORF2. ORF2 encodes a structural protein []; while ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine protease and RNA polymerase activity. The regions of the polyprotein in which these activities lie are similar to proteins produced by the picornaviruses. Two different families of caliciviruses can be distinguished on the basis of sequence similarity, namely those classified as small round structured viruses (SRSVs) and those classed as non-SRSVs. Calicivirus proteases from the non-SRSV group, which are members of the PA protease clan, constitute family C24 of the cysteine proteases (proteases from SRSVs belong to the C37 family). As mentioned above, the protease activity resides within a polyprotein. The enzyme cleaves the polyprotein at sites N-terminal to itself, liberating the polyprotein helicase.; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis
Probab=40.59  E-value=84  Score=29.43  Aligned_cols=17  Identities=24%  Similarity=0.425  Sum_probs=14.3

Q ss_pred             EEEEeCCCeEEecccccc
Q 004596          407 GVLLNDQGLILTNAHLLE  424 (743)
Q Consensus       407 GfvV~~~G~ILTNaHVV~  424 (743)
                      ++-|.. |..+|+.||++
T Consensus         3 avHIGn-G~~vt~tHva~   19 (105)
T PF03510_consen    3 AVHIGN-GRYVTVTHVAK   19 (105)
T ss_pred             eEEeCC-CEEEEEEEEec
Confidence            566775 89999999997


No 46 
>PF05416 Peptidase_C37:  Southampton virus-type processing peptidase;  InterPro: IPR001665 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  This group of cysteine peptidases belong to the MEROPS peptidase family C37, (clan PA(C)). The type example is calicivirin from Southampton virus, an endopeptidase that cleaves the polyprotein at sites N-terminal to itself, liberating the polyprotein helicase. Southampton virus is a positive-stranded ssRNA virus belonging to the Caliciviruses, which are viruses that cause gastroenteritis. The calicivirus genome contains two open reading frames, ORF1 and ORF2. ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine protease and RNA polymerase activity []. The regions of the polyprotein in which these activities lie are similar to proteins produced by the picornaviruses []. ORF2 encodes a structural, capsid protein. Two different families of caliciviruses can be distinguished on the basis of sequence similarity, namely the Norwalk-like viruses or small round structured viruses (SRSVs), and those classed as non-SRSVs.; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 2FYQ_A 2FYR_A 1WQS_D 4ASH_A 2IPH_B.
Probab=40.05  E-value=89  Score=35.91  Aligned_cols=37  Identities=32%  Similarity=0.372  Sum_probs=24.5

Q ss_pred             cCeEEEEcc-------ccCCCCcccceecCCc---eEEEEEeeeecC
Q 004596          594 YPVMLETTA-------AVHPGGSGGAVVNLDG---HMIGLVTSNARH  630 (743)
Q Consensus       594 ~~~~IqTdA-------ai~~GnSGGPL~d~~G---~VIGIvtsna~~  630 (743)
                      ...||.|.+       ...||+.|-|-+-..|   -|+|++++.++.
T Consensus       483 Q~GMLLTGaNAK~mDLGT~PGDCGcPYvyKrgNd~VV~GVH~AAtr~  529 (535)
T PF05416_consen  483 QMGMLLTGANAKGMDLGTIPGDCGCPYVYKRGNDWVVIGVHAAATRS  529 (535)
T ss_dssp             EEEEETTSTT-SSTTTS--TTGTT-EEEEEETTEEEEEEEEEEE-SS
T ss_pred             eeeeeeecCCccccccCCCCCCCCCceeeecCCcEEEEEEEehhccC
Confidence            345666543       4568999999996655   589999998874


No 47 
>PF00863 Peptidase_C4:  Peptidase family C4 This family belongs to family C4 of the peptidase classification.;  InterPro: IPR001730 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  Nuclear inclusion A (NIA) proteases from potyviruses are cysteine peptidases belong to the MEROPS peptidase family C4 (NIa protease family, clan PA(C)) [, ].  Potyviruses include plant viruses in which the single-stranded RNA encodes a polyprotein with NIA protease activity, where proteolytic cleavage is specific for Gln+Gly sites. The NIA protease acts on the polyprotein, releasing itself by Gln+Gly cleavage at both the N- and C-termini. It further processes the polyprotein by cleavage at five similar sites in the C-terminal half of the sequence. In addition to its C-terminal protease activity, the NIA protease contains an N-terminal domain that has been implicated in the transcription process []. This peptidase is present in the nuclear inclusion protein of potyviruses.; GO: 0008234 cysteine-type peptidase activity, 0006508 proteolysis; PDB: 3MMG_B 1Q31_B 1LVB_A 1LVM_A.
Probab=36.53  E-value=85  Score=33.41  Aligned_cols=81  Identities=25%  Similarity=0.357  Sum_probs=41.9

Q ss_pred             cccEEEEEeccCCCCCCceec---CCCCCCCCeEEEEeCCCCCCCccccccceEEEEEe---eecCCCCCCCceEEeecc
Q 004596          214 TSRVAILGVSSYLKDLPNIAL---TPLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVA---NCYPPRSTTRSLLMADIR  287 (743)
Q Consensus       214 ~t~~A~l~i~~~~~~~~~~~~---~~~~~~G~~v~~igsPFg~~sP~~f~ns~s~Givs---~~~~~~~~~~~~i~tD~~  287 (743)
                      -.|+.++|..   +++|+.+-   -..++.||.|..||+=|--       +++ +-.||   ...+  ..+..|+---+.
T Consensus        81 ~~DiviirmP---kDfpPf~~kl~FR~P~~~e~v~mVg~~fq~-------k~~-~s~vSesS~i~p--~~~~~fWkHwIs  147 (235)
T PF00863_consen   81 GRDIVIIRMP---KDFPPFPQKLKFRAPKEGERVCMVGSNFQE-------KSI-SSTVSESSWIYP--EENSHFWKHWIS  147 (235)
T ss_dssp             CSSEEEEE-----TTS----S---B----TT-EEEEEEEECSS-------CCC-EEEEEEEEEEEE--ETTTTEEEE-C-
T ss_pred             CccEEEEeCC---cccCCcchhhhccCCCCCCEEEEEEEEEEc-------CCe-eEEECCceEEee--cCCCCeeEEEec
Confidence            3599999987   45666554   2369999999999997762       222 22233   2232  123468888888


Q ss_pred             ccCC---CceEc-CCCcEEEEEec
Q 004596          288 CLPG---GPVFG-EHAHFVGILIR  307 (743)
Q Consensus       288 ~~pG---g~vf~-~~g~liGiv~~  307 (743)
                      -.+|   .|+++ ++|.+|||-..
T Consensus       148 Tk~G~CG~PlVs~~Dg~IVGiHsl  171 (235)
T PF00863_consen  148 TKDGDCGLPLVSTKDGKIVGIHSL  171 (235)
T ss_dssp             --TT-TT-EEEETTT--EEEEEEE
T ss_pred             CCCCccCCcEEEcCCCcEEEEEcC
Confidence            8887   89997 68899999883


No 48 
>cd00190 Tryp_SPc Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues.
Probab=32.68  E-value=1.3e+02  Score=29.62  Aligned_cols=39  Identities=15%  Similarity=0.133  Sum_probs=24.7

Q ss_pred             cccEEEEEeccCCC---CCCceecCC---CCCCCCeEEEEeCCCC
Q 004596          214 TSRVAILGVSSYLK---DLPNIALTP---LNKRGDLLLAVGSPFG  252 (743)
Q Consensus       214 ~t~~A~l~i~~~~~---~~~~~~~~~---~~~~G~~v~~igsPFg  252 (743)
                      ..|+||||++....   ...++.+..   ....|+.+.+.|....
T Consensus        88 ~~DiAll~L~~~~~~~~~v~picl~~~~~~~~~~~~~~~~G~g~~  132 (232)
T cd00190          88 DNDIALLKLKRPVTLSDNVRPICLPSSGYNLPAGTTCTVSGWGRT  132 (232)
T ss_pred             cCCEEEEEECCcccCCCcccceECCCccccCCCCCEEEEEeCCcC
Confidence            35999999984321   123333322   4778899999986543


No 49 
>PF00571 CBS:  CBS domain CBS domain web page. Mutations in the CBS domain of Swiss:P35520 lead to homocystinuria.;  InterPro: IPR000644 CBS (cystathionine-beta-synthase) domains are small intracellular modules, mostly found in two or four copies within a protein, that occur in a variety of proteins in bacteria, archaea, and eukaryotes [, ]. Tandem pairs of CBS domains can act as binding domains for adenosine derivatives and may regulate the activity of attached enzymatic or other domains []. In some cases, CBS domains may act as sensors of cellular energy status by being activated by AMP and inhibited by ATP []. In chloride ion channels, the CBS domains have been implicated in intracellular targeting and trafficking, as well as in protein-protein interactions, but results vary with different channels: in the CLC-5 channel, the CBS domain was shown to be required for trafficking [], while in the CLC-1 channel, the CBS domain was shown to be critical for channel function, but not necessary for trafficking []. Recent experiments revealing that CBS domains can bind adenosine-containing ligands such ATP, AMP, or S-adenosylmethionine have led to the hypothesis that CBS domains function as sensors of intracellular metabolites [, ]. Crystallographic studies of CBS domains have shown that pairs of CBS sequences form a globular domain where each CBS unit adopts a beta-alpha-beta-beta-alpha pattern []. Crystal structure of the CBS domains of the AMP-activated protein kinase in complexes with AMP and ATP shows that the phosphate groups of AMP/ATP lie in a surface pocket at the interface of two CBS domains, which is lined with basic residues, many of which are associated with disease-causing mutations [].  In humans, mutations in conserved residues within CBS domains cause a variety of human hereditary diseases, including (with the gene mutated in parentheses): homocystinuria (cystathionine beta-synthase); Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase); retinitis pigmentosa (IMP dehydrogenase-1); congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members).; GO: 0005515 protein binding; PDB: 3JTF_A 3TE5_C 3TDH_C 3T4N_C 2QLV_C 3OI8_A 3LV9_A 2QH1_B 1PVM_B 3LQN_A ....
Probab=25.70  E-value=56  Score=25.45  Aligned_cols=21  Identities=33%  Similarity=0.574  Sum_probs=18.3

Q ss_pred             CCCcccceecCCceEEEEEee
Q 004596          606 PGGSGGAVVNLDGHMIGLVTS  626 (743)
Q Consensus       606 ~GnSGGPL~d~~G~VIGIvts  626 (743)
                      .+-+.-||+|.+|+++|+++.
T Consensus        28 ~~~~~~~V~d~~~~~~G~is~   48 (57)
T PF00571_consen   28 NGISRLPVVDEDGKLVGIISR   48 (57)
T ss_dssp             HTSSEEEEESTTSBEEEEEEH
T ss_pred             cCCcEEEEEecCCEEEEEEEH
Confidence            477888999999999999975


No 50 
>smart00020 Tryp_SPc Trypsin-like serine protease. Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues.
Probab=25.10  E-value=4.1e+02  Score=26.27  Aligned_cols=39  Identities=18%  Similarity=0.163  Sum_probs=24.2

Q ss_pred             cccEEEEEeccCC---CCCCceecCC---CCCCCCeEEEEeCCCC
Q 004596          214 TSRVAILGVSSYL---KDLPNIALTP---LNKRGDLLLAVGSPFG  252 (743)
Q Consensus       214 ~t~~A~l~i~~~~---~~~~~~~~~~---~~~~G~~v~~igsPFg  252 (743)
                      ..|+||||++...   ....++.+..   .+..|+.+.+.|-.-.
T Consensus        88 ~~DiAll~L~~~i~~~~~~~pi~l~~~~~~~~~~~~~~~~g~g~~  132 (229)
T smart00020       88 DNDIALLKLKSPVTLSDNVRPICLPSSNYNVPAGTTCTVSGWGRT  132 (229)
T ss_pred             cCCEEEEEECcccCCCCceeeccCCCcccccCCCCEEEEEeCCCC
Confidence            4699999998431   1233333422   4777888888885443


No 51 
>PF00949 Peptidase_S7:  Peptidase S7, Flavivirus NS3 serine protease ;  InterPro: IPR001850 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies serine peptidases belong to MEROPS peptidase family S7 (flavivirin family, clan PA(S)). The protein fold of the peptidase domain for members of this family resembles that of chymotrypsin, the type example for clan PA.  Flaviviruses produce a polyprotein from the ssRNA genome. The N terminus of the NS3 protein (approx. 180 aa) is required for the processing of the polyprotein. NS3 also has conserved homology with NTP-binding proteins and DEAD family of RNA helicase [, , ].; GO: 0003723 RNA binding, 0003724 RNA helicase activity, 0005524 ATP binding; PDB: 2IJO_B 3E90_D 2GGV_B 2FP7_B 2WV9_A 3U1I_B 3U1J_B 2WZQ_A 2WHX_A 3L6P_A ....
Probab=24.37  E-value=69  Score=31.12  Aligned_cols=32  Identities=19%  Similarity=0.401  Sum_probs=21.0

Q ss_pred             ceEEeeccccCC---CceEcCCCcEEEEEeccccc
Q 004596          280 SLLMADIRCLPG---GPVFGEHAHFVGILIRPLRQ  311 (743)
Q Consensus       280 ~~i~tD~~~~pG---g~vf~~~g~liGiv~~~l~~  311 (743)
                      .+.+.|..+-+|   .|+||.+|++|||--.-+.-
T Consensus        86 ~~~~~~~d~~~GsSGSpi~n~~g~ivGlYg~g~~~  120 (132)
T PF00949_consen   86 GIGAIDLDFPKGSSGSPIFNQNGEIVGLYGNGVEV  120 (132)
T ss_dssp             EEEEE---S-TTGTT-EEEETTSCEEEEEEEEEE-
T ss_pred             eEEeeecccCCCCCCCceEcCCCcEEEEEccceee
Confidence            456677777776   99999999999997766543


No 52 
>PF13267 DUF4058:  Protein of unknown function (DUF4058)
Probab=23.48  E-value=59  Score=34.92  Aligned_cols=26  Identities=31%  Similarity=0.489  Sum_probs=21.2

Q ss_pred             ccc-cccchhHHHHHHHHHHHHhhccccc
Q 004596          701 EDN-IEGKGSRFAKFIAERREVLKHSTQV  728 (743)
Q Consensus       701 ~~~-~~~~~~~~akfi~~~~~~~~~~~~~  728 (743)
                      |.| ..++|..  +|+++||++|.|.|+|
T Consensus       124 P~NKr~G~gr~--~Y~~KRq~vl~S~tHL  150 (254)
T PF13267_consen  124 PANKRPGEGRA--AYERKRQEVLGSGTHL  150 (254)
T ss_pred             cccCCCCccHH--HHHHHHHHHHhccCce
Confidence            335 4577776  9999999999999987


No 53 
>PF00947 Pico_P2A:  Picornavirus core protein 2A;  InterPro: IPR000081 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  This domain defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies 3CA and 3CB. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral 3C cysteine protease []. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0008233 peptidase activity, 0006508 proteolysis, 0016032 viral reproduction; PDB: 2HRV_B 1Z8R_A.
Probab=21.42  E-value=1.2e+02  Score=29.42  Aligned_cols=26  Identities=31%  Similarity=0.551  Sum_probs=18.7

Q ss_pred             eEEeeccccCC---CceEcCCCcEEEEEec
Q 004596          281 LLMADIRCLPG---GPVFGEHAHFVGILIR  307 (743)
Q Consensus       281 ~i~tD~~~~pG---g~vf~~~g~liGiv~~  307 (743)
                      +++.--.+.||   |+|+-++| +|||+++
T Consensus        80 ~l~g~Gp~~PGdCGg~L~C~HG-ViGi~Ta  108 (127)
T PF00947_consen   80 LLIGEGPAEPGDCGGILRCKHG-VIGIVTA  108 (127)
T ss_dssp             EEEEE-SSSTT-TCSEEEETTC-EEEEEEE
T ss_pred             ceeecccCCCCCCCceeEeCCC-eEEEEEe
Confidence            34444567887   88888875 9999996


No 54 
>PF12381 Peptidase_C3G:  Tungro spherical virus-type peptidase;  InterPro: IPR024387 This entry represents a rice tungro spherical waikavirus-type peptidase that belongs to MEROPS peptidase family C3G. It is a picornain 3C-type protease, and is responsible for the self-cleavage of the positive single-stranded polyproteins of a number of plant viral genomes. The location of the protease activity of the polyprotein is at the C-terminal end, adjacent and N-terminal to the putative RNA polymerase [, ].
Probab=20.83  E-value=84  Score=33.07  Aligned_cols=54  Identities=15%  Similarity=0.206  Sum_probs=35.9

Q ss_pred             eEEEEccccCCCCccccee--cC--CceEEEEEeeeecCCCCcccCceeEEEeh--hHHHHHHHHH
Q 004596          596 VMLETTAAVHPGGSGGAVV--NL--DGHMIGLVTSNARHGGGTVIPHLNFSIPC--AVLRPIFEFA  655 (743)
Q Consensus       596 ~~IqTdAai~~GnSGGPL~--d~--~G~VIGIvtsna~~~gg~~~p~lnFaIPi--~~l~~il~~~  655 (743)
                      .-+++++....|+-|||++  |.  .-+++||+.+.....      ..+||=++  +.|++.++.+
T Consensus       169 ~gleY~~~t~~GdCGs~i~~~~t~~~RKIvGiHVAG~~~~------~~gYAe~itQEDL~~A~~~l  228 (231)
T PF12381_consen  169 QGLEYQMPTMNGDCGSPIVRNNTQMVRKIVGIHVAGSANH------AMGYAESITQEDLMRAINKL  228 (231)
T ss_pred             eeeeEECCCcCCCccceeeEcchhhhhhhheeeecccccc------cceehhhhhHHHHHHHHHhh
Confidence            3456778899999999998  22  378999999976422      35566554  3444444443


No 55 
>PF08208 RNA_polI_A34:  DNA-directed RNA polymerase I subunit RPA34.5;  InterPro: IPR013240 This is a family of proteins conserved from yeasts to human. Subunit A34.5 of RNA polymerase I is a non-essential subunit which is thought to help Pol I overcome topological constraints imposed on ribosomal DNA during the process of transcription [].; PDB: 3NFG_N.
Probab=20.82  E-value=33  Score=35.05  Aligned_cols=13  Identities=62%  Similarity=0.950  Sum_probs=0.0

Q ss_pred             Ccchhhhcccccc
Q 004596           23 DPKGLKMRRHAFH   35 (743)
Q Consensus        23 dpk~~k~~~~~f~   35 (743)
                      -|+|||||.|+|=
T Consensus       109 qp~gLk~Rf~P~G  121 (198)
T PF08208_consen  109 QPKGLKMRFFPFG  121 (198)
T ss_dssp             -------------
T ss_pred             CCCCcceeeecCC
Confidence            4899999999884


No 56 
>PF02122 Peptidase_S39:  Peptidase S39;  InterPro: IPR000382 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. ORF2 of Potato leafroll virus (PLrV) encodes a polyprotein which is translated following a -1 frameshift. The polyprotein has a putative linear arrangement of membrane achor-VPg-peptidase-polmerase domains. The serine peptidase domain which is found in this group of sequences belongs to MEROPS peptidase family S39 (clan PA(S)). It is likely that the peptidase domain is involved in the cleavage of the polyprotein []. The nucleotide sequence for the RNA of PLrV has been determined [, ]. The sequence contains six large open reading frames (ORFs). The 5' coding region encodes two polypeptides of 28K and 70K, which overlap in different reading frames; it is suggested that the third ORF in the 5' block is translated by frameshift readthrough near the end of the 70K protein, yielding a 118K polypeptide []. Segments of the predicted amino acid sequences of these ORFs resemble those of known viral RNA polymerases, ATP-binding proteins and viral genome-linked proteins. The nucleotide sequence of the genomic RNA of Beet western yellows virus (BWYV) has been determined []. The sequence contains six long ORFs. A cluster of three of these ORFs, including the coat protein cistron, display extensive amino acid sequence similarity to corresponding ORFs of a second luteovirus: Barley yellow dwarf virus [].; GO: 0004252 serine-type endopeptidase activity, 0022415 viral reproductive process, 0016021 integral to membrane; PDB: 1ZYO_A.
Probab=20.04  E-value=1.1e+02  Score=31.82  Aligned_cols=49  Identities=16%  Similarity=0.175  Sum_probs=18.0

Q ss_pred             EEEEccccCCCCcccceecCCceEEEEEeeeecCCCCcccCceeEEEehhHHH
Q 004596          597 MLETTAAVHPGGSGGAVVNLDGHMIGLVTSNARHGGGTVIPHLNFSIPCAVLR  649 (743)
Q Consensus       597 ~IqTdAai~~GnSGGPL~d~~G~VIGIvtsna~~~gg~~~p~lnFaIPi~~l~  649 (743)
                      .....+...+|.||-|+|+.+ +++|+.+......   ...+.|+..|+.-+.
T Consensus       137 ~~~vls~T~~G~SGtp~y~g~-~vvGvH~G~~~~~---~~~n~n~~spip~~~  185 (203)
T PF02122_consen  137 FASVLSNTSPGWSGTPYYSGK-NVVGVHTGSPSGS---NRENNNRMSPIPPIP  185 (203)
T ss_dssp             EEEE-----TT-TT-EEE-SS--EEEEEEEE----------------------
T ss_pred             CCceEcCCCCCCCCCCeEECC-CceEeecCccccc---ccccccccccccccc
Confidence            556667788999999999877 9999998863211   112566666655443


Done!