Query         040739
Match_columns 594
No_of_seqs    183 out of 473
Neff          4.5 
Searched_HMMs 46136
Date          Fri Mar 29 11:19:05 2013
Command       hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/040739.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/040739hhsearch_cdd -cpu 12 -v 0 

 No Hit                             Prob E-value P-value  Score    SS Cols Query HMM  Template HMM
  1 PF08192 Peptidase_S64:  Peptid  99.7 4.8E-16   1E-20  172.4  12.2  100  328-439   586-686 (695)
  2 PRK10139 serine endoprotease;   98.3 3.4E-06 7.3E-11   92.7  12.4  156  224-436    94-253 (455)
  3 TIGR02038 protease_degS peripl  98.3 1.1E-05 2.4E-10   85.6  13.9  158  224-438    82-243 (351)
  4 PRK10942 serine endoprotease;   98.2 1.5E-05 3.3E-10   88.0  12.3  155  224-435   115-273 (473)
  5 TIGR02037 degP_htrA_DO peripla  98.2 1.5E-05 3.3E-10   86.2  12.1  157  224-438    62-222 (428)
  6 PRK10898 serine endoprotease;   98.1   3E-05 6.5E-10   82.6  13.5  158  224-438    82-244 (353)
  7 PF13365 Trypsin_2:  Trypsin-li  97.4 0.00035 7.5E-09   60.3   6.6   21  378-403   100-120 (120)
  8 COG0265 DegQ Trypsin-like seri  96.8  0.0094   2E-07   62.9  10.8  148  236-436    83-235 (347)
  9 PF00089 Trypsin:  Trypsin;  In  96.7  0.0026 5.6E-08   59.9   5.4  122  288-436    86-218 (220)
 10 KOG1421 Predicted signaling-as  96.0   0.029 6.3E-07   64.6   9.7   92  326-439   157-256 (955)
 11 cd00190 Tryp_SPc Trypsin-like   95.8    0.07 1.5E-06   50.5  10.2   30  378-409   180-209 (232)
 12 smart00020 Tryp_SPc Trypsin-li  95.6    0.12 2.5E-06   49.4  10.7   29  379-411   182-210 (229)
 13 PF00944 Peptidase_S3:  Alphavi  95.5   0.018 3.9E-07   54.7   4.5   29  378-411   102-130 (158)
 14 COG3591 V8-like Glu-specific e  94.3     0.5 1.1E-05   49.1  11.7   27  379-410   200-226 (251)
 15 PF00947 Pico_P2A:  Picornaviru  91.4     0.2 4.4E-06   47.1   3.7   49  363-435    75-123 (127)
 16 PF00863 Peptidase_C4:  Peptida  89.6       2 4.4E-05   44.3   9.3   72  327-411   103-176 (235)
 17 PF05579 Peptidase_S32:  Equine  87.3    0.38 8.2E-06   50.5   2.4   27  378-409   204-230 (297)
 18 PF10459 Peptidase_S46:  Peptid  84.7     1.1 2.4E-05   52.7   4.7   52  379-438   630-683 (698)
 19 PF00949 Peptidase_S7:  Peptida  80.8     1.1 2.4E-05   42.4   2.3   26  379-409    94-119 (132)
 20 PF01732 DUF31:  Putative pepti  62.3     5.8 0.00013   42.8   2.7   22  380-406   353-374 (374)
 21 PF12381 Peptidase_C3G:  Tungro  58.8      13 0.00029   38.3   4.3   47  360-411   162-208 (231)
 22 COG5640 Secreted trypsin-like   51.5      92   0.002   34.7   9.4   32  378-412   224-256 (413)
 23 PF00548 Peptidase_C3:  3C cyst  51.3      43 0.00093   32.7   6.4   29  378-408   143-171 (172)
 24 KOG1320 Serine protease [Postt  49.3      61  0.0013   37.0   8.0   56  364-436   290-345 (473)
 25 KOG1320 Serine protease [Postt  45.0      64  0.0014   36.8   7.3  129  237-406    98-229 (473)
 26 PF02122 Peptidase_S39:  Peptid  30.5      40 0.00087   34.2   2.7   42  379-435   144-185 (203)

No 1  
>PF08192 Peptidase_S64:  Peptidase family S64;  InterPro: IPR012985 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This family of fungal proteins is involved in the processing of membrane bound transcription factor Stp1 [] and belongs to MEROPS petidase family S64 (clan PA). The processing causes the signalling domain of Stp1 to be passed to the nucleus where several permease genes are induced. The permeases are important for uptake of amino acids, and processing of tp1 only occurs in an amino acid-rich environment. This family is predicted to be distantly related to the trypsin family (MEROPS peptidase family S1) and to have a typical trypsin-like catalytic triad [].
Probab=99.65  E-value=4.8e-16  Score=172.45  Aligned_cols=100  Identities=29%  Similarity=0.453  Sum_probs=84.0

Q ss_pred             cccCCCeEEEEeecccceEEEEEEEEEEEeCCCCeEEEEEEEEECCCCCCCCCCCCccceEEeeccC-CCCCceEEEEEe
Q 040739          328 SSLIGKQVVKVGRSSGLTTGTVLAYALEYNDEKGICFLTDFLVVGENQQTFDLEGDSGSLILMKGEN-GEKPRPIGIIWG  406 (594)
Q Consensus       328 ~p~lG~~V~KvGRTTGlT~G~Itai~V~y~~~~G~~~f~dqIVt~~~g~~FS~~GDSGSlVl~~~~~-d~~~~aVGLlfG  406 (594)
                      ....|+.|+|+|||||+|+|+|+++++.|+.+ |...+.+++|.+.+...|+.+|||||+|+...++ ...-.+|||+++
T Consensus       586 ~~~~G~~VfK~GrTTgyT~G~lNg~klvyw~d-G~i~s~efvV~s~~~~~Fa~~GDSGS~VLtk~~d~~~gLgvvGMlhs  664 (695)
T PF08192_consen  586 NLVPGMEVFKVGRTTGYTTGILNGIKLVYWAD-GKIQSSEFVVSSDNNPAFASGGDSGSWVLTKLEDNNKGLGVVGMLHS  664 (695)
T ss_pred             ccCCCCeEEEecccCCccceEecceEEEEecC-CCeEEEEEEEecCCCccccCCCCcccEEEecccccccCceeeEEeee
Confidence            45679999999999999999999999888876 5566789999986678899999999999964444 334569999999


Q ss_pred             cCCCCCccccccCCCCCcceEeechHHHHhhcC
Q 040739          407 GTANRGRLKLKIGQPPENWTSGVDLGRLLNLLE  439 (594)
Q Consensus       407 Gs~~~g~~~~~~~~~p~~~Tl~~pI~~VL~~L~  439 (594)
                      .++..           ..|.+|+||..+|+.|+
T Consensus       665 ydge~-----------kqfglftPi~~il~rl~  686 (695)
T PF08192_consen  665 YDGEQ-----------KQFGLFTPINEILDRLE  686 (695)
T ss_pred             cCCcc-----------ceeeccCcHHHHHHHHH
Confidence            87544           68999999999999874


No 2  
>PRK10139 serine endoprotease; Provisional
Probab=98.35  E-value=3.4e-06  Score=92.69  Aligned_cols=156  Identities=21%  Similarity=0.200  Sum_probs=94.9

Q ss_pred             EEEEEeCCCCceeEEEecCceeccCCCCCCCCCCCCCCcccccccCCCccccccccccCCCccccccccccccccCCCCC
Q 040739          224 GAIVKSQTGSRQVGFLTNRHVAVDLDYPNQKMFHPLPPTLGPGVYLGAVERATSFHHRRPLTFVRADGAFIPFADDFDMS  303 (594)
Q Consensus       224 GclVtD~~G~~~~yiLTNnHVla~~n~~~~~~~~pgdpILQPg~~DGG~~rd~~FIp~~p~N~VDaD~AIi~va~~~d~s  303 (594)
                      |+++.+.    .-|||||+||+.+...            +.=...||. ....+.+-..+..    |+|++++....+. 
T Consensus        94 G~ii~~~----~g~IlTn~HVv~~a~~------------i~V~~~dg~-~~~a~vvg~D~~~----DlAvlkv~~~~~l-  151 (455)
T PRK10139         94 GVIIDAA----KGYVLTNNHVINQAQK------------ISIQLNDGR-EFDAKLIGSDDQS----DIALLQIQNPSKL-  151 (455)
T ss_pred             EEEEECC----CCEEEeChHHhCCCCE------------EEEEECCCC-EEEEEEEEEcCCC----CEEEEEecCCCCC-
Confidence            6666542    2399999999987542            111123444 1112322234444    8899998632211 


Q ss_pred             ccccccccccccCcceeecccCcccccCCCeEEEEeeccc----ceEEEEEEEEEEEeCCCCeEEEEEEEEECCCCCCCC
Q 040739          304 TVTTSVKGLGEIGDVKIVDLQSPISSLIGKQVVKVGRSSG----LTTGTVLAYALEYNDEKGICFLTDFLVVGENQQTFD  379 (594)
Q Consensus       304 ~vs~~I~~lG~IG~vk~vdl~g~~~p~lG~~V~KvGRTTG----lT~G~Itai~V~y~~~~G~~~f~dqIVt~~~g~~FS  379 (594)
                         +.+.    +|.        .....+|+.|.-+|.--|    +|.|.|+++.-......+   +.++|.++.    --
T Consensus       152 ---~~~~----lg~--------s~~~~~G~~V~aiG~P~g~~~tvt~GivS~~~r~~~~~~~---~~~~iqtda----~i  209 (455)
T PRK10139        152 ---TQIA----IAD--------SDKLRVGDFAVAVGNPFGLGQTATSGIISALGRSGLNLEG---LENFIQTDA----SI  209 (455)
T ss_pred             ---ceeE----ecC--------ccccCCCCEEEEEecCCCCCCceEEEEEccccccccCCCC---cceEEEECC----cc
Confidence               1111    111        124678999999988544    588999887632111112   356788776    46


Q ss_pred             CCCCccceEEeeccCCCCCceEEEEEecCCCCCccccccCCCCCcceEeechHHHHh
Q 040739          380 LEGDSGSLILMKGENGEKPRPIGIIWGGTANRGRLKLKIGQPPENWTSGVDLGRLLN  436 (594)
Q Consensus       380 ~~GDSGSlVl~~~~~d~~~~aVGLlfGGs~~~g~~~~~~~~~p~~~Tl~~pI~~VL~  436 (594)
                      .+|.||++++     |.++++|||..+--...        ++..+..|+-|+..+..
T Consensus       210 n~GnSGGpl~-----n~~G~vIGi~~~~~~~~--------~~~~gigfaIP~~~~~~  253 (455)
T PRK10139        210 NRGNSGGALL-----NLNGELIGINTAILAPG--------GGSVGIGFAIPSNMART  253 (455)
T ss_pred             CCCCCcceEE-----CCCCeEEEEEEEEEcCC--------CCccceEEEEEhHHHHH
Confidence            7899999999     99999999998743211        12256789999865443


No 3  
>TIGR02038 protease_degS periplasmic serine pepetdase DegS. This family consists of the periplasmic serine protease DegS (HhoB), a shorter paralog of protease DO (HtrA, DegP) and DegQ (HhoA). It is found in E. coli and several other Proteobacteria of the gamma subdivision. It contains a trypsin domain and a single copy of PDZ domain (in contrast to DegP with two copies). A critical role of this DegS is to sense stress in the periplasm and partially degrade an inhibitor of sigma(E).
Probab=98.27  E-value=1.1e-05  Score=85.63  Aligned_cols=158  Identities=19%  Similarity=0.246  Sum_probs=95.2

Q ss_pred             EEEEEeCCCCceeEEEecCceeccCCCCCCCCCCCCCCcccccccCCCccccccccccCCCccccccccccccccCCCCC
Q 040739          224 GAIVKSQTGSRQVGFLTNRHVAVDLDYPNQKMFHPLPPTLGPGVYLGAVERATSFHHRRPLTFVRADGAFIPFADDFDMS  303 (594)
Q Consensus       224 GclVtD~~G~~~~yiLTNnHVla~~n~~~~~~~~pgdpILQPg~~DGG~~rd~~FIp~~p~N~VDaD~AIi~va~~~d~s  303 (594)
                      |+++.+ .|    |||||+||..+.+.            ++=...||.. ...+.+-..+..    |+|++++... +  
T Consensus        82 G~vi~~-~G----~IlTn~HVV~~~~~------------i~V~~~dg~~-~~a~vv~~d~~~----DlAvlkv~~~-~--  136 (351)
T TIGR02038        82 GVIMSK-EG----YILTNYHVIKKADQ------------IVVALQDGRK-FEAELVGSDPLT----DLAVLKIEGD-N--  136 (351)
T ss_pred             EEEEeC-Ce----EEEecccEeCCCCE------------EEEEECCCCE-EEEEEEEecCCC----CEEEEEecCC-C--
Confidence            666643 22    89999999987542            1111234431 112222223444    7899987632 1  


Q ss_pred             ccccccccccccCcceeecccCcccccCCCeEEEEeeccc----ceEEEEEEEEEEEeCCCCeEEEEEEEEECCCCCCCC
Q 040739          304 TVTTSVKGLGEIGDVKIVDLQSPISSLIGKQVVKVGRSSG----LTTGTVLAYALEYNDEKGICFLTDFLVVGENQQTFD  379 (594)
Q Consensus       304 ~vs~~I~~lG~IG~vk~vdl~g~~~p~lG~~V~KvGRTTG----lT~G~Itai~V~y~~~~G~~~f~dqIVt~~~g~~FS  379 (594)
                       + +.+.    +|        ......+|+.|.-+|...|    +|.|.|+++.-......+   ..+++.++.    --
T Consensus       137 -~-~~~~----l~--------~s~~~~~G~~V~aiG~P~~~~~s~t~GiIs~~~r~~~~~~~---~~~~iqtda----~i  195 (351)
T TIGR02038       137 -L-PTIP----VN--------LDRPPHVGDVVLAIGNPYNLGQTITQGIISATGRNGLSSVG---RQNFIQTDA----AI  195 (351)
T ss_pred             -C-ceEe----cc--------CcCccCCCCEEEEEeCCCCCCCcEEEEEEEeccCcccCCCC---cceEEEECC----cc
Confidence             1 1111    11        1235689999999999866    578999888632212112   245677666    36


Q ss_pred             CCCCccceEEeeccCCCCCceEEEEEecCCCCCccccccCCCCCcceEeechHHHHhhc
Q 040739          380 LEGDSGSLILMKGENGEKPRPIGIIWGGTANRGRLKLKIGQPPENWTSGVDLGRLLNLL  438 (594)
Q Consensus       380 ~~GDSGSlVl~~~~~d~~~~aVGLlfGGs~~~g~~~~~~~~~p~~~Tl~~pI~~VL~~L  438 (594)
                      .+|.||++++     |.++++|||..+.-...      .+....+..|+-|+..+...+
T Consensus       196 ~~GnSGGpl~-----n~~G~vIGI~~~~~~~~------~~~~~~g~~faIP~~~~~~vl  243 (351)
T TIGR02038       196 NAGNSGGALI-----NTNGELVGINTASFQKG------GDEGGEGINFAIPIKLAHKIM  243 (351)
T ss_pred             CCCCCcceEE-----CCCCeEEEEEeeeeccc------CCCCccceEEEecHHHHHHHH
Confidence            7899999999     99999999987542110      011224678899988777655


No 4  
>PRK10942 serine endoprotease; Provisional
Probab=98.16  E-value=1.5e-05  Score=88.02  Aligned_cols=155  Identities=19%  Similarity=0.196  Sum_probs=91.3

Q ss_pred             EEEEEeCCCCceeEEEecCceeccCCCCCCCCCCCCCCcccccccCCCccccccccccCCCccccccccccccccCCCCC
Q 040739          224 GAIVKSQTGSRQVGFLTNRHVAVDLDYPNQKMFHPLPPTLGPGVYLGAVERATSFHHRRPLTFVRADGAFIPFADDFDMS  303 (594)
Q Consensus       224 GclVtD~~G~~~~yiLTNnHVla~~n~~~~~~~~pgdpILQPg~~DGG~~rd~~FIp~~p~N~VDaD~AIi~va~~~d~s  303 (594)
                      |++|...    .-|||||+||+.+....          .+  ...||. ....+.+-..+..    |+|++++....+. 
T Consensus       115 G~ii~~~----~G~IlTn~HVv~~a~~i----------~V--~~~dg~-~~~a~vv~~D~~~----DlAvlki~~~~~l-  172 (473)
T PRK10942        115 GVIIDAD----KGYVVTNNHVVDNATKI----------KV--QLSDGR-KFDAKVVGKDPRS----DIALIQLQNPKNL-  172 (473)
T ss_pred             EEEEECC----CCEEEeChhhcCCCCEE----------EE--EECCCC-EEEEEEEEecCCC----CEEEEEecCCCCC-
Confidence            6666432    23899999998774421          11  123443 1112222233444    8899987532211 


Q ss_pred             ccccccccccccCcceeecccCcccccCCCeEEEEeeccc----ceEEEEEEEEEEEeCCCCeEEEEEEEEECCCCCCCC
Q 040739          304 TVTTSVKGLGEIGDVKIVDLQSPISSLIGKQVVKVGRSSG----LTTGTVLAYALEYNDEKGICFLTDFLVVGENQQTFD  379 (594)
Q Consensus       304 ~vs~~I~~lG~IG~vk~vdl~g~~~p~lG~~V~KvGRTTG----lT~G~Itai~V~y~~~~G~~~f~dqIVt~~~g~~FS  379 (594)
                         +.+.    +|        ......+|+.|.-+|..-|    +|.|.|+++.-...   +...+.++|.++.    --
T Consensus       173 ---~~~~----lg--------~s~~l~~G~~V~aiG~P~g~~~tvt~GiVs~~~r~~~---~~~~~~~~iqtda----~i  230 (473)
T PRK10942        173 ---TAIK----MA--------DSDALRVGDYTVAIGNPYGLGETVTSGIVSALGRSGL---NVENYENFIQTDA----AI  230 (473)
T ss_pred             ---ceeE----ec--------CccccCCCCEEEEEcCCCCCCcceeEEEEEEeecccC---CcccccceEEecc----cc
Confidence               1121    11        1234689999999998755    58899988863211   1112346677766    35


Q ss_pred             CCCCccceEEeeccCCCCCceEEEEEecCCCCCccccccCCCCCcceEeechHHHH
Q 040739          380 LEGDSGSLILMKGENGEKPRPIGIIWGGTANRGRLKLKIGQPPENWTSGVDLGRLL  435 (594)
Q Consensus       380 ~~GDSGSlVl~~~~~d~~~~aVGLlfGGs~~~g~~~~~~~~~p~~~Tl~~pI~~VL  435 (594)
                      .+|.||++++     |.++++|||..+.-...        ++..+..|+-|+..+.
T Consensus       231 ~~GnSGGpL~-----n~~GeviGI~t~~~~~~--------g~~~g~gfaIP~~~~~  273 (473)
T PRK10942        231 NRGNSGGALV-----NLNGELIGINTAILAPD--------GGNIGIGFAIPSNMVK  273 (473)
T ss_pred             CCCCCcCccC-----CCCCeEEEEEEEEEcCC--------CCcccEEEEEEHHHHH
Confidence            6899999999     89999999987643211        1113467888875433


No 5  
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=98.15  E-value=1.5e-05  Score=86.23  Aligned_cols=157  Identities=22%  Similarity=0.224  Sum_probs=93.3

Q ss_pred             EEEEEeCCCCceeEEEecCceeccCCCCCCCCCCCCCCcccccccCCCccccccccccCCCccccccccccccccCCCCC
Q 040739          224 GAIVKSQTGSRQVGFLTNRHVAVDLDYPNQKMFHPLPPTLGPGVYLGAVERATSFHHRRPLTFVRADGAFIPFADDFDMS  303 (594)
Q Consensus       224 GclVtD~~G~~~~yiLTNnHVla~~n~~~~~~~~pgdpILQPg~~DGG~~rd~~FIp~~p~N~VDaD~AIi~va~~~d~s  303 (594)
                      |++|... |    |||||+||+.+....          .+.  ..||. ....+.+...+..    |+|++++...   .
T Consensus        62 Gfii~~~-G----~IlTn~Hvv~~~~~i----------~V~--~~~~~-~~~a~vv~~d~~~----DlAllkv~~~---~  116 (428)
T TIGR02037        62 GVIISAD-G----YILTNNHVVDGADEI----------TVT--LSDGR-EFKAKLVGKDPRT----DIAVLKIDAK---K  116 (428)
T ss_pred             EEEECCC-C----EEEEcHHHcCCCCeE----------EEE--eCCCC-EEEEEEEEecCCC----CEEEEEecCC---C
Confidence            6666543 2    899999999885531          111  12333 1112222223333    7799988532   1


Q ss_pred             ccccccccccccCcceeecccCcccccCCCeEEEEeeccc----ceEEEEEEEEEEEeCCCCeEEEEEEEEECCCCCCCC
Q 040739          304 TVTTSVKGLGEIGDVKIVDLQSPISSLIGKQVVKVGRSSG----LTTGTVLAYALEYNDEKGICFLTDFLVVGENQQTFD  379 (594)
Q Consensus       304 ~vs~~I~~lG~IG~vk~vdl~g~~~p~lG~~V~KvGRTTG----lT~G~Itai~V~y~~~~G~~~f~dqIVt~~~g~~FS  379 (594)
                      .+ +.+.    +        .......+|+.|+-+|.--|    +|.|.|+++.-..... +  .+.+++.++.    -.
T Consensus       117 ~~-~~~~----l--------~~~~~~~~G~~v~aiG~p~g~~~~~t~G~vs~~~~~~~~~-~--~~~~~i~tda----~i  176 (428)
T TIGR02037       117 NL-PVIK----L--------GDSDKLRVGDWVLAIGNPFGLGQTVTSGIVSALGRSGLGI-G--DYENFIQTDA----AI  176 (428)
T ss_pred             Cc-eEEE----c--------cCCCCCCCCCEEEEEECCCcCCCcEEEEEEEecccCccCC-C--CccceEEECC----CC
Confidence            11 1111    1        11234689999999998644    6788888876221111 1  1355777766    47


Q ss_pred             CCCCccceEEeeccCCCCCceEEEEEecCCCCCccccccCCCCCcceEeechHHHHhhc
Q 040739          380 LEGDSGSLILMKGENGEKPRPIGIIWGGTANRGRLKLKIGQPPENWTSGVDLGRLLNLL  438 (594)
Q Consensus       380 ~~GDSGSlVl~~~~~d~~~~aVGLlfGGs~~~g~~~~~~~~~p~~~Tl~~pI~~VL~~L  438 (594)
                      .+|.||++++     |.++++|||..+.-...        ++..++.|+-||..+.+.+
T Consensus       177 ~~GnSGGpl~-----n~~G~viGI~~~~~~~~--------g~~~g~~faiP~~~~~~~~  222 (428)
T TIGR02037       177 NPGNSGGPLV-----NLRGEVIGINTAIYSPS--------GGNVGIGFAIPSNMAKNVV  222 (428)
T ss_pred             CCCCCCCceE-----CCCCeEEEEEeEEEcCC--------CCccceEEEEEhHHHHHHH
Confidence            7899999999     89999999987643211        1124567888876555444


No 6  
>PRK10898 serine endoprotease; Provisional
Probab=98.13  E-value=3e-05  Score=82.59  Aligned_cols=158  Identities=23%  Similarity=0.290  Sum_probs=92.1

Q ss_pred             EEEEEeCCCCceeEEEecCceeccCCCCCCCCCCCCCCcccccccCCCccccccccccCCCccccccccccccccCCCCC
Q 040739          224 GAIVKSQTGSRQVGFLTNRHVAVDLDYPNQKMFHPLPPTLGPGVYLGAVERATSFHHRRPLTFVRADGAFIPFADDFDMS  303 (594)
Q Consensus       224 GclVtD~~G~~~~yiLTNnHVla~~n~~~~~~~~pgdpILQPg~~DGG~~rd~~FIp~~p~N~VDaD~AIi~va~~~d~s  303 (594)
                      |+++. ..|    |||||+||+.+....          .+  ...||.. ...+.+-..+..    |+|++++... +  
T Consensus        82 Gfvi~-~~G----~IlTn~HVv~~a~~i----------~V--~~~dg~~-~~a~vv~~d~~~----DlAvl~v~~~-~--  136 (353)
T PRK10898         82 GVIMD-QRG----YILTNKHVINDADQI----------IV--ALQDGRV-FEALLVGSDSLT----DLAVLKINAT-N--  136 (353)
T ss_pred             EEEEe-CCe----EEEecccEeCCCCEE----------EE--EeCCCCE-EEEEEEEEcCCC----CEEEEEEcCC-C--
Confidence            66664 322    899999999874421          11  1224431 111222123444    8899998532 1  


Q ss_pred             ccccccccccccCcceeecccCcccccCCCeEEEEeeccc----ceEEEEEEEE-EEEeCCCCeEEEEEEEEECCCCCCC
Q 040739          304 TVTTSVKGLGEIGDVKIVDLQSPISSLIGKQVVKVGRSSG----LTTGTVLAYA-LEYNDEKGICFLTDFLVVGENQQTF  378 (594)
Q Consensus       304 ~vs~~I~~lG~IG~vk~vdl~g~~~p~lG~~V~KvGRTTG----lT~G~Itai~-V~y~~~~G~~~f~dqIVt~~~g~~F  378 (594)
                       + +.+.    +|        ....+.+|+.|.-+|.-.|    +|.|.|++.. ..+.. .+.   .+++.++.    -
T Consensus       137 -l-~~~~----l~--------~~~~~~~G~~V~aiG~P~g~~~~~t~Giis~~~r~~~~~-~~~---~~~iqtda----~  194 (353)
T PRK10898        137 -L-PVIP----IN--------PKRVPHIGDVVLAIGNPYNLGQTITQGIISATGRIGLSP-TGR---QNFLQTDA----S  194 (353)
T ss_pred             -C-Ceee----cc--------CcCcCCCCCEEEEEeCCCCcCCCcceeEEEeccccccCC-ccc---cceEEecc----c
Confidence             1 1111    11        1124689999999998766    5889999876 32222 121   34566665    3


Q ss_pred             CCCCCccceEEeeccCCCCCceEEEEEecCCCCCccccccCCCCCcceEeechHHHHhhc
Q 040739          379 DLEGDSGSLILMKGENGEKPRPIGIIWGGTANRGRLKLKIGQPPENWTSGVDLGRLLNLL  438 (594)
Q Consensus       379 S~~GDSGSlVl~~~~~d~~~~aVGLlfGGs~~~g~~~~~~~~~p~~~Tl~~pI~~VL~~L  438 (594)
                      -.+|.||++++     |.++++|||..+.-...     ..+..+.+..|+-|+..+...+
T Consensus       195 i~~GnSGGPl~-----n~~G~vvGI~~~~~~~~-----~~~~~~~g~~faIP~~~~~~~~  244 (353)
T PRK10898        195 INHGNSGGALV-----NSLGELMGINTLSFDKS-----NDGETPEGIGFAIPTQLATKIM  244 (353)
T ss_pred             cCCCCCcceEE-----CCCCeEEEEEEEEeccc-----CCCCcccceEEEEchHHHHHHH
Confidence            57899999999     89999999987532111     0011124678898887755443


No 7  
>PF13365 Trypsin_2:  Trypsin-like peptidase domain; PDB: 1Y8T_A 2Z9I_A 3QO6_A 1L1J_A 1QY6_A 2O8L_A 3OTP_E 2ZLE_I 1KY9_A 3CS0_A ....
Probab=97.44  E-value=0.00035  Score=60.32  Aligned_cols=21  Identities=29%  Similarity=0.441  Sum_probs=18.6

Q ss_pred             CCCCCCccceEEeeccCCCCCceEEE
Q 040739          378 FDLEGDSGSLILMKGENGEKPRPIGI  403 (594)
Q Consensus       378 FS~~GDSGSlVl~~~~~d~~~~aVGL  403 (594)
                      ...+|.||++||     |.++++|||
T Consensus       100 ~~~~G~SGgpv~-----~~~G~vvGi  120 (120)
T PF13365_consen  100 DTRPGSSGGPVF-----DSDGRVVGI  120 (120)
T ss_dssp             S-STTTTTSEEE-----ETTSEEEEE
T ss_pred             ccCCCcEeHhEE-----CCCCEEEeC
Confidence            588999999999     899999997


No 8  
>COG0265 DegQ Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain [Posttranslational modification, protein turnover, chaperones]
Probab=96.77  E-value=0.0094  Score=62.91  Aligned_cols=148  Identities=20%  Similarity=0.238  Sum_probs=91.1

Q ss_pred             eEEEecCceeccCCCCCCCCCCCCCCcccccccCCCccccccccccCCCccccccccccccccCCCCCcccccccccccc
Q 040739          236 VGFLTNRHVAVDLDYPNQKMFHPLPPTLGPGVYLGAVERATSFHHRRPLTFVRADGAFIPFADDFDMSTVTTSVKGLGEI  315 (594)
Q Consensus       236 ~yiLTNnHVla~~n~~~~~~~~pgdpILQPg~~DGG~~rd~~FIp~~p~N~VDaD~AIi~va~~~d~s~vs~~I~~lG~I  315 (594)
                      -|+|||+||+.+...            +.-...||. .-..+++...+..    |+|++++.....    .+.+.    +
T Consensus        83 g~ivTn~hVi~~a~~------------i~v~l~dg~-~~~a~~vg~d~~~----dlavlki~~~~~----~~~~~----~  137 (347)
T COG0265          83 GYIVTNNHVIAGAEE------------ITVTLADGR-EVPAKLVGKDPIS----DLAVLKIDGAGG----LPVIA----L  137 (347)
T ss_pred             eEEEecceecCCcce------------EEEEeCCCC-EEEEEEEecCCcc----CEEEEEeccCCC----Cceee----c
Confidence            499999999998442            111114554 1123444334444    779988754221    11221    1


Q ss_pred             CcceeecccCcccccCCCeEEEEeeccc----ceEEEEEEEEEE-EeCCCCeEEEEEEEEECCCCCCCCCCCCccceEEe
Q 040739          316 GDVKIVDLQSPISSLIGKQVVKVGRSSG----LTTGTVLAYALE-YNDEKGICFLTDFLVVGENQQTFDLEGDSGSLILM  390 (594)
Q Consensus       316 G~vk~vdl~g~~~p~lG~~V~KvGRTTG----lT~G~Itai~V~-y~~~~G~~~f~dqIVt~~~g~~FS~~GDSGSlVl~  390 (594)
                      |.        .....+|+.|.-+|-..|    +|.|.|+++.-. +.....   +.++|.+..    .-.+|.||..++ 
T Consensus       138 ~~--------s~~l~vg~~v~aiGnp~g~~~tvt~Givs~~~r~~v~~~~~---~~~~IqtdA----ain~gnsGgpl~-  201 (347)
T COG0265         138 GD--------SDKLRVGDVVVAIGNPFGLGQTVTSGIVSALGRTGVGSAGG---YVNFIQTDA----AINPGNSGGPLV-  201 (347)
T ss_pred             cC--------CCCcccCCEEEEecCCCCcccceeccEEeccccccccCccc---ccchhhccc----ccCCCCCCCceE-
Confidence            11        123458999999999888    677877777643 332111   567777665    688999999999 


Q ss_pred             eccCCCCCceEEEEEecCCCCCccccccCCCCCcceEeechHHHHh
Q 040739          391 KGENGEKPRPIGIIWGGTANRGRLKLKIGQPPENWTSGVDLGRLLN  436 (594)
Q Consensus       391 ~~~~d~~~~aVGLlfGGs~~~g~~~~~~~~~p~~~Tl~~pI~~VL~  436 (594)
                          +.++++||+..+.-...+        +..+..|+-|+..+..
T Consensus       202 ----n~~g~~iGint~~~~~~~--------~~~gigfaiP~~~~~~  235 (347)
T COG0265         202 ----NIDGEVVGINTAIIAPSG--------GSSGIGFAIPVNLVAP  235 (347)
T ss_pred             ----cCCCcEEEEEEEEecCCC--------CcceeEEEecHHHHHH
Confidence                899999999877654321        1123467777766554


No 9  
>PF00089 Trypsin:  Trypsin;  InterPro: IPR001254 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine proteases belong to the MEROPS peptidase family S1 (chymotrypsin family, clan PA(S))and to peptidase family S6 (Hap serine peptidases). The chymotrypsin family is almost totally confined to animals, although trypsin-like enzymes are found in actinomycetes of the genera Streptomyces and Saccharopolyspora, and in the fungus Fusarium oxysporum []. The enzymes are inherently secreted, being synthesised with a signal peptide that targets them to the secretory pathway. Animal enzymes are either secreted directly, packaged into vesicles for regulated secretion, or are retained in leukocyte granules []. The Hap family, 'Haemophilus adhesion and penetration', are proteins that play a role in the interaction with human epithelial cells. The serine protease activity is localized at the N-terminal domain, whereas the binding domain is in the C-terminal region. ; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 1SPJ_A 1A5I_A 2ZGH_A 2ZKS_A 2ZGJ_A 2ZGC_A 2ODP_A 2I6Q_A 2I6S_A 2ODQ_A ....
Probab=96.70  E-value=0.0026  Score=59.93  Aligned_cols=122  Identities=16%  Similarity=0.228  Sum_probs=63.1

Q ss_pred             cccccccccccCCCCCccccccccccccCcceeecccC-cccccCCCeEEEEeecccceEE---EEEEEEEEEeCC----
Q 040739          288 RADGAFIPFADDFDMSTVTTSVKGLGEIGDVKIVDLQS-PISSLIGKQVVKVGRSSGLTTG---TVLAYALEYNDE----  359 (594)
Q Consensus       288 DaD~AIi~va~~~d~s~vs~~I~~lG~IG~vk~vdl~g-~~~p~lG~~V~KvGRTTGlT~G---~Itai~V~y~~~----  359 (594)
                      +.|+||+++.++....   ..+.   .+      .+.. ......|+.+.-+|-......+   .+....+.+-..    
T Consensus        86 ~~DiAll~L~~~~~~~---~~~~---~~------~l~~~~~~~~~~~~~~~~G~~~~~~~~~~~~~~~~~~~~~~~~~c~  153 (220)
T PF00089_consen   86 DNDIALLKLDRPITFG---DNIQ---PI------CLPSAGSDPNVGTSCIVVGWGRTSDNGYSSNLQSVTVPVVSRKTCR  153 (220)
T ss_dssp             TTSEEEEEESSSSEHB---SSBE---ES------BBTSTTHTTTTTSEEEEEESSBSSTTSBTSBEEEEEEEEEEHHHHH
T ss_pred             cccccccccccccccc---cccc---cc------cccccccccccccccccccccccccccccccccccccccccccccc
Confidence            4488999998753221   1221   11      1111 1224678877777776654444   344333322110    


Q ss_pred             ---CCeEEEEEEEEECCCCCCCCCCCCccceEEeeccCCCCCceEEEEEecCCCCCccccccCCCCCcceEeechHHHHh
Q 040739          360 ---KGICFLTDFLVVGENQQTFDLEGDSGSLILMKGENGEKPRPIGIIWGGTANRGRLKLKIGQPPENWTSGVDLGRLLN  436 (594)
Q Consensus       360 ---~G~~~f~dqIVt~~~g~~FS~~GDSGSlVl~~~~~d~~~~aVGLlfGGs~~~g~~~~~~~~~p~~~Tl~~pI~~VL~  436 (594)
                         ... ...+++.+...+..-...|||||.++.     .+..++|++..+....         .+....+|++|...++
T Consensus       154 ~~~~~~-~~~~~~c~~~~~~~~~~~g~sG~pl~~-----~~~~lvGI~s~~~~c~---------~~~~~~v~~~v~~~~~  218 (220)
T PF00089_consen  154 SSYNDN-LTPNMICAGSSGSGDACQGDSGGPLIC-----NNNYLVGIVSFGENCG---------SPNYPGVYTRVSSYLD  218 (220)
T ss_dssp             HHTTTT-STTTEEEEETTSSSBGGTTTTTSEEEE-----TTEEEEEEEEEESSSS---------BTTSEEEEEEGGGGHH
T ss_pred             cccccc-ccccccccccccccccccccccccccc-----ceeeecceeeecCCCC---------CCCcCEEEEEHHHhhc
Confidence               000 011223222212234568999999993     3337999999984321         1123488888876653


No 10 
>KOG1421 consensus Predicted signaling-associated protein (contains a PDZ domain) [General function prediction only]
Probab=96.04  E-value=0.029  Score=64.56  Aligned_cols=92  Identities=17%  Similarity=0.286  Sum_probs=57.7

Q ss_pred             cccccCCCeEEEEeeccc----ceEEEEEEEE---EEEeCCCCeEEEEE-EEEECCCCCCCCCCCCccceEEeeccCCCC
Q 040739          326 PISSLIGKQVVKVGRSSG----LTTGTVLAYA---LEYNDEKGICFLTD-FLVVGENQQTFDLEGDSGSLILMKGENGEK  397 (594)
Q Consensus       326 ~~~p~lG~~V~KvGRTTG----lT~G~Itai~---V~y~~~~G~~~f~d-qIVt~~~g~~FS~~GDSGSlVl~~~~~d~~  397 (594)
                      +..+.+|...+.+|--.|    +-.|.+..+.   =+|.. ...+.|.- .+....    -+.+|-|||+|+     +-.
T Consensus       157 p~~akvgseirvvgNDagEklsIlagflSrldr~apdyg~-~~yndfnTfy~Qaas----stsggssgspVv-----~i~  226 (955)
T KOG1421|consen  157 PELAKVGSEIRVVGNDAGEKLSILAGFLSRLDRNAPDYGE-DTYNDFNTFYIQAAS----STSGGSSGSPVV-----DIP  226 (955)
T ss_pred             ccccccCCceEEecCCccceEEeehhhhhhccCCCccccc-cccccccceeeeehh----cCCCCCCCCcee-----ccc
Confidence            345667777777766442    2223333332   11211 11222322 233333    478899999999     899


Q ss_pred             CceEEEEEecCCCCCccccccCCCCCcceEeechHHHHhhcC
Q 040739          398 PRPIGIIWGGTANRGRLKLKIGQPPENWTSGVDLGRLLNLLE  439 (594)
Q Consensus       398 ~~aVGLlfGGs~~~g~~~~~~~~~p~~~Tl~~pI~~VL~~L~  439 (594)
                      +++|.|.-||....            ...+|-||.+|+.+|-
T Consensus       227 gyAVAl~agg~~ss------------as~ffLpLdrV~RaL~  256 (955)
T KOG1421|consen  227 GYAVALNAGGSISS------------ASDFFLPLDRVVRALR  256 (955)
T ss_pred             ceEEeeecCCcccc------------cccceeeccchhhhhh
Confidence            99999999998653            4479999999998874


No 11 
>cd00190 Tryp_SPc Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues.
Probab=95.85  E-value=0.07  Score=50.52  Aligned_cols=30  Identities=23%  Similarity=0.388  Sum_probs=22.3

Q ss_pred             CCCCCCccceEEeeccCCCCCceEEEEEecCC
Q 040739          378 FDLEGDSGSLILMKGENGEKPRPIGIIWGGTA  409 (594)
Q Consensus       378 FS~~GDSGSlVl~~~~~d~~~~aVGLlfGGs~  409 (594)
                      -...||||+.++..  .+....++|++..|..
T Consensus       180 ~~c~gdsGgpl~~~--~~~~~~lvGI~s~g~~  209 (232)
T cd00190         180 DACQGDSGGPLVCN--DNGRGVLVGIVSWGSG  209 (232)
T ss_pred             ccccCCCCCcEEEE--eCCEEEEEEEEehhhc
Confidence            45679999999952  1234789999988764


No 12 
>smart00020 Tryp_SPc Trypsin-like serine protease. Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues.
Probab=95.59  E-value=0.12  Score=49.40  Aligned_cols=29  Identities=28%  Similarity=0.443  Sum_probs=21.4

Q ss_pred             CCCCCccceEEeeccCCCCCceEEEEEecCCCC
Q 040739          379 DLEGDSGSLILMKGENGEKPRPIGIIWGGTANR  411 (594)
Q Consensus       379 S~~GDSGSlVl~~~~~d~~~~aVGLlfGGs~~~  411 (594)
                      ..+||||+.++...  + ...++|++..|. .|
T Consensus       182 ~c~gdsG~pl~~~~--~-~~~l~Gi~s~g~-~C  210 (229)
T smart00020      182 ACQGDSGGPLVCND--G-RWVLVGIVSWGS-GC  210 (229)
T ss_pred             ccCCCCCCeeEEEC--C-CEEEEEEEEECC-CC
Confidence            45699999999421  1 458999998887 44


No 13 
>PF00944 Peptidase_S3:  Alphavirus core protein ;  InterPro: IPR000930 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. Togavirin, also known as Sindbis virus core endopeptidase, is a serine protease resident at the N terminus of the p130 polyprotein of togaviruses []. The endopeptidase signature identifies the peptidase as belonging to the MEROPS peptidase family S3 (togavirin family, clan PA(S)). The polyprotein also includes structural proteins for the nucleocapsid core and for the glycoprotein spikes []. Togavirin is only active while part of the polyprotein, cleavage at a Trp-Ser bond resulting in total lack of activity []. Mutagenesis studies have identified the location of the His-Asp-Ser catalytic triad, and X-ray studies have revealed the protein fold to be similar to that of chymotrypsin [, ].; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis, 0016020 membrane; PDB: 2YEW_D 1EP5_A 3J0C_F 1EP6_C 1WYK_D 1DYL_A 1VCQ_B 1VCP_B 1LD4_D 1KXA_A ....
Probab=95.50  E-value=0.018  Score=54.73  Aligned_cols=29  Identities=31%  Similarity=0.541  Sum_probs=25.7

Q ss_pred             CCCCCCccceEEeeccCCCCCceEEEEEecCCCC
Q 040739          378 FDLEGDSGSLILMKGENGEKPRPIGIIWGGTANR  411 (594)
Q Consensus       378 FS~~GDSGSlVl~~~~~d~~~~aVGLlfGGs~~~  411 (594)
                      ...+||||.+++     |+++++|||++||....
T Consensus       102 ~g~~GDSGRpi~-----DNsGrVVaIVLGG~neG  130 (158)
T PF00944_consen  102 VGKPGDSGRPIF-----DNSGRVVAIVLGGANEG  130 (158)
T ss_dssp             S-STTSTTEEEE-----STTSBEEEEEEEEEEET
T ss_pred             CCCCCCCCCccC-----cCCCCEEEEEecCCCCC
Confidence            689999999999     99999999999998654


No 14 
>COG3591 V8-like Glu-specific endopeptidase [Amino acid transport and metabolism]
Probab=94.31  E-value=0.5  Score=49.12  Aligned_cols=27  Identities=37%  Similarity=0.570  Sum_probs=23.7

Q ss_pred             CCCCCccceEEeeccCCCCCceEEEEEecCCC
Q 040739          379 DLEGDSGSLILMKGENGEKPRPIGIIWGGTAN  410 (594)
Q Consensus       379 S~~GDSGSlVl~~~~~d~~~~aVGLlfGGs~~  410 (594)
                      ..+|+|||.|+     ..+.+++|++++|..-
T Consensus       200 T~pG~SGSpv~-----~~~~~vigv~~~g~~~  226 (251)
T COG3591         200 TLPGSSGSPVL-----ISKDEVIGVHYNGPGA  226 (251)
T ss_pred             ccCCCCCCceE-----ecCceEEEEEecCCCc
Confidence            77899999999     6777999999999863


No 15 
>PF00947 Pico_P2A:  Picornavirus core protein 2A;  InterPro: IPR000081 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  This domain defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies 3CA and 3CB. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral 3C cysteine protease []. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0008233 peptidase activity, 0006508 proteolysis, 0016032 viral reproduction; PDB: 2HRV_B 1Z8R_A.
Probab=91.44  E-value=0.2  Score=47.08  Aligned_cols=49  Identities=24%  Similarity=0.324  Sum_probs=37.0

Q ss_pred             EEEEEEEEECCCCCCCCCCCCccceEEeeccCCCCCceEEEEEecCCCCCccccccCCCCCcceEeechHHHH
Q 040739          363 CFLTDFLVVGENQQTFDLEGDSGSLILMKGENGEKPRPIGIIWGGTANRGRLKLKIGQPPENWTSGVDLGRLL  435 (594)
Q Consensus       363 ~~f~dqIVt~~~g~~FS~~GDSGSlVl~~~~~d~~~~aVGLlfGGs~~~g~~~~~~~~~p~~~Tl~~pI~~VL  435 (594)
                      .+.+++++..-    +++|||-|++++      -+-.++||+.||..              +...|.+|+.++
T Consensus        75 h~Q~~~l~g~G----p~~PGdCGg~L~------C~HGViGi~Tagg~--------------g~VaF~dir~~~  123 (127)
T PF00947_consen   75 HYQYNLLIGEG----PAEPGDCGGILR------CKHGVIGIVTAGGE--------------GHVAFADIRDLL  123 (127)
T ss_dssp             EEEECEEEEE-----SSSTT-TCSEEE------ETTCEEEEEEEEET--------------TEEEEEECCCGS
T ss_pred             heecCceeecc----cCCCCCCCceeE------eCCCeEEEEEeCCC--------------ceEEEEechhhh
Confidence            56677777655    799999999999      45669999999984              447888887654


No 16 
>PF00863 Peptidase_C4:  Peptidase family C4 This family belongs to family C4 of the peptidase classification.;  InterPro: IPR001730 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  Nuclear inclusion A (NIA) proteases from potyviruses are cysteine peptidases belong to the MEROPS peptidase family C4 (NIa protease family, clan PA(C)) [, ].  Potyviruses include plant viruses in which the single-stranded RNA encodes a polyprotein with NIA protease activity, where proteolytic cleavage is specific for Gln+Gly sites. The NIA protease acts on the polyprotein, releasing itself by Gln+Gly cleavage at both the N- and C-termini. It further processes the polyprotein by cleavage at five similar sites in the C-terminal half of the sequence. In addition to its C-terminal protease activity, the NIA protease contains an N-terminal domain that has been implicated in the transcription process []. This peptidase is present in the nuclear inclusion protein of potyviruses.; GO: 0008234 cysteine-type peptidase activity, 0006508 proteolysis; PDB: 3MMG_B 1Q31_B 1LVB_A 1LVM_A.
Probab=89.60  E-value=2  Score=44.33  Aligned_cols=72  Identities=19%  Similarity=0.265  Sum_probs=46.7

Q ss_pred             ccccCCCeEEEEee--cccceEEEEEEEEEEEeCCCCeEEEEEEEEECCCCCCCCCCCCccceEEeeccCCCCCceEEEE
Q 040739          327 ISSLIGKQVVKVGR--SSGLTTGTVLAYALEYNDEKGICFLTDFLVVGENQQTFDLEGDSGSLILMKGENGEKPRPIGII  404 (594)
Q Consensus       327 ~~p~lG~~V~KvGR--TTGlT~G~Itai~V~y~~~~G~~~f~dqIVt~~~g~~FS~~GDSGSlVl~~~~~d~~~~aVGLl  404 (594)
                      ..|..++.|+.+|-  .+....-+|+.-...|... +..+++.+|-|        ..||=|++++.    -.++.+||++
T Consensus       103 R~P~~~e~v~mVg~~fq~k~~~s~vSesS~i~p~~-~~~fWkHwIsT--------k~G~CG~PlVs----~~Dg~IVGiH  169 (235)
T PF00863_consen  103 RAPKEGERVCMVGSNFQEKSISSTVSESSWIYPEE-NSHFWKHWIST--------KDGDCGLPLVS----TKDGKIVGIH  169 (235)
T ss_dssp             ----TT-EEEEEEEECSSCCCEEEEEEEEEEEEET-TTTEEEE-C-----------TT-TT-EEEE----TTT--EEEEE
T ss_pred             cCCCCCCEEEEEEEEEEcCCeeEEECCceEEeecC-CCCeeEEEecC--------CCCccCCcEEE----cCCCcEEEEE
Confidence            47899999999997  7778888888888777633 23568888854        45999999994    4689999999


Q ss_pred             EecCCCC
Q 040739          405 WGGTANR  411 (594)
Q Consensus       405 fGGs~~~  411 (594)
                      ..++...
T Consensus       170 sl~~~~~  176 (235)
T PF00863_consen  170 SLTSNTS  176 (235)
T ss_dssp             EEEETTT
T ss_pred             cCccCCC
Confidence            9887643


No 17 
>PF05579 Peptidase_S32:  Equine arteritis virus serine endopeptidase S32;  InterPro: IPR008760 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to MEROPS peptidase family S32 (clan PA(S)). The type example is equine arteritis virus serine endopeptidase (equine arteritis virus), which is involved in processing of nidovirus polyproteins [].; GO: 0004252 serine-type endopeptidase activity, 0016032 viral reproduction, 0019082 viral protein processing; PDB: 3FAN_A 3FAO_A 1MBM_A.
Probab=87.34  E-value=0.38  Score=50.50  Aligned_cols=27  Identities=37%  Similarity=0.506  Sum_probs=21.8

Q ss_pred             CCCCCCccceEEeeccCCCCCceEEEEEecCC
Q 040739          378 FDLEGDSGSLILMKGENGEKPRPIGIIWGGTA  409 (594)
Q Consensus       378 FS~~GDSGSlVl~~~~~d~~~~aVGLlfGGs~  409 (594)
                      |+.+|||||+|+     ..++.+||+|.|.+.
T Consensus       204 fT~~GDSGSPVV-----t~dg~liGVHTGSn~  230 (297)
T PF05579_consen  204 FTGPGDSGSPVV-----TEDGDLIGVHTGSNK  230 (297)
T ss_dssp             SS-GGCTT-EEE-----ETTC-EEEEEEEEET
T ss_pred             EcCCCCCCCccC-----cCCCCEEEEEecCCC
Confidence            899999999999     688999999999775


No 18 
>PF10459 Peptidase_S46:  Peptidase S46;  InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains. 
Probab=84.74  E-value=1.1  Score=52.69  Aligned_cols=52  Identities=31%  Similarity=0.440  Sum_probs=38.7

Q ss_pred             CCCCCccceEEeeccCCCCCceEEEEEecCCCC--CccccccCCCCCcceEeechHHHHhhc
Q 040739          379 DLEGDSGSLILMKGENGEKPRPIGIIWGGTANR--GRLKLKIGQPPENWTSGVDLGRLLNLL  438 (594)
Q Consensus       379 S~~GDSGSlVl~~~~~d~~~~aVGLlfGGs~~~--g~~~~~~~~~p~~~Tl~~pI~~VL~~L  438 (594)
                      .-+|.|||+|+     |.++++|||.|-|+-..  |-..|.   .-.+-++..+|.-||-.|
T Consensus       630 itGGNSGSPvl-----N~~GeLVGl~FDgn~Esl~~D~~fd---p~~~R~I~VDiRyvL~~l  683 (698)
T PF10459_consen  630 ITGGNSGSPVL-----NAKGELVGLAFDGNWESLSGDIAFD---PELNRTIHVDIRYVLWAL  683 (698)
T ss_pred             cCCCCCCCccC-----CCCceEEEEeecCchhhcccccccc---cccceeEEEEHHHHHHHH
Confidence            77899999999     99999999999998543  111111   114567899998888765


No 19 
>PF00949 Peptidase_S7:  Peptidase S7, Flavivirus NS3 serine protease ;  InterPro: IPR001850 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies serine peptidases belong to MEROPS peptidase family S7 (flavivirin family, clan PA(S)). The protein fold of the peptidase domain for members of this family resembles that of chymotrypsin, the type example for clan PA.  Flaviviruses produce a polyprotein from the ssRNA genome. The N terminus of the NS3 protein (approx. 180 aa) is required for the processing of the polyprotein. NS3 also has conserved homology with NTP-binding proteins and DEAD family of RNA helicase [, , ].; GO: 0003723 RNA binding, 0003724 RNA helicase activity, 0005524 ATP binding; PDB: 2IJO_B 3E90_D 2GGV_B 2FP7_B 2WV9_A 3U1I_B 3U1J_B 2WZQ_A 2WHX_A 3L6P_A ....
Probab=80.82  E-value=1.1  Score=42.40  Aligned_cols=26  Identities=27%  Similarity=0.387  Sum_probs=21.5

Q ss_pred             CCCCCccceEEeeccCCCCCceEEEEEecCC
Q 040739          379 DLEGDSGSLILMKGENGEKPRPIGIIWGGTA  409 (594)
Q Consensus       379 S~~GDSGSlVl~~~~~d~~~~aVGLlfGGs~  409 (594)
                      -.+|-|||+|+     +.++++|||+++|-.
T Consensus        94 ~~~GsSGSpi~-----n~~g~ivGlYg~g~~  119 (132)
T PF00949_consen   94 FPKGSSGSPIF-----NQNGEIVGLYGNGVE  119 (132)
T ss_dssp             S-TTGTT-EEE-----ETTSCEEEEEEEEEE
T ss_pred             cCCCCCCCceE-----cCCCcEEEEEcccee
Confidence            34699999999     899999999999864


No 20 
>PF01732 DUF31:  Putative peptidase (DUF31);  InterPro: IPR022382  This domain has no known function. It is found in various hypothetical proteins and putative lipoproteins from mycoplasmas. 
Probab=62.25  E-value=5.8  Score=42.79  Aligned_cols=22  Identities=32%  Similarity=0.668  Sum_probs=20.6

Q ss_pred             CCCCccceEEeeccCCCCCceEEEEEe
Q 040739          380 LEGDSGSLILMKGENGEKPRPIGIIWG  406 (594)
Q Consensus       380 ~~GDSGSlVl~~~~~d~~~~aVGLlfG  406 (594)
                      .+|=|||+|+     ++++++|||+||
T Consensus       353 ~gGaSGS~V~-----n~~~~lvGIy~g  374 (374)
T PF01732_consen  353 GGGASGSMVI-----NQNNELVGIYFG  374 (374)
T ss_pred             CCCCCcCeEE-----CCCCCEEEEeCC
Confidence            4799999999     999999999997


No 21 
>PF12381 Peptidase_C3G:  Tungro spherical virus-type peptidase;  InterPro: IPR024387 This entry represents a rice tungro spherical waikavirus-type peptidase that belongs to MEROPS peptidase family C3G. It is a picornain 3C-type protease, and is responsible for the self-cleavage of the positive single-stranded polyproteins of a number of plant viral genomes. The location of the protease activity of the polyprotein is at the C-terminal end, adjacent and N-terminal to the putative RNA polymerase [, ].
Probab=58.77  E-value=13  Score=38.26  Aligned_cols=47  Identities=28%  Similarity=0.316  Sum_probs=31.6

Q ss_pred             CCeEEEEEEEEECCCCCCCCCCCCccceEEeeccCCCCCceEEEEEecCCCC
Q 040739          360 KGICFLTDFLVVGENQQTFDLEGDSGSLILMKGENGEKPRPIGIIWGGTANR  411 (594)
Q Consensus       360 ~G~~~f~dqIVt~~~g~~FS~~GDSGSlVl~~~~~d~~~~aVGLlfGGs~~~  411 (594)
                      +|..+.+.-+.-..    .+..||=||+++.- +...-.+++||+.+|+.+.
T Consensus       162 ~~~ytir~gleY~~----~t~~GdCGs~i~~~-~t~~~RKIvGiHVAG~~~~  208 (231)
T PF12381_consen  162 KGQYTIRQGLEYQM----PTMNGDCGSPIVRN-NTQMVRKIVGIHVAGSANH  208 (231)
T ss_pred             CCcEEeeeeeeEEC----CCcCCCccceeeEc-chhhhhhhheeeecccccc
Confidence            34444554444334    68999999999952 1223467999999999764


No 22 
>COG5640 Secreted trypsin-like serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=51.46  E-value=92  Score=34.66  Aligned_cols=32  Identities=38%  Similarity=0.530  Sum_probs=20.7

Q ss_pred             CCCCCCccceEEeeccCCCCCc-eEEEEEecCCCCC
Q 040739          378 FDLEGDSGSLILMKGENGEKPR-PIGIIWGGTANRG  412 (594)
Q Consensus       378 FS~~GDSGSlVl~~~~~d~~~~-aVGLlfGGs~~~g  412 (594)
                      -+-.||||.+++.+   ..+++ =+|+..=|.+.||
T Consensus       224 daCqGDSGGPi~~~---g~~G~vQ~GVvSwG~~~Cg  256 (413)
T COG5640         224 DACQGDSGGPIFHK---GEEGRVQRGVVSWGDGGCG  256 (413)
T ss_pred             ccccCCCCCceEEe---CCCccEEEeEEEecCCCCC
Confidence            45679999999964   33444 4888844444353


No 23 
>PF00548 Peptidase_C3:  3C cysteine protease (picornain 3C);  InterPro: IPR000199 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  This signature defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies C3A and C3B. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral C3 cysteine protease. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SJO_E 2H6M_A 1QA7_C 1HAV_B 2HAL_A 2H9H_A 3QZQ_B 3QZR_A 3R0F_B 3SJ9_A ....
Probab=51.32  E-value=43  Score=32.72  Aligned_cols=29  Identities=28%  Similarity=0.321  Sum_probs=23.4

Q ss_pred             CCCCCCccceEEeeccCCCCCceEEEEEecC
Q 040739          378 FDLEGDSGSLILMKGENGEKPRPIGIIWGGT  408 (594)
Q Consensus       378 FS~~GDSGSlVl~~~~~d~~~~aVGLlfGGs  408 (594)
                      -+.+||-||+++.  .......++||+.||+
T Consensus       143 ~t~~G~CG~~l~~--~~~~~~~i~GiHvaG~  171 (172)
T PF00548_consen  143 PTKPGMCGSPLVS--RIGGQGKIIGIHVAGN  171 (172)
T ss_dssp             EEETTGTTEEEEE--SCGGTTEEEEEEEEEE
T ss_pred             CCCCCccCCeEEE--eeccCccEEEEEeccC
Confidence            3778999999994  2334788999999997


No 24 
>KOG1320 consensus Serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=49.27  E-value=61  Score=36.96  Aligned_cols=56  Identities=14%  Similarity=0.187  Sum_probs=37.6

Q ss_pred             EEEEEEEECCCCCCCCCCCCccceEEeeccCCCCCceEEEEEecCCCCCccccccCCCCCcceEeechHHHHh
Q 040739          364 FLTDFLVVGENQQTFDLEGDSGSLILMKGENGEKPRPIGIIWGGTANRGRLKLKIGQPPENWTSGVDLGRLLN  436 (594)
Q Consensus       364 ~f~dqIVt~~~g~~FS~~GDSGSlVl~~~~~d~~~~aVGLlfGGs~~~g~~~~~~~~~p~~~Tl~~pI~~VL~  436 (594)
                      +..+.+.+..    -...|-||-+++     +..+.+||+-+.--.   |+.|.     ...++.-|+..|+.
T Consensus       290 ~i~~~~qtd~----ai~~~nsg~~ll-----~~DG~~IgVn~~~~~---ri~~~-----~~iSf~~p~d~vl~  345 (473)
T KOG1320|consen  290 LISKINQTDA----AINPGNSGGPLL-----NLDGEVIGVNTRKVT---RIGFS-----HGISFKIPIDTVLV  345 (473)
T ss_pred             eeeeecccch----hhhcccCCCcEE-----EecCcEeeeeeeeeE---Eeecc-----ccceeccCchHhhh
Confidence            3445555555    578899999999     889999997766443   12122     34467777777774


No 25 
>KOG1320 consensus Serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=44.97  E-value=64  Score=36.77  Aligned_cols=129  Identities=19%  Similarity=0.144  Sum_probs=67.8

Q ss_pred             EEEecCceeccCCCCCCCCCCCCCCcccccc-cCCCccccccccccCCCccccccccccccccCCCCCcccccc-ccccc
Q 040739          237 GFLTNRHVAVDLDYPNQKMFHPLPPTLGPGV-YLGAVERATSFHHRRPLTFVRADGAFIPFADDFDMSTVTTSV-KGLGE  314 (594)
Q Consensus       237 yiLTNnHVla~~n~~~~~~~~pgdpILQPg~-~DGG~~rd~~FIp~~p~N~VDaD~AIi~va~~~d~s~vs~~I-~~lG~  314 (594)
                      -+|||.|+.+..+..           .++.+ ..|...+...|+   ...+..||.|++.+..   ...|-.-. ..+|+
T Consensus        98 ~lltn~~~v~~~~~~-----------~~v~v~~~gs~~k~~~~v---~~~~~~cd~Avv~Ie~---~~f~~~~~~~e~~~  160 (473)
T KOG1320|consen   98 KLLTNAHVVAPNNDH-----------KFVTVKKHGSPRKYKAFV---AAVFEECDLAVVYIES---EEFWKGMNPFELGD  160 (473)
T ss_pred             ceeecCccccccccc-----------cccccccCCCchhhhhhH---HHhhhcccceEEEEee---ccccCCCcccccCC
Confidence            599999999976642           11111 122211112221   2356778888876642   22332111 12222


Q ss_pred             cCcceeecccCcccccCCCeEEEE-eecccceEEEEEEEEEEEeCCCCeEEEEEEEEECCCCCCCCCCCCccceEEeecc
Q 040739          315 IGDVKIVDLQSPISSLIGKQVVKV-GRSSGLTTGTVLAYALEYNDEKGICFLTDFLVVGENQQTFDLEGDSGSLILMKGE  393 (594)
Q Consensus       315 IG~vk~vdl~g~~~p~lG~~V~Kv-GRTTGlT~G~Itai~V~y~~~~G~~~f~dqIVt~~~g~~FS~~GDSGSlVl~~~~  393 (594)
                      +             |.+...|+=+ |-+.=+|.|.|.++..+-....+..+..-||....      .+|.||-.++    
T Consensus       161 i-------------p~l~~S~~Vv~gd~i~VTnghV~~~~~~~y~~~~~~l~~vqi~aa~------~~~~s~ep~i----  217 (473)
T KOG1320|consen  161 I-------------PSLNGSGFVVGGDGIIVTNGHVVRVEPRIYAHSSTVLLRVQIDAAI------GPGNSGEPVI----  217 (473)
T ss_pred             C-------------cccCccEEEEcCCcEEEEeeEEEEEEeccccCCCcceeeEEEEEee------cCCccCCCeE----
Confidence            2             3333333333 55666899999999855333334444455555433      4678888888    


Q ss_pred             CCCCCceEEEEEe
Q 040739          394 NGEKPRPIGIIWG  406 (594)
Q Consensus       394 ~d~~~~aVGLlfG  406 (594)
                       ...+...|+.|-
T Consensus       218 -~g~d~~~gvA~l  229 (473)
T KOG1320|consen  218 -VGVDKVAGVAFL  229 (473)
T ss_pred             -EccccccceEEE
Confidence             234555565554


No 26 
>PF02122 Peptidase_S39:  Peptidase S39;  InterPro: IPR000382 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. ORF2 of Potato leafroll virus (PLrV) encodes a polyprotein which is translated following a -1 frameshift. The polyprotein has a putative linear arrangement of membrane achor-VPg-peptidase-polmerase domains. The serine peptidase domain which is found in this group of sequences belongs to MEROPS peptidase family S39 (clan PA(S)). It is likely that the peptidase domain is involved in the cleavage of the polyprotein []. The nucleotide sequence for the RNA of PLrV has been determined [, ]. The sequence contains six large open reading frames (ORFs). The 5' coding region encodes two polypeptides of 28K and 70K, which overlap in different reading frames; it is suggested that the third ORF in the 5' block is translated by frameshift readthrough near the end of the 70K protein, yielding a 118K polypeptide []. Segments of the predicted amino acid sequences of these ORFs resemble those of known viral RNA polymerases, ATP-binding proteins and viral genome-linked proteins. The nucleotide sequence of the genomic RNA of Beet western yellows virus (BWYV) has been determined []. The sequence contains six long ORFs. A cluster of three of these ORFs, including the coat protein cistron, display extensive amino acid sequence similarity to corresponding ORFs of a second luteovirus: Barley yellow dwarf virus [].; GO: 0004252 serine-type endopeptidase activity, 0022415 viral reproductive process, 0016021 integral to membrane; PDB: 1ZYO_A.
Probab=30.54  E-value=40  Score=34.21  Aligned_cols=42  Identities=19%  Similarity=0.170  Sum_probs=15.1

Q ss_pred             CCCCCccceEEeeccCCCCCceEEEEEecCCCCCccccccCCCCCcceEeechHHHH
Q 040739          379 DLEGDSGSLILMKGENGEKPRPIGIIWGGTANRGRLKLKIGQPPENWTSGVDLGRLL  435 (594)
Q Consensus       379 S~~GDSGSlVl~~~~~d~~~~aVGLlfGGs~~~g~~~~~~~~~p~~~Tl~~pI~~VL  435 (594)
                      +.+|+||+.+|     ..+ .+||++.|....         ...++|-+..||..+.
T Consensus       144 T~~G~SGtp~y-----~g~-~vvGvH~G~~~~---------~~~~n~n~~spip~~~  185 (203)
T PF02122_consen  144 TSPGWSGTPYY-----SGK-NVVGVHTGSPSG---------SNRENNNRMSPIPPIP  185 (203)
T ss_dssp             --TT-TT-EEE------SS--EEEEEEEE----------------------------
T ss_pred             CCCCCCCCCeE-----ECC-CceEeecCcccc---------cccccccccccccccc
Confidence            78999999999     344 999999995110         1124566677776654


Done!