Query         007435
Match_columns 604
No_of_seqs    195 out of 433
Neff          4.5 
Searched_HMMs 46136
Date          Thu Mar 28 23:33:40 2013
Command       hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/007435.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/007435hhsearch_cdd -cpu 12 -v 0 

 No Hit                             Prob E-value P-value  Score    SS Cols Query HMM  Template HMM
  1 PF08192 Peptidase_S64:  Peptid  99.7 8.8E-17 1.9E-21  178.6  12.9  101  337-449   585-686 (695)
  2 PRK10139 serine endoprotease;   98.0 7.3E-05 1.6E-09   82.5  13.1   89  337-446   160-253 (455)
  3 TIGR02038 protease_degS peripl  97.9 0.00013 2.8E-09   77.9  13.9   94  337-448   146-243 (351)
  4 PRK10898 serine endoprotease;   97.8  0.0003 6.5E-09   75.2  13.6   93  338-448   147-244 (353)
  5 PRK10942 serine endoprotease;   97.8 0.00026 5.6E-09   78.6  13.0   89  337-445   181-273 (473)
  6 TIGR02037 degP_htrA_DO peripla  97.8 0.00021 4.6E-09   77.7  12.0   92  337-448   127-222 (428)
  7 PF13365 Trypsin_2:  Trypsin-li  97.1  0.0025 5.3E-08   55.1   7.9   21  388-413   100-120 (120)
  8 PF00089 Trypsin:  Trypsin;  In  96.9   0.002 4.4E-08   60.8   6.2  181  220-446    25-218 (220)
  9 PF00863 Peptidase_C4:  Peptida  95.6    0.12 2.7E-06   53.1  11.3   85  336-445   102-188 (235)
 10 cd00190 Tryp_SPc Trypsin-like   95.6    0.27 5.8E-06   46.7  13.0   30  388-419   180-209 (232)
 11 smart00020 Tryp_SPc Trypsin-li  95.5    0.24 5.3E-06   47.3  12.4   29  389-421   182-210 (229)
 12 COG0265 DegQ Trypsin-like seri  94.9     0.2 4.3E-06   53.1  10.8   90  338-447   142-236 (347)
 13 KOG1421 Predicted signaling-as  94.9   0.059 1.3E-06   62.3   7.0   44  389-449   213-256 (955)
 14 PF00944 Peptidase_S3:  Alphavi  94.7   0.043 9.2E-07   52.4   4.5   32  388-424   102-133 (158)
 15 COG3591 V8-like Glu-specific e  92.3     1.7 3.6E-05   45.4  11.8   27  389-420   200-226 (251)
 16 PF00947 Pico_P2A:  Picornaviru  90.1    0.34 7.3E-06   45.7   3.8   50  372-445    74-123 (127)
 17 PF05579 Peptidase_S32:  Equine  88.4    0.39 8.5E-06   50.5   3.2   28  388-420   204-231 (297)
 18 PF10459 Peptidase_S46:  Peptid  82.0     1.7 3.8E-05   51.2   4.9   52  389-449   630-684 (698)
 19 PF00949 Peptidase_S7:  Peptida  79.8     1.3 2.9E-05   42.0   2.4   26  389-419    94-119 (132)
 20 PF12381 Peptidase_C3G:  Tungro  60.5      11 0.00023   38.9   4.1   33  388-421   176-208 (231)
 21 PF01732 DUF31:  Putative pepti  60.4     6.3 0.00014   42.6   2.6   22  390-416   353-374 (374)
 22 PF00548 Peptidase_C3:  3C cyst  50.8     9.2  0.0002   37.4   1.8   29  389-419   144-172 (172)
 23 KOG1320 Serine protease [Postt  41.1 1.1E+02  0.0024   35.1   8.4   47  388-447   300-346 (473)
 24 PF02122 Peptidase_S39:  Peptid  34.9      31 0.00067   35.1   2.7   43  389-446   144-186 (203)
 25 KOG1320 Serine protease [Postt  32.1 1.3E+02  0.0028   34.5   7.2   45  349-399   173-217 (473)
 26 COG5480 Predicted integral mem  21.1      45 0.00097   32.3   0.9   18  119-136    42-59  (147)

No 1  
>PF08192 Peptidase_S64:  Peptidase family S64;  InterPro: IPR012985 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This family of fungal proteins is involved in the processing of membrane bound transcription factor Stp1 [] and belongs to MEROPS petidase family S64 (clan PA). The processing causes the signalling domain of Stp1 to be passed to the nucleus where several permease genes are induced. The permeases are important for uptake of amino acids, and processing of tp1 only occurs in an amino acid-rich environment. This family is predicted to be distantly related to the trypsin family (MEROPS peptidase family S1) and to have a typical trypsin-like catalytic triad [].
Probab=99.70  E-value=8.8e-17  Score=178.57  Aligned_cols=101  Identities=30%  Similarity=0.438  Sum_probs=85.7

Q ss_pred             ccccCCCeEEEeeecccceEEEEEEEEEEEeCCCCeEEEEEEEEECCCCCCCCCCCCccceEEeeccC-CCCCceEEEEE
Q 007435          337 INSLIGRQVMKVGRSSGLTTGTVMAYALEYNDEKGICFFTDFLVVGENQQTFDLEGDSGSLILLTGQN-GEKPRPVGIIW  415 (604)
Q Consensus       337 ~~p~lG~~V~KvGRTTGlT~G~Itai~V~y~~~~G~~~f~dqIIt~~~~~~fS~~GDSGSlVl~~~~~-d~~~~aVGLlf  415 (604)
                      .....|+.|+|+|||||+|+|+|+++++.|+.+ |...+.+++|.+.+...|+.+|||||+|+.+.++ ...-.+|||++
T Consensus       585 ~~~~~G~~VfK~GrTTgyT~G~lNg~klvyw~d-G~i~s~efvV~s~~~~~Fa~~GDSGS~VLtk~~d~~~gLgvvGMlh  663 (695)
T PF08192_consen  585 SNLVPGMEVFKVGRTTGYTTGILNGIKLVYWAD-GKIQSSEFVVSSDNNPAFASGGDSGSWVLTKLEDNNKGLGVVGMLH  663 (695)
T ss_pred             hccCCCCeEEEecccCCccceEecceEEEEecC-CCeEEEEEEEecCCCccccCCCCcccEEEecccccccCceeeEEee
Confidence            345679999999999999999999999888876 5567899999987778899999999999976555 34456999999


Q ss_pred             eccCCCcccccccCCCCcceEEeechHHHHHhcC
Q 007435          416 GGTANRGRLKLKVGQPPVNWTSGVDLGRLLDLLE  449 (604)
Q Consensus       416 GGs~~~g~~~~~~g~~p~~~Tl~~pI~~VL~~L~  449 (604)
                      +.++..           ..|.+|+||..||+.|+
T Consensus       664 sydge~-----------kqfglftPi~~il~rl~  686 (695)
T PF08192_consen  664 SYDGEQ-----------KQFGLFTPINEILDRLE  686 (695)
T ss_pred             ecCCcc-----------ceeeccCcHHHHHHHHH
Confidence            987643           48999999999999884


No 2  
>PRK10139 serine endoprotease; Provisional
Probab=97.97  E-value=7.3e-05  Score=82.54  Aligned_cols=89  Identities=20%  Similarity=0.192  Sum_probs=61.7

Q ss_pred             ccccCCCeEEEeeeccc----ceEEEEEEEEEE-EeCCCCeEEEEEEEEECCCCCCCCCCCCccceEEeeccCCCCCceE
Q 007435          337 INSLIGRQVMKVGRSSG----LTTGTVMAYALE-YNDEKGICFFTDFLVVGENQQTFDLEGDSGSLILLTGQNGEKPRPV  411 (604)
Q Consensus       337 ~~p~lG~~V~KvGRTTG----lT~G~Itai~V~-y~~~~G~~~f~dqIIt~~~~~~fS~~GDSGSlVl~~~~~d~~~~aV  411 (604)
                      ....+|+.|.-+|.--|    .|.|.|+++.-. +.. .+   +.++|.++.    --.+|.||..++     |.++++|
T Consensus       160 ~~~~~G~~V~aiG~P~g~~~tvt~GivS~~~r~~~~~-~~---~~~~iqtda----~in~GnSGGpl~-----n~~G~vI  226 (455)
T PRK10139        160 DKLRVGDFAVAVGNPFGLGQTATSGIISALGRSGLNL-EG---LENFIQTDA----SINRGNSGGALL-----NLNGELI  226 (455)
T ss_pred             cccCCCCEEEEEecCCCCCCceEEEEEccccccccCC-CC---cceEEEECC----ccCCCCCcceEE-----CCCCeEE
Confidence            35688999999988655    478888887522 111 11   356677765    356899999999     8999999


Q ss_pred             EEEEeccCCCcccccccCCCCcceEEeechHHHHH
Q 007435          412 GIIWGGTANRGRLKLKVGQPPVNWTSGVDLGRLLD  446 (604)
Q Consensus       412 GLlfGGs~~~g~~~~~~g~~p~~~Tl~~pI~~VL~  446 (604)
                      |+..+--...        ++..+..|+-|+..+..
T Consensus       227 Gi~~~~~~~~--------~~~~gigfaIP~~~~~~  253 (455)
T PRK10139        227 GINTAILAPG--------GGSVGIGFAIPSNMART  253 (455)
T ss_pred             EEEEEEEcCC--------CCccceEEEEEhHHHHH
Confidence            9998743211        12246789999865544


No 3  
>TIGR02038 protease_degS periplasmic serine pepetdase DegS. This family consists of the periplasmic serine protease DegS (HhoB), a shorter paralog of protease DO (HtrA, DegP) and DegQ (HhoA). It is found in E. coli and several other Proteobacteria of the gamma subdivision. It contains a trypsin domain and a single copy of PDZ domain (in contrast to DegP with two copies). A critical role of this DegS is to sense stress in the periplasm and partially degrade an inhibitor of sigma(E).
Probab=97.94  E-value=0.00013  Score=77.85  Aligned_cols=94  Identities=15%  Similarity=0.174  Sum_probs=64.2

Q ss_pred             ccccCCCeEEEeeeccc----ceEEEEEEEEEEEeCCCCeEEEEEEEEECCCCCCCCCCCCccceEEeeccCCCCCceEE
Q 007435          337 INSLIGRQVMKVGRSSG----LTTGTVMAYALEYNDEKGICFFTDFLVVGENQQTFDLEGDSGSLILLTGQNGEKPRPVG  412 (604)
Q Consensus       337 ~~p~lG~~V~KvGRTTG----lT~G~Itai~V~y~~~~G~~~f~dqIIt~~~~~~fS~~GDSGSlVl~~~~~d~~~~aVG  412 (604)
                      ....+|+.|.-+|...|    +|.|.|+++.-......+   ..+++.+..    --.+|.||+.++     |.++++||
T Consensus       146 ~~~~~G~~V~aiG~P~~~~~s~t~GiIs~~~r~~~~~~~---~~~~iqtda----~i~~GnSGGpl~-----n~~G~vIG  213 (351)
T TIGR02038       146 RPPHVGDVVLAIGNPYNLGQTITQGIISATGRNGLSSVG---RQNFIQTDA----AINAGNSGGALI-----NTNGELVG  213 (351)
T ss_pred             CccCCCCEEEEEeCCCCCCCcEEEEEEEeccCcccCCCC---cceEEEECC----ccCCCCCcceEE-----CCCCeEEE
Confidence            35789999999999876    478988887522111112   134555554    256899999999     89999999


Q ss_pred             EEEeccCCCcccccccCCCCcceEEeechHHHHHhc
Q 007435          413 IIWGGTANRGRLKLKVGQPPVNWTSGVDLGRLLDLL  448 (604)
Q Consensus       413 LlfGGs~~~g~~~~~~g~~p~~~Tl~~pI~~VL~~L  448 (604)
                      +..+.-...      .+....+..|+-|++.+...+
T Consensus       214 I~~~~~~~~------~~~~~~g~~faIP~~~~~~vl  243 (351)
T TIGR02038       214 INTASFQKG------GDEGGEGINFAIPIKLAHKIM  243 (351)
T ss_pred             EEeeeeccc------CCCCccceEEEecHHHHHHHH
Confidence            987642211      011234678999998877765


No 4  
>PRK10898 serine endoprotease; Provisional
Probab=97.79  E-value=0.0003  Score=75.24  Aligned_cols=93  Identities=18%  Similarity=0.245  Sum_probs=61.5

Q ss_pred             cccCCCeEEEeeeccc----ceEEEEEEEE-EEEeCCCCeEEEEEEEEECCCCCCCCCCCCccceEEeeccCCCCCceEE
Q 007435          338 NSLIGRQVMKVGRSSG----LTTGTVMAYA-LEYNDEKGICFFTDFLVVGENQQTFDLEGDSGSLILLTGQNGEKPRPVG  412 (604)
Q Consensus       338 ~p~lG~~V~KvGRTTG----lT~G~Itai~-V~y~~~~G~~~f~dqIIt~~~~~~fS~~GDSGSlVl~~~~~d~~~~aVG  412 (604)
                      .+..|+.|.-+|.-.|    .|.|.|.+.. ..+... +.   .+++.++.    --.+|.||..++     |.++++||
T Consensus       147 ~~~~G~~V~aiG~P~g~~~~~t~Giis~~~r~~~~~~-~~---~~~iqtda----~i~~GnSGGPl~-----n~~G~vvG  213 (353)
T PRK10898        147 VPHIGDVVLAIGNPYNLGQTITQGIISATGRIGLSPT-GR---QNFLQTDA----SINHGNSGGALV-----NSLGELMG  213 (353)
T ss_pred             cCCCCCEEEEEeCCCCcCCCcceeEEEeccccccCCc-cc---cceEEecc----ccCCCCCcceEE-----CCCCeEEE
Confidence            4689999999998766    5789998875 322221 21   23455544    357899999999     89999999


Q ss_pred             EEEeccCCCcccccccCCCCcceEEeechHHHHHhc
Q 007435          413 IIWGGTANRGRLKLKVGQPPVNWTSGVDLGRLLDLL  448 (604)
Q Consensus       413 LlfGGs~~~g~~~~~~g~~p~~~Tl~~pI~~VL~~L  448 (604)
                      |..+.-...     ..+..+.+..|+-|+..+...+
T Consensus       214 I~~~~~~~~-----~~~~~~~g~~faIP~~~~~~~~  244 (353)
T PRK10898        214 INTLSFDKS-----NDGETPEGIGFAIPTQLATKIM  244 (353)
T ss_pred             EEEEEeccc-----CCCCcccceEEEEchHHHHHHH
Confidence            987542211     0011234678998887755544


No 5  
>PRK10942 serine endoprotease; Provisional
Probab=97.76  E-value=0.00026  Score=78.60  Aligned_cols=89  Identities=17%  Similarity=0.183  Sum_probs=59.5

Q ss_pred             ccccCCCeEEEeeeccc----ceEEEEEEEEEEEeCCCCeEEEEEEEEECCCCCCCCCCCCccceEEeeccCCCCCceEE
Q 007435          337 INSLIGRQVMKVGRSSG----LTTGTVMAYALEYNDEKGICFFTDFLVVGENQQTFDLEGDSGSLILLTGQNGEKPRPVG  412 (604)
Q Consensus       337 ~~p~lG~~V~KvGRTTG----lT~G~Itai~V~y~~~~G~~~f~dqIIt~~~~~~fS~~GDSGSlVl~~~~~d~~~~aVG  412 (604)
                      ....+|+.|.-+|..-|    .|.|.|+++.-...   +...+.++|.++.    --.+|.||..++     |.++++||
T Consensus       181 ~~l~~G~~V~aiG~P~g~~~tvt~GiVs~~~r~~~---~~~~~~~~iqtda----~i~~GnSGGpL~-----n~~GeviG  248 (473)
T PRK10942        181 DALRVGDYTVAIGNPYGLGETVTSGIVSALGRSGL---NVENYENFIQTDA----AINRGNSGGALV-----NLNGELIG  248 (473)
T ss_pred             cccCCCCEEEEEcCCCCCCcceeEEEEEEeecccC---CcccccceEEecc----ccCCCCCcCccC-----CCCCeEEE
Confidence            35689999999998765    48888888762210   1112345666655    246899999999     89999999


Q ss_pred             EEEeccCCCcccccccCCCCcceEEeechHHHH
Q 007435          413 IIWGGTANRGRLKLKVGQPPVNWTSGVDLGRLL  445 (604)
Q Consensus       413 LlfGGs~~~g~~~~~~g~~p~~~Tl~~pI~~VL  445 (604)
                      |..+.-...       + +..+..|+-|+..+.
T Consensus       249 I~t~~~~~~-------g-~~~g~gfaIP~~~~~  273 (473)
T PRK10942        249 INTAILAPD-------G-GNIGIGFAIPSNMVK  273 (473)
T ss_pred             EEEEEEcCC-------C-CcccEEEEEEHHHHH
Confidence            998643211       0 112567888875443


No 6  
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=97.76  E-value=0.00021  Score=77.68  Aligned_cols=92  Identities=16%  Similarity=0.146  Sum_probs=61.9

Q ss_pred             ccccCCCeEEEeeeccc----ceEEEEEEEEEEEeCCCCeEEEEEEEEECCCCCCCCCCCCccceEEeeccCCCCCceEE
Q 007435          337 INSLIGRQVMKVGRSSG----LTTGTVMAYALEYNDEKGICFFTDFLVVGENQQTFDLEGDSGSLILLTGQNGEKPRPVG  412 (604)
Q Consensus       337 ~~p~lG~~V~KvGRTTG----lT~G~Itai~V~y~~~~G~~~f~dqIIt~~~~~~fS~~GDSGSlVl~~~~~d~~~~aVG  412 (604)
                      ....+|+.|+-+|.--|    +|.|.|+++.-.....   ..+.+++.++.    --.+|.||+.++     |.++++||
T Consensus       127 ~~~~~G~~v~aiG~p~g~~~~~t~G~vs~~~~~~~~~---~~~~~~i~tda----~i~~GnSGGpl~-----n~~G~viG  194 (428)
T TIGR02037       127 DKLRVGDWVLAIGNPFGLGQTVTSGIVSALGRSGLGI---GDYENFIQTDA----AINPGNSGGPLV-----NLRGEVIG  194 (428)
T ss_pred             CCCCCCCEEEEEECCCcCCCcEEEEEEEecccCccCC---CCccceEEECC----CCCCCCCCCceE-----CCCCeEEE
Confidence            35689999999998744    5788888875221111   11345666655    367899999999     89999999


Q ss_pred             EEEeccCCCcccccccCCCCcceEEeechHHHHHhc
Q 007435          413 IIWGGTANRGRLKLKVGQPPVNWTSGVDLGRLLDLL  448 (604)
Q Consensus       413 LlfGGs~~~g~~~~~~g~~p~~~Tl~~pI~~VL~~L  448 (604)
                      |..+.-...        ++..++.|+-|+..+.+.+
T Consensus       195 I~~~~~~~~--------g~~~g~~faiP~~~~~~~~  222 (428)
T TIGR02037       195 INTAIYSPS--------GGNVGIGFAIPSNMAKNVV  222 (428)
T ss_pred             EEeEEEcCC--------CCccceEEEEEhHHHHHHH
Confidence            987643311        1123678888876665544


No 7  
>PF13365 Trypsin_2:  Trypsin-like peptidase domain; PDB: 1Y8T_A 2Z9I_A 3QO6_A 1L1J_A 1QY6_A 2O8L_A 3OTP_E 2ZLE_I 1KY9_A 3CS0_A ....
Probab=97.07  E-value=0.0025  Score=55.14  Aligned_cols=21  Identities=33%  Similarity=0.445  Sum_probs=18.6

Q ss_pred             CCCCCCccceEEeeccCCCCCceEEE
Q 007435          388 FDLEGDSGSLILLTGQNGEKPRPVGI  413 (604)
Q Consensus       388 fS~~GDSGSlVl~~~~~d~~~~aVGL  413 (604)
                      ...+|.||+.|+     +.++++|||
T Consensus       100 ~~~~G~SGgpv~-----~~~G~vvGi  120 (120)
T PF13365_consen  100 DTRPGSSGGPVF-----DSDGRVVGI  120 (120)
T ss_dssp             S-STTTTTSEEE-----ETTSEEEEE
T ss_pred             ccCCCcEeHhEE-----CCCCEEEeC
Confidence            588999999999     899999997


No 8  
>PF00089 Trypsin:  Trypsin;  InterPro: IPR001254 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine proteases belong to the MEROPS peptidase family S1 (chymotrypsin family, clan PA(S))and to peptidase family S6 (Hap serine peptidases). The chymotrypsin family is almost totally confined to animals, although trypsin-like enzymes are found in actinomycetes of the genera Streptomyces and Saccharopolyspora, and in the fungus Fusarium oxysporum []. The enzymes are inherently secreted, being synthesised with a signal peptide that targets them to the secretory pathway. Animal enzymes are either secreted directly, packaged into vesicles for regulated secretion, or are retained in leukocyte granules []. The Hap family, 'Haemophilus adhesion and penetration', are proteins that play a role in the interaction with human epithelial cells. The serine protease activity is localized at the N-terminal domain, whereas the binding domain is in the C-terminal region. ; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 1SPJ_A 1A5I_A 2ZGH_A 2ZKS_A 2ZGJ_A 2ZGC_A 2ODP_A 2I6Q_A 2I6S_A 2ODQ_A ....
Probab=96.87  E-value=0.002  Score=60.78  Aligned_cols=181  Identities=17%  Similarity=0.205  Sum_probs=87.5

Q ss_pred             eeeeEEEEEeCCCCceeEEeecCcccccCCCCCccCCCCCCCccCCcccCCCc--cccceeeeeccccccccCCCCcccc
Q 007435          220 YGTLGAIVRSRTGNQQVGFLTNRHVAVDLDYPNQKMFHPLPPSLGPGVYLGAV--ERATSFITDDLWYGIFAGTNPETFV  297 (604)
Q Consensus       220 aGTLGcLV~D~~G~~~~yiLSNnHVLa~~n~~~q~~~~pg~pIlQPG~~DGG~--~r~~~fIpl~~~~~i~~g~~p~N~V  297 (604)
                      .--.|+||.++      ++||++||+...+...   -..+...++..  ++..  -...+++....|       ++.+ .
T Consensus        25 ~~C~G~li~~~------~vLTaahC~~~~~~~~---v~~g~~~~~~~--~~~~~~~~v~~~~~h~~~-------~~~~-~   85 (220)
T PF00089_consen   25 FFCTGTLISPR------WVLTAAHCVDGASDIK---VRLGTYSIRNS--DGSEQTIKVSKIIIHPKY-------DPST-Y   85 (220)
T ss_dssp             EEEEEEEEETT------EEEEEGGGHTSGGSEE---EEESESBTTST--TTTSEEEEEEEEEEETTS-------BTTT-T
T ss_pred             eeEeEEecccc------cccccccccccccccc---ccccccccccc--cccccccccccccccccc-------cccc-c
Confidence            34458888874      9999999999921100   00111111111  1111  112222222221       1222 3


Q ss_pred             cccccccccccccCCCCcccccccccccCcceeecccC-cccccCCCeEEEeeecccceEE---EEEEEEEEEeCC----
Q 007435          298 RADGAFIPFAEDFNLNNVTTSVKGVGEIGDVHIIDLQS-PINSLIGRQVMKVGRSSGLTTG---TVMAYALEYNDE----  369 (604)
Q Consensus       298 DaD~AlI~va~~~d~s~vs~~I~~vG~IG~~~~vdlqg-~~~p~lG~~V~KvGRTTGlT~G---~Itai~V~y~~~----  369 (604)
                      +.|+||+++.++++   ....+..   +      .+.. ...+..|+.+.-+|--.....+   .+....+.+-..    
T Consensus        86 ~~DiAll~L~~~~~---~~~~~~~---~------~l~~~~~~~~~~~~~~~~G~~~~~~~~~~~~~~~~~~~~~~~~~c~  153 (220)
T PF00089_consen   86 DNDIALLKLDRPIT---FGDNIQP---I------CLPSAGSDPNVGTSCIVVGWGRTSDNGYSSNLQSVTVPVVSRKTCR  153 (220)
T ss_dssp             TTSEEEEEESSSSE---HBSSBEE---S------BBTSTTHTTTTTSEEEEEESSBSSTTSBTSBEEEEEEEEEEHHHHH
T ss_pred             cccccccccccccc---ccccccc---c------cccccccccccccccccccccccccccccccccccccccccccccc
Confidence            55889999887522   1222211   1      1111 1234677777777766654444   444333222100    


Q ss_pred             ---CCeEEEEEEEEECCCCCCCCCCCCccceEEeeccCCCCCceEEEEEeccCCCcccccccCCCCcceEEeechHHHHH
Q 007435          370 ---KGICFFTDFLVVGENQQTFDLEGDSGSLILLTGQNGEKPRPVGIIWGGTANRGRLKLKVGQPPVNWTSGVDLGRLLD  446 (604)
Q Consensus       370 ---~G~~~f~dqIIt~~~~~~fS~~GDSGSlVl~~~~~d~~~~aVGLlfGGs~~~g~~~~~~g~~p~~~Tl~~pI~~VL~  446 (604)
                         ... ....++-+...+..-...|||||.++.     .++.++|++..+.. +        ..+....+|++|...++
T Consensus       154 ~~~~~~-~~~~~~c~~~~~~~~~~~g~sG~pl~~-----~~~~lvGI~s~~~~-c--------~~~~~~~v~~~v~~~~~  218 (220)
T PF00089_consen  154 SSYNDN-LTPNMICAGSSGSGDACQGDSGGPLIC-----NNNYLVGIVSFGEN-C--------GSPNYPGVYTRVSSYLD  218 (220)
T ss_dssp             HHTTTT-STTTEEEEETTSSSBGGTTTTTSEEEE-----TTEEEEEEEEEESS-S--------SBTTSEEEEEEGGGGHH
T ss_pred             cccccc-ccccccccccccccccccccccccccc-----ceeeecceeeecCC-C--------CCCCcCEEEEEHHHhhc
Confidence               000 011122222111223567999999993     33379999999833 3        11223588888887654


No 9  
>PF00863 Peptidase_C4:  Peptidase family C4 This family belongs to family C4 of the peptidase classification.;  InterPro: IPR001730 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  Nuclear inclusion A (NIA) proteases from potyviruses are cysteine peptidases belong to the MEROPS peptidase family C4 (NIa protease family, clan PA(C)) [, ].  Potyviruses include plant viruses in which the single-stranded RNA encodes a polyprotein with NIA protease activity, where proteolytic cleavage is specific for Gln+Gly sites. The NIA protease acts on the polyprotein, releasing itself by Gln+Gly cleavage at both the N- and C-termini. It further processes the polyprotein by cleavage at five similar sites in the C-terminal half of the sequence. In addition to its C-terminal protease activity, the NIA protease contains an N-terminal domain that has been implicated in the transcription process []. This peptidase is present in the nuclear inclusion protein of potyviruses.; GO: 0008234 cysteine-type peptidase activity, 0006508 proteolysis; PDB: 3MMG_B 1Q31_B 1LVB_A 1LVM_A.
Probab=95.58  E-value=0.12  Score=53.14  Aligned_cols=85  Identities=18%  Similarity=0.181  Sum_probs=53.5

Q ss_pred             cccccCCCeEEEeee--cccceEEEEEEEEEEEeCCCCeEEEEEEEEECCCCCCCCCCCCccceEEeeccCCCCCceEEE
Q 007435          336 PINSLIGRQVMKVGR--SSGLTTGTVMAYALEYNDEKGICFFTDFLVVGENQQTFDLEGDSGSLILLTGQNGEKPRPVGI  413 (604)
Q Consensus       336 ~~~p~lG~~V~KvGR--TTGlT~G~Itai~V~y~~~~G~~~f~dqIIt~~~~~~fS~~GDSGSlVl~~~~~d~~~~aVGL  413 (604)
                      ...|..++.|+.+|-  .+....-+|+.-...|... +..+++.+|-|.+        ||=|..++.    -.++.+||+
T Consensus       102 FR~P~~~e~v~mVg~~fq~k~~~s~vSesS~i~p~~-~~~fWkHwIsTk~--------G~CG~PlVs----~~Dg~IVGi  168 (235)
T PF00863_consen  102 FRAPKEGERVCMVGSNFQEKSISSTVSESSWIYPEE-NSHFWKHWISTKD--------GDCGLPLVS----TKDGKIVGI  168 (235)
T ss_dssp             B----TT-EEEEEEEECSSCCCEEEEEEEEEEEEET-TTTEEEE-C---T--------T-TT-EEEE----TTT--EEEE
T ss_pred             ccCCCCCCEEEEEEEEEEcCCeeEEECCceEEeecC-CCCeeEEEecCCC--------CccCCcEEE----cCCCcEEEE
Confidence            368999999999996  7888888888888777643 3466888876544        999999995    578999999


Q ss_pred             EEeccCCCcccccccCCCCcceEEeechHHHH
Q 007435          414 IWGGTANRGRLKLKVGQPPVNWTSGVDLGRLL  445 (604)
Q Consensus       414 lfGGs~~~g~~~~~~g~~p~~~Tl~~pI~~VL  445 (604)
                      |..++...            ..-||.|+..=+
T Consensus       169 Hsl~~~~~------------~~N~F~~f~~~f  188 (235)
T PF00863_consen  169 HSLTSNTS------------SRNYFTPFPDDF  188 (235)
T ss_dssp             EEEEETTT------------SSEEEEE--TTH
T ss_pred             EcCccCCC------------CeEEEEcCCHHH
Confidence            99988754            335777765433


No 10 
>cd00190 Tryp_SPc Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues.
Probab=95.57  E-value=0.27  Score=46.67  Aligned_cols=30  Identities=27%  Similarity=0.381  Sum_probs=22.2

Q ss_pred             CCCCCCccceEEeeccCCCCCceEEEEEeccC
Q 007435          388 FDLEGDSGSLILLTGQNGEKPRPVGIIWGGTA  419 (604)
Q Consensus       388 fS~~GDSGSlVl~~~~~d~~~~aVGLlfGGs~  419 (604)
                      -.-+||||+.++..  .+....++|++..|..
T Consensus       180 ~~c~gdsGgpl~~~--~~~~~~lvGI~s~g~~  209 (232)
T cd00190         180 DACQGDSGGPLVCN--DNGRGVLVGIVSWGSG  209 (232)
T ss_pred             ccccCCCCCcEEEE--eCCEEEEEEEEehhhc
Confidence            45579999999952  1244789999988764


No 11 
>smart00020 Tryp_SPc Trypsin-like serine protease. Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues.
Probab=95.48  E-value=0.24  Score=47.35  Aligned_cols=29  Identities=31%  Similarity=0.437  Sum_probs=21.5

Q ss_pred             CCCCCccceEEeeccCCCCCceEEEEEeccCCC
Q 007435          389 DLEGDSGSLILLTGQNGEKPRPVGIIWGGTANR  421 (604)
Q Consensus       389 S~~GDSGSlVl~~~~~d~~~~aVGLlfGGs~~~  421 (604)
                      ..+||||+.++...  + ..+++|+...|. .+
T Consensus       182 ~c~gdsG~pl~~~~--~-~~~l~Gi~s~g~-~C  210 (229)
T smart00020      182 ACQGDSGGPLVCND--G-RWVLVGIVSWGS-GC  210 (229)
T ss_pred             ccCCCCCCeeEEEC--C-CEEEEEEEEECC-CC
Confidence            45699999999421  1 458999998887 44


No 12 
>COG0265 DegQ Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain [Posttranslational modification, protein turnover, chaperones]
Probab=94.91  E-value=0.2  Score=53.14  Aligned_cols=90  Identities=20%  Similarity=0.271  Sum_probs=61.4

Q ss_pred             cccCCCeEEEeeeccc----ceEEEEEEEEEE-EeCCCCeEEEEEEEEECCCCCCCCCCCCccceEEeeccCCCCCceEE
Q 007435          338 NSLIGRQVMKVGRSSG----LTTGTVMAYALE-YNDEKGICFFTDFLVVGENQQTFDLEGDSGSLILLTGQNGEKPRPVG  412 (604)
Q Consensus       338 ~p~lG~~V~KvGRTTG----lT~G~Itai~V~-y~~~~G~~~f~dqIIt~~~~~~fS~~GDSGSlVl~~~~~d~~~~aVG  412 (604)
                      ...+|+.|.-+|-..|    +|.|.|+++.-. +.....   +.++|.+..    .-.+|.||..++     +.++.+||
T Consensus       142 ~l~vg~~v~aiGnp~g~~~tvt~Givs~~~r~~v~~~~~---~~~~IqtdA----ain~gnsGgpl~-----n~~g~~iG  209 (347)
T COG0265         142 KLRVGDVVVAIGNPFGLGQTVTSGIVSALGRTGVGSAGG---YVNFIQTDA----AINPGNSGGPLV-----NIDGEVVG  209 (347)
T ss_pred             CcccCCEEEEecCCCCcccceeccEEeccccccccCccc---ccchhhccc----ccCCCCCCCceE-----cCCCcEEE
Confidence            3458999999998888    566766666532 322111   666676554    578999999999     89999999


Q ss_pred             EEEeccCCCcccccccCCCCcceEEeechHHHHHh
Q 007435          413 IIWGGTANRGRLKLKVGQPPVNWTSGVDLGRLLDL  447 (604)
Q Consensus       413 LlfGGs~~~g~~~~~~g~~p~~~Tl~~pI~~VL~~  447 (604)
                      +..+.-...       + +..+..|+-|+..+...
T Consensus       210 int~~~~~~-------~-~~~gigfaiP~~~~~~v  236 (347)
T COG0265         210 INTAIIAPS-------G-GSSGIGFAIPVNLVAPV  236 (347)
T ss_pred             EEEEEecCC-------C-CcceeEEEecHHHHHHH
Confidence            888766533       1 12235677777766553


No 13 
>KOG1421 consensus Predicted signaling-associated protein (contains a PDZ domain) [General function prediction only]
Probab=94.86  E-value=0.059  Score=62.27  Aligned_cols=44  Identities=23%  Similarity=0.238  Sum_probs=38.8

Q ss_pred             CCCCCccceEEeeccCCCCCceEEEEEeccCCCcccccccCCCCcceEEeechHHHHHhcC
Q 007435          389 DLEGDSGSLILLTGQNGEKPRPVGIIWGGTANRGRLKLKVGQPPVNWTSGVDLGRLLDLLE  449 (604)
Q Consensus       389 S~~GDSGSlVl~~~~~d~~~~aVGLlfGGs~~~g~~~~~~g~~p~~~Tl~~pI~~VL~~L~  449 (604)
                      +.+|-|||.|+     +-.+++|.|.-||....            ...+|-||.+|+.+|-
T Consensus       213 tsggssgspVv-----~i~gyAVAl~agg~~ss------------as~ffLpLdrV~RaL~  256 (955)
T KOG1421|consen  213 TSGGSSGSPVV-----DIPGYAVALNAGGSISS------------ASDFFLPLDRVVRALR  256 (955)
T ss_pred             CCCCCCCCcee-----cccceEEeeecCCcccc------------cccceeeccchhhhhh
Confidence            67899999999     89999999999998754            5579999999998874


No 14 
>PF00944 Peptidase_S3:  Alphavirus core protein ;  InterPro: IPR000930 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. Togavirin, also known as Sindbis virus core endopeptidase, is a serine protease resident at the N terminus of the p130 polyprotein of togaviruses []. The endopeptidase signature identifies the peptidase as belonging to the MEROPS peptidase family S3 (togavirin family, clan PA(S)). The polyprotein also includes structural proteins for the nucleocapsid core and for the glycoprotein spikes []. Togavirin is only active while part of the polyprotein, cleavage at a Trp-Ser bond resulting in total lack of activity []. Mutagenesis studies have identified the location of the His-Asp-Ser catalytic triad, and X-ray studies have revealed the protein fold to be similar to that of chymotrypsin [, ].; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis, 0016020 membrane; PDB: 2YEW_D 1EP5_A 3J0C_F 1EP6_C 1WYK_D 1DYL_A 1VCQ_B 1VCP_B 1LD4_D 1KXA_A ....
Probab=94.67  E-value=0.043  Score=52.35  Aligned_cols=32  Identities=34%  Similarity=0.519  Sum_probs=27.3

Q ss_pred             CCCCCCccceEEeeccCCCCCceEEEEEeccCCCccc
Q 007435          388 FDLEGDSGSLILLTGQNGEKPRPVGIIWGGTANRGRL  424 (604)
Q Consensus       388 fS~~GDSGSlVl~~~~~d~~~~aVGLlfGGs~~~g~~  424 (604)
                      .+.+||||..++     |.++++||+++||.++.-|.
T Consensus       102 ~g~~GDSGRpi~-----DNsGrVVaIVLGG~neG~RT  133 (158)
T PF00944_consen  102 VGKPGDSGRPIF-----DNSGRVVAIVLGGANEGRRT  133 (158)
T ss_dssp             S-STTSTTEEEE-----STTSBEEEEEEEEEEETTEE
T ss_pred             CCCCCCCCCccC-----cCCCCEEEEEecCCCCCCce
Confidence            689999999999     99999999999999865333


No 15 
>COG3591 V8-like Glu-specific endopeptidase [Amino acid transport and metabolism]
Probab=92.31  E-value=1.7  Score=45.41  Aligned_cols=27  Identities=33%  Similarity=0.559  Sum_probs=23.8

Q ss_pred             CCCCCccceEEeeccCCCCCceEEEEEeccCC
Q 007435          389 DLEGDSGSLILLTGQNGEKPRPVGIIWGGTAN  420 (604)
Q Consensus       389 S~~GDSGSlVl~~~~~d~~~~aVGLlfGGs~~  420 (604)
                      ..||+|||.|+     ..+.+++|++++|..-
T Consensus       200 T~pG~SGSpv~-----~~~~~vigv~~~g~~~  226 (251)
T COG3591         200 TLPGSSGSPVL-----ISKDEVIGVHYNGPGA  226 (251)
T ss_pred             ccCCCCCCceE-----ecCceEEEEEecCCCc
Confidence            67899999999     6677999999999863


No 16 
>PF00947 Pico_P2A:  Picornavirus core protein 2A;  InterPro: IPR000081 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  This domain defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies 3CA and 3CB. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral 3C cysteine protease []. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0008233 peptidase activity, 0006508 proteolysis, 0016032 viral reproduction; PDB: 2HRV_B 1Z8R_A.
Probab=90.09  E-value=0.34  Score=45.75  Aligned_cols=50  Identities=22%  Similarity=0.291  Sum_probs=38.3

Q ss_pred             eEEEEEEEEECCCCCCCCCCCCccceEEeeccCCCCCceEEEEEeccCCCcccccccCCCCcceEEeechHHHH
Q 007435          372 ICFFTDFLVVGENQQTFDLEGDSGSLILLTGQNGEKPRPVGIIWGGTANRGRLKLKVGQPPVNWTSGVDLGRLL  445 (604)
Q Consensus       372 ~~~f~dqIIt~~~~~~fS~~GDSGSlVl~~~~~d~~~~aVGLlfGGs~~~g~~~~~~g~~p~~~Tl~~pI~~VL  445 (604)
                      ..+..++++..-    +++|||-|++++      -+..++||+-||..              +..-|.+|+.++
T Consensus        74 ~h~Q~~~l~g~G----p~~PGdCGg~L~------C~HGViGi~Tagg~--------------g~VaF~dir~~~  123 (127)
T PF00947_consen   74 KHYQYNLLIGEG----PAEPGDCGGILR------CKHGVIGIVTAGGE--------------GHVAFADIRDLL  123 (127)
T ss_dssp             SEEEECEEEEE-----SSSTT-TCSEEE------ETTCEEEEEEEEET--------------TEEEEEECCCGS
T ss_pred             hheecCceeecc----cCCCCCCCceeE------eCCCeEEEEEeCCC--------------ceEEEEechhhh
Confidence            466778877655    799999999998      46679999999987              457888887654


No 17 
>PF05579 Peptidase_S32:  Equine arteritis virus serine endopeptidase S32;  InterPro: IPR008760 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to MEROPS peptidase family S32 (clan PA(S)). The type example is equine arteritis virus serine endopeptidase (equine arteritis virus), which is involved in processing of nidovirus polyproteins [].; GO: 0004252 serine-type endopeptidase activity, 0016032 viral reproduction, 0019082 viral protein processing; PDB: 3FAN_A 3FAO_A 1MBM_A.
Probab=88.38  E-value=0.39  Score=50.51  Aligned_cols=28  Identities=32%  Similarity=0.481  Sum_probs=22.5

Q ss_pred             CCCCCCccceEEeeccCCCCCceEEEEEeccCC
Q 007435          388 FDLEGDSGSLILLTGQNGEKPRPVGIIWGGTAN  420 (604)
Q Consensus       388 fS~~GDSGSlVl~~~~~d~~~~aVGLlfGGs~~  420 (604)
                      |+.+|||||.|+     ..++.+||+|-|.+..
T Consensus       204 fT~~GDSGSPVV-----t~dg~liGVHTGSn~~  231 (297)
T PF05579_consen  204 FTGPGDSGSPVV-----TEDGDLIGVHTGSNKR  231 (297)
T ss_dssp             SS-GGCTT-EEE-----ETTC-EEEEEEEEETT
T ss_pred             EcCCCCCCCccC-----cCCCCEEEEEecCCCc
Confidence            899999999999     6889999999998763


No 18 
>PF10459 Peptidase_S46:  Peptidase S46;  InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains. 
Probab=82.01  E-value=1.7  Score=51.17  Aligned_cols=52  Identities=33%  Similarity=0.445  Sum_probs=39.8

Q ss_pred             CCCCCccceEEeeccCCCCCceEEEEEeccCCC--cccccccCCCC-cceEEeechHHHHHhcC
Q 007435          389 DLEGDSGSLILLTGQNGEKPRPVGIIWGGTANR--GRLKLKVGQPP-VNWTSGVDLGRLLDLLE  449 (604)
Q Consensus       389 S~~GDSGSlVl~~~~~d~~~~aVGLlfGGs~~~--g~~~~~~g~~p-~~~Tl~~pI~~VL~~L~  449 (604)
                      .-+|-|||.|+     |.++++|||.|-|+-..  |-..    -.| .+.++..+|.-||-.|+
T Consensus       630 itGGNSGSPvl-----N~~GeLVGl~FDgn~Esl~~D~~----fdp~~~R~I~VDiRyvL~~ld  684 (698)
T PF10459_consen  630 ITGGNSGSPVL-----NAKGELVGLAFDGNWESLSGDIA----FDPELNRTIHVDIRYVLWALD  684 (698)
T ss_pred             cCCCCCCCccC-----CCCceEEEEeecCchhhcccccc----cccccceeEEEEHHHHHHHHH
Confidence            67899999999     99999999999998643  1111    112 46789999999887763


No 19 
>PF00949 Peptidase_S7:  Peptidase S7, Flavivirus NS3 serine protease ;  InterPro: IPR001850 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies serine peptidases belong to MEROPS peptidase family S7 (flavivirin family, clan PA(S)). The protein fold of the peptidase domain for members of this family resembles that of chymotrypsin, the type example for clan PA.  Flaviviruses produce a polyprotein from the ssRNA genome. The N terminus of the NS3 protein (approx. 180 aa) is required for the processing of the polyprotein. NS3 also has conserved homology with NTP-binding proteins and DEAD family of RNA helicase [, , ].; GO: 0003723 RNA binding, 0003724 RNA helicase activity, 0005524 ATP binding; PDB: 2IJO_B 3E90_D 2GGV_B 2FP7_B 2WV9_A 3U1I_B 3U1J_B 2WZQ_A 2WHX_A 3L6P_A ....
Probab=79.77  E-value=1.3  Score=42.02  Aligned_cols=26  Identities=31%  Similarity=0.391  Sum_probs=21.6

Q ss_pred             CCCCCccceEEeeccCCCCCceEEEEEeccC
Q 007435          389 DLEGDSGSLILLTGQNGEKPRPVGIIWGGTA  419 (604)
Q Consensus       389 S~~GDSGSlVl~~~~~d~~~~aVGLlfGGs~  419 (604)
                      -.+|-|||+++     +.++++|||+++|-.
T Consensus        94 ~~~GsSGSpi~-----n~~g~ivGlYg~g~~  119 (132)
T PF00949_consen   94 FPKGSSGSPIF-----NQNGEIVGLYGNGVE  119 (132)
T ss_dssp             S-TTGTT-EEE-----ETTSCEEEEEEEEEE
T ss_pred             cCCCCCCCceE-----cCCCcEEEEEcccee
Confidence            34699999999     899999999999864


No 20 
>PF12381 Peptidase_C3G:  Tungro spherical virus-type peptidase;  InterPro: IPR024387 This entry represents a rice tungro spherical waikavirus-type peptidase that belongs to MEROPS peptidase family C3G. It is a picornain 3C-type protease, and is responsible for the self-cleavage of the positive single-stranded polyproteins of a number of plant viral genomes. The location of the protease activity of the polyprotein is at the C-terminal end, adjacent and N-terminal to the putative RNA polymerase [, ].
Probab=60.45  E-value=11  Score=38.95  Aligned_cols=33  Identities=33%  Similarity=0.427  Sum_probs=25.5

Q ss_pred             CCCCCCccceEEeeccCCCCCceEEEEEeccCCC
Q 007435          388 FDLEGDSGSLILLTGQNGEKPRPVGIIWGGTANR  421 (604)
Q Consensus       388 fS~~GDSGSlVl~~~~~d~~~~aVGLlfGGs~~~  421 (604)
                      .+..||=||+++.- +...-.++||||.+|+.+.
T Consensus       176 ~t~~GdCGs~i~~~-~t~~~RKIvGiHVAG~~~~  208 (231)
T PF12381_consen  176 PTMNGDCGSPIVRN-NTQMVRKIVGIHVAGSANH  208 (231)
T ss_pred             CCcCCCccceeeEc-chhhhhhhheeeecccccc
Confidence            68999999999951 1223467999999999765


No 21 
>PF01732 DUF31:  Putative peptidase (DUF31);  InterPro: IPR022382  This domain has no known function. It is found in various hypothetical proteins and putative lipoproteins from mycoplasmas. 
Probab=60.43  E-value=6.3  Score=42.64  Aligned_cols=22  Identities=36%  Similarity=0.673  Sum_probs=20.5

Q ss_pred             CCCCccceEEeeccCCCCCceEEEEEe
Q 007435          390 LEGDSGSLILLTGQNGEKPRPVGIIWG  416 (604)
Q Consensus       390 ~~GDSGSlVl~~~~~d~~~~aVGLlfG  416 (604)
                      .+|=|||+|+     ++++.+|||+||
T Consensus       353 ~gGaSGS~V~-----n~~~~lvGIy~g  374 (374)
T PF01732_consen  353 GGGASGSMVI-----NQNNELVGIYFG  374 (374)
T ss_pred             CCCCCcCeEE-----CCCCCEEEEeCC
Confidence            4799999999     999999999997


No 22 
>PF00548 Peptidase_C3:  3C cysteine protease (picornain 3C);  InterPro: IPR000199 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  This signature defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies C3A and C3B. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral C3 cysteine protease. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SJO_E 2H6M_A 1QA7_C 1HAV_B 2HAL_A 2H9H_A 3QZQ_B 3QZR_A 3R0F_B 3SJ9_A ....
Probab=50.81  E-value=9.2  Score=37.39  Aligned_cols=29  Identities=24%  Similarity=0.345  Sum_probs=23.5

Q ss_pred             CCCCCccceEEeeccCCCCCceEEEEEeccC
Q 007435          389 DLEGDSGSLILLTGQNGEKPRPVGIIWGGTA  419 (604)
Q Consensus       389 S~~GDSGSlVl~~~~~d~~~~aVGLlfGGs~  419 (604)
                      +.+||-||+++..  .+....++|||.||++
T Consensus       144 t~~G~CG~~l~~~--~~~~~~i~GiHvaG~G  172 (172)
T PF00548_consen  144 TKPGMCGSPLVSR--IGGQGKIIGIHVAGNG  172 (172)
T ss_dssp             EETTGTTEEEEES--CGGTTEEEEEEEEEES
T ss_pred             CCCCccCCeEEEe--eccCccEEEEEeccCC
Confidence            6689999999952  3447899999999973


No 23 
>KOG1320 consensus Serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=41.14  E-value=1.1e+02  Score=35.07  Aligned_cols=47  Identities=15%  Similarity=0.190  Sum_probs=34.3

Q ss_pred             CCCCCCccceEEeeccCCCCCceEEEEEeccCCCcccccccCCCCcceEEeechHHHHHh
Q 007435          388 FDLEGDSGSLILLTGQNGEKPRPVGIIWGGTANRGRLKLKVGQPPVNWTSGVDLGRLLDL  447 (604)
Q Consensus       388 fS~~GDSGSlVl~~~~~d~~~~aVGLlfGGs~~~g~~~~~~g~~p~~~Tl~~pI~~VL~~  447 (604)
                      -...|-||-+++     +..+.+||+.|.--.   |..|..     ..++.-|+..|+..
T Consensus       300 ai~~~nsg~~ll-----~~DG~~IgVn~~~~~---ri~~~~-----~iSf~~p~d~vl~~  346 (473)
T KOG1320|consen  300 AINPGNSGGPLL-----NLDGEVIGVNTRKVT---RIGFSH-----GISFKIPIDTVLVI  346 (473)
T ss_pred             hhhcccCCCcEE-----EecCcEeeeeeeeeE---Eeeccc-----cceeccCchHhhhh
Confidence            577899999999     889999997776554   222222     45788888888763


No 24 
>PF02122 Peptidase_S39:  Peptidase S39;  InterPro: IPR000382 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. ORF2 of Potato leafroll virus (PLrV) encodes a polyprotein which is translated following a -1 frameshift. The polyprotein has a putative linear arrangement of membrane achor-VPg-peptidase-polmerase domains. The serine peptidase domain which is found in this group of sequences belongs to MEROPS peptidase family S39 (clan PA(S)). It is likely that the peptidase domain is involved in the cleavage of the polyprotein []. The nucleotide sequence for the RNA of PLrV has been determined [, ]. The sequence contains six large open reading frames (ORFs). The 5' coding region encodes two polypeptides of 28K and 70K, which overlap in different reading frames; it is suggested that the third ORF in the 5' block is translated by frameshift readthrough near the end of the 70K protein, yielding a 118K polypeptide []. Segments of the predicted amino acid sequences of these ORFs resemble those of known viral RNA polymerases, ATP-binding proteins and viral genome-linked proteins. The nucleotide sequence of the genomic RNA of Beet western yellows virus (BWYV) has been determined []. The sequence contains six long ORFs. A cluster of three of these ORFs, including the coat protein cistron, display extensive amino acid sequence similarity to corresponding ORFs of a second luteovirus: Barley yellow dwarf virus [].; GO: 0004252 serine-type endopeptidase activity, 0022415 viral reproductive process, 0016021 integral to membrane; PDB: 1ZYO_A.
Probab=34.88  E-value=31  Score=35.09  Aligned_cols=43  Identities=21%  Similarity=0.175  Sum_probs=15.1

Q ss_pred             CCCCCccceEEeeccCCCCCceEEEEEeccCCCcccccccCCCCcceEEeechHHHHH
Q 007435          389 DLEGDSGSLILLTGQNGEKPRPVGIIWGGTANRGRLKLKVGQPPVNWTSGVDLGRLLD  446 (604)
Q Consensus       389 S~~GDSGSlVl~~~~~d~~~~aVGLlfGGs~~~g~~~~~~g~~p~~~Tl~~pI~~VL~  446 (604)
                      +.+|+||+.++     ..+ .+||+|-|...         +...++|-+..||..+..
T Consensus       144 T~~G~SGtp~y-----~g~-~vvGvH~G~~~---------~~~~~n~n~~spip~~~g  186 (203)
T PF02122_consen  144 TSPGWSGTPYY-----SGK-NVVGVHTGSPS---------GSNRENNNRMSPIPPIPG  186 (203)
T ss_dssp             --TT-TT-EEE------SS--EEEEEEEE-----------------------------
T ss_pred             CCCCCCCCCeE-----ECC-CceEeecCccc---------cccccccccccccccccc
Confidence            78999999999     344 89999999511         112247777777776653


No 25 
>KOG1320 consensus Serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=32.14  E-value=1.3e+02  Score=34.52  Aligned_cols=45  Identities=16%  Similarity=0.175  Sum_probs=29.4

Q ss_pred             eecccceEEEEEEEEEEEeCCCCeEEEEEEEEECCCCCCCCCCCCccceEE
Q 007435          349 GRSSGLTTGTVMAYALEYNDEKGICFFTDFLVVGENQQTFDLEGDSGSLIL  399 (604)
Q Consensus       349 GRTTGlT~G~Itai~V~y~~~~G~~~f~dqIIt~~~~~~fS~~GDSGSlVl  399 (604)
                      |-+.=+|.|.|.++..+-...++..+..-||.+..      .+|.||-..+
T Consensus       173 gd~i~VTnghV~~~~~~~y~~~~~~l~~vqi~aa~------~~~~s~ep~i  217 (473)
T KOG1320|consen  173 GDGIIVTNGHVVRVEPRIYAHSSTVLLRVQIDAAI------GPGNSGEPVI  217 (473)
T ss_pred             CCcEEEEeeEEEEEEeccccCCCcceeeEEEEEee------cCCccCCCeE
Confidence            56666899999999865333334555566665543      4577777777


No 26 
>COG5480 Predicted integral membrane protein [Function unknown]
Probab=21.09  E-value=45  Score=32.29  Aligned_cols=18  Identities=39%  Similarity=0.720  Sum_probs=15.5

Q ss_pred             CCeEEEEEeEEeeCCccC
Q 007435          119 RFSLGTAIGFRIRRGVLT  136 (604)
Q Consensus       119 pnVvGVGIGYKi~~G~~T  136 (604)
                      -++||++||||.++|-.|
T Consensus        42 ~~~v~vAiGyr~~ngwvt   59 (147)
T COG5480          42 QTLVGVAIGYRAKNGWVT   59 (147)
T ss_pred             hhhhheeeeeecCCCcee
Confidence            458999999999999665


Done!