Query 007572
Match_columns 597
No_of_seqs 375 out of 2409
Neff 6.2
Searched_HMMs 46136
Date Thu Mar 28 12:26:14 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/007572.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/007572hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PRK10139 serine endoprotease; 100.0 1.6E-29 3.4E-34 277.5 23.6 192 238-515 43-264 (455)
2 TIGR02038 protease_degS peripl 100.0 4.6E-29 1E-33 265.7 24.3 192 239-515 49-252 (351)
3 PRK10898 serine endoprotease; 100.0 6.7E-29 1.4E-33 264.6 23.2 193 238-515 48-253 (353)
4 PRK10942 serine endoprotease; 100.0 1E-27 2.2E-32 264.4 23.9 173 257-515 111-285 (473)
5 TIGR02037 degP_htrA_DO peripla 100.0 5.4E-27 1.2E-31 256.1 24.1 173 257-515 58-231 (428)
6 COG0265 DegQ Trypsin-like seri 99.9 1.2E-21 2.6E-26 208.4 20.6 191 239-514 37-245 (347)
7 PRK10139 serine endoprotease; 99.9 1.7E-21 3.7E-26 214.1 13.6 133 50-189 121-260 (455)
8 PRK10942 serine endoprotease; 99.8 4.9E-20 1.1E-24 203.5 13.4 133 50-189 142-281 (473)
9 TIGR02038 protease_degS peripl 99.8 2.7E-19 5.9E-24 190.9 13.8 132 50-189 108-248 (351)
10 PRK10898 serine endoprotease; 99.8 4.9E-19 1.1E-23 189.0 13.1 131 51-189 109-249 (353)
11 TIGR02037 degP_htrA_DO peripla 99.8 2.5E-18 5.3E-23 188.1 14.1 132 51-189 89-227 (428)
12 COG0265 DegQ Trypsin-like seri 99.7 2.1E-17 4.6E-22 175.9 12.0 132 51-189 103-242 (347)
13 PF13365 Trypsin_2: Trypsin-li 99.6 1.5E-14 3.3E-19 128.4 14.1 24 454-477 97-120 (120)
14 KOG1320 Serine protease [Postt 99.5 1.3E-13 2.8E-18 149.8 13.8 201 240-509 133-350 (473)
15 KOG1320 Serine protease [Postt 99.3 8.1E-12 1.7E-16 136.0 7.9 129 54-188 211-351 (473)
16 PF00089 Trypsin: Trypsin; In 99.3 4E-10 8.6E-15 109.6 18.9 124 372-504 86-218 (220)
17 cd00190 Tryp_SPc Trypsin-like 99.1 4.1E-09 8.9E-14 103.2 17.4 109 371-483 87-209 (232)
18 KOG1421 Predicted signaling-as 99.0 1.9E-09 4.1E-14 119.2 12.6 190 240-511 57-260 (955)
19 smart00020 Tryp_SPc Trypsin-li 98.9 6.3E-08 1.4E-12 95.2 19.2 108 371-482 87-208 (229)
20 COG3591 V8-like Glu-specific e 98.4 8.5E-06 1.9E-10 83.0 15.4 73 396-487 157-229 (251)
21 PF00863 Peptidase_C4: Peptida 98.3 1.3E-05 2.8E-10 81.0 13.7 103 371-500 80-185 (235)
22 PF13365 Trypsin_2: Trypsin-li 97.7 1.8E-05 3.9E-10 69.9 2.8 24 135-158 97-120 (120)
23 KOG3627 Trypsin [Amino acid tr 97.5 0.0091 2E-07 60.3 19.1 114 373-493 106-239 (256)
24 PF03761 DUF316: Domain of unk 97.2 0.02 4.4E-07 59.2 17.4 109 371-504 159-273 (282)
25 COG5640 Secreted trypsin-like 96.9 0.0086 1.9E-07 63.6 11.7 50 456-507 223-275 (413)
26 PF05579 Peptidase_S32: Equine 96.8 0.0083 1.8E-07 61.4 9.8 77 373-485 156-232 (297)
27 PF00089 Trypsin: Trypsin; In 96.4 0.054 1.2E-06 52.4 12.7 114 65-179 86-215 (220)
28 PF10459 Peptidase_S46: Peptid 95.4 0.028 6E-07 65.6 6.7 65 446-510 618-687 (698)
29 PF00548 Peptidase_C3: 3C cyst 95.1 0.26 5.7E-06 47.9 11.4 94 372-481 71-170 (172)
30 COG3591 V8-like Glu-specific e 94.0 0.27 5.8E-06 50.6 9.0 75 84-166 153-227 (251)
31 KOG1421 Predicted signaling-as 94.0 1.9 4.2E-05 49.6 16.2 153 348-514 578-731 (955)
32 PF02907 Peptidase_S29: Hepati 92.1 0.11 2.3E-06 48.3 2.5 45 135-180 101-146 (148)
33 PF00949 Peptidase_S7: Peptida 91.4 0.15 3.3E-06 47.4 2.8 29 457-485 93-121 (132)
34 PF02907 Peptidase_S29: Hepati 90.5 0.34 7.5E-06 45.0 4.2 44 457-503 104-147 (148)
35 PF10459 Peptidase_S46: Peptid 90.3 0.2 4.3E-06 58.7 3.1 30 132-161 623-652 (698)
36 PF00949 Peptidase_S7: Peptida 90.2 0.24 5.3E-06 46.1 3.0 35 131-165 86-120 (132)
37 PF00863 Peptidase_C4: Peptida 89.8 0.79 1.7E-05 46.8 6.5 106 64-180 80-189 (235)
38 PF08192 Peptidase_S64: Peptid 88.3 2.6 5.6E-05 48.6 9.9 119 370-509 540-688 (695)
39 PF00944 Peptidase_S3: Alphavi 88.2 0.56 1.2E-05 43.6 3.8 35 451-485 96-130 (158)
40 smart00020 Tryp_SPc Trypsin-li 87.7 3.7 8.1E-05 39.9 9.6 101 63-163 86-208 (229)
41 cd00190 Tryp_SPc Trypsin-like 87.5 2.9 6.4E-05 40.4 8.8 100 64-163 87-208 (232)
42 PF05580 Peptidase_S55: SpoIVB 86.6 0.58 1.3E-05 46.9 3.2 45 452-502 171-215 (218)
43 PF08192 Peptidase_S64: Peptid 80.6 9.1 0.0002 44.4 9.7 113 59-183 536-684 (695)
44 PF00947 Pico_P2A: Picornaviru 79.9 2.6 5.7E-05 38.9 4.3 29 132-161 80-108 (127)
45 PF00947 Pico_P2A: Picornaviru 78.4 4.2 9.1E-05 37.6 5.1 36 446-482 70-110 (127)
46 PF00944 Peptidase_S3: Alphavi 76.5 3.3 7.1E-05 38.7 3.9 32 132-163 96-127 (158)
47 PF09342 DUF1986: Domain of un 73.7 27 0.00059 36.0 9.9 31 250-281 21-51 (267)
48 PF01732 DUF31: Putative pepti 62.8 4.8 0.0001 43.7 2.3 24 456-479 350-373 (374)
49 TIGR02860 spore_IV_B stage IV 60.2 7.3 0.00016 42.9 3.1 45 452-502 351-395 (402)
50 PF03510 Peptidase_C24: 2C end 54.0 46 0.001 29.9 6.5 17 261-278 3-19 (105)
51 PF05416 Peptidase_C37: Southa 46.8 68 0.0015 35.6 7.6 29 457-485 499-530 (535)
52 PF00548 Peptidase_C3: 3C cyst 45.6 31 0.00068 33.4 4.6 90 65-161 71-169 (172)
53 PF05579 Peptidase_S32: Equine 42.2 17 0.00037 37.8 2.2 28 138-165 204-231 (297)
54 PF05580 Peptidase_S55: SpoIVB 39.7 23 0.00051 35.7 2.7 38 139-179 177-214 (218)
55 PF01732 DUF31: Putative pepti 30.7 36 0.00078 37.0 2.6 26 135-160 348-373 (374)
56 PF12381 Peptidase_C3G: Tungro 30.6 42 0.00092 34.0 2.8 54 450-509 169-228 (231)
57 PF00571 CBS: CBS domain CBS d 28.7 43 0.00094 25.1 2.1 22 459-480 27-48 (57)
58 PF02122 Peptidase_S39: Peptid 26.3 73 0.0016 32.0 3.7 49 451-503 137-185 (203)
59 PF13267 DUF4058: Protein of u 24.6 55 0.0012 33.9 2.5 26 555-582 124-150 (254)
60 PF03761 DUF316: Domain of unk 20.8 6.9E+02 0.015 25.5 9.9 92 64-166 159-258 (282)
No 1
>PRK10139 serine endoprotease; Provisional
Probab=99.97 E-value=1.6e-29 Score=277.46 Aligned_cols=192 Identities=29% Similarity=0.521 Sum_probs=158.0
Q ss_pred chHHHhccCceEEEEeCC----------------------------CeEEEEEEEeC-CcEEEecccccCCCCCcccccc
Q 007572 238 PLPIQKALASVCLITIDD----------------------------GVWASGVLLND-QGLILTNAHLLEPWRFGKTTVS 288 (597)
Q Consensus 238 p~~i~~a~~SVV~I~~~~----------------------------~~~GSGflIs~-~G~ILTnaHVV~p~~~~~t~~~ 288 (597)
..+++++.||||.|.+.. .++||||+|++ +||||||+||++.
T Consensus 43 ~~~~~~~~pavV~i~~~~~~~~~~~~~~~~~~~f~~~~~~~~~~~~~~~GSG~ii~~~~g~IlTn~HVv~~--------- 113 (455)
T PRK10139 43 APMLEKVLPAVVSVRVEGTASQGQKIPEEFKKFFGDDLPDQPAQPFEGLGSGVIIDAAKGYVLTNNHVINQ--------- 113 (455)
T ss_pred HHHHHHhCCcEEEEEEEEeecccccCchhHHHhccccCCccccccccceEEEEEEECCCCEEEeChHHhCC---------
Confidence 357899999999996410 14799999985 7999999999971
Q ss_pred CCcccccccCCCCCCCCCCCccccccccCCCCCCCcccccccccccccccccccCCceEEEEEEcCCCCceeEeeEEEEe
Q 007572 289 GWRNGVSFQPEDSASSGHTGVDQYQKSQTLPPKMPKIVDSSVDEHRAYKLSSFSRGHRKIRVRLDHLDPWIWCDAKIVYV 368 (597)
Q Consensus 289 g~~~~~~f~~~~~~~~~~~~~~~~q~~qtl~~k~i~i~~~~~~~~~~~~~~~~~~~~~~i~V~l~~~~~~~w~~A~Vv~~ 368 (597)
...+.|++.+++. |+|++++
T Consensus 114 --------------------------------------------------------a~~i~V~~~dg~~---~~a~vvg- 133 (455)
T PRK10139 114 --------------------------------------------------------AQKISIQLNDGRE---FDAKLIG- 133 (455)
T ss_pred --------------------------------------------------------CCEEEEEECCCCE---EEEEEEE-
Confidence 2247888888776 9999999
Q ss_pred ecCCCcEEEEEEccCCCCccceecCCC-CCCCCCeEEEEccCCCCCCCCCCCeeEeeEEeeeeeccCCCCCCcccccCCC
Q 007572 369 CKGPLDVSLLQLGYIPDQLCPIDADFG-QPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQRNSA 447 (597)
Q Consensus 369 ~~~~~DLALLkl~~~~~~l~pi~l~~s-~~~~Ge~V~vIGyPlf~~~~g~~~svt~GiVS~v~~~~~~~~~~~~~~~~~~ 447 (597)
.|+.+||||||++. +..+++++++++ .+++||.|++|||| +|+..+++.|+||+..+.... ...
T Consensus 134 ~D~~~DlAvlkv~~-~~~l~~~~lg~s~~~~~G~~V~aiG~P-----~g~~~tvt~GivS~~~r~~~~---------~~~ 198 (455)
T PRK10139 134 SDDQSDIALLQIQN-PSKLTQIAIADSDKLRVGDFAVAVGNP-----FGLGQTATSGIISALGRSGLN---------LEG 198 (455)
T ss_pred EcCCCCEEEEEecC-CCCCceeEecCccccCCCCEEEEEecC-----CCCCCceEEEEEccccccccC---------CCC
Confidence 67789999999985 457889999775 58999999999995 577789999999998763110 023
Q ss_pred cCeEEEEcccccCCCCCCceecCCceEEEEEeeeecCCCCcccCceEEEEehhHHHHHHHHHHhcCCc
Q 007572 448 YPVMLETTAAVHPGGSGGAVVNLDGHMIGLVTSNARHGGGTVIPHLNFSIPCAVLRPIFEFARDMQEV 515 (597)
Q Consensus 448 ~~~~iqtdAav~~GnSGGPL~n~~G~VIGIvss~~~~~~g~~~p~lnFaIPi~~l~~~l~~~~~~gd~ 515 (597)
+..+|||||++++|||||||||.+|+||||+++.....++.. +++|+||++.++++++++.+.|.+
T Consensus 199 ~~~~iqtda~in~GnSGGpl~n~~G~vIGi~~~~~~~~~~~~--gigfaIP~~~~~~v~~~l~~~g~v 264 (455)
T PRK10139 199 LENFIQTDASINRGNSGGALLNLNGELIGINTAILAPGGGSV--GIGFAIPSNMARTLAQQLIDFGEI 264 (455)
T ss_pred cceEEEECCccCCCCCcceEECCCCeEEEEEEEEEcCCCCcc--ceEEEEEhHHHHHHHHHHhhcCcc
Confidence 456899999999999999999999999999999876654433 899999999999999999877665
No 2
>TIGR02038 protease_degS periplasmic serine pepetdase DegS. This family consists of the periplasmic serine protease DegS (HhoB), a shorter paralog of protease DO (HtrA, DegP) and DegQ (HhoA). It is found in E. coli and several other Proteobacteria of the gamma subdivision. It contains a trypsin domain and a single copy of PDZ domain (in contrast to DegP with two copies). A critical role of this DegS is to sense stress in the periplasm and partially degrade an inhibitor of sigma(E).
Probab=99.97 E-value=4.6e-29 Score=265.74 Aligned_cols=192 Identities=26% Similarity=0.417 Sum_probs=155.7
Q ss_pred hHHHhccCceEEEEeC-----------CCeEEEEEEEeCCcEEEecccccCCCCCccccccCCcccccccCCCCCCCCCC
Q 007572 239 LPIQKALASVCLITID-----------DGVWASGVLLNDQGLILTNAHLLEPWRFGKTTVSGWRNGVSFQPEDSASSGHT 307 (597)
Q Consensus 239 ~~i~~a~~SVV~I~~~-----------~~~~GSGflIs~~G~ILTnaHVV~p~~~~~t~~~g~~~~~~f~~~~~~~~~~~ 307 (597)
.+++++.||||.|... ..+.||||+|+++||||||+||++.
T Consensus 49 ~~~~~~~psVV~I~~~~~~~~~~~~~~~~~~GSG~vi~~~G~IlTn~HVV~~---------------------------- 100 (351)
T TIGR02038 49 KAVRRAAPAVVNIYNRSISQNSLNQLSIQGLGSGVIMSKEGYILTNYHVIKK---------------------------- 100 (351)
T ss_pred HHHHhcCCcEEEEEeEeccccccccccccceEEEEEEeCCeEEEecccEeCC----------------------------
Confidence 4789999999999762 1347999999999999999999961
Q ss_pred CccccccccCCCCCCCcccccccccccccccccccCCceEEEEEEcCCCCceeEeeEEEEeecCCCcEEEEEEccCCCCc
Q 007572 308 GVDQYQKSQTLPPKMPKIVDSSVDEHRAYKLSSFSRGHRKIRVRLDHLDPWIWCDAKIVYVCKGPLDVSLLQLGYIPDQL 387 (597)
Q Consensus 308 ~~~~~q~~qtl~~k~i~i~~~~~~~~~~~~~~~~~~~~~~i~V~l~~~~~~~w~~A~Vv~~~~~~~DLALLkl~~~~~~l 387 (597)
...+.|++.+++. ++|++++ .|+.+||||||++. ..+
T Consensus 101 -------------------------------------~~~i~V~~~dg~~---~~a~vv~-~d~~~DlAvlkv~~--~~~ 137 (351)
T TIGR02038 101 -------------------------------------ADQIVVALQDGRK---FEAELVG-SDPLTDLAVLKIEG--DNL 137 (351)
T ss_pred -------------------------------------CCEEEEEECCCCE---EEEEEEE-ecCCCCEEEEEecC--CCC
Confidence 1247788887766 9999999 67889999999986 347
Q ss_pred cceecCCC-CCCCCCeEEEEccCCCCCCCCCCCeeEeeEEeeeeeccCCCCCCcccccCCCcCeEEEEcccccCCCCCCc
Q 007572 388 CPIDADFG-QPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQRNSAYPVMLETTAAVHPGGSGGA 466 (597)
Q Consensus 388 ~pi~l~~s-~~~~Ge~V~vIGyPlf~~~~g~~~svt~GiVS~v~~~~~~~~~~~~~~~~~~~~~~iqtdAav~~GnSGGP 466 (597)
+++++..+ .+++||.|+++||| +++..+++.|+|++..+.... ......++||||++++||||||
T Consensus 138 ~~~~l~~s~~~~~G~~V~aiG~P-----~~~~~s~t~GiIs~~~r~~~~---------~~~~~~~iqtda~i~~GnSGGp 203 (351)
T TIGR02038 138 PTIPVNLDRPPHVGDVVLAIGNP-----YNLGQTITQGIISATGRNGLS---------SVGRQNFIQTDAAINAGNSGGA 203 (351)
T ss_pred ceEeccCcCccCCCCEEEEEeCC-----CCCCCcEEEEEEEeccCcccC---------CCCcceEEEECCccCCCCCcce
Confidence 78888654 68999999999996 567789999999998764210 0123468999999999999999
Q ss_pred eecCCceEEEEEeeeecCCCCcccCceEEEEehhHHHHHHHHHHhcCCc
Q 007572 467 VVNLDGHMIGLVTSNARHGGGTVIPHLNFSIPCAVLRPIFEFARDMQEV 515 (597)
Q Consensus 467 L~n~~G~VIGIvss~~~~~~g~~~p~lnFaIPi~~l~~~l~~~~~~gd~ 515 (597)
|||.+|+||||+++.....++....+++|+||++.++++++++.+.+.+
T Consensus 204 l~n~~G~vIGI~~~~~~~~~~~~~~g~~faIP~~~~~~vl~~l~~~g~~ 252 (351)
T TIGR02038 204 LINTNGELVGINTASFQKGGDEGGEGINFAIPIKLAHKIMGKIIRDGRV 252 (351)
T ss_pred EECCCCeEEEEEeeeecccCCCCccceEEEecHHHHHHHHHHHhhcCcc
Confidence 9999999999999876544333335899999999999999999876653
No 3
>PRK10898 serine endoprotease; Provisional
Probab=99.96 E-value=6.7e-29 Score=264.63 Aligned_cols=193 Identities=24% Similarity=0.382 Sum_probs=156.0
Q ss_pred chHHHhccCceEEEEeCC-----------CeEEEEEEEeCCcEEEecccccCCCCCccccccCCcccccccCCCCCCCCC
Q 007572 238 PLPIQKALASVCLITIDD-----------GVWASGVLLNDQGLILTNAHLLEPWRFGKTTVSGWRNGVSFQPEDSASSGH 306 (597)
Q Consensus 238 p~~i~~a~~SVV~I~~~~-----------~~~GSGflIs~~G~ILTnaHVV~p~~~~~t~~~g~~~~~~f~~~~~~~~~~ 306 (597)
..+++++.++||.|.... ..+||||+|+++||||||+||++.
T Consensus 48 ~~~~~~~~psvV~v~~~~~~~~~~~~~~~~~~GSGfvi~~~G~IlTn~HVv~~--------------------------- 100 (353)
T PRK10898 48 NQAVRRAAPAVVNVYNRSLNSTSHNQLEIRTLGSGVIMDQRGYILTNKHVIND--------------------------- 100 (353)
T ss_pred HHHHHHhCCcEEEEEeEeccccCcccccccceeeEEEEeCCeEEEecccEeCC---------------------------
Confidence 357899999999998721 158999999999999999999961
Q ss_pred CCccccccccCCCCCCCcccccccccccccccccccCCceEEEEEEcCCCCceeEeeEEEEeecCCCcEEEEEEccCCCC
Q 007572 307 TGVDQYQKSQTLPPKMPKIVDSSVDEHRAYKLSSFSRGHRKIRVRLDHLDPWIWCDAKIVYVCKGPLDVSLLQLGYIPDQ 386 (597)
Q Consensus 307 ~~~~~~q~~qtl~~k~i~i~~~~~~~~~~~~~~~~~~~~~~i~V~l~~~~~~~w~~A~Vv~~~~~~~DLALLkl~~~~~~ 386 (597)
...+.|++.++.. ++|++++ .|+..||||||++. ..
T Consensus 101 --------------------------------------a~~i~V~~~dg~~---~~a~vv~-~d~~~DlAvl~v~~--~~ 136 (353)
T PRK10898 101 --------------------------------------ADQIIVALQDGRV---FEALLVG-SDSLTDLAVLKINA--TN 136 (353)
T ss_pred --------------------------------------CCEEEEEeCCCCE---EEEEEEE-EcCCCCEEEEEEcC--CC
Confidence 1247788888766 9999998 57789999999985 35
Q ss_pred ccceecCCC-CCCCCCeEEEEccCCCCCCCCCCCeeEeeEEeeeeeccCCCCCCcccccCCCcCeEEEEcccccCCCCCC
Q 007572 387 LCPIDADFG-QPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQRNSAYPVMLETTAAVHPGGSGG 465 (597)
Q Consensus 387 l~pi~l~~s-~~~~Ge~V~vIGyPlf~~~~g~~~svt~GiVS~v~~~~~~~~~~~~~~~~~~~~~~iqtdAav~~GnSGG 465 (597)
++++++.++ .+++|+.|+++||| +++..+++.|+|++..+..... .....+||||+++++|||||
T Consensus 137 l~~~~l~~~~~~~~G~~V~aiG~P-----~g~~~~~t~Giis~~~r~~~~~---------~~~~~~iqtda~i~~GnSGG 202 (353)
T PRK10898 137 LPVIPINPKRVPHIGDVVLAIGNP-----YNLGQTITQGIISATGRIGLSP---------TGRQNFLQTDASINHGNSGG 202 (353)
T ss_pred CCeeeccCcCcCCCCCEEEEEeCC-----CCcCCCcceeEEEeccccccCC---------ccccceEEeccccCCCCCcc
Confidence 778888765 48999999999996 4667889999999877642110 12235899999999999999
Q ss_pred ceecCCceEEEEEeeeecCCC-CcccCceEEEEehhHHHHHHHHHHhcCCc
Q 007572 466 AVVNLDGHMIGLVTSNARHGG-GTVIPHLNFSIPCAVLRPIFEFARDMQEV 515 (597)
Q Consensus 466 PL~n~~G~VIGIvss~~~~~~-g~~~p~lnFaIPi~~l~~~l~~~~~~gd~ 515 (597)
||+|.+|+||||+++.....+ +....+++|+||++.++++++++...|.+
T Consensus 203 Pl~n~~G~vvGI~~~~~~~~~~~~~~~g~~faIP~~~~~~~~~~l~~~G~~ 253 (353)
T PRK10898 203 ALVNSLGELMGINTLSFDKSNDGETPEGIGFAIPTQLATKIMDKLIRDGRV 253 (353)
T ss_pred eEECCCCeEEEEEEEEecccCCCCcccceEEEEchHHHHHHHHHHhhcCcc
Confidence 999999999999998765432 23335899999999999999998776663
No 4
>PRK10942 serine endoprotease; Provisional
Probab=99.96 E-value=1e-27 Score=264.38 Aligned_cols=173 Identities=31% Similarity=0.538 Sum_probs=144.7
Q ss_pred eEEEEEEEeC-CcEEEecccccCCCCCccccccCCcccccccCCCCCCCCCCCccccccccCCCCCCCcccccccccccc
Q 007572 257 VWASGVLLND-QGLILTNAHLLEPWRFGKTTVSGWRNGVSFQPEDSASSGHTGVDQYQKSQTLPPKMPKIVDSSVDEHRA 335 (597)
Q Consensus 257 ~~GSGflIs~-~G~ILTnaHVV~p~~~~~t~~~g~~~~~~f~~~~~~~~~~~~~~~~q~~qtl~~k~i~i~~~~~~~~~~ 335 (597)
++||||+|++ +||||||+||++
T Consensus 111 ~~GSG~ii~~~~G~IlTn~HVv~--------------------------------------------------------- 133 (473)
T PRK10942 111 ALGSGVIIDADKGYVVTNNHVVD--------------------------------------------------------- 133 (473)
T ss_pred ceEEEEEEECCCCEEEeChhhcC---------------------------------------------------------
Confidence 4799999996 599999999996
Q ss_pred cccccccCCceEEEEEEcCCCCceeEeeEEEEeecCCCcEEEEEEccCCCCccceecCCC-CCCCCCeEEEEccCCCCCC
Q 007572 336 YKLSSFSRGHRKIRVRLDHLDPWIWCDAKIVYVCKGPLDVSLLQLGYIPDQLCPIDADFG-QPSLGSAAYVIGHGLFGPR 414 (597)
Q Consensus 336 ~~~~~~~~~~~~i~V~l~~~~~~~w~~A~Vv~~~~~~~DLALLkl~~~~~~l~pi~l~~s-~~~~Ge~V~vIGyPlf~~~ 414 (597)
+...++|++.+++. |+|+|++ .|+.+||||||++. +..++++++.++ .+++|+.|++||||
T Consensus 134 --------~a~~i~V~~~dg~~---~~a~vv~-~D~~~DlAvlki~~-~~~l~~~~lg~s~~l~~G~~V~aiG~P----- 195 (473)
T PRK10942 134 --------NATKIKVQLSDGRK---FDAKVVG-KDPRSDIALIQLQN-PKNLTAIKMADSDALRVGDYTVAIGNP----- 195 (473)
T ss_pred --------CCCEEEEEECCCCE---EEEEEEE-ecCCCCEEEEEecC-CCCCceeEecCccccCCCCEEEEEcCC-----
Confidence 12247888888776 9999999 68889999999975 456889999765 59999999999995
Q ss_pred CCCCCeeEeeEEeeeeeccCCCCCCcccccCCCcCeEEEEcccccCCCCCCceecCCceEEEEEeeeecCCCCcccCceE
Q 007572 415 CGLSPSVSSGVVAKVVKANLPSYGQSTLQRNSAYPVMLETTAAVHPGGSGGAVVNLDGHMIGLVTSNARHGGGTVIPHLN 494 (597)
Q Consensus 415 ~g~~~svt~GiVS~v~~~~~~~~~~~~~~~~~~~~~~iqtdAav~~GnSGGPL~n~~G~VIGIvss~~~~~~g~~~p~ln 494 (597)
+|+..+++.|+|+++.+... ....+..+|||||++++|||||||||.+|+||||+++.....++.. +++
T Consensus 196 ~g~~~tvt~GiVs~~~r~~~---------~~~~~~~~iqtda~i~~GnSGGpL~n~~GeviGI~t~~~~~~g~~~--g~g 264 (473)
T PRK10942 196 YGLGETVTSGIVSALGRSGL---------NVENYENFIQTDAAINRGNSGGALVNLNGELIGINTAILAPDGGNI--GIG 264 (473)
T ss_pred CCCCcceeEEEEEEeecccC---------CcccccceEEeccccCCCCCcCccCCCCCeEEEEEEEEEcCCCCcc--cEE
Confidence 57778999999999876311 0023456899999999999999999999999999999877655543 899
Q ss_pred EEEehhHHHHHHHHHHhcCCc
Q 007572 495 FSIPCAVLRPIFEFARDMQEV 515 (597)
Q Consensus 495 FaIPi~~l~~~l~~~~~~gd~ 515 (597)
|+||++.++++++++.+.+.+
T Consensus 265 faIP~~~~~~v~~~l~~~g~v 285 (473)
T PRK10942 265 FAIPSNMVKNLTSQMVEYGQV 285 (473)
T ss_pred EEEEHHHHHHHHHHHHhcccc
Confidence 999999999999999876664
No 5
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=99.95 E-value=5.4e-27 Score=256.08 Aligned_cols=173 Identities=29% Similarity=0.468 Sum_probs=143.0
Q ss_pred eEEEEEEEeCCcEEEecccccCCCCCccccccCCcccccccCCCCCCCCCCCccccccccCCCCCCCccccccccccccc
Q 007572 257 VWASGVLLNDQGLILTNAHLLEPWRFGKTTVSGWRNGVSFQPEDSASSGHTGVDQYQKSQTLPPKMPKIVDSSVDEHRAY 336 (597)
Q Consensus 257 ~~GSGflIs~~G~ILTnaHVV~p~~~~~t~~~g~~~~~~f~~~~~~~~~~~~~~~~q~~qtl~~k~i~i~~~~~~~~~~~ 336 (597)
++||||+|+++||||||+||++.
T Consensus 58 ~~GSGfii~~~G~IlTn~Hvv~~--------------------------------------------------------- 80 (428)
T TIGR02037 58 GLGSGVIISADGYILTNNHVVDG--------------------------------------------------------- 80 (428)
T ss_pred ceeeEEEECCCCEEEEcHHHcCC---------------------------------------------------------
Confidence 57999999999999999999971
Q ss_pred ccccccCCceEEEEEEcCCCCceeEeeEEEEeecCCCcEEEEEEccCCCCccceecCCC-CCCCCCeEEEEccCCCCCCC
Q 007572 337 KLSSFSRGHRKIRVRLDHLDPWIWCDAKIVYVCKGPLDVSLLQLGYIPDQLCPIDADFG-QPSLGSAAYVIGHGLFGPRC 415 (597)
Q Consensus 337 ~~~~~~~~~~~i~V~l~~~~~~~w~~A~Vv~~~~~~~DLALLkl~~~~~~l~pi~l~~s-~~~~Ge~V~vIGyPlf~~~~ 415 (597)
...+.|++.+++. ++|++++ .|+.+||||||++. +..++++.++++ .+++|+.|+++||| +
T Consensus 81 --------~~~i~V~~~~~~~---~~a~vv~-~d~~~DlAllkv~~-~~~~~~~~l~~~~~~~~G~~v~aiG~p-----~ 142 (428)
T TIGR02037 81 --------ADEITVTLSDGRE---FKAKLVG-KDPRTDIAVLKIDA-KKNLPVIKLGDSDKLRVGDWVLAIGNP-----F 142 (428)
T ss_pred --------CCeEEEEeCCCCE---EEEEEEE-ecCCCCEEEEEecC-CCCceEEEccCCCCCCCCCEEEEEECC-----C
Confidence 1237777777665 8999998 57789999999986 357889999764 68999999999995 5
Q ss_pred CCCCeeEeeEEeeeeeccCCCCCCcccccCCCcCeEEEEcccccCCCCCCceecCCceEEEEEeeeecCCCCcccCceEE
Q 007572 416 GLSPSVSSGVVAKVVKANLPSYGQSTLQRNSAYPVMLETTAAVHPGGSGGAVVNLDGHMIGLVTSNARHGGGTVIPHLNF 495 (597)
Q Consensus 416 g~~~svt~GiVS~v~~~~~~~~~~~~~~~~~~~~~~iqtdAav~~GnSGGPL~n~~G~VIGIvss~~~~~~g~~~p~lnF 495 (597)
++..+++.|+|++..+... ....+..++|||+++++|||||||||.+|+||||+++.....++. .+++|
T Consensus 143 g~~~~~t~G~vs~~~~~~~---------~~~~~~~~i~tda~i~~GnSGGpl~n~~G~viGI~~~~~~~~g~~--~g~~f 211 (428)
T TIGR02037 143 GLGQTVTSGIVSALGRSGL---------GIGDYENFIQTDAAINPGNSGGPLVNLRGEVIGINTAIYSPSGGN--VGIGF 211 (428)
T ss_pred cCCCcEEEEEEEecccCcc---------CCCCccceEEECCCCCCCCCCCceECCCCeEEEEEeEEEcCCCCc--cceEE
Confidence 7778999999998775310 012345689999999999999999999999999999887654443 38999
Q ss_pred EEehhHHHHHHHHHHhcCCc
Q 007572 496 SIPCAVLRPIFEFARDMQEV 515 (597)
Q Consensus 496 aIPi~~l~~~l~~~~~~gd~ 515 (597)
+||++.++++++++.+.+.+
T Consensus 212 aiP~~~~~~~~~~l~~~g~~ 231 (428)
T TIGR02037 212 AIPSNMAKNVVDQLIEGGKV 231 (428)
T ss_pred EEEhHHHHHHHHHHHhcCcC
Confidence 99999999999999886654
No 6
>COG0265 DegQ Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain [Posttranslational modification, protein turnover, chaperones]
Probab=99.88 E-value=1.2e-21 Score=208.43 Aligned_cols=191 Identities=26% Similarity=0.432 Sum_probs=155.0
Q ss_pred hHHHhccCceEEEEeCC-----------------CeEEEEEEEeCCcEEEecccccCCCCCccccccCCcccccccCCCC
Q 007572 239 LPIQKALASVCLITIDD-----------------GVWASGVLLNDQGLILTNAHLLEPWRFGKTTVSGWRNGVSFQPEDS 301 (597)
Q Consensus 239 ~~i~~a~~SVV~I~~~~-----------------~~~GSGflIs~~G~ILTnaHVV~p~~~~~t~~~g~~~~~~f~~~~~ 301 (597)
..++++.++||.|.... ..+||||+++++|||+||.||++.
T Consensus 37 ~~~~~~~~~vV~~~~~~~~~~~~~~~~~~~~~~~~~~gSg~i~~~~g~ivTn~hVi~~---------------------- 94 (347)
T COG0265 37 TAVEKVAPAVVSIATGLTAKLRSFFPSDPPLRSAEGLGSGFIISSDGYIVTNNHVIAG---------------------- 94 (347)
T ss_pred HHHHhcCCcEEEEEeeeeecchhcccCCcccccccccccEEEEcCCeEEEecceecCC----------------------
Confidence 47889999999887631 378999999999999999999971
Q ss_pred CCCCCCCccccccccCCCCCCCcccccccccccccccccccCCceEEEEEEcCCCCceeEeeEEEEeecCCCcEEEEEEc
Q 007572 302 ASSGHTGVDQYQKSQTLPPKMPKIVDSSVDEHRAYKLSSFSRGHRKIRVRLDHLDPWIWCDAKIVYVCKGPLDVSLLQLG 381 (597)
Q Consensus 302 ~~~~~~~~~~~q~~qtl~~k~i~i~~~~~~~~~~~~~~~~~~~~~~i~V~l~~~~~~~w~~A~Vv~~~~~~~DLALLkl~ 381 (597)
..++.+.+.++.. +++++++ .|+..|+|+||++
T Consensus 95 -------------------------------------------a~~i~v~l~dg~~---~~a~~vg-~d~~~dlavlki~ 127 (347)
T COG0265 95 -------------------------------------------AEEITVTLADGRE---VPAKLVG-KDPISDLAVLKID 127 (347)
T ss_pred -------------------------------------------cceEEEEeCCCCE---EEEEEEe-cCCccCEEEEEec
Confidence 1236666665555 9999999 7888999999999
Q ss_pred cCCCCccceecCCC-CCCCCCeEEEEccCCCCCCCCCCCeeEeeEEeeeeeccCCCCCCcccccCCCcCeEEEEcccccC
Q 007572 382 YIPDQLCPIDADFG-QPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQRNSAYPVMLETTAAVHP 460 (597)
Q Consensus 382 ~~~~~l~pi~l~~s-~~~~Ge~V~vIGyPlf~~~~g~~~svt~GiVS~v~~~~~~~~~~~~~~~~~~~~~~iqtdAav~~ 460 (597)
.... ++.+.+.++ .+++|+.++++|+| +++..+++.|+|+...+... .....+..+|||||++++
T Consensus 128 ~~~~-~~~~~~~~s~~l~vg~~v~aiGnp-----~g~~~tvt~Givs~~~r~~v--------~~~~~~~~~IqtdAain~ 193 (347)
T COG0265 128 GAGG-LPVIALGDSDKLRVGDVVVAIGNP-----FGLGQTVTSGIVSALGRTGV--------GSAGGYVNFIQTDAAINP 193 (347)
T ss_pred cCCC-CceeeccCCCCcccCCEEEEecCC-----CCcccceeccEEeccccccc--------cCcccccchhhcccccCC
Confidence 7322 677777765 58899999999995 57889999999999887411 010125568999999999
Q ss_pred CCCCCceecCCceEEEEEeeeecCCCCcccCceEEEEehhHHHHHHHHHHhcCC
Q 007572 461 GGSGGAVVNLDGHMIGLVTSNARHGGGTVIPHLNFSIPCAVLRPIFEFARDMQE 514 (597)
Q Consensus 461 GnSGGPL~n~~G~VIGIvss~~~~~~g~~~p~lnFaIPi~~l~~~l~~~~~~gd 514 (597)
|+||||++|.+|++|||++......++.. +++|+||++.+.++++.+...|.
T Consensus 194 gnsGgpl~n~~g~~iGint~~~~~~~~~~--gigfaiP~~~~~~v~~~l~~~G~ 245 (347)
T COG0265 194 GNSGGPLVNIDGEVVGINTAIIAPSGGSS--GIGFAIPVNLVAPVLDELISKGK 245 (347)
T ss_pred CCCCCceEcCCCcEEEEEEEEecCCCCcc--eeEEEecHHHHHHHHHHHHHcCC
Confidence 99999999999999999999988765533 69999999999999999987553
No 7
>PRK10139 serine endoprotease; Provisional
Probab=99.86 E-value=1.7e-21 Score=214.10 Aligned_cols=133 Identities=21% Similarity=0.333 Sum_probs=114.9
Q ss_pred ccccccCCccccCC-CCccEEEEEEccCCCCCCeeec--CCCCCCCCeEEEEeCCCCCCCCCcccCceEEeEEeeccCCC
Q 007572 50 FAMEESSNLSLMSK-STSRVAILGVSSYLKDLPNIAL--TPLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVANCYPPR 126 (597)
Q Consensus 50 ~~~~~~~~~~~~~~-~~t~~A~l~i~~~~~~~~~~~~--s~~~~~G~~v~aigsPfg~~~p~~f~~~~s~Givs~~~~~~ 126 (597)
+.+++..+|++++. ..+|||||||+. ..+++++++ |+.+++||+|+|||+|||+ ..++|.|+||++.+..
T Consensus 121 ~~dg~~~~a~vvg~D~~~DlAvlkv~~-~~~l~~~~lg~s~~~~~G~~V~aiG~P~g~------~~tvt~GivS~~~r~~ 193 (455)
T PRK10139 121 LNDGREFDAKLIGSDDQSDIALLQIQN-PSKLTQIAIADSDKLRVGDFAVAVGNPFGL------GQTATSGIISALGRSG 193 (455)
T ss_pred ECCCCEEEEEEEEEcCCCCEEEEEecC-CCCCceeEecCccccCCCCEEEEEecCCCC------CCceEEEEEccccccc
Confidence 35677788999965 559999999973 357888888 6679999999999999994 7899999999987642
Q ss_pred ---CCCCceEEEecccCCCCcCceeecCCccEEEEEeeccccc-CCcceEEEeeHHHHHHHHHhhhc
Q 007572 127 ---STTRSLLMADIRCLPGMEGGPVFGEHAHFVGILIRPLRQK-SGAEIQLVIPWEAIATACSDLLL 189 (597)
Q Consensus 127 ---~~~~~~i~tDa~~~pG~~GG~v~~~~g~liGi~~~~l~~~-~~~~l~~aip~~~i~~~~~~l~~ 189 (597)
..+..||||||+++||||||||||.+|+||||+++.++.. +..|++||||.+.++.++.+|+.
T Consensus 194 ~~~~~~~~~iqtda~in~GnSGGpl~n~~G~vIGi~~~~~~~~~~~~gigfaIP~~~~~~v~~~l~~ 260 (455)
T PRK10139 194 LNLEGLENFIQTDASINRGNSGGALLNLNGELIGINTAILAPGGGSVGIGFAIPSNMARTLAQQLID 260 (455)
T ss_pred cCCCCcceEEEECCccCCCCCcceEECCCCeEEEEEEEEEcCCCCccceEEEEEhHHHHHHHHHHhh
Confidence 2346799999999999999999999999999999999876 67899999999999999988864
No 8
>PRK10942 serine endoprotease; Provisional
Probab=99.82 E-value=4.9e-20 Score=203.54 Aligned_cols=133 Identities=20% Similarity=0.337 Sum_probs=114.8
Q ss_pred ccccccCCccccCCCC-ccEEEEEEccCCCCCCeeec--CCCCCCCCeEEEEeCCCCCCCCCcccCceEEeEEeeccCCC
Q 007572 50 FAMEESSNLSLMSKST-SRVAILGVSSYLKDLPNIAL--TPLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVANCYPPR 126 (597)
Q Consensus 50 ~~~~~~~~~~~~~~~~-t~~A~l~i~~~~~~~~~~~~--s~~~~~G~~v~aigsPfg~~~p~~f~~~~s~Givs~~~~~~ 126 (597)
+++++.-++++++.+. +||||||++. ..+++++++ ++.+++||+|++||+|||+ .++++.|+||++.+..
T Consensus 142 ~~dg~~~~a~vv~~D~~~DlAvlki~~-~~~l~~~~lg~s~~l~~G~~V~aiG~P~g~------~~tvt~GiVs~~~r~~ 214 (473)
T PRK10942 142 LSDGRKFDAKVVGKDPRSDIALIQLQN-PKNLTAIKMADSDALRVGDYTVAIGNPYGL------GETVTSGIVSALGRSG 214 (473)
T ss_pred ECCCCEEEEEEEEecCCCCEEEEEecC-CCCCceeEecCccccCCCCEEEEEcCCCCC------CcceeEEEEEEeeccc
Confidence 3566777889996655 9999999962 457888888 5679999999999999994 7899999999987642
Q ss_pred ---CCCCceEEEecccCCCCcCceeecCCccEEEEEeeccccc-CCcceEEEeeHHHHHHHHHhhhc
Q 007572 127 ---STTRSLLMADIRCLPGMEGGPVFGEHAHFVGILIRPLRQK-SGAEIQLVIPWEAIATACSDLLL 189 (597)
Q Consensus 127 ---~~~~~~i~tDa~~~pG~~GG~v~~~~g~liGi~~~~l~~~-~~~~l~~aip~~~i~~~~~~l~~ 189 (597)
..+..||||||+++||||||||||.+|+||||+++.+... ++.+++|+||++.++.++++|..
T Consensus 215 ~~~~~~~~~iqtda~i~~GnSGGpL~n~~GeviGI~t~~~~~~g~~~g~gfaIP~~~~~~v~~~l~~ 281 (473)
T PRK10942 215 LNVENYENFIQTDAAINRGNSGGALVNLNGELIGINTAILAPDGGNIGIGFAIPSNMVKNLTSQMVE 281 (473)
T ss_pred CCcccccceEEeccccCCCCCcCccCCCCCeEEEEEEEEEcCCCCcccEEEEEEHHHHHHHHHHHHh
Confidence 1356899999999999999999999999999999999877 77899999999999999998864
No 9
>TIGR02038 protease_degS periplasmic serine pepetdase DegS. This family consists of the periplasmic serine protease DegS (HhoB), a shorter paralog of protease DO (HtrA, DegP) and DegQ (HhoA). It is found in E. coli and several other Proteobacteria of the gamma subdivision. It contains a trypsin domain and a single copy of PDZ domain (in contrast to DegP with two copies). A critical role of this DegS is to sense stress in the periplasm and partially degrade an inhibitor of sigma(E).
Probab=99.80 E-value=2.7e-19 Score=190.92 Aligned_cols=132 Identities=18% Similarity=0.377 Sum_probs=111.5
Q ss_pred ccccccCCccccCCCC-ccEEEEEEccCCCCCCeeec--CCCCCCCCeEEEEeCCCCCCCCCcccCceEEeEEeeccCCC
Q 007572 50 FAMEESSNLSLMSKST-SRVAILGVSSYLKDLPNIAL--TPLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVANCYPPR 126 (597)
Q Consensus 50 ~~~~~~~~~~~~~~~~-t~~A~l~i~~~~~~~~~~~~--s~~~~~G~~v~aigsPfg~~~p~~f~~~~s~Givs~~~~~~ 126 (597)
+.++...++++++.+. +||||||++. .+++++++ +..+++||+|++||+|||+ .++++.|+||+..+..
T Consensus 108 ~~dg~~~~a~vv~~d~~~DlAvlkv~~--~~~~~~~l~~s~~~~~G~~V~aiG~P~~~------~~s~t~GiIs~~~r~~ 179 (351)
T TIGR02038 108 LQDGRKFEAELVGSDPLTDLAVLKIEG--DNLPTIPVNLDRPPHVGDVVLAIGNPYNL------GQTITQGIISATGRNG 179 (351)
T ss_pred ECCCCEEEEEEEEecCCCCEEEEEecC--CCCceEeccCcCccCCCCEEEEEeCCCCC------CCcEEEEEEEeccCcc
Confidence 4667778899996654 9999999973 35777777 5579999999999999994 6899999999987642
Q ss_pred ---CCCCceEEEecccCCCCcCceeecCCccEEEEEeeccccc---CCcceEEEeeHHHHHHHHHhhhc
Q 007572 127 ---STTRSLLMADIRCLPGMEGGPVFGEHAHFVGILIRPLRQK---SGAEIQLVIPWEAIATACSDLLL 189 (597)
Q Consensus 127 ---~~~~~~i~tDa~~~pG~~GG~v~~~~g~liGi~~~~l~~~---~~~~l~~aip~~~i~~~~~~l~~ 189 (597)
..+..||||||.++||||||||||.+|+||||+++.+... ...+++|+||++.+++++.+++.
T Consensus 180 ~~~~~~~~~iqtda~i~~GnSGGpl~n~~G~vIGI~~~~~~~~~~~~~~g~~faIP~~~~~~vl~~l~~ 248 (351)
T TIGR02038 180 LSSVGRQNFIQTDAAINAGNSGGALINTNGELVGINTASFQKGGDEGGEGINFAIPIKLAHKIMGKIIR 248 (351)
T ss_pred cCCCCcceEEEECCccCCCCCcceEECCCCeEEEEEeeeecccCCCCccceEEEecHHHHHHHHHHHhh
Confidence 1345789999999999999999999999999999988654 23689999999999999988764
No 10
>PRK10898 serine endoprotease; Provisional
Probab=99.79 E-value=4.9e-19 Score=189.01 Aligned_cols=131 Identities=18% Similarity=0.341 Sum_probs=109.4
Q ss_pred cccccCCccccCCCC-ccEEEEEEccCCCCCCeeec--CCCCCCCCeEEEEeCCCCCCCCCcccCceEEeEEeeccCCC-
Q 007572 51 AMEESSNLSLMSKST-SRVAILGVSSYLKDLPNIAL--TPLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVANCYPPR- 126 (597)
Q Consensus 51 ~~~~~~~~~~~~~~~-t~~A~l~i~~~~~~~~~~~~--s~~~~~G~~v~aigsPfg~~~p~~f~~~~s~Givs~~~~~~- 126 (597)
.+++..++++++.+. +||||||++. .+++++++ +..+++||+|+++|+|||+ ..+++.|+||+..+..
T Consensus 109 ~dg~~~~a~vv~~d~~~DlAvl~v~~--~~l~~~~l~~~~~~~~G~~V~aiG~P~g~------~~~~t~Giis~~~r~~~ 180 (353)
T PRK10898 109 QDGRVFEALLVGSDSLTDLAVLKINA--TNLPVIPINPKRVPHIGDVVLAIGNPYNL------GQTITQGIISATGRIGL 180 (353)
T ss_pred CCCCEEEEEEEEEcCCCCEEEEEEcC--CCCCeeeccCcCcCCCCCEEEEEeCCCCc------CCCcceeEEEecccccc
Confidence 566677888886655 9999999973 46777777 5569999999999999994 6799999999876542
Q ss_pred --CCCCceEEEecccCCCCcCceeecCCccEEEEEeecccccC----CcceEEEeeHHHHHHHHHhhhc
Q 007572 127 --STTRSLLMADIRCLPGMEGGPVFGEHAHFVGILIRPLRQKS----GAEIQLVIPWEAIATACSDLLL 189 (597)
Q Consensus 127 --~~~~~~i~tDa~~~pG~~GG~v~~~~g~liGi~~~~l~~~~----~~~l~~aip~~~i~~~~~~l~~ 189 (597)
.....||||||+++||||||||+|.+|+||||+++.+...+ ..+++|+||.+.+.+++.+++.
T Consensus 181 ~~~~~~~~iqtda~i~~GnSGGPl~n~~G~vvGI~~~~~~~~~~~~~~~g~~faIP~~~~~~~~~~l~~ 249 (353)
T PRK10898 181 SPTGRQNFLQTDASINHGNSGGALVNSLGELMGINTLSFDKSNDGETPEGIGFAIPTQLATKIMDKLIR 249 (353)
T ss_pred CCccccceEEeccccCCCCCcceEECCCCeEEEEEEEEecccCCCCcccceEEEEchHHHHHHHHHHhh
Confidence 22357899999999999999999999999999999886542 2589999999999999988764
No 11
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=99.77 E-value=2.5e-18 Score=188.10 Aligned_cols=132 Identities=24% Similarity=0.395 Sum_probs=113.0
Q ss_pred cccccCCccccCC-CCccEEEEEEccCCCCCCeeec--CCCCCCCCeEEEEeCCCCCCCCCcccCceEEeEEeeccCCC-
Q 007572 51 AMEESSNLSLMSK-STSRVAILGVSSYLKDLPNIAL--TPLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVANCYPPR- 126 (597)
Q Consensus 51 ~~~~~~~~~~~~~-~~t~~A~l~i~~~~~~~~~~~~--s~~~~~G~~v~aigsPfg~~~p~~f~~~~s~Givs~~~~~~- 126 (597)
+++..-++++++. ..+||||||++.. .+++++.+ +..+++||+|+++|+|||+ ..+++.|+||+..+..
T Consensus 89 ~~~~~~~a~vv~~d~~~DlAllkv~~~-~~~~~~~l~~~~~~~~G~~v~aiG~p~g~------~~~~t~G~vs~~~~~~~ 161 (428)
T TIGR02037 89 SDGREFKAKLVGKDPRTDIAVLKIDAK-KNLPVIKLGDSDKLRVGDWVLAIGNPFGL------GQTVTSGIVSALGRSGL 161 (428)
T ss_pred CCCCEEEEEEEEecCCCCEEEEEecCC-CCceEEEccCCCCCCCCCEEEEEECCCcC------CCcEEEEEEEecccCcc
Confidence 4566677888855 4599999999732 47888888 4579999999999999994 7899999999887642
Q ss_pred --CCCCceEEEecccCCCCcCceeecCCccEEEEEeeccccc-CCcceEEEeeHHHHHHHHHhhhc
Q 007572 127 --STTRSLLMADIRCLPGMEGGPVFGEHAHFVGILIRPLRQK-SGAEIQLVIPWEAIATACSDLLL 189 (597)
Q Consensus 127 --~~~~~~i~tDa~~~pG~~GG~v~~~~g~liGi~~~~l~~~-~~~~l~~aip~~~i~~~~~~l~~ 189 (597)
..+..||+|||+++||||||||||.+|+||||+++.+... +..+++|+||++.++.+++++..
T Consensus 162 ~~~~~~~~i~tda~i~~GnSGGpl~n~~G~viGI~~~~~~~~g~~~g~~faiP~~~~~~~~~~l~~ 227 (428)
T TIGR02037 162 GIGDYENFIQTDAAINPGNSGGPLVNLRGEVIGINTAIYSPSGGNVGIGFAIPSNMAKNVVDQLIE 227 (428)
T ss_pred CCCCccceEEECCCCCCCCCCCceECCCCeEEEEEeEEEcCCCCccceEEEEEhHHHHHHHHHHHh
Confidence 3456789999999999999999999999999999998876 67899999999999999998764
No 12
>COG0265 DegQ Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain [Posttranslational modification, protein turnover, chaperones]
Probab=99.72 E-value=2.1e-17 Score=175.86 Aligned_cols=132 Identities=27% Similarity=0.393 Sum_probs=113.9
Q ss_pred cccccCCccccCC-CCccEEEEEEccCCCCCCeeec--CCCCCCCCeEEEEeCCCCCCCCCcccCceEEeEEeeccCC-C
Q 007572 51 AMEESSNLSLMSK-STSRVAILGVSSYLKDLPNIAL--TPLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVANCYPP-R 126 (597)
Q Consensus 51 ~~~~~~~~~~~~~-~~t~~A~l~i~~~~~~~~~~~~--s~~~~~G~~v~aigsPfg~~~p~~f~~~~s~Givs~~~~~-~ 126 (597)
.++..-+++++++ ..+|+|+||++... .++.+.+ ++.++.||+++|||+||| |.++++.|+||...+. -
T Consensus 103 ~dg~~~~a~~vg~d~~~dlavlki~~~~-~~~~~~~~~s~~l~vg~~v~aiGnp~g------~~~tvt~Givs~~~r~~v 175 (347)
T COG0265 103 ADGREVPAKLVGKDPISDLAVLKIDGAG-GLPVIALGDSDKLRVGDVVVAIGNPFG------LGQTVTSGIVSALGRTGV 175 (347)
T ss_pred CCCCEEEEEEEecCCccCEEEEEeccCC-CCceeeccCCCCcccCCEEEEecCCCC------cccceeccEEeccccccc
Confidence 5677788899966 55999999998432 2666666 778999999999999999 5899999999999884 1
Q ss_pred ---CCCCceEEEecccCCCCcCceeecCCccEEEEEeeccccc-CCcceEEEeeHHHHHHHHHhhhc
Q 007572 127 ---STTRSLLMADIRCLPGMEGGPVFGEHAHFVGILIRPLRQK-SGAEIQLVIPWEAIATACSDLLL 189 (597)
Q Consensus 127 ---~~~~~~i~tDa~~~pG~~GG~v~~~~g~liGi~~~~l~~~-~~~~l~~aip~~~i~~~~~~l~~ 189 (597)
..+..||||||+++|||+|||++|.+|++|||+++.+... +..|++|+||++.+..++.+++.
T Consensus 176 ~~~~~~~~~IqtdAain~gnsGgpl~n~~g~~iGint~~~~~~~~~~gigfaiP~~~~~~v~~~l~~ 242 (347)
T COG0265 176 GSAGGYVNFIQTDAAINPGNSGGPLVNIDGEVVGINTAIIAPSGGSSGIGFAIPVNLVAPVLDELIS 242 (347)
T ss_pred cCcccccchhhcccccCCCCCCCceEcCCCcEEEEEEEEecCCCCcceeEEEecHHHHHHHHHHHHH
Confidence 1256899999999999999999999999999999999988 57789999999999999998875
No 13
>PF13365 Trypsin_2: Trypsin-like peptidase domain; PDB: 1Y8T_A 2Z9I_A 3QO6_A 1L1J_A 1QY6_A 2O8L_A 3OTP_E 2ZLE_I 1KY9_A 3CS0_A ....
Probab=99.60 E-value=1.5e-14 Score=128.41 Aligned_cols=24 Identities=46% Similarity=0.904 Sum_probs=22.4
Q ss_pred EcccccCCCCCCceecCCceEEEE
Q 007572 454 TTAAVHPGGSGGAVVNLDGHMIGL 477 (597)
Q Consensus 454 tdAav~~GnSGGPL~n~~G~VIGI 477 (597)
+++.+.+|+|||||||.+|+||||
T Consensus 97 ~~~~~~~G~SGgpv~~~~G~vvGi 120 (120)
T PF13365_consen 97 TDADTRPGSSGGPVFDSDGRVVGI 120 (120)
T ss_dssp ESSS-STTTTTSEEEETTSEEEEE
T ss_pred eecccCCCcEeHhEECCCCEEEeC
Confidence 899999999999999999999997
No 14
>KOG1320 consensus Serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=99.51 E-value=1.3e-13 Score=149.83 Aligned_cols=201 Identities=21% Similarity=0.305 Sum_probs=140.5
Q ss_pred HHHhccCceEEEEeCC--------------CeEEEEEEEeCCcEEEecccccCCCCCccccccCCcccccccCCCCCCCC
Q 007572 240 PIQKALASVCLITIDD--------------GVWASGVLLNDQGLILTNAHLLEPWRFGKTTVSGWRNGVSFQPEDSASSG 305 (597)
Q Consensus 240 ~i~~a~~SVV~I~~~~--------------~~~GSGflIs~~G~ILTnaHVV~p~~~~~t~~~g~~~~~~f~~~~~~~~~ 305 (597)
..++...++|.|+..+ ..-||||+++.+|+++||+||+.-. - ..++.+..
T Consensus 133 ~~~~cd~Avv~Ie~~~f~~~~~~~e~~~ip~l~~S~~Vv~gd~i~VTnghV~~~~--------~---~~y~~~~~----- 196 (473)
T KOG1320|consen 133 VFEECDLAVVYIESEEFWKGMNPFELGDIPSLNGSGFVVGGDGIIVTNGHVVRVE--------P---RIYAHSST----- 196 (473)
T ss_pred hhhcccceEEEEeeccccCCCcccccCCCcccCccEEEEcCCcEEEEeeEEEEEE--------e---ccccCCCc-----
Confidence 3456678888887621 1249999999999999999999510 0 00000000
Q ss_pred CCCccccccccCCCCCCCcccccccccccccccccccCCceEEEEEEcCC--CCceeEeeEEEEeecCCCcEEEEEEccC
Q 007572 306 HTGVDQYQKSQTLPPKMPKIVDSSVDEHRAYKLSSFSRGHRKIRVRLDHL--DPWIWCDAKIVYVCKGPLDVSLLQLGYI 383 (597)
Q Consensus 306 ~~~~~~~q~~qtl~~k~i~i~~~~~~~~~~~~~~~~~~~~~~i~V~l~~~--~~~~w~~A~Vv~~~~~~~DLALLkl~~~ 383 (597)
.--.+.+....+ .. +.+.++. -++..|+|+++++..
T Consensus 197 --------------------------------------~l~~vqi~aa~~~~~s---~ep~i~g-~d~~~gvA~l~ik~~ 234 (473)
T KOG1320|consen 197 --------------------------------------VLLRVQIDAAIGPGNS---GEPVIVG-VDKVAGVAFLKIKTP 234 (473)
T ss_pred --------------------------------------ceeeEEEEEeecCCcc---CCCeEEc-cccccceEEEEEecC
Confidence 001244444444 44 6677776 367799999999753
Q ss_pred CCCccceecCCC-CCCCCCeEEEEccCCCCCCCCCCCeeEeeEEeeeeeccCCCCCCcccccCCCcCeEEEEcccccCCC
Q 007572 384 PDQLCPIDADFG-QPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQRNSAYPVMLETTAAVHPGG 462 (597)
Q Consensus 384 ~~~l~pi~l~~s-~~~~Ge~V~vIGyPlf~~~~g~~~svt~GiVS~v~~~~~~~~~~~~~~~~~~~~~~iqtdAav~~Gn 462 (597)
..-++++++... .+..|+++.++|.| +++..+++.|+++...|........ ........+|||++++.|+
T Consensus 235 ~~i~~~i~~~~~~~~~~G~~~~a~~~~-----f~~~nt~t~g~vs~~~R~~~~lg~~----~g~~i~~~~qtd~ai~~~n 305 (473)
T KOG1320|consen 235 ENILYVIPLGVSSHFRTGVEVSAIGNG-----FGLLNTLTQGMVSGQLRKSFKLGLE----TGVLISKINQTDAAINPGN 305 (473)
T ss_pred CcccceeecceeeeecccceeeccccC-----ceeeeeeeecccccccccccccCcc----cceeeeeecccchhhhccc
Confidence 334777887654 58999999999994 6888999999999877643222111 1123446799999999999
Q ss_pred CCCceecCCceEEEEEeeeecCCCCcccCceEEEEehhHHHHHHHHH
Q 007572 463 SGGAVVNLDGHMIGLVTSNARHGGGTVIPHLNFSIPCAVLRPIFEFA 509 (597)
Q Consensus 463 SGGPL~n~~G~VIGIvss~~~~~~g~~~p~lnFaIPi~~l~~~l~~~ 509 (597)
||||++|.+|++||+++.+...-+-. -+++|++|.+.+..++.+.
T Consensus 306 sg~~ll~~DG~~IgVn~~~~~ri~~~--~~iSf~~p~d~vl~~v~r~ 350 (473)
T KOG1320|consen 306 SGGPLLNLDGEVIGVNTRKVTRIGFS--HGISFKIPIDTVLVIVLRL 350 (473)
T ss_pred CCCcEEEecCcEeeeeeeeeEEeecc--ccceeccCchHhhhhhhhh
Confidence 99999999999999999886531111 1689999999999877666
No 15
>KOG1320 consensus Serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=99.26 E-value=8.1e-12 Score=135.96 Aligned_cols=129 Identities=21% Similarity=0.309 Sum_probs=108.1
Q ss_pred ccCCccccC-CCCccEEEEEEccCCCCCCeeec--CCCCCCCCeEEEEeCCCCCCCCCcccCceEEeEEeeccCCC----
Q 007572 54 ESSNLSLMS-KSTSRVAILGVSSYLKDLPNIAL--TPLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVANCYPPR---- 126 (597)
Q Consensus 54 ~~~~~~~~~-~~~t~~A~l~i~~~~~~~~~~~~--s~~~~~G~~v~aigsPfg~~~p~~f~~~~s~Givs~~~~~~---- 126 (597)
.+..|.+++ +...|+|+||++....-+++++. +..++.|+++.++++||++ .|++++|+|+...|..
T Consensus 211 ~s~ep~i~g~d~~~gvA~l~ik~~~~i~~~i~~~~~~~~~~G~~~~a~~~~f~~------~nt~t~g~vs~~~R~~~~lg 284 (473)
T KOG1320|consen 211 NSGEPVIVGVDKVAGVAFLKIKTPENILYVIPLGVSSHFRTGVEVSAIGNGFGL------LNTLTQGMVSGQLRKSFKLG 284 (473)
T ss_pred ccCCCeEEccccccceEEEEEecCCcccceeecceeeeecccceeeccccCcee------eeeeeecccccccccccccC
Confidence 778899999 56699999999733333666665 7799999999999999995 8999999999887642
Q ss_pred ----CCCCceEEEecccCCCCcCceeecCCccEEEEEeeccccc-CCcceEEEeeHHHHHHHHHhhh
Q 007572 127 ----STTRSLLMADIRCLPGMEGGPVFGEHAHFVGILIRPLRQK-SGAEIQLVIPWEAIATACSDLL 188 (597)
Q Consensus 127 ----~~~~~~i~tDa~~~pG~~GG~v~~~~g~liGi~~~~l~~~-~~~~l~~aip~~~i~~~~~~l~ 188 (597)
.....++|||+++++||+|||++|.+|++||++++...+. -..+++|++|.+.+...+....
T Consensus 285 ~~~g~~i~~~~qtd~ai~~~nsg~~ll~~DG~~IgVn~~~~~ri~~~~~iSf~~p~d~vl~~v~r~~ 351 (473)
T KOG1320|consen 285 LETGVLISKINQTDAAINPGNSGGPLLNLDGEVIGVNTRKVTRIGFSHGISFKIPIDTVLVIVLRLG 351 (473)
T ss_pred cccceeeeeecccchhhhcccCCCcEEEecCcEeeeeeeeeEEeeccccceeccCchHhhhhhhhhh
Confidence 2345689999999999999999999999999999998876 5679999999999987655543
No 16
>PF00089 Trypsin: Trypsin; InterPro: IPR001254 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine proteases belong to the MEROPS peptidase family S1 (chymotrypsin family, clan PA(S))and to peptidase family S6 (Hap serine peptidases). The chymotrypsin family is almost totally confined to animals, although trypsin-like enzymes are found in actinomycetes of the genera Streptomyces and Saccharopolyspora, and in the fungus Fusarium oxysporum []. The enzymes are inherently secreted, being synthesised with a signal peptide that targets them to the secretory pathway. Animal enzymes are either secreted directly, packaged into vesicles for regulated secretion, or are retained in leukocyte granules []. The Hap family, 'Haemophilus adhesion and penetration', are proteins that play a role in the interaction with human epithelial cells. The serine protease activity is localized at the N-terminal domain, whereas the binding domain is in the C-terminal region. ; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 1SPJ_A 1A5I_A 2ZGH_A 2ZKS_A 2ZGJ_A 2ZGC_A 2ODP_A 2I6Q_A 2I6S_A 2ODQ_A ....
Probab=99.26 E-value=4e-10 Score=109.63 Aligned_cols=124 Identities=22% Similarity=0.325 Sum_probs=75.8
Q ss_pred CCcEEEEEEccC---CCCccceecCCC--CCCCCCeEEEEccCCCCCCCCCCCeeEeeEEeeeeeccCCCCCCcccccCC
Q 007572 372 PLDVSLLQLGYI---PDQLCPIDADFG--QPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQRNS 446 (597)
Q Consensus 372 ~~DLALLkl~~~---~~~l~pi~l~~s--~~~~Ge~V~vIGyPlf~~~~g~~~svt~GiVS~v~~~~~~~~~~~~~~~~~ 446 (597)
..|+|||+|+.. .....|+.+... .+..|+.+.++||+.-.. .+....+....+.-+.... ..... ...
T Consensus 86 ~~DiAll~L~~~~~~~~~~~~~~l~~~~~~~~~~~~~~~~G~~~~~~-~~~~~~~~~~~~~~~~~~~----c~~~~-~~~ 159 (220)
T PF00089_consen 86 DNDIALLKLDRPITFGDNIQPICLPSAGSDPNVGTSCIVVGWGRTSD-NGYSSNLQSVTVPVVSRKT----CRSSY-NDN 159 (220)
T ss_dssp TTSEEEEEESSSSEHBSSBEESBBTSTTHTTTTTSEEEEEESSBSST-TSBTSBEEEEEEEEEEHHH----HHHHT-TTT
T ss_pred ccccccccccccccccccccccccccccccccccccccccccccccc-ccccccccccccccccccc----ccccc-ccc
Confidence 589999999974 345677777652 368999999999985221 1111233333332222110 00000 001
Q ss_pred CcCeEEEEcc----cccCCCCCCceecCCceEEEEEeeeecCCCCcccCceEEEEehhHHHH
Q 007572 447 AYPVMLETTA----AVHPGGSGGAVVNLDGHMIGLVTSNARHGGGTVIPHLNFSIPCAVLRP 504 (597)
Q Consensus 447 ~~~~~iqtdA----av~~GnSGGPL~n~~G~VIGIvss~~~~~~g~~~p~lnFaIPi~~l~~ 504 (597)
....++++.. ..+.|+|||||++.++.||||++.. ....... ...+.+++....+
T Consensus 160 ~~~~~~c~~~~~~~~~~~g~sG~pl~~~~~~lvGI~s~~-~~c~~~~--~~~v~~~v~~~~~ 218 (220)
T PF00089_consen 160 LTPNMICAGSSGSGDACQGDSGGPLICNNNYLVGIVSFG-ENCGSPN--YPGVYTRVSSYLD 218 (220)
T ss_dssp STTTEEEEETTSSSBGGTTTTTSEEEETTEEEEEEEEEE-SSSSBTT--SEEEEEEGGGGHH
T ss_pred cccccccccccccccccccccccccccceeeecceeeec-CCCCCCC--cCEEEEEHHHhhc
Confidence 2345677665 7889999999998877899999987 3222222 2477888876654
No 17
>cd00190 Tryp_SPc Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues.
Probab=99.09 E-value=4.1e-09 Score=103.18 Aligned_cols=109 Identities=24% Similarity=0.255 Sum_probs=61.0
Q ss_pred CCCcEEEEEEccC---CCCccceecCCC--CCCCCCeEEEEccCCCCCCCCCCCeeEeeEEeeeeeccCCCCCCccccc-
Q 007572 371 GPLDVSLLQLGYI---PDQLCPIDADFG--QPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQR- 444 (597)
Q Consensus 371 ~~~DLALLkl~~~---~~~l~pi~l~~s--~~~~Ge~V~vIGyPlf~~~~g~~~svt~GiVS~v~~~~~~~~~~~~~~~- 444 (597)
...|||||+|+.. .....|+.+... .+..|+.++++||+...........+....+.-+.... +......
T Consensus 87 ~~~DiAll~L~~~~~~~~~v~picl~~~~~~~~~~~~~~~~G~g~~~~~~~~~~~~~~~~~~~~~~~~----C~~~~~~~ 162 (232)
T cd00190 87 YDNDIALLKLKRPVTLSDNVRPICLPSSGYNLPAGTTCTVSGWGRTSEGGPLPDVLQEVNVPIVSNAE----CKRAYSYG 162 (232)
T ss_pred CcCCEEEEEECCcccCCCcccceECCCccccCCCCCEEEEEeCCcCCCCCCCCceeeEEEeeeECHHH----hhhhccCc
Confidence 3589999999862 223677777655 67889999999998633210111122222222111100 0000000
Q ss_pred CCCcCeEEEE-----cccccCCCCCCceecCC---ceEEEEEeeeec
Q 007572 445 NSAYPVMLET-----TAAVHPGGSGGAVVNLD---GHMIGLVTSNAR 483 (597)
Q Consensus 445 ~~~~~~~iqt-----dAav~~GnSGGPL~n~~---G~VIGIvss~~~ 483 (597)
......++.. ....++|+|||||+... +.++||++....
T Consensus 163 ~~~~~~~~C~~~~~~~~~~c~gdsGgpl~~~~~~~~~lvGI~s~g~~ 209 (232)
T cd00190 163 GTITDNMLCAGGLEGGKDACQGDSGGPLVCNDNGRGVLVGIVSWGSG 209 (232)
T ss_pred ccCCCceEeeCCCCCCCccccCCCCCcEEEEeCCEEEEEEEEehhhc
Confidence 0011223332 33467899999999653 899999987653
No 18
>KOG1421 consensus Predicted signaling-associated protein (contains a PDZ domain) [General function prediction only]
Probab=99.02 E-value=1.9e-09 Score=119.25 Aligned_cols=190 Identities=21% Similarity=0.339 Sum_probs=131.3
Q ss_pred HHHhccCceEEEEeC----------CCeEEEEEEEeC-CcEEEecccccCCCCCccccccCCcccccccCCCCCCCCCCC
Q 007572 240 PIQKALASVCLITID----------DGVWASGVLLND-QGLILTNAHLLEPWRFGKTTVSGWRNGVSFQPEDSASSGHTG 308 (597)
Q Consensus 240 ~i~~a~~SVV~I~~~----------~~~~GSGflIs~-~G~ILTnaHVV~p~~~~~t~~~g~~~~~~f~~~~~~~~~~~~ 308 (597)
.+..+.++||.|+.. ....|+||++++ .||||||+||+.|--+.. .++|...+
T Consensus 57 ~ia~VvksvVsI~~S~v~~fdtesag~~~atgfvvd~~~gyiLtnrhvv~pgP~va--------~avf~n~e-------- 120 (955)
T KOG1421|consen 57 TIANVVKSVVSIRFSAVRAFDTESAGESEATGFVVDKKLGYILTNRHVVAPGPFVA--------SAVFDNHE-------- 120 (955)
T ss_pred hhhhhcccEEEEEehheeecccccccccceeEEEEecccceEEEeccccCCCCcee--------EEEecccc--------
Confidence 467788999999873 234699999988 699999999998521110 02222211
Q ss_pred ccccccccCCCCCCCcccccccccccccccccccCCceEEEEEEcCCCCceeEeeEEEEeecCCCcEEEEEEccCC---C
Q 007572 309 VDQYQKSQTLPPKMPKIVDSSVDEHRAYKLSSFSRGHRKIRVRLDHLDPWIWCDAKIVYVCKGPLDVSLLQLGYIP---D 385 (597)
Q Consensus 309 ~~~~q~~qtl~~k~i~i~~~~~~~~~~~~~~~~~~~~~~i~V~l~~~~~~~w~~A~Vv~~~~~~~DLALLkl~~~~---~ 385 (597)
. ++.-.+| .|+-+|+.+++.+... .
T Consensus 121 --------------------------------------e-------------~ei~pvy-rDpVhdfGf~r~dps~ir~s 148 (955)
T KOG1421|consen 121 --------------------------------------E-------------IEIYPVY-RDPVHDFGFFRYDPSTIRFS 148 (955)
T ss_pred --------------------------------------c-------------CCccccc-CCchhhcceeecChhhccee
Confidence 1 1122233 4677899999988521 1
Q ss_pred CccceecCCCCCCCCCeEEEEccCCCCCCCCCCCeeEeeEEeeeeeccCCCCCCcccccCCCcCeEEEEcccccCCCCCC
Q 007572 386 QLCPIDADFGQPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQRNSAYPVMLETTAAVHPGGSGG 465 (597)
Q Consensus 386 ~l~pi~l~~s~~~~Ge~V~vIGyPlf~~~~g~~~svt~GiVS~v~~~~~~~~~~~~~~~~~~~~~~iqtdAav~~GnSGG 465 (597)
....+.+.....++|..++++|+ ..+...++-.|-++++.+. .+.+..+....... ..+|.-+...+|.||.
T Consensus 149 ~vt~i~lap~~akvgseirvvgN-----DagEklsIlagflSrldr~-apdyg~~~yndfnT--fy~Qaasstsggssgs 220 (955)
T KOG1421|consen 149 IVTEICLAPELAKVGSEIRVVGN-----DAGEKLSILAGFLSRLDRN-APDYGEDTYNDFNT--FYIQAASSTSGGSSGS 220 (955)
T ss_pred eeeccccCccccccCCceEEecC-----CccceEEeehhhhhhccCC-Cccccccccccccc--eeeeehhcCCCCCCCC
Confidence 23334444555689999999999 5677788999999998875 34444333222222 2578888889999999
Q ss_pred ceecCCceEEEEEeeeecCCCCcccCceEEEEehhHHHHHHHHHHh
Q 007572 466 AVVNLDGHMIGLVTSNARHGGGTVIPHLNFSIPCAVLRPIFEFARD 511 (597)
Q Consensus 466 PL~n~~G~VIGIvss~~~~~~g~~~p~lnFaIPi~~l~~~l~~~~~ 511 (597)
||+|..|..|.++......+ .-.|++|++-+.+.+..+++
T Consensus 221 pVv~i~gyAVAl~agg~~ss------as~ffLpLdrV~RaL~clq~ 260 (955)
T KOG1421|consen 221 PVVDIPGYAVALNAGGSISS------ASDFFLPLDRVVRALRCLQN 260 (955)
T ss_pred ceecccceEEeeecCCcccc------cccceeeccchhhhhhhhhc
Confidence 99999999999986543322 45799999999999988874
No 19
>smart00020 Tryp_SPc Trypsin-like serine protease. Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues.
Probab=98.95 E-value=6.3e-08 Score=95.20 Aligned_cols=108 Identities=25% Similarity=0.298 Sum_probs=59.2
Q ss_pred CCCcEEEEEEccC---CCCccceecCCC--CCCCCCeEEEEccCCCCCC-CCCCCeeEeeEEeeeeeccC-CCCCCcccc
Q 007572 371 GPLDVSLLQLGYI---PDQLCPIDADFG--QPSLGSAAYVIGHGLFGPR-CGLSPSVSSGVVAKVVKANL-PSYGQSTLQ 443 (597)
Q Consensus 371 ~~~DLALLkl~~~---~~~l~pi~l~~s--~~~~Ge~V~vIGyPlf~~~-~g~~~svt~GiVS~v~~~~~-~~~~~~~~~ 443 (597)
...|||||+|+.. ...+.|+.+... .+..|+.+.+.||+..... ......+....+.-+....- ..+...
T Consensus 87 ~~~DiAll~L~~~i~~~~~~~pi~l~~~~~~~~~~~~~~~~g~g~~~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~--- 163 (229)
T smart00020 87 YDNDIALLKLKSPVTLSDNVRPICLPSSNYNVPAGTTCTVSGWGRTSEGAGSLPDTLQEVNVPIVSNATCRRAYSGG--- 163 (229)
T ss_pred CcCCEEEEEECcccCCCCceeeccCCCcccccCCCCEEEEEeCCCCCCCCCcCCCEeeEEEEEEeCHHHhhhhhccc---
Confidence 4689999999862 234667777554 5778999999999853210 01111222222221111000 000000
Q ss_pred cCCCcCeEEEE-----cccccCCCCCCceecCCc--eEEEEEeeee
Q 007572 444 RNSAYPVMLET-----TAAVHPGGSGGAVVNLDG--HMIGLVTSNA 482 (597)
Q Consensus 444 ~~~~~~~~iqt-----dAav~~GnSGGPL~n~~G--~VIGIvss~~ 482 (597)
......++.. ....++|+|||||+...+ .++||++...
T Consensus 164 -~~~~~~~~C~~~~~~~~~~c~gdsG~pl~~~~~~~~l~Gi~s~g~ 208 (229)
T smart00020 164 -GAITDNMLCAGGLEGGKDACQGDSGGPLVCNDGRWVLVGIVSWGS 208 (229)
T ss_pred -cccCCCcEeecCCCCCCcccCCCCCCeeEEECCCEEEEEEEEECC
Confidence 0001112222 355788999999996543 8999998875
No 20
>COG3591 V8-like Glu-specific endopeptidase [Amino acid transport and metabolism]
Probab=98.38 E-value=8.5e-06 Score=83.00 Aligned_cols=73 Identities=26% Similarity=0.269 Sum_probs=50.6
Q ss_pred CCCCCCeEEEEccCCCCCCCCCCCeeEeeEEeeeeeccCCCCCCcccccCCCcCeEEEEcccccCCCCCCceecCCceEE
Q 007572 396 QPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQRNSAYPVMLETTAAVHPGGSGGAVVNLDGHMI 475 (597)
Q Consensus 396 ~~~~Ge~V~vIGyPlf~~~~g~~~svt~GiVS~v~~~~~~~~~~~~~~~~~~~~~~iqtdAav~~GnSGGPL~n~~G~VI 475 (597)
..+.++.+.++|||.-.+..+. .-...+.|..+. ...+++++.+.+|+||.||++.+.++|
T Consensus 157 ~~~~~d~i~v~GYP~dk~~~~~-~~e~t~~v~~~~------------------~~~l~y~~dT~pG~SGSpv~~~~~~vi 217 (251)
T COG3591 157 EAKANDRITVIGYPGDKPNIGT-MWESTGKVNSIK------------------GNKLFYDADTLPGSSGSPVLISKDEVI 217 (251)
T ss_pred ccccCceeEEEeccCCCCccee-EeeecceeEEEe------------------cceEEEEecccCCCCCCceEecCceEE
Confidence 4689999999999852221111 122333333322 126889999999999999999988999
Q ss_pred EEEeeeecCCCC
Q 007572 476 GLVTSNARHGGG 487 (597)
Q Consensus 476 GIvss~~~~~~g 487 (597)
|+.+.+....++
T Consensus 218 gv~~~g~~~~~~ 229 (251)
T COG3591 218 GVHYNGPGANGG 229 (251)
T ss_pred EEEecCCCcccc
Confidence 999987664433
No 21
>PF00863 Peptidase_C4: Peptidase family C4 This family belongs to family C4 of the peptidase classification.; InterPro: IPR001730 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. Nuclear inclusion A (NIA) proteases from potyviruses are cysteine peptidases belong to the MEROPS peptidase family C4 (NIa protease family, clan PA(C)) [, ]. Potyviruses include plant viruses in which the single-stranded RNA encodes a polyprotein with NIA protease activity, where proteolytic cleavage is specific for Gln+Gly sites. The NIA protease acts on the polyprotein, releasing itself by Gln+Gly cleavage at both the N- and C-termini. It further processes the polyprotein by cleavage at five similar sites in the C-terminal half of the sequence. In addition to its C-terminal protease activity, the NIA protease contains an N-terminal domain that has been implicated in the transcription process []. This peptidase is present in the nuclear inclusion protein of potyviruses.; GO: 0008234 cysteine-type peptidase activity, 0006508 proteolysis; PDB: 3MMG_B 1Q31_B 1LVB_A 1LVM_A.
Probab=98.27 E-value=1.3e-05 Score=81.00 Aligned_cols=103 Identities=17% Similarity=0.292 Sum_probs=50.9
Q ss_pred CCCcEEEEEEccCCCCcccee--cCCCCCCCCCeEEEEccCCCCCCCCCCCeeEeeEEeeeeeccCCCCCCcccccCCCc
Q 007572 371 GPLDVSLLQLGYIPDQLCPID--ADFGQPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQRNSAY 448 (597)
Q Consensus 371 ~~~DLALLkl~~~~~~l~pi~--l~~s~~~~Ge~V~vIGyPlf~~~~g~~~svt~GiVS~v~~~~~~~~~~~~~~~~~~~ 448 (597)
+..||.++|+.. +++|.+ +.+..|+.||.|++||. .+.. .-..-.||....+. + ...
T Consensus 80 ~~~DiviirmPk---DfpPf~~kl~FR~P~~~e~v~mVg~-----~fq~--k~~~s~vSesS~i~-p----------~~~ 138 (235)
T PF00863_consen 80 EGRDIVIIRMPK---DFPPFPQKLKFRAPKEGERVCMVGS-----NFQE--KSISSTVSESSWIY-P----------EEN 138 (235)
T ss_dssp TCSSEEEEE--T---TS----S---B----TT-EEEEEEE-----ECSS--CCCEEEEEEEEEEE-E----------ETT
T ss_pred CCccEEEEeCCc---ccCCcchhhhccCCCCCCEEEEEEE-----EEEc--CCeeEEECCceEEe-e----------cCC
Confidence 569999999975 566654 34557999999999998 2221 11111233221110 0 122
Q ss_pred CeEEEEcccccCCCCCCceec-CCceEEEEEeeeecCCCCcccCceEEEEehh
Q 007572 449 PVMLETTAAVHPGGSGGAVVN-LDGHMIGLVTSNARHGGGTVIPHLNFSIPCA 500 (597)
Q Consensus 449 ~~~iqtdAav~~GnSGGPL~n-~~G~VIGIvss~~~~~~g~~~p~lnFaIPi~ 500 (597)
..+..+-.+...|+-|.||++ .+|++|||++...... ..||..|+.
T Consensus 139 ~~fWkHwIsTk~G~CG~PlVs~~Dg~IVGiHsl~~~~~------~~N~F~~f~ 185 (235)
T PF00863_consen 139 SHFWKHWISTKDGDCGLPLVSTKDGKIVGIHSLTSNTS------SRNYFTPFP 185 (235)
T ss_dssp TTEEEE-C---TT-TT-EEEETTT--EEEEEEEEETTT------SSEEEEE--
T ss_pred CCeeEEEecCCCCccCCcEEEcCCCcEEEEEcCccCCC------CeEEEEcCC
Confidence 346777778889999999997 5799999999765543 357777764
No 22
>PF13365 Trypsin_2: Trypsin-like peptidase domain; PDB: 1Y8T_A 2Z9I_A 3QO6_A 1L1J_A 1QY6_A 2O8L_A 3OTP_E 2ZLE_I 1KY9_A 3CS0_A ....
Probab=97.73 E-value=1.8e-05 Score=69.90 Aligned_cols=24 Identities=46% Similarity=0.855 Sum_probs=22.5
Q ss_pred EecccCCCCcCceeecCCccEEEE
Q 007572 135 ADIRCLPGMEGGPVFGEHAHFVGI 158 (597)
Q Consensus 135 tDa~~~pG~~GG~v~~~~g~liGi 158 (597)
+|+.+.||+|||||||.+|+||||
T Consensus 97 ~~~~~~~G~SGgpv~~~~G~vvGi 120 (120)
T PF13365_consen 97 TDADTRPGSSGGPVFDSDGRVVGI 120 (120)
T ss_dssp ESSS-STTTTTSEEEETTSEEEEE
T ss_pred eecccCCCcEeHhEECCCCEEEeC
Confidence 899999999999999999999997
No 23
>KOG3627 consensus Trypsin [Amino acid transport and metabolism]
Probab=97.50 E-value=0.0091 Score=60.29 Aligned_cols=114 Identities=21% Similarity=0.268 Sum_probs=61.6
Q ss_pred CcEEEEEEcc---CCCCccceecCCCC----CCCCCeEEEEccCCCCCC-CCCCCeeE---eeEEeeeeeccCCCCCCcc
Q 007572 373 LDVSLLQLGY---IPDQLCPIDADFGQ----PSLGSAAYVIGHGLFGPR-CGLSPSVS---SGVVAKVVKANLPSYGQST 441 (597)
Q Consensus 373 ~DLALLkl~~---~~~~l~pi~l~~s~----~~~Ge~V~vIGyPlf~~~-~g~~~svt---~GiVS~v~~~~~~~~~~~~ 441 (597)
.|||||+++. ......|+.+.... ...+..+++.|||..... ......+. .-++....+. ..
T Consensus 106 nDiall~l~~~v~~~~~i~piclp~~~~~~~~~~~~~~~v~GWG~~~~~~~~~~~~L~~~~v~i~~~~~C~-------~~ 178 (256)
T KOG3627|consen 106 NDIALLRLSEPVTFSSHIQPICLPSSADPYFPPGGTTCLVSGWGRTESGGGPLPDTLQEVDVPIISNSECR-------RA 178 (256)
T ss_pred CCEEEEEECCCcccCCcccccCCCCCcccCCCCCCCEEEEEeCCCcCCCCCCCCceeEEEEEeEcChhHhc-------cc
Confidence 8999999986 23456666664322 344588999999853221 01112222 1122211111 00
Q ss_pred cccC-CCcCeEEEEc-----ccccCCCCCCceecCC---ceEEEEEeeeecCCCCcccCce
Q 007572 442 LQRN-SAYPVMLETT-----AAVHPGGSGGAVVNLD---GHMIGLVTSNARHGGGTVIPHL 493 (597)
Q Consensus 442 ~~~~-~~~~~~iqtd-----Aav~~GnSGGPL~n~~---G~VIGIvss~~~~~~g~~~p~l 493 (597)
.... .....++.+. ...+.|+|||||+-.. ..++||+++.....+....|+.
T Consensus 179 ~~~~~~~~~~~~Ca~~~~~~~~~C~GDSGGPLv~~~~~~~~~~GivS~G~~~C~~~~~P~v 239 (256)
T KOG3627|consen 179 YGGLGTITDTMLCAGGPEGGKDACQGDSGGPLVCEDNGRWVLVGIVSWGSGGCGQPNYPGV 239 (256)
T ss_pred ccCccccCCCEEeeCccCCCCccccCCCCCeEEEeeCCcEEEEEEEEecCCCCCCCCCCeE
Confidence 0000 0112345554 2357799999999543 6999999998664333334555
No 24
>PF03761 DUF316: Domain of unknown function (DUF316) ; InterPro: IPR005514 This is a family of uncharacterised proteins from Caenorhabditis elegans.
Probab=97.17 E-value=0.02 Score=59.18 Aligned_cols=109 Identities=15% Similarity=0.107 Sum_probs=62.9
Q ss_pred CCCcEEEEEEccC-CCCccceecCCC--CCCCCCeEEEEccCCCCCCCCCCCeeEeeEEeeeeeccCCCCCCcccccCCC
Q 007572 371 GPLDVSLLQLGYI-PDQLCPIDADFG--QPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQRNSA 447 (597)
Q Consensus 371 ~~~DLALLkl~~~-~~~l~pi~l~~s--~~~~Ge~V~vIGyPlf~~~~g~~~svt~GiVS~v~~~~~~~~~~~~~~~~~~ 447 (597)
..++++||.++.. .....|+-+.++ ....|+.+.+.|+. .. ..+....+.-... ..
T Consensus 159 ~~~~~mIlEl~~~~~~~~~~~Cl~~~~~~~~~~~~~~~yg~~-----~~--~~~~~~~~~i~~~--------------~~ 217 (282)
T PF03761_consen 159 RPYSPMILELEEDFSKNVSPPCLADSSTNWEKGDEVDVYGFN-----ST--GKLKHRKLKITNC--------------TK 217 (282)
T ss_pred cccceEEEEEcccccccCCCEEeCCCccccccCceEEEeecC-----CC--CeEEEEEEEEEEe--------------ec
Confidence 5789999999973 245555555443 46789999998881 11 1122222221111 01
Q ss_pred cCeEEEEcccccCCCCCCcee---cCCceEEEEEeeeecCCCCcccCceEEEEehhHHHH
Q 007572 448 YPVMLETTAAVHPGGSGGAVV---NLDGHMIGLVTSNARHGGGTVIPHLNFSIPCAVLRP 504 (597)
Q Consensus 448 ~~~~iqtdAav~~GnSGGPL~---n~~G~VIGIvss~~~~~~g~~~p~lnFaIPi~~l~~ 504 (597)
....+.+......|++||||+ |.+-.||||.+.+...... +..+...+..+++
T Consensus 218 ~~~~~~~~~~~~~~d~Gg~lv~~~~gr~tlIGv~~~~~~~~~~----~~~~f~~v~~~~~ 273 (282)
T PF03761_consen 218 CAYSICTKQYSCKGDRGGPLVKNINGRWTLIGVGASGNYECNK----NNSYFFNVSWYQD 273 (282)
T ss_pred cceeEecccccCCCCccCeEEEEECCCEEEEEEEccCCCcccc----cccEEEEHHHhhh
Confidence 223455555667999999999 3345689998766433211 1244555555443
No 25
>COG5640 Secreted trypsin-like serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=96.92 E-value=0.0086 Score=63.55 Aligned_cols=50 Identities=24% Similarity=0.417 Sum_probs=34.9
Q ss_pred ccccCCCCCCceec--CCceE-EEEEeeeecCCCCcccCceEEEEehhHHHHHHH
Q 007572 456 AAVHPGGSGGAVVN--LDGHM-IGLVTSNARHGGGTVIPHLNFSIPCAVLRPIFE 507 (597)
Q Consensus 456 Aav~~GnSGGPL~n--~~G~V-IGIvss~~~~~~g~~~p~lnFaIPi~~l~~~l~ 507 (597)
...+.|+||||+|- .+|++ +||+++.....++..+|++--. ++.....+.
T Consensus 223 ~daCqGDSGGPi~~~g~~G~vQ~GVvSwG~~~Cg~t~~~gVyT~--vsny~~WI~ 275 (413)
T COG5640 223 KDACQGDSGGPIFHKGEEGRVQRGVVSWGDGGCGGTLIPGVYTN--VSNYQDWIA 275 (413)
T ss_pred cccccCCCCCceEEeCCCccEEEeEEEecCCCCCCCCcceeEEe--hhHHHHHHH
Confidence 35678999999993 35877 9999999887777777764333 444444333
No 26
>PF05579 Peptidase_S32: Equine arteritis virus serine endopeptidase S32; InterPro: IPR008760 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to MEROPS peptidase family S32 (clan PA(S)). The type example is equine arteritis virus serine endopeptidase (equine arteritis virus), which is involved in processing of nidovirus polyproteins [].; GO: 0004252 serine-type endopeptidase activity, 0016032 viral reproduction, 0019082 viral protein processing; PDB: 3FAN_A 3FAO_A 1MBM_A.
Probab=96.76 E-value=0.0083 Score=61.38 Aligned_cols=77 Identities=25% Similarity=0.320 Sum_probs=40.4
Q ss_pred CcEEEEEEccCCCCccceecCCCCCCCCCeEEEEccCCCCCCCCCCCeeEeeEEeeeeeccCCCCCCcccccCCCcCeEE
Q 007572 373 LDVSLLQLGYIPDQLCPIDADFGQPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQRNSAYPVML 452 (597)
Q Consensus 373 ~DLALLkl~~~~~~l~pi~l~~s~~~~Ge~V~vIGyPlf~~~~g~~~svt~GiVS~v~~~~~~~~~~~~~~~~~~~~~~i 452 (597)
-|.|.-.+++.+...+.+++... ..|..-... ..-+..|.|..-. .+
T Consensus 156 GDfA~~~~~~~~G~~P~~k~a~~--~~GrAyW~t-----------~tGvE~G~ig~~~--------------------~~ 202 (297)
T PF05579_consen 156 GDFAEADITNWPGAAPKYKFAQN--YTGRAYWLT-----------STGVEPGFIGGGG--------------------AV 202 (297)
T ss_dssp TTEEEEEETTS-S---B--B-TT---SEEEEEEE-----------TTEEEEEEEETTE--------------------EE
T ss_pred CcEEEEECCCCCCCCCceeecCC--cccceEEEc-----------ccCcccceecCce--------------------EE
Confidence 68888888665666677666522 222221111 1224555554322 23
Q ss_pred EEcccccCCCCCCceecCCceEEEEEeeeecCC
Q 007572 453 ETTAAVHPGGSGGAVVNLDGHMIGLVTSNARHG 485 (597)
Q Consensus 453 qtdAav~~GnSGGPL~n~~G~VIGIvss~~~~~ 485 (597)
.. .++|+||+|++..+|.+|||++..-+.+
T Consensus 203 ~f---T~~GDSGSPVVt~dg~liGVHTGSn~~G 232 (297)
T PF05579_consen 203 CF---TGPGDSGSPVVTEDGDLIGVHTGSNKRG 232 (297)
T ss_dssp ES---S-GGCTT-EEEETTC-EEEEEEEEETTT
T ss_pred EE---cCCCCCCCccCcCCCCEEEEEecCCCcC
Confidence 32 3689999999999999999999775543
No 27
>PF00089 Trypsin: Trypsin; InterPro: IPR001254 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine proteases belong to the MEROPS peptidase family S1 (chymotrypsin family, clan PA(S))and to peptidase family S6 (Hap serine peptidases). The chymotrypsin family is almost totally confined to animals, although trypsin-like enzymes are found in actinomycetes of the genera Streptomyces and Saccharopolyspora, and in the fungus Fusarium oxysporum []. The enzymes are inherently secreted, being synthesised with a signal peptide that targets them to the secretory pathway. Animal enzymes are either secreted directly, packaged into vesicles for regulated secretion, or are retained in leukocyte granules []. The Hap family, 'Haemophilus adhesion and penetration', are proteins that play a role in the interaction with human epithelial cells. The serine protease activity is localized at the N-terminal domain, whereas the binding domain is in the C-terminal region. ; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 1SPJ_A 1A5I_A 2ZGH_A 2ZKS_A 2ZGJ_A 2ZGC_A 2ODP_A 2I6Q_A 2I6S_A 2ODQ_A ....
Probab=96.39 E-value=0.054 Score=52.35 Aligned_cols=114 Identities=18% Similarity=0.150 Sum_probs=71.5
Q ss_pred CccEEEEEEccC---CCCCCeeecCC---CCCCCCeEEEEeCCCCCCCC-CcccCceEEeEEeec--cC--CCCCCCceE
Q 007572 65 TSRVAILGVSSY---LKDLPNIALTP---LNKRGDLLLAVGSPFGVLSP-MHFFNSVSMGSVANC--YP--PRSTTRSLL 133 (597)
Q Consensus 65 ~t~~A~l~i~~~---~~~~~~~~~s~---~~~~G~~v~aigsPfg~~~p-~~f~~~~s~Givs~~--~~--~~~~~~~~i 133 (597)
..||||||++.. .....++.+.. .++.|+.+.++|.+...... .-........+++.. .. ........+
T Consensus 86 ~~DiAll~L~~~~~~~~~~~~~~l~~~~~~~~~~~~~~~~G~~~~~~~~~~~~~~~~~~~~~~~~~c~~~~~~~~~~~~~ 165 (220)
T PF00089_consen 86 DNDIALLKLDRPITFGDNIQPICLPSAGSDPNVGTSCIVVGWGRTSDNGYSSNLQSVTVPVVSRKTCRSSYNDNLTPNMI 165 (220)
T ss_dssp TTSEEEEEESSSSEHBSSBEESBBTSTTHTTTTTSEEEEEESSBSSTTSBTSBEEEEEEEEEEHHHHHHHTTTTSTTTEE
T ss_pred cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
Confidence 589999999743 23444555532 35899999999988853211 001122344444432 11 111234567
Q ss_pred EEec----ccCCCCcCceeecCCccEEEEEeeccccc-CCcceEEEeeHHH
Q 007572 134 MADI----RCLPGMEGGPVFGEHAHFVGILIRPLRQK-SGAEIQLVIPWEA 179 (597)
Q Consensus 134 ~tDa----~~~pG~~GG~v~~~~g~liGi~~~~l~~~-~~~~l~~aip~~~ 179 (597)
.++. ...+|++||||+..++.||||++.. ..+ ......+..++..
T Consensus 166 c~~~~~~~~~~~g~sG~pl~~~~~~lvGI~s~~-~~c~~~~~~~v~~~v~~ 215 (220)
T PF00089_consen 166 CAGSSGSGDACQGDSGGPLICNNNYLVGIVSFG-ENCGSPNYPGVYTRVSS 215 (220)
T ss_dssp EEETTSSSBGGTTTTTSEEEETTEEEEEEEEEE-SSSSBTTSEEEEEEGGG
T ss_pred cccccccccccccccccccccceeeecceeeec-CCCCCCCcCEEEEEHHH
Confidence 7776 7889999999999998999999987 333 3333566666543
No 28
>PF10459 Peptidase_S46: Peptidase S46; InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains.
Probab=95.40 E-value=0.028 Score=65.59 Aligned_cols=65 Identities=20% Similarity=0.289 Sum_probs=45.8
Q ss_pred CCcCeEEEEcccccCCCCCCceecCCceEEEEEeeeecCCCCccc---C--ceEEEEehhHHHHHHHHHH
Q 007572 446 SAYPVMLETTAAVHPGGSGGAVVNLDGHMIGLVTSNARHGGGTVI---P--HLNFSIPCAVLRPIFEFAR 510 (597)
Q Consensus 446 ~~~~~~iqtdAav~~GnSGGPL~n~~G~VIGIvss~~~~~~g~~~---p--~lnFaIPi~~l~~~l~~~~ 510 (597)
...+.-+.+|+-+.+||||+|++|.+|+|||++.-..-.+-...+ | .-+-.+-+..+..+++...
T Consensus 618 g~~pv~FlstnDitGGNSGSPvlN~~GeLVGl~FDgn~Esl~~D~~fdp~~~R~I~VDiRyvL~~ldkv~ 687 (698)
T PF10459_consen 618 GSVPVNFLSTNDITGGNSGSPVLNAKGELVGLAFDGNWESLSGDIAFDPELNRTIHVDIRYVLWALDKVY 687 (698)
T ss_pred CCeeeEEEeccCcCCCCCCCccCCCCceEEEEeecCchhhcccccccccccceeEEEEHHHHHHHHHHHh
Confidence 456777899999999999999999999999998754332211100 2 3355666667777776653
No 29
>PF00548 Peptidase_C3: 3C cysteine protease (picornain 3C); InterPro: IPR000199 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This signature defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies C3A and C3B. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral C3 cysteine protease. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SJO_E 2H6M_A 1QA7_C 1HAV_B 2HAL_A 2H9H_A 3QZQ_B 3QZR_A 3R0F_B 3SJ9_A ....
Probab=95.09 E-value=0.26 Score=47.85 Aligned_cols=94 Identities=20% Similarity=0.310 Sum_probs=53.6
Q ss_pred CCcEEEEEEccCCCCccceec--CCCCCCCCCeEEEEccCCCCCCCCCCCee-EeeEEeeeeeccCCCCCCcccccCCCc
Q 007572 372 PLDVSLLQLGYIPDQLCPIDA--DFGQPSLGSAAYVIGHGLFGPRCGLSPSV-SSGVVAKVVKANLPSYGQSTLQRNSAY 448 (597)
Q Consensus 372 ~~DLALLkl~~~~~~l~pi~l--~~s~~~~Ge~V~vIGyPlf~~~~g~~~sv-t~GiVS~v~~~~~~~~~~~~~~~~~~~ 448 (597)
..||++++++.. ..++-++- ........+.+.++=.. ...+.+ ..+.+.....+.. +....
T Consensus 71 ~~Dl~~v~l~~~-~kfrDIrk~~~~~~~~~~~~~l~v~~~------~~~~~~~~v~~v~~~~~i~~---------~g~~~ 134 (172)
T PF00548_consen 71 DTDLTLVKLPRN-PKFRDIRKFFPESIPEYPECVLLVNST------KFPRMIVEVGFVTNFGFINL---------SGTTT 134 (172)
T ss_dssp EEEEEEEEEESS-S-B--GGGGSBSSGGTEEEEEEEEESS------SSTCEEEEEEEEEEEEEEEE---------TTEEE
T ss_pred ceeEEEEEccCC-cccCchhhhhccccccCCCcEEEEECC------CCccEEEEEEEEeecCcccc---------CCCEe
Confidence 589999999762 22222211 11112344444444332 222322 4444444443210 01234
Q ss_pred CeEEEEcccccCCCCCCceec---CCceEEEEEeee
Q 007572 449 PVMLETTAAVHPGGSGGAVVN---LDGHMIGLVTSN 481 (597)
Q Consensus 449 ~~~iqtdAav~~GnSGGPL~n---~~G~VIGIvss~ 481 (597)
+.++.+.++..+|+.||||+. ..+++|||+.+.
T Consensus 135 ~~~~~Y~~~t~~G~CG~~l~~~~~~~~~i~GiHvaG 170 (172)
T PF00548_consen 135 PRSLKYKAPTKPGMCGSPLVSRIGGQGKIIGIHVAG 170 (172)
T ss_dssp EEEEEEESEEETTGTTEEEEESCGGTTEEEEEEEEE
T ss_pred eEEEEEccCCCCCccCCeEEEeeccCccEEEEEecc
Confidence 568899999999999999994 258999999874
No 30
>COG3591 V8-like Glu-specific endopeptidase [Amino acid transport and metabolism]
Probab=93.98 E-value=0.27 Score=50.57 Aligned_cols=75 Identities=23% Similarity=0.237 Sum_probs=58.7
Q ss_pred ecCCCCCCCCeEEEEeCCCCCCCCCcccCceEEeEEeeccCCCCCCCceEEEecccCCCCcCceeecCCccEEEEEeecc
Q 007572 84 ALTPLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVANCYPPRSTTRSLLMADIRCLPGMEGGPVFGEHAHFVGILIRPL 163 (597)
Q Consensus 84 ~~s~~~~~G~~v~aigsPfg~~~p~~f~~~~s~Givs~~~~~~~~~~~~i~tDa~~~pG~~GG~v~~~~g~liGi~~~~l 163 (597)
......+.+|.|.++|.|-.- |..+....+.+.|-.... .+++-|+-..||+||.||++.+.+|||+.+...
T Consensus 153 ~~~~~~~~~d~i~v~GYP~dk--~~~~~~~e~t~~v~~~~~------~~l~y~~dT~pG~SGSpv~~~~~~vigv~~~g~ 224 (251)
T COG3591 153 NTASEAKANDRITVIGYPGDK--PNIGTMWESTGKVNSIKG------NKLFYDADTLPGSSGSPVLISKDEVIGVHYNGP 224 (251)
T ss_pred ccccccccCceeEEEeccCCC--CcceeEeeecceeEEEec------ceEEEEecccCCCCCCceEecCceEEEEEecCC
Confidence 345579999999999999885 323445555565544432 368889999999999999999999999999998
Q ss_pred ccc
Q 007572 164 RQK 166 (597)
Q Consensus 164 ~~~ 166 (597)
...
T Consensus 225 ~~~ 227 (251)
T COG3591 225 GAN 227 (251)
T ss_pred Ccc
Confidence 865
No 31
>KOG1421 consensus Predicted signaling-associated protein (contains a PDZ domain) [General function prediction only]
Probab=93.97 E-value=1.9 Score=49.61 Aligned_cols=153 Identities=14% Similarity=0.100 Sum_probs=83.9
Q ss_pred EEEEEcCCCCceeEeeEEEEeecCCCcEEEEEEccCCCCccceecCCCCCCCCCeEEEEccCCCCCCCCCCCeeEeeEEe
Q 007572 348 IRVRLDHLDPWIWCDAKIVYVCKGPLDVSLLQLGYIPDQLCPIDADFGQPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVA 427 (597)
Q Consensus 348 i~V~l~~~~~~~w~~A~Vv~~~~~~~DLALLkl~~~~~~l~pi~l~~s~~~~Ge~V~vIGyPlf~~~~g~~~svt~GiVS 427 (597)
++|+..+... ..|.+.+ -++...+|.+|-+. .....+.+.+..+..||++...|+-.-..-.....+++.-.+-
T Consensus 578 ~~vt~~dS~~---i~a~~~f-L~~t~n~a~~kydp--~~~~~~kl~~~~v~~gD~~~f~g~~~~~r~ltaktsv~dvs~~ 651 (955)
T KOG1421|consen 578 QRVTEADSDG---IPANVSF-LHPTENVASFKYDP--ALEVQLKLTDTTVLRGDECTFEGFTEDLRALTAKTSVTDVSVV 651 (955)
T ss_pred eEEeeccccc---ccceeeE-ecCccceeEeccCh--hHhhhhccceeeEecCCceeEecccccchhhcccceeeeeEEE
Confidence 5566665555 6788887 36668888888874 3334555656668899999999993100000112344432221
Q ss_pred eeeeccCCCCCCcccccCCCcCeEEEEcccc-cCCCCCCceecCCceEEEEEeeeecCCCCcccCceEEEEehhHHHHHH
Q 007572 428 KVVKANLPSYGQSTLQRNSAYPVMLETTAAV-HPGGSGGAVVNLDGHMIGLVTSNARHGGGTVIPHLNFSIPCAVLRPIF 506 (597)
Q Consensus 428 ~v~~~~~~~~~~~~~~~~~~~~~~iqtdAav-~~GnSGGPL~n~~G~VIGIvss~~~~~~g~~~p~lnFaIPi~~l~~~l 506 (597)
-+-+...+.+ ...+. ..|-..+.+ ..++| |-+.|.+|+|+|+=-+-....-+.+=-..-+.+-+.++.+++
T Consensus 652 ~~ps~~~pr~------r~~n~-e~Is~~~nlsT~c~s-g~ltdddg~vvalwl~~~ge~~~~kd~~y~~gl~~~~~l~vl 723 (955)
T KOG1421|consen 652 IIPSSVMPRF------RATNL-EVISFMDNLSTSCLS-GRLTDDDGEVVALWLSVVGEDVGGKDYTYKYGLSMSYILPVL 723 (955)
T ss_pred EecCCCCcce------eecce-EEEEEeccccccccc-eEEECCCCeEEEEEeeeeccccCCceeEEEeccchHHHHHHH
Confidence 1111111111 00111 123333332 34444 456688999999976655443222211345666677899999
Q ss_pred HHHHhcCC
Q 007572 507 EFARDMQE 514 (597)
Q Consensus 507 ~~~~~~gd 514 (597)
+.++....
T Consensus 724 ~rlk~g~~ 731 (955)
T KOG1421|consen 724 ERLKLGPS 731 (955)
T ss_pred HHHhcCCC
Confidence 99987544
No 32
>PF02907 Peptidase_S29: Hepatitis C virus NS3 protease; InterPro: IPR004109 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies the Hepatitis C virus NS3 protein as a serine protease which belongs to MEROPS peptidase family S29 (hepacivirin family, clan PA(S)), which has a trypsin-like fold. The non-structural (NS) protein NS3 is one of the NS proteins involved in replication of the HCV genome. The NS2 proteinase (IPR002518 from INTERPRO), a zinc-dependent enzyme, performs a single proteolytic cut to release the N terminus of NS3. The action of NS3 proteinase (NS3P), which resides in the N-terminal one-third of the NS3 protein, then yields all remaining non-structural proteins. The C-terminal two-thirds of the NS3 protein contain a helicase. The functional relationship between the proteinase and helicase domains is unknown. NS3 has a structural zinc-binding site and requires cofactor NS4. It has been suggested that the NS3 serine protease of hepatitus C is involved in cell transformation and that the ability to transform requires an active enzyme [].; GO: 0008236 serine-type peptidase activity, 0006508 proteolysis, 0019087 transformation of host cell by virus; PDB: 2QV1_B 3LOX_C 2OBQ_C 2OC1_C 2OC0_A 3LON_A 3KNX_A 2O8M_A 2OBO_A 2OC8_A ....
Probab=92.07 E-value=0.11 Score=48.32 Aligned_cols=45 Identities=27% Similarity=0.535 Sum_probs=36.3
Q ss_pred EecccCCCCcCceeecCCccEEEEEeecccccC-CcceEEEeeHHHH
Q 007572 135 ADIRCLPGMEGGPVFGEHAHFVGILIRPLRQKS-GAEIQLVIPWEAI 180 (597)
Q Consensus 135 tDa~~~pG~~GG~v~~~~g~liGi~~~~l~~~~-~~~l~~aip~~~i 180 (597)
.-+..+-|.|||||+-..|++|||..+.++..+ --.+.|+ ||+.+
T Consensus 101 ~pis~lkGSSGgPiLC~~GH~vG~f~aa~~trgvak~i~f~-P~e~l 146 (148)
T PF02907_consen 101 RPISDLKGSSGGPILCPSGHAVGMFRAAVCTRGVAKAIDFI-PVETL 146 (148)
T ss_dssp EEHHHHTT-TT-EEEETTSEEEEEEEEEEEETTEEEEEEEE-EHHHH
T ss_pred ceeEEEecCCCCcccCCCCCEEEEEEEEEEcCCceeeEEEE-eeeec
Confidence 456788899999999999999999999998873 3478887 99865
No 33
>PF00949 Peptidase_S7: Peptidase S7, Flavivirus NS3 serine protease ; InterPro: IPR001850 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies serine peptidases belong to MEROPS peptidase family S7 (flavivirin family, clan PA(S)). The protein fold of the peptidase domain for members of this family resembles that of chymotrypsin, the type example for clan PA. Flaviviruses produce a polyprotein from the ssRNA genome. The N terminus of the NS3 protein (approx. 180 aa) is required for the processing of the polyprotein. NS3 also has conserved homology with NTP-binding proteins and DEAD family of RNA helicase [, , ].; GO: 0003723 RNA binding, 0003724 RNA helicase activity, 0005524 ATP binding; PDB: 2IJO_B 3E90_D 2GGV_B 2FP7_B 2WV9_A 3U1I_B 3U1J_B 2WZQ_A 2WHX_A 3L6P_A ....
Probab=91.40 E-value=0.15 Score=47.43 Aligned_cols=29 Identities=28% Similarity=0.571 Sum_probs=21.7
Q ss_pred cccCCCCCCceecCCceEEEEEeeeecCC
Q 007572 457 AVHPGGSGGAVVNLDGHMIGLVTSNARHG 485 (597)
Q Consensus 457 av~~GnSGGPL~n~~G~VIGIvss~~~~~ 485 (597)
...+|.||+|+||.+|++|||-.......
T Consensus 93 d~~~GsSGSpi~n~~g~ivGlYg~g~~~~ 121 (132)
T PF00949_consen 93 DFPKGSSGSPIFNQNGEIVGLYGNGVEVG 121 (132)
T ss_dssp -S-TTGTT-EEEETTSCEEEEEEEEEE-T
T ss_pred ccCCCCCCCceEcCCCcEEEEEccceeec
Confidence 36799999999999999999987665543
No 34
>PF02907 Peptidase_S29: Hepatitis C virus NS3 protease; InterPro: IPR004109 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies the Hepatitis C virus NS3 protein as a serine protease which belongs to MEROPS peptidase family S29 (hepacivirin family, clan PA(S)), which has a trypsin-like fold. The non-structural (NS) protein NS3 is one of the NS proteins involved in replication of the HCV genome. The NS2 proteinase (IPR002518 from INTERPRO), a zinc-dependent enzyme, performs a single proteolytic cut to release the N terminus of NS3. The action of NS3 proteinase (NS3P), which resides in the N-terminal one-third of the NS3 protein, then yields all remaining non-structural proteins. The C-terminal two-thirds of the NS3 protein contain a helicase. The functional relationship between the proteinase and helicase domains is unknown. NS3 has a structural zinc-binding site and requires cofactor NS4. It has been suggested that the NS3 serine protease of hepatitus C is involved in cell transformation and that the ability to transform requires an active enzyme [].; GO: 0008236 serine-type peptidase activity, 0006508 proteolysis, 0019087 transformation of host cell by virus; PDB: 2QV1_B 3LOX_C 2OBQ_C 2OC1_C 2OC0_A 3LON_A 3KNX_A 2O8M_A 2OBO_A 2OC8_A ....
Probab=90.53 E-value=0.34 Score=45.02 Aligned_cols=44 Identities=25% Similarity=0.484 Sum_probs=31.3
Q ss_pred cccCCCCCCceecCCceEEEEEeeeecCCCCcccCceEEEEehhHHH
Q 007572 457 AVHPGGSGGAVVNLDGHMIGLVTSNARHGGGTVIPHLNFSIPCAVLR 503 (597)
Q Consensus 457 av~~GnSGGPL~n~~G~VIGIvss~~~~~~g~~~p~lnFaIPi~~l~ 503 (597)
+...|+|||||+-.+|++|||..+..-..+-.+ .+-|. |++.+.
T Consensus 104 s~lkGSSGgPiLC~~GH~vG~f~aa~~trgvak--~i~f~-P~e~l~ 147 (148)
T PF02907_consen 104 SDLKGSSGGPILCPSGHAVGMFRAAVCTRGVAK--AIDFI-PVETLP 147 (148)
T ss_dssp HHHTT-TT-EEEETTSEEEEEEEEEEEETTEEE--EEEEE-EHHHHH
T ss_pred EEEecCCCCcccCCCCCEEEEEEEEEEcCCcee--eEEEE-eeeecC
Confidence 456899999999999999999887654443333 56777 887653
No 35
>PF10459 Peptidase_S46: Peptidase S46; InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains.
Probab=90.34 E-value=0.2 Score=58.67 Aligned_cols=30 Identities=20% Similarity=0.408 Sum_probs=27.8
Q ss_pred eEEEecccCCCCcCceeecCCccEEEEEee
Q 007572 132 LLMADIRCLPGMEGGPVFGEHAHFVGILIR 161 (597)
Q Consensus 132 ~i~tDa~~~pG~~GG~v~~~~g~liGi~~~ 161 (597)
.++|+.-+--||||.||+|.+|+|||++.-
T Consensus 623 ~FlstnDitGGNSGSPvlN~~GeLVGl~FD 652 (698)
T PF10459_consen 623 NFLSTNDITGGNSGSPVLNAKGELVGLAFD 652 (698)
T ss_pred EEEeccCcCCCCCCCccCCCCceEEEEeec
Confidence 488999999999999999999999999873
No 36
>PF00949 Peptidase_S7: Peptidase S7, Flavivirus NS3 serine protease ; InterPro: IPR001850 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies serine peptidases belong to MEROPS peptidase family S7 (flavivirin family, clan PA(S)). The protein fold of the peptidase domain for members of this family resembles that of chymotrypsin, the type example for clan PA. Flaviviruses produce a polyprotein from the ssRNA genome. The N terminus of the NS3 protein (approx. 180 aa) is required for the processing of the polyprotein. NS3 also has conserved homology with NTP-binding proteins and DEAD family of RNA helicase [, , ].; GO: 0003723 RNA binding, 0003724 RNA helicase activity, 0005524 ATP binding; PDB: 2IJO_B 3E90_D 2GGV_B 2FP7_B 2WV9_A 3U1I_B 3U1J_B 2WZQ_A 2WHX_A 3L6P_A ....
Probab=90.24 E-value=0.24 Score=46.10 Aligned_cols=35 Identities=20% Similarity=0.418 Sum_probs=25.1
Q ss_pred ceEEEecccCCCCcCceeecCCccEEEEEeecccc
Q 007572 131 SLLMADIRCLPGMEGGPVFGEHAHFVGILIRPLRQ 165 (597)
Q Consensus 131 ~~i~tDa~~~pG~~GG~v~~~~g~liGi~~~~l~~ 165 (597)
.+.+.|..+-+|+||.|+||.+|++|||--..+.-
T Consensus 86 ~~~~~~~d~~~GsSGSpi~n~~g~ivGlYg~g~~~ 120 (132)
T PF00949_consen 86 GIGAIDLDFPKGSSGSPIFNQNGEIVGLYGNGVEV 120 (132)
T ss_dssp EEEEE---S-TTGTT-EEEETTSCEEEEEEEEEE-
T ss_pred eEEeeecccCCCCCCCceEcCCCcEEEEEccceee
Confidence 46677888999999999999999999998766644
No 37
>PF00863 Peptidase_C4: Peptidase family C4 This family belongs to family C4 of the peptidase classification.; InterPro: IPR001730 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. Nuclear inclusion A (NIA) proteases from potyviruses are cysteine peptidases belong to the MEROPS peptidase family C4 (NIa protease family, clan PA(C)) [, ]. Potyviruses include plant viruses in which the single-stranded RNA encodes a polyprotein with NIA protease activity, where proteolytic cleavage is specific for Gln+Gly sites. The NIA protease acts on the polyprotein, releasing itself by Gln+Gly cleavage at both the N- and C-termini. It further processes the polyprotein by cleavage at five similar sites in the C-terminal half of the sequence. In addition to its C-terminal protease activity, the NIA protease contains an N-terminal domain that has been implicated in the transcription process []. This peptidase is present in the nuclear inclusion protein of potyviruses.; GO: 0008234 cysteine-type peptidase activity, 0006508 proteolysis; PDB: 3MMG_B 1Q31_B 1LVB_A 1LVM_A.
Probab=89.77 E-value=0.79 Score=46.76 Aligned_cols=106 Identities=23% Similarity=0.238 Sum_probs=52.0
Q ss_pred CCccEEEEEEccCCCCCCeeec---CCCCCCCCeEEEEeCCCCCCCCCcccCceEEeEEeeccCCCCCCCceEEEecccC
Q 007572 64 STSRVAILGVSSYLKDLPNIAL---TPLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVANCYPPRSTTRSLLMADIRCL 140 (597)
Q Consensus 64 ~~t~~A~l~i~~~~~~~~~~~~---s~~~~~G~~v~aigsPfg~~~p~~f~~~~s~Givs~~~~~~~~~~~~i~tDa~~~ 140 (597)
.-.||.++|.. +++||++. -..++.||.|..||+=|---+ ..-+||. -|.+.+ .....|.---+...
T Consensus 80 ~~~DiviirmP---kDfpPf~~kl~FR~P~~~e~v~mVg~~fq~k~---~~s~vSe--sS~i~p--~~~~~fWkHwIsTk 149 (235)
T PF00863_consen 80 EGRDIVIIRMP---KDFPPFPQKLKFRAPKEGERVCMVGSNFQEKS---ISSTVSE--SSWIYP--EENSHFWKHWISTK 149 (235)
T ss_dssp TCSSEEEEE-----TTS----S---B----TT-EEEEEEEECSSCC---CEEEEEE--EEEEEE--ETTTTEEEE-C---
T ss_pred CCccEEEEeCC---cccCCcchhhhccCCCCCCEEEEEEEEEEcCC---eeEEECC--ceEEee--cCCCCeeEEEecCC
Confidence 45899999994 57787775 346899999999998775200 1112222 122222 12356888888899
Q ss_pred CCCcCceeecC-CccEEEEEeecccccCCcceEEEeeHHHH
Q 007572 141 PGMEGGPVFGE-HAHFVGILIRPLRQKSGAEIQLVIPWEAI 180 (597)
Q Consensus 141 pG~~GG~v~~~-~g~liGi~~~~l~~~~~~~l~~aip~~~i 180 (597)
+|+.|.||++. +|.+|||-...-.. ...++-.++|=+.+
T Consensus 150 ~G~CG~PlVs~~Dg~IVGiHsl~~~~-~~~N~F~~f~~~f~ 189 (235)
T PF00863_consen 150 DGDCGLPLVSTKDGKIVGIHSLTSNT-SSRNYFTPFPDDFE 189 (235)
T ss_dssp TT-TT-EEEETTT--EEEEEEEEETT-TSSEEEEE--TTHH
T ss_pred CCccCCcEEEcCCCcEEEEEcCccCC-CCeEEEEcCCHHHH
Confidence 99999999985 78999998743322 23334444454443
No 38
>PF08192 Peptidase_S64: Peptidase family S64; InterPro: IPR012985 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This family of fungal proteins is involved in the processing of membrane bound transcription factor Stp1 [] and belongs to MEROPS petidase family S64 (clan PA). The processing causes the signalling domain of Stp1 to be passed to the nucleus where several permease genes are induced. The permeases are important for uptake of amino acids, and processing of tp1 only occurs in an amino acid-rich environment. This family is predicted to be distantly related to the trypsin family (MEROPS peptidase family S1) and to have a typical trypsin-like catalytic triad [].
Probab=88.32 E-value=2.6 Score=48.61 Aligned_cols=119 Identities=13% Similarity=0.176 Sum_probs=70.7
Q ss_pred cCCCcEEEEEEccC-------CCCc------cceecCC-------CCCCCCCeEEEEccCCCCCCCCCCCeeEeeEEeee
Q 007572 370 KGPLDVSLLQLGYI-------PDQL------CPIDADF-------GQPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKV 429 (597)
Q Consensus 370 ~~~~DLALLkl~~~-------~~~l------~pi~l~~-------s~~~~Ge~V~vIGyPlf~~~~g~~~svt~GiVS~v 429 (597)
..-.|+||++++.. .+++ +.+.+.. ..+.+|..|+=+|. ..| .|.|.++++
T Consensus 540 ~~LsD~AIIkV~~~~~~~N~LGddi~f~~~dP~l~f~NlyV~~~~~~~~~G~~VfK~Gr-----TTg----yT~G~lNg~ 610 (695)
T PF08192_consen 540 KRLSDWAIIKVNKERKCQNYLGDDIQFNEPDPTLMFQNLYVREVVSNLVPGMEVFKVGR-----TTG----YTTGILNGI 610 (695)
T ss_pred ccccceEEEEeCCCceecCCCCccccccCCCccccccccchhhhhhccCCCCeEEEecc-----cCC----ccceEecce
Confidence 34469999999851 1111 1122221 23567999999987 455 488888866
Q ss_pred eeccCCCCCCcccccCCCcCeEEEEc----ccccCCCCCCceecCCc------eEEEEEeeeecCCCCcccCceEEEEeh
Q 007572 430 VKANLPSYGQSTLQRNSAYPVMLETT----AAVHPGGSGGAVVNLDG------HMIGLVTSNARHGGGTVIPHLNFSIPC 499 (597)
Q Consensus 430 ~~~~~~~~~~~~~~~~~~~~~~iqtd----Aav~~GnSGGPL~n~~G------~VIGIvss~~~~~~g~~~p~lnFaIPi 499 (597)
.-..- .. +. -....++... +-..+|+||.=|++.-+ .|+||..+.-... . .+++..|+
T Consensus 611 klvyw----~d--G~-i~s~efvV~s~~~~~Fa~~GDSGS~VLtk~~d~~~gLgvvGMlhsydge~---k--qfglftPi 678 (695)
T PF08192_consen 611 KLVYW----AD--GK-IQSSEFVVSSDNNPAFASGGDSGSWVLTKLEDNNKGLGVVGMLHSYDGEQ---K--QFGLFTPI 678 (695)
T ss_pred EEEEe----cC--CC-eEEEEEEEecCCCccccCCCCcccEEEecccccccCceeeEEeeecCCcc---c--eeeccCcH
Confidence 43210 00 00 0111233333 44568999999998644 4999998753322 2 58888998
Q ss_pred hHHHHHHHHH
Q 007572 500 AVLRPIFEFA 509 (597)
Q Consensus 500 ~~l~~~l~~~ 509 (597)
..|..-++..
T Consensus 679 ~~il~rl~~v 688 (695)
T PF08192_consen 679 NEILDRLEEV 688 (695)
T ss_pred HHHHHHHHHh
Confidence 8776655543
No 39
>PF00944 Peptidase_S3: Alphavirus core protein ; InterPro: IPR000930 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. Togavirin, also known as Sindbis virus core endopeptidase, is a serine protease resident at the N terminus of the p130 polyprotein of togaviruses []. The endopeptidase signature identifies the peptidase as belonging to the MEROPS peptidase family S3 (togavirin family, clan PA(S)). The polyprotein also includes structural proteins for the nucleocapsid core and for the glycoprotein spikes []. Togavirin is only active while part of the polyprotein, cleavage at a Trp-Ser bond resulting in total lack of activity []. Mutagenesis studies have identified the location of the His-Asp-Ser catalytic triad, and X-ray studies have revealed the protein fold to be similar to that of chymotrypsin [, ].; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis, 0016020 membrane; PDB: 2YEW_D 1EP5_A 3J0C_F 1EP6_C 1WYK_D 1DYL_A 1VCQ_B 1VCP_B 1LD4_D 1KXA_A ....
Probab=88.24 E-value=0.56 Score=43.62 Aligned_cols=35 Identities=26% Similarity=0.489 Sum_probs=28.2
Q ss_pred EEEEcccccCCCCCCceecCCceEEEEEeeeecCC
Q 007572 451 MLETTAAVHPGGSGGAVVNLDGHMIGLVTSNARHG 485 (597)
Q Consensus 451 ~iqtdAav~~GnSGGPL~n~~G~VIGIvss~~~~~ 485 (597)
+..-+..-.+|+||-|++|..|+||||+-..+..+
T Consensus 96 ftip~g~g~~GDSGRpi~DNsGrVVaIVLGG~neG 130 (158)
T PF00944_consen 96 FTIPTGVGKPGDSGRPIFDNSGRVVAIVLGGANEG 130 (158)
T ss_dssp EEEETTS-STTSTTEEEESTTSBEEEEEEEEEEET
T ss_pred EEeccCCCCCCCCCCccCcCCCCEEEEEecCCCCC
Confidence 34446667899999999999999999999887654
No 40
>smart00020 Tryp_SPc Trypsin-like serine protease. Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues.
Probab=87.65 E-value=3.7 Score=39.93 Aligned_cols=101 Identities=17% Similarity=0.139 Sum_probs=53.8
Q ss_pred CCCccEEEEEEccC---CCCCCeeecCC---CCCCCCeEEEEeCCCCCCCCCcccCceEEeEEeeccC-----CC----C
Q 007572 63 KSTSRVAILGVSSY---LKDLPNIALTP---LNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVANCYP-----PR----S 127 (597)
Q Consensus 63 ~~~t~~A~l~i~~~---~~~~~~~~~s~---~~~~G~~v~aigsPfg~~~p~~f~~~~s~Givs~~~~-----~~----~ 127 (597)
....|+||||++.. .....++.+.. .+..|+.+.+.|..-.......+...+....+.-... .. .
T Consensus 86 ~~~~DiAll~L~~~i~~~~~~~pi~l~~~~~~~~~~~~~~~~g~g~~~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~ 165 (229)
T smart00020 86 TYDNDIALLKLKSPVTLSDNVRPICLPSSNYNVPAGTTCTVSGWGRTSEGAGSLPDTLQEVNVPIVSNATCRRAYSGGGA 165 (229)
T ss_pred CCcCCEEEEEECcccCCCCceeeccCCCcccccCCCCEEEEEeCCCCCCCCCcCCCEeeEEEEEEeCHHHhhhhhccccc
Confidence 35589999999732 12344555532 5778899999985443211111122222222221111 00 0
Q ss_pred CCCceE---E--EecccCCCCcCceeecCCc--cEEEEEeecc
Q 007572 128 TTRSLL---M--ADIRCLPGMEGGPVFGEHA--HFVGILIRPL 163 (597)
Q Consensus 128 ~~~~~i---~--tDa~~~pG~~GG~v~~~~g--~liGi~~~~l 163 (597)
.....+ . .....-+|.+||||+...+ .|+||++..-
T Consensus 166 ~~~~~~C~~~~~~~~~~c~gdsG~pl~~~~~~~~l~Gi~s~g~ 208 (229)
T smart00020 166 ITDNMLCAGGLEGGKDACQGDSGGPLVCNDGRWVLVGIVSWGS 208 (229)
T ss_pred cCCCcEeecCCCCCCcccCCCCCCeeEEECCCEEEEEEEEECC
Confidence 000001 0 1344567999999998765 7999988654
No 41
>cd00190 Tryp_SPc Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues.
Probab=87.55 E-value=2.9 Score=40.43 Aligned_cols=100 Identities=19% Similarity=0.202 Sum_probs=53.4
Q ss_pred CCccEEEEEEccCCC---CCCeeecC-C--CCCCCCeEEEEeCCCCCCC--CCcccCceEEeEEeec--cC--C--CCCC
Q 007572 64 STSRVAILGVSSYLK---DLPNIALT-P--LNKRGDLLLAVGSPFGVLS--PMHFFNSVSMGSVANC--YP--P--RSTT 129 (597)
Q Consensus 64 ~~t~~A~l~i~~~~~---~~~~~~~s-~--~~~~G~~v~aigsPfg~~~--p~~f~~~~s~Givs~~--~~--~--~~~~ 129 (597)
...||||||++.... ...++.+. . .+..|+.+.+.|....... ...-......-+++.. .. . ....
T Consensus 87 ~~~DiAll~L~~~~~~~~~v~picl~~~~~~~~~~~~~~~~G~g~~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~ 166 (232)
T cd00190 87 YDNDIALLKLKRPVTLSDNVRPICLPSSGYNLPAGTTCTVSGWGRTSEGGPLPDVLQEVNVPIVSNAECKRAYSYGGTIT 166 (232)
T ss_pred CcCCEEEEEECCcccCCCcccceECCCccccCCCCCEEEEEeCCcCCCCCCCCceeeEEEeeeECHHHhhhhccCcccCC
Confidence 458999999973221 24555552 2 5778999999996443211 0001122222333221 00 0 0001
Q ss_pred CceEEE-----ecccCCCCcCceeecCC---ccEEEEEeecc
Q 007572 130 RSLLMA-----DIRCLPGMEGGPVFGEH---AHFVGILIRPL 163 (597)
Q Consensus 130 ~~~i~t-----Da~~~pG~~GG~v~~~~---g~liGi~~~~l 163 (597)
...+-+ +...-+|.+||||+... ..|+||++...
T Consensus 167 ~~~~C~~~~~~~~~~c~gdsGgpl~~~~~~~~~lvGI~s~g~ 208 (232)
T cd00190 167 DNMLCAGGLEGGKDACQGDSGGPLVCNDNGRGVLVGIVSWGS 208 (232)
T ss_pred CceEeeCCCCCCCccccCCCCCcEEEEeCCEEEEEEEEehhh
Confidence 111111 33455799999999875 56999988654
No 42
>PF05580 Peptidase_S55: SpoIVB peptidase S55; InterPro: IPR008763 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S55 (SpoIVB peptidase family, clan PA(S)). The protein SpoIVB plays a key role in signalling in the final sigma-K checkpoint of Bacillus subtilis [, ].
Probab=86.59 E-value=0.58 Score=46.93 Aligned_cols=45 Identities=31% Similarity=0.477 Sum_probs=33.7
Q ss_pred EEEcccccCCCCCCceecCCceEEEEEeeeecCCCCcccCceEEEEehhHH
Q 007572 452 LETTAAVHPGGSGGAVVNLDGHMIGLVTSNARHGGGTVIPHLNFSIPCAVL 502 (597)
Q Consensus 452 iqtdAav~~GnSGGPL~n~~G~VIGIvss~~~~~~g~~~p~lnFaIPi~~l 502 (597)
+..+.-+..||||+|++ .+|++||=++-.+-.. |..+|.||++..
T Consensus 171 l~~TGGIvqGMSGSPI~-qdGKLiGAVthvf~~d-----p~~Gygi~ie~M 215 (218)
T PF05580_consen 171 LEKTGGIVQGMSGSPII-QDGKLIGAVTHVFVND-----PTKGYGIFIEWM 215 (218)
T ss_pred hhhhCCEEecccCCCEE-ECCEEEEEEEEEEecC-----CCceeeecHHHH
Confidence 33344567799999999 5899999998776432 356899997754
No 43
>PF08192 Peptidase_S64: Peptidase family S64; InterPro: IPR012985 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This family of fungal proteins is involved in the processing of membrane bound transcription factor Stp1 [] and belongs to MEROPS petidase family S64 (clan PA). The processing causes the signalling domain of Stp1 to be passed to the nucleus where several permease genes are induced. The permeases are important for uptake of amino acids, and processing of tp1 only occurs in an amino acid-rich environment. This family is predicted to be distantly related to the trypsin family (MEROPS peptidase family S1) and to have a typical trypsin-like catalytic triad [].
Probab=80.57 E-value=9.1 Score=44.35 Aligned_cols=113 Identities=19% Similarity=0.237 Sum_probs=70.4
Q ss_pred cccCCCCccEEEEEEccCC-------CCC------Ceeec--------CCCCCCCCeEEEEeCCCCCCCCCcccCceEEe
Q 007572 59 SLMSKSTSRVAILGVSSYL-------KDL------PNIAL--------TPLNKRGDLLLAVGSPFGVLSPMHFFNSVSMG 117 (597)
Q Consensus 59 ~~~~~~~t~~A~l~i~~~~-------~~~------~~~~~--------s~~~~~G~~v~aigsPfg~~~p~~f~~~~s~G 117 (597)
.++.+.+.|+||+||+... +++ |.+.+ -..+..|..|+=+|.==|+ |.|
T Consensus 536 ~ii~~~LsD~AIIkV~~~~~~~N~LGddi~f~~~dP~l~f~NlyV~~~~~~~~~G~~VfK~GrTTgy----------T~G 605 (695)
T PF08192_consen 536 SIINKRLSDWAIIKVNKERKCQNYLGDDIQFNEPDPTLMFQNLYVREVVSNLVPGMEVFKVGRTTGY----------TTG 605 (695)
T ss_pred hhhcccccceEEEEeCCCceecCCCCccccccCCCccccccccchhhhhhccCCCCeEEEecccCCc----------cce
Confidence 4555788999999997321 111 11222 2246778999999887775 455
Q ss_pred EEeecc----CCCC-CCCceEEEe----cccCCCCcCceeecCCcc------EEEEEeecccccCCcceEEEeeHHHHHH
Q 007572 118 SVANCY----PPRS-TTRSLLMAD----IRCLPGMEGGPVFGEHAH------FVGILIRPLRQKSGAEIQLVIPWEAIAT 182 (597)
Q Consensus 118 ivs~~~----~~~~-~~~~~i~tD----a~~~pG~~GG~v~~~~g~------liGi~~~~l~~~~~~~l~~aip~~~i~~ 182 (597)
+|.+.. .++. ....|++.- +=..+|.||.=|+++-+. |+||+-+.=. ....|++..||..|..
T Consensus 606 ~lNg~klvyw~dG~i~s~efvV~s~~~~~Fa~~GDSGS~VLtk~~d~~~gLgvvGMlhsydg--e~kqfglftPi~~il~ 683 (695)
T PF08192_consen 606 ILNGIKLVYWADGKIQSSEFVVSSDNNPAFASGGDSGSWVLTKLEDNNKGLGVVGMLHSYDG--EQKQFGLFTPINEILD 683 (695)
T ss_pred EecceEEEEecCCCeEEEEEEEecCCCccccCCCCcccEEEecccccccCceeeEEeeecCC--ccceeeccCcHHHHHH
Confidence 555431 1111 112344444 456679999999997443 8898874332 2347888999999875
Q ss_pred H
Q 007572 183 A 183 (597)
Q Consensus 183 ~ 183 (597)
=
T Consensus 684 r 684 (695)
T PF08192_consen 684 R 684 (695)
T ss_pred H
Confidence 3
No 44
>PF00947 Pico_P2A: Picornavirus core protein 2A; InterPro: IPR000081 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This domain defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies 3CA and 3CB. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral 3C cysteine protease []. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0008233 peptidase activity, 0006508 proteolysis, 0016032 viral reproduction; PDB: 2HRV_B 1Z8R_A.
Probab=79.85 E-value=2.6 Score=38.91 Aligned_cols=29 Identities=31% Similarity=0.501 Sum_probs=22.8
Q ss_pred eEEEecccCCCCcCceeecCCccEEEEEee
Q 007572 132 LLMADIRCLPGMEGGPVFGEHAHFVGILIR 161 (597)
Q Consensus 132 ~i~tDa~~~pG~~GG~v~~~~g~liGi~~~ 161 (597)
+++.--.+.||..||+|+-++| +|||+++
T Consensus 80 ~l~g~Gp~~PGdCGg~L~C~HG-ViGi~Ta 108 (127)
T PF00947_consen 80 LLIGEGPAEPGDCGGILRCKHG-VIGIVTA 108 (127)
T ss_dssp EEEEE-SSSTT-TCSEEEETTC-EEEEEEE
T ss_pred ceeecccCCCCCCCceeEeCCC-eEEEEEe
Confidence 4555668999999999998876 9999995
No 45
>PF00947 Pico_P2A: Picornavirus core protein 2A; InterPro: IPR000081 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This domain defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies 3CA and 3CB. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral 3C cysteine protease []. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0008233 peptidase activity, 0006508 proteolysis, 0016032 viral reproduction; PDB: 2HRV_B 1Z8R_A.
Probab=78.37 E-value=4.2 Score=37.61 Aligned_cols=36 Identities=28% Similarity=0.527 Sum_probs=24.8
Q ss_pred CCcCeEEE-----EcccccCCCCCCceecCCceEEEEEeeee
Q 007572 446 SAYPVMLE-----TTAAVHPGGSGGAVVNLDGHMIGLVTSNA 482 (597)
Q Consensus 446 ~~~~~~iq-----tdAav~~GnSGGPL~n~~G~VIGIvss~~ 482 (597)
..+|..+| ...+..||+.||+|+=.. -||||+++.-
T Consensus 70 ~YYP~h~Q~~~l~g~Gp~~PGdCGg~L~C~H-GViGi~Tagg 110 (127)
T PF00947_consen 70 EYYPKHYQYNLLIGEGPAEPGDCGGILRCKH-GVIGIVTAGG 110 (127)
T ss_dssp TTB-SEEEECEEEEE-SSSTT-TCSEEEETT-CEEEEEEEEE
T ss_pred cCchhheecCceeecccCCCCCCCceeEeCC-CeEEEEEeCC
Confidence 34555555 456889999999999544 5999999863
No 46
>PF00944 Peptidase_S3: Alphavirus core protein ; InterPro: IPR000930 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. Togavirin, also known as Sindbis virus core endopeptidase, is a serine protease resident at the N terminus of the p130 polyprotein of togaviruses []. The endopeptidase signature identifies the peptidase as belonging to the MEROPS peptidase family S3 (togavirin family, clan PA(S)). The polyprotein also includes structural proteins for the nucleocapsid core and for the glycoprotein spikes []. Togavirin is only active while part of the polyprotein, cleavage at a Trp-Ser bond resulting in total lack of activity []. Mutagenesis studies have identified the location of the His-Asp-Ser catalytic triad, and X-ray studies have revealed the protein fold to be similar to that of chymotrypsin [, ].; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis, 0016020 membrane; PDB: 2YEW_D 1EP5_A 3J0C_F 1EP6_C 1WYK_D 1DYL_A 1VCQ_B 1VCP_B 1LD4_D 1KXA_A ....
Probab=76.50 E-value=3.3 Score=38.72 Aligned_cols=32 Identities=22% Similarity=0.400 Sum_probs=26.0
Q ss_pred eEEEecccCCCCcCceeecCCccEEEEEeecc
Q 007572 132 LLMADIRCLPGMEGGPVFGEHAHFVGILIRPL 163 (597)
Q Consensus 132 ~i~tDa~~~pG~~GG~v~~~~g~liGi~~~~l 163 (597)
|.+--..-.||.||-|+||..|++|||+++--
T Consensus 96 ftip~g~g~~GDSGRpi~DNsGrVVaIVLGG~ 127 (158)
T PF00944_consen 96 FTIPTGVGKPGDSGRPIFDNSGRVVAIVLGGA 127 (158)
T ss_dssp EEEETTS-STTSTTEEEESTTSBEEEEEEEEE
T ss_pred EEeccCCCCCCCCCCccCcCCCCEEEEEecCC
Confidence 44455677899999999999999999999643
No 47
>PF09342 DUF1986: Domain of unknown function (DUF1986); InterPro: IPR015420 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This domain is found in serine endopeptidases belonging to MEROPS peptidase family S1A (clan PA). It is found in unusual mosaic proteins, which are encoded by the Drosophila nudel gene (see P98159 from SWISSPROT). Nudel is involved in defining embryonic dorsoventral polarity. Three proteases; ndl, gd and snk process easter to create active easter. Active easter defines cell identities along the dorsal-ventral continuum by activating the spz ligand for the Tl receptor in the ventral region of the embryo. Nudel, pipe and windbeutel together trigger the protease cascade within the extraembryonic perivitelline compartment which induces dorsoventral polarity of the Drosophila embryo [].
Probab=73.74 E-value=27 Score=36.00 Aligned_cols=31 Identities=32% Similarity=0.524 Sum_probs=25.6
Q ss_pred EEEeCCCeEEEEEEEeCCcEEEecccccCCCC
Q 007572 250 LITIDDGVWASGVLLNDQGLILTNAHLLEPWR 281 (597)
Q Consensus 250 ~I~~~~~~~GSGflIs~~G~ILTnaHVV~p~~ 281 (597)
.|.+++.-||||++|+++ |||++..|+....
T Consensus 21 ~IYvdG~~~CsgvLlD~~-WlLvsssCl~~I~ 51 (267)
T PF09342_consen 21 DIYVDGRYWCSGVLLDPH-WLLVSSSCLRGIS 51 (267)
T ss_pred eEEEcCeEEEEEEEeccc-eEEEeccccCCcc
Confidence 455566789999999997 9999999997533
No 48
>PF01732 DUF31: Putative peptidase (DUF31); InterPro: IPR022382 This domain has no known function. It is found in various hypothetical proteins and putative lipoproteins from mycoplasmas.
Probab=62.85 E-value=4.8 Score=43.72 Aligned_cols=24 Identities=25% Similarity=0.515 Sum_probs=21.3
Q ss_pred ccccCCCCCCceecCCceEEEEEe
Q 007572 456 AAVHPGGSGGAVVNLDGHMIGLVT 479 (597)
Q Consensus 456 Aav~~GnSGGPL~n~~G~VIGIvs 479 (597)
.....|+||+.|+|++|++|||..
T Consensus 350 ~~l~gGaSGS~V~n~~~~lvGIy~ 373 (374)
T PF01732_consen 350 YSLGGGASGSMVINQNNELVGIYF 373 (374)
T ss_pred cCCCCCCCcCeEECCCCCEEEEeC
Confidence 366799999999999999999964
No 49
>TIGR02860 spore_IV_B stage IV sporulation protein B. SpoIVB, the stage IV sporulation protein B of endospore-forming bacteria such as Bacillus subtilis, is a serine proteinase, expressed in the spore (rather than mother cell) compartment, that participates in a proteolytic activation cascade for Sigma-K. It appears to be universal among endospore-forming bacteria and occurs nowhere else.
Probab=60.18 E-value=7.3 Score=42.90 Aligned_cols=45 Identities=27% Similarity=0.448 Sum_probs=32.5
Q ss_pred EEEcccccCCCCCCceecCCceEEEEEeeeecCCCCcccCceEEEEehhHH
Q 007572 452 LETTAAVHPGGSGGAVVNLDGHMIGLVTSNARHGGGTVIPHLNFSIPCAVL 502 (597)
Q Consensus 452 iqtdAav~~GnSGGPL~n~~G~VIGIvss~~~~~~g~~~p~lnFaIPi~~l 502 (597)
+.-+.-+..||||+|++ .+|++||=++=.+-.. |.-+|.|-++..
T Consensus 351 l~~tgGivqGMSGSPi~-q~gkliGAvtHVfvnd-----pt~GYGi~ie~M 395 (402)
T TIGR02860 351 LEKTGGIVQGMSGSPII-QNGKVIGAVTHVFVND-----PTSGYGVYIEWM 395 (402)
T ss_pred hhHhCCEEecccCCCEE-ECCEEEEEEEEEEecC-----CCcceeehHHHH
Confidence 33344567799999999 6899999887665443 245788876654
No 50
>PF03510 Peptidase_C24: 2C endopeptidase (C24) cysteine protease family; InterPro: IPR000317 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. The two signatures that defines this group of calivirus polyproteins identify a cysteine peptidase signature that belongs to MEROPS peptidase family C24 (clan PA(C)). Caliciviruses are positive-stranded ssRNA viruses that cause gastroenteritis. The calicivirus genome contains two open reading frames, ORF1 and ORF2. ORF2 encodes a structural protein []; while ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine protease and RNA polymerase activity. The regions of the polyprotein in which these activities lie are similar to proteins produced by the picornaviruses. Two different families of caliciviruses can be distinguished on the basis of sequence similarity, namely those classified as small round structured viruses (SRSVs) and those classed as non-SRSVs. Calicivirus proteases from the non-SRSV group, which are members of the PA protease clan, constitute family C24 of the cysteine proteases (proteases from SRSVs belong to the C37 family). As mentioned above, the protease activity resides within a polyprotein. The enzyme cleaves the polyprotein at sites N-terminal to itself, liberating the polyprotein helicase.; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis
Probab=53.96 E-value=46 Score=29.93 Aligned_cols=17 Identities=24% Similarity=0.425 Sum_probs=14.3
Q ss_pred EEEEeCCcEEEecccccC
Q 007572 261 GVLLNDQGLILTNAHLLE 278 (597)
Q Consensus 261 GflIs~~G~ILTnaHVV~ 278 (597)
++-|.. |..+|+.||++
T Consensus 3 avHIGn-G~~vt~tHva~ 19 (105)
T PF03510_consen 3 AVHIGN-GRYVTVTHVAK 19 (105)
T ss_pred eEEeCC-CEEEEEEEEec
Confidence 567765 89999999997
No 51
>PF05416 Peptidase_C37: Southampton virus-type processing peptidase; InterPro: IPR001665 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This group of cysteine peptidases belong to the MEROPS peptidase family C37, (clan PA(C)). The type example is calicivirin from Southampton virus, an endopeptidase that cleaves the polyprotein at sites N-terminal to itself, liberating the polyprotein helicase. Southampton virus is a positive-stranded ssRNA virus belonging to the Caliciviruses, which are viruses that cause gastroenteritis. The calicivirus genome contains two open reading frames, ORF1 and ORF2. ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine protease and RNA polymerase activity []. The regions of the polyprotein in which these activities lie are similar to proteins produced by the picornaviruses []. ORF2 encodes a structural, capsid protein. Two different families of caliciviruses can be distinguished on the basis of sequence similarity, namely the Norwalk-like viruses or small round structured viruses (SRSVs), and those classed as non-SRSVs.; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 2FYQ_A 2FYR_A 1WQS_D 4ASH_A 2IPH_B.
Probab=46.80 E-value=68 Score=35.62 Aligned_cols=29 Identities=31% Similarity=0.527 Sum_probs=20.0
Q ss_pred cccCCCCCCceecCCc---eEEEEEeeeecCC
Q 007572 457 AVHPGGSGGAVVNLDG---HMIGLVTSNARHG 485 (597)
Q Consensus 457 av~~GnSGGPL~n~~G---~VIGIvss~~~~~ 485 (597)
-..||+-|-|-+=..| -|+|++++.++.+
T Consensus 499 GT~PGDCGcPYvyKrgNd~VV~GVH~AAtr~G 530 (535)
T PF05416_consen 499 GTIPGDCGCPYVYKRGNDWVVIGVHAAATRSG 530 (535)
T ss_dssp S--TTGTT-EEEEEETTEEEEEEEEEEE-SSS
T ss_pred CCCCCCCCCceeeecCCcEEEEEEEehhccCC
Confidence 3568999999996555 4899999988754
No 52
>PF00548 Peptidase_C3: 3C cysteine protease (picornain 3C); InterPro: IPR000199 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This signature defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies C3A and C3B. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral C3 cysteine protease. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SJO_E 2H6M_A 1QA7_C 1HAV_B 2HAL_A 2H9H_A 3QZQ_B 3QZR_A 3R0F_B 3SJ9_A ....
Probab=45.63 E-value=31 Score=33.41 Aligned_cols=90 Identities=20% Similarity=0.300 Sum_probs=51.2
Q ss_pred CccEEEEEEccCCCCCCeeec--CCC-CCCCCeEEEEeCC-CCCCCCCcc-cC-ceEEeEEeeccCCCCCCCceEEEecc
Q 007572 65 TSRVAILGVSSYLKDLPNIAL--TPL-NKRGDLLLAVGSP-FGVLSPMHF-FN-SVSMGSVANCYPPRSTTRSLLMADIR 138 (597)
Q Consensus 65 ~t~~A~l~i~~~~~~~~~~~~--s~~-~~~G~~v~aigsP-fg~~~p~~f-~~-~~s~Givs~~~~~~~~~~~~i~tDa~ 138 (597)
.+|+++++++. ....+.+.- ... -...+.++++-++ |+- .++ .. ....|.| +..+ ......+.=++.
T Consensus 71 ~~Dl~~v~l~~-~~kfrDIrk~~~~~~~~~~~~~l~v~~~~~~~---~~~~v~~v~~~~~i-~~~g--~~~~~~~~Y~~~ 143 (172)
T PF00548_consen 71 DTDLTLVKLPR-NPKFRDIRKFFPESIPEYPECVLLVNSTKFPR---MIVEVGFVTNFGFI-NLSG--TTTPRSLKYKAP 143 (172)
T ss_dssp EEEEEEEEEES-SS-B--GGGGSBSSGGTEEEEEEEEESSSSTC---EEEEEEEEEEEEEE-EETT--EEEEEEEEEESE
T ss_pred ceeEEEEEccC-CcccCchhhhhccccccCCCcEEEEECCCCcc---EEEEEEEEeecCcc-ccCC--CEeeEEEEEccC
Confidence 48999999963 234444432 222 2455666666654 542 111 11 1233444 3322 122345777888
Q ss_pred cCCCCcCceeecC---CccEEEEEee
Q 007572 139 CLPGMEGGPVFGE---HAHFVGILIR 161 (597)
Q Consensus 139 ~~pG~~GG~v~~~---~g~liGi~~~ 161 (597)
.-+|+.||+|+.. .+.++||=.|
T Consensus 144 t~~G~CG~~l~~~~~~~~~i~GiHva 169 (172)
T PF00548_consen 144 TKPGMCGSPLVSRIGGQGKIIGIHVA 169 (172)
T ss_dssp EETTGTTEEEEESCGGTTEEEEEEEE
T ss_pred CCCCccCCeEEEeeccCccEEEEEec
Confidence 8899999999974 4569999765
No 53
>PF05579 Peptidase_S32: Equine arteritis virus serine endopeptidase S32; InterPro: IPR008760 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to MEROPS peptidase family S32 (clan PA(S)). The type example is equine arteritis virus serine endopeptidase (equine arteritis virus), which is involved in processing of nidovirus polyproteins [].; GO: 0004252 serine-type endopeptidase activity, 0016032 viral reproduction, 0019082 viral protein processing; PDB: 3FAN_A 3FAO_A 1MBM_A.
Probab=42.20 E-value=17 Score=37.83 Aligned_cols=28 Identities=25% Similarity=0.455 Sum_probs=21.4
Q ss_pred ccCCCCcCceeecCCccEEEEEeecccc
Q 007572 138 RCLPGMEGGPVFGEHAHFVGILIRPLRQ 165 (597)
Q Consensus 138 ~~~pG~~GG~v~~~~g~liGi~~~~l~~ 165 (597)
-..||.||.||+..+|.+||+-++.=.+
T Consensus 204 fT~~GDSGSPVVt~dg~liGVHTGSn~~ 231 (297)
T PF05579_consen 204 FTGPGDSGSPVVTEDGDLIGVHTGSNKR 231 (297)
T ss_dssp SS-GGCTT-EEEETTC-EEEEEEEEETT
T ss_pred EcCCCCCCCccCcCCCCEEEEEecCCCc
Confidence 3479999999999999999999976543
No 54
>PF05580 Peptidase_S55: SpoIVB peptidase S55; InterPro: IPR008763 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S55 (SpoIVB peptidase family, clan PA(S)). The protein SpoIVB plays a key role in signalling in the final sigma-K checkpoint of Bacillus subtilis [, ].
Probab=39.68 E-value=23 Score=35.74 Aligned_cols=38 Identities=18% Similarity=0.332 Sum_probs=26.9
Q ss_pred cCCCCcCceeecCCccEEEEEeecccccCCcceEEEeeHHH
Q 007572 139 CLPGMEGGPVFGEHAHFVGILIRPLRQKSGAEIQLVIPWEA 179 (597)
Q Consensus 139 ~~pG~~GG~v~~~~g~liGi~~~~l~~~~~~~l~~aip~~~ 179 (597)
+.-||||.|++- +|+|||-++--|......|.. ++++.
T Consensus 177 IvqGMSGSPI~q-dGKLiGAVthvf~~dp~~Gyg--i~ie~ 214 (218)
T PF05580_consen 177 IVQGMSGSPIIQ-DGKLIGAVTHVFVNDPTKGYG--IFIEW 214 (218)
T ss_pred EEecccCCCEEE-CCEEEEEEEEEEecCCCceee--ecHHH
Confidence 567999999986 899999999877433333444 44443
No 55
>PF01732 DUF31: Putative peptidase (DUF31); InterPro: IPR022382 This domain has no known function. It is found in various hypothetical proteins and putative lipoproteins from mycoplasmas.
Probab=30.67 E-value=36 Score=36.97 Aligned_cols=26 Identities=23% Similarity=0.373 Sum_probs=21.4
Q ss_pred EecccCCCCcCceeecCCccEEEEEe
Q 007572 135 ADIRCLPGMEGGPVFGEHAHFVGILI 160 (597)
Q Consensus 135 tDa~~~pG~~GG~v~~~~g~liGi~~ 160 (597)
.+...-.|.||..|+|.+|++|||..
T Consensus 348 ~~~~l~gGaSGS~V~n~~~~lvGIy~ 373 (374)
T PF01732_consen 348 DNYSLGGGASGSMVINQNNELVGIYF 373 (374)
T ss_pred cccCCCCCCCcCeEECCCCCEEEEeC
Confidence 33444579999999999999999965
No 56
>PF12381 Peptidase_C3G: Tungro spherical virus-type peptidase; InterPro: IPR024387 This entry represents a rice tungro spherical waikavirus-type peptidase that belongs to MEROPS peptidase family C3G. It is a picornain 3C-type protease, and is responsible for the self-cleavage of the positive single-stranded polyproteins of a number of plant viral genomes. The location of the protease activity of the polyprotein is at the C-terminal end, adjacent and N-terminal to the putative RNA polymerase [, ].
Probab=30.62 E-value=42 Score=33.96 Aligned_cols=54 Identities=15% Similarity=0.206 Sum_probs=37.1
Q ss_pred eEEEEcccccCCCCCCcee--cC--CceEEEEEeeeecCCCCcccCceEEEEeh--hHHHHHHHHH
Q 007572 450 VMLETTAAVHPGGSGGAVV--NL--DGHMIGLVTSNARHGGGTVIPHLNFSIPC--AVLRPIFEFA 509 (597)
Q Consensus 450 ~~iqtdAav~~GnSGGPL~--n~--~G~VIGIvss~~~~~~g~~~p~lnFaIPi--~~l~~~l~~~ 509 (597)
.-+++.++...|+-|||++ |. .-+++||.++..... ..+||=++ +.|++.++.+
T Consensus 169 ~gleY~~~t~~GdCGs~i~~~~t~~~RKIvGiHVAG~~~~------~~gYAe~itQEDL~~A~~~l 228 (231)
T PF12381_consen 169 QGLEYQMPTMNGDCGSPIVRNNTQMVRKIVGIHVAGSANH------AMGYAESITQEDLMRAINKL 228 (231)
T ss_pred eeeeEECCCcCCCccceeeEcchhhhhhhheeeecccccc------cceehhhhhHHHHHHHHHhh
Confidence 4567888999999999999 22 378999999876532 24566544 3455555544
No 57
>PF00571 CBS: CBS domain CBS domain web page. Mutations in the CBS domain of Swiss:P35520 lead to homocystinuria.; InterPro: IPR000644 CBS (cystathionine-beta-synthase) domains are small intracellular modules, mostly found in two or four copies within a protein, that occur in a variety of proteins in bacteria, archaea, and eukaryotes [, ]. Tandem pairs of CBS domains can act as binding domains for adenosine derivatives and may regulate the activity of attached enzymatic or other domains []. In some cases, CBS domains may act as sensors of cellular energy status by being activated by AMP and inhibited by ATP []. In chloride ion channels, the CBS domains have been implicated in intracellular targeting and trafficking, as well as in protein-protein interactions, but results vary with different channels: in the CLC-5 channel, the CBS domain was shown to be required for trafficking [], while in the CLC-1 channel, the CBS domain was shown to be critical for channel function, but not necessary for trafficking []. Recent experiments revealing that CBS domains can bind adenosine-containing ligands such ATP, AMP, or S-adenosylmethionine have led to the hypothesis that CBS domains function as sensors of intracellular metabolites [, ]. Crystallographic studies of CBS domains have shown that pairs of CBS sequences form a globular domain where each CBS unit adopts a beta-alpha-beta-beta-alpha pattern []. Crystal structure of the CBS domains of the AMP-activated protein kinase in complexes with AMP and ATP shows that the phosphate groups of AMP/ATP lie in a surface pocket at the interface of two CBS domains, which is lined with basic residues, many of which are associated with disease-causing mutations []. In humans, mutations in conserved residues within CBS domains cause a variety of human hereditary diseases, including (with the gene mutated in parentheses): homocystinuria (cystathionine beta-synthase); Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase); retinitis pigmentosa (IMP dehydrogenase-1); congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members).; GO: 0005515 protein binding; PDB: 3JTF_A 3TE5_C 3TDH_C 3T4N_C 2QLV_C 3OI8_A 3LV9_A 2QH1_B 1PVM_B 3LQN_A ....
Probab=28.71 E-value=43 Score=25.15 Aligned_cols=22 Identities=32% Similarity=0.557 Sum_probs=18.6
Q ss_pred cCCCCCCceecCCceEEEEEee
Q 007572 459 HPGGSGGAVVNLDGHMIGLVTS 480 (597)
Q Consensus 459 ~~GnSGGPL~n~~G~VIGIvss 480 (597)
..+-+.-||+|.+|+++|+++.
T Consensus 27 ~~~~~~~~V~d~~~~~~G~is~ 48 (57)
T PF00571_consen 27 KNGISRLPVVDEDGKLVGIISR 48 (57)
T ss_dssp HHTSSEEEEESTTSBEEEEEEH
T ss_pred HcCCcEEEEEecCCEEEEEEEH
Confidence 3577888999999999999974
No 58
>PF02122 Peptidase_S39: Peptidase S39; InterPro: IPR000382 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. ORF2 of Potato leafroll virus (PLrV) encodes a polyprotein which is translated following a -1 frameshift. The polyprotein has a putative linear arrangement of membrane achor-VPg-peptidase-polmerase domains. The serine peptidase domain which is found in this group of sequences belongs to MEROPS peptidase family S39 (clan PA(S)). It is likely that the peptidase domain is involved in the cleavage of the polyprotein []. The nucleotide sequence for the RNA of PLrV has been determined [, ]. The sequence contains six large open reading frames (ORFs). The 5' coding region encodes two polypeptides of 28K and 70K, which overlap in different reading frames; it is suggested that the third ORF in the 5' block is translated by frameshift readthrough near the end of the 70K protein, yielding a 118K polypeptide []. Segments of the predicted amino acid sequences of these ORFs resemble those of known viral RNA polymerases, ATP-binding proteins and viral genome-linked proteins. The nucleotide sequence of the genomic RNA of Beet western yellows virus (BWYV) has been determined []. The sequence contains six long ORFs. A cluster of three of these ORFs, including the coat protein cistron, display extensive amino acid sequence similarity to corresponding ORFs of a second luteovirus: Barley yellow dwarf virus [].; GO: 0004252 serine-type endopeptidase activity, 0022415 viral reproductive process, 0016021 integral to membrane; PDB: 1ZYO_A.
Probab=26.31 E-value=73 Score=31.98 Aligned_cols=49 Identities=16% Similarity=0.175 Sum_probs=18.3
Q ss_pred EEEEcccccCCCCCCceecCCceEEEEEeeeecCCCCcccCceEEEEehhHHH
Q 007572 451 MLETTAAVHPGGSGGAVVNLDGHMIGLVTSNARHGGGTVIPHLNFSIPCAVLR 503 (597)
Q Consensus 451 ~iqtdAav~~GnSGGPL~n~~G~VIGIvss~~~~~~g~~~p~lnFaIPi~~l~ 503 (597)
+....+...+|.||.|+|+.+ +++|+.+...+.. .-.+.|+-.|+.-+.
T Consensus 137 ~~~vls~T~~G~SGtp~y~g~-~vvGvH~G~~~~~---~~~n~n~~spip~~~ 185 (203)
T PF02122_consen 137 FASVLSNTSPGWSGTPYYSGK-NVVGVHTGSPSGS---NRENNNRMSPIPPIP 185 (203)
T ss_dssp EEEE-----TT-TT-EEE-SS--EEEEEEEE----------------------
T ss_pred CCceEcCCCCCCCCCCeEECC-CceEeecCccccc---ccccccccccccccc
Confidence 566667788999999999988 9999998752221 112556666655443
No 59
>PF13267 DUF4058: Protein of unknown function (DUF4058)
Probab=24.57 E-value=55 Score=33.90 Aligned_cols=26 Identities=31% Similarity=0.489 Sum_probs=21.5
Q ss_pred ccc-cccchhHHHHHHHHHHHHhhccccc
Q 007572 555 EDN-IEGKGSRFAKFIAERREVLKHSTQV 582 (597)
Q Consensus 555 ~~~-~~~~~~~~~~~~~~~~~~~~~~~~~ 582 (597)
|.| .-|+|.. +|.++||++|.|.|+|
T Consensus 124 P~NKr~G~gr~--~Y~~KRq~vl~S~tHL 150 (254)
T PF13267_consen 124 PANKRPGEGRA--AYERKRQEVLGSGTHL 150 (254)
T ss_pred cccCCCCccHH--HHHHHHHHHHhccCce
Confidence 445 4477877 9999999999999987
No 60
>PF03761 DUF316: Domain of unknown function (DUF316) ; InterPro: IPR005514 This is a family of uncharacterised proteins from Caenorhabditis elegans.
Probab=20.75 E-value=6.9e+02 Score=25.47 Aligned_cols=92 Identities=16% Similarity=0.199 Sum_probs=51.4
Q ss_pred CCccEEEEEEccC-CCCCCeeecCC---CCCCCCeEEEEeC-CCCCCCCCcccCceEEeEEeeccCCCCCCCceEEEecc
Q 007572 64 STSRVAILGVSSY-LKDLPNIALTP---LNKRGDLLLAVGS-PFGVLSPMHFFNSVSMGSVANCYPPRSTTRSLLMADIR 138 (597)
Q Consensus 64 ~~t~~A~l~i~~~-~~~~~~~~~s~---~~~~G~~v~aigs-Pfg~~~p~~f~~~~s~Givs~~~~~~~~~~~~i~tDa~ 138 (597)
...+++||++... .....+.-+++ .+..||.+-+-|. .-+ .++...+ .|..... ....+.++-.
T Consensus 159 ~~~~~mIlEl~~~~~~~~~~~Cl~~~~~~~~~~~~~~~yg~~~~~----~~~~~~~---~i~~~~~----~~~~~~~~~~ 227 (282)
T PF03761_consen 159 RPYSPMILELEEDFSKNVSPPCLADSSTNWEKGDEVDVYGFNSTG----KLKHRKL---KITNCTK----CAYSICTKQY 227 (282)
T ss_pred cccceEEEEEcccccccCCCEEeCCCccccccCceEEEeecCCCC----eEEEEEE---EEEEeec----cceeEecccc
Confidence 4477889999732 13444444533 3677888877555 111 1111111 1111110 1234556666
Q ss_pred cCCCCcCceeecC-Ccc--EEEEEeeccccc
Q 007572 139 CLPGMEGGPVFGE-HAH--FVGILIRPLRQK 166 (597)
Q Consensus 139 ~~pG~~GG~v~~~-~g~--liGi~~~~l~~~ 166 (597)
.-+|..|||++.. +|+ |||+.+..-...
T Consensus 228 ~~~~d~Gg~lv~~~~gr~tlIGv~~~~~~~~ 258 (282)
T PF03761_consen 228 SCKGDRGGPLVKNINGRWTLIGVGASGNYEC 258 (282)
T ss_pred cCCCCccCeEEEEECCCEEEEEEEccCCCcc
Confidence 6789999999843 454 999988665443
Done!