Query 004542
Match_columns 746
No_of_seqs 380 out of 2258
Neff 5.8
Searched_HMMs 46136
Date Fri Mar 29 01:09:43 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/004542.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/004542hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PRK10139 serine endoprotease; 100.0 3.5E-30 7.5E-35 290.3 22.2 252 386-726 42-334 (455)
2 TIGR02038 protease_degS peripl 100.0 3.2E-29 6.9E-34 274.3 24.0 253 386-726 47-322 (351)
3 PRK10898 serine endoprotease; 100.0 8.4E-29 1.8E-33 271.1 22.2 252 387-726 48-323 (353)
4 PRK10942 serine endoprotease; 100.0 1.6E-28 3.5E-33 278.1 23.0 233 406-727 111-356 (473)
5 TIGR02037 degP_htrA_DO peripla 100.0 2.1E-27 4.6E-32 266.2 24.2 235 406-726 58-301 (428)
6 COG0265 DegQ Trypsin-like seri 99.9 3.3E-21 7.1E-26 210.6 23.1 253 386-728 35-316 (347)
7 PRK10139 serine endoprotease; 99.8 1.1E-20 2.3E-25 213.6 13.5 133 200-338 122-260 (455)
8 PRK10942 serine endoprotease; 99.8 2.1E-19 4.6E-24 204.0 13.4 133 200-338 143-281 (473)
9 TIGR02038 protease_degS peripl 99.8 1.3E-18 2.9E-23 190.7 13.7 132 201-338 110-248 (351)
10 PRK10898 serine endoprotease; 99.8 2.3E-18 5E-23 189.0 13.3 128 205-338 114-249 (353)
11 TIGR02037 degP_htrA_DO peripla 99.7 1.1E-17 2.4E-22 188.0 13.8 128 205-338 94-227 (428)
12 COG0265 DegQ Trypsin-like seri 99.7 5.1E-17 1.1E-21 177.7 11.8 131 202-338 105-242 (347)
13 PF13365 Trypsin_2: Trypsin-li 99.5 8.2E-14 1.8E-18 126.9 13.5 24 603-626 97-120 (120)
14 KOG1320 Serine protease [Postt 99.5 3.1E-13 6.7E-18 151.2 13.4 202 389-659 133-351 (473)
15 PF00089 Trypsin: Trypsin; In 99.3 8.9E-11 1.9E-15 117.1 17.9 122 521-654 86-219 (220)
16 cd00190 Tryp_SPc Trypsin-like 99.2 6.5E-10 1.4E-14 111.6 17.5 106 520-632 87-209 (232)
17 KOG1320 Serine protease [Postt 99.2 3.6E-11 7.9E-16 134.8 7.6 128 203-336 211-350 (473)
18 KOG1421 Predicted signaling-as 99.1 6.1E-10 1.3E-14 126.5 13.8 257 389-727 57-347 (955)
19 smart00020 Tryp_SPc Trypsin-li 99.0 2.8E-08 6E-13 100.3 19.5 108 520-631 87-208 (229)
20 COG3591 V8-like Glu-specific e 98.5 1.8E-06 3.8E-11 90.7 13.4 73 545-636 157-229 (251)
21 KOG3627 Trypsin [Amino acid tr 98.1 0.0002 4.3E-09 74.5 19.7 114 522-642 106-239 (256)
22 PF00863 Peptidase_C4: Peptida 97.7 0.00081 1.8E-08 70.4 14.9 103 520-649 80-185 (235)
23 PF13365 Trypsin_2: Trypsin-li 97.7 2.3E-05 5.1E-10 71.2 2.5 24 284-307 97-120 (120)
24 COG5640 Secreted trypsin-like 97.1 0.0025 5.4E-08 69.6 10.5 51 605-657 223-276 (413)
25 PF03761 DUF316: Domain of unk 97.1 0.029 6.2E-07 59.8 18.3 92 520-632 159-256 (282)
26 PF05579 Peptidase_S32: Equine 96.9 0.0047 1E-07 65.2 9.4 78 521-634 155-232 (297)
27 PF00089 Trypsin: Trypsin; In 96.2 0.063 1.4E-06 53.4 12.3 115 214-329 86-216 (220)
28 PF10459 Peptidase_S46: Peptid 94.4 0.06 1.3E-06 64.8 6.2 66 595-660 618-688 (698)
29 PF00548 Peptidase_C3: 3C cyst 93.9 0.79 1.7E-05 46.0 12.0 35 596-630 133-170 (172)
30 COG3591 V8-like Glu-specific e 93.4 0.4 8.6E-06 51.0 9.3 76 232-315 152-227 (251)
31 KOG1421 Predicted signaling-as 91.5 3 6.6E-05 49.6 13.8 148 497-662 578-730 (955)
32 PF08192 Peptidase_S64: Peptid 91.0 1.1 2.4E-05 53.1 9.9 119 519-658 540-688 (695)
33 PF02907 Peptidase_S29: Hepati 89.4 0.25 5.5E-06 47.5 2.5 45 284-329 101-146 (148)
34 PF00949 Peptidase_S7: Peptida 88.8 0.33 7.2E-06 46.8 2.9 36 280-315 86-121 (132)
35 PF10459 Peptidase_S46: Peptid 88.0 0.36 7.8E-06 58.3 3.1 29 281-309 623-651 (698)
36 PF00949 Peptidase_S7: Peptida 87.4 0.41 9E-06 46.2 2.5 30 605-634 92-121 (132)
37 PF02907 Peptidase_S29: Hepati 86.2 0.73 1.6E-05 44.5 3.5 43 607-652 105-147 (148)
38 PF00944 Peptidase_S3: Alphavi 85.9 0.74 1.6E-05 44.4 3.3 34 601-634 97-130 (158)
39 PF00863 Peptidase_C4: Peptida 82.7 5.2 0.00011 42.4 8.3 92 213-311 80-172 (235)
40 smart00020 Tryp_SPc Trypsin-li 80.9 10 0.00022 38.0 9.5 99 214-312 88-208 (229)
41 PF05580 Peptidase_S55: SpoIVB 80.8 1.3 2.8E-05 46.1 2.9 41 604-650 174-214 (218)
42 cd00190 Tryp_SPc Trypsin-like 80.6 6.5 0.00014 39.1 8.0 99 214-312 88-208 (232)
43 PF09342 DUF1986: Domain of un 77.9 12 0.00026 39.9 8.9 32 398-430 20-51 (267)
44 PF00947 Pico_P2A: Picornaviru 75.8 3.8 8.2E-05 39.3 4.2 31 280-311 79-109 (127)
45 PF02122 Peptidase_S39: Peptid 73.6 14 0.00031 38.3 8.1 48 600-651 137-184 (203)
46 PF00944 Peptidase_S3: Alphavi 71.6 6.5 0.00014 38.1 4.7 49 259-314 81-129 (158)
47 PF08192 Peptidase_S64: Peptid 65.6 47 0.001 40.0 11.0 113 208-332 536-684 (695)
48 PF00947 Pico_P2A: Picornaviru 62.4 8.3 0.00018 37.0 3.4 32 599-631 79-110 (127)
49 KOG0441 Cu2+/Zn2+ superoxide d 54.0 4.6 9.9E-05 39.9 0.2 42 26-67 38-84 (154)
50 PF03510 Peptidase_C24: 2C end 53.9 44 0.00096 31.2 6.6 17 410-427 3-19 (105)
51 TIGR02860 spore_IV_B stage IV 50.9 12 0.00025 42.8 2.8 42 604-651 354-395 (402)
52 PF01732 DUF31: Putative pepti 47.6 11 0.00025 42.1 2.1 24 605-628 350-373 (374)
53 PF00548 Peptidase_C3: 3C cyst 43.0 43 0.00094 33.6 5.2 90 214-310 71-169 (172)
54 PF05579 Peptidase_S32: Equine 38.3 22 0.00047 38.4 2.3 26 288-313 205-230 (297)
55 PF05580 Peptidase_S55: SpoIVB 32.6 41 0.00089 35.3 3.2 38 288-326 177-214 (218)
56 PF05416 Peptidase_C37: Southa 30.2 1.6E+02 0.0035 33.9 7.5 38 597-634 483-530 (535)
57 PF03761 DUF316: Domain of unk 29.6 3.2E+02 0.0069 29.0 9.6 91 214-315 160-258 (282)
58 PF01732 DUF31: Putative pepti 23.5 57 0.0012 36.6 2.6 26 285-310 349-374 (374)
59 PF08208 RNA_polI_A34: DNA-dir 21.1 32 0.00069 35.1 0.0 13 23-35 109-121 (198)
No 1
>PRK10139 serine endoprotease; Provisional
Probab=99.97 E-value=3.5e-30 Score=290.35 Aligned_cols=252 Identities=25% Similarity=0.432 Sum_probs=197.5
Q ss_pred ChhHHHhccCceEEEEeC------------------C----------CeeEEEEEEeC-CCEEEEcccccCCCCCcceee
Q 004542 386 SPLPIQKALASVCLITID------------------D----------GVWASGVLLND-QGLILTNAHLLEPWRFGKTTV 436 (746)
Q Consensus 386 ~p~~i~ka~~SVV~I~~~------------------~----------~~wGSGflIn~-~GlILTnaHVV~p~~~~~t~~ 436 (746)
...+++++.||||.|... . .++||||+|++ +||||||+|||+
T Consensus 42 ~~~~~~~~~pavV~i~~~~~~~~~~~~~~~~~~~f~~~~~~~~~~~~~~~GSG~ii~~~~g~IlTn~HVv~--------- 112 (455)
T PRK10139 42 LAPMLEKVLPAVVSVRVEGTASQGQKIPEEFKKFFGDDLPDQPAQPFEGLGSGVIIDAAKGYVLTNNHVIN--------- 112 (455)
T ss_pred HHHHHHHhCCcEEEEEEEEeecccccCchhHHHhccccCCccccccccceEEEEEEECCCCEEEeChHHhC---------
Confidence 456899999999999541 0 24799999985 799999999997
Q ss_pred cCCccccccCCCCCCCCCCCCccccccccCCCCCCCcccccccccccccccccccCCceeEEEEEcCCCCceeEeeEEEe
Q 004542 437 SGWRNGVSFQPEDSASSGHTGVDQYQKSQTLPPKMPKIVDSSVDEHRAYKLSSFSRGHRKIRVRLDHLDPWIWCDAKIVY 516 (746)
Q Consensus 437 ~G~~~~~~f~~~~~~~~~~~~v~~~~k~q~l~~k~~~i~~~~~~~~~~~~l~l~~~~~~~I~Vrl~~~~~~~W~~A~VV~ 516 (746)
+ ...|.|++.+++. |+|++++
T Consensus 113 -~-------------------------------------------------------a~~i~V~~~dg~~---~~a~vvg 133 (455)
T PRK10139 113 -Q-------------------------------------------------------AQKISIQLNDGRE---FDAKLIG 133 (455)
T ss_pred -C-------------------------------------------------------CCEEEEEECCCCE---EEEEEEE
Confidence 1 2238888888876 9999999
Q ss_pred ecCCCCceEEEEEccCCCCcceeecCCC-CCCCCCeEEEEecCCCCCCCCCCCeeeeeEEeeeeeccCCCCCccccccCC
Q 004542 517 VCKGPLDVSLLQLGYIPDQLCPIDADFG-QPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQRNS 595 (746)
Q Consensus 517 v~d~~~DLALLkle~~p~~l~pi~L~~s-~~~~G~~V~vIG~plf~~~~gl~psvt~GiVS~v~~~~~~~~~~~~~~~~~ 595 (746)
.|+.+||||||++. +..+++++++++ .+++||+|+++||| +|+..+++.|+||+..+... ...
T Consensus 134 -~D~~~DlAvlkv~~-~~~l~~~~lg~s~~~~~G~~V~aiG~P-----~g~~~tvt~GivS~~~r~~~---------~~~ 197 (455)
T PRK10139 134 -SDDQSDIALLQIQN-PSKLTQIAIADSDKLRVGDFAVAVGNP-----FGLGQTATSGIISALGRSGL---------NLE 197 (455)
T ss_pred -EcCCCCEEEEEecC-CCCCceeEecCccccCCCCEEEEEecC-----CCCCCceEEEEEcccccccc---------CCC
Confidence 78899999999985 357889999876 68999999999994 67778999999998865310 012
Q ss_pred CcCcEEEEcccccCCCCCCccccCCcEEEEEEeeeccCCCCcccCceEEEEehhHHHHHHHHHHhccccc-----h-hcc
Q 004542 596 AYPVMLETTAAVHPGGSGGAVVNLDGHMIGLVTSNARHGGGTVIPHLNFSIPCAVLRPIFEFARDMQEVS-----L-LRK 669 (746)
Q Consensus 596 ~~~~~IqTdAav~~GnSGGPL~n~~G~VIGIvssna~~~~g~~~p~lnFaIPi~~l~~il~~~~~~~d~~-----~-l~~ 669 (746)
.+..+|||||++++|||||||||.+|+||||+++.....++. .+++|+||++.++++++++.+.+.+. + ++.
T Consensus 198 ~~~~~iqtda~in~GnSGGpl~n~~G~vIGi~~~~~~~~~~~--~gigfaIP~~~~~~v~~~l~~~g~v~r~~LGv~~~~ 275 (455)
T PRK10139 198 GLENFIQTDASINRGNSGGALLNLNGELIGINTAILAPGGGS--VGIGFAIPSNMARTLAQQLIDFGEIKRGLLGIKGTE 275 (455)
T ss_pred CcceEEEECCccCCCCCcceEECCCCeEEEEEEEEEcCCCCc--cceEEEEEhHHHHHHHHHHhhcCcccccceeEEEEE
Confidence 345689999999999999999999999999999987765443 38999999999999999998866652 1 223
Q ss_pred CCCCCccceeeeeecCCCCCCCCCCCCCCCc---cccc-cc-ccCCcchhHHHHHHHHHHHh
Q 004542 670 LDEPNKHLASVWALMPPLSPKQGPSLPDLPQ---AALE-DN-IEGKGSRFAKFIAERREVLK 726 (746)
Q Consensus 670 L~~~~~~l~~vW~L~~~~~~~~~~~~~~~p~---~~~~-~~-~~~~~~~~ak~~~~~~~~~~ 726 (746)
+ ++..+..+.|............|+.|+ ++++ |. .+.+|.++..+-+.++.+.+
T Consensus 276 l---~~~~~~~lgl~~~~Gv~V~~V~~~SpA~~AGL~~GDvIl~InG~~V~s~~dl~~~l~~ 334 (455)
T PRK10139 276 M---SADIAKAFNLDVQRGAFVSEVLPNSGSAKAGVKAGDIITSLNGKPLNSFAELRSRIAT 334 (455)
T ss_pred C---CHHHHHhcCCCCCCceEEEEECCCChHHHCCCCCCCEEEEECCEECCCHHHHHHHHHh
Confidence 2 344444455543334455666677775 6777 77 99999999998888777765
No 2
>TIGR02038 protease_degS periplasmic serine pepetdase DegS. This family consists of the periplasmic serine protease DegS (HhoB), a shorter paralog of protease DO (HtrA, DegP) and DegQ (HhoA). It is found in E. coli and several other Proteobacteria of the gamma subdivision. It contains a trypsin domain and a single copy of PDZ domain (in contrast to DegP with two copies). A critical role of this DegS is to sense stress in the periplasm and partially degrade an inhibitor of sigma(E).
Probab=99.97 E-value=3.2e-29 Score=274.31 Aligned_cols=253 Identities=23% Similarity=0.312 Sum_probs=189.6
Q ss_pred ChhHHHhccCceEEEEeC-----------CCeeEEEEEEeCCCEEEEcccccCCCCCcceeecCCccccccCCCCCCCCC
Q 004542 386 SPLPIQKALASVCLITID-----------DGVWASGVLLNDQGLILTNAHLLEPWRFGKTTVSGWRNGVSFQPEDSASSG 454 (746)
Q Consensus 386 ~p~~i~ka~~SVV~I~~~-----------~~~wGSGflIn~~GlILTnaHVV~p~~~~~t~~~G~~~~~~f~~~~~~~~~ 454 (746)
...+++++.+|||.|+.. ..+.||||+|+++||||||+||++ +
T Consensus 47 ~~~~~~~~~psVV~I~~~~~~~~~~~~~~~~~~GSG~vi~~~G~IlTn~HVV~----------~---------------- 100 (351)
T TIGR02038 47 FNKAVRRAAPAVVNIYNRSISQNSLNQLSIQGLGSGVIMSKEGYILTNYHVIK----------K---------------- 100 (351)
T ss_pred HHHHHHhcCCcEEEEEeEeccccccccccccceEEEEEEeCCeEEEecccEeC----------C----------------
Confidence 345789999999999762 135799999999999999999996 1
Q ss_pred CCCccccccccCCCCCCCcccccccccccccccccccCCceeEEEEEcCCCCceeEeeEEEeecCCCCceEEEEEccCCC
Q 004542 455 HTGVDQYQKSQTLPPKMPKIVDSSVDEHRAYKLSSFSRGHRKIRVRLDHLDPWIWCDAKIVYVCKGPLDVSLLQLGYIPD 534 (746)
Q Consensus 455 ~~~v~~~~k~q~l~~k~~~i~~~~~~~~~~~~l~l~~~~~~~I~Vrl~~~~~~~W~~A~VV~v~d~~~DLALLkle~~p~ 534 (746)
...+.|++.+++. ++|++++ .|+.+||||||++. .
T Consensus 101 ---------------------------------------~~~i~V~~~dg~~---~~a~vv~-~d~~~DlAvlkv~~--~ 135 (351)
T TIGR02038 101 ---------------------------------------ADQIVVALQDGRK---FEAELVG-SDPLTDLAVLKIEG--D 135 (351)
T ss_pred ---------------------------------------CCEEEEEECCCCE---EEEEEEE-ecCCCCEEEEEecC--C
Confidence 1237788888776 9999999 78899999999997 4
Q ss_pred CcceeecCCC-CCCCCCeEEEEecCCCCCCCCCCCeeeeeEEeeeeeccCCCCCccccccCCCcCcEEEEcccccCCCCC
Q 004542 535 QLCPIDADFG-QPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQRNSAYPVMLETTAAVHPGGSG 613 (746)
Q Consensus 535 ~l~pi~L~~s-~~~~G~~V~vIG~plf~~~~gl~psvt~GiVS~v~~~~~~~~~~~~~~~~~~~~~~IqTdAav~~GnSG 613 (746)
.++++++..+ .+++||.|+++||| +++..+++.|+|++..+... .......++||||++++||||
T Consensus 136 ~~~~~~l~~s~~~~~G~~V~aiG~P-----~~~~~s~t~GiIs~~~r~~~---------~~~~~~~~iqtda~i~~GnSG 201 (351)
T TIGR02038 136 NLPTIPVNLDRPPHVGDVVLAIGNP-----YNLGQTITQGIISATGRNGL---------SSVGRQNFIQTDAAINAGNSG 201 (351)
T ss_pred CCceEeccCcCccCCCCEEEEEeCC-----CCCCCcEEEEEEEeccCccc---------CCCCcceEEEECCccCCCCCc
Confidence 5788888765 68999999999995 56778999999998866421 012235689999999999999
Q ss_pred CccccCCcEEEEEEeeeccCCCCcccCceEEEEehhHHHHHHHHHHhccccc-----h-hccCCCCCccceeeeeecCCC
Q 004542 614 GAVVNLDGHMIGLVTSNARHGGGTVIPHLNFSIPCAVLRPIFEFARDMQEVS-----L-LRKLDEPNKHLASVWALMPPL 687 (746)
Q Consensus 614 GPL~n~~G~VIGIvssna~~~~g~~~p~lnFaIPi~~l~~il~~~~~~~d~~-----~-l~~L~~~~~~l~~vW~L~~~~ 687 (746)
|||||.+|+||||+++.+...++....+++|+||++.++++++++.+.+... + ++.+ ++..+....+....
T Consensus 202 Gpl~n~~G~vIGI~~~~~~~~~~~~~~g~~faIP~~~~~~vl~~l~~~g~~~r~~lGv~~~~~---~~~~~~~lgl~~~~ 278 (351)
T TIGR02038 202 GALINTNGELVGINTASFQKGGDEGGEGINFAIPIKLAHKIMGKIIRDGRVIRGYIGVSGEDI---NSVVAQGLGLPDLR 278 (351)
T ss_pred ceEECCCCeEEEEEeeeecccCCCCccceEEEecHHHHHHHHHHHhhcCcccceEeeeEEEEC---CHHHHHhcCCCccc
Confidence 9999999999999998765443333458999999999999999998755531 1 1221 22222222222112
Q ss_pred CCCCCCCCCCCCc---cccc-cc-ccCCcchhHHHHHHHHHHHh
Q 004542 688 SPKQGPSLPDLPQ---AALE-DN-IEGKGSRFAKFIAERREVLK 726 (746)
Q Consensus 688 ~~~~~~~~~~~p~---~~~~-~~-~~~~~~~~ak~~~~~~~~~~ 726 (746)
........|+.|+ ++++ |. .+.+|.++..+-+.++.+.+
T Consensus 279 Gv~V~~V~~~spA~~aGL~~GDvI~~Ing~~V~s~~dl~~~l~~ 322 (351)
T TIGR02038 279 GIVITGVDPNGPAARAGILVRDVILKYDGKDVIGAEELMDRIAE 322 (351)
T ss_pred cceEeecCCCChHHHCCCCCCCEEEEECCEEcCCHHHHHHHHHh
Confidence 2344555677774 6777 66 99999999887776666654
No 3
>PRK10898 serine endoprotease; Provisional
Probab=99.96 E-value=8.4e-29 Score=271.14 Aligned_cols=252 Identities=21% Similarity=0.291 Sum_probs=185.9
Q ss_pred hhHHHhccCceEEEEeCC-----------CeeEEEEEEeCCCEEEEcccccCCCCCcceeecCCccccccCCCCCCCCCC
Q 004542 387 PLPIQKALASVCLITIDD-----------GVWASGVLLNDQGLILTNAHLLEPWRFGKTTVSGWRNGVSFQPEDSASSGH 455 (746)
Q Consensus 387 p~~i~ka~~SVV~I~~~~-----------~~wGSGflIn~~GlILTnaHVV~p~~~~~t~~~G~~~~~~f~~~~~~~~~~ 455 (746)
..+++++.+|||.|.... .++||||+|+++||||||+||++ +
T Consensus 48 ~~~~~~~~psvV~v~~~~~~~~~~~~~~~~~~GSGfvi~~~G~IlTn~HVv~----------~----------------- 100 (353)
T PRK10898 48 NQAVRRAAPAVVNVYNRSLNSTSHNQLEIRTLGSGVIMDQRGYILTNKHVIN----------D----------------- 100 (353)
T ss_pred HHHHHHhCCcEEEEEeEeccccCcccccccceeeEEEEeCCeEEEecccEeC----------C-----------------
Confidence 457899999999997721 15899999999999999999997 1
Q ss_pred CCccccccccCCCCCCCcccccccccccccccccccCCceeEEEEEcCCCCceeEeeEEEeecCCCCceEEEEEccCCCC
Q 004542 456 TGVDQYQKSQTLPPKMPKIVDSSVDEHRAYKLSSFSRGHRKIRVRLDHLDPWIWCDAKIVYVCKGPLDVSLLQLGYIPDQ 535 (746)
Q Consensus 456 ~~v~~~~k~q~l~~k~~~i~~~~~~~~~~~~l~l~~~~~~~I~Vrl~~~~~~~W~~A~VV~v~d~~~DLALLkle~~p~~ 535 (746)
...+.|++.+++. |+|++++ .|+.+||||||++. ..
T Consensus 101 --------------------------------------a~~i~V~~~dg~~---~~a~vv~-~d~~~DlAvl~v~~--~~ 136 (353)
T PRK10898 101 --------------------------------------ADQIIVALQDGRV---FEALLVG-SDSLTDLAVLKINA--TN 136 (353)
T ss_pred --------------------------------------CCEEEEEeCCCCE---EEEEEEE-EcCCCCEEEEEEcC--CC
Confidence 1237788888776 9999999 68899999999986 46
Q ss_pred cceeecCCC-CCCCCCeEEEEecCCCCCCCCCCCeeeeeEEeeeeeccCCCCCccccccCCCcCcEEEEcccccCCCCCC
Q 004542 536 LCPIDADFG-QPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQRNSAYPVMLETTAAVHPGGSGG 614 (746)
Q Consensus 536 l~pi~L~~s-~~~~G~~V~vIG~plf~~~~gl~psvt~GiVS~v~~~~~~~~~~~~~~~~~~~~~~IqTdAav~~GnSGG 614 (746)
+++++++++ .+++|+.|+++||| +++..+++.|+|++..+.... ......+||||+++++|||||
T Consensus 137 l~~~~l~~~~~~~~G~~V~aiG~P-----~g~~~~~t~Giis~~~r~~~~---------~~~~~~~iqtda~i~~GnSGG 202 (353)
T PRK10898 137 LPVIPINPKRVPHIGDVVLAIGNP-----YNLGQTITQGIISATGRIGLS---------PTGRQNFLQTDASINHGNSGG 202 (353)
T ss_pred CCeeeccCcCcCCCCCEEEEEeCC-----CCcCCCcceeEEEeccccccC---------CccccceEEeccccCCCCCcc
Confidence 788888876 58999999999995 567789999999987664210 012246899999999999999
Q ss_pred ccccCCcEEEEEEeeeccCCC-CcccCceEEEEehhHHHHHHHHHHhccccc-----h-hccCCCCCccceeeeeecCCC
Q 004542 615 AVVNLDGHMIGLVTSNARHGG-GTVIPHLNFSIPCAVLRPIFEFARDMQEVS-----L-LRKLDEPNKHLASVWALMPPL 687 (746)
Q Consensus 615 PL~n~~G~VIGIvssna~~~~-g~~~p~lnFaIPi~~l~~il~~~~~~~d~~-----~-l~~L~~~~~~l~~vW~L~~~~ 687 (746)
||+|.+|+||||+++.....+ +....+++|+||++.++++++++...+.+. + .+.+. +.....-.+....
T Consensus 203 Pl~n~~G~vvGI~~~~~~~~~~~~~~~g~~faIP~~~~~~~~~~l~~~G~~~~~~lGi~~~~~~---~~~~~~~~~~~~~ 279 (353)
T PRK10898 203 ALVNSLGELMGINTLSFDKSNDGETPEGIGFAIPTQLATKIMDKLIRDGRVIRGYIGIGGREIA---PLHAQGGGIDQLQ 279 (353)
T ss_pred eEECCCCeEEEEEEEEecccCCCCcccceEEEEchHHHHHHHHHHhhcCcccccccceEEEECC---HHHHHhcCCCCCC
Confidence 999999999999998765432 223357999999999999999987755541 1 11111 1111000111111
Q ss_pred CCCCCCCCCCCCc---cccc-cc-ccCCcchhHHHHHHHHHHHh
Q 004542 688 SPKQGPSLPDLPQ---AALE-DN-IEGKGSRFAKFIAERREVLK 726 (746)
Q Consensus 688 ~~~~~~~~~~~p~---~~~~-~~-~~~~~~~~ak~~~~~~~~~~ 726 (746)
........++.|+ ++++ |. .+.+|.++...-+.++.+.+
T Consensus 280 Gv~V~~V~~~spA~~aGL~~GDvI~~Ing~~V~s~~~l~~~l~~ 323 (353)
T PRK10898 280 GIVVNEVSPDGPAAKAGIQVNDLIISVNNKPAISALETMDQVAE 323 (353)
T ss_pred eEEEEEECCCChHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHh
Confidence 2233444566664 5777 66 99999999877666666654
No 4
>PRK10942 serine endoprotease; Provisional
Probab=99.96 E-value=1.6e-28 Score=278.08 Aligned_cols=233 Identities=28% Similarity=0.444 Sum_probs=182.8
Q ss_pred eeEEEEEEeC-CCEEEEcccccCCCCCcceeecCCccccccCCCCCCCCCCCCccccccccCCCCCCCcccccccccccc
Q 004542 406 VWASGVLLND-QGLILTNAHLLEPWRFGKTTVSGWRNGVSFQPEDSASSGHTGVDQYQKSQTLPPKMPKIVDSSVDEHRA 484 (746)
Q Consensus 406 ~wGSGflIn~-~GlILTnaHVV~p~~~~~t~~~G~~~~~~f~~~~~~~~~~~~v~~~~k~q~l~~k~~~i~~~~~~~~~~ 484 (746)
++||||+|++ +||||||+||++ +
T Consensus 111 ~~GSG~ii~~~~G~IlTn~HVv~----------~---------------------------------------------- 134 (473)
T PRK10942 111 ALGSGVIIDADKGYVVTNNHVVD----------N---------------------------------------------- 134 (473)
T ss_pred ceEEEEEEECCCCEEEeChhhcC----------C----------------------------------------------
Confidence 4799999996 599999999996 1
Q ss_pred cccccccCCceeEEEEEcCCCCceeEeeEEEeecCCCCceEEEEEccCCCCcceeecCCC-CCCCCCeEEEEecCCCCCC
Q 004542 485 YKLSSFSRGHRKIRVRLDHLDPWIWCDAKIVYVCKGPLDVSLLQLGYIPDQLCPIDADFG-QPSLGSAAYVIGHGLFGPR 563 (746)
Q Consensus 485 ~~l~l~~~~~~~I~Vrl~~~~~~~W~~A~VV~v~d~~~DLALLkle~~p~~l~pi~L~~s-~~~~G~~V~vIG~plf~~~ 563 (746)
...|.|++.+++. |+|+|++ .|+.+||||||++. +..+++++++++ .+++|++|+++|| |
T Consensus 135 ---------a~~i~V~~~dg~~---~~a~vv~-~D~~~DlAvlki~~-~~~l~~~~lg~s~~l~~G~~V~aiG~-----P 195 (473)
T PRK10942 135 ---------ATKIKVQLSDGRK---FDAKVVG-KDPRSDIALIQLQN-PKNLTAIKMADSDALRVGDYTVAIGN-----P 195 (473)
T ss_pred ---------CCEEEEEECCCCE---EEEEEEE-ecCCCCEEEEEecC-CCCCceeEecCccccCCCCEEEEEcC-----C
Confidence 2238888888877 9999999 78999999999975 457889999876 6999999999999 4
Q ss_pred CCCCCeeeeeEEeeeeeccCCCCCccccccCCCcCcEEEEcccccCCCCCCccccCCcEEEEEEeeeccCCCCcccCceE
Q 004542 564 CGLSPSVSSGVVAKVVKANLPSYGQSTLQRNSAYPVMLETTAAVHPGGSGGAVVNLDGHMIGLVTSNARHGGGTVIPHLN 643 (746)
Q Consensus 564 ~gl~psvt~GiVS~v~~~~~~~~~~~~~~~~~~~~~~IqTdAav~~GnSGGPL~n~~G~VIGIvssna~~~~g~~~p~ln 643 (746)
+|+..+++.|+|++..+... ....+..+|||||++++|||||||+|.+|+||||+++.....++.. +++
T Consensus 196 ~g~~~tvt~GiVs~~~r~~~---------~~~~~~~~iqtda~i~~GnSGGpL~n~~GeviGI~t~~~~~~g~~~--g~g 264 (473)
T PRK10942 196 YGLGETVTSGIVSALGRSGL---------NVENYENFIQTDAAINRGNSGGALVNLNGELIGINTAILAPDGGNI--GIG 264 (473)
T ss_pred CCCCcceeEEEEEEeecccC---------CcccccceEEeccccCCCCCcCccCCCCCeEEEEEEEEEcCCCCcc--cEE
Confidence 67788999999999876310 0123457899999999999999999999999999999877655533 799
Q ss_pred EEEehhHHHHHHHHHHhccccc-----h-hccCCCCCccceeeeeecCCCCCCCCCCCCCCCc---cccc-cc-ccCCcc
Q 004542 644 FSIPCAVLRPIFEFARDMQEVS-----L-LRKLDEPNKHLASVWALMPPLSPKQGPSLPDLPQ---AALE-DN-IEGKGS 712 (746)
Q Consensus 644 FaIPi~~l~~il~~~~~~~d~~-----~-l~~L~~~~~~l~~vW~L~~~~~~~~~~~~~~~p~---~~~~-~~-~~~~~~ 712 (746)
|+||++.++++++++.+.+.+. + ++.+ ++.++..+.|............|+.|+ ++++ |. .+.+|+
T Consensus 265 faIP~~~~~~v~~~l~~~g~v~rg~lGv~~~~l---~~~~a~~~~l~~~~GvlV~~V~~~SpA~~AGL~~GDvIl~InG~ 341 (473)
T PRK10942 265 FAIPSNMVKNLTSQMVEYGQVKRGELGIMGTEL---NSELAKAMKVDAQRGAFVSQVLPNSSAAKAGIKAGDVITSLNGK 341 (473)
T ss_pred EEEEHHHHHHHHHHHHhccccccceeeeEeeec---CHHHHHhcCCCCCCceEEEEECCCChHHHcCCCCCCEEEEECCE
Confidence 9999999999999998766642 1 2333 344555555544333344555677775 6777 77 999999
Q ss_pred hhHHHHHHHHHHHhc
Q 004542 713 RFAKFIAERREVLKH 727 (746)
Q Consensus 713 ~~ak~~~~~~~~~~~ 727 (746)
++..+-+.++.+.+.
T Consensus 342 ~V~s~~dl~~~l~~~ 356 (473)
T PRK10942 342 PISSFAALRAQVGTM 356 (473)
T ss_pred ECCCHHHHHHHHHhc
Confidence 999988777766543
No 5
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=99.95 E-value=2.1e-27 Score=266.18 Aligned_cols=235 Identities=28% Similarity=0.421 Sum_probs=178.7
Q ss_pred eeEEEEEEeCCCEEEEcccccCCCCCcceeecCCccccccCCCCCCCCCCCCccccccccCCCCCCCccccccccccccc
Q 004542 406 VWASGVLLNDQGLILTNAHLLEPWRFGKTTVSGWRNGVSFQPEDSASSGHTGVDQYQKSQTLPPKMPKIVDSSVDEHRAY 485 (746)
Q Consensus 406 ~wGSGflIn~~GlILTnaHVV~p~~~~~t~~~G~~~~~~f~~~~~~~~~~~~v~~~~k~q~l~~k~~~i~~~~~~~~~~~ 485 (746)
++||||+|+++||||||+||++ +
T Consensus 58 ~~GSGfii~~~G~IlTn~Hvv~----------~----------------------------------------------- 80 (428)
T TIGR02037 58 GLGSGVIISADGYILTNNHVVD----------G----------------------------------------------- 80 (428)
T ss_pred ceeeEEEECCCCEEEEcHHHcC----------C-----------------------------------------------
Confidence 4799999999999999999997 1
Q ss_pred ccccccCCceeEEEEEcCCCCceeEeeEEEeecCCCCceEEEEEccCCCCcceeecCCC-CCCCCCeEEEEecCCCCCCC
Q 004542 486 KLSSFSRGHRKIRVRLDHLDPWIWCDAKIVYVCKGPLDVSLLQLGYIPDQLCPIDADFG-QPSLGSAAYVIGHGLFGPRC 564 (746)
Q Consensus 486 ~l~l~~~~~~~I~Vrl~~~~~~~W~~A~VV~v~d~~~DLALLkle~~p~~l~pi~L~~s-~~~~G~~V~vIG~plf~~~~ 564 (746)
...+.|++.+++. ++|++++ .|+.+||||||++. +..+++++++++ .+++|++|+++||| +
T Consensus 81 --------~~~i~V~~~~~~~---~~a~vv~-~d~~~DlAllkv~~-~~~~~~~~l~~~~~~~~G~~v~aiG~p-----~ 142 (428)
T TIGR02037 81 --------ADEITVTLSDGRE---FKAKLVG-KDPRTDIAVLKIDA-KKNLPVIKLGDSDKLRVGDWVLAIGNP-----F 142 (428)
T ss_pred --------CCeEEEEeCCCCE---EEEEEEE-ecCCCCEEEEEecC-CCCceEEEccCCCCCCCCCEEEEEECC-----C
Confidence 1237788877766 9999999 68889999999986 247899999875 68999999999995 5
Q ss_pred CCCCeeeeeEEeeeeeccCCCCCccccccCCCcCcEEEEcccccCCCCCCccccCCcEEEEEEeeeccCCCCcccCceEE
Q 004542 565 GLSPSVSSGVVAKVVKANLPSYGQSTLQRNSAYPVMLETTAAVHPGGSGGAVVNLDGHMIGLVTSNARHGGGTVIPHLNF 644 (746)
Q Consensus 565 gl~psvt~GiVS~v~~~~~~~~~~~~~~~~~~~~~~IqTdAav~~GnSGGPL~n~~G~VIGIvssna~~~~g~~~p~lnF 644 (746)
++..+++.|+|++..+... ....+..++|||+++++|+|||||||.+|+||||+++.....++. .+++|
T Consensus 143 g~~~~~t~G~vs~~~~~~~---------~~~~~~~~i~tda~i~~GnSGGpl~n~~G~viGI~~~~~~~~g~~--~g~~f 211 (428)
T TIGR02037 143 GLGQTVTSGIVSALGRSGL---------GIGDYENFIQTDAAINPGNSGGPLVNLRGEVIGINTAIYSPSGGN--VGIGF 211 (428)
T ss_pred cCCCcEEEEEEEecccCcc---------CCCCccceEEECCCCCCCCCCCceECCCCeEEEEEeEEEcCCCCc--cceEE
Confidence 7778999999998765310 112345689999999999999999999999999999887655443 37999
Q ss_pred EEehhHHHHHHHHHHhccccc--hh-ccCCCCCccceeeeeecCCCCCCCCCCCCCCCc---cccc-cc-ccCCcchhHH
Q 004542 645 SIPCAVLRPIFEFARDMQEVS--LL-RKLDEPNKHLASVWALMPPLSPKQGPSLPDLPQ---AALE-DN-IEGKGSRFAK 716 (746)
Q Consensus 645 aIPi~~l~~il~~~~~~~d~~--~l-~~L~~~~~~l~~vW~L~~~~~~~~~~~~~~~p~---~~~~-~~-~~~~~~~~ak 716 (746)
+||++.++++++++.+.+.+. .| -.+...++..+....|............|+.|+ ++++ |. .+.+|.++..
T Consensus 212 aiP~~~~~~~~~~l~~~g~~~~~~lGi~~~~~~~~~~~~lgl~~~~Gv~V~~V~~~spA~~aGL~~GDvI~~Vng~~i~~ 291 (428)
T TIGR02037 212 AIPSNMAKNVVDQLIEGGKVQRGWLGVTIQEVTSDLAKSLGLEKQRGALVAQVLPGSPAEKAGLKAGDVILSVNGKPISS 291 (428)
T ss_pred EEEhHHHHHHHHHHHhcCcCcCCcCceEeecCCHHHHHHcCCCCCCceEEEEccCCCChHHcCCCCCCEEEEECCEEcCC
Confidence 999999999999998866542 11 011112333333334433333455566677775 6777 66 9999999988
Q ss_pred HHHHHHHHHh
Q 004542 717 FIAERREVLK 726 (746)
Q Consensus 717 ~~~~~~~~~~ 726 (746)
+-+.++.+.+
T Consensus 292 ~~~~~~~l~~ 301 (428)
T TIGR02037 292 FADLRRAIGT 301 (428)
T ss_pred HHHHHHHHHh
Confidence 7776666554
No 6
>COG0265 DegQ Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain [Posttranslational modification, protein turnover, chaperones]
Probab=99.88 E-value=3.3e-21 Score=210.58 Aligned_cols=253 Identities=24% Similarity=0.346 Sum_probs=191.7
Q ss_pred ChhHHHhccCceEEEEeCC-----------------CeeEEEEEEeCCCEEEEcccccCCCCCcceeecCCccccccCCC
Q 004542 386 SPLPIQKALASVCLITIDD-----------------GVWASGVLLNDQGLILTNAHLLEPWRFGKTTVSGWRNGVSFQPE 448 (746)
Q Consensus 386 ~p~~i~ka~~SVV~I~~~~-----------------~~wGSGflIn~~GlILTnaHVV~p~~~~~t~~~G~~~~~~f~~~ 448 (746)
....++++.++||.|.... ..+||||+++++|||+||.||++ +
T Consensus 35 ~~~~~~~~~~~vV~~~~~~~~~~~~~~~~~~~~~~~~~~gSg~i~~~~g~ivTn~hVi~----------~---------- 94 (347)
T COG0265 35 FATAVEKVAPAVVSIATGLTAKLRSFFPSDPPLRSAEGLGSGFIISSDGYIVTNNHVIA----------G---------- 94 (347)
T ss_pred HHHHHHhcCCcEEEEEeeeeecchhcccCCcccccccccccEEEEcCCeEEEecceecC----------C----------
Confidence 3457889999999887632 37899999999999999999997 1
Q ss_pred CCCCCCCCCccccccccCCCCCCCcccccccccccccccccccCCceeEEEEEcCCCCceeEeeEEEeecCCCCceEEEE
Q 004542 449 DSASSGHTGVDQYQKSQTLPPKMPKIVDSSVDEHRAYKLSSFSRGHRKIRVRLDHLDPWIWCDAKIVYVCKGPLDVSLLQ 528 (746)
Q Consensus 449 ~~~~~~~~~v~~~~k~q~l~~k~~~i~~~~~~~~~~~~l~l~~~~~~~I~Vrl~~~~~~~W~~A~VV~v~d~~~DLALLk 528 (746)
...+.+.+.++.. +++++++ .|+..|+|+||
T Consensus 95 ---------------------------------------------a~~i~v~l~dg~~---~~a~~vg-~d~~~dlavlk 125 (347)
T COG0265 95 ---------------------------------------------AEEITVTLADGRE---VPAKLVG-KDPISDLAVLK 125 (347)
T ss_pred ---------------------------------------------cceEEEEeCCCCE---EEEEEEe-cCCccCEEEEE
Confidence 1226666666665 9999999 89999999999
Q ss_pred EccCCCCcceeecCCC-CCCCCCeEEEEecCCCCCCCCCCCeeeeeEEeeeeeccCCCCCccccccCCCcCcEEEEcccc
Q 004542 529 LGYIPDQLCPIDADFG-QPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQRNSAYPVMLETTAAV 607 (746)
Q Consensus 529 le~~p~~l~pi~L~~s-~~~~G~~V~vIG~plf~~~~gl~psvt~GiVS~v~~~~~~~~~~~~~~~~~~~~~~IqTdAav 607 (746)
++.... ++.+.+.++ .++.|+.++++|+ ++|+..+++.|+++...+... .....+..+|||||++
T Consensus 126 i~~~~~-~~~~~~~~s~~l~vg~~v~aiGn-----p~g~~~tvt~Givs~~~r~~v--------~~~~~~~~~IqtdAai 191 (347)
T COG0265 126 IDGAGG-LPVIALGDSDKLRVGDVVVAIGN-----PFGLGQTVTSGIVSALGRTGV--------GSAGGYVNFIQTDAAI 191 (347)
T ss_pred eccCCC-CceeeccCCCCcccCCEEEEecC-----CCCcccceeccEEeccccccc--------cCcccccchhhccccc
Confidence 997322 777788876 6889999999999 467889999999999877411 1112256789999999
Q ss_pred cCCCCCCccccCCcEEEEEEeeeccCCCCcccCceEEEEehhHHHHHHHHHHhccccc-----h-hccCCCCCccceeee
Q 004542 608 HPGGSGGAVVNLDGHMIGLVTSNARHGGGTVIPHLNFSIPCAVLRPIFEFARDMQEVS-----L-LRKLDEPNKHLASVW 681 (746)
Q Consensus 608 ~~GnSGGPL~n~~G~VIGIvssna~~~~g~~~p~lnFaIPi~~l~~il~~~~~~~d~~-----~-l~~L~~~~~~l~~vW 681 (746)
++|+||||++|.+|++|||++......++.. +++|+||++.+.+++.++...+.+. + +..+..... -
T Consensus 192 n~gnsGgpl~n~~g~~iGint~~~~~~~~~~--gigfaiP~~~~~~v~~~l~~~G~v~~~~lgv~~~~~~~~~~-----~ 264 (347)
T COG0265 192 NPGNSGGPLVNIDGEVVGINTAIIAPSGGSS--GIGFAIPVNLVAPVLDELISKGKVVRGYLGVIGEPLTADIA-----L 264 (347)
T ss_pred CCCCCCCceEcCCCcEEEEEEEEecCCCCcc--eeEEEecHHHHHHHHHHHHHcCCccccccceEEEEcccccc-----c
Confidence 9999999999999999999999988766533 6999999999999999998755331 1 112211111 0
Q ss_pred eecCCCCCCCCCCCCCCCc---cccc-cc-ccCCcchhHHHHHHHHHHHhcc
Q 004542 682 ALMPPLSPKQGPSLPDLPQ---AALE-DN-IEGKGSRFAKFIAERREVLKHS 728 (746)
Q Consensus 682 ~L~~~~~~~~~~~~~~~p~---~~~~-~~-~~~~~~~~ak~~~~~~~~~~~~ 728 (746)
.+............|+.|+ +++. |. .+.+|.++....+.+..+....
T Consensus 265 g~~~~~G~~V~~v~~~spa~~agi~~Gdii~~vng~~v~~~~~l~~~v~~~~ 316 (347)
T COG0265 265 GLPVAAGAVVLGVLPGSPAAKAGIKAGDIITAVNGKPVASLSDLVAAVASNR 316 (347)
T ss_pred CCCCCCceEEEecCCCChHHHcCCCCCCEEEEECCEEccCHHHHHHHHhccC
Confidence 0111222345555667775 6664 77 9999999999999998888774
No 7
>PRK10139 serine endoprotease; Provisional
Probab=99.84 E-value=1.1e-20 Score=213.55 Aligned_cols=133 Identities=21% Similarity=0.283 Sum_probs=112.4
Q ss_pred cccccCccccc-ccCcccEEEEEEcC-CCCCCCccccCCCCCCCCeEEEEeCCCCCCCccccccceEEEEEeeecCCC--
Q 004542 200 AMEESSNLSLM-SKSTSRVAILGVSS-YLKDLPNIALTPLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVANCYPPR-- 275 (746)
Q Consensus 200 ~~~~~~~~~~~-~~~~t~~A~l~i~~-~~~~~~~~~~s~~~~~Gd~v~aigsPFg~~~p~~f~n~vs~GiVs~~~~~~-- 275 (746)
.++...++.++ .+..+||||||++. ..++..+++.++.+++||+|+|||+||| +..++|.|+||++.|..
T Consensus 122 ~dg~~~~a~vvg~D~~~DlAvlkv~~~~~l~~~~lg~s~~~~~G~~V~aiG~P~g------~~~tvt~GivS~~~r~~~~ 195 (455)
T PRK10139 122 NDGREFDAKLIGSDDQSDIALLQIQNPSKLTQIAIADSDKLRVGDFAVAVGNPFG------LGQTATSGIISALGRSGLN 195 (455)
T ss_pred CCCCEEEEEEEEEcCCCCEEEEEecCCCCCceeEecCccccCCCCEEEEEecCCC------CCCceEEEEEccccccccC
Confidence 34445557777 67899999999973 3345556677788999999999999999 56899999999998742
Q ss_pred -CCCCCeEEeecccCCCCcCcceecCCccEEEEEeeecccc-CCcceEEEeehHHHHHHHHHhhc
Q 004542 276 -STTRSLLMADIRCLPGMEGGPVFGEHAHFVGILIRPLRQK-SGAEIQLVIPWEAIATACSDLLL 338 (746)
Q Consensus 276 -~~~~~~i~tDa~~~pG~~GG~v~~~~g~liGi~~~~l~~~-~~~~l~~~ip~~~i~~~~~~l~~ 338 (746)
..+..||||||++|||||||||||.+|+||||+++.++.. +..|++|+||.+.+..++.+|+.
T Consensus 196 ~~~~~~~iqtda~in~GnSGGpl~n~~G~vIGi~~~~~~~~~~~~gigfaIP~~~~~~v~~~l~~ 260 (455)
T PRK10139 196 LEGLENFIQTDASINRGNSGGALLNLNGELIGINTAILAPGGGSVGIGFAIPSNMARTLAQQLID 260 (455)
T ss_pred CCCcceEEEECCccCCCCCcceEECCCCeEEEEEEEEEcCCCCccceEEEEEhHHHHHHHHHHhh
Confidence 2356799999999999999999999999999999999876 67899999999999999988764
No 8
>PRK10942 serine endoprotease; Provisional
Probab=99.80 E-value=2.1e-19 Score=203.99 Aligned_cols=133 Identities=17% Similarity=0.267 Sum_probs=111.7
Q ss_pred cccccCccccc-ccCcccEEEEEEc-CCCCCCCccccCCCCCCCCeEEEEeCCCCCCCccccccceEEEEEeeecCCC--
Q 004542 200 AMEESSNLSLM-SKSTSRVAILGVS-SYLKDLPNIALTPLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVANCYPPR-- 275 (746)
Q Consensus 200 ~~~~~~~~~~~-~~~~t~~A~l~i~-~~~~~~~~~~~s~~~~~Gd~v~aigsPFg~~~p~~f~n~vs~GiVs~~~~~~-- 275 (746)
+++..-++.++ .+..+||||||++ ....+...++.++.+++||+|++||+||| |.++++.|+||++.+..
T Consensus 143 ~dg~~~~a~vv~~D~~~DlAvlki~~~~~l~~~~lg~s~~l~~G~~V~aiG~P~g------~~~tvt~GiVs~~~r~~~~ 216 (473)
T PRK10942 143 SDGRKFDAKVVGKDPRSDIALIQLQNPKNLTAIKMADSDALRVGDYTVAIGNPYG------LGETVTSGIVSALGRSGLN 216 (473)
T ss_pred CCCCEEEEEEEEecCCCCEEEEEecCCCCCceeEecCccccCCCCEEEEEcCCCC------CCcceeEEEEEEeecccCC
Confidence 34444456666 5688999999996 33344456666778999999999999999 57899999999998752
Q ss_pred -CCCCCeEEeecccCCCCcCcceecCCccEEEEEeeecccc-CCcceEEEeehHHHHHHHHHhhc
Q 004542 276 -STTRSLLMADIRCLPGMEGGPVFGEHAHFVGILIRPLRQK-SGAEIQLVIPWEAIATACSDLLL 338 (746)
Q Consensus 276 -~~~~~~i~tDa~~~pG~~GG~v~~~~g~liGi~~~~l~~~-~~~~l~~~ip~~~i~~~~~~l~~ 338 (746)
..+..||||||++|||||||||||.+|+||||+++++... ++.+++|+||++.+..++.+|+.
T Consensus 217 ~~~~~~~iqtda~i~~GnSGGpL~n~~GeviGI~t~~~~~~g~~~g~gfaIP~~~~~~v~~~l~~ 281 (473)
T PRK10942 217 VENYENFIQTDAAINRGNSGGALVNLNGELIGINTAILAPDGGNIGIGFAIPSNMVKNLTSQMVE 281 (473)
T ss_pred cccccceEEeccccCCCCCcCccCCCCCeEEEEEEEEEcCCCCcccEEEEEEHHHHHHHHHHHHh
Confidence 2367899999999999999999999999999999999877 77899999999999999988773
No 9
>TIGR02038 protease_degS periplasmic serine pepetdase DegS. This family consists of the periplasmic serine protease DegS (HhoB), a shorter paralog of protease DO (HtrA, DegP) and DegQ (HhoA). It is found in E. coli and several other Proteobacteria of the gamma subdivision. It contains a trypsin domain and a single copy of PDZ domain (in contrast to DegP with two copies). A critical role of this DegS is to sense stress in the periplasm and partially degrade an inhibitor of sigma(E).
Probab=99.78 E-value=1.3e-18 Score=190.73 Aligned_cols=132 Identities=17% Similarity=0.306 Sum_probs=109.2
Q ss_pred ccccCccccc-ccCcccEEEEEEcCCCCCCCccccCCCCCCCCeEEEEeCCCCCCCccccccceEEEEEeeecCCC---C
Q 004542 201 MEESSNLSLM-SKSTSRVAILGVSSYLKDLPNIALTPLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVANCYPPR---S 276 (746)
Q Consensus 201 ~~~~~~~~~~-~~~~t~~A~l~i~~~~~~~~~~~~s~~~~~Gd~v~aigsPFg~~~p~~f~n~vs~GiVs~~~~~~---~ 276 (746)
++...++.++ .+..+|+||||++....+..+++.+..+++||+|++||+||| +.++++.|+||+..+.. .
T Consensus 110 dg~~~~a~vv~~d~~~DlAvlkv~~~~~~~~~l~~s~~~~~G~~V~aiG~P~~------~~~s~t~GiIs~~~r~~~~~~ 183 (351)
T TIGR02038 110 DGRKFEAELVGSDPLTDLAVLKIEGDNLPTIPVNLDRPPHVGDVVLAIGNPYN------LGQTITQGIISATGRNGLSSV 183 (351)
T ss_pred CCCEEEEEEEEecCCCCEEEEEecCCCCceEeccCcCccCCCCEEEEEeCCCC------CCCcEEEEEEEeccCcccCCC
Confidence 3444456666 568899999999865555556776778999999999999999 46799999999987642 1
Q ss_pred CCCCeEEeecccCCCCcCcceecCCccEEEEEeeecccc---CCcceEEEeehHHHHHHHHHhhc
Q 004542 277 TTRSLLMADIRCLPGMEGGPVFGEHAHFVGILIRPLRQK---SGAEIQLVIPWEAIATACSDLLL 338 (746)
Q Consensus 277 ~~~~~i~tDa~~~pG~~GG~v~~~~g~liGi~~~~l~~~---~~~~l~~~ip~~~i~~~~~~l~~ 338 (746)
.+..||||||.++||||||||||.+|+||||+++.+... ...+++|+||++.+..++.+++.
T Consensus 184 ~~~~~iqtda~i~~GnSGGpl~n~~G~vIGI~~~~~~~~~~~~~~g~~faIP~~~~~~vl~~l~~ 248 (351)
T TIGR02038 184 GRQNFIQTDAAINAGNSGGALINTNGELVGINTASFQKGGDEGGEGINFAIPIKLAHKIMGKIIR 248 (351)
T ss_pred CcceEEEECCccCCCCCcceEECCCCeEEEEEeeeecccCCCCccceEEEecHHHHHHHHHHHhh
Confidence 245799999999999999999999999999999988654 23689999999999999887764
No 10
>PRK10898 serine endoprotease; Provisional
Probab=99.77 E-value=2.3e-18 Score=189.04 Aligned_cols=128 Identities=19% Similarity=0.295 Sum_probs=106.3
Q ss_pred Cccccc-ccCcccEEEEEEcCCCCCCCccccCCCCCCCCeEEEEeCCCCCCCccccccceEEEEEeeecCCC---CCCCC
Q 004542 205 SNLSLM-SKSTSRVAILGVSSYLKDLPNIALTPLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVANCYPPR---STTRS 280 (746)
Q Consensus 205 ~~~~~~-~~~~t~~A~l~i~~~~~~~~~~~~s~~~~~Gd~v~aigsPFg~~~p~~f~n~vs~GiVs~~~~~~---~~~~~ 280 (746)
-++.++ .+..+||||||++....+..+++.+..+++||+|+++|+||| +..+++.|+||+..+.. ..+..
T Consensus 114 ~~a~vv~~d~~~DlAvl~v~~~~l~~~~l~~~~~~~~G~~V~aiG~P~g------~~~~~t~Giis~~~r~~~~~~~~~~ 187 (353)
T PRK10898 114 FEALLVGSDSLTDLAVLKINATNLPVIPINPKRVPHIGDVVLAIGNPYN------LGQTITQGIISATGRIGLSPTGRQN 187 (353)
T ss_pred EEEEEEEEcCCCCEEEEEEcCCCCCeeeccCcCcCCCCCEEEEEeCCCC------cCCCcceeEEEeccccccCCccccc
Confidence 345555 567899999999865555567777778999999999999999 46789999999887642 22357
Q ss_pred eEEeecccCCCCcCcceecCCccEEEEEeeeccccC----CcceEEEeehHHHHHHHHHhhc
Q 004542 281 LLMADIRCLPGMEGGPVFGEHAHFVGILIRPLRQKS----GAEIQLVIPWEAIATACSDLLL 338 (746)
Q Consensus 281 ~i~tDa~~~pG~~GG~v~~~~g~liGi~~~~l~~~~----~~~l~~~ip~~~i~~~~~~l~~ 338 (746)
||||||+++||||||||+|.+|+||||+++.+...+ ..+++|+||.+.+..++.+++.
T Consensus 188 ~iqtda~i~~GnSGGPl~n~~G~vvGI~~~~~~~~~~~~~~~g~~faIP~~~~~~~~~~l~~ 249 (353)
T PRK10898 188 FLQTDASINHGNSGGALVNSLGELMGINTLSFDKSNDGETPEGIGFAIPTQLATKIMDKLIR 249 (353)
T ss_pred eEEeccccCCCCCcceEECCCCeEEEEEEEEecccCCCCcccceEEEEchHHHHHHHHHHhh
Confidence 999999999999999999999999999999886542 2689999999999999887763
No 11
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=99.74 E-value=1.1e-17 Score=187.95 Aligned_cols=128 Identities=20% Similarity=0.330 Sum_probs=107.7
Q ss_pred Cccccc-ccCcccEEEEEEcCC-CCCCCccccCCCCCCCCeEEEEeCCCCCCCccccccceEEEEEeeecCCC---CCCC
Q 004542 205 SNLSLM-SKSTSRVAILGVSSY-LKDLPNIALTPLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVANCYPPR---STTR 279 (746)
Q Consensus 205 ~~~~~~-~~~~t~~A~l~i~~~-~~~~~~~~~s~~~~~Gd~v~aigsPFg~~~p~~f~n~vs~GiVs~~~~~~---~~~~ 279 (746)
-++.++ .+..+|+||||++.. ..+...++.+..+++||+|+++|+||| +..+++.|+||++.+.. ..+.
T Consensus 94 ~~a~vv~~d~~~DlAllkv~~~~~~~~~~l~~~~~~~~G~~v~aiG~p~g------~~~~~t~G~vs~~~~~~~~~~~~~ 167 (428)
T TIGR02037 94 FKAKLVGKDPRTDIAVLKIDAKKNLPVIKLGDSDKLRVGDWVLAIGNPFG------LGQTVTSGIVSALGRSGLGIGDYE 167 (428)
T ss_pred EEEEEEEecCCCCEEEEEecCCCCceEEEccCCCCCCCCCEEEEEECCCc------CCCcEEEEEEEecccCccCCCCcc
Confidence 345555 567889999999854 334445555678999999999999999 46899999999987752 3466
Q ss_pred CeEEeecccCCCCcCcceecCCccEEEEEeeecccc-CCcceEEEeehHHHHHHHHHhhc
Q 004542 280 SLLMADIRCLPGMEGGPVFGEHAHFVGILIRPLRQK-SGAEIQLVIPWEAIATACSDLLL 338 (746)
Q Consensus 280 ~~i~tDa~~~pG~~GG~v~~~~g~liGi~~~~l~~~-~~~~l~~~ip~~~i~~~~~~l~~ 338 (746)
.|||||++++||||||||||.+|+||||+++.+... +..+++|+||++.+..++.++..
T Consensus 168 ~~i~tda~i~~GnSGGpl~n~~G~viGI~~~~~~~~g~~~g~~faiP~~~~~~~~~~l~~ 227 (428)
T TIGR02037 168 NFIQTDAAINPGNSGGPLVNLRGEVIGINTAIYSPSGGNVGIGFAIPSNMAKNVVDQLIE 227 (428)
T ss_pred ceEEECCCCCCCCCCCceECCCCeEEEEEeEEEcCCCCccceEEEEEhHHHHHHHHHHHh
Confidence 799999999999999999999999999999998876 67899999999999999988774
No 12
>COG0265 DegQ Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain [Posttranslational modification, protein turnover, chaperones]
Probab=99.70 E-value=5.1e-17 Score=177.68 Aligned_cols=131 Identities=21% Similarity=0.328 Sum_probs=112.7
Q ss_pred cccCccccc-ccCcccEEEEEEcCCC-CCCCccccCCCCCCCCeEEEEeCCCCCCCccccccceEEEEEeeecCC-C---
Q 004542 202 EESSNLSLM-SKSTSRVAILGVSSYL-KDLPNIALTPLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVANCYPP-R--- 275 (746)
Q Consensus 202 ~~~~~~~~~-~~~~t~~A~l~i~~~~-~~~~~~~~s~~~~~Gd~v~aigsPFg~~~p~~f~n~vs~GiVs~~~~~-~--- 275 (746)
+..-++.++ .+..+|+|+||++... .+...++++..++.||+++|||+||| |.++++.|+||...|. -
T Consensus 105 g~~~~a~~vg~d~~~dlavlki~~~~~~~~~~~~~s~~l~vg~~v~aiGnp~g------~~~tvt~Givs~~~r~~v~~~ 178 (347)
T COG0265 105 GREVPAKLVGKDPISDLAVLKIDGAGGLPVIALGDSDKLRVGDVVVAIGNPFG------LGQTVTSGIVSALGRTGVGSA 178 (347)
T ss_pred CCEEEEEEEecCCccCEEEEEeccCCCCceeeccCCCCcccCCEEEEecCCCC------cccceeccEEeccccccccCc
Confidence 333445666 6799999999998643 45567778889999999999999999 6799999999999984 1
Q ss_pred CCCCCeEEeecccCCCCcCcceecCCccEEEEEeeecccc-CCcceEEEeehHHHHHHHHHhhc
Q 004542 276 STTRSLLMADIRCLPGMEGGPVFGEHAHFVGILIRPLRQK-SGAEIQLVIPWEAIATACSDLLL 338 (746)
Q Consensus 276 ~~~~~~i~tDa~~~pG~~GG~v~~~~g~liGi~~~~l~~~-~~~~l~~~ip~~~i~~~~~~l~~ 338 (746)
..+..||||||++||||+|||++|.+|++|||+++.+... +..|++|+||++.+..++.+++.
T Consensus 179 ~~~~~~IqtdAain~gnsGgpl~n~~g~~iGint~~~~~~~~~~gigfaiP~~~~~~v~~~l~~ 242 (347)
T COG0265 179 GGYVNFIQTDAAINPGNSGGPLVNIDGEVVGINTAIIAPSGGSSGIGFAIPVNLVAPVLDELIS 242 (347)
T ss_pred ccccchhhcccccCCCCCCCceEcCCCcEEEEEEEEecCCCCcceeEEEecHHHHHHHHHHHHH
Confidence 2256899999999999999999999999999999999988 56889999999999999888774
No 13
>PF13365 Trypsin_2: Trypsin-like peptidase domain; PDB: 1Y8T_A 2Z9I_A 3QO6_A 1L1J_A 1QY6_A 2O8L_A 3OTP_E 2ZLE_I 1KY9_A 3CS0_A ....
Probab=99.54 E-value=8.2e-14 Score=126.93 Aligned_cols=24 Identities=46% Similarity=0.904 Sum_probs=22.4
Q ss_pred EcccccCCCCCCccccCCcEEEEE
Q 004542 603 TTAAVHPGGSGGAVVNLDGHMIGL 626 (746)
Q Consensus 603 TdAav~~GnSGGPL~n~~G~VIGI 626 (746)
+++.+.+|+|||||||.+|+||||
T Consensus 97 ~~~~~~~G~SGgpv~~~~G~vvGi 120 (120)
T PF13365_consen 97 TDADTRPGSSGGPVFDSDGRVVGI 120 (120)
T ss_dssp ESSS-STTTTTSEEEETTSEEEEE
T ss_pred eecccCCCcEeHhEECCCCEEEeC
Confidence 899999999999999999999998
No 14
>KOG1320 consensus Serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=99.47 E-value=3.1e-13 Score=151.18 Aligned_cols=202 Identities=21% Similarity=0.295 Sum_probs=144.4
Q ss_pred HHHhccCceEEEEeCC--------------CeeEEEEEEeCCCEEEEcccccCCCCCcceeecCCccccccCCCCCCCCC
Q 004542 389 PIQKALASVCLITIDD--------------GVWASGVLLNDQGLILTNAHLLEPWRFGKTTVSGWRNGVSFQPEDSASSG 454 (746)
Q Consensus 389 ~i~ka~~SVV~I~~~~--------------~~wGSGflIn~~GlILTnaHVV~p~~~~~t~~~G~~~~~~f~~~~~~~~~ 454 (746)
..++...++|.|+..+ ...|||++++.+|++|||+||+.-. -..+ .
T Consensus 133 ~~~~cd~Avv~Ie~~~f~~~~~~~e~~~ip~l~~S~~Vv~gd~i~VTnghV~~~~--------~~~y------------~ 192 (473)
T KOG1320|consen 133 VFEECDLAVVYIESEEFWKGMNPFELGDIPSLNGSGFVVGGDGIIVTNGHVVRVE--------PRIY------------A 192 (473)
T ss_pred hhhcccceEEEEeeccccCCCcccccCCCcccCccEEEEcCCcEEEEeeEEEEEE--------eccc------------c
Confidence 4567788899888521 1249999999999999999998510 0000 0
Q ss_pred CCCccccccccCCCCCCCcccccccccccccccccccCCceeEEEEEcCC--CCceeEeeEEEeecCCCCceEEEEEccC
Q 004542 455 HTGVDQYQKSQTLPPKMPKIVDSSVDEHRAYKLSSFSRGHRKIRVRLDHL--DPWIWCDAKIVYVCKGPLDVSLLQLGYI 532 (746)
Q Consensus 455 ~~~v~~~~k~q~l~~k~~~i~~~~~~~~~~~~l~l~~~~~~~I~Vrl~~~--~~~~W~~A~VV~v~d~~~DLALLkle~~ 532 (746)
+.. ..--.+.++...+ +. +.+.+++ .|...|+|+++++..
T Consensus 193 ~~~----------------------------------~~l~~vqi~aa~~~~~s---~ep~i~g-~d~~~gvA~l~ik~~ 234 (473)
T KOG1320|consen 193 HSS----------------------------------TVLLRVQIDAAIGPGNS---GEPVIVG-VDKVAGVAFLKIKTP 234 (473)
T ss_pred CCC----------------------------------cceeeEEEEEeecCCcc---CCCeEEc-cccccceEEEEEecC
Confidence 000 0011255555555 55 6777777 578899999999752
Q ss_pred CCCcceeecCCC-CCCCCCeEEEEecCCCCCCCCCCCeeeeeEEeeeeeccCCCCCccccccCCCcCcEEEEcccccCCC
Q 004542 533 PDQLCPIDADFG-QPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQRNSAYPVMLETTAAVHPGG 611 (746)
Q Consensus 533 p~~l~pi~L~~s-~~~~G~~V~vIG~plf~~~~gl~psvt~GiVS~v~~~~~~~~~~~~~~~~~~~~~~IqTdAav~~Gn 611 (746)
..-+++++++.. .+..|+++..+|. ++++..+++.|+++...|...... .. .......++|||+++..|+
T Consensus 235 ~~i~~~i~~~~~~~~~~G~~~~a~~~-----~f~~~nt~t~g~vs~~~R~~~~lg--~~--~g~~i~~~~qtd~ai~~~n 305 (473)
T KOG1320|consen 235 ENILYVIPLGVSSHFRTGVEVSAIGN-----GFGLLNTLTQGMVSGQLRKSFKLG--LE--TGVLISKINQTDAAINPGN 305 (473)
T ss_pred Ccccceeecceeeeecccceeecccc-----CceeeeeeeecccccccccccccC--cc--cceeeeeecccchhhhccc
Confidence 234778888775 7899999999999 478889999999998877531111 10 1123456899999999999
Q ss_pred CCCccccCCcEEEEEEeeeccCCCCcccCceEEEEehhHHHHHHHHHH
Q 004542 612 SGGAVVNLDGHMIGLVTSNARHGGGTVIPHLNFSIPCAVLRPIFEFAR 659 (746)
Q Consensus 612 SGGPL~n~~G~VIGIvssna~~~~g~~~p~lnFaIPi~~l~~il~~~~ 659 (746)
||||++|.+|++||+++.+...-+-. -+++|++|.+.+..++.+..
T Consensus 306 sg~~ll~~DG~~IgVn~~~~~ri~~~--~~iSf~~p~d~vl~~v~r~~ 351 (473)
T KOG1320|consen 306 SGGPLLNLDGEVIGVNTRKVTRIGFS--HGISFKIPIDTVLVIVLRLG 351 (473)
T ss_pred CCCcEEEecCcEeeeeeeeeEEeecc--ccceeccCchHhhhhhhhhh
Confidence 99999999999999988876642211 26899999999999877663
No 15
>PF00089 Trypsin: Trypsin; InterPro: IPR001254 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine proteases belong to the MEROPS peptidase family S1 (chymotrypsin family, clan PA(S))and to peptidase family S6 (Hap serine peptidases). The chymotrypsin family is almost totally confined to animals, although trypsin-like enzymes are found in actinomycetes of the genera Streptomyces and Saccharopolyspora, and in the fungus Fusarium oxysporum []. The enzymes are inherently secreted, being synthesised with a signal peptide that targets them to the secretory pathway. Animal enzymes are either secreted directly, packaged into vesicles for regulated secretion, or are retained in leukocyte granules []. The Hap family, 'Haemophilus adhesion and penetration', are proteins that play a role in the interaction with human epithelial cells. The serine protease activity is localized at the N-terminal domain, whereas the binding domain is in the C-terminal region. ; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 1SPJ_A 1A5I_A 2ZGH_A 2ZKS_A 2ZGJ_A 2ZGC_A 2ODP_A 2I6Q_A 2I6S_A 2ODQ_A ....
Probab=99.31 E-value=8.9e-11 Score=117.10 Aligned_cols=122 Identities=20% Similarity=0.302 Sum_probs=77.8
Q ss_pred CCceEEEEEccC---CCCcceeecCCC--CCCCCCeEEEEecCCCCCCCCCCCeeeeeEE---eeeeeccCCCCCccccc
Q 004542 521 PLDVSLLQLGYI---PDQLCPIDADFG--QPSLGSAAYVIGHGLFGPRCGLSPSVSSGVV---AKVVKANLPSYGQSTLQ 592 (746)
Q Consensus 521 ~~DLALLkle~~---p~~l~pi~L~~s--~~~~G~~V~vIG~plf~~~~gl~psvt~GiV---S~v~~~~~~~~~~~~~~ 592 (746)
.+|||||+++.. ...+.|+.+... .+..|+.+.++|||.... .+....+....+ +...+.. . .
T Consensus 86 ~~DiAll~L~~~~~~~~~~~~~~l~~~~~~~~~~~~~~~~G~~~~~~-~~~~~~~~~~~~~~~~~~~c~~-------~-~ 156 (220)
T PF00089_consen 86 DNDIALLKLDRPITFGDNIQPICLPSAGSDPNVGTSCIVVGWGRTSD-NGYSSNLQSVTVPVVSRKTCRS-------S-Y 156 (220)
T ss_dssp TTSEEEEEESSSSEHBSSBEESBBTSTTHTTTTTSEEEEEESSBSST-TSBTSBEEEEEEEEEEHHHHHH-------H-T
T ss_pred ccccccccccccccccccccccccccccccccccccccccccccccc-cccccccccccccccccccccc-------c-c
Confidence 589999999973 356778888763 468999999999985321 111123333333 2222211 0 0
Q ss_pred cCCCcCcEEEEcc----cccCCCCCCccccCCcEEEEEEeeeccCCCCcccCceEEEEehhHHHHH
Q 004542 593 RNSAYPVMLETTA----AVHPGGSGGAVVNLDGHMIGLVTSNARHGGGTVIPHLNFSIPCAVLRPI 654 (746)
Q Consensus 593 ~~~~~~~~IqTdA----av~~GnSGGPL~n~~G~VIGIvssna~~~~g~~~p~lnFaIPi~~l~~i 654 (746)
.......++++.. ..+.|+|||||++.++.|+||++.... .+... ...+.+++..+.+.
T Consensus 157 ~~~~~~~~~c~~~~~~~~~~~g~sG~pl~~~~~~lvGI~s~~~~-c~~~~--~~~v~~~v~~~~~W 219 (220)
T PF00089_consen 157 NDNLTPNMICAGSSGSGDACQGDSGGPLICNNNYLVGIVSFGEN-CGSPN--YPGVYTRVSSYLDW 219 (220)
T ss_dssp TTTSTTTEEEEETTSSSBGGTTTTTSEEEETTEEEEEEEEEESS-SSBTT--SEEEEEEGGGGHHH
T ss_pred ccccccccccccccccccccccccccccccceeeecceeeecCC-CCCCC--cCEEEEEHHHhhcc
Confidence 0113456777776 788999999999877789999998832 22221 24777888766553
No 16
>cd00190 Tryp_SPc Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues.
Probab=99.20 E-value=6.5e-10 Score=111.61 Aligned_cols=106 Identities=23% Similarity=0.241 Sum_probs=63.1
Q ss_pred CCCceEEEEEccC---CCCcceeecCCC--CCCCCCeEEEEecCCCCCCCCCCCeeeeeE---EeeeeeccCCCCCcccc
Q 004542 520 GPLDVSLLQLGYI---PDQLCPIDADFG--QPSLGSAAYVIGHGLFGPRCGLSPSVSSGV---VAKVVKANLPSYGQSTL 591 (746)
Q Consensus 520 ~~~DLALLkle~~---p~~l~pi~L~~s--~~~~G~~V~vIG~plf~~~~gl~psvt~Gi---VS~v~~~~~~~~~~~~~ 591 (746)
..+|||||+|+.. ...+.|+.+... .+..|+.++++||+................ ++...+.. ..
T Consensus 87 ~~~DiAll~L~~~~~~~~~v~picl~~~~~~~~~~~~~~~~G~g~~~~~~~~~~~~~~~~~~~~~~~~C~~-------~~ 159 (232)
T cd00190 87 YDNDIALLKLKRPVTLSDNVRPICLPSSGYNLPAGTTCTVSGWGRTSEGGPLPDVLQEVNVPIVSNAECKR-------AY 159 (232)
T ss_pred CcCCEEEEEECCcccCCCcccceECCCccccCCCCCEEEEEeCCcCCCCCCCCceeeEEEeeeECHHHhhh-------hc
Confidence 4689999999862 334678888766 678899999999986432111111222222 22211110 00
Q ss_pred cc-CCCcCcEEEEc-----ccccCCCCCCccccCC---cEEEEEEeeecc
Q 004542 592 QR-NSAYPVMLETT-----AAVHPGGSGGAVVNLD---GHMIGLVTSNAR 632 (746)
Q Consensus 592 ~~-~~~~~~~IqTd-----Aav~~GnSGGPL~n~~---G~VIGIvssna~ 632 (746)
.. ......+++.. ...+.|+|||||+... +.++||++....
T Consensus 160 ~~~~~~~~~~~C~~~~~~~~~~c~gdsGgpl~~~~~~~~~lvGI~s~g~~ 209 (232)
T cd00190 160 SYGGTITDNMLCAGGLEGGKDACQGDSGGPLVCNDNGRGVLVGIVSWGSG 209 (232)
T ss_pred cCcccCCCceEeeCCCCCCCccccCCCCCcEEEEeCCEEEEEEEEehhhc
Confidence 00 01123344443 3467899999999653 789999988654
No 17
>KOG1320 consensus Serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=99.17 E-value=3.6e-11 Score=134.77 Aligned_cols=128 Identities=19% Similarity=0.324 Sum_probs=107.8
Q ss_pred ccCcccccc-cCcccEEEEEEcCC--CCCCCccccCCCCCCCCeEEEEeCCCCCCCccccccceEEEEEeeecCCC----
Q 004542 203 ESSNLSLMS-KSTSRVAILGVSSY--LKDLPNIALTPLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVANCYPPR---- 275 (746)
Q Consensus 203 ~~~~~~~~~-~~~t~~A~l~i~~~--~~~~~~~~~s~~~~~Gd~v~aigsPFg~~~p~~f~n~vs~GiVs~~~~~~---- 275 (746)
.+.+|.+++ +..-|+|++|++.. .+...+++.+..++.|+|+.++++||+ +.|++++|+|+...|+.
T Consensus 211 ~s~ep~i~g~d~~~gvA~l~ik~~~~i~~~i~~~~~~~~~~G~~~~a~~~~f~------~~nt~t~g~vs~~~R~~~~lg 284 (473)
T KOG1320|consen 211 NSGEPVIVGVDKVAGVAFLKIKTPENILYVIPLGVSSHFRTGVEVSAIGNGFG------LLNTLTQGMVSGQLRKSFKLG 284 (473)
T ss_pred ccCCCeEEccccccceEEEEEecCCcccceeecceeeeecccceeeccccCce------eeeeeeecccccccccccccC
Confidence 566788886 78889999999633 245557777889999999999999999 58999999999888752
Q ss_pred ----CCCCCeEEeecccCCCCcCcceecCCccEEEEEeeecccc-CCcceEEEeehHHHHHHHHHh
Q 004542 276 ----STTRSLLMADIRCLPGMEGGPVFGEHAHFVGILIRPLRQK-SGAEIQLVIPWEAIATACSDL 336 (746)
Q Consensus 276 ----~~~~~~i~tDa~~~pG~~GG~v~~~~g~liGi~~~~l~~~-~~~~l~~~ip~~~i~~~~~~l 336 (746)
.....++|||+++++||+|||++|.+|+.||++++...+. -+.++.|++|.+.+...+...
T Consensus 285 ~~~g~~i~~~~qtd~ai~~~nsg~~ll~~DG~~IgVn~~~~~ri~~~~~iSf~~p~d~vl~~v~r~ 350 (473)
T KOG1320|consen 285 LETGVLISKINQTDAAINPGNSGGPLLNLDGEVIGVNTRKVTRIGFSHGISFKIPIDTVLVIVLRL 350 (473)
T ss_pred cccceeeeeecccchhhhcccCCCcEEEecCcEeeeeeeeeEEeeccccceeccCchHhhhhhhhh
Confidence 2345689999999999999999999999999999988766 567999999999998766544
No 18
>KOG1421 consensus Predicted signaling-associated protein (contains a PDZ domain) [General function prediction only]
Probab=99.11 E-value=6.1e-10 Score=126.50 Aligned_cols=257 Identities=19% Similarity=0.271 Sum_probs=169.0
Q ss_pred HHHhccCceEEEEeC----------CCeeEEEEEEeC-CCEEEEcccccCCCCCcceeecCCccccccCCCCCCCCCCCC
Q 004542 389 PIQKALASVCLITID----------DGVWASGVLLND-QGLILTNAHLLEPWRFGKTTVSGWRNGVSFQPEDSASSGHTG 457 (746)
Q Consensus 389 ~i~ka~~SVV~I~~~----------~~~wGSGflIn~-~GlILTnaHVV~p~~~~~t~~~G~~~~~~f~~~~~~~~~~~~ 457 (746)
.++.+.++||.|+.. ..+-|+||++++ .||||||+||+.|--+... ..|...
T Consensus 57 ~ia~VvksvVsI~~S~v~~fdtesag~~~atgfvvd~~~gyiLtnrhvv~pgP~va~--------avf~n~--------- 119 (955)
T KOG1421|consen 57 TIANVVKSVVSIRFSAVRAFDTESAGESEATGFVVDKKLGYILTNRHVVAPGPFVAS--------AVFDNH--------- 119 (955)
T ss_pred hhhhhcccEEEEEehheeecccccccccceeEEEEecccceEEEeccccCCCCceeE--------EEeccc---------
Confidence 567788999999862 345799999987 6899999999985211100 111111
Q ss_pred ccccccccCCCCCCCcccccccccccccccccccCCceeEEEEEcCCCCceeEeeEEEeecCCCCceEEEEEccC---CC
Q 004542 458 VDQYQKSQTLPPKMPKIVDSSVDEHRAYKLSSFSRGHRKIRVRLDHLDPWIWCDAKIVYVCKGPLDVSLLQLGYI---PD 534 (746)
Q Consensus 458 v~~~~k~q~l~~k~~~i~~~~~~~~~~~~l~l~~~~~~~I~Vrl~~~~~~~W~~A~VV~v~d~~~DLALLkle~~---p~ 534 (746)
.. ++.-.+| .|+-+|+.++|.+.. ..
T Consensus 120 -----------------------------------------------ee---~ei~pvy-rDpVhdfGf~r~dps~ir~s 148 (955)
T KOG1421|consen 120 -----------------------------------------------EE---IEIYPVY-RDPVHDFGFFRYDPSTIRFS 148 (955)
T ss_pred -----------------------------------------------cc---CCccccc-CCchhhcceeecChhhccee
Confidence 11 2333455 788899999999862 12
Q ss_pred CcceeecCCCCCCCCCeEEEEecCCCCCCCCCCCeeeeeEEeeeeeccCCCCCccccccCCCcCcEEEEcccccCCCCCC
Q 004542 535 QLCPIDADFGQPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQRNSAYPVMLETTAAVHPGGSGG 614 (746)
Q Consensus 535 ~l~pi~L~~s~~~~G~~V~vIG~plf~~~~gl~psvt~GiVS~v~~~~~~~~~~~~~~~~~~~~~~IqTdAav~~GnSGG 614 (746)
.+..+.++..-.++|.+++++|+ -.+..-++-.|-++...+. .|-+......... .-++|.-+...+|.||.
T Consensus 149 ~vt~i~lap~~akvgseirvvgN-----DagEklsIlagflSrldr~-apdyg~~~yndfn--Tfy~Qaasstsggssgs 220 (955)
T KOG1421|consen 149 IVTEICLAPELAKVGSEIRVVGN-----DAGEKLSILAGFLSRLDRN-APDYGEDTYNDFN--TFYIQAASSTSGGSSGS 220 (955)
T ss_pred eeeccccCccccccCCceEEecC-----CccceEEeehhhhhhccCC-Ccccccccccccc--ceeeeehhcCCCCCCCC
Confidence 33344455555689999999999 5667788889999988775 2444333322222 23578878889999999
Q ss_pred ccccCCcEEEEEEeeeccCCCCcccCceEEEEehhHHHHHHHHHHhcccc--------------chhccCCCCCccceee
Q 004542 615 AVVNLDGHMIGLVTSNARHGGGTVIPHLNFSIPCAVLRPIFEFARDMQEV--------------SLLRKLDEPNKHLASV 680 (746)
Q Consensus 615 PL~n~~G~VIGIvssna~~~~g~~~p~lnFaIPi~~l~~il~~~~~~~d~--------------~~l~~L~~~~~~l~~v 680 (746)
||+|-+|..|.++..... ...-.|++|++-+.+.+..++...-+ .-.+++...++....+
T Consensus 221 pVv~i~gyAVAl~agg~~------ssas~ffLpLdrV~RaL~clq~n~PItRGtLqvefl~k~~de~rrlGL~sE~eqv~ 294 (955)
T KOG1421|consen 221 PVVDIPGYAVALNAGGSI------SSASDFFLPLDRVVRALRCLQNNTPITRGTLQVEFLHKLFDECRRLGLSSEWEQVV 294 (955)
T ss_pred ceecccceEEeeecCCcc------cccccceeeccchhhhhhhhhcCCCcccceEEEEEehhhhHHHHhcCCcHHHHHHH
Confidence 999999999999864432 33568999999999999888742111 1123444344433333
Q ss_pred eeecCCCCC--CCCCCCCCCCc--cccc-cc-ccCCcchhHHHHHHHHHHHhc
Q 004542 681 WALMPPLSP--KQGPSLPDLPQ--AALE-DN-IEGKGSRFAKFIAERREVLKH 727 (746)
Q Consensus 681 W~L~~~~~~--~~~~~~~~~p~--~~~~-~~-~~~~~~~~ak~~~~~~~~~~~ 727 (746)
-...|..-- -....+|+.|. .+.+ |. ...|+.-|..|++.-|.+-+.
T Consensus 295 r~k~P~~tgmLvV~~vL~~gpa~k~Le~GDillavN~t~l~df~~l~~iLDeg 347 (955)
T KOG1421|consen 295 RTKFPERTGMLVVETVLPEGPAEKKLEPGDILLAVNSTCLNDFEALEQILDEG 347 (955)
T ss_pred HhcCcccceeEEEEEeccCCchhhccCCCcEEEEEcceehHHHHHHHHHHhhc
Confidence 333332111 33445677775 4444 66 888999999999988877665
No 19
>smart00020 Tryp_SPc Trypsin-like serine protease. Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues.
Probab=99.00 E-value=2.8e-08 Score=100.32 Aligned_cols=108 Identities=24% Similarity=0.276 Sum_probs=61.1
Q ss_pred CCCceEEEEEccC---CCCcceeecCCC--CCCCCCeEEEEecCCCCCCC-CCCCeeeeeEEeeeeeccCCCCCcccccc
Q 004542 520 GPLDVSLLQLGYI---PDQLCPIDADFG--QPSLGSAAYVIGHGLFGPRC-GLSPSVSSGVVAKVVKANLPSYGQSTLQR 593 (746)
Q Consensus 520 ~~~DLALLkle~~---p~~l~pi~L~~s--~~~~G~~V~vIG~plf~~~~-gl~psvt~GiVS~v~~~~~~~~~~~~~~~ 593 (746)
..+|||||+|+.. ...+.|+.+... .+..|+.+.++|||...... ..........+.-.... .+......
T Consensus 87 ~~~DiAll~L~~~i~~~~~~~pi~l~~~~~~~~~~~~~~~~g~g~~~~~~~~~~~~~~~~~~~~~~~~----~C~~~~~~ 162 (229)
T smart00020 87 YDNDIALLKLKSPVTLSDNVRPICLPSSNYNVPAGTTCTVSGWGRTSEGAGSLPDTLQEVNVPIVSNA----TCRRAYSG 162 (229)
T ss_pred CcCCEEEEEECcccCCCCceeeccCCCcccccCCCCEEEEEeCCCCCCCCCcCCCEeeEEEEEEeCHH----Hhhhhhcc
Confidence 4689999999862 345778887664 57789999999998543200 11112222222111110 00000000
Q ss_pred -CCCcCcEEEE-----cccccCCCCCCccccCCc--EEEEEEeeec
Q 004542 594 -NSAYPVMLET-----TAAVHPGGSGGAVVNLDG--HMIGLVTSNA 631 (746)
Q Consensus 594 -~~~~~~~IqT-----dAav~~GnSGGPL~n~~G--~VIGIvssna 631 (746)
......+++. +...++|+|||||+...+ .++||++...
T Consensus 163 ~~~~~~~~~C~~~~~~~~~~c~gdsG~pl~~~~~~~~l~Gi~s~g~ 208 (229)
T smart00020 163 GGAITDNMLCAGGLEGGKDACQGDSGGPLVCNDGRWVLVGIVSWGS 208 (229)
T ss_pred ccccCCCcEeecCCCCCCcccCCCCCCeeEEECCCEEEEEEEEECC
Confidence 0011223333 345788999999996443 8999999876
No 20
>COG3591 V8-like Glu-specific endopeptidase [Amino acid transport and metabolism]
Probab=98.47 E-value=1.8e-06 Score=90.72 Aligned_cols=73 Identities=26% Similarity=0.281 Sum_probs=50.8
Q ss_pred CCCCCCeEEEEecCCCCCCCCCCCeeeeeEEeeeeeccCCCCCccccccCCCcCcEEEEcccccCCCCCCccccCCcEEE
Q 004542 545 QPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQRNSAYPVMLETTAAVHPGGSGGAVVNLDGHMI 624 (746)
Q Consensus 545 ~~~~G~~V~vIG~plf~~~~gl~psvt~GiVS~v~~~~~~~~~~~~~~~~~~~~~~IqTdAav~~GnSGGPL~n~~G~VI 624 (746)
..+.++.+.++|||.-.+..+ ..-...+.+..+. ...++.+|.+.+|+||.||++.+.++|
T Consensus 157 ~~~~~d~i~v~GYP~dk~~~~-~~~e~t~~v~~~~------------------~~~l~y~~dT~pG~SGSpv~~~~~~vi 217 (251)
T COG3591 157 EAKANDRITVIGYPGDKPNIG-TMWESTGKVNSIK------------------GNKLFYDADTLPGSSGSPVLISKDEVI 217 (251)
T ss_pred ccccCceeEEEeccCCCCcce-eEeeecceeEEEe------------------cceEEEEecccCCCCCCceEecCceEE
Confidence 578899999999974322111 1122233333221 226889999999999999999888999
Q ss_pred EEEeeeccCCCC
Q 004542 625 GLVTSNARHGGG 636 (746)
Q Consensus 625 GIvssna~~~~g 636 (746)
|+.+.+....++
T Consensus 218 gv~~~g~~~~~~ 229 (251)
T COG3591 218 GVHYNGPGANGG 229 (251)
T ss_pred EEEecCCCcccc
Confidence 999988775544
No 21
>KOG3627 consensus Trypsin [Amino acid transport and metabolism]
Probab=98.14 E-value=0.0002 Score=74.50 Aligned_cols=114 Identities=21% Similarity=0.276 Sum_probs=65.4
Q ss_pred CceEEEEEcc---CCCCcceeecCCCC----CCCCCeEEEEecCCCCCC-CCCCCeee---eeEEeeeeeccCCCCCccc
Q 004542 522 LDVSLLQLGY---IPDQLCPIDADFGQ----PSLGSAAYVIGHGLFGPR-CGLSPSVS---SGVVAKVVKANLPSYGQST 590 (746)
Q Consensus 522 ~DLALLkle~---~p~~l~pi~L~~s~----~~~G~~V~vIG~plf~~~-~gl~psvt---~GiVS~v~~~~~~~~~~~~ 590 (746)
+|||||+++. ..+.+.|+.+.... ...+..+++.|||..... ......+. .-+++...+ ...
T Consensus 106 nDiall~l~~~v~~~~~i~piclp~~~~~~~~~~~~~~~v~GWG~~~~~~~~~~~~L~~~~v~i~~~~~C-------~~~ 178 (256)
T KOG3627|consen 106 NDIALLRLSEPVTFSSHIQPICLPSSADPYFPPGGTTCLVSGWGRTESGGGPLPDTLQEVDVPIISNSEC-------RRA 178 (256)
T ss_pred CCEEEEEECCCcccCCcccccCCCCCcccCCCCCCCEEEEEeCCCcCCCCCCCCceeEEEEEeEcChhHh-------ccc
Confidence 8999999986 34567777775332 344588999999854321 01112222 222222111 111
Q ss_pred cccC-CCcCcEEEEcc-----cccCCCCCCccccCC---cEEEEEEeeeccCCCCcccCce
Q 004542 591 LQRN-SAYPVMLETTA-----AVHPGGSGGAVVNLD---GHMIGLVTSNARHGGGTVIPHL 642 (746)
Q Consensus 591 ~~~~-~~~~~~IqTdA-----av~~GnSGGPL~n~~---G~VIGIvssna~~~~g~~~p~l 642 (746)
.... .....++++.. .++.|+|||||+-.+ ..++||+++.....+....|+.
T Consensus 179 ~~~~~~~~~~~~Ca~~~~~~~~~C~GDSGGPLv~~~~~~~~~~GivS~G~~~C~~~~~P~v 239 (256)
T KOG3627|consen 179 YGGLGTITDTMLCAGGPEGGKDACQGDSGGPLVCEDNGRWVLVGIVSWGSGGCGQPNYPGV 239 (256)
T ss_pred ccCccccCCCEEeeCccCCCCccccCCCCCeEEEeeCCcEEEEEEEEecCCCCCCCCCCeE
Confidence 1110 11134577653 357899999999543 6999999998765443334555
No 22
>PF00863 Peptidase_C4: Peptidase family C4 This family belongs to family C4 of the peptidase classification.; InterPro: IPR001730 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. Nuclear inclusion A (NIA) proteases from potyviruses are cysteine peptidases belong to the MEROPS peptidase family C4 (NIa protease family, clan PA(C)) [, ]. Potyviruses include plant viruses in which the single-stranded RNA encodes a polyprotein with NIA protease activity, where proteolytic cleavage is specific for Gln+Gly sites. The NIA protease acts on the polyprotein, releasing itself by Gln+Gly cleavage at both the N- and C-termini. It further processes the polyprotein by cleavage at five similar sites in the C-terminal half of the sequence. In addition to its C-terminal protease activity, the NIA protease contains an N-terminal domain that has been implicated in the transcription process []. This peptidase is present in the nuclear inclusion protein of potyviruses.; GO: 0008234 cysteine-type peptidase activity, 0006508 proteolysis; PDB: 3MMG_B 1Q31_B 1LVB_A 1LVM_A.
Probab=97.71 E-value=0.00081 Score=70.38 Aligned_cols=103 Identities=17% Similarity=0.320 Sum_probs=50.4
Q ss_pred CCCceEEEEEccCCCCcceeec--CCCCCCCCCeEEEEecCCCCCCCCCCCeeeeeEEeeeeeccCCCCCccccccCCCc
Q 004542 520 GPLDVSLLQLGYIPDQLCPIDA--DFGQPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQRNSAY 597 (746)
Q Consensus 520 ~~~DLALLkle~~p~~l~pi~L--~~s~~~~G~~V~vIG~plf~~~~gl~psvt~GiVS~v~~~~~~~~~~~~~~~~~~~ 597 (746)
+..||.++|+.. +++|.+- .+..++.|+.|+.+|.=. - ..+....++ .-|.+.+ ...
T Consensus 80 ~~~DiviirmPk---DfpPf~~kl~FR~P~~~e~v~mVg~~f-q-~k~~~s~vS--esS~i~p--------------~~~ 138 (235)
T PF00863_consen 80 EGRDIVIIRMPK---DFPPFPQKLKFRAPKEGERVCMVGSNF-Q-EKSISSTVS--ESSWIYP--------------EEN 138 (235)
T ss_dssp TCSSEEEEE--T---TS----S---B----TT-EEEEEEEEC-S-SCCCEEEEE--EEEEEEE--------------ETT
T ss_pred CCccEEEEeCCc---ccCCcchhhhccCCCCCCEEEEEEEEE-E-cCCeeEEEC--CceEEee--------------cCC
Confidence 469999999974 6666543 344789999999999721 0 111111222 1122222 122
Q ss_pred CcEEEEcccccCCCCCCccccC-CcEEEEEEeeeccCCCCcccCceEEEEehh
Q 004542 598 PVMLETTAAVHPGGSGGAVVNL-DGHMIGLVTSNARHGGGTVIPHLNFSIPCA 649 (746)
Q Consensus 598 ~~~IqTdAav~~GnSGGPL~n~-~G~VIGIvssna~~~~g~~~p~lnFaIPi~ 649 (746)
..+...-..+..|+-|.||++. +|++|||.+...... ..||..|+.
T Consensus 139 ~~fWkHwIsTk~G~CG~PlVs~~Dg~IVGiHsl~~~~~------~~N~F~~f~ 185 (235)
T PF00863_consen 139 SHFWKHWISTKDGDCGLPLVSTKDGKIVGIHSLTSNTS------SRNYFTPFP 185 (235)
T ss_dssp TTEEEE-C---TT-TT-EEEETTT--EEEEEEEEETTT------SSEEEEE--
T ss_pred CCeeEEEecCCCCccCCcEEEcCCCcEEEEEcCccCCC------CeEEEEcCC
Confidence 3466666777899999999974 799999999765432 367887764
No 23
>PF13365 Trypsin_2: Trypsin-like peptidase domain; PDB: 1Y8T_A 2Z9I_A 3QO6_A 1L1J_A 1QY6_A 2O8L_A 3OTP_E 2ZLE_I 1KY9_A 3CS0_A ....
Probab=97.66 E-value=2.3e-05 Score=71.16 Aligned_cols=24 Identities=46% Similarity=0.855 Sum_probs=22.5
Q ss_pred eecccCCCCcCcceecCCccEEEE
Q 004542 284 ADIRCLPGMEGGPVFGEHAHFVGI 307 (746)
Q Consensus 284 tDa~~~pG~~GG~v~~~~g~liGi 307 (746)
+|+.+.||+|||||||.+|++|||
T Consensus 97 ~~~~~~~G~SGgpv~~~~G~vvGi 120 (120)
T PF13365_consen 97 TDADTRPGSSGGPVFDSDGRVVGI 120 (120)
T ss_dssp ESSS-STTTTTSEEEETTSEEEEE
T ss_pred eecccCCCcEeHhEECCCCEEEeC
Confidence 899999999999999999999997
No 24
>COG5640 Secreted trypsin-like serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=97.14 E-value=0.0025 Score=69.63 Aligned_cols=51 Identities=24% Similarity=0.394 Sum_probs=35.7
Q ss_pred ccccCCCCCCcccc--CCcEE-EEEEeeeccCCCCcccCceEEEEehhHHHHHHHH
Q 004542 605 AAVHPGGSGGAVVN--LDGHM-IGLVTSNARHGGGTVIPHLNFSIPCAVLRPIFEF 657 (746)
Q Consensus 605 Aav~~GnSGGPL~n--~~G~V-IGIvssna~~~~g~~~p~lnFaIPi~~l~~il~~ 657 (746)
...|.|+||||+|- .+|++ +||+++.-...++..+|++--. ++.....+..
T Consensus 223 ~daCqGDSGGPi~~~g~~G~vQ~GVvSwG~~~Cg~t~~~gVyT~--vsny~~WI~a 276 (413)
T COG5640 223 KDACQGDSGGPIFHKGEEGRVQRGVVSWGDGGCGGTLIPGVYTN--VSNYQDWIAA 276 (413)
T ss_pred cccccCCCCCceEEeCCCccEEEeEEEecCCCCCCCCcceeEEe--hhHHHHHHHH
Confidence 45688999999993 35876 9999999888777766663222 4444444444
No 25
>PF03761 DUF316: Domain of unknown function (DUF316) ; InterPro: IPR005514 This is a family of uncharacterised proteins from Caenorhabditis elegans.
Probab=97.11 E-value=0.029 Score=59.80 Aligned_cols=92 Identities=16% Similarity=0.153 Sum_probs=56.9
Q ss_pred CCCceEEEEEccC-CCCcceeecCCC--CCCCCCeEEEEecCCCCCCCCCCCeeeeeEEeeeeeccCCCCCccccccCCC
Q 004542 520 GPLDVSLLQLGYI-PDQLCPIDADFG--QPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQRNSA 596 (746)
Q Consensus 520 ~~~DLALLkle~~-p~~l~pi~L~~s--~~~~G~~V~vIG~plf~~~~gl~psvt~GiVS~v~~~~~~~~~~~~~~~~~~ 596 (746)
..++++||.++.. .....|+-++++ ....|+.+.+.|+. .. ..+....+.-.... .
T Consensus 159 ~~~~~mIlEl~~~~~~~~~~~Cl~~~~~~~~~~~~~~~yg~~-----~~--~~~~~~~~~i~~~~--------------~ 217 (282)
T PF03761_consen 159 RPYSPMILELEEDFSKNVSPPCLADSSTNWEKGDEVDVYGFN-----ST--GKLKHRKLKITNCT--------------K 217 (282)
T ss_pred cccceEEEEEcccccccCCCEEeCCCccccccCceEEEeecC-----CC--CeEEEEEEEEEEee--------------c
Confidence 6789999999973 245556666554 46789999988871 11 12222222222110 1
Q ss_pred cCcEEEEcccccCCCCCCcccc-CC--cEEEEEEeeecc
Q 004542 597 YPVMLETTAAVHPGGSGGAVVN-LD--GHMIGLVTSNAR 632 (746)
Q Consensus 597 ~~~~IqTdAav~~GnSGGPL~n-~~--G~VIGIvssna~ 632 (746)
....+.+....+.|++||||+. .+ -.||||.+.+..
T Consensus 218 ~~~~~~~~~~~~~~d~Gg~lv~~~~gr~tlIGv~~~~~~ 256 (282)
T PF03761_consen 218 CAYSICTKQYSCKGDRGGPLVKNINGRWTLIGVGASGNY 256 (282)
T ss_pred cceeEecccccCCCCccCeEEEEECCCEEEEEEEccCCC
Confidence 2334555666789999999983 33 468999887654
No 26
>PF05579 Peptidase_S32: Equine arteritis virus serine endopeptidase S32; InterPro: IPR008760 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to MEROPS peptidase family S32 (clan PA(S)). The type example is equine arteritis virus serine endopeptidase (equine arteritis virus), which is involved in processing of nidovirus polyproteins [].; GO: 0004252 serine-type endopeptidase activity, 0016032 viral reproduction, 0019082 viral protein processing; PDB: 3FAN_A 3FAO_A 1MBM_A.
Probab=96.88 E-value=0.0047 Score=65.17 Aligned_cols=78 Identities=24% Similarity=0.319 Sum_probs=42.6
Q ss_pred CCceEEEEEccCCCCcceeecCCCCCCCCCeEEEEecCCCCCCCCCCCeeeeeEEeeeeeccCCCCCccccccCCCcCcE
Q 004542 521 PLDVSLLQLGYIPDQLCPIDADFGQPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQRNSAYPVM 600 (746)
Q Consensus 521 ~~DLALLkle~~p~~l~pi~L~~s~~~~G~~V~vIG~plf~~~~gl~psvt~GiVS~v~~~~~~~~~~~~~~~~~~~~~~ 600 (746)
.-|.|.-.++..+...|.+++... ..|- .|-.-. .-+..|.|....+
T Consensus 155 ~GDfA~~~~~~~~G~~P~~k~a~~--~~Gr-AyW~t~----------tGvE~G~ig~~~~-------------------- 201 (297)
T PF05579_consen 155 NGDFAEADITNWPGAAPKYKFAQN--YTGR-AYWLTS----------TGVEPGFIGGGGA-------------------- 201 (297)
T ss_dssp ETTEEEEEETTS-S---B--B-TT---SEE-EEEEET----------TEEEEEEEETTEE--------------------
T ss_pred cCcEEEEECCCCCCCCCceeecCC--cccc-eEEEcc----------cCcccceecCceE--------------------
Confidence 478899888766677777777632 2332 222211 2245565553222
Q ss_pred EEEcccccCCCCCCccccCCcEEEEEEeeeccCC
Q 004542 601 LETTAAVHPGGSGGAVVNLDGHMIGLVTSNARHG 634 (746)
Q Consensus 601 IqTdAav~~GnSGGPL~n~~G~VIGIvssna~~~ 634 (746)
++ -.++|+||+|++..+|.+|||++..-+.+
T Consensus 202 ~~---fT~~GDSGSPVVt~dg~liGVHTGSn~~G 232 (297)
T PF05579_consen 202 VC---FTGPGDSGSPVVTEDGDLIGVHTGSNKRG 232 (297)
T ss_dssp EE---SS-GGCTT-EEEETTC-EEEEEEEEETTT
T ss_pred EE---EcCCCCCCCccCcCCCCEEEEEecCCCcC
Confidence 22 34789999999999999999999876543
No 27
>PF00089 Trypsin: Trypsin; InterPro: IPR001254 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine proteases belong to the MEROPS peptidase family S1 (chymotrypsin family, clan PA(S))and to peptidase family S6 (Hap serine peptidases). The chymotrypsin family is almost totally confined to animals, although trypsin-like enzymes are found in actinomycetes of the genera Streptomyces and Saccharopolyspora, and in the fungus Fusarium oxysporum []. The enzymes are inherently secreted, being synthesised with a signal peptide that targets them to the secretory pathway. Animal enzymes are either secreted directly, packaged into vesicles for regulated secretion, or are retained in leukocyte granules []. The Hap family, 'Haemophilus adhesion and penetration', are proteins that play a role in the interaction with human epithelial cells. The serine protease activity is localized at the N-terminal domain, whereas the binding domain is in the C-terminal region. ; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 1SPJ_A 1A5I_A 2ZGH_A 2ZKS_A 2ZGJ_A 2ZGC_A 2ODP_A 2I6Q_A 2I6S_A 2ODQ_A ....
Probab=96.20 E-value=0.063 Score=53.40 Aligned_cols=115 Identities=17% Similarity=0.147 Sum_probs=69.9
Q ss_pred cccEEEEEEcCC---CCCCCccccCC---CCCCCCeEEEEeCCCCCCCc-cccccceEEEEEeee--cC--CCCCCCCeE
Q 004542 214 TSRVAILGVSSY---LKDLPNIALTP---LNKRGDLLLAVGSPFGVLSP-MHFFNSVSMGSVANC--YP--PRSTTRSLL 282 (746)
Q Consensus 214 ~t~~A~l~i~~~---~~~~~~~~~s~---~~~~Gd~v~aigsPFg~~~p-~~f~n~vs~GiVs~~--~~--~~~~~~~~i 282 (746)
..||||||++.. .....++.... .++.|+.+.++|.+...... .--.......+++.. .. ........+
T Consensus 86 ~~DiAll~L~~~~~~~~~~~~~~l~~~~~~~~~~~~~~~~G~~~~~~~~~~~~~~~~~~~~~~~~~c~~~~~~~~~~~~~ 165 (220)
T PF00089_consen 86 DNDIALLKLDRPITFGDNIQPICLPSAGSDPNVGTSCIVVGWGRTSDNGYSSNLQSVTVPVVSRKTCRSSYNDNLTPNMI 165 (220)
T ss_dssp TTSEEEEEESSSSEHBSSBEESBBTSTTHTTTTTSEEEEEESSBSSTTSBTSBEEEEEEEEEEHHHHHHHTTTTSTTTEE
T ss_pred cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
Confidence 359999999854 12222333322 35899999999988853221 001122334444432 11 111234567
Q ss_pred Eeec----ccCCCCcCcceecCCccEEEEEeeecccc-CCcceEEEeehHHH
Q 004542 283 MADI----RCLPGMEGGPVFGEHAHFVGILIRPLRQK-SGAEIQLVIPWEAI 329 (746)
Q Consensus 283 ~tDa----~~~pG~~GG~v~~~~g~liGi~~~~l~~~-~~~~l~~~ip~~~i 329 (746)
.++. ...+|++||||++.++.||||++.. ..+ ......+..+...+
T Consensus 166 c~~~~~~~~~~~g~sG~pl~~~~~~lvGI~s~~-~~c~~~~~~~v~~~v~~~ 216 (220)
T PF00089_consen 166 CAGSSGSGDACQGDSGGPLICNNNYLVGIVSFG-ENCGSPNYPGVYTRVSSY 216 (220)
T ss_dssp EEETTSSSBGGTTTTTSEEEETTEEEEEEEEEE-SSSSBTTSEEEEEEGGGG
T ss_pred cccccccccccccccccccccceeeecceeeec-CCCCCCCcCEEEEEHHHh
Confidence 7776 7789999999999998999999988 333 33335666665443
No 28
>PF10459 Peptidase_S46: Peptidase S46; InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains.
Probab=94.44 E-value=0.06 Score=64.76 Aligned_cols=66 Identities=20% Similarity=0.281 Sum_probs=46.8
Q ss_pred CCcCcEEEEcccccCCCCCCccccCCcEEEEEEeeeccCCCCccc---Cc--eEEEEehhHHHHHHHHHHh
Q 004542 595 SAYPVMLETTAAVHPGGSGGAVVNLDGHMIGLVTSNARHGGGTVI---PH--LNFSIPCAVLRPIFEFARD 660 (746)
Q Consensus 595 ~~~~~~IqTdAav~~GnSGGPL~n~~G~VIGIvssna~~~~g~~~---p~--lnFaIPi~~l~~il~~~~~ 660 (746)
...+.-+.+++.+.+||||+|++|.+|+|||++.-..-+...+.+ |. -+.++-+..+..+++.+..
T Consensus 618 g~~pv~FlstnDitGGNSGSPvlN~~GeLVGl~FDgn~Esl~~D~~fdp~~~R~I~VDiRyvL~~ldkv~g 688 (698)
T PF10459_consen 618 GSVPVNFLSTNDITGGNSGSPVLNAKGELVGLAFDGNWESLSGDIAFDPELNRTIHVDIRYVLWALDKVYG 688 (698)
T ss_pred CCeeeEEEeccCcCCCCCCCccCCCCceEEEEeecCchhhcccccccccccceeEEEEHHHHHHHHHHHhC
Confidence 445777889999999999999999999999998654333211111 22 3556666777777777654
No 29
>PF00548 Peptidase_C3: 3C cysteine protease (picornain 3C); InterPro: IPR000199 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This signature defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies C3A and C3B. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral C3 cysteine protease. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SJO_E 2H6M_A 1QA7_C 1HAV_B 2HAL_A 2H9H_A 3QZQ_B 3QZR_A 3R0F_B 3SJ9_A ....
Probab=93.89 E-value=0.79 Score=46.02 Aligned_cols=35 Identities=29% Similarity=0.526 Sum_probs=29.3
Q ss_pred CcCcEEEEcccccCCCCCCccccC---CcEEEEEEeee
Q 004542 596 AYPVMLETTAAVHPGGSGGAVVNL---DGHMIGLVTSN 630 (746)
Q Consensus 596 ~~~~~IqTdAav~~GnSGGPL~n~---~G~VIGIvssn 630 (746)
..+.++.+.++..+|+-||||+.. .++++||.++.
T Consensus 133 ~~~~~~~Y~~~t~~G~CG~~l~~~~~~~~~i~GiHvaG 170 (172)
T PF00548_consen 133 TTPRSLKYKAPTKPGMCGSPLVSRIGGQGKIIGIHVAG 170 (172)
T ss_dssp EEEEEEEEESEEETTGTTEEEEESCGGTTEEEEEEEEE
T ss_pred EeeEEEEEccCCCCCccCCeEEEeeccCccEEEEEecc
Confidence 346788899999999999999942 57999999874
No 30
>COG3591 V8-like Glu-specific endopeptidase [Amino acid transport and metabolism]
Probab=93.42 E-value=0.4 Score=51.02 Aligned_cols=76 Identities=22% Similarity=0.223 Sum_probs=60.0
Q ss_pred cccCCCCCCCCeEEEEeCCCCCCCccccccceEEEEEeeecCCCCCCCCeEEeecccCCCCcCcceecCCccEEEEEeee
Q 004542 232 IALTPLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVANCYPPRSTTRSLLMADIRCLPGMEGGPVFGEHAHFVGILIRP 311 (746)
Q Consensus 232 ~~~s~~~~~Gd~v~aigsPFg~~~p~~f~n~vs~GiVs~~~~~~~~~~~~i~tDa~~~pG~~GG~v~~~~g~liGi~~~~ 311 (746)
.......+.+|.|.++|.|=.- |.++....+.+.|-.... .+++-|+...||+||.||++.+.++||+.+..
T Consensus 152 ~~~~~~~~~~d~i~v~GYP~dk--~~~~~~~e~t~~v~~~~~------~~l~y~~dT~pG~SGSpv~~~~~~vigv~~~g 223 (251)
T COG3591 152 RNTASEAKANDRITVIGYPGDK--PNIGTMWESTGKVNSIKG------NKLFYDADTLPGSSGSPVLISKDEVIGVHYNG 223 (251)
T ss_pred cccccccccCceeEEEeccCCC--CcceeEeeecceeEEEec------ceEEEEecccCCCCCCceEecCceEEEEEecC
Confidence 3345578999999999999874 334555666666655543 26888999999999999999999999999999
Q ss_pred cccc
Q 004542 312 LRQK 315 (746)
Q Consensus 312 l~~~ 315 (746)
....
T Consensus 224 ~~~~ 227 (251)
T COG3591 224 PGAN 227 (251)
T ss_pred CCcc
Confidence 8755
No 31
>KOG1421 consensus Predicted signaling-associated protein (contains a PDZ domain) [General function prediction only]
Probab=91.50 E-value=3 Score=49.59 Aligned_cols=148 Identities=14% Similarity=0.140 Sum_probs=84.6
Q ss_pred EEEEEcCCCCceeEeeEEEeecCCCCceEEEEEccCCCCcceeecCCCCCCCCCeEEEEecCCCCCCCCC-----CCeee
Q 004542 497 IRVRLDHLDPWIWCDAKIVYVCKGPLDVSLLQLGYIPDQLCPIDADFGQPSLGSAAYVIGHGLFGPRCGL-----SPSVS 571 (746)
Q Consensus 497 I~Vrl~~~~~~~W~~A~VV~v~d~~~DLALLkle~~p~~l~pi~L~~s~~~~G~~V~vIG~plf~~~~gl-----~psvt 571 (746)
++|+..+... ..|.+.+ -++...+|.+|.+. ......++.+..+..||++...|+- .++ ..+++
T Consensus 578 ~~vt~~dS~~---i~a~~~f-L~~t~n~a~~kydp--~~~~~~kl~~~~v~~gD~~~f~g~~-----~~~r~ltaktsv~ 646 (955)
T KOG1421|consen 578 QRVTEADSDG---IPANVSF-LHPTENVASFKYDP--ALEVQLKLTDTTVLRGDECTFEGFT-----EDLRALTAKTSVT 646 (955)
T ss_pred eEEeeccccc---ccceeeE-ecCccceeEeccCh--hHhhhhccceeeEecCCceeEeccc-----ccchhhcccceee
Confidence 4555555555 6788887 57778889888875 3334556666678999999999983 222 12333
Q ss_pred eeEEeeeeeccCCCCCccccccCCCcCcEEEEcccccCCCCCCccccCCcEEEEEEeeeccCCCCcccCceEEEEehhHH
Q 004542 572 SGVVAKVVKANLPSYGQSTLQRNSAYPVMLETTAAVHPGGSGGAVVNLDGHMIGLVTSNARHGGGTVIPHLNFSIPCAVL 651 (746)
Q Consensus 572 ~GiVS~v~~~~~~~~~~~~~~~~~~~~~~IqTdAav~~GnSGGPL~n~~G~VIGIvssna~~~~g~~~p~lnFaIPi~~l 651 (746)
.=.+-.+-+...|.+ .... ...|...+.+.-+.--|-+.|.+|+++|+=-+......+.+--..-|.+-+..+
T Consensus 647 dvs~~~~ps~~~pr~------r~~n-~e~Is~~~nlsT~c~sg~ltdddg~vvalwl~~~ge~~~~kd~~y~~gl~~~~~ 719 (955)
T KOG1421|consen 647 DVSVVIIPSSVMPRF------RATN-LEVISFMDNLSTSCLSGRLTDDDGEVVALWLSVVGEDVGGKDYTYKYGLSMSYI 719 (955)
T ss_pred eeEEEEecCCCCcce------eecc-eEEEEEeccccccccceEEECCCCeEEEEEeeeeccccCCceeEEEeccchHHH
Confidence 221111111111111 0011 123333333333333345678899999997776665433221234566677889
Q ss_pred HHHHHHHHhcc
Q 004542 652 RPIFEFARDMQ 662 (746)
Q Consensus 652 ~~il~~~~~~~ 662 (746)
.++++.++...
T Consensus 720 l~vl~rlk~g~ 730 (955)
T KOG1421|consen 720 LPVLERLKLGP 730 (955)
T ss_pred HHHHHHHhcCC
Confidence 99999998743
No 32
>PF08192 Peptidase_S64: Peptidase family S64; InterPro: IPR012985 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This family of fungal proteins is involved in the processing of membrane bound transcription factor Stp1 [] and belongs to MEROPS petidase family S64 (clan PA). The processing causes the signalling domain of Stp1 to be passed to the nucleus where several permease genes are induced. The permeases are important for uptake of amino acids, and processing of tp1 only occurs in an amino acid-rich environment. This family is predicted to be distantly related to the trypsin family (MEROPS peptidase family S1) and to have a typical trypsin-like catalytic triad [].
Probab=90.99 E-value=1.1 Score=53.06 Aligned_cols=119 Identities=14% Similarity=0.207 Sum_probs=71.8
Q ss_pred CCCCceEEEEEcc-------CCCCc------ceeecCC-------CCCCCCCeEEEEecCCCCCCCCCCCeeeeeEEeee
Q 004542 519 KGPLDVSLLQLGY-------IPDQL------CPIDADF-------GQPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKV 578 (746)
Q Consensus 519 d~~~DLALLkle~-------~p~~l------~pi~L~~-------s~~~~G~~V~vIG~plf~~~~gl~psvt~GiVS~v 578 (746)
..-.|+||++++. ..+++ +.+.+.. ....+|..|+=+|. ..| .|.|.++++
T Consensus 540 ~~LsD~AIIkV~~~~~~~N~LGddi~f~~~dP~l~f~NlyV~~~~~~~~~G~~VfK~Gr-----TTg----yT~G~lNg~ 610 (695)
T PF08192_consen 540 KRLSDWAIIKVNKERKCQNYLGDDIQFNEPDPTLMFQNLYVREVVSNLVPGMEVFKVGR-----TTG----YTTGILNGI 610 (695)
T ss_pred ccccceEEEEeCCCceecCCCCccccccCCCccccccccchhhhhhccCCCCeEEEecc-----cCC----ccceEecce
Confidence 4456999999985 11121 1122221 23567999999997 455 588888876
Q ss_pred eeccCCCCCccccccCCCcCcEEEEc----ccccCCCCCCccccCCc------EEEEEEeeeccCCCCcccCceEEEEeh
Q 004542 579 VKANLPSYGQSTLQRNSAYPVMLETT----AAVHPGGSGGAVVNLDG------HMIGLVTSNARHGGGTVIPHLNFSIPC 648 (746)
Q Consensus 579 ~~~~~~~~~~~~~~~~~~~~~~IqTd----Aav~~GnSGGPL~n~~G------~VIGIvssna~~~~g~~~p~lnFaIPi 648 (746)
.-. +-. ++.-....+++.+ +-..+|+||.=|++.-+ .|+||..+.-+.. -.+++..|+
T Consensus 611 klv----yw~---dG~i~s~efvV~s~~~~~Fa~~GDSGS~VLtk~~d~~~gLgvvGMlhsydge~-----kqfglftPi 678 (695)
T PF08192_consen 611 KLV----YWA---DGKIQSSEFVVSSDNNPAFASGGDSGSWVLTKLEDNNKGLGVVGMLHSYDGEQ-----KQFGLFTPI 678 (695)
T ss_pred EEE----Eec---CCCeEEEEEEEecCCCccccCCCCcccEEEecccccccCceeeEEeeecCCcc-----ceeeccCcH
Confidence 422 100 1111112334444 33568999999997533 4999998864332 258889998
Q ss_pred hHHHHHHHHH
Q 004542 649 AVLRPIFEFA 658 (746)
Q Consensus 649 ~~l~~il~~~ 658 (746)
..|.+=+++.
T Consensus 679 ~~il~rl~~v 688 (695)
T PF08192_consen 679 NEILDRLEEV 688 (695)
T ss_pred HHHHHHHHHh
Confidence 8877666554
No 33
>PF02907 Peptidase_S29: Hepatitis C virus NS3 protease; InterPro: IPR004109 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies the Hepatitis C virus NS3 protein as a serine protease which belongs to MEROPS peptidase family S29 (hepacivirin family, clan PA(S)), which has a trypsin-like fold. The non-structural (NS) protein NS3 is one of the NS proteins involved in replication of the HCV genome. The NS2 proteinase (IPR002518 from INTERPRO), a zinc-dependent enzyme, performs a single proteolytic cut to release the N terminus of NS3. The action of NS3 proteinase (NS3P), which resides in the N-terminal one-third of the NS3 protein, then yields all remaining non-structural proteins. The C-terminal two-thirds of the NS3 protein contain a helicase. The functional relationship between the proteinase and helicase domains is unknown. NS3 has a structural zinc-binding site and requires cofactor NS4. It has been suggested that the NS3 serine protease of hepatitus C is involved in cell transformation and that the ability to transform requires an active enzyme [].; GO: 0008236 serine-type peptidase activity, 0006508 proteolysis, 0019087 transformation of host cell by virus; PDB: 2QV1_B 3LOX_C 2OBQ_C 2OC1_C 2OC0_A 3LON_A 3KNX_A 2O8M_A 2OBO_A 2OC8_A ....
Probab=89.45 E-value=0.25 Score=47.53 Aligned_cols=45 Identities=27% Similarity=0.524 Sum_probs=35.9
Q ss_pred eecccCCCCcCcceecCCccEEEEEeeecccc-CCcceEEEeehHHH
Q 004542 284 ADIRCLPGMEGGPVFGEHAHFVGILIRPLRQK-SGAEIQLVIPWEAI 329 (746)
Q Consensus 284 tDa~~~pG~~GG~v~~~~g~liGi~~~~l~~~-~~~~l~~~ip~~~i 329 (746)
.-+.++-|+|||||+-..|++|||..+-++.+ ---.+-|+ ||+.+
T Consensus 101 ~pis~lkGSSGgPiLC~~GH~vG~f~aa~~trgvak~i~f~-P~e~l 146 (148)
T PF02907_consen 101 RPISDLKGSSGGPILCPSGHAVGMFRAAVCTRGVAKAIDFI-PVETL 146 (148)
T ss_dssp EEHHHHTT-TT-EEEETTSEEEEEEEEEEEETTEEEEEEEE-EHHHH
T ss_pred ceeEEEecCCCCcccCCCCCEEEEEEEEEEcCCceeeEEEE-eeeec
Confidence 45667889999999999999999999999887 33477887 99865
No 34
>PF00949 Peptidase_S7: Peptidase S7, Flavivirus NS3 serine protease ; InterPro: IPR001850 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies serine peptidases belong to MEROPS peptidase family S7 (flavivirin family, clan PA(S)). The protein fold of the peptidase domain for members of this family resembles that of chymotrypsin, the type example for clan PA. Flaviviruses produce a polyprotein from the ssRNA genome. The N terminus of the NS3 protein (approx. 180 aa) is required for the processing of the polyprotein. NS3 also has conserved homology with NTP-binding proteins and DEAD family of RNA helicase [, , ].; GO: 0003723 RNA binding, 0003724 RNA helicase activity, 0005524 ATP binding; PDB: 2IJO_B 3E90_D 2GGV_B 2FP7_B 2WV9_A 3U1I_B 3U1J_B 2WZQ_A 2WHX_A 3L6P_A ....
Probab=88.85 E-value=0.33 Score=46.84 Aligned_cols=36 Identities=19% Similarity=0.397 Sum_probs=25.8
Q ss_pred CeEEeecccCCCCcCcceecCCccEEEEEeeecccc
Q 004542 280 SLLMADIRCLPGMEGGPVFGEHAHFVGILIRPLRQK 315 (746)
Q Consensus 280 ~~i~tDa~~~pG~~GG~v~~~~g~liGi~~~~l~~~ 315 (746)
.+.+.|..+-+|+||.|+||.+|++|||--..+.-.
T Consensus 86 ~~~~~~~d~~~GsSGSpi~n~~g~ivGlYg~g~~~~ 121 (132)
T PF00949_consen 86 GIGAIDLDFPKGSSGSPIFNQNGEIVGLYGNGVEVG 121 (132)
T ss_dssp EEEEE---S-TTGTT-EEEETTSCEEEEEEEEEE-T
T ss_pred eEEeeecccCCCCCCCceEcCCCcEEEEEccceeec
Confidence 466788889999999999999999999987776543
No 35
>PF10459 Peptidase_S46: Peptidase S46; InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains.
Probab=88.00 E-value=0.36 Score=58.31 Aligned_cols=29 Identities=21% Similarity=0.425 Sum_probs=26.7
Q ss_pred eEEeecccCCCCcCcceecCCccEEEEEe
Q 004542 281 LLMADIRCLPGMEGGPVFGEHAHFVGILI 309 (746)
Q Consensus 281 ~i~tDa~~~pG~~GG~v~~~~g~liGi~~ 309 (746)
.++|+..|--||||.||+|.+|||||++.
T Consensus 623 ~FlstnDitGGNSGSPvlN~~GeLVGl~F 651 (698)
T PF10459_consen 623 NFLSTNDITGGNSGSPVLNAKGELVGLAF 651 (698)
T ss_pred EEEeccCcCCCCCCCccCCCCceEEEEee
Confidence 47899999999999999999999999987
No 36
>PF00949 Peptidase_S7: Peptidase S7, Flavivirus NS3 serine protease ; InterPro: IPR001850 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies serine peptidases belong to MEROPS peptidase family S7 (flavivirin family, clan PA(S)). The protein fold of the peptidase domain for members of this family resembles that of chymotrypsin, the type example for clan PA. Flaviviruses produce a polyprotein from the ssRNA genome. The N terminus of the NS3 protein (approx. 180 aa) is required for the processing of the polyprotein. NS3 also has conserved homology with NTP-binding proteins and DEAD family of RNA helicase [, , ].; GO: 0003723 RNA binding, 0003724 RNA helicase activity, 0005524 ATP binding; PDB: 2IJO_B 3E90_D 2GGV_B 2FP7_B 2WV9_A 3U1I_B 3U1J_B 2WZQ_A 2WHX_A 3L6P_A ....
Probab=87.39 E-value=0.41 Score=46.18 Aligned_cols=30 Identities=27% Similarity=0.539 Sum_probs=21.5
Q ss_pred ccccCCCCCCccccCCcEEEEEEeeeccCC
Q 004542 605 AAVHPGGSGGAVVNLDGHMIGLVTSNARHG 634 (746)
Q Consensus 605 Aav~~GnSGGPL~n~~G~VIGIvssna~~~ 634 (746)
....+|.||+|+||.+|++|||--......
T Consensus 92 ~d~~~GsSGSpi~n~~g~ivGlYg~g~~~~ 121 (132)
T PF00949_consen 92 LDFPKGSSGSPIFNQNGEIVGLYGNGVEVG 121 (132)
T ss_dssp --S-TTGTT-EEEETTSCEEEEEEEEEE-T
T ss_pred cccCCCCCCCceEcCCCcEEEEEccceeec
Confidence 346789999999999999999976655433
No 37
>PF02907 Peptidase_S29: Hepatitis C virus NS3 protease; InterPro: IPR004109 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies the Hepatitis C virus NS3 protein as a serine protease which belongs to MEROPS peptidase family S29 (hepacivirin family, clan PA(S)), which has a trypsin-like fold. The non-structural (NS) protein NS3 is one of the NS proteins involved in replication of the HCV genome. The NS2 proteinase (IPR002518 from INTERPRO), a zinc-dependent enzyme, performs a single proteolytic cut to release the N terminus of NS3. The action of NS3 proteinase (NS3P), which resides in the N-terminal one-third of the NS3 protein, then yields all remaining non-structural proteins. The C-terminal two-thirds of the NS3 protein contain a helicase. The functional relationship between the proteinase and helicase domains is unknown. NS3 has a structural zinc-binding site and requires cofactor NS4. It has been suggested that the NS3 serine protease of hepatitus C is involved in cell transformation and that the ability to transform requires an active enzyme [].; GO: 0008236 serine-type peptidase activity, 0006508 proteolysis, 0019087 transformation of host cell by virus; PDB: 2QV1_B 3LOX_C 2OBQ_C 2OC1_C 2OC0_A 3LON_A 3KNX_A 2O8M_A 2OBO_A 2OC8_A ....
Probab=86.21 E-value=0.73 Score=44.47 Aligned_cols=43 Identities=26% Similarity=0.484 Sum_probs=30.1
Q ss_pred ccCCCCCCccccCCcEEEEEEeeeccCCCCcccCceEEEEehhHHH
Q 004542 607 VHPGGSGGAVVNLDGHMIGLVTSNARHGGGTVIPHLNFSIPCAVLR 652 (746)
Q Consensus 607 v~~GnSGGPL~n~~G~VIGIvssna~~~~g~~~p~lnFaIPi~~l~ 652 (746)
...|+|||||+-.+|++|||..+..-..+- ...+-|. |++.+.
T Consensus 105 ~lkGSSGgPiLC~~GH~vG~f~aa~~trgv--ak~i~f~-P~e~l~ 147 (148)
T PF02907_consen 105 DLKGSSGGPILCPSGHAVGMFRAAVCTRGV--AKAIDFI-PVETLP 147 (148)
T ss_dssp HHTT-TT-EEEETTSEEEEEEEEEEEETTE--EEEEEEE-EHHHHH
T ss_pred EEecCCCCcccCCCCCEEEEEEEEEEcCCc--eeeEEEE-eeeecC
Confidence 457999999999999999998876543332 2367777 887653
No 38
>PF00944 Peptidase_S3: Alphavirus core protein ; InterPro: IPR000930 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. Togavirin, also known as Sindbis virus core endopeptidase, is a serine protease resident at the N terminus of the p130 polyprotein of togaviruses []. The endopeptidase signature identifies the peptidase as belonging to the MEROPS peptidase family S3 (togavirin family, clan PA(S)). The polyprotein also includes structural proteins for the nucleocapsid core and for the glycoprotein spikes []. Togavirin is only active while part of the polyprotein, cleavage at a Trp-Ser bond resulting in total lack of activity []. Mutagenesis studies have identified the location of the His-Asp-Ser catalytic triad, and X-ray studies have revealed the protein fold to be similar to that of chymotrypsin [, ].; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis, 0016020 membrane; PDB: 2YEW_D 1EP5_A 3J0C_F 1EP6_C 1WYK_D 1DYL_A 1VCQ_B 1VCP_B 1LD4_D 1KXA_A ....
Probab=85.86 E-value=0.74 Score=44.41 Aligned_cols=34 Identities=26% Similarity=0.487 Sum_probs=27.5
Q ss_pred EEEcccccCCCCCCccccCCcEEEEEEeeeccCC
Q 004542 601 LETTAAVHPGGSGGAVVNLDGHMIGLVTSNARHG 634 (746)
Q Consensus 601 IqTdAav~~GnSGGPL~n~~G~VIGIvssna~~~ 634 (746)
..-+..-.+|+||-|++|..|+||||+-....+.
T Consensus 97 tip~g~g~~GDSGRpi~DNsGrVVaIVLGG~neG 130 (158)
T PF00944_consen 97 TIPTGVGKPGDSGRPIFDNSGRVVAIVLGGANEG 130 (158)
T ss_dssp EEETTS-STTSTTEEEESTTSBEEEEEEEEEEET
T ss_pred EeccCCCCCCCCCCccCcCCCCEEEEEecCCCCC
Confidence 3445667899999999999999999998877654
No 39
>PF00863 Peptidase_C4: Peptidase family C4 This family belongs to family C4 of the peptidase classification.; InterPro: IPR001730 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. Nuclear inclusion A (NIA) proteases from potyviruses are cysteine peptidases belong to the MEROPS peptidase family C4 (NIa protease family, clan PA(C)) [, ]. Potyviruses include plant viruses in which the single-stranded RNA encodes a polyprotein with NIA protease activity, where proteolytic cleavage is specific for Gln+Gly sites. The NIA protease acts on the polyprotein, releasing itself by Gln+Gly cleavage at both the N- and C-termini. It further processes the polyprotein by cleavage at five similar sites in the C-terminal half of the sequence. In addition to its C-terminal protease activity, the NIA protease contains an N-terminal domain that has been implicated in the transcription process []. This peptidase is present in the nuclear inclusion protein of potyviruses.; GO: 0008234 cysteine-type peptidase activity, 0006508 proteolysis; PDB: 3MMG_B 1Q31_B 1LVB_A 1LVM_A.
Probab=82.71 E-value=5.2 Score=42.35 Aligned_cols=92 Identities=22% Similarity=0.241 Sum_probs=45.6
Q ss_pred CcccEEEEEEcCCCCCCCccccCCCCCCCCeEEEEeCCCCCCCccccccceEEEEEeeecCCCCCCCCeEEeecccCCCC
Q 004542 213 STSRVAILGVSSYLKDLPNIALTPLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVANCYPPRSTTRSLLMADIRCLPGM 292 (746)
Q Consensus 213 ~~t~~A~l~i~~~~~~~~~~~~s~~~~~Gd~v~aigsPFg~~~p~~f~n~vs~GiVs~~~~~~~~~~~~i~tDa~~~pG~ 292 (746)
.-.||.++|......|.|....-..++.||.|..||+=|---+ ..-++|. -|.+.+ .....|+---+.-.+|+
T Consensus 80 ~~~DiviirmPkDfpPf~~kl~FR~P~~~e~v~mVg~~fq~k~---~~s~vSe--sS~i~p--~~~~~fWkHwIsTk~G~ 152 (235)
T PF00863_consen 80 EGRDIVIIRMPKDFPPFPQKLKFRAPKEGERVCMVGSNFQEKS---ISSTVSE--SSWIYP--EENSHFWKHWISTKDGD 152 (235)
T ss_dssp TCSSEEEEE--TTS----S---B----TT-EEEEEEEECSSCC---CEEEEEE--EEEEEE--ETTTTEEEE-C---TT-
T ss_pred CCccEEEEeCCcccCCcchhhhccCCCCCCEEEEEEEEEEcCC---eeEEECC--ceEEee--cCCCCeeEEEecCCCCc
Confidence 3459999999744333332222257899999999999775311 1112222 223333 12356888889999999
Q ss_pred cCcceecC-CccEEEEEeee
Q 004542 293 EGGPVFGE-HAHFVGILIRP 311 (746)
Q Consensus 293 ~GG~v~~~-~g~liGi~~~~ 311 (746)
.|.|+++. +|.+|||-...
T Consensus 153 CG~PlVs~~Dg~IVGiHsl~ 172 (235)
T PF00863_consen 153 CGLPLVSTKDGKIVGIHSLT 172 (235)
T ss_dssp TT-EEEETTT--EEEEEEEE
T ss_pred cCCcEEEcCCCcEEEEEcCc
Confidence 99999975 78899998743
No 40
>smart00020 Tryp_SPc Trypsin-like serine protease. Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues.
Probab=80.87 E-value=10 Score=38.04 Aligned_cols=99 Identities=15% Similarity=0.095 Sum_probs=50.2
Q ss_pred cccEEEEEEcCCC---CCCCccccC---CCCCCCCeEEEEeCCCCCCCccccccceEEEEEeeecCC------------C
Q 004542 214 TSRVAILGVSSYL---KDLPNIALT---PLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVANCYPP------------R 275 (746)
Q Consensus 214 ~t~~A~l~i~~~~---~~~~~~~~s---~~~~~Gd~v~aigsPFg~~~p~~f~n~vs~GiVs~~~~~------------~ 275 (746)
..|+||||++... ....++.+. ..+..|+.+.+.|..-.......+...+....+.-..+. .
T Consensus 88 ~~DiAll~L~~~i~~~~~~~pi~l~~~~~~~~~~~~~~~~g~g~~~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~ 167 (229)
T smart00020 88 DNDIALLKLKSPVTLSDNVRPICLPSSNYNVPAGTTCTVSGWGRTSEGAGSLPDTLQEVNVPIVSNATCRRAYSGGGAIT 167 (229)
T ss_pred cCCEEEEEECcccCCCCceeeccCCCcccccCCCCEEEEEeCCCCCCCCCcCCCEeeEEEEEEeCHHHhhhhhccccccC
Confidence 4599999997431 112222222 246778888888854332111112222222222211110 0
Q ss_pred CCCCCeEE--eecccCCCCcCcceecCCc--cEEEEEeeec
Q 004542 276 STTRSLLM--ADIRCLPGMEGGPVFGEHA--HFVGILIRPL 312 (746)
Q Consensus 276 ~~~~~~i~--tDa~~~pG~~GG~v~~~~g--~liGi~~~~l 312 (746)
....+... .....-+|.+|||++...+ .|+||++..-
T Consensus 168 ~~~~C~~~~~~~~~~c~gdsG~pl~~~~~~~~l~Gi~s~g~ 208 (229)
T smart00020 168 DNMLCAGGLEGGKDACQGDSGGPLVCNDGRWVLVGIVSWGS 208 (229)
T ss_pred CCcEeecCCCCCCcccCCCCCCeeEEECCCEEEEEEEEECC
Confidence 00000000 1344557999999998765 7999988764
No 41
>PF05580 Peptidase_S55: SpoIVB peptidase S55; InterPro: IPR008763 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S55 (SpoIVB peptidase family, clan PA(S)). The protein SpoIVB plays a key role in signalling in the final sigma-K checkpoint of Bacillus subtilis [, ].
Probab=80.78 E-value=1.3 Score=46.09 Aligned_cols=41 Identities=29% Similarity=0.438 Sum_probs=32.3
Q ss_pred cccccCCCCCCccccCCcEEEEEEeeeccCCCCcccCceEEEEehhH
Q 004542 604 TAAVHPGGSGGAVVNLDGHMIGLVTSNARHGGGTVIPHLNFSIPCAV 650 (746)
Q Consensus 604 dAav~~GnSGGPL~n~~G~VIGIvssna~~~~g~~~p~lnFaIPi~~ 650 (746)
+.-+..|+||+|++ .+|++||=++-.+-.. |..+|.||++.
T Consensus 174 TGGIvqGMSGSPI~-qdGKLiGAVthvf~~d-----p~~Gygi~ie~ 214 (218)
T PF05580_consen 174 TGGIVQGMSGSPII-QDGKLIGAVTHVFVND-----PTKGYGIFIEW 214 (218)
T ss_pred hCCEEecccCCCEE-ECCEEEEEEEEEEecC-----CCceeeecHHH
Confidence 34567899999999 6999999998876432 45789998764
No 42
>cd00190 Tryp_SPc Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues.
Probab=80.60 E-value=6.5 Score=39.15 Aligned_cols=99 Identities=18% Similarity=0.170 Sum_probs=49.9
Q ss_pred cccEEEEEEcCCCCC---CCccc--cCC-CCCCCCeEEEEeCCCCCCC--ccccccceEEEEEeee--cCC-C---CCCC
Q 004542 214 TSRVAILGVSSYLKD---LPNIA--LTP-LNKRGDLLLAVGSPFGVLS--PMHFFNSVSMGSVANC--YPP-R---STTR 279 (746)
Q Consensus 214 ~t~~A~l~i~~~~~~---~~~~~--~s~-~~~~Gd~v~aigsPFg~~~--p~~f~n~vs~GiVs~~--~~~-~---~~~~ 279 (746)
..||||||++..... ..++. ... ....|+.+.+.|....... ...-......-+++.. ... . ....
T Consensus 88 ~~DiAll~L~~~~~~~~~v~picl~~~~~~~~~~~~~~~~G~g~~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~ 167 (232)
T cd00190 88 DNDIALLKLKRPVTLSDNVRPICLPSSGYNLPAGTTCTVSGWGRTSEGGPLPDVLQEVNVPIVSNAECKRAYSYGGTITD 167 (232)
T ss_pred cCCEEEEEECCcccCCCcccceECCCccccCCCCCEEEEEeCCcCCCCCCCCceeeEEEeeeECHHHhhhhccCcccCCC
Confidence 349999999743211 12222 221 5778899999986443211 0001112222222221 000 0 0011
Q ss_pred CeEEe-----ecccCCCCcCcceecCC---ccEEEEEeeec
Q 004542 280 SLLMA-----DIRCLPGMEGGPVFGEH---AHFVGILIRPL 312 (746)
Q Consensus 280 ~~i~t-----Da~~~pG~~GG~v~~~~---g~liGi~~~~l 312 (746)
..+-+ +...-+|.+||||+... ..|+||++...
T Consensus 168 ~~~C~~~~~~~~~~c~gdsGgpl~~~~~~~~~lvGI~s~g~ 208 (232)
T cd00190 168 NMLCAGGLEGGKDACQGDSGGPLVCNDNGRGVLVGIVSWGS 208 (232)
T ss_pred ceEeeCCCCCCCccccCCCCCcEEEEeCCEEEEEEEEehhh
Confidence 11111 33455799999999875 56999988654
No 43
>PF09342 DUF1986: Domain of unknown function (DUF1986); InterPro: IPR015420 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This domain is found in serine endopeptidases belonging to MEROPS peptidase family S1A (clan PA). It is found in unusual mosaic proteins, which are encoded by the Drosophila nudel gene (see P98159 from SWISSPROT). Nudel is involved in defining embryonic dorsoventral polarity. Three proteases; ndl, gd and snk process easter to create active easter. Active easter defines cell identities along the dorsal-ventral continuum by activating the spz ligand for the Tl receptor in the ventral region of the embryo. Nudel, pipe and windbeutel together trigger the protease cascade within the extraembryonic perivitelline compartment which induces dorsoventral polarity of the Drosophila embryo [].
Probab=77.88 E-value=12 Score=39.85 Aligned_cols=32 Identities=31% Similarity=0.513 Sum_probs=26.9
Q ss_pred EEEEeCCCeeEEEEEEeCCCEEEEcccccCCCC
Q 004542 398 CLITIDDGVWASGVLLNDQGLILTNAHLLEPWR 430 (746)
Q Consensus 398 V~I~~~~~~wGSGflIn~~GlILTnaHVV~p~~ 430 (746)
..|.+++.-||||++|+++ |||++..|+...+
T Consensus 20 A~IYvdG~~~CsgvLlD~~-WlLvsssCl~~I~ 51 (267)
T PF09342_consen 20 ADIYVDGRYWCSGVLLDPH-WLLVSSSCLRGIS 51 (267)
T ss_pred eeEEEcCeEEEEEEEeccc-eEEEeccccCCcc
Confidence 4567777889999999997 9999999997433
No 44
>PF00947 Pico_P2A: Picornavirus core protein 2A; InterPro: IPR000081 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This domain defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies 3CA and 3CB. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral 3C cysteine protease []. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0008233 peptidase activity, 0006508 proteolysis, 0016032 viral reproduction; PDB: 2HRV_B 1Z8R_A.
Probab=75.83 E-value=3.8 Score=39.32 Aligned_cols=31 Identities=29% Similarity=0.461 Sum_probs=24.4
Q ss_pred CeEEeecccCCCCcCcceecCCccEEEEEeee
Q 004542 280 SLLMADIRCLPGMEGGPVFGEHAHFVGILIRP 311 (746)
Q Consensus 280 ~~i~tDa~~~pG~~GG~v~~~~g~liGi~~~~ 311 (746)
.+++.--.+.||..||+|.-++| +|||+|+-
T Consensus 79 ~~l~g~Gp~~PGdCGg~L~C~HG-ViGi~Tag 109 (127)
T PF00947_consen 79 NLLIGEGPAEPGDCGGILRCKHG-VIGIVTAG 109 (127)
T ss_dssp CEEEEE-SSSTT-TCSEEEETTC-EEEEEEEE
T ss_pred CceeecccCCCCCCCceeEeCCC-eEEEEEeC
Confidence 35666778999999999998886 99999974
No 45
>PF02122 Peptidase_S39: Peptidase S39; InterPro: IPR000382 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. ORF2 of Potato leafroll virus (PLrV) encodes a polyprotein which is translated following a -1 frameshift. The polyprotein has a putative linear arrangement of membrane achor-VPg-peptidase-polmerase domains. The serine peptidase domain which is found in this group of sequences belongs to MEROPS peptidase family S39 (clan PA(S)). It is likely that the peptidase domain is involved in the cleavage of the polyprotein []. The nucleotide sequence for the RNA of PLrV has been determined [, ]. The sequence contains six large open reading frames (ORFs). The 5' coding region encodes two polypeptides of 28K and 70K, which overlap in different reading frames; it is suggested that the third ORF in the 5' block is translated by frameshift readthrough near the end of the 70K protein, yielding a 118K polypeptide []. Segments of the predicted amino acid sequences of these ORFs resemble those of known viral RNA polymerases, ATP-binding proteins and viral genome-linked proteins. The nucleotide sequence of the genomic RNA of Beet western yellows virus (BWYV) has been determined []. The sequence contains six long ORFs. A cluster of three of these ORFs, including the coat protein cistron, display extensive amino acid sequence similarity to corresponding ORFs of a second luteovirus: Barley yellow dwarf virus [].; GO: 0004252 serine-type endopeptidase activity, 0022415 viral reproductive process, 0016021 integral to membrane; PDB: 1ZYO_A.
Probab=73.57 E-value=14 Score=38.27 Aligned_cols=48 Identities=17% Similarity=0.185 Sum_probs=17.9
Q ss_pred EEEEcccccCCCCCCccccCCcEEEEEEeeeccCCCCcccCceEEEEehhHH
Q 004542 600 MLETTAAVHPGGSGGAVVNLDGHMIGLVTSNARHGGGTVIPHLNFSIPCAVL 651 (746)
Q Consensus 600 ~IqTdAav~~GnSGGPL~n~~G~VIGIvssna~~~~g~~~p~lnFaIPi~~l 651 (746)
.....+...+|.||.|+|+.. +++|+.+...+.. .-.+.|+..|+--+
T Consensus 137 ~~~vls~T~~G~SGtp~y~g~-~vvGvH~G~~~~~---~~~n~n~~spip~~ 184 (203)
T PF02122_consen 137 FASVLSNTSPGWSGTPYYSGK-NVVGVHTGSPSGS---NRENNNRMSPIPPI 184 (203)
T ss_dssp EEEE-----TT-TT-EEE-SS--EEEEEEEE---------------------
T ss_pred CCceEcCCCCCCCCCCeEECC-CceEeecCccccc---cccccccccccccc
Confidence 556667788999999999877 9999998863221 11245666555443
No 46
>PF00944 Peptidase_S3: Alphavirus core protein ; InterPro: IPR000930 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. Togavirin, also known as Sindbis virus core endopeptidase, is a serine protease resident at the N terminus of the p130 polyprotein of togaviruses []. The endopeptidase signature identifies the peptidase as belonging to the MEROPS peptidase family S3 (togavirin family, clan PA(S)). The polyprotein also includes structural proteins for the nucleocapsid core and for the glycoprotein spikes []. Togavirin is only active while part of the polyprotein, cleavage at a Trp-Ser bond resulting in total lack of activity []. Mutagenesis studies have identified the location of the His-Asp-Ser catalytic triad, and X-ray studies have revealed the protein fold to be similar to that of chymotrypsin [, ].; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis, 0016020 membrane; PDB: 2YEW_D 1EP5_A 3J0C_F 1EP6_C 1WYK_D 1DYL_A 1VCQ_B 1VCP_B 1LD4_D 1KXA_A ....
Probab=71.59 E-value=6.5 Score=38.15 Aligned_cols=49 Identities=22% Similarity=0.394 Sum_probs=33.6
Q ss_pred cccceEEEEEeeecCCCCCCCCeEEeecccCCCCcCcceecCCccEEEEEeeeccc
Q 004542 259 FFNSVSMGSVANCYPPRSTTRSLLMADIRCLPGMEGGPVFGEHAHFVGILIRPLRQ 314 (746)
Q Consensus 259 f~n~vs~GiVs~~~~~~~~~~~~i~tDa~~~pG~~GG~v~~~~g~liGi~~~~l~~ 314 (746)
|+|- -.|-|..... -|.+--..-.||.||-|+||..|++|||+++--..
T Consensus 81 ~YNw-hhGaVqy~~g------rftip~g~g~~GDSGRpi~DNsGrVVaIVLGG~ne 129 (158)
T PF00944_consen 81 FYNW-HHGAVQYSNG------RFTIPTGVGKPGDSGRPIFDNSGRVVAIVLGGANE 129 (158)
T ss_dssp EEEE-TTEEEEEETT------EEEEETTS-STTSTTEEEESTTSBEEEEEEEEEEE
T ss_pred eecc-ccceEEEeCC------eEEeccCCCCCCCCCCccCcCCCCEEEEEecCCCC
Confidence 4443 2366654432 24444556789999999999999999999976543
No 47
>PF08192 Peptidase_S64: Peptidase family S64; InterPro: IPR012985 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This family of fungal proteins is involved in the processing of membrane bound transcription factor Stp1 [] and belongs to MEROPS petidase family S64 (clan PA). The processing causes the signalling domain of Stp1 to be passed to the nucleus where several permease genes are induced. The permeases are important for uptake of amino acids, and processing of tp1 only occurs in an amino acid-rich environment. This family is predicted to be distantly related to the trypsin family (MEROPS peptidase family S1) and to have a typical trypsin-like catalytic triad [].
Probab=65.65 E-value=47 Score=40.02 Aligned_cols=113 Identities=19% Similarity=0.203 Sum_probs=70.7
Q ss_pred cccccCcccEEEEEEcCCC-------------CCCCccccC--------CCCCCCCeEEEEeCCCCCCCccccccceEEE
Q 004542 208 SLMSKSTSRVAILGVSSYL-------------KDLPNIALT--------PLNKRGDLLLAVGSPFGVLSPMHFFNSVSMG 266 (746)
Q Consensus 208 ~~~~~~~t~~A~l~i~~~~-------------~~~~~~~~s--------~~~~~Gd~v~aigsPFg~~~p~~f~n~vs~G 266 (746)
.++.+.++|+|++|++... ..-|.+.+. ..+..|.+|+=+|.==|+ |.|
T Consensus 536 ~ii~~~LsD~AIIkV~~~~~~~N~LGddi~f~~~dP~l~f~NlyV~~~~~~~~~G~~VfK~GrTTgy----------T~G 605 (695)
T PF08192_consen 536 SIINKRLSDWAIIKVNKERKCQNYLGDDIQFNEPDPTLMFQNLYVREVVSNLVPGMEVFKVGRTTGY----------TTG 605 (695)
T ss_pred hhhcccccceEEEEeCCCceecCCCCccccccCCCccccccccchhhhhhccCCCCeEEEecccCCc----------cce
Confidence 3445677899999997432 122222221 246779999999887775 355
Q ss_pred EEeeec----CCCC-CCCCeEEee----cccCCCCcCcceecCCcc------EEEEEeeeccccCCcceEEEeehHHHHH
Q 004542 267 SVANCY----PPRS-TTRSLLMAD----IRCLPGMEGGPVFGEHAH------FVGILIRPLRQKSGAEIQLVIPWEAIAT 331 (746)
Q Consensus 267 iVs~~~----~~~~-~~~~~i~tD----a~~~pG~~GG~v~~~~g~------liGi~~~~l~~~~~~~l~~~ip~~~i~~ 331 (746)
+|.... .++. ....|++.- +=..+|.||.=|+++-+. |+||+-+.=. ....|++..||..|..
T Consensus 606 ~lNg~klvyw~dG~i~s~efvV~s~~~~~Fa~~GDSGS~VLtk~~d~~~gLgvvGMlhsydg--e~kqfglftPi~~il~ 683 (695)
T PF08192_consen 606 ILNGIKLVYWADGKIQSSEFVVSSDNNPAFASGGDSGSWVLTKLEDNNKGLGVVGMLHSYDG--EQKQFGLFTPINEILD 683 (695)
T ss_pred EecceEEEEecCCCeEEEEEEEecCCCccccCCCCcccEEEecccccccCceeeEEeeecCC--ccceeeccCcHHHHHH
Confidence 554431 1111 113445444 446779999999997444 8888876432 3347888999998875
Q ss_pred H
Q 004542 332 A 332 (746)
Q Consensus 332 ~ 332 (746)
-
T Consensus 684 r 684 (695)
T PF08192_consen 684 R 684 (695)
T ss_pred H
Confidence 3
No 48
>PF00947 Pico_P2A: Picornavirus core protein 2A; InterPro: IPR000081 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This domain defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies 3CA and 3CB. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral 3C cysteine protease []. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0008233 peptidase activity, 0006508 proteolysis, 0016032 viral reproduction; PDB: 2HRV_B 1Z8R_A.
Probab=62.36 E-value=8.3 Score=37.03 Aligned_cols=32 Identities=28% Similarity=0.486 Sum_probs=23.5
Q ss_pred cEEEEcccccCCCCCCccccCCcEEEEEEeeec
Q 004542 599 VMLETTAAVHPGGSGGAVVNLDGHMIGLVTSNA 631 (746)
Q Consensus 599 ~~IqTdAav~~GnSGGPL~n~~G~VIGIvssna 631 (746)
.++....++.||+.||+|+-.. -||||+++.-
T Consensus 79 ~~l~g~Gp~~PGdCGg~L~C~H-GViGi~Tagg 110 (127)
T PF00947_consen 79 NLLIGEGPAEPGDCGGILRCKH-GVIGIVTAGG 110 (127)
T ss_dssp CEEEEE-SSSTT-TCSEEEETT-CEEEEEEEEE
T ss_pred CceeecccCCCCCCCceeEeCC-CeEEEEEeCC
Confidence 4555567889999999999544 4899999863
No 49
>KOG0441 consensus Cu2+/Zn2+ superoxide dismutase SOD1 [Inorganic ion transport and metabolism]
Probab=53.96 E-value=4.6 Score=39.95 Aligned_cols=42 Identities=29% Similarity=0.256 Sum_probs=29.5
Q ss_pred hhhhccccccccccCcee---eeeeeeecccccC--chhhhhhccCC
Q 004542 26 GLKMRRHAFHQYNSGKTT---LSASGMLLPLSFF--DTKVAERNWGV 67 (746)
Q Consensus 26 ~~~~~~~~~~~~~~~~~t---~sas~~~~p~~~~--~~~~~~~~~~~ 67 (746)
||.-++|+||.|+.|.+| .||-.-.=|.+.. .+.+.+|..++
T Consensus 38 GL~pg~hgfHvHqfGD~t~GC~SaGphFNp~~~~hg~p~~~~rH~gd 84 (154)
T KOG0441|consen 38 GLPPGKHGFHVHQFGDNTNGCKSAGPHFNPNKKTHGGPVDEVRHVGD 84 (154)
T ss_pred cCCCceeeEEEEeccCCCCChhcCCCCCCCcccCCCCcccccccccc
Confidence 444499999999999998 5775555555454 35566666666
No 50
>PF03510 Peptidase_C24: 2C endopeptidase (C24) cysteine protease family; InterPro: IPR000317 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. The two signatures that defines this group of calivirus polyproteins identify a cysteine peptidase signature that belongs to MEROPS peptidase family C24 (clan PA(C)). Caliciviruses are positive-stranded ssRNA viruses that cause gastroenteritis. The calicivirus genome contains two open reading frames, ORF1 and ORF2. ORF2 encodes a structural protein []; while ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine protease and RNA polymerase activity. The regions of the polyprotein in which these activities lie are similar to proteins produced by the picornaviruses. Two different families of caliciviruses can be distinguished on the basis of sequence similarity, namely those classified as small round structured viruses (SRSVs) and those classed as non-SRSVs. Calicivirus proteases from the non-SRSV group, which are members of the PA protease clan, constitute family C24 of the cysteine proteases (proteases from SRSVs belong to the C37 family). As mentioned above, the protease activity resides within a polyprotein. The enzyme cleaves the polyprotein at sites N-terminal to itself, liberating the polyprotein helicase.; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis
Probab=53.91 E-value=44 Score=31.22 Aligned_cols=17 Identities=24% Similarity=0.425 Sum_probs=14.1
Q ss_pred EEEEeCCCEEEEcccccC
Q 004542 410 GVLLNDQGLILTNAHLLE 427 (746)
Q Consensus 410 GflIn~~GlILTnaHVV~ 427 (746)
++-|.. |.++|+.||++
T Consensus 3 avHIGn-G~~vt~tHva~ 19 (105)
T PF03510_consen 3 AVHIGN-GRYVTVTHVAK 19 (105)
T ss_pred eEEeCC-CEEEEEEEEec
Confidence 566775 89999999997
No 51
>TIGR02860 spore_IV_B stage IV sporulation protein B. SpoIVB, the stage IV sporulation protein B of endospore-forming bacteria such as Bacillus subtilis, is a serine proteinase, expressed in the spore (rather than mother cell) compartment, that participates in a proteolytic activation cascade for Sigma-K. It appears to be universal among endospore-forming bacteria and occurs nowhere else.
Probab=50.89 E-value=12 Score=42.76 Aligned_cols=42 Identities=24% Similarity=0.419 Sum_probs=31.9
Q ss_pred cccccCCCCCCccccCCcEEEEEEeeeccCCCCcccCceEEEEehhHH
Q 004542 604 TAAVHPGGSGGAVVNLDGHMIGLVTSNARHGGGTVIPHLNFSIPCAVL 651 (746)
Q Consensus 604 dAav~~GnSGGPL~n~~G~VIGIvssna~~~~g~~~p~lnFaIPi~~l 651 (746)
+.-+..|+||+|++ .+|++||=++--+-+. |.-+|.|-++.-
T Consensus 354 tgGivqGMSGSPi~-q~gkliGAvtHVfvnd-----pt~GYGi~ie~M 395 (402)
T TIGR02860 354 TGGIVQGMSGSPII-QNGKVIGAVTHVFVND-----PTSGYGVYIEWM 395 (402)
T ss_pred hCCEEecccCCCEE-ECCEEEEEEEEEEecC-----CCcceeehHHHH
Confidence 34567799999999 7999999988776643 346788866543
No 52
>PF01732 DUF31: Putative peptidase (DUF31); InterPro: IPR022382 This domain has no known function. It is found in various hypothetical proteins and putative lipoproteins from mycoplasmas.
Probab=47.62 E-value=11 Score=42.12 Aligned_cols=24 Identities=25% Similarity=0.515 Sum_probs=21.2
Q ss_pred ccccCCCCCCccccCCcEEEEEEe
Q 004542 605 AAVHPGGSGGAVVNLDGHMIGLVT 628 (746)
Q Consensus 605 Aav~~GnSGGPL~n~~G~VIGIvs 628 (746)
....+|+||+.|+|.+|++|||..
T Consensus 350 ~~l~gGaSGS~V~n~~~~lvGIy~ 373 (374)
T PF01732_consen 350 YSLGGGASGSMVINQNNELVGIYF 373 (374)
T ss_pred cCCCCCCCcCeEECCCCCEEEEeC
Confidence 366789999999999999999963
No 53
>PF00548 Peptidase_C3: 3C cysteine protease (picornain 3C); InterPro: IPR000199 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This signature defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies C3A and C3B. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral C3 cysteine protease. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SJO_E 2H6M_A 1QA7_C 1HAV_B 2HAL_A 2H9H_A 3QZQ_B 3QZR_A 3R0F_B 3SJ9_A ....
Probab=43.05 E-value=43 Score=33.63 Aligned_cols=90 Identities=21% Similarity=0.330 Sum_probs=50.4
Q ss_pred cccEEEEEEcCCCCCCCcccc--CC-CCCCCCeEEEEeCC-CCCCCcccc-ccce-EEEEEeeecCCCCCCCCeEEeecc
Q 004542 214 TSRVAILGVSSYLKDLPNIAL--TP-LNKRGDLLLAVGSP-FGVLSPMHF-FNSV-SMGSVANCYPPRSTTRSLLMADIR 287 (746)
Q Consensus 214 ~t~~A~l~i~~~~~~~~~~~~--s~-~~~~Gd~v~aigsP-Fg~~~p~~f-~n~v-s~GiVs~~~~~~~~~~~~i~tDa~ 287 (746)
.+|+++++++.. .....+.- .. .-...+.++++-+. |+- .++ .+.+ ..|.| +... ......|.=++.
T Consensus 71 ~~Dl~~v~l~~~-~kfrDIrk~~~~~~~~~~~~~l~v~~~~~~~---~~~~v~~v~~~~~i-~~~g--~~~~~~~~Y~~~ 143 (172)
T PF00548_consen 71 DTDLTLVKLPRN-PKFRDIRKFFPESIPEYPECVLLVNSTKFPR---MIVEVGFVTNFGFI-NLSG--TTTPRSLKYKAP 143 (172)
T ss_dssp EEEEEEEEEESS-S-B--GGGGSBSSGGTEEEEEEEEESSSSTC---EEEEEEEEEEEEEE-EETT--EEEEEEEEEESE
T ss_pred ceeEEEEEccCC-cccCchhhhhccccccCCCcEEEEECCCCcc---EEEEEEEEeecCcc-ccCC--CEeeEEEEEccC
Confidence 579999999742 22222221 11 22455666666654 552 111 1111 23444 3332 223346777888
Q ss_pred cCCCCcCcceecC---CccEEEEEee
Q 004542 288 CLPGMEGGPVFGE---HAHFVGILIR 310 (746)
Q Consensus 288 ~~pG~~GG~v~~~---~g~liGi~~~ 310 (746)
--+|+.||+|+.. .+.+|||=+|
T Consensus 144 t~~G~CG~~l~~~~~~~~~i~GiHva 169 (172)
T PF00548_consen 144 TKPGMCGSPLVSRIGGQGKIIGIHVA 169 (172)
T ss_dssp EETTGTTEEEEESCGGTTEEEEEEEE
T ss_pred CCCCccCCeEEEeeccCccEEEEEec
Confidence 8899999999974 4569999765
No 54
>PF05579 Peptidase_S32: Equine arteritis virus serine endopeptidase S32; InterPro: IPR008760 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to MEROPS peptidase family S32 (clan PA(S)). The type example is equine arteritis virus serine endopeptidase (equine arteritis virus), which is involved in processing of nidovirus polyproteins [].; GO: 0004252 serine-type endopeptidase activity, 0016032 viral reproduction, 0019082 viral protein processing; PDB: 3FAN_A 3FAO_A 1MBM_A.
Probab=38.25 E-value=22 Score=38.44 Aligned_cols=26 Identities=27% Similarity=0.511 Sum_probs=20.5
Q ss_pred cCCCCcCcceecCCccEEEEEeeecc
Q 004542 288 CLPGMEGGPVFGEHAHFVGILIRPLR 313 (746)
Q Consensus 288 ~~pG~~GG~v~~~~g~liGi~~~~l~ 313 (746)
-.||.||.||+..+|.+||+-++.=.
T Consensus 205 T~~GDSGSPVVt~dg~liGVHTGSn~ 230 (297)
T PF05579_consen 205 TGPGDSGSPVVTEDGDLIGVHTGSNK 230 (297)
T ss_dssp S-GGCTT-EEEETTC-EEEEEEEEET
T ss_pred cCCCCCCCccCcCCCCEEEEEecCCC
Confidence 46999999999999999999998643
No 55
>PF05580 Peptidase_S55: SpoIVB peptidase S55; InterPro: IPR008763 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S55 (SpoIVB peptidase family, clan PA(S)). The protein SpoIVB plays a key role in signalling in the final sigma-K checkpoint of Bacillus subtilis [, ].
Probab=32.63 E-value=41 Score=35.27 Aligned_cols=38 Identities=18% Similarity=0.526 Sum_probs=27.2
Q ss_pred cCCCCcCcceecCCccEEEEEeeeccccCCcceEEEeeh
Q 004542 288 CLPGMEGGPVFGEHAHFVGILIRPLRQKSGAEIQLVIPW 326 (746)
Q Consensus 288 ~~pG~~GG~v~~~~g~liGi~~~~l~~~~~~~l~~~ip~ 326 (746)
+.-||||.|++- +|+|||-++--|......|....|.|
T Consensus 177 IvqGMSGSPI~q-dGKLiGAVthvf~~dp~~Gygi~ie~ 214 (218)
T PF05580_consen 177 IVQGMSGSPIIQ-DGKLIGAVTHVFVNDPTKGYGIFIEW 214 (218)
T ss_pred EEecccCCCEEE-CCEEEEEEEEEEecCCCceeeecHHH
Confidence 556999999986 89999999988754334444443443
No 56
>PF05416 Peptidase_C37: Southampton virus-type processing peptidase; InterPro: IPR001665 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This group of cysteine peptidases belong to the MEROPS peptidase family C37, (clan PA(C)). The type example is calicivirin from Southampton virus, an endopeptidase that cleaves the polyprotein at sites N-terminal to itself, liberating the polyprotein helicase. Southampton virus is a positive-stranded ssRNA virus belonging to the Caliciviruses, which are viruses that cause gastroenteritis. The calicivirus genome contains two open reading frames, ORF1 and ORF2. ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine protease and RNA polymerase activity []. The regions of the polyprotein in which these activities lie are similar to proteins produced by the picornaviruses []. ORF2 encodes a structural, capsid protein. Two different families of caliciviruses can be distinguished on the basis of sequence similarity, namely the Norwalk-like viruses or small round structured viruses (SRSVs), and those classed as non-SRSVs.; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 2FYQ_A 2FYR_A 1WQS_D 4ASH_A 2IPH_B.
Probab=30.24 E-value=1.6e+02 Score=33.94 Aligned_cols=38 Identities=34% Similarity=0.413 Sum_probs=25.3
Q ss_pred cCcEEEEccc-------ccCCCCCCccccCCc---EEEEEEeeeccCC
Q 004542 597 YPVMLETTAA-------VHPGGSGGAVVNLDG---HMIGLVTSNARHG 634 (746)
Q Consensus 597 ~~~~IqTdAa-------v~~GnSGGPL~n~~G---~VIGIvssna~~~ 634 (746)
...||.|.+. +.||+-|-|-+-..| -|+|++++.++.+
T Consensus 483 Q~GMLLTGaNAK~mDLGT~PGDCGcPYvyKrgNd~VV~GVH~AAtr~G 530 (535)
T PF05416_consen 483 QMGMLLTGANAKGMDLGTIPGDCGCPYVYKRGNDWVVIGVHAAATRSG 530 (535)
T ss_dssp EEEEETTSTT-SSTTTS--TTGTT-EEEEEETTEEEEEEEEEEE-SSS
T ss_pred eeeeeeecCCccccccCCCCCCCCCceeeecCCcEEEEEEEehhccCC
Confidence 3457777653 458999999996555 4899999988743
No 57
>PF03761 DUF316: Domain of unknown function (DUF316) ; InterPro: IPR005514 This is a family of uncharacterised proteins from Caenorhabditis elegans.
Probab=29.59 E-value=3.2e+02 Score=28.96 Aligned_cols=91 Identities=16% Similarity=0.174 Sum_probs=52.6
Q ss_pred cccEEEEEEcCC---CCCCCccccCC-CCCCCCeEEEEeC-CCCCCCccccccceEEEEEeeecCCCCCCCCeEEeeccc
Q 004542 214 TSRVAILGVSSY---LKDLPNIALTP-LNKRGDLLLAVGS-PFGVLSPMHFFNSVSMGSVANCYPPRSTTRSLLMADIRC 288 (746)
Q Consensus 214 ~t~~A~l~i~~~---~~~~~~~~~s~-~~~~Gd~v~aigs-PFg~~~p~~f~n~vs~GiVs~~~~~~~~~~~~i~tDa~~ 288 (746)
..+++||+++.. ....+.++.+. .+..||.+-+-|- --+ .++...+. |..+.. ....+.++-..
T Consensus 160 ~~~~mIlEl~~~~~~~~~~~Cl~~~~~~~~~~~~~~~yg~~~~~----~~~~~~~~---i~~~~~----~~~~~~~~~~~ 228 (282)
T PF03761_consen 160 PYSPMILELEEDFSKNVSPPCLADSSTNWEKGDEVDVYGFNSTG----KLKHRKLK---ITNCTK----CAYSICTKQYS 228 (282)
T ss_pred ccceEEEEEcccccccCCCEEeCCCccccccCceEEEeecCCCC----eEEEEEEE---EEEeec----cceeEeccccc
Confidence 347889999754 34556666643 4778898887554 111 11211111 112111 12235566566
Q ss_pred CCCCcCcceec-CCcc--EEEEEeeecccc
Q 004542 289 LPGMEGGPVFG-EHAH--FVGILIRPLRQK 315 (746)
Q Consensus 289 ~pG~~GG~v~~-~~g~--liGi~~~~l~~~ 315 (746)
-+|..|||++. .+|+ |||+.+..-...
T Consensus 229 ~~~d~Gg~lv~~~~gr~tlIGv~~~~~~~~ 258 (282)
T PF03761_consen 229 CKGDRGGPLVKNINGRWTLIGVGASGNYEC 258 (282)
T ss_pred CCCCccCeEEEEECCCEEEEEEEccCCCcc
Confidence 68999999984 3554 999988665443
No 58
>PF01732 DUF31: Putative peptidase (DUF31); InterPro: IPR022382 This domain has no known function. It is found in various hypothetical proteins and putative lipoproteins from mycoplasmas.
Probab=23.49 E-value=57 Score=36.62 Aligned_cols=26 Identities=23% Similarity=0.364 Sum_probs=21.2
Q ss_pred ecccCCCCcCcceecCCccEEEEEee
Q 004542 285 DIRCLPGMEGGPVFGEHAHFVGILIR 310 (746)
Q Consensus 285 Da~~~pG~~GG~v~~~~g~liGi~~~ 310 (746)
+...--|.||..|+|.+|++|||.-|
T Consensus 349 ~~~l~gGaSGS~V~n~~~~lvGIy~g 374 (374)
T PF01732_consen 349 NYSLGGGASGSMVINQNNELVGIYFG 374 (374)
T ss_pred ccCCCCCCCcCeEECCCCCEEEEeCC
Confidence 33444699999999999999999753
No 59
>PF08208 RNA_polI_A34: DNA-directed RNA polymerase I subunit RPA34.5; InterPro: IPR013240 This is a family of proteins conserved from yeasts to human. Subunit A34.5 of RNA polymerase I is a non-essential subunit which is thought to help Pol I overcome topological constraints imposed on ribosomal DNA during the process of transcription [].; PDB: 3NFG_N.
Probab=21.07 E-value=32 Score=35.13 Aligned_cols=13 Identities=62% Similarity=0.950 Sum_probs=0.0
Q ss_pred Ccchhhhcccccc
Q 004542 23 DPKGLKMRRHAFH 35 (746)
Q Consensus 23 ~~~~~~~~~~~~~ 35 (746)
-|+|||||.|+|=
T Consensus 109 qp~gLk~Rf~P~G 121 (198)
T PF08208_consen 109 QPKGLKMRFFPFG 121 (198)
T ss_dssp -------------
T ss_pred CCCCcceeeecCC
Confidence 4899999999884
Done!