Query 006631
Match_columns 637
No_of_seqs 327 out of 2494
Neff 5.8
Searched_HMMs 46136
Date Thu Mar 28 12:13:25 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/006631.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/006631hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PRK10898 serine endoprotease; 99.9 1.4E-21 3E-26 210.2 18.8 158 388-630 49-218 (353)
2 TIGR02038 protease_degS peripl 99.9 2.5E-21 5.4E-26 208.0 19.8 158 388-630 49-218 (351)
3 PRK10139 serine endoprotease; 99.9 2.4E-21 5.2E-26 214.6 18.9 159 388-630 44-232 (455)
4 PRK10139 serine endoprotease; 99.9 8E-22 1.7E-26 218.4 14.1 127 200-338 129-260 (455)
5 TIGR02037 degP_htrA_DO peripla 99.8 1.1E-19 2.4E-24 199.9 19.4 141 406-630 58-199 (428)
6 PRK10942 serine endoprotease; 99.8 3.7E-20 8E-25 206.1 13.9 127 200-338 150-281 (473)
7 PRK10942 serine endoprotease; 99.8 1.7E-19 3.7E-24 200.8 17.4 142 405-630 110-253 (473)
8 TIGR02038 protease_degS peripl 99.8 2E-19 4.4E-24 193.3 14.3 127 200-338 116-248 (351)
9 PRK10898 serine endoprotease; 99.8 2.3E-19 4.9E-24 193.1 13.8 127 200-338 116-249 (353)
10 TIGR02037 degP_htrA_DO peripla 99.8 2E-18 4.4E-23 190.0 14.4 127 200-338 96-227 (428)
11 COG0265 DegQ Trypsin-like seri 99.7 6.5E-18 1.4E-22 181.0 12.5 127 200-338 110-242 (347)
12 COG0265 DegQ Trypsin-like seri 99.6 2.1E-14 4.6E-19 154.0 17.1 164 388-634 37-218 (347)
13 PF13365 Trypsin_2: Trypsin-li 99.5 2.4E-13 5.3E-18 121.3 13.0 24 603-626 97-120 (120)
14 cd00190 Tryp_SPc Trypsin-like 99.4 1E-11 2.2E-16 122.0 16.1 106 521-630 88-207 (232)
15 PF00089 Trypsin: Trypsin; In 99.3 4.9E-11 1.1E-15 116.5 14.1 104 521-630 86-198 (220)
16 smart00020 Tryp_SPc Trypsin-li 99.2 3.5E-10 7.7E-15 111.7 17.4 107 521-631 88-208 (229)
17 KOG1320 Serine protease [Postt 99.2 1.1E-11 2.5E-16 136.3 7.1 127 205-337 213-351 (473)
18 KOG1320 Serine protease [Postt 99.1 7E-10 1.5E-14 122.3 12.2 112 510-631 213-325 (473)
19 KOG1421 Predicted signaling-as 98.6 1.3E-07 2.8E-12 106.3 11.1 165 389-629 57-235 (955)
20 KOG3627 Trypsin [Amino acid tr 98.6 1.3E-06 2.8E-11 88.9 15.8 107 522-631 106-228 (256)
21 COG3591 V8-like Glu-specific e 98.3 5.8E-06 1.3E-10 85.2 13.2 69 545-632 157-225 (251)
22 PF13365 Trypsin_2: Trypsin-li 97.8 1E-05 2.3E-10 71.9 1.9 24 284-307 97-120 (120)
23 PF00863 Peptidase_C4: Peptida 97.4 0.0013 2.8E-08 67.5 11.9 90 521-631 81-173 (235)
24 PF05579 Peptidase_S32: Equine 97.3 0.0022 4.7E-08 66.4 11.0 76 522-633 156-232 (297)
25 COG5640 Secreted trypsin-like 97.2 0.0013 2.7E-08 70.6 9.1 22 405-427 60-81 (413)
26 PF03761 DUF316: Domain of unk 97.2 0.011 2.4E-07 61.6 16.0 92 520-632 159-256 (282)
27 PF10459 Peptidase_S46: Peptid 96.8 0.004 8.7E-08 73.1 8.9 21 408-428 49-69 (698)
28 PF00089 Trypsin: Trypsin; In 96.5 0.014 3.1E-07 56.7 9.4 115 214-329 86-216 (220)
29 PF10459 Peptidase_S46: Peptid 94.4 0.032 6.9E-07 65.8 3.8 34 595-628 618-651 (698)
30 COG3591 V8-like Glu-specific e 93.9 0.28 6E-06 51.1 9.0 77 231-315 151-227 (251)
31 PF02907 Peptidase_S29: Hepati 92.2 0.12 2.6E-06 48.8 3.0 45 284-329 101-146 (148)
32 PF00548 Peptidase_C3: 3C cyst 91.3 2.7 5.9E-05 41.3 11.6 34 597-630 134-170 (172)
33 PF00949 Peptidase_S7: Peptida 90.1 0.24 5.3E-06 46.8 2.9 36 280-315 86-121 (132)
34 PF09342 DUF1986: Domain of un 89.9 2.6 5.7E-05 43.8 10.3 31 397-428 19-49 (267)
35 cd00190 Tryp_SPc Trypsin-like 89.7 1.6 3.4E-05 42.6 8.6 99 214-312 88-208 (232)
36 smart00020 Tryp_SPc Trypsin-li 88.3 2.8 6.1E-05 41.1 9.2 99 214-312 88-208 (229)
37 KOG1421 Predicted signaling-as 87.5 6.2 0.00014 46.3 12.2 46 510-558 588-633 (955)
38 PF00863 Peptidase_C4: Peptida 87.0 1.3 2.8E-05 45.8 6.0 108 214-329 81-189 (235)
39 PF00949 Peptidase_S7: Peptida 85.8 0.61 1.3E-05 44.1 2.7 24 605-628 92-115 (132)
40 PF00944 Peptidase_S3: Alphavi 83.9 1.2 2.6E-05 42.2 3.7 32 600-631 96-127 (158)
41 PF08192 Peptidase_S64: Peptid 81.2 7.2 0.00016 45.7 9.3 114 210-335 538-687 (695)
42 PF00947 Pico_P2A: Picornaviru 78.2 3 6.5E-05 39.2 4.2 30 280-310 79-108 (127)
43 PF00944 Peptidase_S3: Alphavi 77.8 3.5 7.6E-05 39.2 4.5 32 281-312 96-127 (158)
44 PF02907 Peptidase_S29: Hepati 70.0 3.1 6.7E-05 39.5 2.2 24 606-629 104-127 (148)
45 PF05580 Peptidase_S55: SpoIVB 66.8 2.7 5.9E-05 42.9 1.2 29 600-629 170-198 (218)
46 PF08192 Peptidase_S64: Peptid 66.4 21 0.00046 42.0 8.3 98 519-632 540-667 (695)
47 KOG0441 Cu2+/Zn2+ superoxide d 60.6 3.3 7.2E-05 40.1 0.5 42 26-67 38-84 (154)
48 PF00947 Pico_P2A: Picornaviru 60.3 9.7 0.00021 35.8 3.4 33 600-633 80-112 (127)
49 PF01732 DUF31: Putative pepti 51.2 9.4 0.0002 41.9 2.1 24 605-628 350-373 (374)
50 TIGR02860 spore_IV_B stage IV 50.4 6.9 0.00015 43.6 0.9 28 600-628 350-377 (402)
51 PF05580 Peptidase_S55: SpoIVB 41.0 21 0.00046 36.6 2.6 38 288-328 177-214 (218)
52 PF00548 Peptidase_C3: 3C cyst 40.2 58 0.0013 32.0 5.6 89 214-310 71-169 (172)
53 PF05579 Peptidase_S32: Equine 39.5 20 0.00043 37.9 2.2 27 288-314 205-231 (297)
54 PF03761 DUF316: Domain of unk 36.7 2E+02 0.0044 29.8 9.3 91 214-315 160-258 (282)
55 PF03510 Peptidase_C24: 2C end 33.5 1.4E+02 0.0031 27.3 6.5 17 410-427 3-19 (105)
56 PF00571 CBS: CBS domain CBS d 30.3 35 0.00076 26.0 1.8 22 608-629 27-48 (57)
57 PF01732 DUF31: Putative pepti 29.6 37 0.0008 37.3 2.4 26 284-309 348-373 (374)
58 PF05416 Peptidase_C37: Southa 27.0 1.5E+02 0.0032 33.6 6.3 36 598-633 484-529 (535)
59 PF08208 RNA_polI_A34: DNA-dir 25.3 24 0.00051 35.3 0.0 13 23-35 109-121 (198)
No 1
>PRK10898 serine endoprotease; Provisional
Probab=99.88 E-value=1.4e-21 Score=210.22 Aligned_cols=158 Identities=25% Similarity=0.400 Sum_probs=123.2
Q ss_pred hHHhhccCceEEEEeCC-----------CeeeEEEEEeCCCEEEEcccccCCCCCcceeecCCcccccccCCCCCCCCCC
Q 006631 388 LPIQKALASVCLITIDD-----------GVWASGVLLNDQGLILTNAHLLEPWRFGKTTVSGWRNGVSFQPEDSASSGHT 456 (637)
Q Consensus 388 ~~i~~a~~SVV~V~~g~-----------~~wGSGvlI~~~GlILTnAHVV~p~~~~~~~~ng~~~~~~~~~~~~~~~~~~ 456 (637)
..++++.|+||.|.... ..+||||+|+++|+||||+||++.
T Consensus 49 ~~~~~~~psvV~v~~~~~~~~~~~~~~~~~~GSGfvi~~~G~IlTn~HVv~~---------------------------- 100 (353)
T PRK10898 49 QAVRRAAPAVVNVYNRSLNSTSHNQLEIRTLGSGVIMDQRGYILTNKHVIND---------------------------- 100 (353)
T ss_pred HHHHHhCCcEEEEEeEeccccCcccccccceeeEEEEeCCeEEEecccEeCC----------------------------
Confidence 46889999999997621 158999999999999999999961
Q ss_pred cccccccccCCCCCCCcccccccccccccccccccCCceEEEEEEccCCCCceeeeEEEEeCCCCCcEEEEEEccCCCCc
Q 006631 457 GVDQYQKSQTLPPKMPKIVDSSVDEHRAYKLSSFSRGHRKIRVRLDHLDPWIWCDAKIVYVCKGPLDVSLLQLGYIPDQL 536 (637)
Q Consensus 457 ~~~~~~~~q~l~~k~~~~~~~~~~~~~~~~~~~~~~~~~~i~V~l~~~~~~~w~~a~Vv~v~~~~~DIALLkLe~~~~~l 536 (637)
...+.|++..+. +|+|++++.++. .||||||++. ..+
T Consensus 101 -------------------------------------a~~i~V~~~dg~---~~~a~vv~~d~~-~DlAvl~v~~--~~l 137 (353)
T PRK10898 101 -------------------------------------ADQIIVALQDGR---VFEALLVGSDSL-TDLAVLKINA--TNL 137 (353)
T ss_pred -------------------------------------CCEEEEEeCCCC---EEEEEEEEEcCC-CCEEEEEEcC--CCC
Confidence 112444444433 389999999885 9999999985 357
Q ss_pred ceeeCCCCC-CCCCCeEEEEecCCCCCCCCCCCceeeeEEeeeeeecCCcCCccccccCCCcceEEEecCcccCCccccc
Q 006631 537 CPIDADFGQ-PSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQRNSAYPVMLETTAAVHPGGSGGA 615 (637)
Q Consensus 537 ~PI~l~~~~-~~~Ge~V~VIGyplfg~~~g~~~svs~GiVs~v~~v~~~~~~~~~~~~~~~~~~mlqTta~v~~G~SGGP 615 (637)
+++++.+.. +++|+.|+++|||. ++..+++.|+|++..+.... ......++|+++++++|+||||
T Consensus 138 ~~~~l~~~~~~~~G~~V~aiG~P~-----g~~~~~t~Giis~~~r~~~~---------~~~~~~~iqtda~i~~GnSGGP 203 (353)
T PRK10898 138 PVIPINPKRVPHIGDVVLAIGNPY-----NLGQTITQGIISATGRIGLS---------PTGRQNFLQTDASINHGNSGGA 203 (353)
T ss_pred CeeeccCcCcCCCCCEEEEEeCCC-----CcCCCcceeEEEeccccccC---------CccccceEEeccccCCCCCcce
Confidence 778886554 89999999999983 45578999999987653211 0112358999999999999999
Q ss_pred ccccCceEEEEEeee
Q 006631 616 VVNLDGHMIGLVTRY 630 (637)
Q Consensus 616 L~n~~G~LVGIVsSn 630 (637)
|+|.+|+||||+++.
T Consensus 204 l~n~~G~vvGI~~~~ 218 (353)
T PRK10898 204 LVNSLGELMGINTLS 218 (353)
T ss_pred EECCCCeEEEEEEEE
Confidence 999999999999853
No 2
>TIGR02038 protease_degS periplasmic serine pepetdase DegS. This family consists of the periplasmic serine protease DegS (HhoB), a shorter paralog of protease DO (HtrA, DegP) and DegQ (HhoA). It is found in E. coli and several other Proteobacteria of the gamma subdivision. It contains a trypsin domain and a single copy of PDZ domain (in contrast to DegP with two copies). A critical role of this DegS is to sense stress in the periplasm and partially degrade an inhibitor of sigma(E).
Probab=99.87 E-value=2.5e-21 Score=208.05 Aligned_cols=158 Identities=26% Similarity=0.431 Sum_probs=122.8
Q ss_pred hHHhhccCceEEEEeCC-----------CeeeEEEEEeCCCEEEEcccccCCCCCcceeecCCcccccccCCCCCCCCCC
Q 006631 388 LPIQKALASVCLITIDD-----------GVWASGVLLNDQGLILTNAHLLEPWRFGKTTVSGWRNGVSFQPEDSASSGHT 456 (637)
Q Consensus 388 ~~i~~a~~SVV~V~~g~-----------~~wGSGvlI~~~GlILTnAHVV~p~~~~~~~~ng~~~~~~~~~~~~~~~~~~ 456 (637)
..++++.||||.|.... ...||||+|+++||||||+||++ +
T Consensus 49 ~~~~~~~psVV~I~~~~~~~~~~~~~~~~~~GSG~vi~~~G~IlTn~HVV~----------~------------------ 100 (351)
T TIGR02038 49 KAVRRAAPAVVNIYNRSISQNSLNQLSIQGLGSGVIMSKEGYILTNYHVIK----------K------------------ 100 (351)
T ss_pred HHHHhcCCcEEEEEeEeccccccccccccceEEEEEEeCCeEEEecccEeC----------C------------------
Confidence 45889999999997621 24699999999999999999995 1
Q ss_pred cccccccccCCCCCCCcccccccccccccccccccCCceEEEEEEccCCCCceeeeEEEEeCCCCCcEEEEEEccCCCCc
Q 006631 457 GVDQYQKSQTLPPKMPKIVDSSVDEHRAYKLSSFSRGHRKIRVRLDHLDPWIWCDAKIVYVCKGPLDVSLLQLGYIPDQL 536 (637)
Q Consensus 457 ~~~~~~~~q~l~~k~~~~~~~~~~~~~~~~~~~~~~~~~~i~V~l~~~~~~~w~~a~Vv~v~~~~~DIALLkLe~~~~~l 536 (637)
...+.|++..+. +++|++++.++. +||||||++. ..+
T Consensus 101 -------------------------------------~~~i~V~~~dg~---~~~a~vv~~d~~-~DlAvlkv~~--~~~ 137 (351)
T TIGR02038 101 -------------------------------------ADQIVVALQDGR---KFEAELVGSDPL-TDLAVLKIEG--DNL 137 (351)
T ss_pred -------------------------------------CCEEEEEECCCC---EEEEEEEEecCC-CCEEEEEecC--CCC
Confidence 112344444332 388999998884 9999999995 347
Q ss_pred ceeeCCCC-CCCCCCeEEEEecCCCCCCCCCCCceeeeEEeeeeeecCCcCCccccccCCCcceEEEecCcccCCccccc
Q 006631 537 CPIDADFG-QPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQRNSAYPVMLETTAAVHPGGSGGA 615 (637)
Q Consensus 537 ~PI~l~~~-~~~~Ge~V~VIGyplfg~~~g~~~svs~GiVs~v~~v~~~~~~~~~~~~~~~~~~mlqTta~v~~G~SGGP 615 (637)
+++++... .+++|+.|+++|||. ++..+++.|+|+...+.... ......++|+++++.+|+||||
T Consensus 138 ~~~~l~~s~~~~~G~~V~aiG~P~-----~~~~s~t~GiIs~~~r~~~~---------~~~~~~~iqtda~i~~GnSGGp 203 (351)
T TIGR02038 138 PTIPVNLDRPPHVGDVVLAIGNPY-----NLGQTITQGIISATGRNGLS---------SVGRQNFIQTDAAINAGNSGGA 203 (351)
T ss_pred ceEeccCcCccCCCCEEEEEeCCC-----CCCCcEEEEEEEeccCcccC---------CCCcceEEEECCccCCCCCcce
Confidence 77888654 589999999999983 45578999999987653210 0123458999999999999999
Q ss_pred ccccCceEEEEEeee
Q 006631 616 VVNLDGHMIGLVTRY 630 (637)
Q Consensus 616 L~n~~G~LVGIVsSn 630 (637)
|+|.+|+||||+++.
T Consensus 204 l~n~~G~vIGI~~~~ 218 (351)
T TIGR02038 204 LINTNGELVGINTAS 218 (351)
T ss_pred EECCCCeEEEEEeee
Confidence 999999999999864
No 3
>PRK10139 serine endoprotease; Provisional
Probab=99.87 E-value=2.4e-21 Score=214.62 Aligned_cols=159 Identities=29% Similarity=0.507 Sum_probs=124.7
Q ss_pred hHHhhccCceEEEEeC------------------C----------CeeeEEEEEeC-CCEEEEcccccCCCCCcceeecC
Q 006631 388 LPIQKALASVCLITID------------------D----------GVWASGVLLND-QGLILTNAHLLEPWRFGKTTVSG 438 (637)
Q Consensus 388 ~~i~~a~~SVV~V~~g------------------~----------~~wGSGvlI~~-~GlILTnAHVV~p~~~~~~~~ng 438 (637)
..++++.|+||.|... . ..+||||+|++ +||||||+||++ +
T Consensus 44 ~~~~~~~pavV~i~~~~~~~~~~~~~~~~~~~f~~~~~~~~~~~~~~~GSG~ii~~~~g~IlTn~HVv~----------~ 113 (455)
T PRK10139 44 PMLEKVLPAVVSVRVEGTASQGQKIPEEFKKFFGDDLPDQPAQPFEGLGSGVIIDAAKGYVLTNNHVIN----------Q 113 (455)
T ss_pred HHHHHhCCcEEEEEEEEeecccccCchhHHHhccccCCccccccccceEEEEEEECCCCEEEeChHHhC----------C
Confidence 5688999999998541 0 14699999985 799999999996 1
Q ss_pred CcccccccCCCCCCCCCCcccccccccCCCCCCCcccccccccccccccccccCCceEEEEEEccCCCCceeeeEEEEeC
Q 006631 439 WRNGVSFQPEDSASSGHTGVDQYQKSQTLPPKMPKIVDSSVDEHRAYKLSSFSRGHRKIRVRLDHLDPWIWCDAKIVYVC 518 (637)
Q Consensus 439 ~~~~~~~~~~~~~~~~~~~~~~~~~~q~l~~k~~~~~~~~~~~~~~~~~~~~~~~~~~i~V~l~~~~~~~w~~a~Vv~v~ 518 (637)
...+.|++..+. .|+|++++.+
T Consensus 114 -------------------------------------------------------a~~i~V~~~dg~---~~~a~vvg~D 135 (455)
T PRK10139 114 -------------------------------------------------------AQKISIQLNDGR---EFDAKLIGSD 135 (455)
T ss_pred -------------------------------------------------------CCEEEEEECCCC---EEEEEEEEEc
Confidence 113455554333 3899999999
Q ss_pred CCCCcEEEEEEccCCCCcceeeCCCCC-CCCCCeEEEEecCCCCCCCCCCCceeeeEEeeeeeecCCcCCccccccCCCc
Q 006631 519 KGPLDVSLLQLGYIPDQLCPIDADFGQ-PSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQRNSAY 597 (637)
Q Consensus 519 ~~~~DIALLkLe~~~~~l~PI~l~~~~-~~~Ge~V~VIGyplfg~~~g~~~svs~GiVs~v~~v~~~~~~~~~~~~~~~~ 597 (637)
+. +||||||++. +..++++++.++. +++|+.|+++|||. ++..+++.|+|++..+... ....+
T Consensus 136 ~~-~DlAvlkv~~-~~~l~~~~lg~s~~~~~G~~V~aiG~P~-----g~~~tvt~GivS~~~r~~~---------~~~~~ 199 (455)
T PRK10139 136 DQ-SDIALLQIQN-PSKLTQIAIADSDKLRVGDFAVAVGNPF-----GLGQTATSGIISALGRSGL---------NLEGL 199 (455)
T ss_pred CC-CCEEEEEecC-CCCCceeEecCccccCCCCEEEEEecCC-----CCCCceEEEEEcccccccc---------CCCCc
Confidence 85 9999999985 4568899997654 89999999999973 5567899999998764211 01234
Q ss_pred ceEEEecCcccCCcccccccccCceEEEEEeee
Q 006631 598 PVMLETTAAVHPGGSGGAVVNLDGHMIGLVTRY 630 (637)
Q Consensus 598 ~~mlqTta~v~~G~SGGPL~n~~G~LVGIVsSn 630 (637)
..++||++++++|+|||||||.+|+||||+++-
T Consensus 200 ~~~iqtda~in~GnSGGpl~n~~G~vIGi~~~~ 232 (455)
T PRK10139 200 ENFIQTDASINRGNSGGALLNLNGELIGINTAI 232 (455)
T ss_pred ceEEEECCccCCCCCcceEECCCCeEEEEEEEE
Confidence 468999999999999999999999999999974
No 4
>PRK10139 serine endoprotease; Provisional
Probab=99.87 E-value=8e-22 Score=218.40 Aligned_cols=127 Identities=22% Similarity=0.273 Sum_probs=111.3
Q ss_pred cccccCCccccCCCCccEEEEEEe-CCCCCCCcccCCCCCCCCCeEEEEeCCCCCCCCCcccCceEEEEEecccCCC---
Q 006631 200 AMEESSNLSLMSKSTSRVAILGVS-SYLKDLPNIALTPLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVANCYPPR--- 275 (637)
Q Consensus 200 a~~~~~~~~~~~~~~t~~A~lki~-~~~~~~~~~~~s~~~~~G~~v~aigsPfg~~~p~~f~n~vs~GiIs~~~~~~--- 275 (637)
|+....|+. +||||||++ ...++..++++|+.+++||+|+|||+|||+ ..++|.|+||++.+..
T Consensus 129 a~vvg~D~~------~DlAvlkv~~~~~l~~~~lg~s~~~~~G~~V~aiG~P~g~------~~tvt~GivS~~~r~~~~~ 196 (455)
T PRK10139 129 AKLIGSDDQ------SDIALLQIQNPSKLTQIAIADSDKLRVGDFAVAVGNPFGL------GQTATSGIISALGRSGLNL 196 (455)
T ss_pred EEEEEEcCC------CCEEEEEecCCCCCceeEecCccccCCCCEEEEEecCCCC------CCceEEEEEccccccccCC
Confidence 667776666 999999997 466888999999999999999999999994 6799999999886542
Q ss_pred CCCCceEEEecccCCCCcCcceecCCccEEEEEeeccccc-CCcceEEEEeHHHHHHHHHhhhc
Q 006631 276 STTRSLLMADIRCLPGMEGGPVFGEHAHFVGILIRPLRQK-SGAEIQLVIPWEAIATACSDLLL 338 (637)
Q Consensus 276 ~~~~~~i~tDa~~~pG~sGG~v~~~~g~liGiv~~~l~~~-~~~~l~faip~~~i~~~~~~~~~ 338 (637)
.....||||||++|||||||||||.+|+||||+++.++.. +..|++|+||++.+++++.+++.
T Consensus 197 ~~~~~~iqtda~in~GnSGGpl~n~~G~vIGi~~~~~~~~~~~~gigfaIP~~~~~~v~~~l~~ 260 (455)
T PRK10139 197 EGLENFIQTDASINRGNSGGALLNLNGELIGINTAILAPGGGSVGIGFAIPSNMARTLAQQLID 260 (455)
T ss_pred CCcceEEEECCccCCCCCcceEECCCCeEEEEEEEEEcCCCCccceEEEEEhHHHHHHHHHHhh
Confidence 1235799999999999999999999999999999998766 56799999999999999988764
No 5
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=99.83 E-value=1.1e-19 Score=199.94 Aligned_cols=141 Identities=32% Similarity=0.480 Sum_probs=110.8
Q ss_pred eeeEEEEEeCCCEEEEcccccCCCCCcceeecCCcccccccCCCCCCCCCCcccccccccCCCCCCCccccccccccccc
Q 006631 406 VWASGVLLNDQGLILTNAHLLEPWRFGKTTVSGWRNGVSFQPEDSASSGHTGVDQYQKSQTLPPKMPKIVDSSVDEHRAY 485 (637)
Q Consensus 406 ~wGSGvlI~~~GlILTnAHVV~p~~~~~~~~ng~~~~~~~~~~~~~~~~~~~~~~~~~~q~l~~k~~~~~~~~~~~~~~~ 485 (637)
.+||||+|+++||||||+||++..
T Consensus 58 ~~GSGfii~~~G~IlTn~Hvv~~~-------------------------------------------------------- 81 (428)
T TIGR02037 58 GLGSGVIISADGYILTNNHVVDGA-------------------------------------------------------- 81 (428)
T ss_pred ceeeEEEECCCCEEEEcHHHcCCC--------------------------------------------------------
Confidence 479999999999999999999611
Q ss_pred ccccccCCceEEEEEEccCCCCceeeeEEEEeCCCCCcEEEEEEccCCCCcceeeCCCC-CCCCCCeEEEEecCCCCCCC
Q 006631 486 KLSSFSRGHRKIRVRLDHLDPWIWCDAKIVYVCKGPLDVSLLQLGYIPDQLCPIDADFG-QPSLGSAAYVIGHGLFGPRC 564 (637)
Q Consensus 486 ~~~~~~~~~~~i~V~l~~~~~~~w~~a~Vv~v~~~~~DIALLkLe~~~~~l~PI~l~~~-~~~~Ge~V~VIGyplfg~~~ 564 (637)
..+.|++.... +|+|++++.++ .+||||||++. +..++++++.+. .+++|+.|+++|||.
T Consensus 82 ---------~~i~V~~~~~~---~~~a~vv~~d~-~~DlAllkv~~-~~~~~~~~l~~~~~~~~G~~v~aiG~p~----- 142 (428)
T TIGR02037 82 ---------DEITVTLSDGR---EFKAKLVGKDP-RTDIAVLKIDA-KKNLPVIKLGDSDKLRVGDWVLAIGNPF----- 142 (428)
T ss_pred ---------CeEEEEeCCCC---EEEEEEEEecC-CCCEEEEEecC-CCCceEEEccCCCCCCCCCEEEEEECCC-----
Confidence 12344444332 38899999887 49999999985 356889999754 589999999999984
Q ss_pred CCCCceeeeEEeeeeeecCCcCCccccccCCCcceEEEecCcccCCcccccccccCceEEEEEeee
Q 006631 565 GLSPSVSSGVVAKVVKANLPSYGQSTLQRNSAYPVMLETTAAVHPGGSGGAVVNLDGHMIGLVTRY 630 (637)
Q Consensus 565 g~~~svs~GiVs~v~~v~~~~~~~~~~~~~~~~~~mlqTta~v~~G~SGGPL~n~~G~LVGIVsSn 630 (637)
++..+++.|+|+...+... ....+..++++++++.+|+|||||||.+|+||||++..
T Consensus 143 g~~~~~t~G~vs~~~~~~~---------~~~~~~~~i~tda~i~~GnSGGpl~n~~G~viGI~~~~ 199 (428)
T TIGR02037 143 GLGQTVTSGIVSALGRSGL---------GIGDYENFIQTDAAINPGNSGGPLVNLRGEVIGINTAI 199 (428)
T ss_pred cCCCcEEEEEEEecccCcc---------CCCCccceEEECCCCCCCCCCCceECCCCeEEEEEeEE
Confidence 5567899999998754310 01234458999999999999999999999999998763
No 6
>PRK10942 serine endoprotease; Provisional
Probab=99.82 E-value=3.7e-20 Score=206.09 Aligned_cols=127 Identities=19% Similarity=0.265 Sum_probs=110.7
Q ss_pred cccccCCccccCCCCccEEEEEEe-CCCCCCCcccCCCCCCCCCeEEEEeCCCCCCCCCcccCceEEEEEecccCCCC--
Q 006631 200 AMEESSNLSLMSKSTSRVAILGVS-SYLKDLPNIALTPLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVANCYPPRS-- 276 (637)
Q Consensus 200 a~~~~~~~~~~~~~~t~~A~lki~-~~~~~~~~~~~s~~~~~G~~v~aigsPfg~~~p~~f~n~vs~GiIs~~~~~~~-- 276 (637)
|.....|+. +||||||++ ...++.+++++++.+++||+|++||+|||+ .++++.|+||++.+...
T Consensus 150 a~vv~~D~~------~DlAvlki~~~~~l~~~~lg~s~~l~~G~~V~aiG~P~g~------~~tvt~GiVs~~~r~~~~~ 217 (473)
T PRK10942 150 AKVVGKDPR------SDIALIQLQNPKNLTAIKMADSDALRVGDYTVAIGNPYGL------GETVTSGIVSALGRSGLNV 217 (473)
T ss_pred EEEEEecCC------CCEEEEEecCCCCCceeEecCccccCCCCEEEEEcCCCCC------CcceeEEEEEEeecccCCc
Confidence 666666665 999999996 556788899999999999999999999994 77999999998865421
Q ss_pred -CCCceEEEecccCCCCcCcceecCCccEEEEEeeccccc-CCcceEEEEeHHHHHHHHHhhhc
Q 006631 277 -TTRSLLMADIRCLPGMEGGPVFGEHAHFVGILIRPLRQK-SGAEIQLVIPWEAIATACSDLLL 338 (637)
Q Consensus 277 -~~~~~i~tDa~~~pG~sGG~v~~~~g~liGiv~~~l~~~-~~~~l~faip~~~i~~~~~~~~~ 338 (637)
....||||||+++||||||||||.+|+||||+++.+... ++.|++|+||++.++.+++++..
T Consensus 218 ~~~~~~iqtda~i~~GnSGGpL~n~~GeviGI~t~~~~~~g~~~g~gfaIP~~~~~~v~~~l~~ 281 (473)
T PRK10942 218 ENYENFIQTDAAINRGNSGGALVNLNGELIGINTAILAPDGGNIGIGFAIPSNMVKNLTSQMVE 281 (473)
T ss_pred ccccceEEeccccCCCCCcCccCCCCCeEEEEEEEEEcCCCCcccEEEEEEHHHHHHHHHHHHh
Confidence 245789999999999999999999999999999998876 66899999999999999998764
No 7
>PRK10942 serine endoprotease; Provisional
Probab=99.82 E-value=1.7e-19 Score=200.81 Aligned_cols=142 Identities=34% Similarity=0.544 Sum_probs=112.5
Q ss_pred CeeeEEEEEeC-CCEEEEcccccCCCCCcceeecCCcccccccCCCCCCCCCCcccccccccCCCCCCCccccccccccc
Q 006631 405 GVWASGVLLND-QGLILTNAHLLEPWRFGKTTVSGWRNGVSFQPEDSASSGHTGVDQYQKSQTLPPKMPKIVDSSVDEHR 483 (637)
Q Consensus 405 ~~wGSGvlI~~-~GlILTnAHVV~p~~~~~~~~ng~~~~~~~~~~~~~~~~~~~~~~~~~~q~l~~k~~~~~~~~~~~~~ 483 (637)
.++||||+|++ +||||||+||+. +
T Consensus 110 ~~~GSG~ii~~~~G~IlTn~HVv~----------~--------------------------------------------- 134 (473)
T PRK10942 110 MALGSGVIIDADKGYVVTNNHVVD----------N--------------------------------------------- 134 (473)
T ss_pred cceEEEEEEECCCCEEEeChhhcC----------C---------------------------------------------
Confidence 35899999996 599999999995 1
Q ss_pred ccccccccCCceEEEEEEccCCCCceeeeEEEEeCCCCCcEEEEEEccCCCCcceeeCCCC-CCCCCCeEEEEecCCCCC
Q 006631 484 AYKLSSFSRGHRKIRVRLDHLDPWIWCDAKIVYVCKGPLDVSLLQLGYIPDQLCPIDADFG-QPSLGSAAYVIGHGLFGP 562 (637)
Q Consensus 484 ~~~~~~~~~~~~~i~V~l~~~~~~~w~~a~Vv~v~~~~~DIALLkLe~~~~~l~PI~l~~~-~~~~Ge~V~VIGyplfg~ 562 (637)
...++|++..+.. |+|++++.++. +||||||++. +..++++++.+. .+++|+.|+++|||
T Consensus 135 ----------a~~i~V~~~dg~~---~~a~vv~~D~~-~DlAvlki~~-~~~l~~~~lg~s~~l~~G~~V~aiG~P---- 195 (473)
T PRK10942 135 ----------ATKIKVQLSDGRK---FDAKVVGKDPR-SDIALIQLQN-PKNLTAIKMADSDALRVGDYTVAIGNP---- 195 (473)
T ss_pred ----------CCEEEEEECCCCE---EEEEEEEecCC-CCEEEEEecC-CCCCceeEecCccccCCCCEEEEEcCC----
Confidence 1134555544333 89999999884 9999999975 456889999765 48999999999997
Q ss_pred CCCCCCceeeeEEeeeeeecCCcCCccccccCCCcceEEEecCcccCCcccccccccCceEEEEEeee
Q 006631 563 RCGLSPSVSSGVVAKVVKANLPSYGQSTLQRNSAYPVMLETTAAVHPGGSGGAVVNLDGHMIGLVTRY 630 (637)
Q Consensus 563 ~~g~~~svs~GiVs~v~~v~~~~~~~~~~~~~~~~~~mlqTta~v~~G~SGGPL~n~~G~LVGIVsSn 630 (637)
+++..+++.|+|++..+... ....+..++||++++++|+|||||+|.+|+||||+++.
T Consensus 196 -~g~~~tvt~GiVs~~~r~~~---------~~~~~~~~iqtda~i~~GnSGGpL~n~~GeviGI~t~~ 253 (473)
T PRK10942 196 -YGLGETVTSGIVSALGRSGL---------NVENYENFIQTDAAINRGNSGGALVNLNGELIGINTAI 253 (473)
T ss_pred -CCCCcceeEEEEEEeecccC---------CcccccceEEeccccCCCCCcCccCCCCCeEEEEEEEE
Confidence 35567899999998864210 01234568999999999999999999999999999864
No 8
>TIGR02038 protease_degS periplasmic serine pepetdase DegS. This family consists of the periplasmic serine protease DegS (HhoB), a shorter paralog of protease DO (HtrA, DegP) and DegQ (HhoA). It is found in E. coli and several other Proteobacteria of the gamma subdivision. It contains a trypsin domain and a single copy of PDZ domain (in contrast to DegP with two copies). A critical role of this DegS is to sense stress in the periplasm and partially degrade an inhibitor of sigma(E).
Probab=99.81 E-value=2e-19 Score=193.33 Aligned_cols=127 Identities=18% Similarity=0.301 Sum_probs=108.4
Q ss_pred cccccCCccccCCCCccEEEEEEeCCCCCCCcccCCCCCCCCCeEEEEeCCCCCCCCCcccCceEEEEEecccCCCC---
Q 006631 200 AMEESSNLSLMSKSTSRVAILGVSSYLKDLPNIALTPLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVANCYPPRS--- 276 (637)
Q Consensus 200 a~~~~~~~~~~~~~~t~~A~lki~~~~~~~~~~~~s~~~~~G~~v~aigsPfg~~~p~~f~n~vs~GiIs~~~~~~~--- 276 (637)
|.....|+. +||||||++...++..+++++..+++||+|++||+|||+ .++++.|+||+..+...
T Consensus 116 a~vv~~d~~------~DlAvlkv~~~~~~~~~l~~s~~~~~G~~V~aiG~P~~~------~~s~t~GiIs~~~r~~~~~~ 183 (351)
T TIGR02038 116 AELVGSDPL------TDLAVLKIEGDNLPTIPVNLDRPPHVGDVVLAIGNPYNL------GQTITQGIISATGRNGLSSV 183 (351)
T ss_pred EEEEEecCC------CCEEEEEecCCCCceEeccCcCccCCCCEEEEEeCCCCC------CCcEEEEEEEeccCcccCCC
Confidence 555555555 999999999777788899999999999999999999994 67999999998765321
Q ss_pred CCCceEEEecccCCCCcCcceecCCccEEEEEeecccccC---CcceEEEEeHHHHHHHHHhhhc
Q 006631 277 TTRSLLMADIRCLPGMEGGPVFGEHAHFVGILIRPLRQKS---GAEIQLVIPWEAIATACSDLLL 338 (637)
Q Consensus 277 ~~~~~i~tDa~~~pG~sGG~v~~~~g~liGiv~~~l~~~~---~~~l~faip~~~i~~~~~~~~~ 338 (637)
....+|||||+++||||||||||.+|+||||+++.+...+ ..+++|+||++.+.+++.+++.
T Consensus 184 ~~~~~iqtda~i~~GnSGGpl~n~~G~vIGI~~~~~~~~~~~~~~g~~faIP~~~~~~vl~~l~~ 248 (351)
T TIGR02038 184 GRQNFIQTDAAINAGNSGGALINTNGELVGINTASFQKGGDEGGEGINFAIPIKLAHKIMGKIIR 248 (351)
T ss_pred CcceEEEECCccCCCCCcceEECCCCeEEEEEeeeecccCCCCccceEEEecHHHHHHHHHHHhh
Confidence 2357899999999999999999999999999999886542 3699999999999999988664
No 9
>PRK10898 serine endoprotease; Provisional
Probab=99.80 E-value=2.3e-19 Score=193.11 Aligned_cols=127 Identities=18% Similarity=0.289 Sum_probs=107.9
Q ss_pred cccccCCccccCCCCccEEEEEEeCCCCCCCcccCCCCCCCCCeEEEEeCCCCCCCCCcccCceEEEEEecccCCC---C
Q 006631 200 AMEESSNLSLMSKSTSRVAILGVSSYLKDLPNIALTPLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVANCYPPR---S 276 (637)
Q Consensus 200 a~~~~~~~~~~~~~~t~~A~lki~~~~~~~~~~~~s~~~~~G~~v~aigsPfg~~~p~~f~n~vs~GiIs~~~~~~---~ 276 (637)
|.....|+. +||||||++...++..++++++.+++||+|+++|+|||+ ..+++.|+||+..+.. .
T Consensus 116 a~vv~~d~~------~DlAvl~v~~~~l~~~~l~~~~~~~~G~~V~aiG~P~g~------~~~~t~Giis~~~r~~~~~~ 183 (353)
T PRK10898 116 ALLVGSDSL------TDLAVLKINATNLPVIPINPKRVPHIGDVVLAIGNPYNL------GQTITQGIISATGRIGLSPT 183 (353)
T ss_pred EEEEEEcCC------CCEEEEEEcCCCCCeeeccCcCcCCCCCEEEEEeCCCCc------CCCcceeEEEeccccccCCc
Confidence 555565655 999999999777888899999999999999999999994 6789999999775431 1
Q ss_pred CCCceEEEecccCCCCcCcceecCCccEEEEEeecccccC----CcceEEEEeHHHHHHHHHhhhc
Q 006631 277 TTRSLLMADIRCLPGMEGGPVFGEHAHFVGILIRPLRQKS----GAEIQLVIPWEAIATACSDLLL 338 (637)
Q Consensus 277 ~~~~~i~tDa~~~pG~sGG~v~~~~g~liGiv~~~l~~~~----~~~l~faip~~~i~~~~~~~~~ 338 (637)
....||||||+++||||||||+|.+|+||||+++.+...+ ..+++|+||++.+.+++.+++.
T Consensus 184 ~~~~~iqtda~i~~GnSGGPl~n~~G~vvGI~~~~~~~~~~~~~~~g~~faIP~~~~~~~~~~l~~ 249 (353)
T PRK10898 184 GRQNFLQTDASINHGNSGGALVNSLGELMGINTLSFDKSNDGETPEGIGFAIPTQLATKIMDKLIR 249 (353)
T ss_pred cccceEEeccccCCCCCcceEECCCCeEEEEEEEEecccCCCCcccceEEEEchHHHHHHHHHHhh
Confidence 2347899999999999999999999999999999886542 2589999999999999998654
No 10
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=99.77 E-value=2e-18 Score=189.99 Aligned_cols=127 Identities=20% Similarity=0.314 Sum_probs=109.0
Q ss_pred cccccCCccccCCCCccEEEEEEeCC-CCCCCcccCCCCCCCCCeEEEEeCCCCCCCCCcccCceEEEEEecccCCC---
Q 006631 200 AMEESSNLSLMSKSTSRVAILGVSSY-LKDLPNIALTPLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVANCYPPR--- 275 (637)
Q Consensus 200 a~~~~~~~~~~~~~~t~~A~lki~~~-~~~~~~~~~s~~~~~G~~v~aigsPfg~~~p~~f~n~vs~GiIs~~~~~~--- 275 (637)
|.....|+. +||||||++.. .++.+++++++.+++||+|+++|+|||+ ..++|.|+||+..+..
T Consensus 96 a~vv~~d~~------~DlAllkv~~~~~~~~~~l~~~~~~~~G~~v~aiG~p~g~------~~~~t~G~vs~~~~~~~~~ 163 (428)
T TIGR02037 96 AKLVGKDPR------TDIAVLKIDAKKNLPVIKLGDSDKLRVGDWVLAIGNPFGL------GQTVTSGIVSALGRSGLGI 163 (428)
T ss_pred EEEEEecCC------CCEEEEEecCCCCceEEEccCCCCCCCCCEEEEEECCCcC------CCcEEEEEEEecccCccCC
Confidence 455555544 89999999864 6788899999999999999999999994 6799999999876541
Q ss_pred CCCCceEEEecccCCCCcCcceecCCccEEEEEeeccccc-CCcceEEEEeHHHHHHHHHhhhc
Q 006631 276 STTRSLLMADIRCLPGMEGGPVFGEHAHFVGILIRPLRQK-SGAEIQLVIPWEAIATACSDLLL 338 (637)
Q Consensus 276 ~~~~~~i~tDa~~~pG~sGG~v~~~~g~liGiv~~~l~~~-~~~~l~faip~~~i~~~~~~~~~ 338 (637)
.....+|||||+++||||||||||.+|+||||+++.+... +..|++|+||++.++++++++..
T Consensus 164 ~~~~~~i~tda~i~~GnSGGpl~n~~G~viGI~~~~~~~~g~~~g~~faiP~~~~~~~~~~l~~ 227 (428)
T TIGR02037 164 GDYENFIQTDAAINPGNSGGPLVNLRGEVIGINTAIYSPSGGNVGIGFAIPSNMAKNVVDQLIE 227 (428)
T ss_pred CCccceEEECCCCCCCCCCCceECCCCeEEEEEeEEEcCCCCccceEEEEEhHHHHHHHHHHHh
Confidence 2345689999999999999999999999999999988766 66799999999999999998764
No 11
>COG0265 DegQ Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain [Posttranslational modification, protein turnover, chaperones]
Probab=99.74 E-value=6.5e-18 Score=180.95 Aligned_cols=127 Identities=22% Similarity=0.318 Sum_probs=110.8
Q ss_pred cccccCCccccCCCCccEEEEEEeCCC-CCCCcccCCCCCCCCCeEEEEeCCCCCCCCCcccCceEEEEEecccCC-CC-
Q 006631 200 AMEESSNLSLMSKSTSRVAILGVSSYL-KDLPNIALTPLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVANCYPP-RS- 276 (637)
Q Consensus 200 a~~~~~~~~~~~~~~t~~A~lki~~~~-~~~~~~~~s~~~~~G~~v~aigsPfg~~~p~~f~n~vs~GiIs~~~~~-~~- 276 (637)
++..+.|+. +|+|+||++... .+...+++++.+++||+++|||+||| |.++++.||||...+. -.
T Consensus 110 a~~vg~d~~------~dlavlki~~~~~~~~~~~~~s~~l~vg~~v~aiGnp~g------~~~tvt~Givs~~~r~~v~~ 177 (347)
T COG0265 110 AKLVGKDPI------SDLAVLKIDGAGGLPVIALGDSDKLRVGDVVVAIGNPFG------LGQTVTSGIVSALGRTGVGS 177 (347)
T ss_pred EEEEecCCc------cCEEEEEeccCCCCceeeccCCCCcccCCEEEEecCCCC------cccceeccEEeccccccccC
Confidence 455555544 999999999644 77789999999999999999999999 5799999999988764 11
Q ss_pred --CCCceEEEecccCCCCcCcceecCCccEEEEEeeccccc-CCcceEEEEeHHHHHHHHHhhhc
Q 006631 277 --TTRSLLMADIRCLPGMEGGPVFGEHAHFVGILIRPLRQK-SGAEIQLVIPWEAIATACSDLLL 338 (637)
Q Consensus 277 --~~~~~i~tDa~~~pG~sGG~v~~~~g~liGiv~~~l~~~-~~~~l~faip~~~i~~~~~~~~~ 338 (637)
....||||||++||||||||++|.+|++|||+++.+... +..|++|+||++.+..++.+++.
T Consensus 178 ~~~~~~~IqtdAain~gnsGgpl~n~~g~~iGint~~~~~~~~~~gigfaiP~~~~~~v~~~l~~ 242 (347)
T COG0265 178 AGGYVNFIQTDAAINPGNSGGPLVNIDGEVVGINTAIIAPSGGSSGIGFAIPVNLVAPVLDELIS 242 (347)
T ss_pred cccccchhhcccccCCCCCCCceEcCCCcEEEEEEEEecCCCCcceeEEEecHHHHHHHHHHHHH
Confidence 245789999999999999999999999999999999988 46789999999999999998764
No 12
>COG0265 DegQ Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain [Posttranslational modification, protein turnover, chaperones]
Probab=99.59 E-value=2.1e-14 Score=153.96 Aligned_cols=164 Identities=27% Similarity=0.406 Sum_probs=123.8
Q ss_pred hHHhhccCceEEEEeCC-----------------CeeeEEEEEeCCCEEEEcccccCCCCCcceeecCCcccccccCCCC
Q 006631 388 LPIQKALASVCLITIDD-----------------GVWASGVLLNDQGLILTNAHLLEPWRFGKTTVSGWRNGVSFQPEDS 450 (637)
Q Consensus 388 ~~i~~a~~SVV~V~~g~-----------------~~wGSGvlI~~~GlILTnAHVV~p~~~~~~~~ng~~~~~~~~~~~~ 450 (637)
..++++.++||.+.... ..+||||+++++|+|+||.||+. +.
T Consensus 37 ~~~~~~~~~vV~~~~~~~~~~~~~~~~~~~~~~~~~~gSg~i~~~~g~ivTn~hVi~----------~a----------- 95 (347)
T COG0265 37 TAVEKVAPAVVSIATGLTAKLRSFFPSDPPLRSAEGLGSGFIISSDGYIVTNNHVIA----------GA----------- 95 (347)
T ss_pred HHHHhcCCcEEEEEeeeeecchhcccCCcccccccccccEEEEcCCeEEEecceecC----------Cc-----------
Confidence 46888999999887631 36899999999999999999996 10
Q ss_pred CCCCCCcccccccccCCCCCCCcccccccccccccccccccCCceEEEEEEccCCCCceeeeEEEEeCCCCCcEEEEEEc
Q 006631 451 ASSGHTGVDQYQKSQTLPPKMPKIVDSSVDEHRAYKLSSFSRGHRKIRVRLDHLDPWIWCDAKIVYVCKGPLDVSLLQLG 530 (637)
Q Consensus 451 ~~~~~~~~~~~~~~q~l~~k~~~~~~~~~~~~~~~~~~~~~~~~~~i~V~l~~~~~~~w~~a~Vv~v~~~~~DIALLkLe 530 (637)
..+.+.+ +...+++++++..+. ..|+|+||++
T Consensus 96 --------------------------------------------~~i~v~l---~dg~~~~a~~vg~d~-~~dlavlki~ 127 (347)
T COG0265 96 --------------------------------------------EEITVTL---ADGREVPAKLVGKDP-ISDLAVLKID 127 (347)
T ss_pred --------------------------------------------ceEEEEe---CCCCEEEEEEEecCC-ccCEEEEEec
Confidence 0122222 122247899998887 4999999999
Q ss_pred cCCCCcceeeCCCCC-CCCCCeEEEEecCCCCCCCCCCCceeeeEEeeeeeecCCcCCccccccCCCcceEEEecCcccC
Q 006631 531 YIPDQLCPIDADFGQ-PSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQRNSAYPVMLETTAAVHP 609 (637)
Q Consensus 531 ~~~~~l~PI~l~~~~-~~~Ge~V~VIGyplfg~~~g~~~svs~GiVs~v~~v~~~~~~~~~~~~~~~~~~mlqTta~v~~ 609 (637)
.... ++.+.+.+.. ++.|+.++++|.|+ ++..+++.|+++...+... .....+..++||++++++
T Consensus 128 ~~~~-~~~~~~~~s~~l~vg~~v~aiGnp~-----g~~~tvt~Givs~~~r~~v--------~~~~~~~~~IqtdAain~ 193 (347)
T COG0265 128 GAGG-LPVIALGDSDKLRVGDVVVAIGNPF-----GLGQTVTSGIVSALGRTGV--------GSAGGYVNFIQTDAAINP 193 (347)
T ss_pred cCCC-CceeeccCCCCcccCCEEEEecCCC-----CcccceeccEEeccccccc--------cCcccccchhhcccccCC
Confidence 6322 6677776554 78999999999974 5668999999998875310 110124457899999999
Q ss_pred CcccccccccCceEEEEEeeecCCC
Q 006631 610 GGSGGAVVNLDGHMIGLVTRYAGGF 634 (637)
Q Consensus 610 G~SGGPL~n~~G~LVGIVsSna~~~ 634 (637)
|+||||++|.+|++|||++......
T Consensus 194 gnsGgpl~n~~g~~iGint~~~~~~ 218 (347)
T COG0265 194 GNSGGPLVNIDGEVVGINTAIIAPS 218 (347)
T ss_pred CCCCCceEcCCCcEEEEEEEEecCC
Confidence 9999999999999999998765543
No 13
>PF13365 Trypsin_2: Trypsin-like peptidase domain; PDB: 1Y8T_A 2Z9I_A 3QO6_A 1L1J_A 1QY6_A 2O8L_A 3OTP_E 2ZLE_I 1KY9_A 3CS0_A ....
Probab=99.50 E-value=2.4e-13 Score=121.26 Aligned_cols=24 Identities=46% Similarity=0.904 Sum_probs=22.3
Q ss_pred ecCcccCCcccccccccCceEEEE
Q 006631 603 TTAAVHPGGSGGAVVNLDGHMIGL 626 (637)
Q Consensus 603 Tta~v~~G~SGGPL~n~~G~LVGI 626 (637)
+++.+.+|+|||||||.+|++|||
T Consensus 97 ~~~~~~~G~SGgpv~~~~G~vvGi 120 (120)
T PF13365_consen 97 TDADTRPGSSGGPVFDSDGRVVGI 120 (120)
T ss_dssp ESSS-STTTTTSEEEETTSEEEEE
T ss_pred eecccCCCcEeHhEECCCCEEEeC
Confidence 899999999999999999999997
No 14
>cd00190 Tryp_SPc Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues.
Probab=99.38 E-value=1e-11 Score=122.03 Aligned_cols=106 Identities=23% Similarity=0.245 Sum_probs=62.2
Q ss_pred CCcEEEEEEcc---CCCCcceeeCCCC--CCCCCCeEEEEecCCCCCCCCCCCceeeeEEeeeeeecCCcCCcccccc-C
Q 006631 521 PLDVSLLQLGY---IPDQLCPIDADFG--QPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQR-N 594 (637)
Q Consensus 521 ~~DIALLkLe~---~~~~l~PI~l~~~--~~~~Ge~V~VIGyplfg~~~g~~~svs~GiVs~v~~v~~~~~~~~~~~~-~ 594 (637)
.+|||||+|+. ....+.|+.+... .+..|+.++++|||................+. +.....|...... .
T Consensus 88 ~~DiAll~L~~~~~~~~~v~picl~~~~~~~~~~~~~~~~G~g~~~~~~~~~~~~~~~~~~----~~~~~~C~~~~~~~~ 163 (232)
T cd00190 88 DNDIALLKLKRPVTLSDNVRPICLPSSGYNLPAGTTCTVSGWGRTSEGGPLPDVLQEVNVP----IVSNAECKRAYSYGG 163 (232)
T ss_pred cCCEEEEEECCcccCCCcccceECCCccccCCCCCEEEEEeCCcCCCCCCCCceeeEEEee----eECHHHhhhhccCcc
Confidence 59999999986 2345789988766 67889999999998643221111111111111 1111112111110 0
Q ss_pred CCcceEEEe-----cCcccCCcccccccccC---ceEEEEEeee
Q 006631 595 SAYPVMLET-----TAAVHPGGSGGAVVNLD---GHMIGLVTRY 630 (637)
Q Consensus 595 ~~~~~mlqT-----ta~v~~G~SGGPL~n~~---G~LVGIVsSn 630 (637)
.....+++. ....|.|+|||||+... +.++||++..
T Consensus 164 ~~~~~~~C~~~~~~~~~~c~gdsGgpl~~~~~~~~~lvGI~s~g 207 (232)
T cd00190 164 TITDNMLCAGGLEGGKDACQGDSGGPLVCNDNGRGVLVGIVSWG 207 (232)
T ss_pred cCCCceEeeCCCCCCCccccCCCCCcEEEEeCCEEEEEEEEehh
Confidence 111234444 34578999999999653 8899999864
No 15
>PF00089 Trypsin: Trypsin; InterPro: IPR001254 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine proteases belong to the MEROPS peptidase family S1 (chymotrypsin family, clan PA(S))and to peptidase family S6 (Hap serine peptidases). The chymotrypsin family is almost totally confined to animals, although trypsin-like enzymes are found in actinomycetes of the genera Streptomyces and Saccharopolyspora, and in the fungus Fusarium oxysporum []. The enzymes are inherently secreted, being synthesised with a signal peptide that targets them to the secretory pathway. Animal enzymes are either secreted directly, packaged into vesicles for regulated secretion, or are retained in leukocyte granules []. The Hap family, 'Haemophilus adhesion and penetration', are proteins that play a role in the interaction with human epithelial cells. The serine protease activity is localized at the N-terminal domain, whereas the binding domain is in the C-terminal region. ; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 1SPJ_A 1A5I_A 2ZGH_A 2ZKS_A 2ZGJ_A 2ZGC_A 2ODP_A 2I6Q_A 2I6S_A 2ODQ_A ....
Probab=99.28 E-value=4.9e-11 Score=116.45 Aligned_cols=104 Identities=25% Similarity=0.385 Sum_probs=64.9
Q ss_pred CCcEEEEEEccC---CCCcceeeCCCCC--CCCCCeEEEEecCCCCCCCCCCCceeeeEEeeeeeecCCcCCccccccCC
Q 006631 521 PLDVSLLQLGYI---PDQLCPIDADFGQ--PSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQRNS 595 (637)
Q Consensus 521 ~~DIALLkLe~~---~~~l~PI~l~~~~--~~~Ge~V~VIGyplfg~~~g~~~svs~GiVs~v~~v~~~~~~~~~~~~~~ 595 (637)
.+|||||+|+.. .+.+.|+.+.... +..|+.+.++|||.-... +....+....+.-+.. ..|.... ...
T Consensus 86 ~~DiAll~L~~~~~~~~~~~~~~l~~~~~~~~~~~~~~~~G~~~~~~~-~~~~~~~~~~~~~~~~----~~c~~~~-~~~ 159 (220)
T PF00089_consen 86 DNDIALLKLDRPITFGDNIQPICLPSAGSDPNVGTSCIVVGWGRTSDN-GYSSNLQSVTVPVVSR----KTCRSSY-NDN 159 (220)
T ss_dssp TTSEEEEEESSSSEHBSSBEESBBTSTTHTTTTTSEEEEEESSBSSTT-SBTSBEEEEEEEEEEH----HHHHHHT-TTT
T ss_pred cccccccccccccccccccccccccccccccccccccccccccccccc-cccccccccccccccc----ccccccc-ccc
Confidence 489999999973 4667888887633 589999999999852111 1111222222221110 0111110 001
Q ss_pred CcceEEEecC----cccCCcccccccccCceEEEEEeee
Q 006631 596 AYPVMLETTA----AVHPGGSGGAVVNLDGHMIGLVTRY 630 (637)
Q Consensus 596 ~~~~mlqTta----~v~~G~SGGPL~n~~G~LVGIVsSn 630 (637)
....++++.. ..|.|+|||||++.++.|+||++..
T Consensus 160 ~~~~~~c~~~~~~~~~~~g~sG~pl~~~~~~lvGI~s~~ 198 (220)
T PF00089_consen 160 LTPNMICAGSSGSGDACQGDSGGPLICNNNYLVGIVSFG 198 (220)
T ss_dssp STTTEEEEETTSSSBGGTTTTTSEEEETTEEEEEEEEEE
T ss_pred cccccccccccccccccccccccccccceeeecceeeec
Confidence 2345777665 7899999999998666799999976
No 16
>smart00020 Tryp_SPc Trypsin-like serine protease. Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues.
Probab=99.23 E-value=3.5e-10 Score=111.69 Aligned_cols=107 Identities=23% Similarity=0.292 Sum_probs=60.8
Q ss_pred CCcEEEEEEcc---CCCCcceeeCCCC--CCCCCCeEEEEecCCCCCCCC-CCCceeeeEEeeeeeecCCcCCcccccc-
Q 006631 521 PLDVSLLQLGY---IPDQLCPIDADFG--QPSLGSAAYVIGHGLFGPRCG-LSPSVSSGVVAKVVKANLPSYGQSTLQR- 593 (637)
Q Consensus 521 ~~DIALLkLe~---~~~~l~PI~l~~~--~~~~Ge~V~VIGyplfg~~~g-~~~svs~GiVs~v~~v~~~~~~~~~~~~- 593 (637)
.+|||||+|+. ....+.|+.+... .+..++.++++|||......+ .........+. ......|......
T Consensus 88 ~~DiAll~L~~~i~~~~~~~pi~l~~~~~~~~~~~~~~~~g~g~~~~~~~~~~~~~~~~~~~----~~~~~~C~~~~~~~ 163 (229)
T smart00020 88 DNDIALLKLKSPVTLSDNVRPICLPSSNYNVPAGTTCTVSGWGRTSEGAGSLPDTLQEVNVP----IVSNATCRRAYSGG 163 (229)
T ss_pred cCCEEEEEECcccCCCCceeeccCCCcccccCCCCEEEEEeCCCCCCCCCcCCCEeeEEEEE----EeCHHHhhhhhccc
Confidence 59999999986 3456889888765 577899999999986332101 00111111111 1111111111000
Q ss_pred CCCcceEEEe-----cCcccCCcccccccccCc--eEEEEEeeec
Q 006631 594 NSAYPVMLET-----TAAVHPGGSGGAVVNLDG--HMIGLVTRYA 631 (637)
Q Consensus 594 ~~~~~~mlqT-----ta~v~~G~SGGPL~n~~G--~LVGIVsSna 631 (637)
......+++. +...|+|+|||||+...+ .++||++...
T Consensus 164 ~~~~~~~~C~~~~~~~~~~c~gdsG~pl~~~~~~~~l~Gi~s~g~ 208 (229)
T smart00020 164 GAITDNMLCAGGLEGGKDACQGDSGGPLVCNDGRWVLVGIVSWGS 208 (229)
T ss_pred cccCCCcEeecCCCCCCcccCCCCCCeeEEECCCEEEEEEEEECC
Confidence 0011123333 356899999999996443 8999998753
No 17
>KOG1320 consensus Serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=99.23 E-value=1.1e-11 Score=136.25 Aligned_cols=127 Identities=18% Similarity=0.311 Sum_probs=107.3
Q ss_pred CCccccC-CCCccEEEEEEeC--CCCCCCcccCCCCCCCCCeEEEEeCCCCCCCCCcccCceEEEEEecccCCC------
Q 006631 205 SNLSLMS-KSTSRVAILGVSS--YLKDLPNIALTPLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVANCYPPR------ 275 (637)
Q Consensus 205 ~~~~~~~-~~~t~~A~lki~~--~~~~~~~~~~s~~~~~G~~v~aigsPfg~~~p~~f~n~vs~GiIs~~~~~~------ 275 (637)
..|.++. +...|+|++|++. +.+..++++.+..++.|+++.++++||++ .|++++|+||...|..
T Consensus 213 ~ep~i~g~d~~~gvA~l~ik~~~~i~~~i~~~~~~~~~~G~~~~a~~~~f~~------~nt~t~g~vs~~~R~~~~lg~~ 286 (473)
T KOG1320|consen 213 GEPVIVGVDKVAGVAFLKIKTPENILYVIPLGVSSHFRTGVEVSAIGNGFGL------LNTLTQGMVSGQLRKSFKLGLE 286 (473)
T ss_pred CCCeEEccccccceEEEEEecCCcccceeecceeeeecccceeeccccCcee------eeeeeecccccccccccccCcc
Confidence 3466664 7779999999963 33778899999999999999999999995 8899999999776542
Q ss_pred --CCCCceEEEecccCCCCcCcceecCCccEEEEEeeccccc-CCcceEEEEeHHHHHHHHHhhh
Q 006631 276 --STTRSLLMADIRCLPGMEGGPVFGEHAHFVGILIRPLRQK-SGAEIQLVIPWEAIATACSDLL 337 (637)
Q Consensus 276 --~~~~~~i~tDa~~~pG~sGG~v~~~~g~liGiv~~~l~~~-~~~~l~faip~~~i~~~~~~~~ 337 (637)
.....++|||+++++||+|||++|.+|+.||+++++.... -..+++|++|.+.+...+....
T Consensus 287 ~g~~i~~~~qtd~ai~~~nsg~~ll~~DG~~IgVn~~~~~ri~~~~~iSf~~p~d~vl~~v~r~~ 351 (473)
T KOG1320|consen 287 TGVLISKINQTDAAINPGNSGGPLLNLDGEVIGVNTRKVTRIGFSHGISFKIPIDTVLVIVLRLG 351 (473)
T ss_pred cceeeeeecccchhhhcccCCCcEEEecCcEeeeeeeeeEEeeccccceeccCchHhhhhhhhhh
Confidence 2345789999999999999999999999999999888765 4579999999999998887643
No 18
>KOG1320 consensus Serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=99.08 E-value=7e-10 Score=122.31 Aligned_cols=112 Identities=29% Similarity=0.434 Sum_probs=82.5
Q ss_pred eeeEEEEeCCCCCcEEEEEEccCCCCcceeeCCCC-CCCCCCeEEEEecCCCCCCCCCCCceeeeEEeeeeeecCCcCCc
Q 006631 510 CDAKIVYVCKGPLDVSLLQLGYIPDQLCPIDADFG-QPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQ 588 (637)
Q Consensus 510 ~~a~Vv~v~~~~~DIALLkLe~~~~~l~PI~l~~~-~~~~Ge~V~VIGyplfg~~~g~~~svs~GiVs~v~~v~~~~~~~ 588 (637)
+.+.++..++ ..|+|+++++....-.+++++... .+..|+.+..+|-| +++..+++.|+++...+-...+ ..
T Consensus 213 ~ep~i~g~d~-~~gvA~l~ik~~~~i~~~i~~~~~~~~~~G~~~~a~~~~-----f~~~nt~t~g~vs~~~R~~~~l-g~ 285 (473)
T KOG1320|consen 213 GEPVIVGVDK-VAGVAFLKIKTPENILYVIPLGVSSHFRTGVEVSAIGNG-----FGLLNTLTQGMVSGQLRKSFKL-GL 285 (473)
T ss_pred CCCeEEcccc-ccceEEEEEecCCcccceeecceeeeecccceeeccccC-----ceeeeeeeeccccccccccccc-Cc
Confidence 5677777777 499999999752233677777654 48999999999886 4666788999998775432111 10
Q ss_pred cccccCCCcceEEEecCcccCCcccccccccCceEEEEEeeec
Q 006631 589 STLQRNSAYPVMLETTAAVHPGGSGGAVVNLDGHMIGLVTRYA 631 (637)
Q Consensus 589 ~~~~~~~~~~~mlqTta~v~~G~SGGPL~n~~G~LVGIVsSna 631 (637)
. .......++||++++..|+||||++|.+|+.||+.+.+-
T Consensus 286 ~---~g~~i~~~~qtd~ai~~~nsg~~ll~~DG~~IgVn~~~~ 325 (473)
T KOG1320|consen 286 E---TGVLISKINQTDAAINPGNSGGPLLNLDGEVIGVNTRKV 325 (473)
T ss_pred c---cceeeeeecccchhhhcccCCCcEEEecCcEeeeeeeee
Confidence 0 011234579999999999999999999999999888753
No 19
>KOG1421 consensus Predicted signaling-associated protein (contains a PDZ domain) [General function prediction only]
Probab=98.64 E-value=1.3e-07 Score=106.30 Aligned_cols=165 Identities=23% Similarity=0.372 Sum_probs=109.6
Q ss_pred HHhhccCceEEEEeC----------CCeeeEEEEEeC-CCEEEEcccccCCCCCcceeecCCcccccccCCCCCCCCCCc
Q 006631 389 PIQKALASVCLITID----------DGVWASGVLLND-QGLILTNAHLLEPWRFGKTTVSGWRNGVSFQPEDSASSGHTG 457 (637)
Q Consensus 389 ~i~~a~~SVV~V~~g----------~~~wGSGvlI~~-~GlILTnAHVV~p~~~~~~~~ng~~~~~~~~~~~~~~~~~~~ 457 (637)
.+..+.++||.|+.. +.+-|+||++++ .|+||||+||+.|.-+... +.|.
T Consensus 57 ~ia~VvksvVsI~~S~v~~fdtesag~~~atgfvvd~~~gyiLtnrhvv~pgP~va~--------avf~----------- 117 (955)
T KOG1421|consen 57 TIANVVKSVVSIRFSAVRAFDTESAGESEATGFVVDKKLGYILTNRHVVAPGPFVAS--------AVFD----------- 117 (955)
T ss_pred hhhhhcccEEEEEehheeecccccccccceeEEEEecccceEEEeccccCCCCceeE--------EEec-----------
Confidence 467788999999762 245699999998 7899999999976432111 1110
Q ss_pred ccccccccCCCCCCCcccccccccccccccccccCCceEEEEEEccCCCCceeeeEEEEeCCCCCcEEEEEEccC---CC
Q 006631 458 VDQYQKSQTLPPKMPKIVDSSVDEHRAYKLSSFSRGHRKIRVRLDHLDPWIWCDAKIVYVCKGPLDVSLLQLGYI---PD 534 (637)
Q Consensus 458 ~~~~~~~q~l~~k~~~~~~~~~~~~~~~~~~~~~~~~~~i~V~l~~~~~~~w~~a~Vv~v~~~~~DIALLkLe~~---~~ 534 (637)
+... ++-..+|.|+ -+|+.+++.++. -.
T Consensus 118 -----------------------------------n~ee-------------~ei~pvyrDp-VhdfGf~r~dps~ir~s 148 (955)
T KOG1421|consen 118 -----------------------------------NHEE-------------IEIYPVYRDP-VHDFGFFRYDPSTIRFS 148 (955)
T ss_pred -----------------------------------cccc-------------CCcccccCCc-hhhcceeecChhhccee
Confidence 0101 1122344444 489999998851 11
Q ss_pred CcceeeCCCCCCCCCCeEEEEecCCCCCCCCCCCceeeeEEeeeeeecCCcCCccccccCCCcceEEEecCcccCCcccc
Q 006631 535 QLCPIDADFGQPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQRNSAYPVMLETTAAVHPGGSGG 614 (637)
Q Consensus 535 ~l~PI~l~~~~~~~Ge~V~VIGyplfg~~~g~~~svs~GiVs~v~~v~~~~~~~~~~~~~~~~~~mlqTta~v~~G~SGG 614 (637)
.+.-+.+.....++|.+++++|. ..+...++-.|.++.+.+. .+.+.......+..+ ++|..+...+|.||.
T Consensus 149 ~vt~i~lap~~akvgseirvvgN-----DagEklsIlagflSrldr~-apdyg~~~yndfnTf--y~Qaasstsggssgs 220 (955)
T KOG1421|consen 149 IVTEICLAPELAKVGSEIRVVGN-----DAGEKLSILAGFLSRLDRN-APDYGEDTYNDFNTF--YIQAASSTSGGSSGS 220 (955)
T ss_pred eeeccccCccccccCCceEEecC-----CccceEEeehhhhhhccCC-Cccccccccccccce--eeeehhcCCCCCCCC
Confidence 23444455555689999999998 3456677888888877653 233322223322222 688888899999999
Q ss_pred cccccCceEEEEEee
Q 006631 615 AVVNLDGHMIGLVTR 629 (637)
Q Consensus 615 PL~n~~G~LVGIVsS 629 (637)
||++..|..|.++..
T Consensus 221 pVv~i~gyAVAl~ag 235 (955)
T KOG1421|consen 221 PVVDIPGYAVALNAG 235 (955)
T ss_pred ceecccceEEeeecC
Confidence 999999999998874
No 20
>KOG3627 consensus Trypsin [Amino acid transport and metabolism]
Probab=98.59 E-value=1.3e-06 Score=88.90 Aligned_cols=107 Identities=23% Similarity=0.239 Sum_probs=61.3
Q ss_pred CcEEEEEEcc---CCCCcceeeCCCCC----CCCCCeEEEEecCCCCCCCCCCCceeeeEEeeeeeecCCcCCccccccC
Q 006631 522 LDVSLLQLGY---IPDQLCPIDADFGQ----PSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQRN 594 (637)
Q Consensus 522 ~DIALLkLe~---~~~~l~PI~l~~~~----~~~Ge~V~VIGyplfg~~~g~~~svs~GiVs~v~~v~~~~~~~~~~~~~ 594 (637)
+|||||+|+. +.+.++|+.++... ...+..+++.|||.............. .....+.....|.......
T Consensus 106 nDiall~l~~~v~~~~~i~piclp~~~~~~~~~~~~~~~v~GWG~~~~~~~~~~~~L~---~~~v~i~~~~~C~~~~~~~ 182 (256)
T KOG3627|consen 106 NDIALLRLSEPVTFSSHIQPICLPSSADPYFPPGGTTCLVSGWGRTESGGGPLPDTLQ---EVDVPIISNSECRRAYGGL 182 (256)
T ss_pred CCEEEEEECCCcccCCcccccCCCCCcccCCCCCCCEEEEEeCCCcCCCCCCCCceeE---EEEEeEcChhHhcccccCc
Confidence 8999999986 45678888886332 344589999999853221001111111 1111111112233222111
Q ss_pred -CCcceEEEec-----CcccCCcccccccccC---ceEEEEEeeec
Q 006631 595 -SAYPVMLETT-----AAVHPGGSGGAVVNLD---GHMIGLVTRYA 631 (637)
Q Consensus 595 -~~~~~mlqTt-----a~v~~G~SGGPL~n~~---G~LVGIVsSna 631 (637)
.....++++. ..+|.|||||||+-.. ..++||++...
T Consensus 183 ~~~~~~~~Ca~~~~~~~~~C~GDSGGPLv~~~~~~~~~~GivS~G~ 228 (256)
T KOG3627|consen 183 GTITDTMLCAGGPEGGKDACQGDSGGPLVCEDNGRWVLVGIVSWGS 228 (256)
T ss_pred cccCCCEEeeCccCCCCccccCCCCCeEEEeeCCcEEEEEEEEecC
Confidence 1112357664 2468999999999643 69999999754
No 21
>COG3591 V8-like Glu-specific endopeptidase [Amino acid transport and metabolism]
Probab=98.34 E-value=5.8e-06 Score=85.20 Aligned_cols=69 Identities=26% Similarity=0.232 Sum_probs=46.8
Q ss_pred CCCCCCeEEEEecCCCCCCCCCCCceeeeEEeeeeeecCCcCCccccccCCCcceEEEecCcccCCcccccccccCceEE
Q 006631 545 QPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQRNSAYPVMLETTAAVHPGGSGGAVVNLDGHMI 624 (637)
Q Consensus 545 ~~~~Ge~V~VIGyplfg~~~g~~~svs~GiVs~v~~v~~~~~~~~~~~~~~~~~~mlqTta~v~~G~SGGPL~n~~G~LV 624 (637)
..+.++.+.++|||.-.+..+... ...+.+..+. ...++.+|.+++|+||.||++.+.++|
T Consensus 157 ~~~~~d~i~v~GYP~dk~~~~~~~-e~t~~v~~~~------------------~~~l~y~~dT~pG~SGSpv~~~~~~vi 217 (251)
T COG3591 157 EAKANDRITVIGYPGDKPNIGTMW-ESTGKVNSIK------------------GNKLFYDADTLPGSSGSPVLISKDEVI 217 (251)
T ss_pred ccccCceeEEEeccCCCCcceeEe-eecceeEEEe------------------cceEEEEecccCCCCCCceEecCceEE
Confidence 368899999999985322122100 1222222211 126888999999999999999888999
Q ss_pred EEEeeecC
Q 006631 625 GLVTRYAG 632 (637)
Q Consensus 625 GIVsSna~ 632 (637)
|+.+++-.
T Consensus 218 gv~~~g~~ 225 (251)
T COG3591 218 GVHYNGPG 225 (251)
T ss_pred EEEecCCC
Confidence 99997654
No 22
>PF13365 Trypsin_2: Trypsin-like peptidase domain; PDB: 1Y8T_A 2Z9I_A 3QO6_A 1L1J_A 1QY6_A 2O8L_A 3OTP_E 2ZLE_I 1KY9_A 3CS0_A ....
Probab=97.77 E-value=1e-05 Score=71.90 Aligned_cols=24 Identities=46% Similarity=0.855 Sum_probs=22.5
Q ss_pred EecccCCCCcCcceecCCccEEEE
Q 006631 284 ADIRCLPGMEGGPVFGEHAHFVGI 307 (637)
Q Consensus 284 tDa~~~pG~sGG~v~~~~g~liGi 307 (637)
+|+.+.||+|||||||.+|+||||
T Consensus 97 ~~~~~~~G~SGgpv~~~~G~vvGi 120 (120)
T PF13365_consen 97 TDADTRPGSSGGPVFDSDGRVVGI 120 (120)
T ss_dssp ESSS-STTTTTSEEEETTSEEEEE
T ss_pred eecccCCCcEeHhEECCCCEEEeC
Confidence 899999999999999999999997
No 23
>PF00863 Peptidase_C4: Peptidase family C4 This family belongs to family C4 of the peptidase classification.; InterPro: IPR001730 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. Nuclear inclusion A (NIA) proteases from potyviruses are cysteine peptidases belong to the MEROPS peptidase family C4 (NIa protease family, clan PA(C)) [, ]. Potyviruses include plant viruses in which the single-stranded RNA encodes a polyprotein with NIA protease activity, where proteolytic cleavage is specific for Gln+Gly sites. The NIA protease acts on the polyprotein, releasing itself by Gln+Gly cleavage at both the N- and C-termini. It further processes the polyprotein by cleavage at five similar sites in the C-terminal half of the sequence. In addition to its C-terminal protease activity, the NIA protease contains an N-terminal domain that has been implicated in the transcription process []. This peptidase is present in the nuclear inclusion protein of potyviruses.; GO: 0008234 cysteine-type peptidase activity, 0006508 proteolysis; PDB: 3MMG_B 1Q31_B 1LVB_A 1LVM_A.
Probab=97.45 E-value=0.0013 Score=67.52 Aligned_cols=90 Identities=17% Similarity=0.302 Sum_probs=43.3
Q ss_pred CCcEEEEEEccCCCCccee--eCCCCCCCCCCeEEEEecCCCCCCCCCCCceeeeEEeeeeeecCCcCCccccccCCCcc
Q 006631 521 PLDVSLLQLGYIPDQLCPI--DADFGQPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQRNSAYP 598 (637)
Q Consensus 521 ~~DIALLkLe~~~~~l~PI--~l~~~~~~~Ge~V~VIGyplfg~~~g~~~svs~GiVs~v~~v~~~~~~~~~~~~~~~~~ 598 (637)
..||.++|+.. +++|. ++.+..|+.++.|.+||.= +. ..-..-.++....+- + ....
T Consensus 81 ~~DiviirmPk---DfpPf~~kl~FR~P~~~e~v~mVg~~-----fq--~k~~~s~vSesS~i~-p----------~~~~ 139 (235)
T PF00863_consen 81 GRDIVIIRMPK---DFPPFPQKLKFRAPKEGERVCMVGSN-----FQ--EKSISSTVSESSWIY-P----------EENS 139 (235)
T ss_dssp CSSEEEEE--T---TS----S---B----TT-EEEEEEEE-----CS--SCCCEEEEEEEEEEE-E----------ETTT
T ss_pred CccEEEEeCCc---ccCCcchhhhccCCCCCCEEEEEEEE-----EE--cCCeeEEECCceEEe-e----------cCCC
Confidence 39999999975 34444 3456678999999999972 11 111122222221110 0 1233
Q ss_pred eEEEecCcccCCcccccccc-cCceEEEEEeeec
Q 006631 599 VMLETTAAVHPGGSGGAVVN-LDGHMIGLVTRYA 631 (637)
Q Consensus 599 ~mlqTta~v~~G~SGGPL~n-~~G~LVGIVsSna 631 (637)
.+..+-.+...|+-|.||++ .+|++|||-+...
T Consensus 140 ~fWkHwIsTk~G~CG~PlVs~~Dg~IVGiHsl~~ 173 (235)
T PF00863_consen 140 HFWKHWISTKDGDCGLPLVSTKDGKIVGIHSLTS 173 (235)
T ss_dssp TEEEE-C---TT-TT-EEEETTT--EEEEEEEEE
T ss_pred CeeEEEecCCCCccCCcEEEcCCCcEEEEEcCcc
Confidence 47888889999999999998 5799999998543
No 24
>PF05579 Peptidase_S32: Equine arteritis virus serine endopeptidase S32; InterPro: IPR008760 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to MEROPS peptidase family S32 (clan PA(S)). The type example is equine arteritis virus serine endopeptidase (equine arteritis virus), which is involved in processing of nidovirus polyproteins [].; GO: 0004252 serine-type endopeptidase activity, 0016032 viral reproduction, 0019082 viral protein processing; PDB: 3FAN_A 3FAO_A 1MBM_A.
Probab=97.26 E-value=0.0022 Score=66.40 Aligned_cols=76 Identities=25% Similarity=0.313 Sum_probs=40.6
Q ss_pred CcEEEEEEccCCCCcceeeCCCCCCCCCCeEEEEecCCCCCCCCCCCceeeeEEeeeeeecCCcCCccccccCCCcceEE
Q 006631 522 LDVSLLQLGYIPDQLCPIDADFGQPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQRNSAYPVML 601 (637)
Q Consensus 522 ~DIALLkLe~~~~~l~PI~l~~~~~~~Ge~V~VIGyplfg~~~g~~~svs~GiVs~v~~v~~~~~~~~~~~~~~~~~~ml 601 (637)
-|.|.-.+...+...+.+++... ..| ++|-.- + .-+..|.|..-.+ +
T Consensus 156 GDfA~~~~~~~~G~~P~~k~a~~--~~G-rAyW~t------~----tGvE~G~ig~~~~--------------------~ 202 (297)
T PF05579_consen 156 GDFAEADITNWPGAAPKYKFAQN--YTG-RAYWLT------S----TGVEPGFIGGGGA--------------------V 202 (297)
T ss_dssp TTEEEEEETTS-S---B--B-TT---SE-EEEEEE------T----TEEEEEEEETTEE--------------------E
T ss_pred CcEEEEECCCCCCCCCceeecCC--ccc-ceEEEc------c----cCcccceecCceE--------------------E
Confidence 78999888665666666665521 122 233221 1 2245555542222 1
Q ss_pred EecCcccCCcccccccccCceEEEEEe-eecCC
Q 006631 602 ETTAAVHPGGSGGAVVNLDGHMIGLVT-RYAGG 633 (637)
Q Consensus 602 qTta~v~~G~SGGPL~n~~G~LVGIVs-Sna~~ 633 (637)
|-..+||||+|++..+|.+|||-+ ||.+|
T Consensus 203 ---~fT~~GDSGSPVVt~dg~liGVHTGSn~~G 232 (297)
T PF05579_consen 203 ---CFTGPGDSGSPVVTEDGDLIGVHTGSNKRG 232 (297)
T ss_dssp ---ESS-GGCTT-EEEETTC-EEEEEEEEETTT
T ss_pred ---EEcCCCCCCCccCcCCCCEEEEEecCCCcC
Confidence 346789999999999999999999 66665
No 25
>COG5640 Secreted trypsin-like serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=97.22 E-value=0.0013 Score=70.61 Aligned_cols=22 Identities=32% Similarity=0.581 Sum_probs=20.0
Q ss_pred CeeeEEEEEeCCCEEEEcccccC
Q 006631 405 GVWASGVLLNDQGLILTNAHLLE 427 (637)
Q Consensus 405 ~~wGSGvlI~~~GlILTnAHVV~ 427 (637)
..+|.|-+++.+ ||||+|||+.
T Consensus 60 ~tfCGgs~l~~R-YvLTAAHC~~ 81 (413)
T COG5640 60 GTFCGGSKLGGR-YVLTAAHCAD 81 (413)
T ss_pred eeEeccceecce-EEeeehhhcc
Confidence 568999999998 9999999995
No 26
>PF03761 DUF316: Domain of unknown function (DUF316) ; InterPro: IPR005514 This is a family of uncharacterised proteins from Caenorhabditis elegans.
Probab=97.20 E-value=0.011 Score=61.55 Aligned_cols=92 Identities=15% Similarity=0.087 Sum_probs=57.8
Q ss_pred CCCcEEEEEEccC-CCCcceeeCCCCC--CCCCCeEEEEecCCCCCCCCCCCceeeeEEeeeeeecCCcCCccccccCCC
Q 006631 520 GPLDVSLLQLGYI-PDQLCPIDADFGQ--PSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQRNSA 596 (637)
Q Consensus 520 ~~~DIALLkLe~~-~~~l~PI~l~~~~--~~~Ge~V~VIGyplfg~~~g~~~svs~GiVs~v~~v~~~~~~~~~~~~~~~ 596 (637)
..++++||.++.. .....|+-+++.. ...|+.+.+.|+. .. ..+....+.-... ..
T Consensus 159 ~~~~~mIlEl~~~~~~~~~~~Cl~~~~~~~~~~~~~~~yg~~-----~~--~~~~~~~~~i~~~--------------~~ 217 (282)
T PF03761_consen 159 RPYSPMILELEEDFSKNVSPPCLADSSTNWEKGDEVDVYGFN-----ST--GKLKHRKLKITNC--------------TK 217 (282)
T ss_pred cccceEEEEEcccccccCCCEEeCCCccccccCceEEEeecC-----CC--CeEEEEEEEEEEe--------------ec
Confidence 5799999999972 2567777776543 5679999988871 11 1122222221111 01
Q ss_pred cceEEEecCcccCCcccccccc---cCceEEEEEeeecC
Q 006631 597 YPVMLETTAAVHPGGSGGAVVN---LDGHMIGLVTRYAG 632 (637)
Q Consensus 597 ~~~mlqTta~v~~G~SGGPL~n---~~G~LVGIVsSna~ 632 (637)
....+.+....+.|++||||+. ..-.||||.+.+..
T Consensus 218 ~~~~~~~~~~~~~~d~Gg~lv~~~~gr~tlIGv~~~~~~ 256 (282)
T PF03761_consen 218 CAYSICTKQYSCKGDRGGPLVKNINGRWTLIGVGASGNY 256 (282)
T ss_pred cceeEecccccCCCCccCeEEEEECCCEEEEEEEccCCC
Confidence 2234555667889999999993 33459999987653
No 27
>PF10459 Peptidase_S46: Peptidase S46; InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains.
Probab=96.78 E-value=0.004 Score=73.13 Aligned_cols=21 Identities=38% Similarity=0.512 Sum_probs=19.8
Q ss_pred eEEEEEeCCCEEEEcccccCC
Q 006631 408 ASGVLLNDQGLILTNAHLLEP 428 (637)
Q Consensus 408 GSGvlI~~~GlILTnAHVV~p 428 (637)
|||.+|+++|+||||+||.-.
T Consensus 49 CSgsfVS~~GLvlTNHHC~~~ 69 (698)
T PF10459_consen 49 CSGSFVSPDGLVLTNHHCGYG 69 (698)
T ss_pred eeEEEEcCCceEEecchhhhh
Confidence 999999999999999999953
No 28
>PF00089 Trypsin: Trypsin; InterPro: IPR001254 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine proteases belong to the MEROPS peptidase family S1 (chymotrypsin family, clan PA(S))and to peptidase family S6 (Hap serine peptidases). The chymotrypsin family is almost totally confined to animals, although trypsin-like enzymes are found in actinomycetes of the genera Streptomyces and Saccharopolyspora, and in the fungus Fusarium oxysporum []. The enzymes are inherently secreted, being synthesised with a signal peptide that targets them to the secretory pathway. Animal enzymes are either secreted directly, packaged into vesicles for regulated secretion, or are retained in leukocyte granules []. The Hap family, 'Haemophilus adhesion and penetration', are proteins that play a role in the interaction with human epithelial cells. The serine protease activity is localized at the N-terminal domain, whereas the binding domain is in the C-terminal region. ; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 1SPJ_A 1A5I_A 2ZGH_A 2ZKS_A 2ZGJ_A 2ZGC_A 2ODP_A 2I6Q_A 2I6S_A 2ODQ_A ....
Probab=96.49 E-value=0.014 Score=56.73 Aligned_cols=115 Identities=16% Similarity=0.086 Sum_probs=72.0
Q ss_pred CccEEEEEEeCC---C--CCCCcccC-CCCCCCCCeEEEEeCCCCCCCC-CcccCceEEEEEecc--cC--CCCCCCceE
Q 006631 214 TSRVAILGVSSY---L--KDLPNIAL-TPLNKRGDLLLAVGSPFGVLSP-MHFFNSVSMGSVANC--YP--PRSTTRSLL 282 (637)
Q Consensus 214 ~t~~A~lki~~~---~--~~~~~~~~-s~~~~~G~~v~aigsPfg~~~p-~~f~n~vs~GiIs~~--~~--~~~~~~~~i 282 (637)
..||||||++.. . ..++.+.. ...++.|+.+.++|.+...... ..........+++.. .. ........+
T Consensus 86 ~~DiAll~L~~~~~~~~~~~~~~l~~~~~~~~~~~~~~~~G~~~~~~~~~~~~~~~~~~~~~~~~~c~~~~~~~~~~~~~ 165 (220)
T PF00089_consen 86 DNDIALLKLDRPITFGDNIQPICLPSAGSDPNVGTSCIVVGWGRTSDNGYSSNLQSVTVPVVSRKTCRSSYNDNLTPNMI 165 (220)
T ss_dssp TTSEEEEEESSSSEHBSSBEESBBTSTTHTTTTTSEEEEEESSBSSTTSBTSBEEEEEEEEEEHHHHHHHTTTTSTTTEE
T ss_pred cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
Confidence 479999999854 1 12333444 2346899999999999853221 011223344555532 11 111235567
Q ss_pred EEec----ccCCCCcCcceecCCccEEEEEeeccccc-CCcceEEEEeHHHH
Q 006631 283 MADI----RCLPGMEGGPVFGEHAHFVGILIRPLRQK-SGAEIQLVIPWEAI 329 (637)
Q Consensus 283 ~tDa----~~~pG~sGG~v~~~~g~liGiv~~~l~~~-~~~~l~faip~~~i 329 (637)
.++. ...+|+|||||++.++.||||++.. ..+ ......+.+++..+
T Consensus 166 c~~~~~~~~~~~g~sG~pl~~~~~~lvGI~s~~-~~c~~~~~~~v~~~v~~~ 216 (220)
T PF00089_consen 166 CAGSSGSGDACQGDSGGPLICNNNYLVGIVSFG-ENCGSPNYPGVYTRVSSY 216 (220)
T ss_dssp EEETTSSSBGGTTTTTSEEEETTEEEEEEEEEE-SSSSBTTSEEEEEEGGGG
T ss_pred cccccccccccccccccccccceeeecceeeec-CCCCCCCcCEEEEEHHHh
Confidence 7776 7889999999999998999999987 333 33335666665433
No 29
>PF10459 Peptidase_S46: Peptidase S46; InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains.
Probab=94.42 E-value=0.032 Score=65.78 Aligned_cols=34 Identities=32% Similarity=0.581 Sum_probs=30.5
Q ss_pred CCcceEEEecCcccCCcccccccccCceEEEEEe
Q 006631 595 SAYPVMLETTAAVHPGGSGGAVVNLDGHMIGLVT 628 (637)
Q Consensus 595 ~~~~~mlqTta~v~~G~SGGPL~n~~G~LVGIVs 628 (637)
...|.-+.+|..+.+||||+||+|.+|+||||+-
T Consensus 618 g~~pv~FlstnDitGGNSGSPvlN~~GeLVGl~F 651 (698)
T PF10459_consen 618 GSVPVNFLSTNDITGGNSGSPVLNAKGELVGLAF 651 (698)
T ss_pred CCeeeEEEeccCcCCCCCCCccCCCCceEEEEee
Confidence 4567778888999999999999999999999986
No 30
>COG3591 V8-like Glu-specific endopeptidase [Amino acid transport and metabolism]
Probab=93.93 E-value=0.28 Score=51.14 Aligned_cols=77 Identities=22% Similarity=0.224 Sum_probs=60.0
Q ss_pred cccCCCCCCCCCeEEEEeCCCCCCCCCcccCceEEEEEecccCCCCCCCceEEEecccCCCCcCcceecCCccEEEEEee
Q 006631 231 NIALTPLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVANCYPPRSTTRSLLMADIRCLPGMEGGPVFGEHAHFVGILIR 310 (637)
Q Consensus 231 ~~~~s~~~~~G~~v~aigsPfg~~~p~~f~n~vs~GiIs~~~~~~~~~~~~i~tDa~~~pG~sGG~v~~~~g~liGiv~~ 310 (637)
.+.-....+.+|.|.++|.|-.- |..+....+.+.|-.... .+++-|+...||+||.||++.+.++||+...
T Consensus 151 ~~~~~~~~~~~d~i~v~GYP~dk--~~~~~~~e~t~~v~~~~~------~~l~y~~dT~pG~SGSpv~~~~~~vigv~~~ 222 (251)
T COG3591 151 KRNTASEAKANDRITVIGYPGDK--PNIGTMWESTGKVNSIKG------NKLFYDADTLPGSSGSPVLISKDEVIGVHYN 222 (251)
T ss_pred ccccccccccCceeEEEeccCCC--CcceeEeeecceeEEEec------ceEEEEecccCCCCCCceEecCceEEEEEec
Confidence 34456678999999999999764 334455556666554432 3688899999999999999999999999999
Q ss_pred ccccc
Q 006631 311 PLRQK 315 (637)
Q Consensus 311 ~l~~~ 315 (637)
.....
T Consensus 223 g~~~~ 227 (251)
T COG3591 223 GPGAN 227 (251)
T ss_pred CCCcc
Confidence 88866
No 31
>PF02907 Peptidase_S29: Hepatitis C virus NS3 protease; InterPro: IPR004109 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies the Hepatitis C virus NS3 protein as a serine protease which belongs to MEROPS peptidase family S29 (hepacivirin family, clan PA(S)), which has a trypsin-like fold. The non-structural (NS) protein NS3 is one of the NS proteins involved in replication of the HCV genome. The NS2 proteinase (IPR002518 from INTERPRO), a zinc-dependent enzyme, performs a single proteolytic cut to release the N terminus of NS3. The action of NS3 proteinase (NS3P), which resides in the N-terminal one-third of the NS3 protein, then yields all remaining non-structural proteins. The C-terminal two-thirds of the NS3 protein contain a helicase. The functional relationship between the proteinase and helicase domains is unknown. NS3 has a structural zinc-binding site and requires cofactor NS4. It has been suggested that the NS3 serine protease of hepatitus C is involved in cell transformation and that the ability to transform requires an active enzyme [].; GO: 0008236 serine-type peptidase activity, 0006508 proteolysis, 0019087 transformation of host cell by virus; PDB: 2QV1_B 3LOX_C 2OBQ_C 2OC1_C 2OC0_A 3LON_A 3KNX_A 2O8M_A 2OBO_A 2OC8_A ....
Probab=92.18 E-value=0.12 Score=48.75 Aligned_cols=45 Identities=29% Similarity=0.527 Sum_probs=36.1
Q ss_pred EecccCCCCcCcceecCCccEEEEEeecccccCCc-ceEEEEeHHHH
Q 006631 284 ADIRCLPGMEGGPVFGEHAHFVGILIRPLRQKSGA-EIQLVIPWEAI 329 (637)
Q Consensus 284 tDa~~~pG~sGG~v~~~~g~liGiv~~~l~~~~~~-~l~faip~~~i 329 (637)
.-+..+-|+|||||+...|++|||..+.++.++.. .+-|+ ||+.+
T Consensus 101 ~pis~lkGSSGgPiLC~~GH~vG~f~aa~~trgvak~i~f~-P~e~l 146 (148)
T PF02907_consen 101 RPISDLKGSSGGPILCPSGHAVGMFRAAVCTRGVAKAIDFI-PVETL 146 (148)
T ss_dssp EEHHHHTT-TT-EEEETTSEEEEEEEEEEEETTEEEEEEEE-EHHHH
T ss_pred ceeEEEecCCCCcccCCCCCEEEEEEEEEEcCCceeeEEEE-eeeec
Confidence 45667889999999999999999999999887443 77787 99875
No 32
>PF00548 Peptidase_C3: 3C cysteine protease (picornain 3C); InterPro: IPR000199 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This signature defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies C3A and C3B. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral C3 cysteine protease. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SJO_E 2H6M_A 1QA7_C 1HAV_B 2HAL_A 2H9H_A 3QZQ_B 3QZR_A 3R0F_B 3SJ9_A ....
Probab=91.28 E-value=2.7 Score=41.27 Aligned_cols=34 Identities=29% Similarity=0.475 Sum_probs=28.5
Q ss_pred cceEEEecCcccCCccccccccc---CceEEEEEeee
Q 006631 597 YPVMLETTAAVHPGGSGGAVVNL---DGHMIGLVTRY 630 (637)
Q Consensus 597 ~~~mlqTta~v~~G~SGGPL~n~---~G~LVGIVsSn 630 (637)
++.++.+.++...|+-||||+.. .++++||-++.
T Consensus 134 ~~~~~~Y~~~t~~G~CG~~l~~~~~~~~~i~GiHvaG 170 (172)
T PF00548_consen 134 TPRSLKYKAPTKPGMCGSPLVSRIGGQGKIIGIHVAG 170 (172)
T ss_dssp EEEEEEEESEEETTGTTEEEEESCGGTTEEEEEEEEE
T ss_pred eeEEEEEccCCCCCccCCeEEEeeccCccEEEEEecc
Confidence 45688889999999999999942 58999998864
No 33
>PF00949 Peptidase_S7: Peptidase S7, Flavivirus NS3 serine protease ; InterPro: IPR001850 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies serine peptidases belong to MEROPS peptidase family S7 (flavivirin family, clan PA(S)). The protein fold of the peptidase domain for members of this family resembles that of chymotrypsin, the type example for clan PA. Flaviviruses produce a polyprotein from the ssRNA genome. The N terminus of the NS3 protein (approx. 180 aa) is required for the processing of the polyprotein. NS3 also has conserved homology with NTP-binding proteins and DEAD family of RNA helicase [, , ].; GO: 0003723 RNA binding, 0003724 RNA helicase activity, 0005524 ATP binding; PDB: 2IJO_B 3E90_D 2GGV_B 2FP7_B 2WV9_A 3U1I_B 3U1J_B 2WZQ_A 2WHX_A 3L6P_A ....
Probab=90.07 E-value=0.24 Score=46.77 Aligned_cols=36 Identities=19% Similarity=0.397 Sum_probs=25.5
Q ss_pred ceEEEecccCCCCcCcceecCCccEEEEEeeccccc
Q 006631 280 SLLMADIRCLPGMEGGPVFGEHAHFVGILIRPLRQK 315 (637)
Q Consensus 280 ~~i~tDa~~~pG~sGG~v~~~~g~liGiv~~~l~~~ 315 (637)
.+.+.|..+-+|+||.|+||.+|++|||--..+.-.
T Consensus 86 ~~~~~~~d~~~GsSGSpi~n~~g~ivGlYg~g~~~~ 121 (132)
T PF00949_consen 86 GIGAIDLDFPKGSSGSPIFNQNGEIVGLYGNGVEVG 121 (132)
T ss_dssp EEEEE---S-TTGTT-EEEETTSCEEEEEEEEEE-T
T ss_pred eEEeeecccCCCCCCCceEcCCCcEEEEEccceeec
Confidence 566778889999999999999999999987666443
No 34
>PF09342 DUF1986: Domain of unknown function (DUF1986); InterPro: IPR015420 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This domain is found in serine endopeptidases belonging to MEROPS peptidase family S1A (clan PA). It is found in unusual mosaic proteins, which are encoded by the Drosophila nudel gene (see P98159 from SWISSPROT). Nudel is involved in defining embryonic dorsoventral polarity. Three proteases; ndl, gd and snk process easter to create active easter. Active easter defines cell identities along the dorsal-ventral continuum by activating the spz ligand for the Tl receptor in the ventral region of the embryo. Nudel, pipe and windbeutel together trigger the protease cascade within the extraembryonic perivitelline compartment which induces dorsoventral polarity of the Drosophila embryo [].
Probab=89.86 E-value=2.6 Score=43.77 Aligned_cols=31 Identities=32% Similarity=0.570 Sum_probs=26.5
Q ss_pred eEEEEeCCCeeeEEEEEeCCCEEEEcccccCC
Q 006631 397 VCLITIDDGVWASGVLLNDQGLILTNAHLLEP 428 (637)
Q Consensus 397 VV~V~~g~~~wGSGvlI~~~GlILTnAHVV~p 428 (637)
...|.+++.-||||+||+++ |||++..|+..
T Consensus 19 lA~IYvdG~~~CsgvLlD~~-WlLvsssCl~~ 49 (267)
T PF09342_consen 19 LADIYVDGRYWCSGVLLDPH-WLLVSSSCLRG 49 (267)
T ss_pred eeeEEEcCeEEEEEEEeccc-eEEEeccccCC
Confidence 34667777889999999998 99999999963
No 35
>cd00190 Tryp_SPc Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues.
Probab=89.71 E-value=1.6 Score=42.59 Aligned_cols=99 Identities=17% Similarity=0.144 Sum_probs=53.7
Q ss_pred CccEEEEEEeCCC-----CCCCcccCCC-CCCCCCeEEEEeCCCCCCC--CCcccCceEEEEEecc--cCCC----CCCC
Q 006631 214 TSRVAILGVSSYL-----KDLPNIALTP-LNKRGDLLLAVGSPFGVLS--PMHFFNSVSMGSVANC--YPPR----STTR 279 (637)
Q Consensus 214 ~t~~A~lki~~~~-----~~~~~~~~s~-~~~~G~~v~aigsPfg~~~--p~~f~n~vs~GiIs~~--~~~~----~~~~ 279 (637)
..||||||++... ..++.+.... .+..|+.+.+.|....... ...-......-+++.. .... ....
T Consensus 88 ~~DiAll~L~~~~~~~~~v~picl~~~~~~~~~~~~~~~~G~g~~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~ 167 (232)
T cd00190 88 DNDIALLKLKRPVTLSDNVRPICLPSSGYNLPAGTTCTVSGWGRTSEGGPLPDVLQEVNVPIVSNAECKRAYSYGGTITD 167 (232)
T ss_pred cCCEEEEEECCcccCCCcccceECCCccccCCCCCEEEEEeCCcCCCCCCCCceeeEEEeeeECHHHhhhhccCcccCCC
Confidence 3799999997422 2333444443 6788999999996543211 0111122223333321 0000 0011
Q ss_pred ceEEE-----ecccCCCCcCcceecCC---ccEEEEEeecc
Q 006631 280 SLLMA-----DIRCLPGMEGGPVFGEH---AHFVGILIRPL 312 (637)
Q Consensus 280 ~~i~t-----Da~~~pG~sGG~v~~~~---g~liGiv~~~l 312 (637)
..+-+ +...-+|.|||||+... ..|+||++...
T Consensus 168 ~~~C~~~~~~~~~~c~gdsGgpl~~~~~~~~~lvGI~s~g~ 208 (232)
T cd00190 168 NMLCAGGLEGGKDACQGDSGGPLVCNDNGRGVLVGIVSWGS 208 (232)
T ss_pred ceEeeCCCCCCCccccCCCCCcEEEEeCCEEEEEEEEehhh
Confidence 11111 33455799999999875 67999998654
No 36
>smart00020 Tryp_SPc Trypsin-like serine protease. Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues.
Probab=88.33 E-value=2.8 Score=41.11 Aligned_cols=99 Identities=15% Similarity=0.099 Sum_probs=52.5
Q ss_pred CccEEEEEEeCCC-----CCCCcccCC-CCCCCCCeEEEEeCCCCCCCCCcccCceEEEEEecccC-----CC----CCC
Q 006631 214 TSRVAILGVSSYL-----KDLPNIALT-PLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVANCYP-----PR----STT 278 (637)
Q Consensus 214 ~t~~A~lki~~~~-----~~~~~~~~s-~~~~~G~~v~aigsPfg~~~p~~f~n~vs~GiIs~~~~-----~~----~~~ 278 (637)
..||||||++... ..++.+... ..+..|+.+.+.|..-.......+...+....+.-... .. ...
T Consensus 88 ~~DiAll~L~~~i~~~~~~~pi~l~~~~~~~~~~~~~~~~g~g~~~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~ 167 (229)
T smart00020 88 DNDIALLKLKSPVTLSDNVRPICLPSSNYNVPAGTTCTVSGWGRTSEGAGSLPDTLQEVNVPIVSNATCRRAYSGGGAIT 167 (229)
T ss_pred cCCEEEEEECcccCCCCceeeccCCCcccccCCCCEEEEEeCCCCCCCCCcCCCEeeEEEEEEeCHHHhhhhhccccccC
Confidence 4799999997431 123334432 35777899999986543211111222222222221110 00 000
Q ss_pred CceEE-----EecccCCCCcCcceecCCc--cEEEEEeecc
Q 006631 279 RSLLM-----ADIRCLPGMEGGPVFGEHA--HFVGILIRPL 312 (637)
Q Consensus 279 ~~~i~-----tDa~~~pG~sGG~v~~~~g--~liGiv~~~l 312 (637)
...+- .+...-+|.+||||+...+ .|+||++..-
T Consensus 168 ~~~~C~~~~~~~~~~c~gdsG~pl~~~~~~~~l~Gi~s~g~ 208 (229)
T smart00020 168 DNMLCAGGLEGGKDACQGDSGGPLVCNDGRWVLVGIVSWGS 208 (229)
T ss_pred CCcEeecCCCCCCcccCCCCCCeeEEECCCEEEEEEEEECC
Confidence 01110 1344567999999998765 7999988654
No 37
>KOG1421 consensus Predicted signaling-associated protein (contains a PDZ domain) [General function prediction only]
Probab=87.48 E-value=6.2 Score=46.27 Aligned_cols=46 Identities=9% Similarity=0.029 Sum_probs=33.6
Q ss_pred eeeEEEEeCCCCCcEEEEEEccCCCCcceeeCCCCCCCCCCeEEEEecC
Q 006631 510 CDAKIVYVCKGPLDVSLLQLGYIPDQLCPIDADFGQPSLGSAAYVIGHG 558 (637)
Q Consensus 510 ~~a~Vv~v~~~~~DIALLkLe~~~~~l~PI~l~~~~~~~Ge~V~VIGyp 558 (637)
..|.+.+.++. ..+|.+|-++ ......++.+..+..|++|...|+-
T Consensus 588 i~a~~~fL~~t-~n~a~~kydp--~~~~~~kl~~~~v~~gD~~~f~g~~ 633 (955)
T KOG1421|consen 588 IPANVSFLHPT-ENVASFKYDP--ALEVQLKLTDTTVLRGDECTFEGFT 633 (955)
T ss_pred ccceeeEecCc-cceeEeccCh--hHhhhhccceeeEecCCceeEeccc
Confidence 56777777764 7788888874 3334456666668899999999983
No 38
>PF00863 Peptidase_C4: Peptidase family C4 This family belongs to family C4 of the peptidase classification.; InterPro: IPR001730 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. Nuclear inclusion A (NIA) proteases from potyviruses are cysteine peptidases belong to the MEROPS peptidase family C4 (NIa protease family, clan PA(C)) [, ]. Potyviruses include plant viruses in which the single-stranded RNA encodes a polyprotein with NIA protease activity, where proteolytic cleavage is specific for Gln+Gly sites. The NIA protease acts on the polyprotein, releasing itself by Gln+Gly cleavage at both the N- and C-termini. It further processes the polyprotein by cleavage at five similar sites in the C-terminal half of the sequence. In addition to its C-terminal protease activity, the NIA protease contains an N-terminal domain that has been implicated in the transcription process []. This peptidase is present in the nuclear inclusion protein of potyviruses.; GO: 0008234 cysteine-type peptidase activity, 0006508 proteolysis; PDB: 3MMG_B 1Q31_B 1LVB_A 1LVM_A.
Probab=86.97 E-value=1.3 Score=45.82 Aligned_cols=108 Identities=20% Similarity=0.192 Sum_probs=51.6
Q ss_pred CccEEEEEEeCCCCCCCcccCCCCCCCCCeEEEEeCCCCCCCCCcccCceEEEEEecccCCCCCCCceEEEecccCCCCc
Q 006631 214 TSRVAILGVSSYLKDLPNIALTPLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVANCYPPRSTTRSLLMADIRCLPGME 293 (637)
Q Consensus 214 ~t~~A~lki~~~~~~~~~~~~s~~~~~G~~v~aigsPfg~~~p~~f~n~vs~GiIs~~~~~~~~~~~~i~tDa~~~pG~s 293 (637)
..||.++|...+..|.+..-.-...+.||.|..||+=|--.+ ..-+||. -|.+.+ .....|+-=-+.-.+|+.
T Consensus 81 ~~DiviirmPkDfpPf~~kl~FR~P~~~e~v~mVg~~fq~k~---~~s~vSe--sS~i~p--~~~~~fWkHwIsTk~G~C 153 (235)
T PF00863_consen 81 GRDIVIIRMPKDFPPFPQKLKFRAPKEGERVCMVGSNFQEKS---ISSTVSE--SSWIYP--EENSHFWKHWISTKDGDC 153 (235)
T ss_dssp CSSEEEEE--TTS----S---B----TT-EEEEEEEECSSCC---CEEEEEE--EEEEEE--ETTTTEEEE-C---TT-T
T ss_pred CccEEEEeCCcccCCcchhhhccCCCCCCEEEEEEEEEEcCC---eeEEECC--ceEEee--cCCCCeeEEEecCCCCcc
Confidence 479999999866566666556678999999999998875211 1112222 122222 123567777788889999
Q ss_pred CcceecC-CccEEEEEeecccccCCcceEEEEeHHHH
Q 006631 294 GGPVFGE-HAHFVGILIRPLRQKSGAEIQLVIPWEAI 329 (637)
Q Consensus 294 GG~v~~~-~g~liGiv~~~l~~~~~~~l~faip~~~i 329 (637)
|.||++. +|.+|||-...-.. ...++-.++|-+-+
T Consensus 154 G~PlVs~~Dg~IVGiHsl~~~~-~~~N~F~~f~~~f~ 189 (235)
T PF00863_consen 154 GLPLVSTKDGKIVGIHSLTSNT-SSRNYFTPFPDDFE 189 (235)
T ss_dssp T-EEEETTT--EEEEEEEEETT-TSSEEEEE--TTHH
T ss_pred CCcEEEcCCCcEEEEEcCccCC-CCeEEEEcCCHHHH
Confidence 9999977 78899999843322 23334444444433
No 39
>PF00949 Peptidase_S7: Peptidase S7, Flavivirus NS3 serine protease ; InterPro: IPR001850 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies serine peptidases belong to MEROPS peptidase family S7 (flavivirin family, clan PA(S)). The protein fold of the peptidase domain for members of this family resembles that of chymotrypsin, the type example for clan PA. Flaviviruses produce a polyprotein from the ssRNA genome. The N terminus of the NS3 protein (approx. 180 aa) is required for the processing of the polyprotein. NS3 also has conserved homology with NTP-binding proteins and DEAD family of RNA helicase [, , ].; GO: 0003723 RNA binding, 0003724 RNA helicase activity, 0005524 ATP binding; PDB: 2IJO_B 3E90_D 2GGV_B 2FP7_B 2WV9_A 3U1I_B 3U1J_B 2WZQ_A 2WHX_A 3L6P_A ....
Probab=85.76 E-value=0.61 Score=44.12 Aligned_cols=24 Identities=29% Similarity=0.584 Sum_probs=18.5
Q ss_pred CcccCCcccccccccCceEEEEEe
Q 006631 605 AAVHPGGSGGAVVNLDGHMIGLVT 628 (637)
Q Consensus 605 a~v~~G~SGGPL~n~~G~LVGIVs 628 (637)
....+|.||+|+||.+|++|||--
T Consensus 92 ~d~~~GsSGSpi~n~~g~ivGlYg 115 (132)
T PF00949_consen 92 LDFPKGSSGSPIFNQNGEIVGLYG 115 (132)
T ss_dssp --S-TTGTT-EEEETTSCEEEEEE
T ss_pred cccCCCCCCCceEcCCCcEEEEEc
Confidence 357789999999999999999864
No 40
>PF00944 Peptidase_S3: Alphavirus core protein ; InterPro: IPR000930 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. Togavirin, also known as Sindbis virus core endopeptidase, is a serine protease resident at the N terminus of the p130 polyprotein of togaviruses []. The endopeptidase signature identifies the peptidase as belonging to the MEROPS peptidase family S3 (togavirin family, clan PA(S)). The polyprotein also includes structural proteins for the nucleocapsid core and for the glycoprotein spikes []. Togavirin is only active while part of the polyprotein, cleavage at a Trp-Ser bond resulting in total lack of activity []. Mutagenesis studies have identified the location of the His-Asp-Ser catalytic triad, and X-ray studies have revealed the protein fold to be similar to that of chymotrypsin [, ].; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis, 0016020 membrane; PDB: 2YEW_D 1EP5_A 3J0C_F 1EP6_C 1WYK_D 1DYL_A 1VCQ_B 1VCP_B 1LD4_D 1KXA_A ....
Probab=83.91 E-value=1.2 Score=42.18 Aligned_cols=32 Identities=25% Similarity=0.398 Sum_probs=25.5
Q ss_pred EEEecCcccCCcccccccccCceEEEEEeeec
Q 006631 600 MLETTAAVHPGGSGGAVVNLDGHMIGLVTRYA 631 (637)
Q Consensus 600 mlqTta~v~~G~SGGPL~n~~G~LVGIVsSna 631 (637)
+...+..-.+||||-|++|.+|++||||-..+
T Consensus 96 ftip~g~g~~GDSGRpi~DNsGrVVaIVLGG~ 127 (158)
T PF00944_consen 96 FTIPTGVGKPGDSGRPIFDNSGRVVAIVLGGA 127 (158)
T ss_dssp EEEETTS-STTSTTEEEESTTSBEEEEEEEEE
T ss_pred EEeccCCCCCCCCCCccCcCCCCEEEEEecCC
Confidence 34445677899999999999999999998654
No 41
>PF08192 Peptidase_S64: Peptidase family S64; InterPro: IPR012985 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This family of fungal proteins is involved in the processing of membrane bound transcription factor Stp1 [] and belongs to MEROPS petidase family S64 (clan PA). The processing causes the signalling domain of Stp1 to be passed to the nucleus where several permease genes are induced. The permeases are important for uptake of amino acids, and processing of tp1 only occurs in an amino acid-rich environment. This family is predicted to be distantly related to the trypsin family (MEROPS peptidase family S1) and to have a typical trypsin-like catalytic triad [].
Probab=81.19 E-value=7.2 Score=45.73 Aligned_cols=114 Identities=18% Similarity=0.149 Sum_probs=70.9
Q ss_pred cCCCCccEEEEEEeCCCC---------------CCCcccC------CCCCCCCCeEEEEeCCCCCCCCCcccCceEEEEE
Q 006631 210 MSKSTSRVAILGVSSYLK---------------DLPNIAL------TPLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSV 268 (637)
Q Consensus 210 ~~~~~t~~A~lki~~~~~---------------~~~~~~~------s~~~~~G~~v~aigsPfg~~~p~~f~n~vs~GiI 268 (637)
+.+...|+||+||+.... |...+.+ -..+..|..|+=+|.==|+ |.|+|
T Consensus 538 i~~~LsD~AIIkV~~~~~~~N~LGddi~f~~~dP~l~f~NlyV~~~~~~~~~G~~VfK~GrTTgy----------T~G~l 607 (695)
T PF08192_consen 538 INKRLSDWAIIKVNKERKCQNYLGDDIQFNEPDPTLMFQNLYVREVVSNLVPGMEVFKVGRTTGY----------TTGIL 607 (695)
T ss_pred hcccccceEEEEeCCCceecCCCCccccccCCCccccccccchhhhhhccCCCCeEEEecccCCc----------cceEe
Confidence 335568999999984220 1111211 1247789999999987776 35666
Q ss_pred eccc----CCCC-CCCceEEEe----cccCCCCcCcceecCCcc------EEEEEeecccccCCcceEEEEeHHHHHHHH
Q 006631 269 ANCY----PPRS-TTRSLLMAD----IRCLPGMEGGPVFGEHAH------FVGILIRPLRQKSGAEIQLVIPWEAIATAC 333 (637)
Q Consensus 269 s~~~----~~~~-~~~~~i~tD----a~~~pG~sGG~v~~~~g~------liGiv~~~l~~~~~~~l~faip~~~i~~~~ 333 (637)
.+.. .++. ....+++.. +=..+|.||.=|+++-+. |+||+-+.=. ....|++..||..|..-+
T Consensus 608 Ng~klvyw~dG~i~s~efvV~s~~~~~Fa~~GDSGS~VLtk~~d~~~gLgvvGMlhsydg--e~kqfglftPi~~il~rl 685 (695)
T PF08192_consen 608 NGIKLVYWADGKIQSSEFVVSSDNNPAFASGGDSGSWVLTKLEDNNKGLGVVGMLHSYDG--EQKQFGLFTPINEILDRL 685 (695)
T ss_pred cceEEEEecCCCeEEEEEEEecCCCccccCCCCcccEEEecccccccCceeeEEeeecCC--ccceeeccCcHHHHHHHH
Confidence 6431 1111 112344443 446679999999987444 8898874221 224788999999998766
Q ss_pred Hh
Q 006631 334 SD 335 (637)
Q Consensus 334 ~~ 335 (637)
.+
T Consensus 686 ~~ 687 (695)
T PF08192_consen 686 EE 687 (695)
T ss_pred HH
Confidence 54
No 42
>PF00947 Pico_P2A: Picornavirus core protein 2A; InterPro: IPR000081 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This domain defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies 3CA and 3CB. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral 3C cysteine protease []. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0008233 peptidase activity, 0006508 proteolysis, 0016032 viral reproduction; PDB: 2HRV_B 1Z8R_A.
Probab=78.21 E-value=3 Score=39.16 Aligned_cols=30 Identities=30% Similarity=0.494 Sum_probs=24.1
Q ss_pred ceEEEecccCCCCcCcceecCCccEEEEEee
Q 006631 280 SLLMADIRCLPGMEGGPVFGEHAHFVGILIR 310 (637)
Q Consensus 280 ~~i~tDa~~~pG~sGG~v~~~~g~liGiv~~ 310 (637)
.+++.--.+.||..||+|+.++| +|||+++
T Consensus 79 ~~l~g~Gp~~PGdCGg~L~C~HG-ViGi~Ta 108 (127)
T PF00947_consen 79 NLLIGEGPAEPGDCGGILRCKHG-VIGIVTA 108 (127)
T ss_dssp CEEEEE-SSSTT-TCSEEEETTC-EEEEEEE
T ss_pred CceeecccCCCCCCCceeEeCCC-eEEEEEe
Confidence 45667778999999999998886 9999996
No 43
>PF00944 Peptidase_S3: Alphavirus core protein ; InterPro: IPR000930 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. Togavirin, also known as Sindbis virus core endopeptidase, is a serine protease resident at the N terminus of the p130 polyprotein of togaviruses []. The endopeptidase signature identifies the peptidase as belonging to the MEROPS peptidase family S3 (togavirin family, clan PA(S)). The polyprotein also includes structural proteins for the nucleocapsid core and for the glycoprotein spikes []. Togavirin is only active while part of the polyprotein, cleavage at a Trp-Ser bond resulting in total lack of activity []. Mutagenesis studies have identified the location of the His-Asp-Ser catalytic triad, and X-ray studies have revealed the protein fold to be similar to that of chymotrypsin [, ].; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis, 0016020 membrane; PDB: 2YEW_D 1EP5_A 3J0C_F 1EP6_C 1WYK_D 1DYL_A 1VCQ_B 1VCP_B 1LD4_D 1KXA_A ....
Probab=77.75 E-value=3.5 Score=39.19 Aligned_cols=32 Identities=22% Similarity=0.400 Sum_probs=26.0
Q ss_pred eEEEecccCCCCcCcceecCCccEEEEEeecc
Q 006631 281 LLMADIRCLPGMEGGPVFGEHAHFVGILIRPL 312 (637)
Q Consensus 281 ~i~tDa~~~pG~sGG~v~~~~g~liGiv~~~l 312 (637)
|.+--..-.||.||-|+||..|++||||++--
T Consensus 96 ftip~g~g~~GDSGRpi~DNsGrVVaIVLGG~ 127 (158)
T PF00944_consen 96 FTIPTGVGKPGDSGRPIFDNSGRVVAIVLGGA 127 (158)
T ss_dssp EEEETTS-STTSTTEEEESTTSBEEEEEEEEE
T ss_pred EEeccCCCCCCCCCCccCcCCCCEEEEEecCC
Confidence 44555667899999999999999999999643
No 44
>PF02907 Peptidase_S29: Hepatitis C virus NS3 protease; InterPro: IPR004109 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies the Hepatitis C virus NS3 protein as a serine protease which belongs to MEROPS peptidase family S29 (hepacivirin family, clan PA(S)), which has a trypsin-like fold. The non-structural (NS) protein NS3 is one of the NS proteins involved in replication of the HCV genome. The NS2 proteinase (IPR002518 from INTERPRO), a zinc-dependent enzyme, performs a single proteolytic cut to release the N terminus of NS3. The action of NS3 proteinase (NS3P), which resides in the N-terminal one-third of the NS3 protein, then yields all remaining non-structural proteins. The C-terminal two-thirds of the NS3 protein contain a helicase. The functional relationship between the proteinase and helicase domains is unknown. NS3 has a structural zinc-binding site and requires cofactor NS4. It has been suggested that the NS3 serine protease of hepatitus C is involved in cell transformation and that the ability to transform requires an active enzyme [].; GO: 0008236 serine-type peptidase activity, 0006508 proteolysis, 0019087 transformation of host cell by virus; PDB: 2QV1_B 3LOX_C 2OBQ_C 2OC1_C 2OC0_A 3LON_A 3KNX_A 2O8M_A 2OBO_A 2OC8_A ....
Probab=70.04 E-value=3.1 Score=39.53 Aligned_cols=24 Identities=29% Similarity=0.603 Sum_probs=19.1
Q ss_pred cccCCcccccccccCceEEEEEee
Q 006631 606 AVHPGGSGGAVVNLDGHMIGLVTR 629 (637)
Q Consensus 606 ~v~~G~SGGPL~n~~G~LVGIVsS 629 (637)
+.-.|+|||||+-.+|++|||-.+
T Consensus 104 s~lkGSSGgPiLC~~GH~vG~f~a 127 (148)
T PF02907_consen 104 SDLKGSSGGPILCPSGHAVGMFRA 127 (148)
T ss_dssp HHHTT-TT-EEEETTSEEEEEEEE
T ss_pred EEEecCCCCcccCCCCCEEEEEEE
Confidence 456799999999889999999775
No 45
>PF05580 Peptidase_S55: SpoIVB peptidase S55; InterPro: IPR008763 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S55 (SpoIVB peptidase family, clan PA(S)). The protein SpoIVB plays a key role in signalling in the final sigma-K checkpoint of Bacillus subtilis [, ].
Probab=66.81 E-value=2.7 Score=42.90 Aligned_cols=29 Identities=41% Similarity=0.721 Sum_probs=23.6
Q ss_pred EEEecCcccCCcccccccccCceEEEEEee
Q 006631 600 MLETTAAVHPGGSGGAVVNLDGHMIGLVTR 629 (637)
Q Consensus 600 mlqTta~v~~G~SGGPL~n~~G~LVGIVsS 629 (637)
++..+..+..|+||+|++ .+|+|||=|+-
T Consensus 170 Ll~~TGGIvqGMSGSPI~-qdGKLiGAVth 198 (218)
T PF05580_consen 170 LLEKTGGIVQGMSGSPII-QDGKLIGAVTH 198 (218)
T ss_pred hhhhhCCEEecccCCCEE-ECCEEEEEEEE
Confidence 344456788999999999 69999998873
No 46
>PF08192 Peptidase_S64: Peptidase family S64; InterPro: IPR012985 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This family of fungal proteins is involved in the processing of membrane bound transcription factor Stp1 [] and belongs to MEROPS petidase family S64 (clan PA). The processing causes the signalling domain of Stp1 to be passed to the nucleus where several permease genes are induced. The permeases are important for uptake of amino acids, and processing of tp1 only occurs in an amino acid-rich environment. This family is predicted to be distantly related to the trypsin family (MEROPS peptidase family S1) and to have a typical trypsin-like catalytic triad [].
Probab=66.37 E-value=21 Score=41.97 Aligned_cols=98 Identities=16% Similarity=0.286 Sum_probs=55.5
Q ss_pred CCCCcEEEEEEcc-------CCCCcc-----e-eeCC-------CCCCCCCCeEEEEecCCCCCCCCCCCceeeeEEeee
Q 006631 519 KGPLDVSLLQLGY-------IPDQLC-----P-IDAD-------FGQPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKV 578 (637)
Q Consensus 519 ~~~~DIALLkLe~-------~~~~l~-----P-I~l~-------~~~~~~Ge~V~VIGyplfg~~~g~~~svs~GiVs~v 578 (637)
..-.|+|||+++. +.+.+. | +.+. ...+.+|..|+=+|- ..+ .+.|.++.+
T Consensus 540 ~~LsD~AIIkV~~~~~~~N~LGddi~f~~~dP~l~f~NlyV~~~~~~~~~G~~VfK~Gr-----TTg----yT~G~lNg~ 610 (695)
T PF08192_consen 540 KRLSDWAIIKVNKERKCQNYLGDDIQFNEPDPTLMFQNLYVREVVSNLVPGMEVFKVGR-----TTG----YTTGILNGI 610 (695)
T ss_pred ccccceEEEEeCCCceecCCCCccccccCCCccccccccchhhhhhccCCCCeEEEecc-----cCC----ccceEecce
Confidence 3447999999985 222222 1 1111 123567999998886 344 477877765
Q ss_pred eeecCCcCCccccccCCCcceEEEec----CcccCCcccccccccCc------eEEEEEeeecC
Q 006631 579 VKANLPSYGQSTLQRNSAYPVMLETT----AAVHPGGSGGAVVNLDG------HMIGLVTRYAG 632 (637)
Q Consensus 579 ~~v~~~~~~~~~~~~~~~~~~mlqTt----a~v~~G~SGGPL~n~~G------~LVGIVsSna~ 632 (637)
.-+. -.. +. -....++... .-..+||||.=|++.-+ .|+||..|.-+
T Consensus 611 klvy-w~d-----G~-i~s~efvV~s~~~~~Fa~~GDSGS~VLtk~~d~~~gLgvvGMlhsydg 667 (695)
T PF08192_consen 611 KLVY-WAD-----GK-IQSSEFVVSSDNNPAFASGGDSGSWVLTKLEDNNKGLGVVGMLHSYDG 667 (695)
T ss_pred EEEE-ecC-----CC-eEEEEEEEecCCCccccCCCCcccEEEecccccccCceeeEEeeecCC
Confidence 3211 000 00 0011233333 44678999999998533 49999998543
No 47
>KOG0441 consensus Cu2+/Zn2+ superoxide dismutase SOD1 [Inorganic ion transport and metabolism]
Probab=60.59 E-value=3.3 Score=40.10 Aligned_cols=42 Identities=29% Similarity=0.256 Sum_probs=31.3
Q ss_pred hhhhcccccceeccCcee---eeeeeeecccccC--ChhhhhhccCC
Q 006631 26 GLKMRRHAFHQYNSGKTT---LSASGMLLPLSFF--DTKVAERNWGV 67 (637)
Q Consensus 26 ~~k~~~~~f~~~~~g~~t---~sas~~~~p~~~~--~~~~~~~~~~~ 67 (637)
||+-++|+||.|+.|.+| .||-...=|.+.. .+.+..|.+++
T Consensus 38 GL~pg~hgfHvHqfGD~t~GC~SaGphFNp~~~~hg~p~~~~rH~gd 84 (154)
T KOG0441|consen 38 GLPPGKHGFHVHQFGDNTNGCKSAGPHFNPNKKTHGGPVDEVRHVGD 84 (154)
T ss_pred cCCCceeeEEEEeccCCCCChhcCCCCCCCcccCCCCcccccccccc
Confidence 444499999999999998 6776666666555 46667777776
No 48
>PF00947 Pico_P2A: Picornavirus core protein 2A; InterPro: IPR000081 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This domain defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies 3CA and 3CB. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral 3C cysteine protease []. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0008233 peptidase activity, 0006508 proteolysis, 0016032 viral reproduction; PDB: 2HRV_B 1Z8R_A.
Probab=60.29 E-value=9.7 Score=35.83 Aligned_cols=33 Identities=30% Similarity=0.502 Sum_probs=24.5
Q ss_pred EEEecCcccCCcccccccccCceEEEEEeeecCC
Q 006631 600 MLETTAAVHPGGSGGAVVNLDGHMIGLVTRYAGG 633 (637)
Q Consensus 600 mlqTta~v~~G~SGGPL~n~~G~LVGIVsSna~~ 633 (637)
++.....+.||+.||+|+ .+--+|||+|+...+
T Consensus 80 ~l~g~Gp~~PGdCGg~L~-C~HGViGi~Tagg~g 112 (127)
T PF00947_consen 80 LLIGEGPAEPGDCGGILR-CKHGVIGIVTAGGEG 112 (127)
T ss_dssp EEEEE-SSSTT-TCSEEE-ETTCEEEEEEEEETT
T ss_pred ceeecccCCCCCCCceeE-eCCCeEEEEEeCCCc
Confidence 455567899999999999 455599999987654
No 49
>PF01732 DUF31: Putative peptidase (DUF31); InterPro: IPR022382 This domain has no known function. It is found in various hypothetical proteins and putative lipoproteins from mycoplasmas.
Probab=51.16 E-value=9.4 Score=41.86 Aligned_cols=24 Identities=25% Similarity=0.515 Sum_probs=21.3
Q ss_pred CcccCCcccccccccCceEEEEEe
Q 006631 605 AAVHPGGSGGAVVNLDGHMIGLVT 628 (637)
Q Consensus 605 a~v~~G~SGGPL~n~~G~LVGIVs 628 (637)
....+|+||+.|+|.+|++|||..
T Consensus 350 ~~l~gGaSGS~V~n~~~~lvGIy~ 373 (374)
T PF01732_consen 350 YSLGGGASGSMVINQNNELVGIYF 373 (374)
T ss_pred cCCCCCCCcCeEECCCCCEEEEeC
Confidence 367799999999999999999964
No 50
>TIGR02860 spore_IV_B stage IV sporulation protein B. SpoIVB, the stage IV sporulation protein B of endospore-forming bacteria such as Bacillus subtilis, is a serine proteinase, expressed in the spore (rather than mother cell) compartment, that participates in a proteolytic activation cascade for Sigma-K. It appears to be universal among endospore-forming bacteria and occurs nowhere else.
Probab=50.38 E-value=6.9 Score=43.62 Aligned_cols=28 Identities=39% Similarity=0.695 Sum_probs=23.1
Q ss_pred EEEecCcccCCcccccccccCceEEEEEe
Q 006631 600 MLETTAAVHPGGSGGAVVNLDGHMIGLVT 628 (637)
Q Consensus 600 mlqTta~v~~G~SGGPL~n~~G~LVGIVs 628 (637)
.+.-+..+..|+||+|++ .+|+|||=||
T Consensus 350 ll~~tgGivqGMSGSPi~-q~gkliGAvt 377 (402)
T TIGR02860 350 LLEKTGGIVQGMSGSPII-QNGKVIGAVT 377 (402)
T ss_pred HhhHhCCEEecccCCCEE-ECCEEEEEEE
Confidence 344456788999999999 6999999776
No 51
>PF05580 Peptidase_S55: SpoIVB peptidase S55; InterPro: IPR008763 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S55 (SpoIVB peptidase family, clan PA(S)). The protein SpoIVB plays a key role in signalling in the final sigma-K checkpoint of Bacillus subtilis [, ].
Probab=41.03 E-value=21 Score=36.58 Aligned_cols=38 Identities=18% Similarity=0.318 Sum_probs=27.1
Q ss_pred cCCCCcCcceecCCccEEEEEeecccccCCcceEEEEeHHH
Q 006631 288 CLPGMEGGPVFGEHAHFVGILIRPLRQKSGAEIQLVIPWEA 328 (637)
Q Consensus 288 ~~pG~sGG~v~~~~g~liGiv~~~l~~~~~~~l~faip~~~ 328 (637)
|..||||.|++- +|+|||-|+--|......| ..|+++.
T Consensus 177 IvqGMSGSPI~q-dGKLiGAVthvf~~dp~~G--ygi~ie~ 214 (218)
T PF05580_consen 177 IVQGMSGSPIIQ-DGKLIGAVTHVFVNDPTKG--YGIFIEW 214 (218)
T ss_pred EEecccCCCEEE-CCEEEEEEEEEEecCCCce--eeecHHH
Confidence 567999999986 8999999998764433333 3455544
No 52
>PF00548 Peptidase_C3: 3C cysteine protease (picornain 3C); InterPro: IPR000199 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This signature defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies C3A and C3B. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral C3 cysteine protease. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SJO_E 2H6M_A 1QA7_C 1HAV_B 2HAL_A 2H9H_A 3QZQ_B 3QZR_A 3R0F_B 3SJ9_A ....
Probab=40.19 E-value=58 Score=31.97 Aligned_cols=89 Identities=19% Similarity=0.304 Sum_probs=49.3
Q ss_pred CccEEEEEEeCCCCCCCc----ccCCCCCCCCCeEEEEeCC-CCCCCCCcc-cC-ceEEEEEecccCCCCCCCceEEEec
Q 006631 214 TSRVAILGVSSYLKDLPN----IALTPLNKRGDLLLAVGSP-FGVLSPMHF-FN-SVSMGSVANCYPPRSTTRSLLMADI 286 (637)
Q Consensus 214 ~t~~A~lki~~~~~~~~~----~~~s~~~~~G~~v~aigsP-fg~~~p~~f-~n-~vs~GiIs~~~~~~~~~~~~i~tDa 286 (637)
.+|+++++++.. ..... |.+.. -...+.++++-++ |+- ..+ .. ....|.| +..+ ......|.=++
T Consensus 71 ~~Dl~~v~l~~~-~kfrDIrk~~~~~~-~~~~~~~l~v~~~~~~~---~~~~v~~v~~~~~i-~~~g--~~~~~~~~Y~~ 142 (172)
T PF00548_consen 71 DTDLTLVKLPRN-PKFRDIRKFFPESI-PEYPECVLLVNSTKFPR---MIVEVGFVTNFGFI-NLSG--TTTPRSLKYKA 142 (172)
T ss_dssp EEEEEEEEEESS-S-B--GGGGSBSSG-GTEEEEEEEEESSSSTC---EEEEEEEEEEEEEE-EETT--EEEEEEEEEES
T ss_pred ceeEEEEEccCC-cccCchhhhhcccc-ccCCCcEEEEECCCCcc---EEEEEEEEeecCcc-ccCC--CEeeEEEEEcc
Confidence 489999999642 11111 22111 2455666666654 441 111 11 1123444 2221 12234577788
Q ss_pred ccCCCCcCcceecC---CccEEEEEee
Q 006631 287 RCLPGMEGGPVFGE---HAHFVGILIR 310 (637)
Q Consensus 287 ~~~pG~sGG~v~~~---~g~liGiv~~ 310 (637)
+--+|+.||+|+.. .+.++||=+|
T Consensus 143 ~t~~G~CG~~l~~~~~~~~~i~GiHva 169 (172)
T PF00548_consen 143 PTKPGMCGSPLVSRIGGQGKIIGIHVA 169 (172)
T ss_dssp EEETTGTTEEEEESCGGTTEEEEEEEE
T ss_pred CCCCCccCCeEEEeeccCccEEEEEec
Confidence 88899999999964 5679999775
No 53
>PF05579 Peptidase_S32: Equine arteritis virus serine endopeptidase S32; InterPro: IPR008760 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to MEROPS peptidase family S32 (clan PA(S)). The type example is equine arteritis virus serine endopeptidase (equine arteritis virus), which is involved in processing of nidovirus polyproteins [].; GO: 0004252 serine-type endopeptidase activity, 0016032 viral reproduction, 0019082 viral protein processing; PDB: 3FAN_A 3FAO_A 1MBM_A.
Probab=39.53 E-value=20 Score=37.94 Aligned_cols=27 Identities=26% Similarity=0.511 Sum_probs=20.7
Q ss_pred cCCCCcCcceecCCccEEEEEeecccc
Q 006631 288 CLPGMEGGPVFGEHAHFVGILIRPLRQ 314 (637)
Q Consensus 288 ~~pG~sGG~v~~~~g~liGiv~~~l~~ 314 (637)
-.||.||.||+..+|.+||+-++.=.+
T Consensus 205 T~~GDSGSPVVt~dg~liGVHTGSn~~ 231 (297)
T PF05579_consen 205 TGPGDSGSPVVTEDGDLIGVHTGSNKR 231 (297)
T ss_dssp S-GGCTT-EEEETTC-EEEEEEEEETT
T ss_pred cCCCCCCCccCcCCCCEEEEEecCCCc
Confidence 369999999999999999999976433
No 54
>PF03761 DUF316: Domain of unknown function (DUF316) ; InterPro: IPR005514 This is a family of uncharacterised proteins from Caenorhabditis elegans.
Probab=36.69 E-value=2e+02 Score=29.78 Aligned_cols=91 Identities=16% Similarity=0.188 Sum_probs=54.7
Q ss_pred CccEEEEEEeCC---CCCCCcccCCC-CCCCCCeEEEEeC-CCCCCCCCcccCceEEEEEecccCCCCCCCceEEEeccc
Q 006631 214 TSRVAILGVSSY---LKDLPNIALTP-LNKRGDLLLAVGS-PFGVLSPMHFFNSVSMGSVANCYPPRSTTRSLLMADIRC 288 (637)
Q Consensus 214 ~t~~A~lki~~~---~~~~~~~~~s~-~~~~G~~v~aigs-Pfg~~~p~~f~n~vs~GiIs~~~~~~~~~~~~i~tDa~~ 288 (637)
..+++||+++.. ...++.++++. .+..||.+-+-|. .-+ .++..-+. |..... ....+.++-..
T Consensus 160 ~~~~mIlEl~~~~~~~~~~~Cl~~~~~~~~~~~~~~~yg~~~~~----~~~~~~~~---i~~~~~----~~~~~~~~~~~ 228 (282)
T PF03761_consen 160 PYSPMILELEEDFSKNVSPPCLADSSTNWEKGDEVDVYGFNSTG----KLKHRKLK---ITNCTK----CAYSICTKQYS 228 (282)
T ss_pred ccceEEEEEcccccccCCCEEeCCCccccccCceEEEeecCCCC----eEEEEEEE---EEEeec----cceeEeccccc
Confidence 357889999854 56677787655 4788898887776 122 11111111 111110 12235566666
Q ss_pred CCCCcCcceecC-Ccc--EEEEEeeccccc
Q 006631 289 LPGMEGGPVFGE-HAH--FVGILIRPLRQK 315 (637)
Q Consensus 289 ~pG~sGG~v~~~-~g~--liGiv~~~l~~~ 315 (637)
-+|..|||++.. +|+ ||||.+..-...
T Consensus 229 ~~~d~Gg~lv~~~~gr~tlIGv~~~~~~~~ 258 (282)
T PF03761_consen 229 CKGDRGGPLVKNINGRWTLIGVGASGNYEC 258 (282)
T ss_pred CCCCccCeEEEEECCCEEEEEEEccCCCcc
Confidence 689999999833 454 999998655443
No 55
>PF03510 Peptidase_C24: 2C endopeptidase (C24) cysteine protease family; InterPro: IPR000317 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. The two signatures that defines this group of calivirus polyproteins identify a cysteine peptidase signature that belongs to MEROPS peptidase family C24 (clan PA(C)). Caliciviruses are positive-stranded ssRNA viruses that cause gastroenteritis. The calicivirus genome contains two open reading frames, ORF1 and ORF2. ORF2 encodes a structural protein []; while ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine protease and RNA polymerase activity. The regions of the polyprotein in which these activities lie are similar to proteins produced by the picornaviruses. Two different families of caliciviruses can be distinguished on the basis of sequence similarity, namely those classified as small round structured viruses (SRSVs) and those classed as non-SRSVs. Calicivirus proteases from the non-SRSV group, which are members of the PA protease clan, constitute family C24 of the cysteine proteases (proteases from SRSVs belong to the C37 family). As mentioned above, the protease activity resides within a polyprotein. The enzyme cleaves the polyprotein at sites N-terminal to itself, liberating the polyprotein helicase.; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis
Probab=33.50 E-value=1.4e+02 Score=27.29 Aligned_cols=17 Identities=24% Similarity=0.425 Sum_probs=14.2
Q ss_pred EEEEeCCCEEEEcccccC
Q 006631 410 GVLLNDQGLILTNAHLLE 427 (637)
Q Consensus 410 GvlI~~~GlILTnAHVV~ 427 (637)
++.|.+ |.++|+.||++
T Consensus 3 avHIGn-G~~vt~tHva~ 19 (105)
T PF03510_consen 3 AVHIGN-GRYVTVTHVAK 19 (105)
T ss_pred eEEeCC-CEEEEEEEEec
Confidence 567775 89999999996
No 56
>PF00571 CBS: CBS domain CBS domain web page. Mutations in the CBS domain of Swiss:P35520 lead to homocystinuria.; InterPro: IPR000644 CBS (cystathionine-beta-synthase) domains are small intracellular modules, mostly found in two or four copies within a protein, that occur in a variety of proteins in bacteria, archaea, and eukaryotes [, ]. Tandem pairs of CBS domains can act as binding domains for adenosine derivatives and may regulate the activity of attached enzymatic or other domains []. In some cases, CBS domains may act as sensors of cellular energy status by being activated by AMP and inhibited by ATP []. In chloride ion channels, the CBS domains have been implicated in intracellular targeting and trafficking, as well as in protein-protein interactions, but results vary with different channels: in the CLC-5 channel, the CBS domain was shown to be required for trafficking [], while in the CLC-1 channel, the CBS domain was shown to be critical for channel function, but not necessary for trafficking []. Recent experiments revealing that CBS domains can bind adenosine-containing ligands such ATP, AMP, or S-adenosylmethionine have led to the hypothesis that CBS domains function as sensors of intracellular metabolites [, ]. Crystallographic studies of CBS domains have shown that pairs of CBS sequences form a globular domain where each CBS unit adopts a beta-alpha-beta-beta-alpha pattern []. Crystal structure of the CBS domains of the AMP-activated protein kinase in complexes with AMP and ATP shows that the phosphate groups of AMP/ATP lie in a surface pocket at the interface of two CBS domains, which is lined with basic residues, many of which are associated with disease-causing mutations []. In humans, mutations in conserved residues within CBS domains cause a variety of human hereditary diseases, including (with the gene mutated in parentheses): homocystinuria (cystathionine beta-synthase); Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase); retinitis pigmentosa (IMP dehydrogenase-1); congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members).; GO: 0005515 protein binding; PDB: 3JTF_A 3TE5_C 3TDH_C 3T4N_C 2QLV_C 3OI8_A 3LV9_A 2QH1_B 1PVM_B 3LQN_A ....
Probab=30.30 E-value=35 Score=26.04 Aligned_cols=22 Identities=36% Similarity=0.631 Sum_probs=18.4
Q ss_pred cCCcccccccccCceEEEEEee
Q 006631 608 HPGGSGGAVVNLDGHMIGLVTR 629 (637)
Q Consensus 608 ~~G~SGGPL~n~~G~LVGIVsS 629 (637)
..+-+.-||+|.+|+++|+++.
T Consensus 27 ~~~~~~~~V~d~~~~~~G~is~ 48 (57)
T PF00571_consen 27 KNGISRLPVVDEDGKLVGIISR 48 (57)
T ss_dssp HHTSSEEEEESTTSBEEEEEEH
T ss_pred HcCCcEEEEEecCCEEEEEEEH
Confidence 3467788999999999999984
No 57
>PF01732 DUF31: Putative peptidase (DUF31); InterPro: IPR022382 This domain has no known function. It is found in various hypothetical proteins and putative lipoproteins from mycoplasmas.
Probab=29.58 E-value=37 Score=37.27 Aligned_cols=26 Identities=23% Similarity=0.373 Sum_probs=21.7
Q ss_pred EecccCCCCcCcceecCCccEEEEEe
Q 006631 284 ADIRCLPGMEGGPVFGEHAHFVGILI 309 (637)
Q Consensus 284 tDa~~~pG~sGG~v~~~~g~liGiv~ 309 (637)
.+...-.|.||..|+|.+|++|||.-
T Consensus 348 ~~~~l~gGaSGS~V~n~~~~lvGIy~ 373 (374)
T PF01732_consen 348 DNYSLGGGASGSMVINQNNELVGIYF 373 (374)
T ss_pred cccCCCCCCCcCeEECCCCCEEEEeC
Confidence 34455579999999999999999974
No 58
>PF05416 Peptidase_C37: Southampton virus-type processing peptidase; InterPro: IPR001665 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This group of cysteine peptidases belong to the MEROPS peptidase family C37, (clan PA(C)). The type example is calicivirin from Southampton virus, an endopeptidase that cleaves the polyprotein at sites N-terminal to itself, liberating the polyprotein helicase. Southampton virus is a positive-stranded ssRNA virus belonging to the Caliciviruses, which are viruses that cause gastroenteritis. The calicivirus genome contains two open reading frames, ORF1 and ORF2. ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine protease and RNA polymerase activity []. The regions of the polyprotein in which these activities lie are similar to proteins produced by the picornaviruses []. ORF2 encodes a structural, capsid protein. Two different families of caliciviruses can be distinguished on the basis of sequence similarity, namely the Norwalk-like viruses or small round structured viruses (SRSVs), and those classed as non-SRSVs.; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 2FYQ_A 2FYR_A 1WQS_D 4ASH_A 2IPH_B.
Probab=26.98 E-value=1.5e+02 Score=33.64 Aligned_cols=36 Identities=31% Similarity=0.317 Sum_probs=23.8
Q ss_pred ceEEEecC-------cccCCcccccccccCc---eEEEEEeeecCC
Q 006631 598 PVMLETTA-------AVHPGGSGGAVVNLDG---HMIGLVTRYAGG 633 (637)
Q Consensus 598 ~~mlqTta-------~v~~G~SGGPL~n~~G---~LVGIVsSna~~ 633 (637)
..||.|.+ ...|||-|-|-|-..| -|+|+.++.+++
T Consensus 484 ~GMLLTGaNAK~mDLGT~PGDCGcPYvyKrgNd~VV~GVH~AAtr~ 529 (535)
T PF05416_consen 484 MGMLLTGANAKGMDLGTIPGDCGCPYVYKRGNDWVVIGVHAAATRS 529 (535)
T ss_dssp EEEETTSTT-SSTTTS--TTGTT-EEEEEETTEEEEEEEEEEE-SS
T ss_pred eeeeeecCCccccccCCCCCCCCCceeeecCCcEEEEEEEehhccC
Confidence 34676643 4678999999996544 489999988775
No 59
>PF08208 RNA_polI_A34: DNA-directed RNA polymerase I subunit RPA34.5; InterPro: IPR013240 This is a family of proteins conserved from yeasts to human. Subunit A34.5 of RNA polymerase I is a non-essential subunit which is thought to help Pol I overcome topological constraints imposed on ribosomal DNA during the process of transcription [].; PDB: 3NFG_N.
Probab=25.32 E-value=24 Score=35.30 Aligned_cols=13 Identities=62% Similarity=0.950 Sum_probs=0.0
Q ss_pred Ccchhhhcccccc
Q 006631 23 DPKGLKMRRHAFH 35 (637)
Q Consensus 23 dpk~~k~~~~~f~ 35 (637)
-|+|||||.|+|=
T Consensus 109 qp~gLk~Rf~P~G 121 (198)
T PF08208_consen 109 QPKGLKMRFFPFG 121 (198)
T ss_dssp -------------
T ss_pred CCCCcceeeecCC
Confidence 3899999999884
Done!