Query 005822
Match_columns 675
No_of_seqs 318 out of 2212
Neff 5.7
Searched_HMMs 46136
Date Thu Mar 28 14:15:38 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/005822.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/005822hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PRK10898 serine endoprotease; 99.9 4.4E-24 9.5E-29 231.1 22.1 186 388-658 49-250 (353)
2 TIGR02038 protease_degS peripl 99.9 8.1E-24 1.7E-28 228.8 23.1 186 388-658 49-249 (351)
3 PRK10139 serine endoprotease; 99.9 1E-23 2.3E-28 235.0 22.2 188 387-658 43-261 (455)
4 PRK10942 serine endoprotease; 99.9 1.8E-22 3.8E-27 226.1 22.4 144 497-659 138-283 (473)
5 TIGR02037 degP_htrA_DO peripla 99.9 4.8E-22 1E-26 220.1 22.8 170 406-659 58-229 (428)
6 PRK10139 serine endoprotease; 99.8 1.5E-20 3.3E-25 209.7 14.0 120 213-338 136-260 (455)
7 PRK10942 serine endoprotease; 99.8 5.5E-19 1.2E-23 198.1 13.6 119 214-338 158-281 (473)
8 TIGR02038 protease_degS peripl 99.8 1.3E-18 2.8E-23 188.4 14.0 119 214-338 124-248 (351)
9 PRK10898 serine endoprotease; 99.8 1.8E-18 3.8E-23 187.5 13.6 118 215-338 125-249 (353)
10 TIGR02037 degP_htrA_DO peripla 99.7 1.4E-17 2.9E-22 184.7 14.0 118 215-338 105-227 (428)
11 COG0265 DegQ Trypsin-like seri 99.7 2.6E-16 5.7E-21 169.8 19.7 187 388-657 37-242 (347)
12 COG0265 DegQ Trypsin-like seri 99.7 4.9E-17 1.1E-21 175.4 12.1 121 212-338 116-242 (347)
13 PF13365 Trypsin_2: Trypsin-li 99.5 3.9E-13 8.4E-18 120.8 13.9 24 603-626 97-120 (120)
14 cd00190 Tryp_SPc Trypsin-like 99.4 8.4E-12 1.8E-16 123.4 15.6 181 394-631 12-208 (232)
15 PF00089 Trypsin: Trypsin; In 99.3 3.7E-11 8E-16 118.1 13.5 104 521-630 86-198 (220)
16 KOG1320 Serine protease [Postt 99.2 1.5E-11 3.3E-16 136.2 7.1 128 204-337 212-351 (473)
17 smart00020 Tryp_SPc Trypsin-li 99.2 3.2E-10 7E-15 112.7 15.6 107 521-631 88-208 (229)
18 KOG1320 Serine protease [Postt 99.2 1.3E-10 2.7E-15 129.0 13.3 139 510-658 213-353 (473)
19 KOG1421 Predicted signaling-as 98.5 3.2E-07 7E-12 103.9 10.7 188 389-655 57-258 (955)
20 KOG3627 Trypsin [Amino acid tr 98.5 2.8E-06 6.1E-11 87.0 16.3 107 522-632 106-229 (256)
21 COG3591 V8-like Glu-specific e 98.2 7.9E-06 1.7E-10 84.9 11.0 71 545-634 157-227 (251)
22 PF13365 Trypsin_2: Trypsin-li 97.7 1.5E-05 3.2E-10 71.4 1.8 24 284-307 97-120 (120)
23 PF00863 Peptidase_C4: Peptida 97.7 0.00072 1.6E-08 69.9 13.6 92 521-633 81-175 (235)
24 PF03761 DUF316: Domain of unk 97.6 0.0031 6.8E-08 66.2 18.5 106 520-646 159-270 (282)
25 COG5640 Secreted trypsin-like 97.3 0.00092 2E-08 72.1 8.9 22 405-427 60-81 (413)
26 PF05579 Peptidase_S32: Equine 97.1 0.0032 6.8E-08 65.8 10.8 76 522-633 156-231 (297)
27 PF00089 Trypsin: Trypsin; In 96.4 0.015 3.3E-07 57.0 8.7 114 215-329 87-216 (220)
28 PF10459 Peptidase_S46: Peptid 93.8 0.05 1.1E-06 64.6 3.8 59 595-653 618-684 (698)
29 COG3591 V8-like Glu-specific e 93.1 0.48 1E-05 49.8 9.3 76 232-315 152-227 (251)
30 PF09342 DUF1986: Domain of un 92.4 0.87 1.9E-05 47.6 9.8 34 395-429 17-50 (267)
31 PF02907 Peptidase_S29: Hepati 91.8 0.14 3E-06 48.7 3.1 45 284-329 101-146 (148)
32 PF00548 Peptidase_C3: 3C cyst 91.1 3 6.4E-05 41.4 11.7 34 597-630 134-170 (172)
33 cd00190 Tryp_SPc Trypsin-like 90.3 0.73 1.6E-05 45.3 6.7 98 215-312 89-208 (232)
34 PF10459 Peptidase_S46: Peptid 89.2 0.27 5.9E-06 58.6 3.2 29 282-310 624-652 (698)
35 PF00949 Peptidase_S7: Peptida 89.1 0.32 6.9E-06 46.4 2.9 35 280-314 86-120 (132)
36 PF00863 Peptidase_C4: Peptida 87.9 1 2.3E-05 46.9 6.0 86 215-310 82-171 (235)
37 smart00020 Tryp_SPc Trypsin-li 86.9 3.7 8E-05 40.6 9.2 99 214-312 88-208 (229)
38 PF00944 Peptidase_S3: Alphavi 86.3 1.9 4.1E-05 41.3 6.2 36 601-636 97-132 (158)
39 PF00949 Peptidase_S7: Peptida 85.7 1.7 3.6E-05 41.6 5.6 31 604-634 91-121 (132)
40 PF08192 Peptidase_S64: Peptid 84.7 4.7 0.0001 47.6 9.6 116 208-335 536-687 (695)
41 PF02907 Peptidase_S29: Hepati 83.3 1.9 4E-05 41.3 4.7 42 606-648 104-146 (148)
42 KOG1421 Predicted signaling-as 80.4 32 0.00069 41.1 14.0 46 510-558 588-633 (955)
43 PF08192 Peptidase_S64: Peptid 79.3 15 0.00032 43.6 11.0 90 546-653 587-686 (695)
44 PF05580 Peptidase_S55: SpoIVB 78.5 1.8 3.9E-05 44.5 3.1 45 602-649 172-216 (218)
45 PF00947 Pico_P2A: Picornaviru 77.2 3.3 7.2E-05 39.2 4.2 30 280-310 79-108 (127)
46 PF00944 Peptidase_S3: Alphavi 76.4 2.9 6.3E-05 40.1 3.6 32 281-312 96-127 (158)
47 KOG0441 Cu2+/Zn2+ superoxide d 63.3 3.4 7.4E-05 40.3 1.1 42 26-67 38-84 (154)
48 TIGR02860 spore_IV_B stage IV 57.8 8.7 0.00019 43.2 3.2 45 603-650 353-397 (402)
49 PF00947 Pico_P2A: Picornaviru 57.5 18 0.00038 34.5 4.7 31 600-631 80-110 (127)
50 PF02122 Peptidase_S39: Peptid 51.2 16 0.00035 37.4 3.6 59 600-659 137-195 (203)
51 PF01732 DUF31: Putative pepti 50.0 11 0.00024 41.7 2.4 24 605-628 350-373 (374)
52 PF05416 Peptidase_C37: Southa 48.1 71 0.0015 36.3 8.1 40 597-636 483-532 (535)
53 PF05580 Peptidase_S55: SpoIVB 41.6 21 0.00045 37.0 2.7 39 288-329 177-215 (218)
54 PF00548 Peptidase_C3: 3C cyst 38.2 71 0.0015 31.7 5.8 89 214-310 71-169 (172)
55 PF05579 Peptidase_S32: Equine 34.6 27 0.00058 37.4 2.2 26 288-313 205-230 (297)
56 PF01732 DUF31: Putative pepti 28.7 38 0.00082 37.5 2.4 27 283-309 347-373 (374)
57 PF03761 DUF316: Domain of unk 27.5 3.5E+02 0.0075 28.3 9.3 90 215-315 161-258 (282)
58 PF00571 CBS: CBS domain CBS d 25.2 60 0.0013 24.9 2.3 22 608-629 27-48 (57)
59 TIGR02860 spore_IV_B stage IV 25.0 43 0.00094 37.8 2.0 39 287-326 356-394 (402)
60 PF08208 RNA_polI_A34: DNA-dir 23.8 26 0.00057 35.3 0.0 13 23-35 109-121 (198)
61 PF03510 Peptidase_C24: 2C end 23.1 2.9E+02 0.0063 25.6 6.6 17 410-427 3-19 (105)
No 1
>PRK10898 serine endoprotease; Provisional
Probab=99.92 E-value=4.4e-24 Score=231.05 Aligned_cols=186 Identities=24% Similarity=0.350 Sum_probs=144.7
Q ss_pred hhHhhccCceEEEEeCC-----------CeeEEEEEEeCCcEEEEcccccCCCCCcceeccCCCcccccCCCCCCCCCCC
Q 005822 388 LPIQKALASVCLITIDD-----------GVWASGVLLNDQGLILTNAHLLEPWRFGKTTVSGWRNGVSFQPEDSASSGHT 456 (675)
Q Consensus 388 ~~ie~a~~SVV~I~~~~-----------~~wGSGvlIn~~GlILTnAHVV~p~~~g~~~~~g~~~~~~~~~~~~~~~~~~ 456 (675)
..++++.|+||.|.... ..+||||+|+++||||||+||++. .
T Consensus 49 ~~~~~~~psvV~v~~~~~~~~~~~~~~~~~~GSGfvi~~~G~IlTn~HVv~~---------a------------------ 101 (353)
T PRK10898 49 QAVRRAAPAVVNVYNRSLNSTSHNQLEIRTLGSGVIMDQRGYILTNKHVIND---------A------------------ 101 (353)
T ss_pred HHHHHhCCcEEEEEeEeccccCcccccccceeeEEEEeCCeEEEecccEeCC---------C------------------
Confidence 56889999999997621 158999999999999999999961 1
Q ss_pred cccccccccCCCCCCCcccccccccccceeeeeeecCceEEEEEEecCCCCceeeeEEEEecCCCCCeEEEEecCCCCCc
Q 005822 457 GVDQYQKSQTLPPKMPKIVDSSVDEHRAYKLSSFSRGHRKIRVRLDHLDPWIWCDAKIVYVCKGPLDVSLLQLGYIPDQL 536 (675)
Q Consensus 457 ~v~~~~~~~~~~~k~~~~~~~~~~~~~~~~~~~~~~~~~~i~Vrl~~~~~~~w~~a~vv~~~~~~~DIALLkL~~~~~~l 536 (675)
..+.|++..+.. |.|++++.++. +||||||++. ..+
T Consensus 102 --------------------------------------~~i~V~~~dg~~---~~a~vv~~d~~-~DlAvl~v~~--~~l 137 (353)
T PRK10898 102 --------------------------------------DQIIVALQDGRV---FEALLVGSDSL-TDLAVLKINA--TNL 137 (353)
T ss_pred --------------------------------------CEEEEEeCCCCE---EEEEEEEEcCC-CCEEEEEEcC--CCC
Confidence 124455444333 88999998886 9999999985 356
Q ss_pred ceeeCCCCC-CCCCCeEEEEecCCCCCCCCCCCceeeeEEeeeEEecCCccCcccccCCCCcceEEEEcccccCCccccc
Q 005822 537 CPIDADFGQ-PSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQRNSAYPVMLETTAAVHPGGSGGA 615 (675)
Q Consensus 537 ~PI~l~~~~-~~~G~~V~ViG~glfg~~~g~~pSvs~GiIs~v~~~~~~~~~~~~~~~~~~~~~~lqTda~v~~G~SGGP 615 (675)
+++.+.+.. +++|+.|+++|||. +...+++.|+|+...+..... .....++|||+++++|+||||
T Consensus 138 ~~~~l~~~~~~~~G~~V~aiG~P~-----g~~~~~t~Giis~~~r~~~~~---------~~~~~~iqtda~i~~GnSGGP 203 (353)
T PRK10898 138 PVIPINPKRVPHIGDVVLAIGNPY-----NLGQTITQGIISATGRIGLSP---------TGRQNFLQTDASINHGNSGGA 203 (353)
T ss_pred CeeeccCcCcCCCCCEEEEEeCCC-----CcCCCcceeEEEeccccccCC---------ccccceEEeccccCCCCCcce
Confidence 777886554 89999999999984 455788999999876532110 112357999999999999999
Q ss_pred eecCCceEEEEEeeeeeCCCc----eeEEEeehHHHHHHHHHHHHhh
Q 005822 616 VVNLDGHMIGLVTRYFKLSCL----KMSKFMLVAKLLAQLSFLFFIF 658 (675)
Q Consensus 616 Lvd~~G~LIGIVssnak~~~~----~~i~f~ip~~~l~~l~~~~~~~ 658 (675)
|+|.+|+||||+++....... ..++|+||++++...+..++.+
T Consensus 204 l~n~~G~vvGI~~~~~~~~~~~~~~~g~~faIP~~~~~~~~~~l~~~ 250 (353)
T PRK10898 204 LVNSLGELMGINTLSFDKSNDGETPEGIGFAIPTQLATKIMDKLIRD 250 (353)
T ss_pred EECCCCeEEEEEEEEecccCCCCcccceEEEEchHHHHHHHHHHhhc
Confidence 999999999999876544321 4599999999999998887654
No 2
>TIGR02038 protease_degS periplasmic serine pepetdase DegS. This family consists of the periplasmic serine protease DegS (HhoB), a shorter paralog of protease DO (HtrA, DegP) and DegQ (HhoA). It is found in E. coli and several other Proteobacteria of the gamma subdivision. It contains a trypsin domain and a single copy of PDZ domain (in contrast to DegP with two copies). A critical role of this DegS is to sense stress in the periplasm and partially degrade an inhibitor of sigma(E).
Probab=99.92 E-value=8.1e-24 Score=228.85 Aligned_cols=186 Identities=24% Similarity=0.374 Sum_probs=144.5
Q ss_pred hhHhhccCceEEEEeCC-----------CeeEEEEEEeCCcEEEEcccccCCCCCcceeccCCCcccccCCCCCCCCCCC
Q 005822 388 LPIQKALASVCLITIDD-----------GVWASGVLLNDQGLILTNAHLLEPWRFGKTTVSGWRNGVSFQPEDSASSGHT 456 (675)
Q Consensus 388 ~~ie~a~~SVV~I~~~~-----------~~wGSGvlIn~~GlILTnAHVV~p~~~g~~~~~g~~~~~~~~~~~~~~~~~~ 456 (675)
..++++.||||.|.... ...||||+|+++||||||+||++. .
T Consensus 49 ~~~~~~~psVV~I~~~~~~~~~~~~~~~~~~GSG~vi~~~G~IlTn~HVV~~---------~------------------ 101 (351)
T TIGR02038 49 KAVRRAAPAVVNIYNRSISQNSLNQLSIQGLGSGVIMSKEGYILTNYHVIKK---------A------------------ 101 (351)
T ss_pred HHHHhcCCcEEEEEeEeccccccccccccceEEEEEEeCCeEEEecccEeCC---------C------------------
Confidence 56889999999997621 247999999999999999999961 1
Q ss_pred cccccccccCCCCCCCcccccccccccceeeeeeecCceEEEEEEecCCCCceeeeEEEEecCCCCCeEEEEecCCCCCc
Q 005822 457 GVDQYQKSQTLPPKMPKIVDSSVDEHRAYKLSSFSRGHRKIRVRLDHLDPWIWCDAKIVYVCKGPLDVSLLQLGYIPDQL 536 (675)
Q Consensus 457 ~v~~~~~~~~~~~k~~~~~~~~~~~~~~~~~~~~~~~~~~i~Vrl~~~~~~~w~~a~vv~~~~~~~DIALLkL~~~~~~l 536 (675)
..+.|++..+.. ++|++++.++. +||||||++. ..+
T Consensus 102 --------------------------------------~~i~V~~~dg~~---~~a~vv~~d~~-~DlAvlkv~~--~~~ 137 (351)
T TIGR02038 102 --------------------------------------DQIVVALQDGRK---FEAELVGSDPL-TDLAVLKIEG--DNL 137 (351)
T ss_pred --------------------------------------CEEEEEECCCCE---EEEEEEEecCC-CCEEEEEecC--CCC
Confidence 124444444332 88999998886 9999999985 346
Q ss_pred ceeeCCCC-CCCCCCeEEEEecCCCCCCCCCCCceeeeEEeeeEEecCCccCcccccCCCCcceEEEEcccccCCccccc
Q 005822 537 CPIDADFG-QPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQRNSAYPVMLETTAAVHPGGSGGA 615 (675)
Q Consensus 537 ~PI~l~~~-~~~~G~~V~ViG~glfg~~~g~~pSvs~GiIs~v~~~~~~~~~~~~~~~~~~~~~~lqTda~v~~G~SGGP 615 (675)
+++.+... .+++|+.|+++|||. +...+++.|+|+...+.... ......++|||+.+++|+||||
T Consensus 138 ~~~~l~~s~~~~~G~~V~aiG~P~-----~~~~s~t~GiIs~~~r~~~~---------~~~~~~~iqtda~i~~GnSGGp 203 (351)
T TIGR02038 138 PTIPVNLDRPPHVGDVVLAIGNPY-----NLGQTITQGIISATGRNGLS---------SVGRQNFIQTDAAINAGNSGGA 203 (351)
T ss_pred ceEeccCcCccCCCCEEEEEeCCC-----CCCCcEEEEEEEeccCcccC---------CCCcceEEEECCccCCCCCcce
Confidence 77787654 589999999999984 45578999999987653210 0123457999999999999999
Q ss_pred eecCCceEEEEEeeeeeCCC---ceeEEEeehHHHHHHHHHHHHhh
Q 005822 616 VVNLDGHMIGLVTRYFKLSC---LKMSKFMLVAKLLAQLSFLFFIF 658 (675)
Q Consensus 616 Lvd~~G~LIGIVssnak~~~---~~~i~f~ip~~~l~~l~~~~~~~ 658 (675)
|+|.+|+||||+++...... ...++|+||++++..++..++.+
T Consensus 204 l~n~~G~vIGI~~~~~~~~~~~~~~g~~faIP~~~~~~vl~~l~~~ 249 (351)
T TIGR02038 204 LINTNGELVGINTASFQKGGDEGGEGINFAIPIKLAHKIMGKIIRD 249 (351)
T ss_pred EECCCCeEEEEEeeeecccCCCCccceEEEecHHHHHHHHHHHhhc
Confidence 99999999999987654322 24599999999999999888765
No 3
>PRK10139 serine endoprotease; Provisional
Probab=99.92 E-value=1e-23 Score=234.97 Aligned_cols=188 Identities=26% Similarity=0.446 Sum_probs=146.7
Q ss_pred chhHhhccCceEEEEeC------------------C----------CeeEEEEEEeC-CcEEEEcccccCCCCCcceecc
Q 005822 387 PLPIQKALASVCLITID------------------D----------GVWASGVLLND-QGLILTNAHLLEPWRFGKTTVS 437 (675)
Q Consensus 387 ~~~ie~a~~SVV~I~~~------------------~----------~~wGSGvlIn~-~GlILTnAHVV~p~~~g~~~~~ 437 (675)
...++++.||||.|... . ..+||||+|++ +||||||+||++.
T Consensus 43 ~~~~~~~~pavV~i~~~~~~~~~~~~~~~~~~~f~~~~~~~~~~~~~~~GSG~ii~~~~g~IlTn~HVv~~--------- 113 (455)
T PRK10139 43 APMLEKVLPAVVSVRVEGTASQGQKIPEEFKKFFGDDLPDQPAQPFEGLGSGVIIDAAKGYVLTNNHVINQ--------- 113 (455)
T ss_pred HHHHHHhCCcEEEEEEEEeecccccCchhHHHhccccCCccccccccceEEEEEEECCCCEEEeChHHhCC---------
Confidence 35789999999999641 0 14799999985 7999999999961
Q ss_pred CCCcccccCCCCCCCCCCCcccccccccCCCCCCCcccccccccccceeeeeeecCceEEEEEEecCCCCceeeeEEEEe
Q 005822 438 GWRNGVSFQPEDSASSGHTGVDQYQKSQTLPPKMPKIVDSSVDEHRAYKLSSFSRGHRKIRVRLDHLDPWIWCDAKIVYV 517 (675)
Q Consensus 438 g~~~~~~~~~~~~~~~~~~~v~~~~~~~~~~~k~~~~~~~~~~~~~~~~~~~~~~~~~~i~Vrl~~~~~~~w~~a~vv~~ 517 (675)
. ..+.|++..+.. |+|++++.
T Consensus 114 a--------------------------------------------------------~~i~V~~~dg~~---~~a~vvg~ 134 (455)
T PRK10139 114 A--------------------------------------------------------QKISIQLNDGRE---FDAKLIGS 134 (455)
T ss_pred C--------------------------------------------------------CEEEEEECCCCE---EEEEEEEE
Confidence 1 124555544332 89999998
Q ss_pred cCCCCCeEEEEecCCCCCcceeeCCCC-CCCCCCeEEEEecCCCCCCCCCCCceeeeEEeeeEEecCCccCcccccCCCC
Q 005822 518 CKGPLDVSLLQLGYIPDQLCPIDADFG-QPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQRNSA 596 (675)
Q Consensus 518 ~~~~~DIALLkL~~~~~~l~PI~l~~~-~~~~G~~V~ViG~glfg~~~g~~pSvs~GiIs~v~~~~~~~~~~~~~~~~~~ 596 (675)
++. +||||||++. +..++++++.+. .+++|+.|+++|+|. ++..+++.|+|+...+... ....
T Consensus 135 D~~-~DlAvlkv~~-~~~l~~~~lg~s~~~~~G~~V~aiG~P~-----g~~~tvt~GivS~~~r~~~---------~~~~ 198 (455)
T PRK10139 135 DDQ-SDIALLQIQN-PSKLTQIAIADSDKLRVGDFAVAVGNPF-----GLGQTATSGIISALGRSGL---------NLEG 198 (455)
T ss_pred cCC-CCEEEEEecC-CCCCceeEecCccccCCCCEEEEEecCC-----CCCCceEEEEEcccccccc---------CCCC
Confidence 886 9999999985 457888999764 489999999999973 5567899999998764311 0122
Q ss_pred cceEEEEcccccCCccccceecCCceEEEEEeeeeeCCCc-eeEEEeehHHHHHHHHHHHHhh
Q 005822 597 YPVMLETTAAVHPGGSGGAVVNLDGHMIGLVTRYFKLSCL-KMSKFMLVAKLLAQLSFLFFIF 658 (675)
Q Consensus 597 ~~~~lqTda~v~~G~SGGPLvd~~G~LIGIVssnak~~~~-~~i~f~ip~~~l~~l~~~~~~~ 658 (675)
+..++|||+++++|+|||||||.+|+||||+++......+ ..++|+||+..+......++.+
T Consensus 199 ~~~~iqtda~in~GnSGGpl~n~~G~vIGi~~~~~~~~~~~~gigfaIP~~~~~~v~~~l~~~ 261 (455)
T PRK10139 199 LENFIQTDASINRGNSGGALLNLNGELIGINTAILAPGGGSVGIGFAIPSNMARTLAQQLIDF 261 (455)
T ss_pred cceEEEECCccCCCCCcceEECCCCeEEEEEEEEEcCCCCccceEEEEEhHHHHHHHHHHhhc
Confidence 4458999999999999999999999999999987654433 4599999999998888777654
No 4
>PRK10942 serine endoprotease; Provisional
Probab=99.90 E-value=1.8e-22 Score=226.14 Aligned_cols=144 Identities=29% Similarity=0.421 Sum_probs=114.0
Q ss_pred EEEEEecCCCCceeeeEEEEecCCCCCeEEEEecCCCCCcceeeCCCC-CCCCCCeEEEEecCCCCCCCCCCCceeeeEE
Q 005822 497 IRVRLDHLDPWIWCDAKIVYVCKGPLDVSLLQLGYIPDQLCPIDADFG-QPSLGSAAYVIGHGLFGPRCGLSPSVSSGVV 575 (675)
Q Consensus 497 i~Vrl~~~~~~~w~~a~vv~~~~~~~DIALLkL~~~~~~l~PI~l~~~-~~~~G~~V~ViG~glfg~~~g~~pSvs~GiI 575 (675)
++|++.++.. |.|++++.++. +||||||++. +..++++++.+. .+++|+.|+++|+| +++..+++.|+|
T Consensus 138 i~V~~~dg~~---~~a~vv~~D~~-~DlAvlki~~-~~~l~~~~lg~s~~l~~G~~V~aiG~P-----~g~~~tvt~GiV 207 (473)
T PRK10942 138 IKVQLSDGRK---FDAKVVGKDPR-SDIALIQLQN-PKNLTAIKMADSDALRVGDYTVAIGNP-----YGLGETVTSGIV 207 (473)
T ss_pred EEEEECCCCE---EEEEEEEecCC-CCEEEEEecC-CCCCceeEecCccccCCCCEEEEEcCC-----CCCCcceeEEEE
Confidence 4555544333 88999998886 9999999975 456889999754 58999999999997 356678999999
Q ss_pred eeeEEecCCccCcccccCCCCcceEEEEcccccCCccccceecCCceEEEEEeeeeeCCCc-eeEEEeehHHHHHHHHHH
Q 005822 576 AKVVKANLPSYGQSTLQRNSAYPVMLETTAAVHPGGSGGAVVNLDGHMIGLVTRYFKLSCL-KMSKFMLVAKLLAQLSFL 654 (675)
Q Consensus 576 s~v~~~~~~~~~~~~~~~~~~~~~~lqTda~v~~G~SGGPLvd~~G~LIGIVssnak~~~~-~~i~f~ip~~~l~~l~~~ 654 (675)
+...+... ....+..+++||+++++|+|||||+|.+|+||||+++......+ ..++|+||++++..++..
T Consensus 208 s~~~r~~~---------~~~~~~~~iqtda~i~~GnSGGpL~n~~GeviGI~t~~~~~~g~~~g~gfaIP~~~~~~v~~~ 278 (473)
T PRK10942 208 SALGRSGL---------NVENYENFIQTDAAINRGNSGGALVNLNGELIGINTAILAPDGGNIGIGFAIPSNMVKNLTSQ 278 (473)
T ss_pred EEeecccC---------CcccccceEEeccccCCCCCcCccCCCCCeEEEEEEEEEcCCCCcccEEEEEEHHHHHHHHHH
Confidence 98865311 01234568999999999999999999999999999987655444 459999999999998888
Q ss_pred HHhhh
Q 005822 655 FFIFL 659 (675)
Q Consensus 655 ~~~~~ 659 (675)
+..+.
T Consensus 279 l~~~g 283 (473)
T PRK10942 279 MVEYG 283 (473)
T ss_pred HHhcc
Confidence 87643
No 5
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=99.89 E-value=4.8e-22 Score=220.10 Aligned_cols=170 Identities=28% Similarity=0.376 Sum_probs=133.8
Q ss_pred eeEEEEEEeCCcEEEEcccccCCCCCcceeccCCCcccccCCCCCCCCCCCcccccccccCCCCCCCcccccccccccce
Q 005822 406 VWASGVLLNDQGLILTNAHLLEPWRFGKTTVSGWRNGVSFQPEDSASSGHTGVDQYQKSQTLPPKMPKIVDSSVDEHRAY 485 (675)
Q Consensus 406 ~wGSGvlIn~~GlILTnAHVV~p~~~g~~~~~g~~~~~~~~~~~~~~~~~~~v~~~~~~~~~~~k~~~~~~~~~~~~~~~ 485 (675)
.+||||+|+++||||||+||+...
T Consensus 58 ~~GSGfii~~~G~IlTn~Hvv~~~-------------------------------------------------------- 81 (428)
T TIGR02037 58 GLGSGVIISADGYILTNNHVVDGA-------------------------------------------------------- 81 (428)
T ss_pred ceeeEEEECCCCEEEEcHHHcCCC--------------------------------------------------------
Confidence 479999999999999999999621
Q ss_pred eeeeeecCceEEEEEEecCCCCceeeeEEEEecCCCCCeEEEEecCCCCCcceeeCCCC-CCCCCCeEEEEecCCCCCCC
Q 005822 486 KLSSFSRGHRKIRVRLDHLDPWIWCDAKIVYVCKGPLDVSLLQLGYIPDQLCPIDADFG-QPSLGSAAYVIGHGLFGPRC 564 (675)
Q Consensus 486 ~~~~~~~~~~~i~Vrl~~~~~~~w~~a~vv~~~~~~~DIALLkL~~~~~~l~PI~l~~~-~~~~G~~V~ViG~glfg~~~ 564 (675)
..+.|++..+. +|+|++++.++. +||||||++. +..++++.+.+. .+++|+.|+++|||.
T Consensus 82 ---------~~i~V~~~~~~---~~~a~vv~~d~~-~DlAllkv~~-~~~~~~~~l~~~~~~~~G~~v~aiG~p~----- 142 (428)
T TIGR02037 82 ---------DEITVTLSDGR---EFKAKLVGKDPR-TDIAVLKIDA-KKNLPVIKLGDSDKLRVGDWVLAIGNPF----- 142 (428)
T ss_pred ---------CeEEEEeCCCC---EEEEEEEEecCC-CCEEEEEecC-CCCceEEEccCCCCCCCCCEEEEEECCC-----
Confidence 02444444332 388999998875 9999999985 357889999754 589999999999984
Q ss_pred CCCCceeeeEEeeeEEecCCccCcccccCCCCcceEEEEcccccCCccccceecCCceEEEEEeeeeeCCC-ceeEEEee
Q 005822 565 GLSPSVSSGVVAKVVKANLPSYGQSTLQRNSAYPVMLETTAAVHPGGSGGAVVNLDGHMIGLVTRYFKLSC-LKMSKFML 643 (675)
Q Consensus 565 g~~pSvs~GiIs~v~~~~~~~~~~~~~~~~~~~~~~lqTda~v~~G~SGGPLvd~~G~LIGIVssnak~~~-~~~i~f~i 643 (675)
+...+++.|+|+...+... ....+..+++||+++.+|+|||||||.+|+||||+++...... ...++|+|
T Consensus 143 g~~~~~t~G~vs~~~~~~~---------~~~~~~~~i~tda~i~~GnSGGpl~n~~G~viGI~~~~~~~~g~~~g~~fai 213 (428)
T TIGR02037 143 GLGQTVTSGIVSALGRSGL---------GIGDYENFIQTDAAINPGNSGGPLVNLRGEVIGINTAIYSPSGGNVGIGFAI 213 (428)
T ss_pred cCCCcEEEEEEEecccCcc---------CCCCccceEEECCCCCCCCCCCceECCCCeEEEEEeEEEcCCCCccceEEEE
Confidence 5568899999998754310 0122445899999999999999999999999999988765432 24589999
Q ss_pred hHHHHHHHHHHHHhhh
Q 005822 644 VAKLLAQLSFLFFIFL 659 (675)
Q Consensus 644 p~~~l~~l~~~~~~~~ 659 (675)
|+.++..++..+..+.
T Consensus 214 P~~~~~~~~~~l~~~g 229 (428)
T TIGR02037 214 PSNMAKNVVDQLIEGG 229 (428)
T ss_pred EhHHHHHHHHHHHhcC
Confidence 9999999999887764
No 6
>PRK10139 serine endoprotease; Provisional
Probab=99.84 E-value=1.5e-20 Score=209.68 Aligned_cols=120 Identities=22% Similarity=0.294 Sum_probs=105.2
Q ss_pred CcceEEEEEEe-cCCCCCCcccCCCCCCCCCeEEEEeCCCCCCCCCcccCceEEEEEecccCCCC---CCCceEEEeccc
Q 005822 213 STSRVAILGVS-SYLKDLPNIALTPLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVANCYPPRS---TTRSLLMADIRC 288 (675)
Q Consensus 213 ~~td~Avlki~-~~~~~~~~~~~s~~~~~G~~v~aigsPfG~~sp~~f~nsvs~GiIs~~~~~~~---~~~~~i~tDa~~ 288 (675)
..+|||||||+ ...++.+++++|+.+++||+|+|||+|||+ ..++|.||||++.+... ....||||||++
T Consensus 136 ~~~DlAvlkv~~~~~l~~~~lg~s~~~~~G~~V~aiG~P~g~------~~tvt~GivS~~~r~~~~~~~~~~~iqtda~i 209 (455)
T PRK10139 136 DQSDIALLQIQNPSKLTQIAIADSDKLRVGDFAVAVGNPFGL------GQTATSGIISALGRSGLNLEGLENFIQTDASI 209 (455)
T ss_pred CCCCEEEEEecCCCCCceeEecCccccCCCCEEEEEecCCCC------CCceEEEEEccccccccCCCCcceEEEECCcc
Confidence 34799999997 456778899999999999999999999994 78999999998865421 235799999999
Q ss_pred CCCCCCcceeccCccEEEEEEeccccc-CCcceEEEEeHHHHHHHHHhhhc
Q 005822 289 LPGMEGGPVFGEHAHFVGILIRPLRQK-SGAEIQLVIPWEAIATACSDLLL 338 (675)
Q Consensus 289 ~pG~sGG~v~~~~g~liGiv~~~l~~~-~~~~l~~aip~~~i~~~~~~~~~ 338 (675)
|||||||||||.+|+||||+++.+... +..|++||||++.+..++.+++.
T Consensus 210 n~GnSGGpl~n~~G~vIGi~~~~~~~~~~~~gigfaIP~~~~~~v~~~l~~ 260 (455)
T PRK10139 210 NRGNSGGALLNLNGELIGINTAILAPGGGSVGIGFAIPSNMARTLAQQLID 260 (455)
T ss_pred CCCCCcceEECCCCeEEEEEEEEEcCCCCccceEEEEEhHHHHHHHHHHhh
Confidence 999999999999999999999998765 56799999999999999988764
No 7
>PRK10942 serine endoprotease; Provisional
Probab=99.79 E-value=5.5e-19 Score=198.12 Aligned_cols=119 Identities=19% Similarity=0.290 Sum_probs=104.7
Q ss_pred cceEEEEEEe-cCCCCCCcccCCCCCCCCCeEEEEeCCCCCCCCCcccCceEEEEEecccCCCC---CCCceEEEecccC
Q 005822 214 TSRVAILGVS-SYLKDLPNIALTPLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVANCYPPRS---TTRSLLMADIRCL 289 (675)
Q Consensus 214 ~td~Avlki~-~~~~~~~~~~~s~~~~~G~~v~aigsPfG~~sp~~f~nsvs~GiIs~~~~~~~---~~~~~i~tDa~~~ 289 (675)
.+||||||++ ...++.+++++++.+++||+|++||+|||+ .++++.||||...+... ....||||||+++
T Consensus 158 ~~DlAvlki~~~~~l~~~~lg~s~~l~~G~~V~aiG~P~g~------~~tvt~GiVs~~~r~~~~~~~~~~~iqtda~i~ 231 (473)
T PRK10942 158 RSDIALIQLQNPKNLTAIKMADSDALRVGDYTVAIGNPYGL------GETVTSGIVSALGRSGLNVENYENFIQTDAAIN 231 (473)
T ss_pred CCCEEEEEecCCCCCceeEecCccccCCCCEEEEEcCCCCC------CcceeEEEEEEeecccCCcccccceEEeccccC
Confidence 4799999996 456778899999999999999999999994 78999999998765421 2457899999999
Q ss_pred CCCCCcceeccCccEEEEEEeccccc-CCcceEEEEeHHHHHHHHHhhhc
Q 005822 290 PGMEGGPVFGEHAHFVGILIRPLRQK-SGAEIQLVIPWEAIATACSDLLL 338 (675)
Q Consensus 290 pG~sGG~v~~~~g~liGiv~~~l~~~-~~~~l~~aip~~~i~~~~~~~~~ 338 (675)
||||||||||.+|+||||+++.+... ++.|++|+||++.+..++.++..
T Consensus 232 ~GnSGGpL~n~~GeviGI~t~~~~~~g~~~g~gfaIP~~~~~~v~~~l~~ 281 (473)
T PRK10942 232 RGNSGGALVNLNGELIGINTAILAPDGGNIGIGFAIPSNMVKNLTSQMVE 281 (473)
T ss_pred CCCCcCccCCCCCeEEEEEEEEEcCCCCcccEEEEEEHHHHHHHHHHHHh
Confidence 99999999999999999999988776 66799999999999999997764
No 8
>TIGR02038 protease_degS periplasmic serine pepetdase DegS. This family consists of the periplasmic serine protease DegS (HhoB), a shorter paralog of protease DO (HtrA, DegP) and DegQ (HhoA). It is found in E. coli and several other Proteobacteria of the gamma subdivision. It contains a trypsin domain and a single copy of PDZ domain (in contrast to DegP with two copies). A critical role of this DegS is to sense stress in the periplasm and partially degrade an inhibitor of sigma(E).
Probab=99.78 E-value=1.3e-18 Score=188.38 Aligned_cols=119 Identities=18% Similarity=0.328 Sum_probs=102.9
Q ss_pred cceEEEEEEecCCCCCCcccCCCCCCCCCeEEEEeCCCCCCCCCcccCceEEEEEecccCCCC---CCCceEEEecccCC
Q 005822 214 TSRVAILGVSSYLKDLPNIALTPLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVANCYPPRS---TTRSLLMADIRCLP 290 (675)
Q Consensus 214 ~td~Avlki~~~~~~~~~~~~s~~~~~G~~v~aigsPfG~~sp~~f~nsvs~GiIs~~~~~~~---~~~~~i~tDa~~~p 290 (675)
.+||||||++....+..+++++..+++||+|++||+|||+ .++++.|+||+..+... ....+|||||+++|
T Consensus 124 ~~DlAvlkv~~~~~~~~~l~~s~~~~~G~~V~aiG~P~~~------~~s~t~GiIs~~~r~~~~~~~~~~~iqtda~i~~ 197 (351)
T TIGR02038 124 LTDLAVLKIEGDNLPTIPVNLDRPPHVGDVVLAIGNPYNL------GQTITQGIISATGRNGLSSVGRQNFIQTDAAINA 197 (351)
T ss_pred CCCEEEEEecCCCCceEeccCcCccCCCCEEEEEeCCCCC------CCcEEEEEEEeccCcccCCCCcceEEEECCccCC
Confidence 3699999998766778889999999999999999999994 67999999998765321 23578999999999
Q ss_pred CCCCcceeccCccEEEEEEecccccC---CcceEEEEeHHHHHHHHHhhhc
Q 005822 291 GMEGGPVFGEHAHFVGILIRPLRQKS---GAEIQLVIPWEAIATACSDLLL 338 (675)
Q Consensus 291 G~sGG~v~~~~g~liGiv~~~l~~~~---~~~l~~aip~~~i~~~~~~~~~ 338 (675)
|||||||||.+|+||||+++.+...+ ..|++|+||++.+.+++.+++.
T Consensus 198 GnSGGpl~n~~G~vIGI~~~~~~~~~~~~~~g~~faIP~~~~~~vl~~l~~ 248 (351)
T TIGR02038 198 GNSGGALINTNGELVGINTASFQKGGDEGGEGINFAIPIKLAHKIMGKIIR 248 (351)
T ss_pred CCCcceEECCCCeEEEEEeeeecccCCCCccceEEEecHHHHHHHHHHHhh
Confidence 99999999999999999999886542 3699999999999999987764
No 9
>PRK10898 serine endoprotease; Provisional
Probab=99.77 E-value=1.8e-18 Score=187.53 Aligned_cols=118 Identities=18% Similarity=0.307 Sum_probs=101.9
Q ss_pred ceEEEEEEecCCCCCCcccCCCCCCCCCeEEEEeCCCCCCCCCcccCceEEEEEecccCCC---CCCCceEEEecccCCC
Q 005822 215 SRVAILGVSSYLKDLPNIALTPLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVANCYPPR---STTRSLLMADIRCLPG 291 (675)
Q Consensus 215 td~Avlki~~~~~~~~~~~~s~~~~~G~~v~aigsPfG~~sp~~f~nsvs~GiIs~~~~~~---~~~~~~i~tDa~~~pG 291 (675)
+||||||++...++..++++++.+++||+|+++|+|||+ ..+++.|+||+..+.. .....+|||||+++||
T Consensus 125 ~DlAvl~v~~~~l~~~~l~~~~~~~~G~~V~aiG~P~g~------~~~~t~Giis~~~r~~~~~~~~~~~iqtda~i~~G 198 (353)
T PRK10898 125 TDLAVLKINATNLPVIPINPKRVPHIGDVVLAIGNPYNL------GQTITQGIISATGRIGLSPTGRQNFLQTDASINHG 198 (353)
T ss_pred CCEEEEEEcCCCCCeeeccCcCcCCCCCEEEEEeCCCCc------CCCcceeEEEeccccccCCccccceEEeccccCCC
Confidence 699999998767788899999999999999999999994 6799999999775431 1234789999999999
Q ss_pred CCCcceeccCccEEEEEEecccccC----CcceEEEEeHHHHHHHHHhhhc
Q 005822 292 MEGGPVFGEHAHFVGILIRPLRQKS----GAEIQLVIPWEAIATACSDLLL 338 (675)
Q Consensus 292 ~sGG~v~~~~g~liGiv~~~l~~~~----~~~l~~aip~~~i~~~~~~~~~ 338 (675)
||||||+|.+|+||||+++.+...+ ..+++|+||.+.+.+++.+++.
T Consensus 199 nSGGPl~n~~G~vvGI~~~~~~~~~~~~~~~g~~faIP~~~~~~~~~~l~~ 249 (353)
T PRK10898 199 NSGGALVNSLGELMGINTLSFDKSNDGETPEGIGFAIPTQLATKIMDKLIR 249 (353)
T ss_pred CCcceEECCCCeEEEEEEEEecccCCCCcccceEEEEchHHHHHHHHHHhh
Confidence 9999999999999999999886542 2589999999999999987654
No 10
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=99.74 E-value=1.4e-17 Score=184.72 Aligned_cols=118 Identities=21% Similarity=0.348 Sum_probs=103.5
Q ss_pred ceEEEEEEecC-CCCCCcccCCCCCCCCCeEEEEeCCCCCCCCCcccCceEEEEEecccCCC---CCCCceEEEecccCC
Q 005822 215 SRVAILGVSSY-LKDLPNIALTPLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVANCYPPR---STTRSLLMADIRCLP 290 (675)
Q Consensus 215 td~Avlki~~~-~~~~~~~~~s~~~~~G~~v~aigsPfG~~sp~~f~nsvs~GiIs~~~~~~---~~~~~~i~tDa~~~p 290 (675)
+||||||++.. ..+.++++++..+++||+|+++|+|||+ ..++|.|+||...+.. .....+|||||+++|
T Consensus 105 ~DlAllkv~~~~~~~~~~l~~~~~~~~G~~v~aiG~p~g~------~~~~t~G~vs~~~~~~~~~~~~~~~i~tda~i~~ 178 (428)
T TIGR02037 105 TDIAVLKIDAKKNLPVIKLGDSDKLRVGDWVLAIGNPFGL------GQTVTSGIVSALGRSGLGIGDYENFIQTDAAINP 178 (428)
T ss_pred CCEEEEEecCCCCceEEEccCCCCCCCCCEEEEEECCCcC------CCcEEEEEEEecccCccCCCCccceEEECCCCCC
Confidence 59999999854 6777889999999999999999999994 7899999999876531 234568999999999
Q ss_pred CCCCcceeccCccEEEEEEeccccc-CCcceEEEEeHHHHHHHHHhhhc
Q 005822 291 GMEGGPVFGEHAHFVGILIRPLRQK-SGAEIQLVIPWEAIATACSDLLL 338 (675)
Q Consensus 291 G~sGG~v~~~~g~liGiv~~~l~~~-~~~~l~~aip~~~i~~~~~~~~~ 338 (675)
|||||||||.+|+||||+++.+... +..|++|+||++.+.+++.++..
T Consensus 179 GnSGGpl~n~~G~viGI~~~~~~~~g~~~g~~faiP~~~~~~~~~~l~~ 227 (428)
T TIGR02037 179 GNSGGPLVNLRGEVIGINTAIYSPSGGNVGIGFAIPSNMAKNVVDQLIE 227 (428)
T ss_pred CCCCCceECCCCeEEEEEeEEEcCCCCccceEEEEEhHHHHHHHHHHHh
Confidence 9999999999999999999988765 56799999999999999998764
No 11
>COG0265 DegQ Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain [Posttranslational modification, protein turnover, chaperones]
Probab=99.72 E-value=2.6e-16 Score=169.77 Aligned_cols=187 Identities=26% Similarity=0.351 Sum_probs=144.3
Q ss_pred hhHhhccCceEEEEeCC-----------------CeeEEEEEEeCCcEEEEcccccCCCCCcceeccCCCcccccCCCCC
Q 005822 388 LPIQKALASVCLITIDD-----------------GVWASGVLLNDQGLILTNAHLLEPWRFGKTTVSGWRNGVSFQPEDS 450 (675)
Q Consensus 388 ~~ie~a~~SVV~I~~~~-----------------~~wGSGvlIn~~GlILTnAHVV~p~~~g~~~~~g~~~~~~~~~~~~ 450 (675)
..++++.|+||.+.... ..+||||+++++|+|+||.||+.. . .
T Consensus 37 ~~~~~~~~~vV~~~~~~~~~~~~~~~~~~~~~~~~~~gSg~i~~~~g~ivTn~hVi~~---------a-~---------- 96 (347)
T COG0265 37 TAVEKVAPAVVSIATGLTAKLRSFFPSDPPLRSAEGLGSGFIISSDGYIVTNNHVIAG---------A-E---------- 96 (347)
T ss_pred HHHHhcCCcEEEEEeeeeecchhcccCCcccccccccccEEEEcCCeEEEecceecCC---------c-c----------
Confidence 56889999999887621 378999999999999999999961 1 1
Q ss_pred CCCCCCcccccccccCCCCCCCcccccccccccceeeeeeecCceEEEEEEecCCCCceeeeEEEEecCCCCCeEEEEec
Q 005822 451 ASSGHTGVDQYQKSQTLPPKMPKIVDSSVDEHRAYKLSSFSRGHRKIRVRLDHLDPWIWCDAKIVYVCKGPLDVSLLQLG 530 (675)
Q Consensus 451 ~~~~~~~v~~~~~~~~~~~k~~~~~~~~~~~~~~~~~~~~~~~~~~i~Vrl~~~~~~~w~~a~vv~~~~~~~DIALLkL~ 530 (675)
.+.+.+.. ..++++++++.+.. .|+|++|++
T Consensus 97 ---------------------------------------------~i~v~l~d---g~~~~a~~vg~d~~-~dlavlki~ 127 (347)
T COG0265 97 ---------------------------------------------EITVTLAD---GREVPAKLVGKDPI-SDLAVLKID 127 (347)
T ss_pred ---------------------------------------------eEEEEeCC---CCEEEEEEEecCCc-cCEEEEEec
Confidence 12233311 12378888887775 999999999
Q ss_pred CCCCCcceeeCCCCC-CCCCCeEEEEecCCCCCCCCCCCceeeeEEeeeEEecCCccCcccccCCCCcceEEEEcccccC
Q 005822 531 YIPDQLCPIDADFGQ-PSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQRNSAYPVMLETTAAVHP 609 (675)
Q Consensus 531 ~~~~~l~PI~l~~~~-~~~G~~V~ViG~glfg~~~g~~pSvs~GiIs~v~~~~~~~~~~~~~~~~~~~~~~lqTda~v~~ 609 (675)
.... ++.+.+.+.. ++.|+.+.++|.|+ ++..+++.|+|+...+.... ....+..+||||+++++
T Consensus 128 ~~~~-~~~~~~~~s~~l~vg~~v~aiGnp~-----g~~~tvt~Givs~~~r~~v~--------~~~~~~~~IqtdAain~ 193 (347)
T COG0265 128 GAGG-LPVIALGDSDKLRVGDVVVAIGNPF-----GLGQTVTSGIVSALGRTGVG--------SAGGYVNFIQTDAAINP 193 (347)
T ss_pred cCCC-CceeeccCCCCcccCCEEEEecCCC-----CcccceeccEEecccccccc--------CcccccchhhcccccCC
Confidence 6322 6666776554 78999999999974 56689999999988764110 00114457899999999
Q ss_pred CccccceecCCceEEEEEeeeeeCCCc-eeEEEeehHHHHHHHHHHHHh
Q 005822 610 GGSGGAVVNLDGHMIGLVTRYFKLSCL-KMSKFMLVAKLLAQLSFLFFI 657 (675)
Q Consensus 610 G~SGGPLvd~~G~LIGIVssnak~~~~-~~i~f~ip~~~l~~l~~~~~~ 657 (675)
|+||||++|.+|++|||.+.......+ ..++|+||.......+..++.
T Consensus 194 gnsGgpl~n~~g~~iGint~~~~~~~~~~gigfaiP~~~~~~v~~~l~~ 242 (347)
T COG0265 194 GNSGGPLVNIDGEVVGINTAIIAPSGGSSGIGFAIPVNLVAPVLDELIS 242 (347)
T ss_pred CCCCCceEcCCCcEEEEEEEEecCCCCcceeEEEecHHHHHHHHHHHHH
Confidence 999999999999999999988877663 449999999999999988876
No 12
>COG0265 DegQ Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain [Posttranslational modification, protein turnover, chaperones]
Probab=99.70 E-value=4.9e-17 Score=175.44 Aligned_cols=121 Identities=22% Similarity=0.348 Sum_probs=106.7
Q ss_pred CCcceEEEEEEecCC-CCCCcccCCCCCCCCCeEEEEeCCCCCCCCCcccCceEEEEEecccCC-CC---CCCceEEEec
Q 005822 212 KSTSRVAILGVSSYL-KDLPNIALTPLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVANCYPP-RS---TTRSLLMADI 286 (675)
Q Consensus 212 ~~~td~Avlki~~~~-~~~~~~~~s~~~~~G~~v~aigsPfG~~sp~~f~nsvs~GiIs~~~~~-~~---~~~~~i~tDa 286 (675)
...+|+|+||++... .+...+++++.+++||+++|||+||| |.++++.||||...+. -. ....+|||||
T Consensus 116 d~~~dlavlki~~~~~~~~~~~~~s~~l~vg~~v~aiGnp~g------~~~tvt~Givs~~~r~~v~~~~~~~~~IqtdA 189 (347)
T COG0265 116 DPISDLAVLKIDGAGGLPVIALGDSDKLRVGDVVVAIGNPFG------LGQTVTSGIVSALGRTGVGSAGGYVNFIQTDA 189 (347)
T ss_pred CCccCEEEEEeccCCCCceeeccCCCCcccCCEEEEecCCCC------cccceeccEEeccccccccCcccccchhhccc
Confidence 345799999998644 67779999999999999999999999 5899999999988763 11 2457899999
Q ss_pred ccCCCCCCcceeccCccEEEEEEeccccc-CCcceEEEEeHHHHHHHHHhhhc
Q 005822 287 RCLPGMEGGPVFGEHAHFVGILIRPLRQK-SGAEIQLVIPWEAIATACSDLLL 338 (675)
Q Consensus 287 ~~~pG~sGG~v~~~~g~liGiv~~~l~~~-~~~~l~~aip~~~i~~~~~~~~~ 338 (675)
++||||||||++|.+|++|||+++.+... +..|++|+||++.+..++..++.
T Consensus 190 ain~gnsGgpl~n~~g~~iGint~~~~~~~~~~gigfaiP~~~~~~v~~~l~~ 242 (347)
T COG0265 190 AINPGNSGGPLVNIDGEVVGINTAIIAPSGGSSGIGFAIPVNLVAPVLDELIS 242 (347)
T ss_pred ccCCCCCCCceEcCCCcEEEEEEEEecCCCCcceeEEEecHHHHHHHHHHHHH
Confidence 99999999999999999999999999888 46789999999999999998765
No 13
>PF13365 Trypsin_2: Trypsin-like peptidase domain; PDB: 1Y8T_A 2Z9I_A 3QO6_A 1L1J_A 1QY6_A 2O8L_A 3OTP_E 2ZLE_I 1KY9_A 3CS0_A ....
Probab=99.50 E-value=3.9e-13 Score=120.78 Aligned_cols=24 Identities=46% Similarity=0.904 Sum_probs=22.3
Q ss_pred EcccccCCccccceecCCceEEEE
Q 005822 603 TTAAVHPGGSGGAVVNLDGHMIGL 626 (675)
Q Consensus 603 Tda~v~~G~SGGPLvd~~G~LIGI 626 (675)
+++.+.+|+|||||||.+|++|||
T Consensus 97 ~~~~~~~G~SGgpv~~~~G~vvGi 120 (120)
T PF13365_consen 97 TDADTRPGSSGGPVFDSDGRVVGI 120 (120)
T ss_dssp ESSS-STTTTTSEEEETTSEEEEE
T ss_pred eecccCCCcEeHhEECCCCEEEeC
Confidence 899999999999999999999997
No 14
>cd00190 Tryp_SPc Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues.
Probab=99.38 E-value=8.4e-12 Score=123.43 Aligned_cols=181 Identities=22% Similarity=0.236 Sum_probs=98.0
Q ss_pred cCceEEEEeC-CCeeEEEEEEeCCcEEEEcccccCCCCCcceec-cCCCcccccCCCCCCCCCCCcccccccccCCCCCC
Q 005822 394 LASVCLITID-DGVWASGVLLNDQGLILTNAHLLEPWRFGKTTV-SGWRNGVSFQPEDSASSGHTGVDQYQKSQTLPPKM 471 (675)
Q Consensus 394 ~~SVV~I~~~-~~~wGSGvlIn~~GlILTnAHVV~p~~~g~~~~-~g~~~~~~~~~~~~~~~~~~~v~~~~~~~~~~~k~ 471 (675)
.|.+|.|... ...+|+|++|+++ +|||+|||+.........+ .|... ..
T Consensus 12 ~Pw~v~i~~~~~~~~C~GtlIs~~-~VLTaAhC~~~~~~~~~~v~~g~~~-------------------------~~--- 62 (232)
T cd00190 12 FPWQVSLQYTGGRHFCGGSLISPR-WVLTAAHCVYSSAPSNYTVRLGSHD-------------------------LS--- 62 (232)
T ss_pred CCCEEEEEccCCcEEEEEEEeeCC-EEEECHHhcCCCCCccEEEEeCccc-------------------------cc---
Confidence 4778888765 5779999999998 9999999996321000000 01000 00
Q ss_pred CcccccccccccceeeeeeecCceEEEEEEecCCCCceeeeEEEEecCCCCCeEEEEecC---CCCCcceeeCCCC--CC
Q 005822 472 PKIVDSSVDEHRAYKLSSFSRGHRKIRVRLDHLDPWIWCDAKIVYVCKGPLDVSLLQLGY---IPDQLCPIDADFG--QP 546 (675)
Q Consensus 472 ~~~~~~~~~~~~~~~~~~~~~~~~~i~Vrl~~~~~~~w~~a~vv~~~~~~~DIALLkL~~---~~~~l~PI~l~~~--~~ 546 (675)
..........|.....++. |+.. ...+|||||+|+. ....+.|+.+... .+
T Consensus 63 -----------------~~~~~~~~~~v~~~~~hp~--y~~~-----~~~~DiAll~L~~~~~~~~~v~picl~~~~~~~ 118 (232)
T cd00190 63 -----------------SNEGGGQVIKVKKVIVHPN--YNPS-----TYDNDIALLKLKRPVTLSDNVRPICLPSSGYNL 118 (232)
T ss_pred -----------------CCCCceEEEEEEEEEECCC--CCCC-----CCcCCEEEEEECCcccCCCcccceECCCccccC
Confidence 0000111223333333332 2221 1248999999986 2345789998776 67
Q ss_pred CCCCeEEEEecCCCCCCCCCCCceeeeEEeeeEEecCCccCcccccC-CCCcceEEEE-----cccccCCccccceecCC
Q 005822 547 SLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQR-NSAYPVMLET-----TAAVHPGGSGGAVVNLD 620 (675)
Q Consensus 547 ~~G~~V~ViG~glfg~~~g~~pSvs~GiIs~v~~~~~~~~~~~~~~~-~~~~~~~lqT-----da~v~~G~SGGPLvd~~ 620 (675)
..|+.++++|||................+.-+. ...|+..... ......+++. ....|.|+|||||+...
T Consensus 119 ~~~~~~~~~G~g~~~~~~~~~~~~~~~~~~~~~----~~~C~~~~~~~~~~~~~~~C~~~~~~~~~~c~gdsGgpl~~~~ 194 (232)
T cd00190 119 PAGTTCTVSGWGRTSEGGPLPDVLQEVNVPIVS----NAECKRAYSYGGTITDNMLCAGGLEGGKDACQGDSGGPLVCND 194 (232)
T ss_pred CCCCEEEEEeCCcCCCCCCCCceeeEEEeeeEC----HHHhhhhccCcccCCCceEeeCCCCCCCccccCCCCCcEEEEe
Confidence 899999999999643221111111111111110 1111111100 0011223333 34578999999999643
Q ss_pred ---ceEEEEEeeee
Q 005822 621 ---GHMIGLVTRYF 631 (675)
Q Consensus 621 ---G~LIGIVssna 631 (675)
+.++||++...
T Consensus 195 ~~~~~lvGI~s~g~ 208 (232)
T cd00190 195 NGRGVLVGIVSWGS 208 (232)
T ss_pred CCEEEEEEEEehhh
Confidence 89999998654
No 15
>PF00089 Trypsin: Trypsin; InterPro: IPR001254 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine proteases belong to the MEROPS peptidase family S1 (chymotrypsin family, clan PA(S))and to peptidase family S6 (Hap serine peptidases). The chymotrypsin family is almost totally confined to animals, although trypsin-like enzymes are found in actinomycetes of the genera Streptomyces and Saccharopolyspora, and in the fungus Fusarium oxysporum []. The enzymes are inherently secreted, being synthesised with a signal peptide that targets them to the secretory pathway. Animal enzymes are either secreted directly, packaged into vesicles for regulated secretion, or are retained in leukocyte granules []. The Hap family, 'Haemophilus adhesion and penetration', are proteins that play a role in the interaction with human epithelial cells. The serine protease activity is localized at the N-terminal domain, whereas the binding domain is in the C-terminal region. ; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 1SPJ_A 1A5I_A 2ZGH_A 2ZKS_A 2ZGJ_A 2ZGC_A 2ODP_A 2I6Q_A 2I6S_A 2ODQ_A ....
Probab=99.28 E-value=3.7e-11 Score=118.10 Aligned_cols=104 Identities=25% Similarity=0.371 Sum_probs=65.4
Q ss_pred CCCeEEEEecCC---CCCcceeeCCCCC--CCCCCeEEEEecCCCCCCCCCCCceeeeEEeeeEEecCCccCcccccCCC
Q 005822 521 PLDVSLLQLGYI---PDQLCPIDADFGQ--PSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQRNS 595 (675)
Q Consensus 521 ~~DIALLkL~~~---~~~l~PI~l~~~~--~~~G~~V~ViG~glfg~~~g~~pSvs~GiIs~v~~~~~~~~~~~~~~~~~ 595 (675)
.+|||||+|+.. .+.+.|+.+.... +..|+.+.++|||.-.. .+....+....+..+... .+... ....
T Consensus 86 ~~DiAll~L~~~~~~~~~~~~~~l~~~~~~~~~~~~~~~~G~~~~~~-~~~~~~~~~~~~~~~~~~----~c~~~-~~~~ 159 (220)
T PF00089_consen 86 DNDIALLKLDRPITFGDNIQPICLPSAGSDPNVGTSCIVVGWGRTSD-NGYSSNLQSVTVPVVSRK----TCRSS-YNDN 159 (220)
T ss_dssp TTSEEEEEESSSSEHBSSBEESBBTSTTHTTTTTSEEEEEESSBSST-TSBTSBEEEEEEEEEEHH----HHHHH-TTTT
T ss_pred ccccccccccccccccccccccccccccccccccccccccccccccc-cccccccccccccccccc----ccccc-cccc
Confidence 489999999973 4567888887633 58999999999986211 111122332322221110 11110 0001
Q ss_pred CcceEEEEcc----cccCCccccceecCCceEEEEEeee
Q 005822 596 AYPVMLETTA----AVHPGGSGGAVVNLDGHMIGLVTRY 630 (675)
Q Consensus 596 ~~~~~lqTda----~v~~G~SGGPLvd~~G~LIGIVssn 630 (675)
....++++.. ..|.|+|||||++.++.++||++..
T Consensus 160 ~~~~~~c~~~~~~~~~~~g~sG~pl~~~~~~lvGI~s~~ 198 (220)
T PF00089_consen 160 LTPNMICAGSSGSGDACQGDSGGPLICNNNYLVGIVSFG 198 (220)
T ss_dssp STTTEEEEETTSSSBGGTTTTTSEEEETTEEEEEEEEEE
T ss_pred cccccccccccccccccccccccccccceeeecceeeec
Confidence 2344677665 7899999999998767899999877
No 16
>KOG1320 consensus Serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=99.21 E-value=1.5e-11 Score=136.17 Aligned_cols=128 Identities=19% Similarity=0.314 Sum_probs=108.1
Q ss_pred cCccccCC-CCcceEEEEEEec--CCCCCCcccCCCCCCCCCeEEEEeCCCCCCCCCcccCceEEEEEecccCCC-----
Q 005822 204 SSNLSLMS-KSTSRVAILGVSS--YLKDLPNIALTPLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVANCYPPR----- 275 (675)
Q Consensus 204 ~~~~~~~~-~~~td~Avlki~~--~~~~~~~~~~s~~~~~G~~v~aigsPfG~~sp~~f~nsvs~GiIs~~~~~~----- 275 (675)
+.+|.+++ ...-|+|++||+. +-+.+++++.+..++.|+++.++|+||++ .|++++|+||...|..
T Consensus 212 s~ep~i~g~d~~~gvA~l~ik~~~~i~~~i~~~~~~~~~~G~~~~a~~~~f~~------~nt~t~g~vs~~~R~~~~lg~ 285 (473)
T KOG1320|consen 212 SGEPVIVGVDKVAGVAFLKIKTPENILYVIPLGVSSHFRTGVEVSAIGNGFGL------LNTLTQGMVSGQLRKSFKLGL 285 (473)
T ss_pred cCCCeEEccccccceEEEEEecCCcccceeecceeeeecccceeeccccCcee------eeeeeecccccccccccccCc
Confidence 44577777 6777999999963 23677889999999999999999999995 8999999999776542
Q ss_pred ---CCCCceEEEecccCCCCCCcceeccCccEEEEEEeccccc-CCcceEEEEeHHHHHHHHHhhh
Q 005822 276 ---STTRSLLMADIRCLPGMEGGPVFGEHAHFVGILIRPLRQK-SGAEIQLVIPWEAIATACSDLL 337 (675)
Q Consensus 276 ---~~~~~~i~tDa~~~pG~sGG~v~~~~g~liGiv~~~l~~~-~~~~l~~aip~~~i~~~~~~~~ 337 (675)
.....++|||+++++||+|||++|.+|++||+++++...- -.-+++|++|.+.+...+....
T Consensus 286 ~~g~~i~~~~qtd~ai~~~nsg~~ll~~DG~~IgVn~~~~~ri~~~~~iSf~~p~d~vl~~v~r~~ 351 (473)
T KOG1320|consen 286 ETGVLISKINQTDAAINPGNSGGPLLNLDGEVIGVNTRKVTRIGFSHGISFKIPIDTVLVIVLRLG 351 (473)
T ss_pred ccceeeeeecccchhhhcccCCCcEEEecCcEeeeeeeeeEEeeccccceeccCchHhhhhhhhhh
Confidence 2345689999999999999999999999999999887655 4579999999999998887654
No 17
>smart00020 Tryp_SPc Trypsin-like serine protease. Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues.
Probab=99.20 E-value=3.2e-10 Score=112.71 Aligned_cols=107 Identities=24% Similarity=0.297 Sum_probs=60.5
Q ss_pred CCCeEEEEecC---CCCCcceeeCCCC--CCCCCCeEEEEecCCCCCCCC-CCCceeeeEEeeeEEecCCccCcccccC-
Q 005822 521 PLDVSLLQLGY---IPDQLCPIDADFG--QPSLGSAAYVIGHGLFGPRCG-LSPSVSSGVVAKVVKANLPSYGQSTLQR- 593 (675)
Q Consensus 521 ~~DIALLkL~~---~~~~l~PI~l~~~--~~~~G~~V~ViG~glfg~~~g-~~pSvs~GiIs~v~~~~~~~~~~~~~~~- 593 (675)
.+|||||+|+. ....+.|+.+... .+..++.++++|||......+ .........+..+. ...|......
T Consensus 88 ~~DiAll~L~~~i~~~~~~~pi~l~~~~~~~~~~~~~~~~g~g~~~~~~~~~~~~~~~~~~~~~~----~~~C~~~~~~~ 163 (229)
T smart00020 88 DNDIALLKLKSPVTLSDNVRPICLPSSNYNVPAGTTCTVSGWGRTSEGAGSLPDTLQEVNVPIVS----NATCRRAYSGG 163 (229)
T ss_pred cCCEEEEEECcccCCCCceeeccCCCcccccCCCCEEEEEeCCCCCCCCCcCCCEeeEEEEEEeC----HHHhhhhhccc
Confidence 48999999986 2346889888765 578899999999986432111 01111111111110 0001100000
Q ss_pred CCCcceEEEE-----cccccCCccccceecCCc--eEEEEEeeee
Q 005822 594 NSAYPVMLET-----TAAVHPGGSGGAVVNLDG--HMIGLVTRYF 631 (675)
Q Consensus 594 ~~~~~~~lqT-----da~v~~G~SGGPLvd~~G--~LIGIVssna 631 (675)
......+++. +...|+|+|||||+...+ .++||++...
T Consensus 164 ~~~~~~~~C~~~~~~~~~~c~gdsG~pl~~~~~~~~l~Gi~s~g~ 208 (229)
T smart00020 164 GAITDNMLCAGGLEGGKDACQGDSGGPLVCNDGRWVLVGIVSWGS 208 (229)
T ss_pred cccCCCcEeecCCCCCCcccCCCCCCeeEEECCCEEEEEEEEECC
Confidence 0001112222 356899999999996443 9999998765
No 18
>KOG1320 consensus Serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=99.20 E-value=1.3e-10 Score=129.01 Aligned_cols=139 Identities=24% Similarity=0.314 Sum_probs=99.5
Q ss_pred eeeEEEEecCCCCCeEEEEecCCCCCcceeeCCCC-CCCCCCeEEEEecCCCCCCCCCCCceeeeEEeeeEEecCCccCc
Q 005822 510 CDAKIVYVCKGPLDVSLLQLGYIPDQLCPIDADFG-QPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQ 588 (675)
Q Consensus 510 ~~a~vv~~~~~~~DIALLkL~~~~~~l~PI~l~~~-~~~~G~~V~ViG~glfg~~~g~~pSvs~GiIs~v~~~~~~~~~~ 588 (675)
+.+.++..++. .|+|+++++......+++++... .+..|+++..+|-| ++...+++.|+++...+-.......
T Consensus 213 ~ep~i~g~d~~-~gvA~l~ik~~~~i~~~i~~~~~~~~~~G~~~~a~~~~-----f~~~nt~t~g~vs~~~R~~~~lg~~ 286 (473)
T KOG1320|consen 213 GEPVIVGVDKV-AGVAFLKIKTPENILYVIPLGVSSHFRTGVEVSAIGNG-----FGLLNTLTQGMVSGQLRKSFKLGLE 286 (473)
T ss_pred CCCeEEccccc-cceEEEEEecCCcccceeecceeeeecccceeeccccC-----ceeeeeeeecccccccccccccCcc
Confidence 55677776664 99999999742233677777654 48999999999886 4666888999998776532111000
Q ss_pred ccccCCCCcceEEEEcccccCCccccceecCCceEEEEEeeeeeCCCc-eeEEEeehHHHHHHHHHHHHhh
Q 005822 589 STLQRNSAYPVMLETTAAVHPGGSGGAVVNLDGHMIGLVTRYFKLSCL-KMSKFMLVAKLLAQLSFLFFIF 658 (675)
Q Consensus 589 ~~~~~~~~~~~~lqTda~v~~G~SGGPLvd~~G~LIGIVssnak~~~~-~~i~f~ip~~~l~~l~~~~~~~ 658 (675)
.......++||++++..|+||||++|.+|+.||+.+.+...-.. ..+.|++|.+-++..+--...|
T Consensus 287 ----~g~~i~~~~qtd~ai~~~nsg~~ll~~DG~~IgVn~~~~~ri~~~~~iSf~~p~d~vl~~v~r~~e~ 353 (473)
T KOG1320|consen 287 ----TGVLISKINQTDAAINPGNSGGPLLNLDGEVIGVNTRKVTRIGFSHGISFKIPIDTVLVIVLRLGEF 353 (473)
T ss_pred ----cceeeeeecccchhhhcccCCCcEEEecCcEeeeeeeeeEEeeccccceeccCchHhhhhhhhhhhh
Confidence 00123457899999999999999999999999998887765222 3389999998887666554433
No 19
>KOG1421 consensus Predicted signaling-associated protein (contains a PDZ domain) [General function prediction only]
Probab=98.54 E-value=3.2e-07 Score=103.89 Aligned_cols=188 Identities=22% Similarity=0.319 Sum_probs=122.0
Q ss_pred hHhhccCceEEEEeC----------CCeeEEEEEEeC-CcEEEEcccccCCCCCcceeccCCCcccccCCCCCCCCCCCc
Q 005822 389 PIQKALASVCLITID----------DGVWASGVLLND-QGLILTNAHLLEPWRFGKTTVSGWRNGVSFQPEDSASSGHTG 457 (675)
Q Consensus 389 ~ie~a~~SVV~I~~~----------~~~wGSGvlIn~-~GlILTnAHVV~p~~~g~~~~~g~~~~~~~~~~~~~~~~~~~ 457 (675)
.+..+.++||.|... ..+-|+||++++ .|+||||+|++.|--+- + .+.|..+
T Consensus 57 ~ia~VvksvVsI~~S~v~~fdtesag~~~atgfvvd~~~gyiLtnrhvv~pgP~v-----a---~avf~n~--------- 119 (955)
T KOG1421|consen 57 TIANVVKSVVSIRFSAVRAFDTESAGESEATGFVVDKKLGYILTNRHVVAPGPFV-----A---SAVFDNH--------- 119 (955)
T ss_pred hhhhhcccEEEEEehheeecccccccccceeEEEEecccceEEEeccccCCCCce-----e---EEEeccc---------
Confidence 467788999999862 234799999997 68999999999742110 0 0111111
Q ss_pred ccccccccCCCCCCCcccccccccccceeeeeeecCceEEEEEEecCCCCceeeeEEEEecCCCCCeEEEEecCC---CC
Q 005822 458 VDQYQKSQTLPPKMPKIVDSSVDEHRAYKLSSFSRGHRKIRVRLDHLDPWIWCDAKIVYVCKGPLDVSLLQLGYI---PD 534 (675)
Q Consensus 458 v~~~~~~~~~~~k~~~~~~~~~~~~~~~~~~~~~~~~~~i~Vrl~~~~~~~w~~a~vv~~~~~~~DIALLkL~~~---~~ 534 (675)
.. ++-..+|.++ -+|+.++|.++. ..
T Consensus 120 -------------------------------------ee-------------~ei~pvyrDp-VhdfGf~r~dps~ir~s 148 (955)
T KOG1421|consen 120 -------------------------------------EE-------------IEIYPVYRDP-VHDFGFFRYDPSTIRFS 148 (955)
T ss_pred -------------------------------------cc-------------CCcccccCCc-hhhcceeecChhhccee
Confidence 11 1122344444 289999998862 12
Q ss_pred CcceeeCCCCCCCCCCeEEEEecCCCCCCCCCCCceeeeEEeeeEEecCCccCcccccCCCCcceEEEEcccccCCcccc
Q 005822 535 QLCPIDADFGQPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQRNSAYPVMLETTAAVHPGGSGG 614 (675)
Q Consensus 535 ~l~PI~l~~~~~~~G~~V~ViG~glfg~~~g~~pSvs~GiIs~v~~~~~~~~~~~~~~~~~~~~~~lqTda~v~~G~SGG 614 (675)
.+.-+.+.....+.|.+++++|.- .+..-++-.|.++.+.+. .|.+....+.....+ ++|..+...+|.||.
T Consensus 149 ~vt~i~lap~~akvgseirvvgND-----agEklsIlagflSrldr~-apdyg~~~yndfnTf--y~Qaasstsggssgs 220 (955)
T KOG1421|consen 149 IVTEICLAPELAKVGSEIRVVGND-----AGEKLSILAGFLSRLDRN-APDYGEDTYNDFNTF--YIQAASSTSGGSSGS 220 (955)
T ss_pred eeeccccCccccccCCceEEecCC-----ccceEEeehhhhhhccCC-Cccccccccccccce--eeeehhcCCCCCCCC
Confidence 233444455556899999999973 455677888888888653 344433333333332 578888899999999
Q ss_pred ceecCCceEEEEEeeeeeCCCceeEEEeehHHHHHHHHHHH
Q 005822 615 AVVNLDGHMIGLVTRYFKLSCLKMSKFMLVAKLLAQLSFLF 655 (675)
Q Consensus 615 PLvd~~G~LIGIVssnak~~~~~~i~f~ip~~~l~~l~~~~ 655 (675)
||+|-+|..|.++... .......|+.|.+-..+-+..+
T Consensus 221 pVv~i~gyAVAl~agg---~~ssas~ffLpLdrV~RaL~cl 258 (955)
T KOG1421|consen 221 PVVDIPGYAVALNAGG---SISSASDFFLPLDRVVRALRCL 258 (955)
T ss_pred ceecccceEEeeecCC---cccccccceeeccchhhhhhhh
Confidence 9999999999886433 3334467788877666655433
No 20
>KOG3627 consensus Trypsin [Amino acid transport and metabolism]
Probab=98.52 E-value=2.8e-06 Score=87.01 Aligned_cols=107 Identities=22% Similarity=0.229 Sum_probs=61.2
Q ss_pred CCeEEEEecC---CCCCcceeeCCCCC----CCCCCeEEEEecCCCCCCC-CCCCceeeeEEeeeEEecCCccCcccccC
Q 005822 522 LDVSLLQLGY---IPDQLCPIDADFGQ----PSLGSAAYVIGHGLFGPRC-GLSPSVSSGVVAKVVKANLPSYGQSTLQR 593 (675)
Q Consensus 522 ~DIALLkL~~---~~~~l~PI~l~~~~----~~~G~~V~ViG~glfg~~~-g~~pSvs~GiIs~v~~~~~~~~~~~~~~~ 593 (675)
+|||||+++. ..+.+.|+.++... ...+..+++.|||...... .....+... .+.-.+ ...|......
T Consensus 106 nDiall~l~~~v~~~~~i~piclp~~~~~~~~~~~~~~~v~GWG~~~~~~~~~~~~L~~~---~v~i~~-~~~C~~~~~~ 181 (256)
T KOG3627|consen 106 NDIALLRLSEPVTFSSHIQPICLPSSADPYFPPGGTTCLVSGWGRTESGGGPLPDTLQEV---DVPIIS-NSECRRAYGG 181 (256)
T ss_pred CCEEEEEECCCcccCCcccccCCCCCcccCCCCCCCEEEEEeCCCcCCCCCCCCceeEEE---EEeEcC-hhHhcccccC
Confidence 8999999986 45678888886332 3455899999998633220 111111111 111111 1112222111
Q ss_pred C-CCcceEEEEc-----ccccCCccccceecCC---ceEEEEEeeeee
Q 005822 594 N-SAYPVMLETT-----AAVHPGGSGGAVVNLD---GHMIGLVTRYFK 632 (675)
Q Consensus 594 ~-~~~~~~lqTd-----a~v~~G~SGGPLvd~~---G~LIGIVssnak 632 (675)
. .....++++. ..+|.|||||||+-.. ..++||++....
T Consensus 182 ~~~~~~~~~Ca~~~~~~~~~C~GDSGGPLv~~~~~~~~~~GivS~G~~ 229 (256)
T KOG3627|consen 182 LGTITDTMLCAGGPEGGKDACQGDSGGPLVCEDNGRWVLVGIVSWGSG 229 (256)
T ss_pred ccccCCCEEeeCccCCCCccccCCCCCeEEEeeCCcEEEEEEEEecCC
Confidence 0 0112356654 3369999999999643 699999988754
No 21
>COG3591 V8-like Glu-specific endopeptidase [Amino acid transport and metabolism]
Probab=98.22 E-value=7.9e-06 Score=84.90 Aligned_cols=71 Identities=24% Similarity=0.177 Sum_probs=47.9
Q ss_pred CCCCCCeEEEEecCCCCCCCCCCCceeeeEEeeeEEecCCccCcccccCCCCcceEEEEcccccCCccccceecCCceEE
Q 005822 545 QPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQRNSAYPVMLETTAAVHPGGSGGAVVNLDGHMI 624 (675)
Q Consensus 545 ~~~~G~~V~ViG~glfg~~~g~~pSvs~GiIs~v~~~~~~~~~~~~~~~~~~~~~~lqTda~v~~G~SGGPLvd~~G~LI 624 (675)
..+.++.+.++|||.-.+..+.. =...+.|..+. ...++.+|.+++|+||.||++.+.++|
T Consensus 157 ~~~~~d~i~v~GYP~dk~~~~~~-~e~t~~v~~~~------------------~~~l~y~~dT~pG~SGSpv~~~~~~vi 217 (251)
T COG3591 157 EAKANDRITVIGYPGDKPNIGTM-WESTGKVNSIK------------------GNKLFYDADTLPGSSGSPVLISKDEVI 217 (251)
T ss_pred ccccCceeEEEeccCCCCcceeE-eeecceeEEEe------------------cceEEEEecccCCCCCCceEecCceEE
Confidence 36899999999998522211110 11223332221 125888999999999999999888999
Q ss_pred EEEeeeeeCC
Q 005822 625 GLVTRYFKLS 634 (675)
Q Consensus 625 GIVssnak~~ 634 (675)
|+.+......
T Consensus 218 gv~~~g~~~~ 227 (251)
T COG3591 218 GVHYNGPGAN 227 (251)
T ss_pred EEEecCCCcc
Confidence 9998776533
No 22
>PF13365 Trypsin_2: Trypsin-like peptidase domain; PDB: 1Y8T_A 2Z9I_A 3QO6_A 1L1J_A 1QY6_A 2O8L_A 3OTP_E 2ZLE_I 1KY9_A 3CS0_A ....
Probab=97.70 E-value=1.5e-05 Score=71.40 Aligned_cols=24 Identities=46% Similarity=0.855 Sum_probs=22.5
Q ss_pred EecccCCCCCCcceeccCccEEEE
Q 005822 284 ADIRCLPGMEGGPVFGEHAHFVGI 307 (675)
Q Consensus 284 tDa~~~pG~sGG~v~~~~g~liGi 307 (675)
+|+.+.||+|||||||.+|+||||
T Consensus 97 ~~~~~~~G~SGgpv~~~~G~vvGi 120 (120)
T PF13365_consen 97 TDADTRPGSSGGPVFDSDGRVVGI 120 (120)
T ss_dssp ESSS-STTTTTSEEEETTSEEEEE
T ss_pred eecccCCCcEeHhEECCCCEEEeC
Confidence 899999999999999999999997
No 23
>PF00863 Peptidase_C4: Peptidase family C4 This family belongs to family C4 of the peptidase classification.; InterPro: IPR001730 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. Nuclear inclusion A (NIA) proteases from potyviruses are cysteine peptidases belong to the MEROPS peptidase family C4 (NIa protease family, clan PA(C)) [, ]. Potyviruses include plant viruses in which the single-stranded RNA encodes a polyprotein with NIA protease activity, where proteolytic cleavage is specific for Gln+Gly sites. The NIA protease acts on the polyprotein, releasing itself by Gln+Gly cleavage at both the N- and C-termini. It further processes the polyprotein by cleavage at five similar sites in the C-terminal half of the sequence. In addition to its C-terminal protease activity, the NIA protease contains an N-terminal domain that has been implicated in the transcription process []. This peptidase is present in the nuclear inclusion protein of potyviruses.; GO: 0008234 cysteine-type peptidase activity, 0006508 proteolysis; PDB: 3MMG_B 1Q31_B 1LVB_A 1LVM_A.
Probab=97.67 E-value=0.00072 Score=69.94 Aligned_cols=92 Identities=16% Similarity=0.279 Sum_probs=43.3
Q ss_pred CCCeEEEEecCCCCCccee--eCCCCCCCCCCeEEEEecCCCCCCCCCCCceeeeEEeeeEEecCCccCcccccCCCCcc
Q 005822 521 PLDVSLLQLGYIPDQLCPI--DADFGQPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQRNSAYP 598 (675)
Q Consensus 521 ~~DIALLkL~~~~~~l~PI--~l~~~~~~~G~~V~ViG~glfg~~~g~~pSvs~GiIs~v~~~~~~~~~~~~~~~~~~~~ 598 (675)
..||.++|+.. +++|. ++....|+.|+.|.++|.= +. ..-..-.+|....+- + ....
T Consensus 81 ~~DiviirmPk---DfpPf~~kl~FR~P~~~e~v~mVg~~-----fq--~k~~~s~vSesS~i~-p----------~~~~ 139 (235)
T PF00863_consen 81 GRDIVIIRMPK---DFPPFPQKLKFRAPKEGERVCMVGSN-----FQ--EKSISSTVSESSWIY-P----------EENS 139 (235)
T ss_dssp CSSEEEEE--T---TS----S---B----TT-EEEEEEEE-----CS--SCCCEEEEEEEEEEE-E----------ETTT
T ss_pred CccEEEEeCCc---ccCCcchhhhccCCCCCCEEEEEEEE-----EE--cCCeeEEECCceEEe-e----------cCCC
Confidence 39999999975 44443 4455679999999999972 11 111122222221110 0 1123
Q ss_pred eEEEEcccccCCccccceec-CCceEEEEEeeeeeC
Q 005822 599 VMLETTAAVHPGGSGGAVVN-LDGHMIGLVTRYFKL 633 (675)
Q Consensus 599 ~~lqTda~v~~G~SGGPLvd-~~G~LIGIVssnak~ 633 (675)
.+..+-..+..|+-|.|||+ .+|++|||.+.....
T Consensus 140 ~fWkHwIsTk~G~CG~PlVs~~Dg~IVGiHsl~~~~ 175 (235)
T PF00863_consen 140 HFWKHWISTKDGDCGLPLVSTKDGKIVGIHSLTSNT 175 (235)
T ss_dssp TEEEE-C---TT-TT-EEEETTT--EEEEEEEEETT
T ss_pred CeeEEEecCCCCccCCcEEEcCCCcEEEEEcCccCC
Confidence 46777778899999999997 479999999855433
No 24
>PF03761 DUF316: Domain of unknown function (DUF316) ; InterPro: IPR005514 This is a family of uncharacterised proteins from Caenorhabditis elegans.
Probab=97.65 E-value=0.0031 Score=66.20 Aligned_cols=106 Identities=16% Similarity=0.066 Sum_probs=65.2
Q ss_pred CCCCeEEEEecCC-CCCcceeeCCCCC--CCCCCeEEEEecCCCCCCCCCCCceeeeEEeeeEEecCCccCcccccCCCC
Q 005822 520 GPLDVSLLQLGYI-PDQLCPIDADFGQ--PSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQRNSA 596 (675)
Q Consensus 520 ~~~DIALLkL~~~-~~~l~PI~l~~~~--~~~G~~V~ViG~glfg~~~g~~pSvs~GiIs~v~~~~~~~~~~~~~~~~~~ 596 (675)
..++++||.++.. .....|+.+++.. ...|+.+.+.|+. .. ..+....+.-..+. .
T Consensus 159 ~~~~~mIlEl~~~~~~~~~~~Cl~~~~~~~~~~~~~~~yg~~-----~~--~~~~~~~~~i~~~~--------------~ 217 (282)
T PF03761_consen 159 RPYSPMILELEEDFSKNVSPPCLADSSTNWEKGDEVDVYGFN-----ST--GKLKHRKLKITNCT--------------K 217 (282)
T ss_pred cccceEEEEEcccccccCCCEEeCCCccccccCceEEEeecC-----CC--CeEEEEEEEEEEee--------------c
Confidence 4689999999972 2567888886543 6789999988871 01 12222222211110 0
Q ss_pred cceEEEEcccccCCccccceec---CCceEEEEEeeeeeCCCceeEEEeehHH
Q 005822 597 YPVMLETTAAVHPGGSGGAVVN---LDGHMIGLVTRYFKLSCLKMSKFMLVAK 646 (675)
Q Consensus 597 ~~~~lqTda~v~~G~SGGPLvd---~~G~LIGIVssnak~~~~~~i~f~ip~~ 646 (675)
....+.+....+.|++||||+. ....||||.+.+.........-|+....
T Consensus 218 ~~~~~~~~~~~~~~d~Gg~lv~~~~gr~tlIGv~~~~~~~~~~~~~~f~~v~~ 270 (282)
T PF03761_consen 218 CAYSICTKQYSCKGDRGGPLVKNINGRWTLIGVGASGNYECNKNNSYFFNVSW 270 (282)
T ss_pred cceeEecccccCCCCccCeEEEEECCCEEEEEEEccCCCcccccccEEEEHHH
Confidence 2234555667889999999993 3457899998776444333444554443
No 25
>COG5640 Secreted trypsin-like serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=97.28 E-value=0.00092 Score=72.14 Aligned_cols=22 Identities=32% Similarity=0.581 Sum_probs=20.0
Q ss_pred CeeEEEEEEeCCcEEEEcccccC
Q 005822 405 GVWASGVLLNDQGLILTNAHLLE 427 (675)
Q Consensus 405 ~~wGSGvlIn~~GlILTnAHVV~ 427 (675)
..+|.|-+++.+ ||||+|||+.
T Consensus 60 ~tfCGgs~l~~R-YvLTAAHC~~ 81 (413)
T COG5640 60 GTFCGGSKLGGR-YVLTAAHCAD 81 (413)
T ss_pred eeEeccceecce-EEeeehhhcc
Confidence 568999999998 9999999995
No 26
>PF05579 Peptidase_S32: Equine arteritis virus serine endopeptidase S32; InterPro: IPR008760 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to MEROPS peptidase family S32 (clan PA(S)). The type example is equine arteritis virus serine endopeptidase (equine arteritis virus), which is involved in processing of nidovirus polyproteins [].; GO: 0004252 serine-type endopeptidase activity, 0016032 viral reproduction, 0019082 viral protein processing; PDB: 3FAN_A 3FAO_A 1MBM_A.
Probab=97.14 E-value=0.0032 Score=65.76 Aligned_cols=76 Identities=26% Similarity=0.286 Sum_probs=39.9
Q ss_pred CCeEEEEecCCCCCcceeeCCCCCCCCCCeEEEEecCCCCCCCCCCCceeeeEEeeeEEecCCccCcccccCCCCcceEE
Q 005822 522 LDVSLLQLGYIPDQLCPIDADFGQPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQRNSAYPVML 601 (675)
Q Consensus 522 ~DIALLkL~~~~~~l~PI~l~~~~~~~G~~V~ViG~glfg~~~g~~pSvs~GiIs~v~~~~~~~~~~~~~~~~~~~~~~l 601 (675)
-|.|.-.++..+...+.+++... ..| .+|-.- ...+..|.|..-.+ +
T Consensus 156 GDfA~~~~~~~~G~~P~~k~a~~--~~G-rAyW~t----------~tGvE~G~ig~~~~--------------------~ 202 (297)
T PF05579_consen 156 GDFAEADITNWPGAAPKYKFAQN--YTG-RAYWLT----------STGVEPGFIGGGGA--------------------V 202 (297)
T ss_dssp TTEEEEEETTS-S---B--B-TT---SE-EEEEEE----------TTEEEEEEEETTEE--------------------E
T ss_pred CcEEEEECCCCCCCCCceeecCC--ccc-ceEEEc----------ccCcccceecCceE--------------------E
Confidence 69999999665666666665421 122 222211 12345565543221 1
Q ss_pred EEcccccCCccccceecCCceEEEEEeeeeeC
Q 005822 602 ETTAAVHPGGSGGAVVNLDGHMIGLVTRYFKL 633 (675)
Q Consensus 602 qTda~v~~G~SGGPLvd~~G~LIGIVssnak~ 633 (675)
|-.++||||.||++.+|.+||+.+..-+.
T Consensus 203 ---~fT~~GDSGSPVVt~dg~liGVHTGSn~~ 231 (297)
T PF05579_consen 203 ---CFTGPGDSGSPVVTEDGDLIGVHTGSNKR 231 (297)
T ss_dssp ---ESS-GGCTT-EEEETTC-EEEEEEEEETT
T ss_pred ---EEcCCCCCCCccCcCCCCEEEEEecCCCc
Confidence 34579999999999999999999854433
No 27
>PF00089 Trypsin: Trypsin; InterPro: IPR001254 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine proteases belong to the MEROPS peptidase family S1 (chymotrypsin family, clan PA(S))and to peptidase family S6 (Hap serine peptidases). The chymotrypsin family is almost totally confined to animals, although trypsin-like enzymes are found in actinomycetes of the genera Streptomyces and Saccharopolyspora, and in the fungus Fusarium oxysporum []. The enzymes are inherently secreted, being synthesised with a signal peptide that targets them to the secretory pathway. Animal enzymes are either secreted directly, packaged into vesicles for regulated secretion, or are retained in leukocyte granules []. The Hap family, 'Haemophilus adhesion and penetration', are proteins that play a role in the interaction with human epithelial cells. The serine protease activity is localized at the N-terminal domain, whereas the binding domain is in the C-terminal region. ; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 1SPJ_A 1A5I_A 2ZGH_A 2ZKS_A 2ZGJ_A 2ZGC_A 2ODP_A 2I6Q_A 2I6S_A 2ODQ_A ....
Probab=96.35 E-value=0.015 Score=56.97 Aligned_cols=114 Identities=16% Similarity=0.091 Sum_probs=71.1
Q ss_pred ceEEEEEEecC-----CCCCCcccC-CCCCCCCCeEEEEeCCCCCCCC-CcccCceEEEEEecc--cC--CCCCCCceEE
Q 005822 215 SRVAILGVSSY-----LKDLPNIAL-TPLNKRGDLLLAVGSPFGVLSP-MHFFNSVSMGSVANC--YP--PRSTTRSLLM 283 (675)
Q Consensus 215 td~Avlki~~~-----~~~~~~~~~-s~~~~~G~~v~aigsPfG~~sp-~~f~nsvs~GiIs~~--~~--~~~~~~~~i~ 283 (675)
.||||||++.. ...++.+.. ...++.|+.+.++|-+...... ..........+++.. .. ........+.
T Consensus 87 ~DiAll~L~~~~~~~~~~~~~~l~~~~~~~~~~~~~~~~G~~~~~~~~~~~~~~~~~~~~~~~~~c~~~~~~~~~~~~~c 166 (220)
T PF00089_consen 87 NDIALLKLDRPITFGDNIQPICLPSAGSDPNVGTSCIVVGWGRTSDNGYSSNLQSVTVPVVSRKTCRSSYNDNLTPNMIC 166 (220)
T ss_dssp TSEEEEEESSSSEHBSSBEESBBTSTTHTTTTTSEEEEEESSBSSTTSBTSBEEEEEEEEEEHHHHHHHTTTTSTTTEEE
T ss_pred cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
Confidence 49999999854 122333444 2346899999999998853211 011233444555532 11 1112345677
Q ss_pred Eec----ccCCCCCCcceeccCccEEEEEEeccccc-CCcceEEEEeHHHH
Q 005822 284 ADI----RCLPGMEGGPVFGEHAHFVGILIRPLRQK-SGAEIQLVIPWEAI 329 (675)
Q Consensus 284 tDa----~~~pG~sGG~v~~~~g~liGiv~~~l~~~-~~~~l~~aip~~~i 329 (675)
++. ...+|+|||||++.++.||||++.. ... ......+.++...+
T Consensus 167 ~~~~~~~~~~~g~sG~pl~~~~~~lvGI~s~~-~~c~~~~~~~v~~~v~~~ 216 (220)
T PF00089_consen 167 AGSSGSGDACQGDSGGPLICNNNYLVGIVSFG-ENCGSPNYPGVYTRVSSY 216 (220)
T ss_dssp EETTSSSBGGTTTTTSEEEETTEEEEEEEEEE-SSSSBTTSEEEEEEGGGG
T ss_pred ccccccccccccccccccccceeeecceeeec-CCCCCCCcCEEEEEHHHh
Confidence 776 7889999999999988999999987 333 33335666665533
No 28
>PF10459 Peptidase_S46: Peptidase S46; InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains.
Probab=93.81 E-value=0.05 Score=64.63 Aligned_cols=59 Identities=20% Similarity=0.160 Sum_probs=41.7
Q ss_pred CCcceEEEEcccccCCccccceecCCceEEEEEeeeeeCC--------CceeEEEeehHHHHHHHHH
Q 005822 595 SAYPVMLETTAAVHPGGSGGAVVNLDGHMIGLVTRYFKLS--------CLKMSKFMLVAKLLAQLSF 653 (675)
Q Consensus 595 ~~~~~~lqTda~v~~G~SGGPLvd~~G~LIGIVssnak~~--------~~~~i~f~ip~~~l~~l~~ 653 (675)
...+.-+.+|..+.+||||+||+|.+|+||||+.-..-.+ ....=.+.+=+++++..+-
T Consensus 618 g~~pv~FlstnDitGGNSGSPvlN~~GeLVGl~FDgn~Esl~~D~~fdp~~~R~I~VDiRyvL~~ld 684 (698)
T PF10459_consen 618 GSVPVNFLSTNDITGGNSGSPVLNAKGELVGLAFDGNWESLSGDIAFDPELNRTIHVDIRYVLWALD 684 (698)
T ss_pred CCeeeEEEeccCcCCCCCCCccCCCCceEEEEeecCchhhcccccccccccceeEEEEHHHHHHHHH
Confidence 4467778999999999999999999999999985222111 1111355666777766554
No 29
>COG3591 V8-like Glu-specific endopeptidase [Amino acid transport and metabolism]
Probab=93.15 E-value=0.48 Score=49.77 Aligned_cols=76 Identities=22% Similarity=0.226 Sum_probs=57.9
Q ss_pred ccCCCCCCCCCeEEEEeCCCCCCCCCcccCceEEEEEecccCCCCCCCceEEEecccCCCCCCcceeccCccEEEEEEec
Q 005822 232 IALTPLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVANCYPPRSTTRSLLMADIRCLPGMEGGPVFGEHAHFVGILIRP 311 (675)
Q Consensus 232 ~~~s~~~~~G~~v~aigsPfG~~sp~~f~nsvs~GiIs~~~~~~~~~~~~i~tDa~~~pG~sGG~v~~~~g~liGiv~~~ 311 (675)
+.-....+.+|.|.++|.|-.- |..+....+.+.|-... ..+++-|+-..||+||.||++.+.++||+....
T Consensus 152 ~~~~~~~~~~d~i~v~GYP~dk--~~~~~~~e~t~~v~~~~------~~~l~y~~dT~pG~SGSpv~~~~~~vigv~~~g 223 (251)
T COG3591 152 RNTASEAKANDRITVIGYPGDK--PNIGTMWESTGKVNSIK------GNKLFYDADTLPGSSGSPVLISKDEVIGVHYNG 223 (251)
T ss_pred cccccccccCceeEEEeccCCC--CcceeEeeecceeEEEe------cceEEEEecccCCCCCCceEecCceEEEEEecC
Confidence 4445678999999999999764 33334444555544332 236888999999999999999999999999998
Q ss_pred cccc
Q 005822 312 LRQK 315 (675)
Q Consensus 312 l~~~ 315 (675)
....
T Consensus 224 ~~~~ 227 (251)
T COG3591 224 PGAN 227 (251)
T ss_pred CCcc
Confidence 8765
No 30
>PF09342 DUF1986: Domain of unknown function (DUF1986); InterPro: IPR015420 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This domain is found in serine endopeptidases belonging to MEROPS peptidase family S1A (clan PA). It is found in unusual mosaic proteins, which are encoded by the Drosophila nudel gene (see P98159 from SWISSPROT). Nudel is involved in defining embryonic dorsoventral polarity. Three proteases; ndl, gd and snk process easter to create active easter. Active easter defines cell identities along the dorsal-ventral continuum by activating the spz ligand for the Tl receptor in the ventral region of the embryo. Nudel, pipe and windbeutel together trigger the protease cascade within the extraembryonic perivitelline compartment which induces dorsoventral polarity of the Drosophila embryo [].
Probab=92.41 E-value=0.87 Score=47.58 Aligned_cols=34 Identities=29% Similarity=0.473 Sum_probs=28.2
Q ss_pred CceEEEEeCCCeeEEEEEEeCCcEEEEcccccCCC
Q 005822 395 ASVCLITIDDGVWASGVLLNDQGLILTNAHLLEPW 429 (675)
Q Consensus 395 ~SVV~I~~~~~~wGSGvlIn~~GlILTnAHVV~p~ 429 (675)
|-...|.+++.-||+|+||+++ |||++..|+..-
T Consensus 17 PWlA~IYvdG~~~CsgvLlD~~-WlLvsssCl~~I 50 (267)
T PF09342_consen 17 PWLADIYVDGRYWCSGVLLDPH-WLLVSSSCLRGI 50 (267)
T ss_pred cceeeEEEcCeEEEEEEEeccc-eEEEeccccCCc
Confidence 3445677777889999999998 999999999743
No 31
>PF02907 Peptidase_S29: Hepatitis C virus NS3 protease; InterPro: IPR004109 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies the Hepatitis C virus NS3 protein as a serine protease which belongs to MEROPS peptidase family S29 (hepacivirin family, clan PA(S)), which has a trypsin-like fold. The non-structural (NS) protein NS3 is one of the NS proteins involved in replication of the HCV genome. The NS2 proteinase (IPR002518 from INTERPRO), a zinc-dependent enzyme, performs a single proteolytic cut to release the N terminus of NS3. The action of NS3 proteinase (NS3P), which resides in the N-terminal one-third of the NS3 protein, then yields all remaining non-structural proteins. The C-terminal two-thirds of the NS3 protein contain a helicase. The functional relationship between the proteinase and helicase domains is unknown. NS3 has a structural zinc-binding site and requires cofactor NS4. It has been suggested that the NS3 serine protease of hepatitus C is involved in cell transformation and that the ability to transform requires an active enzyme [].; GO: 0008236 serine-type peptidase activity, 0006508 proteolysis, 0019087 transformation of host cell by virus; PDB: 2QV1_B 3LOX_C 2OBQ_C 2OC1_C 2OC0_A 3LON_A 3KNX_A 2O8M_A 2OBO_A 2OC8_A ....
Probab=91.85 E-value=0.14 Score=48.73 Aligned_cols=45 Identities=27% Similarity=0.507 Sum_probs=35.6
Q ss_pred EecccCCCCCCcceeccCccEEEEEEecccccCC-cceEEEEeHHHH
Q 005822 284 ADIRCLPGMEGGPVFGEHAHFVGILIRPLRQKSG-AEIQLVIPWEAI 329 (675)
Q Consensus 284 tDa~~~pG~sGG~v~~~~g~liGiv~~~l~~~~~-~~l~~aip~~~i 329 (675)
.-+..+-|+|||||+...|++|||..+-++..+. -.+-|+ ||+.+
T Consensus 101 ~pis~lkGSSGgPiLC~~GH~vG~f~aa~~trgvak~i~f~-P~e~l 146 (148)
T PF02907_consen 101 RPISDLKGSSGGPILCPSGHAVGMFRAAVCTRGVAKAIDFI-PVETL 146 (148)
T ss_dssp EEHHHHTT-TT-EEEETTSEEEEEEEEEEEETTEEEEEEEE-EHHHH
T ss_pred ceeEEEecCCCCcccCCCCCEEEEEEEEEEcCCceeeEEEE-eeeec
Confidence 4456778999999999999999999999988743 377787 99865
No 32
>PF00548 Peptidase_C3: 3C cysteine protease (picornain 3C); InterPro: IPR000199 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This signature defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies C3A and C3B. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral C3 cysteine protease. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SJO_E 2H6M_A 1QA7_C 1HAV_B 2HAL_A 2H9H_A 3QZQ_B 3QZR_A 3R0F_B 3SJ9_A ....
Probab=91.07 E-value=3 Score=41.38 Aligned_cols=34 Identities=29% Similarity=0.475 Sum_probs=28.4
Q ss_pred cceEEEEcccccCCccccceecC---CceEEEEEeee
Q 005822 597 YPVMLETTAAVHPGGSGGAVVNL---DGHMIGLVTRY 630 (675)
Q Consensus 597 ~~~~lqTda~v~~G~SGGPLvd~---~G~LIGIVssn 630 (675)
.+.++...++..+|+-||||+.. .++++||..+.
T Consensus 134 ~~~~~~Y~~~t~~G~CG~~l~~~~~~~~~i~GiHvaG 170 (172)
T PF00548_consen 134 TPRSLKYKAPTKPGMCGSPLVSRIGGQGKIIGIHVAG 170 (172)
T ss_dssp EEEEEEEESEEETTGTTEEEEESCGGTTEEEEEEEEE
T ss_pred eeEEEEEccCCCCCccCCeEEEeeccCccEEEEEecc
Confidence 45678889999999999999942 58999998764
No 33
>cd00190 Tryp_SPc Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues.
Probab=90.28 E-value=0.73 Score=45.32 Aligned_cols=98 Identities=17% Similarity=0.148 Sum_probs=53.0
Q ss_pred ceEEEEEEecCC-----CCCCcccCCC-CCCCCCeEEEEeCCCCCCC--CCcccCceEEEEEecc--cCC----CCCCCc
Q 005822 215 SRVAILGVSSYL-----KDLPNIALTP-LNKRGDLLLAVGSPFGVLS--PMHFFNSVSMGSVANC--YPP----RSTTRS 280 (675)
Q Consensus 215 td~Avlki~~~~-----~~~~~~~~s~-~~~~G~~v~aigsPfG~~s--p~~f~nsvs~GiIs~~--~~~----~~~~~~ 280 (675)
.||||||++... ..++.+.... .+..|+.+.+.|-...... ...-......-+++.. ... ......
T Consensus 89 ~DiAll~L~~~~~~~~~v~picl~~~~~~~~~~~~~~~~G~g~~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~ 168 (232)
T cd00190 89 NDIALLKLKRPVTLSDNVRPICLPSSGYNLPAGTTCTVSGWGRTSEGGPLPDVLQEVNVPIVSNAECKRAYSYGGTITDN 168 (232)
T ss_pred CCEEEEEECCcccCCCcccceECCCccccCCCCCEEEEEeCCcCCCCCCCCceeeEEEeeeECHHHhhhhccCcccCCCc
Confidence 499999997421 2334455443 6788999999996543211 0111122233333321 000 000111
Q ss_pred eEEE-----ecccCCCCCCcceeccC---ccEEEEEEecc
Q 005822 281 LLMA-----DIRCLPGMEGGPVFGEH---AHFVGILIRPL 312 (675)
Q Consensus 281 ~i~t-----Da~~~pG~sGG~v~~~~---g~liGiv~~~l 312 (675)
.+-+ +...-+|.|||||+... ..|+||++...
T Consensus 169 ~~C~~~~~~~~~~c~gdsGgpl~~~~~~~~~lvGI~s~g~ 208 (232)
T cd00190 169 MLCAGGLEGGKDACQGDSGGPLVCNDNGRGVLVGIVSWGS 208 (232)
T ss_pred eEeeCCCCCCCccccCCCCCcEEEEeCCEEEEEEEEehhh
Confidence 1111 33455799999999875 67999988644
No 34
>PF10459 Peptidase_S46: Peptidase S46; InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains.
Probab=89.24 E-value=0.27 Score=58.57 Aligned_cols=29 Identities=21% Similarity=0.456 Sum_probs=25.0
Q ss_pred EEEecccCCCCCCcceeccCccEEEEEEe
Q 005822 282 LMADIRCLPGMEGGPVFGEHAHFVGILIR 310 (675)
Q Consensus 282 i~tDa~~~pG~sGG~v~~~~g~liGiv~~ 310 (675)
++|+.-|--||||.||+|.+|+|||++.-
T Consensus 624 FlstnDitGGNSGSPvlN~~GeLVGl~FD 652 (698)
T PF10459_consen 624 FLSTNDITGGNSGSPVLNAKGELVGLAFD 652 (698)
T ss_pred EEeccCcCCCCCCCccCCCCceEEEEeec
Confidence 77888888899999999999999998873
No 35
>PF00949 Peptidase_S7: Peptidase S7, Flavivirus NS3 serine protease ; InterPro: IPR001850 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies serine peptidases belong to MEROPS peptidase family S7 (flavivirin family, clan PA(S)). The protein fold of the peptidase domain for members of this family resembles that of chymotrypsin, the type example for clan PA. Flaviviruses produce a polyprotein from the ssRNA genome. The N terminus of the NS3 protein (approx. 180 aa) is required for the processing of the polyprotein. NS3 also has conserved homology with NTP-binding proteins and DEAD family of RNA helicase [, , ].; GO: 0003723 RNA binding, 0003724 RNA helicase activity, 0005524 ATP binding; PDB: 2IJO_B 3E90_D 2GGV_B 2FP7_B 2WV9_A 3U1I_B 3U1J_B 2WZQ_A 2WHX_A 3L6P_A ....
Probab=89.09 E-value=0.32 Score=46.41 Aligned_cols=35 Identities=20% Similarity=0.418 Sum_probs=24.8
Q ss_pred ceEEEecccCCCCCCcceeccCccEEEEEEecccc
Q 005822 280 SLLMADIRCLPGMEGGPVFGEHAHFVGILIRPLRQ 314 (675)
Q Consensus 280 ~~i~tDa~~~pG~sGG~v~~~~g~liGiv~~~l~~ 314 (675)
.+...|..+-+|+||.|+||.+|++|||--..+.-
T Consensus 86 ~~~~~~~d~~~GsSGSpi~n~~g~ivGlYg~g~~~ 120 (132)
T PF00949_consen 86 GIGAIDLDFPKGSSGSPIFNQNGEIVGLYGNGVEV 120 (132)
T ss_dssp EEEEE---S-TTGTT-EEEETTSCEEEEEEEEEE-
T ss_pred eEEeeecccCCCCCCCceEcCCCcEEEEEccceee
Confidence 46667788999999999999999999997766544
No 36
>PF00863 Peptidase_C4: Peptidase family C4 This family belongs to family C4 of the peptidase classification.; InterPro: IPR001730 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. Nuclear inclusion A (NIA) proteases from potyviruses are cysteine peptidases belong to the MEROPS peptidase family C4 (NIa protease family, clan PA(C)) [, ]. Potyviruses include plant viruses in which the single-stranded RNA encodes a polyprotein with NIA protease activity, where proteolytic cleavage is specific for Gln+Gly sites. The NIA protease acts on the polyprotein, releasing itself by Gln+Gly cleavage at both the N- and C-termini. It further processes the polyprotein by cleavage at five similar sites in the C-terminal half of the sequence. In addition to its C-terminal protease activity, the NIA protease contains an N-terminal domain that has been implicated in the transcription process []. This peptidase is present in the nuclear inclusion protein of potyviruses.; GO: 0008234 cysteine-type peptidase activity, 0006508 proteolysis; PDB: 3MMG_B 1Q31_B 1LVB_A 1LVM_A.
Probab=87.90 E-value=1 Score=46.90 Aligned_cols=86 Identities=23% Similarity=0.300 Sum_probs=44.6
Q ss_pred ceEEEEEEecCCCCCCcccCCCCCCCCCeEEEEeCCCCCCCCCcccCceEEEEEe---cccCCCCCCCceEEEecccCCC
Q 005822 215 SRVAILGVSSYLKDLPNIALTPLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVA---NCYPPRSTTRSLLMADIRCLPG 291 (675)
Q Consensus 215 td~Avlki~~~~~~~~~~~~s~~~~~G~~v~aigsPfG~~sp~~f~nsvs~GiIs---~~~~~~~~~~~~i~tDa~~~pG 291 (675)
.||.++|.+....+.+..-.-+.++.||.|..||+=|-- ++++ -.|| .+.+ .....|+---+.-.+|
T Consensus 82 ~DiviirmPkDfpPf~~kl~FR~P~~~e~v~mVg~~fq~-------k~~~-s~vSesS~i~p--~~~~~fWkHwIsTk~G 151 (235)
T PF00863_consen 82 RDIVIIRMPKDFPPFPQKLKFRAPKEGERVCMVGSNFQE-------KSIS-STVSESSWIYP--EENSHFWKHWISTKDG 151 (235)
T ss_dssp SSEEEEE--TTS----S---B----TT-EEEEEEEECSS-------CCCE-EEEEEEEEEEE--ETTTTEEEE-C---TT
T ss_pred ccEEEEeCCcccCCcchhhhccCCCCCCEEEEEEEEEEc-------CCee-EEECCceEEee--cCCCCeeEEEecCCCC
Confidence 499999997654444444455789999999999987752 2222 2233 2222 1235688888888999
Q ss_pred CCCcceecc-CccEEEEEEe
Q 005822 292 MEGGPVFGE-HAHFVGILIR 310 (675)
Q Consensus 292 ~sGG~v~~~-~g~liGiv~~ 310 (675)
+.|.|+++. +|.+|||-..
T Consensus 152 ~CG~PlVs~~Dg~IVGiHsl 171 (235)
T PF00863_consen 152 DCGLPLVSTKDGKIVGIHSL 171 (235)
T ss_dssp -TT-EEEETTT--EEEEEEE
T ss_pred ccCCcEEEcCCCcEEEEEcC
Confidence 999999986 7889999874
No 37
>smart00020 Tryp_SPc Trypsin-like serine protease. Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues.
Probab=86.95 E-value=3.7 Score=40.55 Aligned_cols=99 Identities=15% Similarity=0.074 Sum_probs=52.4
Q ss_pred cceEEEEEEecC-----CCCCCcccCC-CCCCCCCeEEEEeCCCCCCCCCcccCceEEEEEecccC-----CC----CCC
Q 005822 214 TSRVAILGVSSY-----LKDLPNIALT-PLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVANCYP-----PR----STT 278 (675)
Q Consensus 214 ~td~Avlki~~~-----~~~~~~~~~s-~~~~~G~~v~aigsPfG~~sp~~f~nsvs~GiIs~~~~-----~~----~~~ 278 (675)
..|+||||++.. ...++.+... ..+..|+.+.+.|-.-.......+...+....+..... .. ...
T Consensus 88 ~~DiAll~L~~~i~~~~~~~pi~l~~~~~~~~~~~~~~~~g~g~~~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~ 167 (229)
T smart00020 88 DNDIALLKLKSPVTLSDNVRPICLPSSNYNVPAGTTCTVSGWGRTSEGAGSLPDTLQEVNVPIVSNATCRRAYSGGGAIT 167 (229)
T ss_pred cCCEEEEEECcccCCCCceeeccCCCcccccCCCCEEEEEeCCCCCCCCCcCCCEeeEEEEEEeCHHHhhhhhccccccC
Confidence 359999999743 1223334442 35777899999985543211111222232332221110 00 000
Q ss_pred CceE---E--EecccCCCCCCcceeccCc--cEEEEEEecc
Q 005822 279 RSLL---M--ADIRCLPGMEGGPVFGEHA--HFVGILIRPL 312 (675)
Q Consensus 279 ~~~i---~--tDa~~~pG~sGG~v~~~~g--~liGiv~~~l 312 (675)
...+ . .+...-+|.+|||++...+ .|+||++..-
T Consensus 168 ~~~~C~~~~~~~~~~c~gdsG~pl~~~~~~~~l~Gi~s~g~ 208 (229)
T smart00020 168 DNMLCAGGLEGGKDACQGDSGGPLVCNDGRWVLVGIVSWGS 208 (229)
T ss_pred CCcEeecCCCCCCcccCCCCCCeeEEECCCEEEEEEEEECC
Confidence 0111 0 1344567999999998765 7999988654
No 38
>PF00944 Peptidase_S3: Alphavirus core protein ; InterPro: IPR000930 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. Togavirin, also known as Sindbis virus core endopeptidase, is a serine protease resident at the N terminus of the p130 polyprotein of togaviruses []. The endopeptidase signature identifies the peptidase as belonging to the MEROPS peptidase family S3 (togavirin family, clan PA(S)). The polyprotein also includes structural proteins for the nucleocapsid core and for the glycoprotein spikes []. Togavirin is only active while part of the polyprotein, cleavage at a Trp-Ser bond resulting in total lack of activity []. Mutagenesis studies have identified the location of the His-Asp-Ser catalytic triad, and X-ray studies have revealed the protein fold to be similar to that of chymotrypsin [, ].; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis, 0016020 membrane; PDB: 2YEW_D 1EP5_A 3J0C_F 1EP6_C 1WYK_D 1DYL_A 1VCQ_B 1VCP_B 1LD4_D 1KXA_A ....
Probab=86.28 E-value=1.9 Score=41.27 Aligned_cols=36 Identities=19% Similarity=0.241 Sum_probs=28.9
Q ss_pred EEEcccccCCccccceecCCceEEEEEeeeeeCCCc
Q 005822 601 LETTAAVHPGGSGGAVVNLDGHMIGLVTRYFKLSCL 636 (675)
Q Consensus 601 lqTda~v~~G~SGGPLvd~~G~LIGIVssnak~~~~ 636 (675)
..-+..-.+||||-|++|..|+|||||-..+.+..-
T Consensus 97 tip~g~g~~GDSGRpi~DNsGrVVaIVLGG~neG~R 132 (158)
T PF00944_consen 97 TIPTGVGKPGDSGRPIFDNSGRVVAIVLGGANEGRR 132 (158)
T ss_dssp EEETTS-STTSTTEEEESTTSBEEEEEEEEEEETTE
T ss_pred EeccCCCCCCCCCCccCcCCCCEEEEEecCCCCCCc
Confidence 344667789999999999999999999877766544
No 39
>PF00949 Peptidase_S7: Peptidase S7, Flavivirus NS3 serine protease ; InterPro: IPR001850 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies serine peptidases belong to MEROPS peptidase family S7 (flavivirin family, clan PA(S)). The protein fold of the peptidase domain for members of this family resembles that of chymotrypsin, the type example for clan PA. Flaviviruses produce a polyprotein from the ssRNA genome. The N terminus of the NS3 protein (approx. 180 aa) is required for the processing of the polyprotein. NS3 also has conserved homology with NTP-binding proteins and DEAD family of RNA helicase [, , ].; GO: 0003723 RNA binding, 0003724 RNA helicase activity, 0005524 ATP binding; PDB: 2IJO_B 3E90_D 2GGV_B 2FP7_B 2WV9_A 3U1I_B 3U1J_B 2WZQ_A 2WHX_A 3L6P_A ....
Probab=85.68 E-value=1.7 Score=41.59 Aligned_cols=31 Identities=23% Similarity=0.451 Sum_probs=21.8
Q ss_pred cccccCCccccceecCCceEEEEEeeeeeCC
Q 005822 604 TAAVHPGGSGGAVVNLDGHMIGLVTRYFKLS 634 (675)
Q Consensus 604 da~v~~G~SGGPLvd~~G~LIGIVssnak~~ 634 (675)
+....+|.||.|+||.+|++|||--......
T Consensus 91 ~~d~~~GsSGSpi~n~~g~ivGlYg~g~~~~ 121 (132)
T PF00949_consen 91 DLDFPKGSSGSPIFNQNGEIVGLYGNGVEVG 121 (132)
T ss_dssp ---S-TTGTT-EEEETTSCEEEEEEEEEE-T
T ss_pred ecccCCCCCCCceEcCCCcEEEEEccceeec
Confidence 3357899999999999999999986555544
No 40
>PF08192 Peptidase_S64: Peptidase family S64; InterPro: IPR012985 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This family of fungal proteins is involved in the processing of membrane bound transcription factor Stp1 [] and belongs to MEROPS petidase family S64 (clan PA). The processing causes the signalling domain of Stp1 to be passed to the nucleus where several permease genes are induced. The permeases are important for uptake of amino acids, and processing of tp1 only occurs in an amino acid-rich environment. This family is predicted to be distantly related to the trypsin family (MEROPS peptidase family S1) and to have a typical trypsin-like catalytic triad [].
Probab=84.70 E-value=4.7 Score=47.56 Aligned_cols=116 Identities=18% Similarity=0.157 Sum_probs=72.3
Q ss_pred ccCCCCcceEEEEEEecCC-----C-CCC---------cccC------CCCCCCCCeEEEEeCCCCCCCCCcccCceEEE
Q 005822 208 SLMSKSTSRVAILGVSSYL-----K-DLP---------NIAL------TPLNKRGDLLLAVGSPFGVLSPMHFFNSVSMG 266 (675)
Q Consensus 208 ~~~~~~~td~Avlki~~~~-----~-~~~---------~~~~------s~~~~~G~~v~aigsPfG~~sp~~f~nsvs~G 266 (675)
+++.+...|+||+||+... + +.+ .+.. -..+..|..|+=+|.==|+ |.|
T Consensus 536 ~ii~~~LsD~AIIkV~~~~~~~N~LGddi~f~~~dP~l~f~NlyV~~~~~~~~~G~~VfK~GrTTgy----------T~G 605 (695)
T PF08192_consen 536 SIINKRLSDWAIIKVNKERKCQNYLGDDIQFNEPDPTLMFQNLYVREVVSNLVPGMEVFKVGRTTGY----------TTG 605 (695)
T ss_pred hhhcccccceEEEEeCCCceecCCCCccccccCCCccccccccchhhhhhccCCCCeEEEecccCCc----------cce
Confidence 3444666799999998432 0 111 1111 1247789999999988775 456
Q ss_pred EEeccc----CCCC-CCCceEEEe----cccCCCCCCcceeccCcc------EEEEEEecccccCCcceEEEEeHHHHHH
Q 005822 267 SVANCY----PPRS-TTRSLLMAD----IRCLPGMEGGPVFGEHAH------FVGILIRPLRQKSGAEIQLVIPWEAIAT 331 (675)
Q Consensus 267 iIs~~~----~~~~-~~~~~i~tD----a~~~pG~sGG~v~~~~g~------liGiv~~~l~~~~~~~l~~aip~~~i~~ 331 (675)
+|.+.. .++. ....+++.. +=..+|.||.=|+++-+. |+||+-+.=+. ...|++..||..|.+
T Consensus 606 ~lNg~klvyw~dG~i~s~efvV~s~~~~~Fa~~GDSGS~VLtk~~d~~~gLgvvGMlhsydge--~kqfglftPi~~il~ 683 (695)
T PF08192_consen 606 ILNGIKLVYWADGKIQSSEFVVSSDNNPAFASGGDSGSWVLTKLEDNNKGLGVVGMLHSYDGE--QKQFGLFTPINEILD 683 (695)
T ss_pred EecceEEEEecCCCeEEEEEEEecCCCccccCCCCcccEEEecccccccCceeeEEeeecCCc--cceeeccCcHHHHHH
Confidence 665431 1111 112344443 446679999999987444 88988753322 247888999999987
Q ss_pred HHHh
Q 005822 332 ACSD 335 (675)
Q Consensus 332 ~~~~ 335 (675)
-+.+
T Consensus 684 rl~~ 687 (695)
T PF08192_consen 684 RLEE 687 (695)
T ss_pred HHHH
Confidence 6654
No 41
>PF02907 Peptidase_S29: Hepatitis C virus NS3 protease; InterPro: IPR004109 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies the Hepatitis C virus NS3 protein as a serine protease which belongs to MEROPS peptidase family S29 (hepacivirin family, clan PA(S)), which has a trypsin-like fold. The non-structural (NS) protein NS3 is one of the NS proteins involved in replication of the HCV genome. The NS2 proteinase (IPR002518 from INTERPRO), a zinc-dependent enzyme, performs a single proteolytic cut to release the N terminus of NS3. The action of NS3 proteinase (NS3P), which resides in the N-terminal one-third of the NS3 protein, then yields all remaining non-structural proteins. The C-terminal two-thirds of the NS3 protein contain a helicase. The functional relationship between the proteinase and helicase domains is unknown. NS3 has a structural zinc-binding site and requires cofactor NS4. It has been suggested that the NS3 serine protease of hepatitus C is involved in cell transformation and that the ability to transform requires an active enzyme [].; GO: 0008236 serine-type peptidase activity, 0006508 proteolysis, 0019087 transformation of host cell by virus; PDB: 2QV1_B 3LOX_C 2OBQ_C 2OC1_C 2OC0_A 3LON_A 3KNX_A 2O8M_A 2OBO_A 2OC8_A ....
Probab=83.29 E-value=1.9 Score=41.33 Aligned_cols=42 Identities=24% Similarity=0.396 Sum_probs=29.5
Q ss_pred cccCCccccceecCCceEEEEEeeeeeCCCc-eeEEEeehHHHH
Q 005822 606 AVHPGGSGGAVVNLDGHMIGLVTRYFKLSCL-KMSKFMLVAKLL 648 (675)
Q Consensus 606 ~v~~G~SGGPLvd~~G~LIGIVssnak~~~~-~~i~f~ip~~~l 648 (675)
+...|+|||||+..+|++|||..+....... +.+-|. |.+-+
T Consensus 104 s~lkGSSGgPiLC~~GH~vG~f~aa~~trgvak~i~f~-P~e~l 146 (148)
T PF02907_consen 104 SDLKGSSGGPILCPSGHAVGMFRAAVCTRGVAKAIDFI-PVETL 146 (148)
T ss_dssp HHHTT-TT-EEEETTSEEEEEEEEEEEETTEEEEEEEE-EHHHH
T ss_pred EEEecCCCCcccCCCCCEEEEEEEEEEcCCceeeEEEE-eeeec
Confidence 4568999999999899999999776654443 447776 76543
No 42
>KOG1421 consensus Predicted signaling-associated protein (contains a PDZ domain) [General function prediction only]
Probab=80.39 E-value=32 Score=41.08 Aligned_cols=46 Identities=9% Similarity=0.029 Sum_probs=33.4
Q ss_pred eeeEEEEecCCCCCeEEEEecCCCCCcceeeCCCCCCCCCCeEEEEecC
Q 005822 510 CDAKIVYVCKGPLDVSLLQLGYIPDQLCPIDADFGQPSLGSAAYVIGHG 558 (675)
Q Consensus 510 ~~a~vv~~~~~~~DIALLkL~~~~~~l~PI~l~~~~~~~G~~V~ViG~g 558 (675)
..|.+.+.... ..+|.+|-++ ......++.+..+..||+|...||-
T Consensus 588 i~a~~~fL~~t-~n~a~~kydp--~~~~~~kl~~~~v~~gD~~~f~g~~ 633 (955)
T KOG1421|consen 588 IPANVSFLHPT-ENVASFKYDP--ALEVQLKLTDTTVLRGDECTFEGFT 633 (955)
T ss_pred ccceeeEecCc-cceeEeccCh--hHhhhhccceeeEecCCceeEeccc
Confidence 45677666654 7788888874 3344556666778999999999984
No 43
>PF08192 Peptidase_S64: Peptidase family S64; InterPro: IPR012985 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This family of fungal proteins is involved in the processing of membrane bound transcription factor Stp1 [] and belongs to MEROPS petidase family S64 (clan PA). The processing causes the signalling domain of Stp1 to be passed to the nucleus where several permease genes are induced. The permeases are important for uptake of amino acids, and processing of tp1 only occurs in an amino acid-rich environment. This family is predicted to be distantly related to the trypsin family (MEROPS peptidase family S1) and to have a typical trypsin-like catalytic triad [].
Probab=79.28 E-value=15 Score=43.63 Aligned_cols=90 Identities=13% Similarity=0.194 Sum_probs=55.1
Q ss_pred CCCCCeEEEEecCCCCCCCCCCCceeeeEEeeeEEecCCccCcccccCCCCcceEEEEc----ccccCCccccceecCCc
Q 005822 546 PSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQRNSAYPVMLETT----AAVHPGGSGGAVVNLDG 621 (675)
Q Consensus 546 ~~~G~~V~ViG~glfg~~~g~~pSvs~GiIs~v~~~~~~~~~~~~~~~~~~~~~~lqTd----a~v~~G~SGGPLvd~~G 621 (675)
+.+|..|+-+|. ..+ .+.|.|+.+.-+ .+..... ....+++.. .-..+||||.=|++.-+
T Consensus 587 ~~~G~~VfK~Gr-----TTg----yT~G~lNg~klv---yw~dG~i----~s~efvV~s~~~~~Fa~~GDSGS~VLtk~~ 650 (695)
T PF08192_consen 587 LVPGMEVFKVGR-----TTG----YTTGILNGIKLV---YWADGKI----QSSEFVVSSDNNPAFASGGDSGSWVLTKLE 650 (695)
T ss_pred cCCCCeEEEecc-----cCC----ccceEecceEEE---EecCCCe----EEEEEEEecCCCccccCCCCcccEEEeccc
Confidence 578999998886 234 477877755321 1111000 011223333 34568999999998533
Q ss_pred ------eEEEEEeeeeeCCCceeEEEeehHHHHHHHHH
Q 005822 622 ------HMIGLVTRYFKLSCLKMSKFMLVAKLLAQLSF 653 (675)
Q Consensus 622 ------~LIGIVssnak~~~~~~i~f~ip~~~l~~l~~ 653 (675)
.|+||..++ ++..+-++...|...++.=+.
T Consensus 651 d~~~gLgvvGMlhsy--dge~kqfglftPi~~il~rl~ 686 (695)
T PF08192_consen 651 DNNKGLGVVGMLHSY--DGEQKQFGLFTPINEILDRLE 686 (695)
T ss_pred ccccCceeeEEeeec--CCccceeeccCcHHHHHHHHH
Confidence 499999887 344445778899888876553
No 44
>PF05580 Peptidase_S55: SpoIVB peptidase S55; InterPro: IPR008763 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S55 (SpoIVB peptidase family, clan PA(S)). The protein SpoIVB plays a key role in signalling in the final sigma-K checkpoint of Bacillus subtilis [, ].
Probab=78.52 E-value=1.8 Score=44.50 Aligned_cols=45 Identities=27% Similarity=0.431 Sum_probs=32.2
Q ss_pred EEcccccCCccccceecCCceEEEEEeeeeeCCCceeEEEeehHHHHH
Q 005822 602 ETTAAVHPGGSGGAVVNLDGHMIGLVTRYFKLSCLKMSKFMLVAKLLA 649 (675)
Q Consensus 602 qTda~v~~G~SGGPLvd~~G~LIGIVssnak~~~~~~i~f~ip~~~l~ 649 (675)
..+..+..|+||+|++ .+|+|||=|+-..-.+.. ..|.+++++.+
T Consensus 172 ~~TGGIvqGMSGSPI~-qdGKLiGAVthvf~~dp~--~Gygi~ie~ML 216 (218)
T PF05580_consen 172 EKTGGIVQGMSGSPII-QDGKLIGAVTHVFVNDPT--KGYGIFIEWML 216 (218)
T ss_pred hhhCCEEecccCCCEE-ECCEEEEEEEEEEecCCC--ceeeecHHHHh
Confidence 3345678899999999 599999999866533323 56667776543
No 45
>PF00947 Pico_P2A: Picornavirus core protein 2A; InterPro: IPR000081 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This domain defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies 3CA and 3CB. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral 3C cysteine protease []. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0008233 peptidase activity, 0006508 proteolysis, 0016032 viral reproduction; PDB: 2HRV_B 1Z8R_A.
Probab=77.24 E-value=3.3 Score=39.24 Aligned_cols=30 Identities=30% Similarity=0.494 Sum_probs=24.0
Q ss_pred ceEEEecccCCCCCCcceeccCccEEEEEEe
Q 005822 280 SLLMADIRCLPGMEGGPVFGEHAHFVGILIR 310 (675)
Q Consensus 280 ~~i~tDa~~~pG~sGG~v~~~~g~liGiv~~ 310 (675)
.+++.-..+.||..||+|+.++| ||||+++
T Consensus 79 ~~l~g~Gp~~PGdCGg~L~C~HG-ViGi~Ta 108 (127)
T PF00947_consen 79 NLLIGEGPAEPGDCGGILRCKHG-VIGIVTA 108 (127)
T ss_dssp CEEEEE-SSSTT-TCSEEEETTC-EEEEEEE
T ss_pred CceeecccCCCCCCCceeEeCCC-eEEEEEe
Confidence 45667778999999999998876 9999996
No 46
>PF00944 Peptidase_S3: Alphavirus core protein ; InterPro: IPR000930 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. Togavirin, also known as Sindbis virus core endopeptidase, is a serine protease resident at the N terminus of the p130 polyprotein of togaviruses []. The endopeptidase signature identifies the peptidase as belonging to the MEROPS peptidase family S3 (togavirin family, clan PA(S)). The polyprotein also includes structural proteins for the nucleocapsid core and for the glycoprotein spikes []. Togavirin is only active while part of the polyprotein, cleavage at a Trp-Ser bond resulting in total lack of activity []. Mutagenesis studies have identified the location of the His-Asp-Ser catalytic triad, and X-ray studies have revealed the protein fold to be similar to that of chymotrypsin [, ].; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis, 0016020 membrane; PDB: 2YEW_D 1EP5_A 3J0C_F 1EP6_C 1WYK_D 1DYL_A 1VCQ_B 1VCP_B 1LD4_D 1KXA_A ....
Probab=76.44 E-value=2.9 Score=40.06 Aligned_cols=32 Identities=22% Similarity=0.400 Sum_probs=26.1
Q ss_pred eEEEecccCCCCCCcceeccCccEEEEEEecc
Q 005822 281 LLMADIRCLPGMEGGPVFGEHAHFVGILIRPL 312 (675)
Q Consensus 281 ~i~tDa~~~pG~sGG~v~~~~g~liGiv~~~l 312 (675)
|.+--..-.||.||-|+||.+|+|||||++--
T Consensus 96 ftip~g~g~~GDSGRpi~DNsGrVVaIVLGG~ 127 (158)
T PF00944_consen 96 FTIPTGVGKPGDSGRPIFDNSGRVVAIVLGGA 127 (158)
T ss_dssp EEEETTS-STTSTTEEEESTTSBEEEEEEEEE
T ss_pred EEeccCCCCCCCCCCccCcCCCCEEEEEecCC
Confidence 45555667899999999999999999999643
No 47
>KOG0441 consensus Cu2+/Zn2+ superoxide dismutase SOD1 [Inorganic ion transport and metabolism]
Probab=63.35 E-value=3.4 Score=40.35 Aligned_cols=42 Identities=29% Similarity=0.256 Sum_probs=32.2
Q ss_pred hhhccccccceeccCcee---eeeeeeecccccC--CccccccccCC
Q 005822 26 GLKMRRHAFHQYNSGKTT---LSASGMLLPLSFF--DTKVAERNWGV 67 (675)
Q Consensus 26 ~~k~~~~~f~~~~~g~tt---~sas~~~lp~~~~--~~~~~~~~~~~ 67 (675)
||.-++|+||.|+.|.+| .||-...=|.+.. .|....|.+++
T Consensus 38 GL~pg~hgfHvHqfGD~t~GC~SaGphFNp~~~~hg~p~~~~rH~gd 84 (154)
T KOG0441|consen 38 GLPPGKHGFHVHQFGDNTNGCKSAGPHFNPNKKTHGGPVDEVRHVGD 84 (154)
T ss_pred cCCCceeeEEEEeccCCCCChhcCCCCCCCcccCCCCcccccccccc
Confidence 444499999999999998 6886666666665 57677777776
No 48
>TIGR02860 spore_IV_B stage IV sporulation protein B. SpoIVB, the stage IV sporulation protein B of endospore-forming bacteria such as Bacillus subtilis, is a serine proteinase, expressed in the spore (rather than mother cell) compartment, that participates in a proteolytic activation cascade for Sigma-K. It appears to be universal among endospore-forming bacteria and occurs nowhere else.
Probab=57.80 E-value=8.7 Score=43.22 Aligned_cols=45 Identities=24% Similarity=0.387 Sum_probs=33.1
Q ss_pred EcccccCCccccceecCCceEEEEEeeeeeCCCceeEEEeehHHHHHH
Q 005822 603 TTAAVHPGGSGGAVVNLDGHMIGLVTRYFKLSCLKMSKFMLVAKLLAQ 650 (675)
Q Consensus 603 Tda~v~~G~SGGPLvd~~G~LIGIVssnak~~~~~~i~f~ip~~~l~~ 650 (675)
-+..+..|+||+|++ .+|++||=||=..-.+.. -+|.|-+++.++
T Consensus 353 ~tgGivqGMSGSPi~-q~gkliGAvtHVfvndpt--~GYGi~ie~Ml~ 397 (402)
T TIGR02860 353 KTGGIVQGMSGSPII-QNGKVIGAVTHVFVNDPT--SGYGVYIEWMLK 397 (402)
T ss_pred HhCCEEecccCCCEE-ECCEEEEEEEEEEecCCC--cceeehHHHHHH
Confidence 345678899999999 699999988765555444 456666666554
No 49
>PF00947 Pico_P2A: Picornavirus core protein 2A; InterPro: IPR000081 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This domain defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies 3CA and 3CB. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral 3C cysteine protease []. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0008233 peptidase activity, 0006508 proteolysis, 0016032 viral reproduction; PDB: 2HRV_B 1Z8R_A.
Probab=57.54 E-value=18 Score=34.46 Aligned_cols=31 Identities=29% Similarity=0.408 Sum_probs=22.8
Q ss_pred EEEEcccccCCccccceecCCceEEEEEeeee
Q 005822 600 MLETTAAVHPGGSGGAVVNLDGHMIGLVTRYF 631 (675)
Q Consensus 600 ~lqTda~v~~G~SGGPLvd~~G~LIGIVssna 631 (675)
++.......||+.||+|+- +--||||+|+..
T Consensus 80 ~l~g~Gp~~PGdCGg~L~C-~HGViGi~Tagg 110 (127)
T PF00947_consen 80 LLIGEGPAEPGDCGGILRC-KHGVIGIVTAGG 110 (127)
T ss_dssp EEEEE-SSSTT-TCSEEEE-TTCEEEEEEEEE
T ss_pred ceeecccCCCCCCCceeEe-CCCeEEEEEeCC
Confidence 4444578899999999995 555999998874
No 50
>PF02122 Peptidase_S39: Peptidase S39; InterPro: IPR000382 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. ORF2 of Potato leafroll virus (PLrV) encodes a polyprotein which is translated following a -1 frameshift. The polyprotein has a putative linear arrangement of membrane achor-VPg-peptidase-polmerase domains. The serine peptidase domain which is found in this group of sequences belongs to MEROPS peptidase family S39 (clan PA(S)). It is likely that the peptidase domain is involved in the cleavage of the polyprotein []. The nucleotide sequence for the RNA of PLrV has been determined [, ]. The sequence contains six large open reading frames (ORFs). The 5' coding region encodes two polypeptides of 28K and 70K, which overlap in different reading frames; it is suggested that the third ORF in the 5' block is translated by frameshift readthrough near the end of the 70K protein, yielding a 118K polypeptide []. Segments of the predicted amino acid sequences of these ORFs resemble those of known viral RNA polymerases, ATP-binding proteins and viral genome-linked proteins. The nucleotide sequence of the genomic RNA of Beet western yellows virus (BWYV) has been determined []. The sequence contains six long ORFs. A cluster of three of these ORFs, including the coat protein cistron, display extensive amino acid sequence similarity to corresponding ORFs of a second luteovirus: Barley yellow dwarf virus [].; GO: 0004252 serine-type endopeptidase activity, 0022415 viral reproductive process, 0016021 integral to membrane; PDB: 1ZYO_A.
Probab=51.16 E-value=16 Score=37.44 Aligned_cols=59 Identities=17% Similarity=0.119 Sum_probs=17.9
Q ss_pred EEEEcccccCCccccceecCCceEEEEEeeeeeCCCceeEEEeehHHHHHHHHHHHHhhh
Q 005822 600 MLETTAAVHPGGSGGAVVNLDGHMIGLVTRYFKLSCLKMSKFMLVAKLLAQLSFLFFIFL 659 (675)
Q Consensus 600 ~lqTda~v~~G~SGGPLvd~~G~LIGIVssnak~~~~~~i~f~ip~~~l~~l~~~~~~~~ 659 (675)
.....+...+|.||-|+|+.. +++|+.+...+....+..+++.|..-+.-|-..-++|-
T Consensus 137 ~~~vls~T~~G~SGtp~y~g~-~vvGvH~G~~~~~~~~n~n~~spip~~~g~tsP~~~~e 195 (203)
T PF02122_consen 137 FASVLSNTSPGWSGTPYYSGK-NVVGVHTGSPSGSNRENNNRMSPIPPIPGLTSPKYVFE 195 (203)
T ss_dssp EEEE-----TT-TT-EEE-SS--EEEEEEEE-----------------------------
T ss_pred CCceEcCCCCCCCCCCeEECC-CceEeecCcccccccccccccccccccccccccccccc
Confidence 556667889999999999867 99999987766666778899999988888877776663
No 51
>PF01732 DUF31: Putative peptidase (DUF31); InterPro: IPR022382 This domain has no known function. It is found in various hypothetical proteins and putative lipoproteins from mycoplasmas.
Probab=49.95 E-value=11 Score=41.68 Aligned_cols=24 Identities=25% Similarity=0.515 Sum_probs=21.2
Q ss_pred ccccCCccccceecCCceEEEEEe
Q 005822 605 AAVHPGGSGGAVVNLDGHMIGLVT 628 (675)
Q Consensus 605 a~v~~G~SGGPLvd~~G~LIGIVs 628 (675)
....+|+||..|+|.+|++|||..
T Consensus 350 ~~l~gGaSGS~V~n~~~~lvGIy~ 373 (374)
T PF01732_consen 350 YSLGGGASGSMVINQNNELVGIYF 373 (374)
T ss_pred cCCCCCCCcCeEECCCCCEEEEeC
Confidence 366789999999999999999964
No 52
>PF05416 Peptidase_C37: Southampton virus-type processing peptidase; InterPro: IPR001665 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This group of cysteine peptidases belong to the MEROPS peptidase family C37, (clan PA(C)). The type example is calicivirin from Southampton virus, an endopeptidase that cleaves the polyprotein at sites N-terminal to itself, liberating the polyprotein helicase. Southampton virus is a positive-stranded ssRNA virus belonging to the Caliciviruses, which are viruses that cause gastroenteritis. The calicivirus genome contains two open reading frames, ORF1 and ORF2. ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine protease and RNA polymerase activity []. The regions of the polyprotein in which these activities lie are similar to proteins produced by the picornaviruses []. ORF2 encodes a structural, capsid protein. Two different families of caliciviruses can be distinguished on the basis of sequence similarity, namely the Norwalk-like viruses or small round structured viruses (SRSVs), and those classed as non-SRSVs.; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 2FYQ_A 2FYR_A 1WQS_D 4ASH_A 2IPH_B.
Probab=48.09 E-value=71 Score=36.28 Aligned_cols=40 Identities=28% Similarity=0.189 Sum_probs=26.6
Q ss_pred cceEEEEcc-------cccCCccccceecC---CceEEEEEeeeeeCCCc
Q 005822 597 YPVMLETTA-------AVHPGGSGGAVVNL---DGHMIGLVTRYFKLSCL 636 (675)
Q Consensus 597 ~~~~lqTda-------~v~~G~SGGPLvd~---~G~LIGIVssnak~~~~ 636 (675)
...||.|.+ .+-|||-|-|-|-. +-.|+|+.++.++.+++
T Consensus 483 Q~GMLLTGaNAK~mDLGT~PGDCGcPYvyKrgNd~VV~GVH~AAtr~GNT 532 (535)
T PF05416_consen 483 QMGMLLTGANAKGMDLGTIPGDCGCPYVYKRGNDWVVIGVHAAATRSGNT 532 (535)
T ss_dssp EEEEETTSTT-SSTTTS--TTGTT-EEEEEETTEEEEEEEEEEE-SSSSE
T ss_pred eeeeeeecCCccccccCCCCCCCCCceeeecCCcEEEEEEEehhccCCCe
Confidence 345676643 36789999999943 46789999998877765
No 53
>PF05580 Peptidase_S55: SpoIVB peptidase S55; InterPro: IPR008763 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S55 (SpoIVB peptidase family, clan PA(S)). The protein SpoIVB plays a key role in signalling in the final sigma-K checkpoint of Bacillus subtilis [, ].
Probab=41.64 E-value=21 Score=36.97 Aligned_cols=39 Identities=18% Similarity=0.323 Sum_probs=27.4
Q ss_pred cCCCCCCcceeccCccEEEEEEecccccCCcceEEEEeHHHH
Q 005822 288 CLPGMEGGPVFGEHAHFVGILIRPLRQKSGAEIQLVIPWEAI 329 (675)
Q Consensus 288 ~~pG~sGG~v~~~~g~liGiv~~~l~~~~~~~l~~aip~~~i 329 (675)
|..||||.|++- +|+|||-++--|... -..+..|+++..
T Consensus 177 IvqGMSGSPI~q-dGKLiGAVthvf~~d--p~~Gygi~ie~M 215 (218)
T PF05580_consen 177 IVQGMSGSPIIQ-DGKLIGAVTHVFVND--PTKGYGIFIEWM 215 (218)
T ss_pred EEecccCCCEEE-CCEEEEEEEEEEecC--CCceeeecHHHH
Confidence 557999999985 899999999877443 233344555543
No 54
>PF00548 Peptidase_C3: 3C cysteine protease (picornain 3C); InterPro: IPR000199 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This signature defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies C3A and C3B. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral C3 cysteine protease. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SJO_E 2H6M_A 1QA7_C 1HAV_B 2HAL_A 2H9H_A 3QZQ_B 3QZR_A 3R0F_B 3SJ9_A ....
Probab=38.19 E-value=71 Score=31.67 Aligned_cols=89 Identities=19% Similarity=0.291 Sum_probs=49.1
Q ss_pred cceEEEEEEecCCCCCCc----ccCCCCCCCCCeEEEEeCC-CCCCCCCcc--cCceEEEEEecccCCCCCCCceEEEec
Q 005822 214 TSRVAILGVSSYLKDLPN----IALTPLNKRGDLLLAVGSP-FGVLSPMHF--FNSVSMGSVANCYPPRSTTRSLLMADI 286 (675)
Q Consensus 214 ~td~Avlki~~~~~~~~~----~~~s~~~~~G~~v~aigsP-fG~~sp~~f--~nsvs~GiIs~~~~~~~~~~~~i~tDa 286 (675)
.+|+++++++.. ..... +.+. .-...+.++++-++ |+- ..+ ......|.| +..+ ......|.=++
T Consensus 71 ~~Dl~~v~l~~~-~kfrDIrk~~~~~-~~~~~~~~l~v~~~~~~~---~~~~v~~v~~~~~i-~~~g--~~~~~~~~Y~~ 142 (172)
T PF00548_consen 71 DTDLTLVKLPRN-PKFRDIRKFFPES-IPEYPECVLLVNSTKFPR---MIVEVGFVTNFGFI-NLSG--TTTPRSLKYKA 142 (172)
T ss_dssp EEEEEEEEEESS-S-B--GGGGSBSS-GGTEEEEEEEEESSSSTC---EEEEEEEEEEEEEE-EETT--EEEEEEEEEES
T ss_pred ceeEEEEEccCC-cccCchhhhhccc-cccCCCcEEEEECCCCcc---EEEEEEEEeecCcc-ccCC--CEeeEEEEEcc
Confidence 369999999642 21111 2211 22455666666654 441 110 111223333 2221 12234677788
Q ss_pred ccCCCCCCcceecc---CccEEEEEEe
Q 005822 287 RCLPGMEGGPVFGE---HAHFVGILIR 310 (675)
Q Consensus 287 ~~~pG~sGG~v~~~---~g~liGiv~~ 310 (675)
+--+|+.||+|+.. .+.++||=.|
T Consensus 143 ~t~~G~CG~~l~~~~~~~~~i~GiHva 169 (172)
T PF00548_consen 143 PTKPGMCGSPLVSRIGGQGKIIGIHVA 169 (172)
T ss_dssp EEETTGTTEEEEESCGGTTEEEEEEEE
T ss_pred CCCCCccCCeEEEeeccCccEEEEEec
Confidence 88899999999964 5679999765
No 55
>PF05579 Peptidase_S32: Equine arteritis virus serine endopeptidase S32; InterPro: IPR008760 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to MEROPS peptidase family S32 (clan PA(S)). The type example is equine arteritis virus serine endopeptidase (equine arteritis virus), which is involved in processing of nidovirus polyproteins [].; GO: 0004252 serine-type endopeptidase activity, 0016032 viral reproduction, 0019082 viral protein processing; PDB: 3FAN_A 3FAO_A 1MBM_A.
Probab=34.64 E-value=27 Score=37.35 Aligned_cols=26 Identities=27% Similarity=0.511 Sum_probs=20.3
Q ss_pred cCCCCCCcceeccCccEEEEEEeccc
Q 005822 288 CLPGMEGGPVFGEHAHFVGILIRPLR 313 (675)
Q Consensus 288 ~~pG~sGG~v~~~~g~liGiv~~~l~ 313 (675)
-.||.||.||+..+|.+||+-++.-.
T Consensus 205 T~~GDSGSPVVt~dg~liGVHTGSn~ 230 (297)
T PF05579_consen 205 TGPGDSGSPVVTEDGDLIGVHTGSNK 230 (297)
T ss_dssp S-GGCTT-EEEETTC-EEEEEEEEET
T ss_pred cCCCCCCCccCcCCCCEEEEEecCCC
Confidence 35999999999999999999997653
No 56
>PF01732 DUF31: Putative peptidase (DUF31); InterPro: IPR022382 This domain has no known function. It is found in various hypothetical proteins and putative lipoproteins from mycoplasmas.
Probab=28.73 E-value=38 Score=37.49 Aligned_cols=27 Identities=22% Similarity=0.390 Sum_probs=22.1
Q ss_pred EEecccCCCCCCcceeccCccEEEEEE
Q 005822 283 MADIRCLPGMEGGPVFGEHAHFVGILI 309 (675)
Q Consensus 283 ~tDa~~~pG~sGG~v~~~~g~liGiv~ 309 (675)
+.+...-.|.||..|+|.+|++|||..
T Consensus 347 ~~~~~l~gGaSGS~V~n~~~~lvGIy~ 373 (374)
T PF01732_consen 347 IDNYSLGGGASGSMVINQNNELVGIYF 373 (374)
T ss_pred ccccCCCCCCCcCeEECCCCCEEEEeC
Confidence 344455579999999999999999964
No 57
>PF03761 DUF316: Domain of unknown function (DUF316) ; InterPro: IPR005514 This is a family of uncharacterised proteins from Caenorhabditis elegans.
Probab=27.51 E-value=3.5e+02 Score=28.26 Aligned_cols=90 Identities=17% Similarity=0.187 Sum_probs=53.5
Q ss_pred ceEEEEEEecC---CCCCCcccCCC-CCCCCCeEEEEeC-CCCCCCCCcccCceEEEEEecccCCCCCCCceEEEecccC
Q 005822 215 SRVAILGVSSY---LKDLPNIALTP-LNKRGDLLLAVGS-PFGVLSPMHFFNSVSMGSVANCYPPRSTTRSLLMADIRCL 289 (675)
Q Consensus 215 td~Avlki~~~---~~~~~~~~~s~-~~~~G~~v~aigs-PfG~~sp~~f~nsvs~GiIs~~~~~~~~~~~~i~tDa~~~ 289 (675)
.+++||.++.. ...++-++++. .+..||.+-+-|- .-+ .++..-+.. ..... ....+.++-..-
T Consensus 161 ~~~mIlEl~~~~~~~~~~~Cl~~~~~~~~~~~~~~~yg~~~~~----~~~~~~~~i---~~~~~----~~~~~~~~~~~~ 229 (282)
T PF03761_consen 161 YSPMILELEEDFSKNVSPPCLADSSTNWEKGDEVDVYGFNSTG----KLKHRKLKI---TNCTK----CAYSICTKQYSC 229 (282)
T ss_pred cceEEEEEcccccccCCCEEeCCCccccccCceEEEeecCCCC----eEEEEEEEE---EEeec----cceeEecccccC
Confidence 37888998744 56667787754 5788898887665 222 111111111 11100 123455666666
Q ss_pred CCCCCcceecc-Ccc--EEEEEEeccccc
Q 005822 290 PGMEGGPVFGE-HAH--FVGILIRPLRQK 315 (675)
Q Consensus 290 pG~sGG~v~~~-~g~--liGiv~~~l~~~ 315 (675)
+|..|||++.. +|+ |||+.+..-...
T Consensus 230 ~~d~Gg~lv~~~~gr~tlIGv~~~~~~~~ 258 (282)
T PF03761_consen 230 KGDRGGPLVKNINGRWTLIGVGASGNYEC 258 (282)
T ss_pred CCCccCeEEEEECCCEEEEEEEccCCCcc
Confidence 89999999843 454 999988655443
No 58
>PF00571 CBS: CBS domain CBS domain web page. Mutations in the CBS domain of Swiss:P35520 lead to homocystinuria.; InterPro: IPR000644 CBS (cystathionine-beta-synthase) domains are small intracellular modules, mostly found in two or four copies within a protein, that occur in a variety of proteins in bacteria, archaea, and eukaryotes [, ]. Tandem pairs of CBS domains can act as binding domains for adenosine derivatives and may regulate the activity of attached enzymatic or other domains []. In some cases, CBS domains may act as sensors of cellular energy status by being activated by AMP and inhibited by ATP []. In chloride ion channels, the CBS domains have been implicated in intracellular targeting and trafficking, as well as in protein-protein interactions, but results vary with different channels: in the CLC-5 channel, the CBS domain was shown to be required for trafficking [], while in the CLC-1 channel, the CBS domain was shown to be critical for channel function, but not necessary for trafficking []. Recent experiments revealing that CBS domains can bind adenosine-containing ligands such ATP, AMP, or S-adenosylmethionine have led to the hypothesis that CBS domains function as sensors of intracellular metabolites [, ]. Crystallographic studies of CBS domains have shown that pairs of CBS sequences form a globular domain where each CBS unit adopts a beta-alpha-beta-beta-alpha pattern []. Crystal structure of the CBS domains of the AMP-activated protein kinase in complexes with AMP and ATP shows that the phosphate groups of AMP/ATP lie in a surface pocket at the interface of two CBS domains, which is lined with basic residues, many of which are associated with disease-causing mutations []. In humans, mutations in conserved residues within CBS domains cause a variety of human hereditary diseases, including (with the gene mutated in parentheses): homocystinuria (cystathionine beta-synthase); Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase); retinitis pigmentosa (IMP dehydrogenase-1); congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members).; GO: 0005515 protein binding; PDB: 3JTF_A 3TE5_C 3TDH_C 3T4N_C 2QLV_C 3OI8_A 3LV9_A 2QH1_B 1PVM_B 3LQN_A ....
Probab=25.17 E-value=60 Score=24.93 Aligned_cols=22 Identities=36% Similarity=0.631 Sum_probs=18.3
Q ss_pred cCCccccceecCCceEEEEEee
Q 005822 608 HPGGSGGAVVNLDGHMIGLVTR 629 (675)
Q Consensus 608 ~~G~SGGPLvd~~G~LIGIVss 629 (675)
..+-+.-||+|.+|+++|+++.
T Consensus 27 ~~~~~~~~V~d~~~~~~G~is~ 48 (57)
T PF00571_consen 27 KNGISRLPVVDEDGKLVGIISR 48 (57)
T ss_dssp HHTSSEEEEESTTSBEEEEEEH
T ss_pred HcCCcEEEEEecCCEEEEEEEH
Confidence 3567788999999999999963
No 59
>TIGR02860 spore_IV_B stage IV sporulation protein B. SpoIVB, the stage IV sporulation protein B of endospore-forming bacteria such as Bacillus subtilis, is a serine proteinase, expressed in the spore (rather than mother cell) compartment, that participates in a proteolytic activation cascade for Sigma-K. It appears to be universal among endospore-forming bacteria and occurs nowhere else.
Probab=25.04 E-value=43 Score=37.81 Aligned_cols=39 Identities=18% Similarity=0.512 Sum_probs=30.5
Q ss_pred ccCCCCCCcceeccCccEEEEEEecccccCCcceEEEEeH
Q 005822 287 RCLPGMEGGPVFGEHAHFVGILIRPLRQKSGAEIQLVIPW 326 (675)
Q Consensus 287 ~~~pG~sGG~v~~~~g~liGiv~~~l~~~~~~~l~~aip~ 326 (675)
-|..||||.|++ -+|+|||-++--|-.....|-+..|.|
T Consensus 356 GivqGMSGSPi~-q~gkliGAvtHVfvndpt~GYGi~ie~ 394 (402)
T TIGR02860 356 GIVQGMSGSPII-QNGKVIGAVTHVFVNDPTSGYGVYIEW 394 (402)
T ss_pred CEEecccCCCEE-ECCEEEEEEEEEEecCCCcceeehHHH
Confidence 456799999998 679999999998877644566666655
No 60
>PF08208 RNA_polI_A34: DNA-directed RNA polymerase I subunit RPA34.5; InterPro: IPR013240 This is a family of proteins conserved from yeasts to human. Subunit A34.5 of RNA polymerase I is a non-essential subunit which is thought to help Pol I overcome topological constraints imposed on ribosomal DNA during the process of transcription [].; PDB: 3NFG_N.
Probab=23.83 E-value=26 Score=35.28 Aligned_cols=13 Identities=62% Similarity=0.950 Sum_probs=0.0
Q ss_pred Ccchhhccccccc
Q 005822 23 DPKGLKMRRHAFH 35 (675)
Q Consensus 23 dpk~~k~~~~~f~ 35 (675)
-|+|||||.|+|=
T Consensus 109 qp~gLk~Rf~P~G 121 (198)
T PF08208_consen 109 QPKGLKMRFFPFG 121 (198)
T ss_dssp -------------
T ss_pred CCCCcceeeecCC
Confidence 4899999999884
No 61
>PF03510 Peptidase_C24: 2C endopeptidase (C24) cysteine protease family; InterPro: IPR000317 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. The two signatures that defines this group of calivirus polyproteins identify a cysteine peptidase signature that belongs to MEROPS peptidase family C24 (clan PA(C)). Caliciviruses are positive-stranded ssRNA viruses that cause gastroenteritis. The calicivirus genome contains two open reading frames, ORF1 and ORF2. ORF2 encodes a structural protein []; while ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine protease and RNA polymerase activity. The regions of the polyprotein in which these activities lie are similar to proteins produced by the picornaviruses. Two different families of caliciviruses can be distinguished on the basis of sequence similarity, namely those classified as small round structured viruses (SRSVs) and those classed as non-SRSVs. Calicivirus proteases from the non-SRSV group, which are members of the PA protease clan, constitute family C24 of the cysteine proteases (proteases from SRSVs belong to the C37 family). As mentioned above, the protease activity resides within a polyprotein. The enzyme cleaves the polyprotein at sites N-terminal to itself, liberating the polyprotein helicase.; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis
Probab=23.11 E-value=2.9e+02 Score=25.61 Aligned_cols=17 Identities=24% Similarity=0.425 Sum_probs=13.9
Q ss_pred EEEEeCCcEEEEcccccC
Q 005822 410 GVLLNDQGLILTNAHLLE 427 (675)
Q Consensus 410 GvlIn~~GlILTnAHVV~ 427 (675)
++-|.. |.++|+.||++
T Consensus 3 avHIGn-G~~vt~tHva~ 19 (105)
T PF03510_consen 3 AVHIGN-GRYVTVTHVAK 19 (105)
T ss_pred eEEeCC-CEEEEEEEEec
Confidence 566765 79999999986
Done!