Query 006130
Match_columns 660
No_of_seqs 328 out of 2035
Neff 5.9
Searched_HMMs 46136
Date Thu Mar 28 18:47:58 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/006130.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/006130hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 TIGR02038 protease_degS peripl 99.9 1.2E-26 2.6E-31 250.3 23.2 187 387-658 48-246 (351)
2 PRK10139 serine endoprotease; 99.9 1.6E-26 3.5E-31 256.9 22.5 186 387-658 43-258 (455)
3 PRK10898 serine endoprotease; 99.9 2E-26 4.4E-31 248.7 22.2 187 387-658 48-247 (353)
4 PRK10942 serine endoprotease; 99.9 4.4E-25 9.5E-30 246.5 22.9 167 406-658 111-279 (473)
5 TIGR02037 degP_htrA_DO peripla 99.9 1.6E-24 3.4E-29 239.6 23.1 167 406-658 58-225 (428)
6 PRK10139 serine endoprotease; 99.9 4.5E-24 9.8E-29 237.4 13.9 143 204-352 126-275 (455)
7 PRK10942 serine endoprotease; 99.9 1.5E-22 3.2E-27 226.3 13.9 142 205-352 148-296 (473)
8 TIGR02038 protease_degS peripl 99.9 1.9E-21 4.1E-26 210.0 13.9 141 205-351 114-262 (351)
9 PRK10898 serine endoprotease; 99.9 3.6E-21 7.7E-26 208.0 14.4 139 207-351 116-263 (353)
10 TIGR02037 degP_htrA_DO peripla 99.8 1.6E-20 3.5E-25 207.7 15.3 140 206-352 95-242 (428)
11 COG0265 DegQ Trypsin-like seri 99.8 4.6E-19 1E-23 190.9 19.7 187 388-659 37-241 (347)
12 COG0265 DegQ Trypsin-like seri 99.8 1.3E-19 2.9E-24 195.0 13.0 140 206-351 109-256 (347)
13 PF13365 Trypsin_2: Trypsin-li 99.5 9.7E-14 2.1E-18 124.6 13.5 24 603-626 97-120 (120)
14 KOG1320 Serine protease [Postt 99.3 1.2E-12 2.5E-17 144.5 7.3 128 203-336 211-350 (473)
15 KOG1320 Serine protease [Postt 99.3 7.3E-12 1.6E-16 138.2 13.4 201 389-658 133-350 (473)
16 PF00089 Trypsin: Trypsin; In 99.3 7.7E-11 1.7E-15 115.8 17.1 126 521-655 86-220 (220)
17 cd00190 Tryp_SPc Trypsin-like 99.2 5.7E-10 1.2E-14 110.4 17.5 109 520-632 87-209 (232)
18 smart00020 Tryp_SPc Trypsin-li 99.0 1.2E-08 2.6E-13 101.4 17.2 108 520-631 87-208 (229)
19 KOG1421 Predicted signaling-as 98.9 9.1E-09 2E-13 115.6 11.2 188 390-659 58-259 (955)
20 COG3591 V8-like Glu-specific e 98.2 1.9E-05 4.2E-10 81.7 14.3 90 545-656 157-247 (251)
21 KOG3627 Trypsin [Amino acid tr 98.2 0.00011 2.4E-09 75.1 19.5 129 522-656 106-251 (256)
22 PF00863 Peptidase_C4: Peptida 98.1 4.4E-05 9.5E-10 78.4 14.3 103 520-649 80-185 (235)
23 PF13365 Trypsin_2: Trypsin-li 97.8 1.1E-05 2.3E-10 72.3 2.4 24 284-307 97-120 (120)
24 COG5640 Secreted trypsin-like 97.1 0.0038 8.3E-08 67.2 11.3 37 605-641 223-262 (413)
25 PF03761 DUF316: Domain of unk 96.7 0.078 1.7E-06 55.6 17.6 110 520-654 159-274 (282)
26 PF00089 Trypsin: Trypsin; In 96.6 0.023 4.9E-07 55.7 11.7 113 214-327 86-214 (220)
27 PF05579 Peptidase_S32: Equine 96.6 0.011 2.4E-07 61.5 9.6 78 521-634 155-232 (297)
28 KOG1421 Predicted signaling-as 95.0 0.47 1E-05 55.2 14.4 144 497-659 578-727 (955)
29 PF00548 Peptidase_C3: 3C cyst 94.9 0.42 9E-06 47.2 12.5 35 596-630 133-170 (172)
30 PF10459 Peptidase_S46: Peptid 94.6 0.062 1.3E-06 63.7 6.6 64 595-658 618-686 (698)
31 COG3591 V8-like Glu-specific e 94.1 0.26 5.6E-06 51.6 9.3 76 232-315 152-227 (251)
32 PF08192 Peptidase_S64: Peptid 92.8 0.46 1E-05 55.3 9.4 121 519-660 540-690 (695)
33 PF02907 Peptidase_S29: Hepati 91.4 0.14 3E-06 48.4 2.5 45 284-329 101-146 (148)
34 PF10459 Peptidase_S46: Peptid 90.3 0.2 4.3E-06 59.5 3.1 29 281-309 623-651 (698)
35 PF00949 Peptidase_S7: Peptida 90.2 0.23 5.1E-06 47.1 2.9 35 280-314 86-120 (132)
36 PF02907 Peptidase_S29: Hepati 89.6 0.33 7.1E-06 46.0 3.3 44 606-652 104-147 (148)
37 PF00949 Peptidase_S7: Peptida 89.4 0.27 5.9E-06 46.6 2.7 29 606-634 93-121 (132)
38 smart00020 Tryp_SPc Trypsin-li 86.5 4.6 0.0001 39.8 9.7 100 213-312 87-208 (229)
39 PF00944 Peptidase_S3: Alphavi 86.0 0.8 1.7E-05 43.5 3.5 34 601-634 97-130 (158)
40 cd00190 Tryp_SPc Trypsin-like 85.1 6.3 0.00014 38.6 9.8 99 214-312 88-208 (232)
41 PF00863 Peptidase_C4: Peptida 85.0 4.4 9.5E-05 42.2 8.7 106 214-330 81-190 (235)
42 PF00947 Pico_P2A: Picornaviru 84.6 1.8 3.8E-05 40.8 5.1 31 280-311 79-109 (127)
43 PF09342 DUF1986: Domain of un 83.7 6 0.00013 41.4 8.9 33 395-428 17-49 (267)
44 PF05580 Peptidase_S55: SpoIVB 82.7 0.92 2E-05 46.4 2.6 45 601-651 171-215 (218)
45 PF08192 Peptidase_S64: Peptid 80.3 12 0.00026 44.1 10.7 112 209-332 537-684 (695)
46 PF00944 Peptidase_S3: Alphavi 75.0 3.6 7.8E-05 39.2 3.8 32 281-312 96-127 (158)
47 PF00947 Pico_P2A: Picornaviru 61.5 11 0.00024 35.5 4.1 33 599-632 79-111 (127)
48 KOG0441 Cu2+/Zn2+ superoxide d 60.8 3.2 7E-05 40.3 0.4 43 25-67 37-84 (154)
49 PF03510 Peptidase_C24: 2C end 57.8 32 0.0007 31.5 6.3 18 409-427 2-19 (105)
50 PF01732 DUF31: Putative pepti 53.7 7.9 0.00017 42.7 2.1 23 606-628 351-373 (374)
51 TIGR02860 spore_IV_B stage IV 50.7 11 0.00023 42.3 2.5 46 601-652 351-396 (402)
52 PF05579 Peptidase_S32: Equine 48.8 12 0.00026 39.7 2.3 25 288-312 205-229 (297)
53 PF00548 Peptidase_C3: 3C cyst 42.4 56 0.0012 32.2 5.9 90 214-310 71-169 (172)
54 PF03761 DUF316: Domain of unk 30.3 3.3E+02 0.0072 28.3 9.7 92 213-315 159-258 (282)
55 PF05580 Peptidase_S55: SpoIVB 30.1 46 0.001 34.3 3.1 34 288-322 177-210 (218)
56 PF01732 DUF31: Putative pepti 27.3 44 0.00095 36.9 2.6 26 285-310 349-374 (374)
57 PF08208 RNA_polI_A34: DNA-dir 23.7 26 0.00057 35.1 0.0 13 23-35 109-121 (198)
58 PF00571 CBS: CBS domain CBS d 23.5 59 0.0013 24.8 2.0 22 609-630 28-49 (57)
59 PF12381 Peptidase_C3G: Tungro 21.2 84 0.0018 32.5 3.0 53 600-658 170-228 (231)
No 1
>TIGR02038 protease_degS periplasmic serine pepetdase DegS. This family consists of the periplasmic serine protease DegS (HhoB), a shorter paralog of protease DO (HtrA, DegP) and DegQ (HhoA). It is found in E. coli and several other Proteobacteria of the gamma subdivision. It contains a trypsin domain and a single copy of PDZ domain (in contrast to DegP with two copies). A critical role of this DegS is to sense stress in the periplasm and partially degrade an inhibitor of sigma(E).
Probab=99.95 E-value=1.2e-26 Score=250.35 Aligned_cols=187 Identities=26% Similarity=0.438 Sum_probs=151.0
Q ss_pred hhHHhhccCceEEEEeCC-----------CeEEEEEEEeCCCEEEEcccccCCCCCcceeecCCccccccCCCCCCCCCC
Q 006130 387 PLPIQKALASVCLITIDD-----------GVWASGVLLNDQGLILTNAHLLEPWRFGKTTVSGWRNGVSFQPEDSASSGH 455 (660)
Q Consensus 387 p~~i~~a~~SVV~I~~~~-----------~~wGSGflI~~~GlILTNaHVVep~~~~~~~~~g~~~~~~f~~~~~~~~~~ 455 (660)
..+++++.+|||.|.... ...||||+|+++||||||+||++ +
T Consensus 48 ~~~~~~~~psVV~I~~~~~~~~~~~~~~~~~~GSG~vi~~~G~IlTn~HVV~----------~----------------- 100 (351)
T TIGR02038 48 NKAVRRAAPAVVNIYNRSISQNSLNQLSIQGLGSGVIMSKEGYILTNYHVIK----------K----------------- 100 (351)
T ss_pred HHHHHhcCCcEEEEEeEeccccccccccccceEEEEEEeCCeEEEecccEeC----------C-----------------
Confidence 356899999999997621 34799999999999999999996 1
Q ss_pred CcccccccccCCCCCCCcccccccccccccccccccCCceeEEEEEcCCCcceeEeeEEEEecCCCCceEEEEEccCCCC
Q 006130 456 TGVDQYQKSQTLPPKMPKIVDSSVDEHRAYKLSSFSRGHRKIRVRLDHLDPWIWCDAKIVYVCKGPLDVSLLQLGYIPDQ 535 (660)
Q Consensus 456 ~~~~~~~~~~~~~~k~~~i~~e~~~~~~~~~~n~~~~~~~~I~Vr~~~~~~~~w~~A~VV~v~d~~~DLALLqL~~~p~~ 535 (660)
.+.+.|++.+++. ++|++++ .|+..||||||++. ..
T Consensus 101 --------------------------------------~~~i~V~~~dg~~---~~a~vv~-~d~~~DlAvlkv~~--~~ 136 (351)
T TIGR02038 101 --------------------------------------ADQIVVALQDGRK---FEAELVG-SDPLTDLAVLKIEG--DN 136 (351)
T ss_pred --------------------------------------CCEEEEEECCCCE---EEEEEEE-ecCCCCEEEEEecC--CC
Confidence 2237788888765 9999999 78889999999996 34
Q ss_pred cceeecCCC-CCCCCCeEEEEecCCCCCCCCCCCeeeeeEEeeeeeccCCCCCccccccCCCcCcEEEEcccccCCcccc
Q 006130 536 LCPIDADFG-QPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQRNSAYPVMLETTAAVHPGGSGG 614 (660)
Q Consensus 536 l~pi~l~~s-~~~~Gd~V~vIGygl~g~~~gl~psvt~GiVS~v~~~~~~~~~~~~~~~~~~~~~~IqTda~v~~G~SGG 614 (660)
++++++.++ .+++||.|+++|| |+++..+++.|+|++..+... .......+||||+++.+|||||
T Consensus 137 ~~~~~l~~s~~~~~G~~V~aiG~-----P~~~~~s~t~GiIs~~~r~~~---------~~~~~~~~iqtda~i~~GnSGG 202 (351)
T TIGR02038 137 LPTIPVNLDRPPHVGDVVLAIGN-----PYNLGQTITQGIISATGRNGL---------SSVGRQNFIQTDAAINAGNSGG 202 (351)
T ss_pred CceEeccCcCccCCCCEEEEEeC-----CCCCCCcEEEEEEEeccCccc---------CCCCcceEEEECCccCCCCCcc
Confidence 778888764 6899999999999 466778999999998764321 0112346799999999999999
Q ss_pred cccccCceEEEEeeecccCCCCcccCceEEEEehHHHHHHHHHh
Q 006130 615 AVVNLDGHMIGLVTSNARHGGGTVIPHLNFSIPCAVLRPIFEFA 658 (660)
Q Consensus 615 PL~Ns~G~VVGIvssna~~~~g~~ip~lnFaIPi~~l~~il~~a 658 (660)
||||.+|+||||+++.....++....+++|+||++.++++++.+
T Consensus 203 pl~n~~G~vIGI~~~~~~~~~~~~~~g~~faIP~~~~~~vl~~l 246 (351)
T TIGR02038 203 ALINTNGELVGINTASFQKGGDEGGEGINFAIPIKLAHKIMGKI 246 (351)
T ss_pred eEECCCCeEEEEEeeeecccCCCCccceEEEecHHHHHHHHHHH
Confidence 99999999999999876543333335799999999999999765
No 2
>PRK10139 serine endoprotease; Provisional
Probab=99.95 E-value=1.6e-26 Score=256.91 Aligned_cols=186 Identities=28% Similarity=0.513 Sum_probs=151.4
Q ss_pred hhHHhhccCceEEEEeCC----------------------------CeEEEEEEEeC-CCEEEEcccccCCCCCcceeec
Q 006130 387 PLPIQKALASVCLITIDD----------------------------GVWASGVLLND-QGLILTNAHLLEPWRFGKTTVS 437 (660)
Q Consensus 387 p~~i~~a~~SVV~I~~~~----------------------------~~wGSGflI~~-~GlILTNaHVVep~~~~~~~~~ 437 (660)
..+++++.|+||.|.... .+.||||+|++ +||||||+|||+
T Consensus 43 ~~~~~~~~pavV~i~~~~~~~~~~~~~~~~~~~f~~~~~~~~~~~~~~~GSG~ii~~~~g~IlTn~HVv~---------- 112 (455)
T PRK10139 43 APMLEKVLPAVVSVRVEGTASQGQKIPEEFKKFFGDDLPDQPAQPFEGLGSGVIIDAAKGYVLTNNHVIN---------- 112 (455)
T ss_pred HHHHHHhCCcEEEEEEEEeecccccCchhHHHhccccCCccccccccceEEEEEEECCCCEEEeChHHhC----------
Confidence 457899999999996410 14799999985 799999999996
Q ss_pred CCccccccCCCCCCCCCCCcccccccccCCCCCCCcccccccccccccccccccCCceeEEEEEcCCCcceeEeeEEEEe
Q 006130 438 GWRNGVSFQPEDSASSGHTGVDQYQKSQTLPPKMPKIVDSSVDEHRAYKLSSFSRGHRKIRVRLDHLDPWIWCDAKIVYV 517 (660)
Q Consensus 438 g~~~~~~f~~~~~~~~~~~~~~~~~~~~~~~~k~~~i~~e~~~~~~~~~~n~~~~~~~~I~Vr~~~~~~~~w~~A~VV~v 517 (660)
+ ...|.|++.+++. |+|++++
T Consensus 113 ~-------------------------------------------------------a~~i~V~~~dg~~---~~a~vvg- 133 (455)
T PRK10139 113 Q-------------------------------------------------------AQKISIQLNDGRE---FDAKLIG- 133 (455)
T ss_pred C-------------------------------------------------------CCEEEEEECCCCE---EEEEEEE-
Confidence 1 2348888888875 9999999
Q ss_pred cCCCCceEEEEEccCCCCcceeecCCC-CCCCCCeEEEEecCCCCCCCCCCCeeeeeEEeeeeeccCCCCCccccccCCC
Q 006130 518 CKGPLDVSLLQLGYIPDQLCPIDADFG-QPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQRNSA 596 (660)
Q Consensus 518 ~d~~~DLALLqL~~~p~~l~pi~l~~s-~~~~Gd~V~vIGygl~g~~~gl~psvt~GiVS~v~~~~~~~~~~~~~~~~~~ 596 (660)
.|+..||||||++. +..++++++.++ .+++||.|+++|| |+|+..+++.|+|++..+... ....
T Consensus 134 ~D~~~DlAvlkv~~-~~~l~~~~lg~s~~~~~G~~V~aiG~-----P~g~~~tvt~GivS~~~r~~~---------~~~~ 198 (455)
T PRK10139 134 SDDQSDIALLQIQN-PSKLTQIAIADSDKLRVGDFAVAVGN-----PFGLGQTATSGIISALGRSGL---------NLEG 198 (455)
T ss_pred EcCCCCEEEEEecC-CCCCceeEecCccccCCCCEEEEEec-----CCCCCCceEEEEEcccccccc---------CCCC
Confidence 78889999999985 457889999875 5899999999999 567778999999998754310 0123
Q ss_pred cCcEEEEcccccCCcccccccccCceEEEEeeecccCCCCcccCceEEEEehHHHHHHHHHh
Q 006130 597 YPVMLETTAAVHPGGSGGAVVNLDGHMIGLVTSNARHGGGTVIPHLNFSIPCAVLRPIFEFA 658 (660)
Q Consensus 597 ~~~~IqTda~v~~G~SGGPL~Ns~G~VVGIvssna~~~~g~~ip~lnFaIPi~~l~~il~~a 658 (660)
+..+||||+++++|||||||||.+|+||||+++.....++. .+++|+||++.++++++.+
T Consensus 199 ~~~~iqtda~in~GnSGGpl~n~~G~vIGi~~~~~~~~~~~--~gigfaIP~~~~~~v~~~l 258 (455)
T PRK10139 199 LENFIQTDASINRGNSGGALLNLNGELIGINTAILAPGGGS--VGIGFAIPSNMARTLAQQL 258 (455)
T ss_pred cceEEEECCccCCCCCcceEECCCCeEEEEEEEEEcCCCCc--cceEEEEEhHHHHHHHHHH
Confidence 45689999999999999999999999999999977654432 3789999999999988764
No 3
>PRK10898 serine endoprotease; Provisional
Probab=99.95 E-value=2e-26 Score=248.73 Aligned_cols=187 Identities=24% Similarity=0.400 Sum_probs=150.4
Q ss_pred hhHHhhccCceEEEEeCC-----------CeEEEEEEEeCCCEEEEcccccCCCCCcceeecCCccccccCCCCCCCCCC
Q 006130 387 PLPIQKALASVCLITIDD-----------GVWASGVLLNDQGLILTNAHLLEPWRFGKTTVSGWRNGVSFQPEDSASSGH 455 (660)
Q Consensus 387 p~~i~~a~~SVV~I~~~~-----------~~wGSGflI~~~GlILTNaHVVep~~~~~~~~~g~~~~~~f~~~~~~~~~~ 455 (660)
..+++++.+|||.|.... ..+||||+|+++||||||+||++ +
T Consensus 48 ~~~~~~~~psvV~v~~~~~~~~~~~~~~~~~~GSGfvi~~~G~IlTn~HVv~----------~----------------- 100 (353)
T PRK10898 48 NQAVRRAAPAVVNVYNRSLNSTSHNQLEIRTLGSGVIMDQRGYILTNKHVIN----------D----------------- 100 (353)
T ss_pred HHHHHHhCCcEEEEEeEeccccCcccccccceeeEEEEeCCeEEEecccEeC----------C-----------------
Confidence 457899999999998731 15899999999999999999996 1
Q ss_pred CcccccccccCCCCCCCcccccccccccccccccccCCceeEEEEEcCCCcceeEeeEEEEecCCCCceEEEEEccCCCC
Q 006130 456 TGVDQYQKSQTLPPKMPKIVDSSVDEHRAYKLSSFSRGHRKIRVRLDHLDPWIWCDAKIVYVCKGPLDVSLLQLGYIPDQ 535 (660)
Q Consensus 456 ~~~~~~~~~~~~~~k~~~i~~e~~~~~~~~~~n~~~~~~~~I~Vr~~~~~~~~w~~A~VV~v~d~~~DLALLqL~~~p~~ 535 (660)
.+.+.|++.+++. |+|++++ .|+..||||||++. ..
T Consensus 101 --------------------------------------a~~i~V~~~dg~~---~~a~vv~-~d~~~DlAvl~v~~--~~ 136 (353)
T PRK10898 101 --------------------------------------ADQIIVALQDGRV---FEALLVG-SDSLTDLAVLKINA--TN 136 (353)
T ss_pred --------------------------------------CCEEEEEeCCCCE---EEEEEEE-EcCCCCEEEEEEcC--CC
Confidence 2237888888775 9999998 67889999999985 35
Q ss_pred cceeecCCC-CCCCCCeEEEEecCCCCCCCCCCCeeeeeEEeeeeeccCCCCCccccccCCCcCcEEEEcccccCCcccc
Q 006130 536 LCPIDADFG-QPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQRNSAYPVMLETTAAVHPGGSGG 614 (660)
Q Consensus 536 l~pi~l~~s-~~~~Gd~V~vIGygl~g~~~gl~psvt~GiVS~v~~~~~~~~~~~~~~~~~~~~~~IqTda~v~~G~SGG 614 (660)
++++++.++ .+++||.|+++|| |+++..+++.|+|++..+.... ......+||||+++++|||||
T Consensus 137 l~~~~l~~~~~~~~G~~V~aiG~-----P~g~~~~~t~Giis~~~r~~~~---------~~~~~~~iqtda~i~~GnSGG 202 (353)
T PRK10898 137 LPVIPINPKRVPHIGDVVLAIGN-----PYNLGQTITQGIISATGRIGLS---------PTGRQNFLQTDASINHGNSGG 202 (353)
T ss_pred CCeeeccCcCcCCCCCEEEEEeC-----CCCcCCCcceeEEEeccccccC---------CccccceEEeccccCCCCCcc
Confidence 778888765 5899999999999 4566788999999986543210 012245799999999999999
Q ss_pred cccccCceEEEEeeecccCCC-CcccCceEEEEehHHHHHHHHHh
Q 006130 615 AVVNLDGHMIGLVTSNARHGG-GTVIPHLNFSIPCAVLRPIFEFA 658 (660)
Q Consensus 615 PL~Ns~G~VVGIvssna~~~~-g~~ip~lnFaIPi~~l~~il~~a 658 (660)
||+|.+|+||||+++.....+ +....+++|+||++.++++++.+
T Consensus 203 Pl~n~~G~vvGI~~~~~~~~~~~~~~~g~~faIP~~~~~~~~~~l 247 (353)
T PRK10898 203 ALVNSLGELMGINTLSFDKSNDGETPEGIGFAIPTQLATKIMDKL 247 (353)
T ss_pred eEECCCCeEEEEEEEEecccCCCCcccceEEEEchHHHHHHHHHH
Confidence 999999999999998765432 22334789999999999998764
No 4
>PRK10942 serine endoprotease; Provisional
Probab=99.93 E-value=4.4e-25 Score=246.53 Aligned_cols=167 Identities=32% Similarity=0.538 Sum_probs=138.7
Q ss_pred eEEEEEEEeC-CCEEEEcccccCCCCCcceeecCCccccccCCCCCCCCCCCcccccccccCCCCCCCcccccccccccc
Q 006130 406 VWASGVLLND-QGLILTNAHLLEPWRFGKTTVSGWRNGVSFQPEDSASSGHTGVDQYQKSQTLPPKMPKIVDSSVDEHRA 484 (660)
Q Consensus 406 ~wGSGflI~~-~GlILTNaHVVep~~~~~~~~~g~~~~~~f~~~~~~~~~~~~~~~~~~~~~~~~k~~~i~~e~~~~~~~ 484 (660)
++||||+|+. +||||||+||++ +
T Consensus 111 ~~GSG~ii~~~~G~IlTn~HVv~----------~---------------------------------------------- 134 (473)
T PRK10942 111 ALGSGVIIDADKGYVVTNNHVVD----------N---------------------------------------------- 134 (473)
T ss_pred ceEEEEEEECCCCEEEeChhhcC----------C----------------------------------------------
Confidence 4799999996 599999999996 1
Q ss_pred cccccccCCceeEEEEEcCCCcceeEeeEEEEecCCCCceEEEEEccCCCCcceeecCCC-CCCCCCeEEEEecCCCCCC
Q 006130 485 YKLSSFSRGHRKIRVRLDHLDPWIWCDAKIVYVCKGPLDVSLLQLGYIPDQLCPIDADFG-QPSLGSAAYVIGHGLFGPR 563 (660)
Q Consensus 485 ~~~n~~~~~~~~I~Vr~~~~~~~~w~~A~VV~v~d~~~DLALLqL~~~p~~l~pi~l~~s-~~~~Gd~V~vIGygl~g~~ 563 (660)
...|.|++.+++. |+|++++ .|+..||||||++. +..++++++.++ .+++||.|+++|| |
T Consensus 135 ---------a~~i~V~~~dg~~---~~a~vv~-~D~~~DlAvlki~~-~~~l~~~~lg~s~~l~~G~~V~aiG~-----P 195 (473)
T PRK10942 135 ---------ATKIKVQLSDGRK---FDAKVVG-KDPRSDIALIQLQN-PKNLTAIKMADSDALRVGDYTVAIGN-----P 195 (473)
T ss_pred ---------CCEEEEEECCCCE---EEEEEEE-ecCCCCEEEEEecC-CCCCceeEecCccccCCCCEEEEEcC-----C
Confidence 2348888888776 9999999 78889999999975 457889999875 6999999999999 5
Q ss_pred CCCCCeeeeeEEeeeeeccCCCCCccccccCCCcCcEEEEcccccCCcccccccccCceEEEEeeecccCCCCcccCceE
Q 006130 564 CGLSPSVSSGVVAKVVKANLPSYGQSTLQRNSAYPVMLETTAAVHPGGSGGAVVNLDGHMIGLVTSNARHGGGTVIPHLN 643 (660)
Q Consensus 564 ~gl~psvt~GiVS~v~~~~~~~~~~~~~~~~~~~~~~IqTda~v~~G~SGGPL~Ns~G~VVGIvssna~~~~g~~ip~ln 643 (660)
+++..+++.|+|++..+... ....+..+||||+++++|||||||+|.+|+||||+++.....++. .+++
T Consensus 196 ~g~~~tvt~GiVs~~~r~~~---------~~~~~~~~iqtda~i~~GnSGGpL~n~~GeviGI~t~~~~~~g~~--~g~g 264 (473)
T PRK10942 196 YGLGETVTSGIVSALGRSGL---------NVENYENFIQTDAAINRGNSGGALVNLNGELIGINTAILAPDGGN--IGIG 264 (473)
T ss_pred CCCCcceeEEEEEEeecccC---------CcccccceEEeccccCCCCCcCccCCCCCeEEEEEEEEEcCCCCc--ccEE
Confidence 67778999999998764310 012345689999999999999999999999999999877655443 3799
Q ss_pred EEEehHHHHHHHHHh
Q 006130 644 FSIPCAVLRPIFEFA 658 (660)
Q Consensus 644 FaIPi~~l~~il~~a 658 (660)
|+||+++++++++.+
T Consensus 265 faIP~~~~~~v~~~l 279 (473)
T PRK10942 265 FAIPSNMVKNLTSQM 279 (473)
T ss_pred EEEEHHHHHHHHHHH
Confidence 999999999998765
No 5
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=99.93 E-value=1.6e-24 Score=239.56 Aligned_cols=167 Identities=31% Similarity=0.502 Sum_probs=137.1
Q ss_pred eEEEEEEEeCCCEEEEcccccCCCCCcceeecCCccccccCCCCCCCCCCCcccccccccCCCCCCCccccccccccccc
Q 006130 406 VWASGVLLNDQGLILTNAHLLEPWRFGKTTVSGWRNGVSFQPEDSASSGHTGVDQYQKSQTLPPKMPKIVDSSVDEHRAY 485 (660)
Q Consensus 406 ~wGSGflI~~~GlILTNaHVVep~~~~~~~~~g~~~~~~f~~~~~~~~~~~~~~~~~~~~~~~~k~~~i~~e~~~~~~~~ 485 (660)
.+||||+|+++||||||+||++ +
T Consensus 58 ~~GSGfii~~~G~IlTn~Hvv~----------~----------------------------------------------- 80 (428)
T TIGR02037 58 GLGSGVIISADGYILTNNHVVD----------G----------------------------------------------- 80 (428)
T ss_pred ceeeEEEECCCCEEEEcHHHcC----------C-----------------------------------------------
Confidence 4799999999999999999996 1
Q ss_pred ccccccCCceeEEEEEcCCCcceeEeeEEEEecCCCCceEEEEEccCCCCcceeecCCC-CCCCCCeEEEEecCCCCCCC
Q 006130 486 KLSSFSRGHRKIRVRLDHLDPWIWCDAKIVYVCKGPLDVSLLQLGYIPDQLCPIDADFG-QPSLGSAAYVIGHGLFGPRC 564 (660)
Q Consensus 486 ~~n~~~~~~~~I~Vr~~~~~~~~w~~A~VV~v~d~~~DLALLqL~~~p~~l~pi~l~~s-~~~~Gd~V~vIGygl~g~~~ 564 (660)
..++.|++.+++. |+|++++ .|+.+||||||++. +..++++.+.++ .+++|+.|+++|| |+
T Consensus 81 --------~~~i~V~~~~~~~---~~a~vv~-~d~~~DlAllkv~~-~~~~~~~~l~~~~~~~~G~~v~aiG~-----p~ 142 (428)
T TIGR02037 81 --------ADEITVTLSDGRE---FKAKLVG-KDPRTDIAVLKIDA-KKNLPVIKLGDSDKLRVGDWVLAIGN-----PF 142 (428)
T ss_pred --------CCeEEEEeCCCCE---EEEEEEE-ecCCCCEEEEEecC-CCCceEEEccCCCCCCCCCEEEEEEC-----CC
Confidence 2237788877765 9999998 67889999999985 357899999864 6899999999999 56
Q ss_pred CCCCeeeeeEEeeeeeccCCCCCccccccCCCcCcEEEEcccccCCcccccccccCceEEEEeeecccCCCCcccCceEE
Q 006130 565 GLSPSVSSGVVAKVVKANLPSYGQSTLQRNSAYPVMLETTAAVHPGGSGGAVVNLDGHMIGLVTSNARHGGGTVIPHLNF 644 (660)
Q Consensus 565 gl~psvt~GiVS~v~~~~~~~~~~~~~~~~~~~~~~IqTda~v~~G~SGGPL~Ns~G~VVGIvssna~~~~g~~ip~lnF 644 (660)
++..+++.|+|++..+... ....+..+++||+++.+|+|||||||.+|+||||+++.....++. .+++|
T Consensus 143 g~~~~~t~G~vs~~~~~~~---------~~~~~~~~i~tda~i~~GnSGGpl~n~~G~viGI~~~~~~~~g~~--~g~~f 211 (428)
T TIGR02037 143 GLGQTVTSGIVSALGRSGL---------GIGDYENFIQTDAAINPGNSGGPLVNLRGEVIGINTAIYSPSGGN--VGIGF 211 (428)
T ss_pred cCCCcEEEEEEEecccCcc---------CCCCccceEEECCCCCCCCCCCceECCCCeEEEEEeEEEcCCCCc--cceEE
Confidence 7778999999998754310 112345689999999999999999999999999998876654332 37899
Q ss_pred EEehHHHHHHHHHh
Q 006130 645 SIPCAVLRPIFEFA 658 (660)
Q Consensus 645 aIPi~~l~~il~~a 658 (660)
+||++.++++++.+
T Consensus 212 aiP~~~~~~~~~~l 225 (428)
T TIGR02037 212 AIPSNMAKNVVDQL 225 (428)
T ss_pred EEEhHHHHHHHHHH
Confidence 99999999999875
No 6
>PRK10139 serine endoprotease; Provisional
Probab=99.91 E-value=4.5e-24 Score=237.38 Aligned_cols=143 Identities=21% Similarity=0.282 Sum_probs=121.1
Q ss_pred cCCcccc-CCCcccEEEEEEcC-CCCCCCccccCCCCCCCCeEEEEeCCCCCCCCcccccceEEEEEeeecCCC---CCC
Q 006130 204 SSNLSLM-SKSTSRVAILGVSS-YLKDLPNIALTPLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVANCYPPR---STT 278 (660)
Q Consensus 204 ~~~~~~~-~~~~td~A~l~i~~-~~~~~~~~~~s~~~~~G~~v~aigsPFg~~sp~~f~n~vs~GiVs~~~~~~---~~~ 278 (660)
.-+++++ .+..+||||||++. ...+..+++.|+.+++||+|+|||+||| +..++|.||||++.|.. ..+
T Consensus 126 ~~~a~vvg~D~~~DlAvlkv~~~~~l~~~~lg~s~~~~~G~~V~aiG~P~g------~~~tvt~GivS~~~r~~~~~~~~ 199 (455)
T PRK10139 126 EFDAKLIGSDDQSDIALLQIQNPSKLTQIAIADSDKLRVGDFAVAVGNPFG------LGQTATSGIISALGRSGLNLEGL 199 (455)
T ss_pred EEEEEEEEEcCCCCEEEEEecCCCCCceeEecCccccCCCCEEEEEecCCC------CCCceEEEEEccccccccCCCCc
Confidence 3446677 45559999999973 2333445556778999999999999999 57899999999998752 235
Q ss_pred CCeEEeecccCCCCcCcceecCCccEEEEEeeecccc-CCcceEEEEeHHHHHHHHhhhh-cCCCCcccceeeccc
Q 006130 279 RSLLMADIRCLPGMEGGPVFGEHAHFVGILIRPLRQK-SGAEIQLVIPWEAIATACSDLL-LKEPQNAEKEIHINK 352 (660)
Q Consensus 279 ~~~i~tDa~~~pG~sGGpl~~~~g~liGi~~~~l~~~-~~~~i~f~ip~~~i~~~~~~l~-~~~~~~~~~~~~~~~ 352 (660)
.+||||||++|||||||||||.+|+||||++++++.. +..|++|+||++.++.++++|+ .+++.++|+|+.++.
T Consensus 200 ~~~iqtda~in~GnSGGpl~n~~G~vIGi~~~~~~~~~~~~gigfaIP~~~~~~v~~~l~~~g~v~r~~LGv~~~~ 275 (455)
T PRK10139 200 ENFIQTDASINRGNSGGALLNLNGELIGINTAILAPGGGSVGIGFAIPSNMARTLAQQLIDFGEIKRGLLGIKGTE 275 (455)
T ss_pred ceEEEECCccCCCCCcceEECCCCeEEEEEEEEEcCCCCccceEEEEEhHHHHHHHHHHhhcCcccccceeEEEEE
Confidence 6899999999999999999999999999999999876 7789999999999999999998 778999999988653
No 7
>PRK10942 serine endoprotease; Provisional
Probab=99.88 E-value=1.5e-22 Score=226.30 Aligned_cols=142 Identities=18% Similarity=0.268 Sum_probs=120.3
Q ss_pred CCccccC-CCcccEEEEEEcC-CCCCCCccccCCCCCCCCeEEEEeCCCCCCCCcccccceEEEEEeeecCCC---CCCC
Q 006130 205 SNLSLMS-KSTSRVAILGVSS-YLKDLPNIALTPLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVANCYPPR---STTR 279 (660)
Q Consensus 205 ~~~~~~~-~~~td~A~l~i~~-~~~~~~~~~~s~~~~~G~~v~aigsPFg~~sp~~f~n~vs~GiVs~~~~~~---~~~~ 279 (660)
-++++++ +..+||||||++. ...+..+++.++.+++||+|++||+||| |.+++|.||||++.+.. ..+.
T Consensus 148 ~~a~vv~~D~~~DlAvlki~~~~~l~~~~lg~s~~l~~G~~V~aiG~P~g------~~~tvt~GiVs~~~r~~~~~~~~~ 221 (473)
T PRK10942 148 FDAKVVGKDPRSDIALIQLQNPKNLTAIKMADSDALRVGDYTVAIGNPYG------LGETVTSGIVSALGRSGLNVENYE 221 (473)
T ss_pred EEEEEEEecCCCCEEEEEecCCCCCceeEecCccccCCCCEEEEEcCCCC------CCcceeEEEEEEeecccCCccccc
Confidence 3456664 4459999999963 2233334555778999999999999999 57899999999998752 2467
Q ss_pred CeEEeecccCCCCcCcceecCCccEEEEEeeecccc-CCcceEEEEeHHHHHHHHhhhh-cCCCCcccceeeccc
Q 006130 280 SLLMADIRCLPGMEGGPVFGEHAHFVGILIRPLRQK-SGAEIQLVIPWEAIATACSDLL-LKEPQNAEKEIHINK 352 (660)
Q Consensus 280 ~~i~tDa~~~pG~sGGpl~~~~g~liGi~~~~l~~~-~~~~i~f~ip~~~i~~~~~~l~-~~~~~~~~~~~~~~~ 352 (660)
+||||||++|||||||||||.+|+||||++++++.. ++.|++|+||++.++.++++|. .+++.|+|+|+.++.
T Consensus 222 ~~iqtda~i~~GnSGGpL~n~~GeviGI~t~~~~~~g~~~g~gfaIP~~~~~~v~~~l~~~g~v~rg~lGv~~~~ 296 (473)
T PRK10942 222 NFIQTDAAINRGNSGGALVNLNGELIGINTAILAPDGGNIGIGFAIPSNMVKNLTSQMVEYGQVKRGELGIMGTE 296 (473)
T ss_pred ceEEeccccCCCCCcCccCCCCCeEEEEEEEEEcCCCCcccEEEEEEHHHHHHHHHHHHhccccccceeeeEeee
Confidence 899999999999999999999999999999999887 7789999999999999999998 788999999988653
No 8
>TIGR02038 protease_degS periplasmic serine pepetdase DegS. This family consists of the periplasmic serine protease DegS (HhoB), a shorter paralog of protease DO (HtrA, DegP) and DegQ (HhoA). It is found in E. coli and several other Proteobacteria of the gamma subdivision. It contains a trypsin domain and a single copy of PDZ domain (in contrast to DegP with two copies). A critical role of this DegS is to sense stress in the periplasm and partially degrade an inhibitor of sigma(E).
Probab=99.86 E-value=1.9e-21 Score=210.01 Aligned_cols=141 Identities=16% Similarity=0.262 Sum_probs=117.8
Q ss_pred CCccccC-CCcccEEEEEEcCCCCCCCccccCCCCCCCCeEEEEeCCCCCCCCcccccceEEEEEeeecCCC---CCCCC
Q 006130 205 SNLSLMS-KSTSRVAILGVSSYLKDLPNIALTPLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVANCYPPR---STTRS 280 (660)
Q Consensus 205 ~~~~~~~-~~~td~A~l~i~~~~~~~~~~~~s~~~~~G~~v~aigsPFg~~sp~~f~n~vs~GiVs~~~~~~---~~~~~ 280 (660)
.++++++ +..+||||||++....+..++..++.+++||+|++||+||| +.++++.|+||++.+.. ..+..
T Consensus 114 ~~a~vv~~d~~~DlAvlkv~~~~~~~~~l~~s~~~~~G~~V~aiG~P~~------~~~s~t~GiIs~~~r~~~~~~~~~~ 187 (351)
T TIGR02038 114 FEAELVGSDPLTDLAVLKIEGDNLPTIPVNLDRPPHVGDVVLAIGNPYN------LGQTITQGIISATGRNGLSSVGRQN 187 (351)
T ss_pred EEEEEEEecCCCCEEEEEecCCCCceEeccCcCccCCCCEEEEEeCCCC------CCCcEEEEEEEeccCcccCCCCcce
Confidence 3456664 45599999999865444445666778999999999999999 46899999999997742 13467
Q ss_pred eEEeecccCCCCcCcceecCCccEEEEEeeecccc---CCcceEEEEeHHHHHHHHhhhh-cCCCCcccceeecc
Q 006130 281 LLMADIRCLPGMEGGPVFGEHAHFVGILIRPLRQK---SGAEIQLVIPWEAIATACSDLL-LKEPQNAEKEIHIN 351 (660)
Q Consensus 281 ~i~tDa~~~pG~sGGpl~~~~g~liGi~~~~l~~~---~~~~i~f~ip~~~i~~~~~~l~-~~~~~~~~~~~~~~ 351 (660)
||||||+++||||||||||.+|+||||+++.+... ...+++|+||++.++.++++++ .+++.++|+|+..+
T Consensus 188 ~iqtda~i~~GnSGGpl~n~~G~vIGI~~~~~~~~~~~~~~g~~faIP~~~~~~vl~~l~~~g~~~r~~lGv~~~ 262 (351)
T TIGR02038 188 FIQTDAAINAGNSGGALINTNGELVGINTASFQKGGDEGGEGINFAIPIKLAHKIMGKIIRDGRVIRGYIGVSGE 262 (351)
T ss_pred EEEECCccCCCCCcceEECCCCeEEEEEeeeecccCCCCccceEEEecHHHHHHHHHHHhhcCcccceEeeeEEE
Confidence 99999999999999999999999999999998754 2368999999999999999998 67889999998754
No 9
>PRK10898 serine endoprotease; Provisional
Probab=99.85 E-value=3.6e-21 Score=208.04 Aligned_cols=139 Identities=17% Similarity=0.242 Sum_probs=116.2
Q ss_pred ccccC-CCcccEEEEEEcCCCCCCCccccCCCCCCCCeEEEEeCCCCCCCCcccccceEEEEEeeecCCC---CCCCCeE
Q 006130 207 LSLMS-KSTSRVAILGVSSYLKDLPNIALTPLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVANCYPPR---STTRSLL 282 (660)
Q Consensus 207 ~~~~~-~~~td~A~l~i~~~~~~~~~~~~s~~~~~G~~v~aigsPFg~~sp~~f~n~vs~GiVs~~~~~~---~~~~~~i 282 (660)
+.+++ +..+||||||++....+..+++.++.+++||+|++||+||| +..+++.|+||++.|.. .....||
T Consensus 116 a~vv~~d~~~DlAvl~v~~~~l~~~~l~~~~~~~~G~~V~aiG~P~g------~~~~~t~Giis~~~r~~~~~~~~~~~i 189 (353)
T PRK10898 116 ALLVGSDSLTDLAVLKINATNLPVIPINPKRVPHIGDVVLAIGNPYN------LGQTITQGIISATGRIGLSPTGRQNFL 189 (353)
T ss_pred EEEEEEcCCCCEEEEEEcCCCCCeeeccCcCcCCCCCEEEEEeCCCC------cCCCcceeEEEeccccccCCccccceE
Confidence 44553 44599999999865445556666778999999999999999 46799999999987642 2235799
Q ss_pred EeecccCCCCcCcceecCCccEEEEEeeeccccC----CcceEEEEeHHHHHHHHhhhh-cCCCCcccceeecc
Q 006130 283 MADIRCLPGMEGGPVFGEHAHFVGILIRPLRQKS----GAEIQLVIPWEAIATACSDLL-LKEPQNAEKEIHIN 351 (660)
Q Consensus 283 ~tDa~~~pG~sGGpl~~~~g~liGi~~~~l~~~~----~~~i~f~ip~~~i~~~~~~l~-~~~~~~~~~~~~~~ 351 (660)
|||+++|||||||||+|.+|+||||+++.+...+ ..+++|+||++.+..++++++ .+++.++|+|+..+
T Consensus 190 qtda~i~~GnSGGPl~n~~G~vvGI~~~~~~~~~~~~~~~g~~faIP~~~~~~~~~~l~~~G~~~~~~lGi~~~ 263 (353)
T PRK10898 190 QTDASINHGNSGGALVNSLGELMGINTLSFDKSNDGETPEGIGFAIPTQLATKIMDKLIRDGRVIRGYIGIGGR 263 (353)
T ss_pred EeccccCCCCCcceEECCCCeEEEEEEEEecccCCCCcccceEEEEchHHHHHHHHHHhhcCcccccccceEEE
Confidence 9999999999999999999999999999986542 268999999999999999988 77899999998854
No 10
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=99.84 E-value=1.6e-20 Score=207.68 Aligned_cols=140 Identities=23% Similarity=0.362 Sum_probs=118.9
Q ss_pred Ccccc-CCCcccEEEEEEcCCCCCCCccc--cCCCCCCCCeEEEEeCCCCCCCCcccccceEEEEEeeecCCC---CCCC
Q 006130 206 NLSLM-SKSTSRVAILGVSSYLKDLPNIA--LTPLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVANCYPPR---STTR 279 (660)
Q Consensus 206 ~~~~~-~~~~td~A~l~i~~~~~~~~~~~--~s~~~~~G~~v~aigsPFg~~sp~~f~n~vs~GiVs~~~~~~---~~~~ 279 (660)
+++++ .+..+|+||||++.. .+++.+. .++.+++||+|+++|+||| +..++|.|+||++.+.. ..+.
T Consensus 95 ~a~vv~~d~~~DlAllkv~~~-~~~~~~~l~~~~~~~~G~~v~aiG~p~g------~~~~~t~G~vs~~~~~~~~~~~~~ 167 (428)
T TIGR02037 95 KAKLVGKDPRTDIAVLKIDAK-KNLPVIKLGDSDKLRVGDWVLAIGNPFG------LGQTVTSGIVSALGRSGLGIGDYE 167 (428)
T ss_pred EEEEEEecCCCCEEEEEecCC-CCceEEEccCCCCCCCCCEEEEEECCCc------CCCcEEEEEEEecccCccCCCCcc
Confidence 34455 344589999999853 3455444 4678999999999999999 57899999999988752 3467
Q ss_pred CeEEeecccCCCCcCcceecCCccEEEEEeeecccc-CCcceEEEEeHHHHHHHHhhhh-cCCCCcccceeeccc
Q 006130 280 SLLMADIRCLPGMEGGPVFGEHAHFVGILIRPLRQK-SGAEIQLVIPWEAIATACSDLL-LKEPQNAEKEIHINK 352 (660)
Q Consensus 280 ~~i~tDa~~~pG~sGGpl~~~~g~liGi~~~~l~~~-~~~~i~f~ip~~~i~~~~~~l~-~~~~~~~~~~~~~~~ 352 (660)
.|||||++++||||||||||.+|+||||++++++.. ++.|++|+||++.++.+++++. .+++.++|+|+.++.
T Consensus 168 ~~i~tda~i~~GnSGGpl~n~~G~viGI~~~~~~~~g~~~g~~faiP~~~~~~~~~~l~~~g~~~~~~lGi~~~~ 242 (428)
T TIGR02037 168 NFIQTDAAINPGNSGGPLVNLRGEVIGINTAIYSPSGGNVGIGFAIPSNMAKNVVDQLIEGGKVQRGWLGVTIQE 242 (428)
T ss_pred ceEEECCCCCCCCCCCceECCCCeEEEEEeEEEcCCCCccceEEEEEhHHHHHHHHHHHhcCcCcCCcCceEeec
Confidence 799999999999999999999999999999999877 7789999999999999999998 678999999998654
No 11
>COG0265 DegQ Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain [Posttranslational modification, protein turnover, chaperones]
Probab=99.82 E-value=4.6e-19 Score=190.86 Aligned_cols=187 Identities=27% Similarity=0.458 Sum_probs=150.7
Q ss_pred hHHhhccCceEEEEeCC-----------------CeEEEEEEEeCCCEEEEcccccCCCCCcceeecCCccccccCCCCC
Q 006130 388 LPIQKALASVCLITIDD-----------------GVWASGVLLNDQGLILTNAHLLEPWRFGKTTVSGWRNGVSFQPEDS 450 (660)
Q Consensus 388 ~~i~~a~~SVV~I~~~~-----------------~~wGSGflI~~~GlILTNaHVVep~~~~~~~~~g~~~~~~f~~~~~ 450 (660)
.+++++.++||.|.... ..+||||+++++|||+||.||++ +
T Consensus 37 ~~~~~~~~~vV~~~~~~~~~~~~~~~~~~~~~~~~~~gSg~i~~~~g~ivTn~hVi~----------~------------ 94 (347)
T COG0265 37 TAVEKVAPAVVSIATGLTAKLRSFFPSDPPLRSAEGLGSGFIISSDGYIVTNNHVIA----------G------------ 94 (347)
T ss_pred HHHHhcCCcEEEEEeeeeecchhcccCCcccccccccccEEEEcCCeEEEecceecC----------C------------
Confidence 57889999999887632 37899999999999999999997 1
Q ss_pred CCCCCCcccccccccCCCCCCCcccccccccccccccccccCCceeEEEEEcCCCcceeEeeEEEEecCCCCceEEEEEc
Q 006130 451 ASSGHTGVDQYQKSQTLPPKMPKIVDSSVDEHRAYKLSSFSRGHRKIRVRLDHLDPWIWCDAKIVYVCKGPLDVSLLQLG 530 (660)
Q Consensus 451 ~~~~~~~~~~~~~~~~~~~k~~~i~~e~~~~~~~~~~n~~~~~~~~I~Vr~~~~~~~~w~~A~VV~v~d~~~DLALLqL~ 530 (660)
...+.+.+.++.. +++++++ .|+..|+|+||++
T Consensus 95 -------------------------------------------a~~i~v~l~dg~~---~~a~~vg-~d~~~dlavlki~ 127 (347)
T COG0265 95 -------------------------------------------AEEITVTLADGRE---VPAKLVG-KDPISDLAVLKID 127 (347)
T ss_pred -------------------------------------------cceEEEEeCCCCE---EEEEEEe-cCCccCEEEEEec
Confidence 2236666655554 9999999 8888999999999
Q ss_pred cCCCCcceeecCCC-CCCCCCeEEEEecCCCCCCCCCCCeeeeeEEeeeeeccCCCCCccccccCCCcCcEEEEcccccC
Q 006130 531 YIPDQLCPIDADFG-QPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQRNSAYPVMLETTAAVHP 609 (660)
Q Consensus 531 ~~p~~l~pi~l~~s-~~~~Gd~V~vIGygl~g~~~gl~psvt~GiVS~v~~~~~~~~~~~~~~~~~~~~~~IqTda~v~~ 609 (660)
.... ++.+.+.++ .++.|+.++++|+ ++++..+++.|+++...+... .....+..+||||+++++
T Consensus 128 ~~~~-~~~~~~~~s~~l~vg~~v~aiGn-----p~g~~~tvt~Givs~~~r~~v--------~~~~~~~~~IqtdAain~ 193 (347)
T COG0265 128 GAGG-LPVIALGDSDKLRVGDVVVAIGN-----PFGLGQTVTSGIVSALGRTGV--------GSAGGYVNFIQTDAAINP 193 (347)
T ss_pred cCCC-CceeeccCCCCcccCCEEEEecC-----CCCcccceeccEEeccccccc--------cCcccccchhhcccccCC
Confidence 7322 777788776 5889999999999 567889999999998776411 111225667899999999
Q ss_pred CcccccccccCceEEEEeeecccCCCCcccCceEEEEehHHHHHHHHHhc
Q 006130 610 GGSGGAVVNLDGHMIGLVTSNARHGGGTVIPHLNFSIPCAVLRPIFEFAR 659 (660)
Q Consensus 610 G~SGGPL~Ns~G~VVGIvssna~~~~g~~ip~lnFaIPi~~l~~il~~a~ 659 (660)
|+||||++|.+|++|||++......++.. +++|+||+....+++..+.
T Consensus 194 gnsGgpl~n~~g~~iGint~~~~~~~~~~--gigfaiP~~~~~~v~~~l~ 241 (347)
T COG0265 194 GNSGGPLVNIDGEVVGINTAIIAPSGGSS--GIGFAIPVNLVAPVLDELI 241 (347)
T ss_pred CCCCCceEcCCCcEEEEEEEEecCCCCcc--eeEEEecHHHHHHHHHHHH
Confidence 99999999999999999999988765422 5899999999999887653
No 12
>COG0265 DegQ Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain [Posttranslational modification, protein turnover, chaperones]
Probab=99.81 E-value=1.3e-19 Score=195.03 Aligned_cols=140 Identities=20% Similarity=0.285 Sum_probs=121.1
Q ss_pred Ccccc-CCCcccEEEEEEcCCC-CCCCccccCCCCCCCCeEEEEeCCCCCCCCcccccceEEEEEeeecCC-C---CCCC
Q 006130 206 NLSLM-SKSTSRVAILGVSSYL-KDLPNIALTPLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVANCYPP-R---STTR 279 (660)
Q Consensus 206 ~~~~~-~~~~td~A~l~i~~~~-~~~~~~~~s~~~~~G~~v~aigsPFg~~sp~~f~n~vs~GiVs~~~~~-~---~~~~ 279 (660)
+++++ .+..+|+|+||++... .+...+++++.+++||+++|||+||| |.+++|.||||+..|. - ..+.
T Consensus 109 ~a~~vg~d~~~dlavlki~~~~~~~~~~~~~s~~l~vg~~v~aiGnp~g------~~~tvt~Givs~~~r~~v~~~~~~~ 182 (347)
T COG0265 109 PAKLVGKDPISDLAVLKIDGAGGLPVIALGDSDKLRVGDVVVAIGNPFG------LGQTVTSGIVSALGRTGVGSAGGYV 182 (347)
T ss_pred EEEEEecCCccCEEEEEeccCCCCceeeccCCCCcccCCEEEEecCCCC------cccceeccEEeccccccccCccccc
Confidence 35566 4556999999999643 44456777889999999999999999 6899999999999984 1 2267
Q ss_pred CeEEeecccCCCCcCcceecCCccEEEEEeeecccc-CCcceEEEEeHHHHHHHHhhhh-cCCCCcccceeecc
Q 006130 280 SLLMADIRCLPGMEGGPVFGEHAHFVGILIRPLRQK-SGAEIQLVIPWEAIATACSDLL-LKEPQNAEKEIHIN 351 (660)
Q Consensus 280 ~~i~tDa~~~pG~sGGpl~~~~g~liGi~~~~l~~~-~~~~i~f~ip~~~i~~~~~~l~-~~~~~~~~~~~~~~ 351 (660)
+||||||++||||||||++|.+|++|||+++++... +..|++|+||++.+..++.++. .+++.++|+|+.+.
T Consensus 183 ~~IqtdAain~gnsGgpl~n~~g~~iGint~~~~~~~~~~gigfaiP~~~~~~v~~~l~~~G~v~~~~lgv~~~ 256 (347)
T COG0265 183 NFIQTDAAINPGNSGGPLVNIDGEVVGINTAIIAPSGGSSGIGFAIPVNLVAPVLDELISKGKVVRGYLGVIGE 256 (347)
T ss_pred chhhcccccCCCCCCCceEcCCCcEEEEEEEEecCCCCcceeEEEecHHHHHHHHHHHHHcCCccccccceEEE
Confidence 899999999999999999999999999999999988 5788999999999999999999 47899999988754
No 13
>PF13365 Trypsin_2: Trypsin-like peptidase domain; PDB: 1Y8T_A 2Z9I_A 3QO6_A 1L1J_A 1QY6_A 2O8L_A 3OTP_E 2ZLE_I 1KY9_A 3CS0_A ....
Probab=99.54 E-value=9.7e-14 Score=124.57 Aligned_cols=24 Identities=46% Similarity=0.904 Sum_probs=22.3
Q ss_pred EcccccCCcccccccccCceEEEE
Q 006130 603 TTAAVHPGGSGGAVVNLDGHMIGL 626 (660)
Q Consensus 603 Tda~v~~G~SGGPL~Ns~G~VVGI 626 (660)
+++.+.+|+|||||||.+|+||||
T Consensus 97 ~~~~~~~G~SGgpv~~~~G~vvGi 120 (120)
T PF13365_consen 97 TDADTRPGSSGGPVFDSDGRVVGI 120 (120)
T ss_dssp ESSS-STTTTTSEEEETTSEEEEE
T ss_pred eecccCCCcEeHhEECCCCEEEeC
Confidence 899999999999999999999997
No 14
>KOG1320 consensus Serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=99.34 E-value=1.2e-12 Score=144.50 Aligned_cols=128 Identities=19% Similarity=0.309 Sum_probs=107.7
Q ss_pred ccCCccccC-CCcccEEEEEEcCCC--CCCCccccCCCCCCCCeEEEEeCCCCCCCCcccccceEEEEEeeecCCC----
Q 006130 203 ESSNLSLMS-KSTSRVAILGVSSYL--KDLPNIALTPLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVANCYPPR---- 275 (660)
Q Consensus 203 ~~~~~~~~~-~~~td~A~l~i~~~~--~~~~~~~~s~~~~~G~~v~aigsPFg~~sp~~f~n~vs~GiVs~~~~~~---- 275 (660)
-+.+|.+++ +...|+|++|++... ....+.+.+..++.|+|+.++++||+ +.|++|+|+||...|+.
T Consensus 211 ~s~ep~i~g~d~~~gvA~l~ik~~~~i~~~i~~~~~~~~~~G~~~~a~~~~f~------~~nt~t~g~vs~~~R~~~~lg 284 (473)
T KOG1320|consen 211 NSGEPVIVGVDKVAGVAFLKIKTPENILYVIPLGVSSHFRTGVEVSAIGNGFG------LLNTLTQGMVSGQLRKSFKLG 284 (473)
T ss_pred ccCCCeEEccccccceEEEEEecCCcccceeecceeeeecccceeeccccCce------eeeeeeecccccccccccccC
Confidence 456688888 666999999996432 34456666889999999999999999 57999999999988852
Q ss_pred ----CCCCCeEEeecccCCCCcCcceecCCccEEEEEeeecccc-CCcceEEEEeHHHHHHHHhhh
Q 006130 276 ----STTRSLLMADIRCLPGMEGGPVFGEHAHFVGILIRPLRQK-SGAEIQLVIPWEAIATACSDL 336 (660)
Q Consensus 276 ----~~~~~~i~tDa~~~pG~sGGpl~~~~g~liGi~~~~l~~~-~~~~i~f~ip~~~i~~~~~~l 336 (660)
.....++|||+++|+||+|||++|.+|++||++++...+. -+.++.|++|.+.+...+...
T Consensus 285 ~~~g~~i~~~~qtd~ai~~~nsg~~ll~~DG~~IgVn~~~~~ri~~~~~iSf~~p~d~vl~~v~r~ 350 (473)
T KOG1320|consen 285 LETGVLISKINQTDAAINPGNSGGPLLNLDGEVIGVNTRKVTRIGFSHGISFKIPIDTVLVIVLRL 350 (473)
T ss_pred cccceeeeeecccchhhhcccCCCcEEEecCcEeeeeeeeeEEeeccccceeccCchHhhhhhhhh
Confidence 2345788999999999999999999999999999998876 678999999999998766554
No 15
>KOG1320 consensus Serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=99.34 E-value=7.3e-12 Score=138.24 Aligned_cols=201 Identities=21% Similarity=0.282 Sum_probs=138.2
Q ss_pred HHhhccCceEEEEeCC--------------CeEEEEEEEeCCCEEEEcccccCCCCCcceeecCCccccccCCCCCCCCC
Q 006130 389 PIQKALASVCLITIDD--------------GVWASGVLLNDQGLILTNAHLLEPWRFGKTTVSGWRNGVSFQPEDSASSG 454 (660)
Q Consensus 389 ~i~~a~~SVV~I~~~~--------------~~wGSGflI~~~GlILTNaHVVep~~~~~~~~~g~~~~~~f~~~~~~~~~ 454 (660)
+.++...+++.|...+ ..-||||+++.+|+++||+||+.- .-..|
T Consensus 133 ~~~~cd~Avv~Ie~~~f~~~~~~~e~~~ip~l~~S~~Vv~gd~i~VTnghV~~~--------~~~~y------------- 191 (473)
T KOG1320|consen 133 VFEECDLAVVYIESEEFWKGMNPFELGDIPSLNGSGFVVGGDGIIVTNGHVVRV--------EPRIY------------- 191 (473)
T ss_pred hhhcccceEEEEeeccccCCCcccccCCCcccCccEEEEcCCcEEEEeeEEEEE--------Eeccc-------------
Confidence 4566788888888621 124999999999999999999851 00000
Q ss_pred CCcccccccccCCCCCCCcccccccccccccccccccCCceeEEEEEcCC--CcceeEeeEEEEecCCCCceEEEEEccC
Q 006130 455 HTGVDQYQKSQTLPPKMPKIVDSSVDEHRAYKLSSFSRGHRKIRVRLDHL--DPWIWCDAKIVYVCKGPLDVSLLQLGYI 532 (660)
Q Consensus 455 ~~~~~~~~~~~~~~~k~~~i~~e~~~~~~~~~~n~~~~~~~~I~Vr~~~~--~~~~w~~A~VV~v~d~~~DLALLqL~~~ 532 (660)
.+. . ..--.+.++...+ .. +.+.++. .++..|+|+++++..
T Consensus 192 -----~~~----~------------------------~~l~~vqi~aa~~~~~s---~ep~i~g-~d~~~gvA~l~ik~~ 234 (473)
T KOG1320|consen 192 -----AHS----S------------------------TVLLRVQIDAAIGPGNS---GEPVIVG-VDKVAGVAFLKIKTP 234 (473)
T ss_pred -----cCC----C------------------------cceeeEEEEEeecCCcc---CCCeEEc-cccccceEEEEEecC
Confidence 000 0 0011255555544 44 5666666 467899999999752
Q ss_pred CCCcceeecCCC-CCCCCCeEEEEecCCCCCCCCCCCeeeeeEEeeeeeccCCCCCccccccCCCcCcEEEEcccccCCc
Q 006130 533 PDQLCPIDADFG-QPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQRNSAYPVMLETTAAVHPGG 611 (660)
Q Consensus 533 p~~l~pi~l~~s-~~~~Gd~V~vIGygl~g~~~gl~psvt~GiVS~v~~~~~~~~~~~~~~~~~~~~~~IqTda~v~~G~ 611 (660)
..-+.++++... .+..|+.+..+|. ++++..+++.|+++...+...... .. ........+||++++..|+
T Consensus 235 ~~i~~~i~~~~~~~~~~G~~~~a~~~-----~f~~~nt~t~g~vs~~~R~~~~lg---~~-~g~~i~~~~qtd~ai~~~n 305 (473)
T KOG1320|consen 235 ENILYVIPLGVSSHFRTGVEVSAIGN-----GFGLLNTLTQGMVSGQLRKSFKLG---LE-TGVLISKINQTDAAINPGN 305 (473)
T ss_pred Ccccceeecceeeeecccceeecccc-----CceeeeeeeecccccccccccccC---cc-cceeeeeecccchhhhccc
Confidence 233677777654 6899999999999 578888999999987665321111 01 1123456789999999999
Q ss_pred ccccccccCceEEEEeeecccCCCCcccCceEEEEehHHHHHHHHHh
Q 006130 612 SGGAVVNLDGHMIGLVTSNARHGGGTVIPHLNFSIPCAVLRPIFEFA 658 (660)
Q Consensus 612 SGGPL~Ns~G~VVGIvssna~~~~g~~ip~lnFaIPi~~l~~il~~a 658 (660)
||||++|.+|++||+++.+...-+=. -.++|++|++.+..++.+.
T Consensus 306 sg~~ll~~DG~~IgVn~~~~~ri~~~--~~iSf~~p~d~vl~~v~r~ 350 (473)
T KOG1320|consen 306 SGGPLLNLDGEVIGVNTRKVTRIGFS--HGISFKIPIDTVLVIVLRL 350 (473)
T ss_pred CCCcEEEecCcEeeeeeeeeEEeecc--ccceeccCchHhhhhhhhh
Confidence 99999999999999988875521000 1479999999988876553
No 16
>PF00089 Trypsin: Trypsin; InterPro: IPR001254 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine proteases belong to the MEROPS peptidase family S1 (chymotrypsin family, clan PA(S))and to peptidase family S6 (Hap serine peptidases). The chymotrypsin family is almost totally confined to animals, although trypsin-like enzymes are found in actinomycetes of the genera Streptomyces and Saccharopolyspora, and in the fungus Fusarium oxysporum []. The enzymes are inherently secreted, being synthesised with a signal peptide that targets them to the secretory pathway. Animal enzymes are either secreted directly, packaged into vesicles for regulated secretion, or are retained in leukocyte granules []. The Hap family, 'Haemophilus adhesion and penetration', are proteins that play a role in the interaction with human epithelial cells. The serine protease activity is localized at the N-terminal domain, whereas the binding domain is in the C-terminal region. ; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 1SPJ_A 1A5I_A 2ZGH_A 2ZKS_A 2ZGJ_A 2ZGC_A 2ODP_A 2I6Q_A 2I6S_A 2ODQ_A ....
Probab=99.31 E-value=7.7e-11 Score=115.85 Aligned_cols=126 Identities=21% Similarity=0.307 Sum_probs=78.6
Q ss_pred CCceEEEEEccC---CCCcceeecCCC--CCCCCCeEEEEecCCCCCCCCCCCeeeeeEEeeeeeccCCCCCccccccCC
Q 006130 521 PLDVSLLQLGYI---PDQLCPIDADFG--QPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQRNS 595 (660)
Q Consensus 521 ~~DLALLqL~~~---p~~l~pi~l~~s--~~~~Gd~V~vIGygl~g~~~gl~psvt~GiVS~v~~~~~~~~~~~~~~~~~ 595 (660)
.+|||||+|+.. ...+.|+.+... .+..|+.+.++|||.... .+....+....+.-+... .+... ....
T Consensus 86 ~~DiAll~L~~~~~~~~~~~~~~l~~~~~~~~~~~~~~~~G~~~~~~-~~~~~~~~~~~~~~~~~~----~c~~~-~~~~ 159 (220)
T PF00089_consen 86 DNDIALLKLDRPITFGDNIQPICLPSAGSDPNVGTSCIVVGWGRTSD-NGYSSNLQSVTVPVVSRK----TCRSS-YNDN 159 (220)
T ss_dssp TTSEEEEEESSSSEHBSSBEESBBTSTTHTTTTTSEEEEEESSBSST-TSBTSBEEEEEEEEEEHH----HHHHH-TTTT
T ss_pred ccccccccccccccccccccccccccccccccccccccccccccccc-cccccccccccccccccc----ccccc-cccc
Confidence 589999999974 356678877763 468999999999985221 111123332332221110 01110 0112
Q ss_pred CcCcEEEEcc----cccCCcccccccccCceEEEEeeecccCCCCcccCceEEEEehHHHHHHH
Q 006130 596 AYPVMLETTA----AVHPGGSGGAVVNLDGHMIGLVTSNARHGGGTVIPHLNFSIPCAVLRPIF 655 (660)
Q Consensus 596 ~~~~~IqTda----~v~~G~SGGPL~Ns~G~VVGIvssna~~~~g~~ip~lnFaIPi~~l~~il 655 (660)
....++++.. ..+.|+|||||++.++.|+||++.+ ....... ...+.+++....+++
T Consensus 160 ~~~~~~c~~~~~~~~~~~g~sG~pl~~~~~~lvGI~s~~-~~c~~~~--~~~v~~~v~~~~~WI 220 (220)
T PF00089_consen 160 LTPNMICAGSSGSGDACQGDSGGPLICNNNYLVGIVSFG-ENCGSPN--YPGVYTRVSSYLDWI 220 (220)
T ss_dssp STTTEEEEETTSSSBGGTTTTTSEEEETTEEEEEEEEEE-SSSSBTT--SEEEEEEGGGGHHHH
T ss_pred cccccccccccccccccccccccccccceeeecceeeec-CCCCCCC--cCEEEEEHHHhhccC
Confidence 3456788776 7889999999998777899999988 3222221 247788888766653
No 17
>cd00190 Tryp_SPc Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues.
Probab=99.21 E-value=5.7e-10 Score=110.37 Aligned_cols=109 Identities=23% Similarity=0.240 Sum_probs=63.7
Q ss_pred CCCceEEEEEccC---CCCcceeecCCC--CCCCCCeEEEEecCCCCCCCCCCCeeeeeEEeeeeeccCCCCCcccccc-
Q 006130 520 GPLDVSLLQLGYI---PDQLCPIDADFG--QPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQR- 593 (660)
Q Consensus 520 ~~~DLALLqL~~~---p~~l~pi~l~~s--~~~~Gd~V~vIGygl~g~~~gl~psvt~GiVS~v~~~~~~~~~~~~~~~- 593 (660)
..+|||||+|+.. ...+.|+.+... .+..|+.+++.|||................+.-+.. ..|+.....
T Consensus 87 ~~~DiAll~L~~~~~~~~~v~picl~~~~~~~~~~~~~~~~G~g~~~~~~~~~~~~~~~~~~~~~~----~~C~~~~~~~ 162 (232)
T cd00190 87 YDNDIALLKLKRPVTLSDNVRPICLPSSGYNLPAGTTCTVSGWGRTSEGGPLPDVLQEVNVPIVSN----AECKRAYSYG 162 (232)
T ss_pred CcCCEEEEEECCcccCCCcccceECCCccccCCCCCEEEEEeCCcCCCCCCCCceeeEEEeeeECH----HHhhhhccCc
Confidence 3689999999962 234678887766 678899999999986432211111122222211110 011111100
Q ss_pred CCCcCcEEEEc-----ccccCCcccccccccC---ceEEEEeeeccc
Q 006130 594 NSAYPVMLETT-----AAVHPGGSGGAVVNLD---GHMIGLVTSNAR 632 (660)
Q Consensus 594 ~~~~~~~IqTd-----a~v~~G~SGGPL~Ns~---G~VVGIvssna~ 632 (660)
......+++.. ...+.|+|||||+... ..++||++....
T Consensus 163 ~~~~~~~~C~~~~~~~~~~c~gdsGgpl~~~~~~~~~lvGI~s~g~~ 209 (232)
T cd00190 163 GTITDNMLCAGGLEGGKDACQGDSGGPLVCNDNGRGVLVGIVSWGSG 209 (232)
T ss_pred ccCCCceEeeCCCCCCCccccCCCCCcEEEEeCCEEEEEEEEehhhc
Confidence 11223444443 3467899999999754 899999988754
No 18
>smart00020 Tryp_SPc Trypsin-like serine protease. Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues.
Probab=99.01 E-value=1.2e-08 Score=101.41 Aligned_cols=108 Identities=23% Similarity=0.272 Sum_probs=61.6
Q ss_pred CCCceEEEEEccC---CCCcceeecCCC--CCCCCCeEEEEecCCCCCCC-CCCCeeeeeEEeeeeeccCCCCCcccccc
Q 006130 520 GPLDVSLLQLGYI---PDQLCPIDADFG--QPSLGSAAYVIGHGLFGPRC-GLSPSVSSGVVAKVVKANLPSYGQSTLQR 593 (660)
Q Consensus 520 ~~~DLALLqL~~~---p~~l~pi~l~~s--~~~~Gd~V~vIGygl~g~~~-gl~psvt~GiVS~v~~~~~~~~~~~~~~~ 593 (660)
..+|||||+|+.. ...+.|+.+... .+..++.+.+.|||...... ..........+.-+.. ..|......
T Consensus 87 ~~~DiAll~L~~~i~~~~~~~pi~l~~~~~~~~~~~~~~~~g~g~~~~~~~~~~~~~~~~~~~~~~~----~~C~~~~~~ 162 (229)
T smart00020 87 YDNDIALLKLKSPVTLSDNVRPICLPSSNYNVPAGTTCTVSGWGRTSEGAGSLPDTLQEVNVPIVSN----ATCRRAYSG 162 (229)
T ss_pred CcCCEEEEEECcccCCCCceeeccCCCcccccCCCCEEEEEeCCCCCCCCCcCCCEeeEEEEEEeCH----HHhhhhhcc
Confidence 4689999999862 335678777654 57789999999998643200 1111222222221110 001100000
Q ss_pred -CCCcCcEEEE-----cccccCCcccccccccCc--eEEEEeeecc
Q 006130 594 -NSAYPVMLET-----TAAVHPGGSGGAVVNLDG--HMIGLVTSNA 631 (660)
Q Consensus 594 -~~~~~~~IqT-----da~v~~G~SGGPL~Ns~G--~VVGIvssna 631 (660)
......+++. ....++|+|||||+...+ .++||++...
T Consensus 163 ~~~~~~~~~C~~~~~~~~~~c~gdsG~pl~~~~~~~~l~Gi~s~g~ 208 (229)
T smart00020 163 GGAITDNMLCAGGLEGGKDACQGDSGGPLVCNDGRWVLVGIVSWGS 208 (229)
T ss_pred ccccCCCcEeecCCCCCCcccCCCCCCeeEEECCCEEEEEEEEECC
Confidence 0112223433 345788999999996543 9999999876
No 19
>KOG1421 consensus Predicted signaling-associated protein (contains a PDZ domain) [General function prediction only]
Probab=98.87 E-value=9.1e-09 Score=115.62 Aligned_cols=188 Identities=21% Similarity=0.337 Sum_probs=127.4
Q ss_pred HhhccCceEEEEeC----------CCeEEEEEEEeC-CCEEEEcccccCCCCCcceeecCCccccccCCCCCCCCCCCcc
Q 006130 390 IQKALASVCLITID----------DGVWASGVLLND-QGLILTNAHLLEPWRFGKTTVSGWRNGVSFQPEDSASSGHTGV 458 (660)
Q Consensus 390 i~~a~~SVV~I~~~----------~~~wGSGflI~~-~GlILTNaHVVep~~~~~~~~~g~~~~~~f~~~~~~~~~~~~~ 458 (660)
+..+.++||.|+.. ....|+||++++ .||||||+||+.|--+.. + ++|...++
T Consensus 58 ia~VvksvVsI~~S~v~~fdtesag~~~atgfvvd~~~gyiLtnrhvv~pgP~va-------~-avf~n~ee-------- 121 (955)
T KOG1421|consen 58 IANVVKSVVSIRFSAVRAFDTESAGESEATGFVVDKKLGYILTNRHVVAPGPFVA-------S-AVFDNHEE-------- 121 (955)
T ss_pred hhhhcccEEEEEehheeecccccccccceeEEEEecccceEEEeccccCCCCcee-------E-EEeccccc--------
Confidence 66788999999862 244699999998 689999999998532111 0 22221111
Q ss_pred cccccccCCCCCCCcccccccccccccccccccCCceeEEEEEcCCCcceeEeeEEEEecCCCCceEEEEEccCC---CC
Q 006130 459 DQYQKSQTLPPKMPKIVDSSVDEHRAYKLSSFSRGHRKIRVRLDHLDPWIWCDAKIVYVCKGPLDVSLLQLGYIP---DQ 535 (660)
Q Consensus 459 ~~~~~~~~~~~k~~~i~~e~~~~~~~~~~n~~~~~~~~I~Vr~~~~~~~~w~~A~VV~v~d~~~DLALLqL~~~p---~~ 535 (660)
++.-.+| .|+-+|+.+++.+... ..
T Consensus 122 ---------------------------------------------------~ei~pvy-rDpVhdfGf~r~dps~ir~s~ 149 (955)
T KOG1421|consen 122 ---------------------------------------------------IEIYPVY-RDPVHDFGFFRYDPSTIRFSI 149 (955)
T ss_pred ---------------------------------------------------CCccccc-CCchhhcceeecChhhcceee
Confidence 2222344 6778999999998521 12
Q ss_pred cceeecCCCCCCCCCeEEEEecCCCCCCCCCCCeeeeeEEeeeeeccCCCCCccccccCCCcCcEEEEcccccCCccccc
Q 006130 536 LCPIDADFGQPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQRNSAYPVMLETTAAVHPGGSGGA 615 (660)
Q Consensus 536 l~pi~l~~s~~~~Gd~V~vIGygl~g~~~gl~psvt~GiVS~v~~~~~~~~~~~~~~~~~~~~~~IqTda~v~~G~SGGP 615 (660)
+..+.+..+-.+.|.+++++|+ -.+...++-.|.++++.+.- +.++..++. .-....+|.-+...+|.||+|
T Consensus 150 vt~i~lap~~akvgseirvvgN-----DagEklsIlagflSrldr~a-pdyg~~~yn--dfnTfy~Qaasstsggssgsp 221 (955)
T KOG1421|consen 150 VTEICLAPELAKVGSEIRVVGN-----DAGEKLSILAGFLSRLDRNA-PDYGEDTYN--DFNTFYIQAASSTSGGSSGSP 221 (955)
T ss_pred eeccccCccccccCCceEEecC-----CccceEEeehhhhhhccCCC-ccccccccc--cccceeeeehhcCCCCCCCCc
Confidence 3444455555689999999998 45666777888888876532 222222222 122335787778889999999
Q ss_pred ccccCceEEEEeeecccCCCCcccCceEEEEehHHHHHHHHHhc
Q 006130 616 VVNLDGHMIGLVTSNARHGGGTVIPHLNFSIPCAVLRPIFEFAR 659 (660)
Q Consensus 616 L~Ns~G~VVGIvssna~~~~g~~ip~lnFaIPi~~l~~il~~a~ 659 (660)
|+|-.|..|.++..+.... .-.|++|++-+.+.|.+.+
T Consensus 222 Vv~i~gyAVAl~agg~~ss------as~ffLpLdrV~RaL~clq 259 (955)
T KOG1421|consen 222 VVDIPGYAVALNAGGSISS------ASDFFLPLDRVVRALRCLQ 259 (955)
T ss_pred eecccceEEeeecCCcccc------cccceeeccchhhhhhhhh
Confidence 9999999999986654332 3479999999988876654
No 20
>COG3591 V8-like Glu-specific endopeptidase [Amino acid transport and metabolism]
Probab=98.24 E-value=1.9e-05 Score=81.71 Aligned_cols=90 Identities=22% Similarity=0.233 Sum_probs=57.9
Q ss_pred CCCCCCeEEEEecCCCCCCCCCCCeeeeeEEeeeeeccCCCCCccccccCCCcCcEEEEcccccCCcccccccccCceEE
Q 006130 545 QPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQRNSAYPVMLETTAAVHPGGSGGAVVNLDGHMI 624 (660)
Q Consensus 545 ~~~~Gd~V~vIGygl~g~~~gl~psvt~GiVS~v~~~~~~~~~~~~~~~~~~~~~~IqTda~v~~G~SGGPL~Ns~G~VV 624 (660)
..+.++.+.++|||.-.++.+ ......+.|..+. ...+++++-+.+|+||+||++.+.++|
T Consensus 157 ~~~~~d~i~v~GYP~dk~~~~-~~~e~t~~v~~~~------------------~~~l~y~~dT~pG~SGSpv~~~~~~vi 217 (251)
T COG3591 157 EAKANDRITVIGYPGDKPNIG-TMWESTGKVNSIK------------------GNKLFYDADTLPGSSGSPVLISKDEVI 217 (251)
T ss_pred ccccCceeEEEeccCCCCcce-eEeeecceeEEEe------------------cceEEEEecccCCCCCCceEecCceEE
Confidence 468999999999975332222 1223334444322 225888999999999999999989999
Q ss_pred EEeeecccCCCCcccCceEEEE-ehHHHHHHHH
Q 006130 625 GLVTSNARHGGGTVIPHLNFSI-PCAVLRPIFE 656 (660)
Q Consensus 625 GIvssna~~~~g~~ip~lnFaI-Pi~~l~~il~ 656 (660)
|+.+.+....++ ...|+++ =...+.++++
T Consensus 218 gv~~~g~~~~~~---~~~n~~vr~t~~~~~~I~ 247 (251)
T COG3591 218 GVHYNGPGANGG---SLANNAVRLTPEILNFIQ 247 (251)
T ss_pred EEEecCCCcccc---cccCcceEecHHHHHHHH
Confidence 999988664433 2344333 2334444444
No 21
>KOG3627 consensus Trypsin [Amino acid transport and metabolism]
Probab=98.22 E-value=0.00011 Score=75.14 Aligned_cols=129 Identities=20% Similarity=0.213 Sum_probs=70.1
Q ss_pred CceEEEEEcc---CCCCcceeecCCCC----CCCCCeEEEEecCCCCCC-CCCCCeeeeeEEeeeeeccCCCCCcccccc
Q 006130 522 LDVSLLQLGY---IPDQLCPIDADFGQ----PSLGSAAYVIGHGLFGPR-CGLSPSVSSGVVAKVVKANLPSYGQSTLQR 593 (660)
Q Consensus 522 ~DLALLqL~~---~p~~l~pi~l~~s~----~~~Gd~V~vIGygl~g~~-~gl~psvt~GiVS~v~~~~~~~~~~~~~~~ 593 (660)
+|||||+++. ..+.+.|+.+.... ...++.+++.|||..... ...........+.-+ ....|......
T Consensus 106 nDiall~l~~~v~~~~~i~piclp~~~~~~~~~~~~~~~v~GWG~~~~~~~~~~~~L~~~~v~i~----~~~~C~~~~~~ 181 (256)
T KOG3627|consen 106 NDIALLRLSEPVTFSSHIQPICLPSSADPYFPPGGTTCLVSGWGRTESGGGPLPDTLQEVDVPII----SNSECRRAYGG 181 (256)
T ss_pred CCEEEEEECCCcccCCcccccCCCCCcccCCCCCCCEEEEEeCCCcCCCCCCCCceeEEEEEeEc----ChhHhcccccC
Confidence 8999999996 23556777765332 344589999999854321 011111221111111 11112222211
Q ss_pred C-CCcCcEEEEcc-----cccCCcccccccccC---ceEEEEeeecccCCCCcccCceEEEEehHHHHHHHH
Q 006130 594 N-SAYPVMLETTA-----AVHPGGSGGAVVNLD---GHMIGLVTSNARHGGGTVIPHLNFSIPCAVLRPIFE 656 (660)
Q Consensus 594 ~-~~~~~~IqTda-----~v~~G~SGGPL~Ns~---G~VVGIvssna~~~~g~~ip~lnFaIPi~~l~~il~ 656 (660)
. .....++++.. ..+.|+|||||+-.. ..++||+++.....+....|.. ...+....++++
T Consensus 182 ~~~~~~~~~Ca~~~~~~~~~C~GDSGGPLv~~~~~~~~~~GivS~G~~~C~~~~~P~v--yt~V~~y~~WI~ 251 (256)
T KOG3627|consen 182 LGTITDTMLCAGGPEGGKDACQGDSGGPLVCEDNGRWVLVGIVSWGSGGCGQPNYPGV--YTRVSSYLDWIK 251 (256)
T ss_pred ccccCCCEEeeCccCCCCccccCCCCCeEEEeeCCcEEEEEEEEecCCCCCCCCCCeE--EeEhHHhHHHHH
Confidence 1 11234577652 357899999999654 6999999998764333234555 333444444443
No 22
>PF00863 Peptidase_C4: Peptidase family C4 This family belongs to family C4 of the peptidase classification.; InterPro: IPR001730 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. Nuclear inclusion A (NIA) proteases from potyviruses are cysteine peptidases belong to the MEROPS peptidase family C4 (NIa protease family, clan PA(C)) [, ]. Potyviruses include plant viruses in which the single-stranded RNA encodes a polyprotein with NIA protease activity, where proteolytic cleavage is specific for Gln+Gly sites. The NIA protease acts on the polyprotein, releasing itself by Gln+Gly cleavage at both the N- and C-termini. It further processes the polyprotein by cleavage at five similar sites in the C-terminal half of the sequence. In addition to its C-terminal protease activity, the NIA protease contains an N-terminal domain that has been implicated in the transcription process []. This peptidase is present in the nuclear inclusion protein of potyviruses.; GO: 0008234 cysteine-type peptidase activity, 0006508 proteolysis; PDB: 3MMG_B 1Q31_B 1LVB_A 1LVM_A.
Probab=98.14 E-value=4.4e-05 Score=78.43 Aligned_cols=103 Identities=16% Similarity=0.269 Sum_probs=50.6
Q ss_pred CCCceEEEEEccCCCCcceee--cCCCCCCCCCeEEEEecCCCCCCCCCCCeeeeeEEeeeeeccCCCCCccccccCCCc
Q 006130 520 GPLDVSLLQLGYIPDQLCPID--ADFGQPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQRNSAY 597 (660)
Q Consensus 520 ~~~DLALLqL~~~p~~l~pi~--l~~s~~~~Gd~V~vIGygl~g~~~gl~psvt~GiVS~v~~~~~~~~~~~~~~~~~~~ 597 (660)
+..||.++|+.. +++|.+ +....|+.+|.|+++|. .+... -..-.||.....- . ...
T Consensus 80 ~~~DiviirmPk---DfpPf~~kl~FR~P~~~e~v~mVg~-----~fq~k--~~~s~vSesS~i~---------p--~~~ 138 (235)
T PF00863_consen 80 EGRDIVIIRMPK---DFPPFPQKLKFRAPKEGERVCMVGS-----NFQEK--SISSTVSESSWIY---------P--EEN 138 (235)
T ss_dssp TCSSEEEEE--T---TS----S---B----TT-EEEEEEE-----ECSSC--CCEEEEEEEEEEE---------E--ETT
T ss_pred CCccEEEEeCCc---ccCCcchhhhccCCCCCCEEEEEEE-----EEEcC--CeeEEECCceEEe---------e--cCC
Confidence 469999999985 666654 44557999999999998 33211 1111222211100 0 123
Q ss_pred CcEEEEcccccCCccccccccc-CceEEEEeeecccCCCCcccCceEEEEehH
Q 006130 598 PVMLETTAAVHPGGSGGAVVNL-DGHMIGLVTSNARHGGGTVIPHLNFSIPCA 649 (660)
Q Consensus 598 ~~~IqTda~v~~G~SGGPL~Ns-~G~VVGIvssna~~~~g~~ip~lnFaIPi~ 649 (660)
..+..+-..+..|+-|.||++. +|++|||.+...... ..||..|+.
T Consensus 139 ~~fWkHwIsTk~G~CG~PlVs~~Dg~IVGiHsl~~~~~------~~N~F~~f~ 185 (235)
T PF00863_consen 139 SHFWKHWISTKDGDCGLPLVSTKDGKIVGIHSLTSNTS------SRNYFTPFP 185 (235)
T ss_dssp TTEEEE-C---TT-TT-EEEETTT--EEEEEEEEETTT------SSEEEEE--
T ss_pred CCeeEEEecCCCCccCCcEEEcCCCcEEEEEcCccCCC------CeEEEEcCC
Confidence 4567777788899999999985 799999998765432 347777653
No 23
>PF13365 Trypsin_2: Trypsin-like peptidase domain; PDB: 1Y8T_A 2Z9I_A 3QO6_A 1L1J_A 1QY6_A 2O8L_A 3OTP_E 2ZLE_I 1KY9_A 3CS0_A ....
Probab=97.80 E-value=1.1e-05 Score=72.28 Aligned_cols=24 Identities=46% Similarity=0.855 Sum_probs=22.5
Q ss_pred eecccCCCCcCcceecCCccEEEE
Q 006130 284 ADIRCLPGMEGGPVFGEHAHFVGI 307 (660)
Q Consensus 284 tDa~~~pG~sGGpl~~~~g~liGi 307 (660)
+|+.+.||+|||||||.+|+||||
T Consensus 97 ~~~~~~~G~SGgpv~~~~G~vvGi 120 (120)
T PF13365_consen 97 TDADTRPGSSGGPVFDSDGRVVGI 120 (120)
T ss_dssp ESSS-STTTTTSEEEETTSEEEEE
T ss_pred eecccCCCcEeHhEECCCCEEEeC
Confidence 899999999999999999999997
No 24
>COG5640 Secreted trypsin-like serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=97.10 E-value=0.0038 Score=67.21 Aligned_cols=37 Identities=32% Similarity=0.587 Sum_probs=29.1
Q ss_pred ccccCCccccccccc--CceE-EEEeeecccCCCCcccCc
Q 006130 605 AAVHPGGSGGAVVNL--DGHM-IGLVTSNARHGGGTVIPH 641 (660)
Q Consensus 605 a~v~~G~SGGPL~Ns--~G~V-VGIvssna~~~~g~~ip~ 641 (660)
...|.|+||||+|-. +|++ +||++++....++..+|+
T Consensus 223 ~daCqGDSGGPi~~~g~~G~vQ~GVvSwG~~~Cg~t~~~g 262 (413)
T COG5640 223 KDACQGDSGGPIFHKGEEGRVQRGVVSWGDGGCGGTLIPG 262 (413)
T ss_pred cccccCCCCCceEEeCCCccEEEeEEEecCCCCCCCCcce
Confidence 356789999999943 4777 999999988777666655
No 25
>PF03761 DUF316: Domain of unknown function (DUF316) ; InterPro: IPR005514 This is a family of uncharacterised proteins from Caenorhabditis elegans.
Probab=96.75 E-value=0.078 Score=55.57 Aligned_cols=110 Identities=15% Similarity=0.098 Sum_probs=63.7
Q ss_pred CCCceEEEEEccC-CCCcceeecCCC--CCCCCCeEEEEecCCCCCCCCCCCeeeeeEEeeeeeccCCCCCccccccCCC
Q 006130 520 GPLDVSLLQLGYI-PDQLCPIDADFG--QPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQRNSA 596 (660)
Q Consensus 520 ~~~DLALLqL~~~-p~~l~pi~l~~s--~~~~Gd~V~vIGygl~g~~~gl~psvt~GiVS~v~~~~~~~~~~~~~~~~~~ 596 (660)
..++++||+++.. .....|+=++++ ....|+.+.+.|+. .. ..+....+.-... ..
T Consensus 159 ~~~~~mIlEl~~~~~~~~~~~Cl~~~~~~~~~~~~~~~yg~~-----~~--~~~~~~~~~i~~~--------------~~ 217 (282)
T PF03761_consen 159 RPYSPMILELEEDFSKNVSPPCLADSSTNWEKGDEVDVYGFN-----ST--GKLKHRKLKITNC--------------TK 217 (282)
T ss_pred cccceEEEEEcccccccCCCEEeCCCccccccCceEEEeecC-----CC--CeEEEEEEEEEEe--------------ec
Confidence 5789999999973 134444444443 46789999988871 11 1122222221111 01
Q ss_pred cCcEEEEcccccCCccccccc---ccCceEEEEeeecccCCCCcccCceEEEEehHHHHHH
Q 006130 597 YPVMLETTAAVHPGGSGGAVV---NLDGHMIGLVTSNARHGGGTVIPHLNFSIPCAVLRPI 654 (660)
Q Consensus 597 ~~~~IqTda~v~~G~SGGPL~---Ns~G~VVGIvssna~~~~g~~ip~lnFaIPi~~l~~i 654 (660)
....+.+....+.|++|||++ |..-.||||.+.+...... ...+.+.+..+++-
T Consensus 218 ~~~~~~~~~~~~~~d~Gg~lv~~~~gr~tlIGv~~~~~~~~~~----~~~~f~~v~~~~~~ 274 (282)
T PF03761_consen 218 CAYSICTKQYSCKGDRGGPLVKNINGRWTLIGVGASGNYECNK----NNSYFFNVSWYQDE 274 (282)
T ss_pred cceeEecccccCCCCccCeEEEEECCCEEEEEEEccCCCcccc----cccEEEEHHHhhhh
Confidence 233455556677899999998 3345789998776543221 13566666666543
No 26
>PF00089 Trypsin: Trypsin; InterPro: IPR001254 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine proteases belong to the MEROPS peptidase family S1 (chymotrypsin family, clan PA(S))and to peptidase family S6 (Hap serine peptidases). The chymotrypsin family is almost totally confined to animals, although trypsin-like enzymes are found in actinomycetes of the genera Streptomyces and Saccharopolyspora, and in the fungus Fusarium oxysporum []. The enzymes are inherently secreted, being synthesised with a signal peptide that targets them to the secretory pathway. Animal enzymes are either secreted directly, packaged into vesicles for regulated secretion, or are retained in leukocyte granules []. The Hap family, 'Haemophilus adhesion and penetration', are proteins that play a role in the interaction with human epithelial cells. The serine protease activity is localized at the N-terminal domain, whereas the binding domain is in the C-terminal region. ; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 1SPJ_A 1A5I_A 2ZGH_A 2ZKS_A 2ZGJ_A 2ZGC_A 2ODP_A 2I6Q_A 2I6S_A 2ODQ_A ....
Probab=96.61 E-value=0.023 Score=55.69 Aligned_cols=113 Identities=18% Similarity=0.148 Sum_probs=69.6
Q ss_pred cccEEEEEEcCC---CCCCCccccCC---CCCCCCeEEEEeCCCCCCCC-cccccceEEEEEeee--cC--CCCCCCCeE
Q 006130 214 TSRVAILGVSSY---LKDLPNIALTP---LNKRGDLLLAVGSPFGVLSP-MHFFNSVSMGSVANC--YP--PRSTTRSLL 282 (660)
Q Consensus 214 ~td~A~l~i~~~---~~~~~~~~~s~---~~~~G~~v~aigsPFg~~sp-~~f~n~vs~GiVs~~--~~--~~~~~~~~i 282 (660)
..|+||||++.. .....++.+.. .++.|+.+.++|.+...... .--....+..+++.. .. ........+
T Consensus 86 ~~DiAll~L~~~~~~~~~~~~~~l~~~~~~~~~~~~~~~~G~~~~~~~~~~~~~~~~~~~~~~~~~c~~~~~~~~~~~~~ 165 (220)
T PF00089_consen 86 DNDIALLKLDRPITFGDNIQPICLPSAGSDPNVGTSCIVVGWGRTSDNGYSSNLQSVTVPVVSRKTCRSSYNDNLTPNMI 165 (220)
T ss_dssp TTSEEEEEESSSSEHBSSBEESBBTSTTHTTTTTSEEEEEESSBSSTTSBTSBEEEEEEEEEEHHHHHHHTTTTSTTTEE
T ss_pred cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
Confidence 459999999854 12223333332 45899999999988853211 001122334444432 11 011235667
Q ss_pred Eeec----ccCCCCcCcceecCCccEEEEEeeecccc-CCcceEEEEeHH
Q 006130 283 MADI----RCLPGMEGGPVFGEHAHFVGILIRPLRQK-SGAEIQLVIPWE 327 (660)
Q Consensus 283 ~tDa----~~~pG~sGGpl~~~~g~liGi~~~~l~~~-~~~~i~f~ip~~ 327 (660)
.++. ...+|+|||||++.++.||||++.. ..+ ......+..+..
T Consensus 166 c~~~~~~~~~~~g~sG~pl~~~~~~lvGI~s~~-~~c~~~~~~~v~~~v~ 214 (220)
T PF00089_consen 166 CAGSSGSGDACQGDSGGPLICNNNYLVGIVSFG-ENCGSPNYPGVYTRVS 214 (220)
T ss_dssp EEETTSSSBGGTTTTTSEEEETTEEEEEEEEEE-SSSSBTTSEEEEEEGG
T ss_pred cccccccccccccccccccccceeeecceeeec-CCCCCCCcCEEEEEHH
Confidence 7776 7889999999999999999999988 333 332345555554
No 27
>PF05579 Peptidase_S32: Equine arteritis virus serine endopeptidase S32; InterPro: IPR008760 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to MEROPS peptidase family S32 (clan PA(S)). The type example is equine arteritis virus serine endopeptidase (equine arteritis virus), which is involved in processing of nidovirus polyproteins [].; GO: 0004252 serine-type endopeptidase activity, 0016032 viral reproduction, 0019082 viral protein processing; PDB: 3FAN_A 3FAO_A 1MBM_A.
Probab=96.60 E-value=0.011 Score=61.47 Aligned_cols=78 Identities=24% Similarity=0.312 Sum_probs=42.3
Q ss_pred CCceEEEEEccCCCCcceeecCCCCCCCCCeEEEEecCCCCCCCCCCCeeeeeEEeeeeeccCCCCCccccccCCCcCcE
Q 006130 521 PLDVSLLQLGYIPDQLCPIDADFGQPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQRNSAYPVM 600 (660)
Q Consensus 521 ~~DLALLqL~~~p~~l~pi~l~~s~~~~Gd~V~vIGygl~g~~~gl~psvt~GiVS~v~~~~~~~~~~~~~~~~~~~~~~ 600 (660)
.-|+|.-.++..+...+.+++... -.|..-... ..-+..|.|..-. .
T Consensus 155 ~GDfA~~~~~~~~G~~P~~k~a~~--~~GrAyW~t-----------~tGvE~G~ig~~~--------------------~ 201 (297)
T PF05579_consen 155 NGDFAEADITNWPGAAPKYKFAQN--YTGRAYWLT-----------STGVEPGFIGGGG--------------------A 201 (297)
T ss_dssp ETTEEEEEETTS-S---B--B-TT---SEEEEEEE-----------TTEEEEEEEETTE--------------------E
T ss_pred cCcEEEEECCCCCCCCCceeecCC--cccceEEEc-----------ccCcccceecCce--------------------E
Confidence 378888888665666777766522 222211111 1235667665321 1
Q ss_pred EEEcccccCCcccccccccCceEEEEeeecccCC
Q 006130 601 LETTAAVHPGGSGGAVVNLDGHMIGLVTSNARHG 634 (660)
Q Consensus 601 IqTda~v~~G~SGGPL~Ns~G~VVGIvssna~~~ 634 (660)
++ -..+|+||+|++..+|.+|||.+...+.+
T Consensus 202 ~~---fT~~GDSGSPVVt~dg~liGVHTGSn~~G 232 (297)
T PF05579_consen 202 VC---FTGPGDSGSPVVTEDGDLIGVHTGSNKRG 232 (297)
T ss_dssp EE---SS-GGCTT-EEEETTC-EEEEEEEEETTT
T ss_pred EE---EcCCCCCCCccCcCCCCEEEEEecCCCcC
Confidence 33 34689999999999999999999876643
No 28
>KOG1421 consensus Predicted signaling-associated protein (contains a PDZ domain) [General function prediction only]
Probab=94.96 E-value=0.47 Score=55.22 Aligned_cols=144 Identities=15% Similarity=0.162 Sum_probs=79.6
Q ss_pred EEEEEcCCCcceeEeeEEEEecCCCCceEEEEEccCCCCcceeecCCCCCCCCCeEEEEecCCCCCCCCCC-----Ceee
Q 006130 497 IRVRLDHLDPWIWCDAKIVYVCKGPLDVSLLQLGYIPDQLCPIDADFGQPSLGSAAYVIGHGLFGPRCGLS-----PSVS 571 (660)
Q Consensus 497 I~Vr~~~~~~~~w~~A~VV~v~d~~~DLALLqL~~~p~~l~pi~l~~s~~~~Gd~V~vIGygl~g~~~gl~-----psvt 571 (660)
++|+.++... ..|.+.+ -++...+|.+|-+. .....+++.+..+..||++...|+ ...+. -+++
T Consensus 578 ~~vt~~dS~~---i~a~~~f-L~~t~n~a~~kydp--~~~~~~kl~~~~v~~gD~~~f~g~-----~~~~r~ltaktsv~ 646 (955)
T KOG1421|consen 578 QRVTEADSDG---IPANVSF-LHPTENVASFKYDP--ALEVQLKLTDTTVLRGDECTFEGF-----TEDLRALTAKTSVT 646 (955)
T ss_pred eEEeeccccc---ccceeeE-ecCccceeEeccCh--hHhhhhccceeeEecCCceeEecc-----cccchhhcccceee
Confidence 5666666665 6788877 56667888888774 334555677778899999999999 32221 2233
Q ss_pred eeEEeeeeeccCCCCCccccccCCCcCcEEEEcccc-cCCcccccccccCceEEEEeeecccCCCCcccCceEEEEehHH
Q 006130 572 SGVVAKVVKANLPSYGQSTLQRNSAYPVMLETTAAV-HPGGSGGAVVNLDGHMIGLVTSNARHGGGTVIPHLNFSIPCAV 650 (660)
Q Consensus 572 ~GiVS~v~~~~~~~~~~~~~~~~~~~~~~IqTda~v-~~G~SGGPL~Ns~G~VVGIvssna~~~~g~~ip~lnFaIPi~~ 650 (660)
.-.+-.+-....+.+ ...+ -..|...+.+ ..+.| |-+.|.+|+++|+=-+-....-+.+=-..-|.+-+.+
T Consensus 647 dvs~~~~ps~~~pr~------r~~n-~e~Is~~~nlsT~c~s-g~ltdddg~vvalwl~~~ge~~~~kd~~y~~gl~~~~ 718 (955)
T KOG1421|consen 647 DVSVVIIPSSVMPRF------RATN-LEVISFMDNLSTSCLS-GRLTDDDGEVVALWLSVVGEDVGGKDYTYKYGLSMSY 718 (955)
T ss_pred eeEEEEecCCCCcce------eecc-eEEEEEeccccccccc-eEEECCCCeEEEEEeeeeccccCCceeEEEeccchHH
Confidence 221111111111111 1111 1123332222 23444 5677889999999655544332222113455666677
Q ss_pred HHHHHHHhc
Q 006130 651 LRPIFEFAR 659 (660)
Q Consensus 651 l~~il~~a~ 659 (660)
+.+.++.+|
T Consensus 719 ~l~vl~rlk 727 (955)
T KOG1421|consen 719 ILPVLERLK 727 (955)
T ss_pred HHHHHHHHh
Confidence 788777765
No 29
>PF00548 Peptidase_C3: 3C cysteine protease (picornain 3C); InterPro: IPR000199 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This signature defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies C3A and C3B. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral C3 cysteine protease. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SJO_E 2H6M_A 1QA7_C 1HAV_B 2HAL_A 2H9H_A 3QZQ_B 3QZR_A 3R0F_B 3SJ9_A ....
Probab=94.94 E-value=0.42 Score=47.21 Aligned_cols=35 Identities=29% Similarity=0.526 Sum_probs=29.2
Q ss_pred CcCcEEEEcccccCCccccccccc---CceEEEEeeec
Q 006130 596 AYPVMLETTAAVHPGGSGGAVVNL---DGHMIGLVTSN 630 (660)
Q Consensus 596 ~~~~~IqTda~v~~G~SGGPL~Ns---~G~VVGIvssn 630 (660)
..+.++.+.++..+|+.||||+.. .++++||..++
T Consensus 133 ~~~~~~~Y~~~t~~G~CG~~l~~~~~~~~~i~GiHvaG 170 (172)
T PF00548_consen 133 TTPRSLKYKAPTKPGMCGSPLVSRIGGQGKIIGIHVAG 170 (172)
T ss_dssp EEEEEEEEESEEETTGTTEEEEESCGGTTEEEEEEEEE
T ss_pred EeeEEEEEccCCCCCccCCeEEEeeccCccEEEEEecc
Confidence 346688899999999999999942 58999998874
No 30
>PF10459 Peptidase_S46: Peptidase S46; InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains.
Probab=94.58 E-value=0.062 Score=63.68 Aligned_cols=64 Identities=20% Similarity=0.298 Sum_probs=44.6
Q ss_pred CCcCcEEEEcccccCCcccccccccCceEEEEeeecccCCCCccc---Cc--eEEEEehHHHHHHHHHh
Q 006130 595 SAYPVMLETTAAVHPGGSGGAVVNLDGHMIGLVTSNARHGGGTVI---PH--LNFSIPCAVLRPIFEFA 658 (660)
Q Consensus 595 ~~~~~~IqTda~v~~G~SGGPL~Ns~G~VVGIvssna~~~~g~~i---p~--lnFaIPi~~l~~il~~a 658 (660)
...+.-+.++.-+.+||||+||+|.+|+|||++.-..-..-...+ |. -+.++-+..+..+++..
T Consensus 618 g~~pv~FlstnDitGGNSGSPvlN~~GeLVGl~FDgn~Esl~~D~~fdp~~~R~I~VDiRyvL~~ldkv 686 (698)
T PF10459_consen 618 GSVPVNFLSTNDITGGNSGSPVLNAKGELVGLAFDGNWESLSGDIAFDPELNRTIHVDIRYVLWALDKV 686 (698)
T ss_pred CCeeeEEEeccCcCCCCCCCccCCCCceEEEEeecCchhhcccccccccccceeEEEEHHHHHHHHHHH
Confidence 345666778889999999999999999999998765443211110 22 35566677777777554
No 31
>COG3591 V8-like Glu-specific endopeptidase [Amino acid transport and metabolism]
Probab=94.13 E-value=0.26 Score=51.55 Aligned_cols=76 Identities=22% Similarity=0.223 Sum_probs=59.3
Q ss_pred cccCCCCCCCCeEEEEeCCCCCCCCcccccceEEEEEeeecCCCCCCCCeEEeecccCCCCcCcceecCCccEEEEEeee
Q 006130 232 IALTPLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVANCYPPRSTTRSLLMADIRCLPGMEGGPVFGEHAHFVGILIRP 311 (660)
Q Consensus 232 ~~~s~~~~~G~~v~aigsPFg~~sp~~f~n~vs~GiVs~~~~~~~~~~~~i~tDa~~~pG~sGGpl~~~~g~liGi~~~~ 311 (660)
.......+.+|.|..+|.|=.- |..+....+.+-|-.... .+++-|+...||+||.||++.+.++||+.+..
T Consensus 152 ~~~~~~~~~~d~i~v~GYP~dk--~~~~~~~e~t~~v~~~~~------~~l~y~~dT~pG~SGSpv~~~~~~vigv~~~g 223 (251)
T COG3591 152 RNTASEAKANDRITVIGYPGDK--PNIGTMWESTGKVNSIKG------NKLFYDADTLPGSSGSPVLISKDEVIGVHYNG 223 (251)
T ss_pred cccccccccCceeEEEeccCCC--CcceeEeeecceeEEEec------ceEEEEecccCCCCCCceEecCceEEEEEecC
Confidence 3445678999999999999874 334455556665554432 36888999999999999999999999999999
Q ss_pred cccc
Q 006130 312 LRQK 315 (660)
Q Consensus 312 l~~~ 315 (660)
....
T Consensus 224 ~~~~ 227 (251)
T COG3591 224 PGAN 227 (251)
T ss_pred CCcc
Confidence 8865
No 32
>PF08192 Peptidase_S64: Peptidase family S64; InterPro: IPR012985 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This family of fungal proteins is involved in the processing of membrane bound transcription factor Stp1 [] and belongs to MEROPS petidase family S64 (clan PA). The processing causes the signalling domain of Stp1 to be passed to the nucleus where several permease genes are induced. The permeases are important for uptake of amino acids, and processing of tp1 only occurs in an amino acid-rich environment. This family is predicted to be distantly related to the trypsin family (MEROPS peptidase family S1) and to have a typical trypsin-like catalytic triad [].
Probab=92.83 E-value=0.46 Score=55.30 Aligned_cols=121 Identities=14% Similarity=0.181 Sum_probs=73.8
Q ss_pred CCCCceEEEEEccC-------CCCc------ceeecCC-------CCCCCCCeEEEEecCCCCCCCCCCCeeeeeEEeee
Q 006130 519 KGPLDVSLLQLGYI-------PDQL------CPIDADF-------GQPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKV 578 (660)
Q Consensus 519 d~~~DLALLqL~~~-------p~~l------~pi~l~~-------s~~~~Gd~V~vIGygl~g~~~gl~psvt~GiVS~v 578 (660)
+.-.|+||++++.. .+++ +.+.+.. ..+.+|..|+=+|- ..| .|.|.++.+
T Consensus 540 ~~LsD~AIIkV~~~~~~~N~LGddi~f~~~dP~l~f~NlyV~~~~~~~~~G~~VfK~Gr-----TTg----yT~G~lNg~ 610 (695)
T PF08192_consen 540 KRLSDWAIIKVNKERKCQNYLGDDIQFNEPDPTLMFQNLYVREVVSNLVPGMEVFKVGR-----TTG----YTTGILNGI 610 (695)
T ss_pred ccccceEEEEeCCCceecCCCCccccccCCCccccccccchhhhhhccCCCCeEEEecc-----cCC----ccceEecce
Confidence 34469999999851 1111 1222221 23578999999997 455 577777765
Q ss_pred eeccCCCCCccccccCCCcCcEEEEc----ccccCCcccccccccCce------EEEEeeecccCCCCcccCceEEEEeh
Q 006130 579 VKANLPSYGQSTLQRNSAYPVMLETT----AAVHPGGSGGAVVNLDGH------MIGLVTSNARHGGGTVIPHLNFSIPC 648 (660)
Q Consensus 579 ~~~~~~~~~~~~~~~~~~~~~~IqTd----a~v~~G~SGGPL~Ns~G~------VVGIvssna~~~~g~~ip~lnFaIPi 648 (660)
.-.. +..+......++..+ +-..+|+||.=|++.-+. |+||..+.-... + .|++..|+
T Consensus 611 klvy-------w~dG~i~s~efvV~s~~~~~Fa~~GDSGS~VLtk~~d~~~gLgvvGMlhsydge~---k--qfglftPi 678 (695)
T PF08192_consen 611 KLVY-------WADGKIQSSEFVVSSDNNPAFASGGDSGSWVLTKLEDNNKGLGVVGMLHSYDGEQ---K--QFGLFTPI 678 (695)
T ss_pred EEEE-------ecCCCeEEEEEEEecCCCccccCCCCcccEEEecccccccCceeeEEeeecCCcc---c--eeeccCcH
Confidence 3211 111111112233333 345579999999986444 999998863322 2 58899999
Q ss_pred HHHHHHHHHhcC
Q 006130 649 AVLRPIFEFARG 660 (660)
Q Consensus 649 ~~l~~il~~a~g 660 (660)
..|.+-|+...|
T Consensus 679 ~~il~rl~~vT~ 690 (695)
T PF08192_consen 679 NEILDRLEEVTG 690 (695)
T ss_pred HHHHHHHHHhhc
Confidence 999888876544
No 33
>PF02907 Peptidase_S29: Hepatitis C virus NS3 protease; InterPro: IPR004109 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies the Hepatitis C virus NS3 protein as a serine protease which belongs to MEROPS peptidase family S29 (hepacivirin family, clan PA(S)), which has a trypsin-like fold. The non-structural (NS) protein NS3 is one of the NS proteins involved in replication of the HCV genome. The NS2 proteinase (IPR002518 from INTERPRO), a zinc-dependent enzyme, performs a single proteolytic cut to release the N terminus of NS3. The action of NS3 proteinase (NS3P), which resides in the N-terminal one-third of the NS3 protein, then yields all remaining non-structural proteins. The C-terminal two-thirds of the NS3 protein contain a helicase. The functional relationship between the proteinase and helicase domains is unknown. NS3 has a structural zinc-binding site and requires cofactor NS4. It has been suggested that the NS3 serine protease of hepatitus C is involved in cell transformation and that the ability to transform requires an active enzyme [].; GO: 0008236 serine-type peptidase activity, 0006508 proteolysis, 0019087 transformation of host cell by virus; PDB: 2QV1_B 3LOX_C 2OBQ_C 2OC1_C 2OC0_A 3LON_A 3KNX_A 2O8M_A 2OBO_A 2OC8_A ....
Probab=91.37 E-value=0.14 Score=48.45 Aligned_cols=45 Identities=27% Similarity=0.524 Sum_probs=36.3
Q ss_pred eecccCCCCcCcceecCCccEEEEEeeecccc-CCcceEEEEeHHHH
Q 006130 284 ADIRCLPGMEGGPVFGEHAHFVGILIRPLRQK-SGAEIQLVIPWEAI 329 (660)
Q Consensus 284 tDa~~~pG~sGGpl~~~~g~liGi~~~~l~~~-~~~~i~f~ip~~~i 329 (660)
.-+..+-|+|||||+...|++|||..+-++.. --..+.|+ ||+.+
T Consensus 101 ~pis~lkGSSGgPiLC~~GH~vG~f~aa~~trgvak~i~f~-P~e~l 146 (148)
T PF02907_consen 101 RPISDLKGSSGGPILCPSGHAVGMFRAAVCTRGVAKAIDFI-PVETL 146 (148)
T ss_dssp EEHHHHTT-TT-EEEETTSEEEEEEEEEEEETTEEEEEEEE-EHHHH
T ss_pred ceeEEEecCCCCcccCCCCCEEEEEEEEEEcCCceeeEEEE-eeeec
Confidence 45668899999999999999999999999877 34478887 99865
No 34
>PF10459 Peptidase_S46: Peptidase S46; InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains.
Probab=90.30 E-value=0.2 Score=59.53 Aligned_cols=29 Identities=21% Similarity=0.425 Sum_probs=26.8
Q ss_pred eEEeecccCCCCcCcceecCCccEEEEEe
Q 006130 281 LLMADIRCLPGMEGGPVFGEHAHFVGILI 309 (660)
Q Consensus 281 ~i~tDa~~~pG~sGGpl~~~~g~liGi~~ 309 (660)
-++|+..|--||||.||+|.+|||||++.
T Consensus 623 ~FlstnDitGGNSGSPvlN~~GeLVGl~F 651 (698)
T PF10459_consen 623 NFLSTNDITGGNSGSPVLNAKGELVGLAF 651 (698)
T ss_pred EEEeccCcCCCCCCCccCCCCceEEEEee
Confidence 37899999999999999999999999987
No 35
>PF00949 Peptidase_S7: Peptidase S7, Flavivirus NS3 serine protease ; InterPro: IPR001850 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies serine peptidases belong to MEROPS peptidase family S7 (flavivirin family, clan PA(S)). The protein fold of the peptidase domain for members of this family resembles that of chymotrypsin, the type example for clan PA. Flaviviruses produce a polyprotein from the ssRNA genome. The N terminus of the NS3 protein (approx. 180 aa) is required for the processing of the polyprotein. NS3 also has conserved homology with NTP-binding proteins and DEAD family of RNA helicase [, , ].; GO: 0003723 RNA binding, 0003724 RNA helicase activity, 0005524 ATP binding; PDB: 2IJO_B 3E90_D 2GGV_B 2FP7_B 2WV9_A 3U1I_B 3U1J_B 2WZQ_A 2WHX_A 3L6P_A ....
Probab=90.16 E-value=0.23 Score=47.06 Aligned_cols=35 Identities=20% Similarity=0.418 Sum_probs=25.2
Q ss_pred CeEEeecccCCCCcCcceecCCccEEEEEeeeccc
Q 006130 280 SLLMADIRCLPGMEGGPVFGEHAHFVGILIRPLRQ 314 (660)
Q Consensus 280 ~~i~tDa~~~pG~sGGpl~~~~g~liGi~~~~l~~ 314 (660)
.+...|..+.+|+||.|+||.+|++|||--.-+.-
T Consensus 86 ~~~~~~~d~~~GsSGSpi~n~~g~ivGlYg~g~~~ 120 (132)
T PF00949_consen 86 GIGAIDLDFPKGSSGSPIFNQNGEIVGLYGNGVEV 120 (132)
T ss_dssp EEEEE---S-TTGTT-EEEETTSCEEEEEEEEEE-
T ss_pred eEEeeecccCCCCCCCceEcCCCcEEEEEccceee
Confidence 46677888999999999999999999998776654
No 36
>PF02907 Peptidase_S29: Hepatitis C virus NS3 protease; InterPro: IPR004109 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies the Hepatitis C virus NS3 protein as a serine protease which belongs to MEROPS peptidase family S29 (hepacivirin family, clan PA(S)), which has a trypsin-like fold. The non-structural (NS) protein NS3 is one of the NS proteins involved in replication of the HCV genome. The NS2 proteinase (IPR002518 from INTERPRO), a zinc-dependent enzyme, performs a single proteolytic cut to release the N terminus of NS3. The action of NS3 proteinase (NS3P), which resides in the N-terminal one-third of the NS3 protein, then yields all remaining non-structural proteins. The C-terminal two-thirds of the NS3 protein contain a helicase. The functional relationship between the proteinase and helicase domains is unknown. NS3 has a structural zinc-binding site and requires cofactor NS4. It has been suggested that the NS3 serine protease of hepatitus C is involved in cell transformation and that the ability to transform requires an active enzyme [].; GO: 0008236 serine-type peptidase activity, 0006508 proteolysis, 0019087 transformation of host cell by virus; PDB: 2QV1_B 3LOX_C 2OBQ_C 2OC1_C 2OC0_A 3LON_A 3KNX_A 2O8M_A 2OBO_A 2OC8_A ....
Probab=89.63 E-value=0.33 Score=46.02 Aligned_cols=44 Identities=25% Similarity=0.484 Sum_probs=30.4
Q ss_pred cccCCcccccccccCceEEEEeeecccCCCCcccCceEEEEehHHHH
Q 006130 606 AVHPGGSGGAVVNLDGHMIGLVTSNARHGGGTVIPHLNFSIPCAVLR 652 (660)
Q Consensus 606 ~v~~G~SGGPL~Ns~G~VVGIvssna~~~~g~~ip~lnFaIPi~~l~ 652 (660)
+...|+|||||+..+|++|||-.+.....+-.+ .+-|. |++.+.
T Consensus 104 s~lkGSSGgPiLC~~GH~vG~f~aa~~trgvak--~i~f~-P~e~l~ 147 (148)
T PF02907_consen 104 SDLKGSSGGPILCPSGHAVGMFRAAVCTRGVAK--AIDFI-PVETLP 147 (148)
T ss_dssp HHHTT-TT-EEEETTSEEEEEEEEEEEETTEEE--EEEEE-EHHHHH
T ss_pred EEEecCCCCcccCCCCCEEEEEEEEEEcCCcee--eEEEE-eeeecC
Confidence 345799999999999999999877644333223 56676 887653
No 37
>PF00949 Peptidase_S7: Peptidase S7, Flavivirus NS3 serine protease ; InterPro: IPR001850 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies serine peptidases belong to MEROPS peptidase family S7 (flavivirin family, clan PA(S)). The protein fold of the peptidase domain for members of this family resembles that of chymotrypsin, the type example for clan PA. Flaviviruses produce a polyprotein from the ssRNA genome. The N terminus of the NS3 protein (approx. 180 aa) is required for the processing of the polyprotein. NS3 also has conserved homology with NTP-binding proteins and DEAD family of RNA helicase [, , ].; GO: 0003723 RNA binding, 0003724 RNA helicase activity, 0005524 ATP binding; PDB: 2IJO_B 3E90_D 2GGV_B 2FP7_B 2WV9_A 3U1I_B 3U1J_B 2WZQ_A 2WHX_A 3L6P_A ....
Probab=89.43 E-value=0.27 Score=46.61 Aligned_cols=29 Identities=28% Similarity=0.571 Sum_probs=21.6
Q ss_pred cccCCcccccccccCceEEEEeeecccCC
Q 006130 606 AVHPGGSGGAVVNLDGHMIGLVTSNARHG 634 (660)
Q Consensus 606 ~v~~G~SGGPL~Ns~G~VVGIvssna~~~ 634 (660)
-..+|+||+|+||.+|++|||-.......
T Consensus 93 d~~~GsSGSpi~n~~g~ivGlYg~g~~~~ 121 (132)
T PF00949_consen 93 DFPKGSSGSPIFNQNGEIVGLYGNGVEVG 121 (132)
T ss_dssp -S-TTGTT-EEEETTSCEEEEEEEEEE-T
T ss_pred ccCCCCCCCceEcCCCcEEEEEccceeec
Confidence 36789999999999999999976665544
No 38
>smart00020 Tryp_SPc Trypsin-like serine protease. Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues.
Probab=86.48 E-value=4.6 Score=39.78 Aligned_cols=100 Identities=17% Similarity=0.141 Sum_probs=52.0
Q ss_pred CcccEEEEEEcCCC---CCCCccccC---CCCCCCCeEEEEeCCCCCCCCcccccceEEEEEeeecCC-----C----CC
Q 006130 213 STSRVAILGVSSYL---KDLPNIALT---PLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVANCYPP-----R----ST 277 (660)
Q Consensus 213 ~~td~A~l~i~~~~---~~~~~~~~s---~~~~~G~~v~aigsPFg~~sp~~f~n~vs~GiVs~~~~~-----~----~~ 277 (660)
...|+||||++... ....++.+. ..+..|+.+.+.|..-...+...+...+....+.-..+. . ..
T Consensus 87 ~~~DiAll~L~~~i~~~~~~~pi~l~~~~~~~~~~~~~~~~g~g~~~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~ 166 (229)
T smart00020 87 YDNDIALLKLKSPVTLSDNVRPICLPSSNYNVPAGTTCTVSGWGRTSEGAGSLPDTLQEVNVPIVSNATCRRAYSGGGAI 166 (229)
T ss_pred CcCCEEEEEECcccCCCCceeeccCCCcccccCCCCEEEEEeCCCCCCCCCcCCCEeeEEEEEEeCHHHhhhhhcccccc
Confidence 34699999998431 112233332 257778999998854432111112222222222221110 0 00
Q ss_pred CCCeE---E--eecccCCCCcCcceecCCc--cEEEEEeeec
Q 006130 278 TRSLL---M--ADIRCLPGMEGGPVFGEHA--HFVGILIRPL 312 (660)
Q Consensus 278 ~~~~i---~--tDa~~~pG~sGGpl~~~~g--~liGi~~~~l 312 (660)
....+ . .+...-+|.+||||+...+ .|+||++..-
T Consensus 167 ~~~~~C~~~~~~~~~~c~gdsG~pl~~~~~~~~l~Gi~s~g~ 208 (229)
T smart00020 167 TDNMLCAGGLEGGKDACQGDSGGPLVCNDGRWVLVGIVSWGS 208 (229)
T ss_pred CCCcEeecCCCCCCcccCCCCCCeeEEECCCEEEEEEEEECC
Confidence 00111 0 1344557999999998765 7999988764
No 39
>PF00944 Peptidase_S3: Alphavirus core protein ; InterPro: IPR000930 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. Togavirin, also known as Sindbis virus core endopeptidase, is a serine protease resident at the N terminus of the p130 polyprotein of togaviruses []. The endopeptidase signature identifies the peptidase as belonging to the MEROPS peptidase family S3 (togavirin family, clan PA(S)). The polyprotein also includes structural proteins for the nucleocapsid core and for the glycoprotein spikes []. Togavirin is only active while part of the polyprotein, cleavage at a Trp-Ser bond resulting in total lack of activity []. Mutagenesis studies have identified the location of the His-Asp-Ser catalytic triad, and X-ray studies have revealed the protein fold to be similar to that of chymotrypsin [, ].; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis, 0016020 membrane; PDB: 2YEW_D 1EP5_A 3J0C_F 1EP6_C 1WYK_D 1DYL_A 1VCQ_B 1VCP_B 1LD4_D 1KXA_A ....
Probab=85.98 E-value=0.8 Score=43.46 Aligned_cols=34 Identities=26% Similarity=0.487 Sum_probs=27.3
Q ss_pred EEEcccccCCcccccccccCceEEEEeeecccCC
Q 006130 601 LETTAAVHPGGSGGAVVNLDGHMIGLVTSNARHG 634 (660)
Q Consensus 601 IqTda~v~~G~SGGPL~Ns~G~VVGIvssna~~~ 634 (660)
..-+..-.+|+||-|++|..|+||||+-..+...
T Consensus 97 tip~g~g~~GDSGRpi~DNsGrVVaIVLGG~neG 130 (158)
T PF00944_consen 97 TIPTGVGKPGDSGRPIFDNSGRVVAIVLGGANEG 130 (158)
T ss_dssp EEETTS-STTSTTEEEESTTSBEEEEEEEEEEET
T ss_pred EeccCCCCCCCCCCccCcCCCCEEEEEecCCCCC
Confidence 3445677899999999999999999998876653
No 40
>cd00190 Tryp_SPc Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues.
Probab=85.12 E-value=6.3 Score=38.60 Aligned_cols=99 Identities=19% Similarity=0.206 Sum_probs=51.0
Q ss_pred cccEEEEEEcCCCC---CCCcccc--CC-CCCCCCeEEEEeCCCCCCC--CcccccceEEEEEeee--cCCC----CCCC
Q 006130 214 TSRVAILGVSSYLK---DLPNIAL--TP-LNKRGDLLLAVGSPFGVLS--PMHFFNSVSMGSVANC--YPPR----STTR 279 (660)
Q Consensus 214 ~td~A~l~i~~~~~---~~~~~~~--s~-~~~~G~~v~aigsPFg~~s--p~~f~n~vs~GiVs~~--~~~~----~~~~ 279 (660)
..||||||++.... ...++.+ .. .+..|+.+.+.|....... ...-......-+++.. .... ....
T Consensus 88 ~~DiAll~L~~~~~~~~~v~picl~~~~~~~~~~~~~~~~G~g~~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~ 167 (232)
T cd00190 88 DNDIALLKLKRPVTLSDNVRPICLPSSGYNLPAGTTCTVSGWGRTSEGGPLPDVLQEVNVPIVSNAECKRAYSYGGTITD 167 (232)
T ss_pred cCCEEEEEECCcccCCCcccceECCCccccCCCCCEEEEEeCCcCCCCCCCCceeeEEEeeeECHHHhhhhccCcccCCC
Confidence 36999999984321 1223333 21 5778899999986443211 0001112222222221 0000 0011
Q ss_pred CeEEe-----ecccCCCCcCcceecCC---ccEEEEEeeec
Q 006130 280 SLLMA-----DIRCLPGMEGGPVFGEH---AHFVGILIRPL 312 (660)
Q Consensus 280 ~~i~t-----Da~~~pG~sGGpl~~~~---g~liGi~~~~l 312 (660)
..+-+ +...-+|.|||||+... ..|+||++...
T Consensus 168 ~~~C~~~~~~~~~~c~gdsGgpl~~~~~~~~~lvGI~s~g~ 208 (232)
T cd00190 168 NMLCAGGLEGGKDACQGDSGGPLVCNDNGRGVLVGIVSWGS 208 (232)
T ss_pred ceEeeCCCCCCCccccCCCCCcEEEEeCCEEEEEEEEehhh
Confidence 11211 33455799999999886 56999988654
No 41
>PF00863 Peptidase_C4: Peptidase family C4 This family belongs to family C4 of the peptidase classification.; InterPro: IPR001730 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. Nuclear inclusion A (NIA) proteases from potyviruses are cysteine peptidases belong to the MEROPS peptidase family C4 (NIa protease family, clan PA(C)) [, ]. Potyviruses include plant viruses in which the single-stranded RNA encodes a polyprotein with NIA protease activity, where proteolytic cleavage is specific for Gln+Gly sites. The NIA protease acts on the polyprotein, releasing itself by Gln+Gly cleavage at both the N- and C-termini. It further processes the polyprotein by cleavage at five similar sites in the C-terminal half of the sequence. In addition to its C-terminal protease activity, the NIA protease contains an N-terminal domain that has been implicated in the transcription process []. This peptidase is present in the nuclear inclusion protein of potyviruses.; GO: 0008234 cysteine-type peptidase activity, 0006508 proteolysis; PDB: 3MMG_B 1Q31_B 1LVB_A 1LVM_A.
Probab=84.95 E-value=4.4 Score=42.16 Aligned_cols=106 Identities=22% Similarity=0.228 Sum_probs=51.3
Q ss_pred cccEEEEEEcCCCCCCCcccc---CCCCCCCCeEEEEeCCCCCCCCcccccceEEEEEeeecCCCCCCCCeEEeecccCC
Q 006130 214 TSRVAILGVSSYLKDLPNIAL---TPLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVANCYPPRSTTRSLLMADIRCLP 290 (660)
Q Consensus 214 ~td~A~l~i~~~~~~~~~~~~---s~~~~~G~~v~aigsPFg~~sp~~f~n~vs~GiVs~~~~~~~~~~~~i~tDa~~~p 290 (660)
..||.++|... ++|++.- -..++.||.|..||+=|--.+ ...+||. -|.+.+ .....|..--+.-.+
T Consensus 81 ~~DiviirmPk---DfpPf~~kl~FR~P~~~e~v~mVg~~fq~k~---~~s~vSe--sS~i~p--~~~~~fWkHwIsTk~ 150 (235)
T PF00863_consen 81 GRDIVIIRMPK---DFPPFPQKLKFRAPKEGERVCMVGSNFQEKS---ISSTVSE--SSWIYP--EENSHFWKHWISTKD 150 (235)
T ss_dssp CSSEEEEE--T---TS----S---B----TT-EEEEEEEECSSCC---CEEEEEE--EEEEEE--ETTTTEEEE-C---T
T ss_pred CccEEEEeCCc---ccCCcchhhhccCCCCCCEEEEEEEEEEcCC---eeEEECC--ceEEee--cCCCCeeEEEecCCC
Confidence 46999999974 4454443 347899999999998775211 1122222 222332 123578888889999
Q ss_pred CCcCcceecC-CccEEEEEeeeccccCCcceEEEEeHHHHH
Q 006130 291 GMEGGPVFGE-HAHFVGILIRPLRQKSGAEIQLVIPWEAIA 330 (660)
Q Consensus 291 G~sGGpl~~~-~g~liGi~~~~l~~~~~~~i~f~ip~~~i~ 330 (660)
|+.|.||++. +|.+|||-...-.. .+...--++|=+...
T Consensus 151 G~CG~PlVs~~Dg~IVGiHsl~~~~-~~~N~F~~f~~~f~~ 190 (235)
T PF00863_consen 151 GDCGLPLVSTKDGKIVGIHSLTSNT-SSRNYFTPFPDDFEE 190 (235)
T ss_dssp T-TT-EEEETTT--EEEEEEEEETT-TSSEEEEE--TTHHH
T ss_pred CccCCcEEEcCCCcEEEEEcCccCC-CCeEEEEcCCHHHHH
Confidence 9999999987 78899998743322 223333344544443
No 42
>PF00947 Pico_P2A: Picornavirus core protein 2A; InterPro: IPR000081 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This domain defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies 3CA and 3CB. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral 3C cysteine protease []. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0008233 peptidase activity, 0006508 proteolysis, 0016032 viral reproduction; PDB: 2HRV_B 1Z8R_A.
Probab=84.62 E-value=1.8 Score=40.79 Aligned_cols=31 Identities=29% Similarity=0.461 Sum_probs=24.4
Q ss_pred CeEEeecccCCCCcCcceecCCccEEEEEeee
Q 006130 280 SLLMADIRCLPGMEGGPVFGEHAHFVGILIRP 311 (660)
Q Consensus 280 ~~i~tDa~~~pG~sGGpl~~~~g~liGi~~~~ 311 (660)
.+++.--.+.||..||+|..++| ||||+|+-
T Consensus 79 ~~l~g~Gp~~PGdCGg~L~C~HG-ViGi~Tag 109 (127)
T PF00947_consen 79 NLLIGEGPAEPGDCGGILRCKHG-VIGIVTAG 109 (127)
T ss_dssp CEEEEE-SSSTT-TCSEEEETTC-EEEEEEEE
T ss_pred CceeecccCCCCCCCceeEeCCC-eEEEEEeC
Confidence 35556678999999999998887 99999975
No 43
>PF09342 DUF1986: Domain of unknown function (DUF1986); InterPro: IPR015420 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This domain is found in serine endopeptidases belonging to MEROPS peptidase family S1A (clan PA). It is found in unusual mosaic proteins, which are encoded by the Drosophila nudel gene (see P98159 from SWISSPROT). Nudel is involved in defining embryonic dorsoventral polarity. Three proteases; ndl, gd and snk process easter to create active easter. Active easter defines cell identities along the dorsal-ventral continuum by activating the spz ligand for the Tl receptor in the ventral region of the embryo. Nudel, pipe and windbeutel together trigger the protease cascade within the extraembryonic perivitelline compartment which induces dorsoventral polarity of the Drosophila embryo [].
Probab=83.71 E-value=6 Score=41.36 Aligned_cols=33 Identities=30% Similarity=0.506 Sum_probs=28.0
Q ss_pred CceEEEEeCCCeEEEEEEEeCCCEEEEcccccCC
Q 006130 395 ASVCLITIDDGVWASGVLLNDQGLILTNAHLLEP 428 (660)
Q Consensus 395 ~SVV~I~~~~~~wGSGflI~~~GlILTNaHVVep 428 (660)
|-...|..++.-||+|++|+++ |||++..|+..
T Consensus 17 PWlA~IYvdG~~~CsgvLlD~~-WlLvsssCl~~ 49 (267)
T PF09342_consen 17 PWLADIYVDGRYWCSGVLLDPH-WLLVSSSCLRG 49 (267)
T ss_pred cceeeEEEcCeEEEEEEEeccc-eEEEeccccCC
Confidence 4455677777889999999998 99999999974
No 44
>PF05580 Peptidase_S55: SpoIVB peptidase S55; InterPro: IPR008763 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S55 (SpoIVB peptidase family, clan PA(S)). The protein SpoIVB plays a key role in signalling in the final sigma-K checkpoint of Bacillus subtilis [, ].
Probab=82.67 E-value=0.92 Score=46.36 Aligned_cols=45 Identities=31% Similarity=0.477 Sum_probs=33.9
Q ss_pred EEEcccccCCcccccccccCceEEEEeeecccCCCCcccCceEEEEehHHH
Q 006130 601 LETTAAVHPGGSGGAVVNLDGHMIGLVTSNARHGGGTVIPHLNFSIPCAVL 651 (660)
Q Consensus 601 IqTda~v~~G~SGGPL~Ns~G~VVGIvssna~~~~g~~ip~lnFaIPi~~l 651 (660)
+..+.-+..|+||+|++ .+|++||=++-..-+. |..+|.||++..
T Consensus 171 l~~TGGIvqGMSGSPI~-qdGKLiGAVthvf~~d-----p~~Gygi~ie~M 215 (218)
T PF05580_consen 171 LEKTGGIVQGMSGSPII-QDGKLIGAVTHVFVND-----PTKGYGIFIEWM 215 (218)
T ss_pred hhhhCCEEecccCCCEE-ECCEEEEEEEEEEecC-----CCceeeecHHHH
Confidence 33344567799999999 5999999998876432 457899998753
No 45
>PF08192 Peptidase_S64: Peptidase family S64; InterPro: IPR012985 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This family of fungal proteins is involved in the processing of membrane bound transcription factor Stp1 [] and belongs to MEROPS petidase family S64 (clan PA). The processing causes the signalling domain of Stp1 to be passed to the nucleus where several permease genes are induced. The permeases are important for uptake of amino acids, and processing of tp1 only occurs in an amino acid-rich environment. This family is predicted to be distantly related to the trypsin family (MEROPS peptidase family S1) and to have a typical trypsin-like catalytic triad [].
Probab=80.34 E-value=12 Score=44.10 Aligned_cols=112 Identities=18% Similarity=0.198 Sum_probs=71.8
Q ss_pred ccCCCcccEEEEEEcCCC-------------CCCCccccC--------CCCCCCCeEEEEeCCCCCCCCcccccceEEEE
Q 006130 209 LMSKSTSRVAILGVSSYL-------------KDLPNIALT--------PLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGS 267 (660)
Q Consensus 209 ~~~~~~td~A~l~i~~~~-------------~~~~~~~~s--------~~~~~G~~v~aigsPFg~~sp~~f~n~vs~Gi 267 (660)
++.+.++|+|++|++... ..-|.+.+. ..++.|.+|+=+|.==|+ |.|+
T Consensus 537 ii~~~LsD~AIIkV~~~~~~~N~LGddi~f~~~dP~l~f~NlyV~~~~~~~~~G~~VfK~GrTTgy----------T~G~ 606 (695)
T PF08192_consen 537 IINKRLSDWAIIKVNKERKCQNYLGDDIQFNEPDPTLMFQNLYVREVVSNLVPGMEVFKVGRTTGY----------TTGI 606 (695)
T ss_pred hhcccccceEEEEeCCCceecCCCCccccccCCCccccccccchhhhhhccCCCCeEEEecccCCc----------cceE
Confidence 455777899999997432 112222221 257789999999987775 4666
Q ss_pred Eeeec----CCCC-CCCCeEEee----cccCCCCcCcceecCCcc------EEEEEeeeccccCCcceEEEEeHHHHHHH
Q 006130 268 VANCY----PPRS-TTRSLLMAD----IRCLPGMEGGPVFGEHAH------FVGILIRPLRQKSGAEIQLVIPWEAIATA 332 (660)
Q Consensus 268 Vs~~~----~~~~-~~~~~i~tD----a~~~pG~sGGpl~~~~g~------liGi~~~~l~~~~~~~i~f~ip~~~i~~~ 332 (660)
|.+.. .++. ....|++.. +=..+|.||.=|+++-++ |+||+-+.=. ....|++..||..|..=
T Consensus 607 lNg~klvyw~dG~i~s~efvV~s~~~~~Fa~~GDSGS~VLtk~~d~~~gLgvvGMlhsydg--e~kqfglftPi~~il~r 684 (695)
T PF08192_consen 607 LNGIKLVYWADGKIQSSEFVVSSDNNPAFASGGDSGSWVLTKLEDNNKGLGVVGMLHSYDG--EQKQFGLFTPINEILDR 684 (695)
T ss_pred ecceEEEEecCCCeEEEEEEEecCCCccccCCCCcccEEEecccccccCceeeEEeeecCC--ccceeeccCcHHHHHHH
Confidence 65441 1111 113445444 446779999999997444 9999886533 33467889999988643
No 46
>PF00944 Peptidase_S3: Alphavirus core protein ; InterPro: IPR000930 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. Togavirin, also known as Sindbis virus core endopeptidase, is a serine protease resident at the N terminus of the p130 polyprotein of togaviruses []. The endopeptidase signature identifies the peptidase as belonging to the MEROPS peptidase family S3 (togavirin family, clan PA(S)). The polyprotein also includes structural proteins for the nucleocapsid core and for the glycoprotein spikes []. Togavirin is only active while part of the polyprotein, cleavage at a Trp-Ser bond resulting in total lack of activity []. Mutagenesis studies have identified the location of the His-Asp-Ser catalytic triad, and X-ray studies have revealed the protein fold to be similar to that of chymotrypsin [, ].; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis, 0016020 membrane; PDB: 2YEW_D 1EP5_A 3J0C_F 1EP6_C 1WYK_D 1DYL_A 1VCQ_B 1VCP_B 1LD4_D 1KXA_A ....
Probab=74.96 E-value=3.6 Score=39.17 Aligned_cols=32 Identities=22% Similarity=0.400 Sum_probs=26.3
Q ss_pred eEEeecccCCCCcCcceecCCccEEEEEeeec
Q 006130 281 LLMADIRCLPGMEGGPVFGEHAHFVGILIRPL 312 (660)
Q Consensus 281 ~i~tDa~~~pG~sGGpl~~~~g~liGi~~~~l 312 (660)
|.+--..-.||.||-|+||..|++|||+++--
T Consensus 96 ftip~g~g~~GDSGRpi~DNsGrVVaIVLGG~ 127 (158)
T PF00944_consen 96 FTIPTGVGKPGDSGRPIFDNSGRVVAIVLGGA 127 (158)
T ss_dssp EEEETTS-STTSTTEEEESTTSBEEEEEEEEE
T ss_pred EEeccCCCCCCCCCCccCcCCCCEEEEEecCC
Confidence 44555667899999999999999999999764
No 47
>PF00947 Pico_P2A: Picornavirus core protein 2A; InterPro: IPR000081 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This domain defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies 3CA and 3CB. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral 3C cysteine protease []. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0008233 peptidase activity, 0006508 proteolysis, 0016032 viral reproduction; PDB: 2HRV_B 1Z8R_A.
Probab=61.51 E-value=11 Score=35.54 Aligned_cols=33 Identities=27% Similarity=0.476 Sum_probs=23.8
Q ss_pred cEEEEcccccCCcccccccccCceEEEEeeeccc
Q 006130 599 VMLETTAAVHPGGSGGAVVNLDGHMIGLVTSNAR 632 (660)
Q Consensus 599 ~~IqTda~v~~G~SGGPL~Ns~G~VVGIvssna~ 632 (660)
.++....+..||+.||+|+-.. -||||+|++-.
T Consensus 79 ~~l~g~Gp~~PGdCGg~L~C~H-GViGi~Tagg~ 111 (127)
T PF00947_consen 79 NLLIGEGPAEPGDCGGILRCKH-GVIGIVTAGGE 111 (127)
T ss_dssp CEEEEE-SSSTT-TCSEEEETT-CEEEEEEEEET
T ss_pred CceeecccCCCCCCCceeEeCC-CeEEEEEeCCC
Confidence 3445556899999999999655 57999999733
No 48
>KOG0441 consensus Cu2+/Zn2+ superoxide dismutase SOD1 [Inorganic ion transport and metabolism]
Probab=60.83 E-value=3.2 Score=40.28 Aligned_cols=43 Identities=28% Similarity=0.250 Sum_probs=30.7
Q ss_pred chhhhcccccceeccCcee---eeeeeeecccccC--ChhhhhhccCC
Q 006130 25 KGLKMRRHAFHQYNSGKTT---LSASGMLLPLSFF--DTKVAERNWGV 67 (660)
Q Consensus 25 ~~~~~~~~~~~~~~~~~~~---~sas~~~~p~~~~--~~~~~~~~~~~ 67 (660)
+||+-++|+||.|+.|.+| .||-.-.=|.+.. .+.+..|.+++
T Consensus 37 ~GL~pg~hgfHvHqfGD~t~GC~SaGphFNp~~~~hg~p~~~~rH~gd 84 (154)
T KOG0441|consen 37 TGLPPGKHGFHVHQFGDNTNGCKSAGPHFNPNKKTHGGPVDEVRHVGD 84 (154)
T ss_pred ecCCCceeeEEEEeccCCCCChhcCCCCCCCcccCCCCcccccccccc
Confidence 3444499999999999998 6775555555555 36666666666
No 49
>PF03510 Peptidase_C24: 2C endopeptidase (C24) cysteine protease family; InterPro: IPR000317 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. The two signatures that defines this group of calivirus polyproteins identify a cysteine peptidase signature that belongs to MEROPS peptidase family C24 (clan PA(C)). Caliciviruses are positive-stranded ssRNA viruses that cause gastroenteritis. The calicivirus genome contains two open reading frames, ORF1 and ORF2. ORF2 encodes a structural protein []; while ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine protease and RNA polymerase activity. The regions of the polyprotein in which these activities lie are similar to proteins produced by the picornaviruses. Two different families of caliciviruses can be distinguished on the basis of sequence similarity, namely those classified as small round structured viruses (SRSVs) and those classed as non-SRSVs. Calicivirus proteases from the non-SRSV group, which are members of the PA protease clan, constitute family C24 of the cysteine proteases (proteases from SRSVs belong to the C37 family). As mentioned above, the protease activity resides within a polyprotein. The enzyme cleaves the polyprotein at sites N-terminal to itself, liberating the polyprotein helicase.; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis
Probab=57.85 E-value=32 Score=31.52 Aligned_cols=18 Identities=22% Similarity=0.340 Sum_probs=14.9
Q ss_pred EEEEEeCCCEEEEcccccC
Q 006130 409 SGVLLNDQGLILTNAHLLE 427 (660)
Q Consensus 409 SGflI~~~GlILTNaHVVe 427 (660)
=++-|.+ |..+|+.||++
T Consensus 2 ~avHIGn-G~~vt~tHva~ 19 (105)
T PF03510_consen 2 WAVHIGN-GRYVTVTHVAK 19 (105)
T ss_pred ceEEeCC-CEEEEEEEEec
Confidence 3677776 89999999997
No 50
>PF01732 DUF31: Putative peptidase (DUF31); InterPro: IPR022382 This domain has no known function. It is found in various hypothetical proteins and putative lipoproteins from mycoplasmas.
Probab=53.74 E-value=7.9 Score=42.66 Aligned_cols=23 Identities=26% Similarity=0.569 Sum_probs=20.7
Q ss_pred cccCCcccccccccCceEEEEee
Q 006130 606 AVHPGGSGGAVVNLDGHMIGLVT 628 (660)
Q Consensus 606 ~v~~G~SGGPL~Ns~G~VVGIvs 628 (660)
...+|+||+.|+|.+|++|||-.
T Consensus 351 ~l~gGaSGS~V~n~~~~lvGIy~ 373 (374)
T PF01732_consen 351 SLGGGASGSMVINQNNELVGIYF 373 (374)
T ss_pred CCCCCCCcCeEECCCCCEEEEeC
Confidence 66789999999999999999963
No 51
>TIGR02860 spore_IV_B stage IV sporulation protein B. SpoIVB, the stage IV sporulation protein B of endospore-forming bacteria such as Bacillus subtilis, is a serine proteinase, expressed in the spore (rather than mother cell) compartment, that participates in a proteolytic activation cascade for Sigma-K. It appears to be universal among endospore-forming bacteria and occurs nowhere else.
Probab=50.67 E-value=11 Score=42.29 Aligned_cols=46 Identities=26% Similarity=0.423 Sum_probs=33.1
Q ss_pred EEEcccccCCcccccccccCceEEEEeeecccCCCCcccCceEEEEehHHHH
Q 006130 601 LETTAAVHPGGSGGAVVNLDGHMIGLVTSNARHGGGTVIPHLNFSIPCAVLR 652 (660)
Q Consensus 601 IqTda~v~~G~SGGPL~Ns~G~VVGIvssna~~~~g~~ip~lnFaIPi~~l~ 652 (660)
+.-+.-+..|+||+|++ .+|++||=++=..-++ |..+|.|-++.-.
T Consensus 351 l~~tgGivqGMSGSPi~-q~gkliGAvtHVfvnd-----pt~GYGi~ie~Ml 396 (402)
T TIGR02860 351 LEKTGGIVQGMSGSPII-QNGKVIGAVTHVFVND-----PTSGYGVYIEWML 396 (402)
T ss_pred hhHhCCEEecccCCCEE-ECCEEEEEEEEEEecC-----CCcceeehHHHHH
Confidence 33344567799999999 6999999877665433 3457888777643
No 52
>PF05579 Peptidase_S32: Equine arteritis virus serine endopeptidase S32; InterPro: IPR008760 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to MEROPS peptidase family S32 (clan PA(S)). The type example is equine arteritis virus serine endopeptidase (equine arteritis virus), which is involved in processing of nidovirus polyproteins [].; GO: 0004252 serine-type endopeptidase activity, 0016032 viral reproduction, 0019082 viral protein processing; PDB: 3FAN_A 3FAO_A 1MBM_A.
Probab=48.81 E-value=12 Score=39.68 Aligned_cols=25 Identities=28% Similarity=0.496 Sum_probs=20.2
Q ss_pred cCCCCcCcceecCCccEEEEEeeec
Q 006130 288 CLPGMEGGPVFGEHAHFVGILIRPL 312 (660)
Q Consensus 288 ~~pG~sGGpl~~~~g~liGi~~~~l 312 (660)
-.||.||.||+..+|.+||+-++.=
T Consensus 205 T~~GDSGSPVVt~dg~liGVHTGSn 229 (297)
T PF05579_consen 205 TGPGDSGSPVVTEDGDLIGVHTGSN 229 (297)
T ss_dssp S-GGCTT-EEEETTC-EEEEEEEEE
T ss_pred cCCCCCCCccCcCCCCEEEEEecCC
Confidence 4699999999999999999999874
No 53
>PF00548 Peptidase_C3: 3C cysteine protease (picornain 3C); InterPro: IPR000199 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This signature defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies C3A and C3B. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral C3 cysteine protease. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SJO_E 2H6M_A 1QA7_C 1HAV_B 2HAL_A 2H9H_A 3QZQ_B 3QZR_A 3R0F_B 3SJ9_A ....
Probab=42.42 E-value=56 Score=32.22 Aligned_cols=90 Identities=21% Similarity=0.330 Sum_probs=51.0
Q ss_pred cccEEEEEEcCCCCCCCcccc--CC-CCCCCCeEEEEeCC-CCCCCCccc-ccce-EEEEEeeecCCCCCCCCeEEeecc
Q 006130 214 TSRVAILGVSSYLKDLPNIAL--TP-LNKRGDLLLAVGSP-FGVLSPMHF-FNSV-SMGSVANCYPPRSTTRSLLMADIR 287 (660)
Q Consensus 214 ~td~A~l~i~~~~~~~~~~~~--s~-~~~~G~~v~aigsP-Fg~~sp~~f-~n~v-s~GiVs~~~~~~~~~~~~i~tDa~ 287 (660)
.+|+++++++.. ...+.|.- .+ .-...+.++++-++ |+- .++ ...+ ..|.| +... ......+.=+++
T Consensus 71 ~~Dl~~v~l~~~-~kfrDIrk~~~~~~~~~~~~~l~v~~~~~~~---~~~~v~~v~~~~~i-~~~g--~~~~~~~~Y~~~ 143 (172)
T PF00548_consen 71 DTDLTLVKLPRN-PKFRDIRKFFPESIPEYPECVLLVNSTKFPR---MIVEVGFVTNFGFI-NLSG--TTTPRSLKYKAP 143 (172)
T ss_dssp EEEEEEEEEESS-S-B--GGGGSBSSGGTEEEEEEEEESSSSTC---EEEEEEEEEEEEEE-EETT--EEEEEEEEEESE
T ss_pred ceeEEEEEccCC-cccCchhhhhccccccCCCcEEEEECCCCcc---EEEEEEEEeecCcc-ccCC--CEeeEEEEEccC
Confidence 479999999742 22222221 11 22456666666654 542 111 1112 33444 3321 223446777888
Q ss_pred cCCCCcCcceecC---CccEEEEEee
Q 006130 288 CLPGMEGGPVFGE---HAHFVGILIR 310 (660)
Q Consensus 288 ~~pG~sGGpl~~~---~g~liGi~~~ 310 (660)
--+|+.||||+.. .+.+|||=.|
T Consensus 144 t~~G~CG~~l~~~~~~~~~i~GiHva 169 (172)
T PF00548_consen 144 TKPGMCGSPLVSRIGGQGKIIGIHVA 169 (172)
T ss_dssp EETTGTTEEEEESCGGTTEEEEEEEE
T ss_pred CCCCccCCeEEEeeccCccEEEEEec
Confidence 8899999999975 3469999765
No 54
>PF03761 DUF316: Domain of unknown function (DUF316) ; InterPro: IPR005514 This is a family of uncharacterised proteins from Caenorhabditis elegans.
Probab=30.34 E-value=3.3e+02 Score=28.32 Aligned_cols=92 Identities=16% Similarity=0.185 Sum_probs=52.8
Q ss_pred CcccEEEEEEcCC---CCCCCccccCC-CCCCCCeEEEEeC-CCCCCCCcccccceEEEEEeeecCCCCCCCCeEEeecc
Q 006130 213 STSRVAILGVSSY---LKDLPNIALTP-LNKRGDLLLAVGS-PFGVLSPMHFFNSVSMGSVANCYPPRSTTRSLLMADIR 287 (660)
Q Consensus 213 ~~td~A~l~i~~~---~~~~~~~~~s~-~~~~G~~v~aigs-PFg~~sp~~f~n~vs~GiVs~~~~~~~~~~~~i~tDa~ 287 (660)
...+++||+++.. ....+.+..++ .+..||.+-+-|- .-+ .++...+. |..... ....+.++-.
T Consensus 159 ~~~~~mIlEl~~~~~~~~~~~Cl~~~~~~~~~~~~~~~yg~~~~~----~~~~~~~~---i~~~~~----~~~~~~~~~~ 227 (282)
T PF03761_consen 159 RPYSPMILELEEDFSKNVSPPCLADSSTNWEKGDEVDVYGFNSTG----KLKHRKLK---ITNCTK----CAYSICTKQY 227 (282)
T ss_pred cccceEEEEEcccccccCCCEEeCCCccccccCceEEEeecCCCC----eEEEEEEE---EEEeec----cceeEecccc
Confidence 3457889999854 34455566543 4778898887554 111 11111111 111111 1234556666
Q ss_pred cCCCCcCcceecC-Ccc--EEEEEeeecccc
Q 006130 288 CLPGMEGGPVFGE-HAH--FVGILIRPLRQK 315 (660)
Q Consensus 288 ~~pG~sGGpl~~~-~g~--liGi~~~~l~~~ 315 (660)
.-+|..|||++.. +|+ |||+.+..-...
T Consensus 228 ~~~~d~Gg~lv~~~~gr~tlIGv~~~~~~~~ 258 (282)
T PF03761_consen 228 SCKGDRGGPLVKNINGRWTLIGVGASGNYEC 258 (282)
T ss_pred cCCCCccCeEEEEECCCEEEEEEEccCCCcc
Confidence 6689999999843 554 999988665443
No 55
>PF05580 Peptidase_S55: SpoIVB peptidase S55; InterPro: IPR008763 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S55 (SpoIVB peptidase family, clan PA(S)). The protein SpoIVB plays a key role in signalling in the final sigma-K checkpoint of Bacillus subtilis [, ].
Probab=30.07 E-value=46 Score=34.28 Aligned_cols=34 Identities=15% Similarity=0.414 Sum_probs=25.1
Q ss_pred cCCCCcCcceecCCccEEEEEeeeccccCCcceEE
Q 006130 288 CLPGMEGGPVFGEHAHFVGILIRPLRQKSGAEIQL 322 (660)
Q Consensus 288 ~~pG~sGGpl~~~~g~liGi~~~~l~~~~~~~i~f 322 (660)
|.-||||.|++- +|+|||-++--|..-...|.+.
T Consensus 177 IvqGMSGSPI~q-dGKLiGAVthvf~~dp~~Gygi 210 (218)
T PF05580_consen 177 IVQGMSGSPIIQ-DGKLIGAVTHVFVNDPTKGYGI 210 (218)
T ss_pred EEecccCCCEEE-CCEEEEEEEEEEecCCCceeee
Confidence 566999999986 8999999998874333334443
No 56
>PF01732 DUF31: Putative peptidase (DUF31); InterPro: IPR022382 This domain has no known function. It is found in various hypothetical proteins and putative lipoproteins from mycoplasmas.
Probab=27.29 E-value=44 Score=36.86 Aligned_cols=26 Identities=23% Similarity=0.364 Sum_probs=21.5
Q ss_pred ecccCCCCcCcceecCCccEEEEEee
Q 006130 285 DIRCLPGMEGGPVFGEHAHFVGILIR 310 (660)
Q Consensus 285 Da~~~pG~sGGpl~~~~g~liGi~~~ 310 (660)
+...--|.||..|+|.+|++|||..|
T Consensus 349 ~~~l~gGaSGS~V~n~~~~lvGIy~g 374 (374)
T PF01732_consen 349 NYSLGGGASGSMVINQNNELVGIYFG 374 (374)
T ss_pred ccCCCCCCCcCeEECCCCCEEEEeCC
Confidence 34445799999999999999999753
No 57
>PF08208 RNA_polI_A34: DNA-directed RNA polymerase I subunit RPA34.5; InterPro: IPR013240 This is a family of proteins conserved from yeasts to human. Subunit A34.5 of RNA polymerase I is a non-essential subunit which is thought to help Pol I overcome topological constraints imposed on ribosomal DNA during the process of transcription [].; PDB: 3NFG_N.
Probab=23.73 E-value=26 Score=35.12 Aligned_cols=13 Identities=62% Similarity=0.950 Sum_probs=0.0
Q ss_pred Ccchhhhcccccc
Q 006130 23 DPKGLKMRRHAFH 35 (660)
Q Consensus 23 ~~~~~~~~~~~~~ 35 (660)
-|+|||||.|+|=
T Consensus 109 qp~gLk~Rf~P~G 121 (198)
T PF08208_consen 109 QPKGLKMRFFPFG 121 (198)
T ss_dssp -------------
T ss_pred CCCCcceeeecCC
Confidence 4899999999984
No 58
>PF00571 CBS: CBS domain CBS domain web page. Mutations in the CBS domain of Swiss:P35520 lead to homocystinuria.; InterPro: IPR000644 CBS (cystathionine-beta-synthase) domains are small intracellular modules, mostly found in two or four copies within a protein, that occur in a variety of proteins in bacteria, archaea, and eukaryotes [, ]. Tandem pairs of CBS domains can act as binding domains for adenosine derivatives and may regulate the activity of attached enzymatic or other domains []. In some cases, CBS domains may act as sensors of cellular energy status by being activated by AMP and inhibited by ATP []. In chloride ion channels, the CBS domains have been implicated in intracellular targeting and trafficking, as well as in protein-protein interactions, but results vary with different channels: in the CLC-5 channel, the CBS domain was shown to be required for trafficking [], while in the CLC-1 channel, the CBS domain was shown to be critical for channel function, but not necessary for trafficking []. Recent experiments revealing that CBS domains can bind adenosine-containing ligands such ATP, AMP, or S-adenosylmethionine have led to the hypothesis that CBS domains function as sensors of intracellular metabolites [, ]. Crystallographic studies of CBS domains have shown that pairs of CBS sequences form a globular domain where each CBS unit adopts a beta-alpha-beta-beta-alpha pattern []. Crystal structure of the CBS domains of the AMP-activated protein kinase in complexes with AMP and ATP shows that the phosphate groups of AMP/ATP lie in a surface pocket at the interface of two CBS domains, which is lined with basic residues, many of which are associated with disease-causing mutations []. In humans, mutations in conserved residues within CBS domains cause a variety of human hereditary diseases, including (with the gene mutated in parentheses): homocystinuria (cystathionine beta-synthase); Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase); retinitis pigmentosa (IMP dehydrogenase-1); congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members).; GO: 0005515 protein binding; PDB: 3JTF_A 3TE5_C 3TDH_C 3T4N_C 2QLV_C 3OI8_A 3LV9_A 2QH1_B 1PVM_B 3LQN_A ....
Probab=23.46 E-value=59 Score=24.85 Aligned_cols=22 Identities=32% Similarity=0.561 Sum_probs=18.5
Q ss_pred CCcccccccccCceEEEEeeec
Q 006130 609 PGGSGGAVVNLDGHMIGLVTSN 630 (660)
Q Consensus 609 ~G~SGGPL~Ns~G~VVGIvssn 630 (660)
.+-+.-||+|.+|+++|+++..
T Consensus 28 ~~~~~~~V~d~~~~~~G~is~~ 49 (57)
T PF00571_consen 28 NGISRLPVVDEDGKLVGIISRS 49 (57)
T ss_dssp HTSSEEEEESTTSBEEEEEEHH
T ss_pred cCCcEEEEEecCCEEEEEEEHH
Confidence 4677889999999999998753
No 59
>PF12381 Peptidase_C3G: Tungro spherical virus-type peptidase; InterPro: IPR024387 This entry represents a rice tungro spherical waikavirus-type peptidase that belongs to MEROPS peptidase family C3G. It is a picornain 3C-type protease, and is responsible for the self-cleavage of the positive single-stranded polyproteins of a number of plant viral genomes. The location of the protease activity of the polyprotein is at the C-terminal end, adjacent and N-terminal to the putative RNA polymerase [, ].
Probab=21.25 E-value=84 Score=32.51 Aligned_cols=53 Identities=15% Similarity=0.211 Sum_probs=36.0
Q ss_pred EEEEcccccCCccccccc--cc--CceEEEEeeecccCCCCcccCceEEEEe--hHHHHHHHHHh
Q 006130 600 MLETTAAVHPGGSGGAVV--NL--DGHMIGLVTSNARHGGGTVIPHLNFSIP--CAVLRPIFEFA 658 (660)
Q Consensus 600 ~IqTda~v~~G~SGGPL~--Ns--~G~VVGIvssna~~~~g~~ip~lnFaIP--i~~l~~il~~a 658 (660)
-+++......|+-|||++ |. ..+++||.+++.... ..+||=+ -+.|.+.++.+
T Consensus 170 gleY~~~t~~GdCGs~i~~~~t~~~RKIvGiHVAG~~~~------~~gYAe~itQEDL~~A~~~l 228 (231)
T PF12381_consen 170 GLEYQMPTMNGDCGSPIVRNNTQMVRKIVGIHVAGSANH------AMGYAESITQEDLMRAINKL 228 (231)
T ss_pred eeeEECCCcCCCccceeeEcchhhhhhhheeeecccccc------cceehhhhhHHHHHHHHHhh
Confidence 456778889999999998 22 379999999986532 1245544 44566655543
Done!