Query 004596
Match_columns 743
No_of_seqs 359 out of 1843
Neff 5.7
Searched_HMMs 46136
Date Fri Mar 29 01:58:48 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/004596.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/004596hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PRK10139 serine endoprotease; 100.0 3.1E-28 6.7E-33 274.4 23.6 192 384-661 43-264 (455)
2 TIGR02038 protease_degS peripl 100.0 8.1E-28 1.7E-32 263.1 24.4 192 384-660 48-251 (351)
3 PRK10898 serine endoprotease; 100.0 1.4E-27 3E-32 261.4 23.2 192 384-660 48-252 (353)
4 PRK10942 serine endoprotease; 99.9 1.6E-26 3.4E-31 261.8 23.7 173 403-661 111-285 (473)
5 TIGR02037 degP_htrA_DO peripla 99.9 9.9E-26 2.1E-30 252.6 24.1 173 403-661 58-231 (428)
6 COG0265 DegQ Trypsin-like seri 99.9 2.2E-20 4.7E-25 204.0 20.7 190 385-659 37-244 (347)
7 PRK10139 serine endoprotease; 99.8 3.5E-18 7.6E-23 193.1 14.1 129 213-348 136-274 (455)
8 PRK10942 serine endoprotease; 99.7 3.3E-17 7.1E-22 186.1 13.8 129 214-349 158-296 (473)
9 TIGR02037 degP_htrA_DO peripla 99.6 1.1E-15 2.4E-20 171.7 14.7 130 214-350 104-243 (428)
10 TIGR02038 protease_degS peripl 99.6 1.6E-15 3.5E-20 166.4 14.0 128 213-348 123-262 (351)
11 PRK10898 serine endoprotease; 99.6 1.6E-15 3.5E-20 166.5 13.4 127 214-348 124-263 (353)
12 COG0265 DegQ Trypsin-like seri 99.6 1.6E-14 3.5E-19 157.9 12.4 130 212-348 116-256 (347)
13 PF13365 Trypsin_2: Trypsin-li 99.6 6.3E-14 1.4E-18 127.6 14.0 24 600-623 97-120 (120)
14 KOG1320 Serine protease [Postt 99.5 3.7E-13 8E-18 150.5 13.9 201 386-655 133-350 (473)
15 PF00089 Trypsin: Trypsin; In 99.3 2.8E-10 6E-15 113.4 18.2 124 518-650 86-218 (220)
16 cd00190 Tryp_SPc Trypsin-like 99.2 5.8E-10 1.3E-14 111.8 16.3 109 517-629 87-209 (232)
17 smart00020 Tryp_SPc Trypsin-li 99.0 2E-08 4.3E-13 101.3 19.3 108 517-628 87-208 (229)
18 KOG1421 Predicted signaling-as 98.9 5.8E-09 1.3E-13 118.7 11.7 191 386-658 57-261 (955)
19 KOG1320 Serine protease [Postt 98.8 7.2E-09 1.6E-13 116.5 7.7 125 203-333 211-350 (473)
20 COG3591 V8-like Glu-specific e 98.5 1.6E-06 3.6E-11 90.9 14.6 73 542-633 157-229 (251)
21 KOG3627 Trypsin [Amino acid tr 98.1 0.00029 6.4E-09 73.1 19.5 117 519-639 106-239 (256)
22 PF00863 Peptidase_C4: Peptida 97.9 0.00031 6.7E-09 73.4 15.1 103 517-646 80-185 (235)
23 COG5640 Secreted trypsin-like 97.2 0.0023 5E-08 69.9 11.3 38 602-639 223-263 (413)
24 PF03761 DUF316: Domain of unk 97.0 0.042 9E-07 58.5 18.1 92 517-629 159-256 (282)
25 PF05579 Peptidase_S32: Equine 96.7 0.0046 1E-07 65.2 7.4 77 519-631 156-232 (297)
26 PF10459 Peptidase_S46: Peptid 96.3 0.0078 1.7E-07 72.1 7.5 21 405-425 49-69 (698)
27 PF13365 Trypsin_2: Trypsin-li 94.9 0.03 6.6E-07 50.6 4.0 21 284-304 97-120 (120)
28 PF10459 Peptidase_S46: Peptid 94.6 0.026 5.7E-07 67.7 3.8 65 592-656 618-687 (698)
29 PF00548 Peptidase_C3: 3C cyst 94.1 0.93 2E-05 45.5 12.9 35 593-627 133-170 (172)
30 KOG1421 Predicted signaling-as 92.0 2.7 5.9E-05 49.9 14.2 152 494-660 578-731 (955)
31 PF02907 Peptidase_S29: Hepati 88.8 0.52 1.1E-05 45.4 4.1 44 603-649 104-147 (148)
32 PF08192 Peptidase_S64: Peptid 88.2 3 6.4E-05 49.7 10.5 119 516-655 540-688 (695)
33 PF00949 Peptidase_S7: Peptida 87.8 0.46 1E-05 45.8 3.1 31 601-631 91-121 (132)
34 PF00089 Trypsin: Trypsin; In 86.6 7 0.00015 38.6 11.0 110 214-324 86-214 (220)
35 PF00944 Peptidase_S3: Alphavi 84.9 1.1 2.4E-05 43.3 3.9 35 597-631 96-130 (158)
36 PF05580 Peptidase_S55: SpoIVB 80.8 1.4 3E-05 45.9 3.1 45 597-647 170-214 (218)
37 PF09342 DUF1986: Domain of un 79.9 15 0.00033 39.1 10.4 33 392-425 17-49 (267)
38 PF00947 Pico_P2A: Picornaviru 75.0 3.8 8.2E-05 39.3 4.0 31 596-627 79-109 (127)
39 KOG0441 Cu2+/Zn2+ superoxide d 54.3 4.5 9.7E-05 40.0 0.2 42 26-67 38-84 (154)
40 PF01732 DUF31: Putative pepti 51.8 9.3 0.0002 42.8 2.3 24 602-625 350-373 (374)
41 PF08192 Peptidase_S64: Peptid 50.6 76 0.0017 38.4 9.4 109 209-329 537-684 (695)
42 COG3591 V8-like Glu-specific e 49.8 48 0.001 35.6 7.0 71 234-312 154-227 (251)
43 PF02907 Peptidase_S29: Hepati 49.5 12 0.00025 36.5 2.1 42 284-326 104-146 (148)
44 TIGR02860 spore_IV_B stage IV 47.9 15 0.00032 41.9 3.1 45 597-647 350-394 (402)
45 PF03510 Peptidase_C24: 2C end 40.6 84 0.0018 29.4 6.2 17 407-424 3-19 (105)
46 PF05416 Peptidase_C37: Southa 40.0 89 0.0019 35.9 7.4 37 594-630 483-529 (535)
47 PF00863 Peptidase_C4: Peptida 36.5 85 0.0018 33.4 6.3 81 214-307 81-171 (235)
48 cd00190 Tryp_SPc Trypsin-like 32.7 1.3E+02 0.0029 29.6 6.9 39 214-252 88-132 (232)
49 PF00571 CBS: CBS domain CBS d 25.7 56 0.0012 25.5 2.3 21 606-626 28-48 (57)
50 smart00020 Tryp_SPc Trypsin-li 25.1 4.1E+02 0.0089 26.3 9.0 39 214-252 88-132 (229)
51 PF00949 Peptidase_S7: Peptida 24.4 69 0.0015 31.1 3.0 32 280-311 86-120 (132)
52 PF13267 DUF4058: Protein of u 23.5 59 0.0013 34.9 2.5 26 701-728 124-150 (254)
53 PF00947 Pico_P2A: Picornaviru 21.4 1.2E+02 0.0025 29.4 3.8 26 281-307 80-108 (127)
54 PF12381 Peptidase_C3G: Tungro 20.8 84 0.0018 33.1 2.9 54 596-655 169-228 (231)
55 PF08208 RNA_polI_A34: DNA-dir 20.8 33 0.00071 35.0 0.0 13 23-35 109-121 (198)
56 PF02122 Peptidase_S39: Peptid 20.0 1.1E+02 0.0024 31.8 3.6 49 597-649 137-185 (203)
No 1
>PRK10139 serine endoprotease; Provisional
Probab=99.96 E-value=3.1e-28 Score=274.37 Aligned_cols=192 Identities=29% Similarity=0.522 Sum_probs=158.4
Q ss_pred chHHHhccCceEEEEECC----------------------------CeeEEEEEEeC-CCeEEecccccccccCcceecc
Q 004596 384 PLPIQKALASVCLITIDD----------------------------GVWASGVLLND-QGLILTNAHLLEPWRFGKTTVS 434 (743)
Q Consensus 384 p~~i~~a~~SVV~I~~~~----------------------------~~wGSGfvV~~-~G~ILTNaHVV~p~~~~~t~~~ 434 (743)
..+++++.||||.|.... .++||||+|++ +||||||+||++
T Consensus 43 ~~~~~~~~pavV~i~~~~~~~~~~~~~~~~~~~f~~~~~~~~~~~~~~~GSG~ii~~~~g~IlTn~HVv~---------- 112 (455)
T PRK10139 43 APMLEKVLPAVVSVRVEGTASQGQKIPEEFKKFFGDDLPDQPAQPFEGLGSGVIIDAAKGYVLTNNHVIN---------- 112 (455)
T ss_pred HHHHHHhCCcEEEEEEEEeecccccCchhHHHhccccCCccccccccceEEEEEEECCCCEEEeChHHhC----------
Confidence 357899999999996410 14799999985 799999999997
Q ss_pred CCccccccCCCCCCCCCCCcccccccccCCCCCCCcccccccccccccccccccCCceeEEEEEcCCCCceeEeeEEEEe
Q 004596 435 GWRNGVSFQPEDSASSGHTGVDQYQKSQTLPPKMPKIVDSSVDEHRAYKLSSFSRGHRKIRVRLDHLDPWIWCDAKIVYV 514 (743)
Q Consensus 435 g~~~~~~f~~~~~~~~~~~~~~~~~~~q~~~~k~~~i~~~~~~~~~~~~l~l~~~~~~~I~Vrl~~~~~~~W~~A~VV~v 514 (743)
+ ...|.|++.+++. |+|++++
T Consensus 113 ~-------------------------------------------------------a~~i~V~~~dg~~---~~a~vvg- 133 (455)
T PRK10139 113 Q-------------------------------------------------------AQKISIQLNDGRE---FDAKLIG- 133 (455)
T ss_pred C-------------------------------------------------------CCEEEEEECCCCE---EEEEEEE-
Confidence 2 1248888888876 9999999
Q ss_pred cCCCCcEEEEEEccCCCCccceeCCCC-CCCCCCeEEEEecCCCCCCCCCCCeeEeEEEeeeeeccCCCCCccccccCCC
Q 004596 515 CKGPLDVSLLQLGYIPDQLCPIDADFG-QPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQRNSA 593 (743)
Q Consensus 515 ~~~~~DLALLkLe~~p~~l~pi~l~~s-~~~~G~~V~vIGyPlfg~~~g~~psvt~GiIS~vv~~~~~~~~~~~~~~~~~ 593 (743)
.|+.+||||||++. +..+++++++++ .+++||+|+++||| +|+..+++.|+||+..+... ....
T Consensus 134 ~D~~~DlAvlkv~~-~~~l~~~~lg~s~~~~~G~~V~aiG~P-----~g~~~tvt~GivS~~~r~~~---------~~~~ 198 (455)
T PRK10139 134 SDDQSDIALLQIQN-PSKLTQIAIADSDKLRVGDFAVAVGNP-----FGLGQTATSGIISALGRSGL---------NLEG 198 (455)
T ss_pred EcCCCCEEEEEecC-CCCCceeEecCccccCCCCEEEEEecC-----CCCCCceEEEEEcccccccc---------CCCC
Confidence 56779999999985 457889999886 68999999999995 57778999999998766321 0123
Q ss_pred cCeEEEEccccCCCCcccceecCCceEEEEEeeeecCCCCcccCceeEEEehhHHHHHHHHHHhcCCc
Q 004596 594 YPVMLETTAAVHPGGSGGAVVNLDGHMIGLVTSNARHGGGTVIPHLNFSIPCAVLRPIFEFARDMQEV 661 (743)
Q Consensus 594 ~~~~IqTdAai~~GnSGGPL~d~~G~VIGIvtsna~~~gg~~~p~lnFaIPi~~l~~il~~~~~~~d~ 661 (743)
...+||||+++++|||||||||.+|+||||+++...+.++.. +++|+||++.++++++++...+.+
T Consensus 199 ~~~~iqtda~in~GnSGGpl~n~~G~vIGi~~~~~~~~~~~~--gigfaIP~~~~~~v~~~l~~~g~v 264 (455)
T PRK10139 199 LENFIQTDASINRGNSGGALLNLNGELIGINTAILAPGGGSV--GIGFAIPSNMARTLAQQLIDFGEI 264 (455)
T ss_pred cceEEEECCccCCCCCcceEECCCCeEEEEEEEEEcCCCCcc--ceEEEEEhHHHHHHHHHHhhcCcc
Confidence 456899999999999999999999999999999877654443 899999999999999999876654
No 2
>TIGR02038 protease_degS periplasmic serine pepetdase DegS. This family consists of the periplasmic serine protease DegS (HhoB), a shorter paralog of protease DO (HtrA, DegP) and DegQ (HhoA). It is found in E. coli and several other Proteobacteria of the gamma subdivision. It contains a trypsin domain and a single copy of PDZ domain (in contrast to DegP with two copies). A critical role of this DegS is to sense stress in the periplasm and partially degrade an inhibitor of sigma(E).
Probab=99.96 E-value=8.1e-28 Score=263.09 Aligned_cols=192 Identities=25% Similarity=0.415 Sum_probs=156.2
Q ss_pred chHHHhccCceEEEEECC-----------CeeEEEEEEeCCCeEEecccccccccCcceeccCCccccccCCCCCCCCCC
Q 004596 384 PLPIQKALASVCLITIDD-----------GVWASGVLLNDQGLILTNAHLLEPWRFGKTTVSGWRNGVSFQPEDSASSGH 452 (743)
Q Consensus 384 p~~i~~a~~SVV~I~~~~-----------~~wGSGfvV~~~G~ILTNaHVV~p~~~~~t~~~g~~~~~~f~~~~~~~~~~ 452 (743)
..+++++.||||.|.... .+.||||+|+++||||||+||++ +
T Consensus 48 ~~~~~~~~psVV~I~~~~~~~~~~~~~~~~~~GSG~vi~~~G~IlTn~HVV~----------~----------------- 100 (351)
T TIGR02038 48 NKAVRRAAPAVVNIYNRSISQNSLNQLSIQGLGSGVIMSKEGYILTNYHVIK----------K----------------- 100 (351)
T ss_pred HHHHHhcCCcEEEEEeEeccccccccccccceEEEEEEeCCeEEEecccEeC----------C-----------------
Confidence 356889999999997621 34799999999999999999996 2
Q ss_pred CcccccccccCCCCCCCcccccccccccccccccccCCceeEEEEEcCCCCceeEeeEEEEecCCCCcEEEEEEccCCCC
Q 004596 453 TGVDQYQKSQTLPPKMPKIVDSSVDEHRAYKLSSFSRGHRKIRVRLDHLDPWIWCDAKIVYVCKGPLDVSLLQLGYIPDQ 532 (743)
Q Consensus 453 ~~~~~~~~~q~~~~k~~~i~~~~~~~~~~~~l~l~~~~~~~I~Vrl~~~~~~~W~~A~VV~v~~~~~DLALLkLe~~p~~ 532 (743)
...+.|++.+++. ++|++++ .|+.+||||||++. ..
T Consensus 101 --------------------------------------~~~i~V~~~dg~~---~~a~vv~-~d~~~DlAvlkv~~--~~ 136 (351)
T TIGR02038 101 --------------------------------------ADQIVVALQDGRK---FEAELVG-SDPLTDLAVLKIEG--DN 136 (351)
T ss_pred --------------------------------------CCEEEEEECCCCE---EEEEEEE-ecCCCCEEEEEecC--CC
Confidence 1247788888766 8999998 66789999999996 35
Q ss_pred ccceeCCCC-CCCCCCeEEEEecCCCCCCCCCCCeeEeEEEeeeeeccCCCCCccccccCCCcCeEEEEccccCCCCccc
Q 004596 533 LCPIDADFG-QPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQRNSAYPVMLETTAAVHPGGSGG 611 (743)
Q Consensus 533 l~pi~l~~s-~~~~G~~V~vIGyPlfg~~~g~~psvt~GiIS~vv~~~~~~~~~~~~~~~~~~~~~IqTdAai~~GnSGG 611 (743)
++++++.++ .+++|++|+++||| +++..+++.|+|++..+... .......+||||+++++|||||
T Consensus 137 ~~~~~l~~s~~~~~G~~V~aiG~P-----~~~~~s~t~GiIs~~~r~~~---------~~~~~~~~iqtda~i~~GnSGG 202 (351)
T TIGR02038 137 LPTIPVNLDRPPHVGDVVLAIGNP-----YNLGQTITQGIISATGRNGL---------SSVGRQNFIQTDAAINAGNSGG 202 (351)
T ss_pred CceEeccCcCccCCCCEEEEEeCC-----CCCCCcEEEEEEEeccCccc---------CCCCcceEEEECCccCCCCCcc
Confidence 788888765 68999999999996 56678999999998766321 0012356899999999999999
Q ss_pred ceecCCceEEEEEeeeecCCCCcccCceeEEEehhHHHHHHHHHHhcCC
Q 004596 612 AVVNLDGHMIGLVTSNARHGGGTVIPHLNFSIPCAVLRPIFEFARDMQE 660 (743)
Q Consensus 612 PL~d~~G~VIGIvtsna~~~gg~~~p~lnFaIPi~~l~~il~~~~~~~d 660 (743)
||||.+|+||||+++.....++....+++|+||++.++++++++...+.
T Consensus 203 pl~n~~G~vIGI~~~~~~~~~~~~~~g~~faIP~~~~~~vl~~l~~~g~ 251 (351)
T TIGR02038 203 ALINTNGELVGINTASFQKGGDEGGEGINFAIPIKLAHKIMGKIIRDGR 251 (351)
T ss_pred eEECCCCeEEEEEeeeecccCCCCccceEEEecHHHHHHHHHHHhhcCc
Confidence 9999999999999987654433334589999999999999999987654
No 3
>PRK10898 serine endoprotease; Provisional
Probab=99.96 E-value=1.4e-27 Score=261.41 Aligned_cols=192 Identities=23% Similarity=0.377 Sum_probs=155.5
Q ss_pred chHHHhccCceEEEEECC-----------CeeEEEEEEeCCCeEEecccccccccCcceeccCCccccccCCCCCCCCCC
Q 004596 384 PLPIQKALASVCLITIDD-----------GVWASGVLLNDQGLILTNAHLLEPWRFGKTTVSGWRNGVSFQPEDSASSGH 452 (743)
Q Consensus 384 p~~i~~a~~SVV~I~~~~-----------~~wGSGfvV~~~G~ILTNaHVV~p~~~~~t~~~g~~~~~~f~~~~~~~~~~ 452 (743)
..+++++.||||.|.... .++||||+|+++||||||+||++ +
T Consensus 48 ~~~~~~~~psvV~v~~~~~~~~~~~~~~~~~~GSGfvi~~~G~IlTn~HVv~----------~----------------- 100 (353)
T PRK10898 48 NQAVRRAAPAVVNVYNRSLNSTSHNQLEIRTLGSGVIMDQRGYILTNKHVIN----------D----------------- 100 (353)
T ss_pred HHHHHHhCCcEEEEEeEeccccCcccccccceeeEEEEeCCeEEEecccEeC----------C-----------------
Confidence 356899999999998731 15899999999999999999996 2
Q ss_pred CcccccccccCCCCCCCcccccccccccccccccccCCceeEEEEEcCCCCceeEeeEEEEecCCCCcEEEEEEccCCCC
Q 004596 453 TGVDQYQKSQTLPPKMPKIVDSSVDEHRAYKLSSFSRGHRKIRVRLDHLDPWIWCDAKIVYVCKGPLDVSLLQLGYIPDQ 532 (743)
Q Consensus 453 ~~~~~~~~~q~~~~k~~~i~~~~~~~~~~~~l~l~~~~~~~I~Vrl~~~~~~~W~~A~VV~v~~~~~DLALLkLe~~p~~ 532 (743)
...|.|++.+++. |+|++++ .|+.+||||||++. ..
T Consensus 101 --------------------------------------a~~i~V~~~dg~~---~~a~vv~-~d~~~DlAvl~v~~--~~ 136 (353)
T PRK10898 101 --------------------------------------ADQIIVALQDGRV---FEALLVG-SDSLTDLAVLKINA--TN 136 (353)
T ss_pred --------------------------------------CCEEEEEeCCCCE---EEEEEEE-EcCCCCEEEEEEcC--CC
Confidence 1247788888766 9999998 56779999999985 35
Q ss_pred ccceeCCCC-CCCCCCeEEEEecCCCCCCCCCCCeeEeEEEeeeeeccCCCCCccccccCCCcCeEEEEccccCCCCccc
Q 004596 533 LCPIDADFG-QPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQRNSAYPVMLETTAAVHPGGSGG 611 (743)
Q Consensus 533 l~pi~l~~s-~~~~G~~V~vIGyPlfg~~~g~~psvt~GiIS~vv~~~~~~~~~~~~~~~~~~~~~IqTdAai~~GnSGG 611 (743)
+++++++++ .+++|+.|+++||| .++..+++.|+|++..+.... ......+||||+++++|||||
T Consensus 137 l~~~~l~~~~~~~~G~~V~aiG~P-----~g~~~~~t~Giis~~~r~~~~---------~~~~~~~iqtda~i~~GnSGG 202 (353)
T PRK10898 137 LPVIPINPKRVPHIGDVVLAIGNP-----YNLGQTITQGIISATGRIGLS---------PTGRQNFLQTDASINHGNSGG 202 (353)
T ss_pred CCeeeccCcCcCCCCCEEEEEeCC-----CCcCCCcceeEEEeccccccC---------CccccceEEeccccCCCCCcc
Confidence 788888876 58999999999996 566688999999987664210 012246899999999999999
Q ss_pred ceecCCceEEEEEeeeecCCC-CcccCceeEEEehhHHHHHHHHHHhcCC
Q 004596 612 AVVNLDGHMIGLVTSNARHGG-GTVIPHLNFSIPCAVLRPIFEFARDMQE 660 (743)
Q Consensus 612 PL~d~~G~VIGIvtsna~~~g-g~~~p~lnFaIPi~~l~~il~~~~~~~d 660 (743)
||+|.+|+||||+++.....+ +....+++|+||++.++++++++...+.
T Consensus 203 Pl~n~~G~vvGI~~~~~~~~~~~~~~~g~~faIP~~~~~~~~~~l~~~G~ 252 (353)
T PRK10898 203 ALVNSLGELMGINTLSFDKSNDGETPEGIGFAIPTQLATKIMDKLIRDGR 252 (353)
T ss_pred eEECCCCeEEEEEEEEecccCCCCcccceEEEEchHHHHHHHHHHhhcCc
Confidence 999999999999998765432 2333489999999999999999877555
No 4
>PRK10942 serine endoprotease; Provisional
Probab=99.95 E-value=1.6e-26 Score=261.76 Aligned_cols=173 Identities=31% Similarity=0.538 Sum_probs=145.2
Q ss_pred eeEEEEEEeC-CCeEEecccccccccCcceeccCCccccccCCCCCCCCCCCcccccccccCCCCCCCcccccccccccc
Q 004596 403 VWASGVLLND-QGLILTNAHLLEPWRFGKTTVSGWRNGVSFQPEDSASSGHTGVDQYQKSQTLPPKMPKIVDSSVDEHRA 481 (743)
Q Consensus 403 ~wGSGfvV~~-~G~ILTNaHVV~p~~~~~t~~~g~~~~~~f~~~~~~~~~~~~~~~~~~~q~~~~k~~~i~~~~~~~~~~ 481 (743)
++||||+|++ +||||||+||++ +
T Consensus 111 ~~GSG~ii~~~~G~IlTn~HVv~----------~---------------------------------------------- 134 (473)
T PRK10942 111 ALGSGVIIDADKGYVVTNNHVVD----------N---------------------------------------------- 134 (473)
T ss_pred ceEEEEEEECCCCEEEeChhhcC----------C----------------------------------------------
Confidence 4799999996 599999999996 2
Q ss_pred cccccccCCceeEEEEEcCCCCceeEeeEEEEecCCCCcEEEEEEccCCCCccceeCCCC-CCCCCCeEEEEecCCCCCC
Q 004596 482 YKLSSFSRGHRKIRVRLDHLDPWIWCDAKIVYVCKGPLDVSLLQLGYIPDQLCPIDADFG-QPSLGSAAYVIGHGLFGPR 560 (743)
Q Consensus 482 ~~l~l~~~~~~~I~Vrl~~~~~~~W~~A~VV~v~~~~~DLALLkLe~~p~~l~pi~l~~s-~~~~G~~V~vIGyPlfg~~ 560 (743)
...|.|++.+++. |.|++++ .|+.+||||||++. +..+++++++++ .+++|++|+++|||
T Consensus 135 ---------a~~i~V~~~dg~~---~~a~vv~-~D~~~DlAvlki~~-~~~l~~~~lg~s~~l~~G~~V~aiG~P----- 195 (473)
T PRK10942 135 ---------ATKIKVQLSDGRK---FDAKVVG-KDPRSDIALIQLQN-PKNLTAIKMADSDALRVGDYTVAIGNP----- 195 (473)
T ss_pred ---------CCEEEEEECCCCE---EEEEEEE-ecCCCCEEEEEecC-CCCCceeEecCccccCCCCEEEEEcCC-----
Confidence 1248888888877 9999998 67789999999985 457899999876 69999999999995
Q ss_pred CCCCCeeEeEEEeeeeeccCCCCCccccccCCCcCeEEEEccccCCCCcccceecCCceEEEEEeeeecCCCCcccCcee
Q 004596 561 CGLSPSVSSGVVAKVVKANLPSYGQSTLQRNSAYPVMLETTAAVHPGGSGGAVVNLDGHMIGLVTSNARHGGGTVIPHLN 640 (743)
Q Consensus 561 ~g~~psvt~GiIS~vv~~~~~~~~~~~~~~~~~~~~~IqTdAai~~GnSGGPL~d~~G~VIGIvtsna~~~gg~~~p~ln 640 (743)
+|+..+++.|+|+++.+... ....+..+||||+++++|||||||+|.+|+||||+++...+.++.. +++
T Consensus 196 ~g~~~tvt~GiVs~~~r~~~---------~~~~~~~~iqtda~i~~GnSGGpL~n~~GeviGI~t~~~~~~g~~~--g~g 264 (473)
T PRK10942 196 YGLGETVTSGIVSALGRSGL---------NVENYENFIQTDAAINRGNSGGALVNLNGELIGINTAILAPDGGNI--GIG 264 (473)
T ss_pred CCCCcceeEEEEEEeecccC---------CcccccceEEeccccCCCCCcCccCCCCCeEEEEEEEEEcCCCCcc--cEE
Confidence 57778999999999876311 0123457899999999999999999999999999999877665543 899
Q ss_pred EEEehhHHHHHHHHHHhcCCc
Q 004596 641 FSIPCAVLRPIFEFARDMQEV 661 (743)
Q Consensus 641 FaIPi~~l~~il~~~~~~~d~ 661 (743)
|+||++.++++++++.+.+.+
T Consensus 265 faIP~~~~~~v~~~l~~~g~v 285 (473)
T PRK10942 265 FAIPSNMVKNLTSQMVEYGQV 285 (473)
T ss_pred EEEEHHHHHHHHHHHHhcccc
Confidence 999999999999999876654
No 5
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=99.94 E-value=9.9e-26 Score=252.56 Aligned_cols=173 Identities=31% Similarity=0.494 Sum_probs=143.1
Q ss_pred eeEEEEEEeCCCeEEecccccccccCcceeccCCccccccCCCCCCCCCCCcccccccccCCCCCCCccccccccccccc
Q 004596 403 VWASGVLLNDQGLILTNAHLLEPWRFGKTTVSGWRNGVSFQPEDSASSGHTGVDQYQKSQTLPPKMPKIVDSSVDEHRAY 482 (743)
Q Consensus 403 ~wGSGfvV~~~G~ILTNaHVV~p~~~~~t~~~g~~~~~~f~~~~~~~~~~~~~~~~~~~q~~~~k~~~i~~~~~~~~~~~ 482 (743)
++||||+|+++||||||+||++ +
T Consensus 58 ~~GSGfii~~~G~IlTn~Hvv~----------~----------------------------------------------- 80 (428)
T TIGR02037 58 GLGSGVIISADGYILTNNHVVD----------G----------------------------------------------- 80 (428)
T ss_pred ceeeEEEECCCCEEEEcHHHcC----------C-----------------------------------------------
Confidence 4799999999999999999997 2
Q ss_pred ccccccCCceeEEEEEcCCCCceeEeeEEEEecCCCCcEEEEEEccCCCCccceeCCCC-CCCCCCeEEEEecCCCCCCC
Q 004596 483 KLSSFSRGHRKIRVRLDHLDPWIWCDAKIVYVCKGPLDVSLLQLGYIPDQLCPIDADFG-QPSLGSAAYVIGHGLFGPRC 561 (743)
Q Consensus 483 ~l~l~~~~~~~I~Vrl~~~~~~~W~~A~VV~v~~~~~DLALLkLe~~p~~l~pi~l~~s-~~~~G~~V~vIGyPlfg~~~ 561 (743)
...+.|++.+++. |+|++++ .++.+||||||++. +..++++.++++ .+++|++|+++||| .
T Consensus 81 --------~~~i~V~~~~~~~---~~a~vv~-~d~~~DlAllkv~~-~~~~~~~~l~~~~~~~~G~~v~aiG~p-----~ 142 (428)
T TIGR02037 81 --------ADEITVTLSDGRE---FKAKLVG-KDPRTDIAVLKIDA-KKNLPVIKLGDSDKLRVGDWVLAIGNP-----F 142 (428)
T ss_pred --------CCeEEEEeCCCCE---EEEEEEE-ecCCCCEEEEEecC-CCCceEEEccCCCCCCCCCEEEEEECC-----C
Confidence 1237777777765 8999998 56779999999986 357899999875 68999999999996 5
Q ss_pred CCCCeeEeEEEeeeeeccCCCCCccccccCCCcCeEEEEccccCCCCcccceecCCceEEEEEeeeecCCCCcccCceeE
Q 004596 562 GLSPSVSSGVVAKVVKANLPSYGQSTLQRNSAYPVMLETTAAVHPGGSGGAVVNLDGHMIGLVTSNARHGGGTVIPHLNF 641 (743)
Q Consensus 562 g~~psvt~GiIS~vv~~~~~~~~~~~~~~~~~~~~~IqTdAai~~GnSGGPL~d~~G~VIGIvtsna~~~gg~~~p~lnF 641 (743)
++..+++.|+|++..+... ....+..++|||+++++|||||||||.+|+||||+++.....++.. +++|
T Consensus 143 g~~~~~t~G~vs~~~~~~~---------~~~~~~~~i~tda~i~~GnSGGpl~n~~G~viGI~~~~~~~~g~~~--g~~f 211 (428)
T TIGR02037 143 GLGQTVTSGIVSALGRSGL---------GIGDYENFIQTDAAINPGNSGGPLVNLRGEVIGINTAIYSPSGGNV--GIGF 211 (428)
T ss_pred cCCCcEEEEEEEecccCcc---------CCCCccceEEECCCCCCCCCCCceECCCCeEEEEEeEEEcCCCCcc--ceEE
Confidence 6778999999998765310 0123456899999999999999999999999999999876654433 8999
Q ss_pred EEehhHHHHHHHHHHhcCCc
Q 004596 642 SIPCAVLRPIFEFARDMQEV 661 (743)
Q Consensus 642 aIPi~~l~~il~~~~~~~d~ 661 (743)
+||++.++++++++.+.+.+
T Consensus 212 aiP~~~~~~~~~~l~~~g~~ 231 (428)
T TIGR02037 212 AIPSNMAKNVVDQLIEGGKV 231 (428)
T ss_pred EEEhHHHHHHHHHHHhcCcC
Confidence 99999999999999876553
No 6
>COG0265 DegQ Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain [Posttranslational modification, protein turnover, chaperones]
Probab=99.85 E-value=2.2e-20 Score=203.97 Aligned_cols=190 Identities=27% Similarity=0.447 Sum_probs=155.2
Q ss_pred hHHHhccCceEEEEECC-----------------CeeEEEEEEeCCCeEEecccccccccCcceeccCCccccccCCCCC
Q 004596 385 LPIQKALASVCLITIDD-----------------GVWASGVLLNDQGLILTNAHLLEPWRFGKTTVSGWRNGVSFQPEDS 447 (743)
Q Consensus 385 ~~i~~a~~SVV~I~~~~-----------------~~wGSGfvV~~~G~ILTNaHVV~p~~~~~t~~~g~~~~~~f~~~~~ 447 (743)
..++++.++||.|.... .++||||+++.+|||+||.||++ +
T Consensus 37 ~~~~~~~~~vV~~~~~~~~~~~~~~~~~~~~~~~~~~gSg~i~~~~g~ivTn~hVi~----------~------------ 94 (347)
T COG0265 37 TAVEKVAPAVVSIATGLTAKLRSFFPSDPPLRSAEGLGSGFIISSDGYIVTNNHVIA----------G------------ 94 (347)
T ss_pred HHHHhcCCcEEEEEeeeeecchhcccCCcccccccccccEEEEcCCeEEEecceecC----------C------------
Confidence 46888999999997732 37899999999999999999997 2
Q ss_pred CCCCCCcccccccccCCCCCCCcccccccccccccccccccCCceeEEEEEcCCCCceeEeeEEEEecCCCCcEEEEEEc
Q 004596 448 ASSGHTGVDQYQKSQTLPPKMPKIVDSSVDEHRAYKLSSFSRGHRKIRVRLDHLDPWIWCDAKIVYVCKGPLDVSLLQLG 527 (743)
Q Consensus 448 ~~~~~~~~~~~~~~q~~~~k~~~i~~~~~~~~~~~~l~l~~~~~~~I~Vrl~~~~~~~W~~A~VV~v~~~~~DLALLkLe 527 (743)
...+.+.+.++.. +++++++ .|...|+|+||++
T Consensus 95 -------------------------------------------a~~i~v~l~dg~~---~~a~~vg-~d~~~dlavlki~ 127 (347)
T COG0265 95 -------------------------------------------AEEITVTLADGRE---VPAKLVG-KDPISDLAVLKID 127 (347)
T ss_pred -------------------------------------------cceEEEEeCCCCE---EEEEEEe-cCCccCEEEEEec
Confidence 1236666666665 8999998 6778999999999
Q ss_pred cCCCCccceeCCCC-CCCCCCeEEEEecCCCCCCCCCCCeeEeEEEeeeeeccCCCCCccccccCCCcCeEEEEccccCC
Q 004596 528 YIPDQLCPIDADFG-QPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQRNSAYPVMLETTAAVHP 606 (743)
Q Consensus 528 ~~p~~l~pi~l~~s-~~~~G~~V~vIGyPlfg~~~g~~psvt~GiIS~vv~~~~~~~~~~~~~~~~~~~~~IqTdAai~~ 606 (743)
.... ++.+.+.++ .++.|+.++++|+| +|+..+++.|+++...+... ........+|||||++++
T Consensus 128 ~~~~-~~~~~~~~s~~l~vg~~v~aiGnp-----~g~~~tvt~Givs~~~r~~v--------~~~~~~~~~IqtdAain~ 193 (347)
T COG0265 128 GAGG-LPVIALGDSDKLRVGDVVVAIGNP-----FGLGQTVTSGIVSALGRTGV--------GSAGGYVNFIQTDAAINP 193 (347)
T ss_pred cCCC-CceeeccCCCCcccCCEEEEecCC-----CCcccceeccEEeccccccc--------cCcccccchhhcccccCC
Confidence 7322 777788876 58899999999995 57779999999999887411 111225678999999999
Q ss_pred CCcccceecCCceEEEEEeeeecCCCCcccCceeEEEehhHHHHHHHHHHhcC
Q 004596 607 GGSGGAVVNLDGHMIGLVTSNARHGGGTVIPHLNFSIPCAVLRPIFEFARDMQ 659 (743)
Q Consensus 607 GnSGGPL~d~~G~VIGIvtsna~~~gg~~~p~lnFaIPi~~l~~il~~~~~~~ 659 (743)
||||||++|.+|++|||++......++.. +++|+||+..+.+++..+...+
T Consensus 194 gnsGgpl~n~~g~~iGint~~~~~~~~~~--gigfaiP~~~~~~v~~~l~~~G 244 (347)
T COG0265 194 GNSGGPLVNIDGEVVGINTAIIAPSGGSS--GIGFAIPVNLVAPVLDELISKG 244 (347)
T ss_pred CCCCCceEcCCCcEEEEEEEEecCCCCcc--eeEEEecHHHHHHHHHHHHHcC
Confidence 99999999999999999999988766533 6999999999999999988744
No 7
>PRK10139 serine endoprotease; Provisional
Probab=99.76 E-value=3.5e-18 Score=193.11 Aligned_cols=129 Identities=22% Similarity=0.328 Sum_probs=112.1
Q ss_pred ccccEEEEEeccCCCCCCceecC--CCCCCCCeEEEEeCCCCCCCccccccceEEEEEeeecCCC---CCCCceEEeecc
Q 004596 213 STSRVAILGVSSYLKDLPNIALT--PLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVANCYPPR---STTRSLLMADIR 287 (743)
Q Consensus 213 ~~t~~A~l~i~~~~~~~~~~~~~--~~~~~G~~v~~igsPFg~~sP~~f~ns~s~Givs~~~~~~---~~~~~~i~tD~~ 287 (743)
..+|+|||||+. ..++|.+.++ +.+++||+|++||+||| |..++|.|+||++.+.. ..+..|||||++
T Consensus 136 ~~~DlAvlkv~~-~~~l~~~~lg~s~~~~~G~~V~aiG~P~g------~~~tvt~GivS~~~r~~~~~~~~~~~iqtda~ 208 (455)
T PRK10139 136 DQSDIALLQIQN-PSKLTQIAIADSDKLRVGDFAVAVGNPFG------LGQTATSGIISALGRSGLNLEGLENFIQTDAS 208 (455)
T ss_pred CCCCEEEEEecC-CCCCceeEecCccccCCCCEEEEEecCCC------CCCceEEEEEccccccccCCCCcceEEEECCc
Confidence 458999999983 3568888884 46999999999999999 68899999999987642 235679999999
Q ss_pred ccCC---CceEcCCCcEEEEEecccccc-CCcceEEEeeHHHHHHHHhhhh-cCCCcccccceecc
Q 004596 288 CLPG---GPVFGEHAHFVGILIRPLRQK-SGAEIQLVIPWEAIATACSDLL-LKEPQNAEKEIHIN 348 (743)
Q Consensus 288 ~~pG---g~vf~~~g~liGiv~~~l~~~-g~~~l~~~ip~~~i~~~~~~l~-~~~~~~~~~~~~~~ 348 (743)
+||| |||||.+|+||||+++.++.. +..|++|+||++.+..++.+++ .+++.++|+|+.++
T Consensus 209 in~GnSGGpl~n~~G~vIGi~~~~~~~~~~~~gigfaIP~~~~~~v~~~l~~~g~v~r~~LGv~~~ 274 (455)
T PRK10139 209 INRGNSGGALLNLNGELIGINTAILAPGGGSVGIGFAIPSNMARTLAQQLIDFGEIKRGLLGIKGT 274 (455)
T ss_pred cCCCCCcceEECCCCeEEEEEEEEEcCCCCccceEEEEEhHHHHHHHHHHhhcCcccccceeEEEE
Confidence 9997 999999999999999999876 6789999999999999999988 67899999998754
No 8
>PRK10942 serine endoprotease; Provisional
Probab=99.72 E-value=3.3e-17 Score=186.08 Aligned_cols=129 Identities=20% Similarity=0.331 Sum_probs=113.0
Q ss_pred cccEEEEEeccCCCCCCceecC--CCCCCCCeEEEEeCCCCCCCccccccceEEEEEeeecCCC---CCCCceEEeeccc
Q 004596 214 TSRVAILGVSSYLKDLPNIALT--PLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVANCYPPR---STTRSLLMADIRC 288 (743)
Q Consensus 214 ~t~~A~l~i~~~~~~~~~~~~~--~~~~~G~~v~~igsPFg~~sP~~f~ns~s~Givs~~~~~~---~~~~~~i~tD~~~ 288 (743)
.+|+||||++. ..+++.++++ +.+++||||++||+||| |.+++|.|+||++.+.. ..|..|||||+++
T Consensus 158 ~~DlAvlki~~-~~~l~~~~lg~s~~l~~G~~V~aiG~P~g------~~~tvt~GiVs~~~r~~~~~~~~~~~iqtda~i 230 (473)
T PRK10942 158 RSDIALIQLQN-PKNLTAIKMADSDALRVGDYTVAIGNPYG------LGETVTSGIVSALGRSGLNVENYENFIQTDAAI 230 (473)
T ss_pred CCCEEEEEecC-CCCCceeEecCccccCCCCEEEEEcCCCC------CCcceeEEEEEEeecccCCcccccceEEecccc
Confidence 48999999973 3568888885 46999999999999999 68999999999987642 2467899999999
Q ss_pred cCC---CceEcCCCcEEEEEecccccc-CCcceEEEeeHHHHHHHHhhhh-cCCCcccccceeccc
Q 004596 289 LPG---GPVFGEHAHFVGILIRPLRQK-SGAEIQLVIPWEAIATACSDLL-LKEPQNAEKEIHINK 349 (743)
Q Consensus 289 ~pG---g~vf~~~g~liGiv~~~l~~~-g~~~l~~~ip~~~i~~~~~~l~-~~~~~~~~~~~~~~~ 349 (743)
+|| |||||.+|+||||+++.+... ++.+++|+||++.+..++.++. .+++.++|.|+.++.
T Consensus 231 ~~GnSGGpL~n~~GeviGI~t~~~~~~g~~~g~gfaIP~~~~~~v~~~l~~~g~v~rg~lGv~~~~ 296 (473)
T PRK10942 231 NRGNSGGALVNLNGELIGINTAILAPDGGNIGIGFAIPSNMVKNLTSQMVEYGQVKRGELGIMGTE 296 (473)
T ss_pred CCCCCcCccCCCCCeEEEEEEEEEcCCCCcccEEEEEEHHHHHHHHHHHHhccccccceeeeEeee
Confidence 997 999999999999999999887 7789999999999999999988 678899999987543
No 9
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=99.65 E-value=1.1e-15 Score=171.72 Aligned_cols=130 Identities=24% Similarity=0.364 Sum_probs=112.7
Q ss_pred cccEEEEEeccCCCCCCceecC--CCCCCCCeEEEEeCCCCCCCccccccceEEEEEeeecCC---CCCCCceEEeeccc
Q 004596 214 TSRVAILGVSSYLKDLPNIALT--PLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVANCYPP---RSTTRSLLMADIRC 288 (743)
Q Consensus 214 ~t~~A~l~i~~~~~~~~~~~~~--~~~~~G~~v~~igsPFg~~sP~~f~ns~s~Givs~~~~~---~~~~~~~i~tD~~~ 288 (743)
.+|+||||++.. .++|.+.++ ..+++||+|+++|+||| +..++|.|+||+..+. ...+..||+||+++
T Consensus 104 ~~DlAllkv~~~-~~~~~~~l~~~~~~~~G~~v~aiG~p~g------~~~~~t~G~vs~~~~~~~~~~~~~~~i~tda~i 176 (428)
T TIGR02037 104 RTDIAVLKIDAK-KNLPVIKLGDSDKLRVGDWVLAIGNPFG------LGQTVTSGIVSALGRSGLGIGDYENFIQTDAAI 176 (428)
T ss_pred CCCEEEEEecCC-CCceEEEccCCCCCCCCCEEEEEECCCc------CCCcEEEEEEEecccCccCCCCccceEEECCCC
Confidence 469999999853 468888885 46999999999999999 6889999999988764 23467899999999
Q ss_pred cCC---CceEcCCCcEEEEEecccccc-CCcceEEEeeHHHHHHHHhhhh-cCCCcccccceecccC
Q 004596 289 LPG---GPVFGEHAHFVGILIRPLRQK-SGAEIQLVIPWEAIATACSDLL-LKEPQNAEKEIHINKG 350 (743)
Q Consensus 289 ~pG---g~vf~~~g~liGiv~~~l~~~-g~~~l~~~ip~~~i~~~~~~l~-~~~~~~~~~~~~~~~~ 350 (743)
+|| |||||.+|+||||+++.+... ++.+++|+||++.+..++.++. .+++.++|+|+.++.-
T Consensus 177 ~~GnSGGpl~n~~G~viGI~~~~~~~~g~~~g~~faiP~~~~~~~~~~l~~~g~~~~~~lGi~~~~~ 243 (428)
T TIGR02037 177 NPGNSGGPLVNLRGEVIGINTAIYSPSGGNVGIGFAIPSNMAKNVVDQLIEGGKVQRGWLGVTIQEV 243 (428)
T ss_pred CCCCCCCceECCCCeEEEEEeEEEcCCCCccceEEEEEhHHHHHHHHHHHhcCcCcCCcCceEeecC
Confidence 996 999999999999999999877 6779999999999999999988 6678899999886543
No 10
>TIGR02038 protease_degS periplasmic serine pepetdase DegS. This family consists of the periplasmic serine protease DegS (HhoB), a shorter paralog of protease DO (HtrA, DegP) and DegQ (HhoA). It is found in E. coli and several other Proteobacteria of the gamma subdivision. It contains a trypsin domain and a single copy of PDZ domain (in contrast to DegP with two copies). A critical role of this DegS is to sense stress in the periplasm and partially degrade an inhibitor of sigma(E).
Probab=99.64 E-value=1.6e-15 Score=166.43 Aligned_cols=128 Identities=17% Similarity=0.330 Sum_probs=107.1
Q ss_pred ccccEEEEEeccCCCCCCceec--CCCCCCCCeEEEEeCCCCCCCccccccceEEEEEeeecCCC---CCCCceEEeecc
Q 004596 213 STSRVAILGVSSYLKDLPNIAL--TPLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVANCYPPR---STTRSLLMADIR 287 (743)
Q Consensus 213 ~~t~~A~l~i~~~~~~~~~~~~--~~~~~~G~~v~~igsPFg~~sP~~f~ns~s~Givs~~~~~~---~~~~~~i~tD~~ 287 (743)
..+|+||||++.. ++|.+.+ +..+++||+|++||+||| +.++++.|+||+..+.. ..+..|||||+.
T Consensus 123 ~~~DlAvlkv~~~--~~~~~~l~~s~~~~~G~~V~aiG~P~~------~~~s~t~GiIs~~~r~~~~~~~~~~~iqtda~ 194 (351)
T TIGR02038 123 PLTDLAVLKIEGD--NLPTIPVNLDRPPHVGDVVLAIGNPYN------LGQTITQGIISATGRNGLSSVGRQNFIQTDAA 194 (351)
T ss_pred CCCCEEEEEecCC--CCceEeccCcCccCCCCEEEEEeCCCC------CCCcEEEEEEEeccCcccCCCCcceEEEECCc
Confidence 3479999999853 4677776 446999999999999999 67899999999886542 134679999999
Q ss_pred ccCC---CceEcCCCcEEEEEeccccccC---CcceEEEeeHHHHHHHHhhhh-cCCCcccccceecc
Q 004596 288 CLPG---GPVFGEHAHFVGILIRPLRQKS---GAEIQLVIPWEAIATACSDLL-LKEPQNAEKEIHIN 348 (743)
Q Consensus 288 ~~pG---g~vf~~~g~liGiv~~~l~~~g---~~~l~~~ip~~~i~~~~~~l~-~~~~~~~~~~~~~~ 348 (743)
++|| |||||.+|+||||+++.+...+ ..+++|+||++.+..++.+++ .+++.++|+|+..+
T Consensus 195 i~~GnSGGpl~n~~G~vIGI~~~~~~~~~~~~~~g~~faIP~~~~~~vl~~l~~~g~~~r~~lGv~~~ 262 (351)
T TIGR02038 195 INAGNSGGALINTNGELVGINTASFQKGGDEGGEGINFAIPIKLAHKIMGKIIRDGRVIRGYIGVSGE 262 (351)
T ss_pred cCCCCCcceEECCCCeEEEEEeeeecccCCCCccceEEEecHHHHHHHHHHHhhcCcccceEeeeEEE
Confidence 9997 9999999999999999886552 258999999999999999987 66788899988743
No 11
>PRK10898 serine endoprotease; Provisional
Probab=99.63 E-value=1.6e-15 Score=166.54 Aligned_cols=127 Identities=18% Similarity=0.304 Sum_probs=106.1
Q ss_pred cccEEEEEeccCCCCCCceecC--CCCCCCCeEEEEeCCCCCCCccccccceEEEEEeeecCCC---CCCCceEEeeccc
Q 004596 214 TSRVAILGVSSYLKDLPNIALT--PLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVANCYPPR---STTRSLLMADIRC 288 (743)
Q Consensus 214 ~t~~A~l~i~~~~~~~~~~~~~--~~~~~G~~v~~igsPFg~~sP~~f~ns~s~Givs~~~~~~---~~~~~~i~tD~~~ 288 (743)
.+|+||||++.. ++|.+.++ ..+++||+|+++|+||| +..++|.|+||+..+.. ..+..|||||+++
T Consensus 124 ~~DlAvl~v~~~--~l~~~~l~~~~~~~~G~~V~aiG~P~g------~~~~~t~Giis~~~r~~~~~~~~~~~iqtda~i 195 (353)
T PRK10898 124 LTDLAVLKINAT--NLPVIPINPKRVPHIGDVVLAIGNPYN------LGQTITQGIISATGRIGLSPTGRQNFLQTDASI 195 (353)
T ss_pred CCCEEEEEEcCC--CCCeeeccCcCcCCCCCEEEEEeCCCC------cCCCcceeEEEeccccccCCccccceEEecccc
Confidence 479999999853 57776764 45999999999999999 67899999999876541 2345799999999
Q ss_pred cCC---CceEcCCCcEEEEEeccccccC----CcceEEEeeHHHHHHHHhhhh-cCCCcccccceecc
Q 004596 289 LPG---GPVFGEHAHFVGILIRPLRQKS----GAEIQLVIPWEAIATACSDLL-LKEPQNAEKEIHIN 348 (743)
Q Consensus 289 ~pG---g~vf~~~g~liGiv~~~l~~~g----~~~l~~~ip~~~i~~~~~~l~-~~~~~~~~~~~~~~ 348 (743)
+|| |||+|.+|+||||+++.+...+ ..+++|+||++.+..++.++. .+++.++|+|+..+
T Consensus 196 ~~GnSGGPl~n~~G~vvGI~~~~~~~~~~~~~~~g~~faIP~~~~~~~~~~l~~~G~~~~~~lGi~~~ 263 (353)
T PRK10898 196 NHGNSGGALVNSLGELMGINTLSFDKSNDGETPEGIGFAIPTQLATKIMDKLIRDGRVIRGYIGIGGR 263 (353)
T ss_pred CCCCCcceEECCCCeEEEEEEEEecccCCCCcccceEEEEchHHHHHHHHHHhhcCcccccccceEEE
Confidence 997 9999999999999999886542 258999999999999999987 67788999998744
No 12
>COG0265 DegQ Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain [Posttranslational modification, protein turnover, chaperones]
Probab=99.56 E-value=1.6e-14 Score=157.94 Aligned_cols=130 Identities=24% Similarity=0.345 Sum_probs=111.8
Q ss_pred cccccEEEEEeccCCCCCCceec--CCCCCCCCeEEEEeCCCCCCCccccccceEEEEEeeecCC-C---CCCCceEEee
Q 004596 212 KSTSRVAILGVSSYLKDLPNIAL--TPLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVANCYPP-R---STTRSLLMAD 285 (743)
Q Consensus 212 ~~~t~~A~l~i~~~~~~~~~~~~--~~~~~~G~~v~~igsPFg~~sP~~f~ns~s~Givs~~~~~-~---~~~~~~i~tD 285 (743)
...+|+|+||++.... +|.+.+ ++.+++||++++||+||| |.+++|.||||...+. - ..+..|||||
T Consensus 116 d~~~dlavlki~~~~~-~~~~~~~~s~~l~vg~~v~aiGnp~g------~~~tvt~Givs~~~r~~v~~~~~~~~~Iqtd 188 (347)
T COG0265 116 DPISDLAVLKIDGAGG-LPVIALGDSDKLRVGDVVVAIGNPFG------LGQTVTSGIVSALGRTGVGSAGGYVNFIQTD 188 (347)
T ss_pred CCccCEEEEEeccCCC-CceeeccCCCCcccCCEEEEecCCCC------cccceeccEEeccccccccCcccccchhhcc
Confidence 4568999999996433 666666 556999999999999999 7999999999998874 1 2367899999
Q ss_pred ccccCC---CceEcCCCcEEEEEecccccc-CCcceEEEeeHHHHHHHHhhhh-cCCCcccccceecc
Q 004596 286 IRCLPG---GPVFGEHAHFVGILIRPLRQK-SGAEIQLVIPWEAIATACSDLL-LKEPQNAEKEIHIN 348 (743)
Q Consensus 286 ~~~~pG---g~vf~~~g~liGiv~~~l~~~-g~~~l~~~ip~~~i~~~~~~l~-~~~~~~~~~~~~~~ 348 (743)
|++||| ||++|.+|++|||++..+... +..|+.|+||++.+..+..++. .+++.+++.|+.+.
T Consensus 189 Aain~gnsGgpl~n~~g~~iGint~~~~~~~~~~gigfaiP~~~~~~v~~~l~~~G~v~~~~lgv~~~ 256 (347)
T COG0265 189 AAINPGNSGGPLVNIDGEVVGINTAIIAPSGGSSGIGFAIPVNLVAPVLDELISKGKVVRGYLGVIGE 256 (347)
T ss_pred cccCCCCCCCceEcCCCcEEEEEEEEecCCCCcceeEEEecHHHHHHHHHHHHHcCCccccccceEEE
Confidence 999997 999999999999999999988 5678999999999999999988 36788988887754
No 13
>PF13365 Trypsin_2: Trypsin-like peptidase domain; PDB: 1Y8T_A 2Z9I_A 3QO6_A 1L1J_A 1QY6_A 2O8L_A 3OTP_E 2ZLE_I 1KY9_A 3CS0_A ....
Probab=99.55 E-value=6.3e-14 Score=127.59 Aligned_cols=24 Identities=46% Similarity=0.904 Sum_probs=22.4
Q ss_pred EccccCCCCcccceecCCceEEEE
Q 004596 600 TTAAVHPGGSGGAVVNLDGHMIGL 623 (743)
Q Consensus 600 TdAai~~GnSGGPL~d~~G~VIGI 623 (743)
+++.+.+|+|||||||.+|+||||
T Consensus 97 ~~~~~~~G~SGgpv~~~~G~vvGi 120 (120)
T PF13365_consen 97 TDADTRPGSSGGPVFDSDGRVVGI 120 (120)
T ss_dssp ESSS-STTTTTSEEEETTSEEEEE
T ss_pred eecccCCCcEeHhEECCCCEEEeC
Confidence 899999999999999999999998
No 14
>KOG1320 consensus Serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=99.47 E-value=3.7e-13 Score=150.52 Aligned_cols=201 Identities=21% Similarity=0.290 Sum_probs=142.0
Q ss_pred HHHhccCceEEEEECC--------------CeeEEEEEEeCCCeEEecccccccccCcceeccCCccccccCCCCCCCCC
Q 004596 386 PIQKALASVCLITIDD--------------GVWASGVLLNDQGLILTNAHLLEPWRFGKTTVSGWRNGVSFQPEDSASSG 451 (743)
Q Consensus 386 ~i~~a~~SVV~I~~~~--------------~~wGSGfvV~~~G~ILTNaHVV~p~~~~~t~~~g~~~~~~f~~~~~~~~~ 451 (743)
+.++...++|.|+..+ ...||||+++.+|+++||+||+.-. -..| .
T Consensus 133 ~~~~cd~Avv~Ie~~~f~~~~~~~e~~~ip~l~~S~~Vv~gd~i~VTnghV~~~~--------~~~y---~--------- 192 (473)
T KOG1320|consen 133 VFEECDLAVVYIESEEFWKGMNPFELGDIPSLNGSGFVVGGDGIIVTNGHVVRVE--------PRIY---A--------- 192 (473)
T ss_pred hhhcccceEEEEeeccccCCCcccccCCCcccCccEEEEcCCcEEEEeeEEEEEE--------eccc---c---------
Confidence 3556778888888621 1249999999999999999999511 0000 0
Q ss_pred CCcccccccccCCCCCCCcccccccccccccccccccCCceeEEEEEcCC--CCceeEeeEEEEecCCCCcEEEEEEccC
Q 004596 452 HTGVDQYQKSQTLPPKMPKIVDSSVDEHRAYKLSSFSRGHRKIRVRLDHL--DPWIWCDAKIVYVCKGPLDVSLLQLGYI 529 (743)
Q Consensus 452 ~~~~~~~~~~q~~~~k~~~i~~~~~~~~~~~~l~l~~~~~~~I~Vrl~~~--~~~~W~~A~VV~v~~~~~DLALLkLe~~ 529 (743)
+. . ..--.|.++...+ +. +.+.+++ -+...|+|+++++..
T Consensus 193 ------~~---------------------~-------~~l~~vqi~aa~~~~~s---~ep~i~g-~d~~~gvA~l~ik~~ 234 (473)
T KOG1320|consen 193 ------HS---------------------S-------TVLLRVQIDAAIGPGNS---GEPVIVG-VDKVAGVAFLKIKTP 234 (473)
T ss_pred ------CC---------------------C-------cceeeEEEEEeecCCcc---CCCeEEc-cccccceEEEEEecC
Confidence 00 0 0011255555544 44 5677776 356799999999752
Q ss_pred CCCccceeCCCC-CCCCCCeEEEEecCCCCCCCCCCCeeEeEEEeeeeeccCCCCCccccccCCCcCeEEEEccccCCCC
Q 004596 530 PDQLCPIDADFG-QPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQRNSAYPVMLETTAAVHPGG 608 (743)
Q Consensus 530 p~~l~pi~l~~s-~~~~G~~V~vIGyPlfg~~~g~~psvt~GiIS~vv~~~~~~~~~~~~~~~~~~~~~IqTdAai~~Gn 608 (743)
..-+++++++.. .+..|+++..+|.| +++..+++.|+++...|....... ........++|||++++.|+
T Consensus 235 ~~i~~~i~~~~~~~~~~G~~~~a~~~~-----f~~~nt~t~g~vs~~~R~~~~lg~----~~g~~i~~~~qtd~ai~~~n 305 (473)
T KOG1320|consen 235 ENILYVIPLGVSSHFRTGVEVSAIGNG-----FGLLNTLTQGMVSGQLRKSFKLGL----ETGVLISKINQTDAAINPGN 305 (473)
T ss_pred CcccceeecceeeeecccceeeccccC-----ceeeeeeeecccccccccccccCc----ccceeeeeecccchhhhccc
Confidence 233788888775 68999999999985 677789999999987764321110 11123456899999999999
Q ss_pred cccceecCCceEEEEEeeeecCCCCcccCceeEEEehhHHHHHHHHH
Q 004596 609 SGGAVVNLDGHMIGLVTSNARHGGGTVIPHLNFSIPCAVLRPIFEFA 655 (743)
Q Consensus 609 SGGPL~d~~G~VIGIvtsna~~~gg~~~p~lnFaIPi~~l~~il~~~ 655 (743)
||||++|.+|++||+++.+....+-.. +++|++|.+.+..++.+.
T Consensus 306 sg~~ll~~DG~~IgVn~~~~~ri~~~~--~iSf~~p~d~vl~~v~r~ 350 (473)
T KOG1320|consen 306 SGGPLLNLDGEVIGVNTRKVTRIGFSH--GISFKIPIDTVLVIVLRL 350 (473)
T ss_pred CCCcEEEecCcEeeeeeeeeEEeeccc--cceeccCchHhhhhhhhh
Confidence 999999999999999998865422222 689999999999877665
No 15
>PF00089 Trypsin: Trypsin; InterPro: IPR001254 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine proteases belong to the MEROPS peptidase family S1 (chymotrypsin family, clan PA(S))and to peptidase family S6 (Hap serine peptidases). The chymotrypsin family is almost totally confined to animals, although trypsin-like enzymes are found in actinomycetes of the genera Streptomyces and Saccharopolyspora, and in the fungus Fusarium oxysporum []. The enzymes are inherently secreted, being synthesised with a signal peptide that targets them to the secretory pathway. Animal enzymes are either secreted directly, packaged into vesicles for regulated secretion, or are retained in leukocyte granules []. The Hap family, 'Haemophilus adhesion and penetration', are proteins that play a role in the interaction with human epithelial cells. The serine protease activity is localized at the N-terminal domain, whereas the binding domain is in the C-terminal region. ; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 1SPJ_A 1A5I_A 2ZGH_A 2ZKS_A 2ZGJ_A 2ZGC_A 2ODP_A 2I6Q_A 2I6S_A 2ODQ_A ....
Probab=99.25 E-value=2.8e-10 Score=113.43 Aligned_cols=124 Identities=22% Similarity=0.314 Sum_probs=77.2
Q ss_pred CCcEEEEEEccC---CCCccceeCCCC--CCCCCCeEEEEecCCCCCCCCCCCeeEeEEEeeeeeccCCCCCccccccCC
Q 004596 518 PLDVSLLQLGYI---PDQLCPIDADFG--QPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQRNS 592 (743)
Q Consensus 518 ~~DLALLkLe~~---p~~l~pi~l~~s--~~~~G~~V~vIGyPlfg~~~g~~psvt~GiIS~vv~~~~~~~~~~~~~~~~ 592 (743)
.+|||||+|+.. .+.+.|+.+... .+..|+.+.++||+.-.. .+....+....+.-+... .|... ....
T Consensus 86 ~~DiAll~L~~~~~~~~~~~~~~l~~~~~~~~~~~~~~~~G~~~~~~-~~~~~~~~~~~~~~~~~~----~c~~~-~~~~ 159 (220)
T PF00089_consen 86 DNDIALLKLDRPITFGDNIQPICLPSAGSDPNVGTSCIVVGWGRTSD-NGYSSNLQSVTVPVVSRK----TCRSS-YNDN 159 (220)
T ss_dssp TTSEEEEEESSSSEHBSSBEESBBTSTTHTTTTTSEEEEEESSBSST-TSBTSBEEEEEEEEEEHH----HHHHH-TTTT
T ss_pred ccccccccccccccccccccccccccccccccccccccccccccccc-cccccccccccccccccc----ccccc-cccc
Confidence 589999999973 356778888773 468999999999985221 111123333333221110 01110 0011
Q ss_pred CcCeEEEEcc----ccCCCCcccceecCCceEEEEEeeeecCCCCcccCceeEEEehhHHHH
Q 004596 593 AYPVMLETTA----AVHPGGSGGAVVNLDGHMIGLVTSNARHGGGTVIPHLNFSIPCAVLRP 650 (743)
Q Consensus 593 ~~~~~IqTdA----ai~~GnSGGPL~d~~G~VIGIvtsna~~~gg~~~p~lnFaIPi~~l~~ 650 (743)
....++++.. ..+.|+|||||++.++.++||++.. ...+... ...+.+++....+
T Consensus 160 ~~~~~~c~~~~~~~~~~~g~sG~pl~~~~~~lvGI~s~~-~~c~~~~--~~~v~~~v~~~~~ 218 (220)
T PF00089_consen 160 LTPNMICAGSSGSGDACQGDSGGPLICNNNYLVGIVSFG-ENCGSPN--YPGVYTRVSSYLD 218 (220)
T ss_dssp STTTEEEEETTSSSBGGTTTTTSEEEETTEEEEEEEEEE-SSSSBTT--SEEEEEEGGGGHH
T ss_pred cccccccccccccccccccccccccccceeeecceeeec-CCCCCCC--cCEEEEEHHHhhc
Confidence 3456788776 7889999999998777899999988 3232222 2467777776554
No 16
>cd00190 Tryp_SPc Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues.
Probab=99.18 E-value=5.8e-10 Score=111.82 Aligned_cols=109 Identities=23% Similarity=0.235 Sum_probs=63.9
Q ss_pred CCCcEEEEEEccC---CCCccceeCCCC--CCCCCCeEEEEecCCCCCCCCCCCeeEeEEEeeeeeccCCCCCcccccc-
Q 004596 517 GPLDVSLLQLGYI---PDQLCPIDADFG--QPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQR- 590 (743)
Q Consensus 517 ~~~DLALLkLe~~---p~~l~pi~l~~s--~~~~G~~V~vIGyPlfg~~~g~~psvt~GiIS~vv~~~~~~~~~~~~~~- 590 (743)
..+|||||+|+.. ...+.|+.+... .+..|+.++++||+................+.-+. ...|......
T Consensus 87 ~~~DiAll~L~~~~~~~~~v~picl~~~~~~~~~~~~~~~~G~g~~~~~~~~~~~~~~~~~~~~~----~~~C~~~~~~~ 162 (232)
T cd00190 87 YDNDIALLKLKRPVTLSDNVRPICLPSSGYNLPAGTTCTVSGWGRTSEGGPLPDVLQEVNVPIVS----NAECKRAYSYG 162 (232)
T ss_pred CcCCEEEEEECCcccCCCcccceECCCccccCCCCCEEEEEeCCcCCCCCCCCceeeEEEeeeEC----HHHhhhhccCc
Confidence 3589999999962 234788888776 67889999999998643211111112222221111 1111111110
Q ss_pred CCCcCeEEEEc-----cccCCCCcccceecCC---ceEEEEEeeeec
Q 004596 591 NSAYPVMLETT-----AAVHPGGSGGAVVNLD---GHMIGLVTSNAR 629 (743)
Q Consensus 591 ~~~~~~~IqTd-----Aai~~GnSGGPL~d~~---G~VIGIvtsna~ 629 (743)
......+++.. ...+.|+|||||+... +.++||++....
T Consensus 163 ~~~~~~~~C~~~~~~~~~~c~gdsGgpl~~~~~~~~~lvGI~s~g~~ 209 (232)
T cd00190 163 GTITDNMLCAGGLEGGKDACQGDSGGPLVCNDNGRGVLVGIVSWGSG 209 (232)
T ss_pred ccCCCceEeeCCCCCCCccccCCCCCcEEEEeCCEEEEEEEEehhhc
Confidence 11223455543 3467899999999653 889999988654
No 17
>smart00020 Tryp_SPc Trypsin-like serine protease. Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues.
Probab=99.02 E-value=2e-08 Score=101.29 Aligned_cols=108 Identities=24% Similarity=0.291 Sum_probs=61.5
Q ss_pred CCCcEEEEEEccC---CCCccceeCCCC--CCCCCCeEEEEecCCCCCCCC-CCCeeEeEEEeeeeeccCCCCCcccccc
Q 004596 517 GPLDVSLLQLGYI---PDQLCPIDADFG--QPSLGSAAYVIGHGLFGPRCG-LSPSVSSGVVAKVVKANLPSYGQSTLQR 590 (743)
Q Consensus 517 ~~~DLALLkLe~~---p~~l~pi~l~~s--~~~~G~~V~vIGyPlfg~~~g-~~psvt~GiIS~vv~~~~~~~~~~~~~~ 590 (743)
..+|||||+|+.. .+.+.|+.+... .+..++.+.+.||+......+ .........+.-+.. ..|......
T Consensus 87 ~~~DiAll~L~~~i~~~~~~~pi~l~~~~~~~~~~~~~~~~g~g~~~~~~~~~~~~~~~~~~~~~~~----~~C~~~~~~ 162 (229)
T smart00020 87 YDNDIALLKLKSPVTLSDNVRPICLPSSNYNVPAGTTCTVSGWGRTSEGAGSLPDTLQEVNVPIVSN----ATCRRAYSG 162 (229)
T ss_pred CcCCEEEEEECcccCCCCceeeccCCCcccccCCCCEEEEEeCCCCCCCCCcCCCEeeEEEEEEeCH----HHhhhhhcc
Confidence 4689999999872 345678877664 578899999999985332001 111122121111110 001110000
Q ss_pred -CCCcCeEEEE-----ccccCCCCcccceecCCc--eEEEEEeeee
Q 004596 591 -NSAYPVMLET-----TAAVHPGGSGGAVVNLDG--HMIGLVTSNA 628 (743)
Q Consensus 591 -~~~~~~~IqT-----dAai~~GnSGGPL~d~~G--~VIGIvtsna 628 (743)
......+++. +...++|+|||||+...+ .++||++...
T Consensus 163 ~~~~~~~~~C~~~~~~~~~~c~gdsG~pl~~~~~~~~l~Gi~s~g~ 208 (229)
T smart00020 163 GGAITDNMLCAGGLEGGKDACQGDSGGPLVCNDGRWVLVGIVSWGS 208 (229)
T ss_pred ccccCCCcEeecCCCCCCcccCCCCCCeeEEECCCEEEEEEEEECC
Confidence 0112223333 355788999999996443 8999999876
No 18
>KOG1421 consensus Predicted signaling-associated protein (contains a PDZ domain) [General function prediction only]
Probab=98.92 E-value=5.8e-09 Score=118.72 Aligned_cols=191 Identities=20% Similarity=0.331 Sum_probs=131.4
Q ss_pred HHHhccCceEEEEEC----------CCeeEEEEEEeC-CCeEEecccccccccCcceeccCCccccccCCCCCCCCCCCc
Q 004596 386 PIQKALASVCLITID----------DGVWASGVLLND-QGLILTNAHLLEPWRFGKTTVSGWRNGVSFQPEDSASSGHTG 454 (743)
Q Consensus 386 ~i~~a~~SVV~I~~~----------~~~wGSGfvV~~-~G~ILTNaHVV~p~~~~~t~~~g~~~~~~f~~~~~~~~~~~~ 454 (743)
.+..+-++||.|... ..+-|+||+|++ .||||||+||+.|..+... ++|..+++
T Consensus 57 ~ia~VvksvVsI~~S~v~~fdtesag~~~atgfvvd~~~gyiLtnrhvv~pgP~va~--------avf~n~ee------- 121 (955)
T KOG1421|consen 57 TIANVVKSVVSIRFSAVRAFDTESAGESEATGFVVDKKLGYILTNRHVVAPGPFVAS--------AVFDNHEE------- 121 (955)
T ss_pred hhhhhcccEEEEEehheeecccccccccceeEEEEecccceEEEeccccCCCCceeE--------EEeccccc-------
Confidence 467788999999863 234699999997 7899999999986432211 12221111
Q ss_pred ccccccccCCCCCCCcccccccccccccccccccCCceeEEEEEcCCCCceeEeeEEEEecCCCCcEEEEEEccCC---C
Q 004596 455 VDQYQKSQTLPPKMPKIVDSSVDEHRAYKLSSFSRGHRKIRVRLDHLDPWIWCDAKIVYVCKGPLDVSLLQLGYIP---D 531 (743)
Q Consensus 455 ~~~~~~~q~~~~k~~~i~~~~~~~~~~~~l~l~~~~~~~I~Vrl~~~~~~~W~~A~VV~v~~~~~DLALLkLe~~p---~ 531 (743)
++.-.+| .|+-+|+.+++.+... .
T Consensus 122 ----------------------------------------------------~ei~pvy-rDpVhdfGf~r~dps~ir~s 148 (955)
T KOG1421|consen 122 ----------------------------------------------------IEIYPVY-RDPVHDFGFFRYDPSTIRFS 148 (955)
T ss_pred ----------------------------------------------------CCccccc-CCchhhcceeecChhhccee
Confidence 1122233 5667899999988521 2
Q ss_pred CccceeCCCCCCCCCCeEEEEecCCCCCCCCCCCeeEeEEEeeeeeccCCCCCccccccCCCcCeEEEEccccCCCCccc
Q 004596 532 QLCPIDADFGQPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQRNSAYPVMLETTAAVHPGGSGG 611 (743)
Q Consensus 532 ~l~pi~l~~s~~~~G~~V~vIGyPlfg~~~g~~psvt~GiIS~vv~~~~~~~~~~~~~~~~~~~~~IqTdAai~~GnSGG 611 (743)
.+..+.+...-.++|.+++++|+ -.|-..++-.|.++.+.+. .|-|...++..+.. .++|.-+...+|.||.
T Consensus 149 ~vt~i~lap~~akvgseirvvgN-----DagEklsIlagflSrldr~-apdyg~~~yndfnT--fy~Qaasstsggssgs 220 (955)
T KOG1421|consen 149 IVTEICLAPELAKVGSEIRVVGN-----DAGEKLSILAGFLSRLDRN-APDYGEDTYNDFNT--FYIQAASSTSGGSSGS 220 (955)
T ss_pred eeeccccCccccccCCceEEecC-----CccceEEeehhhhhhccCC-Cccccccccccccc--eeeeehhcCCCCCCCC
Confidence 23444444455589999999998 4566678888999988774 34444333433333 3678878889999999
Q ss_pred ceecCCceEEEEEeeeecCCCCcccCceeEEEehhHHHHHHHHHHhc
Q 004596 612 AVVNLDGHMIGLVTSNARHGGGTVIPHLNFSIPCAVLRPIFEFARDM 658 (743)
Q Consensus 612 PL~d~~G~VIGIvtsna~~~gg~~~p~lnFaIPi~~l~~il~~~~~~ 658 (743)
||+|-+|..|.++..... ...-.|.+|++-+.+.+..++++
T Consensus 221 pVv~i~gyAVAl~agg~~------ssas~ffLpLdrV~RaL~clq~n 261 (955)
T KOG1421|consen 221 PVVDIPGYAVALNAGGSI------SSASDFFLPLDRVVRALRCLQNN 261 (955)
T ss_pred ceecccceEEeeecCCcc------cccccceeeccchhhhhhhhhcC
Confidence 999999999999855432 22567999999999988888753
No 19
>KOG1320 consensus Serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=98.81 E-value=7.2e-09 Score=116.51 Aligned_cols=125 Identities=21% Similarity=0.319 Sum_probs=102.7
Q ss_pred ccccccccc-cccccEEEEEeccCCCCCCceec--CCCCCCCCeEEEEeCCCCCCCccccccceEEEEEeeecCCC----
Q 004596 203 ESSNLSLMS-KSTSRVAILGVSSYLKDLPNIAL--TPLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVANCYPPR---- 275 (743)
Q Consensus 203 ~~~~~~~~~-~~~t~~A~l~i~~~~~~~~~~~~--~~~~~~G~~v~~igsPFg~~sP~~f~ns~s~Givs~~~~~~---- 275 (743)
-+.+|.+++ ....|+|+|||+....-++.+.. +..++.|+|+.++++||+ +.|++++|+|+...|..
T Consensus 211 ~s~ep~i~g~d~~~gvA~l~ik~~~~i~~~i~~~~~~~~~~G~~~~a~~~~f~------~~nt~t~g~vs~~~R~~~~lg 284 (473)
T KOG1320|consen 211 NSGEPVIVGVDKVAGVAFLKIKTPENILYVIPLGVSSHFRTGVEVSAIGNGFG------LLNTLTQGMVSGQLRKSFKLG 284 (473)
T ss_pred ccCCCeEEccccccceEEEEEecCCcccceeecceeeeecccceeeccccCce------eeeeeeecccccccccccccC
Confidence 355677888 67789999999743333666654 667999999999999999 79999999999776652
Q ss_pred ----CCCCceEEeeccccCC---CceEcCCCcEEEEEecccccc-CCcceEEEeeHHHHHHHHhhh
Q 004596 276 ----STTRSLLMADIRCLPG---GPVFGEHAHFVGILIRPLRQK-SGAEIQLVIPWEAIATACSDL 333 (743)
Q Consensus 276 ----~~~~~~i~tD~~~~pG---g~vf~~~g~liGiv~~~l~~~-g~~~l~~~ip~~~i~~~~~~l 333 (743)
.....++|||+++++| ||+++.+|+.||++++...+. -+.+++|++|.+.+...+...
T Consensus 285 ~~~g~~i~~~~qtd~ai~~~nsg~~ll~~DG~~IgVn~~~~~ri~~~~~iSf~~p~d~vl~~v~r~ 350 (473)
T KOG1320|consen 285 LETGVLISKINQTDAAINPGNSGGPLLNLDGEVIGVNTRKVTRIGFSHGISFKIPIDTVLVIVLRL 350 (473)
T ss_pred cccceeeeeecccchhhhcccCCCcEEEecCcEeeeeeeeeEEeeccccceeccCchHhhhhhhhh
Confidence 2356689999999997 999999999999999999887 567999999999998666544
No 20
>COG3591 V8-like Glu-specific endopeptidase [Amino acid transport and metabolism]
Probab=98.52 E-value=1.6e-06 Score=90.89 Aligned_cols=73 Identities=26% Similarity=0.280 Sum_probs=51.4
Q ss_pred CCCCCCeEEEEecCCCCCCCCCCCeeEeEEEeeeeeccCCCCCccccccCCCcCeEEEEccccCCCCcccceecCCceEE
Q 004596 542 QPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQRNSAYPVMLETTAAVHPGGSGGAVVNLDGHMI 621 (743)
Q Consensus 542 ~~~~G~~V~vIGyPlfg~~~g~~psvt~GiIS~vv~~~~~~~~~~~~~~~~~~~~~IqTdAai~~GnSGGPL~d~~G~VI 621 (743)
..+.++.+.++|||.-.+..+ ..-...+.+..+ ....++.+|.+.+|+||.||++.+.++|
T Consensus 157 ~~~~~d~i~v~GYP~dk~~~~-~~~e~t~~v~~~------------------~~~~l~y~~dT~pG~SGSpv~~~~~~vi 217 (251)
T COG3591 157 EAKANDRITVIGYPGDKPNIG-TMWESTGKVNSI------------------KGNKLFYDADTLPGSSGSPVLISKDEVI 217 (251)
T ss_pred ccccCceeEEEeccCCCCcce-eEeeecceeEEE------------------ecceEEEEecccCCCCCCceEecCceEE
Confidence 468999999999985333222 112233333322 1236889999999999999999989999
Q ss_pred EEEeeeecCCCC
Q 004596 622 GLVTSNARHGGG 633 (743)
Q Consensus 622 GIvtsna~~~gg 633 (743)
|+.+.+....++
T Consensus 218 gv~~~g~~~~~~ 229 (251)
T COG3591 218 GVHYNGPGANGG 229 (251)
T ss_pred EEEecCCCcccc
Confidence 999998765444
No 21
>KOG3627 consensus Trypsin [Amino acid transport and metabolism]
Probab=98.08 E-value=0.00029 Score=73.13 Aligned_cols=117 Identities=22% Similarity=0.235 Sum_probs=66.6
Q ss_pred CcEEEEEEcc---CCCCccceeCCCCC----CCCCCeEEEEecCCCCCC-CCCCCeeEeEEEeeeeeccCCCCCcccccc
Q 004596 519 LDVSLLQLGY---IPDQLCPIDADFGQ----PSLGSAAYVIGHGLFGPR-CGLSPSVSSGVVAKVVKANLPSYGQSTLQR 590 (743)
Q Consensus 519 ~DLALLkLe~---~p~~l~pi~l~~s~----~~~G~~V~vIGyPlfg~~-~g~~psvt~GiIS~vv~~~~~~~~~~~~~~ 590 (743)
+|||||+++. ..+.+.|+.+.... ...++.+++.|||..... .......... .........|......
T Consensus 106 nDiall~l~~~v~~~~~i~piclp~~~~~~~~~~~~~~~v~GWG~~~~~~~~~~~~L~~~----~v~i~~~~~C~~~~~~ 181 (256)
T KOG3627|consen 106 NDIALLRLSEPVTFSSHIQPICLPSSADPYFPPGGTTCLVSGWGRTESGGGPLPDTLQEV----DVPIISNSECRRAYGG 181 (256)
T ss_pred CCEEEEEECCCcccCCcccccCCCCCcccCCCCCCCEEEEEeCCCcCCCCCCCCceeEEE----EEeEcChhHhcccccC
Confidence 8999999996 34567777775332 345589999999753221 0111122211 1111111223322221
Q ss_pred C-CCcCeEEEEcc-----ccCCCCcccceecCC---ceEEEEEeeeecCCCCcccCce
Q 004596 591 N-SAYPVMLETTA-----AVHPGGSGGAVVNLD---GHMIGLVTSNARHGGGTVIPHL 639 (743)
Q Consensus 591 ~-~~~~~~IqTdA-----ai~~GnSGGPL~d~~---G~VIGIvtsna~~~gg~~~p~l 639 (743)
. .....++++.. ..+.|+|||||+-.+ ..++||+++.....+....|+.
T Consensus 182 ~~~~~~~~~Ca~~~~~~~~~C~GDSGGPLv~~~~~~~~~~GivS~G~~~C~~~~~P~v 239 (256)
T KOG3627|consen 182 LGTITDTMLCAGGPEGGKDACQGDSGGPLVCEDNGRWVLVGIVSWGSGGCGQPNYPGV 239 (256)
T ss_pred ccccCCCEEeeCccCCCCccccCCCCCeEEEeeCCcEEEEEEEEecCCCCCCCCCCeE
Confidence 1 11234677653 357899999999553 6999999998764444345555
No 22
>PF00863 Peptidase_C4: Peptidase family C4 This family belongs to family C4 of the peptidase classification.; InterPro: IPR001730 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. Nuclear inclusion A (NIA) proteases from potyviruses are cysteine peptidases belong to the MEROPS peptidase family C4 (NIa protease family, clan PA(C)) [, ]. Potyviruses include plant viruses in which the single-stranded RNA encodes a polyprotein with NIA protease activity, where proteolytic cleavage is specific for Gln+Gly sites. The NIA protease acts on the polyprotein, releasing itself by Gln+Gly cleavage at both the N- and C-termini. It further processes the polyprotein by cleavage at five similar sites in the C-terminal half of the sequence. In addition to its C-terminal protease activity, the NIA protease contains an N-terminal domain that has been implicated in the transcription process []. This peptidase is present in the nuclear inclusion protein of potyviruses.; GO: 0008234 cysteine-type peptidase activity, 0006508 proteolysis; PDB: 3MMG_B 1Q31_B 1LVB_A 1LVM_A.
Probab=97.88 E-value=0.00031 Score=73.45 Aligned_cols=103 Identities=17% Similarity=0.331 Sum_probs=50.0
Q ss_pred CCCcEEEEEEccCCCCccceeC--CCCCCCCCCeEEEEecCCCCCCCCCCCeeEeEEEeeeeeccCCCCCccccccCCCc
Q 004596 517 GPLDVSLLQLGYIPDQLCPIDA--DFGQPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQRNSAY 594 (743)
Q Consensus 517 ~~~DLALLkLe~~p~~l~pi~l--~~s~~~~G~~V~vIGyPlfg~~~g~~psvt~GiIS~vv~~~~~~~~~~~~~~~~~~ 594 (743)
+..||.++|+.. +++|.+- ....++.|+.|+++|.=. . ..+....+++- +.+.+ ...
T Consensus 80 ~~~DiviirmPk---DfpPf~~kl~FR~P~~~e~v~mVg~~f-q-~k~~~s~vSes--S~i~p--------------~~~ 138 (235)
T PF00863_consen 80 EGRDIVIIRMPK---DFPPFPQKLKFRAPKEGERVCMVGSNF-Q-EKSISSTVSES--SWIYP--------------EEN 138 (235)
T ss_dssp TCSSEEEEE--T---TS----S---B----TT-EEEEEEEEC-S-SCCCEEEEEEE--EEEEE--------------ETT
T ss_pred CCccEEEEeCCc---ccCCcchhhhccCCCCCCEEEEEEEEE-E-cCCeeEEECCc--eEEee--------------cCC
Confidence 359999999986 6666543 345789999999999721 1 11211222221 22222 122
Q ss_pred CeEEEEccccCCCCcccceecC-CceEEEEEeeeecCCCCcccCceeEEEehh
Q 004596 595 PVMLETTAAVHPGGSGGAVVNL-DGHMIGLVTSNARHGGGTVIPHLNFSIPCA 646 (743)
Q Consensus 595 ~~~IqTdAai~~GnSGGPL~d~-~G~VIGIvtsna~~~gg~~~p~lnFaIPi~ 646 (743)
..+..+-.+...|+-|.||++. +|++|||++...... ..||..|+.
T Consensus 139 ~~fWkHwIsTk~G~CG~PlVs~~Dg~IVGiHsl~~~~~------~~N~F~~f~ 185 (235)
T PF00863_consen 139 SHFWKHWISTKDGDCGLPLVSTKDGKIVGIHSLTSNTS------SRNYFTPFP 185 (235)
T ss_dssp TTEEEE-C---TT-TT-EEEETTT--EEEEEEEEETTT------SSEEEEE--
T ss_pred CCeeEEEecCCCCccCCcEEEcCCCcEEEEEcCccCCC------CeEEEEcCC
Confidence 3456666677899999999975 699999999775433 367777754
No 23
>COG5640 Secreted trypsin-like serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=97.23 E-value=0.0023 Score=69.91 Aligned_cols=38 Identities=32% Similarity=0.587 Sum_probs=30.2
Q ss_pred cccCCCCcccceecC--CceE-EEEEeeeecCCCCcccCce
Q 004596 602 AAVHPGGSGGAVVNL--DGHM-IGLVTSNARHGGGTVIPHL 639 (743)
Q Consensus 602 Aai~~GnSGGPL~d~--~G~V-IGIvtsna~~~gg~~~p~l 639 (743)
...|.|+||||+|-. +|++ +||+++....+++..+|++
T Consensus 223 ~daCqGDSGGPi~~~g~~G~vQ~GVvSwG~~~Cg~t~~~gV 263 (413)
T COG5640 223 KDACQGDSGGPIFHKGEEGRVQRGVVSWGDGGCGGTLIPGV 263 (413)
T ss_pred cccccCCCCCceEEeCCCccEEEeEEEecCCCCCCCCccee
Confidence 467889999999943 4765 9999999888877776663
No 24
>PF03761 DUF316: Domain of unknown function (DUF316) ; InterPro: IPR005514 This is a family of uncharacterised proteins from Caenorhabditis elegans.
Probab=96.99 E-value=0.042 Score=58.54 Aligned_cols=92 Identities=17% Similarity=0.133 Sum_probs=55.5
Q ss_pred CCCcEEEEEEccC-CCCccceeCCCC--CCCCCCeEEEEecCCCCCCCCCCCeeEeEEEeeeeeccCCCCCccccccCCC
Q 004596 517 GPLDVSLLQLGYI-PDQLCPIDADFG--QPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQRNSA 593 (743)
Q Consensus 517 ~~~DLALLkLe~~-p~~l~pi~l~~s--~~~~G~~V~vIGyPlfg~~~g~~psvt~GiIS~vv~~~~~~~~~~~~~~~~~ 593 (743)
...+++||+++.. .....|+=++++ ....|+.+.+.|+. .. ..+....+.-... ..
T Consensus 159 ~~~~~mIlEl~~~~~~~~~~~Cl~~~~~~~~~~~~~~~yg~~----~~---~~~~~~~~~i~~~--------------~~ 217 (282)
T PF03761_consen 159 RPYSPMILELEEDFSKNVSPPCLADSSTNWEKGDEVDVYGFN----ST---GKLKHRKLKITNC--------------TK 217 (282)
T ss_pred cccceEEEEEcccccccCCCEEeCCCccccccCceEEEeecC----CC---CeEEEEEEEEEEe--------------ec
Confidence 5789999999973 234444444443 46789999998871 11 1222222221111 01
Q ss_pred cCeEEEEccccCCCCccccee---cCCceEEEEEeeeec
Q 004596 594 YPVMLETTAAVHPGGSGGAVV---NLDGHMIGLVTSNAR 629 (743)
Q Consensus 594 ~~~~IqTdAai~~GnSGGPL~---d~~G~VIGIvtsna~ 629 (743)
....+.++...+.|++|||++ |.+-.||||.+.+..
T Consensus 218 ~~~~~~~~~~~~~~d~Gg~lv~~~~gr~tlIGv~~~~~~ 256 (282)
T PF03761_consen 218 CAYSICTKQYSCKGDRGGPLVKNINGRWTLIGVGASGNY 256 (282)
T ss_pred cceeEecccccCCCCccCeEEEEECCCEEEEEEEccCCC
Confidence 233455666677899999999 333468999877643
No 25
>PF05579 Peptidase_S32: Equine arteritis virus serine endopeptidase S32; InterPro: IPR008760 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to MEROPS peptidase family S32 (clan PA(S)). The type example is equine arteritis virus serine endopeptidase (equine arteritis virus), which is involved in processing of nidovirus polyproteins [].; GO: 0004252 serine-type endopeptidase activity, 0016032 viral reproduction, 0019082 viral protein processing; PDB: 3FAN_A 3FAO_A 1MBM_A.
Probab=96.66 E-value=0.0046 Score=65.22 Aligned_cols=77 Identities=26% Similarity=0.336 Sum_probs=41.1
Q ss_pred CcEEEEEEccCCCCccceeCCCCCCCCCCeEEEEecCCCCCCCCCCCeeEeEEEeeeeeccCCCCCccccccCCCcCeEE
Q 004596 519 LDVSLLQLGYIPDQLCPIDADFGQPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKVVKANLPSYGQSTLQRNSAYPVML 598 (743)
Q Consensus 519 ~DLALLkLe~~p~~l~pi~l~~s~~~~G~~V~vIGyPlfg~~~g~~psvt~GiIS~vv~~~~~~~~~~~~~~~~~~~~~I 598 (743)
-|.|.-.++..+...+.+++.... .| +.|-.- +.-+..|.|... ..+
T Consensus 156 GDfA~~~~~~~~G~~P~~k~a~~~--~G-rAyW~t----------~tGvE~G~ig~~--------------------~~~ 202 (297)
T PF05579_consen 156 GDFAEADITNWPGAAPKYKFAQNY--TG-RAYWLT----------STGVEPGFIGGG--------------------GAV 202 (297)
T ss_dssp TTEEEEEETTS-S---B--B-TT---SE-EEEEEE----------TTEEEEEEEETT--------------------EEE
T ss_pred CcEEEEECCCCCCCCCceeecCCc--cc-ceEEEc----------ccCcccceecCc--------------------eEE
Confidence 688888886656677777766321 12 122111 123455655532 123
Q ss_pred EEccccCCCCcccceecCCceEEEEEeeeecCC
Q 004596 599 ETTAAVHPGGSGGAVVNLDGHMIGLVTSNARHG 631 (743)
Q Consensus 599 qTdAai~~GnSGGPL~d~~G~VIGIvtsna~~~ 631 (743)
+. .++|+||+|++..+|.+|||++..-+.+
T Consensus 203 ~f---T~~GDSGSPVVt~dg~liGVHTGSn~~G 232 (297)
T PF05579_consen 203 CF---TGPGDSGSPVVTEDGDLIGVHTGSNKRG 232 (297)
T ss_dssp ES---S-GGCTT-EEEETTC-EEEEEEEEETTT
T ss_pred EE---cCCCCCCCccCcCCCCEEEEEecCCCcC
Confidence 33 3689999999999999999999875543
No 26
>PF10459 Peptidase_S46: Peptidase S46; InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains.
Probab=96.32 E-value=0.0078 Score=72.14 Aligned_cols=21 Identities=38% Similarity=0.512 Sum_probs=19.8
Q ss_pred EEEEEEeCCCeEEeccccccc
Q 004596 405 ASGVLLNDQGLILTNAHLLEP 425 (743)
Q Consensus 405 GSGfvV~~~G~ILTNaHVV~p 425 (743)
|||.+|+++|+|+||.||+-.
T Consensus 49 CSgsfVS~~GLvlTNHHC~~~ 69 (698)
T PF10459_consen 49 CSGSFVSPDGLVLTNHHCGYG 69 (698)
T ss_pred eeEEEEcCCceEEecchhhhh
Confidence 999999999999999999953
No 27
>PF13365 Trypsin_2: Trypsin-like peptidase domain; PDB: 1Y8T_A 2Z9I_A 3QO6_A 1L1J_A 1QY6_A 2O8L_A 3OTP_E 2ZLE_I 1KY9_A 3CS0_A ....
Probab=94.87 E-value=0.03 Score=50.58 Aligned_cols=21 Identities=48% Similarity=0.892 Sum_probs=19.1
Q ss_pred eeccccCC---CceEcCCCcEEEE
Q 004596 284 ADIRCLPG---GPVFGEHAHFVGI 304 (743)
Q Consensus 284 tD~~~~pG---g~vf~~~g~liGi 304 (743)
+|+.+.|| |||||.+|++|||
T Consensus 97 ~~~~~~~G~SGgpv~~~~G~vvGi 120 (120)
T PF13365_consen 97 TDADTRPGSSGGPVFDSDGRVVGI 120 (120)
T ss_dssp ESSS-STTTTTSEEEETTSEEEEE
T ss_pred eecccCCCcEeHhEECCCCEEEeC
Confidence 89999997 9999999999997
No 28
>PF10459 Peptidase_S46: Peptidase S46; InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains.
Probab=94.63 E-value=0.026 Score=67.74 Aligned_cols=65 Identities=20% Similarity=0.285 Sum_probs=44.5
Q ss_pred CCcCeEEEEccccCCCCcccceecCCceEEEEEeeeecCCCCccc---Cc--eeEEEehhHHHHHHHHHH
Q 004596 592 SAYPVMLETTAAVHPGGSGGAVVNLDGHMIGLVTSNARHGGGTVI---PH--LNFSIPCAVLRPIFEFAR 656 (743)
Q Consensus 592 ~~~~~~IqTdAai~~GnSGGPL~d~~G~VIGIvtsna~~~gg~~~---p~--lnFaIPi~~l~~il~~~~ 656 (743)
...+.-+.+|+-+.+||||+|++|.+|+|||++.-..-.+-...+ |. -+.++=+..+.-+++.+.
T Consensus 618 g~~pv~FlstnDitGGNSGSPvlN~~GeLVGl~FDgn~Esl~~D~~fdp~~~R~I~VDiRyvL~~ldkv~ 687 (698)
T PF10459_consen 618 GSVPVNFLSTNDITGGNSGSPVLNAKGELVGLAFDGNWESLSGDIAFDPELNRTIHVDIRYVLWALDKVY 687 (698)
T ss_pred CCeeeEEEeccCcCCCCCCCccCCCCceEEEEeecCchhhcccccccccccceeEEEEHHHHHHHHHHHh
Confidence 446677788999999999999999999999998765433211110 23 344555566666666553
No 29
>PF00548 Peptidase_C3: 3C cysteine protease (picornain 3C); InterPro: IPR000199 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This signature defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies C3A and C3B. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral C3 cysteine protease. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SJO_E 2H6M_A 1QA7_C 1HAV_B 2HAL_A 2H9H_A 3QZQ_B 3QZR_A 3R0F_B 3SJ9_A ....
Probab=94.06 E-value=0.93 Score=45.51 Aligned_cols=35 Identities=29% Similarity=0.526 Sum_probs=29.5
Q ss_pred CcCeEEEEccccCCCCcccceecC---CceEEEEEeee
Q 004596 593 AYPVMLETTAAVHPGGSGGAVVNL---DGHMIGLVTSN 627 (743)
Q Consensus 593 ~~~~~IqTdAai~~GnSGGPL~d~---~G~VIGIvtsn 627 (743)
..+.++.+.++..+|+.||||+.. .++++||+.+.
T Consensus 133 ~~~~~~~Y~~~t~~G~CG~~l~~~~~~~~~i~GiHvaG 170 (172)
T PF00548_consen 133 TTPRSLKYKAPTKPGMCGSPLVSRIGGQGKIIGIHVAG 170 (172)
T ss_dssp EEEEEEEEESEEETTGTTEEEEESCGGTTEEEEEEEEE
T ss_pred EeeEEEEEccCCCCCccCCeEEEeeccCccEEEEEecc
Confidence 346788899999999999999942 58999999885
No 30
>KOG1421 consensus Predicted signaling-associated protein (contains a PDZ domain) [General function prediction only]
Probab=92.04 E-value=2.7 Score=49.94 Aligned_cols=152 Identities=14% Similarity=0.124 Sum_probs=84.3
Q ss_pred EEEEEcCCCCceeEeeEEEEecCCCCcEEEEEEccCCCCccceeCCCCCCCCCCeEEEEecCCCCCCCCCCCeeEeEEEe
Q 004596 494 IRVRLDHLDPWIWCDAKIVYVCKGPLDVSLLQLGYIPDQLCPIDADFGQPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVA 573 (743)
Q Consensus 494 I~Vrl~~~~~~~W~~A~VV~v~~~~~DLALLkLe~~p~~l~pi~l~~s~~~~G~~V~vIGyPlfg~~~g~~psvt~GiIS 573 (743)
++++.++... ..|++.+ -++...+|.+|.+. .....+++.+..+..||++...|+-.-........+++.-.+-
T Consensus 578 ~~vt~~dS~~---i~a~~~f-L~~t~n~a~~kydp--~~~~~~kl~~~~v~~gD~~~f~g~~~~~r~ltaktsv~dvs~~ 651 (955)
T KOG1421|consen 578 QRVTEADSDG---IPANVSF-LHPTENVASFKYDP--ALEVQLKLTDTTVLRGDECTFEGFTEDLRALTAKTSVTDVSVV 651 (955)
T ss_pred eEEeeccccc---ccceeeE-ecCccceeEeccCh--hHhhhhccceeeEecCCceeEecccccchhhcccceeeeeEEE
Confidence 5566665555 6777776 35567888888874 3445556667778999999999983100000112233332111
Q ss_pred eeeeccCCCCCccccccCCCcCeEEEEcccc-CCCCcccceecCCceEEEEEeeeecCC-CCcccCceeEEEehhHHHHH
Q 004596 574 KVVKANLPSYGQSTLQRNSAYPVMLETTAAV-HPGGSGGAVVNLDGHMIGLVTSNARHG-GGTVIPHLNFSIPCAVLRPI 651 (743)
Q Consensus 574 ~vv~~~~~~~~~~~~~~~~~~~~~IqTdAai-~~GnSGGPL~d~~G~VIGIvtsna~~~-gg~~~p~lnFaIPi~~l~~i 651 (743)
-+-+...|.+. .... ..|-..+.+ ..++|| -+.|.+|+++|+=-+-.... ++..+ ..-|.+-+.++.+.
T Consensus 652 ~~ps~~~pr~r------~~n~-e~Is~~~nlsT~c~sg-~ltdddg~vvalwl~~~ge~~~~kd~-~y~~gl~~~~~l~v 722 (955)
T KOG1421|consen 652 IIPSSVMPRFR------ATNL-EVISFMDNLSTSCLSG-RLTDDDGEVVALWLSVVGEDVGGKDY-TYKYGLSMSYILPV 722 (955)
T ss_pred EecCCCCccee------ecce-EEEEEeccccccccce-EEECCCCeEEEEEeeeeccccCCcee-EEEeccchHHHHHH
Confidence 11111111110 0111 123222222 345554 56688999999977766543 23222 34566667889999
Q ss_pred HHHHHhcCC
Q 004596 652 FEFARDMQE 660 (743)
Q Consensus 652 l~~~~~~~d 660 (743)
++.++....
T Consensus 723 l~rlk~g~~ 731 (955)
T KOG1421|consen 723 LERLKLGPS 731 (955)
T ss_pred HHHHhcCCC
Confidence 999987533
No 31
>PF02907 Peptidase_S29: Hepatitis C virus NS3 protease; InterPro: IPR004109 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies the Hepatitis C virus NS3 protein as a serine protease which belongs to MEROPS peptidase family S29 (hepacivirin family, clan PA(S)), which has a trypsin-like fold. The non-structural (NS) protein NS3 is one of the NS proteins involved in replication of the HCV genome. The NS2 proteinase (IPR002518 from INTERPRO), a zinc-dependent enzyme, performs a single proteolytic cut to release the N terminus of NS3. The action of NS3 proteinase (NS3P), which resides in the N-terminal one-third of the NS3 protein, then yields all remaining non-structural proteins. The C-terminal two-thirds of the NS3 protein contain a helicase. The functional relationship between the proteinase and helicase domains is unknown. NS3 has a structural zinc-binding site and requires cofactor NS4. It has been suggested that the NS3 serine protease of hepatitus C is involved in cell transformation and that the ability to transform requires an active enzyme [].; GO: 0008236 serine-type peptidase activity, 0006508 proteolysis, 0019087 transformation of host cell by virus; PDB: 2QV1_B 3LOX_C 2OBQ_C 2OC1_C 2OC0_A 3LON_A 3KNX_A 2O8M_A 2OBO_A 2OC8_A ....
Probab=88.82 E-value=0.52 Score=45.43 Aligned_cols=44 Identities=25% Similarity=0.484 Sum_probs=30.9
Q ss_pred ccCCCCcccceecCCceEEEEEeeeecCCCCcccCceeEEEehhHHH
Q 004596 603 AVHPGGSGGAVVNLDGHMIGLVTSNARHGGGTVIPHLNFSIPCAVLR 649 (743)
Q Consensus 603 ai~~GnSGGPL~d~~G~VIGIvtsna~~~gg~~~p~lnFaIPi~~l~ 649 (743)
+.-.|.|||||+-.+|++|||-.+..-..+-.+ .+-|. |++.+.
T Consensus 104 s~lkGSSGgPiLC~~GH~vG~f~aa~~trgvak--~i~f~-P~e~l~ 147 (148)
T PF02907_consen 104 SDLKGSSGGPILCPSGHAVGMFRAAVCTRGVAK--AIDFI-PVETLP 147 (148)
T ss_dssp HHHTT-TT-EEEETTSEEEEEEEEEEEETTEEE--EEEEE-EHHHHH
T ss_pred EEEecCCCCcccCCCCCEEEEEEEEEEcCCcee--eEEEE-eeeecC
Confidence 445799999999999999999988754333333 57777 887653
No 32
>PF08192 Peptidase_S64: Peptidase family S64; InterPro: IPR012985 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This family of fungal proteins is involved in the processing of membrane bound transcription factor Stp1 [] and belongs to MEROPS petidase family S64 (clan PA). The processing causes the signalling domain of Stp1 to be passed to the nucleus where several permease genes are induced. The permeases are important for uptake of amino acids, and processing of tp1 only occurs in an amino acid-rich environment. This family is predicted to be distantly related to the trypsin family (MEROPS peptidase family S1) and to have a typical trypsin-like catalytic triad [].
Probab=88.22 E-value=3 Score=49.69 Aligned_cols=119 Identities=14% Similarity=0.206 Sum_probs=69.6
Q ss_pred CCCCcEEEEEEcc-------CCCCc------cceeCCC-------CCCCCCCeEEEEecCCCCCCCCCCCeeEeEEEeee
Q 004596 516 KGPLDVSLLQLGY-------IPDQL------CPIDADF-------GQPSLGSAAYVIGHGLFGPRCGLSPSVSSGVVAKV 575 (743)
Q Consensus 516 ~~~~DLALLkLe~-------~p~~l------~pi~l~~-------s~~~~G~~V~vIGyPlfg~~~g~~psvt~GiIS~v 575 (743)
..-.|+||++++. +.+++ +.+.+.+ ....+|..|+=+|. ..| .|.|.++++
T Consensus 540 ~~LsD~AIIkV~~~~~~~N~LGddi~f~~~dP~l~f~NlyV~~~~~~~~~G~~VfK~Gr-----TTg----yT~G~lNg~ 610 (695)
T PF08192_consen 540 KRLSDWAIIKVNKERKCQNYLGDDIQFNEPDPTLMFQNLYVREVVSNLVPGMEVFKVGR-----TTG----YTTGILNGI 610 (695)
T ss_pred ccccceEEEEeCCCceecCCCCccccccCCCccccccccchhhhhhccCCCCeEEEecc-----cCC----ccceEecce
Confidence 3446999999986 11222 1222221 23578999999986 333 577888765
Q ss_pred eeccCCCCCccccccCCCcCeEEEEc----cccCCCCcccceecCCc------eEEEEEeeeecCCCCcccCceeEEEeh
Q 004596 576 VKANLPSYGQSTLQRNSAYPVMLETT----AAVHPGGSGGAVVNLDG------HMIGLVTSNARHGGGTVIPHLNFSIPC 645 (743)
Q Consensus 576 v~~~~~~~~~~~~~~~~~~~~~IqTd----Aai~~GnSGGPL~d~~G------~VIGIvtsna~~~gg~~~p~lnFaIPi 645 (743)
.-. +-. .+......++... +=..+|+||.=|++.-+ .|+||..+..+. .+ .++...|+
T Consensus 611 klv----yw~---dG~i~s~efvV~s~~~~~Fa~~GDSGS~VLtk~~d~~~gLgvvGMlhsydge---~k--qfglftPi 678 (695)
T PF08192_consen 611 KLV----YWA---DGKIQSSEFVVSSDNNPAFASGGDSGSWVLTKLEDNNKGLGVVGMLHSYDGE---QK--QFGLFTPI 678 (695)
T ss_pred EEE----Eec---CCCeEEEEEEEecCCCccccCCCCcccEEEecccccccCceeeEEeeecCCc---cc--eeeccCcH
Confidence 321 100 0111112233333 23457999999998533 499999886332 22 58888888
Q ss_pred hHHHHHHHHH
Q 004596 646 AVLRPIFEFA 655 (743)
Q Consensus 646 ~~l~~il~~~ 655 (743)
..+..-+++.
T Consensus 679 ~~il~rl~~v 688 (695)
T PF08192_consen 679 NEILDRLEEV 688 (695)
T ss_pred HHHHHHHHHh
Confidence 8776655554
No 33
>PF00949 Peptidase_S7: Peptidase S7, Flavivirus NS3 serine protease ; InterPro: IPR001850 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies serine peptidases belong to MEROPS peptidase family S7 (flavivirin family, clan PA(S)). The protein fold of the peptidase domain for members of this family resembles that of chymotrypsin, the type example for clan PA. Flaviviruses produce a polyprotein from the ssRNA genome. The N terminus of the NS3 protein (approx. 180 aa) is required for the processing of the polyprotein. NS3 also has conserved homology with NTP-binding proteins and DEAD family of RNA helicase [, , ].; GO: 0003723 RNA binding, 0003724 RNA helicase activity, 0005524 ATP binding; PDB: 2IJO_B 3E90_D 2GGV_B 2FP7_B 2WV9_A 3U1I_B 3U1J_B 2WZQ_A 2WHX_A 3L6P_A ....
Probab=87.76 E-value=0.46 Score=45.83 Aligned_cols=31 Identities=26% Similarity=0.521 Sum_probs=22.1
Q ss_pred ccccCCCCcccceecCCceEEEEEeeeecCC
Q 004596 601 TAAVHPGGSGGAVVNLDGHMIGLVTSNARHG 631 (743)
Q Consensus 601 dAai~~GnSGGPL~d~~G~VIGIvtsna~~~ 631 (743)
+....+|.||+|+||.+|++|||-.......
T Consensus 91 ~~d~~~GsSGSpi~n~~g~ivGlYg~g~~~~ 121 (132)
T PF00949_consen 91 DLDFPKGSSGSPIFNQNGEIVGLYGNGVEVG 121 (132)
T ss_dssp ---S-TTGTT-EEEETTSCEEEEEEEEEE-T
T ss_pred ecccCCCCCCCceEcCCCcEEEEEccceeec
Confidence 3457899999999999999999987765543
No 34
>PF00089 Trypsin: Trypsin; InterPro: IPR001254 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine proteases belong to the MEROPS peptidase family S1 (chymotrypsin family, clan PA(S))and to peptidase family S6 (Hap serine peptidases). The chymotrypsin family is almost totally confined to animals, although trypsin-like enzymes are found in actinomycetes of the genera Streptomyces and Saccharopolyspora, and in the fungus Fusarium oxysporum []. The enzymes are inherently secreted, being synthesised with a signal peptide that targets them to the secretory pathway. Animal enzymes are either secreted directly, packaged into vesicles for regulated secretion, or are retained in leukocyte granules []. The Hap family, 'Haemophilus adhesion and penetration', are proteins that play a role in the interaction with human epithelial cells. The serine protease activity is localized at the N-terminal domain, whereas the binding domain is in the C-terminal region. ; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 1SPJ_A 1A5I_A 2ZGH_A 2ZKS_A 2ZGJ_A 2ZGC_A 2ODP_A 2I6Q_A 2I6S_A 2ODQ_A ....
Probab=86.56 E-value=7 Score=38.63 Aligned_cols=110 Identities=17% Similarity=0.141 Sum_probs=64.5
Q ss_pred cccEEEEEeccC---CCCCCceecCC---CCCCCCeEEEEeCCCCCCCc-cccccceEEEEEeee--cC--CCCCCCceE
Q 004596 214 TSRVAILGVSSY---LKDLPNIALTP---LNKRGDLLLAVGSPFGVLSP-MHFFNSVSMGSVANC--YP--PRSTTRSLL 282 (743)
Q Consensus 214 ~t~~A~l~i~~~---~~~~~~~~~~~---~~~~G~~v~~igsPFg~~sP-~~f~ns~s~Givs~~--~~--~~~~~~~~i 282 (743)
..||||||++.. .....++.+.. .++.|+.+.++|-+.....- ..-.......+++.. .. ........+
T Consensus 86 ~~DiAll~L~~~~~~~~~~~~~~l~~~~~~~~~~~~~~~~G~~~~~~~~~~~~~~~~~~~~~~~~~c~~~~~~~~~~~~~ 165 (220)
T PF00089_consen 86 DNDIALLKLDRPITFGDNIQPICLPSAGSDPNVGTSCIVVGWGRTSDNGYSSNLQSVTVPVVSRKTCRSSYNDNLTPNMI 165 (220)
T ss_dssp TTSEEEEEESSSSEHBSSBEESBBTSTTHTTTTTSEEEEEESSBSSTTSBTSBEEEEEEEEEEHHHHHHHTTTTSTTTEE
T ss_pred cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
Confidence 359999999854 12233444433 35899999999999863221 012223344444432 11 011234566
Q ss_pred Eeec----cccC---CCceEcCCCcEEEEEecccccc-CCcceEEEeeHH
Q 004596 283 MADI----RCLP---GGPVFGEHAHFVGILIRPLRQK-SGAEIQLVIPWE 324 (743)
Q Consensus 283 ~tD~----~~~p---Gg~vf~~~g~liGiv~~~l~~~-g~~~l~~~ip~~ 324 (743)
.++. .... ||||+..++.||||++.. ..+ ......+.+++.
T Consensus 166 c~~~~~~~~~~~g~sG~pl~~~~~~lvGI~s~~-~~c~~~~~~~v~~~v~ 214 (220)
T PF00089_consen 166 CAGSSGSGDACQGDSGGPLICNNNYLVGIVSFG-ENCGSPNYPGVYTRVS 214 (220)
T ss_dssp EEETTSSSBGGTTTTTSEEEETTEEEEEEEEEE-SSSSBTTSEEEEEEGG
T ss_pred cccccccccccccccccccccceeeecceeeec-CCCCCCCcCEEEEEHH
Confidence 6665 4444 599999988999999987 333 332345555544
No 35
>PF00944 Peptidase_S3: Alphavirus core protein ; InterPro: IPR000930 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. Togavirin, also known as Sindbis virus core endopeptidase, is a serine protease resident at the N terminus of the p130 polyprotein of togaviruses []. The endopeptidase signature identifies the peptidase as belonging to the MEROPS peptidase family S3 (togavirin family, clan PA(S)). The polyprotein also includes structural proteins for the nucleocapsid core and for the glycoprotein spikes []. Togavirin is only active while part of the polyprotein, cleavage at a Trp-Ser bond resulting in total lack of activity []. Mutagenesis studies have identified the location of the His-Asp-Ser catalytic triad, and X-ray studies have revealed the protein fold to be similar to that of chymotrypsin [, ].; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis, 0016020 membrane; PDB: 2YEW_D 1EP5_A 3J0C_F 1EP6_C 1WYK_D 1DYL_A 1VCQ_B 1VCP_B 1LD4_D 1KXA_A ....
Probab=84.86 E-value=1.1 Score=43.26 Aligned_cols=35 Identities=26% Similarity=0.489 Sum_probs=28.1
Q ss_pred EEEEccccCCCCcccceecCCceEEEEEeeeecCC
Q 004596 597 MLETTAAVHPGGSGGAVVNLDGHMIGLVTSNARHG 631 (743)
Q Consensus 597 ~IqTdAai~~GnSGGPL~d~~G~VIGIvtsna~~~ 631 (743)
+..-+..-.+|+||-|++|-.|+||||+-..+...
T Consensus 96 ftip~g~g~~GDSGRpi~DNsGrVVaIVLGG~neG 130 (158)
T PF00944_consen 96 FTIPTGVGKPGDSGRPIFDNSGRVVAIVLGGANEG 130 (158)
T ss_dssp EEEETTS-STTSTTEEEESTTSBEEEEEEEEEEET
T ss_pred EEeccCCCCCCCCCCccCcCCCCEEEEEecCCCCC
Confidence 34456778899999999998999999998877643
No 36
>PF05580 Peptidase_S55: SpoIVB peptidase S55; InterPro: IPR008763 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S55 (SpoIVB peptidase family, clan PA(S)). The protein SpoIVB plays a key role in signalling in the final sigma-K checkpoint of Bacillus subtilis [, ].
Probab=80.85 E-value=1.4 Score=45.88 Aligned_cols=45 Identities=31% Similarity=0.477 Sum_probs=34.2
Q ss_pred EEEEccccCCCCcccceecCCceEEEEEeeeecCCCCcccCceeEEEehhH
Q 004596 597 MLETTAAVHPGGSGGAVVNLDGHMIGLVTSNARHGGGTVIPHLNFSIPCAV 647 (743)
Q Consensus 597 ~IqTdAai~~GnSGGPL~d~~G~VIGIvtsna~~~gg~~~p~lnFaIPi~~ 647 (743)
.+.-+..+..|+||+|++ .+|++||=++-..-++ |..+|.||++.
T Consensus 170 Ll~~TGGIvqGMSGSPI~-qdGKLiGAVthvf~~d-----p~~Gygi~ie~ 214 (218)
T PF05580_consen 170 LLEKTGGIVQGMSGSPII-QDGKLIGAVTHVFVND-----PTKGYGIFIEW 214 (218)
T ss_pred hhhhhCCEEecccCCCEE-ECCEEEEEEEEEEecC-----CCceeeecHHH
Confidence 344445677899999999 5999999999886433 46789998754
No 37
>PF09342 DUF1986: Domain of unknown function (DUF1986); InterPro: IPR015420 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This domain is found in serine endopeptidases belonging to MEROPS peptidase family S1A (clan PA). It is found in unusual mosaic proteins, which are encoded by the Drosophila nudel gene (see P98159 from SWISSPROT). Nudel is involved in defining embryonic dorsoventral polarity. Three proteases; ndl, gd and snk process easter to create active easter. Active easter defines cell identities along the dorsal-ventral continuum by activating the spz ligand for the Tl receptor in the ventral region of the embryo. Nudel, pipe and windbeutel together trigger the protease cascade within the extraembryonic perivitelline compartment which induces dorsoventral polarity of the Drosophila embryo [].
Probab=79.91 E-value=15 Score=39.14 Aligned_cols=33 Identities=30% Similarity=0.506 Sum_probs=28.2
Q ss_pred CceEEEEECCCeeEEEEEEeCCCeEEeccccccc
Q 004596 392 ASVCLITIDDGVWASGVLLNDQGLILTNAHLLEP 425 (743)
Q Consensus 392 ~SVV~I~~~~~~wGSGfvV~~~G~ILTNaHVV~p 425 (743)
|..+-|.+++.-|+||++|+++ |||++..|+..
T Consensus 17 PWlA~IYvdG~~~CsgvLlD~~-WlLvsssCl~~ 49 (267)
T PF09342_consen 17 PWLADIYVDGRYWCSGVLLDPH-WLLVSSSCLRG 49 (267)
T ss_pred cceeeEEEcCeEEEEEEEeccc-eEEEeccccCC
Confidence 5566777777789999999997 99999999974
No 38
>PF00947 Pico_P2A: Picornavirus core protein 2A; InterPro: IPR000081 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This domain defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies 3CA and 3CB. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral 3C cysteine protease []. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0008233 peptidase activity, 0006508 proteolysis, 0016032 viral reproduction; PDB: 2HRV_B 1Z8R_A.
Probab=74.97 E-value=3.8 Score=39.30 Aligned_cols=31 Identities=29% Similarity=0.508 Sum_probs=23.3
Q ss_pred eEEEEccccCCCCcccceecCCceEEEEEeee
Q 004596 596 VMLETTAAVHPGGSGGAVVNLDGHMIGLVTSN 627 (743)
Q Consensus 596 ~~IqTdAai~~GnSGGPL~d~~G~VIGIvtsn 627 (743)
.++....++.||+.||+|+ .+--||||+|+.
T Consensus 79 ~~l~g~Gp~~PGdCGg~L~-C~HGViGi~Tag 109 (127)
T PF00947_consen 79 NLLIGEGPAEPGDCGGILR-CKHGVIGIVTAG 109 (127)
T ss_dssp CEEEEE-SSSTT-TCSEEE-ETTCEEEEEEEE
T ss_pred CceeecccCCCCCCCceeE-eCCCeEEEEEeC
Confidence 3444556899999999999 456699999996
No 39
>KOG0441 consensus Cu2+/Zn2+ superoxide dismutase SOD1 [Inorganic ion transport and metabolism]
Probab=54.32 E-value=4.5 Score=40.00 Aligned_cols=42 Identities=29% Similarity=0.256 Sum_probs=30.9
Q ss_pred hhhhccccccccccCcee---eeeeeeecccccC--ChhhhhhccCC
Q 004596 26 GLKMRRHAFHQYNSGKTT---LSASGMLLPLSFF--DTKVAERNWGV 67 (743)
Q Consensus 26 ~~k~~~~~f~~~~~g~~t---~sas~~~~p~~~~--~~~~~~~~~~~ 67 (743)
||.-++|+||.|+.|.+| .||-...=|.+.. .+.+.+|.+++
T Consensus 38 GL~pg~hgfHvHqfGD~t~GC~SaGphFNp~~~~hg~p~~~~rH~gd 84 (154)
T KOG0441|consen 38 GLPPGKHGFHVHQFGDNTNGCKSAGPHFNPNKKTHGGPVDEVRHVGD 84 (154)
T ss_pred cCCCceeeEEEEeccCCCCChhcCCCCCCCcccCCCCcccccccccc
Confidence 444499999999999998 6776666566555 46666777766
No 40
>PF01732 DUF31: Putative peptidase (DUF31); InterPro: IPR022382 This domain has no known function. It is found in various hypothetical proteins and putative lipoproteins from mycoplasmas.
Probab=51.84 E-value=9.3 Score=42.81 Aligned_cols=24 Identities=25% Similarity=0.515 Sum_probs=21.2
Q ss_pred cccCCCCcccceecCCceEEEEEe
Q 004596 602 AAVHPGGSGGAVVNLDGHMIGLVT 625 (743)
Q Consensus 602 Aai~~GnSGGPL~d~~G~VIGIvt 625 (743)
....+|+||+.|+|.+|++|||..
T Consensus 350 ~~l~gGaSGS~V~n~~~~lvGIy~ 373 (374)
T PF01732_consen 350 YSLGGGASGSMVINQNNELVGIYF 373 (374)
T ss_pred cCCCCCCCcCeEECCCCCEEEEeC
Confidence 366789999999999999999964
No 41
>PF08192 Peptidase_S64: Peptidase family S64; InterPro: IPR012985 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This family of fungal proteins is involved in the processing of membrane bound transcription factor Stp1 [] and belongs to MEROPS petidase family S64 (clan PA). The processing causes the signalling domain of Stp1 to be passed to the nucleus where several permease genes are induced. The permeases are important for uptake of amino acids, and processing of tp1 only occurs in an amino acid-rich environment. This family is predicted to be distantly related to the trypsin family (MEROPS peptidase family S1) and to have a typical trypsin-like catalytic triad [].
Probab=50.55 E-value=76 Score=38.36 Aligned_cols=109 Identities=17% Similarity=0.186 Sum_probs=63.0
Q ss_pred ccccccccEEEEEeccCC-------------CCCCceecC--------CCCCCCCeEEEEeCCCCCCCccccccceEEEE
Q 004596 209 LMSKSTSRVAILGVSSYL-------------KDLPNIALT--------PLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGS 267 (743)
Q Consensus 209 ~~~~~~t~~A~l~i~~~~-------------~~~~~~~~~--------~~~~~G~~v~~igsPFg~~sP~~f~ns~s~Gi 267 (743)
++.+.++|+||+||+.+. ..-|.+.+. ..+..|.+|+-+|.==|+ |.|+
T Consensus 537 ii~~~LsD~AIIkV~~~~~~~N~LGddi~f~~~dP~l~f~NlyV~~~~~~~~~G~~VfK~GrTTgy----------T~G~ 606 (695)
T PF08192_consen 537 IINKRLSDWAIIKVNKERKCQNYLGDDIQFNEPDPTLMFQNLYVREVVSNLVPGMEVFKVGRTTGY----------TTGI 606 (695)
T ss_pred hhcccccceEEEEeCCCceecCCCCccccccCCCccccccccchhhhhhccCCCCeEEEecccCCc----------cceE
Confidence 445778899999998432 112333332 247789999999998885 6777
Q ss_pred Eeeec----CCCC-CCCceEEee----ccccCC---CceEcCCC------cEEEEEeccccccCCcceEEEeeHHHHHHH
Q 004596 268 VANCY----PPRS-TTRSLLMAD----IRCLPG---GPVFGEHA------HFVGILIRPLRQKSGAEIQLVIPWEAIATA 329 (743)
Q Consensus 268 vs~~~----~~~~-~~~~~i~tD----~~~~pG---g~vf~~~g------~liGiv~~~l~~~g~~~l~~~ip~~~i~~~ 329 (743)
|...- .++. ....|+++- +=..+| .-|+++-+ .|+||+-+.=+ ....+++..||..|++=
T Consensus 607 lNg~klvyw~dG~i~s~efvV~s~~~~~Fa~~GDSGS~VLtk~~d~~~gLgvvGMlhsydg--e~kqfglftPi~~il~r 684 (695)
T PF08192_consen 607 LNGIKLVYWADGKIQSSEFVVSSDNNPAFASGGDSGSWVLTKLEDNNKGLGVVGMLHSYDG--EQKQFGLFTPINEILDR 684 (695)
T ss_pred ecceEEEEecCCCeEEEEEEEecCCCccccCCCCcccEEEecccccccCceeeEEeeecCC--ccceeeccCcHHHHHHH
Confidence 76431 1111 011223332 112224 44556522 38888874332 22377889999999853
No 42
>COG3591 V8-like Glu-specific endopeptidase [Amino acid transport and metabolism]
Probab=49.77 E-value=48 Score=35.56 Aligned_cols=71 Identities=23% Similarity=0.218 Sum_probs=55.6
Q ss_pred cCCCCCCCCeEEEEeCCCCCCCccccccceEEEEEeeecCCCCCCCceEEeeccccCC---CceEcCCCcEEEEEecccc
Q 004596 234 LTPLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVANCYPPRSTTRSLLMADIRCLPG---GPVFGEHAHFVGILIRPLR 310 (743)
Q Consensus 234 ~~~~~~~G~~v~~igsPFg~~sP~~f~ns~s~Givs~~~~~~~~~~~~i~tD~~~~pG---g~vf~~~g~liGiv~~~l~ 310 (743)
.....+.+|.|.++|.|=.- |..+....+.|.|-..... +++-|+...|| .||++.+.++||+.+....
T Consensus 154 ~~~~~~~~d~i~v~GYP~dk--~~~~~~~e~t~~v~~~~~~------~l~y~~dT~pG~SGSpv~~~~~~vigv~~~g~~ 225 (251)
T COG3591 154 TASEAKANDRITVIGYPGDK--PNIGTMWESTGKVNSIKGN------KLFYDADTLPGSSGSPVLISKDEVIGVHYNGPG 225 (251)
T ss_pred cccccccCceeEEEeccCCC--CcceeEeeecceeEEEecc------eEEEEecccCCCCCCceEecCceEEEEEecCCC
Confidence 34569999999999999884 4345666677766555432 67788888897 9999999999999998887
Q ss_pred cc
Q 004596 311 QK 312 (743)
Q Consensus 311 ~~ 312 (743)
..
T Consensus 226 ~~ 227 (251)
T COG3591 226 AN 227 (251)
T ss_pred cc
Confidence 55
No 43
>PF02907 Peptidase_S29: Hepatitis C virus NS3 protease; InterPro: IPR004109 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies the Hepatitis C virus NS3 protein as a serine protease which belongs to MEROPS peptidase family S29 (hepacivirin family, clan PA(S)), which has a trypsin-like fold. The non-structural (NS) protein NS3 is one of the NS proteins involved in replication of the HCV genome. The NS2 proteinase (IPR002518 from INTERPRO), a zinc-dependent enzyme, performs a single proteolytic cut to release the N terminus of NS3. The action of NS3 proteinase (NS3P), which resides in the N-terminal one-third of the NS3 protein, then yields all remaining non-structural proteins. The C-terminal two-thirds of the NS3 protein contain a helicase. The functional relationship between the proteinase and helicase domains is unknown. NS3 has a structural zinc-binding site and requires cofactor NS4. It has been suggested that the NS3 serine protease of hepatitus C is involved in cell transformation and that the ability to transform requires an active enzyme [].; GO: 0008236 serine-type peptidase activity, 0006508 proteolysis, 0019087 transformation of host cell by virus; PDB: 2QV1_B 3LOX_C 2OBQ_C 2OC1_C 2OC0_A 3LON_A 3KNX_A 2O8M_A 2OBO_A 2OC8_A ....
Probab=49.47 E-value=12 Score=36.52 Aligned_cols=42 Identities=24% Similarity=0.539 Sum_probs=31.7
Q ss_pred eeccccCCCceEcCCCcEEEEEeccccccCC-cceEEEeeHHHH
Q 004596 284 ADIRCLPGGPVFGEHAHFVGILIRPLRQKSG-AEIQLVIPWEAI 326 (743)
Q Consensus 284 tD~~~~pGg~vf~~~g~liGiv~~~l~~~g~-~~l~~~ip~~~i 326 (743)
+|.+=-.||||+-..|++|||..+-++.+|- -.+.|+ ||+.+
T Consensus 104 s~lkGSSGgPiLC~~GH~vG~f~aa~~trgvak~i~f~-P~e~l 146 (148)
T PF02907_consen 104 SDLKGSSGGPILCPSGHAVGMFRAAVCTRGVAKAIDFI-PVETL 146 (148)
T ss_dssp HHHTT-TT-EEEETTSEEEEEEEEEEEETTEEEEEEEE-EHHHH
T ss_pred EEEecCCCCcccCCCCCEEEEEEEEEEcCCceeeEEEE-eeeec
Confidence 3344444699999999999999999998844 378888 99865
No 44
>TIGR02860 spore_IV_B stage IV sporulation protein B. SpoIVB, the stage IV sporulation protein B of endospore-forming bacteria such as Bacillus subtilis, is a serine proteinase, expressed in the spore (rather than mother cell) compartment, that participates in a proteolytic activation cascade for Sigma-K. It appears to be universal among endospore-forming bacteria and occurs nowhere else.
Probab=47.94 E-value=15 Score=41.91 Aligned_cols=45 Identities=27% Similarity=0.448 Sum_probs=32.9
Q ss_pred EEEEccccCCCCcccceecCCceEEEEEeeeecCCCCcccCceeEEEehhH
Q 004596 597 MLETTAAVHPGGSGGAVVNLDGHMIGLVTSNARHGGGTVIPHLNFSIPCAV 647 (743)
Q Consensus 597 ~IqTdAai~~GnSGGPL~d~~G~VIGIvtsna~~~gg~~~p~lnFaIPi~~ 647 (743)
.+.-+..+..|+||+|++ .+|++||=+|=-.-++ |..+|.|-++.
T Consensus 350 ll~~tgGivqGMSGSPi~-q~gkliGAvtHVfvnd-----pt~GYGi~ie~ 394 (402)
T TIGR02860 350 LLEKTGGIVQGMSGSPII-QNGKVIGAVTHVFVND-----PTSGYGVYIEW 394 (402)
T ss_pred HhhHhCCEEecccCCCEE-ECCEEEEEEEEEEecC-----CCcceeehHHH
Confidence 333345677899999999 5999999988776543 34678885544
No 45
>PF03510 Peptidase_C24: 2C endopeptidase (C24) cysteine protease family; InterPro: IPR000317 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. The two signatures that defines this group of calivirus polyproteins identify a cysteine peptidase signature that belongs to MEROPS peptidase family C24 (clan PA(C)). Caliciviruses are positive-stranded ssRNA viruses that cause gastroenteritis. The calicivirus genome contains two open reading frames, ORF1 and ORF2. ORF2 encodes a structural protein []; while ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine protease and RNA polymerase activity. The regions of the polyprotein in which these activities lie are similar to proteins produced by the picornaviruses. Two different families of caliciviruses can be distinguished on the basis of sequence similarity, namely those classified as small round structured viruses (SRSVs) and those classed as non-SRSVs. Calicivirus proteases from the non-SRSV group, which are members of the PA protease clan, constitute family C24 of the cysteine proteases (proteases from SRSVs belong to the C37 family). As mentioned above, the protease activity resides within a polyprotein. The enzyme cleaves the polyprotein at sites N-terminal to itself, liberating the polyprotein helicase.; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis
Probab=40.59 E-value=84 Score=29.43 Aligned_cols=17 Identities=24% Similarity=0.425 Sum_probs=14.3
Q ss_pred EEEEeCCCeEEecccccc
Q 004596 407 GVLLNDQGLILTNAHLLE 424 (743)
Q Consensus 407 GfvV~~~G~ILTNaHVV~ 424 (743)
++-|.. |..+|+.||++
T Consensus 3 avHIGn-G~~vt~tHva~ 19 (105)
T PF03510_consen 3 AVHIGN-GRYVTVTHVAK 19 (105)
T ss_pred eEEeCC-CEEEEEEEEec
Confidence 566775 89999999997
No 46
>PF05416 Peptidase_C37: Southampton virus-type processing peptidase; InterPro: IPR001665 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This group of cysteine peptidases belong to the MEROPS peptidase family C37, (clan PA(C)). The type example is calicivirin from Southampton virus, an endopeptidase that cleaves the polyprotein at sites N-terminal to itself, liberating the polyprotein helicase. Southampton virus is a positive-stranded ssRNA virus belonging to the Caliciviruses, which are viruses that cause gastroenteritis. The calicivirus genome contains two open reading frames, ORF1 and ORF2. ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine protease and RNA polymerase activity []. The regions of the polyprotein in which these activities lie are similar to proteins produced by the picornaviruses []. ORF2 encodes a structural, capsid protein. Two different families of caliciviruses can be distinguished on the basis of sequence similarity, namely the Norwalk-like viruses or small round structured viruses (SRSVs), and those classed as non-SRSVs.; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 2FYQ_A 2FYR_A 1WQS_D 4ASH_A 2IPH_B.
Probab=40.05 E-value=89 Score=35.91 Aligned_cols=37 Identities=32% Similarity=0.372 Sum_probs=24.5
Q ss_pred cCeEEEEcc-------ccCCCCcccceecCCc---eEEEEEeeeecC
Q 004596 594 YPVMLETTA-------AVHPGGSGGAVVNLDG---HMIGLVTSNARH 630 (743)
Q Consensus 594 ~~~~IqTdA-------ai~~GnSGGPL~d~~G---~VIGIvtsna~~ 630 (743)
...||.|.+ ...||+.|-|-+-..| -|+|++++.++.
T Consensus 483 Q~GMLLTGaNAK~mDLGT~PGDCGcPYvyKrgNd~VV~GVH~AAtr~ 529 (535)
T PF05416_consen 483 QMGMLLTGANAKGMDLGTIPGDCGCPYVYKRGNDWVVIGVHAAATRS 529 (535)
T ss_dssp EEEEETTSTT-SSTTTS--TTGTT-EEEEEETTEEEEEEEEEEE-SS
T ss_pred eeeeeeecCCccccccCCCCCCCCCceeeecCCcEEEEEEEehhccC
Confidence 345666543 4568999999996655 589999998874
No 47
>PF00863 Peptidase_C4: Peptidase family C4 This family belongs to family C4 of the peptidase classification.; InterPro: IPR001730 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. Nuclear inclusion A (NIA) proteases from potyviruses are cysteine peptidases belong to the MEROPS peptidase family C4 (NIa protease family, clan PA(C)) [, ]. Potyviruses include plant viruses in which the single-stranded RNA encodes a polyprotein with NIA protease activity, where proteolytic cleavage is specific for Gln+Gly sites. The NIA protease acts on the polyprotein, releasing itself by Gln+Gly cleavage at both the N- and C-termini. It further processes the polyprotein by cleavage at five similar sites in the C-terminal half of the sequence. In addition to its C-terminal protease activity, the NIA protease contains an N-terminal domain that has been implicated in the transcription process []. This peptidase is present in the nuclear inclusion protein of potyviruses.; GO: 0008234 cysteine-type peptidase activity, 0006508 proteolysis; PDB: 3MMG_B 1Q31_B 1LVB_A 1LVM_A.
Probab=36.53 E-value=85 Score=33.41 Aligned_cols=81 Identities=25% Similarity=0.357 Sum_probs=41.9
Q ss_pred cccEEEEEeccCCCCCCceec---CCCCCCCCeEEEEeCCCCCCCccccccceEEEEEe---eecCCCCCCCceEEeecc
Q 004596 214 TSRVAILGVSSYLKDLPNIAL---TPLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVA---NCYPPRSTTRSLLMADIR 287 (743)
Q Consensus 214 ~t~~A~l~i~~~~~~~~~~~~---~~~~~~G~~v~~igsPFg~~sP~~f~ns~s~Givs---~~~~~~~~~~~~i~tD~~ 287 (743)
-.|+.++|.. +++|+.+- -..++.||.|..||+=|-- +++ +-.|| ...+ ..+..|+---+.
T Consensus 81 ~~DiviirmP---kDfpPf~~kl~FR~P~~~e~v~mVg~~fq~-------k~~-~s~vSesS~i~p--~~~~~fWkHwIs 147 (235)
T PF00863_consen 81 GRDIVIIRMP---KDFPPFPQKLKFRAPKEGERVCMVGSNFQE-------KSI-SSTVSESSWIYP--EENSHFWKHWIS 147 (235)
T ss_dssp CSSEEEEE-----TTS----S---B----TT-EEEEEEEECSS-------CCC-EEEEEEEEEEEE--ETTTTEEEE-C-
T ss_pred CccEEEEeCC---cccCCcchhhhccCCCCCCEEEEEEEEEEc-------CCe-eEEECCceEEee--cCCCCeeEEEec
Confidence 3599999987 45666554 2369999999999997762 222 22233 2232 123468888888
Q ss_pred ccCC---CceEc-CCCcEEEEEec
Q 004596 288 CLPG---GPVFG-EHAHFVGILIR 307 (743)
Q Consensus 288 ~~pG---g~vf~-~~g~liGiv~~ 307 (743)
-.+| .|+++ ++|.+|||-..
T Consensus 148 Tk~G~CG~PlVs~~Dg~IVGiHsl 171 (235)
T PF00863_consen 148 TKDGDCGLPLVSTKDGKIVGIHSL 171 (235)
T ss_dssp --TT-TT-EEEETTT--EEEEEEE
T ss_pred CCCCccCCcEEEcCCCcEEEEEcC
Confidence 8887 89997 68899999883
No 48
>cd00190 Tryp_SPc Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues.
Probab=32.68 E-value=1.3e+02 Score=29.62 Aligned_cols=39 Identities=15% Similarity=0.133 Sum_probs=24.7
Q ss_pred cccEEEEEeccCCC---CCCceecCC---CCCCCCeEEEEeCCCC
Q 004596 214 TSRVAILGVSSYLK---DLPNIALTP---LNKRGDLLLAVGSPFG 252 (743)
Q Consensus 214 ~t~~A~l~i~~~~~---~~~~~~~~~---~~~~G~~v~~igsPFg 252 (743)
..|+||||++.... ...++.+.. ....|+.+.+.|....
T Consensus 88 ~~DiAll~L~~~~~~~~~v~picl~~~~~~~~~~~~~~~~G~g~~ 132 (232)
T cd00190 88 DNDIALLKLKRPVTLSDNVRPICLPSSGYNLPAGTTCTVSGWGRT 132 (232)
T ss_pred cCCEEEEEECCcccCCCcccceECCCccccCCCCCEEEEEeCCcC
Confidence 35999999984321 123333322 4778899999986543
No 49
>PF00571 CBS: CBS domain CBS domain web page. Mutations in the CBS domain of Swiss:P35520 lead to homocystinuria.; InterPro: IPR000644 CBS (cystathionine-beta-synthase) domains are small intracellular modules, mostly found in two or four copies within a protein, that occur in a variety of proteins in bacteria, archaea, and eukaryotes [, ]. Tandem pairs of CBS domains can act as binding domains for adenosine derivatives and may regulate the activity of attached enzymatic or other domains []. In some cases, CBS domains may act as sensors of cellular energy status by being activated by AMP and inhibited by ATP []. In chloride ion channels, the CBS domains have been implicated in intracellular targeting and trafficking, as well as in protein-protein interactions, but results vary with different channels: in the CLC-5 channel, the CBS domain was shown to be required for trafficking [], while in the CLC-1 channel, the CBS domain was shown to be critical for channel function, but not necessary for trafficking []. Recent experiments revealing that CBS domains can bind adenosine-containing ligands such ATP, AMP, or S-adenosylmethionine have led to the hypothesis that CBS domains function as sensors of intracellular metabolites [, ]. Crystallographic studies of CBS domains have shown that pairs of CBS sequences form a globular domain where each CBS unit adopts a beta-alpha-beta-beta-alpha pattern []. Crystal structure of the CBS domains of the AMP-activated protein kinase in complexes with AMP and ATP shows that the phosphate groups of AMP/ATP lie in a surface pocket at the interface of two CBS domains, which is lined with basic residues, many of which are associated with disease-causing mutations []. In humans, mutations in conserved residues within CBS domains cause a variety of human hereditary diseases, including (with the gene mutated in parentheses): homocystinuria (cystathionine beta-synthase); Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase); retinitis pigmentosa (IMP dehydrogenase-1); congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members).; GO: 0005515 protein binding; PDB: 3JTF_A 3TE5_C 3TDH_C 3T4N_C 2QLV_C 3OI8_A 3LV9_A 2QH1_B 1PVM_B 3LQN_A ....
Probab=25.70 E-value=56 Score=25.45 Aligned_cols=21 Identities=33% Similarity=0.574 Sum_probs=18.3
Q ss_pred CCCcccceecCCceEEEEEee
Q 004596 606 PGGSGGAVVNLDGHMIGLVTS 626 (743)
Q Consensus 606 ~GnSGGPL~d~~G~VIGIvts 626 (743)
.+-+.-||+|.+|+++|+++.
T Consensus 28 ~~~~~~~V~d~~~~~~G~is~ 48 (57)
T PF00571_consen 28 NGISRLPVVDEDGKLVGIISR 48 (57)
T ss_dssp HTSSEEEEESTTSBEEEEEEH
T ss_pred cCCcEEEEEecCCEEEEEEEH
Confidence 477888999999999999975
No 50
>smart00020 Tryp_SPc Trypsin-like serine protease. Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues.
Probab=25.10 E-value=4.1e+02 Score=26.27 Aligned_cols=39 Identities=18% Similarity=0.163 Sum_probs=24.2
Q ss_pred cccEEEEEeccCC---CCCCceecCC---CCCCCCeEEEEeCCCC
Q 004596 214 TSRVAILGVSSYL---KDLPNIALTP---LNKRGDLLLAVGSPFG 252 (743)
Q Consensus 214 ~t~~A~l~i~~~~---~~~~~~~~~~---~~~~G~~v~~igsPFg 252 (743)
..|+||||++... ....++.+.. .+..|+.+.+.|-.-.
T Consensus 88 ~~DiAll~L~~~i~~~~~~~pi~l~~~~~~~~~~~~~~~~g~g~~ 132 (229)
T smart00020 88 DNDIALLKLKSPVTLSDNVRPICLPSSNYNVPAGTTCTVSGWGRT 132 (229)
T ss_pred cCCEEEEEECcccCCCCceeeccCCCcccccCCCCEEEEEeCCCC
Confidence 4699999998431 1233333422 4777888888885443
No 51
>PF00949 Peptidase_S7: Peptidase S7, Flavivirus NS3 serine protease ; InterPro: IPR001850 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies serine peptidases belong to MEROPS peptidase family S7 (flavivirin family, clan PA(S)). The protein fold of the peptidase domain for members of this family resembles that of chymotrypsin, the type example for clan PA. Flaviviruses produce a polyprotein from the ssRNA genome. The N terminus of the NS3 protein (approx. 180 aa) is required for the processing of the polyprotein. NS3 also has conserved homology with NTP-binding proteins and DEAD family of RNA helicase [, , ].; GO: 0003723 RNA binding, 0003724 RNA helicase activity, 0005524 ATP binding; PDB: 2IJO_B 3E90_D 2GGV_B 2FP7_B 2WV9_A 3U1I_B 3U1J_B 2WZQ_A 2WHX_A 3L6P_A ....
Probab=24.37 E-value=69 Score=31.12 Aligned_cols=32 Identities=19% Similarity=0.401 Sum_probs=21.0
Q ss_pred ceEEeeccccCC---CceEcCCCcEEEEEeccccc
Q 004596 280 SLLMADIRCLPG---GPVFGEHAHFVGILIRPLRQ 311 (743)
Q Consensus 280 ~~i~tD~~~~pG---g~vf~~~g~liGiv~~~l~~ 311 (743)
.+.+.|..+-+| .|+||.+|++|||--.-+.-
T Consensus 86 ~~~~~~~d~~~GsSGSpi~n~~g~ivGlYg~g~~~ 120 (132)
T PF00949_consen 86 GIGAIDLDFPKGSSGSPIFNQNGEIVGLYGNGVEV 120 (132)
T ss_dssp EEEEE---S-TTGTT-EEEETTSCEEEEEEEEEE-
T ss_pred eEEeeecccCCCCCCCceEcCCCcEEEEEccceee
Confidence 456677777776 99999999999997766543
No 52
>PF13267 DUF4058: Protein of unknown function (DUF4058)
Probab=23.48 E-value=59 Score=34.92 Aligned_cols=26 Identities=31% Similarity=0.489 Sum_probs=21.2
Q ss_pred ccc-cccchhHHHHHHHHHHHHhhccccc
Q 004596 701 EDN-IEGKGSRFAKFIAERREVLKHSTQV 728 (743)
Q Consensus 701 ~~~-~~~~~~~~akfi~~~~~~~~~~~~~ 728 (743)
|.| ..++|.. +|+++||++|.|.|+|
T Consensus 124 P~NKr~G~gr~--~Y~~KRq~vl~S~tHL 150 (254)
T PF13267_consen 124 PANKRPGEGRA--AYERKRQEVLGSGTHL 150 (254)
T ss_pred cccCCCCccHH--HHHHHHHHHHhccCce
Confidence 335 4577776 9999999999999987
No 53
>PF00947 Pico_P2A: Picornavirus core protein 2A; InterPro: IPR000081 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This domain defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies 3CA and 3CB. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral 3C cysteine protease []. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0008233 peptidase activity, 0006508 proteolysis, 0016032 viral reproduction; PDB: 2HRV_B 1Z8R_A.
Probab=21.42 E-value=1.2e+02 Score=29.42 Aligned_cols=26 Identities=31% Similarity=0.551 Sum_probs=18.7
Q ss_pred eEEeeccccCC---CceEcCCCcEEEEEec
Q 004596 281 LLMADIRCLPG---GPVFGEHAHFVGILIR 307 (743)
Q Consensus 281 ~i~tD~~~~pG---g~vf~~~g~liGiv~~ 307 (743)
+++.--.+.|| |+|+-++| +|||+++
T Consensus 80 ~l~g~Gp~~PGdCGg~L~C~HG-ViGi~Ta 108 (127)
T PF00947_consen 80 LLIGEGPAEPGDCGGILRCKHG-VIGIVTA 108 (127)
T ss_dssp EEEEE-SSSTT-TCSEEEETTC-EEEEEEE
T ss_pred ceeecccCCCCCCCceeEeCCC-eEEEEEe
Confidence 34444567887 88888875 9999996
No 54
>PF12381 Peptidase_C3G: Tungro spherical virus-type peptidase; InterPro: IPR024387 This entry represents a rice tungro spherical waikavirus-type peptidase that belongs to MEROPS peptidase family C3G. It is a picornain 3C-type protease, and is responsible for the self-cleavage of the positive single-stranded polyproteins of a number of plant viral genomes. The location of the protease activity of the polyprotein is at the C-terminal end, adjacent and N-terminal to the putative RNA polymerase [, ].
Probab=20.83 E-value=84 Score=33.07 Aligned_cols=54 Identities=15% Similarity=0.206 Sum_probs=35.9
Q ss_pred eEEEEccccCCCCccccee--cC--CceEEEEEeeeecCCCCcccCceeEEEeh--hHHHHHHHHH
Q 004596 596 VMLETTAAVHPGGSGGAVV--NL--DGHMIGLVTSNARHGGGTVIPHLNFSIPC--AVLRPIFEFA 655 (743)
Q Consensus 596 ~~IqTdAai~~GnSGGPL~--d~--~G~VIGIvtsna~~~gg~~~p~lnFaIPi--~~l~~il~~~ 655 (743)
.-+++++....|+-|||++ |. .-+++||+.+..... ..+||=++ +.|++.++.+
T Consensus 169 ~gleY~~~t~~GdCGs~i~~~~t~~~RKIvGiHVAG~~~~------~~gYAe~itQEDL~~A~~~l 228 (231)
T PF12381_consen 169 QGLEYQMPTMNGDCGSPIVRNNTQMVRKIVGIHVAGSANH------AMGYAESITQEDLMRAINKL 228 (231)
T ss_pred eeeeEECCCcCCCccceeeEcchhhhhhhheeeecccccc------cceehhhhhHHHHHHHHHhh
Confidence 3456778899999999998 22 378999999976422 35566554 3444444443
No 55
>PF08208 RNA_polI_A34: DNA-directed RNA polymerase I subunit RPA34.5; InterPro: IPR013240 This is a family of proteins conserved from yeasts to human. Subunit A34.5 of RNA polymerase I is a non-essential subunit which is thought to help Pol I overcome topological constraints imposed on ribosomal DNA during the process of transcription [].; PDB: 3NFG_N.
Probab=20.82 E-value=33 Score=35.05 Aligned_cols=13 Identities=62% Similarity=0.950 Sum_probs=0.0
Q ss_pred Ccchhhhcccccc
Q 004596 23 DPKGLKMRRHAFH 35 (743)
Q Consensus 23 dpk~~k~~~~~f~ 35 (743)
-|+|||||.|+|=
T Consensus 109 qp~gLk~Rf~P~G 121 (198)
T PF08208_consen 109 QPKGLKMRFFPFG 121 (198)
T ss_dssp -------------
T ss_pred CCCCcceeeecCC
Confidence 4899999999884
No 56
>PF02122 Peptidase_S39: Peptidase S39; InterPro: IPR000382 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. ORF2 of Potato leafroll virus (PLrV) encodes a polyprotein which is translated following a -1 frameshift. The polyprotein has a putative linear arrangement of membrane achor-VPg-peptidase-polmerase domains. The serine peptidase domain which is found in this group of sequences belongs to MEROPS peptidase family S39 (clan PA(S)). It is likely that the peptidase domain is involved in the cleavage of the polyprotein []. The nucleotide sequence for the RNA of PLrV has been determined [, ]. The sequence contains six large open reading frames (ORFs). The 5' coding region encodes two polypeptides of 28K and 70K, which overlap in different reading frames; it is suggested that the third ORF in the 5' block is translated by frameshift readthrough near the end of the 70K protein, yielding a 118K polypeptide []. Segments of the predicted amino acid sequences of these ORFs resemble those of known viral RNA polymerases, ATP-binding proteins and viral genome-linked proteins. The nucleotide sequence of the genomic RNA of Beet western yellows virus (BWYV) has been determined []. The sequence contains six long ORFs. A cluster of three of these ORFs, including the coat protein cistron, display extensive amino acid sequence similarity to corresponding ORFs of a second luteovirus: Barley yellow dwarf virus [].; GO: 0004252 serine-type endopeptidase activity, 0022415 viral reproductive process, 0016021 integral to membrane; PDB: 1ZYO_A.
Probab=20.04 E-value=1.1e+02 Score=31.82 Aligned_cols=49 Identities=16% Similarity=0.175 Sum_probs=18.0
Q ss_pred EEEEccccCCCCcccceecCCceEEEEEeeeecCCCCcccCceeEEEehhHHH
Q 004596 597 MLETTAAVHPGGSGGAVVNLDGHMIGLVTSNARHGGGTVIPHLNFSIPCAVLR 649 (743)
Q Consensus 597 ~IqTdAai~~GnSGGPL~d~~G~VIGIvtsna~~~gg~~~p~lnFaIPi~~l~ 649 (743)
.....+...+|.||-|+|+.+ +++|+.+...... ...+.|+..|+.-+.
T Consensus 137 ~~~vls~T~~G~SGtp~y~g~-~vvGvH~G~~~~~---~~~n~n~~spip~~~ 185 (203)
T PF02122_consen 137 FASVLSNTSPGWSGTPYYSGK-NVVGVHTGSPSGS---NRENNNRMSPIPPIP 185 (203)
T ss_dssp EEEE-----TT-TT-EEE-SS--EEEEEEEE----------------------
T ss_pred CCceEcCCCCCCCCCCeEECC-CceEeecCccccc---ccccccccccccccc
Confidence 556667788999999999877 9999998863211 112566666655443
Done!