Query 008426
Match_columns 566
No_of_seqs 248 out of 1774
Neff 4.5
Searched_HMMs 46136
Date Thu Mar 28 12:00:18 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/008426.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/008426hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PRK10139 serine endoprotease; 99.9 2.4E-21 5.2E-26 209.0 13.6 126 205-337 127-259 (455)
2 PRK10942 serine endoprotease; 99.8 9.7E-20 2.1E-24 197.5 13.3 125 206-337 149-280 (473)
3 TIGR02038 protease_degS peripl 99.8 1.5E-19 3.3E-24 188.8 13.7 126 206-337 115-247 (351)
4 PRK10898 serine endoprotease; 99.8 2.2E-19 4.8E-24 188.0 13.3 126 206-337 115-248 (353)
5 TIGR02037 degP_htrA_DO peripla 99.8 1.6E-18 3.5E-23 184.7 13.9 125 206-337 95-226 (428)
6 COG0265 DegQ Trypsin-like seri 99.7 4.8E-18 1E-22 176.0 11.9 127 206-338 109-242 (347)
7 PRK10898 serine endoprotease; 99.5 3E-14 6.6E-19 149.5 12.6 99 390-559 51-161 (353)
8 TIGR02038 protease_degS peripl 99.5 4.7E-14 1E-18 147.8 13.2 99 390-559 51-161 (351)
9 PRK10139 serine endoprotease; 99.5 4.8E-14 1E-18 152.7 12.5 100 390-559 46-175 (455)
10 TIGR02037 degP_htrA_DO peripla 99.4 6.8E-13 1.5E-17 141.8 12.6 86 405-560 57-143 (428)
11 PRK10942 serine endoprotease; 99.4 1.5E-12 3.2E-17 141.8 11.2 85 405-559 110-196 (473)
12 KOG1320 Serine protease [Postt 99.2 7.6E-12 1.6E-16 135.7 6.7 128 203-336 211-350 (473)
13 COG0265 DegQ Trypsin-like seri 99.1 9.6E-10 2.1E-14 114.4 12.2 102 389-560 38-157 (347)
14 PF13365 Trypsin_2: Trypsin-li 98.7 3.4E-08 7.3E-13 84.8 6.8 22 408-429 1-22 (120)
15 KOG1320 Serine protease [Postt 97.9 4.3E-05 9.4E-10 83.9 8.7 111 392-559 136-262 (473)
16 PF13365 Trypsin_2: Trypsin-li 97.8 1.5E-05 3.2E-10 68.4 2.9 24 284-307 97-120 (120)
17 cd00190 Tryp_SPc Trypsin-like 97.5 0.003 6.4E-08 59.4 13.9 43 520-562 87-134 (232)
18 PF00089 Trypsin: Trypsin; In 97.4 0.0031 6.8E-08 58.9 12.4 40 521-560 86-130 (220)
19 smart00020 Tryp_SPc Trypsin-li 97.2 0.0084 1.8E-07 56.8 13.2 43 520-562 87-134 (229)
20 PF00089 Trypsin: Trypsin; In 96.6 0.055 1.2E-06 50.6 13.3 113 215-330 87-217 (220)
21 KOG1421 Predicted signaling-as 96.4 0.0075 1.6E-07 68.8 7.7 39 390-428 58-107 (955)
22 COG3591 V8-like Glu-specific e 94.1 0.15 3.2E-06 52.6 7.3 76 232-315 152-227 (251)
23 PF10459 Peptidase_S46: Peptid 93.8 0.21 4.6E-06 58.0 8.8 31 394-428 39-69 (698)
24 PF10459 Peptidase_S46: Peptid 92.3 0.12 2.5E-06 60.2 3.8 32 279-310 621-652 (698)
25 PF00863 Peptidase_C4: Peptida 92.0 0.72 1.6E-05 47.3 8.6 34 521-557 81-116 (235)
26 PF02907 Peptidase_S29: Hepati 91.5 0.14 3E-06 48.6 2.6 45 284-329 101-146 (148)
27 PF00949 Peptidase_S7: Peptida 89.7 0.28 6.2E-06 46.1 3.1 35 280-314 86-120 (132)
28 cd00190 Tryp_SPc Trypsin-like 88.8 1.7 3.7E-05 40.8 7.7 98 215-312 89-208 (232)
29 smart00020 Tryp_SPc Trypsin-li 86.9 4.2 9E-05 38.5 9.1 98 215-312 89-208 (229)
30 PF08192 Peptidase_S64: Peptid 86.2 3.5 7.5E-05 47.9 9.3 111 209-333 537-685 (695)
31 PF00863 Peptidase_C4: Peptida 80.7 3.7 8E-05 42.2 6.1 87 215-311 82-172 (235)
32 PF00944 Peptidase_S3: Alphavi 79.5 1.7 3.6E-05 41.6 3.0 31 281-311 96-126 (158)
33 PF00947 Pico_P2A: Picornaviru 79.3 2.8 6E-05 39.4 4.3 30 281-311 80-109 (127)
34 PF03761 DUF316: Domain of unk 77.3 20 0.00043 36.3 10.2 47 511-557 143-199 (282)
35 COG3591 V8-like Glu-specific e 69.6 27 0.00058 36.4 9.0 33 395-428 52-85 (251)
36 KOG0441 Cu2+/Zn2+ superoxide d 60.1 3.5 7.7E-05 39.9 0.6 43 25-67 37-84 (154)
37 KOG1421 Predicted signaling-as 52.4 33 0.00071 40.6 6.6 120 208-336 126-258 (955)
38 PF05579 Peptidase_S32: Equine 45.7 14 0.0003 39.0 2.2 43 260-313 188-230 (297)
39 PF05580 Peptidase_S55: SpoIVB 41.9 22 0.00047 36.4 2.8 38 288-328 177-214 (218)
40 PF01732 DUF31: Putative pepti 35.7 26 0.00057 37.6 2.5 27 283-309 347-373 (374)
41 PF00548 Peptidase_C3: 3C cyst 30.7 67 0.0015 31.2 4.2 90 215-310 72-169 (172)
42 PF08208 RNA_polI_A34: DNA-dir 26.4 22 0.00047 34.9 0.0 13 23-35 109-121 (198)
43 PF01732 DUF31: Putative pepti 25.0 44 0.00096 35.9 2.0 27 405-431 35-71 (374)
44 PF03761 DUF316: Domain of unk 23.5 5.3E+02 0.011 26.1 9.3 89 215-314 161-257 (282)
45 TIGR02860 spore_IV_B stage IV 21.4 62 0.0013 35.9 2.2 39 287-326 356-394 (402)
No 1
>PRK10139 serine endoprotease; Provisional
Probab=99.86 E-value=2.4e-21 Score=209.01 Aligned_cols=126 Identities=22% Similarity=0.338 Sum_probs=105.8
Q ss_pred ccccccC-CCcceEEEEEEccCCCCCC--cccCCCCCCCCCeEEEEeCCCCCCCCCccCCceEEEEEecccCCCC---CC
Q 008426 205 SNLSLMS-KSTSRVAILGVSSYLKDLP--NIALTPLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVANCYPPRS---TT 278 (566)
Q Consensus 205 ~~~~~l~-~~~tdlAvLki~~~~~~~~--~~~~S~~~~~Gd~V~aiGSPFG~lsP~~F~nsvS~GiISn~~~~~~---~~ 278 (566)
-+++++| +..+||||||++.. .+++ .+++|..+++||+|+|||+|||+ ..++|.||||++.+... ..
T Consensus 127 ~~a~vvg~D~~~DlAvlkv~~~-~~l~~~~lg~s~~~~~G~~V~aiG~P~g~------~~tvt~GivS~~~r~~~~~~~~ 199 (455)
T PRK10139 127 FDAKLIGSDDQSDIALLQIQNP-SKLTQIAIADSDKLRVGDFAVAVGNPFGL------GQTATSGIISALGRSGLNLEGL 199 (455)
T ss_pred EEEEEEEEcCCCCEEEEEecCC-CCCceeEecCccccCCCCEEEEEecCCCC------CCceEEEEEccccccccCCCCc
Confidence 3478888 55599999999742 2344 56688899999999999999994 78999999999876421 23
Q ss_pred CceEEEecCCCCCCCCcceeccCCcEEEEEeeeccCCC-CcceEEEeeHHHHHHHHHhhh
Q 008426 279 RSLLMADIRCLPGMEGGPVFGEHAHFVGILIRPLRQKS-GAEIQLVIPWEAIATACSDLL 337 (566)
Q Consensus 279 ~~liqTDA~ilPGnsGGpVfn~~G~lIGIv~~~l~~~~-g~gl~faIP~~~i~~~~~~ll 337 (566)
..+|||||+++||||||||||.+|+||||+++.++..+ ..|++||||++.++.++.+++
T Consensus 200 ~~~iqtda~in~GnSGGpl~n~~G~vIGi~~~~~~~~~~~~gigfaIP~~~~~~v~~~l~ 259 (455)
T PRK10139 200 ENFIQTDASINRGNSGGALLNLNGELIGINTAILAPGGGSVGIGFAIPSNMARTLAQQLI 259 (455)
T ss_pred ceEEEECCccCCCCCcceEECCCCeEEEEEEEEEcCCCCccceEEEEEhHHHHHHHHHHh
Confidence 56899999999999999999999999999999887653 359999999999999888765
No 2
>PRK10942 serine endoprotease; Provisional
Probab=99.81 E-value=9.7e-20 Score=197.45 Aligned_cols=125 Identities=20% Similarity=0.354 Sum_probs=105.1
Q ss_pred cccccC-CCcceEEEEEEccCCCCCC--cccCCCCCCCCCeEEEEeCCCCCCCCCccCCceEEEEEecccCCC---CCCC
Q 008426 206 NLSLMS-KSTSRVAILGVSSYLKDLP--NIALTPLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVANCYPPR---STTR 279 (566)
Q Consensus 206 ~~~~l~-~~~tdlAvLki~~~~~~~~--~~~~S~~~~~Gd~V~aiGSPFG~lsP~~F~nsvS~GiISn~~~~~---~~~~ 279 (566)
++++++ +..+||||||++. ..+++ .++++..+++||+|++||+|||+ .+++|.||||++.+.. ..+.
T Consensus 149 ~a~vv~~D~~~DlAvlki~~-~~~l~~~~lg~s~~l~~G~~V~aiG~P~g~------~~tvt~GiVs~~~r~~~~~~~~~ 221 (473)
T PRK10942 149 DAKVVGKDPRSDIALIQLQN-PKNLTAIKMADSDALRVGDYTVAIGNPYGL------GETVTSGIVSALGRSGLNVENYE 221 (473)
T ss_pred EEEEEEecCCCCEEEEEecC-CCCCceeEecCccccCCCCEEEEEcCCCCC------CcceeEEEEEEeecccCCccccc
Confidence 467888 4559999999973 23344 46678899999999999999994 7899999999987642 1345
Q ss_pred ceEEEecCCCCCCCCcceeccCCcEEEEEeeeccCCC-CcceEEEeeHHHHHHHHHhhh
Q 008426 280 SLLMADIRCLPGMEGGPVFGEHAHFVGILIRPLRQKS-GAEIQLVIPWEAIATACSDLL 337 (566)
Q Consensus 280 ~liqTDA~ilPGnsGGpVfn~~G~lIGIv~~~l~~~~-g~gl~faIP~~~i~~~~~~ll 337 (566)
.+|||||+++||||||||||.+|+||||+++.+...+ +.|++||||++.++.++..+.
T Consensus 222 ~~iqtda~i~~GnSGGpL~n~~GeviGI~t~~~~~~g~~~g~gfaIP~~~~~~v~~~l~ 280 (473)
T PRK10942 222 NFIQTDAAINRGNSGGALVNLNGELIGINTAILAPDGGNIGIGFAIPSNMVKNLTSQMV 280 (473)
T ss_pred ceEEeccccCCCCCcCccCCCCCeEEEEEEEEEcCCCCcccEEEEEEHHHHHHHHHHHH
Confidence 7899999999999999999999999999999887664 469999999999999888765
No 3
>TIGR02038 protease_degS periplasmic serine pepetdase DegS. This family consists of the periplasmic serine protease DegS (HhoB), a shorter paralog of protease DO (HtrA, DegP) and DegQ (HhoA). It is found in E. coli and several other Proteobacteria of the gamma subdivision. It contains a trypsin domain and a single copy of PDZ domain (in contrast to DegP with two copies). A critical role of this DegS is to sense stress in the periplasm and partially degrade an inhibitor of sigma(E).
Probab=99.81 E-value=1.5e-19 Score=188.84 Aligned_cols=126 Identities=17% Similarity=0.334 Sum_probs=103.3
Q ss_pred cccccC-CCcceEEEEEEccCCCCCCcccCCCCCCCCCeEEEEeCCCCCCCCCccCCceEEEEEecccCCCC---CCCce
Q 008426 206 NLSLMS-KSTSRVAILGVSSYLKDLPNIALTPLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVANCYPPRS---TTRSL 281 (566)
Q Consensus 206 ~~~~l~-~~~tdlAvLki~~~~~~~~~~~~S~~~~~Gd~V~aiGSPFG~lsP~~F~nsvS~GiISn~~~~~~---~~~~l 281 (566)
++++++ +..+||||||++.......+++.+..+++||+|++||+|||+ .++++.|+||++.+... ....+
T Consensus 115 ~a~vv~~d~~~DlAvlkv~~~~~~~~~l~~s~~~~~G~~V~aiG~P~~~------~~s~t~GiIs~~~r~~~~~~~~~~~ 188 (351)
T TIGR02038 115 EAELVGSDPLTDLAVLKIEGDNLPTIPVNLDRPPHVGDVVLAIGNPYNL------GQTITQGIISATGRNGLSSVGRQNF 188 (351)
T ss_pred EEEEEEecCCCCEEEEEecCCCCceEeccCcCccCCCCEEEEEeCCCCC------CCcEEEEEEEeccCcccCCCCcceE
Confidence 467777 455999999998543233356677889999999999999995 67999999999865321 23568
Q ss_pred EEEecCCCCCCCCcceeccCCcEEEEEeeeccCCC---CcceEEEeeHHHHHHHHHhhh
Q 008426 282 LMADIRCLPGMEGGPVFGEHAHFVGILIRPLRQKS---GAEIQLVIPWEAIATACSDLL 337 (566)
Q Consensus 282 iqTDA~ilPGnsGGpVfn~~G~lIGIv~~~l~~~~---g~gl~faIP~~~i~~~~~~ll 337 (566)
|||||.++||||||||||.+|+||||+++.+...+ ..|++||||++.+..++..++
T Consensus 189 iqtda~i~~GnSGGpl~n~~G~vIGI~~~~~~~~~~~~~~g~~faIP~~~~~~vl~~l~ 247 (351)
T TIGR02038 189 IQTDAAINAGNSGGALINTNGELVGINTASFQKGGDEGGEGINFAIPIKLAHKIMGKII 247 (351)
T ss_pred EEECCccCCCCCcceEECCCCeEEEEEeeeecccCCCCccceEEEecHHHHHHHHHHHh
Confidence 99999999999999999999999999998876542 259999999999999888765
No 4
>PRK10898 serine endoprotease; Provisional
Probab=99.80 E-value=2.2e-19 Score=187.98 Aligned_cols=126 Identities=18% Similarity=0.307 Sum_probs=102.2
Q ss_pred cccccC-CCcceEEEEEEccCCCCCCcccCCCCCCCCCeEEEEeCCCCCCCCCccCCceEEEEEecccCCC---CCCCce
Q 008426 206 NLSLMS-KSTSRVAILGVSSYLKDLPNIALTPLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVANCYPPR---STTRSL 281 (566)
Q Consensus 206 ~~~~l~-~~~tdlAvLki~~~~~~~~~~~~S~~~~~Gd~V~aiGSPFG~lsP~~F~nsvS~GiISn~~~~~---~~~~~l 281 (566)
++++++ +..+||||||++.......+++.+..+++||+|+++|+|||+ ..++|.|+||+..+.. .....+
T Consensus 115 ~a~vv~~d~~~DlAvl~v~~~~l~~~~l~~~~~~~~G~~V~aiG~P~g~------~~~~t~Giis~~~r~~~~~~~~~~~ 188 (353)
T PRK10898 115 EALLVGSDSLTDLAVLKINATNLPVIPINPKRVPHIGDVVLAIGNPYNL------GQTITQGIISATGRIGLSPTGRQNF 188 (353)
T ss_pred EEEEEEEcCCCCEEEEEEcCCCCCeeeccCcCcCCCCCEEEEEeCCCCc------CCCcceeEEEeccccccCCccccce
Confidence 356677 445999999998543233356677889999999999999994 6799999999886532 123468
Q ss_pred EEEecCCCCCCCCcceeccCCcEEEEEeeeccCCC----CcceEEEeeHHHHHHHHHhhh
Q 008426 282 LMADIRCLPGMEGGPVFGEHAHFVGILIRPLRQKS----GAEIQLVIPWEAIATACSDLL 337 (566)
Q Consensus 282 iqTDA~ilPGnsGGpVfn~~G~lIGIv~~~l~~~~----g~gl~faIP~~~i~~~~~~ll 337 (566)
|||||+++||||||||+|.+|+||||+++.+...+ ..+++||||++.+..++..++
T Consensus 189 iqtda~i~~GnSGGPl~n~~G~vvGI~~~~~~~~~~~~~~~g~~faIP~~~~~~~~~~l~ 248 (353)
T PRK10898 189 LQTDASINHGNSGGALVNSLGELMGINTLSFDKSNDGETPEGIGFAIPTQLATKIMDKLI 248 (353)
T ss_pred EEeccccCCCCCcceEECCCCeEEEEEEEEecccCCCCcccceEEEEchHHHHHHHHHHh
Confidence 99999999999999999999999999999876442 258999999999999888765
No 5
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=99.78 E-value=1.6e-18 Score=184.74 Aligned_cols=125 Identities=23% Similarity=0.408 Sum_probs=104.0
Q ss_pred cccccC-CCcceEEEEEEccCCCCCCc--ccCCCCCCCCCeEEEEeCCCCCCCCCccCCceEEEEEecccCCC---CCCC
Q 008426 206 NLSLMS-KSTSRVAILGVSSYLKDLPN--IALTPLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVANCYPPR---STTR 279 (566)
Q Consensus 206 ~~~~l~-~~~tdlAvLki~~~~~~~~~--~~~S~~~~~Gd~V~aiGSPFG~lsP~~F~nsvS~GiISn~~~~~---~~~~ 279 (566)
++++++ +..+|+||||++.. .++|. ++++..+++||+|+++|+|||+ ..++|.|+||++.+.. ..+.
T Consensus 95 ~a~vv~~d~~~DlAllkv~~~-~~~~~~~l~~~~~~~~G~~v~aiG~p~g~------~~~~t~G~vs~~~~~~~~~~~~~ 167 (428)
T TIGR02037 95 KAKLVGKDPRTDIAVLKIDAK-KNLPVIKLGDSDKLRVGDWVLAIGNPFGL------GQTVTSGIVSALGRSGLGIGDYE 167 (428)
T ss_pred EEEEEEecCCCCEEEEEecCC-CCceEEEccCCCCCCCCCEEEEEECCCcC------CCcEEEEEEEecccCccCCCCcc
Confidence 356777 44589999999853 34554 5577889999999999999994 7899999999987542 2345
Q ss_pred ceEEEecCCCCCCCCcceeccCCcEEEEEeeeccCC-CCcceEEEeeHHHHHHHHHhhh
Q 008426 280 SLLMADIRCLPGMEGGPVFGEHAHFVGILIRPLRQK-SGAEIQLVIPWEAIATACSDLL 337 (566)
Q Consensus 280 ~liqTDA~ilPGnsGGpVfn~~G~lIGIv~~~l~~~-~g~gl~faIP~~~i~~~~~~ll 337 (566)
.+||||++++||||||||||.+|+||||+++.+... +..|++||||++.+++++..+.
T Consensus 168 ~~i~tda~i~~GnSGGpl~n~~G~viGI~~~~~~~~g~~~g~~faiP~~~~~~~~~~l~ 226 (428)
T TIGR02037 168 NFIQTDAAINPGNSGGPLVNLRGEVIGINTAIYSPSGGNVGIGFAIPSNMAKNVVDQLI 226 (428)
T ss_pred ceEEECCCCCCCCCCCceECCCCeEEEEEeEEEcCCCCccceEEEEEhHHHHHHHHHHH
Confidence 689999999999999999999999999999988765 3469999999999999988875
No 6
>COG0265 DegQ Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain [Posttranslational modification, protein turnover, chaperones]
Probab=99.75 E-value=4.8e-18 Score=176.01 Aligned_cols=127 Identities=22% Similarity=0.342 Sum_probs=108.3
Q ss_pred cccccC-CCcceEEEEEEccCC-CCCCcccCCCCCCCCCeEEEEeCCCCCCCCCccCCceEEEEEecccCC-CC---CCC
Q 008426 206 NLSLMS-KSTSRVAILGVSSYL-KDLPNIALTPLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVANCYPP-RS---TTR 279 (566)
Q Consensus 206 ~~~~l~-~~~tdlAvLki~~~~-~~~~~~~~S~~~~~Gd~V~aiGSPFG~lsP~~F~nsvS~GiISn~~~~-~~---~~~ 279 (566)
+++++| +..+|+|+||++... .....++++..++.||+++|||+||| |.++++.||||+..+. .. ...
T Consensus 109 ~a~~vg~d~~~dlavlki~~~~~~~~~~~~~s~~l~vg~~v~aiGnp~g------~~~tvt~Givs~~~r~~v~~~~~~~ 182 (347)
T COG0265 109 PAKLVGKDPISDLAVLKIDGAGGLPVIALGDSDKLRVGDVVVAIGNPFG------LGQTVTSGIVSALGRTGVGSAGGYV 182 (347)
T ss_pred EEEEEecCCccCEEEEEeccCCCCceeeccCCCCcccCCEEEEecCCCC------cccceeccEEeccccccccCccccc
Confidence 467888 556999999999643 23346778999999999999999999 4899999999999874 22 145
Q ss_pred ceEEEecCCCCCCCCcceeccCCcEEEEEeeeccCCC-CcceEEEeeHHHHHHHHHhhhc
Q 008426 280 SLLMADIRCLPGMEGGPVFGEHAHFVGILIRPLRQKS-GAEIQLVIPWEAIATACSDLLL 338 (566)
Q Consensus 280 ~liqTDA~ilPGnsGGpVfn~~G~lIGIv~~~l~~~~-g~gl~faIP~~~i~~~~~~lll 338 (566)
.+|||||+++|||||||++|.+|++|||+++.+...+ ..|++|+||++.+..++..++.
T Consensus 183 ~~IqtdAain~gnsGgpl~n~~g~~iGint~~~~~~~~~~gigfaiP~~~~~~v~~~l~~ 242 (347)
T COG0265 183 NFIQTDAAINPGNSGGPLVNIDGEVVGINTAIIAPSGGSSGIGFAIPVNLVAPVLDELIS 242 (347)
T ss_pred chhhcccccCCCCCCCceEcCCCcEEEEEEEEecCCCCcceeEEEecHHHHHHHHHHHHH
Confidence 7899999999999999999999999999999998775 4689999999999998887764
No 7
>PRK10898 serine endoprotease; Provisional
Probab=99.55 E-value=3e-14 Score=149.51 Aligned_cols=99 Identities=24% Similarity=0.324 Sum_probs=81.6
Q ss_pred hhcccCcEEEEEeCC-----------CceeeEEEEeCCceEEecccccccccCCcccccCCCCCcccCCCCCCCCCCCcc
Q 008426 390 IQKALASVCLITIDD-----------GVWASGVLLNDQGLILTNAHLLEPWRFGKTTVSGWRNGVSFQPEDSASSGHTGV 458 (566)
Q Consensus 390 ~~~~~~sVV~V~~~~-----------~~~GSG~~~~~~G~ilTn~HVv~~~rfg~~~~~g~~~~~~f~~~~~~~~~~~~~ 458 (566)
++++.|+||.|.... .++||||+|+++||||||+|||+..
T Consensus 51 ~~~~~psvV~v~~~~~~~~~~~~~~~~~~GSGfvi~~~G~IlTn~HVv~~a----------------------------- 101 (353)
T PRK10898 51 VRRAAPAVVNVYNRSLNSTSHNQLEIRTLGSGVIMDQRGYILTNKHVINDA----------------------------- 101 (353)
T ss_pred HHHhCCcEEEEEeEeccccCcccccccceeeEEEEeCCeEEEecccEeCCC-----------------------------
Confidence 677899999996621 2689999999999999999999841
Q ss_pred cccccccCCCCCCCcccccchhhhhhhccccCCCCceeEEEEEccCCCCeEEeeEEEEecCCCcceEEEEeccCCCCCcc
Q 008426 459 DQYQKSQTLPPKMPKIVDSSVDEHRAYKLSSFSRGHRKIRVRLDHLDPWIWCDAKIVYVCKGPLDVSLLQLGYIPDQLCP 538 (566)
Q Consensus 459 ~~~~~~q~~~~~~~~~~~~~~~~~~~~~~~l~~~~~~~I~Vrl~~~~~~~w~~A~VV~vs~~p~DlALLqie~vp~~L~p 538 (566)
.+|.|++.++ .+++|+|++.+ ...||||||++. ..|++
T Consensus 102 ------------------------------------~~i~V~~~dg---~~~~a~vv~~d-~~~DlAvl~v~~--~~l~~ 139 (353)
T PRK10898 102 ------------------------------------DQIIVALQDG---RVFEALLVGSD-SLTDLAVLKINA--TNLPV 139 (353)
T ss_pred ------------------------------------CEEEEEeCCC---CEEEEEEEEEc-CCCCEEEEEEcC--CCCCe
Confidence 2488988653 57999999986 469999999986 35788
Q ss_pred eec-CCCCCCCCCeEEEecCCC
Q 008426 539 IDA-DFGQPSLGSAAYVIGHGL 559 (566)
Q Consensus 539 i~~-~~~~p~~Gs~V~vIG~pL 559 (566)
+++ +++.+++|++|++||||+
T Consensus 140 ~~l~~~~~~~~G~~V~aiG~P~ 161 (353)
T PRK10898 140 IPINPKRVPHIGDVVLAIGNPY 161 (353)
T ss_pred eeccCcCcCCCCCEEEEEeCCC
Confidence 886 445689999999999996
No 8
>TIGR02038 protease_degS periplasmic serine pepetdase DegS. This family consists of the periplasmic serine protease DegS (HhoB), a shorter paralog of protease DO (HtrA, DegP) and DegQ (HhoA). It is found in E. coli and several other Proteobacteria of the gamma subdivision. It contains a trypsin domain and a single copy of PDZ domain (in contrast to DegP with two copies). A critical role of this DegS is to sense stress in the periplasm and partially degrade an inhibitor of sigma(E).
Probab=99.54 E-value=4.7e-14 Score=147.77 Aligned_cols=99 Identities=26% Similarity=0.310 Sum_probs=81.7
Q ss_pred hhcccCcEEEEEeCC-----------CceeeEEEEeCCceEEecccccccccCCcccccCCCCCcccCCCCCCCCCCCcc
Q 008426 390 IQKALASVCLITIDD-----------GVWASGVLLNDQGLILTNAHLLEPWRFGKTTVSGWRNGVSFQPEDSASSGHTGV 458 (566)
Q Consensus 390 ~~~~~~sVV~V~~~~-----------~~~GSG~~~~~~G~ilTn~HVv~~~rfg~~~~~g~~~~~~f~~~~~~~~~~~~~ 458 (566)
++++.|+||.|.+.. .+.||||+|+++||||||+|||+..
T Consensus 51 ~~~~~psVV~I~~~~~~~~~~~~~~~~~~GSG~vi~~~G~IlTn~HVV~~~----------------------------- 101 (351)
T TIGR02038 51 VRRAAPAVVNIYNRSISQNSLNQLSIQGLGSGVIMSKEGYILTNYHVIKKA----------------------------- 101 (351)
T ss_pred HHhcCCcEEEEEeEeccccccccccccceEEEEEEeCCeEEEecccEeCCC-----------------------------
Confidence 677899999996521 3679999999999999999999841
Q ss_pred cccccccCCCCCCCcccccchhhhhhhccccCCCCceeEEEEEccCCCCeEEeeEEEEecCCCcceEEEEeccCCCCCcc
Q 008426 459 DQYQKSQTLPPKMPKIVDSSVDEHRAYKLSSFSRGHRKIRVRLDHLDPWIWCDAKIVYVCKGPLDVSLLQLGYIPDQLCP 538 (566)
Q Consensus 459 ~~~~~~q~~~~~~~~~~~~~~~~~~~~~~~l~~~~~~~I~Vrl~~~~~~~w~~A~VV~vs~~p~DlALLqie~vp~~L~p 538 (566)
..|.|++.++ .+++|++|+.+ ..+||||||++. ..+++
T Consensus 102 ------------------------------------~~i~V~~~dg---~~~~a~vv~~d-~~~DlAvlkv~~--~~~~~ 139 (351)
T TIGR02038 102 ------------------------------------DQIVVALQDG---RKFEAELVGSD-PLTDLAVLKIEG--DNLPT 139 (351)
T ss_pred ------------------------------------CEEEEEECCC---CEEEEEEEEec-CCCCEEEEEecC--CCCce
Confidence 2488888653 57999999976 479999999996 34788
Q ss_pred eec-CCCCCCCCCeEEEecCCC
Q 008426 539 IDA-DFGQPSLGSAAYVIGHGL 559 (566)
Q Consensus 539 i~~-~~~~p~~Gs~V~vIG~pL 559 (566)
+++ +...+++|++|++||||+
T Consensus 140 ~~l~~s~~~~~G~~V~aiG~P~ 161 (351)
T TIGR02038 140 IPVNLDRPPHVGDVVLAIGNPY 161 (351)
T ss_pred EeccCcCccCCCCEEEEEeCCC
Confidence 886 456799999999999996
No 9
>PRK10139 serine endoprotease; Provisional
Probab=99.53 E-value=4.8e-14 Score=152.70 Aligned_cols=100 Identities=28% Similarity=0.443 Sum_probs=82.1
Q ss_pred hhcccCcEEEEEeC----------------------------CCceeeEEEEeC-CceEEecccccccccCCcccccCCC
Q 008426 390 IQKALASVCLITID----------------------------DGVWASGVLLND-QGLILTNAHLLEPWRFGKTTVSGWR 440 (566)
Q Consensus 390 ~~~~~~sVV~V~~~----------------------------~~~~GSG~~~~~-~G~ilTn~HVv~~~rfg~~~~~g~~ 440 (566)
.+++.|+||-|.+. ..++||||+|++ +||||||+|||+..
T Consensus 46 ~~~~~pavV~i~~~~~~~~~~~~~~~~~~~f~~~~~~~~~~~~~~~GSG~ii~~~~g~IlTn~HVv~~a----------- 114 (455)
T PRK10139 46 LEKVLPAVVSVRVEGTASQGQKIPEEFKKFFGDDLPDQPAQPFEGLGSGVIIDAAKGYVLTNNHVINQA----------- 114 (455)
T ss_pred HHHhCCcEEEEEEEEeecccccCchhHHHhccccCCccccccccceEEEEEEECCCCEEEeChHHhCCC-----------
Confidence 67789999998431 025799999985 79999999999841
Q ss_pred CCcccCCCCCCCCCCCcccccccccCCCCCCCcccccchhhhhhhccccCCCCceeEEEEEccCCCCeEEeeEEEEecCC
Q 008426 441 NGVSFQPEDSASSGHTGVDQYQKSQTLPPKMPKIVDSSVDEHRAYKLSSFSRGHRKIRVRLDHLDPWIWCDAKIVYVCKG 520 (566)
Q Consensus 441 ~~~~f~~~~~~~~~~~~~~~~~~~q~~~~~~~~~~~~~~~~~~~~~~~l~~~~~~~I~Vrl~~~~~~~w~~A~VV~vs~~ 520 (566)
.+|.|++.++ ++++|+|++.+ .
T Consensus 115 ------------------------------------------------------~~i~V~~~dg---~~~~a~vvg~D-~ 136 (455)
T PRK10139 115 ------------------------------------------------------QKISIQLNDG---REFDAKLIGSD-D 136 (455)
T ss_pred ------------------------------------------------------CEEEEEECCC---CEEEEEEEEEc-C
Confidence 2588998643 57999999986 4
Q ss_pred CcceEEEEeccCCCCCcceec-CCCCCCCCCeEEEecCCC
Q 008426 521 PLDVSLLQLGYIPDQLCPIDA-DFGQPSLGSAAYVIGHGL 559 (566)
Q Consensus 521 p~DlALLqie~vp~~L~pi~~-~~~~p~~Gs~V~vIG~pL 559 (566)
.+||||||++. +..|.++++ +++.+++|++|++||||+
T Consensus 137 ~~DlAvlkv~~-~~~l~~~~lg~s~~~~~G~~V~aiG~P~ 175 (455)
T PRK10139 137 QSDIALLQIQN-PSKLTQIAIADSDKLRVGDFAVAVGNPF 175 (455)
T ss_pred CCCEEEEEecC-CCCCceeEecCccccCCCCEEEEEecCC
Confidence 79999999986 457899996 667899999999999996
No 10
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=99.43 E-value=6.8e-13 Score=141.79 Aligned_cols=86 Identities=27% Similarity=0.299 Sum_probs=72.3
Q ss_pred CceeeEEEEeCCceEEecccccccccCCcccccCCCCCcccCCCCCCCCCCCcccccccccCCCCCCCcccccchhhhhh
Q 008426 405 GVWASGVLLNDQGLILTNAHLLEPWRFGKTTVSGWRNGVSFQPEDSASSGHTGVDQYQKSQTLPPKMPKIVDSSVDEHRA 484 (566)
Q Consensus 405 ~~~GSG~~~~~~G~ilTn~HVv~~~rfg~~~~~g~~~~~~f~~~~~~~~~~~~~~~~~~~q~~~~~~~~~~~~~~~~~~~ 484 (566)
.++||||+|+++||||||+|||+.+
T Consensus 57 ~~~GSGfii~~~G~IlTn~Hvv~~~------------------------------------------------------- 81 (428)
T TIGR02037 57 RGLGSGVIISADGYILTNNHVVDGA------------------------------------------------------- 81 (428)
T ss_pred cceeeEEEECCCCEEEEcHHHcCCC-------------------------------------------------------
Confidence 4689999999999999999999841
Q ss_pred hccccCCCCceeEEEEEccCCCCeEEeeEEEEecCCCcceEEEEeccCCCCCcceec-CCCCCCCCCeEEEecCCCC
Q 008426 485 YKLSSFSRGHRKIRVRLDHLDPWIWCDAKIVYVCKGPLDVSLLQLGYIPDQLCPIDA-DFGQPSLGSAAYVIGHGLF 560 (566)
Q Consensus 485 ~~~~l~~~~~~~I~Vrl~~~~~~~w~~A~VV~vs~~p~DlALLqie~vp~~L~pi~~-~~~~p~~Gs~V~vIG~pLf 560 (566)
.+|+|++.+ ..+++|++++.+ ..+||||||++. +..|+++++ +.+.+++|++|++||||+-
T Consensus 82 ----------~~i~V~~~~---~~~~~a~vv~~d-~~~DlAllkv~~-~~~~~~~~l~~~~~~~~G~~v~aiG~p~g 143 (428)
T TIGR02037 82 ----------DEITVTLSD---GREFKAKLVGKD-PRTDIAVLKIDA-KKNLPVIKLGDSDKLRVGDWVLAIGNPFG 143 (428)
T ss_pred ----------CeEEEEeCC---CCEEEEEEEEec-CCCCEEEEEecC-CCCceEEEccCCCCCCCCCEEEEEECCCc
Confidence 248888864 357999999975 479999999986 356899997 5578899999999999963
No 11
>PRK10942 serine endoprotease; Provisional
Probab=99.38 E-value=1.5e-12 Score=141.83 Aligned_cols=85 Identities=31% Similarity=0.422 Sum_probs=71.8
Q ss_pred CceeeEEEEeC-CceEEecccccccccCCcccccCCCCCcccCCCCCCCCCCCcccccccccCCCCCCCcccccchhhhh
Q 008426 405 GVWASGVLLND-QGLILTNAHLLEPWRFGKTTVSGWRNGVSFQPEDSASSGHTGVDQYQKSQTLPPKMPKIVDSSVDEHR 483 (566)
Q Consensus 405 ~~~GSG~~~~~-~G~ilTn~HVv~~~rfg~~~~~g~~~~~~f~~~~~~~~~~~~~~~~~~~q~~~~~~~~~~~~~~~~~~ 483 (566)
.++||||+|++ +||||||+|||+..
T Consensus 110 ~~~GSG~ii~~~~G~IlTn~HVv~~a------------------------------------------------------ 135 (473)
T PRK10942 110 MALGSGVIIDADKGYVVTNNHVVDNA------------------------------------------------------ 135 (473)
T ss_pred cceEEEEEEECCCCEEEeChhhcCCC------------------------------------------------------
Confidence 46999999996 69999999999841
Q ss_pred hhccccCCCCceeEEEEEccCCCCeEEeeEEEEecCCCcceEEEEeccCCCCCcceec-CCCCCCCCCeEEEecCCC
Q 008426 484 AYKLSSFSRGHRKIRVRLDHLDPWIWCDAKIVYVCKGPLDVSLLQLGYIPDQLCPIDA-DFGQPSLGSAAYVIGHGL 559 (566)
Q Consensus 484 ~~~~~l~~~~~~~I~Vrl~~~~~~~w~~A~VV~vs~~p~DlALLqie~vp~~L~pi~~-~~~~p~~Gs~V~vIG~pL 559 (566)
.+|+|++.+ .+.|+|+|++.+ ..+||||||++. ++.|.++++ +++.+++|++|++||+|+
T Consensus 136 -----------~~i~V~~~d---g~~~~a~vv~~D-~~~DlAvlki~~-~~~l~~~~lg~s~~l~~G~~V~aiG~P~ 196 (473)
T PRK10942 136 -----------TKIKVQLSD---GRKFDAKVVGKD-PRSDIALIQLQN-PKNLTAIKMADSDALRVGDYTVAIGNPY 196 (473)
T ss_pred -----------CEEEEEECC---CCEEEEEEEEec-CCCCEEEEEecC-CCCCceeEecCccccCCCCEEEEEcCCC
Confidence 258899865 357999999975 479999999985 567999997 667899999999999995
No 12
>KOG1320 consensus Serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=99.25 E-value=7.6e-12 Score=135.70 Aligned_cols=128 Identities=20% Similarity=0.322 Sum_probs=104.2
Q ss_pred ccccccccC-CCcceEEEEEEccCCCCCC--cccCCCCCCCCCeEEEEeCCCCCCCCCccCCceEEEEEecccCCCC---
Q 008426 203 ESSNLSLMS-KSTSRVAILGVSSYLKDLP--NIALTPLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVANCYPPRS--- 276 (566)
Q Consensus 203 ~~~~~~~l~-~~~tdlAvLki~~~~~~~~--~~~~S~~~~~Gd~V~aiGSPFG~lsP~~F~nsvS~GiISn~~~~~~--- 276 (566)
-+..|.++| +...|+|++|++....-++ +.+.+..++.|+++.++|+||++ .|++++|++++..+..-
T Consensus 211 ~s~ep~i~g~d~~~gvA~l~ik~~~~i~~~i~~~~~~~~~~G~~~~a~~~~f~~------~nt~t~g~vs~~~R~~~~lg 284 (473)
T KOG1320|consen 211 NSGEPVIVGVDKVAGVAFLKIKTPENILYVIPLGVSSHFRTGVEVSAIGNGFGL------LNTLTQGMVSGQLRKSFKLG 284 (473)
T ss_pred ccCCCeEEccccccceEEEEEecCCcccceeecceeeeecccceeeccccCcee------eeeeeecccccccccccccC
Confidence 455689999 6669999999964322233 45578999999999999999995 89999999999876431
Q ss_pred -----CCCceEEEecCCCCCCCCcceeccCCcEEEEEeeeccCCC-CcceEEEeeHHHHHHHHHhh
Q 008426 277 -----TTRSLLMADIRCLPGMEGGPVFGEHAHFVGILIRPLRQKS-GAEIQLVIPWEAIATACSDL 336 (566)
Q Consensus 277 -----~~~~liqTDA~ilPGnsGGpVfn~~G~lIGIv~~~l~~~~-g~gl~faIP~~~i~~~~~~l 336 (566)
....++|||+++++||+|||++|.+|++||++++...+-+ ..+++|++|.|.+...+...
T Consensus 285 ~~~g~~i~~~~qtd~ai~~~nsg~~ll~~DG~~IgVn~~~~~ri~~~~~iSf~~p~d~vl~~v~r~ 350 (473)
T KOG1320|consen 285 LETGVLISKINQTDAAINPGNSGGPLLNLDGEVIGVNTRKVTRIGFSHGISFKIPIDTVLVIVLRL 350 (473)
T ss_pred cccceeeeeecccchhhhcccCCCcEEEecCcEeeeeeeeeEEeeccccceeccCchHhhhhhhhh
Confidence 2345799999999999999999999999999998877643 34999999999998765543
No 13
>COG0265 DegQ Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain [Posttranslational modification, protein turnover, chaperones]
Probab=99.07 E-value=9.6e-10 Score=114.36 Aligned_cols=102 Identities=24% Similarity=0.238 Sum_probs=82.2
Q ss_pred hhhcccCcEEEEEeCC-----------------CceeeEEEEeCCceEEecccccccccCCcccccCCCCCcccCCCCCC
Q 008426 389 PIQKALASVCLITIDD-----------------GVWASGVLLNDQGLILTNAHLLEPWRFGKTTVSGWRNGVSFQPEDSA 451 (566)
Q Consensus 389 ~~~~~~~sVV~V~~~~-----------------~~~GSG~~~~~~G~ilTn~HVv~~~rfg~~~~~g~~~~~~f~~~~~~ 451 (566)
..+++.|+||.+.... .++||||+++.+|||+||+|||+..
T Consensus 38 ~~~~~~~~vV~~~~~~~~~~~~~~~~~~~~~~~~~~gSg~i~~~~g~ivTn~hVi~~a---------------------- 95 (347)
T COG0265 38 AVEKVAPAVVSIATGLTAKLRSFFPSDPPLRSAEGLGSGFIISSDGYIVTNNHVIAGA---------------------- 95 (347)
T ss_pred HHHhcCCcEEEEEeeeeecchhcccCCcccccccccccEEEEcCCeEEEecceecCCc----------------------
Confidence 3667889999875521 2899999999999999999999931
Q ss_pred CCCCCcccccccccCCCCCCCcccccchhhhhhhccccCCCCceeEEEEEccCCCCeEEeeEEEEecCCCcceEEEEecc
Q 008426 452 SSGHTGVDQYQKSQTLPPKMPKIVDSSVDEHRAYKLSSFSRGHRKIRVRLDHLDPWIWCDAKIVYVCKGPLDVSLLQLGY 531 (566)
Q Consensus 452 ~~~~~~~~~~~~~q~~~~~~~~~~~~~~~~~~~~~~~l~~~~~~~I~Vrl~~~~~~~w~~A~VV~vs~~p~DlALLqie~ 531 (566)
..|.|.++ .+++++|++++.+ ...|+|+||++.
T Consensus 96 -------------------------------------------~~i~v~l~---dg~~~~a~~vg~d-~~~dlavlki~~ 128 (347)
T COG0265 96 -------------------------------------------EEITVTLA---DGREVPAKLVGKD-PISDLAVLKIDG 128 (347)
T ss_pred -------------------------------------------ceEEEEeC---CCCEEEEEEEecC-CccCEEEEEecc
Confidence 24788884 3578999999964 579999999997
Q ss_pred CCCCCcceec-CCCCCCCCCeEEEecCCCC
Q 008426 532 IPDQLCPIDA-DFGQPSLGSAAYVIGHGLF 560 (566)
Q Consensus 532 vp~~L~pi~~-~~~~p~~Gs~V~vIG~pLf 560 (566)
... +..+.+ +...+++|+.+++||+|+-
T Consensus 129 ~~~-~~~~~~~~s~~l~vg~~v~aiGnp~g 157 (347)
T COG0265 129 AGG-LPVIALGDSDKLRVGDVVVAIGNPFG 157 (347)
T ss_pred CCC-CceeeccCCCCcccCCEEEEecCCCC
Confidence 433 666664 6778899999999999985
No 14
>PF13365 Trypsin_2: Trypsin-like peptidase domain; PDB: 1Y8T_A 2Z9I_A 3QO6_A 1L1J_A 1QY6_A 2O8L_A 3OTP_E 2ZLE_I 1KY9_A 3CS0_A ....
Probab=98.70 E-value=3.4e-08 Score=84.79 Aligned_cols=22 Identities=45% Similarity=0.939 Sum_probs=20.7
Q ss_pred eeEEEEeCCceEEecccccccc
Q 008426 408 ASGVLLNDQGLILTNAHLLEPW 429 (566)
Q Consensus 408 GSG~~~~~~G~ilTn~HVv~~~ 429 (566)
||||+|+++||||||+|||+++
T Consensus 1 GTGf~i~~~g~ilT~~Hvv~~~ 22 (120)
T PF13365_consen 1 GTGFLIGPDGYILTAAHVVEDW 22 (120)
T ss_dssp EEEEEEETTTEEEEEHHHHTCC
T ss_pred CEEEEEcCCceEEEchhheecc
Confidence 8999999999999999999964
No 15
>KOG1320 consensus Serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=97.86 E-value=4.3e-05 Score=83.88 Aligned_cols=111 Identities=20% Similarity=0.241 Sum_probs=76.4
Q ss_pred cccCcEEEEEeCC--------------CceeeEEEEeCCceEEecccccccccCCcccccCCCCCcccCCCCCCCCCCCc
Q 008426 392 KALASVCLITIDD--------------GVWASGVLLNDQGLILTNAHLLEPWRFGKTTVSGWRNGVSFQPEDSASSGHTG 457 (566)
Q Consensus 392 ~~~~sVV~V~~~~--------------~~~GSG~~~~~~G~ilTn~HVv~~~rfg~~~~~g~~~~~~f~~~~~~~~~~~~ 457 (566)
+...+||.|+..+ ...|||+|++.+|+|+||+||+... .+. ++.+.
T Consensus 136 ~cd~Avv~Ie~~~f~~~~~~~e~~~ip~l~~S~~Vv~gd~i~VTnghV~~~~----~~~-------y~~~~--------- 195 (473)
T KOG1320|consen 136 ECDLAVVYIESEEFWKGMNPFELGDIPSLNGSGFVVGGDGIIVTNGHVVRVE----PRI-------YAHSS--------- 195 (473)
T ss_pred cccceEEEEeeccccCCCcccccCCCcccCccEEEEcCCcEEEEeeEEEEEE----ecc-------ccCCC---------
Confidence 3456777776521 3459999999999999999999741 100 00000
Q ss_pred ccccccccCCCCCCCcccccchhhhhhhccccCCCCceeEEEEEccCCCCeEEeeEEEEecCCCcceEEEEeccCCCC-C
Q 008426 458 VDQYQKSQTLPPKMPKIVDSSVDEHRAYKLSSFSRGHRKIRVRLDHLDPWIWCDAKIVYVCKGPLDVSLLQLGYIPDQ-L 536 (566)
Q Consensus 458 ~~~~~~~q~~~~~~~~~~~~~~~~~~~~~~~l~~~~~~~I~Vrl~~~~~~~w~~A~VV~vs~~p~DlALLqie~vp~~-L 536 (566)
++ --.|+|++..+. ..-..+.|+++++ ..|||+++++. ++. +
T Consensus 196 ------~~----------------------------l~~vqi~aa~~~-~~s~ep~i~g~d~-~~gvA~l~ik~-~~~i~ 238 (473)
T KOG1320|consen 196 ------TV----------------------------LLRVQIDAAIGP-GNSGEPVIVGVDK-VAGVAFLKIKT-PENIL 238 (473)
T ss_pred ------cc----------------------------eeeEEEEEeecC-CccCCCeEEcccc-ccceEEEEEec-CCccc
Confidence 00 124666665542 3347899998864 79999999974 434 7
Q ss_pred cceec-CCCCCCCCCeEEEecCCC
Q 008426 537 CPIDA-DFGQPSLGSAAYVIGHGL 559 (566)
Q Consensus 537 ~pi~~-~~~~p~~Gs~V~vIG~pL 559 (566)
++|++ -......|+++.++|.|+
T Consensus 239 ~~i~~~~~~~~~~G~~~~a~~~~f 262 (473)
T KOG1320|consen 239 YVIPLGVSSHFRTGVEVSAIGNGF 262 (473)
T ss_pred ceeecceeeeecccceeeccccCc
Confidence 88885 456789999999999986
No 16
>PF13365 Trypsin_2: Trypsin-like peptidase domain; PDB: 1Y8T_A 2Z9I_A 3QO6_A 1L1J_A 1QY6_A 2O8L_A 3OTP_E 2ZLE_I 1KY9_A 3CS0_A ....
Probab=97.78 E-value=1.5e-05 Score=68.42 Aligned_cols=24 Identities=46% Similarity=0.855 Sum_probs=22.5
Q ss_pred EecCCCCCCCCcceeccCCcEEEE
Q 008426 284 ADIRCLPGMEGGPVFGEHAHFVGI 307 (566)
Q Consensus 284 TDA~ilPGnsGGpVfn~~G~lIGI 307 (566)
+|+.+.||+|||||||.+|+||||
T Consensus 97 ~~~~~~~G~SGgpv~~~~G~vvGi 120 (120)
T PF13365_consen 97 TDADTRPGSSGGPVFDSDGRVVGI 120 (120)
T ss_dssp ESSS-STTTTTSEEEETTSEEEEE
T ss_pred eecccCCCcEeHhEECCCCEEEeC
Confidence 899999999999999999999997
No 17
>cd00190 Tryp_SPc Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues.
Probab=97.47 E-value=0.003 Score=59.43 Aligned_cols=43 Identities=26% Similarity=0.258 Sum_probs=33.3
Q ss_pred CCcceEEEEeccC---CCCCcceecCCC--CCCCCCeEEEecCCCCCC
Q 008426 520 GPLDVSLLQLGYI---PDQLCPIDADFG--QPSLGSAAYVIGHGLFGP 562 (566)
Q Consensus 520 ~p~DlALLqie~v---p~~L~pi~~~~~--~p~~Gs~V~vIG~pLfgP 562 (566)
...|||||+|+.. .+.+.||.+... .+..|+.+++.|++....
T Consensus 87 ~~~DiAll~L~~~~~~~~~v~picl~~~~~~~~~~~~~~~~G~g~~~~ 134 (232)
T cd00190 87 YDNDIALLKLKRPVTLSDNVRPICLPSSGYNLPAGTTCTVSGWGRTSE 134 (232)
T ss_pred CcCCEEEEEECCcccCCCcccceECCCccccCCCCCEEEEEeCCcCCC
Confidence 3589999999862 223688886544 788999999999998764
No 18
>PF00089 Trypsin: Trypsin; InterPro: IPR001254 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine proteases belong to the MEROPS peptidase family S1 (chymotrypsin family, clan PA(S))and to peptidase family S6 (Hap serine peptidases). The chymotrypsin family is almost totally confined to animals, although trypsin-like enzymes are found in actinomycetes of the genera Streptomyces and Saccharopolyspora, and in the fungus Fusarium oxysporum []. The enzymes are inherently secreted, being synthesised with a signal peptide that targets them to the secretory pathway. Animal enzymes are either secreted directly, packaged into vesicles for regulated secretion, or are retained in leukocyte granules []. The Hap family, 'Haemophilus adhesion and penetration', are proteins that play a role in the interaction with human epithelial cells. The serine protease activity is localized at the N-terminal domain, whereas the binding domain is in the C-terminal region. ; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 1SPJ_A 1A5I_A 2ZGH_A 2ZKS_A 2ZGJ_A 2ZGC_A 2ODP_A 2I6Q_A 2I6S_A 2ODQ_A ....
Probab=97.36 E-value=0.0031 Score=58.95 Aligned_cols=40 Identities=30% Similarity=0.479 Sum_probs=32.0
Q ss_pred CcceEEEEeccC---CCCCcceecCC--CCCCCCCeEEEecCCCC
Q 008426 521 PLDVSLLQLGYI---PDQLCPIDADF--GQPSLGSAAYVIGHGLF 560 (566)
Q Consensus 521 p~DlALLqie~v---p~~L~pi~~~~--~~p~~Gs~V~vIG~pLf 560 (566)
..|||||||+.. .+.+.|+.+.. ..+..|+.+.++|++.-
T Consensus 86 ~~DiAll~L~~~~~~~~~~~~~~l~~~~~~~~~~~~~~~~G~~~~ 130 (220)
T PF00089_consen 86 DNDIALLKLDRPITFGDNIQPICLPSAGSDPNVGTSCIVVGWGRT 130 (220)
T ss_dssp TTSEEEEEESSSSEHBSSBEESBBTSTTHTTTTTSEEEEEESSBS
T ss_pred ccccccccccccccccccccccccccccccccccccccccccccc
Confidence 589999999984 35578888655 34589999999999974
No 19
>smart00020 Tryp_SPc Trypsin-like serine protease. Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues.
Probab=97.16 E-value=0.0084 Score=56.83 Aligned_cols=43 Identities=26% Similarity=0.261 Sum_probs=32.3
Q ss_pred CCcceEEEEeccC---CCCCcceecCCC--CCCCCCeEEEecCCCCCC
Q 008426 520 GPLDVSLLQLGYI---PDQLCPIDADFG--QPSLGSAAYVIGHGLFGP 562 (566)
Q Consensus 520 ~p~DlALLqie~v---p~~L~pi~~~~~--~p~~Gs~V~vIG~pLfgP 562 (566)
...|||||+|+.. .+.+.||.+... .+..|+.+++.|++....
T Consensus 87 ~~~DiAll~L~~~i~~~~~~~pi~l~~~~~~~~~~~~~~~~g~g~~~~ 134 (229)
T smart00020 87 YDNDIALLKLKSPVTLSDNVRPICLPSSNYNVPAGTTCTVSGWGRTSE 134 (229)
T ss_pred CcCCEEEEEECcccCCCCceeeccCCCcccccCCCCEEEEEeCCCCCC
Confidence 4589999999872 234778776433 677899999999987653
No 20
>PF00089 Trypsin: Trypsin; InterPro: IPR001254 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine proteases belong to the MEROPS peptidase family S1 (chymotrypsin family, clan PA(S))and to peptidase family S6 (Hap serine peptidases). The chymotrypsin family is almost totally confined to animals, although trypsin-like enzymes are found in actinomycetes of the genera Streptomyces and Saccharopolyspora, and in the fungus Fusarium oxysporum []. The enzymes are inherently secreted, being synthesised with a signal peptide that targets them to the secretory pathway. Animal enzymes are either secreted directly, packaged into vesicles for regulated secretion, or are retained in leukocyte granules []. The Hap family, 'Haemophilus adhesion and penetration', are proteins that play a role in the interaction with human epithelial cells. The serine protease activity is localized at the N-terminal domain, whereas the binding domain is in the C-terminal region. ; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 1SPJ_A 1A5I_A 2ZGH_A 2ZKS_A 2ZGJ_A 2ZGC_A 2ODP_A 2I6Q_A 2I6S_A 2ODQ_A ....
Probab=96.57 E-value=0.055 Score=50.60 Aligned_cols=113 Identities=14% Similarity=0.134 Sum_probs=70.5
Q ss_pred ceEEEEEEccC---CCCCCccc--C-CCCCCCCCeEEEEeCCCCCCCCCccCCce---EEEEEecc--cC--CCCCCCce
Q 008426 215 SRVAILGVSSY---LKDLPNIA--L-TPLNKRGDLLLAVGSPFGVLSPMHFFNSV---SMGSVANC--YP--PRSTTRSL 281 (566)
Q Consensus 215 tdlAvLki~~~---~~~~~~~~--~-S~~~~~Gd~V~aiGSPFG~lsP~~F~nsv---S~GiISn~--~~--~~~~~~~l 281 (566)
.|+||||++.. .....++. . ...++.|+.+.++|.+..... .....+ ...+++.. .. ........
T Consensus 87 ~DiAll~L~~~~~~~~~~~~~~l~~~~~~~~~~~~~~~~G~~~~~~~--~~~~~~~~~~~~~~~~~~c~~~~~~~~~~~~ 164 (220)
T PF00089_consen 87 NDIALLKLDRPITFGDNIQPICLPSAGSDPNVGTSCIVVGWGRTSDN--GYSSNLQSVTVPVVSRKTCRSSYNDNLTPNM 164 (220)
T ss_dssp TSEEEEEESSSSEHBSSBEESBBTSTTHTTTTTSEEEEEESSBSSTT--SBTSBEEEEEEEEEEHHHHHHHTTTTSTTTE
T ss_pred ccccccccccccccccccccccccccccccccccccccccccccccc--ccccccccccccccccccccccccccccccc
Confidence 49999999854 11122222 2 234689999999999996321 111233 34444432 11 01123456
Q ss_pred EEEec----CCCCCCCCcceeccCCcEEEEEeeeccCCCCc-ceEEEeeHHHHH
Q 008426 282 LMADI----RCLPGMEGGPVFGEHAHFVGILIRPLRQKSGA-EIQLVIPWEAIA 330 (566)
Q Consensus 282 iqTDA----~ilPGnsGGpVfn~~G~lIGIv~~~l~~~~g~-gl~faIP~~~i~ 330 (566)
+.++. ...+|+|||||++.++.||||++.. ..++.. ...+.+++..++
T Consensus 165 ~c~~~~~~~~~~~g~sG~pl~~~~~~lvGI~s~~-~~c~~~~~~~v~~~v~~~~ 217 (220)
T PF00089_consen 165 ICAGSSGSGDACQGDSGGPLICNNNYLVGIVSFG-ENCGSPNYPGVYTRVSSYL 217 (220)
T ss_dssp EEEETTSSSBGGTTTTTSEEEETTEEEEEEEEEE-SSSSBTTSEEEEEEGGGGH
T ss_pred ccccccccccccccccccccccceeeecceeeec-CCCCCCCcCEEEEEHHHhh
Confidence 77776 7889999999999998999999987 333322 356777765443
No 21
>KOG1421 consensus Predicted signaling-associated protein (contains a PDZ domain) [General function prediction only]
Probab=96.43 E-value=0.0075 Score=68.80 Aligned_cols=39 Identities=33% Similarity=0.525 Sum_probs=32.3
Q ss_pred hhcccCcEEEEEeCC----------CceeeEEEEeC-CceEEeccccccc
Q 008426 390 IQKALASVCLITIDD----------GVWASGVLLND-QGLILTNAHLLEP 428 (566)
Q Consensus 390 ~~~~~~sVV~V~~~~----------~~~GSG~~~~~-~G~ilTn~HVv~~ 428 (566)
+..+.++||.|.... ..-|+||++++ .||||||+|||.+
T Consensus 58 ia~VvksvVsI~~S~v~~fdtesag~~~atgfvvd~~~gyiLtnrhvv~p 107 (955)
T KOG1421|consen 58 IANVVKSVVSIRFSAVRAFDTESAGESEATGFVVDKKLGYILTNRHVVAP 107 (955)
T ss_pred hhhhcccEEEEEehheeecccccccccceeEEEEecccceEEEeccccCC
Confidence 456789999996531 56799999996 8999999999997
No 22
>COG3591 V8-like Glu-specific endopeptidase [Amino acid transport and metabolism]
Probab=94.09 E-value=0.15 Score=52.63 Aligned_cols=76 Identities=22% Similarity=0.223 Sum_probs=58.3
Q ss_pred ccCCCCCCCCCeEEEEeCCCCCCCCCccCCceEEEEEecccCCCCCCCceEEEecCCCCCCCCcceeccCCcEEEEEeee
Q 008426 232 IALTPLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVANCYPPRSTTRSLLMADIRCLPGMEGGPVFGEHAHFVGILIRP 311 (566)
Q Consensus 232 ~~~S~~~~~Gd~V~aiGSPFG~lsP~~F~nsvS~GiISn~~~~~~~~~~liqTDA~ilPGnsGGpVfn~~G~lIGIv~~~ 311 (566)
.......+.+|.|.++|.|-.- |..+....+.|.|-.... ..++-|+-..||+||.||++.+.+|||+.+..
T Consensus 152 ~~~~~~~~~~d~i~v~GYP~dk--~~~~~~~e~t~~v~~~~~------~~l~y~~dT~pG~SGSpv~~~~~~vigv~~~g 223 (251)
T COG3591 152 RNTASEAKANDRITVIGYPGDK--PNIGTMWESTGKVNSIKG------NKLFYDADTLPGSSGSPVLISKDEVIGVHYNG 223 (251)
T ss_pred cccccccccCceeEEEeccCCC--CcceeEeeecceeEEEec------ceEEEEecccCCCCCCceEecCceEEEEEecC
Confidence 3346678999999999999985 323344455555554432 36889999999999999999999999999987
Q ss_pred ccCC
Q 008426 312 LRQK 315 (566)
Q Consensus 312 l~~~ 315 (566)
....
T Consensus 224 ~~~~ 227 (251)
T COG3591 224 PGAN 227 (251)
T ss_pred CCcc
Confidence 7644
No 23
>PF10459 Peptidase_S46: Peptidase S46; InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains.
Probab=93.83 E-value=0.21 Score=58.03 Aligned_cols=31 Identities=29% Similarity=0.280 Sum_probs=24.4
Q ss_pred cCcEEEEEeCCCceeeEEEEeCCceEEeccccccc
Q 008426 394 LASVCLITIDDGVWASGVLLNDQGLILTNAHLLEP 428 (566)
Q Consensus 394 ~~sVV~V~~~~~~~GSG~~~~~~G~ilTn~HVv~~ 428 (566)
..+||.. + +-.||-+|+++|+|+||.|++-.
T Consensus 39 ~dAvv~f--~--gGCSgsfVS~~GLvlTNHHC~~~ 69 (698)
T PF10459_consen 39 KDAVVRF--G--GGCSGSFVSPDGLVLTNHHCGYG 69 (698)
T ss_pred hhheeec--C--CceeEEEEcCCceEEecchhhhh
Confidence 3566654 2 24899999999999999999864
No 24
>PF10459 Peptidase_S46: Peptidase S46; InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains.
Probab=92.35 E-value=0.12 Score=60.18 Aligned_cols=32 Identities=19% Similarity=0.363 Sum_probs=28.8
Q ss_pred CceEEEecCCCCCCCCcceeccCCcEEEEEee
Q 008426 279 RSLLMADIRCLPGMEGGPVFGEHAHFVGILIR 310 (566)
Q Consensus 279 ~~liqTDA~ilPGnsGGpVfn~~G~lIGIv~~ 310 (566)
+..++|+.-|--||||.||+|.+|+|||++.-
T Consensus 621 pv~FlstnDitGGNSGSPvlN~~GeLVGl~FD 652 (698)
T PF10459_consen 621 PVNFLSTNDITGGNSGSPVLNAKGELVGLAFD 652 (698)
T ss_pred eeEEEeccCcCCCCCCCccCCCCceEEEEeec
Confidence 44589999999999999999999999999985
No 25
>PF00863 Peptidase_C4: Peptidase family C4 This family belongs to family C4 of the peptidase classification.; InterPro: IPR001730 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. Nuclear inclusion A (NIA) proteases from potyviruses are cysteine peptidases belong to the MEROPS peptidase family C4 (NIa protease family, clan PA(C)) [, ]. Potyviruses include plant viruses in which the single-stranded RNA encodes a polyprotein with NIA protease activity, where proteolytic cleavage is specific for Gln+Gly sites. The NIA protease acts on the polyprotein, releasing itself by Gln+Gly cleavage at both the N- and C-termini. It further processes the polyprotein by cleavage at five similar sites in the C-terminal half of the sequence. In addition to its C-terminal protease activity, the NIA protease contains an N-terminal domain that has been implicated in the transcription process []. This peptidase is present in the nuclear inclusion protein of potyviruses.; GO: 0008234 cysteine-type peptidase activity, 0006508 proteolysis; PDB: 3MMG_B 1Q31_B 1LVB_A 1LVM_A.
Probab=91.99 E-value=0.72 Score=47.26 Aligned_cols=34 Identities=18% Similarity=0.430 Sum_probs=18.3
Q ss_pred CcceEEEEeccCCCCCccee--cCCCCCCCCCeEEEecC
Q 008426 521 PLDVSLLQLGYIPDQLCPID--ADFGQPSLGSAAYVIGH 557 (566)
Q Consensus 521 p~DlALLqie~vp~~L~pi~--~~~~~p~~Gs~V~vIG~ 557 (566)
..||.++|+.. +++|.+ +.|..|+.|++|..||.
T Consensus 81 ~~DiviirmPk---DfpPf~~kl~FR~P~~~e~v~mVg~ 116 (235)
T PF00863_consen 81 GRDIVIIRMPK---DFPPFPQKLKFRAPKEGERVCMVGS 116 (235)
T ss_dssp CSSEEEEE--T---TS----S---B----TT-EEEEEEE
T ss_pred CccEEEEeCCc---ccCCcchhhhccCCCCCCEEEEEEE
Confidence 59999999976 467766 68899999999999985
No 26
>PF02907 Peptidase_S29: Hepatitis C virus NS3 protease; InterPro: IPR004109 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies the Hepatitis C virus NS3 protein as a serine protease which belongs to MEROPS peptidase family S29 (hepacivirin family, clan PA(S)), which has a trypsin-like fold. The non-structural (NS) protein NS3 is one of the NS proteins involved in replication of the HCV genome. The NS2 proteinase (IPR002518 from INTERPRO), a zinc-dependent enzyme, performs a single proteolytic cut to release the N terminus of NS3. The action of NS3 proteinase (NS3P), which resides in the N-terminal one-third of the NS3 protein, then yields all remaining non-structural proteins. The C-terminal two-thirds of the NS3 protein contain a helicase. The functional relationship between the proteinase and helicase domains is unknown. NS3 has a structural zinc-binding site and requires cofactor NS4. It has been suggested that the NS3 serine protease of hepatitus C is involved in cell transformation and that the ability to transform requires an active enzyme [].; GO: 0008236 serine-type peptidase activity, 0006508 proteolysis, 0019087 transformation of host cell by virus; PDB: 2QV1_B 3LOX_C 2OBQ_C 2OC1_C 2OC0_A 3LON_A 3KNX_A 2O8M_A 2OBO_A 2OC8_A ....
Probab=91.49 E-value=0.14 Score=48.58 Aligned_cols=45 Identities=29% Similarity=0.527 Sum_probs=36.4
Q ss_pred EecCCCCCCCCcceeccCCcEEEEEeeeccCCCCc-ceEEEeeHHHH
Q 008426 284 ADIRCLPGMEGGPVFGEHAHFVGILIRPLRQKSGA-EIQLVIPWEAI 329 (566)
Q Consensus 284 TDA~ilPGnsGGpVfn~~G~lIGIv~~~l~~~~g~-gl~faIP~~~i 329 (566)
.-+..+-|.|||||+-..|.+|||..+.++.++.. .+-|+ ||+.+
T Consensus 101 ~pis~lkGSSGgPiLC~~GH~vG~f~aa~~trgvak~i~f~-P~e~l 146 (148)
T PF02907_consen 101 RPISDLKGSSGGPILCPSGHAVGMFRAAVCTRGVAKAIDFI-PVETL 146 (148)
T ss_dssp EEHHHHTT-TT-EEEETTSEEEEEEEEEEEETTEEEEEEEE-EHHHH
T ss_pred ceeEEEecCCCCcccCCCCCEEEEEEEEEEcCCceeeEEEE-eeeec
Confidence 45667889999999999999999999988876554 88888 99865
No 27
>PF00949 Peptidase_S7: Peptidase S7, Flavivirus NS3 serine protease ; InterPro: IPR001850 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies serine peptidases belong to MEROPS peptidase family S7 (flavivirin family, clan PA(S)). The protein fold of the peptidase domain for members of this family resembles that of chymotrypsin, the type example for clan PA. Flaviviruses produce a polyprotein from the ssRNA genome. The N terminus of the NS3 protein (approx. 180 aa) is required for the processing of the polyprotein. NS3 also has conserved homology with NTP-binding proteins and DEAD family of RNA helicase [, , ].; GO: 0003723 RNA binding, 0003724 RNA helicase activity, 0005524 ATP binding; PDB: 2IJO_B 3E90_D 2GGV_B 2FP7_B 2WV9_A 3U1I_B 3U1J_B 2WZQ_A 2WHX_A 3L6P_A ....
Probab=89.74 E-value=0.28 Score=46.07 Aligned_cols=35 Identities=20% Similarity=0.418 Sum_probs=25.1
Q ss_pred ceEEEecCCCCCCCCcceeccCCcEEEEEeeeccC
Q 008426 280 SLLMADIRCLPGMEGGPVFGEHAHFVGILIRPLRQ 314 (566)
Q Consensus 280 ~liqTDA~ilPGnsGGpVfn~~G~lIGIv~~~l~~ 314 (566)
.+...|..+-+|+||.|+||.+|++|||--..+.-
T Consensus 86 ~~~~~~~d~~~GsSGSpi~n~~g~ivGlYg~g~~~ 120 (132)
T PF00949_consen 86 GIGAIDLDFPKGSSGSPIFNQNGEIVGLYGNGVEV 120 (132)
T ss_dssp EEEEE---S-TTGTT-EEEETTSCEEEEEEEEEE-
T ss_pred eEEeeecccCCCCCCCceEcCCCcEEEEEccceee
Confidence 35677788999999999999999999998876643
No 28
>cd00190 Tryp_SPc Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues.
Probab=88.83 E-value=1.7 Score=40.79 Aligned_cols=98 Identities=16% Similarity=0.153 Sum_probs=51.1
Q ss_pred ceEEEEEEccCCCC---CC--cccCCC-CCCCCCeEEEEeCCCCCCC--CCccCCceEEEEEeccc--CC----CCCCCc
Q 008426 215 SRVAILGVSSYLKD---LP--NIALTP-LNKRGDLLLAVGSPFGVLS--PMHFFNSVSMGSVANCY--PP----RSTTRS 280 (566)
Q Consensus 215 tdlAvLki~~~~~~---~~--~~~~S~-~~~~Gd~V~aiGSPFG~ls--P~~F~nsvS~GiISn~~--~~----~~~~~~ 280 (566)
.|+||||++..... .. .+.... ....|+.+.+.|....... ...-......-+++... .. ......
T Consensus 89 ~DiAll~L~~~~~~~~~v~picl~~~~~~~~~~~~~~~~G~g~~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~ 168 (232)
T cd00190 89 NDIALLKLKRPVTLSDNVRPICLPSSGYNLPAGTTCTVSGWGRTSEGGPLPDVLQEVNVPIVSNAECKRAYSYGGTITDN 168 (232)
T ss_pred CCEEEEEECCcccCCCcccceECCCccccCCCCCEEEEEeCCcCCCCCCCCceeeEEEeeeECHHHhhhhccCcccCCCc
Confidence 49999999853211 11 223322 5788999999997554211 00001112222332210 00 000011
Q ss_pred eEE-----EecCCCCCCCCcceeccC---CcEEEEEeeec
Q 008426 281 LLM-----ADIRCLPGMEGGPVFGEH---AHFVGILIRPL 312 (566)
Q Consensus 281 liq-----TDA~ilPGnsGGpVfn~~---G~lIGIv~~~l 312 (566)
.+- .+...-+|.|||||+... ..|+||++...
T Consensus 169 ~~C~~~~~~~~~~c~gdsGgpl~~~~~~~~~lvGI~s~g~ 208 (232)
T cd00190 169 MLCAGGLEGGKDACQGDSGGPLVCNDNGRGVLVGIVSWGS 208 (232)
T ss_pred eEeeCCCCCCCccccCCCCCcEEEEeCCEEEEEEEEehhh
Confidence 111 134456799999999875 66999998643
No 29
>smart00020 Tryp_SPc Trypsin-like serine protease. Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues.
Probab=86.94 E-value=4.2 Score=38.53 Aligned_cols=98 Identities=13% Similarity=0.070 Sum_probs=51.2
Q ss_pred ceEEEEEEccCC--C-CCCc--ccC-CCCCCCCCeEEEEeCCCCCCCCCccCCceEEEEEecccC-----C-------CC
Q 008426 215 SRVAILGVSSYL--K-DLPN--IAL-TPLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVANCYP-----P-------RS 276 (566)
Q Consensus 215 tdlAvLki~~~~--~-~~~~--~~~-S~~~~~Gd~V~aiGSPFG~lsP~~F~nsvS~GiISn~~~-----~-------~~ 276 (566)
.|+||||++... . ...+ +.. ...+..|+.+.+.|..-.......+...+-...+.-... . ..
T Consensus 89 ~DiAll~L~~~i~~~~~~~pi~l~~~~~~~~~~~~~~~~g~g~~~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~ 168 (229)
T smart00020 89 NDIALLKLKSPVTLSDNVRPICLPSSNYNVPAGTTCTVSGWGRTSEGAGSLPDTLQEVNVPIVSNATCRRAYSGGGAITD 168 (229)
T ss_pred CCEEEEEECcccCCCCceeeccCCCcccccCCCCEEEEEeCCCCCCCCCcCCCEeeEEEEEEeCHHHhhhhhccccccCC
Confidence 599999998531 1 1112 222 235778899999986554211111222222222221110 0 00
Q ss_pred CCCceEE--EecCCCCCCCCcceeccCC--cEEEEEeeec
Q 008426 277 TTRSLLM--ADIRCLPGMEGGPVFGEHA--HFVGILIRPL 312 (566)
Q Consensus 277 ~~~~liq--TDA~ilPGnsGGpVfn~~G--~lIGIv~~~l 312 (566)
...+... .....-+|.+||||+...+ .|+||++...
T Consensus 169 ~~~C~~~~~~~~~~c~gdsG~pl~~~~~~~~l~Gi~s~g~ 208 (229)
T smart00020 169 NMLCAGGLEGGKDACQGDSGGPLVCNDGRWVLVGIVSWGS 208 (229)
T ss_pred CcEeecCCCCCCcccCCCCCCeeEEECCCEEEEEEEEECC
Confidence 0000011 1355667999999998765 7999988753
No 30
>PF08192 Peptidase_S64: Peptidase family S64; InterPro: IPR012985 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This family of fungal proteins is involved in the processing of membrane bound transcription factor Stp1 [] and belongs to MEROPS petidase family S64 (clan PA). The processing causes the signalling domain of Stp1 to be passed to the nucleus where several permease genes are induced. The permeases are important for uptake of amino acids, and processing of tp1 only occurs in an amino acid-rich environment. This family is predicted to be distantly related to the trypsin family (MEROPS peptidase family S1) and to have a typical trypsin-like catalytic triad [].
Probab=86.16 E-value=3.5 Score=47.88 Aligned_cols=111 Identities=19% Similarity=0.244 Sum_probs=70.7
Q ss_pred ccCCCcceEEEEEEccCC-------CCCC------ccc--------CCCCCCCCCeEEEEeCCCCCCCCCccCCceEEEE
Q 008426 209 LMSKSTSRVAILGVSSYL-------KDLP------NIA--------LTPLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGS 267 (566)
Q Consensus 209 ~l~~~~tdlAvLki~~~~-------~~~~------~~~--------~S~~~~~Gd~V~aiGSPFG~lsP~~F~nsvS~Gi 267 (566)
++.+.++|+||+||+... .+++ .+. .-..+..|..|+=+|.==|+ |.|+
T Consensus 537 ii~~~LsD~AIIkV~~~~~~~N~LGddi~f~~~dP~l~f~NlyV~~~~~~~~~G~~VfK~GrTTgy----------T~G~ 606 (695)
T PF08192_consen 537 IINKRLSDWAIIKVNKERKCQNYLGDDIQFNEPDPTLMFQNLYVREVVSNLVPGMEVFKVGRTTGY----------TTGI 606 (695)
T ss_pred hhcccccceEEEEeCCCceecCCCCccccccCCCccccccccchhhhhhccCCCCeEEEecccCCc----------cceE
Confidence 344566899999998531 1221 111 02357889999999988876 4566
Q ss_pred Eeccc----CCCCC-CCceEEEe----cCCCCCCCCcceeccCCc------EEEEEeeeccCCCCc--ceEEEeeHHHHH
Q 008426 268 VANCY----PPRST-TRSLLMAD----IRCLPGMEGGPVFGEHAH------FVGILIRPLRQKSGA--EIQLVIPWEAIA 330 (566)
Q Consensus 268 ISn~~----~~~~~-~~~liqTD----A~ilPGnsGGpVfn~~G~------lIGIv~~~l~~~~g~--gl~faIP~~~i~ 330 (566)
|.+.. .++.- ...++.+. +=..+|.||.=|+++-+. |+||..+. .|+ .+++..||+.|.
T Consensus 607 lNg~klvyw~dG~i~s~efvV~s~~~~~Fa~~GDSGS~VLtk~~d~~~gLgvvGMlhsy----dge~kqfglftPi~~il 682 (695)
T PF08192_consen 607 LNGIKLVYWADGKIQSSEFVVSSDNNPAFASGGDSGSWVLTKLEDNNKGLGVVGMLHSY----DGEQKQFGLFTPINEIL 682 (695)
T ss_pred ecceEEEEecCCCeEEEEEEEecCCCccccCCCCcccEEEecccccccCceeeEEeeec----CCccceeeccCcHHHHH
Confidence 66552 11110 11233333 445679999999997444 99998874 333 688899999888
Q ss_pred HHH
Q 008426 331 TAC 333 (566)
Q Consensus 331 ~~~ 333 (566)
+=+
T Consensus 683 ~rl 685 (695)
T PF08192_consen 683 DRL 685 (695)
T ss_pred HHH
Confidence 643
No 31
>PF00863 Peptidase_C4: Peptidase family C4 This family belongs to family C4 of the peptidase classification.; InterPro: IPR001730 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. Nuclear inclusion A (NIA) proteases from potyviruses are cysteine peptidases belong to the MEROPS peptidase family C4 (NIa protease family, clan PA(C)) [, ]. Potyviruses include plant viruses in which the single-stranded RNA encodes a polyprotein with NIA protease activity, where proteolytic cleavage is specific for Gln+Gly sites. The NIA protease acts on the polyprotein, releasing itself by Gln+Gly cleavage at both the N- and C-termini. It further processes the polyprotein by cleavage at five similar sites in the C-terminal half of the sequence. In addition to its C-terminal protease activity, the NIA protease contains an N-terminal domain that has been implicated in the transcription process []. This peptidase is present in the nuclear inclusion protein of potyviruses.; GO: 0008234 cysteine-type peptidase activity, 0006508 proteolysis; PDB: 3MMG_B 1Q31_B 1LVB_A 1LVM_A.
Probab=80.70 E-value=3.7 Score=42.20 Aligned_cols=87 Identities=24% Similarity=0.300 Sum_probs=44.5
Q ss_pred ceEEEEEEccCCCCCCcccC---CCCCCCCCeEEEEeCCCCCCCCCccCCceEEEEEecccCCCCCCCceEEEecCCCCC
Q 008426 215 SRVAILGVSSYLKDLPNIAL---TPLNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVANCYPPRSTTRSLLMADIRCLPG 291 (566)
Q Consensus 215 tdlAvLki~~~~~~~~~~~~---S~~~~~Gd~V~aiGSPFG~lsP~~F~nsvS~GiISn~~~~~~~~~~liqTDA~ilPG 291 (566)
.|+.++|.. .++|++.. -...+.||.|..||+=|-- -...-+||. -|.+.+. ....+-.--+.-.+|
T Consensus 82 ~DiviirmP---kDfpPf~~kl~FR~P~~~e~v~mVg~~fq~---k~~~s~vSe--sS~i~p~--~~~~fWkHwIsTk~G 151 (235)
T PF00863_consen 82 RDIVIIRMP---KDFPPFPQKLKFRAPKEGERVCMVGSNFQE---KSISSTVSE--SSWIYPE--ENSHFWKHWISTKDG 151 (235)
T ss_dssp SSEEEEE-----TTS----S---B----TT-EEEEEEEECSS---CCCEEEEEE--EEEEEEE--TTTTEEEE-C---TT
T ss_pred ccEEEEeCC---cccCCcchhhhccCCCCCCEEEEEEEEEEc---CCeeEEECC--ceEEeec--CCCCeeEEEecCCCC
Confidence 599999997 46777775 3478999999999998763 001112221 2222221 124567777888899
Q ss_pred CCCcceecc-CCcEEEEEeee
Q 008426 292 MEGGPVFGE-HAHFVGILIRP 311 (566)
Q Consensus 292 nsGGpVfn~-~G~lIGIv~~~ 311 (566)
+.|.|+++. +|.+|||-+..
T Consensus 152 ~CG~PlVs~~Dg~IVGiHsl~ 172 (235)
T PF00863_consen 152 DCGLPLVSTKDGKIVGIHSLT 172 (235)
T ss_dssp -TT-EEEETTT--EEEEEEEE
T ss_pred ccCCcEEEcCCCcEEEEEcCc
Confidence 999999975 89999999854
No 32
>PF00944 Peptidase_S3: Alphavirus core protein ; InterPro: IPR000930 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. Togavirin, also known as Sindbis virus core endopeptidase, is a serine protease resident at the N terminus of the p130 polyprotein of togaviruses []. The endopeptidase signature identifies the peptidase as belonging to the MEROPS peptidase family S3 (togavirin family, clan PA(S)). The polyprotein also includes structural proteins for the nucleocapsid core and for the glycoprotein spikes []. Togavirin is only active while part of the polyprotein, cleavage at a Trp-Ser bond resulting in total lack of activity []. Mutagenesis studies have identified the location of the His-Asp-Ser catalytic triad, and X-ray studies have revealed the protein fold to be similar to that of chymotrypsin [, ].; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis, 0016020 membrane; PDB: 2YEW_D 1EP5_A 3J0C_F 1EP6_C 1WYK_D 1DYL_A 1VCQ_B 1VCP_B 1LD4_D 1KXA_A ....
Probab=79.50 E-value=1.7 Score=41.56 Aligned_cols=31 Identities=23% Similarity=0.426 Sum_probs=25.7
Q ss_pred eEEEecCCCCCCCCcceeccCCcEEEEEeee
Q 008426 281 LLMADIRCLPGMEGGPVFGEHAHFVGILIRP 311 (566)
Q Consensus 281 liqTDA~ilPGnsGGpVfn~~G~lIGIv~~~ 311 (566)
+.+--..-.||.||-|.||..|+||||+.+.
T Consensus 96 ftip~g~g~~GDSGRpi~DNsGrVVaIVLGG 126 (158)
T PF00944_consen 96 FTIPTGVGKPGDSGRPIFDNSGRVVAIVLGG 126 (158)
T ss_dssp EEEETTS-STTSTTEEEESTTSBEEEEEEEE
T ss_pred EEeccCCCCCCCCCCccCcCCCCEEEEEecC
Confidence 4445566789999999999999999999974
No 33
>PF00947 Pico_P2A: Picornavirus core protein 2A; InterPro: IPR000081 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This domain defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies 3CA and 3CB. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral 3C cysteine protease []. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0008233 peptidase activity, 0006508 proteolysis, 0016032 viral reproduction; PDB: 2HRV_B 1Z8R_A.
Probab=79.25 E-value=2.8 Score=39.43 Aligned_cols=30 Identities=30% Similarity=0.467 Sum_probs=23.9
Q ss_pred eEEEecCCCCCCCCcceeccCCcEEEEEeee
Q 008426 281 LLMADIRCLPGMEGGPVFGEHAHFVGILIRP 311 (566)
Q Consensus 281 liqTDA~ilPGnsGGpVfn~~G~lIGIv~~~ 311 (566)
+++.--.+.||..||+|+-++| ||||+|+.
T Consensus 80 ~l~g~Gp~~PGdCGg~L~C~HG-ViGi~Tag 109 (127)
T PF00947_consen 80 LLIGEGPAEPGDCGGILRCKHG-VIGIVTAG 109 (127)
T ss_dssp EEEEE-SSSTT-TCSEEEETTC-EEEEEEEE
T ss_pred ceeecccCCCCCCCceeEeCCC-eEEEEEeC
Confidence 5666678999999999998776 99999983
No 34
>PF03761 DUF316: Domain of unknown function (DUF316) ; InterPro: IPR005514 This is a family of uncharacterised proteins from Caenorhabditis elegans.
Probab=77.31 E-value=20 Score=36.34 Aligned_cols=47 Identities=21% Similarity=0.224 Sum_probs=32.0
Q ss_pred eeEEEEec-------CCCcceEEEEeccC-CCCCcceec--CCCCCCCCCeEEEecC
Q 008426 511 DAKIVYVC-------KGPLDVSLLQLGYI-PDQLCPIDA--DFGQPSLGSAAYVIGH 557 (566)
Q Consensus 511 ~A~VV~vs-------~~p~DlALLqie~v-p~~L~pi~~--~~~~p~~Gs~V~vIG~ 557 (566)
.|-++..| ..+++++||.++.. ...+.|+=+ +......|+.+.+-|+
T Consensus 143 ka~il~~C~~~~~~~~~~~~~mIlEl~~~~~~~~~~~Cl~~~~~~~~~~~~~~~yg~ 199 (282)
T PF03761_consen 143 KAYILNGCKKIKKNFNRPYSPMILELEEDFSKNVSPPCLADSSTNWEKGDEVDVYGF 199 (282)
T ss_pred EEEEEecCCCcccccccccceEEEEEcccccccCCCEEeCCCccccccCceEEEeec
Confidence 45556566 46799999999984 134455444 3345678999888887
No 35
>COG3591 V8-like Glu-specific endopeptidase [Amino acid transport and metabolism]
Probab=69.60 E-value=27 Score=36.39 Aligned_cols=33 Identities=18% Similarity=0.314 Sum_probs=24.2
Q ss_pred CcEEEEEeCCCceeeE-EEEeCCceEEeccccccc
Q 008426 395 ASVCLITIDDGVWASG-VLLNDQGLILTNAHLLEP 428 (566)
Q Consensus 395 ~sVV~V~~~~~~~GSG-~~~~~~G~ilTn~HVv~~ 428 (566)
.+||-.+...+..+.. ++|+++ .|||+.||+-.
T Consensus 52 ~av~~~~~~tG~~~~~~~lI~pn-tvLTa~Hc~~s 85 (251)
T COG3591 52 SAVVQFEAATGRLCTAATLIGPN-TVLTAGHCIYS 85 (251)
T ss_pred ceeEEeecCCCcceeeEEEEcCc-eEEEeeeEEec
Confidence 5677665543334444 999998 99999999964
No 36
>KOG0441 consensus Cu2+/Zn2+ superoxide dismutase SOD1 [Inorganic ion transport and metabolism]
Probab=60.13 E-value=3.5 Score=39.89 Aligned_cols=43 Identities=28% Similarity=0.243 Sum_probs=28.9
Q ss_pred ccccccccccceeccCceee---eeeeeecccccC--CccccccccCC
Q 008426 25 KGLKMRRHAFHQYNSGKTTL---SASGMLLPLSFF--DTKVAERNWGV 67 (566)
Q Consensus 25 k~~kmr~hafh~~~sg~ttl---SaSgllLp~sl~--~~~~~~~~~~~ 67 (566)
+||+-++|+||.|+.|.+|- ||-...=|.+.. .|....|.+++
T Consensus 37 ~GL~pg~hgfHvHqfGD~t~GC~SaGphFNp~~~~hg~p~~~~rH~gd 84 (154)
T KOG0441|consen 37 TGLPPGKHGFHVHQFGDNTNGCKSAGPHFNPNKKTHGGPVDEVRHVGD 84 (154)
T ss_pred ecCCCceeeEEEEeccCCCCChhcCCCCCCCcccCCCCcccccccccc
Confidence 45666999999999999995 553444444444 45555566655
No 37
>KOG1421 consensus Predicted signaling-associated protein (contains a PDZ domain) [General function prediction only]
Probab=52.45 E-value=33 Score=40.58 Aligned_cols=120 Identities=17% Similarity=0.169 Sum_probs=83.8
Q ss_pred cccCCCcceEEEEEEccCCCC---CCcccCCC-CCCCCCeEEEEeCCCCCCCCCccCCceEEEEEecccCCCC-------
Q 008426 208 SLMSKSTSRVAILGVSSYLKD---LPNIALTP-LNKRGDLLLAVGSPFGVLSPMHFFNSVSMGSVANCYPPRS------- 276 (566)
Q Consensus 208 ~~l~~~~tdlAvLki~~~~~~---~~~~~~S~-~~~~Gd~V~aiGSPFG~lsP~~F~nsvS~GiISn~~~~~~------- 276 (566)
.+-+++.-||-++|-+..... ...+.-++ ..++|-++.++||==|- --++-.|.+|.+.++..
T Consensus 126 pvyrDpVhdfGf~r~dps~ir~s~vt~i~lap~~akvgseirvvgNDagE------klsIlagflSrldr~apdyg~~~y 199 (955)
T KOG1421|consen 126 PVYRDPVHDFGFFRYDPSTIRFSIVTEICLAPELAKVGSEIRVVGNDAGE------KLSILAGFLSRLDRNAPDYGEDTY 199 (955)
T ss_pred cccCCchhhcceeecChhhcceeeeeccccCccccccCCceEEecCCccc------eEEeehhhhhhccCCCcccccccc
Confidence 444566689999988742111 11222222 46889999999997773 33556788887765431
Q ss_pred --CCCceEEEecCCCCCCCCcceeccCCcEEEEEeeeccCCCCcceEEEeeHHHHHHHHHhh
Q 008426 277 --TTRSLLMADIRCLPGMEGGPVFGEHAHFVGILIRPLRQKSGAEIQLVIPWEAIATACSDL 336 (566)
Q Consensus 277 --~~~~liqTDA~ilPGnsGGpVfn~~G~lIGIv~~~l~~~~g~gl~faIP~~~i~~~~~~l 336 (566)
....++|.-+...-|.||.||.+-+|..|.++.+..... +=.|++|.+.+.+++.-+
T Consensus 200 ndfnTfy~QaasstsggssgspVv~i~gyAVAl~agg~~ss---as~ffLpLdrV~RaL~cl 258 (955)
T KOG1421|consen 200 NDFNTFYIQAASSTSGGSSGSPVVDIPGYAVALNAGGSISS---ASDFFLPLDRVVRALRCL 258 (955)
T ss_pred ccccceeeeehhcCCCCCCCCceecccceEEeeecCCcccc---cccceeeccchhhhhhhh
Confidence 123458888888999999999999999999998754433 346899999999876643
No 38
>PF05579 Peptidase_S32: Equine arteritis virus serine endopeptidase S32; InterPro: IPR008760 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to MEROPS peptidase family S32 (clan PA(S)). The type example is equine arteritis virus serine endopeptidase (equine arteritis virus), which is involved in processing of nidovirus polyproteins [].; GO: 0004252 serine-type endopeptidase activity, 0016032 viral reproduction, 0019082 viral protein processing; PDB: 3FAN_A 3FAO_A 1MBM_A.
Probab=45.73 E-value=14 Score=39.03 Aligned_cols=43 Identities=21% Similarity=0.305 Sum_probs=29.2
Q ss_pred CCceEEEEEecccCCCCCCCceEEEecCCCCCCCCcceeccCCcEEEEEeeecc
Q 008426 260 FNSVSMGSVANCYPPRSTTRSLLMADIRCLPGMEGGPVFGEHAHFVGILIRPLR 313 (566)
Q Consensus 260 ~nsvS~GiISn~~~~~~~~~~liqTDA~ilPGnsGGpVfn~~G~lIGIv~~~l~ 313 (566)
++-+..|+|.+.. ++- --.||.||.||+..+|.+||+-++.-.
T Consensus 188 ~tGvE~G~ig~~~-------~~~----fT~~GDSGSPVVt~dg~liGVHTGSn~ 230 (297)
T PF05579_consen 188 STGVEPGFIGGGG-------AVC----FTGPGDSGSPVVTEDGDLIGVHTGSNK 230 (297)
T ss_dssp TTEEEEEEEETTE-------EEE----SS-GGCTT-EEEETTC-EEEEEEEEET
T ss_pred ccCcccceecCce-------EEE----EcCCCCCCCccCcCCCCEEEEEecCCC
Confidence 3455667776642 232 347999999999999999999998643
No 39
>PF05580 Peptidase_S55: SpoIVB peptidase S55; InterPro: IPR008763 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S55 (SpoIVB peptidase family, clan PA(S)). The protein SpoIVB plays a key role in signalling in the final sigma-K checkpoint of Bacillus subtilis [, ].
Probab=41.91 E-value=22 Score=36.39 Aligned_cols=38 Identities=18% Similarity=0.309 Sum_probs=26.8
Q ss_pred CCCCCCCcceeccCCcEEEEEeeeccCCCCcceEEEeeHHH
Q 008426 288 CLPGMEGGPVFGEHAHFVGILIRPLRQKSGAEIQLVIPWEA 328 (566)
Q Consensus 288 ilPGnsGGpVfn~~G~lIGIv~~~l~~~~g~gl~faIP~~~ 328 (566)
|..||||.|++- +|+|||-++-.|... ...++.|+++.
T Consensus 177 IvqGMSGSPI~q-dGKLiGAVthvf~~d--p~~Gygi~ie~ 214 (218)
T PF05580_consen 177 IVQGMSGSPIIQ-DGKLIGAVTHVFVND--PTKGYGIFIEW 214 (218)
T ss_pred EEecccCCCEEE-CCEEEEEEEEEEecC--CCceeeecHHH
Confidence 556999999975 899999999876422 23444555553
No 40
>PF01732 DUF31: Putative peptidase (DUF31); InterPro: IPR022382 This domain has no known function. It is found in various hypothetical proteins and putative lipoproteins from mycoplasmas.
Probab=35.73 E-value=26 Score=37.56 Aligned_cols=27 Identities=22% Similarity=0.390 Sum_probs=22.4
Q ss_pred EEecCCCCCCCCcceeccCCcEEEEEe
Q 008426 283 MADIRCLPGMEGGPVFGEHAHFVGILI 309 (566)
Q Consensus 283 qTDA~ilPGnsGGpVfn~~G~lIGIv~ 309 (566)
+.+...-.|.||..|+|.+|++|||..
T Consensus 347 ~~~~~l~gGaSGS~V~n~~~~lvGIy~ 373 (374)
T PF01732_consen 347 IDNYSLGGGASGSMVINQNNELVGIYF 373 (374)
T ss_pred ccccCCCCCCCcCeEECCCCCEEEEeC
Confidence 344455679999999999999999974
No 41
>PF00548 Peptidase_C3: 3C cysteine protease (picornain 3C); InterPro: IPR000199 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This signature defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies C3A and C3B. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral C3 cysteine protease. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SJO_E 2H6M_A 1QA7_C 1HAV_B 2HAL_A 2H9H_A 3QZQ_B 3QZR_A 3R0F_B 3SJ9_A ....
Probab=30.68 E-value=67 Score=31.16 Aligned_cols=90 Identities=20% Similarity=0.285 Sum_probs=48.4
Q ss_pred ceEEEEEEccCCCCCCcccC--CCC-CCCCCeEEEEeCC-CCCCCCCccCCce-EEEEEecccCCCCCCCceEEEecCCC
Q 008426 215 SRVAILGVSSYLKDLPNIAL--TPL-NKRGDLLLAVGSP-FGVLSPMHFFNSV-SMGSVANCYPPRSTTRSLLMADIRCL 289 (566)
Q Consensus 215 tdlAvLki~~~~~~~~~~~~--S~~-~~~Gd~V~aiGSP-FG~lsP~~F~nsv-S~GiISn~~~~~~~~~~liqTDA~il 289 (566)
+|++++++... .+.+.+.. ... -...+.++++-++ |+-+- .....+ ..|.| +..+ ......|.=+++--
T Consensus 72 ~Dl~~v~l~~~-~kfrDIrk~~~~~~~~~~~~~l~v~~~~~~~~~--~~v~~v~~~~~i-~~~g--~~~~~~~~Y~~~t~ 145 (172)
T PF00548_consen 72 TDLTLVKLPRN-PKFRDIRKFFPESIPEYPECVLLVNSTKFPRMI--VEVGFVTNFGFI-NLSG--TTTPRSLKYKAPTK 145 (172)
T ss_dssp EEEEEEEEESS-S-B--GGGGSBSSGGTEEEEEEEEESSSSTCEE--EEEEEEEEEEEE-EETT--EEEEEEEEEESEEE
T ss_pred eeEEEEEccCC-cccCchhhhhccccccCCCcEEEEECCCCccEE--EEEEEEeecCcc-ccCC--CEeeEEEEEccCCC
Confidence 69999999742 22222221 111 2344555555443 44210 001111 23444 3322 11234577788888
Q ss_pred CCCCCcceecc---CCcEEEEEee
Q 008426 290 PGMEGGPVFGE---HAHFVGILIR 310 (566)
Q Consensus 290 PGnsGGpVfn~---~G~lIGIv~~ 310 (566)
+|+.||+|+.. .+.+|||=+|
T Consensus 146 ~G~CG~~l~~~~~~~~~i~GiHva 169 (172)
T PF00548_consen 146 PGMCGSPLVSRIGGQGKIIGIHVA 169 (172)
T ss_dssp TTGTTEEEEESCGGTTEEEEEEEE
T ss_pred CCccCCeEEEeeccCccEEEEEec
Confidence 99999999964 4669999876
No 42
>PF08208 RNA_polI_A34: DNA-directed RNA polymerase I subunit RPA34.5; InterPro: IPR013240 This is a family of proteins conserved from yeasts to human. Subunit A34.5 of RNA polymerase I is a non-essential subunit which is thought to help Pol I overcome topological constraints imposed on ribosomal DNA during the process of transcription [].; PDB: 3NFG_N.
Probab=26.44 E-value=22 Score=34.92 Aligned_cols=13 Identities=62% Similarity=0.950 Sum_probs=0.0
Q ss_pred Ccccccccccccc
Q 008426 23 DPKGLKMRRHAFH 35 (566)
Q Consensus 23 dpk~~kmr~hafh 35 (566)
-|+|||||.++|=
T Consensus 109 qp~gLk~Rf~P~G 121 (198)
T PF08208_consen 109 QPKGLKMRFFPFG 121 (198)
T ss_dssp -------------
T ss_pred CCCCcceeeecCC
Confidence 4899999999984
No 43
>PF01732 DUF31: Putative peptidase (DUF31); InterPro: IPR022382 This domain has no known function. It is found in various hypothetical proteins and putative lipoproteins from mycoplasmas.
Probab=24.98 E-value=44 Score=35.87 Aligned_cols=27 Identities=26% Similarity=0.272 Sum_probs=21.2
Q ss_pred CceeeEEEEeCC----------ceEEecccccccccC
Q 008426 405 GVWASGVLLNDQ----------GLILTNAHLLEPWRF 431 (566)
Q Consensus 405 ~~~GSG~~~~~~----------G~ilTn~HVv~~~rf 431 (566)
...|||.|+|-. =||-||.||++..|+
T Consensus 35 ~~~GT~WIlDy~~~~~~~~p~k~y~ATNlHVa~~l~n 71 (374)
T PF01732_consen 35 SVSGTGWILDYKKPEDNKYPTKWYFATNLHVASNLRN 71 (374)
T ss_pred cCcceEEEEEEeccCCCCCCeEEEEEechhhhccccc
Confidence 468999999733 389999999996543
No 44
>PF03761 DUF316: Domain of unknown function (DUF316) ; InterPro: IPR005514 This is a family of uncharacterised proteins from Caenorhabditis elegans.
Probab=23.50 E-value=5.3e+02 Score=26.08 Aligned_cols=89 Identities=17% Similarity=0.178 Sum_probs=52.2
Q ss_pred ceEEEEEEccC---CCCCCcccCCC-CCCCCCeEEEEeC-CCCCCCCCccCCceEEEEEecccCCCCCCCceEEEecCCC
Q 008426 215 SRVAILGVSSY---LKDLPNIALTP-LNKRGDLLLAVGS-PFGVLSPMHFFNSVSMGSVANCYPPRSTTRSLLMADIRCL 289 (566)
Q Consensus 215 tdlAvLki~~~---~~~~~~~~~S~-~~~~Gd~V~aiGS-PFG~lsP~~F~nsvS~GiISn~~~~~~~~~~liqTDA~il 289 (566)
.+++||.+... ...+|.++++. .+..||.+-+-|- .-+- .+... --|..+.. ....+.++-..-
T Consensus 161 ~~~mIlEl~~~~~~~~~~~Cl~~~~~~~~~~~~~~~yg~~~~~~----~~~~~---~~i~~~~~----~~~~~~~~~~~~ 229 (282)
T PF03761_consen 161 YSPMILELEEDFSKNVSPPCLADSSTNWEKGDEVDVYGFNSTGK----LKHRK---LKITNCTK----CAYSICTKQYSC 229 (282)
T ss_pred cceEEEEEcccccccCCCEEeCCCccccccCceEEEeecCCCCe----EEEEE---EEEEEeec----cceeEecccccC
Confidence 46788888854 23455566543 5888999888776 2221 11111 11111110 123466666677
Q ss_pred CCCCCcceecc-CCc--EEEEEeeeccC
Q 008426 290 PGMEGGPVFGE-HAH--FVGILIRPLRQ 314 (566)
Q Consensus 290 PGnsGGpVfn~-~G~--lIGIv~~~l~~ 314 (566)
+|..|||++.. +|+ ||||.+..-..
T Consensus 230 ~~d~Gg~lv~~~~gr~tlIGv~~~~~~~ 257 (282)
T PF03761_consen 230 KGDRGGPLVKNINGRWTLIGVGASGNYE 257 (282)
T ss_pred CCCccCeEEEEECCCEEEEEEEccCCCc
Confidence 89999999843 444 99999865543
No 45
>TIGR02860 spore_IV_B stage IV sporulation protein B. SpoIVB, the stage IV sporulation protein B of endospore-forming bacteria such as Bacillus subtilis, is a serine proteinase, expressed in the spore (rather than mother cell) compartment, that participates in a proteolytic activation cascade for Sigma-K. It appears to be universal among endospore-forming bacteria and occurs nowhere else.
Probab=21.36 E-value=62 Score=35.95 Aligned_cols=39 Identities=18% Similarity=0.512 Sum_probs=29.8
Q ss_pred CCCCCCCCcceeccCCcEEEEEeeeccCCCCcceEEEeeH
Q 008426 287 RCLPGMEGGPVFGEHAHFVGILIRPLRQKSGAEIQLVIPW 326 (566)
Q Consensus 287 ~ilPGnsGGpVfn~~G~lIGIv~~~l~~~~g~gl~faIP~ 326 (566)
-|..||||.|++ .+|+|||-++-.|-..---|.+.-|.|
T Consensus 356 GivqGMSGSPi~-q~gkliGAvtHVfvndpt~GYGi~ie~ 394 (402)
T TIGR02860 356 GIVQGMSGSPII-QNGKVIGAVTHVFVNDPTSGYGVYIEW 394 (402)
T ss_pred CEEecccCCCEE-ECCEEEEEEEEEEecCCCcceeehHHH
Confidence 456799999987 468999999988865433477777766
Done!