Query psy10089
Match_columns 559
No_of_seqs 467 out of 3309
Neff 9.2
Searched_HMMs 46136
Date Fri Aug 16 17:36:41 2013
Command hhsearch -i /work/01045/syshi/Psyhhblits/psy10089.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/10089hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 cd00190 Tryp_SPc Trypsin-like 100.0 1.7E-40 3.6E-45 320.4 24.1 229 288-545 1-232 (232)
2 KOG3627|consensus 100.0 1.1E-39 2.5E-44 319.9 24.7 238 285-547 10-255 (256)
3 smart00020 Tryp_SPc Trypsin-li 100.0 1E-37 2.2E-42 300.8 23.9 226 287-542 1-229 (229)
4 PF00089 Trypsin: Trypsin; In 100.0 3.3E-35 7.2E-40 281.1 22.5 218 288-542 1-220 (220)
5 cd00190 Tryp_SPc Trypsin-like 100.0 6.7E-32 1.4E-36 260.3 17.7 177 33-221 1-231 (232)
6 COG5640 Secreted trypsin-like 100.0 9.7E-32 2.1E-36 255.0 17.0 248 284-552 29-284 (413)
7 KOG3627|consensus 100.0 9E-31 1.9E-35 256.8 17.0 186 29-222 9-253 (256)
8 smart00020 Tryp_SPc Trypsin-li 100.0 9.5E-30 2.1E-34 245.1 19.5 174 32-219 1-229 (229)
9 PF00089 Trypsin: Trypsin; In 100.0 4.1E-28 8.9E-33 232.0 14.6 172 33-219 1-220 (220)
10 COG5640 Secreted trypsin-like 99.9 1.4E-23 3E-28 199.8 8.3 182 29-224 29-279 (413)
11 PF03761 DUF316: Domain of unk 99.5 1.7E-13 3.6E-18 136.2 17.7 212 287-544 41-277 (282)
12 PF03761 DUF316: Domain of unk 99.4 1.4E-11 3E-16 122.5 15.9 174 22-200 31-259 (282)
13 PF09342 DUF1986: Domain of un 99.3 7.4E-12 1.6E-16 114.4 11.2 115 301-436 14-131 (267)
14 PF09342 DUF1986: Domain of un 99.3 2.7E-11 5.9E-16 110.8 10.9 111 42-165 14-129 (267)
15 COG3591 V8-like Glu-specific e 98.9 2.1E-08 4.6E-13 94.5 12.6 200 299-547 45-251 (251)
16 COG3591 V8-like Glu-specific e 98.3 7.9E-06 1.7E-10 77.3 11.1 147 40-197 45-224 (251)
17 TIGR02037 degP_htrA_DO peripla 98.0 8.8E-05 1.9E-09 78.2 14.8 84 325-436 58-142 (428)
18 TIGR02038 protease_degS peripl 97.7 0.00096 2.1E-08 68.2 16.0 83 325-436 78-161 (351)
19 TIGR02037 degP_htrA_DO peripla 97.7 0.00032 7E-09 73.9 12.7 82 60-165 58-140 (428)
20 PRK10898 serine endoprotease; 97.6 0.0032 6.9E-08 64.4 17.1 82 325-435 78-160 (353)
21 PRK10139 serine endoprotease; 97.5 0.0015 3.3E-08 69.0 14.0 83 325-435 90-174 (455)
22 PRK10942 serine endoprotease; 97.5 0.0036 7.7E-08 66.5 16.1 83 325-435 111-195 (473)
23 TIGR02038 protease_degS peripl 97.5 0.0028 6.1E-08 64.8 14.7 75 42-136 53-135 (351)
24 PRK10898 serine endoprotease; 97.3 0.0064 1.4E-07 62.2 14.9 73 44-136 55-135 (353)
25 PRK10139 serine endoprotease; 97.3 0.003 6.5E-08 66.7 12.5 107 60-196 90-232 (455)
26 PF13365 Trypsin_2: Trypsin-li 97.3 0.00082 1.8E-08 56.9 6.8 20 62-81 1-21 (120)
27 PF13365 Trypsin_2: Trypsin-li 97.2 0.00097 2.1E-08 56.5 6.8 21 327-347 1-22 (120)
28 PRK10942 serine endoprotease; 97.1 0.005 1.1E-07 65.4 11.8 56 60-135 111-168 (473)
29 PF02395 Peptidase_S6: Immunog 92.1 0.83 1.8E-05 51.2 9.9 30 170-200 216-245 (769)
30 PF00548 Peptidase_C3: 3C cyst 89.2 3.4 7.5E-05 37.4 9.6 70 324-416 24-93 (172)
31 PF02395 Peptidase_S6: Immunog 85.1 0.81 1.8E-05 51.3 3.7 55 491-547 212-268 (769)
32 COG0265 DegQ Trypsin-like seri 72.4 64 0.0014 32.9 12.7 58 325-406 72-130 (347)
33 PF00863 Peptidase_C4: Peptida 70.0 1E+02 0.0022 29.5 12.2 49 491-546 147-196 (235)
34 PF00947 Pico_P2A: Picornaviru 60.8 6.5 0.00014 33.2 2.2 35 494-539 89-123 (127)
35 PF05580 Peptidase_S55: SpoIVB 55.7 13 0.00028 34.6 3.4 26 490-522 175-200 (218)
36 PF00548 Peptidase_C3: 3C cyst 47.3 1.7E+02 0.0037 26.4 9.3 71 58-147 23-93 (172)
37 PF00863 Peptidase_C4: Peptida 44.6 2.6E+02 0.0057 26.7 10.3 17 66-82 37-53 (235)
38 PF05416 Peptidase_C37: Southa 38.6 2.1E+02 0.0045 29.7 8.9 31 489-522 497-527 (535)
39 COG0265 DegQ Trypsin-like seri 26.6 6.8E+02 0.015 25.3 11.9 29 60-88 72-101 (347)
40 PF10459 Peptidase_S46: Peptid 26.1 42 0.00091 37.6 2.1 20 61-80 48-68 (698)
41 PF10459 Peptidase_S46: Peptid 24.4 46 0.00099 37.3 2.0 21 326-346 48-69 (698)
42 PF08192 Peptidase_S64: Peptid 23.8 5.5E+02 0.012 28.5 9.7 57 490-547 634-690 (695)
43 PF05579 Peptidase_S32: Equine 22.2 56 0.0012 31.5 1.8 21 494-520 207-227 (297)
44 TIGR02860 spore_IV_B stage IV 22.1 67 0.0015 33.3 2.5 44 489-545 354-398 (402)
45 PF02907 Peptidase_S29: Hepati 20.8 73 0.0016 27.3 2.0 21 493-519 106-126 (148)
No 1
>cd00190 Tryp_SPc Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues.
Probab=100.00 E-value=1.7e-40 Score=320.39 Aligned_cols=229 Identities=35% Similarity=0.662 Sum_probs=193.0
Q ss_pred ecCCCcCCCccccCcCCceEEEeecccCCCCCCccceeEeeEEEEcCCEEEecccccCccCccceEEEccceeccccCCC
Q psy10089 288 ITGWGRDSAETFFGEYPWMMAILTNKINKDGSVTENVFQCGATLILPHVVMTAAHCVNNIPVTDIKVRGGEWDTITNNRT 367 (559)
Q Consensus 288 i~gg~~~~~~~~~~~~Pw~v~i~~~~~~~~~~~~~~~~~C~GtLIs~~~VLTAAhC~~~~~~~~~~V~~g~~~~~~~~~~ 367 (559)
|+||. ++..++|||+|.|+... ..+.|+|+||+++||||||||+.......+.|++|........
T Consensus 1 i~~G~----~~~~~~~Pw~v~i~~~~---------~~~~C~GtlIs~~~VLTaAhC~~~~~~~~~~v~~g~~~~~~~~-- 65 (232)
T cd00190 1 IVGGS----EAKIGSFPWQVSLQYTG---------GRHFCGGSLISPRWVLTAAHCVYSSAPSNYTVRLGSHDLSSNE-- 65 (232)
T ss_pred CcCCe----ECCCCCCCCEEEEEccC---------CcEEEEEEEeeCCEEEECHHhcCCCCCccEEEEeCcccccCCC--
Confidence 45665 88899999999998731 1489999999999999999999875567889999987665421
Q ss_pred CCCCCccceeeeeEEEeCCCCCCCCCCCceEEEEeCCCCCCCCCceecccCCCC-CCCCCceEEEEccCCCCCCCCCccc
Q psy10089 368 DREPFPYQERTVSQIYIHENFEAKTVFNDIALIILDFPFPVKNHIGLACTPNSA-EEYDDQNCIVTGWGKDKFGVEGRYQ 446 (559)
Q Consensus 368 ~~~~~~~~~~~V~~i~~Hp~y~~~~~~~DIALl~L~~p~~~~~~v~picLp~~~-~~~~~~~~~~~GwG~~~~~~~~~~~ 446 (559)
...+.+.|.++++||+|+.....+|||||+|++|+.++.+++|||||... ....+..+.++|||.... ....+
T Consensus 66 ----~~~~~~~v~~~~~hp~y~~~~~~~DiAll~L~~~~~~~~~v~picl~~~~~~~~~~~~~~~~G~g~~~~--~~~~~ 139 (232)
T cd00190 66 ----GGGQVIKVKKVIVHPNYNPSTYDNDIALLKLKRPVTLSDNVRPICLPSSGYNLPAGTTCTVSGWGRTSE--GGPLP 139 (232)
T ss_pred ----CceEEEEEEEEEECCCCCCCCCcCCEEEEEECCcccCCCcccceECCCccccCCCCCEEEEEeCCcCCC--CCCCC
Confidence 13567899999999999988889999999999999999999999999775 334568999999999863 23567
Q ss_pred ccceEEEEEeecchhhhHHhhhcccCCeeccCCceEEEeCCC-CCCCCCCCCCcccEEeecCCCCcEEEEEEEEeCCCCC
Q psy10089 447 STLKKVEVKLVPRNVCQQQLRKTRLGGVFKLHDSFICASGGP-NQDACKGDGGGPLVCQLKNERDRFTQVGIVSWGIGCG 525 (559)
Q Consensus 447 ~~L~~~~v~i~~~~~C~~~~~~~~~~~~~~~~~~~lCa~~~~-~~~~C~GDsGgPLv~~~~~~~~~~~l~GI~S~g~~C~ 525 (559)
..++...+.+++...|...+.. ...+.+.++|+.... ..+.|.|||||||++.. +++++|+||+|+|..|.
T Consensus 140 ~~~~~~~~~~~~~~~C~~~~~~-----~~~~~~~~~C~~~~~~~~~~c~gdsGgpl~~~~---~~~~~lvGI~s~g~~c~ 211 (232)
T cd00190 140 DVLQEVNVPIVSNAECKRAYSY-----GGTITDNMLCAGGLEGGKDACQGDSGGPLVCND---NGRGVLVGIVSWGSGCA 211 (232)
T ss_pred ceeeEEEeeeECHHHhhhhccC-----cccCCCceEeeCCCCCCCccccCCCCCcEEEEe---CCEEEEEEEEehhhccC
Confidence 7899999999999999987653 124789999998543 78899999999999987 68899999999999998
Q ss_pred CCC-CeeeEeccccHHHHHhh
Q psy10089 526 SDT-PGVYVDVRKFKKWILDN 545 (559)
Q Consensus 526 ~~~-p~vyt~V~~y~~WI~~~ 545 (559)
... |++|++|+.|++||+++
T Consensus 212 ~~~~~~~~t~v~~~~~WI~~~ 232 (232)
T cd00190 212 RPNYPGVYTRVSSYLDWIQKT 232 (232)
T ss_pred CCCCCCEEEEcHHhhHHhhcC
Confidence 744 99999999999999864
No 2
>KOG3627|consensus
Probab=100.00 E-value=1.1e-39 Score=319.92 Aligned_cols=238 Identities=37% Similarity=0.699 Sum_probs=193.9
Q ss_pred ceEecCCCcCCCccccCcCCceEEEeecccCCCCCCccceeEeeEEEEcCCEEEecccccCcc-CccceEEEccceeccc
Q psy10089 285 NCVITGWGRDSAETFFGEYPWMMAILTNKINKDGSVTENVFQCGATLILPHVVMTAAHCVNNI-PVTDIKVRGGEWDTIT 363 (559)
Q Consensus 285 ~~ri~gg~~~~~~~~~~~~Pw~v~i~~~~~~~~~~~~~~~~~C~GtLIs~~~VLTAAhC~~~~-~~~~~~V~~g~~~~~~ 363 (559)
..||+||. ++..++|||++.++.... ..+.|+|+||+++||||||||+... .. .+.|++|.+....
T Consensus 10 ~~~i~~g~----~~~~~~~Pw~~~l~~~~~--------~~~~Cggsli~~~~vltaaHC~~~~~~~-~~~V~~G~~~~~~ 76 (256)
T KOG3627|consen 10 EGRIVGGT----EAEPGSFPWQVSLQYGGN--------GRHLCGGSLISPRWVLTAAHCVKGASAS-LYTVRLGEHDINL 76 (256)
T ss_pred cCCEeCCc----cCCCCCCCCEEEEEECCC--------cceeeeeEEeeCCEEEEChhhCCCCCCc-ceEEEECcccccc
Confidence 35899997 888999999999998321 1469999999999999999999863 22 7888889875554
Q ss_pred cCCCCCCCCccceeeeeEEEeCCCCCCCCCC-CceEEEEeCCCCCCCCCceecccCCCCC---CCCCceEEEEccCCCCC
Q psy10089 364 NNRTDREPFPYQERTVSQIYIHENFEAKTVF-NDIALIILDFPFPVKNHIGLACTPNSAE---EYDDQNCIVTGWGKDKF 439 (559)
Q Consensus 364 ~~~~~~~~~~~~~~~V~~i~~Hp~y~~~~~~-~DIALl~L~~p~~~~~~v~picLp~~~~---~~~~~~~~~~GwG~~~~ 439 (559)
...... ..+...|.++++||+|+..... ||||||+|++++.|+++|+|||||.... ...+..|.++|||.+..
T Consensus 77 ~~~~~~---~~~~~~v~~~i~H~~y~~~~~~~nDiall~l~~~v~~~~~i~piclp~~~~~~~~~~~~~~~v~GWG~~~~ 153 (256)
T KOG3627|consen 77 SVSEGE---EQLVGDVEKIIVHPNYNPRTLENNDIALLRLSEPVTFSSHIQPICLPSSADPYFPPGGTTCLVSGWGRTES 153 (256)
T ss_pred ccccCc---hhhhceeeEEEECCCCCCCCCCCCCEEEEEECCCcccCCcccccCCCCCcccCCCCCCCEEEEEeCCCcCC
Confidence 421100 1244557889999999998877 9999999999999999999999995543 33448999999999864
Q ss_pred CCCCcccccceEEEEEeecchhhhHHhhhcccCCeeccCCceEEEeC-CCCCCCCCCCCCcccEEeecCCCCcEEEEEEE
Q psy10089 440 GVEGRYQSTLKKVEVKLVPRNVCQQQLRKTRLGGVFKLHDSFICASG-GPNQDACKGDGGGPLVCQLKNERDRFTQVGIV 518 (559)
Q Consensus 440 ~~~~~~~~~L~~~~v~i~~~~~C~~~~~~~~~~~~~~~~~~~lCa~~-~~~~~~C~GDsGgPLv~~~~~~~~~~~l~GI~ 518 (559)
+ ....+..|+++.+++++.++|...+.... .+.+.||||+. ....++|+|||||||++.. .++++|+||+
T Consensus 154 ~-~~~~~~~L~~~~v~i~~~~~C~~~~~~~~-----~~~~~~~Ca~~~~~~~~~C~GDSGGPLv~~~---~~~~~~~Giv 224 (256)
T KOG3627|consen 154 G-GGPLPDTLQEVDVPIISNSECRRAYGGLG-----TITDTMLCAGGPEGGKDACQGDSGGPLVCED---NGRWVLVGIV 224 (256)
T ss_pred C-CCCCCceeEEEEEeEcChhHhcccccCcc-----ccCCCEEeeCccCCCCccccCCCCCeEEEee---CCcEEEEEEE
Confidence 2 23567899999999999999998875321 36677999995 6778899999999999986 3489999999
Q ss_pred EeCCC-CCCCC-CeeeEeccccHHHHHhhcC
Q psy10089 519 SWGIG-CGSDT-PGVYVDVRKFKKWILDNSH 547 (559)
Q Consensus 519 S~g~~-C~~~~-p~vyt~V~~y~~WI~~~i~ 547 (559)
|||.. |+... |++||+|+.|.+||++.+.
T Consensus 225 S~G~~~C~~~~~P~vyt~V~~y~~WI~~~~~ 255 (256)
T KOG3627|consen 225 SWGSGGCGQPNYPGVYTRVSSYLDWIKENIG 255 (256)
T ss_pred EecCCCCCCCCCCeEEeEhHHhHHHHHHHhc
Confidence 99988 99875 9999999999999999875
No 3
>smart00020 Tryp_SPc Trypsin-like serine protease. Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues.
Probab=100.00 E-value=1e-37 Score=300.76 Aligned_cols=226 Identities=37% Similarity=0.703 Sum_probs=189.8
Q ss_pred EecCCCcCCCccccCcCCceEEEeecccCCCCCCccceeEeeEEEEcCCEEEecccccCccCccceEEEccceeccccCC
Q psy10089 287 VITGWGRDSAETFFGEYPWMMAILTNKINKDGSVTENVFQCGATLILPHVVMTAAHCVNNIPVTDIKVRGGEWDTITNNR 366 (559)
Q Consensus 287 ri~gg~~~~~~~~~~~~Pw~v~i~~~~~~~~~~~~~~~~~C~GtLIs~~~VLTAAhC~~~~~~~~~~V~~g~~~~~~~~~ 366 (559)
||+||. ++..++|||+|.++... ..+.|+|+||++++|||||||+.......+.|++|.+......
T Consensus 1 ~~~~G~----~~~~~~~Pw~~~i~~~~---------~~~~C~GtlIs~~~VLTaahC~~~~~~~~~~v~~g~~~~~~~~- 66 (229)
T smart00020 1 RIVGGS----EANIGSFPWQVSLQYRG---------GRHFCGGSLISPRWVLTAAHCVYGSDPSNIRVRLGSHDLSSGE- 66 (229)
T ss_pred CccCCC----cCCCCCCCcEEEEEEcC---------CCcEEEEEEecCCEEEECHHHcCCCCCcceEEEeCcccCCCCC-
Confidence 467886 88999999999998732 2478999999999999999999875556889999987654432
Q ss_pred CCCCCCccceeeeeEEEeCCCCCCCCCCCceEEEEeCCCCCCCCCceecccCCCC-CCCCCceEEEEccCCCCCCCCCcc
Q psy10089 367 TDREPFPYQERTVSQIYIHENFEAKTVFNDIALIILDFPFPVKNHIGLACTPNSA-EEYDDQNCIVTGWGKDKFGVEGRY 445 (559)
Q Consensus 367 ~~~~~~~~~~~~V~~i~~Hp~y~~~~~~~DIALl~L~~p~~~~~~v~picLp~~~-~~~~~~~~~~~GwG~~~~~~~~~~ 445 (559)
..+.+.|.+++.||+|+.....+|||||+|++|+.+++.++|+|||... ....+..+.++|||.... ..+..
T Consensus 67 ------~~~~~~v~~~~~~p~~~~~~~~~DiAll~L~~~i~~~~~~~pi~l~~~~~~~~~~~~~~~~g~g~~~~-~~~~~ 139 (229)
T smart00020 67 ------EGQVIKVSKVIIHPNYNPSTYDNDIALLKLKSPVTLSDNVRPICLPSSNYNVPAGTTCTVSGWGRTSE-GAGSL 139 (229)
T ss_pred ------CceEEeeEEEEECCCCCCCCCcCCEEEEEECcccCCCCceeeccCCCcccccCCCCEEEEEeCCCCCC-CCCcC
Confidence 1267899999999999988889999999999999999999999999763 334568999999998764 33455
Q ss_pred cccceEEEEEeecchhhhHHhhhcccCCeeccCCceEEEeCCC-CCCCCCCCCCcccEEeecCCCCcEEEEEEEEeCCCC
Q psy10089 446 QSTLKKVEVKLVPRNVCQQQLRKTRLGGVFKLHDSFICASGGP-NQDACKGDGGGPLVCQLKNERDRFTQVGIVSWGIGC 524 (559)
Q Consensus 446 ~~~L~~~~v~i~~~~~C~~~~~~~~~~~~~~~~~~~lCa~~~~-~~~~C~GDsGgPLv~~~~~~~~~~~l~GI~S~g~~C 524 (559)
...++...+.+++.+.|...+... ..+...++|++... ..+.|.|||||||++.. + +|+|+||+|+|..|
T Consensus 140 ~~~~~~~~~~~~~~~~C~~~~~~~-----~~~~~~~~C~~~~~~~~~~c~gdsG~pl~~~~---~-~~~l~Gi~s~g~~C 210 (229)
T smart00020 140 PDTLQEVNVPIVSNATCRRAYSGG-----GAITDNMLCAGGLEGGKDACQGDSGGPLVCND---G-RWVLVGIVSWGSGC 210 (229)
T ss_pred CCEeeEEEEEEeCHHHhhhhhccc-----cccCCCcEeecCCCCCCcccCCCCCCeeEEEC---C-CEEEEEEEEECCCC
Confidence 678999999999999999876431 24788999998544 78899999999999976 4 99999999999999
Q ss_pred CCCC-CeeeEeccccHHHH
Q psy10089 525 GSDT-PGVYVDVRKFKKWI 542 (559)
Q Consensus 525 ~~~~-p~vyt~V~~y~~WI 542 (559)
.... |.+|+||++|++||
T Consensus 211 ~~~~~~~~~~~i~~~~~WI 229 (229)
T smart00020 211 ARPGKPGVYTRVSSYLDWI 229 (229)
T ss_pred CCCCCCCEEEEeccccccC
Confidence 8544 99999999999998
No 4
>PF00089 Trypsin: Trypsin; InterPro: IPR001254 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine proteases belong to the MEROPS peptidase family S1 (chymotrypsin family, clan PA(S))and to peptidase family S6 (Hap serine peptidases). The chymotrypsin family is almost totally confined to animals, although trypsin-like enzymes are found in actinomycetes of the genera Streptomyces and Saccharopolyspora, and in the fungus Fusarium oxysporum []. The enzymes are inherently secreted, being synthesised with a signal peptide that targets them to the secretory pathway. Animal enzymes are either secreted directly, packaged into vesicles for regulated secretion, or are retained in leukocyte granules []. The Hap family, 'Haemophilus adhesion and penetration', are proteins that play a role in the interaction with human epithelial cells. The serine protease activity is localized at the N-terminal domain, whereas the binding domain is in the C-terminal region. ; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 1SPJ_A 1A5I_A 2ZGH_A 2ZKS_A 2ZGJ_A 2ZGC_A 2ODP_A 2I6Q_A 2I6S_A 2ODQ_A ....
Probab=100.00 E-value=3.3e-35 Score=281.08 Aligned_cols=218 Identities=41% Similarity=0.762 Sum_probs=182.5
Q ss_pred ecCCCcCCCccccCcCCceEEEeecccCCCCCCccceeEeeEEEEcCCEEEecccccCccCccceEEEccceeccccCCC
Q psy10089 288 ITGWGRDSAETFFGEYPWMMAILTNKINKDGSVTENVFQCGATLILPHVVMTAAHCVNNIPVTDIKVRGGEWDTITNNRT 367 (559)
Q Consensus 288 i~gg~~~~~~~~~~~~Pw~v~i~~~~~~~~~~~~~~~~~C~GtLIs~~~VLTAAhC~~~~~~~~~~V~~g~~~~~~~~~~ 367 (559)
|.||. ++..++|||+|.++.. . ..++|+|+||+++||||||||+.. ...+.+.+|........
T Consensus 1 i~~g~----~~~~~~~p~~v~i~~~--~-------~~~~C~G~li~~~~vLTaahC~~~--~~~~~v~~g~~~~~~~~-- 63 (220)
T PF00089_consen 1 IVGGD----PASPGEFPWVVSIRYS--N-------GRFFCTGTLISPRWVLTAAHCVDG--ASDIKVRLGTYSIRNSD-- 63 (220)
T ss_dssp SBSSE----ECGTTSSTTEEEEEET--T-------TEEEEEEEEEETTEEEEEGGGHTS--GGSEEEEESESBTTSTT--
T ss_pred CCCCE----ECCCCCCCeEEEEeeC--C-------CCeeEeEEeccccccccccccccc--ccccccccccccccccc--
Confidence 56775 8999999999999983 2 058999999999999999999986 56788888873333222
Q ss_pred CCCCCccceeeeeEEEeCCCCCCCCCCCceEEEEeCCCCCCCCCceecccCCCCC-CCCCceEEEEccCCCCCCCCCccc
Q psy10089 368 DREPFPYQERTVSQIYIHENFEAKTVFNDIALIILDFPFPVKNHIGLACTPNSAE-EYDDQNCIVTGWGKDKFGVEGRYQ 446 (559)
Q Consensus 368 ~~~~~~~~~~~V~~i~~Hp~y~~~~~~~DIALl~L~~p~~~~~~v~picLp~~~~-~~~~~~~~~~GwG~~~~~~~~~~~ 446 (559)
...+.+.|++++.||+|+.....+|||||+|++|+.+.+.++|+||+.... ...+..+.++|||.... .+ ..
T Consensus 64 ----~~~~~~~v~~~~~h~~~~~~~~~~DiAll~L~~~~~~~~~~~~~~l~~~~~~~~~~~~~~~~G~~~~~~--~~-~~ 136 (220)
T PF00089_consen 64 ----GSEQTIKVSKIIIHPKYDPSTYDNDIALLKLDRPITFGDNIQPICLPSAGSDPNVGTSCIVVGWGRTSD--NG-YS 136 (220)
T ss_dssp ----TTSEEEEEEEEEEETTSBTTTTTTSEEEEEESSSSEHBSSBEESBBTSTTHTTTTTSEEEEEESSBSST--TS-BT
T ss_pred ----ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc--cc-cc
Confidence 124689999999999999988899999999999999999999999998433 24568999999998753 22 56
Q ss_pred ccceEEEEEeecchhhhHHhhhcccCCeeccCCceEEEeCCCCCCCCCCCCCcccEEeecCCCCcEEEEEEEEeCCCCCC
Q psy10089 447 STLKKVEVKLVPRNVCQQQLRKTRLGGVFKLHDSFICASGGPNQDACKGDGGGPLVCQLKNERDRFTQVGIVSWGIGCGS 526 (559)
Q Consensus 447 ~~L~~~~v~i~~~~~C~~~~~~~~~~~~~~~~~~~lCa~~~~~~~~C~GDsGgPLv~~~~~~~~~~~l~GI~S~g~~C~~ 526 (559)
..++...+.+++.+.|...+.. .+.+.++|+......+.|.|||||||++.. . +|+||++++..|..
T Consensus 137 ~~~~~~~~~~~~~~~c~~~~~~-------~~~~~~~c~~~~~~~~~~~g~sG~pl~~~~---~---~lvGI~s~~~~c~~ 203 (220)
T PF00089_consen 137 SNLQSVTVPVVSRKTCRSSYND-------NLTPNMICAGSSGSGDACQGDSGGPLICNN---N---YLVGIVSFGENCGS 203 (220)
T ss_dssp SBEEEEEEEEEEHHHHHHHTTT-------TSTTTEEEEETTSSSBGGTTTTTSEEEETT---E---EEEEEEEEESSSSB
T ss_pred cccccccccccccccccccccc-------ccccccccccccccccccccccccccccce---e---eecceeeecCCCCC
Confidence 7899999999999999987442 278899999954668999999999999865 1 79999999999998
Q ss_pred CC-CeeeEeccccHHHH
Q psy10089 527 DT-PGVYVDVRKFKKWI 542 (559)
Q Consensus 527 ~~-p~vyt~V~~y~~WI 542 (559)
.. |.+|+||+.|++||
T Consensus 204 ~~~~~v~~~v~~~~~WI 220 (220)
T PF00089_consen 204 PNYPGVYTRVSSYLDWI 220 (220)
T ss_dssp TTSEEEEEEGGGGHHHH
T ss_pred CCcCEEEEEHHHhhccC
Confidence 76 89999999999999
No 5
>cd00190 Tryp_SPc Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues.
Probab=99.98 E-value=6.7e-32 Score=260.31 Aligned_cols=177 Identities=38% Similarity=0.716 Sum_probs=149.1
Q ss_pred ecCCccCCCCCCCeEEEEEEEecCCceeeeEEEEEcCCEEEecccccccC--ceeeEEeeeEEEcccccccccceeEEee
Q psy10089 33 PISGRNTYFGEFPWMLVLFYYKRNMEYFKCGASLIGPNIALTAAHCVQYD--VTYSVAAGEWFINGIVEEELEEEQRRDV 110 (559)
Q Consensus 33 iigG~~a~~~~~Pw~v~l~~~~~~~~~~~CgGtLIs~~~VLTAAhC~~~~--~~~~v~~g~~~~~~~~~~~~~~~~~~~v 110 (559)
|+||+++.+++|||+|+|+... ..+.|+|+||+++||||||||+... ..+.|++|...... .....+.+.|
T Consensus 1 i~~G~~~~~~~~Pw~v~i~~~~---~~~~C~GtlIs~~~VLTaAhC~~~~~~~~~~v~~g~~~~~~----~~~~~~~~~v 73 (232)
T cd00190 1 IVGGSEAKIGSFPWQVSLQYTG---GRHFCGGSLISPRWVLTAAHCVYSSAPSNYTVRLGSHDLSS----NEGGGQVIKV 73 (232)
T ss_pred CcCCeECCCCCCCCEEEEEccC---CcEEEEEEEeeCCEEEECHHhcCCCCCccEEEEeCcccccC----CCCceEEEEE
Confidence 7899999999999999998753 3478999999999999999999764 56788888765531 1135678899
Q ss_pred EEEEECCCCCCCCcCCceEEEeecCccccCCCcceeccCCCC-CCCCCCcEEEEeecCCC--------------------
Q psy10089 111 LDVRIHPNYSTETLENNIALLKLSSNIDFDDYIHPICLPDWN-VTYDSENCVITGWGRDS-------------------- 169 (559)
Q Consensus 111 ~~i~~hp~y~~~~~~~Diall~L~~~v~~~~~v~picl~~~~-~~~~~~~~~~~GwG~~~-------------------- 169 (559)
.++++||.|+.....+|||||||++|+.++.+++|||||... ....+..+.++|||...
T Consensus 74 ~~~~~hp~y~~~~~~~DiAll~L~~~~~~~~~v~picl~~~~~~~~~~~~~~~~G~g~~~~~~~~~~~~~~~~~~~~~~~ 153 (232)
T cd00190 74 KKVIVHPNYNPSTYDNDIALLKLKRPVTLSDNVRPICLPSSGYNLPAGTTCTVSGWGRTSEGGPLPDVLQEVNVPIVSNA 153 (232)
T ss_pred EEEEECCCCCCCCCcCCEEEEEECCcccCCCcccceECCCccccCCCCCEEEEEeCCcCCCCCCCCceeeEEEeeeECHH
Confidence 999999999998889999999999999999999999999864 23356789999998642
Q ss_pred ------------------------------CCCCCceEeecCCCCCcEEEEEEEEcCCCCCC-CCCeeeEEeeeccCccc
Q psy10089 170 ------------------------------ADGGGPLVCPSKEDPTTFFQVGIAAWSVVCTP-DMPGLYDVTYSVAAGEW 218 (559)
Q Consensus 170 ------------------------------~d~G~pl~~~~~~~~~~~~l~Gi~s~~~~C~~-~~p~vy~~v~~~~~~~W 218 (559)
+|+|+||++... +.++|+||+|++..|.. +.|++|+++.. +.+|
T Consensus 154 ~C~~~~~~~~~~~~~~~C~~~~~~~~~~c~gdsGgpl~~~~~---~~~~lvGI~s~g~~c~~~~~~~~~t~v~~--~~~W 228 (232)
T cd00190 154 ECKRAYSYGGTITDNMLCAGGLEGGKDACQGDSGGPLVCNDN---GRGVLVGIVSWGSGCARPNYPGVYTRVSS--YLDW 228 (232)
T ss_pred HhhhhccCcccCCCceEeeCCCCCCCccccCCCCCcEEEEeC---CEEEEEEEEehhhccCCCCCCCEEEEcHH--hhHH
Confidence 689999999643 78999999999999984 89999999877 6799
Q ss_pred eee
Q psy10089 219 FIN 221 (559)
Q Consensus 219 i~~ 221 (559)
|.+
T Consensus 229 I~~ 231 (232)
T cd00190 229 IQK 231 (232)
T ss_pred hhc
Confidence 854
No 6
>COG5640 Secreted trypsin-like serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=99.98 E-value=9.7e-32 Score=255.01 Aligned_cols=248 Identities=26% Similarity=0.375 Sum_probs=173.5
Q ss_pred CceEecCCCcCCCccccCcCCceEEEeecccCCCCCCccceeEeeEEEEcCCEEEecccccCccCccceEEEccceeccc
Q psy10089 284 ENCVITGWGRDSAETFFGEYPWMMAILTNKINKDGSVTENVFQCGATLILPHVVMTAAHCVNNIPVTDIKVRGGEWDTIT 363 (559)
Q Consensus 284 ~~~ri~gg~~~~~~~~~~~~Pw~v~i~~~~~~~~~~~~~~~~~C~GtLIs~~~VLTAAhC~~~~~~~~~~V~~g~~~~~~ 363 (559)
-..||+||. .+..++||++|++..+ ..+. ...-+|||+++..|||||||||+....+-...+..+..++..
T Consensus 29 vs~rIigGs----~Anag~~P~~VaLv~~--isd~---~s~tfCGgs~l~~RYvLTAAHC~~~~s~is~d~~~vv~~l~d 99 (413)
T COG5640 29 VSSRIIGGS----NANAGEYPSLVALVDR--ISDY---VSGTFCGGSKLGGRYVLTAAHCADASSPISSDVNRVVVDLND 99 (413)
T ss_pred cceeEecCc----ccccccCchHHHHHhh--cccc---cceeEeccceecceEEeeehhhccCCCCccccceEEEecccc
Confidence 456999998 9999999999999863 2111 113589999999999999999998744322223333333333
Q ss_pred cCCCCCCCCccceeeeeEEEeCCCCCCCCCCCceEEEEeCCCCCCC-CCceecccCCC--CCCCCCceEEEEccCCCCCC
Q psy10089 364 NNRTDREPFPYQERTVSQIYIHENFEAKTVFNDIALIILDFPFPVK-NHIGLACTPNS--AEEYDDQNCIVTGWGKDKFG 440 (559)
Q Consensus 364 ~~~~~~~~~~~~~~~V~~i~~Hp~y~~~~~~~DIALl~L~~p~~~~-~~v~picLp~~--~~~~~~~~~~~~GwG~~~~~ 440 (559)
.. ..+...|++++.|..|.+.++.||||+++|.++...- ..+.-.--+.. ............+|+.+...
T Consensus 100 ~S-------q~~rg~vr~i~~~efY~~~n~~ND~Av~~l~~~a~~pr~ki~~~~~sdt~l~sv~~~s~~~n~t~~~~~~~ 172 (413)
T COG5640 100 SS-------QAERGHVRTIYVHEFYSPGNLGNDIAVLELARAASLPRVKITSFDASDTFLNSVTTVSPMTNGTFGVTTPS 172 (413)
T ss_pred cc-------cccCcceEEEeeecccccccccCcceeeccccccccchhheeeccCcccceecccccccccceeeeeeeec
Confidence 22 2457889999999999999999999999999876542 11111111110 01111244556677766543
Q ss_pred CCCcc-c--ccceEEEEEeecchhhhHHhhhcccCCeeccCCceEEEeCCCCCCCCCCCCCcccEEeecCCCCcEEEEEE
Q psy10089 441 VEGRY-Q--STLKKVEVKLVPRNVCQQQLRKTRLGGVFKLHDSFICASGGPNQDACKGDGGGPLVCQLKNERDRFTQVGI 517 (559)
Q Consensus 441 ~~~~~-~--~~L~~~~v~i~~~~~C~~~~~~~~~~~~~~~~~~~lCa~~~~~~~~C~GDsGgPLv~~~~~~~~~~~l~GI 517 (559)
..... + ..|++..+..++...|...+....... ....-.-+|++.. .+++|+||||||++... ++...++||
T Consensus 173 ~v~~~~p~gt~l~e~~v~fv~~stc~~~~g~an~~d-g~~~lT~~cag~~-~~daCqGDSGGPi~~~g---~~G~vQ~GV 247 (413)
T COG5640 173 DVPRSSPKGTILHEVAVLFVPLSTCAQYKGCANASD-GATGLTGFCAGRP-PKDACQGDSGGPIFHKG---EEGRVQRGV 247 (413)
T ss_pred CCCCCCCccceeeeeeeeeechHHhhhhccccccCC-CCCCccceecCCC-CcccccCCCCCceEEeC---CCccEEEeE
Confidence 22222 2 479999999999999998875221111 1122223999844 39999999999999876 455689999
Q ss_pred EEeCCC-CCCCC-CeeeEeccccHHHHHhhcCCCCCC
Q psy10089 518 VSWGIG-CGSDT-PGVYVDVRKFKKWILDNSHGKIID 552 (559)
Q Consensus 518 ~S~g~~-C~~~~-p~vyt~V~~y~~WI~~~i~~~~~~ 552 (559)
+|||.+ |+.+. |+|||+|+.|.+||...|+.....
T Consensus 248 vSwG~~~Cg~t~~~gVyT~vsny~~WI~a~~~~l~~~ 284 (413)
T COG5640 248 VSWGDGGCGGTLIPGVYTNVSNYQDWIAAMTNGLSYL 284 (413)
T ss_pred EEecCCCCCCCCcceeEEehhHHHHHHHHHhcCCCcc
Confidence 999986 99988 999999999999999998875543
No 7
>KOG3627|consensus
Probab=99.97 E-value=9e-31 Score=256.83 Aligned_cols=186 Identities=41% Similarity=0.737 Sum_probs=151.2
Q ss_pred CcceecCCccCCCCCCCeEEEEEEEecCCceeeeEEEEEcCCEEEecccccccCc--eeeEEeeeEEEccccccccccee
Q psy10089 29 DYIEPISGRNTYFGEFPWMLVLFYYKRNMEYFKCGASLIGPNIALTAAHCVQYDV--TYSVAAGEWFINGIVEEELEEEQ 106 (559)
Q Consensus 29 ~~~riigG~~a~~~~~Pw~v~l~~~~~~~~~~~CgGtLIs~~~VLTAAhC~~~~~--~~~v~~g~~~~~~~~~~~~~~~~ 106 (559)
...||+||.+|.++++||+|+|++... ..|.|||+||+++||||||||+.... .+.|++|.+......... ...+
T Consensus 9 ~~~~i~~g~~~~~~~~Pw~~~l~~~~~--~~~~Cggsli~~~~vltaaHC~~~~~~~~~~V~~G~~~~~~~~~~~-~~~~ 85 (256)
T KOG3627|consen 9 PEGRIVGGTEAEPGSFPWQVSLQYGGN--GRHLCGGSLISPRWVLTAAHCVKGASASLYTVRLGEHDINLSVSEG-EEQL 85 (256)
T ss_pred ccCCEeCCccCCCCCCCCEEEEEECCC--cceeeeeEEeeCCEEEEChhhCCCCCCcceEEEECccccccccccC-chhh
Confidence 357899999999999999999987653 23789999999999999999998765 788888876554221110 0124
Q ss_pred EEeeEEEEECCCCCCCCcC-CceEEEeecCccccCCCcceeccCCCCC---CCCCCcEEEEeecCCC-------------
Q psy10089 107 RRDVLDVRIHPNYSTETLE-NNIALLKLSSNIDFDDYIHPICLPDWNV---TYDSENCVITGWGRDS------------- 169 (559)
Q Consensus 107 ~~~v~~i~~hp~y~~~~~~-~Diall~L~~~v~~~~~v~picl~~~~~---~~~~~~~~~~GwG~~~------------- 169 (559)
...|.++++||+|+..... ||||||+|.+++.|+++|+|||||.... ...+..|.++|||.+.
T Consensus 86 ~~~v~~~i~H~~y~~~~~~~nDiall~l~~~v~~~~~i~piclp~~~~~~~~~~~~~~~v~GWG~~~~~~~~~~~~L~~~ 165 (256)
T KOG3627|consen 86 VGDVEKIIVHPNYNPRTLENNDIALLRLSEPVTFSSHIQPICLPSSADPYFPPGGTTCLVSGWGRTESGGGPLPDTLQEV 165 (256)
T ss_pred hceeeEEEECCCCCCCCCCCCCEEEEEECCCcccCCcccccCCCCCcccCCCCCCCEEEEEeCCCcCCCCCCCCceeEEE
Confidence 5557789999999998877 9999999999999999999999985443 3345789999999752
Q ss_pred --------------------------------------CCCCCceEeecCCCCCcEEEEEEEEcCCC-CC-CCCCeeeEE
Q psy10089 170 --------------------------------------ADGGGPLVCPSKEDPTTFFQVGIAAWSVV-CT-PDMPGLYDV 209 (559)
Q Consensus 170 --------------------------------------~d~G~pl~~~~~~~~~~~~l~Gi~s~~~~-C~-~~~p~vy~~ 209 (559)
||+||||+|.... .++++||+|||.. |. .+.|++|++
T Consensus 166 ~v~i~~~~~C~~~~~~~~~~~~~~~Ca~~~~~~~~~C~GDSGGPLv~~~~~---~~~~~GivS~G~~~C~~~~~P~vyt~ 242 (256)
T KOG3627|consen 166 DVPIISNSECRRAYGGLGTITDTMLCAGGPEGGKDACQGDSGGPLVCEDNG---RWVLVGIVSWGSGGCGQPNYPGVYTR 242 (256)
T ss_pred EEeEcChhHhcccccCccccCCCEEeeCccCCCCccccCCCCCeEEEeeCC---cEEEEEEEEecCCCCCCCCCCeEEeE
Confidence 7999999996532 7899999999987 98 569999999
Q ss_pred eeeccCccceeec
Q psy10089 210 TYSVAAGEWFING 222 (559)
Q Consensus 210 v~~~~~~~Wi~~~ 222 (559)
+.. +.+||.+.
T Consensus 243 V~~--y~~WI~~~ 253 (256)
T KOG3627|consen 243 VSS--YLDWIKEN 253 (256)
T ss_pred hHH--hHHHHHHH
Confidence 887 77998654
No 8
>smart00020 Tryp_SPc Trypsin-like serine protease. Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues.
Probab=99.97 E-value=9.5e-30 Score=245.08 Aligned_cols=174 Identities=41% Similarity=0.762 Sum_probs=147.1
Q ss_pred eecCCccCCCCCCCeEEEEEEEecCCceeeeEEEEEcCCEEEecccccccC--ceeeEEeeeEEEcccccccccceeEEe
Q psy10089 32 EPISGRNTYFGEFPWMLVLFYYKRNMEYFKCGASLIGPNIALTAAHCVQYD--VTYSVAAGEWFINGIVEEELEEEQRRD 109 (559)
Q Consensus 32 riigG~~a~~~~~Pw~v~l~~~~~~~~~~~CgGtLIs~~~VLTAAhC~~~~--~~~~v~~g~~~~~~~~~~~~~~~~~~~ 109 (559)
||+||+++.+++|||+|.|+... ..+.|+||||++++|||||||+... ..+.|++|.+.... ....+.+.
T Consensus 1 ~~~~G~~~~~~~~Pw~~~i~~~~---~~~~C~GtlIs~~~VLTaahC~~~~~~~~~~v~~g~~~~~~-----~~~~~~~~ 72 (229)
T smart00020 1 RIVGGSEANIGSFPWQVSLQYRG---GRHFCGGSLISPRWVLTAAHCVYGSDPSNIRVRLGSHDLSS-----GEEGQVIK 72 (229)
T ss_pred CccCCCcCCCCCCCcEEEEEEcC---CCcEEEEEEecCCEEEECHHHcCCCCCcceEEEeCcccCCC-----CCCceEEe
Confidence 69999999999999999998654 3478999999999999999999764 36788888765431 11227789
Q ss_pred eEEEEECCCCCCCCcCCceEEEeecCccccCCCcceeccCCCC-CCCCCCcEEEEeecCCC-------------------
Q psy10089 110 VLDVRIHPNYSTETLENNIALLKLSSNIDFDDYIHPICLPDWN-VTYDSENCVITGWGRDS------------------- 169 (559)
Q Consensus 110 v~~i~~hp~y~~~~~~~Diall~L~~~v~~~~~v~picl~~~~-~~~~~~~~~~~GwG~~~------------------- 169 (559)
|.++++||.|+.....+|||||+|++|+.+++.++|||||... ....+..+.++|||...
T Consensus 73 v~~~~~~p~~~~~~~~~DiAll~L~~~i~~~~~~~pi~l~~~~~~~~~~~~~~~~g~g~~~~~~~~~~~~~~~~~~~~~~ 152 (229)
T smart00020 73 VSKVIIHPNYNPSTYDNDIALLKLKSPVTLSDNVRPICLPSSNYNVPAGTTCTVSGWGRTSEGAGSLPDTLQEVNVPIVS 152 (229)
T ss_pred eEEEEECCCCCCCCCcCCEEEEEECcccCCCCceeeccCCCcccccCCCCEEEEEeCCCCCCCCCcCCCEeeEEEEEEeC
Confidence 9999999999988889999999999999999999999999852 23356789999998752
Q ss_pred --------------------------------CCCCCceEeecCCCCCcEEEEEEEEcCCCCC-CCCCeeeEEeeeccCc
Q psy10089 170 --------------------------------ADGGGPLVCPSKEDPTTFFQVGIAAWSVVCT-PDMPGLYDVTYSVAAG 216 (559)
Q Consensus 170 --------------------------------~d~G~pl~~~~~~~~~~~~l~Gi~s~~~~C~-~~~p~vy~~v~~~~~~ 216 (559)
+|+|+||++... .|+|+||++++..|. .+.|++|+++.. +.
T Consensus 153 ~~~C~~~~~~~~~~~~~~~C~~~~~~~~~~c~gdsG~pl~~~~~----~~~l~Gi~s~g~~C~~~~~~~~~~~i~~--~~ 226 (229)
T smart00020 153 NATCRRAYSGGGAITDNMLCAGGLEGGKDACQGDSGGPLVCNDG----RWVLVGIVSWGSGCARPGKPGVYTRVSS--YL 226 (229)
T ss_pred HHHhhhhhccccccCCCcEeecCCCCCCcccCCCCCCeeEEECC----CEEEEEEEEECCCCCCCCCCCEEEEecc--cc
Confidence 489999999642 799999999999998 789999999986 67
Q ss_pred cce
Q psy10089 217 EWF 219 (559)
Q Consensus 217 ~Wi 219 (559)
+||
T Consensus 227 ~WI 229 (229)
T smart00020 227 DWI 229 (229)
T ss_pred ccC
Confidence 996
No 9
>PF00089 Trypsin: Trypsin; InterPro: IPR001254 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine proteases belong to the MEROPS peptidase family S1 (chymotrypsin family, clan PA(S))and to peptidase family S6 (Hap serine peptidases). The chymotrypsin family is almost totally confined to animals, although trypsin-like enzymes are found in actinomycetes of the genera Streptomyces and Saccharopolyspora, and in the fungus Fusarium oxysporum []. The enzymes are inherently secreted, being synthesised with a signal peptide that targets them to the secretory pathway. Animal enzymes are either secreted directly, packaged into vesicles for regulated secretion, or are retained in leukocyte granules []. The Hap family, 'Haemophilus adhesion and penetration', are proteins that play a role in the interaction with human epithelial cells. The serine protease activity is localized at the N-terminal domain, whereas the binding domain is in the C-terminal region. ; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 1SPJ_A 1A5I_A 2ZGH_A 2ZKS_A 2ZGJ_A 2ZGC_A 2ODP_A 2I6Q_A 2I6S_A 2ODQ_A ....
Probab=99.95 E-value=4.1e-28 Score=231.95 Aligned_cols=172 Identities=40% Similarity=0.770 Sum_probs=143.4
Q ss_pred ecCCccCCCCCCCeEEEEEEEecCCceeeeEEEEEcCCEEEecccccccCceeeEEeeeEEEcccccccccceeEEeeEE
Q psy10089 33 PISGRNTYFGEFPWMLVLFYYKRNMEYFKCGASLIGPNIALTAAHCVQYDVTYSVAAGEWFINGIVEEELEEEQRRDVLD 112 (559)
Q Consensus 33 iigG~~a~~~~~Pw~v~l~~~~~~~~~~~CgGtLIs~~~VLTAAhC~~~~~~~~v~~g~~~~~~~~~~~~~~~~~~~v~~ 112 (559)
|+||.++.+++|||+|.|++... .++|+|+||+++||||||||+.....+.+.+|...+. ......+.+.|.+
T Consensus 1 i~~g~~~~~~~~p~~v~i~~~~~---~~~C~G~li~~~~vLTaahC~~~~~~~~v~~g~~~~~----~~~~~~~~~~v~~ 73 (220)
T PF00089_consen 1 IVGGDPASPGEFPWVVSIRYSNG---RFFCTGTLISPRWVLTAAHCVDGASDIKVRLGTYSIR----NSDGSEQTIKVSK 73 (220)
T ss_dssp SBSSEECGTTSSTTEEEEEETTT---EEEEEEEEEETTEEEEEGGGHTSGGSEEEEESESBTT----STTTTSEEEEEEE
T ss_pred CCCCEECCCCCCCeEEEEeeCCC---CeeEeEEeccccccccccccccccccccccccccccc----ccccccccccccc
Confidence 78999999999999999987543 5889999999999999999998755677777762221 2223358899999
Q ss_pred EEECCCCCCCCcCCceEEEeecCccccCCCcceeccCCCCC-CCCCCcEEEEeecCCC----------------------
Q psy10089 113 VRIHPNYSTETLENNIALLKLSSNIDFDDYIHPICLPDWNV-TYDSENCVITGWGRDS---------------------- 169 (559)
Q Consensus 113 i~~hp~y~~~~~~~Diall~L~~~v~~~~~v~picl~~~~~-~~~~~~~~~~GwG~~~---------------------- 169 (559)
++.||.|+.....+|||||+|++++.+.+.++|+||+.... ...+..+.+.|||...
T Consensus 74 ~~~h~~~~~~~~~~DiAll~L~~~~~~~~~~~~~~l~~~~~~~~~~~~~~~~G~~~~~~~~~~~~~~~~~~~~~~~~~c~ 153 (220)
T PF00089_consen 74 IIIHPKYDPSTYDNDIALLKLDRPITFGDNIQPICLPSAGSDPNVGTSCIVVGWGRTSDNGYSSNLQSVTVPVVSRKTCR 153 (220)
T ss_dssp EEEETTSBTTTTTTSEEEEEESSSSEHBSSBEESBBTSTTHTTTTTSEEEEEESSBSSTTSBTSBEEEEEEEEEEHHHHH
T ss_pred cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
Confidence 99999999988899999999999999999999999998432 2356789999998742
Q ss_pred ------------------------CCCCCceEeecCCCCCcEEEEEEEEcCCCCC-CCCCeeeEEeeeccCccce
Q psy10089 170 ------------------------ADGGGPLVCPSKEDPTTFFQVGIAAWSVVCT-PDMPGLYDVTYSVAAGEWF 219 (559)
Q Consensus 170 ------------------------~d~G~pl~~~~~~~~~~~~l~Gi~s~~~~C~-~~~p~vy~~v~~~~~~~Wi 219 (559)
+|+|+||++... +|+||.+++..|. .+.|++|+++.. +.+||
T Consensus 154 ~~~~~~~~~~~~c~~~~~~~~~~~g~sG~pl~~~~~------~lvGI~s~~~~c~~~~~~~v~~~v~~--~~~WI 220 (220)
T PF00089_consen 154 SSYNDNLTPNMICAGSSGSGDACQGDSGGPLICNNN------YLVGIVSFGENCGSPNYPGVYTRVSS--YLDWI 220 (220)
T ss_dssp HHTTTTSTTTEEEEETTSSSBGGTTTTTSEEEETTE------EEEEEEEEESSSSBTTSEEEEEEGGG--GHHHH
T ss_pred ccccccccccccccccccccccccccccccccccee------eecceeeecCCCCCCCcCEEEEEHHH--hhccC
Confidence 689999999542 7999999999997 557999999987 66896
No 10
>COG5640 Secreted trypsin-like serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=99.89 E-value=1.4e-23 Score=199.79 Aligned_cols=182 Identities=24% Similarity=0.392 Sum_probs=128.1
Q ss_pred CcceecCCccCCCCCCCeEEEEEEEecC-CceeeeEEEEEcCCEEEecccccccCce--eeEEeeeEEEcccccccccce
Q psy10089 29 DYIEPISGRNTYFGEFPWMLVLFYYKRN-MEYFKCGASLIGPNIALTAAHCVQYDVT--YSVAAGEWFINGIVEEELEEE 105 (559)
Q Consensus 29 ~~~riigG~~a~~~~~Pw~v~l~~~~~~-~~~~~CgGtLIs~~~VLTAAhC~~~~~~--~~v~~g~~~~~~~~~~~~~~~ 105 (559)
-.+|||||..|..++||++|+|.-+... ...-+|||+++..|||||||||+..... ..+..+...+ .+.+..
T Consensus 29 vs~rIigGs~Anag~~P~~VaLv~~isd~~s~tfCGgs~l~~RYvLTAAHC~~~~s~is~d~~~vv~~l-----~d~Sq~ 103 (413)
T COG5640 29 VSSRIIGGSNANAGEYPSLVALVDRISDYVSGTFCGGSKLGGRYVLTAAHCADASSPISSDVNRVVVDL-----NDSSQA 103 (413)
T ss_pred cceeEecCcccccccCchHHHHHhhcccccceeEeccceecceEEeeehhhccCCCCccccceEEEecc-----cccccc
Confidence 3689999999999999999999755443 2235799999999999999999976542 2223333333 234667
Q ss_pred eEEeeEEEEECCCCCCCCcCCceEEEeecCccccCCCcceeccCCCCCC-------CCCCcEEEEeecCC----------
Q psy10089 106 QRRDVLDVRIHPNYSTETLENNIALLKLSSNIDFDDYIHPICLPDWNVT-------YDSENCVITGWGRD---------- 168 (559)
Q Consensus 106 ~~~~v~~i~~hp~y~~~~~~~Diall~L~~~v~~~~~v~picl~~~~~~-------~~~~~~~~~GwG~~---------- 168 (559)
+..+|++++.|..|.+.++.||||+++|.++...- .+.+-..+.. .........+||.+
T Consensus 104 ~rg~vr~i~~~efY~~~n~~ND~Av~~l~~~a~~p----r~ki~~~~~sdt~l~sv~~~s~~~n~t~~~~~~~~v~~~~p 179 (413)
T COG5640 104 ERGHVRTIYVHEFYSPGNLGNDIAVLELARAASLP----RVKITSFDASDTFLNSVTTVSPMTNGTFGVTTPSDVPRSSP 179 (413)
T ss_pred cCcceEEEeeecccccccccCcceeeccccccccc----hhheeeccCcccceecccccccccceeeeeeeecCCCCCCC
Confidence 88899999999999999999999999998854321 0011000000 00011122233221
Q ss_pred -----------------------------------------------CCCCCCceEeecCCCCCcEEEEEEEEcCCC-CC
Q psy10089 169 -----------------------------------------------SADGGGPLVCPSKEDPTTFFQVGIAAWSVV-CT 200 (559)
Q Consensus 169 -----------------------------------------------~~d~G~pl~~~~~~~~~~~~l~Gi~s~~~~-C~ 200 (559)
++|+|||++....+ ...+.||+|||.+ |+
T Consensus 180 ~gt~l~e~~v~fv~~stc~~~~g~an~~dg~~~lT~~cag~~~~daCqGDSGGPi~~~g~~---G~vQ~GVvSwG~~~Cg 256 (413)
T COG5640 180 KGTILHEVAVLFVPLSTCAQYKGCANASDGATGLTGFCAGRPPKDACQGDSGGPIFHKGEE---GRVQRGVVSWGDGGCG 256 (413)
T ss_pred ccceeeeeeeeeechHHhhhhccccccCCCCCCccceecCCCCcccccCCCCCceEEeCCC---ccEEEeEEEecCCCCC
Confidence 18999999986643 3468999999975 97
Q ss_pred -CCCCeeeEEeeeccCccceeeccc
Q psy10089 201 -PDMPGLYDVTYSVAAGEWFINGIV 224 (559)
Q Consensus 201 -~~~p~vy~~v~~~~~~~Wi~~~i~ 224 (559)
.+.|+|||++.. |.+||...+.
T Consensus 257 ~t~~~gVyT~vsn--y~~WI~a~~~ 279 (413)
T COG5640 257 GTLIPGVYTNVSN--YQDWIAAMTN 279 (413)
T ss_pred CCCcceeEEehhH--HHHHHHHHhc
Confidence 899999999988 7799876443
No 11
>PF03761 DUF316: Domain of unknown function (DUF316) ; InterPro: IPR005514 This is a family of uncharacterised proteins from Caenorhabditis elegans.
Probab=99.55 E-value=1.7e-13 Score=136.19 Aligned_cols=212 Identities=24% Similarity=0.389 Sum_probs=132.2
Q ss_pred EecCCCcCCCccccCcCCceEEEeecccCCCCCCccceeEeeEEEEcCCEEEecccccCccCccce---------EEEcc
Q psy10089 287 VITGWGRDSAETFFGEYPWMMAILTNKINKDGSVTENVFQCGATLILPHVVMTAAHCVNNIPVTDI---------KVRGG 357 (559)
Q Consensus 287 ri~gg~~~~~~~~~~~~Pw~v~i~~~~~~~~~~~~~~~~~C~GtLIs~~~VLTAAhC~~~~~~~~~---------~V~~g 357 (559)
++.+|. .+...+.||.+.+...... .....++|+|||+||||||+||+..... .+ ...-+
T Consensus 41 ~~~~g~----~~~~~~~pW~v~v~~~~~~------~~~~~~~gtlIS~RHiLtss~~~~~~~~-~W~~~~~~~~~~C~~~ 109 (282)
T PF03761_consen 41 KVFNGT----PAESGEAPWAVSVYTKNHN------EGNYFSTGTLISPRHILTSSHCVMNDKS-KWLNGEEFDNKKCEGN 109 (282)
T ss_pred cccCCc----ccccCCCCCEEEEEeccCc------ccceecceEEeccCeEEEeeeEEEeccc-ccccCcccccceeeCC
Confidence 445554 7778899999999874222 1246689999999999999999974221 11 11111
Q ss_pred ceeccccCC----------CCCCCCccceeeeeEEEeCCCC----CCCCCCCceEEEEeCCCCCCCCCceecccCCCCCC
Q psy10089 358 EWDTITNNR----------TDREPFPYQERTVSQIYIHENF----EAKTVFNDIALIILDFPFPVKNHIGLACTPNSAEE 423 (559)
Q Consensus 358 ~~~~~~~~~----------~~~~~~~~~~~~V~~i~~Hp~y----~~~~~~~DIALl~L~~p~~~~~~v~picLp~~~~~ 423 (559)
...+..+.. ...........+|.++++--.- ......++++||+|+++ ++....|+|||.....
T Consensus 110 ~~~l~vP~~~l~~~~v~~~~~~~~~~~~~~~v~ka~il~~C~~~~~~~~~~~~~mIlEl~~~--~~~~~~~~Cl~~~~~~ 187 (282)
T PF03761_consen 110 NNHLIVPEEVLSKIDVRCCNCFSNGKCFSIKVKKAYILNGCKKIKKNFNRPYSPMILELEED--FSKNVSPPCLADSSTN 187 (282)
T ss_pred CceEEeCHHHhccEEEEeecccccCCcccceeEEEEEEecCCCcccccccccceEEEEEccc--ccccCCCEEeCCCccc
Confidence 000000000 0000011234566666663222 33345689999999999 7789999999976554
Q ss_pred C-CCceEEEEccCCCCCCCCCcccccceEEEEEeecchhhhHHhhhcccCCeeccCCceEEEeCCCCCCCCCCCCCcccE
Q psy10089 424 Y-DDQNCIVTGWGKDKFGVEGRYQSTLKKVEVKLVPRNVCQQQLRKTRLGGVFKLHDSFICASGGPNQDACKGDGGGPLV 502 (559)
Q Consensus 424 ~-~~~~~~~~GwG~~~~~~~~~~~~~L~~~~v~i~~~~~C~~~~~~~~~~~~~~~~~~~lCa~~~~~~~~C~GDsGgPLv 502 (559)
. .+..+.+.|+.. ...+....+.+.....|.. .+| .....|.||+||||+
T Consensus 188 ~~~~~~~~~yg~~~---------~~~~~~~~~~i~~~~~~~~----------------~~~----~~~~~~~~d~Gg~lv 238 (282)
T PF03761_consen 188 WEKGDEVDVYGFNS---------TGKLKHRKLKITNCTKCAY----------------SIC----TKQYSCKGDRGGPLV 238 (282)
T ss_pred cccCceEEEeecCC---------CCeEEEEEEEEEEeeccce----------------eEe----cccccCCCCccCeEE
Confidence 3 345666777611 2345566666655333211 122 246789999999999
Q ss_pred EeecCCCCcEEEEEEEEeCC-CCCCCCCeeeEeccccHHHHHh
Q psy10089 503 CQLKNERDRFTQVGIVSWGI-GCGSDTPGVYVDVRKFKKWILD 544 (559)
Q Consensus 503 ~~~~~~~~~~~l~GI~S~g~-~C~~~~p~vyt~V~~y~~WI~~ 544 (559)
... +++++|+||.+.+. .|... ...|.+|..|.+=|-+
T Consensus 239 ~~~---~gr~tlIGv~~~~~~~~~~~-~~~f~~v~~~~~~IC~ 277 (282)
T PF03761_consen 239 KNI---NGRWTLIGVGASGNYECNKN-NSYFFNVSWYQDEICE 277 (282)
T ss_pred EEE---CCCEEEEEEEccCCCccccc-ccEEEEHHHhhhhhcc
Confidence 887 89999999999775 34322 5788899888775543
No 12
>PF03761 DUF316: Domain of unknown function (DUF316) ; InterPro: IPR005514 This is a family of uncharacterised proteins from Caenorhabditis elegans.
Probab=99.36 E-value=1.4e-11 Score=122.46 Aligned_cols=174 Identities=20% Similarity=0.310 Sum_probs=111.3
Q ss_pred CCCCCCCCcceecCCccCCCCCCCeEEEEEEEecCCceeeeEEEEEcCCEEEecccccccC-cee---------eEEee-
Q psy10089 22 AENTEEYDYIEPISGRNTYFGEFPWMLVLFYYKRNMEYFKCGASLIGPNIALTAAHCVQYD-VTY---------SVAAG- 90 (559)
Q Consensus 22 ~~~~~~~~~~riigG~~a~~~~~Pw~v~l~~~~~~~~~~~CgGtLIs~~~VLTAAhC~~~~-~~~---------~v~~g- 90 (559)
|++......+++.+|..+..++.||.|.+.........+.++|+|||+||||||+||+... ..+ ...-+
T Consensus 31 CG~~~~~~~~~~~~g~~~~~~~~pW~v~v~~~~~~~~~~~~~gtlIS~RHiLtss~~~~~~~~~W~~~~~~~~~~C~~~~ 110 (282)
T PF03761_consen 31 CGKKKLPYPSKVFNGTPAESGEAPWAVSVYTKNHNEGNYFSTGTLISPRHILTSSHCVMNDKSKWLNGEEFDNKKCEGNN 110 (282)
T ss_pred cCCCCCCCcccccCCcccccCCCCCEEEEEeccCcccceecceEEeccCeEEEeeeEEEecccccccCcccccceeeCCC
Confidence 4433344556689999999999999999988766555577899999999999999999632 111 01111
Q ss_pred -eEEEcc-----cc-----cccccceeEEeeEEEEECCCC----CCCCcCCceEEEeecCccccCCCcceeccCCCCCCC
Q psy10089 91 -EWFING-----IV-----EEELEEEQRRDVLDVRIHPNY----STETLENNIALLKLSSNIDFDDYIHPICLPDWNVTY 155 (559)
Q Consensus 91 -~~~~~~-----~~-----~~~~~~~~~~~v~~i~~hp~y----~~~~~~~Diall~L~~~v~~~~~v~picl~~~~~~~ 155 (559)
.+.+.. .. ..........+|.++++.-.- ......++++||+|+++ ++....|+|||......
T Consensus 111 ~~l~vP~~~l~~~~v~~~~~~~~~~~~~~~v~ka~il~~C~~~~~~~~~~~~~mIlEl~~~--~~~~~~~~Cl~~~~~~~ 188 (282)
T PF03761_consen 111 NHLIVPEEVLSKIDVRCCNCFSNGKCFSIKVKKAYILNGCKKIKKNFNRPYSPMILELEED--FSKNVSPPCLADSSTNW 188 (282)
T ss_pred ceEEeCHHHhccEEEEeecccccCCcccceeEEEEEEecCCCcccccccccceEEEEEccc--ccccCCCEEeCCCcccc
Confidence 111100 00 000111223456666653222 23344579999999998 77889999999755432
Q ss_pred -CCCcEEEEeec---------------------------CCCCCCCCceEeecCCCCCcEEEEEEEEcCC-CCC
Q psy10089 156 -DSENCVITGWG---------------------------RDSADGGGPLVCPSKEDPTTFFQVGIAAWSV-VCT 200 (559)
Q Consensus 156 -~~~~~~~~GwG---------------------------~~~~d~G~pl~~~~~~~~~~~~l~Gi~s~~~-~C~ 200 (559)
.+....+.|+- ...+|+||||+... .++++++||.+.+. .|.
T Consensus 189 ~~~~~~~~yg~~~~~~~~~~~~~i~~~~~~~~~~~~~~~~~~~d~Gg~lv~~~---~gr~tlIGv~~~~~~~~~ 259 (282)
T PF03761_consen 189 EKGDEVDVYGFNSTGKLKHRKLKITNCTKCAYSICTKQYSCKGDRGGPLVKNI---NGRWTLIGVGASGNYECN 259 (282)
T ss_pred ccCceEEEeecCCCCeEEEEEEEEEEeeccceeEecccccCCCCccCeEEEEE---CCCEEEEEEEccCCCccc
Confidence 23334444441 11279999999843 47899999998765 444
No 13
>PF09342 DUF1986: Domain of unknown function (DUF1986); InterPro: IPR015420 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This domain is found in serine endopeptidases belonging to MEROPS peptidase family S1A (clan PA). It is found in unusual mosaic proteins, which are encoded by the Drosophila nudel gene (see P98159 from SWISSPROT). Nudel is involved in defining embryonic dorsoventral polarity. Three proteases; ndl, gd and snk process easter to create active easter. Active easter defines cell identities along the dorsal-ventral continuum by activating the spz ligand for the Tl receptor in the ventral region of the embryo. Nudel, pipe and windbeutel together trigger the protease cascade within the extraembryonic perivitelline compartment which induces dorsoventral polarity of the Drosophila embryo [].
Probab=99.34 E-value=7.4e-12 Score=114.39 Aligned_cols=115 Identities=18% Similarity=0.317 Sum_probs=87.7
Q ss_pred CcCCceEEEeecccCCCCCCccceeEeeEEEEcCCEEEecccccCccCc--cceEEEccceeccccCCCCCCCCccceee
Q psy10089 301 GEYPWMMAILTNKINKDGSVTENVFQCGATLILPHVVMTAAHCVNNIPV--TDIKVRGGEWDTITNNRTDREPFPYQERT 378 (559)
Q Consensus 301 ~~~Pw~v~i~~~~~~~~~~~~~~~~~C~GtLIs~~~VLTAAhC~~~~~~--~~~~V~~g~~~~~~~~~~~~~~~~~~~~~ 378 (559)
.-|||+|.|+. .+ .+.|+|+||.++|||++-.|+.+... .-+.|.+|.......- ..+.+|.++
T Consensus 14 y~WPWlA~IYv--dG--------~~~CsgvLlD~~WlLvsssCl~~I~L~~~YvsallG~~Kt~~~v----~Gp~EQI~r 79 (267)
T PF09342_consen 14 YHWPWLADIYV--DG--------RYWCSGVLLDPHWLLVSSSCLRGISLSHHYVSALLGGGKTYLSV----DGPHEQISR 79 (267)
T ss_pred ccCcceeeEEE--cC--------eEEEEEEEeccceEEEeccccCCcccccceEEEEecCcceeccc----CCChheEEE
Confidence 46999999998 44 69999999999999999999987544 4567888876544322 223467777
Q ss_pred eeEEEeCCCCCCCCCCCceEEEEeCCCCCCCCCceecccCCCCCC-CCCceEEEEccCC
Q psy10089 379 VSQIYIHENFEAKTVFNDIALIILDFPFPVKNHIGLACTPNSAEE-YDDQNCIVTGWGK 436 (559)
Q Consensus 379 V~~i~~Hp~y~~~~~~~DIALl~L~~p~~~~~~v~picLp~~~~~-~~~~~~~~~GwG~ 436 (559)
|..+..-| ..+++||.|++|+.|+.+|+|..||..... .....|..+|--.
T Consensus 80 VD~~~~V~-------~S~v~LLHL~~~~~fTr~VlP~flp~~~~~~~~~~~CVAVg~d~ 131 (267)
T PF09342_consen 80 VDCFKDVP-------ESNVLLLHLEQPANFTRYVLPTFLPETSNENESDDECVAVGHDD 131 (267)
T ss_pred eeeeeecc-------ccceeeeeecCcccceeeecccccccccCCCCCCCceEEEEccc
Confidence 77765433 369999999999999999999999974332 2346899998543
No 14
>PF09342 DUF1986: Domain of unknown function (DUF1986); InterPro: IPR015420 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This domain is found in serine endopeptidases belonging to MEROPS peptidase family S1A (clan PA). It is found in unusual mosaic proteins, which are encoded by the Drosophila nudel gene (see P98159 from SWISSPROT). Nudel is involved in defining embryonic dorsoventral polarity. Three proteases; ndl, gd and snk process easter to create active easter. Active easter defines cell identities along the dorsal-ventral continuum by activating the spz ligand for the Tl receptor in the ventral region of the embryo. Nudel, pipe and windbeutel together trigger the protease cascade within the extraembryonic perivitelline compartment which induces dorsoventral polarity of the Drosophila embryo [].
Probab=99.27 E-value=2.7e-11 Score=110.77 Aligned_cols=111 Identities=24% Similarity=0.395 Sum_probs=81.9
Q ss_pred CCCCeEEEEEEEecCCceeeeEEEEEcCCEEEecccccccC----ceeeEEeeeEEEcccccccccceeEEeeEEEEECC
Q psy10089 42 GEFPWMLVLFYYKRNMEYFKCGASLIGPNIALTAAHCVQYD----VTYSVAAGEWFINGIVEEELEEEQRRDVLDVRIHP 117 (559)
Q Consensus 42 ~~~Pw~v~l~~~~~~~~~~~CgGtLIs~~~VLTAAhC~~~~----~~~~v~~g~~~~~~~~~~~~~~~~~~~v~~i~~hp 117 (559)
-.|||.|.|+..+. +.|.|.||.+.|||++..|+.+- .-..+.+|...... .-..+.+|++.|..+..-|
T Consensus 14 y~WPWlA~IYvdG~----~~CsgvLlD~~WlLvsssCl~~I~L~~~YvsallG~~Kt~~--~v~Gp~EQI~rVD~~~~V~ 87 (267)
T PF09342_consen 14 YHWPWLADIYVDGR----YWCSGVLLDPHWLLVSSSCLRGISLSHHYVSALLGGGKTYL--SVDGPHEQISRVDCFKDVP 87 (267)
T ss_pred ccCcceeeEEEcCe----EEEEEEEeccceEEEeccccCCcccccceEEEEecCcceec--ccCCChheEEEeeeeeecc
Confidence 46999999998765 88999999999999999999652 22456666433221 1134567888887665443
Q ss_pred CCCCCCcCCceEEEeecCccccCCCcceeccCCCCCC-CCCCcEEEEee
Q psy10089 118 NYSTETLENNIALLKLSSNIDFDDYIHPICLPDWNVT-YDSENCVITGW 165 (559)
Q Consensus 118 ~y~~~~~~~Diall~L~~~v~~~~~v~picl~~~~~~-~~~~~~~~~Gw 165 (559)
..+++||.|++|+.|+.+|+|..||..... .....|+..|-
T Consensus 88 -------~S~v~LLHL~~~~~fTr~VlP~flp~~~~~~~~~~~CVAVg~ 129 (267)
T PF09342_consen 88 -------ESNVLLLHLEQPANFTRYVLPTFLPETSNENESDDECVAVGH 129 (267)
T ss_pred -------ccceeeeeecCcccceeeecccccccccCCCCCCCceEEEEc
Confidence 348999999999999999999999873322 23457888874
No 15
>COG3591 V8-like Glu-specific endopeptidase [Amino acid transport and metabolism]
Probab=98.89 E-value=2.1e-08 Score=94.52 Aligned_cols=200 Identities=18% Similarity=0.219 Sum_probs=111.3
Q ss_pred ccCcCCceEEEeecccCCCCCCccceeEeeEEEEcCCEEEecccccCccCcc--ceEEEc-cceeccccCCCCCCCCccc
Q psy10089 299 FFGEYPWMMAILTNKINKDGSVTENVFQCGATLILPHVVMTAAHCVNNIPVT--DIKVRG-GEWDTITNNRTDREPFPYQ 375 (559)
Q Consensus 299 ~~~~~Pw~v~i~~~~~~~~~~~~~~~~~C~GtLIs~~~VLTAAhC~~~~~~~--~~~V~~-g~~~~~~~~~~~~~~~~~~ 375 (559)
....|||-+-.... .. ...+-|+++||+++.||||+||+...... .+.+.. |..... . +.-
T Consensus 45 dt~~~Py~av~~~~--~~-----tG~~~~~~~lI~pntvLTa~Hc~~s~~~G~~~~~~~p~g~~~~~-------~--~~~ 108 (251)
T COG3591 45 DTTQFPYSAVVQFE--AA-----TGRLCTAATLIGPNTVLTAGHCIYSPDYGEDDIAAAPPGVNSDG-------G--PFY 108 (251)
T ss_pred cCCCCCcceeEEee--cC-----CCcceeeEEEEcCceEEEeeeEEecCCCChhhhhhcCCcccCCC-------C--CCC
Confidence 45789999877652 21 12356777999999999999999863321 111211 111100 1 111
Q ss_pred eeeeeEEEeCCC--CCCCCCCCceEEEEeCCCCCCCCCceecccCCCCCCCCCceEEEEccCCCCCCCCCcccccceEEE
Q psy10089 376 ERTVSQIYIHEN--FEAKTVFNDIALIILDFPFPVKNHIGLACTPNSAEEYDDQNCIVTGWGKDKFGVEGRYQSTLKKVE 453 (559)
Q Consensus 376 ~~~V~~i~~Hp~--y~~~~~~~DIALl~L~~p~~~~~~v~picLp~~~~~~~~~~~~~~GwG~~~~~~~~~~~~~L~~~~ 453 (559)
.+...++.+-|. |.......|+..+.|+....+.+.....-++.......+....++||-... ...+++.+
T Consensus 109 ~~~~~~~~~~~g~~~~~d~~~~~v~~~~~~~g~~~~~~~~~~~~~~~~~~~~~d~i~v~GYP~dk-------~~~~~~~e 181 (251)
T COG3591 109 GITKIEIRVYPGELYKEDGASYDVGEAALESGINIGDVVNYLKRNTASEAKANDRITVIGYPGDK-------PNIGTMWE 181 (251)
T ss_pred ceeeEEEEecCCceeccCCceeeccHHHhccCCCccccccccccccccccccCceeEEEeccCCC-------CcceeEee
Confidence 222222222333 334445677777888866666666666666655555555568999996443 11111111
Q ss_pred EEeecchhhhHHhhhcccCCeeccCCceEEEeCCCCCCCCCCCCCcccEEeecCCCCcEEEEEEEEeCCCCCCCC-Ceee
Q psy10089 454 VKLVPRNVCQQQLRKTRLGGVFKLHDSFICASGGPNQDACKGDGGGPLVCQLKNERDRFTQVGIVSWGIGCGSDT-PGVY 532 (559)
Q Consensus 454 v~i~~~~~C~~~~~~~~~~~~~~~~~~~lCa~~~~~~~~C~GDsGgPLv~~~~~~~~~~~l~GI~S~g~~C~~~~-p~vy 532 (559)
.|.... .+.... ..-..++|.|+||+|++... + +++||.+-|..-.... ..-.
T Consensus 182 -------~t~~v~---------~~~~~~----l~y~~dT~pG~SGSpv~~~~---~---~vigv~~~g~~~~~~~~~n~~ 235 (251)
T COG3591 182 -------STGKVN---------SIKGNK----LFYDADTLPGSSGSPVLISK---D---EVIGVHYNGPGANGGSLANNA 235 (251)
T ss_pred -------ecceeE---------EEecce----EEEEecccCCCCCCceEecC---c---eEEEEEecCCCcccccccCcc
Confidence 111100 011110 11256899999999999754 2 8999999886422111 2333
Q ss_pred Eec-cccHHHHHhhcC
Q psy10089 533 VDV-RKFKKWILDNSH 547 (559)
Q Consensus 533 t~V-~~y~~WI~~~i~ 547 (559)
+|+ ..+++||++.++
T Consensus 236 vr~t~~~~~~I~~~~~ 251 (251)
T COG3591 236 VRLTPEILNFIQQNIK 251 (251)
T ss_pred eEecHHHHHHHHHhhC
Confidence 444 557899988764
No 16
>COG3591 V8-like Glu-specific endopeptidase [Amino acid transport and metabolism]
Probab=98.26 E-value=7.9e-06 Score=77.32 Aligned_cols=147 Identities=18% Similarity=0.285 Sum_probs=82.9
Q ss_pred CCCCCCeEEEEEEEecCCceeeeEEEEEcCCEEEecccccccCceeeEEeeeEEEcccccccccceeEEeeE--EEEECC
Q psy10089 40 YFGEFPWMLVLFYYKRNMEYFKCGASLIGPNIALTAAHCVQYDVTYSVAAGEWFINGIVEEELEEEQRRDVL--DVRIHP 117 (559)
Q Consensus 40 ~~~~~Pw~v~l~~~~~~~~~~~CgGtLIs~~~VLTAAhC~~~~~~~~v~~g~~~~~~~~~~~~~~~~~~~v~--~i~~hp 117 (559)
....|||-+...+....+ .+-|+++||+++.||||+||+.....-....-. ...+.. ........+. .+.+.|
T Consensus 45 dt~~~Py~av~~~~~~tG-~~~~~~~lI~pntvLTa~Hc~~s~~~G~~~~~~-~p~g~~---~~~~~~~~~~~~~~~~~~ 119 (251)
T COG3591 45 DTTQFPYSAVVQFEAATG-RLCTAATLIGPNTVLTAGHCIYSPDYGEDDIAA-APPGVN---SDGGPFYGITKIEIRVYP 119 (251)
T ss_pred cCCCCCcceeEEeecCCC-cceeeEEEEcCceEEEeeeEEecCCCChhhhhh-cCCccc---CCCCCCCceeeEEEEecC
Confidence 446899999886655433 346888999999999999999653210000000 001111 1122222222 222244
Q ss_pred C--CCCCCcCCceEEEeecCccccCCCcceeccCCCCCCCCCCcEEEEeecCCC--------------------------
Q psy10089 118 N--YSTETLENNIALLKLSSNIDFDDYIHPICLPDWNVTYDSENCVITGWGRDS-------------------------- 169 (559)
Q Consensus 118 ~--y~~~~~~~Diall~L~~~v~~~~~v~picl~~~~~~~~~~~~~~~GwG~~~-------------------------- 169 (559)
. |.......|+..+.|+....++..+...-++.......+....+.|+-.+.
T Consensus 120 g~~~~~d~~~~~v~~~~~~~g~~~~~~~~~~~~~~~~~~~~~d~i~v~GYP~dk~~~~~~~e~t~~v~~~~~~~l~y~~d 199 (251)
T COG3591 120 GELYKEDGASYDVGEAALESGINIGDVVNYLKRNTASEAKANDRITVIGYPGDKPNIGTMWESTGKVNSIKGNKLFYDAD 199 (251)
T ss_pred CceeccCCceeeccHHHhccCCCccccccccccccccccccCceeEEEeccCCCCcceeEeeecceeEEEecceEEEEec
Confidence 3 334445567777777755566666554444433333334446777764322
Q ss_pred ---CCCCCceEeecCCCCCcEEEEEEEEcCC
Q psy10089 170 ---ADGGGPLVCPSKEDPTTFFQVGIAAWSV 197 (559)
Q Consensus 170 ---~d~G~pl~~~~~~~~~~~~l~Gi~s~~~ 197 (559)
|+||+|++-... +++|+..-+.
T Consensus 200 T~pG~SGSpv~~~~~------~vigv~~~g~ 224 (251)
T COG3591 200 TLPGSSGSPVLISKD------EVIGVHYNGP 224 (251)
T ss_pred ccCCCCCCceEecCc------eEEEEEecCC
Confidence 789999987432 6888887664
No 17
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=98.04 E-value=8.8e-05 Score=78.17 Aligned_cols=84 Identities=15% Similarity=0.114 Sum_probs=58.1
Q ss_pred eEeeEEEEcCC-EEEecccccCccCccceEEEccceeccccCCCCCCCCccceeeeeEEEeCCCCCCCCCCCceEEEEeC
Q psy10089 325 FQCGATLILPH-VVMTAAHCVNNIPVTDIKVRGGEWDTITNNRTDREPFPYQERTVSQIYIHENFEAKTVFNDIALIILD 403 (559)
Q Consensus 325 ~~C~GtLIs~~-~VLTAAhC~~~~~~~~~~V~~g~~~~~~~~~~~~~~~~~~~~~V~~i~~Hp~y~~~~~~~DIALl~L~ 403 (559)
..++|.+|+++ +|||++|++.+ ...+.|.+.. ...+..+-+..++ ..||||||++
T Consensus 58 ~~GSGfii~~~G~IlTn~Hvv~~--~~~i~V~~~~---------------~~~~~a~vv~~d~-------~~DlAllkv~ 113 (428)
T TIGR02037 58 GLGSGVIISADGYILTNNHVVDG--ADEITVTLSD---------------GREFKAKLVGKDP-------RTDIAVLKID 113 (428)
T ss_pred ceeeEEEECCCCEEEEcHHHcCC--CCeEEEEeCC---------------CCEEEEEEEEecC-------CCCEEEEEec
Confidence 57999999986 99999999985 3455555432 1233433333333 3699999998
Q ss_pred CCCCCCCCceecccCCCCCCCCCceEEEEccCC
Q psy10089 404 FPFPVKNHIGLACTPNSAEEYDDQNCIVTGWGK 436 (559)
Q Consensus 404 ~p~~~~~~v~picLp~~~~~~~~~~~~~~GwG~ 436 (559)
.+ ..+.++.|........++.++++|+..
T Consensus 114 ~~----~~~~~~~l~~~~~~~~G~~v~aiG~p~ 142 (428)
T TIGR02037 114 AK----KNLPVIKLGDSDKLRVGDWVLAIGNPF 142 (428)
T ss_pred CC----CCceEEEccCCCCCCCCCEEEEEECCC
Confidence 65 345677776655556789999999853
No 18
>TIGR02038 protease_degS periplasmic serine pepetdase DegS. This family consists of the periplasmic serine protease DegS (HhoB), a shorter paralog of protease DO (HtrA, DegP) and DegQ (HhoA). It is found in E. coli and several other Proteobacteria of the gamma subdivision. It contains a trypsin domain and a single copy of PDZ domain (in contrast to DegP with two copies). A critical role of this DegS is to sense stress in the periplasm and partially degrade an inhibitor of sigma(E).
Probab=97.74 E-value=0.00096 Score=68.19 Aligned_cols=83 Identities=8% Similarity=0.015 Sum_probs=53.3
Q ss_pred eEeeEEEEcCC-EEEecccccCccCccceEEEccceeccccCCCCCCCCccceeeeeEEEeCCCCCCCCCCCceEEEEeC
Q psy10089 325 FQCGATLILPH-VVMTAAHCVNNIPVTDIKVRGGEWDTITNNRTDREPFPYQERTVSQIYIHENFEAKTVFNDIALIILD 403 (559)
Q Consensus 325 ~~C~GtLIs~~-~VLTAAhC~~~~~~~~~~V~~g~~~~~~~~~~~~~~~~~~~~~V~~i~~Hp~y~~~~~~~DIALl~L~ 403 (559)
...+|.+|+++ +|||++|.+.. ...+.|.+.. ...+..+-+..+| ..||||||++
T Consensus 78 ~~GSG~vi~~~G~IlTn~HVV~~--~~~i~V~~~d---------------g~~~~a~vv~~d~-------~~DlAvlkv~ 133 (351)
T TIGR02038 78 GLGSGVIMSKEGYILTNYHVIKK--ADQIVVALQD---------------GRKFEAELVGSDP-------LTDLAVLKIE 133 (351)
T ss_pred ceEEEEEEeCCeEEEecccEeCC--CCEEEEEECC---------------CCEEEEEEEEecC-------CCCEEEEEec
Confidence 46999999977 99999999975 3445555432 1223333333333 3699999998
Q ss_pred CCCCCCCCceecccCCCCCCCCCceEEEEccCC
Q psy10089 404 FPFPVKNHIGLACTPNSAEEYDDQNCIVTGWGK 436 (559)
Q Consensus 404 ~p~~~~~~v~picLp~~~~~~~~~~~~~~GwG~ 436 (559)
.+- +.++.+-.......|+.+.++|+..
T Consensus 134 ~~~-----~~~~~l~~s~~~~~G~~V~aiG~P~ 161 (351)
T TIGR02038 134 GDN-----LPTIPVNLDRPPHVGDVVLAIGNPY 161 (351)
T ss_pred CCC-----CceEeccCcCccCCCCEEEEEeCCC
Confidence 642 2334443333445689999999853
No 19
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=97.73 E-value=0.00032 Score=73.90 Aligned_cols=82 Identities=20% Similarity=0.165 Sum_probs=51.0
Q ss_pred eeeEEEEEcCC-EEEecccccccCceeeEEeeeEEEcccccccccceeEEeeEEEEECCCCCCCCcCCceEEEeecCccc
Q psy10089 60 FKCGASLIGPN-IALTAAHCVQYDVTYSVAAGEWFINGIVEEELEEEQRRDVLDVRIHPNYSTETLENNIALLKLSSNID 138 (559)
Q Consensus 60 ~~CgGtLIs~~-~VLTAAhC~~~~~~~~v~~g~~~~~~~~~~~~~~~~~~~v~~i~~hp~y~~~~~~~Diall~L~~~v~ 138 (559)
..++|.+|+++ +|||++|++.....+.|.+.. ...+..+-+..++. .||||||++.+
T Consensus 58 ~~GSGfii~~~G~IlTn~Hvv~~~~~i~V~~~~-------------~~~~~a~vv~~d~~-------~DlAllkv~~~-- 115 (428)
T TIGR02037 58 GLGSGVIISADGYILTNNHVVDGADEITVTLSD-------------GREFKAKLVGKDPR-------TDIAVLKIDAK-- 115 (428)
T ss_pred ceeeEEEECCCCEEEEcHHHcCCCCeEEEEeCC-------------CCEEEEEEEEecCC-------CCEEEEEecCC--
Confidence 56999999986 999999999876655554321 12233333334443 49999999854
Q ss_pred cCCCcceeccCCCCCCCCCCcEEEEee
Q psy10089 139 FDDYIHPICLPDWNVTYDSENCVITGW 165 (559)
Q Consensus 139 ~~~~v~picl~~~~~~~~~~~~~~~Gw 165 (559)
..+.++.|........++.+.+.|+
T Consensus 116 --~~~~~~~l~~~~~~~~G~~v~aiG~ 140 (428)
T TIGR02037 116 --KNLPVIKLGDSDKLRVGDWVLAIGN 140 (428)
T ss_pred --CCceEEEccCCCCCCCCCEEEEEEC
Confidence 2345666654333334555555554
No 20
>PRK10898 serine endoprotease; Provisional
Probab=97.59 E-value=0.0032 Score=64.37 Aligned_cols=82 Identities=10% Similarity=0.064 Sum_probs=52.5
Q ss_pred eEeeEEEEcCC-EEEecccccCccCccceEEEccceeccccCCCCCCCCccceeeeeEEEeCCCCCCCCCCCceEEEEeC
Q psy10089 325 FQCGATLILPH-VVMTAAHCVNNIPVTDIKVRGGEWDTITNNRTDREPFPYQERTVSQIYIHENFEAKTVFNDIALIILD 403 (559)
Q Consensus 325 ~~C~GtLIs~~-~VLTAAhC~~~~~~~~~~V~~g~~~~~~~~~~~~~~~~~~~~~V~~i~~Hp~y~~~~~~~DIALl~L~ 403 (559)
...+|.+|+++ +|||.+|-+.+ ...+.|.+.. ...+..+-+-..| .+||||||++
T Consensus 78 ~~GSGfvi~~~G~IlTn~HVv~~--a~~i~V~~~d---------------g~~~~a~vv~~d~-------~~DlAvl~v~ 133 (353)
T PRK10898 78 TLGSGVIMDQRGYILTNKHVIND--ADQIIVALQD---------------GRVFEALLVGSDS-------LTDLAVLKIN 133 (353)
T ss_pred ceeeEEEEeCCeEEEecccEeCC--CCEEEEEeCC---------------CCEEEEEEEEEcC-------CCCEEEEEEc
Confidence 57999999976 99999999974 3456665432 1223333333333 3699999997
Q ss_pred CCCCCCCCceecccCCCCCCCCCceEEEEccC
Q psy10089 404 FPFPVKNHIGLACTPNSAEEYDDQNCIVTGWG 435 (559)
Q Consensus 404 ~p~~~~~~v~picLp~~~~~~~~~~~~~~GwG 435 (559)
.+ . ..++.|........++.++++|+.
T Consensus 134 ~~-~----l~~~~l~~~~~~~~G~~V~aiG~P 160 (353)
T PRK10898 134 AT-N----LPVIPINPKRVPHIGDVVLAIGNP 160 (353)
T ss_pred CC-C----CCeeeccCcCcCCCCCEEEEEeCC
Confidence 54 1 233344333334468889999985
No 21
>PRK10139 serine endoprotease; Provisional
Probab=97.52 E-value=0.0015 Score=68.96 Aligned_cols=83 Identities=18% Similarity=0.166 Sum_probs=56.1
Q ss_pred eEeeEEEEcC--CEEEecccccCccCccceEEEccceeccccCCCCCCCCccceeeeeEEEeCCCCCCCCCCCceEEEEe
Q psy10089 325 FQCGATLILP--HVVMTAAHCVNNIPVTDIKVRGGEWDTITNNRTDREPFPYQERTVSQIYIHENFEAKTVFNDIALIIL 402 (559)
Q Consensus 325 ~~C~GtLIs~--~~VLTAAhC~~~~~~~~~~V~~g~~~~~~~~~~~~~~~~~~~~~V~~i~~Hp~y~~~~~~~DIALl~L 402 (559)
...+|.+|++ -+|||.+|.+.+ ...+.|.+.. ...+..+-+-..| ..||||||+
T Consensus 90 ~~GSG~ii~~~~g~IlTn~HVv~~--a~~i~V~~~d---------------g~~~~a~vvg~D~-------~~DlAvlkv 145 (455)
T PRK10139 90 GLGSGVIIDAAKGYVLTNNHVINQ--AQKISIQLND---------------GREFDAKLIGSDD-------QSDIALLQI 145 (455)
T ss_pred ceEEEEEEECCCCEEEeChHHhCC--CCEEEEEECC---------------CCEEEEEEEEEcC-------CCCEEEEEe
Confidence 4799999974 599999999985 4566676542 1233333333333 369999999
Q ss_pred CCCCCCCCCceecccCCCCCCCCCceEEEEccC
Q psy10089 403 DFPFPVKNHIGLACTPNSAEEYDDQNCIVTGWG 435 (559)
Q Consensus 403 ~~p~~~~~~v~picLp~~~~~~~~~~~~~~GwG 435 (559)
+.+- ...++.|........|+.++++|+.
T Consensus 146 ~~~~----~l~~~~lg~s~~~~~G~~V~aiG~P 174 (455)
T PRK10139 146 QNPS----KLTQIAIADSDKLRVGDFAVAVGNP 174 (455)
T ss_pred cCCC----CCceeEecCccccCCCCEEEEEecC
Confidence 8652 3446667655555578999999874
No 22
>PRK10942 serine endoprotease; Provisional
Probab=97.47 E-value=0.0036 Score=66.53 Aligned_cols=83 Identities=23% Similarity=0.168 Sum_probs=55.0
Q ss_pred eEeeEEEEcC--CEEEecccccCccCccceEEEccceeccccCCCCCCCCccceeeeeEEEeCCCCCCCCCCCceEEEEe
Q psy10089 325 FQCGATLILP--HVVMTAAHCVNNIPVTDIKVRGGEWDTITNNRTDREPFPYQERTVSQIYIHENFEAKTVFNDIALIIL 402 (559)
Q Consensus 325 ~~C~GtLIs~--~~VLTAAhC~~~~~~~~~~V~~g~~~~~~~~~~~~~~~~~~~~~V~~i~~Hp~y~~~~~~~DIALl~L 402 (559)
...+|.+|+. -+|||.+|.+.+ ...+.|.+.+ ...+..+-+-.+| ..||||||+
T Consensus 111 ~~GSG~ii~~~~G~IlTn~HVv~~--a~~i~V~~~d---------------g~~~~a~vv~~D~-------~~DlAvlki 166 (473)
T PRK10942 111 ALGSGVIIDADKGYVVTNNHVVDN--ATKIKVQLSD---------------GRKFDAKVVGKDP-------RSDIALIQL 166 (473)
T ss_pred ceEEEEEEECCCCEEEeChhhcCC--CCEEEEEECC---------------CCEEEEEEEEecC-------CCCEEEEEe
Confidence 4799999985 499999999975 4556666542 1223333333333 369999999
Q ss_pred CCCCCCCCCceecccCCCCCCCCCceEEEEccC
Q psy10089 403 DFPFPVKNHIGLACTPNSAEEYDDQNCIVTGWG 435 (559)
Q Consensus 403 ~~p~~~~~~v~picLp~~~~~~~~~~~~~~GwG 435 (559)
+.+- ...++.|-.......++.++++|+.
T Consensus 167 ~~~~----~l~~~~lg~s~~l~~G~~V~aiG~P 195 (473)
T PRK10942 167 QNPK----NLTAIKMADSDALRVGDYTVAIGNP 195 (473)
T ss_pred cCCC----CCceeEecCccccCCCCEEEEEcCC
Confidence 7542 2345666555555678888888874
No 23
>TIGR02038 protease_degS periplasmic serine pepetdase DegS. This family consists of the periplasmic serine protease DegS (HhoB), a shorter paralog of protease DO (HtrA, DegP) and DegQ (HhoA). It is found in E. coli and several other Proteobacteria of the gamma subdivision. It contains a trypsin domain and a single copy of PDZ domain (in contrast to DegP with two copies). A critical role of this DegS is to sense stress in the periplasm and partially degrade an inhibitor of sigma(E).
Probab=97.46 E-value=0.0028 Score=64.79 Aligned_cols=75 Identities=15% Similarity=0.147 Sum_probs=47.3
Q ss_pred CCCCeEEEEEEEecC-------CceeeeEEEEEcCC-EEEecccccccCceeeEEeeeEEEcccccccccceeEEeeEEE
Q psy10089 42 GEFPWMLVLFYYKRN-------MEYFKCGASLIGPN-IALTAAHCVQYDVTYSVAAGEWFINGIVEEELEEEQRRDVLDV 113 (559)
Q Consensus 42 ~~~Pw~v~l~~~~~~-------~~~~~CgGtLIs~~-~VLTAAhC~~~~~~~~v~~g~~~~~~~~~~~~~~~~~~~v~~i 113 (559)
.--|-+|.|.-.... ......+|.+|+++ +|||++|.+.....+.|.+.. +..+..+-+
T Consensus 53 ~~~psVV~I~~~~~~~~~~~~~~~~~~GSG~vi~~~G~IlTn~HVV~~~~~i~V~~~d-------------g~~~~a~vv 119 (351)
T TIGR02038 53 RAAPAVVNIYNRSISQNSLNQLSIQGLGSGVIMSKEGYILTNYHVIKKADQIVVALQD-------------GRKFEAELV 119 (351)
T ss_pred hcCCcEEEEEeEeccccccccccccceEEEEEEeCCeEEEecccEeCCCCEEEEEECC-------------CCEEEEEEE
Confidence 345889988754321 11245999999977 999999999776555544321 122333333
Q ss_pred EECCCCCCCCcCCceEEEeecCc
Q psy10089 114 RIHPNYSTETLENNIALLKLSSN 136 (559)
Q Consensus 114 ~~hp~y~~~~~~~Diall~L~~~ 136 (559)
..+|. .||||||++.+
T Consensus 120 ~~d~~-------~DlAvlkv~~~ 135 (351)
T TIGR02038 120 GSDPL-------TDLAVLKIEGD 135 (351)
T ss_pred EecCC-------CCEEEEEecCC
Confidence 34443 49999999753
No 24
>PRK10898 serine endoprotease; Provisional
Probab=97.29 E-value=0.0064 Score=62.18 Aligned_cols=73 Identities=16% Similarity=0.150 Sum_probs=45.9
Q ss_pred CCeEEEEEEEecCC-------ceeeeEEEEEcCC-EEEecccccccCceeeEEeeeEEEcccccccccceeEEeeEEEEE
Q psy10089 44 FPWMLVLFYYKRNM-------EYFKCGASLIGPN-IALTAAHCVQYDVTYSVAAGEWFINGIVEEELEEEQRRDVLDVRI 115 (559)
Q Consensus 44 ~Pw~v~l~~~~~~~-------~~~~CgGtLIs~~-~VLTAAhC~~~~~~~~v~~g~~~~~~~~~~~~~~~~~~~v~~i~~ 115 (559)
-|-+|.|.-..... .....+|.+|+++ +|||+||=+.....+.|.+.+ +..+..+-+..
T Consensus 55 ~psvV~v~~~~~~~~~~~~~~~~~~GSGfvi~~~G~IlTn~HVv~~a~~i~V~~~d-------------g~~~~a~vv~~ 121 (353)
T PRK10898 55 APAVVNVYNRSLNSTSHNQLEIRTLGSGVIMDQRGYILTNKHVINDADQIIVALQD-------------GRVFEALLVGS 121 (353)
T ss_pred CCcEEEEEeEeccccCcccccccceeeEEEEeCCeEEEecccEeCCCCEEEEEeCC-------------CCEEEEEEEEE
Confidence 47888886543211 1246899999976 999999999766555554321 12233333334
Q ss_pred CCCCCCCCcCCceEEEeecCc
Q psy10089 116 HPNYSTETLENNIALLKLSSN 136 (559)
Q Consensus 116 hp~y~~~~~~~Diall~L~~~ 136 (559)
.|. .||||||++..
T Consensus 122 d~~-------~DlAvl~v~~~ 135 (353)
T PRK10898 122 DSL-------TDLAVLKINAT 135 (353)
T ss_pred cCC-------CCEEEEEEcCC
Confidence 443 49999999743
No 25
>PRK10139 serine endoprotease; Provisional
Probab=97.26 E-value=0.003 Score=66.73 Aligned_cols=107 Identities=20% Similarity=0.239 Sum_probs=64.8
Q ss_pred eeeEEEEEcC--CEEEecccccccCceeeEEeeeEEEcccccccccceeEEeeEEEEECCCCCCCCcCCceEEEeecCcc
Q psy10089 60 FKCGASLIGP--NIALTAAHCVQYDVTYSVAAGEWFINGIVEEELEEEQRRDVLDVRIHPNYSTETLENNIALLKLSSNI 137 (559)
Q Consensus 60 ~~CgGtLIs~--~~VLTAAhC~~~~~~~~v~~g~~~~~~~~~~~~~~~~~~~v~~i~~hp~y~~~~~~~Diall~L~~~v 137 (559)
-..+|.||++ -+|||.+|.+.+...+.|.+.+ ...+..+-+...|. .||||||++.+-
T Consensus 90 ~~GSG~ii~~~~g~IlTn~HVv~~a~~i~V~~~d-------------g~~~~a~vvg~D~~-------~DlAvlkv~~~~ 149 (455)
T PRK10139 90 GLGSGVIIDAAKGYVLTNNHVINQAQKISIQLND-------------GREFDAKLIGSDDQ-------SDIALLQIQNPS 149 (455)
T ss_pred ceEEEEEEECCCCEEEeChHHhCCCCEEEEEECC-------------CCEEEEEEEEEcCC-------CCEEEEEecCCC
Confidence 3589999974 6999999999877766665421 12233333334433 499999997532
Q ss_pred ccCCCcceeccCCCCCCCCCCcEEEEee--c--------------CC------------------CCCCCCceEeecCCC
Q psy10089 138 DFDDYIHPICLPDWNVTYDSENCVITGW--G--------------RD------------------SADGGGPLVCPSKED 183 (559)
Q Consensus 138 ~~~~~v~picl~~~~~~~~~~~~~~~Gw--G--------------~~------------------~~d~G~pl~~~~~~~ 183 (559)
...++.|........++.+.+.|. | +. .|.|||||+-...
T Consensus 150 ----~l~~~~lg~s~~~~~G~~V~aiG~P~g~~~tvt~GivS~~~r~~~~~~~~~~~iqtda~in~GnSGGpl~n~~G-- 223 (455)
T PRK10139 150 ----KLTQIAIADSDKLRVGDFAVAVGNPFGLGQTATSGIISALGRSGLNLEGLENFIQTDASINRGNSGGALLNLNG-- 223 (455)
T ss_pred ----CCceeEecCccccCCCCEEEEEecCCCCCCceEEEEEccccccccCCCCcceEEEECCccCCCCCcceEECCCC--
Confidence 234555543332223444444443 1 10 1789999986433
Q ss_pred CCcEEEEEEEEcC
Q psy10089 184 PTTFFQVGIAAWS 196 (559)
Q Consensus 184 ~~~~~l~Gi~s~~ 196 (559)
.++||.+..
T Consensus 224 ----~vIGi~~~~ 232 (455)
T PRK10139 224 ----ELIGINTAI 232 (455)
T ss_pred ----eEEEEEEEE
Confidence 388998764
No 26
>PF13365 Trypsin_2: Trypsin-like peptidase domain; PDB: 1Y8T_A 2Z9I_A 3QO6_A 1L1J_A 1QY6_A 2O8L_A 3OTP_E 2ZLE_I 1KY9_A 3CS0_A ....
Probab=97.26 E-value=0.00082 Score=56.92 Aligned_cols=20 Identities=50% Similarity=0.619 Sum_probs=18.6
Q ss_pred eEEEEEcCC-EEEeccccccc
Q psy10089 62 CGASLIGPN-IALTAAHCVQY 81 (559)
Q Consensus 62 CgGtLIs~~-~VLTAAhC~~~ 81 (559)
|+|.+|+++ +|||||||+..
T Consensus 1 GTGf~i~~~g~ilT~~Hvv~~ 21 (120)
T PF13365_consen 1 GTGFLIGPDGYILTAAHVVED 21 (120)
T ss_dssp EEEEEEETTTEEEEEHHHHTC
T ss_pred CEEEEEcCCceEEEchhheec
Confidence 789999999 99999999974
No 27
>PF13365 Trypsin_2: Trypsin-like peptidase domain; PDB: 1Y8T_A 2Z9I_A 3QO6_A 1L1J_A 1QY6_A 2O8L_A 3OTP_E 2ZLE_I 1KY9_A 3CS0_A ....
Probab=97.21 E-value=0.00097 Score=56.47 Aligned_cols=21 Identities=38% Similarity=0.471 Sum_probs=19.3
Q ss_pred eeEEEEcCC-EEEecccccCcc
Q psy10089 327 CGATLILPH-VVMTAAHCVNNI 347 (559)
Q Consensus 327 C~GtLIs~~-~VLTAAhC~~~~ 347 (559)
|+|.+|+++ +|||+|||+...
T Consensus 1 GTGf~i~~~g~ilT~~Hvv~~~ 22 (120)
T PF13365_consen 1 GTGFLIGPDGYILTAAHVVEDW 22 (120)
T ss_dssp EEEEEEETTTEEEEEHHHHTCC
T ss_pred CEEEEEcCCceEEEchhheecc
Confidence 789999999 999999999863
No 28
>PRK10942 serine endoprotease; Provisional
Probab=97.06 E-value=0.005 Score=65.41 Aligned_cols=56 Identities=21% Similarity=0.293 Sum_probs=37.4
Q ss_pred eeeEEEEEcC--CEEEecccccccCceeeEEeeeEEEcccccccccceeEEeeEEEEECCCCCCCCcCCceEEEeecC
Q psy10089 60 FKCGASLIGP--NIALTAAHCVQYDVTYSVAAGEWFINGIVEEELEEEQRRDVLDVRIHPNYSTETLENNIALLKLSS 135 (559)
Q Consensus 60 ~~CgGtLIs~--~~VLTAAhC~~~~~~~~v~~g~~~~~~~~~~~~~~~~~~~v~~i~~hp~y~~~~~~~Diall~L~~ 135 (559)
...+|.||+. -+|||.+|.+.+...+.|.+.+ ...+..+-+..+|. .||||||++.
T Consensus 111 ~~GSG~ii~~~~G~IlTn~HVv~~a~~i~V~~~d-------------g~~~~a~vv~~D~~-------~DlAvlki~~ 168 (473)
T PRK10942 111 ALGSGVIIDADKGYVVTNNHVVDNATKIKVQLSD-------------GRKFDAKVVGKDPR-------SDIALIQLQN 168 (473)
T ss_pred ceEEEEEEECCCCEEEeChhhcCCCCEEEEEECC-------------CCEEEEEEEEecCC-------CCEEEEEecC
Confidence 3589999985 4999999999876666555321 11233333334443 4999999964
No 29
>PF02395 Peptidase_S6: Immunoglobulin A1 protease Serine protease Prosite pattern; InterPro: IPR000710 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S6 (clan PA(S)). The type sample being the IgA1-specific serine endopeptidase from Neisseria gonorrhoeae []. These cleave prolyl bonds in the hinge regions of immunoglobulin A heavy chains. Similar specificity is shown by the unrelated family of M26 metalloendopeptidases.; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SZE_A 3H09_B 3SYJ_A 1WXR_A 3AK5_B.
Probab=92.09 E-value=0.83 Score=51.21 Aligned_cols=30 Identities=20% Similarity=0.354 Sum_probs=22.2
Q ss_pred CCCCCceEeecCCCCCcEEEEEEEEcCCCCC
Q psy10089 170 ADGGGPLVCPSKEDPTTFFQVGIAAWSVVCT 200 (559)
Q Consensus 170 ~d~G~pl~~~~~~~~~~~~l~Gi~s~~~~C~ 200 (559)
||||+||+.-. ...++|+|+|+.+.+....
T Consensus 216 GDSGSPlF~YD-~~~kKWvl~Gv~~~~~~~~ 245 (769)
T PF02395_consen 216 GDSGSPLFAYD-KEKKKWVLVGVLSGGNGYN 245 (769)
T ss_dssp T-TT-EEEEEE-TTTTEEEEEEEEEEECCCC
T ss_pred CcCCCceEEEE-ccCCeEEEEEEEccccccC
Confidence 89999999865 3457999999999876543
No 30
>PF00548 Peptidase_C3: 3C cysteine protease (picornain 3C); InterPro: IPR000199 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This signature defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies C3A and C3B. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral C3 cysteine protease. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SJO_E 2H6M_A 1QA7_C 1HAV_B 2HAL_A 2H9H_A 3QZQ_B 3QZR_A 3R0F_B 3SJ9_A ....
Probab=89.19 E-value=3.4 Score=37.45 Aligned_cols=70 Identities=14% Similarity=0.039 Sum_probs=39.5
Q ss_pred eeEeeEEEEcCCEEEecccccCccCccceEEEccceeccccCCCCCCCCccceeeeeEEEeCCCCCCCCCCCceEEEEeC
Q psy10089 324 VFQCGATLILPHVVMTAAHCVNNIPVTDIKVRGGEWDTITNNRTDREPFPYQERTVSQIYIHENFEAKTVFNDIALIILD 403 (559)
Q Consensus 324 ~~~C~GtLIs~~~VLTAAhC~~~~~~~~~~V~~g~~~~~~~~~~~~~~~~~~~~~V~~i~~Hp~y~~~~~~~DIALl~L~ 403 (559)
.+.|.+..|..+|.|-..|.- ....+.++. ..+++.+.+.. .+......||++++|.
T Consensus 24 ~~t~l~~gi~~~~~lvp~H~~-----~~~~i~i~g----------------~~~~~~d~~~l--v~~~~~~~Dl~~v~l~ 80 (172)
T PF00548_consen 24 EFTMLALGIYDRYFLVPTHEE-----PEDTIYIDG----------------VEYKVDDSVVL--VDRDGVDTDLTLVKLP 80 (172)
T ss_dssp EEEEEEEEEEBTEEEEEGGGG-----GCSEEEETT----------------EEEEEEEEEEE--EETTSSEEEEEEEEEE
T ss_pred eEEEecceEeeeEEEEECcCC-----CcEEEEECC----------------EEEEeeeeEEE--ecCCCcceeEEEEEcc
Confidence 577889999999999999932 222333332 12222222111 1122235699999999
Q ss_pred CCCCCCCCceecc
Q psy10089 404 FPFPVKNHIGLAC 416 (559)
Q Consensus 404 ~p~~~~~~v~pic 416 (559)
+.-.|.+-.+-++
T Consensus 81 ~~~kfrDIrk~~~ 93 (172)
T PF00548_consen 81 RNPKFRDIRKFFP 93 (172)
T ss_dssp SSS-B--GGGGSB
T ss_pred CCcccCchhhhhc
Confidence 8888865444444
No 31
>PF02395 Peptidase_S6: Immunoglobulin A1 protease Serine protease Prosite pattern; InterPro: IPR000710 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S6 (clan PA(S)). The type sample being the IgA1-specific serine endopeptidase from Neisseria gonorrhoeae []. These cleave prolyl bonds in the hinge regions of immunoglobulin A heavy chains. Similar specificity is shown by the unrelated family of M26 metalloendopeptidases.; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SZE_A 3H09_B 3SYJ_A 1WXR_A 3AK5_B.
Probab=85.12 E-value=0.81 Score=51.28 Aligned_cols=55 Identities=20% Similarity=0.338 Sum_probs=32.6
Q ss_pred CCCCCCCCcccEEeecCCCCcEEEEEEEEeCCCCCCCCCeeeE--eccccHHHHHhhcC
Q psy10089 491 DACKGDGGGPLVCQLKNERDRFTQVGIVSWGIGCGSDTPGVYV--DVRKFKKWILDNSH 547 (559)
Q Consensus 491 ~~C~GDsGgPLv~~~~~~~~~~~l~GI~S~g~~C~~~~p~vyt--~V~~y~~WI~~~i~ 547 (559)
..=.||||+||+.-+ ....+|+|+|+++.+.+..... ..|+ +...+..|+++-..
T Consensus 212 ~~~~GDSGSPlF~YD-~~~kKWvl~Gv~~~~~~~~g~~-~~~~~~~~~f~~~~~~~d~~ 268 (769)
T PF02395_consen 212 YGSPGDSGSPLFAYD-KEKKKWVLVGVLSGGNGYNGKG-NWWNVIPPDFINQIKQNDTD 268 (769)
T ss_dssp B--TT-TT-EEEEEE-TTTTEEEEEEEEEEECCCCHSE-EEEEEECHHHHHHHHHHCCE
T ss_pred ccccCcCCCceEEEE-ccCCeEEEEEEEccccccCCcc-ceeEEecHHHHHHHHhhhcc
Confidence 345799999999765 3468999999999887654322 2332 33334456555433
No 32
>COG0265 DegQ Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain [Posttranslational modification, protein turnover, chaperones]
Probab=72.35 E-value=64 Score=32.86 Aligned_cols=58 Identities=16% Similarity=0.121 Sum_probs=36.8
Q ss_pred eEeeEEEEc-CCEEEecccccCccCccceEEEccceeccccCCCCCCCCccceeeeeEEEeCCCCCCCCCCCceEEEEeC
Q psy10089 325 FQCGATLIL-PHVVMTAAHCVNNIPVTDIKVRGGEWDTITNNRTDREPFPYQERTVSQIYIHENFEAKTVFNDIALIILD 403 (559)
Q Consensus 325 ~~C~GtLIs-~~~VLTAAhC~~~~~~~~~~V~~g~~~~~~~~~~~~~~~~~~~~~V~~i~~Hp~y~~~~~~~DIALl~L~ 403 (559)
...+|.+++ ..+|+|-.|-+.. ...+.|.+.. ...+..+-+-. ....|+|+||.+
T Consensus 72 ~~gSg~i~~~~g~ivTn~hVi~~--a~~i~v~l~d---------------g~~~~a~~vg~-------d~~~dlavlki~ 127 (347)
T COG0265 72 GLGSGFIISSDGYIVTNNHVIAG--AEEITVTLAD---------------GREVPAKLVGK-------DPISDLAVLKID 127 (347)
T ss_pred ccccEEEEcCCeEEEecceecCC--cceEEEEeCC---------------CCEEEEEEEec-------CCccCEEEEEec
Confidence 467888888 7799999998875 4555555411 12233333322 234699999998
Q ss_pred CCC
Q psy10089 404 FPF 406 (559)
Q Consensus 404 ~p~ 406 (559)
..-
T Consensus 128 ~~~ 130 (347)
T COG0265 128 GAG 130 (347)
T ss_pred cCC
Confidence 653
No 33
>PF00863 Peptidase_C4: Peptidase family C4 This family belongs to family C4 of the peptidase classification.; InterPro: IPR001730 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. Nuclear inclusion A (NIA) proteases from potyviruses are cysteine peptidases belong to the MEROPS peptidase family C4 (NIa protease family, clan PA(C)) [, ]. Potyviruses include plant viruses in which the single-stranded RNA encodes a polyprotein with NIA protease activity, where proteolytic cleavage is specific for Gln+Gly sites. The NIA protease acts on the polyprotein, releasing itself by Gln+Gly cleavage at both the N- and C-termini. It further processes the polyprotein by cleavage at five similar sites in the C-terminal half of the sequence. In addition to its C-terminal protease activity, the NIA protease contains an N-terminal domain that has been implicated in the transcription process []. This peptidase is present in the nuclear inclusion protein of potyviruses.; GO: 0008234 cysteine-type peptidase activity, 0006508 proteolysis; PDB: 3MMG_B 1Q31_B 1LVB_A 1LVM_A.
Probab=70.03 E-value=1e+02 Score=29.50 Aligned_cols=49 Identities=29% Similarity=0.395 Sum_probs=24.7
Q ss_pred CCCCCCCCcccEEeecCCCCcEEEEEEEEeCCCCCCCCCeeeEeccc-cHHHHHhhc
Q psy10089 491 DACKGDGGGPLVCQLKNERDRFTQVGIVSWGIGCGSDTPGVYVDVRK-FKKWILDNS 546 (559)
Q Consensus 491 ~~C~GDsGgPLv~~~~~~~~~~~l~GI~S~g~~C~~~~p~vyt~V~~-y~~WI~~~i 546 (559)
++=.||=|.|||... ++ .++||.|-+..-.. -.+|+.+.. +.+-+.+..
T Consensus 147 sTk~G~CG~PlVs~~---Dg--~IVGiHsl~~~~~~--~N~F~~f~~~f~~~~l~~~ 196 (235)
T PF00863_consen 147 STKDGDCGLPLVSTK---DG--KIVGIHSLTSNTSS--RNYFTPFPDDFEEFYLENI 196 (235)
T ss_dssp ---TT-TT-EEEETT---T----EEEEEEEEETTTS--SEEEEE--TTHHHHHCC-C
T ss_pred cCCCCccCCcEEEcC---CC--cEEEEEcCccCCCC--eEEEEcCCHHHHHHHhccc
Confidence 334588899999764 22 79999997642221 357777654 445444433
No 34
>PF00947 Pico_P2A: Picornavirus core protein 2A; InterPro: IPR000081 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This domain defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies 3CA and 3CB. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral 3C cysteine protease []. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0008233 peptidase activity, 0006508 proteolysis, 0016032 viral reproduction; PDB: 2HRV_B 1Z8R_A.
Probab=60.85 E-value=6.5 Score=33.19 Aligned_cols=35 Identities=34% Similarity=0.594 Sum_probs=26.1
Q ss_pred CCCCCcccEEeecCCCCcEEEEEEEEeCCCCCCCCCeeeEeccccH
Q psy10089 494 KGDGGGPLVCQLKNERDRFTQVGIVSWGIGCGSDTPGVYVDVRKFK 539 (559)
Q Consensus 494 ~GDsGgPLv~~~~~~~~~~~l~GI~S~g~~C~~~~p~vyt~V~~y~ 539 (559)
+||-||+|.|+. =++||++.|- .+-..|++|+.+.
T Consensus 89 PGdCGg~L~C~H-------GViGi~Tagg----~g~VaF~dir~~~ 123 (127)
T PF00947_consen 89 PGDCGGILRCKH-------GVIGIVTAGG----EGHVAFADIRDLL 123 (127)
T ss_dssp TT-TCSEEEETT-------CEEEEEEEEE----TTEEEEEECCCGS
T ss_pred CCCCCceeEeCC-------CeEEEEEeCC----CceEEEEechhhh
Confidence 699999999987 4899999872 1246688888763
No 35
>PF05580 Peptidase_S55: SpoIVB peptidase S55; InterPro: IPR008763 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S55 (SpoIVB peptidase family, clan PA(S)). The protein SpoIVB plays a key role in signalling in the final sigma-K checkpoint of Bacillus subtilis [, ].
Probab=55.70 E-value=13 Score=34.60 Aligned_cols=26 Identities=19% Similarity=0.355 Sum_probs=22.6
Q ss_pred CCCCCCCCCcccEEeecCCCCcEEEEEEEEeCC
Q psy10089 490 QDACKGDGGGPLVCQLKNERDRFTQVGIVSWGI 522 (559)
Q Consensus 490 ~~~C~GDsGgPLv~~~~~~~~~~~l~GI~S~g~ 522 (559)
.+.-||.||+|++... .|+|-++++.
T Consensus 175 GGIvqGMSGSPI~qdG-------KLiGAVthvf 200 (218)
T PF05580_consen 175 GGIVQGMSGSPIIQDG-------KLIGAVTHVF 200 (218)
T ss_pred CCEEecccCCCEEECC-------EEEEEEEEEE
Confidence 4678999999999866 8999999985
No 36
>PF00548 Peptidase_C3: 3C cysteine protease (picornain 3C); InterPro: IPR000199 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This signature defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies C3A and C3B. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral C3 cysteine protease. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SJO_E 2H6M_A 1QA7_C 1HAV_B 2HAL_A 2H9H_A 3QZQ_B 3QZR_A 3R0F_B 3SJ9_A ....
Probab=47.28 E-value=1.7e+02 Score=26.41 Aligned_cols=71 Identities=18% Similarity=0.075 Sum_probs=37.2
Q ss_pred ceeeeEEEEEcCCEEEecccccccCceeeEEeeeEEEcccccccccceeEEeeEEEEECCCCCCCCcCCceEEEeecCcc
Q psy10089 58 EYFKCGASLIGPNIALTAAHCVQYDVTYSVAAGEWFINGIVEEELEEEQRRDVLDVRIHPNYSTETLENNIALLKLSSNI 137 (559)
Q Consensus 58 ~~~~CgGtLIs~~~VLTAAhC~~~~~~~~v~~g~~~~~~~~~~~~~~~~~~~v~~i~~hp~y~~~~~~~Diall~L~~~v 137 (559)
..+.|.+.-|..+|.|--.|.- . ...+.++... +++...+.. .+......||++++|.+.-
T Consensus 23 g~~t~l~~gi~~~~~lvp~H~~-~--~~~i~i~g~~--------------~~~~d~~~l--v~~~~~~~Dl~~v~l~~~~ 83 (172)
T PF00548_consen 23 GEFTMLALGIYDRYFLVPTHEE-P--EDTIYIDGVE--------------YKVDDSVVL--VDRDGVDTDLTLVKLPRNP 83 (172)
T ss_dssp EEEEEEEEEEEBTEEEEEGGGG-G--CSEEEETTEE--------------EEEEEEEEE--EETTSSEEEEEEEEEESSS
T ss_pred ceEEEecceEeeeEEEEECcCC-C--cEEEEECCEE--------------EEeeeeEEE--ecCCCcceeEEEEEccCCc
Confidence 3466778899999999999922 1 1222222111 111111111 1112224599999998877
Q ss_pred ccCCCcceec
Q psy10089 138 DFDDYIHPIC 147 (559)
Q Consensus 138 ~~~~~v~pic 147 (559)
.|.+-.+-++
T Consensus 84 kfrDIrk~~~ 93 (172)
T PF00548_consen 84 KFRDIRKFFP 93 (172)
T ss_dssp -B--GGGGSB
T ss_pred ccCchhhhhc
Confidence 7765544444
No 37
>PF00863 Peptidase_C4: Peptidase family C4 This family belongs to family C4 of the peptidase classification.; InterPro: IPR001730 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. Nuclear inclusion A (NIA) proteases from potyviruses are cysteine peptidases belong to the MEROPS peptidase family C4 (NIa protease family, clan PA(C)) [, ]. Potyviruses include plant viruses in which the single-stranded RNA encodes a polyprotein with NIA protease activity, where proteolytic cleavage is specific for Gln+Gly sites. The NIA protease acts on the polyprotein, releasing itself by Gln+Gly cleavage at both the N- and C-termini. It further processes the polyprotein by cleavage at five similar sites in the C-terminal half of the sequence. In addition to its C-terminal protease activity, the NIA protease contains an N-terminal domain that has been implicated in the transcription process []. This peptidase is present in the nuclear inclusion protein of potyviruses.; GO: 0008234 cysteine-type peptidase activity, 0006508 proteolysis; PDB: 3MMG_B 1Q31_B 1LVB_A 1LVM_A.
Probab=44.65 E-value=2.6e+02 Score=26.69 Aligned_cols=17 Identities=18% Similarity=0.120 Sum_probs=13.7
Q ss_pred EEcCCEEEecccccccC
Q psy10089 66 LIGPNIALTAAHCVQYD 82 (559)
Q Consensus 66 LIs~~~VLTAAhC~~~~ 82 (559)
|.--.|++|-+|-+..+
T Consensus 37 igyG~~iItn~HLf~~n 53 (235)
T PF00863_consen 37 IGYGSYIITNAHLFKRN 53 (235)
T ss_dssp EEETTEEEEEGGGGSST
T ss_pred EeECCEEEEChhhhccC
Confidence 45688999999999653
No 38
>PF05416 Peptidase_C37: Southampton virus-type processing peptidase; InterPro: IPR001665 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This group of cysteine peptidases belong to the MEROPS peptidase family C37, (clan PA(C)). The type example is calicivirin from Southampton virus, an endopeptidase that cleaves the polyprotein at sites N-terminal to itself, liberating the polyprotein helicase. Southampton virus is a positive-stranded ssRNA virus belonging to the Caliciviruses, which are viruses that cause gastroenteritis. The calicivirus genome contains two open reading frames, ORF1 and ORF2. ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine protease and RNA polymerase activity []. The regions of the polyprotein in which these activities lie are similar to proteins produced by the picornaviruses []. ORF2 encodes a structural, capsid protein. Two different families of caliciviruses can be distinguished on the basis of sequence similarity, namely the Norwalk-like viruses or small round structured viruses (SRSVs), and those classed as non-SRSVs.; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 2FYQ_A 2FYR_A 1WQS_D 4ASH_A 2IPH_B.
Probab=38.59 E-value=2.1e+02 Score=29.72 Aligned_cols=31 Identities=19% Similarity=0.372 Sum_probs=22.7
Q ss_pred CCCCCCCCCCcccEEeecCCCCcEEEEEEEEeCC
Q psy10089 489 NQDACKGDGGGPLVCQLKNERDRFTQVGIVSWGI 522 (559)
Q Consensus 489 ~~~~C~GDsGgPLv~~~~~~~~~~~l~GI~S~g~ 522 (559)
+-++-+||-|-|-+... ++.|+++||..-..
T Consensus 497 DLGT~PGDCGcPYvyKr---gNd~VV~GVH~AAt 527 (535)
T PF05416_consen 497 DLGTIPGDCGCPYVYKR---GNDWVVIGVHAAAT 527 (535)
T ss_dssp TTS--TTGTT-EEEEEE---TTEEEEEEEEEEE-
T ss_pred ccCCCCCCCCCceeeec---CCcEEEEEEEehhc
Confidence 34567899999999987 78999999987543
No 39
>COG0265 DegQ Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain [Posttranslational modification, protein turnover, chaperones]
Probab=26.63 E-value=6.8e+02 Score=25.28 Aligned_cols=29 Identities=14% Similarity=0.063 Sum_probs=21.5
Q ss_pred eeeEEEEEc-CCEEEecccccccCceeeEE
Q psy10089 60 FKCGASLIG-PNIALTAAHCVQYDVTYSVA 88 (559)
Q Consensus 60 ~~CgGtLIs-~~~VLTAAhC~~~~~~~~v~ 88 (559)
....|.+++ ..+|||-.|=+.......+.
T Consensus 72 ~~gSg~i~~~~g~ivTn~hVi~~a~~i~v~ 101 (347)
T COG0265 72 GLGSGFIISSDGYIVTNNHVIAGAEEITVT 101 (347)
T ss_pred ccccEEEEcCCeEEEecceecCCcceEEEE
Confidence 457888888 88999999988665444443
No 40
>PF10459 Peptidase_S46: Peptidase S46; InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains.
Probab=26.11 E-value=42 Score=37.62 Aligned_cols=20 Identities=35% Similarity=0.826 Sum_probs=18.1
Q ss_pred eeEEEEEcCC-EEEecccccc
Q psy10089 61 KCGASLIGPN-IALTAAHCVQ 80 (559)
Q Consensus 61 ~CgGtLIs~~-~VLTAAhC~~ 80 (559)
.|+|++||++ .|||=-||..
T Consensus 48 GCSgsfVS~~GLvlTNHHC~~ 68 (698)
T PF10459_consen 48 GCSGSFVSPDGLVLTNHHCGY 68 (698)
T ss_pred ceeEEEEcCCceEEecchhhh
Confidence 3999999988 9999999984
No 41
>PF10459 Peptidase_S46: Peptidase S46; InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains.
Probab=24.40 E-value=46 Score=37.31 Aligned_cols=21 Identities=29% Similarity=0.738 Sum_probs=18.7
Q ss_pred EeeEEEEcCC-EEEecccccCc
Q psy10089 326 QCGATLILPH-VVMTAAHCVNN 346 (559)
Q Consensus 326 ~C~GtLIs~~-~VLTAAhC~~~ 346 (559)
.|+|++||++ .|||=-||..+
T Consensus 48 GCSgsfVS~~GLvlTNHHC~~~ 69 (698)
T PF10459_consen 48 GCSGSFVSPDGLVLTNHHCGYG 69 (698)
T ss_pred ceeEEEEcCCceEEecchhhhh
Confidence 3999999988 99999999844
No 42
>PF08192 Peptidase_S64: Peptidase family S64; InterPro: IPR012985 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This family of fungal proteins is involved in the processing of membrane bound transcription factor Stp1 [] and belongs to MEROPS petidase family S64 (clan PA). The processing causes the signalling domain of Stp1 to be passed to the nucleus where several permease genes are induced. The permeases are important for uptake of amino acids, and processing of tp1 only occurs in an amino acid-rich environment. This family is predicted to be distantly related to the trypsin family (MEROPS peptidase family S1) and to have a typical trypsin-like catalytic triad [].
Probab=23.81 E-value=5.5e+02 Score=28.52 Aligned_cols=57 Identities=14% Similarity=0.272 Sum_probs=36.4
Q ss_pred CCCCCCCCCcccEEeecCCCCcEEEEEEEEeCCCCCCCCCeeeEeccccHHHHHhhcC
Q psy10089 490 QDACKGDGGGPLVCQLKNERDRFTQVGIVSWGIGCGSDTPGVYVDVRKFKKWILDNSH 547 (559)
Q Consensus 490 ~~~C~GDsGgPLv~~~~~~~~~~~l~GI~S~g~~C~~~~p~vyt~V~~y~~WI~~~i~ 547 (559)
.-+-.||||+=++....+..-..-++|+.. .+.+.....++||-+...++=++++.+
T Consensus 634 ~Fa~~GDSGS~VLtk~~d~~~gLgvvGMlh-sydge~kqfglftPi~~il~rl~~vT~ 690 (695)
T PF08192_consen 634 AFASGGDSGSWVLTKLEDNNKGLGVVGMLH-SYDGEQKQFGLFTPINEILDRLEEVTG 690 (695)
T ss_pred cccCCCCcccEEEecccccccCceeeEEee-ecCCccceeeccCcHHHHHHHHHHhhc
Confidence 344569999999876532222345677765 334444447899988888877776543
No 43
>PF05579 Peptidase_S32: Equine arteritis virus serine endopeptidase S32; InterPro: IPR008760 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to MEROPS peptidase family S32 (clan PA(S)). The type example is equine arteritis virus serine endopeptidase (equine arteritis virus), which is involved in processing of nidovirus polyproteins [].; GO: 0004252 serine-type endopeptidase activity, 0016032 viral reproduction, 0019082 viral protein processing; PDB: 3FAN_A 3FAO_A 1MBM_A.
Probab=22.20 E-value=56 Score=31.54 Aligned_cols=21 Identities=29% Similarity=0.537 Sum_probs=15.1
Q ss_pred CCCCCcccEEeecCCCCcEEEEEEEEe
Q psy10089 494 KGDGGGPLVCQLKNERDRFTQVGIVSW 520 (559)
Q Consensus 494 ~GDsGgPLv~~~~~~~~~~~l~GI~S~ 520 (559)
.||||+|++..+ + .|+||.+-
T Consensus 207 ~GDSGSPVVt~d---g---~liGVHTG 227 (297)
T PF05579_consen 207 PGDSGSPVVTED---G---DLIGVHTG 227 (297)
T ss_dssp GGCTT-EEEETT---C----EEEEEEE
T ss_pred CCCCCCccCcCC---C---CEEEEEec
Confidence 489999999764 2 68999874
No 44
>TIGR02860 spore_IV_B stage IV sporulation protein B. SpoIVB, the stage IV sporulation protein B of endospore-forming bacteria such as Bacillus subtilis, is a serine proteinase, expressed in the spore (rather than mother cell) compartment, that participates in a proteolytic activation cascade for Sigma-K. It appears to be universal among endospore-forming bacteria and occurs nowhere else.
Probab=22.15 E-value=67 Score=33.31 Aligned_cols=44 Identities=20% Similarity=0.422 Sum_probs=31.1
Q ss_pred CCCCCCCCCCcccEEeecCCCCcEEEEEEEEeCCCCCCC-CCeeeEeccccHHHHHhh
Q psy10089 489 NQDACKGDGGGPLVCQLKNERDRFTQVGIVSWGIGCGSD-TPGVYVDVRKFKKWILDN 545 (559)
Q Consensus 489 ~~~~C~GDsGgPLv~~~~~~~~~~~l~GI~S~g~~C~~~-~p~vyt~V~~y~~WI~~~ 545 (559)
..+..||.||+|++... .|+|-+++-.--... ++++ |.+|+.+.
T Consensus 354 tgGivqGMSGSPi~q~g-------kliGAvtHVfvndpt~GYGi------~ie~Ml~~ 398 (402)
T TIGR02860 354 TGGIVQGMSGSPIIQNG-------KVIGAVTHVFVNDPTSGYGV------YIEWMLKE 398 (402)
T ss_pred hCCEEecccCCCEEECC-------EEEEEEEEEEecCCCcceee------hHHHHHHH
Confidence 35778999999999876 899999976422221 1444 56888765
No 45
>PF02907 Peptidase_S29: Hepatitis C virus NS3 protease; InterPro: IPR004109 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies the Hepatitis C virus NS3 protein as a serine protease which belongs to MEROPS peptidase family S29 (hepacivirin family, clan PA(S)), which has a trypsin-like fold. The non-structural (NS) protein NS3 is one of the NS proteins involved in replication of the HCV genome. The NS2 proteinase (IPR002518 from INTERPRO), a zinc-dependent enzyme, performs a single proteolytic cut to release the N terminus of NS3. The action of NS3 proteinase (NS3P), which resides in the N-terminal one-third of the NS3 protein, then yields all remaining non-structural proteins. The C-terminal two-thirds of the NS3 protein contain a helicase. The functional relationship between the proteinase and helicase domains is unknown. NS3 has a structural zinc-binding site and requires cofactor NS4. It has been suggested that the NS3 serine protease of hepatitus C is involved in cell transformation and that the ability to transform requires an active enzyme [].; GO: 0008236 serine-type peptidase activity, 0006508 proteolysis, 0019087 transformation of host cell by virus; PDB: 2QV1_B 3LOX_C 2OBQ_C 2OC1_C 2OC0_A 3LON_A 3KNX_A 2O8M_A 2OBO_A 2OC8_A ....
Probab=20.82 E-value=73 Score=27.32 Aligned_cols=21 Identities=38% Similarity=0.868 Sum_probs=14.2
Q ss_pred CCCCCCcccEEeecCCCCcEEEEEEEE
Q psy10089 493 CKGDGGGPLVCQLKNERDRFTQVGIVS 519 (559)
Q Consensus 493 C~GDsGgPLv~~~~~~~~~~~l~GI~S 519 (559)
-.|-||||++|.. + ..+||..
T Consensus 106 lkGSSGgPiLC~~---G---H~vG~f~ 126 (148)
T PF02907_consen 106 LKGSSGGPILCPS---G---HAVGMFR 126 (148)
T ss_dssp HTT-TT-EEEETT---S---EEEEEEE
T ss_pred EecCCCCcccCCC---C---CEEEEEE
Confidence 3588999999975 2 6778865
Done!