Query psy16044
Match_columns 436
No_of_seqs 347 out of 2625
Neff 9.8
Searched_HMMs 46136
Date Fri Aug 16 19:26:10 2013
Command hhsearch -i /work/01045/syshi/Psyhhblits/psy16044.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/16044hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 cd00190 Tryp_SPc Trypsin-like 100.0 5.6E-40 1.2E-44 297.7 22.1 226 27-266 1-231 (232)
2 KOG3627|consensus 100.0 1.6E-37 3.4E-42 286.1 21.4 235 23-268 9-254 (256)
3 smart00020 Tryp_SPc Trypsin-li 100.0 6E-37 1.3E-41 277.5 23.3 223 26-264 1-229 (229)
4 PF00089 Trypsin: Trypsin; In 100.0 1.9E-34 4E-39 259.4 18.8 215 27-264 1-220 (220)
5 COG5640 Secreted trypsin-like 100.0 2E-28 4.3E-33 218.6 13.0 238 23-270 29-280 (413)
6 KOG3627|consensus 99.9 3.4E-26 7.4E-31 210.6 15.5 158 271-430 94-255 (256)
7 cd00190 Tryp_SPc Trypsin-like 99.9 2.2E-24 4.7E-29 195.4 13.6 156 270-428 77-232 (232)
8 smart00020 Tryp_SPc Trypsin-li 99.9 9.1E-22 2E-26 178.0 13.2 151 271-425 78-229 (229)
9 PF00089 Trypsin: Trypsin; In 99.8 1.8E-20 3.8E-25 168.3 11.4 145 271-425 76-220 (220)
10 PF03761 DUF316: Domain of unk 99.7 2.4E-17 5.3E-22 153.6 14.6 227 2-260 19-271 (282)
11 COG5640 Secreted trypsin-like 99.6 8.8E-15 1.9E-19 131.7 13.7 160 270-432 112-281 (413)
12 PF09342 DUF1986: Domain of un 99.5 1.3E-12 2.8E-17 112.1 14.7 117 35-167 13-131 (267)
13 COG3591 V8-like Glu-specific e 98.7 2.5E-07 5.4E-12 82.0 11.5 177 33-246 44-225 (251)
14 PF03761 DUF316: Domain of unk 98.3 4E-06 8.6E-11 78.2 9.4 121 280-430 159-280 (282)
15 TIGR02037 degP_htrA_DO peripla 98.2 2.2E-05 4.8E-10 77.6 13.0 85 55-167 57-142 (428)
16 TIGR02038 protease_degS peripl 98.1 0.00012 2.7E-09 70.1 14.9 160 36-244 53-218 (351)
17 PRK10898 serine endoprotease; 97.9 0.00035 7.6E-09 67.0 15.5 103 36-167 53-161 (353)
18 PRK10139 serine endoprotease; 97.9 0.00014 3.1E-09 72.0 12.4 142 55-244 89-232 (455)
19 PRK10942 serine endoprotease; 97.8 0.0003 6.4E-09 70.1 13.0 84 55-166 110-195 (473)
20 PF13365 Trypsin_2: Trypsin-li 97.3 0.00042 9.1E-09 55.2 5.1 21 58-78 1-22 (120)
21 PF02395 Peptidase_S6: Immunog 96.2 0.031 6.8E-07 58.5 9.9 66 59-149 68-133 (769)
22 PF09342 DUF1986: Domain of un 93.7 0.17 3.8E-06 44.5 5.9 45 280-326 87-131 (267)
23 PF02395 Peptidase_S6: Immunog 90.2 0.53 1.1E-05 49.6 5.7 54 375-432 212-266 (769)
24 TIGR02037 degP_htrA_DO peripla 85.1 5.3 0.00012 39.6 9.2 40 280-326 103-142 (428)
25 COG0265 DegQ Trypsin-like seri 82.1 27 0.00058 33.5 12.4 144 56-246 72-216 (347)
26 COG3591 V8-like Glu-specific e 81.2 2.6 5.7E-05 37.9 4.6 52 375-430 199-251 (251)
27 PRK10898 serine endoprotease; 81.0 10 0.00022 36.5 9.0 40 279-326 122-161 (353)
28 PF00863 Peptidase_C4: Peptida 78.4 11 0.00024 33.7 7.6 137 61-246 36-174 (235)
29 TIGR02038 protease_degS peripl 78.3 7.9 0.00017 37.2 7.3 40 279-326 122-161 (351)
30 PF00947 Pico_P2A: Picornaviru 76.1 2.6 5.6E-05 33.3 2.7 34 378-421 89-122 (127)
31 PRK10139 serine endoprotease; 75.1 17 0.00036 36.4 8.8 39 280-325 136-174 (455)
32 PF05416 Peptidase_C37: Southa 69.1 15 0.00033 35.3 6.4 41 204-245 485-527 (535)
33 PF00548 Peptidase_C3: 3C cyst 60.0 5.2 0.00011 34.0 1.5 29 216-245 143-171 (172)
34 PF00947 Pico_P2A: Picornaviru 57.2 8 0.00017 30.7 2.0 23 219-246 89-111 (127)
35 PRK10942 serine endoprotease; 56.5 58 0.0013 32.8 8.5 39 280-325 157-195 (473)
36 PF10459 Peptidase_S46: Peptid 56.4 6.8 0.00015 41.1 1.9 22 57-78 48-70 (698)
37 PF05580 Peptidase_S55: SpoIVB 45.1 20 0.00043 31.4 2.7 25 375-404 176-200 (218)
38 PF05579 Peptidase_S32: Equine 32.6 39 0.00085 30.6 2.6 22 219-244 207-228 (297)
39 PF05580 Peptidase_S55: SpoIVB 30.8 37 0.00079 29.8 2.1 26 215-245 175-200 (218)
40 PF05579 Peptidase_S32: Equine 30.6 41 0.00089 30.5 2.4 21 379-403 208-228 (297)
41 PF00944 Peptidase_S3: Alphavi 29.5 36 0.00079 27.3 1.7 25 217-245 103-127 (158)
42 PF02907 Peptidase_S29: Hepati 28.1 38 0.00083 27.2 1.6 22 377-402 106-127 (148)
43 KOG1421|consensus 23.5 4.5E+02 0.0098 27.7 8.4 135 59-227 87-223 (955)
44 TIGR02860 spore_IV_B stage IV 20.3 72 0.0016 31.2 2.2 44 375-429 356-399 (402)
No 1
>cd00190 Tryp_SPc Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues.
Probab=100.00 E-value=5.6e-40 Score=297.75 Aligned_cols=226 Identities=38% Similarity=0.723 Sum_probs=192.5
Q ss_pred EeCCcccCCCCCcceEEEeeeccCCCCCCeeeEEEEecCCEEEecCCCcCCCCCCCCCCcceEEEeccccCCccCCceEE
Q psy16044 27 LINGKESIRGAWPWQVSLQVLHPRLGLMPHWCGAVLIHPSWVVTAAHCIHNDIFSLPIPELWTAVLGDWDRTEEEKSEVR 106 (436)
Q Consensus 27 i~~G~~a~~~~~Pw~v~i~~~~~~~~~~~~~C~GtLIs~~~VLTAAhC~~~~~~~~~~~~~~~v~~g~~~~~~~~~~~~~ 106 (436)
|+||+++..++|||+|+|+... ..++|+||||+++||||||||+.... ...+.|++|...........+.
T Consensus 1 i~~G~~~~~~~~Pw~v~i~~~~-----~~~~C~GtlIs~~~VLTaAhC~~~~~-----~~~~~v~~g~~~~~~~~~~~~~ 70 (232)
T cd00190 1 IVGGSEAKIGSFPWQVSLQYTG-----GRHFCGGSLISPRWVLTAAHCVYSSA-----PSNYTVRLGSHDLSSNEGGGQV 70 (232)
T ss_pred CcCCeECCCCCCCCEEEEEccC-----CcEEEEEEEeeCCEEEECHHhcCCCC-----CccEEEEeCcccccCCCCceEE
Confidence 6899999999999999998763 16899999999999999999997631 4678899998776654445677
Q ss_pred EeeeEEEeCCCCCC--CCCceeEEEeCCCCCCCCCceeeeeecCCCCCCCCCCCcEEEEecCccCCCCCccccceeeeee
Q psy16044 107 IPVERIRVHEEFHN--YHHDIALLKLSRPTSARDKGVRAVCLTDADKRPVNPKQQCVATGWGRVKPKGDLVSKLRQIRVP 184 (436)
Q Consensus 107 ~~v~~i~~hp~y~~--~~~DiAll~L~~~~~~~~~~v~pi~l~~~~~~~~~~~~~~~~~GwG~~~~~~~~~~~l~~~~~~ 184 (436)
+.|.++++||+|+. ..+|||||||++|+. ++.+++|+|||... .....+..+.++|||...........++...+.
T Consensus 71 ~~v~~~~~hp~y~~~~~~~DiAll~L~~~~~-~~~~v~picl~~~~-~~~~~~~~~~~~G~g~~~~~~~~~~~~~~~~~~ 148 (232)
T cd00190 71 IKVKKVIVHPNYNPSTYDNDIALLKLKRPVT-LSDNVRPICLPSSG-YNLPAGTTCTVSGWGRTSEGGPLPDVLQEVNVP 148 (232)
T ss_pred EEEEEEEECCCCCCCCCcCCEEEEEECCccc-CCCcccceECCCcc-ccCCCCCEEEEEeCCcCCCCCCCCceeeEEEee
Confidence 89999999999984 689999999999999 67789999999863 345577899999999987655566789999999
Q ss_pred ccchhhhhhhcCCCcCCcCCeEEeccCCCCCCCccCCCCCeeeeEecCCcEEEEEEEeEcCCCCC---CCccccccccee
Q psy16044 185 LHNISVCRDKYGDSVELHGGHLCGGQLDGFSGACIGDSGGPLQCSLKDGRWYLAGITSFGSGYCG---VGIRYSHRQPRL 261 (436)
Q Consensus 185 ~~~~~~C~~~~~~~~~~~~~~~Ca~~~~~~~~~C~gDsGgPl~~~~~~~~~~l~GI~S~g~~~C~---~~~~y~~v~~~~ 261 (436)
+++...|...+.......+.++|+.......+.|.|||||||++.. +++|+|+||+|+|.. |+ .+.+|+++..+.
T Consensus 149 ~~~~~~C~~~~~~~~~~~~~~~C~~~~~~~~~~c~gdsGgpl~~~~-~~~~~lvGI~s~g~~-c~~~~~~~~~t~v~~~~ 226 (232)
T cd00190 149 IVSNAECKRAYSYGGTITDNMLCAGGLEGGKDACQGDSGGPLVCND-NGRGVLVGIVSWGSG-CARPNYPGVYTRVSSYL 226 (232)
T ss_pred eECHHHhhhhccCcccCCCceEeeCCCCCCCccccCCCCCcEEEEe-CCEEEEEEEEehhhc-cCCCCCCCEEEEcHHhh
Confidence 9999999988865345667999998876568899999999999985 589999999999987 77 457899999999
Q ss_pred cCCcc
Q psy16044 262 INGKE 266 (436)
Q Consensus 262 ~~~~~ 266 (436)
.||.+
T Consensus 227 ~WI~~ 231 (232)
T cd00190 227 DWIQK 231 (232)
T ss_pred HHhhc
Confidence 99975
No 2
>KOG3627|consensus
Probab=100.00 E-value=1.6e-37 Score=286.11 Aligned_cols=235 Identities=38% Similarity=0.688 Sum_probs=188.5
Q ss_pred CCCeEeCCcccCCCCCcceEEEeeeccCCCCCCeeeEEEEecCCEEEecCCCcCCCCCCCCCCcceEEEeccccCCcc-C
Q psy16044 23 RQPRLINGKESIRGAWPWQVSLQVLHPRLGLMPHWCGAVLIHPSWVVTAAHCIHNDIFSLPIPELWTAVLGDWDRTEE-E 101 (436)
Q Consensus 23 ~~~~i~~G~~a~~~~~Pw~v~i~~~~~~~~~~~~~C~GtLIs~~~VLTAAhC~~~~~~~~~~~~~~~v~~g~~~~~~~-~ 101 (436)
...||+||.++.+++|||+|+|+.... ..++|+|+||+++||||||||+.... .. .+.|++|.+..... .
T Consensus 9 ~~~~i~~g~~~~~~~~Pw~~~l~~~~~----~~~~Cggsli~~~~vltaaHC~~~~~----~~-~~~V~~G~~~~~~~~~ 79 (256)
T KOG3627|consen 9 PEGRIVGGTEAEPGSFPWQVSLQYGGN----GRHLCGGSLISPRWVLTAAHCVKGAS----AS-LYTVRLGEHDINLSVS 79 (256)
T ss_pred ccCCEeCCccCCCCCCCCEEEEEECCC----cceeeeeEEeeCCEEEEChhhCCCCC----Cc-ceEEEECccccccccc
Confidence 357999999999999999999987642 25799999999999999999997742 12 77888897654433 1
Q ss_pred Cc--eEEEeeeEEEeCCCCCC--CC-CceeEEEeCCCCCCCCCceeeeeecCCCCC-CCCCCCcEEEEecCccCCC-CCc
Q psy16044 102 KS--EVRIPVERIRVHEEFHN--YH-HDIALLKLSRPTSARDKGVRAVCLTDADKR-PVNPKQQCVATGWGRVKPK-GDL 174 (436)
Q Consensus 102 ~~--~~~~~v~~i~~hp~y~~--~~-~DiAll~L~~~~~~~~~~v~pi~l~~~~~~-~~~~~~~~~~~GwG~~~~~-~~~ 174 (436)
.. .....+.++++||.|+. .. ||||||+|++++. ++..|+|||||..... ....+..|.++|||.+... ...
T Consensus 80 ~~~~~~~~~v~~~i~H~~y~~~~~~~nDiall~l~~~v~-~~~~i~piclp~~~~~~~~~~~~~~~v~GWG~~~~~~~~~ 158 (256)
T KOG3627|consen 80 EGEEQLVGDVEKIIVHPNYNPRTLENNDIALLRLSEPVT-FSSHIQPICLPSSADPYFPPGGTTCLVSGWGRTESGGGPL 158 (256)
T ss_pred cCchhhhceeeEEEECCCCCCCCCCCCCEEEEEECCCcc-cCCcccccCCCCCcccCCCCCCCEEEEEeCCCcCCCCCCC
Confidence 11 24445778889999984 34 9999999999999 7889999999864332 3445589999999988754 345
Q ss_pred cccceeeeeeccchhhhhhhcCCCcCCcCCeEEeccCCCCCCCccCCCCCeeeeEecCCcEEEEEEEeEcCCCCCC---C
Q psy16044 175 VSKLRQIRVPLHNISVCRDKYGDSVELHGGHLCGGQLDGFSGACIGDSGGPLQCSLKDGRWYLAGITSFGSGYCGV---G 251 (436)
Q Consensus 175 ~~~l~~~~~~~~~~~~C~~~~~~~~~~~~~~~Ca~~~~~~~~~C~gDsGgPl~~~~~~~~~~l~GI~S~g~~~C~~---~ 251 (436)
+..|++..+.+++...|...+.......+.++||+......++|+|||||||++.... +|+++||+|||.+.|+. +
T Consensus 159 ~~~L~~~~v~i~~~~~C~~~~~~~~~~~~~~~Ca~~~~~~~~~C~GDSGGPLv~~~~~-~~~~~GivS~G~~~C~~~~~P 237 (256)
T KOG3627|consen 159 PDTLQEVDVPIISNSECRRAYGGLGTITDTMLCAGGPEGGKDACQGDSGGPLVCEDNG-RWVLVGIVSWGSGGCGQPNYP 237 (256)
T ss_pred CceeEEEEEeEcChhHhcccccCccccCCCEEeeCccCCCCccccCCCCCeEEEeeCC-cEEEEEEEEecCCCCCCCCCC
Confidence 6789999999999999999887542344568999986666889999999999998544 89999999999986655 6
Q ss_pred cccccccceecCCcccc
Q psy16044 252 IRYSHRQPRLINGKESI 268 (436)
Q Consensus 252 ~~y~~v~~~~~~~~~~~ 268 (436)
.+|++|..|..||.+.+
T Consensus 238 ~vyt~V~~y~~WI~~~~ 254 (256)
T KOG3627|consen 238 GVYTRVSSYLDWIKENI 254 (256)
T ss_pred eEEeEhHHhHHHHHHHh
Confidence 89999999999998654
No 3
>smart00020 Tryp_SPc Trypsin-like serine protease. Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues.
Probab=100.00 E-value=6e-37 Score=277.50 Aligned_cols=223 Identities=38% Similarity=0.723 Sum_probs=185.8
Q ss_pred eEeCCcccCCCCCcceEEEeeeccCCCCCCeeeEEEEecCCEEEecCCCcCCCCCCCCCCcceEEEeccccCCccCCceE
Q psy16044 26 RLINGKESIRGAWPWQVSLQVLHPRLGLMPHWCGAVLIHPSWVVTAAHCIHNDIFSLPIPELWTAVLGDWDRTEEEKSEV 105 (436)
Q Consensus 26 ~i~~G~~a~~~~~Pw~v~i~~~~~~~~~~~~~C~GtLIs~~~VLTAAhC~~~~~~~~~~~~~~~v~~g~~~~~~~~~~~~ 105 (436)
||+||+++..++|||+|.++... ..+.|+||||++++|||||||+.... ...+.|++|........ ...
T Consensus 1 ~~~~G~~~~~~~~Pw~~~i~~~~-----~~~~C~GtlIs~~~VLTaahC~~~~~-----~~~~~v~~g~~~~~~~~-~~~ 69 (229)
T smart00020 1 RIVGGSEANIGSFPWQVSLQYRG-----GRHFCGGSLISPRWVLTAAHCVYGSD-----PSNIRVRLGSHDLSSGE-EGQ 69 (229)
T ss_pred CccCCCcCCCCCCCcEEEEEEcC-----CCcEEEEEEecCCEEEECHHHcCCCC-----CcceEEEeCcccCCCCC-Cce
Confidence 58999999999999999998653 26889999999999999999997642 45789999987655432 226
Q ss_pred EEeeeEEEeCCCCC--CCCCceeEEEeCCCCCCCCCceeeeeecCCCCCCCCCCCcEEEEecCccCC-CCCccccceeee
Q psy16044 106 RIPVERIRVHEEFH--NYHHDIALLKLSRPTSARDKGVRAVCLTDADKRPVNPKQQCVATGWGRVKP-KGDLVSKLRQIR 182 (436)
Q Consensus 106 ~~~v~~i~~hp~y~--~~~~DiAll~L~~~~~~~~~~v~pi~l~~~~~~~~~~~~~~~~~GwG~~~~-~~~~~~~l~~~~ 182 (436)
.+.|..++.||+|+ ...+|||||+|++|+. ++..++|+||+.. ......+..+.++|||.... .......++...
T Consensus 70 ~~~v~~~~~~p~~~~~~~~~DiAll~L~~~i~-~~~~~~pi~l~~~-~~~~~~~~~~~~~g~g~~~~~~~~~~~~~~~~~ 147 (229)
T smart00020 70 VIKVSKVIIHPNYNPSTYDNDIALLKLKSPVT-LSDNVRPICLPSS-NYNVPAGTTCTVSGWGRTSEGAGSLPDTLQEVN 147 (229)
T ss_pred EEeeEEEEECCCCCCCCCcCCEEEEEECcccC-CCCceeeccCCCc-ccccCCCCEEEEEeCCCCCCCCCcCCCEeeEEE
Confidence 68899999999997 4789999999999998 6678999999985 22344678999999998763 234456788999
Q ss_pred eeccchhhhhhhcCCCcCCcCCeEEeccCCCCCCCccCCCCCeeeeEecCCcEEEEEEEeEcCCCCC---CCcccccccc
Q psy16044 183 VPLHNISVCRDKYGDSVELHGGHLCGGQLDGFSGACIGDSGGPLQCSLKDGRWYLAGITSFGSGYCG---VGIRYSHRQP 259 (436)
Q Consensus 183 ~~~~~~~~C~~~~~~~~~~~~~~~Ca~~~~~~~~~C~gDsGgPl~~~~~~~~~~l~GI~S~g~~~C~---~~~~y~~v~~ 259 (436)
+.+++.+.|...+.......+.++|++......+.|.|||||||++.. + +|+|+||+|+|. .|+ .+.+|+++.+
T Consensus 148 ~~~~~~~~C~~~~~~~~~~~~~~~C~~~~~~~~~~c~gdsG~pl~~~~-~-~~~l~Gi~s~g~-~C~~~~~~~~~~~i~~ 224 (229)
T smart00020 148 VPIVSNATCRRAYSGGGAITDNMLCAGGLEGGKDACQGDSGGPLVCND-G-RWVLVGIVSWGS-GCARPGKPGVYTRVSS 224 (229)
T ss_pred EEEeCHHHhhhhhccccccCCCcEeecCCCCCCcccCCCCCCeeEEEC-C-CEEEEEEEEECC-CCCCCCCCCEEEEecc
Confidence 999999999988765445677999998876558899999999999975 3 999999999999 587 5678999999
Q ss_pred eecCC
Q psy16044 260 RLING 264 (436)
Q Consensus 260 ~~~~~ 264 (436)
+..||
T Consensus 225 ~~~WI 229 (229)
T smart00020 225 YLDWI 229 (229)
T ss_pred ccccC
Confidence 99997
No 4
>PF00089 Trypsin: Trypsin; InterPro: IPR001254 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine proteases belong to the MEROPS peptidase family S1 (chymotrypsin family, clan PA(S))and to peptidase family S6 (Hap serine peptidases). The chymotrypsin family is almost totally confined to animals, although trypsin-like enzymes are found in actinomycetes of the genera Streptomyces and Saccharopolyspora, and in the fungus Fusarium oxysporum []. The enzymes are inherently secreted, being synthesised with a signal peptide that targets them to the secretory pathway. Animal enzymes are either secreted directly, packaged into vesicles for regulated secretion, or are retained in leukocyte granules []. The Hap family, 'Haemophilus adhesion and penetration', are proteins that play a role in the interaction with human epithelial cells. The serine protease activity is localized at the N-terminal domain, whereas the binding domain is in the C-terminal region. ; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 1SPJ_A 1A5I_A 2ZGH_A 2ZKS_A 2ZGJ_A 2ZGC_A 2ODP_A 2I6Q_A 2I6S_A 2ODQ_A ....
Probab=100.00 E-value=1.9e-34 Score=259.39 Aligned_cols=215 Identities=36% Similarity=0.728 Sum_probs=178.3
Q ss_pred EeCCcccCCCCCcceEEEeeeccCCCCCCeeeEEEEecCCEEEecCCCcCCCCCCCCCCcceEEEeccccCCccCCceEE
Q psy16044 27 LINGKESIRGAWPWQVSLQVLHPRLGLMPHWCGAVLIHPSWVVTAAHCIHNDIFSLPIPELWTAVLGDWDRTEEEKSEVR 106 (436)
Q Consensus 27 i~~G~~a~~~~~Pw~v~i~~~~~~~~~~~~~C~GtLIs~~~VLTAAhC~~~~~~~~~~~~~~~v~~g~~~~~~~~~~~~~ 106 (436)
|+||.++.+++|||+|.|+.... .++|+|+||+++||||||||+.. ...+.+.+|...........+.
T Consensus 1 i~~g~~~~~~~~p~~v~i~~~~~-----~~~C~G~li~~~~vLTaahC~~~-------~~~~~v~~g~~~~~~~~~~~~~ 68 (220)
T PF00089_consen 1 IVGGDPASPGEFPWVVSIRYSNG-----RFFCTGTLISPRWVLTAAHCVDG-------ASDIKVRLGTYSIRNSDGSEQT 68 (220)
T ss_dssp SBSSEECGTTSSTTEEEEEETTT-----EEEEEEEEEETTEEEEEGGGHTS-------GGSEEEEESESBTTSTTTTSEE
T ss_pred CCCCEECCCCCCCeEEEEeeCCC-----CeeEeEEeccccccccccccccc-------cccccccccccccccccccccc
Confidence 68999999999999999988642 68899999999999999999976 3567888998444444444588
Q ss_pred EeeeEEEeCCCCCC--CCCceeEEEeCCCCCCCCCceeeeeecCCCCCCCCCCCcEEEEecCccCCCCCccccceeeeee
Q psy16044 107 IPVERIRVHEEFHN--YHHDIALLKLSRPTSARDKGVRAVCLTDADKRPVNPKQQCVATGWGRVKPKGDLVSKLRQIRVP 184 (436)
Q Consensus 107 ~~v~~i~~hp~y~~--~~~DiAll~L~~~~~~~~~~v~pi~l~~~~~~~~~~~~~~~~~GwG~~~~~~~~~~~l~~~~~~ 184 (436)
+.+++++.||+|+. ..+|||||+|++++. +...++|+|++... .....+..+.++|||.....+ ....++...+.
T Consensus 69 ~~v~~~~~h~~~~~~~~~~DiAll~L~~~~~-~~~~~~~~~l~~~~-~~~~~~~~~~~~G~~~~~~~~-~~~~~~~~~~~ 145 (220)
T PF00089_consen 69 IKVSKIIIHPKYDPSTYDNDIALLKLDRPIT-FGDNIQPICLPSAG-SDPNVGTSCIVVGWGRTSDNG-YSSNLQSVTVP 145 (220)
T ss_dssp EEEEEEEEETTSBTTTTTTSEEEEEESSSSE-HBSSBEESBBTSTT-HTTTTTSEEEEEESSBSSTTS-BTSBEEEEEEE
T ss_pred ccccccccccccccccccccccccccccccc-cccccccccccccc-ccccccccccccccccccccc-ccccccccccc
Confidence 89999999999984 479999999999988 66789999999842 223678899999999976554 55678999999
Q ss_pred ccchhhhhhhcCCCcCCcCCeEEeccCCCCCCCccCCCCCeeeeEecCCcEEEEEEEeEcCCCCCCC---ccccccccee
Q psy16044 185 LHNISVCRDKYGDSVELHGGHLCGGQLDGFSGACIGDSGGPLQCSLKDGRWYLAGITSFGSGYCGVG---IRYSHRQPRL 261 (436)
Q Consensus 185 ~~~~~~C~~~~~~~~~~~~~~~Ca~~~~~~~~~C~gDsGgPl~~~~~~~~~~l~GI~S~g~~~C~~~---~~y~~v~~~~ 261 (436)
+++.+.|...+... ..+.++|++.. ...+.|.|||||||++... +|+||+|++.. |+.. .+|+++..+.
T Consensus 146 ~~~~~~c~~~~~~~--~~~~~~c~~~~-~~~~~~~g~sG~pl~~~~~----~lvGI~s~~~~-c~~~~~~~v~~~v~~~~ 217 (220)
T PF00089_consen 146 VVSRKTCRSSYNDN--LTPNMICAGSS-GSGDACQGDSGGPLICNNN----YLVGIVSFGEN-CGSPNYPGVYTRVSSYL 217 (220)
T ss_dssp EEEHHHHHHHTTTT--STTTEEEEETT-SSSBGGTTTTTSEEEETTE----EEEEEEEEESS-SSBTTSEEEEEEGGGGH
T ss_pred cccccccccccccc--ccccccccccc-cccccccccccccccccee----eecceeeecCC-CCCCCcCEEEEEHHHhh
Confidence 99999999874332 45689999987 5589999999999998643 89999999955 6555 7899999999
Q ss_pred cCC
Q psy16044 262 ING 264 (436)
Q Consensus 262 ~~~ 264 (436)
+||
T Consensus 218 ~WI 220 (220)
T PF00089_consen 218 DWI 220 (220)
T ss_dssp HHH
T ss_pred ccC
Confidence 986
No 5
>COG5640 Secreted trypsin-like serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=99.96 E-value=2e-28 Score=218.63 Aligned_cols=238 Identities=24% Similarity=0.330 Sum_probs=167.2
Q ss_pred CCCeEeCCcccCCCCCcceEEEeeeccCCCCCCeeeEEEEecCCEEEecCCCcCCCCCCCCCCcceEEEeccccCCccCC
Q psy16044 23 RQPRLINGKESIRGAWPWQVSLQVLHPRLGLMPHWCGAVLIHPSWVVTAAHCIHNDIFSLPIPELWTAVLGDWDRTEEEK 102 (436)
Q Consensus 23 ~~~~i~~G~~a~~~~~Pw~v~i~~~~~~~~~~~~~C~GtLIs~~~VLTAAhC~~~~~~~~~~~~~~~v~~g~~~~~~~~~ 102 (436)
...||+||..|+.++||++|++....... ....+|+|+++..|||||||||+.... ........|..+..+..
T Consensus 29 vs~rIigGs~Anag~~P~~VaLv~~isd~-~s~tfCGgs~l~~RYvLTAAHC~~~~s--~is~d~~~vv~~l~d~S---- 101 (413)
T COG5640 29 VSSRIIGGSNANAGEYPSLVALVDRISDY-VSGTFCGGSKLGGRYVLTAAHCADASS--PISSDVNRVVVDLNDSS---- 101 (413)
T ss_pred cceeEecCcccccccCchHHHHHhhcccc-cceeEeccceecceEEeeehhhccCCC--CccccceEEEecccccc----
Confidence 57899999999999999999998765431 125689999999999999999998753 12233345555444333
Q ss_pred ceEEEeeeEEEeCCCCC--CCCCceeEEEeCCCCCCCCCceeeeeecCCCCCCCCCCCcEEEEecCccCCCC---Ccc--
Q psy16044 103 SEVRIPVERIRVHEEFH--NYHHDIALLKLSRPTSARDKGVRAVCLTDADKRPVNPKQQCVATGWGRVKPKG---DLV-- 175 (436)
Q Consensus 103 ~~~~~~v~~i~~hp~y~--~~~~DiAll~L~~~~~~~~~~v~pi~l~~~~~~~~~~~~~~~~~GwG~~~~~~---~~~-- 175 (436)
..+...|..++.|..|. +..||+|+++|.++...--..+...--+..-...+.........+|+.+.... ..+
T Consensus 102 q~~rg~vr~i~~~efY~~~n~~ND~Av~~l~~~a~~pr~ki~~~~~sdt~l~sv~~~s~~~n~t~~~~~~~~v~~~~p~g 181 (413)
T COG5640 102 QAERGHVRTIYVHEFYSPGNLGNDIAVLELARAASLPRVKITSFDASDTFLNSVTTVSPMTNGTFGVTTPSDVPRSSPKG 181 (413)
T ss_pred cccCcceEEEeeecccccccccCcceeeccccccccchhheeeccCcccceecccccccccceeeeeeeecCCCCCCCcc
Confidence 23567899999999997 68899999999997762111111111111001112233444566777654321 112
Q ss_pred ccceeeeeeccchhhhhhhcCC----CcCCcCCeEEeccCCCCCCCccCCCCCeeeeEecCCcEEEEEEEeEcCCCCCCC
Q psy16044 176 SKLRQIRVPLHNISVCRDKYGD----SVELHGGHLCGGQLDGFSGACIGDSGGPLQCSLKDGRWYLAGITSFGSGYCGVG 251 (436)
Q Consensus 176 ~~l~~~~~~~~~~~~C~~~~~~----~~~~~~~~~Ca~~~~~~~~~C~gDsGgPl~~~~~~~~~~l~GI~S~g~~~C~~~ 251 (436)
..|++..+..++...|...++. .....-.-+|++... .++|+||||||++....+ ...++||+|||.+-|+..
T Consensus 182 t~l~e~~v~fv~~stc~~~~g~an~~dg~~~lT~~cag~~~--~daCqGDSGGPi~~~g~~-G~vQ~GVvSwG~~~Cg~t 258 (413)
T COG5640 182 TILHEVAVLFVPLSTCAQYKGCANASDGATGLTGFCAGRPP--KDACQGDSGGPIFHKGEE-GRVQRGVVSWGDGGCGGT 258 (413)
T ss_pred ceeeeeeeeeechHHhhhhccccccCCCCCCccceecCCCC--cccccCCCCCceEEeCCC-ccEEEeEEEecCCCCCCC
Confidence 3689999999999999988752 111222449998776 799999999999998644 458999999999989765
Q ss_pred ---cccccccceecCCccccCC
Q psy16044 252 ---IRYSHRQPRLINGKESIRG 270 (436)
Q Consensus 252 ---~~y~~v~~~~~~~~~~~~~ 270 (436)
.+||+|.-|..||.....+
T Consensus 259 ~~~gVyT~vsny~~WI~a~~~~ 280 (413)
T COG5640 259 LIPGVYTNVSNYQDWIAAMTNG 280 (413)
T ss_pred CcceeEEehhHHHHHHHHHhcC
Confidence 5899999999999875543
No 6
>KOG3627|consensus
Probab=99.94 E-value=3.4e-26 Score=210.62 Aligned_cols=158 Identities=39% Similarity=0.769 Sum_probs=134.1
Q ss_pred CCCcccccccc-cccceeecccccccCCCCeeeeccCCCCCC-CCCCCCeEEEEecCCCCCC-CCccccceEEEEEeeCc
Q psy16044 271 AWPWQNLITSF-LSAALLKLSRPTSARDKGVRAVCLTDADKR-PVNPKQQCVATGWGRVKPK-GDLVSKLRQIRVPLHNI 347 (436)
Q Consensus 271 ~~p~~~~~~~~-~diali~l~~~~~~~~~~v~picl~~~~~~-~~~~~~~~~~~Gwg~~~~~-~~~~~~l~~~~~~~~~~ 347 (436)
.||.|+..... ||||||+|.+++.++.. |+|||||..... ....+..|.++|||++... ...+..|+++.+++++.
T Consensus 94 ~H~~y~~~~~~~nDiall~l~~~v~~~~~-i~piclp~~~~~~~~~~~~~~~v~GWG~~~~~~~~~~~~L~~~~v~i~~~ 172 (256)
T KOG3627|consen 94 VHPNYNPRTLENNDIALLRLSEPVTFSSH-IQPICLPSSADPYFPPGGTTCLVSGWGRTESGGGPLPDTLQEVDVPIISN 172 (256)
T ss_pred ECCCCCCCCCCCCCEEEEEECCCcccCCc-ccccCCCCCcccCCCCCCCEEEEEeCCCcCCCCCCCCceeEEEEEeEcCh
Confidence 68999998877 99999999999999988 999999855442 3455689999999987654 24578899999999999
Q ss_pred cccccccCCCccCCCCceecccCCCCCCCccCCCCceeeeEecCCcEEEEEEEEecCC-CCCCCCCeEEEeCcccHHHHH
Q psy16044 348 SVCRDKYGDSVELHGGHLCGGQLDGFSGACIGDSGGPLQCSLKDGRWYLAGITSFGSG-CAKSGYPDVYTKLSFYLPWIR 426 (436)
Q Consensus 348 ~~C~~~~~~~~~~~~~~~C~~~~~~~~~~c~gdsGgpl~~~~~~~~~~l~Gi~S~~~~-c~~~~~p~v~t~V~~~~~WI~ 426 (436)
++|+..+.....+.+.|||++......++|+|||||||++... ++++|+||+|||.. |+....|++||+|+.|.+||+
T Consensus 173 ~~C~~~~~~~~~~~~~~~Ca~~~~~~~~~C~GDSGGPLv~~~~-~~~~~~GivS~G~~~C~~~~~P~vyt~V~~y~~WI~ 251 (256)
T KOG3627|consen 173 SECRRAYGGLGTITDTMLCAGGPEGGKDACQGDSGGPLVCEDN-GRWVLVGIVSWGSGGCGQPNYPGVYTRVSSYLDWIK 251 (256)
T ss_pred hHhcccccCccccCCCEEeeCccCCCCccccCCCCCeEEEeeC-CcEEEEEEEEecCCCCCCCCCCeEEeEhHHhHHHHH
Confidence 9999988653235677899987656568999999999999863 48999999999988 988778999999999999999
Q ss_pred HHHh
Q psy16044 427 KQIN 430 (436)
Q Consensus 427 ~~i~ 430 (436)
+.+.
T Consensus 252 ~~~~ 255 (256)
T KOG3627|consen 252 ENIG 255 (256)
T ss_pred HHhc
Confidence 9875
No 7
>cd00190 Tryp_SPc Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues.
Probab=99.92 E-value=2.2e-24 Score=195.41 Aligned_cols=156 Identities=41% Similarity=0.758 Sum_probs=134.4
Q ss_pred CCCCcccccccccccceeecccccccCCCCeeeeccCCCCCCCCCCCCeEEEEecCCCCCCCCccccceEEEEEeeCccc
Q psy16044 270 GAWPWQNLITSFLSAALLKLSRPTSARDKGVRAVCLTDADKRPVNPKQQCVATGWGRVKPKGDLVSKLRQIRVPLHNISV 349 (436)
Q Consensus 270 ~~~p~~~~~~~~~diali~l~~~~~~~~~~v~picl~~~~~~~~~~~~~~~~~Gwg~~~~~~~~~~~l~~~~~~~~~~~~ 349 (436)
..||.|......+|||||+|++|+.++.. ++|||||.... ....+..+.+.|||........+..++...+.+++.++
T Consensus 77 ~~hp~y~~~~~~~DiAll~L~~~~~~~~~-v~picl~~~~~-~~~~~~~~~~~G~g~~~~~~~~~~~~~~~~~~~~~~~~ 154 (232)
T cd00190 77 IVHPNYNPSTYDNDIALLKLKRPVTLSDN-VRPICLPSSGY-NLPAGTTCTVSGWGRTSEGGPLPDVLQEVNVPIVSNAE 154 (232)
T ss_pred EECCCCCCCCCcCCEEEEEECCcccCCCc-ccceECCCccc-cCCCCCEEEEEeCCcCCCCCCCCceeeEEEeeeECHHH
Confidence 35899988888999999999999999887 99999998763 45678999999999876554556789999999999999
Q ss_pred cccccCCCccCCCCceecccCCCCCCCccCCCCceeeeEecCCcEEEEEEEEecCCCCCCCCCeEEEeCcccHHHHHHH
Q psy16044 350 CRDKYGDSVELHGGHLCGGQLDGFSGACIGDSGGPLQCSLKDGRWYLAGITSFGSGCAKSGYPDVYTKLSFYLPWIRKQ 428 (436)
Q Consensus 350 C~~~~~~~~~~~~~~~C~~~~~~~~~~c~gdsGgpl~~~~~~~~~~l~Gi~S~~~~c~~~~~p~v~t~V~~~~~WI~~~ 428 (436)
|...+.....+.++++|+.........|.|||||||++.. +++++|+||+|++..|.....|.+|++|.+|++||+++
T Consensus 155 C~~~~~~~~~~~~~~~C~~~~~~~~~~c~gdsGgpl~~~~-~~~~~lvGI~s~g~~c~~~~~~~~~t~v~~~~~WI~~~ 232 (232)
T cd00190 155 CKRAYSYGGTITDNMLCAGGLEGGKDACQGDSGGPLVCND-NGRGVLVGIVSWGSGCARPNYPGVYTRVSSYLDWIQKT 232 (232)
T ss_pred hhhhccCcccCCCceEeeCCCCCCCccccCCCCCcEEEEe-CCEEEEEEEEehhhccCCCCCCCEEEEcHHhhHHhhcC
Confidence 9988764335789999998765456899999999999985 58999999999999888656799999999999999864
No 8
>smart00020 Tryp_SPc Trypsin-like serine protease. Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues.
Probab=99.87 E-value=9.1e-22 Score=177.99 Aligned_cols=151 Identities=42% Similarity=0.814 Sum_probs=128.1
Q ss_pred CCCcccccccccccceeecccccccCCCCeeeeccCCCCCCCCCCCCeEEEEecCCCCC-CCCccccceEEEEEeeCccc
Q psy16044 271 AWPWQNLITSFLSAALLKLSRPTSARDKGVRAVCLTDADKRPVNPKQQCVATGWGRVKP-KGDLVSKLRQIRVPLHNISV 349 (436)
Q Consensus 271 ~~p~~~~~~~~~diali~l~~~~~~~~~~v~picl~~~~~~~~~~~~~~~~~Gwg~~~~-~~~~~~~l~~~~~~~~~~~~ 349 (436)
.||.|......+|+|||+|++|+.+... ++||||+.... ....+..+.+.|||.... .......++...+.+++.+.
T Consensus 78 ~~p~~~~~~~~~DiAll~L~~~i~~~~~-~~pi~l~~~~~-~~~~~~~~~~~g~g~~~~~~~~~~~~~~~~~~~~~~~~~ 155 (229)
T smart00020 78 IHPNYNPSTYDNDIALLKLKSPVTLSDN-VRPICLPSSNY-NVPAGTTCTVSGWGRTSEGAGSLPDTLQEVNVPIVSNAT 155 (229)
T ss_pred ECCCCCCCCCcCCEEEEEECcccCCCCc-eeeccCCCccc-ccCCCCEEEEEeCCCCCCCCCcCCCEeeEEEEEEeCHHH
Confidence 4788888888999999999999998886 99999998733 345689999999998763 23345678999999999999
Q ss_pred cccccCCCccCCCCceecccCCCCCCCccCCCCceeeeEecCCcEEEEEEEEecCCCCCCCCCeEEEeCcccHHHH
Q psy16044 350 CRDKYGDSVELHGGHLCGGQLDGFSGACIGDSGGPLQCSLKDGRWYLAGITSFGSGCAKSGYPDVYTKLSFYLPWI 425 (436)
Q Consensus 350 C~~~~~~~~~~~~~~~C~~~~~~~~~~c~gdsGgpl~~~~~~~~~~l~Gi~S~~~~c~~~~~p~v~t~V~~~~~WI 425 (436)
|...+.....+.+.++|++........|.||+||||++.. + +|+|+||+|++..|...+.|.+|++|.+|++||
T Consensus 156 C~~~~~~~~~~~~~~~C~~~~~~~~~~c~gdsG~pl~~~~-~-~~~l~Gi~s~g~~C~~~~~~~~~~~i~~~~~WI 229 (229)
T smart00020 156 CRRAYSGGGAITDNMLCAGGLEGGKDACQGDSGGPLVCND-G-RWVLVGIVSWGSGCARPGKPGVYTRVSSYLDWI 229 (229)
T ss_pred hhhhhccccccCCCcEeecCCCCCCcccCCCCCCeeEEEC-C-CEEEEEEEEECCCCCCCCCCCEEEEeccccccC
Confidence 9988765335788999998765456899999999999985 4 999999999999998667799999999999998
No 9
>PF00089 Trypsin: Trypsin; InterPro: IPR001254 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine proteases belong to the MEROPS peptidase family S1 (chymotrypsin family, clan PA(S))and to peptidase family S6 (Hap serine peptidases). The chymotrypsin family is almost totally confined to animals, although trypsin-like enzymes are found in actinomycetes of the genera Streptomyces and Saccharopolyspora, and in the fungus Fusarium oxysporum []. The enzymes are inherently secreted, being synthesised with a signal peptide that targets them to the secretory pathway. Animal enzymes are either secreted directly, packaged into vesicles for regulated secretion, or are retained in leukocyte granules []. The Hap family, 'Haemophilus adhesion and penetration', are proteins that play a role in the interaction with human epithelial cells. The serine protease activity is localized at the N-terminal domain, whereas the binding domain is in the C-terminal region. ; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 1SPJ_A 1A5I_A 2ZGH_A 2ZKS_A 2ZGJ_A 2ZGC_A 2ODP_A 2I6Q_A 2I6S_A 2ODQ_A ....
Probab=99.83 E-value=1.8e-20 Score=168.33 Aligned_cols=145 Identities=41% Similarity=0.810 Sum_probs=124.6
Q ss_pred CCCcccccccccccceeecccccccCCCCeeeeccCCCCCCCCCCCCeEEEEecCCCCCCCCccccceEEEEEeeCcccc
Q psy16044 271 AWPWQNLITSFLSAALLKLSRPTSARDKGVRAVCLTDADKRPVNPKQQCVATGWGRVKPKGDLVSKLRQIRVPLHNISVC 350 (436)
Q Consensus 271 ~~p~~~~~~~~~diali~l~~~~~~~~~~v~picl~~~~~~~~~~~~~~~~~Gwg~~~~~~~~~~~l~~~~~~~~~~~~C 350 (436)
.||.|......+|+|||+|++++.+... ++|+||+.... ....++.+.+.||+...... ....++...+.+++.+.|
T Consensus 76 ~h~~~~~~~~~~DiAll~L~~~~~~~~~-~~~~~l~~~~~-~~~~~~~~~~~G~~~~~~~~-~~~~~~~~~~~~~~~~~c 152 (220)
T PF00089_consen 76 IHPKYDPSTYDNDIALLKLDRPITFGDN-IQPICLPSAGS-DPNVGTSCIVVGWGRTSDNG-YSSNLQSVTVPVVSRKTC 152 (220)
T ss_dssp EETTSBTTTTTTSEEEEEESSSSEHBSS-BEESBBTSTTH-TTTTTSEEEEEESSBSSTTS-BTSBEEEEEEEEEEHHHH
T ss_pred cccccccccccccccccccccccccccc-ccccccccccc-cccccccccccccccccccc-cccccccccccccccccc
Confidence 5788888888999999999999988888 99999998443 34788999999999865443 456899999999999999
Q ss_pred ccccCCCccCCCCceecccCCCCCCCccCCCCceeeeEecCCcEEEEEEEEecCCCCCCCCCeEEEeCcccHHHH
Q psy16044 351 RDKYGDSVELHGGHLCGGQLDGFSGACIGDSGGPLQCSLKDGRWYLAGITSFGSGCAKSGYPDVYTKLSFYLPWI 425 (436)
Q Consensus 351 ~~~~~~~~~~~~~~~C~~~~~~~~~~c~gdsGgpl~~~~~~~~~~l~Gi~S~~~~c~~~~~p~v~t~V~~~~~WI 425 (436)
...+.. .+.+.++|+... .....|.|||||||++... +|+||.|++..|...+.|.+|++|+.|++||
T Consensus 153 ~~~~~~--~~~~~~~c~~~~-~~~~~~~g~sG~pl~~~~~----~lvGI~s~~~~c~~~~~~~v~~~v~~~~~WI 220 (220)
T PF00089_consen 153 RSSYND--NLTPNMICAGSS-GSGDACQGDSGGPLICNNN----YLVGIVSFGENCGSPNYPGVYTRVSSYLDWI 220 (220)
T ss_dssp HHHTTT--TSTTTEEEEETT-SSSBGGTTTTTSEEEETTE----EEEEEEEEESSSSBTTSEEEEEEGGGGHHHH
T ss_pred cccccc--cccccccccccc-cccccccccccccccccee----eecceeeecCCCCCCCcCEEEEEHHHhhccC
Confidence 987544 368899999876 4468999999999998642 8999999999998877799999999999999
No 10
>PF03761 DUF316: Domain of unknown function (DUF316) ; InterPro: IPR005514 This is a family of uncharacterised proteins from Caenorhabditis elegans.
Probab=99.74 E-value=2.4e-17 Score=153.62 Aligned_cols=227 Identities=21% Similarity=0.334 Sum_probs=142.6
Q ss_pred cccccCceecCCCCCCccCCCCCCeEeCCcccCCCCCcceEEEeeeccCCCCCCeeeEEEEecCCEEEecCCCcCCCCCC
Q psy16044 2 INLCDTVTFARDCGVGIRYSHRQPRLINGKESIRGAWPWQVSLQVLHPRLGLMPHWCGAVLIHPSWVVTAAHCIHNDIFS 81 (436)
Q Consensus 2 ~~~~~~~~~~~~CG~~~~~~~~~~~i~~G~~a~~~~~Pw~v~i~~~~~~~~~~~~~C~GtLIs~~~VLTAAhC~~~~~~~ 81 (436)
|+.+||..++..||.. ......++.+|..+...+.||+|.+....... ....++|||||+||||||+||+......
T Consensus 19 Lt~eEN~~rl~~CG~~--~~~~~~~~~~g~~~~~~~~pW~v~v~~~~~~~--~~~~~~gtlIS~RHiLtss~~~~~~~~~ 94 (282)
T PF03761_consen 19 LTEEENEERLETCGKK--KLPYPSKVFNGTPAESGEAPWAVSVYTKNHNE--GNYFSTGTLISPRHILTSSHCVMNDKSK 94 (282)
T ss_pred CCHHHHHHHHHhcCCC--CCCCcccccCCcccccCCCCCEEEEEeccCcc--cceecceEEeccCeEEEeeeEEEecccc
Confidence 5678899999999943 33355667999999999999999998765332 2466799999999999999999753221
Q ss_pred C---CCC------cc-eEEEeccc-----cC----CccCCceEEEeeeEEEeCCCC------CCCCCceeEEEeCCCCCC
Q psy16044 82 L---PIP------EL-WTAVLGDW-----DR----TEEEKSEVRIPVERIRVHEEF------HNYHHDIALLKLSRPTSA 136 (436)
Q Consensus 82 ~---~~~------~~-~~v~~g~~-----~~----~~~~~~~~~~~v~~i~~hp~y------~~~~~DiAll~L~~~~~~ 136 (436)
. ... .. ..+.+-.. .. ...........+.++++...- .....+++||+|+++
T Consensus 95 W~~~~~~~~~~C~~~~~~l~vP~~~l~~~~v~~~~~~~~~~~~~~~v~ka~il~~C~~~~~~~~~~~~~mIlEl~~~--- 171 (282)
T PF03761_consen 95 WLNGEEFDNKKCEGNNNHLIVPEEVLSKIDVRCCNCFSNGKCFSIKVKKAYILNGCKKIKKNFNRPYSPMILELEED--- 171 (282)
T ss_pred cccCcccccceeeCCCceEEeCHHHhccEEEEeecccccCCcccceeEEEEEEecCCCcccccccccceEEEEEccc---
Confidence 1 000 00 00000000 00 001111123445555542211 135679999999998
Q ss_pred CCCceeeeeecCCCCCCCCCCCcEEEEecCccCCCCCccccceeeeeeccchhhhhhhcCCCcCCcCCeEEeccCCCCCC
Q psy16044 137 RDKGVRAVCLTDADKRPVNPKQQCVATGWGRVKPKGDLVSKLRQIRVPLHNISVCRDKYGDSVELHGGHLCGGQLDGFSG 216 (436)
Q Consensus 137 ~~~~v~pi~l~~~~~~~~~~~~~~~~~GwG~~~~~~~~~~~l~~~~~~~~~~~~C~~~~~~~~~~~~~~~Ca~~~~~~~~ 216 (436)
++....|+|||+.. .....+..+.+.|+ .....+....+.+.....|. ..+|. ...
T Consensus 172 ~~~~~~~~Cl~~~~-~~~~~~~~~~~yg~-------~~~~~~~~~~~~i~~~~~~~-----------~~~~~-----~~~ 227 (282)
T PF03761_consen 172 FSKNVSPPCLADSS-TNWEKGDEVDVYGF-------NSTGKLKHRKLKITNCTKCA-----------YSICT-----KQY 227 (282)
T ss_pred ccccCCCEEeCCCc-cccccCceEEEeec-------CCCCeEEEEEEEEEEeeccc-----------eeEec-----ccc
Confidence 34568999999863 33556667777777 11233555555554433211 12222 257
Q ss_pred CccCCCCCeeeeEecCCcEEEEEEEeEcCCCCCC-Ccccccccce
Q psy16044 217 ACIGDSGGPLQCSLKDGRWYLAGITSFGSGYCGV-GIRYSHRQPR 260 (436)
Q Consensus 217 ~C~gDsGgPl~~~~~~~~~~l~GI~S~g~~~C~~-~~~y~~v~~~ 260 (436)
.|.||+||||+... +|+|+|+||.+.+...|.. ...|.+|..+
T Consensus 228 ~~~~d~Gg~lv~~~-~gr~tlIGv~~~~~~~~~~~~~~f~~v~~~ 271 (282)
T PF03761_consen 228 SCKGDRGGPLVKNI-NGRWTLIGVGASGNYECNKNNSYFFNVSWY 271 (282)
T ss_pred cCCCCccCeEEEEE-CCCEEEEEEEccCCCcccccccEEEEHHHh
Confidence 89999999999874 8999999999988866643 3444455433
No 11
>COG5640 Secreted trypsin-like serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=99.61 E-value=8.8e-15 Score=131.74 Aligned_cols=160 Identities=24% Similarity=0.355 Sum_probs=108.8
Q ss_pred CCCCcccccccccccceeecccccccCCCCeeeeccCCCCCCCCCCCCeEEEEecCCCCCC---CCc--cccceEEEEEe
Q psy16044 270 GAWPWQNLITSFLSAALLKLSRPTSARDKGVRAVCLTDADKRPVNPKQQCVATGWGRVKPK---GDL--VSKLRQIRVPL 344 (436)
Q Consensus 270 ~~~p~~~~~~~~~diali~l~~~~~~~~~~v~picl~~~~~~~~~~~~~~~~~Gwg~~~~~---~~~--~~~l~~~~~~~ 344 (436)
..|-.|-..++.||+|+++|.++...-...+...--+..-.............+|+.+... ... ...++++.+..
T Consensus 112 ~~~efY~~~n~~ND~Av~~l~~~a~~pr~ki~~~~~sdt~l~sv~~~s~~~n~t~~~~~~~~v~~~~p~gt~l~e~~v~f 191 (413)
T COG5640 112 YVHEFYSPGNLGNDIAVLELARAASLPRVKITSFDASDTFLNSVTTVSPMTNGTFGVTTPSDVPRSSPKGTILHEVAVLF 191 (413)
T ss_pred eeecccccccccCcceeeccccccccchhheeeccCcccceecccccccccceeeeeeeecCCCCCCCccceeeeeeeee
Confidence 3456666778899999999999655432112111111100112233444555666643211 111 24799999999
Q ss_pred eCccccccccCC--Ccc--CCCCceecccCCCCCCCccCCCCceeeeEecCCcEEEEEEEEecCC-CCCCCCCeEEEeCc
Q psy16044 345 HNISVCRDKYGD--SVE--LHGGHLCGGQLDGFSGACIGDSGGPLQCSLKDGRWYLAGITSFGSG-CAKSGYPDVYTKLS 419 (436)
Q Consensus 345 ~~~~~C~~~~~~--~~~--~~~~~~C~~~~~~~~~~c~gdsGgpl~~~~~~~~~~l~Gi~S~~~~-c~~~~~p~v~t~V~ 419 (436)
.+...|.+..+. ... ..-.-+|++... ++.|+||||||++.+..+| ++|+||+|||.+ |+.+..|.|||+|+
T Consensus 192 v~~stc~~~~g~an~~dg~~~lT~~cag~~~--~daCqGDSGGPi~~~g~~G-~vQ~GVvSwG~~~Cg~t~~~gVyT~vs 268 (413)
T COG5640 192 VPLSTCAQYKGCANASDGATGLTGFCAGRPP--KDACQGDSGGPIFHKGEEG-RVQRGVVSWGDGGCGGTLIPGVYTNVS 268 (413)
T ss_pred echHHhhhhccccccCCCCCCccceecCCCC--cccccCCCCCceEEeCCCc-cEEEeEEEecCCCCCCCCcceeEEehh
Confidence 999999988741 111 112239998765 6999999999999986444 589999999987 99989999999999
Q ss_pred ccHHHHHHHHhhh
Q psy16044 420 FYLPWIRKQINIA 432 (436)
Q Consensus 420 ~~~~WI~~~i~~~ 432 (436)
.|.+||...|+..
T Consensus 269 ny~~WI~a~~~~l 281 (413)
T COG5640 269 NYQDWIAAMTNGL 281 (413)
T ss_pred HHHHHHHHHhcCC
Confidence 9999999988643
No 12
>PF09342 DUF1986: Domain of unknown function (DUF1986); InterPro: IPR015420 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This domain is found in serine endopeptidases belonging to MEROPS peptidase family S1A (clan PA). It is found in unusual mosaic proteins, which are encoded by the Drosophila nudel gene (see P98159 from SWISSPROT). Nudel is involved in defining embryonic dorsoventral polarity. Three proteases; ndl, gd and snk process easter to create active easter. Active easter defines cell identities along the dorsal-ventral continuum by activating the spz ligand for the Tl receptor in the ventral region of the embryo. Nudel, pipe and windbeutel together trigger the protease cascade within the extraembryonic perivitelline compartment which induces dorsoventral polarity of the Drosophila embryo [].
Probab=99.48 E-value=1.3e-12 Score=112.11 Aligned_cols=117 Identities=23% Similarity=0.477 Sum_probs=90.0
Q ss_pred CCCCcceEEEeeeccCCCCCCeeeEEEEecCCEEEecCCCcCCCCCCCCCCcceEEEecccc--CCccCCceEEEeeeEE
Q psy16044 35 RGAWPWQVSLQVLHPRLGLMPHWCGAVLIHPSWVVTAAHCIHNDIFSLPIPELWTAVLGDWD--RTEEEKSEVRIPVERI 112 (436)
Q Consensus 35 ~~~~Pw~v~i~~~~~~~~~~~~~C~GtLIs~~~VLTAAhC~~~~~~~~~~~~~~~v~~g~~~--~~~~~~~~~~~~v~~i 112 (436)
.-.|||.|.|+..+ .++|+|+||.+.|||++..|+.+-.. ...-+.|.+|... +....+.+|.+.|..+
T Consensus 13 ~y~WPWlA~IYvdG------~~~CsgvLlD~~WlLvsssCl~~I~L---~~~YvsallG~~Kt~~~v~Gp~EQI~rVD~~ 83 (267)
T PF09342_consen 13 DYHWPWLADIYVDG------RYWCSGVLLDPHWLLVSSSCLRGISL---SHHYVSALLGGGKTYLSVDGPHEQISRVDCF 83 (267)
T ss_pred cccCcceeeEEEcC------eEEEEEEEeccceEEEeccccCCccc---ccceEEEEecCcceecccCCChheEEEeeee
Confidence 34699999999875 78999999999999999999976322 1244678888654 2235566777777776
Q ss_pred EeCCCCCCCCCceeEEEeCCCCCCCCCceeeeeecCCCCCCCCCCCcEEEEecCc
Q psy16044 113 RVHEEFHNYHHDIALLKLSRPTSARDKGVRAVCLTDADKRPVNPKQQCVATGWGR 167 (436)
Q Consensus 113 ~~hp~y~~~~~DiAll~L~~~~~~~~~~v~pi~l~~~~~~~~~~~~~~~~~GwG~ 167 (436)
..-|. .+++||.|++|+. |+.+|+|..||.. .........|.++|-..
T Consensus 84 ~~V~~-----S~v~LLHL~~~~~-fTr~VlP~flp~~-~~~~~~~~~CVAVg~d~ 131 (267)
T PF09342_consen 84 KDVPE-----SNVLLLHLEQPAN-FTRYVLPTFLPET-SNENESDDECVAVGHDD 131 (267)
T ss_pred eeccc-----cceeeeeecCccc-ceeeecccccccc-cCCCCCCCceEEEEccc
Confidence 65555 7999999999999 8999999999974 33344556999999543
No 13
>COG3591 V8-like Glu-specific endopeptidase [Amino acid transport and metabolism]
Probab=98.67 E-value=2.5e-07 Score=82.03 Aligned_cols=177 Identities=20% Similarity=0.191 Sum_probs=93.5
Q ss_pred cCCCCCcceEEEeeeccCCCCCCeeeEEEEecCCEEEecCCCcCCCCCCCCCCcceEEEe-ccccCCccCCceEEEeeeE
Q psy16044 33 SIRGAWPWQVSLQVLHPRLGLMPHWCGAVLIHPSWVVTAAHCIHNDIFSLPIPELWTAVL-GDWDRTEEEKSEVRIPVER 111 (436)
Q Consensus 33 a~~~~~Pw~v~i~~~~~~~~~~~~~C~GtLIs~~~VLTAAhC~~~~~~~~~~~~~~~v~~-g~~~~~~~~~~~~~~~v~~ 111 (436)
....+|||-+-.......+ ..-|+++||+++.||||+||+.+.... . ..+.+.. |....... ...+....
T Consensus 44 ~dt~~~Py~av~~~~~~tG---~~~~~~~lI~pntvLTa~Hc~~s~~~G--~-~~~~~~p~g~~~~~~~---~~~~~~~~ 114 (251)
T COG3591 44 TDTTQFPYSAVVQFEAATG---RLCTAATLIGPNTVLTAGHCIYSPDYG--E-DDIAAAPPGVNSDGGP---FYGITKIE 114 (251)
T ss_pred ccCCCCCcceeEEeecCCC---cceeeEEEEcCceEEEeeeEEecCCCC--h-hhhhhcCCcccCCCCC---CCceeeEE
Confidence 4467899998886654332 445777999999999999999775321 0 1122221 22211111 11222223
Q ss_pred EEeCCC-C---CCCCCceeEEEeCCCCCCCCCceeeeeecCCCCCCCCCCCcEEEEecCccCCCCCccccceeeeeeccc
Q psy16044 112 IRVHEE-F---HNYHHDIALLKLSRPTSARDKGVRAVCLTDADKRPVNPKQQCVATGWGRVKPKGDLVSKLRQIRVPLHN 187 (436)
Q Consensus 112 i~~hp~-y---~~~~~DiAll~L~~~~~~~~~~v~pi~l~~~~~~~~~~~~~~~~~GwG~~~~~~~~~~~l~~~~~~~~~ 187 (436)
+...|. + +....|+..+.|+...+ +...+....++.. .....+....++||-...... ..+.+..-.+..
T Consensus 115 ~~~~~g~~~~~d~~~~~v~~~~~~~g~~-~~~~~~~~~~~~~--~~~~~~d~i~v~GYP~dk~~~---~~~~e~t~~v~~ 188 (251)
T COG3591 115 IRVYPGELYKEDGASYDVGEAALESGIN-IGDVVNYLKRNTA--SEAKANDRITVIGYPGDKPNI---GTMWESTGKVNS 188 (251)
T ss_pred EEecCCceeccCCceeeccHHHhccCCC-ccccccccccccc--cccccCceeEEEeccCCCCcc---eeEeeecceeEE
Confidence 323443 2 24556777777774443 2333343333332 233445558999986544210 001111000000
Q ss_pred hhhhhhhcCCCcCCcCCeEEeccCCCCCCCccCCCCCeeeeEecCCcEEEEEEEeEcCC
Q psy16044 188 ISVCRDKYGDSVELHGGHLCGGQLDGFSGACIGDSGGPLQCSLKDGRWYLAGITSFGSG 246 (436)
Q Consensus 188 ~~~C~~~~~~~~~~~~~~~Ca~~~~~~~~~C~gDsGgPl~~~~~~~~~~l~GI~S~g~~ 246 (436)
.+ ..+ ..-..+++.|+||+|++.... +++||.+-|..
T Consensus 189 ~~--------------~~~----l~y~~dT~pG~SGSpv~~~~~----~vigv~~~g~~ 225 (251)
T COG3591 189 IK--------------GNK----LFYDADTLPGSSGSPVLISKD----EVIGVHYNGPG 225 (251)
T ss_pred Ee--------------cce----EEEEecccCCCCCCceEecCc----eEEEEEecCCC
Confidence 00 000 001268999999999997532 89999988765
No 14
>PF03761 DUF316: Domain of unknown function (DUF316) ; InterPro: IPR005514 This is a family of uncharacterised proteins from Caenorhabditis elegans.
Probab=98.27 E-value=4e-06 Score=78.21 Aligned_cols=121 Identities=26% Similarity=0.504 Sum_probs=82.8
Q ss_pred cccccceeecccccccCCCCeeeeccCCCCCCCCCCCCeEEEEecCCCCCCCCccccceEEEEEeeCccccccccCCCcc
Q psy16044 280 SFLSAALLKLSRPTSARDKGVRAVCLTDADKRPVNPKQQCVATGWGRVKPKGDLVSKLRQIRVPLHNISVCRDKYGDSVE 359 (436)
Q Consensus 280 ~~~diali~l~~~~~~~~~~v~picl~~~~~~~~~~~~~~~~~Gwg~~~~~~~~~~~l~~~~~~~~~~~~C~~~~~~~~~ 359 (436)
...+.+||+|+++ +... ..|+|||.... ....++...+.|+. ....+....+++.....|..
T Consensus 159 ~~~~~mIlEl~~~--~~~~-~~~~Cl~~~~~-~~~~~~~~~~yg~~-------~~~~~~~~~~~i~~~~~~~~------- 220 (282)
T PF03761_consen 159 RPYSPMILELEED--FSKN-VSPPCLADSST-NWEKGDEVDVYGFN-------STGKLKHRKLKITNCTKCAY------- 220 (282)
T ss_pred cccceEEEEEccc--cccc-CCCEEeCCCcc-ccccCceEEEeecC-------CCCeEEEEEEEEEEeeccce-------
Confidence 3467789999999 5555 89999997765 35566777777771 12335555555554322111
Q ss_pred CCCCceecccCCCCCCCccCCCCceeeeEecCCcEEEEEEEEecC-CCCCCCCCeEEEeCcccHHHHHHHHh
Q psy16044 360 LHGGHLCGGQLDGFSGACIGDSGGPLQCSLKDGRWYLAGITSFGS-GCAKSGYPDVYTKLSFYLPWIRKQIN 430 (436)
Q Consensus 360 ~~~~~~C~~~~~~~~~~c~gdsGgpl~~~~~~~~~~l~Gi~S~~~-~c~~~~~p~v~t~V~~~~~WI~~~i~ 430 (436)
.+| ..+..|.+|+||||+-.. +++|+|+||.+.+. .|... ...|.+|..|.+=|.+..+
T Consensus 221 ----~~~-----~~~~~~~~d~Gg~lv~~~-~gr~tlIGv~~~~~~~~~~~--~~~f~~v~~~~~~IC~ltG 280 (282)
T PF03761_consen 221 ----SIC-----TKQYSCKGDRGGPLVKNI-NGRWTLIGVGASGNYECNKN--NSYFFNVSWYQDEICELTG 280 (282)
T ss_pred ----eEe-----cccccCCCCccCeEEEEE-CCCEEEEEEEccCCCccccc--ccEEEEHHHhhhhhcccee
Confidence 122 224789999999999875 89999999998765 34322 5789999999887766543
No 15
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=98.19 E-value=2.2e-05 Score=77.59 Aligned_cols=85 Identities=19% Similarity=0.246 Sum_probs=58.4
Q ss_pred CeeeEEEEecCC-EEEecCCCcCCCCCCCCCCcceEEEeccccCCccCCceEEEeeeEEEeCCCCCCCCCceeEEEeCCC
Q psy16044 55 PHWCGAVLIHPS-WVVTAAHCIHNDIFSLPIPELWTAVLGDWDRTEEEKSEVRIPVERIRVHEEFHNYHHDIALLKLSRP 133 (436)
Q Consensus 55 ~~~C~GtLIs~~-~VLTAAhC~~~~~~~~~~~~~~~v~~g~~~~~~~~~~~~~~~v~~i~~hp~y~~~~~DiAll~L~~~ 133 (436)
...++|.+|+++ +|||++|++.+. ..+.|.+... ..+..+-+..++. .||||||++.+
T Consensus 57 ~~~GSGfii~~~G~IlTn~Hvv~~~-------~~i~V~~~~~---------~~~~a~vv~~d~~-----~DlAllkv~~~ 115 (428)
T TIGR02037 57 RGLGSGVIISADGYILTNNHVVDGA-------DEITVTLSDG---------REFKAKLVGKDPR-----TDIAVLKIDAK 115 (428)
T ss_pred cceeeEEEECCCCEEEEcHHHcCCC-------CeEEEEeCCC---------CEEEEEEEEecCC-----CCEEEEEecCC
Confidence 467999999986 999999999663 4455555321 2234444444544 69999999864
Q ss_pred CCCCCCceeeeeecCCCCCCCCCCCcEEEEecCc
Q psy16044 134 TSARDKGVRAVCLTDADKRPVNPKQQCVATGWGR 167 (436)
Q Consensus 134 ~~~~~~~v~pi~l~~~~~~~~~~~~~~~~~GwG~ 167 (436)
. .+.++.|.+. ..+..++.+.++|+..
T Consensus 116 ~-----~~~~~~l~~~--~~~~~G~~v~aiG~p~ 142 (428)
T TIGR02037 116 K-----NLPVIKLGDS--DKLRVGDWVLAIGNPF 142 (428)
T ss_pred C-----CceEEEccCC--CCCCCCCEEEEEECCC
Confidence 2 2567777653 3456899999999864
No 16
>TIGR02038 protease_degS periplasmic serine pepetdase DegS. This family consists of the periplasmic serine protease DegS (HhoB), a shorter paralog of protease DO (HtrA, DegP) and DegQ (HhoA). It is found in E. coli and several other Proteobacteria of the gamma subdivision. It contains a trypsin domain and a single copy of PDZ domain (in contrast to DegP with two copies). A critical role of this DegS is to sense stress in the periplasm and partially degrade an inhibitor of sigma(E).
Probab=98.06 E-value=0.00012 Score=70.08 Aligned_cols=160 Identities=18% Similarity=0.187 Sum_probs=89.1
Q ss_pred CCCcceEEEeeeccCCC-----CCCeeeEEEEecCC-EEEecCCCcCCCCCCCCCCcceEEEeccccCCccCCceEEEee
Q psy16044 36 GAWPWQVSLQVLHPRLG-----LMPHWCGAVLIHPS-WVVTAAHCIHNDIFSLPIPELWTAVLGDWDRTEEEKSEVRIPV 109 (436)
Q Consensus 36 ~~~Pw~v~i~~~~~~~~-----~~~~~C~GtLIs~~-~VLTAAhC~~~~~~~~~~~~~~~v~~g~~~~~~~~~~~~~~~v 109 (436)
.--|-+|.|........ ......+|.+|+++ +|||++|.+... ..+.|.+.. ...+..
T Consensus 53 ~~~psVV~I~~~~~~~~~~~~~~~~~~GSG~vi~~~G~IlTn~HVV~~~-------~~i~V~~~d---------g~~~~a 116 (351)
T TIGR02038 53 RAAPAVVNIYNRSISQNSLNQLSIQGLGSGVIMSKEGYILTNYHVIKKA-------DQIVVALQD---------GRKFEA 116 (351)
T ss_pred hcCCcEEEEEeEeccccccccccccceEEEEEEeCCeEEEecccEeCCC-------CEEEEEECC---------CCEEEE
Confidence 44699999976432110 11346999999977 999999998652 345555432 122344
Q ss_pred eEEEeCCCCCCCCCceeEEEeCCCCCCCCCceeeeeecCCCCCCCCCCCcEEEEecCccCCCCCccccceeeeeeccchh
Q psy16044 110 ERIRVHEEFHNYHHDIALLKLSRPTSARDKGVRAVCLTDADKRPVNPKQQCVATGWGRVKPKGDLVSKLRQIRVPLHNIS 189 (436)
Q Consensus 110 ~~i~~hp~y~~~~~DiAll~L~~~~~~~~~~v~pi~l~~~~~~~~~~~~~~~~~GwG~~~~~~~~~~~l~~~~~~~~~~~ 189 (436)
+-+..+|. .||||||++.+- +.++.+.. ...+..++.+.++|+...... ......+.-+...
T Consensus 117 ~vv~~d~~-----~DlAvlkv~~~~------~~~~~l~~--s~~~~~G~~V~aiG~P~~~~~-----s~t~GiIs~~~r~ 178 (351)
T TIGR02038 117 ELVGSDPL-----TDLAVLKIEGDN------LPTIPVNL--DRPPHVGDVVLAIGNPYNLGQ-----TITQGIISATGRN 178 (351)
T ss_pred EEEEecCC-----CCEEEEEecCCC------CceEeccC--cCccCCCCEEEEEeCCCCCCC-----cEEEEEEEeccCc
Confidence 44444544 799999998541 23444433 345678999999998642211 1111111111110
Q ss_pred hhhhhcCCCcCCcCCeEEeccCCCCCCCccCCCCCeeeeEecCCcEEEEEEEeEc
Q psy16044 190 VCRDKYGDSVELHGGHLCGGQLDGFSGACIGDSGGPLQCSLKDGRWYLAGITSFG 244 (436)
Q Consensus 190 ~C~~~~~~~~~~~~~~~Ca~~~~~~~~~C~gDsGgPl~~~~~~~~~~l~GI~S~g 244 (436)
. +... .....+=+ ......|.|||||+-. +| .++||.+..
T Consensus 179 ~----~~~~--~~~~~iqt-----da~i~~GnSGGpl~n~--~G--~vIGI~~~~ 218 (351)
T TIGR02038 179 G----LSSV--GRQNFIQT-----DAAINAGNSGGALINT--NG--ELVGINTAS 218 (351)
T ss_pred c----cCCC--CcceEEEE-----CCccCCCCCcceEECC--CC--eEEEEEeee
Confidence 0 0000 00011111 1456789999999953 34 499998764
No 17
>PRK10898 serine endoprotease; Provisional
Probab=97.94 E-value=0.00035 Score=66.98 Aligned_cols=103 Identities=17% Similarity=0.157 Sum_probs=64.6
Q ss_pred CCCcceEEEeeeccCCC-----CCCeeeEEEEecCC-EEEecCCCcCCCCCCCCCCcceEEEeccccCCccCCceEEEee
Q psy16044 36 GAWPWQVSLQVLHPRLG-----LMPHWCGAVLIHPS-WVVTAAHCIHNDIFSLPIPELWTAVLGDWDRTEEEKSEVRIPV 109 (436)
Q Consensus 36 ~~~Pw~v~i~~~~~~~~-----~~~~~C~GtLIs~~-~VLTAAhC~~~~~~~~~~~~~~~v~~g~~~~~~~~~~~~~~~v 109 (436)
.--|-+|.|........ .....-+|.+|+++ +|||++|=+.+. ..+.|.+... ..+..
T Consensus 53 ~~~psvV~v~~~~~~~~~~~~~~~~~~GSGfvi~~~G~IlTn~HVv~~a-------~~i~V~~~dg---------~~~~a 116 (353)
T PRK10898 53 RAAPAVVNVYNRSLNSTSHNQLEIRTLGSGVIMDQRGYILTNKHVINDA-------DQIIVALQDG---------RVFEA 116 (353)
T ss_pred HhCCcEEEEEeEeccccCcccccccceeeEEEEeCCeEEEecccEeCCC-------CEEEEEeCCC---------CEEEE
Confidence 34588998876532111 01256999999976 999999988652 4455655321 22334
Q ss_pred eEEEeCCCCCCCCCceeEEEeCCCCCCCCCceeeeeecCCCCCCCCCCCcEEEEecCc
Q psy16044 110 ERIRVHEEFHNYHHDIALLKLSRPTSARDKGVRAVCLTDADKRPVNPKQQCVATGWGR 167 (436)
Q Consensus 110 ~~i~~hp~y~~~~~DiAll~L~~~~~~~~~~v~pi~l~~~~~~~~~~~~~~~~~GwG~ 167 (436)
+-+...|. .||||||++.+ . ..++.+.. ...+..+..+.++|+..
T Consensus 117 ~vv~~d~~-----~DlAvl~v~~~-~-----l~~~~l~~--~~~~~~G~~V~aiG~P~ 161 (353)
T PRK10898 117 LLVGSDSL-----TDLAVLKINAT-N-----LPVIPINP--KRVPHIGDVVLAIGNPY 161 (353)
T ss_pred EEEEEcCC-----CCEEEEEEcCC-C-----CCeeeccC--cCcCCCCCEEEEEeCCC
Confidence 43444544 79999999753 1 23444443 23456789999999853
No 18
>PRK10139 serine endoprotease; Provisional
Probab=97.90 E-value=0.00014 Score=72.00 Aligned_cols=142 Identities=21% Similarity=0.228 Sum_probs=81.7
Q ss_pred CeeeEEEEecC--CEEEecCCCcCCCCCCCCCCcceEEEeccccCCccCCceEEEeeeEEEeCCCCCCCCCceeEEEeCC
Q psy16044 55 PHWCGAVLIHP--SWVVTAAHCIHNDIFSLPIPELWTAVLGDWDRTEEEKSEVRIPVERIRVHEEFHNYHHDIALLKLSR 132 (436)
Q Consensus 55 ~~~C~GtLIs~--~~VLTAAhC~~~~~~~~~~~~~~~v~~g~~~~~~~~~~~~~~~v~~i~~hp~y~~~~~DiAll~L~~ 132 (436)
....+|.+|++ -+|||++|.+.+. ..+.|.+... ..+..+-+...|. .||||||++.
T Consensus 89 ~~~GSG~ii~~~~g~IlTn~HVv~~a-------~~i~V~~~dg---------~~~~a~vvg~D~~-----~DlAvlkv~~ 147 (455)
T PRK10139 89 EGLGSGVIIDAAKGYVLTNNHVINQA-------QKISIQLNDG---------REFDAKLIGSDDQ-----SDIALLQIQN 147 (455)
T ss_pred cceEEEEEEECCCCEEEeChHHhCCC-------CEEEEEECCC---------CEEEEEEEEEcCC-----CCEEEEEecC
Confidence 35799999974 6999999998653 4566665321 2344444445554 7999999985
Q ss_pred CCCCCCCceeeeeecCCCCCCCCCCCcEEEEecCccCCCCCccccceeeeeeccchhhhhhhcCCCcCCcCCeEEeccCC
Q psy16044 133 PTSARDKGVRAVCLTDADKRPVNPKQQCVATGWGRVKPKGDLVSKLRQIRVPLHNISVCRDKYGDSVELHGGHLCGGQLD 212 (436)
Q Consensus 133 ~~~~~~~~v~pi~l~~~~~~~~~~~~~~~~~GwG~~~~~~~~~~~l~~~~~~~~~~~~C~~~~~~~~~~~~~~~Ca~~~~ 212 (436)
+-. ..++.|.+. ..+..++.+.++|+..... .......+.-+.... .. .......+=+
T Consensus 148 ~~~-----l~~~~lg~s--~~~~~G~~V~aiG~P~g~~-----~tvt~GivS~~~r~~----~~--~~~~~~~iqt---- 205 (455)
T PRK10139 148 PSK-----LTQIAIADS--DKLRVGDFAVAVGNPFGLG-----QTATSGIISALGRSG----LN--LEGLENFIQT---- 205 (455)
T ss_pred CCC-----CceeEecCc--cccCCCCEEEEEecCCCCC-----CceEEEEEccccccc----cC--CCCcceEEEE----
Confidence 422 446666653 4466799999999742111 111111111111100 00 0000011111
Q ss_pred CCCCCccCCCCCeeeeEecCCcEEEEEEEeEc
Q psy16044 213 GFSGACIGDSGGPLQCSLKDGRWYLAGITSFG 244 (436)
Q Consensus 213 ~~~~~C~gDsGgPl~~~~~~~~~~l~GI~S~g 244 (436)
....-.|.|||||+-. +| .++||.+..
T Consensus 206 -da~in~GnSGGpl~n~--~G--~vIGi~~~~ 232 (455)
T PRK10139 206 -DASINRGNSGGALLNL--NG--ELIGINTAI 232 (455)
T ss_pred -CCccCCCCCcceEECC--CC--eEEEEEEEE
Confidence 2556789999999953 34 399999874
No 19
>PRK10942 serine endoprotease; Provisional
Probab=97.81 E-value=0.0003 Score=70.11 Aligned_cols=84 Identities=24% Similarity=0.330 Sum_probs=56.4
Q ss_pred CeeeEEEEecC--CEEEecCCCcCCCCCCCCCCcceEEEeccccCCccCCceEEEeeeEEEeCCCCCCCCCceeEEEeCC
Q psy16044 55 PHWCGAVLIHP--SWVVTAAHCIHNDIFSLPIPELWTAVLGDWDRTEEEKSEVRIPVERIRVHEEFHNYHHDIALLKLSR 132 (436)
Q Consensus 55 ~~~C~GtLIs~--~~VLTAAhC~~~~~~~~~~~~~~~v~~g~~~~~~~~~~~~~~~v~~i~~hp~y~~~~~DiAll~L~~ 132 (436)
....+|.+|+. -+|||++|.+.+ ...+.|.+... ..+..+-+..+|. .||||||++.
T Consensus 110 ~~~GSG~ii~~~~G~IlTn~HVv~~-------a~~i~V~~~dg---------~~~~a~vv~~D~~-----~DlAvlki~~ 168 (473)
T PRK10942 110 MALGSGVIIDADKGYVVTNNHVVDN-------ATKIKVQLSDG---------RKFDAKVVGKDPR-----SDIALIQLQN 168 (473)
T ss_pred cceEEEEEEECCCCEEEeChhhcCC-------CCEEEEEECCC---------CEEEEEEEEecCC-----CCEEEEEecC
Confidence 35799999985 499999999865 24566665421 2234444445554 7999999975
Q ss_pred CCCCCCCceeeeeecCCCCCCCCCCCcEEEEecC
Q psy16044 133 PTSARDKGVRAVCLTDADKRPVNPKQQCVATGWG 166 (436)
Q Consensus 133 ~~~~~~~~v~pi~l~~~~~~~~~~~~~~~~~GwG 166 (436)
+-. ..++.|.+ ...+..+..+.++|+-
T Consensus 169 ~~~-----l~~~~lg~--s~~l~~G~~V~aiG~P 195 (473)
T PRK10942 169 PKN-----LTAIKMAD--SDALRVGDYTVAIGNP 195 (473)
T ss_pred CCC-----CceeEecC--ccccCCCCEEEEEcCC
Confidence 322 34566654 3446678999999864
No 20
>PF13365 Trypsin_2: Trypsin-like peptidase domain; PDB: 1Y8T_A 2Z9I_A 3QO6_A 1L1J_A 1QY6_A 2O8L_A 3OTP_E 2ZLE_I 1KY9_A 3CS0_A ....
Probab=97.29 E-value=0.00042 Score=55.19 Aligned_cols=21 Identities=33% Similarity=0.556 Sum_probs=19.4
Q ss_pred eEEEEecCC-EEEecCCCcCCC
Q psy16044 58 CGAVLIHPS-WVVTAAHCIHND 78 (436)
Q Consensus 58 C~GtLIs~~-~VLTAAhC~~~~ 78 (436)
|+|.+|.++ +|||||||+...
T Consensus 1 GTGf~i~~~g~ilT~~Hvv~~~ 22 (120)
T PF13365_consen 1 GTGFLIGPDGYILTAAHVVEDW 22 (120)
T ss_dssp EEEEEEETTTEEEEEHHHHTCC
T ss_pred CEEEEEcCCceEEEchhheecc
Confidence 799999999 999999999764
No 21
>PF02395 Peptidase_S6: Immunoglobulin A1 protease Serine protease Prosite pattern; InterPro: IPR000710 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S6 (clan PA(S)). The type sample being the IgA1-specific serine endopeptidase from Neisseria gonorrhoeae []. These cleave prolyl bonds in the hinge regions of immunoglobulin A heavy chains. Similar specificity is shown by the unrelated family of M26 metalloendopeptidases.; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SZE_A 3H09_B 3SYJ_A 1WXR_A 3AK5_B.
Probab=96.15 E-value=0.031 Score=58.49 Aligned_cols=66 Identities=18% Similarity=0.193 Sum_probs=36.9
Q ss_pred EEEEecCCEEEecCCCcCCCCCCCCCCcceEEEeccccCCccCCceEEEeeeEEEeCCCCCCCCCceeEEEeCCCCCCCC
Q psy16044 59 GAVLIHPSWVVTAAHCIHNDIFSLPIPELWTAVLGDWDRTEEEKSEVRIPVERIRVHEEFHNYHHDIALLKLSRPTSARD 138 (436)
Q Consensus 59 ~GtLIs~~~VLTAAhC~~~~~~~~~~~~~~~v~~g~~~~~~~~~~~~~~~v~~i~~hp~y~~~~~DiAll~L~~~~~~~~ 138 (436)
..|||+|++|+|++|=.... -.|.+|.... ..+.+..--.|+. .|..+-||.+=|.
T Consensus 68 ~aTLigpqYiVSV~HN~~gy---------~~v~FG~~g~-------~~Y~iV~RNn~~~-----~Df~~pRLnK~VT--- 123 (769)
T PF02395_consen 68 VATLIGPQYIVSVKHNGKGY---------NSVSFGNEGQ-------NTYKIVDRNNYPS-----GDFHMPRLNKFVT--- 123 (769)
T ss_dssp S-EEEETTEEEBETTG-TSC---------CEECESCSST-------CEEEEEEEEBETT-----STEBEEEESS------
T ss_pred eEEEecCCeEEEEEccCCCc---------CceeecccCC-------ceEEEEEccCCCC-----cccceeecCceEE---
Confidence 38999999999999986221 2577776432 2233333333433 5999999998665
Q ss_pred CceeeeeecCC
Q psy16044 139 KGVRAVCLTDA 149 (436)
Q Consensus 139 ~~v~pi~l~~~ 149 (436)
.+.|+.....
T Consensus 124 -EvaP~~~t~~ 133 (769)
T PF02395_consen 124 -EVAPAEMTTA 133 (769)
T ss_dssp -SS----BBSS
T ss_pred -EEeccccccc
Confidence 2677766553
No 22
>PF09342 DUF1986: Domain of unknown function (DUF1986); InterPro: IPR015420 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This domain is found in serine endopeptidases belonging to MEROPS peptidase family S1A (clan PA). It is found in unusual mosaic proteins, which are encoded by the Drosophila nudel gene (see P98159 from SWISSPROT). Nudel is involved in defining embryonic dorsoventral polarity. Three proteases; ndl, gd and snk process easter to create active easter. Active easter defines cell identities along the dorsal-ventral continuum by activating the spz ligand for the Tl receptor in the ventral region of the embryo. Nudel, pipe and windbeutel together trigger the protease cascade within the extraembryonic perivitelline compartment which induces dorsoventral polarity of the Drosophila embryo [].
Probab=93.68 E-value=0.17 Score=44.55 Aligned_cols=45 Identities=22% Similarity=0.347 Sum_probs=36.8
Q ss_pred cccccceeecccccccCCCCeeeeccCCCCCCCCCCCCeEEEEecCC
Q psy16044 280 SFLSAALLKLSRPTSARDKGVRAVCLTDADKRPVNPKQQCVATGWGR 326 (436)
Q Consensus 280 ~~~diali~l~~~~~~~~~~v~picl~~~~~~~~~~~~~~~~~Gwg~ 326 (436)
...+++|++|++|+.|+.+ |+|..||+... .+.....|...|-..
T Consensus 87 ~~S~v~LLHL~~~~~fTr~-VlP~flp~~~~-~~~~~~~CVAVg~d~ 131 (267)
T PF09342_consen 87 PESNVLLLHLEQPANFTRY-VLPTFLPETSN-ENESDDECVAVGHDD 131 (267)
T ss_pred cccceeeeeecCcccceee-ecccccccccC-CCCCCCceEEEEccc
Confidence 4688999999999999999 99999997544 344555999998664
No 23
>PF02395 Peptidase_S6: Immunoglobulin A1 protease Serine protease Prosite pattern; InterPro: IPR000710 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S6 (clan PA(S)). The type sample being the IgA1-specific serine endopeptidase from Neisseria gonorrhoeae []. These cleave prolyl bonds in the hinge regions of immunoglobulin A heavy chains. Similar specificity is shown by the unrelated family of M26 metalloendopeptidases.; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SZE_A 3H09_B 3SYJ_A 1WXR_A 3AK5_B.
Probab=90.18 E-value=0.53 Score=49.64 Aligned_cols=54 Identities=26% Similarity=0.425 Sum_probs=32.5
Q ss_pred CCccCCCCceeeeEe-cCCcEEEEEEEEecCCCCCCCCCeEEEeCcccHHHHHHHHhhh
Q psy16044 375 GACIGDSGGPLQCSL-KDGRWYLAGITSFGSGCAKSGYPDVYTKLSFYLPWIRKQINIA 432 (436)
Q Consensus 375 ~~c~gdsGgpl~~~~-~~~~~~l~Gi~S~~~~c~~~~~p~v~t~V~~~~~WI~~~i~~~ 432 (436)
..-.||||+||+.-+ +..+|+|+|+.+.+.+..... ..++-+. .+|+++.+.+.
T Consensus 212 ~~~~GDSGSPlF~YD~~~kKWvl~Gv~~~~~~~~g~~--~~~~~~~--~~f~~~~~~~d 266 (769)
T PF02395_consen 212 YGSPGDSGSPLFAYDKEKKKWVLVGVLSGGNGYNGKG--NWWNVIP--PDFINQIKQND 266 (769)
T ss_dssp B--TT-TT-EEEEEETTTTEEEEEEEEEEECCCCHSE--EEEEEEC--HHHHHHHHHHC
T ss_pred ccccCcCCCceEEEEccCCeEEEEEEEccccccCCcc--ceeEEec--HHHHHHHHhhh
Confidence 355699999998876 577999999999876543322 3343332 45555555443
No 24
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=85.09 E-value=5.3 Score=39.58 Aligned_cols=40 Identities=20% Similarity=0.199 Sum_probs=28.0
Q ss_pred cccccceeecccccccCCCCeeeeccCCCCCCCCCCCCeEEEEecCC
Q psy16044 280 SFLSAALLKLSRPTSARDKGVRAVCLTDADKRPVNPKQQCVATGWGR 326 (436)
Q Consensus 280 ~~~diali~l~~~~~~~~~~v~picl~~~~~~~~~~~~~~~~~Gwg~ 326 (436)
...|+|||+++.+ .. ..++.|.... ....++.+++.|+..
T Consensus 103 ~~~DlAllkv~~~----~~-~~~~~l~~~~--~~~~G~~v~aiG~p~ 142 (428)
T TIGR02037 103 PRTDIAVLKIDAK----KN-LPVIKLGDSD--KLRVGDWVLAIGNPF 142 (428)
T ss_pred CCCCEEEEEecCC----CC-ceEEEccCCC--CCCCCCEEEEEECCC
Confidence 3579999999865 12 5566665332 357899999999864
No 25
>COG0265 DegQ Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain [Posttranslational modification, protein turnover, chaperones]
Probab=82.07 E-value=27 Score=33.49 Aligned_cols=144 Identities=21% Similarity=0.220 Sum_probs=75.2
Q ss_pred eeeEEEEec-CCEEEecCCCcCCCCCCCCCCcceEEEeccccCCccCCceEEEeeeEEEeCCCCCCCCCceeEEEeCCCC
Q psy16044 56 HWCGAVLIH-PSWVVTAAHCIHNDIFSLPIPELWTAVLGDWDRTEEEKSEVRIPVERIRVHEEFHNYHHDIALLKLSRPT 134 (436)
Q Consensus 56 ~~C~GtLIs-~~~VLTAAhC~~~~~~~~~~~~~~~v~~g~~~~~~~~~~~~~~~v~~i~~hp~y~~~~~DiAll~L~~~~ 134 (436)
...+|.+++ ..+|||..|=+.. ...+.+.+. ....+..+.+-..+ ..|+|+||.+..-
T Consensus 72 ~~gSg~i~~~~g~ivTn~hVi~~-------a~~i~v~l~---------dg~~~~a~~vg~d~-----~~dlavlki~~~~ 130 (347)
T COG0265 72 GLGSGFIISSDGYIVTNNHVIAG-------AEEITVTLA---------DGREVPAKLVGKDP-----ISDLAVLKIDGAG 130 (347)
T ss_pred ccccEEEEcCCeEEEecceecCC-------cceEEEEeC---------CCCEEEEEEEecCC-----ccCEEEEEeccCC
Confidence 678999999 7899999998754 234444441 11334444443333 3799999998653
Q ss_pred CCCCCceeeeeecCCCCCCCCCCCcEEEEecCccCCCCCccccceeeeeeccchhhhhhhcCCCcCCcCCeEEeccCCCC
Q psy16044 135 SARDKGVRAVCLTDADKRPVNPKQQCVATGWGRVKPKGDLVSKLRQIRVPLHNISVCRDKYGDSVELHGGHLCGGQLDGF 214 (436)
Q Consensus 135 ~~~~~~v~pi~l~~~~~~~~~~~~~~~~~GwG~~~~~~~~~~~l~~~~~~~~~~~~C~~~~~~~~~~~~~~~Ca~~~~~~ 214 (436)
. ...+.+.. ...+..++...+.|-..-. ........+..+... +-..... ....+ ...
T Consensus 131 ~-----~~~~~~~~--s~~l~vg~~v~aiGnp~g~-----~~tvt~Givs~~~r~-~v~~~~~----~~~~I-----qtd 188 (347)
T COG0265 131 G-----LPVIALGD--SDKLRVGDVVVAIGNPFGL-----GQTVTSGIVSALGRT-GVGSAGG----YVNFI-----QTD 188 (347)
T ss_pred C-----CceeeccC--CCCcccCCEEEEecCCCCc-----ccceeccEEeccccc-cccCccc----ccchh-----hcc
Confidence 2 22333333 2334456666677643221 011111112222211 1110000 00111 111
Q ss_pred CCCccCCCCCeeeeEecCCcEEEEEEEeEcCC
Q psy16044 215 SGACIGDSGGPLQCSLKDGRWYLAGITSFGSG 246 (436)
Q Consensus 215 ~~~C~gDsGgPl~~~~~~~~~~l~GI~S~g~~ 246 (436)
...+.|.||||++.. ++ .++||.+....
T Consensus 189 Aain~gnsGgpl~n~--~g--~~iGint~~~~ 216 (347)
T COG0265 189 AAINPGNSGGPLVNI--DG--EVVGINTAIIA 216 (347)
T ss_pred cccCCCCCCCceEcC--CC--cEEEEEEEEec
Confidence 568899999999963 33 49998887654
No 26
>COG3591 V8-like Glu-specific endopeptidase [Amino acid transport and metabolism]
Probab=81.20 E-value=2.6 Score=37.94 Aligned_cols=52 Identities=23% Similarity=0.369 Sum_probs=35.1
Q ss_pred CCccCCCCceeeeEecCCcEEEEEEEEecCCCCCCCCCeEEEeCcc-cHHHHHHHHh
Q psy16044 375 GACIGDSGGPLQCSLKDGRWYLAGITSFGSGCAKSGYPDVYTKLSF-YLPWIRKQIN 430 (436)
Q Consensus 375 ~~c~gdsGgpl~~~~~~~~~~l~Gi~S~~~~c~~~~~p~v~t~V~~-~~~WI~~~i~ 430 (436)
+++.|+||+|+..... +++||.+-+..-.......-.+|+.. +++||++.++
T Consensus 199 dT~pG~SGSpv~~~~~----~vigv~~~g~~~~~~~~~n~~vr~t~~~~~~I~~~~~ 251 (251)
T COG3591 199 DTLPGSSGSPVLISKD----EVIGVHYNGPGANGGSLANNAVRLTPEILNFIQQNIK 251 (251)
T ss_pred cccCCCCCCceEecCc----eEEEEEecCCCcccccccCcceEecHHHHHHHHHhhC
Confidence 7888999999987642 79999988764322122233444444 7889887653
No 27
>PRK10898 serine endoprotease; Provisional
Probab=80.99 E-value=10 Score=36.48 Aligned_cols=40 Identities=20% Similarity=0.216 Sum_probs=25.4
Q ss_pred ccccccceeecccccccCCCCeeeeccCCCCCCCCCCCCeEEEEecCC
Q psy16044 279 TSFLSAALLKLSRPTSARDKGVRAVCLTDADKRPVNPKQQCVATGWGR 326 (436)
Q Consensus 279 ~~~~diali~l~~~~~~~~~~v~picl~~~~~~~~~~~~~~~~~Gwg~ 326 (436)
....|+||||++.+ . ..++.|.. ......++.+++.|+..
T Consensus 122 d~~~DlAvl~v~~~-~-----l~~~~l~~--~~~~~~G~~V~aiG~P~ 161 (353)
T PRK10898 122 DSLTDLAVLKINAT-N-----LPVIPINP--KRVPHIGDVVLAIGNPY 161 (353)
T ss_pred cCCCCEEEEEEcCC-C-----CCeeeccC--cCcCCCCCEEEEEeCCC
Confidence 34689999999764 1 22233322 22456899999999753
No 28
>PF00863 Peptidase_C4: Peptidase family C4 This family belongs to family C4 of the peptidase classification.; InterPro: IPR001730 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. Nuclear inclusion A (NIA) proteases from potyviruses are cysteine peptidases belong to the MEROPS peptidase family C4 (NIa protease family, clan PA(C)) [, ]. Potyviruses include plant viruses in which the single-stranded RNA encodes a polyprotein with NIA protease activity, where proteolytic cleavage is specific for Gln+Gly sites. The NIA protease acts on the polyprotein, releasing itself by Gln+Gly cleavage at both the N- and C-termini. It further processes the polyprotein by cleavage at five similar sites in the C-terminal half of the sequence. In addition to its C-terminal protease activity, the NIA protease contains an N-terminal domain that has been implicated in the transcription process []. This peptidase is present in the nuclear inclusion protein of potyviruses.; GO: 0008234 cysteine-type peptidase activity, 0006508 proteolysis; PDB: 3MMG_B 1Q31_B 1LVB_A 1LVM_A.
Probab=78.40 E-value=11 Score=33.66 Aligned_cols=137 Identities=18% Similarity=0.213 Sum_probs=57.0
Q ss_pred EEecCCEEEecCCCcCCCCCCCCCCcceEEE--eccccCCccCCceEEEeeeEEEeCCCCCCCCCceeEEEeCCCCCCCC
Q psy16044 61 VLIHPSWVVTAAHCIHNDIFSLPIPELWTAV--LGDWDRTEEEKSEVRIPVERIRVHEEFHNYHHDIALLKLSRPTSARD 138 (436)
Q Consensus 61 tLIs~~~VLTAAhC~~~~~~~~~~~~~~~v~--~g~~~~~~~~~~~~~~~v~~i~~hp~y~~~~~DiAll~L~~~~~~~~ 138 (436)
-|.--.||||-+|-+....+ .+.+. .|.+..... ...++..-+ ..||.||+|.+.++-+.
T Consensus 36 gigyG~~iItn~HLf~~nng------~L~i~s~hG~f~v~nt-------~~lkv~~i~-----~~DiviirmPkDfpPf~ 97 (235)
T PF00863_consen 36 GIGYGSYIITNAHLFKRNNG------ELTIKSQHGEFTVPNT-------TQLKVHPIE-----GRDIVIIRMPKDFPPFP 97 (235)
T ss_dssp EEEETTEEEEEGGGGSSTTC------EEEEEETTEEEEECEG-------GGSEEEE-T-----CSSEEEEE--TTS----
T ss_pred EEeECCEEEEChhhhccCCC------eEEEEeCceEEEcCCc-------cccceEEeC-----CccEEEEeCCcccCCcc
Confidence 35567899999999866422 23332 222221111 011222222 46999999998876322
Q ss_pred CceeeeeecCCCCCCCCCCCcEEEEecCccCCCCCccccceeeeeeccchhhhhhhcCCCcCCcCCeEEeccCCCCCCCc
Q psy16044 139 KGVRAVCLTDADKRPVNPKQQCVATGWGRVKPKGDLVSKLRQIRVPLHNISVCRDKYGDSVELHGGHLCGGQLDGFSGAC 218 (436)
Q Consensus 139 ~~v~pi~l~~~~~~~~~~~~~~~~~GwG~~~~~~~~~~~l~~~~~~~~~~~~C~~~~~~~~~~~~~~~Ca~~~~~~~~~C 218 (436)
. -+++ +.+..+..+.++|--...... .....+. ..+... .. ..|=.- --++=
T Consensus 98 ~---kl~F-----R~P~~~e~v~mVg~~fq~k~~--~s~vSes--S~i~p~-~~-----------~~fWkH----wIsTk 149 (235)
T PF00863_consen 98 Q---KLKF-----RAPKEGERVCMVGSNFQEKSI--SSTVSES--SWIYPE-EN-----------SHFWKH----WISTK 149 (235)
T ss_dssp S------B---------TT-EEEEEEEECSSCCC--EEEEEEE--EEEEEE-TT-----------TTEEEE-----C---
T ss_pred h---hhhc-----cCCCCCCEEEEEEEEEEcCCe--eEEECCc--eEEeec-CC-----------CCeeEE----EecCC
Confidence 2 1121 334466777888854333111 1111111 000000 00 000000 02345
Q ss_pred cCCCCCeeeeEecCCcEEEEEEEeEcCC
Q psy16044 219 IGDSGGPLQCSLKDGRWYLAGITSFGSG 246 (436)
Q Consensus 219 ~gDsGgPl~~~~~~~~~~l~GI~S~g~~ 246 (436)
.||=|.||+.. .+| .++||-|.+..
T Consensus 150 ~G~CG~PlVs~-~Dg--~IVGiHsl~~~ 174 (235)
T PF00863_consen 150 DGDCGLPLVST-KDG--KIVGIHSLTSN 174 (235)
T ss_dssp TT-TT-EEEET-TT----EEEEEEEEET
T ss_pred CCccCCcEEEc-CCC--cEEEEEcCccC
Confidence 68889999986 345 49999998754
No 29
>TIGR02038 protease_degS periplasmic serine pepetdase DegS. This family consists of the periplasmic serine protease DegS (HhoB), a shorter paralog of protease DO (HtrA, DegP) and DegQ (HhoA). It is found in E. coli and several other Proteobacteria of the gamma subdivision. It contains a trypsin domain and a single copy of PDZ domain (in contrast to DegP with two copies). A critical role of this DegS is to sense stress in the periplasm and partially degrade an inhibitor of sigma(E).
Probab=78.27 E-value=7.9 Score=37.21 Aligned_cols=40 Identities=18% Similarity=0.213 Sum_probs=26.2
Q ss_pred ccccccceeecccccccCCCCeeeeccCCCCCCCCCCCCeEEEEecCC
Q psy16044 279 TSFLSAALLKLSRPTSARDKGVRAVCLTDADKRPVNPKQQCVATGWGR 326 (436)
Q Consensus 279 ~~~~diali~l~~~~~~~~~~v~picl~~~~~~~~~~~~~~~~~Gwg~ 326 (436)
....|+||||++.+- ..++.+. +......++.+.+.|+..
T Consensus 122 d~~~DlAvlkv~~~~------~~~~~l~--~s~~~~~G~~V~aiG~P~ 161 (351)
T TIGR02038 122 DPLTDLAVLKIEGDN------LPTIPVN--LDRPPHVGDVVLAIGNPY 161 (351)
T ss_pred cCCCCEEEEEecCCC------CceEecc--CcCccCCCCEEEEEeCCC
Confidence 346899999998641 2223332 222467899999999864
No 30
>PF00947 Pico_P2A: Picornavirus core protein 2A; InterPro: IPR000081 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This domain defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies 3CA and 3CB. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral 3C cysteine protease []. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0008233 peptidase activity, 0006508 proteolysis, 0016032 viral reproduction; PDB: 2HRV_B 1Z8R_A.
Probab=76.08 E-value=2.6 Score=33.34 Aligned_cols=34 Identities=26% Similarity=0.461 Sum_probs=25.8
Q ss_pred cCCCCceeeeEecCCcEEEEEEEEecCCCCCCCCCeEEEeCccc
Q psy16044 378 IGDSGGPLQCSLKDGRWYLAGITSFGSGCAKSGYPDVYTKLSFY 421 (436)
Q Consensus 378 ~gdsGgpl~~~~~~~~~~l~Gi~S~~~~c~~~~~p~v~t~V~~~ 421 (436)
.||-||+|.|+-. ++||++.|-+ ....|++++.+
T Consensus 89 PGdCGg~L~C~HG-----ViGi~Tagg~-----g~VaF~dir~~ 122 (127)
T PF00947_consen 89 PGDCGGILRCKHG-----VIGIVTAGGE-----GHVAFADIRDL 122 (127)
T ss_dssp TT-TCSEEEETTC-----EEEEEEEEET-----TEEEEEECCCG
T ss_pred CCCCCceeEeCCC-----eEEEEEeCCC-----ceEEEEechhh
Confidence 3899999999743 9999998652 24679999875
No 31
>PRK10139 serine endoprotease; Provisional
Probab=75.15 E-value=17 Score=36.37 Aligned_cols=39 Identities=23% Similarity=0.372 Sum_probs=26.4
Q ss_pred cccccceeecccccccCCCCeeeeccCCCCCCCCCCCCeEEEEecC
Q psy16044 280 SFLSAALLKLSRPTSARDKGVRAVCLTDADKRPVNPKQQCVATGWG 325 (436)
Q Consensus 280 ~~~diali~l~~~~~~~~~~v~picl~~~~~~~~~~~~~~~~~Gwg 325 (436)
...|+||||++.+.. ..++.|.... ....++.++..|.-
T Consensus 136 ~~~DlAvlkv~~~~~-----l~~~~lg~s~--~~~~G~~V~aiG~P 174 (455)
T PRK10139 136 DQSDIALLQIQNPSK-----LTQIAIADSD--KLRVGDFAVAVGNP 174 (455)
T ss_pred CCCCEEEEEecCCCC-----CceeEecCcc--ccCCCCEEEEEecC
Confidence 458999999985422 3445554332 45689999999874
No 32
>PF05416 Peptidase_C37: Southampton virus-type processing peptidase; InterPro: IPR001665 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This group of cysteine peptidases belong to the MEROPS peptidase family C37, (clan PA(C)). The type example is calicivirin from Southampton virus, an endopeptidase that cleaves the polyprotein at sites N-terminal to itself, liberating the polyprotein helicase. Southampton virus is a positive-stranded ssRNA virus belonging to the Caliciviruses, which are viruses that cause gastroenteritis. The calicivirus genome contains two open reading frames, ORF1 and ORF2. ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine protease and RNA polymerase activity []. The regions of the polyprotein in which these activities lie are similar to proteins produced by the picornaviruses []. ORF2 encodes a structural, capsid protein. Two different families of caliciviruses can be distinguished on the basis of sequence similarity, namely the Norwalk-like viruses or small round structured viruses (SRSVs), and those classed as non-SRSVs.; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 2FYQ_A 2FYR_A 1WQS_D 4ASH_A 2IPH_B.
Probab=69.08 E-value=15 Score=35.34 Aligned_cols=41 Identities=24% Similarity=0.438 Sum_probs=26.6
Q ss_pred CeEEeccCCCC--CCCccCCCCCeeeeEecCCcEEEEEEEeEcC
Q psy16044 204 GHLCGGQLDGF--SGACIGDSGGPLQCSLKDGRWYLAGITSFGS 245 (436)
Q Consensus 204 ~~~Ca~~~~~~--~~~C~gDsGgPl~~~~~~~~~~l~GI~S~g~ 245 (436)
.|+-++....+ -++-.||-|-|-++. .++.|+++||..-..
T Consensus 485 GMLLTGaNAK~mDLGT~PGDCGcPYvyK-rgNd~VV~GVH~AAt 527 (535)
T PF05416_consen 485 GMLLTGANAKGMDLGTIPGDCGCPYVYK-RGNDWVVIGVHAAAT 527 (535)
T ss_dssp EEETTSTT-SSTTTS--TTGTT-EEEEE-ETTEEEEEEEEEEE-
T ss_pred eeeeecCCccccccCCCCCCCCCceeee-cCCcEEEEEEEehhc
Confidence 45555544332 456799999999997 588999999976543
No 33
>PF00548 Peptidase_C3: 3C cysteine protease (picornain 3C); InterPro: IPR000199 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This signature defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies C3A and C3B. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral C3 cysteine protease. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SJO_E 2H6M_A 1QA7_C 1HAV_B 2HAL_A 2H9H_A 3QZQ_B 3QZR_A 3R0F_B 3SJ9_A ....
Probab=59.98 E-value=5.2 Score=34.05 Aligned_cols=29 Identities=28% Similarity=0.411 Sum_probs=22.1
Q ss_pred CCccCCCCCeeeeEecCCcEEEEEEEeEcC
Q psy16044 216 GACIGDSGGPLQCSLKDGRWYLAGITSFGS 245 (436)
Q Consensus 216 ~~C~gDsGgPl~~~~~~~~~~l~GI~S~g~ 245 (436)
.+..|+=||||+... ++...++||-.-|.
T Consensus 143 ~t~~G~CG~~l~~~~-~~~~~i~GiHvaG~ 171 (172)
T PF00548_consen 143 PTKPGMCGSPLVSRI-GGQGKIIGIHVAGN 171 (172)
T ss_dssp EEETTGTTEEEEESC-GGTTEEEEEEEEEE
T ss_pred CCCCCccCCeEEEee-ccCccEEEEEeccC
Confidence 355789999999863 45668999987764
No 34
>PF00947 Pico_P2A: Picornavirus core protein 2A; InterPro: IPR000081 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This domain defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies 3CA and 3CB. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral 3C cysteine protease []. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0008233 peptidase activity, 0006508 proteolysis, 0016032 viral reproduction; PDB: 2HRV_B 1Z8R_A.
Probab=57.17 E-value=8 Score=30.66 Aligned_cols=23 Identities=39% Similarity=0.732 Sum_probs=18.3
Q ss_pred cCCCCCeeeeEecCCcEEEEEEEeEcCC
Q psy16044 219 IGDSGGPLQCSLKDGRWYLAGITSFGSG 246 (436)
Q Consensus 219 ~gDsGgPl~~~~~~~~~~l~GI~S~g~~ 246 (436)
.||-||+|.|+-. ++||++-|..
T Consensus 89 PGdCGg~L~C~HG-----ViGi~Tagg~ 111 (127)
T PF00947_consen 89 PGDCGGILRCKHG-----VIGIVTAGGE 111 (127)
T ss_dssp TT-TCSEEEETTC-----EEEEEEEEET
T ss_pred CCCCCceeEeCCC-----eEEEEEeCCC
Confidence 7999999999732 9999998754
No 35
>PRK10942 serine endoprotease; Provisional
Probab=56.49 E-value=58 Score=32.78 Aligned_cols=39 Identities=23% Similarity=0.314 Sum_probs=25.0
Q ss_pred cccccceeecccccccCCCCeeeeccCCCCCCCCCCCCeEEEEecC
Q psy16044 280 SFLSAALLKLSRPTSARDKGVRAVCLTDADKRPVNPKQQCVATGWG 325 (436)
Q Consensus 280 ~~~diali~l~~~~~~~~~~v~picl~~~~~~~~~~~~~~~~~Gwg 325 (436)
...|+||||++.+-. ..++.|... .....++.+++.|.-
T Consensus 157 ~~~DlAvlki~~~~~-----l~~~~lg~s--~~l~~G~~V~aiG~P 195 (473)
T PRK10942 157 PRSDIALIQLQNPKN-----LTAIKMADS--DALRVGDYTVAIGNP 195 (473)
T ss_pred CCCCEEEEEecCCCC-----CceeEecCc--cccCCCCEEEEEcCC
Confidence 458999999864322 334444322 245788999988864
No 36
>PF10459 Peptidase_S46: Peptidase S46; InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains.
Probab=56.35 E-value=6.8 Score=41.12 Aligned_cols=22 Identities=27% Similarity=0.564 Sum_probs=19.5
Q ss_pred eeEEEEecCC-EEEecCCCcCCC
Q psy16044 57 WCGAVLIHPS-WVVTAAHCIHND 78 (436)
Q Consensus 57 ~C~GtLIs~~-~VLTAAhC~~~~ 78 (436)
.|+|++||++ .|||--||..+.
T Consensus 48 GCSgsfVS~~GLvlTNHHC~~~~ 70 (698)
T PF10459_consen 48 GCSGSFVSPDGLVLTNHHCGYGA 70 (698)
T ss_pred ceeEEEEcCCceEEecchhhhhH
Confidence 4999999997 999999998763
No 37
>PF05580 Peptidase_S55: SpoIVB peptidase S55; InterPro: IPR008763 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S55 (SpoIVB peptidase family, clan PA(S)). The protein SpoIVB plays a key role in signalling in the final sigma-K checkpoint of Bacillus subtilis [, ].
Probab=45.14 E-value=20 Score=31.43 Aligned_cols=25 Identities=32% Similarity=0.353 Sum_probs=21.5
Q ss_pred CCccCCCCceeeeEecCCcEEEEEEEEecC
Q psy16044 375 GACIGDSGGPLQCSLKDGRWYLAGITSFGS 404 (436)
Q Consensus 375 ~~c~gdsGgpl~~~~~~~~~~l~Gi~S~~~ 404 (436)
+.-+|.||+|++.+++ |+|-+++..
T Consensus 176 GIvqGMSGSPI~qdGK-----LiGAVthvf 200 (218)
T PF05580_consen 176 GIVQGMSGSPIIQDGK-----LIGAVTHVF 200 (218)
T ss_pred CEEecccCCCEEECCE-----EEEEEEEEE
Confidence 6889999999987654 999999875
No 38
>PF05579 Peptidase_S32: Equine arteritis virus serine endopeptidase S32; InterPro: IPR008760 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to MEROPS peptidase family S32 (clan PA(S)). The type example is equine arteritis virus serine endopeptidase (equine arteritis virus), which is involved in processing of nidovirus polyproteins [].; GO: 0004252 serine-type endopeptidase activity, 0016032 viral reproduction, 0019082 viral protein processing; PDB: 3FAN_A 3FAO_A 1MBM_A.
Probab=32.58 E-value=39 Score=30.63 Aligned_cols=22 Identities=41% Similarity=0.656 Sum_probs=15.4
Q ss_pred cCCCCCeeeeEecCCcEEEEEEEeEc
Q psy16044 219 IGDSGGPLQCSLKDGRWYLAGITSFG 244 (436)
Q Consensus 219 ~gDsGgPl~~~~~~~~~~l~GI~S~g 244 (436)
.||||+|++.. +| .|+||.+-.
T Consensus 207 ~GDSGSPVVt~--dg--~liGVHTGS 228 (297)
T PF05579_consen 207 PGDSGSPVVTE--DG--DLIGVHTGS 228 (297)
T ss_dssp GGCTT-EEEET--TC---EEEEEEEE
T ss_pred CCCCCCccCcC--CC--CEEEEEecC
Confidence 68999999964 44 399997643
No 39
>PF05580 Peptidase_S55: SpoIVB peptidase S55; InterPro: IPR008763 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S55 (SpoIVB peptidase family, clan PA(S)). The protein SpoIVB plays a key role in signalling in the final sigma-K checkpoint of Bacillus subtilis [, ].
Probab=30.81 E-value=37 Score=29.84 Aligned_cols=26 Identities=27% Similarity=0.277 Sum_probs=21.7
Q ss_pred CCCccCCCCCeeeeEecCCcEEEEEEEeEcC
Q psy16044 215 SGACIGDSGGPLQCSLKDGRWYLAGITSFGS 245 (436)
Q Consensus 215 ~~~C~gDsGgPl~~~~~~~~~~l~GI~S~g~ 245 (436)
.+.-+|-||+|++.++ .|+|-++++.
T Consensus 175 GGIvqGMSGSPI~qdG-----KLiGAVthvf 200 (218)
T PF05580_consen 175 GGIVQGMSGSPIIQDG-----KLIGAVTHVF 200 (218)
T ss_pred CCEEecccCCCEEECC-----EEEEEEEEEE
Confidence 5688999999999765 3999998874
No 40
>PF05579 Peptidase_S32: Equine arteritis virus serine endopeptidase S32; InterPro: IPR008760 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to MEROPS peptidase family S32 (clan PA(S)). The type example is equine arteritis virus serine endopeptidase (equine arteritis virus), which is involved in processing of nidovirus polyproteins [].; GO: 0004252 serine-type endopeptidase activity, 0016032 viral reproduction, 0019082 viral protein processing; PDB: 3FAN_A 3FAO_A 1MBM_A.
Probab=30.61 E-value=41 Score=30.52 Aligned_cols=21 Identities=38% Similarity=0.591 Sum_probs=15.2
Q ss_pred CCCCceeeeEecCCcEEEEEEEEec
Q psy16044 379 GDSGGPLQCSLKDGRWYLAGITSFG 403 (436)
Q Consensus 379 gdsGgpl~~~~~~~~~~l~Gi~S~~ 403 (436)
||||+|++..+ + .|+||-+-+
T Consensus 208 GDSGSPVVt~d--g--~liGVHTGS 228 (297)
T PF05579_consen 208 GDSGSPVVTED--G--DLIGVHTGS 228 (297)
T ss_dssp GCTT-EEEETT--C---EEEEEEEE
T ss_pred CCCCCccCcCC--C--CEEEEEecC
Confidence 89999999853 3 399998654
No 41
>PF00944 Peptidase_S3: Alphavirus core protein ; InterPro: IPR000930 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. Togavirin, also known as Sindbis virus core endopeptidase, is a serine protease resident at the N terminus of the p130 polyprotein of togaviruses []. The endopeptidase signature identifies the peptidase as belonging to the MEROPS peptidase family S3 (togavirin family, clan PA(S)). The polyprotein also includes structural proteins for the nucleocapsid core and for the glycoprotein spikes []. Togavirin is only active while part of the polyprotein, cleavage at a Trp-Ser bond resulting in total lack of activity []. Mutagenesis studies have identified the location of the His-Asp-Ser catalytic triad, and X-ray studies have revealed the protein fold to be similar to that of chymotrypsin [, ].; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis, 0016020 membrane; PDB: 2YEW_D 1EP5_A 3J0C_F 1EP6_C 1WYK_D 1DYL_A 1VCQ_B 1VCP_B 1LD4_D 1KXA_A ....
Probab=29.47 E-value=36 Score=27.27 Aligned_cols=25 Identities=36% Similarity=0.512 Sum_probs=17.4
Q ss_pred CccCCCCCeeeeEecCCcEEEEEEEeEcC
Q psy16044 217 ACIGDSGGPLQCSLKDGRWYLAGITSFGS 245 (436)
Q Consensus 217 ~C~gDsGgPl~~~~~~~~~~l~GI~S~g~ 245 (436)
.-.||||-|++-+ .|+ ++||+--|.
T Consensus 103 g~~GDSGRpi~DN--sGr--VVaIVLGG~ 127 (158)
T PF00944_consen 103 GKPGDSGRPIFDN--SGR--VVAIVLGGA 127 (158)
T ss_dssp -STTSTTEEEEST--TSB--EEEEEEEEE
T ss_pred CCCCCCCCccCcC--CCC--EEEEEecCC
Confidence 3479999999853 444 888876553
No 42
>PF02907 Peptidase_S29: Hepatitis C virus NS3 protease; InterPro: IPR004109 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies the Hepatitis C virus NS3 protein as a serine protease which belongs to MEROPS peptidase family S29 (hepacivirin family, clan PA(S)), which has a trypsin-like fold. The non-structural (NS) protein NS3 is one of the NS proteins involved in replication of the HCV genome. The NS2 proteinase (IPR002518 from INTERPRO), a zinc-dependent enzyme, performs a single proteolytic cut to release the N terminus of NS3. The action of NS3 proteinase (NS3P), which resides in the N-terminal one-third of the NS3 protein, then yields all remaining non-structural proteins. The C-terminal two-thirds of the NS3 protein contain a helicase. The functional relationship between the proteinase and helicase domains is unknown. NS3 has a structural zinc-binding site and requires cofactor NS4. It has been suggested that the NS3 serine protease of hepatitus C is involved in cell transformation and that the ability to transform requires an active enzyme [].; GO: 0008236 serine-type peptidase activity, 0006508 proteolysis, 0019087 transformation of host cell by virus; PDB: 2QV1_B 3LOX_C 2OBQ_C 2OC1_C 2OC0_A 3LON_A 3KNX_A 2O8M_A 2OBO_A 2OC8_A ....
Probab=28.06 E-value=38 Score=27.19 Aligned_cols=22 Identities=36% Similarity=0.724 Sum_probs=14.8
Q ss_pred ccCCCCceeeeEecCCcEEEEEEEEe
Q psy16044 377 CIGDSGGPLQCSLKDGRWYLAGITSF 402 (436)
Q Consensus 377 c~gdsGgpl~~~~~~~~~~l~Gi~S~ 402 (436)
-.|.||||+.|. .| ..+||+-.
T Consensus 106 lkGSSGgPiLC~--~G--H~vG~f~a 127 (148)
T PF02907_consen 106 LKGSSGGPILCP--SG--HAVGMFRA 127 (148)
T ss_dssp HTT-TT-EEEET--TS--EEEEEEEE
T ss_pred EecCCCCcccCC--CC--CEEEEEEE
Confidence 347799999996 34 59998654
No 43
>KOG1421|consensus
Probab=23.50 E-value=4.5e+02 Score=27.66 Aligned_cols=135 Identities=17% Similarity=0.192 Sum_probs=0.0
Q ss_pred EEEEecCC--EEEecCCCcCCCCCCCCCCcceEEEeccccCCccCCceEEEeeeEEEeCCCCCCCCCceeEEEeCCCCCC
Q psy16044 59 GAVLIHPS--WVVTAAHCIHNDIFSLPIPELWTAVLGDWDRTEEEKSEVRIPVERIRVHEEFHNYHHDIALLKLSRPTSA 136 (436)
Q Consensus 59 ~GtLIs~~--~VLTAAhC~~~~~~~~~~~~~~~v~~g~~~~~~~~~~~~~~~v~~i~~hp~y~~~~~DiAll~L~~~~~~ 136 (436)
+|.++++. ++||+.|-+ .+..+...+ ....=..+-+.|-|...-+|+.+++.+..-.
T Consensus 87 tgfvvd~~~gyiLtnrhvv--------~pgP~va~a------------vf~n~ee~ei~pvyrDpVhdfGf~r~dps~i- 145 (955)
T KOG1421|consen 87 TGFVVDKKLGYILTNRHVV--------APGPFVASA------------VFDNHEEIEIYPVYRDPVHDFGFFRYDPSTI- 145 (955)
T ss_pred eEEEEecccceEEEecccc--------CCCCceeEE------------EecccccCCcccccCCchhhcceeecChhhc-
Q ss_pred CCCceeeeeecCCCCCCCCCCCcEEEEecCccCCCCCccccceeeeeeccchhhhhhhcCCCcCCcCCeEEeccCCCCCC
Q psy16044 137 RDKGVRAVCLTDADKRPVNPKQQCVATGWGRVKPKGDLVSKLRQIRVPLHNISVCRDKYGDSVELHGGHLCGGQLDGFSG 216 (436)
Q Consensus 137 ~~~~v~pi~l~~~~~~~~~~~~~~~~~GwG~~~~~~~~~~~l~~~~~~~~~~~~C~~~~~~~~~~~~~~~Ca~~~~~~~~ 216 (436)
.-..++.+||.. .....+..-+++| .....+..+-.--.+.-...........-..|-+.+.+-..+
T Consensus 146 r~s~vt~i~lap---~~akvgseirvvg----------NDagEklsIlagflSrldr~apdyg~~~yndfnTfy~Qaass 212 (955)
T KOG1421|consen 146 RFSIVTEICLAP---ELAKVGSEIRVVG----------NDAGEKLSILAGFLSRLDRNAPDYGEDTYNDFNTFYIQAASS 212 (955)
T ss_pred ceeeeeccccCc---cccccCCceEEec----------CCccceEEeehhhhhhccCCCccccccccccccceeeeehhc
Q ss_pred CccCCCCCeee
Q psy16044 217 ACIGDSGGPLQ 227 (436)
Q Consensus 217 ~C~gDsGgPl~ 227 (436)
+-.|.||+|++
T Consensus 213 tsggssgspVv 223 (955)
T KOG1421|consen 213 TSGGSSGSPVV 223 (955)
T ss_pred CCCCCCCCcee
No 44
>TIGR02860 spore_IV_B stage IV sporulation protein B. SpoIVB, the stage IV sporulation protein B of endospore-forming bacteria such as Bacillus subtilis, is a serine proteinase, expressed in the spore (rather than mother cell) compartment, that participates in a proteolytic activation cascade for Sigma-K. It appears to be universal among endospore-forming bacteria and occurs nowhere else.
Probab=20.32 E-value=72 Score=31.15 Aligned_cols=44 Identities=25% Similarity=0.347 Sum_probs=29.1
Q ss_pred CCccCCCCceeeeEecCCcEEEEEEEEecCCCCCCCCCeEEEeCcccHHHHHHHH
Q psy16044 375 GACIGDSGGPLQCSLKDGRWYLAGITSFGSGCAKSGYPDVYTKLSFYLPWIRKQI 429 (436)
Q Consensus 375 ~~c~gdsGgpl~~~~~~~~~~l~Gi~S~~~~c~~~~~p~v~t~V~~~~~WI~~~i 429 (436)
+.-+|.||+|++.+++ |+|-++.-.--.++...++ |++|..+..
T Consensus 356 GivqGMSGSPi~q~gk-----liGAvtHVfvndpt~GYGi------~ie~Ml~~~ 399 (402)
T TIGR02860 356 GIVQGMSGSPIIQNGK-----VIGAVTHVFVNDPTSGYGV------YIEWMLKEA 399 (402)
T ss_pred CEEecccCCCEEECCE-----EEEEEEEEEecCCCcceee------hHHHHHHHh
Confidence 7889999999998665 9998886431111111233 578887654
Done!