Query psy15063
Match_columns 417
No_of_seqs 361 out of 2209
Neff 7.8
Searched_HMMs 46136
Date Fri Aug 16 19:29:46 2013
Command hhsearch -i /work/01045/syshi/Psyhhblits/psy15063.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/15063hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 cd00190 Tryp_SPc Trypsin-like 100.0 2.7E-39 5.9E-44 301.0 24.3 224 105-412 1-232 (232)
2 KOG3627|consensus 100.0 2.2E-38 4.7E-43 302.1 23.5 230 103-414 11-255 (256)
3 smart00020 Tryp_SPc Trypsin-li 100.0 9.5E-36 2E-40 277.7 23.9 220 105-409 2-229 (229)
4 PF00089 Trypsin: Trypsin; In 100.0 4.5E-34 9.8E-39 264.0 23.3 215 105-409 1-220 (220)
5 COG5640 Secreted trypsin-like 100.0 1.5E-27 3.1E-32 226.9 16.0 228 101-414 29-279 (413)
6 PF03761 DUF316: Domain of unk 99.5 5.9E-13 1.3E-17 129.0 18.5 223 90-415 28-281 (282)
7 PF09342 DUF1986: Domain of un 99.2 2.4E-10 5.2E-15 104.9 14.5 112 118-241 15-131 (267)
8 cd00190 Tryp_SPc Trypsin-like 98.8 2.7E-09 5.9E-14 98.8 4.1 45 320-364 1-46 (232)
9 PF00089 Trypsin: Trypsin; In 98.8 6.6E-09 1.4E-13 95.5 4.9 45 320-364 1-46 (220)
10 smart00020 Tryp_SPc Trypsin-li 98.7 7.5E-09 1.6E-13 96.1 4.0 48 319-366 1-49 (229)
11 KOG3627|consensus 98.6 6.1E-08 1.3E-12 92.2 4.9 51 316-366 9-61 (256)
12 COG5640 Secreted trypsin-like 98.5 7.4E-08 1.6E-12 93.0 2.6 58 313-370 26-88 (413)
13 COG3591 V8-like Glu-specific e 98.2 2.1E-05 4.5E-10 74.1 12.4 117 118-243 48-173 (251)
14 TIGR02037 degP_htrA_DO peripla 97.6 0.0013 2.9E-08 67.6 13.9 84 128-241 58-142 (428)
15 PF09342 DUF1986: Domain of un 97.0 0.00098 2.1E-08 61.9 5.1 36 327-362 12-47 (267)
16 PRK10898 serine endoprotease; 96.9 0.046 9.9E-07 54.9 16.3 83 128-241 78-161 (353)
17 PF03761 DUF316: Domain of unk 96.9 0.0011 2.4E-08 64.2 4.5 50 314-363 36-89 (282)
18 TIGR02038 protease_degS peripl 96.7 0.087 1.9E-06 52.8 16.7 83 128-241 78-161 (351)
19 PF13365 Trypsin_2: Trypsin-li 96.6 0.0055 1.2E-07 50.4 6.3 23 130-155 1-24 (120)
20 PRK10139 serine endoprotease; 96.5 0.15 3.2E-06 53.0 17.3 83 128-240 90-174 (455)
21 PRK10942 serine endoprotease; 96.4 0.11 2.3E-06 54.4 15.7 83 128-240 111-195 (473)
22 COG3591 V8-like Glu-specific e 94.6 0.029 6.3E-07 53.1 3.3 37 328-364 46-85 (251)
23 PF02395 Peptidase_S6: Immunog 93.3 1.2 2.5E-05 49.2 12.8 66 131-225 68-133 (769)
24 PF13365 Trypsin_2: Trypsin-li 87.1 0.31 6.6E-06 39.8 1.3 19 345-363 1-20 (120)
25 PF00947 Pico_P2A: Picornaviru 80.5 3 6.5E-05 35.2 4.4 33 363-406 91-123 (127)
26 PF05579 Peptidase_S32: Equine 56.2 20 0.00043 34.3 4.7 39 349-390 195-234 (297)
27 PRK14065 exodeoxyribonuclease 53.1 13 0.00029 28.9 2.6 28 2-29 25-52 (86)
28 PF00863 Peptidase_C4: Peptida 52.1 2.1E+02 0.0046 27.0 11.4 41 363-410 152-193 (235)
29 PF00548 Peptidase_C3: 3C cyst 44.1 65 0.0014 28.7 6.0 69 128-221 25-93 (172)
30 KOG1067|consensus 29.6 63 0.0014 34.3 3.9 65 327-404 460-533 (760)
31 TIGR01280 xseB exodeoxyribonuc 26.2 55 0.0012 24.4 2.2 24 2-25 1-24 (67)
32 PRK14068 exodeoxyribonuclease 20.9 79 0.0017 24.3 2.1 31 2-32 6-36 (76)
No 1
>cd00190 Tryp_SPc Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues.
Probab=100.00 E-value=2.7e-39 Score=301.00 Aligned_cols=224 Identities=35% Similarity=0.628 Sum_probs=195.7
Q ss_pred ecCCCCCCCCCCCCCcceeeEcc----ccceEEEeeCCeEEeeccceeeecCCCCCCCccceEEEeceecCCCCCCCCcE
Q psy15063 105 KSPHHPARPSSGLTCDYDVAIRQ----DVCAVRIEFEKVNLARKVGGVCDIDQLRDTPVSELVLHLGDHDLTQLNETSHV 180 (417)
Q Consensus 105 ~~~g~~a~~~~~~~~PW~v~i~~----~~C~GtLIs~~~VLTA~~~A~Cv~~~~~~~~~~~~~V~lG~~~~~~~~~~~~q 180 (417)
+++|.++.++ +|||+|.+.. +.|+||||+++||||| |||+.+.. ...+.|++|..+...... ..+
T Consensus 1 i~~G~~~~~~---~~Pw~v~i~~~~~~~~C~GtlIs~~~VLTa---AhC~~~~~----~~~~~v~~g~~~~~~~~~-~~~ 69 (232)
T cd00190 1 IVGGSEAKIG---SFPWQVSLQYTGGRHFCGGSLISPRWVLTA---AHCVYSSA----PSNYTVRLGSHDLSSNEG-GGQ 69 (232)
T ss_pred CcCCeECCCC---CCCCEEEEEccCCcEEEEEEEeeCCEEEEC---HHhcCCCC----CccEEEEeCcccccCCCC-ceE
Confidence 3577777776 5999999953 5999999999999999 99997642 467999999988765443 567
Q ss_pred EEeeeEEEeCCCCCCCCCCCCeEEEEecCCcCCCCceeeeeCCCCCCcc-CCCeEEEEEcCccCCCCCCCCccceEEEEE
Q psy15063 181 RRGVRRVLFHSHFHPFVLSNDIALLQLDRPVPLTGTIQPVCLPQKGESF-IGKRGHVVGWGVTSFPMGEPSPTLQKLEVK 259 (417)
Q Consensus 181 ~~~V~~i~iHP~Y~~~~~~nDIALLkL~~pv~~s~~v~PicLp~~~~~~-~~~~~~v~GWG~~~~~~~~~~~~L~~~~v~ 259 (417)
.+.|.++++||+|+.....+|||||||++|+.++++++|||||...... .+..+.++|||.+.... ..+..|++..+.
T Consensus 70 ~~~v~~~~~hp~y~~~~~~~DiAll~L~~~~~~~~~v~picl~~~~~~~~~~~~~~~~G~g~~~~~~-~~~~~~~~~~~~ 148 (232)
T cd00190 70 VIKVKKVIVHPNYNPSTYDNDIALLKLKRPVTLSDNVRPICLPSSGYNLPAGTTCTVSGWGRTSEGG-PLPDVLQEVNVP 148 (232)
T ss_pred EEEEEEEEECCCCCCCCCcCCEEEEEECCcccCCCcccceECCCccccCCCCCEEEEEeCCcCCCCC-CCCceeeEEEee
Confidence 8889999999999998889999999999999999999999999985333 78999999999987653 467889999999
Q ss_pred EeChhhhhhhhh--ccCCCCeEEecCCC-CCCCccCCCCCcccccccccccccCccccCCcceeecccccccccCCcEEE
Q psy15063 260 VLSNARCSTVIE--ESIGIGMLCAAPDE-TQGTCFVPVSPVGYTKKHLQQFHQGTTYRQPRRRIILGGEADIGEFPWQVA 336 (417)
Q Consensus 260 i~~~~~C~~~~~--~~~~~~~iCa~~~~-~~~~C~g~~~p~~~t~~~~~~wi~~~~~~~~~~~~~~G~~~~~~~~p~~~~ 336 (417)
+++...|...+. ..+.+.+||+.... ..+.|.|
T Consensus 149 ~~~~~~C~~~~~~~~~~~~~~~C~~~~~~~~~~c~g-------------------------------------------- 184 (232)
T cd00190 149 IVSNAECKRAYSYGGTITDNMLCAGGLEGGKDACQG-------------------------------------------- 184 (232)
T ss_pred eECHHHhhhhccCcccCCCceEeeCCCCCCCccccC--------------------------------------------
Confidence 999999998886 47889999999887 7889999
Q ss_pred EccCCeeeeeeEEecCceeeeccccccCCCceEEeeCCeEEEEeEEEEcCCCccccCCCCeEEEecccchhHHhhh
Q psy15063 337 IALDGMFFCGGALLNEHFVLTAAHCIMTGGPLTFEQDGYHVLAGIVSYGVTGCAIMPSYPDLYTRVSEYIRWIHVN 412 (417)
Q Consensus 337 ~~~~~~~~Cggsli~~~~vLtaAhC~dsGgPL~~~~~g~~~l~GI~S~g~~~C~~~~~~p~vyt~Vs~~~~WI~~~ 412 (417)
|+||||++..+++++|+||+|+| ..|. ..+.|.+||+|+.|++||+++
T Consensus 185 --------------------------dsGgpl~~~~~~~~~lvGI~s~g-~~c~-~~~~~~~~t~v~~~~~WI~~~ 232 (232)
T cd00190 185 --------------------------DSGGPLVCNDNGRGVLVGIVSWG-SGCA-RPNYPGVYTRVSSYLDWIQKT 232 (232)
T ss_pred --------------------------CCCCcEEEEeCCEEEEEEEEehh-hccC-CCCCCCEEEEcHHhhHHhhcC
Confidence 99999999989999999999999 6798 558999999999999999874
No 2
>KOG3627|consensus
Probab=100.00 E-value=2.2e-38 Score=302.06 Aligned_cols=230 Identities=33% Similarity=0.599 Sum_probs=194.3
Q ss_pred eEecCCCCCCCCCCCCCcceeeEc-----cccceEEEeeCCeEEeeccceeeecCCCCCCCccceEEEeceecCCCCCCC
Q psy15063 103 YFKSPHHPARPSSGLTCDYDVAIR-----QDVCAVRIEFEKVNLARKVGGVCDIDQLRDTPVSELVLHLGDHDLTQLNET 177 (417)
Q Consensus 103 ~~~~~g~~a~~~~~~~~PW~v~i~-----~~~C~GtLIs~~~VLTA~~~A~Cv~~~~~~~~~~~~~V~lG~~~~~~~~~~ 177 (417)
.++++|.++.++ .+||++.+. .++|+|+||+++||||| |||+.+.. .. .+.|++|.++.......
T Consensus 11 ~~i~~g~~~~~~---~~Pw~~~l~~~~~~~~~Cggsli~~~~vlta---aHC~~~~~---~~-~~~V~~G~~~~~~~~~~ 80 (256)
T KOG3627|consen 11 GRIVGGTEAEPG---SFPWQVSLQYGGNGRHLCGGSLISPRWVLTA---AHCVKGAS---AS-LYTVRLGEHDINLSVSE 80 (256)
T ss_pred CCEeCCccCCCC---CCCCEEEEEECCCcceeeeeEEeeCCEEEEC---hhhCCCCC---Cc-ceEEEECcccccccccc
Confidence 478889888888 499999984 35999999999999999 99997642 11 78999998866654221
Q ss_pred Cc--EEEeeeEEEeCCCCCCCCCC-CCeEEEEecCCcCCCCceeeeeCCCCCC---ccCCCeEEEEEcCccCCCCCCCCc
Q psy15063 178 SH--VRRGVRRVLFHSHFHPFVLS-NDIALLQLDRPVPLTGTIQPVCLPQKGE---SFIGKRGHVVGWGVTSFPMGEPSP 251 (417)
Q Consensus 178 ~~--q~~~V~~i~iHP~Y~~~~~~-nDIALLkL~~pv~~s~~v~PicLp~~~~---~~~~~~~~v~GWG~~~~~~~~~~~ 251 (417)
.. ....|.++++||+|+..... ||||||+|.+++.|+++|+|||||.... ...+..|+++|||.+.......+.
T Consensus 81 ~~~~~~~~v~~~i~H~~y~~~~~~~nDiall~l~~~v~~~~~i~piclp~~~~~~~~~~~~~~~v~GWG~~~~~~~~~~~ 160 (256)
T KOG3627|consen 81 GEEQLVGDVEKIIVHPNYNPRTLENNDIALLRLSEPVTFSSHIQPICLPSSADPYFPPGGTTCLVSGWGRTESGGGPLPD 160 (256)
T ss_pred CchhhhceeeEEEECCCCCCCCCCCCCEEEEEECCCcccCCcccccCCCCCcccCCCCCCCEEEEEeCCCcCCCCCCCCc
Confidence 22 45558889999999998888 9999999999999999999999985544 225689999999999876335789
Q ss_pred cceEEEEEEeChhhhhhhhhc--cCCCCeEEecCC-CCCCCccCCCCCcccccccccccccCccccCCcceeeccccccc
Q psy15063 252 TLQKLEVKVLSNARCSTVIEE--SIGIGMLCAAPD-ETQGTCFVPVSPVGYTKKHLQQFHQGTTYRQPRRRIILGGEADI 328 (417)
Q Consensus 252 ~L~~~~v~i~~~~~C~~~~~~--~~~~~~iCa~~~-~~~~~C~g~~~p~~~t~~~~~~wi~~~~~~~~~~~~~~G~~~~~ 328 (417)
.||++++++++...|+..+.. .+++.||||+.. .+.++|+|
T Consensus 161 ~L~~~~v~i~~~~~C~~~~~~~~~~~~~~~Ca~~~~~~~~~C~G------------------------------------ 204 (256)
T KOG3627|consen 161 TLQEVDVPIISNSECRRAYGGLGTITDTMLCAGGPEGGKDACQG------------------------------------ 204 (256)
T ss_pred eeEEEEEeEcChhHhcccccCccccCCCEEeeCccCCCCccccC------------------------------------
Confidence 999999999999999998876 477789999984 47889999
Q ss_pred ccCCcEEEEccCCeeeeeeEEecCceeeeccccccCCCceEEeeCCeEEEEeEEEEcCCC-ccccCCCCeEEEecccchh
Q psy15063 329 GEFPWQVAIALDGMFFCGGALLNEHFVLTAAHCIMTGGPLTFEQDGYHVLAGIVSYGVTG-CAIMPSYPDLYTRVSEYIR 407 (417)
Q Consensus 329 ~~~p~~~~~~~~~~~~Cggsli~~~~vLtaAhC~dsGgPL~~~~~g~~~l~GI~S~g~~~-C~~~~~~p~vyt~Vs~~~~ 407 (417)
||||||++..+++++++||+||| .+ |+ ..+.|++||+|+.|.+
T Consensus 205 ----------------------------------DSGGPLv~~~~~~~~~~GivS~G-~~~C~-~~~~P~vyt~V~~y~~ 248 (256)
T KOG3627|consen 205 ----------------------------------DSGGPLVCEDNGRWVLVGIVSWG-SGGCG-QPNYPGVYTRVSSYLD 248 (256)
T ss_pred ----------------------------------CCCCeEEEeeCCcEEEEEEEEec-CCCCC-CCCCCeEEeEhHHhHH
Confidence 99999999887689999999999 55 99 6679999999999999
Q ss_pred HHhhhhc
Q psy15063 408 WIHVNAI 414 (417)
Q Consensus 408 WI~~~~~ 414 (417)
||++.+.
T Consensus 249 WI~~~~~ 255 (256)
T KOG3627|consen 249 WIKENIG 255 (256)
T ss_pred HHHHHhc
Confidence 9999875
No 3
>smart00020 Tryp_SPc Trypsin-like serine protease. Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues.
Probab=100.00 E-value=9.5e-36 Score=277.67 Aligned_cols=220 Identities=35% Similarity=0.602 Sum_probs=190.3
Q ss_pred ecCCCCCCCCCCCCCcceeeEccc----cceEEEeeCCeEEeeccceeeecCCCCCCCccceEEEeceecCCCCCCCCcE
Q psy15063 105 KSPHHPARPSSGLTCDYDVAIRQD----VCAVRIEFEKVNLARKVGGVCDIDQLRDTPVSELVLHLGDHDLTQLNETSHV 180 (417)
Q Consensus 105 ~~~g~~a~~~~~~~~PW~v~i~~~----~C~GtLIs~~~VLTA~~~A~Cv~~~~~~~~~~~~~V~lG~~~~~~~~~~~~q 180 (417)
+++|.++.+. +|||++.+... .|+||||++++|||| |||+.+.. ...+.|++|.++...... .+
T Consensus 2 ~~~G~~~~~~---~~Pw~~~i~~~~~~~~C~GtlIs~~~VLTa---ahC~~~~~----~~~~~v~~g~~~~~~~~~--~~ 69 (229)
T smart00020 2 IVGGSEANIG---SFPWQVSLQYRGGRHFCGGSLISPRWVLTA---AHCVYGSD----PSNIRVRLGSHDLSSGEE--GQ 69 (229)
T ss_pred ccCCCcCCCC---CCCcEEEEEEcCCCcEEEEEEecCCEEEEC---HHHcCCCC----CcceEEEeCcccCCCCCC--ce
Confidence 5678888777 59999998544 699999999999999 99997642 367999999987765332 27
Q ss_pred EEeeeEEEeCCCCCCCCCCCCeEEEEecCCcCCCCceeeeeCCCCCCcc-CCCeEEEEEcCccCCCCCCCCccceEEEEE
Q psy15063 181 RRGVRRVLFHSHFHPFVLSNDIALLQLDRPVPLTGTIQPVCLPQKGESF-IGKRGHVVGWGVTSFPMGEPSPTLQKLEVK 259 (417)
Q Consensus 181 ~~~V~~i~iHP~Y~~~~~~nDIALLkL~~pv~~s~~v~PicLp~~~~~~-~~~~~~v~GWG~~~~~~~~~~~~L~~~~v~ 259 (417)
.+.|.++++||+|+.....+|||||+|++|+.+++.++|||||...... .+..+.++|||.........+..++...+.
T Consensus 70 ~~~v~~~~~~p~~~~~~~~~DiAll~L~~~i~~~~~~~pi~l~~~~~~~~~~~~~~~~g~g~~~~~~~~~~~~~~~~~~~ 149 (229)
T smart00020 70 VIKVSKVIIHPNYNPSTYDNDIALLKLKSPVTLSDNVRPICLPSSNYNVPAGTTCTVSGWGRTSEGAGSLPDTLQEVNVP 149 (229)
T ss_pred EEeeEEEEECCCCCCCCCcCCEEEEEECcccCCCCceeeccCCCcccccCCCCEEEEEeCCCCCCCCCcCCCEeeEEEEE
Confidence 7889999999999988889999999999999999999999999874333 789999999999875333467789999999
Q ss_pred EeChhhhhhhhhc--cCCCCeEEecCCC-CCCCccCCCCCcccccccccccccCccccCCcceeecccccccccCCcEEE
Q psy15063 260 VLSNARCSTVIEE--SIGIGMLCAAPDE-TQGTCFVPVSPVGYTKKHLQQFHQGTTYRQPRRRIILGGEADIGEFPWQVA 336 (417)
Q Consensus 260 i~~~~~C~~~~~~--~~~~~~iCa~~~~-~~~~C~g~~~p~~~t~~~~~~wi~~~~~~~~~~~~~~G~~~~~~~~p~~~~ 336 (417)
+++...|...+.. .+.+.++|++... ..+.|.|
T Consensus 150 ~~~~~~C~~~~~~~~~~~~~~~C~~~~~~~~~~c~g-------------------------------------------- 185 (229)
T smart00020 150 IVSNATCRRAYSGGGAITDNMLCAGGLEGGKDACQG-------------------------------------------- 185 (229)
T ss_pred EeCHHHhhhhhccccccCCCcEeecCCCCCCcccCC--------------------------------------------
Confidence 9999999988765 5889999999887 7889999
Q ss_pred EccCCeeeeeeEEecCceeeeccccccCCCceEEeeCCeEEEEeEEEEcCCCccccCCCCeEEEecccchhHH
Q psy15063 337 IALDGMFFCGGALLNEHFVLTAAHCIMTGGPLTFEQDGYHVLAGIVSYGVTGCAIMPSYPDLYTRVSEYIRWI 409 (417)
Q Consensus 337 ~~~~~~~~Cggsli~~~~vLtaAhC~dsGgPL~~~~~g~~~l~GI~S~g~~~C~~~~~~p~vyt~Vs~~~~WI 409 (417)
|+||||++..+ +|+|+||+|+| ..|. ..+.|.+|+||+.|++||
T Consensus 186 --------------------------dsG~pl~~~~~-~~~l~Gi~s~g-~~C~-~~~~~~~~~~i~~~~~WI 229 (229)
T smart00020 186 --------------------------DSGGPLVCNDG-RWVLVGIVSWG-SGCA-RPGKPGVYTRVSSYLDWI 229 (229)
T ss_pred --------------------------CCCCeeEEECC-CEEEEEEEEEC-CCCC-CCCCCCEEEEeccccccC
Confidence 99999999766 89999999999 6998 778999999999999998
No 4
>PF00089 Trypsin: Trypsin; InterPro: IPR001254 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine proteases belong to the MEROPS peptidase family S1 (chymotrypsin family, clan PA(S))and to peptidase family S6 (Hap serine peptidases). The chymotrypsin family is almost totally confined to animals, although trypsin-like enzymes are found in actinomycetes of the genera Streptomyces and Saccharopolyspora, and in the fungus Fusarium oxysporum []. The enzymes are inherently secreted, being synthesised with a signal peptide that targets them to the secretory pathway. Animal enzymes are either secreted directly, packaged into vesicles for regulated secretion, or are retained in leukocyte granules []. The Hap family, 'Haemophilus adhesion and penetration', are proteins that play a role in the interaction with human epithelial cells. The serine protease activity is localized at the N-terminal domain, whereas the binding domain is in the C-terminal region. ; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 1SPJ_A 1A5I_A 2ZGH_A 2ZKS_A 2ZGJ_A 2ZGC_A 2ODP_A 2I6Q_A 2I6S_A 2ODQ_A ....
Probab=100.00 E-value=4.5e-34 Score=263.97 Aligned_cols=215 Identities=34% Similarity=0.629 Sum_probs=184.0
Q ss_pred ecCCCCCCCCCCCCCcceeeEcc----ccceEEEeeCCeEEeeccceeeecCCCCCCCccceEEEeceecCCCCCCCCcE
Q psy15063 105 KSPHHPARPSSGLTCDYDVAIRQ----DVCAVRIEFEKVNLARKVGGVCDIDQLRDTPVSELVLHLGDHDLTQLNETSHV 180 (417)
Q Consensus 105 ~~~g~~a~~~~~~~~PW~v~i~~----~~C~GtLIs~~~VLTA~~~A~Cv~~~~~~~~~~~~~V~lG~~~~~~~~~~~~q 180 (417)
|++|.++.++ +|||++.+.. ++|+|+||+++||||| |||+.. ...+.+++|......... ..+
T Consensus 1 i~~g~~~~~~---~~p~~v~i~~~~~~~~C~G~li~~~~vLTa---ahC~~~------~~~~~v~~g~~~~~~~~~-~~~ 67 (220)
T PF00089_consen 1 IVGGDPASPG---EFPWVVSIRYSNGRFFCTGTLISPRWVLTA---AHCVDG------ASDIKVRLGTYSIRNSDG-SEQ 67 (220)
T ss_dssp SBSSEECGTT---SSTTEEEEEETTTEEEEEEEEEETTEEEEE---GGGHTS------GGSEEEEESESBTTSTTT-TSE
T ss_pred CCCCEECCCC---CCCeEEEEeeCCCCeeEeEEeccccccccc---cccccc------cccccccccccccccccc-ccc
Confidence 4567777777 5999999944 4699999999999999 999975 367899999854444333 458
Q ss_pred EEeeeEEEeCCCCCCCCCCCCeEEEEecCCcCCCCceeeeeCCCCCCcc-CCCeEEEEEcCccCCCCCCCCccceEEEEE
Q psy15063 181 RRGVRRVLFHSHFHPFVLSNDIALLQLDRPVPLTGTIQPVCLPQKGESF-IGKRGHVVGWGVTSFPMGEPSPTLQKLEVK 259 (417)
Q Consensus 181 ~~~V~~i~iHP~Y~~~~~~nDIALLkL~~pv~~s~~v~PicLp~~~~~~-~~~~~~v~GWG~~~~~~~~~~~~L~~~~v~ 259 (417)
.+.|++++.||+|+.....+|||||+|++|+.+.+.++|+||+...... .+..+.+.|||...... .+..++...+.
T Consensus 68 ~~~v~~~~~h~~~~~~~~~~DiAll~L~~~~~~~~~~~~~~l~~~~~~~~~~~~~~~~G~~~~~~~~--~~~~~~~~~~~ 145 (220)
T PF00089_consen 68 TIKVSKIIIHPKYDPSTYDNDIALLKLDRPITFGDNIQPICLPSAGSDPNVGTSCIVVGWGRTSDNG--YSSNLQSVTVP 145 (220)
T ss_dssp EEEEEEEEEETTSBTTTTTTSEEEEEESSSSEHBSSBEESBBTSTTHTTTTTSEEEEEESSBSSTTS--BTSBEEEEEEE
T ss_pred ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc--ccccccccccc
Confidence 8999999999999998889999999999999999999999999954432 78999999999966443 56789999999
Q ss_pred EeChhhhhhhhhccCCCCeEEecCCCCCCCccCCCCCcccccccccccccCccccCCcceeecccccccccCCcEEEEcc
Q psy15063 260 VLSNARCSTVIEESIGIGMLCAAPDETQGTCFVPVSPVGYTKKHLQQFHQGTTYRQPRRRIILGGEADIGEFPWQVAIAL 339 (417)
Q Consensus 260 i~~~~~C~~~~~~~~~~~~iCa~~~~~~~~C~g~~~p~~~t~~~~~~wi~~~~~~~~~~~~~~G~~~~~~~~p~~~~~~~ 339 (417)
+++...|...+...+.+.++|+......+.|.|
T Consensus 146 ~~~~~~c~~~~~~~~~~~~~c~~~~~~~~~~~g----------------------------------------------- 178 (220)
T PF00089_consen 146 VVSRKTCRSSYNDNLTPNMICAGSSGSGDACQG----------------------------------------------- 178 (220)
T ss_dssp EEEHHHHHHHTTTTSTTTEEEEETTSSSBGGTT-----------------------------------------------
T ss_pred ccccccccccccccccccccccccccccccccc-----------------------------------------------
Confidence 999999999866568899999998556789999
Q ss_pred CCeeeeeeEEecCceeeeccccccCCCceEEeeCCeEEEEeEEEEcCCCccccCCCCeEEEecccchhHH
Q psy15063 340 DGMFFCGGALLNEHFVLTAAHCIMTGGPLTFEQDGYHVLAGIVSYGVTGCAIMPSYPDLYTRVSEYIRWI 409 (417)
Q Consensus 340 ~~~~~Cggsli~~~~vLtaAhC~dsGgPL~~~~~g~~~l~GI~S~g~~~C~~~~~~p~vyt~Vs~~~~WI 409 (417)
|+||||++. +. +|+||+|++ ..|. ..+.|.+|+||+.|++||
T Consensus 179 -----------------------~sG~pl~~~--~~-~lvGI~s~~-~~c~-~~~~~~v~~~v~~~~~WI 220 (220)
T PF00089_consen 179 -----------------------DSGGPLICN--NN-YLVGIVSFG-ENCG-SPNYPGVYTRVSSYLDWI 220 (220)
T ss_dssp -----------------------TTTSEEEET--TE-EEEEEEEEE-SSSS-BTTSEEEEEEGGGGHHHH
T ss_pred -----------------------ccccccccc--ee-eecceeeec-CCCC-CCCcCEEEEEHHHhhccC
Confidence 999999994 44 899999999 9999 666799999999999998
No 5
>COG5640 Secreted trypsin-like serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=99.95 E-value=1.5e-27 Score=226.92 Aligned_cols=228 Identities=22% Similarity=0.299 Sum_probs=172.7
Q ss_pred ceeEecCCCCCCCCCCCCCcceeeEcc--------ccceEEEeeCCeEEeeccceeeecCCCCCCCccceEEEeceecCC
Q psy15063 101 VTYFKSPHHPARPSSGLTCDYDVAIRQ--------DVCAVRIEFEKVNLARKVGGVCDIDQLRDTPVSELVLHLGDHDLT 172 (417)
Q Consensus 101 ~~~~~~~g~~a~~~~~~~~PW~v~i~~--------~~C~GtLIs~~~VLTA~~~A~Cv~~~~~~~~~~~~~V~lG~~~~~ 172 (417)
...+|++|..|..+ +||.+|++.. .+|||+++..|||||| |||+.... .......+|..+..+.+
T Consensus 29 vs~rIigGs~Anag---~~P~~VaLv~~isd~~s~tfCGgs~l~~RYvLTA---AHC~~~~s-~is~d~~~vv~~l~d~S 101 (413)
T COG5640 29 VSSRIIGGSNANAG---EYPSLVALVDRISDYVSGTFCGGSKLGGRYVLTA---AHCADASS-PISSDVNRVVVDLNDSS 101 (413)
T ss_pred cceeEecCcccccc---cCchHHHHHhhcccccceeEeccceecceEEeee---hhhccCCC-CccccceEEEecccccc
Confidence 34688999988888 6999998832 2999999999999999 99998764 33334455555554443
Q ss_pred CCCCCCcEEEeeeEEEeCCCCCCCCCCCCeEEEEecCCcCCCCceeeeeCCCCCCcc-----CCCeEEEEEcCccCCCCC
Q psy15063 173 QLNETSHVRRGVRRVLFHSHFHPFVLSNDIALLQLDRPVPLTGTIQPVCLPQKGESF-----IGKRGHVVGWGVTSFPMG 247 (417)
Q Consensus 173 ~~~~~~~q~~~V~~i~iHP~Y~~~~~~nDIALLkL~~pv~~s~~v~PicLp~~~~~~-----~~~~~~v~GWG~~~~~~~ 247 (417)
..+...|.+++.|..|.+.++.||||+++|.++...- .+ .|-.-...+.+ ......+.+||.+.....
T Consensus 102 -----q~~rg~vr~i~~~efY~~~n~~ND~Av~~l~~~a~~p-r~-ki~~~~~sdt~l~sv~~~s~~~n~t~~~~~~~~v 174 (413)
T COG5640 102 -----QAERGHVRTIYVHEFYSPGNLGNDIAVLELARAASLP-RV-KITSFDASDTFLNSVTTVSPMTNGTFGVTTPSDV 174 (413)
T ss_pred -----cccCcceEEEeeecccccccccCcceeeccccccccc-hh-heeeccCcccceecccccccccceeeeeeeecCC
Confidence 3566779999999999999999999999999976532 11 11121221111 345566778887764321
Q ss_pred C-C---CccceEEEEEEeChhhhhhhhhc------cCCCCeEEecCCCCCCCccCCCCCcccccccccccccCccccCCc
Q psy15063 248 E-P---SPTLQKLEVKVLSNARCSTVIEE------SIGIGMLCAAPDETQGTCFVPVSPVGYTKKHLQQFHQGTTYRQPR 317 (417)
Q Consensus 248 ~-~---~~~L~~~~v~i~~~~~C~~~~~~------~~~~~~iCa~~~~~~~~C~g~~~p~~~t~~~~~~wi~~~~~~~~~ 317 (417)
+ . ...|+++.+...+-.+|.+.+.. ...-+-||++.. .+++|+|
T Consensus 175 ~~~~p~gt~l~e~~v~fv~~stc~~~~g~an~~dg~~~lT~~cag~~-~~daCqG------------------------- 228 (413)
T COG5640 175 PRSSPKGTILHEVAVLFVPLSTCAQYKGCANASDGATGLTGFCAGRP-PKDACQG------------------------- 228 (413)
T ss_pred CCCCCccceeeeeeeeeechHHhhhhccccccCCCCCCccceecCCC-CcccccC-------------------------
Confidence 1 1 24799999999999999988752 222334999988 5999999
Q ss_pred ceeecccccccccCCcEEEEccCCeeeeeeEEecCceeeeccccccCCCceEEeeCCeEEEEeEEEEcCCCccccCCCCe
Q psy15063 318 RRIILGGEADIGEFPWQVAIALDGMFFCGGALLNEHFVLTAAHCIMTGGPLTFEQDGYHVLAGIVSYGVTGCAIMPSYPD 397 (417)
Q Consensus 318 ~~~~~G~~~~~~~~p~~~~~~~~~~~~Cggsli~~~~vLtaAhC~dsGgPL~~~~~g~~~l~GI~S~g~~~C~~~~~~p~ 397 (417)
|||||++.+.+.-.+++||+|||+.+|+ ....|+
T Consensus 229 ---------------------------------------------DSGGPi~~~g~~G~vQ~GVvSwG~~~Cg-~t~~~g 262 (413)
T COG5640 229 ---------------------------------------------DSGGPIFHKGEEGRVQRGVVSWGDGGCG-GTLIPG 262 (413)
T ss_pred ---------------------------------------------CCCCceEEeCCCccEEEeEEEecCCCCC-CCCcce
Confidence 9999999987555789999999977799 999999
Q ss_pred EEEecccchhHHhhhhc
Q psy15063 398 LYTRVSEYIRWIHVNAI 414 (417)
Q Consensus 398 vyt~Vs~~~~WI~~~~~ 414 (417)
|||+|+.|.+||...|.
T Consensus 263 VyT~vsny~~WI~a~~~ 279 (413)
T COG5640 263 VYTNVSNYQDWIAAMTN 279 (413)
T ss_pred eEEehhHHHHHHHHHhc
Confidence 99999999999998764
No 6
>PF03761 DUF316: Domain of unknown function (DUF316) ; InterPro: IPR005514 This is a family of uncharacterised proteins from Caenorhabditis elegans.
Probab=99.52 E-value=5.9e-13 Score=129.02 Aligned_cols=223 Identities=18% Similarity=0.293 Sum_probs=135.7
Q ss_pred cCCCCccccccceeEecCCCCCCCCCCCCCcceeeEccc-------cceEEEeeCCeEEeeccceeeecCCCCC----CC
Q psy15063 90 TYSCGDVSKERVTYFKSPHHPARPSSGLTCDYDVAIRQD-------VCAVRIEFEKVNLARKVGGVCDIDQLRD----TP 158 (417)
Q Consensus 90 ~~~CG~~~~~~~~~~~~~g~~a~~~~~~~~PW~v~i~~~-------~C~GtLIs~~~VLTA~~~A~Cv~~~~~~----~~ 158 (417)
...||.. ..+......+|..+... +.||.+.+... +++|||||+|||||| +||+...... ..
T Consensus 28 l~~CG~~-~~~~~~~~~~g~~~~~~---~~pW~v~v~~~~~~~~~~~~~gtlIS~RHiLts---s~~~~~~~~~W~~~~~ 100 (282)
T PF03761_consen 28 LETCGKK-KLPYPSKVFNGTPAESG---EAPWAVSVYTKNHNEGNYFSTGTLISPRHILTS---SHCVMNDKSKWLNGEE 100 (282)
T ss_pred HHhcCCC-CCCCcccccCCcccccC---CCCCEEEEEeccCcccceecceEEeccCeEEEe---eeEEEecccccccCcc
Confidence 3668822 12222222455555544 47888877332 579999999999999 9999743220 00
Q ss_pred -----c--c--ceEEE---eceecC---CCCCCCCcEEEeeeEEEeCCC----CCCCCCCCCeEEEEecCCcCCCCceee
Q psy15063 159 -----V--S--ELVLH---LGDHDL---TQLNETSHVRRGVRRVLFHSH----FHPFVLSNDIALLQLDRPVPLTGTIQP 219 (417)
Q Consensus 159 -----~--~--~~~V~---lG~~~~---~~~~~~~~q~~~V~~i~iHP~----Y~~~~~~nDIALLkL~~pv~~s~~v~P 219 (417)
. . .+.|- +-.... ............|.++++--. .......++++||+|+++ ++....|
T Consensus 101 ~~~~~C~~~~~~l~vP~~~l~~~~v~~~~~~~~~~~~~~~v~ka~il~~C~~~~~~~~~~~~~mIlEl~~~--~~~~~~~ 178 (282)
T PF03761_consen 101 FDNKKCEGNNNHLIVPEEVLSKIDVRCCNCFSNGKCFSIKVKKAYILNGCKKIKKNFNRPYSPMILELEED--FSKNVSP 178 (282)
T ss_pred cccceeeCCCceEEeCHHHhccEEEEeecccccCCcccceeEEEEEEecCCCcccccccccceEEEEEccc--ccccCCC
Confidence 0 0 11110 000000 000000223345666655311 123445689999999999 7888999
Q ss_pred eeCCCCCCcc-CCCeEEEEEcCccCCCCCCCCccceEEEEEEeChhhhhhhhhccCCCCeEEecCCCCCCCccCCCCCcc
Q psy15063 220 VCLPQKGESF-IGKRGHVVGWGVTSFPMGEPSPTLQKLEVKVLSNARCSTVIEESIGIGMLCAAPDETQGTCFVPVSPVG 298 (417)
Q Consensus 220 icLp~~~~~~-~~~~~~v~GWG~~~~~~~~~~~~L~~~~v~i~~~~~C~~~~~~~~~~~~iCa~~~~~~~~C~g~~~p~~ 298 (417)
+|||...... .++...+.|+ . ....+....+.+.....| ...+|. .+..|.+
T Consensus 179 ~Cl~~~~~~~~~~~~~~~yg~---~-----~~~~~~~~~~~i~~~~~~---------~~~~~~----~~~~~~~------ 231 (282)
T PF03761_consen 179 PCLADSSTNWEKGDEVDVYGF---N-----STGKLKHRKLKITNCTKC---------AYSICT----KQYSCKG------ 231 (282)
T ss_pred EEeCCCccccccCceEEEeec---C-----CCCeEEEEEEEEEEeecc---------ceeEec----ccccCCC------
Confidence 9999876654 6667777776 1 223455555555432221 122333 4577888
Q ss_pred cccccccccccCccccCCcceeecccccccccCCcEEEEccCCeeeeeeEEecCceeeeccccccCCCceEEeeCCeEEE
Q psy15063 299 YTKKHLQQFHQGTTYRQPRRRIILGGEADIGEFPWQVAIALDGMFFCGGALLNEHFVLTAAHCIMTGGPLTFEQDGYHVL 378 (417)
Q Consensus 299 ~t~~~~~~wi~~~~~~~~~~~~~~G~~~~~~~~p~~~~~~~~~~~~Cggsli~~~~vLtaAhC~dsGgPL~~~~~g~~~l 378 (417)
|+||||+...+|+++|
T Consensus 232 ----------------------------------------------------------------d~Gg~lv~~~~gr~tl 247 (282)
T PF03761_consen 232 ----------------------------------------------------------------DRGGPLVKNINGRWTL 247 (282)
T ss_pred ----------------------------------------------------------------CccCeEEEEECCCEEE
Confidence 9999999999999999
Q ss_pred EeEEEEcCCCccccCCCCeEEEecccchhHHhhhhcc
Q psy15063 379 AGIVSYGVTGCAIMPSYPDLYTRVSEYIRWIHVNAIV 415 (417)
Q Consensus 379 ~GI~S~g~~~C~~~~~~p~vyt~Vs~~~~WI~~~~~~ 415 (417)
+||.+.+...|. . ....|.+|..|.+=|-+.+++
T Consensus 248 IGv~~~~~~~~~--~-~~~~f~~v~~~~~~IC~ltGI 281 (282)
T PF03761_consen 248 IGVGASGNYECN--K-NNSYFFNVSWYQDEICELTGI 281 (282)
T ss_pred EEEEccCCCccc--c-cccEEEEHHHhhhhhccceec
Confidence 999998844554 2 267899999998877666553
No 7
>PF09342 DUF1986: Domain of unknown function (DUF1986); InterPro: IPR015420 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This domain is found in serine endopeptidases belonging to MEROPS peptidase family S1A (clan PA). It is found in unusual mosaic proteins, which are encoded by the Drosophila nudel gene (see P98159 from SWISSPROT). Nudel is involved in defining embryonic dorsoventral polarity. Three proteases; ndl, gd and snk process easter to create active easter. Active easter defines cell identities along the dorsal-ventral continuum by activating the spz ligand for the Tl receptor in the ventral region of the embryo. Nudel, pipe and windbeutel together trigger the protease cascade within the extraembryonic perivitelline compartment which induces dorsoventral polarity of the Drosophila embryo [].
Probab=99.23 E-value=2.4e-10 Score=104.88 Aligned_cols=112 Identities=16% Similarity=0.136 Sum_probs=86.9
Q ss_pred CCcceeeEccc---cceEEEeeCCeEEeeccceeeecCCCCCCCccceEEEeceecC-CCCCCCCcEEEeeeEEEeCCCC
Q psy15063 118 TCDYDVAIRQD---VCAVRIEFEKVNLARKVGGVCDIDQLRDTPVSELVLHLGDHDL-TQLNETSHVRRGVRRVLFHSHF 193 (417)
Q Consensus 118 ~~PW~v~i~~~---~C~GtLIs~~~VLTA~~~A~Cv~~~~~~~~~~~~~V~lG~~~~-~~~~~~~~q~~~V~~i~iHP~Y 193 (417)
.|||.+.|+.. .|.|+||.+.|||++ ..|+.+-.. ...-+.|.+|.... .....+.+|.++|..+..=|
T Consensus 15 ~WPWlA~IYvdG~~~CsgvLlD~~WlLvs---ssCl~~I~L--~~~YvsallG~~Kt~~~v~Gp~EQI~rVD~~~~V~-- 87 (267)
T PF09342_consen 15 HWPWLADIYVDGRYWCSGVLLDPHWLLVS---SSCLRGISL--SHHYVSALLGGGKTYLSVDGPHEQISRVDCFKDVP-- 87 (267)
T ss_pred cCcceeeEEEcCeEEEEEEEeccceEEEe---ccccCCccc--ccceEEEEecCcceecccCCChheEEEeeeeeecc--
Confidence 59999999544 999999999999999 999965221 23567889997652 22244478888888876543
Q ss_pred CCCCCCCCeEEEEecCCcCCCCceeeeeCCCCCCcc-CCCeEEEEEcCc
Q psy15063 194 HPFVLSNDIALLQLDRPVPLTGTIQPVCLPQKGESF-IGKRGHVVGWGV 241 (417)
Q Consensus 194 ~~~~~~nDIALLkL~~pv~~s~~v~PicLp~~~~~~-~~~~~~v~GWG~ 241 (417)
..+++||.|++|+.|+.+|+|..||+..... ....|...|-..
T Consensus 88 -----~S~v~LLHL~~~~~fTr~VlP~flp~~~~~~~~~~~CVAVg~d~ 131 (267)
T PF09342_consen 88 -----ESNVLLLHLEQPANFTRYVLPTFLPETSNENESDDECVAVGHDD 131 (267)
T ss_pred -----ccceeeeeecCcccceeeecccccccccCCCCCCCceEEEEccc
Confidence 3689999999999999999999999844433 566899888554
No 8
>cd00190 Tryp_SPc Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues.
Probab=98.82 E-value=2.7e-09 Score=98.78 Aligned_cols=45 Identities=53% Similarity=1.119 Sum_probs=41.7
Q ss_pred eecccccccccCCcEEEEccC-CeeeeeeEEecCceeeeccccccC
Q psy15063 320 IILGGEADIGEFPWQVAIALD-GMFFCGGALLNEHFVLTAAHCIMT 364 (417)
Q Consensus 320 ~~~G~~~~~~~~p~~~~~~~~-~~~~Cggsli~~~~vLtaAhC~ds 364 (417)
+++|..+..++|||++.++.. ..+.|+||||+++||||||||+..
T Consensus 1 i~~G~~~~~~~~Pw~v~i~~~~~~~~C~GtlIs~~~VLTaAhC~~~ 46 (232)
T cd00190 1 IVGGSEAKIGSFPWQVSLQYTGGRHFCGGSLISPRWVLTAAHCVYS 46 (232)
T ss_pred CcCCeECCCCCCCCEEEEEccCCcEEEEEEEeeCCEEEECHHhcCC
Confidence 478999999999999999887 789999999999999999999954
No 9
>PF00089 Trypsin: Trypsin; InterPro: IPR001254 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine proteases belong to the MEROPS peptidase family S1 (chymotrypsin family, clan PA(S))and to peptidase family S6 (Hap serine peptidases). The chymotrypsin family is almost totally confined to animals, although trypsin-like enzymes are found in actinomycetes of the genera Streptomyces and Saccharopolyspora, and in the fungus Fusarium oxysporum []. The enzymes are inherently secreted, being synthesised with a signal peptide that targets them to the secretory pathway. Animal enzymes are either secreted directly, packaged into vesicles for regulated secretion, or are retained in leukocyte granules []. The Hap family, 'Haemophilus adhesion and penetration', are proteins that play a role in the interaction with human epithelial cells. The serine protease activity is localized at the N-terminal domain, whereas the binding domain is in the C-terminal region. ; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 1SPJ_A 1A5I_A 2ZGH_A 2ZKS_A 2ZGJ_A 2ZGC_A 2ODP_A 2I6Q_A 2I6S_A 2ODQ_A ....
Probab=98.77 E-value=6.6e-09 Score=95.53 Aligned_cols=45 Identities=49% Similarity=0.947 Sum_probs=42.6
Q ss_pred eecccccccccCCcEEEEccCC-eeeeeeEEecCceeeeccccccC
Q psy15063 320 IILGGEADIGEFPWQVAIALDG-MFFCGGALLNEHFVLTAAHCIMT 364 (417)
Q Consensus 320 ~~~G~~~~~~~~p~~~~~~~~~-~~~Cggsli~~~~vLtaAhC~ds 364 (417)
|.+|.++..++|||++.++... .++|.|+||+++||||||||++.
T Consensus 1 i~~g~~~~~~~~p~~v~i~~~~~~~~C~G~li~~~~vLTaahC~~~ 46 (220)
T PF00089_consen 1 IVGGDPASPGEFPWVVSIRYSNGRFFCTGTLISPRWVLTAAHCVDG 46 (220)
T ss_dssp SBSSEECGTTSSTTEEEEEETTTEEEEEEEEEETTEEEEEGGGHTS
T ss_pred CCCCEECCCCCCCeEEEEeeCCCCeeEeEEeccccccccccccccc
Confidence 5789999999999999999887 99999999999999999999977
No 10
>smart00020 Tryp_SPc Trypsin-like serine protease. Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues.
Probab=98.73 E-value=7.5e-09 Score=96.09 Aligned_cols=48 Identities=52% Similarity=1.077 Sum_probs=43.4
Q ss_pred eeecccccccccCCcEEEEccCC-eeeeeeEEecCceeeeccccccCCC
Q psy15063 319 RIILGGEADIGEFPWQVAIALDG-MFFCGGALLNEHFVLTAAHCIMTGG 366 (417)
Q Consensus 319 ~~~~G~~~~~~~~p~~~~~~~~~-~~~Cggsli~~~~vLtaAhC~dsGg 366 (417)
|+++|..+..++|||++.++... .+.|.||||++++|||||||+....
T Consensus 1 ~~~~G~~~~~~~~Pw~~~i~~~~~~~~C~GtlIs~~~VLTaahC~~~~~ 49 (229)
T smart00020 1 RIVGGSEANIGSFPWQVSLQYRGGRHFCGGSLISPRWVLTAAHCVYGSD 49 (229)
T ss_pred CccCCCcCCCCCCCcEEEEEEcCCCcEEEEEEecCCEEEECHHHcCCCC
Confidence 57899999999999999998876 8899999999999999999996543
No 11
>KOG3627|consensus
Probab=98.56 E-value=6.1e-08 Score=92.18 Aligned_cols=51 Identities=47% Similarity=0.974 Sum_probs=46.1
Q ss_pred CcceeecccccccccCCcEEEEccCC--eeeeeeEEecCceeeeccccccCCC
Q psy15063 316 PRRRIILGGEADIGEFPWQVAIALDG--MFFCGGALLNEHFVLTAAHCIMTGG 366 (417)
Q Consensus 316 ~~~~~~~G~~~~~~~~p~~~~~~~~~--~~~Cggsli~~~~vLtaAhC~dsGg 366 (417)
...|+++|.++..++|||+++++... .+.|||+||+++||||||||+....
T Consensus 9 ~~~~i~~g~~~~~~~~Pw~~~l~~~~~~~~~Cggsli~~~~vltaaHC~~~~~ 61 (256)
T KOG3627|consen 9 PEGRIVGGTEAEPGSFPWQVSLQYGGNGRHLCGGSLISPRWVLTAAHCVKGAS 61 (256)
T ss_pred ccCCEeCCccCCCCCCCCEEEEEECCCcceeeeeEEeeCCEEEEChhhCCCCC
Confidence 35799999999999999999998876 7899999999999999999997654
No 12
>COG5640 Secreted trypsin-like serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=98.46 E-value=7.4e-08 Score=93.03 Aligned_cols=58 Identities=40% Similarity=0.650 Sum_probs=48.9
Q ss_pred ccCCcceeecccccccccCCcEEEEccC-----CeeeeeeEEecCceeeeccccccCCCceEE
Q psy15063 313 YRQPRRRIILGGEADIGEFPWQVAIALD-----GMFFCGGALLNEHFVLTAAHCIMTGGPLTF 370 (417)
Q Consensus 313 ~~~~~~~~~~G~~~~~~~~p~~~~~~~~-----~~~~Cggsli~~~~vLtaAhC~dsGgPL~~ 370 (417)
..+-..||++|..|+.++||++|++... .-.+||||+++.|||||||||+|.-+|.-.
T Consensus 26 ~devs~rIigGs~Anag~~P~~VaLv~~isd~~s~tfCGgs~l~~RYvLTAAHC~~~~s~is~ 88 (413)
T COG5640 26 ADEVSSRIIGGSNANAGEYPSLVALVDRISDYVSGTFCGGSKLGGRYVLTAAHCADASSPISS 88 (413)
T ss_pred ccccceeEecCcccccccCchHHHHHhhcccccceeEeccceecceEEeeehhhccCCCCccc
Confidence 3445789999999999999999998442 235999999999999999999988776554
No 13
>COG3591 V8-like Glu-specific endopeptidase [Amino acid transport and metabolism]
Probab=98.20 E-value=2.1e-05 Score=74.08 Aligned_cols=117 Identities=11% Similarity=0.064 Sum_probs=65.7
Q ss_pred CCcceeeE------ccccceEEEeeCCeEEeeccceeeecCCCCCCCccceEEEe-ceecCCCCCCCCcEEEeeeEEEeC
Q psy15063 118 TCDYDVAI------RQDVCAVRIEFEKVNLARKVGGVCDIDQLRDTPVSELVLHL-GDHDLTQLNETSHVRRGVRRVLFH 190 (417)
Q Consensus 118 ~~PW~v~i------~~~~C~GtLIs~~~VLTA~~~A~Cv~~~~~~~~~~~~~V~l-G~~~~~~~~~~~~q~~~V~~i~iH 190 (417)
.|||..-. ...-|+++||+++.|||| +||+.+..... ..+.+.. |...-.. +.-.+...++.+.
T Consensus 48 ~~Py~av~~~~~~tG~~~~~~~lI~pntvLTa---~Hc~~s~~~G~--~~~~~~p~g~~~~~~----~~~~~~~~~~~~~ 118 (251)
T COG3591 48 QFPYSAVVQFEAATGRLCTAATLIGPNTVLTA---GHCIYSPDYGE--DDIAAAPPGVNSDGG----PFYGITKIEIRVY 118 (251)
T ss_pred CCCcceeEEeecCCCcceeeEEEEcCceEEEe---eeEEecCCCCh--hhhhhcCCcccCCCC----CCCceeeEEEEec
Confidence 68887765 222566699999999999 99998764321 3333333 3222111 1112222223224
Q ss_pred CC--CCCCCCCCCeEEEEecCCcCCCCceeeeeCCCCCCccCCCeEEEEEcCccC
Q psy15063 191 SH--FHPFVLSNDIALLQLDRPVPLTGTIQPVCLPQKGESFIGKRGHVVGWGVTS 243 (417)
Q Consensus 191 P~--Y~~~~~~nDIALLkL~~pv~~s~~v~PicLp~~~~~~~~~~~~v~GWG~~~ 243 (417)
|. |..+....|+..+.|+...++.+.+...-++.......++...++|+-...
T Consensus 119 ~g~~~~~d~~~~~v~~~~~~~g~~~~~~~~~~~~~~~~~~~~~d~i~v~GYP~dk 173 (251)
T COG3591 119 PGELYKEDGASYDVGEAALESGINIGDVVNYLKRNTASEAKANDRITVIGYPGDK 173 (251)
T ss_pred CCceeccCCceeeccHHHhccCCCccccccccccccccccccCceeEEEeccCCC
Confidence 43 344444567777777755555565555555555444456668889976543
No 14
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=97.56 E-value=0.0013 Score=67.62 Aligned_cols=84 Identities=17% Similarity=0.112 Sum_probs=58.1
Q ss_pred ccceEEEeeCC-eEEeeccceeeecCCCCCCCccceEEEeceecCCCCCCCCcEEEeeeEEEeCCCCCCCCCCCCeEEEE
Q psy15063 128 DVCAVRIEFEK-VNLARKVGGVCDIDQLRDTPVSELVLHLGDHDLTQLNETSHVRRGVRRVLFHSHFHPFVLSNDIALLQ 206 (417)
Q Consensus 128 ~~C~GtLIs~~-~VLTA~~~A~Cv~~~~~~~~~~~~~V~lG~~~~~~~~~~~~q~~~V~~i~iHP~Y~~~~~~nDIALLk 206 (417)
..++|.+|+++ +|||+ +|++.+. ..+.|.+.. ...+..+-+..+| ..||||||
T Consensus 58 ~~GSGfii~~~G~IlTn---~Hvv~~~------~~i~V~~~~----------~~~~~a~vv~~d~-------~~DlAllk 111 (428)
T TIGR02037 58 GLGSGVIISADGYILTN---NHVVDGA------DEITVTLSD----------GREFKAKLVGKDP-------RTDIAVLK 111 (428)
T ss_pred ceeeEEEECCCCEEEEc---HHHcCCC------CeEEEEeCC----------CCEEEEEEEEecC-------CCCEEEEE
Confidence 48999999987 99999 9999753 566676542 1223333333333 36999999
Q ss_pred ecCCcCCCCceeeeeCCCCCCccCCCeEEEEEcCc
Q psy15063 207 LDRPVPLTGTIQPVCLPQKGESFIGKRGHVVGWGV 241 (417)
Q Consensus 207 L~~pv~~s~~v~PicLp~~~~~~~~~~~~v~GWG~ 241 (417)
++.+ ..+.++.|.+......|+.+++.|+..
T Consensus 112 v~~~----~~~~~~~l~~~~~~~~G~~v~aiG~p~ 142 (428)
T TIGR02037 112 IDAK----KNLPVIKLGDSDKLRVGDWVLAIGNPF 142 (428)
T ss_pred ecCC----CCceEEEccCCCCCCCCCEEEEEECCC
Confidence 9865 345677776554433899999999864
No 15
>PF09342 DUF1986: Domain of unknown function (DUF1986); InterPro: IPR015420 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This domain is found in serine endopeptidases belonging to MEROPS peptidase family S1A (clan PA). It is found in unusual mosaic proteins, which are encoded by the Drosophila nudel gene (see P98159 from SWISSPROT). Nudel is involved in defining embryonic dorsoventral polarity. Three proteases; ndl, gd and snk process easter to create active easter. Active easter defines cell identities along the dorsal-ventral continuum by activating the spz ligand for the Tl receptor in the ventral region of the embryo. Nudel, pipe and windbeutel together trigger the protease cascade within the extraembryonic perivitelline compartment which induces dorsoventral polarity of the Drosophila embryo [].
Probab=97.01 E-value=0.00098 Score=61.90 Aligned_cols=36 Identities=33% Similarity=0.933 Sum_probs=34.1
Q ss_pred ccccCCcEEEEccCCeeeeeeEEecCceeeeccccc
Q psy15063 327 DIGEFPWQVAIALDGMFFCGGALLNEHFVLTAAHCI 362 (417)
Q Consensus 327 ~~~~~p~~~~~~~~~~~~Cggsli~~~~vLtaAhC~ 362 (417)
+...|||.|.|+-+|.+.|.|.||+.+|||++.+|+
T Consensus 12 e~y~WPWlA~IYvdG~~~CsgvLlD~~WlLvsssCl 47 (267)
T PF09342_consen 12 EDYHWPWLADIYVDGRYWCSGVLLDPHWLLVSSSCL 47 (267)
T ss_pred ccccCcceeeEEEcCeEEEEEEEeccceEEEecccc
Confidence 456799999999999999999999999999999998
No 16
>PRK10898 serine endoprotease; Provisional
Probab=96.88 E-value=0.046 Score=54.86 Aligned_cols=83 Identities=17% Similarity=0.102 Sum_probs=52.2
Q ss_pred ccceEEEeeCC-eEEeeccceeeecCCCCCCCccceEEEeceecCCCCCCCCcEEEeeeEEEeCCCCCCCCCCCCeEEEE
Q psy15063 128 DVCAVRIEFEK-VNLARKVGGVCDIDQLRDTPVSELVLHLGDHDLTQLNETSHVRRGVRRVLFHSHFHPFVLSNDIALLQ 206 (417)
Q Consensus 128 ~~C~GtLIs~~-~VLTA~~~A~Cv~~~~~~~~~~~~~V~lG~~~~~~~~~~~~q~~~V~~i~iHP~Y~~~~~~nDIALLk 206 (417)
..-+|.+|+++ +|||+ +|=+.+. ..+.|.+.. ...+...-+...| .+||||||
T Consensus 78 ~~GSGfvi~~~G~IlTn---~HVv~~a------~~i~V~~~d----------g~~~~a~vv~~d~-------~~DlAvl~ 131 (353)
T PRK10898 78 TLGSGVIMDQRGYILTN---KHVINDA------DQIIVALQD----------GRVFEALLVGSDS-------LTDLAVLK 131 (353)
T ss_pred ceeeEEEEeCCeEEEec---ccEeCCC------CEEEEEeCC----------CCEEEEEEEEEcC-------CCCEEEEE
Confidence 46899999976 99999 9988642 556676532 1223333233333 37999999
Q ss_pred ecCCcCCCCceeeeeCCCCCCccCCCeEEEEEcCc
Q psy15063 207 LDRPVPLTGTIQPVCLPQKGESFIGKRGHVVGWGV 241 (417)
Q Consensus 207 L~~pv~~s~~v~PicLp~~~~~~~~~~~~v~GWG~ 241 (417)
++.+ . ..++-|........|+.+.+.|+..
T Consensus 132 v~~~-~----l~~~~l~~~~~~~~G~~V~aiG~P~ 161 (353)
T PRK10898 132 INAT-N----LPVIPINPKRVPHIGDVVLAIGNPY 161 (353)
T ss_pred EcCC-C----CCeeeccCcCcCCCCCEEEEEeCCC
Confidence 9854 1 2334443332222789999999753
No 17
>PF03761 DUF316: Domain of unknown function (DUF316) ; InterPro: IPR005514 This is a family of uncharacterised proteins from Caenorhabditis elegans.
Probab=96.88 E-value=0.0011 Score=64.16 Aligned_cols=50 Identities=30% Similarity=0.670 Sum_probs=41.2
Q ss_pred cCCcceeecccccccccCCcEEEEccCC----eeeeeeEEecCceeeecccccc
Q psy15063 314 RQPRRRIILGGEADIGEFPWQVAIALDG----MFFCGGALLNEHFVLTAAHCIM 363 (417)
Q Consensus 314 ~~~~~~~~~G~~~~~~~~p~~~~~~~~~----~~~Cggsli~~~~vLtaAhC~d 363 (417)
.+...+..+|..+...+.||.+.+...+ ..++.||+||+|||||++||+.
T Consensus 36 ~~~~~~~~~g~~~~~~~~pW~v~v~~~~~~~~~~~~~gtlIS~RHiLtss~~~~ 89 (282)
T PF03761_consen 36 LPYPSKVFNGTPAESGEAPWAVSVYTKNHNEGNYFSTGTLISPRHILTSSHCVM 89 (282)
T ss_pred CCCcccccCCcccccCCCCCEEEEEeccCcccceecceEEeccCeEEEeeeEEE
Confidence 3344556888899999999999996643 3678999999999999999994
No 18
>TIGR02038 protease_degS periplasmic serine pepetdase DegS. This family consists of the periplasmic serine protease DegS (HhoB), a shorter paralog of protease DO (HtrA, DegP) and DegQ (HhoA). It is found in E. coli and several other Proteobacteria of the gamma subdivision. It contains a trypsin domain and a single copy of PDZ domain (in contrast to DegP with two copies). A critical role of this DegS is to sense stress in the periplasm and partially degrade an inhibitor of sigma(E).
Probab=96.70 E-value=0.087 Score=52.82 Aligned_cols=83 Identities=13% Similarity=0.080 Sum_probs=53.1
Q ss_pred ccceEEEeeCC-eEEeeccceeeecCCCCCCCccceEEEeceecCCCCCCCCcEEEeeeEEEeCCCCCCCCCCCCeEEEE
Q psy15063 128 DVCAVRIEFEK-VNLARKVGGVCDIDQLRDTPVSELVLHLGDHDLTQLNETSHVRRGVRRVLFHSHFHPFVLSNDIALLQ 206 (417)
Q Consensus 128 ~~C~GtLIs~~-~VLTA~~~A~Cv~~~~~~~~~~~~~V~lG~~~~~~~~~~~~q~~~V~~i~iHP~Y~~~~~~nDIALLk 206 (417)
...+|.+|+++ +|||+ +|.+.+. ..+.|.+.. ...+..+-+..+| ..||||||
T Consensus 78 ~~GSG~vi~~~G~IlTn---~HVV~~~------~~i~V~~~d----------g~~~~a~vv~~d~-------~~DlAvlk 131 (351)
T TIGR02038 78 GLGSGVIMSKEGYILTN---YHVIKKA------DQIVVALQD----------GRKFEAELVGSDP-------LTDLAVLK 131 (351)
T ss_pred ceEEEEEEeCCeEEEec---ccEeCCC------CEEEEEECC----------CCEEEEEEEEecC-------CCCEEEEE
Confidence 36899999977 99999 9999643 556666532 1223333333333 47999999
Q ss_pred ecCCcCCCCceeeeeCCCCCCccCCCeEEEEEcCc
Q psy15063 207 LDRPVPLTGTIQPVCLPQKGESFIGKRGHVVGWGV 241 (417)
Q Consensus 207 L~~pv~~s~~v~PicLp~~~~~~~~~~~~v~GWG~ 241 (417)
++.+- +.++.|........|+.+++.|+..
T Consensus 132 v~~~~-----~~~~~l~~s~~~~~G~~V~aiG~P~ 161 (351)
T TIGR02038 132 IEGDN-----LPTIPVNLDRPPHVGDVVLAIGNPY 161 (351)
T ss_pred ecCCC-----CceEeccCcCccCCCCEEEEEeCCC
Confidence 98642 2344443332222899999999864
No 19
>PF13365 Trypsin_2: Trypsin-like peptidase domain; PDB: 1Y8T_A 2Z9I_A 3QO6_A 1L1J_A 1QY6_A 2O8L_A 3OTP_E 2ZLE_I 1KY9_A 3CS0_A ....
Probab=96.61 E-value=0.0055 Score=50.45 Aligned_cols=23 Identities=9% Similarity=-0.305 Sum_probs=20.1
Q ss_pred ceEEEeeCC-eEEeeccceeeecCCCC
Q psy15063 130 CAVRIEFEK-VNLARKVGGVCDIDQLR 155 (417)
Q Consensus 130 C~GtLIs~~-~VLTA~~~A~Cv~~~~~ 155 (417)
|+|.+|.++ +|||+ +||+.....
T Consensus 1 GTGf~i~~~g~ilT~---~Hvv~~~~~ 24 (120)
T PF13365_consen 1 GTGFLIGPDGYILTA---AHVVEDWND 24 (120)
T ss_dssp EEEEEEETTTEEEEE---HHHHTCCTT
T ss_pred CEEEEEcCCceEEEc---hhheecccc
Confidence 689999999 99999 999976533
No 20
>PRK10139 serine endoprotease; Provisional
Probab=96.50 E-value=0.15 Score=53.01 Aligned_cols=83 Identities=17% Similarity=0.176 Sum_probs=55.2
Q ss_pred ccceEEEeeC--CeEEeeccceeeecCCCCCCCccceEEEeceecCCCCCCCCcEEEeeeEEEeCCCCCCCCCCCCeEEE
Q psy15063 128 DVCAVRIEFE--KVNLARKVGGVCDIDQLRDTPVSELVLHLGDHDLTQLNETSHVRRGVRRVLFHSHFHPFVLSNDIALL 205 (417)
Q Consensus 128 ~~C~GtLIs~--~~VLTA~~~A~Cv~~~~~~~~~~~~~V~lG~~~~~~~~~~~~q~~~V~~i~iHP~Y~~~~~~nDIALL 205 (417)
...+|.+|++ -+|||. +|.+.+. ..+.|.+.. ...+..+-+...| ..|||||
T Consensus 90 ~~GSG~ii~~~~g~IlTn---~HVv~~a------~~i~V~~~d----------g~~~~a~vvg~D~-------~~DlAvl 143 (455)
T PRK10139 90 GLGSGVIIDAAKGYVLTN---NHVINQA------QKISIQLND----------GREFDAKLIGSDD-------QSDIALL 143 (455)
T ss_pred ceEEEEEEECCCCEEEeC---hHHhCCC------CEEEEEECC----------CCEEEEEEEEEcC-------CCCEEEE
Confidence 4789999984 599999 9999653 667777642 1233333333332 4799999
Q ss_pred EecCCcCCCCceeeeeCCCCCCccCCCeEEEEEcC
Q psy15063 206 QLDRPVPLTGTIQPVCLPQKGESFIGKRGHVVGWG 240 (417)
Q Consensus 206 kL~~pv~~s~~v~PicLp~~~~~~~~~~~~v~GWG 240 (417)
|++.+- ...++.|.+......|+.+...|..
T Consensus 144 kv~~~~----~l~~~~lg~s~~~~~G~~V~aiG~P 174 (455)
T PRK10139 144 QIQNPS----KLTQIAIADSDKLRVGDFAVAVGNP 174 (455)
T ss_pred EecCCC----CCceeEecCccccCCCCEEEEEecC
Confidence 998542 2346667554332279999999874
No 21
>PRK10942 serine endoprotease; Provisional
Probab=96.42 E-value=0.11 Score=54.35 Aligned_cols=83 Identities=16% Similarity=0.151 Sum_probs=53.6
Q ss_pred ccceEEEeeC--CeEEeeccceeeecCCCCCCCccceEEEeceecCCCCCCCCcEEEeeeEEEeCCCCCCCCCCCCeEEE
Q psy15063 128 DVCAVRIEFE--KVNLARKVGGVCDIDQLRDTPVSELVLHLGDHDLTQLNETSHVRRGVRRVLFHSHFHPFVLSNDIALL 205 (417)
Q Consensus 128 ~~C~GtLIs~--~~VLTA~~~A~Cv~~~~~~~~~~~~~V~lG~~~~~~~~~~~~q~~~V~~i~iHP~Y~~~~~~nDIALL 205 (417)
...+|.+|+. -+|||. +|.+.+. ..+.|.+... ..+..+-+..+| ..|||||
T Consensus 111 ~~GSG~ii~~~~G~IlTn---~HVv~~a------~~i~V~~~dg----------~~~~a~vv~~D~-------~~DlAvl 164 (473)
T PRK10942 111 ALGSGVIIDADKGYVVTN---NHVVDNA------TKIKVQLSDG----------RKFDAKVVGKDP-------RSDIALI 164 (473)
T ss_pred ceEEEEEEECCCCEEEeC---hhhcCCC------CEEEEEECCC----------CEEEEEEEEecC-------CCCEEEE
Confidence 4789999985 499999 9998643 6677776421 223333333333 4799999
Q ss_pred EecCCcCCCCceeeeeCCCCCCccCCCeEEEEEcC
Q psy15063 206 QLDRPVPLTGTIQPVCLPQKGESFIGKRGHVVGWG 240 (417)
Q Consensus 206 kL~~pv~~s~~v~PicLp~~~~~~~~~~~~v~GWG 240 (417)
|++.+-. ..++.|-+......|+.+...|.-
T Consensus 165 ki~~~~~----l~~~~lg~s~~l~~G~~V~aiG~P 195 (473)
T PRK10942 165 QLQNPKN----LTAIKMADSDALRVGDYTVAIGNP 195 (473)
T ss_pred EecCCCC----CceeEecCccccCCCCEEEEEcCC
Confidence 9975332 335666544332378888888854
No 22
>COG3591 V8-like Glu-specific endopeptidase [Amino acid transport and metabolism]
Probab=94.61 E-value=0.029 Score=53.05 Aligned_cols=37 Identities=30% Similarity=0.653 Sum_probs=30.0
Q ss_pred cccCCcEEEEcc---CCeeeeeeEEecCceeeeccccccC
Q psy15063 328 IGEFPWQVAIAL---DGMFFCGGALLNEHFVLTAAHCIMT 364 (417)
Q Consensus 328 ~~~~p~~~~~~~---~~~~~Cggsli~~~~vLtaAhC~ds 364 (417)
...|||.+.... .+.+-|.++||+++-||||+||..+
T Consensus 46 t~~~Py~av~~~~~~tG~~~~~~~lI~pntvLTa~Hc~~s 85 (251)
T COG3591 46 TTQFPYSAVVQFEAATGRLCTAATLIGPNTVLTAGHCIYS 85 (251)
T ss_pred CCCCCcceeEEeecCCCcceeeEEEEcCceEEEeeeEEec
Confidence 457999998855 3456677799999999999999944
No 23
>PF02395 Peptidase_S6: Immunoglobulin A1 protease Serine protease Prosite pattern; InterPro: IPR000710 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S6 (clan PA(S)). The type sample being the IgA1-specific serine endopeptidase from Neisseria gonorrhoeae []. These cleave prolyl bonds in the hinge regions of immunoglobulin A heavy chains. Similar specificity is shown by the unrelated family of M26 metalloendopeptidases.; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SZE_A 3H09_B 3SYJ_A 1WXR_A 3AK5_B.
Probab=93.26 E-value=1.2 Score=49.19 Aligned_cols=66 Identities=8% Similarity=0.013 Sum_probs=37.8
Q ss_pred eEEEeeCCeEEeeccceeeecCCCCCCCccceEEEeceecCCCCCCCCcEEEeeeEEEeCCCCCCCCCCCCeEEEEecCC
Q psy15063 131 AVRIEFEKVNLARKVGGVCDIDQLRDTPVSELVLHLGDHDLTQLNETSHVRRGVRRVLFHSHFHPFVLSNDIALLQLDRP 210 (417)
Q Consensus 131 ~GtLIs~~~VLTA~~~A~Cv~~~~~~~~~~~~~V~lG~~~~~~~~~~~~q~~~V~~i~iHP~Y~~~~~~nDIALLkL~~p 210 (417)
..|||+|++|+|+ +|=..+ .-.|.+|.... ..+.+..--.|+. .|+.+-||.+=
T Consensus 68 ~aTLigpqYiVSV---~HN~~g--------y~~v~FG~~g~--------~~Y~iV~RNn~~~-------~Df~~pRLnK~ 121 (769)
T PF02395_consen 68 VATLIGPQYIVSV---KHNGKG--------YNSVSFGNEGQ--------NTYKIVDRNNYPS-------GDFHMPRLNKF 121 (769)
T ss_dssp S-EEEETTEEEBE---TTG-TS--------CCEECESCSST--------CEEEEEEEEBETT-------STEBEEEESS-
T ss_pred eEEEecCCeEEEE---EccCCC--------cCceeecccCC--------ceEEEEEccCCCC-------cccceeecCce
Confidence 4899999999999 774421 23577776432 3444444444544 69999999987
Q ss_pred cCCCCceeeeeCCCC
Q psy15063 211 VPLTGTIQPVCLPQK 225 (417)
Q Consensus 211 v~~s~~v~PicLp~~ 225 (417)
|+ -+.|+-+...
T Consensus 122 VT---EvaP~~~t~~ 133 (769)
T PF02395_consen 122 VT---EVAPAEMTTA 133 (769)
T ss_dssp -----SS----BBSS
T ss_pred EE---EEeccccccc
Confidence 76 3667666544
No 24
>PF13365 Trypsin_2: Trypsin-like peptidase domain; PDB: 1Y8T_A 2Z9I_A 3QO6_A 1L1J_A 1QY6_A 2O8L_A 3OTP_E 2ZLE_I 1KY9_A 3CS0_A ....
Probab=87.07 E-value=0.31 Score=39.79 Aligned_cols=19 Identities=37% Similarity=0.613 Sum_probs=17.7
Q ss_pred eeeEEecCc-eeeecccccc
Q psy15063 345 CGGALLNEH-FVLTAAHCIM 363 (417)
Q Consensus 345 Cggsli~~~-~vLtaAhC~d 363 (417)
|.|.+|+++ +|||||||+.
T Consensus 1 GTGf~i~~~g~ilT~~Hvv~ 20 (120)
T PF13365_consen 1 GTGFLIGPDGYILTAAHVVE 20 (120)
T ss_dssp EEEEEEETTTEEEEEHHHHT
T ss_pred CEEEEEcCCceEEEchhhee
Confidence 578999999 9999999996
No 25
>PF00947 Pico_P2A: Picornavirus core protein 2A; InterPro: IPR000081 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This domain defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies 3CA and 3CB. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral 3C cysteine protease []. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0008233 peptidase activity, 0006508 proteolysis, 0016032 viral reproduction; PDB: 2HRV_B 1Z8R_A.
Probab=80.45 E-value=3 Score=35.20 Aligned_cols=33 Identities=21% Similarity=0.372 Sum_probs=26.2
Q ss_pred cCCCceEEeeCCeEEEEeEEEEcCCCccccCCCCeEEEecccch
Q psy15063 363 MTGGPLTFEQDGYHVLAGIVSYGVTGCAIMPSYPDLYTRVSEYI 406 (417)
Q Consensus 363 dsGgPL~~~~~g~~~l~GI~S~g~~~C~~~~~~p~vyt~Vs~~~ 406 (417)
|-||+|.|+.+ ++||++.| .+.-..|++|..+.
T Consensus 91 dCGg~L~C~HG----ViGi~Tag-------g~g~VaF~dir~~~ 123 (127)
T PF00947_consen 91 DCGGILRCKHG----VIGIVTAG-------GEGHVAFADIRDLL 123 (127)
T ss_dssp -TCSEEEETTC----EEEEEEEE-------ETTEEEEEECCCGS
T ss_pred CCCceeEeCCC----eEEEEEeC-------CCceEEEEechhhh
Confidence 68999999755 99999998 34557899998763
No 26
>PF05579 Peptidase_S32: Equine arteritis virus serine endopeptidase S32; InterPro: IPR008760 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to MEROPS peptidase family S32 (clan PA(S)). The type example is equine arteritis virus serine endopeptidase (equine arteritis virus), which is involved in processing of nidovirus polyproteins [].; GO: 0004252 serine-type endopeptidase activity, 0016032 viral reproduction, 0019082 viral protein processing; PDB: 3FAN_A 3FAO_A 1MBM_A.
Probab=56.16 E-value=20 Score=34.34 Aligned_cols=39 Identities=21% Similarity=0.232 Sum_probs=24.8
Q ss_pred EecCceeeeccccccCCCceEEeeCCeEEEEeEEEEcC-CCcc
Q psy15063 349 LLNEHFVLTAAHCIMTGGPLTFEQDGYHVLAGIVSYGV-TGCA 390 (417)
Q Consensus 349 li~~~~vLtaAhC~dsGgPL~~~~~g~~~l~GI~S~g~-~~C~ 390 (417)
+|....-+-=-||=|||+|++.. +|. ++||.+-.. .+|+
T Consensus 195 ~ig~~~~~~fT~~GDSGSPVVt~-dg~--liGVHTGSn~~G~g 234 (297)
T PF05579_consen 195 FIGGGGAVCFTGPGDSGSPVVTE-DGD--LIGVHTGSNKRGSG 234 (297)
T ss_dssp EEETTEEEESS-GGCTT-EEEET-TC---EEEEEEEEETTTEE
T ss_pred eecCceEEEEcCCCCCCCccCcC-CCC--EEEEEecCCCcCce
Confidence 44455555555677999999987 555 999997653 4554
No 27
>PRK14065 exodeoxyribonuclease VII small subunit; Provisional
Probab=53.06 E-value=13 Score=28.91 Aligned_cols=28 Identities=25% Similarity=0.491 Sum_probs=22.6
Q ss_pred CChhHHHHHHHHHHhhcCCCccccccCC
Q psy15063 2 LSFEDKVKQIKEILAQLSPSDYSEDNGS 29 (417)
Q Consensus 2 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 29 (417)
.|||++++.+++||.+|.+=+-|-.-++
T Consensus 25 ~sFE~klerakeiLe~LndpeisL~eSv 52 (86)
T PRK14065 25 KSFEEHVHSLEQAIDRLNDPNLSLKDGM 52 (86)
T ss_pred ccHHHHHHHHHHHHHHhcCCCCCHHHHH
Confidence 4899999999999999998776543333
No 28
>PF00863 Peptidase_C4: Peptidase family C4 This family belongs to family C4 of the peptidase classification.; InterPro: IPR001730 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. Nuclear inclusion A (NIA) proteases from potyviruses are cysteine peptidases belong to the MEROPS peptidase family C4 (NIa protease family, clan PA(C)) [, ]. Potyviruses include plant viruses in which the single-stranded RNA encodes a polyprotein with NIA protease activity, where proteolytic cleavage is specific for Gln+Gly sites. The NIA protease acts on the polyprotein, releasing itself by Gln+Gly cleavage at both the N- and C-termini. It further processes the polyprotein by cleavage at five similar sites in the C-terminal half of the sequence. In addition to its C-terminal protease activity, the NIA protease contains an N-terminal domain that has been implicated in the transcription process []. This peptidase is present in the nuclear inclusion protein of potyviruses.; GO: 0008234 cysteine-type peptidase activity, 0006508 proteolysis; PDB: 3MMG_B 1Q31_B 1LVB_A 1LVM_A.
Probab=52.05 E-value=2.1e+02 Score=26.98 Aligned_cols=41 Identities=24% Similarity=0.372 Sum_probs=25.0
Q ss_pred cCCCceEEeeCCeEEEEeEEEEcCCCccccCCCCeEEEeccc-chhHHh
Q psy15063 363 MTGGPLTFEQDGYHVLAGIVSYGVTGCAIMPSYPDLYTRVSE-YIRWIH 410 (417)
Q Consensus 363 dsGgPL~~~~~g~~~l~GI~S~g~~~C~~~~~~p~vyt~Vs~-~~~WI~ 410 (417)
|=|.||+...+|. ++||-|.+.. ...-.+|+.+.. |.+.+.
T Consensus 152 ~CG~PlVs~~Dg~--IVGiHsl~~~-----~~~~N~F~~f~~~f~~~~l 193 (235)
T PF00863_consen 152 DCGLPLVSTKDGK--IVGIHSLTSN-----TSSRNYFTPFPDDFEEFYL 193 (235)
T ss_dssp -TT-EEEETTT----EEEEEEEEET-----TTSSEEEEE--TTHHHHHC
T ss_pred ccCCcEEEcCCCc--EEEEEcCccC-----CCCeEEEEcCCHHHHHHHh
Confidence 5699999998888 9999998732 223458888775 555443
No 29
>PF00548 Peptidase_C3: 3C cysteine protease (picornain 3C); InterPro: IPR000199 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This signature defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies C3A and C3B. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral C3 cysteine protease. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SJO_E 2H6M_A 1QA7_C 1HAV_B 2HAL_A 2H9H_A 3QZQ_B 3QZR_A 3R0F_B 3SJ9_A ....
Probab=44.08 E-value=65 Score=28.74 Aligned_cols=69 Identities=12% Similarity=0.026 Sum_probs=38.5
Q ss_pred ccceEEEeeCCeEEeeccceeeecCCCCCCCccceEEEeceecCCCCCCCCcEEEeeeEEEeCCCCCCCCCCCCeEEEEe
Q psy15063 128 DVCAVRIEFEKVNLARKVGGVCDIDQLRDTPVSELVLHLGDHDLTQLNETSHVRRGVRRVLFHSHFHPFVLSNDIALLQL 207 (417)
Q Consensus 128 ~~C~GtLIs~~~VLTA~~~A~Cv~~~~~~~~~~~~~V~lG~~~~~~~~~~~~q~~~V~~i~iHP~Y~~~~~~nDIALLkL 207 (417)
..|.+..|..+|.|-- .|.- ....+.++. ..+.+...+.. .+......||++++|
T Consensus 25 ~t~l~~gi~~~~~lvp---~H~~---------~~~~i~i~g-----------~~~~~~d~~~l--v~~~~~~~Dl~~v~l 79 (172)
T PF00548_consen 25 FTMLALGIYDRYFLVP---THEE---------PEDTIYIDG-----------VEYKVDDSVVL--VDRDGVDTDLTLVKL 79 (172)
T ss_dssp EEEEEEEEEBTEEEEE---GGGG---------GCSEEEETT-----------EEEEEEEEEEE--EETTSSEEEEEEEEE
T ss_pred EEEecceEeeeEEEEE---CcCC---------CcEEEEECC-----------EEEEeeeeEEE--ecCCCcceeEEEEEc
Confidence 4566778999999988 7721 223444432 22333332221 122233569999999
Q ss_pred cCCcCCCCceeeee
Q psy15063 208 DRPVPLTGTIQPVC 221 (417)
Q Consensus 208 ~~pv~~s~~v~Pic 221 (417)
++.-.|-+..+-++
T Consensus 80 ~~~~kfrDIrk~~~ 93 (172)
T PF00548_consen 80 PRNPKFRDIRKFFP 93 (172)
T ss_dssp ESSS-B--GGGGSB
T ss_pred cCCcccCchhhhhc
Confidence 99888877665555
No 30
>KOG1067|consensus
Probab=29.63 E-value=63 Score=34.26 Aligned_cols=65 Identities=20% Similarity=0.295 Sum_probs=43.8
Q ss_pred ccccCCcEEEEcc-----CCe----eeeeeEEecCceeeeccccccCCCceEEeeCCeEEEEeEEEEcCCCccccCCCCe
Q psy15063 327 DIGEFPWQVAIAL-----DGM----FFCGGALLNEHFVLTAAHCIMTGGPLTFEQDGYHVLAGIVSYGVTGCAIMPSYPD 397 (417)
Q Consensus 327 ~~~~~p~~~~~~~-----~~~----~~Cggsli~~~~vLtaAhC~dsGgPL~~~~~g~~~l~GI~S~g~~~C~~~~~~p~ 397 (417)
.+.+|||.+.+-- +++ -.|||||-=- |+|-|+-+..-|. -+|++.-....-+ .-..|-
T Consensus 460 lP~dfPftIRv~SeVleSnGSsSMASvCGGslALm----------DaGvPv~a~vAGv--aiGlvt~td~e~g-~i~dyr 526 (760)
T KOG1067|consen 460 LPEDFPFTIRVTSEVLESNGSSSMASVCGGSLALM----------DAGVPVSAHVAGV--AIGLVTKTDPEKG-EIEDYR 526 (760)
T ss_pred CcccCceEEEEeeeeeecCCcchHHhhhcchhhhh----------hcCCcccccccee--EEEeEeccCcccC-Ccccce
Confidence 4689999998822 333 3999987554 9999999977665 8888865422333 344556
Q ss_pred EEEeccc
Q psy15063 398 LYTRVSE 404 (417)
Q Consensus 398 vyt~Vs~ 404 (417)
+-|+|--
T Consensus 527 iltDIlG 533 (760)
T KOG1067|consen 527 ILTDILG 533 (760)
T ss_pred eehhhcc
Confidence 6666543
No 31
>TIGR01280 xseB exodeoxyribonuclease VII, small subunit. This protein is the small subunit for exodeoxyribonuclease VII. Exodeoxyribonuclease VII is made of a complex of four small subunits to one large subunit. The complex degrades single-stranded DNA into large acid-insoluble oligonucleotides. These nucleotides are then degraded further into acid-soluble oligonucleotides.
Probab=26.23 E-value=55 Score=24.40 Aligned_cols=24 Identities=29% Similarity=0.605 Sum_probs=20.3
Q ss_pred CChhHHHHHHHHHHhhcCCCcccc
Q psy15063 2 LSFEDKVKQIKEILAQLSPSDYSE 25 (417)
Q Consensus 2 ~~~~~~~~~~~~~~~~~~~~~~~~ 25 (417)
+|||+.++++.+|+.+|..=+-+-
T Consensus 1 ~sfEe~l~~Le~Iv~~LE~~~l~L 24 (67)
T TIGR01280 1 LSFEEALSELEQIVQKLESGDLAL 24 (67)
T ss_pred CCHHHHHHHHHHHHHHHHCCCCCH
Confidence 689999999999999998655443
No 32
>PRK14068 exodeoxyribonuclease VII small subunit; Provisional
Probab=20.89 E-value=79 Score=24.26 Aligned_cols=31 Identities=26% Similarity=0.459 Sum_probs=23.2
Q ss_pred CChhHHHHHHHHHHhhcCCCccccccCCCCc
Q psy15063 2 LSFEDKVKQIKEILAQLSPSDYSEDNGSDDY 32 (417)
Q Consensus 2 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 32 (417)
+|||..++++.+|+.+|..=|-+-.-.+.-|
T Consensus 6 ~sfEeal~~Le~IV~~LE~gdl~Leesl~ly 36 (76)
T PRK14068 6 QSFEEMMQELEQIVQKLDNETVSLEESLDLY 36 (76)
T ss_pred cCHHHHHHHHHHHHHHHHcCCCCHHHHHHHH
Confidence 4899999999999999987665444333333
Done!