Query psy2950
Match_columns 406
No_of_seqs 246 out of 2346
Neff 9.3
Searched_HMMs 46136
Date Fri Aug 16 21:42:56 2013
Command hhsearch -i /work/01045/syshi/Psyhhblits/psy2950.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/2950hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 cd00190 Tryp_SPc Trypsin-like 100.0 6.1E-34 1.3E-38 258.0 14.6 191 7-215 1-232 (232)
2 KOG3627|consensus 100.0 6.1E-32 1.3E-36 249.4 16.9 196 5-217 11-255 (256)
3 smart00020 Tryp_SPc Trypsin-li 100.0 5.5E-31 1.2E-35 238.5 15.5 188 6-212 1-229 (229)
4 PF00089 Trypsin: Trypsin; In 100.0 7.1E-29 1.5E-33 223.0 13.0 181 7-212 1-220 (220)
5 KOG3627|consensus 100.0 9.8E-28 2.1E-32 221.3 17.2 164 222-389 85-255 (256)
6 cd00190 Tryp_SPc Trypsin-like 100.0 1.6E-27 3.4E-32 215.9 17.9 197 178-387 33-232 (232)
7 smart00020 Tryp_SPc Trypsin-li 99.9 3.6E-24 7.8E-29 193.8 18.1 158 222-384 69-229 (229)
8 PF00089 Trypsin: Trypsin; In 99.9 6.4E-23 1.4E-27 184.2 16.4 186 178-384 33-220 (220)
9 COG5640 Secreted trypsin-like 99.9 3.8E-23 8.2E-28 186.9 13.6 230 4-249 30-323 (413)
10 COG5640 Secreted trypsin-like 99.8 3.1E-18 6.8E-23 155.3 17.1 205 178-397 69-287 (413)
11 PF03761 DUF316: Domain of unk 99.0 5E-09 1.1E-13 98.0 12.6 164 5-213 40-276 (282)
12 PF09342 DUF1986: Domain of un 98.5 6.7E-07 1.5E-11 78.2 9.4 82 15-110 13-131 (267)
13 PF03761 DUF316: Domain of unk 98.4 1.6E-06 3.4E-11 81.1 10.9 117 237-384 157-275 (282)
14 PF09342 DUF1986: Domain of un 96.7 0.044 9.5E-07 48.6 12.6 60 217-282 71-131 (267)
15 COG3591 V8-like Glu-specific e 96.5 0.0092 2E-07 53.7 7.6 53 160-216 197-250 (251)
16 TIGR02037 degP_htrA_DO peripla 94.1 0.11 2.4E-06 51.6 6.3 63 29-110 60-142 (428)
17 PRK10898 serine endoprotease; 93.5 0.97 2.1E-05 43.6 11.4 24 163-190 195-218 (353)
18 TIGR02038 protease_degS peripl 92.4 0.57 1.2E-05 45.2 8.1 25 162-190 194-218 (351)
19 TIGR02037 degP_htrA_DO peripla 89.6 1.7 3.6E-05 43.2 8.6 39 240-282 104-142 (428)
20 PRK10139 serine endoprotease; 88.0 1.4 2.9E-05 44.2 6.7 25 162-190 208-232 (455)
21 PF02395 Peptidase_S6: Immunog 87.4 0.51 1.1E-05 49.9 3.4 53 335-391 213-266 (769)
22 PRK10898 serine endoprotease; 86.3 8.6 0.00019 37.1 11.0 37 240-281 124-160 (353)
23 COG3591 V8-like Glu-specific e 84.4 1.8 3.9E-05 39.2 4.9 53 332-388 197-250 (251)
24 PF02395 Peptidase_S6: Immunog 82.6 1 2.2E-05 47.7 3.0 32 163-194 213-245 (769)
25 TIGR02038 protease_degS peripl 82.5 6.5 0.00014 37.9 8.3 38 240-282 124-161 (351)
26 PRK10942 serine endoprotease; 80.0 4.9 0.00011 40.4 6.7 24 163-190 230-253 (473)
27 PRK10139 serine endoprotease; 79.6 9.8 0.00021 38.1 8.7 122 240-388 137-259 (455)
28 PF05580 Peptidase_S55: SpoIVB 69.4 6.4 0.00014 34.7 3.8 28 332-364 174-201 (218)
29 PRK10942 serine endoprotease; 60.8 39 0.00085 34.0 8.1 37 240-280 158-194 (473)
30 PF00947 Pico_P2A: Picornaviru 59.3 10 0.00022 30.3 2.9 35 337-381 89-123 (127)
31 PF00947 Pico_P2A: Picornaviru 52.0 18 0.00039 29.0 3.2 34 165-208 89-122 (127)
32 TIGR02860 spore_IV_B stage IV 34.1 37 0.0008 33.2 2.9 46 332-388 354-399 (402)
33 PF05579 Peptidase_S32: Equine 32.0 51 0.0011 30.1 3.2 22 337-362 207-228 (297)
34 PF13365 Trypsin_2: Trypsin-li 29.6 48 0.001 25.5 2.5 18 337-358 103-120 (120)
35 PF00548 Peptidase_C3: 3C cyst 24.2 95 0.0021 26.5 3.5 26 336-362 145-170 (172)
36 PF02907 Peptidase_S29: Hepati 23.7 79 0.0017 25.7 2.7 25 165-194 107-131 (148)
37 PF00944 Peptidase_S3: Alphavi 21.3 41 0.00089 27.3 0.6 24 337-364 105-128 (158)
No 1
>cd00190 Tryp_SPc Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues.
Probab=100.00 E-value=6.1e-34 Score=257.96 Aligned_cols=191 Identities=42% Similarity=0.818 Sum_probs=164.8
Q ss_pred cccCCccccceeeEEEEeeee----ccCCCcce---EEEEEEe---------------------------------cCCC
Q psy2950 7 FVEGNPRQLHHQLFIILLRRT----SEGGSLPH---ILQAAEV---------------------------------PLTP 46 (406)
Q Consensus 7 i~~G~~~~~~~~P~~v~i~~~----~c~G~l~~---vltaa~c---------------------------------~~Hp 46 (406)
|+||.++..++|||+|+|... .|+|+|+. ||||||| ++||
T Consensus 1 i~~G~~~~~~~~Pw~v~i~~~~~~~~C~GtlIs~~~VLTaAhC~~~~~~~~~~v~~g~~~~~~~~~~~~~~~v~~~~~hp 80 (232)
T cd00190 1 IVGGSEAKIGSFPWQVSLQYTGGRHFCGGSLISPRWVLTAAHCVYSSAPSNYTVRLGSHDLSSNEGGGQVIKVKKVIVHP 80 (232)
T ss_pred CcCCeECCCCCCCCEEEEEccCCcEEEEEEEeeCCEEEECHHhcCCCCCccEEEEeCcccccCCCCceEEEEEEEEEECC
Confidence 789999999999999999865 39999966 9999999 4599
Q ss_pred CCCC-CCCcceEEEEeCCccccCCCccccccCCCCCccccccccccccccccCCCceEEEeeeccCCCCCCcccccceee
Q psy2950 47 KEEC-RRSYAVAGYELTRPFKFNEFVSPICLPNPGLTVTADVGLISGWGRLSEGADVGLISGWGRLSEGGSLPHILQAAE 125 (406)
Q Consensus 47 ~y~~-~~~nDIALlkL~~~v~~~~~v~picl~~~~~~~~~~~~~~~~wg~~~~~~~~~~~~Gwg~~~~~~~~~~~l~~~~ 125 (406)
+|+. ...+|||||||++++.++..++|||||........ + ..+.++|||........+..++...
T Consensus 81 ~y~~~~~~~DiAll~L~~~~~~~~~v~picl~~~~~~~~~-------------~-~~~~~~G~g~~~~~~~~~~~~~~~~ 146 (232)
T cd00190 81 NYNPSTYDNDIALLKLKRPVTLSDNVRPICLPSSGYNLPA-------------G-TTCTVSGWGRTSEGGPLPDVLQEVN 146 (232)
T ss_pred CCCCCCCcCCEEEEEECCcccCCCcccceECCCccccCCC-------------C-CEEEEEeCCcCCCCCCCCceeeEEE
Confidence 9984 57899999999999999999999999988633332 4 8999999998766545677899999
Q ss_pred eeccChhhhhHhhhccCCCCCCCCCeeecCCCCCCcccccCCCCCeeEeeCCCCcEEEEEEEEecCCCCCCCCCeeEEEe
Q psy2950 126 VPLTPKEECRRSYAVAGYSNYLNQCQVCTGTKQGGLDSCQGDSGGPLACPLPDGRYYLCGITSWGVGCARPDFYGVYTLV 205 (406)
Q Consensus 126 ~~i~~~~~C~~~~~~~~~~~~~~~~~~Ca~~~~~~~~~C~gdsGgPl~~~~~~~~~~l~Gi~s~~~~C~~~~~p~~~t~v 205 (406)
+.+++.++|+..+.. ...+.+.++|++........|.||+||||++.. +++|+|+||+|++..|.....|.+|+++
T Consensus 147 ~~~~~~~~C~~~~~~---~~~~~~~~~C~~~~~~~~~~c~gdsGgpl~~~~-~~~~~lvGI~s~g~~c~~~~~~~~~t~v 222 (232)
T cd00190 147 VPIVSNAECKRAYSY---GGTITDNMLCAGGLEGGKDACQGDSGGPLVCND-NGRGVLVGIVSWGSGCARPNYPGVYTRV 222 (232)
T ss_pred eeeECHHHhhhhccC---cccCCCceEeeCCCCCCCccccCCCCCcEEEEe-CCEEEEEEEEehhhccCCCCCCCEEEEc
Confidence 999999999988764 126889999998654467899999999999988 6999999999999889866778999999
Q ss_pred eechhhHHHh
Q psy2950 206 SCYSDWVKSI 215 (406)
Q Consensus 206 ~~~~~WI~~~ 215 (406)
..|.+||+++
T Consensus 223 ~~~~~WI~~~ 232 (232)
T cd00190 223 SSYLDWIQKT 232 (232)
T ss_pred HHhhHHhhcC
Confidence 9999999853
No 2
>KOG3627|consensus
Probab=99.98 E-value=6.1e-32 Score=249.36 Aligned_cols=196 Identities=39% Similarity=0.772 Sum_probs=164.2
Q ss_pred cccccCCccccceeeEEEEeeee-----ccCCCcce---EEEEEEe----------------------------------
Q psy2950 5 QDFVEGNPRQLHHQLFIILLRRT-----SEGGSLPH---ILQAAEV---------------------------------- 42 (406)
Q Consensus 5 ~~i~~G~~~~~~~~P~~v~i~~~-----~c~G~l~~---vltaa~c---------------------------------- 42 (406)
.||+||.++..+++||+|+|+.. .|+|+|++ |||||||
T Consensus 11 ~~i~~g~~~~~~~~Pw~~~l~~~~~~~~~Cggsli~~~~vltaaHC~~~~~~~~~~V~~G~~~~~~~~~~~~~~~~~~v~ 90 (256)
T KOG3627|consen 11 GRIVGGTEAEPGSFPWQVSLQYGGNGRHLCGGSLISPRWVLTAAHCVKGASASLYTVRLGEHDINLSVSEGEEQLVGDVE 90 (256)
T ss_pred CCEeCCccCCCCCCCCEEEEEECCCcceeeeeEEeeCCEEEEChhhCCCCCCcceEEEECccccccccccCchhhhceee
Confidence 58999999999999999999864 49999855 9999988
Q ss_pred --cCCCCCC-CCCC-cceEEEEeCCccccCCCccccccCCCCCc-cccccccccccccccCCCceEEEeeeccCCCC-CC
Q psy2950 43 --PLTPKEE-CRRS-YAVAGYELTRPFKFNEFVSPICLPNPGLT-VTADVGLISGWGRLSEGADVGLISGWGRLSEG-GS 116 (406)
Q Consensus 43 --~~Hp~y~-~~~~-nDIALlkL~~~v~~~~~v~picl~~~~~~-~~~~~~~~~~wg~~~~~~~~~~~~Gwg~~~~~-~~ 116 (406)
++||+|+ .+.. ||||||+|.+++.|+++|+|||||..... ... +...|.++|||++... ..
T Consensus 91 ~~i~H~~y~~~~~~~nDiall~l~~~v~~~~~i~piclp~~~~~~~~~-------------~~~~~~v~GWG~~~~~~~~ 157 (256)
T KOG3627|consen 91 KIIVHPNYNPRTLENNDIALLRLSEPVTFSSHIQPICLPSSADPYFPP-------------GGTTCLVSGWGRTESGGGP 157 (256)
T ss_pred EEEECCCCCCCCCCCCCEEEEEECCCcccCCcccccCCCCCcccCCCC-------------CCCEEEEEeCCCcCCCCCC
Confidence 2488887 4455 99999999999999999999999855432 111 2278999999987665 35
Q ss_pred cccccceeeeeccChhhhhHhhhccCCCCCCCCCeeecCCCCCCcccccCCCCCeeEeeCCCCcEEEEEEEEecCC-CCC
Q psy2950 117 LPHILQAAEVPLTPKEECRRSYAVAGYSNYLNQCQVCTGTKQGGLDSCQGDSGGPLACPLPDGRYYLCGITSWGVG-CAR 195 (406)
Q Consensus 117 ~~~~l~~~~~~i~~~~~C~~~~~~~~~~~~~~~~~~Ca~~~~~~~~~C~gdsGgPl~~~~~~~~~~l~Gi~s~~~~-C~~ 195 (406)
.+..|+++.+++++.++|+..+.... .+++.++|++......+.|.|||||||++.. .++|+++||+|||.. |..
T Consensus 158 ~~~~L~~~~v~i~~~~~C~~~~~~~~---~~~~~~~Ca~~~~~~~~~C~GDSGGPLv~~~-~~~~~~~GivS~G~~~C~~ 233 (256)
T KOG3627|consen 158 LPDTLQEVDVPIISNSECRRAYGGLG---TITDTMLCAGGPEGGKDACQGDSGGPLVCED-NGRWVLVGIVSWGSGGCGQ 233 (256)
T ss_pred CCceeEEEEEeEcChhHhcccccCcc---ccCCCEEeeCccCCCCccccCCCCCeEEEee-CCcEEEEEEEEecCCCCCC
Confidence 57889999999999999999887532 3566789999755667899999999999998 558999999999987 998
Q ss_pred CCCCeeEEEeeechhhHHHhhh
Q psy2950 196 PDFYGVYTLVSCYSDWVKSILY 217 (406)
Q Consensus 196 ~~~p~~~t~v~~~~~WI~~~i~ 217 (406)
.+.|++|++|..|.+||++.+.
T Consensus 234 ~~~P~vyt~V~~y~~WI~~~~~ 255 (256)
T KOG3627|consen 234 PNYPGVYTRVSSYLDWIKENIG 255 (256)
T ss_pred CCCCeEEeEhHHhHHHHHHHhc
Confidence 7799999999999999998764
No 3
>smart00020 Tryp_SPc Trypsin-like serine protease. Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues.
Probab=99.97 E-value=5.5e-31 Score=238.52 Aligned_cols=188 Identities=42% Similarity=0.789 Sum_probs=161.2
Q ss_pred ccccCCccccceeeEEEEeeeec----cCCCcce---EEEEEEe--------------------------------cCCC
Q psy2950 6 DFVEGNPRQLHHQLFIILLRRTS----EGGSLPH---ILQAAEV--------------------------------PLTP 46 (406)
Q Consensus 6 ~i~~G~~~~~~~~P~~v~i~~~~----c~G~l~~---vltaa~c--------------------------------~~Hp 46 (406)
||+||.++...+|||+|.++... |+|+|+. ||||||| ++||
T Consensus 1 ~~~~G~~~~~~~~Pw~~~i~~~~~~~~C~GtlIs~~~VLTaahC~~~~~~~~~~v~~g~~~~~~~~~~~~~~v~~~~~~p 80 (229)
T smart00020 1 RIVGGSEANIGSFPWQVSLQYRGGRHFCGGSLISPRWVLTAAHCVYGSDPSNIRVRLGSHDLSSGEEGQVIKVSKVIIHP 80 (229)
T ss_pred CccCCCcCCCCCCCcEEEEEEcCCCcEEEEEEecCCEEEECHHHcCCCCCcceEEEeCcccCCCCCCceEEeeEEEEECC
Confidence 68999999999999999998664 9999966 9999999 3489
Q ss_pred CCC-CCCCcceEEEEeCCccccCCCccccccCCCCCccccccccccccccccCCCceEEEeeeccCCCC-CCccccccee
Q psy2950 47 KEE-CRRSYAVAGYELTRPFKFNEFVSPICLPNPGLTVTADVGLISGWGRLSEGADVGLISGWGRLSEG-GSLPHILQAA 124 (406)
Q Consensus 47 ~y~-~~~~nDIALlkL~~~v~~~~~v~picl~~~~~~~~~~~~~~~~wg~~~~~~~~~~~~Gwg~~~~~-~~~~~~l~~~ 124 (406)
+|+ ....+|||||||++++.+++.++|||||........ + ..+.++|||..... ......++..
T Consensus 81 ~~~~~~~~~DiAll~L~~~i~~~~~~~pi~l~~~~~~~~~-------------~-~~~~~~g~g~~~~~~~~~~~~~~~~ 146 (229)
T smart00020 81 NYNPSTYDNDIALLKLKSPVTLSDNVRPICLPSSNYNVPA-------------G-TTCTVSGWGRTSEGAGSLPDTLQEV 146 (229)
T ss_pred CCCCCCCcCCEEEEEECcccCCCCceeeccCCCcccccCC-------------C-CEEEEEeCCCCCCCCCcCCCEeeEE
Confidence 987 567899999999999999999999999987433333 4 88999999986542 3456678899
Q ss_pred eeeccChhhhhHhhhccCCCCCCCCCeeecCCCCCCcccccCCCCCeeEeeCCCCcEEEEEEEEecCCCCCCCCCeeEEE
Q psy2950 125 EVPLTPKEECRRSYAVAGYSNYLNQCQVCTGTKQGGLDSCQGDSGGPLACPLPDGRYYLCGITSWGVGCARPDFYGVYTL 204 (406)
Q Consensus 125 ~~~i~~~~~C~~~~~~~~~~~~~~~~~~Ca~~~~~~~~~C~gdsGgPl~~~~~~~~~~l~Gi~s~~~~C~~~~~p~~~t~ 204 (406)
.+.+++.+.|...+... ..+.+.++|++........|.+|+|+||++.. + +|+|+||+|++..|.....|.+|++
T Consensus 147 ~~~~~~~~~C~~~~~~~---~~~~~~~~C~~~~~~~~~~c~gdsG~pl~~~~-~-~~~l~Gi~s~g~~C~~~~~~~~~~~ 221 (229)
T smart00020 147 NVPIVSNATCRRAYSGG---GAITDNMLCAGGLEGGKDACQGDSGGPLVCND-G-RWVLVGIVSWGSGCARPGKPGVYTR 221 (229)
T ss_pred EEEEeCHHHhhhhhccc---cccCCCcEeecCCCCCCcccCCCCCCeeEEEC-C-CEEEEEEEEECCCCCCCCCCCEEEE
Confidence 99999999999877542 25788999998765467899999999999988 5 9999999999999986678899999
Q ss_pred eeechhhH
Q psy2950 205 VSCYSDWV 212 (406)
Q Consensus 205 v~~~~~WI 212 (406)
+..|.+||
T Consensus 222 i~~~~~WI 229 (229)
T smart00020 222 VSSYLDWI 229 (229)
T ss_pred eccccccC
Confidence 99999997
No 4
>PF00089 Trypsin: Trypsin; InterPro: IPR001254 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine proteases belong to the MEROPS peptidase family S1 (chymotrypsin family, clan PA(S))and to peptidase family S6 (Hap serine peptidases). The chymotrypsin family is almost totally confined to animals, although trypsin-like enzymes are found in actinomycetes of the genera Streptomyces and Saccharopolyspora, and in the fungus Fusarium oxysporum []. The enzymes are inherently secreted, being synthesised with a signal peptide that targets them to the secretory pathway. Animal enzymes are either secreted directly, packaged into vesicles for regulated secretion, or are retained in leukocyte granules []. The Hap family, 'Haemophilus adhesion and penetration', are proteins that play a role in the interaction with human epithelial cells. The serine protease activity is localized at the N-terminal domain, whereas the binding domain is in the C-terminal region. ; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 1SPJ_A 1A5I_A 2ZGH_A 2ZKS_A 2ZGJ_A 2ZGC_A 2ODP_A 2I6Q_A 2I6S_A 2ODQ_A ....
Probab=99.96 E-value=7.1e-29 Score=222.97 Aligned_cols=181 Identities=38% Similarity=0.772 Sum_probs=156.5
Q ss_pred cccCCccccceeeEEEEeeeec----cCCCcce---EEEEEEe-------------------------------cCCCCC
Q psy2950 7 FVEGNPRQLHHQLFIILLRRTS----EGGSLPH---ILQAAEV-------------------------------PLTPKE 48 (406)
Q Consensus 7 i~~G~~~~~~~~P~~v~i~~~~----c~G~l~~---vltaa~c-------------------------------~~Hp~y 48 (406)
|.||.+++.++|||+|.++... |+|+|+. ||||||| ++||+|
T Consensus 1 i~~g~~~~~~~~p~~v~i~~~~~~~~C~G~li~~~~vLTaahC~~~~~~~~v~~g~~~~~~~~~~~~~~~v~~~~~h~~~ 80 (220)
T PF00089_consen 1 IVGGDPASPGEFPWVVSIRYSNGRFFCTGTLISPRWVLTAAHCVDGASDIKVRLGTYSIRNSDGSEQTIKVSKIIIHPKY 80 (220)
T ss_dssp SBSSEECGTTSSTTEEEEEETTTEEEEEEEEEETTEEEEEGGGHTSGGSEEEEESESBTTSTTTTSEEEEEEEEEEETTS
T ss_pred CCCCEECCCCCCCeEEEEeeCCCCeeEeEEeccccccccccccccccccccccccccccccccccccccccccccccccc
Confidence 7899999999999999998755 9999966 9999999 468898
Q ss_pred C-CCCCcceEEEEeCCccccCCCccccccCCCCCccccccccccccccccCCCceEEEeeeccCCCCCCcccccceeeee
Q psy2950 49 E-CRRSYAVAGYELTRPFKFNEFVSPICLPNPGLTVTADVGLISGWGRLSEGADVGLISGWGRLSEGGSLPHILQAAEVP 127 (406)
Q Consensus 49 ~-~~~~nDIALlkL~~~v~~~~~v~picl~~~~~~~~~~~~~~~~wg~~~~~~~~~~~~Gwg~~~~~~~~~~~l~~~~~~ 127 (406)
+ ....+|||||||++++.+.+.++|+||+........ + ..+.+.|||.....+ ....++...+.
T Consensus 81 ~~~~~~~DiAll~L~~~~~~~~~~~~~~l~~~~~~~~~-------------~-~~~~~~G~~~~~~~~-~~~~~~~~~~~ 145 (220)
T PF00089_consen 81 DPSTYDNDIALLKLDRPITFGDNIQPICLPSAGSDPNV-------------G-TSCIVVGWGRTSDNG-YSSNLQSVTVP 145 (220)
T ss_dssp BTTTTTTSEEEEEESSSSEHBSSBEESBBTSTTHTTTT-------------T-SEEEEEESSBSSTTS-BTSBEEEEEEE
T ss_pred cccccccccccccccccccccccccccccccccccccc-------------c-ccccccccccccccc-ccccccccccc
Confidence 7 446899999999999999999999999985443333 4 899999999865544 45678899999
Q ss_pred ccChhhhhHhhhccCCCCCCCCCeeecCCCCCCcccccCCCCCeeEeeCCCCcEEEEEEEEecCCCCCCCCCeeEEEeee
Q psy2950 128 LTPKEECRRSYAVAGYSNYLNQCQVCTGTKQGGLDSCQGDSGGPLACPLPDGRYYLCGITSWGVGCARPDFYGVYTLVSC 207 (406)
Q Consensus 128 i~~~~~C~~~~~~~~~~~~~~~~~~Ca~~~~~~~~~C~gdsGgPl~~~~~~~~~~l~Gi~s~~~~C~~~~~p~~~t~v~~ 207 (406)
+++.+.|...+.. .+.+.++|++.. .....|.||+|+||++.+ . +|+||.+++..|...+.|.+|+++..
T Consensus 146 ~~~~~~c~~~~~~-----~~~~~~~c~~~~-~~~~~~~g~sG~pl~~~~-~---~lvGI~s~~~~c~~~~~~~v~~~v~~ 215 (220)
T PF00089_consen 146 VVSRKTCRSSYND-----NLTPNMICAGSS-GSGDACQGDSGGPLICNN-N---YLVGIVSFGENCGSPNYPGVYTRVSS 215 (220)
T ss_dssp EEEHHHHHHHTTT-----TSTTTEEEEETT-SSSBGGTTTTTSEEEETT-E---EEEEEEEEESSSSBTTSEEEEEEGGG
T ss_pred ccccccccccccc-----cccccccccccc-ccccccccccccccccce-e---eecceeeecCCCCCCCcCEEEEEHHH
Confidence 9999999998543 578899999865 557899999999999977 2 79999999999998778999999999
Q ss_pred chhhH
Q psy2950 208 YSDWV 212 (406)
Q Consensus 208 ~~~WI 212 (406)
|.+||
T Consensus 216 ~~~WI 220 (220)
T PF00089_consen 216 YLDWI 220 (220)
T ss_dssp GHHHH
T ss_pred hhccC
Confidence 99998
No 5
>KOG3627|consensus
Probab=99.95 E-value=9.8e-28 Score=221.28 Aligned_cols=164 Identities=45% Similarity=0.914 Sum_probs=138.9
Q ss_pred eeeeeeeEee-ccCCCCCCc-CceEEEEeCCCcccCCCcccccCCCCCC---cccCCcEEEEcccccCCC-CCcccccee
Q psy2950 222 QRRRVERIYT-DFYDKSIYK-NDIALLELTRPFKFNEFVSPICLPNPGL---TVTADVGLISGWGRLSEG-GSLPHILQA 295 (406)
Q Consensus 222 ~~~~v~~i~~-p~y~~~~~~-~DiALlkL~~~v~~~~~v~piclp~~~~---~~~~~~~~~~GwG~~~~~-~~~~~~l~~ 295 (406)
....+.+++. |+|+..... ||||||+|.+++.|+++|+|||||.... ...+..+.++|||..... ...+..|++
T Consensus 85 ~~~~v~~~i~H~~y~~~~~~~nDiall~l~~~v~~~~~i~piclp~~~~~~~~~~~~~~~v~GWG~~~~~~~~~~~~L~~ 164 (256)
T KOG3627|consen 85 LVGDVEKIIVHPNYNPRTLENNDIALLRLSEPVTFSSHIQPICLPSSADPYFPPGGTTCLVSGWGRTESGGGPLPDTLQE 164 (256)
T ss_pred hhceeeEEEECCCCCCCCCCCCCEEEEEECCCcccCCcccccCCCCCcccCCCCCCCEEEEEeCCCcCCCCCCCCceeEE
Confidence 3344667777 999998877 9999999999999999999999996543 225688999999987764 346788999
Q ss_pred eeeccCChhhhhhhhhccCCcCCCCcceEEeecCCCCCCcccCCCCCeeEeecCCCcEEEEEEEEeCCC-CCCCCCCeEE
Q psy2950 296 AEVPLTPKEECRRSYAVAGYSNYLNQCQVCTGTKQGGLDSCQGDSGGPLACPLPDGRYYLCGITSWGVG-CARPDFYGVY 374 (406)
Q Consensus 296 ~~v~~~~~~~C~~~~~~~~~~~~~~~~~~Ca~~~~~~~~~C~gDsGgPl~~~~~~~~~~l~Gi~S~~~~-c~~~~~p~vf 374 (406)
+++.+++.++|+..+.... .+.+.+|||+......++|.|||||||++.. .++++|+||+|||.. |+....|++|
T Consensus 165 ~~v~i~~~~~C~~~~~~~~---~~~~~~~Ca~~~~~~~~~C~GDSGGPLv~~~-~~~~~~~GivS~G~~~C~~~~~P~vy 240 (256)
T KOG3627|consen 165 VDVPIISNSECRRAYGGLG---TITDTMLCAGGPEGGKDACQGDSGGPLVCED-NGRWVLVGIVSWGSGGCGQPNYPGVY 240 (256)
T ss_pred EEEeEcChhHhcccccCcc---ccCCCEEeeCccCCCCccccCCCCCeEEEee-CCcEEEEEEEEecCCCCCCCCCCeEE
Confidence 9999999999999887432 3567789999756678899999999999987 448999999999988 9987799999
Q ss_pred EeCCccHHHHHHhHc
Q psy2950 375 TLVSCYSDWVKSILY 389 (406)
Q Consensus 375 t~v~~~~~WI~~~~~ 389 (406)
|||+.|.+||++.+.
T Consensus 241 t~V~~y~~WI~~~~~ 255 (256)
T KOG3627|consen 241 TRVSSYLDWIKENIG 255 (256)
T ss_pred eEhHHhHHHHHHHhc
Confidence 999999999999875
No 6
>cd00190 Tryp_SPc Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues.
Probab=99.95 E-value=1.6e-27 Score=215.90 Aligned_cols=197 Identities=41% Similarity=0.794 Sum_probs=155.7
Q ss_pred CCcEEEEEEEEecCCCCCCCCC-eeEEEeeechhhHHHhhhcccceeeeeeeEee-ccCCCCCCcCceEEEEeCCCcccC
Q psy2950 178 DGRYYLCGITSWGVGCARPDFY-GVYTLVSCYSDWVKSILYARHEQRRRVERIYT-DFYDKSIYKNDIALLELTRPFKFN 255 (406)
Q Consensus 178 ~~~~~l~Gi~s~~~~C~~~~~p-~~~t~v~~~~~WI~~~i~~~~~~~~~v~~i~~-p~y~~~~~~~DiALlkL~~~v~~~ 255 (406)
..+|+|.. ++|.....+ ....++... ..... ....+.+.|++++. |+|+.....+|||||||++++.++
T Consensus 33 s~~~VLTa-----AhC~~~~~~~~~~v~~g~~---~~~~~-~~~~~~~~v~~~~~hp~y~~~~~~~DiAll~L~~~~~~~ 103 (232)
T cd00190 33 SPRWVLTA-----AHCVYSSAPSNYTVRLGSH---DLSSN-EGGGQVIKVKKVIVHPNYNPSTYDNDIALLKLKRPVTLS 103 (232)
T ss_pred eCCEEEEC-----HHhcCCCCCccEEEEeCcc---cccCC-CCceEEEEEEEEEECCCCCCCCCcCCEEEEEECCcccCC
Confidence 45678887 899753211 122222211 11110 01237788999999 999998889999999999999999
Q ss_pred CCcccccCCCCCCcc-cCCcEEEEcccccCCCCCccccceeeeeccCChhhhhhhhhccCCcCCCCcceEEeecCCCCCC
Q psy2950 256 EFVSPICLPNPGLTV-TADVGLISGWGRLSEGGSLPHILQAAEVPLTPKEECRRSYAVAGYSNYLNQCQVCTGTKQGGLD 334 (406)
Q Consensus 256 ~~v~piclp~~~~~~-~~~~~~~~GwG~~~~~~~~~~~l~~~~v~~~~~~~C~~~~~~~~~~~~~~~~~~Ca~~~~~~~~ 334 (406)
++++|+|||...... .+..+.++|||........+..|++..+.+++.++|...+... ..+.++++|++......+
T Consensus 104 ~~v~picl~~~~~~~~~~~~~~~~G~g~~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~---~~~~~~~~C~~~~~~~~~ 180 (232)
T cd00190 104 DNVRPICLPSSGYNLPAGTTCTVSGWGRTSEGGPLPDVLQEVNVPIVSNAECKRAYSYG---GTITDNMLCAGGLEGGKD 180 (232)
T ss_pred CcccceECCCccccCCCCCEEEEEeCCcCCCCCCCCceeeEEEeeeECHHHhhhhccCc---ccCCCceEeeCCCCCCCc
Confidence 999999999875333 6899999999987665456778999999999999999887631 167899999997554778
Q ss_pred cccCCCCCeeEeecCCCcEEEEEEEEeCCCCCCCCCCeEEEeCCccHHHHHHh
Q psy2950 335 SCQGDSGGPLACPLPDGRYYLCGITSWGVGCARPDFYGVYTLVSCYSDWVKSI 387 (406)
Q Consensus 335 ~C~gDsGgPl~~~~~~~~~~l~Gi~S~~~~c~~~~~p~vft~v~~~~~WI~~~ 387 (406)
.|.|||||||++.. +++++|+||+|++..|.....|.+||||+.|.+||+++
T Consensus 181 ~c~gdsGgpl~~~~-~~~~~lvGI~s~g~~c~~~~~~~~~t~v~~~~~WI~~~ 232 (232)
T cd00190 181 ACQGDSGGPLVCND-NGRGVLVGIVSWGSGCARPNYPGVYTRVSSYLDWIQKT 232 (232)
T ss_pred cccCCCCCcEEEEe-CCEEEEEEEEehhhccCCCCCCCEEEEcHHhhHHhhcC
Confidence 99999999999987 68999999999998898767899999999999999864
No 7
>smart00020 Tryp_SPc Trypsin-like serine protease. Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues.
Probab=99.92 E-value=3.6e-24 Score=193.80 Aligned_cols=158 Identities=47% Similarity=0.916 Sum_probs=136.4
Q ss_pred eeeeeeeEee-ccCCCCCCcCceEEEEeCCCcccCCCcccccCCCCCCcc-cCCcEEEEcccccCC-CCCccccceeeee
Q psy2950 222 QRRRVERIYT-DFYDKSIYKNDIALLELTRPFKFNEFVSPICLPNPGLTV-TADVGLISGWGRLSE-GGSLPHILQAAEV 298 (406)
Q Consensus 222 ~~~~v~~i~~-p~y~~~~~~~DiALlkL~~~v~~~~~v~piclp~~~~~~-~~~~~~~~GwG~~~~-~~~~~~~l~~~~v 298 (406)
..+.|.+++. |+|+.....+|||||+|++++.+++.++|+|||...... .+..+.+.|||.... .......++...+
T Consensus 69 ~~~~v~~~~~~p~~~~~~~~~DiAll~L~~~i~~~~~~~pi~l~~~~~~~~~~~~~~~~g~g~~~~~~~~~~~~~~~~~~ 148 (229)
T smart00020 69 QVIKVSKVIIHPNYNPSTYDNDIALLKLKSPVTLSDNVRPICLPSSNYNVPAGTTCTVSGWGRTSEGAGSLPDTLQEVNV 148 (229)
T ss_pred eEEeeEEEEECCCCCCCCCcCCEEEEEECcccCCCCceeeccCCCcccccCCCCEEEEEeCCCCCCCCCcCCCEeeEEEE
Confidence 6788999998 999988889999999999999999999999999863333 688999999998764 2344667899999
Q ss_pred ccCChhhhhhhhhccCCcCCCCcceEEeecCCCCCCcccCCCCCeeEeecCCCcEEEEEEEEeCCCCCCCCCCeEEEeCC
Q psy2950 299 PLTPKEECRRSYAVAGYSNYLNQCQVCTGTKQGGLDSCQGDSGGPLACPLPDGRYYLCGITSWGVGCARPDFYGVYTLVS 378 (406)
Q Consensus 299 ~~~~~~~C~~~~~~~~~~~~~~~~~~Ca~~~~~~~~~C~gDsGgPl~~~~~~~~~~l~Gi~S~~~~c~~~~~p~vft~v~ 378 (406)
.+++.+.|...+... ..+.++++|++........|.||+||||++.. + +|+|+||+|++..|...+.|.+|+||.
T Consensus 149 ~~~~~~~C~~~~~~~---~~~~~~~~C~~~~~~~~~~c~gdsG~pl~~~~-~-~~~l~Gi~s~g~~C~~~~~~~~~~~i~ 223 (229)
T smart00020 149 PIVSNATCRRAYSGG---GAITDNMLCAGGLEGGKDACQGDSGGPLVCND-G-RWVLVGIVSWGSGCARPGKPGVYTRVS 223 (229)
T ss_pred EEeCHHHhhhhhccc---cccCCCcEeecCCCCCCcccCCCCCCeeEEEC-C-CEEEEEEEEECCCCCCCCCCCEEEEec
Confidence 999999999877532 15788899999754467899999999999987 4 999999999999998667899999999
Q ss_pred ccHHHH
Q psy2950 379 CYSDWV 384 (406)
Q Consensus 379 ~~~~WI 384 (406)
+|.+||
T Consensus 224 ~~~~WI 229 (229)
T smart00020 224 SYLDWI 229 (229)
T ss_pred cccccC
Confidence 999998
No 8
>PF00089 Trypsin: Trypsin; InterPro: IPR001254 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine proteases belong to the MEROPS peptidase family S1 (chymotrypsin family, clan PA(S))and to peptidase family S6 (Hap serine peptidases). The chymotrypsin family is almost totally confined to animals, although trypsin-like enzymes are found in actinomycetes of the genera Streptomyces and Saccharopolyspora, and in the fungus Fusarium oxysporum []. The enzymes are inherently secreted, being synthesised with a signal peptide that targets them to the secretory pathway. Animal enzymes are either secreted directly, packaged into vesicles for regulated secretion, or are retained in leukocyte granules []. The Hap family, 'Haemophilus adhesion and penetration', are proteins that play a role in the interaction with human epithelial cells. The serine protease activity is localized at the N-terminal domain, whereas the binding domain is in the C-terminal region. ; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 1SPJ_A 1A5I_A 2ZGH_A 2ZKS_A 2ZGJ_A 2ZGC_A 2ODP_A 2I6Q_A 2I6S_A 2ODQ_A ....
Probab=99.90 E-value=6.4e-23 Score=184.18 Aligned_cols=186 Identities=39% Similarity=0.752 Sum_probs=147.5
Q ss_pred CCcEEEEEEEEecCCCCCCCCCeeEEEeeechhhHHHhhhcccceeeeeeeEee-ccCCCCCCcCceEEEEeCCCcccCC
Q psy2950 178 DGRYYLCGITSWGVGCARPDFYGVYTLVSCYSDWVKSILYARHEQRRRVERIYT-DFYDKSIYKNDIALLELTRPFKFNE 256 (406)
Q Consensus 178 ~~~~~l~Gi~s~~~~C~~~~~p~~~t~v~~~~~WI~~~i~~~~~~~~~v~~i~~-p~y~~~~~~~DiALlkL~~~v~~~~ 256 (406)
..+|+|.. ++|... ...+-..+.. .++...... .+.+.|++++. |.|+.....+|||||||++++.+.+
T Consensus 33 ~~~~vLTa-----ahC~~~-~~~~~v~~g~--~~~~~~~~~--~~~~~v~~~~~h~~~~~~~~~~DiAll~L~~~~~~~~ 102 (220)
T PF00089_consen 33 SPRWVLTA-----AHCVDG-ASDIKVRLGT--YSIRNSDGS--EQTIKVSKIIIHPKYDPSTYDNDIALLKLDRPITFGD 102 (220)
T ss_dssp ETTEEEEE-----GGGHTS-GGSEEEEESE--SBTTSTTTT--SEEEEEEEEEEETTSBTTTTTTSEEEEEESSSSEHBS
T ss_pred cccccccc-----cccccc-cccccccccc--ccccccccc--ccccccccccccccccccccccccccccccccccccc
Confidence 44577877 899764 1112122222 233222221 37899999988 9999988899999999999999999
Q ss_pred CcccccCCCCCCcc-cCCcEEEEcccccCCCCCccccceeeeeccCChhhhhhhhhccCCcCCCCcceEEeecCCCCCCc
Q psy2950 257 FVSPICLPNPGLTV-TADVGLISGWGRLSEGGSLPHILQAAEVPLTPKEECRRSYAVAGYSNYLNQCQVCTGTKQGGLDS 335 (406)
Q Consensus 257 ~v~piclp~~~~~~-~~~~~~~~GwG~~~~~~~~~~~l~~~~v~~~~~~~C~~~~~~~~~~~~~~~~~~Ca~~~~~~~~~ 335 (406)
.++|+||+...... .+..+.+.|||.....+ ....++...+.+++.+.|...+.. .+.+.++|+... ...+.
T Consensus 103 ~~~~~~l~~~~~~~~~~~~~~~~G~~~~~~~~-~~~~~~~~~~~~~~~~~c~~~~~~-----~~~~~~~c~~~~-~~~~~ 175 (220)
T PF00089_consen 103 NIQPICLPSAGSDPNVGTSCIVVGWGRTSDNG-YSSNLQSVTVPVVSRKTCRSSYND-----NLTPNMICAGSS-GSGDA 175 (220)
T ss_dssp SBEESBBTSTTHTTTTTSEEEEEESSBSSTTS-BTSBEEEEEEEEEEHHHHHHHTTT-----TSTTTEEEEETT-SSSBG
T ss_pred cccccccccccccccccccccccccccccccc-cccccccccccccccccccccccc-----cccccccccccc-ccccc
Confidence 99999999844322 68899999999865544 456789999999999999987542 478899999975 56799
Q ss_pred ccCCCCCeeEeecCCCcEEEEEEEEeCCCCCCCCCCeEEEeCCccHHHH
Q psy2950 336 CQGDSGGPLACPLPDGRYYLCGITSWGVGCARPDFYGVYTLVSCYSDWV 384 (406)
Q Consensus 336 C~gDsGgPl~~~~~~~~~~l~Gi~S~~~~c~~~~~p~vft~v~~~~~WI 384 (406)
|.|||||||++... +|+||+|++..|.....|.+|+||+.|++||
T Consensus 176 ~~g~sG~pl~~~~~----~lvGI~s~~~~c~~~~~~~v~~~v~~~~~WI 220 (220)
T PF00089_consen 176 CQGDSGGPLICNNN----YLVGIVSFGENCGSPNYPGVYTRVSSYLDWI 220 (220)
T ss_dssp GTTTTTSEEEETTE----EEEEEEEEESSSSBTTSEEEEEEGGGGHHHH
T ss_pred ccccccccccccee----eecceeeecCCCCCCCcCEEEEEHHHhhccC
Confidence 99999999999761 7999999999999877899999999999999
No 9
>COG5640 Secreted trypsin-like serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=99.90 E-value=3.8e-23 Score=186.93 Aligned_cols=230 Identities=23% Similarity=0.327 Sum_probs=158.3
Q ss_pred ccccccCCccccceeeEEEEeeee--------ccCCCc---ceEEEEEEe------------------------------
Q psy2950 4 IQDFVEGNPRQLHHQLFIILLRRT--------SEGGSL---PHILQAAEV------------------------------ 42 (406)
Q Consensus 4 ~~~i~~G~~~~~~~~P~~v~i~~~--------~c~G~l---~~vltaa~c------------------------------ 42 (406)
--||+||+.+..++||++|++-.+ .|||+. ++|||||||
T Consensus 30 s~rIigGs~Anag~~P~~VaLv~~isd~~s~tfCGgs~l~~RYvLTAAHC~~~~s~is~d~~~vv~~l~d~Sq~~rg~vr 109 (413)
T COG5640 30 SSRIIGGSNANAGEYPSLVALVDRISDYVSGTFCGGSKLGGRYVLTAAHCADASSPISSDVNRVVVDLNDSSQAERGHVR 109 (413)
T ss_pred ceeEecCcccccccCchHHHHHhhcccccceeEeccceecceEEeeehhhccCCCCccccceEEEecccccccccCcceE
Confidence 358999999999999999999532 299998 459999999
Q ss_pred --cCCCCCC-CCCCcceEEEEeCCccccCCCccccccCCCCCccccccccccccccccCCCceEEEeeeccCCCC---CC
Q psy2950 43 --PLTPKEE-CRRSYAVAGYELTRPFKFNEFVSPICLPNPGLTVTADVGLISGWGRLSEGADVGLISGWGRLSEG---GS 116 (406)
Q Consensus 43 --~~Hp~y~-~~~~nDIALlkL~~~v~~~~~v~picl~~~~~~~~~~~~~~~~wg~~~~~~~~~~~~Gwg~~~~~---~~ 116 (406)
..|..|. .++.||||+++|.++.... .+.+-..+..... .--++.| ......+|+.+... ..
T Consensus 110 ~i~~~efY~~~n~~ND~Av~~l~~~a~~p----r~ki~~~~~sdt~-l~sv~~~-------s~~~n~t~~~~~~~~v~~~ 177 (413)
T COG5640 110 TIYVHEFYSPGNLGNDIAVLELARAASLP----RVKITSFDASDTF-LNSVTTV-------SPMTNGTFGVTTPSDVPRS 177 (413)
T ss_pred EEeeecccccccccCcceeeccccccccc----hhheeeccCcccc-eeccccc-------ccccceeeeeeeecCCCCC
Confidence 4588886 6799999999999866432 1111111110000 0001111 34555666644322 12
Q ss_pred cc--cccceeeeeccChhhhhHhhhccC-CCCCCCCCeeecCCCCCCcccccCCCCCeeEeeCCCCcEEEEEEEEecCC-
Q psy2950 117 LP--HILQAAEVPLTPKEECRRSYAVAG-YSNYLNQCQVCTGTKQGGLDSCQGDSGGPLACPLPDGRYYLCGITSWGVG- 192 (406)
Q Consensus 117 ~~--~~l~~~~~~i~~~~~C~~~~~~~~-~~~~~~~~~~Ca~~~~~~~~~C~gdsGgPl~~~~~~~~~~l~Gi~s~~~~- 192 (406)
.+ ..|+++.+...+...|.+.++... ......-.-||++.+ ..+.|+||||||++... +...+++||+|||.+
T Consensus 178 ~p~gt~l~e~~v~fv~~stc~~~~g~an~~dg~~~lT~~cag~~--~~daCqGDSGGPi~~~g-~~G~vQ~GVvSwG~~~ 254 (413)
T COG5640 178 SPKGTILHEVAVLFVPLSTCAQYKGCANASDGATGLTGFCAGRP--PKDACQGDSGGPIFHKG-EEGRVQRGVVSWGDGG 254 (413)
T ss_pred CCccceeeeeeeeeechHHhhhhccccccCCCCCCccceecCCC--CcccccCCCCCceEEeC-CCccEEEeEEEecCCC
Confidence 22 478999999999999999886211 111222233999976 47999999999999988 556699999999976
Q ss_pred CCCCCCCeeEEEeeechhhHHHhhhcccceeeeeeeEeec-------------cCCCCCCcCceEEEEeC
Q psy2950 193 CARPDFYGVYTLVSCYSDWVKSILYARHEQRRRVERIYTD-------------FYDKSIYKNDIALLELT 249 (406)
Q Consensus 193 C~~~~~p~~~t~v~~~~~WI~~~i~~~~~~~~~v~~i~~p-------------~y~~~~~~~DiALlkL~ 249 (406)
|+....|++||+++.|.+||...+..-....+..+. +.| .++..++.-|++++-.+
T Consensus 255 Cg~t~~~gVyT~vsny~~WI~a~~~~l~~~~~rp~~-~~~~G~~t~~~p~T~~~~na~t~~g~~~~llva 323 (413)
T COG5640 255 CGGTLIPGVYTNVSNYQDWIAAMTNGLSYLQFRPLG-YRPTGFDTPRDPATNFFFNAQTYEGNTFVLLVA 323 (413)
T ss_pred CCCCCcceeEEehhHHHHHHHHHhcCCCcccccccc-cccccccccccCCCccccccccccCCeEEEEEe
Confidence 999999999999999999999987654433333333 111 23445566677777765
No 10
>COG5640 Secreted trypsin-like serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=99.79 E-value=3.1e-18 Score=155.25 Aligned_cols=205 Identities=30% Similarity=0.447 Sum_probs=138.3
Q ss_pred CCcEEEEEEEEecCCCCCCCCCe--eEEEeeechhhHHHhhhcccceeeeeeeEee-ccCCCCCCcCceEEEEeCCCccc
Q psy2950 178 DGRYYLCGITSWGVGCARPDFYG--VYTLVSCYSDWVKSILYARHEQRRRVERIYT-DFYDKSIYKNDIALLELTRPFKF 254 (406)
Q Consensus 178 ~~~~~l~Gi~s~~~~C~~~~~p~--~~t~v~~~~~WI~~~i~~~~~~~~~v~~i~~-p~y~~~~~~~DiALlkL~~~v~~ 254 (406)
++||+|.+ ++|...+.|. -+.++..-+. +.++.+...+..++. ..|.+.++.||+|+++|.++...
T Consensus 69 ~~RYvLTA-----AHC~~~~s~is~d~~~vv~~l~------d~Sq~~rg~vr~i~~~efY~~~n~~ND~Av~~l~~~a~~ 137 (413)
T COG5640 69 GGRYVLTA-----AHCADASSPISSDVNRVVVDLN------DSSQAERGHVRTIYVHEFYSPGNLGNDIAVLELARAASL 137 (413)
T ss_pred cceEEeee-----hhhccCCCCccccceEEEeccc------ccccccCcceEEEeeecccccccccCcceeecccccccc
Confidence 67899999 8997655432 2333332222 333347778888888 88899999999999999985542
Q ss_pred CCCcccccCCCCC---Ccc-cCCcEEEEcccccCCCC---Ccc--ccceeeeeccCChhhhhhhhhccCCc-CCCCcceE
Q psy2950 255 NEFVSPICLPNPG---LTV-TADVGLISGWGRLSEGG---SLP--HILQAAEVPLTPKEECRRSYAVAGYS-NYLNQCQV 324 (406)
Q Consensus 255 ~~~v~piclp~~~---~~~-~~~~~~~~GwG~~~~~~---~~~--~~l~~~~v~~~~~~~C~~~~~~~~~~-~~~~~~~~ 324 (406)
- .++---....+ ... ........+||.+.... ..+ ..|+++.+...+...|..+++..... ....-.-+
T Consensus 138 p-r~ki~~~~~sdt~l~sv~~~s~~~n~t~~~~~~~~v~~~~p~gt~l~e~~v~fv~~stc~~~~g~an~~dg~~~lT~~ 216 (413)
T COG5640 138 P-RVKITSFDASDTFLNSVTTVSPMTNGTFGVTTPSDVPRSSPKGTILHEVAVLFVPLSTCAQYKGCANASDGATGLTGF 216 (413)
T ss_pred c-hhheeeccCcccceecccccccccceeeeeeeecCCCCCCCccceeeeeeeeeechHHhhhhccccccCCCCCCccce
Confidence 1 11100000001 000 11222345555543321 112 48999999999999999988521111 11222239
Q ss_pred EeecCCCCCCcccCCCCCeeEeecCCCcEEEEEEEEeCCC-CCCCCCCeEEEeCCccHHHHHHhHccCCCcccc
Q psy2950 325 CTGTKQGGLDSCQGDSGGPLACPLPDGRYYLCGITSWGVG-CARPDFYGVYTLVSCYSDWVKSILYASVSAKRV 397 (406)
Q Consensus 325 Ca~~~~~~~~~C~gDsGgPl~~~~~~~~~~l~Gi~S~~~~-c~~~~~p~vft~v~~~~~WI~~~~~~~~~~~~~ 397 (406)
|++.+ .++.|+||||||++.+..+++ +++||+|||.+ |+....|.|||+|+.|.+||...|...+.+.+-
T Consensus 217 cag~~--~~daCqGDSGGPi~~~g~~G~-vQ~GVvSwG~~~Cg~t~~~gVyT~vsny~~WI~a~~~~l~~~~~r 287 (413)
T COG5640 217 CAGRP--PKDACQGDSGGPIFHKGEEGR-VQRGVVSWGDGGCGGTLIPGVYTNVSNYQDWIAAMTNGLSYLQFR 287 (413)
T ss_pred ecCCC--CcccccCCCCCceEEeCCCcc-EEEeEEEecCCCCCCCCcceeEEehhHHHHHHHHHhcCCCccccc
Confidence 99965 589999999999999985555 89999999988 999999999999999999999999887665443
No 11
>PF03761 DUF316: Domain of unknown function (DUF316) ; InterPro: IPR005514 This is a family of uncharacterised proteins from Caenorhabditis elegans.
Probab=98.99 E-value=5e-09 Score=97.96 Aligned_cols=164 Identities=25% Similarity=0.459 Sum_probs=109.0
Q ss_pred cccccCCccccceeeEEEEeeeec-------cCCCcce---EEEEEEecCCC----------------------------
Q psy2950 5 QDFVEGNPRQLHHQLFIILLRRTS-------EGGSLPH---ILQAAEVPLTP---------------------------- 46 (406)
Q Consensus 5 ~~i~~G~~~~~~~~P~~v~i~~~~-------c~G~l~~---vltaa~c~~Hp---------------------------- 46 (406)
.++.+|.++...+.||.|.+.... ++|++|+ |||++||+...
T Consensus 40 ~~~~~g~~~~~~~~pW~v~v~~~~~~~~~~~~~gtlIS~RHiLtss~~~~~~~~~W~~~~~~~~~~C~~~~~~l~vP~~~ 119 (282)
T PF03761_consen 40 SKVFNGTPAESGEAPWAVSVYTKNHNEGNYFSTGTLISPRHILTSSHCVMNDKSKWLNGEEFDNKKCEGNNNHLIVPEEV 119 (282)
T ss_pred ccccCCcccccCCCCCEEEEEeccCcccceecceEEeccCeEEEeeeEEEecccccccCcccccceeeCCCceEEeCHHH
Confidence 346889999999999999997654 5889854 99999992200
Q ss_pred ----------------------------CC------CCCCCcceEEEEeCCccccCCCccccccCCCCCccccccccccc
Q psy2950 47 ----------------------------KE------ECRRSYAVAGYELTRPFKFNEFVSPICLPNPGLTVTADVGLISG 92 (406)
Q Consensus 47 ----------------------------~y------~~~~~nDIALlkL~~~v~~~~~v~picl~~~~~~~~~~~~~~~~ 92 (406)
++ ......+++||+|+++ ++....|+|||........
T Consensus 120 l~~~~v~~~~~~~~~~~~~~~v~ka~il~~C~~~~~~~~~~~~~mIlEl~~~--~~~~~~~~Cl~~~~~~~~~------- 190 (282)
T PF03761_consen 120 LSKIDVRCCNCFSNGKCFSIKVKKAYILNGCKKIKKNFNRPYSPMILELEED--FSKNVSPPCLADSSTNWEK------- 190 (282)
T ss_pred hccEEEEeecccccCCcccceeEEEEEEecCCCcccccccccceEEEEEccc--ccccCCCEEeCCCcccccc-------
Confidence 00 0123467889999998 7788999999987655432
Q ss_pred cccccCCCceEEEeeeccCCCCCCcccccceeeeeccChhhhhHhhhccCCCCCCCCCeeecCCCCCCcccccCCCCCee
Q psy2950 93 WGRLSEGADVGLISGWGRLSEGGSLPHILQAAEVPLTPKEECRRSYAVAGYSNYLNQCQVCTGTKQGGLDSCQGDSGGPL 172 (406)
Q Consensus 93 wg~~~~~~~~~~~~Gwg~~~~~~~~~~~l~~~~~~i~~~~~C~~~~~~~~~~~~~~~~~~Ca~~~~~~~~~C~gdsGgPl 172 (406)
+ +.+.+.|+ .....+....+.+.....|. ..+| .....|.+|+||||
T Consensus 191 ------~-~~~~~yg~-------~~~~~~~~~~~~i~~~~~~~--------------~~~~-----~~~~~~~~d~Gg~l 237 (282)
T PF03761_consen 191 ------G-DEVDVYGF-------NSTGKLKHRKLKITNCTKCA--------------YSIC-----TKQYSCKGDRGGPL 237 (282)
T ss_pred ------C-ceEEEeec-------CCCCeEEEEEEEEEEeeccc--------------eeEe-----cccccCCCCccCeE
Confidence 3 55555555 11233444444443322211 1122 23477999999999
Q ss_pred EeeCCCCcEEEEEEEEecC-CCCCCCCCeeEEEeeechhhHH
Q psy2950 173 ACPLPDGRYYLCGITSWGV-GCARPDFYGVYTLVSCYSDWVK 213 (406)
Q Consensus 173 ~~~~~~~~~~l~Gi~s~~~-~C~~~~~p~~~t~v~~~~~WI~ 213 (406)
+... +|+|+|+||.+.+. .|.. + ...|.++..|.+=|-
T Consensus 238 v~~~-~gr~tlIGv~~~~~~~~~~-~-~~~f~~v~~~~~~IC 276 (282)
T PF03761_consen 238 VKNI-NGRWTLIGVGASGNYECNK-N-NSYFFNVSWYQDEIC 276 (282)
T ss_pred EEEE-CCCEEEEEEEccCCCcccc-c-ccEEEEHHHhhhhhc
Confidence 9998 89999999998874 3432 1 457888877766443
No 12
>PF09342 DUF1986: Domain of unknown function (DUF1986); InterPro: IPR015420 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This domain is found in serine endopeptidases belonging to MEROPS peptidase family S1A (clan PA). It is found in unusual mosaic proteins, which are encoded by the Drosophila nudel gene (see P98159 from SWISSPROT). Nudel is involved in defining embryonic dorsoventral polarity. Three proteases; ndl, gd and snk process easter to create active easter. Active easter defines cell identities along the dorsal-ventral continuum by activating the spz ligand for the Tl receptor in the ventral region of the embryo. Nudel, pipe and windbeutel together trigger the protease cascade within the extraembryonic perivitelline compartment which induces dorsoventral polarity of the Drosophila embryo [].
Probab=98.51 E-value=6.7e-07 Score=78.18 Aligned_cols=82 Identities=17% Similarity=0.157 Sum_probs=61.0
Q ss_pred cceeeEEEEeeeec---cCCCcc---eEEEEEEecC-------------------------CCC------CCCCCCcceE
Q psy2950 15 LHHQLFIILLRRTS---EGGSLP---HILQAAEVPL-------------------------TPK------EECRRSYAVA 57 (406)
Q Consensus 15 ~~~~P~~v~i~~~~---c~G~l~---~vltaa~c~~-------------------------Hp~------y~~~~~nDIA 57 (406)
...|||.|.|+... |+|+|+ |+|++..|.. |.. |..-.+.+++
T Consensus 13 ~y~WPWlA~IYvdG~~~CsgvLlD~~WlLvsssCl~~I~L~~~YvsallG~~Kt~~~v~Gp~EQI~rVD~~~~V~~S~v~ 92 (267)
T PF09342_consen 13 DYHWPWLADIYVDGRYWCSGVLLDPHWLLVSSSCLRGISLSHHYVSALLGGGKTYLSVDGPHEQISRVDCFKDVPESNVL 92 (267)
T ss_pred cccCcceeeEEEcCeEEEEEEEeccceEEEeccccCCcccccceEEEEecCcceecccCCChheEEEeeeeeecccccee
Confidence 45699999998655 999994 4999999921 111 0001357999
Q ss_pred EEEeCCccccCCCccccccCCCCCccccccccccccccccCCCceEEEeeecc
Q psy2950 58 GYELTRPFKFNEFVSPICLPNPGLTVTADVGLISGWGRLSEGADVGLISGWGR 110 (406)
Q Consensus 58 LlkL~~~v~~~~~v~picl~~~~~~~~~~~~~~~~wg~~~~~~~~~~~~Gwg~ 110 (406)
||.|++|++|+.+|+|..||........ . ..|...|-..
T Consensus 93 LLHL~~~~~fTr~VlP~flp~~~~~~~~-------------~-~~CVAVg~d~ 131 (267)
T PF09342_consen 93 LLHLEQPANFTRYVLPTFLPETSNENES-------------D-DECVAVGHDD 131 (267)
T ss_pred eeeecCcccceeeecccccccccCCCCC-------------C-CceEEEEccc
Confidence 9999999999999999999974444433 4 6888888654
No 13
>PF03761 DUF316: Domain of unknown function (DUF316) ; InterPro: IPR005514 This is a family of uncharacterised proteins from Caenorhabditis elegans.
Probab=98.44 E-value=1.6e-06 Score=81.13 Aligned_cols=117 Identities=27% Similarity=0.414 Sum_probs=79.7
Q ss_pred CCCcCceEEEEeCCCcccCCCcccccCCCCCCcc-cCCcEEEEcccccCCCCCccccceeeeeccCChhhhhhhhhccCC
Q psy2950 237 SIYKNDIALLELTRPFKFNEFVSPICLPNPGLTV-TADVGLISGWGRLSEGGSLPHILQAAEVPLTPKEECRRSYAVAGY 315 (406)
Q Consensus 237 ~~~~~DiALlkL~~~v~~~~~v~piclp~~~~~~-~~~~~~~~GwG~~~~~~~~~~~l~~~~v~~~~~~~C~~~~~~~~~ 315 (406)
....++++||+|+++ ++....|+|||...... .++.+.+.|+ .....+....+.+.....
T Consensus 157 ~~~~~~~mIlEl~~~--~~~~~~~~Cl~~~~~~~~~~~~~~~yg~-------~~~~~~~~~~~~i~~~~~---------- 217 (282)
T PF03761_consen 157 FNRPYSPMILELEED--FSKNVSPPCLADSSTNWEKGDEVDVYGF-------NSTGKLKHRKLKITNCTK---------- 217 (282)
T ss_pred cccccceEEEEEccc--ccccCCCEEeCCCccccccCceEEEeec-------CCCCeEEEEEEEEEEeec----------
Confidence 345789999999999 77899999999865544 5666666665 112234444444433322
Q ss_pred cCCCCcceEEeecCCCCCCcccCCCCCeeEeecCCCcEEEEEEEEeCCC-CCCCCCCeEEEeCCccHHHH
Q psy2950 316 SNYLNQCQVCTGTKQGGLDSCQGDSGGPLACPLPDGRYYLCGITSWGVG-CARPDFYGVYTLVSCYSDWV 384 (406)
Q Consensus 316 ~~~~~~~~~Ca~~~~~~~~~C~gDsGgPl~~~~~~~~~~l~Gi~S~~~~-c~~~~~p~vft~v~~~~~WI 384 (406)
|+.........|.+|+||||+... +++++|+||.+.+.. |.. ....|.+|..|.+=|
T Consensus 218 ---------~~~~~~~~~~~~~~d~Gg~lv~~~-~gr~tlIGv~~~~~~~~~~--~~~~f~~v~~~~~~I 275 (282)
T PF03761_consen 218 ---------CAYSICTKQYSCKGDRGGPLVKNI-NGRWTLIGVGASGNYECNK--NNSYFFNVSWYQDEI 275 (282)
T ss_pred ---------cceeEecccccCCCCccCeEEEEE-CCCEEEEEEEccCCCcccc--cccEEEEHHHhhhhh
Confidence 111112245779999999999987 899999999987753 432 267888988877643
No 14
>PF09342 DUF1986: Domain of unknown function (DUF1986); InterPro: IPR015420 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This domain is found in serine endopeptidases belonging to MEROPS peptidase family S1A (clan PA). It is found in unusual mosaic proteins, which are encoded by the Drosophila nudel gene (see P98159 from SWISSPROT). Nudel is involved in defining embryonic dorsoventral polarity. Three proteases; ndl, gd and snk process easter to create active easter. Active easter defines cell identities along the dorsal-ventral continuum by activating the spz ligand for the Tl receptor in the ventral region of the embryo. Nudel, pipe and windbeutel together trigger the protease cascade within the extraembryonic perivitelline compartment which induces dorsoventral polarity of the Drosophila embryo [].
Probab=96.72 E-value=0.044 Score=48.62 Aligned_cols=60 Identities=27% Similarity=0.400 Sum_probs=43.1
Q ss_pred hcccceeeeeeeEeeccCCCCCCcCceEEEEeCCCcccCCCcccccCCCCCCcc-cCCcEEEEcccc
Q psy2950 217 YARHEQRRRVERIYTDFYDKSIYKNDIALLELTRPFKFNEFVSPICLPNPGLTV-TADVGLISGWGR 282 (406)
Q Consensus 217 ~~~~~~~~~v~~i~~p~y~~~~~~~DiALlkL~~~v~~~~~v~piclp~~~~~~-~~~~~~~~GwG~ 282 (406)
....+|..+|..+.. - ...+++||.|++|+.|+.+|+|+.||....+. ....|...|-..
T Consensus 71 ~Gp~EQI~rVD~~~~-----V-~~S~v~LLHL~~~~~fTr~VlP~flp~~~~~~~~~~~CVAVg~d~ 131 (267)
T PF09342_consen 71 DGPHEQISRVDCFKD-----V-PESNVLLLHLEQPANFTRYVLPTFLPETSNENESDDECVAVGHDD 131 (267)
T ss_pred CCChheEEEeeeeee-----c-cccceeeeeecCcccceeeecccccccccCCCCCCCceEEEEccc
Confidence 333345555555543 1 25689999999999999999999999744444 566899888554
No 15
>COG3591 V8-like Glu-specific endopeptidase [Amino acid transport and metabolism]
Probab=96.55 E-value=0.0092 Score=53.74 Aligned_cols=53 Identities=15% Similarity=0.238 Sum_probs=34.5
Q ss_pred CcccccCCCCCeeEeeCCCCcEEEEEEEEecCCCCCCCCCeeEEEee-echhhHHHhh
Q psy2950 160 GLDSCQGDSGGPLACPLPDGRYYLCGITSWGVGCARPDFYGVYTLVS-CYSDWVKSIL 216 (406)
Q Consensus 160 ~~~~C~gdsGgPl~~~~~~~~~~l~Gi~s~~~~C~~~~~p~~~t~v~-~~~~WI~~~i 216 (406)
..+.+.|+||+|++.... +++||..-+.+-.........+++. ..++||++.+
T Consensus 197 ~~dT~pG~SGSpv~~~~~----~vigv~~~g~~~~~~~~~n~~vr~t~~~~~~I~~~~ 250 (251)
T COG3591 197 DADTLPGSSGSPVLISKD----EVIGVHYNGPGANGGSLANNAVRLTPEILNFIQQNI 250 (251)
T ss_pred EecccCCCCCCceEecCc----eEEEEEecCCCcccccccCcceEecHHHHHHHHHhh
Confidence 347889999999997652 8999998876533222223334443 4567777654
No 16
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=94.07 E-value=0.11 Score=51.56 Aligned_cols=63 Identities=17% Similarity=0.076 Sum_probs=40.4
Q ss_pred cCCCcce----EEEEEEecCCCC-----------CCC-----CCCcceEEEEeCCccccCCCccccccCCCCCccccccc
Q psy2950 29 EGGSLPH----ILQAAEVPLTPK-----------EEC-----RRSYAVAGYELTRPFKFNEFVSPICLPNPGLTVTADVG 88 (406)
Q Consensus 29 c~G~l~~----vltaa~c~~Hp~-----------y~~-----~~~nDIALlkL~~~v~~~~~v~picl~~~~~~~~~~~~ 88 (406)
++|.++. |||.+|++-... |.. ....||||||++.+ ..+.++.|........
T Consensus 60 GSGfii~~~G~IlTn~Hvv~~~~~i~V~~~~~~~~~a~vv~~d~~~DlAllkv~~~----~~~~~~~l~~~~~~~~---- 131 (428)
T TIGR02037 60 GSGVIISADGYILTNNHVVDGADEITVTLSDGREFKAKLVGKDPRTDIAVLKIDAK----KNLPVIKLGDSDKLRV---- 131 (428)
T ss_pred eeEEEECCCCEEEEcHHHcCCCCeEEEEeCCCCEEEEEEEEecCCCCEEEEEecCC----CCceEEEccCCCCCCC----
Confidence 7777754 999999943111 211 24689999999865 3456777764433222
Q ss_pred cccccccccCCCceEEEeeecc
Q psy2950 89 LISGWGRLSEGADVGLISGWGR 110 (406)
Q Consensus 89 ~~~~wg~~~~~~~~~~~~Gwg~ 110 (406)
+ +.+.+.|+..
T Consensus 132 ----------G-~~v~aiG~p~ 142 (428)
T TIGR02037 132 ----------G-DWVLAIGNPF 142 (428)
T ss_pred ----------C-CEEEEEECCC
Confidence 4 8888888753
No 17
>PRK10898 serine endoprotease; Provisional
Probab=93.50 E-value=0.97 Score=43.59 Aligned_cols=24 Identities=38% Similarity=0.555 Sum_probs=18.1
Q ss_pred cccCCCCCeeEeeCCCCcEEEEEEEEec
Q psy2950 163 SCQGDSGGPLACPLPDGRYYLCGITSWG 190 (406)
Q Consensus 163 ~C~gdsGgPl~~~~~~~~~~l~Gi~s~~ 190 (406)
.-.|.|||||+-... .++||.+..
T Consensus 195 i~~GnSGGPl~n~~G----~vvGI~~~~ 218 (353)
T PRK10898 195 INHGNSGGALVNSLG----ELMGINTLS 218 (353)
T ss_pred cCCCCCcceEECCCC----eEEEEEEEE
Confidence 346889999994432 799998864
No 18
>TIGR02038 protease_degS periplasmic serine pepetdase DegS. This family consists of the periplasmic serine protease DegS (HhoB), a shorter paralog of protease DO (HtrA, DegP) and DegQ (HhoA). It is found in E. coli and several other Proteobacteria of the gamma subdivision. It contains a trypsin domain and a single copy of PDZ domain (in contrast to DegP with two copies). A critical role of this DegS is to sense stress in the periplasm and partially degrade an inhibitor of sigma(E).
Probab=92.42 E-value=0.57 Score=45.19 Aligned_cols=25 Identities=32% Similarity=0.375 Sum_probs=18.3
Q ss_pred ccccCCCCCeeEeeCCCCcEEEEEEEEec
Q psy2950 162 DSCQGDSGGPLACPLPDGRYYLCGITSWG 190 (406)
Q Consensus 162 ~~C~gdsGgPl~~~~~~~~~~l~Gi~s~~ 190 (406)
..-.|.|||||+-... .++||.+..
T Consensus 194 ~i~~GnSGGpl~n~~G----~vIGI~~~~ 218 (351)
T TIGR02038 194 AINAGNSGGALINTNG----ELVGINTAS 218 (351)
T ss_pred ccCCCCCcceEECCCC----eEEEEEeee
Confidence 3446889999995432 799998764
No 19
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=89.64 E-value=1.7 Score=43.21 Aligned_cols=39 Identities=23% Similarity=0.183 Sum_probs=27.5
Q ss_pred cCceEEEEeCCCcccCCCcccccCCCCCCcccCCcEEEEcccc
Q psy2950 240 KNDIALLELTRPFKFNEFVSPICLPNPGLTVTADVGLISGWGR 282 (406)
Q Consensus 240 ~~DiALlkL~~~v~~~~~v~piclp~~~~~~~~~~~~~~GwG~ 282 (406)
..|+||||+..+ ....++.|...+....|+.+.+.|+-.
T Consensus 104 ~~DlAllkv~~~----~~~~~~~l~~~~~~~~G~~v~aiG~p~ 142 (428)
T TIGR02037 104 RTDIAVLKIDAK----KNLPVIKLGDSDKLRVGDWVLAIGNPF 142 (428)
T ss_pred CCCEEEEEecCC----CCceEEEccCCCCCCCCCEEEEEECCC
Confidence 579999999864 245566776444333799999988753
No 20
>PRK10139 serine endoprotease; Provisional
Probab=87.97 E-value=1.4 Score=44.17 Aligned_cols=25 Identities=32% Similarity=0.291 Sum_probs=18.8
Q ss_pred ccccCCCCCeeEeeCCCCcEEEEEEEEec
Q psy2950 162 DSCQGDSGGPLACPLPDGRYYLCGITSWG 190 (406)
Q Consensus 162 ~~C~gdsGgPl~~~~~~~~~~l~Gi~s~~ 190 (406)
..-.|.|||||+-... .++||.+..
T Consensus 208 ~in~GnSGGpl~n~~G----~vIGi~~~~ 232 (455)
T PRK10139 208 SINRGNSGGALLNLNG----ELIGINTAI 232 (455)
T ss_pred ccCCCCCcceEECCCC----eEEEEEEEE
Confidence 3456899999995432 799998864
No 21
>PF02395 Peptidase_S6: Immunoglobulin A1 protease Serine protease Prosite pattern; InterPro: IPR000710 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S6 (clan PA(S)). The type sample being the IgA1-specific serine endopeptidase from Neisseria gonorrhoeae []. These cleave prolyl bonds in the hinge regions of immunoglobulin A heavy chains. Similar specificity is shown by the unrelated family of M26 metalloendopeptidases.; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SZE_A 3H09_B 3SYJ_A 1WXR_A 3AK5_B.
Probab=87.38 E-value=0.51 Score=49.88 Aligned_cols=53 Identities=25% Similarity=0.436 Sum_probs=31.3
Q ss_pred cccCCCCCeeEe-ecCCCcEEEEEEEEeCCCCCCCCCCeEEEeCCccHHHHHHhHccC
Q psy2950 335 SCQGDSGGPLAC-PLPDGRYYLCGITSWGVGCARPDFYGVYTLVSCYSDWVKSILYAS 391 (406)
Q Consensus 335 ~C~gDsGgPl~~-~~~~~~~~l~Gi~S~~~~c~~~~~p~vft~v~~~~~WI~~~~~~~ 391 (406)
.=.||||+||+. +..+.+|+|+|+++.+....... ..++-+ -.+|+++++.+.
T Consensus 213 ~~~GDSGSPlF~YD~~~kKWvl~Gv~~~~~~~~g~~--~~~~~~--~~~f~~~~~~~d 266 (769)
T PF02395_consen 213 GSPGDSGSPLFAYDKEKKKWVLVGVLSGGNGYNGKG--NWWNVI--PPDFINQIKQND 266 (769)
T ss_dssp --TT-TT-EEEEEETTTTEEEEEEEEEEECCCCHSE--EEEEEE--CHHHHHHHHHHC
T ss_pred cccCcCCCceEEEEccCCeEEEEEEEccccccCCcc--ceeEEe--cHHHHHHHHhhh
Confidence 457999999985 55678999999999876543321 333222 345555555444
No 22
>PRK10898 serine endoprotease; Provisional
Probab=86.32 E-value=8.6 Score=37.10 Aligned_cols=37 Identities=22% Similarity=0.245 Sum_probs=22.9
Q ss_pred cCceEEEEeCCCcccCCCcccccCCCCCCcccCCcEEEEccc
Q psy2950 240 KNDIALLELTRPFKFNEFVSPICLPNPGLTVTADVGLISGWG 281 (406)
Q Consensus 240 ~~DiALlkL~~~v~~~~~v~piclp~~~~~~~~~~~~~~GwG 281 (406)
.+|+||||+... . ..++-|........|+.+.+.|+-
T Consensus 124 ~~DlAvl~v~~~-~----l~~~~l~~~~~~~~G~~V~aiG~P 160 (353)
T PRK10898 124 LTDLAVLKINAT-N----LPVIPINPKRVPHIGDVVLAIGNP 160 (353)
T ss_pred CCCEEEEEEcCC-C----CCeeeccCcCcCCCCCEEEEEeCC
Confidence 689999999753 1 233333322222278888888865
No 23
>COG3591 V8-like Glu-specific endopeptidase [Amino acid transport and metabolism]
Probab=84.37 E-value=1.8 Score=39.23 Aligned_cols=53 Identities=15% Similarity=0.237 Sum_probs=38.2
Q ss_pred CCCcccCCCCCeeEeecCCCcEEEEEEEEeCCCCCCCCCCeEEEeCCc-cHHHHHHhH
Q psy2950 332 GLDSCQGDSGGPLACPLPDGRYYLCGITSWGVGCARPDFYGVYTLVSC-YSDWVKSIL 388 (406)
Q Consensus 332 ~~~~C~gDsGgPl~~~~~~~~~~l~Gi~S~~~~c~~~~~p~vft~v~~-~~~WI~~~~ 388 (406)
..++|.|+||+|++.... +++||.+-+..--......-.+|+.+ ..+||++.+
T Consensus 197 ~~dT~pG~SGSpv~~~~~----~vigv~~~g~~~~~~~~~n~~vr~t~~~~~~I~~~~ 250 (251)
T COG3591 197 DADTLPGSSGSPVLISKD----EVIGVHYNGPGANGGSLANNAVRLTPEILNFIQQNI 250 (251)
T ss_pred EecccCCCCCCceEecCc----eEEEEEecCCCcccccccCcceEecHHHHHHHHHhh
Confidence 457889999999998652 89999998876322233455666554 788998865
No 24
>PF02395 Peptidase_S6: Immunoglobulin A1 protease Serine protease Prosite pattern; InterPro: IPR000710 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S6 (clan PA(S)). The type sample being the IgA1-specific serine endopeptidase from Neisseria gonorrhoeae []. These cleave prolyl bonds in the hinge regions of immunoglobulin A heavy chains. Similar specificity is shown by the unrelated family of M26 metalloendopeptidases.; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SZE_A 3H09_B 3SYJ_A 1WXR_A 3AK5_B.
Probab=82.56 E-value=1 Score=47.67 Aligned_cols=32 Identities=34% Similarity=0.509 Sum_probs=23.5
Q ss_pred cccCCCCCeeEeeCC-CCcEEEEEEEEecCCCC
Q psy2950 163 SCQGDSGGPLACPLP-DGRYYLCGITSWGVGCA 194 (406)
Q Consensus 163 ~C~gdsGgPl~~~~~-~~~~~l~Gi~s~~~~C~ 194 (406)
.=.||||+||+..+. ..+|+|+|+++.+....
T Consensus 213 ~~~GDSGSPlF~YD~~~kKWvl~Gv~~~~~~~~ 245 (769)
T PF02395_consen 213 GSPGDSGSPLFAYDKEKKKWVLVGVLSGGNGYN 245 (769)
T ss_dssp --TT-TT-EEEEEETTTTEEEEEEEEEEECCCC
T ss_pred cccCcCCCceEEEEccCCeEEEEEEEccccccC
Confidence 457999999998775 67999999999875543
No 25
>TIGR02038 protease_degS periplasmic serine pepetdase DegS. This family consists of the periplasmic serine protease DegS (HhoB), a shorter paralog of protease DO (HtrA, DegP) and DegQ (HhoA). It is found in E. coli and several other Proteobacteria of the gamma subdivision. It contains a trypsin domain and a single copy of PDZ domain (in contrast to DegP with two copies). A critical role of this DegS is to sense stress in the periplasm and partially degrade an inhibitor of sigma(E).
Probab=82.49 E-value=6.5 Score=37.88 Aligned_cols=38 Identities=21% Similarity=0.131 Sum_probs=24.0
Q ss_pred cCceEEEEeCCCcccCCCcccccCCCCCCcccCCcEEEEcccc
Q psy2950 240 KNDIALLELTRPFKFNEFVSPICLPNPGLTVTADVGLISGWGR 282 (406)
Q Consensus 240 ~~DiALlkL~~~v~~~~~v~piclp~~~~~~~~~~~~~~GwG~ 282 (406)
.+|+||||+..+- ..++-+-.......|+.+.+.|+..
T Consensus 124 ~~DlAvlkv~~~~-----~~~~~l~~s~~~~~G~~V~aiG~P~ 161 (351)
T TIGR02038 124 LTDLAVLKIEGDN-----LPTIPVNLDRPPHVGDVVLAIGNPY 161 (351)
T ss_pred CCCEEEEEecCCC-----CceEeccCcCccCCCCEEEEEeCCC
Confidence 6899999997532 2333343222222799999988753
No 26
>PRK10942 serine endoprotease; Provisional
Probab=80.03 E-value=4.9 Score=40.42 Aligned_cols=24 Identities=33% Similarity=0.314 Sum_probs=17.7
Q ss_pred cccCCCCCeeEeeCCCCcEEEEEEEEec
Q psy2950 163 SCQGDSGGPLACPLPDGRYYLCGITSWG 190 (406)
Q Consensus 163 ~C~gdsGgPl~~~~~~~~~~l~Gi~s~~ 190 (406)
.-.|.|||||+-... .++||.+..
T Consensus 230 i~~GnSGGpL~n~~G----eviGI~t~~ 253 (473)
T PRK10942 230 INRGNSGGALVNLNG----ELIGINTAI 253 (473)
T ss_pred cCCCCCcCccCCCCC----eEEEEEEEE
Confidence 346889999995432 799998753
No 27
>PRK10139 serine endoprotease; Provisional
Probab=79.63 E-value=9.8 Score=38.08 Aligned_cols=122 Identities=16% Similarity=0.147 Sum_probs=59.2
Q ss_pred cCceEEEEeCCCcccCCCcccccCCCCCCcccCCcEEEEcccccCCCCCccccceeeeeccCChhhhhhhhhccCCcCCC
Q psy2950 240 KNDIALLELTRPFKFNEFVSPICLPNPGLTVTADVGLISGWGRLSEGGSLPHILQAAEVPLTPKEECRRSYAVAGYSNYL 319 (406)
Q Consensus 240 ~~DiALlkL~~~v~~~~~v~piclp~~~~~~~~~~~~~~GwG~~~~~~~~~~~l~~~~v~~~~~~~C~~~~~~~~~~~~~ 319 (406)
.+||||||+..+- ...++.|-..+....|+.+.+.|.-.. . ......-.+..+... .... .-
T Consensus 137 ~~DlAvlkv~~~~----~l~~~~lg~s~~~~~G~~V~aiG~P~g-~----~~tvt~GivS~~~r~----~~~~-----~~ 198 (455)
T PRK10139 137 QSDIALLQIQNPS----KLTQIAIADSDKLRVGDFAVAVGNPFG-L----GQTATSGIISALGRS----GLNL-----EG 198 (455)
T ss_pred CCCEEEEEecCCC----CCceeEecCccccCCCCEEEEEecCCC-C----CCceEEEEEcccccc----ccCC-----CC
Confidence 6799999997532 344566654333337899888885321 1 111112222221110 0000 00
Q ss_pred CcceEEeecCCCCCCcccCCCCCeeEeecCCCcEEEEEEEEeCCCC-CCCCCCeEEEeCCccHHHHHHhH
Q psy2950 320 NQCQVCTGTKQGGLDSCQGDSGGPLACPLPDGRYYLCGITSWGVGC-ARPDFYGVYTLVSCYSDWVKSIL 388 (406)
Q Consensus 320 ~~~~~Ca~~~~~~~~~C~gDsGgPl~~~~~~~~~~l~Gi~S~~~~c-~~~~~p~vft~v~~~~~WI~~~~ 388 (406)
....+=+ ....-.|.|||||+.. +| .|+||.+....- +.....+...-+..-...+++.+
T Consensus 199 ~~~~iqt-----da~in~GnSGGpl~n~--~G--~vIGi~~~~~~~~~~~~gigfaIP~~~~~~v~~~l~ 259 (455)
T PRK10139 199 LENFIQT-----DASINRGNSGGALLNL--NG--ELIGINTAILAPGGGSVGIGFAIPSNMARTLAQQLI 259 (455)
T ss_pred cceEEEE-----CCccCCCCCcceEECC--CC--eEEEEEEEEEcCCCCccceEEEEEhHHHHHHHHHHh
Confidence 0112222 1233468899999963 22 699999864321 11112345555544444455444
No 28
>PF05580 Peptidase_S55: SpoIVB peptidase S55; InterPro: IPR008763 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S55 (SpoIVB peptidase family, clan PA(S)). The protein SpoIVB plays a key role in signalling in the final sigma-K checkpoint of Bacillus subtilis [, ].
Probab=69.44 E-value=6.4 Score=34.67 Aligned_cols=28 Identities=25% Similarity=0.136 Sum_probs=23.9
Q ss_pred CCCcccCCCCCeeEeecCCCcEEEEEEEEeCCC
Q psy2950 332 GLDSCQGDSGGPLACPLPDGRYYLCGITSWGVG 364 (406)
Q Consensus 332 ~~~~C~gDsGgPl~~~~~~~~~~l~Gi~S~~~~ 364 (406)
.-+.-+|-||+|++.++ +|+|-+++...
T Consensus 174 TGGIvqGMSGSPI~qdG-----KLiGAVthvf~ 201 (218)
T PF05580_consen 174 TGGIVQGMSGSPIIQDG-----KLIGAVTHVFV 201 (218)
T ss_pred hCCEEecccCCCEEECC-----EEEEEEEEEEe
Confidence 34678999999999988 89999998754
No 29
>PRK10942 serine endoprotease; Provisional
Probab=60.76 E-value=39 Score=34.04 Aligned_cols=37 Identities=24% Similarity=0.397 Sum_probs=23.6
Q ss_pred cCceEEEEeCCCcccCCCcccccCCCCCCcccCCcEEEEcc
Q psy2950 240 KNDIALLELTRPFKFNEFVSPICLPNPGLTVTADVGLISGW 280 (406)
Q Consensus 240 ~~DiALlkL~~~v~~~~~v~piclp~~~~~~~~~~~~~~Gw 280 (406)
..|+||||+..+-. ..++-|-..+....|+.+.+.|.
T Consensus 158 ~~DlAvlki~~~~~----l~~~~lg~s~~l~~G~~V~aiG~ 194 (473)
T PRK10942 158 RSDIALIQLQNPKN----LTAIKMADSDALRVGDYTVAIGN 194 (473)
T ss_pred CCCEEEEEecCCCC----CceeEecCccccCCCCEEEEEcC
Confidence 68999999964322 34555543333237888888774
No 30
>PF00947 Pico_P2A: Picornavirus core protein 2A; InterPro: IPR000081 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This domain defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies 3CA and 3CB. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral 3C cysteine protease []. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0008233 peptidase activity, 0006508 proteolysis, 0016032 viral reproduction; PDB: 2HRV_B 1Z8R_A.
Probab=59.32 E-value=10 Score=30.34 Aligned_cols=35 Identities=26% Similarity=0.326 Sum_probs=26.3
Q ss_pred cCCCCCeeEeecCCCcEEEEEEEEeCCCCCCCCCCeEEEeCCccH
Q psy2950 337 QGDSGGPLACPLPDGRYYLCGITSWGVGCARPDFYGVYTLVSCYS 381 (406)
Q Consensus 337 ~gDsGgPl~~~~~~~~~~l~Gi~S~~~~c~~~~~p~vft~v~~~~ 381 (406)
.||-||+|.|+- =++||++-|- ..-..|++|+.+.
T Consensus 89 PGdCGg~L~C~H-----GViGi~Tagg-----~g~VaF~dir~~~ 123 (127)
T PF00947_consen 89 PGDCGGILRCKH-----GVIGIVTAGG-----EGHVAFADIRDLL 123 (127)
T ss_dssp TT-TCSEEEETT-----CEEEEEEEEE-----TTEEEEEECCCGS
T ss_pred CCCCCceeEeCC-----CeEEEEEeCC-----CceEEEEechhhh
Confidence 478899999986 4999998762 2257799998764
No 31
>PF00947 Pico_P2A: Picornavirus core protein 2A; InterPro: IPR000081 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This domain defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies 3CA and 3CB. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral 3C cysteine protease []. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0008233 peptidase activity, 0006508 proteolysis, 0016032 viral reproduction; PDB: 2HRV_B 1Z8R_A.
Probab=51.95 E-value=18 Score=29.00 Aligned_cols=34 Identities=26% Similarity=0.356 Sum_probs=25.5
Q ss_pred cCCCCCeeEeeCCCCcEEEEEEEEecCCCCCCCCCeeEEEeeec
Q psy2950 165 QGDSGGPLACPLPDGRYYLCGITSWGVGCARPDFYGVYTLVSCY 208 (406)
Q Consensus 165 ~gdsGgPl~~~~~~~~~~l~Gi~s~~~~C~~~~~p~~~t~v~~~ 208 (406)
.||-||+|.|+- =++||++.|. .....|++++.+
T Consensus 89 PGdCGg~L~C~H-----GViGi~Tagg-----~g~VaF~dir~~ 122 (127)
T PF00947_consen 89 PGDCGGILRCKH-----GVIGIVTAGG-----EGHVAFADIRDL 122 (127)
T ss_dssp TT-TCSEEEETT-----CEEEEEEEEE-----TTEEEEEECCCG
T ss_pred CCCCCceeEeCC-----CeEEEEEeCC-----CceEEEEechhh
Confidence 489999999987 4999998872 223578888765
No 32
>TIGR02860 spore_IV_B stage IV sporulation protein B. SpoIVB, the stage IV sporulation protein B of endospore-forming bacteria such as Bacillus subtilis, is a serine proteinase, expressed in the spore (rather than mother cell) compartment, that participates in a proteolytic activation cascade for Sigma-K. It appears to be universal among endospore-forming bacteria and occurs nowhere else.
Probab=34.05 E-value=37 Score=33.24 Aligned_cols=46 Identities=24% Similarity=0.274 Sum_probs=33.2
Q ss_pred CCCcccCCCCCeeEeecCCCcEEEEEEEEeCCCCCCCCCCeEEEeCCccHHHHHHhH
Q psy2950 332 GLDSCQGDSGGPLACPLPDGRYYLCGITSWGVGCARPDFYGVYTLVSCYSDWVKSIL 388 (406)
Q Consensus 332 ~~~~C~gDsGgPl~~~~~~~~~~l~Gi~S~~~~c~~~~~p~vft~v~~~~~WI~~~~ 388 (406)
..+.-+|-||+|++.++ .|+|-++.-.--+....+++ |.+|+.+..
T Consensus 354 tgGivqGMSGSPi~q~g-----kliGAvtHVfvndpt~GYGi------~ie~Ml~~~ 399 (402)
T TIGR02860 354 TGGIVQGMSGSPIIQNG-----KVIGAVTHVFVNDPTSGYGV------YIEWMLKEA 399 (402)
T ss_pred hCCEEecccCCCEEECC-----EEEEEEEEEEecCCCcceee------hHHHHHHHh
Confidence 34778999999999988 89999987654333233444 578887753
No 33
>PF05579 Peptidase_S32: Equine arteritis virus serine endopeptidase S32; InterPro: IPR008760 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to MEROPS peptidase family S32 (clan PA(S)). The type example is equine arteritis virus serine endopeptidase (equine arteritis virus), which is involved in processing of nidovirus polyproteins [].; GO: 0004252 serine-type endopeptidase activity, 0016032 viral reproduction, 0019082 viral protein processing; PDB: 3FAN_A 3FAO_A 1MBM_A.
Probab=31.98 E-value=51 Score=30.15 Aligned_cols=22 Identities=36% Similarity=0.588 Sum_probs=15.8
Q ss_pred cCCCCCeeEeecCCCcEEEEEEEEeC
Q psy2950 337 QGDSGGPLACPLPDGRYYLCGITSWG 362 (406)
Q Consensus 337 ~gDsGgPl~~~~~~~~~~l~Gi~S~~ 362 (406)
.||||+|++.+. + .|+||-+-+
T Consensus 207 ~GDSGSPVVt~d--g--~liGVHTGS 228 (297)
T PF05579_consen 207 PGDSGSPVVTED--G--DLIGVHTGS 228 (297)
T ss_dssp GGCTT-EEEETT--C---EEEEEEEE
T ss_pred CCCCCCccCcCC--C--CEEEEEecC
Confidence 489999999864 3 599998654
No 34
>PF13365 Trypsin_2: Trypsin-like peptidase domain; PDB: 1Y8T_A 2Z9I_A 3QO6_A 1L1J_A 1QY6_A 2O8L_A 3OTP_E 2ZLE_I 1KY9_A 3CS0_A ....
Probab=29.60 E-value=48 Score=25.46 Aligned_cols=18 Identities=50% Similarity=0.898 Sum_probs=13.4
Q ss_pred cCCCCCeeEeecCCCcEEEEEE
Q psy2950 337 QGDSGGPLACPLPDGRYYLCGI 358 (406)
Q Consensus 337 ~gDsGgPl~~~~~~~~~~l~Gi 358 (406)
.|.|||||+-. ++ .++||
T Consensus 103 ~G~SGgpv~~~--~G--~vvGi 120 (120)
T PF13365_consen 103 PGSSGGPVFDS--DG--RVVGI 120 (120)
T ss_dssp TTTTTSEEEET--TS--EEEEE
T ss_pred CCcEeHhEECC--CC--EEEeC
Confidence 58899999762 44 58886
No 35
>PF00548 Peptidase_C3: 3C cysteine protease (picornain 3C); InterPro: IPR000199 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This signature defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies C3A and C3B. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral C3 cysteine protease. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SJO_E 2H6M_A 1QA7_C 1HAV_B 2HAL_A 2H9H_A 3QZQ_B 3QZR_A 3R0F_B 3SJ9_A ....
Probab=24.15 E-value=95 Score=26.46 Aligned_cols=26 Identities=31% Similarity=0.482 Sum_probs=19.3
Q ss_pred ccCCCCCeeEeecCCCcEEEEEEEEeC
Q psy2950 336 CQGDSGGPLACPLPDGRYYLCGITSWG 362 (406)
Q Consensus 336 C~gDsGgPl~~~~~~~~~~l~Gi~S~~ 362 (406)
-.|+=||||+.+. .+...++||-.-|
T Consensus 145 ~~G~CG~~l~~~~-~~~~~i~GiHvaG 170 (172)
T PF00548_consen 145 KPGMCGSPLVSRI-GGQGKIIGIHVAG 170 (172)
T ss_dssp ETTGTTEEEEESC-GGTTEEEEEEEEE
T ss_pred CCCccCCeEEEee-ccCccEEEEEecc
Confidence 3456699999965 4566899997654
No 36
>PF02907 Peptidase_S29: Hepatitis C virus NS3 protease; InterPro: IPR004109 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies the Hepatitis C virus NS3 protein as a serine protease which belongs to MEROPS peptidase family S29 (hepacivirin family, clan PA(S)), which has a trypsin-like fold. The non-structural (NS) protein NS3 is one of the NS proteins involved in replication of the HCV genome. The NS2 proteinase (IPR002518 from INTERPRO), a zinc-dependent enzyme, performs a single proteolytic cut to release the N terminus of NS3. The action of NS3 proteinase (NS3P), which resides in the N-terminal one-third of the NS3 protein, then yields all remaining non-structural proteins. The C-terminal two-thirds of the NS3 protein contain a helicase. The functional relationship between the proteinase and helicase domains is unknown. NS3 has a structural zinc-binding site and requires cofactor NS4. It has been suggested that the NS3 serine protease of hepatitus C is involved in cell transformation and that the ability to transform requires an active enzyme [].; GO: 0008236 serine-type peptidase activity, 0006508 proteolysis, 0019087 transformation of host cell by virus; PDB: 2QV1_B 3LOX_C 2OBQ_C 2OC1_C 2OC0_A 3LON_A 3KNX_A 2O8M_A 2OBO_A 2OC8_A ....
Probab=23.73 E-value=79 Score=25.71 Aligned_cols=25 Identities=36% Similarity=0.768 Sum_probs=17.3
Q ss_pred cCCCCCeeEeeCCCCcEEEEEEEEecCCCC
Q psy2950 165 QGDSGGPLACPLPDGRYYLCGITSWGVGCA 194 (406)
Q Consensus 165 ~gdsGgPl~~~~~~~~~~l~Gi~s~~~~C~ 194 (406)
.|.||+|++|... ..+||.... -|.
T Consensus 107 kGSSGgPiLC~~G----H~vG~f~aa-~~t 131 (148)
T PF02907_consen 107 KGSSGGPILCPSG----HAVGMFRAA-VCT 131 (148)
T ss_dssp TT-TT-EEEETTS----EEEEEEEEE-EEE
T ss_pred ecCCCCcccCCCC----CEEEEEEEE-EEc
Confidence 7899999999874 788877653 443
No 37
>PF00944 Peptidase_S3: Alphavirus core protein ; InterPro: IPR000930 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. Togavirin, also known as Sindbis virus core endopeptidase, is a serine protease resident at the N terminus of the p130 polyprotein of togaviruses []. The endopeptidase signature identifies the peptidase as belonging to the MEROPS peptidase family S3 (togavirin family, clan PA(S)). The polyprotein also includes structural proteins for the nucleocapsid core and for the glycoprotein spikes []. Togavirin is only active while part of the polyprotein, cleavage at a Trp-Ser bond resulting in total lack of activity []. Mutagenesis studies have identified the location of the His-Asp-Ser catalytic triad, and X-ray studies have revealed the protein fold to be similar to that of chymotrypsin [, ].; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis, 0016020 membrane; PDB: 2YEW_D 1EP5_A 3J0C_F 1EP6_C 1WYK_D 1DYL_A 1VCQ_B 1VCP_B 1LD4_D 1KXA_A ....
Probab=21.32 E-value=41 Score=27.26 Aligned_cols=24 Identities=33% Similarity=0.494 Sum_probs=16.9
Q ss_pred cCCCCCeeEeecCCCcEEEEEEEEeCCC
Q psy2950 337 QGDSGGPLACPLPDGRYYLCGITSWGVG 364 (406)
Q Consensus 337 ~gDsGgPl~~~~~~~~~~l~Gi~S~~~~ 364 (406)
.||||.|++-+ .| .+|||+--|..
T Consensus 105 ~GDSGRpi~DN--sG--rVVaIVLGG~n 128 (158)
T PF00944_consen 105 PGDSGRPIFDN--SG--RVVAIVLGGAN 128 (158)
T ss_dssp TTSTTEEEEST--TS--BEEEEEEEEEE
T ss_pred CCCCCCccCcC--CC--CEEEEEecCCC
Confidence 58999999853 33 47888865543
Done!