Query psy6524
Match_columns 457
No_of_seqs 281 out of 1574
Neff 7.1
Searched_HMMs 46136
Date Fri Aug 16 20:17:43 2013
Command hhsearch -i /work/01045/syshi/Psyhhblits/psy6524.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/6524hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 cd00190 Tryp_SPc Trypsin-like 100.0 4.2E-40 9.2E-45 311.9 24.5 228 30-448 1-232 (232)
2 KOG3627|consensus 100.0 6.8E-38 1.5E-42 304.9 25.9 237 26-450 9-255 (256)
3 smart00020 Tryp_SPc Trypsin-li 100.0 1.1E-36 2.4E-41 289.2 23.2 225 29-445 1-229 (229)
4 PF00089 Trypsin: Trypsin; In 100.0 2.3E-34 5E-39 270.7 23.0 217 30-445 1-220 (220)
5 COG5640 Secreted trypsin-like 100.0 3E-28 6.4E-33 238.8 15.0 92 360-451 184-280 (413)
6 PF03761 DUF316: Domain of unk 99.6 6.4E-14 1.4E-18 138.9 17.8 56 19-74 29-90 (282)
7 KOG3627|consensus 99.3 1.5E-11 3.2E-16 119.6 10.6 140 102-350 87-229 (256)
8 PF09342 DUF1986: Domain of un 99.2 6.5E-11 1.4E-15 112.1 9.5 99 38-144 13-113 (267)
9 cd00190 Tryp_SPc Trypsin-like 99.0 1.7E-09 3.6E-14 102.1 9.4 62 101-169 70-131 (232)
10 smart00020 Tryp_SPc Trypsin-li 98.7 4.8E-08 1E-12 92.4 9.4 61 102-169 71-131 (229)
11 COG3591 V8-like Glu-specific e 98.5 2.8E-06 6.2E-11 82.3 13.4 54 394-450 197-251 (251)
12 PF00089 Trypsin: Trypsin; In 98.4 1.3E-06 2.7E-11 81.8 8.6 60 102-168 69-128 (220)
13 TIGR02037 degP_htrA_DO peripla 97.6 0.0013 2.9E-08 69.3 15.2 38 223-264 105-142 (428)
14 TIGR02038 protease_degS peripl 97.3 0.011 2.3E-07 60.9 16.4 42 41-84 55-108 (351)
15 PRK10898 serine endoprotease; 97.0 0.029 6.3E-07 57.7 17.0 42 41-84 55-108 (353)
16 PRK10139 serine endoprotease; 96.9 0.03 6.6E-07 59.6 15.9 31 53-85 90-122 (455)
17 PF13365 Trypsin_2: Trypsin-li 96.9 0.003 6.4E-08 53.3 6.8 21 55-75 1-22 (120)
18 PRK10942 serine endoprotease; 96.6 0.046 1E-06 58.5 14.8 31 53-85 111-143 (473)
19 PF09342 DUF1986: Domain of un 93.9 0.15 3.3E-06 49.2 6.6 76 181-264 55-131 (267)
20 PF02395 Peptidase_S6: Immunog 82.2 1.2 2.5E-05 50.5 3.5 31 397-427 213-245 (769)
21 PF00947 Pico_P2A: Picornaviru 71.8 2.8 6.1E-05 36.6 2.1 38 398-445 88-125 (127)
22 COG5640 Secreted trypsin-like 68.2 12 0.00025 38.5 5.9 156 27-291 30-200 (413)
23 PF00548 Peptidase_C3: 3C cyst 41.0 1.8E+02 0.0039 26.6 8.6 28 397-424 144-171 (172)
24 PF13365 Trypsin_2: Trypsin-li 37.0 26 0.00055 28.8 2.2 20 397-419 101-120 (120)
25 PF03761 DUF316: Domain of unk 32.7 61 0.0013 31.7 4.5 52 394-447 225-277 (282)
26 PF05579 Peptidase_S32: Equine 30.4 50 0.0011 32.7 3.2 23 399-424 207-229 (297)
27 PF02907 Peptidase_S29: Hepati 30.2 38 0.00083 30.0 2.2 23 398-423 106-128 (148)
28 PF00863 Peptidase_C4: Peptida 28.1 5.7E+02 0.012 24.9 12.2 40 396-440 147-186 (235)
29 KOG0276|consensus 26.4 2.3E+02 0.005 31.5 7.6 20 31-51 16-35 (794)
30 PF10459 Peptidase_S46: Peptid 25.9 62 0.0013 36.6 3.4 29 44-74 39-69 (698)
31 PF05580 Peptidase_S55: SpoIVB 21.1 1.1E+02 0.0023 29.5 3.4 26 395-424 175-200 (218)
No 1
>cd00190 Tryp_SPc Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues.
Probab=100.00 E-value=4.2e-40 Score=311.85 Aligned_cols=228 Identities=49% Similarity=0.905 Sum_probs=189.1
Q ss_pred eecCeecCCCCCCeEEEEEeC-CeeEEEEEEEeCCeeeecccccccccCccEEEEEccccCCCcccCCCCceeeeeeEEE
Q psy6524 30 IVGGRPTGVNKYPWVARLVYD-GNFHCGASLINEDYVLTAAHCVRRLKRSKIRIVLGDYDQSVTTETAEPTMMRAVSSIV 108 (457)
Q Consensus 30 i~~G~~~~~~~~Pw~v~i~~~-~~~~C~GtLIs~~~VLTAAhCv~~~~~~~~~v~~G~~~~~~~~~~~~~~~~~~v~~i~ 108 (457)
|+||+++..++|||+|.|+.. ..+.|+||||+++||||||||+.+.....+.|++|........
T Consensus 1 i~~G~~~~~~~~Pw~v~i~~~~~~~~C~GtlIs~~~VLTaAhC~~~~~~~~~~v~~g~~~~~~~~--------------- 65 (232)
T cd00190 1 IVGGSEAKIGSFPWQVSLQYTGGRHFCGGSLISPRWVLTAAHCVYSSAPSNYTVRLGSHDLSSNE--------------- 65 (232)
T ss_pred CcCCeECCCCCCCCEEEEEccCCcEEEEEEEeeCCEEEECHHhcCCCCCccEEEEeCcccccCCC---------------
Confidence 689999999999999999987 7889999999999999999999865567788888865544211
Q ss_pred ecCcCCCCCCccceEEEEecCccccCCCeeeeeCCCCCCCcccccCcccccccceeeeeeeccCCCceeeeccceEEEeC
Q psy6524 109 RHRHFDVNNYNHDIALLKLRKPVSFTKSVRPICLPPDSEYHTVVKGTMRCRQRAAVLAFGTQRDGSDVKLVSSKIRIVLG 188 (457)
Q Consensus 109 ~h~~y~~~~~~~DIaLl~L~~~v~~~~~v~picl~~~~~~~~~~~~~~~~~~~~~v~~~g~~~~~~~~~~~~~~~~~~~g 188 (457)
T Consensus 66 -------------------------------------------------------------------------------- 65 (232)
T cd00190 66 -------------------------------------------------------------------------------- 65 (232)
T ss_pred --------------------------------------------------------------------------------
Confidence
Q ss_pred cccCCcccccCccceeeeeeEEEecCCCCCCCCCcceEEEeeCCCcccCCCccccccCCCC-CCCCCCeEEEEecccccC
Q psy6524 189 DYDQSVTTETAEPTMMRAVSSIVRHRHFDVNNYNHDIALLKLRKPVSFTKSVRPICLPPDN-IDPSGKMGTVVGWGRTSE 267 (457)
Q Consensus 189 ~~~~~~~~~~~~~~~~~~V~~i~~hp~~~~~~~~~DiAllkL~~~~~~s~~v~PicLp~~~-~~~~~~~~~~~Gwg~~~~ 267 (457)
...+.+.|+++++||+|+.....+|||||+|++++.++++++|||||... ....+..+.+.|||....
T Consensus 66 -----------~~~~~~~v~~~~~hp~y~~~~~~~DiAll~L~~~~~~~~~v~picl~~~~~~~~~~~~~~~~G~g~~~~ 134 (232)
T cd00190 66 -----------GGGQVIKVKKVIVHPNYNPSTYDNDIALLKLKRPVTLSDNVRPICLPSSGYNLPAGTTCTVSGWGRTSE 134 (232)
T ss_pred -----------CceEEEEEEEEEECCCCCCCCCcCCEEEEEECCcccCCCcccceECCCccccCCCCCEEEEEeCCcCCC
Confidence 02345678889999999888888999999999999999999999999886 334678899999998754
Q ss_pred CCCCcccceeccccccCchhccccccCCCCCCCCceecCCCCCCcccCCCCCCCCCCccccccccccCCCCCCceeEecc
Q psy6524 268 GGSLATEALEVQVPILSPGQCRAMKYKPSRITPNMLCAGRGEMDSCQDLAPRRPTESHLHFHFLSTDIDPSGKMGTVVGW 347 (457)
Q Consensus 268 ~~~~~~~l~~~~~~~~~~~~C~~~~~~~~~~~~~~~Cag~~~~~~C~~~s~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 347 (457)
....+
T Consensus 135 ~~~~~--------------------------------------------------------------------------- 139 (232)
T cd00190 135 GGPLP--------------------------------------------------------------------------- 139 (232)
T ss_pred CCCCC---------------------------------------------------------------------------
Confidence 32111
Q ss_pred ccccCCCCccccceEeEeeecChhhhcccccCCCCCCCCeEEeecC--CCCCCcCCCCCceEEeeCCcEEEEEEEEecCC
Q psy6524 348 GRTSEGGSLATEALEVQVPILSPGQCRAMKYKPSRITPNMLCAGRG--EMDSCQGDSGGPLIINDVGRYELVGIVSWGVG 425 (457)
Q Consensus 348 g~~~~~~~~~~~l~~~~~~~~s~~~C~~~~~~~~~i~~~~lCa~~~--~~~~C~gDsGgPLv~~~~~~~~L~GI~S~~~~ 425 (457)
..++...+.+++...|...+.....+.+.++|+... ....|.||+||||++..+++++|+||+|++..
T Consensus 140 ----------~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~C~~~~~~~~~~c~gdsGgpl~~~~~~~~~lvGI~s~g~~ 209 (232)
T cd00190 140 ----------DVLQEVNVPIVSNAECKRAYSYGGTITDNMLCAGGLEGGKDACQGDSGGPLVCNDNGRGVLVGIVSWGSG 209 (232)
T ss_pred ----------ceeeEEEeeeECHHHhhhhccCcccCCCceEeeCCCCCCCccccCCCCCcEEEEeCCEEEEEEEEehhhc
Confidence 237788888888888887643324678899999833 67899999999999998899999999999998
Q ss_pred CCCCCCCeEEEeCcccHHHHHHH
Q psy6524 426 CGRPGYPGVYTRVNRYLSWVKRN 448 (457)
Q Consensus 426 C~~~~~p~vyt~V~~~~dWI~~~ 448 (457)
|.....|.+||+|..|++||+++
T Consensus 210 c~~~~~~~~~t~v~~~~~WI~~~ 232 (232)
T cd00190 210 CARPNYPGVYTRVSSYLDWIQKT 232 (232)
T ss_pred cCCCCCCCEEEEcHHhhHHhhcC
Confidence 98767899999999999999864
No 2
>KOG3627|consensus
Probab=100.00 E-value=6.8e-38 Score=304.85 Aligned_cols=237 Identities=46% Similarity=0.897 Sum_probs=186.9
Q ss_pred ccceeecCeecCCCCCCeEEEEEeCC--eeEEEEEEEeCCeeeecccccccc-cCccEEEEEccccCCCcccCCCCceee
Q psy6524 26 QEVRIVGGRPTGVNKYPWVARLVYDG--NFHCGASLINEDYVLTAAHCVRRL-KRSKIRIVLGDYDQSVTTETAEPTMMR 102 (457)
Q Consensus 26 ~~~ri~~G~~~~~~~~Pw~v~i~~~~--~~~C~GtLIs~~~VLTAAhCv~~~-~~~~~~v~~G~~~~~~~~~~~~~~~~~ 102 (457)
...||+||.++.+++|||+|+|.... .++|+|+||+++||||||||+... .. .+.|++|.+........
T Consensus 9 ~~~~i~~g~~~~~~~~Pw~~~l~~~~~~~~~Cggsli~~~~vltaaHC~~~~~~~-~~~V~~G~~~~~~~~~~------- 80 (256)
T KOG3627|consen 9 PEGRIVGGTEAEPGSFPWQVSLQYGGNGRHLCGGSLISPRWVLTAAHCVKGASAS-LYTVRLGEHDINLSVSE------- 80 (256)
T ss_pred ccCCEeCCccCCCCCCCCEEEEEECCCcceeeeeEEeeCCEEEEChhhCCCCCCc-ceEEEECcccccccccc-------
Confidence 46799999999999999999999876 789999999999999999999763 22 77777776544432110
Q ss_pred eeeEEEecCcCCCCCCccceEEEEecCccccCCCeeeeeCCCCCCCcccccCcccccccceeeeeeeccCCCceeeeccc
Q psy6524 103 AVSSIVRHRHFDVNNYNHDIALLKLRKPVSFTKSVRPICLPPDSEYHTVVKGTMRCRQRAAVLAFGTQRDGSDVKLVSSK 182 (457)
Q Consensus 103 ~v~~i~~h~~y~~~~~~~DIaLl~L~~~v~~~~~v~picl~~~~~~~~~~~~~~~~~~~~~v~~~g~~~~~~~~~~~~~~ 182 (457)
T Consensus 81 -------------------------------------------------------------------------------- 80 (256)
T KOG3627|consen 81 -------------------------------------------------------------------------------- 80 (256)
T ss_pred --------------------------------------------------------------------------------
Confidence
Q ss_pred eEEEeCcccCCcccccCccceeeeeeEEEecCCCCCCCCC-cceEEEeeCCCcccCCCccccccCCCCC---CCCCCeEE
Q psy6524 183 IRIVLGDYDQSVTTETAEPTMMRAVSSIVRHRHFDVNNYN-HDIALLKLRKPVSFTKSVRPICLPPDNI---DPSGKMGT 258 (457)
Q Consensus 183 ~~~~~g~~~~~~~~~~~~~~~~~~V~~i~~hp~~~~~~~~-~DiAllkL~~~~~~s~~v~PicLp~~~~---~~~~~~~~ 258 (457)
........|.++++||+|+..... ||||||+|..++.|++.|+|||||.... ...+..+.
T Consensus 81 ----------------~~~~~~~~v~~~i~H~~y~~~~~~~nDiall~l~~~v~~~~~i~piclp~~~~~~~~~~~~~~~ 144 (256)
T KOG3627|consen 81 ----------------GEEQLVGDVEKIIVHPNYNPRTLENNDIALLRLSEPVTFSSHIQPICLPSSADPYFPPGGTTCL 144 (256)
T ss_pred ----------------CchhhhceeeEEEECCCCCCCCCCCCCEEEEEECCCcccCCcccccCCCCCcccCCCCCCCEEE
Confidence 000123346677799999988877 9999999999999999999999985554 33558888
Q ss_pred EEecccccCCCCCcccceeccccccCchhccccccCCCCCCCCceecCCCCCCcccCCCCCCCCCCccccccccccCCCC
Q psy6524 259 VVGWGRTSEGGSLATEALEVQVPILSPGQCRAMKYKPSRITPNMLCAGRGEMDSCQDLAPRRPTESHLHFHFLSTDIDPS 338 (457)
Q Consensus 259 ~~Gwg~~~~~~~~~~~l~~~~~~~~~~~~C~~~~~~~~~~~~~~~Cag~~~~~~C~~~s~~~~~~~~~~~~~~~~~~~~~ 338 (457)
++|||.+.....
T Consensus 145 v~GWG~~~~~~~-------------------------------------------------------------------- 156 (256)
T KOG3627|consen 145 VSGWGRTESGGG-------------------------------------------------------------------- 156 (256)
T ss_pred EEeCCCcCCCCC--------------------------------------------------------------------
Confidence 999988765411
Q ss_pred CCceeEeccccccCCCCccccceEeEeeecChhhhcccccCCCCCCCCeEEee--cCCCCCCcCCCCCceEEeeCCcEEE
Q psy6524 339 GKMGTVVGWGRTSEGGSLATEALEVQVPILSPGQCRAMKYKPSRITPNMLCAG--RGEMDSCQGDSGGPLIINDVGRYEL 416 (457)
Q Consensus 339 ~~~~~~~~~g~~~~~~~~~~~l~~~~~~~~s~~~C~~~~~~~~~i~~~~lCa~--~~~~~~C~gDsGgPLv~~~~~~~~L 416 (457)
..+..|+++++++++...|...+.....+.+.+||++ ....++|.|||||||++..+++++|
T Consensus 157 ----------------~~~~~L~~~~v~i~~~~~C~~~~~~~~~~~~~~~Ca~~~~~~~~~C~GDSGGPLv~~~~~~~~~ 220 (256)
T KOG3627|consen 157 ----------------PLPDTLQEVDVPIISNSECRRAYGGLGTITDTMLCAGGPEGGKDACQGDSGGPLVCEDNGRWVL 220 (256)
T ss_pred ----------------CCCceeEEEEEeEcChhHhcccccCccccCCCEEeeCccCCCCccccCCCCCeEEEeeCCcEEE
Confidence 1122367777777887778776544334666789998 4667789999999999998779999
Q ss_pred EEEEEecCC-CCCCCCCeEEEeCcccHHHHHHHhh
Q psy6524 417 VGIVSWGVG-CGRPGYPGVYTRVNRYLSWVKRNMK 450 (457)
Q Consensus 417 ~GI~S~~~~-C~~~~~p~vyt~V~~~~dWI~~~i~ 450 (457)
+||+|||.. |.....|++||+|+.|.+||++.+.
T Consensus 221 ~GivS~G~~~C~~~~~P~vyt~V~~y~~WI~~~~~ 255 (256)
T KOG3627|consen 221 VGIVSWGSGGCGQPNYPGVYTRVSSYLDWIKENIG 255 (256)
T ss_pred EEEEEecCCCCCCCCCCeEEeEhHHhHHHHHHHhc
Confidence 999999987 9988899999999999999999875
No 3
>smart00020 Tryp_SPc Trypsin-like serine protease. Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues.
Probab=100.00 E-value=1.1e-36 Score=289.23 Aligned_cols=225 Identities=52% Similarity=0.941 Sum_probs=185.3
Q ss_pred eeecCeecCCCCCCeEEEEEeCC-eeEEEEEEEeCCeeeecccccccccCccEEEEEccccCCCcccCCCCceeeeeeEE
Q psy6524 29 RIVGGRPTGVNKYPWVARLVYDG-NFHCGASLINEDYVLTAAHCVRRLKRSKIRIVLGDYDQSVTTETAEPTMMRAVSSI 107 (457)
Q Consensus 29 ri~~G~~~~~~~~Pw~v~i~~~~-~~~C~GtLIs~~~VLTAAhCv~~~~~~~~~v~~G~~~~~~~~~~~~~~~~~~v~~i 107 (457)
||+||+++.+++|||+|.|+... .+.|+||||++++|||||||+.+.....+.|++|..+.....
T Consensus 1 ~~~~G~~~~~~~~Pw~~~i~~~~~~~~C~GtlIs~~~VLTaahC~~~~~~~~~~v~~g~~~~~~~~-------------- 66 (229)
T smart00020 1 RIVGGSEANIGSFPWQVSLQYRGGRHFCGGSLISPRWVLTAAHCVYGSDPSNIRVRLGSHDLSSGE-------------- 66 (229)
T ss_pred CccCCCcCCCCCCCcEEEEEEcCCCcEEEEEEecCCEEEECHHHcCCCCCcceEEEeCcccCCCCC--------------
Confidence 68999999999999999999886 789999999999999999999865556788888866543211
Q ss_pred EecCcCCCCCCccceEEEEecCccccCCCeeeeeCCCCCCCcccccCcccccccceeeeeeeccCCCceeeeccceEEEe
Q psy6524 108 VRHRHFDVNNYNHDIALLKLRKPVSFTKSVRPICLPPDSEYHTVVKGTMRCRQRAAVLAFGTQRDGSDVKLVSSKIRIVL 187 (457)
Q Consensus 108 ~~h~~y~~~~~~~DIaLl~L~~~v~~~~~v~picl~~~~~~~~~~~~~~~~~~~~~v~~~g~~~~~~~~~~~~~~~~~~~ 187 (457)
T Consensus 67 -------------------------------------------------------------------------------- 66 (229)
T smart00020 67 -------------------------------------------------------------------------------- 66 (229)
T ss_pred --------------------------------------------------------------------------------
Confidence
Q ss_pred CcccCCcccccCccceeeeeeEEEecCCCCCCCCCcceEEEeeCCCcccCCCccccccCCCC-CCCCCCeEEEEeccccc
Q psy6524 188 GDYDQSVTTETAEPTMMRAVSSIVRHRHFDVNNYNHDIALLKLRKPVSFTKSVRPICLPPDN-IDPSGKMGTVVGWGRTS 266 (457)
Q Consensus 188 g~~~~~~~~~~~~~~~~~~V~~i~~hp~~~~~~~~~DiAllkL~~~~~~s~~v~PicLp~~~-~~~~~~~~~~~Gwg~~~ 266 (457)
....+.|..++.||+|+.....+|+|||+|++|+.+++.++|+|||... ....+..+.+.|||...
T Consensus 67 -------------~~~~~~v~~~~~~p~~~~~~~~~DiAll~L~~~i~~~~~~~pi~l~~~~~~~~~~~~~~~~g~g~~~ 133 (229)
T smart00020 67 -------------EGQVIKVSKVIIHPNYNPSTYDNDIALLKLKSPVTLSDNVRPICLPSSNYNVPAGTTCTVSGWGRTS 133 (229)
T ss_pred -------------CceEEeeEEEEECCCCCCCCCcCCEEEEEECcccCCCCceeeccCCCcccccCCCCEEEEEeCCCCC
Confidence 0134678888899999888888999999999999999999999999873 33467889999999865
Q ss_pred CCCCCcccceeccccccCchhccccccCCCCCCCCceecCCCCCCcccCCCCCCCCCCccccccccccCCCCCCceeEec
Q psy6524 267 EGGSLATEALEVQVPILSPGQCRAMKYKPSRITPNMLCAGRGEMDSCQDLAPRRPTESHLHFHFLSTDIDPSGKMGTVVG 346 (457)
Q Consensus 267 ~~~~~~~~l~~~~~~~~~~~~C~~~~~~~~~~~~~~~Cag~~~~~~C~~~s~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 346 (457)
...
T Consensus 134 ~~~----------------------------------------------------------------------------- 136 (229)
T smart00020 134 EGA----------------------------------------------------------------------------- 136 (229)
T ss_pred CCC-----------------------------------------------------------------------------
Confidence 311
Q ss_pred cccccCCCCccccceEeEeeecChhhhcccccCCCCCCCCeEEeecC--CCCCCcCCCCCceEEeeCCcEEEEEEEEecC
Q psy6524 347 WGRTSEGGSLATEALEVQVPILSPGQCRAMKYKPSRITPNMLCAGRG--EMDSCQGDSGGPLIINDVGRYELVGIVSWGV 424 (457)
Q Consensus 347 ~g~~~~~~~~~~~l~~~~~~~~s~~~C~~~~~~~~~i~~~~lCa~~~--~~~~C~gDsGgPLv~~~~~~~~L~GI~S~~~ 424 (457)
+.....++...+.+++...|...+.....+.+.++|++.. ....|.||+||||++..+ +|+|+||+|++.
T Consensus 137 -------~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~C~~~~~~~~~~c~gdsG~pl~~~~~-~~~l~Gi~s~g~ 208 (229)
T smart00020 137 -------GSLPDTLQEVNVPIVSNATCRRAYSGGGAITDNMLCAGGLEGGKDACQGDSGGPLVCNDG-RWVLVGIVSWGS 208 (229)
T ss_pred -------CcCCCEeeEEEEEEeCHHHhhhhhccccccCCCcEeecCCCCCCcccCCCCCCeeEEECC-CEEEEEEEEECC
Confidence 0111237788888999999987653333578899999843 578999999999999887 999999999999
Q ss_pred CCCCCCCCeEEEeCcccHHHH
Q psy6524 425 GCGRPGYPGVYTRVNRYLSWV 445 (457)
Q Consensus 425 ~C~~~~~p~vyt~V~~~~dWI 445 (457)
.|.....|.+|++|..|++||
T Consensus 209 ~C~~~~~~~~~~~i~~~~~WI 229 (229)
T smart00020 209 GCARPGKPGVYTRVSSYLDWI 229 (229)
T ss_pred CCCCCCCCCEEEEeccccccC
Confidence 998677899999999999998
No 4
>PF00089 Trypsin: Trypsin; InterPro: IPR001254 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine proteases belong to the MEROPS peptidase family S1 (chymotrypsin family, clan PA(S))and to peptidase family S6 (Hap serine peptidases). The chymotrypsin family is almost totally confined to animals, although trypsin-like enzymes are found in actinomycetes of the genera Streptomyces and Saccharopolyspora, and in the fungus Fusarium oxysporum []. The enzymes are inherently secreted, being synthesised with a signal peptide that targets them to the secretory pathway. Animal enzymes are either secreted directly, packaged into vesicles for regulated secretion, or are retained in leukocyte granules []. The Hap family, 'Haemophilus adhesion and penetration', are proteins that play a role in the interaction with human epithelial cells. The serine protease activity is localized at the N-terminal domain, whereas the binding domain is in the C-terminal region. ; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 1SPJ_A 1A5I_A 2ZGH_A 2ZKS_A 2ZGJ_A 2ZGC_A 2ODP_A 2I6Q_A 2I6S_A 2ODQ_A ....
Probab=100.00 E-value=2.3e-34 Score=270.72 Aligned_cols=217 Identities=47% Similarity=0.933 Sum_probs=179.4
Q ss_pred eecCeecCCCCCCeEEEEEeCC-eeEEEEEEEeCCeeeecccccccccCccEEEEEccccCCCcccCCCCceeeeeeEEE
Q psy6524 30 IVGGRPTGVNKYPWVARLVYDG-NFHCGASLINEDYVLTAAHCVRRLKRSKIRIVLGDYDQSVTTETAEPTMMRAVSSIV 108 (457)
Q Consensus 30 i~~G~~~~~~~~Pw~v~i~~~~-~~~C~GtLIs~~~VLTAAhCv~~~~~~~~~v~~G~~~~~~~~~~~~~~~~~~v~~i~ 108 (457)
|.||.++.+++|||+|.|.... .++|+|+||+++||||||||+.. ...+.+++|.........
T Consensus 1 i~~g~~~~~~~~p~~v~i~~~~~~~~C~G~li~~~~vLTaahC~~~--~~~~~v~~g~~~~~~~~~-------------- 64 (220)
T PF00089_consen 1 IVGGDPASPGEFPWVVSIRYSNGRFFCTGTLISPRWVLTAAHCVDG--ASDIKVRLGTYSIRNSDG-------------- 64 (220)
T ss_dssp SBSSEECGTTSSTTEEEEEETTTEEEEEEEEEETTEEEEEGGGHTS--GGSEEEEESESBTTSTTT--------------
T ss_pred CCCCEECCCCCCCeEEEEeeCCCCeeEeEEeccccccccccccccc--cccccccccccccccccc--------------
Confidence 7899999999999999999987 89999999999999999999975 567888888632221111
Q ss_pred ecCcCCCCCCccceEEEEecCccccCCCeeeeeCCCCCCCcccccCcccccccceeeeeeeccCCCceeeeccceEEEeC
Q psy6524 109 RHRHFDVNNYNHDIALLKLRKPVSFTKSVRPICLPPDSEYHTVVKGTMRCRQRAAVLAFGTQRDGSDVKLVSSKIRIVLG 188 (457)
Q Consensus 109 ~h~~y~~~~~~~DIaLl~L~~~v~~~~~v~picl~~~~~~~~~~~~~~~~~~~~~v~~~g~~~~~~~~~~~~~~~~~~~g 188 (457)
T Consensus 65 -------------------------------------------------------------------------------- 64 (220)
T PF00089_consen 65 -------------------------------------------------------------------------------- 64 (220)
T ss_dssp --------------------------------------------------------------------------------
T ss_pred --------------------------------------------------------------------------------
Confidence
Q ss_pred cccCCcccccCccceeeeeeEEEecCCCCCCCCCcceEEEeeCCCcccCCCccccccCCCCC-CCCCCeEEEEecccccC
Q psy6524 189 DYDQSVTTETAEPTMMRAVSSIVRHRHFDVNNYNHDIALLKLRKPVSFTKSVRPICLPPDNI-DPSGKMGTVVGWGRTSE 267 (457)
Q Consensus 189 ~~~~~~~~~~~~~~~~~~V~~i~~hp~~~~~~~~~DiAllkL~~~~~~s~~v~PicLp~~~~-~~~~~~~~~~Gwg~~~~ 267 (457)
..+.+.|++++.||+|+.....+|+|||+|++++.+.+.++|+||+.... ...+..+.+.||+....
T Consensus 65 ------------~~~~~~v~~~~~h~~~~~~~~~~DiAll~L~~~~~~~~~~~~~~l~~~~~~~~~~~~~~~~G~~~~~~ 132 (220)
T PF00089_consen 65 ------------SEQTIKVSKIIIHPKYDPSTYDNDIALLKLDRPITFGDNIQPICLPSAGSDPNVGTSCIVVGWGRTSD 132 (220)
T ss_dssp ------------TSEEEEEEEEEEETTSBTTTTTTSEEEEEESSSSEHBSSBEESBBTSTTHTTTTTSEEEEEESSBSST
T ss_pred ------------cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
Confidence 13466788888999998888889999999999999999999999998443 35788899999998644
Q ss_pred CCCCcccceeccccccCchhccccccCCCCCCCCceecCCCCCCcccCCCCCCCCCCccccccccccCCCCCCceeEecc
Q psy6524 268 GGSLATEALEVQVPILSPGQCRAMKYKPSRITPNMLCAGRGEMDSCQDLAPRRPTESHLHFHFLSTDIDPSGKMGTVVGW 347 (457)
Q Consensus 268 ~~~~~~~l~~~~~~~~~~~~C~~~~~~~~~~~~~~~Cag~~~~~~C~~~s~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 347 (457)
.. .
T Consensus 133 ~~-~---------------------------------------------------------------------------- 135 (220)
T PF00089_consen 133 NG-Y---------------------------------------------------------------------------- 135 (220)
T ss_dssp TS-B----------------------------------------------------------------------------
T ss_pred cc-c----------------------------------------------------------------------------
Confidence 32 1
Q ss_pred ccccCCCCccccceEeEeeecChhhhcccccCCCCCCCCeEEeec-CCCCCCcCCCCCceEEeeCCcEEEEEEEEecCCC
Q psy6524 348 GRTSEGGSLATEALEVQVPILSPGQCRAMKYKPSRITPNMLCAGR-GEMDSCQGDSGGPLIINDVGRYELVGIVSWGVGC 426 (457)
Q Consensus 348 g~~~~~~~~~~~l~~~~~~~~s~~~C~~~~~~~~~i~~~~lCa~~-~~~~~C~gDsGgPLv~~~~~~~~L~GI~S~~~~C 426 (457)
+..++...+.+++...|... ....+.+.++|+.. ...+.|.|||||||++.++ +|+||.+++..|
T Consensus 136 ---------~~~~~~~~~~~~~~~~c~~~--~~~~~~~~~~c~~~~~~~~~~~g~sG~pl~~~~~---~lvGI~s~~~~c 201 (220)
T PF00089_consen 136 ---------SSNLQSVTVPVVSRKTCRSS--YNDNLTPNMICAGSSGSGDACQGDSGGPLICNNN---YLVGIVSFGENC 201 (220)
T ss_dssp ---------TSBEEEEEEEEEEHHHHHHH--TTTTSTTTEEEEETTSSSBGGTTTTTSEEEETTE---EEEEEEEEESSS
T ss_pred ---------cccccccccccccccccccc--ccccccccccccccccccccccccccccccccee---eecceeeecCCC
Confidence 12377788888888899875 23347889999984 5578999999999999875 799999999999
Q ss_pred CCCCCCeEEEeCcccHHHH
Q psy6524 427 GRPGYPGVYTRVNRYLSWV 445 (457)
Q Consensus 427 ~~~~~p~vyt~V~~~~dWI 445 (457)
.....|.+|++|+.|++||
T Consensus 202 ~~~~~~~v~~~v~~~~~WI 220 (220)
T PF00089_consen 202 GSPNYPGVYTRVSSYLDWI 220 (220)
T ss_dssp SBTTSEEEEEEGGGGHHHH
T ss_pred CCCCcCEEEEEHHHhhccC
Confidence 9887899999999999999
No 5
>COG5640 Secreted trypsin-like serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=99.95 E-value=3e-28 Score=238.82 Aligned_cols=92 Identities=37% Similarity=0.664 Sum_probs=71.2
Q ss_pred ceEeEeeecChhhhccccc----CCCCCCCCeEEeecCCCCCCcCCCCCceEEeeCCcEEEEEEEEecCC-CCCCCCCeE
Q psy6524 360 ALEVQVPILSPGQCRAMKY----KPSRITPNMLCAGRGEMDSCQGDSGGPLIINDVGRYELVGIVSWGVG-CGRPGYPGV 434 (457)
Q Consensus 360 l~~~~~~~~s~~~C~~~~~----~~~~i~~~~lCa~~~~~~~C~gDsGgPLv~~~~~~~~L~GI~S~~~~-C~~~~~p~v 434 (457)
+.++.+...+..+|...+. ......-.-+|++...+++|+||||||++.+.+...+++||+|||.+ |+.+..|.|
T Consensus 184 l~e~~v~fv~~stc~~~~g~an~~dg~~~lT~~cag~~~~daCqGDSGGPi~~~g~~G~vQ~GVvSwG~~~Cg~t~~~gV 263 (413)
T COG5640 184 LHEVAVLFVPLSTCAQYKGCANASDGATGLTGFCAGRPPKDACQGDSGGPIFHKGEEGRVQRGVVSWGDGGCGGTLIPGV 263 (413)
T ss_pred eeeeeeeeechHHhhhhccccccCCCCCCccceecCCCCcccccCCCCCceEEeCCCccEEEeEEEecCCCCCCCCccee
Confidence 5555555555555554331 01112222399997779999999999999999888899999999985 999999999
Q ss_pred EEeCcccHHHHHHHhhc
Q psy6524 435 YTRVNRYLSWVKRNMKD 451 (457)
Q Consensus 435 yt~V~~~~dWI~~~i~~ 451 (457)
||+|+.|.+||..+|+.
T Consensus 264 yT~vsny~~WI~a~~~~ 280 (413)
T COG5640 264 YTNVSNYQDWIAAMTNG 280 (413)
T ss_pred EEehhHHHHHHHHHhcC
Confidence 99999999999998764
No 6
>PF03761 DUF316: Domain of unknown function (DUF316) ; InterPro: IPR005514 This is a family of uncharacterised proteins from Caenorhabditis elegans.
Probab=99.58 E-value=6.4e-14 Score=138.88 Aligned_cols=56 Identities=25% Similarity=0.585 Sum_probs=46.6
Q ss_pred CCCCCCC--ccceeecCeecCCCCCCeEEEEEeCC----eeEEEEEEEeCCeeeeccccccc
Q psy6524 19 LECGVTN--QEVRIVGGRPTGVNKYPWVARLVYDG----NFHCGASLINEDYVLTAAHCVRR 74 (457)
Q Consensus 19 ~~cg~~~--~~~ri~~G~~~~~~~~Pw~v~i~~~~----~~~C~GtLIs~~~VLTAAhCv~~ 74 (457)
..||+.. .+.++.+|..+..++.||+|.+...+ .++++|||||+||||||+||+..
T Consensus 29 ~~CG~~~~~~~~~~~~g~~~~~~~~pW~v~v~~~~~~~~~~~~~gtlIS~RHiLtss~~~~~ 90 (282)
T PF03761_consen 29 ETCGKKKLPYPSKVFNGTPAESGEAPWAVSVYTKNHNEGNYFSTGTLISPRHILTSSHCVMN 90 (282)
T ss_pred HhcCCCCCCCcccccCCcccccCCCCCEEEEEeccCcccceecceEEeccCeEEEeeeEEEe
Confidence 4688443 45568999999999999999998754 35689999999999999999974
No 7
>KOG3627|consensus
Probab=99.28 E-value=1.5e-11 Score=119.63 Aligned_cols=140 Identities=30% Similarity=0.512 Sum_probs=97.7
Q ss_pred eeeeEEEecCcCCCCCCc-cceEEEEecCccccCCCeeeeeCCCCCCCcccccCcccccccceeeeeeeccCCCceeeec
Q psy6524 102 RAVSSIVRHRHFDVNNYN-HDIALLKLRKPVSFTKSVRPICLPPDSEYHTVVKGTMRCRQRAAVLAFGTQRDGSDVKLVS 180 (457)
Q Consensus 102 ~~v~~i~~h~~y~~~~~~-~DIaLl~L~~~v~~~~~v~picl~~~~~~~~~~~~~~~~~~~~~v~~~g~~~~~~~~~~~~ 180 (457)
..+.+++.||+|+..+.. ||||||+|..++.|+++|+|||||.+... ..........++|||....+..
T Consensus 87 ~~v~~~i~H~~y~~~~~~~nDiall~l~~~v~~~~~i~piclp~~~~~-----~~~~~~~~~~v~GWG~~~~~~~----- 156 (256)
T KOG3627|consen 87 GDVEKIIVHPNYNPRTLENNDIALLRLSEPVTFSSHIQPICLPSSADP-----YFPPGGTTCLVSGWGRTESGGG----- 156 (256)
T ss_pred ceeeEEEECCCCCCCCCCCCCEEEEEECCCcccCCcccccCCCCCccc-----CCCCCCCEEEEEeCCCcCCCCC-----
Confidence 345578899999999888 99999999999999999999999854432 1111223445788873111100
Q ss_pred cceEEEeCcccCCcccccCccceeeeeeEEEecCCCCCCCCCcceEEEeeCCCcccCCCccccccCCCCCCCCCCeEEEE
Q psy6524 181 SKIRIVLGDYDQSVTTETAEPTMMRAVSSIVRHRHFDVNNYNHDIALLKLRKPVSFTKSVRPICLPPDNIDPSGKMGTVV 260 (457)
Q Consensus 181 ~~~~~~~g~~~~~~~~~~~~~~~~~~V~~i~~hp~~~~~~~~~DiAllkL~~~~~~s~~v~PicLp~~~~~~~~~~~~~~ 260 (457)
..+
T Consensus 157 ------------------------------------------------------~~~----------------------- 159 (256)
T KOG3627|consen 157 ------------------------------------------------------PLP----------------------- 159 (256)
T ss_pred ------------------------------------------------------CCC-----------------------
Confidence 000
Q ss_pred ecccccCCCCCcccceeccccccCchhccccccCCCCCCCCceecC--CCCCCcccCCCCCCCCCCccccccccccCCCC
Q psy6524 261 GWGRTSEGGSLATEALEVQVPILSPGQCRAMKYKPSRITPNMLCAG--RGEMDSCQDLAPRRPTESHLHFHFLSTDIDPS 338 (457)
Q Consensus 261 Gwg~~~~~~~~~~~l~~~~~~~~~~~~C~~~~~~~~~~~~~~~Cag--~~~~~~C~~~s~~~~~~~~~~~~~~~~~~~~~ 338 (457)
..|+++++++++..+|+..+.....+++.|+||+ .+++++|+|||++++.-.... .
T Consensus 160 ------------~~L~~~~v~i~~~~~C~~~~~~~~~~~~~~~Ca~~~~~~~~~C~GDSGGPLv~~~~~----------~ 217 (256)
T KOG3627|consen 160 ------------DTLQEVDVPIISNSECRRAYGGLGTITDTMLCAGGPEGGKDACQGDSGGPLVCEDNG----------R 217 (256)
T ss_pred ------------ceeEEEEEeEcChhHhcccccCccccCCCEEeeCccCCCCccccCCCCCeEEEeeCC----------c
Confidence 1133456778888889988776656778899999 577889999999885555432 3
Q ss_pred CCceeEeccccc
Q psy6524 339 GKMGTVVGWGRT 350 (457)
Q Consensus 339 ~~~~~~~~~g~~ 350 (457)
...+.++.||..
T Consensus 218 ~~~~GivS~G~~ 229 (256)
T KOG3627|consen 218 WVLVGIVSWGSG 229 (256)
T ss_pred EEEEEEEEecCC
Confidence 456677788765
No 8
>PF09342 DUF1986: Domain of unknown function (DUF1986); InterPro: IPR015420 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This domain is found in serine endopeptidases belonging to MEROPS peptidase family S1A (clan PA). It is found in unusual mosaic proteins, which are encoded by the Drosophila nudel gene (see P98159 from SWISSPROT). Nudel is involved in defining embryonic dorsoventral polarity. Three proteases; ndl, gd and snk process easter to create active easter. Active easter defines cell identities along the dorsal-ventral continuum by activating the spz ligand for the Tl receptor in the ventral region of the embryo. Nudel, pipe and windbeutel together trigger the protease cascade within the extraembryonic perivitelline compartment which induces dorsoventral polarity of the Drosophila embryo [].
Probab=99.20 E-value=6.5e-11 Score=112.06 Aligned_cols=99 Identities=23% Similarity=0.538 Sum_probs=81.6
Q ss_pred CCCCCeEEEEEeCCeeEEEEEEEeCCeeeeccccccccc--CccEEEEEccccCCCcccCCCCceeeeeeEEEecCcCCC
Q psy6524 38 VNKYPWVARLVYDGNFHCGASLINEDYVLTAAHCVRRLK--RSKIRIVLGDYDQSVTTETAEPTMMRAVSSIVRHRHFDV 115 (457)
Q Consensus 38 ~~~~Pw~v~i~~~~~~~C~GtLIs~~~VLTAAhCv~~~~--~~~~~v~~G~~~~~~~~~~~~~~~~~~v~~i~~h~~y~~ 115 (457)
.-.|||.|.|+.++.+.|+|+||.+.|||++..|+.+.+ ..-+.+++|.......- .....|++.|..+..-
T Consensus 13 ~y~WPWlA~IYvdG~~~CsgvLlD~~WlLvsssCl~~I~L~~~YvsallG~~Kt~~~v-~Gp~EQI~rVD~~~~V----- 86 (267)
T PF09342_consen 13 DYHWPWLADIYVDGRYWCSGVLLDPHWLLVSSSCLRGISLSHHYVSALLGGGKTYLSV-DGPHEQISRVDCFKDV----- 86 (267)
T ss_pred cccCcceeeEEEcCeEEEEEEEeccceEEEeccccCCcccccceEEEEecCcceeccc-CCChheEEEeeeeeec-----
Confidence 346999999999999999999999999999999998744 46678999987754433 2355677777766543
Q ss_pred CCCccceEEEEecCccccCCCeeeeeCCC
Q psy6524 116 NNYNHDIALLKLRKPVSFTKSVRPICLPP 144 (457)
Q Consensus 116 ~~~~~DIaLl~L~~~v~~~~~v~picl~~ 144 (457)
.+.+++||.|++|+.|+++|+|..||.
T Consensus 87 --~~S~v~LLHL~~~~~fTr~VlP~flp~ 113 (267)
T PF09342_consen 87 --PESNVLLLHLEQPANFTRYVLPTFLPE 113 (267)
T ss_pred --cccceeeeeecCcccceeeeccccccc
Confidence 356899999999999999999999985
No 9
>cd00190 Tryp_SPc Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues.
Probab=98.99 E-value=1.7e-09 Score=102.14 Aligned_cols=62 Identities=34% Similarity=0.732 Sum_probs=51.0
Q ss_pred eeeeeEEEecCcCCCCCCccceEEEEecCccccCCCeeeeeCCCCCCCcccccCcccccccceeeeeee
Q psy6524 101 MRAVSSIVRHRHFDVNNYNHDIALLKLRKPVSFTKSVRPICLPPDSEYHTVVKGTMRCRQRAAVLAFGT 169 (457)
Q Consensus 101 ~~~v~~i~~h~~y~~~~~~~DIaLl~L~~~v~~~~~v~picl~~~~~~~~~~~~~~~~~~~~~v~~~g~ 169 (457)
.+.|.++++||+|+.....+|||||+|++|+.++.+++|||||.+.. ...-++.+.++|||.
T Consensus 70 ~~~v~~~~~hp~y~~~~~~~DiAll~L~~~~~~~~~v~picl~~~~~-------~~~~~~~~~~~G~g~ 131 (232)
T cd00190 70 VIKVKKVIVHPNYNPSTYDNDIALLKLKRPVTLSDNVRPICLPSSGY-------NLPAGTTCTVSGWGR 131 (232)
T ss_pred EEEEEEEEECCCCCCCCCcCCEEEEEECCcccCCCcccceECCCccc-------cCCCCCEEEEEeCCc
Confidence 45688999999999999999999999999999999999999997641 122345667888883
No 10
>smart00020 Tryp_SPc Trypsin-like serine protease. Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues.
Probab=98.72 E-value=4.8e-08 Score=92.43 Aligned_cols=61 Identities=36% Similarity=0.726 Sum_probs=49.0
Q ss_pred eeeeEEEecCcCCCCCCccceEEEEecCccccCCCeeeeeCCCCCCCcccccCcccccccceeeeeee
Q psy6524 102 RAVSSIVRHRHFDVNNYNHDIALLKLRKPVSFTKSVRPICLPPDSEYHTVVKGTMRCRQRAAVLAFGT 169 (457)
Q Consensus 102 ~~v~~i~~h~~y~~~~~~~DIaLl~L~~~v~~~~~v~picl~~~~~~~~~~~~~~~~~~~~~v~~~g~ 169 (457)
+.|.+++.||+|+.....+|||||+|++|+.++.+++|||||.... ...-++.+.++|||.
T Consensus 71 ~~v~~~~~~p~~~~~~~~~DiAll~L~~~i~~~~~~~pi~l~~~~~-------~~~~~~~~~~~g~g~ 131 (229)
T smart00020 71 IKVSKVIIHPNYNPSTYDNDIALLKLKSPVTLSDNVRPICLPSSNY-------NVPAGTTCTVSGWGR 131 (229)
T ss_pred EeeEEEEECCCCCCCCCcCCEEEEEECcccCCCCceeeccCCCccc-------ccCCCCEEEEEeCCC
Confidence 4578899999999989999999999999999999999999997521 112234567788873
No 11
>COG3591 V8-like Glu-specific endopeptidase [Amino acid transport and metabolism]
Probab=98.46 E-value=2.8e-06 Score=82.28 Aligned_cols=54 Identities=26% Similarity=0.519 Sum_probs=38.2
Q ss_pred CCCCCcCCCCCceEEeeCCcEEEEEEEEecCCCCCCCCCeEEEeC-cccHHHHHHHhh
Q psy6524 394 EMDSCQGDSGGPLIINDVGRYELVGIVSWGVGCGRPGYPGVYTRV-NRYLSWVKRNMK 450 (457)
Q Consensus 394 ~~~~C~gDsGgPLv~~~~~~~~L~GI~S~~~~C~~~~~p~vyt~V-~~~~dWI~~~i~ 450 (457)
..+++.|+||+|++...+ +++||.+-+..-.......-.+++ ..+++||.+.++
T Consensus 197 ~~dT~pG~SGSpv~~~~~---~vigv~~~g~~~~~~~~~n~~vr~t~~~~~~I~~~~~ 251 (251)
T COG3591 197 DADTLPGSSGSPVLISKD---EVIGVHYNGPGANGGSLANNAVRLTPEILNFIQQNIK 251 (251)
T ss_pred EecccCCCCCCceEecCc---eEEEEEecCCCcccccccCcceEecHHHHHHHHHhhC
Confidence 457899999999998776 899999888642211222334455 458999998764
No 12
>PF00089 Trypsin: Trypsin; InterPro: IPR001254 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine proteases belong to the MEROPS peptidase family S1 (chymotrypsin family, clan PA(S))and to peptidase family S6 (Hap serine peptidases). The chymotrypsin family is almost totally confined to animals, although trypsin-like enzymes are found in actinomycetes of the genera Streptomyces and Saccharopolyspora, and in the fungus Fusarium oxysporum []. The enzymes are inherently secreted, being synthesised with a signal peptide that targets them to the secretory pathway. Animal enzymes are either secreted directly, packaged into vesicles for regulated secretion, or are retained in leukocyte granules []. The Hap family, 'Haemophilus adhesion and penetration', are proteins that play a role in the interaction with human epithelial cells. The serine protease activity is localized at the N-terminal domain, whereas the binding domain is in the C-terminal region. ; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 1SPJ_A 1A5I_A 2ZGH_A 2ZKS_A 2ZGJ_A 2ZGC_A 2ODP_A 2I6Q_A 2I6S_A 2ODQ_A ....
Probab=98.37 E-value=1.3e-06 Score=81.83 Aligned_cols=60 Identities=37% Similarity=0.747 Sum_probs=47.9
Q ss_pred eeeeEEEecCcCCCCCCccceEEEEecCccccCCCeeeeeCCCCCCCcccccCcccccccceeeeee
Q psy6524 102 RAVSSIVRHRHFDVNNYNHDIALLKLRKPVSFTKSVRPICLPPDSEYHTVVKGTMRCRQRAAVLAFG 168 (457)
Q Consensus 102 ~~v~~i~~h~~y~~~~~~~DIaLl~L~~~v~~~~~v~picl~~~~~~~~~~~~~~~~~~~~~v~~~g 168 (457)
+.|.+++.||+|+.....+|||||+|.+++.+.+.++|+|++.... ...-+..+.++|||
T Consensus 69 ~~v~~~~~h~~~~~~~~~~DiAll~L~~~~~~~~~~~~~~l~~~~~-------~~~~~~~~~~~G~~ 128 (220)
T PF00089_consen 69 IKVSKIIIHPKYDPSTYDNDIALLKLDRPITFGDNIQPICLPSAGS-------DPNVGTSCIVVGWG 128 (220)
T ss_dssp EEEEEEEEETTSBTTTTTTSEEEEEESSSSEHBSSBEESBBTSTTH-------TTTTTSEEEEEESS
T ss_pred cccccccccccccccccccccccccccccccccccccccccccccc-------cccccccccccccc
Confidence 4577899999999999999999999999999999999999997221 01123345677776
No 13
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=97.61 E-value=0.0013 Score=69.26 Aligned_cols=38 Identities=26% Similarity=0.221 Sum_probs=27.6
Q ss_pred cceEEEeeCCCcccCCCccccccCCCCCCCCCCeEEEEeccc
Q psy6524 223 HDIALLKLRKPVSFTKSVRPICLPPDNIDPSGKMGTVVGWGR 264 (457)
Q Consensus 223 ~DiAllkL~~~~~~s~~v~PicLp~~~~~~~~~~~~~~Gwg~ 264 (457)
.|+|||+++.+ ..+.++.|.+......++.+.+.|+..
T Consensus 105 ~DlAllkv~~~----~~~~~~~l~~~~~~~~G~~v~aiG~p~ 142 (428)
T TIGR02037 105 TDIAVLKIDAK----KNLPVIKLGDSDKLRVGDWVLAIGNPF 142 (428)
T ss_pred CCEEEEEecCC----CCceEEEccCCCCCCCCCEEEEEECCC
Confidence 48999998754 345667777666666888888888753
No 14
>TIGR02038 protease_degS periplasmic serine pepetdase DegS. This family consists of the periplasmic serine protease DegS (HhoB), a shorter paralog of protease DO (HtrA, DegP) and DegQ (HhoA). It is found in E. coli and several other Proteobacteria of the gamma subdivision. It contains a trypsin domain and a single copy of PDZ domain (in contrast to DegP with two copies). A critical role of this DegS is to sense stress in the periplasm and partially degrade an inhibitor of sigma(E).
Probab=97.27 E-value=0.011 Score=60.86 Aligned_cols=42 Identities=19% Similarity=0.379 Sum_probs=30.1
Q ss_pred CCeEEEEEeCC-----------eeEEEEEEEeCC-eeeecccccccccCccEEEEE
Q psy6524 41 YPWVARLVYDG-----------NFHCGASLINED-YVLTAAHCVRRLKRSKIRIVL 84 (457)
Q Consensus 41 ~Pw~v~i~~~~-----------~~~C~GtLIs~~-~VLTAAhCv~~~~~~~~~v~~ 84 (457)
-|-+|.|.... ....+|.+|+++ +|||++|-+.+ .+.+.|.+
T Consensus 55 ~psVV~I~~~~~~~~~~~~~~~~~~GSG~vi~~~G~IlTn~HVV~~--~~~i~V~~ 108 (351)
T TIGR02038 55 APAVVNIYNRSISQNSLNQLSIQGLGSGVIMSKEGYILTNYHVIKK--ADQIVVAL 108 (351)
T ss_pred CCcEEEEEeEeccccccccccccceEEEEEEeCCeEEEecccEeCC--CCEEEEEE
Confidence 47888886421 246999999977 99999999964 34455554
No 15
>PRK10898 serine endoprotease; Provisional
Probab=97.05 E-value=0.029 Score=57.73 Aligned_cols=42 Identities=19% Similarity=0.351 Sum_probs=30.0
Q ss_pred CCeEEEEEeCC-----------eeEEEEEEEeCC-eeeecccccccccCccEEEEE
Q psy6524 41 YPWVARLVYDG-----------NFHCGASLINED-YVLTAAHCVRRLKRSKIRIVL 84 (457)
Q Consensus 41 ~Pw~v~i~~~~-----------~~~C~GtLIs~~-~VLTAAhCv~~~~~~~~~v~~ 84 (457)
-|-+|.|.... ....+|.+|+++ +|||+||=+.+ ...+.|.+
T Consensus 55 ~psvV~v~~~~~~~~~~~~~~~~~~GSGfvi~~~G~IlTn~HVv~~--a~~i~V~~ 108 (353)
T PRK10898 55 APAVVNVYNRSLNSTSHNQLEIRTLGSGVIMDQRGYILTNKHVIND--ADQIIVAL 108 (353)
T ss_pred CCcEEEEEeEeccccCcccccccceeeEEEEeCCeEEEecccEeCC--CCEEEEEe
Confidence 47777776421 146999999976 99999999863 45566554
No 16
>PRK10139 serine endoprotease; Provisional
Probab=96.89 E-value=0.03 Score=59.57 Aligned_cols=31 Identities=32% Similarity=0.446 Sum_probs=24.3
Q ss_pred eEEEEEEEeC--CeeeecccccccccCccEEEEEc
Q psy6524 53 FHCGASLINE--DYVLTAAHCVRRLKRSKIRIVLG 85 (457)
Q Consensus 53 ~~C~GtLIs~--~~VLTAAhCv~~~~~~~~~v~~G 85 (457)
...+|.+|++ -+|||++|.+.+ ...+.|.+.
T Consensus 90 ~~GSG~ii~~~~g~IlTn~HVv~~--a~~i~V~~~ 122 (455)
T PRK10139 90 GLGSGVIIDAAKGYVLTNNHVINQ--AQKISIQLN 122 (455)
T ss_pred ceEEEEEEECCCCEEEeChHHhCC--CCEEEEEEC
Confidence 4799999974 699999999974 456666653
No 17
>PF13365 Trypsin_2: Trypsin-like peptidase domain; PDB: 1Y8T_A 2Z9I_A 3QO6_A 1L1J_A 1QY6_A 2O8L_A 3OTP_E 2ZLE_I 1KY9_A 3CS0_A ....
Probab=96.88 E-value=0.003 Score=53.28 Aligned_cols=21 Identities=48% Similarity=0.607 Sum_probs=19.3
Q ss_pred EEEEEEeCC-eeeecccccccc
Q psy6524 55 CGASLINED-YVLTAAHCVRRL 75 (457)
Q Consensus 55 C~GtLIs~~-~VLTAAhCv~~~ 75 (457)
|+|.+|+++ +|||||||+...
T Consensus 1 GTGf~i~~~g~ilT~~Hvv~~~ 22 (120)
T PF13365_consen 1 GTGFLIGPDGYILTAAHVVEDW 22 (120)
T ss_dssp EEEEEEETTTEEEEEHHHHTCC
T ss_pred CEEEEEcCCceEEEchhheecc
Confidence 789999999 999999999764
No 18
>PRK10942 serine endoprotease; Provisional
Probab=96.60 E-value=0.046 Score=58.48 Aligned_cols=31 Identities=29% Similarity=0.473 Sum_probs=24.0
Q ss_pred eEEEEEEEeC--CeeeecccccccccCccEEEEEc
Q psy6524 53 FHCGASLINE--DYVLTAAHCVRRLKRSKIRIVLG 85 (457)
Q Consensus 53 ~~C~GtLIs~--~~VLTAAhCv~~~~~~~~~v~~G 85 (457)
...+|.+|++ -+|||++|.+.+ ...+.|.+.
T Consensus 111 ~~GSG~ii~~~~G~IlTn~HVv~~--a~~i~V~~~ 143 (473)
T PRK10942 111 ALGSGVIIDADKGYVVTNNHVVDN--ATKIKVQLS 143 (473)
T ss_pred ceEEEEEEECCCCEEEeChhhcCC--CCEEEEEEC
Confidence 4699999985 499999999864 456666653
No 19
>PF09342 DUF1986: Domain of unknown function (DUF1986); InterPro: IPR015420 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This domain is found in serine endopeptidases belonging to MEROPS peptidase family S1A (clan PA). It is found in unusual mosaic proteins, which are encoded by the Drosophila nudel gene (see P98159 from SWISSPROT). Nudel is involved in defining embryonic dorsoventral polarity. Three proteases; ndl, gd and snk process easter to create active easter. Active easter defines cell identities along the dorsal-ventral continuum by activating the spz ligand for the Tl receptor in the ventral region of the embryo. Nudel, pipe and windbeutel together trigger the protease cascade within the extraembryonic perivitelline compartment which induces dorsoventral polarity of the Drosophila embryo [].
Probab=93.94 E-value=0.15 Score=49.25 Aligned_cols=76 Identities=21% Similarity=0.325 Sum_probs=55.6
Q ss_pred cceEEEeCcccCCcccccCccceeeeeeEEEecCCCCCCCCCcceEEEeeCCCcccCCCccccccCCCCCCC-CCCeEEE
Q psy6524 181 SKIRIVLGDYDQSVTTETAEPTMMRAVSSIVRHRHFDVNNYNHDIALLKLRKPVSFTKSVRPICLPPDNIDP-SGKMGTV 259 (457)
Q Consensus 181 ~~~~~~~g~~~~~~~~~~~~~~~~~~V~~i~~hp~~~~~~~~~DiAllkL~~~~~~s~~v~PicLp~~~~~~-~~~~~~~ 259 (457)
.=+.++||....-..+ +..-.|++.|..+..-|. .+++||.|++|+.|+.+|+|..||...... ....|..
T Consensus 55 ~YvsallG~~Kt~~~v-~Gp~EQI~rVD~~~~V~~-------S~v~LLHL~~~~~fTr~VlP~flp~~~~~~~~~~~CVA 126 (267)
T PF09342_consen 55 HYVSALLGGGKTYLSV-DGPHEQISRVDCFKDVPE-------SNVLLLHLEQPANFTRYVLPTFLPETSNENESDDECVA 126 (267)
T ss_pred ceEEEEecCcceeccc-CCChheEEEeeeeeeccc-------cceeeeeecCcccceeeecccccccccCCCCCCCceEE
Confidence 3456777766655544 334556777777666554 589999999999999999999999754443 5558998
Q ss_pred Eeccc
Q psy6524 260 VGWGR 264 (457)
Q Consensus 260 ~Gwg~ 264 (457)
.|-..
T Consensus 127 Vg~d~ 131 (267)
T PF09342_consen 127 VGHDD 131 (267)
T ss_pred EEccc
Confidence 88766
No 20
>PF02395 Peptidase_S6: Immunoglobulin A1 protease Serine protease Prosite pattern; InterPro: IPR000710 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S6 (clan PA(S)). The type sample being the IgA1-specific serine endopeptidase from Neisseria gonorrhoeae []. These cleave prolyl bonds in the hinge regions of immunoglobulin A heavy chains. Similar specificity is shown by the unrelated family of M26 metalloendopeptidases.; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SZE_A 3H09_B 3SYJ_A 1WXR_A 3AK5_B.
Probab=82.17 E-value=1.2 Score=50.46 Aligned_cols=31 Identities=42% Similarity=0.710 Sum_probs=22.3
Q ss_pred CCcCCCCCceEEee--CCcEEEEEEEEecCCCC
Q psy6524 397 SCQGDSGGPLIIND--VGRYELVGIVSWGVGCG 427 (457)
Q Consensus 397 ~C~gDsGgPLv~~~--~~~~~L~GI~S~~~~C~ 427 (457)
.-.||||+|||.-+ ..+|+|+|+++.+....
T Consensus 213 ~~~GDSGSPlF~YD~~~kKWvl~Gv~~~~~~~~ 245 (769)
T PF02395_consen 213 GSPGDSGSPLFAYDKEKKKWVLVGVLSGGNGYN 245 (769)
T ss_dssp --TT-TT-EEEEEETTTTEEEEEEEEEEECCCC
T ss_pred cccCcCCCceEEEEccCCeEEEEEEEccccccC
Confidence 34699999998755 67899999999886543
No 21
>PF00947 Pico_P2A: Picornavirus core protein 2A; InterPro: IPR000081 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This domain defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies 3CA and 3CB. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral 3C cysteine protease []. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0008233 peptidase activity, 0006508 proteolysis, 0016032 viral reproduction; PDB: 2HRV_B 1Z8R_A.
Probab=71.76 E-value=2.8 Score=36.59 Aligned_cols=38 Identities=26% Similarity=0.445 Sum_probs=28.6
Q ss_pred CcCCCCCceEEeeCCcEEEEEEEEecCCCCCCCCCeEEEeCcccHHHH
Q psy6524 398 CQGDSGGPLIINDVGRYELVGIVSWGVGCGRPGYPGVYTRVNRYLSWV 445 (457)
Q Consensus 398 C~gDsGgPLv~~~~~~~~L~GI~S~~~~C~~~~~p~vyt~V~~~~dWI 445 (457)
=+||-||+|.|..+ ++||++.|-+ ....|++|..+ .|+
T Consensus 88 ~PGdCGg~L~C~HG----ViGi~Tagg~-----g~VaF~dir~~-~~~ 125 (127)
T PF00947_consen 88 EPGDCGGILRCKHG----VIGIVTAGGE-----GHVAFADIRDL-LWL 125 (127)
T ss_dssp STT-TCSEEEETTC----EEEEEEEEET-----TEEEEEECCCG-STT
T ss_pred CCCCCCceeEeCCC----eEEEEEeCCC-----ceEEEEechhh-hee
Confidence 35899999999885 9999988742 34789999884 454
No 22
>COG5640 Secreted trypsin-like serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=68.20 E-value=12 Score=38.51 Aligned_cols=156 Identities=26% Similarity=0.349 Sum_probs=103.2
Q ss_pred cceeecCeecCCCCCCeEEEEEeC-----CeeEEEEEEEeCCeeeecccccccccCccEEEEEccccCCCcccCCCCcee
Q psy6524 27 EVRIVGGRPTGVNKYPWVARLVYD-----GNFHCGASLINEDYVLTAAHCVRRLKRSKIRIVLGDYDQSVTTETAEPTMM 101 (457)
Q Consensus 27 ~~ri~~G~~~~~~~~Pw~v~i~~~-----~~~~C~GtLIs~~~VLTAAhCv~~~~~~~~~v~~G~~~~~~~~~~~~~~~~ 101 (457)
+.||+||..|+.++||++|+|... ...+|||+++..|||||||||+....+-...
T Consensus 30 s~rIigGs~Anag~~P~~VaLv~~isd~~s~tfCGgs~l~~RYvLTAAHC~~~~s~is~d-------------------- 89 (413)
T COG5640 30 SSRIIGGSNANAGEYPSLVALVDRISDYVSGTFCGGSKLGGRYVLTAAHCADASSPISSD-------------------- 89 (413)
T ss_pred ceeEecCcccccccCchHHHHHhhcccccceeEeccceecceEEeeehhhccCCCCcccc--------------------
Confidence 579999999999999999998643 2468999999999999999998642210000
Q ss_pred eeeeEEEecCcCCCCCCccceEEEEecCccccCCCeeeeeCCCCCCCcccccCcccccccceeeeeeeccCCCceeeecc
Q psy6524 102 RAVSSIVRHRHFDVNNYNHDIALLKLRKPVSFTKSVRPICLPPDSEYHTVVKGTMRCRQRAAVLAFGTQRDGSDVKLVSS 181 (457)
Q Consensus 102 ~~v~~i~~h~~y~~~~~~~DIaLl~L~~~v~~~~~v~picl~~~~~~~~~~~~~~~~~~~~~v~~~g~~~~~~~~~~~~~ 181 (457)
T Consensus 90 -------------------------------------------------------------------------------- 89 (413)
T COG5640 90 -------------------------------------------------------------------------------- 89 (413)
T ss_pred --------------------------------------------------------------------------------
Confidence
Q ss_pred ceEEEeCcccCCcccccCccceeeeeeEEEecCCCCCCCCCcceEEEeeCCCcccCCCccccccCCCCC-----CCCCCe
Q psy6524 182 KIRIVLGDYDQSVTTETAEPTMMRAVSSIVRHRHFDVNNYNHDIALLKLRKPVSFTKSVRPICLPPDNI-----DPSGKM 256 (457)
Q Consensus 182 ~~~~~~g~~~~~~~~~~~~~~~~~~V~~i~~hp~~~~~~~~~DiAllkL~~~~~~s~~v~PicLp~~~~-----~~~~~~ 256 (457)
-.+|+.+ +.+....+...|..+..|..|...++.||+|+++|.++.... .++ +-+..+.. ......
T Consensus 90 ~~~vv~~-------l~d~Sq~~rg~vr~i~~~efY~~~n~~ND~Av~~l~~~a~~p-r~k-i~~~~~sdt~l~sv~~~s~ 160 (413)
T COG5640 90 VNRVVVD-------LNDSSQAERGHVRTIYVHEFYSPGNLGNDIAVLELARAASLP-RVK-ITSFDASDTFLNSVTTVSP 160 (413)
T ss_pred ceEEEec-------ccccccccCcceEEEeeecccccccccCcceeeccccccccc-hhh-eeeccCcccceeccccccc
Confidence 0112222 112224556789999999999999999999999999866421 111 11111111 012333
Q ss_pred EEEEecccccCCC-----CCcccceeccccccCchhcccc
Q psy6524 257 GTVVGWGRTSEGG-----SLATEALEVQVPILSPGQCRAM 291 (457)
Q Consensus 257 ~~~~Gwg~~~~~~-----~~~~~l~~~~~~~~~~~~C~~~ 291 (457)
....+|+.+.... +..+.|+++.+.+++...|...
T Consensus 161 ~~n~t~~~~~~~~v~~~~p~gt~l~e~~v~fv~~stc~~~ 200 (413)
T COG5640 161 MTNGTFGVTTPSDVPRSSPKGTILHEVAVLFVPLSTCAQY 200 (413)
T ss_pred ccceeeeeeeecCCCCCCCccceeeeeeeeeechHHhhhh
Confidence 4455666554321 1225788999999999999763
No 23
>PF00548 Peptidase_C3: 3C cysteine protease (picornain 3C); InterPro: IPR000199 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This signature defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies C3A and C3B. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral C3 cysteine protease. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SJO_E 2H6M_A 1QA7_C 1HAV_B 2HAL_A 2H9H_A 3QZQ_B 3QZR_A 3R0F_B 3SJ9_A ....
Probab=40.95 E-value=1.8e+02 Score=26.64 Aligned_cols=28 Identities=29% Similarity=0.417 Sum_probs=22.7
Q ss_pred CCcCCCCCceEEeeCCcEEEEEEEEecC
Q psy6524 397 SCQGDSGGPLIINDVGRYELVGIVSWGV 424 (457)
Q Consensus 397 ~C~gDsGgPLv~~~~~~~~L~GI~S~~~ 424 (457)
+..|+=||||+...++...++||-..|.
T Consensus 144 t~~G~CG~~l~~~~~~~~~i~GiHvaG~ 171 (172)
T PF00548_consen 144 TKPGMCGSPLVSRIGGQGKIIGIHVAGN 171 (172)
T ss_dssp EETTGTTEEEEESCGGTTEEEEEEEEEE
T ss_pred CCCCccCCeEEEeeccCccEEEEEeccC
Confidence 3468889999997777789999987663
No 24
>PF13365 Trypsin_2: Trypsin-like peptidase domain; PDB: 1Y8T_A 2Z9I_A 3QO6_A 1L1J_A 1QY6_A 2O8L_A 3OTP_E 2ZLE_I 1KY9_A 3CS0_A ....
Probab=36.95 E-value=26 Score=28.82 Aligned_cols=20 Identities=50% Similarity=0.943 Sum_probs=14.2
Q ss_pred CCcCCCCCceEEeeCCcEEEEEE
Q psy6524 397 SCQGDSGGPLIINDVGRYELVGI 419 (457)
Q Consensus 397 ~C~gDsGgPLv~~~~~~~~L~GI 419 (457)
+-.|.|||||+-.+ + .++||
T Consensus 101 ~~~G~SGgpv~~~~-G--~vvGi 120 (120)
T PF13365_consen 101 TRPGSSGGPVFDSD-G--RVVGI 120 (120)
T ss_dssp -STTTTTSEEEETT-S--EEEEE
T ss_pred cCCCcEeHhEECCC-C--EEEeC
Confidence 34589999997644 3 58886
No 25
>PF03761 DUF316: Domain of unknown function (DUF316) ; InterPro: IPR005514 This is a family of uncharacterised proteins from Caenorhabditis elegans.
Probab=32.74 E-value=61 Score=31.73 Aligned_cols=52 Identities=33% Similarity=0.632 Sum_probs=41.4
Q ss_pred CCCCCcCCCCCceEEeeCCcEEEEEEEEecC-CCCCCCCCeEEEeCcccHHHHHH
Q psy6524 394 EMDSCQGDSGGPLIINDVGRYELVGIVSWGV-GCGRPGYPGVYTRVNRYLSWVKR 447 (457)
Q Consensus 394 ~~~~C~gDsGgPLv~~~~~~~~L~GI~S~~~-~C~~~~~p~vyt~V~~~~dWI~~ 447 (457)
....|.+|+||||+...+++|+|+||.+.+. .|... ...|.+|..|.+=|-+
T Consensus 225 ~~~~~~~d~Gg~lv~~~~gr~tlIGv~~~~~~~~~~~--~~~f~~v~~~~~~IC~ 277 (282)
T PF03761_consen 225 KQYSCKGDRGGPLVKNINGRWTLIGVGASGNYECNKN--NSYFFNVSWYQDEICE 277 (282)
T ss_pred ccccCCCCccCeEEEEECCCEEEEEEEccCCCccccc--ccEEEEHHHhhhhhcc
Confidence 4577999999999999999999999998775 45322 5788898888776543
No 26
>PF05579 Peptidase_S32: Equine arteritis virus serine endopeptidase S32; InterPro: IPR008760 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to MEROPS peptidase family S32 (clan PA(S)). The type example is equine arteritis virus serine endopeptidase (equine arteritis virus), which is involved in processing of nidovirus polyproteins [].; GO: 0004252 serine-type endopeptidase activity, 0016032 viral reproduction, 0019082 viral protein processing; PDB: 3FAN_A 3FAO_A 1MBM_A.
Probab=30.41 E-value=50 Score=32.74 Aligned_cols=23 Identities=35% Similarity=0.692 Sum_probs=17.8
Q ss_pred cCCCCCceEEeeCCcEEEEEEEEecC
Q psy6524 399 QGDSGGPLIINDVGRYELVGIVSWGV 424 (457)
Q Consensus 399 ~gDsGgPLv~~~~~~~~L~GI~S~~~ 424 (457)
.||||+|++.+++ .|+||-+-+.
T Consensus 207 ~GDSGSPVVt~dg---~liGVHTGSn 229 (297)
T PF05579_consen 207 PGDSGSPVVTEDG---DLIGVHTGSN 229 (297)
T ss_dssp GGCTT-EEEETTC----EEEEEEEEE
T ss_pred CCCCCCccCcCCC---CEEEEEecCC
Confidence 5899999999876 6999987764
No 27
>PF02907 Peptidase_S29: Hepatitis C virus NS3 protease; InterPro: IPR004109 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies the Hepatitis C virus NS3 protein as a serine protease which belongs to MEROPS peptidase family S29 (hepacivirin family, clan PA(S)), which has a trypsin-like fold. The non-structural (NS) protein NS3 is one of the NS proteins involved in replication of the HCV genome. The NS2 proteinase (IPR002518 from INTERPRO), a zinc-dependent enzyme, performs a single proteolytic cut to release the N terminus of NS3. The action of NS3 proteinase (NS3P), which resides in the N-terminal one-third of the NS3 protein, then yields all remaining non-structural proteins. The C-terminal two-thirds of the NS3 protein contain a helicase. The functional relationship between the proteinase and helicase domains is unknown. NS3 has a structural zinc-binding site and requires cofactor NS4. It has been suggested that the NS3 serine protease of hepatitus C is involved in cell transformation and that the ability to transform requires an active enzyme [].; GO: 0008236 serine-type peptidase activity, 0006508 proteolysis, 0019087 transformation of host cell by virus; PDB: 2QV1_B 3LOX_C 2OBQ_C 2OC1_C 2OC0_A 3LON_A 3KNX_A 2O8M_A 2OBO_A 2OC8_A ....
Probab=30.24 E-value=38 Score=30.03 Aligned_cols=23 Identities=30% Similarity=0.572 Sum_probs=16.8
Q ss_pred CcCCCCCceEEeeCCcEEEEEEEEec
Q psy6524 398 CQGDSGGPLIINDVGRYELVGIVSWG 423 (457)
Q Consensus 398 C~gDsGgPLv~~~~~~~~L~GI~S~~ 423 (457)
-.|.||||+.|+.+ ..+||+.-.
T Consensus 106 lkGSSGgPiLC~~G---H~vG~f~aa 128 (148)
T PF02907_consen 106 LKGSSGGPILCPSG---HAVGMFRAA 128 (148)
T ss_dssp HTT-TT-EEEETTS---EEEEEEEEE
T ss_pred EecCCCCcccCCCC---CEEEEEEEE
Confidence 35889999999876 799998544
No 28
>PF00863 Peptidase_C4: Peptidase family C4 This family belongs to family C4 of the peptidase classification.; InterPro: IPR001730 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. Nuclear inclusion A (NIA) proteases from potyviruses are cysteine peptidases belong to the MEROPS peptidase family C4 (NIa protease family, clan PA(C)) [, ]. Potyviruses include plant viruses in which the single-stranded RNA encodes a polyprotein with NIA protease activity, where proteolytic cleavage is specific for Gln+Gly sites. The NIA protease acts on the polyprotein, releasing itself by Gln+Gly cleavage at both the N- and C-termini. It further processes the polyprotein by cleavage at five similar sites in the C-terminal half of the sequence. In addition to its C-terminal protease activity, the NIA protease contains an N-terminal domain that has been implicated in the transcription process []. This peptidase is present in the nuclear inclusion protein of potyviruses.; GO: 0008234 cysteine-type peptidase activity, 0006508 proteolysis; PDB: 3MMG_B 1Q31_B 1LVB_A 1LVM_A.
Probab=28.15 E-value=5.7e+02 Score=24.90 Aligned_cols=40 Identities=28% Similarity=0.425 Sum_probs=22.6
Q ss_pred CCCcCCCCCceEEeeCCcEEEEEEEEecCCCCCCCCCeEEEeCcc
Q psy6524 396 DSCQGDSGGPLIINDVGRYELVGIVSWGVGCGRPGYPGVYTRVNR 440 (457)
Q Consensus 396 ~~C~gDsGgPLv~~~~~~~~L~GI~S~~~~C~~~~~p~vyt~V~~ 440 (457)
++-.||=|.||+...++ .++||-|-+..-.. -..|+.+..
T Consensus 147 sTk~G~CG~PlVs~~Dg--~IVGiHsl~~~~~~---~N~F~~f~~ 186 (235)
T PF00863_consen 147 STKDGDCGLPLVSTKDG--KIVGIHSLTSNTSS---RNYFTPFPD 186 (235)
T ss_dssp ---TT-TT-EEEETTT----EEEEEEEEETTTS---SEEEEE--T
T ss_pred cCCCCccCCcEEEcCCC--cEEEEEcCccCCCC---eEEEEcCCH
Confidence 44568889999986544 69999998764322 257887754
No 29
>KOG0276|consensus
Probab=26.40 E-value=2.3e+02 Score=31.48 Aligned_cols=20 Identities=25% Similarity=0.610 Sum_probs=12.6
Q ss_pred ecCeecCCCCCCeEEEEEeCC
Q psy6524 31 VGGRPTGVNKYPWVARLVYDG 51 (457)
Q Consensus 31 ~~G~~~~~~~~Pw~v~i~~~~ 51 (457)
+.+.+--|.+ ||+.+-.+++
T Consensus 16 VKsVd~HPte-Pw~la~LynG 35 (794)
T KOG0276|consen 16 VKSVDFHPTE-PWILAALYNG 35 (794)
T ss_pred eeeeecCCCC-ceEEEeeecC
Confidence 4455555556 9987666655
No 30
>PF10459 Peptidase_S46: Peptidase S46; InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains.
Probab=25.95 E-value=62 Score=36.59 Aligned_cols=29 Identities=34% Similarity=0.588 Sum_probs=22.2
Q ss_pred EEEEEe-CCeeEEEEEEEeCC-eeeeccccccc
Q psy6524 44 VARLVY-DGNFHCGASLINED-YVLTAAHCVRR 74 (457)
Q Consensus 44 ~v~i~~-~~~~~C~GtLIs~~-~VLTAAhCv~~ 74 (457)
+-+|.. .+ .|+|++||++ .|||--||..+
T Consensus 39 ~dAvv~f~g--GCSgsfVS~~GLvlTNHHC~~~ 69 (698)
T PF10459_consen 39 KDAVVRFGG--GCSGSFVSPDGLVLTNHHCGYG 69 (698)
T ss_pred hhheeecCC--ceeEEEEcCCceEEecchhhhh
Confidence 445544 33 3999999987 89999999864
No 31
>PF05580 Peptidase_S55: SpoIVB peptidase S55; InterPro: IPR008763 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S55 (SpoIVB peptidase family, clan PA(S)). The protein SpoIVB plays a key role in signalling in the final sigma-K checkpoint of Bacillus subtilis [, ].
Probab=21.11 E-value=1.1e+02 Score=29.50 Aligned_cols=26 Identities=35% Similarity=0.515 Sum_probs=22.5
Q ss_pred CCCCcCCCCCceEEeeCCcEEEEEEEEecC
Q psy6524 395 MDSCQGDSGGPLIINDVGRYELVGIVSWGV 424 (457)
Q Consensus 395 ~~~C~gDsGgPLv~~~~~~~~L~GI~S~~~ 424 (457)
.+.-+|-||+|++.++ +|+|-+++..
T Consensus 175 GGIvqGMSGSPI~qdG----KLiGAVthvf 200 (218)
T PF05580_consen 175 GGIVQGMSGSPIIQDG----KLIGAVTHVF 200 (218)
T ss_pred CCEEecccCCCEEECC----EEEEEEEEEE
Confidence 4577899999999988 8999998875
Done!