Query psy15052
Match_columns 422
No_of_seqs 391 out of 1985
Neff 8.9
Searched_HMMs 46136
Date Fri Aug 16 19:13:00 2013
Command hhsearch -i /work/01045/syshi/Psyhhblits/psy15052.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/15052hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 cd00190 Tryp_SPc Trypsin-like 100.0 7.5E-43 1.6E-47 321.2 23.0 226 139-370 5-232 (232)
2 KOG3627|consensus 100.0 5.7E-41 1.2E-45 314.5 21.8 234 131-372 12-255 (256)
3 smart00020 Tryp_SPc Trypsin-li 100.0 3.3E-39 7.2E-44 296.9 24.9 221 139-367 6-229 (229)
4 PF00089 Trypsin: Trypsin; In 100.0 2.2E-37 4.8E-42 282.5 19.9 214 139-367 5-220 (220)
5 COG5640 Secreted trypsin-like 100.0 1.5E-30 3.3E-35 239.0 13.5 241 131-380 32-287 (413)
6 PF03761 DUF316: Domain of unk 99.6 6.1E-14 1.3E-18 133.3 17.0 201 139-370 46-278 (282)
7 PF09342 DUF1986: Domain of un 99.5 6.2E-13 1.3E-17 117.5 13.7 116 142-264 12-130 (267)
8 COG3591 V8-like Glu-specific e 99.0 1.6E-09 3.5E-14 98.4 10.8 199 141-372 44-251 (251)
9 KOG3627|consensus 98.5 1.1E-07 2.3E-12 89.0 4.5 51 371-422 204-255 (256)
10 TIGR02037 degP_htrA_DO peripla 98.2 1.2E-05 2.6E-10 81.0 11.6 85 157-265 57-142 (428)
11 TIGR02038 protease_degS peripl 98.1 2.5E-05 5.3E-10 76.5 12.3 140 159-346 79-219 (351)
12 cd00190 Tryp_SPc Trypsin-like 98.1 1.8E-06 3.8E-11 78.9 3.6 49 371-420 184-232 (232)
13 PRK10898 serine endoprotease; 98.0 8.1E-05 1.8E-09 72.8 13.6 82 158-264 78-160 (353)
14 COG5640 Secreted trypsin-like 98.0 6.5E-06 1.4E-10 77.3 4.5 50 371-421 228-278 (413)
15 PRK10139 serine endoprotease; 97.9 9.6E-05 2.1E-09 74.7 10.6 141 158-345 90-232 (455)
16 PRK10942 serine endoprotease; 97.7 0.00031 6.7E-09 71.4 11.6 83 158-264 111-195 (473)
17 smart00020 Tryp_SPc Trypsin-li 97.7 4.1E-05 9E-10 69.8 4.1 45 371-417 185-229 (229)
18 PF13365 Trypsin_2: Trypsin-li 97.5 0.00019 4.1E-09 58.2 6.0 21 160-180 1-22 (120)
19 PF00089 Trypsin: Trypsin; In 97.5 8.8E-05 1.9E-09 67.0 3.6 44 370-417 177-220 (220)
20 PF02395 Peptidase_S6: Immunog 95.8 0.037 8.1E-07 59.1 9.0 65 162-249 69-133 (769)
21 smart00680 CLIP Clip or disulp 87.8 0.26 5.6E-06 33.4 1.0 38 29-66 1-52 (52)
22 PF12032 CLIP: Regulatory CLIP 86.6 0.12 2.6E-06 35.6 -1.2 17 29-45 1-18 (54)
23 COG0265 DegQ Trypsin-like seri 81.8 34 0.00074 33.3 13.3 145 158-348 72-217 (347)
24 PF00548 Peptidase_C3: 3C cyst 81.2 5.8 0.00012 34.6 6.8 67 160-245 27-93 (172)
25 PF00863 Peptidase_C4: Peptida 79.6 28 0.00061 31.9 10.8 177 164-397 37-217 (235)
26 PF00947 Pico_P2A: Picornaviru 76.5 2.5 5.4E-05 34.4 2.7 34 320-363 89-122 (127)
27 PF05579 Peptidase_S32: Equine 44.2 15 0.00032 34.1 1.8 22 320-345 207-228 (297)
28 PF05580 Peptidase_S55: SpoIVB 31.1 41 0.00089 30.3 2.5 26 316-346 175-200 (218)
29 PF00944 Peptidase_S3: Alphavi 31.1 28 0.00061 28.7 1.4 24 320-347 105-128 (158)
30 PF10459 Peptidase_S46: Peptid 27.9 43 0.00094 36.0 2.5 22 157-178 46-68 (698)
31 PF03761 DUF316: Domain of unk 26.6 53 0.0011 30.8 2.7 44 370-416 230-274 (282)
No 1
>cd00190 Tryp_SPc Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues.
Probab=100.00 E-value=7.5e-43 Score=321.16 Aligned_cols=226 Identities=42% Similarity=0.704 Sum_probs=195.5
Q ss_pred CCCCCCCcCeEEEEEec-CCCCceEEEccCCEEeecccCcCCCCCceEEEEEeeeecccCCCCCCcEEEeEEEEEECCCC
Q psy15052 139 SSKNTTQRPYVKPLKES-LGRPVNVYSNNEKVDDFSTESVNSLLTSQIKIRVGEYDFSKLEEPYPYQERGVVKKMVHPKY 217 (422)
Q Consensus 139 ~~a~~~~~Pw~v~i~~~-~~~~C~GtLIs~~~VLTAAhCv~~~~~~~~~V~lG~~~~~~~~~~~~~~~~~V~~i~iHp~y 217 (422)
..+..++|||+|+|+.. ....|+|+||+++||||||||+.......+.|++|..+...... ..+.+.|.++++||+|
T Consensus 5 ~~~~~~~~Pw~v~i~~~~~~~~C~GtlIs~~~VLTaAhC~~~~~~~~~~v~~g~~~~~~~~~--~~~~~~v~~~~~hp~y 82 (232)
T cd00190 5 SEAKIGSFPWQVSLQYTGGRHFCGGSLISPRWVLTAAHCVYSSAPSNYTVRLGSHDLSSNEG--GGQVIKVKKVIVHPNY 82 (232)
T ss_pred eECCCCCCCCEEEEEccCCcEEEEEEEeeCCEEEECHHhcCCCCCccEEEEeCcccccCCCC--ceEEEEEEEEEECCCC
Confidence 55778999999999977 56789999999999999999998765678999999887765322 2578899999999999
Q ss_pred CCCCCCCceEEEEECCCcccCCCeeeeecCCCC-CCCCCCeEEEEEecccCCCCCCCCCceEEEEeccChhhHHHHHhhc
Q psy15052 218 NFFTYEYDLALVRLETPVEFAPNIVPICLPGSD-DLLIGENATVTGWGRLSEGGSLPPVLQKVTVPIVSNEKCRSMFLRA 296 (422)
Q Consensus 218 ~~~~~~nDIALLkL~~pv~~s~~v~PicLp~~~-~~~~~~~~~v~GwG~~~~~~~~~~~L~~~~v~v~~~~~C~~~~~~~ 296 (422)
+.....+|||||||++++.++++++|||||... ....+..+.++|||...........++...+.+++.+.|...+..
T Consensus 83 ~~~~~~~DiAll~L~~~~~~~~~v~picl~~~~~~~~~~~~~~~~G~g~~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~- 161 (232)
T cd00190 83 NPSTYDNDIALLKLKRPVTLSDNVRPICLPSSGYNLPAGTTCTVSGWGRTSEGGPLPDVLQEVNVPIVSNAECKRAYSY- 161 (232)
T ss_pred CCCCCcCCEEEEEECCcccCCCcccceECCCccccCCCCCEEEEEeCCcCCCCCCCCceeeEEEeeeECHHHhhhhccC-
Confidence 998889999999999999999999999999875 455688999999998766555678899999999999999988753
Q ss_pred CCcccccCceEEEeecCCCCCCCcCCCCCCceeecCCCcEEEeeEEeecccccCCCCCceeeeccccccccccc
Q psy15052 297 GRYEFISDIFMCAGFDNGGRDSCQGDSGGPLQVKGKDGRYFLAGIISWGIGCAEANLPGVCTRISKFVPWVLDT 370 (422)
Q Consensus 297 ~~~~~i~~~~lCa~~~~~~~~~C~GDSGgPLv~~~~~~~~~lvGV~S~g~~C~~~~~p~vyt~V~~y~~WI~~~ 370 (422)
...+.+.++|+.......+.|.|||||||++. .+++|+|+||+|++..|...+.|++|++|+.|++||+++
T Consensus 162 --~~~~~~~~~C~~~~~~~~~~c~gdsGgpl~~~-~~~~~~lvGI~s~g~~c~~~~~~~~~t~v~~~~~WI~~~ 232 (232)
T cd00190 162 --GGTITDNMLCAGGLEGGKDACQGDSGGPLVCN-DNGRGVLVGIVSWGSGCARPNYPGVYTRVSSYLDWIQKT 232 (232)
T ss_pred --cccCCCceEeeCCCCCCCccccCCCCCcEEEE-eCCEEEEEEEEehhhccCCCCCCCEEEEcHHhhHHhhcC
Confidence 11378899999865547789999999999998 458999999999998898668899999999999999874
No 2
>KOG3627|consensus
Probab=100.00 E-value=5.7e-41 Score=314.53 Aligned_cols=234 Identities=37% Similarity=0.647 Sum_probs=193.1
Q ss_pred ccccccCCCCCCCCCcCeEEEEEecC--CCCceEEEccCCEEeecccCcCCC-CCceEEEEEeeeecccCCCCCC-cEEE
Q psy15052 131 SYYSHKDISSKNTTQRPYVKPLKESL--GRPVNVYSNNEKVDDFSTESVNSL-LTSQIKIRVGEYDFSKLEEPYP-YQER 206 (422)
Q Consensus 131 ~~~~~~~~~~a~~~~~Pw~v~i~~~~--~~~C~GtLIs~~~VLTAAhCv~~~-~~~~~~V~lG~~~~~~~~~~~~-~~~~ 206 (422)
++.++ .++.+++|||+|+|+... .+.|+|+||+++||||||||+... .. .+.|++|++.......... .+..
T Consensus 12 ~i~~g---~~~~~~~~Pw~~~l~~~~~~~~~Cggsli~~~~vltaaHC~~~~~~~-~~~V~~G~~~~~~~~~~~~~~~~~ 87 (256)
T KOG3627|consen 12 RIVGG---TEAEPGSFPWQVSLQYGGNGRHLCGGSLISPRWVLTAAHCVKGASAS-LYTVRLGEHDINLSVSEGEEQLVG 87 (256)
T ss_pred CEeCC---ccCCCCCCCCEEEEEECCCcceeeeeEEeeCCEEEEChhhCCCCCCc-ceEEEECccccccccccCchhhhc
Confidence 44444 467788999999999876 568999999999999999999873 22 8899999876654422211 2455
Q ss_pred eEEEEEECCCCCCCCCC-CceEEEEECCCcccCCCeeeeecCCCCC---CCCCCeEEEEEecccCCC-CCCCCCceEEEE
Q psy15052 207 GVVKKMVHPKYNFFTYE-YDLALVRLETPVEFAPNIVPICLPGSDD---LLIGENATVTGWGRLSEG-GSLPPVLQKVTV 281 (422)
Q Consensus 207 ~V~~i~iHp~y~~~~~~-nDIALLkL~~pv~~s~~v~PicLp~~~~---~~~~~~~~v~GwG~~~~~-~~~~~~L~~~~v 281 (422)
.|.++++||+|+..... ||||||+|++++.|+++|+|||||.... ...+..+.++|||.+... ...+..|+++.+
T Consensus 88 ~v~~~i~H~~y~~~~~~~nDiall~l~~~v~~~~~i~piclp~~~~~~~~~~~~~~~v~GWG~~~~~~~~~~~~L~~~~v 167 (256)
T KOG3627|consen 88 DVEKIIVHPNYNPRTLENNDIALLRLSEPVTFSSHIQPICLPSSADPYFPPGGTTCLVSGWGRTESGGGPLPDTLQEVDV 167 (256)
T ss_pred eeeEEEECCCCCCCCCCCCCEEEEEECCCcccCCcccccCCCCCcccCCCCCCCEEEEEeCCCcCCCCCCCCceeEEEEE
Confidence 58889999999998877 9999999999999999999999985543 345588999999987654 356888999999
Q ss_pred eccChhhHHHHHhhcCCcccccCceEEEeecCCCCCCCcCCCCCCceeecCCCcEEEeeEEeeccc-ccCCCCCceeeec
Q psy15052 282 PIVSNEKCRSMFLRAGRYEFISDIFMCAGFDNGGRDSCQGDSGGPLQVKGKDGRYFLAGIISWGIG-CAEANLPGVCTRI 360 (422)
Q Consensus 282 ~v~~~~~C~~~~~~~~~~~~i~~~~lCa~~~~~~~~~C~GDSGgPLv~~~~~~~~~lvGV~S~g~~-C~~~~~p~vyt~V 360 (422)
.+++.+.|...+.... .+.+.++||+......++|+|||||||++.... +|+++||+|||.+ |+..+.|++||+|
T Consensus 168 ~i~~~~~C~~~~~~~~---~~~~~~~Ca~~~~~~~~~C~GDSGGPLv~~~~~-~~~~~GivS~G~~~C~~~~~P~vyt~V 243 (256)
T KOG3627|consen 168 PIISNSECRRAYGGLG---TITDTMLCAGGPEGGKDACQGDSGGPLVCEDNG-RWVLVGIVSWGSGGCGQPNYPGVYTRV 243 (256)
T ss_pred eEcChhHhcccccCcc---ccCCCEEeeCccCCCCccccCCCCCeEEEeeCC-cEEEEEEEEecCCCCCCCCCCeEEeEh
Confidence 9999999998886432 256678999975677889999999999998544 8999999999988 9988899999999
Q ss_pred ccccccccccCC
Q psy15052 361 SKFVPWVLDTGD 372 (422)
Q Consensus 361 ~~y~~WI~~~i~ 372 (422)
+.|.+||++.+.
T Consensus 244 ~~y~~WI~~~~~ 255 (256)
T KOG3627|consen 244 SSYLDWIKENIG 255 (256)
T ss_pred HHhHHHHHHHhc
Confidence 999999998774
No 3
>smart00020 Tryp_SPc Trypsin-like serine protease. Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues.
Probab=100.00 E-value=3.3e-39 Score=296.88 Aligned_cols=221 Identities=42% Similarity=0.722 Sum_probs=189.7
Q ss_pred CCCCCCCcCeEEEEEecC-CCCceEEEccCCEEeecccCcCCCCCceEEEEEeeeecccCCCCCCcEEEeEEEEEECCCC
Q psy15052 139 SSKNTTQRPYVKPLKESL-GRPVNVYSNNEKVDDFSTESVNSLLTSQIKIRVGEYDFSKLEEPYPYQERGVVKKMVHPKY 217 (422)
Q Consensus 139 ~~a~~~~~Pw~v~i~~~~-~~~C~GtLIs~~~VLTAAhCv~~~~~~~~~V~lG~~~~~~~~~~~~~~~~~V~~i~iHp~y 217 (422)
..+.+++|||+|.|+... ...|+|+||++++|||||||+.......+.|++|..+.....+ .+.+.|.++++||+|
T Consensus 6 ~~~~~~~~Pw~~~i~~~~~~~~C~GtlIs~~~VLTaahC~~~~~~~~~~v~~g~~~~~~~~~---~~~~~v~~~~~~p~~ 82 (229)
T smart00020 6 SEANIGSFPWQVSLQYRGGRHFCGGSLISPRWVLTAAHCVYGSDPSNIRVRLGSHDLSSGEE---GQVIKVSKVIIHPNY 82 (229)
T ss_pred CcCCCCCCCcEEEEEEcCCCcEEEEEEecCCEEEECHHHcCCCCCcceEEEeCcccCCCCCC---ceEEeeEEEEECCCC
Confidence 567889999999999876 6679999999999999999998755568999999887655432 277899999999999
Q ss_pred CCCCCCCceEEEEECCCcccCCCeeeeecCCCC-CCCCCCeEEEEEecccCC-CCCCCCCceEEEEeccChhhHHHHHhh
Q psy15052 218 NFFTYEYDLALVRLETPVEFAPNIVPICLPGSD-DLLIGENATVTGWGRLSE-GGSLPPVLQKVTVPIVSNEKCRSMFLR 295 (422)
Q Consensus 218 ~~~~~~nDIALLkL~~pv~~s~~v~PicLp~~~-~~~~~~~~~v~GwG~~~~-~~~~~~~L~~~~v~v~~~~~C~~~~~~ 295 (422)
+.....+|||||||++|+.+++.++|+||+... ....+..+.++|||.... .......++...+.+++.+.|...+..
T Consensus 83 ~~~~~~~DiAll~L~~~i~~~~~~~pi~l~~~~~~~~~~~~~~~~g~g~~~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~ 162 (229)
T smart00020 83 NPSTYDNDIALLKLKSPVTLSDNVRPICLPSSNYNVPAGTTCTVSGWGRTSEGAGSLPDTLQEVNVPIVSNATCRRAYSG 162 (229)
T ss_pred CCCCCcCCEEEEEECcccCCCCceeeccCCCcccccCCCCEEEEEeCCCCCCCCCcCCCEeeEEEEEEeCHHHhhhhhcc
Confidence 988889999999999999999999999999863 355688999999998753 234567889999999999999987754
Q ss_pred cCCcccccCceEEEeecCCCCCCCcCCCCCCceeecCCCcEEEeeEEeecccccCCCCCceeeecccccccc
Q psy15052 296 AGRYEFISDIFMCAGFDNGGRDSCQGDSGGPLQVKGKDGRYFLAGIISWGIGCAEANLPGVCTRISKFVPWV 367 (422)
Q Consensus 296 ~~~~~~i~~~~lCa~~~~~~~~~C~GDSGgPLv~~~~~~~~~lvGV~S~g~~C~~~~~p~vyt~V~~y~~WI 367 (422)
. ..+.+.++|++......+.|.|||||||+... + +|+|+||+|+|..|...+.|.+|+||++|++||
T Consensus 163 ~---~~~~~~~~C~~~~~~~~~~c~gdsG~pl~~~~-~-~~~l~Gi~s~g~~C~~~~~~~~~~~i~~~~~WI 229 (229)
T smart00020 163 G---GAITDNMLCAGGLEGGKDACQGDSGGPLVCND-G-RWVLVGIVSWGSGCARPGKPGVYTRVSSYLDWI 229 (229)
T ss_pred c---cccCCCcEeecCCCCCCcccCCCCCCeeEEEC-C-CEEEEEEEEECCCCCCCCCCCEEEEeccccccC
Confidence 2 13788999998655467899999999999984 3 999999999999998778899999999999998
No 4
>PF00089 Trypsin: Trypsin; InterPro: IPR001254 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine proteases belong to the MEROPS peptidase family S1 (chymotrypsin family, clan PA(S))and to peptidase family S6 (Hap serine peptidases). The chymotrypsin family is almost totally confined to animals, although trypsin-like enzymes are found in actinomycetes of the genera Streptomyces and Saccharopolyspora, and in the fungus Fusarium oxysporum []. The enzymes are inherently secreted, being synthesised with a signal peptide that targets them to the secretory pathway. Animal enzymes are either secreted directly, packaged into vesicles for regulated secretion, or are retained in leukocyte granules []. The Hap family, 'Haemophilus adhesion and penetration', are proteins that play a role in the interaction with human epithelial cells. The serine protease activity is localized at the N-terminal domain, whereas the binding domain is in the C-terminal region. ; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 1SPJ_A 1A5I_A 2ZGH_A 2ZKS_A 2ZGJ_A 2ZGC_A 2ODP_A 2I6Q_A 2I6S_A 2ODQ_A ....
Probab=100.00 E-value=2.2e-37 Score=282.51 Aligned_cols=214 Identities=37% Similarity=0.701 Sum_probs=184.3
Q ss_pred CCCCCCCcCeEEEEEecC-CCCceEEEccCCEEeecccCcCCCCCceEEEEEeeeecccCCCCCCcEEEeEEEEEECCCC
Q psy15052 139 SSKNTTQRPYVKPLKESL-GRPVNVYSNNEKVDDFSTESVNSLLTSQIKIRVGEYDFSKLEEPYPYQERGVVKKMVHPKY 217 (422)
Q Consensus 139 ~~a~~~~~Pw~v~i~~~~-~~~C~GtLIs~~~VLTAAhCv~~~~~~~~~V~lG~~~~~~~~~~~~~~~~~V~~i~iHp~y 217 (422)
..+.+++|||+|.|+... .++|+|+||+++||||||||+.. ..++.+++|.......... .+.+.|++++.||+|
T Consensus 5 ~~~~~~~~p~~v~i~~~~~~~~C~G~li~~~~vLTaahC~~~--~~~~~v~~g~~~~~~~~~~--~~~~~v~~~~~h~~~ 80 (220)
T PF00089_consen 5 DPASPGEFPWVVSIRYSNGRFFCTGTLISPRWVLTAAHCVDG--ASDIKVRLGTYSIRNSDGS--EQTIKVSKIIIHPKY 80 (220)
T ss_dssp EECGTTSSTTEEEEEETTTEEEEEEEEEETTEEEEEGGGHTS--GGSEEEEESESBTTSTTTT--SEEEEEEEEEEETTS
T ss_pred EECCCCCCCeEEEEeeCCCCeeEeEEeccccccccccccccc--ccccccccccccccccccc--ccccccccccccccc
Confidence 567889999999999887 67899999999999999999987 5689999998444333321 488999999999999
Q ss_pred CCCCCCCceEEEEECCCcccCCCeeeeecCCCC-CCCCCCeEEEEEecccCCCCCCCCCceEEEEeccChhhHHHHHhhc
Q psy15052 218 NFFTYEYDLALVRLETPVEFAPNIVPICLPGSD-DLLIGENATVTGWGRLSEGGSLPPVLQKVTVPIVSNEKCRSMFLRA 296 (422)
Q Consensus 218 ~~~~~~nDIALLkL~~pv~~s~~v~PicLp~~~-~~~~~~~~~v~GwG~~~~~~~~~~~L~~~~v~v~~~~~C~~~~~~~ 296 (422)
+.....+|||||||++++.+.+.+.|+||+... ....+..+.+.||+.....+ ....++...+.+++.+.|...+..
T Consensus 81 ~~~~~~~DiAll~L~~~~~~~~~~~~~~l~~~~~~~~~~~~~~~~G~~~~~~~~-~~~~~~~~~~~~~~~~~c~~~~~~- 158 (220)
T PF00089_consen 81 DPSTYDNDIALLKLDRPITFGDNIQPICLPSAGSDPNVGTSCIVVGWGRTSDNG-YSSNLQSVTVPVVSRKTCRSSYND- 158 (220)
T ss_dssp BTTTTTTSEEEEEESSSSEHBSSBEESBBTSTTHTTTTTSEEEEEESSBSSTTS-BTSBEEEEEEEEEEHHHHHHHTTT-
T ss_pred cccccccccccccccccccccccccccccccccccccccccccccccccccccc-cccccccccccccccccccccccc-
Confidence 998889999999999999999999999999844 34678899999999865544 567899999999999999987432
Q ss_pred CCcccccCceEEEeecCCCCCCCcCCCCCCceeecCCCcEEEeeEEeecccccCCCCCceeeecccccccc
Q psy15052 297 GRYEFISDIFMCAGFDNGGRDSCQGDSGGPLQVKGKDGRYFLAGIISWGIGCAEANLPGVCTRISKFVPWV 367 (422)
Q Consensus 297 ~~~~~i~~~~lCa~~~~~~~~~C~GDSGgPLv~~~~~~~~~lvGV~S~g~~C~~~~~p~vyt~V~~y~~WI 367 (422)
.+.+.++|+... ...+.|.|||||||++.+. +|+||++++..|...+.|.+|+||+.|++||
T Consensus 159 ----~~~~~~~c~~~~-~~~~~~~g~sG~pl~~~~~----~lvGI~s~~~~c~~~~~~~v~~~v~~~~~WI 220 (220)
T PF00089_consen 159 ----NLTPNMICAGSS-GSGDACQGDSGGPLICNNN----YLVGIVSFGENCGSPNYPGVYTRVSSYLDWI 220 (220)
T ss_dssp ----TSTTTEEEEETT-SSSBGGTTTTTSEEEETTE----EEEEEEEEESSSSBTTSEEEEEEGGGGHHHH
T ss_pred ----cccccccccccc-cccccccccccccccccee----eecceeeecCCCCCCCcCEEEEEHHHhhccC
Confidence 267889999865 5688999999999998744 8999999999999888899999999999998
No 5
>COG5640 Secreted trypsin-like serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=99.97 E-value=1.5e-30 Score=238.98 Aligned_cols=241 Identities=21% Similarity=0.302 Sum_probs=173.5
Q ss_pred ccccccCCCCCCCCCcCeEEEEEecC-----CCCceEEEccCCEEeecccCcCCCCCceEEEEEeeeecccCCCCCCcEE
Q psy15052 131 SYYSHKDISSKNTTQRPYVKPLKESL-----GRPVNVYSNNEKVDDFSTESVNSLLTSQIKIRVGEYDFSKLEEPYPYQE 205 (422)
Q Consensus 131 ~~~~~~~~~~a~~~~~Pw~v~i~~~~-----~~~C~GtLIs~~~VLTAAhCv~~~~~~~~~V~lG~~~~~~~~~~~~~~~ 205 (422)
++.++ +.|+.++||++|++.... ..+|+|+++..|||||||||+....+-...+..+..++++..+ .+.
T Consensus 32 rIigG---s~Anag~~P~~VaLv~~isd~~s~tfCGgs~l~~RYvLTAAHC~~~~s~is~d~~~vv~~l~d~Sq---~~r 105 (413)
T COG5640 32 RIIGG---SNANAGEYPSLVALVDRISDYVSGTFCGGSKLGGRYVLTAAHCADASSPISSDVNRVVVDLNDSSQ---AER 105 (413)
T ss_pred eEecC---cccccccCchHHHHHhhcccccceeEeccceecceEEeeehhhccCCCCccccceEEEeccccccc---ccC
Confidence 44555 689999999999997433 4579999999999999999998755434444555555554444 467
Q ss_pred EeEEEEEECCCCCCCCCCCceEEEEECCCcccCC-CeeeeecCC--CCCCCCCCeEEEEEecccCCCC-----CCCCCce
Q psy15052 206 RGVVKKMVHPKYNFFTYEYDLALVRLETPVEFAP-NIVPICLPG--SDDLLIGENATVTGWGRLSEGG-----SLPPVLQ 277 (422)
Q Consensus 206 ~~V~~i~iHp~y~~~~~~nDIALLkL~~pv~~s~-~v~PicLp~--~~~~~~~~~~~v~GwG~~~~~~-----~~~~~L~ 277 (422)
..|.+++.|..|...++.||+|+++|.++..... .+.-..-+. ..+..........+|+.+.... +....|+
T Consensus 106 g~vr~i~~~efY~~~n~~ND~Av~~l~~~a~~pr~ki~~~~~sdt~l~sv~~~s~~~n~t~~~~~~~~v~~~~p~gt~l~ 185 (413)
T COG5640 106 GHVRTIYVHEFYSPGNLGNDIAVLELARAASLPRVKITSFDASDTFLNSVTTVSPMTNGTFGVTTPSDVPRSSPKGTILH 185 (413)
T ss_pred cceEEEeeecccccccccCcceeeccccccccchhheeeccCcccceecccccccccceeeeeeeecCCCCCCCccceee
Confidence 7899999999999999999999999999664321 111111111 0112223344556666543321 2224789
Q ss_pred EEEEeccChhhHHHHHhhcCC-cccccCceEEEeecCCCCCCCcCCCCCCceeecCCCcEEEeeEEeeccc-ccCCCCCc
Q psy15052 278 KVTVPIVSNEKCRSMFLRAGR-YEFISDIFMCAGFDNGGRDSCQGDSGGPLQVKGKDGRYFLAGIISWGIG-CAEANLPG 355 (422)
Q Consensus 278 ~~~v~v~~~~~C~~~~~~~~~-~~~i~~~~lCa~~~~~~~~~C~GDSGgPLv~~~~~~~~~lvGV~S~g~~-C~~~~~p~ 355 (422)
+..+...+...|...++.... .....-.-+|++.. ..++|+||||||++.+..++ ..++||+|||.+ |+....|+
T Consensus 186 e~~v~fv~~stc~~~~g~an~~dg~~~lT~~cag~~--~~daCqGDSGGPi~~~g~~G-~vQ~GVvSwG~~~Cg~t~~~g 262 (413)
T COG5640 186 EVAVLFVPLSTCAQYKGCANASDGATGLTGFCAGRP--PKDACQGDSGGPIFHKGEEG-RVQRGVVSWGDGGCGGTLIPG 262 (413)
T ss_pred eeeeeeechHHhhhhccccccCCCCCCccceecCCC--CcccccCCCCCceEEeCCCc-cEEEeEEEecCCCCCCCCcce
Confidence 999999999999988852211 11122223999844 48999999999999985544 479999999987 99999999
Q ss_pred eeeecccccccccccCCCCCCcccc
Q psy15052 356 VCTRISKFVPWVLDTGDSGGPLQVK 380 (422)
Q Consensus 356 vyt~V~~y~~WI~~~i~~~~pl~~~ 380 (422)
|||+|+.|.+||..+++.-++++..
T Consensus 263 VyT~vsny~~WI~a~~~~l~~~~~r 287 (413)
T COG5640 263 VYTNVSNYQDWIAAMTNGLSYLQFR 287 (413)
T ss_pred eEEehhHHHHHHHHHhcCCCccccc
Confidence 9999999999999988876666544
No 6
>PF03761 DUF316: Domain of unknown function (DUF316) ; InterPro: IPR005514 This is a family of uncharacterised proteins from Caenorhabditis elegans.
Probab=99.58 E-value=6.1e-14 Score=133.31 Aligned_cols=201 Identities=19% Similarity=0.234 Sum_probs=125.6
Q ss_pred CCCCCCCcCeEEEEEecCC----CCceEEEccCCEEeecccCcCCCCCc-----------------eEEEE---Eeeeec
Q psy15052 139 SSKNTTQRPYVKPLKESLG----RPVNVYSNNEKVDDFSTESVNSLLTS-----------------QIKIR---VGEYDF 194 (422)
Q Consensus 139 ~~a~~~~~Pw~v~i~~~~~----~~C~GtLIs~~~VLTAAhCv~~~~~~-----------------~~~V~---lG~~~~ 194 (422)
..+...+.||+|.+..... .+++|+|||+||||||+||+...... .+.|- +-....
T Consensus 46 ~~~~~~~~pW~v~v~~~~~~~~~~~~~gtlIS~RHiLtss~~~~~~~~~W~~~~~~~~~~C~~~~~~l~vP~~~l~~~~v 125 (282)
T PF03761_consen 46 TPAESGEAPWAVSVYTKNHNEGNYFSTGTLISPRHILTSSHCVMNDKSKWLNGEEFDNKKCEGNNNHLIVPEEVLSKIDV 125 (282)
T ss_pred cccccCCCCCEEEEEeccCcccceecceEEeccCeEEEeeeEEEecccccccCcccccceeeCCCceEEeCHHHhccEEE
Confidence 5567789999999986542 34799999999999999999632110 11110 000000
Q ss_pred --ccCCCCCCcEEEeEEEEEECCCC----CCCCCCCceEEEEECCCcccCCCeeeeecCCCCC-CCCCCeEEEEEecccC
Q psy15052 195 --SKLEEPYPYQERGVVKKMVHPKY----NFFTYEYDLALVRLETPVEFAPNIVPICLPGSDD-LLIGENATVTGWGRLS 267 (422)
Q Consensus 195 --~~~~~~~~~~~~~V~~i~iHp~y----~~~~~~nDIALLkL~~pv~~s~~v~PicLp~~~~-~~~~~~~~v~GwG~~~ 267 (422)
.............|.++++--.- .......+++||+|+++ ++....|+||+.... ...+..+.+.|+
T Consensus 126 ~~~~~~~~~~~~~~~v~ka~il~~C~~~~~~~~~~~~~mIlEl~~~--~~~~~~~~Cl~~~~~~~~~~~~~~~yg~---- 199 (282)
T PF03761_consen 126 RCCNCFSNGKCFSIKVKKAYILNGCKKIKKNFNRPYSPMILELEED--FSKNVSPPCLADSSTNWEKGDEVDVYGF---- 199 (282)
T ss_pred EeecccccCCcccceeEEEEEEecCCCcccccccccceEEEEEccc--ccccCCCEEeCCCccccccCceEEEeec----
Confidence 00000001233566666663222 23344689999999998 778899999998754 334666667666
Q ss_pred CCCCCCCCceEEEEeccChhhHHHHHhhcCCcccccCceEEEeecCCCCCCCcCCCCCCceeecCCCcEEEeeEEeeccc
Q psy15052 268 EGGSLPPVLQKVTVPIVSNEKCRSMFLRAGRYEFISDIFMCAGFDNGGRDSCQGDSGGPLQVKGKDGRYFLAGIISWGIG 347 (422)
Q Consensus 268 ~~~~~~~~L~~~~v~v~~~~~C~~~~~~~~~~~~i~~~~lCa~~~~~~~~~C~GDSGgPLv~~~~~~~~~lvGV~S~g~~ 347 (422)
.....+....+.+..... |..........|.+|+||||+.. .+++|+|+||.+.+..
T Consensus 200 ---~~~~~~~~~~~~i~~~~~-------------------~~~~~~~~~~~~~~d~Gg~lv~~-~~gr~tlIGv~~~~~~ 256 (282)
T PF03761_consen 200 ---NSTGKLKHRKLKITNCTK-------------------CAYSICTKQYSCKGDRGGPLVKN-INGRWTLIGVGASGNY 256 (282)
T ss_pred ---CCCCeEEEEEEEEEEeec-------------------cceeEecccccCCCCccCeEEEE-ECCCEEEEEEEccCCC
Confidence 113334455555443221 22112224578999999999988 6899999999997753
Q ss_pred -ccCCCCCceeeeccccccccccc
Q psy15052 348 -CAEANLPGVCTRISKFVPWVLDT 370 (422)
Q Consensus 348 -C~~~~~p~vyt~V~~y~~WI~~~ 370 (422)
|.. ....|.+|+.|.+=|=+.
T Consensus 257 ~~~~--~~~~f~~v~~~~~~IC~l 278 (282)
T PF03761_consen 257 ECNK--NNSYFFNVSWYQDEICEL 278 (282)
T ss_pred cccc--cccEEEEHHHhhhhhccc
Confidence 321 267888998887755443
No 7
>PF09342 DUF1986: Domain of unknown function (DUF1986); InterPro: IPR015420 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This domain is found in serine endopeptidases belonging to MEROPS peptidase family S1A (clan PA). It is found in unusual mosaic proteins, which are encoded by the Drosophila nudel gene (see P98159 from SWISSPROT). Nudel is involved in defining embryonic dorsoventral polarity. Three proteases; ndl, gd and snk process easter to create active easter. Active easter defines cell identities along the dorsal-ventral continuum by activating the spz ligand for the Tl receptor in the ventral region of the embryo. Nudel, pipe and windbeutel together trigger the protease cascade within the extraembryonic perivitelline compartment which induces dorsoventral polarity of the Drosophila embryo [].
Probab=99.48 E-value=6.2e-13 Score=117.51 Aligned_cols=116 Identities=14% Similarity=0.088 Sum_probs=91.1
Q ss_pred CCCCcCeEEEEEecCCCCceEEEccCCEEeecccCcCCCCC--ceEEEEEeeeecccCCCCCCcEEEeEEEEEECCCCCC
Q psy15052 142 NTTQRPYVKPLKESLGRPVNVYSNNEKVDDFSTESVNSLLT--SQIKIRVGEYDFSKLEEPYPYQERGVVKKMVHPKYNF 219 (422)
Q Consensus 142 ~~~~~Pw~v~i~~~~~~~C~GtLIs~~~VLTAAhCv~~~~~--~~~~V~lG~~~~~~~~~~~~~~~~~V~~i~iHp~y~~ 219 (422)
+...|||.|.|+..+...|+|+||.+.|||++..|+.+.+. .-+.|.+|.......-+....|.++|..+..-|
T Consensus 12 e~y~WPWlA~IYvdG~~~CsgvLlD~~WlLvsssCl~~I~L~~~YvsallG~~Kt~~~v~Gp~EQI~rVD~~~~V~---- 87 (267)
T PF09342_consen 12 EDYHWPWLADIYVDGRYWCSGVLLDPHWLLVSSSCLRGISLSHHYVSALLGGGKTYLSVDGPHEQISRVDCFKDVP---- 87 (267)
T ss_pred ccccCcceeeEEEcCeEEEEEEEeccceEEEeccccCCcccccceEEEEecCcceecccCCChheEEEeeeeeecc----
Confidence 35679999999999888999999999999999999987444 457788887664332233336888888776544
Q ss_pred CCCCCceEEEEECCCcccCCCeeeeecCCCC-CCCCCCeEEEEEec
Q psy15052 220 FTYEYDLALVRLETPVEFAPNIVPICLPGSD-DLLIGENATVTGWG 264 (422)
Q Consensus 220 ~~~~nDIALLkL~~pv~~s~~v~PicLp~~~-~~~~~~~~~v~GwG 264 (422)
+.+++||.|++|+.|+.+|+|..||... +......|..+|-.
T Consensus 88 ---~S~v~LLHL~~~~~fTr~VlP~flp~~~~~~~~~~~CVAVg~d 130 (267)
T PF09342_consen 88 ---ESNVLLLHLEQPANFTRYVLPTFLPETSNENESDDECVAVGHD 130 (267)
T ss_pred ---ccceeeeeecCcccceeeecccccccccCCCCCCCceEEEEcc
Confidence 3789999999999999999999999743 33445689999854
No 8
>COG3591 V8-like Glu-specific endopeptidase [Amino acid transport and metabolism]
Probab=99.04 E-value=1.6e-09 Score=98.40 Aligned_cols=199 Identities=17% Similarity=0.145 Sum_probs=112.1
Q ss_pred CCCCCcCeEEEEEe--cCCCC-ceEEEccCCEEeecccCcCCCCCc--eEEEEE-eeeecccCCCCCCcEEEeEEEEEEC
Q psy15052 141 KNTTQRPYVKPLKE--SLGRP-VNVYSNNEKVDDFSTESVNSLLTS--QIKIRV-GEYDFSKLEEPYPYQERGVVKKMVH 214 (422)
Q Consensus 141 a~~~~~Pw~v~i~~--~~~~~-C~GtLIs~~~VLTAAhCv~~~~~~--~~~V~l-G~~~~~~~~~~~~~~~~~V~~i~iH 214 (422)
.+..+|||-+-... ..+++ |+++||+++.||||+||+...... ++.+.. |.. .+..+...+.-.+..+.
T Consensus 44 ~dt~~~Py~av~~~~~~tG~~~~~~~lI~pntvLTa~Hc~~s~~~G~~~~~~~p~g~~-----~~~~~~~~~~~~~~~~~ 118 (251)
T COG3591 44 TDTTQFPYSAVVQFEAATGRLCTAATLIGPNTVLTAGHCIYSPDYGEDDIAAAPPGVN-----SDGGPFYGITKIEIRVY 118 (251)
T ss_pred ccCCCCCcceeEEeecCCCcceeeEEEEcCceEEEeeeEEecCCCChhhhhhcCCccc-----CCCCCCCceeeEEEEec
Confidence 46788999655543 33555 566999999999999999764322 222221 211 11111223333334334
Q ss_pred CC--CCCCCCCCceEEEEECCCcccCCCeeeeecCCCCCCCCCCeEEEEEecccCCCCCCCCCceEEEEeccChhhHHHH
Q psy15052 215 PK--YNFFTYEYDLALVRLETPVEFAPNIVPICLPGSDDLLIGENATVTGWGRLSEGGSLPPVLQKVTVPIVSNEKCRSM 292 (422)
Q Consensus 215 p~--y~~~~~~nDIALLkL~~pv~~s~~v~PicLp~~~~~~~~~~~~v~GwG~~~~~~~~~~~L~~~~v~v~~~~~C~~~ 292 (422)
|. |.......|+..+.|+....+.+.+....++.......++...++||-.... ..+++- +.|...
T Consensus 119 ~g~~~~~d~~~~~v~~~~~~~g~~~~~~~~~~~~~~~~~~~~~d~i~v~GYP~dk~-----~~~~~~-------e~t~~v 186 (251)
T COG3591 119 PGELYKEDGASYDVGEAALESGINIGDVVNYLKRNTASEAKANDRITVIGYPGDKP-----NIGTMW-------ESTGKV 186 (251)
T ss_pred CCceeccCCceeeccHHHhccCCCccccccccccccccccccCceeEEEeccCCCC-----cceeEe-------eeccee
Confidence 44 2334455677777777555555555555555555556677789999854321 111110 011111
Q ss_pred HhhcCCcccccCceEEEeecCCCCCCCcCCCCCCceeecCCCcEEEeeEEeecccccCCCCCceeeecc-cccccccccC
Q psy15052 293 FLRAGRYEFISDIFMCAGFDNGGRDSCQGDSGGPLQVKGKDGRYFLAGIISWGIGCAEANLPGVCTRIS-KFVPWVLDTG 371 (422)
Q Consensus 293 ~~~~~~~~~i~~~~lCa~~~~~~~~~C~GDSGgPLv~~~~~~~~~lvGV~S~g~~C~~~~~p~vyt~V~-~y~~WI~~~i 371 (422)
. .+.... .....+.+.|+||+|++.... +++||.+.+..-.......-.+|+. .+++||++.+
T Consensus 187 ~-------~~~~~~-----l~y~~dT~pG~SGSpv~~~~~----~vigv~~~g~~~~~~~~~n~~vr~t~~~~~~I~~~~ 250 (251)
T COG3591 187 N-------SIKGNK-----LFYDADTLPGSSGSPVLISKD----EVIGVHYNGPGANGGSLANNAVRLTPEILNFIQQNI 250 (251)
T ss_pred E-------EEecce-----EEEEecccCCCCCCceEecCc----eEEEEEecCCCcccccccCcceEecHHHHHHHHHhh
Confidence 0 011111 122468899999999998733 7999999887633233334445554 4678888765
Q ss_pred C
Q psy15052 372 D 372 (422)
Q Consensus 372 ~ 372 (422)
+
T Consensus 251 ~ 251 (251)
T COG3591 251 K 251 (251)
T ss_pred C
Confidence 3
No 9
>KOG3627|consensus
Probab=98.49 E-value=1.1e-07 Score=88.96 Aligned_cols=51 Identities=47% Similarity=1.094 Sum_probs=46.2
Q ss_pred CCCCCCccccCCCCceeEEeeeeecCc-ccCCCCCceEEEccCchhHHhhhcC
Q psy15052 371 GDSGGPLQVKGKDGRYFLAGIISWGIG-CAEANLPGVCTRISKFVPWVLDTVT 422 (422)
Q Consensus 371 i~~~~pl~~~~~~~~~~l~g~~s~g~~-~~~~~~~~v~~~v~~~~~Wi~~~~~ 422 (422)
-++++||.+.... +|+++|++|||.. |+....|+|||||++|.+||.+++.
T Consensus 204 GDSGGPLv~~~~~-~~~~~GivS~G~~~C~~~~~P~vyt~V~~y~~WI~~~~~ 255 (256)
T KOG3627|consen 204 GDSGGPLVCEDNG-RWVLVGIVSWGSGGCGQPNYPGVYTRVSSYLDWIKENIG 255 (256)
T ss_pred CCCCCeEEEeeCC-cEEEEEEEEecCCCCCCCCCCeEEeEhHHhHHHHHHHhc
Confidence 4789999888755 9999999999998 9998899999999999999999873
No 10
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=98.20 E-value=1.2e-05 Score=81.05 Aligned_cols=85 Identities=16% Similarity=0.124 Sum_probs=60.2
Q ss_pred CCCceEEEccCC-EEeecccCcCCCCCceEEEEEeeeecccCCCCCCcEEEeEEEEEECCCCCCCCCCCceEEEEECCCc
Q psy15052 157 GRPVNVYSNNEK-VDDFSTESVNSLLTSQIKIRVGEYDFSKLEEPYPYQERGVVKKMVHPKYNFFTYEYDLALVRLETPV 235 (422)
Q Consensus 157 ~~~C~GtLIs~~-~VLTAAhCv~~~~~~~~~V~lG~~~~~~~~~~~~~~~~~V~~i~iHp~y~~~~~~nDIALLkL~~pv 235 (422)
...++|.+|+++ +|||++|.+.+ ...+.|.+.. ...+..+-+..++ ..||||||++.+
T Consensus 57 ~~~GSGfii~~~G~IlTn~Hvv~~--~~~i~V~~~~-----------~~~~~a~vv~~d~-------~~DlAllkv~~~- 115 (428)
T TIGR02037 57 RGLGSGVIISADGYILTNNHVVDG--ADEITVTLSD-----------GREFKAKLVGKDP-------RTDIAVLKIDAK- 115 (428)
T ss_pred cceeeEEEECCCCEEEEcHHHcCC--CCeEEEEeCC-----------CCEEEEEEEEecC-------CCCEEEEEecCC-
Confidence 346899999976 99999999986 4566666542 1233444333444 379999999864
Q ss_pred ccCCCeeeeecCCCCCCCCCCeEEEEEecc
Q psy15052 236 EFAPNIVPICLPGSDDLLIGENATVTGWGR 265 (422)
Q Consensus 236 ~~s~~v~PicLp~~~~~~~~~~~~v~GwG~ 265 (422)
..+.++.|........++.+++.|+..
T Consensus 116 ---~~~~~~~l~~~~~~~~G~~v~aiG~p~ 142 (428)
T TIGR02037 116 ---KNLPVIKLGDSDKLRVGDWVLAIGNPF 142 (428)
T ss_pred ---CCceEEEccCCCCCCCCCEEEEEECCC
Confidence 346677886666667899999999863
No 11
>TIGR02038 protease_degS periplasmic serine pepetdase DegS. This family consists of the periplasmic serine protease DegS (HhoB), a shorter paralog of protease DO (HtrA, DegP) and DegQ (HhoA). It is found in E. coli and several other Proteobacteria of the gamma subdivision. It contains a trypsin domain and a single copy of PDZ domain (in contrast to DegP with two copies). A critical role of this DegS is to sense stress in the periplasm and partially degrade an inhibitor of sigma(E).
Probab=98.15 E-value=2.5e-05 Score=76.46 Aligned_cols=140 Identities=14% Similarity=0.090 Sum_probs=79.8
Q ss_pred CceEEEccCC-EEeecccCcCCCCCceEEEEEeeeecccCCCCCCcEEEeEEEEEECCCCCCCCCCCceEEEEECCCccc
Q psy15052 159 PVNVYSNNEK-VDDFSTESVNSLLTSQIKIRVGEYDFSKLEEPYPYQERGVVKKMVHPKYNFFTYEYDLALVRLETPVEF 237 (422)
Q Consensus 159 ~C~GtLIs~~-~VLTAAhCv~~~~~~~~~V~lG~~~~~~~~~~~~~~~~~V~~i~iHp~y~~~~~~nDIALLkL~~pv~~ 237 (422)
..+|.+|+++ +|||++|.+.+ ...+.|.+.. ...+..+-+..+| ..||||||++.+-
T Consensus 79 ~GSG~vi~~~G~IlTn~HVV~~--~~~i~V~~~d-----------g~~~~a~vv~~d~-------~~DlAvlkv~~~~-- 136 (351)
T TIGR02038 79 LGSGVIMSKEGYILTNYHVIKK--ADQIVVALQD-----------GRKFEAELVGSDP-------LTDLAVLKIEGDN-- 136 (351)
T ss_pred eEEEEEEeCCeEEEecccEeCC--CCEEEEEECC-----------CCEEEEEEEEecC-------CCCEEEEEecCCC--
Confidence 4789999876 99999999976 3456666532 1233444343444 4799999998632
Q ss_pred CCCeeeeecCCCCCCCCCCeEEEEEecccCCCCCCCCCceEEEEeccChhhHHHHHhhcCCcccccCceEEEeecCCCCC
Q psy15052 238 APNIVPICLPGSDDLLIGENATVTGWGRLSEGGSLPPVLQKVTVPIVSNEKCRSMFLRAGRYEFISDIFMCAGFDNGGRD 317 (422)
Q Consensus 238 s~~v~PicLp~~~~~~~~~~~~v~GwG~~~~~~~~~~~L~~~~v~v~~~~~C~~~~~~~~~~~~i~~~~lCa~~~~~~~~ 317 (422)
+.++.|-.......|+.+++.|+.... ........+..+.... +... -....+=. +..
T Consensus 137 ---~~~~~l~~s~~~~~G~~V~aiG~P~~~-----~~s~t~GiIs~~~r~~----~~~~-----~~~~~iqt-----da~ 194 (351)
T TIGR02038 137 ---LPTIPVNLDRPPHVGDVVLAIGNPYNL-----GQTITQGIISATGRNG----LSSV-----GRQNFIQT-----DAA 194 (351)
T ss_pred ---CceEeccCcCccCCCCEEEEEeCCCCC-----CCcEEEEEEEeccCcc----cCCC-----CcceEEEE-----CCc
Confidence 334555444456779999999986321 1122222222222110 0000 00111111 234
Q ss_pred CCcCCCCCCceeecCCCcEEEeeEEeecc
Q psy15052 318 SCQGDSGGPLQVKGKDGRYFLAGIISWGI 346 (422)
Q Consensus 318 ~C~GDSGgPLv~~~~~~~~~lvGV~S~g~ 346 (422)
.-.|.|||||+-. +| .++||.+...
T Consensus 195 i~~GnSGGpl~n~--~G--~vIGI~~~~~ 219 (351)
T TIGR02038 195 INAGNSGGALINT--NG--ELVGINTASF 219 (351)
T ss_pred cCCCCCcceEECC--CC--eEEEEEeeee
Confidence 5578999999854 23 3999998643
No 12
>cd00190 Tryp_SPc Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues.
Probab=98.13 E-value=1.8e-06 Score=78.85 Aligned_cols=49 Identities=55% Similarity=1.118 Sum_probs=43.9
Q ss_pred CCCCCCccccCCCCceeEEeeeeecCcccCCCCCceEEEccCchhHHhhh
Q psy15052 371 GDSGGPLQVKGKDGRYFLAGIISWGIGCAEANLPGVCTRISKFVPWVLDT 420 (422)
Q Consensus 371 i~~~~pl~~~~~~~~~~l~g~~s~g~~~~~~~~~~v~~~v~~~~~Wi~~~ 420 (422)
-++++||.+... ++|+|+||+|||..|.....|.+|+||++|.+||+++
T Consensus 184 gdsGgpl~~~~~-~~~~lvGI~s~g~~c~~~~~~~~~t~v~~~~~WI~~~ 232 (232)
T cd00190 184 GDSGGPLVCNDN-GRGVLVGIVSWGSGCARPNYPGVYTRVSSYLDWIQKT 232 (232)
T ss_pred CCCCCcEEEEeC-CEEEEEEEEehhhccCCCCCCCEEEEcHHhhHHhhcC
Confidence 578999988764 8999999999999998767899999999999999875
No 13
>PRK10898 serine endoprotease; Provisional
Probab=98.04 E-value=8.1e-05 Score=72.83 Aligned_cols=82 Identities=13% Similarity=0.053 Sum_probs=54.5
Q ss_pred CCceEEEccCC-EEeecccCcCCCCCceEEEEEeeeecccCCCCCCcEEEeEEEEEECCCCCCCCCCCceEEEEECCCcc
Q psy15052 158 RPVNVYSNNEK-VDDFSTESVNSLLTSQIKIRVGEYDFSKLEEPYPYQERGVVKKMVHPKYNFFTYEYDLALVRLETPVE 236 (422)
Q Consensus 158 ~~C~GtLIs~~-~VLTAAhCv~~~~~~~~~V~lG~~~~~~~~~~~~~~~~~V~~i~iHp~y~~~~~~nDIALLkL~~pv~ 236 (422)
..-+|.+|+++ +|||+||=+.+ ...+.|.+.. ...+..+-+..+| .+||||||++..
T Consensus 78 ~~GSGfvi~~~G~IlTn~HVv~~--a~~i~V~~~d-----------g~~~~a~vv~~d~-------~~DlAvl~v~~~-- 135 (353)
T PRK10898 78 TLGSGVIMDQRGYILTNKHVIND--ADQIIVALQD-----------GRVFEALLVGSDS-------LTDLAVLKINAT-- 135 (353)
T ss_pred ceeeEEEEeCCeEEEecccEeCC--CCEEEEEeCC-----------CCEEEEEEEEEcC-------CCCEEEEEEcCC--
Confidence 34789999876 99999999975 4567776532 1223343344444 389999999853
Q ss_pred cCCCeeeeecCCCCCCCCCCeEEEEEec
Q psy15052 237 FAPNIVPICLPGSDDLLIGENATVTGWG 264 (422)
Q Consensus 237 ~s~~v~PicLp~~~~~~~~~~~~v~GwG 264 (422)
...++.|-.......|+.+++.|+.
T Consensus 136 ---~l~~~~l~~~~~~~~G~~V~aiG~P 160 (353)
T PRK10898 136 ---NLPVIPINPKRVPHIGDVVLAIGNP 160 (353)
T ss_pred ---CCCeeeccCcCcCCCCCEEEEEeCC
Confidence 1234445443445679999999975
No 14
>COG5640 Secreted trypsin-like serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=97.99 E-value=6.5e-06 Score=77.34 Aligned_cols=50 Identities=40% Similarity=0.890 Sum_probs=44.2
Q ss_pred CCCCCCccccCCCCceeEEeeeeecCc-ccCCCCCceEEEccCchhHHhhhc
Q psy15052 371 GDSGGPLQVKGKDGRYFLAGIISWGIG-CAEANLPGVCTRISKFVPWVLDTV 421 (422)
Q Consensus 371 i~~~~pl~~~~~~~~~~l~g~~s~g~~-~~~~~~~~v~~~v~~~~~Wi~~~~ 421 (422)
-++++|+.-+. ...+.+.|++|||.+ |+.+..|+|||+|+.|.+||.++|
T Consensus 228 GDSGGPi~~~g-~~G~vQ~GVvSwG~~~Cg~t~~~gVyT~vsny~~WI~a~~ 278 (413)
T COG5640 228 GDSGGPIFHKG-EEGRVQRGVVSWGDGGCGGTLIPGVYTNVSNYQDWIAAMT 278 (413)
T ss_pred CCCCCceEEeC-CCccEEEeEEEecCCCCCCCCcceeEEehhHHHHHHHHHh
Confidence 46889997776 456789999999988 999999999999999999999875
No 15
>PRK10139 serine endoprotease; Provisional
Probab=97.86 E-value=9.6e-05 Score=74.72 Aligned_cols=141 Identities=18% Similarity=0.150 Sum_probs=82.7
Q ss_pred CCceEEEccC--CEEeecccCcCCCCCceEEEEEeeeecccCCCCCCcEEEeEEEEEECCCCCCCCCCCceEEEEECCCc
Q psy15052 158 RPVNVYSNNE--KVDDFSTESVNSLLTSQIKIRVGEYDFSKLEEPYPYQERGVVKKMVHPKYNFFTYEYDLALVRLETPV 235 (422)
Q Consensus 158 ~~C~GtLIs~--~~VLTAAhCv~~~~~~~~~V~lG~~~~~~~~~~~~~~~~~V~~i~iHp~y~~~~~~nDIALLkL~~pv 235 (422)
...+|.+|++ -+|||++|.+.+ ...+.|.+.. ...+..+-+-..| ..||||||++.+-
T Consensus 90 ~~GSG~ii~~~~g~IlTn~HVv~~--a~~i~V~~~d-----------g~~~~a~vvg~D~-------~~DlAvlkv~~~~ 149 (455)
T PRK10139 90 GLGSGVIIDAAKGYVLTNNHVINQ--AQKISIQLND-----------GREFDAKLIGSDD-------QSDIALLQIQNPS 149 (455)
T ss_pred ceEEEEEEECCCCEEEeChHHhCC--CCEEEEEECC-----------CCEEEEEEEEEcC-------CCCEEEEEecCCC
Confidence 4679999974 699999999986 4577777632 1233444444444 4799999998542
Q ss_pred ccCCCeeeeecCCCCCCCCCCeEEEEEecccCCCCCCCCCceEEEEeccChhhHHHHHhhcCCcccccCceEEEeecCCC
Q psy15052 236 EFAPNIVPICLPGSDDLLIGENATVTGWGRLSEGGSLPPVLQKVTVPIVSNEKCRSMFLRAGRYEFISDIFMCAGFDNGG 315 (422)
Q Consensus 236 ~~s~~v~PicLp~~~~~~~~~~~~v~GwG~~~~~~~~~~~L~~~~v~v~~~~~C~~~~~~~~~~~~i~~~~lCa~~~~~~ 315 (422)
...++.|........|+.+++.|+-.. . ........+.-+.+.. ... . -....+=+ +
T Consensus 150 ----~l~~~~lg~s~~~~~G~~V~aiG~P~g---~--~~tvt~GivS~~~r~~----~~~-~----~~~~~iqt-----d 206 (455)
T PRK10139 150 ----KLTQIAIADSDKLRVGDFAVAVGNPFG---L--GQTATSGIISALGRSG----LNL-E----GLENFIQT-----D 206 (455)
T ss_pred ----CCceeEecCccccCCCCEEEEEecCCC---C--CCceEEEEEccccccc----cCC-C----CcceEEEE-----C
Confidence 345677765555678999999997421 1 1112222222221110 000 0 00112222 2
Q ss_pred CCCCcCCCCCCceeecCCCcEEEeeEEeec
Q psy15052 316 RDSCQGDSGGPLQVKGKDGRYFLAGIISWG 345 (422)
Q Consensus 316 ~~~C~GDSGgPLv~~~~~~~~~lvGV~S~g 345 (422)
...-+|.|||||+-. +| .++||.+..
T Consensus 207 a~in~GnSGGpl~n~--~G--~vIGi~~~~ 232 (455)
T PRK10139 207 ASINRGNSGGALLNL--NG--ELIGINTAI 232 (455)
T ss_pred CccCCCCCcceEECC--CC--eEEEEEEEE
Confidence 345579999999864 22 299999874
No 16
>PRK10942 serine endoprotease; Provisional
Probab=97.71 E-value=0.00031 Score=71.40 Aligned_cols=83 Identities=19% Similarity=0.243 Sum_probs=56.4
Q ss_pred CCceEEEccC--CEEeecccCcCCCCCceEEEEEeeeecccCCCCCCcEEEeEEEEEECCCCCCCCCCCceEEEEECCCc
Q psy15052 158 RPVNVYSNNE--KVDDFSTESVNSLLTSQIKIRVGEYDFSKLEEPYPYQERGVVKKMVHPKYNFFTYEYDLALVRLETPV 235 (422)
Q Consensus 158 ~~C~GtLIs~--~~VLTAAhCv~~~~~~~~~V~lG~~~~~~~~~~~~~~~~~V~~i~iHp~y~~~~~~nDIALLkL~~pv 235 (422)
...+|.+|+. -+|||++|.+.+ ...+.|.+... ..+..+-+..+| ..||||||++.+-
T Consensus 111 ~~GSG~ii~~~~G~IlTn~HVv~~--a~~i~V~~~dg-----------~~~~a~vv~~D~-------~~DlAvlki~~~~ 170 (473)
T PRK10942 111 ALGSGVIIDADKGYVVTNNHVVDN--ATKIKVQLSDG-----------RKFDAKVVGKDP-------RSDIALIQLQNPK 170 (473)
T ss_pred ceEEEEEEECCCCEEEeChhhcCC--CCEEEEEECCC-----------CEEEEEEEEecC-------CCCEEEEEecCCC
Confidence 3579999985 599999999986 45677776421 233343333444 4799999997432
Q ss_pred ccCCCeeeeecCCCCCCCCCCeEEEEEec
Q psy15052 236 EFAPNIVPICLPGSDDLLIGENATVTGWG 264 (422)
Q Consensus 236 ~~s~~v~PicLp~~~~~~~~~~~~v~GwG 264 (422)
...++.|-.......|+.+++.|+-
T Consensus 171 ----~l~~~~lg~s~~l~~G~~V~aiG~P 195 (473)
T PRK10942 171 ----NLTAIKMADSDALRVGDYTVAIGNP 195 (473)
T ss_pred ----CCceeEecCccccCCCCEEEEEcCC
Confidence 2345666555556779999998864
No 17
>smart00020 Tryp_SPc Trypsin-like serine protease. Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues.
Probab=97.66 E-value=4.1e-05 Score=69.80 Aligned_cols=45 Identities=53% Similarity=1.163 Sum_probs=41.2
Q ss_pred CCCCCCccccCCCCceeEEeeeeecCcccCCCCCceEEEccCchhHH
Q psy15052 371 GDSGGPLQVKGKDGRYFLAGIISWGIGCAEANLPGVCTRISKFVPWV 417 (422)
Q Consensus 371 i~~~~pl~~~~~~~~~~l~g~~s~g~~~~~~~~~~v~~~v~~~~~Wi 417 (422)
.++++|+.+..+ +|+|+|++|||..|...+.|.+|+||++|.+||
T Consensus 185 gdsG~pl~~~~~--~~~l~Gi~s~g~~C~~~~~~~~~~~i~~~~~WI 229 (229)
T smart00020 185 GDSGGPLVCNDG--RWVLVGIVSWGSGCARPGKPGVYTRVSSYLDWI 229 (229)
T ss_pred CCCCCeeEEECC--CEEEEEEEEECCCCCCCCCCCEEEEeccccccC
Confidence 478999988765 999999999999998777899999999999998
No 18
>PF13365 Trypsin_2: Trypsin-like peptidase domain; PDB: 1Y8T_A 2Z9I_A 3QO6_A 1L1J_A 1QY6_A 2O8L_A 3OTP_E 2ZLE_I 1KY9_A 3CS0_A ....
Probab=97.53 E-value=0.00019 Score=58.24 Aligned_cols=21 Identities=5% Similarity=-0.178 Sum_probs=19.1
Q ss_pred ceEEEccCC-EEeecccCcCCC
Q psy15052 160 VNVYSNNEK-VDDFSTESVNSL 180 (422)
Q Consensus 160 C~GtLIs~~-~VLTAAhCv~~~ 180 (422)
|+|.+|+++ +|||||||+...
T Consensus 1 GTGf~i~~~g~ilT~~Hvv~~~ 22 (120)
T PF13365_consen 1 GTGFLIGPDGYILTAAHVVEDW 22 (120)
T ss_dssp EEEEEEETTTEEEEEHHHHTCC
T ss_pred CEEEEEcCCceEEEchhheecc
Confidence 689999999 999999999864
No 19
>PF00089 Trypsin: Trypsin; InterPro: IPR001254 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine proteases belong to the MEROPS peptidase family S1 (chymotrypsin family, clan PA(S))and to peptidase family S6 (Hap serine peptidases). The chymotrypsin family is almost totally confined to animals, although trypsin-like enzymes are found in actinomycetes of the genera Streptomyces and Saccharopolyspora, and in the fungus Fusarium oxysporum []. The enzymes are inherently secreted, being synthesised with a signal peptide that targets them to the secretory pathway. Animal enzymes are either secreted directly, packaged into vesicles for regulated secretion, or are retained in leukocyte granules []. The Hap family, 'Haemophilus adhesion and penetration', are proteins that play a role in the interaction with human epithelial cells. The serine protease activity is localized at the N-terminal domain, whereas the binding domain is in the C-terminal region. ; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 1SPJ_A 1A5I_A 2ZGH_A 2ZKS_A 2ZGJ_A 2ZGC_A 2ODP_A 2I6Q_A 2I6S_A 2ODQ_A ....
Probab=97.48 E-value=8.8e-05 Score=66.97 Aligned_cols=44 Identities=48% Similarity=1.066 Sum_probs=40.1
Q ss_pred cCCCCCCccccCCCCceeEEeeeeecCcccCCCCCceEEEccCchhHH
Q psy15052 370 TGDSGGPLQVKGKDGRYFLAGIISWGIGCAEANLPGVCTRISKFVPWV 417 (422)
Q Consensus 370 ~i~~~~pl~~~~~~~~~~l~g~~s~g~~~~~~~~~~v~~~v~~~~~Wi 417 (422)
..++|+|+.+... +|+||+|++..|...+.|.||+||+.|++||
T Consensus 177 ~g~sG~pl~~~~~----~lvGI~s~~~~c~~~~~~~v~~~v~~~~~WI 220 (220)
T PF00089_consen 177 QGDSGGPLICNNN----YLVGIVSFGENCGSPNYPGVYTRVSSYLDWI 220 (220)
T ss_dssp TTTTTSEEEETTE----EEEEEEEEESSSSBTTSEEEEEEGGGGHHHH
T ss_pred cccccccccccee----eecceeeecCCCCCCCcCEEEEEHHHhhccC
Confidence 4678999988775 8999999999999988899999999999999
No 20
>PF02395 Peptidase_S6: Immunoglobulin A1 protease Serine protease Prosite pattern; InterPro: IPR000710 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S6 (clan PA(S)). The type sample being the IgA1-specific serine endopeptidase from Neisseria gonorrhoeae []. These cleave prolyl bonds in the hinge regions of immunoglobulin A heavy chains. Similar specificity is shown by the unrelated family of M26 metalloendopeptidases.; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SZE_A 3H09_B 3SYJ_A 1WXR_A 3AK5_B.
Probab=95.84 E-value=0.037 Score=59.12 Aligned_cols=65 Identities=12% Similarity=0.078 Sum_probs=39.0
Q ss_pred EEEccCCEEeecccCcCCCCCceEEEEEeeeecccCCCCCCcEEEeEEEEEECCCCCCCCCCCceEEEEECCCcccCCCe
Q psy15052 162 VYSNNEKVDDFSTESVNSLLTSQIKIRVGEYDFSKLEEPYPYQERGVVKKMVHPKYNFFTYEYDLALVRLETPVEFAPNI 241 (422)
Q Consensus 162 GtLIs~~~VLTAAhCv~~~~~~~~~V~lG~~~~~~~~~~~~~~~~~V~~i~iHp~y~~~~~~nDIALLkL~~pv~~s~~v 241 (422)
.|||+|++|+|++|=... .-.|.+|.... ..+.+.+--.|+. .|+.+-||++=|. -+
T Consensus 69 aTLigpqYiVSV~HN~~g----y~~v~FG~~g~---------~~Y~iV~RNn~~~-------~Df~~pRLnK~VT---Ev 125 (769)
T PF02395_consen 69 ATLIGPQYIVSVKHNGKG----YNSVSFGNEGQ---------NTYKIVDRNNYPS-------GDFHMPRLNKFVT---EV 125 (769)
T ss_dssp -EEEETTEEEBETTG-TS----CCEECESCSST---------CEEEEEEEEBETT-------STEBEEEESS------SS
T ss_pred EEEecCCeEEEEEccCCC----cCceeecccCC---------ceEEEEEccCCCC-------cccceeecCceEE---EE
Confidence 799999999999997633 23456665332 3456666656654 6999999998664 46
Q ss_pred eeeecCCC
Q psy15052 242 VPICLPGS 249 (422)
Q Consensus 242 ~PicLp~~ 249 (422)
.|+-....
T Consensus 126 aP~~~t~~ 133 (769)
T PF02395_consen 126 APAEMTTA 133 (769)
T ss_dssp ----BBSS
T ss_pred eccccccc
Confidence 67666443
No 21
>smart00680 CLIP Clip or disulphide knot domain. Present in horseshoe crab proclotting enzyme N-terminal domain, Drosophila Easter and silkworm prophenoloxidase-activating enzyme.
Probab=87.77 E-value=0.26 Score=33.43 Aligned_cols=38 Identities=18% Similarity=0.482 Sum_probs=25.9
Q ss_pred eee-CCceeeeeeeeeecc-----CCC-------ceecccc-CCeEEEEecC
Q psy15052 29 CSV-NSVEGRCMFVWECIN-----TDG-------HHLGMCV-DTFMFGSCCS 66 (422)
Q Consensus 29 C~~-~~~~g~C~~~~~C~~-----~~g-------~~~~~C~-~~~~~~~CC~ 66 (422)
|.+ ++..|.|+.+.+|+. ... .....|+ ++.-..||||
T Consensus 1 C~tp~~~~G~Cv~~~~C~~~~~~l~~~~~~~~~~l~~~~Cg~~~~~~~vCCp 52 (52)
T smart00680 1 CRTPDGERGTCVPISDCPSLLSLLKSDPPEDLNFLRKSQCGFGNREPLVCCP 52 (52)
T ss_pred CcCCCCCcEEeEEHHhChHHHHHHccCCHHHHHHHHHccCCCCCCCEeeeCc
Confidence 677 789999999999983 221 1234895 3334558996
No 22
>PF12032 CLIP: Regulatory CLIP domain of proteinases; InterPro: IPR022700 CLIP is a regulatory domain which controls the proteinase action of various proteins of the trypsin family, e.g. easter and pap2. The CLIP domain remains linked to the protease domain after cleavage of a conserved residue which retains the protein in zymogen form. It is named CLIP because it can be drawn in the shape of a paper clip. It has many disulphide bonds and highly conserved cysteine residues, and so it folds extensively [, ]. This entry represents the CLIP domain and is found in association with PF00089 from PFAM.; PDB: 2IKE_A 2XXL_A 2IKD_A.
Probab=86.64 E-value=0.12 Score=35.55 Aligned_cols=17 Identities=35% Similarity=0.902 Sum_probs=15.3
Q ss_pred eee-CCceeeeeeeeeec
Q psy15052 29 CSV-NSVEGRCMFVWECI 45 (422)
Q Consensus 29 C~~-~~~~g~C~~~~~C~ 45 (422)
|++ ++..|+|+.+.+|+
T Consensus 1 C~tp~g~~G~Cv~i~~C~ 18 (54)
T PF12032_consen 1 CTTPNGEPGRCVPIRSCP 18 (54)
T ss_dssp EE-TTSSEEEEEETTTBH
T ss_pred CcCCCCCcEEEecHHHCH
Confidence 788 89999999999999
No 23
>COG0265 DegQ Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain [Posttranslational modification, protein turnover, chaperones]
Probab=81.80 E-value=34 Score=33.30 Aligned_cols=145 Identities=18% Similarity=0.106 Sum_probs=77.0
Q ss_pred CCceEEEcc-CCEEeecccCcCCCCCceEEEEEeeeecccCCCCCCcEEEeEEEEEECCCCCCCCCCCceEEEEECCCcc
Q psy15052 158 RPVNVYSNN-EKVDDFSTESVNSLLTSQIKIRVGEYDFSKLEEPYPYQERGVVKKMVHPKYNFFTYEYDLALVRLETPVE 236 (422)
Q Consensus 158 ~~C~GtLIs-~~~VLTAAhCv~~~~~~~~~V~lG~~~~~~~~~~~~~~~~~V~~i~iHp~y~~~~~~nDIALLkL~~pv~ 236 (422)
...+|.+++ ..+|||-.|=+.+ ...+.|.+ .+ ...+..+-+-..+ ..|+|+||.+..-.
T Consensus 72 ~~gSg~i~~~~g~ivTn~hVi~~--a~~i~v~l--~d---------g~~~~a~~vg~d~-------~~dlavlki~~~~~ 131 (347)
T COG0265 72 GLGSGFIISSDGYIVTNNHVIAG--AEEITVTL--AD---------GREVPAKLVGKDP-------ISDLAVLKIDGAGG 131 (347)
T ss_pred ccccEEEEcCCeEEEecceecCC--cceEEEEe--CC---------CCEEEEEEEecCC-------ccCEEEEEeccCCC
Confidence 456788887 8899999998876 56677766 11 1233344333332 47999999986432
Q ss_pred cCCCeeeeecCCCCCCCCCCeEEEEEecccCCCCCCCCCceEEEEeccChhhHHHHHhhcCCcccccCceEEEeecCCCC
Q psy15052 237 FAPNIVPICLPGSDDLLIGENATVTGWGRLSEGGSLPPVLQKVTVPIVSNEKCRSMFLRAGRYEFISDIFMCAGFDNGGR 316 (422)
Q Consensus 237 ~s~~v~PicLp~~~~~~~~~~~~v~GwG~~~~~~~~~~~L~~~~v~v~~~~~C~~~~~~~~~~~~i~~~~lCa~~~~~~~ 316 (422)
...+.+........++...+.|-... .....-..-+....+. +-.... ...+.+ ....
T Consensus 132 ----~~~~~~~~s~~l~vg~~v~aiGnp~g-----~~~tvt~Givs~~~r~-~v~~~~-------~~~~~I-----qtdA 189 (347)
T COG0265 132 ----LPVIALGDSDKLRVGDVVVAIGNPFG-----LGQTVTSGIVSALGRT-GVGSAG-------GYVNFI-----QTDA 189 (347)
T ss_pred ----CceeeccCCCCcccCCEEEEecCCCC-----cccceeccEEeccccc-cccCcc-------cccchh-----hccc
Confidence 22334444444445666666663321 1111111222222221 100000 001111 1123
Q ss_pred CCCcCCCCCCceeecCCCcEEEeeEEeecccc
Q psy15052 317 DSCQGDSGGPLQVKGKDGRYFLAGIISWGIGC 348 (422)
Q Consensus 317 ~~C~GDSGgPLv~~~~~~~~~lvGV~S~g~~C 348 (422)
...+|.|||||+..+. .++||.+.....
T Consensus 190 ain~gnsGgpl~n~~g----~~iGint~~~~~ 217 (347)
T COG0265 190 AINPGNSGGPLVNIDG----EVVGINTAIIAP 217 (347)
T ss_pred ccCCCCCCCceEcCCC----cEEEEEEEEecC
Confidence 4678999999986422 299998876554
No 24
>PF00548 Peptidase_C3: 3C cysteine protease (picornain 3C); InterPro: IPR000199 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This signature defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies C3A and C3B. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral C3 cysteine protease. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SJO_E 2H6M_A 1QA7_C 1HAV_B 2HAL_A 2H9H_A 3QZQ_B 3QZR_A 3R0F_B 3SJ9_A ....
Probab=81.18 E-value=5.8 Score=34.57 Aligned_cols=67 Identities=16% Similarity=0.027 Sum_probs=36.4
Q ss_pred ceEEEccCCEEeecccCcCCCCCceEEEEEeeeecccCCCCCCcEEEeEEEEEECCCCCCCCCCCceEEEEECCCcccCC
Q psy15052 160 VNVYSNNEKVDDFSTESVNSLLTSQIKIRVGEYDFSKLEEPYPYQERGVVKKMVHPKYNFFTYEYDLALVRLETPVEFAP 239 (422)
Q Consensus 160 C~GtLIs~~~VLTAAhCv~~~~~~~~~V~lG~~~~~~~~~~~~~~~~~V~~i~iHp~y~~~~~~nDIALLkL~~pv~~s~ 239 (422)
+.+.-|..+|.|--.|.- ....+.++. ..+++.+.+.. .+......||+|++|.+.-.|.+
T Consensus 27 ~l~~gi~~~~~lvp~H~~-----~~~~i~i~g------------~~~~~~d~~~l--v~~~~~~~Dl~~v~l~~~~kfrD 87 (172)
T PF00548_consen 27 MLALGIYDRYFLVPTHEE-----PEDTIYIDG------------VEYKVDDSVVL--VDRDGVDTDLTLVKLPRNPKFRD 87 (172)
T ss_dssp EEEEEEEBTEEEEEGGGG-----GCSEEEETT------------EEEEEEEEEEE--EETTSSEEEEEEEEEESSS-B--
T ss_pred EecceEeeeEEEEECcCC-----CcEEEEECC------------EEEEeeeeEEE--ecCCCcceeEEEEEccCCcccCc
Confidence 556679999999999921 222333321 12233332221 11122246999999998888876
Q ss_pred Ceeeee
Q psy15052 240 NIVPIC 245 (422)
Q Consensus 240 ~v~Pic 245 (422)
..+-++
T Consensus 88 Irk~~~ 93 (172)
T PF00548_consen 88 IRKFFP 93 (172)
T ss_dssp GGGGSB
T ss_pred hhhhhc
Confidence 555555
No 25
>PF00863 Peptidase_C4: Peptidase family C4 This family belongs to family C4 of the peptidase classification.; InterPro: IPR001730 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. Nuclear inclusion A (NIA) proteases from potyviruses are cysteine peptidases belong to the MEROPS peptidase family C4 (NIa protease family, clan PA(C)) [, ]. Potyviruses include plant viruses in which the single-stranded RNA encodes a polyprotein with NIA protease activity, where proteolytic cleavage is specific for Gln+Gly sites. The NIA protease acts on the polyprotein, releasing itself by Gln+Gly cleavage at both the N- and C-termini. It further processes the polyprotein by cleavage at five similar sites in the C-terminal half of the sequence. In addition to its C-terminal protease activity, the NIA protease contains an N-terminal domain that has been implicated in the transcription process []. This peptidase is present in the nuclear inclusion protein of potyviruses.; GO: 0008234 cysteine-type peptidase activity, 0006508 proteolysis; PDB: 3MMG_B 1Q31_B 1LVB_A 1LVM_A.
Probab=79.56 E-value=28 Score=31.86 Aligned_cols=177 Identities=16% Similarity=0.126 Sum_probs=73.8
Q ss_pred EccCCEEeecccCcCCCCCceEEEEE--eeeecccCCCCCCcEEEeEEEEEECCCCCCCCCCCceEEEEECCCcccCCCe
Q psy15052 164 SNNEKVDDFSTESVNSLLTSQIKIRV--GEYDFSKLEEPYPYQERGVVKKMVHPKYNFFTYEYDLALVRLETPVEFAPNI 241 (422)
Q Consensus 164 LIs~~~VLTAAhCv~~~~~~~~~V~l--G~~~~~~~~~~~~~~~~~V~~i~iHp~y~~~~~~nDIALLkL~~pv~~s~~v 241 (422)
+.--.|+||-+|-+...+ ..+.|.- |.+....... .+|..+ ...||.||||.+.+. +.-
T Consensus 37 igyG~~iItn~HLf~~nn-g~L~i~s~hG~f~v~nt~~------lkv~~i----------~~~DiviirmPkDfp--Pf~ 97 (235)
T PF00863_consen 37 IGYGSYIITNAHLFKRNN-GELTIKSQHGEFTVPNTTQ------LKVHPI----------EGRDIVIIRMPKDFP--PFP 97 (235)
T ss_dssp EEETTEEEEEGGGGSSTT-CEEEEEETTEEEEECEGGG------SEEEE-----------TCSSEEEEE--TTS------
T ss_pred EeECCEEEEChhhhccCC-CeEEEEeCceEEEcCCccc------cceEEe----------CCccEEEEeCCcccC--Ccc
Confidence 345789999999987643 4465553 3443332221 122222 147999999988664 222
Q ss_pred eeeecCCCCCCCCCCeEEEEEecccCCCCCCCCCceEEEEeccChhhHHHHHhhcCCcccccCceEEEeecCCCCCCCcC
Q psy15052 242 VPICLPGSDDLLIGENATVTGWGRLSEGGSLPPVLQKVTVPIVSNEKCRSMFLRAGRYEFISDIFMCAGFDNGGRDSCQG 321 (422)
Q Consensus 242 ~PicLp~~~~~~~~~~~~v~GwG~~~~~~~~~~~L~~~~v~v~~~~~C~~~~~~~~~~~~i~~~~lCa~~~~~~~~~C~G 321 (422)
+.+++ ..+..++.+.++|--...... .. ...+...+.. .....+-... .++=.|
T Consensus 98 ~kl~F---R~P~~~e~v~mVg~~fq~k~~--~s--~vSesS~i~p---------------~~~~~fWkHw----IsTk~G 151 (235)
T PF00863_consen 98 QKLKF---RAPKEGERVCMVGSNFQEKSI--SS--TVSESSWIYP---------------EENSHFWKHW----ISTKDG 151 (235)
T ss_dssp S---B-------TT-EEEEEEEECSSCCC--EE--EEEEEEEEEE---------------ETTTTEEEE-----C---TT
T ss_pred hhhhc---cCCCCCCEEEEEEEEEEcCCe--eE--EECCceEEee---------------cCCCCeeEEE----ecCCCC
Confidence 22222 123456677777754322111 11 1111111110 0111222211 122246
Q ss_pred CCCCCceeecCCCcEEEeeEEeecccccCCCCCceeeecccccccccccCCCCCCc--cccCCCCceeEEeeeeecCc
Q psy15052 322 DSGGPLQVKGKDGRYFLAGIISWGIGCAEANLPGVCTRISKFVPWVLDTGDSGGPL--QVKGKDGRYFLAGIISWGIG 397 (422)
Q Consensus 322 DSGgPLv~~~~~~~~~lvGV~S~g~~C~~~~~p~vyt~V~~y~~WI~~~i~~~~pl--~~~~~~~~~~l~g~~s~g~~ 397 (422)
|=|.||+.. .+| .+|||.|.+..-. .-..|+.+.. ++++..++....+ ..+++ |.--.++||.-
T Consensus 152 ~CG~PlVs~-~Dg--~IVGiHsl~~~~~---~~N~F~~f~~--~f~~~~l~~~~~~~w~k~W~----fn~d~i~Wg~l 217 (235)
T PF00863_consen 152 DCGLPLVST-KDG--KIVGIHSLTSNTS---SRNYFTPFPD--DFEEFYLENIEELEWVKHWK----FNPDKISWGSL 217 (235)
T ss_dssp -TT-EEEET-TT----EEEEEEEEETTT---SSEEEEE--T--THHHHHCC-CCC--EECS--------CCCEEETTE
T ss_pred ccCCcEEEc-CCC--cEEEEEcCccCCC---CeEEEEcCCH--HHHHHHhcccccCccccCCE----ECCccceEcCc
Confidence 779999986 344 3999999764422 2346766543 3333333332222 11232 55667788754
No 26
>PF00947 Pico_P2A: Picornavirus core protein 2A; InterPro: IPR000081 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This domain defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies 3CA and 3CB. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral 3C cysteine protease []. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0008233 peptidase activity, 0006508 proteolysis, 0016032 viral reproduction; PDB: 2HRV_B 1Z8R_A.
Probab=76.54 E-value=2.5 Score=34.43 Aligned_cols=34 Identities=29% Similarity=0.400 Sum_probs=25.7
Q ss_pred cCCCCCCceeecCCCcEEEeeEEeecccccCCCCCceeeecccc
Q psy15052 320 QGDSGGPLQVKGKDGRYFLAGIISWGIGCAEANLPGVCTRISKF 363 (422)
Q Consensus 320 ~GDSGgPLv~~~~~~~~~lvGV~S~g~~C~~~~~p~vyt~V~~y 363 (422)
+||-||+|.|+.. ++||++.|- +.-.-|++|+.+
T Consensus 89 PGdCGg~L~C~HG-----ViGi~Tagg-----~g~VaF~dir~~ 122 (127)
T PF00947_consen 89 PGDCGGILRCKHG-----VIGIVTAGG-----EGHVAFADIRDL 122 (127)
T ss_dssp TT-TCSEEEETTC-----EEEEEEEEE-----TTEEEEEECCCG
T ss_pred CCCCCceeEeCCC-----eEEEEEeCC-----CceEEEEechhh
Confidence 6899999999865 999999862 223568888775
No 27
>PF05579 Peptidase_S32: Equine arteritis virus serine endopeptidase S32; InterPro: IPR008760 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to MEROPS peptidase family S32 (clan PA(S)). The type example is equine arteritis virus serine endopeptidase (equine arteritis virus), which is involved in processing of nidovirus polyproteins [].; GO: 0004252 serine-type endopeptidase activity, 0016032 viral reproduction, 0019082 viral protein processing; PDB: 3FAN_A 3FAO_A 1MBM_A.
Probab=44.15 E-value=15 Score=34.11 Aligned_cols=22 Identities=32% Similarity=0.493 Sum_probs=16.6
Q ss_pred cCCCCCCceeecCCCcEEEeeEEeec
Q psy15052 320 QGDSGGPLQVKGKDGRYFLAGIISWG 345 (422)
Q Consensus 320 ~GDSGgPLv~~~~~~~~~lvGV~S~g 345 (422)
.||||+|++..+. .|+||.+..
T Consensus 207 ~GDSGSPVVt~dg----~liGVHTGS 228 (297)
T PF05579_consen 207 PGDSGSPVVTEDG----DLIGVHTGS 228 (297)
T ss_dssp GGCTT-EEEETTC-----EEEEEEEE
T ss_pred CCCCCCccCcCCC----CEEEEEecC
Confidence 4899999998744 399999864
No 28
>PF05580 Peptidase_S55: SpoIVB peptidase S55; InterPro: IPR008763 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S55 (SpoIVB peptidase family, clan PA(S)). The protein SpoIVB plays a key role in signalling in the final sigma-K checkpoint of Bacillus subtilis [, ].
Probab=31.13 E-value=41 Score=30.27 Aligned_cols=26 Identities=35% Similarity=0.521 Sum_probs=22.0
Q ss_pred CCCCcCCCCCCceeecCCCcEEEeeEEeecc
Q psy15052 316 RDSCQGDSGGPLQVKGKDGRYFLAGIISWGI 346 (422)
Q Consensus 316 ~~~C~GDSGgPLv~~~~~~~~~lvGV~S~g~ 346 (422)
...-+|=||+|++.+++ |+|-+++..
T Consensus 175 GGIvqGMSGSPI~qdGK-----LiGAVthvf 200 (218)
T PF05580_consen 175 GGIVQGMSGSPIIQDGK-----LIGAVTHVF 200 (218)
T ss_pred CCEEecccCCCEEECCE-----EEEEEEEEE
Confidence 35678999999998765 999999875
No 29
>PF00944 Peptidase_S3: Alphavirus core protein ; InterPro: IPR000930 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. Togavirin, also known as Sindbis virus core endopeptidase, is a serine protease resident at the N terminus of the p130 polyprotein of togaviruses []. The endopeptidase signature identifies the peptidase as belonging to the MEROPS peptidase family S3 (togavirin family, clan PA(S)). The polyprotein also includes structural proteins for the nucleocapsid core and for the glycoprotein spikes []. Togavirin is only active while part of the polyprotein, cleavage at a Trp-Ser bond resulting in total lack of activity []. Mutagenesis studies have identified the location of the His-Asp-Ser catalytic triad, and X-ray studies have revealed the protein fold to be similar to that of chymotrypsin [, ].; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis, 0016020 membrane; PDB: 2YEW_D 1EP5_A 3J0C_F 1EP6_C 1WYK_D 1DYL_A 1VCQ_B 1VCP_B 1LD4_D 1KXA_A ....
Probab=31.08 E-value=28 Score=28.68 Aligned_cols=24 Identities=38% Similarity=0.639 Sum_probs=18.0
Q ss_pred cCCCCCCceeecCCCcEEEeeEEeeccc
Q psy15052 320 QGDSGGPLQVKGKDGRYFLAGIISWGIG 347 (422)
Q Consensus 320 ~GDSGgPLv~~~~~~~~~lvGV~S~g~~ 347 (422)
.||||-|++-+ .|+ +|||+-.|..
T Consensus 105 ~GDSGRpi~DN--sGr--VVaIVLGG~n 128 (158)
T PF00944_consen 105 PGDSGRPIFDN--SGR--VVAIVLGGAN 128 (158)
T ss_dssp TTSTTEEEEST--TSB--EEEEEEEEEE
T ss_pred CCCCCCccCcC--CCC--EEEEEecCCC
Confidence 69999999754 343 8899887654
No 30
>PF10459 Peptidase_S46: Peptidase S46; InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains.
Probab=27.86 E-value=43 Score=35.96 Aligned_cols=22 Identities=5% Similarity=-0.229 Sum_probs=19.1
Q ss_pred CCCceEEEccCC-EEeecccCcC
Q psy15052 157 GRPVNVYSNNEK-VDDFSTESVN 178 (422)
Q Consensus 157 ~~~C~GtLIs~~-~VLTAAhCv~ 178 (422)
...|+|++||++ .|||--||..
T Consensus 46 ~gGCSgsfVS~~GLvlTNHHC~~ 68 (698)
T PF10459_consen 46 GGGCSGSFVSPDGLVLTNHHCGY 68 (698)
T ss_pred CCceeEEEEcCCceEEecchhhh
Confidence 456999999987 8999999983
No 31
>PF03761 DUF316: Domain of unknown function (DUF316) ; InterPro: IPR005514 This is a family of uncharacterised proteins from Caenorhabditis elegans.
Probab=26.56 E-value=53 Score=30.79 Aligned_cols=44 Identities=30% Similarity=0.446 Sum_probs=32.6
Q ss_pred cCCCCCCccccCCCCceeEEeeeeecCc-ccCCCCCceEEEccCchhH
Q psy15052 370 TGDSGGPLQVKGKDGRYFLAGIISWGIG-CAEANLPGVCTRISKFVPW 416 (422)
Q Consensus 370 ~i~~~~pl~~~~~~~~~~l~g~~s~g~~-~~~~~~~~v~~~v~~~~~W 416 (422)
..+.++||+-.. +|+|+++|+.+-+.. |... ...|.+|+.|.+=
T Consensus 230 ~~d~Gg~lv~~~-~gr~tlIGv~~~~~~~~~~~--~~~f~~v~~~~~~ 274 (282)
T PF03761_consen 230 KGDRGGPLVKNI-NGRWTLIGVGASGNYECNKN--NSYFFNVSWYQDE 274 (282)
T ss_pred CCCccCeEEEEE-CCCEEEEEEEccCCCccccc--ccEEEEHHHhhhh
Confidence 456789996555 799999999987764 4332 7788888887653
Done!