Query psy15989
Match_columns 363
No_of_seqs 427 out of 2686
Neff 9.9
Searched_HMMs 46136
Date Fri Aug 16 18:06:55 2013
Command hhsearch -i /work/01045/syshi/Psyhhblits/psy15989.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/15989hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 cd00190 Tryp_SPc Trypsin-like 100.0 2.3E-35 5E-40 256.8 17.6 211 20-256 1-224 (232)
2 KOG3627|consensus 100.0 1E-33 2.2E-38 250.4 18.2 223 16-257 9-246 (256)
3 smart00020 Tryp_SPc Trypsin-li 100.0 7.4E-33 1.6E-37 240.8 17.9 212 19-257 1-225 (229)
4 PF00089 Trypsin: Trypsin; In 100.0 7.9E-31 1.7E-35 226.4 12.5 205 20-256 1-215 (220)
5 COG5640 Secreted trypsin-like 99.9 3.1E-26 6.7E-31 197.8 11.6 224 15-257 28-270 (413)
6 PF03761 DUF316: Domain of unk 99.7 2.5E-16 5.4E-21 141.1 16.0 205 4-233 27-253 (282)
7 KOG3627|consensus 99.6 1.1E-15 2.4E-20 135.1 9.5 105 252-361 87-194 (256)
8 cd00190 Tryp_SPc Trypsin-like 99.6 1.9E-14 4.1E-19 124.9 10.9 108 248-362 67-175 (232)
9 smart00020 Tryp_SPc Trypsin-li 99.4 9.6E-13 2.1E-17 114.1 10.4 106 250-362 69-176 (229)
10 PF09342 DUF1986: Domain of un 99.4 1E-12 2.3E-17 108.8 9.9 102 28-144 13-114 (267)
11 PF00089 Trypsin: Trypsin; In 99.3 5.8E-12 1.3E-16 108.3 10.0 104 249-361 66-169 (220)
12 COG3591 V8-like Glu-specific e 98.5 7.3E-07 1.6E-11 76.3 9.4 173 27-234 45-223 (251)
13 PF13365 Trypsin_2: Trypsin-li 98.4 1.2E-05 2.7E-10 61.8 13.4 22 49-70 1-23 (120)
14 TIGR02037 degP_htrA_DO peripla 97.8 0.00055 1.2E-08 65.1 13.4 70 46-145 57-127 (428)
15 TIGR02038 protease_degS peripl 97.7 0.0022 4.7E-08 59.2 15.5 74 30-129 54-135 (351)
16 PRK10139 serine endoprotease; 97.5 0.0027 5.9E-08 60.6 13.5 69 47-145 90-160 (455)
17 PRK10898 serine endoprotease; 97.5 0.0055 1.2E-07 56.6 15.1 73 31-129 55-135 (353)
18 PRK10942 serine endoprotease; 97.3 0.0057 1.2E-07 58.7 13.7 68 47-144 111-180 (473)
19 COG5640 Secreted trypsin-like 97.0 0.0019 4E-08 57.6 6.4 110 250-362 104-221 (413)
20 PF09342 DUF1986: Domain of un 95.4 0.032 6.9E-07 47.3 5.2 59 246-316 72-130 (267)
21 PF02395 Peptidase_S6: Immunog 95.3 0.26 5.6E-06 50.0 12.0 65 51-144 69-133 (769)
22 PF00548 Peptidase_C3: 3C cyst 94.1 2 4.4E-05 35.2 12.7 147 45-234 23-170 (172)
23 PF00863 Peptidase_C4: Peptida 91.3 1.1 2.4E-05 38.5 7.6 25 208-234 148-172 (235)
24 PF03761 DUF316: Domain of unk 87.1 3.3 7.2E-05 36.9 8.1 42 267-315 158-199 (282)
25 COG0265 DegQ Trypsin-like seri 56.2 1.7E+02 0.0036 27.0 11.9 58 47-130 72-130 (347)
26 PF05416 Peptidase_C37: Southa 46.2 34 0.00075 32.0 4.5 27 207-233 499-525 (535)
27 PF00947 Pico_P2A: Picornaviru 46.2 18 0.00039 27.7 2.3 21 210-234 89-109 (127)
28 PF10459 Peptidase_S46: Peptid 41.9 16 0.00036 37.0 2.0 21 48-68 48-69 (698)
29 smart00816 Amb_V_allergen Amb 29.8 32 0.0007 20.3 1.1 31 1-40 4-36 (45)
30 TIGR02037 degP_htrA_DO peripla 29.1 85 0.0018 29.9 4.5 38 269-316 104-141 (428)
31 cd07268 Glo_EDI_BRP_like_4 Thi 23.9 61 0.0013 25.7 2.0 27 269-295 36-63 (149)
32 PF03913 Amb_V_allergen: Amb V 23.3 40 0.00086 19.8 0.7 30 1-39 3-34 (44)
33 PF05579 Peptidase_S32: Equine 21.6 62 0.0013 28.4 1.8 21 210-233 207-227 (297)
34 COG3102 Uncharacterized protei 21.3 76 0.0017 25.7 2.1 26 269-294 74-100 (185)
No 1
>cd00190 Tryp_SPc Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues.
Probab=100.00 E-value=2.3e-35 Score=256.78 Aligned_cols=211 Identities=33% Similarity=0.460 Sum_probs=160.5
Q ss_pred eecCeecCCCCCcceEEEEEeeCCceeEEEEEEEeeCCEEEeeccCcccCCCCCcccCCcEEEEeceecccccCCCCCce
Q psy15989 20 VTYGQKTARGQWPWHVALYRTEGINLSYVCGGSLVSVNYVITAAHCVTKKPYDKPVDSDTLVIYLGKYHQHQFSDEGGVQ 99 (363)
Q Consensus 20 IvgG~~a~~~~~Pw~v~l~~~~~~~~~~~CgGsLIs~~~VLTAAhC~~~~~~~~~~~~~~~~V~~G~~~~~~~~~~~~~~ 99 (363)
|+||+++.+++|||+|+|+... ..++|+||||+++||||||||+... ....+.|++|........ ...+
T Consensus 1 i~~G~~~~~~~~Pw~v~i~~~~---~~~~C~GtlIs~~~VLTaAhC~~~~------~~~~~~v~~g~~~~~~~~--~~~~ 69 (232)
T cd00190 1 IVGGSEAKIGSFPWQVSLQYTG---GRHFCGGSLISPRWVLTAAHCVYSS------APSNYTVRLGSHDLSSNE--GGGQ 69 (232)
T ss_pred CcCCeECCCCCCCCEEEEEccC---CcEEEEEEEeeCCEEEECHHhcCCC------CCccEEEEeCcccccCCC--CceE
Confidence 6899999999999999998763 2789999999999999999999753 246789999998775432 3567
Q ss_pred eeeeeEEEECCCCCCCCCCCCeEEEEeCCcccCCCCccceEecCCCCCCceeeeeeceeeecCCCCccceEEecCeeEEe
Q psy15989 100 NKQVKRVHIYPTFNSSNYLGDIALLQLSSDVDYSMYVRPVCLWDDSTAPLQLSAVEGTSVCNGDSGGGMVFKIDSAWYLR 179 (363)
Q Consensus 100 ~~~V~~i~~hp~y~~~~~~nDIALl~L~~~~~~~~~v~picL~~~~~~~~~~~~~~~~~~c~g~~G~~~~~~~~~~~~~~ 179 (363)
.+.|+++++||+|+.....+|||||+|++|+.+++.++|||||.... .. ..+ ..+. ..||+...... ....
T Consensus 70 ~~~v~~~~~hp~y~~~~~~~DiAll~L~~~~~~~~~v~picl~~~~~-~~----~~~-~~~~-~~G~g~~~~~~--~~~~ 140 (232)
T cd00190 70 VIKVKKVIVHPNYNPSTYDNDIALLKLKRPVTLSDNVRPICLPSSGY-NL----PAG-TTCT-VSGWGRTSEGG--PLPD 140 (232)
T ss_pred EEEEEEEEECCCCCCCCCcCCEEEEEECCcccCCCcccceECCCccc-cC----CCC-CEEE-EEeCCcCCCCC--CCCc
Confidence 88999999999999988899999999999999999999999998742 10 111 2232 46666543210 1123
Q ss_pred eeeEEEEeecCeeeccccc----------eEEecC--CCccCcccccceeEEEEcCccEEEEEEEEEEecCcceecCC-C
Q psy15989 180 GIVSITVARDGLRVCDTKH----------YVVFTD--VANVCNGDSGGGMVFKIDSAWYLRGIVSITVARDGLRVCDT-K 246 (363)
Q Consensus 180 ~l~~~~~~~~~~~~C~~~~----------~~~~~~--~~~~C~gdsGgpl~~~~~~~~~l~Gi~s~~~~~~g~~~c~~-~ 246 (363)
.++...+.+.+...|...+ .|.... ....|.|||||||++..+++++|+||+|++. . |.. .
T Consensus 141 ~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~C~~~~~~~~~~c~gdsGgpl~~~~~~~~~lvGI~s~g~-----~-c~~~~ 214 (232)
T cd00190 141 VLQEVNVPIVSNAECKRAYSYGGTITDNMLCAGGLEGGKDACQGDSGGPLVCNDNGRGVLVGIVSWGS-----G-CARPN 214 (232)
T ss_pred eeeEEEeeeECHHHhhhhccCcccCCCceEeeCCCCCCCccccCCCCCcEEEEeCCEEEEEEEEehhh-----c-cCCCC
Confidence 4556666777777786332 233333 5679999999999999889999999999952 2 765 5
Q ss_pred cceEEEeeeE
Q psy15989 247 HYVVFTDVKR 256 (363)
Q Consensus 247 ~~~v~~~V~~ 256 (363)
.+.+|++|..
T Consensus 215 ~~~~~t~v~~ 224 (232)
T cd00190 215 YPGVYTRVSS 224 (232)
T ss_pred CCCEEEEcHH
Confidence 6778888654
No 2
>KOG3627|consensus
Probab=100.00 E-value=1e-33 Score=250.44 Aligned_cols=223 Identities=28% Similarity=0.422 Sum_probs=167.4
Q ss_pred ccceeecCeecCCCCCcceEEEEEeeCCceeEEEEEEEeeCCEEEeeccCcccCCCCCcccCCcEEEEeceecccccCCC
Q psy15989 16 AQPLVTYGQKTARGQWPWHVALYRTEGINLSYVCGGSLVSVNYVITAAHCVTKKPYDKPVDSDTLVIYLGKYHQHQFSDE 95 (363)
Q Consensus 16 ~~~rIvgG~~a~~~~~Pw~v~l~~~~~~~~~~~CgGsLIs~~~VLTAAhC~~~~~~~~~~~~~~~~V~~G~~~~~~~~~~ 95 (363)
...||+||.++.+++|||+|+|+.... ..|+|||+||+++||||||||+.... .. .+.|++|.+........
T Consensus 9 ~~~~i~~g~~~~~~~~Pw~~~l~~~~~--~~~~Cggsli~~~~vltaaHC~~~~~-----~~-~~~V~~G~~~~~~~~~~ 80 (256)
T KOG3627|consen 9 PEGRIVGGTEAEPGSFPWQVSLQYGGN--GRHLCGGSLISPRWVLTAAHCVKGAS-----AS-LYTVRLGEHDINLSVSE 80 (256)
T ss_pred ccCCEeCCccCCCCCCCCEEEEEECCC--cceeeeeEEeeCCEEEEChhhCCCCC-----Cc-ceEEEECcccccccccc
Confidence 357899999999999999999987642 26799999999999999999997632 11 78899998866544211
Q ss_pred CC-ceeeeeeEEEECCCCCCCCCC-CCeEEEEeCCcccCCCCccceEecCCCCCCceeeeeeceeeecCCCCccceEEec
Q psy15989 96 GG-VQNKQVKRVHIYPTFNSSNYL-GDIALLQLSSDVDYSMYVRPVCLWDDSTAPLQLSAVEGTSVCNGDSGGGMVFKID 173 (363)
Q Consensus 96 ~~-~~~~~V~~i~~hp~y~~~~~~-nDIALl~L~~~~~~~~~v~picL~~~~~~~~~~~~~~~~~~c~g~~G~~~~~~~~ 173 (363)
.. .....|.++++||+|+..+.. ||||||+|++++.|++.|+|||||...... ...+...|. .+|||.....
T Consensus 81 ~~~~~~~~v~~~i~H~~y~~~~~~~nDiall~l~~~v~~~~~i~piclp~~~~~~----~~~~~~~~~-v~GWG~~~~~- 154 (256)
T KOG3627|consen 81 GEEQLVGDVEKIIVHPNYNPRTLENNDIALLRLSEPVTFSSHIQPICLPSSADPY----FPPGGTTCL-VSGWGRTESG- 154 (256)
T ss_pred CchhhhceeeEEEECCCCCCCCCCCCCEEEEEECCCcccCCcccccCCCCCcccC----CCCCCCEEE-EEeCCCcCCC-
Confidence 21 245568899999999998877 999999999999999999999998554310 122435666 6889876433
Q ss_pred CeeEEeeeeEEEEeecCeeeccccc----------eEEec--CCCccCcccccceeEEEEcCccEEEEEEEEEEecCcce
Q psy15989 174 SAWYLRGIVSITVARDGLRVCDTKH----------YVVFT--DVANVCNGDSGGGMVFKIDSAWYLRGIVSITVARDGLR 241 (363)
Q Consensus 174 ~~~~~~~l~~~~~~~~~~~~C~~~~----------~~~~~--~~~~~C~gdsGgpl~~~~~~~~~l~Gi~s~~~~~~g~~ 241 (363)
.......|+...+++.+...|...+ .|... ...+.|.|||||||++....+++++||+|++ ..
T Consensus 155 ~~~~~~~L~~~~v~i~~~~~C~~~~~~~~~~~~~~~Ca~~~~~~~~~C~GDSGGPLv~~~~~~~~~~GivS~G-----~~ 229 (256)
T KOG3627|consen 155 GGPLPDTLQEVDVPIISNSECRRAYGGLGTITDTMLCAGGPEGGKDACQGDSGGPLVCEDNGRWVLVGIVSWG-----SG 229 (256)
T ss_pred CCCCCceeEEEEEeEcChhHhcccccCccccCCCEEeeCccCCCCccccCCCCCeEEEeeCCcEEEEEEEEec-----CC
Confidence 1112345667788888888897433 34442 3466899999999999987789999999994 33
Q ss_pred ecCCC-cceEEEeeeEE
Q psy15989 242 VCDTK-HYVVFTDVKRV 257 (363)
Q Consensus 242 ~c~~~-~~~v~~~V~~i 257 (363)
.|... .|.+|++|..+
T Consensus 230 ~C~~~~~P~vyt~V~~y 246 (256)
T KOG3627|consen 230 GCGQPNYPGVYTRVSSY 246 (256)
T ss_pred CCCCCCCCeEEeEhHHh
Confidence 26654 78899987653
No 3
>smart00020 Tryp_SPc Trypsin-like serine protease. Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues.
Probab=100.00 E-value=7.4e-33 Score=240.77 Aligned_cols=212 Identities=32% Similarity=0.452 Sum_probs=155.6
Q ss_pred eeecCeecCCCCCcceEEEEEeeCCceeEEEEEEEeeCCEEEeeccCcccCCCCCcccCCcEEEEeceecccccCCCCCc
Q psy15989 19 LVTYGQKTARGQWPWHVALYRTEGINLSYVCGGSLVSVNYVITAAHCVTKKPYDKPVDSDTLVIYLGKYHQHQFSDEGGV 98 (363)
Q Consensus 19 rIvgG~~a~~~~~Pw~v~l~~~~~~~~~~~CgGsLIs~~~VLTAAhC~~~~~~~~~~~~~~~~V~~G~~~~~~~~~~~~~ 98 (363)
||+||+++.+++|||+|.++... ..+.|+||||+++||||||||+.... ...+.|++|.+..... ...
T Consensus 1 ~~~~G~~~~~~~~Pw~~~i~~~~---~~~~C~GtlIs~~~VLTaahC~~~~~------~~~~~v~~g~~~~~~~---~~~ 68 (229)
T smart00020 1 RIVGGSEANIGSFPWQVSLQYRG---GRHFCGGSLISPRWVLTAAHCVYGSD------PSNIRVRLGSHDLSSG---EEG 68 (229)
T ss_pred CccCCCcCCCCCCCcEEEEEEcC---CCcEEEEEEecCCEEEECHHHcCCCC------CcceEEEeCcccCCCC---CCc
Confidence 68999999999999999998654 37889999999999999999997531 3579999998876543 222
Q ss_pred eeeeeeEEEECCCCCCCCCCCCeEEEEeCCcccCCCCccceEecCCCCCCceeeeeeceeeecCCCCccceEEecCeeEE
Q psy15989 99 QNKQVKRVHIYPTFNSSNYLGDIALLQLSSDVDYSMYVRPVCLWDDSTAPLQLSAVEGTSVCNGDSGGGMVFKIDSAWYL 178 (363)
Q Consensus 99 ~~~~V~~i~~hp~y~~~~~~nDIALl~L~~~~~~~~~v~picL~~~~~~~~~~~~~~~~~~c~g~~G~~~~~~~~~~~~~ 178 (363)
+.+.|.++++||+|+.....+|||||+|++|+.+++.++|||||...... ..+ ..+. ..||+.... ......
T Consensus 69 ~~~~v~~~~~~p~~~~~~~~~DiAll~L~~~i~~~~~~~pi~l~~~~~~~-----~~~-~~~~-~~g~g~~~~-~~~~~~ 140 (229)
T smart00020 69 QVIKVSKVIIHPNYNPSTYDNDIALLKLKSPVTLSDNVRPICLPSSNYNV-----PAG-TTCT-VSGWGRTSE-GAGSLP 140 (229)
T ss_pred eEEeeEEEEECCCCCCCCCcCCEEEEEECcccCCCCceeeccCCCccccc-----CCC-CEEE-EEeCCCCCC-CCCcCC
Confidence 78899999999999988889999999999999999999999998862211 111 1222 355554321 111112
Q ss_pred eeeeEEEEeecCeeeccccc----------eEEecC--CCccCcccccceeEEEEcCccEEEEEEEEEEecCcceecC-C
Q psy15989 179 RGIVSITVARDGLRVCDTKH----------YVVFTD--VANVCNGDSGGGMVFKIDSAWYLRGIVSITVARDGLRVCD-T 245 (363)
Q Consensus 179 ~~l~~~~~~~~~~~~C~~~~----------~~~~~~--~~~~C~gdsGgpl~~~~~~~~~l~Gi~s~~~~~~g~~~c~-~ 245 (363)
..+....+.+.+...|...+ .|.... ....|.||+||||+...+ +|+|+||++++ . .|. .
T Consensus 141 ~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~C~~~~~~~~~~c~gdsG~pl~~~~~-~~~l~Gi~s~g-----~-~C~~~ 213 (229)
T smart00020 141 DTLQEVNVPIVSNATCRRAYSGGGAITDNMLCAGGLEGGKDACQGDSGGPLVCNDG-RWVLVGIVSWG-----S-GCARP 213 (229)
T ss_pred CEeeEEEEEEeCHHHhhhhhccccccCCCcEeecCCCCCCcccCCCCCCeeEEECC-CEEEEEEEEEC-----C-CCCCC
Confidence 23445555666666676322 122222 467899999999999877 99999999994 3 566 4
Q ss_pred CcceEEEeeeEE
Q psy15989 246 KHYVVFTDVKRV 257 (363)
Q Consensus 246 ~~~~v~~~V~~i 257 (363)
..+.++++|..+
T Consensus 214 ~~~~~~~~i~~~ 225 (229)
T smart00020 214 GKPGVYTRVSSY 225 (229)
T ss_pred CCCCEEEEeccc
Confidence 567888887753
No 4
>PF00089 Trypsin: Trypsin; InterPro: IPR001254 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine proteases belong to the MEROPS peptidase family S1 (chymotrypsin family, clan PA(S))and to peptidase family S6 (Hap serine peptidases). The chymotrypsin family is almost totally confined to animals, although trypsin-like enzymes are found in actinomycetes of the genera Streptomyces and Saccharopolyspora, and in the fungus Fusarium oxysporum []. The enzymes are inherently secreted, being synthesised with a signal peptide that targets them to the secretory pathway. Animal enzymes are either secreted directly, packaged into vesicles for regulated secretion, or are retained in leukocyte granules []. The Hap family, 'Haemophilus adhesion and penetration', are proteins that play a role in the interaction with human epithelial cells. The serine protease activity is localized at the N-terminal domain, whereas the binding domain is in the C-terminal region. ; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 1SPJ_A 1A5I_A 2ZGH_A 2ZKS_A 2ZGJ_A 2ZGC_A 2ODP_A 2I6Q_A 2I6S_A 2ODQ_A ....
Probab=99.97 E-value=7.9e-31 Score=226.39 Aligned_cols=205 Identities=29% Similarity=0.561 Sum_probs=151.2
Q ss_pred eecCeecCCCCCcceEEEEEeeCCceeEEEEEEEeeCCEEEeeccCcccCCCCCcccCCcEEEEeceecccccCCCCCce
Q psy15989 20 VTYGQKTARGQWPWHVALYRTEGINLSYVCGGSLVSVNYVITAAHCVTKKPYDKPVDSDTLVIYLGKYHQHQFSDEGGVQ 99 (363)
Q Consensus 20 IvgG~~a~~~~~Pw~v~l~~~~~~~~~~~CgGsLIs~~~VLTAAhC~~~~~~~~~~~~~~~~V~~G~~~~~~~~~~~~~~ 99 (363)
|+||.++.+++|||+|.|+.... .++|+|+||+++||||||||+.. ...+.+++|........ ...+
T Consensus 1 i~~g~~~~~~~~p~~v~i~~~~~---~~~C~G~li~~~~vLTaahC~~~--------~~~~~v~~g~~~~~~~~--~~~~ 67 (220)
T PF00089_consen 1 IVGGDPASPGEFPWVVSIRYSNG---RFFCTGTLISPRWVLTAAHCVDG--------ASDIKVRLGTYSIRNSD--GSEQ 67 (220)
T ss_dssp SBSSEECGTTSSTTEEEEEETTT---EEEEEEEEEETTEEEEEGGGHTS--------GGSEEEEESESBTTSTT--TTSE
T ss_pred CCCCEECCCCCCCeEEEEeeCCC---CeeEeEEeccccccccccccccc--------ccccccccccccccccc--cccc
Confidence 78999999999999999988653 88999999999999999999974 35788999985444332 4458
Q ss_pred eeeeeEEEECCCCCCCCCCCCeEEEEeCCcccCCCCccceEecCCCCCCceeeeeeceeeecCCCCccceEEecCeeEEe
Q psy15989 100 NKQVKRVHIYPTFNSSNYLGDIALLQLSSDVDYSMYVRPVCLWDDSTAPLQLSAVEGTSVCNGDSGGGMVFKIDSAWYLR 179 (363)
Q Consensus 100 ~~~V~~i~~hp~y~~~~~~nDIALl~L~~~~~~~~~v~picL~~~~~~~~~~~~~~~~~~c~g~~G~~~~~~~~~~~~~~ 179 (363)
.+.|++++.||+|+.....+|||||+|++++.+.+.++|+||+.... . ......|. ..||+.....+ ...
T Consensus 68 ~~~v~~~~~h~~~~~~~~~~DiAll~L~~~~~~~~~~~~~~l~~~~~-~-----~~~~~~~~-~~G~~~~~~~~---~~~ 137 (220)
T PF00089_consen 68 TIKVSKIIIHPKYDPSTYDNDIALLKLDRPITFGDNIQPICLPSAGS-D-----PNVGTSCI-VVGWGRTSDNG---YSS 137 (220)
T ss_dssp EEEEEEEEEETTSBTTTTTTSEEEEEESSSSEHBSSBEESBBTSTTH-T-----TTTTSEEE-EEESSBSSTTS---BTS
T ss_pred ccccccccccccccccccccccccccccccccccccccccccccccc-c-----cccccccc-ccccccccccc---ccc
Confidence 89999999999999988899999999999999999999999998332 0 01112333 45666532211 112
Q ss_pred eeeEEEEeecCeeecccc--------ceEEec-CCCccCcccccceeEEEEcCccEEEEEEEEEEecCcceecCCC-cce
Q psy15989 180 GIVSITVARDGLRVCDTK--------HYVVFT-DVANVCNGDSGGGMVFKIDSAWYLRGIVSITVARDGLRVCDTK-HYV 249 (363)
Q Consensus 180 ~l~~~~~~~~~~~~C~~~--------~~~~~~-~~~~~C~gdsGgpl~~~~~~~~~l~Gi~s~~~~~~g~~~c~~~-~~~ 249 (363)
.+....+.+.+...|... ..+... .....|.|||||||++... +|+||++.. ..|... .+.
T Consensus 138 ~~~~~~~~~~~~~~c~~~~~~~~~~~~~c~~~~~~~~~~~g~sG~pl~~~~~---~lvGI~s~~------~~c~~~~~~~ 208 (220)
T PF00089_consen 138 NLQSVTVPVVSRKTCRSSYNDNLTPNMICAGSSGSGDACQGDSGGPLICNNN---YLVGIVSFG------ENCGSPNYPG 208 (220)
T ss_dssp BEEEEEEEEEEHHHHHHHTTTTSTTTEEEEETTSSSBGGTTTTTSEEEETTE---EEEEEEEEE------SSSSBTTSEE
T ss_pred ccccccccccccccccccccccccccccccccccccccccccccccccccee---eecceeeec------CCCCCCCcCE
Confidence 234445555556667642 233333 3468999999999998764 799999984 346554 468
Q ss_pred EEEeeeE
Q psy15989 250 VFTDVKR 256 (363)
Q Consensus 250 v~~~V~~ 256 (363)
++++|..
T Consensus 209 v~~~v~~ 215 (220)
T PF00089_consen 209 VYTRVSS 215 (220)
T ss_dssp EEEEGGG
T ss_pred EEEEHHH
Confidence 8888764
No 5
>COG5640 Secreted trypsin-like serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=99.94 E-value=3.1e-26 Score=197.75 Aligned_cols=224 Identities=24% Similarity=0.302 Sum_probs=145.2
Q ss_pred CccceeecCeecCCCCCcceEEEEEeeCC-ceeEEEEEEEeeCCEEEeeccCcccCCCCCcccCCcEEEEeceecccccC
Q psy15989 15 KAQPLVTYGQKTARGQWPWHVALYRTEGI-NLSYVCGGSLVSVNYVITAAHCVTKKPYDKPVDSDTLVIYLGKYHQHQFS 93 (363)
Q Consensus 15 ~~~~rIvgG~~a~~~~~Pw~v~l~~~~~~-~~~~~CgGsLIs~~~VLTAAhC~~~~~~~~~~~~~~~~V~~G~~~~~~~~ 93 (363)
..+.||+||..|+.++||++|+|..+... ....||||+++..|||||||||+.... +...+..+|..+..+.
T Consensus 28 evs~rIigGs~Anag~~P~~VaLv~~isd~~s~tfCGgs~l~~RYvLTAAHC~~~~s---~is~d~~~vv~~l~d~---- 100 (413)
T COG5640 28 EVSSRIIGGSNANAGEYPSLVALVDRISDYVSGTFCGGSKLGGRYVLTAAHCADASS---PISSDVNRVVVDLNDS---- 100 (413)
T ss_pred ccceeEecCcccccccCchHHHHHhhcccccceeEeccceecceEEeeehhhccCCC---CccccceEEEeccccc----
Confidence 36789999999999999999999654322 235689999999999999999997642 2344445555554433
Q ss_pred CCCCceeeeeeEEEECCCCCCCCCCCCeEEEEeCCcccCCCCccceEecCCCCCCceeeeeeceeeecCCCCccceEEec
Q psy15989 94 DEGGVQNKQVKRVHIYPTFNSSNYLGDIALLQLSSDVDYSMYVRPVCLWDDSTAPLQLSAVEGTSVCNGDSGGGMVFKID 173 (363)
Q Consensus 94 ~~~~~~~~~V~~i~~hp~y~~~~~~nDIALl~L~~~~~~~~~v~picL~~~~~~~~~~~~~~~~~~c~g~~G~~~~~~~~ 173 (363)
...+...|++++.|..|.+.++.||||+++|.++..... ..|..-.....-+.. . ....=.++.+++......
T Consensus 101 --Sq~~rg~vr~i~~~efY~~~n~~ND~Av~~l~~~a~~pr--~ki~~~~~sdt~l~s--v-~~~s~~~n~t~~~~~~~~ 173 (413)
T COG5640 101 --SQAERGHVRTIYVHEFYSPGNLGNDIAVLELARAASLPR--VKITSFDASDTFLNS--V-TTVSPMTNGTFGVTTPSD 173 (413)
T ss_pred --ccccCcceEEEeeecccccccccCcceeeccccccccch--hheeeccCcccceec--c-cccccccceeeeeeeecC
Confidence 455788899999999999999999999999999765321 112221111100000 0 000111133333332211
Q ss_pred C-eeEE--eeeeEEEEeecCeeeccccc--------------eEEecCCCccCcccccceeEEEEcCccEEEEEEEEEEe
Q psy15989 174 S-AWYL--RGIVSITVARDGLRVCDTKH--------------YVVFTDVANVCNGDSGGGMVFKIDSAWYLRGIVSITVA 236 (363)
Q Consensus 174 ~-~~~~--~~l~~~~~~~~~~~~C~~~~--------------~~~~~~~~~~C~gdsGgpl~~~~~~~~~l~Gi~s~~~~ 236 (363)
. .... ..+.+..+...+...|...+ +++.+..++.|+||||||++.+..+.+.++||+||+
T Consensus 174 v~~~~p~gt~l~e~~v~fv~~stc~~~~g~an~~dg~~~lT~~cag~~~~daCqGDSGGPi~~~g~~G~vQ~GVvSwG-- 251 (413)
T COG5640 174 VPRSSPKGTILHEVAVLFVPLSTCAQYKGCANASDGATGLTGFCAGRPPKDACQGDSGGPIFHKGEEGRVQRGVVSWG-- 251 (413)
T ss_pred CCCCCCccceeeeeeeeeechHHhhhhccccccCCCCCCccceecCCCCcccccCCCCCceEEeCCCccEEEeEEEec--
Confidence 1 1111 13445555556666665332 233444578999999999999998888999999994
Q ss_pred cCcceecC-CCcceEEEeeeEE
Q psy15989 237 RDGLRVCD-TKHYVVFTDVKRV 257 (363)
Q Consensus 237 ~~g~~~c~-~~~~~v~~~V~~i 257 (363)
...|. ...+++||+|+.+
T Consensus 252 ---~~~Cg~t~~~gVyT~vsny 270 (413)
T COG5640 252 ---DGGCGGTLIPGVYTNVSNY 270 (413)
T ss_pred ---CCCCCCCCcceeEEehhHH
Confidence 33354 4567888887655
No 6
>PF03761 DUF316: Domain of unknown function (DUF316) ; InterPro: IPR005514 This is a family of uncharacterised proteins from Caenorhabditis elegans.
Probab=99.71 E-value=2.5e-16 Score=141.07 Aligned_cols=205 Identities=26% Similarity=0.397 Sum_probs=122.7
Q ss_pred CCCCCCccccCCccceeecCeecCCCCCcceEEEEEeeCCceeEEEEEEEeeCCEEEeeccCcccCCCCC----ccc---
Q psy15989 4 RDVSCGTVVYNKAQPLVTYGQKTARGQWPWHVALYRTEGINLSYVCGGSLVSVNYVITAAHCVTKKPYDK----PVD--- 76 (363)
Q Consensus 4 ~~~~CG~~~~~~~~~rIvgG~~a~~~~~Pw~v~l~~~~~~~~~~~CgGsLIs~~~VLTAAhC~~~~~~~~----~~~--- 76 (363)
|...||... ....+++.+|..+...+.||.|.++.........+++|||||+||||||+||+....... ..+
T Consensus 27 rl~~CG~~~-~~~~~~~~~g~~~~~~~~pW~v~v~~~~~~~~~~~~~gtlIS~RHiLtss~~~~~~~~~W~~~~~~~~~~ 105 (282)
T PF03761_consen 27 RLETCGKKK-LPYPSKVFNGTPAESGEAPWAVSVYTKNHNEGNYFSTGTLISPRHILTSSHCVMNDKSKWLNGEEFDNKK 105 (282)
T ss_pred HHHhcCCCC-CCCcccccCCcccccCCCCCEEEEEeccCcccceecceEEeccCeEEEeeeEEEecccccccCcccccce
Confidence 345799652 334456899999999999999999887655456778999999999999999998543211 100
Q ss_pred --CC--cEEEE---eceecc---cccCCCCCceeeeeeEEEECCCC----CCCCCCCCeEEEEeCCcccCCCCccceEec
Q psy15989 77 --SD--TLVIY---LGKYHQ---HQFSDEGGVQNKQVKRVHIYPTF----NSSNYLGDIALLQLSSDVDYSMYVRPVCLW 142 (363)
Q Consensus 77 --~~--~~~V~---~G~~~~---~~~~~~~~~~~~~V~~i~~hp~y----~~~~~~nDIALl~L~~~~~~~~~v~picL~ 142 (363)
.. .+.|- +-.... .... ........|.++++--.- +......+++||+|+++ ++....|+|||
T Consensus 106 C~~~~~~l~vP~~~l~~~~v~~~~~~~-~~~~~~~~v~ka~il~~C~~~~~~~~~~~~~mIlEl~~~--~~~~~~~~Cl~ 182 (282)
T PF03761_consen 106 CEGNNNHLIVPEEVLSKIDVRCCNCFS-NGKCFSIKVKKAYILNGCKKIKKNFNRPYSPMILELEED--FSKNVSPPCLA 182 (282)
T ss_pred eeCCCceEEeCHHHhccEEEEeecccc-cCCcccceeEEEEEEecCCCcccccccccceEEEEEccc--ccccCCCEEeC
Confidence 00 11110 000000 0000 122234567777663222 33445689999999999 78899999998
Q ss_pred CCCCCCceeeeeecee-eecCCCCccceEEecCeeEEeeeeEEEEeecCeeeccccceEEecCCCccCcccccceeEEEE
Q psy15989 143 DDSTAPLQLSAVEGTS-VCNGDSGGGMVFKIDSAWYLRGIVSITVARDGLRVCDTKHYVVFTDVANVCNGDSGGGMVFKI 221 (363)
Q Consensus 143 ~~~~~~~~~~~~~~~~-~c~g~~G~~~~~~~~~~~~~~~l~~~~~~~~~~~~C~~~~~~~~~~~~~~C~gdsGgpl~~~~ 221 (363)
..... + ..+.. ...|. ..... +....+.+. .|.. ...........|.||+||||+...
T Consensus 183 ~~~~~-~----~~~~~~~~yg~-------~~~~~-----~~~~~~~i~---~~~~-~~~~~~~~~~~~~~d~Gg~lv~~~ 241 (282)
T PF03761_consen 183 DSSTN-W----EKGDEVDVYGF-------NSTGK-----LKHRKLKIT---NCTK-CAYSICTKQYSCKGDRGGPLVKNI 241 (282)
T ss_pred CCccc-c----ccCceEEEeec-------CCCCe-----EEEEEEEEE---Eeec-cceeEecccccCCCCccCeEEEEE
Confidence 77651 1 11110 01111 00111 111122111 2321 112233456789999999999999
Q ss_pred cCccEEEEEEEE
Q psy15989 222 DSAWYLRGIVSI 233 (363)
Q Consensus 222 ~~~~~l~Gi~s~ 233 (363)
+++|+++|+.+-
T Consensus 242 ~gr~tlIGv~~~ 253 (282)
T PF03761_consen 242 NGRWTLIGVGAS 253 (282)
T ss_pred CCCEEEEEEEcc
Confidence 999999999775
No 7
>KOG3627|consensus
Probab=99.63 E-value=1.1e-15 Score=135.05 Aligned_cols=105 Identities=30% Similarity=0.529 Sum_probs=84.2
Q ss_pred EeeeEEEEcCCCCCCCCC-CceEEEeecceeecccceEeeeeCCCCCCccccccccCCcEEEEEecCCCCC--Cccccce
Q psy15989 252 TDVKRVHIYPTFNSSNYL-GDIALLQLSSDVDYSMYVRPVCLWDDSTAPLQLSAVEGRDGTVIGWGYDEND--RVSEELK 328 (363)
Q Consensus 252 ~~V~~ii~h~~y~~~~~~-nDIall~L~~~v~~~~~v~picl~~~~~~~~~~~~~~~~~~~v~GWG~~~~~--~~~~~L~ 328 (363)
..|.+++.||+|+..... ||||||+|++++.|+++|+|||||..... .....+..|.++|||.++.. ..+..||
T Consensus 87 ~~v~~~i~H~~y~~~~~~~nDiall~l~~~v~~~~~i~piclp~~~~~---~~~~~~~~~~v~GWG~~~~~~~~~~~~L~ 163 (256)
T KOG3627|consen 87 GDVEKIIVHPNYNPRTLENNDIALLRLSEPVTFSSHIQPICLPSSADP---YFPPGGTTCLVSGWGRTESGGGPLPDTLQ 163 (256)
T ss_pred ceeeEEEECCCCCCCCCCCCCEEEEEECCCcccCCcccccCCCCCccc---CCCCCCCEEEEEeCCCcCCCCCCCCceeE
Confidence 346678899999998877 99999999999999999999999854321 00234588999999998755 6789999
Q ss_pred EEEeceeecccccccCccccccCCCCCeEEecc
Q psy15989 329 MAIMPIVSHQQCLWSNPQFFSQFTSDETFCAGF 361 (363)
Q Consensus 329 ~~~v~ii~~~~C~~~~~~~~~~~i~~~~iCag~ 361 (363)
+++++++++++|+..+.. ...++++|||||.
T Consensus 164 ~~~v~i~~~~~C~~~~~~--~~~~~~~~~Ca~~ 194 (256)
T KOG3627|consen 164 EVDVPIISNSECRRAYGG--LGTITDTMLCAGG 194 (256)
T ss_pred EEEEeEcChhHhcccccC--ccccCCCEEeeCc
Confidence 999999999999876431 1146778999986
No 8
>cd00190 Tryp_SPc Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues.
Probab=99.56 E-value=1.9e-14 Score=124.91 Aligned_cols=108 Identities=33% Similarity=0.499 Sum_probs=88.9
Q ss_pred ceEEEeeeEEEEcCCCCCCCCCCceEEEeecceeecccceEeeeeCCCCCCccccccccCCcEEEEEecCCCCC-Ccccc
Q psy15989 248 YVVFTDVKRVHIYPTFNSSNYLGDIALLQLSSDVDYSMYVRPVCLWDDSTAPLQLSAVEGRDGTVIGWGYDEND-RVSEE 326 (363)
Q Consensus 248 ~~v~~~V~~ii~h~~y~~~~~~nDIall~L~~~v~~~~~v~picl~~~~~~~~~~~~~~~~~~~v~GWG~~~~~-~~~~~ 326 (363)
......|.++++||+|+.....+|||||||++++.++++++|||||..... ...+..+.++|||.+... ..+..
T Consensus 67 ~~~~~~v~~~~~hp~y~~~~~~~DiAll~L~~~~~~~~~v~picl~~~~~~-----~~~~~~~~~~G~g~~~~~~~~~~~ 141 (232)
T cd00190 67 GGQVIKVKKVIVHPNYNPSTYDNDIALLKLKRPVTLSDNVRPICLPSSGYN-----LPAGTTCTVSGWGRTSEGGPLPDV 141 (232)
T ss_pred ceEEEEEEEEEECCCCCCCCCcCCEEEEEECCcccCCCcccceECCCcccc-----CCCCCEEEEEeCCcCCCCCCCCce
Confidence 355678999999999999888999999999999999999999999976421 355788999999987533 46778
Q ss_pred ceEEEeceeecccccccCccccccCCCCCeEEeccC
Q psy15989 327 LKMAIMPIVSHQQCLWSNPQFFSQFTSDETFCAGFR 362 (363)
Q Consensus 327 L~~~~v~ii~~~~C~~~~~~~~~~~i~~~~iCag~~ 362 (363)
|+++.+.+++.++|+..+.. ...+.+++|||+..
T Consensus 142 ~~~~~~~~~~~~~C~~~~~~--~~~~~~~~~C~~~~ 175 (232)
T cd00190 142 LQEVNVPIVSNAECKRAYSY--GGTITDNMLCAGGL 175 (232)
T ss_pred eeEEEeeeECHHHhhhhccC--cccCCCceEeeCCC
Confidence 99999999999999865321 13578899999754
No 9
>smart00020 Tryp_SPc Trypsin-like serine protease. Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues.
Probab=99.42 E-value=9.6e-13 Score=114.11 Aligned_cols=106 Identities=35% Similarity=0.542 Sum_probs=86.2
Q ss_pred EEEeeeEEEEcCCCCCCCCCCceEEEeecceeecccceEeeeeCCCCCCccccccccCCcEEEEEecCCCC--CCccccc
Q psy15989 250 VFTDVKRVHIYPTFNSSNYLGDIALLQLSSDVDYSMYVRPVCLWDDSTAPLQLSAVEGRDGTVIGWGYDEN--DRVSEEL 327 (363)
Q Consensus 250 v~~~V~~ii~h~~y~~~~~~nDIall~L~~~v~~~~~v~picl~~~~~~~~~~~~~~~~~~~v~GWG~~~~--~~~~~~L 327 (363)
....|.+++.||+|+.....+|||||+|++|+.+++.++|||||..... ...+..+.++|||.... +..+..|
T Consensus 69 ~~~~v~~~~~~p~~~~~~~~~DiAll~L~~~i~~~~~~~pi~l~~~~~~-----~~~~~~~~~~g~g~~~~~~~~~~~~~ 143 (229)
T smart00020 69 QVIKVSKVIIHPNYNPSTYDNDIALLKLKSPVTLSDNVRPICLPSSNYN-----VPAGTTCTVSGWGRTSEGAGSLPDTL 143 (229)
T ss_pred eEEeeEEEEECCCCCCCCCcCCEEEEEECcccCCCCceeeccCCCcccc-----cCCCCEEEEEeCCCCCCCCCcCCCEe
Confidence 5568999999999998888999999999999999999999999976322 24578899999998753 3456789
Q ss_pred eEEEeceeecccccccCccccccCCCCCeEEeccC
Q psy15989 328 KMAIMPIVSHQQCLWSNPQFFSQFTSDETFCAGFR 362 (363)
Q Consensus 328 ~~~~v~ii~~~~C~~~~~~~~~~~i~~~~iCag~~ 362 (363)
+.+.+.+++.+.|...+.. ...+.++++||+..
T Consensus 144 ~~~~~~~~~~~~C~~~~~~--~~~~~~~~~C~~~~ 176 (229)
T smart00020 144 QEVNVPIVSNATCRRAYSG--GGAITDNMLCAGGL 176 (229)
T ss_pred eEEEEEEeCHHHhhhhhcc--ccccCCCcEeecCC
Confidence 9999999999999865321 12478889999754
No 10
>PF09342 DUF1986: Domain of unknown function (DUF1986); InterPro: IPR015420 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This domain is found in serine endopeptidases belonging to MEROPS peptidase family S1A (clan PA). It is found in unusual mosaic proteins, which are encoded by the Drosophila nudel gene (see P98159 from SWISSPROT). Nudel is involved in defining embryonic dorsoventral polarity. Three proteases; ndl, gd and snk process easter to create active easter. Active easter defines cell identities along the dorsal-ventral continuum by activating the spz ligand for the Tl receptor in the ventral region of the embryo. Nudel, pipe and windbeutel together trigger the protease cascade within the extraembryonic perivitelline compartment which induces dorsoventral polarity of the Drosophila embryo [].
Probab=99.42 E-value=1e-12 Score=108.84 Aligned_cols=102 Identities=22% Similarity=0.432 Sum_probs=80.5
Q ss_pred CCCCcceEEEEEeeCCceeEEEEEEEeeCCEEEeeccCcccCCCCCcccCCcEEEEeceecccccCCCCCceeeeeeEEE
Q psy15989 28 RGQWPWHVALYRTEGINLSYVCGGSLVSVNYVITAAHCVTKKPYDKPVDSDTLVIYLGKYHQHQFSDEGGVQNKQVKRVH 107 (363)
Q Consensus 28 ~~~~Pw~v~l~~~~~~~~~~~CgGsLIs~~~VLTAAhC~~~~~~~~~~~~~~~~V~~G~~~~~~~~~~~~~~~~~V~~i~ 107 (363)
.-.|||.|.|+..+ .+.|.|+||.+.|||++-.|+.+-. +...-+.|.+|.......-+.+.+|.+.|..+.
T Consensus 13 ~y~WPWlA~IYvdG----~~~CsgvLlD~~WlLvsssCl~~I~----L~~~YvsallG~~Kt~~~v~Gp~EQI~rVD~~~ 84 (267)
T PF09342_consen 13 DYHWPWLADIYVDG----RYWCSGVLLDPHWLLVSSSCLRGIS----LSHHYVSALLGGGKTYLSVDGPHEQISRVDCFK 84 (267)
T ss_pred cccCcceeeEEEcC----eEEEEEEEeccceEEEeccccCCcc----cccceEEEEecCcceecccCCChheEEEeeeee
Confidence 35699999999876 8999999999999999999997532 233557888888763332234666777787665
Q ss_pred ECCCCCCCCCCCCeEEEEeCCcccCCCCccceEecCC
Q psy15989 108 IYPTFNSSNYLGDIALLQLSSDVDYSMYVRPVCLWDD 144 (363)
Q Consensus 108 ~hp~y~~~~~~nDIALl~L~~~~~~~~~v~picL~~~ 144 (363)
.-| ..+++||+|++|+.|+.+|+|..||..
T Consensus 85 ~V~-------~S~v~LLHL~~~~~fTr~VlP~flp~~ 114 (267)
T PF09342_consen 85 DVP-------ESNVLLLHLEQPANFTRYVLPTFLPET 114 (267)
T ss_pred ecc-------ccceeeeeecCcccceeeecccccccc
Confidence 543 358999999999999999999999873
No 11
>PF00089 Trypsin: Trypsin; InterPro: IPR001254 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine proteases belong to the MEROPS peptidase family S1 (chymotrypsin family, clan PA(S))and to peptidase family S6 (Hap serine peptidases). The chymotrypsin family is almost totally confined to animals, although trypsin-like enzymes are found in actinomycetes of the genera Streptomyces and Saccharopolyspora, and in the fungus Fusarium oxysporum []. The enzymes are inherently secreted, being synthesised with a signal peptide that targets them to the secretory pathway. Animal enzymes are either secreted directly, packaged into vesicles for regulated secretion, or are retained in leukocyte granules []. The Hap family, 'Haemophilus adhesion and penetration', are proteins that play a role in the interaction with human epithelial cells. The serine protease activity is localized at the N-terminal domain, whereas the binding domain is in the C-terminal region. ; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 1SPJ_A 1A5I_A 2ZGH_A 2ZKS_A 2ZGJ_A 2ZGC_A 2ODP_A 2I6Q_A 2I6S_A 2ODQ_A ....
Probab=99.34 E-value=5.8e-12 Score=108.34 Aligned_cols=104 Identities=28% Similarity=0.558 Sum_probs=85.9
Q ss_pred eEEEeeeEEEEcCCCCCCCCCCceEEEeecceeecccceEeeeeCCCCCCccccccccCCcEEEEEecCCCCCCccccce
Q psy15989 249 VVFTDVKRVHIYPTFNSSNYLGDIALLQLSSDVDYSMYVRPVCLWDDSTAPLQLSAVEGRDGTVIGWGYDENDRVSEELK 328 (363)
Q Consensus 249 ~v~~~V~~ii~h~~y~~~~~~nDIall~L~~~v~~~~~v~picl~~~~~~~~~~~~~~~~~~~v~GWG~~~~~~~~~~L~ 328 (363)
....+|.+++.||+|+.....+|||||+|++++.+.+.++|+||+..... ...+..+.+.|||.......+..++
T Consensus 66 ~~~~~v~~~~~h~~~~~~~~~~DiAll~L~~~~~~~~~~~~~~l~~~~~~-----~~~~~~~~~~G~~~~~~~~~~~~~~ 140 (220)
T PF00089_consen 66 EQTIKVSKIIIHPKYDPSTYDNDIALLKLDRPITFGDNIQPICLPSAGSD-----PNVGTSCIVVGWGRTSDNGYSSNLQ 140 (220)
T ss_dssp SEEEEEEEEEEETTSBTTTTTTSEEEEEESSSSEHBSSBEESBBTSTTHT-----TTTTSEEEEEESSBSSTTSBTSBEE
T ss_pred cccccccccccccccccccccccccccccccccccccccccccccccccc-----ccccccccccccccccccccccccc
Confidence 35678999999999999888999999999999999999999999873321 2457899999999875433567899
Q ss_pred EEEeceeecccccccCccccccCCCCCeEEecc
Q psy15989 329 MAIMPIVSHQQCLWSNPQFFSQFTSDETFCAGF 361 (363)
Q Consensus 329 ~~~v~ii~~~~C~~~~~~~~~~~i~~~~iCag~ 361 (363)
+..+.+++.+.|+.. +...+.+.++|++.
T Consensus 141 ~~~~~~~~~~~c~~~----~~~~~~~~~~c~~~ 169 (220)
T PF00089_consen 141 SVTVPVVSRKTCRSS----YNDNLTPNMICAGS 169 (220)
T ss_dssp EEEEEEEEHHHHHHH----TTTTSTTTEEEEET
T ss_pred ccccccccccccccc----cccccccccccccc
Confidence 999999999999754 23347788999875
No 12
>COG3591 V8-like Glu-specific endopeptidase [Amino acid transport and metabolism]
Probab=98.50 E-value=7.3e-07 Score=76.28 Aligned_cols=173 Identities=18% Similarity=0.203 Sum_probs=88.1
Q ss_pred CCCCCcceEEEEEeeCCceeEEEEEEEeeCCEEEeeccCcccCCCCCcccCCcEEEEe-ceecccccCCCCCceeeeeeE
Q psy15989 27 ARGQWPWHVALYRTEGINLSYVCGGSLVSVNYVITAAHCVTKKPYDKPVDSDTLVIYL-GKYHQHQFSDEGGVQNKQVKR 105 (363)
Q Consensus 27 ~~~~~Pw~v~l~~~~~~~~~~~CgGsLIs~~~VLTAAhC~~~~~~~~~~~~~~~~V~~-G~~~~~~~~~~~~~~~~~V~~ 105 (363)
....|||-+-.+..... ..+-|+++||+++-||||+||+.....+. ..+.+.. |... .+.....+..
T Consensus 45 dt~~~Py~av~~~~~~t-G~~~~~~~lI~pntvLTa~Hc~~s~~~G~----~~~~~~p~g~~~-------~~~~~~~~~~ 112 (251)
T COG3591 45 DTTQFPYSAVVQFEAAT-GRLCTAATLIGPNTVLTAGHCIYSPDYGE----DDIAAAPPGVNS-------DGGPFYGITK 112 (251)
T ss_pred cCCCCCcceeEEeecCC-CcceeeEEEEcCceEEEeeeEEecCCCCh----hhhhhcCCcccC-------CCCCCCceee
Confidence 45789999888665443 25567789999999999999998754221 2222322 3221 2222222333
Q ss_pred --EEECCC--CCCCCCCCCeEEEEeCCcccCCCCccceEecCCCCCCceeeeeeceeeecCCCCccceEE-ecCeeEEee
Q psy15989 106 --VHIYPT--FNSSNYLGDIALLQLSSDVDYSMYVRPVCLWDDSTAPLQLSAVEGTSVCNGDSGGGMVFK-IDSAWYLRG 180 (363)
Q Consensus 106 --i~~hp~--y~~~~~~nDIALl~L~~~~~~~~~v~picL~~~~~~~~~~~~~~~~~~c~g~~G~~~~~~-~~~~~~~~~ 180 (363)
+...|. |+.+....|+..+.|+...++.+.+....++...... ... . ....|.|.... ...+|...+
T Consensus 113 ~~~~~~~g~~~~~d~~~~~v~~~~~~~g~~~~~~~~~~~~~~~~~~~-----~~d--~-i~v~GYP~dk~~~~~~~e~t~ 184 (251)
T COG3591 113 IEIRVYPGELYKEDGASYDVGEAALESGINIGDVVNYLKRNTASEAK-----AND--R-ITVIGYPGDKPNIGTMWESTG 184 (251)
T ss_pred EEEEecCCceeccCCceeeccHHHhccCCCccccccccccccccccc-----cCc--e-eEEEeccCCCCcceeEeeecc
Confidence 322443 3444555666666666444444444433332222100 011 1 11344443222 111111111
Q ss_pred eeEEEEeecCeeeccccceEEecCCCccCcccccceeEEEEcCccEEEEEEEEE
Q psy15989 181 IVSITVARDGLRVCDTKHYVVFTDVANVCNGDSGGGMVFKIDSAWYLRGIVSIT 234 (363)
Q Consensus 181 l~~~~~~~~~~~~C~~~~~~~~~~~~~~C~gdsGgpl~~~~~~~~~l~Gi~s~~ 234 (363)
-+.+- ....+....+.+.|+||+|++...+ +++|+..-+
T Consensus 185 ~v~~~------------~~~~l~y~~dT~pG~SGSpv~~~~~---~vigv~~~g 223 (251)
T COG3591 185 KVNSI------------KGNKLFYDADTLPGSSGSPVLISKD---EVIGVHYNG 223 (251)
T ss_pred eeEEE------------ecceEEEEecccCCCCCCceEecCc---eEEEEEecC
Confidence 11000 0012334467889999999997765 889887763
No 13
>PF13365 Trypsin_2: Trypsin-like peptidase domain; PDB: 1Y8T_A 2Z9I_A 3QO6_A 1L1J_A 1QY6_A 2O8L_A 3OTP_E 2ZLE_I 1KY9_A 3CS0_A ....
Probab=98.40 E-value=1.2e-05 Score=61.77 Aligned_cols=22 Identities=36% Similarity=0.531 Sum_probs=19.6
Q ss_pred EEEEEeeCC-EEEeeccCcccCC
Q psy15989 49 CGGSLVSVN-YVITAAHCVTKKP 70 (363)
Q Consensus 49 CgGsLIs~~-~VLTAAhC~~~~~ 70 (363)
|+|.+|.++ +|||||||+....
T Consensus 1 GTGf~i~~~g~ilT~~Hvv~~~~ 23 (120)
T PF13365_consen 1 GTGFLIGPDGYILTAAHVVEDWN 23 (120)
T ss_dssp EEEEEEETTTEEEEEHHHHTCCT
T ss_pred CEEEEEcCCceEEEchhheeccc
Confidence 689999999 9999999998643
No 14
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=97.76 E-value=0.00055 Score=65.08 Aligned_cols=70 Identities=24% Similarity=0.363 Sum_probs=46.6
Q ss_pred eEEEEEEEeeCC-EEEeeccCcccCCCCCcccCCcEEEEeceecccccCCCCCceeeeeeEEEECCCCCCCCCCCCeEEE
Q psy15989 46 SYVCGGSLVSVN-YVITAAHCVTKKPYDKPVDSDTLVIYLGKYHQHQFSDEGGVQNKQVKRVHIYPTFNSSNYLGDIALL 124 (363)
Q Consensus 46 ~~~CgGsLIs~~-~VLTAAhC~~~~~~~~~~~~~~~~V~~G~~~~~~~~~~~~~~~~~V~~i~~hp~y~~~~~~nDIALl 124 (363)
...++|.+|++. +|||++|.+.+ ...+.|.+.. ...+..+-+..++ ..|||||
T Consensus 57 ~~~GSGfii~~~G~IlTn~Hvv~~--------~~~i~V~~~~-----------~~~~~a~vv~~d~-------~~DlAll 110 (428)
T TIGR02037 57 RGLGSGVIISADGYILTNNHVVDG--------ADEITVTLSD-----------GREFKAKLVGKDP-------RTDIAVL 110 (428)
T ss_pred cceeeEEEECCCCEEEEcHHHcCC--------CCeEEEEeCC-----------CCEEEEEEEEecC-------CCCEEEE
Confidence 467999999986 99999999964 3566666532 1234444333443 3599999
Q ss_pred EeCCcccCCCCccceEecCCC
Q psy15989 125 QLSSDVDYSMYVRPVCLWDDS 145 (363)
Q Consensus 125 ~L~~~~~~~~~v~picL~~~~ 145 (363)
+++.+ ..+.++.|....
T Consensus 111 kv~~~----~~~~~~~l~~~~ 127 (428)
T TIGR02037 111 KIDAK----KNLPVIKLGDSD 127 (428)
T ss_pred EecCC----CCceEEEccCCC
Confidence 99865 245577775443
No 15
>TIGR02038 protease_degS periplasmic serine pepetdase DegS. This family consists of the periplasmic serine protease DegS (HhoB), a shorter paralog of protease DO (HtrA, DegP) and DegQ (HhoA). It is found in E. coli and several other Proteobacteria of the gamma subdivision. It contains a trypsin domain and a single copy of PDZ domain (in contrast to DegP with two copies). A critical role of this DegS is to sense stress in the periplasm and partially degrade an inhibitor of sigma(E).
Probab=97.67 E-value=0.0022 Score=59.24 Aligned_cols=74 Identities=26% Similarity=0.405 Sum_probs=47.7
Q ss_pred CCcceEEEEEeeCC-------ceeEEEEEEEeeCC-EEEeeccCcccCCCCCcccCCcEEEEeceecccccCCCCCceee
Q psy15989 30 QWPWHVALYRTEGI-------NLSYVCGGSLVSVN-YVITAAHCVTKKPYDKPVDSDTLVIYLGKYHQHQFSDEGGVQNK 101 (363)
Q Consensus 30 ~~Pw~v~l~~~~~~-------~~~~~CgGsLIs~~-~VLTAAhC~~~~~~~~~~~~~~~~V~~G~~~~~~~~~~~~~~~~ 101 (363)
.-|-+|.|+..... ......+|.+|+++ +|||++|.+.. .+.+.|.+.+ +..+
T Consensus 54 ~~psVV~I~~~~~~~~~~~~~~~~~~GSG~vi~~~G~IlTn~HVV~~--------~~~i~V~~~d-----------g~~~ 114 (351)
T TIGR02038 54 AAPAVVNIYNRSISQNSLNQLSIQGLGSGVIMSKEGYILTNYHVIKK--------ADQIVVALQD-----------GRKF 114 (351)
T ss_pred cCCcEEEEEeEeccccccccccccceEEEEEEeCCeEEEecccEeCC--------CCEEEEEECC-----------CCEE
Confidence 34888998754311 11346899999977 99999999964 3456665422 1234
Q ss_pred eeeEEEECCCCCCCCCCCCeEEEEeCCc
Q psy15989 102 QVKRVHIYPTFNSSNYLGDIALLQLSSD 129 (363)
Q Consensus 102 ~V~~i~~hp~y~~~~~~nDIALl~L~~~ 129 (363)
..+-+..+| ..||||||++.+
T Consensus 115 ~a~vv~~d~-------~~DlAvlkv~~~ 135 (351)
T TIGR02038 115 EAELVGSDP-------LTDLAVLKIEGD 135 (351)
T ss_pred EEEEEEecC-------CCCEEEEEecCC
Confidence 444343443 359999999864
No 16
>PRK10139 serine endoprotease; Provisional
Probab=97.46 E-value=0.0027 Score=60.59 Aligned_cols=69 Identities=23% Similarity=0.395 Sum_probs=45.1
Q ss_pred EEEEEEEeeC--CEEEeeccCcccCCCCCcccCCcEEEEeceecccccCCCCCceeeeeeEEEECCCCCCCCCCCCeEEE
Q psy15989 47 YVCGGSLVSV--NYVITAAHCVTKKPYDKPVDSDTLVIYLGKYHQHQFSDEGGVQNKQVKRVHIYPTFNSSNYLGDIALL 124 (363)
Q Consensus 47 ~~CgGsLIs~--~~VLTAAhC~~~~~~~~~~~~~~~~V~~G~~~~~~~~~~~~~~~~~V~~i~~hp~y~~~~~~nDIALl 124 (363)
...+|.+|++ -+|||.+|.+.+ ...+.|.+.+ . ..+..+-+-..| ..|||||
T Consensus 90 ~~GSG~ii~~~~g~IlTn~HVv~~--------a~~i~V~~~d---------g--~~~~a~vvg~D~-------~~DlAvl 143 (455)
T PRK10139 90 GLGSGVIIDAAKGYVLTNNHVINQ--------AQKISIQLND---------G--REFDAKLIGSDD-------QSDIALL 143 (455)
T ss_pred ceEEEEEEECCCCEEEeChHHhCC--------CCEEEEEECC---------C--CEEEEEEEEEcC-------CCCEEEE
Confidence 4689999974 699999999964 4567777532 1 234444343433 3599999
Q ss_pred EeCCcccCCCCccceEecCCC
Q psy15989 125 QLSSDVDYSMYVRPVCLWDDS 145 (363)
Q Consensus 125 ~L~~~~~~~~~v~picL~~~~ 145 (363)
|++.+- ...++.|....
T Consensus 144 kv~~~~----~l~~~~lg~s~ 160 (455)
T PRK10139 144 QIQNPS----KLTQIAIADSD 160 (455)
T ss_pred EecCCC----CCceeEecCcc
Confidence 998642 24466665433
No 17
>PRK10898 serine endoprotease; Provisional
Probab=97.46 E-value=0.0055 Score=56.60 Aligned_cols=73 Identities=19% Similarity=0.321 Sum_probs=46.9
Q ss_pred CcceEEEEEeeCC-------ceeEEEEEEEeeCC-EEEeeccCcccCCCCCcccCCcEEEEeceecccccCCCCCceeee
Q psy15989 31 WPWHVALYRTEGI-------NLSYVCGGSLVSVN-YVITAAHCVTKKPYDKPVDSDTLVIYLGKYHQHQFSDEGGVQNKQ 102 (363)
Q Consensus 31 ~Pw~v~l~~~~~~-------~~~~~CgGsLIs~~-~VLTAAhC~~~~~~~~~~~~~~~~V~~G~~~~~~~~~~~~~~~~~ 102 (363)
-|-+|.|...... .....-+|.+|+++ +|||++|=+.+ ...+.|.+.+ ...+.
T Consensus 55 ~psvV~v~~~~~~~~~~~~~~~~~~GSGfvi~~~G~IlTn~HVv~~--------a~~i~V~~~d-----------g~~~~ 115 (353)
T PRK10898 55 APAVVNVYNRSLNSTSHNQLEIRTLGSGVIMDQRGYILTNKHVIND--------ADQIIVALQD-----------GRVFE 115 (353)
T ss_pred CCcEEEEEeEeccccCcccccccceeeEEEEeCCeEEEecccEeCC--------CCEEEEEeCC-----------CCEEE
Confidence 4888888664311 11256899999976 99999998863 3556666532 12333
Q ss_pred eeEEEECCCCCCCCCCCCeEEEEeCCc
Q psy15989 103 VKRVHIYPTFNSSNYLGDIALLQLSSD 129 (363)
Q Consensus 103 V~~i~~hp~y~~~~~~nDIALl~L~~~ 129 (363)
.+-+...| .+||||||++.+
T Consensus 116 a~vv~~d~-------~~DlAvl~v~~~ 135 (353)
T PRK10898 116 ALLVGSDS-------LTDLAVLKINAT 135 (353)
T ss_pred EEEEEEcC-------CCCEEEEEEcCC
Confidence 43344433 369999999864
No 18
>PRK10942 serine endoprotease; Provisional
Probab=97.30 E-value=0.0057 Score=58.73 Aligned_cols=68 Identities=25% Similarity=0.386 Sum_probs=43.8
Q ss_pred EEEEEEEeeC--CEEEeeccCcccCCCCCcccCCcEEEEeceecccccCCCCCceeeeeeEEEECCCCCCCCCCCCeEEE
Q psy15989 47 YVCGGSLVSV--NYVITAAHCVTKKPYDKPVDSDTLVIYLGKYHQHQFSDEGGVQNKQVKRVHIYPTFNSSNYLGDIALL 124 (363)
Q Consensus 47 ~~CgGsLIs~--~~VLTAAhC~~~~~~~~~~~~~~~~V~~G~~~~~~~~~~~~~~~~~V~~i~~hp~y~~~~~~nDIALl 124 (363)
...+|.+|++ -+|||.+|.+.+ ...+.|.+.+ ...+..+-+..+| ..|||||
T Consensus 111 ~~GSG~ii~~~~G~IlTn~HVv~~--------a~~i~V~~~d-----------g~~~~a~vv~~D~-------~~DlAvl 164 (473)
T PRK10942 111 ALGSGVIIDADKGYVVTNNHVVDN--------ATKIKVQLSD-----------GRKFDAKVVGKDP-------RSDIALI 164 (473)
T ss_pred ceEEEEEEECCCCEEEeChhhcCC--------CCEEEEEECC-----------CCEEEEEEEEecC-------CCCEEEE
Confidence 4689999985 599999999864 4567777532 1233444333443 3599999
Q ss_pred EeCCcccCCCCccceEecCC
Q psy15989 125 QLSSDVDYSMYVRPVCLWDD 144 (363)
Q Consensus 125 ~L~~~~~~~~~v~picL~~~ 144 (363)
|++.+-. ..++.|...
T Consensus 165 ki~~~~~----l~~~~lg~s 180 (473)
T PRK10942 165 QLQNPKN----LTAIKMADS 180 (473)
T ss_pred EecCCCC----CceeEecCc
Confidence 9975322 345666443
No 19
>COG5640 Secreted trypsin-like serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=97.00 E-value=0.0019 Score=57.60 Aligned_cols=110 Identities=19% Similarity=0.169 Sum_probs=66.6
Q ss_pred EEEeeeEEEEcCCCCCCCCCCceEEEeecceeecccceEeeeeCCCCCCccccccccCCcEEEEEecCCCC----CCcc-
Q psy15989 250 VFTDVKRVHIYPTFNSSNYLGDIALLQLSSDVDYSMYVRPVCLWDDSTAPLQLSAVEGRDGTVIGWGYDEN----DRVS- 324 (363)
Q Consensus 250 v~~~V~~ii~h~~y~~~~~~nDIall~L~~~v~~~~~v~picl~~~~~~~~~~~~~~~~~~~v~GWG~~~~----~~~~- 324 (363)
....|..++.|..|.+.++.||||+++|.++...-. + -|-+-+..+. ...++.........+||.+.. ...+
T Consensus 104 ~rg~vr~i~~~efY~~~n~~ND~Av~~l~~~a~~pr-~-ki~~~~~sdt-~l~sv~~~s~~~n~t~~~~~~~~v~~~~p~ 180 (413)
T COG5640 104 ERGHVRTIYVHEFYSPGNLGNDIAVLELARAASLPR-V-KITSFDASDT-FLNSVTTVSPMTNGTFGVTTPSDVPRSSPK 180 (413)
T ss_pred cCcceEEEeeecccccccccCcceeeccccccccch-h-heeeccCccc-ceecccccccccceeeeeeeecCCCCCCCc
Confidence 345688899999999999999999999998655321 0 1111111110 011233345566778886521 1223
Q ss_pred -ccceEEEeceeecccccccCc-ccc-ccCCCCCeEEeccC
Q psy15989 325 -EELKMAIMPIVSHQQCLWSNP-QFF-SQFTSDETFCAGFR 362 (363)
Q Consensus 325 -~~L~~~~v~ii~~~~C~~~~~-~~~-~~~i~~~~iCag~~ 362 (363)
..|+++.|..++.++|++.+. ... .....-.-+|||.+
T Consensus 181 gt~l~e~~v~fv~~stc~~~~g~an~~dg~~~lT~~cag~~ 221 (413)
T COG5640 181 GTILHEVAVLFVPLSTCAQYKGCANASDGATGLTGFCAGRP 221 (413)
T ss_pred cceeeeeeeeeechHHhhhhccccccCCCCCCccceecCCC
Confidence 489999999999999976542 111 11111123999864
No 20
>PF09342 DUF1986: Domain of unknown function (DUF1986); InterPro: IPR015420 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This domain is found in serine endopeptidases belonging to MEROPS peptidase family S1A (clan PA). It is found in unusual mosaic proteins, which are encoded by the Drosophila nudel gene (see P98159 from SWISSPROT). Nudel is involved in defining embryonic dorsoventral polarity. Three proteases; ndl, gd and snk process easter to create active easter. Active easter defines cell identities along the dorsal-ventral continuum by activating the spz ligand for the Tl receptor in the ventral region of the embryo. Nudel, pipe and windbeutel together trigger the protease cascade within the extraembryonic perivitelline compartment which induces dorsoventral polarity of the Drosophila embryo [].
Probab=95.40 E-value=0.032 Score=47.35 Aligned_cols=59 Identities=19% Similarity=0.279 Sum_probs=42.1
Q ss_pred CcceEEEeeeEEEEcCCCCCCCCCCceEEEeecceeecccceEeeeeCCCCCCccccccccCCcEEEEEec
Q psy15989 246 KHYVVFTDVKRVHIYPTFNSSNYLGDIALLQLSSDVDYSMYVRPVCLWDDSTAPLQLSAVEGRDGTVIGWG 316 (363)
Q Consensus 246 ~~~~v~~~V~~ii~h~~y~~~~~~nDIall~L~~~v~~~~~v~picl~~~~~~~~~~~~~~~~~~~v~GWG 316 (363)
+...|..+|..+..-| ..+++||.|++|+.|+.+|+|..||+.... ......|...|--
T Consensus 72 Gp~EQI~rVD~~~~V~-------~S~v~LLHL~~~~~fTr~VlP~flp~~~~~-----~~~~~~CVAVg~d 130 (267)
T PF09342_consen 72 GPHEQISRVDCFKDVP-------ESNVLLLHLEQPANFTRYVLPTFLPETSNE-----NESDDECVAVGHD 130 (267)
T ss_pred CChheEEEeeeeeecc-------ccceeeeeecCcccceeeecccccccccCC-----CCCCCceEEEEcc
Confidence 3455666777654322 468999999999999999999999973332 1234589888853
No 21
>PF02395 Peptidase_S6: Immunoglobulin A1 protease Serine protease Prosite pattern; InterPro: IPR000710 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S6 (clan PA(S)). The type sample being the IgA1-specific serine endopeptidase from Neisseria gonorrhoeae []. These cleave prolyl bonds in the hinge regions of immunoglobulin A heavy chains. Similar specificity is shown by the unrelated family of M26 metalloendopeptidases.; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SZE_A 3H09_B 3SYJ_A 1WXR_A 3AK5_B.
Probab=95.27 E-value=0.26 Score=49.96 Aligned_cols=65 Identities=20% Similarity=0.323 Sum_probs=37.3
Q ss_pred EEEeeCCEEEeeccCcccCCCCCcccCCcEEEEeceecccccCCCCCceeeeeeEEEECCCCCCCCCCCCeEEEEeCCcc
Q psy15989 51 GSLVSVNYVITAAHCVTKKPYDKPVDSDTLVIYLGKYHQHQFSDEGGVQNKQVKRVHIYPTFNSSNYLGDIALLQLSSDV 130 (363)
Q Consensus 51 GsLIs~~~VLTAAhC~~~~~~~~~~~~~~~~V~~G~~~~~~~~~~~~~~~~~V~~i~~hp~y~~~~~~nDIALl~L~~~~ 130 (363)
.|||+|++|+|++|=...+ -.|.+|.... ..+.+.+--.|+. .|..+.||.+=|
T Consensus 69 aTLigpqYiVSV~HN~~gy----------~~v~FG~~g~---------~~Y~iV~RNn~~~-------~Df~~pRLnK~V 122 (769)
T PF02395_consen 69 ATLIGPQYIVSVKHNGKGY----------NSVSFGNEGQ---------NTYKIVDRNNYPS-------GDFHMPRLNKFV 122 (769)
T ss_dssp -EEEETTEEEBETTG-TSC----------CEECESCSST---------CEEEEEEEEBETT-------STEBEEEESS--
T ss_pred EEEecCCeEEEEEccCCCc----------CceeecccCC---------ceEEEEEccCCCC-------cccceeecCceE
Confidence 8999999999999976221 2455555321 3455555545543 499999999866
Q ss_pred cCCCCccceEecCC
Q psy15989 131 DYSMYVRPVCLWDD 144 (363)
Q Consensus 131 ~~~~~v~picL~~~ 144 (363)
+ -+.|+-....
T Consensus 123 T---EvaP~~~t~~ 133 (769)
T PF02395_consen 123 T---EVAPAEMTTA 133 (769)
T ss_dssp ----SS----BBSS
T ss_pred E---EEeccccccc
Confidence 5 3677766544
No 22
>PF00548 Peptidase_C3: 3C cysteine protease (picornain 3C); InterPro: IPR000199 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This signature defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies C3A and C3B. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral C3 cysteine protease. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SJO_E 2H6M_A 1QA7_C 1HAV_B 2HAL_A 2H9H_A 3QZQ_B 3QZR_A 3R0F_B 3SJ9_A ....
Probab=94.11 E-value=2 Score=35.22 Aligned_cols=147 Identities=13% Similarity=0.068 Sum_probs=70.8
Q ss_pred eeEEEEEEEeeCCEEEeeccCcccCCCCCcccCCcEEEEeceecccccCCCCCceeeeeeEEEECCCCCCCCCCCCeEEE
Q psy15989 45 LSYVCGGSLVSVNYVITAAHCVTKKPYDKPVDSDTLVIYLGKYHQHQFSDEGGVQNKQVKRVHIYPTFNSSNYLGDIALL 124 (363)
Q Consensus 45 ~~~~CgGsLIs~~~VLTAAhC~~~~~~~~~~~~~~~~V~~G~~~~~~~~~~~~~~~~~V~~i~~hp~y~~~~~~nDIALl 124 (363)
..+.+.+..|.++|.|--.|.-. .. .+.++.. .+++...+.. .+......||+|+
T Consensus 23 g~~t~l~~gi~~~~~lvp~H~~~---------~~--~i~i~g~------------~~~~~d~~~l--v~~~~~~~Dl~~v 77 (172)
T PF00548_consen 23 GEFTMLALGIYDRYFLVPTHEEP---------ED--TIYIDGV------------EYKVDDSVVL--VDRDGVDTDLTLV 77 (172)
T ss_dssp EEEEEEEEEEEBTEEEEEGGGGG---------CS--EEEETTE------------EEEEEEEEEE--EETTSSEEEEEEE
T ss_pred ceEEEecceEeeeEEEEECcCCC---------cE--EEEECCE------------EEEeeeeEEE--ecCCCcceeEEEE
Confidence 36777888999999999999211 12 2222221 2222222111 1122234599999
Q ss_pred EeCCcccCCCCccceEecCCCCCCceeeeeeceeeecCCCCccceEEecCeeEEeeeeEEEE-eecCeeeccccceEEec
Q psy15989 125 QLSSDVDYSMYVRPVCLWDDSTAPLQLSAVEGTSVCNGDSGGGMVFKIDSAWYLRGIVSITV-ARDGLRVCDTKHYVVFT 203 (363)
Q Consensus 125 ~L~~~~~~~~~v~picL~~~~~~~~~~~~~~~~~~c~g~~G~~~~~~~~~~~~~~~l~~~~~-~~~~~~~C~~~~~~~~~ 203 (363)
+|++.-+|.|..+-++- .... +.+...... ....+....... .+...+. ....... ...+.
T Consensus 78 ~l~~~~kfrDIrk~~~~-~~~~------~~~~~l~v~-~~~~~~~~~~v~-----~v~~~~~i~~~g~~~-----~~~~~ 139 (172)
T PF00548_consen 78 KLPRNPKFRDIRKFFPE-SIPE------YPECVLLVN-STKFPRMIVEVG-----FVTNFGFINLSGTTT-----PRSLK 139 (172)
T ss_dssp EEESSS-B--GGGGSBS-SGGT------EEEEEEEEE-SSSSTCEEEEEE-----EEEEEEEEEETTEEE-----EEEEE
T ss_pred EccCCcccCchhhhhcc-cccc------CCCcEEEEE-CCCCccEEEEEE-----EEeecCccccCCCEe-----eEEEE
Confidence 99999889887777761 1111 111111112 222221111111 0111111 1001111 11111
Q ss_pred CCCccCcccccceeEEEEcCccEEEEEEEEE
Q psy15989 204 DVANVCNGDSGGGMVFKIDSAWYLRGIVSIT 234 (363)
Q Consensus 204 ~~~~~C~gdsGgpl~~~~~~~~~l~Gi~s~~ 234 (363)
-......|+.||||+.+.++...++||+.-+
T Consensus 140 Y~~~t~~G~CG~~l~~~~~~~~~i~GiHvaG 170 (172)
T PF00548_consen 140 YKAPTKPGMCGSPLVSRIGGQGKIIGIHVAG 170 (172)
T ss_dssp EESEEETTGTTEEEEESCGGTTEEEEEEEEE
T ss_pred EccCCCCCccCCeEEEeeccCccEEEEEecc
Confidence 1223446999999999887788999998753
No 23
>PF00863 Peptidase_C4: Peptidase family C4 This family belongs to family C4 of the peptidase classification.; InterPro: IPR001730 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. Nuclear inclusion A (NIA) proteases from potyviruses are cysteine peptidases belong to the MEROPS peptidase family C4 (NIa protease family, clan PA(C)) [, ]. Potyviruses include plant viruses in which the single-stranded RNA encodes a polyprotein with NIA protease activity, where proteolytic cleavage is specific for Gln+Gly sites. The NIA protease acts on the polyprotein, releasing itself by Gln+Gly cleavage at both the N- and C-termini. It further processes the polyprotein by cleavage at five similar sites in the C-terminal half of the sequence. In addition to its C-terminal protease activity, the NIA protease contains an N-terminal domain that has been implicated in the transcription process []. This peptidase is present in the nuclear inclusion protein of potyviruses.; GO: 0008234 cysteine-type peptidase activity, 0006508 proteolysis; PDB: 3MMG_B 1Q31_B 1LVB_A 1LVM_A.
Probab=91.29 E-value=1.1 Score=38.50 Aligned_cols=25 Identities=36% Similarity=0.437 Sum_probs=16.6
Q ss_pred cCcccccceeEEEEcCccEEEEEEEEE
Q psy15989 208 VCNGDSGGGMVFKIDSAWYLRGIVSIT 234 (363)
Q Consensus 208 ~C~gdsGgpl~~~~~~~~~l~Gi~s~~ 234 (363)
.=.||.|.|++...++ .++||.|.+
T Consensus 148 Tk~G~CG~PlVs~~Dg--~IVGiHsl~ 172 (235)
T PF00863_consen 148 TKDGDCGLPLVSTKDG--KIVGIHSLT 172 (235)
T ss_dssp --TT-TT-EEEETTT----EEEEEEEE
T ss_pred CCCCccCCcEEEcCCC--cEEEEEcCc
Confidence 3468999999987776 799999985
No 24
>PF03761 DUF316: Domain of unknown function (DUF316) ; InterPro: IPR005514 This is a family of uncharacterised proteins from Caenorhabditis elegans.
Probab=87.08 E-value=3.3 Score=36.87 Aligned_cols=42 Identities=36% Similarity=0.476 Sum_probs=30.3
Q ss_pred CCCCceEEEeecceeecccceEeeeeCCCCCCccccccccCCcEEEEEe
Q psy15989 267 NYLGDIALLQLSSDVDYSMYVRPVCLWDDSTAPLQLSAVEGRDGTVIGW 315 (363)
Q Consensus 267 ~~~nDIall~L~~~v~~~~~v~picl~~~~~~~~~~~~~~~~~~~v~GW 315 (363)
....+.+||+|+++ ++..+.|+|||++... ...+....+-|+
T Consensus 158 ~~~~~~mIlEl~~~--~~~~~~~~Cl~~~~~~-----~~~~~~~~~yg~ 199 (282)
T PF03761_consen 158 NRPYSPMILELEED--FSKNVSPPCLADSSTN-----WEKGDEVDVYGF 199 (282)
T ss_pred ccccceEEEEEccc--ccccCCCEEeCCCccc-----cccCceEEEeec
Confidence 44678999999999 8889999999976542 233444445554
No 25
>COG0265 DegQ Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain [Posttranslational modification, protein turnover, chaperones]
Probab=56.16 E-value=1.7e+02 Score=26.96 Aligned_cols=58 Identities=19% Similarity=0.321 Sum_probs=36.8
Q ss_pred EEEEEEEee-CCEEEeeccCcccCCCCCcccCCcEEEEeceecccccCCCCCceeeeeeEEEECCCCCCCCCCCCeEEEE
Q psy15989 47 YVCGGSLVS-VNYVITAAHCVTKKPYDKPVDSDTLVIYLGKYHQHQFSDEGGVQNKQVKRVHIYPTFNSSNYLGDIALLQ 125 (363)
Q Consensus 47 ~~CgGsLIs-~~~VLTAAhC~~~~~~~~~~~~~~~~V~~G~~~~~~~~~~~~~~~~~V~~i~~hp~y~~~~~~nDIALl~ 125 (363)
....|.+++ +.+|+|-.|=+.. +..+.|.+-+ +..+..+-+-.. ...|+|+||
T Consensus 72 ~~gSg~i~~~~g~ivTn~hVi~~--------a~~i~v~l~d-----------g~~~~a~~vg~d-------~~~dlavlk 125 (347)
T COG0265 72 GLGSGFIISSDGYIVTNNHVIAG--------AEEITVTLAD-----------GREVPAKLVGKD-------PISDLAVLK 125 (347)
T ss_pred ccccEEEEcCCeEEEecceecCC--------cceEEEEeCC-----------CCEEEEEEEecC-------CccCEEEEE
Confidence 568888888 7899999997753 4556666510 122333333222 245999999
Q ss_pred eCCcc
Q psy15989 126 LSSDV 130 (363)
Q Consensus 126 L~~~~ 130 (363)
.+..-
T Consensus 126 i~~~~ 130 (347)
T COG0265 126 IDGAG 130 (347)
T ss_pred eccCC
Confidence 99753
No 26
>PF05416 Peptidase_C37: Southampton virus-type processing peptidase; InterPro: IPR001665 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This group of cysteine peptidases belong to the MEROPS peptidase family C37, (clan PA(C)). The type example is calicivirin from Southampton virus, an endopeptidase that cleaves the polyprotein at sites N-terminal to itself, liberating the polyprotein helicase. Southampton virus is a positive-stranded ssRNA virus belonging to the Caliciviruses, which are viruses that cause gastroenteritis. The calicivirus genome contains two open reading frames, ORF1 and ORF2. ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine protease and RNA polymerase activity []. The regions of the polyprotein in which these activities lie are similar to proteins produced by the picornaviruses []. ORF2 encodes a structural, capsid protein. Two different families of caliciviruses can be distinguished on the basis of sequence similarity, namely the Norwalk-like viruses or small round structured viruses (SRSVs), and those classed as non-SRSVs.; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 2FYQ_A 2FYR_A 1WQS_D 4ASH_A 2IPH_B.
Probab=46.24 E-value=34 Score=31.98 Aligned_cols=27 Identities=26% Similarity=0.533 Sum_probs=21.5
Q ss_pred ccCcccccceeEEEEcCccEEEEEEEE
Q psy15989 207 NVCNGDSGGGMVFKIDSAWYLRGIVSI 233 (363)
Q Consensus 207 ~~C~gdsGgpl~~~~~~~~~l~Gi~s~ 233 (363)
+.-.||.|.|.+++..+-|+++|++..
T Consensus 499 GT~PGDCGcPYvyKrgNd~VV~GVH~A 525 (535)
T PF05416_consen 499 GTIPGDCGCPYVYKRGNDWVVIGVHAA 525 (535)
T ss_dssp S--TTGTT-EEEEEETTEEEEEEEEEE
T ss_pred CCCCCCCCCceeeecCCcEEEEEEEeh
Confidence 345799999999999999999999765
No 27
>PF00947 Pico_P2A: Picornavirus core protein 2A; InterPro: IPR000081 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This domain defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies 3CA and 3CB. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral 3C cysteine protease []. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0008233 peptidase activity, 0006508 proteolysis, 0016032 viral reproduction; PDB: 2HRV_B 1Z8R_A.
Probab=46.23 E-value=18 Score=27.72 Aligned_cols=21 Identities=38% Similarity=0.464 Sum_probs=16.3
Q ss_pred cccccceeEEEEcCccEEEEEEEEE
Q psy15989 210 NGDSGGGMVFKIDSAWYLRGIVSIT 234 (363)
Q Consensus 210 ~gdsGgpl~~~~~~~~~l~Gi~s~~ 234 (363)
.||.||+|.|+. =++||++.+
T Consensus 89 PGdCGg~L~C~H----GViGi~Tag 109 (127)
T PF00947_consen 89 PGDCGGILRCKH----GVIGIVTAG 109 (127)
T ss_dssp TT-TCSEEEETT----CEEEEEEEE
T ss_pred CCCCCceeEeCC----CeEEEEEeC
Confidence 599999999986 578887764
No 28
>PF10459 Peptidase_S46: Peptidase S46; InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains.
Probab=41.91 E-value=16 Score=36.96 Aligned_cols=21 Identities=43% Similarity=0.702 Sum_probs=19.0
Q ss_pred EEEEEEeeCC-EEEeeccCccc
Q psy15989 48 VCGGSLVSVN-YVITAAHCVTK 68 (363)
Q Consensus 48 ~CgGsLIs~~-~VLTAAhC~~~ 68 (363)
.|+|++||+. .|||=-||..+
T Consensus 48 GCSgsfVS~~GLvlTNHHC~~~ 69 (698)
T PF10459_consen 48 GCSGSFVSPDGLVLTNHHCGYG 69 (698)
T ss_pred ceeEEEEcCCceEEecchhhhh
Confidence 4999999998 89999999864
No 29
>smart00816 Amb_V_allergen Amb V Allergen. Amb V is an Ambrosia sp (ragweed) pollen allergen. Amb t V has been shown to contain a C-terminal helix as the major T cell epitope. Free sulphhydryl groups also play a major role in the T cell recognition of cross-reactivity T cell epitopes within these related allergens.
Probab=29.76 E-value=32 Score=20.31 Aligned_cols=31 Identities=32% Similarity=0.882 Sum_probs=18.7
Q ss_pred CCc-CCCCCCccccCCccceeecCeecCCCCC-cceEEEEEe
Q psy15989 1 MCY-RDVSCGTVVYNKAQPLVTYGQKTARGQW-PWHVALYRT 40 (363)
Q Consensus 1 ~~~-~~~~CG~~~~~~~~~rIvgG~~a~~~~~-Pw~v~l~~~ 40 (363)
+|| .-..||.. ... --+++|.| ||||--+..
T Consensus 4 ~Cy~aG~~CGek--r~Y-------CcSdpGrYCpwqvVCYeS 36 (45)
T smart00816 4 LCYWAGTNCGEK--RKY-------CCSDPGRYCPWQVVCYES 36 (45)
T ss_pred chhccccccccc--Ccc-------ccCCCcccCCceEEEeeh
Confidence 366 45578864 111 12466777 999987653
No 30
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=29.07 E-value=85 Score=29.90 Aligned_cols=38 Identities=24% Similarity=0.171 Sum_probs=26.4
Q ss_pred CCceEEEeecceeecccceEeeeeCCCCCCccccccccCCcEEEEEec
Q psy15989 269 LGDIALLQLSSDVDYSMYVRPVCLWDDSTAPLQLSAVEGRDGTVIGWG 316 (363)
Q Consensus 269 ~nDIall~L~~~v~~~~~v~picl~~~~~~~~~~~~~~~~~~~v~GWG 316 (363)
..||||||++.+ ..+.++.|.+... ...|..+.+.|+-
T Consensus 104 ~~DlAllkv~~~----~~~~~~~l~~~~~------~~~G~~v~aiG~p 141 (428)
T TIGR02037 104 RTDIAVLKIDAK----KNLPVIKLGDSDK------LRVGDWVLAIGNP 141 (428)
T ss_pred CCCEEEEEecCC----CCceEEEccCCCC------CCCCCEEEEEECC
Confidence 479999999864 3466777754332 3457888888874
No 31
>cd07268 Glo_EDI_BRP_like_4 This conserved domain belongs to a superfamily including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. This protein family belongs to a conserved domain superfamily that is found in a variety of structurally related metalloproteins, including the bleomycin resistance protein, glyoxalase I, and type I ring-cleaving dioxygenases. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). The protein superfamily contains members with or without domain swapping. The proteins of this family share three conserved metal binding amino acids with the type I extradiol dioxygenases, which shows no domain swapping.
Probab=23.94 E-value=61 Score=25.67 Aligned_cols=27 Identities=22% Similarity=0.128 Sum_probs=21.8
Q ss_pred CCceEEEeecceeeccc-ceEeeeeCCC
Q psy15989 269 LGDIALLQLSSDVDYSM-YVRPVCLWDD 295 (363)
Q Consensus 269 ~nDIall~L~~~v~~~~-~v~picl~~~ 295 (363)
..=|+|++|.+|+.+.. .|.-|.||-.
T Consensus 36 GRPI~l~~L~qPl~~~~~~I~cvELP~P 63 (149)
T cd07268 36 GRPIALIKLEKPLQFAGWSISIVELPFP 63 (149)
T ss_pred CeeEEEEEcCCCceeCCcEEEEEEeCCC
Confidence 34699999999999987 5777788754
No 32
>PF03913 Amb_V_allergen: Amb V Allergen; InterPro: IPR005611 Amb V is an Ambrosia sp (ragweed) pollen allergen. Amb t V has been shown to contain a C-terminal helix as the major T cell epitope. Free sulphydryl groups also play a major role in the T cell recognition of cross-reactivity T cell epitopes within these related allergens [].; PDB: 2BBG_A 3BBG_A 1BBG_A.
Probab=23.26 E-value=40 Score=19.85 Aligned_cols=30 Identities=30% Similarity=0.743 Sum_probs=12.1
Q ss_pred CCc-CCCCCCccccCCccceeecCeecCCCCC-cceEEEEE
Q psy15989 1 MCY-RDVSCGTVVYNKAQPLVTYGQKTARGQW-PWHVALYR 39 (363)
Q Consensus 1 ~~~-~~~~CG~~~~~~~~~rIvgG~~a~~~~~-Pw~v~l~~ 39 (363)
+|| .-..||.. ... --.++|.| ||||--+.
T Consensus 3 ~Cy~aG~~CGek--r~Y-------CcSdpGrYCpwqvVCYe 34 (44)
T PF03913_consen 3 PCYWAGNICGEK--RAY-------CCSDPGRYCPWQVVCYE 34 (44)
T ss_dssp --E-ESSTTS-T--TSE-------EE-SSSSS-----EEES
T ss_pred cccccccccccc--CCe-------ecCCCcccccceeeeec
Confidence 366 45578864 111 12567887 99997754
No 33
>PF05579 Peptidase_S32: Equine arteritis virus serine endopeptidase S32; InterPro: IPR008760 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to MEROPS peptidase family S32 (clan PA(S)). The type example is equine arteritis virus serine endopeptidase (equine arteritis virus), which is involved in processing of nidovirus polyproteins [].; GO: 0004252 serine-type endopeptidase activity, 0016032 viral reproduction, 0019082 viral protein processing; PDB: 3FAN_A 3FAO_A 1MBM_A.
Probab=21.57 E-value=62 Score=28.43 Aligned_cols=21 Identities=33% Similarity=0.347 Sum_probs=15.0
Q ss_pred cccccceeEEEEcCccEEEEEEEE
Q psy15989 210 NGDSGGGMVFKIDSAWYLRGIVSI 233 (363)
Q Consensus 210 ~gdsGgpl~~~~~~~~~l~Gi~s~ 233 (363)
.||||+|.+.... .++||++-
T Consensus 207 ~GDSGSPVVt~dg---~liGVHTG 227 (297)
T PF05579_consen 207 PGDSGSPVVTEDG---DLIGVHTG 227 (297)
T ss_dssp GGCTT-EEEETTC----EEEEEEE
T ss_pred CCCCCCccCcCCC---CEEEEEec
Confidence 4899999997653 68898865
No 34
>COG3102 Uncharacterized protein conserved in bacteria [Function unknown]
Probab=21.33 E-value=76 Score=25.68 Aligned_cols=26 Identities=15% Similarity=0.082 Sum_probs=20.8
Q ss_pred CCceEEEeecceeeccc-ceEeeeeCC
Q psy15989 269 LGDIALLQLSSDVDYSM-YVRPVCLWD 294 (363)
Q Consensus 269 ~nDIall~L~~~v~~~~-~v~picl~~ 294 (363)
..-|.|+||.+|+.|.. .+.-|.||-
T Consensus 74 GRpI~li~l~~Pl~v~~w~id~iELP~ 100 (185)
T COG3102 74 GRPICLIKLHQPLQVAHWQIDIIELPY 100 (185)
T ss_pred CceEEEEEcCCcceecceEEEEEEccC
Confidence 34699999999999877 577777874
Done!