Query psy4294
Match_columns 238
No_of_seqs 113 out of 1653
Neff 9.2
Searched_HMMs 46136
Date Fri Aug 16 21:52:02 2013
Command hhsearch -i /work/01045/syshi/Psyhhblits/psy4294.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/4294hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 cd00190 Tryp_SPc Trypsin-like 100.0 7.9E-33 1.7E-37 223.8 15.1 171 21-209 1-232 (232)
2 smart00020 Tryp_SPc Trypsin-li 100.0 5.2E-30 1.1E-34 207.5 15.4 169 20-206 1-229 (229)
3 KOG3627|consensus 100.0 2.8E-29 6.1E-34 207.2 17.4 183 17-211 9-255 (256)
4 PF00089 Trypsin: Trypsin; In 100.0 6.5E-29 1.4E-33 199.5 14.6 166 21-206 1-220 (220)
5 COG5640 Secreted trypsin-like 99.9 1.8E-22 3.9E-27 166.5 11.2 194 17-225 29-293 (413)
6 PF09342 DUF1986: Domain of un 99.6 2.4E-14 5.2E-19 113.1 10.4 120 28-158 12-131 (267)
7 PF03761 DUF316: Domain of unk 99.2 1.4E-10 2.9E-15 97.2 11.7 183 17-211 38-280 (282)
8 COG3591 V8-like Glu-specific e 97.8 0.00017 3.6E-09 58.8 9.6 40 28-67 45-87 (251)
9 PF13365 Trypsin_2: Trypsin-li 97.7 0.00012 2.7E-09 52.5 6.4 63 46-115 1-65 (120)
10 TIGR02037 degP_htrA_DO peripla 96.9 0.0044 9.6E-08 55.1 8.4 58 43-117 57-115 (428)
11 TIGR02038 protease_degS peripl 96.5 0.034 7.3E-07 48.2 10.5 69 32-117 55-135 (351)
12 PRK10898 serine endoprotease; 96.1 0.065 1.4E-06 46.5 10.1 69 32-117 55-135 (353)
13 PRK10942 serine endoprotease; 96.1 0.035 7.7E-07 50.0 8.7 57 44-117 111-169 (473)
14 PRK10139 serine endoprotease; 95.5 0.073 1.6E-06 47.8 8.4 57 44-117 90-148 (455)
15 PF10459 Peptidase_S46: Peptid 48.6 12 0.00026 35.6 2.1 20 46-65 49-69 (698)
16 PF00863 Peptidase_C4: Peptida 37.8 2.3E+02 0.0049 23.2 7.9 56 50-119 37-94 (235)
17 PF02395 Peptidase_S6: Immunog 36.3 87 0.0019 30.4 5.7 65 47-131 68-132 (769)
18 COG0265 DegQ Trypsin-like seri 28.4 1.9E+02 0.004 24.9 6.1 57 44-117 72-129 (347)
No 1
>cd00190 Tryp_SPc Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues.
Probab=100.00 E-value=7.9e-33 Score=223.82 Aligned_cols=171 Identities=38% Similarity=0.643 Sum_probs=140.2
Q ss_pred eecCeecCCCCCceEEEEeeC-CeeeEEEEEeeCCEEEEccCccccccccccccccEEEeceeeecc---cccccccccc
Q psy4294 21 IVGGRDVNPGEVPYIVSLSLY-GNLYCGGSLISLQWFLSARHCFVTENLVWNQFNPLIIAGSIYRNY---KEQKRQPQLN 96 (238)
Q Consensus 21 i~gG~~~~~~~~Pw~v~i~~~-~~~~C~GtLI~~~~VLTaAhCv~~~~~~~~~~~~~v~~g~~~~~~---~~~~~~~~~~ 96 (238)
|+||++++.++|||+|.++.. ..+.|+|+||+++||||||||+.... ...+.+.+|...... .....
T Consensus 1 i~~G~~~~~~~~Pw~v~i~~~~~~~~C~GtlIs~~~VLTaAhC~~~~~----~~~~~v~~g~~~~~~~~~~~~~~----- 71 (232)
T cd00190 1 IVGGSEAKIGSFPWQVSLQYTGGRHFCGGSLISPRWVLTAAHCVYSSA----PSNYTVRLGSHDLSSNEGGGQVI----- 71 (232)
T ss_pred CcCCeECCCCCCCCEEEEEccCCcEEEEEEEeeCCEEEECHHhcCCCC----CccEEEEeCcccccCCCCceEEE-----
Confidence 579999999999999999987 78999999999999999999998742 227778888776543 34444
Q ss_pred eeeEEEeCC-------CCceEEEEecCCcccccccccccccCcccccccccccccCCCCceeEEecccccccCC------
Q psy4294 97 EIALIYWHS-------DADLAMVKLKEPFRQTTFVKPLDYYTARETNYINDVLSKTDRSEMSIVSGFGVTFQRD------ 163 (238)
Q Consensus 97 ~v~~i~~hp-------~~Diall~L~~~~~~~~~i~picl~~~~~~~~~~~~~~~~~~~~~~~~~G~g~~~~~~------ 163 (238)
.|++++.|| .+|+|||||++|+.++.+++|+|||...... .. +..+.+.|||......
T Consensus 72 ~v~~~~~hp~y~~~~~~~DiAll~L~~~~~~~~~v~picl~~~~~~~--------~~-~~~~~~~G~g~~~~~~~~~~~~ 142 (232)
T cd00190 72 KVKKVIVHPNYNPSTYDNDIALLKLKRPVTLSDNVRPICLPSSGYNL--------PA-GTTCTVSGWGRTSEGGPLPDVL 142 (232)
T ss_pred EEEEEEECCCCCCCCCcCCEEEEEECCcccCCCcccceECCCccccC--------CC-CCEEEEEeCCcCCCCCCCCcee
Confidence 899999998 5899999999999999999999999874322 44 8899999999986541
Q ss_pred ----------------CC--CCccCCCCCCC-------------c------------eEEEEeeCCCCCC-CCCceeeeC
Q psy4294 164 ----------------KD--GIVSWGIGCAL-------------G------------YPGIVSWGIGCAL-GYPGVYVRV 199 (238)
Q Consensus 164 ----------------~~--~~~~~~~~C~~-------------G------------L~Gv~s~g~~C~~-~~p~vyt~V 199 (238)
+. ..+.+.++|+. | |+||+|+|..|.. +.|++|++|
T Consensus 143 ~~~~~~~~~~~~C~~~~~~~~~~~~~~~C~~~~~~~~~~c~gdsGgpl~~~~~~~~~lvGI~s~g~~c~~~~~~~~~t~v 222 (232)
T cd00190 143 QEVNVPIVSNAECKRAYSYGGTITDNMLCAGGLEGGKDACQGDSGGPLVCNDNGRGVLVGIVSWGSGCARPNYPGVYTRV 222 (232)
T ss_pred eEEEeeeECHHHhhhhccCcccCCCceEeeCCCCCCCccccCCCCCcEEEEeCCEEEEEEEEehhhccCCCCCCCEEEEc
Confidence 11 23445566664 1 9999999999997 889999999
Q ss_pred CCCHHHHHHH
Q psy4294 200 DHYDPWIQSV 209 (238)
Q Consensus 200 ~~~~dWI~~~ 209 (238)
..|++||+++
T Consensus 223 ~~~~~WI~~~ 232 (232)
T cd00190 223 SSYLDWIQKT 232 (232)
T ss_pred HHhhHHhhcC
Confidence 9999999864
No 2
>smart00020 Tryp_SPc Trypsin-like serine protease. Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues.
Probab=99.97 E-value=5.2e-30 Score=207.45 Aligned_cols=169 Identities=40% Similarity=0.668 Sum_probs=137.5
Q ss_pred eeecCeecCCCCCceEEEEeeCC-eeeEEEEEeeCCEEEEccCccccccccccccccEEEeceeeecccc--cccccccc
Q psy4294 20 RIVGGRDVNPGEVPYIVSLSLYG-NLYCGGSLISLQWFLSARHCFVTENLVWNQFNPLIIAGSIYRNYKE--QKRQPQLN 96 (238)
Q Consensus 20 ~i~gG~~~~~~~~Pw~v~i~~~~-~~~C~GtLI~~~~VLTaAhCv~~~~~~~~~~~~~v~~g~~~~~~~~--~~~~~~~~ 96 (238)
||+||+++..++|||++.++... .+.|+|+||++++|||||||+.... ...+.+.+|........ ...
T Consensus 1 ~~~~G~~~~~~~~Pw~~~i~~~~~~~~C~GtlIs~~~VLTaahC~~~~~----~~~~~v~~g~~~~~~~~~~~~~----- 71 (229)
T smart00020 1 RIVGGSEANIGSFPWQVSLQYRGGRHFCGGSLISPRWVLTAAHCVYGSD----PSNIRVRLGSHDLSSGEEGQVI----- 71 (229)
T ss_pred CccCCCcCCCCCCCcEEEEEEcCCCcEEEEEEecCCEEEECHHHcCCCC----CcceEEEeCcccCCCCCCceEE-----
Confidence 58999999999999999999876 8899999999999999999998743 22778888876643332 445
Q ss_pred eeeEEEeCC-------CCceEEEEecCCcccccccccccccCcccccccccccccCCCCceeEEecccccccCC------
Q psy4294 97 EIALIYWHS-------DADLAMVKLKEPFRQTTFVKPLDYYTARETNYINDVLSKTDRSEMSIVSGFGVTFQRD------ 163 (238)
Q Consensus 97 ~v~~i~~hp-------~~Diall~L~~~~~~~~~i~picl~~~~~~~~~~~~~~~~~~~~~~~~~G~g~~~~~~------ 163 (238)
.+..++.|| .+|+|||||++|+.++..++|+||+...... .. +..+.+.|||......
T Consensus 72 ~v~~~~~~p~~~~~~~~~DiAll~L~~~i~~~~~~~pi~l~~~~~~~--------~~-~~~~~~~g~g~~~~~~~~~~~~ 142 (229)
T smart00020 72 KVSKVIIHPNYNPSTYDNDIALLKLKSPVTLSDNVRPICLPSSNYNV--------PA-GTTCTVSGWGRTSEGAGSLPDT 142 (229)
T ss_pred eeEEEEECCCCCCCCCcCCEEEEEECcccCCCCceeeccCCCccccc--------CC-CCEEEEEeCCCCCCCCCcCCCE
Confidence 899999998 5899999999999998999999999873333 44 8889999999886411
Q ss_pred -----------------CC--CCccCCCCCCC-------------c-----------eEEEEeeCCCCCC-CCCceeeeC
Q psy4294 164 -----------------KD--GIVSWGIGCAL-------------G-----------YPGIVSWGIGCAL-GYPGVYVRV 199 (238)
Q Consensus 164 -----------------~~--~~~~~~~~C~~-------------G-----------L~Gv~s~g~~C~~-~~p~vyt~V 199 (238)
+. ..+...++|++ | |+||+|+|..|.. +.|.+|++|
T Consensus 143 ~~~~~~~~~~~~~C~~~~~~~~~~~~~~~C~~~~~~~~~~c~gdsG~pl~~~~~~~~l~Gi~s~g~~C~~~~~~~~~~~i 222 (229)
T smart00020 143 LQEVNVPIVSNATCRRAYSGGGAITDNMLCAGGLEGGKDACQGDSGGPLVCNDGRWVLVGIVSWGSGCARPGKPGVYTRV 222 (229)
T ss_pred eeEEEEEEeCHHHhhhhhccccccCCCcEeecCCCCCCcccCCCCCCeeEEECCCEEEEEEEEECCCCCCCCCCCEEEEe
Confidence 11 13455566664 1 9999999999995 789999999
Q ss_pred CCCHHHH
Q psy4294 200 DHYDPWI 206 (238)
Q Consensus 200 ~~~~dWI 206 (238)
..|.+||
T Consensus 223 ~~~~~WI 229 (229)
T smart00020 223 SSYLDWI 229 (229)
T ss_pred ccccccC
Confidence 9999998
No 3
>KOG3627|consensus
Probab=99.97 E-value=2.8e-29 Score=207.18 Aligned_cols=183 Identities=34% Similarity=0.621 Sum_probs=142.9
Q ss_pred CCCeeecCeecCCCCCceEEEEeeCC--eeeEEEEEeeCCEEEEccCccccccccccccccEEEeceeeeccccccc-cc
Q psy4294 17 IGGRIVGGRDVNPGEVPYIVSLSLYG--NLYCGGSLISLQWFLSARHCFVTENLVWNQFNPLIIAGSIYRNYKEQKR-QP 93 (238)
Q Consensus 17 ~~~~i~gG~~~~~~~~Pw~v~i~~~~--~~~C~GtLI~~~~VLTaAhCv~~~~~~~~~~~~~v~~g~~~~~~~~~~~-~~ 93 (238)
...+|+||.++..+++||++++..+. .+.|+|+||+++||||||||+.... .. ...|+.|+.......... ..
T Consensus 9 ~~~~i~~g~~~~~~~~Pw~~~l~~~~~~~~~Cggsli~~~~vltaaHC~~~~~---~~-~~~V~~G~~~~~~~~~~~~~~ 84 (256)
T KOG3627|consen 9 PEGRIVGGTEAEPGSFPWQVSLQYGGNGRHLCGGSLISPRWVLTAAHCVKGAS---AS-LYTVRLGEHDINLSVSEGEEQ 84 (256)
T ss_pred ccCCEeCCccCCCCCCCCEEEEEECCCcceeeeeEEeeCCEEEEChhhCCCCC---Cc-ceEEEECccccccccccCchh
Confidence 46899999999999999999999876 7899999999999999999999853 11 666777765422221111 11
Q ss_pred ccceeeEEEeCC-------C-CceEEEEecCCcccccccccccccCcccc-cccccccccCCCCceeEEecccccccC--
Q psy4294 94 QLNEIALIYWHS-------D-ADLAMVKLKEPFRQTTFVKPLDYYTARET-NYINDVLSKTDRSEMSIVSGFGVTFQR-- 162 (238)
Q Consensus 94 ~~~~v~~i~~hp-------~-~Diall~L~~~~~~~~~i~picl~~~~~~-~~~~~~~~~~~~~~~~~~~G~g~~~~~-- 162 (238)
....+.+++.|| . +||||++|+.++.|+..++|+|||..... . ... +..|.++|||++...
T Consensus 85 ~~~~v~~~i~H~~y~~~~~~~nDiall~l~~~v~~~~~i~piclp~~~~~~~-------~~~-~~~~~v~GWG~~~~~~~ 156 (256)
T KOG3627|consen 85 LVGDVEKIIVHPNYNPRTLENNDIALLRLSEPVTFSSHIQPICLPSSADPYF-------PPG-GTTCLVSGWGRTESGGG 156 (256)
T ss_pred hhceeeEEEECCCCCCCCCCCCCEEEEEECCCcccCCcccccCCCCCcccCC-------CCC-CCEEEEEeCCCcCCCCC
Confidence 222577888998 3 89999999999999999999999854431 1 033 688999999998654
Q ss_pred C---------------------CCC--CccCCCCCCC-------------c------------eEEEEeeCCC-CCC-CC
Q psy4294 163 D---------------------KDG--IVSWGIGCAL-------------G------------YPGIVSWGIG-CAL-GY 192 (238)
Q Consensus 163 ~---------------------~~~--~~~~~~~C~~-------------G------------L~Gv~s~g~~-C~~-~~ 192 (238)
. +.. .+...++|++ | ++||+|||.. |+. ..
T Consensus 157 ~~~~~L~~~~v~i~~~~~C~~~~~~~~~~~~~~~Ca~~~~~~~~~C~GDSGGPLv~~~~~~~~~~GivS~G~~~C~~~~~ 236 (256)
T KOG3627|consen 157 PLPDTLQEVDVPIISNSECRRAYGGLGTITDTMLCAGGPEGGKDACQGDSGGPLVCEDNGRWVLVGIVSWGSGGCGQPNY 236 (256)
T ss_pred CCCceeEEEEEeEcChhHhcccccCccccCCCEEeeCccCCCCccccCCCCCeEEEeeCCcEEEEEEEEecCCCCCCCCC
Confidence 2 222 3555678987 1 8999999987 998 69
Q ss_pred CceeeeCCCCHHHHHHHHh
Q psy4294 193 PGVYVRVDHYDPWIQSVKN 211 (238)
Q Consensus 193 p~vyt~V~~~~dWI~~~i~ 211 (238)
|++||+|+.|.|||++.+.
T Consensus 237 P~vyt~V~~y~~WI~~~~~ 255 (256)
T KOG3627|consen 237 PGVYTRVSSYLDWIKENIG 255 (256)
T ss_pred CeEEeEhHHhHHHHHHHhc
Confidence 9999999999999999875
No 4
>PF00089 Trypsin: Trypsin; InterPro: IPR001254 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine proteases belong to the MEROPS peptidase family S1 (chymotrypsin family, clan PA(S))and to peptidase family S6 (Hap serine peptidases). The chymotrypsin family is almost totally confined to animals, although trypsin-like enzymes are found in actinomycetes of the genera Streptomyces and Saccharopolyspora, and in the fungus Fusarium oxysporum []. The enzymes are inherently secreted, being synthesised with a signal peptide that targets them to the secretory pathway. Animal enzymes are either secreted directly, packaged into vesicles for regulated secretion, or are retained in leukocyte granules []. The Hap family, 'Haemophilus adhesion and penetration', are proteins that play a role in the interaction with human epithelial cells. The serine protease activity is localized at the N-terminal domain, whereas the binding domain is in the C-terminal region. ; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 1SPJ_A 1A5I_A 2ZGH_A 2ZKS_A 2ZGJ_A 2ZGC_A 2ODP_A 2I6Q_A 2I6S_A 2ODQ_A ....
Probab=99.96 E-value=6.5e-29 Score=199.46 Aligned_cols=166 Identities=34% Similarity=0.611 Sum_probs=133.7
Q ss_pred eecCeecCCCCCceEEEEeeCC-eeeEEEEEeeCCEEEEccCccccccccccccccEEEeceeee-cc--cccccccccc
Q psy4294 21 IVGGRDVNPGEVPYIVSLSLYG-NLYCGGSLISLQWFLSARHCFVTENLVWNQFNPLIIAGSIYR-NY--KEQKRQPQLN 96 (238)
Q Consensus 21 i~gG~~~~~~~~Pw~v~i~~~~-~~~C~GtLI~~~~VLTaAhCv~~~~~~~~~~~~~v~~g~~~~-~~--~~~~~~~~~~ 96 (238)
|.||.++..++|||+|.++... .++|+|+||+++||||||||+.... .+.+.+|.... .. .....
T Consensus 1 i~~g~~~~~~~~p~~v~i~~~~~~~~C~G~li~~~~vLTaahC~~~~~------~~~v~~g~~~~~~~~~~~~~~----- 69 (220)
T PF00089_consen 1 IVGGDPASPGEFPWVVSIRYSNGRFFCTGTLISPRWVLTAAHCVDGAS------DIKVRLGTYSIRNSDGSEQTI----- 69 (220)
T ss_dssp SBSSEECGTTSSTTEEEEEETTTEEEEEEEEEETTEEEEEGGGHTSGG------SEEEEESESBTTSTTTTSEEE-----
T ss_pred CCCCEECCCCCCCeEEEEeeCCCCeeEeEEeccccccccccccccccc------ccccccccccccccccccccc-----
Confidence 6899999999999999999977 8999999999999999999999833 67777887221 11 12445
Q ss_pred eeeEEEeCC-------CCceEEEEecCCcccccccccccccCcccccccccccccCCCCceeEEecccccccCC------
Q psy4294 97 EIALIYWHS-------DADLAMVKLKEPFRQTTFVKPLDYYTARETNYINDVLSKTDRSEMSIVSGFGVTFQRD------ 163 (238)
Q Consensus 97 ~v~~i~~hp-------~~Diall~L~~~~~~~~~i~picl~~~~~~~~~~~~~~~~~~~~~~~~~G~g~~~~~~------ 163 (238)
.+++++.|| .+|+|||||++++.+...++|+|++...... .. +..+.+.||+......
T Consensus 70 ~v~~~~~h~~~~~~~~~~DiAll~L~~~~~~~~~~~~~~l~~~~~~~--------~~-~~~~~~~G~~~~~~~~~~~~~~ 140 (220)
T PF00089_consen 70 KVSKIIIHPKYDPSTYDNDIALLKLDRPITFGDNIQPICLPSAGSDP--------NV-GTSCIVVGWGRTSDNGYSSNLQ 140 (220)
T ss_dssp EEEEEEEETTSBTTTTTTSEEEEEESSSSEHBSSBEESBBTSTTHTT--------TT-TSEEEEEESSBSSTTSBTSBEE
T ss_pred ccccccccccccccccccccccccccccccccccccccccccccccc--------cc-cccccccccccccccccccccc
Confidence 899999998 4799999999999999999999999854432 34 8899999999874322
Q ss_pred ---------------CCCCccCCCCC----------CC----------c-eEEEEeeCCCCCC-CCCceeeeCCCCHHHH
Q psy4294 164 ---------------KDGIVSWGIGC----------AL----------G-YPGIVSWGIGCAL-GYPGVYVRVDHYDPWI 206 (238)
Q Consensus 164 ---------------~~~~~~~~~~C----------~~----------G-L~Gv~s~g~~C~~-~~p~vyt~V~~~~dWI 206 (238)
+...+...++| .+ + |+||.+++..|.. +.|++|+||+.|+|||
T Consensus 141 ~~~~~~~~~~~c~~~~~~~~~~~~~c~~~~~~~~~~~g~sG~pl~~~~~~lvGI~s~~~~c~~~~~~~v~~~v~~~~~WI 220 (220)
T PF00089_consen 141 SVTVPVVSRKTCRSSYNDNLTPNMICAGSSGSGDACQGDSGGPLICNNNYLVGIVSFGENCGSPNYPGVYTRVSSYLDWI 220 (220)
T ss_dssp EEEEEEEEHHHHHHHTTTTSTTTEEEEETTSSSBGGTTTTTSEEEETTEEEEEEEEEESSSSBTTSEEEEEEGGGGHHHH
T ss_pred ccccccccccccccccccccccccccccccccccccccccccccccceeeecceeeecCCCCCCCcCEEEEEHHHhhccC
Confidence 22223444444 33 2 9999999999998 6799999999999998
No 5
>COG5640 Secreted trypsin-like serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=99.88 E-value=1.8e-22 Score=166.49 Aligned_cols=194 Identities=26% Similarity=0.428 Sum_probs=125.3
Q ss_pred CCCeeecCeecCCCCCceEEEEeeC-----CeeeEEEEEeeCCEEEEccCccccccccccccccEEEeceeeeccccccc
Q psy4294 17 IGGRIVGGRDVNPGEVPYIVSLSLY-----GNLYCGGSLISLQWFLSARHCFVTENLVWNQFNPLIIAGSIYRNYKEQKR 91 (238)
Q Consensus 17 ~~~~i~gG~~~~~~~~Pw~v~i~~~-----~~~~C~GtLI~~~~VLTaAhCv~~~~~~~~~~~~~v~~g~~~~~~~~~~~ 91 (238)
.++||+||..|+.++||++|++..+ ...+|+|+++..|||||||||+.... +......+.+-+.......+..
T Consensus 29 vs~rIigGs~Anag~~P~~VaLv~~isd~~s~tfCGgs~l~~RYvLTAAHC~~~~s--~is~d~~~vv~~l~d~Sq~~rg 106 (413)
T COG5640 29 VSSRIIGGSNANAGEYPSLVALVDRISDYVSGTFCGGSKLGGRYVLTAAHCADASS--PISSDVNRVVVDLNDSSQAERG 106 (413)
T ss_pred cceeEecCcccccccCchHHHHHhhcccccceeEeccceecceEEeeehhhccCCC--CccccceEEEecccccccccCc
Confidence 6789999999999999999998753 24689999999999999999999855 1112333333333444455555
Q ss_pred ccccceeeEEEeCC-------CCceEEEEecCCccccc-ccccccccCcccccccccccccCCCCceeEEecccccccCC
Q psy4294 92 QPQLNEIALIYWHS-------DADLAMVKLKEPFRQTT-FVKPLDYYTARETNYINDVLSKTDRSEMSIVSGFGVTFQRD 163 (238)
Q Consensus 92 ~~~~~~v~~i~~hp-------~~Diall~L~~~~~~~~-~i~picl~~~~~~~~~~~~~~~~~~~~~~~~~G~g~~~~~~ 163 (238)
++-+++.|. .||+|+++|.++..... .+.-..-+. -++.+..+...+...+|+.+....
T Consensus 107 -----~vr~i~~~efY~~~n~~ND~Av~~l~~~a~~pr~ki~~~~~sd--------t~l~sv~~~s~~~n~t~~~~~~~~ 173 (413)
T COG5640 107 -----HVRTIYVHEFYSPGNLGNDIAVLELARAASLPRVKITSFDASD--------TFLNSVTTVSPMTNGTFGVTTPSD 173 (413)
T ss_pred -----ceEEEeeecccccccccCcceeeccccccccchhheeeccCcc--------cceecccccccccceeeeeeeecC
Confidence 777888877 78999999998654211 011000000 000001113333444444333221
Q ss_pred --------------------------CC-------CCccCCCCCCC---------------------c--eEEEEeeCCC
Q psy4294 164 --------------------------KD-------GIVSWGIGCAL---------------------G--YPGIVSWGIG 187 (238)
Q Consensus 164 --------------------------~~-------~~~~~~~~C~~---------------------G--L~Gv~s~g~~ 187 (238)
+. ....-..+|++ | ++||+|||.+
T Consensus 174 v~~~~p~gt~l~e~~v~fv~~stc~~~~g~an~~dg~~~lT~~cag~~~~daCqGDSGGPi~~~g~~G~vQ~GVvSwG~~ 253 (413)
T COG5640 174 VPRSSPKGTILHEVAVLFVPLSTCAQYKGCANASDGATGLTGFCAGRPPKDACQGDSGGPIFHKGEEGRVQRGVVSWGDG 253 (413)
T ss_pred CCCCCCccceeeeeeeeeechHHhhhhccccccCCCCCCccceecCCCCcccccCCCCCceEEeCCCccEEEeEEEecCC
Confidence 11 11111235555 1 9999999996
Q ss_pred -CCC-CCCceeeeCCCCHHHHHHHHhCCCCcceeeehhhh
Q psy4294 188 -CAL-GYPGVYVRVDHYDPWIQSVKNNGDNAGVLISALHM 225 (238)
Q Consensus 188 -C~~-~~p~vyt~V~~~~dWI~~~i~~~~~~~~~~~~~~~ 225 (238)
|+. ..|.|||+|+.|.|||..+|+.-+......+..++
T Consensus 254 ~Cg~t~~~gVyT~vsny~~WI~a~~~~l~~~~~rp~~~~~ 293 (413)
T COG5640 254 GCGGTLIPGVYTNVSNYQDWIAAMTNGLSYLQFRPLGYRP 293 (413)
T ss_pred CCCCCCcceeEEehhHHHHHHHHHhcCCCccccccccccc
Confidence 998 88999999999999999999887765554444333
No 6
>PF09342 DUF1986: Domain of unknown function (DUF1986); InterPro: IPR015420 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This domain is found in serine endopeptidases belonging to MEROPS peptidase family S1A (clan PA). It is found in unusual mosaic proteins, which are encoded by the Drosophila nudel gene (see P98159 from SWISSPROT). Nudel is involved in defining embryonic dorsoventral polarity. Three proteases; ndl, gd and snk process easter to create active easter. Active easter defines cell identities along the dorsal-ventral continuum by activating the spz ligand for the Tl receptor in the ventral region of the embryo. Nudel, pipe and windbeutel together trigger the protease cascade within the extraembryonic perivitelline compartment which induces dorsoventral polarity of the Drosophila embryo [].
Probab=99.57 E-value=2.4e-14 Score=113.14 Aligned_cols=120 Identities=15% Similarity=0.306 Sum_probs=91.6
Q ss_pred CCCCCceEEEEeeCCeeeEEEEEeeCCEEEEccCccccccccccccccEEEeceeeecccccccccccceeeEEEeCCCC
Q psy4294 28 NPGEVPYIVSLSLYGNLYCGGSLISLQWFLSARHCFVTENLVWNQFNPLIIAGSIYRNYKEQKRQPQLNEIALIYWHSDA 107 (238)
Q Consensus 28 ~~~~~Pw~v~i~~~~~~~C~GtLI~~~~VLTaAhCv~~~~~~~~~~~~~v~~g~~~~~~~~~~~~~~~~~v~~i~~hp~~ 107 (238)
+-..|||.+.|+.++.++|+|+||.++|+|++..|+...+...+ -+.+.+|..........-+.|+..|+.+..-|+.
T Consensus 12 e~y~WPWlA~IYvdG~~~CsgvLlD~~WlLvsssCl~~I~L~~~--YvsallG~~Kt~~~v~Gp~EQI~rVD~~~~V~~S 89 (267)
T PF09342_consen 12 EDYHWPWLADIYVDGRYWCSGVLLDPHWLLVSSSCLRGISLSHH--YVSALLGGGKTYLSVDGPHEQISRVDCFKDVPES 89 (267)
T ss_pred ccccCcceeeEEEcCeEEEEEEEeccceEEEeccccCCcccccc--eEEEEecCcceecccCCChheEEEeeeeeecccc
Confidence 34579999999999999999999999999999999987332222 4556777655322222223344499999888899
Q ss_pred ceEEEEecCCcccccccccccccCcccccccccccccCCCCceeEEecccc
Q psy4294 108 DLAMVKLKEPFRQTTFVKPLDYYTARETNYINDVLSKTDRSEMSIVSGFGV 158 (238)
Q Consensus 108 Diall~L~~~~~~~~~i~picl~~~~~~~~~~~~~~~~~~~~~~~~~G~g~ 158 (238)
+++|++|+.|++|+.+++|..+|.....+ .. ...|...|-..
T Consensus 90 ~v~LLHL~~~~~fTr~VlP~flp~~~~~~--------~~-~~~CVAVg~d~ 131 (267)
T PF09342_consen 90 NVLLLHLEQPANFTRYVLPTFLPETSNEN--------ES-DDECVAVGHDD 131 (267)
T ss_pred ceeeeeecCcccceeeecccccccccCCC--------CC-CCceEEEEccc
Confidence 99999999999999999999999743333 33 66888888765
No 7
>PF03761 DUF316: Domain of unknown function (DUF316) ; InterPro: IPR005514 This is a family of uncharacterised proteins from Caenorhabditis elegans.
Probab=99.22 E-value=1.4e-10 Score=97.24 Aligned_cols=183 Identities=19% Similarity=0.238 Sum_probs=111.5
Q ss_pred CCCeeecCeecCCCCCceEEEEeeCC----eeeEEEEEeeCCEEEEccCccccccccc----cccccEEEec-------e
Q psy4294 17 IGGRIVGGRDVNPGEVPYIVSLSLYG----NLYCGGSLISLQWFLSARHCFVTENLVW----NQFNPLIIAG-------S 81 (238)
Q Consensus 17 ~~~~i~gG~~~~~~~~Pw~v~i~~~~----~~~C~GtLI~~~~VLTaAhCv~~~~~~~----~~~~~~v~~g-------~ 81 (238)
..+++.+|.++..++.||.+.+...+ ...++||+||+|||||++||+......| ........-+ .
T Consensus 38 ~~~~~~~g~~~~~~~~pW~v~v~~~~~~~~~~~~~gtlIS~RHiLtss~~~~~~~~~W~~~~~~~~~~C~~~~~~l~vP~ 117 (282)
T PF03761_consen 38 YPSKVFNGTPAESGEAPWAVSVYTKNHNEGNYFSTGTLISPRHILTSSHCVMNDKSKWLNGEEFDNKKCEGNNNHLIVPE 117 (282)
T ss_pred CcccccCCcccccCCCCCEEEEEeccCcccceecceEEeccCeEEEeeeEEEecccccccCcccccceeeCCCceEEeCH
Confidence 45567899999999999999998743 3567999999999999999998644333 0000011111 0
Q ss_pred eeec---c----cccccccccceeeEEEeC-----------CCCceEEEEecCCcccccccccccccCcccccccccccc
Q psy4294 82 IYRN---Y----KEQKRQPQLNEIALIYWH-----------SDADLAMVKLKEPFRQTTFVKPLDYYTARETNYINDVLS 143 (238)
Q Consensus 82 ~~~~---~----~~~~~~~~~~~v~~i~~h-----------p~~Diall~L~~~~~~~~~i~picl~~~~~~~~~~~~~~ 143 (238)
.... . ...........+.++++- ..++.+|++|+++ ++....|+||+......
T Consensus 118 ~~l~~~~v~~~~~~~~~~~~~~~v~ka~il~~C~~~~~~~~~~~~~mIlEl~~~--~~~~~~~~Cl~~~~~~~------- 188 (282)
T PF03761_consen 118 EVLSKIDVRCCNCFSNGKCFSIKVKKAYILNGCKKIKKNFNRPYSPMILELEED--FSKNVSPPCLADSSTNW------- 188 (282)
T ss_pred HHhccEEEEeecccccCCcccceeEEEEEEecCCCcccccccccceEEEEEccc--ccccCCCEEeCCCcccc-------
Confidence 0000 0 000001111244444331 1678999999999 67789999999876543
Q ss_pred cCCCCceeEEecccccccCC------------CCCCccCCCCCCC---c-----------eEEEEeeCC-CCCCCCCcee
Q psy4294 144 KTDRSEMSIVSGFGVTFQRD------------KDGIVSWGIGCAL---G-----------YPGIVSWGI-GCALGYPGVY 196 (238)
Q Consensus 144 ~~~~~~~~~~~G~g~~~~~~------------~~~~~~~~~~C~~---G-----------L~Gv~s~g~-~C~~~~p~vy 196 (238)
.. ++...+.|+..+..-. ..........|.+ | |+||.+.+. .|... ...|
T Consensus 189 -~~-~~~~~~yg~~~~~~~~~~~~~i~~~~~~~~~~~~~~~~~~~d~Gg~lv~~~~gr~tlIGv~~~~~~~~~~~-~~~f 265 (282)
T PF03761_consen 189 -EK-GDEVDVYGFNSTGKLKHRKLKITNCTKCAYSICTKQYSCKGDRGGPLVKNINGRWTLIGVGASGNYECNKN-NSYF 265 (282)
T ss_pred -cc-CceEEEeecCCCCeEEEEEEEEEEeeccceeEecccccCCCCccCeEEEEECCCEEEEEEEccCCCccccc-ccEE
Confidence 33 5555556661111000 1112233455664 2 999998886 35433 6889
Q ss_pred eeCCCCHHHHHHHHh
Q psy4294 197 VRVDHYDPWIQSVKN 211 (238)
Q Consensus 197 t~V~~~~dWI~~~i~ 211 (238)
.+|..|.|=|-+..+
T Consensus 266 ~~v~~~~~~IC~ltG 280 (282)
T PF03761_consen 266 FNVSWYQDEICELTG 280 (282)
T ss_pred EEHHHhhhhhcccee
Confidence 999999887766543
No 8
>COG3591 V8-like Glu-specific endopeptidase [Amino acid transport and metabolism]
Probab=97.84 E-value=0.00017 Score=58.77 Aligned_cols=40 Identities=25% Similarity=0.510 Sum_probs=31.5
Q ss_pred CCCCCceEEEEee---CCeeeEEEEEeeCCEEEEccCcccccc
Q psy4294 28 NPGEVPYIVSLSL---YGNLYCGGSLISLQWFLSARHCFVTEN 67 (238)
Q Consensus 28 ~~~~~Pw~v~i~~---~~~~~C~GtLI~~~~VLTaAhCv~~~~ 67 (238)
+-..|||-+-... .++.-|+++||+++.|||++||+....
T Consensus 45 dt~~~Py~av~~~~~~tG~~~~~~~lI~pntvLTa~Hc~~s~~ 87 (251)
T COG3591 45 DTTQFPYSAVVQFEAATGRLCTAATLIGPNTVLTAGHCIYSPD 87 (251)
T ss_pred cCCCCCcceeEEeecCCCcceeeEEEEcCceEEEeeeEEecCC
Confidence 3457999877755 345556679999999999999999854
No 9
>PF13365 Trypsin_2: Trypsin-like peptidase domain; PDB: 1Y8T_A 2Z9I_A 3QO6_A 1L1J_A 1QY6_A 2O8L_A 3OTP_E 2ZLE_I 1KY9_A 3CS0_A ....
Probab=97.71 E-value=0.00012 Score=52.50 Aligned_cols=63 Identities=22% Similarity=0.184 Sum_probs=35.8
Q ss_pred EEEEEeeCC-EEEEccCccccccccccccccEEEeceeeecccccccccccceeeEEEeCCC-CceEEEEec
Q psy4294 46 CGGSLISLQ-WFLSARHCFVTENLVWNQFNPLIIAGSIYRNYKEQKRQPQLNEIALIYWHSD-ADLAMVKLK 115 (238)
Q Consensus 46 C~GtLI~~~-~VLTaAhCv~~~~~~~~~~~~~v~~g~~~~~~~~~~~~~~~~~v~~i~~hp~-~Diall~L~ 115 (238)
|+|.+|.++ +|||++||+.............+...... ...... ...-+...+. .|+|||+++
T Consensus 1 GTGf~i~~~g~ilT~~Hvv~~~~~~~~~~~~~~~~~~~~--~~~~~~-----~~~~~~~~~~~~D~All~v~ 65 (120)
T PF13365_consen 1 GTGFLIGPDGYILTAAHVVEDWNDGKQPDNSSVEVVFPD--GRRVPP-----VAEVVYFDPDDYDLALLKVD 65 (120)
T ss_dssp EEEEEEETTTEEEEEHHHHTCCTT--G-TCSEEEEEETT--SCEEET-----EEEEEEEETT-TTEEEEEES
T ss_pred CEEEEEcCCceEEEchhheecccccccCCCCEEEEEecC--CCEEee-----eEEEEEECCccccEEEEEEe
Confidence 789999999 99999999997331111011222211111 011111 1344444555 899999999
No 10
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=96.93 E-value=0.0044 Score=55.10 Aligned_cols=58 Identities=16% Similarity=0.239 Sum_probs=43.4
Q ss_pred eeeEEEEEeeCC-EEEEccCccccccccccccccEEEeceeeecccccccccccceeeEEEeCCCCceEEEEecCC
Q psy4294 43 NLYCGGSLISLQ-WFLSARHCFVTENLVWNQFNPLIIAGSIYRNYKEQKRQPQLNEIALIYWHSDADLAMVKLKEP 117 (238)
Q Consensus 43 ~~~C~GtLI~~~-~VLTaAhCv~~~~~~~~~~~~~v~~g~~~~~~~~~~~~~~~~~v~~i~~hp~~Diall~L~~~ 117 (238)
...++|.+|+++ +|||.+|++.... .+.|.+.+ ...+ ..+-+..++..|+|+||++.+
T Consensus 57 ~~~GSGfii~~~G~IlTn~Hvv~~~~------~i~V~~~~------~~~~-----~a~vv~~d~~~DlAllkv~~~ 115 (428)
T TIGR02037 57 RGLGSGVIISADGYILTNNHVVDGAD------EITVTLSD------GREF-----KAKLVGKDPRTDIAVLKIDAK 115 (428)
T ss_pred cceeeEEEECCCCEEEEcHHHcCCCC------eEEEEeCC------CCEE-----EEEEEEecCCCCEEEEEecCC
Confidence 457999999986 9999999998755 55555432 2234 555566678899999999865
No 11
>TIGR02038 protease_degS periplasmic serine pepetdase DegS. This family consists of the periplasmic serine protease DegS (HhoB), a shorter paralog of protease DO (HtrA, DegP) and DegQ (HhoA). It is found in E. coli and several other Proteobacteria of the gamma subdivision. It contains a trypsin domain and a single copy of PDZ domain (in contrast to DegP with two copies). A critical role of this DegS is to sense stress in the periplasm and partially degrade an inhibitor of sigma(E).
Probab=96.49 E-value=0.034 Score=48.19 Aligned_cols=69 Identities=19% Similarity=0.264 Sum_probs=49.2
Q ss_pred CceEEEEeeCC-----------eeeEEEEEeeCC-EEEEccCccccccccccccccEEEeceeeecccccccccccceee
Q psy4294 32 VPYIVSLSLYG-----------NLYCGGSLISLQ-WFLSARHCFVTENLVWNQFNPLIIAGSIYRNYKEQKRQPQLNEIA 99 (238)
Q Consensus 32 ~Pw~v~i~~~~-----------~~~C~GtLI~~~-~VLTaAhCv~~~~~~~~~~~~~v~~g~~~~~~~~~~~~~~~~~v~ 99 (238)
-|-+|.|+..+ ....+|.+|+++ +|||.+|.+.... .+.|.+. +...+ ..+
T Consensus 55 ~psVV~I~~~~~~~~~~~~~~~~~~GSG~vi~~~G~IlTn~HVV~~~~------~i~V~~~------dg~~~-----~a~ 117 (351)
T TIGR02038 55 APAVVNIYNRSISQNSLNQLSIQGLGSGVIMSKEGYILTNYHVIKKAD------QIVVALQ------DGRKF-----EAE 117 (351)
T ss_pred CCcEEEEEeEeccccccccccccceEEEEEEeCCeEEEecccEeCCCC------EEEEEEC------CCCEE-----EEE
Confidence 47888886521 235999999987 9999999998754 4555542 22334 555
Q ss_pred EEEeCCCCceEEEEecCC
Q psy4294 100 LIYWHSDADLAMVKLKEP 117 (238)
Q Consensus 100 ~i~~hp~~Diall~L~~~ 117 (238)
-+...+..|+|+||++.+
T Consensus 118 vv~~d~~~DlAvlkv~~~ 135 (351)
T TIGR02038 118 LVGSDPLTDLAVLKIEGD 135 (351)
T ss_pred EEEecCCCCEEEEEecCC
Confidence 566778999999999864
No 12
>PRK10898 serine endoprotease; Provisional
Probab=96.09 E-value=0.065 Score=46.46 Aligned_cols=69 Identities=17% Similarity=0.274 Sum_probs=48.8
Q ss_pred CceEEEEeeCC-----------eeeEEEEEeeCC-EEEEccCccccccccccccccEEEeceeeecccccccccccceee
Q psy4294 32 VPYIVSLSLYG-----------NLYCGGSLISLQ-WFLSARHCFVTENLVWNQFNPLIIAGSIYRNYKEQKRQPQLNEIA 99 (238)
Q Consensus 32 ~Pw~v~i~~~~-----------~~~C~GtLI~~~-~VLTaAhCv~~~~~~~~~~~~~v~~g~~~~~~~~~~~~~~~~~v~ 99 (238)
-|-+|.|.... ....+|.+|+++ +|||.+|=+.... .+.|.+.+ ...+ ..+
T Consensus 55 ~psvV~v~~~~~~~~~~~~~~~~~~GSGfvi~~~G~IlTn~HVv~~a~------~i~V~~~d------g~~~-----~a~ 117 (353)
T PRK10898 55 APAVVNVYNRSLNSTSHNQLEIRTLGSGVIMDQRGYILTNKHVINDAD------QIIVALQD------GRVF-----EAL 117 (353)
T ss_pred CCcEEEEEeEeccccCcccccccceeeEEEEeCCeEEEecccEeCCCC------EEEEEeCC------CCEE-----EEE
Confidence 37777776521 246899999976 9999999998755 55555422 2233 455
Q ss_pred EEEeCCCCceEEEEecCC
Q psy4294 100 LIYWHSDADLAMVKLKEP 117 (238)
Q Consensus 100 ~i~~hp~~Diall~L~~~ 117 (238)
-+...|..|+|+||++.+
T Consensus 118 vv~~d~~~DlAvl~v~~~ 135 (353)
T PRK10898 118 LVGSDSLTDLAVLKINAT 135 (353)
T ss_pred EEEEcCCCCEEEEEEcCC
Confidence 566678899999999754
No 13
>PRK10942 serine endoprotease; Provisional
Probab=96.09 E-value=0.035 Score=50.00 Aligned_cols=57 Identities=14% Similarity=0.289 Sum_probs=42.6
Q ss_pred eeEEEEEeeC--CEEEEccCccccccccccccccEEEeceeeecccccccccccceeeEEEeCCCCceEEEEecCC
Q psy4294 44 LYCGGSLISL--QWFLSARHCFVTENLVWNQFNPLIIAGSIYRNYKEQKRQPQLNEIALIYWHSDADLAMVKLKEP 117 (238)
Q Consensus 44 ~~C~GtLI~~--~~VLTaAhCv~~~~~~~~~~~~~v~~g~~~~~~~~~~~~~~~~~v~~i~~hp~~Diall~L~~~ 117 (238)
...+|.+|+. -+|||.+|.+.... .+.|.+.+ ...+ ..+-+..++..|+||||++.+
T Consensus 111 ~~GSG~ii~~~~G~IlTn~HVv~~a~------~i~V~~~d------g~~~-----~a~vv~~D~~~DlAvlki~~~ 169 (473)
T PRK10942 111 ALGSGVIIDADKGYVVTNNHVVDNAT------KIKVQLSD------GRKF-----DAKVVGKDPRSDIALIQLQNP 169 (473)
T ss_pred ceEEEEEEECCCCEEEeChhhcCCCC------EEEEEECC------CCEE-----EEEEEEecCCCCEEEEEecCC
Confidence 4699999985 49999999998765 66666532 2334 555666788999999999753
No 14
>PRK10139 serine endoprotease; Provisional
Probab=95.54 E-value=0.073 Score=47.76 Aligned_cols=57 Identities=18% Similarity=0.296 Sum_probs=42.9
Q ss_pred eeEEEEEeeC--CEEEEccCccccccccccccccEEEeceeeecccccccccccceeeEEEeCCCCceEEEEecCC
Q psy4294 44 LYCGGSLISL--QWFLSARHCFVTENLVWNQFNPLIIAGSIYRNYKEQKRQPQLNEIALIYWHSDADLAMVKLKEP 117 (238)
Q Consensus 44 ~~C~GtLI~~--~~VLTaAhCv~~~~~~~~~~~~~v~~g~~~~~~~~~~~~~~~~~v~~i~~hp~~Diall~L~~~ 117 (238)
...+|.+|++ -+|||.+|.+.... .+.|.+. +...+ ..+-+...+..|+|+||++.+
T Consensus 90 ~~GSG~ii~~~~g~IlTn~HVv~~a~------~i~V~~~------dg~~~-----~a~vvg~D~~~DlAvlkv~~~ 148 (455)
T PRK10139 90 GLGSGVIIDAAKGYVLTNNHVINQAQ------KISIQLN------DGREF-----DAKLIGSDDQSDIALLQIQNP 148 (455)
T ss_pred ceEEEEEEECCCCEEEeChHHhCCCC------EEEEEEC------CCCEE-----EEEEEEEcCCCCEEEEEecCC
Confidence 4689999974 59999999998866 6666652 22334 556666677899999999854
No 15
>PF10459 Peptidase_S46: Peptidase S46; InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains.
Probab=48.56 E-value=12 Score=35.58 Aligned_cols=20 Identities=35% Similarity=0.698 Sum_probs=18.6
Q ss_pred EEEEEeeCC-EEEEccCcccc
Q psy4294 46 CGGSLISLQ-WFLSARHCFVT 65 (238)
Q Consensus 46 C~GtLI~~~-~VLTaAhCv~~ 65 (238)
|+|++||++ .|||--||...
T Consensus 49 CSgsfVS~~GLvlTNHHC~~~ 69 (698)
T PF10459_consen 49 CSGSFVSPDGLVLTNHHCGYG 69 (698)
T ss_pred eeEEEEcCCceEEecchhhhh
Confidence 999999998 89999999975
No 16
>PF00863 Peptidase_C4: Peptidase family C4 This family belongs to family C4 of the peptidase classification.; InterPro: IPR001730 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. Nuclear inclusion A (NIA) proteases from potyviruses are cysteine peptidases belong to the MEROPS peptidase family C4 (NIa protease family, clan PA(C)) [, ]. Potyviruses include plant viruses in which the single-stranded RNA encodes a polyprotein with NIA protease activity, where proteolytic cleavage is specific for Gln+Gly sites. The NIA protease acts on the polyprotein, releasing itself by Gln+Gly cleavage at both the N- and C-termini. It further processes the polyprotein by cleavage at five similar sites in the C-terminal half of the sequence. In addition to its C-terminal protease activity, the NIA protease contains an N-terminal domain that has been implicated in the transcription process []. This peptidase is present in the nuclear inclusion protein of potyviruses.; GO: 0008234 cysteine-type peptidase activity, 0006508 proteolysis; PDB: 3MMG_B 1Q31_B 1LVB_A 1LVM_A.
Probab=37.82 E-value=2.3e+02 Score=23.21 Aligned_cols=56 Identities=14% Similarity=0.237 Sum_probs=28.7
Q ss_pred EeeCCEEEEccCccccccccccccccEEE--eceeeecccccccccccceeeEEEeCCCCceEEEEecCCcc
Q psy4294 50 LISLQWFLSARHCFVTENLVWNQFNPLII--AGSIYRNYKEQKRQPQLNEIALIYWHSDADLAMVKLKEPFR 119 (238)
Q Consensus 50 LI~~~~VLTaAhCv~~~~~~~~~~~~~v~--~g~~~~~~~~~~~~~~~~~v~~i~~hp~~Diall~L~~~~~ 119 (238)
+..-.|++|-+|-+.+..- .+.+. .|...-.+.. .. .+..+ +..||.++||..+++
T Consensus 37 igyG~~iItn~HLf~~nng-----~L~i~s~hG~f~v~nt~-~l-----kv~~i---~~~DiviirmPkDfp 94 (235)
T PF00863_consen 37 IGYGSYIITNAHLFKRNNG-----ELTIKSQHGEFTVPNTT-QL-----KVHPI---EGRDIVIIRMPKDFP 94 (235)
T ss_dssp EEETTEEEEEGGGGSSTTC-----EEEEEETTEEEEECEGG-GS-----EEEE----TCSSEEEEE--TTS-
T ss_pred EeECCEEEEChhhhccCCC-----eEEEEeCceEEEcCCcc-cc-----ceEEe---CCccEEEEeCCcccC
Confidence 5567999999999977430 22222 1222111111 11 22222 267999999988764
No 17
>PF02395 Peptidase_S6: Immunoglobulin A1 protease Serine protease Prosite pattern; InterPro: IPR000710 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S6 (clan PA(S)). The type sample being the IgA1-specific serine endopeptidase from Neisseria gonorrhoeae []. These cleave prolyl bonds in the hinge regions of immunoglobulin A heavy chains. Similar specificity is shown by the unrelated family of M26 metalloendopeptidases.; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SZE_A 3H09_B 3SYJ_A 1WXR_A 3AK5_B.
Probab=36.29 E-value=87 Score=30.37 Aligned_cols=65 Identities=20% Similarity=0.166 Sum_probs=35.9
Q ss_pred EEEEeeCCEEEEccCccccccccccccccEEEeceeeecccccccccccceeeEEEeCCCCceEEEEecCCccccccccc
Q psy4294 47 GGSLISLQWFLSARHCFVTENLVWNQFNPLIIAGSIYRNYKEQKRQPQLNEIALIYWHSDADLAMVKLKEPFRQTTFVKP 126 (238)
Q Consensus 47 ~GtLI~~~~VLTaAhCv~~~~~~~~~~~~~v~~g~~~~~~~~~~~~~~~~~v~~i~~hp~~Diall~L~~~~~~~~~i~p 126 (238)
.+|||+|++|+|++|=... .-.+.+|.... ..+ .+..-..|+..|+.+-||.+-++- +.|
T Consensus 68 ~aTLigpqYiVSV~HN~~g--------y~~v~FG~~g~----~~Y-----~iV~RNn~~~~Df~~pRLnK~VTE---vaP 127 (769)
T PF02395_consen 68 VATLIGPQYIVSVKHNGKG--------YNSVSFGNEGQ----NTY-----KIVDRNNYPSGDFHMPRLNKFVTE---VAP 127 (769)
T ss_dssp S-EEEETTEEEBETTG-TS--------CCEECESCSST----CEE-----EEEEEEBETTSTEBEEEESS---S---S--
T ss_pred eEEEecCCeEEEEEccCCC--------cCceeecccCC----ceE-----EEEEccCCCCcccceeecCceEEE---Eec
Confidence 4899999999999998732 23345554322 223 344444455679999999887753 445
Q ss_pred ccccC
Q psy4294 127 LDYYT 131 (238)
Q Consensus 127 icl~~ 131 (238)
+.+..
T Consensus 128 ~~~t~ 132 (769)
T PF02395_consen 128 AEMTT 132 (769)
T ss_dssp --BBS
T ss_pred ccccc
Confidence 54443
No 18
>COG0265 DegQ Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain [Posttranslational modification, protein turnover, chaperones]
Probab=28.36 E-value=1.9e+02 Score=24.85 Aligned_cols=57 Identities=16% Similarity=0.178 Sum_probs=40.0
Q ss_pred eeEEEEEee-CCEEEEccCccccccccccccccEEEeceeeecccccccccccceeeEEEeCCCCceEEEEecCC
Q psy4294 44 LYCGGSLIS-LQWFLSARHCFVTENLVWNQFNPLIIAGSIYRNYKEQKRQPQLNEIALIYWHSDADLAMVKLKEP 117 (238)
Q Consensus 44 ~~C~GtLI~-~~~VLTaAhCv~~~~~~~~~~~~~v~~g~~~~~~~~~~~~~~~~~v~~i~~hp~~Diall~L~~~ 117 (238)
...+|.++. ..+|+|-.|=+.... ...+.. ...... ..+.+-..+..|+|++|.+..
T Consensus 72 ~~gSg~i~~~~g~ivTn~hVi~~a~------~i~v~l------~dg~~~-----~a~~vg~d~~~dlavlki~~~ 129 (347)
T COG0265 72 GLGSGFIISSDGYIVTNNHVIAGAE------EITVTL------ADGREV-----PAKLVGKDPISDLAVLKIDGA 129 (347)
T ss_pred ccccEEEEcCCeEEEecceecCCcc------eEEEEe------CCCCEE-----EEEEEecCCccCEEEEEeccC
Confidence 567888888 678999999887644 444444 333444 556666667889999999865
Done!