Query psy7302
Match_columns 741
No_of_seqs 371 out of 2123
Neff 6.1
Searched_HMMs 46136
Date Fri Aug 16 23:21:05 2013
Command hhsearch -i /work/01045/syshi/Psyhhblits/psy7302.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/7302hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 cd00190 Tryp_SPc Trypsin-like 100.0 4.3E-40 9.3E-45 332.0 24.9 230 448-685 1-231 (232)
2 KOG3627|consensus 100.0 3.8E-38 8.3E-43 327.6 25.2 239 445-736 10-255 (256)
3 smart00020 Tryp_SPc Trypsin-li 100.0 3.2E-37 6.8E-42 312.5 25.5 227 447-683 1-229 (229)
4 PF00089 Trypsin: Trypsin; In 100.0 2.3E-35 4.9E-40 295.7 23.3 220 448-683 1-220 (220)
5 COG5640 Secreted trypsin-like 100.0 1.4E-28 3.1E-33 259.1 14.5 231 445-684 30-275 (413)
6 PF03761 DUF316: Domain of unk 99.7 7E-16 1.5E-20 163.9 17.7 223 432-685 28-277 (282)
7 PF09342 DUF1986: Domain of un 99.5 3.1E-12 6.8E-17 130.5 18.7 120 455-582 12-131 (267)
8 COG3591 V8-like Glu-specific e 98.5 6.3E-07 1.4E-11 93.7 11.3 177 453-662 43-224 (251)
9 cd00190 Tryp_SPc Trypsin-like 98.4 3E-07 6.5E-12 92.7 4.9 52 683-734 181-232 (232)
10 COG5640 Secreted trypsin-like 98.3 8.7E-07 1.9E-11 95.3 6.3 61 677-737 219-280 (413)
11 smart00020 Tryp_SPc Trypsin-li 98.0 6.4E-06 1.4E-10 83.4 4.3 48 683-731 182-229 (229)
12 PF13365 Trypsin_2: Trypsin-li 98.0 2.7E-05 5.9E-10 70.9 8.0 22 474-495 1-23 (120)
13 TIGR02037 degP_htrA_DO peripla 97.9 0.00022 4.8E-09 80.9 15.3 85 471-582 57-142 (428)
14 PF00089 Trypsin: Trypsin; In 97.9 1.5E-05 3.2E-10 79.8 4.7 48 681-731 173-220 (220)
15 PRK10898 serine endoprotease; 97.8 0.00084 1.8E-08 74.5 17.6 83 472-582 78-161 (353)
16 TIGR02038 protease_degS peripl 97.8 0.00071 1.5E-08 75.0 16.7 140 472-661 78-218 (351)
17 PRK10139 serine endoprotease; 97.4 0.0013 2.8E-08 75.4 12.1 142 472-662 90-233 (455)
18 PRK10942 serine endoprotease; 97.3 0.0045 9.7E-08 71.3 15.0 83 472-581 111-195 (473)
19 PF02395 Peptidase_S6: Immunog 94.4 0.14 3.1E-06 62.2 9.2 65 476-565 69-133 (769)
20 PF00548 Peptidase_C3: 3C cyst 87.2 4.1 9E-05 40.8 9.7 150 468-662 21-171 (172)
21 KOG3627|consensus 78.2 0.81 1.8E-05 47.6 0.6 18 668-685 235-252 (256)
22 smart00130 KR Kringle domain. 76.7 2.5 5.4E-05 37.3 3.1 35 401-436 44-82 (83)
23 PF00051 Kringle: Kringle doma 76.4 1.3 2.8E-05 38.4 1.2 33 400-433 42-77 (79)
24 PF00863 Peptidase_C4: Peptida 76.1 18 0.0004 38.1 9.8 145 478-676 37-184 (235)
25 PF03761 DUF316: Domain of unk 74.2 3.6 7.9E-05 43.8 4.2 48 682-731 227-275 (282)
26 cd00108 KR Kringle domain; Kri 73.0 3 6.5E-05 36.7 2.7 66 364-435 9-82 (83)
27 COG0265 DegQ Trypsin-like seri 65.4 75 0.0016 35.1 12.3 146 472-665 72-218 (347)
28 PF02395 Peptidase_S6: Immunog 50.8 12 0.00026 46.0 3.1 36 678-713 208-245 (769)
29 PF05416 Peptidase_C37: Southa 40.6 99 0.0022 35.4 7.9 38 623-660 485-525 (535)
30 PF11593 Med3: Mediator comple 26.9 1.8E+02 0.0039 32.8 7.0 19 21-39 99-118 (379)
31 KOG1924|consensus 24.3 2.1E+02 0.0045 35.3 7.3 10 720-729 978-987 (1102)
32 PF00947 Pico_P2A: Picornaviru 23.5 71 0.0015 30.7 2.8 34 637-679 89-122 (127)
No 1
>cd00190 Tryp_SPc Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues.
Probab=100.00 E-value=4.3e-40 Score=332.04 Aligned_cols=230 Identities=47% Similarity=0.844 Sum_probs=201.0
Q ss_pred eeCCeecCCCCcceEEEEEeeCCceeEEEEEeccceEEeecccccccccCCCcEEEEEceeccccccCCCCcEEEEEEEE
Q psy7302 448 VVGGEDADPAEWCWQVALINSLNQYLCGGALIGTQWVLTAAHCVTNIVRSGDAVYVRVGDHDLTRKYGSPGAQTLRVATT 527 (741)
Q Consensus 448 IvGG~~a~~ge~PW~VsL~~~~~~~~CgGTLIS~rwVLTAAHCv~~~~~~~~~i~V~lG~~~l~~~~~~~~~q~~~V~~I 527 (741)
|+||+.+..++|||+|+|+.....++|+|+||+++||||||||+.+.. ...+.|++|....... ....+.+.|.++
T Consensus 1 i~~G~~~~~~~~Pw~v~i~~~~~~~~C~GtlIs~~~VLTaAhC~~~~~--~~~~~v~~g~~~~~~~--~~~~~~~~v~~~ 76 (232)
T cd00190 1 IVGGSEAKIGSFPWQVSLQYTGGRHFCGGSLISPRWVLTAAHCVYSSA--PSNYTVRLGSHDLSSN--EGGGQVIKVKKV 76 (232)
T ss_pred CcCCeECCCCCCCCEEEEEccCCcEEEEEEEeeCCEEEECHHhcCCCC--CccEEEEeCcccccCC--CCceEEEEEEEE
Confidence 689999999999999999876567999999999999999999997643 4578899998776432 134578899999
Q ss_pred EEecCCCCCCCCCceEEEEeccccccCCceeEEecCCCCCcCCCCCeEEEEEecccCCCCCCcceeEEEEEeeeChhHHH
Q psy7302 528 YIHHNHNSQTLDNDIALLKLHGQAELKDGVCLVCLPARGVNHAAGKRCTVTGYGYMGEAGPIPLRVREAEIPIVSDTECI 607 (741)
Q Consensus 528 iiHP~Y~~~t~~nDIALLkL~~pi~~s~~V~PICLP~~~~~~~~g~~~~v~GWG~t~~~~~~s~~L~~~~v~vis~~~C~ 607 (741)
++||+|+.....+|||||||++++.++++++|||||........+..+.++|||...........+++..+.+++..+|.
T Consensus 77 ~~hp~y~~~~~~~DiAll~L~~~~~~~~~v~picl~~~~~~~~~~~~~~~~G~g~~~~~~~~~~~~~~~~~~~~~~~~C~ 156 (232)
T cd00190 77 IVHPNYNPSTYDNDIALLKLKRPVTLSDNVRPICLPSSGYNLPAGTTCTVSGWGRTSEGGPLPDVLQEVNVPIVSNAECK 156 (232)
T ss_pred EECCCCCCCCCcCCEEEEEECCcccCCCcccceECCCccccCCCCCEEEEEeCCcCCCCCCCCceeeEEEeeeECHHHhh
Confidence 99999998888999999999999999999999999987655677899999999987655456778999999999999998
Q ss_pred HhhccccccccccCCCeEEecCCC-CCCCCCCCCCCceEEeeCCeEEEEEEEecCCCCCCCCCCceeeeccccCCCccc
Q psy7302 608 RKINAVTEKIFILPASSFCAGGEE-GNDACQGDGGGPLVCQDDGFYELAGLVSWGFGCGRQDVPGVYVKVSSFNGMSKV 685 (741)
Q Consensus 608 ~~~~~~~~~~~~i~~~~~CAg~~~-g~~~C~GDSGGPLv~~~~g~~~LvGVvS~G~gC~~~~~P~VYtrVS~y~~wI~~ 685 (741)
..+.. ...+.+++||++... ..+.|.|||||||++..+++|+|+||+|+|..|...+.|++|++|+.|.+||..
T Consensus 157 ~~~~~----~~~~~~~~~C~~~~~~~~~~c~gdsGgpl~~~~~~~~~lvGI~s~g~~c~~~~~~~~~t~v~~~~~WI~~ 231 (232)
T cd00190 157 RAYSY----GGTITDNMLCAGGLEGGKDACQGDSGGPLVCNDNGRGVLVGIVSWGSGCARPNYPGVYTRVSSYLDWIQK 231 (232)
T ss_pred hhccC----cccCCCceEeeCCCCCCCccccCCCCCcEEEEeCCEEEEEEEEehhhccCCCCCCCEEEEcHHhhHHhhc
Confidence 87753 124678999999766 789999999999999999999999999999999888899999999999999974
No 2
>KOG3627|consensus
Probab=100.00 E-value=3.8e-38 Score=327.58 Aligned_cols=239 Identities=43% Similarity=0.862 Sum_probs=191.5
Q ss_pred CCceeCCeecCCCCcceEEEEEeeCC-ceeEEEEEeccceEEeecccccccccCCCcEEEEEceeccccccCCCC-cEEE
Q psy7302 445 GARVVGGEDADPAEWCWQVALINSLN-QYLCGGALIGTQWVLTAAHCVTNIVRSGDAVYVRVGDHDLTRKYGSPG-AQTL 522 (741)
Q Consensus 445 ~~RIvGG~~a~~ge~PW~VsL~~~~~-~~~CgGTLIS~rwVLTAAHCv~~~~~~~~~i~V~lG~~~l~~~~~~~~-~q~~ 522 (741)
..||+||..+.+++|||+|+|..... .|+|+|+||+++||||||||+.... .. .+.|++|.+.......... .+..
T Consensus 10 ~~~i~~g~~~~~~~~Pw~~~l~~~~~~~~~Cggsli~~~~vltaaHC~~~~~-~~-~~~V~~G~~~~~~~~~~~~~~~~~ 87 (256)
T KOG3627|consen 10 EGRIVGGTEAEPGSFPWQVSLQYGGNGRHLCGGSLISPRWVLTAAHCVKGAS-AS-LYTVRLGEHDINLSVSEGEEQLVG 87 (256)
T ss_pred cCCEeCCccCCCCCCCCEEEEEECCCcceeeeeEEeeCCEEEEChhhCCCCC-Cc-ceEEEECccccccccccCchhhhc
Confidence 57999999999999999999987754 7899999999999999999998742 12 7889999876554322221 2455
Q ss_pred EEEEEEEecCCCCCCCC-CceEEEEeccccccCCceeEEecCCCCC--cCCCCCeEEEEEecccCCC-CCCcceeEEEEE
Q psy7302 523 RVATTYIHHNHNSQTLD-NDIALLKLHGQAELKDGVCLVCLPARGV--NHAAGKRCTVTGYGYMGEA-GPIPLRVREAEI 598 (741)
Q Consensus 523 ~V~~IiiHP~Y~~~t~~-nDIALLkL~~pi~~s~~V~PICLP~~~~--~~~~g~~~~v~GWG~t~~~-~~~s~~L~~~~v 598 (741)
.|.++++||+|+..... ||||||+|.+++.|++.|+|||||.... ....+..+.++|||.+... ...+..|+++++
T Consensus 88 ~v~~~i~H~~y~~~~~~~nDiall~l~~~v~~~~~i~piclp~~~~~~~~~~~~~~~v~GWG~~~~~~~~~~~~L~~~~v 167 (256)
T KOG3627|consen 88 DVEKIIVHPNYNPRTLENNDIALLRLSEPVTFSSHIQPICLPSSADPYFPPGGTTCLVSGWGRTESGGGPLPDTLQEVDV 167 (256)
T ss_pred eeeEEEECCCCCCCCCCCCCEEEEEECCCcccCCcccccCCCCCcccCCCCCCCEEEEEeCCCcCCCCCCCCceeEEEEE
Confidence 68889999999998877 9999999999999999999999985543 2455689999999987654 355778999999
Q ss_pred eeeChhHHHHhhccccccccccCCCeEEecC-CCCCCCCCCCCCCceEEeeCCeEEEEEEEecCCCCCCCCCCceeeecc
Q psy7302 599 PIVSDTECIRKINAVTEKIFILPASSFCAGG-EEGNDACQGDGGGPLVCQDDGFYELAGLVSWGFGCGRQDVPGVYVKVS 677 (741)
Q Consensus 599 ~vis~~~C~~~~~~~~~~~~~i~~~~~CAg~-~~g~~~C~GDSGGPLv~~~~g~~~LvGVvS~G~gC~~~~~P~VYtrVS 677 (741)
++++..+|+..+.... .+.+.+|||+. ..+.++|+|||||||++..+++|+|+||+|||.+
T Consensus 168 ~i~~~~~C~~~~~~~~----~~~~~~~Ca~~~~~~~~~C~GDSGGPLv~~~~~~~~~~GivS~G~~-------------- 229 (256)
T KOG3627|consen 168 PIISNSECRRAYGGLG----TITDTMLCAGGPEGGKDACQGDSGGPLVCEDNGRWVLVGIVSWGSG-------------- 229 (256)
T ss_pred eEcChhHhcccccCcc----ccCCCEEeeCccCCCCccccCCCCCeEEEeeCCcEEEEEEEEecCC--------------
Confidence 9999999988876421 24566899998 5567899999999999998767777777777654
Q ss_pred ccCCCcccCCCCCceEeccCCeEEEEeeeecCCCCCCCCCCeEEEeccchHhHHHHhhc
Q psy7302 678 SFNGMSKVGDGGGPLVCQDDGFYELAGLVSWGFGCGRQDVPGVYVKVSSFIGWINQIIS 736 (741)
Q Consensus 678 ~y~~wI~~~DsGgp~~~~~~~~~~L~GI~S~G~~Cg~~~~P~VYTrVs~y~dWI~~vI~ 736 (741)
.|+..+.|++|+||+.|++||++.|.
T Consensus 230 ---------------------------------~C~~~~~P~vyt~V~~y~~WI~~~~~ 255 (256)
T KOG3627|consen 230 ---------------------------------GCGQPNYPGVYTRVSSYLDWIKENIG 255 (256)
T ss_pred ---------------------------------CCCCCCCCeEEeEhHHhHHHHHHHhc
Confidence 05555568888888888888888774
No 3
>smart00020 Tryp_SPc Trypsin-like serine protease. Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues.
Probab=100.00 E-value=3.2e-37 Score=312.47 Aligned_cols=227 Identities=46% Similarity=0.837 Sum_probs=196.3
Q ss_pred ceeCCeecCCCCcceEEEEEeeCCceeEEEEEeccceEEeecccccccccCCCcEEEEEceeccccccCCCCcEEEEEEE
Q psy7302 447 RVVGGEDADPAEWCWQVALINSLNQYLCGGALIGTQWVLTAAHCVTNIVRSGDAVYVRVGDHDLTRKYGSPGAQTLRVAT 526 (741)
Q Consensus 447 RIvGG~~a~~ge~PW~VsL~~~~~~~~CgGTLIS~rwVLTAAHCv~~~~~~~~~i~V~lG~~~l~~~~~~~~~q~~~V~~ 526 (741)
||+||+++.+++|||+|.|+.....+.|+|+||+++||||||||+.... ...+.|++|.+..... ...+.+.|.+
T Consensus 1 ~~~~G~~~~~~~~Pw~~~i~~~~~~~~C~GtlIs~~~VLTaahC~~~~~--~~~~~v~~g~~~~~~~---~~~~~~~v~~ 75 (229)
T smart00020 1 RIVGGSEANIGSFPWQVSLQYRGGRHFCGGSLISPRWVLTAAHCVYGSD--PSNIRVRLGSHDLSSG---EEGQVIKVSK 75 (229)
T ss_pred CccCCCcCCCCCCCcEEEEEEcCCCcEEEEEEecCCEEEECHHHcCCCC--CcceEEEeCcccCCCC---CCceEEeeEE
Confidence 6899999999999999999866447899999999999999999998743 3578999998875432 1227789999
Q ss_pred EEEecCCCCCCCCCceEEEEeccccccCCceeEEecCCCCCcCCCCCeEEEEEecccCC-CCCCcceeEEEEEeeeChhH
Q psy7302 527 TYIHHNHNSQTLDNDIALLKLHGQAELKDGVCLVCLPARGVNHAAGKRCTVTGYGYMGE-AGPIPLRVREAEIPIVSDTE 605 (741)
Q Consensus 527 IiiHP~Y~~~t~~nDIALLkL~~pi~~s~~V~PICLP~~~~~~~~g~~~~v~GWG~t~~-~~~~s~~L~~~~v~vis~~~ 605 (741)
+++||+|+.....+|||||+|++++.+.+.++||||+........+..+.++|||.... .......++...+.+++.++
T Consensus 76 ~~~~p~~~~~~~~~DiAll~L~~~i~~~~~~~pi~l~~~~~~~~~~~~~~~~g~g~~~~~~~~~~~~~~~~~~~~~~~~~ 155 (229)
T smart00020 76 VIIHPNYNPSTYDNDIALLKLKSPVTLSDNVRPICLPSSNYNVPAGTTCTVSGWGRTSEGAGSLPDTLQEVNVPIVSNAT 155 (229)
T ss_pred EEECCCCCCCCCcCCEEEEEECcccCCCCceeeccCCCcccccCCCCEEEEEeCCCCCCCCCcCCCEeeEEEEEEeCHHH
Confidence 99999999888899999999999999999999999998755566789999999998653 23456689999999999999
Q ss_pred HHHhhccccccccccCCCeEEecCCC-CCCCCCCCCCCceEEeeCCeEEEEEEEecCCCCCCCCCCceeeeccccCCCc
Q psy7302 606 CIRKINAVTEKIFILPASSFCAGGEE-GNDACQGDGGGPLVCQDDGFYELAGLVSWGFGCGRQDVPGVYVKVSSFNGMS 683 (741)
Q Consensus 606 C~~~~~~~~~~~~~i~~~~~CAg~~~-g~~~C~GDSGGPLv~~~~g~~~LvGVvS~G~gC~~~~~P~VYtrVS~y~~wI 683 (741)
|...+... ..+.+.+||++... +.+.|.|||||||++..+ +|+|+||+|+|..|...+.|.+|+||+.|.+||
T Consensus 156 C~~~~~~~----~~~~~~~~C~~~~~~~~~~c~gdsG~pl~~~~~-~~~l~Gi~s~g~~C~~~~~~~~~~~i~~~~~WI 229 (229)
T smart00020 156 CRRAYSGG----GAITDNMLCAGGLEGGKDACQGDSGGPLVCNDG-RWVLVGIVSWGSGCARPGKPGVYTRVSSYLDWI 229 (229)
T ss_pred hhhhhccc----cccCCCcEeecCCCCCCcccCCCCCCeeEEECC-CEEEEEEEEECCCCCCCCCCCEEEEeccccccC
Confidence 98876431 24678899999766 789999999999999887 999999999999998888999999999999998
No 4
>PF00089 Trypsin: Trypsin; InterPro: IPR001254 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine proteases belong to the MEROPS peptidase family S1 (chymotrypsin family, clan PA(S))and to peptidase family S6 (Hap serine peptidases). The chymotrypsin family is almost totally confined to animals, although trypsin-like enzymes are found in actinomycetes of the genera Streptomyces and Saccharopolyspora, and in the fungus Fusarium oxysporum []. The enzymes are inherently secreted, being synthesised with a signal peptide that targets them to the secretory pathway. Animal enzymes are either secreted directly, packaged into vesicles for regulated secretion, or are retained in leukocyte granules []. The Hap family, 'Haemophilus adhesion and penetration', are proteins that play a role in the interaction with human epithelial cells. The serine protease activity is localized at the N-terminal domain, whereas the binding domain is in the C-terminal region. ; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 1SPJ_A 1A5I_A 2ZGH_A 2ZKS_A 2ZGJ_A 2ZGC_A 2ODP_A 2I6Q_A 2I6S_A 2ODQ_A ....
Probab=100.00 E-value=2.3e-35 Score=295.75 Aligned_cols=220 Identities=39% Similarity=0.774 Sum_probs=190.8
Q ss_pred eeCCeecCCCCcceEEEEEeeCCceeEEEEEeccceEEeecccccccccCCCcEEEEEceeccccccCCCCcEEEEEEEE
Q psy7302 448 VVGGEDADPAEWCWQVALINSLNQYLCGGALIGTQWVLTAAHCVTNIVRSGDAVYVRVGDHDLTRKYGSPGAQTLRVATT 527 (741)
Q Consensus 448 IvGG~~a~~ge~PW~VsL~~~~~~~~CgGTLIS~rwVLTAAHCv~~~~~~~~~i~V~lG~~~l~~~~~~~~~q~~~V~~I 527 (741)
|+||..+.+++|||+|.|+.....++|+|+||+++||||||||+.. ...+.+.+|...+... ....+.+.|.++
T Consensus 1 i~~g~~~~~~~~p~~v~i~~~~~~~~C~G~li~~~~vLTaahC~~~----~~~~~v~~g~~~~~~~--~~~~~~~~v~~~ 74 (220)
T PF00089_consen 1 IVGGDPASPGEFPWVVSIRYSNGRFFCTGTLISPRWVLTAAHCVDG----ASDIKVRLGTYSIRNS--DGSEQTIKVSKI 74 (220)
T ss_dssp SBSSEECGTTSSTTEEEEEETTTEEEEEEEEEETTEEEEEGGGHTS----GGSEEEEESESBTTST--TTTSEEEEEEEE
T ss_pred CCCCEECCCCCCCeEEEEeeCCCCeeEeEEeccccccccccccccc----cccccccccccccccc--cccccccccccc
Confidence 7899999999999999998775589999999999999999999987 3568889998433322 233589999999
Q ss_pred EEecCCCCCCCCCceEEEEeccccccCCceeEEecCCCCCcCCCCCeEEEEEecccCCCCCCcceeEEEEEeeeChhHHH
Q psy7302 528 YIHHNHNSQTLDNDIALLKLHGQAELKDGVCLVCLPARGVNHAAGKRCTVTGYGYMGEAGPIPLRVREAEIPIVSDTECI 607 (741)
Q Consensus 528 iiHP~Y~~~t~~nDIALLkL~~pi~~s~~V~PICLP~~~~~~~~g~~~~v~GWG~t~~~~~~s~~L~~~~v~vis~~~C~ 607 (741)
++||+|+.....+|||||+|++++.+.+.++|+||+........+..+.++|||.....+ ....++...+.+++.+.|.
T Consensus 75 ~~h~~~~~~~~~~DiAll~L~~~~~~~~~~~~~~l~~~~~~~~~~~~~~~~G~~~~~~~~-~~~~~~~~~~~~~~~~~c~ 153 (220)
T PF00089_consen 75 IIHPKYDPSTYDNDIALLKLDRPITFGDNIQPICLPSAGSDPNVGTSCIVVGWGRTSDNG-YSSNLQSVTVPVVSRKTCR 153 (220)
T ss_dssp EEETTSBTTTTTTSEEEEEESSSSEHBSSBEESBBTSTTHTTTTTSEEEEEESSBSSTTS-BTSBEEEEEEEEEEHHHHH
T ss_pred cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc-ccccccccccccccccccc
Confidence 999999988889999999999999999999999999865555788999999999865444 5568999999999999998
Q ss_pred HhhccccccccccCCCeEEecCCCCCCCCCCCCCCceEEeeCCeEEEEEEEecCCCCCCCCCCceeeeccccCCCc
Q psy7302 608 RKINAVTEKIFILPASSFCAGGEEGNDACQGDGGGPLVCQDDGFYELAGLVSWGFGCGRQDVPGVYVKVSSFNGMS 683 (741)
Q Consensus 608 ~~~~~~~~~~~~i~~~~~CAg~~~g~~~C~GDSGGPLv~~~~g~~~LvGVvS~G~gC~~~~~P~VYtrVS~y~~wI 683 (741)
..++. .+.+.++|++.....+.|.|||||||++.+. +|+||++++..|...+.|.+|+||+.|++||
T Consensus 154 ~~~~~------~~~~~~~c~~~~~~~~~~~g~sG~pl~~~~~---~lvGI~s~~~~c~~~~~~~v~~~v~~~~~WI 220 (220)
T PF00089_consen 154 SSYND------NLTPNMICAGSSGSGDACQGDSGGPLICNNN---YLVGIVSFGENCGSPNYPGVYTRVSSYLDWI 220 (220)
T ss_dssp HHTTT------TSTTTEEEEETTSSSBGGTTTTTSEEEETTE---EEEEEEEEESSSSBTTSEEEEEEGGGGHHHH
T ss_pred ccccc------cccccccccccccccccccccccccccccee---eecceeeecCCCCCCCcCEEEEEHHHhhccC
Confidence 87432 2467899999765689999999999999874 7999999999999999999999999999998
No 5
>COG5640 Secreted trypsin-like serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=99.96 E-value=1.4e-28 Score=259.15 Aligned_cols=231 Identities=31% Similarity=0.489 Sum_probs=171.0
Q ss_pred CCceeCCeecCCCCcceEEEEEeeC----CceeEEEEEeccceEEeecccccccc-cCCCcEEEEEceeccccccCCCCc
Q psy7302 445 GARVVGGEDADPAEWCWQVALINSL----NQYLCGGALIGTQWVLTAAHCVTNIV-RSGDAVYVRVGDHDLTRKYGSPGA 519 (741)
Q Consensus 445 ~~RIvGG~~a~~ge~PW~VsL~~~~----~~~~CgGTLIS~rwVLTAAHCv~~~~-~~~~~i~V~lG~~~l~~~~~~~~~ 519 (741)
..||+||..|+.++||++|+|.... +..+|||++|..|||||||||+.... ...+...|.++..+. ...
T Consensus 30 s~rIigGs~Anag~~P~~VaLv~~isd~~s~tfCGgs~l~~RYvLTAAHC~~~~s~is~d~~~vv~~l~d~------Sq~ 103 (413)
T COG5640 30 SSRIIGGSNANAGEYPSLVALVDRISDYVSGTFCGGSKLGGRYVLTAAHCADASSPISSDVNRVVVDLNDS------SQA 103 (413)
T ss_pred ceeEecCcccccccCchHHHHHhhcccccceeEeccceecceEEeeehhhccCCCCccccceEEEeccccc------ccc
Confidence 5799999999999999999996443 35789999999999999999998754 333445555554443 334
Q ss_pred EEEEEEEEEEecCCCCCCCCCceEEEEeccccccCC-ceeEEecCCC-CCcCCCCCeEEEEEecccCCCC---CC--cce
Q psy7302 520 QTLRVATTYIHHNHNSQTLDNDIALLKLHGQAELKD-GVCLVCLPAR-GVNHAAGKRCTVTGYGYMGEAG---PI--PLR 592 (741)
Q Consensus 520 q~~~V~~IiiHP~Y~~~t~~nDIALLkL~~pi~~s~-~V~PICLP~~-~~~~~~g~~~~v~GWG~t~~~~---~~--s~~ 592 (741)
+...|..++.|..|...++.||||+++|.++..... .+.-.--+.. .............+||.+.... .. ...
T Consensus 104 ~rg~vr~i~~~efY~~~n~~ND~Av~~l~~~a~~pr~ki~~~~~sdt~l~sv~~~s~~~n~t~~~~~~~~v~~~~p~gt~ 183 (413)
T COG5640 104 ERGHVRTIYVHEFYSPGNLGNDIAVLELARAASLPRVKITSFDASDTFLNSVTTVSPMTNGTFGVTTPSDVPRSSPKGTI 183 (413)
T ss_pred cCcceEEEeeecccccccccCcceeeccccccccchhheeeccCcccceecccccccccceeeeeeeecCCCCCCCccce
Confidence 788999999999999999999999999998764321 1111100100 0112233444566776543221 11 247
Q ss_pred eEEEEEeeeChhHHHHhhc--cccccccccCCCeEEecCCCCCCCCCCCCCCceEEeeCCeEEEEEEEecCCC-CCCCCC
Q psy7302 593 VREAEIPIVSDTECIRKIN--AVTEKIFILPASSFCAGGEEGNDACQGDGGGPLVCQDDGFYELAGLVSWGFG-CGRQDV 669 (741)
Q Consensus 593 L~~~~v~vis~~~C~~~~~--~~~~~~~~i~~~~~CAg~~~g~~~C~GDSGGPLv~~~~g~~~LvGVvS~G~g-C~~~~~ 669 (741)
|+++.+..++...|...+. ........+. -||++... .++|+||||||++.+.++.++++||+|||.+ |++.+.
T Consensus 184 l~e~~v~fv~~stc~~~~g~an~~dg~~~lT--~~cag~~~-~daCqGDSGGPi~~~g~~G~vQ~GVvSwG~~~Cg~t~~ 260 (413)
T COG5640 184 LHEVAVLFVPLSTCAQYKGCANASDGATGLT--GFCAGRPP-KDACQGDSGGPIFHKGEEGRVQRGVVSWGDGGCGGTLI 260 (413)
T ss_pred eeeeeeeeechHHhhhhccccccCCCCCCcc--ceecCCCC-cccccCCCCCceEEeCCCccEEEeEEEecCCCCCCCCc
Confidence 9999999999999998874 1111111222 39998544 9999999999999999888999999999997 999999
Q ss_pred CceeeeccccCCCcc
Q psy7302 670 PGVYVKVSSFNGMSK 684 (741)
Q Consensus 670 P~VYtrVS~y~~wI~ 684 (741)
|+|||+|+.|.+||.
T Consensus 261 ~gVyT~vsny~~WI~ 275 (413)
T COG5640 261 PGVYTNVSNYQDWIA 275 (413)
T ss_pred ceeEEehhHHHHHHH
Confidence 999999999999993
No 6
>PF03761 DUF316: Domain of unknown function (DUF316) ; InterPro: IPR005514 This is a family of uncharacterised proteins from Caenorhabditis elegans.
Probab=99.68 E-value=7e-16 Score=163.93 Aligned_cols=223 Identities=23% Similarity=0.397 Sum_probs=147.8
Q ss_pred cccCCCCCCcCCCCCceeCCeecCCCCcceEEEEEeeCC---ceeEEEEEeccceEEeecccccccccCC----------
Q psy7302 432 KYVCGVKGTARERGARVVGGEDADPAEWCWQVALINSLN---QYLCGGALIGTQWVLTAAHCVTNIVRSG---------- 498 (741)
Q Consensus 432 ~~~CG~~~~~~~~~~RIvGG~~a~~ge~PW~VsL~~~~~---~~~CgGTLIS~rwVLTAAHCv~~~~~~~---------- 498 (741)
...||..... ...++.+|..+...+.||+|.+..... .++++|+|||+|||||++||+......+
T Consensus 28 l~~CG~~~~~--~~~~~~~g~~~~~~~~pW~v~v~~~~~~~~~~~~~gtlIS~RHiLtss~~~~~~~~~W~~~~~~~~~~ 105 (282)
T PF03761_consen 28 LETCGKKKLP--YPSKVFNGTPAESGEAPWAVSVYTKNHNEGNYFSTGTLISPRHILTSSHCVMNDKSKWLNGEEFDNKK 105 (282)
T ss_pred HHhcCCCCCC--CcccccCCcccccCCCCCEEEEEeccCcccceecceEEeccCeEEEeeeEEEecccccccCcccccce
Confidence 4579965432 245679999999999999999975432 3678999999999999999997432211
Q ss_pred ---CcEEEEEceecccc-------ccCCCCcEEEEEEEEEEecCC----CCCCCCCceEEEEeccccccCCceeEEecCC
Q psy7302 499 ---DAVYVRVGDHDLTR-------KYGSPGAQTLRVATTYIHHNH----NSQTLDNDIALLKLHGQAELKDGVCLVCLPA 564 (741)
Q Consensus 499 ---~~i~V~lG~~~l~~-------~~~~~~~q~~~V~~IiiHP~Y----~~~t~~nDIALLkL~~pi~~s~~V~PICLP~ 564 (741)
+...+.+-...+.. ...........|.++++.-.- +.....++++||+|+++ +...+.|+||+.
T Consensus 106 C~~~~~~l~vP~~~l~~~~v~~~~~~~~~~~~~~~v~ka~il~~C~~~~~~~~~~~~~mIlEl~~~--~~~~~~~~Cl~~ 183 (282)
T PF03761_consen 106 CEGNNNHLIVPEEVLSKIDVRCCNCFSNGKCFSIKVKKAYILNGCKKIKKNFNRPYSPMILELEED--FSKNVSPPCLAD 183 (282)
T ss_pred eeCCCceEEeCHHHhccEEEEeecccccCCcccceeEEEEEEecCCCcccccccccceEEEEEccc--ccccCCCEEeCC
Confidence 00011111111000 000111234566777664322 23445789999999999 678889999998
Q ss_pred CCCcCCCCCeEEEEEecccCCCCCCcceeEEEEEeeeChhHHHHhhccccccccccCCCeEEecCCCCCCCCCCCCCCce
Q psy7302 565 RGVNHAAGKRCTVTGYGYMGEAGPIPLRVREAEIPIVSDTECIRKINAVTEKIFILPASSFCAGGEEGNDACQGDGGGPL 644 (741)
Q Consensus 565 ~~~~~~~g~~~~v~GWG~t~~~~~~s~~L~~~~v~vis~~~C~~~~~~~~~~~~~i~~~~~CAg~~~g~~~C~GDSGGPL 644 (741)
.......+..+.+.|+. ....+....+.+.....|. ..++ .....|.||+||||
T Consensus 184 ~~~~~~~~~~~~~yg~~-------~~~~~~~~~~~i~~~~~~~---------------~~~~----~~~~~~~~d~Gg~l 237 (282)
T PF03761_consen 184 SSTNWEKGDEVDVYGFN-------STGKLKHRKLKITNCTKCA---------------YSIC----TKQYSCKGDRGGPL 237 (282)
T ss_pred CccccccCceEEEeecC-------CCCeEEEEEEEEEEeeccc---------------eeEe----cccccCCCCccCeE
Confidence 77766677777777771 1124555566665443321 1122 23578999999999
Q ss_pred EEeeCCeEEEEEEEecCCCCCCCCCCceeeeccccCCCccc
Q psy7302 645 VCQDDGFYELAGLVSWGFGCGRQDVPGVYVKVSSFNGMSKV 685 (741)
Q Consensus 645 v~~~~g~~~LvGVvS~G~gC~~~~~P~VYtrVS~y~~wI~~ 685 (741)
+...+++|+|+||.+.+..-...+ ...|.+|..|.+-||.
T Consensus 238 v~~~~gr~tlIGv~~~~~~~~~~~-~~~f~~v~~~~~~IC~ 277 (282)
T PF03761_consen 238 VKNINGRWTLIGVGASGNYECNKN-NSYFFNVSWYQDEICE 277 (282)
T ss_pred EEEECCCEEEEEEEccCCCccccc-ccEEEEHHHhhhhhcc
Confidence 999999999999998876322212 7889999999999986
No 7
>PF09342 DUF1986: Domain of unknown function (DUF1986); InterPro: IPR015420 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This domain is found in serine endopeptidases belonging to MEROPS peptidase family S1A (clan PA). It is found in unusual mosaic proteins, which are encoded by the Drosophila nudel gene (see P98159 from SWISSPROT). Nudel is involved in defining embryonic dorsoventral polarity. Three proteases; ndl, gd and snk process easter to create active easter. Active easter defines cell identities along the dorsal-ventral continuum by activating the spz ligand for the Tl receptor in the ventral region of the embryo. Nudel, pipe and windbeutel together trigger the protease cascade within the extraembryonic perivitelline compartment which induces dorsoventral polarity of the Drosophila embryo [].
Probab=99.46 E-value=3.1e-12 Score=130.48 Aligned_cols=120 Identities=22% Similarity=0.404 Sum_probs=101.0
Q ss_pred CCCCcceEEEEEeeCCceeEEEEEeccceEEeecccccccccCCCcEEEEEceeccccccCCCCcEEEEEEEEEEecCCC
Q psy7302 455 DPAEWCWQVALINSLNQYLCGGALIGTQWVLTAAHCVTNIVRSGDAVYVRVGDHDLTRKYGSPGAQTLRVATTYIHHNHN 534 (741)
Q Consensus 455 ~~ge~PW~VsL~~~~~~~~CgGTLIS~rwVLTAAHCv~~~~~~~~~i~V~lG~~~l~~~~~~~~~q~~~V~~IiiHP~Y~ 534 (741)
+...|||+|.||.. +.+.|+|+||.++|||++-.|+.+.......+.|.+|..........+.+|.+.|..+..-|
T Consensus 12 e~y~WPWlA~IYvd-G~~~CsgvLlD~~WlLvsssCl~~I~L~~~YvsallG~~Kt~~~v~Gp~EQI~rVD~~~~V~--- 87 (267)
T PF09342_consen 12 EDYHWPWLADIYVD-GRYWCSGVLLDPHWLLVSSSCLRGISLSHHYVSALLGGGKTYLSVDGPHEQISRVDCFKDVP--- 87 (267)
T ss_pred ccccCcceeeEEEc-CeEEEEEEEeccceEEEeccccCCcccccceEEEEecCcceecccCCChheEEEeeeeeecc---
Confidence 45679999999865 78999999999999999999998876666788899998875555567778988888766544
Q ss_pred CCCCCCceEEEEeccccccCCceeEEecCCCCCcCCCCCeEEEEEecc
Q psy7302 535 SQTLDNDIALLKLHGQAELKDGVCLVCLPARGVNHAAGKRCTVTGYGY 582 (741)
Q Consensus 535 ~~t~~nDIALLkL~~pi~~s~~V~PICLP~~~~~~~~g~~~~v~GWG~ 582 (741)
..+++||.|++|+.|+.+|+|..||...........|..+|-..
T Consensus 88 ----~S~v~LLHL~~~~~fTr~VlP~flp~~~~~~~~~~~CVAVg~d~ 131 (267)
T PF09342_consen 88 ----ESNVLLLHLEQPANFTRYVLPTFLPETSNENESDDECVAVGHDD 131 (267)
T ss_pred ----ccceeeeeecCcccceeeecccccccccCCCCCCCceEEEEccc
Confidence 47899999999999999999999997656666677999998654
No 8
>COG3591 V8-like Glu-specific endopeptidase [Amino acid transport and metabolism]
Probab=98.51 E-value=6.3e-07 Score=93.74 Aligned_cols=177 Identities=19% Similarity=0.191 Sum_probs=93.2
Q ss_pred ecCCCCcceEEEEEee--CCceeEEEEEeccceEEeecccccccccCCCcEEEEE-ceeccccccCCCCcEEEEEEEEEE
Q psy7302 453 DADPAEWCWQVALINS--LNQYLCGGALIGTQWVLTAAHCVTNIVRSGDAVYVRV-GDHDLTRKYGSPGAQTLRVATTYI 529 (741)
Q Consensus 453 ~a~~ge~PW~VsL~~~--~~~~~CgGTLIS~rwVLTAAHCv~~~~~~~~~i~V~l-G~~~l~~~~~~~~~q~~~V~~Iii 529 (741)
......|||-+..... .+.+-|+|+||+++.||||+||+.........+.+.. |...- ..+... +....+.+
T Consensus 43 V~dt~~~Py~av~~~~~~tG~~~~~~~lI~pntvLTa~Hc~~s~~~G~~~~~~~p~g~~~~----~~~~~~-~~~~~~~~ 117 (251)
T COG3591 43 VTDTTQFPYSAVVQFEAATGRLCTAATLIGPNTVLTAGHCIYSPDYGEDDIAAAPPGVNSD----GGPFYG-ITKIEIRV 117 (251)
T ss_pred cccCCCCCcceeEEeecCCCcceeeEEEEcCceEEEeeeEEecCCCChhhhhhcCCcccCC----CCCCCc-eeeEEEEe
Confidence 3556789997766433 2345677799999999999999987544323333332 22111 111111 22222222
Q ss_pred ecC--CCCCCCCCceEEEEeccccccCCceeEEecCCCCCcCCCCCeEEEEEecccCCCCCCcceeEEEEEeeeChhHHH
Q psy7302 530 HHN--HNSQTLDNDIALLKLHGQAELKDGVCLVCLPARGVNHAAGKRCTVTGYGYMGEAGPIPLRVREAEIPIVSDTECI 607 (741)
Q Consensus 530 HP~--Y~~~t~~nDIALLkL~~pi~~s~~V~PICLP~~~~~~~~g~~~~v~GWG~t~~~~~~s~~L~~~~v~vis~~~C~ 607 (741)
.|. |.......|+..+.|+...++.+.+...-++..... ..+....++||-..... ...+.+..-.+..
T Consensus 118 ~~g~~~~~d~~~~~v~~~~~~~g~~~~~~~~~~~~~~~~~~-~~~d~i~v~GYP~dk~~---~~~~~e~t~~v~~----- 188 (251)
T COG3591 118 YPGELYKEDGASYDVGEAALESGINIGDVVNYLKRNTASEA-KANDRITVIGYPGDKPN---IGTMWESTGKVNS----- 188 (251)
T ss_pred cCCceeccCCceeeccHHHhccCCCcccccccccccccccc-ccCceeEEEeccCCCCc---ceeEeeecceeEE-----
Confidence 332 233444566777777644555555444344433322 34444888888533221 0011111111100
Q ss_pred HhhccccccccccCCCeEEecCCCCCCCCCCCCCCceEEeeCCeEEEEEEEecCC
Q psy7302 608 RKINAVTEKIFILPASSFCAGGEEGNDACQGDGGGPLVCQDDGFYELAGLVSWGF 662 (741)
Q Consensus 608 ~~~~~~~~~~~~i~~~~~CAg~~~g~~~C~GDSGGPLv~~~~g~~~LvGVvS~G~ 662 (741)
+.... .....+++.|+||+|++...+ +++||..-|.
T Consensus 189 ------------~~~~~----l~y~~dT~pG~SGSpv~~~~~---~vigv~~~g~ 224 (251)
T COG3591 189 ------------IKGNK----LFYDADTLPGSSGSPVLISKD---EVIGVHYNGP 224 (251)
T ss_pred ------------Eecce----EEEEecccCCCCCCceEecCc---eEEEEEecCC
Confidence 00000 022468899999999998765 8999977664
No 9
>cd00190 Tryp_SPc Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues.
Probab=98.38 E-value=3e-07 Score=92.72 Aligned_cols=52 Identities=54% Similarity=1.142 Sum_probs=48.2
Q ss_pred cccCCCCCceEeccCCeEEEEeeeecCCCCCCCCCCeEEEeccchHhHHHHh
Q psy7302 683 SKVGDGGGPLVCQDDGFYELAGLVSWGFGCGRQDVPGVYVKVSSFIGWINQI 734 (741)
Q Consensus 683 I~~~DsGgp~~~~~~~~~~L~GI~S~G~~Cg~~~~P~VYTrVs~y~dWI~~v 734 (741)
.+.+|||+|++...+++|+|+||+|+|..|+..+.|++|++|++|++||+++
T Consensus 181 ~c~gdsGgpl~~~~~~~~~lvGI~s~g~~c~~~~~~~~~t~v~~~~~WI~~~ 232 (232)
T cd00190 181 ACQGDSGGPLVCNDNGRGVLVGIVSWGSGCARPNYPGVYTRVSSYLDWIQKT 232 (232)
T ss_pred cccCCCCCcEEEEeCCEEEEEEEEehhhccCCCCCCCEEEEcHHhhHHhhcC
Confidence 4779999999999999999999999999998767899999999999999874
No 10
>COG5640 Secreted trypsin-like serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=98.30 E-value=8.7e-07 Score=95.32 Aligned_cols=61 Identities=34% Similarity=0.662 Sum_probs=55.2
Q ss_pred cccCCCcccCCCCCceEeccCCeEEEEeeeecCCC-CCCCCCCeEEEeccchHhHHHHhhcc
Q psy7302 677 SSFNGMSKVGDGGGPLVCQDDGFYELAGLVSWGFG-CGRQDVPGVYVKVSSFIGWINQIISV 737 (741)
Q Consensus 677 S~y~~wI~~~DsGgp~~~~~~~~~~L~GI~S~G~~-Cg~~~~P~VYTrVs~y~dWI~~vI~~ 737 (741)
.++..=.+.+|||+|++....+-..+.||+|||.+ |+....|+|||+|+.|.+||..+|+.
T Consensus 219 g~~~~daCqGDSGGPi~~~g~~G~vQ~GVvSwG~~~Cg~t~~~gVyT~vsny~~WI~a~~~~ 280 (413)
T COG5640 219 GRPPKDACQGDSGGPIFHKGEEGRVQRGVVSWGDGGCGGTLIPGVYTNVSNYQDWIAAMTNG 280 (413)
T ss_pred CCCCcccccCCCCCceEEeCCCccEEEeEEEecCCCCCCCCcceeEEehhHHHHHHHHHhcC
Confidence 34555568999999999999999999999999987 99999999999999999999998864
No 11
>smart00020 Tryp_SPc Trypsin-like serine protease. Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues.
Probab=97.96 E-value=6.4e-06 Score=83.43 Aligned_cols=48 Identities=56% Similarity=1.203 Sum_probs=45.2
Q ss_pred cccCCCCCceEeccCCeEEEEeeeecCCCCCCCCCCeEEEeccchHhHH
Q psy7302 683 SKVGDGGGPLVCQDDGFYELAGLVSWGFGCGRQDVPGVYVKVSSFIGWI 731 (741)
Q Consensus 683 I~~~DsGgp~~~~~~~~~~L~GI~S~G~~Cg~~~~P~VYTrVs~y~dWI 731 (741)
++.+|+|+|++...+ +|+|+||+++|..|+..+.|.+|+||++|++||
T Consensus 182 ~c~gdsG~pl~~~~~-~~~l~Gi~s~g~~C~~~~~~~~~~~i~~~~~WI 229 (229)
T smart00020 182 ACQGDSGGPLVCNDG-RWVLVGIVSWGSGCARPGKPGVYTRVSSYLDWI 229 (229)
T ss_pred ccCCCCCCeeEEECC-CEEEEEEEEECCCCCCCCCCCEEEEeccccccC
Confidence 577999999999888 999999999999999778899999999999998
No 12
>PF13365 Trypsin_2: Trypsin-like peptidase domain; PDB: 1Y8T_A 2Z9I_A 3QO6_A 1L1J_A 1QY6_A 2O8L_A 3OTP_E 2ZLE_I 1KY9_A 3CS0_A ....
Probab=97.96 E-value=2.7e-05 Score=70.88 Aligned_cols=22 Identities=45% Similarity=0.649 Sum_probs=19.7
Q ss_pred EEEEEeccc-eEEeecccccccc
Q psy7302 474 CGGALIGTQ-WVLTAAHCVTNIV 495 (741)
Q Consensus 474 CgGTLIS~r-wVLTAAHCv~~~~ 495 (741)
|+|.||+++ +|||+|||+....
T Consensus 1 GTGf~i~~~g~ilT~~Hvv~~~~ 23 (120)
T PF13365_consen 1 GTGFLIGPDGYILTAAHVVEDWN 23 (120)
T ss_dssp EEEEEEETTTEEEEEHHHHTCCT
T ss_pred CEEEEEcCCceEEEchhheeccc
Confidence 689999999 9999999998653
No 13
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=97.88 E-value=0.00022 Score=80.88 Aligned_cols=85 Identities=21% Similarity=0.347 Sum_probs=58.4
Q ss_pred ceeEEEEEeccc-eEEeecccccccccCCCcEEEEEceeccccccCCCCcEEEEEEEEEEecCCCCCCCCCceEEEEecc
Q psy7302 471 QYLCGGALIGTQ-WVLTAAHCVTNIVRSGDAVYVRVGDHDLTRKYGSPGAQTLRVATTYIHHNHNSQTLDNDIALLKLHG 549 (741)
Q Consensus 471 ~~~CgGTLIS~r-wVLTAAHCv~~~~~~~~~i~V~lG~~~l~~~~~~~~~q~~~V~~IiiHP~Y~~~t~~nDIALLkL~~ 549 (741)
...++|.+|+++ ||||++|.+.+ ...+.|.+... ..+..+-+..++ ..||||||++.
T Consensus 57 ~~~GSGfii~~~G~IlTn~Hvv~~----~~~i~V~~~~~-----------~~~~a~vv~~d~-------~~DlAllkv~~ 114 (428)
T TIGR02037 57 RGLGSGVIISADGYILTNNHVVDG----ADEITVTLSDG-----------REFKAKLVGKDP-------RTDIAVLKIDA 114 (428)
T ss_pred cceeeEEEECCCCEEEEcHHHcCC----CCeEEEEeCCC-----------CEEEEEEEEecC-------CCCEEEEEecC
Confidence 457999999976 99999999976 34566665421 334544444443 47999999975
Q ss_pred ccccCCceeEEecCCCCCcCCCCCeEEEEEecc
Q psy7302 550 QAELKDGVCLVCLPARGVNHAAGKRCTVTGYGY 582 (741)
Q Consensus 550 pi~~s~~V~PICLP~~~~~~~~g~~~~v~GWG~ 582 (741)
+ ..+.++.|... .....|+.++++||..
T Consensus 115 ~----~~~~~~~l~~~-~~~~~G~~v~aiG~p~ 142 (428)
T TIGR02037 115 K----KNLPVIKLGDS-DKLRVGDWVLAIGNPF 142 (428)
T ss_pred C----CCceEEEccCC-CCCCCCCEEEEEECCC
Confidence 4 23556667543 2357899999999864
No 14
>PF00089 Trypsin: Trypsin; InterPro: IPR001254 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine proteases belong to the MEROPS peptidase family S1 (chymotrypsin family, clan PA(S))and to peptidase family S6 (Hap serine peptidases). The chymotrypsin family is almost totally confined to animals, although trypsin-like enzymes are found in actinomycetes of the genera Streptomyces and Saccharopolyspora, and in the fungus Fusarium oxysporum []. The enzymes are inherently secreted, being synthesised with a signal peptide that targets them to the secretory pathway. Animal enzymes are either secreted directly, packaged into vesicles for regulated secretion, or are retained in leukocyte granules []. The Hap family, 'Haemophilus adhesion and penetration', are proteins that play a role in the interaction with human epithelial cells. The serine protease activity is localized at the N-terminal domain, whereas the binding domain is in the C-terminal region. ; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 1SPJ_A 1A5I_A 2ZGH_A 2ZKS_A 2ZGJ_A 2ZGC_A 2ODP_A 2I6Q_A 2I6S_A 2ODQ_A ....
Probab=97.85 E-value=1.5e-05 Score=79.79 Aligned_cols=48 Identities=50% Similarity=1.114 Sum_probs=44.6
Q ss_pred CCcccCCCCCceEeccCCeEEEEeeeecCCCCCCCCCCeEEEeccchHhHH
Q psy7302 681 GMSKVGDGGGPLVCQDDGFYELAGLVSWGFGCGRQDVPGVYVKVSSFIGWI 731 (741)
Q Consensus 681 ~wI~~~DsGgp~~~~~~~~~~L~GI~S~G~~Cg~~~~P~VYTrVs~y~dWI 731 (741)
.-.+.+|||+|+++.+. +|+||++++.+|+..+.|.+|+||+.|++||
T Consensus 173 ~~~~~g~sG~pl~~~~~---~lvGI~s~~~~c~~~~~~~v~~~v~~~~~WI 220 (220)
T PF00089_consen 173 GDACQGDSGGPLICNNN---YLVGIVSFGENCGSPNYPGVYTRVSSYLDWI 220 (220)
T ss_dssp SBGGTTTTTSEEEETTE---EEEEEEEEESSSSBTTSEEEEEEGGGGHHHH
T ss_pred cccccccccccccccee---eecceeeecCCCCCCCcCEEEEEHHHhhccC
Confidence 45688999999999887 9999999999999888899999999999999
No 15
>PRK10898 serine endoprotease; Provisional
Probab=97.79 E-value=0.00084 Score=74.45 Aligned_cols=83 Identities=18% Similarity=0.335 Sum_probs=54.2
Q ss_pred eeEEEEEeccc-eEEeecccccccccCCCcEEEEEceeccccccCCCCcEEEEEEEEEEecCCCCCCCCCceEEEEeccc
Q psy7302 472 YLCGGALIGTQ-WVLTAAHCVTNIVRSGDAVYVRVGDHDLTRKYGSPGAQTLRVATTYIHHNHNSQTLDNDIALLKLHGQ 550 (741)
Q Consensus 472 ~~CgGTLIS~r-wVLTAAHCv~~~~~~~~~i~V~lG~~~l~~~~~~~~~q~~~V~~IiiHP~Y~~~t~~nDIALLkL~~p 550 (741)
..-+|.+|+++ ||||++|=+.+ .+.+.|.+..- ..+...-+..+| .+||||||++..
T Consensus 78 ~~GSGfvi~~~G~IlTn~HVv~~----a~~i~V~~~dg-----------~~~~a~vv~~d~-------~~DlAvl~v~~~ 135 (353)
T PRK10898 78 TLGSGVIMDQRGYILTNKHVIND----ADQIIVALQDG-----------RVFEALLVGSDS-------LTDLAVLKINAT 135 (353)
T ss_pred ceeeEEEEeCCeEEEecccEeCC----CCEEEEEeCCC-----------CEEEEEEEEEcC-------CCCEEEEEEcCC
Confidence 56899999875 99999999865 34566665421 234444344443 489999999753
Q ss_pred cccCCceeEEecCCCCCcCCCCCeEEEEEecc
Q psy7302 551 AELKDGVCLVCLPARGVNHAAGKRCTVTGYGY 582 (741)
Q Consensus 551 i~~s~~V~PICLP~~~~~~~~g~~~~v~GWG~ 582 (741)
. +.++.|... .....|+..+++|+..
T Consensus 136 -~----l~~~~l~~~-~~~~~G~~V~aiG~P~ 161 (353)
T PRK10898 136 -N----LPVIPINPK-RVPHIGDVVLAIGNPY 161 (353)
T ss_pred -C----CCeeeccCc-CcCCCCCEEEEEeCCC
Confidence 1 233444322 2346789999999863
No 16
>TIGR02038 protease_degS periplasmic serine pepetdase DegS. This family consists of the periplasmic serine protease DegS (HhoB), a shorter paralog of protease DO (HtrA, DegP) and DegQ (HhoA). It is found in E. coli and several other Proteobacteria of the gamma subdivision. It contains a trypsin domain and a single copy of PDZ domain (in contrast to DegP with two copies). A critical role of this DegS is to sense stress in the periplasm and partially degrade an inhibitor of sigma(E).
Probab=97.77 E-value=0.00071 Score=74.95 Aligned_cols=140 Identities=18% Similarity=0.291 Sum_probs=79.4
Q ss_pred eeEEEEEeccc-eEEeecccccccccCCCcEEEEEceeccccccCCCCcEEEEEEEEEEecCCCCCCCCCceEEEEeccc
Q psy7302 472 YLCGGALIGTQ-WVLTAAHCVTNIVRSGDAVYVRVGDHDLTRKYGSPGAQTLRVATTYIHHNHNSQTLDNDIALLKLHGQ 550 (741)
Q Consensus 472 ~~CgGTLIS~r-wVLTAAHCv~~~~~~~~~i~V~lG~~~l~~~~~~~~~q~~~V~~IiiHP~Y~~~t~~nDIALLkL~~p 550 (741)
...+|.+|+++ ||||++|-+.+ .+.+.|.+... ..+..+-+..++ ..||||||++..
T Consensus 78 ~~GSG~vi~~~G~IlTn~HVV~~----~~~i~V~~~dg-----------~~~~a~vv~~d~-------~~DlAvlkv~~~ 135 (351)
T TIGR02038 78 GLGSGVIMSKEGYILTNYHVIKK----ADQIVVALQDG-----------RKFEAELVGSDP-------LTDLAVLKIEGD 135 (351)
T ss_pred ceEEEEEEeCCeEEEecccEeCC----CCEEEEEECCC-----------CEEEEEEEEecC-------CCCEEEEEecCC
Confidence 46999999876 99999999965 34566665321 334444444444 589999999754
Q ss_pred cccCCceeEEecCCCCCcCCCCCeEEEEEecccCCCCCCcceeEEEEEeeeChhHHHHhhccccccccccCCCeEEecCC
Q psy7302 551 AELKDGVCLVCLPARGVNHAAGKRCTVTGYGYMGEAGPIPLRVREAEIPIVSDTECIRKINAVTEKIFILPASSFCAGGE 630 (741)
Q Consensus 551 i~~s~~V~PICLP~~~~~~~~g~~~~v~GWG~t~~~~~~s~~L~~~~v~vis~~~C~~~~~~~~~~~~~i~~~~~CAg~~ 630 (741)
- +.++.|-. ......|+.+.++|+..... ..+....+.-+... .+..... ...+ .
T Consensus 136 ~-----~~~~~l~~-s~~~~~G~~V~aiG~P~~~~-----~s~t~GiIs~~~r~----~~~~~~~------~~~i----q 190 (351)
T TIGR02038 136 N-----LPTIPVNL-DRPPHVGDVVLAIGNPYNLG-----QTITQGIISATGRN----GLSSVGR------QNFI----Q 190 (351)
T ss_pred C-----CceEeccC-cCccCCCCEEEEEeCCCCCC-----CcEEEEEEEeccCc----ccCCCCc------ceEE----E
Confidence 2 23444432 23457899999999864211 11222222211110 0000000 0011 1
Q ss_pred CCCCCCCCCCCCceEEeeCCeEEEEEEEecC
Q psy7302 631 EGNDACQGDGGGPLVCQDDGFYELAGLVSWG 661 (741)
Q Consensus 631 ~g~~~C~GDSGGPLv~~~~g~~~LvGVvS~G 661 (741)
-+...-.|.|||||+..++ .|+||.+..
T Consensus 191 tda~i~~GnSGGpl~n~~G---~vIGI~~~~ 218 (351)
T TIGR02038 191 TDAAINAGNSGGALINTNG---ELVGINTAS 218 (351)
T ss_pred ECCccCCCCCcceEECCCC---eEEEEEeee
Confidence 1234457889999996553 799998764
No 17
>PRK10139 serine endoprotease; Provisional
Probab=97.37 E-value=0.0013 Score=75.38 Aligned_cols=142 Identities=18% Similarity=0.281 Sum_probs=81.7
Q ss_pred eeEEEEEecc--ceEEeecccccccccCCCcEEEEEceeccccccCCCCcEEEEEEEEEEecCCCCCCCCCceEEEEecc
Q psy7302 472 YLCGGALIGT--QWVLTAAHCVTNIVRSGDAVYVRVGDHDLTRKYGSPGAQTLRVATTYIHHNHNSQTLDNDIALLKLHG 549 (741)
Q Consensus 472 ~~CgGTLIS~--rwVLTAAHCv~~~~~~~~~i~V~lG~~~l~~~~~~~~~q~~~V~~IiiHP~Y~~~t~~nDIALLkL~~ 549 (741)
...+|.+|++ .||||.+|.+.+ .+.+.|.+... ..+..+-+...+ ..||||||++.
T Consensus 90 ~~GSG~ii~~~~g~IlTn~HVv~~----a~~i~V~~~dg-----------~~~~a~vvg~D~-------~~DlAvlkv~~ 147 (455)
T PRK10139 90 GLGSGVIIDAAKGYVLTNNHVINQ----AQKISIQLNDG-----------REFDAKLIGSDD-------QSDIALLQIQN 147 (455)
T ss_pred ceEEEEEEECCCCEEEeChHHhCC----CCEEEEEECCC-----------CEEEEEEEEEcC-------CCCEEEEEecC
Confidence 4689999974 699999999976 35677776421 334544444433 58999999975
Q ss_pred ccccCCceeEEecCCCCCcCCCCCeEEEEEecccCCCCCCcceeEEEEEeeeChhHHHHhhccccccccccCCCeEEecC
Q psy7302 550 QAELKDGVCLVCLPARGVNHAAGKRCTVTGYGYMGEAGPIPLRVREAEIPIVSDTECIRKINAVTEKIFILPASSFCAGG 629 (741)
Q Consensus 550 pi~~s~~V~PICLP~~~~~~~~g~~~~v~GWG~t~~~~~~s~~L~~~~v~vis~~~C~~~~~~~~~~~~~i~~~~~CAg~ 629 (741)
+- .+.++.|... .....|+.+.++|+-.... . . +..-+++...= .... ... ....+
T Consensus 148 ~~----~l~~~~lg~s-~~~~~G~~V~aiG~P~g~~-~----t---vt~GivS~~~r-~~~~-~~~-----~~~~i---- 203 (455)
T PRK10139 148 PS----KLTQIAIADS-DKLRVGDFAVAVGNPFGLG-Q----T---ATSGIISALGR-SGLN-LEG-----LENFI---- 203 (455)
T ss_pred CC----CCceeEecCc-cccCCCCEEEEEecCCCCC-C----c---eEEEEEccccc-cccC-CCC-----cceEE----
Confidence 42 2345666432 3456789999999853111 1 1 12223321100 0000 000 00111
Q ss_pred CCCCCCCCCCCCCceEEeeCCeEEEEEEEecCC
Q psy7302 630 EEGNDACQGDGGGPLVCQDDGFYELAGLVSWGF 662 (741)
Q Consensus 630 ~~g~~~C~GDSGGPLv~~~~g~~~LvGVvS~G~ 662 (741)
..+...-.|.|||||+..++ .|+||.++..
T Consensus 204 qtda~in~GnSGGpl~n~~G---~vIGi~~~~~ 233 (455)
T PRK10139 204 QTDASINRGNSGGALLNLNG---ELIGINTAIL 233 (455)
T ss_pred EECCccCCCCCcceEECCCC---eEEEEEEEEE
Confidence 11234457999999996543 8999998743
No 18
>PRK10942 serine endoprotease; Provisional
Probab=97.26 E-value=0.0045 Score=71.34 Aligned_cols=83 Identities=20% Similarity=0.308 Sum_probs=55.1
Q ss_pred eeEEEEEecc--ceEEeecccccccccCCCcEEEEEceeccccccCCCCcEEEEEEEEEEecCCCCCCCCCceEEEEecc
Q psy7302 472 YLCGGALIGT--QWVLTAAHCVTNIVRSGDAVYVRVGDHDLTRKYGSPGAQTLRVATTYIHHNHNSQTLDNDIALLKLHG 549 (741)
Q Consensus 472 ~~CgGTLIS~--rwVLTAAHCv~~~~~~~~~i~V~lG~~~l~~~~~~~~~q~~~V~~IiiHP~Y~~~t~~nDIALLkL~~ 549 (741)
...+|.+|+. .+|||.+|.+.+ .+.+.|.+... ..+..+-+..++ ..||||||++.
T Consensus 111 ~~GSG~ii~~~~G~IlTn~HVv~~----a~~i~V~~~dg-----------~~~~a~vv~~D~-------~~DlAvlki~~ 168 (473)
T PRK10942 111 ALGSGVIIDADKGYVVTNNHVVDN----ATKIKVQLSDG-----------RKFDAKVVGKDP-------RSDIALIQLQN 168 (473)
T ss_pred ceEEEEEEECCCCEEEeChhhcCC----CCEEEEEECCC-----------CEEEEEEEEecC-------CCCEEEEEecC
Confidence 4689999985 599999999876 35677776421 234444444444 48999999964
Q ss_pred ccccCCceeEEecCCCCCcCCCCCeEEEEEec
Q psy7302 550 QAELKDGVCLVCLPARGVNHAAGKRCTVTGYG 581 (741)
Q Consensus 550 pi~~s~~V~PICLP~~~~~~~~g~~~~v~GWG 581 (741)
+- .+.++-|-.. .....|+.++++|+-
T Consensus 169 ~~----~l~~~~lg~s-~~l~~G~~V~aiG~P 195 (473)
T PRK10942 169 PK----NLTAIKMADS-DALRVGDYTVAIGNP 195 (473)
T ss_pred CC----CCceeEecCc-cccCCCCEEEEEcCC
Confidence 32 2335555432 235678888888874
No 19
>PF02395 Peptidase_S6: Immunoglobulin A1 protease Serine protease Prosite pattern; InterPro: IPR000710 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S6 (clan PA(S)). The type sample being the IgA1-specific serine endopeptidase from Neisseria gonorrhoeae []. These cleave prolyl bonds in the hinge regions of immunoglobulin A heavy chains. Similar specificity is shown by the unrelated family of M26 metalloendopeptidases.; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SZE_A 3H09_B 3SYJ_A 1WXR_A 3AK5_B.
Probab=94.39 E-value=0.14 Score=62.15 Aligned_cols=65 Identities=17% Similarity=0.247 Sum_probs=39.1
Q ss_pred EEEeccceEEeecccccccccCCCcEEEEEceeccccccCCCCcEEEEEEEEEEecCCCCCCCCCceEEEEeccccccCC
Q psy7302 476 GALIGTQWVLTAAHCVTNIVRSGDAVYVRVGDHDLTRKYGSPGAQTLRVATTYIHHNHNSQTLDNDIALLKLHGQAELKD 555 (741)
Q Consensus 476 GTLIS~rwVLTAAHCv~~~~~~~~~i~V~lG~~~l~~~~~~~~~q~~~V~~IiiHP~Y~~~t~~nDIALLkL~~pi~~s~ 555 (741)
+|||++++|+|++|=... .-.|.+|.... ..+.+.+---|+. .|+.+.||.+=|.
T Consensus 69 aTLigpqYiVSV~HN~~g------y~~v~FG~~g~---------~~Y~iV~RNn~~~-------~Df~~pRLnK~VT--- 123 (769)
T PF02395_consen 69 ATLIGPQYIVSVKHNGKG------YNSVSFGNEGQ---------NTYKIVDRNNYPS-------GDFHMPRLNKFVT--- 123 (769)
T ss_dssp -EEEETTEEEBETTG-TS------CCEECESCSST---------CEEEEEEEEBETT-------STEBEEEESS------
T ss_pred EEEecCCeEEEEEccCCC------cCceeecccCC---------ceEEEEEccCCCC-------cccceeecCceEE---
Confidence 899999999999997622 23466775332 4566666666654 6999999997764
Q ss_pred ceeEEecCCC
Q psy7302 556 GVCLVCLPAR 565 (741)
Q Consensus 556 ~V~PICLP~~ 565 (741)
.+.|+-+...
T Consensus 124 EvaP~~~t~~ 133 (769)
T PF02395_consen 124 EVAPAEMTTA 133 (769)
T ss_dssp SS----BBSS
T ss_pred EEeccccccc
Confidence 3567666543
No 20
>PF00548 Peptidase_C3: 3C cysteine protease (picornain 3C); InterPro: IPR000199 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This signature defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies C3A and C3B. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral C3 cysteine protease. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SJO_E 2H6M_A 1QA7_C 1HAV_B 2HAL_A 2H9H_A 3QZQ_B 3QZR_A 3R0F_B 3SJ9_A ....
Probab=87.19 E-value=4.1 Score=40.78 Aligned_cols=150 Identities=15% Similarity=0.131 Sum_probs=74.2
Q ss_pred eCCceeEEEEEeccceEEeecccccccccCCCcEEEEEceeccccccCCCCcEEEEEEEEEEecCCCCCCCCCceEEEEe
Q psy7302 468 SLNQYLCGGALIGTQWVLTAAHCVTNIVRSGDAVYVRVGDHDLTRKYGSPGAQTLRVATTYIHHNHNSQTLDNDIALLKL 547 (741)
Q Consensus 468 ~~~~~~CgGTLIS~rwVLTAAHCv~~~~~~~~~i~V~lG~~~l~~~~~~~~~q~~~V~~IiiHP~Y~~~t~~nDIALLkL 547 (741)
..+.+.+.|..|..+|+|--.|- . .. ..+.++. ..+.+.+.+.. .+......||+|++|
T Consensus 21 ~~g~~t~l~~gi~~~~~lvp~H~--~---~~--~~i~i~g------------~~~~~~d~~~l--v~~~~~~~Dl~~v~l 79 (172)
T PF00548_consen 21 GKGEFTMLALGIYDRYFLVPTHE--E---PE--DTIYIDG------------VEYKVDDSVVL--VDRDGVDTDLTLVKL 79 (172)
T ss_dssp TTEEEEEEEEEEEBTEEEEEGGG--G---GC--SEEEETT------------EEEEEEEEEEE--EETTSSEEEEEEEEE
T ss_pred CCceEEEecceEeeeEEEEECcC--C---Cc--EEEEECC------------EEEEeeeeEEE--ecCCCcceeEEEEEc
Confidence 33467788889999999999992 1 12 2333332 22333332222 122233579999999
Q ss_pred ccccccCCceeEEecCCCCCcCCCCCeEEEEEecccCCCCCCcceeEEE-EEeeeChhHHHHhhccccccccccCCCeEE
Q psy7302 548 HGQAELKDGVCLVCLPARGVNHAAGKRCTVTGYGYMGEAGPIPLRVREA-EIPIVSDTECIRKINAVTEKIFILPASSFC 626 (741)
Q Consensus 548 ~~pi~~s~~V~PICLP~~~~~~~~g~~~~v~GWG~t~~~~~~s~~L~~~-~v~vis~~~C~~~~~~~~~~~~~i~~~~~C 626 (741)
++.-.|.+..+-+. ........+.++-|... .. ..+..+ .+..... .+ +....+.
T Consensus 80 ~~~~kfrDIrk~~~-----~~~~~~~~~~l~v~~~~--~~---~~~~~v~~v~~~~~------i~--------~~g~~~~ 135 (172)
T PF00548_consen 80 PRNPKFRDIRKFFP-----ESIPEYPECVLLVNSTK--FP---RMIVEVGFVTNFGF------IN--------LSGTTTP 135 (172)
T ss_dssp ESSS-B--GGGGSB-----SSGGTEEEEEEEEESSS--ST---CEEEEEEEEEEEEE------EE--------ETTEEEE
T ss_pred cCCcccCchhhhhc-----cccccCCCcEEEEECCC--Cc---cEEEEEEEEeecCc------cc--------cCCCEee
Confidence 88877866554444 11122334444444311 11 011110 1110000 00 0001111
Q ss_pred ecCCCCCCCCCCCCCCceEEeeCCeEEEEEEEecCC
Q psy7302 627 AGGEEGNDACQGDGGGPLVCQDDGFYELAGLVSWGF 662 (741)
Q Consensus 627 Ag~~~g~~~C~GDSGGPLv~~~~g~~~LvGVvS~G~ 662 (741)
.-......+-.|+-||+|+.+.++...++||...|.
T Consensus 136 ~~~~Y~~~t~~G~CG~~l~~~~~~~~~i~GiHvaG~ 171 (172)
T PF00548_consen 136 RSLKYKAPTKPGMCGSPLVSRIGGQGKIIGIHVAGN 171 (172)
T ss_dssp EEEEEESEEETTGTTEEEEESCGGTTEEEEEEEEEE
T ss_pred EEEEEccCCCCCccCCeEEEeeccCccEEEEEeccC
Confidence 111112234568889999998877889999988764
No 21
>KOG3627|consensus
Probab=78.24 E-value=0.81 Score=47.56 Aligned_cols=18 Identities=44% Similarity=0.698 Sum_probs=16.5
Q ss_pred CCCceeeeccccCCCccc
Q psy7302 668 DVPGVYVKVSSFNGMSKV 685 (741)
Q Consensus 668 ~~P~VYtrVS~y~~wI~~ 685 (741)
+.|+||++|+.|.+||+.
T Consensus 235 ~~P~vyt~V~~y~~WI~~ 252 (256)
T KOG3627|consen 235 NYPGVYTRVSSYLDWIKE 252 (256)
T ss_pred CCCeEEeEhHHhHHHHHH
Confidence 679999999999999975
No 22
>smart00130 KR Kringle domain. Named after a Danish pastry. Found in several serine proteases and in ROR-like receptors. Can occur in up to 38 copies (in apolipoprotein(a)). Plasminogen-like kringles possess affinity for free lysine and lysine- containing peptides.
Probab=76.72 E-value=2.5 Score=37.27 Aligned_cols=35 Identities=23% Similarity=0.301 Sum_probs=23.3
Q ss_pred CCcccccccccCCCC---Cc-cccccCccCCcccccccCC
Q psy7302 401 RPPFQCNICLFAASY---PT-YSTTSTTVRPAVYNKYVCG 436 (741)
Q Consensus 401 ~~~l~~n~c~~~~~~---p~-~t~~~~~~~~~~~~~~~CG 436 (741)
...+..||||||++. |. |+ ......+.+|+-..|+
T Consensus 44 ~~~~~hNyCRNPd~~~~~PWCyv-~~~~~~~e~C~ip~C~ 82 (83)
T smart00130 44 EAGLEENYCRNPDGDSEGPWCYT-TDPNVRWEYCDIPQCE 82 (83)
T ss_pred cccccccccCCCCCCCCCCEEEe-CCCCcceEeCCCCcCC
Confidence 447899999999994 22 55 3344556666666664
No 23
>PF00051 Kringle: Kringle domain; InterPro: IPR000001 Kringles are autonomous structural domains, found throughout the blood clotting and fibrinolytic proteins. Kringle domains are believed to play a role in binding mediators (e.g., membranes, other proteins or phospholipids), and in the regulation of proteolytic activity [, , ]. Kringle domains [, , ] are characterised by a triple loop, 3-disulphide bridge structure, whose conformation is defined by a number of hydrogen bonds and small pieces of anti-parallel beta-sheet. They are found in a varying number of copies in some plasma proteins including prothrombin and urokinase-type plasminogen activator, which are serine proteases belonging to MEROPS peptidase family S1A. Steroid or nuclear hormone receptors (4A nuclear receptor, NRs) constitute an important superfamily of transcription regulators that are involved in widely diverse physiological functions, including control of embryonic development, cell differentiation and homeostasis. Members of the superfamily include the steroid hormone receptors and receptors for thyroid hormone, retinoids, 1,25-dihydroxy-vitamin D3 and a variety of other ligands []. The proteins function as dimeric molecules in nuclei to regulate the transcription of target genes in a ligand-responsive manner [, ]. In addition to C-terminal ligand-binding domains, these nuclear receptors contain a highly-conserved, N-terminal zinc-finger that mediates specific binding to target DNA sequences, termed ligand-responsive elements. In the absence of ligand, steroid hormone receptors are thought to be weakly associated with nuclear components; hormone binding greatly increases receptor affinity. NRs are extremely important in medical research, a large number of them being implicated in diseases such as cancer, diabetes, hormone resistance syndromes, etc. While several NRs act as ligand-inducible transcription factors, many do not yet have a defined ligand and are accordingly termed 'orphan' receptors. During the last decade, more than 300 NRs have been described, many of which are orphans, which cannot easily be named due to current nomenclature confusions in the literature. However, a new system has recently been introduced in an attempt to rationalise the increasingly complex set of names used to describe superfamily members.; PDB: 1JFN_A 3HN4_A 2FEB_A 4A5T_S 5HPG_B 4DUR_B 4DUU_A 2KNF_A 2QJ4_B 2QJ2_A ....
Probab=76.35 E-value=1.3 Score=38.42 Aligned_cols=33 Identities=15% Similarity=0.200 Sum_probs=21.8
Q ss_pred CCCcccccccccCCCCCc---cccccCccCCcccccc
Q psy7302 400 KRPPFQCNICLFAASYPT---YSTTSTTVRPAVYNKY 433 (741)
Q Consensus 400 ~~~~l~~n~c~~~~~~p~---~t~~~~~~~~~~~~~~ 433 (741)
+..+|.+||||||++++. |+ ......+++|+-.
T Consensus 42 ~~~~l~~NyCRNPd~~~~PWCy~-~~~~~~~~~C~Ip 77 (79)
T PF00051_consen 42 PEAGLGHNYCRNPDGDPRPWCYT-KDPGIRWEYCDIP 77 (79)
T ss_dssp TTSTHTTTSSBETTSCSSSEEEB-SSTTESEEEBSSE
T ss_pred cccCcCcceeeCCCCCCCcceEe-cCCCceEEecCCC
Confidence 456889999999999865 44 3334455555433
No 24
>PF00863 Peptidase_C4: Peptidase family C4 This family belongs to family C4 of the peptidase classification.; InterPro: IPR001730 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. Nuclear inclusion A (NIA) proteases from potyviruses are cysteine peptidases belong to the MEROPS peptidase family C4 (NIa protease family, clan PA(C)) [, ]. Potyviruses include plant viruses in which the single-stranded RNA encodes a polyprotein with NIA protease activity, where proteolytic cleavage is specific for Gln+Gly sites. The NIA protease acts on the polyprotein, releasing itself by Gln+Gly cleavage at both the N- and C-termini. It further processes the polyprotein by cleavage at five similar sites in the C-terminal half of the sequence. In addition to its C-terminal protease activity, the NIA protease contains an N-terminal domain that has been implicated in the transcription process []. This peptidase is present in the nuclear inclusion protein of potyviruses.; GO: 0008234 cysteine-type peptidase activity, 0006508 proteolysis; PDB: 3MMG_B 1Q31_B 1LVB_A 1LVM_A.
Probab=76.05 E-value=18 Score=38.14 Aligned_cols=145 Identities=17% Similarity=0.224 Sum_probs=63.0
Q ss_pred EeccceEEeecccccccccCCCcEEEEEceeccccccCCCCcEEEEEE---EEEEecCCCCCCCCCceEEEEeccccccC
Q psy7302 478 LIGTQWVLTAAHCVTNIVRSGDAVYVRVGDHDLTRKYGSPGAQTLRVA---TTYIHHNHNSQTLDNDIALLKLHGQAELK 554 (741)
Q Consensus 478 LIS~rwVLTAAHCv~~~~~~~~~i~V~lG~~~l~~~~~~~~~q~~~V~---~IiiHP~Y~~~t~~nDIALLkL~~pi~~s 554 (741)
|.--.||||-+|-+.+.+ ..+.|..-. -.+.+. ++-+||- ...||.||||.+.+.
T Consensus 37 igyG~~iItn~HLf~~nn---g~L~i~s~h------------G~f~v~nt~~lkv~~i-----~~~DiviirmPkDfp-- 94 (235)
T PF00863_consen 37 IGYGSYIITNAHLFKRNN---GELTIKSQH------------GEFTVPNTTQLKVHPI-----EGRDIVIIRMPKDFP-- 94 (235)
T ss_dssp EEETTEEEEEGGGGSSTT---CEEEEEETT------------EEEEECEGGGSEEEE------TCSSEEEEE--TTS---
T ss_pred EeECCEEEEChhhhccCC---CeEEEEeCc------------eEEEcCCccccceEEe-----CCccEEEEeCCcccC--
Confidence 345569999999986632 224443221 112222 2334442 268999999987663
Q ss_pred CceeEEecCCCCCcCCCCCeEEEEEecccCCCCCCcceeEEEEEeeeChhHHHHhhccccccccccCCCeEEecCCCCCC
Q psy7302 555 DGVCLVCLPARGVNHAAGKRCTVTGYGYMGEAGPIPLRVREAEIPIVSDTECIRKINAVTEKIFILPASSFCAGGEEGND 634 (741)
Q Consensus 555 ~~V~PICLP~~~~~~~~g~~~~v~GWG~t~~~~~~s~~L~~~~v~vis~~~C~~~~~~~~~~~~~i~~~~~CAg~~~g~~ 634 (741)
+.-+-+++ ..+..+..+.++|--..... ....+. +...+... . ...|=.. --.
T Consensus 95 Pf~~kl~F----R~P~~~e~v~mVg~~fq~k~--~~s~vS--esS~i~p~----~------------~~~fWkH---wIs 147 (235)
T PF00863_consen 95 PFPQKLKF----RAPKEGERVCMVGSNFQEKS--ISSTVS--ESSWIYPE----E------------NSHFWKH---WIS 147 (235)
T ss_dssp ---S---B--------TT-EEEEEEEECSSCC--CEEEEE--EEEEEEEE----T------------TTTEEEE----C-
T ss_pred Ccchhhhc----cCCCCCCEEEEEEEEEEcCC--eeEEEC--CceEEeec----C------------CCCeeEE---Eec
Confidence 22222222 12245667777776543221 111111 11111110 0 0011100 112
Q ss_pred CCCCCCCCceEEeeCCeEEEEEEEecCCCCCCCCCCceeeec
Q psy7302 635 ACQGDGGGPLVCQDDGFYELAGLVSWGFGCGRQDVPGVYVKV 676 (741)
Q Consensus 635 ~C~GDSGGPLv~~~~g~~~LvGVvS~G~gC~~~~~P~VYtrV 676 (741)
+=.||=|.|||.-.|+ .+|||.|.+..-. .-.+|+.+
T Consensus 148 Tk~G~CG~PlVs~~Dg--~IVGiHsl~~~~~---~~N~F~~f 184 (235)
T PF00863_consen 148 TKDGDCGLPLVSTKDG--KIVGIHSLTSNTS---SRNYFTPF 184 (235)
T ss_dssp --TT-TT-EEEETTT----EEEEEEEEETTT---SSEEEEE-
T ss_pred CCCCccCCcEEEcCCC--cEEEEEcCccCCC---CeEEEEcC
Confidence 3357779999998877 8999999775332 23467665
No 25
>PF03761 DUF316: Domain of unknown function (DUF316) ; InterPro: IPR005514 This is a family of uncharacterised proteins from Caenorhabditis elegans.
Probab=74.19 E-value=3.6 Score=43.78 Aligned_cols=48 Identities=33% Similarity=0.511 Sum_probs=37.4
Q ss_pred CcccCCCCCceEeccCCeEEEEeeeecCC-CCCCCCCCeEEEeccchHhHH
Q psy7302 682 MSKVGDGGGPLVCQDDGFYELAGLVSWGF-GCGRQDVPGVYVKVSSFIGWI 731 (741)
Q Consensus 682 wI~~~DsGgp~~~~~~~~~~L~GI~S~G~-~Cg~~~~P~VYTrVs~y~dWI 731 (741)
..+.+|.|||++...+++|.|+||.+.+. .|... ...|.+|+.|.+=|
T Consensus 227 ~~~~~d~Gg~lv~~~~gr~tlIGv~~~~~~~~~~~--~~~f~~v~~~~~~I 275 (282)
T PF03761_consen 227 YSCKGDRGGPLVKNINGRWTLIGVGASGNYECNKN--NSYFFNVSWYQDEI 275 (282)
T ss_pred ccCCCCccCeEEEEECCCEEEEEEEccCCCccccc--ccEEEEHHHhhhhh
Confidence 34788999999999999999999998875 34322 56888888776533
No 26
>cd00108 KR Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides.
Probab=73.02 E-value=3 Score=36.72 Aligned_cols=66 Identities=18% Similarity=0.291 Sum_probs=35.9
Q ss_pred ceeecC----CCcccceeeeeeeeceEEEecCccccccccCCCcccccccccCCCCC---c-cccccCccCCcccccccC
Q psy7302 364 ARVVGG----EDADMLCQLYVTVWGAIVLHNGLRTQEFRQKRPPFQCNICLFAASYP---T-YSTTSTTVRPAVYNKYVC 435 (741)
Q Consensus 364 ~~~~~~----~~~~~~~~~~~~~~g~i~~~~~~~~~~~~~~~~~l~~n~c~~~~~~p---~-~t~~~~~~~~~~~~~~~C 435 (741)
+..|.| +..++.|+.|.- +...........+ +...+..||||||+++. . |+.. ....+.+|+-..|
T Consensus 9 G~~YrG~~s~T~sG~~C~~W~s---~~~~~~~~~~~~~--~~~~~~hNyCRNPd~~~~~PWCyv~~-~~~~~eyC~ip~C 82 (83)
T cd00108 9 GESYRGTVSTTKSGKPCQRWNS---QLPHQHKFNPERF--PEGLLEENYCRNPDGDPEGPWCYTTD-PNVRWEYCDIPRC 82 (83)
T ss_pred CCcccCceeECCCCCcccCCcc---cCccccccccccc--ccccccccccCCCCCCCCCCEEEeCC-CCccEeecCCCcC
Confidence 445666 336778888822 1111111111111 24678999999999982 2 4433 2455666655554
No 27
>COG0265 DegQ Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain [Posttranslational modification, protein turnover, chaperones]
Probab=65.39 E-value=75 Score=35.05 Aligned_cols=146 Identities=18% Similarity=0.272 Sum_probs=74.4
Q ss_pred eeEEEEEec-cceEEeecccccccccCCCcEEEEEceeccccccCCCCcEEEEEEEEEEecCCCCCCCCCceEEEEeccc
Q psy7302 472 YLCGGALIG-TQWVLTAAHCVTNIVRSGDAVYVRVGDHDLTRKYGSPGAQTLRVATTYIHHNHNSQTLDNDIALLKLHGQ 550 (741)
Q Consensus 472 ~~CgGTLIS-~rwVLTAAHCv~~~~~~~~~i~V~lG~~~l~~~~~~~~~q~~~V~~IiiHP~Y~~~t~~nDIALLkL~~p 550 (741)
...+|.+|+ ..+|||..|=+.. .+.+.|.+. + ...+...-+-..+ ..|||+||.+..
T Consensus 72 ~~gSg~i~~~~g~ivTn~hVi~~----a~~i~v~l~--d---------g~~~~a~~vg~d~-------~~dlavlki~~~ 129 (347)
T COG0265 72 GLGSGFIISSDGYIVTNNHVIAG----AEEITVTLA--D---------GREVPAKLVGKDP-------ISDLAVLKIDGA 129 (347)
T ss_pred ccccEEEEcCCeEEEecceecCC----cceEEEEeC--C---------CCEEEEEEEecCC-------ccCEEEEEeccC
Confidence 567888888 7899999998765 345556551 1 1334444333322 589999999865
Q ss_pred cccCCceeEEecCCCCCcCCCCCeEEEEEecccCCCCCCcceeEEEEEeeeChhHHHHhhccccccccccCCCeEEecCC
Q psy7302 551 AELKDGVCLVCLPARGVNHAAGKRCTVTGYGYMGEAGPIPLRVREAEIPIVSDTECIRKINAVTEKIFILPASSFCAGGE 630 (741)
Q Consensus 551 i~~s~~V~PICLP~~~~~~~~g~~~~v~GWG~t~~~~~~s~~L~~~~v~vis~~~C~~~~~~~~~~~~~i~~~~~CAg~~ 630 (741)
-. +..+-+... ..+..++...+.|-... .. ...-.--+..+... +...... . .+.| .
T Consensus 130 ~~----~~~~~~~~s-~~l~vg~~v~aiGnp~g-~~----~tvt~Givs~~~r~-~v~~~~~--~------~~~I----q 186 (347)
T COG0265 130 GG----LPVIALGDS-DKLRVGDVVVAIGNPFG-LG----QTVTSGIVSALGRT-GVGSAGG--Y------VNFI----Q 186 (347)
T ss_pred CC----CceeeccCC-CCcccCCEEEEecCCCC-cc----cceeccEEeccccc-cccCccc--c------cchh----h
Confidence 32 222223222 22235555566554321 01 11111122222221 1000000 0 0011 1
Q ss_pred CCCCCCCCCCCCceEEeeCCeEEEEEEEecCCCCC
Q psy7302 631 EGNDACQGDGGGPLVCQDDGFYELAGLVSWGFGCG 665 (741)
Q Consensus 631 ~g~~~C~GDSGGPLv~~~~g~~~LvGVvS~G~gC~ 665 (741)
.......|.|||||+.... .++||.+......
T Consensus 187 tdAain~gnsGgpl~n~~g---~~iGint~~~~~~ 218 (347)
T COG0265 187 TDAAINPGNSGGPLVNIDG---EVVGINTAIIAPS 218 (347)
T ss_pred cccccCCCCCCCceEcCCC---cEEEEEEEEecCC
Confidence 1245678999999997543 7899888766543
No 28
>PF02395 Peptidase_S6: Immunoglobulin A1 protease Serine protease Prosite pattern; InterPro: IPR000710 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S6 (clan PA(S)). The type sample being the IgA1-specific serine endopeptidase from Neisseria gonorrhoeae []. These cleave prolyl bonds in the hinge regions of immunoglobulin A heavy chains. Similar specificity is shown by the unrelated family of M26 metalloendopeptidases.; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SZE_A 3H09_B 3SYJ_A 1WXR_A 3AK5_B.
Probab=50.75 E-value=12 Score=46.01 Aligned_cols=36 Identities=28% Similarity=0.522 Sum_probs=24.8
Q ss_pred ccCCCcccCCCCCceEe--ccCCeEEEEeeeecCCCCC
Q psy7302 678 SFNGMSKVGDGGGPLVC--QDDGFYELAGLVSWGFGCG 713 (741)
Q Consensus 678 ~y~~wI~~~DsGgp~~~--~~~~~~~L~GI~S~G~~Cg 713 (741)
.+......||||+|++. ..+++|+|+|+++.+.+.+
T Consensus 208 pL~n~~~~GDSGSPlF~YD~~~kKWvl~Gv~~~~~~~~ 245 (769)
T PF02395_consen 208 PLPNYGSPGDSGSPLFAYDKEKKKWVLVGVLSGGNGYN 245 (769)
T ss_dssp SSBEB--TT-TT-EEEEEETTTTEEEEEEEEEEECCCC
T ss_pred ccccccccCcCCCceEEEEccCCeEEEEEEEccccccC
Confidence 34455678999999997 5567999999999876543
No 29
>PF05416 Peptidase_C37: Southampton virus-type processing peptidase; InterPro: IPR001665 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This group of cysteine peptidases belong to the MEROPS peptidase family C37, (clan PA(C)). The type example is calicivirin from Southampton virus, an endopeptidase that cleaves the polyprotein at sites N-terminal to itself, liberating the polyprotein helicase. Southampton virus is a positive-stranded ssRNA virus belonging to the Caliciviruses, which are viruses that cause gastroenteritis. The calicivirus genome contains two open reading frames, ORF1 and ORF2. ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine protease and RNA polymerase activity []. The regions of the polyprotein in which these activities lie are similar to proteins produced by the picornaviruses []. ORF2 encodes a structural, capsid protein. Two different families of caliciviruses can be distinguished on the basis of sequence similarity, namely the Norwalk-like viruses or small round structured viruses (SRSVs), and those classed as non-SRSVs.; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 2FYQ_A 2FYR_A 1WQS_D 4ASH_A 2IPH_B.
Probab=40.56 E-value=99 Score=35.37 Aligned_cols=38 Identities=18% Similarity=0.294 Sum_probs=26.8
Q ss_pred CeEEecCCC---CCCCCCCCCCCceEEeeCCeEEEEEEEec
Q psy7302 623 SSFCAGGEE---GNDACQGDGGGPLVCQDDGFYELAGLVSW 660 (741)
Q Consensus 623 ~~~CAg~~~---g~~~C~GDSGGPLv~~~~g~~~LvGVvS~ 660 (741)
.||-.|... +-++-.||-|.|.|+..+|.|+++||...
T Consensus 485 GMLLTGaNAK~mDLGT~PGDCGcPYvyKrgNd~VV~GVH~A 525 (535)
T PF05416_consen 485 GMLLTGANAKGMDLGTIPGDCGCPYVYKRGNDWVVIGVHAA 525 (535)
T ss_dssp EEETTSTT-SSTTTS--TTGTT-EEEEEETTEEEEEEEEEE
T ss_pred eeeeecCCccccccCCCCCCCCCceeeecCCcEEEEEEEeh
Confidence 355555322 35778999999999999999999999754
No 30
>PF11593 Med3: Mediator complex subunit 3 fungal; InterPro: IPR020998 The Mediator complex is a coactivator involved in the regulated transcription of nearly all RNA polymerase II-dependent genes. Mediator functions as a bridge to convey information from gene-specific regulatory proteins to the basal RNA polymerase II transcription machinery. The Mediator complex, having a compact conformation in its free form, is recruited to promoters by direct interactions with regulatory proteins and serves for the assembly of a functional preinitiation complex with RNA polymerase II and the general transcription factors. On recruitment the Mediator complex unfolds to an extended conformation and partially surrounds RNA polymerase II, specifically interacting with the unphosphorylated form of the C-terminal domain (CTD) of RNA polymerase II. The Mediator complex dissociates from the RNA polymerase II holoenzyme and stays at the promoter when transcriptional elongation begins. The Mediator complex is composed of at least 31 subunits: MED1, MED4, MED6, MED7, MED8, MED9, MED10, MED11, MED12, MED13, MED13L, MED14, MED15, MED16, MED17, MED18, MED19, MED20, MED21, MED22, MED23, MED24, MED25, MED26, MED27, MED29, MED30, MED31, CCNC, CDK8 and CDC2L6/CDK11. The subunits form at least three structurally distinct submodules. The head and the middle modules interact directly with RNA polymerase II, whereas the elongated tail module interacts with gene-specific regulatory proteins. Mediator containing the CDK8 module is less active than Mediator lacking this module in supporting transcriptional activation. The head module contains: MED6, MED8, MED11, SRB4/MED17, SRB5/MED18, ROX3/MED19, SRB2/MED20 and SRB6/MED22. The middle module contains: MED1, MED4, NUT1/MED5, MED7, CSE2/MED9, NUT2/MED10, SRB7/MED21 and SOH1/MED31. CSE2/MED9 interacts directly with MED4. The tail module contains: MED2, PGD1/MED3, RGR1/MED14, GAL11/MED15 and SIN4/MED16. The CDK8 module contains: MED12, MED13, CCNC and CDK8. Individual preparations of the Mediator complex lacking one or more distinct subunits have been variously termed ARC, CRSP, DRIP, PC2, SMCC and TRAP. This entry represents the subunit Med3, which is a physical target for Cyc8-Tup1, a yeast transcriptional co-repressor []. ; GO: 0001104 RNA polymerase II transcription cofactor activity, 0006357 regulation of transcription from RNA polymerase II promoter, 0016592 mediator complex
Probab=26.89 E-value=1.8e+02 Score=32.76 Aligned_cols=19 Identities=16% Similarity=0.163 Sum_probs=11.8
Q ss_pred cccCCC-CCCceEEeccCCC
Q psy7302 21 DAYSGT-LPPNLVIMERNGT 39 (741)
Q Consensus 21 d~~~~~-~~~~~~~~~~~~~ 39 (741)
++|+++ -.+..+..+..++
T Consensus 99 ~eyse~~~~kkF~pLEtL~~ 118 (379)
T PF11593_consen 99 PEYSEKYNSKKFQPLETLSS 118 (379)
T ss_pred HHHhcccCCccceechhhhc
Confidence 567765 4566666666653
No 31
>KOG1924|consensus
Probab=24.31 E-value=2.1e+02 Score=35.31 Aligned_cols=10 Identities=10% Similarity=0.507 Sum_probs=4.5
Q ss_pred EEEeccchHh
Q psy7302 720 VYVKVSSFIG 729 (741)
Q Consensus 720 VYTrVs~y~d 729 (741)
+|.+|..|.+
T Consensus 978 FFaDi~tFrn 987 (1102)
T KOG1924|consen 978 FFADIRTFRN 987 (1102)
T ss_pred HHHHHHHHHH
Confidence 3444444443
No 32
>PF00947 Pico_P2A: Picornavirus core protein 2A; InterPro: IPR000081 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This domain defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies 3CA and 3CB. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral 3C cysteine protease []. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0008233 peptidase activity, 0006508 proteolysis, 0016032 viral reproduction; PDB: 2HRV_B 1Z8R_A.
Probab=23.52 E-value=71 Score=30.65 Aligned_cols=34 Identities=26% Similarity=0.492 Sum_probs=23.4
Q ss_pred CCCCCCceEEeeCCeEEEEEEEecCCCCCCCCCCceeeecccc
Q psy7302 637 QGDGGGPLVCQDDGFYELAGLVSWGFGCGRQDVPGVYVKVSSF 679 (741)
Q Consensus 637 ~GDSGGPLv~~~~g~~~LvGVvS~G~gC~~~~~P~VYtrVS~y 679 (741)
.||-||+|+|+. =++||+..|-. .-.-|+++..|
T Consensus 89 PGdCGg~L~C~H----GViGi~Tagg~-----g~VaF~dir~~ 122 (127)
T PF00947_consen 89 PGDCGGILRCKH----GVIGIVTAGGE-----GHVAFADIRDL 122 (127)
T ss_dssp TT-TCSEEEETT----CEEEEEEEEET-----TEEEEEECCCG
T ss_pred CCCCCceeEeCC----CeEEEEEeCCC-----ceEEEEechhh
Confidence 689999999987 48999988732 22336666554
Done!