Query 040739
Match_columns 594
No_of_seqs 183 out of 473
Neff 4.5
Searched_HMMs 46136
Date Fri Mar 29 11:19:05 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/040739.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/040739hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PF08192 Peptidase_S64: Peptid 99.7 4.8E-16 1E-20 172.4 12.2 100 328-439 586-686 (695)
2 PRK10139 serine endoprotease; 98.3 3.4E-06 7.3E-11 92.7 12.4 156 224-436 94-253 (455)
3 TIGR02038 protease_degS peripl 98.3 1.1E-05 2.4E-10 85.6 13.9 158 224-438 82-243 (351)
4 PRK10942 serine endoprotease; 98.2 1.5E-05 3.3E-10 88.0 12.3 155 224-435 115-273 (473)
5 TIGR02037 degP_htrA_DO peripla 98.2 1.5E-05 3.3E-10 86.2 12.1 157 224-438 62-222 (428)
6 PRK10898 serine endoprotease; 98.1 3E-05 6.5E-10 82.6 13.5 158 224-438 82-244 (353)
7 PF13365 Trypsin_2: Trypsin-li 97.4 0.00035 7.5E-09 60.3 6.6 21 378-403 100-120 (120)
8 COG0265 DegQ Trypsin-like seri 96.8 0.0094 2E-07 62.9 10.8 148 236-436 83-235 (347)
9 PF00089 Trypsin: Trypsin; In 96.7 0.0026 5.6E-08 59.9 5.4 122 288-436 86-218 (220)
10 KOG1421 Predicted signaling-as 96.0 0.029 6.3E-07 64.6 9.7 92 326-439 157-256 (955)
11 cd00190 Tryp_SPc Trypsin-like 95.8 0.07 1.5E-06 50.5 10.2 30 378-409 180-209 (232)
12 smart00020 Tryp_SPc Trypsin-li 95.6 0.12 2.5E-06 49.4 10.7 29 379-411 182-210 (229)
13 PF00944 Peptidase_S3: Alphavi 95.5 0.018 3.9E-07 54.7 4.5 29 378-411 102-130 (158)
14 COG3591 V8-like Glu-specific e 94.3 0.5 1.1E-05 49.1 11.7 27 379-410 200-226 (251)
15 PF00947 Pico_P2A: Picornaviru 91.4 0.2 4.4E-06 47.1 3.7 49 363-435 75-123 (127)
16 PF00863 Peptidase_C4: Peptida 89.6 2 4.4E-05 44.3 9.3 72 327-411 103-176 (235)
17 PF05579 Peptidase_S32: Equine 87.3 0.38 8.2E-06 50.5 2.4 27 378-409 204-230 (297)
18 PF10459 Peptidase_S46: Peptid 84.7 1.1 2.4E-05 52.7 4.7 52 379-438 630-683 (698)
19 PF00949 Peptidase_S7: Peptida 80.8 1.1 2.4E-05 42.4 2.3 26 379-409 94-119 (132)
20 PF01732 DUF31: Putative pepti 62.3 5.8 0.00013 42.8 2.7 22 380-406 353-374 (374)
21 PF12381 Peptidase_C3G: Tungro 58.8 13 0.00029 38.3 4.3 47 360-411 162-208 (231)
22 COG5640 Secreted trypsin-like 51.5 92 0.002 34.7 9.4 32 378-412 224-256 (413)
23 PF00548 Peptidase_C3: 3C cyst 51.3 43 0.00093 32.7 6.4 29 378-408 143-171 (172)
24 KOG1320 Serine protease [Postt 49.3 61 0.0013 37.0 8.0 56 364-436 290-345 (473)
25 KOG1320 Serine protease [Postt 45.0 64 0.0014 36.8 7.3 129 237-406 98-229 (473)
26 PF02122 Peptidase_S39: Peptid 30.5 40 0.00087 34.2 2.7 42 379-435 144-185 (203)
No 1
>PF08192 Peptidase_S64: Peptidase family S64; InterPro: IPR012985 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This family of fungal proteins is involved in the processing of membrane bound transcription factor Stp1 [] and belongs to MEROPS petidase family S64 (clan PA). The processing causes the signalling domain of Stp1 to be passed to the nucleus where several permease genes are induced. The permeases are important for uptake of amino acids, and processing of tp1 only occurs in an amino acid-rich environment. This family is predicted to be distantly related to the trypsin family (MEROPS peptidase family S1) and to have a typical trypsin-like catalytic triad [].
Probab=99.65 E-value=4.8e-16 Score=172.45 Aligned_cols=100 Identities=29% Similarity=0.453 Sum_probs=84.0
Q ss_pred cccCCCeEEEEeecccceEEEEEEEEEEEeCCCCeEEEEEEEEECCCCCCCCCCCCccceEEeeccC-CCCCceEEEEEe
Q 040739 328 SSLIGKQVVKVGRSSGLTTGTVLAYALEYNDEKGICFLTDFLVVGENQQTFDLEGDSGSLILMKGEN-GEKPRPIGIIWG 406 (594)
Q Consensus 328 ~p~lG~~V~KvGRTTGlT~G~Itai~V~y~~~~G~~~f~dqIVt~~~g~~FS~~GDSGSlVl~~~~~-d~~~~aVGLlfG 406 (594)
....|+.|+|+|||||+|+|+|+++++.|+.+ |...+.+++|.+.+...|+.+|||||+|+...++ ...-.+|||+++
T Consensus 586 ~~~~G~~VfK~GrTTgyT~G~lNg~klvyw~d-G~i~s~efvV~s~~~~~Fa~~GDSGS~VLtk~~d~~~gLgvvGMlhs 664 (695)
T PF08192_consen 586 NLVPGMEVFKVGRTTGYTTGILNGIKLVYWAD-GKIQSSEFVVSSDNNPAFASGGDSGSWVLTKLEDNNKGLGVVGMLHS 664 (695)
T ss_pred ccCCCCeEEEecccCCccceEecceEEEEecC-CCeEEEEEEEecCCCccccCCCCcccEEEecccccccCceeeEEeee
Confidence 45679999999999999999999999888876 5566789999986678899999999999964444 334569999999
Q ss_pred cCCCCCccccccCCCCCcceEeechHHHHhhcC
Q 040739 407 GTANRGRLKLKIGQPPENWTSGVDLGRLLNLLE 439 (594)
Q Consensus 407 Gs~~~g~~~~~~~~~p~~~Tl~~pI~~VL~~L~ 439 (594)
.++.. ..|.+|+||..+|+.|+
T Consensus 665 ydge~-----------kqfglftPi~~il~rl~ 686 (695)
T PF08192_consen 665 YDGEQ-----------KQFGLFTPINEILDRLE 686 (695)
T ss_pred cCCcc-----------ceeeccCcHHHHHHHHH
Confidence 87544 68999999999999874
No 2
>PRK10139 serine endoprotease; Provisional
Probab=98.35 E-value=3.4e-06 Score=92.69 Aligned_cols=156 Identities=21% Similarity=0.200 Sum_probs=94.9
Q ss_pred EEEEEeCCCCceeEEEecCceeccCCCCCCCCCCCCCCcccccccCCCccccccccccCCCccccccccccccccCCCCC
Q 040739 224 GAIVKSQTGSRQVGFLTNRHVAVDLDYPNQKMFHPLPPTLGPGVYLGAVERATSFHHRRPLTFVRADGAFIPFADDFDMS 303 (594)
Q Consensus 224 GclVtD~~G~~~~yiLTNnHVla~~n~~~~~~~~pgdpILQPg~~DGG~~rd~~FIp~~p~N~VDaD~AIi~va~~~d~s 303 (594)
|+++.+. .-|||||+||+.+... +.=...||. ....+.+-..+.. |+|++++....+.
T Consensus 94 G~ii~~~----~g~IlTn~HVv~~a~~------------i~V~~~dg~-~~~a~vvg~D~~~----DlAvlkv~~~~~l- 151 (455)
T PRK10139 94 GVIIDAA----KGYVLTNNHVINQAQK------------ISIQLNDGR-EFDAKLIGSDDQS----DIALLQIQNPSKL- 151 (455)
T ss_pred EEEEECC----CCEEEeChHHhCCCCE------------EEEEECCCC-EEEEEEEEEcCCC----CEEEEEecCCCCC-
Confidence 6666542 2399999999987542 111123444 1112322234444 8899998632211
Q ss_pred ccccccccccccCcceeecccCcccccCCCeEEEEeeccc----ceEEEEEEEEEEEeCCCCeEEEEEEEEECCCCCCCC
Q 040739 304 TVTTSVKGLGEIGDVKIVDLQSPISSLIGKQVVKVGRSSG----LTTGTVLAYALEYNDEKGICFLTDFLVVGENQQTFD 379 (594)
Q Consensus 304 ~vs~~I~~lG~IG~vk~vdl~g~~~p~lG~~V~KvGRTTG----lT~G~Itai~V~y~~~~G~~~f~dqIVt~~~g~~FS 379 (594)
+.+. +|. .....+|+.|.-+|.--| +|.|.|+++.-......+ +.++|.++. --
T Consensus 152 ---~~~~----lg~--------s~~~~~G~~V~aiG~P~g~~~tvt~GivS~~~r~~~~~~~---~~~~iqtda----~i 209 (455)
T PRK10139 152 ---TQIA----IAD--------SDKLRVGDFAVAVGNPFGLGQTATSGIISALGRSGLNLEG---LENFIQTDA----SI 209 (455)
T ss_pred ---ceeE----ecC--------ccccCCCCEEEEEecCCCCCCceEEEEEccccccccCCCC---cceEEEECC----cc
Confidence 1111 111 124678999999988544 588999887632111112 356788776 46
Q ss_pred CCCCccceEEeeccCCCCCceEEEEEecCCCCCccccccCCCCCcceEeechHHHHh
Q 040739 380 LEGDSGSLILMKGENGEKPRPIGIIWGGTANRGRLKLKIGQPPENWTSGVDLGRLLN 436 (594)
Q Consensus 380 ~~GDSGSlVl~~~~~d~~~~aVGLlfGGs~~~g~~~~~~~~~p~~~Tl~~pI~~VL~ 436 (594)
.+|.||++++ |.++++|||..+--... ++..+..|+-|+..+..
T Consensus 210 n~GnSGGpl~-----n~~G~vIGi~~~~~~~~--------~~~~gigfaIP~~~~~~ 253 (455)
T PRK10139 210 NRGNSGGALL-----NLNGELIGINTAILAPG--------GGSVGIGFAIPSNMART 253 (455)
T ss_pred CCCCCcceEE-----CCCCeEEEEEEEEEcCC--------CCccceEEEEEhHHHHH
Confidence 7899999999 99999999998743211 12256789999865443
No 3
>TIGR02038 protease_degS periplasmic serine pepetdase DegS. This family consists of the periplasmic serine protease DegS (HhoB), a shorter paralog of protease DO (HtrA, DegP) and DegQ (HhoA). It is found in E. coli and several other Proteobacteria of the gamma subdivision. It contains a trypsin domain and a single copy of PDZ domain (in contrast to DegP with two copies). A critical role of this DegS is to sense stress in the periplasm and partially degrade an inhibitor of sigma(E).
Probab=98.27 E-value=1.1e-05 Score=85.63 Aligned_cols=158 Identities=19% Similarity=0.246 Sum_probs=95.2
Q ss_pred EEEEEeCCCCceeEEEecCceeccCCCCCCCCCCCCCCcccccccCCCccccccccccCCCccccccccccccccCCCCC
Q 040739 224 GAIVKSQTGSRQVGFLTNRHVAVDLDYPNQKMFHPLPPTLGPGVYLGAVERATSFHHRRPLTFVRADGAFIPFADDFDMS 303 (594)
Q Consensus 224 GclVtD~~G~~~~yiLTNnHVla~~n~~~~~~~~pgdpILQPg~~DGG~~rd~~FIp~~p~N~VDaD~AIi~va~~~d~s 303 (594)
|+++.+ .| |||||+||..+.+. ++=...||.. ...+.+-..+.. |+|++++... +
T Consensus 82 G~vi~~-~G----~IlTn~HVV~~~~~------------i~V~~~dg~~-~~a~vv~~d~~~----DlAvlkv~~~-~-- 136 (351)
T TIGR02038 82 GVIMSK-EG----YILTNYHVIKKADQ------------IVVALQDGRK-FEAELVGSDPLT----DLAVLKIEGD-N-- 136 (351)
T ss_pred EEEEeC-Ce----EEEecccEeCCCCE------------EEEEECCCCE-EEEEEEEecCCC----CEEEEEecCC-C--
Confidence 666643 22 89999999987542 1111234431 112222223444 7899987632 1
Q ss_pred ccccccccccccCcceeecccCcccccCCCeEEEEeeccc----ceEEEEEEEEEEEeCCCCeEEEEEEEEECCCCCCCC
Q 040739 304 TVTTSVKGLGEIGDVKIVDLQSPISSLIGKQVVKVGRSSG----LTTGTVLAYALEYNDEKGICFLTDFLVVGENQQTFD 379 (594)
Q Consensus 304 ~vs~~I~~lG~IG~vk~vdl~g~~~p~lG~~V~KvGRTTG----lT~G~Itai~V~y~~~~G~~~f~dqIVt~~~g~~FS 379 (594)
+ +.+. +| ......+|+.|.-+|...| +|.|.|+++.-......+ ..+++.++. --
T Consensus 137 -~-~~~~----l~--------~s~~~~~G~~V~aiG~P~~~~~s~t~GiIs~~~r~~~~~~~---~~~~iqtda----~i 195 (351)
T TIGR02038 137 -L-PTIP----VN--------LDRPPHVGDVVLAIGNPYNLGQTITQGIISATGRNGLSSVG---RQNFIQTDA----AI 195 (351)
T ss_pred -C-ceEe----cc--------CcCccCCCCEEEEEeCCCCCCCcEEEEEEEeccCcccCCCC---cceEEEECC----cc
Confidence 1 1111 11 1235689999999999866 578999888632212112 245677666 36
Q ss_pred CCCCccceEEeeccCCCCCceEEEEEecCCCCCccccccCCCCCcceEeechHHHHhhc
Q 040739 380 LEGDSGSLILMKGENGEKPRPIGIIWGGTANRGRLKLKIGQPPENWTSGVDLGRLLNLL 438 (594)
Q Consensus 380 ~~GDSGSlVl~~~~~d~~~~aVGLlfGGs~~~g~~~~~~~~~p~~~Tl~~pI~~VL~~L 438 (594)
.+|.||++++ |.++++|||..+.-... .+....+..|+-|+..+...+
T Consensus 196 ~~GnSGGpl~-----n~~G~vIGI~~~~~~~~------~~~~~~g~~faIP~~~~~~vl 243 (351)
T TIGR02038 196 NAGNSGGALI-----NTNGELVGINTASFQKG------GDEGGEGINFAIPIKLAHKIM 243 (351)
T ss_pred CCCCCcceEE-----CCCCeEEEEEeeeeccc------CCCCccceEEEecHHHHHHHH
Confidence 7899999999 99999999987542110 011224678899988777655
No 4
>PRK10942 serine endoprotease; Provisional
Probab=98.16 E-value=1.5e-05 Score=88.02 Aligned_cols=155 Identities=19% Similarity=0.196 Sum_probs=91.3
Q ss_pred EEEEEeCCCCceeEEEecCceeccCCCCCCCCCCCCCCcccccccCCCccccccccccCCCccccccccccccccCCCCC
Q 040739 224 GAIVKSQTGSRQVGFLTNRHVAVDLDYPNQKMFHPLPPTLGPGVYLGAVERATSFHHRRPLTFVRADGAFIPFADDFDMS 303 (594)
Q Consensus 224 GclVtD~~G~~~~yiLTNnHVla~~n~~~~~~~~pgdpILQPg~~DGG~~rd~~FIp~~p~N~VDaD~AIi~va~~~d~s 303 (594)
|++|... .-|||||+||+.+.... .+ ...||. ....+.+-..+.. |+|++++....+.
T Consensus 115 G~ii~~~----~G~IlTn~HVv~~a~~i----------~V--~~~dg~-~~~a~vv~~D~~~----DlAvlki~~~~~l- 172 (473)
T PRK10942 115 GVIIDAD----KGYVVTNNHVVDNATKI----------KV--QLSDGR-KFDAKVVGKDPRS----DIALIQLQNPKNL- 172 (473)
T ss_pred EEEEECC----CCEEEeChhhcCCCCEE----------EE--EECCCC-EEEEEEEEecCCC----CEEEEEecCCCCC-
Confidence 6666432 23899999998774421 11 123443 1112222233444 8899987532211
Q ss_pred ccccccccccccCcceeecccCcccccCCCeEEEEeeccc----ceEEEEEEEEEEEeCCCCeEEEEEEEEECCCCCCCC
Q 040739 304 TVTTSVKGLGEIGDVKIVDLQSPISSLIGKQVVKVGRSSG----LTTGTVLAYALEYNDEKGICFLTDFLVVGENQQTFD 379 (594)
Q Consensus 304 ~vs~~I~~lG~IG~vk~vdl~g~~~p~lG~~V~KvGRTTG----lT~G~Itai~V~y~~~~G~~~f~dqIVt~~~g~~FS 379 (594)
+.+. +| ......+|+.|.-+|..-| +|.|.|+++.-... +...+.++|.++. --
T Consensus 173 ---~~~~----lg--------~s~~l~~G~~V~aiG~P~g~~~tvt~GiVs~~~r~~~---~~~~~~~~iqtda----~i 230 (473)
T PRK10942 173 ---TAIK----MA--------DSDALRVGDYTVAIGNPYGLGETVTSGIVSALGRSGL---NVENYENFIQTDA----AI 230 (473)
T ss_pred ---ceeE----ec--------CccccCCCCEEEEEcCCCCCCcceeEEEEEEeecccC---CcccccceEEecc----cc
Confidence 1121 11 1234689999999998755 58899988863211 1112346677766 35
Q ss_pred CCCCccceEEeeccCCCCCceEEEEEecCCCCCccccccCCCCCcceEeechHHHH
Q 040739 380 LEGDSGSLILMKGENGEKPRPIGIIWGGTANRGRLKLKIGQPPENWTSGVDLGRLL 435 (594)
Q Consensus 380 ~~GDSGSlVl~~~~~d~~~~aVGLlfGGs~~~g~~~~~~~~~p~~~Tl~~pI~~VL 435 (594)
.+|.||++++ |.++++|||..+.-... ++..+..|+-|+..+.
T Consensus 231 ~~GnSGGpL~-----n~~GeviGI~t~~~~~~--------g~~~g~gfaIP~~~~~ 273 (473)
T PRK10942 231 NRGNSGGALV-----NLNGELIGINTAILAPD--------GGNIGIGFAIPSNMVK 273 (473)
T ss_pred CCCCCcCccC-----CCCCeEEEEEEEEEcCC--------CCcccEEEEEEHHHHH
Confidence 6899999999 89999999987643211 1113467888875433
No 5
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=98.15 E-value=1.5e-05 Score=86.23 Aligned_cols=157 Identities=22% Similarity=0.224 Sum_probs=93.3
Q ss_pred EEEEEeCCCCceeEEEecCceeccCCCCCCCCCCCCCCcccccccCCCccccccccccCCCccccccccccccccCCCCC
Q 040739 224 GAIVKSQTGSRQVGFLTNRHVAVDLDYPNQKMFHPLPPTLGPGVYLGAVERATSFHHRRPLTFVRADGAFIPFADDFDMS 303 (594)
Q Consensus 224 GclVtD~~G~~~~yiLTNnHVla~~n~~~~~~~~pgdpILQPg~~DGG~~rd~~FIp~~p~N~VDaD~AIi~va~~~d~s 303 (594)
|++|... | |||||+||+.+.... .+. ..||. ....+.+...+.. |+|++++... .
T Consensus 62 Gfii~~~-G----~IlTn~Hvv~~~~~i----------~V~--~~~~~-~~~a~vv~~d~~~----DlAllkv~~~---~ 116 (428)
T TIGR02037 62 GVIISAD-G----YILTNNHVVDGADEI----------TVT--LSDGR-EFKAKLVGKDPRT----DIAVLKIDAK---K 116 (428)
T ss_pred EEEECCC-C----EEEEcHHHcCCCCeE----------EEE--eCCCC-EEEEEEEEecCCC----CEEEEEecCC---C
Confidence 6666543 2 899999999885531 111 12333 1112222223333 7799988532 1
Q ss_pred ccccccccccccCcceeecccCcccccCCCeEEEEeeccc----ceEEEEEEEEEEEeCCCCeEEEEEEEEECCCCCCCC
Q 040739 304 TVTTSVKGLGEIGDVKIVDLQSPISSLIGKQVVKVGRSSG----LTTGTVLAYALEYNDEKGICFLTDFLVVGENQQTFD 379 (594)
Q Consensus 304 ~vs~~I~~lG~IG~vk~vdl~g~~~p~lG~~V~KvGRTTG----lT~G~Itai~V~y~~~~G~~~f~dqIVt~~~g~~FS 379 (594)
.+ +.+. + .......+|+.|+-+|.--| +|.|.|+++.-..... + .+.+++.++. -.
T Consensus 117 ~~-~~~~----l--------~~~~~~~~G~~v~aiG~p~g~~~~~t~G~vs~~~~~~~~~-~--~~~~~i~tda----~i 176 (428)
T TIGR02037 117 NL-PVIK----L--------GDSDKLRVGDWVLAIGNPFGLGQTVTSGIVSALGRSGLGI-G--DYENFIQTDA----AI 176 (428)
T ss_pred Cc-eEEE----c--------cCCCCCCCCCEEEEEECCCcCCCcEEEEEEEecccCccCC-C--CccceEEECC----CC
Confidence 11 1111 1 11234689999999998644 6788888876221111 1 1355777766 47
Q ss_pred CCCCccceEEeeccCCCCCceEEEEEecCCCCCccccccCCCCCcceEeechHHHHhhc
Q 040739 380 LEGDSGSLILMKGENGEKPRPIGIIWGGTANRGRLKLKIGQPPENWTSGVDLGRLLNLL 438 (594)
Q Consensus 380 ~~GDSGSlVl~~~~~d~~~~aVGLlfGGs~~~g~~~~~~~~~p~~~Tl~~pI~~VL~~L 438 (594)
.+|.||++++ |.++++|||..+.-... ++..++.|+-||..+.+.+
T Consensus 177 ~~GnSGGpl~-----n~~G~viGI~~~~~~~~--------g~~~g~~faiP~~~~~~~~ 222 (428)
T TIGR02037 177 NPGNSGGPLV-----NLRGEVIGINTAIYSPS--------GGNVGIGFAIPSNMAKNVV 222 (428)
T ss_pred CCCCCCCceE-----CCCCeEEEEEeEEEcCC--------CCccceEEEEEhHHHHHHH
Confidence 7899999999 89999999987643211 1124567888876555444
No 6
>PRK10898 serine endoprotease; Provisional
Probab=98.13 E-value=3e-05 Score=82.59 Aligned_cols=158 Identities=23% Similarity=0.290 Sum_probs=92.1
Q ss_pred EEEEEeCCCCceeEEEecCceeccCCCCCCCCCCCCCCcccccccCCCccccccccccCCCccccccccccccccCCCCC
Q 040739 224 GAIVKSQTGSRQVGFLTNRHVAVDLDYPNQKMFHPLPPTLGPGVYLGAVERATSFHHRRPLTFVRADGAFIPFADDFDMS 303 (594)
Q Consensus 224 GclVtD~~G~~~~yiLTNnHVla~~n~~~~~~~~pgdpILQPg~~DGG~~rd~~FIp~~p~N~VDaD~AIi~va~~~d~s 303 (594)
|+++. ..| |||||+||+.+.... .+ ...||.. ...+.+-..+.. |+|++++... +
T Consensus 82 Gfvi~-~~G----~IlTn~HVv~~a~~i----------~V--~~~dg~~-~~a~vv~~d~~~----DlAvl~v~~~-~-- 136 (353)
T PRK10898 82 GVIMD-QRG----YILTNKHVINDADQI----------IV--ALQDGRV-FEALLVGSDSLT----DLAVLKINAT-N-- 136 (353)
T ss_pred EEEEe-CCe----EEEecccEeCCCCEE----------EE--EeCCCCE-EEEEEEEEcCCC----CEEEEEEcCC-C--
Confidence 66664 322 899999999874421 11 1224431 111222123444 8899998532 1
Q ss_pred ccccccccccccCcceeecccCcccccCCCeEEEEeeccc----ceEEEEEEEE-EEEeCCCCeEEEEEEEEECCCCCCC
Q 040739 304 TVTTSVKGLGEIGDVKIVDLQSPISSLIGKQVVKVGRSSG----LTTGTVLAYA-LEYNDEKGICFLTDFLVVGENQQTF 378 (594)
Q Consensus 304 ~vs~~I~~lG~IG~vk~vdl~g~~~p~lG~~V~KvGRTTG----lT~G~Itai~-V~y~~~~G~~~f~dqIVt~~~g~~F 378 (594)
+ +.+. +| ....+.+|+.|.-+|.-.| +|.|.|++.. ..+.. .+. .+++.++. -
T Consensus 137 -l-~~~~----l~--------~~~~~~~G~~V~aiG~P~g~~~~~t~Giis~~~r~~~~~-~~~---~~~iqtda----~ 194 (353)
T PRK10898 137 -L-PVIP----IN--------PKRVPHIGDVVLAIGNPYNLGQTITQGIISATGRIGLSP-TGR---QNFLQTDA----S 194 (353)
T ss_pred -C-Ceee----cc--------CcCcCCCCCEEEEEeCCCCcCCCcceeEEEeccccccCC-ccc---cceEEecc----c
Confidence 1 1111 11 1124689999999998766 5889999876 32222 121 34566665 3
Q ss_pred CCCCCccceEEeeccCCCCCceEEEEEecCCCCCccccccCCCCCcceEeechHHHHhhc
Q 040739 379 DLEGDSGSLILMKGENGEKPRPIGIIWGGTANRGRLKLKIGQPPENWTSGVDLGRLLNLL 438 (594)
Q Consensus 379 S~~GDSGSlVl~~~~~d~~~~aVGLlfGGs~~~g~~~~~~~~~p~~~Tl~~pI~~VL~~L 438 (594)
-.+|.||++++ |.++++|||..+.-... ..+..+.+..|+-|+..+...+
T Consensus 195 i~~GnSGGPl~-----n~~G~vvGI~~~~~~~~-----~~~~~~~g~~faIP~~~~~~~~ 244 (353)
T PRK10898 195 INHGNSGGALV-----NSLGELMGINTLSFDKS-----NDGETPEGIGFAIPTQLATKIM 244 (353)
T ss_pred cCCCCCcceEE-----CCCCeEEEEEEEEeccc-----CCCCcccceEEEEchHHHHHHH
Confidence 57899999999 89999999987532111 0011124678898887755443
No 7
>PF13365 Trypsin_2: Trypsin-like peptidase domain; PDB: 1Y8T_A 2Z9I_A 3QO6_A 1L1J_A 1QY6_A 2O8L_A 3OTP_E 2ZLE_I 1KY9_A 3CS0_A ....
Probab=97.44 E-value=0.00035 Score=60.32 Aligned_cols=21 Identities=29% Similarity=0.441 Sum_probs=18.6
Q ss_pred CCCCCCccceEEeeccCCCCCceEEE
Q 040739 378 FDLEGDSGSLILMKGENGEKPRPIGI 403 (594)
Q Consensus 378 FS~~GDSGSlVl~~~~~d~~~~aVGL 403 (594)
...+|.||++|| |.++++|||
T Consensus 100 ~~~~G~SGgpv~-----~~~G~vvGi 120 (120)
T PF13365_consen 100 DTRPGSSGGPVF-----DSDGRVVGI 120 (120)
T ss_dssp S-STTTTTSEEE-----ETTSEEEEE
T ss_pred ccCCCcEeHhEE-----CCCCEEEeC
Confidence 588999999999 899999997
No 8
>COG0265 DegQ Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain [Posttranslational modification, protein turnover, chaperones]
Probab=96.77 E-value=0.0094 Score=62.91 Aligned_cols=148 Identities=20% Similarity=0.238 Sum_probs=91.1
Q ss_pred eEEEecCceeccCCCCCCCCCCCCCCcccccccCCCccccccccccCCCccccccccccccccCCCCCcccccccccccc
Q 040739 236 VGFLTNRHVAVDLDYPNQKMFHPLPPTLGPGVYLGAVERATSFHHRRPLTFVRADGAFIPFADDFDMSTVTTSVKGLGEI 315 (594)
Q Consensus 236 ~yiLTNnHVla~~n~~~~~~~~pgdpILQPg~~DGG~~rd~~FIp~~p~N~VDaD~AIi~va~~~d~s~vs~~I~~lG~I 315 (594)
-|+|||+||+.+... +.-...||. .-..+++...+.. |+|++++..... .+.+. +
T Consensus 83 g~ivTn~hVi~~a~~------------i~v~l~dg~-~~~a~~vg~d~~~----dlavlki~~~~~----~~~~~----~ 137 (347)
T COG0265 83 GYIVTNNHVIAGAEE------------ITVTLADGR-EVPAKLVGKDPIS----DLAVLKIDGAGG----LPVIA----L 137 (347)
T ss_pred eEEEecceecCCcce------------EEEEeCCCC-EEEEEEEecCCcc----CEEEEEeccCCC----Cceee----c
Confidence 499999999998442 111114554 1123444334444 779988754221 11221 1
Q ss_pred CcceeecccCcccccCCCeEEEEeeccc----ceEEEEEEEEEE-EeCCCCeEEEEEEEEECCCCCCCCCCCCccceEEe
Q 040739 316 GDVKIVDLQSPISSLIGKQVVKVGRSSG----LTTGTVLAYALE-YNDEKGICFLTDFLVVGENQQTFDLEGDSGSLILM 390 (594)
Q Consensus 316 G~vk~vdl~g~~~p~lG~~V~KvGRTTG----lT~G~Itai~V~-y~~~~G~~~f~dqIVt~~~g~~FS~~GDSGSlVl~ 390 (594)
|. .....+|+.|.-+|-..| +|.|.|+++.-. +..... +.++|.+.. .-.+|.||..++
T Consensus 138 ~~--------s~~l~vg~~v~aiGnp~g~~~tvt~Givs~~~r~~v~~~~~---~~~~IqtdA----ain~gnsGgpl~- 201 (347)
T COG0265 138 GD--------SDKLRVGDVVVAIGNPFGLGQTVTSGIVSALGRTGVGSAGG---YVNFIQTDA----AINPGNSGGPLV- 201 (347)
T ss_pred cC--------CCCcccCCEEEEecCCCCcccceeccEEeccccccccCccc---ccchhhccc----ccCCCCCCCceE-
Confidence 11 123458999999999888 677877777643 332111 567777665 688999999999
Q ss_pred eccCCCCCceEEEEEecCCCCCccccccCCCCCcceEeechHHHHh
Q 040739 391 KGENGEKPRPIGIIWGGTANRGRLKLKIGQPPENWTSGVDLGRLLN 436 (594)
Q Consensus 391 ~~~~d~~~~aVGLlfGGs~~~g~~~~~~~~~p~~~Tl~~pI~~VL~ 436 (594)
+.++++||+..+.-...+ +..+..|+-|+..+..
T Consensus 202 ----n~~g~~iGint~~~~~~~--------~~~gigfaiP~~~~~~ 235 (347)
T COG0265 202 ----NIDGEVVGINTAIIAPSG--------GSSGIGFAIPVNLVAP 235 (347)
T ss_pred ----cCCCcEEEEEEEEecCCC--------CcceeEEEecHHHHHH
Confidence 899999999877654321 1123467777766554
No 9
>PF00089 Trypsin: Trypsin; InterPro: IPR001254 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine proteases belong to the MEROPS peptidase family S1 (chymotrypsin family, clan PA(S))and to peptidase family S6 (Hap serine peptidases). The chymotrypsin family is almost totally confined to animals, although trypsin-like enzymes are found in actinomycetes of the genera Streptomyces and Saccharopolyspora, and in the fungus Fusarium oxysporum []. The enzymes are inherently secreted, being synthesised with a signal peptide that targets them to the secretory pathway. Animal enzymes are either secreted directly, packaged into vesicles for regulated secretion, or are retained in leukocyte granules []. The Hap family, 'Haemophilus adhesion and penetration', are proteins that play a role in the interaction with human epithelial cells. The serine protease activity is localized at the N-terminal domain, whereas the binding domain is in the C-terminal region. ; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 1SPJ_A 1A5I_A 2ZGH_A 2ZKS_A 2ZGJ_A 2ZGC_A 2ODP_A 2I6Q_A 2I6S_A 2ODQ_A ....
Probab=96.70 E-value=0.0026 Score=59.93 Aligned_cols=122 Identities=16% Similarity=0.228 Sum_probs=63.1
Q ss_pred cccccccccccCCCCCccccccccccccCcceeecccC-cccccCCCeEEEEeecccceEE---EEEEEEEEEeCC----
Q 040739 288 RADGAFIPFADDFDMSTVTTSVKGLGEIGDVKIVDLQS-PISSLIGKQVVKVGRSSGLTTG---TVLAYALEYNDE---- 359 (594)
Q Consensus 288 DaD~AIi~va~~~d~s~vs~~I~~lG~IG~vk~vdl~g-~~~p~lG~~V~KvGRTTGlT~G---~Itai~V~y~~~---- 359 (594)
+.|+||+++.++.... ..+. .+ .+.. ......|+.+.-+|-......+ .+....+.+-..
T Consensus 86 ~~DiAll~L~~~~~~~---~~~~---~~------~l~~~~~~~~~~~~~~~~G~~~~~~~~~~~~~~~~~~~~~~~~~c~ 153 (220)
T PF00089_consen 86 DNDIALLKLDRPITFG---DNIQ---PI------CLPSAGSDPNVGTSCIVVGWGRTSDNGYSSNLQSVTVPVVSRKTCR 153 (220)
T ss_dssp TTSEEEEEESSSSEHB---SSBE---ES------BBTSTTHTTTTTSEEEEEESSBSSTTSBTSBEEEEEEEEEEHHHHH
T ss_pred cccccccccccccccc---cccc---cc------cccccccccccccccccccccccccccccccccccccccccccccc
Confidence 4488999998753221 1221 11 1111 1224678877777776654444 344333322110
Q ss_pred ---CCeEEEEEEEEECCCCCCCCCCCCccceEEeeccCCCCCceEEEEEecCCCCCccccccCCCCCcceEeechHHHHh
Q 040739 360 ---KGICFLTDFLVVGENQQTFDLEGDSGSLILMKGENGEKPRPIGIIWGGTANRGRLKLKIGQPPENWTSGVDLGRLLN 436 (594)
Q Consensus 360 ---~G~~~f~dqIVt~~~g~~FS~~GDSGSlVl~~~~~d~~~~aVGLlfGGs~~~g~~~~~~~~~p~~~Tl~~pI~~VL~ 436 (594)
... ...+++.+...+..-...|||||.++. .+..++|++..+.... .+....+|++|...++
T Consensus 154 ~~~~~~-~~~~~~c~~~~~~~~~~~g~sG~pl~~-----~~~~lvGI~s~~~~c~---------~~~~~~v~~~v~~~~~ 218 (220)
T PF00089_consen 154 SSYNDN-LTPNMICAGSSGSGDACQGDSGGPLIC-----NNNYLVGIVSFGENCG---------SPNYPGVYTRVSSYLD 218 (220)
T ss_dssp HHTTTT-STTTEEEEETTSSSBGGTTTTTSEEEE-----TTEEEEEEEEEESSSS---------BTTSEEEEEEGGGGHH
T ss_pred cccccc-ccccccccccccccccccccccccccc-----ceeeecceeeecCCCC---------CCCcCEEEEEHHHhhc
Confidence 000 011223222212234568999999993 3337999999984321 1123488888876653
No 10
>KOG1421 consensus Predicted signaling-associated protein (contains a PDZ domain) [General function prediction only]
Probab=96.04 E-value=0.029 Score=64.56 Aligned_cols=92 Identities=17% Similarity=0.286 Sum_probs=57.7
Q ss_pred cccccCCCeEEEEeeccc----ceEEEEEEEE---EEEeCCCCeEEEEE-EEEECCCCCCCCCCCCccceEEeeccCCCC
Q 040739 326 PISSLIGKQVVKVGRSSG----LTTGTVLAYA---LEYNDEKGICFLTD-FLVVGENQQTFDLEGDSGSLILMKGENGEK 397 (594)
Q Consensus 326 ~~~p~lG~~V~KvGRTTG----lT~G~Itai~---V~y~~~~G~~~f~d-qIVt~~~g~~FS~~GDSGSlVl~~~~~d~~ 397 (594)
+..+.+|...+.+|--.| +-.|.+..+. =+|.. ...+.|.- .+.... -+.+|-|||+|+ +-.
T Consensus 157 p~~akvgseirvvgNDagEklsIlagflSrldr~apdyg~-~~yndfnTfy~Qaas----stsggssgspVv-----~i~ 226 (955)
T KOG1421|consen 157 PELAKVGSEIRVVGNDAGEKLSILAGFLSRLDRNAPDYGE-DTYNDFNTFYIQAAS----STSGGSSGSPVV-----DIP 226 (955)
T ss_pred ccccccCCceEEecCCccceEEeehhhhhhccCCCccccc-cccccccceeeeehh----cCCCCCCCCcee-----ccc
Confidence 345667777777766442 2223333332 11211 11222322 233333 478899999999 899
Q ss_pred CceEEEEEecCCCCCccccccCCCCCcceEeechHHHHhhcC
Q 040739 398 PRPIGIIWGGTANRGRLKLKIGQPPENWTSGVDLGRLLNLLE 439 (594)
Q Consensus 398 ~~aVGLlfGGs~~~g~~~~~~~~~p~~~Tl~~pI~~VL~~L~ 439 (594)
+++|.|.-||.... ...+|-||.+|+.+|-
T Consensus 227 gyAVAl~agg~~ss------------as~ffLpLdrV~RaL~ 256 (955)
T KOG1421|consen 227 GYAVALNAGGSISS------------ASDFFLPLDRVVRALR 256 (955)
T ss_pred ceEEeeecCCcccc------------cccceeeccchhhhhh
Confidence 99999999998653 4479999999998874
No 11
>cd00190 Tryp_SPc Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues.
Probab=95.85 E-value=0.07 Score=50.52 Aligned_cols=30 Identities=23% Similarity=0.388 Sum_probs=22.3
Q ss_pred CCCCCCccceEEeeccCCCCCceEEEEEecCC
Q 040739 378 FDLEGDSGSLILMKGENGEKPRPIGIIWGGTA 409 (594)
Q Consensus 378 FS~~GDSGSlVl~~~~~d~~~~aVGLlfGGs~ 409 (594)
-...||||+.++.. .+....++|++..|..
T Consensus 180 ~~c~gdsGgpl~~~--~~~~~~lvGI~s~g~~ 209 (232)
T cd00190 180 DACQGDSGGPLVCN--DNGRGVLVGIVSWGSG 209 (232)
T ss_pred ccccCCCCCcEEEE--eCCEEEEEEEEehhhc
Confidence 45679999999952 1234789999988764
No 12
>smart00020 Tryp_SPc Trypsin-like serine protease. Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues.
Probab=95.59 E-value=0.12 Score=49.40 Aligned_cols=29 Identities=28% Similarity=0.443 Sum_probs=21.4
Q ss_pred CCCCCccceEEeeccCCCCCceEEEEEecCCCC
Q 040739 379 DLEGDSGSLILMKGENGEKPRPIGIIWGGTANR 411 (594)
Q Consensus 379 S~~GDSGSlVl~~~~~d~~~~aVGLlfGGs~~~ 411 (594)
..+||||+.++... + ...++|++..|. .|
T Consensus 182 ~c~gdsG~pl~~~~--~-~~~l~Gi~s~g~-~C 210 (229)
T smart00020 182 ACQGDSGGPLVCND--G-RWVLVGIVSWGS-GC 210 (229)
T ss_pred ccCCCCCCeeEEEC--C-CEEEEEEEEECC-CC
Confidence 45699999999421 1 458999998887 44
No 13
>PF00944 Peptidase_S3: Alphavirus core protein ; InterPro: IPR000930 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. Togavirin, also known as Sindbis virus core endopeptidase, is a serine protease resident at the N terminus of the p130 polyprotein of togaviruses []. The endopeptidase signature identifies the peptidase as belonging to the MEROPS peptidase family S3 (togavirin family, clan PA(S)). The polyprotein also includes structural proteins for the nucleocapsid core and for the glycoprotein spikes []. Togavirin is only active while part of the polyprotein, cleavage at a Trp-Ser bond resulting in total lack of activity []. Mutagenesis studies have identified the location of the His-Asp-Ser catalytic triad, and X-ray studies have revealed the protein fold to be similar to that of chymotrypsin [, ].; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis, 0016020 membrane; PDB: 2YEW_D 1EP5_A 3J0C_F 1EP6_C 1WYK_D 1DYL_A 1VCQ_B 1VCP_B 1LD4_D 1KXA_A ....
Probab=95.50 E-value=0.018 Score=54.73 Aligned_cols=29 Identities=31% Similarity=0.541 Sum_probs=25.7
Q ss_pred CCCCCCccceEEeeccCCCCCceEEEEEecCCCC
Q 040739 378 FDLEGDSGSLILMKGENGEKPRPIGIIWGGTANR 411 (594)
Q Consensus 378 FS~~GDSGSlVl~~~~~d~~~~aVGLlfGGs~~~ 411 (594)
...+||||.+++ |+++++|||++||....
T Consensus 102 ~g~~GDSGRpi~-----DNsGrVVaIVLGG~neG 130 (158)
T PF00944_consen 102 VGKPGDSGRPIF-----DNSGRVVAIVLGGANEG 130 (158)
T ss_dssp S-STTSTTEEEE-----STTSBEEEEEEEEEEET
T ss_pred CCCCCCCCCccC-----cCCCCEEEEEecCCCCC
Confidence 689999999999 99999999999998654
No 14
>COG3591 V8-like Glu-specific endopeptidase [Amino acid transport and metabolism]
Probab=94.31 E-value=0.5 Score=49.12 Aligned_cols=27 Identities=37% Similarity=0.570 Sum_probs=23.7
Q ss_pred CCCCCccceEEeeccCCCCCceEEEEEecCCC
Q 040739 379 DLEGDSGSLILMKGENGEKPRPIGIIWGGTAN 410 (594)
Q Consensus 379 S~~GDSGSlVl~~~~~d~~~~aVGLlfGGs~~ 410 (594)
..+|+|||.|+ ..+.+++|++++|..-
T Consensus 200 T~pG~SGSpv~-----~~~~~vigv~~~g~~~ 226 (251)
T COG3591 200 TLPGSSGSPVL-----ISKDEVIGVHYNGPGA 226 (251)
T ss_pred ccCCCCCCceE-----ecCceEEEEEecCCCc
Confidence 77899999999 6777999999999863
No 15
>PF00947 Pico_P2A: Picornavirus core protein 2A; InterPro: IPR000081 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This domain defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies 3CA and 3CB. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral 3C cysteine protease []. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0008233 peptidase activity, 0006508 proteolysis, 0016032 viral reproduction; PDB: 2HRV_B 1Z8R_A.
Probab=91.44 E-value=0.2 Score=47.08 Aligned_cols=49 Identities=24% Similarity=0.324 Sum_probs=37.0
Q ss_pred EEEEEEEEECCCCCCCCCCCCccceEEeeccCCCCCceEEEEEecCCCCCccccccCCCCCcceEeechHHHH
Q 040739 363 CFLTDFLVVGENQQTFDLEGDSGSLILMKGENGEKPRPIGIIWGGTANRGRLKLKIGQPPENWTSGVDLGRLL 435 (594)
Q Consensus 363 ~~f~dqIVt~~~g~~FS~~GDSGSlVl~~~~~d~~~~aVGLlfGGs~~~g~~~~~~~~~p~~~Tl~~pI~~VL 435 (594)
.+.+++++..- +++|||-|++++ -+-.++||+.||.. +...|.+|+.++
T Consensus 75 h~Q~~~l~g~G----p~~PGdCGg~L~------C~HGViGi~Tagg~--------------g~VaF~dir~~~ 123 (127)
T PF00947_consen 75 HYQYNLLIGEG----PAEPGDCGGILR------CKHGVIGIVTAGGE--------------GHVAFADIRDLL 123 (127)
T ss_dssp EEEECEEEEE-----SSSTT-TCSEEE------ETTCEEEEEEEEET--------------TEEEEEECCCGS
T ss_pred heecCceeecc----cCCCCCCCceeE------eCCCeEEEEEeCCC--------------ceEEEEechhhh
Confidence 56677777655 799999999999 45669999999984 447888887654
No 16
>PF00863 Peptidase_C4: Peptidase family C4 This family belongs to family C4 of the peptidase classification.; InterPro: IPR001730 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. Nuclear inclusion A (NIA) proteases from potyviruses are cysteine peptidases belong to the MEROPS peptidase family C4 (NIa protease family, clan PA(C)) [, ]. Potyviruses include plant viruses in which the single-stranded RNA encodes a polyprotein with NIA protease activity, where proteolytic cleavage is specific for Gln+Gly sites. The NIA protease acts on the polyprotein, releasing itself by Gln+Gly cleavage at both the N- and C-termini. It further processes the polyprotein by cleavage at five similar sites in the C-terminal half of the sequence. In addition to its C-terminal protease activity, the NIA protease contains an N-terminal domain that has been implicated in the transcription process []. This peptidase is present in the nuclear inclusion protein of potyviruses.; GO: 0008234 cysteine-type peptidase activity, 0006508 proteolysis; PDB: 3MMG_B 1Q31_B 1LVB_A 1LVM_A.
Probab=89.60 E-value=2 Score=44.33 Aligned_cols=72 Identities=19% Similarity=0.265 Sum_probs=46.7
Q ss_pred ccccCCCeEEEEee--cccceEEEEEEEEEEEeCCCCeEEEEEEEEECCCCCCCCCCCCccceEEeeccCCCCCceEEEE
Q 040739 327 ISSLIGKQVVKVGR--SSGLTTGTVLAYALEYNDEKGICFLTDFLVVGENQQTFDLEGDSGSLILMKGENGEKPRPIGII 404 (594)
Q Consensus 327 ~~p~lG~~V~KvGR--TTGlT~G~Itai~V~y~~~~G~~~f~dqIVt~~~g~~FS~~GDSGSlVl~~~~~d~~~~aVGLl 404 (594)
..|..++.|+.+|- .+....-+|+.-...|... +..+++.+|-| ..||=|++++. -.++.+||++
T Consensus 103 R~P~~~e~v~mVg~~fq~k~~~s~vSesS~i~p~~-~~~fWkHwIsT--------k~G~CG~PlVs----~~Dg~IVGiH 169 (235)
T PF00863_consen 103 RAPKEGERVCMVGSNFQEKSISSTVSESSWIYPEE-NSHFWKHWIST--------KDGDCGLPLVS----TKDGKIVGIH 169 (235)
T ss_dssp ----TT-EEEEEEEECSSCCCEEEEEEEEEEEEET-TTTEEEE-C-----------TT-TT-EEEE----TTT--EEEEE
T ss_pred cCCCCCCEEEEEEEEEEcCCeeEEECCceEEeecC-CCCeeEEEecC--------CCCccCCcEEE----cCCCcEEEEE
Confidence 47899999999997 7778888888888777633 23568888854 45999999994 4689999999
Q ss_pred EecCCCC
Q 040739 405 WGGTANR 411 (594)
Q Consensus 405 fGGs~~~ 411 (594)
..++...
T Consensus 170 sl~~~~~ 176 (235)
T PF00863_consen 170 SLTSNTS 176 (235)
T ss_dssp EEEETTT
T ss_pred cCccCCC
Confidence 9887643
No 17
>PF05579 Peptidase_S32: Equine arteritis virus serine endopeptidase S32; InterPro: IPR008760 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to MEROPS peptidase family S32 (clan PA(S)). The type example is equine arteritis virus serine endopeptidase (equine arteritis virus), which is involved in processing of nidovirus polyproteins [].; GO: 0004252 serine-type endopeptidase activity, 0016032 viral reproduction, 0019082 viral protein processing; PDB: 3FAN_A 3FAO_A 1MBM_A.
Probab=87.34 E-value=0.38 Score=50.50 Aligned_cols=27 Identities=37% Similarity=0.506 Sum_probs=21.8
Q ss_pred CCCCCCccceEEeeccCCCCCceEEEEEecCC
Q 040739 378 FDLEGDSGSLILMKGENGEKPRPIGIIWGGTA 409 (594)
Q Consensus 378 FS~~GDSGSlVl~~~~~d~~~~aVGLlfGGs~ 409 (594)
|+.+|||||+|+ ..++.+||+|.|.+.
T Consensus 204 fT~~GDSGSPVV-----t~dg~liGVHTGSn~ 230 (297)
T PF05579_consen 204 FTGPGDSGSPVV-----TEDGDLIGVHTGSNK 230 (297)
T ss_dssp SS-GGCTT-EEE-----ETTC-EEEEEEEEET
T ss_pred EcCCCCCCCccC-----cCCCCEEEEEecCCC
Confidence 899999999999 688999999999775
No 18
>PF10459 Peptidase_S46: Peptidase S46; InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains.
Probab=84.74 E-value=1.1 Score=52.69 Aligned_cols=52 Identities=31% Similarity=0.440 Sum_probs=38.7
Q ss_pred CCCCCccceEEeeccCCCCCceEEEEEecCCCC--CccccccCCCCCcceEeechHHHHhhc
Q 040739 379 DLEGDSGSLILMKGENGEKPRPIGIIWGGTANR--GRLKLKIGQPPENWTSGVDLGRLLNLL 438 (594)
Q Consensus 379 S~~GDSGSlVl~~~~~d~~~~aVGLlfGGs~~~--g~~~~~~~~~p~~~Tl~~pI~~VL~~L 438 (594)
.-+|.|||+|+ |.++++|||.|-|+-.. |-..|. .-.+-++..+|.-||-.|
T Consensus 630 itGGNSGSPvl-----N~~GeLVGl~FDgn~Esl~~D~~fd---p~~~R~I~VDiRyvL~~l 683 (698)
T PF10459_consen 630 ITGGNSGSPVL-----NAKGELVGLAFDGNWESLSGDIAFD---PELNRTIHVDIRYVLWAL 683 (698)
T ss_pred cCCCCCCCccC-----CCCceEEEEeecCchhhcccccccc---cccceeEEEEHHHHHHHH
Confidence 77899999999 99999999999998543 111111 114567899998888765
No 19
>PF00949 Peptidase_S7: Peptidase S7, Flavivirus NS3 serine protease ; InterPro: IPR001850 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies serine peptidases belong to MEROPS peptidase family S7 (flavivirin family, clan PA(S)). The protein fold of the peptidase domain for members of this family resembles that of chymotrypsin, the type example for clan PA. Flaviviruses produce a polyprotein from the ssRNA genome. The N terminus of the NS3 protein (approx. 180 aa) is required for the processing of the polyprotein. NS3 also has conserved homology with NTP-binding proteins and DEAD family of RNA helicase [, , ].; GO: 0003723 RNA binding, 0003724 RNA helicase activity, 0005524 ATP binding; PDB: 2IJO_B 3E90_D 2GGV_B 2FP7_B 2WV9_A 3U1I_B 3U1J_B 2WZQ_A 2WHX_A 3L6P_A ....
Probab=80.82 E-value=1.1 Score=42.40 Aligned_cols=26 Identities=27% Similarity=0.387 Sum_probs=21.5
Q ss_pred CCCCCccceEEeeccCCCCCceEEEEEecCC
Q 040739 379 DLEGDSGSLILMKGENGEKPRPIGIIWGGTA 409 (594)
Q Consensus 379 S~~GDSGSlVl~~~~~d~~~~aVGLlfGGs~ 409 (594)
-.+|-|||+|+ +.++++|||+++|-.
T Consensus 94 ~~~GsSGSpi~-----n~~g~ivGlYg~g~~ 119 (132)
T PF00949_consen 94 FPKGSSGSPIF-----NQNGEIVGLYGNGVE 119 (132)
T ss_dssp S-TTGTT-EEE-----ETTSCEEEEEEEEEE
T ss_pred cCCCCCCCceE-----cCCCcEEEEEcccee
Confidence 34699999999 899999999999864
No 20
>PF01732 DUF31: Putative peptidase (DUF31); InterPro: IPR022382 This domain has no known function. It is found in various hypothetical proteins and putative lipoproteins from mycoplasmas.
Probab=62.25 E-value=5.8 Score=42.79 Aligned_cols=22 Identities=32% Similarity=0.668 Sum_probs=20.6
Q ss_pred CCCCccceEEeeccCCCCCceEEEEEe
Q 040739 380 LEGDSGSLILMKGENGEKPRPIGIIWG 406 (594)
Q Consensus 380 ~~GDSGSlVl~~~~~d~~~~aVGLlfG 406 (594)
.+|=|||+|+ ++++++|||+||
T Consensus 353 ~gGaSGS~V~-----n~~~~lvGIy~g 374 (374)
T PF01732_consen 353 GGGASGSMVI-----NQNNELVGIYFG 374 (374)
T ss_pred CCCCCcCeEE-----CCCCCEEEEeCC
Confidence 4799999999 999999999997
No 21
>PF12381 Peptidase_C3G: Tungro spherical virus-type peptidase; InterPro: IPR024387 This entry represents a rice tungro spherical waikavirus-type peptidase that belongs to MEROPS peptidase family C3G. It is a picornain 3C-type protease, and is responsible for the self-cleavage of the positive single-stranded polyproteins of a number of plant viral genomes. The location of the protease activity of the polyprotein is at the C-terminal end, adjacent and N-terminal to the putative RNA polymerase [, ].
Probab=58.77 E-value=13 Score=38.26 Aligned_cols=47 Identities=28% Similarity=0.316 Sum_probs=31.6
Q ss_pred CCeEEEEEEEEECCCCCCCCCCCCccceEEeeccCCCCCceEEEEEecCCCC
Q 040739 360 KGICFLTDFLVVGENQQTFDLEGDSGSLILMKGENGEKPRPIGIIWGGTANR 411 (594)
Q Consensus 360 ~G~~~f~dqIVt~~~g~~FS~~GDSGSlVl~~~~~d~~~~aVGLlfGGs~~~ 411 (594)
+|..+.+.-+.-.. .+..||=||+++.- +...-.+++||+.+|+.+.
T Consensus 162 ~~~ytir~gleY~~----~t~~GdCGs~i~~~-~t~~~RKIvGiHVAG~~~~ 208 (231)
T PF12381_consen 162 KGQYTIRQGLEYQM----PTMNGDCGSPIVRN-NTQMVRKIVGIHVAGSANH 208 (231)
T ss_pred CCcEEeeeeeeEEC----CCcCCCccceeeEc-chhhhhhhheeeecccccc
Confidence 34444554444334 68999999999952 1223467999999999764
No 22
>COG5640 Secreted trypsin-like serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=51.46 E-value=92 Score=34.66 Aligned_cols=32 Identities=38% Similarity=0.530 Sum_probs=20.7
Q ss_pred CCCCCCccceEEeeccCCCCCc-eEEEEEecCCCCC
Q 040739 378 FDLEGDSGSLILMKGENGEKPR-PIGIIWGGTANRG 412 (594)
Q Consensus 378 FS~~GDSGSlVl~~~~~d~~~~-aVGLlfGGs~~~g 412 (594)
-+-.||||.+++.+ ..+++ =+|+..=|.+.||
T Consensus 224 daCqGDSGGPi~~~---g~~G~vQ~GVvSwG~~~Cg 256 (413)
T COG5640 224 DACQGDSGGPIFHK---GEEGRVQRGVVSWGDGGCG 256 (413)
T ss_pred ccccCCCCCceEEe---CCCccEEEeEEEecCCCCC
Confidence 45679999999964 33444 4888844444353
No 23
>PF00548 Peptidase_C3: 3C cysteine protease (picornain 3C); InterPro: IPR000199 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This signature defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies C3A and C3B. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral C3 cysteine protease. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SJO_E 2H6M_A 1QA7_C 1HAV_B 2HAL_A 2H9H_A 3QZQ_B 3QZR_A 3R0F_B 3SJ9_A ....
Probab=51.32 E-value=43 Score=32.72 Aligned_cols=29 Identities=28% Similarity=0.321 Sum_probs=23.4
Q ss_pred CCCCCCccceEEeeccCCCCCceEEEEEecC
Q 040739 378 FDLEGDSGSLILMKGENGEKPRPIGIIWGGT 408 (594)
Q Consensus 378 FS~~GDSGSlVl~~~~~d~~~~aVGLlfGGs 408 (594)
-+.+||-||+++. .......++||+.||+
T Consensus 143 ~t~~G~CG~~l~~--~~~~~~~i~GiHvaG~ 171 (172)
T PF00548_consen 143 PTKPGMCGSPLVS--RIGGQGKIIGIHVAGN 171 (172)
T ss_dssp EEETTGTTEEEEE--SCGGTTEEEEEEEEEE
T ss_pred CCCCCccCCeEEE--eeccCccEEEEEeccC
Confidence 3778999999994 2334788999999997
No 24
>KOG1320 consensus Serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=49.27 E-value=61 Score=36.96 Aligned_cols=56 Identities=14% Similarity=0.187 Sum_probs=37.6
Q ss_pred EEEEEEEECCCCCCCCCCCCccceEEeeccCCCCCceEEEEEecCCCCCccccccCCCCCcceEeechHHHHh
Q 040739 364 FLTDFLVVGENQQTFDLEGDSGSLILMKGENGEKPRPIGIIWGGTANRGRLKLKIGQPPENWTSGVDLGRLLN 436 (594)
Q Consensus 364 ~f~dqIVt~~~g~~FS~~GDSGSlVl~~~~~d~~~~aVGLlfGGs~~~g~~~~~~~~~p~~~Tl~~pI~~VL~ 436 (594)
+..+.+.+.. -...|-||-+++ +..+.+||+-+.--. |+.|. ...++.-|+..|+.
T Consensus 290 ~i~~~~qtd~----ai~~~nsg~~ll-----~~DG~~IgVn~~~~~---ri~~~-----~~iSf~~p~d~vl~ 345 (473)
T KOG1320|consen 290 LISKINQTDA----AINPGNSGGPLL-----NLDGEVIGVNTRKVT---RIGFS-----HGISFKIPIDTVLV 345 (473)
T ss_pred eeeeecccch----hhhcccCCCcEE-----EecCcEeeeeeeeeE---Eeecc-----ccceeccCchHhhh
Confidence 3445555555 578899999999 889999997766443 12122 34467777777774
No 25
>KOG1320 consensus Serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=44.97 E-value=64 Score=36.77 Aligned_cols=129 Identities=19% Similarity=0.144 Sum_probs=67.8
Q ss_pred EEEecCceeccCCCCCCCCCCCCCCcccccc-cCCCccccccccccCCCccccccccccccccCCCCCcccccc-ccccc
Q 040739 237 GFLTNRHVAVDLDYPNQKMFHPLPPTLGPGV-YLGAVERATSFHHRRPLTFVRADGAFIPFADDFDMSTVTTSV-KGLGE 314 (594)
Q Consensus 237 yiLTNnHVla~~n~~~~~~~~pgdpILQPg~-~DGG~~rd~~FIp~~p~N~VDaD~AIi~va~~~d~s~vs~~I-~~lG~ 314 (594)
-+|||.|+.+..+.. .++.+ ..|...+...|+ ...+..||.|++.+.. ...|-.-. ..+|+
T Consensus 98 ~lltn~~~v~~~~~~-----------~~v~v~~~gs~~k~~~~v---~~~~~~cd~Avv~Ie~---~~f~~~~~~~e~~~ 160 (473)
T KOG1320|consen 98 KLLTNAHVVAPNNDH-----------KFVTVKKHGSPRKYKAFV---AAVFEECDLAVVYIES---EEFWKGMNPFELGD 160 (473)
T ss_pred ceeecCccccccccc-----------cccccccCCCchhhhhhH---HHhhhcccceEEEEee---ccccCCCcccccCC
Confidence 599999999976642 11111 122211112221 2356778888876642 22332111 12222
Q ss_pred cCcceeecccCcccccCCCeEEEE-eecccceEEEEEEEEEEEeCCCCeEEEEEEEEECCCCCCCCCCCCccceEEeecc
Q 040739 315 IGDVKIVDLQSPISSLIGKQVVKV-GRSSGLTTGTVLAYALEYNDEKGICFLTDFLVVGENQQTFDLEGDSGSLILMKGE 393 (594)
Q Consensus 315 IG~vk~vdl~g~~~p~lG~~V~Kv-GRTTGlT~G~Itai~V~y~~~~G~~~f~dqIVt~~~g~~FS~~GDSGSlVl~~~~ 393 (594)
+ |.+...|+=+ |-+.=+|.|.|.++..+-....+..+..-||.... .+|.||-.++
T Consensus 161 i-------------p~l~~S~~Vv~gd~i~VTnghV~~~~~~~y~~~~~~l~~vqi~aa~------~~~~s~ep~i---- 217 (473)
T KOG1320|consen 161 I-------------PSLNGSGFVVGGDGIIVTNGHVVRVEPRIYAHSSTVLLRVQIDAAI------GPGNSGEPVI---- 217 (473)
T ss_pred C-------------cccCccEEEEcCCcEEEEeeEEEEEEeccccCCCcceeeEEEEEee------cCCccCCCeE----
Confidence 2 3333333333 55666899999999855333334444455555433 4678888888
Q ss_pred CCCCCceEEEEEe
Q 040739 394 NGEKPRPIGIIWG 406 (594)
Q Consensus 394 ~d~~~~aVGLlfG 406 (594)
...+...|+.|-
T Consensus 218 -~g~d~~~gvA~l 229 (473)
T KOG1320|consen 218 -VGVDKVAGVAFL 229 (473)
T ss_pred -EccccccceEEE
Confidence 234555565554
No 26
>PF02122 Peptidase_S39: Peptidase S39; InterPro: IPR000382 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. ORF2 of Potato leafroll virus (PLrV) encodes a polyprotein which is translated following a -1 frameshift. The polyprotein has a putative linear arrangement of membrane achor-VPg-peptidase-polmerase domains. The serine peptidase domain which is found in this group of sequences belongs to MEROPS peptidase family S39 (clan PA(S)). It is likely that the peptidase domain is involved in the cleavage of the polyprotein []. The nucleotide sequence for the RNA of PLrV has been determined [, ]. The sequence contains six large open reading frames (ORFs). The 5' coding region encodes two polypeptides of 28K and 70K, which overlap in different reading frames; it is suggested that the third ORF in the 5' block is translated by frameshift readthrough near the end of the 70K protein, yielding a 118K polypeptide []. Segments of the predicted amino acid sequences of these ORFs resemble those of known viral RNA polymerases, ATP-binding proteins and viral genome-linked proteins. The nucleotide sequence of the genomic RNA of Beet western yellows virus (BWYV) has been determined []. The sequence contains six long ORFs. A cluster of three of these ORFs, including the coat protein cistron, display extensive amino acid sequence similarity to corresponding ORFs of a second luteovirus: Barley yellow dwarf virus [].; GO: 0004252 serine-type endopeptidase activity, 0022415 viral reproductive process, 0016021 integral to membrane; PDB: 1ZYO_A.
Probab=30.54 E-value=40 Score=34.21 Aligned_cols=42 Identities=19% Similarity=0.170 Sum_probs=15.1
Q ss_pred CCCCCccceEEeeccCCCCCceEEEEEecCCCCCccccccCCCCCcceEeechHHHH
Q 040739 379 DLEGDSGSLILMKGENGEKPRPIGIIWGGTANRGRLKLKIGQPPENWTSGVDLGRLL 435 (594)
Q Consensus 379 S~~GDSGSlVl~~~~~d~~~~aVGLlfGGs~~~g~~~~~~~~~p~~~Tl~~pI~~VL 435 (594)
+.+|+||+.+| ..+ .+||++.|.... ...++|-+..||..+.
T Consensus 144 T~~G~SGtp~y-----~g~-~vvGvH~G~~~~---------~~~~n~n~~spip~~~ 185 (203)
T PF02122_consen 144 TSPGWSGTPYY-----SGK-NVVGVHTGSPSG---------SNRENNNRMSPIPPIP 185 (203)
T ss_dssp --TT-TT-EEE------SS--EEEEEEEE----------------------------
T ss_pred CCCCCCCCCeE-----ECC-CceEeecCcccc---------cccccccccccccccc
Confidence 78999999999 344 999999995110 1124566677776654
Done!