Query 012266
Match_columns 467
No_of_seqs 178 out of 422
Neff 5.3
Searched_HMMs 46136
Date Fri Mar 29 00:48:59 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/012266.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/012266hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PF08192 Peptidase_S64: Peptid 99.7 4.6E-17 9.9E-22 175.7 13.2 102 336-449 584-686 (695)
2 PRK10139 serine endoprotease; 98.0 7.3E-05 1.6E-09 80.3 13.2 89 337-446 160-253 (455)
3 TIGR02038 protease_degS peripl 98.0 0.00013 2.8E-09 75.7 14.0 94 337-448 146-243 (351)
4 PRK10898 serine endoprotease; 97.9 0.00018 4E-09 74.7 13.5 93 337-447 146-243 (353)
5 PRK10942 serine endoprotease; 97.9 0.00015 3.2E-09 78.3 12.8 88 337-444 181-272 (473)
6 TIGR02037 degP_htrA_DO peripla 97.8 0.00016 3.4E-09 76.7 12.0 92 337-448 127-222 (428)
7 PF13365 Trypsin_2: Trypsin-li 97.2 0.0013 2.9E-08 55.4 7.4 21 388-413 100-120 (120)
8 PF00089 Trypsin: Trypsin; In 96.7 0.0038 8.3E-08 57.6 6.8 45 388-446 174-218 (220)
9 PF00863 Peptidase_C4: Peptida 95.3 0.17 3.6E-06 50.4 11.0 105 298-444 81-187 (235)
10 COG0265 DegQ Trypsin-like seri 95.2 0.17 3.6E-06 52.2 11.0 90 339-448 143-237 (347)
11 cd00190 Tryp_SPc Trypsin-like 94.6 0.87 1.9E-05 42.1 13.1 30 388-419 180-209 (232)
12 smart00020 Tryp_SPc Trypsin-li 94.5 0.84 1.8E-05 42.5 12.8 43 389-443 182-224 (229)
13 KOG1421 Predicted signaling-as 94.2 0.11 2.4E-06 58.2 7.1 45 388-449 212-256 (955)
14 PF00944 Peptidase_S3: Alphavi 94.0 0.12 2.6E-06 47.5 5.6 29 388-421 102-130 (158)
15 PF00947 Pico_P2A: Picornaviru 92.5 0.16 3.4E-06 46.1 4.0 50 372-445 74-123 (127)
16 PF05579 Peptidase_S32: Equine 89.7 0.29 6.3E-06 49.6 3.2 28 388-420 204-231 (297)
17 COG3591 V8-like Glu-specific e 89.0 6.1 0.00013 39.9 12.0 28 389-421 200-227 (251)
18 PF10459 Peptidase_S46: Peptid 86.9 0.77 1.7E-05 52.3 4.7 61 375-449 621-684 (698)
19 PF00949 Peptidase_S7: Peptida 83.8 0.77 1.7E-05 42.0 2.3 26 389-419 94-119 (132)
20 PF12381 Peptidase_C3G: Tungro 82.4 1.4 3E-05 43.5 3.7 43 388-443 176-218 (231)
21 PF01732 DUF31: Putative pepti 64.5 4.9 0.00011 42.1 2.6 22 390-416 353-374 (374)
22 PF00548 Peptidase_C3: 3C cyst 54.3 7.9 0.00017 36.5 1.9 29 389-419 144-172 (172)
23 KOG1320 Serine protease [Postt 40.6 1.2E+02 0.0026 33.5 8.4 57 374-447 290-346 (473)
24 KOG1320 Serine protease [Postt 40.4 1.1E+02 0.0023 33.8 8.0 95 295-416 133-229 (473)
25 PF02122 Peptidase_S39: Peptid 36.7 29 0.00062 34.0 2.7 44 388-446 143-186 (203)
26 COG5480 Predicted integral mem 25.7 33 0.00072 31.7 1.0 19 118-136 41-59 (147)
No 1
>PF08192 Peptidase_S64: Peptidase family S64; InterPro: IPR012985 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This family of fungal proteins is involved in the processing of membrane bound transcription factor Stp1 [] and belongs to MEROPS petidase family S64 (clan PA). The processing causes the signalling domain of Stp1 to be passed to the nucleus where several permease genes are induced. The permeases are important for uptake of amino acids, and processing of tp1 only occurs in an amino acid-rich environment. This family is predicted to be distantly related to the trypsin family (MEROPS peptidase family S1) and to have a typical trypsin-like catalytic triad [].
Probab=99.71 E-value=4.6e-17 Score=175.67 Aligned_cols=102 Identities=29% Similarity=0.428 Sum_probs=85.6
Q ss_pred cccccCCCcEEEeeeccCceEEEEEEEEEEEeCCCCeEEEEEEEEEcCCCCCCCCCCCccceEEeeccC-CCCCceEEEE
Q 012266 336 PINSLIGRQVMKVGRSSGLTTGTVMAYALEYNDEKGICFFTDFLVVGENQQTFDLEGDSGSLILLTGQN-GEKPRPVGII 414 (467)
Q Consensus 336 ~~~p~lG~~V~KvGRTTGlT~G~I~ai~v~y~~~~G~~~f~dqiit~~~~~~fS~~GDSGSlvl~~~~~-d~~~~~VGLl 414 (467)
......|+.|+|+|||||+|+|+|+++.+.|+.+ |...+.+++|...++..|+.+|||||+|+.+.++ ...-.+||||
T Consensus 584 ~~~~~~G~~VfK~GrTTgyT~G~lNg~klvyw~d-G~i~s~efvV~s~~~~~Fa~~GDSGS~VLtk~~d~~~gLgvvGMl 662 (695)
T PF08192_consen 584 VSNLVPGMEVFKVGRTTGYTTGILNGIKLVYWAD-GKIQSSEFVVSSDNNPAFASGGDSGSWVLTKLEDNNKGLGVVGML 662 (695)
T ss_pred hhccCCCCeEEEecccCCccceEecceEEEEecC-CCeEEEEEEEecCCCccccCCCCcccEEEecccccccCceeeEEe
Confidence 3455679999999999999999999999888876 5567899999987788899999999999965444 3345699999
Q ss_pred EeccCCCCccccccCCCCcceeeeechHHHHhhcC
Q 012266 415 WGGTANRGRLKLKVGQPPVNWTSGVDLGRLLDLLE 449 (467)
Q Consensus 415 fgG~~~~g~~~~~~~~~~~~~t~~~pI~~VL~~L~ 449 (467)
++.++.. ..|.+|+||..||+.|.
T Consensus 663 hsydge~-----------kqfglftPi~~il~rl~ 686 (695)
T PF08192_consen 663 HSYDGEQ-----------KQFGLFTPINEILDRLE 686 (695)
T ss_pred eecCCcc-----------ceeeccCcHHHHHHHHH
Confidence 9986543 48999999999999884
No 2
>PRK10139 serine endoprotease; Provisional
Probab=97.99 E-value=7.3e-05 Score=80.25 Aligned_cols=89 Identities=20% Similarity=0.192 Sum_probs=61.1
Q ss_pred ccccCCCcEEEeeeccC----ceEEEEEEEEEE-EeCCCCeEEEEEEEEEcCCCCCCCCCCCccceEEeeccCCCCCceE
Q 012266 337 INSLIGRQVMKVGRSSG----LTTGTVMAYALE-YNDEKGICFFTDFLVVGENQQTFDLEGDSGSLILLTGQNGEKPRPV 411 (467)
Q Consensus 337 ~~p~lG~~V~KvGRTTG----lT~G~I~ai~v~-y~~~~G~~~f~dqiit~~~~~~fS~~GDSGSlvl~~~~~d~~~~~V 411 (467)
....+|+.|.-+|.--| .|.|.|+++.-. +.. .+ +.++|.+.. --.+|.||..++ |.++++|
T Consensus 160 ~~~~~G~~V~aiG~P~g~~~tvt~GivS~~~r~~~~~-~~---~~~~iqtda----~in~GnSGGpl~-----n~~G~vI 226 (455)
T PRK10139 160 DKLRVGDFAVAVGNPFGLGQTATSGIISALGRSGLNL-EG---LENFIQTDA----SINRGNSGGALL-----NLNGELI 226 (455)
T ss_pred cccCCCCEEEEEecCCCCCCceEEEEEccccccccCC-CC---cceEEEECC----ccCCCCCcceEE-----CCCCeEE
Confidence 44678999999887544 578888887522 111 11 356777766 356899999999 9999999
Q ss_pred EEEEeccCCCCccccccCCCCcceeeeechHHHHh
Q 012266 412 GIIWGGTANRGRLKLKVGQPPVNWTSGVDLGRLLD 446 (467)
Q Consensus 412 GLlfgG~~~~g~~~~~~~~~~~~~t~~~pI~~VL~ 446 (467)
|+..+--... ++..+..|+-|++.+.+
T Consensus 227 Gi~~~~~~~~--------~~~~gigfaIP~~~~~~ 253 (455)
T PRK10139 227 GINTAILAPG--------GGSVGIGFAIPSNMART 253 (455)
T ss_pred EEEEEEEcCC--------CCccceEEEEEhHHHHH
Confidence 9998743211 12346788888865443
No 3
>TIGR02038 protease_degS periplasmic serine pepetdase DegS. This family consists of the periplasmic serine protease DegS (HhoB), a shorter paralog of protease DO (HtrA, DegP) and DegQ (HhoA). It is found in E. coli and several other Proteobacteria of the gamma subdivision. It contains a trypsin domain and a single copy of PDZ domain (in contrast to DegP with two copies). A critical role of this DegS is to sense stress in the periplasm and partially degrade an inhibitor of sigma(E).
Probab=97.96 E-value=0.00013 Score=75.74 Aligned_cols=94 Identities=15% Similarity=0.177 Sum_probs=63.4
Q ss_pred ccccCCCcEEEeeeccC----ceEEEEEEEEEEEeCCCCeEEEEEEEEEcCCCCCCCCCCCccceEEeeccCCCCCceEE
Q 012266 337 INSLIGRQVMKVGRSSG----LTTGTVMAYALEYNDEKGICFFTDFLVVGENQQTFDLEGDSGSLILLTGQNGEKPRPVG 412 (467)
Q Consensus 337 ~~p~lG~~V~KvGRTTG----lT~G~I~ai~v~y~~~~G~~~f~dqiit~~~~~~fS~~GDSGSlvl~~~~~d~~~~~VG 412 (467)
..+.+|+.|.-+|...| +|.|.|+++.-......+. .+++.+.. --.+|.||..++ |.++++||
T Consensus 146 ~~~~~G~~V~aiG~P~~~~~s~t~GiIs~~~r~~~~~~~~---~~~iqtda----~i~~GnSGGpl~-----n~~G~vIG 213 (351)
T TIGR02038 146 RPPHVGDVVLAIGNPYNLGQTITQGIISATGRNGLSSVGR---QNFIQTDA----AINAGNSGGALI-----NTNGELVG 213 (351)
T ss_pred CccCCCCEEEEEeCCCCCCCcEEEEEEEeccCcccCCCCc---ceEEEECC----ccCCCCCcceEE-----CCCCeEEE
Confidence 45789999999999866 4789998875221111121 35566655 357899999999 99999999
Q ss_pred EEEeccCCCCccccccCCCCcceeeeechHHHHhhc
Q 012266 413 IIWGGTANRGRLKLKVGQPPVNWTSGVDLGRLLDLL 448 (467)
Q Consensus 413 LlfgG~~~~g~~~~~~~~~~~~~t~~~pI~~VL~~L 448 (467)
+..+.-... .+.......|+-|++.+.+.+
T Consensus 214 I~~~~~~~~------~~~~~~g~~faIP~~~~~~vl 243 (351)
T TIGR02038 214 INTASFQKG------GDEGGEGINFAIPIKLAHKIM 243 (351)
T ss_pred EEeeeeccc------CCCCccceEEEecHHHHHHHH
Confidence 987642211 011234678888887766555
No 4
>PRK10898 serine endoprotease; Provisional
Probab=97.88 E-value=0.00018 Score=74.73 Aligned_cols=93 Identities=18% Similarity=0.226 Sum_probs=60.4
Q ss_pred ccccCCCcEEEeeeccC----ceEEEEEEEE-EEEeCCCCeEEEEEEEEEcCCCCCCCCCCCccceEEeeccCCCCCceE
Q 012266 337 INSLIGRQVMKVGRSSG----LTTGTVMAYA-LEYNDEKGICFFTDFLVVGENQQTFDLEGDSGSLILLTGQNGEKPRPV 411 (467)
Q Consensus 337 ~~p~lG~~V~KvGRTTG----lT~G~I~ai~-v~y~~~~G~~~f~dqiit~~~~~~fS~~GDSGSlvl~~~~~d~~~~~V 411 (467)
..+..|+.|.-+|.-.| .|.|.|++.. ..+... +. .++|.+.. --.+|.||..++ |.++++|
T Consensus 146 ~~~~~G~~V~aiG~P~g~~~~~t~Giis~~~r~~~~~~-~~---~~~iqtda----~i~~GnSGGPl~-----n~~G~vv 212 (353)
T PRK10898 146 RVPHIGDVVLAIGNPYNLGQTITQGIISATGRIGLSPT-GR---QNFLQTDA----SINHGNSGGALV-----NSLGELM 212 (353)
T ss_pred CcCCCCCEEEEEeCCCCcCCCcceeEEEeccccccCCc-cc---cceEEecc----ccCCCCCcceEE-----CCCCeEE
Confidence 34688999999998766 5789998875 322221 11 24455555 357899999999 9999999
Q ss_pred EEEEeccCCCCccccccCCCCcceeeeechHHHHhh
Q 012266 412 GIIWGGTANRGRLKLKVGQPPVNWTSGVDLGRLLDL 447 (467)
Q Consensus 412 GLlfgG~~~~g~~~~~~~~~~~~~t~~~pI~~VL~~ 447 (467)
|+..+.-... ..+.......|+-|++.+.+.
T Consensus 213 GI~~~~~~~~-----~~~~~~~g~~faIP~~~~~~~ 243 (353)
T PRK10898 213 GINTLSFDKS-----NDGETPEGIGFAIPTQLATKI 243 (353)
T ss_pred EEEEEEeccc-----CCCCcccceEEEEchHHHHHH
Confidence 9987532211 001113467888777764433
No 5
>PRK10942 serine endoprotease; Provisional
Probab=97.86 E-value=0.00015 Score=78.33 Aligned_cols=88 Identities=16% Similarity=0.164 Sum_probs=58.7
Q ss_pred ccccCCCcEEEeeeccC----ceEEEEEEEEEEEeCCCCeEEEEEEEEEcCCCCCCCCCCCccceEEeeccCCCCCceEE
Q 012266 337 INSLIGRQVMKVGRSSG----LTTGTVMAYALEYNDEKGICFFTDFLVVGENQQTFDLEGDSGSLILLTGQNGEKPRPVG 412 (467)
Q Consensus 337 ~~p~lG~~V~KvGRTTG----lT~G~I~ai~v~y~~~~G~~~f~dqiit~~~~~~fS~~GDSGSlvl~~~~~d~~~~~VG 412 (467)
....+|+.|.-+|.--| .|.|.|+++.-... +...+.++|.+.. --.+|.||..++ |.++++||
T Consensus 181 ~~l~~G~~V~aiG~P~g~~~tvt~GiVs~~~r~~~---~~~~~~~~iqtda----~i~~GnSGGpL~-----n~~GeviG 248 (473)
T PRK10942 181 DALRVGDYTVAIGNPYGLGETVTSGIVSALGRSGL---NVENYENFIQTDA----AINRGNSGGALV-----NLNGELIG 248 (473)
T ss_pred cccCCCCEEEEEcCCCCCCcceeEEEEEEeecccC---CcccccceEEecc----ccCCCCCcCccC-----CCCCeEEE
Confidence 45689999999998755 48888988762210 1112346666665 246899999999 99999999
Q ss_pred EEEeccCCCCccccccCCCCcceeeeechHHH
Q 012266 413 IIWGGTANRGRLKLKVGQPPVNWTSGVDLGRL 444 (467)
Q Consensus 413 LlfgG~~~~g~~~~~~~~~~~~~t~~~pI~~V 444 (467)
+..+.-... ++.....|+-|+..+
T Consensus 249 I~t~~~~~~--------g~~~g~gfaIP~~~~ 272 (473)
T PRK10942 249 INTAILAPD--------GGNIGIGFAIPSNMV 272 (473)
T ss_pred EEEEEEcCC--------CCcccEEEEEEHHHH
Confidence 998643211 112246777777543
No 6
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=97.82 E-value=0.00016 Score=76.68 Aligned_cols=92 Identities=17% Similarity=0.177 Sum_probs=61.1
Q ss_pred ccccCCCcEEEeeeccC----ceEEEEEEEEEEEeCCCCeEEEEEEEEEcCCCCCCCCCCCccceEEeeccCCCCCceEE
Q 012266 337 INSLIGRQVMKVGRSSG----LTTGTVMAYALEYNDEKGICFFTDFLVVGENQQTFDLEGDSGSLILLTGQNGEKPRPVG 412 (467)
Q Consensus 337 ~~p~lG~~V~KvGRTTG----lT~G~I~ai~v~y~~~~G~~~f~dqiit~~~~~~fS~~GDSGSlvl~~~~~d~~~~~VG 412 (467)
....+|+.|.-+|.--| +|.|.|+++.-..... + .+.+++.+.. --.+|.||+.++ |.++++||
T Consensus 127 ~~~~~G~~v~aiG~p~g~~~~~t~G~vs~~~~~~~~~-~--~~~~~i~tda----~i~~GnSGGpl~-----n~~G~viG 194 (428)
T TIGR02037 127 DKLRVGDWVLAIGNPFGLGQTVTSGIVSALGRSGLGI-G--DYENFIQTDA----AINPGNSGGPLV-----NLRGEVIG 194 (428)
T ss_pred CCCCCCCEEEEEECCCcCCCcEEEEEEEecccCccCC-C--CccceEEECC----CCCCCCCCCceE-----CCCCeEEE
Confidence 45689999999998744 6788888875221111 1 1345666665 467899999999 99999999
Q ss_pred EEEeccCCCCccccccCCCCcceeeeechHHHHhhc
Q 012266 413 IIWGGTANRGRLKLKVGQPPVNWTSGVDLGRLLDLL 448 (467)
Q Consensus 413 LlfgG~~~~g~~~~~~~~~~~~~t~~~pI~~VL~~L 448 (467)
+..+.-... ++.....++-|++.+.+.|
T Consensus 195 I~~~~~~~~--------g~~~g~~faiP~~~~~~~~ 222 (428)
T TIGR02037 195 INTAIYSPS--------GGNVGIGFAIPSNMAKNVV 222 (428)
T ss_pred EEeEEEcCC--------CCccceEEEEEhHHHHHHH
Confidence 987643211 1123568888876554433
No 7
>PF13365 Trypsin_2: Trypsin-like peptidase domain; PDB: 1Y8T_A 2Z9I_A 3QO6_A 1L1J_A 1QY6_A 2O8L_A 3OTP_E 2ZLE_I 1KY9_A 3CS0_A ....
Probab=97.20 E-value=0.0013 Score=55.42 Aligned_cols=21 Identities=33% Similarity=0.445 Sum_probs=18.6
Q ss_pred CCCCCCccceEEeeccCCCCCceEEE
Q 012266 388 FDLEGDSGSLILLTGQNGEKPRPVGI 413 (467)
Q Consensus 388 fS~~GDSGSlvl~~~~~d~~~~~VGL 413 (467)
...+|.||+.|| |.++++|||
T Consensus 100 ~~~~G~SGgpv~-----~~~G~vvGi 120 (120)
T PF13365_consen 100 DTRPGSSGGPVF-----DSDGRVVGI 120 (120)
T ss_dssp S-STTTTTSEEE-----ETTSEEEEE
T ss_pred ccCCCcEeHhEE-----CCCCEEEeC
Confidence 688999999999 899999997
No 8
>PF00089 Trypsin: Trypsin; InterPro: IPR001254 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine proteases belong to the MEROPS peptidase family S1 (chymotrypsin family, clan PA(S))and to peptidase family S6 (Hap serine peptidases). The chymotrypsin family is almost totally confined to animals, although trypsin-like enzymes are found in actinomycetes of the genera Streptomyces and Saccharopolyspora, and in the fungus Fusarium oxysporum []. The enzymes are inherently secreted, being synthesised with a signal peptide that targets them to the secretory pathway. Animal enzymes are either secreted directly, packaged into vesicles for regulated secretion, or are retained in leukocyte granules []. The Hap family, 'Haemophilus adhesion and penetration', are proteins that play a role in the interaction with human epithelial cells. The serine protease activity is localized at the N-terminal domain, whereas the binding domain is in the C-terminal region. ; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 1SPJ_A 1A5I_A 2ZGH_A 2ZKS_A 2ZGJ_A 2ZGC_A 2ODP_A 2I6Q_A 2I6S_A 2ODQ_A ....
Probab=96.73 E-value=0.0038 Score=57.58 Aligned_cols=45 Identities=24% Similarity=0.225 Sum_probs=31.4
Q ss_pred CCCCCCccceEEeeccCCCCCceEEEEEeccCCCCccccccCCCCcceeeeechHHHHh
Q 012266 388 FDLEGDSGSLILLTGQNGEKPRPVGIIWGGTANRGRLKLKVGQPPVNWTSGVDLGRLLD 446 (467)
Q Consensus 388 fS~~GDSGSlvl~~~~~d~~~~~VGLlfgG~~~~g~~~~~~~~~~~~~t~~~pI~~VL~ 446 (467)
-.-.|||||.++. .++.++|++..+.... ......+|++|...++
T Consensus 174 ~~~~g~sG~pl~~-----~~~~lvGI~s~~~~c~---------~~~~~~v~~~v~~~~~ 218 (220)
T PF00089_consen 174 DACQGDSGGPLIC-----NNNYLVGIVSFGENCG---------SPNYPGVYTRVSSYLD 218 (220)
T ss_dssp BGGTTTTTSEEEE-----TTEEEEEEEEEESSSS---------BTTSEEEEEEGGGGHH
T ss_pred ccccccccccccc-----ceeeecceeeecCCCC---------CCCcCEEEEEHHHhhc
Confidence 3567999999993 3337999999883321 1223688999887654
No 9
>PF00863 Peptidase_C4: Peptidase family C4 This family belongs to family C4 of the peptidase classification.; InterPro: IPR001730 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. Nuclear inclusion A (NIA) proteases from potyviruses are cysteine peptidases belong to the MEROPS peptidase family C4 (NIa protease family, clan PA(C)) [, ]. Potyviruses include plant viruses in which the single-stranded RNA encodes a polyprotein with NIA protease activity, where proteolytic cleavage is specific for Gln+Gly sites. The NIA protease acts on the polyprotein, releasing itself by Gln+Gly cleavage at both the N- and C-termini. It further processes the polyprotein by cleavage at five similar sites in the C-terminal half of the sequence. In addition to its C-terminal protease activity, the NIA protease contains an N-terminal domain that has been implicated in the transcription process []. This peptidase is present in the nuclear inclusion protein of potyviruses.; GO: 0008234 cysteine-type peptidase activity, 0006508 proteolysis; PDB: 3MMG_B 1Q31_B 1LVB_A 1LVM_A.
Probab=95.35 E-value=0.17 Score=50.43 Aligned_cols=105 Identities=18% Similarity=0.183 Sum_probs=59.9
Q ss_pred cccccccccccccCCCCcccccccccccCceeeecccCcccccCCCcEEEeee--ccCceEEEEEEEEEEEeCCCCeEEE
Q 012266 298 RADGAFIPFAEDFNLNNVTTSVKGVGEIGDVHIIDLQSPINSLIGRQVMKVGR--SSGLTTGTVMAYALEYNDEKGICFF 375 (467)
Q Consensus 298 DaD~Ali~~a~~~d~s~vs~~I~~iG~iG~v~~v~l~g~~~p~lG~~V~KvGR--TTGlT~G~I~ai~v~y~~~~G~~~f 375 (467)
+.|+.+|+..+.+.+ .... .-...|..++.|+.+|- .+....-+|+.-...|... +..++
T Consensus 81 ~~DiviirmPkDfpP--f~~k---------------l~FR~P~~~e~v~mVg~~fq~k~~~s~vSesS~i~p~~-~~~fW 142 (235)
T PF00863_consen 81 GRDIVIIRMPKDFPP--FPQK---------------LKFRAPKEGERVCMVGSNFQEKSISSTVSESSWIYPEE-NSHFW 142 (235)
T ss_dssp CSSEEEEE--TTS------S------------------B----TT-EEEEEEEECSSCCCEEEEEEEEEEEEET-TTTEE
T ss_pred CccEEEEeCCcccCC--cchh---------------hhccCCCCCCEEEEEEEEEEcCCeeEEECCceEEeecC-CCCee
Confidence 447788887764422 1111 13477999999999996 7777888888887777633 34467
Q ss_pred EEEEEEcCCCCCCCCCCCccceEEeeccCCCCCceEEEEEeccCCCCccccccCCCCcceeeeechHHH
Q 012266 376 TDFLVVGENQQTFDLEGDSGSLILLTGQNGEKPRPVGIIWGGTANRGRLKLKVGQPPVNWTSGVDLGRL 444 (467)
Q Consensus 376 ~dqiit~~~~~~fS~~GDSGSlvl~~~~~d~~~~~VGLlfgG~~~~g~~~~~~~~~~~~~t~~~pI~~V 444 (467)
+.+|-|. .||=|++++- -.++.+||+|..++... ..-+|.|+..=
T Consensus 143 kHwIsTk--------~G~CG~PlVs----~~Dg~IVGiHsl~~~~~------------~~N~F~~f~~~ 187 (235)
T PF00863_consen 143 KHWISTK--------DGDCGLPLVS----TKDGKIVGIHSLTSNTS------------SRNYFTPFPDD 187 (235)
T ss_dssp EE-C-----------TT-TT-EEEE----TTT--EEEEEEEEETTT------------SSEEEEE--TT
T ss_pred EEEecCC--------CCccCCcEEE----cCCCcEEEEEcCccCCC------------CeEEEEcCCHH
Confidence 8887654 4999999994 57899999999988755 33577776543
No 10
>COG0265 DegQ Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain [Posttranslational modification, protein turnover, chaperones]
Probab=95.21 E-value=0.17 Score=52.21 Aligned_cols=90 Identities=21% Similarity=0.282 Sum_probs=60.5
Q ss_pred ccCCCcEEEeeeccC----ceEEEEEEEEEE-EeCCCCeEEEEEEEEEcCCCCCCCCCCCccceEEeeccCCCCCceEEE
Q 012266 339 SLIGRQVMKVGRSSG----LTTGTVMAYALE-YNDEKGICFFTDFLVVGENQQTFDLEGDSGSLILLTGQNGEKPRPVGI 413 (467)
Q Consensus 339 p~lG~~V~KvGRTTG----lT~G~I~ai~v~-y~~~~G~~~f~dqiit~~~~~~fS~~GDSGSlvl~~~~~d~~~~~VGL 413 (467)
..+|+.|.-+|-..| +|.|.|+++.-. +..... +.++|.+.. .-.+|.||..++ +.++++||+
T Consensus 143 l~vg~~v~aiGnp~g~~~tvt~Givs~~~r~~v~~~~~---~~~~IqtdA----ain~gnsGgpl~-----n~~g~~iGi 210 (347)
T COG0265 143 LRVGDVVVAIGNPFGLGQTVTSGIVSALGRTGVGSAGG---YVNFIQTDA----AINPGNSGGPLV-----NIDGEVVGI 210 (347)
T ss_pred cccCCEEEEecCCCCcccceeccEEeccccccccCccc---ccchhhccc----ccCCCCCCCceE-----cCCCcEEEE
Confidence 448888888888887 566777766532 222111 566776654 578999999999 899999998
Q ss_pred EEeccCCCCccccccCCCCcceeeeechHHHHhhc
Q 012266 414 IWGGTANRGRLKLKVGQPPVNWTSGVDLGRLLDLL 448 (467)
Q Consensus 414 lfgG~~~~g~~~~~~~~~~~~~t~~~pI~~VL~~L 448 (467)
..+.-...+ +.....|+-|+..+...+
T Consensus 211 nt~~~~~~~--------~~~gigfaiP~~~~~~v~ 237 (347)
T COG0265 211 NTAIIAPSG--------GSSGIGFAIPVNLVAPVL 237 (347)
T ss_pred EEEEecCCC--------CcceeEEEecHHHHHHHH
Confidence 877655331 122356777776655443
No 11
>cd00190 Tryp_SPc Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues.
Probab=94.56 E-value=0.87 Score=42.12 Aligned_cols=30 Identities=27% Similarity=0.381 Sum_probs=21.7
Q ss_pred CCCCCCccceEEeeccCCCCCceEEEEEeccC
Q 012266 388 FDLEGDSGSLILLTGQNGEKPRPVGIIWGGTA 419 (467)
Q Consensus 388 fS~~GDSGSlvl~~~~~d~~~~~VGLlfgG~~ 419 (467)
-.-+||||+.++.. .+....++|++..|..
T Consensus 180 ~~c~gdsGgpl~~~--~~~~~~lvGI~s~g~~ 209 (232)
T cd00190 180 DACQGDSGGPLVCN--DNGRGVLVGIVSWGSG 209 (232)
T ss_pred ccccCCCCCcEEEE--eCCEEEEEEEEehhhc
Confidence 34569999999952 1244789999987764
No 12
>smart00020 Tryp_SPc Trypsin-like serine protease. Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues.
Probab=94.47 E-value=0.84 Score=42.53 Aligned_cols=43 Identities=23% Similarity=0.259 Sum_probs=27.6
Q ss_pred CCCCCccceEEeeccCCCCCceEEEEEeccCCCCccccccCCCCcceeeeechHH
Q 012266 389 DLEGDSGSLILLTGQNGEKPRPVGIIWGGTANRGRLKLKVGQPPVNWTSGVDLGR 443 (467)
Q Consensus 389 S~~GDSGSlvl~~~~~d~~~~~VGLlfgG~~~~g~~~~~~~~~~~~~t~~~pI~~ 443 (467)
.-+||||+.++... + ..+++|+...|. .++ ......+|..|..
T Consensus 182 ~c~gdsG~pl~~~~--~-~~~l~Gi~s~g~-~C~--------~~~~~~~~~~i~~ 224 (229)
T smart00020 182 ACQGDSGGPLVCND--G-RWVLVGIVSWGS-GCA--------RPGKPGVYTRVSS 224 (229)
T ss_pred ccCCCCCCeeEEEC--C-CEEEEEEEEECC-CCC--------CCCCCCEEEEecc
Confidence 34599999999421 1 458999998876 441 1234456666654
No 13
>KOG1421 consensus Predicted signaling-associated protein (contains a PDZ domain) [General function prediction only]
Probab=94.22 E-value=0.11 Score=58.21 Aligned_cols=45 Identities=22% Similarity=0.212 Sum_probs=39.6
Q ss_pred CCCCCCccceEEeeccCCCCCceEEEEEeccCCCCccccccCCCCcceeeeechHHHHhhcC
Q 012266 388 FDLEGDSGSLILLTGQNGEKPRPVGIIWGGTANRGRLKLKVGQPPVNWTSGVDLGRLLDLLE 449 (467)
Q Consensus 388 fS~~GDSGSlvl~~~~~d~~~~~VGLlfgG~~~~g~~~~~~~~~~~~~t~~~pI~~VL~~L~ 449 (467)
-+.+|-|||.|+ +-.+++|.|.-||.... ...+|-||++|+++|-
T Consensus 212 stsggssgspVv-----~i~gyAVAl~agg~~ss------------as~ffLpLdrV~RaL~ 256 (955)
T KOG1421|consen 212 STSGGSSGSPVV-----DIPGYAVALNAGGSISS------------ASDFFLPLDRVVRALR 256 (955)
T ss_pred cCCCCCCCCcee-----cccceEEeeecCCcccc------------cccceeeccchhhhhh
Confidence 377899999999 99999999999998765 5689999999998873
No 14
>PF00944 Peptidase_S3: Alphavirus core protein ; InterPro: IPR000930 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. Togavirin, also known as Sindbis virus core endopeptidase, is a serine protease resident at the N terminus of the p130 polyprotein of togaviruses []. The endopeptidase signature identifies the peptidase as belonging to the MEROPS peptidase family S3 (togavirin family, clan PA(S)). The polyprotein also includes structural proteins for the nucleocapsid core and for the glycoprotein spikes []. Togavirin is only active while part of the polyprotein, cleavage at a Trp-Ser bond resulting in total lack of activity []. Mutagenesis studies have identified the location of the His-Asp-Ser catalytic triad, and X-ray studies have revealed the protein fold to be similar to that of chymotrypsin [, ].; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis, 0016020 membrane; PDB: 2YEW_D 1EP5_A 3J0C_F 1EP6_C 1WYK_D 1DYL_A 1VCQ_B 1VCP_B 1LD4_D 1KXA_A ....
Probab=93.96 E-value=0.12 Score=47.48 Aligned_cols=29 Identities=34% Similarity=0.544 Sum_probs=25.7
Q ss_pred CCCCCCccceEEeeccCCCCCceEEEEEeccCCC
Q 012266 388 FDLEGDSGSLILLTGQNGEKPRPVGIIWGGTANR 421 (467)
Q Consensus 388 fS~~GDSGSlvl~~~~~d~~~~~VGLlfgG~~~~ 421 (467)
.+.+||||..++ |+++++||+++||..+.
T Consensus 102 ~g~~GDSGRpi~-----DNsGrVVaIVLGG~neG 130 (158)
T PF00944_consen 102 VGKPGDSGRPIF-----DNSGRVVAIVLGGANEG 130 (158)
T ss_dssp S-STTSTTEEEE-----STTSBEEEEEEEEEEET
T ss_pred CCCCCCCCCccC-----cCCCCEEEEEecCCCCC
Confidence 689999999999 99999999999998753
No 15
>PF00947 Pico_P2A: Picornavirus core protein 2A; InterPro: IPR000081 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This domain defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies 3CA and 3CB. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral 3C cysteine protease []. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0008233 peptidase activity, 0006508 proteolysis, 0016032 viral reproduction; PDB: 2HRV_B 1Z8R_A.
Probab=92.49 E-value=0.16 Score=46.05 Aligned_cols=50 Identities=22% Similarity=0.291 Sum_probs=39.2
Q ss_pred eEEEEEEEEEcCCCCCCCCCCCccceEEeeccCCCCCceEEEEEeccCCCCccccccCCCCcceeeeechHHHH
Q 012266 372 ICFFTDFLVVGENQQTFDLEGDSGSLILLTGQNGEKPRPVGIIWGGTANRGRLKLKVGQPPVNWTSGVDLGRLL 445 (467)
Q Consensus 372 ~~~f~dqiit~~~~~~fS~~GDSGSlvl~~~~~d~~~~~VGLlfgG~~~~g~~~~~~~~~~~~~t~~~pI~~VL 445 (467)
..+..+.++... +++|||-|++++ -+..++||+-||.. +...|.+|+.++
T Consensus 74 ~h~Q~~~l~g~G----p~~PGdCGg~L~------C~HGViGi~Tagg~--------------g~VaF~dir~~~ 123 (127)
T PF00947_consen 74 KHYQYNLLIGEG----PAEPGDCGGILR------CKHGVIGIVTAGGE--------------GHVAFADIRDLL 123 (127)
T ss_dssp SEEEECEEEEE-----SSSTT-TCSEEE------ETTCEEEEEEEEET--------------TEEEEEECCCGS
T ss_pred hheecCceeecc----cCCCCCCCceeE------eCCCeEEEEEeCCC--------------ceEEEEechhhh
Confidence 467888888877 799999999998 46669999999987 458888887653
No 16
>PF05579 Peptidase_S32: Equine arteritis virus serine endopeptidase S32; InterPro: IPR008760 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to MEROPS peptidase family S32 (clan PA(S)). The type example is equine arteritis virus serine endopeptidase (equine arteritis virus), which is involved in processing of nidovirus polyproteins [].; GO: 0004252 serine-type endopeptidase activity, 0016032 viral reproduction, 0019082 viral protein processing; PDB: 3FAN_A 3FAO_A 1MBM_A.
Probab=89.68 E-value=0.29 Score=49.59 Aligned_cols=28 Identities=32% Similarity=0.481 Sum_probs=22.5
Q ss_pred CCCCCCccceEEeeccCCCCCceEEEEEeccCC
Q 012266 388 FDLEGDSGSLILLTGQNGEKPRPVGIIWGGTAN 420 (467)
Q Consensus 388 fS~~GDSGSlvl~~~~~d~~~~~VGLlfgG~~~ 420 (467)
|+.+|||||.|+ .+++.+||+|-|.+..
T Consensus 204 fT~~GDSGSPVV-----t~dg~liGVHTGSn~~ 231 (297)
T PF05579_consen 204 FTGPGDSGSPVV-----TEDGDLIGVHTGSNKR 231 (297)
T ss_dssp SS-GGCTT-EEE-----ETTC-EEEEEEEEETT
T ss_pred EcCCCCCCCccC-----cCCCCEEEEEecCCCc
Confidence 899999999999 7899999999997753
No 17
>COG3591 V8-like Glu-specific endopeptidase [Amino acid transport and metabolism]
Probab=89.04 E-value=6.1 Score=39.87 Aligned_cols=28 Identities=32% Similarity=0.542 Sum_probs=24.3
Q ss_pred CCCCCccceEEeeccCCCCCceEEEEEeccCCC
Q 012266 389 DLEGDSGSLILLTGQNGEKPRPVGIIWGGTANR 421 (467)
Q Consensus 389 S~~GDSGSlvl~~~~~d~~~~~VGLlfgG~~~~ 421 (467)
..||+|||.|+ ..+.+++|++++|-.-.
T Consensus 200 T~pG~SGSpv~-----~~~~~vigv~~~g~~~~ 227 (251)
T COG3591 200 TLPGSSGSPVL-----ISKDEVIGVHYNGPGAN 227 (251)
T ss_pred ccCCCCCCceE-----ecCceEEEEEecCCCcc
Confidence 67899999999 67779999999998743
No 18
>PF10459 Peptidase_S46: Peptidase S46; InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains.
Probab=86.87 E-value=0.77 Score=52.34 Aligned_cols=61 Identities=31% Similarity=0.405 Sum_probs=45.6
Q ss_pred EEEEEEEcCCCCCCCCCCCccceEEeeccCCCCCceEEEEEeccCCCCccccccCC---CCcceeeeechHHHHhhcC
Q 012266 375 FTDFLVVGENQQTFDLEGDSGSLILLTGQNGEKPRPVGIIWGGTANRGRLKLKVGQ---PPVNWTSGVDLGRLLDLLE 449 (467)
Q Consensus 375 f~dqiit~~~~~~fS~~GDSGSlvl~~~~~d~~~~~VGLlfgG~~~~g~~~~~~~~---~~~~~t~~~pI~~VL~~L~ 449 (467)
--||+-+.+ .-+|-|||.++ |.++++|||.|-|+-.. +.++- +..+.++..+|.-||-.|+
T Consensus 621 pv~FlstnD-----itGGNSGSPvl-----N~~GeLVGl~FDgn~Es----l~~D~~fdp~~~R~I~VDiRyvL~~ld 684 (698)
T PF10459_consen 621 PVNFLSTND-----ITGGNSGSPVL-----NAKGELVGLAFDGNWES----LSGDIAFDPELNRTIHVDIRYVLWALD 684 (698)
T ss_pred eeEEEeccC-----cCCCCCCCccC-----CCCceEEEEeecCchhh----cccccccccccceeEEEEHHHHHHHHH
Confidence 345655555 67899999999 99999999999998543 22222 1357789999999988763
No 19
>PF00949 Peptidase_S7: Peptidase S7, Flavivirus NS3 serine protease ; InterPro: IPR001850 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies serine peptidases belong to MEROPS peptidase family S7 (flavivirin family, clan PA(S)). The protein fold of the peptidase domain for members of this family resembles that of chymotrypsin, the type example for clan PA. Flaviviruses produce a polyprotein from the ssRNA genome. The N terminus of the NS3 protein (approx. 180 aa) is required for the processing of the polyprotein. NS3 also has conserved homology with NTP-binding proteins and DEAD family of RNA helicase [, , ].; GO: 0003723 RNA binding, 0003724 RNA helicase activity, 0005524 ATP binding; PDB: 2IJO_B 3E90_D 2GGV_B 2FP7_B 2WV9_A 3U1I_B 3U1J_B 2WZQ_A 2WHX_A 3L6P_A ....
Probab=83.79 E-value=0.77 Score=41.97 Aligned_cols=26 Identities=31% Similarity=0.391 Sum_probs=21.5
Q ss_pred CCCCCccceEEeeccCCCCCceEEEEEeccC
Q 012266 389 DLEGDSGSLILLTGQNGEKPRPVGIIWGGTA 419 (467)
Q Consensus 389 S~~GDSGSlvl~~~~~d~~~~~VGLlfgG~~ 419 (467)
-.+|-|||++| |.++++|||++++-.
T Consensus 94 ~~~GsSGSpi~-----n~~g~ivGlYg~g~~ 119 (132)
T PF00949_consen 94 FPKGSSGSPIF-----NQNGEIVGLYGNGVE 119 (132)
T ss_dssp S-TTGTT-EEE-----ETTSCEEEEEEEEEE
T ss_pred cCCCCCCCceE-----cCCCcEEEEEcccee
Confidence 34699999999 999999999999864
No 20
>PF12381 Peptidase_C3G: Tungro spherical virus-type peptidase; InterPro: IPR024387 This entry represents a rice tungro spherical waikavirus-type peptidase that belongs to MEROPS peptidase family C3G. It is a picornain 3C-type protease, and is responsible for the self-cleavage of the positive single-stranded polyproteins of a number of plant viral genomes. The location of the protease activity of the polyprotein is at the C-terminal end, adjacent and N-terminal to the putative RNA polymerase [, ].
Probab=82.44 E-value=1.4 Score=43.49 Aligned_cols=43 Identities=26% Similarity=0.319 Sum_probs=31.6
Q ss_pred CCCCCCccceEEeeccCCCCCceEEEEEeccCCCCccccccCCCCcceeeeechHH
Q 012266 388 FDLEGDSGSLILLTGQNGEKPRPVGIIWGGTANRGRLKLKVGQPPVNWTSGVDLGR 443 (467)
Q Consensus 388 fS~~GDSGSlvl~~~~~d~~~~~VGLlfgG~~~~g~~~~~~~~~~~~~t~~~pI~~ 443 (467)
.+..||=||+++.- +....-.++|||.+|+.+.| ..|+++|..
T Consensus 176 ~t~~GdCGs~i~~~-~t~~~RKIvGiHVAG~~~~~------------~gYAe~itQ 218 (231)
T PF12381_consen 176 PTMNGDCGSPIVRN-NTQMVRKIVGIHVAGSANHA------------MGYAESITQ 218 (231)
T ss_pred CCcCCCccceeeEc-chhhhhhhheeeeccccccc------------ceehhhhhH
Confidence 69999999999962 11223569999999998653 477777653
No 21
>PF01732 DUF31: Putative peptidase (DUF31); InterPro: IPR022382 This domain has no known function. It is found in various hypothetical proteins and putative lipoproteins from mycoplasmas.
Probab=64.47 E-value=4.9 Score=42.11 Aligned_cols=22 Identities=36% Similarity=0.673 Sum_probs=20.5
Q ss_pred CCCCccceEEeeccCCCCCceEEEEEe
Q 012266 390 LEGDSGSLILLTGQNGEKPRPVGIIWG 416 (467)
Q Consensus 390 ~~GDSGSlvl~~~~~d~~~~~VGLlfg 416 (467)
.+|=|||+|+ ++++++||++||
T Consensus 353 ~gGaSGS~V~-----n~~~~lvGIy~g 374 (374)
T PF01732_consen 353 GGGASGSMVI-----NQNNELVGIYFG 374 (374)
T ss_pred CCCCCcCeEE-----CCCCCEEEEeCC
Confidence 4799999999 999999999997
No 22
>PF00548 Peptidase_C3: 3C cysteine protease (picornain 3C); InterPro: IPR000199 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This signature defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies C3A and C3B. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral C3 cysteine protease. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SJO_E 2H6M_A 1QA7_C 1HAV_B 2HAL_A 2H9H_A 3QZQ_B 3QZR_A 3R0F_B 3SJ9_A ....
Probab=54.29 E-value=7.9 Score=36.51 Aligned_cols=29 Identities=24% Similarity=0.345 Sum_probs=23.3
Q ss_pred CCCCCccceEEeeccCCCCCceEEEEEeccC
Q 012266 389 DLEGDSGSLILLTGQNGEKPRPVGIIWGGTA 419 (467)
Q Consensus 389 S~~GDSGSlvl~~~~~d~~~~~VGLlfgG~~ 419 (467)
+.+||-||+++.. .+....++|||.||++
T Consensus 144 t~~G~CG~~l~~~--~~~~~~i~GiHvaG~G 172 (172)
T PF00548_consen 144 TKPGMCGSPLVSR--IGGQGKIIGIHVAGNG 172 (172)
T ss_dssp EETTGTTEEEEES--CGGTTEEEEEEEEEES
T ss_pred CCCCccCCeEEEe--eccCccEEEEEeccCC
Confidence 6789999999942 3347889999999863
No 23
>KOG1320 consensus Serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=40.59 E-value=1.2e+02 Score=33.49 Aligned_cols=57 Identities=11% Similarity=0.110 Sum_probs=37.8
Q ss_pred EEEEEEEEcCCCCCCCCCCCccceEEeeccCCCCCceEEEEEeccCCCCccccccCCCCcceeeeechHHHHhh
Q 012266 374 FFTDFLVVGENQQTFDLEGDSGSLILLTGQNGEKPRPVGIIWGGTANRGRLKLKVGQPPVNWTSGVDLGRLLDL 447 (467)
Q Consensus 374 ~f~dqiit~~~~~~fS~~GDSGSlvl~~~~~d~~~~~VGLlfgG~~~~g~~~~~~~~~~~~~t~~~pI~~VL~~ 447 (467)
+..+.+-+.. -...|-||-+++ +.++.+||+.+.--... .+. ..-++.-|++.|+..
T Consensus 290 ~i~~~~qtd~----ai~~~nsg~~ll-----~~DG~~IgVn~~~~~ri---~~~-----~~iSf~~p~d~vl~~ 346 (473)
T KOG1320|consen 290 LISKINQTDA----AINPGNSGGPLL-----NLDGEVIGVNTRKVTRI---GFS-----HGISFKIPIDTVLVI 346 (473)
T ss_pred eeeeecccch----hhhcccCCCcEE-----EecCcEeeeeeeeeEEe---ecc-----ccceeccCchHhhhh
Confidence 3444444544 578899999999 89999999887765421 111 134677788777653
No 24
>KOG1320 consensus Serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=40.40 E-value=1.1e+02 Score=33.84 Aligned_cols=95 Identities=15% Similarity=0.127 Sum_probs=52.2
Q ss_pred ccccccccccccccccCCCCcccc-cccccccCceeeecccCcccccCCCcEEEe-eeccCceEEEEEEEEEEEeCCCCe
Q 012266 295 TFVRADGAFIPFAEDFNLNNVTTS-VKGVGEIGDVHIIDLQSPINSLIGRQVMKV-GRSSGLTTGTVMAYALEYNDEKGI 372 (467)
Q Consensus 295 N~vDaD~Ali~~a~~~d~s~vs~~-I~~iG~iG~v~~v~l~g~~~p~lG~~V~Kv-GRTTGlT~G~I~ai~v~y~~~~G~ 372 (467)
++..||.|++.+. ....|-.. -...|.+ |.+...|+=+ |-+.=+|.|.|.++..+-...++.
T Consensus 133 ~~~~cd~Avv~Ie---~~~f~~~~~~~e~~~i-------------p~l~~S~~Vv~gd~i~VTnghV~~~~~~~y~~~~~ 196 (473)
T KOG1320|consen 133 VFEECDLAVVYIE---SEEFWKGMNPFELGDI-------------PSLNGSGFVVGGDGIIVTNGHVVRVEPRIYAHSST 196 (473)
T ss_pred hhhcccceEEEEe---eccccCCCcccccCCC-------------cccCccEEEEcCCcEEEEeeEEEEEEeccccCCCc
Confidence 4678888986654 22233211 1233332 2333333322 555668999999998553322345
Q ss_pred EEEEEEEEEcCCCCCCCCCCCccceEEeeccCCCCCceEEEEEe
Q 012266 373 CFFTDFLVVGENQQTFDLEGDSGSLILLTGQNGEKPRPVGIIWG 416 (467)
Q Consensus 373 ~~f~dqiit~~~~~~fS~~GDSGSlvl~~~~~d~~~~~VGLlfg 416 (467)
.+..-||.+. ..+|.||-..+ ...+...|+.|-
T Consensus 197 ~l~~vqi~aa------~~~~~s~ep~i-----~g~d~~~gvA~l 229 (473)
T KOG1320|consen 197 VLLRVQIDAA------IGPGNSGEPVI-----VGVDKVAGVAFL 229 (473)
T ss_pred ceeeEEEEEe------ecCCccCCCeE-----EccccccceEEE
Confidence 5555665554 34678888887 344555555554
No 25
>PF02122 Peptidase_S39: Peptidase S39; InterPro: IPR000382 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. ORF2 of Potato leafroll virus (PLrV) encodes a polyprotein which is translated following a -1 frameshift. The polyprotein has a putative linear arrangement of membrane achor-VPg-peptidase-polmerase domains. The serine peptidase domain which is found in this group of sequences belongs to MEROPS peptidase family S39 (clan PA(S)). It is likely that the peptidase domain is involved in the cleavage of the polyprotein []. The nucleotide sequence for the RNA of PLrV has been determined [, ]. The sequence contains six large open reading frames (ORFs). The 5' coding region encodes two polypeptides of 28K and 70K, which overlap in different reading frames; it is suggested that the third ORF in the 5' block is translated by frameshift readthrough near the end of the 70K protein, yielding a 118K polypeptide []. Segments of the predicted amino acid sequences of these ORFs resemble those of known viral RNA polymerases, ATP-binding proteins and viral genome-linked proteins. The nucleotide sequence of the genomic RNA of Beet western yellows virus (BWYV) has been determined []. The sequence contains six long ORFs. A cluster of three of these ORFs, including the coat protein cistron, display extensive amino acid sequence similarity to corresponding ORFs of a second luteovirus: Barley yellow dwarf virus [].; GO: 0004252 serine-type endopeptidase activity, 0022415 viral reproductive process, 0016021 integral to membrane; PDB: 1ZYO_A.
Probab=36.66 E-value=29 Score=33.99 Aligned_cols=44 Identities=20% Similarity=0.148 Sum_probs=15.2
Q ss_pred CCCCCCccceEEeeccCCCCCceEEEEEeccCCCCccccccCCCCcceeeeechHHHHh
Q 012266 388 FDLEGDSGSLILLTGQNGEKPRPVGIIWGGTANRGRLKLKVGQPPVNWTSGVDLGRLLD 446 (467)
Q Consensus 388 fS~~GDSGSlvl~~~~~d~~~~~VGLlfgG~~~~g~~~~~~~~~~~~~t~~~pI~~VL~ 446 (467)
-+.+|+||+.++ ..+ ++||+|-|... +...++|-+..||..+..
T Consensus 143 ~T~~G~SGtp~y-----~g~-~vvGvH~G~~~---------~~~~~n~n~~spip~~~g 186 (203)
T PF02122_consen 143 NTSPGWSGTPYY-----SGK-NVVGVHTGSPS---------GSNRENNNRMSPIPPIPG 186 (203)
T ss_dssp ---TT-TT-EEE------SS--EEEEEEEE-----------------------------
T ss_pred CCCCCCCCCCeE-----ECC-CceEeecCccc---------cccccccccccccccccc
Confidence 378999999999 445 99999999511 122457777778777653
No 26
>COG5480 Predicted integral membrane protein [Function unknown]
Probab=25.71 E-value=33 Score=31.71 Aligned_cols=19 Identities=37% Similarity=0.679 Sum_probs=16.2
Q ss_pred cCCcEEEEEEEeeeCCccc
Q 012266 118 RRFSLGTAIGFRIRRGVLT 136 (467)
Q Consensus 118 ~pnVvGvgiGyK~~~G~~T 136 (467)
.-+++|++||||.++|-.|
T Consensus 41 t~~~v~vAiGyr~~ngwvt 59 (147)
T COG5480 41 TQTLVGVAIGYRAKNGWVT 59 (147)
T ss_pred hhhhhheeeeeecCCCcee
Confidence 4578999999999999655
Done!