Query 008706
Match_columns 557
No_of_seqs 32 out of 34
Neff 2.3
Searched_HMMs 46136
Date Thu Mar 28 15:33:33 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/008706.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/008706hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 TIGR02038 protease_degS peripl 99.1 5.9E-10 1.3E-14 111.1 12.1 180 24-234 50-249 (351)
2 TIGR02037 degP_htrA_DO peripla 99.1 3.5E-10 7.6E-15 114.2 10.4 161 42-234 58-228 (428)
3 PF13365 Trypsin_2: Trypsin-li 99.0 2.6E-10 5.7E-15 91.1 3.9 109 44-181 1-118 (120)
4 PRK10898 serine endoprotease; 99.0 6.3E-09 1.4E-13 104.2 12.4 178 24-231 50-247 (353)
5 COG3591 V8-like Glu-specific e 98.9 6.7E-09 1.4E-13 102.3 12.0 165 46-232 68-250 (251)
6 PRK10942 serine endoprotease; 98.7 2.2E-08 4.7E-13 104.3 8.3 159 43-233 112-281 (473)
7 PRK10139 serine endoprotease; 98.7 1.6E-08 3.5E-13 104.8 7.2 131 42-186 90-231 (455)
8 COG0265 DegQ Trypsin-like seri 98.0 2.5E-05 5.5E-10 76.9 8.7 169 42-246 72-253 (347)
9 PF00089 Trypsin: Trypsin; In 97.8 5.3E-05 1.1E-09 65.6 6.9 173 30-228 13-220 (220)
10 cd00190 Tryp_SPc Trypsin-like 96.4 0.033 7.2E-07 48.6 9.7 149 28-181 11-198 (232)
11 smart00020 Tryp_SPc Trypsin-li 96.0 0.067 1.5E-06 47.3 9.7 156 27-189 11-205 (229)
12 PF05416 Peptidase_C37: Southa 90.8 0.082 1.8E-06 57.3 0.5 127 34-189 374-524 (535)
13 PF10459 Peptidase_S46: Peptid 79.7 1.2 2.7E-05 50.2 2.6 41 24-66 31-72 (698)
14 KOG1421 Predicted signaling-as 78.7 2.8 6.1E-05 48.3 4.9 42 20-61 61-105 (955)
15 PF04152 Mre11_DNA_bind: Mre11 55.0 12 0.00025 35.2 3.0 31 220-250 43-80 (175)
16 PF00949 Peptidase_S7: Peptida 53.4 10 0.00022 35.3 2.3 24 157-180 87-110 (132)
17 PF00548 Peptidase_C3: 3C cyst 42.6 66 0.0014 30.3 5.9 127 44-188 27-167 (172)
18 PF10459 Peptidase_S46: Peptid 34.1 22 0.00048 40.6 1.7 26 161-186 627-652 (698)
19 PF08192 Peptidase_S64: Peptid 29.5 3E+02 0.0064 32.3 9.3 54 85-138 534-599 (695)
20 PF10385 RNA_pol_Rpb2_45: RNA 28.2 70 0.0015 26.3 3.2 35 144-181 3-40 (66)
21 PRK08903 DnaA regulatory inact 26.1 27 0.00059 32.5 0.5 26 214-239 170-196 (227)
22 TIGR00583 mre11 DNA repair pro 22.9 54 0.0012 35.1 2.1 14 237-250 352-365 (405)
23 KOG1320 Serine protease [Postt 22.7 2.1E+02 0.0045 32.0 6.4 130 43-186 88-232 (473)
No 1
>TIGR02038 protease_degS periplasmic serine pepetdase DegS. This family consists of the periplasmic serine protease DegS (HhoB), a shorter paralog of protease DO (HtrA, DegP) and DegQ (HhoA). It is found in E. coli and several other Proteobacteria of the gamma subdivision. It contains a trypsin domain and a single copy of PDZ domain (in contrast to DegP with two copies). A critical role of this DegS is to sense stress in the periplasm and partially degrade an inhibitor of sigma(E).
Probab=99.11 E-value=5.9e-10 Score=111.07 Aligned_cols=180 Identities=14% Similarity=0.221 Sum_probs=121.3
Q ss_pred hhcccCceeEEEEecc----------CCccceeEEEecc-eeeccCcCcCcHHHHhhhhhhcccccccccCCceeeeccc
Q 008706 24 IFSGKGLAMARISVAA----------SAVSGTGFLIHRN-LLLTTHVNLPSVAAAETAEIRLQNGVAAALVPHRFFITSS 92 (557)
Q Consensus 24 ifs~k~~AvArI~~~~----------~gG~GTGFLIspn-LLLTNnhvLpSaaaA~~Aev~lq~g~~a~L~P~RFFITs~ 92 (557)
++..-.+||-+|.... ..+.||||+|+++ ++|||+||+..+ ...+|.|.+|.. .+-+.--.|+
T Consensus 50 ~~~~~~psVV~I~~~~~~~~~~~~~~~~~~GSG~vi~~~G~IlTn~HVV~~~---~~i~V~~~dg~~---~~a~vv~~d~ 123 (351)
T TIGR02038 50 AVRRAAPAVVNIYNRSISQNSLNQLSIQGLGSGVIMSKEGYILTNYHVIKKA---DQIVVALQDGRK---FEAELVGSDP 123 (351)
T ss_pred HHHhcCCcEEEEEeEeccccccccccccceEEEEEEeCCeEEEecccEeCCC---CEEEEEECCCCE---EEEEEEEecC
Confidence 3444557888876421 1257999999977 999999999754 346678888764 2334455789
Q ss_pred cceeEEEeeccCCCCCCCCCCCCCcccccCCCCccccceEEEeecCCccceeeccCcEEEe---------ecCceeeecC
Q 008706 93 VLDLTIVGLDSADGDSNAPGQQPHHLKTCSKPNLDLGSIVYLLGYMEEKELMVGEGKVAIA---------TDNLIKLSTD 163 (557)
Q Consensus 93 ~LDfTiVAvd~v~~d~~s~Gq~ph~Lk~~~kpkl~lGE~VsIIqHP~pK~laIrEnKVv~~---------~DnfIhysTD 163 (557)
..||.|+-++..+ ..++++.....+.+|+.|+.||+|.--..++-.|.|... ..+||.....
T Consensus 124 ~~DlAvlkv~~~~---------~~~~~l~~s~~~~~G~~V~aiG~P~~~~~s~t~GiIs~~~r~~~~~~~~~~~iqtda~ 194 (351)
T TIGR02038 124 LTDLAVLKIEGDN---------LPTIPVNLDRPPHVGDVVLAIGNPYNLGQTITQGIISATGRNGLSSVGRQNFIQTDAA 194 (351)
T ss_pred CCCEEEEEecCCC---------CceEeccCcCccCCCCEEEEEeCCCCCCCcEEEEEEEeccCcccCCCCcceEEEECCc
Confidence 9999999997532 234666655579999999999999322334445544322 1345555555
Q ss_pred CCcCCCCCcccccCCCeeEEEeccccccCCCCCCCCCCcCCCCCccccccccccCcchhHHhHHHhhhcCC
Q 008706 164 GIIWSPGSAGFDVQGNLAFMICDPMKLATSPNTKSSSTSSSSSSSWKKDSSMQFGIPIPIICDWLNQHWEG 234 (557)
Q Consensus 164 t~~wSSGSAgFn~qgnlafmVc~p~~lA~sP~~~~sstSssss~s~kk~~i~q~GI~IssI~~wl~qhw~g 234 (557)
-.+|.||.|.||.+|++.=|+.+-.. . .... ...-.+|.|||..+.+.|.+-.++
T Consensus 195 i~~GnSGGpl~n~~G~vIGI~~~~~~--~---~~~~-----------~~~g~~faIP~~~~~~vl~~l~~~ 249 (351)
T TIGR02038 195 INAGNSGGALINTNGELVGINTASFQ--K---GGDE-----------GGEGINFAIPIKLAHKIMGKIIRD 249 (351)
T ss_pred cCCCCCcceEECCCCeEEEEEeeeec--c---cCCC-----------CccceEEEecHHHHHHHHHHHhhc
Confidence 67899999999999998877654211 0 0000 123468999999998888765433
No 2
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=99.10 E-value=3.5e-10 Score=114.17 Aligned_cols=161 Identities=17% Similarity=0.245 Sum_probs=114.9
Q ss_pred ccceeEEEecc-eeeccCcCcCcHHHHhhhhhhcccccccccCCceeeeccccceeEEEeeccCCCCCCCCCCCCCcccc
Q 008706 42 VSGTGFLIHRN-LLLTTHVNLPSVAAAETAEIRLQNGVAAALVPHRFFITSSVLDLTIVGLDSADGDSNAPGQQPHHLKT 120 (557)
Q Consensus 42 G~GTGFLIspn-LLLTNnhvLpSaaaA~~Aev~lq~g~~a~L~P~RFFITs~~LDfTiVAvd~v~~d~~s~Gq~ph~Lk~ 120 (557)
+.||||+|+++ ++|||+||+..+. ..+|.+.+|... +-+..-.|+..||.|+-++.. ....++++
T Consensus 58 ~~GSGfii~~~G~IlTn~Hvv~~~~---~i~V~~~~~~~~---~a~vv~~d~~~DlAllkv~~~--------~~~~~~~l 123 (428)
T TIGR02037 58 GLGSGVIISADGYILTNNHVVDGAD---EITVTLSDGREF---KAKLVGKDPRTDIAVLKIDAK--------KNLPVIKL 123 (428)
T ss_pred ceeeEEEECCCCEEEEcHHHcCCCC---eEEEEeCCCCEE---EEEEEEecCCCCEEEEEecCC--------CCceEEEc
Confidence 57999999986 9999999998753 456777777642 223344689999999999742 23445777
Q ss_pred cCCCCccccceEEEeecCCccceeeccCcEEEe---------ecCceeeecCCCcCCCCCcccccCCCeeEEEecccccc
Q 008706 121 CSKPNLDLGSIVYLLGYMEEKELMVGEGKVAIA---------TDNLIKLSTDGIIWSPGSAGFDVQGNLAFMICDPMKLA 191 (557)
Q Consensus 121 ~~kpkl~lGE~VsIIqHP~pK~laIrEnKVv~~---------~DnfIhysTDt~~wSSGSAgFn~qgnlafmVc~p~~lA 191 (557)
.....+.+|+.|+.||+|.-...++-.|.|... ..+||...+...+|.||++.||.+|++.=|..+-..
T Consensus 124 ~~~~~~~~G~~v~aiG~p~g~~~~~t~G~vs~~~~~~~~~~~~~~~i~tda~i~~GnSGGpl~n~~G~viGI~~~~~~-- 201 (428)
T TIGR02037 124 GDSDKLRVGDWVLAIGNPFGLGQTVTSGIVSALGRSGLGIGDYENFIQTDAAINPGNSGGPLVNLRGEVIGINTAIYS-- 201 (428)
T ss_pred cCCCCCCCCCEEEEEECCCcCCCcEEEEEEEecccCccCCCCccceEEECCCCCCCCCCCceECCCCeEEEEEeEEEc--
Confidence 755689999999999999444455556665432 234666666678899999999999998776543211
Q ss_pred CCCCCCCCCCcCCCCCccccccccccCcchhHHhHHHhhhcCC
Q 008706 192 TSPNTKSSSTSSSSSSSWKKDSSMQFGIPIPIICDWLNQHWEG 234 (557)
Q Consensus 192 ~sP~~~~sstSssss~s~kk~~i~q~GI~IssI~~wl~qhw~g 234 (557)
+. . . ..-..|.|||..|.++|.+.-++
T Consensus 202 --~~-g--~-----------~~g~~faiP~~~~~~~~~~l~~~ 228 (428)
T TIGR02037 202 --PS-G--G-----------NVGIGFAIPSNMAKNVVDQLIEG 228 (428)
T ss_pred --CC-C--C-----------ccceEEEEEhHHHHHHHHHHHhc
Confidence 00 0 0 12457899999999999886654
No 3
>PF13365 Trypsin_2: Trypsin-like peptidase domain; PDB: 1Y8T_A 2Z9I_A 3QO6_A 1L1J_A 1QY6_A 2O8L_A 3OTP_E 2ZLE_I 1KY9_A 3CS0_A ....
Probab=99.01 E-value=2.6e-10 Score=91.11 Aligned_cols=109 Identities=21% Similarity=0.264 Sum_probs=62.3
Q ss_pred ceeEEEecc-eeeccCcCcCcHHHHhh---hhh--hcccccccccCC--ceeeecccc-ceeEEEeeccCCCCCCCCCCC
Q 008706 44 GTGFLIHRN-LLLTTHVNLPSVAAAET---AEI--RLQNGVAAALVP--HRFFITSSV-LDLTIVGLDSADGDSNAPGQQ 114 (557)
Q Consensus 44 GTGFLIspn-LLLTNnhvLpSaaaA~~---Aev--~lq~g~~a~L~P--~RFFITs~~-LDfTiVAvd~v~~d~~s~Gq~ 114 (557)
||||+|.++ ++||++||+........ .++ .+..+.. .+ -+..-.+.. +||.|+-|+
T Consensus 1 GTGf~i~~~g~ilT~~Hvv~~~~~~~~~~~~~~~~~~~~~~~---~~~~~~~~~~~~~~~D~All~v~------------ 65 (120)
T PF13365_consen 1 GTGFLIGPDGYILTAAHVVEDWNDGKQPDNSSVEVVFPDGRR---VPPVAEVVYFDPDDYDLALLKVD------------ 65 (120)
T ss_dssp EEEEEEETTTEEEEEHHHHTCCTT--G-TCSEEEEEETTSCE---EETEEEEEEEETT-TTEEEEEES------------
T ss_pred CEEEEEcCCceEEEchhheecccccccCCCCEEEEEecCCCE---EeeeEEEEEECCccccEEEEEEe------------
Confidence 899999999 99999999998776532 222 2222221 22 444445555 999999997
Q ss_pred CCcccccCCCCccccceEEEeecCCccceeeccCcEEEeecCceeeecCCCcCCCCCcccccCCCee
Q 008706 115 PHHLKTCSKPNLDLGSIVYLLGYMEEKELMVGEGKVAIATDNLIKLSTDGIIWSPGSAGFDVQGNLA 181 (557)
Q Consensus 115 ph~Lk~~~kpkl~lGE~VsIIqHP~pK~laIrEnKVv~~~DnfIhysTDt~~wSSGSAgFn~qgnla 181 (557)
+ ........+....+.....-...++.+.. | |.+++.+|+||+|+||.+|++.
T Consensus 66 ~--------~~~~~~~~~~~~~~~~~~~~~~~~~~~~~-----~-~~~~~~~G~SGgpv~~~~G~vv 118 (120)
T PF13365_consen 66 P--------WTGVGGGVRVPGSTSGVSPTSTNDNRMLY-----I-TDADTRPGSSGGPVFDSDGRVV 118 (120)
T ss_dssp C--------EEEEEEEEEEEEEEEEEEEEEEEETEEEE-----E-ESSS-STTTTTSEEEETTSEEE
T ss_pred c--------ccceeeeeEeeeeccccccccCcccceeE-----e-eecccCCCcEeHhEECCCCEEE
Confidence 0 00111111111122111111111122222 5 8999999999999999998864
No 4
>PRK10898 serine endoprotease; Provisional
Probab=98.95 E-value=6.3e-09 Score=104.23 Aligned_cols=178 Identities=14% Similarity=0.209 Sum_probs=119.0
Q ss_pred hhcccCceeEEEEeccC----C------ccceeEEEecc-eeeccCcCcCcHHHHhhhhhhcccccccccCCceeeeccc
Q 008706 24 IFSGKGLAMARISVAAS----A------VSGTGFLIHRN-LLLTTHVNLPSVAAAETAEIRLQNGVAAALVPHRFFITSS 92 (557)
Q Consensus 24 ifs~k~~AvArI~~~~~----g------G~GTGFLIspn-LLLTNnhvLpSaaaA~~Aev~lq~g~~a~L~P~RFFITs~ 92 (557)
++..-.+||-.|..... + +.||||+|+++ ++|||+||+..+. .-.|.|.+|... +-+.--.|+
T Consensus 50 ~~~~~~psvV~v~~~~~~~~~~~~~~~~~~GSGfvi~~~G~IlTn~HVv~~a~---~i~V~~~dg~~~---~a~vv~~d~ 123 (353)
T PRK10898 50 AVRRAAPAVVNVYNRSLNSTSHNQLEIRTLGSGVIMDQRGYILTNKHVINDAD---QIIVALQDGRVF---EALLVGSDS 123 (353)
T ss_pred HHHHhCCcEEEEEeEeccccCcccccccceeeEEEEeCCeEEEecccEeCCCC---EEEEEeCCCCEE---EEEEEEEcC
Confidence 34444577777764321 1 57999999975 9999999998643 356778777642 233445689
Q ss_pred cceeEEEeeccCCCCCCCCCCCCCcccccCCCCccccceEEEeecCCccceeeccCcEEE---------eecCceeeecC
Q 008706 93 VLDLTIVGLDSADGDSNAPGQQPHHLKTCSKPNLDLGSIVYLLGYMEEKELMVGEGKVAI---------ATDNLIKLSTD 163 (557)
Q Consensus 93 ~LDfTiVAvd~v~~d~~s~Gq~ph~Lk~~~kpkl~lGE~VsIIqHP~pK~laIrEnKVv~---------~~DnfIhysTD 163 (557)
..||.|+-++.. +..++++.....+.+||.|+.||+|.--..++-.|-|.. ...+||....-
T Consensus 124 ~~DlAvl~v~~~---------~l~~~~l~~~~~~~~G~~V~aiG~P~g~~~~~t~Giis~~~r~~~~~~~~~~~iqtda~ 194 (353)
T PRK10898 124 LTDLAVLKINAT---------NLPVIPINPKRVPHIGDVVLAIGNPYNLGQTITQGIISATGRIGLSPTGRQNFLQTDAS 194 (353)
T ss_pred CCCEEEEEEcCC---------CCCeeeccCcCcCCCCCEEEEEeCCCCcCCCcceeEEEeccccccCCccccceEEeccc
Confidence 999999999752 133466665557889999999999932233444454431 12367777777
Q ss_pred CCcCCCCCcccccCCCeeEEEeccccccCCCCCCCCCCcCCCCCccccccccccCcchhHHhHHHhhh
Q 008706 164 GIIWSPGSAGFDVQGNLAFMICDPMKLATSPNTKSSSTSSSSSSSWKKDSSMQFGIPIPIICDWLNQH 231 (557)
Q Consensus 164 t~~wSSGSAgFn~qgnlafmVc~p~~lA~sP~~~~sstSssss~s~kk~~i~q~GI~IssI~~wl~qh 231 (557)
-.+|.||.|.||.+|++-=|..+-.. .... . ....-..|-|||..+...+.+-
T Consensus 195 i~~GnSGGPl~n~~G~vvGI~~~~~~--~~~~---~----------~~~~g~~faIP~~~~~~~~~~l 247 (353)
T PRK10898 195 INHGNSGGALVNSLGELMGINTLSFD--KSND---G----------ETPEGIGFAIPTQLATKIMDKL 247 (353)
T ss_pred cCCCCCcceEECCCCeEEEEEEEEec--ccCC---C----------CcccceEEEEchHHHHHHHHHH
Confidence 78999999999999999776553211 0000 0 0012367889999988888774
No 5
>COG3591 V8-like Glu-specific endopeptidase [Amino acid transport and metabolism]
Probab=98.95 E-value=6.7e-09 Score=102.31 Aligned_cols=165 Identities=18% Similarity=0.135 Sum_probs=112.5
Q ss_pred eEEEecceeeccCcCcCcHHHHhh--hhh---hccccc----cc----ccCCceeeeccccceeEEEeeccCCCCCCCCC
Q 008706 46 GFLIHRNLLLTTHVNLPSVAAAET--AEI---RLQNGV----AA----ALVPHRFFITSSVLDLTIVGLDSADGDSNAPG 112 (557)
Q Consensus 46 GFLIspnLLLTNnhvLpSaaaA~~--Aev---~lq~g~----~a----~L~P~RFFITs~~LDfTiVAvd~v~~d~~s~G 112 (557)
+|||.||++|||.|++-|-.--+. +-+ .-.+|. .. .-.| =+++.+..+...|+.-..+ .+..+|
T Consensus 68 ~~lI~pntvLTa~Hc~~s~~~G~~~~~~~p~g~~~~~~~~~~~~~~~~~~~~--g~~~~~d~~~~~v~~~~~~-~g~~~~ 144 (251)
T COG3591 68 ATLIGPNTVLTAGHCIYSPDYGEDDIAAAPPGVNSDGGPFYGITKIEIRVYP--GELYKEDGASYDVGEAALE-SGINIG 144 (251)
T ss_pred EEEEcCceEEEeeeEEecCCCChhhhhhcCCcccCCCCCCCceeeEEEEecC--CceeccCCceeeccHHHhc-cCCCcc
Confidence 399999999999999988764221 111 111122 11 1244 2455666666666653333 234555
Q ss_pred CCCCcccccCCCCccccceEEEeecCCccc----eeeccCcEEEeecCceeeecCCCcCCCCCcccccCCCeeEEEeccc
Q 008706 113 QQPHHLKTCSKPNLDLGSIVYLLGYMEEKE----LMVGEGKVAIATDNLIKLSTDGIIWSPGSAGFDVQGNLAFMICDPM 188 (557)
Q Consensus 113 q~ph~Lk~~~kpkl~lGE~VsIIqHP~pK~----laIrEnKVv~~~DnfIhysTDt~~wSSGSAgFn~qgnlafmVc~p~ 188 (557)
--..||+.--...+++|+.|.++|+|..|. --.-.++|-.+..+++-|..||.+|+|||++|+-.. -|+.-|
T Consensus 145 ~~~~~~~~~~~~~~~~~d~i~v~GYP~dk~~~~~~~e~t~~v~~~~~~~l~y~~dT~pG~SGSpv~~~~~----~vigv~ 220 (251)
T COG3591 145 DVVNYLKRNTASEAKANDRITVIGYPGDKPNIGTMWESTGKVNSIKGNKLFYDADTLPGSSGSPVLISKD----EVIGVH 220 (251)
T ss_pred ccccccccccccccccCceeEEEeccCCCCcceeEeeecceeEEEecceEEEEecccCCCCCCceEecCc----eEEEEE
Confidence 566666766667889999999999995554 223467777888899999999999999999999885 566666
Q ss_pred cccCCCCCCCCCCcCCCCCccccccccccCcch-hHHhHHHhhhc
Q 008706 189 KLATSPNTKSSSTSSSSSSSWKKDSSMQFGIPI-PIICDWLNQHW 232 (557)
Q Consensus 189 ~lA~sP~~~~sstSssss~s~kk~~i~q~GI~I-ssI~~wl~qhw 232 (557)
- ++...-+ .+..|+++++ +.|.+||.|..
T Consensus 221 ~-~g~~~~~--------------~~~~n~~vr~t~~~~~~I~~~~ 250 (251)
T COG3591 221 Y-NGPGANG--------------GSLANNAVRLTPEILNFIQQNI 250 (251)
T ss_pred e-cCCCccc--------------ccccCcceEecHHHHHHHHHhh
Confidence 5 3332222 3578999997 67888887753
No 6
>PRK10942 serine endoprotease; Provisional
Probab=98.74 E-value=2.2e-08 Score=104.27 Aligned_cols=159 Identities=17% Similarity=0.249 Sum_probs=107.9
Q ss_pred cceeEEEec--ceeeccCcCcCcHHHHhhhhhhcccccccccCCceeeeccccceeEEEeeccCCCCCCCCCCCCCcccc
Q 008706 43 SGTGFLIHR--NLLLTTHVNLPSVAAAETAEIRLQNGVAAALVPHRFFITSSVLDLTIVGLDSADGDSNAPGQQPHHLKT 120 (557)
Q Consensus 43 ~GTGFLIsp--nLLLTNnhvLpSaaaA~~Aev~lq~g~~a~L~P~RFFITs~~LDfTiVAvd~v~~d~~s~Gq~ph~Lk~ 120 (557)
.|+||+|++ .++|||+||+..+. .-.|.|.+|... +-+.--+|...||.|+-+++. + ...++++
T Consensus 112 ~GSG~ii~~~~G~IlTn~HVv~~a~---~i~V~~~dg~~~---~a~vv~~D~~~DlAvlki~~~--~------~l~~~~l 177 (473)
T PRK10942 112 LGSGVIIDADKGYVVTNNHVVDNAT---KIKVQLSDGRKF---DAKVVGKDPRSDIALIQLQNP--K------NLTAIKM 177 (473)
T ss_pred eEEEEEEECCCCEEEeChhhcCCCC---EEEEEECCCCEE---EEEEEEecCCCCEEEEEecCC--C------CCceeEe
Confidence 699999985 59999999987653 456788887642 223334689999999998631 1 2345777
Q ss_pred cCCCCccccceEEEeecCCccceeeccCcEEEe---------ecCceeeecCCCcCCCCCcccccCCCeeEEEecccccc
Q 008706 121 CSKPNLDLGSIVYLLGYMEEKELMVGEGKVAIA---------TDNLIKLSTDGIIWSPGSAGFDVQGNLAFMICDPMKLA 191 (557)
Q Consensus 121 ~~kpkl~lGE~VsIIqHP~pK~laIrEnKVv~~---------~DnfIhysTDt~~wSSGSAgFn~qgnlafmVc~p~~lA 191 (557)
....++.+|++|+.||+|.--..++-.|.|.-. .++||...+.-.+|.||.+.||.+|++-=|.++-..
T Consensus 178 g~s~~l~~G~~V~aiG~P~g~~~tvt~GiVs~~~r~~~~~~~~~~~iqtda~i~~GnSGGpL~n~~GeviGI~t~~~~-- 255 (473)
T PRK10942 178 ADSDALRVGDYTVAIGNPYGLGETVTSGIVSALGRSGLNVENYENFIQTDAAINRGNSGGALVNLNGELIGINTAILA-- 255 (473)
T ss_pred cCccccCCCCEEEEEcCCCCCCcceeEEEEEEeecccCCcccccceEEeccccCCCCCcCccCCCCCeEEEEEEEEEc--
Confidence 766689999999999999222223444555421 245555555555999999999999998766554221
Q ss_pred CCCCCCCCCCcCCCCCccccccccccCcchhHHhHHHhhhcC
Q 008706 192 TSPNTKSSSTSSSSSSSWKKDSSMQFGIPIPIICDWLNQHWE 233 (557)
Q Consensus 192 ~sP~~~~sstSssss~s~kk~~i~q~GI~IssI~~wl~qhw~ 233 (557)
++ . . ..-..|-||+..+.+++.|--+
T Consensus 256 --~~-g--~-----------~~g~gfaIP~~~~~~v~~~l~~ 281 (473)
T PRK10942 256 --PD-G--G-----------NIGIGFAIPSNMVKNLTSQMVE 281 (473)
T ss_pred --CC-C--C-----------cccEEEEEEHHHHHHHHHHHHh
Confidence 00 0 0 1345678899888888777644
No 7
>PRK10139 serine endoprotease; Provisional
Probab=98.74 E-value=1.6e-08 Score=104.82 Aligned_cols=131 Identities=15% Similarity=0.227 Sum_probs=94.9
Q ss_pred ccceeEEEec--ceeeccCcCcCcHHHHhhhhhhcccccccccCCceeeeccccceeEEEeeccCCCCCCCCCCCCCccc
Q 008706 42 VSGTGFLIHR--NLLLTTHVNLPSVAAAETAEIRLQNGVAAALVPHRFFITSSVLDLTIVGLDSADGDSNAPGQQPHHLK 119 (557)
Q Consensus 42 G~GTGFLIsp--nLLLTNnhvLpSaaaA~~Aev~lq~g~~a~L~P~RFFITs~~LDfTiVAvd~v~~d~~s~Gq~ph~Lk 119 (557)
+.|+||+|.+ .++|||+||+..+. .-+|.|.+|.. .+-+.--+|+..||.|+-++... ...+++
T Consensus 90 ~~GSG~ii~~~~g~IlTn~HVv~~a~---~i~V~~~dg~~---~~a~vvg~D~~~DlAvlkv~~~~--------~l~~~~ 155 (455)
T PRK10139 90 GLGSGVIIDAAKGYVLTNNHVINQAQ---KISIQLNDGRE---FDAKLIGSDDQSDIALLQIQNPS--------KLTQIA 155 (455)
T ss_pred ceEEEEEEECCCCEEEeChHHhCCCC---EEEEEECCCCE---EEEEEEEEcCCCCEEEEEecCCC--------CCceeE
Confidence 5799999974 69999999998653 45678888764 23345567999999999997421 233467
Q ss_pred ccCCCCccccceEEEeecCCccceeeccCcEEEe---------ecCceeeecCCCcCCCCCcccccCCCeeEEEec
Q 008706 120 TCSKPNLDLGSIVYLLGYMEEKELMVGEGKVAIA---------TDNLIKLSTDGIIWSPGSAGFDVQGNLAFMICD 186 (557)
Q Consensus 120 ~~~kpkl~lGE~VsIIqHP~pK~laIrEnKVv~~---------~DnfIhysTDt~~wSSGSAgFn~qgnlafmVc~ 186 (557)
+..-.++.+|++|+.||+|---..++-.|-|.-. .++||.-.+--.+|.||.|.||.+|++.=|..+
T Consensus 156 lg~s~~~~~G~~V~aiG~P~g~~~tvt~GivS~~~r~~~~~~~~~~~iqtda~in~GnSGGpl~n~~G~vIGi~~~ 231 (455)
T PRK10139 156 IADSDKLRVGDFAVAVGNPFGLGQTATSGIISALGRSGLNLEGLENFIQTDASINRGNSGGALLNLNGELIGINTA 231 (455)
T ss_pred ecCccccCCCCEEEEEecCCCCCCceEEEEEccccccccCCCCcceEEEECCccCCCCCcceEECCCCeEEEEEEE
Confidence 7655689999999999999333445555644311 235665555567899999999999998766554
No 8
>COG0265 DegQ Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain [Posttranslational modification, protein turnover, chaperones]
Probab=97.98 E-value=2.5e-05 Score=76.92 Aligned_cols=169 Identities=21% Similarity=0.313 Sum_probs=111.5
Q ss_pred ccceeEEEe-cceeeccCcCcCcHHHHhhhhhhcccccccccCCceeeeccccceeEEEeeccCCCCCCCCCCCCCcccc
Q 008706 42 VSGTGFLIH-RNLLLTTHVNLPSVAAAETAEIRLQNGVAAALVPHRFFITSSVLDLTIVGLDSADGDSNAPGQQPHHLKT 120 (557)
Q Consensus 42 G~GTGFLIs-pnLLLTNnhvLpSaaaA~~Aev~lq~g~~a~L~P~RFFITs~~LDfTiVAvd~v~~d~~s~Gq~ph~Lk~ 120 (557)
+.|+||++. ...++||+||+..+ ....+.+.+|+. ++-++.-.|+.-|+.++-++.... .-++.+
T Consensus 72 ~~gSg~i~~~~g~ivTn~hVi~~a---~~i~v~l~dg~~---~~a~~vg~d~~~dlavlki~~~~~--------~~~~~~ 137 (347)
T COG0265 72 GLGSGFIISSDGYIVTNNHVIAGA---EEITVTLADGRE---VPAKLVGKDPISDLAVLKIDGAGG--------LPVIAL 137 (347)
T ss_pred ccccEEEEcCCeEEEecceecCCc---ceEEEEeCCCCE---EEEEEEecCCccCEEEEEeccCCC--------Cceeec
Confidence 689999999 89999999999983 334455566654 445555689999999999976433 234556
Q ss_pred cCCCCccccceEEEeecCCccceeeccCcEEEe----------ecCceeeecC--CCcCCCCCcccccCCCeeEEEeccc
Q 008706 121 CSKPNLDLGSIVYLLGYMEEKELMVGEGKVAIA----------TDNLIKLSTD--GIIWSPGSAGFDVQGNLAFMICDPM 188 (557)
Q Consensus 121 ~~kpkl~lGE~VsIIqHP~pK~laIrEnKVv~~----------~DnfIhysTD--t~~wSSGSAgFn~qgnlafmVc~p~ 188 (557)
....++.+|++|.-||-|.--.-++=.|.|... ..+|| .|| ..+|.||.+.||.+|.+-.|--+--
T Consensus 138 ~~s~~l~vg~~v~aiGnp~g~~~tvt~Givs~~~r~~v~~~~~~~~~I--qtdAain~gnsGgpl~n~~g~~iGint~~~ 215 (347)
T COG0265 138 GDSDKLRVGDVVVAIGNPFGLGQTVTSGIVSALGRTGVGSAGGYVNFI--QTDAAINPGNSGGPLVNIDGEVVGINTAII 215 (347)
T ss_pred cCCCCcccCCEEEEecCCCCcccceeccEEeccccccccCcccccchh--hcccccCCCCCCCceEcCCCcEEEEEEEEe
Confidence 666688899999999999322233334433221 12455 556 7799999999999999987554433
Q ss_pred cccCCCCCCCCCCcCCCCCccccccccccCcchhHHhHHHhhhcCCCccccCCCCcce
Q 008706 189 KLATSPNTKSSSTSSSSSSSWKKDSSMQFGIPIPIICDWLNQHWEGNLDELTKPKLPI 246 (557)
Q Consensus 189 ~lA~sP~~~~sstSssss~s~kk~~i~q~GI~IssI~~wl~qhw~g~lde~~kpklp~ 246 (557)
..++- +. . -.|.||+..+-.-+.+--... .+..|.+=+
T Consensus 216 ~~~~~-----~~-----------g--igfaiP~~~~~~v~~~l~~~G--~v~~~~lgv 253 (347)
T COG0265 216 APSGG-----SS-----------G--IGFAIPVNLVAPVLDELISKG--KVVRGYLGV 253 (347)
T ss_pred cCCCC-----cc-----------e--eEEEecHHHHHHHHHHHHHcC--Cccccccce
Confidence 31110 00 1 567788887777776655522 444554443
No 9
>PF00089 Trypsin: Trypsin; InterPro: IPR001254 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine proteases belong to the MEROPS peptidase family S1 (chymotrypsin family, clan PA(S))and to peptidase family S6 (Hap serine peptidases). The chymotrypsin family is almost totally confined to animals, although trypsin-like enzymes are found in actinomycetes of the genera Streptomyces and Saccharopolyspora, and in the fungus Fusarium oxysporum []. The enzymes are inherently secreted, being synthesised with a signal peptide that targets them to the secretory pathway. Animal enzymes are either secreted directly, packaged into vesicles for regulated secretion, or are retained in leukocyte granules []. The Hap family, 'Haemophilus adhesion and penetration', are proteins that play a role in the interaction with human epithelial cells. The serine protease activity is localized at the N-terminal domain, whereas the binding domain is in the C-terminal region. ; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 1SPJ_A 1A5I_A 2ZGH_A 2ZKS_A 2ZGJ_A 2ZGC_A 2ODP_A 2I6Q_A 2I6S_A 2ODQ_A ....
Probab=97.82 E-value=5.3e-05 Score=65.58 Aligned_cols=173 Identities=15% Similarity=0.157 Sum_probs=96.5
Q ss_pred ceeEEEEeccCCccceeEEEecceeeccCcCcCcHHHHhhhhh-------hccccc-----ccccCCceeeec-ccccee
Q 008706 30 LAMARISVAASAVSGTGFLIHRNLLLTTHVNLPSVAAAETAEI-------RLQNGV-----AAALVPHRFFIT-SSVLDL 96 (557)
Q Consensus 30 ~AvArI~~~~~gG~GTGFLIspnLLLTNnhvLpSaaaA~~Aev-------~lq~g~-----~a~L~P~RFFIT-s~~LDf 96 (557)
+-++.|......-..+|+||+++.+||..|.+.. +..-++ +...+. ...+..+.-|-. ...-|+
T Consensus 13 p~~v~i~~~~~~~~C~G~li~~~~vLTaahC~~~---~~~~~v~~g~~~~~~~~~~~~~~~v~~~~~h~~~~~~~~~~Di 89 (220)
T PF00089_consen 13 PWVVSIRYSNGRFFCTGTLISPRWVLTAAHCVDG---ASDIKVRLGTYSIRNSDGSEQTIKVSKIIIHPKYDPSTYDNDI 89 (220)
T ss_dssp TTEEEEEETTTEEEEEEEEEETTEEEEEGGGHTS---GGSEEEEESESBTTSTTTTSEEEEEEEEEEETTSBTTTTTTSE
T ss_pred CeEEEEeeCCCCeeEeEEeccccccccccccccc---ccccccccccccccccccccccccccccccccccccccccccc
Confidence 4567765543234589999999999999999999 211112 111111 011112111222 237899
Q ss_pred EEEeeccCCCCCCCCCCCCCcccccCCC-CccccceEEEeecCCc--cc--eeeccCcEEE-------------eecCce
Q 008706 97 TIVGLDSADGDSNAPGQQPHHLKTCSKP-NLDLGSIVYLLGYMEE--KE--LMVGEGKVAI-------------ATDNLI 158 (557)
Q Consensus 97 TiVAvd~v~~d~~s~Gq~ph~Lk~~~kp-kl~lGE~VsIIqHP~p--K~--laIrEnKVv~-------------~~DnfI 158 (557)
.||-|+..- ..+.....+.+.... .+.+|..+.++|++.. .. -.++...+.. +.++++
T Consensus 90 All~L~~~~----~~~~~~~~~~l~~~~~~~~~~~~~~~~G~~~~~~~~~~~~~~~~~~~~~~~~~c~~~~~~~~~~~~~ 165 (220)
T PF00089_consen 90 ALLKLDRPI----TFGDNIQPICLPSAGSDPNVGTSCIVVGWGRTSDNGYSSNLQSVTVPVVSRKTCRSSYNDNLTPNMI 165 (220)
T ss_dssp EEEEESSSS----EHBSSBEESBBTSTTHTTTTTSEEEEEESSBSSTTSBTSBEEEEEEEEEEHHHHHHHTTTTSTTTEE
T ss_pred ccccccccc----ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
Confidence 999997741 111122224444433 5678999999999922 11 1333333332 123444
Q ss_pred eeec----CCCcCCCCCcccccCCCeeEEEeccccccCCCCCCCCCCcCCCCCccccccccccCcchhHHhHHH
Q 008706 159 KLST----DGIIWSPGSAGFDVQGNLAFMICDPMKLATSPNTKSSSTSSSSSSSWKKDSSMQFGIPIPIICDWL 228 (557)
Q Consensus 159 hysT----Dt~~wSSGSAgFn~qgnlafmVc~p~~lA~sP~~~~sstSssss~s~kk~~i~q~GI~IssI~~wl 228 (557)
-... |...|.||+|.|+.++ .++-... .+ . .. .. ..-....++|+.+.+||
T Consensus 166 c~~~~~~~~~~~g~sG~pl~~~~~----~lvGI~s-~~-~----~c-------~~--~~~~~v~~~v~~~~~WI 220 (220)
T PF00089_consen 166 CAGSSGSGDACQGDSGGPLICNNN----YLVGIVS-FG-E----NC-------GS--PNYPGVYTRVSSYLDWI 220 (220)
T ss_dssp EEETTSSSBGGTTTTTSEEEETTE----EEEEEEE-EE-S----SS-------SB--TTSEEEEEEGGGGHHHH
T ss_pred ccccccccccccccccccccccee----eecceee-ec-C----CC-------CC--CCcCEEEEEHHHhhccC
Confidence 4333 8889999999999996 3333322 11 0 00 00 11135568999999997
No 10
>cd00190 Tryp_SPc Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues.
Probab=96.36 E-value=0.033 Score=48.61 Aligned_cols=149 Identities=19% Similarity=0.134 Sum_probs=78.6
Q ss_pred cCceeEEEEeccCCccceeEEEecceeeccCcCcCcHHHHhhhhh-----hccc---cc----ccc-cCCceeeeccccc
Q 008706 28 KGLAMARISVAASAVSGTGFLIHRNLLLTTHVNLPSVAAAETAEI-----RLQN---GV----AAA-LVPHRFFITSSVL 94 (557)
Q Consensus 28 k~~AvArI~~~~~gG~GTGFLIspnLLLTNnhvLpSaaaA~~Aev-----~lq~---g~----~a~-L~P~RFFITs~~L 94 (557)
..+-+++|........-+|.||+++++||.+|-+.... +....| +... +. ... .....|--....-
T Consensus 11 ~~Pw~v~i~~~~~~~~C~GtlIs~~~VLTaAhC~~~~~-~~~~~v~~g~~~~~~~~~~~~~~~v~~~~~hp~y~~~~~~~ 89 (232)
T cd00190 11 SFPWQVSLQYTGGRHFCGGSLISPRWVLTAAHCVYSSA-PSNYTVRLGSHDLSSNEGGGQVIKVKKVIVHPNYNPSTYDN 89 (232)
T ss_pred CCCCEEEEEccCCcEEEEEEEeeCCEEEECHHhcCCCC-CccEEEEeCcccccCCCCceEEEEEEEEEECCCCCCCCCcC
Confidence 34567777543223356899999999999999998753 111111 1111 01 111 2222222224578
Q ss_pred eeEEEeeccCCCCCCCCCCCCCcccccCCC-CccccceEEEeecCCccc-----eeeccCcEEE---------------e
Q 008706 95 DLTIVGLDSADGDSNAPGQQPHHLKTCSKP-NLDLGSIVYLLGYMEEKE-----LMVGEGKVAI---------------A 153 (557)
Q Consensus 95 DfTiVAvd~v~~d~~s~Gq~ph~Lk~~~kp-kl~lGE~VsIIqHP~pK~-----laIrEnKVv~---------------~ 153 (557)
|+.|+-|+..-...... .| ..+.... .+..|..+++.|...... -.++..++.. +
T Consensus 90 DiAll~L~~~~~~~~~v--~p--icl~~~~~~~~~~~~~~~~G~g~~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~ 165 (232)
T cd00190 90 DIALLKLKRPVTLSDNV--RP--ICLPSSGYNLPAGTTCTVSGWGRTSEGGPLPDVLQEVNVPIVSNAECKRAYSYGGTI 165 (232)
T ss_pred CEEEEEECCcccCCCcc--cc--eECCCccccCCCCCEEEEEeCCcCCCCCCCCceeeEEEeeeECHHHhhhhccCcccC
Confidence 99999996421111111 22 2222222 567789999998762211 1122222211 1
Q ss_pred ecCcee-----eecCCCcCCCCCcccccCCCee
Q 008706 154 TDNLIK-----LSTDGIIWSPGSAGFDVQGNLA 181 (557)
Q Consensus 154 ~DnfIh-----ysTDt~~wSSGSAgFn~qgnla 181 (557)
.++++- -..++..|.||++.|...++-.
T Consensus 166 ~~~~~C~~~~~~~~~~c~gdsGgpl~~~~~~~~ 198 (232)
T cd00190 166 TDNMLCAGGLEGGKDACQGDSGGPLVCNDNGRG 198 (232)
T ss_pred CCceEeeCCCCCCCccccCCCCCcEEEEeCCEE
Confidence 122221 1567788999999999876443
No 11
>smart00020 Tryp_SPc Trypsin-like serine protease. Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues.
Probab=95.96 E-value=0.067 Score=47.28 Aligned_cols=156 Identities=19% Similarity=0.191 Sum_probs=86.1
Q ss_pred ccCceeEEEEeccCCccceeEEEecceeeccCcCcCcHHHHhhhhhhccc-----c---c---ccccCCceeee-ccccc
Q 008706 27 GKGLAMARISVAASAVSGTGFLIHRNLLLTTHVNLPSVAAAETAEIRLQN-----G---V---AAALVPHRFFI-TSSVL 94 (557)
Q Consensus 27 ~k~~AvArI~~~~~gG~GTGFLIspnLLLTNnhvLpSaaaA~~Aev~lq~-----g---~---~a~L~P~RFFI-Ts~~L 94 (557)
...+-+|+|......-.=+|.||+++++||..|-+.... ...-.|.+.. . . ...+..+-.|. +...-
T Consensus 11 ~~~Pw~~~i~~~~~~~~C~GtlIs~~~VLTaahC~~~~~-~~~~~v~~g~~~~~~~~~~~~~~v~~~~~~p~~~~~~~~~ 89 (229)
T smart00020 11 GSFPWQVSLQYRGGRHFCGGSLISPRWVLTAAHCVYGSD-PSNIRVRLGSHDLSSGEEGQVIKVSKVIIHPNYNPSTYDN 89 (229)
T ss_pred CCCCcEEEEEEcCCCcEEEEEEecCCEEEECHHHcCCCC-CcceEEEeCcccCCCCCCceEEeeEEEEECCCCCCCCCcC
Confidence 345667777533212236899999999999999998765 1111222211 1 1 11233333333 67789
Q ss_pred eeEEEeeccCCCCCCCCCCCCCcccccC-CCCccccceEEEeecCCccc------eeeccCcEEEe--------------
Q 008706 95 DLTIVGLDSADGDSNAPGQQPHHLKTCS-KPNLDLGSIVYLLGYMEEKE------LMVGEGKVAIA-------------- 153 (557)
Q Consensus 95 DfTiVAvd~v~~d~~s~Gq~ph~Lk~~~-kpkl~lGE~VsIIqHP~pK~------laIrEnKVv~~-------------- 153 (557)
|+.|+-|+..-.-... -+| +.+.. ...+..|..+.+.|+..... -.++...+..+
T Consensus 90 DiAll~L~~~i~~~~~--~~p--i~l~~~~~~~~~~~~~~~~g~g~~~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~ 165 (229)
T smart00020 90 DIALLKLKSPVTLSDN--VRP--ICLPSSNYNVPAGTTCTVSGWGRTSEGAGSLPDTLQEVNVPIVSNATCRRAYSGGGA 165 (229)
T ss_pred CEEEEEECcccCCCCc--eee--ccCCCcccccCCCCEEEEEeCCCCCCCCCcCCCEeeEEEEEEeCHHHhhhhhccccc
Confidence 9999999652111111 122 22222 12577789999999873321 11121121111
Q ss_pred -ecCce-----eeecCCCcCCCCCcccccCCCeeEEEecccc
Q 008706 154 -TDNLI-----KLSTDGIIWSPGSAGFDVQGNLAFMICDPMK 189 (557)
Q Consensus 154 -~DnfI-----hysTDt~~wSSGSAgFn~qgnlafmVc~p~~ 189 (557)
.++.+ ....++..|.+|++.|...+ -|.++-..-
T Consensus 166 ~~~~~~C~~~~~~~~~~c~gdsG~pl~~~~~--~~~l~Gi~s 205 (229)
T smart00020 166 ITDNMLCAGGLEGGKDACQGDSGGPLVCNDG--RWVLVGIVS 205 (229)
T ss_pred cCCCcEeecCCCCCCcccCCCCCCeeEEECC--CEEEEEEEE
Confidence 11111 12577888999999999876 777776654
No 12
>PF05416 Peptidase_C37: Southampton virus-type processing peptidase; InterPro: IPR001665 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This group of cysteine peptidases belong to the MEROPS peptidase family C37, (clan PA(C)). The type example is calicivirin from Southampton virus, an endopeptidase that cleaves the polyprotein at sites N-terminal to itself, liberating the polyprotein helicase. Southampton virus is a positive-stranded ssRNA virus belonging to the Caliciviruses, which are viruses that cause gastroenteritis. The calicivirus genome contains two open reading frames, ORF1 and ORF2. ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine protease and RNA polymerase activity []. The regions of the polyprotein in which these activities lie are similar to proteins produced by the picornaviruses []. ORF2 encodes a structural, capsid protein. Two different families of caliciviruses can be distinguished on the basis of sequence similarity, namely the Norwalk-like viruses or small round structured viruses (SRSVs), and those classed as non-SRSVs.; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 2FYQ_A 2FYR_A 1WQS_D 4ASH_A 2IPH_B.
Probab=90.83 E-value=0.082 Score=57.29 Aligned_cols=127 Identities=23% Similarity=0.217 Sum_probs=62.5
Q ss_pred EEEeccCCccceeEEEecceeeccCcCcCcHHHHhh----hhhhcccccccccCCcee-eeccccceeEEEeeccCCCCC
Q 008706 34 RISVAASAVSGTGFLIHRNLLLTTHVNLPSVAAAET----AEIRLQNGVAAALVPHRF-FITSSVLDLTIVGLDSADGDS 108 (557)
Q Consensus 34 rI~~~~~gG~GTGFLIspnLLLTNnhvLpSaaaA~~----Aev~lq~g~~a~L~P~RF-FITs~~LDfTiVAvd~v~~d~ 108 (557)
||+ .+|+|=||-|||+||+|+-||||.-..-.+ ++|... ..=.=.+| |-.--.-|+|=.-|++.-.+
T Consensus 374 Riv---~fGsGWGfWVS~~lfITttHViP~g~~E~FGv~i~~i~vh----~sGeF~~~rFpk~iRPDvtgmiLEeGapE- 445 (535)
T PF05416_consen 374 RIV---KFGSGWGFWVSPTLFITTTHVIPPGAKEAFGVPISQIQVH----KSGEFCRFRFPKPIRPDVTGMILEEGAPE- 445 (535)
T ss_dssp TEE---EETTEEEEESSSSEEEEEGGGS-STTSEETTEECGGEEEE----EETTEEEEEESS-SSTTS---EE-SS--T-
T ss_pred hhe---ecCCceeeeecceEEEEeeeecCCcchhhhCCChhHeEEe----eccceEEEecCCCCCCCccceeeccCCCC-
Confidence 555 568999999999999999999997654333 111111 11111233 22233346666666553333
Q ss_pred CCCCCCCCcccccCCCCccccceEEEeecC---Cccceeecc---------CcEEEe-------ecCceeeecCCCcCCC
Q 008706 109 NAPGQQPHHLKTCSKPNLDLGSIVYLLGYM---EEKELMVGE---------GKVAIA-------TDNLIKLSTDGIIWSP 169 (557)
Q Consensus 109 ~s~Gq~ph~Lk~~~kpkl~lGE~VsIIqHP---~pK~laIrE---------nKVv~~-------~DnfIhysTDt~~wSS 169 (557)
|-+.+||--- +.-.|++|= ||+|-. .-|--..---|.||--
T Consensus 446 --------------------GtV~siLiKR~sGEllpLAvRMgt~AsmkIqgr~v~GQ~GMLLTGaNAK~mDLGT~PGDC 505 (535)
T PF05416_consen 446 --------------------GTVCSILIKRPSGELLPLAVRMGTHASMKIQGRTVHGQMGMLLTGANAKGMDLGTIPGDC 505 (535)
T ss_dssp --------------------T-EEEEEEE-TTSBEEEEEEEEEEEEEEEETTEEEEEEEEEETTSTT-SSTTTS--TTGT
T ss_pred --------------------ceEEEEEEEcCCccchhhhhhhccceeEEEcceeecceeeeeeecCCccccccCCCCCCC
Confidence 2222222211 222333332 333311 1121222334788888
Q ss_pred CCcccccCCCeeEEEecccc
Q 008706 170 GSAGFDVQGNLAFMICDPMK 189 (557)
Q Consensus 170 GSAgFn~qgnlafmVc~p~~ 189 (557)
|-|-+=--||. |+||--|-
T Consensus 506 GcPYvyKrgNd-~VV~GVH~ 524 (535)
T PF05416_consen 506 GCPYVYKRGND-WVVIGVHA 524 (535)
T ss_dssp T-EEEEEETTE-EEEEEEEE
T ss_pred CCceeeecCCc-EEEEEEEe
Confidence 99988888887 89998887
No 13
>PF10459 Peptidase_S46: Peptidase S46; InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains.
Probab=79.65 E-value=1.2 Score=50.18 Aligned_cols=41 Identities=17% Similarity=0.201 Sum_probs=27.3
Q ss_pred hhcccCceeEEEEeccCCccceeEEEecc-eeeccCcCcCcHHH
Q 008706 24 IFSGKGLAMARISVAASAVSGTGFLIHRN-LLLTTHVNLPSVAA 66 (557)
Q Consensus 24 ifs~k~~AvArI~~~~~gG~GTGFLIspn-LLLTNnhvLpSaaa 66 (557)
|....+.++.+|+.= +|.+||-+|||. |+|||||.--++-.
T Consensus 31 l~~~~~~s~dAvv~f--~gGCSgsfVS~~GLvlTNHHC~~~~Iq 72 (698)
T PF10459_consen 31 LYSPNGSSKDAVVRF--GGGCSGSFVSPDGLVLTNHHCGYGAIQ 72 (698)
T ss_pred HhCCCccchhheeec--CCceeEEEEcCCceEEecchhhhhHHH
Confidence 455555566666520 234799999985 99999998654433
No 14
>KOG1421 consensus Predicted signaling-associated protein (contains a PDZ domain) [General function prediction only]
Probab=78.74 E-value=2.8 Score=48.26 Aligned_cols=42 Identities=17% Similarity=0.288 Sum_probs=30.1
Q ss_pred hhhhhhcccCceeEEEEeccCC-ccceeEEEec--ceeeccCcCc
Q 008706 20 MKAAIFSGKGLAMARISVAASA-VSGTGFLIHR--NLLLTTHVNL 61 (557)
Q Consensus 20 ~kaaifs~k~~AvArI~~~~~g-G~GTGFLIsp--nLLLTNnhvL 61 (557)
+-.++-+=++.||+.--....| .+||||.|++ ++.|||-||.
T Consensus 61 VvksvVsI~~S~v~~fdtesag~~~atgfvvd~~~gyiLtnrhvv 105 (955)
T KOG1421|consen 61 VVKSVVSIRFSAVRAFDTESAGESEATGFVVDKKLGYILTNRHVV 105 (955)
T ss_pred hcccEEEEEehheeecccccccccceeEEEEecccceEEEecccc
Confidence 3345566677777775433333 6999999997 6799999986
No 15
>PF04152 Mre11_DNA_bind: Mre11 DNA-binding presumed domain ; InterPro: IPR007281 The Mre11 complex is a multi-subunit nuclease that is composed of Mre11, Rad50 and Nbs1/Xrs2, and is involved in checkpoint signalling and DNA replication []. Mre11 has an intrinsic DNA-binding activity that is stimulated by Rad50 on its own or in combination with Nbs1 [].; GO: 0004519 endonuclease activity, 0030145 manganese ion binding, 0006302 double-strand break repair, 0005634 nucleus; PDB: 4FBW_B 4FBK_A 4FCX_B 4FBQ_B 3T1I_B.
Probab=54.97 E-value=12 Score=35.23 Aligned_cols=31 Identities=23% Similarity=0.532 Sum_probs=20.5
Q ss_pred chhHHhHHHhhhc-------CCCccccCCCCcceeeee
Q 008706 220 PIPIICDWLNQHW-------EGNLDELTKPKLPIIRLM 250 (557)
Q Consensus 220 ~IssI~~wl~qhw-------~g~lde~~kpklp~~rlm 250 (557)
.|...++.-+.+| ....+.-.+|+||||||=
T Consensus 43 ~Ve~mI~~A~~~~~~~~~~~~~~~~~~~~~~lPLIRLR 80 (175)
T PF04152_consen 43 KVEEMIEEAKEEWEELQREPDDQTGHPKQPPLPLIRLR 80 (175)
T ss_dssp HHHHHHHHHHHHC--HHHHT--STTTSSS-SS-EEEEE
T ss_pred HHHHHHHHhHhhhccccccccccccCcccCCCCEEEEE
Confidence 4667777777888 334456789999999994
No 16
>PF00949 Peptidase_S7: Peptidase S7, Flavivirus NS3 serine protease ; InterPro: IPR001850 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies serine peptidases belong to MEROPS peptidase family S7 (flavivirin family, clan PA(S)). The protein fold of the peptidase domain for members of this family resembles that of chymotrypsin, the type example for clan PA. Flaviviruses produce a polyprotein from the ssRNA genome. The N terminus of the NS3 protein (approx. 180 aa) is required for the processing of the polyprotein. NS3 also has conserved homology with NTP-binding proteins and DEAD family of RNA helicase [, , ].; GO: 0003723 RNA binding, 0003724 RNA helicase activity, 0005524 ATP binding; PDB: 2IJO_B 3E90_D 2GGV_B 2FP7_B 2WV9_A 3U1I_B 3U1J_B 2WZQ_A 2WHX_A 3L6P_A ....
Probab=53.41 E-value=10 Score=35.27 Aligned_cols=24 Identities=25% Similarity=0.232 Sum_probs=15.5
Q ss_pred ceeeecCCCcCCCCCcccccCCCe
Q 008706 157 LIKLSTDGIIWSPGSAGFDVQGNL 180 (557)
Q Consensus 157 fIhysTDt~~wSSGSAgFn~qgnl 180 (557)
+.--.+|--+|+||||.||.+|..
T Consensus 87 ~~~~~~d~~~GsSGSpi~n~~g~i 110 (132)
T PF00949_consen 87 IGAIDLDFPKGSSGSPIFNQNGEI 110 (132)
T ss_dssp EEEE---S-TTGTT-EEEETTSCE
T ss_pred EEeeecccCCCCCCCceEcCCCcE
Confidence 334457888999999999999875
No 17
>PF00548 Peptidase_C3: 3C cysteine protease (picornain 3C); InterPro: IPR000199 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This signature defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies C3A and C3B. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral C3 cysteine protease. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SJO_E 2H6M_A 1QA7_C 1HAV_B 2HAL_A 2H9H_A 3QZQ_B 3QZR_A 3R0F_B 3SJ9_A ....
Probab=42.61 E-value=66 Score=30.29 Aligned_cols=127 Identities=18% Similarity=0.076 Sum_probs=69.3
Q ss_pred ceeEEEecceeeccCcCcCcHHHHhh-hhhhcccccccc-cCCceee-eccc--cceeEEEeeccCCCCCCCCCCCCCcc
Q 008706 44 GTGFLIHRNLLLTTHVNLPSVAAAET-AEIRLQNGVAAA-LVPHRFF-ITSS--VLDLTIVGLDSADGDSNAPGQQPHHL 118 (557)
Q Consensus 44 GTGFLIspnLLLTNnhvLpSaaaA~~-Aev~lq~g~~a~-L~P~RFF-ITs~--~LDfTiVAvd~v~~d~~s~Gq~ph~L 118 (557)
+.++.|..+++|.+.| +.. -+|-+ +|+... .. .+- ++.. .+|+|||-|+..+ .+ ++- .
T Consensus 27 ~l~~gi~~~~~lvp~H-------~~~~~~i~i-~g~~~~~~d--~~~lv~~~~~~~Dl~~v~l~~~~----kf-rDI--r 89 (172)
T PF00548_consen 27 MLALGIYDRYFLVPTH-------EEPEDTIYI-DGVEYKVDD--SVVLVDRDGVDTDLTLVKLPRNP----KF-RDI--R 89 (172)
T ss_dssp EEEEEEEBTEEEEEGG-------GGGCSEEEE-TTEEEEEEE--EEEEEETTSSEEEEEEEEEESSS-----B---G--G
T ss_pred EecceEeeeEEEEECc-------CCCcEEEEE-CCEEEEeee--eEEEecCCCcceeEEEEEccCCc----cc-Cch--h
Confidence 5677999999999999 221 12222 244221 23 322 3322 6899999995411 11 111 2
Q ss_pred cccCCCCccccceEEEeecCCccceeeccCcEE---------EeecCceeeecCCCcCCCCCcccccCCCeeEEEeccc
Q 008706 119 KTCSKPNLDLGSIVYLLGYMEEKELMVGEGKVA---------IATDNLIKLSTDGIIWSPGSAGFDVQGNLAFMICDPM 188 (557)
Q Consensus 119 k~~~kpkl~lGE~VsIIqHP~pK~laIrEnKVv---------~~~DnfIhysTDt~~wSSGSAgFn~qgnlafmVc~p~ 188 (557)
+..++.-=...+.+.+|--++...+.+..+.|. ..+++.++|..-|.+|.=||+..-..|. ..-|+--|
T Consensus 90 k~~~~~~~~~~~~~l~v~~~~~~~~~~~v~~v~~~~~i~~~g~~~~~~~~Y~~~t~~G~CG~~l~~~~~~-~~~i~GiH 167 (172)
T PF00548_consen 90 KFFPESIPEYPECVLLVNSTKFPRMIVEVGFVTNFGFINLSGTTTPRSLKYKAPTKPGMCGSPLVSRIGG-QGKIIGIH 167 (172)
T ss_dssp GGSBSSGGTEEEEEEEEESSSSTCEEEEEEEEEEEEEEEETTEEEEEEEEEESEEETTGTTEEEEESCGG-TTEEEEEE
T ss_pred hhhccccccCCCcEEEEECCCCccEEEEEEEEeecCccccCCCEeeEEEEEccCCCCCccCCeEEEeecc-CccEEEEE
Confidence 222211113455555555444444444433333 3467899999999999999998864332 34444444
No 18
>PF10459 Peptidase_S46: Peptidase S46; InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains.
Probab=34.11 E-value=22 Score=40.58 Aligned_cols=26 Identities=23% Similarity=0.236 Sum_probs=14.9
Q ss_pred ecCCCcCCCCCcccccCCCeeEEEec
Q 008706 161 STDGIIWSPGSAGFDVQGNLAFMICD 186 (557)
Q Consensus 161 sTDt~~wSSGSAgFn~qgnlafmVc~ 186 (557)
+.|+.-|-||||++|..|+|-=++-|
T Consensus 627 tnDitGGNSGSPvlN~~GeLVGl~FD 652 (698)
T PF10459_consen 627 TNDITGGNSGSPVLNAKGELVGLAFD 652 (698)
T ss_pred ccCcCCCCCCCccCCCCceEEEEeec
Confidence 44666666666666666665444443
No 19
>PF08192 Peptidase_S64: Peptidase family S64; InterPro: IPR012985 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This family of fungal proteins is involved in the processing of membrane bound transcription factor Stp1 [] and belongs to MEROPS petidase family S64 (clan PA). The processing causes the signalling domain of Stp1 to be passed to the nucleus where several permease genes are induced. The permeases are important for uptake of amino acids, and processing of tp1 only occurs in an amino acid-rich environment. This family is predicted to be distantly related to the trypsin family (MEROPS peptidase family S1) and to have a typical trypsin-like catalytic triad [].
Probab=29.47 E-value=3e+02 Score=32.31 Aligned_cols=54 Identities=22% Similarity=0.199 Sum_probs=32.4
Q ss_pred ceeeeccccceeEEEeeccCCCCCCCCCCCCC---cccccCCC---------CccccceEEEeecC
Q 008706 85 HRFFITSSVLDLTIVGLDSADGDSNAPGQQPH---HLKTCSKP---------NLDLGSIVYLLGYM 138 (557)
Q Consensus 85 ~RFFITs~~LDfTiVAvd~v~~d~~s~Gq~ph---~Lk~~~kp---------kl~lGE~VsIIqHP 138 (557)
+|--|-..-+|++||-|+...--.+.+|-+.- |-|++.-. ++..|..|.-+|-.
T Consensus 534 ER~ii~~~LsD~AIIkV~~~~~~~N~LGddi~f~~~dP~l~f~NlyV~~~~~~~~~G~~VfK~GrT 599 (695)
T PF08192_consen 534 ERSIINKRLSDWAIIKVNKERKCQNYLGDDIQFNEPDPTLMFQNLYVREVVSNLVPGMEVFKVGRT 599 (695)
T ss_pred cchhhcccccceEEEEeCCCceecCCCCccccccCCCccccccccchhhhhhccCCCCeEEEeccc
Confidence 34455555569999999765533444444332 33333222 45569999999987
No 20
>PF10385 RNA_pol_Rpb2_45: RNA polymerase beta subunit external 1 domain; InterPro: IPR019462 DNA-directed RNA polymerases 2.7.7.6 from EC (also known as DNA-dependent RNA polymerases) are responsible for the polymerisation of ribonucleotides into a sequence complementary to the template DNA. In eukaryotes, there are three different forms of DNA-directed RNA polymerases transcribing different sets of genes. Most RNA polymerases are multimeric enzymes and are composed of a variable number of subunits. The core RNA polymerase complex consists of five subunits (two alpha, one beta, one beta-prime and one omega) and is sufficient for transcription elongation and termination but is unable to initiate transcription. Transcription initiation from promoter elements requires a sixth, dissociable subunit called a sigma factor, which reversibly associates with the core RNA polymerase complex to form a holoenzyme []. The core RNA polymerase complex forms a "crab claw"-like structure with an internal channel running along the full length []. The key functional sites of the enzyme, as defined by mutational and cross-linking analysis, are located on the inner wall of this channel. RNA synthesis follows after the attachment of RNA polymerase to a specific site, the promoter, on the template DNA strand. The RNA synthesis process continues until a termination sequence is reached. The RNA product, which is synthesised in the 5' to 3'direction, is known as the primary transcript. Eukaryotic nuclei contain three distinct types of RNA polymerases that differ in the RNA they synthesise: RNA polymerase I: located in the nucleoli, synthesises precursors of most ribosomal RNAs. RNA polymerase II: occurs in the nucleoplasm, synthesises mRNA precursors. RNA polymerase III: also occurs in the nucleoplasm, synthesises the precursors of 5S ribosomal RNA, the tRNAs, and a variety of other small nuclear and cytosolic RNAs. Eukaryotic cells are also known to contain separate mitochondrial and chloroplast RNA polymerases. Eukaryotic RNA polymerases, whose molecular masses vary in size from 500 to 700 kDa, contain two non-identical large (>100 kDa) subunits and an array of up to 12 different small (less than 50 kDa) subunits. RNA polymerases catalyse the DNA-dependent polymerisation of RNA. Prokaryotes contain a single RNA polymerase compared with three in eukaryotes (not including mitochondrial or chloroplast polymerases). This entry represents a domain in prokaryotic polymerases that spans the gap between domains 4 and 5 of the protein. It is also known as the external 1 region of the polymerase and is bound in association with the external 2 region []. ; GO: 0003899 DNA-directed RNA polymerase activity, 0006351 transcription, DNA-dependent; PDB: 2GHO_C 1YNN_C 1I6V_C 1YNJ_C 1HQM_C 1SMY_M 3DXJ_M 3AOI_C 2A68_M 1ZYR_C ....
Probab=28.22 E-value=70 Score=26.30 Aligned_cols=35 Identities=31% Similarity=0.350 Sum_probs=28.1
Q ss_pred eeccCcEEEeecCceeeecCCCcCC---CCCcccccCCCee
Q 008706 144 MVGEGKVAIATDNLIKLSTDGIIWS---PGSAGFDVQGNLA 181 (557)
Q Consensus 144 aIrEnKVv~~~DnfIhysTDt~~wS---SGSAgFn~qgnla 181 (557)
.|.+||| +|..+.+++|.+.+. ++++-.|..|+|.
T Consensus 3 kV~~g~V---t~~i~YLtA~eEe~~~IAqA~~~ld~~g~~~ 40 (66)
T PF10385_consen 3 KVKNGKV---TDEIEYLTADEEEKYVIAQANAPLDEDGKFI 40 (66)
T ss_dssp EEETTEE---ECEEEEEECTTCCCSEEE-TTS-BSSTTBBC
T ss_pred EEeCCEE---CCeeEEEchhhcCCcEecccCeeeecCCEEE
Confidence 3577888 999999999999998 8888889888774
No 21
>PRK08903 DnaA regulatory inactivator Hda; Validated
Probab=26.06 E-value=27 Score=32.45 Aligned_cols=26 Identities=19% Similarity=0.446 Sum_probs=21.8
Q ss_pred ccccCcchhH-HhHHHhhhcCCCcccc
Q 008706 214 SMQFGIPIPI-ICDWLNQHWEGNLDEL 239 (557)
Q Consensus 214 i~q~GI~Iss-I~~wl~qhw~g~lde~ 239 (557)
.++.||+|+. +++||.++|.||+.++
T Consensus 170 ~~~~~v~l~~~al~~L~~~~~gn~~~l 196 (227)
T PRK08903 170 AAERGLQLADEVPDYLLTHFRRDMPSL 196 (227)
T ss_pred HHHcCCCCCHHHHHHHHHhccCCHHHH
Confidence 3467899976 9999999999998764
No 22
>TIGR00583 mre11 DNA repair protein (mre11). All proteins in this family for which functions are known are subunits of a nuclease complex made up of multiple proteins including MRE11 and RAD50 homologs. The functions of this nuclease complex include recombinational repair and non-homolgous end joining. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). The proteins in this family are distantly related to proteins in the SbcCD complex of bacteria.
Probab=22.90 E-value=54 Score=35.05 Aligned_cols=14 Identities=57% Similarity=0.944 Sum_probs=12.0
Q ss_pred cccCCCCcceeeee
Q 008706 237 DELTKPKLPIIRLM 250 (557)
Q Consensus 237 de~~kpklp~~rlm 250 (557)
++..+|+||||||-
T Consensus 352 ~~~~~~~~plirl~ 365 (405)
T TIGR00583 352 DEPREPPLPLIRLK 365 (405)
T ss_pred cccccCCCceEEEE
Confidence 46789999999995
No 23
>KOG1320 consensus Serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=22.67 E-value=2.1e+02 Score=31.97 Aligned_cols=130 Identities=20% Similarity=0.290 Sum_probs=80.1
Q ss_pred cceeEEEecceeeccCcCcCcHHHHhhhhhhccccccc--ccCCceeeeccccceeEEEeeccCCCCCCCCCCCCCcccc
Q 008706 43 SGTGFLIHRNLLLTTHVNLPSVAAAETAEIRLQNGVAA--ALVPHRFFITSSVLDLTIVGLDSADGDSNAPGQQPHHLKT 120 (557)
Q Consensus 43 ~GTGFLIspnLLLTNnhvLpSaaaA~~Aev~lq~g~~a--~L~P~RFFITs~~LDfTiVAvd~v~~d~~s~Gq~ph~Lk~ 120 (557)
-|.||.|.=..||||.|+++-+..+....|. -+|... .=.+...| .+-|+.+|.+|..+- -.|..| |.+
T Consensus 88 ~~s~f~i~~~~lltn~~~v~~~~~~~~v~v~-~~gs~~k~~~~v~~~~---~~cd~Avv~Ie~~~f---~~~~~~--~e~ 158 (473)
T KOG1320|consen 88 GGSGFAIYGKKLLTNAHVVAPNNDHKFVTVK-KHGSPRKYKAFVAAVF---EECDLAVVYIESEEF---WKGMNP--FEL 158 (473)
T ss_pred cccchhhcccceeecCccccccccccccccc-cCCCchhhhhhHHHhh---hcccceEEEEeeccc---cCCCcc--ccc
Confidence 4899999999999999999977777777776 565532 12334444 577888888865332 222222 333
Q ss_pred cCCCCccccceEEEeecCCccceeeccCcEEEeec--------CceeeecCCCcC--CCCCccccc---CCCeeEEEec
Q 008706 121 CSKPNLDLGSIVYLLGYMEEKELMVGEGKVAIATD--------NLIKLSTDGIIW--SPGSAGFDV---QGNLAFMICD 186 (557)
Q Consensus 121 ~~kpkl~lGE~VsIIqHP~pK~laIrEnKVv~~~D--------nfIhysTDt~~w--SSGSAgFn~---qgnlafmVc~ 186 (557)
. +.--+.+.|+|++ -.-+-+-.|-|+-+.= +++.-..|...| -+|-+.+.+ +--+||..++
T Consensus 159 ~--~ip~l~~S~~Vv~---gd~i~VTnghV~~~~~~~y~~~~~~l~~vqi~aa~~~~~s~ep~i~g~d~~~gvA~l~ik 232 (473)
T KOG1320|consen 159 G--DIPSLNGSGFVVG---GDGIIVTNGHVVRVEPRIYAHSSTVLLRVQIDAAIGPGNSGEPVIVGVDKVAGVAFLKIK 232 (473)
T ss_pred C--CCcccCccEEEEc---CCcEEEEeeEEEEEEeccccCCCcceeeEEEEEeecCCccCCCeEEccccccceEEEEEe
Confidence 3 2233467788887 2233344444443332 233345565555 788888866 5567888875
Done!