Query 041120
Match_columns 340
No_of_seqs 352 out of 1901
Neff 7.9
Searched_HMMs 46136
Date Fri Mar 29 03:59:12 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/041120.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/041120hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 KOG1542 Cysteine proteinase Ca 100.0 1.5E-82 3.2E-87 581.2 22.9 323 2-339 13-372 (372)
2 PTZ00203 cathepsin L protease; 100.0 4.7E-76 1E-80 561.0 33.1 275 55-337 31-339 (348)
3 PTZ00021 falcipain-2; Provisio 100.0 1.9E-75 4.2E-80 572.5 28.6 280 56-339 163-489 (489)
4 PTZ00200 cysteine proteinase; 100.0 3.2E-73 6.8E-78 555.1 29.4 276 56-339 120-446 (448)
5 KOG1543 Cysteine proteinase Ca 100.0 1.8E-66 3.8E-71 491.8 27.4 266 66-338 30-324 (325)
6 cd02621 Peptidase_C1A_Cathepsi 100.0 2.4E-53 5.2E-58 389.0 19.2 187 142-336 1-240 (243)
7 cd02698 Peptidase_C1A_Cathepsi 100.0 4.6E-53 9.9E-58 386.1 19.8 192 142-337 1-237 (239)
8 cd02248 Peptidase_C1A Peptidas 100.0 5.7E-52 1.2E-56 371.1 20.1 186 143-336 1-210 (210)
9 cd02620 Peptidase_C1A_Cathepsi 100.0 3.4E-52 7.3E-57 379.7 18.4 185 143-334 1-234 (236)
10 PF00112 Peptidase_C1: Papain 100.0 4.7E-51 1E-55 366.2 17.2 191 142-337 1-219 (219)
11 smart00645 Pept_C1 Papain fami 100.0 1.2E-50 2.6E-55 353.0 17.5 166 142-332 1-169 (174)
12 PTZ00049 cathepsin C-like prot 100.0 3.2E-49 6.9E-54 395.6 19.9 193 139-339 378-677 (693)
13 PTZ00364 dipeptidyl-peptidase 100.0 1.1E-48 2.4E-53 387.9 19.4 186 140-334 203-455 (548)
14 PTZ00462 Serine-repeat antigen 100.0 1.9E-42 4.1E-47 355.8 18.7 182 154-339 544-782 (1004)
15 cd02619 Peptidase_C1 C1 Peptid 100.0 4.7E-42 1E-46 308.4 17.3 173 145-319 1-213 (223)
16 KOG1544 Predicted cysteine pro 100.0 3.7E-37 8.1E-42 279.1 4.5 237 91-334 151-456 (470)
17 COG4870 Cysteine protease [Pos 100.0 1.4E-29 3E-34 235.2 7.0 177 141-321 98-316 (372)
18 cd00585 Peptidase_C1B Peptidas 99.9 1.2E-22 2.6E-27 198.1 4.0 78 155-233 55-159 (437)
19 PF08246 Inhibitor_I29: Cathep 99.7 5E-17 1.1E-21 115.7 7.2 57 62-118 1-58 (58)
20 PF03051 Peptidase_C1_2: Pepti 99.6 9.2E-17 2E-21 157.0 2.0 78 155-233 56-160 (438)
21 smart00848 Inhibitor_I29 Cathe 99.5 7.3E-15 1.6E-19 103.9 5.2 56 62-117 1-57 (57)
22 COG3579 PepC Aminopeptidase C 98.7 2.2E-09 4.7E-14 99.4 -0.6 77 156-233 59-162 (444)
23 PF08127 Propeptide_C1: Peptid 96.0 0.0067 1.4E-07 39.6 2.7 35 90-126 3-37 (41)
24 KOG4128 Bleomycin hydrolases a 95.2 0.016 3.4E-07 54.5 3.0 79 154-233 62-169 (457)
25 PF05543 Peptidase_C47: Stapho 90.6 0.24 5.3E-06 42.6 3.1 121 158-305 17-146 (175)
26 KOG4128 Bleomycin hydrolases a 79.6 0.2 4.3E-06 47.3 -3.1 38 279-316 371-412 (457)
27 PF13529 Peptidase_C39_2: Pept 77.8 1.9 4.1E-05 34.7 2.6 23 278-303 122-144 (144)
28 cd00044 CysPc Calpains, domain 60.6 13 0.00028 35.2 4.6 27 279-305 235-263 (315)
29 PF08139 LPAM_1: Prokaryotic m 60.0 13 0.00029 21.4 2.7 11 22-32 5-15 (25)
30 COG3017 LolB Outer membrane li 55.0 25 0.00053 31.2 4.9 30 21-50 1-30 (206)
31 COG5510 Predicted small secret 40.3 27 0.00059 22.9 2.1 14 23-36 1-14 (44)
32 PRK09810 entericidin A; Provis 38.8 32 0.00069 22.4 2.3 10 23-32 1-10 (41)
33 PRK10081 entericidin B membran 38.5 32 0.0007 23.1 2.3 13 23-35 1-13 (48)
34 COG2854 Ttg2D ABC-type transpo 35.5 99 0.0021 27.5 5.6 91 23-119 2-93 (202)
35 PF00648 Peptidase_C2: Calpain 34.2 69 0.0015 29.8 4.9 27 279-305 213-243 (298)
36 KOG4702 Uncharacterized conser 31.6 1E+02 0.0022 22.5 4.1 32 60-92 29-60 (77)
37 COG4990 Uncharacterized protei 29.7 33 0.00073 29.9 1.7 23 279-305 147-169 (195)
38 PF09778 Guanylate_cyc_2: Guan 29.2 59 0.0013 29.1 3.2 23 277-301 158-180 (212)
39 smart00230 CysPc Calpain-like 26.8 1E+02 0.0022 29.3 4.7 27 279-305 227-255 (318)
40 PF15284 PAGK: Phage-encoded v 24.9 69 0.0015 22.7 2.3 15 24-38 1-15 (61)
41 PF07351 DUF1480: Protein of u 24.2 1.5E+02 0.0032 22.1 3.9 52 247-298 4-56 (80)
42 TIGR03042 PS_II_psbQ_bact phot 24.0 1.4E+02 0.003 25.1 4.3 34 26-59 3-36 (142)
43 cd02549 Peptidase_C39A A sub-f 22.9 88 0.0019 25.0 3.0 22 279-303 93-114 (141)
44 PF15588 Imm7: Immunity protei 22.5 3E+02 0.0065 21.9 5.9 32 282-313 17-55 (115)
No 1
>KOG1542 consensus Cysteine proteinase Cathepsin F [Posttranslational modification, protein turnover, chaperones]
Probab=100.00 E-value=1.5e-82 Score=581.23 Aligned_cols=323 Identities=40% Similarity=0.689 Sum_probs=267.7
Q ss_pred CceeehhhhhccchhHHHHHHhhHHHHHHHHHH---HHHHhhhccccc--cCCCCCCChhhHHHHHHHHHHHhCCccCCH
Q 041120 2 QHRLFIAIYTNLHLKIAIDMRMMLRNAVLSLFL---LWVLGIPAGAWS--EGYPQKYDPQSMEERFENWLKQYSREYGSE 76 (340)
Q Consensus 2 ~~~~~~~~~~~~~~~~~~~~~~m~~~~~l~l~~---~~~l~~~~~~~~--~~~~~~~~~~~~~~~f~~w~~~~~k~Y~~~ 76 (340)
+||+.+++.+.++.++...-..- . ++.+.. ...+.+ ..... ...|+ ...+++.|..|+.+|+|+|.+.
T Consensus 13 ~~r~~~~~~~~~~~~~~~~~~~~--~-~~~~~~v~~~~~~~i-~~v~~~~~~~~~---~l~~~~~F~~F~~kf~r~Y~s~ 85 (372)
T KOG1542|consen 13 NHRSEMDCKTLVAFRKCPIEFTA--L-SVSLSVVPLGDDLTI-RQVVRLQDLNPR---GLGLEDSFKLFTIKFGRSYASR 85 (372)
T ss_pred ccccccchhhhhhhhccchhhhh--h-hhhccccccchhhhh-hhhhhhcccCCc---ccchHHHHHHHHHhcCcccCcH
Confidence 79999999999998877432110 0 000000 000000 00000 11121 2345889999999999999999
Q ss_pred HHHHHHHHHHHHHHHHHHHhccCC-CceEEEcccCCCCCHHHHHHhhcCCCCC-C-C---CCCCCCCCCCCCCCeeeccC
Q 041120 77 DEWQRRFGIYSSNVQYIDYINSQN-LSFKLTDNKFADLSNEEFISTYLGYNKP-Y-N---EPRWPSVQYLGLPASVDWRK 150 (340)
Q Consensus 77 ~E~~~R~~iF~~Nl~~I~~~N~~~-~s~~~g~N~FsDlt~eEf~~~~~g~~~~-~-~---~~~~~~~~~~~lP~~~Dwr~ 150 (340)
+|..+|+.+|+.|+..+++++... .|.++|+|+|||||+|||++++++.+.. . . ....+..+...||++||||+
T Consensus 86 eE~~~Rl~iF~~N~~~a~~~q~~d~gsA~yGvtqFSDlT~eEFkk~~l~~~~~~~~~~~~~~~~~~~~~~~lP~~fDWR~ 165 (372)
T KOG1542|consen 86 EEHAHRLSIFKHNLLRAERLQENDPGSAEYGVTQFSDLTEEEFKKIYLGVKRRGSKLPGDAAEAPIEPGESLPESFDWRD 165 (372)
T ss_pred HHHHHHHHHHHHHHHHHHHhhhcCccccccCccchhhcCHHHHHHHhhccccccccCccccccCcCCCCCCCCcccchhc
Confidence 999999999999999999999876 4899999999999999999999876652 1 1 11123344568999999999
Q ss_pred CCCCCccCCCCCCCchHHHHHHHHHHHHHHHHcCCccccChhHhhhccCCCCCCCCCCCchHHHHHHHHHhCCCCCCCCC
Q 041120 151 EGAVTPVKDQGQCGSCWAFSAVAAVEGINKLKTGKLVSLSEQELVDCDVNSENQGCNGGYMEKAFEFITKIGGVTTEDDY 230 (340)
Q Consensus 151 ~g~vtpV~nQg~cGsCwAfA~~~~lE~~~~~~~~~~~~LS~q~l~dc~~~~~~~gC~GG~~~~a~~~i~~~~Gi~~e~~y 230 (340)
+|+||||||||+||||||||+++++|++++|++|++++||||+|+||+.. +.||+||.+.+|++|+++.+|+..|++|
T Consensus 166 kgaVTpVKnQG~CGSCWAFS~tG~vEga~~i~~g~LvsLSEQeLvDCD~~--d~gC~GGl~~nA~~~~~~~gGL~~E~dY 243 (372)
T KOG1542|consen 166 KGAVTPVKNQGMCGSCWAFSTTGAVEGAWAIATGKLVSLSEQELVDCDSC--DNGCNGGLMDNAFKYIKKAGGLEKEKDY 243 (372)
T ss_pred cCCccccccCCcCcchhhhhhhhhhhhHHHhhcCcccccchhhhhcccCc--CCcCCCCChhHHHHHHHHhCCccccccC
Confidence 99999999999999999999999999999999999999999999999976 9999999999999999898899999999
Q ss_pred CCCCCCC-CcCCCCCCceeEEeceeEEcCCC--------------------cccccccCceecC---CCCCC-CCeEEEE
Q 041120 231 PYRGKND-RCQTDKTKHHAVTITGYEAIPAR--------------------YAFQLYSHGVFDE---YCGHQ-LNHGVTV 285 (340)
Q Consensus 231 Py~~~~~-~c~~~~~~~~~~~i~~y~~~~~~--------------------~~f~~y~~Gi~~~---~c~~~-~~Hav~i 285 (340)
||++..+ .|...+ ....+.|.+|..++.+ ..+|+|++||+.+ .|+.. ++|||+|
T Consensus 244 PY~g~~~~~C~~~~-~~~~v~I~~f~~l~~nE~~ia~wLv~~GPi~vgiNa~~mQ~YrgGV~~P~~~~Cs~~~~~HaVLl 322 (372)
T KOG1542|consen 244 PYTGKKGNQCHFDK-SKIVVSIKDFSMLSNNEDQIAAWLVTFGPLSVGINAKPMQFYRGGVSCPSKYICSPKLLNHAVLL 322 (372)
T ss_pred CccccCCCccccch-hhceEEEeccEecCCCHHHHHHHHHhcCCeEEEEchHHHHHhcccccCCCcccCCccccCceEEE
Confidence 9999988 999884 5678999999999877 6799999999998 39864 8999999
Q ss_pred EEEeecC-CeeEEEEEcCCCCCCCCCceEEEEeCCCCCCCcccceeecceeeeec
Q 041120 286 VGYGEDH-GEKYWLVKNSWGTSWGEAGYIRMARNSPSSNIGICGILMQASYPVKR 339 (340)
Q Consensus 286 VGyg~~~-g~~ywivkNSWG~~WGe~Gy~~i~~~~~~~~~~~Cgi~~~~~yp~~~ 339 (340)
||||.+. .++|||||||||++|||+||+|+.||. |+|||++.++-+.++
T Consensus 323 vGyG~~g~~~PYWIVKNSWG~~WGE~GY~~l~RG~-----N~CGi~~mvss~~v~ 372 (372)
T KOG1542|consen 323 VGYGSSGYEKPYWIVKNSWGTSWGEKGYYKLCRGS-----NACGIADMVSSAAVN 372 (372)
T ss_pred EeecCCCCCCceEEEECCccccccccceEEEeccc-----cccccccchhhhhcC
Confidence 9999987 899999999999999999999999993 459999999887764
No 2
>PTZ00203 cathepsin L protease; Provisional
Probab=100.00 E-value=4.7e-76 Score=560.97 Aligned_cols=275 Identities=38% Similarity=0.704 Sum_probs=232.3
Q ss_pred hhhHHHHHHHHHHHhCCccCCHHHHHHHHHHHHHHHHHHHHhccCCCceEEEcccCCCCCHHHHHHhhcCCCC--C-CCC
Q 041120 55 PQSMEERFENWLKQYSREYGSEDEWQRRFGIYSSNVQYIDYINSQNLSFKLTDNKFADLSNEEFISTYLGYNK--P-YNE 131 (340)
Q Consensus 55 ~~~~~~~f~~w~~~~~k~Y~~~~E~~~R~~iF~~Nl~~I~~~N~~~~s~~~g~N~FsDlt~eEf~~~~~g~~~--~-~~~ 131 (340)
+.++++.|++|+++|+|+|.+.+|+.+|+.||++|+++|++||+++.+|++|+|+|+|||.|||.+++++... + ...
T Consensus 31 ~~~~~~~f~~~~~~~~K~Y~~~~E~~~R~~iF~~N~~~I~~~N~~~~~~~lg~N~FaDlT~eEf~~~~l~~~~~~~~~~~ 110 (348)
T PTZ00203 31 GTPAAALFEEFKRTYQRAYGTLTEEQQRLANFERNLELMREHQARNPHARFGITKFFDLSEAEFAARYLNGAAYFAAAKQ 110 (348)
T ss_pred ccHHHHHHHHHHHHhCCCCCChHHHHHHHHHHHHHHHHHHHHhccCCCeEEeccccccCCHHHHHHHhcCCCcccccccc
Confidence 4578889999999999999988899999999999999999999987899999999999999999988764211 1 110
Q ss_pred ---CCCCC--CCCCCCCCeeeccCCCCCCccCCCCCCCchHHHHHHHHHHHHHHHHcCCccccChhHhhhccCCCCCCCC
Q 041120 132 ---PRWPS--VQYLGLPASVDWRKEGAVTPVKDQGQCGSCWAFSAVAAVEGINKLKTGKLVSLSEQELVDCDVNSENQGC 206 (340)
Q Consensus 132 ---~~~~~--~~~~~lP~~~Dwr~~g~vtpV~nQg~cGsCwAfA~~~~lE~~~~~~~~~~~~LS~q~l~dc~~~~~~~gC 206 (340)
..... .+..++|++||||++|+|+||||||.||||||||+++++|+++++++++.+.||+|+|+||+.. +.||
T Consensus 111 ~~~~~~~~~~~~~~~lP~~~DWR~~g~VtpVkdQg~CGSCWAfa~~~aiEs~~~i~~~~~~~LSeQqLvdC~~~--~~GC 188 (348)
T PTZ00203 111 HAGQHYRKARADLSAVPDAVDWREKGAVTPVKNQGACGSCWAFSAVGNIESQWAVAGHKLVRLSEQQLVSCDHV--DNGC 188 (348)
T ss_pred cccccccccccccccCCCCCcCCcCCCCCCccccCCCccHHHHhhHHHHHHHHHHhcCCCccCCHHHHHhccCC--CCCC
Confidence 11111 1234789999999999999999999999999999999999999999999999999999999874 7899
Q ss_pred CCCchHHHHHHHHHh--CCCCCCCCCCCCCCCC---CcCCCCCCceeEEeceeEEcCCC--------------------c
Q 041120 207 NGGYMEKAFEFITKI--GGVTTEDDYPYRGKND---RCQTDKTKHHAVTITGYEAIPAR--------------------Y 261 (340)
Q Consensus 207 ~GG~~~~a~~~i~~~--~Gi~~e~~yPy~~~~~---~c~~~~~~~~~~~i~~y~~~~~~--------------------~ 261 (340)
+||++..|++|++++ ||+++|++|||.+.++ .|.........+.+++|..++.+ .
T Consensus 189 ~GG~~~~a~~yi~~~~~ggi~~e~~YPY~~~~~~~~~C~~~~~~~~~~~i~~~~~i~~~e~~~~~~l~~~GPv~v~i~a~ 268 (348)
T PTZ00203 189 GGGLMLQAFEWVLRNMNGTVFTEKSYPYVSGNGDVPECSNSSELAPGARIDGYVSMESSERVMAAWLAKNGPISIAVDAS 268 (348)
T ss_pred CCCCHHHHHHHHHHhcCCCCCccccCCCccCCCCCCcCCCCcccccceEecceeecCcCHHHHHHHHHhCCCEEEEEEhh
Confidence 999999999999865 6799999999998766 58643221234667787766543 5
Q ss_pred ccccccCceecCCCC-CCCCeEEEEEEEeecCCeeEEEEEcCCCCCCCCCceEEEEeCCCCCCCcccceeecceeee
Q 041120 262 AFQLYSHGVFDEYCG-HQLNHGVTVVGYGEDHGEKYWLVKNSWGTSWGEAGYIRMARNSPSSNIGICGILMQASYPV 337 (340)
Q Consensus 262 ~f~~y~~Gi~~~~c~-~~~~Hav~iVGyg~~~g~~ywivkNSWG~~WGe~Gy~~i~~~~~~~~~~~Cgi~~~~~yp~ 337 (340)
+|++|++|||+. |. ..++|||+|||||.++|++|||||||||++|||+|||||+|+. | .|||++.+....
T Consensus 269 ~f~~Y~~GIy~~-c~~~~~nHaVliVGYG~~~g~~YWiikNSWG~~WGe~GY~ri~rg~-n----~Cgi~~~~~~~~ 339 (348)
T PTZ00203 269 SFMSYHSGVLTS-CIGEQLNHGVLLVGYNMTGEVPYWVIKNSWGEDWGEKGYVRVTMGV-N----ACLLTGYPVSVH 339 (348)
T ss_pred hhcCccCceeec-cCCCCCCeEEEEEEEecCCCceEEEEEcCCCCCcCcCceEEEEcCC-C----cccccceEEEEe
Confidence 899999999985 86 4589999999999988999999999999999999999999984 4 499997766543
No 3
>PTZ00021 falcipain-2; Provisional
Probab=100.00 E-value=1.9e-75 Score=572.54 Aligned_cols=280 Identities=39% Similarity=0.717 Sum_probs=237.4
Q ss_pred hhHHHHHHHHHHHhCCccCCHHHHHHHHHHHHHHHHHHHHhccCC-CceEEEcccCCCCCHHHHHHhhcCCCCC--CC-C
Q 041120 56 QSMEERFENWLKQYSREYGSEDEWQRRFGIYSSNVQYIDYINSQN-LSFKLTDNKFADLSNEEFISTYLGYNKP--YN-E 131 (340)
Q Consensus 56 ~~~~~~f~~w~~~~~k~Y~~~~E~~~R~~iF~~Nl~~I~~~N~~~-~s~~~g~N~FsDlt~eEf~~~~~g~~~~--~~-~ 131 (340)
.+....|++|+.+|+|+|.+.+|+.+|+.+|++|+++|++||+++ .+|++|+|+|+|||.|||++++++.... .. .
T Consensus 163 ~e~~~~F~~wk~ky~K~Y~~~eE~~~R~~iF~~Nl~~Ie~hN~~~~~ty~lgiNqFsDlT~EEF~~~~l~~~~~~~~~~~ 242 (489)
T PTZ00021 163 LENVNSFYLFIKEHGKKYQTPDEMQQRYLSFVENLAKINAHNNKENVLYKKGMNRFGDLSFEEFKKKYLTLKSFDFKSNG 242 (489)
T ss_pred hHHHHHHHHHHHHhCCcCCCHHHHHHHHHHHHHHHHHHHHhhccCCCCEEEeccccccCCHHHHHHHhcccccccccccc
Confidence 345567999999999999998899999999999999999999875 7999999999999999999988764321 00 0
Q ss_pred ---C---CC-------CCCCCCCCCCeeeccCCCCCCccCCCCCCCchHHHHHHHHHHHHHHHHcCCccccChhHhhhcc
Q 041120 132 ---P---RW-------PSVQYLGLPASVDWRKEGAVTPVKDQGQCGSCWAFSAVAAVEGINKLKTGKLVSLSEQELVDCD 198 (340)
Q Consensus 132 ---~---~~-------~~~~~~~lP~~~Dwr~~g~vtpV~nQg~cGsCwAfA~~~~lE~~~~~~~~~~~~LS~q~l~dc~ 198 (340)
. .. ...+....|+++|||+.|.|+||||||.||||||||+++++|++++|+++..+.||+|+|+||+
T Consensus 243 ~~~~~~~~~~~~~~~~~~~~~~~~P~s~DWR~~g~VtpVKdQG~CGSCWAFAa~~alEs~~~I~~g~~v~LSeQqLVDCs 322 (489)
T PTZ00021 243 KKSPRVINYDDVIKKYKPKDATFDHAKYDWRLHNGVTPVKDQKNCGSCWAFSTVGVVESQYAIRKNELVSLSEQELVDCS 322 (489)
T ss_pred ccccccccccccccccccccccCCccccccccCCCCCCcccccccccHHHHHHHHHHHHHHHHHcCCCcccCHHHHhhhc
Confidence 0 00 0000112489999999999999999999999999999999999999999999999999999999
Q ss_pred CCCCCCCCCCCchHHHHHHHHHhCCCCCCCCCCCCCC-CCCcCCCCCCceeEEeceeEEcCCC-----------------
Q 041120 199 VNSENQGCNGGYMEKAFEFITKIGGVTTEDDYPYRGK-NDRCQTDKTKHHAVTITGYEAIPAR----------------- 260 (340)
Q Consensus 199 ~~~~~~gC~GG~~~~a~~~i~~~~Gi~~e~~yPy~~~-~~~c~~~~~~~~~~~i~~y~~~~~~----------------- 260 (340)
.. +.||+||++..|++|+.+++||++|++|||.+. .+.|.... ....+++++|..++..
T Consensus 323 ~~--n~GC~GG~~~~Af~yi~~~gGl~tE~~YPY~~~~~~~C~~~~-~~~~~~i~~y~~i~~~~lk~al~~~GPVsv~i~ 399 (489)
T PTZ00021 323 FK--NNGCYGGLIPNAFEDMIELGGLCSEDDYPYVSDTPELCNIDR-CKEKYKIKSYVSIPEDKFKEAIRFLGPISVSIA 399 (489)
T ss_pred cC--CCCCCCcchHhhhhhhhhccccCcccccCccCCCCCcccccc-ccccceeeeEEEecHHHHHHHHHhcCCeEEEEE
Confidence 75 889999999999999988889999999999987 47897542 2345677888776533
Q ss_pred --cccccccCceecCCCCCCCCeEEEEEEEeecCC----------eeEEEEEcCCCCCCCCCceEEEEeCCCCCCCcccc
Q 041120 261 --YAFQLYSHGVFDEYCGHQLNHGVTVVGYGEDHG----------EKYWLVKNSWGTSWGEAGYIRMARNSPSSNIGICG 328 (340)
Q Consensus 261 --~~f~~y~~Gi~~~~c~~~~~Hav~iVGyg~~~g----------~~ywivkNSWG~~WGe~Gy~~i~~~~~~~~~~~Cg 328 (340)
.+|++|++|||+++|+..++|||+|||||++++ .+|||||||||++|||+|||||+|+.+.. .|+||
T Consensus 400 a~~~f~~YkgGIy~~~C~~~~nHAVlIVGYG~e~~~~~~~~~~~~~~YWIVKNSWGt~WGE~GY~rI~r~~~g~-~n~CG 478 (489)
T PTZ00021 400 VSDDFAFYKGGIFDGECGEEPNHAVILVGYGMEEIYNSDTKKMEKRYYYIIKNSWGESWGEKGFIRIETDENGL-MKTCS 478 (489)
T ss_pred eecccccCCCCcCCCCCCCccceEEEEEEecCcCCcccccccCCCCCEEEEECCCCCCcccCeEEEEEcCCCCC-CCCCC
Confidence 589999999999889888999999999997632 47999999999999999999999985422 46799
Q ss_pred eeecceeeeec
Q 041120 329 ILMQASYPVKR 339 (340)
Q Consensus 329 i~~~~~yp~~~ 339 (340)
|++.++||+++
T Consensus 479 I~t~a~yP~~~ 489 (489)
T PTZ00021 479 LGTEAYVPLIE 489 (489)
T ss_pred CcccceeEecC
Confidence 99999999874
No 4
>PTZ00200 cysteine proteinase; Provisional
Probab=100.00 E-value=3.2e-73 Score=555.06 Aligned_cols=276 Identities=38% Similarity=0.712 Sum_probs=234.1
Q ss_pred hhHHHHHHHHHHHhCCccCCHHHHHHHHHHHHHHHHHHHHhccCCCceEEEcccCCCCCHHHHHHhhcCCCCCCC-----
Q 041120 56 QSMEERFENWLKQYSREYGSEDEWQRRFGIYSSNVQYIDYINSQNLSFKLTDNKFADLSNEEFISTYLGYNKPYN----- 130 (340)
Q Consensus 56 ~~~~~~f~~w~~~~~k~Y~~~~E~~~R~~iF~~Nl~~I~~~N~~~~s~~~g~N~FsDlt~eEf~~~~~g~~~~~~----- 130 (340)
.++...|++|+++|+|.|.+.+|+.+|+.+|++|++.|++||. ..+|++|+|+|+|||+|||.+++++.+.+..
T Consensus 120 ~e~~~~F~~f~~ky~K~Y~~~~E~~~R~~iF~~Nl~~I~~hN~-~~~y~lgiN~FsDlT~eEF~~~~~~~~~~~~~~~~~ 198 (448)
T PTZ00200 120 FEVYLEFEEFNKKYNRKHATHAERLNRFLTFRNNYLEVKSHKG-DEPYSKEINKFSDLTEEEFRKLFPVIKVPPKSNSTS 198 (448)
T ss_pred HHHHHHHHHHHHHhCCcCCCHHHHHHHHHHHHHHHHHHHHhcC-cCCeEEeccccccCCHHHHHHHhccCCCcccccccc
Confidence 3556689999999999999889999999999999999999996 3689999999999999999988765432210
Q ss_pred -----------CCCCCC---------CCC----CCCCCeeeccCCCCCCccCCCC-CCCchHHHHHHHHHHHHHHHHcCC
Q 041120 131 -----------EPRWPS---------VQY----LGLPASVDWRKEGAVTPVKDQG-QCGSCWAFSAVAAVEGINKLKTGK 185 (340)
Q Consensus 131 -----------~~~~~~---------~~~----~~lP~~~Dwr~~g~vtpV~nQg-~cGsCwAfA~~~~lE~~~~~~~~~ 185 (340)
...+.. .++ ..+|++||||+.|.|+|||||| .||||||||+++++|++++++++.
T Consensus 199 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~P~~~DWR~~g~vtpVkdQG~~CGSCWAFat~~aiEs~~~i~~~~ 278 (448)
T PTZ00200 199 HNNDFKARHVSNPTYLKNLKKAKNTDEDVKDPSKITGEGLDWRRADAVTKVKDQGLNCGSCWAFSSVGSVESLYKIYRDK 278 (448)
T ss_pred cccccccccccccccccccccccccccccccccccCCCCccCCCCCCCCCcccCCCccchHHHHhHHHHHHHHHHHhcCC
Confidence 000000 001 1259999999999999999999 999999999999999999999999
Q ss_pred ccccChhHhhhccCCCCCCCCCCCchHHHHHHHHHhCCCCCCCCCCCCCCCCCcCCCCCCceeEEeceeEEcCCC-----
Q 041120 186 LVSLSEQELVDCDVNSENQGCNGGYMEKAFEFITKIGGVTTEDDYPYRGKNDRCQTDKTKHHAVTITGYEAIPAR----- 260 (340)
Q Consensus 186 ~~~LS~q~l~dc~~~~~~~gC~GG~~~~a~~~i~~~~Gi~~e~~yPy~~~~~~c~~~~~~~~~~~i~~y~~~~~~----- 260 (340)
.+.||+|+|+||+.. +.||+||++..|++|++++ ||++|++|||.+..+.|.... ...+.|.+|..++..
T Consensus 279 ~~~LSeQqLvDC~~~--~~GC~GG~~~~A~~yi~~~-Gi~~e~~YPY~~~~~~C~~~~--~~~~~i~~y~~~~~~~~l~~ 353 (448)
T PTZ00200 279 SVDLSEQELVNCDTK--SQGCSGGYPDTALEYVKNK-GLSSSSDVPYLAKDGKCVVSS--TKKVYIDSYLVAKGKDVLNK 353 (448)
T ss_pred CeecCHHHHhhccCc--cCCCCCCcHHHHHHHHhhc-CccccccCCCCCCCCCCcCCC--CCeeEecceEecCHHHHHHH
Confidence 999999999999874 7899999999999999887 899999999999999997653 234556666544321
Q ss_pred --------------cccccccCceecCCCCCCCCeEEEEEEEeec--CCeeEEEEEcCCCCCCCCCceEEEEeCCCCCCC
Q 041120 261 --------------YAFQLYSHGVFDEYCGHQLNHGVTVVGYGED--HGEKYWLVKNSWGTSWGEAGYIRMARNSPSSNI 324 (340)
Q Consensus 261 --------------~~f~~y~~Gi~~~~c~~~~~Hav~iVGyg~~--~g~~ywivkNSWG~~WGe~Gy~~i~~~~~~~~~ 324 (340)
.+|+.|++|||+++|+..++|||+|||||.+ +|.+|||||||||++|||+|||||+|+.. +.
T Consensus 354 ~l~~GPV~v~i~~~~~f~~Yk~GIy~~~C~~~~nHaV~lVGyG~d~~~g~~YWIIkNSWG~~WGe~GY~ri~r~~~--g~ 431 (448)
T PTZ00200 354 SLVISPTVVYIAVSRELLKYKSGVYNGECGKSLNHAVLLVGEGYDEKTKKRYWIIKNSWGTDWGENGYMRLERTNE--GT 431 (448)
T ss_pred HHhcCCEEEEeecccccccCCCCccccccCCCCcEEEEEEEecccCCCCCceEEEEcCCCCCcccCeeEEEEeCCC--CC
Confidence 5899999999998898779999999999854 68899999999999999999999999742 24
Q ss_pred cccceeecceeeeec
Q 041120 325 GICGILMQASYPVKR 339 (340)
Q Consensus 325 ~~Cgi~~~~~yp~~~ 339 (340)
|.|||++.+.||++.
T Consensus 432 n~CGI~~~~~~P~~~ 446 (448)
T PTZ00200 432 DKCGILTVGLTPVFY 446 (448)
T ss_pred CcCCccccceeeEEe
Confidence 679999999999873
No 5
>KOG1543 consensus Cysteine proteinase Cathepsin L [Posttranslational modification, protein turnover, chaperones]
Probab=100.00 E-value=1.8e-66 Score=491.76 Aligned_cols=266 Identities=47% Similarity=0.848 Sum_probs=229.8
Q ss_pred HHHhCCccCCHHHHHHHHHHHHHHHHHHHHhccC-CCceEEEcccCCCCCHHHHHHhhcCCCCCCCC-CC-CCCCCCCCC
Q 041120 66 LKQYSREYGSEDEWQRRFGIYSSNVQYIDYINSQ-NLSFKLTDNKFADLSNEEFISTYLGYNKPYNE-PR-WPSVQYLGL 142 (340)
Q Consensus 66 ~~~~~k~Y~~~~E~~~R~~iF~~Nl~~I~~~N~~-~~s~~~g~N~FsDlt~eEf~~~~~g~~~~~~~-~~-~~~~~~~~l 142 (340)
+.+|.+.|.+..|+..|+.+|++|++.|+.||.. ..+|++++|+|+|+|.+|++..+.+.+.+... .. .......++
T Consensus 30 ~~~~~~~y~~~~~~~~r~~~f~~n~~~~~~~n~~~~~~~~~g~n~~~d~~~ee~~~~~~~~~~~~~~~~~~~~~~~~~~~ 109 (325)
T KOG1543|consen 30 LVKFLKRYEDRVEKKARRAIFKENLQKIESHNLKYVLSFLMGVNQFADLTTEEFKRKKTGKKPPEIKRDKFTEKLDGDDL 109 (325)
T ss_pred hhhhccccccHHHHHHHHHHHHHHHHHHHhhhhhhceeeeeccccccccchHHHHHhhccccCccccccccccccchhhC
Confidence 6777777876789999999999999999999998 58999999999999999999988876554321 11 112234589
Q ss_pred CCeeeccCCCC-CCccCCCCCCCchHHHHHHHHHHHHHHHHcC-CccccChhHhhhccCCCCCCCCCCCchHHHHHHHHH
Q 041120 143 PASVDWRKEGA-VTPVKDQGQCGSCWAFSAVAAVEGINKLKTG-KLVSLSEQELVDCDVNSENQGCNGGYMEKAFEFITK 220 (340)
Q Consensus 143 P~~~Dwr~~g~-vtpV~nQg~cGsCwAfA~~~~lE~~~~~~~~-~~~~LS~q~l~dc~~~~~~~gC~GG~~~~a~~~i~~ 220 (340)
|++||||++|. ++||||||.||||||||++++||++++|+++ .+++||+|+|+||+.. +++||+||++..|++|+++
T Consensus 110 p~s~DwR~~~~~~~~vkdQg~CgsCWAFaa~~aie~~~~i~~g~~l~sLSeq~lvdC~~~-~~~GC~GG~~~~A~~yi~~ 188 (325)
T KOG1543|consen 110 PDSFDWRDKGAVTPPVKDQGSCGSCWAFAATGALEDRYNIKTGGKLLSLSEQDLVDCCGE-CGDGCNGGEPKNAFKYIKK 188 (325)
T ss_pred CCCccccccCCcCCCcCCCCcCcchHHHHHHHHHHHHHHHHhCCccCccChhhhhhccCC-CCCCcCCCCHHHHHHHHHH
Confidence 99999999974 5559999999999999999999999999999 8999999999999987 6889999999999999999
Q ss_pred hCCCCCCCCCCCCCCCCCcCCCCCCceeEEeceeEEcCCC---------------------cccccccCceecCCCC-C-
Q 041120 221 IGGVTTEDDYPYRGKNDRCQTDKTKHHAVTITGYEAIPAR---------------------YAFQLYSHGVFDEYCG-H- 277 (340)
Q Consensus 221 ~~Gi~~e~~yPy~~~~~~c~~~~~~~~~~~i~~y~~~~~~---------------------~~f~~y~~Gi~~~~c~-~- 277 (340)
+||+.++++|||.+..+.|..... ...+.+.++..++.+ .+|++|++|||.++|. .
T Consensus 189 ~G~~t~~~~Ypy~~~~~~C~~~~~-~~~~~~~~~~~~~~~e~~i~~~v~~~GPv~v~~~a~~~F~~Y~~GVy~~~~~~~~ 267 (325)
T KOG1543|consen 189 NGGVTECENYPYIGKDGTCKSNKK-DKTVTIKGFYNVPANEEAIAEAVAKNGPVSVAIDAYEDFSLYKGGVYAEEKGDDK 267 (325)
T ss_pred hCCCCCCcCCCCcCCCCCccCCCc-cceeEeeeeeecCcCHHHHHHHHHhcCCeEEEEeehhhhhhccCceEeCCCCCCC
Confidence 965555999999999999998854 556777777766655 6899999999999854 4
Q ss_pred CCCeEEEEEEEeecCCeeEEEEEcCCCCCCCCCceEEEEeCCCCCCCcccceeeccee-eee
Q 041120 278 QLNHGVTVVGYGEDHGEKYWLVKNSWGTSWGEAGYIRMARNSPSSNIGICGILMQASY-PVK 338 (340)
Q Consensus 278 ~~~Hav~iVGyg~~~g~~ywivkNSWG~~WGe~Gy~~i~~~~~~~~~~~Cgi~~~~~y-p~~ 338 (340)
.++|||+|||||..++.+|||||||||+.|||+|||||.|+.++ |+|++.++| |+.
T Consensus 268 ~~~Hav~iVGyG~~~~~~YWivkNSWG~~WGe~Gy~ri~r~~~~-----~~I~~~~~~~p~~ 324 (325)
T KOG1543|consen 268 EGDHAVLIVGYGTGDGVDYWIVKNSWGTDWGEKGYFRIARGVNK-----CGIASEASYGPIK 324 (325)
T ss_pred CCCceEEEEEEcCCCCceeEEEEcCCCCCcccCceEEEecCCCc-----hhhhcccccCCCC
Confidence 59999999999996678999999999999999999999999644 999999999 764
No 6
>cd02621 Peptidase_C1A_CathepsinC Cathepsin C; also known as Dipeptidyl Peptidase I (DPPI), an atypical papain-like cysteine peptidase with chloride dependency and dipeptidyl aminopeptidase activity, resulting from its tetrameric structure which limits substrate access. Each subunit of the tetramer is composed of three peptides: the heavy and light chains, which together adopts the papain fold and forms the catalytic domain; and the residual propeptide region, which forms a beta barrel and points towards the substrate's N-terminus. The subunit composition is the result of the unique characteristic of procathepsin C maturation involving the cleavage of the catalytic domain and the non-autocatalytic excision of an activation peptide within its propeptide region. By removing N-terminal dipeptide extensions, cathepsin C activates granule serine peptidases (granzymes) involved in cell-mediated apoptosis, inflammation and tissue remodelling. Loss-of-function mutations in cathepsin C are assoc
Probab=100.00 E-value=2.4e-53 Score=388.99 Aligned_cols=187 Identities=39% Similarity=0.809 Sum_probs=158.0
Q ss_pred CCCeeeccCCC----CCCccCCCCCCCchHHHHHHHHHHHHHHHHcCC------ccccChhHhhhccCCCCCCCCCCCch
Q 041120 142 LPASVDWRKEG----AVTPVKDQGQCGSCWAFSAVAAVEGINKLKTGK------LVSLSEQELVDCDVNSENQGCNGGYM 211 (340)
Q Consensus 142 lP~~~Dwr~~g----~vtpV~nQg~cGsCwAfA~~~~lE~~~~~~~~~------~~~LS~q~l~dc~~~~~~~gC~GG~~ 211 (340)
||++||||+.+ +|+||+|||.||||||||+++++|++++++++. .+.||+|+|+||+.. +.||+||++
T Consensus 1 lP~~fDwr~~~~~~~~v~~v~dQg~CGsCwAfa~~~~ies~~~i~~~~~~~~~~~~~lS~q~l~dC~~~--~~GC~GG~~ 78 (243)
T cd02621 1 LPKSFDWGDVNNGFNYVSPVRNQGGCGSCYAFASVYALEARIMIASNKTDPLGQQPILSPQHVLSCSQY--SQGCDGGFP 78 (243)
T ss_pred CCCcccccccCCCCcccccCCCCCcCccHHHHHHHHHHHHHHHHHhCCCCccccCcccCHHHhhhhcCC--CCCCCCCCH
Confidence 79999999998 999999999999999999999999999998876 689999999999864 789999999
Q ss_pred HHHHHHHHHhCCCCCCCCCCCCC-CCCCcCCCCCCceeEEeceeEEcC-----CC---------------------cccc
Q 041120 212 EKAFEFITKIGGVTTEDDYPYRG-KNDRCQTDKTKHHAVTITGYEAIP-----AR---------------------YAFQ 264 (340)
Q Consensus 212 ~~a~~~i~~~~Gi~~e~~yPy~~-~~~~c~~~~~~~~~~~i~~y~~~~-----~~---------------------~~f~ 264 (340)
..|++|++++ |+++|++|||.. ..+.|.........+.++.|..+. .. .+|+
T Consensus 79 ~~a~~~~~~~-Gi~~e~~yPY~~~~~~~C~~~~~~~~~~~~~~~~~i~~~~~~~~~~~ik~~i~~~GPv~v~~~~~~~F~ 157 (243)
T cd02621 79 FLVGKFAEDF-GIVTEDYFPYTADDDRPCKASPSECRRYYFSDYNYVGGCYGCTNEDEMKWEIYRNGPIVVAFEVYSDFD 157 (243)
T ss_pred HHHHHHHHhc-CcCCCceeCCCCCCCCCCCCCccccccccccceeEcccccccCCHHHHHHHHHHcCCEEEEEEeccccc
Confidence 9999999988 899999999998 677897543122233333333221 01 5899
Q ss_pred cccCceecCC-----CCC---------CCCeEEEEEEEeecC--CeeEEEEEcCCCCCCCCCceEEEEeCCCCCCCcccc
Q 041120 265 LYSHGVFDEY-----CGH---------QLNHGVTVVGYGEDH--GEKYWLVKNSWGTSWGEAGYIRMARNSPSSNIGICG 328 (340)
Q Consensus 265 ~y~~Gi~~~~-----c~~---------~~~Hav~iVGyg~~~--g~~ywivkNSWG~~WGe~Gy~~i~~~~~~~~~~~Cg 328 (340)
+|++|||+.+ |+. .++|||+|||||++. |.+|||||||||++|||+|||||+|+. | .||
T Consensus 158 ~Y~~GIy~~~~~~~~C~~~~~~~~~~~~~~HaV~iVGyg~~~~~g~~YWiirNSWG~~WGe~Gy~~i~~~~-~----~cg 232 (243)
T cd02621 158 FYKEGVYHHTDNDEVSDGDNDNFNPFELTNHAVLLVGWGEDEIKGEKYWIVKNSWGSSWGEKGYFKIRRGT-N----ECG 232 (243)
T ss_pred ccCCeEECcCCcccccccccccccCcccCCeEEEEEEeeccCCCCCcEEEEEcCCCCCCCcCCeEEEecCC-c----ccC
Confidence 9999999875 642 479999999999986 899999999999999999999999984 3 499
Q ss_pred eeecceee
Q 041120 329 ILMQASYP 336 (340)
Q Consensus 329 i~~~~~yp 336 (340)
|++.+.+.
T Consensus 233 i~~~~~~~ 240 (243)
T cd02621 233 IESQAVFA 240 (243)
T ss_pred cccceEee
Confidence 99998653
No 7
>cd02698 Peptidase_C1A_CathepsinX Cathepsin X; the only papain-like lysosomal cysteine peptidase exhibiting carboxymonopeptidase activity. It can also act as a carboxydipeptidase, like cathepsin B, but has been shown to preferentially cleave substrates through a monopeptidyl carboxypeptidase pathway. The propeptide region of cathepsin X, the shortest among papain-like peptidases, is covalently attached to the active site cysteine in the inactive form of the enzyme. Little is known about the biological function of cathepsin X. Some studies point to a role in early tumorigenesis. A more recent study indicates that cathepsin X expression is restricted to immune cells suggesting a role in phagocytosis and the regulation of the immune response.
Probab=100.00 E-value=4.6e-53 Score=386.10 Aligned_cols=192 Identities=32% Similarity=0.654 Sum_probs=161.9
Q ss_pred CCCeeeccCCC---CCCccCCCC---CCCchHHHHHHHHHHHHHHHHcC---CccccChhHhhhccCCCCCCCCCCCchH
Q 041120 142 LPASVDWRKEG---AVTPVKDQG---QCGSCWAFSAVAAVEGINKLKTG---KLVSLSEQELVDCDVNSENQGCNGGYME 212 (340)
Q Consensus 142 lP~~~Dwr~~g---~vtpV~nQg---~cGsCwAfA~~~~lE~~~~~~~~---~~~~LS~q~l~dc~~~~~~~gC~GG~~~ 212 (340)
||++||||+.+ +|+|||||| .||||||||++++||+++.++++ ..+.||+|+|+||+. +.||+||++.
T Consensus 1 lP~~~Dwr~~~~~~~v~~vk~Qg~~~~CGsCwAfa~~~aies~~~i~~~~~~~~~~lS~Q~lldC~~---~~gC~GG~~~ 77 (239)
T cd02698 1 LPKSWDWRNVNGVNYVSPTRNQHIPQYCGSCWAHGSTSALADRINIARKGAWPSVYLSVQVVIDCAG---GGSCHGGDPG 77 (239)
T ss_pred CCCCcccccCCCCcccCccccCCCCCCCCcchHHHhHHHHHHHHHHHHCCCCCCcccCHHHHHhCCC---CCCccCcCHH
Confidence 69999999987 899999998 89999999999999999999875 357899999999986 6799999999
Q ss_pred HHHHHHHHhCCCCCCCCCCCCCCCCCcCCCCC--------------CceeEEeceeEEcCCC------------------
Q 041120 213 KAFEFITKIGGVTTEDDYPYRGKNDRCQTDKT--------------KHHAVTITGYEAIPAR------------------ 260 (340)
Q Consensus 213 ~a~~~i~~~~Gi~~e~~yPy~~~~~~c~~~~~--------------~~~~~~i~~y~~~~~~------------------ 260 (340)
.|++|++++ |+++|++|||....+.|..... ....+.+++|..++..
T Consensus 78 ~a~~~~~~~-Gl~~e~~yPY~~~~~~C~~~~~~~~c~~~~~c~~~~~~~~~~i~~~~~~~~~~~i~~~l~~~GPV~v~i~ 156 (239)
T cd02698 78 GVYEYAHKH-GIPDETCNPYQAKDGECNPFNRCGTCNPFGECFAIKNYTLYFVSDYGSVSGRDKMMAEIYARGPISCGIM 156 (239)
T ss_pred HHHHHHHHc-CcCCCCeeCCcCCCCCCcCCCCCCCcccCcccccccccceEEeeeceecCCHHHHHHHHHHcCCEEEEEE
Confidence 999999998 8999999999987777753100 1123566666655422
Q ss_pred --cccccccCceecCC-CCCCCCeEEEEEEEeecC-CeeEEEEEcCCCCCCCCCceEEEEeCCCCCCCcccceeecceee
Q 041120 261 --YAFQLYSHGVFDEY-CGHQLNHGVTVVGYGEDH-GEKYWLVKNSWGTSWGEAGYIRMARNSPSSNIGICGILMQASYP 336 (340)
Q Consensus 261 --~~f~~y~~Gi~~~~-c~~~~~Hav~iVGyg~~~-g~~ywivkNSWG~~WGe~Gy~~i~~~~~~~~~~~Cgi~~~~~yp 336 (340)
.+|+.|++|||+.+ |...++|||+|||||+++ |++|||||||||++|||+|||||+|+.-..-.+.|||++++.|+
T Consensus 157 ~~~~f~~Y~~GIy~~~~~~~~~~HaV~IVGyG~~~~g~~YWiikNSWG~~WGe~Gy~~i~rg~~~~~~~~~~i~~~~~~~ 236 (239)
T cd02698 157 ATEALENYTGGVYKEYVQDPLINHIISVAGWGVDENGVEYWIVRNSWGEPWGERGWFRIVTSSYKGARYNLAIEEDCAWA 236 (239)
T ss_pred ecccccccCCeEEccCCCCCcCCeEEEEEEEEecCCCCEEEEEEcCCCcccCcCceEEEEccCCcccccccccccceEEE
Confidence 57999999999886 556789999999999886 99999999999999999999999998511112459999999997
Q ss_pred e
Q 041120 337 V 337 (340)
Q Consensus 337 ~ 337 (340)
.
T Consensus 237 ~ 237 (239)
T cd02698 237 D 237 (239)
T ss_pred e
Confidence 5
No 8
>cd02248 Peptidase_C1A Peptidase C1A subfamily (MEROPS database nomenclature); composed of cysteine peptidases (CPs) similar to papain, including the mammalian CPs (cathepsins B, C, F, H, L, K, O, S, V, X and W). Papain is an endopeptidase with specific substrate preferences, primarily for bulky hydrophobic or aromatic residues at the S2 subsite, a hydrophobic pocket in papain that accommodates the P2 sidechain of the substrate (the second residue away from the scissile bond). Most members of the papain subfamily are endopeptidases. Some exceptions to this rule can be explained by specific details of the catalytic domains like the occluding loop in cathepsin B which confers an additional carboxydipeptidyl activity and the mini-chain of cathepsin H resulting in an N-terminal exopeptidase activity. Papain-like CPs have different functions in various organisms. Plant CPs are used to mobilize storage proteins in seeds. Parasitic CPs act extracellularly to help invade tissues and cells, to h
Probab=100.00 E-value=5.7e-52 Score=371.08 Aligned_cols=186 Identities=58% Similarity=1.135 Sum_probs=167.2
Q ss_pred CCeeeccCCCCCCccCCCCCCCchHHHHHHHHHHHHHHHHcCCccccChhHhhhccCCCCCCCCCCCchHHHHHHHHHhC
Q 041120 143 PASVDWRKEGAVTPVKDQGQCGSCWAFSAVAAVEGINKLKTGKLVSLSEQELVDCDVNSENQGCNGGYMEKAFEFITKIG 222 (340)
Q Consensus 143 P~~~Dwr~~g~vtpV~nQg~cGsCwAfA~~~~lE~~~~~~~~~~~~LS~q~l~dc~~~~~~~gC~GG~~~~a~~~i~~~~ 222 (340)
|++||||+.+.++||+|||.||+|||||++++||++++++++....||+|+|++|... .+.+|+||.+..|+++++++
T Consensus 1 P~~~d~r~~~~~~~v~dQg~cgsCwAfa~~~~le~~~~i~~~~~~~lS~q~l~~c~~~-~~~gC~GG~~~~a~~~~~~~- 78 (210)
T cd02248 1 PESVDWREKGAVTPVKDQGSCGSCWAFSTVGALEGAYAIKTGKLVSLSEQQLVDCSTS-GNNGCNGGNPDNAFEYVKNG- 78 (210)
T ss_pred CCcccCCcCCCCCCCccCCCCcchHHhHHHHHHHHHHHHHcCCCcccCHHHHhccCCC-CCCCCCCCCHHHhHHHHHHC-
Confidence 8899999999999999999999999999999999999999998899999999999874 47899999999999999887
Q ss_pred CCCCCCCCCCCCCCCCcCCCCCCceeEEeceeEEcCCC----------------------cccccccCceecCC-C-CCC
Q 041120 223 GVTTEDDYPYRGKNDRCQTDKTKHHAVTITGYEAIPAR----------------------YAFQLYSHGVFDEY-C-GHQ 278 (340)
Q Consensus 223 Gi~~e~~yPy~~~~~~c~~~~~~~~~~~i~~y~~~~~~----------------------~~f~~y~~Gi~~~~-c-~~~ 278 (340)
|+++|++|||......|.... .....++.+|..++.. .+|+.|++|||..+ | ...
T Consensus 79 Gi~~e~~yPY~~~~~~C~~~~-~~~~~~i~~~~~i~~~~~~~ik~~l~~~gPV~~~~~~~~~f~~y~~Giy~~~~~~~~~ 157 (210)
T cd02248 79 GLASESDYPYTGKDGTCKYNS-SKVGAKITGYSNVPPGDEEALKAALANYGPVSVAIDASSSFQFYKGGIYSGPCCSNTN 157 (210)
T ss_pred CcCccccCCccCCCCCccCCC-CcccEEEeeEEEcCCCcHHHHHHHHhhcCCEEEEEecCcccccCCCCceeCCCCCCCc
Confidence 899999999999888898763 3466888888877652 58999999999987 4 356
Q ss_pred CCeEEEEEEEeecCCeeEEEEEcCCCCCCCCCceEEEEeCCCCCCCcccceeecceee
Q 041120 279 LNHGVTVVGYGEDHGEKYWLVKNSWGTSWGEAGYIRMARNSPSSNIGICGILMQASYP 336 (340)
Q Consensus 279 ~~Hav~iVGyg~~~g~~ywivkNSWG~~WGe~Gy~~i~~~~~~~~~~~Cgi~~~~~yp 336 (340)
++|||+|||||++.+.+|||||||||++||++|||||+|+. + .|||++.+.||
T Consensus 158 ~~Hav~iVGy~~~~~~~ywiv~NSWG~~WG~~Gy~~i~~~~-~----~cgi~~~~~~~ 210 (210)
T cd02248 158 LNHAVLLVGYGTENGVDYWIVKNSWGTSWGEKGYIRIARGS-N----LCGIASYASYP 210 (210)
T ss_pred CCEEEEEEEEeecCCceEEEEEcCCCCccccCcEEEEEcCC-C----ccCceeeeecC
Confidence 79999999999998899999999999999999999999984 3 49999998887
No 9
>cd02620 Peptidase_C1A_CathepsinB Cathepsin B group; composed of cathepsin B and similar proteins, including tubulointerstitial nephritis antigen (TIN-Ag). Cathepsin B is a lysosomal papain-like cysteine peptidase which is expressed in all tissues and functions primarily as an exopeptidase through its carboxydipeptidyl activity. Together with other cathepsins, it is involved in the degradation of proteins, proenzyme activation, Ag processing, metabolism and apoptosis. Cathepsin B has been implicated in a number of human diseases such as cancer, rheumatoid arthritis, osteoporosis and Alzheimer's disease. The unique carboxydipeptidyl activity of cathepsin B is attributed to the presence of an occluding loop in its active site which favors the binding of the C-termini of substrate proteins. Some members of this group do not possess the occluding loop. TIN-Ag is an extracellular matrix basement protein which was originally identified as a target Ag involved in anti-tubular basement membrane
Probab=100.00 E-value=3.4e-52 Score=379.71 Aligned_cols=185 Identities=37% Similarity=0.732 Sum_probs=152.8
Q ss_pred CCeeeccCC--CCC--CccCCCCCCCchHHHHHHHHHHHHHHHHcC--CccccChhHhhhccCCCCCCCCCCCchHHHHH
Q 041120 143 PASVDWRKE--GAV--TPVKDQGQCGSCWAFSAVAAVEGINKLKTG--KLVSLSEQELVDCDVNSENQGCNGGYMEKAFE 216 (340)
Q Consensus 143 P~~~Dwr~~--g~v--tpV~nQg~cGsCwAfA~~~~lE~~~~~~~~--~~~~LS~q~l~dc~~~~~~~gC~GG~~~~a~~ 216 (340)
|++||||++ +++ +||+|||.||||||||++++||+++.++++ +.+.||+|+|+||+.. .+.||+||++..|++
T Consensus 1 p~~~DwR~~~~~~~~v~~v~dQg~CGsCwAfa~~~~le~~~~i~~~~~~~~~LS~Q~lidC~~~-~~~gC~GG~~~~a~~ 79 (236)
T cd02620 1 PESFDAREKWPNCISIGEIRDQGNCGSCWAFSAVEAFSDRLCIQSNGKENVLLSAQDLLSCCSG-CGDGCNGGYPDAAWK 79 (236)
T ss_pred CCcccchhhCCCCCCccccCCcccchhHHHHHHHHHHhhHHHHhcCCCCccccCHHHHHhhcCC-CCCCCCCCCHHHHHH
Confidence 899999997 454 599999999999999999999999999888 7789999999999874 478999999999999
Q ss_pred HHHHhCCCCCCCCCCCCCCCCC------------------cCCCCC---CceeEEeceeEEcCCC---------------
Q 041120 217 FITKIGGVTTEDDYPYRGKNDR------------------CQTDKT---KHHAVTITGYEAIPAR--------------- 260 (340)
Q Consensus 217 ~i~~~~Gi~~e~~yPy~~~~~~------------------c~~~~~---~~~~~~i~~y~~~~~~--------------- 260 (340)
|++++ |+++|++|||.+.... |..... ....+++..+..+...
T Consensus 80 ~i~~~-G~~~e~~yPY~~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~~~~~~~~~~~~~ik~~l~~~GPv~ 158 (236)
T cd02620 80 YLTTT-GVVTGGCQPYTIPPCGHHPEGPPPCCGTPYCTPKCQDGCEKTYEEDKHKGKSAYSVPSDETDIMKEIMTNGPVQ 158 (236)
T ss_pred HHHhc-CCCcCCEecCcCCCCccCCCCCCCCCCCCCCCCCCCcCCccccceeeeeecceeeeCCHHHHHHHHHHHCCCeE
Confidence 99988 8999999999886543 322210 0112233333333221
Q ss_pred ------cccccccCceecCCCCC-CCCeEEEEEEEeecCCeeEEEEEcCCCCCCCCCceEEEEeCCCCCCCcccceeecc
Q 041120 261 ------YAFQLYSHGVFDEYCGH-QLNHGVTVVGYGEDHGEKYWLVKNSWGTSWGEAGYIRMARNSPSSNIGICGILMQA 333 (340)
Q Consensus 261 ------~~f~~y~~Gi~~~~c~~-~~~Hav~iVGyg~~~g~~ywivkNSWG~~WGe~Gy~~i~~~~~~~~~~~Cgi~~~~ 333 (340)
++|+.|++|||+.+|+. .++|||+|||||+++|++|||||||||++|||+|||||+|+. | .|||++.+
T Consensus 159 v~i~~~~~f~~Y~~Giy~~~~~~~~~~HaV~iVGyg~~~g~~YWivrNSWG~~WGe~Gy~ri~~~~-~----~cgi~~~~ 233 (236)
T cd02620 159 AAFTVYEDFLYYKSGVYQHTSGKQLGGHAVKIIGWGVENGVPYWLAANSWGTDWGENGYFRILRGS-N----ECGIESEV 233 (236)
T ss_pred EEEEechhhhhcCCcEEeecCCCCcCCeEEEEEEEeccCCeeEEEEEeCCCCCCCCCcEEEEEccC-c----ccccccce
Confidence 68999999999876654 468999999999988999999999999999999999999984 3 39999987
Q ss_pred e
Q 041120 334 S 334 (340)
Q Consensus 334 ~ 334 (340)
+
T Consensus 234 ~ 234 (236)
T cd02620 234 V 234 (236)
T ss_pred e
Confidence 5
No 10
>PF00112 Peptidase_C1: Papain family cysteine protease This is family C1 in the peptidase classification. ; InterPro: IPR000668 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This group of proteins belong to the peptidase family C1, sub-family C1A (papain family, clan CA). It includes proteins classed as non-peptidase homologs. These are have either been shown experimentally to lack peptidase activity or lack one or more of the active site residues. The papain family has a wide variety of activities, including broad-range (papain) and narrow-range endo-peptidases, aminopeptidases, dipeptidyl peptidases and enzymes with both exo- and endo-peptidase activity []. Members of the papain family are widespread, found in baculovirus [], eubacteria, yeast, and practically all protozoa, plants and mammals []. The proteins are typically lysosomal or secreted, and proteolytic cleavage of the propeptide is required for enzyme activation, although bleomycin hydrolase is cytosolic in fungi and mammals []. Papain-like cysteine proteinases are essentially synthesised as inactive proenzymes (zymogens) with N-terminal propeptide regions. The activation process of these enzymes includes the removal of propeptide regions. The propeptide regions serve a variety of functions in vivo and in vitro. The pro-region is required for the proper folding of the newly synthesised enzyme, the inactivation of the peptidase domain and stabilisation of the enzyme against denaturing at neutral to alkaline pH conditions. Amino acid residues within the pro-region mediate their membrane association, and play a role in the transport of the proenzyme to lysosomes. Among the most notable features of propeptides is their ability to inhibit the activity of their cognate enzymes and that certain propeptides exhibit high selectivity for inhibition of the peptidases from which they originate []. The catalytic residues of papain are Cys-25 and His-159, other important residues being Gln-19, which helps form the 'oxyanion hole', and Asn-175, which orientates the imidazole ring of His-159. ; GO: 0008234 cysteine-type peptidase activity, 0006508 proteolysis; PDB: 3MOR_B 3HHI_B 1S4V_A 3F75_A 1MEG_A 1PCI_C 1PPO_A 3HD3_B 1F29_A 1EWL_A ....
Probab=100.00 E-value=4.7e-51 Score=366.21 Aligned_cols=191 Identities=46% Similarity=0.955 Sum_probs=162.0
Q ss_pred CCCeeeccCC-CCCCccCCCCCCCchHHHHHHHHHHHHHHHHc-CCccccChhHhhhccCCCCCCCCCCCchHHHHHHHH
Q 041120 142 LPASVDWRKE-GAVTPVKDQGQCGSCWAFSAVAAVEGINKLKT-GKLVSLSEQELVDCDVNSENQGCNGGYMEKAFEFIT 219 (340)
Q Consensus 142 lP~~~Dwr~~-g~vtpV~nQg~cGsCwAfA~~~~lE~~~~~~~-~~~~~LS~q~l~dc~~~~~~~gC~GG~~~~a~~~i~ 219 (340)
||++||||+. +.++||+|||.||+|||||+++++|++++++. ...+.||+|+|++|.. ..+.+|+||++..|+++++
T Consensus 1 lP~~~D~r~~~~~~~~v~dQg~~gsCwafa~~~~~e~~~~~~~~~~~~~lS~q~l~~~~~-~~~~~c~gg~~~~a~~~~~ 79 (219)
T PF00112_consen 1 LPKSFDWRDKGGRITPVRDQGSCGSCWAFAAAAALESRLAIQNNGKNVDLSEQYLIDCSN-KYNKGCDGGSPFDALKYIK 79 (219)
T ss_dssp STSSEEGGGTTTCSG---BTTSSBTHHHHHHHHHHHHHHHHHHTSSCEEB-HHHHHHHST-GTSSTTBBBEHHHHHHHHH
T ss_pred CCCCEecccCCCCcCccccCCcccccccchhccceecccccccccccccccccccccccc-ccccccccCcccccceeec
Confidence 7999999998 48999999999999999999999999999999 7889999999999998 2367999999999999999
Q ss_pred HhCCCCCCCCCCCCCCC-CCcCCCCCCceeEEeceeEEcCCC----------------------c-ccccccCceecCC-
Q 041120 220 KIGGVTTEDDYPYRGKN-DRCQTDKTKHHAVTITGYEAIPAR----------------------Y-AFQLYSHGVFDEY- 274 (340)
Q Consensus 220 ~~~Gi~~e~~yPy~~~~-~~c~~~~~~~~~~~i~~y~~~~~~----------------------~-~f~~y~~Gi~~~~- 274 (340)
+..|+++|++|||.... ..|..........++..|..+... . +|+.|++|||..+
T Consensus 80 ~~~Gi~~e~~~pY~~~~~~~c~~~~~~~~~~~i~~~~~~~~~~~~~ik~~L~~~gpV~~~~~~~~~~f~~~~~gi~~~~~ 159 (219)
T PF00112_consen 80 NNNGIVTEEDYPYNGNENPTCKSKKSNSYYVKIKGYGKVKDNDIEDIKKALMKYGPVVASIDVSSEDFQNYKSGIYDPPD 159 (219)
T ss_dssp HHTSBEBTTTS--SSSSSCSSCHSGGGEEEBEESEEEEEESTCHHHHHHHHHHHSSEEEEEEEESHHHHTEESSEECSTS
T ss_pred ccCcccccccccccccccccccccccccccccccccccccccchhHHHHHHhhCceeeeeeeccccccccccceeeeccc
Confidence 93389999999999877 688876322224678888776653 4 5999999999986
Q ss_pred CC-CCCCeEEEEEEEeecCCeeEEEEEcCCCCCCCCCceEEEEeCCCCCCCcccceeecceeee
Q 041120 275 CG-HQLNHGVTVVGYGEDHGEKYWLVKNSWGTSWGEAGYIRMARNSPSSNIGICGILMQASYPV 337 (340)
Q Consensus 275 c~-~~~~Hav~iVGyg~~~g~~ywivkNSWG~~WGe~Gy~~i~~~~~~~~~~~Cgi~~~~~yp~ 337 (340)
|. ..++|||+|||||++.+++|||||||||++||++|||||+|+.++ +|||+++++||+
T Consensus 160 ~~~~~~~Hav~iVGy~~~~~~~~wiv~NSWG~~WG~~Gy~~i~~~~~~----~c~i~~~~~~~~ 219 (219)
T PF00112_consen 160 CSNESGGHAVLIVGYDDENGKGYWIVKNSWGTDWGDNGYFRISYDYNN----ECGIESQAVYPI 219 (219)
T ss_dssp SSSSSEEEEEEEEEEEEETTEEEEEEE-SBTTTSTBTTEEEEESSSSS----GGGTTSSEEEEE
T ss_pred cccccccccccccccccccceeeEeeehhhCCccCCCeEEEEeeCCCC----cCccCceeeecC
Confidence 76 478999999999999999999999999999999999999999643 499999999996
No 11
>smart00645 Pept_C1 Papain family cysteine protease.
Probab=100.00 E-value=1.2e-50 Score=352.97 Aligned_cols=166 Identities=61% Similarity=1.161 Sum_probs=146.6
Q ss_pred CCCeeeccCCCCCCccCCCCCCCchHHHHHHHHHHHHHHHHcCCccccChhHhhhccCCCCCCCCCCCchHHHHHHHHHh
Q 041120 142 LPASVDWRKEGAVTPVKDQGQCGSCWAFSAVAAVEGINKLKTGKLVSLSEQELVDCDVNSENQGCNGGYMEKAFEFITKI 221 (340)
Q Consensus 142 lP~~~Dwr~~g~vtpV~nQg~cGsCwAfA~~~~lE~~~~~~~~~~~~LS~q~l~dc~~~~~~~gC~GG~~~~a~~~i~~~ 221 (340)
||++||||+.++++||+|||.||+|||||+++++|+++++++++.++||+|+|++|... .+.||+||.+..|++|++++
T Consensus 1 lP~~~D~R~~~~~~~v~dQg~CGsCwAfa~~~~ie~~~~i~~~~~~~lS~q~l~~C~~~-~~~gC~GG~~~~a~~~~~~~ 79 (174)
T smart00645 1 LPESFDWRKKGAVTPVKDQGQCGSCWAFSATGALEGRYCIKTGKLVSLSEQQLVDCSTG-GNNGCNGGLPDNAFEYIKKN 79 (174)
T ss_pred CCCcCcccccCCCCccccCcccchHHHHHHHHHHHHHHHHhcCCccccCHHHHhhhcCC-CCCCCCCcCHHHHHHHHHHc
Confidence 69999999999999999999999999999999999999999998999999999999875 46699999999999999887
Q ss_pred CCCCCCCCCCCCCCCCCcCCCCCCceeEEeceeEEcCCCcccccccCceecCC-CCC-CCCeEEEEEEEeec-CCeeEEE
Q 041120 222 GGVTTEDDYPYRGKNDRCQTDKTKHHAVTITGYEAIPARYAFQLYSHGVFDEY-CGH-QLNHGVTVVGYGED-HGEKYWL 298 (340)
Q Consensus 222 ~Gi~~e~~yPy~~~~~~c~~~~~~~~~~~i~~y~~~~~~~~f~~y~~Gi~~~~-c~~-~~~Hav~iVGyg~~-~g~~ywi 298 (340)
+|+++|++|||.. .+.+ .. .+|+.|++|||+.+ |+. .++|+|+|||||.+ +|++|||
T Consensus 80 ~Gi~~e~~~PY~~-------------~~~~----~~---~~f~~Y~~Gi~~~~~~~~~~~~Hav~ivGyg~~~~g~~yWi 139 (174)
T smart00645 80 GGLETESCYPYTG-------------SVAI----DA---SDFQFYKSGIYDHPGCGSGTLDHAVLIVGYGTEENGKDYWI 139 (174)
T ss_pred CCcccccccCccc-------------EEEE----Ec---ccccCCcCeEECCCCCCCCcccEEEEEEEEeecCCCeeEEE
Confidence 6799999999976 1111 11 25999999999985 865 37999999999987 8999999
Q ss_pred EEcCCCCCCCCCceEEEEeCCCCCCCcccceeec
Q 041120 299 VKNSWGTSWGEAGYIRMARNSPSSNIGICGILMQ 332 (340)
Q Consensus 299 vkNSWG~~WGe~Gy~~i~~~~~~~~~~~Cgi~~~ 332 (340)
||||||+.|||+|||||+++..+ .|||+..
T Consensus 140 i~NSwG~~WG~~G~~~i~~~~~~----~c~i~~~ 169 (174)
T smart00645 140 VKNSWGTDWGENGYFRIARGKNN----ECGIEAS 169 (174)
T ss_pred EECCCCCCcccCeEEEEEcCCCC----ccCceee
Confidence 99999999999999999998534 4999554
No 12
>PTZ00049 cathepsin C-like protein; Provisional
Probab=100.00 E-value=3.2e-49 Score=395.63 Aligned_cols=193 Identities=28% Similarity=0.577 Sum_probs=158.0
Q ss_pred CCCCCCeeeccCC----CCCCccCCCCCCCchHHHHHHHHHHHHHHHHcCC-----c-----cccChhHhhhccCCCCCC
Q 041120 139 YLGLPASVDWRKE----GAVTPVKDQGQCGSCWAFSAVAAVEGINKLKTGK-----L-----VSLSEQELVDCDVNSENQ 204 (340)
Q Consensus 139 ~~~lP~~~Dwr~~----g~vtpV~nQg~cGsCwAfA~~~~lE~~~~~~~~~-----~-----~~LS~q~l~dc~~~~~~~ 204 (340)
..+||++||||+. +.++||+|||.||||||||+++++|++++|+++. . ..||+|+|+||+.. +.
T Consensus 378 ~~~LP~sfDWRd~~~~~~~vtpVkdQG~CGSCWAFAat~alEsR~~Ia~~~~l~~~~~~~~~~~LS~QqLLDCs~~--nq 455 (693)
T PTZ00049 378 IDELPKNFTWGDPFNNNTREYDVTNQLLCGSCYIASQMYAFKRRIEIALTKNLDKKYLNNFDDLLSIQTVLSCSFY--DQ 455 (693)
T ss_pred cccCCCCEecCcCCCCCCcccCCCCCccCcHHHHHHHHHHHHHHHHHHhccccccccccccccCcCHHHhcccCCC--CC
Confidence 4589999999984 6799999999999999999999999999998643 1 27999999999874 78
Q ss_pred CCCCCchHHHHHHHHHhCCCCCCCCCCCCCCCCCcCCCCCC--------------------------------------c
Q 041120 205 GCNGGYMEKAFEFITKIGGVTTEDDYPYRGKNDRCQTDKTK--------------------------------------H 246 (340)
Q Consensus 205 gC~GG~~~~a~~~i~~~~Gi~~e~~yPy~~~~~~c~~~~~~--------------------------------------~ 246 (340)
||+||++..|++|++++ ||++|++|||.+..+.|...... .
T Consensus 456 GC~GG~~~~A~kya~~~-GI~tEscYPY~a~~g~C~~~~~~~~~~~~g~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 534 (693)
T PTZ00049 456 GCNGGFPYLVSKMAKLQ-GIPLDKVFPYTATEQTCPYQVDQSANSMNGSANLRQINAVFFSSETQSDMHADFEAPISSEP 534 (693)
T ss_pred CcCCCcHHHHHHHHHHC-CCCcCCccCCcCCCCCCCCCCCCccccccccccccccccccccccccccccccccccccccc
Confidence 99999999999999888 89999999999988888642110 0
Q ss_pred eeEEeceeEEcC--------CC---------------------cccccccCceecC-------CCCC-------------
Q 041120 247 HAVTITGYEAIP--------AR---------------------YAFQLYSHGVFDE-------YCGH------------- 277 (340)
Q Consensus 247 ~~~~i~~y~~~~--------~~---------------------~~f~~y~~Gi~~~-------~c~~------------- 277 (340)
..+.++.|..++ .. .+|++|++|||+. .|..
T Consensus 535 ~r~y~k~y~yI~g~y~~~~~~~E~~Im~eI~~~GPVsVsIda~~dF~~YksGVY~~~~~~h~~~C~~d~~~~~~~~~~~G 614 (693)
T PTZ00049 535 ARWYAKDYNYIGGCYGCNQCNGEKIMMNEIYRNGPIVASFEASPDFYDYADGVYYVEDFPHARRCTVDLPKHNGVYNITG 614 (693)
T ss_pred cceeeeeeEEecccccccCCCCHHHHHHHHHhcCCEEEEEEechhhhcCCCccccCcccccccccCCccccccccccccc
Confidence 112334444432 11 5899999999985 2642
Q ss_pred --CCCeEEEEEEEeec--CCe--eEEEEEcCCCCCCCCCceEEEEeCCCCCCCcccceeecceeeeec
Q 041120 278 --QLNHGVTVVGYGED--HGE--KYWLVKNSWGTSWGEAGYIRMARNSPSSNIGICGILMQASYPVKR 339 (340)
Q Consensus 278 --~~~Hav~iVGyg~~--~g~--~ywivkNSWG~~WGe~Gy~~i~~~~~~~~~~~Cgi~~~~~yp~~~ 339 (340)
.++|||+|||||.+ +|. +|||||||||+.|||+|||||+|+. | .|||++++.|+..+
T Consensus 615 ~e~~NHAVlIVGwG~d~enG~~~~YWIVRNSWGt~WGenGYfKI~RG~-N----~CGIEs~a~~~~pd 677 (693)
T PTZ00049 615 WEKVNHAIVLVGWGEEEINGKLYKYWIGRNSWGKNWGKEGYFKIIRGK-N----FSGIESQSLFIEPD 677 (693)
T ss_pred cccCceEEEEEEeccccCCCcccCEEEEECCCCCCcccCceEEEEcCC-C----ccCCccceeEEeee
Confidence 36999999999985 453 7999999999999999999999995 3 49999999998743
No 13
>PTZ00364 dipeptidyl-peptidase I precursor; Provisional
Probab=100.00 E-value=1.1e-48 Score=387.90 Aligned_cols=186 Identities=27% Similarity=0.560 Sum_probs=153.9
Q ss_pred CCCCCeeeccCCC---CCCccCCCCC---CCchHHHHHHHHHHHHHHHHc------CCccccChhHhhhccCCCCCCCCC
Q 041120 140 LGLPASVDWRKEG---AVTPVKDQGQ---CGSCWAFSAVAAVEGINKLKT------GKLVSLSEQELVDCDVNSENQGCN 207 (340)
Q Consensus 140 ~~lP~~~Dwr~~g---~vtpV~nQg~---cGsCwAfA~~~~lE~~~~~~~------~~~~~LS~q~l~dc~~~~~~~gC~ 207 (340)
.+||++||||+.| +|+||||||. ||||||||+++++|++++|++ +..+.||+|+|+||+.. +.||+
T Consensus 203 ~~LP~sfDWR~~gg~~~VtpVrdQg~~~~CGSCWAFAav~alEsr~~I~tn~~~~~g~~~~LS~QqLVDCs~~--n~GCd 280 (548)
T PTZ00364 203 DPPPAAWSWGDVGGASFLPAAPPASPGRGCNSSYVEAALAAMMARVMVASNRTDPLGQQTFLSARHVLDCSQY--GQGCA 280 (548)
T ss_pred cCCCCccccCcCCCCccCCCCcCCCCCCCCcCHHHHHHHHHHHHHHHHHhCCCcccCcccCcCHHHHhcccCC--CCCCC
Confidence 5799999999987 7999999999 999999999999999999988 34689999999999864 78999
Q ss_pred CCchHHHHHHHHHhCCCCCCCCC--CCCCCCC---CcCCCCCCceeEE------eceeEEcCCC----------------
Q 041120 208 GGYMEKAFEFITKIGGVTTEDDY--PYRGKND---RCQTDKTKHHAVT------ITGYEAIPAR---------------- 260 (340)
Q Consensus 208 GG~~~~a~~~i~~~~Gi~~e~~y--Py~~~~~---~c~~~~~~~~~~~------i~~y~~~~~~---------------- 260 (340)
||++..|++|++++ ||++|++| ||.+.++ .|+... ....+. +.+|..+..+
T Consensus 281 GG~p~~A~~yi~~~-GI~tE~dY~~PY~~~dg~~~~Ck~~~-~~~~y~~~~~~~I~gyy~~~~~e~~I~~eI~~~GPVsV 358 (548)
T PTZ00364 281 GGFPEEVGKFAETF-GILTTDSYYIPYDSGDGVERACKTRR-PSRRYYFTNYGPLGGYYGAVTDPDEIIWEIYRHGPVPA 358 (548)
T ss_pred CCcHHHHHHHHHhC-CcccccccCCCCCCCCCCCCCCCCCc-ccceeeeeeeEEecceeecCCcHHHHHHHHHHcCCeEE
Confidence 99999999999888 89999999 9987665 487542 222222 3333322222
Q ss_pred -----cccccccCceecC---------CC-----------CCCCCeEEEEEEEeec-CCeeEEEEEcCCCC--CCCCCce
Q 041120 261 -----YAFQLYSHGVFDE---------YC-----------GHQLNHGVTVVGYGED-HGEKYWLVKNSWGT--SWGEAGY 312 (340)
Q Consensus 261 -----~~f~~y~~Gi~~~---------~c-----------~~~~~Hav~iVGyg~~-~g~~ywivkNSWG~--~WGe~Gy 312 (340)
.+|..|++|||.+ .| ...++|||+|||||.+ +|.+|||||||||+ +|||+||
T Consensus 359 aIda~~df~~YksGiy~gi~~~~~~~~~~~~~~~~~~~~~~~~~nHAVlIVGYG~de~G~~YWIVKNSWGt~~~WGE~GY 438 (548)
T PTZ00364 359 SVYANSDWYNCDENSTEDVRYVSLDDYSTASADRPLRHYFASNVNHTVLIIGWGTDENGGDYWLVLDPWGSRRSWCDGGT 438 (548)
T ss_pred EEEechHHHhcCCCCccCeeccccccccccccCCcccccccccCCeEEEEEEecccCCCceEEEEECCCCCCCCcccCCe
Confidence 5799999999862 11 1347999999999974 78899999999999 9999999
Q ss_pred EEEEeCCCCCCCcccceeecce
Q 041120 313 IRMARNSPSSNIGICGILMQAS 334 (340)
Q Consensus 313 ~~i~~~~~~~~~~~Cgi~~~~~ 334 (340)
|||+|+. |. |||++.+.
T Consensus 439 fRI~RG~-N~----CGIes~~v 455 (548)
T PTZ00364 439 RKIARGV-NA----YNIESEVV 455 (548)
T ss_pred EEEEcCC-Cc----ccccceee
Confidence 9999984 43 99999976
No 14
>PTZ00462 Serine-repeat antigen protein; Provisional
Probab=100.00 E-value=1.9e-42 Score=355.78 Aligned_cols=182 Identities=25% Similarity=0.542 Sum_probs=145.8
Q ss_pred CCccCCCCCCCchHHHHHHHHHHHHHHHHcCCccccChhHhhhccCCCCCCCCCCCc-hHHHHHHHHHhCCCCCCCCCCC
Q 041120 154 VTPVKDQGQCGSCWAFSAVAAVEGINKLKTGKLVSLSEQELVDCDVNSENQGCNGGY-MEKAFEFITKIGGVTTEDDYPY 232 (340)
Q Consensus 154 vtpV~nQg~cGsCwAfA~~~~lE~~~~~~~~~~~~LS~q~l~dc~~~~~~~gC~GG~-~~~a~~~i~~~~Gi~~e~~yPy 232 (340)
..||+|||.||+|||||+++++|++++++++..+.||+|+|+||+...++.||.||. +..++.|+.++||+++|++|||
T Consensus 544 ~i~VKDQG~CGSCWAFASaaaLES~~cIkgg~~v~LSeQqLVDCs~~~gn~GC~GG~~~~efl~yI~e~GgLptESdYPY 623 (1004)
T PTZ00462 544 KIQIEDQGNCAISWIFASKYHLETIKCMKGYEPHAISALYIANCSKGEHKDRCDEGSNPLEFLQIIEDNGFLPADSNYLY 623 (1004)
T ss_pred CCCcccCCcchHHHHHHHHHHHHHHHHHhcCCCcccCHHHHHhcccccCCCCCCCCCcHHHHHHHHHHcCCCcccccCCC
Confidence 479999999999999999999999999999999999999999999765678999997 5556699988877999999999
Q ss_pred CC--CCCCcCCCCC-----------------CceeEEeceeEEcCC-----------C------------------cccc
Q 041120 233 RG--KNDRCQTDKT-----------------KHHAVTITGYEAIPA-----------R------------------YAFQ 264 (340)
Q Consensus 233 ~~--~~~~c~~~~~-----------------~~~~~~i~~y~~~~~-----------~------------------~~f~ 264 (340)
.+ ..+.|..... ....+.+.+|..+.. . .+|+
T Consensus 624 t~k~~~g~Cp~~~~~w~n~~~~~kll~~~~~~~~~i~~kgY~~~~s~~~~~n~d~~i~~IK~eI~~kGPVaV~IdAsdf~ 703 (1004)
T PTZ00462 624 NYTKVGEDCPDEEDHWMNLLDHGKILNHNKKEPNSLDGKAYRAYESEHFHDKMDAFIKIIKDEIMNKGSVIAYIKAENVL 703 (1004)
T ss_pred ccCCCCCCCCCCcccccccccccccccccccccceeeccceEEecccccccchhhHHHHHHHHHHhcCCEEEEEEeehHH
Confidence 75 4567864311 001223344443321 0 5788
Q ss_pred ccc-CceecCC-CCC-CCCeEEEEEEEeec-----CCeeEEEEEcCCCCCCCCCceEEEEeCCCCCCCcccceeecceee
Q 041120 265 LYS-HGVFDEY-CGH-QLNHGVTVVGYGED-----HGEKYWLVKNSWGTSWGEAGYIRMARNSPSSNIGICGILMQASYP 336 (340)
Q Consensus 265 ~y~-~Gi~~~~-c~~-~~~Hav~iVGyg~~-----~g~~ywivkNSWG~~WGe~Gy~~i~~~~~~~~~~~Cgi~~~~~yp 336 (340)
.|. +|||... |+. .++|||+|||||.+ .+++|||||||||+.|||+|||||.|...+ .|||+.-..+|
T Consensus 704 ~Y~~sGIyv~~~Cgs~~~nHAVlIVGYGt~in~eg~gk~YWIVRNSWGt~WGEnGYFKI~r~g~n----~CGin~i~t~~ 779 (1004)
T PTZ00462 704 GYEFNGKKVQNLCGDDTADHAVNIVGYGNYINDEDEKKSYWIVRNSWGKYWGDEGYFKVDMYGPS----HCEDNFIHSVV 779 (1004)
T ss_pred hhhcCCccccCCCCCCcCCceEEEEEecccccccCCCCceEEEEcCCCCCcCCCeEEEEEeCCCC----CCccchheeee
Confidence 884 8986654 884 57999999999974 257899999999999999999999995444 49999999999
Q ss_pred eec
Q 041120 337 VKR 339 (340)
Q Consensus 337 ~~~ 339 (340)
++|
T Consensus 780 ~fn 782 (1004)
T PTZ00462 780 IFN 782 (1004)
T ss_pred eEe
Confidence 876
No 15
>cd02619 Peptidase_C1 C1 Peptidase family (MEROPS database nomenclature), also referred to as the papain family; composed of two subfamilies of cysteine peptidases (CPs), C1A (papain) and C1B (bleomycin hydrolase). Papain-like enzymes are mostly endopeptidases with some exceptions like cathepsins B, C, H and X, which are exopeptidases. Papain-like CPs have different functions in various organisms. Plant CPs are used to mobilize storage proteins in seeds while mammalian CPs are primarily lysosomal enzymes responsible for protein degradation in the lysosome. Papain-like CPs are synthesized as inactive proenzymes with N-terminal propeptide regions, which are removed upon activation. Bleomycin hydrolase (BH) is a CP that detoxifies bleomycin by hydrolysis of an amide group. It acts as a carboxypeptidase on its C-terminus to convert itself into an aminopeptidase and peptide ligase. BH is found in all tissues in mammals as well as in many other eukaryotes. It forms a hexameric ring barrel str
Probab=100.00 E-value=4.7e-42 Score=308.38 Aligned_cols=173 Identities=35% Similarity=0.649 Sum_probs=146.0
Q ss_pred eeeccCCCCCCccCCCCCCCchHHHHHHHHHHHHHHHHcC--CccccChhHhhhccCCCC---CCCCCCCchHHHHH-HH
Q 041120 145 SVDWRKEGAVTPVKDQGQCGSCWAFSAVAAVEGINKLKTG--KLVSLSEQELVDCDVNSE---NQGCNGGYMEKAFE-FI 218 (340)
Q Consensus 145 ~~Dwr~~g~vtpV~nQg~cGsCwAfA~~~~lE~~~~~~~~--~~~~LS~q~l~dc~~~~~---~~gC~GG~~~~a~~-~i 218 (340)
.+|||+.+ ++||+|||.||+|||||+++++|+++.+++. +.++||+|+|++|..... ..||.||.+..++. ++
T Consensus 1 ~~d~r~~~-~~~v~dQg~~gsCwafa~~~~les~~~~~~~~~~~~~lS~q~l~~c~~~~~~~~~~~c~gG~~~~~~~~~~ 79 (223)
T cd02619 1 SVDLRPLR-LTPVKNQGSRGSCWAFASAYALESAYRIKGGEDEYVDLSPQYLYICANDECLGINGSCDGGGPLSALLKLV 79 (223)
T ss_pred CCcchhcC-CCCcccCCCCcCcHHHHHHHHHHHHHHHhcCCcccccCCHHHHHHhccccccccCCCCCCCcHHHHHHHHH
Confidence 48999988 9999999999999999999999999999987 789999999999987632 37999999999998 77
Q ss_pred HHhCCCCCCCCCCCCCCCCCcCCC---CCCceeEEeceeEEcCCC----------------------cccccccCceec-
Q 041120 219 TKIGGVTTEDDYPYRGKNDRCQTD---KTKHHAVTITGYEAIPAR----------------------YAFQLYSHGVFD- 272 (340)
Q Consensus 219 ~~~~Gi~~e~~yPy~~~~~~c~~~---~~~~~~~~i~~y~~~~~~----------------------~~f~~y~~Gi~~- 272 (340)
+.+ |+++|++|||......|... .......++..|..+... ..|..|++|++.
T Consensus 80 ~~~-Gi~~e~~~Py~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~~ik~aL~~~gPv~~~~~~~~~~~~~~~~~~~~ 158 (223)
T cd02619 80 ALK-GIPPEEDYPYGAESDGEEPKSEAALNAAKVKLKDYRRVLKNNIEDIKEALAKGGPVVAGFDVYSGFDRLKEGIIYE 158 (223)
T ss_pred HHc-CCCccccCCCCCCCCCCCCCCccchhhcceeecceeEeCchhHHHHHHHHHHCCCEEEEEEcccchhcccCccccc
Confidence 777 89999999999987776532 122345677777766542 678899999873
Q ss_pred ----C-CCC-CCCCeEEEEEEEeecC--CeeEEEEEcCCCCCCCCCceEEEEeCC
Q 041120 273 ----E-YCG-HQLNHGVTVVGYGEDH--GEKYWLVKNSWGTSWGEAGYIRMARNS 319 (340)
Q Consensus 273 ----~-~c~-~~~~Hav~iVGyg~~~--g~~ywivkNSWG~~WGe~Gy~~i~~~~ 319 (340)
. .|. ..++|||+|||||++. +++|||||||||+.||++||+||+++.
T Consensus 159 ~~~~~~~~~~~~~~Hav~ivGy~~~~~~~~~~~i~~NSwG~~wg~~Gy~~i~~~~ 213 (223)
T cd02619 159 EIVYLLYEDGDLGGHAVVIVGYDDNYVEGKGAFIVKNSWGTDWGDNGYGRISYED 213 (223)
T ss_pred cccccccCCCccCCeEEEEEeecCCCCCCCCEEEEEeCCCCccccCCEEEEehhh
Confidence 2 233 4579999999999986 889999999999999999999999984
No 16
>KOG1544 consensus Predicted cysteine proteinase TIN-ag [General function prediction only]
Probab=100.00 E-value=3.7e-37 Score=279.09 Aligned_cols=237 Identities=29% Similarity=0.567 Sum_probs=181.1
Q ss_pred HHHHHhccCCCceEEE-cccCCCCCHHHHHHhhcCCCCCCCC----CCC--CCCCCCCCCCeeeccCC--CCCCccCCCC
Q 041120 91 QYIDYINSQNLSFKLT-DNKFADLSNEEFISTYLGYNKPYNE----PRW--PSVQYLGLPASVDWRKE--GAVTPVKDQG 161 (340)
Q Consensus 91 ~~I~~~N~~~~s~~~g-~N~FsDlt~eEf~~~~~g~~~~~~~----~~~--~~~~~~~lP~~~Dwr~~--g~vtpV~nQg 161 (340)
..||++|..+-+|+.+ ..+|..||.++=.+..+|..+|... ... ......+||+.||-|++ +++.|+.|||
T Consensus 151 d~iE~in~G~YgW~A~NYSaFWGmtL~DGiKyRLGTL~Ps~sv~nMNEi~~~l~p~~~LPE~F~As~KWp~liH~plDQg 230 (470)
T KOG1544|consen 151 DMIEAINQGNYGWQAGNYSAFWGMTLDDGIKYRLGTLRPSSSVMNMNEIYTVLNPGEVLPEAFEASEKWPNLIHEPLDQG 230 (470)
T ss_pred HHHHHHhcCCccccccchhhhhcccccccceeeecccCchhhhhhHHhHhhccCcccccchhhhhhhcCCccccCccccC
Confidence 4688889877778765 4589999999877767776655321 111 22334689999999987 8999999999
Q ss_pred CCCchHHHHHHHHHHHHHHHHc-CC-ccccChhHhhhccCCCCCCCCCCCchHHHHHHHHHhCCCCCCCCCCCCCC----
Q 041120 162 QCGSCWAFSAVAAVEGINKLKT-GK-LVSLSEQELVDCDVNSENQGCNGGYMEKAFEFITKIGGVTTEDDYPYRGK---- 235 (340)
Q Consensus 162 ~cGsCwAfA~~~~lE~~~~~~~-~~-~~~LS~q~l~dc~~~~~~~gC~GG~~~~a~~~i~~~~Gi~~e~~yPy~~~---- 235 (340)
+|++.|||+++++...+++|.. |+ ...||+|+|++|... ...||+||+.+.|+=|+.+. |++...+|||...
T Consensus 231 nCa~SWafSTaavasDRiAI~S~GR~t~~LSpQnLlSC~~h-~q~GC~gG~lDRAWWYlRKr-GvVsdhCYP~~~dQ~~~ 308 (470)
T KOG1544|consen 231 NCAGSWAFSTAAVASDRVAIHSLGRMTPVLSPQNLLSCDTH-QQQGCRGGRLDRAWWYLRKR-GVVSDHCYPFSGDQAGP 308 (470)
T ss_pred CcccceeeeeehhccceeEEeeccccccccChHHhcchhhh-hhccCccCcccchheeeecc-cccccccccccCCCCCC
Confidence 9999999999999999999876 33 368999999999876 47899999999999999998 8999999999763
Q ss_pred CCCcCCCCC-------------CceeE-EeceeEEcCCC--------------------------cccccccCceecCCC
Q 041120 236 NDRCQTDKT-------------KHHAV-TITGYEAIPAR--------------------------YAFQLYSHGVFDEYC 275 (340)
Q Consensus 236 ~~~c~~~~~-------------~~~~~-~i~~y~~~~~~--------------------------~~f~~y~~Gi~~~~c 275 (340)
.+.|...+. ..... ..+-|...|+. ++|..|++|||.+..
T Consensus 309 ~~~C~m~sR~~grgkRqat~~CPn~~~~Sn~iyq~tPPYrVSSnE~eImkElM~NGPVQA~m~VHEDFF~YkgGiY~H~~ 388 (470)
T KOG1544|consen 309 APPCMMHSRAMGRGKRQATAHCPNSYVNSNDIYQVTPPYRVSSNEKEIMKELMENGPVQALMEVHEDFFLYKGGIYSHTP 388 (470)
T ss_pred CCCceeeccccCcccccccCcCCCcccccCceeeecCCeeccCCHHHHHHHHHhCCChhhhhhhhhhhhhhccceeeccc
Confidence 233432110 00000 11223333322 899999999998752
Q ss_pred C---------CCCCeEEEEEEEeecC---C--eeEEEEEcCCCCCCCCCceEEEEeCCCCCCCcccceeecce
Q 041120 276 G---------HQLNHGVTVVGYGEDH---G--EKYWLVKNSWGTSWGEAGYIRMARNSPSSNIGICGILMQAS 334 (340)
Q Consensus 276 ~---------~~~~Hav~iVGyg~~~---g--~~ywivkNSWG~~WGe~Gy~~i~~~~~~~~~~~Cgi~~~~~ 334 (340)
. ..+.|+|.|.|||.+. | .+|||..||||+.|||+|||||-|+. |. |-|++...
T Consensus 389 ~~~~~~e~yr~~gtHsVk~tGWG~~~~~~G~~~KyW~aANSWG~~WGE~GYFriLRGv-Ne----cdIEsfvI 456 (470)
T KOG1544|consen 389 VSLGRPERYRRHGTHSVKITGWGEETLPDGRTLKYWTAANSWGPAWGERGYFRILRGV-NE----CDIESFVI 456 (470)
T ss_pred cccCCchhhhhcccceEEEeecccccCCCCCeeEEEEeecccccccccCceEEEeccc-cc----hhhhHhhh
Confidence 1 2468999999999982 3 47999999999999999999999996 54 99998753
No 17
>COG4870 Cysteine protease [Posttranslational modification, protein turnover, chaperones]
Probab=99.96 E-value=1.4e-29 Score=235.15 Aligned_cols=177 Identities=31% Similarity=0.560 Sum_probs=117.0
Q ss_pred CCCCeeeccCCCCCCccCCCCCCCchHHHHHHHHHHHHHHHHcCCccccChhHhhhccCCCCCCCC-----CCCchHHHH
Q 041120 141 GLPASVDWRKEGAVTPVKDQGQCGSCWAFSAVAAVEGINKLKTGKLVSLSEQELVDCDVNSENQGC-----NGGYMEKAF 215 (340)
Q Consensus 141 ~lP~~~Dwr~~g~vtpV~nQg~cGsCwAfA~~~~lE~~~~~~~~~~~~LS~q~l~dc~~~~~~~gC-----~GG~~~~a~ 215 (340)
.+|+.||||+.|.|+||||||.||+||||++++++|+.+.-.. ...+|+..+..-.......+| +||....+.
T Consensus 98 s~~~~fd~r~~g~vs~v~dQg~~Gscwaf~t~~sles~l~~~~--~w~~s~~nm~~ll~~~ye~~fd~~~~d~g~~~m~~ 175 (372)
T COG4870 98 SLPSYFDRRDEGKVSPVKDQGSGGSCWAFATTRSLESYLNPES--AWDFSENNMKNLLGVPYEKGFDYTSNDGGNADMSA 175 (372)
T ss_pred cchhheeeeccCCcccccccCcccceEeeeehhhhhheecccc--cccccccchhhhcCCCccccCCCccccCCcccccc
Confidence 5899999999999999999999999999999999998764332 344555444321111112222 377777777
Q ss_pred HHHHHhCCCCCCCCCCCCCCCCCcCCCCCCceeEEeceeEEcCCC----------cccccccC-----------------
Q 041120 216 EFITKIGGVTTEDDYPYRGKNDRCQTDKTKHHAVTITGYEAIPAR----------YAFQLYSH----------------- 268 (340)
Q Consensus 216 ~~i~~~~Gi~~e~~yPy~~~~~~c~~~~~~~~~~~i~~y~~~~~~----------~~f~~y~~----------------- 268 (340)
.|+.+..|.+.|.+-||......|....+... +...-..++.. .-|++|..
T Consensus 176 a~l~e~sgpv~et~d~y~~~s~~~~~~~p~~k--~~~~~~~i~~~~~~LdnG~i~~~~~~yg~~s~~~~id~~~~~~~~~ 253 (372)
T COG4870 176 AYLTEWSGPVYETDDPYSENSYFSPTNLPVTK--HVQEAQIIPSRKKYLDNGNIKAMFGFYGAVSSSMYIDATNSLGICI 253 (372)
T ss_pred ccccccCCcchhhcCccccccccCCcCCchhh--ccccceecccchhhhcccchHHHHhhhccccceeEEeccccccccc
Confidence 78888878888888888776555544311111 11111111111 12333321
Q ss_pred ceecCCCCCCCCeEEEEEEEeec----------CCeeEEEEEcCCCCCCCCCceEEEEeCCCC
Q 041120 269 GVFDEYCGHQLNHGVTVVGYGED----------HGEKYWLVKNSWGTSWGEAGYIRMARNSPS 321 (340)
Q Consensus 269 Gi~~~~c~~~~~Hav~iVGyg~~----------~g~~ywivkNSWG~~WGe~Gy~~i~~~~~~ 321 (340)
+.|........+|||+||||++. .|.+.||||||||+.||++|||||+|..-+
T Consensus 254 ~~~~~~s~~~~gHAv~iVGyDDs~~~n~~~~~~~g~GAfiikNSWGt~wG~~GYfwisY~ya~ 316 (372)
T COG4870 254 PYPYVDSGENWGHAVLIVGYDDSFDINNFKYGPPGDGAFIIKNSWGTNWGENGYFWISYYYAL 316 (372)
T ss_pred CCCCCCccccccceEEEEeccccccccccccCCCCCceEEEECccccccccCceEEEEeeecc
Confidence 11111112457999999999987 256799999999999999999999998543
No 18
>cd00585 Peptidase_C1B Peptidase C1B subfamily (MEROPS database nomenclature); composed of eukaryotic bleomycin hydrolases (BH) and bacterial aminopeptidases C (pepC). The proteins of this subfamily contain a large insert relative to the C1A peptidase (papain) subfamily. BH is a cysteine peptidase that detoxifies bleomycin by hydrolysis of an amide group. It acts as a carboxypeptidase on its C-terminus to convert itself into an aminopeptidase and peptide ligase. BH is found in all tissues in mammals as well as in many other eukaryotes. Bleomycin, a glycopeptide derived from the fungus Streptomyces verticullus, is an effective anticancer drug due to its ability to induce DNA strand breaks. Human BH is the major cause of tumor cell resistance to bleomycin chemotherapy, and is also genetically linked to Alzheimer's disease. In addition to its peptidase activity, the yeast BH (Gal6) binds DNA and acts as a repressor in the Gal4 regulatory system. BH forms a hexameric ring barrel structure w
Probab=99.86 E-value=1.2e-22 Score=198.12 Aligned_cols=78 Identities=26% Similarity=0.365 Sum_probs=64.3
Q ss_pred CccCCCCCCCchHHHHHHHHHHHHHHHH-cCCccccChhHhhhccCC--------------------------CCCCCCC
Q 041120 155 TPVKDQGQCGSCWAFSAVAAVEGINKLK-TGKLVSLSEQELVDCDVN--------------------------SENQGCN 207 (340)
Q Consensus 155 tpV~nQg~cGsCwAfA~~~~lE~~~~~~-~~~~~~LS~q~l~dc~~~--------------------------~~~~gC~ 207 (340)
.||+||++-|.||.||++..+|+.+..+ +...++||+.++..-++. ......+
T Consensus 55 ~~vtnQ~~SGrCW~FA~Ln~lr~~~~k~~~~~~felSq~Yl~f~dklEkaN~fle~ii~~~~~~~~~R~v~~ll~~~~~D 134 (437)
T cd00585 55 EPVTNQKSSGRCWLFAALNVLRHQFMKKLNLKEFEFSQSYLFFWDKLEKANYFLENIIETADEPLDDRLVQFLLANPQND 134 (437)
T ss_pred CCcccCCCCchhHHHHCHHHHHHHHHHHcCCCCEEeCcHHHHHHHHHHHHHHHHHHHHHHhcCCCccHHHHHHHhCCcCC
Confidence 4899999999999999999999987764 556799999988752211 0244579
Q ss_pred CCchHHHHHHHHHhCCCCCCCCCCCC
Q 041120 208 GGYMEKAFEFITKIGGVTTEDDYPYR 233 (340)
Q Consensus 208 GG~~~~a~~~i~~~~Gi~~e~~yPy~ 233 (340)
||....+...++++ |+++++.||-+
T Consensus 135 GGqw~m~~~li~KY-GvVPk~~~pet 159 (437)
T cd00585 135 GGQWDMLVNLIEKY-GLVPKSVMPES 159 (437)
T ss_pred CCchHHHHHHHHHc-CCCcccccCCC
Confidence 99999999999998 89999999964
No 19
>PF08246 Inhibitor_I29: Cathepsin propeptide inhibitor domain (I29); InterPro: IPR013201 Peptide proteinase inhibitors can be found as single domain proteins or as single or multiple domains within proteins; these are referred to as either simple or compound inhibitors, respectively. In many cases they are synthesised as part of a larger precursor protein, either as a prepropeptide or as an N-terminal domain associated with an inactive peptidase or zymogen. This domain prevents access of the substrate to the active site. Removal of the N-terminal inhibitor domain either by interaction with a second peptidase or by autocatalytic cleavage activates the zymogen. Other inhibitors interact direct with proteinases using a simple noncovalent lock and key mechanism; while yet others use a conformational change-based trapping mechanism that depends on their structural and thermodynamic properties. This entry represents a peptidase inhibitor domain, which belongs to MEROPS peptidase inhibitor family I29. The domain is also found at the N terminus of a variety of peptidase precursors that belong to MEROPS peptidase subfamily C1A; these include cathepsin L, papain, and procaricain (P10056 from SWISSPROT) []. It forms an alpha-helical domain that runs through the substrate-binding site, preventing access. Removal of this region by proteolytic cleavage results in activation of the enzyme. This domain is also found, in one or more copies, in a variety of cysteine peptidase inhibitors such as salarin [].; PDB: 3QT4_A 3QJ3_A 2C0Y_A 2L95_A 1CJL_A 1CS8_A 7PCK_A 1BY8_A 1PCI_A 2O6X_A ....
Probab=99.70 E-value=5e-17 Score=115.67 Aligned_cols=57 Identities=39% Similarity=0.703 Sum_probs=50.4
Q ss_pred HHHHHHHhCCccCCHHHHHHHHHHHHHHHHHHHHhccCC-CceEEEcccCCCCCHHHH
Q 041120 62 FENWLKQYSREYGSEDEWQRRFGIYSSNVQYIDYINSQN-LSFKLTDNKFADLSNEEF 118 (340)
Q Consensus 62 f~~w~~~~~k~Y~~~~E~~~R~~iF~~Nl~~I~~~N~~~-~s~~~g~N~FsDlt~eEf 118 (340)
|++|+++|+|.|.+++|+..|+.+|++|++.|++||+.+ .+|++|+|+|||||++||
T Consensus 1 F~~~~~~~~k~Y~~~~e~~~R~~~F~~N~~~I~~~N~~~~~~~~~~~N~fsD~t~eEf 58 (58)
T PF08246_consen 1 FEQFKKKYGKSYKSAEEEARRFAIFKENLRRIEEHNANGNNTYKLGLNQFSDMTPEEF 58 (58)
T ss_dssp HHHHHHHCT---SSHHHHHHHHHHHHHHHHHHHHHHHTTSSSEEE-SSTTTTSSHHHH
T ss_pred CHHHHHHcCCCCCCHHHHHHHHHHHHHHHHHHHHHhcCCCCCeEEeCccccCcChhhC
Confidence 899999999999999999999999999999999999544 899999999999999997
No 20
>PF03051 Peptidase_C1_2: Peptidase C1-like family This family is a subfamily of the Prosite entry; InterPro: IPR004134 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This group of proteins belong to MEROPS peptidase family C1, sub-family C1B (bleomycin hydrolase, clan CA). This family contains prokaryotic and eukaryotic aminopeptidases and bleomycin hydrolases.; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 3PW3_F 2CB5_A 1CB5_C 2DZZ_A 2E02_A 2E01_A 2E03_A 1A6R_A 1GCB_A 3GCB_A ....
Probab=99.62 E-value=9.2e-17 Score=157.00 Aligned_cols=78 Identities=28% Similarity=0.436 Sum_probs=51.3
Q ss_pred CccCCCCCCCchHHHHHHHHHHHHHHHHcC-CccccChhHhh----------------hccCCC----------CCCCCC
Q 041120 155 TPVKDQGQCGSCWAFSAVAAVEGINKLKTG-KLVSLSEQELV----------------DCDVNS----------ENQGCN 207 (340)
Q Consensus 155 tpV~nQg~cGsCwAfA~~~~lE~~~~~~~~-~~~~LS~q~l~----------------dc~~~~----------~~~gC~ 207 (340)
.||.||.+-|.||.||++..++..+..+.+ ...+||+.++. ++.... .....+
T Consensus 56 ~~vtnQk~SGRCW~FA~lN~lR~~~~kk~~l~~felSq~Yl~F~DKlEKaN~fLe~ii~~~~~~~d~R~v~~ll~~~~~D 135 (438)
T PF03051_consen 56 GPVTNQKSSGRCWLFAALNVLRHEIMKKLNLKDFELSQNYLFFWDKLEKANYFLENIIDTADEPLDDRLVRFLLKNPVSD 135 (438)
T ss_dssp -S--B--BSSTHHHHHHHHHHHHHHHHHCT-SS--B-HHHHHHHHHHHHHHHHHHHHHHCCTS-TTSHHHHHHHHSTT-S
T ss_pred CCCCCCCCCCCcchhhchHHHHHHHHHHcCCCceEeechHHHHHHHHHHHHHHHHHHHHHhcCCcchHHHHHHHhcCCCC
Confidence 489999999999999999999999887765 67899999876 222111 123568
Q ss_pred CCchHHHHHHHHHhCCCCCCCCCCCC
Q 041120 208 GGYMEKAFEFITKIGGVTTEDDYPYR 233 (340)
Q Consensus 208 GG~~~~a~~~i~~~~Gi~~e~~yPy~ 233 (340)
||....+...++++ ||++.+.||-+
T Consensus 136 GGqw~~~~nli~KY-GvVPk~~mpet 160 (438)
T PF03051_consen 136 GGQWDMVVNLIKKY-GVVPKSVMPET 160 (438)
T ss_dssp -B-HHHHHHHHHHH----BGGGSTTG
T ss_pred CCchHHHHHHHHHc-CcCcHhhCCCC
Confidence 99999999999998 89999999965
No 21
>smart00848 Inhibitor_I29 Cathepsin propeptide inhibitor domain (I29). This domain is found at the N-terminus of some C1 peptidases such as Cathepsin L where it acts as a propeptide. There are also a number of proteins that are composed solely of multiple copies of this domain such as the peptidase inhibitor salarin. This family is classified as I29 by MEROPS. Peptide proteinase inhibitors can be found as single domain proteins or as single or multiple domains within proteins; these are referred to as either simple or compound inhibitors, respectively. In many cases they are synthesised as part of a larger precursor protein, either as a prepropeptide or as an N-terminal domain associated with an inactive peptidase or zymogen. This domain prevents access of the substrate to the active site. Removal of the N-terminal inhibitor domain either by interaction with a second peptidase or by autocatalytic cleavage activates the zymogen. Other inhibitors interact direct with proteinases using a s
Probab=99.54 E-value=7.3e-15 Score=103.91 Aligned_cols=56 Identities=48% Similarity=0.846 Sum_probs=53.2
Q ss_pred HHHHHHHhCCccCCHHHHHHHHHHHHHHHHHHHHhccCC-CceEEEcccCCCCCHHH
Q 041120 62 FENWLKQYSREYGSEDEWQRRFGIYSSNVQYIDYINSQN-LSFKLTDNKFADLSNEE 117 (340)
Q Consensus 62 f~~w~~~~~k~Y~~~~E~~~R~~iF~~Nl~~I~~~N~~~-~s~~~g~N~FsDlt~eE 117 (340)
|++|+.+|+|.|.+.+|+..|+.+|++|++.|+.||+.+ .+|++|+|+|+|||++|
T Consensus 1 f~~~~~~~~k~y~~~~e~~~r~~~f~~n~~~i~~~N~~~~~~~~~~~N~fsDlt~eE 57 (57)
T smart00848 1 FEQWKKKYGKSYSSEEEELRRFEIFKENLKFIEEHNKKNDHSYTLGLNQFADLTNEE 57 (57)
T ss_pred ChHHHHHhCCCCCCHHHHHHHHHHHHHHHHHHHHHHhcCCCCeEecCcccccCCCCC
Confidence 688999999999999999999999999999999999877 89999999999999986
No 22
>COG3579 PepC Aminopeptidase C [Amino acid transport and metabolism]
Probab=98.70 E-value=2.2e-09 Score=99.38 Aligned_cols=77 Identities=23% Similarity=0.311 Sum_probs=56.7
Q ss_pred ccCCCCCCCchHHHHHHHHHHHHHHHHcC-CccccChhHhhhccCC--------------------------CCCCCCCC
Q 041120 156 PVKDQGQCGSCWAFSAVAAVEGINKLKTG-KLVSLSEQELVDCDVN--------------------------SENQGCNG 208 (340)
Q Consensus 156 pV~nQg~cGsCwAfA~~~~lE~~~~~~~~-~~~~LS~q~l~dc~~~--------------------------~~~~gC~G 208 (340)
||-||.+.|-||-||++..+...+.-.-+ +.+.||..++.-.++. -.+.--+|
T Consensus 59 ~vtNQk~SGRCWmFAAlNtfRhk~~~el~le~fElSQaytfFwDKlEKaN~FleqIi~tadq~ldsRlv~~LL~~PqqDG 138 (444)
T COG3579 59 KVTNQKQSGRCWMFAALNTFRHKLISELKLEDFELSQAYTFFWDKLEKANWFLEQIIETADQELDSRLVSFLLATPQQDG 138 (444)
T ss_pred ccccccccceehHHHHHHHHHHHHHHhcCcceeehhhHHHHHHHHHHHhhHHHHHHHhhcccchHHHHHHHHHcCccccC
Confidence 89999999999999999998754433322 3477887766533221 01334589
Q ss_pred CchHHHHHHHHHhCCCCCCCCCCCC
Q 041120 209 GYMEKAFEFITKIGGVTTEDDYPYR 233 (340)
Q Consensus 209 G~~~~a~~~i~~~~Gi~~e~~yPy~ 233 (340)
|-.+.....+.++ |+++.+.||-.
T Consensus 139 GQwdM~v~l~eKY-GvVpK~~ypes 162 (444)
T COG3579 139 GQWDMFVSLFEKY-GVVPKSVYPES 162 (444)
T ss_pred chHHHHHHHHHHh-CCCchhhcccc
Confidence 9999998988888 89999999865
No 23
>PF08127 Propeptide_C1: Peptidase family C1 propeptide; InterPro: IPR012599 This domain is found at the N-terminal of cathepsin B and cathepsin B-like peptidases that belong to MEROPS peptidase subfamily C1A. Cathepsin B are lysosomal cysteine proteinases belonging to the papain superfamily and are unique in their ability to act as both an endo- and an exopeptidases. They are synthesized as inactive zymogens. Activation of the peptidases occurs with the removal of the propeptide [, ]. ; GO: 0004197 cysteine-type endopeptidase activity, 0050790 regulation of catalytic activity; PDB: 1MIR_A 1PBH_A 2PBH_A 3PBH_A.
Probab=96.01 E-value=0.0067 Score=39.60 Aligned_cols=35 Identities=40% Similarity=0.534 Sum_probs=22.4
Q ss_pred HHHHHHhccCCCceEEEcccCCCCCHHHHHHhhcCCC
Q 041120 90 VQYIDYINSQNLSFKLTDNKFADLSNEEFISTYLGYN 126 (340)
Q Consensus 90 l~~I~~~N~~~~s~~~g~N~FsDlt~eEf~~~~~g~~ 126 (340)
-++|+.+|+.+.+|++|.| |.+.|.++++.+ +|..
T Consensus 3 de~I~~IN~~~~tWkAG~N-F~~~~~~~ik~L-lGv~ 37 (41)
T PF08127_consen 3 DEFIDYINSKNTTWKAGRN-FENTSIEYIKRL-LGVL 37 (41)
T ss_dssp HHHHHHHHHCT-SEEE-----SSB-HHHHHHC-S-B-
T ss_pred HHHHHHHHcCCCcccCCCC-CCCCCHHHHHHH-cCCC
Confidence 3689999999899999999 899999988774 4543
No 24
>KOG4128 consensus Bleomycin hydrolases and aminopeptidases of cysteine protease family [Amino acid transport and metabolism]
Probab=95.18 E-value=0.016 Score=54.47 Aligned_cols=79 Identities=24% Similarity=0.327 Sum_probs=58.5
Q ss_pred CCccCCCCCCCchHHHHHHHHHHHHHHHHcC-CccccChhHhhhcc--------------------CCC--------CCC
Q 041120 154 VTPVKDQGQCGSCWAFSAVAAVEGINKLKTG-KLVSLSEQELVDCD--------------------VNS--------ENQ 204 (340)
Q Consensus 154 vtpV~nQg~cGsCwAfA~~~~lE~~~~~~~~-~~~~LS~q~l~dc~--------------------~~~--------~~~ 204 (340)
-+||.||..-|-||.|+.+..+.--+..+-+ ....||..+|.--+ ... .+.
T Consensus 62 ~~pvtnqkssGrcWift~ln~lrl~~~~kLnl~eFElSqayLFFwdKlErcnyFL~~vvd~a~r~ep~DgRlvq~Ll~nP 141 (457)
T KOG4128|consen 62 RQPVTNQKSSGRCWIFTGLNLLRLEMDRKLNLPEFELSQAYLFFWDKLERCNYFLWTVVDLAMRCEPLDGRLVQNLLKNP 141 (457)
T ss_pred CcccccCcCCCceEEEechhHHHHHHHhcCCcchhhhhhHHHHHHHHHHHHHHHHHHHHHHHhhcCCcccHHHHHHHhCC
Confidence 3699999999999999999988654444332 34788888775321 110 134
Q ss_pred CCCCCchHHHHHHHHHhCCCCCCCCCCCC
Q 041120 205 GCNGGYMEKAFEFITKIGGVTTEDDYPYR 233 (340)
Q Consensus 205 gC~GG~~~~a~~~i~~~~Gi~~e~~yPy~ 233 (340)
.-+||.....+..++++ |+.+..+||-.
T Consensus 142 ~~DGGqw~MfvNlVkKY-GviPKkcy~~s 169 (457)
T KOG4128|consen 142 VPDGGQWQMFVNLVKKY-GVIPKKCYLHS 169 (457)
T ss_pred CCCCchHHHHHHHHHHh-CCCcHHhcccc
Confidence 45899999999999888 89999999854
No 25
>PF05543 Peptidase_C47: Staphopain peptidase C47; InterPro: IPR008750 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This group of cysteine peptidases belong to the peptidase family C47 (staphopain family, clan CA). The type example are the staphopains, which are one of four major families of proteinases secreted by the Gram-positive Staphylococcus aureus. These staphylococcal cysteine proteases are secreted as preproenzymes that are proteolytically cleaved to generate the mature enzyme [, , ].; GO: 0008234 cysteine-type peptidase activity, 0006508 proteolysis; PDB: 1X9Y_D 1Y4H_B 1PXV_B 1CV8_A.
Probab=90.64 E-value=0.24 Score=42.62 Aligned_cols=121 Identities=18% Similarity=0.334 Sum_probs=59.1
Q ss_pred CCCCCCCchHHHHHHHHHHHHHH--------HHcCCccccChhHhhhccCCCCCCCCCCCchHHHHHHHHHhCCCCCCCC
Q 041120 158 KDQGQCGSCWAFSAVAAVEGINK--------LKTGKLVSLSEQELVDCDVNSENQGCNGGYMEKAFEFITKIGGVTTEDD 229 (340)
Q Consensus 158 ~nQg~cGsCwAfA~~~~lE~~~~--------~~~~~~~~LS~q~l~dc~~~~~~~gC~GG~~~~a~~~i~~~~Gi~~e~~ 229 (340)
..||.-+=|-+||.++.|-+... +.+.....+|+++|.+++. .+...++|.+.. |..+.
T Consensus 17 EtQg~~pWCa~Ya~aailN~~~~~~~~~A~~iMr~~yPn~s~~~l~~~~~----------~~~~~i~y~ks~-g~~~~-- 83 (175)
T PF05543_consen 17 ETQGYNPWCAGYAMAAILNATTNTKIYNAKDIMRYLYPNVSEEQLKFTSL----------TPNQMIKYAKSQ-GRNPQ-- 83 (175)
T ss_dssp ---SSSS-HHHHHHHHHHHHHCT-S---HHHHHHHHSTTS-CCCHHH--B-----------HHHHHHHHHHT-TEEEE--
T ss_pred eccCcCcHHHHHHHHHHHHhhhCcCcCCHHHHHHHHCCCCCHHHHhhcCC----------CHHHHHHHHHHc-Ccchh--
Confidence 35888899999999999875421 1122236788888888764 356899998877 54421
Q ss_pred CCCCCCCCCcCCCCCCceeEEeceeEEcCCCcccccccCceecCCCCCCCCeEEEEEEEeec-CCeeEEEEEcCCCC
Q 041120 230 YPYRGKNDRCQTDKTKHHAVTITGYEAIPARYAFQLYSHGVFDEYCGHQLNHGVTVVGYGED-HGEKYWLVKNSWGT 305 (340)
Q Consensus 230 yPy~~~~~~c~~~~~~~~~~~i~~y~~~~~~~~f~~y~~Gi~~~~c~~~~~Hav~iVGyg~~-~g~~ywivkNSWG~ 305 (340)
| ..... .....+ . .++ .+.......+. ....-+...+||++||||-.- +|.++.++=|=|-.
T Consensus 84 ~--~n~~~--s~~eV~-~--~~~------~nk~i~i~~~~-v~~~~~~~~gHAlavvGya~~~~g~~~y~~WNPW~~ 146 (175)
T PF05543_consen 84 Y--NNRMP--SFDEVK-K--LID------NNKGIAILADR-VEQTNGPHAGHALAVVGYAKPNNGQKTYYFWNPWWN 146 (175)
T ss_dssp E--ECS-----HHHHH-H--HHH------TT-EEEEEEEE-TTSCTTB--EEEEEEEEEEEETTSEEEEEEE-TT-S
T ss_pred H--hcCCC--CHHHHH-H--HHH------cCCCeEEEecc-cccCCCCccceeEEEEeeeecCCCCeEEEEeCCccC
Confidence 0 00000 000000 0 000 00000000000 001122347899999999874 67999999888754
No 26
>KOG4128 consensus Bleomycin hydrolases and aminopeptidases of cysteine protease family [Amino acid transport and metabolism]
Probab=79.56 E-value=0.2 Score=47.33 Aligned_cols=38 Identities=32% Similarity=0.581 Sum_probs=30.3
Q ss_pred CCeEEEEEEEee-c---CCeeEEEEEcCCCCCCCCCceEEEE
Q 041120 279 LNHGVTVVGYGE-D---HGEKYWLVKNSWGTSWGEAGYIRMA 316 (340)
Q Consensus 279 ~~Hav~iVGyg~-~---~g~~ywivkNSWG~~WGe~Gy~~i~ 316 (340)
..|||++.|-+. + .+-.-|-|.||||.+-|.+||..|.
T Consensus 371 mthAml~T~v~~kd~~~g~~~~~rVenswgkd~gkkg~~~mt 412 (457)
T KOG4128|consen 371 MTHAMLLTSVGLKDPATGGLNEHRVENSWGKDLGKKGVNKMT 412 (457)
T ss_pred HHHHHHhhhccccCcccCCchhhhhhchhhhhccccchhhhh
Confidence 479999999882 2 3334699999999999999996653
No 27
>PF13529 Peptidase_C39_2: Peptidase_C39 like family; PDB: 3ERV_A.
Probab=77.78 E-value=1.9 Score=34.72 Aligned_cols=23 Identities=39% Similarity=0.775 Sum_probs=16.3
Q ss_pred CCCeEEEEEEEeecCCeeEEEEEcCC
Q 041120 278 QLNHGVTVVGYGEDHGEKYWLVKNSW 303 (340)
Q Consensus 278 ~~~Hav~iVGyg~~~g~~ywivkNSW 303 (340)
..+|.|+|+||+.+. +++|-.+|
T Consensus 122 ~~~H~vvi~Gy~~~~---~~~v~DP~ 144 (144)
T PF13529_consen 122 YGGHYVVIIGYDEDG---YVYVNDPW 144 (144)
T ss_dssp TTEEEEEEEEE-SSE----EEEE-TT
T ss_pred cCCEEEEEEEEeCCC---EEEEeCCC
Confidence 468999999998742 78888877
No 28
>cd00044 CysPc Calpains, domains IIa, IIb; calcium-dependent cytoplasmic cysteine proteinases, papain-like. Functions in cytoskeletal remodeling processes, cell differentiation, apoptosis and signal transduction.
Probab=60.65 E-value=13 Score=35.18 Aligned_cols=27 Identities=26% Similarity=0.537 Sum_probs=23.9
Q ss_pred CCeEEEEEEEeecC--CeeEEEEEcCCCC
Q 041120 279 LNHGVTVVGYGEDH--GEKYWLVKNSWGT 305 (340)
Q Consensus 279 ~~Hav~iVGyg~~~--g~~ywivkNSWG~ 305 (340)
.+||=.|++...-+ |.+...+||-||.
T Consensus 235 ~~HaY~Vl~~~~~~~~~~~lv~lrNPWg~ 263 (315)
T cd00044 235 KGHAYSVLDVREVQEEGLRLLRLRNPWGV 263 (315)
T ss_pred cCcceEEeEEEEEccCceEEEEecCCccC
Confidence 48999999998766 8899999999994
No 29
>PF08139 LPAM_1: Prokaryotic membrane lipoprotein lipid attachment site; InterPro: IPR012640 In prokaryotes, membrane lipoproteins are synthesized with a precursor signal peptide, which is cleaved by a specific lipoprotein signal peptidase (signal peptidase II). The peptidase recognises a conserved sequence and cuts upstream of a cysteine residue to which a glyceride-fatty acid lipid is attached [,]. This lipid attachment site is found in homologues of the VirB proteins of type IV secretion systems (T4SS). Conjugal transfer across the cell envelope of Gram-negative bacteria is mediated by a supramolecular structure termed mating pair formation (Mpf) complex. Collectively, secretion pathways ancestrally related to bacterial conjugation systems are now known as T4SS. T4SS are involved in the delivery of effector molecules to eukaryotic target cells; each of these systems exports distinct DNA or protein substrates to effect a myriad of changes in host cell physiology during infection [].
Probab=60.00 E-value=13 Score=21.42 Aligned_cols=11 Identities=27% Similarity=0.144 Sum_probs=7.0
Q ss_pred HhhHHHHHHHH
Q 041120 22 RMMLRNAVLSL 32 (340)
Q Consensus 22 ~~m~~~~~l~l 32 (340)
.||||+++.++
T Consensus 5 ~mmKkil~~l~ 15 (25)
T PF08139_consen 5 SMMKKILFPLL 15 (25)
T ss_pred HHHHHHHHHHH
Confidence 57788855444
No 30
>COG3017 LolB Outer membrane lipoprotein involved in outer membrane biogenesis [Cell envelope biogenesis, outer membrane]
Probab=55.02 E-value=25 Score=31.25 Aligned_cols=30 Identities=30% Similarity=0.160 Sum_probs=14.4
Q ss_pred HHhhHHHHHHHHHHHHHHhhhccccccCCC
Q 041120 21 MRMMLRNAVLSLFLLWVLGIPAGAWSEGYP 50 (340)
Q Consensus 21 ~~~m~~~~~l~l~~~~~l~~~~~~~~~~~~ 50 (340)
|.||++...+++.++.+|+.+++..+...|
T Consensus 1 ~~~~~~~~~~l~~~As~LL~aC~~~~~~~~ 30 (206)
T COG3017 1 MPMMKRLLFLLLALASLLLTACTLTASRPP 30 (206)
T ss_pred CchHHHHHHHHHHHHHHHHHhccCcCCCCC
Confidence 455666555444444444445444443333
No 31
>COG5510 Predicted small secreted protein [Function unknown]
Probab=40.26 E-value=27 Score=22.94 Aligned_cols=14 Identities=29% Similarity=0.449 Sum_probs=7.0
Q ss_pred hhHHHHHHHHHHHH
Q 041120 23 MMLRNAVLSLFLLW 36 (340)
Q Consensus 23 ~m~~~~~l~l~~~~ 36 (340)
||+|.+++++++++
T Consensus 1 mmk~t~l~i~~vll 14 (44)
T COG5510 1 MMKKTILLIALVLL 14 (44)
T ss_pred CchHHHHHHHHHHH
Confidence 56775444443333
No 32
>PRK09810 entericidin A; Provisional
Probab=38.81 E-value=32 Score=22.40 Aligned_cols=10 Identities=60% Similarity=0.594 Sum_probs=5.9
Q ss_pred hhHHHHHHHH
Q 041120 23 MMLRNAVLSL 32 (340)
Q Consensus 23 ~m~~~~~l~l 32 (340)
||+|.+++++
T Consensus 1 mMkk~~~l~~ 10 (41)
T PRK09810 1 MMKRLIVLVL 10 (41)
T ss_pred ChHHHHHHHH
Confidence 5777655443
No 33
>PRK10081 entericidin B membrane lipoprotein; Provisional
Probab=38.46 E-value=32 Score=23.14 Aligned_cols=13 Identities=15% Similarity=0.215 Sum_probs=6.9
Q ss_pred hhHHHHHHHHHHH
Q 041120 23 MMLRNAVLSLFLL 35 (340)
Q Consensus 23 ~m~~~~~l~l~~~ 35 (340)
||+|.+.++++++
T Consensus 1 MmKk~i~~i~~~l 13 (48)
T PRK10081 1 MVKKTIAAIFSVL 13 (48)
T ss_pred ChHHHHHHHHHHH
Confidence 5777655444333
No 34
>COG2854 Ttg2D ABC-type transport system involved in resistance to organic solvents, auxiliary component [Secondary metabolites biosynthesis, transport, and catabolism]
Probab=35.51 E-value=99 Score=27.50 Aligned_cols=91 Identities=12% Similarity=0.116 Sum_probs=39.4
Q ss_pred hhHHHHHHHHHHHHHHhhhccccccCCCCCCChhhHHHHHHHHHHHhCCccCCHHHHHHHHHHHHHHHHHHHHhccCC-C
Q 041120 23 MMLRNAVLSLFLLWVLGIPAGAWSEGYPQKYDPQSMEERFENWLKQYSREYGSEDEWQRRFGIYSSNVQYIDYINSQN-L 101 (340)
Q Consensus 23 ~m~~~~~l~l~~~~~l~~~~~~~~~~~~~~~~~~~~~~~f~~w~~~~~k~Y~~~~E~~~R~~iF~~Nl~~I~~~N~~~-~ 101 (340)
||++.+.+..++++++++..++...+.|...-....++.|..-+..=-+ ++ .++...|..+.++=+..+ +.+. .
T Consensus 2 ~m~k~l~~~~ll~~a~a~~~~~~~~~~~~~~v~~~a~~~ls~lk~~~~~-~k-~dp~~l~~~v~~~l~p~v---d~~~~a 76 (202)
T COG2854 2 MMKKSLTILALLVIAFASSLAAAAPANPYSLVQEAADKVLSILKNNQAK-IK-QDPQYLRQIVDQELLPYV---DFKYAA 76 (202)
T ss_pred hHHHHHHHHHHHHHHHHHHHHhhCccchHHHHHHHHHHHHHHHhccchh-hc-cCHHHHHHHHHHHhhhhh---cHHHHH
Confidence 4556544444333333322222223333111123445566666544332 21 234444433333322222 2222 3
Q ss_pred ceEEEcccCCCCCHHHHH
Q 041120 102 SFKLTDNKFADLSNEEFI 119 (340)
Q Consensus 102 s~~~g~N~FsDlt~eEf~ 119 (340)
.+.|| +-+...|+|+-.
T Consensus 77 ~~vLG-k~~k~aspeQ~~ 93 (202)
T COG2854 77 KLVLG-KYYKTASPEQRQ 93 (202)
T ss_pred HHHhc-cccccCCHHHHH
Confidence 45677 777788887654
No 35
>PF00648 Peptidase_C2: Calpain family cysteine protease This is family C2 in the peptidase classification. ; InterPro: IPR001300 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This group of cysteine peptidases belong to the MEROPS peptidase family C2 (calpain family, clan CA). A type example is calpain, which is an intracellular protease involved in many important cellular functions that are regulated by calcium []. The protein is a complex of 2 polypeptide chains (light and heavy), with three known forms in mammals [, ]: a highly calcium-sensitive (i.e., micro-molar range) form known as mu-calpain, mu-CANP or calpain I; a form sensitive to calcium in the milli-molar range, known as m-calpain, m-CANP or calpain II; and a third form, known as p94, which is found in skeletal muscle only []. All forms have identical light but different heavy chains. Both mu- and m-calpain are heterodimers containing an identical 28kDa subunit and an 80kDa subunit that shares 55-65% sequence homology between the two proteases [, ]. The crystallographic structure of m-calpain reveals six "domains" in the 80kDa subunit: A 19-amino acid NH2-terminal sequence; Active site domain IIa; Active site domain IIb. Domain 2 shows low levels of sequence similarity to papain; although the catalytic His has not been located by biochemical means, it is likely that calpain and papain are related []. Domain III; An 18-amino acid extended sequence linking domain III to domain IV; Domain IV, which resembles the penta EF-hand family of polypeptides, binds calcium and regulates activity []. />]. Ca2+-binding causes a rearrangement of the protein backbone, the net effect of which is that a Trp side chain, which acts as a wedge between catalytic domains IIa and IIb in the apo state, moves away from the active site cleft allowing for the proper formation of the catalytic triad []. Calpain-like mRNAs have been identified in other organisms including bacteria, but the molecules encoded by these mRNAs have not been isolated, so little is known about their properties. How calpain activity is regulated in these organisms cells is still unclear In metazoans, the activity of calpain is controlled by a single proteinase inhibitor, calpastatin (IPR001259 from INTERPRO). The calpastatin gene can produce eight or more calpastatin polypeptides ranging from 17 to 85 kDa by use of different promoters and alternative splicing events. The physiological significance of these different calpastatins is unclear, although all bind to three different places on the calpain molecule; binding to at least two of the sites is Ca2+ dependent. The calpains ostensibly participate in a variety of cellular processes including remodelling of cytoskeletal/membrane attachments, different signal transduction pathways, and apoptosis. Deregulated calpain activity following loss of Ca2+ homeostasis results in tissue damage in response to events such as myocardial infarcts, stroke, and brain trauma []. Calpains are a family of cytosolic cysteine proteinases (see PDOC00126 from PROSITEDOC). Members of the calpain family are believed to function in various biological processes, including integrin-mediated cell migration, cytoskeletal remodeling, cell differentiation and apoptosis [, ]. The calpain family includes numerous members from C. elegans to mammals and with homologues in yeast and bacteria. The best characterised members are the m- and mu-calpains, both proteins are heterodimer composed of a large catalytic subunit and a small regulatory subunit. The large subunit comprises four domains (dI-dIV) while the small subunit has two domains (dV-dVI). Domain dI is a short region cleaved by autolysis, dII is the catalytic core, dIII is a C2-like domain, dIV consists of five calcium binding EF-hand motifs []. The crystal structure of calpain has been solved [, ]. The catalytic region consists of two distinct structural domains (dIIa and dIIb). dIIa contains a central helix flanked on three faces by a cluster of alpha-helices and is entirely unrelated to the corresponding domain in the typical thiol proteinases. The fold of dIIb is similar to the corresponding domain in other cysteine proteinases and contains two three-stranded anti-parallel beta-sheets. The catalytic triad residues (C,H,N) are located in dIIa and dIIb. The activation of the domain is dependent on the binding of two calcium atoms in two non EF-hand calcium binding sites located in the catalytic core, one close to the Cys active site in dIIa and one at the end of dIIb. Calcium-binding induced conformational changes in the catalytic domain which align the active site [][]. The profile covers the whole catalytic domain.; GO: 0004198 calcium-dependent cysteine-type endopeptidase activity, 0006508 proteolysis, 0005622 intracellular; PDB: 2NQA_A 1KFU_L 1KFX_L 1QXP_B 2R9C_A 1TL9_A 2G8E_A 1KXR_B 2G8J_A 2NQG_A ....
Probab=34.22 E-value=69 Score=29.83 Aligned_cols=27 Identities=26% Similarity=0.559 Sum_probs=19.8
Q ss_pred CCeEEEEEEEeecCC----eeEEEEEcCCCC
Q 041120 279 LNHGVTVVGYGEDHG----EKYWLVKNSWGT 305 (340)
Q Consensus 279 ~~Hav~iVGyg~~~g----~~ywivkNSWG~ 305 (340)
.+||-.|++....++ .+.-.+||-||.
T Consensus 213 ~~HaY~Vl~~~~~~~~~~~~~lv~LrNPwg~ 243 (298)
T PF00648_consen 213 PGHAYAVLDVREVNGNGEGHRLVKLRNPWGS 243 (298)
T ss_dssp TTS-EEEEEEEEEEETTEEEEEEEEE-TTSS
T ss_pred cceeEEEEEEEeeccccceeEEEEEcCCCcc
Confidence 489999999986533 567789999995
No 36
>KOG4702 consensus Uncharacterized conserved protein [Function unknown]
Probab=31.59 E-value=1e+02 Score=22.52 Aligned_cols=32 Identities=28% Similarity=0.450 Sum_probs=24.0
Q ss_pred HHHHHHHHHhCCccCCHHHHHHHHHHHHHHHHH
Q 041120 60 ERFENWLKQYSREYGSEDEWQRRFGIYSSNVQY 92 (340)
Q Consensus 60 ~~f~~w~~~~~k~Y~~~~E~~~R~~iF~~Nl~~ 92 (340)
..|++|+.+|.+.-.. .|...|.+-|++-++.
T Consensus 29 e~Fee~v~~~krel~p-pe~~~~~EE~~~~lRe 60 (77)
T KOG4702|consen 29 EIFEEFVRGYKRELSP-PEATKRKEEYENFLRE 60 (77)
T ss_pred HHHHHHHHhccccCCC-hHHHhhHHHHHHHHHH
Confidence 4699999999998854 4777777777665554
No 37
>COG4990 Uncharacterized protein conserved in bacteria [Function unknown]
Probab=29.68 E-value=33 Score=29.87 Aligned_cols=23 Identities=22% Similarity=0.400 Sum_probs=17.7
Q ss_pred CCeEEEEEEEeecCCeeEEEEEcCCCC
Q 041120 279 LNHGVTVVGYGEDHGEKYWLVKNSWGT 305 (340)
Q Consensus 279 ~~Hav~iVGyg~~~g~~ywivkNSWG~ 305 (340)
.-|+|+|+|||+. ++..-++||.
T Consensus 147 s~H~v~itgyDk~----n~yynDpyG~ 169 (195)
T COG4990 147 SIHSVLITGYDKY----NIYYNDPYGY 169 (195)
T ss_pred ceeeeEeeccccc----ceEecccccc
Confidence 4599999999874 5667777753
No 38
>PF09778 Guanylate_cyc_2: Guanylylate cyclase; InterPro: IPR018616 Members of this family of proteins catalyse the conversion of guanosine triphosphate (GTP) to 3',5'-cyclic guanosine monophosphate (cGMP) and pyrophosphate.
Probab=29.24 E-value=59 Score=29.14 Aligned_cols=23 Identities=17% Similarity=0.283 Sum_probs=16.2
Q ss_pred CCCCeEEEEEEEeecCCeeEEEEEc
Q 041120 277 HQLNHGVTVVGYGEDHGEKYWLVKN 301 (340)
Q Consensus 277 ~~~~Hav~iVGyg~~~g~~ywivkN 301 (340)
...+|-|+|+||+.+.+ =++++|
T Consensus 158 ~Y~GHYVVlcGyd~~~~--~~~yrd 180 (212)
T PF09778_consen 158 DYQGHYVVLCGYDAATK--EFEYRD 180 (212)
T ss_pred CccEEEEEEEeecCCCC--eEEEeC
Confidence 35689999999998643 244444
No 39
>smart00230 CysPc Calpain-like thiol protease family. Calpain-like thiol protease family (peptidase family C2). Calcium activated neutral protease (large subunit).
Probab=26.84 E-value=1e+02 Score=29.25 Aligned_cols=27 Identities=26% Similarity=0.554 Sum_probs=22.0
Q ss_pred CCeEEEEEEEeecCCee--EEEEEcCCCC
Q 041120 279 LNHGVTVVGYGEDHGEK--YWLVKNSWGT 305 (340)
Q Consensus 279 ~~Hav~iVGyg~~~g~~--ywivkNSWG~ 305 (340)
.+||=.|++...-++.+ ...+||-||.
T Consensus 227 ~~HaYsVl~v~~~~~~~~~Ll~lrNPWg~ 255 (318)
T smart00230 227 KGHAYSVTDVREVQGRRQELLRLRNPWGQ 255 (318)
T ss_pred cCccEEEEEEEEEecCCeEEEEEECCCCC
Confidence 48999999988765555 8999999993
No 40
>PF15284 PAGK: Phage-encoded virulence factor
Probab=24.92 E-value=69 Score=22.66 Aligned_cols=15 Identities=27% Similarity=0.414 Sum_probs=6.7
Q ss_pred hHHHHHHHHHHHHHH
Q 041120 24 MLRNAVLSLFLLWVL 38 (340)
Q Consensus 24 m~~~~~l~l~~~~~l 38 (340)
|++...++|.++|+|
T Consensus 1 Mkk~ksifL~l~~~L 15 (61)
T PF15284_consen 1 MKKFKSIFLALVFIL 15 (61)
T ss_pred ChHHHHHHHHHHHHH
Confidence 444444444444444
No 41
>PF07351 DUF1480: Protein of unknown function (DUF1480); InterPro: IPR009950 This family consists of several hypothetical Enterobacterial proteins of around 80 residues in length. The function of this family is unknown.
Probab=24.23 E-value=1.5e+02 Score=22.08 Aligned_cols=52 Identities=15% Similarity=0.121 Sum_probs=31.6
Q ss_pred eeEEeceeEEcCCC-cccccccCceecCCCCCCCCeEEEEEEEeecCCeeEEE
Q 041120 247 HAVTITGYEAIPAR-YAFQLYSHGVFDEYCGHQLNHGVTVVGYGEDHGEKYWL 298 (340)
Q Consensus 247 ~~~~i~~y~~~~~~-~~f~~y~~Gi~~~~c~~~~~Hav~iVGyg~~~g~~ywi 298 (340)
..++|..|...... .+-..-......-||.++.+-+|.+=||+.+...++||
T Consensus 4 t~vkIg~fEIdDA~l~~~~~~~~~tlsIPCksdpdlcmQLDgWDe~TSiPA~l 56 (80)
T PF07351_consen 4 TVVKIGSFEIDDAELSSEPDKGEDTLSIPCKSDPDLCMQLDGWDEHTSIPAIL 56 (80)
T ss_pred eEEEEEEEEEEeeEecCCCCCCCCeEEeecCCChhheeEecccccCCccceEE
Confidence 45677777554322 00001123455667999999999999999875444443
No 42
>TIGR03042 PS_II_psbQ_bact photosystem II protein PsbQ. This protein through the member sll1638 from Synechocystis sp. PCC 6803, was shown to be part of the cyanobacteria photosystem II. It is homologous to (but quite diverged from) the chloroplast PsbQ protein, called oxygen-evolving enhancer protein 3 (OEE3). We designate this cyanobacteria protein PsbQ by homology.
Probab=24.01 E-value=1.4e+02 Score=25.07 Aligned_cols=34 Identities=24% Similarity=0.351 Sum_probs=14.7
Q ss_pred HHHHHHHHHHHHHhhhccccccCCCCCCChhhHH
Q 041120 26 RNAVLSLFLLWVLGIPAGAWSEGYPQKYDPQSME 59 (340)
Q Consensus 26 ~~~~l~l~~~~~l~~~~~~~~~~~~~~~~~~~~~ 59 (340)
+++.++|+++++++++++..+...|--||..++.
T Consensus 3 ~~~s~~Lv~~~~~Lvsc~~p~~~~p~tysp~~l~ 36 (142)
T TIGR03042 3 SLASLLLVLLLTFLVSCSGPAAAVPPTYSPAQLA 36 (142)
T ss_pred hHHHHHHHHHHHHHHHcCCCcccCCCCCCHHHHH
Confidence 3444444444444444433333334344555444
No 43
>cd02549 Peptidase_C39A A sub-family of peptidase family C39. Peptidase family C39 mostly contains bacteriocin-processing endopeptidases from bacteria. The cysteine peptidases in family C39 cleave the "double-glycine" leader peptides from the precursors of various bacteriocins (mostly non-lantibiotic). The cleavage is mediated by the transporter as part of the secretion process. Bacteriocins are antibiotic proteins secreted by some species of bacteria that inhibit the growth of other bacterial species. The bacteriocin is synthesized as a precursor with an N-terminal leader peptide, and processing involves removal of the leader peptide by cleavage at a Gly-Gly bond, followed by translocation of the mature bacteriocin across the cytoplasmic membrane. Most endopeptidases of family C39 are N-terminal domains in larger proteins (ABC transporters) that serve both functions. The proposed protease active site is conserved in this sub-family of proteins with a single peptidase domain, which are
Probab=22.92 E-value=88 Score=24.98 Aligned_cols=22 Identities=23% Similarity=0.406 Sum_probs=15.6
Q ss_pred CCeEEEEEEEeecCCeeEEEEEcCC
Q 041120 279 LNHGVTVVGYGEDHGEKYWLVKNSW 303 (340)
Q Consensus 279 ~~Hav~iVGyg~~~g~~ywivkNSW 303 (340)
.+|.|+|+||+. .+..+|.+.|
T Consensus 93 ~gH~vVv~g~~~---~~~~~i~DP~ 114 (141)
T cd02549 93 SGHAMVVIGYDR---KGNVYVNDPG 114 (141)
T ss_pred CCeEEEEEEEcC---CCCEEEECCC
Confidence 689999999982 1235566665
No 44
>PF15588 Imm7: Immunity protein 7
Probab=22.47 E-value=3e+02 Score=21.93 Aligned_cols=32 Identities=28% Similarity=0.630 Sum_probs=24.0
Q ss_pred EEEEEEEeec--CCeeEEEEEcCC-----CCCCCCCceE
Q 041120 282 GVTVVGYGED--HGEKYWLVKNSW-----GTSWGEAGYI 313 (340)
Q Consensus 282 av~iVGyg~~--~g~~ywivkNSW-----G~~WGe~Gy~ 313 (340)
-|+.||++++ +.+.|.|++.+- ...=|.+||.
T Consensus 17 ~v~~vG~ADd~~~~~~yiilQR~~~~de~D~~~~~d~~~ 55 (115)
T PF15588_consen 17 NVLMVGFADDEDGPKEYIILQRSLEFDEQDEDLGSDGYY 55 (115)
T ss_pred cEEEEEEecCCCCCceEEEEEccCCCCCcccccCcCcEE
Confidence 4999999987 456799999964 3444567886
Done!