Query 018104
Match_columns 360
No_of_seqs 322 out of 1908
Neff 8.5
Searched_HMMs 46136
Date Fri Mar 29 06:10:44 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/018104.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/018104hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 KOG1542 Cysteine proteinase Ca 100.0 1.4E-81 3.1E-86 568.5 24.3 298 34-343 66-371 (372)
2 PTZ00203 cathepsin L protease; 100.0 4.6E-79 9.9E-84 578.2 36.2 298 32-341 31-338 (348)
3 PTZ00021 falcipain-2; Provisio 100.0 2.9E-77 6.3E-82 581.4 31.2 308 30-343 160-488 (489)
4 PTZ00200 cysteine proteinase; 100.0 1.1E-76 2.4E-81 575.8 33.5 308 29-345 116-447 (448)
5 KOG1543 Cysteine proteinase Ca 100.0 2.4E-69 5.2E-74 508.4 29.7 286 43-342 30-323 (325)
6 cd02621 Peptidase_C1A_Cathepsi 100.0 5.6E-57 1.2E-61 411.8 22.7 210 126-342 1-241 (243)
7 cd02698 Peptidase_C1A_Cathepsi 100.0 2.7E-56 5.9E-61 405.8 23.4 211 126-342 1-237 (239)
8 cd02248 Peptidase_C1A Peptidas 100.0 6.1E-56 1.3E-60 396.4 23.2 207 127-341 1-210 (210)
9 cd02620 Peptidase_C1A_Cathepsi 100.0 8.8E-56 1.9E-60 401.7 21.8 205 127-339 1-234 (236)
10 PF00112 Peptidase_C1: Papain 100.0 4.6E-54 1E-58 386.0 18.4 213 126-342 1-219 (219)
11 PTZ00364 dipeptidyl-peptidase 100.0 4E-53 8.7E-58 416.7 23.7 210 123-341 202-457 (548)
12 PTZ00049 cathepsin C-like prot 100.0 6.3E-53 1.4E-57 419.1 23.7 214 123-343 378-676 (693)
13 smart00645 Pept_C1 Papain fami 100.0 1.4E-49 3E-54 344.6 18.4 168 126-338 1-170 (174)
14 cd02619 Peptidase_C1 C1 Peptid 100.0 6.2E-46 1.3E-50 334.0 20.6 194 129-325 1-213 (223)
15 PTZ00462 Serine-repeat antigen 100.0 1.1E-44 2.4E-49 369.6 22.6 210 138-352 544-790 (1004)
16 KOG1544 Predicted cysteine pro 100.0 4.5E-41 9.7E-46 300.9 7.9 265 67-340 151-457 (470)
17 COG4870 Cysteine protease [Pos 100.0 4.4E-31 9.6E-36 242.6 7.6 198 124-327 97-316 (372)
18 cd00585 Peptidase_C1B Peptidas 99.9 2.2E-26 4.8E-31 222.7 14.4 183 139-324 55-399 (437)
19 PF03051 Peptidase_C1_2: Pepti 99.8 5.5E-20 1.2E-24 178.4 17.1 183 139-324 56-400 (438)
20 PF08246 Inhibitor_I29: Cathep 99.7 8.5E-17 1.9E-21 113.4 6.8 56 39-94 1-58 (58)
21 smart00848 Inhibitor_I29 Cathe 99.5 1.2E-14 2.6E-19 102.0 5.2 55 39-93 1-57 (57)
22 COG3579 PepC Aminopeptidase C 99.4 7.6E-13 1.7E-17 120.6 10.2 74 140-214 59-160 (444)
23 KOG4128 Bleomycin hydrolases a 98.6 2.8E-08 6.1E-13 90.9 4.0 75 139-214 63-167 (457)
24 PF13529 Peptidase_C39_2: Pept 97.2 0.007 1.5E-07 49.5 11.5 57 243-309 87-144 (144)
25 PF05543 Peptidase_C47: Stapho 96.6 0.027 5.9E-07 47.9 10.1 129 142-324 17-154 (175)
26 PF08127 Propeptide_C1: Peptid 96.3 0.0031 6.6E-08 40.6 2.1 35 66-102 3-37 (41)
27 PF14399 Transpep_BrtH: NlpC/p 91.3 0.53 1.1E-05 44.3 6.5 66 244-323 77-143 (317)
28 COG4990 Uncharacterized protei 89.9 0.79 1.7E-05 39.2 5.5 51 239-310 117-168 (195)
29 PF13956 Ibs_toxin: Toxin Ibs, 83.7 0.44 9.5E-06 24.5 0.4 15 1-15 1-15 (19)
30 cd02549 Peptidase_C39A A sub-f 79.8 5.5 0.00012 32.2 6.0 44 248-309 70-114 (141)
31 PF09778 Guanylate_cyc_2: Guan 76.6 10 0.00022 33.7 6.9 61 244-307 112-180 (212)
32 cd00044 CysPc Calpains, domain 68.4 20 0.00044 33.8 7.5 28 283-311 234-263 (315)
33 PF12385 Peptidase_C70: Papain 59.2 1.1E+02 0.0024 25.9 10.4 38 244-296 97-135 (166)
34 PF08139 LPAM_1: Prokaryotic m 58.5 7.7 0.00017 22.0 1.5 14 1-14 7-20 (25)
35 PF02402 Lysis_col: Lysis prot 54.8 4.4 9.5E-05 26.1 0.2 18 1-18 1-18 (46)
36 PF10731 Anophelin: Thrombin i 54.5 14 0.00031 25.5 2.6 23 1-23 1-23 (65)
37 PF06291 Lambda_Bor: Bor prote 52.9 7.8 0.00017 29.9 1.3 22 1-22 1-22 (97)
38 PF09403 FadA: Adhesion protei 52.7 5.8 0.00012 32.2 0.6 15 1-15 1-15 (126)
39 PF01640 Peptidase_C10: Peptid 52.2 71 0.0015 27.8 7.5 51 246-320 141-192 (192)
40 PRK10081 entericidin B membran 52.0 16 0.00036 24.1 2.5 15 1-15 2-16 (48)
41 COG5510 Predicted small secret 48.8 21 0.00045 23.1 2.5 12 1-12 2-13 (44)
42 PRK09810 entericidin A; Provis 43.4 25 0.00054 22.5 2.3 13 1-13 2-14 (41)
43 PF11948 DUF3465: Protein of u 37.8 39 0.00085 27.5 3.2 15 1-15 1-15 (131)
44 PF11106 YjbE: Exopolysacchari 36.4 32 0.0007 25.1 2.2 15 1-15 1-15 (80)
45 PF10880 DUF2673: Protein of u 36.0 76 0.0016 21.7 3.8 14 1-14 1-14 (65)
46 PF11777 DUF3316: Protein of u 35.3 62 0.0013 25.6 4.0 15 1-15 1-15 (114)
47 PF03032 Brevenin: Brevenin/es 34.6 27 0.00058 23.0 1.5 17 1-17 3-19 (46)
48 PF14060 DUF4252: Domain of un 33.2 47 0.001 27.5 3.2 34 2-45 1-34 (155)
49 COG2143 Thioredoxin-related pr 33.1 63 0.0014 27.3 3.8 18 1-18 1-18 (182)
50 PF05968 Bacillus_PapR: Bacill 32.3 39 0.00085 22.0 1.9 15 1-15 1-15 (48)
51 COG5294 Uncharacterized protei 31.7 41 0.0009 26.5 2.3 15 1-15 1-15 (113)
52 PF00648 Peptidase_C2: Calpain 31.5 75 0.0016 29.5 4.6 37 285-322 214-281 (298)
53 PF11873 DUF3393: Domain of un 28.8 75 0.0016 28.1 3.8 20 54-73 50-69 (204)
54 PLN02923 xylose isomerase 28.1 38 0.00083 33.3 1.9 48 1-48 1-49 (478)
55 PF15588 Imm7: Immunity protei 28.1 1.9E+02 0.0041 22.9 5.7 33 287-319 17-55 (115)
56 PRK10053 hypothetical protein; 27.9 52 0.0011 26.9 2.4 14 1-14 1-14 (130)
57 PRK13883 conjugal transfer pro 27.3 1.3E+02 0.0029 25.2 4.8 15 1-15 1-15 (151)
58 smart00230 CysPc Calpain-like 26.6 1.2E+02 0.0025 28.8 5.0 28 283-311 226-255 (318)
59 PF07437 YfaZ: YfaZ precursor; 26.4 53 0.0012 28.4 2.4 17 1-17 1-17 (180)
60 PRK10780 periplasmic chaperone 25.7 91 0.002 26.4 3.7 15 1-15 1-15 (165)
61 PF15284 PAGK: Phage-encoded v 25.6 37 0.0008 23.6 1.0 12 3-14 6-17 (61)
62 TIGR00156 conserved hypothetic 24.7 62 0.0013 26.3 2.3 13 1-13 1-13 (126)
63 PF02553 CbiN: Cobalt transpor 24.6 51 0.0011 24.1 1.6 14 1-14 1-14 (74)
64 PF15240 Pro-rich: Proline-ric 24.5 62 0.0013 27.9 2.4 11 4-14 3-13 (179)
65 PF11912 DUF3430: Protein of u 24.5 57 0.0012 28.5 2.3 15 1-15 1-15 (212)
66 PF10107 Endonuc_Holl: Endonuc 24.4 1.9E+02 0.0042 24.3 5.2 18 32-49 21-39 (156)
67 COG4871 Uncharacterized protei 23.6 48 0.001 28.0 1.5 16 140-155 135-152 (193)
68 PF11839 DUF3359: Protein of u 23.0 91 0.002 24.0 2.8 22 1-22 1-22 (96)
69 TIGR01655 yxeA_fam conserved h 22.7 60 0.0013 25.8 1.9 15 1-15 1-15 (114)
70 KOG4702 Uncharacterized conser 22.6 1.6E+02 0.0036 21.1 3.8 30 38-68 30-60 (77)
71 PF07910 Peptidase_C78: Peptid 22.6 1E+02 0.0022 27.6 3.5 23 284-306 155-177 (218)
72 PF15240 Pro-rich: Proline-ric 22.1 54 0.0012 28.3 1.6 16 5-20 1-16 (179)
73 PF04202 Mfp-3: Foot protein 3 22.0 72 0.0016 22.6 1.9 15 1-15 1-15 (71)
74 PRK09838 periplasmic copper-bi 21.2 86 0.0019 25.0 2.5 15 1-15 1-15 (115)
75 COG5567 Predicted small peripl 21.0 1.3E+02 0.0029 20.5 2.9 22 1-22 1-22 (58)
76 PF11337 DUF3139: Protein of u 20.9 93 0.002 23.1 2.5 13 1-13 1-15 (85)
77 PRK15240 resistance to complem 20.5 68 0.0015 27.8 1.9 17 1-17 1-17 (185)
78 PRK13697 cytochrome c6; Provis 20.4 2.3E+02 0.0049 21.8 4.8 15 1-15 1-15 (111)
79 PRK13859 type IV secretion sys 20.4 78 0.0017 21.2 1.7 14 1-14 1-14 (55)
80 KOG4404 Tandem pore domain K+ 20.1 2E+02 0.0044 27.3 5.0 32 24-55 27-59 (350)
No 1
>KOG1542 consensus Cysteine proteinase Cathepsin F [Posttranslational modification, protein turnover, chaperones]
Probab=100.00 E-value=1.4e-81 Score=568.52 Aligned_cols=298 Identities=43% Similarity=0.745 Sum_probs=261.6
Q ss_pred HHHHHHHHHHHhc-cccCChHHHHHHHHHHHHHHHHHHhhCCCC-CCeEEecccCCCCChhhhhhccccccccccccccc
Q 018104 34 GLWDLYERWRSHH-TVSRSLDEKHKRFNVFKQNVMHVHQTNKMD-KPYKLKLNKFADMTNHEFASTYAGSKIKHHRMFQG 111 (360)
Q Consensus 34 ~~~~~f~~~~~~~-k~Y~~~~E~~~R~~if~~n~~~I~~~N~~~-~s~~~g~N~fsD~t~~Ef~~~~~~~~~~~~~~~~~ 111 (360)
...+.|..|+.+| |+|.+.+|...|+.||+.|+..++++++.. .|.+.|+|+|||||+|||++++++.+....+....
T Consensus 66 ~~~~~F~~F~~kf~r~Y~s~eE~~~Rl~iF~~N~~~a~~~q~~d~gsA~yGvtqFSDlT~eEFkk~~l~~~~~~~~~~~~ 145 (372)
T KOG1542|consen 66 GLEDSFKLFTIKFGRSYASREEHAHRLSIFKHNLLRAERLQENDPGSAEYGVTQFSDLTEEEFKKIYLGVKRRGSKLPGD 145 (372)
T ss_pred chHHHHHHHHHhcCcccCcHHHHHHHHHHHHHHHHHHHHhhhcCccccccCccchhhcCHHHHHHHhhccccccccCccc
Confidence 3477899999999 999999999999999999999999998876 59999999999999999999998765531110000
Q ss_pred cCCCcccccCCCCCCCCceecCCCCCCCCCCCCCCCCcHHHHHHHHHHHHHHHHhcCCccccCHHHHHhhcCCCCCCCCC
Q 018104 112 TRGNGTFMYGKVTSIPPSVDWRKKGSVTAVKDQGQCGSCWAFSTIAAVEGINHIMTNKLVSLSEQELVDCDTDQNQGCNG 191 (360)
Q Consensus 112 ~~~~~~~~~~~~~~lP~~~Dwr~~g~vtpV~dQg~cGsCwAfA~~~~le~~~~~~~~~~~~lS~q~l~dc~~~~~~gc~G 191 (360)
... ........||++||||++|.||||||||+||||||||+++++|+++.+++|++++||||+|+||+.. ++||+|
T Consensus 146 ~~~---~~~~~~~~lP~~fDWR~kgaVTpVKnQG~CGSCWAFS~tG~vEga~~i~~g~LvsLSEQeLvDCD~~-d~gC~G 221 (372)
T KOG1542|consen 146 AAE---APIEPGESLPESFDWRDKGAVTPVKNQGMCGSCWAFSTTGAVEGAWAIATGKLVSLSEQELVDCDSC-DNGCNG 221 (372)
T ss_pred ccc---CcCCCCCCCCcccchhccCCccccccCCcCcchhhhhhhhhhhhHHHhhcCcccccchhhhhcccCc-CCcCCC
Confidence 000 1123446899999999999999999999999999999999999999999999999999999999986 899999
Q ss_pred cchhhHHHHHHHcCCCCCCCCCcccCCCC-CcCCCCCCCCcEEecceEEcCCChHHHHHHHH-HhCCeEEEEecCCcccc
Q 018104 192 GLMELAFEFIKKKGGVTTEAKYPYQANDG-TCDVSKESSPAVSIDGHENVPANHEDALLKAV-AKQPVSVAIDAGSSDFQ 269 (360)
Q Consensus 192 G~~~~a~~~~~~~~Gi~~e~~yPY~~~~~-~c~~~~~~~~~~~i~~~~~v~~~~~~~i~~~l-~~gPV~v~~~~~~~~f~ 269 (360)
|.+..|++|+++.+|+..|.+|||++..+ .|..... ...+.|+++..++. ++++|.+.| .+|||+|+|++. .++
T Consensus 222 Gl~~nA~~~~~~~gGL~~E~dYPY~g~~~~~C~~~~~-~~~v~I~~f~~l~~-nE~~ia~wLv~~GPi~vgiNa~--~mQ 297 (372)
T KOG1542|consen 222 GLMDNAFKYIKKAGGLEKEKDYPYTGKKGNQCHFDKS-KIVVSIKDFSMLSN-NEDQIAAWLVTFGPLSVGINAK--PMQ 297 (372)
T ss_pred CChhHHHHHHHHhCCccccccCCccccCCCccccchh-hceEEEeccEecCC-CHHHHHHHHHhcCCeEEEEchH--HHH
Confidence 99999999988888999999999999888 8998774 77899999999976 889999988 679999999975 799
Q ss_pred cccCceEeC---CCCCC-CCeEEEEEEeeecCCCceEEEEEcCCCCCCCCCcEEEEEecCCCCCCCccccccceeeee
Q 018104 270 FYSEGVFTG---ECGTE-LNHGVAAVGYGTTLDGTKYWIVRNSWGPEWGEKGYIRMQRGISDKKGLCGIAMEASYPIK 343 (360)
Q Consensus 270 ~y~~Giy~~---~~~~~-~~Hav~iVGyg~~~~g~~ywivkNSWG~~WG~~Gy~~i~~~~~~~~~~Cgi~~~~~~~~~ 343 (360)
+|.+||..+ .|+.. ++|||+|||||...-.++|||||||||++||++||+|+.|| .|.|||++.++-+.+
T Consensus 298 ~YrgGV~~P~~~~Cs~~~~~HaVLlvGyG~~g~~~PYWIVKNSWG~~WGE~GY~~l~RG----~N~CGi~~mvss~~v 371 (372)
T KOG1542|consen 298 FYRGGVSCPSKYICSPKLLNHAVLLVGYGSSGYEKPYWIVKNSWGTSWGEKGYYKLCRG----SNACGIADMVSSAAV 371 (372)
T ss_pred HhcccccCCCcccCCccccCceEEEEeecCCCCCCceEEEECCccccccccceEEEecc----ccccccccchhhhhc
Confidence 999999977 67765 89999999999973379999999999999999999999999 467999999876543
No 2
>PTZ00203 cathepsin L protease; Provisional
Probab=100.00 E-value=4.6e-79 Score=578.23 Aligned_cols=298 Identities=37% Similarity=0.715 Sum_probs=248.8
Q ss_pred hhHHHHHHHHHHHhc-cccCChHHHHHHHHHHHHHHHHHHhhCCCCCCeEEecccCCCCChhhhhhcccccccccccccc
Q 018104 32 EEGLWDLYERWRSHH-TVSRSLDEKHKRFNVFKQNVMHVHQTNKMDKPYKLKLNKFADMTNHEFASTYAGSKIKHHRMFQ 110 (360)
Q Consensus 32 ~~~~~~~f~~~~~~~-k~Y~~~~E~~~R~~if~~n~~~I~~~N~~~~s~~~g~N~fsD~t~~Ef~~~~~~~~~~~~~~~~ 110 (360)
+..+..+|++||++| |.|.+.+|+.+|++||++|+++|++||+++.+|++|+|+|+|||+|||++++++......+. .
T Consensus 31 ~~~~~~~f~~~~~~~~K~Y~~~~E~~~R~~iF~~N~~~I~~~N~~~~~~~lg~N~FaDlT~eEf~~~~l~~~~~~~~~-~ 109 (348)
T PTZ00203 31 GTPAAALFEEFKRTYQRAYGTLTEEQQRLANFERNLELMREHQARNPHARFGITKFFDLSEAEFAARYLNGAAYFAAA-K 109 (348)
T ss_pred ccHHHHHHHHHHHHhCCCCCChHHHHHHHHHHHHHHHHHHHHhccCCCeEEeccccccCCHHHHHHHhcCCCcccccc-c
Confidence 567888999999999 99988889999999999999999999988789999999999999999998775321100000 0
Q ss_pred ccCCCccccc--CCCCCCCCceecCCCCCCCCCCCCCCCCcHHHHHHHHHHHHHHHHhcCCccccCHHHHHhhcCCCCCC
Q 018104 111 GTRGNGTFMY--GKVTSIPPSVDWRKKGSVTAVKDQGQCGSCWAFSTIAAVEGINHIMTNKLVSLSEQELVDCDTDQNQG 188 (360)
Q Consensus 111 ~~~~~~~~~~--~~~~~lP~~~Dwr~~g~vtpV~dQg~cGsCwAfA~~~~le~~~~~~~~~~~~lS~q~l~dc~~~~~~g 188 (360)
.... ..... ....+||++||||++|.|+||||||.||||||||+++++|+++++++++.++||+|+|+||+.. +.|
T Consensus 110 ~~~~-~~~~~~~~~~~~lP~~~DWR~~g~VtpVkdQg~CGSCWAfa~~~aiEs~~~i~~~~~~~LSeQqLvdC~~~-~~G 187 (348)
T PTZ00203 110 QHAG-QHYRKARADLSAVPDAVDWREKGAVTPVKNQGACGSCWAFSAVGNIESQWAVAGHKLVRLSEQQLVSCDHV-DNG 187 (348)
T ss_pred cccc-ccccccccccccCCCCCcCCcCCCCCCccccCCCccHHHHhhHHHHHHHHHHhcCCCccCCHHHHHhccCC-CCC
Confidence 0000 00111 1234689999999999999999999999999999999999999999999999999999999875 789
Q ss_pred CCCcchhhHHHHHHHc--CCCCCCCCCcccCCCC---CcCCCCCCCCcEEecceEEcCCChHHHHHHHHH-hCCeEEEEe
Q 018104 189 CNGGLMELAFEFIKKK--GGVTTEAKYPYQANDG---TCDVSKESSPAVSIDGHENVPANHEDALLKAVA-KQPVSVAID 262 (360)
Q Consensus 189 c~GG~~~~a~~~~~~~--~Gi~~e~~yPY~~~~~---~c~~~~~~~~~~~i~~~~~v~~~~~~~i~~~l~-~gPV~v~~~ 262 (360)
|+||++..|++|+.++ +|+++|++|||.+.++ .|..........++++|..++. +++.|+.+|. .|||+|+|+
T Consensus 188 C~GG~~~~a~~yi~~~~~ggi~~e~~YPY~~~~~~~~~C~~~~~~~~~~~i~~~~~i~~-~e~~~~~~l~~~GPv~v~i~ 266 (348)
T PTZ00203 188 CGGGLMLQAFEWVLRNMNGTVFTEKSYPYVSGNGDVPECSNSSELAPGARIDGYVSMES-SERVMAAWLAKNGPISIAVD 266 (348)
T ss_pred CCCCCHHHHHHHHHHhcCCCCCccccCCCccCCCCCCcCCCCcccccceEecceeecCc-CHHHHHHHHHhCCCEEEEEE
Confidence 9999999999999864 6799999999998766 5764322123467889988865 7788999986 589999999
Q ss_pred cCCcccccccCceEeCCCC-CCCCeEEEEEEeeecCCCceEEEEEcCCCCCCCCCcEEEEEecCCCCCCCccccccceee
Q 018104 263 AGSSDFQFYSEGVFTGECG-TELNHGVAAVGYGTTLDGTKYWIVRNSWGPEWGEKGYIRMQRGISDKKGLCGIAMEASYP 341 (360)
Q Consensus 263 ~~~~~f~~y~~Giy~~~~~-~~~~Hav~iVGyg~~~~g~~ywivkNSWG~~WG~~Gy~~i~~~~~~~~~~Cgi~~~~~~~ 341 (360)
+. +|++|++|||+. |. ..++|||+|||||.+ +|++|||||||||++||++|||||+|+. |.|||++.++..
T Consensus 267 a~--~f~~Y~~GIy~~-c~~~~~nHaVliVGYG~~-~g~~YWiikNSWG~~WGe~GY~ri~rg~----n~Cgi~~~~~~~ 338 (348)
T PTZ00203 267 AS--SFMSYHSGVLTS-CIGEQLNHGVLLVGYNMT-GEVPYWVIKNSWGEDWGEKGYVRVTMGV----NACLLTGYPVSV 338 (348)
T ss_pred hh--hhcCccCceeec-cCCCCCCeEEEEEEEecC-CCceEEEEEcCCCCCcCcCceEEEEcCC----CcccccceEEEE
Confidence 84 799999999985 64 457999999999986 7899999999999999999999999984 569999777665
No 3
>PTZ00021 falcipain-2; Provisional
Probab=100.00 E-value=2.9e-77 Score=581.37 Aligned_cols=308 Identities=36% Similarity=0.656 Sum_probs=256.9
Q ss_pred CChhHHHHHHHHHHHhc-cccCChHHHHHHHHHHHHHHHHHHhhCCCC-CCeEEecccCCCCChhhhhhccccccccc-c
Q 018104 30 ESEEGLWDLYERWRSHH-TVSRSLDEKHKRFNVFKQNVMHVHQTNKMD-KPYKLKLNKFADMTNHEFASTYAGSKIKH-H 106 (360)
Q Consensus 30 ~~~~~~~~~f~~~~~~~-k~Y~~~~E~~~R~~if~~n~~~I~~~N~~~-~s~~~g~N~fsD~t~~Ef~~~~~~~~~~~-~ 106 (360)
-++.+....|++|+.+| |.|.+.+|+.+|++||++|+++|++||+++ .+|++|+|+|+|||.|||++++++..... .
T Consensus 160 ~~n~e~~~~F~~wk~ky~K~Y~~~eE~~~R~~iF~~Nl~~Ie~hN~~~~~ty~lgiNqFsDlT~EEF~~~~l~~~~~~~~ 239 (489)
T PTZ00021 160 MTNLENVNSFYLFIKEHGKKYQTPDEMQQRYLSFVENLAKINAHNNKENVLYKKGMNRFGDLSFEEFKKKYLTLKSFDFK 239 (489)
T ss_pred ccChHHHHHHHHHHHHhCCcCCCHHHHHHHHHHHHHHHHHHHHhhccCCCCEEEeccccccCCHHHHHHHhccccccccc
Confidence 45566677899999999 999999999999999999999999999874 89999999999999999998876543110 0
Q ss_pred cc-ccccC--C-C---cccccCCCCCCCCceecCCCCCCCCCCCCCCCCcHHHHHHHHHHHHHHHHhcCCccccCHHHHH
Q 018104 107 RM-FQGTR--G-N---GTFMYGKVTSIPPSVDWRKKGSVTAVKDQGQCGSCWAFSTIAAVEGINHIMTNKLVSLSEQELV 179 (360)
Q Consensus 107 ~~-~~~~~--~-~---~~~~~~~~~~lP~~~Dwr~~g~vtpV~dQg~cGsCwAfA~~~~le~~~~~~~~~~~~lS~q~l~ 179 (360)
.. ..... . . ..........+|++||||+.|.|+||||||.||||||||+++++|++++++++..++||+|+|+
T Consensus 240 ~~~~~~~~~~~~~~~~~~~~~~~~~~~P~s~DWR~~g~VtpVKdQG~CGSCWAFAa~~alEs~~~I~~g~~v~LSeQqLV 319 (489)
T PTZ00021 240 SNGKKSPRVINYDDVIKKYKPKDATFDHAKYDWRLHNGVTPVKDQKNCGSCWAFSTVGVVESQYAIRKNELVSLSEQELV 319 (489)
T ss_pred cccccccccccccccccccccccccCCccccccccCCCCCCcccccccccHHHHHHHHHHHHHHHHHcCCCcccCHHHHh
Confidence 00 00000 0 0 0000011112499999999999999999999999999999999999999999999999999999
Q ss_pred hhcCCCCCCCCCcchhhHHHHHHHcCCCCCCCCCcccCC-CCCcCCCCCCCCcEEecceEEcCCChHHHHHHHHH-hCCe
Q 018104 180 DCDTDQNQGCNGGLMELAFEFIKKKGGVTTEAKYPYQAN-DGTCDVSKESSPAVSIDGHENVPANHEDALLKAVA-KQPV 257 (360)
Q Consensus 180 dc~~~~~~gc~GG~~~~a~~~~~~~~Gi~~e~~yPY~~~-~~~c~~~~~~~~~~~i~~~~~v~~~~~~~i~~~l~-~gPV 257 (360)
||+.. +.||+||++..|+.|+.+++|+++|++|||.+. .+.|..... ...+++++|..++ +++|+++|. .|||
T Consensus 320 DCs~~-n~GC~GG~~~~Af~yi~~~gGl~tE~~YPY~~~~~~~C~~~~~-~~~~~i~~y~~i~---~~~lk~al~~~GPV 394 (489)
T PTZ00021 320 DCSFK-NNGCYGGLIPNAFEDMIELGGLCSEDDYPYVSDTPELCNIDRC-KEKYKIKSYVSIP---EDKFKEAIRFLGPI 394 (489)
T ss_pred hhccC-CCCCCCcchHhhhhhhhhccccCcccccCccCCCCCccccccc-cccceeeeEEEec---HHHHHHHHHhcCCe
Confidence 99875 889999999999999988779999999999987 478975433 3457888998885 467899996 5899
Q ss_pred EEEEecCCcccccccCceEeCCCCCCCCeEEEEEEeeecC---------CCceEEEEEcCCCCCCCCCcEEEEEecCCCC
Q 018104 258 SVAIDAGSSDFQFYSEGVFTGECGTELNHGVAAVGYGTTL---------DGTKYWIVRNSWGPEWGEKGYIRMQRGISDK 328 (360)
Q Consensus 258 ~v~~~~~~~~f~~y~~Giy~~~~~~~~~Hav~iVGyg~~~---------~g~~ywivkNSWG~~WG~~Gy~~i~~~~~~~ 328 (360)
+|+|++. .+|++|++|||++.|+..++|||+|||||++. .+.+|||||||||++|||+|||||+|+.+..
T Consensus 395 sv~i~a~-~~f~~YkgGIy~~~C~~~~nHAVlIVGYG~e~~~~~~~~~~~~~~YWIVKNSWGt~WGE~GY~rI~r~~~g~ 473 (489)
T PTZ00021 395 SVSIAVS-DDFAFYKGGIFDGECGEEPNHAVILVGYGMEEIYNSDTKKMEKRYYYIIKNSWGESWGEKGFIRIETDENGL 473 (489)
T ss_pred EEEEEee-cccccCCCCcCCCCCCCccceEEEEEEecCcCCcccccccCCCCCEEEEECCCCCCcccCeEEEEEcCCCCC
Confidence 9999997 68999999999988988899999999999752 1247999999999999999999999986544
Q ss_pred CCCccccccceeeee
Q 018104 329 KGLCGIAMEASYPIK 343 (360)
Q Consensus 329 ~~~Cgi~~~~~~~~~ 343 (360)
.|+|||++.+.||++
T Consensus 474 ~n~CGI~t~a~yP~~ 488 (489)
T PTZ00021 474 MKTCSLGTEAYVPLI 488 (489)
T ss_pred CCCCCCcccceeEec
Confidence 578999999999986
No 4
>PTZ00200 cysteine proteinase; Provisional
Probab=100.00 E-value=1.1e-76 Score=575.82 Aligned_cols=308 Identities=37% Similarity=0.699 Sum_probs=255.6
Q ss_pred cCChhHHHHHHHHHHHhc-cccCChHHHHHHHHHHHHHHHHHHhhCCCCCCeEEecccCCCCChhhhhhcccccccccc-
Q 018104 29 LESEEGLWDLYERWRSHH-TVSRSLDEKHKRFNVFKQNVMHVHQTNKMDKPYKLKLNKFADMTNHEFASTYAGSKIKHH- 106 (360)
Q Consensus 29 ~~~~~~~~~~f~~~~~~~-k~Y~~~~E~~~R~~if~~n~~~I~~~N~~~~s~~~g~N~fsD~t~~Ef~~~~~~~~~~~~- 106 (360)
...+.++...|++|+++| |.|.+.+|+.+|+.||++|+++|++||. +.+|++|+|+|+|||+|||.+++++...+..
T Consensus 116 ~~~e~e~~~~F~~f~~ky~K~Y~~~~E~~~R~~iF~~Nl~~I~~hN~-~~~y~lgiN~FsDlT~eEF~~~~~~~~~~~~~ 194 (448)
T PTZ00200 116 PKLEFEVYLEFEEFNKKYNRKHATHAERLNRFLTFRNNYLEVKSHKG-DEPYSKEINKFSDLTEEEFRKLFPVIKVPPKS 194 (448)
T ss_pred ccchHHHHHHHHHHHHHhCCcCCCHHHHHHHHHHHHHHHHHHHHhcC-cCCeEEeccccccCCHHHHHHHhccCCCcccc
Confidence 345566777899999999 9999889999999999999999999996 4689999999999999999988765332110
Q ss_pred c----c--cccc-CCCccccc---------CC----CCCCCCceecCCCCCCCCCCCCC-CCCcHHHHHHHHHHHHHHHH
Q 018104 107 R----M--FQGT-RGNGTFMY---------GK----VTSIPPSVDWRKKGSVTAVKDQG-QCGSCWAFSTIAAVEGINHI 165 (360)
Q Consensus 107 ~----~--~~~~-~~~~~~~~---------~~----~~~lP~~~Dwr~~g~vtpV~dQg-~cGsCwAfA~~~~le~~~~~ 165 (360)
. . .... .....+.. .. ...+|++||||+.|.|+|||||| .||||||||+++++|+++++
T Consensus 195 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~P~~~DWR~~g~vtpVkdQG~~CGSCWAFat~~aiEs~~~i 274 (448)
T PTZ00200 195 NSTSHNNDFKARHVSNPTYLKNLKKAKNTDEDVKDPSKITGEGLDWRRADAVTKVKDQGLNCGSCWAFSSVGSVESLYKI 274 (448)
T ss_pred cccccccccccccccccccccccccccccccccccccccCCCCccCCCCCCCCCcccCCCccchHHHHhHHHHHHHHHHH
Confidence 0 0 0000 00000000 00 01269999999999999999999 99999999999999999999
Q ss_pred hcCCccccCHHHHHhhcCCCCCCCCCcchhhHHHHHHHcCCCCCCCCCcccCCCCCcCCCCCCCCcEEecceEEcCCChH
Q 018104 166 MTNKLVSLSEQELVDCDTDQNQGCNGGLMELAFEFIKKKGGVTTEAKYPYQANDGTCDVSKESSPAVSIDGHENVPANHE 245 (360)
Q Consensus 166 ~~~~~~~lS~q~l~dc~~~~~~gc~GG~~~~a~~~~~~~~Gi~~e~~yPY~~~~~~c~~~~~~~~~~~i~~~~~v~~~~~ 245 (360)
+++..++||+|+|+||... +.||+||++..|++|++++ |+++|++|||.+..+.|.... .....+.+|..++ ..
T Consensus 275 ~~~~~~~LSeQqLvDC~~~-~~GC~GG~~~~A~~yi~~~-Gi~~e~~YPY~~~~~~C~~~~--~~~~~i~~y~~~~--~~ 348 (448)
T PTZ00200 275 YRDKSVDLSEQELVNCDTK-SQGCSGGYPDTALEYVKNK-GLSSSSDVPYLAKDGKCVVSS--TKKVYIDSYLVAK--GK 348 (448)
T ss_pred hcCCCeecCHHHHhhccCc-cCCCCCCcHHHHHHHHhhc-CccccccCCCCCCCCCCcCCC--CCeeEecceEecC--HH
Confidence 9999999999999999875 8899999999999999887 999999999999999997653 3456788887664 34
Q ss_pred HHHHHHHHhCCeEEEEecCCcccccccCceEeCCCCCCCCeEEEEEEeeec-CCCceEEEEEcCCCCCCCCCcEEEEEec
Q 018104 246 DALLKAVAKQPVSVAIDAGSSDFQFYSEGVFTGECGTELNHGVAAVGYGTT-LDGTKYWIVRNSWGPEWGEKGYIRMQRG 324 (360)
Q Consensus 246 ~~i~~~l~~gPV~v~~~~~~~~f~~y~~Giy~~~~~~~~~Hav~iVGyg~~-~~g~~ywivkNSWG~~WG~~Gy~~i~~~ 324 (360)
+.+++++..|||+|+|++. .+|+.|++|||+++|+..++|||+|||||.+ .+|.+|||||||||++||++|||||+|+
T Consensus 349 ~~l~~~l~~GPV~v~i~~~-~~f~~Yk~GIy~~~C~~~~nHaV~lVGyG~d~~~g~~YWIIkNSWG~~WGe~GY~ri~r~ 427 (448)
T PTZ00200 349 DVLNKSLVISPTVVYIAVS-RELLKYKSGVYNGECGKSLNHAVLLVGEGYDEKTKKRYWIIKNSWGTDWGENGYMRLERT 427 (448)
T ss_pred HHHHHHHhcCCEEEEeecc-cccccCCCCccccccCCCCcEEEEEEEecccCCCCCceEEEEcCCCCCcccCeeEEEEeC
Confidence 5677777889999999997 7899999999998898889999999999854 3678999999999999999999999997
Q ss_pred CCCCCCCccccccceeeeecC
Q 018104 325 ISDKKGLCGIAMEASYPIKKS 345 (360)
Q Consensus 325 ~~~~~~~Cgi~~~~~~~~~~~ 345 (360)
.. ..|.|||++.+.||++..
T Consensus 428 ~~-g~n~CGI~~~~~~P~~~~ 447 (448)
T PTZ00200 428 NE-GTDKCGILTVGLTPVFYS 447 (448)
T ss_pred CC-CCCcCCccccceeeEEec
Confidence 42 247899999999998753
No 5
>KOG1543 consensus Cysteine proteinase Cathepsin L [Posttranslational modification, protein turnover, chaperones]
Probab=100.00 E-value=2.4e-69 Score=508.43 Aligned_cols=286 Identities=48% Similarity=0.815 Sum_probs=247.6
Q ss_pred HHhc-cccCChHHHHHHHHHHHHHHHHHHhhCCC-CCCeEEecccCCCCChhhhhhccccccccccccccccCCCccccc
Q 018104 43 RSHH-TVSRSLDEKHKRFNVFKQNVMHVHQTNKM-DKPYKLKLNKFADMTNHEFASTYAGSKIKHHRMFQGTRGNGTFMY 120 (360)
Q Consensus 43 ~~~~-k~Y~~~~E~~~R~~if~~n~~~I~~~N~~-~~s~~~g~N~fsD~t~~Ef~~~~~~~~~~~~~~~~~~~~~~~~~~ 120 (360)
+.+| +.|.+..|+..|+.+|.+|++.|+.||.. ..+|++|+|+|+|+|.+|++..+.+.++...... .. ...
T Consensus 30 ~~~~~~~y~~~~~~~~r~~~f~~n~~~~~~~n~~~~~~~~~g~n~~~d~~~ee~~~~~~~~~~~~~~~~---~~---~~~ 103 (325)
T KOG1543|consen 30 LVKFLKRYEDRVEKKARRAIFKENLQKIESHNLKYVLSFLMGVNQFADLTTEEFKRKKTGKKPPEIKRD---KF---TEK 103 (325)
T ss_pred hhhhccccccHHHHHHHHHHHHHHHHHHHhhhhhhceeeeeccccccccchHHHHHhhccccCcccccc---cc---ccc
Confidence 7788 88876778999999999999999999998 6999999999999999999988776554332100 00 111
Q ss_pred CCCCCCCCceecCCCCC-CCCCCCCCCCCcHHHHHHHHHHHHHHHHhcC-CccccCHHHHHhhcCCCCCCCCCcchhhHH
Q 018104 121 GKVTSIPPSVDWRKKGS-VTAVKDQGQCGSCWAFSTIAAVEGINHIMTN-KLVSLSEQELVDCDTDQNQGCNGGLMELAF 198 (360)
Q Consensus 121 ~~~~~lP~~~Dwr~~g~-vtpV~dQg~cGsCwAfA~~~~le~~~~~~~~-~~~~lS~q~l~dc~~~~~~gc~GG~~~~a~ 198 (360)
....++|++||||++|. ++||||||.||||||||++++||++++|+++ .++.||+|+|+||+..++.||.||.+..|+
T Consensus 104 ~~~~~~p~s~DwR~~~~~~~~vkdQg~CgsCWAFaa~~aie~~~~i~~g~~l~sLSeq~lvdC~~~~~~GC~GG~~~~A~ 183 (325)
T KOG1543|consen 104 LDGDDLPDSFDWRDKGAVTPPVKDQGSCGSCWAFAATGALEDRYNIKTGGKLLSLSEQDLVDCCGECGDGCNGGEPKNAF 183 (325)
T ss_pred cchhhCCCCccccccCCcCCCcCCCCcCcchHHHHHHHHHHHHHHHHhCCccCccChhhhhhccCCCCCCcCCCCHHHHH
Confidence 23457999999999974 5559999999999999999999999999999 899999999999998667899999999999
Q ss_pred HHHHHcCCCCCCCCCcccCCCCCcCCCCCCCCcEEecceEEcCCChHHHHHHHHHh-CCeEEEEecCCcccccccCceEe
Q 018104 199 EFIKKKGGVTTEAKYPYQANDGTCDVSKESSPAVSIDGHENVPANHEDALLKAVAK-QPVSVAIDAGSSDFQFYSEGVFT 277 (360)
Q Consensus 199 ~~~~~~~Gi~~e~~yPY~~~~~~c~~~~~~~~~~~i~~~~~v~~~~~~~i~~~l~~-gPV~v~~~~~~~~f~~y~~Giy~ 277 (360)
+|+++++|+..+++|||.+..+.|..... .....+.++..++.+ +++|++++++ |||+|+|++.. +|++|++|||.
T Consensus 184 ~yi~~~G~~t~~~~Ypy~~~~~~C~~~~~-~~~~~~~~~~~~~~~-e~~i~~~v~~~GPv~v~~~a~~-~F~~Y~~GVy~ 260 (325)
T KOG1543|consen 184 KYIKKNGGVTECENYPYIGKDGTCKSNKK-DKTVTIKGFYNVPAN-EEAIAEAVAKNGPVSVAIDAYE-DFSLYKGGVYA 260 (325)
T ss_pred HHHHHhCCCCCCcCCCCcCCCCCccCCCc-cceeEeeeeeecCcC-HHHHHHHHHhcCCeEEEEeehh-hhhhccCceEe
Confidence 99999944444999999999999998775 567788888888875 9999999955 79999999995 99999999999
Q ss_pred CCCCC--CCCeEEEEEEeeecCCCceEEEEEcCCCCCCCCCcEEEEEecCCCCCCCcccccccee-ee
Q 018104 278 GECGT--ELNHGVAAVGYGTTLDGTKYWIVRNSWGPEWGEKGYIRMQRGISDKKGLCGIAMEASY-PI 342 (360)
Q Consensus 278 ~~~~~--~~~Hav~iVGyg~~~~g~~ywivkNSWG~~WG~~Gy~~i~~~~~~~~~~Cgi~~~~~~-~~ 342 (360)
+++.. .++|||+|||||. .++.+|||||||||++|||+|||||.|+.++ |+|++.+.| |+
T Consensus 261 ~~~~~~~~~~Hav~iVGyG~-~~~~~YWivkNSWG~~WGe~Gy~ri~r~~~~----~~I~~~~~~~p~ 323 (325)
T KOG1543|consen 261 EEKGDDKEGDHAVLIVGYGT-GDGVDYWIVKNSWGTDWGEKGYFRIARGVNK----CGIASEASYGPI 323 (325)
T ss_pred CCCCCCCCCCceEEEEEEcC-CCCceeEEEEcCCCCCcccCceEEEecCCCc----hhhhcccccCCC
Confidence 87555 5999999999999 6889999999999999999999999999654 999999988 54
No 6
>cd02621 Peptidase_C1A_CathepsinC Cathepsin C; also known as Dipeptidyl Peptidase I (DPPI), an atypical papain-like cysteine peptidase with chloride dependency and dipeptidyl aminopeptidase activity, resulting from its tetrameric structure which limits substrate access. Each subunit of the tetramer is composed of three peptides: the heavy and light chains, which together adopts the papain fold and forms the catalytic domain; and the residual propeptide region, which forms a beta barrel and points towards the substrate's N-terminus. The subunit composition is the result of the unique characteristic of procathepsin C maturation involving the cleavage of the catalytic domain and the non-autocatalytic excision of an activation peptide within its propeptide region. By removing N-terminal dipeptide extensions, cathepsin C activates granule serine peptidases (granzymes) involved in cell-mediated apoptosis, inflammation and tissue remodelling. Loss-of-function mutations in cathepsin C are assoc
Probab=100.00 E-value=5.6e-57 Score=411.75 Aligned_cols=210 Identities=39% Similarity=0.745 Sum_probs=179.2
Q ss_pred CCCceecCCCC----CCCCCCCCCCCCcHHHHHHHHHHHHHHHHhcCC------ccccCHHHHHhhcCCCCCCCCCcchh
Q 018104 126 IPPSVDWRKKG----SVTAVKDQGQCGSCWAFSTIAAVEGINHIMTNK------LVSLSEQELVDCDTDQNQGCNGGLME 195 (360)
Q Consensus 126 lP~~~Dwr~~g----~vtpV~dQg~cGsCwAfA~~~~le~~~~~~~~~------~~~lS~q~l~dc~~~~~~gc~GG~~~ 195 (360)
||++||||+.+ +|+||||||.||+|||||++++||+++++++++ .+.||+|+|+||... +.||+||++.
T Consensus 1 lP~~fDwr~~~~~~~~v~~v~dQg~CGsCwAfa~~~~ies~~~i~~~~~~~~~~~~~lS~q~l~dC~~~-~~GC~GG~~~ 79 (243)
T cd02621 1 LPKSFDWGDVNNGFNYVSPVRNQGGCGSCYAFASVYALEARIMIASNKTDPLGQQPILSPQHVLSCSQY-SQGCDGGFPF 79 (243)
T ss_pred CCCcccccccCCCCcccccCCCCCcCccHHHHHHHHHHHHHHHHHhCCCCccccCcccCHHHhhhhcCC-CCCCCCCCHH
Confidence 79999999998 999999999999999999999999999998876 789999999999864 7899999999
Q ss_pred hHHHHHHHcCCCCCCCCCcccC-CCCCcCCCCCCCCcEEecceEEc----CCChHHHHHHHHH-hCCeEEEEecCCcccc
Q 018104 196 LAFEFIKKKGGVTTEAKYPYQA-NDGTCDVSKESSPAVSIDGHENV----PANHEDALLKAVA-KQPVSVAIDAGSSDFQ 269 (360)
Q Consensus 196 ~a~~~~~~~~Gi~~e~~yPY~~-~~~~c~~~~~~~~~~~i~~~~~v----~~~~~~~i~~~l~-~gPV~v~~~~~~~~f~ 269 (360)
.+++|+.+. |+++|++|||.. ....|..........++..+..+ ...++++|+++|. +|||+++|++. ++|.
T Consensus 80 ~a~~~~~~~-Gi~~e~~yPY~~~~~~~C~~~~~~~~~~~~~~~~~i~~~~~~~~~~~ik~~i~~~GPv~v~~~~~-~~F~ 157 (243)
T cd02621 80 LVGKFAEDF-GIVTEDYFPYTADDDRPCKASPSECRRYYFSDYNYVGGCYGCTNEDEMKWEIYRNGPIVVAFEVY-SDFD 157 (243)
T ss_pred HHHHHHHhc-CcCCCceeCCCCCCCCCCCCCccccccccccceeEcccccccCCHHHHHHHHHHcCCEEEEEEec-cccc
Confidence 999999887 999999999998 67788754312233444444433 1347889999995 58999999997 6899
Q ss_pred cccCceEeCC-----CCC---------CCCeEEEEEEeeecC-CCceEEEEEcCCCCCCCCCcEEEEEecCCCCCCCccc
Q 018104 270 FYSEGVFTGE-----CGT---------ELNHGVAAVGYGTTL-DGTKYWIVRNSWGPEWGEKGYIRMQRGISDKKGLCGI 334 (360)
Q Consensus 270 ~y~~Giy~~~-----~~~---------~~~Hav~iVGyg~~~-~g~~ywivkNSWG~~WG~~Gy~~i~~~~~~~~~~Cgi 334 (360)
.|++|||+.. |.. .++|||+|||||++. ++++|||||||||++||++|||||+|+. |.|||
T Consensus 158 ~Y~~GIy~~~~~~~~C~~~~~~~~~~~~~~HaV~iVGyg~~~~~g~~YWiirNSWG~~WGe~Gy~~i~~~~----~~cgi 233 (243)
T cd02621 158 FYKEGVYHHTDNDEVSDGDNDNFNPFELTNHAVLLVGWGEDEIKGEKYWIVKNSWGSSWGEKGYFKIRRGT----NECGI 233 (243)
T ss_pred ccCCeEECcCCcccccccccccccCcccCCeEEEEEEeeccCCCCCcEEEEEcCCCCCCCcCCeEEEecCC----cccCc
Confidence 9999999874 532 468999999999874 4889999999999999999999999984 56999
Q ss_pred cccceeee
Q 018104 335 AMEASYPI 342 (360)
Q Consensus 335 ~~~~~~~~ 342 (360)
++.+.++.
T Consensus 234 ~~~~~~~~ 241 (243)
T cd02621 234 ESQAVFAY 241 (243)
T ss_pred ccceEeec
Confidence 99987653
No 7
>cd02698 Peptidase_C1A_CathepsinX Cathepsin X; the only papain-like lysosomal cysteine peptidase exhibiting carboxymonopeptidase activity. It can also act as a carboxydipeptidase, like cathepsin B, but has been shown to preferentially cleave substrates through a monopeptidyl carboxypeptidase pathway. The propeptide region of cathepsin X, the shortest among papain-like peptidases, is covalently attached to the active site cysteine in the inactive form of the enzyme. Little is known about the biological function of cathepsin X. Some studies point to a role in early tumorigenesis. A more recent study indicates that cathepsin X expression is restricted to immune cells suggesting a role in phagocytosis and the regulation of the immune response.
Probab=100.00 E-value=2.7e-56 Score=405.79 Aligned_cols=211 Identities=31% Similarity=0.619 Sum_probs=180.2
Q ss_pred CCCceecCCCC---CCCCCCCCC---CCCcHHHHHHHHHHHHHHHHhcC---CccccCHHHHHhhcCCCCCCCCCcchhh
Q 018104 126 IPPSVDWRKKG---SVTAVKDQG---QCGSCWAFSTIAAVEGINHIMTN---KLVSLSEQELVDCDTDQNQGCNGGLMEL 196 (360)
Q Consensus 126 lP~~~Dwr~~g---~vtpV~dQg---~cGsCwAfA~~~~le~~~~~~~~---~~~~lS~q~l~dc~~~~~~gc~GG~~~~ 196 (360)
||++||||+.+ +|+|||||| .||||||||++++||+++.++++ ..+.||+|+|+||+. +.||+||++..
T Consensus 1 lP~~~Dwr~~~~~~~v~~vk~Qg~~~~CGsCwAfa~~~aies~~~i~~~~~~~~~~lS~Q~lldC~~--~~gC~GG~~~~ 78 (239)
T cd02698 1 LPKSWDWRNVNGVNYVSPTRNQHIPQYCGSCWAHGSTSALADRINIARKGAWPSVYLSVQVVIDCAG--GGSCHGGDPGG 78 (239)
T ss_pred CCCCcccccCCCCcccCccccCCCCCCCCcchHHHhHHHHHHHHHHHHCCCCCCcccCHHHHHhCCC--CCCccCcCHHH
Confidence 69999999988 999999998 89999999999999999998865 367899999999986 68999999999
Q ss_pred HHHHHHHcCCCCCCCCCcccCCCCCcCCCC--------------CCCCcEEecceEEcCCChHHHHHHHH-HhCCeEEEE
Q 018104 197 AFEFIKKKGGVTTEAKYPYQANDGTCDVSK--------------ESSPAVSIDGHENVPANHEDALLKAV-AKQPVSVAI 261 (360)
Q Consensus 197 a~~~~~~~~Gi~~e~~yPY~~~~~~c~~~~--------------~~~~~~~i~~~~~v~~~~~~~i~~~l-~~gPV~v~~ 261 (360)
+++|+.++ |+++|++|||......|.... ......+++.|..++ ++++|+++| .+|||+++|
T Consensus 79 a~~~~~~~-Gl~~e~~yPY~~~~~~C~~~~~~~~c~~~~~c~~~~~~~~~~i~~~~~~~--~~~~i~~~l~~~GPV~v~i 155 (239)
T cd02698 79 VYEYAHKH-GIPDETCNPYQAKDGECNPFNRCGTCNPFGECFAIKNYTLYFVSDYGSVS--GRDKMMAEIYARGPISCGI 155 (239)
T ss_pred HHHHHHHc-CcCCCCeeCCcCCCCCCcCCCCCCCcccCcccccccccceEEeeeceecC--CHHHHHHHHHHcCCEEEEE
Confidence 99999997 999999999998776665310 012345677776664 467899888 568999999
Q ss_pred ecCCcccccccCceEeCC-CCCCCCeEEEEEEeeecCCCceEEEEEcCCCCCCCCCcEEEEEecC-CCCCCCccccccce
Q 018104 262 DAGSSDFQFYSEGVFTGE-CGTELNHGVAAVGYGTTLDGTKYWIVRNSWGPEWGEKGYIRMQRGI-SDKKGLCGIAMEAS 339 (360)
Q Consensus 262 ~~~~~~f~~y~~Giy~~~-~~~~~~Hav~iVGyg~~~~g~~ywivkNSWG~~WG~~Gy~~i~~~~-~~~~~~Cgi~~~~~ 339 (360)
.+. ++|+.|++|||+.. |...++|||+|||||++.++++|||||||||++||++|||||+|+. .+..+.|||+++++
T Consensus 156 ~~~-~~f~~Y~~GIy~~~~~~~~~~HaV~IVGyG~~~~g~~YWiikNSWG~~WGe~Gy~~i~rg~~~~~~~~~~i~~~~~ 234 (239)
T cd02698 156 MAT-EALENYTGGVYKEYVQDPLINHIISVAGWGVDENGVEYWIVRNSWGEPWGERGWFRIVTSSYKGARYNLAIEEDCA 234 (239)
T ss_pred Eec-ccccccCCeEEccCCCCCcCCeEEEEEEEEecCCCCEEEEEEcCCCcccCcCceEEEEccCCcccccccccccceE
Confidence 998 58999999999874 4556799999999998744899999999999999999999999996 12336799999999
Q ss_pred eee
Q 018104 340 YPI 342 (360)
Q Consensus 340 ~~~ 342 (360)
|+.
T Consensus 235 ~~~ 237 (239)
T cd02698 235 WAD 237 (239)
T ss_pred EEe
Confidence 875
No 8
>cd02248 Peptidase_C1A Peptidase C1A subfamily (MEROPS database nomenclature); composed of cysteine peptidases (CPs) similar to papain, including the mammalian CPs (cathepsins B, C, F, H, L, K, O, S, V, X and W). Papain is an endopeptidase with specific substrate preferences, primarily for bulky hydrophobic or aromatic residues at the S2 subsite, a hydrophobic pocket in papain that accommodates the P2 sidechain of the substrate (the second residue away from the scissile bond). Most members of the papain subfamily are endopeptidases. Some exceptions to this rule can be explained by specific details of the catalytic domains like the occluding loop in cathepsin B which confers an additional carboxydipeptidyl activity and the mini-chain of cathepsin H resulting in an N-terminal exopeptidase activity. Papain-like CPs have different functions in various organisms. Plant CPs are used to mobilize storage proteins in seeds. Parasitic CPs act extracellularly to help invade tissues and cells, to h
Probab=100.00 E-value=6.1e-56 Score=396.41 Aligned_cols=207 Identities=60% Similarity=1.099 Sum_probs=187.2
Q ss_pred CCceecCCCCCCCCCCCCCCCCcHHHHHHHHHHHHHHHHhcCCccccCHHHHHhhcCCCCCCCCCcchhhHHHHHHHcCC
Q 018104 127 PPSVDWRKKGSVTAVKDQGQCGSCWAFSTIAAVEGINHIMTNKLVSLSEQELVDCDTDQNQGCNGGLMELAFEFIKKKGG 206 (360)
Q Consensus 127 P~~~Dwr~~g~vtpV~dQg~cGsCwAfA~~~~le~~~~~~~~~~~~lS~q~l~dc~~~~~~gc~GG~~~~a~~~~~~~~G 206 (360)
|++||||+.+.++||+|||.||+|||||++++||++++++++..++||+|+|++|....+.+|.||+...|++++.+. |
T Consensus 1 P~~~d~r~~~~~~~v~dQg~cgsCwAfa~~~~le~~~~i~~~~~~~lS~q~l~~c~~~~~~gC~GG~~~~a~~~~~~~-G 79 (210)
T cd02248 1 PESVDWREKGAVTPVKDQGSCGSCWAFSTVGALEGAYAIKTGKLVSLSEQQLVDCSTSGNNGCNGGNPDNAFEYVKNG-G 79 (210)
T ss_pred CCcccCCcCCCCCCCccCCCCcchHHhHHHHHHHHHHHHHcCCCcccCHHHHhccCCCCCCCCCCCCHHHhHHHHHHC-C
Confidence 789999999999999999999999999999999999999999999999999999986447899999999999999887 9
Q ss_pred CCCCCCCcccCCCCCcCCCCCCCCcEEecceEEcCCChHHHHHHHHHh-CCeEEEEecCCcccccccCceEeCCCC--CC
Q 018104 207 VTTEAKYPYQANDGTCDVSKESSPAVSIDGHENVPANHEDALLKAVAK-QPVSVAIDAGSSDFQFYSEGVFTGECG--TE 283 (360)
Q Consensus 207 i~~e~~yPY~~~~~~c~~~~~~~~~~~i~~~~~v~~~~~~~i~~~l~~-gPV~v~~~~~~~~f~~y~~Giy~~~~~--~~ 283 (360)
+++|++|||......|..... ....++.+|..++..+.++||++|.+ |||++++.+. ++|..|++|||+.++. ..
T Consensus 80 i~~e~~yPY~~~~~~C~~~~~-~~~~~i~~~~~i~~~~~~~ik~~l~~~gPV~~~~~~~-~~f~~y~~Giy~~~~~~~~~ 157 (210)
T cd02248 80 LASESDYPYTGKDGTCKYNSS-KVGAKITGYSNVPPGDEEALKAALANYGPVSVAIDAS-SSFQFYKGGIYSGPCCSNTN 157 (210)
T ss_pred cCccccCCccCCCCCccCCCC-cccEEEeeEEEcCCCcHHHHHHHHhhcCCEEEEEecC-cccccCCCCceeCCCCCCCc
Confidence 999999999988888986653 56789999999987678999999955 8999999997 6899999999987543 56
Q ss_pred CCeEEEEEEeeecCCCceEEEEEcCCCCCCCCCcEEEEEecCCCCCCCccccccceee
Q 018104 284 LNHGVAAVGYGTTLDGTKYWIVRNSWGPEWGEKGYIRMQRGISDKKGLCGIAMEASYP 341 (360)
Q Consensus 284 ~~Hav~iVGyg~~~~g~~ywivkNSWG~~WG~~Gy~~i~~~~~~~~~~Cgi~~~~~~~ 341 (360)
++|||+|||||++ .+++|||||||||++||++|||||+++. +.|||++++.||
T Consensus 158 ~~Hav~iVGy~~~-~~~~ywiv~NSWG~~WG~~Gy~~i~~~~----~~cgi~~~~~~~ 210 (210)
T cd02248 158 LNHAVLLVGYGTE-NGVDYWIVKNSWGTSWGEKGYIRIARGS----NLCGIASYASYP 210 (210)
T ss_pred CCEEEEEEEEeec-CCceEEEEEcCCCCccccCcEEEEEcCC----CccCceeeeecC
Confidence 7999999999997 6889999999999999999999999984 569999888775
No 9
>cd02620 Peptidase_C1A_CathepsinB Cathepsin B group; composed of cathepsin B and similar proteins, including tubulointerstitial nephritis antigen (TIN-Ag). Cathepsin B is a lysosomal papain-like cysteine peptidase which is expressed in all tissues and functions primarily as an exopeptidase through its carboxydipeptidyl activity. Together with other cathepsins, it is involved in the degradation of proteins, proenzyme activation, Ag processing, metabolism and apoptosis. Cathepsin B has been implicated in a number of human diseases such as cancer, rheumatoid arthritis, osteoporosis and Alzheimer's disease. The unique carboxydipeptidyl activity of cathepsin B is attributed to the presence of an occluding loop in its active site which favors the binding of the C-termini of substrate proteins. Some members of this group do not possess the occluding loop. TIN-Ag is an extracellular matrix basement protein which was originally identified as a target Ag involved in anti-tubular basement membrane
Probab=100.00 E-value=8.8e-56 Score=401.70 Aligned_cols=205 Identities=36% Similarity=0.725 Sum_probs=173.0
Q ss_pred CCceecCCC--CCC--CCCCCCCCCCcHHHHHHHHHHHHHHHHhcC--CccccCHHHHHhhcCCCCCCCCCcchhhHHHH
Q 018104 127 PPSVDWRKK--GSV--TAVKDQGQCGSCWAFSTIAAVEGINHIMTN--KLVSLSEQELVDCDTDQNQGCNGGLMELAFEF 200 (360)
Q Consensus 127 P~~~Dwr~~--g~v--tpV~dQg~cGsCwAfA~~~~le~~~~~~~~--~~~~lS~q~l~dc~~~~~~gc~GG~~~~a~~~ 200 (360)
|++||||++ +++ +||+|||.||+|||||++++||+++.++++ +.+.||+|+|+||...++.||+||++..|++|
T Consensus 1 p~~~DwR~~~~~~~~v~~v~dQg~CGsCwAfa~~~~le~~~~i~~~~~~~~~LS~Q~lidC~~~~~~gC~GG~~~~a~~~ 80 (236)
T cd02620 1 PESFDAREKWPNCISIGEIRDQGNCGSCWAFSAVEAFSDRLCIQSNGKENVLLSAQDLLSCCSGCGDGCNGGYPDAAWKY 80 (236)
T ss_pred CCcccchhhCCCCCCccccCCcccchhHHHHHHHHHHhhHHHHhcCCCCccccCHHHHHhhcCCCCCCCCCCCHHHHHHH
Confidence 889999997 554 599999999999999999999999999887 78999999999998755789999999999999
Q ss_pred HHHcCCCCCCCCCcccCCCCC------------------cCCCCC---CCCcEEecceEEcCCChHHHHHHHHH-hCCeE
Q 018104 201 IKKKGGVTTEAKYPYQANDGT------------------CDVSKE---SSPAVSIDGHENVPANHEDALLKAVA-KQPVS 258 (360)
Q Consensus 201 ~~~~~Gi~~e~~yPY~~~~~~------------------c~~~~~---~~~~~~i~~~~~v~~~~~~~i~~~l~-~gPV~ 258 (360)
++++ |+++|++|||...+.. |..... .....++..+..+. .++++|+.+|. +|||+
T Consensus 81 i~~~-G~~~e~~yPY~~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~~~~~~~~-~~~~~ik~~l~~~GPv~ 158 (236)
T cd02620 81 LTTT-GVVTGGCQPYTIPPCGHHPEGPPPCCGTPYCTPKCQDGCEKTYEEDKHKGKSAYSVP-SDETDIMKEIMTNGPVQ 158 (236)
T ss_pred HHhc-CCCcCCEecCcCCCCccCCCCCCCCCCCCCCCCCCCcCCccccceeeeeecceeeeC-CHHHHHHHHHHHCCCeE
Confidence 9988 9999999999876543 322111 01123445555554 36789999995 58999
Q ss_pred EEEecCCcccccccCceEeCCCCC-CCCeEEEEEEeeecCCCceEEEEEcCCCCCCCCCcEEEEEecCCCCCCCcccccc
Q 018104 259 VAIDAGSSDFQFYSEGVFTGECGT-ELNHGVAAVGYGTTLDGTKYWIVRNSWGPEWGEKGYIRMQRGISDKKGLCGIAME 337 (360)
Q Consensus 259 v~~~~~~~~f~~y~~Giy~~~~~~-~~~Hav~iVGyg~~~~g~~ywivkNSWG~~WG~~Gy~~i~~~~~~~~~~Cgi~~~ 337 (360)
++|.+. ++|+.|++|||+..++. .++|||+|||||++ ++++|||||||||++|||+|||||+|+. +.|||+++
T Consensus 159 v~i~~~-~~f~~Y~~Giy~~~~~~~~~~HaV~iVGyg~~-~g~~YWivrNSWG~~WGe~Gy~ri~~~~----~~cgi~~~ 232 (236)
T cd02620 159 AAFTVY-EDFLYYKSGVYQHTSGKQLGGHAVKIIGWGVE-NGVPYWLAANSWGTDWGENGYFRILRGS----NECGIESE 232 (236)
T ss_pred EEEEec-hhhhhcCCcEEeecCCCCcCCeEEEEEEEecc-CCeeEEEEEeCCCCCCCCCcEEEEEccC----cccccccc
Confidence 999996 78999999999875554 46899999999987 8899999999999999999999999984 56999998
Q ss_pred ce
Q 018104 338 AS 339 (360)
Q Consensus 338 ~~ 339 (360)
++
T Consensus 233 ~~ 234 (236)
T cd02620 233 VV 234 (236)
T ss_pred ee
Confidence 75
No 10
>PF00112 Peptidase_C1: Papain family cysteine protease This is family C1 in the peptidase classification. ; InterPro: IPR000668 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This group of proteins belong to the peptidase family C1, sub-family C1A (papain family, clan CA). It includes proteins classed as non-peptidase homologs. These are have either been shown experimentally to lack peptidase activity or lack one or more of the active site residues. The papain family has a wide variety of activities, including broad-range (papain) and narrow-range endo-peptidases, aminopeptidases, dipeptidyl peptidases and enzymes with both exo- and endo-peptidase activity []. Members of the papain family are widespread, found in baculovirus [], eubacteria, yeast, and practically all protozoa, plants and mammals []. The proteins are typically lysosomal or secreted, and proteolytic cleavage of the propeptide is required for enzyme activation, although bleomycin hydrolase is cytosolic in fungi and mammals []. Papain-like cysteine proteinases are essentially synthesised as inactive proenzymes (zymogens) with N-terminal propeptide regions. The activation process of these enzymes includes the removal of propeptide regions. The propeptide regions serve a variety of functions in vivo and in vitro. The pro-region is required for the proper folding of the newly synthesised enzyme, the inactivation of the peptidase domain and stabilisation of the enzyme against denaturing at neutral to alkaline pH conditions. Amino acid residues within the pro-region mediate their membrane association, and play a role in the transport of the proenzyme to lysosomes. Among the most notable features of propeptides is their ability to inhibit the activity of their cognate enzymes and that certain propeptides exhibit high selectivity for inhibition of the peptidases from which they originate []. The catalytic residues of papain are Cys-25 and His-159, other important residues being Gln-19, which helps form the 'oxyanion hole', and Asn-175, which orientates the imidazole ring of His-159. ; GO: 0008234 cysteine-type peptidase activity, 0006508 proteolysis; PDB: 3MOR_B 3HHI_B 1S4V_A 3F75_A 1MEG_A 1PCI_C 1PPO_A 3HD3_B 1F29_A 1EWL_A ....
Probab=100.00 E-value=4.6e-54 Score=385.99 Aligned_cols=213 Identities=45% Similarity=0.865 Sum_probs=182.0
Q ss_pred CCCceecCCC-CCCCCCCCCCCCCcHHHHHHHHHHHHHHHHhc-CCccccCHHHHHhhcCCCCCCCCCcchhhHHHHHHH
Q 018104 126 IPPSVDWRKK-GSVTAVKDQGQCGSCWAFSTIAAVEGINHIMT-NKLVSLSEQELVDCDTDQNQGCNGGLMELAFEFIKK 203 (360)
Q Consensus 126 lP~~~Dwr~~-g~vtpV~dQg~cGsCwAfA~~~~le~~~~~~~-~~~~~lS~q~l~dc~~~~~~gc~GG~~~~a~~~~~~ 203 (360)
||++||||+. +.++||+|||.||+|||||+++++|++++++. ...++||+|+|++|....+.+|+||++..|++++++
T Consensus 1 lP~~~D~r~~~~~~~~v~dQg~~gsCwafa~~~~~e~~~~~~~~~~~~~lS~q~l~~~~~~~~~~c~gg~~~~a~~~~~~ 80 (219)
T PF00112_consen 1 LPKSFDWRDKGGRITPVRDQGSCGSCWAFAAAAALESRLAIQNNGKNVDLSEQYLIDCSNKYNKGCDGGSPFDALKYIKN 80 (219)
T ss_dssp STSSEEGGGTTTCSG---BTTSSBTHHHHHHHHHHHHHHHHHHTSSCEEB-HHHHHHHSTGTSSTTBBBEHHHHHHHHHH
T ss_pred CCCCEecccCCCCcCccccCCcccccccchhccceeccccccccccccccccccccccccccccccccCcccccceeecc
Confidence 7999999998 48999999999999999999999999999999 788999999999999733679999999999999999
Q ss_pred cCCCCCCCCCcccCCC-CCcCCCCCCCCcEEecceEEcCCChHHHHHHHHHh-CCeEEEEecCCcccccccCceEeCC-C
Q 018104 204 KGGVTTEAKYPYQAND-GTCDVSKESSPAVSIDGHENVPANHEDALLKAVAK-QPVSVAIDAGSSDFQFYSEGVFTGE-C 280 (360)
Q Consensus 204 ~~Gi~~e~~yPY~~~~-~~c~~~~~~~~~~~i~~~~~v~~~~~~~i~~~l~~-gPV~v~~~~~~~~f~~y~~Giy~~~-~ 280 (360)
+.|+++|++|||.... ..|..........++..+..+...++++|+++|.+ |||++++.+...+|..|++|||+.+ +
T Consensus 81 ~~Gi~~e~~~pY~~~~~~~c~~~~~~~~~~~i~~~~~~~~~~~~~ik~~L~~~gpV~~~~~~~~~~f~~~~~gi~~~~~~ 160 (219)
T PF00112_consen 81 NNGIVTEEDYPYNGNENPTCKSKKSNSYYVKIKGYGKVKDNDIEDIKKALMKYGPVVASIDVSSEDFQNYKSGIYDPPDC 160 (219)
T ss_dssp HTSBEBTTTS--SSSSSCSSCHSGGGEEEBEESEEEEEESTCHHHHHHHHHHHSSEEEEEEEESHHHHTEESSEECSTSS
T ss_pred cCcccccccccccccccccccccccccccccccccccccccchhHHHHHHhhCceeeeeeeccccccccccceeeecccc
Confidence 3499999999999877 67876532112468889998877779999999965 8999999998446999999999884 5
Q ss_pred C-CCCCeEEEEEEeeecCCCceEEEEEcCCCCCCCCCcEEEEEecCCCCCCCccccccceeee
Q 018104 281 G-TELNHGVAAVGYGTTLDGTKYWIVRNSWGPEWGEKGYIRMQRGISDKKGLCGIAMEASYPI 342 (360)
Q Consensus 281 ~-~~~~Hav~iVGyg~~~~g~~ywivkNSWG~~WG~~Gy~~i~~~~~~~~~~Cgi~~~~~~~~ 342 (360)
. ..++|||+|||||++ .+++|||||||||++||++|||||+|+.+ ++|||++.++||+
T Consensus 161 ~~~~~~Hav~iVGy~~~-~~~~~wiv~NSWG~~WG~~Gy~~i~~~~~---~~c~i~~~~~~~~ 219 (219)
T PF00112_consen 161 SNESGGHAVLIVGYDDE-NGKGYWIVKNSWGTDWGDNGYFRISYDYN---NECGIESQAVYPI 219 (219)
T ss_dssp SSSSEEEEEEEEEEEEE-TTEEEEEEE-SBTTTSTBTTEEEEESSSS---SGGGTTSSEEEEE
T ss_pred ccccccccccccccccc-cceeeEeeehhhCCccCCCeEEEEeeCCC---CcCccCceeeecC
Confidence 5 467999999999998 68999999999999999999999999964 3699999999995
No 11
>PTZ00364 dipeptidyl-peptidase I precursor; Provisional
Probab=100.00 E-value=4e-53 Score=416.71 Aligned_cols=210 Identities=26% Similarity=0.520 Sum_probs=175.7
Q ss_pred CCCCCCceecCCCC---CCCCCCCCCC---CCcHHHHHHHHHHHHHHHHhcC------CccccCHHHHHhhcCCCCCCCC
Q 018104 123 VTSIPPSVDWRKKG---SVTAVKDQGQ---CGSCWAFSTIAAVEGINHIMTN------KLVSLSEQELVDCDTDQNQGCN 190 (360)
Q Consensus 123 ~~~lP~~~Dwr~~g---~vtpV~dQg~---cGsCwAfA~~~~le~~~~~~~~------~~~~lS~q~l~dc~~~~~~gc~ 190 (360)
..+||++||||+.| +|+||||||. ||||||||++++||++++++++ ..+.||+|+|+||+.. +.||+
T Consensus 202 ~~~LP~sfDWR~~gg~~~VtpVrdQg~~~~CGSCWAFAav~alEsr~~I~tn~~~~~g~~~~LS~QqLVDCs~~-n~GCd 280 (548)
T PTZ00364 202 GDPPPAAWSWGDVGGASFLPAAPPASPGRGCNSSYVEAALAAMMARVMVASNRTDPLGQQTFLSARHVLDCSQY-GQGCA 280 (548)
T ss_pred ccCCCCccccCcCCCCccCCCCcCCCCCCCCcCHHHHHHHHHHHHHHHHHhCCCcccCcccCcCHHHHhcccCC-CCCCC
Confidence 35799999999987 7999999999 9999999999999999999873 4688999999999875 78999
Q ss_pred CcchhhHHHHHHHcCCCCCCCCC--cccCCCC---CcCCCCCCCCcEE------ecceEEcCCChHHHHHHHH-HhCCeE
Q 018104 191 GGLMELAFEFIKKKGGVTTEAKY--PYQANDG---TCDVSKESSPAVS------IDGHENVPANHEDALLKAV-AKQPVS 258 (360)
Q Consensus 191 GG~~~~a~~~~~~~~Gi~~e~~y--PY~~~~~---~c~~~~~~~~~~~------i~~~~~v~~~~~~~i~~~l-~~gPV~ 258 (360)
||++..|++|++++ |+++|++| ||.+.++ .|..... ...+. +.+|..+. .++++|+.+| .+|||+
T Consensus 281 GG~p~~A~~yi~~~-GI~tE~dY~~PY~~~dg~~~~Ck~~~~-~~~y~~~~~~~I~gyy~~~-~~e~~I~~eI~~~GPVs 357 (548)
T PTZ00364 281 GGFPEEVGKFAETF-GILTTDSYYIPYDSGDGVERACKTRRP-SRRYYFTNYGPLGGYYGAV-TDPDEIIWEIYRHGPVP 357 (548)
T ss_pred CCcHHHHHHHHHhC-CcccccccCCCCCCCCCCCCCCCCCcc-cceeeeeeeEEecceeecC-CcHHHHHHHHHHcCCeE
Confidence 99999999999887 99999999 9987655 4865432 22333 33444443 3678899988 469999
Q ss_pred EEEecCCcccccccCceEeC---------CC-----------CCCCCeEEEEEEeeecCCCceEEEEEcCCCC--CCCCC
Q 018104 259 VAIDAGSSDFQFYSEGVFTG---------EC-----------GTELNHGVAAVGYGTTLDGTKYWIVRNSWGP--EWGEK 316 (360)
Q Consensus 259 v~~~~~~~~f~~y~~Giy~~---------~~-----------~~~~~Hav~iVGyg~~~~g~~ywivkNSWG~--~WG~~ 316 (360)
|+|++. .+|..|++|||.+ .| ...++|||+|||||.+++|.+|||||||||+ +|||+
T Consensus 358 VaIda~-~df~~YksGiy~gi~~~~~~~~~~~~~~~~~~~~~~~~~nHAVlIVGYG~de~G~~YWIVKNSWGt~~~WGE~ 436 (548)
T PTZ00364 358 ASVYAN-SDWYNCDENSTEDVRYVSLDDYSTASADRPLRHYFASNVNHTVLIIGWGTDENGGDYWLVLDPWGSRRSWCDG 436 (548)
T ss_pred EEEEec-hHHHhcCCCCccCeeccccccccccccCCcccccccccCCeEEEEEEecccCCCceEEEEECCCCCCCCcccC
Confidence 999997 6899999998752 11 1347999999999986578999999999999 99999
Q ss_pred cEEEEEecCCCCCCCccccccceee
Q 018104 317 GYIRMQRGISDKKGLCGIAMEASYP 341 (360)
Q Consensus 317 Gy~~i~~~~~~~~~~Cgi~~~~~~~ 341 (360)
|||||+|+. |.|||+++++..
T Consensus 437 GYfRI~RG~----N~CGIes~~v~~ 457 (548)
T PTZ00364 437 GTRKIARGV----NAYNIESEVVVM 457 (548)
T ss_pred CeEEEEcCC----Ccccccceeeee
Confidence 999999995 569999998843
No 12
>PTZ00049 cathepsin C-like protein; Provisional
Probab=100.00 E-value=6.3e-53 Score=419.14 Aligned_cols=214 Identities=29% Similarity=0.610 Sum_probs=177.2
Q ss_pred CCCCCCceecCCC----CCCCCCCCCCCCCcHHHHHHHHHHHHHHHHhcCC-----c-----cccCHHHHHhhcCCCCCC
Q 018104 123 VTSIPPSVDWRKK----GSVTAVKDQGQCGSCWAFSTIAAVEGINHIMTNK-----L-----VSLSEQELVDCDTDQNQG 188 (360)
Q Consensus 123 ~~~lP~~~Dwr~~----g~vtpV~dQg~cGsCwAfA~~~~le~~~~~~~~~-----~-----~~lS~q~l~dc~~~~~~g 188 (360)
..+||++||||+. +.++||+|||.||||||||++++||++++|++++ . ..||+|+|+||+.. +.|
T Consensus 378 ~~~LP~sfDWRd~~~~~~~vtpVkdQG~CGSCWAFAat~alEsR~~Ia~~~~l~~~~~~~~~~~LS~QqLLDCs~~-nqG 456 (693)
T PTZ00049 378 IDELPKNFTWGDPFNNNTREYDVTNQLLCGSCYIASQMYAFKRRIEIALTKNLDKKYLNNFDDLLSIQTVLSCSFY-DQG 456 (693)
T ss_pred cccCCCCEecCcCCCCCCcccCCCCCccCcHHHHHHHHHHHHHHHHHHhccccccccccccccCcCHHHhcccCCC-CCC
Confidence 4689999999984 6799999999999999999999999999998642 1 27999999999875 899
Q ss_pred CCCcchhhHHHHHHHcCCCCCCCCCcccCCCCCcCCCCCC--------------------------------------CC
Q 018104 189 CNGGLMELAFEFIKKKGGVTTEAKYPYQANDGTCDVSKES--------------------------------------SP 230 (360)
Q Consensus 189 c~GG~~~~a~~~~~~~~Gi~~e~~yPY~~~~~~c~~~~~~--------------------------------------~~ 230 (360)
|+||++..|++|+++. ||++|.+|||.+..+.|...... ..
T Consensus 457 C~GG~~~~A~kya~~~-GI~tEscYPY~a~~g~C~~~~~~~~~~~~g~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 535 (693)
T PTZ00049 457 CNGGFPYLVSKMAKLQ-GIPLDKVFPYTATEQTCPYQVDQSANSMNGSANLRQINAVFFSSETQSDMHADFEAPISSEPA 535 (693)
T ss_pred cCCCcHHHHHHHHHHC-CCCcCCccCCcCCCCCCCCCCCCcccccccccccccccccccccccccccccccccccccccc
Confidence 9999999999999887 99999999999887788542110 11
Q ss_pred cEEecceEEcC-------CChHHHHHHHHH-hCCeEEEEecCCcccccccCceEeC-------CCCC-------------
Q 018104 231 AVSIDGHENVP-------ANHEDALLKAVA-KQPVSVAIDAGSSDFQFYSEGVFTG-------ECGT------------- 282 (360)
Q Consensus 231 ~~~i~~~~~v~-------~~~~~~i~~~l~-~gPV~v~~~~~~~~f~~y~~Giy~~-------~~~~------------- 282 (360)
++.++.|..+. ..++++|+++|. +|||+|+|++. ++|++|++|||+. .|..
T Consensus 536 r~y~k~y~yI~g~y~~~~~~~E~~Im~eI~~~GPVsVsIda~-~dF~~YksGVY~~~~~~h~~~C~~d~~~~~~~~~~~G 614 (693)
T PTZ00049 536 RWYAKDYNYIGGCYGCNQCNGEKIMMNEIYRNGPIVASFEAS-PDFYDYADGVYYVEDFPHARRCTVDLPKHNGVYNITG 614 (693)
T ss_pred ceeeeeeEEecccccccCCCCHHHHHHHHHhcCCEEEEEEec-hhhhcCCCccccCcccccccccCCccccccccccccc
Confidence 23345555442 246788999985 69999999997 6899999999974 2532
Q ss_pred --CCCeEEEEEEeeecC-CCc--eEEEEEcCCCCCCCCCcEEEEEecCCCCCCCccccccceeeee
Q 018104 283 --ELNHGVAAVGYGTTL-DGT--KYWIVRNSWGPEWGEKGYIRMQRGISDKKGLCGIAMEASYPIK 343 (360)
Q Consensus 283 --~~~Hav~iVGyg~~~-~g~--~ywivkNSWG~~WG~~Gy~~i~~~~~~~~~~Cgi~~~~~~~~~ 343 (360)
.++|||+|||||.+. +|+ +|||||||||++||++|||||+|+. |.|||++++.|+..
T Consensus 615 ~e~~NHAVlIVGwG~d~enG~~~~YWIVRNSWGt~WGenGYfKI~RG~----N~CGIEs~a~~~~p 676 (693)
T PTZ00049 615 WEKVNHAIVLVGWGEEEINGKLYKYWIGRNSWGKNWGKEGYFKIIRGK----NFSGIESQSLFIEP 676 (693)
T ss_pred cccCceEEEEEEeccccCCCcccCEEEEECCCCCCcccCceEEEEcCC----CccCCccceeEEee
Confidence 368999999999753 453 7999999999999999999999994 56999999998764
No 13
>smart00645 Pept_C1 Papain family cysteine protease.
Probab=100.00 E-value=1.4e-49 Score=344.56 Aligned_cols=168 Identities=62% Similarity=1.159 Sum_probs=148.9
Q ss_pred CCCceecCCCCCCCCCCCCCCCCcHHHHHHHHHHHHHHHHhcCCccccCHHHHHhhcCCCCCCCCCcchhhHHHHHHHcC
Q 018104 126 IPPSVDWRKKGSVTAVKDQGQCGSCWAFSTIAAVEGINHIMTNKLVSLSEQELVDCDTDQNQGCNGGLMELAFEFIKKKG 205 (360)
Q Consensus 126 lP~~~Dwr~~g~vtpV~dQg~cGsCwAfA~~~~le~~~~~~~~~~~~lS~q~l~dc~~~~~~gc~GG~~~~a~~~~~~~~ 205 (360)
||++||||+.++++||+|||.||+|||||+++++|++++++++..++||+|+|++|....+.+|+||++..|++|+.+++
T Consensus 1 lP~~~D~R~~~~~~~v~dQg~CGsCwAfa~~~~ie~~~~i~~~~~~~lS~q~l~~C~~~~~~gC~GG~~~~a~~~~~~~~ 80 (174)
T smart00645 1 LPESFDWRKKGAVTPVKDQGQCGSCWAFSATGALEGRYCIKTGKLVSLSEQQLVDCSTGGNNGCNGGLPDNAFEYIKKNG 80 (174)
T ss_pred CCCcCcccccCCCCccccCcccchHHHHHHHHHHHHHHHHhcCCccccCHHHHhhhcCCCCCCCCCcCHHHHHHHHHHcC
Confidence 69999999999999999999999999999999999999999998999999999999874356999999999999998866
Q ss_pred CCCCCCCCcccCCCCCcCCCCCCCCcEEecceEEcCCChHHHHHHHHHhCCeEEEEecCCcccccccCceEeC-CCCCC-
Q 018104 206 GVTTEAKYPYQANDGTCDVSKESSPAVSIDGHENVPANHEDALLKAVAKQPVSVAIDAGSSDFQFYSEGVFTG-ECGTE- 283 (360)
Q Consensus 206 Gi~~e~~yPY~~~~~~c~~~~~~~~~~~i~~~~~v~~~~~~~i~~~l~~gPV~v~~~~~~~~f~~y~~Giy~~-~~~~~- 283 (360)
|+++|++|||.. ++.+.+. +|+.|++|||+. .|...
T Consensus 81 Gi~~e~~~PY~~----------------------------------------~~~~~~~--~f~~Y~~Gi~~~~~~~~~~ 118 (174)
T smart00645 81 GLETESCYPYTG----------------------------------------SVAIDAS--DFQFYKSGIYDHPGCGSGT 118 (174)
T ss_pred CcccccccCccc----------------------------------------EEEEEcc--cccCCcCeEECCCCCCCCc
Confidence 999999999974 4555554 599999999987 47653
Q ss_pred CCeEEEEEEeeecCCCceEEEEEcCCCCCCCCCcEEEEEecCCCCCCCccccccc
Q 018104 284 LNHGVAAVGYGTTLDGTKYWIVRNSWGPEWGEKGYIRMQRGISDKKGLCGIAMEA 338 (360)
Q Consensus 284 ~~Hav~iVGyg~~~~g~~ywivkNSWG~~WG~~Gy~~i~~~~~~~~~~Cgi~~~~ 338 (360)
.+|+|+|||||.+.++++|||||||||+.||++|||||.|+.. +.|||+...
T Consensus 119 ~~Hav~ivGyg~~~~g~~yWii~NSwG~~WG~~G~~~i~~~~~---~~c~i~~~~ 170 (174)
T smart00645 119 LDHAVLIVGYGTEENGKDYWIVKNSWGTDWGENGYFRIARGKN---NECGIEASV 170 (174)
T ss_pred ccEEEEEEEEeecCCCeeEEEEECCCCCCcccCeEEEEEcCCC---CccCceeee
Confidence 7999999999986578899999999999999999999999842 459996543
No 14
>cd02619 Peptidase_C1 C1 Peptidase family (MEROPS database nomenclature), also referred to as the papain family; composed of two subfamilies of cysteine peptidases (CPs), C1A (papain) and C1B (bleomycin hydrolase). Papain-like enzymes are mostly endopeptidases with some exceptions like cathepsins B, C, H and X, which are exopeptidases. Papain-like CPs have different functions in various organisms. Plant CPs are used to mobilize storage proteins in seeds while mammalian CPs are primarily lysosomal enzymes responsible for protein degradation in the lysosome. Papain-like CPs are synthesized as inactive proenzymes with N-terminal propeptide regions, which are removed upon activation. Bleomycin hydrolase (BH) is a CP that detoxifies bleomycin by hydrolysis of an amide group. It acts as a carboxypeptidase on its C-terminus to convert itself into an aminopeptidase and peptide ligase. BH is found in all tissues in mammals as well as in many other eukaryotes. It forms a hexameric ring barrel str
Probab=100.00 E-value=6.2e-46 Score=333.99 Aligned_cols=194 Identities=35% Similarity=0.564 Sum_probs=167.1
Q ss_pred ceecCCCCCCCCCCCCCCCCcHHHHHHHHHHHHHHHHhcC--CccccCHHHHHhhcCCC----CCCCCCcchhhHHH-HH
Q 018104 129 SVDWRKKGSVTAVKDQGQCGSCWAFSTIAAVEGINHIMTN--KLVSLSEQELVDCDTDQ----NQGCNGGLMELAFE-FI 201 (360)
Q Consensus 129 ~~Dwr~~g~vtpV~dQg~cGsCwAfA~~~~le~~~~~~~~--~~~~lS~q~l~dc~~~~----~~gc~GG~~~~a~~-~~ 201 (360)
++|||+.+ ++||+|||.||+|||||+++++|++++++++ ..++||+|+|++|.... ..+|.||.+..++. ++
T Consensus 1 ~~d~r~~~-~~~v~dQg~~gsCwafa~~~~les~~~~~~~~~~~~~lS~q~l~~c~~~~~~~~~~~c~gG~~~~~~~~~~ 79 (223)
T cd02619 1 SVDLRPLR-LTPVKNQGSRGSCWAFASAYALESAYRIKGGEDEYVDLSPQYLYICANDECLGINGSCDGGGPLSALLKLV 79 (223)
T ss_pred CCcchhcC-CCCcccCCCCcCcHHHHHHHHHHHHHHHhcCCcccccCCHHHHHHhccccccccCCCCCCCcHHHHHHHHH
Confidence 48999988 9999999999999999999999999999987 88999999999998763 37999999999998 77
Q ss_pred HHcCCCCCCCCCcccCCCCCcCCC---CCCCCcEEecceEEcCCChHHHHHHHHHh-CCeEEEEecCCcccccccCceEe
Q 018104 202 KKKGGVTTEAKYPYQANDGTCDVS---KESSPAVSIDGHENVPANHEDALLKAVAK-QPVSVAIDAGSSDFQFYSEGVFT 277 (360)
Q Consensus 202 ~~~~Gi~~e~~yPY~~~~~~c~~~---~~~~~~~~i~~~~~v~~~~~~~i~~~l~~-gPV~v~~~~~~~~f~~y~~Giy~ 277 (360)
+.+ |+++|.+|||......|... .......++..|..+...++++||++|.+ |||++++.+. ..|..|++|++.
T Consensus 80 ~~~-Gi~~e~~~Py~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~~ik~aL~~~gPv~~~~~~~-~~~~~~~~~~~~ 157 (223)
T cd02619 80 ALK-GIPPEEDYPYGAESDGEEPKSEAALNAAKVKLKDYRRVLKNNIEDIKEALAKGGPVVAGFDVY-SGFDRLKEGIIY 157 (223)
T ss_pred HHc-CCCccccCCCCCCCCCCCCCCccchhhcceeecceeEeCchhHHHHHHHHHHCCCEEEEEEcc-cchhcccCcccc
Confidence 776 99999999999877766532 12235678899988877778999999965 8999999997 789999999862
Q ss_pred ------CCC-CCCCCeEEEEEEeeecC-CCceEEEEEcCCCCCCCCCcEEEEEecC
Q 018104 278 ------GEC-GTELNHGVAAVGYGTTL-DGTKYWIVRNSWGPEWGEKGYIRMQRGI 325 (360)
Q Consensus 278 ------~~~-~~~~~Hav~iVGyg~~~-~g~~ywivkNSWG~~WG~~Gy~~i~~~~ 325 (360)
..+ ...++|||+|||||++. .+++|||||||||+.||++||+||+++.
T Consensus 158 ~~~~~~~~~~~~~~~Hav~ivGy~~~~~~~~~~~i~~NSwG~~wg~~Gy~~i~~~~ 213 (223)
T cd02619 158 EEIVYLLYEDGDLGGHAVVIVGYDDNYVEGKGAFIVKNSWGTDWGDNGYGRISYED 213 (223)
T ss_pred ccccccccCCCccCCeEEEEEeecCCCCCCCCEEEEEeCCCCccccCCEEEEehhh
Confidence 122 34579999999999873 2789999999999999999999999985
No 15
>PTZ00462 Serine-repeat antigen protein; Provisional
Probab=100.00 E-value=1.1e-44 Score=369.60 Aligned_cols=210 Identities=22% Similarity=0.435 Sum_probs=167.9
Q ss_pred CCCCCCCCCCCcHHHHHHHHHHHHHHHHhcCCccccCHHHHHhhcCC-CCCCCCCcchh-hHHHHHHHcCCCCCCCCCcc
Q 018104 138 VTAVKDQGQCGSCWAFSTIAAVEGINHIMTNKLVSLSEQELVDCDTD-QNQGCNGGLME-LAFEFIKKKGGVTTEAKYPY 215 (360)
Q Consensus 138 vtpV~dQg~cGsCwAfA~~~~le~~~~~~~~~~~~lS~q~l~dc~~~-~~~gc~GG~~~-~a~~~~~~~~Gi~~e~~yPY 215 (360)
..||+|||.||+|||||+++++|++++++++..+.||+|+|+||+.. ++.||.||... .++.|+.+++|+++|.+|||
T Consensus 544 ~i~VKDQG~CGSCWAFASaaaLES~~cIkgg~~v~LSeQqLVDCs~~~gn~GC~GG~~~~efl~yI~e~GgLptESdYPY 623 (1004)
T PTZ00462 544 KIQIEDQGNCAISWIFASKYHLETIKCMKGYEPHAISALYIANCSKGEHKDRCDEGSNPLEFLQIIEDNGFLPADSNYLY 623 (1004)
T ss_pred CCCcccCCcchHHHHHHHHHHHHHHHHHhcCCCcccCHHHHHhcccccCCCCCCCCCcHHHHHHHHHHcCCCcccccCCC
Confidence 57899999999999999999999999999999999999999999864 46899999744 56689988867999999999
Q ss_pred cC--CCCCcCCCCC-----------------CCCcEEecceEEcCCC----h----HHHHHHHHHh-CCeEEEEecCCcc
Q 018104 216 QA--NDGTCDVSKE-----------------SSPAVSIDGHENVPAN----H----EDALLKAVAK-QPVSVAIDAGSSD 267 (360)
Q Consensus 216 ~~--~~~~c~~~~~-----------------~~~~~~i~~~~~v~~~----~----~~~i~~~l~~-gPV~v~~~~~~~~ 267 (360)
.. ..+.|..... ......+.+|..+... + +++|+++|.+ |||+|+|++. +
T Consensus 624 t~k~~~g~Cp~~~~~w~n~~~~~kll~~~~~~~~~i~~kgY~~~~s~~~~~n~d~~i~~IK~eI~~kGPVaV~IdAs--d 701 (1004)
T PTZ00462 624 NYTKVGEDCPDEEDHWMNLLDHGKILNHNKKEPNSLDGKAYRAYESEHFHDKMDAFIKIIKDEIMNKGSVIAYIKAE--N 701 (1004)
T ss_pred ccCCCCCCCCCCcccccccccccccccccccccceeeccceEEecccccccchhhHHHHHHHHHHhcCCEEEEEEee--h
Confidence 75 4567864211 0112334566655431 1 3688888955 8999999985 6
Q ss_pred ccccc-CceEe-CCCCC-CCCeEEEEEEeeecC----CCceEEEEEcCCCCCCCCCcEEEEEecCCCCCCCcccccccee
Q 018104 268 FQFYS-EGVFT-GECGT-ELNHGVAAVGYGTTL----DGTKYWIVRNSWGPEWGEKGYIRMQRGISDKKGLCGIAMEASY 340 (360)
Q Consensus 268 f~~y~-~Giy~-~~~~~-~~~Hav~iVGyg~~~----~g~~ywivkNSWG~~WG~~Gy~~i~~~~~~~~~~Cgi~~~~~~ 340 (360)
|+.|. +|||. ..|+. .++|||+|||||.+. .+++|||||||||+.||++|||||.|.. .++|||+....+
T Consensus 702 f~~Y~~sGIyv~~~Cgs~~~nHAVlIVGYGt~in~eg~gk~YWIVRNSWGt~WGEnGYFKI~r~g---~n~CGin~i~t~ 778 (1004)
T PTZ00462 702 VLGYEFNGKKVQNLCGDDTADHAVNIVGYGNYINDEDEKKSYWIVRNSWGKYWGDEGYFKVDMYG---PSHCEDNFIHSV 778 (1004)
T ss_pred HHhhhcCCccccCCCCCCcCCceEEEEEecccccccCCCCceEEEEcCCCCCcCCCeEEEEEeCC---CCCCccchheee
Confidence 88885 89864 46874 579999999999742 3579999999999999999999999843 356999999999
Q ss_pred eeecCCCCCCCC
Q 018104 341 PIKKSATNPTGP 352 (360)
Q Consensus 341 ~~~~~~~~~~~~ 352 (360)
++++...|....
T Consensus 779 ~~fn~d~~~~~~ 790 (1004)
T PTZ00462 779 VIFNIDLPKNKK 790 (1004)
T ss_pred eeEeeccccccC
Confidence 999888876553
No 16
>KOG1544 consensus Predicted cysteine proteinase TIN-ag [General function prediction only]
Probab=100.00 E-value=4.5e-41 Score=300.93 Aligned_cols=265 Identities=27% Similarity=0.507 Sum_probs=200.9
Q ss_pred HHHHhhCCCCCCeEEe-cccCCCCChhhhhhccccccccccccccccCCCcccccCCCCCCCCceecCCC--CCCCCCCC
Q 018104 67 MHVHQTNKMDKPYKLK-LNKFADMTNHEFASTYAGSKIKHHRMFQGTRGNGTFMYGKVTSIPPSVDWRKK--GSVTAVKD 143 (360)
Q Consensus 67 ~~I~~~N~~~~s~~~g-~N~fsD~t~~Ef~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~lP~~~Dwr~~--g~vtpV~d 143 (360)
.+|+++|..+.+|.++ ..+|..||.++--+..++..++.. ++..-... ........+||+.||-|++ +++.|+.|
T Consensus 151 d~iE~in~G~YgW~A~NYSaFWGmtL~DGiKyRLGTL~Ps~-sv~nMNEi-~~~l~p~~~LPE~F~As~KWp~liH~plD 228 (470)
T KOG1544|consen 151 DMIEAINQGNYGWQAGNYSAFWGMTLDDGIKYRLGTLRPSS-SVMNMNEI-YTVLNPGEVLPEAFEASEKWPNLIHEPLD 228 (470)
T ss_pred HHHHHHhcCCccccccchhhhhcccccccceeeecccCchh-hhhhHHhH-hhccCcccccchhhhhhhcCCccccCccc
Confidence 3689999888999997 679999999887666665443322 11100000 0011234689999999987 89999999
Q ss_pred CCCCCcHHHHHHHHHHHHHHHHhcCC--ccccCHHHHHhhcCCCCCCCCCcchhhHHHHHHHcCCCCCCCCCcccCCC--
Q 018104 144 QGQCGSCWAFSTIAAVEGINHIMTNK--LVSLSEQELVDCDTDQNQGCNGGLMELAFEFIKKKGGVTTEAKYPYQAND-- 219 (360)
Q Consensus 144 Qg~cGsCwAfA~~~~le~~~~~~~~~--~~~lS~q~l~dc~~~~~~gc~GG~~~~a~~~~~~~~Gi~~e~~yPY~~~~-- 219 (360)
||+|++.|||+++++...+++|++.. ...||+|+|++|......||.||....|+-|+.+. |++...+|||...+
T Consensus 229 QgnCa~SWafSTaavasDRiAI~S~GR~t~~LSpQnLlSC~~h~q~GC~gG~lDRAWWYlRKr-GvVsdhCYP~~~dQ~~ 307 (470)
T KOG1544|consen 229 QGNCAGSWAFSTAAVASDRVAIHSLGRMTPVLSPQNLLSCDTHQQQGCRGGRLDRAWWYLRKR-GVVSDHCYPFSGDQAG 307 (470)
T ss_pred cCCcccceeeeeehhccceeEEeeccccccccChHHhcchhhhhhccCccCcccchheeeecc-cccccccccccCCCCC
Confidence 99999999999999999999887643 46899999999988778999999999999999887 99999999997522
Q ss_pred --CCcCCC------------------CC-CCCcEEecceEEcCCChHHHHHHHH-HhCCeEEEEecCCcccccccCceEe
Q 018104 220 --GTCDVS------------------KE-SSPAVSIDGHENVPANHEDALLKAV-AKQPVSVAIDAGSSDFQFYSEGVFT 277 (360)
Q Consensus 220 --~~c~~~------------------~~-~~~~~~i~~~~~v~~~~~~~i~~~l-~~gPV~v~~~~~~~~f~~y~~Giy~ 277 (360)
+.|... .. ....++++.-..|.. ++++|++.| .+|||-+.|.+- ++|..|++|||.
T Consensus 308 ~~~~C~m~sR~~grgkRqat~~CPn~~~~Sn~iyq~tPPYrVSS-nE~eImkElM~NGPVQA~m~VH-EDFF~YkgGiY~ 385 (470)
T KOG1544|consen 308 PAPPCMMHSRAMGRGKRQATAHCPNSYVNSNDIYQVTPPYRVSS-NEKEIMKELMENGPVQALMEVH-EDFFLYKGGIYS 385 (470)
T ss_pred CCCCceeeccccCcccccccCcCCCcccccCceeeecCCeeccC-CHHHHHHHHHhCCChhhhhhhh-hhhhhhccceee
Confidence 233211 10 012233333334544 566666666 789999999885 899999999997
Q ss_pred CCCC---------CCCCeEEEEEEeeecC--CC--ceEEEEEcCCCCCCCCCcEEEEEecCCCCCCCcccccccee
Q 018104 278 GECG---------TELNHGVAAVGYGTTL--DG--TKYWIVRNSWGPEWGEKGYIRMQRGISDKKGLCGIAMEASY 340 (360)
Q Consensus 278 ~~~~---------~~~~Hav~iVGyg~~~--~g--~~ywivkNSWG~~WG~~Gy~~i~~~~~~~~~~Cgi~~~~~~ 340 (360)
+... ..+.|+|.|.|||.+. .| .+|||..||||+.|||+|||||.|+.++ |.|++...-
T Consensus 386 H~~~~~~~~e~yr~~gtHsVk~tGWG~~~~~~G~~~KyW~aANSWG~~WGE~GYFriLRGvNe----cdIEsfvIg 457 (470)
T KOG1544|consen 386 HTPVSLGRPERYRRHGTHSVKITGWGEETLPDGRTLKYWTAANSWGPAWGERGYFRILRGVNE----CDIESFVIG 457 (470)
T ss_pred ccccccCCchhhhhcccceEEEeecccccCCCCCeeEEEEeecccccccccCceEEEeccccc----hhhhHhhhh
Confidence 7321 1468999999999873 23 4799999999999999999999999755 999987653
No 17
>COG4870 Cysteine protease [Posttranslational modification, protein turnover, chaperones]
Probab=99.97 E-value=4.4e-31 Score=242.62 Aligned_cols=198 Identities=26% Similarity=0.441 Sum_probs=135.4
Q ss_pred CCCCCceecCCCCCCCCCCCCCCCCcHHHHHHHHHHHHHHHHhcCCccccCHHHHHh-----hcCCC-CCCCCCcchhhH
Q 018104 124 TSIPPSVDWRKKGSVTAVKDQGQCGSCWAFSTIAAVEGINHIMTNKLVSLSEQELVD-----CDTDQ-NQGCNGGLMELA 197 (360)
Q Consensus 124 ~~lP~~~Dwr~~g~vtpV~dQg~cGsCwAfA~~~~le~~~~~~~~~~~~lS~q~l~d-----c~~~~-~~gc~GG~~~~a 197 (360)
..+|+.||||+.|.|+||||||.||+|||||+++++|+.+.-.. ..++|+..+.. |.... ...-+||....+
T Consensus 97 ~s~~~~fd~r~~g~vs~v~dQg~~Gscwaf~t~~sles~l~~~~--~w~~s~~nm~~ll~~~ye~~fd~~~~d~g~~~m~ 174 (372)
T COG4870 97 ASLPSYFDRRDEGKVSPVKDQGSGGSCWAFATTRSLESYLNPES--AWDFSENNMKNLLGVPYEKGFDYTSNDGGNADMS 174 (372)
T ss_pred ccchhheeeeccCCcccccccCcccceEeeeehhhhhheecccc--cccccccchhhhcCCCccccCCCccccCCccccc
Confidence 35899999999999999999999999999999999999765333 34455544332 22111 111248888888
Q ss_pred HHHHHHcCCCCCCCCCcccCCCCCcCCCCCCCCcEEecceEEcCCC----hHHHHHHHHH-hCCeE--EEEecCCccccc
Q 018104 198 FEFIKKKGGVTTEAKYPYQANDGTCDVSKESSPAVSIDGHENVPAN----HEDALLKAVA-KQPVS--VAIDAGSSDFQF 270 (360)
Q Consensus 198 ~~~~~~~~Gi~~e~~yPY~~~~~~c~~~~~~~~~~~i~~~~~v~~~----~~~~i~~~l~-~gPV~--v~~~~~~~~f~~ 270 (360)
..|+.++.|.+.+.+-||......|....+ ...++.....++.. +.-.|++++. .|-+. +.|++. .+..
T Consensus 175 ~a~l~e~sgpv~et~d~y~~~s~~~~~~~p--~~k~~~~~~~i~~~~~~LdnG~i~~~~~~yg~~s~~~~id~~--~~~~ 250 (372)
T COG4870 175 AAYLTEWSGPVYETDDPYSENSYFSPTNLP--VTKHVQEAQIIPSRKKYLDNGNIKAMFGFYGAVSSSMYIDAT--NSLG 250 (372)
T ss_pred cccccccCCcchhhcCccccccccCCcCCc--hhhccccceecccchhhhcccchHHHHhhhccccceeEEecc--cccc
Confidence 889999999999999999887666654332 12223333333221 1223555553 34333 235554 2222
Q ss_pred ccCceEeCCCCCCCCeEEEEEEeeecC---------CCceEEEEEcCCCCCCCCCcEEEEEecCCC
Q 018104 271 YSEGVFTGECGTELNHGVAAVGYGTTL---------DGTKYWIVRNSWGPEWGEKGYIRMQRGISD 327 (360)
Q Consensus 271 y~~Giy~~~~~~~~~Hav~iVGyg~~~---------~g~~ywivkNSWG~~WG~~Gy~~i~~~~~~ 327 (360)
..-+.|........+|||+||||||.. .|.++||||||||+.||++|||||++....
T Consensus 251 ~~~~~~~~~s~~~~gHAv~iVGyDDs~~~n~~~~~~~g~GAfiikNSWGt~wG~~GYfwisY~ya~ 316 (372)
T COG4870 251 ICIPYPYVDSGENWGHAVLIVGYDDSFDINNFKYGPPGDGAFIIKNSWGTNWGENGYFWISYYYAL 316 (372)
T ss_pred cccCCCCCCccccccceEEEEeccccccccccccCCCCCceEEEECccccccccCceEEEEeeecc
Confidence 333444444446789999999999863 467899999999999999999999998654
No 18
>cd00585 Peptidase_C1B Peptidase C1B subfamily (MEROPS database nomenclature); composed of eukaryotic bleomycin hydrolases (BH) and bacterial aminopeptidases C (pepC). The proteins of this subfamily contain a large insert relative to the C1A peptidase (papain) subfamily. BH is a cysteine peptidase that detoxifies bleomycin by hydrolysis of an amide group. It acts as a carboxypeptidase on its C-terminus to convert itself into an aminopeptidase and peptide ligase. BH is found in all tissues in mammals as well as in many other eukaryotes. Bleomycin, a glycopeptide derived from the fungus Streptomyces verticullus, is an effective anticancer drug due to its ability to induce DNA strand breaks. Human BH is the major cause of tumor cell resistance to bleomycin chemotherapy, and is also genetically linked to Alzheimer's disease. In addition to its peptidase activity, the yeast BH (Gal6) binds DNA and acts as a repressor in the Gal4 regulatory system. BH forms a hexameric ring barrel structure w
Probab=99.94 E-value=2.2e-26 Score=222.73 Aligned_cols=183 Identities=23% Similarity=0.370 Sum_probs=131.4
Q ss_pred CCCCCCCCCCcHHHHHHHHHHHHHHHHh-cCCccccCHHHHHh----------------hcCC-----------CCCCCC
Q 018104 139 TAVKDQGQCGSCWAFSTIAAVEGINHIM-TNKLVSLSEQELVD----------------CDTD-----------QNQGCN 190 (360)
Q Consensus 139 tpV~dQg~cGsCwAfA~~~~le~~~~~~-~~~~~~lS~q~l~d----------------c~~~-----------~~~gc~ 190 (360)
.||+||+..|.||.||++..||+.+.+. +.+.++||+.+++. +... .....+
T Consensus 55 ~~vtnQ~~SGrCW~FA~Ln~lr~~~~k~~~~~~felSq~Yl~f~dklEkaN~fle~ii~~~~~~~~~R~v~~ll~~~~~D 134 (437)
T cd00585 55 EPVTNQKSSGRCWLFAALNVLRHQFMKKLNLKEFEFSQSYLFFWDKLEKANYFLENIIETADEPLDDRLVQFLLANPQND 134 (437)
T ss_pred CCcccCCCCchhHHHHCHHHHHHHHHHHcCCCCEEeCcHHHHHHHHHHHHHHHHHHHHHHhcCCCccHHHHHHHhCCcCC
Confidence 3899999999999999999999988874 55789999987754 2111 245679
Q ss_pred CcchhhHHHHHHHcCCCCCCCCCcccC-----------------------------------------------------
Q 018104 191 GGLMELAFEFIKKKGGVTTEAKYPYQA----------------------------------------------------- 217 (360)
Q Consensus 191 GG~~~~a~~~~~~~~Gi~~e~~yPY~~----------------------------------------------------- 217 (360)
||....+...+.++ |+++++.||-+.
T Consensus 135 GGqw~m~~~li~KY-GvVPk~~~pet~~s~~t~~~n~~L~~kLr~~a~~lr~~~~~~~~~~~l~~~~~~~~~~iy~il~~ 213 (437)
T cd00585 135 GGQWDMLVNLIEKY-GLVPKSVMPESFNSENSRRLNYLLNRKLREDALELRKLVAKGASKEEIEAKKEEMLKEVYRILAI 213 (437)
T ss_pred CCchHHHHHHHHHc-CCCcccccCCCcCccchHHHHHHHHHHHHHHHHHHHHHHhcCCcHHHHHHHHHHHHHHHHHHHHH
Confidence 99999999999887 999999999210
Q ss_pred ---------------CCC------------------CcCCC-------CCC--C---CcE-----------EecceEEcC
Q 018104 218 ---------------NDG------------------TCDVS-------KES--S---PAV-----------SIDGHENVP 241 (360)
Q Consensus 218 ---------------~~~------------------~c~~~-------~~~--~---~~~-----------~i~~~~~v~ 241 (360)
.++ .|... .+. . ..+ +...|.+++
T Consensus 214 ~lG~pP~~F~~~y~dkd~~~~~~~~~TP~~F~~~yv~~~~~dyV~l~~~p~~~~p~~~~y~ve~~~Nv~~g~~~~y~Nvp 293 (437)
T cd00585 214 ALGEPPEKFDWEYRDKDKKYHEIKELTPLEFYKKYVKFDLDDYVSLINDPRPDKPYNKLYTVEYLGNVVGGRPILYLNVP 293 (437)
T ss_pred HcCCCCceEEEEEEeCCCCeeeCCCcCHHHHHHHhcCCCccceEEEEeCCCCCCCCCceEEEecCCcccccccceEEecC
Confidence 000 00000 000 0 001 112344454
Q ss_pred CChHHHHH-HHHHhC-CeEEEEecCCcccccccCceEeCC----------------------CCCCCCeEEEEEEeeecC
Q 018104 242 ANHEDALL-KAVAKQ-PVSVAIDAGSSDFQFYSEGVFTGE----------------------CGTELNHGVAAVGYGTTL 297 (360)
Q Consensus 242 ~~~~~~i~-~~l~~g-PV~v~~~~~~~~f~~y~~Giy~~~----------------------~~~~~~Hav~iVGyg~~~ 297 (360)
.....++. ++|..| ||.+++++. .|..|++||++.. +.+..+|||+|||||.+.
T Consensus 294 ~d~l~~~~~~~L~~g~pV~~g~Dv~--~~~~~k~GI~d~~~~~~~~~f~~~~~~~KaeRl~~~es~~tHAM~ivGv~~D~ 371 (437)
T cd00585 294 MDVLKKAAIAQLKDGEPVWFGCDVG--KFSDRKSGILDTDLFDYELLFGIDFGLNKAERLDYGESLMTHAMVLTGVDLDE 371 (437)
T ss_pred HHHHHHHHHHHHhcCCCEEEEEEcC--hhhccCCccccCcccchhhhcCccccCCHHHHHhhcCCcCCeEEEEEEEEecC
Confidence 33333322 566665 999999996 5678999999653 223468999999999876
Q ss_pred CCc-eEEEEEcCCCCCCCCCcEEEEEec
Q 018104 298 DGT-KYWIVRNSWGPEWGEKGYIRMQRG 324 (360)
Q Consensus 298 ~g~-~ywivkNSWG~~WG~~Gy~~i~~~ 324 (360)
+|+ .||+||||||+.||++||++|+++
T Consensus 372 ~g~p~yw~VkNSWG~~~G~~Gy~~ms~~ 399 (437)
T cd00585 372 DGKPVKWKVENSWGEKVGKKGYFVMSDD 399 (437)
T ss_pred CCCcceEEEEcccCCCCCCCcceehhHH
Confidence 676 699999999999999999999987
No 19
>PF03051 Peptidase_C1_2: Peptidase C1-like family This family is a subfamily of the Prosite entry; InterPro: IPR004134 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This group of proteins belong to MEROPS peptidase family C1, sub-family C1B (bleomycin hydrolase, clan CA). This family contains prokaryotic and eukaryotic aminopeptidases and bleomycin hydrolases.; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 3PW3_F 2CB5_A 1CB5_C 2DZZ_A 2E02_A 2E01_A 2E03_A 1A6R_A 1GCB_A 3GCB_A ....
Probab=99.84 E-value=5.5e-20 Score=178.35 Aligned_cols=183 Identities=27% Similarity=0.454 Sum_probs=111.0
Q ss_pred CCCCCCCCCCcHHHHHHHHHHHHHHHHhcC-CccccCHHHHH----------------hhcCC-----------CCCCCC
Q 018104 139 TAVKDQGQCGSCWAFSTIAAVEGINHIMTN-KLVSLSEQELV----------------DCDTD-----------QNQGCN 190 (360)
Q Consensus 139 tpV~dQg~cGsCwAfA~~~~le~~~~~~~~-~~~~lS~q~l~----------------dc~~~-----------~~~gc~ 190 (360)
.||.||...|.||.||++..++..+.++.+ +.++||+.++. ++... .....+
T Consensus 56 ~~vtnQk~SGRCW~FA~lN~lR~~~~kk~~l~~felSq~Yl~F~DKlEKaN~fLe~ii~~~~~~~d~R~v~~ll~~~~~D 135 (438)
T PF03051_consen 56 GPVTNQKSSGRCWLFAALNVLRHEIMKKLNLKDFELSQNYLFFWDKLEKANYFLENIIDTADEPLDDRLVRFLLKNPVSD 135 (438)
T ss_dssp -S--B--BSSTHHHHHHHHHHHHHHHHHCT-SS--B-HHHHHHHHHHHHHHHHHHHHHHCCTS-TTSHHHHHHHHSTT-S
T ss_pred CCCCCCCCCCCcchhhchHHHHHHHHHHcCCCceEeechHHHHHHHHHHHHHHHHHHHHHhcCCcchHHHHHHHhcCCCC
Confidence 399999999999999999999999888776 78999998874 22211 134578
Q ss_pred CcchhhHHHHHHHcCCCCCCCCCccc------------------------------------------------------
Q 018104 191 GGLMELAFEFIKKKGGVTTEAKYPYQ------------------------------------------------------ 216 (360)
Q Consensus 191 GG~~~~a~~~~~~~~Gi~~e~~yPY~------------------------------------------------------ 216 (360)
||....+.+.++++ |+|+.+.||-+
T Consensus 136 GGqw~~~~nli~KY-GvVPk~~mpet~~s~~t~~~n~~l~~~Lr~~a~~LR~~~~~~~~~~~l~~~k~~~l~~iy~il~~ 214 (438)
T PF03051_consen 136 GGQWDMVVNLIKKY-GVVPKSVMPETFSSSNTSEMNEMLNTKLREYALELRKLVKAGKSEEELRKLKEEMLAEIYRILAI 214 (438)
T ss_dssp -B-HHHHHHHHHHH----BGGGSTTGCGCHBHHHHHHHHHHHHHHHHHHHHHHHHTTTTCHHHHHHHHHHHHHHHHHHHH
T ss_pred CCchHHHHHHHHHc-CcCcHhhCCCCCCCCChHHHHHHHHHHHHHHHHHHHHHHHcCCCHHHHHHHHHHHHHHHHHHHHH
Confidence 99999999999988 99999999910
Q ss_pred --------------CCCC----------------CcCCC--------------CCCCCcEEe-----------cceEEcC
Q 018104 217 --------------ANDG----------------TCDVS--------------KESSPAVSI-----------DGHENVP 241 (360)
Q Consensus 217 --------------~~~~----------------~c~~~--------------~~~~~~~~i-----------~~~~~v~ 241 (360)
..++ .+... .+....+.+ ..|.++|
T Consensus 215 ~lG~PP~~F~~ey~dkd~~~~~~~~~TP~eF~~kyv~~~~ddyVsLin~P~~~~py~~~y~ve~~~Nv~~g~~~~ylNvp 294 (438)
T PF03051_consen 215 YLGEPPEKFTWEYRDKDKKYHRGKNYTPLEFYKKYVGFDLDDYVSLINDPRSHHPYNKLYTVEYLGNVVGGRPVRYLNVP 294 (438)
T ss_dssp HH---SSSEEEEEE-TTS-EEEEEEE-HHHHHHHCTTS-GGGEEEEE--T-TTS-TTCEEEETTTTSSTT-EEEEEEE--
T ss_pred HcCCCChheeEEEeccccccccccccCchhHHHHHhCCCCcceEEEeeCCCccCccceeEEEccCCCEECCcceeEeccC
Confidence 0000 00000 000011111 1244554
Q ss_pred CChH-HHHHHHHHhC-CeEEEEecCCcccccccCceEeCCC----------------------CCCCCeEEEEEEeeecC
Q 018104 242 ANHE-DALLKAVAKQ-PVSVAIDAGSSDFQFYSEGVFTGEC----------------------GTELNHGVAAVGYGTTL 297 (360)
Q Consensus 242 ~~~~-~~i~~~l~~g-PV~v~~~~~~~~f~~y~~Giy~~~~----------------------~~~~~Hav~iVGyg~~~ 297 (360)
.... +.+.++|..| ||..+.++. . +...+.||.+... .+..+|||+|||.+.+.
T Consensus 295 id~lk~~~i~~Lk~G~~VwfgcDV~-k-~~~~k~Gi~D~~~~d~~~~fg~~~~~~K~~Rl~~~eS~~tHAM~itGv~~D~ 372 (438)
T PF03051_consen 295 IDELKDAAIKSLKAGYPVWFGCDVG-K-FFDRKNGIMDTDLYDYDSLFGVDFNMSKAERLDYGESTMTHAMVITGVDLDE 372 (438)
T ss_dssp HHHHHHHHHHHHHTT--EEEEEETT-T-TEETTTTEE-TTSB-HHHHHT--S-S-HHHHHHTTSS--EEEEEEEEEEE-T
T ss_pred HHHHHHHHHHHHHcCCcEEEeccCC-c-cccccchhhccchhhhhhhhccccccCHHHHHHhCCCCCceeEEEEEEEecc
Confidence 3222 2334445667 999999997 3 4556789875532 12348999999999977
Q ss_pred CCc-eEEEEEcCCCCCCCCCcEEEEEec
Q 018104 298 DGT-KYWIVRNSWGPEWGEKGYIRMQRG 324 (360)
Q Consensus 298 ~g~-~ywivkNSWG~~WG~~Gy~~i~~~ 324 (360)
+|+ .+|+|+||||+..|.+||+.|+..
T Consensus 373 ~g~p~~wkVeNSWG~~~g~kGy~~msd~ 400 (438)
T PF03051_consen 373 DGKPVRWKVENSWGTDNGDKGYFYMSDD 400 (438)
T ss_dssp TSSEEEEEEE-SBTTTSTBTTEEEEEHH
T ss_pred CCCeeEEEEEcCCCCCCCCCcEEEECHH
Confidence 776 699999999999999999999975
No 20
>PF08246 Inhibitor_I29: Cathepsin propeptide inhibitor domain (I29); InterPro: IPR013201 Peptide proteinase inhibitors can be found as single domain proteins or as single or multiple domains within proteins; these are referred to as either simple or compound inhibitors, respectively. In many cases they are synthesised as part of a larger precursor protein, either as a prepropeptide or as an N-terminal domain associated with an inactive peptidase or zymogen. This domain prevents access of the substrate to the active site. Removal of the N-terminal inhibitor domain either by interaction with a second peptidase or by autocatalytic cleavage activates the zymogen. Other inhibitors interact direct with proteinases using a simple noncovalent lock and key mechanism; while yet others use a conformational change-based trapping mechanism that depends on their structural and thermodynamic properties. This entry represents a peptidase inhibitor domain, which belongs to MEROPS peptidase inhibitor family I29. The domain is also found at the N terminus of a variety of peptidase precursors that belong to MEROPS peptidase subfamily C1A; these include cathepsin L, papain, and procaricain (P10056 from SWISSPROT) []. It forms an alpha-helical domain that runs through the substrate-binding site, preventing access. Removal of this region by proteolytic cleavage results in activation of the enzyme. This domain is also found, in one or more copies, in a variety of cysteine peptidase inhibitors such as salarin [].; PDB: 3QT4_A 3QJ3_A 2C0Y_A 2L95_A 1CJL_A 1CS8_A 7PCK_A 1BY8_A 1PCI_A 2O6X_A ....
Probab=99.68 E-value=8.5e-17 Score=113.36 Aligned_cols=56 Identities=36% Similarity=0.704 Sum_probs=49.4
Q ss_pred HHHHHHhc-cccCChHHHHHHHHHHHHHHHHHHhhC-CCCCCeEEecccCCCCChhhh
Q 018104 39 YERWRSHH-TVSRSLDEKHKRFNVFKQNVMHVHQTN-KMDKPYKLKLNKFADMTNHEF 94 (360)
Q Consensus 39 f~~~~~~~-k~Y~~~~E~~~R~~if~~n~~~I~~~N-~~~~s~~~g~N~fsD~t~~Ef 94 (360)
|++|+++| |.|.+.+|+.+|+++|++|++.|++|| ..+.+|++|+|+|||||.+||
T Consensus 1 F~~~~~~~~k~Y~~~~e~~~R~~~F~~N~~~I~~~N~~~~~~~~~~~N~fsD~t~eEf 58 (58)
T PF08246_consen 1 FEQFKKKYGKSYKSAEEEARRFAIFKENLRRIEEHNANGNNTYKLGLNQFSDMTPEEF 58 (58)
T ss_dssp HHHHHHHCT---SSHHHHHHHHHHHHHHHHHHHHHHHTTSSSEEE-SSTTTTSSHHHH
T ss_pred CHHHHHHcCCCCCCHHHHHHHHHHHHHHHHHHHHHhcCCCCCeEEeCccccCcChhhC
Confidence 89999999 999999999999999999999999999 445899999999999999997
No 21
>smart00848 Inhibitor_I29 Cathepsin propeptide inhibitor domain (I29). This domain is found at the N-terminus of some C1 peptidases such as Cathepsin L where it acts as a propeptide. There are also a number of proteins that are composed solely of multiple copies of this domain such as the peptidase inhibitor salarin. This family is classified as I29 by MEROPS. Peptide proteinase inhibitors can be found as single domain proteins or as single or multiple domains within proteins; these are referred to as either simple or compound inhibitors, respectively. In many cases they are synthesised as part of a larger precursor protein, either as a prepropeptide or as an N-terminal domain associated with an inactive peptidase or zymogen. This domain prevents access of the substrate to the active site. Removal of the N-terminal inhibitor domain either by interaction with a second peptidase or by autocatalytic cleavage activates the zymogen. Other inhibitors interact direct with proteinases using a s
Probab=99.53 E-value=1.2e-14 Score=102.00 Aligned_cols=55 Identities=38% Similarity=0.745 Sum_probs=52.1
Q ss_pred HHHHHHhc-cccCChHHHHHHHHHHHHHHHHHHhhCCCC-CCeEEecccCCCCChhh
Q 018104 39 YERWRSHH-TVSRSLDEKHKRFNVFKQNVMHVHQTNKMD-KPYKLKLNKFADMTNHE 93 (360)
Q Consensus 39 f~~~~~~~-k~Y~~~~E~~~R~~if~~n~~~I~~~N~~~-~s~~~g~N~fsD~t~~E 93 (360)
|++|+.+| |.|.+.+|..+|+.+|.+|++.|+.||+.+ .+|++|+|+|||||++|
T Consensus 1 f~~~~~~~~k~y~~~~e~~~r~~~f~~n~~~i~~~N~~~~~~~~~~~N~fsDlt~eE 57 (57)
T smart00848 1 FEQWKKKYGKSYSSEEEELRRFEIFKENLKFIEEHNKKNDHSYTLGLNQFADLTNEE 57 (57)
T ss_pred ChHHHHHhCCCCCCHHHHHHHHHHHHHHHHHHHHHHhcCCCCeEecCcccccCCCCC
Confidence 68999999 999999999999999999999999999876 89999999999999986
No 22
>COG3579 PepC Aminopeptidase C [Amino acid transport and metabolism]
Probab=99.42 E-value=7.6e-13 Score=120.59 Aligned_cols=74 Identities=23% Similarity=0.390 Sum_probs=56.2
Q ss_pred CCCCCCCCCcHHHHHHHHHHHHHHHHhcC-CccccCHHHHHhhc----------------CC-----------CCCCCCC
Q 018104 140 AVKDQGQCGSCWAFSTIAAVEGINHIMTN-KLVSLSEQELVDCD----------------TD-----------QNQGCNG 191 (360)
Q Consensus 140 pV~dQg~cGsCwAfA~~~~le~~~~~~~~-~~~~lS~q~l~dc~----------------~~-----------~~~gc~G 191 (360)
||-||...|.||.||++..+.-.+...-+ +.+.||..++.-.+ .. ...--+|
T Consensus 59 ~vtNQk~SGRCWmFAAlNtfRhk~~~el~le~fElSQaytfFwDKlEKaN~FleqIi~tadq~ldsRlv~~LL~~PqqDG 138 (444)
T COG3579 59 KVTNQKQSGRCWMFAALNTFRHKLISELKLEDFELSQAYTFFWDKLEKANWFLEQIIETADQELDSRLVSFLLATPQQDG 138 (444)
T ss_pred ccccccccceehHHHHHHHHHHHHHHhcCcceeehhhHHHHHHHHHHHhhHHHHHHHhhcccchHHHHHHHHHcCccccC
Confidence 89999999999999999998766554444 56888886653211 00 2344589
Q ss_pred cchhhHHHHHHHcCCCCCCCCCc
Q 018104 192 GLMELAFEFIKKKGGVTTEAKYP 214 (360)
Q Consensus 192 G~~~~a~~~~~~~~Gi~~e~~yP 214 (360)
|-.......+.++ |+++.++||
T Consensus 139 GQwdM~v~l~eKY-GvVpK~~yp 160 (444)
T COG3579 139 GQWDMFVSLFEKY-GVVPKSVYP 160 (444)
T ss_pred chHHHHHHHHHHh-CCCchhhcc
Confidence 9888888888887 999999999
No 23
>KOG4128 consensus Bleomycin hydrolases and aminopeptidases of cysteine protease family [Amino acid transport and metabolism]
Probab=98.61 E-value=2.8e-08 Score=90.86 Aligned_cols=75 Identities=25% Similarity=0.396 Sum_probs=58.9
Q ss_pred CCCCCCCCCCcHHHHHHHHHHHHHHHHhcC-CccccCHHHHHh--------------------hcCC---------CCCC
Q 018104 139 TAVKDQGQCGSCWAFSTIAAVEGINHIMTN-KLVSLSEQELVD--------------------CDTD---------QNQG 188 (360)
Q Consensus 139 tpV~dQg~cGsCwAfA~~~~le~~~~~~~~-~~~~lS~q~l~d--------------------c~~~---------~~~g 188 (360)
+||.||.+.|.||.|+++..+.-.+..+-+ ..+.||..+|+- |..- .+..
T Consensus 63 ~pvtnqkssGrcWift~ln~lrl~~~~kLnl~eFElSqayLFFwdKlErcnyFL~~vvd~a~r~ep~DgRlvq~Ll~nP~ 142 (457)
T KOG4128|consen 63 QPVTNQKSSGRCWIFTGLNLLRLEMDRKLNLPEFELSQAYLFFWDKLERCNYFLWTVVDLAMRCEPLDGRLVQNLLKNPV 142 (457)
T ss_pred cccccCcCCCceEEEechhHHHHHHHhcCCcchhhhhhHHHHHHHHHHHHHHHHHHHHHHHhhcCCcccHHHHHHHhCCC
Confidence 599999999999999999998766665544 568899877741 2111 2344
Q ss_pred CCCcchhhHHHHHHHcCCCCCCCCCc
Q 018104 189 CNGGLMELAFEFIKKKGGVTTEAKYP 214 (360)
Q Consensus 189 c~GG~~~~a~~~~~~~~Gi~~e~~yP 214 (360)
-+||.....++.++++ |+.+..+||
T Consensus 143 ~DGGqw~MfvNlVkKY-GviPKkcy~ 167 (457)
T KOG4128|consen 143 PDGGQWQMFVNLVKKY-GVIPKKCYL 167 (457)
T ss_pred CCCchHHHHHHHHHHh-CCCcHHhcc
Confidence 5899999999999887 999999998
No 24
>PF13529 Peptidase_C39_2: Peptidase_C39 like family; PDB: 3ERV_A.
Probab=97.18 E-value=0.007 Score=49.49 Aligned_cols=57 Identities=25% Similarity=0.427 Sum_probs=34.5
Q ss_pred ChHHHHHHHHHhC-CeEEEEecCCcccccccCceEeCCCCCCCCeEEEEEEeeecCCCceEEEEEcCC
Q 018104 243 NHEDALLKAVAKQ-PVSVAIDAGSSDFQFYSEGVFTGECGTELNHGVAAVGYGTTLDGTKYWIVRNSW 309 (360)
Q Consensus 243 ~~~~~i~~~l~~g-PV~v~~~~~~~~f~~y~~Giy~~~~~~~~~Hav~iVGyg~~~~g~~ywivkNSW 309 (360)
.+.+.|++.|.+| ||.+.+...-... ....+. ....+|.|+|+||+.+ . +++|..+|
T Consensus 87 ~~~~~i~~~i~~G~Pvi~~~~~~~~~~---~~~~~~---~~~~~H~vvi~Gy~~~---~-~~~v~DP~ 144 (144)
T PF13529_consen 87 ASFDDIKQEIDAGRPVIVSVNSGWRPP---NGDGYD---GTYGGHYVVIIGYDED---G-YVYVNDPW 144 (144)
T ss_dssp S-HHHHHHHHHTT--EEEEEETTSS-----TTEEEE---E-TTEEEEEEEEE-SS---E--EEEE-TT
T ss_pred CcHHHHHHHHHCCCcEEEEEEcccccC---CCCCcC---CCcCCEEEEEEEEeCC---C-EEEEeCCC
Confidence 4568899999887 9999997431111 111111 2347999999999985 2 78888877
No 25
>PF05543 Peptidase_C47: Staphopain peptidase C47; InterPro: IPR008750 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This group of cysteine peptidases belong to the peptidase family C47 (staphopain family, clan CA). The type example are the staphopains, which are one of four major families of proteinases secreted by the Gram-positive Staphylococcus aureus. These staphylococcal cysteine proteases are secreted as preproenzymes that are proteolytically cleaved to generate the mature enzyme [, , ].; GO: 0008234 cysteine-type peptidase activity, 0006508 proteolysis; PDB: 1X9Y_D 1Y4H_B 1PXV_B 1CV8_A.
Probab=96.57 E-value=0.027 Score=47.87 Aligned_cols=129 Identities=18% Similarity=0.286 Sum_probs=73.3
Q ss_pred CCCCCCCcHHHHHHHHHHHHHHHHh--------cCCccccCHHHHHhhcCCCCCCCCCcchhhHHHHHHHcCCCCCCCCC
Q 018104 142 KDQGQCGSCWAFSTIAAVEGINHIM--------TNKLVSLSEQELVDCDTDQNQGCNGGLMELAFEFIKKKGGVTTEAKY 213 (360)
Q Consensus 142 ~dQg~cGsCwAfA~~~~le~~~~~~--------~~~~~~lS~q~l~dc~~~~~~gc~GG~~~~a~~~~~~~~Gi~~e~~y 213 (360)
..||.-+-|-+||.+++|-+..... ..-...+|+++|.++.. .+...++|.+.. |....
T Consensus 17 EtQg~~pWCa~Ya~aailN~~~~~~~~~A~~iMr~~yPn~s~~~l~~~~~---------~~~~~i~y~ks~-g~~~~--- 83 (175)
T PF05543_consen 17 ETQGYNPWCAGYAMAAILNATTNTKIYNAKDIMRYLYPNVSEEQLKFTSL---------TPNQMIKYAKSQ-GRNPQ--- 83 (175)
T ss_dssp ---SSSS-HHHHHHHHHHHHHCT-S---HHHHHHHHSTTS-CCCHHH--B----------HHHHHHHHHHT-TEEEE---
T ss_pred eccCcCcHHHHHHHHHHHHhhhCcCcCCHHHHHHHHCCCCCHHHHhhcCC---------CHHHHHHHHHHc-Ccchh---
Confidence 4689999999999999987642111 11234677777766643 345777887665 53210
Q ss_pred cccCCCCCcCCCCCCCCcEEecceEEcCCChHHHHHHHHHh-CCeEEEEecCCcccccccCceEeCCCCCCCCeEEEEEE
Q 018104 214 PYQANDGTCDVSKESSPAVSIDGHENVPANHEDALLKAVAK-QPVSVAIDAGSSDFQFYSEGVFTGECGTELNHGVAAVG 292 (360)
Q Consensus 214 PY~~~~~~c~~~~~~~~~~~i~~~~~v~~~~~~~i~~~l~~-gPV~v~~~~~~~~f~~y~~Giy~~~~~~~~~Hav~iVG 292 (360)
| .. .. -+.+++++.+.+ .|+.+..+.... ..+...+|||+|||
T Consensus 84 -~------------------~n---~~--~s~~eV~~~~~~nk~i~i~~~~v~~------------~~~~~~gHAlavvG 127 (175)
T PF05543_consen 84 -Y------------------NN---RM--PSFDEVKKLIDNNKGIAILADRVEQ------------TNGPHAGHALAVVG 127 (175)
T ss_dssp -E------------------EC---S-----HHHHHHHHHTT-EEEEEEEETTS------------CTTB--EEEEEEEE
T ss_pred -H------------------hc---CC--CCHHHHHHHHHcCCCeEEEeccccc------------CCCCccceeEEEEe
Confidence 0 00 01 146778888865 588876654311 12345799999999
Q ss_pred eeecCCCceEEEEEcCCCCCCCCCcEEEEEec
Q 018104 293 YGTTLDGTKYWIVRNSWGPEWGEKGYIRMQRG 324 (360)
Q Consensus 293 yg~~~~g~~ywivkNSWG~~WG~~Gy~~i~~~ 324 (360)
|-.-.+|.++.++=|=| +++++-++-+
T Consensus 128 ya~~~~g~~~y~~WNPW-----~~~~~~~sa~ 154 (175)
T PF05543_consen 128 YAKPNNGQKTYYFWNPW-----WNDVMIQSAK 154 (175)
T ss_dssp EEEETTSEEEEEEE-TT------SS-EEEETT
T ss_pred eeecCCCCeEEEEeCCc-----cCCcEEEecC
Confidence 98765778999998888 3556665544
No 26
>PF08127 Propeptide_C1: Peptidase family C1 propeptide; InterPro: IPR012599 This domain is found at the N-terminal of cathepsin B and cathepsin B-like peptidases that belong to MEROPS peptidase subfamily C1A. Cathepsin B are lysosomal cysteine proteinases belonging to the papain superfamily and are unique in their ability to act as both an endo- and an exopeptidases. They are synthesized as inactive zymogens. Activation of the peptidases occurs with the removal of the propeptide [, ]. ; GO: 0004197 cysteine-type endopeptidase activity, 0050790 regulation of catalytic activity; PDB: 1MIR_A 1PBH_A 2PBH_A 3PBH_A.
Probab=96.28 E-value=0.0031 Score=40.56 Aligned_cols=35 Identities=14% Similarity=0.148 Sum_probs=22.4
Q ss_pred HHHHHhhCCCCCCeEEecccCCCCChhhhhhcccccc
Q 018104 66 VMHVHQTNKMDKPYKLKLNKFADMTNHEFASTYAGSK 102 (360)
Q Consensus 66 ~~~I~~~N~~~~s~~~g~N~fsD~t~~Ef~~~~~~~~ 102 (360)
-++|+.+|+.+.+|++|.| |.+.|.++++.++ |..
T Consensus 3 de~I~~IN~~~~tWkAG~N-F~~~~~~~ik~Ll-Gv~ 37 (41)
T PF08127_consen 3 DEFIDYINSKNTTWKAGRN-FENTSIEYIKRLL-GVL 37 (41)
T ss_dssp HHHHHHHHHCT-SEEE-----SSB-HHHHHHCS--B-
T ss_pred HHHHHHHHcCCCcccCCCC-CCCCCHHHHHHHc-CCC
Confidence 3678999998899999999 8999999887764 443
No 27
>PF14399 Transpep_BrtH: NlpC/p60-like transpeptidase
Probab=91.25 E-value=0.53 Score=44.35 Aligned_cols=66 Identities=17% Similarity=0.221 Sum_probs=41.6
Q ss_pred hHHHHHHHHHhC-CeEEEEecCCcccccccCceEeCCCCCCCCeEEEEEEeeecCCCceEEEEEcCCCCCCCCCcEEEEE
Q 018104 244 HEDALLKAVAKQ-PVSVAIDAGSSDFQFYSEGVFTGECGTELNHGVAAVGYGTTLDGTKYWIVRNSWGPEWGEKGYIRMQ 322 (360)
Q Consensus 244 ~~~~i~~~l~~g-PV~v~~~~~~~~f~~y~~Giy~~~~~~~~~Hav~iVGyg~~~~g~~ywivkNSWG~~WG~~Gy~~i~ 322 (360)
-.+.|++.|.+| ||.+.++.+ +..|...-| .....+|.|+|+||+++ ++.+.++-. ....+.+++
T Consensus 77 ~~~~l~~~l~~g~pv~~~~D~~---~lpy~~~~~---~~~~~~H~i~v~G~d~~--~~~~~v~D~------~~~~~~~~~ 142 (317)
T PF14399_consen 77 AWEELKEALDAGRPVIVWVDMY---YLPYRPNYY---KKHHADHYIVVYGYDEE--EDVFYVSDP------PSYEPGRLP 142 (317)
T ss_pred HHHHHHHHHhCCCceEEEeccc---cCCCCcccc---ccccCCcEEEEEEEeCC--CCEEEEEcC------CCCcceeec
Confidence 356788888787 999998876 333433322 12346899999999975 234555533 233445555
Q ss_pred e
Q 018104 323 R 323 (360)
Q Consensus 323 ~ 323 (360)
+
T Consensus 143 ~ 143 (317)
T PF14399_consen 143 Y 143 (317)
T ss_pred H
Confidence 4
No 28
>COG4990 Uncharacterized protein conserved in bacteria [Function unknown]
Probab=89.87 E-value=0.79 Score=39.23 Aligned_cols=51 Identities=16% Similarity=0.210 Sum_probs=37.8
Q ss_pred EcCCChHHHHHHHHHhC-CeEEEEecCCcccccccCceEeCCCCCCCCeEEEEEEeeecCCCceEEEEEcCCC
Q 018104 239 NVPANHEDALLKAVAKQ-PVSVAIDAGSSDFQFYSEGVFTGECGTELNHGVAAVGYGTTLDGTKYWIVRNSWG 310 (360)
Q Consensus 239 ~v~~~~~~~i~~~l~~g-PV~v~~~~~~~~f~~y~~Giy~~~~~~~~~Hav~iVGyg~~~~g~~ywivkNSWG 310 (360)
.+...+..+|+..|.+| ||.+-.... .. ..-|+|+|+|||+. ++..-++||
T Consensus 117 d~tGksl~~ik~ql~kg~PV~iw~T~~----~~------------~s~H~v~itgyDk~-----n~yynDpyG 168 (195)
T COG4990 117 DLTGKSLSDIKGQLLKGRPVVIWVTNF----HS------------YSIHSVLITGYDKY-----NIYYNDPYG 168 (195)
T ss_pred cCcCCcHHHHHHHHhcCCcEEEEEecc----cc------------cceeeeEeeccccc-----ceEeccccc
Confidence 34556889999999776 998766443 21 35799999999975 677777775
No 29
>PF13956 Ibs_toxin: Toxin Ibs, type I toxin-antitoxin system
Probab=83.71 E-value=0.44 Score=24.49 Aligned_cols=15 Identities=20% Similarity=0.434 Sum_probs=10.1
Q ss_pred ChhHHHHHHHHHHHH
Q 018104 1 MKRVYLLAAFLLALV 15 (360)
Q Consensus 1 Mk~~l~~~~~~l~l~ 15 (360)
|+++.++++++|.++
T Consensus 1 MMk~vIIlvvLLliS 15 (19)
T PF13956_consen 1 MMKLVIILVVLLLIS 15 (19)
T ss_pred CceehHHHHHHHhcc
Confidence 677777766666664
No 30
>cd02549 Peptidase_C39A A sub-family of peptidase family C39. Peptidase family C39 mostly contains bacteriocin-processing endopeptidases from bacteria. The cysteine peptidases in family C39 cleave the "double-glycine" leader peptides from the precursors of various bacteriocins (mostly non-lantibiotic). The cleavage is mediated by the transporter as part of the secretion process. Bacteriocins are antibiotic proteins secreted by some species of bacteria that inhibit the growth of other bacterial species. The bacteriocin is synthesized as a precursor with an N-terminal leader peptide, and processing involves removal of the leader peptide by cleavage at a Gly-Gly bond, followed by translocation of the mature bacteriocin across the cytoplasmic membrane. Most endopeptidases of family C39 are N-terminal domains in larger proteins (ABC transporters) that serve both functions. The proposed protease active site is conserved in this sub-family of proteins with a single peptidase domain, which are
Probab=79.80 E-value=5.5 Score=32.24 Aligned_cols=44 Identities=27% Similarity=0.410 Sum_probs=29.6
Q ss_pred HHHHHHhC-CeEEEEecCCcccccccCceEeCCCCCCCCeEEEEEEeeecCCCceEEEEEcCC
Q 018104 248 LLKAVAKQ-PVSVAIDAGSSDFQFYSEGVFTGECGTELNHGVAAVGYGTTLDGTKYWIVRNSW 309 (360)
Q Consensus 248 i~~~l~~g-PV~v~~~~~~~~f~~y~~Giy~~~~~~~~~Hav~iVGyg~~~~g~~ywivkNSW 309 (360)
+++.+..+ ||.+.++.. ......+|.|+|+||+. .+..+|.+.|
T Consensus 70 ~~~~l~~~~Pvi~~~~~~--------------~~~~~~gH~vVv~g~~~----~~~~~i~DP~ 114 (141)
T cd02549 70 LLRQLAAGHPVIVSVNLG--------------VSITPSGHAMVVIGYDR----KGNVYVNDPG 114 (141)
T ss_pred HHHHHHCCCeEEEEEecC--------------cccCCCCeEEEEEEEcC----CCCEEEECCC
Confidence 66777666 999887641 01224689999999982 1246677776
No 31
>PF09778 Guanylate_cyc_2: Guanylylate cyclase; InterPro: IPR018616 Members of this family of proteins catalyse the conversion of guanosine triphosphate (GTP) to 3',5'-cyclic guanosine monophosphate (cGMP) and pyrophosphate.
Probab=76.64 E-value=10 Score=33.65 Aligned_cols=61 Identities=20% Similarity=0.297 Sum_probs=35.3
Q ss_pred hHHHHHHHHHhC-CeEEEEecCCcccccccCceEeC---C----CCCCCCeEEEEEEeeecCCCceEEEEEc
Q 018104 244 HEDALLKAVAKQ-PVSVAIDAGSSDFQFYSEGVFTG---E----CGTELNHGVAAVGYGTTLDGTKYWIVRN 307 (360)
Q Consensus 244 ~~~~i~~~l~~g-PV~v~~~~~~~~f~~y~~Giy~~---~----~~~~~~Hav~iVGyg~~~~g~~ywivkN 307 (360)
..++|...|..| |+.+-++...-.=..-+...... . .....+|=|+|+||+.. .+-++++|
T Consensus 112 s~~ei~~hl~~g~~aIvLVd~~~L~C~~Ck~~~~~~~~~~~~~~~~~Y~GHYVVlcGyd~~---~~~~~yrd 180 (212)
T PF09778_consen 112 SIQEIIEHLSSGGPAIVLVDASLLHCDLCKSNCFDPIGSKCFGRSPDYQGHYVVLCGYDAA---TKEFEYRD 180 (212)
T ss_pred cHHHHHHHHhCCCcEEEEEccccccChhhcccccccccccccCCCCCccEEEEEEEeecCC---CCeEEEeC
Confidence 567888889765 77777766411100012222211 1 12356999999999975 34466655
No 32
>cd00044 CysPc Calpains, domains IIa, IIb; calcium-dependent cytoplasmic cysteine proteinases, papain-like. Functions in cytoskeletal remodeling processes, cell differentiation, apoptosis and signal transduction.
Probab=68.39 E-value=20 Score=33.79 Aligned_cols=28 Identities=21% Similarity=0.456 Sum_probs=23.7
Q ss_pred CCCeEEEEEEeeecCC--CceEEEEEcCCCC
Q 018104 283 ELNHGVAAVGYGTTLD--GTKYWIVRNSWGP 311 (360)
Q Consensus 283 ~~~Hav~iVGyg~~~~--g~~ywivkNSWG~ 311 (360)
..+||-.|++...- + +.....+||-||.
T Consensus 234 ~~~HaY~Vl~~~~~-~~~~~~lv~lrNPWg~ 263 (315)
T cd00044 234 VKGHAYSVLDVREV-QEEGLRLLRLRNPWGV 263 (315)
T ss_pred ccCcceEEeEEEEE-ccCceEEEEecCCccC
Confidence 35899999999875 4 7889999999994
No 33
>PF12385 Peptidase_C70: Papain-like cysteine protease AvrRpt2; InterPro: IPR022118 This is a family of cysteine proteases, found in actinobacteria, protobacteria and firmicutes. Papain-like cysteine proteases play a crucial role in plant-pathogen/pest interactions. On entering the host they act on non-self substrates, thereby manipulating the host to evade proteolysis []. AvrRpt2 from Pseudomonas syringae pv tomato DC3000 triggers resistance to P. syringae-2-dependent defence responses, including hypersensitive cell death, by cleaving the Arabidopsis RIN4 protein which is monitored by the cognate resistance protein RPS2 [].
Probab=59.25 E-value=1.1e+02 Score=25.89 Aligned_cols=38 Identities=18% Similarity=0.234 Sum_probs=27.3
Q ss_pred hHHHHHHHH-HhCCeEEEEecCCcccccccCceEeCCCCCCCCeEEEEEEeeec
Q 018104 244 HEDALLKAV-AKQPVSVAIDAGSSDFQFYSEGVFTGECGTELNHGVAAVGYGTT 296 (360)
Q Consensus 244 ~~~~i~~~l-~~gPV~v~~~~~~~~f~~y~~Giy~~~~~~~~~Hav~iVGyg~~ 296 (360)
..+.+...| .+||+.+++.... .....|+++|.|-+.+
T Consensus 97 t~e~~~~LL~~yGPLwv~~~~P~---------------~~~~~H~~ViTGI~~d 135 (166)
T PF12385_consen 97 TAEGLANLLREYGPLWVAWEAPG---------------DSWVAHASVITGIDGD 135 (166)
T ss_pred CHHHHHHHHHHcCCeEEEecCCC---------------CcceeeEEEEEeecCC
Confidence 456778888 5699999966541 1234799999999765
No 34
>PF08139 LPAM_1: Prokaryotic membrane lipoprotein lipid attachment site; InterPro: IPR012640 In prokaryotes, membrane lipoproteins are synthesized with a precursor signal peptide, which is cleaved by a specific lipoprotein signal peptidase (signal peptidase II). The peptidase recognises a conserved sequence and cuts upstream of a cysteine residue to which a glyceride-fatty acid lipid is attached [,]. This lipid attachment site is found in homologues of the VirB proteins of type IV secretion systems (T4SS). Conjugal transfer across the cell envelope of Gram-negative bacteria is mediated by a supramolecular structure termed mating pair formation (Mpf) complex. Collectively, secretion pathways ancestrally related to bacterial conjugation systems are now known as T4SS. T4SS are involved in the delivery of effector molecules to eukaryotic target cells; each of these systems exports distinct DNA or protein substrates to effect a myriad of changes in host cell physiology during infection [].
Probab=58.48 E-value=7.7 Score=21.98 Aligned_cols=14 Identities=29% Similarity=0.420 Sum_probs=9.4
Q ss_pred ChhHHHHHHHHHHH
Q 018104 1 MKRVYLLAAFLLAL 14 (360)
Q Consensus 1 Mk~~l~~~~~~l~l 14 (360)
||++++.++.++.|
T Consensus 7 mKkil~~l~a~~~L 20 (25)
T PF08139_consen 7 MKKILFPLLALFML 20 (25)
T ss_pred HHHHHHHHHHHHHH
Confidence 37777777666655
No 35
>PF02402 Lysis_col: Lysis protein; InterPro: IPR003059 The DNA sequence of the entire colicin E2 operon has been determined []. The operon comprises the colicin activity gene (ceaB), the colicin immunity gene (ceiB) and the lysis gene (celB), which is essential for colicin release from producing cells []. A putative LexA binding site is located upstream from ceaB, and a rho-independent terminator structure is located downstream from celB []. Comparison of the amino acid sequences of colicin E2 and cloacin DF13 reveal extensive similarity. These colicins have different modes of action and recognise different cell surface receptors; the two major regions of heterology at the C terminus, and in the C-terminal end of the central region are thought to correspond to the catalytic and receptor-recognition domains, respectively []. Sequence similarities between colicins E2, A and E1 [] are less striking. The colicin E2 (pyocin) immunity protein does not share similarity with either the colicin E3 or cloacin DF13 [] immunity proteins. By contrast, the lysis proteins of the ColE2, ColE1 and CloDF13 plasmids are almost identical except in the N-terminal regions, which themselves are similar to lipoprotein signal peptides []. Processing of the ColE2 prolysis protein to the mature form is prevented by globomycin, a specific inhibitor of the lipoprotein signal peptidase []. The mature ColE2 lysis protein is located in the cell envelope [].; GO: 0009405 pathogenesis, 0019835 cytolysis, 0019867 outer membrane
Probab=54.77 E-value=4.4 Score=26.13 Aligned_cols=18 Identities=33% Similarity=0.759 Sum_probs=12.5
Q ss_pred ChhHHHHHHHHHHHHHHh
Q 018104 1 MKRVYLLAAFLLALVLGI 18 (360)
Q Consensus 1 Mk~~l~~~~~~l~l~~~~ 18 (360)
||++++++++++.++.++
T Consensus 1 MkKi~~~~i~~~~~~L~a 18 (46)
T PF02402_consen 1 MKKIIFIGIFLLTMLLAA 18 (46)
T ss_pred CcEEEEeHHHHHHHHHHH
Confidence 888888877777754333
No 36
>PF10731 Anophelin: Thrombin inhibitor from mosquito; InterPro: IPR018932 Members of this family are all inhibitors of thrombin, the peptidase that is at the end of the blood coagulation cascade and which creates the clot by cleaving fibrinogen. The interaction between thrombin and fibrinogen involves two different areas of contact - via the thrombin active site and via a second substrate-binding site known as an exosite. The inhibitor acts by blocking the exosite, rather than by interacting with the active site. The inhibitors are from mosquitoes that feed on human blood and which, by inhibiting thrombin, prevent the blood from clotting and keep it flowing.
Probab=54.51 E-value=14 Score=25.48 Aligned_cols=23 Identities=22% Similarity=0.323 Sum_probs=17.4
Q ss_pred ChhHHHHHHHHHHHHHHhhcCcc
Q 018104 1 MKRVYLLAAFLLALVLGIVEGFD 23 (360)
Q Consensus 1 Mk~~l~~~~~~l~l~~~~~~~~~ 23 (360)
|.+.|++|.++++.+.+.++++.
T Consensus 1 MA~Kl~vialLC~aLva~vQ~AP 23 (65)
T PF10731_consen 1 MASKLIVIALLCVALVAIVQSAP 23 (65)
T ss_pred CcchhhHHHHHHHHHHHHHhcCc
Confidence 77788888888887766666665
No 37
>PF06291 Lambda_Bor: Bor protein; InterPro: IPR010438 This family consists of several Bacteriophage lambda Bor and Escherichia coli Iss proteins. Expression of bor significantly increases the survival of the E. coli host cell in animal serum. This property is a well known bacterial virulence determinant indeed, bor and its adjacent sequences are highly homologous to the iss serum resistance locus of the plasmid ColV2-K94, which confers virulence in animals. It has been suggested that lysogeny may generally have a role in bacterial survival in animal hosts, and perhaps in pathogenesis [].
Probab=52.92 E-value=7.8 Score=29.92 Aligned_cols=22 Identities=36% Similarity=0.419 Sum_probs=15.7
Q ss_pred ChhHHHHHHHHHHHHHHhhcCc
Q 018104 1 MKRVYLLAAFLLALVLGIVEGF 22 (360)
Q Consensus 1 Mk~~l~~~~~~l~l~~~~~~~~ 22 (360)
||++|++.+++|+|...+++..
T Consensus 1 mKk~ll~~~lallLtgCatqt~ 22 (97)
T PF06291_consen 1 MKKLLLAAALALLLTGCATQTF 22 (97)
T ss_pred CcHHHHHHHHHHHHcccceeEE
Confidence 9999999988887754433333
No 38
>PF09403 FadA: Adhesion protein FadA; InterPro: IPR018543 FadA (Fusobacterium adhesin A) is an adhesin which forms two alpha helices. ; PDB: 3ETZ_B 3ETY_A 2GL2_B 3ETX_C 3ETW_A.
Probab=52.70 E-value=5.8 Score=32.22 Aligned_cols=15 Identities=40% Similarity=0.738 Sum_probs=0.0
Q ss_pred ChhHHHHHHHHHHHH
Q 018104 1 MKRVYLLAAFLLALV 15 (360)
Q Consensus 1 Mk~~l~~~~~~l~l~ 15 (360)
||+++|+.+++++.+
T Consensus 1 MKK~ll~~~lllss~ 15 (126)
T PF09403_consen 1 MKKILLLGMLLLSSI 15 (126)
T ss_dssp ---------------
T ss_pred ChHHHHHHHHHHHHH
Confidence 898776655444444
No 39
>PF01640 Peptidase_C10: Peptidase C10 family classification.; InterPro: IPR000200 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This group of cysteine peptidases belong to MEROPS peptidase family C10 (streptopain family, clan CA). Streptopain is a cysteine protease found in Streptococcus pyogenes that shows some structural and functional similarity to papain (family C1) [, ]. The order of the catalytic cysteine/histidine dyad is the same and the surrounding sequences are similar. The two proteins also show similar specificities, both preferring a hydrophobic residue at the P2 site [, ]. Streptopain shows a high degree of sequence similarity to the S. pyogenes exotoxin B, and strong similarity to the prtT gene product of Porphyromonas gingivalis (Bacteroides gingivalis), both of which have been included in the family [].; GO: 0008234 cysteine-type peptidase activity, 0006508 proteolysis; PDB: 4D8I_A 4D8E_A 4D8B_A 3BBA_B 3BB7_A 2JTC_A 1PVJ_A 1DKI_D 2UZJ_A.
Probab=52.24 E-value=71 Score=27.76 Aligned_cols=51 Identities=25% Similarity=0.470 Sum_probs=30.9
Q ss_pred HHHHHHHHhC-CeEEEEecCCcccccccCceEeCCCCCCCCeEEEEEEeeecCCCceEEEEEcCCCCCCCCCcEEE
Q 018104 246 DALLKAVAKQ-PVSVAIDAGSSDFQFYSEGVFTGECGTELNHGVAAVGYGTTLDGTKYWIVRNSWGPEWGEKGYIR 320 (360)
Q Consensus 246 ~~i~~~l~~g-PV~v~~~~~~~~f~~y~~Giy~~~~~~~~~Hav~iVGyg~~~~g~~ywivkNSWG~~WG~~Gy~~ 320 (360)
+.|+..|.++ ||.+.-... . .+||.+|=||+.+ .||-+==.||-. .+||++
T Consensus 141 ~~i~~el~~~rPV~~~g~~~-~-----------------~GHawViDGy~~~----~~~H~NwGW~G~--~nGyy~ 192 (192)
T PF01640_consen 141 DMIRNELDNGRPVLYSGNSK-S-----------------GGHAWVIDGYDSD----GYFHCNWGWGGS--SNGYYR 192 (192)
T ss_dssp HHHHHHHHTT--EEEEEEET-T-----------------EEEEEEEEEEESS----SEEEEE-SSTTT--T-EEEE
T ss_pred HHHHHHHHcCCCEEEEEecC-C-----------------CCeEEEEcCccCC----CeEEEeeCccCC--CCCccC
Confidence 5677778665 988654332 0 1999999999653 577654334322 678875
No 40
>PRK10081 entericidin B membrane lipoprotein; Provisional
Probab=51.97 E-value=16 Score=24.15 Aligned_cols=15 Identities=20% Similarity=0.166 Sum_probs=8.7
Q ss_pred ChhHHHHHHHHHHHH
Q 018104 1 MKRVYLLAAFLLALV 15 (360)
Q Consensus 1 Mk~~l~~~~~~l~l~ 15 (360)
||+++.+++++++++
T Consensus 2 mKk~i~~i~~~l~~~ 16 (48)
T PRK10081 2 VKKTIAAIFSVLVLS 16 (48)
T ss_pred hHHHHHHHHHHHHHH
Confidence 667666655555543
No 41
>COG5510 Predicted small secreted protein [Function unknown]
Probab=48.83 E-value=21 Score=23.09 Aligned_cols=12 Identities=50% Similarity=0.747 Sum_probs=5.6
Q ss_pred ChhHHHHHHHHH
Q 018104 1 MKRVYLLAAFLL 12 (360)
Q Consensus 1 Mk~~l~~~~~~l 12 (360)
||+.++++++++
T Consensus 2 mk~t~l~i~~vl 13 (44)
T COG5510 2 MKKTILLIALVL 13 (44)
T ss_pred chHHHHHHHHHH
Confidence 666444443333
No 42
>PRK09810 entericidin A; Provisional
Probab=43.37 E-value=25 Score=22.52 Aligned_cols=13 Identities=31% Similarity=0.444 Sum_probs=6.4
Q ss_pred ChhHHHHHHHHHH
Q 018104 1 MKRVYLLAAFLLA 13 (360)
Q Consensus 1 Mk~~l~~~~~~l~ 13 (360)
||+++++++++++
T Consensus 2 Mkk~~~l~~~~~~ 14 (41)
T PRK09810 2 MKRLIVLVLLAST 14 (41)
T ss_pred hHHHHHHHHHHHH
Confidence 5665555543333
No 43
>PF11948 DUF3465: Protein of unknown function (DUF3465); InterPro: IPR021856 This family of proteins are functionally uncharacterised. This protein is found in bacteria. Proteins in this family are typically between 131 to 151 amino acids in length. This protein has a conserved HWTH sequence motif.
Probab=37.75 E-value=39 Score=27.50 Aligned_cols=15 Identities=33% Similarity=0.326 Sum_probs=10.9
Q ss_pred ChhHHHHHHHHHHHH
Q 018104 1 MKRVYLLAAFLLALV 15 (360)
Q Consensus 1 Mk~~l~~~~~~l~l~ 15 (360)
||+++..++++|+.+
T Consensus 1 m~~~~~~~~~~~~~~ 15 (131)
T PF11948_consen 1 MKRFLALFLSVLSAF 15 (131)
T ss_pred CcchHHHHHHHHHHh
Confidence 898888776666664
No 44
>PF11106 YjbE: Exopolysaccharide production protein YjbE
Probab=36.40 E-value=32 Score=25.09 Aligned_cols=15 Identities=33% Similarity=0.350 Sum_probs=11.3
Q ss_pred ChhHHHHHHHHHHHH
Q 018104 1 MKRVYLLAAFLLALV 15 (360)
Q Consensus 1 Mk~~l~~~~~~l~l~ 15 (360)
||++++.++.++.|.
T Consensus 1 MKK~~~~~~~i~~l~ 15 (80)
T PF11106_consen 1 MKKIIYGLFAILALA 15 (80)
T ss_pred ChhHHHHHHHHHHHH
Confidence 999998776666664
No 45
>PF10880 DUF2673: Protein of unknown function (DUF2673); InterPro: IPR024247 This family of proteins with unknown function appears to be restricted to Rickettsiae spp.
Probab=36.02 E-value=76 Score=21.66 Aligned_cols=14 Identities=21% Similarity=0.197 Sum_probs=9.7
Q ss_pred ChhHHHHHHHHHHH
Q 018104 1 MKRVYLLAAFLLAL 14 (360)
Q Consensus 1 Mk~~l~~~~~~l~l 14 (360)
||++|-+|+++.+.
T Consensus 1 mknllkillilafa 14 (65)
T PF10880_consen 1 MKNLLKILLILAFA 14 (65)
T ss_pred ChhHHHHHHHHHHh
Confidence 88888776655554
No 46
>PF11777 DUF3316: Protein of unknown function (DUF3316); InterPro: IPR016879 There is currently no experimental data for members of this group or their homologues, nor do they exhibit features indicative of any function.
Probab=35.31 E-value=62 Score=25.61 Aligned_cols=15 Identities=47% Similarity=0.662 Sum_probs=10.2
Q ss_pred ChhHHHHHHHHHHHH
Q 018104 1 MKRVYLLAAFLLALV 15 (360)
Q Consensus 1 Mk~~l~~~~~~l~l~ 15 (360)
||+++++++++|+.+
T Consensus 1 MKk~~ll~~~ll~s~ 15 (114)
T PF11777_consen 1 MKKIILLASLLLLSS 15 (114)
T ss_pred CchHHHHHHHHHHHH
Confidence 888888875554444
No 47
>PF03032 Brevenin: Brevenin/esculentin/gaegurin/rugosin family; InterPro: IPR004275 In addition to the highly specific cell-mediated immune system, vertebrates possess an efficient host-defence mechanism against invading microorganisms which involves the synthesis of highly potent antimicrobial peptides with a large spectrum of activity. This entry represents a number of these defence peptides secreted from the skin of amphibians, including the opiate-like dermorphins and deltorphins, and the antimicrobial dermoseptins and temporins.; GO: 0006952 defense response, 0042742 defense response to bacterium, 0005576 extracellular region
Probab=34.63 E-value=27 Score=22.99 Aligned_cols=17 Identities=35% Similarity=0.503 Sum_probs=10.4
Q ss_pred ChhHHHHHHHHHHHHHH
Q 018104 1 MKRVYLLAAFLLALVLG 17 (360)
Q Consensus 1 Mk~~l~~~~~~l~l~~~ 17 (360)
||+.|++++++-.++.+
T Consensus 3 lKKsllLlfflG~ISlS 19 (46)
T PF03032_consen 3 LKKSLLLLFFLGTISLS 19 (46)
T ss_pred chHHHHHHHHHHHcccc
Confidence 77777766655555433
No 48
>PF14060 DUF4252: Domain of unknown function (DUF4252)
Probab=33.20 E-value=47 Score=27.50 Aligned_cols=34 Identities=18% Similarity=0.436 Sum_probs=17.4
Q ss_pred hhHHHHHHHHHHHHHHhhcCccCCccccCChhHHHHHHHHHHHh
Q 018104 2 KRVYLLAAFLLALVLGIVEGFDFHEKELESEEGLWDLYERWRSH 45 (360)
Q Consensus 2 k~~l~~~~~~l~l~~~~~~~~~~~~~~~~~~~~~~~~f~~~~~~ 45 (360)
|++|++++++++.+.+.++ ....+...|+++...
T Consensus 1 Kk~i~~l~l~~~~~~~~aq----------~~~~~~~~~~~~~~~ 34 (155)
T PF14060_consen 1 KKIILILLLLLACLASCAQ----------QGQSLQKYFDKYSEN 34 (155)
T ss_pred ChhHHHHHHHHHHHHHhcc----------cchhHHHHHHHhCCC
Confidence 5666666655555433221 124555667665443
No 49
>COG2143 Thioredoxin-related protein [Posttranslational modification, protein turnover, chaperones]
Probab=33.07 E-value=63 Score=27.34 Aligned_cols=18 Identities=28% Similarity=0.563 Sum_probs=12.6
Q ss_pred ChhHHHHHHHHHHHHHHh
Q 018104 1 MKRVYLLAAFLLALVLGI 18 (360)
Q Consensus 1 Mk~~l~~~~~~l~l~~~~ 18 (360)
|||+|+++++++++..++
T Consensus 1 ~mRvl~i~Lliis~fl~a 18 (182)
T COG2143 1 VMRVLLIVLLIISLFLSA 18 (182)
T ss_pred CcchHHHHHHHHHHHHHH
Confidence 678887777777775443
No 50
>PF05968 Bacillus_PapR: Bacillus PapR protein; InterPro: IPR009239 This family consists of the Bacillus species-specific PapR protein. The papR gene belongs to the PlcR regulon and is located 70 bp downstream from plcR. It encodes a 48-amino-acid peptide. Disruption of the papR gene abolishes expression of the PlcR regulon, resulting in a large decrease in haemolysis and virulence in insect larvae. A processed form of PapR activates the PlcR regulon by allowing PlcR to bind to its DNA target. This activating mechanism is strain specific [].
Probab=32.27 E-value=39 Score=21.97 Aligned_cols=15 Identities=27% Similarity=0.452 Sum_probs=11.7
Q ss_pred ChhHHHHHHHHHHHH
Q 018104 1 MKRVYLLAAFLLALV 15 (360)
Q Consensus 1 Mk~~l~~~~~~l~l~ 15 (360)
||++|++-++.+...
T Consensus 1 mkkll~~slltlam~ 15 (48)
T PF05968_consen 1 MKKLLIGSLLTLAMA 15 (48)
T ss_pred CchHHHhHHHHHHHH
Confidence 899998877776664
No 51
>COG5294 Uncharacterized protein conserved in bacteria [Function unknown]
Probab=31.70 E-value=41 Score=26.49 Aligned_cols=15 Identities=27% Similarity=0.448 Sum_probs=10.9
Q ss_pred ChhHHHHHHHHHHHH
Q 018104 1 MKRVYLLAAFLLALV 15 (360)
Q Consensus 1 Mk~~l~~~~~~l~l~ 15 (360)
||++|+.|+.+++++
T Consensus 1 MKkil~~ilall~~i 15 (113)
T COG5294 1 MKKILIGILALLLII 15 (113)
T ss_pred CcchHHHHHHHHHHH
Confidence 899998666666554
No 52
>PF00648 Peptidase_C2: Calpain family cysteine protease This is family C2 in the peptidase classification. ; InterPro: IPR001300 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This group of cysteine peptidases belong to the MEROPS peptidase family C2 (calpain family, clan CA). A type example is calpain, which is an intracellular protease involved in many important cellular functions that are regulated by calcium []. The protein is a complex of 2 polypeptide chains (light and heavy), with three known forms in mammals [, ]: a highly calcium-sensitive (i.e., micro-molar range) form known as mu-calpain, mu-CANP or calpain I; a form sensitive to calcium in the milli-molar range, known as m-calpain, m-CANP or calpain II; and a third form, known as p94, which is found in skeletal muscle only []. All forms have identical light but different heavy chains. Both mu- and m-calpain are heterodimers containing an identical 28kDa subunit and an 80kDa subunit that shares 55-65% sequence homology between the two proteases [, ]. The crystallographic structure of m-calpain reveals six "domains" in the 80kDa subunit: A 19-amino acid NH2-terminal sequence; Active site domain IIa; Active site domain IIb. Domain 2 shows low levels of sequence similarity to papain; although the catalytic His has not been located by biochemical means, it is likely that calpain and papain are related []. Domain III; An 18-amino acid extended sequence linking domain III to domain IV; Domain IV, which resembles the penta EF-hand family of polypeptides, binds calcium and regulates activity []. />]. Ca2+-binding causes a rearrangement of the protein backbone, the net effect of which is that a Trp side chain, which acts as a wedge between catalytic domains IIa and IIb in the apo state, moves away from the active site cleft allowing for the proper formation of the catalytic triad []. Calpain-like mRNAs have been identified in other organisms including bacteria, but the molecules encoded by these mRNAs have not been isolated, so little is known about their properties. How calpain activity is regulated in these organisms cells is still unclear In metazoans, the activity of calpain is controlled by a single proteinase inhibitor, calpastatin (IPR001259 from INTERPRO). The calpastatin gene can produce eight or more calpastatin polypeptides ranging from 17 to 85 kDa by use of different promoters and alternative splicing events. The physiological significance of these different calpastatins is unclear, although all bind to three different places on the calpain molecule; binding to at least two of the sites is Ca2+ dependent. The calpains ostensibly participate in a variety of cellular processes including remodelling of cytoskeletal/membrane attachments, different signal transduction pathways, and apoptosis. Deregulated calpain activity following loss of Ca2+ homeostasis results in tissue damage in response to events such as myocardial infarcts, stroke, and brain trauma []. Calpains are a family of cytosolic cysteine proteinases (see PDOC00126 from PROSITEDOC). Members of the calpain family are believed to function in various biological processes, including integrin-mediated cell migration, cytoskeletal remodeling, cell differentiation and apoptosis [, ]. The calpain family includes numerous members from C. elegans to mammals and with homologues in yeast and bacteria. The best characterised members are the m- and mu-calpains, both proteins are heterodimer composed of a large catalytic subunit and a small regulatory subunit. The large subunit comprises four domains (dI-dIV) while the small subunit has two domains (dV-dVI). Domain dI is a short region cleaved by autolysis, dII is the catalytic core, dIII is a C2-like domain, dIV consists of five calcium binding EF-hand motifs []. The crystal structure of calpain has been solved [, ]. The catalytic region consists of two distinct structural domains (dIIa and dIIb). dIIa contains a central helix flanked on three faces by a cluster of alpha-helices and is entirely unrelated to the corresponding domain in the typical thiol proteinases. The fold of dIIb is similar to the corresponding domain in other cysteine proteinases and contains two three-stranded anti-parallel beta-sheets. The catalytic triad residues (C,H,N) are located in dIIa and dIIb. The activation of the domain is dependent on the binding of two calcium atoms in two non EF-hand calcium binding sites located in the catalytic core, one close to the Cys active site in dIIa and one at the end of dIIb. Calcium-binding induced conformational changes in the catalytic domain which align the active site [][]. The profile covers the whole catalytic domain.; GO: 0004198 calcium-dependent cysteine-type endopeptidase activity, 0006508 proteolysis, 0005622 intracellular; PDB: 2NQA_A 1KFU_L 1KFX_L 1QXP_B 2R9C_A 1TL9_A 2G8E_A 1KXR_B 2G8J_A 2NQG_A ....
Probab=31.54 E-value=75 Score=29.54 Aligned_cols=37 Identities=24% Similarity=0.501 Sum_probs=0.0
Q ss_pred CeEEEEEEeeecCCC----ceEEEEEcCCCC---------------------------CCCCCcEEEEE
Q 018104 285 NHGVAAVGYGTTLDG----TKYWIVRNSWGP---------------------------EWGEKGYIRMQ 322 (360)
Q Consensus 285 ~Hav~iVGyg~~~~g----~~ywivkNSWG~---------------------------~WG~~Gy~~i~ 322 (360)
+||-.|+++... ++ ...-.+||-||. .-.++|.|||+
T Consensus 214 ~HaY~Vl~~~~~-~~~~~~~~lv~LrNPwg~~~w~G~ws~~s~~W~~~~~~~~~~~~~~~~~dg~FWM~ 281 (298)
T PF00648_consen 214 GHAYAVLDVREV-NGNGEGHRLVKLRNPWGSTEWKGDWSDDSPEWTEIHPSLRKRLNQSSSDDGTFWMS 281 (298)
T ss_dssp TS-EEEEEEEEE-EETTEEEEEEEEE-TTSS---SSTTSTTSGGGGGS-HHHHHHHTTTSSSSSEEEEE
T ss_pred ceeEEEEEEEee-ccccceeEEEEEcCCCccccccccccccccccccCCHHHHhhcccccccCccHhHh
No 53
>PF11873 DUF3393: Domain of unknown function (DUF3393); InterPro: IPR024570 Membrane-bound lytic murein transglycosylase C (also known as murein hydrolase C), is a murein-degrading enzyme that may play a role in the recycling of muropeptides during cell elongation and/or cell division. This entry represents the N-terminal domain, whose function is currently not known.
Probab=28.79 E-value=75 Score=28.09 Aligned_cols=20 Identities=15% Similarity=0.219 Sum_probs=11.4
Q ss_pred HHHHHHHHHHHHHHHHHhhC
Q 018104 54 EKHKRFNVFKQNVMHVHQTN 73 (360)
Q Consensus 54 E~~~R~~if~~n~~~I~~~N 73 (360)
....-+..|..|+..+=--|
T Consensus 50 ~~~~l~~~~~~~i~~~WG~~ 69 (204)
T PF11873_consen 50 GLDILMGQFSKNIEKIWGKN 69 (204)
T ss_pred HHHHHHHHHHHHHHHHhCCC
Confidence 44455666777776654333
No 54
>PLN02923 xylose isomerase
Probab=28.10 E-value=38 Score=33.27 Aligned_cols=48 Identities=15% Similarity=-0.046 Sum_probs=24.3
Q ss_pred ChhHHHHHHHHHHHHHHhhcCccCCccccCChhHHHHHHHHHHHhc-cc
Q 018104 1 MKRVYLLAAFLLALVLGIVEGFDFHEKELESEEGLWDLYERWRSHH-TV 48 (360)
Q Consensus 1 Mk~~l~~~~~~l~l~~~~~~~~~~~~~~~~~~~~~~~~f~~~~~~~-k~ 48 (360)
||..++++++++.|++.+.........-..+-..-.+.++.||..| +.
T Consensus 1 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~m~~yF~~ 49 (478)
T PLN02923 1 MKGGSILLLLLCALLCLSGVIAAQPPTCPADLGSKCSDSDEWEGEFFPG 49 (478)
T ss_pred CCcchhhHHHHHHHHHHHHHHhcCCCCCchhhcccccccHHHHHHhcCC
Confidence 7777777777777664432221111111111122234588888777 64
No 55
>PF15588 Imm7: Immunity protein 7
Probab=28.07 E-value=1.9e+02 Score=22.94 Aligned_cols=33 Identities=27% Similarity=0.528 Sum_probs=24.8
Q ss_pred EEEEEEeeecC-CCceEEEEEcCC-----CCCCCCCcEE
Q 018104 287 GVAAVGYGTTL-DGTKYWIVRNSW-----GPEWGEKGYI 319 (360)
Q Consensus 287 av~iVGyg~~~-~g~~ywivkNSW-----G~~WG~~Gy~ 319 (360)
-|++||+++++ +-+.|-|++.+- ...=|.+||.
T Consensus 17 ~v~~vG~ADd~~~~~~yiilQR~~~~de~D~~~~~d~~~ 55 (115)
T PF15588_consen 17 NVLMVGFADDEDGPKEYIILQRSLEFDEQDEDLGSDGYY 55 (115)
T ss_pred cEEEEEEecCCCCCceEEEEEccCCCCCcccccCcCcEE
Confidence 48999999875 446899999864 4455668886
No 56
>PRK10053 hypothetical protein; Provisional
Probab=27.95 E-value=52 Score=26.87 Aligned_cols=14 Identities=29% Similarity=0.252 Sum_probs=10.1
Q ss_pred ChhHHHHHHHHHHH
Q 018104 1 MKRVYLLAAFLLAL 14 (360)
Q Consensus 1 Mk~~l~~~~~~l~l 14 (360)
||++++.+++++++
T Consensus 1 MKK~~~~~~~~~~s 14 (130)
T PRK10053 1 MKLQAIALASFLVM 14 (130)
T ss_pred CcHHHHHHHHHHHH
Confidence 89987777665554
No 57
>PRK13883 conjugal transfer protein TrbH; Provisional
Probab=27.31 E-value=1.3e+02 Score=25.19 Aligned_cols=15 Identities=47% Similarity=0.640 Sum_probs=12.0
Q ss_pred ChhHHHHHHHHHHHH
Q 018104 1 MKRVYLLAAFLLALV 15 (360)
Q Consensus 1 Mk~~l~~~~~~l~l~ 15 (360)
|++++++++++++|.
T Consensus 1 Mrk~l~~~~l~l~La 15 (151)
T PRK13883 1 MRKIVLLALLALALG 15 (151)
T ss_pred ChhHHHHHHHHHHHh
Confidence 888888888777774
No 58
>smart00230 CysPc Calpain-like thiol protease family. Calpain-like thiol protease family (peptidase family C2). Calcium activated neutral protease (large subunit).
Probab=26.62 E-value=1.2e+02 Score=28.81 Aligned_cols=28 Identities=21% Similarity=0.467 Sum_probs=22.1
Q ss_pred CCCeEEEEEEeeecCCCce--EEEEEcCCCC
Q 018104 283 ELNHGVAAVGYGTTLDGTK--YWIVRNSWGP 311 (360)
Q Consensus 283 ~~~Hav~iVGyg~~~~g~~--ywivkNSWG~ 311 (360)
..+||=.|++...- ++.+ -..+||-||.
T Consensus 226 v~~HaYsVl~v~~~-~~~~~~Ll~lrNPWg~ 255 (318)
T smart00230 226 VKGHAYSVTDVREV-QGRRQELLRLRNPWGQ 255 (318)
T ss_pred ccCccEEEEEEEEE-ecCCeEEEEEECCCCC
Confidence 35899999999764 4444 8999999983
No 59
>PF07437 YfaZ: YfaZ precursor; InterPro: IPR009998 This family contains the precursor of the bacterial protein YfaZ (approximately 180 residues long). Many members of this family are hypothetical proteins.
Probab=26.43 E-value=53 Score=28.41 Aligned_cols=17 Identities=41% Similarity=0.426 Sum_probs=11.7
Q ss_pred ChhHHHHHHHHHHHHHH
Q 018104 1 MKRVYLLAAFLLALVLG 17 (360)
Q Consensus 1 Mk~~l~~~~~~l~l~~~ 17 (360)
|||++++.+++|++++.
T Consensus 1 m~k~~~a~~~~l~~~s~ 17 (180)
T PF07437_consen 1 MKKFLLASAAALLLVSA 17 (180)
T ss_pred CchHHHHHHHHHHHHhh
Confidence 88888877666666533
No 60
>PRK10780 periplasmic chaperone; Provisional
Probab=25.75 E-value=91 Score=26.36 Aligned_cols=15 Identities=47% Similarity=0.476 Sum_probs=11.2
Q ss_pred ChhHHHHHHHHHHHH
Q 018104 1 MKRVYLLAAFLLALV 15 (360)
Q Consensus 1 Mk~~l~~~~~~l~l~ 15 (360)
||++++++++.++++
T Consensus 1 Mkk~~~~~~l~l~~~ 15 (165)
T PRK10780 1 MKKWLLAAGLGLALA 15 (165)
T ss_pred ChHHHHHHHHHHHHH
Confidence 899998777665554
No 61
>PF15284 PAGK: Phage-encoded virulence factor
Probab=25.60 E-value=37 Score=23.65 Aligned_cols=12 Identities=25% Similarity=0.650 Sum_probs=5.2
Q ss_pred hHHHHHHHHHHH
Q 018104 3 RVYLLAAFLLAL 14 (360)
Q Consensus 3 ~~l~~~~~~l~l 14 (360)
+++|+++++|..
T Consensus 6 sifL~l~~~LsA 17 (61)
T PF15284_consen 6 SIFLALVFILSA 17 (61)
T ss_pred HHHHHHHHHHHH
Confidence 444444444443
No 62
>TIGR00156 conserved hypothetical protein TIGR00156. As of the last revision, this family consists only of two proteins from Escherichia coli and one from the related species Haemophilus influenzae.
Probab=24.66 E-value=62 Score=26.28 Aligned_cols=13 Identities=31% Similarity=0.169 Sum_probs=9.6
Q ss_pred ChhHHHHHHHHHH
Q 018104 1 MKRVYLLAAFLLA 13 (360)
Q Consensus 1 Mk~~l~~~~~~l~ 13 (360)
||+++++++++|+
T Consensus 1 MKK~~~~~~~~l~ 13 (126)
T TIGR00156 1 MKFQAIVLASALV 13 (126)
T ss_pred CchHHHHHHHHHH
Confidence 8998887776544
No 63
>PF02553 CbiN: Cobalt transport protein component CbiN; InterPro: IPR003705 The cobalt transport protein CbiN is part of the active cobalt transport system involved in uptake of cobalt in to the cell involved with cobalamin biosynthesis (vitamin B12). It has been suggested that CbiN may function as the periplasmic binding protein component of the active cobalt transport system [].; GO: 0015087 cobalt ion transmembrane transporter activity, 0006824 cobalt ion transport, 0009236 cobalamin biosynthetic process, 0016020 membrane
Probab=24.58 E-value=51 Score=24.08 Aligned_cols=14 Identities=29% Similarity=0.323 Sum_probs=9.3
Q ss_pred ChhHHHHHHHHHHH
Q 018104 1 MKRVYLLAAFLLAL 14 (360)
Q Consensus 1 Mk~~l~~~~~~l~l 14 (360)
|||+++++++++++
T Consensus 1 ~kn~~l~~~vv~l~ 14 (74)
T PF02553_consen 1 MKNLLLLLLVVALA 14 (74)
T ss_pred CceeHHHHHHHHHH
Confidence 78877777655444
No 64
>PF15240 Pro-rich: Proline-rich
Probab=24.53 E-value=62 Score=27.90 Aligned_cols=11 Identities=55% Similarity=0.667 Sum_probs=4.6
Q ss_pred HHHHHHHHHHH
Q 018104 4 VYLLAAFLLAL 14 (360)
Q Consensus 4 ~l~~~~~~l~l 14 (360)
|||+.++||+|
T Consensus 3 lVLLSvALLAL 13 (179)
T PF15240_consen 3 LVLLSVALLAL 13 (179)
T ss_pred hHHHHHHHHHh
Confidence 33434444444
No 65
>PF11912 DUF3430: Protein of unknown function (DUF3430); InterPro: IPR021837 This family of proteins are functionally uncharacterised. This protein is found in eukaryotes. Proteins in this family are typically between 209 to 265 amino acids in length.
Probab=24.52 E-value=57 Score=28.53 Aligned_cols=15 Identities=27% Similarity=0.441 Sum_probs=8.3
Q ss_pred ChhHHHHHHHHHHHH
Q 018104 1 MKRVYLLAAFLLALV 15 (360)
Q Consensus 1 Mk~~l~~~~~~l~l~ 15 (360)
||-++++|+|+++++
T Consensus 1 MKll~~lilli~~~~ 15 (212)
T PF11912_consen 1 MKLLISLILLILLII 15 (212)
T ss_pred CcHHHHHHHHHHHHH
Confidence 776555555544444
No 66
>PF10107 Endonuc_Holl: Endonuclease related to archaeal Holliday junction resolvase; InterPro: IPR019287 This domain is found in various predicted bacterial endonucleases which are distantly related to archaeal Holliday junction resolvases.
Probab=24.44 E-value=1.9e+02 Score=24.29 Aligned_cols=18 Identities=28% Similarity=0.649 Sum_probs=13.3
Q ss_pred hhHHHHHHHHHHHhc-ccc
Q 018104 32 EEGLWDLYERWRSHH-TVS 49 (360)
Q Consensus 32 ~~~~~~~f~~~~~~~-k~Y 49 (360)
+.....+|++|+... ..-
T Consensus 21 ~~~a~~~fe~wr~~~~~~~ 39 (156)
T PF10107_consen 21 ERRARELFEQWRQRESETL 39 (156)
T ss_pred HHHHHHHHHHHHHhHHHHH
Confidence 456777899998886 443
No 67
>COG4871 Uncharacterized protein conserved in archaea [Function unknown]
Probab=23.57 E-value=48 Score=27.98 Aligned_cols=16 Identities=38% Similarity=0.974 Sum_probs=11.1
Q ss_pred CCCCCCCCC--cHHHHHH
Q 018104 140 AVKDQGQCG--SCWAFST 155 (360)
Q Consensus 140 pV~dQg~cG--sCwAfA~ 155 (360)
|-.|=|.|| +|.|||.
T Consensus 135 P~tNCg~CGEqtCmaFAi 152 (193)
T COG4871 135 PQTNCGKCGEQTCMAFAI 152 (193)
T ss_pred CCCccccchhHHHHHHHH
Confidence 445666776 7899974
No 68
>PF11839 DUF3359: Protein of unknown function (DUF3359); InterPro: IPR021793 This family of proteins are functionally uncharacterised. This protein is found in bacteria. Proteins in this family are about 80 amino acids in length.
Probab=23.03 E-value=91 Score=24.00 Aligned_cols=22 Identities=36% Similarity=0.335 Sum_probs=14.2
Q ss_pred ChhHHHHHHHHHHHHHHhhcCc
Q 018104 1 MKRVYLLAAFLLALVLGIVEGF 22 (360)
Q Consensus 1 Mk~~l~~~~~~l~l~~~~~~~~ 22 (360)
||++|+..+++.+++...+++.
T Consensus 1 M~k~l~sal~~~~~L~~GCAst 22 (96)
T PF11839_consen 1 MKKLLLSALALAALLLAGCAST 22 (96)
T ss_pred CchHHHHHHHHHHHHHhHccCC
Confidence 8888877666665554444443
No 69
>TIGR01655 yxeA_fam conserved hypothetical protein TIGR01655. This model represents a family of small (about 115 amino acids) uncharacterized proteins with N-terminal signal sequences, found exclusively in Gram-positive organisms. Most genomes that have any members of this family have at least two members.
Probab=22.66 E-value=60 Score=25.76 Aligned_cols=15 Identities=20% Similarity=0.348 Sum_probs=11.1
Q ss_pred ChhHHHHHHHHHHHH
Q 018104 1 MKRVYLLAAFLLALV 15 (360)
Q Consensus 1 Mk~~l~~~~~~l~l~ 15 (360)
||++++.++.+++++
T Consensus 1 mKK~li~li~~ivv~ 15 (114)
T TIGR01655 1 MKKGLAILLALIVVI 15 (114)
T ss_pred CceehHHHHHHHHhH
Confidence 888888876666554
No 70
>KOG4702 consensus Uncharacterized conserved protein [Function unknown]
Probab=22.58 E-value=1.6e+02 Score=21.12 Aligned_cols=30 Identities=17% Similarity=0.175 Sum_probs=21.5
Q ss_pred HHHHHHHhc-cccCChHHHHHHHHHHHHHHHH
Q 018104 38 LYERWRSHH-TVSRSLDEKHKRFNVFKQNVMH 68 (360)
Q Consensus 38 ~f~~~~~~~-k~Y~~~~E~~~R~~if~~n~~~ 68 (360)
.|++|+..| +.- .+-|..+|.+-|++-++.
T Consensus 30 ~Fee~v~~~krel-~ppe~~~~~EE~~~~lRe 60 (77)
T KOG4702|consen 30 IFEEFVRGYKREL-SPPEATKRKEEYENFLRE 60 (77)
T ss_pred HHHHHHHhccccC-CChHHHhhHHHHHHHHHH
Confidence 599999999 665 344667777777666554
No 71
>PF07910 Peptidase_C78: Peptidase family C78; InterPro: IPR012462 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This entry contains UfSP1 and UfSP2, which are cysteine peptidases required for the processing and activation of Ubiquitin fold modifier 1 (Ufm1, IPR005375 from INTERPRO) and for its release from conjugated cellular proteins. UfSP1 and UfSP2 are 217 aa and 461 aa respectively [, ]. The peptidases belong to MEROPS peptidase family C78, clan CA. The UfSP2 family have an N-terminal extension with one or more zinc finger domains of the C2H2 type (IPR007087 from INTERPRO), which have been shown to be involved in protein:protein interaction. UfSP2 is present in most, if not all, multi-cellular organisms including plants, nematodes, flies, and mammals, whereas UfSP1 is not present in plants and nematodes []. ; PDB: 3OQC_B 2Z84_A.
Probab=22.55 E-value=1e+02 Score=27.56 Aligned_cols=23 Identities=26% Similarity=0.273 Sum_probs=15.8
Q ss_pred CCeEEEEEEeeecCCCceEEEEE
Q 018104 284 LNHGVAAVGYGTTLDGTKYWIVR 306 (360)
Q Consensus 284 ~~Hav~iVGyg~~~~g~~ywivk 306 (360)
.+|+.+|||+....+|.-+++|-
T Consensus 155 ~ghS~TIvGie~~~~g~~~LLVl 177 (218)
T PF07910_consen 155 DGHSRTIVGIERNKDGEVNLLVL 177 (218)
T ss_dssp TTEEEEEEEEEE-TT--EEEEEE
T ss_pred cccceEEEEEEECCCCCEEEEEE
Confidence 58999999999865666666553
No 72
>PF15240 Pro-rich: Proline-rich
Probab=22.13 E-value=54 Score=28.25 Aligned_cols=16 Identities=25% Similarity=0.173 Sum_probs=12.2
Q ss_pred HHHHHHHHHHHHHhhc
Q 018104 5 YLLAAFLLALVLGIVE 20 (360)
Q Consensus 5 l~~~~~~l~l~~~~~~ 20 (360)
||||||+.+||..+.|
T Consensus 1 MLlVLLSvALLALSSA 16 (179)
T PF15240_consen 1 MLLVLLSVALLALSSA 16 (179)
T ss_pred ChhHHHHHHHHHhhhc
Confidence 6888888888866544
No 73
>PF04202 Mfp-3: Foot protein 3; InterPro: IPR007328 Mytilus foot protein-3 (Mfp-3) is a highly polymorphic protein family located in the byssal adhesive plaques of blue mussels.
Probab=22.01 E-value=72 Score=22.59 Aligned_cols=15 Identities=20% Similarity=0.397 Sum_probs=10.0
Q ss_pred ChhHHHHHHHHHHHH
Q 018104 1 MKRVYLLAAFLLALV 15 (360)
Q Consensus 1 Mk~~l~~~~~~l~l~ 15 (360)
|.++.+..+++|+|+
T Consensus 1 mnn~Si~VLlaLvLI 15 (71)
T PF04202_consen 1 MNNLSIAVLLALVLI 15 (71)
T ss_pred CCchhHHHHHHHHHH
Confidence 777776666666664
No 74
>PRK09838 periplasmic copper-binding protein; Provisional
Probab=21.18 E-value=86 Score=25.01 Aligned_cols=15 Identities=40% Similarity=0.436 Sum_probs=12.7
Q ss_pred ChhHHHHHHHHHHHH
Q 018104 1 MKRVYLLAAFLLALV 15 (360)
Q Consensus 1 Mk~~l~~~~~~l~l~ 15 (360)
||++++.+++.|++.
T Consensus 1 mk~~~~~~~~~~~~~ 15 (115)
T PRK09838 1 MKKALKVAMFSLFSV 15 (115)
T ss_pred CchHHHHHHHHHHHH
Confidence 899999888888876
No 75
>COG5567 Predicted small periplasmic lipoprotein [Cell motility and secretion]
Probab=21.02 E-value=1.3e+02 Score=20.52 Aligned_cols=22 Identities=27% Similarity=0.152 Sum_probs=14.5
Q ss_pred ChhHHHHHHHHHHHHHHhhcCc
Q 018104 1 MKRVYLLAAFLLALVLGIVEGF 22 (360)
Q Consensus 1 Mk~~l~~~~~~l~l~~~~~~~~ 22 (360)
||+.+.-++++..|++.++.+.
T Consensus 1 mk~~~~s~~ala~l~sLA~CG~ 22 (58)
T COG5567 1 MKNVFKSLLALATLFSLAGCGL 22 (58)
T ss_pred ChhHHHHHHHHHHHHHHHhccc
Confidence 8888877777766664444444
No 76
>PF11337 DUF3139: Protein of unknown function (DUF3139); InterPro: IPR021486 This family of proteins with unknown function appears to be restricted to Firmicutes.
Probab=20.90 E-value=93 Score=23.10 Aligned_cols=13 Identities=31% Similarity=0.659 Sum_probs=6.4
Q ss_pred Chh--HHHHHHHHHH
Q 018104 1 MKR--VYLLAAFLLA 13 (360)
Q Consensus 1 Mk~--~l~~~~~~l~ 13 (360)
||+ ++++++++++
T Consensus 1 MKK~kii~iii~li~ 15 (85)
T PF11337_consen 1 MKKKKIILIIIILIV 15 (85)
T ss_pred CCchHHHHHHHHHHH
Confidence 777 4444433333
No 77
>PRK15240 resistance to complement killing; Provisional
Probab=20.50 E-value=68 Score=27.84 Aligned_cols=17 Identities=35% Similarity=0.399 Sum_probs=12.4
Q ss_pred ChhHHHHHHHHHHHHHH
Q 018104 1 MKRVYLLAAFLLALVLG 17 (360)
Q Consensus 1 Mk~~l~~~~~~l~l~~~ 17 (360)
||++|++++++++++..
T Consensus 1 Mkk~~~~~~~~~~~~~~ 17 (185)
T PRK15240 1 MKKIVLSSLLLSAAGLA 17 (185)
T ss_pred CchhHHHHHHHHHHHhc
Confidence 99999877776666533
No 78
>PRK13697 cytochrome c6; Provisional
Probab=20.45 E-value=2.3e+02 Score=21.76 Aligned_cols=15 Identities=33% Similarity=0.452 Sum_probs=10.2
Q ss_pred ChhHHHHHHHHHHHH
Q 018104 1 MKRVYLLAAFLLALV 15 (360)
Q Consensus 1 Mk~~l~~~~~~l~l~ 15 (360)
||++++.++++++++
T Consensus 1 m~~~~~~~~~~~~~~ 15 (111)
T PRK13697 1 MKKILSLVLLGLLLL 15 (111)
T ss_pred ChhHHHHHHHHHHHH
Confidence 888887766665554
No 79
>PRK13859 type IV secretion system lipoprotein VirB7; Provisional
Probab=20.44 E-value=78 Score=21.23 Aligned_cols=14 Identities=43% Similarity=0.292 Sum_probs=9.5
Q ss_pred ChhHHHHHHHHHHH
Q 018104 1 MKRVYLLAAFLLAL 14 (360)
Q Consensus 1 Mk~~l~~~~~~l~l 14 (360)
||..||++++.|.-
T Consensus 1 MKY~lL~l~l~La~ 14 (55)
T PRK13859 1 MKYCLLCLALALAG 14 (55)
T ss_pred CchhHHHHHHHHHh
Confidence 78777777665554
No 80
>KOG4404 consensus Tandem pore domain K+ channel TASK3/THIK-1 [Inorganic ion transport and metabolism]
Probab=20.07 E-value=2e+02 Score=27.31 Aligned_cols=32 Identities=22% Similarity=0.148 Sum_probs=20.3
Q ss_pred CCccccCChhHHHHHHHHHHHhc-cccCChHHH
Q 018104 24 FHEKELESEEGLWDLYERWRSHH-TVSRSLDEK 55 (360)
Q Consensus 24 ~~~~~~~~~~~~~~~f~~~~~~~-k~Y~~~~E~ 55 (360)
++.-+.+.|..-+..++.=+.++ ++|.=.+|+
T Consensus 27 FdaLEse~E~~~r~~l~~~~~~~~~kyn~s~~d 59 (350)
T KOG4404|consen 27 FDALESENEARERERLERRLANLKRKYNLSEED 59 (350)
T ss_pred HHHhcCcchHHHHHHHHHHHHHHHHhhCCCHHH
Confidence 34344555666667788878888 888544444
Done!