Query 018781
Match_columns 350
No_of_seqs 296 out of 1885
Neff 8.2
Searched_HMMs 46136
Date Fri Mar 29 03:58:59 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/018781.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/018781hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 KOG1542 Cysteine proteinase Ca 100.0 3.5E-83 7.7E-88 581.9 25.6 297 43-349 66-371 (372)
2 PTZ00203 cathepsin L protease; 100.0 9.7E-80 2.1E-84 584.4 34.2 299 41-348 31-339 (348)
3 PTZ00021 falcipain-2; Provisio 100.0 3.4E-78 7.3E-83 589.8 31.8 309 35-349 156-488 (489)
4 PTZ00200 cysteine proteinase; 100.0 4.1E-77 8.9E-82 580.4 32.7 304 37-349 115-445 (448)
5 KOG1543 Cysteine proteinase Ca 100.0 9.7E-70 2.1E-74 512.4 30.4 287 52-348 30-323 (325)
6 cd02621 Peptidase_C1A_Cathepsi 100.0 5.2E-58 1.1E-62 419.2 21.9 207 133-346 1-239 (243)
7 cd02698 Peptidase_C1A_Cathepsi 100.0 1.5E-57 3.3E-62 414.7 22.7 210 133-348 1-237 (239)
8 cd02248 Peptidase_C1A Peptidas 100.0 2.9E-57 6.3E-62 405.2 23.0 207 134-347 1-210 (210)
9 cd02620 Peptidase_C1A_Cathepsi 100.0 5.9E-57 1.3E-61 410.1 21.6 205 134-345 1-234 (236)
10 PF00112 Peptidase_C1: Papain 100.0 6.3E-56 1.4E-60 398.0 19.5 213 133-348 1-219 (219)
11 PTZ00049 cathepsin C-like prot 100.0 1.8E-53 3.8E-58 424.5 23.1 212 130-348 378-675 (693)
12 PTZ00364 dipeptidyl-peptidase 100.0 5.6E-53 1.2E-57 416.9 22.2 206 131-345 203-455 (548)
13 smart00645 Pept_C1 Papain fami 100.0 3.6E-50 7.9E-55 349.0 18.3 166 133-343 1-169 (174)
14 cd02619 Peptidase_C1 C1 Peptid 100.0 8.3E-47 1.8E-51 339.5 20.1 193 136-331 1-213 (223)
15 PTZ00462 Serine-repeat antigen 100.0 9.5E-45 2.1E-49 371.3 21.3 201 145-350 544-782 (1004)
16 KOG1544 Predicted cysteine pro 100.0 3.6E-42 7.9E-47 309.6 7.2 262 77-345 151-456 (470)
17 COG4870 Cysteine protease [Pos 100.0 6.7E-31 1.5E-35 242.5 7.5 198 131-332 97-315 (372)
18 cd00585 Peptidase_C1B Peptidas 99.9 6E-24 1.3E-28 206.4 13.1 179 146-330 55-399 (437)
19 PF03051 Peptidase_C1_2: Pepti 99.7 1.7E-16 3.7E-21 154.6 16.4 179 146-330 56-400 (438)
20 PF08246 Inhibitor_I29: Cathep 99.7 1.3E-16 2.8E-21 112.9 7.1 57 48-104 1-58 (58)
21 smart00848 Inhibitor_I29 Cathe 99.5 1.6E-14 3.5E-19 101.7 5.5 56 48-103 1-57 (57)
22 COG3579 PepC Aminopeptidase C 98.9 1.4E-08 3E-13 93.6 10.5 75 147-222 59-161 (444)
23 KOG4128 Bleomycin hydrolases a 97.5 0.00011 2.4E-09 68.0 4.3 75 146-221 63-167 (457)
24 PF13529 Peptidase_C39_2: Pept 97.3 0.0025 5.5E-08 52.2 10.2 57 250-315 87-144 (144)
25 PF05543 Peptidase_C47: Stapho 96.9 0.015 3.2E-07 49.7 10.8 118 149-315 17-144 (175)
26 PF08127 Propeptide_C1: Peptid 96.2 0.0041 8.8E-08 40.3 2.5 35 77-113 4-38 (41)
27 PF14399 Transpep_BrtH: NlpC/p 89.8 0.83 1.8E-05 43.0 6.4 56 251-313 77-133 (317)
28 COG4990 Uncharacterized protei 86.6 1.5 3.3E-05 37.7 5.2 51 246-316 117-168 (195)
29 cd00044 CysPc Calpains, domain 79.8 7 0.00015 37.0 7.4 27 291-317 235-263 (315)
30 PF09778 Guanylate_cyc_2: Guan 77.3 8 0.00017 34.4 6.5 59 251-313 112-180 (212)
31 PF03032 Brevenin: Brevenin/es 75.5 2 4.3E-05 28.5 1.6 20 4-23 1-20 (46)
32 cd02549 Peptidase_C39A A sub-f 72.3 11 0.00023 30.6 5.7 44 255-315 70-114 (141)
33 PF11106 YjbE: Exopolysacchari 70.4 3.7 8.1E-05 30.0 2.1 23 6-28 1-23 (80)
34 PF15240 Pro-rich: Proline-ric 69.8 2.5 5.4E-05 36.5 1.3 15 9-23 1-15 (179)
35 PF12385 Peptidase_C70: Papain 68.6 70 0.0015 27.2 9.6 38 251-303 97-135 (166)
36 PF08139 LPAM_1: Prokaryotic m 67.5 6 0.00013 22.6 2.1 15 6-20 7-21 (25)
37 PF10731 Anophelin: Thrombin i 67.2 6 0.00013 27.5 2.5 19 7-25 3-21 (65)
38 PF07172 GRP: Glycine rich pro 65.2 5.6 0.00012 30.7 2.4 16 8-23 5-20 (95)
39 COG5510 Predicted small secret 61.1 9.4 0.0002 24.8 2.4 15 6-20 2-16 (44)
40 PF11777 DUF3316: Protein of u 60.6 11 0.00023 30.1 3.4 14 6-19 1-14 (114)
41 KOG4702 Uncharacterized conser 53.7 59 0.0013 23.5 5.7 32 46-78 29-60 (77)
42 PRK10081 entericidin B membran 53.0 16 0.00034 24.4 2.5 15 6-20 2-16 (48)
43 PF02402 Lysis_col: Lysis prot 52.9 3.5 7.6E-05 26.8 -0.5 13 6-18 1-13 (46)
44 PRK11443 lipoprotein; Provisio 50.3 11 0.00025 30.5 2.0 18 6-23 1-18 (124)
45 PF11948 DUF3465: Protein of u 49.0 16 0.00035 29.9 2.6 18 6-23 1-18 (131)
46 PRK13883 conjugal transfer pro 48.8 32 0.0007 28.9 4.4 19 6-24 1-19 (151)
47 COG4588 AcfC Accessory coloniz 48.8 22 0.00047 31.6 3.5 49 6-58 1-49 (252)
48 PRK09810 entericidin A; Provis 48.7 19 0.0004 23.3 2.3 10 6-15 2-11 (41)
49 PF11873 DUF3393: Domain of un 48.4 20 0.00044 31.8 3.4 18 6-23 1-18 (204)
50 PF09403 FadA: Adhesion protei 47.4 30 0.00064 28.2 3.9 21 6-29 1-21 (126)
51 PF11153 DUF2931: Protein of u 47.4 14 0.00031 32.8 2.4 19 6-24 1-19 (216)
52 PRK10780 periplasmic chaperone 41.2 19 0.0004 30.7 2.0 28 6-33 1-29 (165)
53 PF01640 Peptidase_C10: Peptid 39.3 1.3E+02 0.0027 26.3 7.0 52 252-326 140-192 (192)
54 PRK10053 hypothetical protein; 38.2 24 0.00053 28.9 2.1 13 6-18 1-13 (130)
55 PF14060 DUF4252: Domain of un 37.7 37 0.0008 28.2 3.3 17 7-23 1-17 (155)
56 smart00230 CysPc Calpain-like 37.3 56 0.0012 31.0 4.8 26 291-316 227-254 (318)
57 PF12771 SusD-like_2: Starch-b 36.9 15 0.00033 37.1 0.9 24 6-29 1-24 (488)
58 TIGR00156 conserved hypothetic 36.8 23 0.00051 28.8 1.8 10 6-15 1-10 (126)
59 PF07437 YfaZ: YfaZ precursor; 36.6 25 0.00054 30.5 2.1 23 6-29 1-23 (180)
60 PRK11372 lysozyme inhibitor; P 35.9 33 0.00072 27.2 2.5 20 5-24 2-21 (109)
61 PF12276 DUF3617: Protein of u 34.6 28 0.00061 29.2 2.1 18 6-23 1-18 (162)
62 PRK10936 TMAO reductase system 34.5 25 0.00054 33.4 1.9 20 6-25 1-20 (343)
63 PLN00131 hypothetical protein; 33.9 15 0.00033 30.8 0.2 18 2-19 29-46 (218)
64 PLN03024 Putative EG45-like do 32.8 23 0.00049 28.9 1.1 28 6-33 1-28 (125)
65 COG3637 Opacity protein and re 32.6 31 0.00067 30.4 2.1 22 6-27 1-22 (199)
66 COG1792 MreC Cell shape-determ 32.4 1.1E+02 0.0023 28.6 5.7 27 43-69 56-82 (284)
67 PRK13835 conjugal transfer pro 32.1 90 0.002 26.1 4.5 19 6-24 1-19 (145)
68 PF05540 Serpulina_VSP: Serpul 31.9 30 0.00064 33.1 1.9 24 6-29 1-24 (377)
69 PF13677 MotB_plug: Membrane M 30.9 1.1E+02 0.0023 21.2 4.1 14 41-54 43-56 (58)
70 PRK09934 fimbrial-like adhesin 30.9 32 0.00069 29.4 1.8 19 6-24 1-19 (171)
71 KOG3554 Histone deacetylase co 30.7 1.5E+02 0.0032 29.6 6.4 32 31-62 279-310 (693)
72 PRK03577 acid shock protein pr 30.3 48 0.001 25.5 2.4 25 6-30 1-25 (102)
73 TIGR03519 Bac_Flav_fam_1 Bacte 29.8 29 0.00063 32.4 1.5 16 6-21 1-16 (292)
74 PF10614 CsgF: Type VIII secre 29.6 56 0.0012 27.2 2.9 72 6-78 1-79 (142)
75 PF15284 PAGK: Phage-encoded v 28.9 54 0.0012 23.0 2.3 14 6-19 1-16 (61)
76 TIGR01165 cbiN cobalt transpor 28.1 13 0.00028 28.3 -0.9 22 5-26 2-23 (91)
77 PRK15240 resistance to complem 28.0 39 0.00084 29.4 1.8 17 6-22 1-17 (185)
78 PF05984 Cytomega_UL20A: Cytom 27.9 51 0.0011 24.7 2.1 22 6-27 1-22 (100)
79 PF06873 SerH: Cell surface im 27.9 39 0.00084 33.2 2.0 24 6-29 1-25 (403)
80 PF00879 Defensin_propep: Defe 27.5 60 0.0013 22.1 2.3 18 6-24 1-18 (52)
81 COG3088 CcmH Uncharacterized p 27.2 1.1E+02 0.0023 25.8 4.1 13 59-71 29-41 (153)
82 PRK15209 long polar fimbrial p 26.9 49 0.0011 28.2 2.3 18 6-23 1-18 (174)
83 PF15588 Imm7: Immunity protei 26.8 2E+02 0.0043 22.9 5.6 35 294-329 17-58 (115)
84 PRK15346 outer membrane secret 26.7 71 0.0015 32.4 3.7 52 6-59 1-55 (499)
85 PF11853 DUF3373: Protein of u 26.5 79 0.0017 31.9 3.9 13 6-18 1-13 (489)
86 TIGR02744 TrbI_Ftype type-F co 25.1 1.4E+02 0.0031 23.8 4.4 45 42-86 37-82 (112)
87 PF10107 Endonuc_Holl: Endonuc 25.1 1.5E+02 0.0033 25.0 4.7 18 41-58 21-38 (156)
88 PTZ00045 apical membrane antig 25.0 72 0.0016 32.7 3.3 20 5-24 15-34 (595)
89 TIGR03044 PS_II_psb27 photosys 24.4 1.5E+02 0.0033 24.4 4.5 16 43-58 67-82 (135)
90 PF11119 DUF2633: Protein of u 24.4 69 0.0015 22.4 2.1 16 5-20 8-23 (59)
91 PF06585 JHBP: Haemolymph juve 23.9 85 0.0018 28.2 3.4 21 6-26 1-21 (248)
92 COG5633 Predicted periplasmic 22.8 51 0.0011 26.4 1.4 24 6-29 1-24 (123)
93 PRK09838 periplasmic copper-bi 22.6 69 0.0015 25.6 2.2 19 6-24 1-19 (115)
94 TIGR02052 MerP mercuric transp 22.2 63 0.0014 23.1 1.9 13 6-18 1-13 (92)
95 PRK11671 mltC murein transglyc 22.1 1.4E+02 0.003 29.0 4.5 17 6-22 1-17 (359)
96 PRK09936 hypothetical protein; 20.4 1.8E+02 0.0038 27.4 4.7 24 6-30 1-24 (296)
97 COG5266 CbiK ABC-type Co2+ tra 20.3 70 0.0015 29.4 2.0 22 6-27 1-22 (264)
98 COG3054 Predicted transcriptio 20.3 73 0.0016 26.9 1.9 24 6-29 1-24 (184)
99 PRK13733 conjugal transfer pro 20.1 1.5E+02 0.0033 25.5 3.8 18 6-23 1-18 (171)
No 1
>KOG1542 consensus Cysteine proteinase Cathepsin F [Posttranslational modification, protein turnover, chaperones]
Probab=100.00 E-value=3.5e-83 Score=581.85 Aligned_cols=297 Identities=47% Similarity=0.822 Sum_probs=263.6
Q ss_pred HHHHHHHHHHHHhCCccCChHHHHHHHHHHHHHHHHHHHhccCC-CcEEEEcccCCCCChHhHhhhhcCCCCC-CCCCCC
Q 018781 43 KLIELFESWMSKHGKTYKCIEEKLHRFEIFKENLKHIDQRNKEV-TSYWLGLNEFADMSHEEFKNKYLGLKPQ-FPTRRQ 120 (350)
Q Consensus 43 ~~~~~f~~~~~~~~k~Y~~~~E~~~R~~if~~n~~~I~~~N~~~-~s~~~g~N~fsDlt~~E~~~~~~~~~~~-~~~~~~ 120 (350)
...+.|..|+.+|+|+|.+.+|..+|++||++|+..++++++.. .|.+.|+|+|||||+|||++++++.+.. .+. +.
T Consensus 66 ~~~~~F~~F~~kf~r~Y~s~eE~~~Rl~iF~~N~~~a~~~q~~d~gsA~yGvtqFSDlT~eEFkk~~l~~~~~~~~~-~~ 144 (372)
T KOG1542|consen 66 GLEDSFKLFTIKFGRSYASREEHAHRLSIFKHNLLRAERLQENDPGSAEYGVTQFSDLTEEEFKKIYLGVKRRGSKL-PG 144 (372)
T ss_pred chHHHHHHHHHhcCcccCcHHHHHHHHHHHHHHHHHHHHhhhcCccccccCccchhhcCHHHHHHHhhccccccccC-cc
Confidence 34789999999999999999999999999999999999998875 5899999999999999999999876553 111 11
Q ss_pred CCCcccccccCCCCCeeeccCCCCCCccccCCCCcchHHHHHHHHHHHHHHHHcCCCcccChHHhhhhcCCCCCCCCCCc
Q 018781 121 PSAEFSYRDVKALPKSVDWRKKGAVTPVKNQGSCGSCWAFSTVAAVEGINQIVSGNLTSLSEQELIDCDTSFNNGCNGGL 200 (350)
Q Consensus 121 ~~~~~~~~~~~~lP~~~Dwr~~g~v~pV~dQg~cGsCwAfA~~~~lE~~~~~~~~~~~~lS~q~l~~c~~~~~~gC~GG~ 200 (350)
.....+......||++||||++|.||||||||+||||||||+++++|+++.+++|++++||||+|+||+. .++||+||.
T Consensus 145 ~~~~~~~~~~~~lP~~fDWR~kgaVTpVKnQG~CGSCWAFS~tG~vEga~~i~~g~LvsLSEQeLvDCD~-~d~gC~GGl 223 (372)
T KOG1542|consen 145 DAAEAPIEPGESLPESFDWRDKGAVTPVKNQGMCGSCWAFSTTGAVEGAWAIATGKLVSLSEQELVDCDS-CDNGCNGGL 223 (372)
T ss_pred ccccCcCCCCCCCCcccchhccCCccccccCCcCcchhhhhhhhhhhhHHHhhcCcccccchhhhhcccC-cCCcCCCCC
Confidence 1111122344689999999999999999999999999999999999999999999999999999999997 489999999
Q ss_pred hHHHHHHHHHhCCCCCCCCCccccCCC-ccCCCccCceeEEEeeeEecCCCcHHHHHHHH-hcCCcEEEEeecCcccccc
Q 018781 201 MDYAFKYIVASGGLHKEEDYPYLMEEG-TCEDKKEEMEVVTISGYQDVPENDEQSLLKAL-AHQPVSVAIEASGTDFQFY 278 (350)
Q Consensus 201 ~~~a~~~~~~~~Gi~~e~~yPY~~~~~-~c~~~~~~~~~~~i~~~~~v~~~~~~~i~~al-~~GPV~v~i~~~~~~f~~y 278 (350)
+..|++|+++.+|+..|++|||++..+ .|...+ ....+.|.+|..++. |+++|.+.| ++|||+|+|++ ..+|.|
T Consensus 224 ~~nA~~~~~~~gGL~~E~dYPY~g~~~~~C~~~~-~~~~v~I~~f~~l~~-nE~~ia~wLv~~GPi~vgiNa--~~mQ~Y 299 (372)
T KOG1542|consen 224 MDNAFKYIKKAGGLEKEKDYPYTGKKGNQCHFDK-SKIVVSIKDFSMLSN-NEDQIAAWLVTFGPLSVGINA--KPMQFY 299 (372)
T ss_pred hhHHHHHHHHhCCccccccCCccccCCCccccch-hhceEEEeccEecCC-CHHHHHHHHHhcCCeEEEEch--HHHHHh
Confidence 999999988888999999999999887 898654 567889999999976 899999998 78999999997 489999
Q ss_pred cCCeeeC---CCCCC-CCeEEEEEEEeecC-CeeEEEEEcCCCCCCCCCceEEEEecCCCCCCcccccccccceec
Q 018781 279 SGGVFTG---PCGAE-LDHGVAAVGYGKSK-GSDYIIVKNSWGPKWGERGYIRMKRNTGKPEGLCGINKMASIPLK 349 (350)
Q Consensus 279 ~~Giy~~---~~~~~-~~Hav~iVGyg~~~-g~~ywivkNSWG~~WG~~GY~~i~~~~~~~~~~CgI~~~~~~p~~ 349 (350)
.+||..+ .|... ++|||+|||||... .++|||||||||++|||+||+|+.||. |.|||+++++-+.+
T Consensus 300 rgGV~~P~~~~Cs~~~~~HaVLlvGyG~~g~~~PYWIVKNSWG~~WGE~GY~~l~RG~----N~CGi~~mvss~~v 371 (372)
T KOG1542|consen 300 RGGVSCPSKYICSPKLLNHAVLLVGYGSSGYEKPYWIVKNSWGTSWGEKGYYKLCRGS----NACGIADMVSSAAV 371 (372)
T ss_pred cccccCCCcccCCccccCceEEEEeecCCCCCCceEEEECCccccccccceEEEeccc----cccccccchhhhhc
Confidence 9999987 68764 89999999999987 899999999999999999999999995 58999999886654
No 2
>PTZ00203 cathepsin L protease; Provisional
Probab=100.00 E-value=9.7e-80 Score=584.38 Aligned_cols=299 Identities=36% Similarity=0.713 Sum_probs=250.2
Q ss_pred hhHHHHHHHHHHHHhCCccCChHHHHHHHHHHHHHHHHHHHhccCCCcEEEEcccCCCCChHhHhhhhcCCCCCC-CCCC
Q 018781 41 MDKLIELFESWMSKHGKTYKCIEEKLHRFEIFKENLKHIDQRNKEVTSYWLGLNEFADMSHEEFKNKYLGLKPQF-PTRR 119 (350)
Q Consensus 41 ~~~~~~~f~~~~~~~~k~Y~~~~E~~~R~~if~~n~~~I~~~N~~~~s~~~g~N~fsDlt~~E~~~~~~~~~~~~-~~~~ 119 (350)
..++..+|++||++|+|.|.+.+|+.+|++||++|+++|++||+++.+|++|+|+|+|||.|||++++++..... +...
T Consensus 31 ~~~~~~~f~~~~~~~~K~Y~~~~E~~~R~~iF~~N~~~I~~~N~~~~~~~lg~N~FaDlT~eEf~~~~l~~~~~~~~~~~ 110 (348)
T PTZ00203 31 GTPAAALFEEFKRTYQRAYGTLTEEQQRLANFERNLELMREHQARNPHARFGITKFFDLSEAEFAARYLNGAAYFAAAKQ 110 (348)
T ss_pred ccHHHHHHHHHHHHhCCCCCChHHHHHHHHHHHHHHHHHHHHhccCCCeEEeccccccCCHHHHHHHhcCCCcccccccc
Confidence 456777899999999999998889999999999999999999987789999999999999999998776321111 0000
Q ss_pred CCCCcccc--cccCCCCCeeeccCCCCCCccccCCCCcchHHHHHHHHHHHHHHHHcCCCcccChHHhhhhcCCCCCCCC
Q 018781 120 QPSAEFSY--RDVKALPKSVDWRKKGAVTPVKNQGSCGSCWAFSTVAAVEGINQIVSGNLTSLSEQELIDCDTSFNNGCN 197 (350)
Q Consensus 120 ~~~~~~~~--~~~~~lP~~~Dwr~~g~v~pV~dQg~cGsCwAfA~~~~lE~~~~~~~~~~~~lS~q~l~~c~~~~~~gC~ 197 (350)
.....+.. .+..+||++||||++|.|+||||||.||||||||+++++|+++++++++.++||+|+|+||+. .+.||+
T Consensus 111 ~~~~~~~~~~~~~~~lP~~~DWR~~g~VtpVkdQg~CGSCWAfa~~~aiEs~~~i~~~~~~~LSeQqLvdC~~-~~~GC~ 189 (348)
T PTZ00203 111 HAGQHYRKARADLSAVPDAVDWREKGAVTPVKNQGACGSCWAFSAVGNIESQWAVAGHKLVRLSEQQLVSCDH-VDNGCG 189 (348)
T ss_pred cccccccccccccccCCCCCcCCcCCCCCCccccCCCccHHHHhhHHHHHHHHHHhcCCCccCCHHHHHhccC-CCCCCC
Confidence 00000111 123468999999999999999999999999999999999999999999999999999999986 478999
Q ss_pred CCchHHHHHHHHHh--CCCCCCCCCccccCCC---ccCCCccCceeEEEeeeEecCCCcHHHHHHHH-hcCCcEEEEeec
Q 018781 198 GGLMDYAFKYIVAS--GGLHKEEDYPYLMEEG---TCEDKKEEMEVVTISGYQDVPENDEQSLLKAL-AHQPVSVAIEAS 271 (350)
Q Consensus 198 GG~~~~a~~~~~~~--~Gi~~e~~yPY~~~~~---~c~~~~~~~~~~~i~~~~~v~~~~~~~i~~al-~~GPV~v~i~~~ 271 (350)
||++..|++|+.++ +|+++|++|||.+.++ .|...........+++|..++. +++.|+.+| .+|||+|+|++.
T Consensus 190 GG~~~~a~~yi~~~~~ggi~~e~~YPY~~~~~~~~~C~~~~~~~~~~~i~~~~~i~~-~e~~~~~~l~~~GPv~v~i~a~ 268 (348)
T PTZ00203 190 GGLMLQAFEWVLRNMNGTVFTEKSYPYVSGNGDVPECSNSSELAPGARIDGYVSMES-SERVMAAWLAKNGPISIAVDAS 268 (348)
T ss_pred CCCHHHHHHHHHHhcCCCCCccccCCCccCCCCCCcCCCCcccccceEecceeecCc-CHHHHHHHHHhCCCEEEEEEhh
Confidence 99999999999764 6799999999998765 5763222123467888888866 788899998 469999999984
Q ss_pred CcccccccCCeeeCCCCC-CCCeEEEEEEEeecCCeeEEEEEcCCCCCCCCCceEEEEecCCCCCCccccccccccee
Q 018781 272 GTDFQFYSGGVFTGPCGA-ELDHGVAAVGYGKSKGSDYIIVKNSWGPKWGERGYIRMKRNTGKPEGLCGINKMASIPL 348 (350)
Q Consensus 272 ~~~f~~y~~Giy~~~~~~-~~~Hav~iVGyg~~~g~~ywivkNSWG~~WG~~GY~~i~~~~~~~~~~CgI~~~~~~p~ 348 (350)
+|+.|++|||+. |.. .++|||+|||||+++|++|||||||||++|||+|||||+|+. |.|||+++++...
T Consensus 269 --~f~~Y~~GIy~~-c~~~~~nHaVliVGYG~~~g~~YWiikNSWG~~WGe~GY~ri~rg~----n~Cgi~~~~~~~~ 339 (348)
T PTZ00203 269 --SFMSYHSGVLTS-CIGEQLNHGVLLVGYNMTGEVPYWVIKNSWGEDWGEKGYVRVTMGV----NACLLTGYPVSVH 339 (348)
T ss_pred --hhcCccCceeec-cCCCCCCeEEEEEEEecCCCceEEEEEcCCCCCcCcCceEEEEcCC----CcccccceEEEEe
Confidence 899999999985 753 579999999999988899999999999999999999999984 5899997776543
No 3
>PTZ00021 falcipain-2; Provisional
Probab=100.00 E-value=3.4e-78 Score=589.77 Aligned_cols=309 Identities=41% Similarity=0.750 Sum_probs=259.0
Q ss_pred CCcCCChhHHHHHHHHHHHHhCCccCChHHHHHHHHHHHHHHHHHHHhccCC-CcEEEEcccCCCCChHhHhhhhcCCCC
Q 018781 35 PEHLTSMDKLIELFESWMSKHGKTYKCIEEKLHRFEIFKENLKHIDQRNKEV-TSYWLGLNEFADMSHEEFKNKYLGLKP 113 (350)
Q Consensus 35 ~~~~~~~~~~~~~f~~~~~~~~k~Y~~~~E~~~R~~if~~n~~~I~~~N~~~-~s~~~g~N~fsDlt~~E~~~~~~~~~~ 113 (350)
..-+.+..+...+|++||.+|+|+|.+.+|+.+|+.||++|+++|++||+++ .+|++|+|+|+|||.|||++++++.+.
T Consensus 156 ~~~~~~n~e~~~~F~~wk~ky~K~Y~~~eE~~~R~~iF~~Nl~~Ie~hN~~~~~ty~lgiNqFsDlT~EEF~~~~l~~~~ 235 (489)
T PTZ00021 156 SKFLMTNLENVNSFYLFIKEHGKKYQTPDEMQQRYLSFVENLAKINAHNNKENVLYKKGMNRFGDLSFEEFKKKYLTLKS 235 (489)
T ss_pred hhhhccChHHHHHHHHHHHHhCCcCCCHHHHHHHHHHHHHHHHHHHHhhccCCCCEEEeccccccCCHHHHHHHhccccc
Confidence 3334445566788999999999999999999999999999999999999864 799999999999999999988776432
Q ss_pred C-CCC--C--CCC---C---CcccccccCCCCCeeeccCCCCCCccccCCCCcchHHHHHHHHHHHHHHHHcCCCcccCh
Q 018781 114 Q-FPT--R--RQP---S---AEFSYRDVKALPKSVDWRKKGAVTPVKNQGSCGSCWAFSTVAAVEGINQIVSGNLTSLSE 182 (350)
Q Consensus 114 ~-~~~--~--~~~---~---~~~~~~~~~~lP~~~Dwr~~g~v~pV~dQg~cGsCwAfA~~~~lE~~~~~~~~~~~~lS~ 182 (350)
. .+. . ... . ..+.......+|+++|||+.|.|+||||||.||||||||+++++|++++++++..++||+
T Consensus 236 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~P~s~DWR~~g~VtpVKdQG~CGSCWAFAa~~alEs~~~I~~g~~v~LSe 315 (489)
T PTZ00021 236 FDFKSNGKKSPRVINYDDVIKKYKPKDATFDHAKYDWRLHNGVTPVKDQKNCGSCWAFSTVGVVESQYAIRKNELVSLSE 315 (489)
T ss_pred cccccccccccccccccccccccccccccCCccccccccCCCCCCcccccccccHHHHHHHHHHHHHHHHHcCCCcccCH
Confidence 1 110 0 000 0 000001111249999999999999999999999999999999999999999999999999
Q ss_pred HHhhhhcCCCCCCCCCCchHHHHHHHHHhCCCCCCCCCccccC-CCccCCCccCceeEEEeeeEecCCCcHHHHHHHHh-
Q 018781 183 QELIDCDTSFNNGCNGGLMDYAFKYIVASGGLHKEEDYPYLME-EGTCEDKKEEMEVVTISGYQDVPENDEQSLLKALA- 260 (350)
Q Consensus 183 q~l~~c~~~~~~gC~GG~~~~a~~~~~~~~Gi~~e~~yPY~~~-~~~c~~~~~~~~~~~i~~~~~v~~~~~~~i~~al~- 260 (350)
|+|+||+. .+.||+||++..|++|+.+++|+++|++|||.+. ++.|....+ ...+++.+|..++ +++|+++|.
T Consensus 316 QqLVDCs~-~n~GC~GG~~~~Af~yi~~~gGl~tE~~YPY~~~~~~~C~~~~~-~~~~~i~~y~~i~---~~~lk~al~~ 390 (489)
T PTZ00021 316 QELVDCSF-KNNGCYGGLIPNAFEDMIELGGLCSEDDYPYVSDTPELCNIDRC-KEKYKIKSYVSIP---EDKFKEAIRF 390 (489)
T ss_pred HHHhhhcc-CCCCCCCcchHhhhhhhhhccccCcccccCccCCCCCccccccc-cccceeeeEEEec---HHHHHHHHHh
Confidence 99999986 4889999999999999988889999999999987 478974332 3457888998885 468899995
Q ss_pred cCCcEEEEeecCcccccccCCeeeCCCCCCCCeEEEEEEEeecC----------CeeEEEEEcCCCCCCCCCceEEEEec
Q 018781 261 HQPVSVAIEASGTDFQFYSGGVFTGPCGAELDHGVAAVGYGKSK----------GSDYIIVKNSWGPKWGERGYIRMKRN 330 (350)
Q Consensus 261 ~GPV~v~i~~~~~~f~~y~~Giy~~~~~~~~~Hav~iVGyg~~~----------g~~ywivkNSWG~~WG~~GY~~i~~~ 330 (350)
.|||+|+|++. .+|+.|++|||+++|+..++|||+|||||+++ +.+|||||||||++|||+|||||+|+
T Consensus 391 ~GPVsv~i~a~-~~f~~YkgGIy~~~C~~~~nHAVlIVGYG~e~~~~~~~~~~~~~~YWIVKNSWGt~WGE~GY~rI~r~ 469 (489)
T PTZ00021 391 LGPISVSIAVS-DDFAFYKGGIFDGECGEEPNHAVILVGYGMEEIYNSDTKKMEKRYYYIIKNSWGESWGEKGFIRIETD 469 (489)
T ss_pred cCCeEEEEEee-cccccCCCCcCCCCCCCccceEEEEEEecCcCCcccccccCCCCCEEEEECCCCCCcccCeEEEEEcC
Confidence 69999999997 69999999999988988889999999999753 24799999999999999999999999
Q ss_pred CCCCCCcccccccccceec
Q 018781 331 TGKPEGLCGINKMASIPLK 349 (350)
Q Consensus 331 ~~~~~~~CgI~~~~~~p~~ 349 (350)
.+...|+|||++.+.||++
T Consensus 470 ~~g~~n~CGI~t~a~yP~~ 488 (489)
T PTZ00021 470 ENGLMKTCSLGTEAYVPLI 488 (489)
T ss_pred CCCCCCCCCCcccceeEec
Confidence 6544579999999999986
No 4
>PTZ00200 cysteine proteinase; Provisional
Probab=100.00 E-value=4.1e-77 Score=580.43 Aligned_cols=304 Identities=35% Similarity=0.637 Sum_probs=254.3
Q ss_pred cCCChhHHHHHHHHHHHHhCCccCChHHHHHHHHHHHHHHHHHHHhccCCCcEEEEcccCCCCChHhHhhhhcCCCCCCC
Q 018781 37 HLTSMDKLIELFESWMSKHGKTYKCIEEKLHRFEIFKENLKHIDQRNKEVTSYWLGLNEFADMSHEEFKNKYLGLKPQFP 116 (350)
Q Consensus 37 ~~~~~~~~~~~f~~~~~~~~k~Y~~~~E~~~R~~if~~n~~~I~~~N~~~~s~~~g~N~fsDlt~~E~~~~~~~~~~~~~ 116 (350)
.+..+.++..+|++|+++|+|.|.+.+|+.+|+.||++|++.|++||.. .+|++|+|+|+|||++||.+++++.+.+..
T Consensus 115 ~~~~e~e~~~~F~~f~~ky~K~Y~~~~E~~~R~~iF~~Nl~~I~~hN~~-~~y~lgiN~FsDlT~eEF~~~~~~~~~~~~ 193 (448)
T PTZ00200 115 DPKLEFEVYLEFEEFNKKYNRKHATHAERLNRFLTFRNNYLEVKSHKGD-EPYSKEINKFSDLTEEEFRKLFPVIKVPPK 193 (448)
T ss_pred CccchHHHHHHHHHHHHHhCCcCCCHHHHHHHHHHHHHHHHHHHHhcCc-CCeEEeccccccCCHHHHHHHhccCCCccc
Confidence 3444556778999999999999999899999999999999999999964 689999999999999999988765332110
Q ss_pred CC---C--------CCCCcccc---------c---c-cCCCCCeeeccCCCCCCccccCC-CCcchHHHHHHHHHHHHHH
Q 018781 117 TR---R--------QPSAEFSY---------R---D-VKALPKSVDWRKKGAVTPVKNQG-SCGSCWAFSTVAAVEGINQ 171 (350)
Q Consensus 117 ~~---~--------~~~~~~~~---------~---~-~~~lP~~~Dwr~~g~v~pV~dQg-~cGsCwAfA~~~~lE~~~~ 171 (350)
.. . .....+.. . + ...+|++||||+.|.|+|||||| .||||||||+++++|++++
T Consensus 194 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~P~~~DWR~~g~vtpVkdQG~~CGSCWAFat~~aiEs~~~ 273 (448)
T PTZ00200 194 SNSTSHNNDFKARHVSNPTYLKNLKKAKNTDEDVKDPSKITGEGLDWRRADAVTKVKDQGLNCGSCWAFSSVGSVESLYK 273 (448)
T ss_pred ccccccccccccccccccccccccccccccccccccccccCCCCccCCCCCCCCCcccCCCccchHHHHhHHHHHHHHHH
Confidence 00 0 00000000 0 0 01269999999999999999999 9999999999999999999
Q ss_pred HHcCCCcccChHHhhhhcCCCCCCCCCCchHHHHHHHHHhCCCCCCCCCccccCCCccCCCccCceeEEEeeeEecCCCc
Q 018781 172 IVSGNLTSLSEQELIDCDTSFNNGCNGGLMDYAFKYIVASGGLHKEEDYPYLMEEGTCEDKKEEMEVVTISGYQDVPEND 251 (350)
Q Consensus 172 ~~~~~~~~lS~q~l~~c~~~~~~gC~GG~~~~a~~~~~~~~Gi~~e~~yPY~~~~~~c~~~~~~~~~~~i~~~~~v~~~~ 251 (350)
++++..++||+|+|+||+. .+.||+||++..|++|++++ |+++|++|||.+..+.|.... .....|.+|..++ .
T Consensus 274 i~~~~~~~LSeQqLvDC~~-~~~GC~GG~~~~A~~yi~~~-Gi~~e~~YPY~~~~~~C~~~~--~~~~~i~~y~~~~--~ 347 (448)
T PTZ00200 274 IYRDKSVDLSEQELVNCDT-KSQGCSGGYPDTALEYVKNK-GLSSSSDVPYLAKDGKCVVSS--TKKVYIDSYLVAK--G 347 (448)
T ss_pred HhcCCCeecCHHHHhhccC-ccCCCCCCcHHHHHHHHhhc-CccccccCCCCCCCCCCcCCC--CCeeEecceEecC--H
Confidence 9999999999999999986 47899999999999999776 999999999999999997543 2345688887654 3
Q ss_pred HHHHHHHHhcCCcEEEEeecCcccccccCCeeeCCCCCCCCeEEEEEEEee--cCCeeEEEEEcCCCCCCCCCceEEEEe
Q 018781 252 EQSLLKALAHQPVSVAIEASGTDFQFYSGGVFTGPCGAELDHGVAAVGYGK--SKGSDYIIVKNSWGPKWGERGYIRMKR 329 (350)
Q Consensus 252 ~~~i~~al~~GPV~v~i~~~~~~f~~y~~Giy~~~~~~~~~Hav~iVGyg~--~~g~~ywivkNSWG~~WG~~GY~~i~~ 329 (350)
.+.+++++.+|||+|+|.++ .+|+.|++|||+++|+..++|||+|||||. ++|.+|||||||||++|||+|||||+|
T Consensus 348 ~~~l~~~l~~GPV~v~i~~~-~~f~~Yk~GIy~~~C~~~~nHaV~lVGyG~d~~~g~~YWIIkNSWG~~WGe~GY~ri~r 426 (448)
T PTZ00200 348 KDVLNKSLVISPTVVYIAVS-RELLKYKSGVYNGECGKSLNHAVLLVGEGYDEKTKKRYWIIKNSWGTDWGENGYMRLER 426 (448)
T ss_pred HHHHHHHHhcCCEEEEeecc-cccccCCCCccccccCCCCcEEEEEEEecccCCCCCceEEEEcCCCCCcccCeeEEEEe
Confidence 55677777889999999997 799999999999889877899999999985 467899999999999999999999999
Q ss_pred cCCCCCCcccccccccceec
Q 018781 330 NTGKPEGLCGINKMASIPLK 349 (350)
Q Consensus 330 ~~~~~~~~CgI~~~~~~p~~ 349 (350)
+.. +.|.|||++.+.||++
T Consensus 427 ~~~-g~n~CGI~~~~~~P~~ 445 (448)
T PTZ00200 427 TNE-GTDKCGILTVGLTPVF 445 (448)
T ss_pred CCC-CCCcCCccccceeeEE
Confidence 742 3579999999999985
No 5
>KOG1543 consensus Cysteine proteinase Cathepsin L [Posttranslational modification, protein turnover, chaperones]
Probab=100.00 E-value=9.7e-70 Score=512.44 Aligned_cols=287 Identities=48% Similarity=0.842 Sum_probs=247.9
Q ss_pred HHHhCCccCChHHHHHHHHHHHHHHHHHHHhccC-CCcEEEEcccCCCCChHhHhhhhcCCCCCCCCCCCCCCccccccc
Q 018781 52 MSKHGKTYKCIEEKLHRFEIFKENLKHIDQRNKE-VTSYWLGLNEFADMSHEEFKNKYLGLKPQFPTRRQPSAEFSYRDV 130 (350)
Q Consensus 52 ~~~~~k~Y~~~~E~~~R~~if~~n~~~I~~~N~~-~~s~~~g~N~fsDlt~~E~~~~~~~~~~~~~~~~~~~~~~~~~~~ 130 (350)
+.+|.+.|.+..|+..|+.+|.+|++.|+.||.. ..+|++++|+|+|+|.+|++....+.+.+.. ...........
T Consensus 30 ~~~~~~~y~~~~~~~~r~~~f~~n~~~~~~~n~~~~~~~~~g~n~~~d~~~ee~~~~~~~~~~~~~---~~~~~~~~~~~ 106 (325)
T KOG1543|consen 30 LVKFLKRYEDRVEKKARRAIFKENLQKIESHNLKYVLSFLMGVNQFADLTTEEFKRKKTGKKPPEI---KRDKFTEKLDG 106 (325)
T ss_pred hhhhccccccHHHHHHHHHHHHHHHHHHHhhhhhhceeeeeccccccccchHHHHHhhccccCccc---cccccccccch
Confidence 6777788876778899999999999999999997 6899999999999999999988776544332 00011112234
Q ss_pred CCCCCeeeccCCC-CCCccccCCCCcchHHHHHHHHHHHHHHHHcC-CCcccChHHhhhhcCCCCCCCCCCchHHHHHHH
Q 018781 131 KALPKSVDWRKKG-AVTPVKNQGSCGSCWAFSTVAAVEGINQIVSG-NLTSLSEQELIDCDTSFNNGCNGGLMDYAFKYI 208 (350)
Q Consensus 131 ~~lP~~~Dwr~~g-~v~pV~dQg~cGsCwAfA~~~~lE~~~~~~~~-~~~~lS~q~l~~c~~~~~~gC~GG~~~~a~~~~ 208 (350)
.++|++||||+++ .++||||||.||||||||++++||++++|+++ .++.||+|+|+||+...++||+||.+..|++|+
T Consensus 107 ~~~p~s~DwR~~~~~~~~vkdQg~CgsCWAFaa~~aie~~~~i~~g~~l~sLSeq~lvdC~~~~~~GC~GG~~~~A~~yi 186 (325)
T KOG1543|consen 107 DDLPDSFDWRDKGAVTPPVKDQGSCGSCWAFAATGALEDRYNIKTGGKLLSLSEQDLVDCCGECGDGCNGGEPKNAFKYI 186 (325)
T ss_pred hhCCCCccccccCCcCCCcCCCCcCcchHHHHHHHHHHHHHHHHhCCccCccChhhhhhccCCCCCCcCCCCHHHHHHHH
Confidence 5899999999996 56669999999999999999999999999999 899999999999997668899999999999999
Q ss_pred HHhCCCCCCCCCccccCCCccCCCccCceeEEEeeeEecCCCcHHHHHHHH-hcCCcEEEEeecCcccccccCCeeeCCC
Q 018781 209 VASGGLHKEEDYPYLMEEGTCEDKKEEMEVVTISGYQDVPENDEQSLLKAL-AHQPVSVAIEASGTDFQFYSGGVFTGPC 287 (350)
Q Consensus 209 ~~~~Gi~~e~~yPY~~~~~~c~~~~~~~~~~~i~~~~~v~~~~~~~i~~al-~~GPV~v~i~~~~~~f~~y~~Giy~~~~ 287 (350)
.+++|+..+++|||....+.|..... .....+.++..++.+ +++|+++| .+|||+|+|+++ .+|+.|++|||.+++
T Consensus 187 ~~~G~~t~~~~Ypy~~~~~~C~~~~~-~~~~~~~~~~~~~~~-e~~i~~~v~~~GPv~v~~~a~-~~F~~Y~~GVy~~~~ 263 (325)
T KOG1543|consen 187 KKNGGVTECENYPYIGKDGTCKSNKK-DKTVTIKGFYNVPAN-EEAIAEAVAKNGPVSVAIDAY-EDFSLYKGGVYAEEK 263 (325)
T ss_pred HHhCCCCCCcCCCCcCCCCCccCCCc-cceeEeeeeeecCcC-HHHHHHHHHhcCCeEEEEeeh-hhhhhccCceEeCCC
Confidence 99954444999999999999997654 566778888888875 99999999 569999999998 499999999999875
Q ss_pred CC--CCCeEEEEEEEeecCCeeEEEEEcCCCCCCCCCceEEEEecCCCCCCcccccccccc-ee
Q 018781 288 GA--ELDHGVAAVGYGKSKGSDYIIVKNSWGPKWGERGYIRMKRNTGKPEGLCGINKMASI-PL 348 (350)
Q Consensus 288 ~~--~~~Hav~iVGyg~~~g~~ywivkNSWG~~WG~~GY~~i~~~~~~~~~~CgI~~~~~~-p~ 348 (350)
.. .++|||+|||||..++.+|||||||||++|||+|||||.|+.+ .|+|++.+.| |+
T Consensus 264 ~~~~~~~Hav~iVGyG~~~~~~YWivkNSWG~~WGe~Gy~ri~r~~~----~~~I~~~~~~~p~ 323 (325)
T KOG1543|consen 264 GDDKEGDHAVLIVGYGTGDGVDYWIVKNSWGTDWGEKGYFRIARGVN----KCGIASEASYGPI 323 (325)
T ss_pred CCCCCCCceEEEEEEcCCCCceeEEEEcCCCCCcccCceEEEecCCC----chhhhcccccCCC
Confidence 54 5899999999999667899999999999999999999999964 6999999999 65
No 6
>cd02621 Peptidase_C1A_CathepsinC Cathepsin C; also known as Dipeptidyl Peptidase I (DPPI), an atypical papain-like cysteine peptidase with chloride dependency and dipeptidyl aminopeptidase activity, resulting from its tetrameric structure which limits substrate access. Each subunit of the tetramer is composed of three peptides: the heavy and light chains, which together adopts the papain fold and forms the catalytic domain; and the residual propeptide region, which forms a beta barrel and points towards the substrate's N-terminus. The subunit composition is the result of the unique characteristic of procathepsin C maturation involving the cleavage of the catalytic domain and the non-autocatalytic excision of an activation peptide within its propeptide region. By removing N-terminal dipeptide extensions, cathepsin C activates granule serine peptidases (granzymes) involved in cell-mediated apoptosis, inflammation and tissue remodelling. Loss-of-function mutations in cathepsin C are assoc
Probab=100.00 E-value=5.2e-58 Score=419.15 Aligned_cols=207 Identities=37% Similarity=0.738 Sum_probs=178.0
Q ss_pred CCCeeeccCCC----CCCccccCCCCcchHHHHHHHHHHHHHHHHcCC------CcccChHHhhhhcCCCCCCCCCCchH
Q 018781 133 LPKSVDWRKKG----AVTPVKNQGSCGSCWAFSTVAAVEGINQIVSGN------LTSLSEQELIDCDTSFNNGCNGGLMD 202 (350)
Q Consensus 133 lP~~~Dwr~~g----~v~pV~dQg~cGsCwAfA~~~~lE~~~~~~~~~------~~~lS~q~l~~c~~~~~~gC~GG~~~ 202 (350)
||++||||+.+ +|+||+|||.||||||||+++++|+++++++++ .+.||+|+|++|.. .+.||+||++.
T Consensus 1 lP~~fDwr~~~~~~~~v~~v~dQg~CGsCwAfa~~~~ies~~~i~~~~~~~~~~~~~lS~q~l~dC~~-~~~GC~GG~~~ 79 (243)
T cd02621 1 LPKSFDWGDVNNGFNYVSPVRNQGGCGSCYAFASVYALEARIMIASNKTDPLGQQPILSPQHVLSCSQ-YSQGCDGGFPF 79 (243)
T ss_pred CCCcccccccCCCCcccccCCCCCcCccHHHHHHHHHHHHHHHHHhCCCCccccCcccCHHHhhhhcC-CCCCCCCCCHH
Confidence 79999999988 999999999999999999999999999998876 68999999999986 47899999999
Q ss_pred HHHHHHHHhCCCCCCCCCcccc-CCCccCCCccCceeEEEeeeEecC----CCcHHHHHHHH-hcCCcEEEEeecCcccc
Q 018781 203 YAFKYIVASGGLHKEEDYPYLM-EEGTCEDKKEEMEVVTISGYQDVP----ENDEQSLLKAL-AHQPVSVAIEASGTDFQ 276 (350)
Q Consensus 203 ~a~~~~~~~~Gi~~e~~yPY~~-~~~~c~~~~~~~~~~~i~~~~~v~----~~~~~~i~~al-~~GPV~v~i~~~~~~f~ 276 (350)
.+++|+.++ |+++|++|||.. ....|..........++..|..+. ..++++|+++| .+|||+++|++. ++|+
T Consensus 80 ~a~~~~~~~-Gi~~e~~yPY~~~~~~~C~~~~~~~~~~~~~~~~~i~~~~~~~~~~~ik~~i~~~GPv~v~~~~~-~~F~ 157 (243)
T cd02621 80 LVGKFAEDF-GIVTEDYFPYTADDDRPCKASPSECRRYYFSDYNYVGGCYGCTNEDEMKWEIYRNGPIVVAFEVY-SDFD 157 (243)
T ss_pred HHHHHHHhc-CcCCCceeCCCCCCCCCCCCCccccccccccceeEcccccccCCHHHHHHHHHHcCCEEEEEEec-cccc
Confidence 999999877 899999999998 677897433122333444444331 34789999999 579999999997 7999
Q ss_pred cccCCeeeCC-----CCC---------CCCeEEEEEEEeecC--CeeEEEEEcCCCCCCCCCceEEEEecCCCCCCcccc
Q 018781 277 FYSGGVFTGP-----CGA---------ELDHGVAAVGYGKSK--GSDYIIVKNSWGPKWGERGYIRMKRNTGKPEGLCGI 340 (350)
Q Consensus 277 ~y~~Giy~~~-----~~~---------~~~Hav~iVGyg~~~--g~~ywivkNSWG~~WG~~GY~~i~~~~~~~~~~CgI 340 (350)
.|++|||+.+ |.. .++|||+|||||++. +++|||||||||++|||+|||||+|+. |.|||
T Consensus 158 ~Y~~GIy~~~~~~~~C~~~~~~~~~~~~~~HaV~iVGyg~~~~~g~~YWiirNSWG~~WGe~Gy~~i~~~~----~~cgi 233 (243)
T cd02621 158 FYKEGVYHHTDNDEVSDGDNDNFNPFELTNHAVLLVGWGEDEIKGEKYWIVKNSWGSSWGEKGYFKIRRGT----NECGI 233 (243)
T ss_pred ccCCeEECcCCcccccccccccccCcccCCeEEEEEEeeccCCCCCcEEEEEcCCCCCCCcCCeEEEecCC----cccCc
Confidence 9999999874 532 469999999999976 899999999999999999999999984 58999
Q ss_pred cccccc
Q 018781 341 NKMASI 346 (350)
Q Consensus 341 ~~~~~~ 346 (350)
++++.+
T Consensus 234 ~~~~~~ 239 (243)
T cd02621 234 ESQAVF 239 (243)
T ss_pred ccceEe
Confidence 999865
No 7
>cd02698 Peptidase_C1A_CathepsinX Cathepsin X; the only papain-like lysosomal cysteine peptidase exhibiting carboxymonopeptidase activity. It can also act as a carboxydipeptidase, like cathepsin B, but has been shown to preferentially cleave substrates through a monopeptidyl carboxypeptidase pathway. The propeptide region of cathepsin X, the shortest among papain-like peptidases, is covalently attached to the active site cysteine in the inactive form of the enzyme. Little is known about the biological function of cathepsin X. Some studies point to a role in early tumorigenesis. A more recent study indicates that cathepsin X expression is restricted to immune cells suggesting a role in phagocytosis and the regulation of the immune response.
Probab=100.00 E-value=1.5e-57 Score=414.73 Aligned_cols=210 Identities=31% Similarity=0.597 Sum_probs=180.4
Q ss_pred CCCeeeccCCC---CCCccccCC---CCcchHHHHHHHHHHHHHHHHcC---CCcccChHHhhhhcCCCCCCCCCCchHH
Q 018781 133 LPKSVDWRKKG---AVTPVKNQG---SCGSCWAFSTVAAVEGINQIVSG---NLTSLSEQELIDCDTSFNNGCNGGLMDY 203 (350)
Q Consensus 133 lP~~~Dwr~~g---~v~pV~dQg---~cGsCwAfA~~~~lE~~~~~~~~---~~~~lS~q~l~~c~~~~~~gC~GG~~~~ 203 (350)
||++||||+.+ +|+|||||| .||||||||++++||+++.++++ ..+.||+|+|+||+. +.||+||++..
T Consensus 1 lP~~~Dwr~~~~~~~v~~vk~Qg~~~~CGsCwAfa~~~aies~~~i~~~~~~~~~~lS~Q~lldC~~--~~gC~GG~~~~ 78 (239)
T cd02698 1 LPKSWDWRNVNGVNYVSPTRNQHIPQYCGSCWAHGSTSALADRINIARKGAWPSVYLSVQVVIDCAG--GGSCHGGDPGG 78 (239)
T ss_pred CCCCcccccCCCCcccCccccCCCCCCCCcchHHHhHHHHHHHHHHHHCCCCCCcccCHHHHHhCCC--CCCccCcCHHH
Confidence 69999999987 999999998 89999999999999999998875 357899999999986 68999999999
Q ss_pred HHHHHHHhCCCCCCCCCccccCCCccCCCc----------c----CceeEEEeeeEecCCCcHHHHHHHH-hcCCcEEEE
Q 018781 204 AFKYIVASGGLHKEEDYPYLMEEGTCEDKK----------E----EMEVVTISGYQDVPENDEQSLLKAL-AHQPVSVAI 268 (350)
Q Consensus 204 a~~~~~~~~Gi~~e~~yPY~~~~~~c~~~~----------~----~~~~~~i~~~~~v~~~~~~~i~~al-~~GPV~v~i 268 (350)
+++|+.++ |+++|++|||...+..|.... + ....+.+++|..++ ++++|+++| .+|||+++|
T Consensus 79 a~~~~~~~-Gl~~e~~yPY~~~~~~C~~~~~~~~c~~~~~c~~~~~~~~~~i~~~~~~~--~~~~i~~~l~~~GPV~v~i 155 (239)
T cd02698 79 VYEYAHKH-GIPDETCNPYQAKDGECNPFNRCGTCNPFGECFAIKNYTLYFVSDYGSVS--GRDKMMAEIYARGPISCGI 155 (239)
T ss_pred HHHHHHHc-CcCCCCeeCCcCCCCCCcCCCCCCCcccCcccccccccceEEeeeceecC--CHHHHHHHHHHcCCEEEEE
Confidence 99999886 899999999988766665210 0 12345677777774 478899998 689999999
Q ss_pred eecCcccccccCCeeeCC-CCCCCCeEEEEEEEeecC-CeeEEEEEcCCCCCCCCCceEEEEecCC-CCCCccccccccc
Q 018781 269 EASGTDFQFYSGGVFTGP-CGAELDHGVAAVGYGKSK-GSDYIIVKNSWGPKWGERGYIRMKRNTG-KPEGLCGINKMAS 345 (350)
Q Consensus 269 ~~~~~~f~~y~~Giy~~~-~~~~~~Hav~iVGyg~~~-g~~ywivkNSWG~~WG~~GY~~i~~~~~-~~~~~CgI~~~~~ 345 (350)
.++ ++|+.|++|||+.. |...++|||+|||||+++ +++|||||||||++|||+|||||+|+.. +-.|+|||++.++
T Consensus 156 ~~~-~~f~~Y~~GIy~~~~~~~~~~HaV~IVGyG~~~~g~~YWiikNSWG~~WGe~Gy~~i~rg~~~~~~~~~~i~~~~~ 234 (239)
T cd02698 156 MAT-EALENYTGGVYKEYVQDPLINHIISVAGWGVDENGVEYWIVRNSWGEPWGERGWFRIVTSSYKGARYNLAIEEDCA 234 (239)
T ss_pred Eec-ccccccCCeEEccCCCCCcCCeEEEEEEEEecCCCCEEEEEEcCCCcccCcCceEEEEccCCcccccccccccceE
Confidence 997 69999999999874 455679999999999876 8999999999999999999999999961 1236899999999
Q ss_pred cee
Q 018781 346 IPL 348 (350)
Q Consensus 346 ~p~ 348 (350)
|+.
T Consensus 235 ~~~ 237 (239)
T cd02698 235 WAD 237 (239)
T ss_pred EEe
Confidence 875
No 8
>cd02248 Peptidase_C1A Peptidase C1A subfamily (MEROPS database nomenclature); composed of cysteine peptidases (CPs) similar to papain, including the mammalian CPs (cathepsins B, C, F, H, L, K, O, S, V, X and W). Papain is an endopeptidase with specific substrate preferences, primarily for bulky hydrophobic or aromatic residues at the S2 subsite, a hydrophobic pocket in papain that accommodates the P2 sidechain of the substrate (the second residue away from the scissile bond). Most members of the papain subfamily are endopeptidases. Some exceptions to this rule can be explained by specific details of the catalytic domains like the occluding loop in cathepsin B which confers an additional carboxydipeptidyl activity and the mini-chain of cathepsin H resulting in an N-terminal exopeptidase activity. Papain-like CPs have different functions in various organisms. Plant CPs are used to mobilize storage proteins in seeds. Parasitic CPs act extracellularly to help invade tissues and cells, to h
Probab=100.00 E-value=2.9e-57 Score=405.16 Aligned_cols=207 Identities=59% Similarity=1.085 Sum_probs=187.7
Q ss_pred CCeeeccCCCCCCccccCCCCcchHHHHHHHHHHHHHHHHcCCCcccChHHhhhhcCCCCCCCCCCchHHHHHHHHHhCC
Q 018781 134 PKSVDWRKKGAVTPVKNQGSCGSCWAFSTVAAVEGINQIVSGNLTSLSEQELIDCDTSFNNGCNGGLMDYAFKYIVASGG 213 (350)
Q Consensus 134 P~~~Dwr~~g~v~pV~dQg~cGsCwAfA~~~~lE~~~~~~~~~~~~lS~q~l~~c~~~~~~gC~GG~~~~a~~~~~~~~G 213 (350)
|++||||+.+.++||+|||.||+|||||++++||++++++++...+||+|+|++|....+.+|.||++..+++++.+. |
T Consensus 1 P~~~d~r~~~~~~~v~dQg~cgsCwAfa~~~~le~~~~i~~~~~~~lS~q~l~~c~~~~~~gC~GG~~~~a~~~~~~~-G 79 (210)
T cd02248 1 PESVDWREKGAVTPVKDQGSCGSCWAFSTVGALEGAYAIKTGKLVSLSEQQLVDCSTSGNNGCNGGNPDNAFEYVKNG-G 79 (210)
T ss_pred CCcccCCcCCCCCCCccCCCCcchHHhHHHHHHHHHHHHHcCCCcccCHHHHhccCCCCCCCCCCCCHHHhHHHHHHC-C
Confidence 789999999999999999999999999999999999999999889999999999986447899999999999988776 8
Q ss_pred CCCCCCCccccCCCccCCCccCceeEEEeeeEecCCCcHHHHHHHH-hcCCcEEEEeecCcccccccCCeeeCCCC--CC
Q 018781 214 LHKEEDYPYLMEEGTCEDKKEEMEVVTISGYQDVPENDEQSLLKAL-AHQPVSVAIEASGTDFQFYSGGVFTGPCG--AE 290 (350)
Q Consensus 214 i~~e~~yPY~~~~~~c~~~~~~~~~~~i~~~~~v~~~~~~~i~~al-~~GPV~v~i~~~~~~f~~y~~Giy~~~~~--~~ 290 (350)
+++|++|||......|.... .....++.+|..++..++++||++| .+|||++++.+. ++|+.|++|||..++. ..
T Consensus 80 i~~e~~yPY~~~~~~C~~~~-~~~~~~i~~~~~i~~~~~~~ik~~l~~~gPV~~~~~~~-~~f~~y~~Giy~~~~~~~~~ 157 (210)
T cd02248 80 LASESDYPYTGKDGTCKYNS-SKVGAKITGYSNVPPGDEEALKAALANYGPVSVAIDAS-SSFQFYKGGIYSGPCCSNTN 157 (210)
T ss_pred cCccccCCccCCCCCccCCC-CcccEEEeeEEEcCCCcHHHHHHHHhhcCCEEEEEecC-cccccCCCCceeCCCCCCCc
Confidence 99999999999888898543 3567899999999877789999999 569999999997 7999999999987543 46
Q ss_pred CCeEEEEEEEeecCCeeEEEEEcCCCCCCCCCceEEEEecCCCCCCcccccccccce
Q 018781 291 LDHGVAAVGYGKSKGSDYIIVKNSWGPKWGERGYIRMKRNTGKPEGLCGINKMASIP 347 (350)
Q Consensus 291 ~~Hav~iVGyg~~~g~~ywivkNSWG~~WG~~GY~~i~~~~~~~~~~CgI~~~~~~p 347 (350)
++|||+|||||++.+++|||||||||++||++|||||+|+. |.|||++.+.||
T Consensus 158 ~~Hav~iVGy~~~~~~~ywiv~NSWG~~WG~~Gy~~i~~~~----~~cgi~~~~~~~ 210 (210)
T cd02248 158 LNHAVLLVGYGTENGVDYWIVKNSWGTSWGEKGYIRIARGS----NLCGIASYASYP 210 (210)
T ss_pred CCEEEEEEEEeecCCceEEEEEcCCCCccccCcEEEEEcCC----CccCceeeeecC
Confidence 79999999999988899999999999999999999999984 589999998886
No 9
>cd02620 Peptidase_C1A_CathepsinB Cathepsin B group; composed of cathepsin B and similar proteins, including tubulointerstitial nephritis antigen (TIN-Ag). Cathepsin B is a lysosomal papain-like cysteine peptidase which is expressed in all tissues and functions primarily as an exopeptidase through its carboxydipeptidyl activity. Together with other cathepsins, it is involved in the degradation of proteins, proenzyme activation, Ag processing, metabolism and apoptosis. Cathepsin B has been implicated in a number of human diseases such as cancer, rheumatoid arthritis, osteoporosis and Alzheimer's disease. The unique carboxydipeptidyl activity of cathepsin B is attributed to the presence of an occluding loop in its active site which favors the binding of the C-termini of substrate proteins. Some members of this group do not possess the occluding loop. TIN-Ag is an extracellular matrix basement protein which was originally identified as a target Ag involved in anti-tubular basement membrane
Probab=100.00 E-value=5.9e-57 Score=410.14 Aligned_cols=205 Identities=36% Similarity=0.702 Sum_probs=173.9
Q ss_pred CCeeeccCC--CCC--CccccCCCCcchHHHHHHHHHHHHHHHHcC--CCcccChHHhhhhcCCCCCCCCCCchHHHHHH
Q 018781 134 PKSVDWRKK--GAV--TPVKNQGSCGSCWAFSTVAAVEGINQIVSG--NLTSLSEQELIDCDTSFNNGCNGGLMDYAFKY 207 (350)
Q Consensus 134 P~~~Dwr~~--g~v--~pV~dQg~cGsCwAfA~~~~lE~~~~~~~~--~~~~lS~q~l~~c~~~~~~gC~GG~~~~a~~~ 207 (350)
|++||||++ +++ +||+|||.||||||||++++||+++.++++ +.+.||+|+|+||+...+.||+||++..+++|
T Consensus 1 p~~~DwR~~~~~~~~v~~v~dQg~CGsCwAfa~~~~le~~~~i~~~~~~~~~LS~Q~lidC~~~~~~gC~GG~~~~a~~~ 80 (236)
T cd02620 1 PESFDAREKWPNCISIGEIRDQGNCGSCWAFSAVEAFSDRLCIQSNGKENVLLSAQDLLSCCSGCGDGCNGGYPDAAWKY 80 (236)
T ss_pred CCcccchhhCCCCCCccccCCcccchhHHHHHHHHHHhhHHHHhcCCCCccccCHHHHHhhcCCCCCCCCCCCHHHHHHH
Confidence 899999996 554 599999999999999999999999999887 77899999999998644789999999999999
Q ss_pred HHHhCCCCCCCCCccccCCCc------------------cCCCcc---CceeEEEeeeEecCCCcHHHHHHHH-hcCCcE
Q 018781 208 IVASGGLHKEEDYPYLMEEGT------------------CEDKKE---EMEVVTISGYQDVPENDEQSLLKAL-AHQPVS 265 (350)
Q Consensus 208 ~~~~~Gi~~e~~yPY~~~~~~------------------c~~~~~---~~~~~~i~~~~~v~~~~~~~i~~al-~~GPV~ 265 (350)
++++ |+++|++|||...+.. |..... .....++..+..+. .++++||.+| .+|||+
T Consensus 81 i~~~-G~~~e~~yPY~~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~~~~~~~~-~~~~~ik~~l~~~GPv~ 158 (236)
T cd02620 81 LTTT-GVVTGGCQPYTIPPCGHHPEGPPPCCGTPYCTPKCQDGCEKTYEEDKHKGKSAYSVP-SDETDIMKEIMTNGPVQ 158 (236)
T ss_pred HHhc-CCCcCCEecCcCCCCccCCCCCCCCCCCCCCCCCCCcCCccccceeeeeecceeeeC-CHHHHHHHHHHHCCCeE
Confidence 9887 8999999999876543 332111 11234455555554 3789999999 679999
Q ss_pred EEEeecCcccccccCCeeeCCCCC-CCCeEEEEEEEeecCCeeEEEEEcCCCCCCCCCceEEEEecCCCCCCcccccccc
Q 018781 266 VAIEASGTDFQFYSGGVFTGPCGA-ELDHGVAAVGYGKSKGSDYIIVKNSWGPKWGERGYIRMKRNTGKPEGLCGINKMA 344 (350)
Q Consensus 266 v~i~~~~~~f~~y~~Giy~~~~~~-~~~Hav~iVGyg~~~g~~ywivkNSWG~~WG~~GY~~i~~~~~~~~~~CgI~~~~ 344 (350)
++|.+. ++|+.|++|||+.+++. .++|||+|||||++++++|||||||||++|||+|||||+|+. |+|||++.+
T Consensus 159 v~i~~~-~~f~~Y~~Giy~~~~~~~~~~HaV~iVGyg~~~g~~YWivrNSWG~~WGe~Gy~ri~~~~----~~cgi~~~~ 233 (236)
T cd02620 159 AAFTVY-EDFLYYKSGVYQHTSGKQLGGHAVKIIGWGVENGVPYWLAANSWGTDWGENGYFRILRGS----NECGIESEV 233 (236)
T ss_pred EEEEec-hhhhhcCCcEEeecCCCCcCCeEEEEEEEeccCCeeEEEEEeCCCCCCCCCcEEEEEccC----cccccccce
Confidence 999996 79999999999876554 468999999999988999999999999999999999999984 589999887
Q ss_pred c
Q 018781 345 S 345 (350)
Q Consensus 345 ~ 345 (350)
+
T Consensus 234 ~ 234 (236)
T cd02620 234 V 234 (236)
T ss_pred e
Confidence 5
No 10
>PF00112 Peptidase_C1: Papain family cysteine protease This is family C1 in the peptidase classification. ; InterPro: IPR000668 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This group of proteins belong to the peptidase family C1, sub-family C1A (papain family, clan CA). It includes proteins classed as non-peptidase homologs. These are have either been shown experimentally to lack peptidase activity or lack one or more of the active site residues. The papain family has a wide variety of activities, including broad-range (papain) and narrow-range endo-peptidases, aminopeptidases, dipeptidyl peptidases and enzymes with both exo- and endo-peptidase activity []. Members of the papain family are widespread, found in baculovirus [], eubacteria, yeast, and practically all protozoa, plants and mammals []. The proteins are typically lysosomal or secreted, and proteolytic cleavage of the propeptide is required for enzyme activation, although bleomycin hydrolase is cytosolic in fungi and mammals []. Papain-like cysteine proteinases are essentially synthesised as inactive proenzymes (zymogens) with N-terminal propeptide regions. The activation process of these enzymes includes the removal of propeptide regions. The propeptide regions serve a variety of functions in vivo and in vitro. The pro-region is required for the proper folding of the newly synthesised enzyme, the inactivation of the peptidase domain and stabilisation of the enzyme against denaturing at neutral to alkaline pH conditions. Amino acid residues within the pro-region mediate their membrane association, and play a role in the transport of the proenzyme to lysosomes. Among the most notable features of propeptides is their ability to inhibit the activity of their cognate enzymes and that certain propeptides exhibit high selectivity for inhibition of the peptidases from which they originate []. The catalytic residues of papain are Cys-25 and His-159, other important residues being Gln-19, which helps form the 'oxyanion hole', and Asn-175, which orientates the imidazole ring of His-159. ; GO: 0008234 cysteine-type peptidase activity, 0006508 proteolysis; PDB: 3MOR_B 3HHI_B 1S4V_A 3F75_A 1MEG_A 1PCI_C 1PPO_A 3HD3_B 1F29_A 1EWL_A ....
Probab=100.00 E-value=6.3e-56 Score=397.97 Aligned_cols=213 Identities=47% Similarity=0.877 Sum_probs=183.5
Q ss_pred CCCeeeccCC-CCCCccccCCCCcchHHHHHHHHHHHHHHHHc-CCCcccChHHhhhhcCCCCCCCCCCchHHHHHHHHH
Q 018781 133 LPKSVDWRKK-GAVTPVKNQGSCGSCWAFSTVAAVEGINQIVS-GNLTSLSEQELIDCDTSFNNGCNGGLMDYAFKYIVA 210 (350)
Q Consensus 133 lP~~~Dwr~~-g~v~pV~dQg~cGsCwAfA~~~~lE~~~~~~~-~~~~~lS~q~l~~c~~~~~~gC~GG~~~~a~~~~~~ 210 (350)
||++||||+. +.++||+|||.||+|||||+++++|++++++. ...++||+|+|++|....+.+|+||++..|++++++
T Consensus 1 lP~~~D~r~~~~~~~~v~dQg~~gsCwafa~~~~~e~~~~~~~~~~~~~lS~q~l~~~~~~~~~~c~gg~~~~a~~~~~~ 80 (219)
T PF00112_consen 1 LPKSFDWRDKGGRITPVRDQGSCGSCWAFAAAAALESRLAIQNNGKNVDLSEQYLIDCSNKYNKGCDGGSPFDALKYIKN 80 (219)
T ss_dssp STSSEEGGGTTTCSG---BTTSSBTHHHHHHHHHHHHHHHHHHTSSCEEB-HHHHHHHSTGTSSTTBBBEHHHHHHHHHH
T ss_pred CCCCEecccCCCCcCccccCCcccccccchhccceeccccccccccccccccccccccccccccccccCcccccceeecc
Confidence 7999999998 48999999999999999999999999999999 788999999999999734679999999999999998
Q ss_pred hCCCCCCCCCccccCC-CccCCCccCceeEEEeeeEecCCCcHHHHHHHH-hcCCcEEEEeecCcccccccCCeeeCC-C
Q 018781 211 SGGLHKEEDYPYLMEE-GTCEDKKEEMEVVTISGYQDVPENDEQSLLKAL-AHQPVSVAIEASGTDFQFYSGGVFTGP-C 287 (350)
Q Consensus 211 ~~Gi~~e~~yPY~~~~-~~c~~~~~~~~~~~i~~~~~v~~~~~~~i~~al-~~GPV~v~i~~~~~~f~~y~~Giy~~~-~ 287 (350)
+.|+++|++|||.... ..|..........++..|..+...++++|+++| .+|||++++.+...+|+.|++|||..+ |
T Consensus 81 ~~Gi~~e~~~pY~~~~~~~c~~~~~~~~~~~i~~~~~~~~~~~~~ik~~L~~~gpV~~~~~~~~~~f~~~~~gi~~~~~~ 160 (219)
T PF00112_consen 81 NNGIVTEEDYPYNGNENPTCKSKKSNSYYVKIKGYGKVKDNDIEDIKKALMKYGPVVASIDVSSEDFQNYKSGIYDPPDC 160 (219)
T ss_dssp HTSBEBTTTS--SSSSSCSSCHSGGGEEEBEESEEEEEESTCHHHHHHHHHHHSSEEEEEEEESHHHHTEESSEECSTSS
T ss_pred cCcccccccccccccccccccccccccccccccccccccccchhHHHHHHhhCceeeeeeeccccccccccceeeecccc
Confidence 4599999999999877 688855433235788899988777899999999 569999999998446999999999884 5
Q ss_pred C-CCCCeEEEEEEEeecCCeeEEEEEcCCCCCCCCCceEEEEecCCCCCCccccccccccee
Q 018781 288 G-AELDHGVAAVGYGKSKGSDYIIVKNSWGPKWGERGYIRMKRNTGKPEGLCGINKMASIPL 348 (350)
Q Consensus 288 ~-~~~~Hav~iVGyg~~~g~~ywivkNSWG~~WG~~GY~~i~~~~~~~~~~CgI~~~~~~p~ 348 (350)
. ..++|||+|||||++.+++|||||||||++||++||+||+|+.+ ++|||++.++||+
T Consensus 161 ~~~~~~Hav~iVGy~~~~~~~~wiv~NSWG~~WG~~Gy~~i~~~~~---~~c~i~~~~~~~~ 219 (219)
T PF00112_consen 161 SNESGGHAVLIVGYDDENGKGYWIVKNSWGTDWGDNGYFRISYDYN---NECGIESQAVYPI 219 (219)
T ss_dssp SSSSEEEEEEEEEEEEETTEEEEEEE-SBTTTSTBTTEEEEESSSS---SGGGTTSSEEEEE
T ss_pred ccccccccccccccccccceeeEeeehhhCCccCCCeEEEEeeCCC---CcCccCceeeecC
Confidence 5 46799999999999999999999999999999999999999964 4899999999996
No 11
>PTZ00049 cathepsin C-like protein; Provisional
Probab=100.00 E-value=1.8e-53 Score=424.50 Aligned_cols=212 Identities=29% Similarity=0.571 Sum_probs=176.3
Q ss_pred cCCCCCeeeccCC----CCCCccccCCCCcchHHHHHHHHHHHHHHHHcCC-----C-----cccChHHhhhhcCCCCCC
Q 018781 130 VKALPKSVDWRKK----GAVTPVKNQGSCGSCWAFSTVAAVEGINQIVSGN-----L-----TSLSEQELIDCDTSFNNG 195 (350)
Q Consensus 130 ~~~lP~~~Dwr~~----g~v~pV~dQg~cGsCwAfA~~~~lE~~~~~~~~~-----~-----~~lS~q~l~~c~~~~~~g 195 (350)
..+||++||||+. +.++||+|||.||||||||+++++|++++|+.+. . ..||+|+|+||+. .++|
T Consensus 378 ~~~LP~sfDWRd~~~~~~~vtpVkdQG~CGSCWAFAat~alEsR~~Ia~~~~l~~~~~~~~~~~LS~QqLLDCs~-~nqG 456 (693)
T PTZ00049 378 IDELPKNFTWGDPFNNNTREYDVTNQLLCGSCYIASQMYAFKRRIEIALTKNLDKKYLNNFDDLLSIQTVLSCSF-YDQG 456 (693)
T ss_pred cccCCCCEecCcCCCCCCcccCCCCCccCcHHHHHHHHHHHHHHHHHHhccccccccccccccCcCHHHhcccCC-CCCC
Confidence 4689999999984 6799999999999999999999999999998642 1 2799999999986 4889
Q ss_pred CCCCchHHHHHHHHHhCCCCCCCCCccccCCCccCCCccC--------------------------------------ce
Q 018781 196 CNGGLMDYAFKYIVASGGLHKEEDYPYLMEEGTCEDKKEE--------------------------------------ME 237 (350)
Q Consensus 196 C~GG~~~~a~~~~~~~~Gi~~e~~yPY~~~~~~c~~~~~~--------------------------------------~~ 237 (350)
|+||++..|++|+.+. ||++|.+|||.+..+.|...... ..
T Consensus 457 C~GG~~~~A~kya~~~-GI~tEscYPY~a~~g~C~~~~~~~~~~~~g~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 535 (693)
T PTZ00049 457 CNGGFPYLVSKMAKLQ-GIPLDKVFPYTATEQTCPYQVDQSANSMNGSANLRQINAVFFSSETQSDMHADFEAPISSEPA 535 (693)
T ss_pred cCCCcHHHHHHHHHHC-CCCcCCccCCcCCCCCCCCCCCCcccccccccccccccccccccccccccccccccccccccc
Confidence 9999999999999876 89999999999887778532110 11
Q ss_pred eEEEeeeEecC-------CCcHHHHHHHH-hcCCcEEEEeecCcccccccCCeeeC-------CCCC-------------
Q 018781 238 VVTISGYQDVP-------ENDEQSLLKAL-AHQPVSVAIEASGTDFQFYSGGVFTG-------PCGA------------- 289 (350)
Q Consensus 238 ~~~i~~~~~v~-------~~~~~~i~~al-~~GPV~v~i~~~~~~f~~y~~Giy~~-------~~~~------------- 289 (350)
.+.++.|..+. ..++++|+++| .+|||+|+|++. ++|+.|++|||+. .|..
T Consensus 536 r~y~k~y~yI~g~y~~~~~~~E~~Im~eI~~~GPVsVsIda~-~dF~~YksGVY~~~~~~h~~~C~~d~~~~~~~~~~~G 614 (693)
T PTZ00049 536 RWYAKDYNYIGGCYGCNQCNGEKIMMNEIYRNGPIVASFEAS-PDFYDYADGVYYVEDFPHARRCTVDLPKHNGVYNITG 614 (693)
T ss_pred ceeeeeeEEecccccccCCCCHHHHHHHHHhcCCEEEEEEec-hhhhcCCCccccCcccccccccCCccccccccccccc
Confidence 23345555542 24688999998 579999999997 7899999999975 2532
Q ss_pred --CCCeEEEEEEEeec--CCe--eEEEEEcCCCCCCCCCceEEEEecCCCCCCccccccccccee
Q 018781 290 --ELDHGVAAVGYGKS--KGS--DYIIVKNSWGPKWGERGYIRMKRNTGKPEGLCGINKMASIPL 348 (350)
Q Consensus 290 --~~~Hav~iVGyg~~--~g~--~ywivkNSWG~~WG~~GY~~i~~~~~~~~~~CgI~~~~~~p~ 348 (350)
.++|||+|||||.+ +|. +|||||||||++||++|||||+|+. |.|||++.+.|++
T Consensus 615 ~e~~NHAVlIVGwG~d~enG~~~~YWIVRNSWGt~WGenGYfKI~RG~----N~CGIEs~a~~~~ 675 (693)
T PTZ00049 615 WEKVNHAIVLVGWGEEEINGKLYKYWIGRNSWGKNWGKEGYFKIIRGK----NFSGIESQSLFIE 675 (693)
T ss_pred cccCceEEEEEEeccccCCCcccCEEEEECCCCCCcccCceEEEEcCC----CccCCccceeEEe
Confidence 36999999999974 453 7999999999999999999999995 5899999999875
No 12
>PTZ00364 dipeptidyl-peptidase I precursor; Provisional
Probab=100.00 E-value=5.6e-53 Score=416.92 Aligned_cols=206 Identities=22% Similarity=0.506 Sum_probs=173.5
Q ss_pred CCCCCeeeccCCC---CCCccccCCC---CcchHHHHHHHHHHHHHHHHcC------CCcccChHHhhhhcCCCCCCCCC
Q 018781 131 KALPKSVDWRKKG---AVTPVKNQGS---CGSCWAFSTVAAVEGINQIVSG------NLTSLSEQELIDCDTSFNNGCNG 198 (350)
Q Consensus 131 ~~lP~~~Dwr~~g---~v~pV~dQg~---cGsCwAfA~~~~lE~~~~~~~~------~~~~lS~q~l~~c~~~~~~gC~G 198 (350)
.+||++||||++| +++||||||. ||||||||+++++|++++++++ ..+.||+|+|+||+. .++||+|
T Consensus 203 ~~LP~sfDWR~~gg~~~VtpVrdQg~~~~CGSCWAFAav~alEsr~~I~tn~~~~~g~~~~LS~QqLVDCs~-~n~GCdG 281 (548)
T PTZ00364 203 DPPPAAWSWGDVGGASFLPAAPPASPGRGCNSSYVEAALAAMMARVMVASNRTDPLGQQTFLSARHVLDCSQ-YGQGCAG 281 (548)
T ss_pred cCCCCccccCcCCCCccCCCCcCCCCCCCCcCHHHHHHHHHHHHHHHHHhCCCcccCcccCcCHHHHhcccC-CCCCCCC
Confidence 5799999999987 7999999999 9999999999999999999873 468899999999986 4789999
Q ss_pred CchHHHHHHHHHhCCCCCCCCC--ccccCCC---ccCCCccCceeEE------EeeeEecCCCcHHHHHHHH-hcCCcEE
Q 018781 199 GLMDYAFKYIVASGGLHKEEDY--PYLMEEG---TCEDKKEEMEVVT------ISGYQDVPENDEQSLLKAL-AHQPVSV 266 (350)
Q Consensus 199 G~~~~a~~~~~~~~Gi~~e~~y--PY~~~~~---~c~~~~~~~~~~~------i~~~~~v~~~~~~~i~~al-~~GPV~v 266 (350)
|++..|++|+.++ ||++|++| ||.+..+ .|.... ....+. +.+|..+. .++++|+.+| .+|||+|
T Consensus 282 G~p~~A~~yi~~~-GI~tE~dY~~PY~~~dg~~~~Ck~~~-~~~~y~~~~~~~I~gyy~~~-~~e~~I~~eI~~~GPVsV 358 (548)
T PTZ00364 282 GFPEEVGKFAETF-GILTTDSYYIPYDSGDGVERACKTRR-PSRRYYFTNYGPLGGYYGAV-TDPDEIIWEIYRHGPVPA 358 (548)
T ss_pred CcHHHHHHHHHhC-CcccccccCCCCCCCCCCCCCCCCCc-ccceeeeeeeEEecceeecC-CcHHHHHHHHHHcCCeEE
Confidence 9999999999876 89999999 9987655 486432 222233 33444443 4788899998 6799999
Q ss_pred EEeecCcccccccCCeeeC---------CC-----------CCCCCeEEEEEEEee-cCCeeEEEEEcCCCC--CCCCCc
Q 018781 267 AIEASGTDFQFYSGGVFTG---------PC-----------GAELDHGVAAVGYGK-SKGSDYIIVKNSWGP--KWGERG 323 (350)
Q Consensus 267 ~i~~~~~~f~~y~~Giy~~---------~~-----------~~~~~Hav~iVGyg~-~~g~~ywivkNSWG~--~WG~~G 323 (350)
+|+++ .+|+.|++|||.+ .| ...++|||+|||||. ++|.+|||||||||+ +|||+|
T Consensus 359 aIda~-~df~~YksGiy~gi~~~~~~~~~~~~~~~~~~~~~~~~~nHAVlIVGYG~de~G~~YWIVKNSWGt~~~WGE~G 437 (548)
T PTZ00364 359 SVYAN-SDWYNCDENSTEDVRYVSLDDYSTASADRPLRHYFASNVNHTVLIIGWGTDENGGDYWLVLDPWGSRRSWCDGG 437 (548)
T ss_pred EEEec-hHHHhcCCCCccCeeccccccccccccCCcccccccccCCeEEEEEEecccCCCceEEEEECCCCCCCCcccCC
Confidence 99997 7899999998752 11 134699999999997 578899999999999 999999
Q ss_pred eEEEEecCCCCCCccccccccc
Q 018781 324 YIRMKRNTGKPEGLCGINKMAS 345 (350)
Q Consensus 324 Y~~i~~~~~~~~~~CgI~~~~~ 345 (350)
||||+||. |+|||++.++
T Consensus 438 YfRI~RG~----N~CGIes~~v 455 (548)
T PTZ00364 438 TRKIARGV----NAYNIESEVV 455 (548)
T ss_pred eEEEEcCC----Ccccccceee
Confidence 99999995 5899999987
No 13
>smart00645 Pept_C1 Papain family cysteine protease.
Probab=100.00 E-value=3.6e-50 Score=348.98 Aligned_cols=166 Identities=61% Similarity=1.123 Sum_probs=148.2
Q ss_pred CCCeeeccCCCCCCccccCCCCcchHHHHHHHHHHHHHHHHcCCCcccChHHhhhhcCCCCCCCCCCchHHHHHHHHHhC
Q 018781 133 LPKSVDWRKKGAVTPVKNQGSCGSCWAFSTVAAVEGINQIVSGNLTSLSEQELIDCDTSFNNGCNGGLMDYAFKYIVASG 212 (350)
Q Consensus 133 lP~~~Dwr~~g~v~pV~dQg~cGsCwAfA~~~~lE~~~~~~~~~~~~lS~q~l~~c~~~~~~gC~GG~~~~a~~~~~~~~ 212 (350)
||++||||+.++++||+|||.||+|||||+++++|++++++++..++||+|+|++|....+.||+||++..|++|+.+++
T Consensus 1 lP~~~D~R~~~~~~~v~dQg~CGsCwAfa~~~~ie~~~~i~~~~~~~lS~q~l~~C~~~~~~gC~GG~~~~a~~~~~~~~ 80 (174)
T smart00645 1 LPESFDWRKKGAVTPVKDQGQCGSCWAFSATGALEGRYCIKTGKLVSLSEQQLVDCSTGGNNGCNGGLPDNAFEYIKKNG 80 (174)
T ss_pred CCCcCcccccCCCCccccCcccchHHHHHHHHHHHHHHHHhcCCccccCHHHHhhhcCCCCCCCCCcCHHHHHHHHHHcC
Confidence 69999999999999999999999999999999999999999998999999999999864456999999999999998866
Q ss_pred CCCCCCCCccccCCCccCCCccCceeEEEeeeEecCCCcHHHHHHHHhcCCcEEEEeecCcccccccCCeeeC-CCCC-C
Q 018781 213 GLHKEEDYPYLMEEGTCEDKKEEMEVVTISGYQDVPENDEQSLLKALAHQPVSVAIEASGTDFQFYSGGVFTG-PCGA-E 290 (350)
Q Consensus 213 Gi~~e~~yPY~~~~~~c~~~~~~~~~~~i~~~~~v~~~~~~~i~~al~~GPV~v~i~~~~~~f~~y~~Giy~~-~~~~-~ 290 (350)
|+++|++|||.. ++.+.+. +|+.|++|||+. +|.. .
T Consensus 81 Gi~~e~~~PY~~----------------------------------------~~~~~~~--~f~~Y~~Gi~~~~~~~~~~ 118 (174)
T smart00645 81 GLETESCYPYTG----------------------------------------SVAIDAS--DFQFYKSGIYDHPGCGSGT 118 (174)
T ss_pred CcccccccCccc----------------------------------------EEEEEcc--cccCCcCeEECCCCCCCCc
Confidence 899999999965 4555553 699999999987 4764 3
Q ss_pred CCeEEEEEEEeec-CCeeEEEEEcCCCCCCCCCceEEEEecCCCCCCccccccc
Q 018781 291 LDHGVAAVGYGKS-KGSDYIIVKNSWGPKWGERGYIRMKRNTGKPEGLCGINKM 343 (350)
Q Consensus 291 ~~Hav~iVGyg~~-~g~~ywivkNSWG~~WG~~GY~~i~~~~~~~~~~CgI~~~ 343 (350)
.+|||+|||||.+ ++++|||||||||++|||+|||||+|+.. |.|||+..
T Consensus 119 ~~Hav~ivGyg~~~~g~~yWii~NSwG~~WG~~G~~~i~~~~~---~~c~i~~~ 169 (174)
T smart00645 119 LDHAVLIVGYGTEENGKDYWIVKNSWGTDWGENGYFRIARGKN---NECGIEAS 169 (174)
T ss_pred ccEEEEEEEEeecCCCeeEEEEECCCCCCcccCeEEEEEcCCC---CccCceee
Confidence 7999999999987 88899999999999999999999999852 57999544
No 14
>cd02619 Peptidase_C1 C1 Peptidase family (MEROPS database nomenclature), also referred to as the papain family; composed of two subfamilies of cysteine peptidases (CPs), C1A (papain) and C1B (bleomycin hydrolase). Papain-like enzymes are mostly endopeptidases with some exceptions like cathepsins B, C, H and X, which are exopeptidases. Papain-like CPs have different functions in various organisms. Plant CPs are used to mobilize storage proteins in seeds while mammalian CPs are primarily lysosomal enzymes responsible for protein degradation in the lysosome. Papain-like CPs are synthesized as inactive proenzymes with N-terminal propeptide regions, which are removed upon activation. Bleomycin hydrolase (BH) is a CP that detoxifies bleomycin by hydrolysis of an amide group. It acts as a carboxypeptidase on its C-terminus to convert itself into an aminopeptidase and peptide ligase. BH is found in all tissues in mammals as well as in many other eukaryotes. It forms a hexameric ring barrel str
Probab=100.00 E-value=8.3e-47 Score=339.52 Aligned_cols=193 Identities=37% Similarity=0.579 Sum_probs=167.0
Q ss_pred eeeccCCCCCCccccCCCCcchHHHHHHHHHHHHHHHHcC--CCcccChHHhhhhcCCC----CCCCCCCchHHHHH-HH
Q 018781 136 SVDWRKKGAVTPVKNQGSCGSCWAFSTVAAVEGINQIVSG--NLTSLSEQELIDCDTSF----NNGCNGGLMDYAFK-YI 208 (350)
Q Consensus 136 ~~Dwr~~g~v~pV~dQg~cGsCwAfA~~~~lE~~~~~~~~--~~~~lS~q~l~~c~~~~----~~gC~GG~~~~a~~-~~ 208 (350)
++|||+.+ ++||+|||.||+|||||+++++|+.++++.+ +.++||+|+|++|.... ..+|.||.+..++. ++
T Consensus 1 ~~d~r~~~-~~~v~dQg~~gsCwafa~~~~les~~~~~~~~~~~~~lS~q~l~~c~~~~~~~~~~~c~gG~~~~~~~~~~ 79 (223)
T cd02619 1 SVDLRPLR-LTPVKNQGSRGSCWAFASAYALESAYRIKGGEDEYVDLSPQYLYICANDECLGINGSCDGGGPLSALLKLV 79 (223)
T ss_pred CCcchhcC-CCCcccCCCCcCcHHHHHHHHHHHHHHHhcCCcccccCCHHHHHHhccccccccCCCCCCCcHHHHHHHHH
Confidence 48999988 9999999999999999999999999999987 78999999999998653 36999999999998 77
Q ss_pred HHhCCCCCCCCCccccCCCccCCC---ccCceeEEEeeeEecCCCcHHHHHHHH-hcCCcEEEEeecCcccccccCCeee
Q 018781 209 VASGGLHKEEDYPYLMEEGTCEDK---KEEMEVVTISGYQDVPENDEQSLLKAL-AHQPVSVAIEASGTDFQFYSGGVFT 284 (350)
Q Consensus 209 ~~~~Gi~~e~~yPY~~~~~~c~~~---~~~~~~~~i~~~~~v~~~~~~~i~~al-~~GPV~v~i~~~~~~f~~y~~Giy~ 284 (350)
+++ |+++|.+|||......|... .......++..|..+...++++||++| ..|||++++.+. ..|..|++|++.
T Consensus 80 ~~~-Gi~~e~~~Py~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~~ik~aL~~~gPv~~~~~~~-~~~~~~~~~~~~ 157 (223)
T cd02619 80 ALK-GIPPEEDYPYGAESDGEEPKSEAALNAAKVKLKDYRRVLKNNIEDIKEALAKGGPVVAGFDVY-SGFDRLKEGIIY 157 (223)
T ss_pred HHc-CCCccccCCCCCCCCCCCCCCccchhhcceeecceeEeCchhHHHHHHHHHHCCCEEEEEEcc-cchhcccCcccc
Confidence 665 99999999999887766532 123345788889888777799999999 569999999997 799999999862
Q ss_pred -----C-CC-CCCCCeEEEEEEEeecC--CeeEEEEEcCCCCCCCCCceEEEEecC
Q 018781 285 -----G-PC-GAELDHGVAAVGYGKSK--GSDYIIVKNSWGPKWGERGYIRMKRNT 331 (350)
Q Consensus 285 -----~-~~-~~~~~Hav~iVGyg~~~--g~~ywivkNSWG~~WG~~GY~~i~~~~ 331 (350)
. .+ ...++|||+|||||++. +++|||||||||++||++||+||+++.
T Consensus 158 ~~~~~~~~~~~~~~~Hav~ivGy~~~~~~~~~~~i~~NSwG~~wg~~Gy~~i~~~~ 213 (223)
T cd02619 158 EEIVYLLYEDGDLGGHAVVIVGYDDNYVEGKGAFIVKNSWGTDWGDNGYGRISYED 213 (223)
T ss_pred ccccccccCCCccCCeEEEEEeecCCCCCCCCEEEEEeCCCCccccCCEEEEehhh
Confidence 1 22 34579999999999986 889999999999999999999999984
No 15
>PTZ00462 Serine-repeat antigen protein; Provisional
Probab=100.00 E-value=9.5e-45 Score=371.26 Aligned_cols=201 Identities=24% Similarity=0.460 Sum_probs=160.3
Q ss_pred CCccccCCCCcchHHHHHHHHHHHHHHHHcCCCcccChHHhhhhcCC-CCCCCCCCc-hHHHHHHHHHhCCCCCCCCCcc
Q 018781 145 VTPVKNQGSCGSCWAFSTVAAVEGINQIVSGNLTSLSEQELIDCDTS-FNNGCNGGL-MDYAFKYIVASGGLHKEEDYPY 222 (350)
Q Consensus 145 v~pV~dQg~cGsCwAfA~~~~lE~~~~~~~~~~~~lS~q~l~~c~~~-~~~gC~GG~-~~~a~~~~~~~~Gi~~e~~yPY 222 (350)
..||+|||.||+|||||+++++|++++++++..+.||+|+|+||+.. ++.||.||+ +..++.|+.+++|+++|++|||
T Consensus 544 ~i~VKDQG~CGSCWAFASaaaLES~~cIkgg~~v~LSeQqLVDCs~~~gn~GC~GG~~~~efl~yI~e~GgLptESdYPY 623 (1004)
T PTZ00462 544 KIQIEDQGNCAISWIFASKYHLETIKCMKGYEPHAISALYIANCSKGEHKDRCDEGSNPLEFLQIIEDNGFLPADSNYLY 623 (1004)
T ss_pred CCCcccCCcchHHHHHHHHHHHHHHHHHhcCCCcccCHHHHHhcccccCCCCCCCCCcHHHHHHHHHHcCCCcccccCCC
Confidence 47999999999999999999999999999999999999999999853 468999997 5556689988877999999999
Q ss_pred cc--CCCccCCCcc-----------------CceeEEEeeeEecCCC----c----HHHHHHHH-hcCCcEEEEeecCcc
Q 018781 223 LM--EEGTCEDKKE-----------------EMEVVTISGYQDVPEN----D----EQSLLKAL-AHQPVSVAIEASGTD 274 (350)
Q Consensus 223 ~~--~~~~c~~~~~-----------------~~~~~~i~~~~~v~~~----~----~~~i~~al-~~GPV~v~i~~~~~~ 274 (350)
.. ..+.|+.... ....+.+.+|..+... + ++.|+++| .+|||+|+|++. +
T Consensus 624 t~k~~~g~Cp~~~~~w~n~~~~~kll~~~~~~~~~i~~kgY~~~~s~~~~~n~d~~i~~IK~eI~~kGPVaV~IdAs--d 701 (1004)
T PTZ00462 624 NYTKVGEDCPDEEDHWMNLLDHGKILNHNKKEPNSLDGKAYRAYESEHFHDKMDAFIKIIKDEIMNKGSVIAYIKAE--N 701 (1004)
T ss_pred ccCCCCCCCCCCcccccccccccccccccccccceeeccceEEecccccccchhhHHHHHHHHHHhcCCEEEEEEee--h
Confidence 75 4567863211 0112334556555431 1 46888888 469999999985 6
Q ss_pred ccccc-CCeeeC-CCCC-CCCeEEEEEEEeec-----CCeeEEEEEcCCCCCCCCCceEEEEecCCCCCCcccccccccc
Q 018781 275 FQFYS-GGVFTG-PCGA-ELDHGVAAVGYGKS-----KGSDYIIVKNSWGPKWGERGYIRMKRNTGKPEGLCGINKMASI 346 (350)
Q Consensus 275 f~~y~-~Giy~~-~~~~-~~~Hav~iVGyg~~-----~g~~ywivkNSWG~~WG~~GY~~i~~~~~~~~~~CgI~~~~~~ 346 (350)
|+.|. +|||.. .|+. .++|||+|||||.. .+++|||||||||+.|||+|||||.|.. .++|||.....+
T Consensus 702 f~~Y~~sGIyv~~~Cgs~~~nHAVlIVGYGt~in~eg~gk~YWIVRNSWGt~WGEnGYFKI~r~g---~n~CGin~i~t~ 778 (1004)
T PTZ00462 702 VLGYEFNGKKVQNLCGDDTADHAVNIVGYGNYINDEDEKKSYWIVRNSWGKYWGDEGYFKVDMYG---PSHCEDNFIHSV 778 (1004)
T ss_pred HHhhhcCCccccCCCCCCcCCceEEEEEecccccccCCCCceEEEEcCCCCCcCCCeEEEEEeCC---CCCCccchheee
Confidence 88885 898654 6874 57999999999973 2579999999999999999999999842 358999888887
Q ss_pred eecC
Q 018781 347 PLKK 350 (350)
Q Consensus 347 p~~~ 350 (350)
|+++
T Consensus 779 ~~fn 782 (1004)
T PTZ00462 779 VIFN 782 (1004)
T ss_pred eeEe
Confidence 7753
No 16
>KOG1544 consensus Predicted cysteine proteinase TIN-ag [General function prediction only]
Probab=100.00 E-value=3.6e-42 Score=309.60 Aligned_cols=262 Identities=27% Similarity=0.501 Sum_probs=201.7
Q ss_pred HHHHHhccCCCcEEEE-cccCCCCChHhHhhhhcCCCCCCCCCCCCCCcc-cccccCCCCCeeeccCC--CCCCccccCC
Q 018781 77 KHIDQRNKEVTSYWLG-LNEFADMSHEEFKNKYLGLKPQFPTRRQPSAEF-SYRDVKALPKSVDWRKK--GAVTPVKNQG 152 (350)
Q Consensus 77 ~~I~~~N~~~~s~~~g-~N~fsDlt~~E~~~~~~~~~~~~~~~~~~~~~~-~~~~~~~lP~~~Dwr~~--g~v~pV~dQg 152 (350)
.+|+++|..+.+|.++ +.+|..||.++-.+..++.-++......-...+ ......+||+.||-|++ +++.|+.|||
T Consensus 151 d~iE~in~G~YgW~A~NYSaFWGmtL~DGiKyRLGTL~Ps~sv~nMNEi~~~l~p~~~LPE~F~As~KWp~liH~plDQg 230 (470)
T KOG1544|consen 151 DMIEAINQGNYGWQAGNYSAFWGMTLDDGIKYRLGTLRPSSSVMNMNEIYTVLNPGEVLPEAFEASEKWPNLIHEPLDQG 230 (470)
T ss_pred HHHHHHhcCCccccccchhhhhcccccccceeeecccCchhhhhhHHhHhhccCcccccchhhhhhhcCCccccCccccC
Confidence 4688999877889887 348999999887776666543332110000000 11223589999999987 8999999999
Q ss_pred CCcchHHHHHHHHHHHHHHHHcCC--CcccChHHhhhhcCCCCCCCCCCchHHHHHHHHHhCCCCCCCCCccccC----C
Q 018781 153 SCGSCWAFSTVAAVEGINQIVSGN--LTSLSEQELIDCDTSFNNGCNGGLMDYAFKYIVASGGLHKEEDYPYLME----E 226 (350)
Q Consensus 153 ~cGsCwAfA~~~~lE~~~~~~~~~--~~~lS~q~l~~c~~~~~~gC~GG~~~~a~~~~~~~~Gi~~e~~yPY~~~----~ 226 (350)
+|++.|||+++++...+++|++.. ...||+|+|++|.....+||.||+...|+=|+.+. |++...+|||... +
T Consensus 231 nCa~SWafSTaavasDRiAI~S~GR~t~~LSpQnLlSC~~h~q~GC~gG~lDRAWWYlRKr-GvVsdhCYP~~~dQ~~~~ 309 (470)
T KOG1544|consen 231 NCAGSWAFSTAAVASDRVAIHSLGRMTPVLSPQNLLSCDTHQQQGCRGGRLDRAWWYLRKR-GVVSDHCYPFSGDQAGPA 309 (470)
T ss_pred CcccceeeeeehhccceeEEeeccccccccChHHhcchhhhhhccCccCcccchheeeecc-cccccccccccCCCCCCC
Confidence 999999999999999999988743 36899999999997778999999999999999887 8999999999752 2
Q ss_pred Ccc------------------CCCccC-ceeEEEeeeEecCCCcHHHHHHHH-hcCCcEEEEeecCcccccccCCeeeCC
Q 018781 227 GTC------------------EDKKEE-MEVVTISGYQDVPENDEQSLLKAL-AHQPVSVAIEASGTDFQFYSGGVFTGP 286 (350)
Q Consensus 227 ~~c------------------~~~~~~-~~~~~i~~~~~v~~~~~~~i~~al-~~GPV~v~i~~~~~~f~~y~~Giy~~~ 286 (350)
+.| .....+ ...++.+.-+.|++ ++++|++.| .+|||-+.|.|- ++|..|++|||.+.
T Consensus 310 ~~C~m~sR~~grgkRqat~~CPn~~~~Sn~iyq~tPPYrVSS-nE~eImkElM~NGPVQA~m~VH-EDFF~YkgGiY~H~ 387 (470)
T KOG1544|consen 310 PPCMMHSRAMGRGKRQATAHCPNSYVNSNDIYQVTPPYRVSS-NEKEIMKELMENGPVQALMEVH-EDFFLYKGGIYSHT 387 (470)
T ss_pred CCceeeccccCcccccccCcCCCcccccCceeeecCCeeccC-CHHHHHHHHHhCCChhhhhhhh-hhhhhhccceeecc
Confidence 233 221111 12344444455655 567777766 899999999885 99999999999873
Q ss_pred CC---------CCCCeEEEEEEEeecC-----CeeEEEEEcCCCCCCCCCceEEEEecCCCCCCccccccccc
Q 018781 287 CG---------AELDHGVAAVGYGKSK-----GSDYIIVKNSWGPKWGERGYIRMKRNTGKPEGLCGINKMAS 345 (350)
Q Consensus 287 ~~---------~~~~Hav~iVGyg~~~-----g~~ywivkNSWG~~WG~~GY~~i~~~~~~~~~~CgI~~~~~ 345 (350)
.. ..+.|+|.|.|||++. ..+|||..||||+.|||+|||||-||+| +|-|+++..
T Consensus 388 ~~~~~~~e~yr~~gtHsVk~tGWG~~~~~~G~~~KyW~aANSWG~~WGE~GYFriLRGvN----ecdIEsfvI 456 (470)
T KOG1544|consen 388 PVSLGRPERYRRHGTHSVKITGWGEETLPDGRTLKYWTAANSWGPAWGERGYFRILRGVN----ECDIESFVI 456 (470)
T ss_pred ccccCCchhhhhcccceEEEeecccccCCCCCeeEEEEeecccccccccCceEEEecccc----chhhhHhhh
Confidence 21 2468999999999853 2589999999999999999999999974 799998754
No 17
>COG4870 Cysteine protease [Posttranslational modification, protein turnover, chaperones]
Probab=99.97 E-value=6.7e-31 Score=242.54 Aligned_cols=198 Identities=29% Similarity=0.429 Sum_probs=132.7
Q ss_pred CCCCCeeeccCCCCCCccccCCCCcchHHHHHHHHHHHHHHHHcCCCcccChHHhhhhcCC-CCCCC-----CCCchHHH
Q 018781 131 KALPKSVDWRKKGAVTPVKNQGSCGSCWAFSTVAAVEGINQIVSGNLTSLSEQELIDCDTS-FNNGC-----NGGLMDYA 204 (350)
Q Consensus 131 ~~lP~~~Dwr~~g~v~pV~dQg~cGsCwAfA~~~~lE~~~~~~~~~~~~lS~q~l~~c~~~-~~~gC-----~GG~~~~a 204 (350)
..+|+.||||+.|.|+||||||.||+|||||+++++|+.+.-.. .-++|+..+..-... ...+| +||....+
T Consensus 97 ~s~~~~fd~r~~g~vs~v~dQg~~Gscwaf~t~~sles~l~~~~--~w~~s~~nm~~ll~~~ye~~fd~~~~d~g~~~m~ 174 (372)
T COG4870 97 ASLPSYFDRRDEGKVSPVKDQGSGGSCWAFATTRSLESYLNPES--AWDFSENNMKNLLGVPYEKGFDYTSNDGGNADMS 174 (372)
T ss_pred ccchhheeeeccCCcccccccCcccceEeeeehhhhhheecccc--cccccccchhhhcCCCccccCCCccccCCccccc
Confidence 35899999999999999999999999999999999999876433 334454433322111 12233 47888888
Q ss_pred HHHHHHhCCCCCCCCCccccCCCccCCCccCceeEEEeeeEecCC----CcHHHHHHHH-hcCCcEEEEeecCccccccc
Q 018781 205 FKYIVASGGLHKEEDYPYLMEEGTCEDKKEEMEVVTISGYQDVPE----NDEQSLLKAL-AHQPVSVAIEASGTDFQFYS 279 (350)
Q Consensus 205 ~~~~~~~~Gi~~e~~yPY~~~~~~c~~~~~~~~~~~i~~~~~v~~----~~~~~i~~al-~~GPV~v~i~~~~~~f~~y~ 279 (350)
..|+.+..|.+.+.+-||......|..... ...++..-..++. -+...|++++ ..|-+...|.+....+....
T Consensus 175 ~a~l~e~sgpv~et~d~y~~~s~~~~~~~p--~~k~~~~~~~i~~~~~~LdnG~i~~~~~~yg~~s~~~~id~~~~~~~~ 252 (372)
T COG4870 175 AAYLTEWSGPVYETDDPYSENSYFSPTNLP--VTKHVQEAQIIPSRKKYLDNGNIKAMFGFYGAVSSSMYIDATNSLGIC 252 (372)
T ss_pred cccccccCCcchhhcCccccccccCCcCCc--hhhccccceecccchhhhcccchHHHHhhhccccceeEEecccccccc
Confidence 888888899999999999887666654221 1111111122221 1233466666 45655433333212332222
Q ss_pred CCeeeCCCCCCCCeEEEEEEEeec----------CCeeEEEEEcCCCCCCCCCceEEEEecCC
Q 018781 280 GGVFTGPCGAELDHGVAAVGYGKS----------KGSDYIIVKNSWGPKWGERGYIRMKRNTG 332 (350)
Q Consensus 280 ~Giy~~~~~~~~~Hav~iVGyg~~----------~g~~ywivkNSWG~~WG~~GY~~i~~~~~ 332 (350)
-+.+........+|||+||||||. .|.++||||||||+.||++|||||+|..-
T Consensus 253 ~~~~~~~s~~~~gHAv~iVGyDDs~~~n~~~~~~~g~GAfiikNSWGt~wG~~GYfwisY~ya 315 (372)
T COG4870 253 IPYPYVDSGENWGHAVLIVGYDDSFDINNFKYGPPGDGAFIIKNSWGTNWGENGYFWISYYYA 315 (372)
T ss_pred cCCCCCCccccccceEEEEeccccccccccccCCCCCceEEEECccccccccCceEEEEeeec
Confidence 334433333567999999999993 35789999999999999999999999854
No 18
>cd00585 Peptidase_C1B Peptidase C1B subfamily (MEROPS database nomenclature); composed of eukaryotic bleomycin hydrolases (BH) and bacterial aminopeptidases C (pepC). The proteins of this subfamily contain a large insert relative to the C1A peptidase (papain) subfamily. BH is a cysteine peptidase that detoxifies bleomycin by hydrolysis of an amide group. It acts as a carboxypeptidase on its C-terminus to convert itself into an aminopeptidase and peptide ligase. BH is found in all tissues in mammals as well as in many other eukaryotes. Bleomycin, a glycopeptide derived from the fungus Streptomyces verticullus, is an effective anticancer drug due to its ability to induce DNA strand breaks. Human BH is the major cause of tumor cell resistance to bleomycin chemotherapy, and is also genetically linked to Alzheimer's disease. In addition to its peptidase activity, the yeast BH (Gal6) binds DNA and acts as a repressor in the Gal4 regulatory system. BH forms a hexameric ring barrel structure w
Probab=99.91 E-value=6e-24 Score=206.41 Aligned_cols=179 Identities=25% Similarity=0.372 Sum_probs=127.6
Q ss_pred CccccCCCCcchHHHHHHHHHHHHHHHH-cCCCcccChHHhhhhcC----------------C-----------CCCCCC
Q 018781 146 TPVKNQGSCGSCWAFSTVAAVEGINQIV-SGNLTSLSEQELIDCDT----------------S-----------FNNGCN 197 (350)
Q Consensus 146 ~pV~dQg~cGsCwAfA~~~~lE~~~~~~-~~~~~~lS~q~l~~c~~----------------~-----------~~~gC~ 197 (350)
.||+||+..|.||.||+...+|+.+.++ +.+.++||+.+++..++ . .....+
T Consensus 55 ~~vtnQ~~SGrCW~FA~Ln~lr~~~~k~~~~~~felSq~Yl~f~dklEkaN~fle~ii~~~~~~~~~R~v~~ll~~~~~D 134 (437)
T cd00585 55 EPVTNQKSSGRCWLFAALNVLRHQFMKKLNLKEFEFSQSYLFFWDKLEKANYFLENIIETADEPLDDRLVQFLLANPQND 134 (437)
T ss_pred CCcccCCCCchhHHHHCHHHHHHHHHHHcCCCCEEeCcHHHHHHHHHHHHHHHHHHHHHHhcCCCccHHHHHHHhCCcCC
Confidence 4899999999999999999999988774 45679999998876221 0 134568
Q ss_pred CCchHHHHHHHHHhCCCCCCCCCccccC--C-------------------------C-----------------------
Q 018781 198 GGLMDYAFKYIVASGGLHKEEDYPYLME--E-------------------------G----------------------- 227 (350)
Q Consensus 198 GG~~~~a~~~~~~~~Gi~~e~~yPY~~~--~-------------------------~----------------------- 227 (350)
||....+...+.++ |+++++.||-+.. . +
T Consensus 135 GGqw~m~~~li~KY-GvVPk~~~pet~~s~~t~~~n~~L~~kLr~~a~~lr~~~~~~~~~~~l~~~~~~~~~~iy~il~~ 213 (437)
T cd00585 135 GGQWDMLVNLIEKY-GLVPKSVMPESFNSENSRRLNYLLNRKLREDALELRKLVAKGASKEEIEAKKEEMLKEVYRILAI 213 (437)
T ss_pred CCchHHHHHHHHHc-CCCcccccCCCcCccchHHHHHHHHHHHHHHHHHHHHHHhcCCcHHHHHHHHHHHHHHHHHHHHH
Confidence 99999999999886 9999999984210 0 0
Q ss_pred ---ccCCC---------------c--c----------------------Cc------eeE-----------EEeeeEecC
Q 018781 228 ---TCEDK---------------K--E----------------------EM------EVV-----------TISGYQDVP 248 (350)
Q Consensus 228 ---~c~~~---------------~--~----------------------~~------~~~-----------~i~~~~~v~ 248 (350)
.++.. . . +. ..+ ....|.++|
T Consensus 214 ~lG~pP~~F~~~y~dkd~~~~~~~~~TP~~F~~~yv~~~~~dyV~l~~~p~~~~p~~~~y~ve~~~Nv~~g~~~~y~Nvp 293 (437)
T cd00585 214 ALGEPPEKFDWEYRDKDKKYHEIKELTPLEFYKKYVKFDLDDYVSLINDPRPDKPYNKLYTVEYLGNVVGGRPILYLNVP 293 (437)
T ss_pred HcCCCCceEEEEEEeCCCCeeeCCCcCHHHHHHHhcCCCccceEEEEeCCCCCCCCCceEEEecCCcccccccceEEecC
Confidence 00000 0 0 00 000 011222332
Q ss_pred CCcHHHHH----HHHhc-CCcEEEEeecCcccccccCCeeeCC---------------------C-CCCCCeEEEEEEEe
Q 018781 249 ENDEQSLL----KALAH-QPVSVAIEASGTDFQFYSGGVFTGP---------------------C-GAELDHGVAAVGYG 301 (350)
Q Consensus 249 ~~~~~~i~----~al~~-GPV~v~i~~~~~~f~~y~~Giy~~~---------------------~-~~~~~Hav~iVGyg 301 (350)
++.++ ++|.. +||.+++++. .|..|++||++.. | .+..+|||+|||||
T Consensus 294 ---~d~l~~~~~~~L~~g~pV~~g~Dv~--~~~~~k~GI~d~~~~~~~~~f~~~~~~~KaeRl~~~es~~tHAM~ivGv~ 368 (437)
T cd00585 294 ---MDVLKKAAIAQLKDGEPVWFGCDVG--KFSDRKSGILDTDLFDYELLFGIDFGLNKAERLDYGESLMTHAMVLTGVD 368 (437)
T ss_pred ---HHHHHHHHHHHHhcCCCEEEEEEcC--hhhccCCccccCcccchhhhcCccccCCHHHHHhhcCCcCCeEEEEEEEE
Confidence 34444 56654 5999999996 5779999999653 1 23468999999999
Q ss_pred ecC-Ce-eEEEEEcCCCCCCCCCceEEEEec
Q 018781 302 KSK-GS-DYIIVKNSWGPKWGERGYIRMKRN 330 (350)
Q Consensus 302 ~~~-g~-~ywivkNSWG~~WG~~GY~~i~~~ 330 (350)
.+. |+ .||+||||||+.||++||++|+++
T Consensus 369 ~D~~g~p~yw~VkNSWG~~~G~~Gy~~ms~~ 399 (437)
T cd00585 369 LDEDGKPVKWKVENSWGEKVGKKGYFVMSDD 399 (437)
T ss_pred ecCCCCcceEEEEcccCCCCCCCcceehhHH
Confidence 854 65 699999999999999999999875
No 19
>PF03051 Peptidase_C1_2: Peptidase C1-like family This family is a subfamily of the Prosite entry; InterPro: IPR004134 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This group of proteins belong to MEROPS peptidase family C1, sub-family C1B (bleomycin hydrolase, clan CA). This family contains prokaryotic and eukaryotic aminopeptidases and bleomycin hydrolases.; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 3PW3_F 2CB5_A 1CB5_C 2DZZ_A 2E02_A 2E01_A 2E03_A 1A6R_A 1GCB_A 3GCB_A ....
Probab=99.72 E-value=1.7e-16 Score=154.58 Aligned_cols=179 Identities=23% Similarity=0.364 Sum_probs=107.3
Q ss_pred CccccCCCCcchHHHHHHHHHHHHHHHHcC-CCcccChHHhhhh----------------cCC-----------CCCCCC
Q 018781 146 TPVKNQGSCGSCWAFSTVAAVEGINQIVSG-NLTSLSEQELIDC----------------DTS-----------FNNGCN 197 (350)
Q Consensus 146 ~pV~dQg~cGsCwAfA~~~~lE~~~~~~~~-~~~~lS~q~l~~c----------------~~~-----------~~~gC~ 197 (350)
.||.||...|.||.||+...++..+.++.+ +.++||+.++... ... .....+
T Consensus 56 ~~vtnQk~SGRCW~FA~lN~lR~~~~kk~~l~~felSq~Yl~F~DKlEKaN~fLe~ii~~~~~~~d~R~v~~ll~~~~~D 135 (438)
T PF03051_consen 56 GPVTNQKSSGRCWLFAALNVLRHEIMKKLNLKDFELSQNYLFFWDKLEKANYFLENIIDTADEPLDDRLVRFLLKNPVSD 135 (438)
T ss_dssp -S--B--BSSTHHHHHHHHHHHHHHHHHCT-SS--B-HHHHHHHHHHHHHHHHHHHHHHCCTS-TTSHHHHHHHHSTT-S
T ss_pred CCCCCCCCCCCcchhhchHHHHHHHHHHcCCCceEeechHHHHHHHHHHHHHHHHHHHHHhcCCcchHHHHHHHhcCCCC
Confidence 499999999999999999999999888776 6799999988622 111 123468
Q ss_pred CCchHHHHHHHHHhCCCCCCCCCccccCC---------------------------------------------------
Q 018781 198 GGLMDYAFKYIVASGGLHKEEDYPYLMEE--------------------------------------------------- 226 (350)
Q Consensus 198 GG~~~~a~~~~~~~~Gi~~e~~yPY~~~~--------------------------------------------------- 226 (350)
||....+...++++ |||+.+.||-+...
T Consensus 136 GGqw~~~~nli~KY-GvVPk~~mpet~~s~~t~~~n~~l~~~Lr~~a~~LR~~~~~~~~~~~l~~~k~~~l~~iy~il~~ 214 (438)
T PF03051_consen 136 GGQWDMVVNLIKKY-GVVPKSVMPETFSSSNTSEMNEMLNTKLREYALELRKLVKAGKSEEELRKLKEEMLAEIYRILAI 214 (438)
T ss_dssp -B-HHHHHHHHHHH----BGGGSTTGCGCHBHHHHHHHHHHHHHHHHHHHHHHHHTTTTCHHHHHHHHHHHHHHHHHHHH
T ss_pred CCchHHHHHHHHHc-CcCcHhhCCCCCCCCChHHHHHHHHHHHHHHHHHHHHHHHcCCCHHHHHHHHHHHHHHHHHHHHH
Confidence 99999999999887 89999999842100
Q ss_pred --CccCCC------ccCce-----------------------eEEE---------------------------eeeEecC
Q 018781 227 --GTCEDK------KEEME-----------------------VVTI---------------------------SGYQDVP 248 (350)
Q Consensus 227 --~~c~~~------~~~~~-----------------------~~~i---------------------------~~~~~v~ 248 (350)
+.++.. ..... .+.+ ..|.++
T Consensus 215 ~lG~PP~~F~~ey~dkd~~~~~~~~~TP~eF~~kyv~~~~ddyVsLin~P~~~~py~~~y~ve~~~Nv~~g~~~~ylNv- 293 (438)
T PF03051_consen 215 YLGEPPEKFTWEYRDKDKKYHRGKNYTPLEFYKKYVGFDLDDYVSLINDPRSHHPYNKLYTVEYLGNVVGGRPVRYLNV- 293 (438)
T ss_dssp HH---SSSEEEEEE-TTS-EEEEEEE-HHHHHHHCTTS-GGGEEEEE--T-TTS-TTCEEEETTTTSSTT-EEEEEEE--
T ss_pred HcCCCChheeEEEeccccccccccccCchhHHHHHhCCCCcceEEEeeCCCccCccceeEEEccCCCEECCcceeEecc-
Confidence 000000 00000 0000 012222
Q ss_pred CCcHHHHHH----HHhcC-CcEEEEeecCcccccccCCeeeCCC----------------------CCCCCeEEEEEEEe
Q 018781 249 ENDEQSLLK----ALAHQ-PVSVAIEASGTDFQFYSGGVFTGPC----------------------GAELDHGVAAVGYG 301 (350)
Q Consensus 249 ~~~~~~i~~----al~~G-PV~v~i~~~~~~f~~y~~Giy~~~~----------------------~~~~~Hav~iVGyg 301 (350)
..+.|++ +|..| ||-.+-+|. . +...+.||.+... .+..+|||+|||.+
T Consensus 294 --pid~lk~~~i~~Lk~G~~VwfgcDV~-k-~~~~k~Gi~D~~~~d~~~~fg~~~~~~K~~Rl~~~eS~~tHAM~itGv~ 369 (438)
T PF03051_consen 294 --PIDELKDAAIKSLKAGYPVWFGCDVG-K-FFDRKNGIMDTDLYDYDSLFGVDFNMSKAERLDYGESTMTHAMVITGVD 369 (438)
T ss_dssp ---HHHHHHHHHHHHHTT--EEEEEETT-T-TEETTTTEE-TTSB-HHHHHT--S-S-HHHHHHTTSS--EEEEEEEEEE
T ss_pred --CHHHHHHHHHHHHHcCCcEEEeccCC-c-cccccchhhccchhhhhhhhccccccCHHHHHHhCCCCCceeEEEEEEE
Confidence 2444444 45666 999999996 3 4556788875422 02248999999999
Q ss_pred e-cCCe-eEEEEEcCCCCCCCCCceEEEEec
Q 018781 302 K-SKGS-DYIIVKNSWGPKWGERGYIRMKRN 330 (350)
Q Consensus 302 ~-~~g~-~ywivkNSWG~~WG~~GY~~i~~~ 330 (350)
. ++|+ .+|+|+||||++.|.+||+.|+..
T Consensus 370 ~D~~g~p~~wkVeNSWG~~~g~kGy~~msd~ 400 (438)
T PF03051_consen 370 LDEDGKPVRWKVENSWGTDNGDKGYFYMSDD 400 (438)
T ss_dssp E-TTSSEEEEEEE-SBTTTSTBTTEEEEEHH
T ss_pred eccCCCeeEEEEEcCCCCCCCCCcEEEECHH
Confidence 8 5565 699999999999999999999853
No 20
>PF08246 Inhibitor_I29: Cathepsin propeptide inhibitor domain (I29); InterPro: IPR013201 Peptide proteinase inhibitors can be found as single domain proteins or as single or multiple domains within proteins; these are referred to as either simple or compound inhibitors, respectively. In many cases they are synthesised as part of a larger precursor protein, either as a prepropeptide or as an N-terminal domain associated with an inactive peptidase or zymogen. This domain prevents access of the substrate to the active site. Removal of the N-terminal inhibitor domain either by interaction with a second peptidase or by autocatalytic cleavage activates the zymogen. Other inhibitors interact direct with proteinases using a simple noncovalent lock and key mechanism; while yet others use a conformational change-based trapping mechanism that depends on their structural and thermodynamic properties. This entry represents a peptidase inhibitor domain, which belongs to MEROPS peptidase inhibitor family I29. The domain is also found at the N terminus of a variety of peptidase precursors that belong to MEROPS peptidase subfamily C1A; these include cathepsin L, papain, and procaricain (P10056 from SWISSPROT) []. It forms an alpha-helical domain that runs through the substrate-binding site, preventing access. Removal of this region by proteolytic cleavage results in activation of the enzyme. This domain is also found, in one or more copies, in a variety of cysteine peptidase inhibitors such as salarin [].; PDB: 3QT4_A 3QJ3_A 2C0Y_A 2L95_A 1CJL_A 1CS8_A 7PCK_A 1BY8_A 1PCI_A 2O6X_A ....
Probab=99.67 E-value=1.3e-16 Score=112.91 Aligned_cols=57 Identities=53% Similarity=0.886 Sum_probs=50.3
Q ss_pred HHHHHHHhCCccCChHHHHHHHHHHHHHHHHHHHhcc-CCCcEEEEcccCCCCChHhH
Q 018781 48 FESWMSKHGKTYKCIEEKLHRFEIFKENLKHIDQRNK-EVTSYWLGLNEFADMSHEEF 104 (350)
Q Consensus 48 f~~~~~~~~k~Y~~~~E~~~R~~if~~n~~~I~~~N~-~~~s~~~g~N~fsDlt~~E~ 104 (350)
|++|+++|+|.|.+.+|+..|+.+|.+|++.|++||+ .+.+|++|+|+|||||++||
T Consensus 1 F~~~~~~~~k~Y~~~~e~~~R~~~F~~N~~~I~~~N~~~~~~~~~~~N~fsD~t~eEf 58 (58)
T PF08246_consen 1 FEQFKKKYGKSYKSAEEEARRFAIFKENLRRIEEHNANGNNTYKLGLNQFSDMTPEEF 58 (58)
T ss_dssp HHHHHHHCT---SSHHHHHHHHHHHHHHHHHHHHHHHTTSSSEEE-SSTTTTSSHHHH
T ss_pred CHHHHHHcCCCCCCHHHHHHHHHHHHHHHHHHHHHhcCCCCCeEEeCccccCcChhhC
Confidence 8999999999999999999999999999999999994 45899999999999999997
No 21
>smart00848 Inhibitor_I29 Cathepsin propeptide inhibitor domain (I29). This domain is found at the N-terminus of some C1 peptidases such as Cathepsin L where it acts as a propeptide. There are also a number of proteins that are composed solely of multiple copies of this domain such as the peptidase inhibitor salarin. This family is classified as I29 by MEROPS. Peptide proteinase inhibitors can be found as single domain proteins or as single or multiple domains within proteins; these are referred to as either simple or compound inhibitors, respectively. In many cases they are synthesised as part of a larger precursor protein, either as a prepropeptide or as an N-terminal domain associated with an inactive peptidase or zymogen. This domain prevents access of the substrate to the active site. Removal of the N-terminal inhibitor domain either by interaction with a second peptidase or by autocatalytic cleavage activates the zymogen. Other inhibitors interact direct with proteinases using a s
Probab=99.52 E-value=1.6e-14 Score=101.69 Aligned_cols=56 Identities=61% Similarity=0.990 Sum_probs=53.1
Q ss_pred HHHHHHHhCCccCChHHHHHHHHHHHHHHHHHHHhccCC-CcEEEEcccCCCCChHh
Q 018781 48 FESWMSKHGKTYKCIEEKLHRFEIFKENLKHIDQRNKEV-TSYWLGLNEFADMSHEE 103 (350)
Q Consensus 48 f~~~~~~~~k~Y~~~~E~~~R~~if~~n~~~I~~~N~~~-~s~~~g~N~fsDlt~~E 103 (350)
|++|+.+|+|.|.+.+|...|+.+|.+|++.|+.||+.+ .+|++|+|+|+|||++|
T Consensus 1 f~~~~~~~~k~y~~~~e~~~r~~~f~~n~~~i~~~N~~~~~~~~~~~N~fsDlt~eE 57 (57)
T smart00848 1 FEQWKKKYGKSYSSEEEELRRFEIFKENLKFIEEHNKKNDHSYTLGLNQFADLTNEE 57 (57)
T ss_pred ChHHHHHhCCCCCCHHHHHHHHHHHHHHHHHHHHHHhcCCCCeEecCcccccCCCCC
Confidence 689999999999999999999999999999999999876 89999999999999986
No 22
>COG3579 PepC Aminopeptidase C [Amino acid transport and metabolism]
Probab=98.87 E-value=1.4e-08 Score=93.59 Aligned_cols=75 Identities=21% Similarity=0.249 Sum_probs=54.9
Q ss_pred ccccCCCCcchHHHHHHHHHHHHHHHHcC-CCcccChHHhhhhcCC---------------------------CCCCCCC
Q 018781 147 PVKNQGSCGSCWAFSTVAAVEGINQIVSG-NLTSLSEQELIDCDTS---------------------------FNNGCNG 198 (350)
Q Consensus 147 pV~dQg~cGsCwAfA~~~~lE~~~~~~~~-~~~~lS~q~l~~c~~~---------------------------~~~gC~G 198 (350)
||-||...|-||-||+..++.-.+...-+ +.+.||..++.-.++. ...--+|
T Consensus 59 ~vtNQk~SGRCWmFAAlNtfRhk~~~el~le~fElSQaytfFwDKlEKaN~FleqIi~tadq~ldsRlv~~LL~~PqqDG 138 (444)
T COG3579 59 KVTNQKQSGRCWMFAALNTFRHKLISELKLEDFELSQAYTFFWDKLEKANWFLEQIIETADQELDSRLVSFLLATPQQDG 138 (444)
T ss_pred ccccccccceehHHHHHHHHHHHHHHhcCcceeehhhHHHHHHHHHHHhhHHHHHHHhhcccchHHHHHHHHHcCccccC
Confidence 89999999999999999988765554433 3477888766543320 1223478
Q ss_pred CchHHHHHHHHHhCCCCCCCCCcc
Q 018781 199 GLMDYAFKYIVASGGLHKEEDYPY 222 (350)
Q Consensus 199 G~~~~a~~~~~~~~Gi~~e~~yPY 222 (350)
|-.......+.++ |+++.++||-
T Consensus 139 GQwdM~v~l~eKY-GvVpK~~ype 161 (444)
T COG3579 139 GQWDMFVSLFEKY-GVVPKSVYPE 161 (444)
T ss_pred chHHHHHHHHHHh-CCCchhhccc
Confidence 8888888877776 8999999974
No 23
>KOG4128 consensus Bleomycin hydrolases and aminopeptidases of cysteine protease family [Amino acid transport and metabolism]
Probab=97.48 E-value=0.00011 Score=68.03 Aligned_cols=75 Identities=24% Similarity=0.277 Sum_probs=56.3
Q ss_pred CccccCCCCcchHHHHHHHHHHHHHHHHcC-CCcccChHHhhhhcC--------------------C---------CCCC
Q 018781 146 TPVKNQGSCGSCWAFSTVAAVEGINQIVSG-NLTSLSEQELIDCDT--------------------S---------FNNG 195 (350)
Q Consensus 146 ~pV~dQg~cGsCwAfA~~~~lE~~~~~~~~-~~~~lS~q~l~~c~~--------------------~---------~~~g 195 (350)
+||.||...|-||-|+....+.--+.++-+ ....||..+|+-.++ . .+..
T Consensus 63 ~pvtnqkssGrcWift~ln~lrl~~~~kLnl~eFElSqayLFFwdKlErcnyFL~~vvd~a~r~ep~DgRlvq~Ll~nP~ 142 (457)
T KOG4128|consen 63 QPVTNQKSSGRCWIFTGLNLLRLEMDRKLNLPEFELSQAYLFFWDKLERCNYFLWTVVDLAMRCEPLDGRLVQNLLKNPV 142 (457)
T ss_pred cccccCcCCCceEEEechhHHHHHHHhcCCcchhhhhhHHHHHHHHHHHHHHHHHHHHHHHhhcCCcccHHHHHHHhCCC
Confidence 699999999999999999987655554433 347889888763221 0 1334
Q ss_pred CCCCchHHHHHHHHHhCCCCCCCCCc
Q 018781 196 CNGGLMDYAFKYIVASGGLHKEEDYP 221 (350)
Q Consensus 196 C~GG~~~~a~~~~~~~~Gi~~e~~yP 221 (350)
-+||....-++.++++ |+.+.++||
T Consensus 143 ~DGGqw~MfvNlVkKY-GviPKkcy~ 167 (457)
T KOG4128|consen 143 PDGGQWQMFVNLVKKY-GVIPKKCYL 167 (457)
T ss_pred CCCchHHHHHHHHHHh-CCCcHHhcc
Confidence 5788888888888776 899999996
No 24
>PF13529 Peptidase_C39_2: Peptidase_C39 like family; PDB: 3ERV_A.
Probab=97.32 E-value=0.0025 Score=52.15 Aligned_cols=57 Identities=21% Similarity=0.438 Sum_probs=33.8
Q ss_pred CcHHHHHHHHhcC-CcEEEEeecCcccccccCCeeeCCCCCCCCeEEEEEEEeecCCeeEEEEEcCC
Q 018781 250 NDEQSLLKALAHQ-PVSVAIEASGTDFQFYSGGVFTGPCGAELDHGVAAVGYGKSKGSDYIIVKNSW 315 (350)
Q Consensus 250 ~~~~~i~~al~~G-PV~v~i~~~~~~f~~y~~Giy~~~~~~~~~Hav~iVGyg~~~g~~ywivkNSW 315 (350)
.+.+.|++.|..| ||.+.+....... ..+.+. ....+|.|+|+||+++. +++|-.+|
T Consensus 87 ~~~~~i~~~i~~G~Pvi~~~~~~~~~~---~~~~~~---~~~~~H~vvi~Gy~~~~---~~~v~DP~ 144 (144)
T PF13529_consen 87 ASFDDIKQEIDAGRPVIVSVNSGWRPP---NGDGYD---GTYGGHYVVIIGYDEDG---YVYVNDPW 144 (144)
T ss_dssp S-HHHHHHHHHTT--EEEEEETTSS-----TTEEEE---E-TTEEEEEEEEE-SSE----EEEE-TT
T ss_pred CcHHHHHHHHHCCCcEEEEEEcccccC---CCCCcC---CCcCCEEEEEEEEeCCC---EEEEeCCC
Confidence 4679999999776 9999986421111 111111 13468999999999843 78888877
No 25
>PF05543 Peptidase_C47: Staphopain peptidase C47; InterPro: IPR008750 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This group of cysteine peptidases belong to the peptidase family C47 (staphopain family, clan CA). The type example are the staphopains, which are one of four major families of proteinases secreted by the Gram-positive Staphylococcus aureus. These staphylococcal cysteine proteases are secreted as preproenzymes that are proteolytically cleaved to generate the mature enzyme [, , ].; GO: 0008234 cysteine-type peptidase activity, 0006508 proteolysis; PDB: 1X9Y_D 1Y4H_B 1PXV_B 1CV8_A.
Probab=96.90 E-value=0.015 Score=49.71 Aligned_cols=118 Identities=19% Similarity=0.297 Sum_probs=65.2
Q ss_pred ccCCCCcchHHHHHHHHHHHHHHH--------HcCCCcccChHHhhhhcCCCCCCCCCCchHHHHHHHHHhCCCCCCCCC
Q 018781 149 KNQGSCGSCWAFSTVAAVEGINQI--------VSGNLTSLSEQELIDCDTSFNNGCNGGLMDYAFKYIVASGGLHKEEDY 220 (350)
Q Consensus 149 ~dQg~cGsCwAfA~~~~lE~~~~~--------~~~~~~~lS~q~l~~c~~~~~~gC~GG~~~~a~~~~~~~~Gi~~e~~y 220 (350)
..||.-+-|-+||.+++|-+.... .+.-...+|+++|-+++- .+...++|.+..+ ..
T Consensus 17 EtQg~~pWCa~Ya~aailN~~~~~~~~~A~~iMr~~yPn~s~~~l~~~~~---------~~~~~i~y~ks~g-~~----- 81 (175)
T PF05543_consen 17 ETQGYNPWCAGYAMAAILNATTNTKIYNAKDIMRYLYPNVSEEQLKFTSL---------TPNQMIKYAKSQG-RN----- 81 (175)
T ss_dssp ---SSSS-HHHHHHHHHHHHHCT-S---HHHHHHHHSTTS-CCCHHH--B----------HHHHHHHHHHTT-EE-----
T ss_pred eccCcCcHHHHHHHHHHHHhhhCcCcCCHHHHHHHHCCCCCHHHHhhcCC---------CHHHHHHHHHHcC-cc-----
Confidence 358999999999999988764211 111124566666655542 3567777765442 11
Q ss_pred ccccCCCccCCCccCceeEEEeeeEecCCCcHHHHHHHHh-cCCcEEEEeecCcccccccCCeeeCCCCCCCCeEEEEEE
Q 018781 221 PYLMEEGTCEDKKEEMEVVTISGYQDVPENDEQSLLKALA-HQPVSVAIEASGTDFQFYSGGVFTGPCGAELDHGVAAVG 299 (350)
Q Consensus 221 PY~~~~~~c~~~~~~~~~~~i~~~~~v~~~~~~~i~~al~-~GPV~v~i~~~~~~f~~y~~Giy~~~~~~~~~Hav~iVG 299 (350)
+ .+..- ..+.+++++.+. +-|+.+....... ..+...+|||+|||
T Consensus 82 --------------------~-~~~n~-~~s~~eV~~~~~~nk~i~i~~~~v~~------------~~~~~~gHAlavvG 127 (175)
T PF05543_consen 82 --------------------P-QYNNR-MPSFDEVKKLIDNNKGIAILADRVEQ------------TNGPHAGHALAVVG 127 (175)
T ss_dssp --------------------E-EEECS----HHHHHHHHHTT-EEEEEEEETTS------------CTTB--EEEEEEEE
T ss_pred --------------------h-hHhcC-CCCHHHHHHHHHcCCCeEEEeccccc------------CCCCccceeEEEEe
Confidence 0 01111 114788888884 4677775554311 12345689999999
Q ss_pred Eee-cCCeeEEEEEcCC
Q 018781 300 YGK-SKGSDYIIVKNSW 315 (350)
Q Consensus 300 yg~-~~g~~ywivkNSW 315 (350)
|-. .+|.++.++=|=|
T Consensus 128 ya~~~~g~~~y~~WNPW 144 (175)
T PF05543_consen 128 YAKPNNGQKTYYFWNPW 144 (175)
T ss_dssp EEEETTSEEEEEEE-TT
T ss_pred eeecCCCCeEEEEeCCc
Confidence 988 5668899997777
No 26
>PF08127 Propeptide_C1: Peptidase family C1 propeptide; InterPro: IPR012599 This domain is found at the N-terminal of cathepsin B and cathepsin B-like peptidases that belong to MEROPS peptidase subfamily C1A. Cathepsin B are lysosomal cysteine proteinases belonging to the papain superfamily and are unique in their ability to act as both an endo- and an exopeptidases. They are synthesized as inactive zymogens. Activation of the peptidases occurs with the removal of the propeptide [, ]. ; GO: 0004197 cysteine-type endopeptidase activity, 0050790 regulation of catalytic activity; PDB: 1MIR_A 1PBH_A 2PBH_A 3PBH_A.
Probab=96.23 E-value=0.0041 Score=40.27 Aligned_cols=35 Identities=34% Similarity=0.439 Sum_probs=22.2
Q ss_pred HHHHHhccCCCcEEEEcccCCCCChHhHhhhhcCCCC
Q 018781 77 KHIDQRNKEVTSYWLGLNEFADMSHEEFKNKYLGLKP 113 (350)
Q Consensus 77 ~~I~~~N~~~~s~~~g~N~fsDlt~~E~~~~~~~~~~ 113 (350)
++|+.+|+.+.+|++|.| |.+.+.++++.++ |..+
T Consensus 4 e~I~~IN~~~~tWkAG~N-F~~~~~~~ik~Ll-Gv~~ 38 (41)
T PF08127_consen 4 EFIDYINSKNTTWKAGRN-FENTSIEYIKRLL-GVLP 38 (41)
T ss_dssp HHHHHHHHCT-SEEE-----SSB-HHHHHHCS--B-T
T ss_pred HHHHHHHcCCCcccCCCC-CCCCCHHHHHHHc-CCCC
Confidence 678999998899999999 8999999887654 5443
No 27
>PF14399 Transpep_BrtH: NlpC/p60-like transpeptidase
Probab=89.78 E-value=0.83 Score=43.01 Aligned_cols=56 Identities=20% Similarity=0.304 Sum_probs=36.1
Q ss_pred cHHHHHHHHhcC-CcEEEEeecCcccccccCCeeeCCCCCCCCeEEEEEEEeecCCeeEEEEEc
Q 018781 251 DEQSLLKALAHQ-PVSVAIEASGTDFQFYSGGVFTGPCGAELDHGVAAVGYGKSKGSDYIIVKN 313 (350)
Q Consensus 251 ~~~~i~~al~~G-PV~v~i~~~~~~f~~y~~Giy~~~~~~~~~Hav~iVGyg~~~g~~ywivkN 313 (350)
..+.|+++|..| ||.+.++.+ +..|...-| .....+|.|+|+||+++++ .+.++-+
T Consensus 77 ~~~~l~~~l~~g~pv~~~~D~~---~lpy~~~~~---~~~~~~H~i~v~G~d~~~~-~~~v~D~ 133 (317)
T PF14399_consen 77 AWEELKEALDAGRPVIVWVDMY---YLPYRPNYY---KKHHADHYIVVYGYDEEED-VFYVSDP 133 (317)
T ss_pred HHHHHHHHHhCCCceEEEeccc---cCCCCcccc---ccccCCcEEEEEEEeCCCC-EEEEEcC
Confidence 356788888777 999998775 333333222 1223589999999997643 4555533
No 28
>COG4990 Uncharacterized protein conserved in bacteria [Function unknown]
Probab=86.62 E-value=1.5 Score=37.74 Aligned_cols=51 Identities=22% Similarity=0.292 Sum_probs=35.7
Q ss_pred ecCCCcHHHHHHHHhcC-CcEEEEeecCcccccccCCeeeCCCCCCCCeEEEEEEEeecCCeeEEEEEcCCC
Q 018781 246 DVPENDEQSLLKALAHQ-PVSVAIEASGTDFQFYSGGVFTGPCGAELDHGVAAVGYGKSKGSDYIIVKNSWG 316 (350)
Q Consensus 246 ~v~~~~~~~i~~al~~G-PV~v~i~~~~~~f~~y~~Giy~~~~~~~~~Hav~iVGyg~~~g~~ywivkNSWG 316 (350)
.++..+..+|+..|..| ||.+-.... -. ..-|+|+|.|||+. ++..-++||
T Consensus 117 d~tGksl~~ik~ql~kg~PV~iw~T~~----~~------------~s~H~v~itgyDk~----n~yynDpyG 168 (195)
T COG4990 117 DLTGKSLSDIKGQLLKGRPVVIWVTNF----HS------------YSIHSVLITGYDKY----NIYYNDPYG 168 (195)
T ss_pred cCcCCcHHHHHHHHhcCCcEEEEEecc----cc------------cceeeeEeeccccc----ceEeccccc
Confidence 45566899999999655 988765432 21 23599999999984 455557775
No 29
>cd00044 CysPc Calpains, domains IIa, IIb; calcium-dependent cytoplasmic cysteine proteinases, papain-like. Functions in cytoskeletal remodeling processes, cell differentiation, apoptosis and signal transduction.
Probab=79.76 E-value=7 Score=37.00 Aligned_cols=27 Identities=19% Similarity=0.475 Sum_probs=23.7
Q ss_pred CCeEEEEEEEeecC--CeeEEEEEcCCCC
Q 018781 291 LDHGVAAVGYGKSK--GSDYIIVKNSWGP 317 (350)
Q Consensus 291 ~~Hav~iVGyg~~~--g~~ywivkNSWG~ 317 (350)
.+||-.|++....+ +.....+||-||.
T Consensus 235 ~~HaY~Vl~~~~~~~~~~~lv~lrNPWg~ 263 (315)
T cd00044 235 KGHAYSVLDVREVQEEGLRLLRLRNPWGV 263 (315)
T ss_pred cCcceEEeEEEEEccCceEEEEecCCccC
Confidence 48999999998866 7889999999994
No 30
>PF09778 Guanylate_cyc_2: Guanylylate cyclase; InterPro: IPR018616 Members of this family of proteins catalyse the conversion of guanosine triphosphate (GTP) to 3',5'-cyclic guanosine monophosphate (cGMP) and pyrophosphate.
Probab=77.31 E-value=8 Score=34.44 Aligned_cols=59 Identities=20% Similarity=0.318 Sum_probs=35.1
Q ss_pred cHHHHHHHHhc-CCcEEEEeecCccccc--ccCCeeeC---CC----CCCCCeEEEEEEEeecCCeeEEEEEc
Q 018781 251 DEQSLLKALAH-QPVSVAIEASGTDFQF--YSGGVFTG---PC----GAELDHGVAAVGYGKSKGSDYIIVKN 313 (350)
Q Consensus 251 ~~~~i~~al~~-GPV~v~i~~~~~~f~~--y~~Giy~~---~~----~~~~~Hav~iVGyg~~~g~~ywivkN 313 (350)
..++|...|.. ||+.+-++.. -+.. -+...... .| ....+|-|+|+||+.+.+ -++++|
T Consensus 112 s~~ei~~hl~~g~~aIvLVd~~--~L~C~~Ck~~~~~~~~~~~~~~~~~Y~GHYVVlcGyd~~~~--~~~yrd 180 (212)
T PF09778_consen 112 SIQEIIEHLSSGGPAIVLVDAS--LLHCDLCKSNCFDPIGSKCFGRSPDYQGHYVVLCGYDAATK--EFEYRD 180 (212)
T ss_pred cHHHHHHHHhCCCcEEEEEccc--cccChhhcccccccccccccCCCCCccEEEEEEEeecCCCC--eEEEeC
Confidence 57889998855 5666666553 2220 02222211 11 235699999999998653 366666
No 31
>PF03032 Brevenin: Brevenin/esculentin/gaegurin/rugosin family; InterPro: IPR004275 In addition to the highly specific cell-mediated immune system, vertebrates possess an efficient host-defence mechanism against invading microorganisms which involves the synthesis of highly potent antimicrobial peptides with a large spectrum of activity. This entry represents a number of these defence peptides secreted from the skin of amphibians, including the opiate-like dermorphins and deltorphins, and the antimicrobial dermoseptins and temporins.; GO: 0006952 defense response, 0042742 defense response to bacterium, 0005576 extracellular region
Probab=75.50 E-value=2 Score=28.47 Aligned_cols=20 Identities=35% Similarity=0.346 Sum_probs=15.2
Q ss_pred cchhHHHHHHHHHHHHHhhh
Q 018781 4 FSHSKLLLLSLSLSLFACSS 23 (350)
Q Consensus 4 ~~~~~~~~~~~~~~~~~~~~ 23 (350)
|.|+|.++|+++|=++.++.
T Consensus 1 ftlKKsllLlfflG~ISlSl 20 (46)
T PF03032_consen 1 FTLKKSLLLLFFLGTISLSL 20 (46)
T ss_pred CcchHHHHHHHHHHHcccch
Confidence 67999999998876664444
No 32
>cd02549 Peptidase_C39A A sub-family of peptidase family C39. Peptidase family C39 mostly contains bacteriocin-processing endopeptidases from bacteria. The cysteine peptidases in family C39 cleave the "double-glycine" leader peptides from the precursors of various bacteriocins (mostly non-lantibiotic). The cleavage is mediated by the transporter as part of the secretion process. Bacteriocins are antibiotic proteins secreted by some species of bacteria that inhibit the growth of other bacterial species. The bacteriocin is synthesized as a precursor with an N-terminal leader peptide, and processing involves removal of the leader peptide by cleavage at a Gly-Gly bond, followed by translocation of the mature bacteriocin across the cytoplasmic membrane. Most endopeptidases of family C39 are N-terminal domains in larger proteins (ABC transporters) that serve both functions. The proposed protease active site is conserved in this sub-family of proteins with a single peptidase domain, which are
Probab=72.28 E-value=11 Score=30.55 Aligned_cols=44 Identities=25% Similarity=0.443 Sum_probs=28.6
Q ss_pred HHHHHhcC-CcEEEEeecCcccccccCCeeeCCCCCCCCeEEEEEEEeecCCeeEEEEEcCC
Q 018781 255 LLKALAHQ-PVSVAIEASGTDFQFYSGGVFTGPCGAELDHGVAAVGYGKSKGSDYIIVKNSW 315 (350)
Q Consensus 255 i~~al~~G-PV~v~i~~~~~~f~~y~~Giy~~~~~~~~~Hav~iVGyg~~~g~~ywivkNSW 315 (350)
+++.+..| ||.+.++.. + .....+|.|+|+||+.+ +..+|.+.|
T Consensus 70 ~~~~l~~~~Pvi~~~~~~---~-----------~~~~~gH~vVv~g~~~~---~~~~i~DP~ 114 (141)
T cd02549 70 LLRQLAAGHPVIVSVNLG---V-----------SITPSGHAMVVIGYDRK---GNVYVNDPG 114 (141)
T ss_pred HHHHHHCCCeEEEEEecC---c-----------ccCCCCeEEEEEEEcCC---CCEEEECCC
Confidence 66777554 998877541 0 11235899999999821 336666775
No 33
>PF11106 YjbE: Exopolysaccharide production protein YjbE
Probab=70.36 E-value=3.7 Score=30.04 Aligned_cols=23 Identities=17% Similarity=0.063 Sum_probs=16.7
Q ss_pred hhHHHHHHHHHHHHHhhhhcccc
Q 018781 6 HSKLLLLSLSLSLFACSSLAHDF 28 (350)
Q Consensus 6 ~~~~~~~~~~~~~~~~~~~~~~~ 28 (350)
|||+++++++|+.+..++++++.
T Consensus 1 MKK~~~~~~~i~~l~~~s~~aA~ 23 (80)
T PF11106_consen 1 MKKIIYGLFAILALASSSAFAAP 23 (80)
T ss_pred ChhHHHHHHHHHHHHhcchhhhh
Confidence 89999888777777666655543
No 34
>PF15240 Pro-rich: Proline-rich
Probab=69.77 E-value=2.5 Score=36.46 Aligned_cols=15 Identities=53% Similarity=0.687 Sum_probs=11.5
Q ss_pred HHHHHHHHHHHHhhh
Q 018781 9 LLLLSLSLSLFACSS 23 (350)
Q Consensus 9 ~~~~~~~~~~~~~~~ 23 (350)
||||+|+++||++++
T Consensus 1 MLlVLLSvALLALSS 15 (179)
T PF15240_consen 1 MLLVLLSVALLALSS 15 (179)
T ss_pred ChhHHHHHHHHHhhh
Confidence 678888878877766
No 35
>PF12385 Peptidase_C70: Papain-like cysteine protease AvrRpt2; InterPro: IPR022118 This is a family of cysteine proteases, found in actinobacteria, protobacteria and firmicutes. Papain-like cysteine proteases play a crucial role in plant-pathogen/pest interactions. On entering the host they act on non-self substrates, thereby manipulating the host to evade proteolysis []. AvrRpt2 from Pseudomonas syringae pv tomato DC3000 triggers resistance to P. syringae-2-dependent defence responses, including hypersensitive cell death, by cleaving the Arabidopsis RIN4 protein which is monitored by the cognate resistance protein RPS2 [].
Probab=68.57 E-value=70 Score=27.16 Aligned_cols=38 Identities=26% Similarity=0.326 Sum_probs=27.1
Q ss_pred cHHHHHHHH-hcCCcEEEEeecCcccccccCCeeeCCCCCCCCeEEEEEEEeec
Q 018781 251 DEQSLLKAL-AHQPVSVAIEASGTDFQFYSGGVFTGPCGAELDHGVAAVGYGKS 303 (350)
Q Consensus 251 ~~~~i~~al-~~GPV~v~i~~~~~~f~~y~~Giy~~~~~~~~~Hav~iVGyg~~ 303 (350)
+.+.+...| .+||+-++..... +.-..|+++|.|-+.+
T Consensus 97 t~e~~~~LL~~yGPLwv~~~~P~---------------~~~~~H~~ViTGI~~d 135 (166)
T PF12385_consen 97 TAEGLANLLREYGPLWVAWEAPG---------------DSWVAHASVITGIDGD 135 (166)
T ss_pred CHHHHHHHHHHcCCeEEEecCCC---------------CcceeeEEEEEeecCC
Confidence 467888888 7899998855431 1223699999997754
No 36
>PF08139 LPAM_1: Prokaryotic membrane lipoprotein lipid attachment site; InterPro: IPR012640 In prokaryotes, membrane lipoproteins are synthesized with a precursor signal peptide, which is cleaved by a specific lipoprotein signal peptidase (signal peptidase II). The peptidase recognises a conserved sequence and cuts upstream of a cysteine residue to which a glyceride-fatty acid lipid is attached [,]. This lipid attachment site is found in homologues of the VirB proteins of type IV secretion systems (T4SS). Conjugal transfer across the cell envelope of Gram-negative bacteria is mediated by a supramolecular structure termed mating pair formation (Mpf) complex. Collectively, secretion pathways ancestrally related to bacterial conjugation systems are now known as T4SS. T4SS are involved in the delivery of effector molecules to eukaryotic target cells; each of these systems exports distinct DNA or protein substrates to effect a myriad of changes in host cell physiology during infection [].
Probab=67.47 E-value=6 Score=22.62 Aligned_cols=15 Identities=33% Similarity=0.443 Sum_probs=8.1
Q ss_pred hhHHHHHHHHHHHHH
Q 018781 6 HSKLLLLSLSLSLFA 20 (350)
Q Consensus 6 ~~~~~~~~~~~~~~~ 20 (350)
|||++++++.+++++
T Consensus 7 mKkil~~l~a~~~La 21 (25)
T PF08139_consen 7 MKKILFPLLALFMLA 21 (25)
T ss_pred HHHHHHHHHHHHHHh
Confidence 466665555555443
No 37
>PF10731 Anophelin: Thrombin inhibitor from mosquito; InterPro: IPR018932 Members of this family are all inhibitors of thrombin, the peptidase that is at the end of the blood coagulation cascade and which creates the clot by cleaving fibrinogen. The interaction between thrombin and fibrinogen involves two different areas of contact - via the thrombin active site and via a second substrate-binding site known as an exosite. The inhibitor acts by blocking the exosite, rather than by interacting with the active site. The inhibitors are from mosquitoes that feed on human blood and which, by inhibiting thrombin, prevent the blood from clotting and keep it flowing.
Probab=67.15 E-value=6 Score=27.50 Aligned_cols=19 Identities=21% Similarity=0.272 Sum_probs=13.0
Q ss_pred hHHHHHHHHHHHHHhhhhc
Q 018781 7 SKLLLLSLSLSLFACSSLA 25 (350)
Q Consensus 7 ~~~~~~~~~~~~~~~~~~~ 25 (350)
.|+|+|.+||+.++.+.++
T Consensus 3 ~Kl~vialLC~aLva~vQ~ 21 (65)
T PF10731_consen 3 SKLIVIALLCVALVAIVQS 21 (65)
T ss_pred chhhHHHHHHHHHHHHHhc
Confidence 5788888877777554443
No 38
>PF07172 GRP: Glycine rich protein family; InterPro: IPR010800 This family consists of glycine rich proteins. Some of them may be involved in resistance to environmental stress [].
Probab=65.24 E-value=5.6 Score=30.75 Aligned_cols=16 Identities=38% Similarity=0.407 Sum_probs=7.7
Q ss_pred HHHHHHHHHHHHHhhh
Q 018781 8 KLLLLSLSLSLFACSS 23 (350)
Q Consensus 8 ~~~~~~~~~~~~~~~~ 23 (350)
++|||.|+|+++|+++
T Consensus 5 ~~llL~l~LA~lLlis 20 (95)
T PF07172_consen 5 AFLLLGLLLAALLLIS 20 (95)
T ss_pred HHHHHHHHHHHHHHHH
Confidence 3455555544444444
No 39
>COG5510 Predicted small secreted protein [Function unknown]
Probab=61.06 E-value=9.4 Score=24.82 Aligned_cols=15 Identities=27% Similarity=0.199 Sum_probs=9.5
Q ss_pred hhHHHHHHHHHHHHH
Q 018781 6 HSKLLLLSLSLSLFA 20 (350)
Q Consensus 6 ~~~~~~~~~~~~~~~ 20 (350)
|||.++|++++++..
T Consensus 2 mk~t~l~i~~vll~s 16 (44)
T COG5510 2 MKKTILLIALVLLAS 16 (44)
T ss_pred chHHHHHHHHHHHHH
Confidence 888776666654443
No 40
>PF11777 DUF3316: Protein of unknown function (DUF3316); InterPro: IPR016879 There is currently no experimental data for members of this group or their homologues, nor do they exhibit features indicative of any function.
Probab=60.65 E-value=11 Score=30.09 Aligned_cols=14 Identities=36% Similarity=0.373 Sum_probs=9.6
Q ss_pred hhHHHHHHHHHHHH
Q 018781 6 HSKLLLLSLSLSLF 19 (350)
Q Consensus 6 ~~~~~~~~~~~~~~ 19 (350)
|||+|+|++++++-
T Consensus 1 MKk~~ll~~~ll~s 14 (114)
T PF11777_consen 1 MKKIILLASLLLLS 14 (114)
T ss_pred CchHHHHHHHHHHH
Confidence 89998888553333
No 41
>KOG4702 consensus Uncharacterized conserved protein [Function unknown]
Probab=53.67 E-value=59 Score=23.47 Aligned_cols=32 Identities=22% Similarity=0.322 Sum_probs=23.2
Q ss_pred HHHHHHHHHhCCccCChHHHHHHHHHHHHHHHH
Q 018781 46 ELFESWMSKHGKTYKCIEEKLHRFEIFKENLKH 78 (350)
Q Consensus 46 ~~f~~~~~~~~k~Y~~~~E~~~R~~if~~n~~~ 78 (350)
.-|++|+.+|.+.-.++ |...|..-|.+-+++
T Consensus 29 e~Fee~v~~~krel~pp-e~~~~~EE~~~~lRe 60 (77)
T KOG4702|consen 29 EIFEEFVRGYKRELSPP-EATKRKEEYENFLRE 60 (77)
T ss_pred HHHHHHHHhccccCCCh-HHHhhHHHHHHHHHH
Confidence 47999999999987544 666677666655554
No 42
>PRK10081 entericidin B membrane lipoprotein; Provisional
Probab=53.04 E-value=16 Score=24.44 Aligned_cols=15 Identities=13% Similarity=0.166 Sum_probs=9.6
Q ss_pred hhHHHHHHHHHHHHH
Q 018781 6 HSKLLLLSLSLSLFA 20 (350)
Q Consensus 6 ~~~~~~~~~~~~~~~ 20 (350)
|||+|.+++++++++
T Consensus 2 mKk~i~~i~~~l~~~ 16 (48)
T PRK10081 2 VKKTIAAIFSVLVLS 16 (48)
T ss_pred hHHHHHHHHHHHHHH
Confidence 788877766544443
No 43
>PF02402 Lysis_col: Lysis protein; InterPro: IPR003059 The DNA sequence of the entire colicin E2 operon has been determined []. The operon comprises the colicin activity gene (ceaB), the colicin immunity gene (ceiB) and the lysis gene (celB), which is essential for colicin release from producing cells []. A putative LexA binding site is located upstream from ceaB, and a rho-independent terminator structure is located downstream from celB []. Comparison of the amino acid sequences of colicin E2 and cloacin DF13 reveal extensive similarity. These colicins have different modes of action and recognise different cell surface receptors; the two major regions of heterology at the C terminus, and in the C-terminal end of the central region are thought to correspond to the catalytic and receptor-recognition domains, respectively []. Sequence similarities between colicins E2, A and E1 [] are less striking. The colicin E2 (pyocin) immunity protein does not share similarity with either the colicin E3 or cloacin DF13 [] immunity proteins. By contrast, the lysis proteins of the ColE2, ColE1 and CloDF13 plasmids are almost identical except in the N-terminal regions, which themselves are similar to lipoprotein signal peptides []. Processing of the ColE2 prolysis protein to the mature form is prevented by globomycin, a specific inhibitor of the lipoprotein signal peptidase []. The mature ColE2 lysis protein is located in the cell envelope [].; GO: 0009405 pathogenesis, 0019835 cytolysis, 0019867 outer membrane
Probab=52.89 E-value=3.5 Score=26.78 Aligned_cols=13 Identities=15% Similarity=0.345 Sum_probs=7.5
Q ss_pred hhHHHHHHHHHHH
Q 018781 6 HSKLLLLSLSLSL 18 (350)
Q Consensus 6 ~~~~~~~~~~~~~ 18 (350)
|||++++.++++.
T Consensus 1 MkKi~~~~i~~~~ 13 (46)
T PF02402_consen 1 MKKIIFIGIFLLT 13 (46)
T ss_pred CcEEEEeHHHHHH
Confidence 7777555554444
No 44
>PRK11443 lipoprotein; Provisional
Probab=50.33 E-value=11 Score=30.55 Aligned_cols=18 Identities=28% Similarity=0.368 Sum_probs=13.1
Q ss_pred hhHHHHHHHHHHHHHhhh
Q 018781 6 HSKLLLLSLSLSLFACSS 23 (350)
Q Consensus 6 ~~~~~~~~~~~~~~~~~~ 23 (350)
|||+|+++++++|..|++
T Consensus 1 Mk~~~~~~~~~lLsgCa~ 18 (124)
T PRK11443 1 MKKFIAPLLALLLSGCQI 18 (124)
T ss_pred ChHHHHHHHHHHHHhccC
Confidence 888888877766665555
No 45
>PF11948 DUF3465: Protein of unknown function (DUF3465); InterPro: IPR021856 This family of proteins are functionally uncharacterised. This protein is found in bacteria. Proteins in this family are typically between 131 to 151 amino acids in length. This protein has a conserved HWTH sequence motif.
Probab=49.05 E-value=16 Score=29.85 Aligned_cols=18 Identities=22% Similarity=0.176 Sum_probs=11.8
Q ss_pred hhHHHHHHHHHHHHHhhh
Q 018781 6 HSKLLLLSLSLSLFACSS 23 (350)
Q Consensus 6 ~~~~~~~~~~~~~~~~~~ 23 (350)
||+++.+.+++++.+..+
T Consensus 1 m~~~~~~~~~~~~~~~~~ 18 (131)
T PF11948_consen 1 MKRFLALFLSVLSAFSTA 18 (131)
T ss_pred CcchHHHHHHHHHHhccc
Confidence 888888777755553333
No 46
>PRK13883 conjugal transfer protein TrbH; Provisional
Probab=48.80 E-value=32 Score=28.95 Aligned_cols=19 Identities=21% Similarity=0.233 Sum_probs=14.6
Q ss_pred hhHHHHHHHHHHHHHhhhh
Q 018781 6 HSKLLLLSLSLSLFACSSL 24 (350)
Q Consensus 6 ~~~~~~~~~~~~~~~~~~~ 24 (350)
|||++++.++.+++.-|+.
T Consensus 1 Mrk~l~~~~l~l~LaGCAt 19 (151)
T PRK13883 1 MRKIVLLALLALALGGCAT 19 (151)
T ss_pred ChhHHHHHHHHHHHhcccC
Confidence 8999998888666665654
No 47
>COG4588 AcfC Accessory colonization factor AcfC, contains ABC-type periplasmic domain [General function prediction only]
Probab=48.76 E-value=22 Score=31.58 Aligned_cols=49 Identities=22% Similarity=0.279 Sum_probs=29.6
Q ss_pred hhHHHHHHHHHHHHHhhhhcccccccCCCCCcCCChhHHHHHHHHHHHHhCCc
Q 018781 6 HSKLLLLSLSLSLFACSSLAHDFSIVGYSPEHLTSMDKLIELFESWMSKHGKT 58 (350)
Q Consensus 6 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~~~~~~~~k~ 58 (350)
|||.+++++.++|.+.+++.+++..-.+.++. ..+.+.=+.|.++-++.
T Consensus 1 Mk~~~~i~~~~~La~s~~~~adinlYGpGGPh----taL~~vA~~~~ektg~k 49 (252)
T COG4588 1 MKKAVLILLIFLLAFSSAANADINLYGPGGPH----TALKDVAKKYEEKTGIK 49 (252)
T ss_pred CchhHHHHHHHHHHhhhhhcceEEEecCCCCc----HHHHHHHHHHHHHhCeE
Confidence 78877766665566556777777754443332 24455666666665554
No 48
>PRK09810 entericidin A; Provisional
Probab=48.68 E-value=19 Score=23.26 Aligned_cols=10 Identities=30% Similarity=0.501 Sum_probs=6.8
Q ss_pred hhHHHHHHHH
Q 018781 6 HSKLLLLSLS 15 (350)
Q Consensus 6 ~~~~~~~~~~ 15 (350)
|+|+++++++
T Consensus 2 Mkk~~~l~~~ 11 (41)
T PRK09810 2 MKRLIVLVLL 11 (41)
T ss_pred hHHHHHHHHH
Confidence 7887766644
No 49
>PF11873 DUF3393: Domain of unknown function (DUF3393); InterPro: IPR024570 Membrane-bound lytic murein transglycosylase C (also known as murein hydrolase C), is a murein-degrading enzyme that may play a role in the recycling of muropeptides during cell elongation and/or cell division. This entry represents the N-terminal domain, whose function is currently not known.
Probab=48.37 E-value=20 Score=31.75 Aligned_cols=18 Identities=56% Similarity=0.763 Sum_probs=10.0
Q ss_pred hhHHHHHHHHHHHHHhhh
Q 018781 6 HSKLLLLSLSLSLFACSS 23 (350)
Q Consensus 6 ~~~~~~~~~~~~~~~~~~ 23 (350)
|+++++++++++|..|+.
T Consensus 1 ~k~l~~~~~~~lL~~Cs~ 18 (204)
T PF11873_consen 1 KKKLLLLLIALLLSGCSS 18 (204)
T ss_pred CcCHHHHHHHHHHHHhCC
Confidence 555555555555555553
No 50
>PF09403 FadA: Adhesion protein FadA; InterPro: IPR018543 FadA (Fusobacterium adhesin A) is an adhesin which forms two alpha helices. ; PDB: 3ETZ_B 3ETY_A 2GL2_B 3ETX_C 3ETW_A.
Probab=47.41 E-value=30 Score=28.25 Aligned_cols=21 Identities=29% Similarity=0.156 Sum_probs=0.0
Q ss_pred hhHHHHHHHHHHHHHhhhhccccc
Q 018781 6 HSKLLLLSLSLSLFACSSLAHDFS 29 (350)
Q Consensus 6 ~~~~~~~~~~~~~~~~~~~~~~~~ 29 (350)
|+|+|++ ++|++++.+|+++.
T Consensus 1 MKK~ll~---~~lllss~sfaA~~ 21 (126)
T PF09403_consen 1 MKKILLL---GMLLLSSISFAATA 21 (126)
T ss_dssp ------------------------
T ss_pred ChHHHHH---HHHHHHHHHHHccc
Confidence 8886533 23344444444444
No 51
>PF11153 DUF2931: Protein of unknown function (DUF2931); InterPro: IPR021326 Some members in this family of proteins are annotated as lipoproteins however this cannot be confirmed. Currently, there is no known function.
Probab=47.40 E-value=14 Score=32.82 Aligned_cols=19 Identities=47% Similarity=0.579 Sum_probs=13.4
Q ss_pred hhHHHHHHHHHHHHHhhhh
Q 018781 6 HSKLLLLSLSLSLFACSSL 24 (350)
Q Consensus 6 ~~~~~~~~~~~~~~~~~~~ 24 (350)
|+|+++|+++|++..|+..
T Consensus 1 mk~i~~l~l~lll~~C~~~ 19 (216)
T PF11153_consen 1 MKKILLLLLLLLLTGCSTN 19 (216)
T ss_pred ChHHHHHHHHHHHHhhcCC
Confidence 8899888866655555553
No 52
>PRK10780 periplasmic chaperone; Provisional
Probab=41.16 E-value=19 Score=30.68 Aligned_cols=28 Identities=18% Similarity=-0.058 Sum_probs=14.4
Q ss_pred hhHHHHHHHHH-HHHHhhhhcccccccCC
Q 018781 6 HSKLLLLSLSL-SLFACSSLAHDFSIVGY 33 (350)
Q Consensus 6 ~~~~~~~~~~~-~~~~~~~~~~~~~~~~~ 33 (350)
|+|+++++++. +++++++++++.-+..+
T Consensus 1 Mkk~~~~~~l~l~~~~~~~a~a~~KIg~V 29 (165)
T PRK10780 1 MKKWLLAAGLGLALATSAGAQAADKIAIV 29 (165)
T ss_pred ChHHHHHHHHHHHHHHHHHHHHhcCeEEe
Confidence 89998765553 33333333344444333
No 53
>PF01640 Peptidase_C10: Peptidase C10 family classification.; InterPro: IPR000200 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This group of cysteine peptidases belong to MEROPS peptidase family C10 (streptopain family, clan CA). Streptopain is a cysteine protease found in Streptococcus pyogenes that shows some structural and functional similarity to papain (family C1) [, ]. The order of the catalytic cysteine/histidine dyad is the same and the surrounding sequences are similar. The two proteins also show similar specificities, both preferring a hydrophobic residue at the P2 site [, ]. Streptopain shows a high degree of sequence similarity to the S. pyogenes exotoxin B, and strong similarity to the prtT gene product of Porphyromonas gingivalis (Bacteroides gingivalis), both of which have been included in the family [].; GO: 0008234 cysteine-type peptidase activity, 0006508 proteolysis; PDB: 4D8I_A 4D8E_A 4D8B_A 3BBA_B 3BB7_A 2JTC_A 1PVJ_A 1DKI_D 2UZJ_A.
Probab=39.26 E-value=1.3e+02 Score=26.26 Aligned_cols=52 Identities=23% Similarity=0.430 Sum_probs=29.7
Q ss_pred HHHHHHHHhc-CCcEEEEeecCcccccccCCeeeCCCCCCCCeEEEEEEEeecCCeeEEEEEcCCCCCCCCCceEE
Q 018781 252 EQSLLKALAH-QPVSVAIEASGTDFQFYSGGVFTGPCGAELDHGVAAVGYGKSKGSDYIIVKNSWGPKWGERGYIR 326 (350)
Q Consensus 252 ~~~i~~al~~-GPV~v~i~~~~~~f~~y~~Giy~~~~~~~~~Hav~iVGyg~~~g~~ywivkNSWG~~WG~~GY~~ 326 (350)
.+.|+..|.+ .||.+.-... . .+||.+|=||... .|+-+==.||-. .+||++
T Consensus 140 ~~~i~~el~~~rPV~~~g~~~-~-----------------~GHawViDGy~~~---~~~H~NwGW~G~--~nGyy~ 192 (192)
T PF01640_consen 140 MDMIRNELDNGRPVLYSGNSK-S-----------------GGHAWVIDGYDSD---GYFHCNWGWGGS--SNGYYR 192 (192)
T ss_dssp HHHHHHHHHTT--EEEEEEET-T-----------------EEEEEEEEEEESS---SEEEEE-SSTTT--T-EEEE
T ss_pred HHHHHHHHHcCCCEEEEEecC-C-----------------CCeEEEEcCccCC---CeEEEeeCccCC--CCCccC
Confidence 3567777844 5987654322 0 1799999999643 577663233321 479885
No 54
>PRK10053 hypothetical protein; Provisional
Probab=38.18 E-value=24 Score=28.91 Aligned_cols=13 Identities=8% Similarity=-0.097 Sum_probs=9.6
Q ss_pred hhHHHHHHHHHHH
Q 018781 6 HSKLLLLSLSLSL 18 (350)
Q Consensus 6 ~~~~~~~~~~~~~ 18 (350)
|||.+++++++++
T Consensus 1 MKK~~~~~~~~~~ 13 (130)
T PRK10053 1 MKLQAIALASFLV 13 (130)
T ss_pred CcHHHHHHHHHHH
Confidence 8998777776555
No 55
>PF14060 DUF4252: Domain of unknown function (DUF4252)
Probab=37.67 E-value=37 Score=28.22 Aligned_cols=17 Identities=29% Similarity=0.335 Sum_probs=9.0
Q ss_pred hHHHHHHHHHHHHHhhh
Q 018781 7 SKLLLLSLSLSLFACSS 23 (350)
Q Consensus 7 ~~~~~~~~~~~~~~~~~ 23 (350)
||+|+++++++++++++
T Consensus 1 Kk~i~~l~l~~~~~~~~ 17 (155)
T PF14060_consen 1 KKIILILLLLLACLASC 17 (155)
T ss_pred ChhHHHHHHHHHHHHHh
Confidence 45566666555544444
No 56
>smart00230 CysPc Calpain-like thiol protease family. Calpain-like thiol protease family (peptidase family C2). Calcium activated neutral protease (large subunit).
Probab=37.27 E-value=56 Score=30.99 Aligned_cols=26 Identities=19% Similarity=0.518 Sum_probs=21.5
Q ss_pred CCeEEEEEEEeecCCee--EEEEEcCCC
Q 018781 291 LDHGVAAVGYGKSKGSD--YIIVKNSWG 316 (350)
Q Consensus 291 ~~Hav~iVGyg~~~g~~--ywivkNSWG 316 (350)
.+||=.|++...-++.+ -..+||-||
T Consensus 227 ~~HaYsVl~v~~~~~~~~~Ll~lrNPWg 254 (318)
T smart00230 227 KGHAYSVTDVREVQGRRQELLRLRNPWG 254 (318)
T ss_pred cCccEEEEEEEEEecCCeEEEEEECCCC
Confidence 48999999988755445 899999998
No 57
>PF12771 SusD-like_2: Starch-binding associating with outer membrane; PDB: 3EJN_A 3FDH_A 3MX3_A.
Probab=36.86 E-value=15 Score=37.06 Aligned_cols=24 Identities=21% Similarity=0.275 Sum_probs=0.0
Q ss_pred hhHHHHHHHHHHHHHhhhhccccc
Q 018781 6 HSKLLLLSLSLSLFACSSLAHDFS 29 (350)
Q Consensus 6 ~~~~~~~~~~~~~~~~~~~~~~~~ 29 (350)
|||+|+++++++++.||--|.+++
T Consensus 1 MKK~Il~i~l~~~~~sc~df~diN 24 (488)
T PF12771_consen 1 MKKIILIILLALLLSSCDDFEDIN 24 (488)
T ss_dssp ------------------------
T ss_pred CcHHHHHHHHHHHHHhcCCHHHcc
Confidence 999999888887777777666666
No 58
>TIGR00156 conserved hypothetical protein TIGR00156. As of the last revision, this family consists only of two proteins from Escherichia coli and one from the related species Haemophilus influenzae.
Probab=36.84 E-value=23 Score=28.83 Aligned_cols=10 Identities=10% Similarity=-0.124 Sum_probs=8.2
Q ss_pred hhHHHHHHHH
Q 018781 6 HSKLLLLSLS 15 (350)
Q Consensus 6 ~~~~~~~~~~ 15 (350)
|||+++++++
T Consensus 1 MKK~~~~~~~ 10 (126)
T TIGR00156 1 MKFQAIVLAS 10 (126)
T ss_pred CchHHHHHHH
Confidence 8998887777
No 59
>PF07437 YfaZ: YfaZ precursor; InterPro: IPR009998 This family contains the precursor of the bacterial protein YfaZ (approximately 180 residues long). Many members of this family are hypothetical proteins.
Probab=36.61 E-value=25 Score=30.54 Aligned_cols=23 Identities=35% Similarity=0.385 Sum_probs=14.8
Q ss_pred hhHHHHHHHHHHHHHhhhhccccc
Q 018781 6 HSKLLLLSLSLSLFACSSLAHDFS 29 (350)
Q Consensus 6 ~~~~~~~~~~~~~~~~~~~~~~~~ 29 (350)
|||+++++++ +|++.+.+|++.+
T Consensus 1 m~k~~~a~~~-~l~~~s~~a~A~~ 23 (180)
T PF07437_consen 1 MKKFLLASAA-ALLLVSASANAIS 23 (180)
T ss_pred CchHHHHHHH-HHHHHhhhhheee
Confidence 8888776655 4444566666666
No 60
>PRK11372 lysozyme inhibitor; Provisional
Probab=35.86 E-value=33 Score=27.16 Aligned_cols=20 Identities=40% Similarity=0.652 Sum_probs=14.5
Q ss_pred chhHHHHHHHHHHHHHhhhh
Q 018781 5 SHSKLLLLSLSLSLFACSSL 24 (350)
Q Consensus 5 ~~~~~~~~~~~~~~~~~~~~ 24 (350)
+||++|.+++.++|-.|+..
T Consensus 2 ~mk~ll~~~~~~lL~gCs~~ 21 (109)
T PRK11372 2 SMKKLLIICLPVLLTGCSAY 21 (109)
T ss_pred chHHHHHHHHHHHHHHhcCC
Confidence 79998877777666666653
No 61
>PF12276 DUF3617: Protein of unknown function (DUF3617); InterPro: IPR022061 This family of proteins is found in bacteria. Proteins in this family are typically between 155 and 179 amino acids in length. There is a single completely conserved residue C that may be functionally important.
Probab=34.62 E-value=28 Score=29.18 Aligned_cols=18 Identities=39% Similarity=0.587 Sum_probs=12.8
Q ss_pred hhHHHHHHHHHHHHHhhh
Q 018781 6 HSKLLLLSLSLSLFACSS 23 (350)
Q Consensus 6 ~~~~~~~~~~~~~~~~~~ 23 (350)
|||.++++++++++++.+
T Consensus 1 M~~~~~~~~~~~~~~~~~ 18 (162)
T PF12276_consen 1 MKRRLLLALALALLALAA 18 (162)
T ss_pred CchHHHHHHHHHHHHhhc
Confidence 788888888777765433
No 62
>PRK10936 TMAO reductase system periplasmic protein TorT; Provisional
Probab=34.47 E-value=25 Score=33.40 Aligned_cols=20 Identities=45% Similarity=0.560 Sum_probs=14.3
Q ss_pred hhHHHHHHHHHHHHHhhhhc
Q 018781 6 HSKLLLLSLSLSLFACSSLA 25 (350)
Q Consensus 6 ~~~~~~~~~~~~~~~~~~~~ 25 (350)
|||+++|+++++|+-+.+-+
T Consensus 1 ~~~~~~~~~~~~~~~~~~~~ 20 (343)
T PRK10936 1 MRKLLFLLLSLFLLSLTAFA 20 (343)
T ss_pred ChhHHHHHHHHHHHHHHHHH
Confidence 89998888887766555433
No 63
>PLN00131 hypothetical protein; Provisional
Probab=33.86 E-value=15 Score=30.83 Aligned_cols=18 Identities=33% Similarity=0.286 Sum_probs=14.4
Q ss_pred cccchhHHHHHHHHHHHH
Q 018781 2 AFFSHSKLLLLSLSLSLF 19 (350)
Q Consensus 2 ~~~~~~~~~~~~~~~~~~ 19 (350)
.||+|+|-+.+++.++++
T Consensus 29 fFF~m~K~~~~SL~~~~m 46 (218)
T PLN00131 29 FFFFMRKGGIVSLILVYM 46 (218)
T ss_pred HHHHHHhchHHhhHHHHh
Confidence 589999998888776555
No 64
>PLN03024 Putative EG45-like domain containing protein 1; Provisional
Probab=32.85 E-value=23 Score=28.86 Aligned_cols=28 Identities=21% Similarity=0.185 Sum_probs=20.2
Q ss_pred hhHHHHHHHHHHHHHhhhhcccccccCC
Q 018781 6 HSKLLLLSLSLSLFACSSLAHDFSIVGY 33 (350)
Q Consensus 6 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 33 (350)
|+|-|||.+.+++++.+++.++.+.-++
T Consensus 1 ~~~~~~~~~~~~~~~~~~~~~~~G~AT~ 28 (125)
T PLN03024 1 MSKRILIFSTVLVFLFSVSYATPGIATF 28 (125)
T ss_pred CceeeHHHHHHHHHHhhhhcccceEEEE
Confidence 6777777777777777887777775443
No 65
>COG3637 Opacity protein and related surface antigens [Cell envelope biogenesis, outer membrane]
Probab=32.64 E-value=31 Score=30.36 Aligned_cols=22 Identities=27% Similarity=0.267 Sum_probs=17.0
Q ss_pred hhHHHHHHHHHHHHHhhhhccc
Q 018781 6 HSKLLLLSLSLSLFACSSLAHD 27 (350)
Q Consensus 6 ~~~~~~~~~~~~~~~~~~~~~~ 27 (350)
||+++.+.+++++++++++|++
T Consensus 1 mk~~l~~a~~~~~~~~~a~aad 22 (199)
T COG3637 1 MKKLLAAAALAALLLSAAAAAD 22 (199)
T ss_pred ChhHHHHHHHHHHHhhhhhhhh
Confidence 7888888888788877777665
No 66
>COG1792 MreC Cell shape-determining protein [Cell envelope biogenesis, outer membrane]
Probab=32.45 E-value=1.1e+02 Score=28.64 Aligned_cols=27 Identities=7% Similarity=0.012 Sum_probs=13.2
Q ss_pred HHHHHHHHHHHHhCCccCChHHHHHHH
Q 018781 43 KLIELFESWMSKHGKTYKCIEEKLHRF 69 (350)
Q Consensus 43 ~~~~~f~~~~~~~~k~Y~~~~E~~~R~ 69 (350)
+.-..+.++.+.|.+.|...++...+.
T Consensus 56 ~~v~~~~~~~~~~~~~~~en~~Lk~~l 82 (284)
T COG1792 56 EFVDGVLEFLKSLKDLALENEELKKEL 82 (284)
T ss_pred HHHHhHHHHHHHhHHHHHHhHHHHHHH
Confidence 333445555566666664443333333
No 67
>PRK13835 conjugal transfer protein TrbH; Provisional
Probab=32.14 E-value=90 Score=26.07 Aligned_cols=19 Identities=16% Similarity=0.139 Sum_probs=14.5
Q ss_pred hhHHHHHHHHHHHHHhhhh
Q 018781 6 HSKLLLLSLSLSLFACSSL 24 (350)
Q Consensus 6 ~~~~~~~~~~~~~~~~~~~ 24 (350)
|||++++.++.+++.-|+.
T Consensus 1 mrk~~~~~~~al~LaGCaT 19 (145)
T PRK13835 1 LRRLLAACILALLLSGCQT 19 (145)
T ss_pred ChhHHHHHHHHHHHhcccc
Confidence 8999988888777665554
No 68
>PF05540 Serpulina_VSP: Serpulina hyodysenteriae variable surface protein; InterPro: IPR008838 This family consists of several variable surface proteins from Treponema hyodysenteriae (Serpulina hyodysenteriae).
Probab=31.89 E-value=30 Score=33.14 Aligned_cols=24 Identities=25% Similarity=0.328 Sum_probs=22.0
Q ss_pred hhHHHHHHHHHHHHHhhhhccccc
Q 018781 6 HSKLLLLSLSLSLFACSSLAHDFS 29 (350)
Q Consensus 6 ~~~~~~~~~~~~~~~~~~~~~~~~ 29 (350)
|||++|....|+.++.+++|.|-+
T Consensus 1 MKK~lLt~~alltia~~SvFGmYG 24 (377)
T PF05540_consen 1 MKKVLLTAIALLTIASASVFGMYG 24 (377)
T ss_pred CcchHHHHHHHHHHHhhhhheecc
Confidence 899999999999999999999888
No 69
>PF13677 MotB_plug: Membrane MotB of proton-channel complex MotA/MotB
Probab=30.95 E-value=1.1e+02 Score=21.21 Aligned_cols=14 Identities=21% Similarity=0.316 Sum_probs=6.4
Q ss_pred hhHHHHHHHHHHHH
Q 018781 41 MDKLIELFESWMSK 54 (350)
Q Consensus 41 ~~~~~~~f~~~~~~ 54 (350)
..+..+..+.+...
T Consensus 43 ~~k~~~~~~s~~~~ 56 (58)
T PF13677_consen 43 KEKFEEVAQSFQQA 56 (58)
T ss_pred HHHHHHHHHHHHHh
Confidence 33444555544443
No 70
>PRK09934 fimbrial-like adhesin protein SfmF; Provisional
Probab=30.94 E-value=32 Score=29.38 Aligned_cols=19 Identities=11% Similarity=0.116 Sum_probs=12.3
Q ss_pred hhHHHHHHHHHHHHHhhhh
Q 018781 6 HSKLLLLSLSLSLFACSSL 24 (350)
Q Consensus 6 ~~~~~~~~~~~~~~~~~~~ 24 (350)
|||++++.++++++..++.
T Consensus 1 m~~~~~~~~~~~~~~~~~~ 19 (171)
T PRK09934 1 MRRVFFACFCGLLWSPLSW 19 (171)
T ss_pred ChhHHHHHHHHHhhChhhh
Confidence 8988877766555544443
No 71
>KOG3554 consensus Histone deacetylase complex, MTA1 component [Chromatin structure and dynamics]
Probab=30.71 E-value=1.5e+02 Score=29.61 Aligned_cols=32 Identities=28% Similarity=0.221 Sum_probs=21.8
Q ss_pred cCCCCCcCCChhHHHHHHHHHHHHhCCccCCh
Q 018781 31 VGYSPEHLTSMDKLIELFESWMSKHGKTYKCI 62 (350)
Q Consensus 31 ~~~~~~~~~~~~~~~~~f~~~~~~~~k~Y~~~ 62 (350)
+.+.|.-.+=...-..+|++=..+|+|.|.+.
T Consensus 279 vLCRDemEEWSasEanLFEeALeKyGKDFndI 310 (693)
T KOG3554|consen 279 VLCRDEMEEWSASEANLFEEALEKYGKDFNDI 310 (693)
T ss_pred eeehhhhhhccchhhHHHHHHHHHhcccHHHH
Confidence 44544444444445678999999999999543
No 72
>PRK03577 acid shock protein precursor; Provisional
Probab=30.33 E-value=48 Score=25.47 Aligned_cols=25 Identities=16% Similarity=0.061 Sum_probs=18.8
Q ss_pred hhHHHHHHHHHHHHHhhhhcccccc
Q 018781 6 HSKLLLLSLSLSLFACSSLAHDFSI 30 (350)
Q Consensus 6 ~~~~~~~~~~~~~~~~~~~~~~~~~ 30 (350)
|+|+|.|++--+|=+.+..|+....
T Consensus 1 MKKVLAlvVAa~~glSs~AFAA~ta 25 (102)
T PRK03577 1 MKKVLALVVAAAMGLSSAAFAAETA 25 (102)
T ss_pred ChHHHHHHHHHHHHhhHHHHhcccc
Confidence 8998888888777777777776553
No 73
>TIGR03519 Bac_Flav_fam_1 Bacteroidetes-specific putative membrane protein. This model describes a protein family unique to, and greatly expanded in, the Bacteriodetes. Species in this lineage include several, such as Cytophaga hutchinsonii and Flavobacterium johnsoniae, that exhibit a poorly understood rapid gliding phenotype. Several members of this protein family are found in operons with other genes whose loss leads to a loss a this motility.
Probab=29.76 E-value=29 Score=32.43 Aligned_cols=16 Identities=44% Similarity=0.523 Sum_probs=10.3
Q ss_pred hhHHHHHHHHHHHHHh
Q 018781 6 HSKLLLLSLSLSLFAC 21 (350)
Q Consensus 6 ~~~~~~~~~~~~~~~~ 21 (350)
|||+++++++++++.+
T Consensus 1 mkk~~~~~~l~~~~~~ 16 (292)
T TIGR03519 1 MKKILLLLLLLLLLTV 16 (292)
T ss_pred CceeehhhHHHHHhhh
Confidence 7888777666544433
No 74
>PF10614 CsgF: Type VIII secretion system (T8SS), CsgF protein; InterPro: IPR018893 Fimbriae are cell-surface protein polymers, of e.g. Escherichia coli and Salmonella spp, that mediate interactions important for host and environmental persistence, development of biofilms, motility, colonisation and invasion of cells, and conjugation. Four general assembly pathways for different fimbriae have been proposed, one of which is extracellular nucleation-precipitation (ENP), that differs from the others in that fibre-growth occurs extracellularly. Thin aggregative fimbriae (Tafi) are the only fimbriae dependent on the ENP pathway. Tafi were first identified in Salmonella spp. and the controlling operon termed agf; however subsequent isolation of the homologous operon in E. coli led to its being called csg. Tafi are known as curli because, in the absence of extracellular polysaccharides, their morphology appears curled; however, when expressed with such polysaccharides their morphology appears as a tangled amorphous matrix []. CsgF is one of three putative curli assembly factors appearing to act as a nucleator protein. Unlike eukaryotic amyloid formation, curli biogenesis is a productive pathway requiring a specific assembly machinery [].
Probab=29.60 E-value=56 Score=27.21 Aligned_cols=72 Identities=19% Similarity=0.116 Sum_probs=32.1
Q ss_pred hhHHHHHHHHHHHHHhhhhcccccccCCCCCcCCChhHHHHHHHHHHHHhCCccCChHHH-------HHHHHHHHHHHHH
Q 018781 6 HSKLLLLSLSLSLFACSSLAHDFSIVGYSPEHLTSMDKLIELFESWMSKHGKTYKCIEEK-------LHRFEIFKENLKH 78 (350)
Q Consensus 6 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~~~~~~~~k~Y~~~~E~-------~~R~~if~~n~~~ 78 (350)
||+..++.+++++++ +..+.++..+--+=.+.---..+.-.|-.=++.=+..|+++..+ ..-...|.++|+.
T Consensus 1 mk~~~l~a~l~~~~~-~~~a~A~eLVY~PvNPsFGGnplNgs~LL~~A~AQN~~~dp~~~~~~~~~~~S~l~~F~~sLqs 79 (142)
T PF10614_consen 1 MKYRGLLALLLLLLA-ASSAQAQELVYTPVNPSFGGNPLNGSWLLSSAQAQNDFKDPSAEDDFSTSSLSALDRFTQSLQS 79 (142)
T ss_pred CcEeHHHHHHHHHHc-ccccchhheEeeccCCCCCCCcccHHHHhhhhhhcCCcCCCccccccccCCCCHHHHHHHHHHH
Confidence 677766555544443 44444544311111111111222334444455556666554443 1235566666654
No 75
>PF15284 PAGK: Phage-encoded virulence factor
Probab=28.88 E-value=54 Score=23.01 Aligned_cols=14 Identities=29% Similarity=0.437 Sum_probs=6.0
Q ss_pred hhHH--HHHHHHHHHH
Q 018781 6 HSKL--LLLSLSLSLF 19 (350)
Q Consensus 6 ~~~~--~~~~~~~~~~ 19 (350)
|+|+ |+|.++++|.
T Consensus 1 Mkk~ksifL~l~~~Ls 16 (61)
T PF15284_consen 1 MKKFKSIFLALVFILS 16 (61)
T ss_pred ChHHHHHHHHHHHHHH
Confidence 5533 4444443333
No 76
>TIGR01165 cbiN cobalt transport protein. This model describes the cobalt transporter in bacteria and its equivalents in archaea. It principally functions in the ion uptake mechanism. It is a multisubunit transporter with two integral membrane proteins and two closely associated cytoplasmic subunits. This transporter belongs to the ABC transporter superfamily (ATP stands for ATP Binding Cassette). This superfamily includes two groups, one which catalyze the uptake of small molecules, including ions from the external milieu and the other group which is engaged in the efflux of small molecular weight compounds and ions from within the cell. Energy derived from the hydrolysis of ATP drive the both the process of uptake and efflux.
Probab=28.12 E-value=13 Score=28.35 Aligned_cols=22 Identities=18% Similarity=0.146 Sum_probs=13.7
Q ss_pred chhHHHHHHHHHHHHHhhhhcc
Q 018781 5 SHSKLLLLSLSLSLFACSSLAH 26 (350)
Q Consensus 5 ~~~~~~~~~~~~~~~~~~~~~~ 26 (350)
+|||-++|++++++++.++++-
T Consensus 2 ~~~~~~~ll~~v~~l~~~pl~~ 23 (91)
T TIGR01165 2 SMKKTIWLLAAVAALVVLPLLI 23 (91)
T ss_pred CcchhHHHHHHHHHHHHHHHHh
Confidence 6887766666666665555543
No 77
>PRK15240 resistance to complement killing; Provisional
Probab=27.95 E-value=39 Score=29.44 Aligned_cols=17 Identities=35% Similarity=0.303 Sum_probs=11.7
Q ss_pred hhHHHHHHHHHHHHHhh
Q 018781 6 HSKLLLLSLSLSLFACS 22 (350)
Q Consensus 6 ~~~~~~~~~~~~~~~~~ 22 (350)
|||.+++++++++++..
T Consensus 1 Mkk~~~~~~~~~~~~~~ 17 (185)
T PRK15240 1 MKKIVLSSLLLSAAGLA 17 (185)
T ss_pred CchhHHHHHHHHHHHhc
Confidence 88988777665555555
No 78
>PF05984 Cytomega_UL20A: Cytomegalovirus UL20A protein; InterPro: IPR009245 This family consists of several Cytomegalovirus UL20A proteins. UL20A is thought to be a glycoprotein [].
Probab=27.93 E-value=51 Score=24.68 Aligned_cols=22 Identities=18% Similarity=0.125 Sum_probs=13.3
Q ss_pred hhHHHHHHHHHHHHHhhhhccc
Q 018781 6 HSKLLLLSLSLSLFACSSLAHD 27 (350)
Q Consensus 6 ~~~~~~~~~~~~~~~~~~~~~~ 27 (350)
|.+-++|+.+|+..||+++|+-
T Consensus 1 MaRRlwiLslLAVtLtVALAAP 22 (100)
T PF05984_consen 1 MARRLWILSLLAVTLTVALAAP 22 (100)
T ss_pred CchhhHHHHHHHHHHHHHhhcc
Confidence 3344455555577778887654
No 79
>PF06873 SerH: Cell surface immobilisation antigen SerH; InterPro: IPR009670 This entry consists of several cell surface immobilisation antigen SerH proteins which seem to be specific to Tetrahymena thermophila. The SerH locus of T. thermophila is one of several paralogous loci with genes encoding variants of the major cell surface protein known as the immobilisation antigen (i-ag) [].
Probab=27.91 E-value=39 Score=33.16 Aligned_cols=24 Identities=13% Similarity=0.108 Sum_probs=12.6
Q ss_pred hhHHHHHHHHH-HHHHhhhhccccc
Q 018781 6 HSKLLLLSLSL-SLFACSSLAHDFS 29 (350)
Q Consensus 6 ~~~~~~~~~~~-~~~~~~~~~~~~~ 29 (350)
|+++|||+.|| ++++...+++..+
T Consensus 1 M~~k~lii~Lii~~llv~~i~a~~G 25 (403)
T PF06873_consen 1 MQNKILIICLIISSLLVSQISATPG 25 (403)
T ss_pred CcchhhHHHHHHHHHHHheeccCCC
Confidence 55555555444 4555555555444
No 80
>PF00879 Defensin_propep: Defensin propeptide The pattern for this Prosite entry doesn't match the propeptide.; InterPro: IPR002366 Defensins are 2-6 kDa, cationic, microbicidal peptides active against many Gram-negative and Gram-positive bacteria, fungi, and enveloped viruses [], containing three pairs of intramolecular disulphide bonds []. On the basis of their size and pattern of disulphide bonding, mammalian defensins are classified into alpha, beta and theta categories. Alpha-defensins, which have been identified in humans, monkeys and several rodent species, are particularly abundant in neutrophils, certain macrophage populations and Paneth cells of the small intestine. Every mammalian species explored thus far has beta-defensins. In cows, as many as 13 beta-defensins exist in neutrophils. However, in other species, beta-defensins are more often produced by epithelial cells lining various organs (e.g. the epidermis, bronchial tree and genitourinary tract). Theta-defensins are cyclic and have so far only been identified in primate phagocytes. Defensins are produced constitutively and/or in response to microbial products or proinflammatory cytokines. Some defensins are also called corticostatins (CS) because they inhibit corticotropin-stimulated corticosteroid production. The mechanism(s) by which microorganisms are killed and/or inactivated by defensins is not understood completely. However, it is generally believed that killing is a consequence of disruption of the microbial membrane. The polar topology of defensins, with spatially separated charged and hydrophobic regions, allows them to insert themselves into the phospholipid membranes so that their hydrophobic regions are buried within the lipid membrane interior and their charged (mostly cationic) regions interact with anionic phospholipid head groups and water. Subsequently, some defensins can aggregate to form `channel-like' pores; others might bind to and cover the microbial membrane in a `carpet-like' manner. The net outcome is the disruption of membrane integrity and function, which ultimately leads to the lysis of microorganisms. Some defensins are synthesized as propeptides which may be relevant to this process - in neutrophils only the mature peptides have been identified but in Paneth cells, the propeptide is stored in vesicles [] and appears to be cleaved by trypsin on activation. ; GO: 0006952 defense response
Probab=27.52 E-value=60 Score=22.10 Aligned_cols=18 Identities=28% Similarity=0.218 Sum_probs=10.5
Q ss_pred hhHHHHHHHHHHHHHhhhh
Q 018781 6 HSKLLLLSLSLSLFACSSL 24 (350)
Q Consensus 6 ~~~~~~~~~~~~~~~~~~~ 24 (350)
||.+.+|+-+ +|+++.++
T Consensus 1 MRTL~LLaAl-LLlAlqaQ 18 (52)
T PF00879_consen 1 MRTLALLAAL-LLLALQAQ 18 (52)
T ss_pred CcHHHHHHHH-HHHHHHHh
Confidence 7777777764 44434443
No 81
>COG3088 CcmH Uncharacterized protein involved in biosynthesis of c-type cytochromes [Posttranslational modification, protein turnover, chaperones]
Probab=27.17 E-value=1.1e+02 Score=25.81 Aligned_cols=13 Identities=0% Similarity=-0.003 Sum_probs=6.8
Q ss_pred cCChHHHHHHHHH
Q 018781 59 YKCIEEKLHRFEI 71 (350)
Q Consensus 59 Y~~~~E~~~R~~i 71 (350)
+++++++.+..++
T Consensus 29 f~~~~qe~ra~~L 41 (153)
T COG3088 29 FADPAQEQRARAL 41 (153)
T ss_pred CCCHHHHHHHHHH
Confidence 6666555444443
No 82
>PRK15209 long polar fimbrial protein LpfA; Provisional
Probab=26.85 E-value=49 Score=28.17 Aligned_cols=18 Identities=22% Similarity=0.224 Sum_probs=10.1
Q ss_pred hhHHHHHHHHHHHHHhhh
Q 018781 6 HSKLLLLSLSLSLFACSS 23 (350)
Q Consensus 6 ~~~~~~~~~~~~~~~~~~ 23 (350)
|||+++.++++.++..++
T Consensus 1 Mkk~~~~~~~~~~~~~~a 18 (174)
T PRK15209 1 MKKVVFALSALALTSTSV 18 (174)
T ss_pred CchHHHHHHHHHHHhhhc
Confidence 888766555544443333
No 83
>PF15588 Imm7: Immunity protein 7
Probab=26.76 E-value=2e+02 Score=22.90 Aligned_cols=35 Identities=29% Similarity=0.548 Sum_probs=24.9
Q ss_pred EEEEEEEeec--CCeeEEEEEcCC-----CCCCCCCceEEEEe
Q 018781 294 GVAAVGYGKS--KGSDYIIVKNSW-----GPKWGERGYIRMKR 329 (350)
Q Consensus 294 av~iVGyg~~--~g~~ywivkNSW-----G~~WG~~GY~~i~~ 329 (350)
-|++||++++ +...|-|++.+- .+.=|.+||. +..
T Consensus 17 ~v~~vG~ADd~~~~~~yiilQR~~~~de~D~~~~~d~~~-~e~ 58 (115)
T PF15588_consen 17 NVLMVGFADDEDGPKEYIILQRSLEFDEQDEDLGSDGYY-TEC 58 (115)
T ss_pred cEEEEEEecCCCCCceEEEEEccCCCCCcccccCcCcEE-EEE
Confidence 3999999983 346899999963 4445668886 444
No 84
>PRK15346 outer membrane secretin SsaC; Provisional
Probab=26.73 E-value=71 Score=32.41 Aligned_cols=52 Identities=17% Similarity=0.102 Sum_probs=28.7
Q ss_pred hhHHHHHHHHHHHHHhhhhcccccccCCCCCc---CCChhHHHHHHHHHHHHhCCcc
Q 018781 6 HSKLLLLSLSLSLFACSSLAHDFSIVGYSPEH---LTSMDKLIELFESWMSKHGKTY 59 (350)
Q Consensus 6 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~---~~~~~~~~~~f~~~~~~~~k~Y 59 (350)
|+|+++|++||+|+-...+++.... +.+.. ...+.++....+.+-+..++..
T Consensus 1 ~~~~~~~~~~~~~~~~~~~~~~~~~--w~~~~~~~~~~~~di~~vl~~~a~~~g~ni 55 (499)
T PRK15346 1 MKKLLILIFLFLLNTAKFAASKSIP--WQGNPFFIYSRGMPLAEVLHDLGANYGIPV 55 (499)
T ss_pred CchhHHHHHHHHHhhhhhhccCCCC--CCCCCEEEEECCCcHHHHHHHHHHHhCCCE
Confidence 7788777777666655555444431 21111 1133455666666666666664
No 85
>PF11853 DUF3373: Protein of unknown function (DUF3373); InterPro: IPR021803 This family of proteins are functionally uncharacterised. This protein is found in bacteria. Proteins in this family are typically between 472 to 574 amino acids in length.
Probab=26.52 E-value=79 Score=31.89 Aligned_cols=13 Identities=46% Similarity=0.562 Sum_probs=11.0
Q ss_pred hhHHHHHHHHHHH
Q 018781 6 HSKLLLLSLSLSL 18 (350)
Q Consensus 6 ~~~~~~~~~~~~~ 18 (350)
|||+|.|+|+.+|
T Consensus 1 Mkk~~~l~l~aal 13 (489)
T PF11853_consen 1 MKKLISLSLAAAL 13 (489)
T ss_pred CchhHHHHHHHHH
Confidence 8999999888765
No 86
>TIGR02744 TrbI_Ftype type-F conjugative transfer system protein TrbI. This protein is an essential component of the F-type conjugative transfer sytem for plasmid DNA transfer and has been shown to be localized to the periplasm.
Probab=25.14 E-value=1.4e+02 Score=23.76 Aligned_cols=45 Identities=4% Similarity=0.082 Sum_probs=29.5
Q ss_pred hHHHHHHHHHHHHhCCccCChHHHHHHHHHHHHHH-HHHHHhccCC
Q 018781 42 DKLIELFESWMSKHGKTYKCIEEKLHRFEIFKENL-KHIDQRNKEV 86 (350)
Q Consensus 42 ~~~~~~f~~~~~~~~k~Y~~~~E~~~R~~if~~n~-~~I~~~N~~~ 86 (350)
.++...-++|...=.+.=-++++...+-..|..-+ +.+.+.++++
T Consensus 37 fdmk~tld~F~~q~~~~~lte~q~~~~~~rF~~~L~~~L~~yq~~H 82 (112)
T TIGR02744 37 FDMKQTLDAFFDSASQKKLSEAQQKALLGRFNALLEAELQAWQAQH 82 (112)
T ss_pred EecHHHHHHHHHHHhhcCCCHHHHHHHHHHHHHHHHHHHHHHHHhC
Confidence 35555666666655555557778888888888888 4455555543
No 87
>PF10107 Endonuc_Holl: Endonuclease related to archaeal Holliday junction resolvase; InterPro: IPR019287 This domain is found in various predicted bacterial endonucleases which are distantly related to archaeal Holliday junction resolvases.
Probab=25.10 E-value=1.5e+02 Score=25.02 Aligned_cols=18 Identities=33% Similarity=0.650 Sum_probs=13.0
Q ss_pred hhHHHHHHHHHHHHhCCc
Q 018781 41 MDKLIELFESWMSKHGKT 58 (350)
Q Consensus 41 ~~~~~~~f~~~~~~~~k~ 58 (350)
+.....+|++|++.....
T Consensus 21 ~~~a~~~fe~wr~~~~~~ 38 (156)
T PF10107_consen 21 ERRARELFEQWRQRESET 38 (156)
T ss_pred HHHHHHHHHHHHHhHHHH
Confidence 556678899998875433
No 88
>PTZ00045 apical membrane antigen 1; Provisional
Probab=25.03 E-value=72 Score=32.67 Aligned_cols=20 Identities=25% Similarity=0.190 Sum_probs=10.8
Q ss_pred chhHHHHHHHHHHHHHhhhh
Q 018781 5 SHSKLLLLSLSLSLFACSSL 24 (350)
Q Consensus 5 ~~~~~~~~~~~~~~~~~~~~ 24 (350)
-|++|..++++.+++++...
T Consensus 15 ~~~~~~~~~~l~~~~~~~~~ 34 (595)
T PTZ00045 15 HMRKLYCLSFLSVLCSVHIF 34 (595)
T ss_pred hhhhhHHHHHHHHHHHHHHh
Confidence 47887555555444444433
No 89
>TIGR03044 PS_II_psb27 photosystem II protein Psb27. Members of this family are the Psb27 protein of the cyanobacterial photosynthetic supracomplex, photosystem II. Although most protein components of both cyanobacterial and chloroplast versions of photosystem II are closely related and described together by single model families, this family is strictly bacterial. Some uncharacterized proteins with highly divergent sequences, from Arabidopsis, score between trusted and noise cutoffs for this model but are not at this time assigned as functionally equivalent photosystem II proteins.
Probab=24.39 E-value=1.5e+02 Score=24.40 Aligned_cols=16 Identities=13% Similarity=0.345 Sum_probs=9.2
Q ss_pred HHHHHHHHHHHHhCCc
Q 018781 43 KLIELFESWMSKHGKT 58 (350)
Q Consensus 43 ~~~~~f~~~~~~~~k~ 58 (350)
+.+++=.+|..+|.+.
T Consensus 67 ~ar~~indyvsrYRr~ 82 (135)
T TIGR03044 67 EARQLINDYISRYRRR 82 (135)
T ss_pred HHHHHHHHHHHHhcCC
Confidence 4455566666666555
No 90
>PF11119 DUF2633: Protein of unknown function (DUF2633); InterPro: IPR022576 This family is conserved largely in Proteobacteria. Several members are named as YfgG. The function is not known.
Probab=24.36 E-value=69 Score=22.38 Aligned_cols=16 Identities=25% Similarity=0.440 Sum_probs=12.4
Q ss_pred chhHHHHHHHHHHHHH
Q 018781 5 SHSKLLLLSLSLSLFA 20 (350)
Q Consensus 5 ~~~~~~~~~~~~~~~~ 20 (350)
+|-|++||+.+|.||.
T Consensus 8 ~mtriVLLISfiIlfg 23 (59)
T PF11119_consen 8 RMTRIVLLISFIILFG 23 (59)
T ss_pred hHHHHHHHHHHHHHHH
Confidence 6788888888877774
No 91
>PF06585 JHBP: Haemolymph juvenile hormone binding protein (JHBP); InterPro: IPR010562 This family consists of several insect specific haemolymph juvenile hormone binding proteins (JHBP). Juvenile hormone (JH) has a profound effect on insects. It regulates embryogenesis, maintains the status quo of larva development and stimulates reproductive maturation in the adult forms. JH is transported from the sites of its synthesis to target tissues by a haemolymph carrier called juvenile hormone-binding protein (JHBP). JHBP protects the JH molecules from hydrolysis by non-specific esterases present in the insect haemolymph []. The crystal structure of the JHBP from Galleria mellonella (Wax moth) shows an unusual fold consisting of a long alpha-helix wrapped in a much curved antiparallel beta-sheet. The folding pattern for this structure closely resembles that found in some tandem-repeat mammalian lipid-binding and bactericidal permeability-increasing proteins, with a similar organisation of the major cavity and a disulphide bond linking the long helix and the beta-sheet. It would appear that JHBP forms two cavities, only one of which, the one near the N- and C-termini, binds the hormone; binding induces a conformational change, of unknown significance [, ].; PDB: 3A1Z_D 3AOS_B 3AOT_A 2RQF_A 2RCK_A 3E8W_A 3E8T_A.
Probab=23.85 E-value=85 Score=28.19 Aligned_cols=21 Identities=29% Similarity=0.319 Sum_probs=0.0
Q ss_pred hhHHHHHHHHHHHHHhhhhcc
Q 018781 6 HSKLLLLSLSLSLFACSSLAH 26 (350)
Q Consensus 6 ~~~~~~~~~~~~~~~~~~~~~ 26 (350)
|+.++++.++++++++.+.++
T Consensus 1 M~~~~~~~~l~~~~~~~~~~~ 21 (248)
T PF06585_consen 1 MKLLVLFFLLVLVFSVASSAA 21 (248)
T ss_dssp ---------------------
T ss_pred CcchHHHHHHHHHHHHHHhcc
Confidence 666655555544444444433
No 92
>COG5633 Predicted periplasmic lipoprotein [General function prediction only]
Probab=22.78 E-value=51 Score=26.42 Aligned_cols=24 Identities=38% Similarity=0.386 Sum_probs=15.5
Q ss_pred hhHHHHHHHHHHHHHhhhhccccc
Q 018781 6 HSKLLLLSLSLSLFACSSLAHDFS 29 (350)
Q Consensus 6 ~~~~~~~~~~~~~~~~~~~~~~~~ 29 (350)
|||..++++.++|++-|.+-+.+.
T Consensus 1 Mrk~~~~~l~~~lLvGCsS~~~i~ 24 (123)
T COG5633 1 MRKLCLLSLALLLLVGCSSHQEIL 24 (123)
T ss_pred CceehHHHHHHHHhhccCCCCCcc
Confidence 889888766666666555444333
No 93
>PRK09838 periplasmic copper-binding protein; Provisional
Probab=22.63 E-value=69 Score=25.63 Aligned_cols=19 Identities=11% Similarity=-0.099 Sum_probs=11.9
Q ss_pred hhHHHHHHHHHHHHHhhhh
Q 018781 6 HSKLLLLSLSLSLFACSSL 24 (350)
Q Consensus 6 ~~~~~~~~~~~~~~~~~~~ 24 (350)
|||++..+++.++++.++.
T Consensus 1 mk~~~~~~~~~~~~~~~~~ 19 (115)
T PRK09838 1 MKKALKVAMFSLFSVIGFN 19 (115)
T ss_pred CchHHHHHHHHHHHHHhhh
Confidence 7888776666555554443
No 94
>TIGR02052 MerP mercuric transport protein periplasmic component. This model represents the periplasmic mercury (II) binding protein of the bacterial mercury detoxification system which passes mercuric ion to the MerT transporter for subsequent reduction to Hg(0) by the mercuric reductase MerA. MerP contains a distinctive GMTCXXC motif associated with metal binding. MerP is related to a larger family of metal binding proteins (pfam00403).
Probab=22.25 E-value=63 Score=23.08 Aligned_cols=13 Identities=31% Similarity=0.184 Sum_probs=7.6
Q ss_pred hhHHHHHHHHHHH
Q 018781 6 HSKLLLLSLSLSL 18 (350)
Q Consensus 6 ~~~~~~~~~~~~~ 18 (350)
|+|++.|.+|+++
T Consensus 1 ~~~~~~~~~~~~~ 13 (92)
T TIGR02052 1 MKKLATLLALFVL 13 (92)
T ss_pred ChhHHHHHHHHHH
Confidence 7787655554333
No 95
>PRK11671 mltC murein transglycosylase C; Provisional
Probab=22.12 E-value=1.4e+02 Score=29.00 Aligned_cols=17 Identities=29% Similarity=0.329 Sum_probs=10.6
Q ss_pred hhHHHHHHHHHHHHHhh
Q 018781 6 HSKLLLLSLSLSLFACS 22 (350)
Q Consensus 6 ~~~~~~~~~~~~~~~~~ 22 (350)
|+|++.+++++.|+..|
T Consensus 1 ~k~~~~~~~~~~~l~~c 17 (359)
T PRK11671 1 MKKYLALALIAPLLISC 17 (359)
T ss_pred CchHHHHHHHHHHHhhh
Confidence 88888666554444444
No 96
>PRK09936 hypothetical protein; Provisional
Probab=20.42 E-value=1.8e+02 Score=27.37 Aligned_cols=24 Identities=17% Similarity=0.146 Sum_probs=15.4
Q ss_pred hhHHHHHHHHHHHHHhhhhcccccc
Q 018781 6 HSKLLLLSLSLSLFACSSLAHDFSI 30 (350)
Q Consensus 6 ~~~~~~~~~~~~~~~~~~~~~~~~~ 30 (350)
|+|+|.|+|.++ ++..++.+|.++
T Consensus 1 m~~~~~~~l~~l-~~~~~~~a~~g~ 24 (296)
T PRK09936 1 MRKFIFVLLTLL-LVSPFSQAMKGI 24 (296)
T ss_pred ChhHHHHHHHHH-HcCchhhccccc
Confidence 899988887744 334335566664
No 97
>COG5266 CbiK ABC-type Co2+ transport system, periplasmic component [Inorganic ion transport and metabolism]
Probab=20.27 E-value=70 Score=29.36 Aligned_cols=22 Identities=41% Similarity=0.463 Sum_probs=16.0
Q ss_pred hhHHHHHHHHHHHHHhhhhccc
Q 018781 6 HSKLLLLSLSLSLFACSSLAHD 27 (350)
Q Consensus 6 ~~~~~~~~~~~~~~~~~~~~~~ 27 (350)
|+|.+.|..++++|..++-|+.
T Consensus 1 MKK~l~l~a~~l~~s~~a~AH~ 22 (264)
T COG5266 1 MKKILTLGASSLLFSASAFAHF 22 (264)
T ss_pred CchhHHHHHHHHHHHhhhccee
Confidence 8999998888888855554443
No 98
>COG3054 Predicted transcriptional regulator [General function prediction only]
Probab=20.26 E-value=73 Score=26.88 Aligned_cols=24 Identities=29% Similarity=0.292 Sum_probs=17.4
Q ss_pred hhHHHHHHHHHHHHHhhhhccccc
Q 018781 6 HSKLLLLSLSLSLFACSSLAHDFS 29 (350)
Q Consensus 6 ~~~~~~~~~~~~~~~~~~~~~~~~ 29 (350)
|+|..++.++++++-..+.|+-.-
T Consensus 1 Mkk~~l~~~~~~~~p~~~~AHnlq 24 (184)
T COG3054 1 MKKRKLLALLCLLLPMMASAHNLQ 24 (184)
T ss_pred CchhhHHHHHHHHhHHHHHHhhcc
Confidence 788888888877777666666544
No 99
>PRK13733 conjugal transfer protein TraV; Provisional
Probab=20.09 E-value=1.5e+02 Score=25.47 Aligned_cols=18 Identities=22% Similarity=0.146 Sum_probs=10.5
Q ss_pred hhHHHHHHHHHHHHHhhh
Q 018781 6 HSKLLLLSLSLSLFACSS 23 (350)
Q Consensus 6 ~~~~~~~~~~~~~~~~~~ 23 (350)
||+|..|++++.+|+++-
T Consensus 1 MK~~~~li~l~~~LlL~G 18 (171)
T PRK13733 1 MKQISLLIPLLGTLLLSG 18 (171)
T ss_pred CchhhHHHHHHHHHHhcc
Confidence 778766666555444443
Done!