RPS-BLAST 2.2.26 [Sep-21-2011]
Database: CDD.v3.10
44,354 sequences; 10,937,602 total letters
Searching..................................................done
Query= psy11694
(655 letters)
>gnl|CDD|215726 pfam00112, Peptidase_C1, Papain family cysteine protease.
Length = 213
Score = 232 bits (593), Expect = 2e-72
Identities = 86/193 (44%), Positives = 114/193 (59%), Gaps = 17/193 (8%)
Query: 296 LPEAFDWRAEGVISKVKEQGKCACCWAFSAVGVVEAMHAIQGNSLTELSVQQLVDCDMSN 355
LPE+FDWR +G ++ VK+QG+C CWAFSAVG +E + I+ L LS QQLVDCD N
Sbjct: 1 LPESFDWREKGAVTPVKDQGQCGSCWAFSAVGALEGRYCIKTGKLVSLSEQQLVDCDTGN 60
Query: 356 GGCNGGRMDDALQYIIDNGGVVSDQAYPYKASESERGCLVGEEEGFKVKVKEYSRIPYGE 415
GCNGG D+A +YI NGG+V++ YPY A + C + K+K Y +PY +
Sbjct: 61 NGCNGGLPDNAFEYIKKNGGIVTESDYPYTAHDGT--CKFKKSNSKYAKIKGYGDVPYND 118
Query: 416 EEEMKKWVATRGPLSVGMNANGLF--YYSGGVID---LNQRL--------YGTS--IPYW 460
EE ++ +A GP+SV ++A Y GV + L YGT +PYW
Sbjct: 119 EEALQAALAKNGPVSVAIDAYEDDFQLYKSGVYKHTECSGELDHAVLIVGYGTENGVPYW 178
Query: 461 IVKNSWGSDWGEK 473
IVKNSWG+DWGE
Sbjct: 179 IVKNSWGTDWGEN 191
Score = 168 bits (429), Expect = 1e-48
Identities = 66/158 (41%), Positives = 92/158 (58%), Gaps = 12/158 (7%)
Query: 497 KLSRLATEKLVDCDMSNGGCNGGRMDDALQYIIDNGGVVSDQAYPYKASESERGCLVGEE 556
KL L+ ++LVDCD N GCNGG D+A +YI NGG+V++ YPY A + C +
Sbjct: 44 KLVSLSEQQLVDCDTGNNGCNGGLPDNAFEYIKKNGGIVTESDYPYTAHDGT--CKFKKS 101
Query: 557 EGFKVKVKEYSRIPYGEEEEMKKWVATRGPLSVGMNANGLF--YYSGGVIDLNQRLCNPK 614
K+K Y +PY +EE ++ +A GP+SV ++A Y GV + C+ +
Sbjct: 102 NSKYAKIKGYGDVPYNDEEALQAALAKNGPVSVAIDAYEDDFQLYKSGVYKHTE--CSGE 159
Query: 615 AQNHALIIVGYGEEEKKDGTSIPYWIVKNSWGSDWGEK 652
+HA++IVGYG E PYWIVKNSWG+DWGE
Sbjct: 160 L-DHAVLIVGYGTENGV-----PYWIVKNSWGTDWGEN 191
Score = 85.7 bits (213), Expect = 6e-19
Identities = 27/52 (51%), Positives = 36/52 (69%)
Query: 148 LPEAFDWRAEGVISKVKEQGKCACCWAFSAVGVVEAMHAIQGNNLTELSVQH 199
LPE+FDWR +G ++ VK+QG+C CWAFSAVG +E + I+ L LS Q
Sbjct: 1 LPESFDWREKGAVTPVKDQGQCGSCWAFSAVGALEGRYCIKTGKLVSLSEQQ 52
>gnl|CDD|239068 cd02248, Peptidase_C1A, Peptidase C1A subfamily (MEROPS database
nomenclature); composed of cysteine peptidases (CPs)
similar to papain, including the mammalian CPs
(cathepsins B, C, F, H, L, K, O, S, V, X and W). Papain
is an endopeptidase with specific substrate preferences,
primarily for bulky hydrophobic or aromatic residues at
the S2 subsite, a hydrophobic pocket in papain that
accommodates the P2 sidechain of the substrate (the
second residue away from the scissile bond). Most
members of the papain subfamily are endopeptidases. Some
exceptions to this rule can be explained by specific
details of the catalytic domains like the occluding loop
in cathepsin B which confers an additional
carboxydipeptidyl activity and the mini-chain of
cathepsin H resulting in an N-terminal exopeptidase
activity. Papain-like CPs have different functions in
various organisms. Plant CPs are used to mobilize
storage proteins in seeds. Parasitic CPs act
extracellularly to help invade tissues and cells, to
hatch or to evade the host immune system. Mammalian CPs
are primarily lysosomal enzymes with the exception of
cathepsin W, which is retained in the endoplasmic
reticulum. They are responsible for protein degradation
in the lysosome. Papain-like CPs are synthesized as
inactive proenzymes with N-terminal propeptide regions,
which are removed upon activation. In addition to its
inhibitory role, the propeptide is required for proper
folding of the newly synthesized enzyme and its
stabilization in denaturing pH conditions. Residues
within the propeptide region also play a role in the
transport of the proenzyme to lysosomes or acidified
vesicles. Also included in this subfamily are proteins
classified as non-peptidase homologs, which lack
peptidase activity or have missing active site residues.
Length = 210
Score = 223 bits (570), Expect = 5e-69
Identities = 83/193 (43%), Positives = 111/193 (57%), Gaps = 20/193 (10%)
Query: 297 PEAFDWRAEGVISKVKEQGKCACCWAFSAVGVVEAMHAIQGNSLTELSVQQLVDCD-MSN 355
PE+ DWR +G ++ VK+QG C CWAFS VG +E +AI+ L LS QQLVDC N
Sbjct: 1 PESVDWREKGAVTPVKDQGSCGSCWAFSTVGALEGAYAIKTGKLVSLSEQQLVDCSTSGN 60
Query: 356 GGCNGGRMDDALQYIIDNGGVVSDQAYPYKASESERGCLVGEEEGFKVKVKEYSRIPYGE 415
GCNGG D+A +Y + NGG+ S+ YPY + C + K+ YS +P G+
Sbjct: 61 NGCNGGNPDNAFEY-VKNGGLASESDYPYTGKDGT--CKYNSSKV-GAKITGYSNVPPGD 116
Query: 416 EEEMKKWVATRGPLSVGMNANGLF-YYSGGVID----LNQRL--------YGTS--IPYW 460
EE +K +A GP+SV ++A+ F +Y GG+ N L YGT + YW
Sbjct: 117 EEALKAALANYGPVSVAIDASSSFQFYKGGIYSGPCCSNTNLNHAVLLVGYGTENGVDYW 176
Query: 461 IVKNSWGSDWGEK 473
IVKNSWG+ WGEK
Sbjct: 177 IVKNSWGTSWGEK 189
Score = 165 bits (419), Expect = 3e-47
Identities = 65/158 (41%), Positives = 91/158 (57%), Gaps = 13/158 (8%)
Query: 497 KLSRLATEKLVDCD-MSNGGCNGGRMDDALQYIIDNGGVVSDQAYPYKASESERGCLVGE 555
KL L+ ++LVDC N GCNGG D+A +Y+ NGG+ S+ YPY + C
Sbjct: 43 KLVSLSEQQLVDCSTSGNNGCNGGNPDNAFEYV-KNGGLASESDYPYTGKDGT--CKYNS 99
Query: 556 EEGFKVKVKEYSRIPYGEEEEMKKWVATRGPLSVGMNANGLF-YYSGGVIDLNQRLCNPK 614
+ K+ YS +P G+EE +K +A GP+SV ++A+ F +Y GG+ C+
Sbjct: 100 SKV-GAKITGYSNVPPGDEEALKAALANYGPVSVAIDASSSFQFYKGGIYSGPC--CSNT 156
Query: 615 AQNHALIIVGYGEEEKKDGTSIPYWIVKNSWGSDWGEK 652
NHA+++VGYG E D YWIVKNSWG+ WGEK
Sbjct: 157 NLNHAVLLVGYGTENGVD-----YWIVKNSWGTSWGEK 189
Score = 86.9 bits (216), Expect = 2e-19
Identities = 25/50 (50%), Positives = 33/50 (66%)
Query: 149 PEAFDWRAEGVISKVKEQGKCACCWAFSAVGVVEAMHAIQGNNLTELSVQ 198
PE+ DWR +G ++ VK+QG C CWAFS VG +E +AI+ L LS Q
Sbjct: 1 PESVDWREKGAVTPVKDQGSCGSCWAFSTVGALEGAYAIKTGKLVSLSEQ 50
>gnl|CDD|214761 smart00645, Pept_C1, Papain family cysteine protease.
Length = 175
Score = 181 bits (462), Expect = 1e-53
Identities = 70/195 (35%), Positives = 92/195 (47%), Gaps = 60/195 (30%)
Query: 296 LPEAFDWRAEGVISKVKEQGKCACCWAFSAVGVVEAMHAIQGNSLTELSVQQLVDCD-MS 354
LPE+FDWR +G ++ VK+QG+C CWAFSA G +E + I+ L LS QQLVDC
Sbjct: 1 LPESFDWRKKGAVTPVKDQGQCGSCWAFSATGALEGRYCIKTGKLVSLSEQQLVDCSGGG 60
Query: 355 NGGCNGGRMDDALQYIIDNGGVVSDQAYPYKASESERGCLVGEEEGFKVKVKEYSRIPYG 414
N GCNGG D+A +YI NGG+ ++ YPY S
Sbjct: 61 NCGCNGGLPDNAFEYIKKNGGLETESCYPYTGS--------------------------- 93
Query: 415 EEEEMKKWVATRGPLSVGMNANGLFYYSGGVID----LNQRL--------YGTSI----P 458
V ++A+ +Y G+ D + L YGT +
Sbjct: 94 ----------------VAIDASDFQFYKSGIYDHPGCGSGTLDHAVLIVGYGTEVENGKD 137
Query: 459 YWIVKNSWGSDWGEK 473
YWIVKNSWG+DWGE
Sbjct: 138 YWIVKNSWGTDWGEN 152
Score = 118 bits (297), Expect = 1e-30
Identities = 51/157 (32%), Positives = 69/157 (43%), Gaps = 49/157 (31%)
Query: 497 KLSRLATEKLVDCD-MSNGGCNGGRMDDALQYIIDNGGVVSDQAYPYKASESERGCLVGE 555
KL L+ ++LVDC N GCNGG D+A +YI NGG+ ++ YPY S
Sbjct: 44 KLVSLSEQQLVDCSGGGNCGCNGGLPDNAFEYIKKNGGLETESCYPYTGS---------- 93
Query: 556 EEGFKVKVKEYSRIPYGEEEEMKKWVATRGPLSVGMNANGLFYYSGGVIDLNQRLCNPKA 615
V ++A+ +Y G+ D C
Sbjct: 94 ---------------------------------VAIDASDFQFYKSGIYD--HPGCGSGT 118
Query: 616 QNHALIIVGYGEEEKKDGTSIPYWIVKNSWGSDWGEK 652
+HA++IVGYG E + YWIVKNSWG+DWGE
Sbjct: 119 LDHAVLIVGYGTEVENGK---DYWIVKNSWGTDWGEN 152
Score = 89.9 bits (224), Expect = 7e-21
Identities = 26/52 (50%), Positives = 35/52 (67%)
Query: 148 LPEAFDWRAEGVISKVKEQGKCACCWAFSAVGVVEAMHAIQGNNLTELSVQH 199
LPE+FDWR +G ++ VK+QG+C CWAFSA G +E + I+ L LS Q
Sbjct: 1 LPESFDWRKKGAVTPVKDQGQCGSCWAFSATGALEGRYCIKTGKLVSLSEQQ 52
>gnl|CDD|185513 PTZ00203, PTZ00203, cathepsin L protease; Provisional.
Length = 348
Score = 166 bits (421), Expect = 6e-46
Identities = 97/299 (32%), Positives = 149/299 (49%), Gaps = 47/299 (15%)
Query: 198 QHHDKVYSSVEDLLRRHENFVTNVEKAEDYQSEDSGTAVFGVNKFFDLSESDL--QQLTG 255
+ + + Y ++ + +R NF N+E ++Q+ + A FG+ KFFDLSE++ + L G
Sbjct: 43 RTYQRAYGTLTEEQQRLANFERNLELMREHQARNP-HARFGITKFFDLSEAEFAARYLNG 101
Query: 256 LNLDSTLEDIQPSLQAPFSSNQTDTEMRAFQFNSLRH---GDDL---PEAFDWRAEGVIS 309
A F++ A Q + DL P+A DWR +G ++
Sbjct: 102 --------------AAYFAA--------AKQHAGQHYRKARADLSAVPDAVDWREKGAVT 139
Query: 310 KVKEQGKCACCWAFSAVGVVEAMHAIQGNSLTELSVQQLVDCDMSNGGCNGGRMDDALQY 369
VK QG C CWAFSAVG +E+ A+ G+ L LS QQLV CD + GC GG M A ++
Sbjct: 140 PVKNQGACGSCWAFSAVGNIESQWAVAGHKLVRLSEQQLVSCDHVDNGCGGGLMLQAFEW 199
Query: 370 IIDN--GGVVSDQAYPYKASESERG-CLVGEEEGFKVKVKEYSRIPYGEEEEMKKWVATR 426
++ N G V ++++YPY + + C E ++ Y + E M W+A
Sbjct: 200 VLRNMNGTVFTEKSYPYVSGNGDVPECSNSSELAPGARIDGYVSME-SSERVMAAWLAKN 258
Query: 427 GPLSVGMNANGLFYYSGGVI---DLNQRLYGT---------SIPYWIVKNSWGSDWGEK 473
GP+S+ ++A+ Y GV+ Q +G +PYW++KNSWG DWGEK
Sbjct: 259 GPISIAVDASSFMSYHSGVLTSCIGEQLNHGVLLVGYNMTGEVPYWVIKNSWGEDWGEK 317
Score = 105 bits (263), Expect = 1e-24
Identities = 54/159 (33%), Positives = 85/159 (53%), Gaps = 13/159 (8%)
Query: 497 KLSRLATEKLVDCDMSNGGCNGGRMDDALQYIIDN--GGVVSDQAYPYKASESERG-CLV 553
KL RL+ ++LV CD + GC GG M A ++++ N G V ++++YPY + + C
Sbjct: 169 KLVRLSEQQLVSCDHVDNGCGGGLMLQAFEWVLRNMNGTVFTEKSYPYVSGNGDVPECSN 228
Query: 554 GEEEGFKVKVKEYSRIPYGEEEEMKKWVATRGPLSVGMNANGLFYYSGGVIDLNQRLCNP 613
E ++ Y + E M W+A GP+S+ ++A+ Y GV+ C
Sbjct: 229 SSELAPGARIDGYVSME-SSERVMAAWLAKNGPISIAVDASSFMSYHSGVLTS----CIG 283
Query: 614 KAQNHALIIVGYGEEEKKDGTSIPYWIVKNSWGSDWGEK 652
+ NH +++VGY G +PYW++KNSWG DWGEK
Sbjct: 284 EQLNHGVLLVGY----NMTG-EVPYWVIKNSWGEDWGEK 317
Score = 79.0 bits (194), Expect = 9e-16
Identities = 54/172 (31%), Positives = 82/172 (47%), Gaps = 32/172 (18%)
Query: 37 YLNSPVTR-FLNFMRDHDKVYSSVEDLLRRHENFVTNVEKAEDYQREDSGTAVFEVNKFF 95
Y+ +P F F R + + Y ++ + +R NF N+E ++Q + A F + KFF
Sbjct: 29 YVGTPAAALFEEFKRTYQRAYGTLTEEQQRLANFERNLELMREHQARNP-HARFGITKFF 87
Query: 96 DLSDSDL--QQLTGLNLDSTLEDIQPSLQAPFSSNQTDTEMRAFQFNSLRH---GDDL-- 148
DLS+++ + L G A F++ A Q + DL
Sbjct: 88 DLSEAEFAARYLNG--------------AAYFAA--------AKQHAGQHYRKARADLSA 125
Query: 149 -PEAFDWRAEGVISKVKEQGKCACCWAFSAVGVVEAMHAIQGNNLTELSVQH 199
P+A DWR +G ++ VK QG C CWAFSAVG +E+ A+ G+ L LS Q
Sbjct: 126 VPDAVDWREKGAVTPVKNQGACGSCWAFSAVGNIESQWAVAGHKLVRLSEQQ 177
>gnl|CDD|240232 PTZ00021, PTZ00021, falcipain-2; Provisional.
Length = 489
Score = 156 bits (396), Expect = 4e-41
Identities = 96/308 (31%), Positives = 153/308 (49%), Gaps = 38/308 (12%)
Query: 191 NLTELSVQHHDKVYSSVEDLLRRHENFVTNVEKAEDYQSEDSGTAVFGVNKFFDLSESDL 250
N L ++ H K Y + +++ +R+ +FV N+ K + ++++ G+N+F DLS +
Sbjct: 167 NSFYLFIKEHGKKYQTPDEMQQRYLSFVENLAKINAHNNKENVLYKKGMNRFGDLSFEEF 226
Query: 251 QQLTGLNLDSTLEDIQPSLQAPFSSNQTDTEMRAFQFNSLRHGDDL--PEAFDWRAEGVI 308
++ L L + + ++P N D + D +DWR +
Sbjct: 227 KK-KYLTL-KSFDFKSNGKKSPRVINYDDV------IKKYKPKDATFDHAKYDWRLHNGV 278
Query: 309 SKVKEQGKCACCWAFSAVGVVEAMHAIQGNSLTELSVQQLVDCDMSNGGCNGGRMDDALQ 368
+ VK+Q C CWAFS VGVVE+ +AI+ N L LS Q+LVDC N GC GG + +A +
Sbjct: 279 TPVKDQKNCGSCWAFSTVGVVESQYAIRKNELVSLSEQELVDCSFKNNGCYGGLIPNAFE 338
Query: 369 YIIDNGGVVSDQAYPYKASESERGCLVGEEEGFKVKVKEYSRIPYGEEEEMKKWVATRGP 428
+I+ GG+ S+ YPY S++ C + + K K+K Y IP E++ K+ + GP
Sbjct: 339 DMIELGGLCSEDDYPY-VSDTPELCNIDRCKE-KYKIKSYVSIP---EDKFKEAIRFLGP 393
Query: 429 LSVGMNANGLF-YYSGGVID------LNQRL----YGTSIP------------YWIVKNS 465
+SV + + F +Y GG+ D N + YG Y+I+KNS
Sbjct: 394 ISVSIAVSDDFAFYKGGIFDGECGEEPNHAVILVGYGMEEIYNSDTKKMEKRYYYIIKNS 453
Query: 466 WGSDWGEK 473
WG WGEK
Sbjct: 454 WGESWGEK 461
Score = 107 bits (269), Expect = 1e-24
Identities = 60/162 (37%), Positives = 92/162 (56%), Gaps = 15/162 (9%)
Query: 497 KLSRLATEKLVDCDMSNGGCNGGRMDDALQYIIDNGGVVSDQAYPYKASESERGCLVGEE 556
+L L+ ++LVDC N GC GG + +A + +I+ GG+ S+ YPY S++ C +
Sbjct: 309 ELVSLSEQELVDCSFKNNGCYGGLIPNAFEDMIELGGLCSEDDYPY-VSDTPELCNIDRC 367
Query: 557 EGFKVKVKEYSRIPYGEEEEMKKWVATRGPLSVGMNANGLF-YYSGGVIDLNQRLCNPKA 615
+ K K+K Y IP E++ K+ + GP+SV + + F +Y GG+ D C +
Sbjct: 368 KE-KYKIKSYVSIP---EDKFKEAIRFLGPISVSIAVSDDFAFYKGGIFDGE---CG-EE 419
Query: 616 QNHALIIVGYGEEEKKDGTSIP-----YWIVKNSWGSDWGEK 652
NHA+I+VGYG EE + + Y+I+KNSWG WGEK
Sbjct: 420 PNHAVILVGYGMEEIYNSDTKKMEKRYYYIIKNSWGESWGEK 461
Score = 80.6 bits (199), Expect = 6e-16
Identities = 47/175 (26%), Positives = 83/175 (47%), Gaps = 12/175 (6%)
Query: 28 ESNIFQTRGYLNS--PVTRFLNFMRDHDKVYSSVEDLLRRHENFVTNVEKAEDYQREDSG 85
N ++ + + V F F+++H K Y + +++ +R+ +FV N+ K + +++
Sbjct: 150 LINFADSKFLMTNLENVNSFYLFIKEHGKKYQTPDEMQQRYLSFVENLAKINAHNNKENV 209
Query: 86 TAVFEVNKFFDLSDSDLQQLTGLNLDSTLEDIQPSLQAPFSSNQTDTEMRAFQFNSLRHG 145
+N+F DLS + ++ L L + + ++P N D +
Sbjct: 210 LYKKGMNRFGDLSFEEFKK-KYLTL-KSFDFKSNGKKSPRVINYDDV------IKKYKPK 261
Query: 146 DDL--PEAFDWRAEGVISKVKEQGKCACCWAFSAVGVVEAMHAIQGNNLTELSVQ 198
D +DWR ++ VK+Q C CWAFS VGVVE+ +AI+ N L LS Q
Sbjct: 262 DATFDHAKYDWRLHNGVTPVKDQKNCGSCWAFSTVGVVESQYAIRKNELVSLSEQ 316
>gnl|CDD|240310 PTZ00200, PTZ00200, cysteine proteinase; Provisional.
Length = 448
Score = 144 bits (365), Expect = 2e-37
Identities = 87/304 (28%), Positives = 140/304 (46%), Gaps = 49/304 (16%)
Query: 200 HDKVYSSVEDLLRRHENFVTNVEKAEDYQSEDSGTAVFGVNKFFDLSESDLQQLTGLNLD 259
+++ +++ + L R F N + + ++ ++ +NKF DL+E + ++L
Sbjct: 133 YNRKHATHAERLNRFLTFRNNYLEVKSHKGDE--PYSKEINKFSDLTEEEFRKL------ 184
Query: 260 STLEDIQPSLQAPFSSNQTDT----EMRAFQFNSLRHGDDL-----------PEAFDWRA 304
I+ ++ +S+ D +L+ + E DWR
Sbjct: 185 --FPVIKVPPKSNSTSHNNDFKARHVSNPTYLKNLKKAKNTDEDVKDPSKITGEGLDWRR 242
Query: 305 EGVISKVKEQGK-CACCWAFSAVGVVEAMHAIQGNSLTELSVQQLVDCDMSNGGCNGGRM 363
++KVK+QG C CWAFS+VG VE+++ I + +LS Q+LV+CD + GC+GG
Sbjct: 243 ADAVTKVKDQGLNCGSCWAFSSVGSVESLYKIYRDKSVDLSEQELVNCDTKSQGCSGGYP 302
Query: 364 DDALQYIIDNGGVVSDQAYPYKASESERGCLVGEEEGFKVKVKEYSRIPYGEEEEMKKWV 423
D AL+Y + N G+ S PY A + + C+V KV + Y + G++ K V
Sbjct: 303 DTALEY-VKNKGLSSSSDVPYLAKDGK--CVVS--STKKVYIDSYL-VAKGKDVLNKSLV 356
Query: 424 ATRGPLSVGMNA-NGLFYYSGGVID------LNQR--LYG------TSIPYWIVKNSWGS 468
P V + L Y GV + LN L G T YWI+KNSWG+
Sbjct: 357 I--SPTVVYIAVSRELLKYKSGVYNGECGKSLNHAVLLVGEGYDEKTKKRYWIIKNSWGT 414
Query: 469 DWGE 472
DWGE
Sbjct: 415 DWGE 418
Score = 99.4 bits (248), Expect = 4e-22
Identities = 63/186 (33%), Positives = 89/186 (47%), Gaps = 25/186 (13%)
Query: 467 GSDWGEKVEDKVGSSGNRTRDLELTGVLPSKLSRLATEKLVDCDMSNGGCNGGRMDDALQ 526
GS W V S RD + L+ ++LV+CD + GC+GG D AL+
Sbjct: 257 GSCWAFSSVGSVESLYKIYRDKSV---------DLSEQELVNCDTKSQGCSGGYPDTALE 307
Query: 527 YIIDNGGVVSDQAYPYKASESERGCLVGEEEGFKVKVKEYSRIPYGEEEEMKKWVATRGP 586
Y + N G+ S PY A + + C+V KV + Y + G++ K V P
Sbjct: 308 Y-VKNKGLSSSSDVPYLAKDGK--CVVS--STKKVYIDSYL-VAKGKDVLNKSLVI--SP 359
Query: 587 LSVGMNA-NGLFYYSGGVIDLNQRLCNPKAQNHALIIVGYGEEEKKDGTSIPYWIVKNSW 645
V + L Y GV + C K+ NHA+++VG G +EK T YWI+KNSW
Sbjct: 360 TVVYIAVSRELLKYKSGVYNGE---CG-KSLNHAVLLVGEGYDEK---TKKRYWIIKNSW 412
Query: 646 GSDWGE 651
G+DWGE
Sbjct: 413 GTDWGE 418
Score = 72.8 bits (179), Expect = 2e-13
Identities = 40/171 (23%), Positives = 74/171 (43%), Gaps = 26/171 (15%)
Query: 45 FLNFMRDHDKVYSSVEDLLRRHENFVTNVEKAEDYQREDSGTAVFEVNKFFDLSDSDLQQ 104
F F + +++ +++ + L R F N + + + E+NKF DL++ + ++
Sbjct: 126 FEEFNKKYNRKHATHAERLNRFLTFRNNYLEVK--SHKGDEPYSKEINKFSDLTEEEFRK 183
Query: 105 LTGLNLDSTLEDIQPSLQAPFSSNQTDT----EMRAFQFNSLRHGDDL-----------P 149
L I+ ++ +S+ D +L+ +
Sbjct: 184 L--------FPVIKVPPKSNSTSHNNDFKARHVSNPTYLKNLKKAKNTDEDVKDPSKITG 235
Query: 150 EAFDWRAEGVISKVKEQGK-CACCWAFSAVGVVEAMHAIQGNNLTELSVQH 199
E DWR ++KVK+QG C CWAFS+VG VE+++ I + +LS Q
Sbjct: 236 EGLDWRRADAVTKVKDQGLNCGSCWAFSSVGSVESLYKIYRDKSVDLSEQE 286
>gnl|CDD|239111 cd02620, Peptidase_C1A_CathepsinB, Cathepsin B group; composed of
cathepsin B and similar proteins, including
tubulointerstitial nephritis antigen (TIN-Ag). Cathepsin
B is a lysosomal papain-like cysteine peptidase which is
expressed in all tissues and functions primarily as an
exopeptidase through its carboxydipeptidyl activity.
Together with other cathepsins, it is involved in the
degradation of proteins, proenzyme activation, Ag
processing, metabolism and apoptosis. Cathepsin B has
been implicated in a number of human diseases such as
cancer, rheumatoid arthritis, osteoporosis and
Alzheimer's disease. The unique carboxydipeptidyl
activity of cathepsin B is attributed to the presence of
an occluding loop in its active site which favors the
binding of the C-termini of substrate proteins. Some
members of this group do not possess the occluding loop.
TIN-Ag is an extracellular matrix basement protein which
was originally identified as a target Ag involved in
anti-tubular basement membrane antibody-mediated
interstitial nephritis. It plays a role in renal
tubulogenesis and is defective in hereditary
tubulointerstitial disorders. TIN-Ag is exclusively
expressed in kidney tissues. .
Length = 236
Score = 126 bits (319), Expect = 4e-33
Identities = 68/218 (31%), Positives = 91/218 (41%), Gaps = 44/218 (20%)
Query: 297 PEAFDWRAE----GVISKVKEQGKCACCWAFSAVGVVEAMHAIQGNSL--TELSVQQLVD 350
PE+FD R + I ++++QG C CWAFSAV IQ N LS Q L+
Sbjct: 1 PESFDAREKWPNCISIGEIRDQGNCGSCWAFSAVEAFSDRLCIQSNGKENVLLSAQDLLS 60
Query: 351 CDMSNG-GCNGGRMDDALQYIIDNGGVVSDQAYPYKASES-------------------- 389
C G GCNGG D A +Y+ G VV+ PY
Sbjct: 61 CCSGCGDGCNGGYPDAAWKYLTTTG-VVTGGCQPYTIPPCGHHPEGPPPCCGTPYCTPKC 119
Query: 390 ERGCLVGEEEGFKVKVKEYSRIPYGEEEEMKKWVATRGPLSVGMNAN-GLFYYSGGV--- 445
+ GC EE K K K +P +E ++ K + T GP+ YY GV
Sbjct: 120 QDGCEKTYEE-DKHKGKSAYSVP-SDETDIMKEIMTNGPVQAAFTVYEDFLYYKSGVYQH 177
Query: 446 IDLNQ------RL--YGTS--IPYWIVKNSWGSDWGEK 473
Q ++ +G +PYW+ NSWG+DWGE
Sbjct: 178 TSGKQLGGHAVKIIGWGVENGVPYWLAANSWGTDWGEN 215
Score = 101 bits (255), Expect = 2e-24
Identities = 52/175 (29%), Positives = 72/175 (41%), Gaps = 33/175 (18%)
Query: 500 RLATEKLVDCDMSNG-GCNGGRMDDALQYIIDNGGVVSDQAYPYKASES----------- 547
L+ + L+ C G GCNGG D A +Y+ G VV+ PY
Sbjct: 52 LLSAQDLLSCCSGCGDGCNGGYPDAAWKYLTTTG-VVTGGCQPYTIPPCGHHPEGPPPCC 110
Query: 548 ---------ERGCLVGEEEGFKVKVKEYSRIPYGEEEEMKKWVATRGPLSVGMNAN-GLF 597
+ GC EE K K K +P +E ++ K + T GP+
Sbjct: 111 GTPYCTPKCQDGCEKTYEE-DKHKGKSAYSVP-SDETDIMKEIMTNGPVQAAFTVYEDFL 168
Query: 598 YYSGGVIDLNQRLCNPKAQNHALIIVGYGEEEKKDGTSIPYWIVKNSWGSDWGEK 652
YY GV Q + HA+ I+G+G E +PYW+ NSWG+DWGE
Sbjct: 169 YYKSGVY---QHTSGKQLGGHAVKIIGWGVENG-----VPYWLAANSWGTDWGEN 215
Score = 48.0 bits (115), Expect = 4e-06
Identities = 22/57 (38%), Positives = 28/57 (49%), Gaps = 6/57 (10%)
Query: 149 PEAFDWRAE----GVISKVKEQGKCACCWAFSAVGVVEAMHAIQGNNL--TELSVQH 199
PE+FD R + I ++++QG C CWAFSAV IQ N LS Q
Sbjct: 1 PESFDAREKWPNCISIGEIRDQGNCGSCWAFSAVEAFSDRLCIQSNGKENVLLSAQD 57
>gnl|CDD|239112 cd02621, Peptidase_C1A_CathepsinC, Cathepsin C; also known as
Dipeptidyl Peptidase I (DPPI), an atypical papain-like
cysteine peptidase with chloride dependency and
dipeptidyl aminopeptidase activity, resulting from its
tetrameric structure which limits substrate access. Each
subunit of the tetramer is composed of three peptides:
the heavy and light chains, which together adopts the
papain fold and forms the catalytic domain; and the
residual propeptide region, which forms a beta barrel
and points towards the substrate's N-terminus. The
subunit composition is the result of the unique
characteristic of procathepsin C maturation involving
the cleavage of the catalytic domain and the
non-autocatalytic excision of an activation peptide
within its propeptide region. By removing N-terminal
dipeptide extensions, cathepsin C activates granule
serine peptidases (granzymes) involved in cell-mediated
apoptosis, inflammation and tissue remodelling.
Loss-of-function mutations in cathepsin C are associated
with Papillon-Lefevre and Haim-Munk syndromes, rare
diseases characterized by hyperkeratosis and early-onset
periodontitis. Cathepsin C is widely expressed in many
tissues with high levels in lung, kidney and placenta.
It is also highly expressed in cytotoxic lymphocytes and
mature myeloid cells.
Length = 243
Score = 107 bits (269), Expect = 3e-26
Identities = 64/221 (28%), Positives = 94/221 (42%), Gaps = 45/221 (20%)
Query: 296 LPEAFDWR----AEGVISKVKEQGKCACCWAFSAVGVVEAMHAIQGNSL------TELSV 345
LP++FDW +S V+ QG C C+AF++V +EA I N LS
Sbjct: 1 LPKSFDWGDVNNGFNYVSPVRNQGGCGSCYAFASVYALEARIMIASNKTDPLGQQPILSP 60
Query: 346 QQLVDCDMSNGGCNGGRMDDALQYIIDNGGVVSDQAYPYKASESERGCLVGEEEGFKVKV 405
Q ++ C + GC+GG ++ D G+V++ +PY A + +R C E +
Sbjct: 61 QHVLSCSQYSQGCDGGFPFLVGKFAEDF-GIVTEDYFPYTA-DDDRPCKASPSECRRYYF 118
Query: 406 KEYSRIP--YG--EEEEMKKWVATRGPLSVGMNAN--GLFYYSG--GVIDLNQRL----- 452
+Y+ + YG E+EMK + GP+ V FY G D ++
Sbjct: 119 SDYNYVGGCYGCTNEDEMKWEIYRNGPIVVAFEVYSDFDFYKEGVYHHTDNDEVSDGDND 178
Query: 453 ----------------YGT----SIPYWIVKNSWGSDWGEK 473
+G YWIVKNSWGS WGEK
Sbjct: 179 NFNPFELTNHAVLLVGWGEDEIKGEKYWIVKNSWGSSWGEK 219
Score = 104 bits (260), Expect = 6e-25
Identities = 52/168 (30%), Positives = 80/168 (47%), Gaps = 20/168 (11%)
Query: 500 RLATEKLVDCDMSNGGCNGGRMDDALQYIIDNGGVVSDQAYPYKASESERGCLVGEEEGF 559
L+ + ++ C + GC+GG ++ D G +V++ +PY A + +R C E
Sbjct: 57 ILSPQHVLSCSQYSQGCDGGFPFLVGKFAEDFG-IVTEDYFPYTA-DDDRPCKASPSECR 114
Query: 560 KVKVKEYSRIP--YG--EEEEMKKWVATRGPLSVGMNAN--GLFYYSG--GVIDLNQRLC 611
+ +Y+ + YG E+EMK + GP+ V FY G D ++
Sbjct: 115 RYYFSDYNYVGGCYGCTNEDEMKWEIYRNGPIVVAFEVYSDFDFYKEGVYHHTDNDEVSD 174
Query: 612 NPKAQ-------NHALIIVGYGEEEKKDGTSIPYWIVKNSWGSDWGEK 652
NHA+++VG+GE+E K YWIVKNSWGS WGEK
Sbjct: 175 GDNDNFNPFELTNHAVLLVGWGEDEIKGE---KYWIVKNSWGSSWGEK 219
Score = 42.0 bits (99), Expect = 4e-04
Identities = 22/62 (35%), Positives = 30/62 (48%), Gaps = 10/62 (16%)
Query: 148 LPEAFDWR----AEGVISKVKEQGKCACCWAFSAVGVVEAMHAIQGNNL------TELSV 197
LP++FDW +S V+ QG C C+AF++V +EA I N LS
Sbjct: 1 LPKSFDWGDVNNGFNYVSPVRNQGGCGSCYAFASVYALEARIMIASNKTDPLGQQPILSP 60
Query: 198 QH 199
QH
Sbjct: 61 QH 62
>gnl|CDD|239110 cd02619, Peptidase_C1, C1 Peptidase family (MEROPS database
nomenclature), also referred to as the papain family;
composed of two subfamilies of cysteine peptidases
(CPs), C1A (papain) and C1B (bleomycin hydrolase).
Papain-like enzymes are mostly endopeptidases with some
exceptions like cathepsins B, C, H and X, which are
exopeptidases. Papain-like CPs have different functions
in various organisms. Plant CPs are used to mobilize
storage proteins in seeds while mammalian CPs are
primarily lysosomal enzymes responsible for protein
degradation in the lysosome. Papain-like CPs are
synthesized as inactive proenzymes with N-terminal
propeptide regions, which are removed upon activation.
Bleomycin hydrolase (BH) is a CP that detoxifies
bleomycin by hydrolysis of an amide group. It acts as a
carboxypeptidase on its C-terminus to convert itself
into an aminopeptidase and peptide ligase. BH is found
in all tissues in mammals as well as in many other
eukaryotes. It forms a hexameric ring barrel structure
with the active sites imbedded in the central channel.
Some members of the C1 family are proteins classified as
non-peptidase homologs which lack peptidase activity or
have missing active site residues.
Length = 223
Score = 103 bits (258), Expect = 5e-25
Identities = 57/204 (27%), Positives = 87/204 (42%), Gaps = 31/204 (15%)
Query: 300 FDWRAEGVISKVKEQGKCACCWAFSAVGVVEAMHAIQGN--SLTELSVQQLVDCD----- 352
D R ++ VK QG CWAF++ +E+ + I+G +LS Q L C
Sbjct: 2 VDLR-PLRLTPVKNQGSRGSCWAFASAYALESAYRIKGGEDEYVDLSPQYLYICANDECL 60
Query: 353 MSNGGCNGGRMDDALQYIIDNGGVVSDQAYPYKA-SESERGCLVGEEEGFKVKVKEYSRI 411
NG C+GG AL ++ G+ ++ YPY A S+ E KVK+K+Y R+
Sbjct: 61 GINGSCDGGGPLSALLKLVALKGIPPEEDYPYGAESDGEEPKSEAALNAAKVKLKDYRRV 120
Query: 412 PYGEEEEMKKWVATRGPLSVGMNANGLFYYSGGVIDLNQRLYGT---------------- 455
E++K+ +A GP+ G + F I + +Y
Sbjct: 121 LKNNIEDIKEALAKGGPVVAGFDVYSGFDRLKEGIIYEEIVYLLYEDGDLGGHAVVIVGY 180
Query: 456 ------SIPYWIVKNSWGSDWGEK 473
+IVKNSWG+DWG+
Sbjct: 181 DDNYVEGKGAFIVKNSWGTDWGDN 204
Score = 91.4 bits (227), Expect = 6e-21
Identities = 46/149 (30%), Positives = 72/149 (48%), Gaps = 8/149 (5%)
Query: 509 CDMSNGGCNGGRMDDALQYIIDNGGVVSDQAYPYKA-SESERGCLVGEEEGFKVKVKEYS 567
C NG C+GG AL ++ G+ ++ YPY A S+ E KVK+K+Y
Sbjct: 59 CLGINGSCDGGGPLSALLKLVALKGIPPEEDYPYGAESDGEEPKSEAALNAAKVKLKDYR 118
Query: 568 RIPYGEEEEMKKWVATRGPLSVGMNA-NGLFYYSGGVIDLNQRLCNPKAQ---NHALIIV 623
R+ E++K+ +A GP+ G + +G G+I + HA++IV
Sbjct: 119 RVLKNNIEDIKEALAKGGPVVAGFDVYSGFDRLKEGIIYEEIVYLLYEDGDLGGHAVVIV 178
Query: 624 GYGEEEKKDGTSIPYWIVKNSWGSDWGEK 652
GY + + +IVKNSWG+DWG+
Sbjct: 179 GYDDNYVEGK---GAFIVKNSWGTDWGDN 204
Score = 41.7 bits (98), Expect = 5e-04
Identities = 16/51 (31%), Positives = 26/51 (50%), Gaps = 3/51 (5%)
Query: 152 FDWRAEGVISKVKEQGKCACCWAFSAVGVVEAMHAIQGNN--LTELSVQHH 200
D R ++ VK QG CWAF++ +E+ + I+G +LS Q+
Sbjct: 2 VDLR-PLRLTPVKNQGSRGSCWAFASAYALESAYRIKGGEDEYVDLSPQYL 51
>gnl|CDD|239149 cd02698, Peptidase_C1A_CathepsinX, Cathepsin X; the only
papain-like lysosomal cysteine peptidase exhibiting
carboxymonopeptidase activity. It can also act as a
carboxydipeptidase, like cathepsin B, but has been shown
to preferentially cleave substrates through a
monopeptidyl carboxypeptidase pathway. The propeptide
region of cathepsin X, the shortest among papain-like
peptidases, is covalently attached to the active site
cysteine in the inactive form of the enzyme. Little is
known about the biological function of cathepsin X. Some
studies point to a role in early tumorigenesis. A more
recent study indicates that cathepsin X expression is
restricted to immune cells suggesting a role in
phagocytosis and the regulation of the immune response.
Length = 239
Score = 99.0 bits (247), Expect = 3e-23
Identities = 58/215 (26%), Positives = 88/215 (40%), Gaps = 42/215 (19%)
Query: 296 LPEAFDWR-AEGV--ISKVKEQ---GKCACCWAFSAVGVVEAMHAIQGN---SLTELSVQ 346
LP+++DWR GV +S + Q C CWA + + I LSVQ
Sbjct: 1 LPKSWDWRNVNGVNYVSPTRNQHIPQYCGSCWAHGSTSALADRINIARKGAWPSVYLSVQ 60
Query: 347 QLVDCDMSNGG-CNGGRMDDALQYIIDNGGVVSDQAYPYKASESE-------RGCLVGEE 398
++DC + GG C+GG +Y +G + + PY+A + E C E
Sbjct: 61 VVIDC--AGGGSCHGGDPGGVYEYAHKHG-IPDETCNPYQAKDGECNPFNRCGTCNPFGE 117
Query: 399 -----EGFKVKVKEYSRIPYGEEEEMKKWVATRGPLSVGMNANGLFY-YSGGVIDLNQRL 452
V +Y + ++M + RGP+S G+ A Y+GGV +
Sbjct: 118 CFAIKNYTLYFVSDYGSV--SGRDKMMAEIYARGPISCGIMATEALENYTGGVYKEYVQD 175
Query: 453 -----------YGT---SIPYWIVKNSWGSDWGEK 473
+G + YWIV+NSWG WGE+
Sbjct: 176 PLINHIISVAGWGVDENGVEYWIVRNSWGEPWGER 210
Score = 84.8 bits (210), Expect = 2e-18
Identities = 45/166 (27%), Positives = 73/166 (43%), Gaps = 26/166 (15%)
Query: 501 LATEKLVDCDMSNGG-CNGGRMDDALQYIIDNGGVVSDQAYPYKASESE-------RGCL 552
L+ + ++DC + GG C+GG +Y +G + + PY+A + E C
Sbjct: 57 LSVQVVIDC--AGGGSCHGGDPGGVYEYAHKHG-IPDETCNPYQAKDGECNPFNRCGTCN 113
Query: 553 VGEE-----EGFKVKVKEYSRIPYGEEEEMKKWVATRGPLSVGMNANGLFY-YSGGVIDL 606
E V +Y + ++M + RGP+S G+ A Y+GGV
Sbjct: 114 PFGECFAIKNYTLYFVSDYGSV--SGRDKMMAEIYARGPISCGIMATEALENYTGGVYKE 171
Query: 607 NQRLCNPKAQNHALIIVGYGEEEKKDGTSIPYWIVKNSWGSDWGEK 652
+ NH + + G+G +E + YWIV+NSWG WGE+
Sbjct: 172 YVQDPLI---NHIISVAGWGVDENG----VEYWIVRNSWGEPWGER 210
Score = 35.9 bits (83), Expect = 0.039
Identities = 18/61 (29%), Positives = 25/61 (40%), Gaps = 9/61 (14%)
Query: 148 LPEAFDWR-AEGV--ISKVKEQ---GKCACCWAFSAVGVVEAMHAIQGNN---LTELSVQ 198
LP+++DWR GV +S + Q C CWA + + I LSVQ
Sbjct: 1 LPKSWDWRNVNGVNYVSPTRNQHIPQYCGSCWAHGSTSALADRINIARKGAWPSVYLSVQ 60
Query: 199 H 199
Sbjct: 61 V 61
>gnl|CDD|214853 smart00848, Inhibitor_I29, Cathepsin propeptide inhibitor domain
(I29). This domain is found at the N-terminus of some
C1 peptidases such as Cathepsin L where it acts as a
propeptide. There are also a number of proteins that
are composed solely of multiple copies of this domain
such as the peptidase inhibitor salarin. This family is
classified as I29 by MEROPS. Peptide proteinase
inhibitors can be found as single domain proteins or as
single or multiple domains within proteins; these are
referred to as either simple or compound inhibitors,
respectively. In many cases they are synthesised as
part of a larger precursor protein, either as a
prepropeptide or as an N-terminal domain associated
with an inactive peptidase or zymogen. This domain
prevents access of the substrate to the active site.
Removal of the N-terminal inhibitor domain either by
interaction with a second peptidase or by autocatalytic
cleavage activates the zymogen. Other inhibitors
interact direct with proteinases using a simple
noncovalent lock and key mechanism; while yet others
use a conformational change-based trapping mechanism
that depends on their structural and thermodynamic
properties.
Length = 57
Score = 59.9 bits (146), Expect = 1e-11
Identities = 18/55 (32%), Positives = 30/55 (54%)
Query: 45 FLNFMRDHDKVYSSVEDLLRRHENFVTNVEKAEDYQREDSGTAVFEVNKFFDLSD 99
F + + H K YSS E+ RR F N++K E++ ++ + VN+F DL+
Sbjct: 1 FEQWKKKHGKSYSSEEEEARRFAIFKENLKKIEEHNKKYEHSYKLGVNQFSDLTP 55
Score = 54.2 bits (131), Expect = 1e-09
Identities = 18/50 (36%), Positives = 28/50 (56%)
Query: 198 QHHDKVYSSVEDLLRRHENFVTNVEKAEDYQSEDSGTAVFGVNKFFDLSE 247
+ H K YSS E+ RR F N++K E++ + + GVN+F DL+
Sbjct: 6 KKHGKSYSSEEEEARRFAIFKENLKKIEEHNKKYEHSYKLGVNQFSDLTP 55
>gnl|CDD|219764 pfam08246, Inhibitor_I29, Cathepsin propeptide inhibitor domain
(I29). This domain is found at the N-terminus of some
C1 peptidases such as Cathepsin L where it acts as a
propeptide. There are also a number of proteins that are
composed solely of multiple copies of this domain such
as the peptidase inhibitor salarin. This family is
classified as I29 by MEROPS.
Length = 58
Score = 56.1 bits (136), Expect = 2e-10
Identities = 15/58 (25%), Positives = 32/58 (55%)
Query: 45 FLNFMRDHDKVYSSVEDLLRRHENFVTNVEKAEDYQREDSGTAVFEVNKFFDLSDSDL 102
F ++ + + K Y S E+ L R + F N+ E++ ++ + + +N+F DL+D +
Sbjct: 1 FEDWKKKYGKSYYSEEEELYRFQIFKENLRFIEEHNKKGNVSYTLGLNQFADLTDEEF 58
Score = 49.2 bits (118), Expect = 7e-08
Identities = 14/54 (25%), Positives = 29/54 (53%)
Query: 197 VQHHDKVYSSVEDLLRRHENFVTNVEKAEDYQSEDSGTAVFGVNKFFDLSESDL 250
+ + K Y S E+ L R + F N+ E++ + + + G+N+F DL++ +
Sbjct: 5 KKKYGKSYYSEEEELYRFQIFKENLRFIEEHNKKGNVSYTLGLNQFADLTDEEF 58
>gnl|CDD|240244 PTZ00049, PTZ00049, cathepsin C-like protein; Provisional.
Length = 693
Score = 58.4 bits (141), Expect = 9e-09
Identities = 33/96 (34%), Positives = 46/96 (47%), Gaps = 20/96 (20%)
Query: 574 EEEMKKWVATRGPLSVGMNANGLFY-YSGGVIDLNQ----RLCN---PKAQ--------- 616
E+ M + GP+ A+ FY Y+ GV + R C PK
Sbjct: 557 EKIMMNEIYRNGPIVASFEASPDFYDYADGVYYVEDFPHARRCTVDLPKHNGVYNITGWE 616
Query: 617 --NHALIIVGYGEEEKKDGTSIPYWIVKNSWGSDWG 650
NHA+++VG+GEEE +G YWI +NSWG +WG
Sbjct: 617 KVNHAIVLVGWGEEEI-NGKLYKYWIGRNSWGKNWG 651
Score = 36.1 bits (83), Expect = 0.060
Identities = 29/118 (24%), Positives = 45/118 (38%), Gaps = 25/118 (21%)
Query: 294 DDLPEAFDW----RAEGVISKVKEQGKCACCWAFSAVGVVEAMHAIQ----------GNS 339
D+LP+ F W V Q C C+ S + + I N
Sbjct: 379 DELPKNFTWGDPFNNNTREYDVTNQLLCGSCYIASQMYAFKRRIEIALTKNLDKKYLNNF 438
Query: 340 LTELSVQQLVDCDMSNGGCNGGRMDDALQYIIDN----GGVVSDQAYPYKASESERGC 393
LS+Q ++ C + GCNGG Y++ G+ D+ +PY A +E+ C
Sbjct: 439 DDLLSIQTVLSCSFYDQGCNGG-----FPYLVSKMAKLQGIPLDKVFPYTA--TEQTC 489
Score = 33.8 bits (77), Expect = 0.38
Identities = 10/18 (55%), Positives = 12/18 (66%)
Query: 454 GTSIPYWIVKNSWGSDWG 471
G YWI +NSWG +WG
Sbjct: 634 GKLYKYWIGRNSWGKNWG 651
>gnl|CDD|227207 COG4870, COG4870, Cysteine protease [Posttranslational
modification, protein turnover, chaperones].
Length = 372
Score = 56.0 bits (135), Expect = 3e-08
Identities = 61/281 (21%), Positives = 87/281 (30%), Gaps = 41/281 (14%)
Query: 228 QSEDSGTAVFGVNKFFDLSESDLQQLTGLNLDSTLEDIQPSLQAPFSSNQTDTEMRAFQF 287
Q + +L+ + DS L SL+ + Q +
Sbjct: 31 QLVLLRDKLSTSGIIIELAPKLIDFSEPEEKDSLLPVSLDSLEDCSPTGQVPDPVDLGSC 90
Query: 288 NSLRHGDDLPEAFDWRAEGVISKVKEQGKCACCWAFSAVGVVEAMHAIQ----GNSLTEL 343
+L LP FD R EG +S VK+QG CWAF+ +E + S +
Sbjct: 91 TALNASASLPSYFDRRDEGKVSPVKDQGSGGSCWAFATTRSLE-SYLNPESAWDFSENNM 149
Query: 344 SVQQLVDC--DMSNGGCNGGRMDDALQYIIDNGGVVSDQAYPYKASE--SERGCLVGE-- 397
V +GG D + Y+ + G V + PY + S V +
Sbjct: 150 KNLLGVPYEKGFDYTSNDGGNADMSAAYLTEWSGPVYETDDPYSENSYFSPTNLPVTKHV 209
Query: 398 -EEGFKVKVKEY---SRIP-----YGEEEEMKKWVATR------GPLSVGMNANGLFYYS 442
E K+Y I YG AT V N
Sbjct: 210 QEAQIIPSRKKYLDNGNIKAMFGFYGAVSSSMYIDATNSLGICIPYPYVDSGENW----- 264
Query: 443 G------GVIDLNQRLYGTSIPY----WIVKNSWGSDWGEK 473
G G D P +I+KNSWG++WGE
Sbjct: 265 GHAVLIVGYDDSFDINNFKYGPPGDGAFIIKNSWGTNWGEN 305
Score = 45.6 bits (108), Expect = 6e-05
Identities = 34/159 (21%), Positives = 55/159 (34%), Gaps = 1/159 (0%)
Query: 96 DLSDSDLQQLTGLNLDSTLEDIQPSLQAPFSSNQTDTEMRAFQFNSLRHGDDLPEAFDWR 155
+L+ + DS L SL+ + Q + +L LP FD R
Sbjct: 47 ELAPKLIDFSEPEEKDSLLPVSLDSLEDCSPTGQVPDPVDLGSCTALNASASLPSYFDRR 106
Query: 156 AEGVISKVKEQGKCACCWAFSAVGVVEAMHAIQGNNLTELSVQHHDKVYSSVEDLLRRHE 215
EG +S VK+QG CWAF+ +E + + + + E
Sbjct: 107 DEGKVSPVKDQGSGGSCWAFATTRSLE-SYLNPESAWDFSENNMKNLLGVPYEKGFDYTS 165
Query: 216 NFVTNVEKAEDYQSEDSGTAVFGVNKFFDLSESDLQQLT 254
N N + + Y +E SG + + + S L
Sbjct: 166 NDGGNADMSAAYLTEWSGPVYETDDPYSENSYFSPTNLP 204
Score = 44.4 bits (105), Expect = 1e-04
Identities = 16/41 (39%), Positives = 24/41 (58%), Gaps = 5/41 (12%)
Query: 617 NHALIIVGYGEEEKKDGT-SIPY----WIVKNSWGSDWGEK 652
HA++IVGY + + P +I+KNSWG++WGE
Sbjct: 265 GHAVLIVGYDDSFDINNFKYGPPGDGAFIIKNSWGTNWGEN 305
>gnl|CDD|240381 PTZ00364, PTZ00364, dipeptidyl-peptidase I precursor; Provisional.
Length = 548
Score = 54.5 bits (131), Expect = 1e-07
Identities = 49/236 (20%), Positives = 79/236 (33%), Gaps = 54/236 (22%)
Query: 289 SLRHGDDLPEAFDWRAEGVISKVKE---QG---KCACCWAFSAVGVVEAMHAIQGN---- 338
S + GD P A+ W G S + C + +A+ + A + N
Sbjct: 198 SHQLGDPPPAAWSWGDVGGASFLPAAPPASPGRGCNSSYVEAALAAMMARVMVASNRTDP 257
Query: 339 --SLTELSVQQLVDCDMSNGGCNGGRMDDALQYIIDNGGVVSDQAY-PYKASESERGCLV 395
T LS + ++DC GC GG ++ ++ G + +D Y PY + +
Sbjct: 258 LGQQTFLSARHVLDCSQYGQGCAGGFPEEVGKFAETFGILTTDSYYIPYDSGDGVERACK 317
Query: 396 GEEEGFKVKVKEYSRIP--YG---EEEEMKKWVATRGPLSVGMNANGLFY---------- 440
+ Y + YG + +E+ + GP+ + AN +Y
Sbjct: 318 TRRPSRRYYFTNYGPLGGYYGAVTDPDEIIWEIYRHGPVPASVYANSDWYNCDENSTEDV 377
Query: 441 -------YSGGVIDLNQRLY--------------GT---SIPYWIVKNSWGS--DW 470
YS D R Y GT YW+V + WGS W
Sbjct: 378 RYVSLDDYSTASADRPLRHYFASNVNHTVLIIGWGTDENGGDYWLVLDPWGSRRSW 433
Score = 49.9 bits (119), Expect = 4e-06
Identities = 35/176 (19%), Positives = 65/176 (36%), Gaps = 29/176 (16%)
Query: 499 SRLATEKLVDCDMSNGGCNGGRMDDALQYIIDNGGVVSDQAY-PYKASESERGCLVGEEE 557
+ L+ ++DC GC GG ++ ++ G + +D Y PY + +
Sbjct: 262 TFLSARHVLDCSQYGQGCAGGFPEEVGKFAETFGILTTDSYYIPYDSGDGVERACKTRRP 321
Query: 558 GFKVKVKEYSRIP--YG---EEEEMKKWVATRGPLSVGMNANGLFYYSGGVIDLNQRLCN 612
+ Y + YG + +E+ + GP+ + AN +Y + R +
Sbjct: 322 SRRYYFTNYGPLGGYYGAVTDPDEIIWEIYRHGPVPASVYANSDWYNCDENSTEDVRYVS 381
Query: 613 PKAQ-----------------NHALIIVGYGEEEKKDGTSIPYWIVKNSWGS--DW 649
NH ++I+G+G +E YW+V + WGS W
Sbjct: 382 LDDYSTASADRPLRHYFASNVNHTVLIIGWGTDE----NGGDYWLVLDPWGSRRSW 433
Score = 32.9 bits (75), Expect = 0.58
Identities = 16/71 (22%), Positives = 25/71 (35%), Gaps = 12/71 (16%)
Query: 141 SLRHGDDLPEAFDWRAEGVISKVKE---QG---KCACCWAFSAVGVVEAMHAIQGNNL-- 192
S + GD P A+ W G S + C + +A+ + A + N
Sbjct: 198 SHQLGDPPPAAWSWGDVGGASFLPAAPPASPGRGCNSSYVEAALAAMMARVMVASNRTDP 257
Query: 193 ----TELSVQH 199
T LS +H
Sbjct: 258 LGQQTFLSARH 268
>gnl|CDD|185641 PTZ00462, PTZ00462, Serine-repeat antigen protein; Provisional.
Length = 1004
Score = 52.4 bits (125), Expect = 7e-07
Identities = 20/45 (44%), Positives = 26/45 (57%)
Query: 608 QRLCNPKAQNHALIIVGYGEEEKKDGTSIPYWIVKNSWGSDWGEK 652
Q LC +HA+ IVGYG + YWIV+NSWG WG++
Sbjct: 713 QNLCGDDTADHAVNIVGYGNYINDEDEKKSYWIVRNSWGKYWGDE 757
Score = 39.3 bits (91), Expect = 0.007
Identities = 49/228 (21%), Positives = 82/228 (35%), Gaps = 62/228 (27%)
Query: 310 KVKEQGKCACCWAFSAVGVVEAMHAIQGNSLTELSVQQLVDCDMS--NGGCNGGRMD-DA 366
++++QG CA W F++ +E + ++G +S + +C C+ G +
Sbjct: 546 QIEDQGNCAISWIFASKYHLETIKCMKGYEPHAISALYIANCSKGEHKDRCDEGSNPLEF 605
Query: 367 LQYIIDNGGVVSDQAYPY--------------------------------KASESERGCL 394
LQ I DNG + +D Y Y S +
Sbjct: 606 LQIIEDNGFLPADSNYLYNYTKVGEDCPDEEDHWMNLLDHGKILNHNKKEPNSLDGKAYR 665
Query: 395 VGEEEGFKVKVKEYSRIPYGEEEEMKKWVATRGPLSVGMNANGLFYY--SGGVID----- 447
E E F K+ + +I +K + +G + + A + Y +G +
Sbjct: 666 AYESEHFHDKMDAFIKI-------IKDEIMNKGSVIAYIKAENVLGYEFNGKKVQNLCGD 718
Query: 448 ------LNQRLYGTSI-------PYWIVKNSWGSDWGEKVEDKVGSSG 482
+N YG I YWIV+NSWG WG++ KV G
Sbjct: 719 DTADHAVNIVGYGNYINDEDEKKSYWIVRNSWGKYWGDEGYFKVDMYG 766
>gnl|CDD|202517 pfam03051, Peptidase_C1_2, Peptidase C1-like family. This family
is closely related to the Peptidase_C1 family pfam00112,
containing several prokaryotic and eukaryotic
aminopeptidases and bleomycin hydrolases.
Length = 438
Score = 34.2 bits (79), Expect = 0.23
Identities = 14/36 (38%), Positives = 21/36 (58%), Gaps = 3/36 (8%)
Query: 617 NHALIIVGYGEEEKKDGTSIPYWIVKNSWGSDWGEK 652
HA+++ G E++ T W V+NSWG D G+K
Sbjct: 360 THAMVLTGVDEDDDGKPTK---WKVENSWGDDSGKK 392
>gnl|CDD|238328 cd00585, Peptidase_C1B, Peptidase C1B subfamily (MEROPS database
nomenclature); composed of eukaryotic bleomycin
hydrolases (BH) and bacterial aminopeptidases C (pepC).
The proteins of this subfamily contain a large insert
relative to the C1A peptidase (papain) subfamily. BH is
a cysteine peptidase that detoxifies bleomycin by
hydrolysis of an amide group. It acts as a
carboxypeptidase on its C-terminus to convert itself
into an aminopeptidase and peptide ligase. BH is found
in all tissues in mammals as well as in many other
eukaryotes. Bleomycin, a glycopeptide derived from the
fungus Streptomyces verticullus, is an effective
anticancer drug due to its ability to induce DNA strand
breaks. Human BH is the major cause of tumor cell
resistance to bleomycin chemotherapy, and is also
genetically linked to Alzheimer's disease. In addition
to its peptidase activity, the yeast BH (Gal6) binds DNA
and acts as a repressor in the Gal4 regulatory system.
BH forms a hexameric ring barrel structure with the
active sites imbedded in the central channel. The
bacterial homolog of BH, called pepC, is a cysteine
aminopeptidase possessing broad specificity. Although
its crystal structure has not been solved, biochemical
analysis shows that pepC also forms a hexamer. .
Length = 437
Score = 32.6 bits (75), Expect = 0.61
Identities = 14/41 (34%), Positives = 20/41 (48%), Gaps = 6/41 (14%)
Query: 617 NHALIIVGYGEEEKKDGTSIPYWIVKNSWGSDWGEK---VM 654
HA+++ G +E W V+NSWG G+K VM
Sbjct: 359 THAMVLTGVDLDEDGKPVK---WKVENSWGEKVGKKGYFVM 396
>gnl|CDD|237194 PRK12765, PRK12765, flagellar capping protein; Provisional.
Length = 595
Score = 31.4 bits (71), Expect = 1.6
Identities = 24/68 (35%), Positives = 33/68 (48%), Gaps = 1/68 (1%)
Query: 198 QHHDKVYSSVEDLLRRHENFVTNVEKAEDYQSEDSGTAVF-GVNKFFDLSESDLQQLTGL 256
Q + V +VEDL+ + N VTN+ A Y SE F GV++ + S L L
Sbjct: 292 QDTEGVTKAVEDLVDAYNNLVTNLNAATSYNSETGTKGTFQGVSEITSIRSSILADLFSQ 351
Query: 257 NLDSTLED 264
+D T ED
Sbjct: 352 VVDGTDED 359
>gnl|CDD|227573 COG5248, TAF19, Transcription initiation factor TFIID, subunit
TAF13 [Transcription].
Length = 126
Score = 29.5 bits (66), Expect = 2.6
Identities = 18/93 (19%), Positives = 31/93 (33%), Gaps = 7/93 (7%)
Query: 15 GYLHTFMIKVALLESNIFQTRGYLNSPVTRFLNFMRDHDKVYSSVEDLLRRHENFVTNVE 74
Y+ +M + N+ Q R F +R K VE+LL +E
Sbjct: 38 EYVLDYMSILCTNAHNMAQVRNKTK--TEDFKFALRRDPKKLGRVEELLITNEEI---KL 92
Query: 75 KAEDYQREDSGTAVF-EVNKFFDLSDSDLQQLT 106
+ ++ +DS +N S L +
Sbjct: 93 AKKAFEPKDSRYRKELRINTKMH-SFITLNKFI 124
>gnl|CDD|183636 PRK12631, flgC, flagellar basal body rod protein FlgC; Provisional.
Length = 138
Score = 29.5 bits (66), Expect = 2.8
Identities = 24/75 (32%), Positives = 33/75 (44%), Gaps = 7/75 (9%)
Query: 191 NLTELSVQHHDKVYSSVEDLLR-RHENFVTNVEKAEDYQSEDSGTAVFGVNKFFDLSESD 249
N T ++ + D V SSV+ R RH F + KA+ Q G AV G+ ESD
Sbjct: 22 NTTASNIANADSVSSSVDKTYRARHPIFEAEMAKAQSQQQASQGVAVKGI------VESD 75
Query: 250 LQQLTGLNLDSTLED 264
L + D + D
Sbjct: 76 KPLLKEYSPDHPMAD 90
>gnl|CDD|226107 COG3579, PepC, Aminopeptidase C [Amino acid transport and
metabolism].
Length = 444
Score = 30.5 bits (69), Expect = 3.2
Identities = 13/35 (37%), Positives = 20/35 (57%), Gaps = 3/35 (8%)
Query: 618 HALIIVGYGEEEKKDGTSIPYWIVKNSWGSDWGEK 652
HA+++ G +E + W V+NSWG D G+K
Sbjct: 363 HAMVLTGVDLDETGNPLR---WKVENSWGKDVGKK 394
>gnl|CDD|184358 PRK13874, PRK13874, conjugal transfer protein TrbJ; Provisional.
Length = 230
Score = 29.5 bits (67), Expect = 4.7
Identities = 13/27 (48%), Positives = 15/27 (55%), Gaps = 2/27 (7%)
Query: 324 SAVGVVEAMHAIQGNSLTELSVQQLVD 350
SA G ++A A GN L L QQL D
Sbjct: 170 SATGALQAAQA--GNQLLALQAQQLAD 194
>gnl|CDD|191696 pfam07168, Ureide_permease, Ureide permease. Heterocyclic nitrogen
compounds may serve as nitrogen sources or nitrogen
transport compounds in plants that are not able to fix
nitrogen. This family represents ureide permease, a
transporter of a wide spectrum of oxo derivatives of
heterocyclic nitrogen compounds, including allantoin,
uric acid and xanthine; it has 10 putative transmembrane
domains with a large cytosolic central domain containing
a 'Walker A' motif. Ureide permease is likely to
transport other purine degradation products when
nitrogen sources are low. Transport is dependent on
glucose and a proton gradient. The family is found in
bacteria, plants and yeast.
Length = 336
Score = 29.4 bits (66), Expect = 5.7
Identities = 14/62 (22%), Positives = 26/62 (41%), Gaps = 2/62 (3%)
Query: 177 AVGVVEAMHAIQG-NNLTELSVQHHDKVYSSVEDLLRRHENFVTNVEKAEDYQSEDSGTA 235
AV + A+H+ +N +L+ + + S+ L ++E E +GTA
Sbjct: 140 AVFLGSAVHSSNAADNKEKLNAFENYQSEFSISSLELMSRMNSEDLENGEA-DDAKAGTA 198
Query: 236 VF 237
F
Sbjct: 199 EF 200
>gnl|CDD|219580 pfam07793, DUF1631, Protein of unknown function (DUF1631). The
members of this family are sequences derived from a
group of hypothetical proteins expressed by certain
bacterial species. The region concerned is approximately
440 amino acid residues in length.
Length = 729
Score = 29.2 bits (66), Expect = 9.6
Identities = 19/71 (26%), Positives = 32/71 (45%), Gaps = 19/71 (26%)
Query: 14 LGYLHTFMIKVALLESNIFQTRGYLNSPVTRFLNFM--------------RD--HDKVYS 57
+G L ++KVALL+ + F +RG P R LN + RD + K+
Sbjct: 375 IGRLQIPVLKVALLDKSFF-SRG--EHPARRLLNEIAEAGIGWGGDDDGLRDSLYAKIEE 431
Query: 58 SVEDLLRRHEN 68
V+ +L ++
Sbjct: 432 IVQRILNEFDD 442
Database: CDD.v3.10
Posted date: Mar 20, 2013 7:55 AM
Number of letters in database: 10,937,602
Number of sequences in database: 44,354
Lambda K H
0.315 0.133 0.400
Gapped
Lambda K H
0.267 0.0764 0.140
Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 33,042,956
Number of extensions: 3205812
Number of successful extensions: 2058
Number of sequences better than 10.0: 1
Number of HSP's gapped: 1981
Number of HSP's successfully gapped: 63
Length of query: 655
Length of database: 10,937,602
Length adjustment: 103
Effective length of query: 552
Effective length of database: 6,369,140
Effective search space: 3515765280
Effective search space used: 3515765280
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 42 (22.0 bits)
S2: 62 (27.6 bits)