Citrus Sinensis ID: 047264


Local Sequence Feature Prediction

Prediction and (Method)Result
Residue Number Marker
Protein Sequence ?
Secondary Structure (PSIPRED) ?
Secondary Structure Prediction (SSPRO) ?
Coil and Loop (DISEMBL) ?
Flexible Loop (DISEMBL) ?
Low Complexity Region (SEG) ?
Disordered region (IsUnstruct) ?
Disordered Region (DISOPRED) ?
Disordered Region (DISEMBL) ?
Disordered Region (DISPRO) ?
Transmembrane Helix (TMHMM) ?
Transmembrane Helix (HMMTOP) ?
Transmembrane Helix (MEMSAT) ?
TM Helix, Signal Peptide (MEMSAT_SVM) ?
TM Helix, Signal Peptide (Phobius) ?
Signal Peptide (SignalP HMM Mode) ?
Signal Peptide (SignalP NN Mode) ?
Coiled Coils (COILS) ?
Positional Conservation ?
 
--------10--------20--------30--------40--------50--------60--------70--------80--------90-------100-------110-------120-------130-------140-------15
MGEREREREMKMLSSNCKLLLRLSLCVSLLMIICIRCSSSRTLQGHEYDGEFSIVGYSPEELTSTDKLVELFESWMLKHGKSYESTEEKLHRLEIFKDNLKHIDARNRELQITSSYWLGLNEFSDMSHEEFKNKYLTGLKPDDDEFRRR
ccHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHccccccccccccccccccccccccHHHHHHHHHHHHHHHccccccHHHHHHHHHHHHHHHHHHHHHHHcccccccEEEcccccccccHHHHHHHHcccccccccccccc
cccccccccccccccHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHccccccEEEccHHHcccHHHHHHHHHHHHHHHccccccHHHHHHHHHHHHHHHHHHHHHHccccccccEEEEEcccccccHHHHHHHHccccccccHHHccc
MGEREREREMKMLSSNCKLLLRLSLCVSLLMIICIRcsssrtlqgheydgefsivgyspeeltsTDKLVELFESWMlkhgksyesteEKLHRLEIFKDNLKHIDARNRELQITSsywlglnefsdmsheEFKNkyltglkpdddefrrr
MGEREREREMKMLSSNCKLLLRLSLCVSLLMIICIRCSSSRTLQGHEYDGEFSIVGYSPEELTSTDKLVELFESWMLKHgksyesteekLHRLEIFKDNLKHIDARNRELQITSSYWLGLNEFSDMSHEEFknkyltglkpdddefrrr
MGererereMKMlssncklllrlslcvsllMIICIRCSSSRTLQGHEYDGEFSIVGYSPEELTSTDKLVELFESWMLKHGKSYESTEEKLHRLEIFKDNLKHIDARNRELQITSSYWLGLNEFSDMSHEEFKNKYLTGLKPDDDEFRRR
***************NCKLLLRLSLCVSLLMIICIRCSSSRTLQGHEYDGEFSIVGYSPEELTSTDKLVELFESWMLKHGKSYESTEEKLHRLEIFKDNLKHIDARNRELQITSSYWLGLNEFSD************************
***************NCKLLLRLSLCVSLLMIICIRCSSSRTLQGHEYDGEFSIVGYSPEELTSTDKLVELFESWMLKHGKSYESTEEKLHRLEIFKDNLKHIDARNRELQITSSYWLGLNEFSDMSHEEFKNKY**************
***********MLSSNCKLLLRLSLCVSLLMIICIRCSSSRTLQGHEYDGEFSIVGYSPEELTSTDKLVELFESWMLKHGKSYESTEEKLHRLEIFKDNLKHIDARNRELQITSSYWLGLNEFSDMSHEEFKNKYLTGLKPDDDEFRRR
***********MLSSNCKLLLRLSLCVSLLMIICIRCSSSRTLQGHEYDGEFSIVGYSPEELTSTDKLVELFESWMLKHGKSYESTEEKLHRLEIFKDNLKHIDARNRELQITSSYWLGLNEFSDMSHEEFKNKYLT************
ooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
iiiiiiiiiiiiiiiiiiHHHHHHHHHHHHHHHHHoooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
iiiiiiiiiiiiiiiiiiiHHHHHHHHHHHHHHHHHHHHoooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiHHHHHHHHHHHHHHHHoooooooooooooooooo
SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSoooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
MGEREREREMKMLSSNCKLLLRLSLCVSLLMIICIRCSSSRTLQGHEYDGEFSIVGYSPEELTSTDKLVELFESWMLKHGKSYESTEEKLHRLEIFKDNLKHIDARNRELQITSSYWLGLNEFSDMSHEEFKNKYLTGLKPDDDEFRRR
no confident homologs detected

Close Homologs for Annotation Transfer

Close Homologs in SWISS-PROT Database Detected by BLAST ?

ID ?Alignment graph ?Length ? Definition ? RBH(Q2H) ? RBH(H2Q) ? Q cover ? H cover ? Identity ? E-value ?
Query149 2.2.26 [Sep-21-2011]
O65493 355 Xylem cysteine proteinase no no 0.771 0.323 0.515 4e-31
P05994 348 Papaya proteinase 4 OS=Ca N/A no 0.624 0.267 0.578 4e-28
P00784 345 Papain OS=Carica papaya P N/A no 0.724 0.313 0.492 4e-27
P10056 348 Caricain OS=Carica papaya N/A no 0.751 0.321 0.469 1e-26
Q9LM66 356 Xylem cysteine proteinase no no 0.590 0.247 0.630 6e-26
P14080 352 Chymopapain OS=Carica pap N/A no 0.791 0.335 0.440 1e-23
P43297 462 Cysteine proteinase RD21a no no 0.630 0.203 0.377 2e-11
P00785 380 Actinidain OS=Actinidia c N/A no 0.476 0.186 0.438 2e-11
A5HII1 380 Actinidain OS=Actinidia d N/A no 0.476 0.186 0.438 2e-11
P80884 345 Ananain OS=Ananas comosus N/A no 0.483 0.208 0.413 3e-11
>sp|O65493|XCP1_ARATH Xylem cysteine proteinase 1 OS=Arabidopsis thaliana GN=XCP1 PE=1 SV=1 Back     alignment and function desciption
 Score =  133 bits (335), Expect = 4e-31,   Method: Compositional matrix adjust.
 Identities = 67/130 (51%), Positives = 92/130 (70%), Gaps = 15/130 (11%)

Query: 20  LLRLSLCVSLLMIICIRCSSSRTLQGHEYDGEFSIVGYSPEELTSTDKLVELFESWMLKH 79
           L + SL V++     + C+ +R         +FSIVGY+PE LT+TDKL+ELFESWM +H
Sbjct: 8   LSKFSLLVAISASALLCCAFAR---------DFSIVGYTPEHLTNTDKLLELFESWMSEH 58

Query: 80  GKSYESTEEKLHRLEIFKDNLKHIDARNRELQITSSYWLGLNEFSDMSHEEFKNKYLTGL 139
            K+Y+S EEK+HR E+F++NL HID RN E+   +SYWLGLNEF+D++HEEFK +YL   
Sbjct: 59  SKAYKSVEEKVHRFEVFRENLMHIDQRNNEI---NSYWLGLNEFADLTHEEFKGRYLGLA 115

Query: 140 KPDDDEFRRR 149
           KP   +F R+
Sbjct: 116 KP---QFSRK 122




Probable thiol protease.
Arabidopsis thaliana (taxid: 3702)
EC: 3EC: .EC: 4EC: .EC: 2EC: 2EC: .EC: -
>sp|P05994|PAPA4_CARPA Papaya proteinase 4 OS=Carica papaya PE=1 SV=3 Back     alignment and function description
>sp|P00784|PAPA1_CARPA Papain OS=Carica papaya PE=1 SV=1 Back     alignment and function description
>sp|P10056|PAPA3_CARPA Caricain OS=Carica papaya PE=1 SV=2 Back     alignment and function description
>sp|Q9LM66|XCP2_ARATH Xylem cysteine proteinase 2 OS=Arabidopsis thaliana GN=XCP2 PE=1 SV=2 Back     alignment and function description
>sp|P14080|PAPA2_CARPA Chymopapain OS=Carica papaya PE=1 SV=2 Back     alignment and function description
>sp|P43297|RD21A_ARATH Cysteine proteinase RD21a OS=Arabidopsis thaliana GN=RD21A PE=1 SV=1 Back     alignment and function description
>sp|P00785|ACTN_ACTCH Actinidain OS=Actinidia chinensis PE=1 SV=4 Back     alignment and function description
>sp|A5HII1|ACTN_ACTDE Actinidain OS=Actinidia deliciosa PE=1 SV=1 Back     alignment and function description
>sp|P80884|ANAN_ANACO Ananain OS=Ananas comosus GN=AN1 PE=1 SV=2 Back     alignment and function description

Close Homologs in the Non-Redundant Database Detected by BLAST ?

GI ?Alignment Graph ?Length ? Definition ? Q cover ? H cover ? Identity ? E-value ?
Query149
224131910 349 predicted protein [Populus trichocarpa] 0.617 0.263 0.696 5e-31
356508487 349 PREDICTED: xylem cysteine proteinase 2-l 0.744 0.318 0.561 5e-31
225428328 707 PREDICTED: cysteine proteinase-like [Vit 0.637 0.134 0.622 2e-30
356517184 350 PREDICTED: xylem cysteine proteinase 2-l 0.765 0.325 0.573 4e-30
359491865 351 PREDICTED: xylem cysteine proteinase 1-l 0.577 0.245 0.733 9e-30
255635584 345 unknown [Glycine max] 0.704 0.304 0.628 1e-29
356508490 349 PREDICTED: xylem cysteine proteinase 2-l 0.704 0.300 0.628 1e-29
42573181 288 Xylem cysteine proteinase 1 [Arabidopsis 0.771 0.399 0.515 2e-29
18418684 355 Xylem cysteine proteinase 1 [Arabidopsis 0.771 0.323 0.515 2e-29
37780039 351 cysteine protease 14 [Trifolium repens] 0.590 0.250 0.717 3e-29
>gi|224131910|ref|XP_002328138.1| predicted protein [Populus trichocarpa] gi|222837653|gb|EEE76018.1| predicted protein [Populus trichocarpa] Back     alignment and taxonomy information
 Score =  139 bits (349), Expect = 5e-31,   Method: Compositional matrix adjust.
 Identities = 69/99 (69%), Positives = 82/99 (82%), Gaps = 7/99 (7%)

Query: 51  EFSIVGYSPEELTSTDKLVELFESWMLKHGKSYESTEEKLHRLEIFKDNLKHIDARNREL 110
           +FSIVGYSPE LTS DKLVELFESW+  HGK+Y S EEKLHR E+FK+NLKHID RN+E+
Sbjct: 26  DFSIVGYSPEHLTSVDKLVELFESWISGHGKAYNSLEEKLHRFEVFKENLKHIDQRNKEV 85

Query: 111 QITSSYWLGLNEFSDMSHEEFKNKYLTGLKPDDDEFRRR 149
              +SYWLGLNEF+D+SHEEFK+K+L GL P   EF R+
Sbjct: 86  ---TSYWLGLNEFADLSHEEFKSKFL-GLYP---EFPRK 117




Source: Populus trichocarpa

Species: Populus trichocarpa

Genus: Populus

Family: Salicaceae

Order: Malpighiales

Class:

Phylum: Streptophyta

Superkingdom: Eukaryota

>gi|356508487|ref|XP_003522988.1| PREDICTED: xylem cysteine proteinase 2-like [Glycine max] Back     alignment and taxonomy information
>gi|225428328|ref|XP_002279940.1| PREDICTED: cysteine proteinase-like [Vitis vinifera] Back     alignment and taxonomy information
>gi|356517184|ref|XP_003527269.1| PREDICTED: xylem cysteine proteinase 2-like [Glycine max] Back     alignment and taxonomy information
>gi|359491865|ref|XP_002273243.2| PREDICTED: xylem cysteine proteinase 1-like [Vitis vinifera] Back     alignment and taxonomy information
>gi|255635584|gb|ACU18142.1| unknown [Glycine max] Back     alignment and taxonomy information
>gi|356508490|ref|XP_003522989.1| PREDICTED: xylem cysteine proteinase 2-like [Glycine max] Back     alignment and taxonomy information
>gi|42573181|ref|NP_974687.1| Xylem cysteine proteinase 1 [Arabidopsis thaliana] gi|332661102|gb|AEE86502.1| Xylem cysteine proteinase 1 [Arabidopsis thaliana] Back     alignment and taxonomy information
>gi|18418684|ref|NP_567983.1| Xylem cysteine proteinase 1 [Arabidopsis thaliana] gi|71153408|sp|O65493.1|XCP1_ARATH RecName: Full=Xylem cysteine proteinase 1; Short=AtXCP1; Flags: Precursor gi|6708181|gb|AAF25831.1|AF191027_1 papain-type cysteine endopeptidase XCP1 [Arabidopsis thaliana] gi|3080415|emb|CAA18734.1| cysteine proteinase-like protein [Arabidopsis thaliana] gi|7270487|emb|CAB80252.1| cysteine proteinase-like protein [Arabidopsis thaliana] gi|26449881|dbj|BAC42063.1| putative cysteine proteinase [Arabidopsis thaliana] gi|28827736|gb|AAO50712.1| unknown protein [Arabidopsis thaliana] gi|332661101|gb|AEE86501.1| Xylem cysteine proteinase 1 [Arabidopsis thaliana] Back     alignment and taxonomy information
>gi|37780039|gb|AAP32192.1| cysteine protease 14 [Trifolium repens] Back     alignment and taxonomy information

Prediction of Gene Ontology (GO) Terms

Close Homologs with Gene Ontology terms Detected by BLAST ?

ID ? Alignment graph ? Length ? Definition ? Q cover ? H cover ? Identity ? E-value ?
Query149
TAIR|locus:2122113 355 XCP1 "xylem cysteine peptidase 0.624 0.261 0.616 4.9e-30
TAIR|locus:2030427 356 XCP2 "xylem cysteine peptidase 0.590 0.247 0.630 9.4e-27
TAIR|locus:2825832 462 RD21A "responsive to dehydrati 0.630 0.203 0.377 4.9e-12
TAIR|locus:2024362 437 XBCP3 "xylem bark cysteine pep 0.496 0.169 0.454 1.2e-11
TAIR|locus:2128253 371 AT4G11320 [Arabidopsis thalian 0.389 0.156 0.557 1.4e-11
TAIR|locus:2167821 463 RD21B "esponsive to dehydratio 0.624 0.200 0.392 1.7e-11
TAIR|locus:2128243 364 AT4G11310 [Arabidopsis thalian 0.530 0.217 0.482 2.8e-11
UNIPROTKB|Q3T0I2 335 CTSH "Pro-cathepsin H" [Bos ta 0.409 0.182 0.492 3.7e-11
TAIR|locus:2097104 376 AT3G43960 [Arabidopsis thalian 0.516 0.204 0.392 6.4e-11
UNIPROTKB|F1RKR7197 CTSH "Cathepsin H light chain" 0.503 0.380 0.444 7.4e-11
TAIR|locus:2122113 XCP1 "xylem cysteine peptidase 1" [Arabidopsis thaliana (taxid:3702)] Back     alignment and assigned GO terms
 Score = 332 (121.9 bits), Expect = 4.9e-30, P = 4.9e-30
 Identities = 61/99 (61%), Positives = 80/99 (80%)

Query:    51 EFSIVGYSPEELTSTDKLVELFESWMLKHGKSYESTEEKLHRLEIFKDNLKHIDARNREL 110
             +FSIVGY+PE LT+TDKL+ELFESWM +H K+Y+S EEK+HR E+F++NL HID RN E+
Sbjct:    30 DFSIVGYTPEHLTNTDKLLELFESWMSEHSKAYKSVEEKVHRFEVFRENLMHIDQRNNEI 89

Query:   111 QITSSYWLGLNEFSDMSHEEFKNKYLTGLKPDDDEFRRR 149
                +SYWLGLNEF+D++HEEFK +YL   KP   +F R+
Sbjct:    90 ---NSYWLGLNEFADLTHEEFKGRYLGLAKP---QFSRK 122




GO:0005576 "extracellular region" evidence=ISM
GO:0006508 "proteolysis" evidence=IEA;ISS
GO:0008234 "cysteine-type peptidase activity" evidence=IEA;ISS
GO:0000325 "plant-type vacuole" evidence=IDA
GO:0005634 "nucleus" evidence=IDA
GO:0010623 "developmental programmed cell death" evidence=IMP
GO:0010413 "glucuronoxylan metabolic process" evidence=RCA
GO:0045492 "xylan biosynthetic process" evidence=RCA
TAIR|locus:2030427 XCP2 "xylem cysteine peptidase 2" [Arabidopsis thaliana (taxid:3702)] Back     alignment and assigned GO terms
TAIR|locus:2825832 RD21A "responsive to dehydration 21A" [Arabidopsis thaliana (taxid:3702)] Back     alignment and assigned GO terms
TAIR|locus:2024362 XBCP3 "xylem bark cysteine peptidase 3" [Arabidopsis thaliana (taxid:3702)] Back     alignment and assigned GO terms
TAIR|locus:2128253 AT4G11320 [Arabidopsis thaliana (taxid:3702)] Back     alignment and assigned GO terms
TAIR|locus:2167821 RD21B "esponsive to dehydration 21B" [Arabidopsis thaliana (taxid:3702)] Back     alignment and assigned GO terms
TAIR|locus:2128243 AT4G11310 [Arabidopsis thaliana (taxid:3702)] Back     alignment and assigned GO terms
UNIPROTKB|Q3T0I2 CTSH "Pro-cathepsin H" [Bos taurus (taxid:9913)] Back     alignment and assigned GO terms
TAIR|locus:2097104 AT3G43960 [Arabidopsis thaliana (taxid:3702)] Back     alignment and assigned GO terms
UNIPROTKB|F1RKR7 CTSH "Cathepsin H light chain" [Sus scrofa (taxid:9823)] Back     alignment and assigned GO terms

Prediction of Enzyme Commission (EC) Number

EC Number Prediction by Annotation Transfer from SWISS-PROT Entries ?

No confident hit for EC number transfering in SWISSPROT detected by BLAST

EC Number Prediction by Ezypred Server ?

Fail to connect to Ezypred Server

EC Number Prediction by EFICAz Software ?

No EC number assignment, probably not an enzyme!


Prediction of Functionally Associated Proteins

Functionally Associated Proteins Detected by STRING ?

Your Input:
fgenesh4_pm.C_scaffold_66000095
hypothetical protein (349 aa)
(Populus trichocarpa)
Predicted Functional Partners:
MYB020
hypothetical protein (109 aa)
       0.702
gw1.III.864.1
annotation not avaliable (313 aa)
       0.702
NAC105
NAC domain protein, IPR003441 (310 aa)
       0.702
HB5
SubName- Full=Class III HD-Zip protein 5; (851 aa)
       0.510
HB6
SubName- Full=Class III HD-Zip protein 6; (837 aa)
       0.510
gw1.I.8677.1
annotation not avaliable (139 aa)
       0.508
grail3.0061005101
hypothetical protein (175 aa)
       0.508
grail3.0050017401
hypothetical protein (174 aa)
       0.508
grail3.0052005901
annotation not avaliable (72 aa)
       0.505
estExt_fgenesh4_pm.C_LG_VIII0087
hypothetical protein (521 aa)
       0.505

Conserved Domains and Related Protein Families

Conserved Domains Detected by RPS-BLAST ?

ID ?Alignment Graph ?Length ? Definition ? E-value ?
Query149
smart0084857 smart00848, Inhibitor_I29, Cathepsin propeptide in 4e-21
pfam0824658 pfam08246, Inhibitor_I29, Cathepsin propeptide inh 1e-20
PTZ00021 489 PTZ00021, PTZ00021, falcipain-2; Provisional 2e-12
PTZ00200 448 PTZ00200, PTZ00200, cysteine proteinase; Provision 5e-06
PTZ00203 348 PTZ00203, PTZ00203, cathepsin L protease; Provisio 2e-04
>gnl|CDD|214853 smart00848, Inhibitor_I29, Cathepsin propeptide inhibitor domain (I29) Back     alignment and domain information
 Score = 80.4 bits (199), Expect = 4e-21
 Identities = 31/59 (52%), Positives = 40/59 (67%), Gaps = 2/59 (3%)

Query: 72  FESWMLKHGKSYESTEEKLHRLEIFKDNLKHIDARNRELQITSSYWLGLNEFSDMSHEE 130
           FE W  KHGKSY S EE+  R  IFK+NLK I+  N++     SY LG+N+FSD++ EE
Sbjct: 1   FEQWKKKHGKSYSSEEEEARRFAIFKENLKKIEEHNKKY--EHSYKLGVNQFSDLTPEE 57


This domain is found at the N-terminus of some C1 peptidases such as Cathepsin L where it acts as a propeptide. There are also a number of proteins that are composed solely of multiple copies of this domain such as the peptidase inhibitor salarin. This family is classified as I29 by MEROPS. Peptide proteinase inhibitors can be found as single domain proteins or as single or multiple domains within proteins; these are referred to as either simple or compound inhibitors, respectively. In many cases they are synthesised as part of a larger precursor protein, either as a prepropeptide or as an N-terminal domain associated with an inactive peptidase or zymogen. This domain prevents access of the substrate to the active site. Removal of the N-terminal inhibitor domain either by interaction with a second peptidase or by autocatalytic cleavage activates the zymogen. Other inhibitors interact direct with proteinases using a simple noncovalent lock and key mechanism; while yet others use a conformational change-based trapping mechanism that depends on their structural and thermodynamic properties. Length = 57

>gnl|CDD|219764 pfam08246, Inhibitor_I29, Cathepsin propeptide inhibitor domain (I29) Back     alignment and domain information
>gnl|CDD|240232 PTZ00021, PTZ00021, falcipain-2; Provisional Back     alignment and domain information
>gnl|CDD|240310 PTZ00200, PTZ00200, cysteine proteinase; Provisional Back     alignment and domain information
>gnl|CDD|185513 PTZ00203, PTZ00203, cathepsin L protease; Provisional Back     alignment and domain information

Conserved Domains Detected by HHsearch ?

ID ?Alignment Graph ?Length ? Definition ? Probability ?
Query 149
PF0824658 Inhibitor_I29: Cathepsin propeptide inhibitor doma 99.81
PTZ00203 348 cathepsin L protease; Provisional 99.76
smart0084857 Inhibitor_I29 Cathepsin propeptide inhibitor domai 99.71
PTZ00200 448 cysteine proteinase; Provisional 99.64
PTZ00021 489 falcipain-2; Provisional 99.63
KOG1542 372 consensus Cysteine proteinase Cathepsin F [Posttra 99.47
KOG1543 325 consensus Cysteine proteinase Cathepsin L [Posttra 98.58
PF0812741 Propeptide_C1: Peptidase family C1 propeptide; Int 92.49
>PF08246 Inhibitor_I29: Cathepsin propeptide inhibitor domain (I29); InterPro: IPR013201 Peptide proteinase inhibitors can be found as single domain proteins or as single or multiple domains within proteins; these are referred to as either simple or compound inhibitors, respectively Back     alignment and domain information
Probab=99.81  E-value=5.8e-20  Score=118.45  Aligned_cols=58  Identities=52%  Similarity=0.835  Sum_probs=50.4

Q ss_pred             HHHHHHHhCCccCCHHHHHHHHHHHHHHHHHHHHhchhcccCcceeeecccCCCCCHHHH
Q 047264           72 FESWMLKHGKSYESTEEKLHRLEIFKDNLKHIDARNRELQITSSYWLGLNEFSDMSHEEF  131 (149)
Q Consensus        72 F~~wk~ky~K~Y~s~~E~~~R~~iF~~Nl~~I~~hN~~~~g~~sy~lglN~FaDLT~eEF  131 (149)
                      |+.|+.+|+|.|.+..|..+|+.||++|++.|.+||+..  ..+|++|+|+|||||++||
T Consensus         1 F~~~~~~~~k~Y~~~~e~~~R~~~F~~N~~~I~~~N~~~--~~~~~~~~N~fsD~t~eEf   58 (58)
T PF08246_consen    1 FEQFKKKYGKSYKSAEEEARRFAIFKENLRRIEEHNANG--NNTYKLGLNQFSDMTPEEF   58 (58)
T ss_dssp             HHHHHHHCT---SSHHHHHHHHHHHHHHHHHHHHHHHTT--SSSEEE-SSTTTTSSHHHH
T ss_pred             CHHHHHHcCCCCCCHHHHHHHHHHHHHHHHHHHHHhcCC--CCCeEEeCccccCcChhhC
Confidence            889999999999999999999999999999999999544  3799999999999999998



In many cases they are synthesised as part of a larger precursor protein, either as a prepropeptide or as an N-terminal domain associated with an inactive peptidase or zymogen. This domain prevents access of the substrate to the active site. Removal of the N-terminal inhibitor domain either by interaction with a second peptidase or by autocatalytic cleavage activates the zymogen. Other inhibitors interact direct with proteinases using a simple noncovalent lock and key mechanism; while yet others use a conformational change-based trapping mechanism that depends on their structural and thermodynamic properties. This entry represents a peptidase inhibitor domain, which belongs to MEROPS peptidase inhibitor family I29. The domain is also found at the N terminus of a variety of peptidase precursors that belong to MEROPS peptidase subfamily C1A; these include cathepsin L, papain, and procaricain (P10056 from SWISSPROT) []. It forms an alpha-helical domain that runs through the substrate-binding site, preventing access. Removal of this region by proteolytic cleavage results in activation of the enzyme. This domain is also found, in one or more copies, in a variety of cysteine peptidase inhibitors such as salarin [].; PDB: 3QT4_A 3QJ3_A 2C0Y_A 2L95_A 1CJL_A 1CS8_A 7PCK_A 1BY8_A 1PCI_A 2O6X_A ....

>PTZ00203 cathepsin L protease; Provisional Back     alignment and domain information
>smart00848 Inhibitor_I29 Cathepsin propeptide inhibitor domain (I29) Back     alignment and domain information
>PTZ00200 cysteine proteinase; Provisional Back     alignment and domain information
>PTZ00021 falcipain-2; Provisional Back     alignment and domain information
>KOG1542 consensus Cysteine proteinase Cathepsin F [Posttranslational modification, protein turnover, chaperones] Back     alignment and domain information
>KOG1543 consensus Cysteine proteinase Cathepsin L [Posttranslational modification, protein turnover, chaperones] Back     alignment and domain information
>PF08127 Propeptide_C1: Peptidase family C1 propeptide; InterPro: IPR012599 This domain is found at the N-terminal of cathepsin B and cathepsin B-like peptidases that belong to MEROPS peptidase subfamily C1A Back     alignment and domain information

Homologous Structure Templates

Structure Templates Detected by BLAST ?

ID ?Alignment Graph ?Length ? Definition ? E-value ?
Query149
3tnx_A 363 Structure Of The Precursor Of A Thermostable Varian 2e-26
1pci_A 322 Procaricain Length = 322 6e-26
2o6x_A 310 Crystal Structure Of Procathepsin L1 From Fasciola 1e-07
3f75_P106 Activated Toxoplasma Gondii Cathepsin L (Tgcpl) In 3e-07
3qt4_A 329 Structure Of Digestive Procathepsin L 3 Of Tenebrio 8e-05
7pck_A 314 Crystal Structure Of Wild Type Human Procathepsin K 2e-04
>pdb|3TNX|A Chain A, Structure Of The Precursor Of A Thermostable Variant Of Papain At 2.6 Angstroem Resolution Length = 363 Back     alignment and structure

Iteration: 1

Score = 114 bits (285), Expect = 2e-26, Method: Compositional matrix adjust. Identities = 53/85 (62%), Positives = 71/85 (83%), Gaps = 3/85 (3%) Query: 51 EFSIVGYSPEELTSTDKLVELFESWMLKHGKSYESTEEKLHRLEIFKDNLKHIDARNREL 110 +FSIVGYS +LTST++L++LFESWMLKH K Y++ +EK++R EIFKDNLK+ID N++ Sbjct: 45 DFSIVGYSQNDLTSTERLIQLFESWMLKHNKIYKNIDEKIYRFEIFKDNLKYIDETNKK- 103 Query: 111 QITSSYWLGLNEFSDMSHEEFKNKY 135 +SYWLGLN F+DMS++EFK KY Sbjct: 104 --NNSYWLGLNVFADMSNDEFKEKY 126
>pdb|1PCI|A Chain A, Procaricain Length = 322 Back     alignment and structure
>pdb|2O6X|A Chain A, Crystal Structure Of Procathepsin L1 From Fasciola Hepatica Length = 310 Back     alignment and structure
>pdb|3F75|P Chain P, Activated Toxoplasma Gondii Cathepsin L (Tgcpl) In Complex With Its Propeptide Length = 106 Back     alignment and structure
>pdb|3QT4|A Chain A, Structure Of Digestive Procathepsin L 3 Of Tenebrio Molitor Larval Midgut Length = 329 Back     alignment and structure
>pdb|7PCK|A Chain A, Crystal Structure Of Wild Type Human Procathepsin K Length = 314 Back     alignment and structure

Structure Templates Detected by RPS-BLAST ?

ID ?Alignment Graph ?Length ? Definition ? E-value ?
Query149
1pci_A 322 Procaricain; zymogen, hydrolase, thiol protease; 3 3e-40
1by8_A 314 Protein (procathepsin K); hydrolase(sulfhydryl pro 7e-33
2c0y_A 315 Procathepsin S; proenzyme, proteinase, hydrolase, 4e-32
3qj3_A 331 Cathepsin L-like protein; hydrolase, proteinase, l 1e-31
3qt4_A 329 Cathepsin-L-like midgut cysteine proteinase; hydro 8e-31
1cs8_A 316 Human procathepsin L; prosegment, propeptide, inhi 8e-30
2o6x_A 310 Procathepsin L1, secreted cathepsin L 1; hydrolase 5e-29
3f75_P106 Toxopain-2, cathepsin L propeptide; medical struct 1e-28
1xkg_A 312 DER P I, major mite fecal allergen DER P 1; major 3e-28
2l95_A80 Crammer, LP06209P; cysteine proteinase inhibitor, 2e-24
3pdf_A 441 Cathepsin C, dipeptidyl peptidase 1; two domains, 2e-12
>1pci_A Procaricain; zymogen, hydrolase, thiol protease; 3.20A {Carica papaya} SCOP: d.3.1.1 Length = 322 Back     alignment and structure
 Score =  136 bits (344), Expect = 3e-40
 Identities = 53/99 (53%), Positives = 73/99 (73%), Gaps = 4/99 (4%)

Query: 51  EFSIVGYSPEELTSTDKLVELFESWMLKHGKSYESTEEKLHRLEIFKDNLKHIDARNREL 110
           +FSIVGYS ++LTST++L++LF SWML H K YE+ +EKL+R EIFKDNL +ID  N++ 
Sbjct: 1   DFSIVGYSQDDLTSTERLIQLFNSWMLNHNKFYENVDEKLYRFEIFKDNLNYIDETNKK- 59

Query: 111 QITSSYWLGLNEFSDMSHEEFKNKYLTGLKPDDDEFRRR 149
              +SYWLGLNEF+D+S++EF  KY+ G   D    +  
Sbjct: 60  --NNSYWLGLNEFADLSNDEFNEKYV-GSLIDATIEQSY 95


>1by8_A Protein (procathepsin K); hydrolase(sulfhydryl proteinase), papain; 2.60A {Homo sapiens} SCOP: d.3.1.1 PDB: 7pck_A Length = 314 Back     alignment and structure
>2c0y_A Procathepsin S; proenzyme, proteinase, hydrolase, thiol protease, prosegment binding loop, glycoprotein, lysosome, protease, zymogen; 2.1A {Homo sapiens} Length = 315 Back     alignment and structure
>3qj3_A Cathepsin L-like protein; hydrolase, proteinase, larVal midgut; 1.85A {Tenebrio molitor} Length = 331 Back     alignment and structure
>3qt4_A Cathepsin-L-like midgut cysteine proteinase; hydrolase, zymogen, intramolecular DISS bonds, insect larVal midgut; HET: PG4 PG6; 2.11A {Tenebrio molitor} Length = 329 Back     alignment and structure
>1cs8_A Human procathepsin L; prosegment, propeptide, inhibition, hydrolase; HET: OCS; 1.80A {Homo sapiens} SCOP: d.3.1.1 PDB: 1cjl_A 3hwn_A* Length = 316 Back     alignment and structure
>2o6x_A Procathepsin L1, secreted cathepsin L 1; hydrolase, thiol protease, cysteine protease, zymogen, hydro; 1.40A {Fasciola hepatica} Length = 310 Back     alignment and structure
>3f75_P Toxopain-2, cathepsin L propeptide; medical structural genomics of pathogenic protozoa, MSGPP, C protease, parasite, protozoa, hydrolase; 1.99A {Toxoplasma gondii} Length = 106 Back     alignment and structure
>1xkg_A DER P I, major mite fecal allergen DER P 1; major allergen, cysteine protease, house DUST mite, dermatop pteronyssinus; 1.61A {Dermatophagoides pteronyssinus} SCOP: d.3.1.1 Length = 312 Back     alignment and structure
>2l95_A Crammer, LP06209P; cysteine proteinase inhibitor, intrinsic disorder P like protein, hydrolase; NMR {Drosophila melanogaster} Length = 80 Back     alignment and structure
>3pdf_A Cathepsin C, dipeptidyl peptidase 1; two domains, cystein protease, hydrolase-hydrolase inhibitor; HET: LXV NAG; 1.85A {Homo sapiens} PDB: 1jqp_A* 2djf_B* 1k3b_B* 2djg_B* 2djf_A* 1k3b_A* 2djg_A* 2djf_C* 1k3b_C* 2djg_C* Length = 441 Back     alignment and structure

Structure Templates Detected by HHsearch ?

ID ?Alignment Graph ?Length ? Definition ? Probability ?
Query149
3f75_P106 Toxopain-2, cathepsin L propeptide; medical struct 99.88
2l95_A80 Crammer, LP06209P; cysteine proteinase inhibitor, 99.87
3tnx_A 363 Papain; hydrolase, cytoplasm for recombinant expre 99.86
3qj3_A 331 Cathepsin L-like protein; hydrolase, proteinase, l 99.78
1pci_A 322 Procaricain; zymogen, hydrolase, thiol protease; 3 99.76
3qt4_A 329 Cathepsin-L-like midgut cysteine proteinase; hydro 99.74
1by8_A 314 Protein (procathepsin K); hydrolase(sulfhydryl pro 99.71
2c0y_A 315 Procathepsin S; proenzyme, proteinase, hydrolase, 99.71
1cs8_A 316 Human procathepsin L; prosegment, propeptide, inhi 99.69
2o6x_A 310 Procathepsin L1, secreted cathepsin L 1; hydrolase 99.64
1xkg_A 312 DER P I, major mite fecal allergen DER P 1; major 99.38
3pdf_A 441 Cathepsin C, dipeptidyl peptidase 1; two domains, 98.58
3hhi_A 325 Cathepsin B-like cysteine protease; occluding loop 97.49
3pbh_A 317 Procathepsin B; thiol protease, cysteine protease, 96.38
>3f75_P Toxopain-2, cathepsin L propeptide; medical structural genomics of pathogenic protozoa, MSGPP, C protease, parasite, protozoa, hydrolase; 1.99A {Toxoplasma gondii} Back     alignment and structure
Probab=99.88  E-value=1.6e-22  Score=145.08  Aligned_cols=74  Identities=41%  Similarity=0.640  Sum_probs=68.3

Q ss_pred             ChhHHHHHHHHHHHHhCCccCCHHHHHHHHHHHHHHHHHHHHhchhcccCcceeeecccCCCCCHHHHHHHHhcCCCC
Q 047264           64 STDKLVELFESWMLKHGKSYESTEEKLHRLEIFKDNLKHIDARNRELQITSSYWLGLNEFSDMSHEEFKNKYLTGLKP  141 (149)
Q Consensus        64 ~~~~~~~~F~~wk~ky~K~Y~s~~E~~~R~~iF~~Nl~~I~~hN~~~~g~~sy~lglN~FaDLT~eEF~~~~~~g~~~  141 (149)
                      ++..+...|+.|+.+|+|.|.+..|+.+|+.||++|+++|++||++.   .+|++|||+|||||++||++.|+ |++.
T Consensus        17 ~~~~~~~~F~~wk~~~~K~Y~~~~E~~~R~~iF~~Nl~~I~~hN~~~---~sy~lglN~FaDLT~eEF~~~~l-g~~~   90 (106)
T 3f75_P           17 KEAHFQDAFSSFQAMYAKSYATEEEKQRRYAIFKNNLVYIHTHNQQG---YSYSLKMNHFGDLSRDEFRRKYL-GFKK   90 (106)
T ss_dssp             CHHHHHHHHHHHHHHHTCCCSSHHHHHHHHHHHHHHHHHHHHHHTSC---CSEEECCCTTTTCCHHHHHHHSC-CBCC
T ss_pred             CcHHHHHHHHHHHHHhCCcCCCHHHHHHHHHHHHHHHHHHHHHHhcC---CCeeeCCcccccCCHHHHHHHHc-CCCC
Confidence            45678899999999999999998999999999999999999999874   79999999999999999999998 8653



>2l95_A Crammer, LP06209P; cysteine proteinase inhibitor, intrinsic disorder P like protein, hydrolase; NMR {Drosophila melanogaster} Back     alignment and structure
>3tnx_A Papain; hydrolase, cytoplasm for recombinant expression; 2.62A {Carica papaya} Back     alignment and structure
>3qj3_A Cathepsin L-like protein; hydrolase, proteinase, larVal midgut; 1.85A {Tenebrio molitor} SCOP: d.3.1.0 Back     alignment and structure
>1pci_A Procaricain; zymogen, hydrolase, thiol protease; 3.20A {Carica papaya} SCOP: d.3.1.1 Back     alignment and structure
>3qt4_A Cathepsin-L-like midgut cysteine proteinase; hydrolase, zymogen, intramolecular DISS bonds, insect larVal midgut; HET: PG4 PG6; 2.11A {Tenebrio molitor} Back     alignment and structure
>1by8_A Protein (procathepsin K); hydrolase(sulfhydryl proteinase), papain; 2.60A {Homo sapiens} SCOP: d.3.1.1 PDB: 7pck_A Back     alignment and structure
>2c0y_A Procathepsin S; proenzyme, proteinase, hydrolase, thiol protease, prosegment binding loop, glycoprotein, lysosome, protease, zymogen; 2.1A {Homo sapiens} Back     alignment and structure
>1cs8_A Human procathepsin L; prosegment, propeptide, inhibition, hydrolase; HET: OCS; 1.80A {Homo sapiens} SCOP: d.3.1.1 PDB: 1cjl_A 3hwn_A* Back     alignment and structure
>2o6x_A Procathepsin L1, secreted cathepsin L 1; hydrolase, thiol protease, cysteine protease, zymogen, hydro; 1.40A {Fasciola hepatica} Back     alignment and structure
>1xkg_A DER P I, major mite fecal allergen DER P 1; major allergen, cysteine protease, house DUST mite, dermatop pteronyssinus; 1.61A {Dermatophagoides pteronyssinus} SCOP: d.3.1.1 Back     alignment and structure
>3pdf_A Cathepsin C, dipeptidyl peptidase 1; two domains, cystein protease, hydrolase-hydrolase inhibitor; HET: LXV NAG; 1.85A {Homo sapiens} PDB: 1jqp_A* 2djf_B* 1k3b_B* 2djg_B* 2djf_A* 1k3b_A* 2djg_A* 2djf_C* 1k3b_C* 2djg_C* Back     alignment and structure
>3hhi_A Cathepsin B-like cysteine protease; occluding loop, hydrolase, THIO protease; HET: 074; 1.60A {Trypanosoma brucei} SCOP: d.3.1.0 PDB: 4hwy_A* 3mor_A* Back     alignment and structure
>3pbh_A Procathepsin B; thiol protease, cysteine protease, proenzyme, papain; 2.50A {Homo sapiens} SCOP: d.3.1.1 PDB: 2pbh_A 1pbh_A 1mir_A Back     alignment and structure

Homologous Structure Domains

Structure Domains Detected by RPS-BLAST ?

ID ?Alignment Graph ?Length ? Definition ? E-value ?
Query 149
d1xkga1 302 d.3.1.1 (A:4-305) Major mite fecal allergen der p 9e-21
d1cs8a_ 316 d.3.1.1 (A:) (Pro)cathepsin L {Human (Homo sapiens 4e-20
>d1xkga1 d.3.1.1 (A:4-305) Major mite fecal allergen der p 1 {House-dust mite (Dermatophagoides pteronyssinus) [TaxId: 6956]} Length = 302 Back     information, alignment and structure

class: Alpha and beta proteins (a+b)
fold: Cysteine proteinases
superfamily: Cysteine proteinases
family: Papain-like
domain: Major mite fecal allergen der p 1
species: House-dust mite (Dermatophagoides pteronyssinus) [TaxId: 6956]
 Score = 83.6 bits (205), Expect = 9e-21
 Identities = 17/75 (22%), Positives = 37/75 (49%), Gaps = 10/75 (13%)

Query: 69  VELFESWMLKHGKSYESTEEKLHRLEIFKDNLKHIDARNRELQITSSYWLGLNEFSDMSH 128
           ++ FE +     KSY + E++    + F +++K++ +              +N  SD+S 
Sbjct: 2   IKTFEEYKKAFNKSYATFEDEEAARKNFLESVKYVQSNG----------GAINHLSDLSL 51

Query: 129 EEFKNKYLTGLKPDD 143
           +EFKN++L   +  +
Sbjct: 52  DEFKNRFLMSAEAFE 66


>d1cs8a_ d.3.1.1 (A:) (Pro)cathepsin L {Human (Homo sapiens) [TaxId: 9606]} Length = 316 Back     information, alignment and structure

Homologous Domains Detected by HHsearch ?

ID ?Alignment Graph ?Length ? Definition ? Probability ?
Query149
d1cs8a_ 316 (Pro)cathepsin L {Human (Homo sapiens) [TaxId: 960 99.72
d1xkga1 302 Major mite fecal allergen der p 1 {House-dust mite 99.48
>d1cs8a_ d.3.1.1 (A:) (Pro)cathepsin L {Human (Homo sapiens) [TaxId: 9606]} Back     information, alignment and structure
class: Alpha and beta proteins (a+b)
fold: Cysteine proteinases
superfamily: Cysteine proteinases
family: Papain-like
domain: (Pro)cathepsin L
species: Human (Homo sapiens) [TaxId: 9606]
Probab=99.72  E-value=4.3e-18  Score=138.28  Aligned_cols=74  Identities=27%  Similarity=0.512  Sum_probs=64.6

Q ss_pred             hhHHHHHHHHHHHHhCCccCCHHHHHHHHHHHHHHHHHHHHhchhcc-cCcceeeecccCCCCCHHHHHHHHhcCCC
Q 047264           65 TDKLVELFESWMLKHGKSYESTEEKLHRLEIFKDNLKHIDARNRELQ-ITSSYWLGLNEFSDMSHEEFKNKYLTGLK  140 (149)
Q Consensus        65 ~~~~~~~F~~wk~ky~K~Y~s~~E~~~R~~iF~~Nl~~I~~hN~~~~-g~~sy~lglN~FaDLT~eEF~~~~~~g~~  140 (149)
                      +..+...|++||++|+|.|.+ .|+.+|++||.+|++.|++||++.. +..+|++|+|+|+|||++||.+.++ +..
T Consensus         5 ~~~l~~~F~~f~~~~~K~Y~~-~ee~~R~~iF~~N~~~I~~~N~~~~~~~~~~~~g~N~fsDlt~eEf~~~~~-~~~   79 (316)
T d1cs8a_           5 DHSLEAQWTKWKAMHNRLYGM-NEEGWRRAVWEKNMKMIELHNQEYREGKHSFTMAMNAFGDMTSEEFRQVMN-GFQ   79 (316)
T ss_dssp             CGGGHHHHHHHHHHTTCCCCT-THHHHHHHHHHHHHHHHHHHHHHHHTTCCSEEECCCTTTTCCHHHHHHHHC-CBC
T ss_pred             cHHHHHHHHHHHHHhCCcCCC-HHHHHHHHHHHHHHHHHHHHHhHhhcCCCceEEeceeccccCcHHHHhhhc-ccc
Confidence            345678899999999999987 4678999999999999999998752 4579999999999999999999887 543



>d1xkga1 d.3.1.1 (A:4-305) Major mite fecal allergen der p 1 {House-dust mite (Dermatophagoides pteronyssinus) [TaxId: 6956]} Back     information, alignment and structure