Citrus Sinensis ID: 025989

Local Sequence Feature Prediction

Prediction and (Method)	Result

Residue Number Marker

Protein Sequence

Secondary Structure (PSIPRED)

Secondary Structure Prediction (SSPRO)

Coil and Loop (DISEMBL)

Flexible Loop (DISEMBL)

Low Complexity Region (SEG)

Disordered region (IsUnstruct)

Disordered Region (DISOPRED)

Disordered Region (DISEMBL)

Disordered Region (DISPRO)

Transmembrane Helix (TMHMM)

Transmembrane Helix (HMMTOP)

Transmembrane Helix (MEMSAT)

TM Helix, Signal Peptide (MEMSAT_SVM)

TM Helix, Signal Peptide (Phobius)

Signal Peptide (SignalP HMM Mode)

Signal Peptide (SignalP NN Mode)

Coiled Coils (COILS)

Positional Conservation

--------10--------20--------30--------40--------50--------60--------70--------80--------90-------100-------110-------120-------130-------140-------150-------160-------170-------180-------190-------200-------210-------220-------230-------240-----

MAQTLTRAITALSIRSSRLSLLSSKRLLSTNTTTVTAPSPLPSLLFSRRAAAPLSHAVGLISPLPSTRFCQIRCRANRSGNSAYSPLNSGSNFSDRPPTEMAPLFPGCDYEHWLIVMDKPGGEGATKQQMIDCYIKTLAQVVGSEEEAKKKIYNVSCERYFGFGCELDEETSNKLEGLPGVLFVLPDSYVDPENKDYGAELFVNGEIVQRSPERQRRVEPQPQRAQDRPRYNDRTRYVRRRENTR

cHHHHHHHHHHHHHHHHHHHHHHHHHHccccccccccccccccHHHHcccccccHHcccccccccccccccccccccccccccccccccccccccccccccccccccccccEEEEEEcccccccccHHHHHHHHHHHHHHHHccHHHHHccEEEEEEEEEEEEEcEEcHHHHHHHcccccEEEEccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc

cHHHHHHHHHHHHHHHHHcccccHHHHHcccccccccccccccccccccccccccccccccccccccccccEEEEccccccccccccccccccccccccccEEccccccccEEEEEEcccccccccHHHHHHHHHHHHHHHHccHHHHHHHHHHEcccccccEEEEEcHHHHHHHcccccEEEEEccccccccccccccccccccEEEcccccccccccccccccccccccccccHHHHHHHccc

MAQTLTRAITALSIRSSRLSLLSSkrllstntttvtapsplpsllfsrraaaplshavglisplpstrfcqircranrsgnsaysplnsgsnfsdrpptemaplfpgcdyehwlivmdkpggegatkQQMIDCYIKTLAQVVGSEEEAKKKIYNVSCERyfgfgceldeetsnkleglpgvlfvlpdsyvdpenkdygaelfvngeivqrsperqrrvepqpqraqdrpryndrtRYVRRRENTR

MAQTLTRAITALSIRSSRLsllsskrllstntttvtapsplpsLLFSRRAAAPLSHAvglisplpsTRFCQIRCRANrsgnsaysplnsgsnfsdrPPTEMAPLFPGCDYEHWLIVMDKPGGEGATKQQMIDCYIKTLAQVVGseeeakkkiynVSCERYFGFGCELDEETSNKLEGLPGVLFVLPDSYVDPENKDYGAELFVNGeivqrsperqrrvepqpqraqdrpryndrtryvrrrentr

MAQTLTRAITAlsirssrlsllsskrllsTNTTTVTAPSPLPSLLFSRRAAAPLSHAVGLISPLPSTRFCQIRCRANRSGNSAYSPLNSGSNFSDRPPTEMAPLFPGCDYEHWLIVMDKPGGEGATKQQMIDCYIKTLAQVVGSEEEAKKKIYNVSCERYFGFGCELDEETSNKLEGLPGVLFVLPDSYVDPENKDYGAELFVNGEIvqrsperqrrvepqpqraqDrpryndrtryvrrrentr

*******************************************LLFSRRAAAPLSHAVGLISPLPSTRFCQIRCR***************************PLFPGCDYEHWLIVMDKPGGEGATKQQMIDCYIKTLAQVVGSEEEAKKKIYNVSCERYFGFGCELDEETSNKLEGLPGVLFVLPDSYVDPENKDYGAELFVNGEI**************************************

*********TALSIRSSRLSLLSS*****************************************************************************APLFPGCDYEHWLIVMDKP*****TKQQMIDCYIKTLAQVVGSEEEAKKKIYNVSCERYFGFGCELDEETSNKLEGLPGVLFVLPDSYVDPENKDYGAELFVNGEI**************************************

**********ALSIRSSRLSLLSSKRLLSTNTTTVTAPSPLPSLLFSRRAAAPLSHAVGLISPLPSTRFCQIRCRANRS*****************PPTEMAPLFPGCDYEHWLIVMDKPGGEGATKQQMIDCYIKTLAQVVGSEEEAKKKIYNVSCERYFGFGCELDEETSNKLEGLPGVLFVLPDSYVDPENKDYGAELFVNGEIVQRSPERQ******************************

ooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo

ooooooooooooooooooooooooooooooooooooooooooooooooooooooooooohhhhhhhhhhhhhhhhhhhiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii

SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSiiiiiiihhhhhhhhhhhhhhhhoooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

no confident homologs detected

Close Homologs for Annotation Transfer

Close Homologs in SWISS-PROT Database Detected by BLAST

Original result of BLAST against SWISS-PROT Database

ID	Alignment graph	Length	Definition	RBH(Q2H)	RBH(H2Q)	Q cover	H cover	Identity	E-value
Query		245	2.2.26 [Sep-21-2011]
Q38732		230	DAG protein, chloroplasti	N/A	no	0.538	0.573	0.510	1e-34
Q9LKA5		395	Uncharacterized protein A	no	no	0.489	0.303	0.532	2e-31

>sp\|Q38732\|DAG_ANTMA DAG protein, chloroplastic OS=Antirrhinum majus GN=DAG PE=2 SV=1	Back alignment and function desciption

 Score =  146 bits (369), Expect = 1e-34,   Method: Compositional matrix adjust.
 Identities = 71/139 (51%), Positives = 94/139 (67%), Gaps = 7/139 (5%)

Query: 104 LFPGCDYEHWLIVMDKPGGEGATKQQMIDCYIKTLAQVVGSEEEAKKKIYNVSCERYFGF 163
           + PGCDY HWLIVM+ P     T++QMID Y+ TLA V+GS EEAKK +Y  S   Y GF
Sbjct: 80  MLPGCDYNHWLIVMEFPKDPAPTREQMIDTYLNTLATVLGSMEEAKKNMYAFSTTTYTGF 139

Query: 164 GCELDEETSNKLEGLPGVLFVLPDSYVDPENKDYGAELFVNGEIVQRSPERQRRVEPQPQ 223
            C + EETS K +GLPGVL+VLPDSY+D +NKDYG + +VNGEI+   P +    +P+  
Sbjct: 140 QCTVTEETSEKFKGLPGVLWVLPDSYIDVKNKDYGGDKYVNGEII---PCQYPTYQPKQS 196

Query: 224 RAQDRPRYNDRTRYVRRRE 242
           R+    +Y  +  YVR+R+
Sbjct: 197 RSS---KYKSKA-YVRQRD 211

Acts very early in chloroplast development, being required for expression of RNA polymerase beta subunit gene, and hence indirectly for subsequent expression of CAB and RBCS genes.
Antirrhinum majus (taxid: 4151)

>sp\|Q9LKA5\|UMP1_ARATH Uncharacterized protein At3g15000, mitochondrial OS=Arabidopsis thaliana GN=At3g15000 PE=1 SV=1	Back alignment and function description

Close Homologs in the Non-Redundant Database Detected by BLAST

Original result of BLAST against Nonredundant Database

GI	Alignment Graph	Length	Definition	Q cover	H cover	Identity	E-value
Query		245
255584289		248	DAG protein, chloroplast precursor, puta	0.971	0.959	0.807	1e-108
118487925		241	unknown [Populus trichocarpa]	0.979	0.995	0.764	1e-102
224102209		241	predicted protein [Populus trichocarpa]	0.975	0.991	0.769	1e-100
225433215		233	PREDICTED: DAG protein, chloroplastic [V	0.934	0.982	0.762	1e-100
449432522		243	PREDICTED: DAG protein, chloroplastic-li	0.971	0.979	0.743	6e-99
297823137		219	hypothetical protein ARALYDRAFT_482286 [	0.730	0.817	0.938	2e-95
15226108		219	protein differentiation and greening-lik	0.730	0.817	0.933	5e-95
388494872		217	unknown [Medicago truncatula]	0.755	0.852	0.870	8e-95
225438029		227	PREDICTED: DAG protein, chloroplastic-li	0.885	0.955	0.725	1e-94
2246378		198	plastid protein [Arabidopsis thaliana]	0.730	0.904	0.933	2e-94

>gi\|255584289\|ref\|XP_002532881.1\| DAG protein, chloroplast precursor, putative [Ricinus communis] gi\|223527366\|gb\|EEF29510.1\| DAG protein, chloroplast precursor, putative [Ricinus communis]	Back alignment and taxonomy information

 Score =  395 bits (1015), Expect = e-108,   Method: Compositional matrix adjust.
 Identities = 206/255 (80%), Positives = 213/255 (83%), Gaps = 17/255 (6%)

Query: 1   MAQTLTRAITALSIRSSRLSLLSSKRLLSTNTTTVT----------APSPLPSLLFSRRA 50
           MAQTL R+ T    R    SLL  KRLLST TTT T           P   PSLLF+RR+
Sbjct: 1   MAQTLARSFT----RHLTFSLLLPKRLLSTITTTTTTATTSTTSIICPPLPPSLLFTRRS 56

Query: 51  AAPLSHAVGLISPLPSTRFCQIRCRANRSGNSAYSPLNSGSNFSDRPPTEMAPLFPGCDY 110
             PLSHAV  I P   TRF  IRCR NR+GNSAYSPLNSGSNFSDRPP EMAPLFPGCDY
Sbjct: 57  LLPLSHAVHSIKP---TRFTSIRCRVNRAGNSAYSPLNSGSNFSDRPPNEMAPLFPGCDY 113

Query: 111 EHWLIVMDKPGGEGATKQQMIDCYIKTLAQVVGSEEEAKKKIYNVSCERYFGFGCELDEE 170
           EHWLIVMDKPGGEGATKQQMIDCYI+TLA+VVGSEEEAKKKIYNVSCERYFGFGCE+DEE
Sbjct: 114 EHWLIVMDKPGGEGATKQQMIDCYIQTLAKVVGSEEEAKKKIYNVSCERYFGFGCEIDEE 173

Query: 171 TSNKLEGLPGVLFVLPDSYVDPENKDYGAELFVNGEIVQRSPERQRRVEPQPQRAQDRPR 230
           TSNKLEGLPGVLFVLPDSYVDPENKDYGAELFVNGEIVQRSPERQRRVEPQPQRA DRPR
Sbjct: 174 TSNKLEGLPGVLFVLPDSYVDPENKDYGAELFVNGEIVQRSPERQRRVEPQPQRANDRPR 233

Query: 231 YNDRTRYVRRRENTR 245
           YNDRTRYVRRREN R
Sbjct: 234 YNDRTRYVRRRENMR 248

Source: Ricinus communis

Species: Ricinus communis

Genus: Ricinus

Family: Euphorbiaceae

Order: Malpighiales

Class:

Phylum: Streptophyta

Superkingdom: Eukaryota

>gi\|118487925\|gb\|ABK95784.1\| unknown [Populus trichocarpa]	Back alignment and taxonomy information

>gi\|224102209\|ref\|XP_002312591.1\| predicted protein [Populus trichocarpa] gi\|222852411\|gb\|EEE89958.1\| predicted protein [Populus trichocarpa]	Back alignment and taxonomy information

>gi\|225433215\|ref\|XP_002285392.1\| PREDICTED: DAG protein, chloroplastic [Vitis vinifera] gi\|147779193\|emb\|CAN67991.1\| hypothetical protein VITISV_023920 [Vitis vinifera]	Back alignment and taxonomy information

>gi\|449432522\|ref\|XP_004134048.1\| PREDICTED: DAG protein, chloroplastic-like [Cucumis sativus] gi\|449517983\|ref\|XP_004166023.1\| PREDICTED: DAG protein, chloroplastic-like [Cucumis sativus]	Back alignment and taxonomy information

>gi\|297823137\|ref\|XP_002879451.1\| hypothetical protein ARALYDRAFT_482286 [Arabidopsis lyrata subsp. lyrata] gi\|297325290\|gb\|EFH55710.1\| hypothetical protein ARALYDRAFT_482286 [Arabidopsis lyrata subsp. lyrata]	Back alignment and taxonomy information

>gi|15226108|ref|NP_180901.1| protein differentiation and greening-like 1 [Arabidopsis thaliana] gi|17933285|gb|AAL48226.1|AF446351_1 At2g33430/F4P9.20 [Arabidopsis thaliana] gi|2459425|gb|AAB80660.1| plastid protein [Arabidopsis thaliana] gi|20453405|gb|AAM19941.1| At2g33430/F4P9.20 [Arabidopsis thaliana] gi|110736869|dbj|BAF00392.1| plastid protein [Arabidopsis thaliana] gi|330253739|gb|AEC08833.1| protein differentiation and greening-like 1 [Arabidopsis thaliana]

Back alignment and taxonomy information

>gi\|388494872\|gb\|AFK35502.1\| unknown [Medicago truncatula]	Back alignment and taxonomy information

>gi\|225438029\|ref\|XP_002271431.1\| PREDICTED: DAG protein, chloroplastic-like [Vitis vinifera]	Back alignment and taxonomy information

>gi\|2246378\|emb\|CAB06698.1\| plastid protein [Arabidopsis thaliana]	Back alignment and taxonomy information

Prediction of Gene Ontology (GO) Terms

Close Homologs with Gene Ontology terms Detected by BLAST

Original result of BLAST against Gene Ontology (AMIGO)

ID	Alignment graph	Length	Definition	Q cover	H cover	Identity	E-value
Query		245
TAIR\|locus:2063389		232	MORF6 "multiple organellar RNA	0.714	0.754	0.780	2.5e-72
TAIR\|locus:2206639		229	AT1G32580 "AT1G32580" [Arabido	0.702	0.751	0.788	1.4e-71
TAIR\|locus:2051003		219	DAL1 "differentiation and gree	0.575	0.643	0.922	3e-69
TAIR\|locus:2083348		244	MORF3 "multiple organellar RNA	0.706	0.709	0.486	9.2e-36
TAIR\|locus:2200131		232	MORF9 "multiple organellar RNA	0.497	0.525	0.551	4.7e-32
TAIR\|locus:2086310		395	RIP1 "RNA-editing factor inter	0.681	0.422	0.458	8.8e-31
UNIPROTKB\|Q2R8U1		374	Os11g0216400 "Os11g0216400 pro	0.685	0.449	0.426	1.3e-27
TAIR\|locus:2119782		419	MORF1 "multiple organellar RNA	0.444	0.260	0.477	1.9e-22
TAIR\|locus:2030200		192	AT1G72530 "AT1G72530" [Arabido	0.506	0.645	0.437	2.6e-22
TAIR\|locus:2156344		723	MORF4 "AT5G44780" [Arabidopsis	0.591	0.200	0.346	1.6e-16

TAIR\|locus:2063389 MORF6 "multiple organellar RNA editing factor 6" [Arabidopsis thaliana (taxid:3702)]	Back alignment and assigned GO terms

 Score = 731 (262.4 bits), Expect = 2.5e-72, P = 2.5e-72
 Identities = 139/178 (78%), Positives = 153/178 (85%)

Query:    30 TNTTTVTAPSPLPSLLFSRRAAAPLSHAVGLISPLPSTRFCQIRCRANRSGNSAYSPLNS 89
             + +  V +PSPLPS L SRR +  + HAVG I  L  TRF  IR R +RSG S YSPL S
Sbjct:    19 STSNAVASPSPLPSHLISRRFSPTIFHAVGYIPAL--TRFTTIRTRMDRSGGS-YSPLKS 75

Query:    90 GSNFSDRPPTEMAPLFPGCDYEHWLIVMDKPGGEGATKQQMIDCYIKTLAQVVGSEEEAK 149
             GSNFSDRPPTEMAPLFPGCDYEHWLIVM+KPGGE A KQQMIDCY++TLA++VGSEEEA+
Sbjct:    76 GSNFSDRPPTEMAPLFPGCDYEHWLIVMEKPGGENAQKQQMIDCYVQTLAKIVGSEEEAR 135

Query:   150 KKIYNVSCERYFGFGCELDEETSNKLEGLPGVLFVLPDSYVDPENKDYGAELFVNGEI 207
             KKIYNVSCERYFGFGCE+DEETSNKLEGLPGVLFVLPDSYVDPE KDYGAELFVNGE+
Sbjct:   136 KKIYNVSCERYFGFGCEIDEETSNKLEGLPGVLFVLPDSYVDPEFKDYGAELFVNGEV 193

GO:0009507 "chloroplast" evidence=ISM

GO:0005739 "mitochondrion" evidence=IDA

TAIR\|locus:2206639 AT1G32580 "AT1G32580" [Arabidopsis thaliana (taxid:3702)]	Back alignment and assigned GO terms

TAIR\|locus:2051003 DAL1 "differentiation and greening-like 1" [Arabidopsis thaliana (taxid:3702)]	Back alignment and assigned GO terms

TAIR\|locus:2083348 MORF3 "multiple organellar RNA editing factor 3" [Arabidopsis thaliana (taxid:3702)]	Back alignment and assigned GO terms

TAIR\|locus:2200131 MORF9 "multiple organellar RNA editing factor 9" [Arabidopsis thaliana (taxid:3702)]	Back alignment and assigned GO terms

TAIR\|locus:2086310 RIP1 "RNA-editing factor interacting protein 1" [Arabidopsis thaliana (taxid:3702)]	Back alignment and assigned GO terms

UNIPROTKB\|Q2R8U1 Os11g0216400 "Os11g0216400 protein" [Oryza sativa Japonica Group (taxid:39947)]	Back alignment and assigned GO terms

TAIR\|locus:2119782 MORF1 "multiple organellar RNA editing factor 1" [Arabidopsis thaliana (taxid:3702)]	Back alignment and assigned GO terms

TAIR\|locus:2030200 AT1G72530 "AT1G72530" [Arabidopsis thaliana (taxid:3702)]	Back alignment and assigned GO terms

TAIR\|locus:2156344 MORF4 "AT5G44780" [Arabidopsis thaliana (taxid:3702)]	Back alignment and assigned GO terms

Prediction of Enzyme Commission (EC) Number

EC Number Prediction by Annotation Transfer from SWISS-PROT Entries

Original result of BLAST against SWISS-PROT

No confident hit for EC number transfering in SWISSPROT detected by BLAST

EC Number Prediction by Ezypred Server

Original result from Ezypred Server

Fail to connect to Ezypred Server

EC Number Prediction by EFICAz Software

No EC number assignment, probably not an enzyme!

Prediction of Functionally Associated Proteins

Functionally Associated Proteins Detected by STRING

Original result from the STRING server

Fail to connect to STRING server

Conserved Domains and Related Protein Families

Conserved Domains Detected by RPS-BLAST

Original result of RPS-BLAST against CDD database part I

Conserved Domains Detected by HHsearch

Original result of HHsearch against CDD database

ID	Alignment Graph	Length	Definition	Probability
Query		245
PF05922		82	Inhibitor_I9: Peptidase inhibitor I9; InterPro: IP	99.32

>PF05922 Inhibitor_I9: Peptidase inhibitor I9; InterPro: IPR010259 Peptide proteinase inhibitors can be found as single domain proteins or as single or multiple domains within proteins; these are referred to as either simple or compound inhibitors, respectively	Back alignment and domain information

Probab=99.32  E-value=1.4e-12  Score=95.50  Aligned_cols=76  Identities=24%  Similarity=0.330  Sum_probs=54.3

Q ss_pred             eEEEEecCCCCCCCCchhhHHHHHHHHHHhhCCh----hhhhcceeEEecccceeeeeecCHHHHHhhcCCCCeEEEeCC
Q 025989          112 HWLIVMDKPGGEGATKQQMIDCYIKTLAQVVGSE----EEAKKKIYNVSCERYFGFGCELDEETSNKLEGLPGVLFVLPD  187 (245)
Q Consensus       112 tYIVyMd~p~~~~ps~~~~~~~h~s~LaSVLgS~----eeAk~sIlYSYt~sfnGFAArLTeEEAe~Lk~lPGVVSVfPD  187 (245)
                      .|||.|+.....    +...+++.+++.+++.+.    ......++|+|+..|+||+|+|+++++++|+++|+|.+|.||
T Consensus         1 ~YIV~~k~~~~~----~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~Gfs~~l~~~~i~~L~~~p~V~~Ve~D   76 (82)
T PF05922_consen    1 RYIVVFKDDASA----ASSFSSHKSWQASILKSALKSASSINAKVLYSYDNAFNGFSAKLSEEEIEKLRKDPGVKSVEPD   76 (82)
T ss_dssp             EEEEEE-TTSTH----HCHHHHHHHHHH----HHHHTH-TTT-EEEEEESSTSSEEEEEE-HHHHHHHHTSTTEEEEEEE
T ss_pred             CEEEEECCCCCc----chhHHHHHHHHHHHHhhhhhhhcccCCceEEEEeeeEEEEEEEeCHHHHHHHHcCCCeEEEEeC
Confidence            599999876431    223566666666554321    234567899999999999999999999999999999999999


Q ss_pred             CCCC
Q 025989          188 SYVD  191 (245)
Q Consensus       188 s~lk  191 (245)
                      ..++
T Consensus        77 ~~v~   80 (82)
T PF05922_consen   77 QVVS   80 (82)
T ss_dssp             CEEE
T ss_pred             ceEe
Confidence            8764

In many cases they are synthesised as part of a larger precursor protein, either as a prepropeptide or as an N-terminal domain associated with an inactive peptidase or zymogen. This domain prevents access of the substrate to the active site. Removal of the N-terminal inhibitor domain either by interaction with a second peptidase or by autocatalytic cleavage activates the zymogen. Other inhibitors interact direct with proteinases using a simple noncovalent lock and key mechanism; while yet others use a conformational change-based trapping mechanism that depends on their structural and thermodynamic properties. Limited proteolysis of most large protein precursors is carried out in vivo by the subtilisin-like pro-protein convertases. Many important biological processes such as peptide hormone synthesis, viral protein processing and receptor maturation involve proteolytic processing by these enzymes []. The subtilisin-serine protease (SRSP) family hormone and pro-protein convertases (furin, PC1/3, PC2, PC4, PACE4, PC5/6, and PC7/7/LPC) act within the secretory pathway to cleave polypeptide precursors at specific basic sites, generating their biologically active forms. Serum proteins, pro-hormones, receptors, zymogens, viral surface glycoproteins, bacterial toxins, amongst others, are activated by this route []. The SRSPs share the same domain structure, including a signal peptide, the pro-peptide, the catalytic domain, the P/middle or homo B domain, and the C terminus. Proteinase propeptide inhibitors (sometimes refered to as activation peptides) are responsible for the modulation of folding and activity of the pro-enzyme or zymogen. The pro-segment docks into the enzyme moiety shielding the substrate binding site, thereby promoting inhibition of the enzyme. Several such propeptides share a similar topology [], despite often low sequence identities []. The propeptide region has an open-sandwich antiparallel-alpha/antiparallel-beta fold, with two alpha-helices and four beta-strands with a (beta/alpha/beta)x2 topology. This group of sequences contain the propeptide domain at the N terminus of peptidases belonging to MEROPS family S8A, subtilisins. A number of the members of this group of sequences belong to MEROPS inhibitor family I9, clan I-. The propeptide is removed by proteolytic cleavage; removal activating the enzyme.; GO: 0004252 serine-type endopeptidase activity, 0042802 identical protein binding, 0043086 negative regulation of catalytic activity; PDB: 3CNQ_P 1SPB_P 3CO0_P 1ITP_A 1V5I_B 1SCJ_B 3P5B_P 2XTJ_P 2W2M_P 2P4E_P ....

Homologous Structure Templates

Structure Templates Detected by BLAST

Original result of BLAST against Protein Data Bank

No homologous structure with e-value below 0.005

Structure Templates Detected by RPS-BLAST

Original result of RPS-BLAST against PDB70 database

ID	Alignment Graph	Length	Definition	E-value
Query		245
1vt4_I		1221	APAF-1 related killer DARK; drosophila apoptosome,	4e-06

>1vt4_I APAF-1 related killer DARK; drosophila apoptosome, apoptosis, programmed cell death; HET: DTP; 6.90A {Drosophila melanogaster} PDB: 3iz8_A* Length = 1221	Back alignment and structure

 Score = 46.8 bits (110), Expect = 4e-06
 Identities = 23/173 (13%), Positives = 43/173 (24%), Gaps = 52/173 (30%)

Query: 85  SPLNSGSNFSDRPP-------TEMAPLFPGCDYEHWLIVMDKPGGEGATKQQMIDCYIKT 137
               S S+ S            E+  L     YE+ L+V+              +     
Sbjct: 211 PNWTSRSDHSSNIKLRIHSIQAELRRLLKSKPYENCLLVLL-------------N----- 252

Query: 138 LAQVVGSEEEAKKKIYNVSCERYFGFGCEL-----DEETSNKLEGLPGVLFVLPDSYVDP 192
                         + N      F   C++      ++ ++ L         L D +   
Sbjct: 253 --------------VQNAKAWNAFNLSCKILLTTRFKQVTDFLSAATTTHISL-DHHSMT 297

Query: 193 ENKDYGAELFVN--GEIVQRSPERQRRVEPQ-----PQRAQDRPRYNDRTRYV 238
              D    L +       Q  P       P+      +  +D     D  ++V
Sbjct: 298 LTPDEVKSLLLKYLDCRPQDLPREVLTTNPRRLSIIAESIRDGLATWDNWKHV 350

PyMOL of 1vt4

Structure Templates Detected by HHsearch

Original result of HHsearch against PDB70 database

ID	Alignment Graph	Length	Definition	Probability
Query		245
2w2n_P		114	Proprotein convertase subtilisin/kexin type 9; hyd	99.43
3cnq_P		80	Subtilisin BPN'; uncleaved, proenzyme, substrate c	98.98
2qtw_A		124	Proprotein convertase subtilisin/kexin type 9 Pro;	98.98
2p4e_P		692	Proprotein convertase subtilisin/kexin type 9; pro	98.27
1v5i_B		76	POIA1, IA-1=serine proteinase inhibitor; protease-	98.05
3t41_A		471	Epidermin leader peptide processing serine protea;	97.85
3afg_A		539	Subtilisin-like serine protease; propeptide, therm	97.5
2z2z_A		395	TK-subtilisin precursor; thermococcus kodakaraensi	97.08
2z30_B		65	TK-subtilisin; thermococcus kodakaraensis, hydrola	94.72
3vta_A		621	Cucumisin; subtilisin-like fold, serine protease,	85.43

>2w2n_P Proprotein convertase subtilisin/kexin type 9; hydrolase-receptor complex, PCSK9, proprotein converta low-density lipoprotein receptor, EGF; 2.30A {Homo sapiens} PDB: 2w2m_P 2w2o_P 2w2p_P 2w2q_P 2xtj_P	Back alignment and structure

Probab=99.43  E-value=7e-14  Score=110.82  Aligned_cols=75  Identities=17%  Similarity=0.144  Sum_probs=62.1

Q ss_pred             eeEEEEecCCCCCCCCchhhHHHHHHHHHHhhCChhhhhcceeEEecccceeeeeecCHHHHHhhcCCCCeEEEeCCCCC
Q 025989          111 EHWLIVMDKPGGEGATKQQMIDCYIKTLAQVVGSEEEAKKKIYNVSCERYFGFGCELDEETSNKLEGLPGVLFVLPDSYV  190 (245)
Q Consensus       111 ktYIVyMd~p~~~~ps~~~~~~~h~s~LaSVLgS~eeAk~sIlYSYt~sfnGFAArLTeEEAe~Lk~lPGVVSVfPDs~l  190 (245)
                      +.|||+|+....     ...+..|++||.+++++ +.+...++|+|++.|+||+|+|+++|+++|+++|+|++|.||+.+
T Consensus        38 ~~YIV~lk~~~~-----~~~~~~h~~~l~s~~~~-~~~~~~i~~sY~~~~~GFaa~Lt~~~~~~L~~~P~V~~VE~D~~v  111 (114)
T 2w2n_P           38 GTYVVVLKEETH-----LSQSERTARRLQAQAAR-RGYLTKILHVFHGLLPGFLVKMSGDLLELALKLPHVDYIEEDSSV  111 (114)
T ss_dssp             EEEEEEECTTCC-----HHHHHHHHHHHHHHHHH-TTCCCEEEEEECSSSSEEEEECCGGGHHHHHTSTTEEEEEEEEEE
T ss_pred             CcEEEEECCCCC-----HHHHHHHHHHHHHHhhh-cccCCceEEEecccceEEEEEcCHHHHHHHHcCCCccEEEeCceE
Confidence            789999975432     23456788898888764 234567999999999999999999999999999999999999865


Q ss_pred             C
Q 025989          191 D  191 (245)
Q Consensus       191 k  191 (245)
                      +
T Consensus       112 ~  112 (114)
T 2w2n_P          112 F  112 (114)
T ss_dssp             E
T ss_pred             e
Confidence            4

PyMOL of 2w2n

>3cnq_P Subtilisin BPN'; uncleaved, proenzyme, substrate complex, hydrolase, metal- binding, protease, secreted, serine protease, sporulation; 1.71A {Bacillus amyloliquefaciens} PDB: 3bgo_P 3co0_P 1spb_P 1scj_B	Back alignment and structure

>2qtw_A Proprotein convertase subtilisin/kexin type 9 Pro; coronary heart disease, hypercholest low density lipoprotein receptor, autocatalytic cleavage; HET: NAG; 1.90A {Homo sapiens} PDB: 3m0c_A 2pmw_A 3h42_A 3bps_P 3gcw_P 3gcx_P 3p5b_P 3p5c_P	Back alignment and structure

>2p4e_P Proprotein convertase subtilisin/kexin type 9; protease, LDL receptor, LDL, endocytosis, hydrol; 1.98A {Homo sapiens}	Back alignment and structure

>1v5i_B POIA1, IA-1=serine proteinase inhibitor; protease-inhibitor complex, subtilisin, hydrolase-Pro binding complex; 1.50A {Pleurotus ostreatus} SCOP: d.58.3.2 PDB: 1itp_A	Back alignment and structure

>3t41_A Epidermin leader peptide processing serine protea; structural genomics, center for structural genomics of infec diseases, csgid; 1.95A {Staphylococcus aureus} PDB: 3qfh_A	Back alignment and structure

>3afg_A Subtilisin-like serine protease; propeptide, thermococcus kodakaraensis, hydrolas protease; 2.00A {Thermococcus kodakarensis}	Back alignment and structure

>2z2z_A TK-subtilisin precursor; thermococcus kodakaraensis, hydrolase; 1.87A {Thermococcus kodakarensis} PDB: 2e1p_A 2zwp_A 2zwo_A	Back alignment and structure

>2z30_B TK-subtilisin; thermococcus kodakaraensis, hydrolase; 1.65A {Thermococcus kodakarensis} PDB: 2z2y_B 3a3p_B 2z56_B 2z58_B 2z57_B 3a3n_B 3a3o_B	Back alignment and structure

>3vta_A Cucumisin; subtilisin-like fold, serine protease, hydrolase; HET: DFP NAG FUC BMA MAN; 2.75A {Cucumis melo}	Back alignment and structure

Homologous Structure Domains

Structure Domains Detected by RPS-BLAST

Original result of RPS-BLAST against SCOP70(version1.75) database

No hit with e-value below 0.005

Homologous Domains Detected by HHsearch

Original result of HHsearch against SCOP70(version1.75) database

ID	Alignment Graph	Length	Definition	Probability
Query		245
d1scjb_		71	Subtilisin prosegment {Bacillus subtilis [TaxId: 1	98.05
d1v5ib1		72	Proteinase A inhibitor 1, POIA1 {Oyster mushroom (	98.03

>d1scjb_ d.58.3.2 (B:) Subtilisin prosegment {Bacillus subtilis [TaxId: 1423]}	Back information, alignment and structure

class: Alpha and beta proteins (a+b)
fold: Ferredoxin-like
superfamily: Protease propeptides/inhibitors
family: Subtilase propeptides/inhibitors
domain: Subtilisin prosegment
species: Bacillus subtilis [TaxId: 1423]

Probab=98.05  E-value=5.6e-06  Score=58.86  Aligned_cols=68  Identities=15%  Similarity=0.124  Sum_probs=49.9

Q ss_pred             eeEEEEecCCCCCCCCchhhHHHHHHHHHHhhCChhhhhcceeEEecccceeeeeecCHHHHHhhcCCCCeEEEeCCCCC
Q 025989          111 EHWLIVMDKPGGEGATKQQMIDCYIKTLAQVVGSEEEAKKKIYNVSCERYFGFGCELDEETSNKLEGLPGVLFVLPDSYV  190 (245)
Q Consensus       111 ktYIVyMd~p~~~~ps~~~~~~~h~s~LaSVLgS~eeAk~sIlYSYt~sfnGFAArLTeEEAe~Lk~lPGVVSVfPDs~l  190 (245)
                      +-|||.+.......     ....+.+++.+       ...++.+.|+ .++||+|+|++++++.|+..|+|..|=+|...
T Consensus         2 ~~YIV~fK~~~~~~-----~~~~~~~~v~~-------~gg~v~~~~~-~i~gfs~~l~~~~~~~L~~~p~V~yVE~D~v~   68 (71)
T d1scjb_           2 KKYIVGFKQTMSAM-----SSAKKKDVISQ-------KGGKVEKQFK-YVNAAAATLDEKAVKELKKDPSVAYVEEDHIA   68 (71)
T ss_dssp             EEEEEEECSSSSCC-----SHHHHHHHHHT-------TTCEEEEECS-SSSEEEEEECHHHHHHHHTSTTEEEEEECCEE
T ss_pred             CcEEEEECCCCChH-----HHHHHHHHHHH-------cCCeEEEEEe-ecceEEEEeCHHHHHHHHcCCCceEEeCCcEE
Confidence            57999997543221     22333333332       2345789997 69999999999999999999999999999865


Q ss_pred             C
Q 025989          191 D  191 (245)
Q Consensus       191 k  191 (245)
                      +
T Consensus        69 ~   69 (71)
T d1scjb_          69 H   69 (71)
T ss_dssp             E
T ss_pred             E
Confidence            3

PyMOL of d1scjb_

>d1v5ib1 d.58.3.2 (B:1-72) Proteinase A inhibitor 1, POIA1 {Oyster mushroom (Pleurotus ostreatus) [TaxId: 5322]}	Back information, alignment and structure