Citrus Sinensis ID: 040090


Local Sequence Feature Prediction

Prediction and (Method)Result
Residue Number Marker
Protein Sequence ?
Secondary Structure (PSIPRED) ?
Secondary Structure Prediction (SSPRO) ?
Coil and Loop (DISEMBL) ?
Flexible Loop (DISEMBL) ?
Low Complexity Region (SEG) ?
Disordered region (IsUnstruct) ?
Disordered Region (DISOPRED) ?
Disordered Region (DISEMBL) ?
Disordered Region (DISPRO) ?
Transmembrane Helix (TMHMM) ?
Transmembrane Helix (HMMTOP) ?
Transmembrane Helix (MEMSAT) ?
TM Helix, Signal Peptide (MEMSAT_SVM) ?
TM Helix, Signal Peptide (Phobius) ?
Signal Peptide (SignalP HMM Mode) ?
Signal Peptide (SignalP NN Mode) ?
Coiled Coils (COILS) ?
Positional Conservation ?
 
--------10--------20--------30--------40--------50--------60--------70--------80--------90-------100-------110--
MASQQGRKELDTRARQGETVVPGGTGGKSLEAQEHLAEGRSRGGQTRREQLGTEGYQEMGHRGGQMRREQMGSEGYQEMGRKGGLSTIDKSGGERAAEEGIEIDESKYKTSS
cccHHHHHHHHHHHHccccccccccccccHHHHHHHHHHccccccccccccccHHHHHHHcccccHHHHHccHHHHHHHHcccccccccccccHHHHHHccccccccccccc
cccHHHHHHHHHHHHcccEEcccccccccHHHHHHHHcccccccccHHHHHccccHHHcccccccccHHHcccccHHHccccccccccccccHHHHHHcccccccccccccc
MASQQGRKELDTRArqgetvvpggtggksLEAQEHLAegrsrggqtrreQLGTegyqemghrggqmrreqmgsegyqemgrkgglstidksggeraaeegieideskyktss
masqqgrkeldtrarqgetvvpggtggkslEAQEHLaegrsrggqtrreqlgtegyqemghrggqMRREQMGSEGYqemgrkgglstidksggeraaeegieideskyktss
MASQQGRKELDTRARQGETVVPGGTGGKSLEAQEHLAEGRSRGGQTRREQLGTEGYQEMGHRGGQMRREQMGSEGYQEMGRKGGLSTIDKSggeraaeegieideSKYKTSS
****************************************************************************************************************
****************************************************************************************************************
**************RQGETVVPGGTGGKSLEAQE**************EQLGTEGYQ***********************RKGGLSTIDKSGGERAAEEGIEI*********
****************************************************************************************************************
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii
iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiihhhhhhhhhhhhhhhhhhhooooooooooooooooooooooooooooooooo
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooohhhhhhhhhhhhhhhhiiii
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
MASQQGRKELDTRARQGETVVPGGTGGKSLEAQEHLAEGRSRGGQTRREQLGTEGYQEMGHRGGQMRREQMGSEGYQEMGRKGGLSTIDKSGGERAAEEGIEIDESKYKTSS
no confident homologs detected

Close Homologs for Annotation Transfer

Close Homologs in SWISS-PROT Database Detected by BLAST ?

ID ?Alignment graph ?Length ? Definition ? RBH(Q2H) ? RBH(H2Q) ? Q cover ? H cover ? Identity ? E-value ?
Query112 2.2.26 [Sep-21-2011]
Q02400133 Late embryogenesis abunda N/A no 1.0 0.842 0.676 3e-38
Q07187152 Em-like protein GEA1 OS=A yes no 0.964 0.710 0.590 6e-35
Q05191153 Late embryogenesis abunda N/A no 1.0 0.732 0.588 1e-34
Q0297392 Em-like protein GEA6 OS=A no no 0.821 1.0 0.651 4e-31
P1763992 EMB-1 protein OS=Daucus c N/A no 0.803 0.978 0.672 5e-31
P4275593 Em protein H5 OS=Triticum N/A no 0.821 0.989 0.672 6e-30
P4652095 Embryonic abundant protei yes no 0.821 0.968 0.669 8e-30
Q0800093 Em protein H2 OS=Triticum N/A no 0.821 0.989 0.663 2e-29
P0456893 Em protein OS=Triticum ae N/A no 0.821 0.989 0.654 3e-29
Q0519093 Late embryogenesis abunda N/A no 0.821 0.989 0.654 3e-29
>sp|Q02400|LE193_HORVU Late embryogenesis abundant protein B19.3 OS=Hordeum vulgare GN=B19.3 PE=2 SV=1 Back     alignment and function desciption
 Score =  156 bits (395), Expect = 3e-38,   Method: Compositional matrix adjust.
 Identities = 90/133 (67%), Positives = 102/133 (76%), Gaps = 21/133 (15%)

Query: 1   MAS-QQGRKELDTRARQGETVVPGGTGGKSLEAQEHLAEGRSRGGQ-------------- 45
           MAS QQ R ELD  AR+GETVVPGGTGGK+LEAQEHLAEGRSRGGQ              
Sbjct: 1   MASGQQERSELDRMAREGETVVPGGTGGKTLEAQEHLAEGRSRGGQTRKDQLGEEGYREM 60

Query: 46  ------TRREQLGTEGYQEMGHRGGQMRREQMGSEGYQEMGRKGGLSTIDKSGGERAAEE 99
                 TR+EQLG EGY+EMGH+GG+ R+EQMG EGY EMGRKGGLST+++SGGERAA E
Sbjct: 61  GHKGGETRKEQLGEEGYREMGHKGGETRKEQMGEEGYHEMGRKGGLSTMEESGGERAARE 120

Query: 100 GIEIDESKYKTSS 112
           GI+IDESK+KT S
Sbjct: 121 GIDIDESKFKTKS 133




Lea proteins are late embryonic proteins abundant in higher plant seed embryos.
Hordeum vulgare (taxid: 4513)
>sp|Q07187|EM1_ARATH Em-like protein GEA1 OS=Arabidopsis thaliana GN=EM1 PE=1 SV=1 Back     alignment and function description
>sp|Q05191|LE194_HORVU Late embryogenesis abundant protein B19.4 OS=Hordeum vulgare GN=B19.4 PE=2 SV=1 Back     alignment and function description
>sp|Q02973|EM6_ARATH Em-like protein GEA6 OS=Arabidopsis thaliana GN=EM6 PE=1 SV=1 Back     alignment and function description
>sp|P17639|EMB1_DAUCA EMB-1 protein OS=Daucus carota GN=EMB-1 PE=2 SV=1 Back     alignment and function description
>sp|P42755|EM4_WHEAT Em protein H5 OS=Triticum aestivum GN=EMH5 PE=2 SV=1 Back     alignment and function description
>sp|P46520|EMP1_ORYSJ Embryonic abundant protein 1 OS=Oryza sativa subsp. japonica GN=EMP1 PE=2 SV=1 Back     alignment and function description
>sp|Q08000|EM3_WHEAT Em protein H2 OS=Triticum aestivum GN=EMH2 PE=2 SV=1 Back     alignment and function description
>sp|P04568|EM1_WHEAT Em protein OS=Triticum aestivum GN=EM PE=2 SV=1 Back     alignment and function description
>sp|Q05190|LE19A_HORVU Late embryogenesis abundant protein B19.1A OS=Hordeum vulgare GN=B19.1A PE=2 SV=1 Back     alignment and function description

Close Homologs in the Non-Redundant Database Detected by BLAST ?

GI ?Alignment Graph ?Length ? Definition ? Q cover ? H cover ? Identity ? E-value ?
Query112
255579322112 Late seed maturation protein P8B6, putat 1.0 1.0 0.857 1e-45
18499110 late embryogenesis abundant protein [Gos 0.982 1.0 0.845 6e-44
444336110 water stress-related protein 0.982 1.0 0.836 1e-43
33151040113 Em protein [Quercus robur] 1.0 0.991 0.812 3e-42
356568122112 PREDICTED: em-like protein GEA1-like [Gl 0.991 0.991 0.821 9e-42
1754977112 Em protein [Robinia pseudoacacia] 0.982 0.982 0.810 2e-41
146189786112 late embryogenesis abundant protein [Vig 0.991 0.991 0.794 3e-41
255637579112 unknown [Glycine max] 0.991 0.991 0.812 7e-41
3641278112 late embryogenic abundant protein [Vigna 0.991 0.991 0.785 7e-41
388494570112 unknown [Lotus japonicus] 0.991 0.991 0.803 7e-41
>gi|255579322|ref|XP_002530506.1| Late seed maturation protein P8B6, putative [Ricinus communis] gi|223529963|gb|EEF31890.1| Late seed maturation protein P8B6, putative [Ricinus communis] Back     alignment and taxonomy information
 Score =  186 bits (472), Expect = 1e-45,   Method: Compositional matrix adjust.
 Identities = 96/112 (85%), Positives = 105/112 (93%)

Query: 1   MASQQGRKELDTRARQGETVVPGGTGGKSLEAQEHLAEGRSRGGQTRREQLGTEGYQEMG 60
           M+S Q R ELD RA++GETVVPGGTGGKSLEAQEHLAEGRSRGGQTRREQLGTEGY+E+G
Sbjct: 1   MSSDQERAELDARAKRGETVVPGGTGGKSLEAQEHLAEGRSRGGQTRREQLGTEGYKELG 60

Query: 61  HRGGQMRREQMGSEGYQEMGRKGGLSTIDKSGGERAAEEGIEIDESKYKTSS 112
           H+GG+ RREQ+G+EGYQEMGRKGGLSTIDKSGGERAAEEGIEIDESKYK  S
Sbjct: 61  HKGGETRREQIGTEGYQEMGRKGGLSTIDKSGGERAAEEGIEIDESKYKARS 112




Source: Ricinus communis

Species: Ricinus communis

Genus: Ricinus

Family: Euphorbiaceae

Order: Malpighiales

Class:

Phylum: Streptophyta

Superkingdom: Eukaryota

>gi|18499|emb|CAA38374.1| late embryogenesis abundant protein [Gossypium hirsutum] gi|167330|gb|AAA33057.1| embryogensis abundant protein [Gossypium hirsutum] gi|167353|gb|AAB00728.1| water-stress protectant protein [Gossypium hirsutum] gi|167355|gb|AAA33064.1| late embryogenesis-abundant protein 2-D [Gossypium hirsutum] Back     alignment and taxonomy information
>gi|444336|prf||1906384B water stress-related protein Back     alignment and taxonomy information
>gi|33151040|gb|AAP97398.1| Em protein [Quercus robur] Back     alignment and taxonomy information
>gi|356568122|ref|XP_003552262.1| PREDICTED: em-like protein GEA1-like [Glycine max] Back     alignment and taxonomy information
>gi|1754977|gb|AAB39473.1| Em protein [Robinia pseudoacacia] Back     alignment and taxonomy information
>gi|146189786|emb|CAM92311.1| late embryogenesis abundant protein [Vigna radiata] gi|148291150|emb|CAN84534.1| late embryogenesis abundant protein [Vigna radiata] gi|148291152|emb|CAN84535.1| late embryogenesis abundant protein [Vigna radiata] Back     alignment and taxonomy information
>gi|255637579|gb|ACU19115.1| unknown [Glycine max] Back     alignment and taxonomy information
>gi|3641278|gb|AAC36329.1| late embryogenic abundant protein [Vigna radiata] Back     alignment and taxonomy information
>gi|388494570|gb|AFK35351.1| unknown [Lotus japonicus] Back     alignment and taxonomy information

Prediction of Gene Ontology (GO) Terms

Close Homologs with Gene Ontology terms Detected by BLAST ?

ID ? Alignment graph ? Length ? Definition ? Q cover ? H cover ? Identity ? E-value ?
Query112
TAIR|locus:2074383152 EM1 "AT3G51810" [Arabidopsis t 0.75 0.552 0.776 3.7e-32
TAIR|locus:206504192 GEA6 "AT2G40170" [Arabidopsis 0.714 0.869 0.7 5.2e-26
UNIPROTKB|P4652095 EMP1 "Embryonic abundant prote 0.633 0.747 0.718 1.4e-23
TAIR|locus:2074383 EM1 "AT3G51810" [Arabidopsis thaliana (taxid:3702)] Back     alignment and assigned GO terms
 Score = 352 (129.0 bits), Expect = 3.7e-32, P = 3.7e-32
 Identities = 66/85 (77%), Positives = 77/85 (90%)

Query:     1 MASQQ-GRKELDTRARQGETVVPGGTGGKSLEAQEHLAEGRSRGGQTRREQLGTEGYQEM 59
             MAS+Q  R+ELD +A+QGETVVPGGTGG SLEAQEHLAEGRS+GGQTR+EQLG EGYQE+
Sbjct:     1 MASKQLSREELDEKAKQGETVVPGGTGGHSLEAQEHLAEGRSKGGQTRKEQLGHEGYQEI 60

Query:    60 GHRGGQMRREQMGSEGYQEMGRKGG 84
             GH+GG+ R+EQ+G EGYQEMG KGG
Sbjct:    61 GHKGGEARKEQLGHEGYQEMGHKGG 85


GO:0003674 "molecular_function" evidence=ND
GO:0005737 "cytoplasm" evidence=ISM
GO:0009793 "embryo development ending in seed dormancy" evidence=RCA;TAS
GO:0009640 "photomorphogenesis" evidence=RCA
GO:0009737 "response to abscisic acid stimulus" evidence=RCA;TAS
GO:0009845 "seed germination" evidence=RCA
GO:0009909 "regulation of flower development" evidence=RCA
GO:0009933 "meristem structural organization" evidence=RCA
GO:0010162 "seed dormancy process" evidence=RCA
GO:0010182 "sugar mediated signaling pathway" evidence=RCA
GO:0010228 "vegetative to reproductive phase transition of meristem" evidence=RCA
GO:0016567 "protein ubiquitination" evidence=RCA
GO:0019915 "lipid storage" evidence=RCA
GO:0048825 "cotyledon development" evidence=RCA
GO:0050826 "response to freezing" evidence=RCA
GO:0051301 "cell division" evidence=RCA
TAIR|locus:2065041 GEA6 "AT2G40170" [Arabidopsis thaliana (taxid:3702)] Back     alignment and assigned GO terms
UNIPROTKB|P46520 EMP1 "Embryonic abundant protein 1" [Oryza sativa Japonica Group (taxid:39947)] Back     alignment and assigned GO terms

Prediction of Enzyme Commission (EC) Number

EC Number Prediction by Annotation Transfer from SWISS-PROT Entries ?

ID ?Name ?Annotated EC number ?Identity ?Query coverage ?Hit coverage ?RBH(Q2H) ?RBH(H2Q) ?
P22701EM2_WHEATNo assigned EC number0.64030.82140.9787N/Ano
P42755EM4_WHEATNo assigned EC number0.67250.82140.9892N/Ano
Q02400LE193_HORVUNo assigned EC number0.67661.00.8421N/Ano
P09443LE19_GOSHINo assigned EC number0.58550.81250.8921N/Ano
P17639EMB1_DAUCANo assigned EC number0.67270.80350.9782N/Ano
P46532LE19B_HORVUNo assigned EC number0.65480.82140.9892N/Ano
Q05190LE19A_HORVUNo assigned EC number0.65480.82140.9892N/Ano
P04568EM1_WHEATNo assigned EC number0.65480.82140.9892N/Ano
Q07187EM1_ARATHNo assigned EC number0.59060.96420.7105yesno
Q08000EM3_WHEATNo assigned EC number0.66370.82140.9892N/Ano
P46520EMP1_ORYSJNo assigned EC number0.66950.82140.9684yesno

EC Number Prediction by Ezypred Server ?

Fail to connect to Ezypred Server

EC Number Prediction by EFICAz Software ?

No EC number assignment, probably not an enzyme!


Prediction of Functionally Associated Proteins

Functionally Associated Proteins Detected by STRING ?

Fail to connect to STRING server


Conserved Domains and Related Protein Families

Conserved Domains Detected by RPS-BLAST ?

ID ?Alignment Graph ?Length ? Definition ? E-value ?
Query112
pfam00477109 pfam00477, LEA_5, Small hydrophilic plant seed pro 6e-28
>gnl|CDD|109530 pfam00477, LEA_5, Small hydrophilic plant seed protein Back     alignment and domain information
 Score = 97.8 bits (243), Expect = 6e-28
 Identities = 88/107 (82%), Positives = 99/107 (92%)

Query: 2   ASQQGRKELDTRARQGETVVPGGTGGKSLEAQEHLAEGRSRGGQTRREQLGTEGYQEMGH 61
           + Q+ R+ELD RA+QGETVVPGGTGGKSLEAQEHLAEGRS+GGQTR+EQLGTEGYQEMG 
Sbjct: 3   SGQEEREELDRRAKQGETVVPGGTGGKSLEAQEHLAEGRSKGGQTRKEQLGTEGYQEMGT 62

Query: 62  RGGQMRREQMGSEGYQEMGRKGGLSTIDKSGGERAAEEGIEIDESKY 108
           +GGQ R+EQMG EGY EMGRKGGLST+D+SGGERAA EGIEIDESK+
Sbjct: 63  KGGQTRKEQMGHEGYSEMGRKGGLSTMDESGGERAAREGIEIDESKF 109


Length = 109

Conserved Domains Detected by HHsearch ?

ID ?Alignment Graph ?Length ? Definition ? Probability ?
Query 112
PF00477109 LEA_5: Small hydrophilic plant seed protein; Inter 100.0
PF00477109 LEA_5: Small hydrophilic plant seed protein; Inter 99.58
COG372973 GsiB General stress protein [General function pred 99.3
COG372973 GsiB General stress protein [General function pred 99.25
PF1068523 KGG: Stress-induced bacterial acidophilic repeat m 96.69
PF1068523 KGG: Stress-induced bacterial acidophilic repeat m 96.34
>PF00477 LEA_5: Small hydrophilic plant seed protein; InterPro: IPR000389 This entry contains a number of bacterial proteins annotated as stress-induced and members of the plant LEA (late embryogenesis abundant) proteins, which are small hydrophilic plant seed proteins that are structurally related Back     alignment and domain information
Probab=100.00  E-value=2e-50  Score=292.98  Aligned_cols=108  Identities=71%  Similarity=1.085  Sum_probs=106.8

Q ss_pred             Ccc-hhhhHHHHHHHhcCCccccCCCCCcchhHHHHHHHhhhccchhhhhhhchhhHHhhhccccchhhhhcCchhHHHh
Q 040090            1 MAS-QQGRKELDTRARQGETVVPGGTGGKSLEAQEHLAEGRSRGGQTRREQLGTEGYQEMGHRGGQMRREQMGSEGYQEM   79 (112)
Q Consensus         1 m~~-~~~~~eld~~a~~getvv~ggtgg~s~~aqEfy~E~G~KGGeats~~~G~efY~EiG~KGGeat~e~~g~efYeEi   79 (112)
                      ||| |++|++||++||+|+|||||||||+||+||++++|+|+|||++|+++||+|||++||+|||++|+++|+++||++|
T Consensus         1 ma~~q~~r~eld~~aregetvv~gGtggksl~aqe~laEggkKGGetr~e~~G~E~YqEiG~KGGe~t~e~~g~EfY~ei   80 (109)
T PF00477_consen    1 MASGQESREELDARAREGETVVPGGTGGKSLEAQERLAEGGKKGGETRKEQHGKEFYQEIGKKGGEATKEKHGKEFYEEI   80 (109)
T ss_pred             CcchhHHHHHHHHHHhcCCccccCCCCCCcchHHHHHHHHHhhcccchhhhcchhHHHHHhhccCccchhhhchHHHHHH
Confidence            899 6699999999999999999999999999999999999999999999999999999999999999999999999999


Q ss_pred             hhcCCccccccCCchhHHhhCCccCcccc
Q 040090           80 GRKGGLSTIDKSGGERAAEEGIEIDESKY  108 (112)
Q Consensus        80 G~KGG~a~~~~~g~efyeE~G~~~de~~~  108 (112)
                      |||||+++++++|.|+++++||+||||||
T Consensus        81 GrKGG~~~~~~~g~era~~eg~~~de~~~  109 (109)
T PF00477_consen   81 GRKGGEATSDKSGGERAAEEGIEIDESKF  109 (109)
T ss_pred             HHhhCcccccccchHHHHHcCCCcccccC
Confidence            99999999999999999999999999998



These proteins contains from 83 to 153 amino acid residues and may play a role [, ] in equipping the seed for survival, maintaining a minimal level of hydration in the dry organism and preventing the denaturation of cytoplasmic components. They may also play a role during imbibition by controlling water uptake.

>PF00477 LEA_5: Small hydrophilic plant seed protein; InterPro: IPR000389 This entry contains a number of bacterial proteins annotated as stress-induced and members of the plant LEA (late embryogenesis abundant) proteins, which are small hydrophilic plant seed proteins that are structurally related Back     alignment and domain information
>COG3729 GsiB General stress protein [General function prediction only] Back     alignment and domain information
>COG3729 GsiB General stress protein [General function prediction only] Back     alignment and domain information
>PF10685 KGG: Stress-induced bacterial acidophilic repeat motif; InterPro: IPR019626 This repeat contains a highly conserved, characteristic sequence motif, KGG, that is recognised by plants and lower eukaryotes Back     alignment and domain information
>PF10685 KGG: Stress-induced bacterial acidophilic repeat motif; InterPro: IPR019626 This repeat contains a highly conserved, characteristic sequence motif, KGG, that is recognised by plants and lower eukaryotes Back     alignment and domain information

Homologous Structure Templates

Structure Templates Detected by BLAST ?

No homologous structure with e-value below 0.005

Structure Templates Detected by RPS-BLAST ?

No hit with e-value below 0.005

Structure Templates Detected by HHsearch ?

No hit with probability above 80.00


Homologous Structure Domains

Structure Domains Detected by RPS-BLAST ?

No hit with e-value below 0.005

Homologous Domains Detected by HHsearch ?

No hit with probability above 80.00