Citrus Sinensis ID: 027132


Local Sequence Feature Prediction

Prediction and (Method)Result
Residue Number Marker
Protein Sequence ?
Secondary Structure (PSIPRED) ?
Secondary Structure Prediction (SSPRO) ?
Coil and Loop (DISEMBL) ?
Flexible Loop (DISEMBL) ?
Low Complexity Region (SEG) ?
Disordered region (IsUnstruct) ?
Disordered Region (DISOPRED) ?
Disordered Region (DISEMBL) ?
Disordered Region (DISPRO) ?
Transmembrane Helix (TMHMM) ?
Transmembrane Helix (HMMTOP) ?
Transmembrane Helix (MEMSAT) ?
TM Helix, Signal Peptide (MEMSAT_SVM) ?
TM Helix, Signal Peptide (Phobius) ?
Signal Peptide (SignalP HMM Mode) ?
Signal Peptide (SignalP NN Mode) ?
Coiled Coils (COILS) ?
Positional Conservation ?
 
--------10--------20--------30--------40--------50--------60--------70--------80--------90-------100-------110-------120-------130-------140-------150-------160-------170-------180-------190-------200-------210-------220-------
MAAATSLLELNAFKWTQLRPNPRLFPHRPAKFSANLNNFRTLQRTTTARNFNGVKCFCRPQNIESSSTSSASPDRFFDPLEILTNALSKSIQALKKPAVIAVVLGFLLTWDPNLAFAASGGRMGGRSFSSHSSSSSSRTYMVEPRVGFSASAPYYAPSPFGGAGGGIYVGPAVGVGVGAGSSLFLILMGFAAFVLVSGFLSDRSDGSVLTATDKTSVIKLQVLDLRL
ccccccHHHcccccccccccccccccccccccccccccccccccccccccccEEEEEEccccccccccccccccccccHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHccccccHHHHccccccccccccccccccccccccccccccccccccccccccccccccccccccEEEcccccHHHHHHHHHHHHHHHHHccccccccccEEEEccEEEEEEEEEEEccc
ccHHHHHHHcccccccccccccccccccccccccccccccccccccccccccEEEEEEccccccccccccccccccccHHHHHHHHHHHHHHHHHcHHHHHHHHHHHHHccccHHHHcccccccccccccccccccccccccccccccccccccccccccccccccEEEcccEEEccccHHHHHHHHHHHHHHHHHHccccccccccEEEEccccEEEEEEEEEccc
MAAATSLLELNafkwtqlrpnprlfphrpakfsanlNNFRTLQRtttarnfngvkcfcrpqniessstssaspdrffdpLEILTNALSKSIQALKKPAVIAVVLGFLltwdpnlafaasggrmggrsfsshssssssrtymveprvgfsasapyyapspfggagggiyvgpavgvgvGAGSSLFLILMGFAAFVLVSGflsdrsdgsvltatdktsvIKLQVLDLRL
MAAATsllelnafkwtqlrpnPRLFPHRPAKFSANLNNFRTLQRTTTARNFNGVKCFCRPQNIessstssaspdRFFDPLEILTNALSKSIQALKKPAVIAVVLGFLLTWDPNLAFAASGGRMGGRSFSshssssssrTYMVEPRVGFSASAPYYAPSPFGGAGGGIYVGPAVGVGVGAGSSLFLILMGFAAFVLVSGFLSDRSdgsvltatdktsviklqvldlrl
MAAATSLLELNAFKWTQLRPNPRLFPHRPAKFSANLNNFRTLQRTTTARNFNGVKCFCRPQNIESSSTSSASPDRFFDPLEILTNALSKSIQALKKPAVIAVVLGFLLTWDPNLAFAAsggrmggrsfsshssssssrTYMVEPRVGFSASAPYYAPSPFggagggiyvgpavgvgvgagSSLFLILMGFAAFVLVSGFLSDRSDGSVLTATDKTSVIKLQVLDLRL
******LLELNAFKWTQLRPNPRLFPHRPAKFSANLNNFRTLQRTTTARNFNGVKCFCR****************FFDPLEILTNALSKSIQALKKPAVIAVVLGFLLTWDPNLAFAA***************************VGFSASAPYYAPSPFGGAGGGIYVGPAVGVGVGAGSSLFLILMGFAAFVLVSGFLSDRSDGSVLTATDKTSVIKLQVL****
***ATSL*ELNAFKWTQLRPNPRLFPHRPAKFSANLN***************GV**************************EILTNALSKSIQALKKPAVIAVVLGFLLTWDPNLAFA*********************************SAPYYAPSPFGGAGGGIYVGPAVGVGVGAGSSLFLILMGFAAFVLVSGFLSDRSDGSVLTATDKTSVIKLQVLDLRL
MAAATSLLELNAFKWTQLRPNPRLFPHRPAKFSANLNNFRTLQRTTTARNFNGVKCFCRPQN**********PDRFFDPLEILTNALSKSIQALKKPAVIAVVLGFLLTWDPNLAFAASG********************MVEPRVGFSASAPYYAPSPFGGAGGGIYVGPAVGVGVGAGSSLFLILMGFAAFVLVSGFLSDRSDGSVLTATDKTSVIKLQVLDLRL
****TSLLELNAFKWTQLRPNPRLFPHRPAKFSANLNNFRTLQRTTTARNFNGVKCFCRPQ*************RFFDPLEILTNALSKSIQALKKPAVIAVVLGFLLTWDPNLAFA******************************FSASAPYYAPSPFGGAGGGIYVGPAVGVGVGAGSSLFLILMGFAAFVLVSGFLSDRSDGSVLTATDKTSVIKLQVLDLRL
iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiHHHHHHHHHHHHHHHHHHHHHHHoooooooooooooooooooooooooooooooooooooooooooooooooooooooooHHHHHHHHHHHHHHHHHHHHHHHiiiiiiiiiiiiiiiiiiiiiiiiiii
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooHHHHHHHHHHHHHHHHHHHHHHHHHiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii
iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiHHHHHHHHHHHHHHHHHHHHoooooooooooooooooooooooooooooooooooooooooooooooooooooooooooHHHHHHHHHHHHHHHHHHHHHHiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii
SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiHHHHHHHHHHHHHHHHoooooooooooooooooooooooooooooooooooooooooooooooooooooooooHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHiiiiiiiiiiiiiiiiiiiiiiiiiiiiii
iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiHHHHHHHHHHHHHHHHHHHHHooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooHHHHHHHHHHHHHHHHHHHHHHiiiiiiiiiiiiiiiiiiiiiiiiiii
SSSSSSSSSSSSSSSSSSSoooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
MAAATSLLELNAFKWTQLRPNPRLFPHRPAKFSANLNNFRTLQRTTTARNFNGVKCFCRPQNIESSSTSSASPDRFFDPLEILTNALSKSIQALKKPAVIAVVLGFLLTWDPNLAFAASGGRMGGRSFSSHSSSSSSRTYMVEPRVGFSASAPYYAPSPFGGAGGGIYVGPAVGVGVGAGSSLFLILMGFAAFVLVSGFLSDRSDGSVLTATDKTSVIKLQVLDLRL
no confident homologs detected

Close Homologs for Annotation Transfer

Close Homologs in SWISS-PROT Database Detected by BLAST ?

No hits with e-value below 0.001 by BLAST

Close Homologs in the Non-Redundant Database Detected by BLAST ?

GI ?Alignment Graph ?Length ? Definition ? Q cover ? H cover ? Identity ? E-value ?
Query227
449449537 406 PREDICTED: uncharacterized protein LOC10 0.647 0.362 0.644 2e-39
449518362231 PREDICTED: uncharacterized protein LOC10 0.647 0.636 0.644 1e-38
356573091 378 PREDICTED: uncharacterized protein LOC10 0.850 0.510 0.533 9e-33
357512223 390 hypothetical protein MTR_7g114660 [Medic 0.845 0.492 0.507 1e-32
18404967 391 uncharacterized protein [Arabidopsis tha 0.903 0.524 0.489 1e-32
11993861 391 unknown protein [Arabidopsis thaliana] g 0.903 0.524 0.484 5e-32
255542366 360 conserved hypothetical protein [Ricinus 0.629 0.397 0.707 2e-31
297848040 389 hypothetical protein ARALYDRAFT_474728 [ 0.907 0.529 0.478 1e-30
118489113 407 unknown [Populus trichocarpa x Populus d 0.929 0.518 0.483 1e-30
148905805 460 unknown [Picea sitchensis] 0.572 0.282 0.559 3e-25
>gi|449449537|ref|XP_004142521.1| PREDICTED: uncharacterized protein LOC101210275 [Cucumis sativus] Back     alignment and taxonomy information
 Score =  167 bits (424), Expect = 2e-39,   Method: Compositional matrix adjust.
 Identities = 96/149 (64%), Positives = 116/149 (77%), Gaps = 2/149 (1%)

Query: 74  DRFFDPLEILTNALSKSIQALKKPAVIAVVLGFLLTWDPNLAFAASGGRMGGRSFSSHSS 133
           D+   P+E++ N++  +++AL+KPA+ AV+LG LL +DPN A AASGGR+GG +FSS SS
Sbjct: 85  DKPKSPMEVIGNSIINALKALQKPAIAAVLLGLLLMYDPNSALAASGGRVGGNAFSSRSS 144

Query: 134 SSSSRTYMVEPRVGFSASAPYYAPSPFGGAGGGIYVGPAVGVGVGAGSSLFLILMGFAAF 193
           SSS          GFS SAPY +PS FGG  GGIYVGPAVGVG+GAGSS   IL GFAAF
Sbjct: 145 SSSRSYSTPRMSSGFSYSAPYTSPSMFGG--GGIYVGPAVGVGLGAGSSFVFILAGFAAF 202

Query: 194 VLVSGFLSDRSDGSVLTATDKTSVIKLQV 222
           +LVSGFLSDRSD SVLTA+DKTSV+KLQV
Sbjct: 203 LLVSGFLSDRSDTSVLTASDKTSVLKLQV 231




Source: Cucumis sativus

Species: Cucumis sativus

Genus: Cucumis

Family: Cucurbitaceae

Order: Cucurbitales

Class:

Phylum: Streptophyta

Superkingdom: Eukaryota

>gi|449518362|ref|XP_004166211.1| PREDICTED: uncharacterized protein LOC101226468 [Cucumis sativus] Back     alignment and taxonomy information
>gi|356573091|ref|XP_003554698.1| PREDICTED: uncharacterized protein LOC100800250 [Glycine max] Back     alignment and taxonomy information
>gi|357512223|ref|XP_003626400.1| hypothetical protein MTR_7g114660 [Medicago truncatula] gi|355501415|gb|AES82618.1| hypothetical protein MTR_7g114660 [Medicago truncatula] Back     alignment and taxonomy information
>gi|18404967|ref|NP_564660.1| uncharacterized protein [Arabidopsis thaliana] gi|20260342|gb|AAM13069.1| unknown protein [Arabidopsis thaliana] gi|21537407|gb|AAM61748.1| unknown [Arabidopsis thaliana] gi|332194993|gb|AEE33114.1| uncharacterized protein [Arabidopsis thaliana] Back     alignment and taxonomy information
>gi|11993861|gb|AAG42914.1|AF327533_1 unknown protein [Arabidopsis thaliana] gi|13926263|gb|AAK49603.1|AF372887_1 At1g54520/F20D21_34 [Arabidopsis thaliana] gi|28416543|gb|AAO42802.1| At1g54520/F20D21_34 [Arabidopsis thaliana] Back     alignment and taxonomy information
>gi|255542366|ref|XP_002512246.1| conserved hypothetical protein [Ricinus communis] gi|223548207|gb|EEF49698.1| conserved hypothetical protein [Ricinus communis] Back     alignment and taxonomy information
>gi|297848040|ref|XP_002891901.1| hypothetical protein ARALYDRAFT_474728 [Arabidopsis lyrata subsp. lyrata] gi|297337743|gb|EFH68160.1| hypothetical protein ARALYDRAFT_474728 [Arabidopsis lyrata subsp. lyrata] Back     alignment and taxonomy information
>gi|118489113|gb|ABK96363.1| unknown [Populus trichocarpa x Populus deltoides] Back     alignment and taxonomy information
>gi|148905805|gb|ABR16066.1| unknown [Picea sitchensis] Back     alignment and taxonomy information

Prediction of Gene Ontology (GO) Terms

Close Homologs with Gene Ontology terms Detected by BLAST ?

ID ? Alignment graph ? Length ? Definition ? Q cover ? H cover ? Identity ? E-value ?
Query227
TAIR|locus:2020158 391 AT1G54520 "AT1G54520" [Arabido 0.903 0.524 0.399 1.9e-26
TAIR|locus:2020158 AT1G54520 "AT1G54520" [Arabidopsis thaliana (taxid:3702)] Back     alignment and assigned GO terms
 Score = 298 (110.0 bits), Expect = 1.9e-26, P = 1.9e-26
 Identities = 93/233 (39%), Positives = 120/233 (51%)

Query:     1 MAAATSLLELNAFKWTQLRPNPRLFPHRPAKFSANLNNFRTLQRTTTARNFNGVKCFCRP 60
             MA++++ LEL  F+W Q  P P  +  RP   +  L + +  +R+ + R    VK     
Sbjct:     1 MASSSTFLELTPFQWNQ--PLP--YTQRPHHRTVLLYS-KPQRRSNSIRLQISVKY---- 51

Query:    61 QNIESSSTSSASPD-RF-FDPLEILTNALSKSIQALKKPAVIAVVLGFLLTWDPNLAFAA 118
                   STSS+ PD R  F+P E +   + K++ +LKKPA+ AV+LG LL +DPN A AA
Sbjct:    52 ----KQSTSSSDPDLRSNFNPFEQIAIQVKKALDSLKKPAIAAVLLGLLLFYDPNSALAA 107

Query:   119 XXXXX---XXXXXXXXXXXXXXXTYMV----EPRVGFSA-SAPYYAPSPFXXXXXXXXXX 170
                                    +Y V     P   +SA +APYY PSPF          
Sbjct:   108 SGGRIGGNSFSSRSRSSSSSSSQSYSVPRTSNPSFSYSARTAPYYGPSPFGGGFVGPAVG 167

Query:   171 XXXXXXXXXXSSLFLILMGFAAFVLVSGFLSDRS-DGSVLTATDKTSVIKLQV 222
                       SS  LIL+GFAAFVLVSGFLSDRS D S+LT T KTSVIKLQV
Sbjct:   168 FGFGGF----SSFSLILVGFAAFVLVSGFLSDRSQDDSILTDTQKTSVIKLQV 216


Parameters:
  V=100
  filter=SEG
  E=0.001

  ctxfactor=1.00

  Query                        -----  As Used  -----    -----  Computed  ----
  Frame  MatID Matrix name     Lambda    K       H      Lambda    K       H
   +0      0   BLOSUM62        0.323   0.134   0.393    same    same    same
               Q=9,R=2         0.244   0.0300  0.180     n/a     n/a     n/a

  Query
  Frame  MatID  Length  Eff.Length     E     S W   T  X   E2     S2
   +0      0      227       187   0.00085  110 3  11 22  0.38    32
                                                     31  0.41    35


Statistics:

  Database:  /share/blast/go-seqdb.fasta
   Title:  go_20130330-seqdb.fasta
   Posted:  5:47:42 AM PDT Apr 1, 2013
   Created:  5:47:42 AM PDT Apr 1, 2013
   Format:  XDF-1
   # of letters in database:  169,044,731
   # of sequences in database:  368,745
   # of database sequences satisfying E:  1
  No. of states in DFA:  570 (61 KB)
  Total size of DFA:  140 KB (2087 KB)
  Time to generate neighborhood:  0.00u 0.00s 0.00t   Elapsed:  00:00:00
  No. of threads or processors used:  24
  Search cpu time:  13.96u 0.09s 14.05t   Elapsed:  00:00:02
  Total cpu time:  13.96u 0.09s 14.05t   Elapsed:  00:00:02
  Start:  Fri May 10 01:09:52 2013   End:  Fri May 10 01:09:54 2013


GO:0003674 "molecular_function" evidence=ND
GO:0008150 "biological_process" evidence=ND
GO:0009507 "chloroplast" evidence=ISM;IDA

Prediction of Enzyme Commission (EC) Number

EC Number Prediction by Annotation Transfer from SWISS-PROT Entries ?

No confident hit for EC number transfering in SWISSPROT detected by BLAST

EC Number Prediction by Ezypred Server ?

Fail to connect to Ezypred Server

EC Number Prediction by EFICAz Software ?

No EC number assignment, probably not an enzyme!


Prediction of Functionally Associated Proteins

Functionally Associated Proteins Detected by STRING ?

Fail to connect to STRING server


Conserved Domains and Related Protein Families

Conserved Domains Detected by RPS-BLAST ?

ID ?Alignment Graph ?Length ? Definition ? E-value ?
Query227
pfam07466 280 pfam07466, DUF1517, Protein of unknown function (D 5e-20
COG4371 334 COG4371, COG4371, Predicted membrane protein [Func 7e-07
>gnl|CDD|219420 pfam07466, DUF1517, Protein of unknown function (DUF1517) Back     alignment and domain information
 Score = 85.5 bits (212), Expect = 5e-20
 Identities = 44/110 (40%), Positives = 52/110 (47%), Gaps = 13/110 (11%)

Query: 117 AASGGRMGGRSFSSHSSSSSSRTYMVEPRVGFSASAPYYAPSPFGGAGGGIYVGPAVGVG 176
           AASGGR+GG SF + S SSSS      PR        YY     GG  G  ++ P  G G
Sbjct: 2   AASGGRIGGGSFRAPSRSSSS------PRSSSPGGGGYY--GSPGGGFGFPFLIPFFGFG 53

Query: 177 VGAGSSLFLILMGFAAFVLVSGFLSDRS----DGSVLTATDKTSVIKLQV 222
            G G    LILM   A VLV+ F S         S   +  K SV++LQV
Sbjct: 54  GGGGLFGLLILMA-IAGVLVNAFRSAGGGGGGLSSAGRSNGKVSVVQLQV 102


This family consists of several hypothetical glycine rich plant and bacterial proteins of around 300 residues in length. The function of this family is unknown. Length = 280

>gnl|CDD|226808 COG4371, COG4371, Predicted membrane protein [Function unknown] Back     alignment and domain information

Conserved Domains Detected by HHsearch ?

ID ?Alignment Graph ?Length ? Definition ? Probability ?
Query 227
COG4371 334 Predicted membrane protein [Function unknown] 99.88
PF07466 289 DUF1517: Protein of unknown function (DUF1517); In 99.85
COG4371 334 Predicted membrane protein [Function unknown] 82.24
PF07466 289 DUF1517: Protein of unknown function (DUF1517); In 80.47
>COG4371 Predicted membrane protein [Function unknown] Back     alignment and domain information
Probab=99.88  E-value=5e-22  Score=179.85  Aligned_cols=101  Identities=35%  Similarity=0.427  Sum_probs=75.3

Q ss_pred             hhhhh-cCCcccCCCCCCCCCCCCCCccc--cCCCCCCCCCCCCCCCCCCCCCCCce---eecccccccccC-chhHHHH
Q 027132          114 LAFAA-SGGRMGGRSFSSHSSSSSSRTYM--VEPRVGFSASAPYYAPSPFGGAGGGI---YVGPAVGVGVGA-GSSLFLI  186 (227)
Q Consensus       114 ~A~AA-SGGRiGGgSFrS~SssSspRSYs--~p~~~~ygYg~~gY~~SPfgy~GGGf---Fl~P~fGfGgGg-G~~~fli  186 (227)
                      .|+|| |||||||+|||+||..  +|+|+  .|+.++|+  +++|     +  ||||   |+||.+|+|+|+ |+|.+|+
T Consensus        43 ~a~aarSGGriGGgSfraps~~--sr~YS~~gpsGGgY~--gg~Y-----~--GGGfgfPfiip~~G~GGGfgGiFgilv  111 (334)
T COG4371          43 VAAAARSGGRIGGGSFRAPSGY--SRGYSGGGPSGGGYS--GGGY-----S--GGGFGFPFIIPGGGGGGGFGGIFGILV  111 (334)
T ss_pred             HHHHHhhCCCccCCCCCCCCCC--CCCcCCCCCCCCCCC--CCCC-----C--CCCcCcCeEeccCCcCCccccHHHHHH
Confidence            35555 9999999999999742  48998  35554433  2234     3  6776   999999998874 8899999


Q ss_pred             HHHHHHHHhhhhhcccCC-----CCCcccCCCceEEEEEeeeccc
Q 027132          187 LMGFAAFVLVSGFLSDRS-----DGSVLTATDKTSVIKLQVLDLR  226 (227)
Q Consensus       187 Li~iag~v~V~~F~~~~~-----~g~~~~~~~~VSV~kLQVGLLa  226 (227)
                      |++|+. ++|..||..-+     +.....++|+|.+.++||||||
T Consensus       112 f~aian-~vv~~~Rr~~ssGe~~g~~~~~S~ptv~~~~vQvgLLA  155 (334)
T COG4371         112 FGAIAN-GVVGMMRRNLSSGEARGLSSLGSSPTVQAVSVQVGLLA  155 (334)
T ss_pred             HHHHHH-HHHHHHHhcCCCCCcCCccccCCCCceeEEeeeehhhh
Confidence            999988 58899986311     1133346899999999999997



>PF07466 DUF1517: Protein of unknown function (DUF1517); InterPro: IPR010903 This family consists of several hypothetical glycine rich plant and bacterial proteins of around 300 residues in length Back     alignment and domain information
>COG4371 Predicted membrane protein [Function unknown] Back     alignment and domain information
>PF07466 DUF1517: Protein of unknown function (DUF1517); InterPro: IPR010903 This family consists of several hypothetical glycine rich plant and bacterial proteins of around 300 residues in length Back     alignment and domain information

Homologous Structure Templates

Structure Templates Detected by BLAST ?

No homologous structure with e-value below 0.005

Structure Templates Detected by RPS-BLAST ?

No hit with e-value below 0.005

Structure Templates Detected by HHsearch ?

No hit with probability above 80.00


Homologous Structure Domains

Structure Domains Detected by RPS-BLAST ?

No hit with e-value below 0.005

Homologous Domains Detected by HHsearch ?

No hit with probability above 80.00