Citrus Sinensis ID: 034242


Local Sequence Feature Prediction

Prediction and (Method)Result
Residue Number Marker
Protein Sequence ?
Secondary Structure (PSIPRED) ?
Secondary Structure Prediction (SSPRO) ?
Coil and Loop (DISEMBL) ?
Flexible Loop (DISEMBL) ?
Low Complexity Region (SEG) ?
Disordered region (IsUnstruct) ?
Disordered Region (DISOPRED) ?
Disordered Region (DISEMBL) ?
Disordered Region (DISPRO) ?
Transmembrane Helix (TMHMM) ?
Transmembrane Helix (HMMTOP) ?
Transmembrane Helix (MEMSAT) ?
TM Helix, Signal Peptide (MEMSAT_SVM) ?
TM Helix, Signal Peptide (Phobius) ?
Signal Peptide (SignalP HMM Mode) ?
Signal Peptide (SignalP NN Mode) ?
Coiled Coils (COILS) ?
Positional Conservation ?
 
--------10--------20--------30--------40--------50--------60--------70--------80--------90-------100
MQQLKICATRRRRMAYSRTETYVLLEPGVEEKFVTEEELKARLKYWLENWAGQVGKGGLPPDLAKFATIDEAVAFLITNVCELELQGDVGSIQWYEVRLE
cccHHHHHHHHHHHHHcccccEEEEccccccccccHHHHHHHHHHHHHHHHcccccccccHHHHccccHHHHHHHHHHccEEEEEEccccEEEEEEEEEc
ccEEEEEHHHHHHHHHcccccEEEEccccccHcccHHHHHHHHHHHHHHcccccccccccHHHHccccHHHHHHHHHHHccEEEccccccEEEEEEEEcc
MQQLKICATRRRRMAYSRTETYvllepgveekfVTEEELKARLKYWLENWagqvgkgglppdlakFATIDEAVAFLITNVCELelqgdvgsiQWYEVRLE
mqqlkicatrrrrmaysrtetyvllepgveekfvTEEELKARLKYWLENWAGQVGKGGLPPDLAKFATIDEAVAFLITNVCELelqgdvgsiqwYEVRLE
MQQLKICATRRRRMAYSRTETYVLLEPGVEEKFVTEEELKARLKYWLENWAGQVGKGGLPPDLAKFATIDEAVAFLITNVCELELQGDVGSIQWYEVRLE
*****ICATRRRRMAYSRTETYVLLEPGVEEKFVTEEELKARLKYWLENWAGQVGKGGLPPDLAKFATIDEAVAFLITNVCELELQGDVGSIQWYEV***
********TRRRRMAYSRTETYVLLEPGVEEKFVTEEELKARLKYWLENWAGQVGKGGLPPDLAKFATIDEAVAFLITNVCELELQGDVGSIQWYEVRLE
MQQLKICATRRRRMAYSRTETYVLLEPGVEEKFVTEEELKARLKYWLENWAGQVGKGGLPPDLAKFATIDEAVAFLITNVCELELQGDVGSIQWYEVRLE
MQQLKICATRRRRMAYSRTETYVLLEPGVEEKFVTEEELKARLKYWLENWAGQVGKGGLPPDLAKFATIDEAVAFLITNVCELELQGDVGSIQWYEVRLE
iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
iiiiiiiiiiiiiiiiiiiihhhhhhhhhhhhhhhhhhhhhooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiihhhhhhhhhhhhhhhhoooooooo
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
MQQLKICATRRRRMAYSRTETYVLLEPGVEEKFVTEEELKARLKYWLENWAGQVGKGGLPPDLAKFATIDEAVAFLITNVCELELQGDVGSIQWYEVRLE
no confident homologs detected

Close Homologs for Annotation Transfer

Close Homologs in SWISS-PROT Database Detected by BLAST ?

No hits with e-value below 0.001 by BLAST

Close Homologs in the Non-Redundant Database Detected by BLAST ?

GI ?Alignment Graph ?Length ? Definition ? Q cover ? H cover ? Identity ? E-value ?
Query100
359488522151 PREDICTED: uncharacterized protein LOC10 0.92 0.609 0.690 2e-30
297805830157 hypothetical protein ARALYDRAFT_916402 [ 0.94 0.598 0.602 2e-29
270342082152 hypothetical protein [Phaseolus vulgaris 0.92 0.605 0.666 1e-28
388494090151 unknown [Lotus japonicus] 0.91 0.602 0.687 5e-28
21553859156 unknown [Arabidopsis thaliana] 0.93 0.596 0.602 1e-27
18421842156 chlororespiratory reduction 7 [Arabidops 0.93 0.596 0.602 2e-27
356573538152 PREDICTED: uncharacterized protein LOC10 0.92 0.605 0.677 2e-26
118489214162 unknown [Populus trichocarpa x Populus d 0.94 0.580 0.606 5e-26
22408011399 predicted protein [Populus trichocarpa] 0.94 0.949 0.606 1e-25
326494344151 predicted protein [Hordeum vulgare subsp 0.89 0.589 0.595 2e-25
>gi|359488522|ref|XP_002280011.2| PREDICTED: uncharacterized protein LOC100249838 [Vitis vinifera] gi|296082204|emb|CBI21209.3| unnamed protein product [Vitis vinifera] Back     alignment and taxonomy information
 Score =  136 bits (342), Expect = 2e-30,   Method: Compositional matrix adjust.
 Identities = 67/97 (69%), Positives = 80/97 (82%), Gaps = 5/97 (5%)

Query: 4   LKICATRRRRMAYSRTETYVLLEPGVEEKFVTEEELKARLKYWLENWAGQVGKGGLPPDL 63
           +K  A RRRR AYS+TETYVL+EPG +E+FV++EELKARLK WLENW G+     LPPDL
Sbjct: 60  VKSYAMRRRR-AYSQTETYVLMEPGKDEEFVSQEELKARLKGWLENWPGK----ALPPDL 114

Query: 64  AKFATIDEAVAFLITNVCELELQGDVGSIQWYEVRLE 100
           AKF TID+AV +L+  VCELE+ GDVGSIQWYE+RLE
Sbjct: 115 AKFQTIDDAVMYLVKAVCELEIDGDVGSIQWYEIRLE 151




Source: Vitis vinifera

Species: Vitis vinifera

Genus: Vitis

Family: Vitaceae

Order: Vitales

Class:

Phylum: Streptophyta

Superkingdom: Eukaryota

>gi|297805830|ref|XP_002870799.1| hypothetical protein ARALYDRAFT_916402 [Arabidopsis lyrata subsp. lyrata] gi|297316635|gb|EFH47058.1| hypothetical protein ARALYDRAFT_916402 [Arabidopsis lyrata subsp. lyrata] Back     alignment and taxonomy information
>gi|270342082|gb|ACZ74666.1| hypothetical protein [Phaseolus vulgaris] Back     alignment and taxonomy information
>gi|388494090|gb|AFK35111.1| unknown [Lotus japonicus] Back     alignment and taxonomy information
>gi|21553859|gb|AAM62952.1| unknown [Arabidopsis thaliana] Back     alignment and taxonomy information
>gi|18421842|ref|NP_568563.1| chlororespiratory reduction 7 [Arabidopsis thaliana] gi|9758849|dbj|BAB09375.1| unnamed protein product [Arabidopsis thaliana] gi|17529164|gb|AAL38808.1| unknown protein [Arabidopsis thaliana] gi|21281014|gb|AAM45076.1| unknown protein [Arabidopsis thaliana] gi|332007025|gb|AED94408.1| chlororespiratory reduction 7 [Arabidopsis thaliana] Back     alignment and taxonomy information
>gi|356573538|ref|XP_003554915.1| PREDICTED: uncharacterized protein LOC100801516 [Glycine max] Back     alignment and taxonomy information
>gi|118489214|gb|ABK96413.1| unknown [Populus trichocarpa x Populus deltoides] Back     alignment and taxonomy information
>gi|224080113|ref|XP_002306021.1| predicted protein [Populus trichocarpa] gi|222848985|gb|EEE86532.1| predicted protein [Populus trichocarpa] Back     alignment and taxonomy information
>gi|326494344|dbj|BAJ90441.1| predicted protein [Hordeum vulgare subsp. vulgare] gi|326520968|dbj|BAJ92847.1| predicted protein [Hordeum vulgare subsp. vulgare] Back     alignment and taxonomy information

Prediction of Gene Ontology (GO) Terms

Close Homologs with Gene Ontology terms Detected by BLAST ?

ID ? Alignment graph ? Length ? Definition ? Q cover ? H cover ? Identity ? E-value ?
Query100
TAIR|locus:2157250156 CRR7 "AT5G39210" [Arabidopsis 0.9 0.576 0.621 1.3e-27
TAIR|locus:2157250 CRR7 "AT5G39210" [Arabidopsis thaliana (taxid:3702)] Back     alignment and assigned GO terms
 Score = 309 (113.8 bits), Expect = 1.3e-27, P = 1.3e-27
 Identities = 59/95 (62%), Positives = 76/95 (80%)

Query:     6 ICATRRRRMAYSRTETYVLLEPGVEEKFVTEEELKARLKYWLENWAGQVGKGGLPPDLAK 65
             +CATRRRR+ +S ++TYVLLE G +E+FVTE+ELKA+L+ WLENW        LPPDLA+
Sbjct:    67 VCATRRRRV-HSNSDTYVLLEAGQDEQFVTEDELKAKLRGWLENWP----VNSLPPDLAR 121

Query:    66 FATIDEAVAFLITNVCELELQGDVGSIQWYEVRLE 100
             F  +DEAV FL+  VCELE+ G+VGS+QWY+VRLE
Sbjct:   122 FDDLDEAVDFLVKAVCELEIDGEVGSVQWYQVRLE 156


Parameters:
  V=100
  filter=SEG
  E=0.001

  ctxfactor=1.00

  Query                        -----  As Used  -----    -----  Computed  ----
  Frame  MatID Matrix name     Lambda    K       H      Lambda    K       H
   +0      0   BLOSUM62        0.319   0.136   0.410    same    same    same
               Q=9,R=2         0.244   0.0300  0.180     n/a     n/a     n/a

  Query
  Frame  MatID  Length  Eff.Length     E     S W   T  X   E2     S2
   +0      0      100       100   0.00091  102 3  11 22  0.40    30
                                                     29  0.47    31


Statistics:

  Database:  /share/blast/go-seqdb.fasta
   Title:  go_20130330-seqdb.fasta
   Posted:  5:47:42 AM PDT Apr 1, 2013
   Created:  5:47:42 AM PDT Apr 1, 2013
   Format:  XDF-1
   # of letters in database:  169,044,731
   # of sequences in database:  368,745
   # of database sequences satisfying E:  1
  No. of states in DFA:  588 (63 KB)
  Total size of DFA:  126 KB (2079 KB)
  Time to generate neighborhood:  0.00u 0.00s 0.00t   Elapsed:  00:00:00
  No. of threads or processors used:  24
  Search cpu time:  10.56u 0.10s 10.66t   Elapsed:  00:00:00
  Total cpu time:  10.56u 0.10s 10.66t   Elapsed:  00:00:00
  Start:  Mon May 20 15:46:19 2013   End:  Mon May 20 15:46:19 2013


GO:0003674 "molecular_function" evidence=ND
GO:0005634 "nucleus" evidence=ISM
GO:0009507 "chloroplast" evidence=IDA
GO:0010275 "NAD(P)H dehydrogenase complex assembly" evidence=IGI;IMP
GO:0010598 "NAD(P)H dehydrogenase complex (plastoquinone)" evidence=IGI
GO:0016020 "membrane" evidence=IDA
GO:0009570 "chloroplast stroma" evidence=IDA
GO:0000023 "maltose metabolic process" evidence=RCA
GO:0010207 "photosystem II assembly" evidence=RCA
GO:0016556 "mRNA modification" evidence=RCA
GO:0019252 "starch biosynthetic process" evidence=RCA

Prediction of Enzyme Commission (EC) Number

EC Number Prediction by Annotation Transfer from SWISS-PROT Entries ?

No confident hit for EC number transfering in SWISSPROT detected by BLAST

EC Number Prediction by Ezypred Server ?

Fail to connect to Ezypred Server

EC Number Prediction by EFICAz Software ?

No EC number assignment, probably not an enzyme!


Prediction of Functionally Associated Proteins

Functionally Associated Proteins Detected by STRING ?

Your Input:
GSVIVG00038032001
SubName- Full=Chromosome chr14 scaffold_9, whole genome shotgun sequence; (158 aa)
(Vitis vinifera)
Predicted Functional Partners:
GSVIVG00023910001
SubName- Full=Chromosome chr6 scaffold_3, whole genome shotgun sequence; (216 aa)
     0.702
ndhM
SubName- Full=Chromosome chr18 scaffold_1, whole genome shotgun sequence;; NDH-1 shuttles elect [...] (208 aa)
      0.616
PsbP2
SubName- Full=Chromosome chr13 scaffold_17, whole genome shotgun sequence; (238 aa)
     0.604
GSVIVG00028499001
RecName- Full=Photosystem II reaction center psb28 protein; (179 aa)
      0.600
GSVIVG00028771001
SubName- Full=Chromosome chr13 scaffold_45, whole genome shotgun sequence; (378 aa)
      0.581
GSVIVG00018549001
SubName- Full=Chromosome chr13 scaffold_17, whole genome shotgun sequence; (235 aa)
     0.575
GSVIVG00005655001
SubName- Full=Putative uncharacterized protein (Chromosome undetermined scaffold_155, whole gen [...] (254 aa)
      0.568
GSVIVG00023239001
SubName- Full=Chromosome chr8 scaffold_29, whole genome shotgun sequence; (219 aa)
      0.550
GSVIVG00038203001
SubName- Full=Chromosome chr14 scaffold_9, whole genome shotgun sequence; (175 aa)
      0.537
GSVIVG00030316001
SubName- Full=Chromosome chr1 scaffold_5, whole genome shotgun sequence; (102 aa)
      0.537

Conserved Domains and Related Protein Families

Conserved Domains Detected by RPS-BLAST ?

ID ?Alignment Graph ?Length ? Definition ? E-value ?
Query100
pfam1209583 pfam12095, DUF3571, Protein of unknown function (D 2e-22
>gnl|CDD|192936 pfam12095, DUF3571, Protein of unknown function (DUF3571) Back     alignment and domain information
 Score = 82.7 bits (205), Expect = 2e-22
 Identities = 36/81 (44%), Positives = 46/81 (56%), Gaps = 7/81 (8%)

Query: 20  ETYVLLEPGVEEKFVTEEELKARLKYWLENWAGQVGKGGLPPDLAKFATIDEAVAFLITN 79
           + YV+LEPG  E+F+T  EL A LK WL           LP DLAK  ++D     L+  
Sbjct: 10  DHYVVLEPGQPEQFLTPAELLAWLKEWLTR------LDSLPADLAKLPSLDAQAQRLLDT 63

Query: 80  VCELELQGDVGSIQWYEVRLE 100
            CELE+ G   ++QWY VRLE
Sbjct: 64  ACELEI-GPGLTLQWYAVRLE 83


This family of proteins is functionally uncharacterized. This protein is found in bacteria and eukaryotes. Proteins in this family are typically between 85 to 97 amino acids in length. Length = 83

Conserved Domains Detected by HHsearch ?

ID ?Alignment Graph ?Length ? Definition ? Probability ?
Query 100
PF1209583 DUF3571: Protein of unknown function (DUF3571); In 100.0
PHA03373 247 tegument protein; Provisional 84.06
>PF12095 DUF3571: Protein of unknown function (DUF3571); InterPro: IPR021954 This family of proteins is functionally uncharacterised Back     alignment and domain information
Probab=100.00  E-value=4.9e-45  Score=249.35  Aligned_cols=79  Identities=51%  Similarity=0.993  Sum_probs=58.1

Q ss_pred             hhhccCCcEEEecCCCCccccCHHHHHHHHHHHHHhhcccCCCCCCChhhhccCCHHHHHHHHHHccccccccCCceeEE
Q 034242           14 MAYSRTETYVLLEPGVEEKFVTEEELKARLKYWLENWAGQVGKGGLPPDLAKFATIDEAVAFLITNVCELELQGDVGSIQ   93 (100)
Q Consensus        14 ~my~q~D~YVvLEp~~~EqfLT~~Ell~~Lk~~L~~~~~~~~~~~LP~dL~k~~s~~~qaq~Lldt~CELEi~pg~~~lQ   93 (100)
                      -|| ++|||||||||+||||||++||++|||+||++.      ++||+||+||+|+++||+||++|+|||||||| +|||
T Consensus         5 lm~-~~d~yVvLEp~~~Eqflt~~Ell~~Lk~~L~~~------~~LP~dL~~~~s~~~qa~~Lldt~CeLeigpg-~~lQ   76 (83)
T PF12095_consen    5 LMY-QEDHYVVLEPGQPEQFLTPEELLEKLKEWLQNQ------DDLPPDLAKFSSVEEQAQYLLDTACELEIGPG-GYLQ   76 (83)
T ss_dssp             -S------EEEEESSS-SEEE-HHHHHHHHHHHHHHT------TTS-HHHHH---HHHHHHHHHHH---EEEETT-EEEE
T ss_pred             hhh-ccCCEEEecCCCCcccCCHHHHHHHHHHHHHcC------CCCCHHHHhCCCHHHHHHHHHHhceeeecCCC-CEEE
Confidence            488 999999999999999999999999999999994      48999999999999999999999999999999 6999


Q ss_pred             EEEEeeC
Q 034242           94 WYEVRLE  100 (100)
Q Consensus        94 WYaVRLE  100 (100)
                      |||||||
T Consensus        77 WyaVRLE   83 (83)
T PF12095_consen   77 WYAVRLE   83 (83)
T ss_dssp             EEE----
T ss_pred             EEEEecC
Confidence            9999998



This protein is found in bacteria and eukaryotes. Proteins in this family are typically between 85 to 97 amino acids in length. ; PDB: 2KRX_A.

>PHA03373 tegument protein; Provisional Back     alignment and domain information

Homologous Structure Templates

Structure Templates Detected by BLAST ?

ID ?Alignment Graph ?Length ? Definition ? E-value ?
Query100
2krx_A94 Solution Nmr Structure Of Asl3597 From Nostoc Sp. P 5e-05
>pdb|2KRX|A Chain A, Solution Nmr Structure Of Asl3597 From Nostoc Sp. Pcc7120. Northeast Structural Genomics Consortium Target Id Nsr244 Length = 94 Back     alignment and structure

Iteration: 1

Score = 42.7 bits (99), Expect = 5e-05, Method: Compositional matrix adjust. Identities = 31/86 (36%), Positives = 45/86 (52%), Gaps = 11/86 (12%) Query: 18 RTETYVLLEPGVEEKFVTEEELKARLKYWLENWAGQVGKGGLPPDLAKFATIDEAVAFLI 77 + + +V+LE E+F+T EL +LK LE ++ LP +L K ++ LI Sbjct: 8 QQDNFVVLETNQPEQFLTTIELLEKLKGELE----KISFSDLPLELQKLDSLPAQAQHLI 63 Query: 78 TNVCELELQGDVGS---IQWYEVRLE 100 CEL DVG+ +QWY VRLE Sbjct: 64 DTSCEL----DVGAGKYLQWYAVRLE 85

Structure Templates Detected by RPS-BLAST ?

ID ?Alignment Graph ?Length ? Definition ? E-value ?
Query100
2krx_A94 ASL3597 protein; structural genomics, PSI-2, prote 1e-28
>2krx_A ASL3597 protein; structural genomics, PSI-2, protein structure initiative, no structural genomics consortium, NESG, unknown function; NMR {Nostoc SP} Length = 94 Back     alignment and structure
 Score = 98.3 bits (245), Expect = 1e-28
 Identities = 29/81 (35%), Positives = 42/81 (51%), Gaps = 5/81 (6%)

Query: 20  ETYVLLEPGVEEKFVTEEELKARLKYWLENWAGQVGKGGLPPDLAKFATIDEAVAFLITN 79
           + +V+LE    E+F+T  EL  +LK  LE  +       LP +L K  ++      LI  
Sbjct: 10  DNFVVLETNQPEQFLTTIELLEKLKGELEKISFSD----LPLELQKLDSLPAQAQHLIDT 65

Query: 80  VCELELQGDVGSIQWYEVRLE 100
            CEL++ G    +QWY VRLE
Sbjct: 66  SCELDV-GAGKYLQWYAVRLE 85


Structure Templates Detected by HHsearch ?

ID ?Alignment Graph ?Length ? Definition ? Probability ?
Query100
2krx_A94 ASL3597 protein; structural genomics, PSI-2, prote 100.0
>2krx_A ASL3597 protein; structural genomics, PSI-2, protein structure initiative, no structural genomics consortium, NESG, unknown function; NMR {Nostoc SP} Back     alignment and structure
Probab=100.00  E-value=1.1e-45  Score=256.48  Aligned_cols=81  Identities=36%  Similarity=0.613  Sum_probs=78.2

Q ss_pred             hhhccCCcEEEecCCCCccccCHHHHHHHHHHHHHhhcccCCCCCCChhhhccCCHHHHHHHHHHccccccccCCceeEE
Q 034242           14 MAYSRTETYVLLEPGVEEKFVTEEELKARLKYWLENWAGQVGKGGLPPDLAKFATIDEAVAFLITNVCELELQGDVGSIQ   93 (100)
Q Consensus        14 ~my~q~D~YVvLEp~~~EqfLT~~Ell~~Lk~~L~~~~~~~~~~~LP~dL~k~~s~~~qaq~Lldt~CELEi~pg~~~lQ   93 (100)
                      -|| |+|||||||||+||||||++||++|||+||+++|+    ++||+||+||+|+++||+||+||+|||||+||+ |||
T Consensus         5 lmy-q~D~yVvLEp~~~EqfLT~~Ell~~Lk~~L~~~~~----~~LP~dL~~~~s~~~qaq~Lldt~CELei~pG~-~lQ   78 (94)
T 2krx_A            5 LMY-QQDNFVVLETNQPEQFLTTIELLEKLKGELEKISF----SDLPLELQKLDSLPAQAQHLIDTSCELDVGAGK-YLQ   78 (94)
T ss_dssp             CSC-CCCCEEEEESSSCSEEECHHHHHHHHHHHHHHSCT----TTSCHHHHHCCCHHHHHHHHHHHCCCEEEETTE-EEE
T ss_pred             hhc-ccCCEEEecCCCCcccCCHHHHHHHHHHHHHhCcc----ccCCHHHHhCCCHHHHHHHHHHheeeeeeCCCC-EEE
Confidence            389 99999999999999999999999999999999995    499999999999999999999999999999995 999


Q ss_pred             EEEEeeC
Q 034242           94 WYEVRLE  100 (100)
Q Consensus        94 WYaVRLE  100 (100)
                      |||||||
T Consensus        79 WYaVRLE   85 (94)
T 2krx_A           79 WYAVRLE   85 (94)
T ss_dssp             EEECCCC
T ss_pred             EEEEEEe
Confidence            9999998




Homologous Structure Domains

Structure Domains Detected by RPS-BLAST ?

No hit with e-value below 0.005

Homologous Domains Detected by HHsearch ?

No hit with probability above 80.00