Citrus Sinensis ID: 030195


Local Sequence Feature Prediction

Prediction and (Method)Result
Residue Number Marker
Protein Sequence ?
Secondary Structure (PSIPRED) ?
Secondary Structure Prediction (SSPRO) ?
Coil and Loop (DISEMBL) ?
Flexible Loop (DISEMBL) ?
Low Complexity Region (SEG) ?
Disordered region (IsUnstruct) ?
Disordered Region (DISOPRED) ?
Disordered Region (DISEMBL) ?
Disordered Region (DISPRO) ?
Transmembrane Helix (TMHMM) ?
Transmembrane Helix (HMMTOP) ?
Transmembrane Helix (MEMSAT) ?
TM Helix, Signal Peptide (MEMSAT_SVM) ?
TM Helix, Signal Peptide (Phobius) ?
Signal Peptide (SignalP HMM Mode) ?
Signal Peptide (SignalP NN Mode) ?
Coiled Coils (COILS) ?
Positional Conservation ?
 
--------10--------20--------30--------40--------50--------60--------70--------80--------90-------100-------110-------120-------130-------140-------150-------160-------170-------180-
MAATAGSIFAASSQFASATRSVSCRNATPSVSSTLGSSFKGASLPRSSPKQKKTVRITGKVTAAVTTATNPYEEIEEYTRPSWAMFELGKAPVYWKTMNGLPPMSGEKLKIFYNPYAKKLLPNEDFGIGFNGGFNQPFMCGGEPRAMLRKNRGQNDSPFYTIQICVPKHGMYYFTDLYSLS
cccccccccccccccccccEEccccccccccccccccccccccccccccccccEEEEccEEEEEEEEccccHHHHccccccccHHHccccccEEEEEcccccccccccEEEEEccccccccccccEEEEEcccccccccccccHHHHHHHHHcccccccEEEEEEEcccEEEEEEcccccc
ccccccHHHccccccccccccccccccccccccccccccccccccccccccccEEEEccEEEEEEEEEcccHHHHHHcccccHHHEEcccccEEEEccccccccccccEEEEEcccHHccccccHcEEEEcccccccccccccHHHHHHHHcccccccEEEEEEEccccEEEEEccccccc
MAATAGSIFAASSQFasatrsvscrnatpsvsstlgssfkgaslprsspkqkktvRITGKVTAAVttatnpyeeieeytrpswamfelgkapvywktmnglppmsgeklkifynpyakkllpnedfgigfnggfnqpfmcggepramlrknrgqndspfytiqicvpkhgmyyftdlysls
MAATAGSIFAASSQFASATRSVSCRNAtpsvsstlgssfkgaslprsspkqkktvritgkvtaavttatnpyeeieeytrpswaMFELGKAPVYWKTMNGLPPMSGEKLKIFYNPYAKKLLPNEDFGIGFNGGFNQPFMCGGEPRAMLRKNRGQNDSPFYTIQICVPKHGMYYFTDLYSLS
MAATagsifaassqfasatrsVSCRNATPSVSSTLGSSFKGASLPRSSPKQKKTVRITGKvtaavttatNPYEEIEEYTRPSWAMFELGKAPVYWKTMNGLPPMSGEKLKIFYNPYAKKLLPNEDFGIGFNGGFNQPFMCGGEPRAMLRKNRGQNDSPFYTIQICVPKHGMYYFTDLYSLS
*******************************************************RITGKVTAAVTTATNPYEEIEEYTRPSWAMFELGKAPVYWKTMNGLPPMSGEKLKIFYNPYAKKLLPNEDFGIGFNGGFNQPFMCGGEPR***********SPFYTIQICVPKHGMYYFTDLY***
**********ASSQF**************SVSSTLGSSFKGASLP**********RITGKVTAAVTTATNPYEEIEEYTRPSWAMFELGKAPVYWKTMNGLPPMSGEKLKIFYNPYAKKLLPNEDFGIGFNGGFNQPFMCGGEP*********QNDSPFYTIQICVPKHGMYYFTDLYSLS
*********************************************************TGKVTAAVTTATNPYEEIEEYTRPSWAMFELGKAPVYWKTMNGLPPMSGEKLKIFYNPYAKKLLPNEDFGIGFNGGFNQPFMCGGEPRAMLRKNRGQNDSPFYTIQICVPKHGMYYFTDLYSLS
*************************************SFKGASLPRSSPKQKKTVRITGKVTAAVTTATNPYEEIEEYTRPSWAMFELGKAPVYWKTMNGLPPMSGEKLKIFYNPYAKKLLPNEDFGIGFNGGFNQPFMCGGEPRAMLRKNRGQNDSPFYTIQICVPKHGMYYFTDLYS**
ooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
ooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiihhhhhhhhhhhhhhhhhhhooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiihhhhhhhhhhhhhhhhoooo
SSSSSSSSSSSSSSSSSSooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
SSSSSSSSSSSSSSSSSSSoooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
MAATAGSIFAASSQFASATRSVSCRNATPSVSSTLGSSFKGASLPRSSPKQKKTVRITGKVTAAVTTATNPYEEIEEYTRPSWAMFELGKAPVYWKTMNGLPPMSGEKLKIFYNPYAKKLLPNEDFGIGFNGGFNQPFMCGGEPRAMLRKNRGQNDSPFYTIQICVPKHGMYYFTDLYSLS
no confident homologs detected

Close Homologs for Annotation Transfer

Close Homologs in SWISS-PROT Database Detected by BLAST ?

No hits with e-value below 0.001 by BLAST

Close Homologs in the Non-Redundant Database Detected by BLAST ?

GI ?Alignment Graph ?Length ? Definition ? Q cover ? H cover ? Identity ? E-value ?
Query181
225442943 282 PREDICTED: uncharacterized protein LOC10 0.966 0.620 0.659 5e-61
255553199279 conserved hypothetical protein [Ricinus 0.950 0.616 0.651 9e-60
224054336 286 predicted protein [Populus trichocarpa] 0.972 0.615 0.625 4e-56
114149950277 chloroplast post-illumination chlorophyl 0.850 0.555 0.645 1e-54
217072944254 unknown [Medicago truncatula] 0.906 0.645 0.618 8e-53
357468177280 hypothetical protein MTR_4g010140 [Medic 0.906 0.585 0.618 9e-53
357468179209 hypothetical protein MTR_4g010140 [Medic 0.906 0.784 0.618 1e-52
388490544182 unknown [Medicago truncatula] 0.906 0.901 0.612 7e-52
359807532277 uncharacterized protein LOC100790141 [Gl 0.900 0.588 0.596 1e-51
217072380240 unknown [Medicago truncatula] 0.906 0.683 0.612 2e-51
>gi|225442943|ref|XP_002266165.1| PREDICTED: uncharacterized protein LOC100264489 isoform 1 [Vitis vinifera] gi|225442945|ref|XP_002266207.1| PREDICTED: uncharacterized protein LOC100264489 isoform 2 [Vitis vinifera] gi|297743467|emb|CBI36334.3| unnamed protein product [Vitis vinifera] Back     alignment and taxonomy information
 Score =  239 bits (609), Expect = 5e-61,   Method: Compositional matrix adjust.
 Identities = 118/179 (65%), Positives = 136/179 (75%), Gaps = 4/179 (2%)

Query: 1   MAATAGSIFAASSQFASATRSVSCRNATPSVSSTLGSSFKGASLPRSSPKQKKTVRITGK 60
           MAAT GSIFA+S+Q      SVS  N +PSV+  L S+F GASL    PK  K V++TGK
Sbjct: 1   MAATTGSIFASSTQRFPTVTSVSGTNGSPSVAGRLASNFMGASLRSRLPKMGKVVKVTGK 60

Query: 61  VTAAVTTATNPYEEIEEYTRPSWAMFELGKAPVYWKTMNGLPPMSGEKLKIFYNPYAKKL 120
           V+AA   AT P EE +E   PSWAMFELG+APVYWKTMNGLPP SGEKLK+FYNP A KL
Sbjct: 61  VSAAAV-ATTPVEETKEVKLPSWAMFELGRAPVYWKTMNGLPPSSGEKLKLFYNPVASKL 119

Query: 121 LPNEDFGIGFNGGFNQPFMCGGEPRAMLRKNRGQNDSPFYTIQICVPKHG---MYYFTD 176
           +PNEDFGIGFNGGFNQP MCGGEPRAML+K RG+ D P YTIQIC+PKH    ++ FT+
Sbjct: 120 VPNEDFGIGFNGGFNQPIMCGGEPRAMLKKARGKADRPIYTIQICIPKHAVNLIFSFTN 178




Source: Vitis vinifera

Species: Vitis vinifera

Genus: Vitis

Family: Vitaceae

Order: Vitales

Class:

Phylum: Streptophyta

Superkingdom: Eukaryota

>gi|255553199|ref|XP_002517642.1| conserved hypothetical protein [Ricinus communis] gi|223543274|gb|EEF44806.1| conserved hypothetical protein [Ricinus communis] Back     alignment and taxonomy information
>gi|224054336|ref|XP_002298209.1| predicted protein [Populus trichocarpa] gi|222845467|gb|EEE83014.1| predicted protein [Populus trichocarpa] Back     alignment and taxonomy information
>gi|114149950|gb|ABI51594.1| chloroplast post-illumination chlorophyll fluorescence increase protein [Nicotiana tabacum] Back     alignment and taxonomy information
>gi|217072944|gb|ACJ84832.1| unknown [Medicago truncatula] Back     alignment and taxonomy information
>gi|357468177|ref|XP_003604373.1| hypothetical protein MTR_4g010140 [Medicago truncatula] gi|217073576|gb|ACJ85148.1| unknown [Medicago truncatula] gi|355505428|gb|AES86570.1| hypothetical protein MTR_4g010140 [Medicago truncatula] gi|388504280|gb|AFK40206.1| unknown [Medicago truncatula] Back     alignment and taxonomy information
>gi|357468179|ref|XP_003604374.1| hypothetical protein MTR_4g010140 [Medicago truncatula] gi|355505429|gb|AES86571.1| hypothetical protein MTR_4g010140 [Medicago truncatula] Back     alignment and taxonomy information
>gi|388490544|gb|AFK33338.1| unknown [Medicago truncatula] Back     alignment and taxonomy information
>gi|359807532|ref|NP_001240893.1| uncharacterized protein LOC100790141 [Glycine max] gi|255639039|gb|ACU19820.1| unknown [Glycine max] Back     alignment and taxonomy information
>gi|217072380|gb|ACJ84550.1| unknown [Medicago truncatula] Back     alignment and taxonomy information

Prediction of Gene Ontology (GO) Terms

Close Homologs with Gene Ontology terms Detected by BLAST ?

ID ? Alignment graph ? Length ? Definition ? Q cover ? H cover ? Identity ? E-value ?
Query181
TAIR|locus:2093287 268 PIFI "AT3G15840" [Arabidopsis 0.751 0.507 0.605 5.2e-42
TAIR|locus:2093287 PIFI "AT3G15840" [Arabidopsis thaliana (taxid:3702)] Back     alignment and assigned GO terms
 Score = 445 (161.7 bits), Expect = 5.2e-42, P = 5.2e-42
 Identities = 86/142 (60%), Positives = 102/142 (71%)

Query:    30 SVSSTLGSSFKGASLPRSSPKQKKTVRITGKXXXXXXXXXNPYEEIEEYTRPSWAMFELG 89
             S  S L SS  G  L R  P++K  +++             P EEI+EY  PSWAMFE+G
Sbjct:    20 SKRSFLYSSRIGPIL-RRFPRKKLDLQVKAVATTLA-----PLEEIKEYKLPSWAMFEMG 73

Query:    90 KAPVYWKTMNGLPPMSGEKLKIFYNPYAKKLLPNEDFGIGFNGGFNQPFMCGGEPRAMLR 149
              APVYWKTMNGLPP SGEKLK+FYNP A KL  NED+G+ FNGGFNQP MCGGEPRAML+
Sbjct:    74 TAPVYWKTMNGLPPTSGEKLKLFYNPAASKLTLNEDYGVAFNGGFNQPIMCGGEPRAMLK 133

Query:   150 KNRGQNDSPFYTIQICVPKHGM 171
             K+RG+ DSP YT+QIC+PKH +
Sbjct:   134 KDRGKADSPIYTMQICIPKHAV 155


Parameters:
  V=100
  filter=SEG
  E=0.001

  ctxfactor=1.00

  Query                        -----  As Used  -----    -----  Computed  ----
  Frame  MatID Matrix name     Lambda    K       H      Lambda    K       H
   +0      0   BLOSUM62        0.317   0.135   0.420    same    same    same
               Q=9,R=2         0.244   0.0300  0.180     n/a     n/a     n/a

  Query
  Frame  MatID  Length  Eff.Length     E     S W   T  X   E2     S2
   +0      0      181       155   0.00096  105 3  11 22  0.38    32
                                                     30  0.41    34


Statistics:

  Database:  /share/blast/go-seqdb.fasta
   Title:  go_20130330-seqdb.fasta
   Posted:  5:47:42 AM PDT Apr 1, 2013
   Created:  5:47:42 AM PDT Apr 1, 2013
   Format:  XDF-1
   # of letters in database:  169,044,731
   # of sequences in database:  368,745
   # of database sequences satisfying E:  1
  No. of states in DFA:  598 (64 KB)
  Total size of DFA:  159 KB (2095 KB)
  Time to generate neighborhood:  0.00u 0.00s 0.00t   Elapsed:  00:00:00
  No. of threads or processors used:  24
  Search cpu time:  13.87u 0.12s 13.99t   Elapsed:  00:00:01
  Total cpu time:  13.87u 0.12s 13.99t   Elapsed:  00:00:01
  Start:  Mon May 20 16:07:06 2013   End:  Mon May 20 16:07:07 2013


GO:0003674 "molecular_function" evidence=ND
GO:0009507 "chloroplast" evidence=ISM;IDA
GO:0009579 "thylakoid" evidence=IDA
GO:0009570 "chloroplast stroma" evidence=IDA
GO:0010478 "chlororespiration" evidence=IMP

Prediction of Enzyme Commission (EC) Number

EC Number Prediction by Annotation Transfer from SWISS-PROT Entries ?

No confident hit for EC number transfering in SWISSPROT detected by BLAST

EC Number Prediction by Ezypred Server ?

Fail to connect to Ezypred Server

EC Number Prediction by EFICAz Software ?

No EC number assignment, probably not an enzyme!


Prediction of Functionally Associated Proteins

Functionally Associated Proteins Detected by STRING ?

Your Input:
estExt_Genewise1_v1.C_LG_I0106
SubName- Full=Putative uncharacterized protein; (286 aa)
(Populus trichocarpa)
Predicted Functional Partners:
eugene3.02080025
hypothetical protein (465 aa)
      0.410

Conserved Domains and Related Protein Families

Conserved Domains Detected by RPS-BLAST ?

Conserved Domains Detected by HHsearch ?

ID ?Alignment Graph ?Length ? Definition ? Probability ?
Query 181
PF0342387 CBM_25: Carbohydrate binding domain (family 25); I 95.3
PLN02316 1036 synthase/transferase 93.91
PLN02316 1036 synthase/transferase 89.22
>PF03423 CBM_25: Carbohydrate binding domain (family 25); InterPro: IPR005085 A carbohydrate-binding module (CBM) is defined as a contiguous amino acid sequence within a carbohydrate-active enzyme with a discreet fold having carbohydrate-binding activity Back     alignment and domain information
Probab=95.30  E-value=0.019  Score=41.23  Aligned_cols=66  Identities=20%  Similarity=0.369  Sum_probs=30.5

Q ss_pred             CCceEEEEccccCCCCCCcceeeeecCCCCCceecCCchhhhhhhhhCCCCCCceEEEEeecCceeeEEEe
Q 030195          106 GEKLKIFYNPYAKKLLPNEDFGIGFNGGFNQPFMCGGEPRAMLRKNRGQNDSPFYTIQICVPKHGMYYFTD  176 (181)
Q Consensus       106 Ge~L~lfyNp~as~l~PNe~fGiaFNGGFNQPIMCGGePR~M~~k~RGkad~PiYtI~I~vPkHa~~L~f~  176 (181)
                      |+.++|||||..+.|.-..  -|=+.+|||. -. ....-.|.+... .....-+...|.||+.|..|-|-
T Consensus         1 G~~vtVyYn~~~~~l~g~~--~v~~~~G~n~-W~-~~~~~~m~~~~~-~~~~~~~~~tv~vP~~a~~~dfv   66 (87)
T PF03423_consen    1 GETVTVYYNPSLTALSGAP--NVHLHGGFNR-WT-HVPGFGMTKMCV-PDEGGWWKATVDVPEDAYVMDFV   66 (87)
T ss_dssp             -SEEEEEE---E-SSS-S---EEEEEETTS--B--SSS-EE-EEESS----TTEEEEEEE--TTTSEEEEE
T ss_pred             CCEEEEEEEeCCCCCCCCC--cEEEEecCCC-CC-cCCCCCcceeee-eecCCEEEEEEEEcCCceEEEEE
Confidence            7899999999877775222  2344444442 11 111123333221 11167899999999998766653



A few exceptions are CBMs in cellulosomal scaffolding proteins and rare instances of independent putative CBMs. The requirement of CBMs existing as modules within larger enzymes sets this class of carbohydrate-binding protein apart from other non-catalytic sugar binding proteins such as lectins and sugar transport proteins. CBMs were previously classified as cellulose-binding domains (CBDs) based on the initial discovery of several modules that bound cellulose [, ]. However, additional modules in carbohydrate-active enzymes are continually being found that bind carbohydrates other than cellulose yet otherwise meet the CBM criteria, hence the need to reclassify these polypeptides using more inclusive terminology. Previous classification of cellulose-binding domains were based on amino acid similarity. Groupings of CBDs were called "Types" and numbered with roman numerals (e.g. Type I or Type II CBDs). In keeping with the glycoside hydrolase classification, these groupings are now called families and numbered with Arabic numerals. Families 1 to 13 are the same as Types I to XIII. For a detailed review on the structure and binding modes of CBMs see []. This entry represents CBM25 from CAZY which has a starch-binding function as has been demonstrated in one case.; PDB: 2LAB_A 2C3X_B 2C3V_A 2C3W_C 2LAA_A.

>PLN02316 synthase/transferase Back     alignment and domain information
>PLN02316 synthase/transferase Back     alignment and domain information

Homologous Structure Templates

Structure Templates Detected by BLAST ?

No homologous structure with e-value below 0.005

Structure Templates Detected by RPS-BLAST ?

No hit with e-value below 0.005

Structure Templates Detected by HHsearch ?

No hit with probability above 80.00


Homologous Structure Domains

Structure Domains Detected by RPS-BLAST ?

No hit with e-value below 0.005

Homologous Domains Detected by HHsearch ?

ID ?Alignment Graph ?Length ? Definition ? Probability ?
Query181
d2gf3a2104 Sarcosine oxidase {Bacillus sp., strain b0618 [Tax 85.05
>d2gf3a2 d.16.1.3 (A:218-321) Sarcosine oxidase {Bacillus sp., strain b0618 [TaxId: 1409]} Back     information, alignment and structure
class: Alpha and beta proteins (a+b)
fold: FAD-linked reductases, C-terminal domain
superfamily: FAD-linked reductases, C-terminal domain
family: D-aminoacid oxidase-like
domain: Sarcosine oxidase
species: Bacillus sp., strain b0618 [TaxId: 1409]
Probab=85.05  E-value=0.38  Score=31.85  Aligned_cols=43  Identities=23%  Similarity=0.363  Sum_probs=30.3

Q ss_pred             CchhhhccCCcceE-EEecC----CCCCCCCCceEEEEccccCCCCCC
Q 030195           81 PSWAMFELGKAPVY-WKTMN----GLPPMSGEKLKIFYNPYAKKLLPN  123 (181)
Q Consensus        81 psWa~fElG~apVy-Wkt~n----GlpP~sGe~L~lfyNp~as~l~PN  123 (181)
                      .....|..++.||| |+..+    |+|..-|..+||.+.=......|+
T Consensus        14 ~~~~~~~~~~fP~fi~~~~~~~~YGfP~~~~~g~Ki~~~~~g~~~dPd   61 (104)
T d2gf3a2          14 DESKYSNDIDFPGFMVEVPNGIYYGFPSFGGCGLKLGYHTFGQKIDPD   61 (104)
T ss_dssp             CHHHHBGGGTCCEEEEEETTEEEEEECBSTTCCEEEEESSCCEECCTT
T ss_pred             CCccccccCCCCEEEEECCCCeEEecCCCCCCceEEEEecCCCccCcc
Confidence            34556777888888 77776    899999999999886333333443