Citrus Sinensis ID: 030167


Local Sequence Feature Prediction

Prediction and (Method)Result
Residue Number Marker
Protein Sequence ?
Secondary Structure (PSIPRED) ?
Secondary Structure Prediction (SSPRO) ?
Coil and Loop (DISEMBL) ?
Flexible Loop (DISEMBL) ?
Low Complexity Region (SEG) ?
Disordered region (IsUnstruct) ?
Disordered Region (DISOPRED) ?
Disordered Region (DISEMBL) ?
Disordered Region (DISPRO) ?
Transmembrane Helix (TMHMM) ?
Transmembrane Helix (HMMTOP) ?
Transmembrane Helix (MEMSAT) ?
TM Helix, Signal Peptide (MEMSAT_SVM) ?
TM Helix, Signal Peptide (Phobius) ?
Signal Peptide (SignalP HMM Mode) ?
Signal Peptide (SignalP NN Mode) ?
Coiled Coils (COILS) ?
Positional Conservation ?
 
--------10--------20--------30--------40--------50--------60--------70--------80--------90-------100-------110-------120-------130-------140-------150-------160-------170-------180--
MKMALNCSSSTCITFSYNKIRNPTNTSSLTLNTGFFHLAQKSPFRVSAATSPAKARFVARRKESVRVRQLQRPLIEYMSLPASQYSVLDAERIERVDDNTFRSSMINRISCDSNSSNSEVQQLTSDAFIEVSIEVPFAFRAFPVEAIESTGTQVLDQILKLMLPRFMSQVSRSICYSFIEVN
cccccccccccEEEEEEcccccccccccccccccccccccccccEEEcccccccEEEEEEEEEEEEEEccccHHHHHHcccccccccccHHHHEEccccEEEEEEEEEEEEccccccccccEEccccEEEEEEEEcccccccccccccccHHHHHHHHHHHHHHHHHHHHHHHHcEEcEEEc
ccEEEEcccccEEEEEcccccccccccEEEEccccccccccccEEEEcccccccEEEEEEccccEEEHHcccHHHHHHccccccEEEEcccEEEEcccccEEEEEEEEEEEcccccccccEEEcccEEEEEEEEcccccccccHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHEEEEEEEEc
mkmalncssstcitfsynkirnptntssltlntgffhlaqkspfrvsaatspAKARFVARRKESVRVRQLQRPlieymslpasqysvldaeriervddntfrsSMINriscdsnssnsevqQLTSDAFIEVSIevpfafrafpveaieSTGTQVLDQILKLMLPRFMSQVSRSICYSFIEVN
mkmalncssstCITFSYNKIRNPTNTSSLTLNTGFFHLAQKSPFRVSAATSPAKarfvarrkesvrvrqlqrplieymslpasqysvldaERIERVDDNTFRSSMINriscdsnssnSEVQQLTSDAFIEVSIEVPFAFRAFPVEAIESTGTQVLDQILKLMLPRFMSQVSRSICYSFIEVN
MKMALNCSSSTCITFSYNKIRNPtntssltlntGFFHLAQKSPFRVSAATSPAKARFVARRKESVRVRQLQRPLIEYMSLPASQYSVLDAERIERVDDNTFRSSMINRISCDSNSSNSEVQQLTSDAFIEVSIEVPFAFRAFPVEAIESTGTQVLDQILKLMLPRFMSQVSRSICYSFIEVN
*********STCITFSYNKIRNPTNTSSLTLNTGFFHLAQKSPF********************VRVRQLQRPLIEYMSLPASQYSVLDAERIERV***************************TSDAFIEVSIEVPFAFRAFPVEAIESTGTQVLDQILKLMLPRFMSQVSRSICYSFIE**
*********S****F*Y*******************************************************PLIEYMSLPASQYSVLDAERIERVDDNTFRSSMINRISC***********LTSDAFIEVSIEVPFAFRAFPVEAIESTGTQVLDQILKLMLPRFMSQVSRSICYSFIEVN
********SSTCITFSYNKIRNPTNTSSLTLNTGFFHLAQKSP************************RQLQRPLIEYMSLPASQYSVLDAERIERVDDNTFRSSMINRISC**********QLTSDAFIEVSIEVPFAFRAFPVEAIESTGTQVLDQILKLMLPRFMSQVSRSICYSFIEVN
*KMALNCSSSTCITFSYNKIRNPTNTSSLTLNTGFFHLAQKSPFRVSAATSPAKARFVARRKESVRVRQLQRPLIEYMSLPASQYSVLDAERIERVDDNTFRSSMINRISCDSNSSNSEVQQLTSDAFIEVSIEVPFAFRAFPVEAIESTGTQVLDQILKLMLPRFMSQVSRSICYSFIEVN
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
ooooooooooooooooooooooooooooooooooooooooooooooooooooooooooohhhhhhhhhhhhhhhhhhhiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii
SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiihhhhhhhhhhhhhhhhooooo
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
MKMALNCSSSTCITFSYNKIRNPTNTSSLTLNTGFFHLAQKSPFRVSAATSPAKARFVARRKESVRVRQLQRPLIEYMSLPASQYSVLDAERIERVDDNTFRSSMINRISCDSNSSNSEVQQLTSDAFIEVSIEVPFAFRAFPVEAIESTGTQVLDQILKLMLPRFMSQVSRSICYSFIEVN
no confident homologs detected

Close Homologs for Annotation Transfer

Close Homologs in SWISS-PROT Database Detected by BLAST ?

No hits with e-value below 0.001 by BLAST

Close Homologs in the Non-Redundant Database Detected by BLAST ?

GI ?Alignment Graph ?Length ? Definition ? Q cover ? H cover ? Identity ? E-value ?
Query182
7406448 285 putative protein [Arabidopsis thaliana] 0.708 0.452 0.563 1e-45
30680516255 uncharacterized protein [Arabidopsis tha 0.708 0.505 0.558 3e-45
255583732245 conserved hypothetical protein [Ricinus 0.934 0.693 0.48 4e-44
297806421252 hypothetical protein ARALYDRAFT_487226 [ 0.884 0.638 0.479 5e-44
225448890239 PREDICTED: uncharacterized protein SYNPC 0.747 0.569 0.531 8e-44
388515479226 unknown [Medicago truncatula] 0.802 0.646 0.512 2e-41
356500349228 PREDICTED: uncharacterized protein LOC10 0.741 0.592 0.526 2e-40
356534738231 PREDICTED: uncharacterized protein LOC10 0.692 0.545 0.519 1e-39
357439939227 (RAP Annotation release2) Galactose-bind 0.741 0.594 0.515 2e-39
449493277242 PREDICTED: uncharacterized protein SYNPC 0.824 0.619 0.473 3e-37
>gi|7406448|emb|CAB85550.1| putative protein [Arabidopsis thaliana] Back     alignment and taxonomy information
 Score =  187 bits (476), Expect = 1e-45,   Method: Compositional matrix adjust.
 Identities = 102/181 (56%), Positives = 117/181 (64%), Gaps = 52/181 (28%)

Query: 43  PFRVSAATSPAKARFVARRKESVRVRQLQRPLIEYMSLPASQYSVLDAERIERVDDNTFR 102
           P RVS++++P KARF+AR+K+SV VRQLQRPLIEYMSLPASQYSVLDAERIERVDDNTFR
Sbjct: 56  PIRVSSSSTP-KARFIARQKQSVSVRQLQRPLIEYMSLPASQYSVLDAERIERVDDNTFR 114

Query: 103 ---------------------------------------------------SSMINRISC 111
                                                              +SM+NR+SC
Sbjct: 115 CYVYTFKFFNFEVCPVLLVRVEEQPNGCCIKLLSCKLEGSPVVVAQNDKFDASMVNRVSC 174

Query: 112 DSNSSNSEVQQLTSDAFIEVSIEVPFAFRAFPVEAIESTGTQVLDQILKLMLPRFMSQVS 171
           DS    S  QQ+TSDA IEV+IE+PFAFR FPV AIE+TGTQVLDQILKLMLPRF+SQV 
Sbjct: 175 DSTQEGSSEQQITSDAVIEVNIEIPFAFRVFPVGAIEATGTQVLDQILKLMLPRFLSQVC 234

Query: 172 R 172
           R
Sbjct: 235 R 235




Source: Arabidopsis thaliana

Species: Arabidopsis thaliana

Genus: Arabidopsis

Family: Brassicaceae

Order: Brassicales

Class:

Phylum: Streptophyta

Superkingdom: Eukaryota

>gi|30680516|ref|NP_196064.2| uncharacterized protein [Arabidopsis thaliana] gi|63147366|gb|AAY34156.1| At5g04440 [Arabidopsis thaliana] gi|87116598|gb|ABD19663.1| At5g04440 [Arabidopsis thaliana] gi|332003362|gb|AED90745.1| uncharacterized protein [Arabidopsis thaliana] Back     alignment and taxonomy information
>gi|255583732|ref|XP_002532619.1| conserved hypothetical protein [Ricinus communis] gi|223527639|gb|EEF29750.1| conserved hypothetical protein [Ricinus communis] Back     alignment and taxonomy information
>gi|297806421|ref|XP_002871094.1| hypothetical protein ARALYDRAFT_487226 [Arabidopsis lyrata subsp. lyrata] gi|297316931|gb|EFH47353.1| hypothetical protein ARALYDRAFT_487226 [Arabidopsis lyrata subsp. lyrata] Back     alignment and taxonomy information
>gi|225448890|ref|XP_002270872.1| PREDICTED: uncharacterized protein SYNPCC7002_A1590 [Vitis vinifera] gi|296085939|emb|CBI31380.3| unnamed protein product [Vitis vinifera] Back     alignment and taxonomy information
>gi|388515479|gb|AFK45801.1| unknown [Medicago truncatula] Back     alignment and taxonomy information
>gi|356500349|ref|XP_003518995.1| PREDICTED: uncharacterized protein LOC100789119 [Glycine max] Back     alignment and taxonomy information
>gi|356534738|ref|XP_003535909.1| PREDICTED: uncharacterized protein LOC100797206 [Glycine max] Back     alignment and taxonomy information
>gi|357439939|ref|XP_003590247.1| (RAP Annotation release2) Galactose-binding like domain containing protein [Medicago truncatula] gi|355479295|gb|AES60498.1| (RAP Annotation release2) Galactose-binding like domain containing protein [Medicago truncatula] Back     alignment and taxonomy information
>gi|449493277|ref|XP_004159242.1| PREDICTED: uncharacterized protein SYNPCC7002_A1590-like [Cucumis sativus] Back     alignment and taxonomy information

Prediction of Gene Ontology (GO) Terms

Close Homologs with Gene Ontology terms Detected by BLAST ?

ID ? Alignment graph ? Length ? Definition ? Q cover ? H cover ? Identity ? E-value ?
Query182
TAIR|locus:2184377255 AT5G04440 "AT5G04440" [Arabido 0.467 0.333 0.623 6.8e-24
TAIR|locus:2184377 AT5G04440 "AT5G04440" [Arabidopsis thaliana (taxid:3702)] Back     alignment and assigned GO terms
 Score = 274 (101.5 bits), Expect = 6.8e-24, P = 6.8e-24
 Identities = 53/85 (62%), Positives = 67/85 (78%)

Query:    88 LDAERIERVDDNTFRSSMINRISCDSNSSNSEVQQLTSDAFIEVSIEVPFAFRAFPVEAI 147
             L+   +    ++ F +SM+NR+SCDS    S  QQ+TSDA IEV+IE+PFAFR FPV AI
Sbjct:   151 LEGSPVVVAQNDKFDASMVNRVSCDSTQEGSSEQQITSDAVIEVNIEIPFAFRVFPVGAI 210

Query:   148 ESTGTQVLDQILKLMLPRFMSQVSR 172
             E+TGTQVLDQILKLMLPRF+SQ+S+
Sbjct:   211 EATGTQVLDQILKLMLPRFLSQLSK 235


GO:0003674 "molecular_function" evidence=ND
GO:0005634 "nucleus" evidence=ISM
GO:0008150 "biological_process" evidence=ND
GO:0009507 "chloroplast" evidence=IDA

Prediction of Enzyme Commission (EC) Number

EC Number Prediction by Annotation Transfer from SWISS-PROT Entries ?

No confident hit for EC number transfering in SWISSPROT detected by BLAST

EC Number Prediction by Ezypred Server ?

Fail to connect to Ezypred Server

EC Number Prediction by EFICAz Software ?

No EC number assignment, probably not an enzyme!


Prediction of Functionally Associated Proteins

Functionally Associated Proteins Detected by STRING ?

Fail to connect to STRING server


Conserved Domains and Related Protein Families

Conserved Domains Detected by RPS-BLAST ?

ID ?Alignment Graph ?Length ? Definition ? E-value ?
Query182
pfam09366170 pfam09366, DUF1997, Protein of unknown function (D 4e-20
>gnl|CDD|117908 pfam09366, DUF1997, Protein of unknown function (DUF1997) Back     alignment and domain information
 Score = 81.9 bits (203), Expect = 4e-20
 Identities = 31/160 (19%), Positives = 55/160 (34%), Gaps = 46/160 (28%)

Query: 57  FVARRKESVRVRQLQRPLIEYMSLPASQYSVL-DAERIERVDDNTFRSSM--IN------ 107
           F A +   + +     PL EY+  P   +S L D  ++ER+ D  +R ++          
Sbjct: 1   FSASQSLDLPLPAPAEPLAEYLRQPQRVFSALLDPMKVERLGDGRYRLTVRPFGFFGFEV 60

Query: 108 ------RISCDSNS-------------------------------SNSEVQQLTSDAFIE 130
                 R+  + +                                  +    L  DA + 
Sbjct: 61  EPVVVLRVEPEDDGLTIELLDCELEGLPLVNDDFDLDLRASLYPDREATETGLEGDADLS 120

Query: 131 VSIEVPFAFRAFPVEAIESTGTQVLDQILKLMLPRFMSQV 170
           V++ +P   R  P   +ESTG  +L  IL+ +  R   Q+
Sbjct: 121 VTVSLPPPLRLLPEPVLESTGESLLQGILRQIKRRLTRQL 160


This family of proteins are functionally uncharacterized. Length = 170

Conserved Domains Detected by HHsearch ?

ID ?Alignment Graph ?Length ? Definition ? Probability ?
Query 182
PF09366158 DUF1997: Protein of unknown function (DUF1997); In 99.96
PF06240140 COXG: Carbon monoxide dehydrogenase subunit G (Cox 90.35
>PF09366 DUF1997: Protein of unknown function (DUF1997); InterPro: IPR018971 This family of proteins are functionally uncharacterised Back     alignment and domain information
Probab=99.96  E-value=1e-28  Score=196.86  Aligned_cols=107  Identities=37%  Similarity=0.550  Sum_probs=94.6

Q ss_pred             CcchhhhhcCccchhhc-cCccceeecCccceee----------------------------------------------
Q 030167           71 QRPLIEYMSLPASQYSV-LDAERIERVDDNTFRS----------------------------------------------  103 (182)
Q Consensus        71 ~rPL~eYLrqPa~~ysv-LD~~riErL~ddtFRa----------------------------------------------  103 (182)
                      +.||+|||++|+|++++ +|++++|+|||++||+                                              
T Consensus         1 ~~~l~~YL~~~~r~~~~~~d~~~ie~l~~~~yr~~~~~~~~~~~~v~P~v~l~v~~~~~~~~i~~~~~~l~G~~~~~~~~   80 (158)
T PF09366_consen    1 QAPLAEYLSDPQRWFSALFDPMRIEPLGDNTYRLKMRPFQFFGFEVEPVVDLRVWPQDDGLTIRSLDCELRGSPLVEQND   80 (158)
T ss_pred             CCchHHHHhCchhHHHHhcCHHHcEEcCCCeEEEEEcCccEEEEEEEEEEEEEEEEcCCCeEEEEEEEEEeCCCccccCC
Confidence            36899999999995555 6999999999999992                                              


Q ss_pred             ----eeeeeEEecCCCCCCCceeEeeeceEEEEEecCCCcccccHHHHHhhhHHHHHHHHHHHHHHHHHHHHHHhhhhce
Q 030167          104 ----SMINRISCDSNSSNSEVQQLTSDAFIEVSIEVPFAFRAFPVEAIESTGTQVLDQILKLMLPRFMSQVSRSICYSFI  179 (182)
Q Consensus       104 ----~miNR~s~~~~~~~~~~~~L~gda~LeV~veVP~pF~l~P~~alE~tGN~lLq~VL~~i~prfL~QLl~DY~~w~~  179 (182)
                          +|.|.+.|+.   .++.+.++|+++|+|.+++|+||+++|.+++|+|||++|++|+++|+|||++||.+||+.|..
T Consensus        81 ~f~l~~~~~l~~~~---~~~~t~l~~~~~l~V~v~~P~~~~~~P~~~l~~~G~~vl~~il~~i~~r~~~~l~~Dy~~w~~  157 (158)
T PF09366_consen   81 GFSLDLQASLYPEE---PPGRTRLEGDADLSVSVELPPPFRLLPESLLESTGNAVLQQILRQIKPRFLQQLQADYHRWAR  157 (158)
T ss_pred             cEEEEEEEEEEEec---CCCceEEEEEEEEEEEEEcChhHHhCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhh
Confidence                3334444443   566789999999999999999999999999999999999999999999999999999999987


Q ss_pred             e
Q 030167          180 E  180 (182)
Q Consensus       180 ~  180 (182)
                      |
T Consensus       158 ~  158 (158)
T PF09366_consen  158 E  158 (158)
T ss_pred             C
Confidence            6



>PF06240 COXG: Carbon monoxide dehydrogenase subunit G (CoxG); InterPro: IPR010419 The CO dehydrogenase structural genes coxMSL are flanked by nine accessory genes arranged as the cox gene cluster Back     alignment and domain information

Homologous Structure Templates

Structure Templates Detected by BLAST ?

No homologous structure with e-value below 0.005

Structure Templates Detected by RPS-BLAST ?

No hit with e-value below 0.005

Structure Templates Detected by HHsearch ?

No hit with probability above 80.00


Homologous Structure Domains

Structure Domains Detected by RPS-BLAST ?

No hit with e-value below 0.005

Homologous Domains Detected by HHsearch ?

ID ?Alignment Graph ?Length ? Definition ? Probability ?
Query182
d2pcsa1147 Hypothetical protein GKP20 {Geobacillus kaustophil 91.05
>d2pcsa1 d.129.3.10 (A:1-147) Hypothetical protein GKP20 {Geobacillus kaustophilus [TaxId: 1462]} Back     information, alignment and structure
class: Alpha and beta proteins (a+b)
fold: TBP-like
superfamily: Bet v1-like
family: CoxG-like
domain: Hypothetical protein GKP20
species: Geobacillus kaustophilus [TaxId: 1462]
Probab=91.05  E-value=0.87  Score=30.67  Aligned_cols=40  Identities=8%  Similarity=0.234  Sum_probs=25.1

Q ss_pred             EeeeccCCcchhhhhcCccchhhccC-ccceeecCccceee
Q 030167           64 SVRVRQLQRPLIEYMSLPASQYSVLD-AERIERVDDNTFRS  103 (182)
Q Consensus        64 ~i~V~e~~rPL~eYLrqPa~~ysvLD-~~riErL~ddtFRa  103 (182)
                      ++.|+-.+.-+=+.|.+|.+.-..+. -+.+|.+++++|.+
T Consensus         6 ~~~i~~~~e~v~~~l~D~~~~~~~~Pg~~~~~~~~~~~~~~   46 (147)
T d2pcsa1           6 SIELKGTVEEVWSKLMDPSILSKCIMGCKSLELIGEDKYKA   46 (147)
T ss_dssp             EEEEESCHHHHHHHHTCHHHHHHHSTTEEEEEEEETTEEEE
T ss_pred             eEEeCCCHHHHHHHHcCHHHHHhhCcchhhceecCCCEEEE
Confidence            34444333445666666666544444 45799999999984