Citrus Sinensis ID: 025411


Local Sequence Feature Prediction

Prediction and (Method)Result
Residue Number Marker
Protein Sequence ?
Secondary Structure (PSIPRED) ?
Secondary Structure Prediction (SSPRO) ?
Coil and Loop (DISEMBL) ?
Flexible Loop (DISEMBL) ?
Low Complexity Region (SEG) ?
Disordered region (IsUnstruct) ?
Disordered Region (DISOPRED) ?
Disordered Region (DISEMBL) ?
Disordered Region (DISPRO) ?
Transmembrane Helix (TMHMM) ?
Transmembrane Helix (HMMTOP) ?
Transmembrane Helix (MEMSAT) ?
TM Helix, Signal Peptide (MEMSAT_SVM) ?
TM Helix, Signal Peptide (Phobius) ?
Signal Peptide (SignalP HMM Mode) ?
Signal Peptide (SignalP NN Mode) ?
Coiled Coils (COILS) ?
Positional Conservation ?
 
--------10--------20--------30--------40--------50--------60--------70--------80--------90-------100-------110-------120-------130-------140-------150-------160-------170-------180-------190-------200-------210-------220-------230-------240-------250---
MDHGSNPQMQMSDEQHAIHHVNYVPEHELHHISNGDVMDDEHDEGNGVGESEAMEGDAPSDPGSLSDNRAVSEIGDQLTLSFQGQVYVFDSVSPEKVQAVLLLLGGREVPSTTPAIPIANNQNNRGLPGTPQRLSVPQRLASLIRFREKRKERNFEKKIRYTVRKEVALRMQRNKGQFTSAKSNNEDSASAISSWGSNQSWAGDVNGSQNQDIVCRHCGISEKSTPMMRRGPEGPRTLCNACGLMWANKVREL
ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccEEEEccEEEEEccccHHHHHHHHHccccccccccccccccccccccccccccccccccHHHHHHHHHHHHHHHHcccHHHHcHHHHHHHHHHHHHccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccHHHHHHHHHHHccc
ccccccccccccccccccHEEcccccccccccccccccccccccccccccccccccccccccccccccccccccccEEEEEEEcEEEEEccccHHHHHHHHHHccccccccccccccccccccccccccccccccHHHHHHHHHHHHHHHHHcccccEEEHHHHHHHHHHcccccccEccccccccccccccccccccccccccccccccccEEEcccccccccccccccccccccHHHHHHHHHHHHHHHcc
mdhgsnpqmqmsdeqhaihhvnyvpehelhhisngdvmddehdegngvgeseamegdapsdpgslsdnravseigdqltlsfqgqvyvfdsvSPEKVQAVLLLLggrevpsttpaipiannqnnrglpgtpqrlsvPQRLASLIRFREKRKERNFEKKIRYTVRKEVALRMQRnkgqftsaksnnedsasaisswgsnqswagdvngsqnqdivcrhcgisekstpmmrrgpegprtlcNACGLMWANKVREL
MDHGSNPQMQMSDEQHAIHHVNYVPEHELHHISNGDVMDDEHDEGNGVGESEAMEGDAPSDPGSLSDNRAVSEIGDQLTLSFQGQVYVFDSVSPEKVQAVLLLLGGREVPSttpaipiannqnnrglpgtpqrlsvpqRLASLIrfrekrkernfekkirytvrkevalrmqrnkgqftsaksnneDSASAISSWGSNQSWAGDVNGSQNQDIVCRHCGIsekstpmmrrgpeGPRTLCNACGLMWANKVREL
MDHGSNPQMQMSDEQHAIHHVNYVPEHELHHISNGDVMDDEHDEGNGVGESEAMEGDAPSDPGSLSDNRAVSEIGDQLTLSFQGQVYVFDSVSPEKVQAVLLLLGGREVPSTTPAIPIANNQNNRGLPGTPQRLSVPQRLASLirfrekrkernfekkirYTVRKEVALRMQRNKGQFTSAKSNNEDSASAISSWGSNQSWAGDVNGSQNQDIVCRHCGISEKSTPMMRRGPEGPRTLCNACGLMWANKVREL
*******************HVNYV*************************************************IGDQLTLSFQGQVYVFDSVSPEKVQAVLLLLGGREV*************************************************IRYTVR***********************************************DIVCRHCGI****************TLCNACGLMWAN*****
********************************************************************************SFQGQVYVFDSVSPEKV**********************************************************************************SAKSNNEDSASAISSWGSNQSWAGDVNGSQNQDIVCRHCGISEKSTPMMRRGPEGPRTLCNACGLMWANKVRE*
***********SDEQHAIHHVNYVPEHELHHISNGDVMDDEHDEGNGVGE**************LSDNRAVSEIGDQLTLSFQGQVYVFDSVSPEKVQAVLLLLGGREVPSTTPAIPIANNQNNRGLPGTPQRLSVPQRLASLIRFREKRKERNFEKKIRYTVRKEVALRMQRNKG*******************GSNQSWAGDVNGSQNQDIVCRHCGISEKSTPMMRRGPEGPRTLCNACGLMWANKVREL
*****************I**V*YVPEHELH********************************************GDQLTLSFQGQVYVFDSVSPEKVQAVLLLLGGREVPST**A****************QRLSVPQRLASLIRFREKRKERNFEKKIRYTVRKEVALRMQ***************************************DI*C**CGISEKSTPMMRRGPEGPRTLCNACGLMWANKVREL
ooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
ooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooohhhhhhhhhhhhhhhhhhhhhhiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii
iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiihhhhhhhhhhhhhhhhoooo
ooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
MDHGSNPQMQMSDEQHAIHHVNYVPEHELHHISNGDVMDDEHDEGNGVGESEAMEGDAPSDPGSLSDNRAVSEIGDQLTLSFQGQVYVFDSVSPEKVQAVLLLLGGREVPSTTPAIPIANNQNNRGLPGTPQRLSVPQRLASLIRFREKRKERNFEKKIRYTVRKEVALRMQRNKGQFTSAKSNNEDSASAISSWGSNQSWAGDVNGSQNQDIVCRHCGISEKSTPMMRRGPEGPRTLCNACGLMWANKVREL
no confident homologs detected

Close Homologs for Annotation Transfer

Close Homologs in SWISS-PROT Database Detected by BLAST ?

ID ?Alignment graph ?Length ? Definition ? RBH(Q2H) ? RBH(H2Q) ? Q cover ? H cover ? Identity ? E-value ?
Query253 2.2.26 [Sep-21-2011]
Q8H1G0302 GATA transcription factor yes no 0.968 0.811 0.690 1e-93
Q8GXL7297 GATA transcription factor no no 0.972 0.828 0.661 4e-88
Q9LRH6309 GATA transcription factor no no 0.683 0.559 0.538 4e-46
Q8L500468 Two-component response re no no 0.498 0.269 0.312 2e-07
O50055355 Zinc finger protein CONST no no 0.332 0.236 0.370 7e-07
B0G188 695 GATA zinc finger domain-c yes no 0.150 0.054 0.525 8e-07
A2YQ93742 Two-component response re N/A no 0.162 0.055 0.560 1e-06
Q0D3B6742 Two-component response re no no 0.162 0.055 0.560 1e-06
Q00858457 Cutinase gene palindrome- N/A no 0.162 0.089 0.488 2e-06
Q93WK5727 Two-component response re no no 0.205 0.071 0.472 2e-06
>sp|Q8H1G0|GAT28_ARATH GATA transcription factor 28 OS=Arabidopsis thaliana GN=GATA28 PE=2 SV=1 Back     alignment and function desciption
 Score =  342 bits (878), Expect = 1e-93,   Method: Compositional matrix adjust.
 Identities = 176/255 (69%), Positives = 202/255 (79%), Gaps = 10/255 (3%)

Query: 3   HGSNPQMQMSDEQHAIHHVNYVPEHELHHISNGDVM-DDEHDEGNGVGESEAMEGDAPSD 61
           HGSN +M + + Q  +H V +   H LHHI NG  M DD+ D+GN  G SE +E D PS 
Sbjct: 5   HGSNARMHIREAQDPMH-VQF-EHHALHHIHNGSGMVDDQADDGNAGGMSEGVETDIPSH 62

Query: 62  PGSLSDNRAV-----SEIGDQLTLSFQGQVYVFDSVSPEKVQAVLLLLGGREVPSTTPAI 116
           PG+++DNR       SE GDQLTLSFQGQVYVFDSV PEKVQAVLLLLGGRE+P   P  
Sbjct: 63  PGNVTDNRGEVVDRGSEQGDQLTLSFQGQVYVFDSVLPEKVQAVLLLLGGRELPQAAPPG 122

Query: 117 PIANNQNNR--GLPGTPQRLSVPQRLASLIRFREKRKERNFEKKIRYTVRKEVALRMQRN 174
             + +QNNR   LPGTPQR S+PQRLASL+RFREKRK RNF+KKIRYTVRKEVALRMQRN
Sbjct: 123 LGSPHQNNRVSSLPGTPQRFSIPQRLASLVRFREKRKGRNFDKKIRYTVRKEVALRMQRN 182

Query: 175 KGQFTSAKSNNEDSASAISSWGSNQSWAGDVNGSQNQDIVCRHCGISEKSTPMMRRGPEG 234
           KGQFTSAKSNN+++ASA SSWGSNQ+WA + + +Q+Q+I CRHCGI EKSTPMMRRGP G
Sbjct: 183 KGQFTSAKSNNDEAASAGSSWGSNQTWAIESSEAQHQEISCRHCGIGEKSTPMMRRGPAG 242

Query: 235 PRTLCNACGLMWANK 249
           PRTLCNACGLMWANK
Sbjct: 243 PRTLCNACGLMWANK 257




Transcriptional activator that specifically binds 5'-GATA-3' or 5'-GAT-3' motifs within gene promoters.
Arabidopsis thaliana (taxid: 3702)
>sp|Q8GXL7|GAT24_ARATH GATA transcription factor 24 OS=Arabidopsis thaliana GN=GATA24 PE=2 SV=2 Back     alignment and function description
>sp|Q9LRH6|GAT25_ARATH GATA transcription factor 25 OS=Arabidopsis thaliana GN=GATA25 PE=2 SV=2 Back     alignment and function description
>sp|Q8L500|APRR9_ARATH Two-component response regulator-like APRR9 OS=Arabidopsis thaliana GN=APRR9 PE=2 SV=2 Back     alignment and function description
>sp|O50055|COL1_ARATH Zinc finger protein CONSTANS-LIKE 1 OS=Arabidopsis thaliana GN=COL1 PE=1 SV=1 Back     alignment and function description
>sp|B0G188|GTAP_DICDI GATA zinc finger domain-containing protein 16 OS=Dictyostelium discoideum GN=gtaP PE=4 SV=1 Back     alignment and function description
>sp|A2YQ93|PRR37_ORYSI Two-component response regulator-like PRR37 OS=Oryza sativa subsp. indica GN=PRR37 PE=2 SV=2 Back     alignment and function description
>sp|Q0D3B6|PRR37_ORYSJ Two-component response regulator-like PRR37 OS=Oryza sativa subsp. japonica GN=PRR37 PE=2 SV=1 Back     alignment and function description
>sp|Q00858|CGPB_FUSSO Cutinase gene palindrome-binding protein OS=Fusarium solani subsp. pisi PE=2 SV=1 Back     alignment and function description
>sp|Q93WK5|APRR7_ARATH Two-component response regulator-like APRR7 OS=Arabidopsis thaliana GN=APRR7 PE=2 SV=1 Back     alignment and function description

Close Homologs in the Non-Redundant Database Detected by BLAST ?

GI ?Alignment Graph ?Length ? Definition ? Q cover ? H cover ? Identity ? E-value ?
Query253
356526306300 PREDICTED: GATA transcription factor 28- 0.988 0.833 0.703 1e-101
18403600302 GATA transcription factor 28 [Arabidopsi 0.968 0.811 0.690 6e-92
15028099302 putative flowering protein CONSTANS [Ara 0.968 0.811 0.690 1e-91
30686115295 GATA transcription factor 24 [Arabidopsi 0.972 0.833 0.666 6e-88
297847530302 hypothetical protein ARALYDRAFT_892135 [ 0.988 0.827 0.685 2e-87
449447986299 PREDICTED: GATA transcription factor 24- 0.964 0.816 0.640 1e-86
449491798303 PREDICTED: GATA transcription factor 24- 0.964 0.805 0.640 2e-86
18402914297 GATA transcription factor 24 [Arabidopsi 0.972 0.828 0.661 2e-86
21537322297 flowering protein CONSTANS, putative [Ar 0.972 0.828 0.661 3e-86
225443335299 PREDICTED: GATA transcription factor 28- 0.964 0.816 0.680 3e-86
>gi|356526306|ref|XP_003531759.1| PREDICTED: GATA transcription factor 28-like [Glycine max] Back     alignment and taxonomy information
 Score =  372 bits (956), Expect = e-101,   Method: Compositional matrix adjust.
 Identities = 185/263 (70%), Positives = 218/263 (82%), Gaps = 13/263 (4%)

Query: 3   HGSNPQMQMSDEQHAIHHVNYVPEHE---LHHISNGDVMDDEHDEG--NGVGESEAMEGD 57
           HG + ++ ++D QH IH V YV EHE   LHHISNG+ +DD+H++G     G SE+MEG+
Sbjct: 5   HGGDSRIHITDGQHPIH-VPYVQEHEHHGLHHISNGNGIDDDHNDGGDTNCGGSESMEGE 63

Query: 58  APSDPGSLSDNRAV-----SEIGDQLTLSFQGQVYVFDSVSPEKVQAVLLLLGGREVPST 112
            PS+ G+L DN AV      + GDQLTLSFQGQVYVFDSVSPEKVQAVLLLLGGRE+P T
Sbjct: 64  VPSNHGNLPDNHAVMMDQGGDSGDQLTLSFQGQVYVFDSVSPEKVQAVLLLLGGREIPPT 123

Query: 113 TPAIPIANNQNNRGLPGTPQRLSVPQRLASLIRFREKRKERNFEKKIRYTVRKEVALRMQ 172
            PA+P++ N NNRG  GTPQ+ SVPQRLASLIRFREKRKERN++KKIRYTVRKEVALRMQ
Sbjct: 124 MPAMPVSPNHNNRGYTGTPQKFSVPQRLASLIRFREKRKERNYDKKIRYTVRKEVALRMQ 183

Query: 173 RNKGQFTSAKSNNEDSASAISSWGSNQSWAGDVNGSQNQDIVCRHCGISEKSTPMMRRGP 232
           RNKGQFTS+KSNN++SAS  ++WG +++W  D +GSQ QDIVCRHCGISEKSTPMMRRGP
Sbjct: 184 RNKGQFTSSKSNNDESASNATNWGMDENWTADNSGSQQQDIVCRHCGISEKSTPMMRRGP 243

Query: 233 EGPRTLCNACGLMWANK--VREL 253
           EGPRTLCNACGLMWANK  +R+L
Sbjct: 244 EGPRTLCNACGLMWANKGILRDL 266




Source: Glycine max

Species: Glycine max

Genus: Glycine

Family: Fabaceae

Order: Fabales

Class:

Phylum: Streptophyta

Superkingdom: Eukaryota

>gi|18403600|ref|NP_564593.1| GATA transcription factor 28 [Arabidopsis thaliana] gi|42571823|ref|NP_974002.1| GATA transcription factor 28 [Arabidopsis thaliana] gi|71660840|sp|Q8H1G0.1|GAT28_ARATH RecName: Full=GATA transcription factor 28; AltName: Full=Protein TIFY 2A; AltName: Full=ZIM-like 2 protein gi|23297318|gb|AAN12940.1| putative flowering protein CONSTANS [Arabidopsis thaliana] gi|38603660|dbj|BAD02931.1| GATA-type zinc finger protein [Arabidopsis thaliana] gi|332194567|gb|AEE32688.1| GATA transcription factor 28 [Arabidopsis thaliana] gi|332194568|gb|AEE32689.1| GATA transcription factor 28 [Arabidopsis thaliana] Back     alignment and taxonomy information
>gi|15028099|gb|AAK76580.1| putative flowering protein CONSTANS [Arabidopsis thaliana] Back     alignment and taxonomy information
>gi|30686115|ref|NP_850618.1| GATA transcription factor 24 [Arabidopsis thaliana] gi|14596059|gb|AAK68757.1| Unknown protein [Arabidopsis thaliana] gi|17978695|gb|AAL47341.1| unknown protein [Arabidopsis thaliana] gi|332642950|gb|AEE76471.1| GATA transcription factor 24 [Arabidopsis thaliana] Back     alignment and taxonomy information
>gi|297847530|ref|XP_002891646.1| hypothetical protein ARALYDRAFT_892135 [Arabidopsis lyrata subsp. lyrata] gi|297337488|gb|EFH67905.1| hypothetical protein ARALYDRAFT_892135 [Arabidopsis lyrata subsp. lyrata] Back     alignment and taxonomy information
>gi|449447986|ref|XP_004141747.1| PREDICTED: GATA transcription factor 24-like [Cucumis sativus] Back     alignment and taxonomy information
>gi|449491798|ref|XP_004159006.1| PREDICTED: GATA transcription factor 24-like [Cucumis sativus] Back     alignment and taxonomy information
>gi|18402914|ref|NP_566676.1| GATA transcription factor 24 [Arabidopsis thaliana] gi|71660846|sp|Q8GXL7.2|GAT24_ARATH RecName: Full=GATA transcription factor 24; AltName: Full=Protein TIFY 2B; AltName: Full=ZIM-like 1 protein gi|9280218|dbj|BAB01708.1| unnamed protein product [Arabidopsis thaliana] gi|38603658|dbj|BAD02930.1| GATA-type zinc finger protein [Arabidopsis thaliana] gi|332642949|gb|AEE76470.1| GATA transcription factor 24 [Arabidopsis thaliana] Back     alignment and taxonomy information
>gi|21537322|gb|AAM61663.1| flowering protein CONSTANS, putative [Arabidopsis thaliana] Back     alignment and taxonomy information
>gi|225443335|ref|XP_002263707.1| PREDICTED: GATA transcription factor 28-like [Vitis vinifera] Back     alignment and taxonomy information

Prediction of Gene Ontology (GO) Terms

Close Homologs with Gene Ontology terms Detected by BLAST ?

ID ? Alignment graph ? Length ? Definition ? Q cover ? H cover ? Identity ? E-value ?
Query253
TAIR|locus:2017582302 ZML2 "ZIM-LIKE 2" [Arabidopsis 0.968 0.811 0.635 7.9e-78
TAIR|locus:505006360297 ZML1 "ZIM-like 1" [Arabidopsis 0.972 0.828 0.615 4e-74
DICTYBASE|DDB_G0270756 1006 gtaG "GATA zinc finger domain- 0.150 0.037 0.6 1.3e-06
CGD|CAL0005605442 orf19.1577 [Candida albicans ( 0.241 0.138 0.439 5.8e-06
UNIPROTKB|Q5ALK1442 CaO19.1577 "Putative uncharact 0.241 0.138 0.439 5.8e-06
TAIR|locus:2145259331 GATA12 "GATA transcription fac 0.118 0.090 0.562 7.3e-06
DICTYBASE|DDB_G0295707 695 gtaP "GATA zinc finger domain- 0.276 0.100 0.376 8.6e-06
TAIR|locus:2083388149 GATA15 "GATA transcription fac 0.166 0.281 0.477 1.3e-05
SGD|S000004744560 GAT2 "Protein containing GATA 0.284 0.128 0.375 3.7e-05
TAIR|locus:2077932 295 MNP "MONOPOLE" [Arabidopsis th 0.284 0.244 0.364 9.3e-05
TAIR|locus:2017582 ZML2 "ZIM-LIKE 2" [Arabidopsis thaliana (taxid:3702)] Back     alignment and assigned GO terms
 Score = 783 (280.7 bits), Expect = 7.9e-78, P = 7.9e-78
 Identities = 162/255 (63%), Positives = 186/255 (72%)

Query:     3 HGSNPQMQMSDEQHAIHHVNYVPEHELHHISNGDVM-DDEHDEGNGVGESEAMEGDAPSD 61
             HGSN +M + + Q  +H V +   H LHHI NG  M DD+ D+GN  G SE +E D PS 
Sbjct:     5 HGSNARMHIREAQDPMH-VQF-EHHALHHIHNGSGMVDDQADDGNAGGMSEGVETDIPSH 62

Query:    62 PGSLSDNRAV-----SEIGDQLTLSFQGQVYVFDSVSPEKVQAVLLLLGGREVPSTTPAI 116
             PG+++DNR       SE GDQLTLSFQGQVYVFDSV PEKVQAVLLLLGGRE+P   P  
Sbjct:    63 PGNVTDNRGEVVDRGSEQGDQLTLSFQGQVYVFDSVLPEKVQAVLLLLGGRELPQAAPPG 122

Query:   117 PIANNQNNR--GLPGTPQRLSVPQRLASLXXXXXXXXXXXXXXXXXYTVRKEVALRMQRN 174
               + +QNNR   LPGTPQR S+PQRLASL                 YTVRKEVALRMQRN
Sbjct:   123 LGSPHQNNRVSSLPGTPQRFSIPQRLASLVRFREKRKGRNFDKKIRYTVRKEVALRMQRN 182

Query:   175 KGQFTSAKSNNEDSASAISSWGSNQSWAGDVNGSQNQDIVCRHCGISEKSTPMMRRGPEG 234
             KGQFTSAKSNN+++ASA SSWGSNQ+WA + + +Q+Q+I CRHCGI EKSTPMMRRGP G
Sbjct:   183 KGQFTSAKSNNDEAASAGSSWGSNQTWAIESSEAQHQEISCRHCGIGEKSTPMMRRGPAG 242

Query:   235 PRTLCNACGLMWANK 249
             PRTLCNACGLMWANK
Sbjct:   243 PRTLCNACGLMWANK 257




GO:0003700 "sequence-specific DNA binding transcription factor activity" evidence=IEA;ISS
GO:0005634 "nucleus" evidence=ISM
GO:0006355 "regulation of transcription, DNA-dependent" evidence=IEA
GO:0008270 "zinc ion binding" evidence=IEA
GO:0043565 "sequence-specific DNA binding" evidence=IEA
TAIR|locus:505006360 ZML1 "ZIM-like 1" [Arabidopsis thaliana (taxid:3702)] Back     alignment and assigned GO terms
DICTYBASE|DDB_G0270756 gtaG "GATA zinc finger domain-containing protein 7" [Dictyostelium discoideum (taxid:44689)] Back     alignment and assigned GO terms
CGD|CAL0005605 orf19.1577 [Candida albicans (taxid:5476)] Back     alignment and assigned GO terms
UNIPROTKB|Q5ALK1 CaO19.1577 "Putative uncharacterized protein" [Candida albicans SC5314 (taxid:237561)] Back     alignment and assigned GO terms
TAIR|locus:2145259 GATA12 "GATA transcription factor 12" [Arabidopsis thaliana (taxid:3702)] Back     alignment and assigned GO terms
DICTYBASE|DDB_G0295707 gtaP "GATA zinc finger domain-containing protein 16" [Dictyostelium discoideum (taxid:44689)] Back     alignment and assigned GO terms
TAIR|locus:2083388 GATA15 "GATA transcription factor 15" [Arabidopsis thaliana (taxid:3702)] Back     alignment and assigned GO terms
SGD|S000004744 GAT2 "Protein containing GATA family zinc finger motifs" [Saccharomyces cerevisiae (taxid:4932)] Back     alignment and assigned GO terms
TAIR|locus:2077932 MNP "MONOPOLE" [Arabidopsis thaliana (taxid:3702)] Back     alignment and assigned GO terms

Prediction of Enzyme Commission (EC) Number

EC Number Prediction by Annotation Transfer from SWISS-PROT Entries ?

ID ?Name ?Annotated EC number ?Identity ?Query coverage ?Hit coverage ?RBH(Q2H) ?RBH(H2Q) ?
Q8H1G0GAT28_ARATHNo assigned EC number0.69010.96830.8112yesno

EC Number Prediction by Ezypred Server ?

Fail to connect to Ezypred Server

EC Number Prediction by EFICAz Software ?

No EC number assignment, probably not an enzyme!


Prediction of Functionally Associated Proteins

Functionally Associated Proteins Detected by STRING ?

Fail to connect to STRING server


Conserved Domains and Related Protein Families

Conserved Domains Detected by RPS-BLAST ?

ID ?Alignment Graph ?Length ? Definition ? E-value ?
Query253
pfam0620345 pfam06203, CCT, CCT motif 3e-19
pfam0032036 pfam00320, GATA, GATA zinc finger 4e-12
smart0040152 smart00401, ZnF_GATA, zinc finger binding to DNA c 4e-12
pfam0620036 pfam06200, tify, tify domain 7e-11
cd0020254 cd00202, ZnF_GATA, Zinc finger DNA binding domain; 1e-10
smart0097936 smart00979, TIFY, This short possible domain is fo 4e-10
pfam0942527 pfam09425, CCT_2, Divergent CCT motif 4e-05
>gnl|CDD|203407 pfam06203, CCT, CCT motif Back     alignment and domain information
 Score = 78.0 bits (193), Expect = 3e-19
 Identities = 23/45 (51%), Positives = 29/45 (64%)

Query: 139 RLASLIRFREKRKERNFEKKIRYTVRKEVALRMQRNKGQFTSAKS 183
           R A+L+R++EKRK R F+KKIRY  RK VA    R KG+F     
Sbjct: 1   REAALLRYKEKRKTRKFDKKIRYASRKAVAESRPRVKGRFVKQSE 45


This short motif is found in a number of plant proteins. It is rich in basic amino acids and has been called a CCT motif after Co, Col and Toc1. The CCT motif is about 45 amino acids long and contains a putative nuclear localisation signal within the second half of the CCT motif. Toc1 mutants have been identified in this region. Length = 45

>gnl|CDD|109380 pfam00320, GATA, GATA zinc finger Back     alignment and domain information
>gnl|CDD|214648 smart00401, ZnF_GATA, zinc finger binding to DNA consensus sequence [AT]GATA[AG] Back     alignment and domain information
>gnl|CDD|203405 pfam06200, tify, tify domain Back     alignment and domain information
>gnl|CDD|238123 cd00202, ZnF_GATA, Zinc finger DNA binding domain; binds specifically to DNA consensus sequence [AT]GATA[AG] promoter elements; a subset of family members may also bind protein; zinc-finger consensus topology is C-X(2)-C-X(17)-C-X(2)-C Back     alignment and domain information
>gnl|CDD|198047 smart00979, TIFY, This short possible domain is found in a variety of plant transcription factors that contain GATA domains as well as other motifs Back     alignment and domain information
>gnl|CDD|117965 pfam09425, CCT_2, Divergent CCT motif Back     alignment and domain information

Conserved Domains Detected by HHsearch ?

ID ?Alignment Graph ?Length ? Definition ? Probability ?
Query 253
PF0620036 tify: tify domain; InterPro: IPR010399 The tify do 99.69
PF0032036 GATA: GATA zinc finger; InterPro: IPR000679 Zinc f 99.47
cd0020254 ZnF_GATA Zinc finger DNA binding domain; binds spe 99.46
smart0040152 ZnF_GATA zinc finger binding to DNA consensus sequ 99.45
PF0620345 CCT: CCT motif; InterPro: IPR010402 The CCT (CONST 99.4
PF0942527 CCT_2: Divergent CCT motif; InterPro: IPR018467 Th 98.55
KOG1601 340 consensus GATA-4/5/6 transcription factors [Transc 98.41
COG5641 498 GAT1 GATA Zn-finger-containing transcription facto 97.6
KOG3554 693 consensus Histone deacetylase complex, MTA1 compon 91.57
>PF06200 tify: tify domain; InterPro: IPR010399 The tify domain is a 36-amino acid domain only found among Embryophyta (land plants) Back     alignment and domain information
Probab=99.69  E-value=2.9e-17  Score=107.34  Aligned_cols=34  Identities=47%  Similarity=0.705  Sum_probs=32.2

Q ss_pred             CCCCCceeEEEccEEEeeCCCCHHHHHHHHHHhC
Q 025411           72 SEIGDQLTLSFQGQVYVFDSVSPEKVQAVLLLLG  105 (253)
Q Consensus        72 ~~~~~QLTIfY~G~V~VfD~v~~eKaq~Im~lA~  105 (253)
                      ++.++||||||+|+|+|||+||+|||++||+||+
T Consensus         2 ~~~~~qLTIfY~G~V~Vfd~v~~~Ka~~im~lA~   35 (36)
T PF06200_consen    2 SPETAQLTIFYGGQVCVFDDVPPDKAQEIMLLAS   35 (36)
T ss_pred             CCCCCcEEEEECCEEEEeCCCCHHHHHHHHHHhc
Confidence            4678999999999999999999999999999997



It has been named after the most conserved amino acid pattern (TIF[F/Y]XG) it contains, but was previously known as the Zim domain. As the use of uppercase characters (TIFY) might imply that the domain is fully conserved across proteins, a lowercase lettering has been chosen in an attempt to highlight the reality of its natural variability. Based on the domain architecture, tify domain containing proteins can be classified into two groups. Group I is formed by proteins possessing a CCT (CONSTANS, CO-like, and TOC1) domain and a GATA-type zinc finger in addition to the tify domain. Group II contains proteins characterised by the tify domain but lacking a GATA-type zinc finger. Tify domain containing proteins might be involved in developmental processes and some of them have features that are characteristic for transcription factors: a nuclear localisation and the presence of a putative DNA-binding domain []. Some proteins known to contain a tify domain include: Arabidopsis thaliana Zinc-finger protein expressed in Inflorescence Meristem (ZIM), a putative transcription factor involved in inflorescence and flower development [, ]. A. thaliana ZIM-like proteins (ZML) []. A. thaliana PEAPOD1 and PEAPOD2 (PPD1 and PPD2) [].

>PF00320 GATA: GATA zinc finger; InterPro: IPR000679 Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule Back     alignment and domain information
>cd00202 ZnF_GATA Zinc finger DNA binding domain; binds specifically to DNA consensus sequence [AT]GATA[AG] promoter elements; a subset of family members may also bind protein; zinc-finger consensus topology is C-X(2)-C-X(17)-C-X(2)-C Back     alignment and domain information
>smart00401 ZnF_GATA zinc finger binding to DNA consensus sequence [AT]GATA[AG] Back     alignment and domain information
>PF06203 CCT: CCT motif; InterPro: IPR010402 The CCT (CONSTANS, CO-like, and TOC1) domain is a highly conserved basic module of ~43 amino acids, which is found near the C terminus of plant proteins often involved in light signal transduction Back     alignment and domain information
>PF09425 CCT_2: Divergent CCT motif; InterPro: IPR018467 The short CCT (CO, COL, TOC1) motif is found in a number of plant proteins, including Constans (CO), Constans-like (COL) and TOC1 Back     alignment and domain information
>KOG1601 consensus GATA-4/5/6 transcription factors [Transcription] Back     alignment and domain information
>COG5641 GAT1 GATA Zn-finger-containing transcription factor [Transcription] Back     alignment and domain information
>KOG3554 consensus Histone deacetylase complex, MTA1 component [Chromatin structure and dynamics] Back     alignment and domain information

Homologous Structure Templates

Structure Templates Detected by BLAST ?

No homologous structure with e-value below 0.005

Structure Templates Detected by RPS-BLAST ?

ID ?Alignment Graph ?Length ? Definition ? E-value ?
Query253
2kae_A71 GATA-type transcription factor; zinc finger, GATA- 2e-09
1gnf_A46 Transcription factor GATA-1; zinc finger, transcri 3e-08
4gat_A66 Nitrogen regulatory protein AREA; DNA binding prot 2e-07
2vut_I43 AREA, nitrogen regulatory protein AREA; transcript 3e-07
3dfx_A63 Trans-acting T-cell-specific transcription factor 2e-06
>2kae_A GATA-type transcription factor; zinc finger, GATA-type, DNA; NMR {Caenorhabditis elegans} Length = 71 Back     alignment and structure
 Score = 51.8 bits (124), Expect = 2e-09
 Identities = 11/42 (26%), Positives = 16/42 (38%), Gaps = 4/42 (9%)

Query: 207 GSQN--QDIVCRHCGISEKSTPMMRRGPEGPRTLCNACGLMW 246
           GS    +   C +C ++E  T   R         CNAC +  
Sbjct: 1   GSHMNKKSFQCSNCSVTE--TIRWRNIRSKEGIQCNACFIYQ 40


>1gnf_A Transcription factor GATA-1; zinc finger, transcription regulation; NMR {Mus musculus} SCOP: g.39.1.1 PDB: 1y0j_A 2l6y_A 2l6z_A Length = 46 Back     alignment and structure
>4gat_A Nitrogen regulatory protein AREA; DNA binding protein, transcription factor, zinc binding domain, complex (transcription regulation/DNA); HET: DNA; NMR {Emericella nidulans} SCOP: g.39.1.1 PDB: 5gat_A* 6gat_A* 7gat_A* Length = 66 Back     alignment and structure
>2vut_I AREA, nitrogen regulatory protein AREA; transcription regulation, protein-protein interactions, metal-binding, nitrate assimilation; HET: NAD; 2.3A {Emericella nidulans} SCOP: g.39.1.1 PDB: 2vus_I* 2vuu_I* Length = 43 Back     alignment and structure
>3dfx_A Trans-acting T-cell-specific transcription factor GATA-3; activator, DNA-binding, metal-binding, nucleus; HET: DNA; 2.70A {Mus musculus} PDB: 3dfv_D* 2gat_A* 3gat_A* 1gat_A* 1gau_A* Length = 63 Back     alignment and structure

Structure Templates Detected by HHsearch ?

ID ?Alignment Graph ?Length ? Definition ? Probability ?
Query253
2vut_I43 AREA, nitrogen regulatory protein AREA; transcript 99.6
1gnf_A46 Transcription factor GATA-1; zinc finger, transcri 99.58
3dfx_A63 Trans-acting T-cell-specific transcription factor 99.56
4gat_A66 Nitrogen regulatory protein AREA; DNA binding prot 99.52
2kae_A71 GATA-type transcription factor; zinc finger, GATA- 99.47
4hc9_A115 Trans-acting T-cell-specific transcription factor; 99.31
4hc9_A115 Trans-acting T-cell-specific transcription factor; 99.12
3ogl_Q21 JAZ1 incomplete degron peptide; leucine-rich repea 98.26
3ogk_Q22 JAZ1 incomplete degron peptide; leucine rich repea 98.18
>2vut_I AREA, nitrogen regulatory protein AREA; transcription regulation, protein-protein interactions, metal-binding, nitrate assimilation; HET: NAD; 2.3A {Emericella nidulans} SCOP: g.39.1.1 PDB: 2vus_I* 2vuu_I* Back     alignment and structure
Probab=99.60  E-value=2.9e-16  Score=105.59  Aligned_cols=36  Identities=44%  Similarity=0.894  Sum_probs=32.8

Q ss_pred             ccccccccccCCCCccccCCCCCcccchHHHHHHHhhhc
Q 025411          213 IVCRHCGISEKSTPMMRRGPEGPRTLCNACGLMWANKVR  251 (253)
Q Consensus       213 ~~C~~C~~~~~~Tp~wR~Gp~G~~~LCNaCGl~~~k~~~  251 (253)
                      ..|++|++  +.||+||+||+|+ +|||||||+|++..+
T Consensus         2 ~~C~~C~t--t~Tp~WR~gp~G~-~LCNaCGl~~k~~~~   37 (43)
T 2vut_I            2 TTCTNCFT--QTTPLWRRNPEGQ-PLCNACGLFLKLHGV   37 (43)
T ss_dssp             CCCSSSCC--CCCSCCEECTTSC-EECHHHHHHHHHHSS
T ss_pred             CcCCccCC--CCCCccccCCCCC-cccHHHHHHHHHhCC
Confidence            57999999  7899999999997 999999999988653



>1gnf_A Transcription factor GATA-1; zinc finger, transcription regulation; NMR {Mus musculus} SCOP: g.39.1.1 PDB: 1y0j_A 2l6y_A 2l6z_A Back     alignment and structure
>3dfx_A Trans-acting T-cell-specific transcription factor GATA-3; activator, DNA-binding, metal-binding, nucleus; HET: DNA; 2.70A {Mus musculus} PDB: 3dfv_D* 2gat_A* 3gat_A* 1gat_A* 1gau_A* Back     alignment and structure
>4gat_A Nitrogen regulatory protein AREA; DNA binding protein, transcription factor, zinc binding domain, complex (transcription regulation/DNA); HET: DNA; NMR {Emericella nidulans} SCOP: g.39.1.1 PDB: 5gat_A* 6gat_A* 7gat_A* Back     alignment and structure
>2kae_A GATA-type transcription factor; zinc finger, GATA-type, DNA; NMR {Caenorhabditis elegans} Back     alignment and structure
>4hc9_A Trans-acting T-cell-specific transcription factor; zinc finger, GATA transcription factor, DNA bridging, transc DNA complex; HET: DNA; 1.60A {Homo sapiens} PDB: 4hc7_A* 4hca_A* 3dfx_A* 3dfv_D* 2gat_A* 3gat_A* 1gat_A* 1gau_A* 1gnf_A 1y0j_A 2l6y_A 2l6z_A Back     alignment and structure
>4hc9_A Trans-acting T-cell-specific transcription factor; zinc finger, GATA transcription factor, DNA bridging, transc DNA complex; HET: DNA; 1.60A {Homo sapiens} PDB: 4hc7_A* 4hca_A* 3dfx_A* 3dfv_D* 2gat_A* 3gat_A* 1gat_A* 1gau_A* 1gnf_A 1y0j_A 2l6y_A 2l6z_A Back     alignment and structure

Homologous Structure Domains

Structure Domains Detected by RPS-BLAST ?

ID ?Alignment Graph ?Length ? Definition ? E-value ?
Query 253
d1y0ja139 g.39.1.1 (A:200-238) Erythroid transcription facto 7e-10
d2vuti142 g.39.1.1 (I:671-712) Erythroid transcription facto 1e-08
d3gata_66 g.39.1.1 (A:) Erythroid transcription factor GATA- 7e-07
>d1y0ja1 g.39.1.1 (A:200-238) Erythroid transcription factor GATA-1 {Mouse (Mus musculus) [TaxId: 10090]} Length = 39 Back     information, alignment and structure

class: Small proteins
fold: Glucocorticoid receptor-like (DNA-binding domain)
superfamily: Glucocorticoid receptor-like (DNA-binding domain)
family: Erythroid transcription factor GATA-1
domain: Erythroid transcription factor GATA-1
species: Mouse (Mus musculus) [TaxId: 10090]
 Score = 50.9 bits (122), Expect = 7e-10
 Identities = 15/32 (46%), Positives = 18/32 (56%), Gaps = 3/32 (9%)

Query: 215 CRHCGISEKSTPMMRRGPEGPRTLCNACGLMW 246
           C +CG +   TP+ RR   G   LCNACGL  
Sbjct: 5   CVNCGATA--TPLWRRDRTG-HYLCNACGLYH 33


>d2vuti1 g.39.1.1 (I:671-712) Erythroid transcription factor GATA-1 {Emericella nidulans [TaxId: 162425]} Length = 42 Back     information, alignment and structure
>d3gata_ g.39.1.1 (A:) Erythroid transcription factor GATA-1 {Chicken (Gallus gallus) [TaxId: 9031]} Length = 66 Back     information, alignment and structure

Homologous Domains Detected by HHsearch ?

ID ?Alignment Graph ?Length ? Definition ? Probability ?
Query253
d1y0ja139 Erythroid transcription factor GATA-1 {Mouse (Mus 99.66
d2vuti142 Erythroid transcription factor GATA-1 {Emericella 99.63
d3gata_66 Erythroid transcription factor GATA-1 {Chicken (Ga 99.54
>d1y0ja1 g.39.1.1 (A:200-238) Erythroid transcription factor GATA-1 {Mouse (Mus musculus) [TaxId: 10090]} Back     information, alignment and structure
class: Small proteins
fold: Glucocorticoid receptor-like (DNA-binding domain)
superfamily: Glucocorticoid receptor-like (DNA-binding domain)
family: Erythroid transcription factor GATA-1
domain: Erythroid transcription factor GATA-1
species: Mouse (Mus musculus) [TaxId: 10090]
Probab=99.66  E-value=8.5e-18  Score=108.80  Aligned_cols=37  Identities=41%  Similarity=0.806  Sum_probs=33.1

Q ss_pred             cccccccccccCCCCccccCCCCCcccchHHHHHHHhhhc
Q 025411          212 DIVCRHCGISEKSTPMMRRGPEGPRTLCNACGLMWANKVR  251 (253)
Q Consensus       212 ~~~C~~C~~~~~~Tp~wR~Gp~G~~~LCNaCGl~~~k~~~  251 (253)
                      .+.|++|++  +.||+||+||+| ++|||||||+|++..+
T Consensus         2 ~r~C~~Cgt--t~Tp~WR~gp~G-~~LCNACGl~~r~~G~   38 (39)
T d1y0ja1           2 ARECVNCGA--TATPLWRRDRTG-HYLCNACGLYHKMNGQ   38 (39)
T ss_dssp             CCCCSSSCC--CCCSCCEECTTS-CEECSSHHHHHHHSCC
T ss_pred             cCCCCCCCC--CCCcccccCCCC-CEeeHHhHHHHHHhCC
Confidence            468999999  789999999999 6999999999987643



>d2vuti1 g.39.1.1 (I:671-712) Erythroid transcription factor GATA-1 {Emericella nidulans [TaxId: 162425]} Back     information, alignment and structure
>d3gata_ g.39.1.1 (A:) Erythroid transcription factor GATA-1 {Chicken (Gallus gallus) [TaxId: 9031]} Back     information, alignment and structure