Citrus Sinensis ID: 026843


Local Sequence Feature Prediction

Prediction and (Method)Result
Residue Number Marker
Protein Sequence ?
Secondary Structure (PSIPRED) ?
Secondary Structure Prediction (SSPRO) ?
Coil and Loop (DISEMBL) ?
Flexible Loop (DISEMBL) ?
Low Complexity Region (SEG) ?
Disordered region (IsUnstruct) ?
Disordered Region (DISOPRED) ?
Disordered Region (DISEMBL) ?
Disordered Region (DISPRO) ?
Transmembrane Helix (TMHMM) ?
Transmembrane Helix (HMMTOP) ?
Transmembrane Helix (MEMSAT) ?
TM Helix, Signal Peptide (MEMSAT_SVM) ?
TM Helix, Signal Peptide (Phobius) ?
Signal Peptide (SignalP HMM Mode) ?
Signal Peptide (SignalP NN Mode) ?
Coiled Coils (COILS) ?
Positional Conservation ?
 
--------10--------20--------30--------40--------50--------60--------70--------80--------90-------100-------110-------120-------130-------140-------150-------160-------170-------180-------190-------200-------210-------220-------230--
MSQPKTMTNKLAIRRREVGEGSTSDNSSYTSSSSGESMSSKMRLANKIINSSSVSTGTHDESVKVAEKLLHEHDNIEVHYFTTNSSNSNNTVRICSDCNTTTTPLWRSGPRGPKSLCNACGIRQRKARKAMQAAAESGTTTAKDNSSFSKIKLQNNMEKKPRTSHVAQYKKVQCNTPDPDPPHHEYRSQRKLCFKDFALSLSSNSALKQVFPRDVEEAAILLMELSCGFSHT
ccccccccccccccccccccccccccccccccccccccHHHHHHHHHHccccccccccccccccHHHHHccccccccccccccccccccccccccccccccccccccccccccccHHHHHHHHHHHHHHHHHHHHHcccccccccccccHHHHcccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccHHHHHHHHHHHccccccc
ccccccccccEEEEEccccccccccccccccccccccccHHHHHHHHHccccccccccccccccccccccccccccccccccccccccccccEEEccccccccccccccccccccHHHccccEEccccccccccccccccccccccccHHHHHHHHHHHcccccccccccHcccccccccccccccccccEEccccEEEEccccHHHccccccHHHHHHHHHHHHHcccccc
msqpktmtNKLAIRRrevgegstsdnssytssssgesmSSKMRLANKIinsssvstgthdESVKVAEKLLHehdnievhyfttnssnsnntvricsdcnttttplwrsgprgpkslcnacGIRQRKARKAMQAAAEsgtttakdnssfSKIKlqnnmekkprtshvaqykkvqcntpdpdpphheyrsqrkLCFKDFAlslssnsalkqvfpRDVEEAAILLMELSCGFSHT
msqpktmtnklairrrevgegstsdnssytssssgesmSSKMRLANKIInsssvstgthDESVKVAEKLLHEHDNIEVHyfttnssnsnntVRICSDCNTtttplwrsgprgpkslCNACGIRQRKARKAMQAAAesgtttakdnssfskiklqnnmekkprtshVAQYKKVQCNTPDPDPPHHEYRSQRKLCFKDFALSLSSNSALKQVFPRDVEEAAILLMELSCGFSHT
MSQPKTMTNKLAIRRREVGEGstsdnssytssssgesmssKMRLANKIINSSSVSTGTHDESVKVAEKLLHEHDNIEVHYFttnssnsnntVRICSDCNTTTTPLWRSGPRGPKSLCNACGIrqrkarkamqaaaESGTTTAKDNSSFSKIKLQNNMEKKPRTSHVAQYKKVQCNTPDPDPPHHEYRSQRKLCFKDFALSLSSNSALKQVFPRDVEEAAILLMELSCGFSHT
******************************************************************EKLLHEHDNIEVHYFTTNSSNSNNTVRICSDCNTTTTPLWR********LCNACG*********************************************************************KLCFKDFALSLSSNSALKQVFPRDVEEAAILLMELSCG****
************************************************************************************************DCNTTTTPLWRSGPRGPKSLCNACGIRQ*********************************************************************************************AAILLMELSCGFSHT
MSQPKTMTNKLAIRRRE************************MRLANKIINSS*********SVKVAEKLLHEHDNIEVHYFTTNSSNSNNTVRICSDCNTTTTPLWRSGPRGPKSLCNACGI************************SFSKIKLQNN************YKKVQCNTPDPDPPHHEYRSQRKLCFKDFALSLSSNSALKQVFPRDVEEAAILLMELSCGFSHT
*******************************************************************************************VRICSDCNTTTTPLWRSGPRGPKSLCNACGIRQRKARKAMQAAAES***************************************************QRKLCFKDFALSLSSNSALKQVFPRDVEEAAILLMELSCGFSHT
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiihhhhhhhhhhhhhhhhhhhooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiihhhhhhhhhhhhhhhhoooo
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
MSQPKTMTNKLAIRRREVGEGSTSDNSSYTSSSSGESMSSKMRLANKIINSSSVSTGTHDESVKVAEKLLHEHDNIEVHYFTTNSSNSNNTVRICSDCNTTTTPLWRSGPRGPKSLCNACGIRQRKARKAMQAAAESGTTTAKDNSSFSKIKLQNNMEKKPRTSHVAQYKKVQCNTPDPDPPHHEYRSQRKLCFKDFALSLSSNSALKQVFPRDVEEAAILLMELSCGFSHT
no confident homologs detected

Close Homologs for Annotation Transfer

Close Homologs in SWISS-PROT Database Detected by BLAST ?

ID ?Alignment graph ?Length ? Definition ? RBH(Q2H) ? RBH(H2Q) ? Q cover ? H cover ? Identity ? E-value ?
Query232 2.2.26 [Sep-21-2011]
Q9SZI6352 Putative GATA transcripti yes no 0.607 0.400 0.443 4e-22
Q5HZ36398 GATA transcription factor no no 0.176 0.103 0.853 5e-17
Q8LC59120 GATA transcription factor no no 0.219 0.425 0.568 6e-13
Q9FJ10139 GATA transcription factor no no 0.202 0.338 0.595 3e-12
Q8LG10149 GATA transcription factor no no 0.25 0.389 0.517 9e-12
Q6QPM2211 GATA transcription factor no no 0.232 0.255 0.551 1e-11
Q8VZP4308 GATA transcription factor no no 0.202 0.152 0.595 3e-11
Q8LC79295 GATA transcription factor no no 0.185 0.145 0.651 5e-11
Q9LIB5190 GATA transcription factor no no 0.176 0.215 0.682 5e-11
Q6DBP8303 GATA transcription factor no no 0.202 0.155 0.595 5e-11
>sp|Q9SZI6|GAT22_ARATH Putative GATA transcription factor 22 OS=Arabidopsis thaliana GN=GATA22 PE=2 SV=1 Back     alignment and function desciption
 Score =  105 bits (261), Expect = 4e-22,   Method: Compositional matrix adjust.
 Identities = 75/169 (44%), Positives = 93/169 (55%), Gaps = 28/169 (16%)

Query: 84  NSSNSNNTVRICSDCNTTTTPLWRSGPRGPKSLCNACGIRQRKARKAMQAAAE----SGT 139
           N  N++  +RICSDCNTT TPLWRSGPRGPKSLCNACGIRQRKAR+A  A A     SG 
Sbjct: 190 NGYNNDCVIRICSDCNTTKTPLWRSGPRGPKSLCNACGIRQRKARRAAMATATATAVSGV 249

Query: 140 TTAKDNSS--------------FSKIKLQNNMEKKPRT---SHVAQYKKVQCNTPDPDPP 182
           +                      S + L+ N  K+  T   + +A+  + Q N+      
Sbjct: 250 SPPVMKKKMQNKNKISNGVYKILSPLPLKVNTCKRMITLEETALAEDLETQSNST----- 304

Query: 183 HHEYRSQRKLCFKDFALSLSSNSALKQVFPRDVEEAAILLMELSCGFSH 231
                S   + F D AL LS +SA +QVFP+D +EAAILLM LS G  H
Sbjct: 305 --MLSSSDNIYFDDLALLLSKSSAYQQVFPQDEKEAAILLMALSHGMVH 351




Transcriptional regulator that specifically binds 5'-GATA-3' or 5'-GAT-3' motifs within gene promoters.
Arabidopsis thaliana (taxid: 3702)
>sp|Q5HZ36|GAT21_ARATH GATA transcription factor 21 OS=Arabidopsis thaliana GN=GATA21 PE=1 SV=2 Back     alignment and function description
>sp|Q8LC59|GAT23_ARATH GATA transcription factor 23 OS=Arabidopsis thaliana GN=GATA23 PE=2 SV=2 Back     alignment and function description
>sp|Q9FJ10|GAT16_ARATH GATA transcription factor 16 OS=Arabidopsis thaliana GN=GATA16 PE=2 SV=1 Back     alignment and function description
>sp|Q8LG10|GAT15_ARATH GATA transcription factor 15 OS=Arabidopsis thaliana GN=GATA15 PE=2 SV=2 Back     alignment and function description
>sp|Q6QPM2|GAT19_ARATH GATA transcription factor 19 OS=Arabidopsis thaliana GN=GATA19 PE=2 SV=2 Back     alignment and function description
>sp|Q8VZP4|GAT10_ARATH GATA transcription factor 10 OS=Arabidopsis thaliana GN=GATA10 PE=2 SV=1 Back     alignment and function description
>sp|Q8LC79|GAT18_ARATH GATA transcription factor 18 OS=Arabidopsis thaliana GN=GATA18 PE=2 SV=2 Back     alignment and function description
>sp|Q9LIB5|GAT17_ARATH GATA transcription factor 17 OS=Arabidopsis thaliana GN=GATA17 PE=2 SV=1 Back     alignment and function description
>sp|Q6DBP8|GAT11_ARATH GATA transcription factor 11 OS=Arabidopsis thaliana GN=GATA11 PE=2 SV=1 Back     alignment and function description

Close Homologs in the Non-Redundant Database Detected by BLAST ?

GI ?Alignment Graph ?Length ? Definition ? Q cover ? H cover ? Identity ? E-value ?
Query232
224088836234 predicted protein [Populus trichocarpa] 0.849 0.841 0.519 9e-41
118487597303 unknown [Populus trichocarpa] 0.849 0.650 0.519 1e-40
255550794186 conserved hypothetical protein [Ricinus 0.793 0.989 0.538 1e-35
356554076306 PREDICTED: putative GATA transcription f 0.905 0.686 0.449 1e-34
118489347303 unknown [Populus trichocarpa x Populus d 0.573 0.438 0.671 5e-34
118488977306 unknown [Populus trichocarpa x Populus d 0.577 0.437 0.659 6e-34
225429550306 PREDICTED: putative GATA transcription f 0.875 0.663 0.420 1e-32
255546095312 hypothetical protein RCOM_1046780 [Ricin 0.909 0.676 0.425 6e-31
147805325211 hypothetical protein VITISV_032017 [Viti 0.862 0.947 0.432 2e-29
356556282315 PREDICTED: putative GATA transcription f 0.810 0.596 0.412 6e-29
>gi|224088836|ref|XP_002308561.1| predicted protein [Populus trichocarpa] gi|222854537|gb|EEE92084.1| predicted protein [Populus trichocarpa] Back     alignment and taxonomy information
 Score =  172 bits (436), Expect = 9e-41,   Method: Compositional matrix adjust.
 Identities = 109/210 (51%), Positives = 134/210 (63%), Gaps = 13/210 (6%)

Query: 25  DNSSYTSSSSGESMSSKMRLANKIINSSSVSTGTHDESVKVAEKLLHEHDNIEVHYFTTN 84
           D +  +  SS + M SKMRL  K+ NS+   T            +L  H+    +    +
Sbjct: 36  DGAEESGESSVKWMPSKMRLMQKMTNSNCSETDHMPMKF-----MLKFHNQQYQNNEINS 90

Query: 85  SSNSNNTVRICSDCNTTTTPLWRSGPRGPKSLCNACGIRQRK--ARKAMQAAAESGTTTA 142
           SSNSN+ +R+CSDCNTT+TPLWRSGPRGPKSLCNACGIRQRK     A  AAA +GT  A
Sbjct: 91  SSNSNSNIRVCSDCNTTSTPLWRSGPRGPKSLCNACGIRQRKARRAMAAAAAAANGTVIA 150

Query: 143 KDNSSFSKIKLQNNMEKKPRTSHVAQYKKVQCNTPDPDPPHHEYRSQRKLCFKDFALSLS 202
            + SS ++    NN  KK RT+HV+Q KK+        PP    +SQ+KLCFK+ ALSLS
Sbjct: 151 IEASSSTRSTKVNNKVKKSRTNHVSQNKKLS------KPPESSLQSQKKLCFKNLALSLS 204

Query: 203 SNSALKQVFPRDVEEAAILLMELSCGFSHT 232
            N AL+QV P DVEEAAILLMELSCGF H+
Sbjct: 205 KNPALQQVLPHDVEEAAILLMELSCGFIHS 234




Source: Populus trichocarpa

Species: Populus trichocarpa

Genus: Populus

Family: Salicaceae

Order: Malpighiales

Class:

Phylum: Streptophyta

Superkingdom: Eukaryota

>gi|118487597|gb|ABK95624.1| unknown [Populus trichocarpa] Back     alignment and taxonomy information
>gi|255550794|ref|XP_002516445.1| conserved hypothetical protein [Ricinus communis] gi|223544265|gb|EEF45786.1| conserved hypothetical protein [Ricinus communis] Back     alignment and taxonomy information
>gi|356554076|ref|XP_003545375.1| PREDICTED: putative GATA transcription factor 22-like [Glycine max] Back     alignment and taxonomy information
>gi|118489347|gb|ABK96478.1| unknown [Populus trichocarpa x Populus deltoides] Back     alignment and taxonomy information
>gi|118488977|gb|ABK96296.1| unknown [Populus trichocarpa x Populus deltoides] Back     alignment and taxonomy information
>gi|225429550|ref|XP_002279283.1| PREDICTED: putative GATA transcription factor 22 [Vitis vinifera] gi|296081660|emb|CBI20665.3| unnamed protein product [Vitis vinifera] Back     alignment and taxonomy information
>gi|255546095|ref|XP_002514107.1| hypothetical protein RCOM_1046780 [Ricinus communis] gi|223546563|gb|EEF48061.1| hypothetical protein RCOM_1046780 [Ricinus communis] Back     alignment and taxonomy information
>gi|147805325|emb|CAN63090.1| hypothetical protein VITISV_032017 [Vitis vinifera] Back     alignment and taxonomy information
>gi|356556282|ref|XP_003546455.1| PREDICTED: putative GATA transcription factor 22-like [Glycine max] Back     alignment and taxonomy information

Prediction of Gene Ontology (GO) Terms

Close Homologs with Gene Ontology terms Detected by BLAST ?

ID ? Alignment graph ? Length ? Definition ? Q cover ? H cover ? Identity ? E-value ?
Query232
TAIR|locus:2170277398 GNC "GATA, nitrate-inducible, 0.133 0.077 0.903 1.3e-23
TAIR|locus:2120845352 CGA1 "cytokinin-responsive gat 0.603 0.397 0.415 1.3e-20
TAIR|locus:2148558120 GATA23 "GATA transcription fac 0.133 0.258 0.741 1.9e-14
TAIR|locus:504955441197 AT4G16141 [Arabidopsis thalian 0.129 0.152 0.733 6.5e-13
TAIR|locus:2155919139 GATA16 "GATA transcription fac 0.129 0.215 0.733 1.8e-12
TAIR|locus:2083388149 GATA15 "GATA transcription fac 0.241 0.375 0.482 2.8e-12
TAIR|locus:2077932295 MNP "MONOPOLE" [Arabidopsis th 0.400 0.315 0.356 4.2e-11
TAIR|locus:2093678190 GATA17 "GATA transcription fac 0.568 0.694 0.326 4.5e-11
TAIR|locus:2115195211 GATA19 "GATA transcription fac 0.129 0.142 0.8 1.2e-10
TAIR|locus:2062095208 GATA20 "GATA transcription fac 0.129 0.144 0.766 3.2e-10
TAIR|locus:2170277 GNC "GATA, nitrate-inducible, carbon metabolism-involved" [Arabidopsis thaliana (taxid:3702)] Back     alignment and assigned GO terms
 Score = 172 (65.6 bits), Expect = 1.3e-23, Sum P(2) = 1.3e-23
 Identities = 28/31 (90%), Positives = 30/31 (96%)

Query:    92 VRICSDCNTTTTPLWRSGPRGPKSLCNACGI 122
             +R+CSDCNTT TPLWRSGPRGPKSLCNACGI
Sbjct:   229 IRVCSDCNTTKTPLWRSGPRGPKSLCNACGI 259


GO:0003700 "sequence-specific DNA binding transcription factor activity" evidence=IEA;ISS
GO:0005634 "nucleus" evidence=ISM
GO:0006355 "regulation of transcription, DNA-dependent" evidence=IEA;IMP
GO:0008270 "zinc ion binding" evidence=IEA
GO:0043565 "sequence-specific DNA binding" evidence=IEA
GO:0010255 "glucose mediated signaling pathway" evidence=IMP
GO:0051171 "regulation of nitrogen compound metabolic process" evidence=IEP;IMP
GO:0007623 "circadian rhythm" evidence=IEP
GO:0009416 "response to light stimulus" evidence=IEP
GO:0005515 "protein binding" evidence=IPI
GO:0009740 "gibberellic acid mediated signaling pathway" evidence=IEP
GO:0009910 "negative regulation of flower development" evidence=IMP
GO:0010187 "negative regulation of seed germination" evidence=IEP
GO:0010380 "regulation of chlorophyll biosynthetic process" evidence=IMP
GO:0010468 "regulation of gene expression" evidence=IDA
GO:0044212 "transcription regulatory region DNA binding" evidence=IDA
GO:0009965 "leaf morphogenesis" evidence=RCA
GO:0030154 "cell differentiation" evidence=RCA
GO:0045893 "positive regulation of transcription, DNA-dependent" evidence=RCA
TAIR|locus:2120845 CGA1 "cytokinin-responsive gata factor 1" [Arabidopsis thaliana (taxid:3702)] Back     alignment and assigned GO terms
TAIR|locus:2148558 GATA23 "GATA transcription factor 23" [Arabidopsis thaliana (taxid:3702)] Back     alignment and assigned GO terms
TAIR|locus:504955441 AT4G16141 [Arabidopsis thaliana (taxid:3702)] Back     alignment and assigned GO terms
TAIR|locus:2155919 GATA16 "GATA transcription factor 16" [Arabidopsis thaliana (taxid:3702)] Back     alignment and assigned GO terms
TAIR|locus:2083388 GATA15 "GATA transcription factor 15" [Arabidopsis thaliana (taxid:3702)] Back     alignment and assigned GO terms
TAIR|locus:2077932 MNP "MONOPOLE" [Arabidopsis thaliana (taxid:3702)] Back     alignment and assigned GO terms
TAIR|locus:2093678 GATA17 "GATA transcription factor 17" [Arabidopsis thaliana (taxid:3702)] Back     alignment and assigned GO terms
TAIR|locus:2115195 GATA19 "GATA transcription factor 19" [Arabidopsis thaliana (taxid:3702)] Back     alignment and assigned GO terms
TAIR|locus:2062095 GATA20 "GATA transcription factor 20" [Arabidopsis thaliana (taxid:3702)] Back     alignment and assigned GO terms

Prediction of Enzyme Commission (EC) Number

EC Number Prediction by Annotation Transfer from SWISS-PROT Entries ?

No confident hit for EC number transfering in SWISSPROT detected by BLAST

EC Number Prediction by Ezypred Server ?

Fail to connect to Ezypred Server

EC Number Prediction by EFICAz Software ?

No EC number assignment, probably not an enzyme!


Prediction of Functionally Associated Proteins

Functionally Associated Proteins Detected by STRING ?

Fail to connect to STRING server


Conserved Domains and Related Protein Families

Conserved Domains Detected by RPS-BLAST ?

ID ?Alignment Graph ?Length ? Definition ? E-value ?
Query232
smart0040152 smart00401, ZnF_GATA, zinc finger binding to DNA c 3e-16
pfam0032036 pfam00320, GATA, GATA zinc finger 2e-15
cd0020254 cd00202, ZnF_GATA, Zinc finger DNA binding domain; 1e-12
>gnl|CDD|214648 smart00401, ZnF_GATA, zinc finger binding to DNA consensus sequence [AT]GATA[AG] Back     alignment and domain information
 Score = 70.1 bits (172), Expect = 3e-16
 Identities = 22/45 (48%), Positives = 28/45 (62%)

Query: 90  NTVRICSDCNTTTTPLWRSGPRGPKSLCNACGIRQRKARKAMQAA 134
            + R CS+C TT TPLWR GP G K+LCNACG+  +K     +  
Sbjct: 1   GSGRSCSNCGTTETPLWRRGPSGNKTLCNACGLYYKKHGGLKRPL 45


Length = 52

>gnl|CDD|109380 pfam00320, GATA, GATA zinc finger Back     alignment and domain information
>gnl|CDD|238123 cd00202, ZnF_GATA, Zinc finger DNA binding domain; binds specifically to DNA consensus sequence [AT]GATA[AG] promoter elements; a subset of family members may also bind protein; zinc-finger consensus topology is C-X(2)-C-X(17)-C-X(2)-C Back     alignment and domain information

Conserved Domains Detected by HHsearch ?

ID ?Alignment Graph ?Length ? Definition ? Probability ?
Query 232
PF0032036 GATA: GATA zinc finger; InterPro: IPR000679 Zinc f 99.49
cd0020254 ZnF_GATA Zinc finger DNA binding domain; binds spe 99.47
smart0040152 ZnF_GATA zinc finger binding to DNA consensus sequ 99.46
KOG1601340 consensus GATA-4/5/6 transcription factors [Transc 98.66
COG5641 498 GAT1 GATA Zn-finger-containing transcription facto 97.76
KOG3554 693 consensus Histone deacetylase complex, MTA1 compon 84.07
>PF00320 GATA: GATA zinc finger; InterPro: IPR000679 Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule Back     alignment and domain information
Probab=99.49  E-value=5.9e-15  Score=96.80  Aligned_cols=35  Identities=57%  Similarity=1.230  Sum_probs=28.2

Q ss_pred             ccCCCCCCCCcccCCCCCCcccchHHHHHHHHhhh
Q 026843           95 CSDCNTTTTPLWRSGPRGPKSLCNACGIRQRKARK  129 (232)
Q Consensus        95 C~~Cgtt~TP~WRrGP~G~~~LCNACGL~~~k~r~  129 (232)
                      |++|++++||+||+||.|+.+|||+||++|++.++
T Consensus         1 C~~C~tt~t~~WR~~~~g~~~LCn~Cg~~~kk~~~   35 (36)
T PF00320_consen    1 CSNCGTTETPQWRRGPNGNRTLCNACGLYYKKYGK   35 (36)
T ss_dssp             -TTT--ST-SSEEEETTSEE-EEHHHHHHHHHHSS
T ss_pred             CcCCcCCCCchhhcCCCCCCHHHHHHHHHHHHhCC
Confidence            89999999999999999988899999999998764



Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [, , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. This entry represents GATA-type zinc fingers (Znf). A number of transcription factors (including erythroid-specific transcription factor and nitrogen regulatory proteins), specifically bind the DNA sequence (A/T)GATA(A/G) [] in the regulatory regions of genes. They are consequently termed GATA-binding transcription factors. The interactions occur via highly-conserved Znf domains in which the zinc ion is coordinated by 4 cysteine residues [, ]. NMR studies have shown the core of the Znf to comprise 2 irregular anti-parallel beta-sheets and an alpha-helix, followed by a long loop to the C-terminal end of the finger. The N-terminal part, which includes the helix, is similar in structure, but not sequence, to the N-terminal zinc module of the glucocorticoid receptor DNA-binding domain. The helix and the loop connecting the 2 beta-sheets interact with the major groove of the DNA, while the C-terminal tail wraps around into the minor groove. It is this tail that is the essential determinant of specific binding. Interactions between the Znf and DNA are mainly hydrophobic, explaining the preponderance of thymines in the binding site; a large number of interactions with the phosphate backbone have also been observed []. Two GATA zinc fingers are found in the GATA transcription factors. However there are several proteins which only contains a single copy of the domain. More information about these proteins can be found at Protein of the Month: Zinc Fingers [].; GO: 0003700 sequence-specific DNA binding transcription factor activity, 0008270 zinc ion binding, 0043565 sequence-specific DNA binding, 0006355 regulation of transcription, DNA-dependent; PDB: 3GAT_A 2GAT_A 1GAU_A 1GAT_A 1Y0J_A 1GNF_A 2L6Z_A 2L6Y_A 3DFV_D 3DFX_B ....

>cd00202 ZnF_GATA Zinc finger DNA binding domain; binds specifically to DNA consensus sequence [AT]GATA[AG] promoter elements; a subset of family members may also bind protein; zinc-finger consensus topology is C-X(2)-C-X(17)-C-X(2)-C Back     alignment and domain information
>smart00401 ZnF_GATA zinc finger binding to DNA consensus sequence [AT]GATA[AG] Back     alignment and domain information
>KOG1601 consensus GATA-4/5/6 transcription factors [Transcription] Back     alignment and domain information
>COG5641 GAT1 GATA Zn-finger-containing transcription factor [Transcription] Back     alignment and domain information
>KOG3554 consensus Histone deacetylase complex, MTA1 component [Chromatin structure and dynamics] Back     alignment and domain information

Homologous Structure Templates

Structure Templates Detected by BLAST ?

ID ?Alignment Graph ?Length ? Definition ? E-value ?
Query232
2vus_I43 Crystal Structure Of Unliganded Nmra-Area Zinc Fing 4e-04
4gat_A66 Solution Nmr Structure Of The Wild Type Dna Binding 8e-04
>pdb|2VUS|I Chain I, Crystal Structure Of Unliganded Nmra-Area Zinc Finger Complex Length = 43 Back     alignment and structure

Iteration: 1

Score = 41.6 bits (96), Expect = 4e-04, Method: Composition-based stats. Identities = 17/28 (60%), Positives = 21/28 (75%), Gaps = 1/28 (3%) Query: 95 CSDCNTTTTPLWRSGPRGPKSLCNACGI 122 C++C T TTPLWR P G + LCNACG+ Sbjct: 4 CTNCFTQTTPLWRRNPEG-QPLCNACGL 30
>pdb|4GAT|A Chain A, Solution Nmr Structure Of The Wild Type Dna Binding Domain Of Area Complexed To A 13bp Dna Containing A Cgata Site, Regularized Mean Structure Length = 66 Back     alignment and structure

Structure Templates Detected by RPS-BLAST ?

ID ?Alignment Graph ?Length ? Definition ? E-value ?
Query232
2kae_A71 GATA-type transcription factor; zinc finger, GATA- 4e-13
1gnf_A46 Transcription factor GATA-1; zinc finger, transcri 3e-12
2vut_I43 AREA, nitrogen regulatory protein AREA; transcript 2e-10
3dfx_A63 Trans-acting T-cell-specific transcription factor 2e-08
4gat_A66 Nitrogen regulatory protein AREA; DNA binding prot 2e-08
1vt4_I 1221 APAF-1 related killer DARK; drosophila apoptosome, 2e-04
1vt4_I 1221 APAF-1 related killer DARK; drosophila apoptosome, 7e-04
>2kae_A GATA-type transcription factor; zinc finger, GATA-type, DNA; NMR {Caenorhabditis elegans} Length = 71 Back     alignment and structure
 Score = 61.8 bits (150), Expect = 4e-13
 Identities = 17/41 (41%), Positives = 21/41 (51%)

Query: 86  SNSNNTVRICSDCNTTTTPLWRSGPRGPKSLCNACGIRQRK 126
           S+ N     CS+C+ T T  WR+        CNAC I QRK
Sbjct: 2   SHMNKKSFQCSNCSVTETIRWRNIRSKEGIQCNACFIYQRK 42


>1gnf_A Transcription factor GATA-1; zinc finger, transcription regulation; NMR {Mus musculus} SCOP: g.39.1.1 PDB: 1y0j_A 2l6y_A 2l6z_A Length = 46 Back     alignment and structure
>2vut_I AREA, nitrogen regulatory protein AREA; transcription regulation, protein-protein interactions, metal-binding, nitrate assimilation; HET: NAD; 2.3A {Emericella nidulans} SCOP: g.39.1.1 PDB: 2vus_I* 2vuu_I* Length = 43 Back     alignment and structure
>3dfx_A Trans-acting T-cell-specific transcription factor GATA-3; activator, DNA-binding, metal-binding, nucleus; HET: DNA; 2.70A {Mus musculus} PDB: 3dfv_D* 2gat_A* 3gat_A* 1gat_A* 1gau_A* Length = 63 Back     alignment and structure
>4gat_A Nitrogen regulatory protein AREA; DNA binding protein, transcription factor, zinc binding domain, complex (transcription regulation/DNA); HET: DNA; NMR {Emericella nidulans} SCOP: g.39.1.1 PDB: 5gat_A* 6gat_A* 7gat_A* Length = 66 Back     alignment and structure
>1vt4_I APAF-1 related killer DARK; drosophila apoptosome, apoptosis, programmed cell death; HET: DTP; 6.90A {Drosophila melanogaster} PDB: 3iz8_A* Length = 1221 Back     alignment and structure
>1vt4_I APAF-1 related killer DARK; drosophila apoptosome, apoptosis, programmed cell death; HET: DTP; 6.90A {Drosophila melanogaster} PDB: 3iz8_A* Length = 1221 Back     alignment and structure

Structure Templates Detected by HHsearch ?

ID ?Alignment Graph ?Length ? Definition ? Probability ?
Query232
3dfx_A63 Trans-acting T-cell-specific transcription factor 99.59
2vut_I43 AREA, nitrogen regulatory protein AREA; transcript 99.59
1gnf_A46 Transcription factor GATA-1; zinc finger, transcri 99.58
4gat_A66 Nitrogen regulatory protein AREA; DNA binding prot 99.54
2kae_A71 GATA-type transcription factor; zinc finger, GATA- 99.51
4hc9_A115 Trans-acting T-cell-specific transcription factor; 99.3
4hc9_A115 Trans-acting T-cell-specific transcription factor; 99.21
>3dfx_A Trans-acting T-cell-specific transcription factor GATA-3; activator, DNA-binding, metal-binding, nucleus; HET: DNA; 2.70A {Mus musculus} PDB: 3dfv_D* 2gat_A* 3gat_A* 1gat_A* 1gau_A* Back     alignment and structure
Probab=99.59  E-value=8.2e-16  Score=111.85  Aligned_cols=40  Identities=38%  Similarity=0.816  Sum_probs=36.3

Q ss_pred             CCCCccccCCCCCCCCcccCCCCCCcccchHHHHHHHHhhh
Q 026843           89 NNTVRICSDCNTTTTPLWRSGPRGPKSLCNACGIRQRKARK  129 (232)
Q Consensus        89 ~~~~~~C~~Cgtt~TP~WRrGP~G~~~LCNACGL~~~k~r~  129 (232)
                      ......|++|++++||+||+||+|+ +|||||||+|++...
T Consensus         4 ~~~~~~C~~C~tt~Tp~WR~gp~G~-~LCNACGl~~~~~~~   43 (63)
T 3dfx_A            4 RRAGTSCANCQTTTTTLWRRNANGD-PVCNACGLYYKLHNI   43 (63)
T ss_dssp             CCTTCCCTTTCCSCCSSCCCCTTSC-CCCHHHHHHHHHHSS
T ss_pred             CCCCCcCCCcCCCCCCccCCCCCCC-chhhHHHHHHHHcCC
Confidence            3567889999999999999999997 999999999998764



>2vut_I AREA, nitrogen regulatory protein AREA; transcription regulation, protein-protein interactions, metal-binding, nitrate assimilation; HET: NAD; 2.3A {Emericella nidulans} SCOP: g.39.1.1 PDB: 2vus_I* 2vuu_I* Back     alignment and structure
>1gnf_A Transcription factor GATA-1; zinc finger, transcription regulation; NMR {Mus musculus} SCOP: g.39.1.1 PDB: 1y0j_A 2l6y_A 2l6z_A Back     alignment and structure
>4gat_A Nitrogen regulatory protein AREA; DNA binding protein, transcription factor, zinc binding domain, complex (transcription regulation/DNA); HET: DNA; NMR {Emericella nidulans} SCOP: g.39.1.1 PDB: 5gat_A* 6gat_A* 7gat_A* Back     alignment and structure
>2kae_A GATA-type transcription factor; zinc finger, GATA-type, DNA; NMR {Caenorhabditis elegans} Back     alignment and structure
>4hc9_A Trans-acting T-cell-specific transcription factor; zinc finger, GATA transcription factor, DNA bridging, transc DNA complex; HET: DNA; 1.60A {Homo sapiens} PDB: 4hc7_A* 4hca_A* 3dfx_A* 3dfv_D* 2gat_A* 3gat_A* 1gat_A* 1gau_A* 1gnf_A 1y0j_A 2l6y_A 2l6z_A Back     alignment and structure
>4hc9_A Trans-acting T-cell-specific transcription factor; zinc finger, GATA transcription factor, DNA bridging, transc DNA complex; HET: DNA; 1.60A {Homo sapiens} PDB: 4hc7_A* 4hca_A* 3dfx_A* 3dfv_D* 2gat_A* 3gat_A* 1gat_A* 1gau_A* 1gnf_A 1y0j_A 2l6y_A 2l6z_A Back     alignment and structure

Homologous Structure Domains

Structure Domains Detected by RPS-BLAST ?

ID ?Alignment Graph ?Length ? Definition ? E-value ?
Query 232
d1y0ja139 g.39.1.1 (A:200-238) Erythroid transcription facto 4e-12
d2vuti142 g.39.1.1 (I:671-712) Erythroid transcription facto 1e-09
d3gata_66 g.39.1.1 (A:) Erythroid transcription factor GATA- 1e-07
>d1y0ja1 g.39.1.1 (A:200-238) Erythroid transcription factor GATA-1 {Mouse (Mus musculus) [TaxId: 10090]} Length = 39 Back     information, alignment and structure

class: Small proteins
fold: Glucocorticoid receptor-like (DNA-binding domain)
superfamily: Glucocorticoid receptor-like (DNA-binding domain)
family: Erythroid transcription factor GATA-1
domain: Erythroid transcription factor GATA-1
species: Mouse (Mus musculus) [TaxId: 10090]
 Score = 57.0 bits (138), Expect = 4e-12
 Identities = 16/34 (47%), Positives = 19/34 (55%), Gaps = 1/34 (2%)

Query: 93  RICSDCNTTTTPLWRSGPRGPKSLCNACGIRQRK 126
           R C +C  T TPLWR    G   LCNACG+  + 
Sbjct: 3   RECVNCGATATPLWRRDRTG-HYLCNACGLYHKM 35


>d2vuti1 g.39.1.1 (I:671-712) Erythroid transcription factor GATA-1 {Emericella nidulans [TaxId: 162425]} Length = 42 Back     information, alignment and structure
>d3gata_ g.39.1.1 (A:) Erythroid transcription factor GATA-1 {Chicken (Gallus gallus) [TaxId: 9031]} Length = 66 Back     information, alignment and structure

Homologous Domains Detected by HHsearch ?

ID ?Alignment Graph ?Length ? Definition ? Probability ?
Query232
d1y0ja139 Erythroid transcription factor GATA-1 {Mouse (Mus 99.65
d2vuti142 Erythroid transcription factor GATA-1 {Emericella 99.62
d3gata_66 Erythroid transcription factor GATA-1 {Chicken (Ga 99.54
>d1y0ja1 g.39.1.1 (A:200-238) Erythroid transcription factor GATA-1 {Mouse (Mus musculus) [TaxId: 10090]} Back     information, alignment and structure
class: Small proteins
fold: Glucocorticoid receptor-like (DNA-binding domain)
superfamily: Glucocorticoid receptor-like (DNA-binding domain)
family: Erythroid transcription factor GATA-1
domain: Erythroid transcription factor GATA-1
species: Mouse (Mus musculus) [TaxId: 10090]
Probab=99.65  E-value=2.1e-17  Score=108.79  Aligned_cols=35  Identities=46%  Similarity=1.051  Sum_probs=32.5

Q ss_pred             CccccCCCCCCCCcccCCCCCCcccchHHHHHHHHh
Q 026843           92 VRICSDCNTTTTPLWRSGPRGPKSLCNACGIRQRKA  127 (232)
Q Consensus        92 ~~~C~~Cgtt~TP~WRrGP~G~~~LCNACGL~~~k~  127 (232)
                      .+.|++|++++||+||+||+| ++|||||||+|+..
T Consensus         2 ~r~C~~Cgtt~Tp~WR~gp~G-~~LCNACGl~~r~~   36 (39)
T d1y0ja1           2 ARECVNCGATATPLWRRDRTG-HYLCNACGLYHKMN   36 (39)
T ss_dssp             CCCCSSSCCCCCSCCEECTTS-CEECSSHHHHHHHS
T ss_pred             cCCCCCCCCCCCcccccCCCC-CEeeHHhHHHHHHh
Confidence            578999999999999999999 68999999999864



>d2vuti1 g.39.1.1 (I:671-712) Erythroid transcription factor GATA-1 {Emericella nidulans [TaxId: 162425]} Back     information, alignment and structure
>d3gata_ g.39.1.1 (A:) Erythroid transcription factor GATA-1 {Chicken (Gallus gallus) [TaxId: 9031]} Back     information, alignment and structure