Citrus Sinensis ID: 032112


Local Sequence Feature Prediction

Prediction and (Method)Result
Residue Number Marker
Protein Sequence ?
Secondary Structure (PSIPRED) ?
Secondary Structure Prediction (SSPRO) ?
Coil and Loop (DISEMBL) ?
Flexible Loop (DISEMBL) ?
Low Complexity Region (SEG) ?
Disordered region (IsUnstruct) ?
Disordered Region (DISOPRED) ?
Disordered Region (DISEMBL) ?
Disordered Region (DISPRO) ?
Transmembrane Helix (TMHMM) ?
Transmembrane Helix (HMMTOP) ?
Transmembrane Helix (MEMSAT) ?
TM Helix, Signal Peptide (MEMSAT_SVM) ?
TM Helix, Signal Peptide (Phobius) ?
Signal Peptide (SignalP HMM Mode) ?
Signal Peptide (SignalP NN Mode) ?
Coiled Coils (COILS) ?
Positional Conservation ?
 
--------10--------20--------30--------40--------50--------60--------70--------80--------90-------100-------110-------120-------130-------140-------
MMDPSDKGFESDEVNSSGSKRLDGVSSDENSIKKTCADCGTTKTPLWRGGPAGPKSLCNACGIRSRKKRRAILGITKEEKKSKRGNSNSSSNSSSNKLGDSLKQRLYALGREVLMQRSSVEKQRKTLGEEEQAAVLLMALSYGSVYA
ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccHHHHHHHHccHHHHccccHHHHHHccccccccccccccccccHHHHHHHcccHHHHHcccHHHHccccccHHHHHHHHHHHHHHccccc
cccccccccccccccccccccccccccccccccEEEcccccccccccccccccccHHHHHHcHHHHHHHHHHHcccccccccccccccccccccccccccccccccccccccccccHHHHHHHHHcccHHHHHHHHHHHHccccEcc
mmdpsdkgfesdevnssgskrldgvssdensikktcadcgttktplwrggpagpkslcnacgirsrkkRRAILGItkeekkskrgnsnsssnsssnklgdSLKQRLYALGREVLMQRSSVEKQRKTLGEEEQAAVLLMALSYGSVYA
mmdpsdkgfesdevnssgskrldgvssdensikktcadcgttktplwrggpagpkslcnacgirsrkkrrailgitkeekkskrgnsnsssnsssnklgdsLKQRLYALGREVLMQRSSVEKQRKTLGEEEQAAVLLMALSYGSVYA
MMDPSDKGFESDEVNSSGSKRLDGVSSDENSIKKTCADCGTTKTPLWRGGPAGPKSLCNACGIRSRKKRRAILGITKEEkkskrgnsnsssnsssnkLGDSLKQRLYALGREVLMQRSSVEKQRKTLGEEEQAAVLLMALSYGSVYA
***********************************CADCGTTKTPLWRGGPAGPKSLCNACGIR*********************************************************************AVLLMALSY*****
*************************************DCGTTKTPLWRGGPAGPKSLCNACGIRS********************************************************************AVLLMALSYGSVYA
******************************SIKKTCADCGTTKTPLWRGGPAGPKSLCNACGIRSRKKRRAILGITKE********************GDSLKQRLYALGREVLMQRSSVEKQRKTLGEEEQAAVLLMALSYGSVYA
*******************************IKKTCADCGTTKTPLWRGGPAGPKSLCNACGIRSRKKRRA*************************************************EKQRKTLGEEEQAAVLLMALSYGSVYA
ooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii
iiiiiiiiiiiiiiiiiiiiiiiiiihhhhhhhhhhhhhhhhhhhhooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiihhhhhhhhhhhhhhhhoooo
ooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
MMDPSDKGFESDEVNSSGSKRLDGVSSDENSIKKTCADCGTTKTPLWRGGPAGPKSLCNACGIRSRKKRRAILGITKEEKKSKRGNSNSSSNSSSNKLGDSLKQRLYALGREVLMQRSSVEKQRKTLGEEEQAAVLLMALSYGSVYA
no confident homologs detected

Close Homologs for Annotation Transfer

Close Homologs in SWISS-PROT Database Detected by BLAST ?

ID ?Alignment graph ?Length ? Definition ? RBH(Q2H) ? RBH(H2Q) ? Q cover ? H cover ? Identity ? E-value ?
Query147 2.2.26 [Sep-21-2011]
Q8LG10149 GATA transcription factor yes no 0.925 0.912 0.562 4e-39
Q9FJ10139 GATA transcription factor no no 0.931 0.985 0.577 1e-36
Q9LIB5190 GATA transcription factor no no 0.782 0.605 0.48 6e-26
Q8LC59120 GATA transcription factor no no 0.326 0.4 0.653 1e-13
Q5HZ36398 GATA transcription factor no no 0.272 0.100 0.707 2e-13
Q9SZI6352 Putative GATA transcripti no no 0.877 0.366 0.335 3e-13
Q9ZPX0208 GATA transcription factor no no 0.278 0.197 0.658 3e-11
Q6QPM2211 GATA transcription factor no no 0.285 0.199 0.642 3e-11
Q8LC79295 GATA transcription factor no no 0.306 0.152 0.6 1e-10
Q6DBP8303 GATA transcription factor no no 0.299 0.145 0.568 9e-10
>sp|Q8LG10|GAT15_ARATH GATA transcription factor 15 OS=Arabidopsis thaliana GN=GATA15 PE=2 SV=2 Back     alignment and function desciption
 Score =  160 bits (404), Expect = 4e-39,   Method: Compositional matrix adjust.
 Identities = 90/160 (56%), Positives = 107/160 (66%), Gaps = 24/160 (15%)

Query: 1   MMDPSDKGFESDEVNSSGSKRLDGVSSDENSI-----------KKTCADCGTTKTPLWRG 49
           M+DP++K  +S+ + S    +L  V + E              KK+CA CGT+KTPLWRG
Sbjct: 1   MLDPTEKVIDSESMES----KLTSVDAIEEHSSSSSNEAISNEKKSCAICGTSKTPLWRG 56

Query: 50  GPAGPKSLCNACGIRSRKKRRAILGITKEEKKSKRGNSNSSSNSSSNKLGDSLKQRLYAL 109
           GPAGPKSLCNACGIR+RKKRR ++    E+KK K  N N        K GDSLKQRL  L
Sbjct: 57  GPAGPKSLCNACGIRNRKKRRTLISNRSEDKKKKSHNRNP-------KFGDSLKQRLMEL 109

Query: 110 GREVLMQRSSVEKQRKT-LGEEEQAAVLLMALSYG-SVYA 147
           GREV+MQRS+ E QR+  LGEEEQAAVLLMALSY  SVYA
Sbjct: 110 GREVMMQRSTAENQRRNKLGEEEQAAVLLMALSYASSVYA 149




Transcriptional regulator that specifically binds 5'-GATA-3' or 5'-GAT-3' motifs within gene promoters.
Arabidopsis thaliana (taxid: 3702)
>sp|Q9FJ10|GAT16_ARATH GATA transcription factor 16 OS=Arabidopsis thaliana GN=GATA16 PE=2 SV=1 Back     alignment and function description
>sp|Q9LIB5|GAT17_ARATH GATA transcription factor 17 OS=Arabidopsis thaliana GN=GATA17 PE=2 SV=1 Back     alignment and function description
>sp|Q8LC59|GAT23_ARATH GATA transcription factor 23 OS=Arabidopsis thaliana GN=GATA23 PE=2 SV=2 Back     alignment and function description
>sp|Q5HZ36|GAT21_ARATH GATA transcription factor 21 OS=Arabidopsis thaliana GN=GATA21 PE=1 SV=2 Back     alignment and function description
>sp|Q9SZI6|GAT22_ARATH Putative GATA transcription factor 22 OS=Arabidopsis thaliana GN=GATA22 PE=2 SV=1 Back     alignment and function description
>sp|Q9ZPX0|GAT20_ARATH GATA transcription factor 20 OS=Arabidopsis thaliana GN=GATA20 PE=2 SV=2 Back     alignment and function description
>sp|Q6QPM2|GAT19_ARATH GATA transcription factor 19 OS=Arabidopsis thaliana GN=GATA19 PE=2 SV=2 Back     alignment and function description
>sp|Q8LC79|GAT18_ARATH GATA transcription factor 18 OS=Arabidopsis thaliana GN=GATA18 PE=2 SV=2 Back     alignment and function description
>sp|Q6DBP8|GAT11_ARATH GATA transcription factor 11 OS=Arabidopsis thaliana GN=GATA11 PE=2 SV=1 Back     alignment and function description

Close Homologs in the Non-Redundant Database Detected by BLAST ?

GI ?Alignment Graph ?Length ? Definition ? Q cover ? H cover ? Identity ? E-value ?
Query147
225431869153 PREDICTED: GATA transcription factor 16 1.0 0.960 0.660 4e-47
449432898148 PREDICTED: GATA transcription factor 16- 0.972 0.966 0.708 2e-45
255556286149 GATA transcription factor, putative [Ric 0.952 0.939 0.634 5e-43
449432896151 PREDICTED: GATA transcription factor 16- 0.931 0.907 0.696 3e-41
224110254125 predicted protein [Populus trichocarpa] 0.761 0.896 0.685 3e-39
18397703149 GATA transcription factor 15 [Arabidopsi 0.925 0.912 0.562 2e-37
297829216137 hypothetical protein ARALYDRAFT_477989 [ 0.775 0.832 0.661 1e-36
388516305144 unknown [Lotus japonicus] 0.952 0.972 0.589 4e-36
7549639136 hypothetical protein [Arabidopsis thalia 0.734 0.794 0.692 5e-36
21536761136 unknown [Arabidopsis thaliana] 0.734 0.794 0.683 9e-36
>gi|225431869|ref|XP_002275498.1| PREDICTED: GATA transcription factor 16 [Vitis vinifera] gi|296083288|emb|CBI22924.3| unnamed protein product [Vitis vinifera] Back     alignment and taxonomy information
 Score =  192 bits (488), Expect = 4e-47,   Method: Compositional matrix adjust.
 Identities = 101/153 (66%), Positives = 122/153 (79%), Gaps = 6/153 (3%)

Query: 1   MMDPSDKGFESDEVNSSGSKRLDGVSSDENSIKKTCADCGTTKTPLWRGGPAGPKSLCNA 60
           M+D S+KG ES+++N+     +    S  N  KKTCADCGTTKTPLWRGGPAGPKSLCNA
Sbjct: 1   MVDLSEKGSESEDMNNKNPDAVSSAESQVNEPKKTCADCGTTKTPLWRGGPAGPKSLCNA 60

Query: 61  CGIRSRKKRRAILGITK---EEKKSKR---GNSNSSSNSSSNKLGDSLKQRLYALGREVL 114
           CGIRSRKKRRA LG+ K   +++K+KR    + N+   + +NKLGDSLK+RL+ALGREVL
Sbjct: 61  CGIRSRKKRRAFLGLNKGSTDDRKAKRSSNHSHNNGGGNGNNKLGDSLKRRLFALGREVL 120

Query: 115 MQRSSVEKQRKTLGEEEQAAVLLMALSYGSVYA 147
           +QRS+VEKQR+ LGEEEQAAVLLMALSYG VYA
Sbjct: 121 LQRSTVEKQRRKLGEEEQAAVLLMALSYGYVYA 153




Source: Vitis vinifera

Species: Vitis vinifera

Genus: Vitis

Family: Vitaceae

Order: Vitales

Class:

Phylum: Streptophyta

Superkingdom: Eukaryota

>gi|449432898|ref|XP_004134235.1| PREDICTED: GATA transcription factor 16-like isoform 2 [Cucumis sativus] Back     alignment and taxonomy information
>gi|255556286|ref|XP_002519177.1| GATA transcription factor, putative [Ricinus communis] gi|223541492|gb|EEF43041.1| GATA transcription factor, putative [Ricinus communis] Back     alignment and taxonomy information
>gi|449432896|ref|XP_004134234.1| PREDICTED: GATA transcription factor 16-like isoform 1 [Cucumis sativus] Back     alignment and taxonomy information
>gi|224110254|ref|XP_002315462.1| predicted protein [Populus trichocarpa] gi|222864502|gb|EEF01633.1| predicted protein [Populus trichocarpa] Back     alignment and taxonomy information
>gi|18397703|ref|NP_566290.1| GATA transcription factor 15 [Arabidopsis thaliana] gi|71660789|sp|Q8LG10.2|GAT15_ARATH RecName: Full=GATA transcription factor 15 gi|17380940|gb|AAL36282.1| unknown protein [Arabidopsis thaliana] gi|20258947|gb|AAM14189.1| unknown protein [Arabidopsis thaliana] gi|332640929|gb|AEE74450.1| GATA transcription factor 15 [Arabidopsis thaliana] Back     alignment and taxonomy information
>gi|297829216|ref|XP_002882490.1| hypothetical protein ARALYDRAFT_477989 [Arabidopsis lyrata subsp. lyrata] gi|297328330|gb|EFH58749.1| hypothetical protein ARALYDRAFT_477989 [Arabidopsis lyrata subsp. lyrata] Back     alignment and taxonomy information
>gi|388516305|gb|AFK46214.1| unknown [Lotus japonicus] Back     alignment and taxonomy information
>gi|7549639|gb|AAF63824.1| hypothetical protein [Arabidopsis thaliana] Back     alignment and taxonomy information
>gi|21536761|gb|AAM61093.1| unknown [Arabidopsis thaliana] Back     alignment and taxonomy information

Prediction of Gene Ontology (GO) Terms

Close Homologs with Gene Ontology terms Detected by BLAST ?

ID ? Alignment graph ? Length ? Definition ? Q cover ? H cover ? Identity ? E-value ?
Query147
TAIR|locus:2083388149 GATA15 "GATA transcription fac 0.945 0.932 0.541 1.2e-33
TAIR|locus:2155919139 GATA16 "GATA transcription fac 0.931 0.985 0.530 4.7e-32
TAIR|locus:2093678190 GATA17 "GATA transcription fac 0.469 0.363 0.623 2.3e-29
TAIR|locus:504955441197 AT4G16141 [Arabidopsis thalian 0.448 0.335 0.6 1.2e-26
TAIR|locus:2120845352 CGA1 "cytokinin-responsive gat 0.469 0.196 0.541 2.4e-20
TAIR|locus:2148558120 GATA23 "GATA transcription fac 0.326 0.4 0.653 7.3e-20
TAIR|locus:2170277398 GNC "GATA, nitrate-inducible, 0.319 0.118 0.680 1.8e-19
TAIR|locus:2115195211 GATA19 "GATA transcription fac 0.360 0.251 0.574 4.4e-13
TAIR|locus:2062095208 GATA20 "GATA transcription fac 0.278 0.197 0.658 5e-12
TAIR|locus:2077932295 MNP "MONOPOLE" [Arabidopsis th 0.414 0.206 0.491 8.7e-12
TAIR|locus:2083388 GATA15 "GATA transcription factor 15" [Arabidopsis thaliana (taxid:3702)] Back     alignment and assigned GO terms
 Score = 366 (133.9 bits), Expect = 1.2e-33, P = 1.2e-33
 Identities = 85/157 (54%), Positives = 103/157 (65%)

Query:     1 MMDPSDKGFESDEVNSSGSKRLDGV----SSDENSI----KKTCADCGTTKTPLWRGGPA 52
             M+DP++K  +S+ + S  +  +D +    SS  N      KK+CA CGT+KTPLWRGGPA
Sbjct:     1 MLDPTEKVIDSESMESKLTS-VDAIEEHSSSSSNEAISNEKKSCAICGTSKTPLWRGGPA 59

Query:    53 GPKSLCNACGIRSRKKRRAILGITKEEXXXXXXXXXXXXXXXXXXLGDSLKQRLYALGRE 112
             GPKSLCNACGIR+RKKRR ++    E+                   GDSLKQRL  LGRE
Sbjct:    60 GPKSLCNACGIRNRKKRRTLISNRSEDKKKKSHNRNPK-------FGDSLKQRLMELGRE 112

Query:   113 VLMQRSSVEKQRKT-LGEEEQAAVLLMALSYGS-VYA 147
             V+MQRS+ E QR+  LGEEEQAAVLLMALSY S VYA
Sbjct:   113 VMMQRSTAENQRRNKLGEEEQAAVLLMALSYASSVYA 149




GO:0003700 "sequence-specific DNA binding transcription factor activity" evidence=IEA;ISS
GO:0005634 "nucleus" evidence=ISM
GO:0006355 "regulation of transcription, DNA-dependent" evidence=IEA
GO:0008270 "zinc ion binding" evidence=IEA
GO:0043565 "sequence-specific DNA binding" evidence=IEA
GO:0009407 "toxin catabolic process" evidence=RCA
TAIR|locus:2155919 GATA16 "GATA transcription factor 16" [Arabidopsis thaliana (taxid:3702)] Back     alignment and assigned GO terms
TAIR|locus:2093678 GATA17 "GATA transcription factor 17" [Arabidopsis thaliana (taxid:3702)] Back     alignment and assigned GO terms
TAIR|locus:504955441 AT4G16141 [Arabidopsis thaliana (taxid:3702)] Back     alignment and assigned GO terms
TAIR|locus:2120845 CGA1 "cytokinin-responsive gata factor 1" [Arabidopsis thaliana (taxid:3702)] Back     alignment and assigned GO terms
TAIR|locus:2148558 GATA23 "GATA transcription factor 23" [Arabidopsis thaliana (taxid:3702)] Back     alignment and assigned GO terms
TAIR|locus:2170277 GNC "GATA, nitrate-inducible, carbon metabolism-involved" [Arabidopsis thaliana (taxid:3702)] Back     alignment and assigned GO terms
TAIR|locus:2115195 GATA19 "GATA transcription factor 19" [Arabidopsis thaliana (taxid:3702)] Back     alignment and assigned GO terms
TAIR|locus:2062095 GATA20 "GATA transcription factor 20" [Arabidopsis thaliana (taxid:3702)] Back     alignment and assigned GO terms
TAIR|locus:2077932 MNP "MONOPOLE" [Arabidopsis thaliana (taxid:3702)] Back     alignment and assigned GO terms

Prediction of Enzyme Commission (EC) Number

EC Number Prediction by Annotation Transfer from SWISS-PROT Entries ?

ID ?Name ?Annotated EC number ?Identity ?Query coverage ?Hit coverage ?RBH(Q2H) ?RBH(H2Q) ?
Q8LG10GAT15_ARATHNo assigned EC number0.56250.92510.9127yesno

EC Number Prediction by Ezypred Server ?

Fail to connect to Ezypred Server

EC Number Prediction by EFICAz Software ?

No EC number assignment, probably not an enzyme!


Prediction of Functionally Associated Proteins

Functionally Associated Proteins Detected by STRING ?

Your Input:
gw1.X.3599.1
hypothetical protein (118 aa)
(Populus trichocarpa)
Predicted Functional Partners:
 
Sorry, there are no predicted associations at the current settings.
 

Conserved Domains and Related Protein Families

Conserved Domains Detected by RPS-BLAST ?

ID ?Alignment Graph ?Length ? Definition ? E-value ?
Query147
pfam0032036 pfam00320, GATA, GATA zinc finger 3e-16
smart0040152 smart00401, ZnF_GATA, zinc finger binding to DNA c 4e-14
cd0020254 cd00202, ZnF_GATA, Zinc finger DNA binding domain; 3e-12
>gnl|CDD|109380 pfam00320, GATA, GATA zinc finger Back     alignment and domain information
 Score = 66.9 bits (164), Expect = 3e-16
 Identities = 22/35 (62%), Positives = 27/35 (77%)

Query: 36 CADCGTTKTPLWRGGPAGPKSLCNACGIRSRKKRR 70
          C++CGTTKTPLWR GP G ++LCNACG+  RK   
Sbjct: 1  CSNCGTTKTPLWRRGPDGNRTLCNACGLYYRKHGL 35


This domain uses four cysteine residues to coordinate a zinc ion. This domain binds to DNA. Two GATA zinc fingers are found in the GATA transcription factors. However there are several proteins which only contains a single copy of the domain. Length = 36

>gnl|CDD|214648 smart00401, ZnF_GATA, zinc finger binding to DNA consensus sequence [AT]GATA[AG] Back     alignment and domain information
>gnl|CDD|238123 cd00202, ZnF_GATA, Zinc finger DNA binding domain; binds specifically to DNA consensus sequence [AT]GATA[AG] promoter elements; a subset of family members may also bind protein; zinc-finger consensus topology is C-X(2)-C-X(17)-C-X(2)-C Back     alignment and domain information

Conserved Domains Detected by HHsearch ?

ID ?Alignment Graph ?Length ? Definition ? Probability ?
Query 147
PF0032036 GATA: GATA zinc finger; InterPro: IPR000679 Zinc f 99.58
cd0020254 ZnF_GATA Zinc finger DNA binding domain; binds spe 99.58
smart0040152 ZnF_GATA zinc finger binding to DNA consensus sequ 99.57
KOG1601340 consensus GATA-4/5/6 transcription factors [Transc 98.87
COG5641 498 GAT1 GATA Zn-finger-containing transcription facto 98.71
COG5641498 GAT1 GATA Zn-finger-containing transcription facto 95.83
KOG3554 693 consensus Histone deacetylase complex, MTA1 compon 94.15
PF1480334 Nudix_N_2: Nudix N-terminal; PDB: 3CNG_C. 84.1
PF01412116 ArfGap: Putative GTPase activating protein for Arf 81.38
>PF00320 GATA: GATA zinc finger; InterPro: IPR000679 Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule Back     alignment and domain information
Probab=99.58  E-value=3.7e-16  Score=95.35  Aligned_cols=35  Identities=54%  Similarity=1.272  Sum_probs=28.3

Q ss_pred             cccCCCCCCCccccCCCCCcccchhHHHHHHhhcc
Q 032112           36 CADCGTTKTPLWRGGPAGPKSLCNACGIRSRKKRR   70 (147)
Q Consensus        36 C~nCgtt~Tp~WR~gp~G~~~LCNaCGl~~~k~~~   70 (147)
                      |+||+|++||+||++|.|+.+|||+||+||+++++
T Consensus         1 C~~C~tt~t~~WR~~~~g~~~LCn~Cg~~~kk~~~   35 (36)
T PF00320_consen    1 CSNCGTTETPQWRRGPNGNRTLCNACGLYYKKYGK   35 (36)
T ss_dssp             -TTT--ST-SSEEEETTSEE-EEHHHHHHHHHHSS
T ss_pred             CcCCcCCCCchhhcCCCCCCHHHHHHHHHHHHhCC
Confidence            89999999999999999987799999999998865



Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [, , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. This entry represents GATA-type zinc fingers (Znf). A number of transcription factors (including erythroid-specific transcription factor and nitrogen regulatory proteins), specifically bind the DNA sequence (A/T)GATA(A/G) [] in the regulatory regions of genes. They are consequently termed GATA-binding transcription factors. The interactions occur via highly-conserved Znf domains in which the zinc ion is coordinated by 4 cysteine residues [, ]. NMR studies have shown the core of the Znf to comprise 2 irregular anti-parallel beta-sheets and an alpha-helix, followed by a long loop to the C-terminal end of the finger. The N-terminal part, which includes the helix, is similar in structure, but not sequence, to the N-terminal zinc module of the glucocorticoid receptor DNA-binding domain. The helix and the loop connecting the 2 beta-sheets interact with the major groove of the DNA, while the C-terminal tail wraps around into the minor groove. It is this tail that is the essential determinant of specific binding. Interactions between the Znf and DNA are mainly hydrophobic, explaining the preponderance of thymines in the binding site; a large number of interactions with the phosphate backbone have also been observed []. Two GATA zinc fingers are found in the GATA transcription factors. However there are several proteins which only contains a single copy of the domain. More information about these proteins can be found at Protein of the Month: Zinc Fingers [].; GO: 0003700 sequence-specific DNA binding transcription factor activity, 0008270 zinc ion binding, 0043565 sequence-specific DNA binding, 0006355 regulation of transcription, DNA-dependent; PDB: 3GAT_A 2GAT_A 1GAU_A 1GAT_A 1Y0J_A 1GNF_A 2L6Z_A 2L6Y_A 3DFV_D 3DFX_B ....

>cd00202 ZnF_GATA Zinc finger DNA binding domain; binds specifically to DNA consensus sequence [AT]GATA[AG] promoter elements; a subset of family members may also bind protein; zinc-finger consensus topology is C-X(2)-C-X(17)-C-X(2)-C Back     alignment and domain information
>smart00401 ZnF_GATA zinc finger binding to DNA consensus sequence [AT]GATA[AG] Back     alignment and domain information
>KOG1601 consensus GATA-4/5/6 transcription factors [Transcription] Back     alignment and domain information
>COG5641 GAT1 GATA Zn-finger-containing transcription factor [Transcription] Back     alignment and domain information
>COG5641 GAT1 GATA Zn-finger-containing transcription factor [Transcription] Back     alignment and domain information
>KOG3554 consensus Histone deacetylase complex, MTA1 component [Chromatin structure and dynamics] Back     alignment and domain information
>PF14803 Nudix_N_2: Nudix N-terminal; PDB: 3CNG_C Back     alignment and domain information
>PF01412 ArfGap: Putative GTPase activating protein for Arf; InterPro: IPR001164 This entry describes a family of small GTPase activating proteins, for example ARF1-directed GTPase-activating protein, the cycle control GTPase activating protein (GAP) GCS1 which is important for the regulation of the ADP ribosylation factor ARF, a member of the Ras superfamily of GTP-binding proteins [] Back     alignment and domain information

Homologous Structure Templates

Structure Templates Detected by BLAST ?

ID ?Alignment Graph ?Length ? Definition ? E-value ?
Query147
2vus_I43 Crystal Structure Of Unliganded Nmra-Area Zinc Fing 7e-05
4gat_A66 Solution Nmr Structure Of The Wild Type Dna Binding 3e-04
7gat_A66 Solution Nmr Structure Of The L22v Mutant Dna Bindi 8e-04
>pdb|2VUS|I Chain I, Crystal Structure Of Unliganded Nmra-Area Zinc Finger Complex Length = 43 Back     alignment and structure

Iteration: 1

Score = 42.7 bits (99), Expect = 7e-05, Method: Composition-based stats. Identities = 17/29 (58%), Positives = 20/29 (68%), Gaps = 1/29 (3%) Query: 35 TCADCGTTKTPLWRGGPAGPKSLCNACGI 63 TC +C T TPLWR P G + LCNACG+ Sbjct: 3 TCTNCFTQTTPLWRRNPEG-QPLCNACGL 30
>pdb|4GAT|A Chain A, Solution Nmr Structure Of The Wild Type Dna Binding Domain Of Area Complexed To A 13bp Dna Containing A Cgata Site, Regularized Mean Structure Length = 66 Back     alignment and structure
>pdb|7GAT|A Chain A, Solution Nmr Structure Of The L22v Mutant Dna Binding Domain Of Area Complexed To A 13 Bp Dna Containing A Tgata Site, 34 Structures Length = 66 Back     alignment and structure

Structure Templates Detected by RPS-BLAST ?

ID ?Alignment Graph ?Length ? Definition ? E-value ?
Query147
2kae_A71 GATA-type transcription factor; zinc finger, GATA- 1e-16
1gnf_A46 Transcription factor GATA-1; zinc finger, transcri 5e-14
2vut_I43 AREA, nitrogen regulatory protein AREA; transcript 2e-12
4gat_A66 Nitrogen regulatory protein AREA; DNA binding prot 1e-09
3dfx_A63 Trans-acting T-cell-specific transcription factor 1e-09
>2kae_A GATA-type transcription factor; zinc finger, GATA-type, DNA; NMR {Caenorhabditis elegans} Length = 71 Back     alignment and structure
 Score = 68.8 bits (168), Expect = 1e-16
 Identities = 20/69 (28%), Positives = 31/69 (44%), Gaps = 2/69 (2%)

Query: 27 SDENSIKKTCADCGTTKTPLWRGGPAGPKSLCNACGIRSRK--KRRAILGITKEEKKSKR 84
          S  N     C++C  T+T  WR   +     CNAC I  RK  K R +  + K +K+  +
Sbjct: 2  SHMNKKSFQCSNCSVTETIRWRNIRSKEGIQCNACFIYQRKYNKTRPVTAVNKYQKRKLK 61

Query: 85 GNSNSSSNS 93
              +  +S
Sbjct: 62 VQETNGVDS 70


>1gnf_A Transcription factor GATA-1; zinc finger, transcription regulation; NMR {Mus musculus} SCOP: g.39.1.1 PDB: 1y0j_A 2l6y_A 2l6z_A Length = 46 Back     alignment and structure
>2vut_I AREA, nitrogen regulatory protein AREA; transcription regulation, protein-protein interactions, metal-binding, nitrate assimilation; HET: NAD; 2.3A {Emericella nidulans} SCOP: g.39.1.1 PDB: 2vus_I* 2vuu_I* Length = 43 Back     alignment and structure
>4gat_A Nitrogen regulatory protein AREA; DNA binding protein, transcription factor, zinc binding domain, complex (transcription regulation/DNA); HET: DNA; NMR {Emericella nidulans} SCOP: g.39.1.1 PDB: 5gat_A* 6gat_A* 7gat_A* Length = 66 Back     alignment and structure
>3dfx_A Trans-acting T-cell-specific transcription factor GATA-3; activator, DNA-binding, metal-binding, nucleus; HET: DNA; 2.70A {Mus musculus} PDB: 3dfv_D* 2gat_A* 3gat_A* 1gat_A* 1gau_A* Length = 63 Back     alignment and structure

Structure Templates Detected by HHsearch ?

ID ?Alignment Graph ?Length ? Definition ? Probability ?
Query147
4hc9_A115 Trans-acting T-cell-specific transcription factor; 99.86
3dfx_A63 Trans-acting T-cell-specific transcription factor 99.85
4gat_A66 Nitrogen regulatory protein AREA; DNA binding prot 99.81
2kae_A71 GATA-type transcription factor; zinc finger, GATA- 99.76
1gnf_A46 Transcription factor GATA-1; zinc finger, transcri 99.72
2vut_I43 AREA, nitrogen regulatory protein AREA; transcript 99.7
4hc9_A115 Trans-acting T-cell-specific transcription factor; 99.63
2vut_I43 AREA, nitrogen regulatory protein AREA; transcript 87.09
>4hc9_A Trans-acting T-cell-specific transcription factor; zinc finger, GATA transcription factor, DNA bridging, transc DNA complex; HET: DNA; 1.60A {Homo sapiens} PDB: 4hc7_A* 4hca_A* 3dfx_A* 3dfv_D* 2gat_A* 3gat_A* 1gat_A* 1gau_A* 1gnf_A 1y0j_A 2l6y_A 2l6z_A Back     alignment and structure
Probab=99.86  E-value=4.5e-23  Score=153.35  Aligned_cols=97  Identities=25%  Similarity=0.545  Sum_probs=67.9

Q ss_pred             CCCCccccCCCCCCCccccCCCCCcccchhHHHHHHhhcccccccCchhhhh-------cccCCCCCCCC--CCCC----
Q 032112           31 SIKKTCADCGTTKTPLWRGGPAGPKSLCNACGIRSRKKRRAILGITKEEKKS-------KRGNSNSSSNS--SSNK----   97 (147)
Q Consensus        31 ~~~~~C~nCgtt~Tp~WR~gp~G~~~LCNaCGl~~~k~~~~~~~~~~~~~k~-------k~~~~~~~~~~--~~~~----   97 (147)
                      ...++|+||+|++||+||+|++| .+|||||||||+.+++.++.+++..+..       .+.++....++  +...    
T Consensus         3 ~~~~~C~~Cg~~~Tp~WRr~~~g-~~lCnaCgl~~Kl~G~nRP~~KpKKR~~~~~~~~~~C~~C~t~~tp~WRr~~~g~~   81 (115)
T 4hc9_A            3 HMGRECVNCGATSTPLWRRDGTG-HYLCNACGLYHKMNGQNRPLIKPKRRLSAARRAGTSCANCQTTTTTLWRRNANGDP   81 (115)
T ss_dssp             ---CCCTTTCCSCCSSCEECTTS-CEECHHHHHHHHHHSSCCCCSSCCCCCCCCCCTTCCCTTTCCSCCSSCEECTTSCE
T ss_pred             CCCCCCCCCCCccCCcceECCCC-CCcCcchhhhhhhcccccccccccccccccccccccCCCcCCCCcceeEECCCCCC
Confidence            45789999999999999999999 5999999999999987775443322111       12233332221  2121    


Q ss_pred             --CCCcchhhhhccCcchhhhhhhHHHHhhhcC
Q 032112           98 --LGDSLKQRLYALGREVLMQRSSVEKQRKTLG  128 (147)
Q Consensus        98 --~~~~l~~kl~~v~r~~~mkk~~i~~rkrkl~  128 (147)
                        ..||||+++|++.||+.|+++.|++|+||++
T Consensus        82 lCNaCgl~~~~~~~~rp~~~~~~~i~~r~r~~s  114 (115)
T 4hc9_A           82 VCNACGLYYKLHNINRPLTMKKEGIQTRNRKMS  114 (115)
T ss_dssp             ECHHHHHHHHHHSSCCCGGGCCSSCCCCC----
T ss_pred             cchHHHHHHHHhCCCCCccccccchhhcccccc
Confidence              2269999999999999999999999999986



>3dfx_A Trans-acting T-cell-specific transcription factor GATA-3; activator, DNA-binding, metal-binding, nucleus; HET: DNA; 2.70A {Mus musculus} PDB: 3dfv_D* 2gat_A* 3gat_A* 1gat_A* 1gau_A* Back     alignment and structure
>4gat_A Nitrogen regulatory protein AREA; DNA binding protein, transcription factor, zinc binding domain, complex (transcription regulation/DNA); HET: DNA; NMR {Emericella nidulans} SCOP: g.39.1.1 PDB: 5gat_A* 6gat_A* 7gat_A* Back     alignment and structure
>2kae_A GATA-type transcription factor; zinc finger, GATA-type, DNA; NMR {Caenorhabditis elegans} Back     alignment and structure
>1gnf_A Transcription factor GATA-1; zinc finger, transcription regulation; NMR {Mus musculus} SCOP: g.39.1.1 PDB: 1y0j_A 2l6y_A 2l6z_A Back     alignment and structure
>2vut_I AREA, nitrogen regulatory protein AREA; transcription regulation, protein-protein interactions, metal-binding, nitrate assimilation; HET: NAD; 2.3A {Emericella nidulans} SCOP: g.39.1.1 PDB: 2vus_I* 2vuu_I* Back     alignment and structure
>4hc9_A Trans-acting T-cell-specific transcription factor; zinc finger, GATA transcription factor, DNA bridging, transc DNA complex; HET: DNA; 1.60A {Homo sapiens} PDB: 4hc7_A* 4hca_A* 3dfx_A* 3dfv_D* 2gat_A* 3gat_A* 1gat_A* 1gau_A* 1gnf_A 1y0j_A 2l6y_A 2l6z_A Back     alignment and structure
>2vut_I AREA, nitrogen regulatory protein AREA; transcription regulation, protein-protein interactions, metal-binding, nitrate assimilation; HET: NAD; 2.3A {Emericella nidulans} SCOP: g.39.1.1 PDB: 2vus_I* 2vuu_I* Back     alignment and structure

Homologous Structure Domains

Structure Domains Detected by RPS-BLAST ?

ID ?Alignment Graph ?Length ? Definition ? E-value ?
Query 147
d1y0ja139 g.39.1.1 (A:200-238) Erythroid transcription facto 3e-13
d2vuti142 g.39.1.1 (I:671-712) Erythroid transcription facto 4e-12
d3gata_66 g.39.1.1 (A:) Erythroid transcription factor GATA- 9e-08
>d1y0ja1 g.39.1.1 (A:200-238) Erythroid transcription factor GATA-1 {Mouse (Mus musculus) [TaxId: 10090]} Length = 39 Back     information, alignment and structure

class: Small proteins
fold: Glucocorticoid receptor-like (DNA-binding domain)
superfamily: Glucocorticoid receptor-like (DNA-binding domain)
family: Erythroid transcription factor GATA-1
domain: Erythroid transcription factor GATA-1
species: Mouse (Mus musculus) [TaxId: 10090]
 Score = 58.2 bits (141), Expect = 3e-13
 Identities = 16/37 (43%), Positives = 21/37 (56%), Gaps = 1/37 (2%)

Query: 34 KTCADCGTTKTPLWRGGPAGPKSLCNACGIRSRKKRR 70
          + C +CG T TPLWR    G   LCNACG+  +   +
Sbjct: 3  RECVNCGATATPLWRRDRTG-HYLCNACGLYHKMNGQ 38


>d2vuti1 g.39.1.1 (I:671-712) Erythroid transcription factor GATA-1 {Emericella nidulans [TaxId: 162425]} Length = 42 Back     information, alignment and structure
>d3gata_ g.39.1.1 (A:) Erythroid transcription factor GATA-1 {Chicken (Gallus gallus) [TaxId: 9031]} Length = 66 Back     information, alignment and structure

Homologous Domains Detected by HHsearch ?

ID ?Alignment Graph ?Length ? Definition ? Probability ?
Query147
d3gata_66 Erythroid transcription factor GATA-1 {Chicken (Ga 99.81
d2vuti142 Erythroid transcription factor GATA-1 {Emericella 99.79
d1y0ja139 Erythroid transcription factor GATA-1 {Mouse (Mus 99.75
d1neea237 Zinc-binding domain of translation initiation fact 87.97
d1dcqa2122 Pyk2-associated protein beta ARF-GAP domain {Mouse 82.85
d1k81a_36 Zinc-binding domain of translation initiation fact 82.32
>d3gata_ g.39.1.1 (A:) Erythroid transcription factor GATA-1 {Chicken (Gallus gallus) [TaxId: 9031]} Back     information, alignment and structure
class: Small proteins
fold: Glucocorticoid receptor-like (DNA-binding domain)
superfamily: Glucocorticoid receptor-like (DNA-binding domain)
family: Erythroid transcription factor GATA-1
domain: Erythroid transcription factor GATA-1
species: Chicken (Gallus gallus) [TaxId: 9031]
Probab=99.81  E-value=7.4e-22  Score=133.10  Aligned_cols=52  Identities=31%  Similarity=0.741  Sum_probs=39.0

Q ss_pred             CCccccCCCCCCCccccCCCCCcccchhHHHHHHhhcccc-cccCchhhhhccc
Q 032112           33 KKTCADCGTTKTPLWRGGPAGPKSLCNACGIRSRKKRRAI-LGITKEEKKSKRG   85 (147)
Q Consensus        33 ~~~C~nCgtt~Tp~WR~gp~G~~~LCNaCGl~~~k~~~~~-~~~~~~~~k~k~~   85 (147)
                      ...|+||+|++||+||+||.| .+|||||||||++++..+ +.+.++.++.+++
T Consensus         4 g~~C~nCgt~~Tp~WRr~~~G-~~lCNACGl~~~~~~~~RP~~~~~~~i~~r~r   56 (66)
T d3gata_           4 GTVCSNCQTSTTTLWRRSPMG-DPVCNACGLYYKLHQVNRPLTMRKDGIQTRNR   56 (66)
T ss_dssp             TCCCTTTCCCCCSSEEECTTS-CEEEHHHHHHHHHHCSCCCGGGCCSSCCCCSC
T ss_pred             CCCCCCCCCCCCcccccCCCC-CcccchhHHHHHHhCCcCCccccccccccccC
Confidence            467999999999999999999 599999999999655332 2344444444333



>d2vuti1 g.39.1.1 (I:671-712) Erythroid transcription factor GATA-1 {Emericella nidulans [TaxId: 162425]} Back     information, alignment and structure
>d1y0ja1 g.39.1.1 (A:200-238) Erythroid transcription factor GATA-1 {Mouse (Mus musculus) [TaxId: 10090]} Back     information, alignment and structure
>d1neea2 g.59.1.1 (A:99-135) Zinc-binding domain of translation initiation factor 2 beta {Archaeon Methanobacterium thermoautotrophicum [TaxId: 145262]} Back     information, alignment and structure
>d1dcqa2 g.45.1.1 (A:247-368) Pyk2-associated protein beta ARF-GAP domain {Mouse (Mus musculus) [TaxId: 10090]} Back     information, alignment and structure
>d1k81a_ g.59.1.1 (A:) Zinc-binding domain of translation initiation factor 2 beta {Archaeon Methanococcus jannaschii [TaxId: 2190]} Back     information, alignment and structure