Citrus Sinensis ID: 019440

Local Sequence Feature Prediction

Prediction and (Method)	Result

Residue Number Marker

Protein Sequence

Secondary Structure (PSIPRED)

Secondary Structure Prediction (SSPRO)

Coil and Loop (DISEMBL)

Flexible Loop (DISEMBL)

Low Complexity Region (SEG)

Disordered region (IsUnstruct)

Disordered Region (DISOPRED)

Disordered Region (DISEMBL)

Disordered Region (DISPRO)

Transmembrane Helix (TMHMM)

Transmembrane Helix (HMMTOP)

Transmembrane Helix (MEMSAT)

TM Helix, Signal Peptide (MEMSAT_SVM)

TM Helix, Signal Peptide (Phobius)

Signal Peptide (SignalP HMM Mode)

Signal Peptide (SignalP NN Mode)

Coiled Coils (COILS)

Positional Conservation

--------10--------20--------30--------40--------50--------60--------70--------80--------90-------100-------110-------120-------130-------140-------150-------160-------170-------180-------190-------200-------210-------220-------230-------240-------250-------260-------270-------280-------290-------300-------310-------320-------330-------340-

MDRRDGLALPGSASFYMQRGMTGSGSGTQPSLHGSPGIHPLSNPSLQFQSNIGGSTIGSTLSVDPSSAISPHGVNVTASASMPQSEPVKRKRGRPRKYGPDGSVSLALSPSVSTHPGTISPTQKRGRGRPPGTGRKQQVSSLGESLSGSAGMGFTPHVITVAVGEDIAMKLLSFSQQGPRAICVLSANGAISTATLRQPSSSGGSVTYEGRFEILCLSGSYLLSGNGGSRNRSGGLSVSLASPDGRVIGGGVGGMLIAANNVQVIVGSFLWGGPKMKNKKGEASEGVRDSEHQSVENPVTPTTAPSSQNLTPTSSVGGVWAGSRQMDMMRNAHVDIDLMRG

ccccccccccccccEEEEcccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccEEEEEEccccHHHHHHHHHHHHcccEEEEEEEccEEEEEEEcccccccccEEEEEEEEEEEEEcEEEEccccccccccccEEEEEEcccccEEEEEcccEEEEEEcEEEEEEEEcccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc

ccccccccccccccEEEEccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccHcccccccccccccccccccccccccccccccccccccccccccccccccccHHHHHHccccccccccEEEEEcccHHHHHHHHHHHHccccEEEEEEcccEEEEEEEEcccccccEEEEEcEEEEEEEccccccccccccccccccEEEEEEcccccEEEccHHHHHEccccEEEEEEEEccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccEEcccc

mdrrdglalpgsasfymqrgmtgsgsgtqpslhgspgihplsnpslqfqsniggstigstlsvdpssaisphgvnvtasasmpqsepvkrkrgrprkygpdgsvslalspsvsthpgtisptqkrgrgrppgtgrkqQVSSlgeslsgsagmgftphVITVAVGEDIAMKLLSFSQQGPRAICVLSangaistatlrqpsssggsvtyeGRFEILCLSgsyllsgnggsrnrsgglsvslaspdgrvigggVGGMLIAANNVQVIVGSflwggpkmknkkgeasegvrdsehqsvenpvtpttapssqnltptssvggvwagsrqmdmmRNAHVDIDLMRG

mdrrdglalpgSASFYMQRGMTGSGSGTQPSLHGSPGIHPLSNPSLQFQSNIGGSTIGSTLSVDPSSAISPHGvnvtasasmpqsepvkrkrgrprkygpdgsvslalspsvsthpgtisptqkrgrgrppgTGRKQQVSSLGESLSGSAGMGFTPHVITVAVGEDIAMKLLSFSQQGPRAICVLSANGAIStatlrqpsssggsVTYEGRFEILCLSGSYLLSGNGGSRNRSGGLSVSLASPDGRVIGGGVGGMLIAANNVQVIVGSFLWGGPKMKNKKGEAsegvrdsehqsvenpvtpttapssqnltptSSVGGVWAGSRQMDMMRNAHVDIDLMRG

MDRRDGLALPGSASFYMQRGMTGSGSGTQPSLHGSPGIHPLSNPSLQFQSNIGGSTIGSTLSVDPSSAISPHGVNVTASASMPQSEpvkrkrgrprkygpdgSVSLALSPSVSTHPGTISptqkrgrgrppgtgrkqqVsslgeslsgsAGMGFTPHVITVAVGEDIAMKLLSFSQQGPRAICVLSANGAISTATLRQPSSSGGSVTYEGRFEIlclsgsyllsgnggsrnrsgglsvslASPDgrvigggvggMLIAANNVQVIVGSFLWGGPKMKNKKGEASEGVRDSEHQSVENPVTPTTAPSSQNLTPTSSVGGVWAGSRQMDMMRNAHVDIDLMRG

*******************************************************************************************************************************************************MGFTPHVITVAVGEDIAMKLLSFSQQGPRAICVLSANGAISTA***********VTYEGRFEILCLSGSYLLSG******************DGRVIGGGVGGMLIAANNVQVIVGSFLWGG********************************************************************

********************************************************************************************************************************************************GFTPHVITVAVGEDIAMKLLSFSQQGPRAICVLSANGAISTATLRQPSSSGGSVTYEGRFEILCLSGSYL************GLSVSLASPDGRVIGGGVGGMLIAANNVQVIVGSFL*******************************************************************L***

********LPGSASFYMQRGM*************SPGIHPLSNPSLQFQSNIGGSTIGSTLSVDPSSAISPHGVNVT**************************VSLALSP*************************************GSAGMGFTPHVITVAVGEDIAMKLLSFSQQGPRAICVLSANGAISTAT**********VTYEGRFEILCLSGSYLLSGNGGSRNRSGGLSVSLASPDGRVIGGGVGGMLIAANNVQVIVGSFLWGGPK***************************************SVGGVWAGSRQMDMMRNAHVDIDLMRG

**************FY**RGMTGSGSGTQPSLHGSPGIHPLSNPSLQ*******************************************************SVSL******************************QQVSSLGESLSGSAGMGFTPHVITVAVGEDIAMKLLSFSQQGPRAICVLSANGAISTATLRQPSSSGGSVTYEGRFEILCLSGSYLLSGNGGSRNRSGGLSVSLASPDGRVIGGGVGGMLIAANNVQVIVGSFLWGGP*******************************************************************

ooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo

iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii

oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooHHHHHHHHHHHHHHHHHHHHHHHHHiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii

ooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooohhhhhhhhhhhhhhhhhhhhiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

no confident homologs detected

Close Homologs for Annotation Transfer

Close Homologs in SWISS-PROT Database Detected by BLAST

Original result of BLAST against SWISS-PROT Database

ID	Alignment graph	Length	Definition	RBH(Q2H)	RBH(H2Q)	Q cover	H cover	Identity	E-value
Query		341	2.2.26 [Sep-21-2011]
Q9S7C9		311	Putative DNA-binding prot	no	no	0.410	0.450	0.374	7e-08

>sp\|Q9S7C9\|ESCA_ARATH Putative DNA-binding protein ESCAROLA OS=Arabidopsis thaliana GN=ESC PE=2 SV=1	Back alignment and function desciption

 Score = 58.5 bits (140), Expect = 7e-08,   Method: Compositional matrix adjust.
 Identities = 58/155 (37%), Positives = 83/155 (53%), Gaps = 15/155 (9%)

Query: 124 KRGRGRPPGTGRKQQVSSLGESLSGSAGMGFTPHVITVAVGEDIAMKLLSFSQQGPRAIC 183
           KR RGRPPG+  K +   +    S +A      HV+ V+ G DI   + +++++  R + 
Sbjct: 86  KRPRGRPPGSKNKAKPPIIVTRDSPNA---LRSHVLEVSPGADIVESVSTYARRRGRGVS 142

Query: 184 VLSANGAISTATLRQPSS---------SGGSVTYEGRFEILCLSGSYLLSGNGGSRNRSG 234
           VL  NG +S  TLRQP +          GG VT  GRFEIL L+G+ L      +   +G
Sbjct: 143 VLGGNGTVSNVTLRQPVTPGNGGGVSGGGGVVTLHGRFEILSLTGTVLPP---PAPPGAG 199

Query: 235 GLSVSLASPDGRVIGGGVGGMLIAANNVQVIVGSF 269
           GLS+ LA   G+V+GG V   LIA+  V ++  SF
Sbjct: 200 GLSIFLAGGQGQVVGGSVVAPLIASAPVILMAASF 234

Arabidopsis thaliana (taxid: 3702)

Close Homologs in the Non-Redundant Database Detected by BLAST

Original result of BLAST against Nonredundant Database

GI	Alignment Graph	Length	Definition	Q cover	H cover	Identity	E-value
Query		341
255541558		340	DNA binding protein, putative [Ricinus c	0.994	0.997	0.789	1e-143
133907524		340	AT-hook DNA-binding protein [Gossypium h	0.994	0.997	0.733	1e-134
225454180		345	PREDICTED: uncharacterized protein LOC10	0.994	0.982	0.769	1e-131
224130232		336	predicted protein [Populus trichocarpa]	0.979	0.994	0.749	1e-126
224067876		328	predicted protein [Populus trichocarpa]	0.961	1.0	0.709	1e-121
358249184		341	uncharacterized protein LOC100814615 [Gl	0.982	0.982	0.685	1e-115
356568374		342	PREDICTED: uncharacterized protein LOC10	0.982	0.979	0.672	1e-112
449441474		334	PREDICTED: uncharacterized protein LOC10	0.973	0.994	0.661	1e-110
449518609		334	PREDICTED: uncharacterized LOC101203138	0.973	0.994	0.658	1e-110
356504535		340	PREDICTED: uncharacterized protein LOC10	0.988	0.991	0.656	1e-109

>gi\|255541558\|ref\|XP_002511843.1\| DNA binding protein, putative [Ricinus communis] gi\|223549023\|gb\|EEF50512.1\| DNA binding protein, putative [Ricinus communis]	Back alignment and taxonomy information

 Score =  515 bits (1326), Expect = e-143,   Method: Compositional matrix adjust.
 Identities = 270/342 (78%), Positives = 308/342 (90%), Gaps = 3/342 (0%)

Query: 1   MDRRDGLALPGSASFYMQRGMTGSGSGTQPSLHGSPGIHPLSNPSLQFQSNIGGSTIGST 60
           MDRRD +A+ GSASFYMQRGMTGSGSGTQ  L+ S GI+PL++ ++ FQSN+G +TIGST
Sbjct: 1   MDRRDAMAMSGSASFYMQRGMTGSGSGTQSGLNVSSGINPLTSTNVSFQSNVGANTIGST 60

Query: 61  LSVDPSSAISPHGVNVTASASMPQ-SEPVKRKRGRPRKYGPDGSVSLALSPSVSTHPGTI 119
           L ++ S+AI PHGVNV AS+ MP   EPVKRKRGRPRKYGPDG+VSLALSPS+STHPGTI
Sbjct: 61  LPLETSTAIPPHGVNVGASSLMPPPGEPVKRKRGRPRKYGPDGTVSLALSPSLSTHPGTI 120

Query: 120 SPTQKRGRGRPPGTGRKQQVSSLGESLSGSAGMGFTPHVITVAVGEDIAMKLLSFSQQGP 179
           +PTQKRGRGRPPGTGRKQQ++SLGE LSGSAGMGFTPH+IT+AVGEDIA K++SFSQQGP
Sbjct: 121 TPTQKRGRGRPPGTGRKQQLASLGEWLSGSAGMGFTPHIITIAVGEDIATKIMSFSQQGP 180

Query: 180 RAICVLSANGAISTATLRQPSSSGGSVTYEGRFEILCLSGSYLLSGNGGSRNRSGGLSVS 239
           RAIC+LSANGA+ST TLRQPS+SGGSVTYEGRFEILCLSGSYL++ NGGSRNR+GGLSVS
Sbjct: 181 RAICILSANGAVSTVTLRQPSTSGGSVTYEGRFEILCLSGSYLVTSNGGSRNRTGGLSVS 240

Query: 240 LASPDGRVIGGGVGGMLIAANNVQVIVGSFLWGGPKMKNKKGEASEGVRDSEHQSVENPV 299
           LASPDGRVIGGGVGGMLIAA+ VQVIVGSFLWGG K KNKKGE  EG RDS+HQ+VENPV
Sbjct: 241 LASPDGRVIGGGVGGMLIAASPVQVIVGSFLWGGSKAKNKKGEGPEGARDSDHQTVENPV 300

Query: 300 TPTTAPSSQNLTPTSSVGGVWAGSRQMDMMRNAHVDIDLMRG 341
           TP++ P SQNLTPTSS+ G+W GS+ +D MRN HVDIDLMRG
Sbjct: 301 TPSSVPPSQNLTPTSSI-GLWPGSQSLD-MRNTHVDIDLMRG 340

Source: Ricinus communis

Species: Ricinus communis

Genus: Ricinus

Family: Euphorbiaceae

Order: Malpighiales

Class:

Phylum: Streptophyta

Superkingdom: Eukaryota

>gi\|133907524\|gb\|ABO42262.1\| AT-hook DNA-binding protein [Gossypium hirsutum]	Back alignment and taxonomy information

>gi\|225454180\|ref\|XP_002272142.1\| PREDICTED: uncharacterized protein LOC100265498 [Vitis vinifera] gi\|297745264\|emb\|CBI40344.3\| unnamed protein product [Vitis vinifera]	Back alignment and taxonomy information

>gi\|224130232\|ref\|XP_002320785.1\| predicted protein [Populus trichocarpa] gi\|222861558\|gb\|EEE99100.1\| predicted protein [Populus trichocarpa]	Back alignment and taxonomy information

>gi\|224067876\|ref\|XP_002302577.1\| predicted protein [Populus trichocarpa] gi\|222844303\|gb\|EEE81850.1\| predicted protein [Populus trichocarpa]	Back alignment and taxonomy information

>gi\|358249184\|ref\|NP_001239751.1\| uncharacterized protein LOC100814615 [Glycine max] gi\|255636132\|gb\|ACU18409.1\| unknown [Glycine max]	Back alignment and taxonomy information

>gi\|356568374\|ref\|XP_003552386.1\| PREDICTED: uncharacterized protein LOC100802542 isoform 1 [Glycine max] gi\|356568376\|ref\|XP_003552387.1\| PREDICTED: uncharacterized protein LOC100802542 isoform 2 [Glycine max]	Back alignment and taxonomy information

>gi\|449441474\|ref\|XP_004138507.1\| PREDICTED: uncharacterized protein LOC101203138 [Cucumis sativus]	Back alignment and taxonomy information

>gi\|449518609\|ref\|XP_004166329.1\| PREDICTED: uncharacterized LOC101203138 [Cucumis sativus]	Back alignment and taxonomy information

>gi\|356504535\|ref\|XP_003521051.1\| PREDICTED: uncharacterized protein LOC100783475 [Glycine max]	Back alignment and taxonomy information

Prediction of Gene Ontology (GO) Terms

Close Homologs with Gene Ontology terms Detected by BLAST

Original result of BLAST against Gene Ontology (AMIGO)

ID	Alignment graph	Length	Definition	Q cover	H cover	Identity	E-value
Query		341
TAIR\|locus:2050766		348	AT2G45850 [Arabidopsis thalian	0.967	0.948	0.417	3e-53
TAIR\|locus:2098861		354	AT3G61310 [Arabidopsis thalian	0.979	0.943	0.407	3.1e-51
TAIR\|locus:2031306		361	AT1G63480 [Arabidopsis thalian	0.560	0.529	0.376	2.4e-33
TAIR\|locus:2031321		378	AT1G63470 [Arabidopsis thalian	0.557	0.502	0.350	4.3e-32
TAIR\|locus:2051038		351	AT2G33620 [Arabidopsis thalian	0.454	0.441	0.420	7.7e-31
TAIR\|locus:2118091		356	AHL1 "AT-hook motif nuclear-lo	0.539	0.516	0.408	7.9e-30
TAIR\|locus:2126946		318	AT4G00200 [Arabidopsis thalian	0.548	0.588	0.394	3.4e-29
TAIR\|locus:2132599		334	AT4G22770 [Arabidopsis thalian	0.516	0.526	0.388	1.2e-26
TAIR\|locus:2153142		419	AHL4 "AT-HOOK MOTIF NUCLEAR LO	0.398	0.324	0.411	4.7e-26
TAIR\|locus:2178505		386	AT5G46640 [Arabidopsis thalian	0.434	0.383	0.411	2.3e-23

TAIR\|locus:2050766 AT2G45850 [Arabidopsis thaliana (taxid:3702)]	Back alignment and assigned GO terms

 Score = 551 (199.0 bits), Expect = 3.0e-53, P = 3.0e-53
 Identities = 150/359 (41%), Positives = 190/359 (52%)

Query:     1 MDRRDGLALPGSASFYMQRGMTGSGSGTQPSLHGSP----GIHPLSNPSLQFQSNIGGST 56
             MDRRD + L GS S+Y+ RG++GSG    P+ HGSP    G+  L N +  F S  G + 
Sbjct:     1 MDRRDAMGLSGSGSYYIHRGLSGSGP---PTFHGSPQQQQGLRHLPNQNSPFGS--GSTG 55

Query:    57 IGS-TLSVDPSSAIS-------PH--GVNVTASASMPQSEXXXXXXXXXXXXXXXXSVSL 106
              GS +L  DPS A +       PH  GVN+ A    P                   SVSL
Sbjct:    56 FGSPSLHGDPSLATAAGGAGALPHHIGVNMIAPPPPPSETPMKRKRGRPRKYGQDGSVSL 115

Query:   107 ALSPS-VSTHPGTISXXXXXXXXXXXXXXXXXXVXXXXXXXXXXAGMGFTPHVITVAVGE 165
             ALS S VST   T +                  +          +GM FTPHVI V++GE
Sbjct:   116 ALSSSSVSTI--TPNNSNKRGRGRPPGSGKKQRMASVGELMPSSSGMSFTPHVIAVSIGE 173

Query:   166 DIAMKLLSFSQQGPRAICVLSANGAISTATLRQPSSSGGSVTYEGRFEIXXXXXXXXXXX 225
             DIA K+++FSQQGPRAICVLSA+GA+STATL QPS+S G++ YEGRFEI           
Sbjct:   174 DIASKVIAFSQQGPRAICVLSASGAVSTATLIQPSASPGAIKYEGRFEILALSTSYIVAT 233

Query:   226 XXXXXXXXXXXXXXXASPDXXXXXXXXXXMLIAANNVQVIVGSFLWGGPKMKNKKGE--A 283
                            ASPD           LIAA+ VQVIVGSF+W  PK+K+KK E  A
Sbjct:   234 DGSFRNRTGNLSVSLASPDGRVIGGAIGGPLIAASPVQVIVGSFIWAAPKIKSKKREEEA 293

Query:   284 SEGVRDSEHQSVENPVTPTTAPSSQNLTPTSSVGGVWA-GSRQMDMMRNAHVDIDLMRG 341
             SE V++++   V +    T +P  Q   P  ++  +W+ GSRQMDM R+AH DIDLMRG
Sbjct:   294 SEVVQETDDHHVLDNNNNTISPVPQQ-QPNQNL--IWSTGSRQMDM-RHAHADIDLMRG 348

GO:0003677 "DNA binding" evidence=IEA;ISS

GO:0005634 "nucleus" evidence=ISM;IDA

GO:0008150 "biological_process" evidence=ND

GO:0010103 "stomatal complex morphogenesis" evidence=RCA

TAIR\|locus:2098861 AT3G61310 [Arabidopsis thaliana (taxid:3702)]	Back alignment and assigned GO terms

TAIR\|locus:2031306 AT1G63480 [Arabidopsis thaliana (taxid:3702)]	Back alignment and assigned GO terms

TAIR\|locus:2031321 AT1G63470 [Arabidopsis thaliana (taxid:3702)]	Back alignment and assigned GO terms

TAIR\|locus:2051038 AT2G33620 [Arabidopsis thaliana (taxid:3702)]	Back alignment and assigned GO terms

TAIR\|locus:2118091 AHL1 "AT-hook motif nuclear-localized protein 1" [Arabidopsis thaliana (taxid:3702)]	Back alignment and assigned GO terms

TAIR\|locus:2126946 AT4G00200 [Arabidopsis thaliana (taxid:3702)]	Back alignment and assigned GO terms

TAIR\|locus:2132599 AT4G22770 [Arabidopsis thaliana (taxid:3702)]	Back alignment and assigned GO terms

TAIR\|locus:2153142 AHL4 "AT-HOOK MOTIF NUCLEAR LOCALIZED PROTEIN 4" [Arabidopsis thaliana (taxid:3702)]	Back alignment and assigned GO terms

TAIR\|locus:2178505 AT5G46640 [Arabidopsis thaliana (taxid:3702)]	Back alignment and assigned GO terms

Prediction of Enzyme Commission (EC) Number

EC Number Prediction by Annotation Transfer from SWISS-PROT Entries

Original result of BLAST against SWISS-PROT

No confident hit for EC number transfering in SWISSPROT detected by BLAST

EC Number Prediction by Ezypred Server

Original result from Ezypred Server

Fail to connect to Ezypred Server

EC Number Prediction by EFICAz Software

No EC number assignment, probably not an enzyme!

Prediction of Functionally Associated Proteins

Functionally Associated Proteins Detected by STRING

Original result from the STRING server

Fail to connect to STRING server

Conserved Domains and Related Protein Families

Conserved Domains Detected by RPS-BLAST

Original result of RPS-BLAST against CDD database part I

ID	Alignment Graph	Length	Definition	E-value
Query		341
pfam03479		120	pfam03479, DUF296, Domain of unknown function (DUF	4e-36
cd11378		113	cd11378, DUF296, Domain of unknown function found	7e-31
COG1661		141	COG1661, COG1661, Predicted DNA-binding protein wi	0.001

>gnl\|CDD\|217587 pfam03479, DUF296, Domain of unknown function (DUF296)	Back alignment and domain information

 Score =  126 bits (319), Expect = 4e-36
 Identities = 54/123 (43%), Positives = 66/123 (53%), Gaps = 9/123 (7%)

Query: 154 FTPHVITVAVGEDIAMKLLSFSQQGPRAICVLSANGAISTATLRQPS---SSGGSVTYEG 210
             PHV+ +  GED+   L +F++Q      VLS  GA+S  TLRQP     S G VT EG
Sbjct: 1   GRPHVLRLEPGEDLVESLEAFARQRGIGAAVLSGIGAVSNVTLRQPDEEAKSYGVVTLEG 60

Query: 211 RFEILCLSGSYLLSGNGGSRNRSGGLSVSLASPDGRVIGGGV-GGMLIAANNVQVIVGSF 269
           RFEIL LSG+    G       SG L VSLA PDG+V+GG +  G + A   V V   SF
Sbjct: 61  RFEILSLSGTISPGG-----KPSGHLHVSLADPDGQVVGGHLAEGTVFATGEVVVTELSF 115

Query: 270 LWG 272
              
Sbjct: 116 ENA 118

This putative domain is found in proteins that contain AT-hook motifs pfam02178, which strongly suggests a DNA-binding function for the proteins as a whole. There are three highly conserved histidine residues, eg at 117, 119 and 133 in Reut_B5223, which should be a structurally conserved metal-binding unit, based on structural comparison with known metal-binding structures. The proteins should work as trimers. Length = 120

>gnl\|CDD\|211390 cd11378, DUF296, Domain of unknown function found in archaea, bacteria, and plants	Back alignment and domain information

>gnl\|CDD\|224575 COG1661, COG1661, Predicted DNA-binding protein with PD1-like DNA-binding motif [General function prediction only]	Back alignment and domain information

Conserved Domains Detected by HHsearch

Original result of HHsearch against CDD database

ID	Alignment Graph	Length	Definition	Probability
Query		341
PF03479		120	DUF296: Domain of unknown function (DUF296); Inter	99.94
COG1661		141	Predicted DNA-binding protein with PD1-like DNA-bi	99.87
PF02178		13	AT_hook: AT hook motif; InterPro: IPR017956 AT hoo	96.07
smart00384		26	AT_hook DNA binding domain with preference for A/T	96.04
PF14621		219	RFX5_DNA_bdg: RFX5 DNA-binding domain	82.02

>PF03479 DUF296: Domain of unknown function (DUF296); InterPro: IPR005175 This putative conserved domain is found in proteins that contain AT-hook motifs IPR000637 from INTERPRO, suggesting a DNA-binding function for the proteins as a whole, however, the function of this domain is unknown	Back alignment and domain information

Probab=99.94  E-value=6.6e-27  Score=196.57  Aligned_cols=113  Identities=30%  Similarity=0.391  Sum_probs=94.6

Q ss_pred             ceEEEEEecCCchHHHHHHHHHHhCCccEEEEeeeceeeeEEEeCCCC--CCCceeeeeeeEEEEeeceeeeCCCCCCCC
Q 019440          154 FTPHVITVAVGEDIAMKLLSFSQQGPRAICVLSANGAISTATLRQPSS--SGGSVTYEGRFEILCLSGSYLLSGNGGSRN  231 (341)
Q Consensus       154 f~phVIrV~~GEDV~~kI~~Faqq~~~aicILSa~GaVSnVTLRqp~s--~~~tvtyeG~FEILSLSGT~~~~~~~~~~~  231 (341)
                      |++|++||++||||+++|.+||+++++..|+|+++|+|++|+|++++.  .....+|+|+|||+||+|||...++    .
T Consensus         1 ~r~~~~rl~~Gedl~~~l~~~~~~~~i~~~~is~iGsl~~~~l~~~~~~~~~~~~~~~g~~Ei~sl~G~i~~~~g----~   76 (120)
T PF03479_consen    1 GRVFVIRLDPGEDLLESLEAFAREHGIRSGVISGIGSLSNVTLGYYDPPSYYEPLEFEGPFEIISLSGTISPEDG----K   76 (120)
T ss_dssp             EEEEEEEEETTSBHHHHHHHHHHHHT-SSEEEEEEEEEEEEEEEEEETTTEEEEEEEESEEEEEEEEEEEEEETT----E
T ss_pred             CcEEEEEECCCCHHHHHHHHHHHHCCCcEEEEEEEeEEeEEEEEEecccCCcceEEecccEEEEEeEEEEECCCC----C
Confidence            789999999999999999999999999889999999999999999843  3468899999999999999998443    2


Q ss_pred             CCCceEEEEeCCCCeEEeeecC-cceEeecceEEEEEEccCC
Q 019440          232 RSGGLSVSLASPDGRVIGGGVG-GMLIAANNVQVIVGSFLWG  272 (341)
Q Consensus       232 ~~~hLhISLAg~dGqViGGhV~-G~LIAAtpVqVVvgSF~~~  272 (341)
                      ++.||||+|+|.||+|+||||. |.++++  +||+|-.+...
T Consensus        77 ~~~HlHisl~~~~g~v~gGHl~~g~v~~t--~Ev~i~~~~~~  116 (120)
T PF03479_consen   77 PFVHLHISLADPDGQVFGGHLLEGTVFAT--AEVVITELSGI  116 (120)
T ss_dssp             EEEEEEEEEE-TTSEEEEEEEEEEEEEEE--EEEEEEEETTE
T ss_pred             CcceEEEEEECCCCeEEeeEeCCCEEeEE--EEEEEEEecCc
Confidence            6789999999999999999999 555444  55555554443

Overexpression of a protein containing this domain, Q9S7C9 from SWISSPROT, in Arabidopsis thaliana causes late flowering and modified leaf development []. ; PDB: 2DT4_A 2P6Y_A 3HWU_A 3HTN_A 2NMU_A 2H6L_A 2HX0_A.

>COG1661 Predicted DNA-binding protein with PD1-like DNA-binding motif [General function prediction only]	Back alignment and domain information

>PF02178 AT_hook: AT hook motif; InterPro: IPR017956 AT hooks are DNA-binding motifs with a preference for A/T rich regions	Back alignment and domain information

>smart00384 AT_hook DNA binding domain with preference for A/T rich regions	Back alignment and domain information

>PF14621 RFX5_DNA_bdg: RFX5 DNA-binding domain	Back alignment and domain information

Homologous Structure Templates

Structure Templates Detected by BLAST

Original result of BLAST against Protein Data Bank

No homologous structure with e-value below 0.005

Structure Templates Detected by RPS-BLAST

Original result of RPS-BLAST against PDB70 database

ID	Alignment Graph	Length	Definition	E-value
Query		341
2hx0_A		154	Putative DNA-binding protein; NESG, PSI-2, SCR59,	5e-23
2p6y_A		142	Hypothetical protein VCA0587; NESG, Q9KM02_vibch,	7e-23
2dt4_A		143	Hypothetical protein PH0802; PPC domain, structura	2e-14
3htn_A		149	Putative DNA binding protein; DUF269 family protei	3e-07
2h6l_A		146	Hypothetical protein; NESG GR103, structural genom	6e-06
3hwu_A		147	Putative DNA-binding protein; YP_299413.1, structu	1e-05
1qzv_F		154	Plant photosystem I: subunit PSAF; photosynthesis,	3e-05

>2hx0_A Putative DNA-binding protein; NESG, PSI-2, SCR59, structural genomics, protein structure initiative; 1.55A {Salmonella choleraesuis} SCOP: d.290.1.3 PDB: 2nmu_A Length = 154	Back alignment and structure

 Score = 92.4 bits (229), Expect = 5e-23
 Identities = 24/124 (19%), Positives = 55/124 (44%), Gaps = 10/124 (8%)

Query: 147 SGSAGMGFTPHVITVAVGEDIAMKLLSFSQQ-GPRAICVLSANGAISTATLRQPSSSGGS 205
           S         + + +  G+++  +L +F QQ   RA  +    G+++   LR       +
Sbjct: 11  SHHNASTARFYALRLLPGQEVFSQLHAFVQQNQLRAAWIAGCTGSLTDVALRYAGQ-EAT 69

Query: 206 VTYEGRFEILCLSGSYLLSGNGGSRNRSGGLSVSLASPDGRVIGGGVGGMLIAANNVQVI 265
            +  G FE++ L+G+  L+G          L ++++ P G ++GG +         ++++
Sbjct: 70  TSLTGTFEVISLNGTLELTGEH--------LHLAVSDPYGVMLGGHMMPGCTVRTTLELV 121

Query: 266 VGSF 269
           +G  
Sbjct: 122 IGEL 125

PyMOL of 2hx0

>2p6y_A Hypothetical protein VCA0587; NESG, Q9KM02_vibch, VCR80, structural genomics, PSI-2, prote structure initiative; 1.63A {Vibrio cholerae} Length = 142	Back alignment and structure

>2dt4_A Hypothetical protein PH0802; PPC domain, structural genomics, unknown function; 1.60A {Pyrococcus horikoshii} Length = 143	Back alignment and structure

>3htn_A Putative DNA binding protein; DUF269 family protein, structural genomics, joint center for structural genomics, JCSG; HET: MSE 1PE; 1.50A {Bacteroides thetaiotaomicron vpi-5482} Length = 149	Back alignment and structure

>2h6l_A Hypothetical protein; NESG GR103, structural genomics, PSI-2, protein structure initiative, northeast structural genomics consortium; 2.00A {Archaeoglobus fulgidus} SCOP: d.290.1.3 Length = 146	Back alignment and structure

>3hwu_A Putative DNA-binding protein; YP_299413.1, structural genomi center for structural genomics, JCSG, protein structure INI PSI-2; HET: MSE; 1.30A {Ralstonia eutropha} Length = 147	Back alignment and structure

>1qzv_F Plant photosystem I: subunit PSAF; photosynthesis,plant photosynthetic reaction center, peripheral antenna; HET: CL1 PQN; 4.44A {Pisum sativum} SCOP: i.5.1.1 Length = 154	Back alignment and structure

Structure Templates Detected by HHsearch

Original result of HHsearch against PDB70 database

ID	Alignment Graph	Length	Definition	Probability
Query		341
2p6y_A		142	Hypothetical protein VCA0587; NESG, Q9KM02_vibch,	99.94
2hx0_A		154	Putative DNA-binding protein; NESG, PSI-2, SCR59,	99.94
2dt4_A		143	Hypothetical protein PH0802; PPC domain, structura	99.94
2h6l_A		146	Hypothetical protein; NESG GR103, structural genom	99.93
3htn_A		149	Putative DNA binding protein; DUF269 family protei	99.92
3hwu_A		147	Putative DNA-binding protein; YP_299413.1, structu	99.88
2ezd_A		26	High mobility group protein HMG-I/HMG-Y; DNA bindi	96.77

>2p6y_A Hypothetical protein VCA0587; NESG, Q9KM02_vibch, VCR80, structural genomics, PSI-2, prote structure initiative; 1.63A {Vibrio cholerae}	Back alignment and structure

Probab=99.94  E-value=1.9e-26  Score=198.06  Aligned_cols=115  Identities=22%  Similarity=0.302  Sum_probs=103.4

Q ss_pred             ceEEEEEecCCchHHHHHHHHHHhCCc-cEEEEeeeceeeeEEEeCCCCCCCceeeeeeeEEEEeeceeeeCCCCCCCCC
Q 019440          154 FTPHVITVAVGEDIAMKLLSFSQQGPR-AICVLSANGAISTATLRQPSSSGGSVTYEGRFEILCLSGSYLLSGNGGSRNR  232 (341)
Q Consensus       154 f~phVIrV~~GEDV~~kI~~Faqq~~~-aicILSa~GaVSnVTLRqp~s~~~tvtyeG~FEILSLSGT~~~~~~~~~~~~  232 (341)
                      |++|++||++||||+++|.+||+++++ +.||++++|++++|+||+++.. .+++|+|+|||+||+|||.+. .      
T Consensus         2 ~r~~~lrL~~Gedl~~~i~~~~~~~~i~~a~v~~~iGsl~~~~l~~~~~~-~~~~~~g~~EIlsl~Gti~~~-~------   73 (142)
T 2p6y_A            2 IHLIALRLTRGMDLKQQIVQLVQQHRIHAGSIASCVGCLSTLHIRLADSV-STLQVSAPFEILSLSGTLTYQ-H------   73 (142)
T ss_dssp             CEEEEEEECTTCBHHHHHHHHHHHTTCSSEEEEEEEEEEEEEEEECTTSS-CEEEECSCEEEEEEEEEECSS-C------
T ss_pred             CcEEEEEECCCCcHHHHHHHHHHHhCCCEEEEEEeEEEEEeEEEECCCCC-ccEecCCcEEEEEeEEEEeCC-C------
Confidence            899999999999999999999999886 7999999999999999999974 478899999999999999995 1      


Q ss_pred             CCceEEEEeCCCCeEEeeecCcceEeecceEEEEEEccCCCCCcc
Q 019440          233 SGGLSVSLASPDGRVIGGGVGGMLIAANNVQVIVGSFLWGGPKMK  277 (341)
Q Consensus       233 ~~hLhISLAg~dGqViGGhV~G~LIAAtpVqVVvgSF~~~~~k~~  277 (341)
                       .||||+|+|.||+|+||||.+..++..++||++..|....+.++
T Consensus        74 -~HlHisl~~~~G~v~GGHl~~g~~V~~t~Ev~i~~~~~~~~~R~  117 (142)
T 2p6y_A           74 -CHLHIAVADAQGRVWGGHLLEGNLINTTAELMIHHYPQHHFTRE  117 (142)
T ss_dssp             -EEEEEEEECTTSCEEEEEECTTCEECC-EEEEEEECTTEEEEEE
T ss_pred             -CEEEEEEECCCCCEEccccCCCCeEEEEEEEEEEEccCCeEEEe
Confidence             59999999999999999999766667889999999988667743

PyMOL of 2p6y

>2hx0_A Putative DNA-binding protein; NESG, PSI-2, SCR59, structural genomics, protein structure initiative; 1.55A {Salmonella choleraesuis} SCOP: d.290.1.3 PDB: 2nmu_A	Back alignment and structure

>2dt4_A Hypothetical protein PH0802; PPC domain, structural genomics, unknown function; 1.60A {Pyrococcus horikoshii}	Back alignment and structure

>2h6l_A Hypothetical protein; NESG GR103, structural genomics, PSI-2, protein structure initiative, northeast structural genomics consortium; 2.00A {Archaeoglobus fulgidus} SCOP: d.290.1.3	Back alignment and structure

>3htn_A Putative DNA binding protein; DUF269 family protein, structural genomics, joint center for structural genomics, JCSG; HET: MSE 1PE; 1.50A {Bacteroides thetaiotaomicron vpi-5482} SCOP: d.290.1.0	Back alignment and structure

>3hwu_A Putative DNA-binding protein; YP_299413.1, structural genomi center for structural genomics, JCSG, protein structure INI PSI-2; HET: MSE; 1.30A {Ralstonia eutropha}	Back alignment and structure

>2ezd_A High mobility group protein HMG-I/HMG-Y; DNA binding protein, minor groove DNA binding, transcriptional CO-activator, architectural factor; HET: DNA; NMR {Homo sapiens} SCOP: j.10.1.1 PDB: 2eze_A*	Back alignment and structure

Homologous Structure Domains

Structure Domains Detected by RPS-BLAST

Original result of RPS-BLAST against SCOP70(version1.75) database

ID	Alignment Graph	Length	Definition	E-value
Query		341
d2hx0a1		136	d.290.1.3 (A:6-141) Hypothetical protein STM3071 {	4e-20
d2h6la1		138	d.290.1.3 (A:1-138) Hypothetical protein AF0104 {A	2e-13

>d2hx0a1 d.290.1.3 (A:6-141) Hypothetical protein STM3071 {Salmonella typhimurium [TaxId: 90371]} Length = 136	Back information, alignment and structure


class: Alpha and beta proteins (a+b)
fold: AF0104/ALDC/Ptd012-like
superfamily: AF0104/ALDC/Ptd012-like
family: AF0104-like
domain: Hypothetical protein STM3071
species: Salmonella typhimurium [TaxId: 90371]

 Score = 83.0 bits (205), Expect = 4e-20
 Identities = 23/124 (18%), Positives = 55/124 (44%), Gaps = 13/124 (10%)

Query: 147 SGSAGMGFTPHVITVAVGEDIAMKLLSFS-QQGPRAICVLSANGAISTATLRQPSSSGGS 205
           + S       + + +  G+++  +L +F  Q   RA  +    G+++   LR       +
Sbjct: 2   NASTA---RFYALRLLPGQEVFSQLHAFVQQNQLRAAWIAGCTGSLTDVALRYAGQ-EAT 57

Query: 206 VTYEGRFEILCLSGSYLLSGNGGSRNRSGGLSVSLASPDGRVIGGGVGGMLIAANNVQVI 265
            +  G FE++ L+G+  L+G          L ++++ P G ++GG +         ++++
Sbjct: 58  TSLTGTFEVISLNGTLELTGEH--------LHLAVSDPYGVMLGGHMMPGCTVRTTLELV 109

Query: 266 VGSF 269
           +G  
Sbjct: 110 IGEL 113

PyMOL of d2hx0a1

>d2h6la1 d.290.1.3 (A:1-138) Hypothetical protein AF0104 {Archaeoglobus fulgidus [TaxId: 2234]} Length = 138	Back information, alignment and structure

Homologous Domains Detected by HHsearch

Original result of HHsearch against SCOP70(version1.75) database

ID	Alignment Graph	Length	Definition	Probability
Query		341
d2hx0a1		136	Hypothetical protein STM3071 {Salmonella typhimuri	99.94
d2h6la1		138	Hypothetical protein AF0104 {Archaeoglobus fulgidu	99.88

>d2hx0a1 d.290.1.3 (A:6-141) Hypothetical protein STM3071 {Salmonella typhimurium [TaxId: 90371]}	Back information, alignment and structure

class: Alpha and beta proteins (a+b)
fold: AF0104/ALDC/Ptd012-like
superfamily: AF0104/ALDC/Ptd012-like
family: AF0104-like
domain: Hypothetical protein STM3071
species: Salmonella typhimurium [TaxId: 90371]

Probab=99.94  E-value=2e-26  Score=194.82  Aligned_cols=118  Identities=19%  Similarity=0.334  Sum_probs=107.9

Q ss_pred             CCCceEEEEEecCCchHHHHHHHHHHhCCc-cEEEEeeeceeeeEEEeCCCCCCCceeeeeeeEEEEeeceeeeCCCCCC
Q 019440          151 GMGFTPHVITVAVGEDIAMKLLSFSQQGPR-AICVLSANGAISTATLRQPSSSGGSVTYEGRFEILCLSGSYLLSGNGGS  229 (341)
Q Consensus       151 g~~f~phVIrV~~GEDV~~kI~~Faqq~~~-aicILSa~GaVSnVTLRqp~s~~~tvtyeG~FEILSLSGT~~~~~~~~~  229 (341)
                      ++..+.|++||++||||+++|.+||+++++ +.||++++|++++|+|++++. .....++|+|||+||+|+|...+    
T Consensus         3 ~~~~R~~~lrl~~Gedl~~~i~~~~~~~~I~~a~V~~~iGs~~~~~~~~~~~-~~~~~~~g~~Ei~sl~G~I~~~~----   77 (136)
T d2hx0a1           3 ASTARFYALRLLPGQEVFSQLHAFVQQNQLRAAWIAGCTGSLTDVALRYAGQ-EATTSLTGTFEVISLNGTLELTG----   77 (136)
T ss_dssp             CCCCEEEEEEECTTCBHHHHHHHHHHHHTCSSEEEEEEEEEEEEEEEECTTC-SSCEEEEEEEEEEEEEEEEETTE----
T ss_pred             CCCCcEEEEEECCCChHHHHHHHHHHHhCCCEEEEEEEeeeeEEEEEEeCCC-CCcEEecCcEEEEEEEEEeccCC----
Confidence            567899999999999999999999998885 899999999999999999986 45668999999999999998876    


Q ss_pred             CCCCCceEEEEeCCCCeEEeeecCcceEeecceEEEEEEccCCCCCcc
Q 019440          230 RNRSGGLSVSLASPDGRVIGGGVGGMLIAANNVQVIVGSFLWGGPKMK  277 (341)
Q Consensus       230 ~~~~~hLhISLAg~dGqViGGhV~G~LIAAtpVqVVvgSF~~~~~k~~  277 (341)
                          .||||+|+|.||+|+||||.+++++++++||+|..|....++++
T Consensus        78 ----~HlH~~~a~~~g~v~gGhL~~g~~v~~t~Eivi~~l~~~~~~R~  121 (136)
T d2hx0a1          78 ----EHLHLAVSDPYGVMLGGHMMPGCTVRTTLELVIGELPALTFSRQ  121 (136)
T ss_dssp             ----EEEEEEEECTTSCEEEEEECTTCEEEEEEEEEEEECTTEEEEEE
T ss_pred             ----CeEEEEEECCCCcEEeEEecCCcEEEEEEEEEEEEccCCceEEc
Confidence                39999999999999999999989999999999999988767653

PyMOL of d2hx0a1

>d2h6la1 d.290.1.3 (A:1-138) Hypothetical protein AF0104 {Archaeoglobus fulgidus [TaxId: 2234]}	Back information, alignment and structure