Citrus Sinensis ID: 026596


Local Sequence Feature Prediction

Prediction and (Method)Result
Residue Number Marker
Protein Sequence ?
Secondary Structure (PSIPRED) ?
Secondary Structure Prediction (SSPRO) ?
Coil and Loop (DISEMBL) ?
Flexible Loop (DISEMBL) ?
Low Complexity Region (SEG) ?
Disordered region (IsUnstruct) ?
Disordered Region (DISOPRED) ?
Disordered Region (DISEMBL) ?
Disordered Region (DISPRO) ?
Transmembrane Helix (TMHMM) ?
Transmembrane Helix (HMMTOP) ?
Transmembrane Helix (MEMSAT) ?
TM Helix, Signal Peptide (MEMSAT_SVM) ?
TM Helix, Signal Peptide (Phobius) ?
Signal Peptide (SignalP HMM Mode) ?
Signal Peptide (SignalP NN Mode) ?
Coiled Coils (COILS) ?
Positional Conservation ?
 
--------10--------20--------30--------40--------50--------60--------70--------80--------90-------100-------110-------120-------130-------140-------150-------160-------170-------180-------190-------200-------210-------220-------230------
MANGVEKESVLMMFSDEELREISGVKRGGDYIEVTCGCTSHRYGDAVGRLRVFSNGDLEITCECTPGCNEDKMTPGAFEKHSGRETARKWKNNVWVIANGEKVPLSKTVLLKYYNQASKHGNGSHRSHNGRVCHRDEFVRCARCNKERRFRLRTKEECLIHHNALADKNWKCSDLPYDKITCDDEEERASRRVYRGCIRSPTCKGCTSCVCFGCDICRFSDCSCQTCIDFTRNAKT
cccccccccccccccHHHHHHHHccEEcccEEEEEEccccccccccEEEEEEEEcccEEEEEEcccccccccccHHHHHHcccccccccccccEEEEEcccEEcccHHHHHHHHHHHHccccccccccccccccccccccccccccccccccccHHHHHHHHHHHccccccccccccccccccHHHHHHHHHHHccccccccccccccEEEEcccEEEEcccccHHHHHHHHHccc
cccccccccEEEEEcHHHHHHHHcccccccEEEEEEccccccccccEEEEEEEcccEEEEEEEcccccccccccHHHHHHHccccccccccccEEEEEccccccccHHHHHHHHHHHHccccccccccccccccccHEEEEccccccEEEEEccHHHHHHHHHHHHcccccccccccccEEcccHHHHHHHHHHcccccccccccccEEEEcccEEEEEcccccccHHHHHHHccc
MANGVEKESVLMMFSDEELReisgvkrggdyievtcgctshrygdavgrlrvfsngdleitcectpgcnedkmtpgafekhsgreTARKWKNNVWVIangekvplSKTVLLKYYNqaskhgngshrshngrvchrdefvrcaRCNKERRFRLRTKEECLIHHnaladknwkcsdlpydkitcddeeERASRrvyrgcirsptckgctscvcfgcdicrfsdcscqtcidftrnakt
MANGVEKESVLMMFSDEELReisgvkrggdYIEVTCGCTSHRYGDAVGRLRVFSNGDLEITCECTPGCNEDKMTPGAFEKHSGRETARKWKNNVWVIANGEKVPLSKTVLLKYYNQAskhgngshrshngrvchrdefvrcarcnkerrfrlrtkeeclihhnaladknwkcsdLPYDKitcddeeerasrrvyrgcirsptckgctSCVCFGCDICRFSDCSCQTCIDFTRNAKT
MANGVEKESVLMMFSDEELREISGVKRGGDYIEVTCGCTSHRYGDAVGRLRVFSNGDLEITCECTPGCNEDKMTPGAFEKHSGRETARKWKNNVWVIANGEKVPLSKTVLLKYYNQASKHGNGSHRSHNGRVCHRDEFVRCARCNKERRFRLRTKEECLIHHNALADKNWKCSDLPYDKITCDDEEERASRRVYRGCIRSPTCKGCTSCVCFGCDICRFSDCSCQTCIDFTRNAKT
******************LREISGVKRGGDYIEVTCGCTSHRYGDAVGRLRVFSNGDLEITCECTPGCN*****************ARKWKNNVWVIANGEKVPLSKTVLLKYYNQA************GRVCHRDEFVRCARCNKERRFRLRTKEECLIHHNALADKNWKCSDLPYDKITCDDEEERASRRVYRGCIRSPTCKGCTSCVCFGCDICRFSDCSCQTCIDFT*****
************MFSDEELREISGVKRGGDYIEVTCGCTSHRYGDAVGRLRVFSNGDLEITCECTPGCNEDKMT***************WKNNVWVIANGEKVPLSKTVLLKYYN******************HRDEFVRCARCNKERRFRLRTKEECLIHHNALADKNWKCSDLPYDKITCDD****************PTCKGCTSCVCFGCDICRFSDCSCQTCIDFTRNA**
MANGVEKESVLMMFSDEELREISGVKRGGDYIEVTCGCTSHRYGDAVGRLRVFSNGDLEITCECTPGCNEDKMTPGAFEKHSGRETARKWKNNVWVIANGEKVPLSKTVLLKYYNQAS**********NGRVCHRDEFVRCARCNKERRFRLRTKEECLIHHNALADKNWKCSDLPYDKITCDDEEERASRRVYRGCIRSPTCKGCTSCVCFGCDICRFSDCSCQTCIDFTRNAKT
*******ESVLMMFSDEELREISGVKRGGDYIEVTCGCTSHRYGDAVGRLRVFSNGDLEITCECTPGCNEDKMTPGAFEKHSGRETARKWKNNVWVIANGEKVPLSKTVLLKYYNQASK************VCHRDEFVRCARCNKERRFRLRTKEECLIHHNALADKNWKCSDLPYDKITCDDEEERAS******CIRSPTCKGCTSCVCFGCDICRFSDCSCQTCIDFTRNAK*
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
iiiiiiiiiiiiiiiiiiiiiiiiiiiiiihhhhhhhhhhhhhhhhhhhhhhhhhooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiihhhhhhhhhhhhhhhhooooooooooooooooo
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
MANGVEKESVLMMFSDEELREISGVKRGGDYIEVTCGCTSHRYGDAVGRLRVFSNGDLEITCECTPGCNEDKMTPGAFEKHSGRETARKWKNNVWVIANGEKVPLSKTVLLKYYNQASKHGNGSHRSHNGRVCHRDEFVRCARCNKERRFRLRTKEECLIHHNALADKNWKCSDLPYDKITCDDEEERASRRVYRGCIRSPTCKGCTSCVCFGCDICRFSDCSCQTCIDFTRNAKT
no confident homologs detected

Close Homologs for Annotation Transfer

Close Homologs in SWISS-PROT Database Detected by BLAST ?

ID ?Alignment graph ?Length ? Definition ? RBH(Q2H) ? RBH(H2Q) ? Q cover ? H cover ? Identity ? E-value ?
Query236 2.2.26 [Sep-21-2011]
Q8GZA8237 Protein ULTRAPETALA 1 OS= yes no 0.953 0.949 0.755 1e-102
Q8S8I2228 Protein ULTRAPETALA 2 OS= no no 0.919 0.951 0.695 6e-91
>sp|Q8GZA8|ULT1_ARATH Protein ULTRAPETALA 1 OS=Arabidopsis thaliana GN=ULT1 PE=1 SV=1 Back     alignment and function desciption
 Score =  370 bits (951), Expect = e-102,   Method: Compositional matrix adjust.
 Identities = 170/225 (75%), Positives = 195/225 (86%)

Query: 12  MMFSDEELREISGVKRGGDYIEVTCGCTSHRYGDAVGRLRVFSNGDLEITCECTPGCNED 71
           M+F  EEL+E+SGV  GGDY+EV CGCTSHRYGDAV RLRVF  GDLEITCECTPGC+ED
Sbjct: 13  MLFKQEELQEMSGVNVGGDYVEVMCGCTSHRYGDAVARLRVFPTGDLEITCECTPGCDED 72

Query: 72  KMTPGAFEKHSGRETARKWKNNVWVIANGEKVPLSKTVLLKYYNQASKHGNGSHRSHNGR 131
           K+TP AFEKHSGRETARKWKNNVWVI  GEKVPLSKTVLLKYYN++SK  + S+RS   +
Sbjct: 73  KLTPAAFEKHSGRETARKWKNNVWVIIGGEKVPLSKTVLLKYYNESSKKCSRSNRSQGAK 132

Query: 132 VCHRDEFVRCARCNKERRFRLRTKEECLIHHNALADKNWKCSDLPYDKITCDDEEERASR 191
           VCHRDEFV C  C KERRFRLR+++EC +HHNA+ D NWKCSD PYDKITC++EEER SR
Sbjct: 133 VCHRDEFVGCNDCGKERRFRLRSRDECRLHHNAMGDPNWKCSDFPYDKITCEEEEERGSR 192

Query: 192 RVYRGCIRSPTCKGCTSCVCFGCDICRFSDCSCQTCIDFTRNAKT 236
           +VYRGC RSP+CKGCTSCVCFGC++CRFS+C+CQTC+DFT N K 
Sbjct: 193 KVYRGCTRSPSCKGCTSCVCFGCELCRFSECTCQTCVDFTSNVKA 237




Putative transcription factor that acts as a key negative regulator of cell accumulation in shoot and floral meristems. Negatively regulates the size of the WUSCHEL (WUS)-expressing organizing center in inflorescence meristems. May act by down-regulating expression of WUS.
Arabidopsis thaliana (taxid: 3702)
>sp|Q8S8I2|ULT2_ARATH Protein ULTRAPETALA 2 OS=Arabidopsis thaliana GN=ULT2 PE=2 SV=1 Back     alignment and function description

Close Homologs in the Non-Redundant Database Detected by BLAST ?

GI ?Alignment Graph ?Length ? Definition ? Q cover ? H cover ? Identity ? E-value ?
Query236
224126749235 predicted protein [Populus trichocarpa] 0.995 1.0 0.881 1e-120
224127015235 predicted protein [Populus trichocarpa] 0.995 1.0 0.872 1e-119
356542889236 PREDICTED: protein ULTRAPETALA 1-like [G 1.0 1.0 0.877 1e-119
225447249235 PREDICTED: protein ULTRAPETALA 1 [Vitis 0.944 0.948 0.910 1e-118
255637610236 unknown [Glycine max] 1.0 1.0 0.868 1e-117
359807494237 uncharacterized protein LOC100815782 [Gl 1.0 0.995 0.864 1e-116
449438068235 PREDICTED: protein ULTRAPETALA 1-like [C 0.995 1.0 0.822 1e-112
388500952233 unknown [Lotus japonicus] 0.983 0.995 0.795 1e-107
388510756233 unknown [Lotus japonicus] 0.983 0.995 0.787 1e-106
297803264237 hypothetical protein ARALYDRAFT_491956 [ 0.995 0.991 0.731 1e-101
>gi|224126749|ref|XP_002319917.1| predicted protein [Populus trichocarpa] gi|118481891|gb|ABK92881.1| unknown [Populus trichocarpa] gi|222858293|gb|EEE95840.1| predicted protein [Populus trichocarpa] Back     alignment and taxonomy information
 Score =  435 bits (1119), Expect = e-120,   Method: Compositional matrix adjust.
 Identities = 208/236 (88%), Positives = 222/236 (94%), Gaps = 1/236 (0%)

Query: 1   MANGVEKESVLMMFSDEELREISGVKRGGDYIEVTCGCTSHRYGDAVGRLRVFSNGDLEI 60
           MANG E+E VL+ FS+EELRE+SGVKRGGDYIEV CGCTSHRYGDAVGRLRVF NG+LEI
Sbjct: 1   MANGAEREMVLL-FSEEELREMSGVKRGGDYIEVMCGCTSHRYGDAVGRLRVFINGELEI 59

Query: 61  TCECTPGCNEDKMTPGAFEKHSGRETARKWKNNVWVIANGEKVPLSKTVLLKYYNQASKH 120
           TCECTPGC+EDKMTP AFEKHSGRETARKWKNNVWVI NGEKVPL KTVLLKYYNQASK+
Sbjct: 60  TCECTPGCDEDKMTPAAFEKHSGRETARKWKNNVWVIVNGEKVPLVKTVLLKYYNQASKN 119

Query: 121 GNGSHRSHNGRVCHRDEFVRCARCNKERRFRLRTKEECLIHHNALADKNWKCSDLPYDKI 180
           GNGSHRSHNGRVCHRDEFVRC +CNKERRFRLR+KEEC IHH+ALAD NWKC+D+PYDKI
Sbjct: 120 GNGSHRSHNGRVCHRDEFVRCNKCNKERRFRLRSKEECRIHHDALADANWKCADMPYDKI 179

Query: 181 TCDDEEERASRRVYRGCIRSPTCKGCTSCVCFGCDICRFSDCSCQTCIDFTRNAKT 236
           TCDD+EERASRRVYRGC RSPTCKGCTSCVCFGC+ICRFSDCSCQTC DFTRNAK 
Sbjct: 180 TCDDDEERASRRVYRGCTRSPTCKGCTSCVCFGCEICRFSDCSCQTCTDFTRNAKV 235




Source: Populus trichocarpa

Species: Populus trichocarpa

Genus: Populus

Family: Salicaceae

Order: Malpighiales

Class:

Phylum: Streptophyta

Superkingdom: Eukaryota

>gi|224127015|ref|XP_002329362.1| predicted protein [Populus trichocarpa] gi|222870412|gb|EEF07543.1| predicted protein [Populus trichocarpa] Back     alignment and taxonomy information
>gi|356542889|ref|XP_003539897.1| PREDICTED: protein ULTRAPETALA 1-like [Glycine max] Back     alignment and taxonomy information
>gi|225447249|ref|XP_002278664.1| PREDICTED: protein ULTRAPETALA 1 [Vitis vinifera] gi|147771900|emb|CAN75705.1| hypothetical protein VITISV_031418 [Vitis vinifera] gi|297739267|emb|CBI28918.3| unnamed protein product [Vitis vinifera] Back     alignment and taxonomy information
>gi|255637610|gb|ACU19130.1| unknown [Glycine max] Back     alignment and taxonomy information
>gi|359807494|ref|NP_001241143.1| uncharacterized protein LOC100815782 [Glycine max] gi|255636902|gb|ACU18784.1| unknown [Glycine max] Back     alignment and taxonomy information
>gi|449438068|ref|XP_004136812.1| PREDICTED: protein ULTRAPETALA 1-like [Cucumis sativus] gi|449493082|ref|XP_004159188.1| PREDICTED: protein ULTRAPETALA 1-like [Cucumis sativus] Back     alignment and taxonomy information
>gi|388500952|gb|AFK38542.1| unknown [Lotus japonicus] Back     alignment and taxonomy information
>gi|388510756|gb|AFK43444.1| unknown [Lotus japonicus] Back     alignment and taxonomy information
>gi|297803264|ref|XP_002869516.1| hypothetical protein ARALYDRAFT_491956 [Arabidopsis lyrata subsp. lyrata] gi|297315352|gb|EFH45775.1| hypothetical protein ARALYDRAFT_491956 [Arabidopsis lyrata subsp. lyrata] Back     alignment and taxonomy information

Prediction of Gene Ontology (GO) Terms

Close Homologs with Gene Ontology terms Detected by BLAST ?

ID ? Alignment graph ? Length ? Definition ? Q cover ? H cover ? Identity ? E-value ?
Query236
TAIR|locus:2827502228 ULT2 "ULTRAPETALA 2" [Arabidop 0.919 0.951 0.695 7.3e-91
TAIR|locus:2827502 ULT2 "ULTRAPETALA 2" [Arabidopsis thaliana (taxid:3702)] Back     alignment and assigned GO terms
 Score = 906 (324.0 bits), Expect = 7.3e-91, P = 7.3e-91
 Identities = 155/223 (69%), Positives = 187/223 (83%)

Query:    13 MFSDEELREISGVKRGGDYIEVTCGCTSHRYGDAVGRLRVFSNGDLEITCECTPGCNEDK 72
             +FS EEL+EISGV  G DY+EV CGCTSHRYGDAV RL++FS+G+L+ITC+CTP C EDK
Sbjct:    10 LFSKEELQEISGVHVGDDYVEVMCGCTSHRYGDAVARLKIFSDGELQITCQCTPACLEDK 69

Query:    73 MTPGAFEKHSGRETARKWKNNVWVIANGEKVPLSKTVLLKYYNQASKHGNGSHRSHNGRV 132
             +TP AFEKHS RET+R W+NNVWV   G+KVPLSKTVLL+YYN+A K+ N S      +V
Sbjct:    70 LTPAAFEKHSERETSRNWRNNVWVFIEGDKVPLSKTVLLRYYNKALKNSNVS------KV 123

Query:   133 CHRDEFVRCARCNKERRFRLRTKEECLIHHNALADKNWKCSDLPYDKITCDDEEERASRR 192
              HRDEFV C+ C KERRFRLR++ EC +HH+A+A+ NWKC D PYDKITC++EEER SR+
Sbjct:   124 IHRDEFVGCSTCGKERRFRLRSRGECRMHHDAIAEPNWKCCDYPYDKITCEEEEERGSRK 183

Query:   193 VYRGCIRSPTCKGCTSCVCFGCDICRFSDCSCQTCIDFTRNAK 235
             V+RGC RSP+CKGCTSCVCFGC +CRFSDC+CQTC+DFT NAK
Sbjct:   184 VFRGCTRSPSCKGCTSCVCFGCKLCRFSDCNCQTCLDFTTNAK 226


Parameters:
  V=100
  filter=SEG
  E=0.001

  ctxfactor=1.00

  Query                        -----  As Used  -----    -----  Computed  ----
  Frame  MatID Matrix name     Lambda    K       H      Lambda    K       H
   +0      0   BLOSUM62        0.322   0.136   0.449    same    same    same
               Q=9,R=2         0.244   0.0300  0.180     n/a     n/a     n/a

  Query
  Frame  MatID  Length  Eff.Length     E     S W   T  X   E2     S2
   +0      0      236       236   0.00089  113 3  11 22  0.39    33
                                                     32  0.43    36


Statistics:

  Database:  /share/blast/go-seqdb.fasta
   Title:  go_20130330-seqdb.fasta
   Posted:  5:47:42 AM PDT Apr 1, 2013
   Created:  5:47:42 AM PDT Apr 1, 2013
   Format:  XDF-1
   # of letters in database:  169,044,731
   # of sequences in database:  368,745
   # of database sequences satisfying E:  1
  No. of states in DFA:  613 (65 KB)
  Total size of DFA:  230 KB (2123 KB)
  Time to generate neighborhood:  0.00u 0.00s 0.00t   Elapsed:  00:00:00
  No. of threads or processors used:  24
  Search cpu time:  20.33u 0.43s 20.76t   Elapsed:  00:00:02
  Total cpu time:  20.33u 0.43s 20.76t   Elapsed:  00:00:02
  Start:  Fri May 10 21:28:28 2013   End:  Fri May 10 21:28:30 2013


GO:0003677 "DNA binding" evidence=IEA
GO:0005634 "nucleus" evidence=IEA;IDA
GO:0008150 "biological_process" evidence=ND
GO:0005829 "cytosol" evidence=IDA

Prediction of Enzyme Commission (EC) Number

EC Number Prediction by Annotation Transfer from SWISS-PROT Entries ?

ID ?Name ?Annotated EC number ?Identity ?Query coverage ?Hit coverage ?RBH(Q2H) ?RBH(H2Q) ?
Q8GZA8ULT1_ARATHNo assigned EC number0.75550.95330.9493yesno

EC Number Prediction by Ezypred Server ?

Fail to connect to Ezypred Server

EC Number Prediction by EFICAz Software ?

No EC number assignment, probably not an enzyme!


Prediction of Functionally Associated Proteins

Functionally Associated Proteins Detected by STRING ?

Fail to connect to STRING server


Conserved Domains and Related Protein Families

Conserved Domains Detected by RPS-BLAST ?

Conserved Domains Detected by HHsearch ?

ID ?Alignment Graph ?Length ? Definition ? Probability ?
Query 236
PF0134282 SAND: SAND domain; InterPro: IPR000770 The SAND do 97.12
smart0025873 SAND SAND domain. 95.93
PF0749650 zf-CW: CW-type Zinc Finger; InterPro: IPR011124 Zi 91.91
>PF01342 SAND: SAND domain; InterPro: IPR000770 The SAND domain (named after Sp100, AIRE-1, NucP41/75, DEAF-1) is a conserved ~80 residue region found in a number of nuclear proteins, many of which function in chromatin-dependent transcriptional control Back     alignment and domain information
Probab=97.12  E-value=8.6e-05  Score=56.27  Aligned_cols=55  Identities=36%  Similarity=0.746  Sum_probs=36.9

Q ss_pred             CceEEEcccccceeecCcceeeEE--e-eCC--eeeEeeecCCCCCCCCCChhhhhhccccccccccccceEE
Q 026596           29 GDYIEVTCGCTSHRYGDAVGRLRV--F-SNG--DLEITCECTPGCNEDKMTPGAFEKHSGRETARKWKNNVWV   96 (236)
Q Consensus        29 edyvEV~CGcTs~rYGD~VGRLri--~-~~G--~~~ItCeC~PgC~~dkltp~aFEkHs~reta~kWkn~vWV   96 (236)
                      +.-|.|+||       +..|.|-.  + ..|  +-=|.++      ..-+||.+||+|+|++++.+||..|-|
T Consensus         8 ~~~lpVtCG-------~~~G~L~~~k~~~~g~~~kCI~~~------g~~~TP~eFE~~~G~~~sK~WK~SIr~   67 (82)
T PF01342_consen    8 DPELPVTCG-------DVKGTLYKKKFVKQGICGKCIQCE------GRWFTPSEFERHGGKGSSKDWKRSIRC   67 (82)
T ss_dssp             CSEEEEEET-------TEEEEEEHHHH-TTGTTSS-EEET------TEEE-HHHHHHHHTTCTCS-HHHHSEE
T ss_pred             CCeEeeEeC-------CeEEEEEHHHhhcccccCceEeeC------CcEECHHHHHhhcCcccCCCCCccEEE
Confidence            567999996       44555542  2 222  1123222      567999999999999999999999998



These include proteins linked to various human diseases, such as the Sp100 (Speckled protein 100 kDa), NUDR (Nuclear DEAF-1 related), GMEB (Glucocorticoid Modulatory Element Binding) proteins and AIRE-1 (Autoimmune regulator 1) proteins. Proteins containing the SAND domain have a modular structure; the SAND domain can be associated with a number of other modules, including the bromodomain, the PHD finger and the MYND finger. Because no SAND domain has been found in yeast, it is thought that the SAND domain could be restricted to animal phyla. Many SAND domain-containing proteins, including NUDR, DEAF-1 (Deformed epidermal autoregulatory factor-1) and GMEB, have been shown to bind DNA sequences specifically. The SAND domain has been proposed to mediate the DNA binding activity of these proteins [, ]. The resolution of the 3D structure of the SAND domain from Sp100b has revealed that it consists of a novel alpha/beta fold. The SAND domain adopts a compact fold consisting of a strongly twisted, five-stranded antiparallel beta-sheet with four alpha-helices packing against one side of the beta-sheet. The opposite side of the beta-sheet is solvent exposed. The beta-sheet and alpha-helical parts of the structure form two distinct regions. Multiple hydrophobic residues pack between these regions to form a structural core. A conserved KDWK sequence motif is found within the alpha-helical, positively charged surface patch. The DNA binding surface has been mapped to the alpha-helical region encompassing the KDWK motif [].; GO: 0003677 DNA binding, 0005634 nucleus; PDB: 1OQJ_B 1UFN_A 1H5P_A.

>smart00258 SAND SAND domain Back     alignment and domain information
>PF07496 zf-CW: CW-type Zinc Finger; InterPro: IPR011124 Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule Back     alignment and domain information

Homologous Structure Templates

Structure Templates Detected by BLAST ?

No homologous structure with e-value below 0.005

Structure Templates Detected by RPS-BLAST ?

No hit with e-value below 0.005

Structure Templates Detected by HHsearch ?

ID ?Alignment Graph ?Length ? Definition ? Probability ?
Query236
1oqj_A97 Glucocorticoid modulatory element binding protein- 92.81
1h5p_A95 Nuclear autoantigen SP100-B; transcription, DNA bi 91.4
1ufn_A94 Putative nuclear protein homolog 5830484A20RIK; SA 90.75
2e61_A69 Zinc finger CW-type PWWP domain protein 1; ZF-CW d 90.48
2l7p_A100 Histone-lysine N-methyltransferase ASHH2; CW-domai 83.49
>1oqj_A Glucocorticoid modulatory element binding protein-1; SAND domain, alpha-beta fold, KDWK motif, zinc-binding motif, DNA binding protein; 1.55A {Homo sapiens} SCOP: d.217.1.1 Back     alignment and structure
Probab=92.81  E-value=0.019  Score=44.69  Aligned_cols=44  Identities=20%  Similarity=0.537  Sum_probs=31.8

Q ss_pred             CCChhhhhhccccccccccccceEEEeCCccccc---chhhHHHHhhhhhc
Q 026596           72 KMTPGAFEKHSGRETARKWKNNVWVIANGEKVPL---SKTVLLKYYNQASK  119 (236)
Q Consensus        72 kltp~aFEkHs~reta~kWkn~vWV~~~~~kvpL---~kT~LlkyY~~~~~  119 (236)
                      =+||.+||..+||++++.||..|=+  +|  .||   -+--+|.+|.|...
T Consensus        42 w~TP~EFe~~~gk~~sKdWK~sIR~--~G--~~L~~Lme~g~L~~~~h~~~   88 (97)
T 1oqj_A           42 LISPKHFVHLAGKSTLKDWKRAIRL--GG--IMLRKMMDSGQIDFYQHDKV   88 (97)
T ss_dssp             EECHHHHHHHTTCGGGSCHHHHSEE--TT--EEHHHHHHTTSSCCTTTTTC
T ss_pred             EEChHHHhhhcCcCCCCCcchheEE--CC--eEHHHHHHCCcccccCcCCc
Confidence            4799999999999999999999853  44  333   23455666666543



>1h5p_A Nuclear autoantigen SP100-B; transcription, DNA binding, SAND domain, KDWK, nuclear protein, alternative splicing; NMR {Homo sapiens} SCOP: d.217.1.1 Back     alignment and structure
>1ufn_A Putative nuclear protein homolog 5830484A20RIK; SAND domain, KDWK motif, structural genomics, riken structural genomics/proteomics initiative, RSGI; NMR {Mus musculus} SCOP: d.217.1.1 Back     alignment and structure
>2e61_A Zinc finger CW-type PWWP domain protein 1; ZF-CW domain, structural genomics, NPPSFA, national project protein structural and functional analyses; NMR {Homo sapiens} PDB: 2rr4_A* Back     alignment and structure
>2l7p_A Histone-lysine N-methyltransferase ASHH2; CW-domain; NMR {Arabidopsis thaliana} Back     alignment and structure

Homologous Structure Domains

Structure Domains Detected by RPS-BLAST ?

ID ?Alignment Graph ?Length ? Definition ? E-value ?
Query 236
d1ufna_94 d.217.1.1 (A:) Putative nuclear protein homolog 58 4e-07
d1h5pa_95 d.217.1.1 (A:) Nuclear autoantigen Sp100b {Human ( 0.003
>d1ufna_ d.217.1.1 (A:) Putative nuclear protein homolog 5830484a20rik {Mouse (Mus musculus) [TaxId: 10090]} Length = 94 Back     information, alignment and structure

class: Alpha and beta proteins (a+b)
fold: SAND domain-like
superfamily: SAND domain-like
family: SAND domain
domain: Putative nuclear protein homolog 5830484a20rik
species: Mouse (Mus musculus) [TaxId: 10090]
 Score = 44.8 bits (106), Expect = 4e-07
 Identities = 17/65 (26%), Positives = 24/65 (36%), Gaps = 12/65 (18%)

Query: 32 IEVTCGCTSHRYGDAVGRL--RVFSNGDLEITCECTPGCNEDKMTPGAFEKHSGRETARK 89
          + VTCG        A G L       G    + +C      D +T   F    GR T++ 
Sbjct: 17 LPVTCG-------KAKGTLFQEKLKQG---ASKKCIQNEAGDWLTVKEFLNEGGRATSKD 66

Query: 90 WKNNV 94
          WK  +
Sbjct: 67 WKGVI 71


>d1h5pa_ d.217.1.1 (A:) Nuclear autoantigen Sp100b {Human (Homo sapiens) [TaxId: 9606]} Length = 95 Back     information, alignment and structure

Homologous Domains Detected by HHsearch ?

ID ?Alignment Graph ?Length ? Definition ? Probability ?
Query236
d1ufna_94 Putative nuclear protein homolog 5830484a20rik {Mo 97.96
d1oqja_90 Glucocorticoid modulatory element binding protein- 94.12
d1h5pa_95 Nuclear autoantigen Sp100b {Human (Homo sapiens) [ 91.83
>d1ufna_ d.217.1.1 (A:) Putative nuclear protein homolog 5830484a20rik {Mouse (Mus musculus) [TaxId: 10090]} Back     information, alignment and structure
class: Alpha and beta proteins (a+b)
fold: SAND domain-like
superfamily: SAND domain-like
family: SAND domain
domain: Putative nuclear protein homolog 5830484a20rik
species: Mouse (Mus musculus) [TaxId: 10090]
Probab=97.96  E-value=2.7e-07  Score=69.65  Aligned_cols=57  Identities=30%  Similarity=0.551  Sum_probs=46.7

Q ss_pred             ceEEEcccccceeecCcceee--EEeeCCeeeEeeecCCCCCCCCCChhhhhhccccccccccccceEE
Q 026596           30 DYIEVTCGCTSHRYGDAVGRL--RVFSNGDLEITCECTPGCNEDKMTPGAFEKHSGRETARKWKNNVWV   96 (236)
Q Consensus        30 dyvEV~CGcTs~rYGD~VGRL--ri~~~G~~~ItCeC~PgC~~dkltp~aFEkHs~reta~kWkn~vWV   96 (236)
                      .-|.|+|       |+..|.|  +.|..|   |.+.|+.-.....+||.+||+|+|++++.+||..|-+
T Consensus        15 ~~LpVtC-------G~~~G~L~~~kf~~G---~~~kCI~~~~g~w~TP~eFe~~~gk~~~K~WK~sIr~   73 (94)
T d1ufna_          15 PTLPVTC-------GKAKGTLFQEKLKQG---ASKKCIQNEAGDWLTVKEFLNEGGRATSKDWKGVIRC   73 (94)
T ss_dssp             SEEEEEE-------TTEEEEEEHHHHHSC---TTSCCEECTTCCEECHHHHHHHHTCTTCSCHHHHCEE
T ss_pred             CceeeEe-------CCcEEEEEHhHccCC---ceecceEeCCCcEECHHHHHHhcCccccCCCcccEEE
Confidence            4689999       6788888  444566   5677766555678999999999999999999999976



>d1oqja_ d.217.1.1 (A:) Glucocorticoid modulatory element binding protein-1 (Gmeb1) {Human (Homo sapiens) [TaxId: 9606]} Back     information, alignment and structure
>d1h5pa_ d.217.1.1 (A:) Nuclear autoantigen Sp100b {Human (Homo sapiens) [TaxId: 9606]} Back     information, alignment and structure