Psyllid ID: psy9365


Local Sequence Feature Prediction

Prediction and (Method)Result
Residue Number Marker
Protein Sequence ?
Secondary Structure (PSIPRED) ?
Secondary Structure Prediction (SSPRO) ?
Coil and Loop (DISEMBL) ?
Flexible Loop (DISEMBL) ?
Low Complexity Region (SEG) ?
Disordered region (IsUnstruct) ?
Disordered Region (DISOPRED) ?
Disordered Region (DISEMBL) ?
Disordered Region (DISPRO) ?
Transmembrane Helix (TMHMM) ?
Transmembrane Helix (HMMTOP) ?
Transmembrane Helix (MEMSAT) ?
TM Helix, Signal Peptide (MEMSAT_SVM) ?
TM Helix, Signal Peptide (Phobius) ?
Signal Peptide (SignalP HMM Mode) ?
Signal Peptide (SignalP NN Mode) ?
Coiled Coils (COILS) ?
Positional Conservation ?
 
--------10--------20--------30--------40--------50--------60--------70--------80--------90-------10
MPGQFTGSTEVRILTPQELAAGSQAWARKEQFVEAAPVSGGMGMHLLQKMGWQPGEGLGKNKEGTVQPLSLDIKFDRRGLVSKEELPPQRKNPYHVQQ
cccccccccccccccHHHHHcccHHHHHHHHHHccccccccHHHHHHHHcccccccccccccccccccEEEEEcccccEEEEcccccccccccccccc
cccccccccccEEccHHHHccccccEEEHHHHHccccccccHHHHHHHHccccccccccccccccEEEEEEEEEccccccccccccccHccccccccc
mpgqftgstevriltpqelAAGSQAWARKEQFVeaapvsggmgMHLLQKmgwqpgeglgknkegtvqplsldikfdrrglvskeelppqrknpyhvqq
mpgqftgstevriltpQELAAGSQAWARKEQFVEAAPVSGGMGMHLLQKMGWQPGEGLGKNKEGTVQPLSLDIKFDrrglvskeelppqrknpyhvqq
MPGQFTGSTEVRILTPQELAAGSQAWARKEQFVEAAPVSGGMGMHLLQKMGWQPGEGLGKNKEGTVQPLSLDIKFDRRGLVSKEELPPQRKNPYHVQQ
*************************WARKEQFVEAAPVSGGMGMHLLQKMGW**********************************************
*P**************************************GMGMHLLQKMGWQPGEGLGKNKEGTVQPLSLDIKFDRR********************
*********EVRILTPQELAAGSQAWARKEQFVEAAPVSGGMGMHLLQKMGWQPGEGLGKNKEGTVQPLSLDIKFDRRGLVSKEELPPQRKNPYHVQQ
**GQFTGSTEVRILTPQELAAGSQAWARKEQFVEAAPVSGGMGMHLLQKMGWQPGEGLGKNKEGTVQPLSLDIKFDR**L******************
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii
iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiihhhhhhhhhhhhhhhhhhhhoooooooooooooooooooooooooooooooooooooooo
oooooooooooooooooooooooooooooooooohhhhhhhhhhhhhhhhiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
MPGQFTGSTEVRILTPQELAAGSQAWARKEQFVEAAPVSGGMGMHLLQKMGWQPGEGLGKNKEGTVQPLSLDIKFDRRGLVSKEELPPQRKNPYHVQQ
no confident homologs detected

Close Homologs for Annotation Transfer

Close Homologs in SWISS-PROT Database Detected by BLAST ?

ID ?Alignment graph ?Length ? Definition ? RBH(Q2H) ? RBH(H2Q) ? Q cover ? H cover ? Identity ? E-value ?
Query98 2.2.26 [Sep-21-2011]
P185832426 Protein SON OS=Homo sapie yes N/A 0.836 0.033 0.614 3e-24
Q9QX472444 Protein SON OS=Mus muscul yes N/A 0.836 0.033 0.614 3e-24
Q6DGZ0 262 Coiled-coil domain-contai no N/A 0.530 0.198 0.442 0.0001
Q28H71 514 Zinc finger CCCH-type wit no N/A 0.520 0.099 0.431 0.0007
Q6C233812 Protein SQS1 OS=Yarrowia yes N/A 0.397 0.048 0.461 0.0009
>sp|P18583|SON_HUMAN Protein SON OS=Homo sapiens GN=SON PE=1 SV=4 Back     alignment and function desciption
 Score =  110 bits (274), Expect = 3e-24,   Method: Compositional matrix adjust.
 Identities = 51/83 (61%), Positives = 69/83 (83%), Gaps = 1/83 (1%)

Query: 1    MPGQFTGSTEVRILTPQELA-AGSQAWARKEQFVEAAPVSGGMGMHLLQKMGWQPGEGLG 59
            +PGQFTGST V++LT ++LA  G+QAW +K+QF+ AAPV+GGMG  L++KMGW+ GEGLG
Sbjct: 2266 IPGQFTGSTGVQVLTQEQLANTGAQAWIKKDQFLRAAPVTGGMGAVLMRKMGWREGEGLG 2325

Query: 60   KNKEGTVQPLSLDIKFDRRGLVS 82
            KNKEG  +P+ +D K DR+GLV+
Sbjct: 2326 KNKEGNKEPILVDFKTDRKGLVA 2348




RNA-binding protein that acts as a mRNA splicing cofactor by promoting efficient splicing of transcripts that posses weak splice sites. Specifically promotes splicing of many cell-cycle and DNA-repair transcripts that posses weak splice sites, such as TUBG1, KATNB1, TUBGCP2, AURKB, PCNT, AKT1, RAD23A, and FANCG. Probably acts by facilitating the interaction between Serine/arginine-rich proteins such as SRSF2 and the RNA polymerase II. Also binds to DNA; binds to the consensus DNA sequence: 5'-GA[GT]AN[CG][AG]CC-3'. May indirectly repress hepatitis B virus (HBV) core promoter activity and transcription of HBV genes and production of HBV virions.
Homo sapiens (taxid: 9606)
>sp|Q9QX47|SON_MOUSE Protein SON OS=Mus musculus GN=Son PE=1 SV=2 Back     alignment and function description
>sp|Q6DGZ0|CCD75_DANRE Coiled-coil domain-containing protein 75 OS=Danio rerio GN=ccdc75 PE=2 SV=1 Back     alignment and function description
>sp|Q28H71|ZGPAT_XENTR Zinc finger CCCH-type with G patch domain-containing protein OS=Xenopus tropicalis GN=zgpat PE=2 SV=1 Back     alignment and function description
>sp|Q6C233|SQS1_YARLI Protein SQS1 OS=Yarrowia lipolytica (strain CLIB 122 / E 150) GN=SQS1 PE=3 SV=1 Back     alignment and function description

Close Homologs in the Non-Redundant Database Detected by BLAST ?

GI ?Alignment Graph ?Length ? Definition ? Q cover ? H cover ? Identity ? E-value ?
Query98
332017864 808 Protein SON [Acromyrmex echinatior] 0.877 0.106 0.662 2e-29
345494516 930 PREDICTED: hypothetical protein LOC10011 0.877 0.092 0.697 2e-29
357623380 638 hypothetical protein KGM_22318 [Danaus p 0.897 0.137 0.704 3e-29
242014081 865 conserved hypothetical protein [Pediculu 0.908 0.102 0.617 5e-26
91093789 626 PREDICTED: similar to SON DNA-binding pr 0.857 0.134 0.642 5e-26
195330360 899 GM23818 [Drosophila sechellia] gi|194120 0.948 0.103 0.585 1e-25
195572230 899 GD18627 [Drosophila simulans] gi|1942000 0.948 0.103 0.585 1e-25
24645429 874 CG8273, isoform A [Drosophila melanogast 0.948 0.106 0.585 1e-25
25009833 886 AT18855p [Drosophila melanogaster] 0.948 0.104 0.585 1e-25
193627224 732 PREDICTED: hypothetical protein LOC10016 0.897 0.120 0.602 1e-25
>gi|332017864|gb|EGI58524.1| Protein SON [Acromyrmex echinatior] Back     alignment and taxonomy information
 Score =  132 bits (333), Expect = 2e-29,   Method: Compositional matrix adjust.
 Identities = 57/86 (66%), Positives = 76/86 (88%)

Query: 1   MPGQFTGSTEVRILTPQELAAGSQAWARKEQFVEAAPVSGGMGMHLLQKMGWQPGEGLGK 60
           +PGQFTGST V++LTP EL++G QAWARK+Q V A P+SGGMGM LLQKMGW+PGEGLGK
Sbjct: 628 IPGQFTGSTGVKVLTPAELSSGYQAWARKDQLVSAQPLSGGMGMALLQKMGWRPGEGLGK 687

Query: 61  NKEGTVQPLSLDIKFDRRGLVSKEEL 86
           NKEG ++PL L++K D++GL+S++++
Sbjct: 688 NKEGALEPLQLEVKLDKKGLISEQDI 713




Source: Acromyrmex echinatior

Species: Acromyrmex echinatior

Genus: Acromyrmex

Family: Formicidae

Order: Hymenoptera

Class: Insecta

Phylum: Arthropoda

Superkingdom: Eukaryota

>gi|345494516|ref|XP_001601978.2| PREDICTED: hypothetical protein LOC100117848 isoform 1 [Nasonia vitripennis] Back     alignment and taxonomy information
>gi|357623380|gb|EHJ74558.1| hypothetical protein KGM_22318 [Danaus plexippus] Back     alignment and taxonomy information
>gi|242014081|ref|XP_002427726.1| conserved hypothetical protein [Pediculus humanus corporis] gi|212512167|gb|EEB14988.1| conserved hypothetical protein [Pediculus humanus corporis] Back     alignment and taxonomy information
>gi|91093789|ref|XP_967550.1| PREDICTED: similar to SON DNA-binding protein [Tribolium castaneum] Back     alignment and taxonomy information
>gi|195330360|ref|XP_002031872.1| GM23818 [Drosophila sechellia] gi|194120815|gb|EDW42858.1| GM23818 [Drosophila sechellia] Back     alignment and taxonomy information
>gi|195572230|ref|XP_002104099.1| GD18627 [Drosophila simulans] gi|194200026|gb|EDX13602.1| GD18627 [Drosophila simulans] Back     alignment and taxonomy information
>gi|24645429|ref|NP_649914.1| CG8273, isoform A [Drosophila melanogaster] gi|442618276|ref|NP_001262426.1| CG8273, isoform B [Drosophila melanogaster] gi|7299212|gb|AAF54409.1| CG8273, isoform A [Drosophila melanogaster] gi|440217260|gb|AGB95808.1| CG8273, isoform B [Drosophila melanogaster] Back     alignment and taxonomy information
>gi|25009833|gb|AAN71087.1| AT18855p [Drosophila melanogaster] Back     alignment and taxonomy information
>gi|193627224|ref|XP_001952647.1| PREDICTED: hypothetical protein LOC100161769 [Acyrthosiphon pisum] Back     alignment and taxonomy information

Prediction of Gene Ontology (GO) Terms

Close Homologs with Gene Ontology terms Detected by BLAST ?

ID ? Alignment graph ? Length ? Definition ? Q cover ? H cover ? Identity ? E-value ?
Query98
FB|FBgn0037716 886 CG8273 [Drosophila melanogaste 0.948 0.104 0.585 5e-26
UNIPROTKB|E1BX00484 E1BX00 "Uncharacterized protei 1.0 0.202 0.565 1.3e-25
UNIPROTKB|J3QSZ5454 SON "Protein SON" [Homo sapien 0.959 0.207 0.557 9.3e-25
UNIPROTKB|H7C1M21454 SON "Protein SON" [Homo sapien 0.959 0.064 0.557 1.5e-23
UNIPROTKB|F1MYN82136 SON "Uncharacterized protein" 0.959 0.044 0.557 2.4e-23
UNIPROTKB|I3L9C92414 SON "Uncharacterized protein" 0.959 0.038 0.557 2.8e-23
UNIPROTKB|E2RRQ02415 SON "Uncharacterized protein" 0.959 0.038 0.557 2.8e-23
UNIPROTKB|P185832426 SON "Protein SON" [Homo sapien 0.959 0.038 0.557 2.8e-23
UNIPROTKB|E2RRP52431 SON "Uncharacterized protein" 0.959 0.038 0.557 2.8e-23
MGI|MGI:983532444 Son "Son DNA binding protein" 0.959 0.038 0.557 2.9e-23
FB|FBgn0037716 CG8273 [Drosophila melanogaster (taxid:7227)] Back     alignment and assigned GO terms
 Score = 305 (112.4 bits), Expect = 5.0e-26, P = 5.0e-26
 Identities = 55/94 (58%), Positives = 72/94 (76%)

Query:     1 MPGQFTGSTEVRILTPQELAAGSQAWARKEQFVEAAPVSGGMGMHLLQKMGWQPGEGLGK 60
             +PGQFTGST  +++   EL +G Q W RK+Q     PV+GGMGM LLQKMGW+PGEGLG+
Sbjct:   679 LPGQFTGSTGAQVMKAHELNSGPQLWVRKDQMTSTKPVTGGMGMALLQKMGWKPGEGLGR 738

Query:    61 NKEGTVQPLSLDIKFDRRGLVSKEEL-PPQRKNP 93
              K G++QPL LD+K D+RGLVS+++L PPQ + P
Sbjct:   739 CKTGSLQPLLLDVKLDKRGLVSRDDLRPPQMRAP 772




GO:0003725 "double-stranded RNA binding" evidence=IEA;ISS
UNIPROTKB|E1BX00 E1BX00 "Uncharacterized protein" [Gallus gallus (taxid:9031)] Back     alignment and assigned GO terms
UNIPROTKB|J3QSZ5 SON "Protein SON" [Homo sapiens (taxid:9606)] Back     alignment and assigned GO terms
UNIPROTKB|H7C1M2 SON "Protein SON" [Homo sapiens (taxid:9606)] Back     alignment and assigned GO terms
UNIPROTKB|F1MYN8 SON "Uncharacterized protein" [Bos taurus (taxid:9913)] Back     alignment and assigned GO terms
UNIPROTKB|I3L9C9 SON "Uncharacterized protein" [Sus scrofa (taxid:9823)] Back     alignment and assigned GO terms
UNIPROTKB|E2RRQ0 SON "Uncharacterized protein" [Canis lupus familiaris (taxid:9615)] Back     alignment and assigned GO terms
UNIPROTKB|P18583 SON "Protein SON" [Homo sapiens (taxid:9606)] Back     alignment and assigned GO terms
UNIPROTKB|E2RRP5 SON "Uncharacterized protein" [Canis lupus familiaris (taxid:9615)] Back     alignment and assigned GO terms
MGI|MGI:98353 Son "Son DNA binding protein" [Mus musculus (taxid:10090)] Back     alignment and assigned GO terms

Prediction of Enzyme Commission (EC) Number

EC Number Prediction by Annotation Transfer from SWISS-PROT Entries ?

ID ?Name ?Annotated EC number ?Identity ?Query coverage ?Hit coverage ?RBH(Q2H) ?RBH(H2Q) ?
Q9QX47SON_MOUSENo assigned EC number0.61440.83670.0335yesN/A
P18583SON_HUMANNo assigned EC number0.61440.83670.0338yesN/A

EC Number Prediction by EFICAz Software ?

No EC number assignment, probably not an enzyme!


Prediction of Functionally Associated Proteins


Conserved Domains and Related Protein Families

Conserved Domains Detected by RPS-BLAST ?

ID ?Alignment Graph ?Length ? Definition ? E-value ?
Query98
pfam0158545 pfam01585, G-patch, G-patch domain 1e-12
smart0044347 smart00443, G_patch, glycine rich nucleic binding 2e-11
pfam1265679 pfam12656, G-patch_2, DExH-box splicing factor bin 2e-06
>gnl|CDD|144978 pfam01585, G-patch, G-patch domain Back     alignment and domain information
 Score = 56.4 bits (137), Expect = 1e-12
 Identities = 22/42 (52%), Positives = 31/42 (73%)

Query: 40 GGMGMHLLQKMGWQPGEGLGKNKEGTVQPLSLDIKFDRRGLV 81
            +G  LLQKMGW+PG+GLGKN++G  +P+   I+ DR+GL 
Sbjct: 2  SNIGFKLLQKMGWKPGQGLGKNEQGITEPIEAKIRPDRKGLG 43


This domain is found in a number of RNA binding proteins, and is also found in proteins that contain RNA binding domains. This suggests that this domain may have an RNA binding function. This domain has seven highly conserved glycines. Length = 45

>gnl|CDD|197727 smart00443, G_patch, glycine rich nucleic binding domain Back     alignment and domain information
>gnl|CDD|221692 pfam12656, G-patch_2, DExH-box splicing factor binding site Back     alignment and domain information

Conserved Domains Detected by HHsearch ?

ID ?Alignment Graph ?Length ? Definition ? Probability ?
Query 98
PF0158545 G-patch: G-patch domain; InterPro: IPR000467 The D 99.69
smart0044347 G_patch glycine rich nucleic binding domain. A pre 99.58
KOG2809|consensus 326 99.3
KOG0965|consensus988 99.16
KOG2184|consensus 767 99.12
PF1265677 G-patch_2: DExH-box splicing factor binding site 98.96
KOG2384|consensus223 98.89
KOG2185|consensus 486 98.88
KOG3673|consensus 845 98.64
KOG1996|consensus 378 98.46
KOG1994|consensus 268 98.28
KOG0154|consensus573 98.25
KOG4315|consensus 455 97.74
KOG4368|consensus757 97.2
KOG2138|consensus 883 96.95
KOG1994|consensus 268 95.32
>PF01585 G-patch: G-patch domain; InterPro: IPR000467 The D111/G-patch domain [] is a short conserved region of about 40 amino acids which occurs in a number of putative RNA-binding proteins, including tumor suppressor and DNA-damage-repair proteins, suggesting that this domain may have an RNA binding function Back     alignment and domain information
Probab=99.69  E-value=2.9e-17  Score=96.40  Aligned_cols=45  Identities=44%  Similarity=0.982  Sum_probs=43.0

Q ss_pred             CCcHHHHHHHhcCCCCCCCCCCCCCCccccceeeeecCCceeeec
Q psy9365          39 SGGMGMHLLQKMGWQPGEGLGKNKEGTVQPLSLDIKFDRRGLVSK   83 (98)
Q Consensus        39 ~~~~G~kmL~kmGw~~G~GLGk~~qGi~~PI~~~~k~~~~GLG~~   83 (98)
                      +++||++||++|||++|+|||++.+||++||++..+.++.|||++
T Consensus         1 t~~~g~~lm~kmGw~~G~GLGk~~~G~~~pi~~~~~~~~~GlG~~   45 (45)
T PF01585_consen    1 TSSIGFKLMKKMGWKPGQGLGKNGQGIAEPIEVKKKKDRKGLGAE   45 (45)
T ss_pred             CCcHHHHHHHHCCCCCCcCCCcCCccCCcceEEeeEcCCccccCC
Confidence            479999999999999999999999999999999999999999974



This domain has seven highly conserved glycines. A multiple alignment of a small subset of D111/G-patch domains is shown in Fig. 2b of [].; GO: 0003676 nucleic acid binding, 0005622 intracellular

>smart00443 G_patch glycine rich nucleic binding domain Back     alignment and domain information
>KOG2809|consensus Back     alignment and domain information
>KOG0965|consensus Back     alignment and domain information
>KOG2184|consensus Back     alignment and domain information
>PF12656 G-patch_2: DExH-box splicing factor binding site Back     alignment and domain information
>KOG2384|consensus Back     alignment and domain information
>KOG2185|consensus Back     alignment and domain information
>KOG3673|consensus Back     alignment and domain information
>KOG1996|consensus Back     alignment and domain information
>KOG1994|consensus Back     alignment and domain information
>KOG0154|consensus Back     alignment and domain information
>KOG4315|consensus Back     alignment and domain information
>KOG4368|consensus Back     alignment and domain information
>KOG2138|consensus Back     alignment and domain information
>KOG1994|consensus Back     alignment and domain information

Homologous Structure Templates

Structure Templates Detected by BLAST ?

No homologous structure with e-value below 0.005

Structure Templates Detected by RPS-BLAST ?

No hit with e-value below 0.005

Structure Templates Detected by HHsearch ?

No hit with probability above 80.00


Homologous Structure Domains

Structure Domains Detected by RPS-BLAST ?

No hit with e-value below 0.005

Homologous Domains Detected by HHsearch ?

No hit with probability above 80.00