Psyllid ID: psy17997


Local Sequence Feature Prediction

Prediction and (Method)Result
Residue Number Marker
Protein Sequence ?
Secondary Structure (PSIPRED) ?
Secondary Structure Prediction (SSPRO) ?
Coil and Loop (DISEMBL) ?
Flexible Loop (DISEMBL) ?
Low Complexity Region (SEG) ?
Disordered region (IsUnstruct) ?
Disordered Region (DISOPRED) ?
Disordered Region (DISEMBL) ?
Disordered Region (DISPRO) ?
Transmembrane Helix (TMHMM) ?
Transmembrane Helix (HMMTOP) ?
Transmembrane Helix (MEMSAT) ?
TM Helix, Signal Peptide (MEMSAT_SVM) ?
TM Helix, Signal Peptide (Phobius) ?
Signal Peptide (SignalP HMM Mode) ?
Signal Peptide (SignalP NN Mode) ?
Coiled Coils (COILS) ?
Positional Conservation ?
 
--------10--------20--------30--------40--------50--------60--------70--------80--------90-------100-------110-------120
MSENVTDLPALFSQPSQDLPKHFSSGIRIGSNLGLSVPAIVKPELGPLYKYEPTIKMIDGPDYPSWDDVRRGNYLKHVRTETVEEKAGGYVCAIATMRALNDAFAGLYHTNKRTSPAAVE
ccccccccccccccccccccccccccEEEcccccccccccccccccccccccccEEcccccccccHHHHHccccccccccccHHHHcccHHHHHHHHHHHHHHHHHHccccccccccccc
ccHHHHHHHHHHcccccccccEEcccEEEccccccccccccccccccEEEEEcEEEEccccccccHHHHHHcccHHHcccccHHHHccccHHHHHHHHHHHHHHHccccccccccccccc
msenvtdlpalfsqpsqdlpkhfssgirigsnlglsvpaivkpelgplykyeptikmidgpdypswddvrrgnylKHVRTetveekaggYVCAIATMRALNDAFAGLyhtnkrtspaave
msenvtdlpalfsqpsqdLPKHFSSGIRIGSNLGLSVPAIVKPELGPLYKYEPTIKMIDGPDYPSWDDVRRGNYLKHvrtetveekagGYVCAIATMRALNDAFAGLyhtnkrtspaave
MSENVTDLPALFSQPSQDLPKHFSSGIRIGSNLGLSVPAIVKPELGPLYKYEPTIKMIDGPDYPSWDDVRRGNYLKHVRTETVEEKAGGYVCAIATMRALNDAFAGLYHTNKRTSPAAVE
*************************GIRIGSNLGLSVPAIVKPELGPLYKYEPTIKMIDGPDYPSWDDVRRGNYLKHVRTETVEEKAGGYVCAIATMRALNDAFAGLYHT**********
*************************GIRIGSNLGLSVPAIVKPELGPLYKYEPTIKMIDGPDYPSWDDVRRGNYLKHVRTETV**KAGGYVCAIATMRALNDAFAGLYHTN*********
********PALFSQPSQDLPKHFSSGIRIGSNLGLSVPAIVKPELGPLYKYEPTIKMIDGPDYPSWDDVRRGNYLKHVRTETVEEKAGGYVCAIATMRALNDAFAGLYHTN*********
*****TDLPALFSQPSQDLPKHFSSGIRIGSNLGLSVPAIVKPELGPLYKYEPTIKMIDGPDYPSWDDVRRGNYLKHVRTETVEEKAGGYVCAIATMRALNDAFAGLYHTN*********
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
oooooooooooooooooooooooooooooooooooooooooooooooooooooooohhhhhhhhhhhhhhhhhhhiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooohhhhhhhhhhhhhhhhiiiiiiiiiiii
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
MSENVTDLPALFSQPSQDLPKHFSSGIRIGSNLGLSVPAIVKPELGPLYKYEPTIKMIDGPDYPSWDDVRRGNYLKHVRTETVEEKAGGYVCAIATMRALNDAFAGLYHTNKRTSPAAVE
no confident homologs detected

Close Homologs for Annotation Transfer

Close Homologs in SWISS-PROT Database Detected by BLAST ?

ID ?Alignment graph ?Length ? Definition ? RBH(Q2H) ? RBH(H2Q) ? Q cover ? H cover ? Identity ? E-value ?
Query120 2.2.26 [Sep-21-2011]
P51949309 CDK-activating kinase ass yes N/A 0.833 0.323 0.378 4e-11
P51948309 CDK-activating kinase ass yes N/A 0.833 0.323 0.378 1e-10
P51951309 CDK-activating kinase ass N/A N/A 0.691 0.268 0.418 3e-10
P51950324 CDK-activating kinase ass N/A N/A 0.75 0.277 0.387 2e-06
>sp|P51949|MAT1_MOUSE CDK-activating kinase assembly factor MAT1 OS=Mus musculus GN=Mnat1 PE=2 SV=2 Back     alignment and function desciption
 Score = 66.6 bits (161), Expect = 4e-11,   Method: Compositional matrix adjust.
 Identities = 39/103 (37%), Positives = 56/103 (54%), Gaps = 3/103 (2%)

Query: 6   TDLPALFSQPSQDLPKHFSSGIRIGSNLGLSVPAIVKPELGPLYKYEPTIKMIDGPDYPS 65
           T L     +P    P  FS+GI++G  + L+   I K E   LY+Y+P      GP  P 
Sbjct: 206 TQLEMQLEKPRSMKPVTFSTGIKMGQQISLA--PIQKLE-EALYEYQPLQIETCGPQVPE 262

Query: 66  WDDVRRGNYLKHVRTETVEEKAGGYVCAIATMRALNDAFAGLY 108
            + + R  YL HVR  + ++ AGGY  ++A  RAL DAF+GL+
Sbjct: 263 QELLGRLGYLNHVRAASPQDLAGGYTSSLACHRALQDAFSGLF 305




Stabilizes the cyclin H-CDK7 complex to form a functional CDK-activating kinase (CAK) enzymatic complex. CAK activates the cyclin-associated kinases CDK1, CDK2, CDK4 and CDK6 by threonine phosphorylation. CAK complexed to the core-TFIIH basal transcription factor activates RNA polymerase II by serine phosphorylation of the repetitive C-terminus domain (CTD) of its large subunit (POLR2A), allowing its escape from the promoter and elongation of the transcripts. Involved in cell cycle control and in RNA transcription by RNA polymerase II.
Mus musculus (taxid: 10090)
>sp|P51948|MAT1_HUMAN CDK-activating kinase assembly factor MAT1 OS=Homo sapiens GN=MNAT1 PE=1 SV=1 Back     alignment and function description
>sp|P51951|MAT1_XENLA CDK-activating kinase assembly factor MAT1 OS=Xenopus laevis GN=mnat1 PE=2 SV=1 Back     alignment and function description
>sp|P51950|MAT1_MARGL CDK-activating kinase assembly factor MAT1 OS=Marthasterias glacialis PE=1 SV=1 Back     alignment and function description

Close Homologs in the Non-Redundant Database Detected by BLAST ?

GI ?Alignment Graph ?Length ? Definition ? Q cover ? H cover ? Identity ? E-value ?
Query120
322798720 323 hypothetical protein SINV_10084 [Solenop 0.7 0.260 0.456 2e-14
307187513 317 CDK-activating kinase assembly factor MA 0.741 0.280 0.415 2e-14
350417960 320 PREDICTED: CDK-activating kinase assembl 0.783 0.293 0.425 3e-14
189239528 296 PREDICTED: similar to Mat1 CG7614-PA [Tr 0.725 0.293 0.446 3e-14
270010607 282 hypothetical protein TcasGA2_TC010030 [T 0.725 0.308 0.446 3e-14
340715615 320 PREDICTED: CDK-activating kinase assembl 0.733 0.275 0.443 4e-14
118791406 314 AGAP008991-PA [Anopheles gambiae str. PE 0.733 0.280 0.465 5e-14
48106220 320 PREDICTED: CDK-activating kinase assembl 0.733 0.275 0.420 7e-14
380019240 320 PREDICTED: CDK-activating kinase assembl 0.733 0.275 0.420 9e-14
20151823 320 RH31013p [Drosophila melanogaster] 0.741 0.278 0.415 1e-13
>gi|322798720|gb|EFZ20318.1| hypothetical protein SINV_10084 [Solenopsis invicta] Back     alignment and taxonomy information
 Score = 83.2 bits (204), Expect = 2e-14,   Method: Compositional matrix adjust.
 Identities = 42/92 (45%), Positives = 57/92 (61%), Gaps = 8/92 (8%)

Query: 22  HFSSGIRIGSNLG----LSVPAIVKPELGPLYKYEPTIKMIDGPDYPSWDDVRRGNYLKH 77
            FS+GI+ G N G    LSVP I   E GPLY Y P  + ++GP  PSW +++   Y+ H
Sbjct: 224 QFSTGIKFG-NQGNQNYLSVPKI---EEGPLYTYTPIRQQVEGPTPPSWRELQGRGYVTH 279

Query: 78  VRTETVEEKAGGYVCAIATMRALNDAFAGLYH 109
           +R E   E+AGG+   +A +RAL +A AGLYH
Sbjct: 280 IRNECKSERAGGFRAHVACLRALQEAMAGLYH 311




Source: Solenopsis invicta

Species: Solenopsis invicta

Genus: Solenopsis

Family: Formicidae

Order: Hymenoptera

Class: Insecta

Phylum: Arthropoda

Superkingdom: Eukaryota

>gi|307187513|gb|EFN72564.1| CDK-activating kinase assembly factor MAT1 [Camponotus floridanus] Back     alignment and taxonomy information
>gi|350417960|ref|XP_003491665.1| PREDICTED: CDK-activating kinase assembly factor MAT1-like [Bombus impatiens] Back     alignment and taxonomy information
>gi|189239528|ref|XP_001816137.1| PREDICTED: similar to Mat1 CG7614-PA [Tribolium castaneum] Back     alignment and taxonomy information
>gi|270010607|gb|EFA07055.1| hypothetical protein TcasGA2_TC010030 [Tribolium castaneum] Back     alignment and taxonomy information
>gi|340715615|ref|XP_003396306.1| PREDICTED: CDK-activating kinase assembly factor MAT1-like [Bombus terrestris] Back     alignment and taxonomy information
>gi|118791406|ref|XP_319742.3| AGAP008991-PA [Anopheles gambiae str. PEST] gi|116117584|gb|EAA14858.3| AGAP008991-PA [Anopheles gambiae str. PEST] Back     alignment and taxonomy information
>gi|48106220|ref|XP_396068.1| PREDICTED: CDK-activating kinase assembly factor MAT1-like isoform 2 [Apis mellifera] gi|328790414|ref|XP_003251416.1| PREDICTED: CDK-activating kinase assembly factor MAT1-like isoform 1 [Apis mellifera] Back     alignment and taxonomy information
>gi|380019240|ref|XP_003693519.1| PREDICTED: CDK-activating kinase assembly factor MAT1-like [Apis florea] Back     alignment and taxonomy information
>gi|20151823|gb|AAM11271.1| RH31013p [Drosophila melanogaster] Back     alignment and taxonomy information

Prediction of Gene Ontology (GO) Terms

Close Homologs with Gene Ontology terms Detected by BLAST ?

ID ? Alignment graph ? Length ? Definition ? Q cover ? H cover ? Identity ? E-value ?
Query120
FB|FBgn0024956320 Mat1 "Mat1" [Drosophila melano 0.733 0.275 0.409 1e-14
UNIPROTKB|E1BVP0309 MNAT1 "Uncharacterized protein 0.833 0.323 0.388 6.2e-12
UNIPROTKB|H0YJ92169 MNAT1 "CDK-activating kinase a 0.833 0.591 0.388 8.2e-12
UNIPROTKB|Q3SX39309 MNAT1 "MNAT1 protein" [Bos tau 0.833 0.323 0.388 1.3e-11
UNIPROTKB|E2QVF4309 MNAT1 "Uncharacterized protein 0.833 0.323 0.388 1.3e-11
RGD|628874309 Mnat1 "menage a trois homolog 0.833 0.323 0.388 1.7e-11
UNIPROTKB|P51948309 MNAT1 "CDK-activating kinase a 0.833 0.323 0.388 2.3e-11
MGI|MGI:106207309 Mnat1 "menage a trois 1" [Mus 0.833 0.323 0.388 2.3e-11
ZFIN|ZDB-GENE-041010-203309 mnat1 "menage a trois homolog 0.691 0.268 0.383 3e-10
UNIPROTKB|G3V1U8267 MNAT1 "CDK-activating kinase a 0.608 0.273 0.405 5.6e-08
FB|FBgn0024956 Mat1 "Mat1" [Drosophila melanogaster (taxid:7227)] Back     alignment and assigned GO terms
 Score = 190 (71.9 bits), Expect = 1.0e-14, P = 1.0e-14
 Identities = 36/88 (40%), Positives = 53/88 (60%)

Query:    23 FSSGIRIGSNLGLSVPAIVKPELGPLYKYEPTIKMIDGPDYPSWDDVRRGNYLKHVRTET 82
             FS+GI+ G     S+  + K E GPL+ YEP +   +GP  P  +++    Y+ H+R ET
Sbjct:   224 FSTGIKFGQTADPSLLPVPKSEEGPLFVYEPLVPFSEGPAMPPTNEIVSRGYIAHIRAET 283

Query:    83 VEEKAGGYVCAIATMRALNDAFAGLYHT 110
              +E AGG+  A+A  RAL +A  GLY+T
Sbjct:   284 PQENAGGFTSALACERALQEALQGLYYT 311




GO:0005675 "holo TFIIH complex" evidence=ISS;IDA
GO:0006367 "transcription initiation from RNA polymerase II promoter" evidence=ISS
GO:0008353 "RNA polymerase II carboxy-terminal domain kinase activity" evidence=IDA
GO:0004693 "cyclin-dependent protein serine/threonine kinase activity" evidence=IDA
GO:0032806 "carboxy-terminal domain protein kinase complex" evidence=IDA
GO:0007049 "cell cycle" evidence=IEA
GO:0008270 "zinc ion binding" evidence=IEA
UNIPROTKB|E1BVP0 MNAT1 "Uncharacterized protein" [Gallus gallus (taxid:9031)] Back     alignment and assigned GO terms
UNIPROTKB|H0YJ92 MNAT1 "CDK-activating kinase assembly factor MAT1" [Homo sapiens (taxid:9606)] Back     alignment and assigned GO terms
UNIPROTKB|Q3SX39 MNAT1 "MNAT1 protein" [Bos taurus (taxid:9913)] Back     alignment and assigned GO terms
UNIPROTKB|E2QVF4 MNAT1 "Uncharacterized protein" [Canis lupus familiaris (taxid:9615)] Back     alignment and assigned GO terms
RGD|628874 Mnat1 "menage a trois homolog 1, cyclin H assembly factor (Xenopus laevis)" [Rattus norvegicus (taxid:10116)] Back     alignment and assigned GO terms
UNIPROTKB|P51948 MNAT1 "CDK-activating kinase assembly factor MAT1" [Homo sapiens (taxid:9606)] Back     alignment and assigned GO terms
MGI|MGI:106207 Mnat1 "menage a trois 1" [Mus musculus (taxid:10090)] Back     alignment and assigned GO terms
ZFIN|ZDB-GENE-041010-203 mnat1 "menage a trois homolog 1" [Danio rerio (taxid:7955)] Back     alignment and assigned GO terms
UNIPROTKB|G3V1U8 MNAT1 "CDK-activating kinase assembly factor MAT1" [Homo sapiens (taxid:9606)] Back     alignment and assigned GO terms

Prediction of Enzyme Commission (EC) Number

EC Number Prediction by Annotation Transfer from SWISS-PROT Entries ?

No confident hit for EC number transfering in SWISSPROT detected by BLAST

EC Number Prediction by EFICAz Software ?

No EC number assignment, probably not an enzyme!


Prediction of Functionally Associated Proteins


Conserved Domains and Related Protein Families

Conserved Domains Detected by RPS-BLAST ?

ID ?Alignment Graph ?Length ? Definition ? E-value ?
Query120
TIGR00570309 TIGR00570, cdk7, CDK-activating kinase assembly fa 5e-15
>gnl|CDD|129661 TIGR00570, cdk7, CDK-activating kinase assembly factor MAT1 Back     alignment and domain information
 Score = 68.7 bits (168), Expect = 5e-15
 Identities = 39/108 (36%), Positives = 60/108 (55%), Gaps = 3/108 (2%)

Query: 3   ENVTDLPALFSQPSQDLPKHFSSGIRIGSNLGLSVPAIVKPELGPLYKYEPTIKMIDGPD 62
           +N   L     +P  + P  FS+GI++G  + L VP + K E   LY Y+P     +GP 
Sbjct: 203 KNSVKLEMQVEKPKPEKPNTFSTGIKMGYQISL-VP-VQKSEE-ALYPYQPLNIETEGPP 259

Query: 63  YPSWDDVRRGNYLKHVRTETVEEKAGGYVCAIATMRALNDAFAGLYHT 110
            P+ +++ R  YL HVR  + ++ AGGY   +A  RAL +AF+GL+  
Sbjct: 260 VPTLEELVRQGYLNHVRAASPQDIAGGYTSNLACERALQEAFSGLFWQ 307


All proteins in this family for which functions are known are cyclin dependent protein kinases that are components of TFIIH, a complex that is involved in nucleotide excision repair and transcription initiation. Also known as MAT1 (menage a trois 1). This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University) [DNA metabolism, DNA replication, recombination, and repair]. Length = 309

Conserved Domains Detected by HHsearch ?

ID ?Alignment Graph ?Length ? Definition ? Probability ?
Query 120
TIGR00570309 cdk7 CDK-activating kinase assembly factor MAT1. A 99.87
KOG3800|consensus300 99.87
PF06391200 MAT1: CDK-activating kinase assembly factor MAT1; 96.51
>TIGR00570 cdk7 CDK-activating kinase assembly factor MAT1 Back     alignment and domain information
Probab=99.87  E-value=2.5e-23  Score=173.17  Aligned_cols=102  Identities=35%  Similarity=0.689  Sum_probs=91.9

Q ss_pred             ccccccCCCCCCCCCccccceeeccccccccCCccCCccCCCceeeccccccCCCCCCCHHHhhhchhhhhcccccchhh
Q psy17997          7 DLPALFSQPSQDLPKHFSSGIRIGSNLGLSVPAIVKPELGPLYKYEPTIKMIDGPDYPSWDDVRRGNYLKHVRTETVEEK   86 (120)
Q Consensus         7 ~~~~~~~~~~~~k~~~FSTGIk~g~~~s~~~~pvpk~eeG~lY~Y~Pl~~~~~GP~~P~~e~L~~~GYl~HVR~~s~~e~   86 (120)
                      .+....+..+..|++.|+|||+++.+.+++  |+++.++ +.|.|+|+..+.+||..|+++++...||+.|||+++++.+
T Consensus       207 ~~k~~~~r~~~~~~~~~~~~~~~~~~~~~~--~~~~~~~-~~~py~Pf~g~~~~p~~y~~~~~~~~~y~~~~r~~~~~~~  283 (309)
T TIGR00570       207 KLEMQVEKPKPEKPNTFSTGIKMGYQISLV--PVQKSEE-ALYPYQPLNIETEGPPVPTLEELVRQGYLNHVRAASPQDI  283 (309)
T ss_pred             HHHhhhhhccccchhhhhhccccccccccc--ccCCCCC-CCCCcCCCCCCCCCCCCCCccccccccHHHHHhccCcccc
Confidence            344555666666777899999998777888  9988776 9999999999999999999999999999999999999999


Q ss_pred             cCchhhHHhHHHHHHHHhhcccCCC
Q psy17997         87 AGGYVCAIATMRALNDAFAGLYHTN  111 (120)
Q Consensus        87 AGGfts~laC~RALQEAfsgL~~~~  111 (120)
                      ||||++.+.|.|||||||+||++..
T Consensus       284 aGGy~~~~~~~RaL~EAF~GL~~fi  308 (309)
T TIGR00570       284 AGGYTSNLACERALQEAFSGLFWQP  308 (309)
T ss_pred             cCCcCHHHHHHHHHHHHHccCCccC
Confidence            9999999999999999999999865



All proteins in this family for which functions are known are cyclin dependent protein kinases that are components of TFIIH, a complex that is involved in nucleotide excision repair and transcription initiation. Also known as MAT1 (menage a trois 1). This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University).

>KOG3800|consensus Back     alignment and domain information
>PF06391 MAT1: CDK-activating kinase assembly factor MAT1; InterPro: IPR015877 MAT1 (menage a trois 1) is a RING finger protein with a characteristic C3HC4 motif located in the N-terminal domain Back     alignment and domain information

Homologous Structure Templates

Structure Templates Detected by BLAST ?

No homologous structure with e-value below 0.005

Structure Templates Detected by RPS-BLAST ?

No hit with e-value below 0.005

Structure Templates Detected by HHsearch ?

No hit with probability above 80.00


Homologous Structure Domains

Structure Domains Detected by RPS-BLAST ?

No hit with e-value below 0.005

Homologous Domains Detected by HHsearch ?

No hit with probability above 80.00