Psyllid ID: psy5060


Local Sequence Feature Prediction

Prediction and (Method)Result
Residue Number Marker
Protein Sequence ?
Secondary Structure (PSIPRED) ?
Secondary Structure Prediction (SSPRO) ?
Coil and Loop (DISEMBL) ?
Flexible Loop (DISEMBL) ?
Low Complexity Region (SEG) ?
Disordered region (IsUnstruct) ?
Disordered Region (DISOPRED) ?
Disordered Region (DISEMBL) ?
Disordered Region (DISPRO) ?
Transmembrane Helix (TMHMM) ?
Transmembrane Helix (HMMTOP) ?
Transmembrane Helix (MEMSAT) ?
TM Helix, Signal Peptide (MEMSAT_SVM) ?
TM Helix, Signal Peptide (Phobius) ?
Signal Peptide (SignalP HMM Mode) ?
Signal Peptide (SignalP NN Mode) ?
Coiled Coils (COILS) ?
Positional Conservation ?
 
--------10--------20--------30--------40--------50--------60--------70--------80--------90-------100-------110-------120-------130-------140-----
MVQRKWNSLFWTSSTKFGRFPEDKLRRKQWCIAMKRDKWKPSKHSKICSAHFTEDSFETNAWSERKKLSDTAVPSIFTFPDHLMKTPCPRKQPAVRSLSAQEYMFVCKTSKAWFENPGSQFTFLYHFCRLVTQWRLKPYTEDLSS
cccccccccccccccEEEEccccHHHHHHHHHHHHccccccccccEEcccccccccccccccccccccccccccccccccccccccccccccccccccHHHHHHHHccccccccccccccccccccccccccccccccccccccc
ccHHHHcccccccccEEccccccHHHHHHHHHHHHccccccccccEEcHHcccccHccccccccccccccccccEEEcccccccccccccccccEEEccHHHccEEEccccHHccccccEEEEEEcccccccccccccccccccc
mvqrkwnslfwtsstkfgrfpedkLRRKQWCIAmkrdkwkpskhskicsahftedsfetnawserkklsdtavpsiftfpdhlmktpcprkqpavrslSAQEYMFVCKTskawfenpgsqftFLYHFCRLVtqwrlkpytedlss
mvqrkwnslfwtsstkfgrfpedklrRKQWCIamkrdkwkpskhskicsahftedsfetnaWSERKKLSDTAVPSIFtfpdhlmktpcPRKQPAVRSLSAQEYMFVCKTSKAWFENPGSQFTFLYHFCRLVTQWrlkpytedlss
MVQRKWNSLFWTSSTKFGRFPEDKLRRKQWCIAMKRDKWKPSKHSKICSAHFTEDSFETNAWSERKKLSDTAVPSIFTFPDHLMKTPCPRKQPAVRSLSAQEYMFVCKTSKAWFENPGSQFTFLYHFCRLVTQWRLKPYTEDLSS
*****WNSLFWTSSTKFGRFPEDKLRRKQWCIAMKRDKWKP**HSKICSAHFTEDSFETNAW*******DTAVPSIFTFPDHLMKTPC******VRSLSAQEYMFVCKTSKAWFENPGSQFTFLYHFCRLVTQWRLKPY******
****************FGRFPEDKLRRKQWCIAMKRDKWKPSKHSKICSAHFTEDSFETNAWSERKKLSDTAVPSIF********************************************************************
MVQRKWNSLFWTSSTKFGRFPEDKLRRKQWCIAMKR*********KICSAHFTEDSFETNAWSERKKLSDTAVPSIFTFPDHLMKTPCPRKQPAVRSLSAQEYMFVCKTSKAWFENPGSQFTFLYHFCRLVTQWRLKPYTEDLSS
*****WNSLFWTSSTKFGRFPEDKLRRKQWCIAMKRDKWKPSKHSKICSAHFTEDSFETNAWSERKKLSDTAVPSIFTFPDHLMKTPCPRKQPAVRSLSAQEYMFVCKTSKAWFENPGSQFTFLYHFCRLVTQW***********
ooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
ooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiihhhhhhhhhhhhhhhhhhhhhhhhoooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
iiiiiiiiiiiiiiiiiiiiiiiiiiiiiihhhhhhhhhhhhhhhhooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
ooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
MVQRKWNSLFWTSSTKFGRFPEDKLRRKQWCIAMKRDKWKPSKHSKICSAHFTEDSFETNAWSERKKLSDTAVPSIFTFPDHLMKTPCPRKQPAVRSLSAQEYMFVCKTSKAWFENPGSQFTFLYHFCRLVTQWRLKPYTEDLSS
no confident homologs detected

Close Homologs for Annotation Transfer

Close Homologs in SWISS-PROT Database Detected by BLAST ?

ID ?Alignment graph ?Length ? Definition ? RBH(Q2H) ? RBH(H2Q) ? Q cover ? H cover ? Identity ? E-value ?
Query145 2.2.26 [Sep-21-2011]
Q642B6 569 THAP domain-containing pr yes N/A 0.441 0.112 0.476 6e-09
Q6P3Z3 569 THAP domain-containing pr yes N/A 0.441 0.112 0.476 6e-09
Q8WY91 577 THAP domain-containing pr yes N/A 0.441 0.110 0.476 7e-09
Q2TBI2 584 THAP domain-containing pr yes N/A 0.441 0.109 0.476 7e-09
Q5ZHN5 413 THAP domain-containing pr yes N/A 0.434 0.152 0.515 8e-09
Q4R7M0 396 THAP domain-containing pr N/A N/A 0.503 0.184 0.467 6e-08
Q7Z6K1 395 THAP domain-containing pr no N/A 0.503 0.184 0.467 6e-08
Q9D305217 THAP domain-containing pr no N/A 0.448 0.299 0.402 1e-07
Q9H0W7228 THAP domain-containing pr no N/A 0.448 0.285 0.402 1e-07
Q1RMM0 394 THAP domain-containing pr no N/A 0.434 0.159 0.469 2e-07
>sp|Q642B6|THAP4_RAT THAP domain-containing protein 4 OS=Rattus norvegicus GN=Thap4 PE=2 SV=1 Back     alignment and function desciption
 Score = 59.7 bits (143), Expect = 6e-09,   Method: Compositional matrix adjust.
 Identities = 31/65 (47%), Positives = 41/65 (63%), Gaps = 1/65 (1%)

Query: 14 STKFGRFP-EDKLRRKQWCIAMKRDKWKPSKHSKICSAHFTEDSFETNAWSERKKLSDTA 72
          +  F RFP +D  R  QW  A++RD W P+K+S +CS HFT+DSF      + + L  TA
Sbjct: 21 AVSFHRFPLKDSKRLIQWLKAVQRDNWTPTKYSFLCSEHFTKDSFSKRLEDQHRLLKPTA 80

Query: 73 VPSIF 77
          VPSIF
Sbjct: 81 VPSIF 85





Rattus norvegicus (taxid: 10116)
>sp|Q6P3Z3|THAP4_MOUSE THAP domain-containing protein 4 OS=Mus musculus GN=Thap4 PE=2 SV=1 Back     alignment and function description
>sp|Q8WY91|THAP4_HUMAN THAP domain-containing protein 4 OS=Homo sapiens GN=THAP4 PE=1 SV=2 Back     alignment and function description
>sp|Q2TBI2|THAP4_BOVIN THAP domain-containing protein 4 OS=Bos taurus GN=THAP4 PE=2 SV=2 Back     alignment and function description
>sp|Q5ZHN5|THAP5_CHICK THAP domain-containing protein 5 OS=Gallus gallus GN=THAP5 PE=2 SV=1 Back     alignment and function description
>sp|Q4R7M0|THAP5_MACFA THAP domain-containing protein 5 OS=Macaca fascicularis GN=THAP5 PE=2 SV=1 Back     alignment and function description
>sp|Q7Z6K1|THAP5_HUMAN THAP domain-containing protein 5 OS=Homo sapiens GN=THAP5 PE=1 SV=2 Back     alignment and function description
>sp|Q9D305|THAP2_MOUSE THAP domain-containing protein 2 OS=Mus musculus GN=Thap2 PE=2 SV=1 Back     alignment and function description
>sp|Q9H0W7|THAP2_HUMAN THAP domain-containing protein 2 OS=Homo sapiens GN=THAP2 PE=1 SV=1 Back     alignment and function description
>sp|Q1RMM0|THAP5_BOVIN THAP domain-containing protein 5 OS=Bos taurus GN=THAP5 PE=2 SV=2 Back     alignment and function description

Close Homologs in the Non-Redundant Database Detected by BLAST ?

GI ?Alignment Graph ?Length ? Definition ? Q cover ? H cover ? Identity ? E-value ?
Query145
449692890 276 PREDICTED: uncharacterized protein LOC10 0.606 0.318 0.443 1e-13
449682480 450 PREDICTED: THAP domain-containing protei 0.579 0.186 0.425 1e-09
270016627 673 hypothetical protein TcasGA2_TC006872 [T 0.489 0.105 0.435 2e-09
443684617170 hypothetical protein CAPTEDRAFT_76654, p 0.475 0.405 0.422 9e-09
427795225 522 Putative tick transposon, partial [Rhipi 0.620 0.172 0.393 1e-08
427795057 512 Putative tick transposon, partial [Rhipi 0.620 0.175 0.393 1e-08
427797471 505 Putative tick transposon, partial [Rhipi 0.558 0.160 0.418 1e-08
427796927 574 Putative tick transposon, partial [Rhipi 0.551 0.139 0.436 2e-08
427792867 545 Putative tick transposon, partial [Rhipi 0.551 0.146 0.436 2e-08
328703142 1038 PREDICTED: hypothetical protein LOC10057 0.434 0.060 0.444 2e-08
>gi|449692890|ref|XP_004213213.1| PREDICTED: uncharacterized protein LOC101240618, partial [Hydra magnipapillata] Back     alignment and taxonomy information
 Score = 81.3 bits (199), Expect = 1e-13,   Method: Compositional matrix adjust.
 Identities = 39/88 (44%), Positives = 52/88 (59%)

Query: 11  WTSSTKFGRFPEDKLRRKQWCIAMKRDKWKPSKHSKICSAHFTEDSFETNAWSERKKLSD 70
           +T    F  FP+D   R++W   M+RD + PSK SK+C  HFT D +E + WS +KKL  
Sbjct: 24  YTKGVSFHGFPKDLELRRKWIQVMRRDGFTPSKQSKLCGKHFTIDCYEGSPWSSQKKLKS 83

Query: 71  TAVPSIFTFPDHLMKTPCPRKQPAVRSL 98
            A+PSIF FP  L K+   RK P  R +
Sbjct: 84  DAIPSIFDFPTRLKKSTYSRKPPKNRQI 111




Source: Hydra magnipapillata

Species: Hydra magnipapillata

Genus: Hydra

Family: Hydridae

Order: Hydroida

Class: Hydrozoa

Phylum: Cnidaria

Superkingdom: Eukaryota

>gi|449682480|ref|XP_002169212.2| PREDICTED: THAP domain-containing protein 9-like [Hydra magnipapillata] Back     alignment and taxonomy information
>gi|270016627|gb|EFA13073.1| hypothetical protein TcasGA2_TC006872 [Tribolium castaneum] Back     alignment and taxonomy information
>gi|443684617|gb|ELT88502.1| hypothetical protein CAPTEDRAFT_76654, partial [Capitella teleta] Back     alignment and taxonomy information
>gi|427795225|gb|JAA63064.1| Putative tick transposon, partial [Rhipicephalus pulchellus] Back     alignment and taxonomy information
>gi|427795057|gb|JAA62980.1| Putative tick transposon, partial [Rhipicephalus pulchellus] Back     alignment and taxonomy information
>gi|427797471|gb|JAA64187.1| Putative tick transposon, partial [Rhipicephalus pulchellus] Back     alignment and taxonomy information
>gi|427796927|gb|JAA63915.1| Putative tick transposon, partial [Rhipicephalus pulchellus] Back     alignment and taxonomy information
>gi|427792867|gb|JAA61885.1| Putative tick transposon, partial [Rhipicephalus pulchellus] Back     alignment and taxonomy information
>gi|328703142|ref|XP_003242105.1| PREDICTED: hypothetical protein LOC100574034 [Acyrthosiphon pisum] Back     alignment and taxonomy information

Prediction of Gene Ontology (GO) Terms

Close Homologs with Gene Ontology terms Detected by BLAST ?

ID ? Alignment graph ? Length ? Definition ? Q cover ? H cover ? Identity ? E-value ?
Query145
UNIPROTKB|F1SIM6 396 THAP5 "Uncharacterized protein 0.675 0.247 0.382 7.5e-12
UNIPROTKB|Q1RMM0 394 THAP5 "THAP domain-containing 0.675 0.248 0.382 9.5e-12
UNIPROTKB|F1MH63 395 THAP5 "THAP domain-containing 0.675 0.248 0.382 9.6e-12
UNIPROTKB|F1PXF3 395 THAP5 "Uncharacterized protein 0.675 0.248 0.372 5.6e-11
UNIPROTKB|F1NV78 413 THAP5 "THAP domain-containing 0.579 0.203 0.425 7.9e-11
UNIPROTKB|Q5ZHN5 413 THAP5 "THAP domain-containing 0.579 0.203 0.425 7.9e-11
UNIPROTKB|F1PC56 553 THAP4 "Uncharacterized protein 0.434 0.113 0.484 8.2e-11
UNIPROTKB|Q2TBI2 584 THAP4 "THAP domain-containing 0.441 0.109 0.476 9e-11
UNIPROTKB|Q7Z6K1 395 THAP5 "THAP domain-containing 0.627 0.230 0.408 9.3e-11
UNIPROTKB|Q4R7M0 396 THAP5 "THAP domain-containing 0.627 0.229 0.408 9.3e-11
UNIPROTKB|F1SIM6 THAP5 "Uncharacterized protein" [Sus scrofa (taxid:9823)] Back     alignment and assigned GO terms
 Score = 167 (63.8 bits), Expect = 7.5e-12, P = 7.5e-12
 Identities = 39/102 (38%), Positives = 55/102 (53%)

Query:    17 FGRFP-EDKLRRKQWCIAMKRDKWKPSKHSKICSAHFTEDSFETNAWSERKKLSDTAVPS 75
             F  FP  DK R ++W   MKRD W PSK+  +CS HFT DS +   W  R  L  TA+P+
Sbjct:    26 FYPFPLHDKERLEKWLKNMKRDSWVPSKYQFLCSDHFTPDSLDIR-WGIRY-LKQTAIPT 83

Query:    76 IFTFP-DHLMKTPCPRKQPAVRSLSAQEYMFVCKTSKAWFEN 116
             IF+ P D+  K P  +K    +S   +E     K+ +++  N
Sbjct:    84 IFSLPEDNQEKDPSKKKSQKKKSEDEKEVCLKSKSEESFASN 125




GO:0045786 "negative regulation of cell cycle" evidence=IEA
GO:0005634 "nucleus" evidence=IEA
GO:0002020 "protease binding" evidence=IEA
GO:0003676 "nucleic acid binding" evidence=IEA
UNIPROTKB|Q1RMM0 THAP5 "THAP domain-containing protein 5" [Bos taurus (taxid:9913)] Back     alignment and assigned GO terms
UNIPROTKB|F1MH63 THAP5 "THAP domain-containing protein 5" [Bos taurus (taxid:9913)] Back     alignment and assigned GO terms
UNIPROTKB|F1PXF3 THAP5 "Uncharacterized protein" [Canis lupus familiaris (taxid:9615)] Back     alignment and assigned GO terms
UNIPROTKB|F1NV78 THAP5 "THAP domain-containing protein 5" [Gallus gallus (taxid:9031)] Back     alignment and assigned GO terms
UNIPROTKB|Q5ZHN5 THAP5 "THAP domain-containing protein 5" [Gallus gallus (taxid:9031)] Back     alignment and assigned GO terms
UNIPROTKB|F1PC56 THAP4 "Uncharacterized protein" [Canis lupus familiaris (taxid:9615)] Back     alignment and assigned GO terms
UNIPROTKB|Q2TBI2 THAP4 "THAP domain-containing protein 4" [Bos taurus (taxid:9913)] Back     alignment and assigned GO terms
UNIPROTKB|Q7Z6K1 THAP5 "THAP domain-containing protein 5" [Homo sapiens (taxid:9606)] Back     alignment and assigned GO terms
UNIPROTKB|Q4R7M0 THAP5 "THAP domain-containing protein 5" [Macaca fascicularis (taxid:9541)] Back     alignment and assigned GO terms

Prediction of Enzyme Commission (EC) Number

EC Number Prediction by Annotation Transfer from SWISS-PROT Entries ?

ID ?Name ?Annotated EC number ?Identity ?Query coverage ?Hit coverage ?RBH(Q2H) ?RBH(H2Q) ?
Q5ZHN5THAP5_CHICKNo assigned EC number0.51510.43440.1525yesN/A

EC Number Prediction by EFICAz Software ?

No EC number assignment, probably not an enzyme!


Prediction of Functionally Associated Proteins


Conserved Domains and Related Protein Families

Conserved Domains Detected by RPS-BLAST ?

ID ?Alignment Graph ?Length ? Definition ? E-value ?
Query145
pfam0548584 pfam05485, THAP, THAP domain 9e-15
smart0098080 smart00980, THAP, The THAP domain is a putative DN 7e-13
smart0069259 smart00692, DM3, Zinc finger domain in CG10631, C 1e-07
>gnl|CDD|218604 pfam05485, THAP, THAP domain Back     alignment and domain information
 Score = 64.7 bits (158), Expect = 9e-15
 Identities = 26/68 (38%), Positives = 39/68 (57%), Gaps = 3/68 (4%)

Query: 16 KFGRFPEDKLRRKQWCIAMKRD-KWKPSKHSKICSAHFTEDSFETNAWSERKKLSDTAVP 74
           F RFP+D    ++W   + R+  WKP+K+S+ICS HF  D F+      R++L   AVP
Sbjct: 18 SFHRFPKDPELLRKWLKNLGREDDWKPTKNSRICSKHFEPDCFDNR--GGRRRLKPGAVP 75

Query: 75 SIFTFPDH 82
          ++F   D 
Sbjct: 76 TLFLGHDD 83


The THAP domain is a putative DNA-binding domain (DBD) and probably also binds a zinc ion. It features the conserved C2CH architecture (consensus sequence: Cys - 2-4 residues - Cys - 35-50 residues - Cys - 2 residues - His). Other universal features include the location of the domain at the N-termini of proteins, its size of about 90 residues, a C-terminal AVPTIF box and several other conserved residues. Orthologues of the human THAP domain have been identified in other vertebrates and probably worms and flies, but not in other eukaryotes or any prokaryotes. Length = 84

>gnl|CDD|214951 smart00980, THAP, The THAP domain is a putative DNA-binding domain (DBD) and probably also binds a zinc ion Back     alignment and domain information
>gnl|CDD|128933 smart00692, DM3, Zinc finger domain in CG10631, C Back     alignment and domain information

Conserved Domains Detected by HHsearch ?

ID ?Alignment Graph ?Length ? Definition ? Probability ?
Query 145
PF0548584 THAP: THAP domain; InterPro: IPR006612 Zinc finger 99.84
smart0069259 DM3 Zinc finger domain in CG10631, C. elegans LIN- 99.78
>PF05485 THAP: THAP domain; InterPro: IPR006612 Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule Back     alignment and domain information
Probab=99.84  E-value=5e-21  Score=131.40  Aligned_cols=73  Identities=45%  Similarity=0.814  Sum_probs=54.5

Q ss_pred             ccccCCCCcEEEeCCCChHHHHHHHHHhhcCCC-CcCCCceeccccCCcCccccccccCCcccCCCceecccCCCC
Q psy5060           7 NSLFWTSSTKFGRFPEDKLRRKQWCIAMKRDKW-KPSKHSKICSAHFTEDSFETNAWSERKKLSDTAVPSIFTFPD   81 (145)
Q Consensus         7 n~~~k~~~vsfhrFPkD~~rr~kWi~~~~r~~~-~p~k~~~VCS~HF~~~~f~~~~~~~r~~Lk~~AVPTIF~~~~   81 (145)
                      |+...+.+++||+||+|++++++|+++|++.++ .+...++|||+||++++|..  ...+.+|++|||||||+.++
T Consensus        10 ~~~~~~~~~~f~~fP~d~~~~~~W~~~~~~~~~~~~~~~~~ICs~HF~~~~~~~--~~~~~~L~~~AVPtl~~~~~   83 (84)
T PF05485_consen   10 NSSSRKPGVSFFRFPKDPERRKKWLKACGREDWWKPTKNSRICSRHFEPDDFRR--SSKRRRLKPDAVPTLFLPPE   83 (84)
T ss_dssp             TSTCCTTSS-EEE--SSHHHHHHHHHHHTSTCG-GTSTTSEEEGGGSTGGGBST--TTSSSSB-TT---CCC----
T ss_pred             CCCeeCCCeEEEECCCCHHHHHHHHHHhcccccccccCCccchhhhCchhhccc--ccCCCcCCCCCCCcCcCCCC
Confidence            356678899999999999999999999999887 77889999999999999944  34568999999999998654



Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [, , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. C2H2-type (classical) zinc fingers (Znf) were the first class to be characterised. They contain a short beta hairpin and an alpha helix (beta/beta/alpha structure), where a single zinc atom is held in place by Cys(2)His(2) (C2H2) residues in a tetrahedral array. C2H2 Znf's can be divided into three groups based on the number and pattern of fingers: triple-C2H2 (binds single ligand), multiple-adjacent-C2H2 (binds multiple ligands), and separated paired-C2H2 []. C2H2 Znf's are the most common DNA-binding motifs found in eukaryotic transcription factors, and have also been identified in prokaryotes []. Transcription factors usually contain several Znf's (each with a conserved beta/beta/alpha structure) capable of making multiple contacts along the DNA, where the C2H2 Znf motifs recognise DNA sequences by binding to the major groove of DNA via a short alpha-helix in the Znf, the Znf spanning 3-4 bases of the DNA []. C2H2 Znf's can also bind to RNA and protein targets []. The THAP domain is an ~90-residue domain restricted to animals, which is shared between the THAP family of cellular DNA-binding proteins, and transposases from mobile genomic parasites. The defined THAP domain includes: a C2CH signature (consensus: C-x(2,4)-C-x(35,50)-C-x(2)-H); three additional key residues that are strictly conserved in all THAP domains that have been found to date (THAP1 amino acids P26, W36, F58); a C-terminal AVPTIF box; and several other conserved amino acid positions with distinct physicochemical properties (e.g. hydrophobic and polar). The THAP domain can be found in one or more copies and can be associated with other domains, such as the C2H2-type zinc finger. The THAP domain is supposed to be a DNA-binding domain (DBD) [, ]. More information about these proteins can be found at Protein of the Month: Zinc Fingers [].; GO: 0003676 nucleic acid binding; PDB: 3KDE_C 2D8R_A 2JM3_A 2KO0_A 2JTG_A 2L1G_A.

>smart00692 DM3 Zinc finger domain in CG10631, C Back     alignment and domain information

Homologous Structure Templates

Structure Templates Detected by BLAST ?

ID ?Alignment Graph ?Length ? Definition ? E-value ?
Query145
2d8r_A99 Solution Structure Of The Thap Domain Of The Human 2e-08
2ko0_A87 Solution Structure Of The Thap Zinc Finger Of Thap1 8e-07
2jtg_A87 Solution Structure Of The Thap-Zinc Finger Of Thap1 8e-06
>pdb|2D8R|A Chain A, Solution Structure Of The Thap Domain Of The Human Thap Domain-Containing Protein 2 Length = 99 Back     alignment and structure

Iteration: 1

Score = 55.1 bits (131), Expect = 2e-08, Method: Compositional matrix adjust. Identities = 27/67 (40%), Positives = 38/67 (56%), Gaps = 2/67 (2%) Query: 17 FGRFPEDKLRRKQWCIAMKRDKWKPSKHSKICSAHFTEDSFETNAWSERKKLSDTAVPSI 76 F RFP D RRK+W ++R + P KH+ +CS HF F+ + R K+ AVP+I Sbjct: 29 FHRFPLDPKRRKEWVRLVRRKNFVPGKHTFLCSKHFEASCFDLTGQTRRLKMD--AVPTI 86 Query: 77 FTFPDHL 83 F F H+ Sbjct: 87 FDFCTHI 93
>pdb|2KO0|A Chain A, Solution Structure Of The Thap Zinc Finger Of Thap1 In Complex With Its Dna Target Length = 87 Back     alignment and structure
>pdb|2JTG|A Chain A, Solution Structure Of The Thap-Zinc Finger Of Thap1 Length = 87 Back     alignment and structure

Structure Templates Detected by RPS-BLAST ?

ID ?Alignment Graph ?Length ? Definition ? E-value ?
Query145
2d8r_A99 THAP domain-containing protein 2; NPPSFA, national 2e-15
2jtg_A87 THAP domain-containing protein 1; zinc finger, CCC 1e-14
2jm3_A91 Hypothetical protein; zinc finger, domain, metal b 6e-08
>2d8r_A THAP domain-containing protein 2; NPPSFA, national project on protein structural and functional analyses; NMR {Homo sapiens} SCOP: g.39.1.16 Length = 99 Back     alignment and structure
 Score = 66.6 bits (162), Expect = 2e-15
 Identities = 26/77 (33%), Positives = 40/77 (51%), Gaps = 2/77 (2%)

Query: 7  NSLFWTSSTKFGRFPEDKLRRKQWCIAMKRDKWKPSKHSKICSAHFTEDSFETNAWSERK 66
           +     +  F RFP D  RRK+W   ++R  + P KH+ +CS HF    F+     + +
Sbjct: 19 TTYNKHINISFHRFPLDPKRRKEWVRLVRRKNFVPGKHTFLCSKHFEASCFDLT--GQTR 76

Query: 67 KLSDTAVPSIFTFPDHL 83
          +L   AVP+IF F  H+
Sbjct: 77 RLKMDAVPTIFDFCTHI 93


>2jtg_A THAP domain-containing protein 1; zinc finger, CCCH, DNA-binding, metal-binding, zinc- finger, metal binding protein; NMR {Homo sapiens} PDB: 2ko0_A 2l1g_A* Length = 87 Back     alignment and structure
>2jm3_A Hypothetical protein; zinc finger, domain, metal binding protein; NMR {Caenorhabditis elegans} Length = 91 Back     alignment and structure

Structure Templates Detected by HHsearch ?

ID ?Alignment Graph ?Length ? Definition ? Probability ?
Query145
2jtg_A87 THAP domain-containing protein 1; zinc finger, CCC 99.89
2d8r_A99 THAP domain-containing protein 2; NPPSFA, national 99.89
2lau_A81 THAP domain-containing protein 11; zinc finger, pr 99.83
3kde_C77 Transposable element P transposase; THAP domain, D 99.68
2jm3_A91 Hypothetical protein; zinc finger, domain, metal b 99.62
>2jtg_A THAP domain-containing protein 1; zinc finger, CCCH, DNA-binding, metal-binding, zinc- finger, metal binding protein; NMR {Homo sapiens} PDB: 2ko0_A 2l1g_A* Back     alignment and structure
Probab=99.89  E-value=6.6e-24  Score=147.13  Aligned_cols=73  Identities=37%  Similarity=0.572  Sum_probs=65.0

Q ss_pred             ccccCCCCcEEEeCC-CChHHHHHHHHHhhcCCCCcCCCceeccccCCcCccccccccCCcccCCCceecccCCCC
Q psy5060           7 NSLFWTSSTKFGRFP-EDKLRRKQWCIAMKRDKWKPSKHSKICSAHFTEDSFETNAWSERKKLSDTAVPSIFTFPD   81 (145)
Q Consensus         7 n~~~k~~~vsfhrFP-kD~~rr~kWi~~~~r~~~~p~k~~~VCS~HF~~~~f~~~~~~~r~~Lk~~AVPTIF~~~~   81 (145)
                      |++.++.+++||+|| +|++++++|+++|++++|.+++.++|||+||+++||..  .+.+++|++|||||||+..+
T Consensus        12 n~~~~~~~~~f~~FP~kd~~~~~~W~~~~~~~~~~~~~~~~iCs~HF~~~~f~~--~~~~~~Lk~~AVPtif~~~~   85 (87)
T 2jtg_A           12 NRYDKDKPVSFHKFPLTRPSLCKEWEAAVRRKNFKPTKYSSICSEHFTPDCFKR--ECNNKLLKENAVPTIFLELV   85 (87)
T ss_dssp             CCCCSSSCCCEEECCSSCCTTHHHHHHHHCSSCCCTTTTSEEEGGGSCGGGGCC--CCSSCCCCTTCCCGGGCCCC
T ss_pred             CCCcCCCCeEEEECCCCChHHHHHHHHHhCcccCccCCCCEEccccCcHhHhhc--cCCcCeeCCCCCCcCcCCCC
Confidence            444556789999999 99999999999999999999999999999999999997  24568999999999999654



>2d8r_A THAP domain-containing protein 2; NPPSFA, national project on protein structural and functional analyses; NMR {Homo sapiens} SCOP: g.39.1.16 Back     alignment and structure
>2lau_A THAP domain-containing protein 11; zinc finger, protein-DNA complex, DNA binding domain, transc factor, CCCH, transcription-DNA complex; NMR {Homo sapiens} Back     alignment and structure
>3kde_C Transposable element P transposase; THAP domain, DNA-binding domain, zinc-finger, beta-alpha-BET element transposase, DNA integration; HET: BRU; 1.74A {Drosophila melanogaster} Back     alignment and structure
>2jm3_A Hypothetical protein; zinc finger, domain, metal binding protein; NMR {Caenorhabditis elegans} Back     alignment and structure

Homologous Structure Domains

Structure Domains Detected by RPS-BLAST ?

ID ?Alignment Graph ?Length ? Definition ? E-value ?
Query 145
d2d8ra186 g.39.1.16 (A:8-93) THAP domain-containing protein 1e-16
>d2d8ra1 g.39.1.16 (A:8-93) THAP domain-containing protein 2 {Human (Homo sapiens) [TaxId: 9606]} Length = 86 Back     information, alignment and structure

class: Small proteins
fold: Glucocorticoid receptor-like (DNA-binding domain)
superfamily: Glucocorticoid receptor-like (DNA-binding domain)
family: THAP domain
domain: THAP domain-containing protein 2
species: Human (Homo sapiens) [TaxId: 9606]
 Score = 68.2 bits (166), Expect = 1e-16
 Identities = 26/77 (33%), Positives = 40/77 (51%), Gaps = 2/77 (2%)

Query: 7  NSLFWTSSTKFGRFPEDKLRRKQWCIAMKRDKWKPSKHSKICSAHFTEDSFETNAWSERK 66
           +     +  F RFP D  RRK+W   ++R  + P KH+ +CS HF    F+     + +
Sbjct: 12 TTYNKHINISFHRFPLDPKRRKEWVRLVRRKNFVPGKHTFLCSKHFEASCFDLT--GQTR 69

Query: 67 KLSDTAVPSIFTFPDHL 83
          +L   AVP+IF F  H+
Sbjct: 70 RLKMDAVPTIFDFCTHI 86


Homologous Domains Detected by HHsearch ?

ID ?Alignment Graph ?Length ? Definition ? Probability ?
Query145
d2d8ra186 THAP domain-containing protein 2 {Human (Homo sapi 99.89
>d2d8ra1 g.39.1.16 (A:8-93) THAP domain-containing protein 2 {Human (Homo sapiens) [TaxId: 9606]} Back     information, alignment and structure
class: Small proteins
fold: Glucocorticoid receptor-like (DNA-binding domain)
superfamily: Glucocorticoid receptor-like (DNA-binding domain)
family: THAP domain
domain: THAP domain-containing protein 2
species: Human (Homo sapiens) [TaxId: 9606]
Probab=99.89  E-value=1e-23  Score=144.51  Aligned_cols=75  Identities=35%  Similarity=0.675  Sum_probs=66.5

Q ss_pred             ccccCCCCcEEEeCCCChHHHHHHHHHhhcCCCCcCCCceeccccCCcCccccccccCCcccCCCceecccCCCCCC
Q psy5060           7 NSLFWTSSTKFGRFPEDKLRRKQWCIAMKRDKWKPSKHSKICSAHFTEDSFETNAWSERKKLSDTAVPSIFTFPDHL   83 (145)
Q Consensus         7 n~~~k~~~vsfhrFPkD~~rr~kWi~~~~r~~~~p~k~~~VCS~HF~~~~f~~~~~~~r~~Lk~~AVPTIF~~~~~~   83 (145)
                      |+...+.+++||+||+|++++++|+++|+++++.++.+++|||+||+++||..  .+.+++|++|||||||..+.++
T Consensus        12 ~~~~~~~~~~ff~fP~d~~~~~~W~~~~~~~~~~~~~~~~ICs~HF~~~~~~~--~~~~~~L~~~AVPtiF~~~~~~   86 (86)
T d2d8ra1          12 TTYNKHINISFHRFPLDPKRRKEWVRLVRRKNFVPGKHTFLCSKHFEASCFDL--TGQTRRLKMDAVPTIFDFCTHI   86 (86)
T ss_dssp             CSCCSSCCCCCEECCSSHHHHHHHHHHTTCTTCCCSSSCEECTTSSCSTTBSC--TTSSCCBCTTCCCSCCCCCCSS
T ss_pred             CCCCCCCCEEEEECCCCHHHHHHHHHHhCCcccccCCccEEeCCcCChhhhcc--cCCCCEeCCCCccceeCCCCCC
Confidence            45566788999999999999999999999999999999999999999999986  3345689999999999988764