Psyllid ID: psy5284


Local Sequence Feature Prediction

Prediction and (Method)Result
Residue Number Marker
Protein Sequence ?
Secondary Structure (PSIPRED) ?
Secondary Structure Prediction (SSPRO) ?
Coil and Loop (DISEMBL) ?
Flexible Loop (DISEMBL) ?
Low Complexity Region (SEG) ?
Disordered region (IsUnstruct) ?
Disordered Region (DISOPRED) ?
Disordered Region (DISEMBL) ?
Disordered Region (DISPRO) ?
Transmembrane Helix (TMHMM) ?
Transmembrane Helix (HMMTOP) ?
Transmembrane Helix (MEMSAT) ?
TM Helix, Signal Peptide (MEMSAT_SVM) ?
TM Helix, Signal Peptide (Phobius) ?
Signal Peptide (SignalP HMM Mode) ?
Signal Peptide (SignalP NN Mode) ?
Coiled Coils (COILS) ?
Positional Conservation ?
 
--------10--------20--------30--------40--------50--------60--------70--------80--------90-------100-------110-------120-------130-------140-------150-------160-------170-------180
MPPSCCVPTCKLMRNNSEKLSYHEIPSKEPLRTNWIKQIGILTGNKFWQPTSESAVVCSKHFIEPDFVETPLRRRLKPTSVPSVFYMMQGKPYFVPSRRRKYESPTTPNLFPNSQHHEIVNNEGNETRDIIDNSNVIENSKYDAFLQASFADKKFLEMLQLNLEMLEVANHFLADYSLLE
ccccEEEcccccccccccccEEEEccccHHHHHHHHHHHcccccccccccccccEEEccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccHHHHHHHHHHHHHHHHHHHHHHHHHHHcccccccccccc
cccccEcccccccccccccEEEEEccccHHHHHHHHHHHcccccccccccccccEEEEcccccccHccccccccccccccccEEEEEccccccccccHcccccccccccccccccccccccccccccHHHEcccccccccEEEccccccHHHHHHHHHHHcccHHHHHHHHHHHHHHHcc
mppsccvptcklmrnnseklsyheipskeplrtnwIKQIGIltgnkfwqptsesavvcskhfiepdfvetplrrrlkptsvpsvfymmqgkpyfvpsrrrkyespttpnlfpnsqhheivnnegnetrdiidnsnvienskyDAFLQASFADKKFLEMLQLNLEMLEVANHFLADYSLLE
mppsccvptcklmrnnseklsyheipskeplrtnWIKQIGILTGNKFWQPTSESAVVCSKHFiepdfvetplrrrlkptsvpsvfymmqGKPYFVPSRRRKYESpttpnlfpnsqhheivnnegnetrDIIDNSNVIENSKYDAFLQASFADKKFLEMLQLNLEMLEVANHFLADYSLLE
MPPSCCVPTCKLMRNNSEKLSYHEIPSKEPLRTNWIKQIGILTGNKFWQPTSESAVVCSKHFIEPDFVETPLRRRLKPTSVPSVFYMMQGKPYFVPSRRRKYESPTTPNLFPNSQHHEIVNNEGNETRDIIDNSNVIENSKYDAFLQASFADKKFlemlqlnlemleVANHFLADYSLLE
*****CVPTCKL*****************PLRTNWIKQIGILTGNKFWQPTSESAVVCSKHFIEPDFVETPLRRRLKPTSVPSVFYMMQGKPYFV**********************************IIDNSNVIENSKYDAFLQASFADKKFLEMLQLNLEMLEVANHFLADYS***
MPPSCCVPTCKLMRNNSEKLSYHEIPSKEPLRTNWIKQIGILTGNKFWQPTSESAVVCSKHFIEPDFVETPLRRRLKPTSVPSVF************************************************************************MLQLNLEMLEVANHFL*DYSLL*
MPPSCCVPTCKLMRNNSEKLSYHEIPSKEPLRTNWIKQIGILTGNKFWQPTSESAVVCSKHFIEPDFVETPLRRRLKPTSVPSVFYMMQGKPYFVPSRRRKYESPTTPNLFPNSQHHEIVNNEGNETRDIIDNSNVIENSKYDAFLQASFADKKFLEMLQLNLEMLEVANHFLADYSLLE
***SCCVPTCKLMRNNSEKLSYHEIPSKEPLRTNWIKQIGILTGNKFWQPTSESAVVCSKHFIEPDFVETPLRRRLKPTSVPSVFYMMQGK***************TP******************TRDIIDNSNVIENSKYDAFLQASFADKKFLEMLQLNLEMLEVANHFLADYSLLE
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiihhhhhhhhhhhhhhhhhhhooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiihhhhhhhhhhhhhhhhoooooooo
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
MPPSCCVPTCKLMRNNSEKLSYHEIPSKEPLRTNWIKQIGILTGNKFWQPTSESAVVCSKHFIEPDFVETPLRRRLKPTSVPSVFYMMQGKPYFVPSRRRKYESPTTPNLFPNSQHHEIVNNEGNETRDIIDNSNVIENSKYDAFLQASFADKKFLEMLQLNLEMLEVANHFLADYSLLE
no confident homologs detected

Close Homologs for Annotation Transfer

Close Homologs in SWISS-PROT Database Detected by BLAST ?

ID ?Alignment graph ?Length ? Definition ? RBH(Q2H) ? RBH(H2Q) ? Q cover ? H cover ? Identity ? E-value ?
Query180 2.2.26 [Sep-21-2011]
Q9H5L6 903 THAP domain-containing pr yes N/A 0.6 0.119 0.310 5e-08
Q8TBB0222 THAP domain-containing pr no N/A 0.572 0.463 0.307 2e-06
Q8VCZ3 309 THAP domain-containing pr yes N/A 0.444 0.258 0.367 1e-05
Q9BT49 309 THAP domain-containing pr no N/A 0.444 0.258 0.356 2e-05
Q642B6 569 THAP domain-containing pr no N/A 0.505 0.159 0.317 0.0002
Q2TBI2 584 THAP domain-containing pr no N/A 0.455 0.140 0.326 0.0006
Q9H0W7228 THAP domain-containing pr no N/A 0.444 0.350 0.317 0.0006
Q8WY91 577 THAP domain-containing pr no N/A 0.455 0.142 0.333 0.0007
Q9D305217 THAP domain-containing pr no N/A 0.444 0.368 0.317 0.0007
Q6P3Z3 569 THAP domain-containing pr no N/A 0.455 0.144 0.333 0.0008
>sp|Q9H5L6|THAP9_HUMAN THAP domain-containing protein 9 OS=Homo sapiens GN=THAP9 PE=2 SV=2 Back     alignment and function desciption
 Score = 57.4 bits (137), Expect = 5e-08,   Method: Compositional matrix adjust.
 Identities = 36/116 (31%), Positives = 56/116 (48%), Gaps = 8/116 (6%)

Query: 1   MPPSCCVPTCKL---MRNNSEKLSYHEIPSKEPLRTNWIKQIGILT--GNKFWQPTSESA 55
           M  SC    C     + +    LS+H+ P+    R+ WI+ +  +     K W P    A
Sbjct: 1   MTRSCSAVGCSTRDTVLSRERGLSFHQFPTDTIQRSKWIRAVNRVDPRSKKIWIP-GPGA 59

Query: 56  VVCSKHFIEPDFVETPLRRRLKPTSVPSV--FYMMQGKPYFVPSRRRKYESPTTPN 109
           ++CSKHF E DF    +RR+LK  +VPSV  + + QG      +R++  + P   N
Sbjct: 60  ILCSKHFQESDFESYGIRRKLKKGAVPSVSLYKIPQGVHLKGKARQKILKQPLPDN 115





Homo sapiens (taxid: 9606)
>sp|Q8TBB0|THAP6_HUMAN THAP domain-containing protein 6 OS=Homo sapiens GN=THAP6 PE=2 SV=1 Back     alignment and function description
>sp|Q8VCZ3|THAP7_MOUSE THAP domain-containing protein 7 OS=Mus musculus GN=Thap7 PE=2 SV=1 Back     alignment and function description
>sp|Q9BT49|THAP7_HUMAN THAP domain-containing protein 7 OS=Homo sapiens GN=THAP7 PE=1 SV=2 Back     alignment and function description
>sp|Q642B6|THAP4_RAT THAP domain-containing protein 4 OS=Rattus norvegicus GN=Thap4 PE=2 SV=1 Back     alignment and function description
>sp|Q2TBI2|THAP4_BOVIN THAP domain-containing protein 4 OS=Bos taurus GN=THAP4 PE=2 SV=2 Back     alignment and function description
>sp|Q9H0W7|THAP2_HUMAN THAP domain-containing protein 2 OS=Homo sapiens GN=THAP2 PE=1 SV=1 Back     alignment and function description
>sp|Q8WY91|THAP4_HUMAN THAP domain-containing protein 4 OS=Homo sapiens GN=THAP4 PE=1 SV=2 Back     alignment and function description
>sp|Q9D305|THAP2_MOUSE THAP domain-containing protein 2 OS=Mus musculus GN=Thap2 PE=2 SV=1 Back     alignment and function description
>sp|Q6P3Z3|THAP4_MOUSE THAP domain-containing protein 4 OS=Mus musculus GN=Thap4 PE=2 SV=1 Back     alignment and function description

Close Homologs in the Non-Redundant Database Detected by BLAST ?

GI ?Alignment Graph ?Length ? Definition ? Q cover ? H cover ? Identity ? E-value ?
Query180
241615798231 conserved hypothetical protein [Ixodes s 0.761 0.593 0.333 1e-10
58379912193 AGAP009530-PA [Anopheles gambiae str. PE 0.483 0.450 0.402 2e-09
346468253 332 hypothetical protein [Amblyomma maculatu 0.544 0.295 0.355 6e-09
241700819 314 conserved hypothetical protein [Ixodes s 0.566 0.324 0.371 1e-08
427786543 296 Hypothetical protein [Rhipicephalus pulc 0.811 0.493 0.319 6e-08
427787159 338 Hypothetical protein [Rhipicephalus pulc 0.438 0.233 0.373 9e-08
170066772 722 zinc finger protein 358 [Culex quinquefa 0.577 0.144 0.375 1e-07
241695299 301 hypothetical protein IscW_ISCW012574 [Ix 0.505 0.302 0.329 1e-07
241286592219 conserved hypothetical protein [Ixodes s 0.472 0.388 0.414 1e-07
241802319222 conserved hypothetical protein [Ixodes s 0.444 0.360 0.471 2e-07
>gi|241615798|ref|XP_002406808.1| conserved hypothetical protein [Ixodes scapularis] gi|215500875|gb|EEC10369.1| conserved hypothetical protein [Ixodes scapularis] Back     alignment and taxonomy information
 Score = 71.6 bits (174), Expect = 1e-10,   Method: Compositional matrix adjust.
 Identities = 49/147 (33%), Positives = 73/147 (49%), Gaps = 10/147 (6%)

Query: 1   MPPSCCVPTCKLMRNNSEKLSYHEIPSKEPLRTNWIKQI---GILTGNKFWQPTSESAVV 57
           +P +C VP C +   N+  +S+HEIPSK  LR  W+K+    G   G ++   +S  AV+
Sbjct: 1   VPKNCSVPLCTVTERNNAVVSFHEIPSKPELRAAWLKKSLRQGTSKGTEWV--SSSRAVI 58

Query: 58  CSKHFIEPDFVETPLRRRLKPTSVPSVFYMMQGKPYFVP---SRRRKYESPTTPNLFPNS 114
           CS HF + DF +   RR L   +VPSVF  ++   Y  P   S+RR+ E         ++
Sbjct: 59  CSLHFTKQDFKQGAKRRMLLSEAVPSVF--LEYPSYLQPAPASKRRQLEREFCETPSTST 116

Query: 115 QHHEIVNNEGNETRDIIDNSNVIENSK 141
             H     E  +T   + N N  E S+
Sbjct: 117 SAHPESPTETEDTFADVGNDNTTETSQ 143




Source: Ixodes scapularis

Species: Ixodes scapularis

Genus: Ixodes

Family: Ixodidae

Order: Ixodida

Class: Arachnida

Phylum: Arthropoda

Superkingdom: Eukaryota

>gi|58379912|ref|XP_310159.2| AGAP009530-PA [Anopheles gambiae str. PEST] gi|55244055|gb|EAA05877.2| AGAP009530-PA [Anopheles gambiae str. PEST] Back     alignment and taxonomy information
>gi|346468253|gb|AEO33971.1| hypothetical protein [Amblyomma maculatum] Back     alignment and taxonomy information
>gi|241700819|ref|XP_002411899.1| conserved hypothetical protein [Ixodes scapularis] gi|215504839|gb|EEC14333.1| conserved hypothetical protein [Ixodes scapularis] Back     alignment and taxonomy information
>gi|427786543|gb|JAA58723.1| Hypothetical protein [Rhipicephalus pulchellus] Back     alignment and taxonomy information
>gi|427787159|gb|JAA59031.1| Hypothetical protein [Rhipicephalus pulchellus] Back     alignment and taxonomy information
>gi|170066772|ref|XP_001868218.1| zinc finger protein 358 [Culex quinquefasciatus] gi|167862961|gb|EDS26344.1| zinc finger protein 358 [Culex quinquefasciatus] Back     alignment and taxonomy information
>gi|241695299|ref|XP_002413044.1| hypothetical protein IscW_ISCW012574 [Ixodes scapularis] gi|215506858|gb|EEC16352.1| hypothetical protein IscW_ISCW012574 [Ixodes scapularis] Back     alignment and taxonomy information
>gi|241286592|ref|XP_002407001.1| conserved hypothetical protein [Ixodes scapularis] gi|215496977|gb|EEC06617.1| conserved hypothetical protein, partial [Ixodes scapularis] Back     alignment and taxonomy information
>gi|241802319|ref|XP_002414529.1| conserved hypothetical protein [Ixodes scapularis] gi|215508740|gb|EEC18194.1| conserved hypothetical protein [Ixodes scapularis] Back     alignment and taxonomy information

Prediction of Gene Ontology (GO) Terms

Close Homologs with Gene Ontology terms Detected by BLAST ?

ID ? Alignment graph ? Length ? Definition ? Q cover ? H cover ? Identity ? E-value ?
Query180
UNIPROTKB|F2Z371102 THAP9 "THAP domain-containing 0.488 0.862 0.347 2.9e-09
UNIPROTKB|B4E146160 THAP6 "cDNA FLJ52043, highly s 0.583 0.656 0.310 5.9e-09
UNIPROTKB|D6RCT5102 THAP9 "THAP domain-containing 0.461 0.813 0.348 5.9e-09
UNIPROTKB|D6REM3109 THAP9 "THAP domain-containing 0.461 0.761 0.348 5.9e-09
UNIPROTKB|Q8TBB0222 THAP6 "THAP domain-containing 0.583 0.472 0.310 7.3e-09
UNIPROTKB|F1PQC2 887 THAP9 "Uncharacterized protein 0.655 0.133 0.318 1.4e-08
UNIPROTKB|J9P5J1 902 THAP9 "Uncharacterized protein 0.655 0.130 0.318 1.4e-08
UNIPROTKB|F1MY07 903 THAP9 "Uncharacterized protein 0.6 0.119 0.310 2.3e-08
UNIPROTKB|Q9H5L6 903 THAP9 "DNA transposase THAP9" 0.661 0.131 0.305 3.8e-08
UNIPROTKB|F1RYW3222 THAP6 "Uncharacterized protein 0.711 0.576 0.277 3.8e-08
UNIPROTKB|F2Z371 THAP9 "THAP domain-containing protein 9" [Homo sapiens (taxid:9606)] Back     alignment and assigned GO terms
 Score = 136 (52.9 bits), Expect = 2.9e-09, P = 2.9e-09
 Identities = 33/95 (34%), Positives = 48/95 (50%)

Query:     1 MPPSCCVPTCKL---MRNNSEKLSYHEIPSKEPLRTNWIKQIGILT--GNKFWQPTSESA 55
             M  SC    C     + +    LS+H+ P+    R+ WI+ +  +     K W P    A
Sbjct:     1 MTRSCSAVGCSTRDTVLSRERGLSFHQFPTDTIQRSKWIRAVNRVDPRSKKIWIP-GPGA 59

Query:    56 VVCSKHFIEPDFVETPLRRRLKPTSVPSV-FYMMQ 89
             ++CSKHF E DF    +RR+LK  +VPSV  Y M+
Sbjct:    60 ILCSKHFQESDFESYGIRRKLKKGAVPSVSLYKMR 94




GO:0003676 "nucleic acid binding" evidence=IEA
UNIPROTKB|B4E146 THAP6 "cDNA FLJ52043, highly similar to THAP domain-containing protein 6" [Homo sapiens (taxid:9606)] Back     alignment and assigned GO terms
UNIPROTKB|D6RCT5 THAP9 "THAP domain-containing protein 9" [Homo sapiens (taxid:9606)] Back     alignment and assigned GO terms
UNIPROTKB|D6REM3 THAP9 "THAP domain-containing protein 9" [Homo sapiens (taxid:9606)] Back     alignment and assigned GO terms
UNIPROTKB|Q8TBB0 THAP6 "THAP domain-containing protein 6" [Homo sapiens (taxid:9606)] Back     alignment and assigned GO terms
UNIPROTKB|F1PQC2 THAP9 "Uncharacterized protein" [Canis lupus familiaris (taxid:9615)] Back     alignment and assigned GO terms
UNIPROTKB|J9P5J1 THAP9 "Uncharacterized protein" [Canis lupus familiaris (taxid:9615)] Back     alignment and assigned GO terms
UNIPROTKB|F1MY07 THAP9 "Uncharacterized protein" [Bos taurus (taxid:9913)] Back     alignment and assigned GO terms
UNIPROTKB|Q9H5L6 THAP9 "DNA transposase THAP9" [Homo sapiens (taxid:9606)] Back     alignment and assigned GO terms
UNIPROTKB|F1RYW3 THAP6 "Uncharacterized protein" [Sus scrofa (taxid:9823)] Back     alignment and assigned GO terms

Prediction of Enzyme Commission (EC) Number

EC Number Prediction by Annotation Transfer from SWISS-PROT Entries ?

No confident hit for EC number transfering in SWISSPROT detected by BLAST

EC Number Prediction by EFICAz Software ?

No EC number assignment, probably not an enzyme!


Prediction of Functionally Associated Proteins


Conserved Domains and Related Protein Families

Conserved Domains Detected by RPS-BLAST ?

ID ?Alignment Graph ?Length ? Definition ? E-value ?
Query180
pfam0548584 pfam05485, THAP, THAP domain 6e-17
smart0098080 smart00980, THAP, The THAP domain is a putative DN 2e-15
>gnl|CDD|218604 pfam05485, THAP, THAP domain Back     alignment and domain information
 Score = 71.3 bits (175), Expect = 6e-17
 Identities = 31/82 (37%), Positives = 40/82 (48%), Gaps = 5/82 (6%)

Query: 4  SCCVPTCKLMRNNSEKLSYHEIPSKEPLRTNWIKQIGILTGNKFWQPTSESAVVCSKHFI 63
           CCVP C         +S+H  P    L   W+K +G       W+PT  S  +CSKHF 
Sbjct: 2  KCCVPGCN-RSKRGPGVSFHRFPKDPELLRKWLKNLGR---EDDWKPTKNS-RICSKHFE 56

Query: 64 EPDFVETPLRRRLKPTSVPSVF 85
             F     RRRLKP +VP++F
Sbjct: 57 PDCFDNRGGRRRLKPGAVPTLF 78


The THAP domain is a putative DNA-binding domain (DBD) and probably also binds a zinc ion. It features the conserved C2CH architecture (consensus sequence: Cys - 2-4 residues - Cys - 35-50 residues - Cys - 2 residues - His). Other universal features include the location of the domain at the N-termini of proteins, its size of about 90 residues, a C-terminal AVPTIF box and several other conserved residues. Orthologues of the human THAP domain have been identified in other vertebrates and probably worms and flies, but not in other eukaryotes or any prokaryotes. Length = 84

>gnl|CDD|214951 smart00980, THAP, The THAP domain is a putative DNA-binding domain (DBD) and probably also binds a zinc ion Back     alignment and domain information

Conserved Domains Detected by HHsearch ?

ID ?Alignment Graph ?Length ? Definition ? Probability ?
Query 180
PF0548584 THAP: THAP domain; InterPro: IPR006612 Zinc finger 99.89
smart0069259 DM3 Zinc finger domain in CG10631, C. elegans LIN- 99.72
>PF05485 THAP: THAP domain; InterPro: IPR006612 Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule Back     alignment and domain information
Probab=99.89  E-value=1.2e-23  Score=148.63  Aligned_cols=82  Identities=35%  Similarity=0.791  Sum_probs=60.0

Q ss_pred             cceeeccCCCCccCCCCeEEEecCCChhHHHHHHHHhccccCCcccCCCCCceeeccCCccCCCcccCCCCcccCCCcee
Q psy5284           3 PSCCVPTCKLMRNNSEKLSYHEIPSKEPLRTNWIKQIGILTGNKFWQPTSESAVVCSKHFIEPDFVETPLRRRLKPTSVP   82 (180)
Q Consensus         3 ~~C~V~gC~n~~~k~~~isff~FPkd~~~r~~Wi~~~~~~~~~~~~~p~~~~~~VCs~HF~~~~f~~~~~~~~Lk~~AVP   82 (180)
                      ++|+|+||.+......+++||+||+|++++++|++++++..   .+.+... .+||+.||++++|.....+++|+++|||
T Consensus         1 r~C~v~~C~~~~~~~~~~~f~~fP~d~~~~~~W~~~~~~~~---~~~~~~~-~~ICs~HF~~~~~~~~~~~~~L~~~AVP   76 (84)
T PF05485_consen    1 RKCCVPGCSNSSSRKPGVSFFRFPKDPERRKKWLKACGRED---WWKPTKN-SRICSRHFEPDDFRRSSKRRRLKPDAVP   76 (84)
T ss_dssp             --ETSSSTTTSTCCTTSS-EEE--SSHHHHHHHHHHHTSTC---G-GTSTT-SEEEGGGSTGGGBSTTTSSSSB-TT---
T ss_pred             CEEEeccCcCCCeeCCCeEEEECCCCHHHHHHHHHHhcccc---cccccCC-ccchhhhCchhhcccccCCCcCCCCCCC
Confidence            57999999666666789999999999999999999999953   2345554 8999999999999766889999999999


Q ss_pred             ecccCC
Q psy5284          83 SVFYMM   88 (180)
Q Consensus        83 tlf~~~   88 (180)
                      |||++.
T Consensus        77 tl~~~~   82 (84)
T PF05485_consen   77 TLFLPP   82 (84)
T ss_dssp             CCC---
T ss_pred             cCcCCC
Confidence            999864



Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [, , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. C2H2-type (classical) zinc fingers (Znf) were the first class to be characterised. They contain a short beta hairpin and an alpha helix (beta/beta/alpha structure), where a single zinc atom is held in place by Cys(2)His(2) (C2H2) residues in a tetrahedral array. C2H2 Znf's can be divided into three groups based on the number and pattern of fingers: triple-C2H2 (binds single ligand), multiple-adjacent-C2H2 (binds multiple ligands), and separated paired-C2H2 []. C2H2 Znf's are the most common DNA-binding motifs found in eukaryotic transcription factors, and have also been identified in prokaryotes []. Transcription factors usually contain several Znf's (each with a conserved beta/beta/alpha structure) capable of making multiple contacts along the DNA, where the C2H2 Znf motifs recognise DNA sequences by binding to the major groove of DNA via a short alpha-helix in the Znf, the Znf spanning 3-4 bases of the DNA []. C2H2 Znf's can also bind to RNA and protein targets []. The THAP domain is an ~90-residue domain restricted to animals, which is shared between the THAP family of cellular DNA-binding proteins, and transposases from mobile genomic parasites. The defined THAP domain includes: a C2CH signature (consensus: C-x(2,4)-C-x(35,50)-C-x(2)-H); three additional key residues that are strictly conserved in all THAP domains that have been found to date (THAP1 amino acids P26, W36, F58); a C-terminal AVPTIF box; and several other conserved amino acid positions with distinct physicochemical properties (e.g. hydrophobic and polar). The THAP domain can be found in one or more copies and can be associated with other domains, such as the C2H2-type zinc finger. The THAP domain is supposed to be a DNA-binding domain (DBD) [, ]. More information about these proteins can be found at Protein of the Month: Zinc Fingers [].; GO: 0003676 nucleic acid binding; PDB: 3KDE_C 2D8R_A 2JM3_A 2KO0_A 2JTG_A 2L1G_A.

>smart00692 DM3 Zinc finger domain in CG10631, C Back     alignment and domain information

Homologous Structure Templates

Structure Templates Detected by BLAST ?

ID ?Alignment Graph ?Length ? Definition ? E-value ?
Query180
2d8r_A99 Solution Structure Of The Thap Domain Of The Human 5e-05
>pdb|2D8R|A Chain A, Solution Structure Of The Thap Domain Of The Human Thap Domain-Containing Protein 2 Length = 99 Back     alignment and structure

Iteration: 1

Score = 43.9 bits (102), Expect = 5e-05, Method: Compositional matrix adjust. Identities = 27/85 (31%), Positives = 38/85 (44%), Gaps = 5/85 (5%) Query: 1 MPPSCCVPTCKLMRNNSEKLSYHEIPSKEPLRTNWIKQIGILTGNKFWQPTSESAVVCSK 60 MP +C C N +S+H P R W++ L K + P + +CSK Sbjct: 8 MPTNCAAAGCATTYNKHINISFHRFPLDPKRRKEWVR----LVRRKNFVP-GKHTFLCSK 62 Query: 61 HFIEPDFVETPLRRRLKPTSVPSVF 85 HF F T RRLK +VP++F Sbjct: 63 HFEASCFDLTGQTRRLKMDAVPTIF 87

Structure Templates Detected by RPS-BLAST ?

ID ?Alignment Graph ?Length ? Definition ? E-value ?
Query180
2jtg_A87 THAP domain-containing protein 1; zinc finger, CCC 3e-21
2d8r_A99 THAP domain-containing protein 2; NPPSFA, national 8e-21
2jm3_A91 Hypothetical protein; zinc finger, domain, metal b 7e-10
3kde_C77 Transposable element P transposase; THAP domain, D 5e-05
>2jtg_A THAP domain-containing protein 1; zinc finger, CCCH, DNA-binding, metal-binding, zinc- finger, metal binding protein; NMR {Homo sapiens} PDB: 2ko0_A 2l1g_A* Length = 87 Back     alignment and structure
 Score = 82.2 bits (203), Expect = 3e-21
 Identities = 25/86 (29%), Positives = 39/86 (45%), Gaps = 6/86 (6%)

Query: 1  MPPSCCVPTCKLMRNNSEKLSYHEIPSKEP-LRTNWIKQIGILTGNKFWQPTSESAVVCS 59
          M  SC    CK   +  + +S+H+ P   P L   W   +      K ++PT  S  +CS
Sbjct: 1  MVQSCSAYGCKNRYDKDKPVSFHKFPLTRPSLCKEWEAAVR----RKNFKPTKYS-SICS 55

Query: 60 KHFIEPDFVETPLRRRLKPTSVPSVF 85
          +HF    F      + LK  +VP++F
Sbjct: 56 EHFTPDCFKRECNNKLLKENAVPTIF 81


>2d8r_A THAP domain-containing protein 2; NPPSFA, national project on protein structural and functional analyses; NMR {Homo sapiens} SCOP: g.39.1.16 Length = 99 Back     alignment and structure
>2jm3_A Hypothetical protein; zinc finger, domain, metal binding protein; NMR {Caenorhabditis elegans} Length = 91 Back     alignment and structure
>3kde_C Transposable element P transposase; THAP domain, DNA-binding domain, zinc-finger, beta-alpha-BET element transposase, DNA integration; HET: BRU; 1.74A {Drosophila melanogaster} Length = 77 Back     alignment and structure

Structure Templates Detected by HHsearch ?

ID ?Alignment Graph ?Length ? Definition ? Probability ?
Query180
2jtg_A87 THAP domain-containing protein 1; zinc finger, CCC 99.95
2d8r_A99 THAP domain-containing protein 2; NPPSFA, national 99.94
2lau_A81 THAP domain-containing protein 11; zinc finger, pr 99.91
2jm3_A91 Hypothetical protein; zinc finger, domain, metal b 99.84
3kde_C77 Transposable element P transposase; THAP domain, D 99.6
>2jtg_A THAP domain-containing protein 1; zinc finger, CCCH, DNA-binding, metal-binding, zinc- finger, metal binding protein; NMR {Homo sapiens} PDB: 2ko0_A 2l1g_A* Back     alignment and structure
Probab=99.95  E-value=2.4e-28  Score=174.08  Aligned_cols=83  Identities=28%  Similarity=0.553  Sum_probs=75.5

Q ss_pred             CCcceeeccCCCCccCCCCeEEEecC-CChhHHHHHHHHhccccCCcccCCCCCceeeccCCccCCCcccCCCCcccCCC
Q psy5284           1 MPPSCCVPTCKLMRNNSEKLSYHEIP-SKEPLRTNWIKQIGILTGNKFWQPTSESAVVCSKHFIEPDFVETPLRRRLKPT   79 (180)
Q Consensus         1 M~~~C~V~gC~n~~~k~~~isff~FP-kd~~~r~~Wi~~~~~~~~~~~~~p~~~~~~VCs~HF~~~~f~~~~~~~~Lk~~   79 (180)
                      ||+.|+|+||.|++.++.+++||+|| +|++++++|+++|++    +.|.|.+. ++|||+||+++||.....+++|+++
T Consensus         1 M~~~C~v~~C~n~~~~~~~~~f~~FP~kd~~~~~~W~~~~~~----~~~~~~~~-~~iCs~HF~~~~f~~~~~~~~Lk~~   75 (87)
T 2jtg_A            1 MVQSCSAYGCKNRYDKDKPVSFHKFPLTRPSLCKEWEAAVRR----KNFKPTKY-SSICSEHFTPDCFKRECNNKLLKEN   75 (87)
T ss_dssp             CCCCCSSTTCCCCCCSSSCCCEEECCSSCCTTHHHHHHHHCS----SCCCTTTT-SEEEGGGSCGGGGCCCCSSCCCCTT
T ss_pred             CCCccEeCCCCCCCcCCCCeEEEECCCCChHHHHHHHHHhCc----ccCccCCC-CEEccccCcHhHhhccCCcCeeCCC
Confidence            99999999999988767789999999 899999999999998    36777764 8999999999999987778999999


Q ss_pred             ceeecccCC
Q psy5284          80 SVPSVFYMM   88 (180)
Q Consensus        80 AVPtlf~~~   88 (180)
                      ||||||...
T Consensus        76 AVPtif~~~   84 (87)
T 2jtg_A           76 AVPTIFLEL   84 (87)
T ss_dssp             CCCGGGCCC
T ss_pred             CCCcCcCCC
Confidence            999999864



>2d8r_A THAP domain-containing protein 2; NPPSFA, national project on protein structural and functional analyses; NMR {Homo sapiens} SCOP: g.39.1.16 Back     alignment and structure
>2lau_A THAP domain-containing protein 11; zinc finger, protein-DNA complex, DNA binding domain, transc factor, CCCH, transcription-DNA complex; NMR {Homo sapiens} Back     alignment and structure
>2jm3_A Hypothetical protein; zinc finger, domain, metal binding protein; NMR {Caenorhabditis elegans} Back     alignment and structure
>3kde_C Transposable element P transposase; THAP domain, DNA-binding domain, zinc-finger, beta-alpha-BET element transposase, DNA integration; HET: BRU; 1.74A {Drosophila melanogaster} Back     alignment and structure

Homologous Structure Domains

Structure Domains Detected by RPS-BLAST ?

ID ?Alignment Graph ?Length ? Definition ? E-value ?
Query 180
d2d8ra186 g.39.1.16 (A:8-93) THAP domain-containing protein 2e-20
>d2d8ra1 g.39.1.16 (A:8-93) THAP domain-containing protein 2 {Human (Homo sapiens) [TaxId: 9606]} Length = 86 Back     information, alignment and structure

class: Small proteins
fold: Glucocorticoid receptor-like (DNA-binding domain)
superfamily: Glucocorticoid receptor-like (DNA-binding domain)
family: THAP domain
domain: THAP domain-containing protein 2
species: Human (Homo sapiens) [TaxId: 9606]
 Score = 79.0 bits (194), Expect = 2e-20
 Identities = 26/85 (30%), Positives = 37/85 (43%), Gaps = 5/85 (5%)

Query: 1  MPPSCCVPTCKLMRNNSEKLSYHEIPSKEPLRTNWIKQIGILTGNKFWQPTSESAVVCSK 60
          MP +C    C    N    +S+H  P     R  W++ +      K + P      +CSK
Sbjct: 1  MPTNCAAAGCATTYNKHINISFHRFPLDPKRRKEWVRLV----RRKNFVPGK-HTFLCSK 55

Query: 61 HFIEPDFVETPLRRRLKPTSVPSVF 85
          HF    F  T   RRLK  +VP++F
Sbjct: 56 HFEASCFDLTGQTRRLKMDAVPTIF 80


Homologous Domains Detected by HHsearch ?

ID ?Alignment Graph ?Length ? Definition ? Probability ?
Query180
d2d8ra186 THAP domain-containing protein 2 {Human (Homo sapi 99.94
>d2d8ra1 g.39.1.16 (A:8-93) THAP domain-containing protein 2 {Human (Homo sapiens) [TaxId: 9606]} Back     information, alignment and structure
class: Small proteins
fold: Glucocorticoid receptor-like (DNA-binding domain)
superfamily: Glucocorticoid receptor-like (DNA-binding domain)
family: THAP domain
domain: THAP domain-containing protein 2
species: Human (Homo sapiens) [TaxId: 9606]
Probab=99.94  E-value=5.9e-28  Score=170.19  Aligned_cols=82  Identities=30%  Similarity=0.594  Sum_probs=73.7

Q ss_pred             CCcceeeccCCCCccCCCCeEEEecCCChhHHHHHHHHhccccCCcccCCCCCceeeccCCccCCCcccCCCCcccCCCc
Q psy5284           1 MPPSCCVPTCKLMRNNSEKLSYHEIPSKEPLRTNWIKQIGILTGNKFWQPTSESAVVCSKHFIEPDFVETPLRRRLKPTS   80 (180)
Q Consensus         1 M~~~C~V~gC~n~~~k~~~isff~FPkd~~~r~~Wi~~~~~~~~~~~~~p~~~~~~VCs~HF~~~~f~~~~~~~~Lk~~A   80 (180)
                      ||++|+|+||.+++.++.+++||+||+|++++++|++++++.    .+.++.. .+|||+||+++||.....+++|+++|
T Consensus         1 M~~~C~v~~C~~~~~~~~~~~ff~fP~d~~~~~~W~~~~~~~----~~~~~~~-~~ICs~HF~~~~~~~~~~~~~L~~~A   75 (86)
T d2d8ra1           1 MPTNCAAAGCATTYNKHINISFHRFPLDPKRRKEWVRLVRRK----NFVPGKH-TFLCSKHFEASCFDLTGQTRRLKMDA   75 (86)
T ss_dssp             CCCCCCSSSCCCSCCSSCCCCCEECCSSHHHHHHHHHHTTCT----TCCCSSS-CEECTTSSCSTTBSCTTSSCCBCTTC
T ss_pred             CCCeeEcCCCCCCCCCCCCEEEEECCCCHHHHHHHHHHhCCc----ccccCCc-cEEeCCcCChhhhcccCCCCEeCCCC
Confidence            999999999999887778899999999999999999999983    4566654 89999999999998877888999999


Q ss_pred             eeecccC
Q psy5284          81 VPSVFYM   87 (180)
Q Consensus        81 VPtlf~~   87 (180)
                      |||||.-
T Consensus        76 VPtiF~~   82 (86)
T d2d8ra1          76 VPTIFDF   82 (86)
T ss_dssp             CCSCCCC
T ss_pred             ccceeCC
Confidence            9999963