TriCyp
BrowseH-GroupsBenchmarkDownloadsPaper

TriCyp

Three-state cysteine classification across ECOD F70 representative domains — disulfide-bonded, metal-binding, or free thiol — combining ESM2 predictions with PDB structural evidence.

Navigation

  • Dashboard
  • Browse Families
  • H-Groups
  • Benchmark
  • AF Geometric
  • Downloads & API
  • About / Methods
  • Paper

Resources

  • ECOD Database
  • RCSB PDB

© 2026 Schaeffer & Cong Labs, UT Southwestern Medical Center

data · paper-v1·refreshed 2026-05-06

Paper Fig 2 + Fig S1

Benchmark

Held-out benchmarking of ESM2-3state against the structure-aware baselines used in the manuscript. Disulfide prediction is compared against SSBONDPredict; metal-binding prediction against LMetalSite and GPSite. Operating thresholds were chosen on the held-out validation set (metal-binding ≥ 0.972, disulfide ≥ 0.742).

Held-out evaluation · v2 (Apr 2026)

AUROC and average precision

Disulfide

ToolAUROCAP
ESM2-3state0.9870.966
SSBONDPredict0.9710.894

Metal-binding (all metals)

ToolAUROCAP
ESM2-3state0.9940.943
LMetalSite0.9790.892
GPSite0.9750.881

On the v2 (zinc-rebalanced) benchmark, all three metal-binding tools score in the same band (AUROC 0.975–0.994) because most held-out positives are zinc — a metal LMetalSite and GPSite were trained on. Stratifying by metal type isolates where ESM2-3state actually differs: on the shared metals (Zn / Ca / Mg / Mn) the AUROC values are essentially tied (0.994–0.996), and the residual difference in the all-metals number comes entirely from iron coordination — Fe-S clusters and especially heme — which the specialist tools were not designed to predict. The per-stratum table below makes the scope-vs-architecture read explicit.

Fig 2

Per-task ROC, PR, and threshold tuning

Held-out benchmarking of three-state cysteine classification. Panels A–C compare ESM2-3state against SSBONDPredict for disulfide prediction (ROC, PR, and threshold-tuning curves). Panels D–F repeat the comparison for metal-binding against LMetalSite and GPSite.

Fig 2A — Disulfide ROC: ESM2-3state vs SSBONDPredictFig 2B — Disulfide precision–recallFig 2C — Disulfide threshold tuningFig 2D — Metal-binding ROC vs LMetalSite and GPSiteFig 2E — Metal-binding precision–recallFig 2F — Metal-binding threshold tuning

Fig S1

Metal-type-stratified ROC strip

Metal-type-stratified ROC. On the metals all three tools were trained for (Zn / Ca / Mg / Mn), AUROC values are essentially tied (0.994–0.996). The differences in the all-metals comparison are entirely from the iron stratum (Fe-S clusters and heme), where specialist tools were not designed to predict — heme is the worst case for them (LMetalSite 0.838, GPSite 0.711 vs ESM2-3state 0.981). The iron stratum is roughly 15% of held-out metal positives in the v2 (zinc-rebalanced) benchmark.

Fig S1 — metal-type-stratified ROC strip

Iron stratum (Fe / Fe-S / heme)

The iron-stratum AUROC gap reflects training-set scope rather than algorithmic superiority. ESM2-3state was trained directly on cysteine 3-state labels covering Fe / Fe-S / heme coordination; LMetalSite and GPSite were trained on Zn / Ca / Mg / Mn binding. On the metals all three tools share training coverage (Zn / Ca / Mg / Mn) the AUROC values are essentially tied (0.994–0.996). The difference shows up specifically on iron coordination, and particularly on heme (ESM2 0.981 vs LMetalSite 0.838 vs GPSite 0.711) — a sub-domain the specialist tools were not designed for. Read the iron-stratum advantage as a coverage statement, not a head-to-head outperformance claim.

ESM2-3state

0.993

AUROC · Fe

LMetalSite

0.917

AUROC · Fe

GPSite

0.877

AUROC · Fe

AUROC / AP per tool per stratum

|
ToolTaskStratumAUROCAP
ESM2-3stateDisulfideAll metals0.9870.966
SSBONDPredictDisulfideAll metals0.9710.894
ESM2-3stateMetal-bindingAll metals0.9940.943
LMetalSiteMetal-bindingAll metals0.9790.892
GPSiteMetal-bindingAll metals0.9750.881
ESM2-3stateMetal-bindingShared metals (Zn/Ca/Mg/Mn)0.9960.946
LMetalSiteMetal-bindingShared metals (Zn/Ca/Mg/Mn)0.9940.932
GPSiteMetal-bindingShared metals (Zn/Ca/Mg/Mn)0.9960.944
ESM2-3stateMetal-bindingIron only0.9930.594
LMetalSiteMetal-bindingIron only0.9170.209
GPSiteMetal-bindingIron only0.8770.114
ESM2-3stateMetal-bindingIron · [4Fe-4S]0.9950.431
LMetalSiteMetal-bindingIron · [4Fe-4S]0.9190.130
GPSiteMetal-bindingIron · [4Fe-4S]0.8710.053
ESM2-3stateMetal-bindingIron · heme0.9910.152
LMetalSiteMetal-bindingIron · heme0.8380.003
GPSiteMetal-bindingIron · heme0.7060.001
ESM2-3stateMetal-bindingIron · [2Fe-2S] / [3Fe-4S]0.9920.360
LMetalSiteMetal-bindingIron · [2Fe-2S] / [3Fe-4S]0.9200.103
GPSiteMetal-bindingIron · [2Fe-2S] / [3Fe-4S]0.9260.063

Tabular summary of the held-out benchmark (v2, Apr 2026). All-metals AUROC + AP, the shared-metal subset (Zn/Ca/Mg/Mn), and the per-iron-cofactor strata (4Fe-4S, heme, 2Fe-2S/3Fe-4S) are all transcribed from the v2 protocol; per-stratum AP is not reported in the source and renders as em-dashes.

Operating thresholds

The classification published on TriCyp uses fixed thresholds chosen on the held-out validation set: a cysteine is called metal-binding when P(Met) ≥ 0.972, disulfide when P(Dis) ≥ 0.742, and otherwise free thiol. The two thresholds were tuned independently to the same per-task precision target on held-out data; raw probabilities for every cysteine remain available via the per-cysteine TSV download (see Downloads) so users can re-threshold for their own use case.