The Standard for
AI Drug Discovery
The industry's first independent credibility layer. Audit datasets, validate models, and gate claims with irrefutable proof. Molecules, peptides, and protein–protein interactions — calibrated on official benchmarks (CASF-2016, DUDE-Z, LIT-PCBA) and certified with Ed25519-signed certificates that bind to the evidence they audit.
No Claims Without Verification
LP-Gate provides a comprehensive suite of tools to ensure your AI models are learning real chemistry, not just memorizing dataset bias.
Advanced Leakage Detection
Identify duplicate compounds, scaffold overlaps, and near-neighbor analogs across train/test splits. We use ECFP4 fingerprints with Tanimoto similarity to detect subtle leakage that inflates benchmarks.
Cryptographic Certificates
Every validation run generates a cryptographically signed JSON certificate, hashing your dataset and code commit.
High-Throughput Scoring
Linear-time compound scoring via the Target Balance engine. Each molecule is evaluated independently — no pairwise bottleneck — enabling rapid screening across large compound libraries.
Proprietary Scoring Engine
Integration with TRAIRC's Target Balance (TB) algorithm. A biophysics-informed scoring engine combining molecular descriptors into an E×C product, validated on five independent gold-standard benchmarks: CASF-2013 (ρ=0.434 baseline, ρ=0.531 enhanced transfer, n=195), CASF-2016 (ρ=0.515 OOF, n=285), DUDE-Z (mean ROC-AUC 0.714 across 43 targets, 2,775 actives vs 135,711 decoys), PDBbind general v2020R1 (ρ=0.322 baseline at 18,918 complexes; enhanced ρ=0.449 on time-split 2015-2019 holdout of 7,201 complexes), and LIT-PCBA (1,569 compounds/sec, 168 MB peak on 150,000 compounds). The baseline scorer is the live production default. The enhanced overlay is benchmark-complete, production-wiring in shadow mode.
Peptides, PPI & interaction datasets
Inside the API-key–protected validation workspace: optional extensions for peptide sequence stability heuristics, PPI interface analysis from uploaded structures, and train/test leakage screening for protein–protein pair tables. Same authentication and rate limits as molecular validation; outputs are research-grade and calibration-aware.
Calibrated Against Independent Standards
LP-Gate's Target Balance scorer has been evaluated head-to-head on four of the most-used community benchmarks, with a fifth scale-proof benchmark validated in-house. Every number below is reproducible from the signed evidence pack.
Core Set, 5-fold OOF
Enhanced TB E·C on the official CASF-2016 core set. Pearson r = 0.498, out-of-fold.
Core Set, Pure Transfer
Enhanced overlay trained only on CASF-2016 (n=285), applied unmodified to CASF-2013 (n=195). Temporally independent; slightly beats its own CV (0.515). Baseline ρ = 0.434.
Actives vs Decoys
Mean across 43 targets, 2,775 actives vs 135,711 decoys. Baseline TB-only: 0.697.
Prospective Screens
Pending public benchmark run. Numbers will be published only after the certified evidence pack is signed.
LP-Gate is a claim admissibility system for AI drug discovery. Not just a better model. Not just a benchmark. A cryptographic standard for what a computational result is allowed to claim.
From Upload to Certified Result
A rigorous, automated pipeline that leaves no room for ambiguity.
Ingest & Sanitize
SMILES canonicalization, salt stripping, and duplicate removal.
Split & Audit
Stratified scaffold splitting with rigorous leakage checks.
Baseline & Control
Compare against honest baselines and random-shuffle controls.
Gate & Certify
Issue signed certificate only if all checks pass thresholds.
Ready to Validate?
Get started with LP-Gate and receive cryptographically signed proof of your model's credibility.