Why this matters Probing the structure of proteins and complexes remains a bottleneck for biology and drug discovery. This project provides openly available models and tooling that aim to close the gap between research-grade folding systems and practical, application-ready predictions — especially for antibody–antigen complexes and ligand-aware plausibility checks.
What Sets It Apart
- Published multi-release model lineage with measurable gains: protenix-v2 (≈464M params, 2026-04-08) improves antibody–antigen success rates and ligand plausibility versus earlier releases, while protenix-v1 matches or exceeds contemporaneous benchmarks under similar data cutoffs.
- Reproducible evaluation and datasets: PXMeter and curated benchmark sets are released alongside the code to standardize comparisons and remove experimental artifacts, so you can trust cross-model comparisons are comparable and reproducible.
- Practical inference features: support for MSA/RNA MSA, templates, inference-time sampling scaling, atom-level contact/pocket constraints, and performance-focused optimizations (kernel fusion, TF32 acceleration) — meaning better quality under realistic compute budgets.
- Design & downstream tools: companion projects (PXDesign, Protenix-Dock) enable de novo binder design and classical docking workflows, connecting structure prediction to experimental design and ligand docking pipelines.
Who It's For and Trade-offs
Great fit if you need an open, reproducible structure-prediction stack for research or applied bioengineering (antibody design, binder discovery, ligand plausibility) and you want to run or reproduce published benchmarks. It’s also useful when inference-time sampling and constraint-based refinement matter for challenging targets. Look elsewhere if you require minimal setup for trivial single-chain folding (lightweight single-model runners) or if you need commercial support/SLAs — this repository is research-focused and assumes familiarity with model inference pipelines, MSAs, and occasional heavy compute for large-scale sampling.
Where It Fits
Positioned between research reproductions of AlphaFold3-scale systems and application-oriented design tools: it reproduces and extends large-parameter folding models while providing evaluation and design toolkits that bridge prediction to experimental testing.
Method snapshot
The codebase bundles model implementations, training/inference pipelines, and released checkpoints with clear training-data cutoff metadata (e.g., 2021-09-30 and a later 2025-06-30 trained variant). The project emphasizes inference-time strategies (sampling budgets, constraint inputs) and engineering optimizations to make higher-accuracy sampling cost-effective for practical use.
