LogoAIAny
Icon for item

Protenix

High-accuracy biomolecular structure prediction suite: open-source models (protenix-v2/v1), a benchmark/evaluation toolkit, and a web server for inference. Targets protein/antibody–antigen and ligand-aware predictions with inference-time sampling and constraint support.

Introduction

Why this matters Probing the structure of proteins and complexes remains a bottleneck for biology and drug discovery. This project provides openly available models and tooling that aim to close the gap between research-grade folding systems and practical, application-ready predictions — especially for antibody–antigen complexes and ligand-aware plausibility checks.

What Sets It Apart
  • Published multi-release model lineage with measurable gains: protenix-v2 (≈464M params, 2026-04-08) improves antibody–antigen success rates and ligand plausibility versus earlier releases, while protenix-v1 matches or exceeds contemporaneous benchmarks under similar data cutoffs.
  • Reproducible evaluation and datasets: PXMeter and curated benchmark sets are released alongside the code to standardize comparisons and remove experimental artifacts, so you can trust cross-model comparisons are comparable and reproducible.
  • Practical inference features: support for MSA/RNA MSA, templates, inference-time sampling scaling, atom-level contact/pocket constraints, and performance-focused optimizations (kernel fusion, TF32 acceleration) — meaning better quality under realistic compute budgets.
  • Design & downstream tools: companion projects (PXDesign, Protenix-Dock) enable de novo binder design and classical docking workflows, connecting structure prediction to experimental design and ligand docking pipelines.
Who It's For and Trade-offs

Great fit if you need an open, reproducible structure-prediction stack for research or applied bioengineering (antibody design, binder discovery, ligand plausibility) and you want to run or reproduce published benchmarks. It’s also useful when inference-time sampling and constraint-based refinement matter for challenging targets. Look elsewhere if you require minimal setup for trivial single-chain folding (lightweight single-model runners) or if you need commercial support/SLAs — this repository is research-focused and assumes familiarity with model inference pipelines, MSAs, and occasional heavy compute for large-scale sampling.

Where It Fits

Positioned between research reproductions of AlphaFold3-scale systems and application-oriented design tools: it reproduces and extends large-parameter folding models while providing evaluation and design toolkits that bridge prediction to experimental testing.

Method snapshot

The codebase bundles model implementations, training/inference pipelines, and released checkpoints with clear training-data cutoff metadata (e.g., 2021-09-30 and a later 2025-06-30 trained variant). The project emphasizes inference-time strategies (sampling budgets, constraint inputs) and engineering optimizations to make higher-accuracy sampling cost-effective for practical use.

Information

  • Websitegithub.com
  • AuthorsByteDance AI4Science (AML) Team
  • Published date2024/11/08

Categories