Proto is not affiliated with Boltz, MIT Jameel Clinic, and Recursion. This toolkit is open source and builds on the implementations produced by these organizations. Product names, logos, and trademarks are the property of their respective owners.
Background
Boltz-2 (Passaro et al., 2025) predicts the joint 3D structure of a biomolecular assembly from the sequences and chemical components it contains. It builds on Boltz-1, one of the most widely used open-source alternatives to AlphaFold3, extending that co-folding model with a binding-affinity module, improved controllability, and additional training data. Like AlphaFold3, a single model folds complexes that mix proteins, DNA, RNA, and small-molecule ligands and predicts how those components are arranged relative to one another. Each protein chain can be paired with a multiple-sequence alignment (MSA) of evolutionarily related sequences, whose covariation patterns supply the evolutionary signal the model uses to place residues. Architecturally, Boltz-2 reproduces AlphaFold3: it carries a single representation of the input tokens and a pairwise representation over token pairs, refines them through an AlphaFold3-style trunk, and generates all-atom coordinates with a diffusion module that starts from noise and iteratively denoises into a structure. Several structures can be sampled per complex and ranked by a confidence score, reported as a complex predicted local distance difference test (pLDDT) for local reliability, a predicted aligned error (PAE) for the relative placement of any two tokens, and predicted template-modeling (pTM) and interface predicted template-modeling (ipTM) scores that summarize overall and interface accuracy. Beyond structure, Boltz-2 adds a binding-affinity module that approaches the accuracy of physics-based free-energy perturbation while running more than 1000 times faster. The reference implementation is open-sourced at jwohlwend/boltz under the MIT license, covering the code, weights, and training pipeline for both academic and commercial use, with the released weights distributed asboltz-community/boltz-2. It was developed by the Boltz team at the MIT Jameel Clinic together with Recursion.
Learning Resources
- Boltz-2: democratizing biomolecular interaction modeling (MIT Jameel Clinic and Recursion) - an accessible overview of Boltz-2, including how it extends on the work of Boltz-1 and its binding-affinity capability.
Tools
Boltz-2 Structure Prediction (boltz2-prediction)
Predicts the 3D structure of a biomolecular complex. Each input complex can combine protein, DNA, RNA, and ligand chains; the assembly is folded by Boltz-2 and returned as a predicted Structure per complex with confidence metrics: a complex pLDDT, pTM, interface pTM, per-chain and pairwise-chain pTM/ipTM, and predicted aligned error.API Reference
Input: Boltz2Input
Input: Boltz2Input
StructurePredictionInput. Each complex can contain multiple chains of proteins, DNA, RNA, and/or ligands.ComplexMSAs (per-chain MSAs keyed by chain index); paired=True marks rows taxonomy-aligned across chains. Populated by preprocess() or supplied directly.Config: Boltz2Config
Config: Boltz2Config
min(cpu_count, 4).StructurePredictionConfig. Default: False."cuda" (NVIDIA GPU), "cpu" (CPU execution), or specific GPU devices like "cuda:0". Structure prediction is computationally intensive and strongly benefits from GPU acceleration. Default: "cuda".None waits indefinitely. Default: 1200.BaseToolOutput.approx_equal), and the seed participates in cache keys. When None, cacheable seed-sensitive tools skip cache until seeded.pae (avg_pae always emitted). Default: False.MSAStructurePredictionConfig. Default: True.use_msa=True. Inherited from MSAStructurePredictionConfig. Default: None.MSAStructurePredictionConfig. Default: True.Output: Boltz2Output
Output: Boltz2Output
Boltz2Metrics instance on .metrics.structures item)| Metric | Type | Range | Availability |
|---|---|---|---|
confidence_score | float | 0.0 to 1.0 | always |
ptm | float | 0.0 to 1.0 | always |
iptm | float | 0.0 to 1.0 | always |
chains_ptm | list[float] | 0.0 to 1.0 | always |
pair_chains_iptm | list[list[float]] | 0.0 to 1.0 | always |
avg_pae | float | 0.0 to 32.0 | always |
pae | list[list[float]] | 0.0 to 32.0 | when include_pae_matrix=True |
ligand_iptm | float | 0.0 to 1.0 | depends on complex composition |
protein_iptm | float | 0.0 to 1.0 | depends on complex composition |
complex_plddt | float | 0.0 to 1.0 | depends on complex composition |
complex_iplddt | float | 0.0 to 1.0 | depends on complex composition |
complex_pde | float | ≥ 0.0 | depends on complex composition |
complex_ipde | float | ≥ 0.0 | depends on complex composition |
Applications
This tool predicts the structure of multi-component assemblies such as protein-DNA and protein-RNA complexes or protein-ligand binding poses. Running it on a multi-chain complex also estimates how confidently the components are placed relative to each other through interface pTM and PAE, which is informative for assessing predicted interfaces.Usage Tips
use_msadefaults toTrue. A ColabFold search generates an MSA for each protein chain; set itFalsefor single-sequence prediction, or attach precomputed MSAs to the input. Protein chains with no detectable homologs fall back to an empty MSA.- Structures come from a diffusion process.
diffusion_samples(default1) independent samples are drawn per complex and the best is kept byconfidence_score;sampling_steps(default200) sets the number of denoising steps andstep_scale(default1.5) trades accuracy for sample diversity, where lower values are more diverse. recycling_steps(default3) trades accuracy for time. More recycling iterations refine the prediction but increase runtime.- Confidence is reported as a complex pLDDT, pTM, ipTM, and PAE.
confidence_score, the primary metric, isiptmfor multi-chain complexes andptmfor a single chain;complex_plddtis on a 0 to 1 scale and PAE is in angstroms (0 to about 32). Setinclude_pae_matrixto attach the full per-token PAE matrix. - Multi-modal inputs. Protein, DNA, RNA, and ligand entities are supported; chain modifications are not.
Boltz-2 Affinity (boltz2-affinity)
Predicts the binding affinity of a single small-molecule ligand against a protein target. Each input complex must contain at least one protein chain and at least one ligand chain; the binder is the complex’s sole ligand (auto-detected) or the chain named by binder_chain. Each complex returns a predicted Structure with the binding pose in the CIF and the affinity scores on structure.metrics: affinity_pred_value (log10 IC50 in μM; lower is stronger binding) and affinity_probability_binary (binder probability in [0, 1]).API Reference
Input: Boltz2AffinityInput
Input: Boltz2AffinityInput
Config: Boltz2AffinityConfig
Config: Boltz2AffinityConfig
False.200.5.StructurePredictionConfig. Default: False."cuda" (NVIDIA GPU), "cpu" (CPU execution), or specific GPU devices like "cuda:0". Structure prediction is computationally intensive and strongly benefits from GPU acceleration. Default: "cuda".None waits indefinitely. Default: 1200.BaseToolOutput.approx_equal), and the seed participates in cache keys. When None, cacheable seed-sensitive tools skip cache until seeded.False.True.None.True.3.200.1.1.5.8192.False.min(cpu_count, 4).Output: Boltz2AffinityOutput
Output: Boltz2AffinityOutput
structures item)| Metric | Type | Range | Availability |
|---|---|---|---|
affinity_pred_value | float | unbounded | always |
affinity_probability_binary | float | 0.0 to 1.0 | always |
affinity_pred_value1 | float | unbounded | when ensemble emits per-model values |
affinity_probability_binary1 | float | 0.0 to 1.0 | when ensemble emits per-model values |
affinity_pred_value2 | float | unbounded | when ensemble emits per-model values |
affinity_probability_binary2 | float | 0.0 to 1.0 | when ensemble emits per-model values |
Applications
This tool ranks candidate ligands against a chosen protein target, pairing a predicted affinity with a predicted binding pose — supporting hit discovery, structure-activity studies, and library-screening loops over a list of SMILES.Usage Tips
affinity_pred_valueis on a log10-IC50 (μM) scale. Values below0(sub-μM IC50) indicate strong binders; positive values indicate weaker binding.affinity_probability_binaryis an independent binder probability and can stay high even when the IC50 estimate is uncertain.- One binder ligand per complex. The binder is auto-detected when a complex has exactly one ligand; set
binder_chain(e.g."B") to name it when a complex has several. The binder must be a ligand chain with at most 128 heavy atoms. - Structure-side and affinity-side knobs are independent.
recycling_steps,sampling_steps,diffusion_samples, and MSA settings control the structure pass that runs first;sampling_steps_affinity(default200) anddiffusion_samples_affinity(default5) control the affinity pass. Setaffinity_mw_correctionto apply Boltz-2’s molecular-weight correction to the affinity value head. - Stochastic predictions. The diffusion-based affinity head is stochastic; set
seedfor reproducibility.
Toolkit Notes
These apply to every Boltz-2 tool in this toolkit (boltz2-prediction, boltz2-affinity).
- Requires a GPU. Boltz-2 runs through a PyTorch backend and needs an NVIDIA GPU; CPU execution is not practical.
- MSA-based and AlphaFold3-style. Boltz-2 uses optional MSAs and a diffusion process.
subsample_msaand unseeded runs are intentionally non-deterministic. - Shared model weights. Both tools run the same bundled Boltz-2 checkpoint; the affinity head ships with it, so
boltz2-affinityneeds no extra download or environment.

Boltz
Recursion