Skip to main content
License: ESMFold2 is open source and free for academic and commercial use under an MIT license. Please refer to the license for full terms.

Proto is not affiliated with Biohub. This toolkit is open source and builds on the implementation produced by this organization. Product names, logos, and trademarks are the property of their respective owners.


Biohub/esm
Biohub/esm
2.6k stars
View repo
@misc{candido2026language,
  title={Language Modeling Materializes a World Model of Protein Biology},
  author={Candido, Salvatore and Hayes, Thomas and Derry, Alexander and Rao, Roshan and Lin, Zeming and Verkuil, Robert and Wu, Bryan and Lee, Jin Sub and Bruguera, Elise S. and Keval, Jehan A. and Kopylov, Mykhailo and Pak, John E. and Wu, Wesley and Thomas, Neil and Mataraso, Samson and Hsu, Alvin and Trotman-Grant, Ashton C. and Fatras, Kilian and dos Santos Costa, Allan and Badkundri, Rohil and Ak{\i}n, Halil and Oktay, Deniz and Deaton, Jonathan and Montabana, Elizabeth and Sitwala, Hrishita and Yu, Yue and Wiggert, Marius and Carlin, Dylan Alexander and Goering, Anthony W. and Blazejewski, Tomasz and Sandora, McCullen and Hla, Michael and Jia, Tina Z. and Kloker, Leon H. and Sofroniew, Nicholas J. and Uehara, Masatoshi and Pannu, Jassi and Bachas, Sharrol and Liu, Daniel S. and Sercu, Tom and Rives, Alexander},
  year={2026},
  url={https://biohub.ai/papers/esm_protein.pdf},
  note={Preprint}
}
Copy citation
proto-bio/proto-tools/proto_tools/tools/structure_prediction/esmfold2
View source
Open Notebook
Open notebook
Coming soon!
Run this tool directly in Proto with no setup required.
FunctionDescription
run_esmfold2()All-atom biomolecular complex structure prediction using ESMFold2 from Biohub. (GPU) Docs Source

Background

ESMFold2 (Candido et al., 2026) extends the ESM family from protein-only single-sequence folding to all-atom prediction of biomolecular complexes. Where the original ESMFold (Lin et al., 2023) used the ESM-2 protein language model as a learned substitute for an MSA and folded a single protein chain into a backbone, ESMFold2 supports proteins, DNA, RNA, small-molecule ligands, modified residues, and covalent bonds in a single joint prediction, comparable in scope to AlphaFold3 and Boltz-2. The model can be run in single-sequence mode, or, when an MSA is available for a protein chain, conditioned on the alignment to recover the evolutionary signal that aids prediction of difficult or sparsely-engineered targets. Architecturally, ESMFold2 conditions on representations from the frozen ESMC 6B language model, pools them into a two-dimensional pair representation refined through a stack of folding layers with a stabilized recurrent update, and concludes with a diffusion transformer that denoises directly into all-atom coordinates. Two inference-time parameters, the number of refinement loops through the folding stack and the number of diffusion sampling steps, trade computation time for accuracy and can materially improve predictions on difficult targets, especially antibody-antigen complexes. Alongside the structure, ESMFold2 reports calibrated confidence: a per-residue predicted local distance difference test (pLDDT), a predicted aligned error (PAE) for the relative placement of any two tokens, and predicted template-modeling (pTM) and interface predicted template-modeling (ipTM) scores that summarize overall and interface accuracy. Two checkpoints are available. esmfold2 is the larger, MSA-capable model recommended for difficult or long targets where alignment signal aids prediction; esmfold2-fast is an inference-optimized single-sequence variant intended for high-throughput applications. Both are distributed under the MIT license at Biohub/esm, the consolidated package that also distributes ESM3 and ESM C.

Learning Resources

  • ESMFold2 model card (Biohub) - architecture details, training data, benchmark results, and intended-use guidance for the MSA-capable checkpoint.

Tools

ESMFold2 Structure Prediction (esmfold2-prediction)

Predicts the all-atom 3D structure of a biomolecular complex. Each input complex can combine protein, DNA, RNA, and ligand chains (with optional chain-level modifications and covalent bonds); the assembly is folded by ESMFold2 and returned as a predicted Structure per complex with confidence metrics: pLDDT, pTM, interface pTM (for multi-chain complexes), and predicted aligned error.

API Reference

Source
complexes
List[Complex]
required
List of biomolecular complexes to fold. Inherited from StructurePredictionInput.
msas
array
Pre-computed MSAs, one entry per complex. Each entry is a ComplexMSAs (per-chain MSAs keyed by chain index); paired=True marks rows taxonomy-aligned across chains. Populated by Config.preprocess() or supplied directly. Only consumed when Config.model_checkpoint == "esmfold2".
Source
model_checkpoint
enum
default:"esmfold2-fast"
Which ESMFold2 variant to load. Default "esmfold2-fast".Available options: esmfold2, esmfold2-fast
num_loops
integer
default:"3"
Iterative refinement loops through the model. Higher = more accurate but slower. Default 3.
num_sampling_steps
integer
default:"50"
Diffusion sampling steps for the structure module. Higher = more refined but slower. Default 50.
diffusion_samples
integer
default:"1"
Independent diffusion samples per complex; the highest-pLDDT sample is returned. Higher = better quality but slower. Default 1.
step_scale
number
Diffusion step size override (typical range 1.0 to 2.0). Lower values produce more sample diversity. None uses the upstream sampler default. Default None.
noise_scale
number
Diffusion noise scale override. None uses the upstream sampler default. Default None.
max_inference_sigma
number
Maximum sigma value for the diffusion sampler. None uses the upstream default (256.0). Default None.
early_exit
boolean
default:"False"
Exit refinement loops early when convergence is detected. Default False.
verbose
integer
default:"0"
Verbosity level (0=quiet, 1=info, 2=debug, 3=raw subprocess stderr). True is coerced to 1 and False to 0.
device
string
default:"cuda"
Device to run the model on. Default "cuda". Inherited.
timeout
integer
default:"1200"
Maximum execution time in seconds. None waits indefinitely. Default 1200.
seed
integer
Random seed. When set, tools run reproducibly up to small GPU float noise (see BaseToolOutput.approx_equal), and the seed participates in cache keys. When None, cacheable seed-sensitive tools skip cache until seeded.
include_pae_matrix
boolean
default:"False"
Attach the full per-token PAE matrix to metrics (avg_pae is always emitted). Default False. Inherited.
use_msa
boolean
default:"False"
Whether to generate MSAs for protein chains via MMseqs2 homology search. Only valid with model_checkpoint='esmfold2'. Default False.
msa_search_config
Mmseqs2HomologySearchConfig
Configuration for MMseqs2 homology search (MSA generation). Only used when use_msa=True. Inherited. Default: None.
pair_heterocomplex_msas
boolean
default:"True"
Whether heterocomplex protein chains should use taxonomy-paired MSA generation. Inherited. Default: True.
Source
structures
List[Structure]
required
Predicted structures, each carrying an :class:ESMFold2Metrics instance on .metrics.
Metrics (one set per structures item)
MetricTypeRangeAvailability
plddtfloat0.0 to 1.0always
ptmfloat0.0 to 1.0always
iptmfloat0.0 to 1.0depends on complex composition
avg_paefloat0.0 to 32.0always
paelist[list[float]]0.0 to 32.0when include_pae_matrix=True

Applications

This tool predicts the structure of multi-component assemblies such as protein-protein, protein-DNA, protein-RNA, and protein-ligand complexes, including antibody-antigen interfaces where ESMFold2 is reported to be competitive with AlphaFold3. Running it on a multi-chain complex also estimates how confidently the components are placed relative to each other through interface pTM and PAE, which is informative for assessing predicted interfaces.

Usage Tips

  • model_checkpoint selects the variant. esmfold2-fast (default) is the inference-optimized single-sequence model and is appropriate for most high-throughput applications; select esmfold2 (with use_msa=True, or by attaching precomputed msas on the input) for the larger MSA-capable model on difficult or long targets. Setting use_msa=True with esmfold2-fast raises a validation error, and msas supplied with esmfold2-fast are ignored with a logged warning.
  • num_loops (default 3) and num_sampling_steps (default 50) trade computation for accuracy. Both parameters materially affect prediction quality, with the largest gains on difficult targets such as antibody-antigen complexes. Increasing either improves accuracy but extends runtime; decreasing them accelerates high-throughput screens at some accuracy cost.
  • Multi-modal inputs. Protein, DNA, RNA, and small-molecule ligand chains are supported; ligands can be specified by CCD code or SMILES, and chain modifications and covalent bonds are accepted. SMILES-based ligand input is supported but currently has known accuracy issues; CCD codes are recommended.
  • Confidence is reported as pLDDT, pTM, ipTM, and PAE. Mean pLDDT (0 to 1) is the primary per-structure quality metric; iptm is emitted only for multi-chain complexes, and avg_pae is in angstroms (0 to about 32). Set include_pae_matrix=True to attach the full per-token PAE matrix.

Toolkit Notes

These apply to every ESMFold2 tool in this toolkit (esmfold2-prediction).
  • Requires a GPU. ESMFold2 runs through a PyTorch backend and needs an NVIDIA GPU; CPU execution is not practical.
  • Shared biohub_esm environment. ESMFold2 is part of the consolidated Biohub/esm package and shares its standalone environment with the ESM3 and ESM C toolkits, so installing any one of them provisions the others.
  • AlphaFold3-style diffusion with optional MSAs. Predictions are stochastic, so set seed for reproducibility across runs. MSAs are only consumed by the esmfold2 checkpoint; the esmfold2-fast checkpoint is single-sequence by construction.
  • Structure prediction only. This toolkit provides ESMFold2’s structure prediction capability; the broader ESM family’s language-model, generation, and embedding capabilities are provided by the sibling ESM3 and ESM C toolkits.
Example notebook: See the full working example for a copy-paste-ready walkthrough.

Infrastructure Guides

The following guides cover how to run tools efficiently and at scale.

Tool Persistence

Keep a tool’s model warm across calls instead of reloading it every invocation.

Device Management

How GPUs are allocated to tools and how to target specific devices.

Parallel Execution

Fan a batch of inputs out across multiple GPUs.

Cloud Inference

Run tools on managed cloud infrastructure with no local setup.