ESMFold2 - Proto

License: ESMFold2 is open source and free for academic and commercial use under an MIT license. Please refer to the license for full terms.

Proto is not affiliated with Biohub. This toolkit is open source and builds on the implementation produced by this organization. Product names, logos, and trademarks are the property of their respective owners.

GitHub 2.6k GitHub 2.6k

HuggingFace

HuggingFace Cite Cite Tool Source Tool Source Open as Notebook Open as Notebook Open on Proto Open on Proto

@misc{candido2026language,
  title={Language Modeling Materializes a World Model of Protein Biology},
  author={Candido, Salvatore and Hayes, Thomas and Derry, Alexander and Rao, Roshan and Lin, Zeming and Verkuil, Robert and Wu, Bryan and Lee, Jin Sub and Bruguera, Elise S. and Keval, Jehan A. and Kopylov, Mykhailo and Pak, John E. and Wu, Wesley and Thomas, Neil and Mataraso, Samson and Hsu, Alvin and Trotman-Grant, Ashton C. and Fatras, Kilian and dos Santos Costa, Allan and Badkundri, Rohil and Ak{\i}n, Halil and Oktay, Deniz and Deaton, Jonathan and Montabana, Elizabeth and Sitwala, Hrishita and Yu, Yue and Wiggert, Marius and Carlin, Dylan Alexander and Goering, Anthony W. and Blazejewski, Tomasz and Sandora, McCullen and Hla, Michael and Jia, Tina Z. and Kloker, Leon H. and Sofroniew, Nicholas J. and Uehara, Masatoshi and Pannu, Jassi and Bachas, Sharrol and Liu, Daniel S. and Sercu, Tom and Rives, Alexander},
  year={2026},
  url={https://biohub.ai/papers/esm_protein.pdf},
  note={Preprint}
}

Copy citation

proto-bio/proto-tools/proto_tools/tools/structure_prediction/esmfold2

View source

Open Notebook

Open notebook

Coming soon!

Run this tool directly in Proto with no setup required.

Function	Description
`run_esmfold2()`	All-atom biomolecular complex structure prediction using ESMFold2 from Biohub. (GPU)	Docs Source

Background

ESMFold2 (Candido et al., 2026) extends the ESM family from protein-only single-sequence folding to all-atom prediction of biomolecular complexes. Where the original ESMFold (Lin et al., 2023) used the ESM-2 protein language model as a learned substitute for an MSA and folded a single protein chain into a backbone, ESMFold2 supports proteins, DNA, RNA, small-molecule ligands, modified residues, and covalent bonds in a single joint prediction, comparable in scope to AlphaFold3 and Boltz-2. The model can be run in single-sequence mode, or, when an MSA is available for a protein chain, conditioned on the alignment to recover the evolutionary signal that aids prediction of difficult or sparsely-engineered targets. Architecturally, ESMFold2 conditions on representations from the frozen ESMC 6B language model, pools them into a two-dimensional pair representation refined through a stack of folding layers with a stabilized recurrent update, and concludes with a diffusion transformer that denoises directly into all-atom coordinates. Two inference-time parameters, the number of refinement loops through the folding stack and the number of diffusion sampling steps, trade computation time for accuracy and can materially improve predictions on difficult targets, especially antibody-antigen complexes. Alongside the structure, ESMFold2 reports calibrated confidence: a per-residue predicted local distance difference test (pLDDT), a predicted aligned error (PAE) for the relative placement of any two tokens, and predicted template-modeling (pTM) and interface predicted template-modeling (ipTM) scores that summarize overall and interface accuracy. Two checkpoints are available. esmfold2 is the larger, MSA-capable model recommended for difficult or long targets where alignment signal aids prediction; esmfold2-fast is an inference-optimized single-sequence variant intended for high-throughput applications. Both are distributed under the MIT license at Biohub/esm, the consolidated package that also distributes ESM3 and ESM C.

Learning Resources

ESMFold2 model card (Biohub) - architecture details, training data, benchmark results, and intended-use guidance for the MSA-capable checkpoint.

Tools

ESMFold2 Structure Prediction (`esmfold2-prediction`)

Predicts the all-atom 3D structure of a biomolecular complex. Each input complex can combine protein, DNA, RNA, and ligand chains (with optional chain-level modifications and covalent bonds); the assembly is folded by ESMFold2 and returned as a predicted Structure per complex with confidence metrics: pLDDT, pTM, interface pTM (for multi-chain complexes), and predicted aligned error.

API Reference

Source

Input: ESMFold2Input

complexes

List[Complex]

required

List of biomolecular complexes to fold. Inherited from StructurePredictionInput.

Show Complex

chains

List[Chain | Fragment]

required

Chains in the complex, in input order.

msas

array

Pre-computed MSAs, one entry per complex. Each entry is a ComplexMSAs (per-chain MSAs keyed by chain index); paired=True marks rows taxonomy-aligned across chains. Populated by Config.preprocess() or supplied directly. Only consumed when Config.model_checkpoint == "esmfold2".

Source

Config: ESMFold2Config

model_checkpoint

enum

default:"esmfold2-fast"

Which ESMFold2 variant to load. Default "esmfold2-fast".Available options: esmfold2, esmfold2-fast

num_loops

integer

default:"3"

Iterative refinement loops through the model. Higher = more accurate but slower. Default 3.

num_sampling_steps

integer

default:"50"

Diffusion sampling steps for the structure module. Higher = more refined but slower. Default 50.

diffusion_samples

integer

default:"1"

Independent diffusion samples per complex; the highest-pLDDT sample is returned. Higher = better quality but slower. Default 1.

step_scale

number

Diffusion step size override (typical range 1.0 to 2.0). Lower values produce more sample diversity. None uses the upstream sampler default. Default None.

noise_scale

number

Diffusion noise scale override. None uses the upstream sampler default. Default None.

max_inference_sigma

number

Maximum sigma value for the diffusion sampler. None uses the upstream default (256.0). Default None.

early_exit

boolean

default:"False"

Exit refinement loops early when convergence is detected. Default False.

verbose

integer

default:"0"

Verbosity level (0=quiet, 1=info, 2=debug, 3=raw subprocess stderr). True is coerced to 1 and False to 0.

device

string

default:"cuda"

Device to run the model on. Default "cuda". Inherited.

timeout

integer

default:"1200"

Maximum execution time in seconds. None waits indefinitely. Default 1200.

seed

integer

Random seed. When set, tools run reproducibly up to small GPU float noise (see BaseToolOutput.approx_equal), and the seed participates in cache keys. When None, cacheable seed-sensitive tools skip cache until seeded.

include_pae_matrix

boolean

default:"False"

Attach the full per-token PAE matrix to metrics (avg_pae is always emitted). Default False. Inherited.

use_msa

boolean

default:"False"

Whether to generate MSAs for protein chains via MMseqs2 homology search. Only valid with model_checkpoint='esmfold2'. Default False.

msa_search_config

Mmseqs2HomologySearchConfig

Configuration for MMseqs2 homology search (MSA generation). Only used when use_msa=True. Inherited. Default: None.

pair_heterocomplex_msas

boolean

default:"True"

Whether heterocomplex protein chains should use taxonomy-paired MSA generation. Inherited. Default: True.

Source

Output: ESMFold2Output

structures

List[Structure]

required

Predicted structures, each carrying an :class:ESMFold2Metrics instance on .metrics.

Show Structure

structure

string

required

Raw structure content in PDB or CIF format.

structure_format

string

Format of the content string (auto-detected if omitted).

b_factor_type

BFactorType

What the B-factor column represents.

source

string

Optional source identifier (filepath or tool name).

metrics

Metrics

Associated metrics (e.g., pLDDT, pTM scores, per-chain lists, pairwise matrices). None values are stripped at construction.

Metrics (one set per structures item)

Metric	Type	Range	Availability
`plddt`	float	0.0 to 1.0	always
`ptm`	float	0.0 to 1.0	always
`iptm`	float	0.0 to 1.0	depends on complex composition
`avg_pae`	float	0.0 to 32.0	always
`pae`	list[list[float]]	0.0 to 32.0	when include_pae_matrix=True

Applications

This tool predicts the structure of multi-component assemblies such as protein-protein, protein-DNA, protein-RNA, and protein-ligand complexes, including antibody-antigen interfaces where ESMFold2 is reported to be competitive with AlphaFold3. Running it on a multi-chain complex also estimates how confidently the components are placed relative to each other through interface pTM and PAE, which is informative for assessing predicted interfaces.

Usage Tips

model_checkpoint selects the variant. esmfold2-fast (default) is the inference-optimized single-sequence model and is appropriate for most high-throughput applications; select esmfold2 (with use_msa=True, or by attaching precomputed msas on the input) for the larger MSA-capable model on difficult or long targets. Setting use_msa=True with esmfold2-fast raises a validation error, and msas supplied with esmfold2-fast are ignored with a logged warning.
num_loops (default 3) and num_sampling_steps (default 50) trade computation for accuracy. Both parameters materially affect prediction quality, with the largest gains on difficult targets such as antibody-antigen complexes. Increasing either improves accuracy but extends runtime; decreasing them accelerates high-throughput screens at some accuracy cost.
Multi-modal inputs. Protein, DNA, RNA, and small-molecule ligand chains are supported; ligands can be specified by CCD code or SMILES, and chain modifications and covalent bonds are accepted. SMILES-based ligand input is supported but currently has known accuracy issues; CCD codes are recommended.
Confidence is reported as pLDDT, pTM, ipTM, and PAE. Mean pLDDT (0 to 1) is the primary per-structure quality metric; iptm is emitted only for multi-chain complexes, and avg_pae is in angstroms (0 to about 32). Set include_pae_matrix=True to attach the full per-token PAE matrix.

Toolkit Notes

These apply to every ESMFold2 tool in this toolkit (esmfold2-prediction).

Requires a GPU. ESMFold2 runs through a PyTorch backend and needs an NVIDIA GPU; CPU execution is not practical.
Shared biohub_esm environment. ESMFold2 is part of the consolidated Biohub/esm package and shares its standalone environment with the ESM3 and ESM C toolkits, so installing any one of them provisions the others.
AlphaFold3-style diffusion with optional MSAs. Predictions are stochastic, so set seed for reproducibility across runs. MSAs are only consumed by the esmfold2 checkpoint; the esmfold2-fast checkpoint is single-sequence by construction.
Structure prediction only. This toolkit provides ESMFold2’s structure prediction capability; the broader ESM family’s language-model, generation, and embedding capabilities are provided by the sibling ESM3 and ESM C toolkits.

Example notebook: See the full working example for a copy-paste-ready walkthrough.

Infrastructure Guides

The following guides cover how to run tools efficiently and at scale.

Tool Persistence

Keep a tool’s model warm across calls instead of reloading it every invocation.

Device Management

How GPUs are allocated to tools and how to target specific devices.

Parallel Execution

Fan a batch of inputs out across multiple GPUs.

Cloud Inference

Run tools on managed cloud infrastructure with no local setup.

​Background

​Learning Resources

​Tools

​ESMFold2 Structure Prediction (esmfold2-prediction)

​API Reference

​Applications

​Usage Tips

​Toolkit Notes

​Infrastructure Guides

Tool Persistence

Device Management

Parallel Execution

Cloud Inference

Background

Learning Resources

Tools

ESMFold2 Structure Prediction (`esmfold2-prediction`)

API Reference

Applications

Usage Tips

Toolkit Notes

Infrastructure Guides