AlphaFold3 - Proto

License: AlphaFold3 uses CC-BY-NC-SA-4.0 for code and Custom (AlphaFold 3 Model Parameters Terms of Use) for model weights and has restrictions around commercial use and may require explicit attribution when utilized. Model weights are not publicly distributed and must be requested from the provider. Please refer to the code license and model weights license for full terms.

Proto is not affiliated with Google DeepMind. This toolkit is open source and builds on the implementation produced by this organization. Product names, logos, and trademarks are the property of their respective owners.

GitHub 7.8k GitHub 7.8k Publication Publication Cite Cite Tool Source Tool Source Open as Notebook Open as Notebook

google-deepmind/alphafold3

AlphaFold 3 inference pipeline.

7.8k stars

View repo

Accurate structure prediction of biomolecular interactions with AlphaFold 3

Josh Abramson, Jonas Adler, … Joshua Bambrick

Nature (2024)

Read paper

@article{abramson2024alphafold3,
  title={Accurate structure prediction of biomolecular interactions with AlphaFold 3},
  author={Abramson, Josh and Adler, Jonas and Dunger, Jack and Evans, Richard and Green, Tim and Pritzel, Alexander and Ronneberger, Olaf and Willmore, Lindsay and Ballard, Andrew J and Bambrick, Joshua and others},
  journal={Nature},
  volume={630},
  number={8016},
  pages={493--500},
  year={2024},
  publisher={Nature Publishing Group},
  doi={10.1038/s41586-024-07487-w}
}

Copy citation

proto-bio/proto-tools/proto_tools/tools/structure_prediction/alphafold3

View source

Open Notebook

Open notebook

Function	Description
`run_alphafold3()`	Protein structure prediction using AlphaFold3 (GPU)	Docs Source

Background

AlphaFold3 (Abramson et al., 2024) predicts the joint 3D structure of a biomolecular assembly from the sequences and chemical components it contains. It extends AlphaFold2 beyond single proteins: one model folds complexes that mix proteins, DNA, RNA, and small-molecule ligands, and predicts how those parts are arranged relative to one another. As in AlphaFold2, each protein chain is paired with a multiple-sequence alignment (MSA) of related sequences, whose covariation patterns give the model an evolutionary signal for placing residues. Internally, AlphaFold3 represents the assembly as a set of tokens: one per amino-acid residue or nucleotide, and one per atom for ligands and modified residues. It then learns a representation of every token and of every token pair. Where AlphaFold2 leaned on the large MSA-centric Evoformer, AlphaFold3 de-emphasizes the MSA, handling it in a separate preliminary module rather than iterating it through the deep trunk, and does most of its work in the ‘Pairformer’, which iteratively refines the token and pair representations through geometry-inspired “triangle attention” updates. The final representations are then fed into a diffusion module that iteratively denoises all-atom coordinates starting from random noise. Run from several random seeds, it produces multiple candidate structures, and the highest-confidence candidate is returned as the final prediction. In addition, AlphaFold3 reports calibrated confidence metrics such as the per-atom predicted local distance difference test (pLDDT) for local reliability, a predicted aligned error (PAE) for how well any two tokens are placed relative to each other, and predicted template-modeling (pTM) and interface predicted template-modeling (ipTM) scores for overall and interface accuracy.

Learning Resources

The Illustrated AlphaFold (by Elana Simon and Jake Silberg) - a visual, diagram-driven walkthrough of the AlphaFold3 architecture, from input preparation through representation learning to structure prediction.
AlphaFold 3 predicts the structure and interactions of all of life’s molecules (Google DeepMind and Isomorphic Labs) - the official announcement, with an accessible overview of what AlphaFold3 predicts and how it extends earlier models.

Tools

AlphaFold3 Structure Prediction (`alphafold3-prediction`)

Predicts the 3D structure of a biomolecular complex. Each input complex can combine protein, DNA, RNA, and ligand chains; the assembly is folded by AlphaFold3 and returned as a predicted Structure per complex with confidence metrics: per-residue pLDDT, pTM, interface pTM for multi-chain complexes, and predicted aligned error.

API Reference

Source

Input: AlphaFold3Input

complexes

List[Complex]

required

List of complexes to predict structures for. Inherited from StructurePredictionInput. Each complex can contain one or more sequences of proteins, DNA, RNA, or ligands.

Show Complex

chains

List[Chain | Fragment]

required

Chains in the complex, in input order.

msas

array

Pre-computed MSAs, one entry per complex. Each entry is a ComplexMSAs (per-chain MSAs keyed by chain index); paired=True marks rows taxonomy-aligned across chains. Populated by preprocess() or supplied directly.

Source

Config: AlphaFold3Config

name

string

default:"af3_job"

Name of the folding job. Default: "af3_job".

seeds

List[integer]

default:"[0]"

Seeds to use for AlphaFold3 when the common BaseConfig.seed field is unset. Default: [0]. Note: AlphaFold3 will do five diffusion samples per seed, so this often can be set to a single seed. More seeds are required for complex docking tasks, such as antibody-antigen docking.

output_dir

string

Path prefix for the AlphaFold3 output directory. Appends _af3_results to the provided string. If None (default), uses a temporary directory that is automatically cleaned up after inference. If specified, creates a persistent directory at the given path that will NOT be automatically deleted. Default: None.

model_dir

string

Local path to the directory containing AlphaFold3 model parameters (a single .bin or .bin.zst file per DeepMind’s release layout). If None (default), weights are resolved from PROTO_ALPHAFOLD3_WEIGHTS_DIR, then PROTO_MODEL_CACHE, then PROTO_HOME/proto_model_cache/alphafold3/ (see notes/storage.md).

sif_path

string

Optional path to a pre-built AlphaFold3 Apptainer image (.sif). When set, the tool runs apptainer run against this image (which dispatches via the sif’s %runscript) instead of the in-env Python install. When None (default), inference.py looks for $VENV_PATH/alphafold3.sif (provisioned by setup.sh) and falls back to the env-based install if absent.

num_recycles

integer

default:"10"

Recycling iterations.

num_diffusion_samples

integer

default:"5"

Diffusion samples per seed; total candidates = len(seeds) * num_diffusion_samples.

verbose

integer

default:"0"

Whether to print status messages during execution. Inherited from StructurePredictionConfig. Default: False.

device

string

default:"cuda"

Device to run the model on ("cuda", "cpu"). Inherited from StructurePredictionConfig. Default: "cuda".

timeout

integer

default:"600"

Maximum execution time in seconds. None waits indefinitely.

seed

integer

Random seed. When set, tools run reproducibly up to small GPU float noise (see BaseToolOutput.approx_equal), and the seed participates in cache keys. When None, cacheable seed-sensitive tools skip cache until seeded.

include_pae_matrix

boolean

default:"False"

Inherited. Default: False.

use_msa

boolean

default:"True"

Whether to generate and use Multiple Sequence Alignments (MSAs) for protein chains using MMseqs2 homology search. Inherited from MSAStructurePredictionConfig. Default: True.

msa_search_config

Mmseqs2HomologySearchConfig

Configuration for MMseqs2 homology search (MSA generation). Only used when use_msa=True. Inherited from MSAStructurePredictionConfig. Default: None.

pair_heterocomplex_msas

boolean

default:"True"

Whether heterocomplex protein chains should use taxonomy-paired MSA generation. Inherited from MSAStructurePredictionConfig. Default: True.

Source

Output: AlphaFold3Output

structures

List[Structure]

required

Predicted structures, each carrying an :class:AlphaFold3Metrics instance on .metrics.

Show Structure

structure

string

required

Raw structure content in PDB or CIF format.

structure_format

string

Format of the content string (auto-detected if omitted).

b_factor_type

BFactorType

What the B-factor column represents.

source

string

Optional source identifier (filepath or tool name).

metrics

Metrics

Associated metrics (e.g., pLDDT, pTM scores, per-chain lists, pairwise matrices). None values are stripped at construction.

Metrics (one set per structures item)

Metric	Type	Range	Availability
`avg_plddt`	float	0.0 to 100.0	always
`avg_pae`	float	≥ 0.0	always
`pae`	list[list[float]]	≥ 0.0	when include_pae_matrix=True
`ptm`	float	0.0 to 1.0	depends on model output
`iptm`	float	0.0 to 1.0	depends on model output
`ranking_score`	float	unbounded	depends on model output

Applications

This tool predicts the structure of multi-component assemblies such as protein-DNA and protein-RNA complexes or protein-ligand binding poses. Running it on a multi-chain complex also estimates how confidently the components are placed relative to each other through interface pTM and PAE, which is informative for assessing predicted interfaces.

Usage Tips

use_msa defaults to True. An MSA is then generated by a ColabFold search for protein chains; set it False to skip the search, or attach precomputed MSAs to the input.
Diffusion sampling is controlled by seeds and num_diffusion_samples. AlphaFold3 draws num_diffusion_samples (default 5) structures per seed and keeps the best by ranking score, so a single seed is often enough; the total number of candidates is len(seeds) times num_diffusion_samples.
num_recycles (default 10) trades accuracy for time. More recycling iterations refine the prediction but increase runtime.
Confidence is reported as pLDDT, pTM, ipTM, and PAE. Average pLDDT (0 to 1) is the primary per-structure quality metric; ipTM is populated only for multi-chain complexes.

Toolkit Notes

These apply to every AlphaFold3 tool in this toolkit (alphafold3-prediction).

Requires a GPU. AlphaFold3 needs an NVIDIA GPU; CPU execution is not practical.
Model weights are gated. AlphaFold3 weights are not publicly distributed; access is restricted to non-commercial research and must be requested from Google DeepMind through their form, then made available to the tool before it can run.

Example notebook: See the full working example for a copy-paste-ready walkthrough.

Infrastructure Guides

The following guides cover how to run tools efficiently and at scale.

Tool Persistence

Keep a tool’s model warm across calls instead of reloading it every invocation.

Device Management

How GPUs are allocated to tools and how to target specific devices.

Parallel Execution

Fan a batch of inputs out across multiple GPUs.

Cloud Inference

Run tools on managed cloud infrastructure with no local setup.

​Background

​Learning Resources

​Tools

​AlphaFold3 Structure Prediction (alphafold3-prediction)

​API Reference

​Applications

​Usage Tips

​Toolkit Notes

​Infrastructure Guides

Tool Persistence

Device Management

Parallel Execution

Cloud Inference

Background

Learning Resources

Tools

AlphaFold3 Structure Prediction (`alphafold3-prediction`)

API Reference

Applications

Usage Tips

Toolkit Notes

Infrastructure Guides