Skip to main content
PyMOL RMSD
License: PyMOL RMSD is licensed under Custom (Open-Source PyMOL Copyright Notice) and may require explicit attribution when utilized. Please refer to the license for full terms.

This toolkit is open source. Any third-party models, product names, or trademarks referenced are the property of their respective owners, and Proto is not affiliated with them.


schrodinger/pymol-open-source
schrodinger/pymol-open-source
Open-source foundation of the user-sponsored PyMOL molecular visualization system.
1.7k stars
View repo
@misc{delano2002pymol,
  title={The {PyMOL} Molecular Graphics System},
  author={DeLano, Warren L.},
  year={2002},
  publisher={DeLano Scientific},
  address={San Carlos, CA, USA},
  url={http://www.pymol.org}
}
Copy citation
proto-bio/proto-tools/proto_tools/tools/structure_alignment/pymol_rmsd
View source
Open Notebook
Open notebook
Coming soon!
Run this tool directly in Proto with no setup required.
FunctionDescription
run_pymol_rmsd_alignment()Pairwise structure RMSD alignment using PyMOL cealign or align. Docs Source

Background

Root mean square deviation (RMSD) measures the average distance between corresponding atoms of two structures after optimal rigid-body superposition. It is the standard summary statistic for assessing how closely a model recapitulates a reference structure, comparing alternative poses of the same complex, and quantifying conformational change between two states of the same protein. Computing a meaningful RMSD requires a residue correspondence between the two structures, which the upstream alignment routine establishes before the superposition is performed. Open-Source PyMOL exposes two distinct alignment routines that this toolkit invokes. The cealign command implements the Combinatorial Extension (CE) algorithm (Shindyalov and Bourne, 1998), which builds a structural alignment from aligned fragment pairs based on local geometry rather than a global sequence alignment. CE was developed to detect remote structural similarity below the sequence-similarity twilight zone, where the underlying sequences share too little identity for a meaningful sequence-based alignment. The align command in contrast first computes a sequence alignment using the BLOSUM62 substitution matrix, performs a structural superposition over the aligned residues, and then iterates several cycles of outlier rejection to remove residues with poor structural agreement. This routine is appropriate when the two structures share substantial sequence identity and the goal is a residue-matched superposition that excludes locally divergent regions.

Learning Resources

  • schrodinger/pymol-open-source (Schrödinger, LLC). The official Open-Source PyMOL repository and the source of the cealign and align commands invoked by this toolkit.
  • PyMOL Wiki (community-maintained). Reference documentation for the align and cealign commands and the broader PyMOL scripting interface.

Tools

PyMOL RMSD Alignment (pymol-rmsd-alignment)

Aligns two Structure inputs with Open-Source PyMOL and returns the post-alignment RMSD together with method-specific alignment statistics. The method configuration field selects between the CE-based cealign and the sequence-aware align routine.

API Reference

Source
target_structure
Structure
required
Target/reference structure.
mobile_structure
Structure
required
Mobile/query structure to align against the target.
Source
method
enum
default:"cealign"
PyMOL alignment routine to use.Available options: cealign, align
target_selection
string
default:"target"
PyMOL selection for the target/reference structure.
mobile_selection
string
default:"mobile"
PyMOL selection for the mobile/query structure.
failure_rmsd
number
default:"999.0"
RMSD returned when PyMOL cannot align the structures.
verbose
integer
default:"0"
Verbosity level (0=quiet, 1=info, 2=debug, 3=raw subprocess stderr). True is coerced to 1 and False to 0.
device
string
default:"cpu"
Device to run the tool on.
timeout
integer
default:"600"
Maximum execution time in seconds. None waits indefinitely.
seed
integer
Random seed. When set, tools run reproducibly up to small GPU float noise (see BaseToolOutput.approx_equal), and the seed participates in cache keys. When None, cacheable seed-sensitive tools skip cache until seeded.
Source
method
enum
required
PyMOL alignment method used.
metrics
PyMOLRMSDMetrics
RMSD alignment metrics.
Metrics
MetricTypeRangeAvailability
rmsdfloat≥ 0.0always
aligned_lengthint≥ 0.0cealign only
aligned_atomsint≥ 0.0align only
alignment_cyclesint≥ 0.0align only
alignment_scorefloatunboundedalign only
pre_refinement_rmsdfloat≥ 0.0align only
pre_refinement_aligned_atomsint≥ 0.0align only
aligned_residuesint≥ 0.0align only

Applications

This tool is appropriate for any analysis that needs a pairwise structural superposition of two proteins. Representative applications include scoring designed structures against a reference template, quantifying conformational drift across molecular dynamics snapshots, comparing predicted poses against experimental structures, and evaluating backbone or side-chain changes introduced by mutation.

Usage Tips

  • method selects the alignment routine and should match the expected sequence relationship between the two inputs. The default cealign runs the Combinatorial Extension algorithm and is appropriate for proteins with low sequence similarity. align performs a sequence alignment followed by structural superposition and iterative outlier rejection, and is more appropriate when the two inputs share substantial sequence identity.
  • cealign and align populate different metric fields. A cealign call returns rmsd and aligned_length (the length of the CE alignment in CA atoms, equivalent to aligned residues since cealign operates only on CA atoms). An align call returns rmsd (after refinement), aligned_atoms, alignment_cycles, alignment_score, pre_refinement_rmsd, pre_refinement_aligned_atoms, and aligned_residues. Metric fields that do not apply to the selected method are returned as None.
  • target_selection and mobile_selection accept arbitrary PyMOL selection strings. The defaults select the full target and mobile objects. Pass a refined selection such as "target and name CA" or "target and chain A" to restrict the alignment to a specific residue subset, chain, or atom set. Selection syntax follows the standard PyMOL grammar documented on the PyMOL Wiki.
  • A failed alignment returns failure_rmsd rather than raising an exception. When PyMOL cannot align the two structures (for example, when the structures are too dissimilar for CE to converge), the call returns the configured failure_rmsd value (default 999.0) and attaches the underlying error message to the result metadata as alignment_error. This sentinel-value approach lets calling code distinguish a failed alignment from a near-zero RMSD between two essentially identical structures.

Toolkit Notes

These apply to every PyMOL RMSD tool in this toolkit (pymol-rmsd-alignment).
  • Outputs are returned as typed metric objects. Each PyMOLRMSDMetrics result carries the post-alignment rmsd together with the method-specific metrics described under Usage Tips. The headline primary_metric is rmsd, and results can be exported to JSON through the standard export method.
  • Inputs accept a Structure object, a file path, or raw PDB or mmCIF content. Each input is normalised to a Structure before scoring, and the corresponding PDB text is passed to PyMOL through a temporary file.
Example notebook: See the full working example for a copy-paste-ready walkthrough.

Infrastructure Guides

The following guides cover how to run tools efficiently and at scale.

Tool Persistence

Keep a tool’s model warm across calls instead of reloading it every invocation.

Device Management

How GPUs are allocated to tools and how to target specific devices.

Parallel Execution

Fan a batch of inputs out across multiple GPUs.

Cloud Inference

Run tools on managed cloud infrastructure with no local setup.