MCMCOptimizer proposes mutations while four constraints score each candidate by folding it
with ESMFold and measuring confidence, symmetry, and globularity.
This designs a three-protomer assembly over several thousand steps. It requires a GPU for ESMFold.
Open as a runnable notebook
View as a Python script
Runtime: this walkthrough runs real models on a GPU and takes several minutes to complete. The first run is slower because it builds the tool environment and downloads model weights.
Protomers
Each symmetric unit is one proteinSegment, wrapped in its own Construct. Here N_UNITS = 3
protomers of MONOMER_LENGTH = 50 residues each form the assembly. Passing length=50 with no
starting sequence leaves every position open for the optimizer to fill, and sequence_type="protein"
restricts those positions to the standard amino acids. Each segment carries a distinct label
(protomer_1, protomer_2, protomer_3), the key its per-segment results are filed under, and
each Construct is labeled to match.
python
A symmetry-aware generator
The generator proposes the mutations the optimizer scores.RandomProteinGenerator substitutes
random amino acids at masked positions; MaskingStrategy(num_mutations=1) masks exactly one
position per call, so each step is a single point mutation. Because the segments start empty, the
first sample() call instead fills each protomer with a fully random sequence. Passing the whole
list of protomers to generator.assign(protomers) binds one generator across every unit at once,
so the same mutation is applied to all protomers and the design stays symmetric.
python
Structure and symmetry constraints
EachConstraint lists the protomers it reads in inputs; passing all three folds them together
as one complex. The assembly is scored on four objectives, all folded with ESMFold on the GPU
(structure_tool="esmfold"). structure_plddt_constraint reads per-residue confidence (pLDDT)
and structure_ptm_constraint reads the global fold confidence (pTM); both return 1 - metric,
so lower scores mean a more confident structure. protein_symmetry_ring_constraint scores how
evenly the protomers are spaced around the ring, and all_to_all_protomer_symmetry=True measures
distances between every pair of protomers rather than only adjacent ones. protein_globularity_constraint
rewards a compact, spherical fold. Each constraint’s weight multiplies its raw score before the
optimizer combines them; globularity carries weight=5 while the other three use weight=1.
python
Run the optimization
TheMCMCOptimizer ties the constructs, generator, and constraints together and runs
Metropolis-Hastings: at each step it proposes a mutation, scores it, and accepts or rejects,
always keeping improvements and accepting worse proposals with a probability that falls as the
temperature anneals from max_temperature=1.0 toward min_temperature=0.0001 over num_steps=5000
steps. The optional custom_logging callable receives the step number and current segments; here
track records each tracked protomer sequence and its pLDDT so the trajectory can be inspected
afterward. The Program runs the optimizer and collects the result; the score improves as the
protomers are mutated toward a confident, symmetric fold.
python
Inspect the result
The optimized design is read back from the first protomer’sresult_sequences[0], where the
constraint diagnostics are stored under metadata["constraints"] keyed by each constraint’s
label. representative thins the recorded trajectory down to a few evenly spaced steps to show
how the fold confidence climbed over the run. The final block prints the optimized protomer
sequence alongside its avg_plddt (from the plddt constraint) and ptm (from the ptm
constraint), the two fold-confidence metrics for the designed assembly.
python
Next Steps
Protein Hunter
Structure-based protein design by cycling.
Using Constraints
The structure constraints used here.