Skip to main content
This program designs a cyclically symmetric protein assembly: several identical protomers whose sequences are optimized together so that the complex folds into a well-formed, symmetric ring. An MCMCOptimizer proposes mutations while four constraints score each candidate by folding it with ESMFold and measuring confidence, symmetry, and globularity. This designs a three-protomer assembly over several thousand steps. It requires a GPU for ESMFold. Open as a runnable notebook View as a Python script
Runtime: this walkthrough runs real models on a GPU and takes several minutes to complete. The first run is slower because it builds the tool environment and downloads model weights.

Protomers

Each symmetric unit is one protein Segment, wrapped in its own Construct. Here N_UNITS = 3 protomers of MONOMER_LENGTH = 50 residues each form the assembly. Passing length=50 with no starting sequence leaves every position open for the optimizer to fill, and sequence_type="protein" restricts those positions to the standard amino acids. Each segment carries a distinct label (protomer_1, protomer_2, protomer_3), the key its per-segment results are filed under, and each Construct is labeled to match.
python
from proto_language.core import Segment, Construct

MONOMER_LENGTH = 50
N_UNITS = 3

protomers = [
    Segment(length=MONOMER_LENGTH, sequence_type="protein", label=f"protomer_{i + 1}")
    for i in range(N_UNITS)
]
constructs = [Construct([seg], label=seg.label) for seg in protomers]

A symmetry-aware generator

The generator proposes the mutations the optimizer scores. RandomProteinGenerator substitutes random amino acids at masked positions; MaskingStrategy(num_mutations=1) masks exactly one position per call, so each step is a single point mutation. Because the segments start empty, the first sample() call instead fills each protomer with a fully random sequence. Passing the whole list of protomers to generator.assign(protomers) binds one generator across every unit at once, so the same mutation is applied to all protomers and the design stays symmetric.
python
from proto_language.generator import (
    MaskingStrategy,
    RandomProteinGenerator,
    RandomProteinGeneratorConfig,
)

generator = RandomProteinGenerator(
    RandomProteinGeneratorConfig(masking_strategy=MaskingStrategy(num_mutations=1))
)
generator.assign(protomers)

Structure and symmetry constraints

Each Constraint lists the protomers it reads in inputs; passing all three folds them together as one complex. The assembly is scored on four objectives, all folded with ESMFold on the GPU (structure_tool="esmfold"). structure_plddt_constraint reads per-residue confidence (pLDDT) and structure_ptm_constraint reads the global fold confidence (pTM); both return 1 - metric, so lower scores mean a more confident structure. protein_symmetry_ring_constraint scores how evenly the protomers are spaced around the ring, and all_to_all_protomer_symmetry=True measures distances between every pair of protomers rather than only adjacent ones. protein_globularity_constraint rewards a compact, spherical fold. Each constraint’s weight multiplies its raw score before the optimizer combines them; globularity carries weight=5 while the other three use weight=1.
python
from proto_language.core import Constraint
from proto_language.constraint import (
    protein_globularity_constraint,
    protein_symmetry_ring_constraint,
    structure_plddt_constraint,
    structure_ptm_constraint,
)

complex_inputs = protomers

plddt = Constraint(inputs=complex_inputs, function=structure_plddt_constraint,
                   function_config={"structure_tool": "esmfold"}, weight=1, label="plddt")
ptm = Constraint(inputs=complex_inputs, function=structure_ptm_constraint,
                 function_config={"structure_tool": "esmfold"}, weight=1, label="ptm")
symmetry = Constraint(inputs=complex_inputs, function=protein_symmetry_ring_constraint,
                      function_config={"all_to_all_protomer_symmetry": True}, weight=1, label="symmetry")
globularity = Constraint(inputs=complex_inputs, function=protein_globularity_constraint,
                         function_config={}, weight=5, label="globularity")

Run the optimization

The MCMCOptimizer ties the constructs, generator, and constraints together and runs Metropolis-Hastings: at each step it proposes a mutation, scores it, and accepts or rejects, always keeping improvements and accepting worse proposals with a probability that falls as the temperature anneals from max_temperature=1.0 toward min_temperature=0.0001 over num_steps=5000 steps. The optional custom_logging callable receives the step number and current segments; here track records each tracked protomer sequence and its pLDDT so the trajectory can be inspected afterward. The Program runs the optimizer and collects the result; the score improves as the protomers are mutated toward a confident, symmetric fold.
python
from proto_language.core import Program
from proto_language.optimizer import MCMCOptimizer, MCMCOptimizerConfig

# Record the protomer sequence and its fold confidence at each tracked step.
trajectory = []


def track(step, segments):
    seq = segments[0].proposal_sequences[0]
    pl = seq.metadata.get("constraints", {}).get("plddt", {}).get("data", {}).get("avg_plddt")
    trajectory.append((step, str(seq.sequence), pl))


optimizer = MCMCOptimizer(
    constructs=constructs,
    generators=[generator],
    constraints=[plddt, ptm, symmetry, globularity],
    config=MCMCOptimizerConfig(num_steps=5000, max_temperature=1.0, min_temperature=0.0001),
    custom_logging=track,
)

program = Program(optimizers=[optimizer], num_results=1)
program.run()

Inspect the result

The optimized design is read back from the first protomer’s result_sequences[0], where the constraint diagnostics are stored under metadata["constraints"] keyed by each constraint’s label. representative thins the recorded trajectory down to a few evenly spaced steps to show how the fold confidence climbed over the run. The final block prints the optimized protomer sequence alongside its avg_plddt (from the plddt constraint) and ptm (from the ptm constraint), the two fold-confidence metrics for the designed assembly.
python
best = program.constructs[0].segments[0].result_sequences[0]
scores = best.metadata.get("constraints", {})

def representative(traj, n=4):
    if len(traj) <= n:
        return traj
    idx = sorted({round(i * (len(traj) - 1) / (n - 1)) for i in range(n)})
    return [traj[i] for i in idx]

print("trajectory (fold confidence improves as the protomers are optimized):")
for step, seq, pl in representative(trajectory):
    pl_str = f"{pl:.2f}" if pl is not None else "  -  "
    print(f"  step {step:4d} | pLDDT {pl_str} | {seq}")

print(f"\nprotomer sequence: {best.sequence}")
print(f"pLDDT: {scores.get('plddt', {}).get('data', {}).get('avg_plddt')}")
print(f"pTM:   {scores.get('ptm', {}).get('data', {}).get('ptm')}")
trajectory (fold confidence improves as the protomers are optimized):
  step    1 | pLDDT 0.20 | QEQGSKCSNINSQDDHTKPWITFASAAQWRIDIHHHWWMKFYQVITTHYI
  step 1667 | pLDDT 0.25 | MSFHIFFQSEGLDLLIIWVAPLSPWIRALYLSNVKTEDWRCGNVWAVMHE
  step 3334 | pLDDT 0.60 | MYFELFFQDSDRDLVHHWNYNWSPWHPDVTLNNVSTDNWDCGPNWAFMHD
  step 5000 | pLDDT 0.56 | MYFELFFQDTDRDLVHHWMYNWSPWHPDVTLNNVSTDNWDGGPNWAFMHD

protomer sequence: MYFELFFQDTDRDLVHHWMYNWSPWHPDVTLNNVSTDNWDCGPNWAFMHD
pLDDT: 0.6816216111183167
pTM:   0.6676322817802429

Next Steps

Protein Hunter

Structure-based protein design by cycling.

Using Constraints

The structure constraints used here.