Programs

While individual optimizers run a single search strategy, a Program chains multiple optimizers into a multi-stage pipeline: broad exploration followed by targeted refinement, cheap filters before expensive scoring, temperature annealing across stages. A Program runs its optimizers sequentially, automatically handling the handoff of results between stages.

Single vs Multi-Stage

Single Stage
Multi-Stage

For simple designs, wrap one optimizer in a Program:

python

from proto_language.core import Segment, Construct, Constraint, Program
from proto_language.optimizer import MCMCOptimizer, MCMCOptimizerConfig
from proto_language.generator import (
    RandomNucleotideGenerator, RandomNucleotideGeneratorConfig
)
from proto_language.constraint import gc_content_constraint

# Setup
segment = Segment(length=100, sequence_type="dna")
construct = Construct([segment])

generator = RandomNucleotideGenerator(
    RandomNucleotideGeneratorConfig()
)
generator.assign(segment)

constraint = Constraint(
    inputs=[segment],
    function=gc_content_constraint,
    function_config={"min_gc": 45, "max_gc": 55},
)

# Single optimizer
optimizer = MCMCOptimizer(
    constructs=[construct],
    generators=[generator],
    constraints=[constraint],
    config=MCMCOptimizerConfig(num_steps=500, num_results=5, proposals_per_result=10),
)

program = Program(optimizers=[optimizer], num_results=5)
program.run()

Chain optimizers for coarse-to-fine optimization:

python

from proto_language.optimizer import (
    RejectionSamplingOptimizer, RejectionSamplingOptimizerConfig,
    MCMCOptimizer, MCMCOptimizerConfig,
)

# Stage 1: Broad exploration with cheap constraints
gen1 = RandomNucleotideGenerator(
    RandomNucleotideGeneratorConfig(masking_strategy=MaskingStrategy(num_mutations=10))
)
gen1.assign(segment)

opt1 = RejectionSamplingOptimizer(
    constructs=[construct],
    generators=[gen1],
    constraints=[gc_constraint],
    config=RejectionSamplingOptimizerConfig(num_samples=5000, num_results=20),
)

# Stage 2: Refinement with expensive structure prediction
gen2 = ESM2Generator(ESM2GeneratorConfig(masking_strategy=MaskingStrategy(num_mutations=3)))
gen2.assign(segment)

opt2 = MCMCOptimizer(
    constructs=[construct],  # Same construct object!
    generators=[gen2],
    constraints=[gc_constraint_2, structure_constraint],
    config=MCMCOptimizerConfig(num_steps=200, num_results=5, proposals_per_result=5),
)

# num_results=5 sets the default for all optimizers.
# opt1 overrides with config.num_results=20; opt2 uses the program default of 5.
program = Program(optimizers=[opt1, opt2], num_results=5)
program.run()

The Handoff

When one optimizer finishes and the next begins, the Program performs a carefully orchestrated handoff:

After each optimizer completes: Optimizers are responsible for their own ordering. Rejection Sampling keeps result_sequences sorted by energy (best first) throughout its run. Other optimizers preserve their natural ordering. Before the next optimizer runs:

_initialize_sequence_pools() reads from the previous optimizer’s result_sequences
Both pools are filled by cycling through source (preserving diversity when sizes differ)
Stale constraint metadata is cleared so the new stage starts with a clean slate

Optimizer-Specific Behavior

Not all optimizers use inherited state the same way:

Optimizer	How It Uses Previous Results
Rejection Sampling	Uses as starting proposals, then generates more and keeps overall best
MCMC	Uses as parallel trajectories, generates proposals from each
Cycling	Uses as working proposals for conditioning cycles
BeamSearch	Ignores previous results. Always starts fresh from its `prompt` parameter

BeamSearch ignores previous optimizer results by design. It always starts fresh from its configured prompt since it is built for autoregressive generation. Place it as the first stage in a pipeline, or use it standalone.

Pipeline Design Recipes

The snippets below are illustrative patterns. They assume the segment, construct, generators, and the named constraint objects (for example gc_constraint, structure_constraint, expression_constraint) have already been defined as shown in the earlier examples and the Constraints guide.

Exploration then Refinement

Rejection Sampling (broad) then MCMC (focused)Use Rejection Sampling to quickly sample thousands of proposals with cheap constraints, then hand the best ones to MCMC for detailed optimization with expensive constraints.Most common multi-stage pattern.

Progressive Constraints

MCMC (basic) then MCMC (+ structure) then MCMC (+ expression)Start with cheap sequence-level constraints, then progressively add expensive constraints. Each stage builds on the previous one’s results.Avoids wasting GPU time scoring bad sequences.

Temperature Annealing

MCMC (hot) then MCMC (warm) then MCMC (cold)Explicit temperature stages: high temperature for broad exploration, medium for narrowing, low for final polishing. More control than single-optimizer annealing.Better for rugged energy landscapes.

Generator Switching

Rejection Sampling + RandomNucleotide then MCMC + ESM2Start with fast random mutations for initial screening, then switch to language-model-guided mutations for biologically informed refinement.Combines fast screening with language-model-guided refinement.

python

# Stage 1: Fast exploration with cheap constraints
gen1 = RandomNucleotideGenerator(
    RandomNucleotideGeneratorConfig(masking_strategy=MaskingStrategy(num_mutations=10))
)
gen1.assign(segment)

opt1 = RejectionSamplingOptimizer(
    constructs=[construct],
    generators=[gen1],
    constraints=[gc_filter, homopolymer_filter],
    config=RejectionSamplingOptimizerConfig(num_samples=5000, num_results=20),
)

# Stage 2: Structure-based refinement
gen2 = ESM2Generator(ESM2GeneratorConfig(masking_strategy=MaskingStrategy(num_mutations=3)))
gen2.assign(segment)

opt2 = MCMCOptimizer(
    constructs=[construct],
    generators=[gen2],
    constraints=[gc_constraint, plddt_constraint, rmsd_constraint],
    config=MCMCOptimizerConfig(
        num_steps=200,
        num_results=5,
        proposals_per_result=5,
        max_temperature=2.0,
    ),
)

Program(optimizers=[opt1, opt2], num_results=5).run()

Progressive Constraints

python

# Stage 1: Sequence composition only
gen1 = RandomNucleotideGenerator(RandomNucleotideGeneratorConfig())
gen1.assign(segment)
opt1 = MCMCOptimizer(
    constructs=[construct],
    generators=[gen1],
    constraints=[gc_constraint_1],
    config=MCMCOptimizerConfig(num_steps=300, num_results=10, proposals_per_result=5),
)

# Stage 2: Add structure prediction
gen2 = RandomNucleotideGenerator(RandomNucleotideGeneratorConfig(masking_strategy=MaskingStrategy(num_mutations=9)))
gen2.assign(segment)
opt2 = MCMCOptimizer(
    constructs=[construct],
    generators=[gen2],
    constraints=[gc_constraint_2, structure_constraint],
    config=MCMCOptimizerConfig(num_steps=200, num_results=5, proposals_per_result=5),
)

# Stage 3: Add expression constraint
gen3 = RandomNucleotideGenerator(RandomNucleotideGeneratorConfig(masking_strategy=MaskingStrategy(num_mutations=6)))
gen3.assign(segment)
opt3 = MCMCOptimizer(
    constructs=[construct],
    generators=[gen3],
    constraints=[gc_constraint_3, structure_constraint_2, expression_constraint],
    config=MCMCOptimizerConfig(num_steps=100, num_results=3, proposals_per_result=5),
)

Program(optimizers=[opt1, opt2, opt3], num_results=10).run()

Temperature Annealing

python

# High temperature: broad exploration
gen1 = RandomNucleotideGenerator(RandomNucleotideGeneratorConfig(masking_strategy=MaskingStrategy(num_mutations=8)))
gen1.assign(segment)
opt1 = MCMCOptimizer(
    constructs=[construct],
    generators=[gen1],
    constraints=constraints_1,
    config=MCMCOptimizerConfig(
        num_steps=500, num_results=10, proposals_per_result=10, max_temperature=5.0
    ),
)

# Medium temperature: narrowing
gen2 = RandomNucleotideGenerator(RandomNucleotideGeneratorConfig())
gen2.assign(segment)
opt2 = MCMCOptimizer(
    constructs=[construct],
    generators=[gen2],
    constraints=constraints_2,
    config=MCMCOptimizerConfig(
        num_steps=300, num_results=5, proposals_per_result=5, max_temperature=2.0
    ),
)

# Low temperature: polishing
gen3 = RandomNucleotideGenerator(RandomNucleotideGeneratorConfig(masking_strategy=MaskingStrategy(num_mutations=6)))
gen3.assign(segment)
opt3 = MCMCOptimizer(
    constructs=[construct],
    generators=[gen3],
    constraints=constraints_3,
    config=MCMCOptimizerConfig(
        num_steps=200, num_results=3, proposals_per_result=3, max_temperature=0.5
    ),
)

Program(optimizers=[opt1, opt2, opt3], num_results=10).run()

Running Stages Individually

Use run_stage() for fine-grained control: inspect results between stages, conditionally skip stages, or re-run a stage with different parameters.

python

program = Program(optimizers=[opt1, opt2, opt3], num_results=5)

# Run first stage
program.run_stage(0)
results = program.get_stage_results(0)

# Inspect before continuing
best = results["results"][results["best_result_idx"]]
print(f"Stage 1 best energy: {best['energy_score']:.4f}")

# Conditionally run next stage
if best["energy_score"] < 0.5:
    program.run_stage(1)
else:
    print("Stage 1 didn't converge, skipping refinement")

A previous stage can also be re-run, which resets the pipeline to that point and invalidates subsequent stages:

python

# Re-run stage 0 (invalidates stages 1 and 2)
program.run_stage(0)

Results and Export

Accessing Results

python

program.run()

# Final energy scores (from last optimizer)
print(program.energy_scores)  # [0.05, 0.08, 0.12, ...]

# Final sequences (from shared constructs)
for construct in program.constructs:
    for sequence in construct.joined_sequences:
        print(sequence.sequence)

# Structured results
results = program.extract_results(program.energy_scores)
for result in results["results"]:
    print(f"Result {result['result_idx']}: energy={result['energy_score']:.4f}")
    for construct in result["constructs"]:
        for seg in construct["segments"]:
            print(f"  {seg['label']}: {seg['sequence'][:50]}...")

Export Formats

# Export all 4 tables at once (sequences, constraints, constructs, optimization)
program.export(path="./results/", format="csv")
# Creates: results/sequences.csv, results/constraints.csv,
#          results/constructs.csv, results/optimization.csv

Stage-Specific Results

Access results from any completed stage:

python

# Results from stage 0
stage_0_results = program.get_stage_results(0)

# Export a specific stage's results (writes the 4-table folder for that stage)
program.export(path="./stage0_results/", format="csv", stage=0)

Optimizer-Level Export

Individual Optimizer instances also provide the same export methods (without the stage parameter):

python

optimizer.export(path="./results/", format="csv")
df = optimizer.to_dataframe(table="sequences")
fasta = optimizer.to_fasta()

State Serialization

Save and restore program state for long-running optimization or checkpointing:

python

# Save state
state = program.serialize_state()
# Save to file, database, etc.
import json
with open("checkpoint.json", "w") as f:
    json.dump(state, f)

# Later: restore state and continue
with open("checkpoint.json") as f:
    state = json.load(f)
program.restore_state(state, stage_index=1)
program.run_stage(1)  # Resume from stage 1

Important Rules

All optimizers in a Program must share the same Construct objects (by identity, not just value). This is how state persists between stages. The construct is created once and the same object is passed to all optimizers.

python

# Correct: same construct object
construct = Construct([segment])
opt1 = MCMCOptimizer(constructs=[construct], ...)
opt2 = MCMCOptimizer(constructs=[construct], ...)  # Same object

# Wrong: different construct objects (raises ValueError)
opt1 = MCMCOptimizer(constructs=[Construct([segment])], ...)
opt2 = MCMCOptimizer(constructs=[Construct([segment])], ...)  # Different object!

Each generator and constraint instance can only be used in one optimizer. This prevents shared mutable state bugs. Create new instances for each stage.

python

# Correct: separate generator instances per optimizer
gen1 = RandomNucleotideGenerator(config)
gen2 = RandomNucleotideGenerator(config)  # New instance, same config is fine
gen1.assign(segment)
gen2.assign(segment)

# Wrong: reusing the same generator instance (raises ValueError)
gen = RandomNucleotideGenerator(config)
gen.assign(segment)
opt1 = MCMCOptimizer(generators=[gen], ...)
opt2 = MCMCOptimizer(generators=[gen], ...)  # Same instance -- error!

Properties

Property	Description
`constructs`	List of Construct objects being optimized (shared across all optimizers)
`optimizers`	List of Optimizer objects in sequence
`num_results`	Program-level default for the number of output sequences. Each optimizer resolves its result count as: config override > `num_results` > error.
`energy_scores`	Final energy scores from the last optimizer (after `run()`)
`current_stage`	Index of current/next stage to run
`verbose`	If True, forces verbose mode in all optimizers

Next Steps

Quickstart

A complete program, from scratch

Optimizers

Deep dive into individual optimizer strategies

Constraints

Scoring functions for design objectives

Tools

The bioinformatics tools that constraints and generators call

​Programs

​Single vs Multi-Stage

​The Handoff

​Optimizer-Specific Behavior

​Pipeline Design Recipes

Exploration then Refinement

Progressive Constraints

Temperature Annealing

Generator Switching

​Exploration then Refinement

​Progressive Constraints

​Temperature Annealing

​Running Stages Individually

​Results and Export

​Accessing Results

​Export Formats

​Stage-Specific Results

​Optimizer-Level Export

​State Serialization

​Important Rules

​Properties

​Next Steps

Quickstart

Optimizers

Constraints

Tools

Programs

Single vs Multi-Stage

The Handoff

Optimizer-Specific Behavior

Pipeline Design Recipes

Exploration then Refinement

Progressive Constraints

Temperature Annealing

Running Stages Individually

Results and Export

Accessing Results

Export Formats

Stage-Specific Results

Optimizer-Level Export

State Serialization

Important Rules

Properties

Next Steps