Optimizers

Optimizers are the search algorithms of Proto. They coordinate generators and constraints in an iterative loop: generate proposal sequences, score them, select the best, and repeat. Different optimizers implement different search strategies, from rejection sampling to MCMC with simulated annealing.

The Optimization Loop

Every optimizer follows the same fundamental loop:

Energy scoring is where constraints come together. The optimizer calls score_energy() which:

Evaluates all filter constraints first (hard pass/fail)
Rejects proposals that fail any filter (energy = inf)
Evaluates scoring constraints only on surviving proposals
Combines scores: energy = Sigma(weight_i x score_i)

Optimizer Comparison

	MCMC	Rejection Sampling	BeamSearch	Cycling	Gradient
Strategy	Random walk with acceptance criterion	Generate many, keep best	Beam search over token generation	Iterative conditioning cycles	Continuous relaxation with gradient descent
Exploration	High (temperature-controlled)	Broad (many samples)	Moderate (beam width)	Custom (user-defined)	Local (gradient-driven)
Exploitation	Good (annealing)	Low (no refinement)	Good (beam pruning)	Depends on pipeline	Strong (descent)
Best for	General optimization, refinement	Initial screening, exploration	Long DNA generation	Structure-conditioned design	Differentiable constraints
Inherits state?	Yes	Yes	No (starts from prompt)	Yes	Yes
Multi-segment?	Yes	Yes	No (single segment)	No (single segment)	No (single segment)

Available Optimizers

MCMCOptimizer: Metropolis-Hastings Search

The general-purpose optimizer. Uses Markov Chain Monte Carlo with Metropolis-Hastings acceptance to explore sequence space. Maintains one or more parallel trajectories, proposing mutations and accepting or rejecting them based on energy improvement and temperature.How it works:

For each result sequence, generate proposals_per_result proposals
Score all proposals, pick the best one per result
Accept or reject based on Metropolis-Hastings criterion:
- Always accept if energy improves
- Sometimes accept worse moves (controlled by temperature)
Temperature anneals from max_temperature to min_temperature over the run

When to use: General-purpose optimization, protein refinement, any iterative design task. A reasonable default when the choice is unclear.

python

from proto_language.optimizer import MCMCOptimizer, MCMCOptimizerConfig

config = MCMCOptimizerConfig(
    num_results=5,           # Maintain 5 parallel trajectories
    num_steps=1000,          # Run for 1000 steps
    proposals_per_result=10,  # Generate 10 proposals per result per step
    max_temperature=2.0,     # Start exploratory
    min_temperature=0.001,   # End greedy
)

optimizer = MCMCOptimizer(
    constructs=[construct],
    generators=[generator],
    constraints=constraints,
    config=config,
)

num_results

int | None

default:"None"

Number of parallel trajectories to maintain. num_results=1 is standard single-chain MCMC. Higher values explore more of sequence space in parallel.

num_steps

int

required

Total number of MCMC steps. More steps allow better convergence but increase runtime.

proposals_per_result

int

default:"1"

Number of proposals to generate per result per step. Total proposals per step = num_results x proposals_per_result.

max_temperature

float

default:"1.0"

Starting temperature. Higher = more exploration (accepts worse moves more often).

min_temperature

float

default:"0.001"

Final temperature. Lower = more greedy (only accepts improvements).

RejectionSamplingOptimizer: Generate Many, Keep Best

The simplest optimizer. Generates a large number of proposals, scores them all, and keeps the top num_results by lowest energy. No iterative refinement; just sampling with selection.How it works:

Generate proposals in batches of proposal_batch_size
Score them all
Track the top num_results proposals seen so far (maintained in sorted order)
Repeat until num_samples reached (or energy_threshold met)

When to use: Initial exploration, quick screening, or as the first stage in a multi-stage Program pipeline.

python

from proto_language.optimizer import RejectionSamplingOptimizer, RejectionSamplingOptimizerConfig

# Standard mode: generate exactly num_samples
config = RejectionSamplingOptimizerConfig(
    num_samples=1000,          # Generate 1000 proposals total
    num_results=10,            # Keep the top 10
    proposal_batch_size=50,    # Evaluate 50 per batch
)

# Early stopping mode: stop when threshold is met
config = RejectionSamplingOptimizerConfig(
    num_samples=10000,
    num_results=10,
    energy_threshold=0.1,  # Stop early if top results all below 0.1
)

optimizer = RejectionSamplingOptimizer(
    constructs=[construct],
    generators=[generator],
    constraints=constraints,
    config=config,
)

num_samples

int

required

Maximum number of proposals to generate.

num_results

int | None

default:"None"

Number of top sequences to keep (by lowest energy). Overrides the program-level num_results when set.

proposal_batch_size

int | None

default:"None"

Number of proposal sequences to generate and evaluate per batch, capped at num_samples.

energy_threshold

float

default:"None"

If set, enables early stopping: halts when all top-k proposals have energy below this threshold.

BeamSearchOptimizer: Autoregressive Beam Search

Optimizer for generating long sequences with autoregressive generators such as Evo2. Splits a long segment into chunks of beam_length tokens and performs beam search at each boundary, pruning low-quality beams as the sequence grows.How it works:

Start from prompt sequence
Generate beam_length tokens with proposals_per_result variations per beam
Score all beams with constraints
Keep top num_results beams
Repeat until segment length is reached

When to use: Generating a single long DNA sequence (e.g., 2000+ bp) with quality constraints applied during generation rather than after.

python

from proto_language.optimizer import (
    BeamSearchOptimizer, BeamSearchOptimizerConfig
)

config = BeamSearchOptimizerConfig(
    prompt="ATGCCTGAA",          # Starting sequence
    beam_length=500,             # Generate 500bp per beam step
    num_results=5,               # Keep top 5 beams
    proposals_per_result=10,    # Generate 10 variations per beam
    score_by="mean",             # Average scores across all beams
    use_kv_caching=True,         # Cache for speed
)

optimizer = BeamSearchOptimizer(
    target_segment=construct.segments[0],  # The single segment to generate over
    constructs=[construct],
    generators=[evo2_generator], # Must be autoregressive
    constraints=constraints,
    config=config,
)

BeamSearch requires a single-segment construct and an autoregressive generator. It ignores previous optimizer results in a Program; it always starts fresh from its configured prompt.

prompt

str

required

Initial sequence to begin generation from. All beams extend from this prompt.

beam_length

int

required

Number of tokens to generate per beam step.

num_results

int

Number of top beams to maintain at each step (K in beam search). Optional; defaults to the number of results requested by the program.

proposals_per_result

int

required

Number of proposal extensions per beam. Total proposals per step = num_results x proposals_per_result.

score_by

str

default:"mean"

Aggregation method: "mean" (average across beams, rewards consistency) or "last" (most recent beam only).

use_kv_caching

bool

default:"False"

Enable KV cache reuse for faster autoregressive generation across beam steps.

CyclingOptimizer: Iterative Conditioning Cycles

A generalized optimizer that alternates between a user-defined conditioning function and a generator. The conditioning function can modify generator config between iterations; for example, predicting a 3D structure from the current sequence, then using that structure to condition inverse folding for the next iteration.How it works:

Run the conditioning function on current sequences
The conditioning function updates generator config (e.g., sets new PDB structures)
Generator produces new proposals conditioned on updated config
Optionally evaluate constraints and roll back rejected proposals
Repeat for num_steps

When to use: Structure prediction + inverse folding cycles (“protein hallucination”), or any iterative conditioning workflow.Built-in pipeline: The protein-hunter pipeline automates the common pattern of structure prediction followed by inverse folding:

python

from proto_language.optimizer import (
    CyclingOptimizer, CyclingOptimizerConfig
)

config = CyclingOptimizerConfig(
    num_results=10,
    num_steps=50,
    pipeline="protein-hunter",
    protein_hunter={"structure_tool": "boltz2"},
)

optimizer = CyclingOptimizer(
    target_segment=construct.segments[0],  # The single segment to design
    constructs=[construct],
    generators=[proteinmpnn_gen],
    constraints=constraints,
    config=config,
)

num_results

int | None

default:"None"

Number of proposal trajectories to maintain across cycles.

num_steps

int

required

Number of conditioning-generation cycles to run.

pipeline

str

default:"None"

Named pipeline (e.g., "protein-hunter") for common conditioning patterns.

conditioning_fn

Callable

default:"None"

Custom conditioning function for advanced use cases. Passed as a constructor argument to CyclingOptimizer (not a CyclingOptimizerConfig field), and mutually exclusive with pipeline.

The Gradient optimizer (continuous relaxation with differentiable constraints, paired with PositionWeightGenerator) is summarized in the comparison table above; see its full reference at Gradient optimizer.

Optimizer Decision Tree

Pool Architecture

Understanding the dual-pool system is key to understanding how optimizers work:

`proposal_sequences`

The working pool. Generators write proposals here. Constraints evaluate sequences from here. Size = num_proposals.It acts as an inbox: new proposals arrive, are evaluated, and the best ones graduate to the result pool.

`result_sequences`

The results pool. Contains the best sequences found so far. Size = num_results.It acts as a hall of fame: only the best-scoring sequences are retained here.

Pool Initialization (Cycling)

When an optimizer starts, either fresh or after receiving results from a previous optimizer in a Program, both pools are initialized by cycling through the source sequences:

Source sequences: [A, B, C]  (3 from previous optimizer)
num_results = 5

result_sequences:  [A, B, C, A, B]    # Cycles to fill 5 slots
proposal_sequences: [A, B, C, A, B, C, A, B, C, A]  # Cycles to fill num_proposals

This cycling preserves diversity when pool sizes differ between optimizers. Some optimizers (like Rejection Sampling) keep their results sorted by energy; others preserve their natural ordering.

Constraint Evaluation & Performance

The score_energy() method implements a two-pass evaluation strategy that skips expensive GPU computations on already-rejected proposals: Pass 1, Filters: All filter constraints (those with a threshold) are evaluated first. Proposals that fail any filter are immediately rejected with energy = inf and marked with the rejecting constraint’s label. This is an AND gate; a proposal must pass every filter to survive. Pass 2, Scoring: Scoring constraints (those with a weight) are only evaluated on proposals that passed all filters. This means expensive GPU evaluations (structure prediction, binding strength) are never run on proposals that already failed a cheap filter (homopolymer check, GC content range).

50 proposals generated
  → 12 rejected by homopolymer filter     (CPU, <1ms)
  → 5 rejected by GC content filter       (CPU, <1ms)
  → 33 survive to scoring
  → ESMFold pLDDT evaluated on 33         (GPU, skipped 17)
  → Boltz2 binding evaluated on 33        (GPU, skipped 17)

GPU memory for constraint evaluation is managed at the tool level, not the framework level. Unlike generators (which have a framework-level batch_size), constraints receive all passing proposals in a single call. Each tool handles its own memory internally: ESMFold splits by residue count, Boltz2 processes complexes sequentially, and so on. Users control this through tool-specific config fields (e.g., max_batch_residues for ESMFold) rather than a constraint-level parameter.

Constraints are ordered with cheap filters first (sequence composition checks) and expensive scoring constraints last (structure prediction, binding). The two-pass strategy ensures rejected proposals never trigger GPU evaluations.

Temperature and Acceptance

Temperature controls the exploration-exploitation trade-off in MCMC. It determines how willing the optimizer is to accept a proposal that is worse than the current best:

Acceptance probability = min(1, exp(-delta_energy / temperature))

MCMC uses exponential annealing: T(step) = T_max x (T_min / T_max)^((step - 1) / (num_steps - 1)), so step 1 is exactly T_max and the final step is exactly T_min.

A common pattern is to start with high temperature for exploration, then anneal to low temperature for refinement. For multi-stage optimization, use a Program with explicit temperature stages instead of relying solely on annealing.

Tool Cache Management

Constraints that call expensive bioinformatics tools (structure prediction, sequence alignment) benefit from caching. The optimizer manages a shared tool cache:

python

MCMCOptimizer(
    ...,
    # Prune cache when it exceeds 100MB (default)
    clear_tool_cache=100 * 1024 * 1024,

    # Or: clear every step (aggressive, for memory-constrained runs)
    clear_tool_cache=True,

    # Or: only clear specific tools
    clear_tool_cache=["esmfold", "boltz2"],
)

History Tracking

Optimizers record snapshots of their state at configurable intervals for post-hoc analysis:

python

optimizer.run()

# Iterate over history snapshots
for snapshot in optimizer.history:
    step = snapshot["time_step"]
    energies = [r["energy_score"] for r in snapshot["results"] if r["energy_score"] is not None]
    print(f"Step {step}: best={min(energies):.4f}, mean={sum(energies)/len(energies):.4f}")

Next Steps

Programs

Chain multiple optimizers into multi-stage pipelines

Generators

The models that propose candidate sequences

Constraints

The quality checklist optimizers minimize

Optimizer Reference

Full API reference for each optimizer

​Optimizers

​The Optimization Loop

​Optimizer Comparison

​Available Optimizers

​Optimizer Decision Tree

​Pool Architecture

​proposal_sequences

​result_sequences

​Pool Initialization (Cycling)

​Constraint Evaluation & Performance

​Temperature and Acceptance

​Tool Cache Management

​History Tracking

​Next Steps

Programs

Generators

Constraints

Optimizer Reference

​Optimizer Catalog

Optimizers

The Optimization Loop

Optimizer Comparison

Available Optimizers

Optimizer Decision Tree

Pool Architecture

`proposal_sequences`

`result_sequences`

Pool Initialization (Cycling)

Constraint Evaluation & Performance

Temperature and Acceptance

Tool Cache Management

History Tracking

Next Steps

Optimizer Catalog