Skip to main content

Optimizers

Optimizers are the search algorithms of Proto. They coordinate generators and constraints in an iterative loop: generate proposal sequences, score them, select the best, and repeat. Different optimizers implement different search strategies, from rejection sampling to MCMC with simulated annealing.

The Optimization Loop

Every optimizer follows the same fundamental loop:
Initialize PoolsGenerate(generators propose proposals)Filter(hard constraints reject bad ones)Score(soft constraints compute energy)Select(keep best by energy)Done?Return ResultsYesNoENERGY AGGREGATIONfilter fails → E = ∞E = Σ wᵢ·sᵢ (add)E = Π wᵢ·sᵢ (multiply)
Initialize PoolsGenerate(generators propose proposals)Filter(hard constraints reject bad ones)Score(soft constraints compute energy)Select(keep best by energy)Done?Return ResultsYesNoENERGY AGGREGATIONfilter fails → E = ∞E = Σ wᵢ·sᵢ (add)E = Π wᵢ·sᵢ (multiply)
Energy scoring is where constraints come together. The optimizer calls score_energy() which:
  1. Evaluates all filter constraints first (hard pass/fail)
  2. Rejects proposals that fail any filter (energy = inf)
  3. Evaluates scoring constraints only on surviving proposals
  4. Combines scores: energy = Sigma(weight_i x score_i)

Optimizer Comparison

MCMCRejection SamplingBeamSearchCyclingGradient
StrategyRandom walk with acceptance criterionGenerate many, keep bestBeam search over token generationIterative conditioning cyclesContinuous relaxation with gradient descent
ExplorationHigh (temperature-controlled)Broad (many samples)Moderate (beam width)Custom (user-defined)Local (gradient-driven)
ExploitationGood (annealing)Low (no refinement)Good (beam pruning)Depends on pipelineStrong (descent)
Best forGeneral optimization, refinementInitial screening, explorationLong DNA generationStructure-conditioned designDifferentiable constraints
Inherits state?YesYesNo (starts from prompt)YesYes
Multi-segment?YesYesNo (single segment)No (single segment)No (single segment)

Available Optimizers

The simplest optimizer. Generates a large number of proposals, scores them all, and keeps the top num_results by lowest energy. No iterative refinement; just sampling with selection.How it works:
  1. Generate proposals in batches of proposal_batch_size
  2. Score them all
  3. Track the top num_results proposals seen so far (maintained in sorted order)
  4. Repeat until num_samples reached (or energy_threshold met)
When to use: Initial exploration, quick screening, or as the first stage in a multi-stage Program pipeline.
python
from proto_language.optimizer import RejectionSamplingOptimizer, RejectionSamplingOptimizerConfig

# Standard mode: generate exactly num_samples
config = RejectionSamplingOptimizerConfig(
    num_samples=1000,          # Generate 1000 proposals total
    num_results=10,            # Keep the top 10
    proposal_batch_size=50,    # Evaluate 50 per batch
)

# Early stopping mode: stop when threshold is met
config = RejectionSamplingOptimizerConfig(
    num_samples=10000,
    num_results=10,
    energy_threshold=0.1,  # Stop early if top results all below 0.1
)

optimizer = RejectionSamplingOptimizer(
    constructs=[construct],
    generators=[generator],
    constraints=constraints,
    config=config,
)
num_samples
int
required
Maximum number of proposals to generate.
num_results
int | None
default:"None"
Number of top sequences to keep (by lowest energy). Overrides the program-level num_results when set.
proposal_batch_size
int | None
default:"None"
Number of proposal sequences to generate and evaluate per batch, capped at num_samples.
energy_threshold
float
default:"None"
If set, enables early stopping: halts when all top-k proposals have energy below this threshold.
A generalized optimizer that alternates between a user-defined conditioning function and a generator. The conditioning function can modify generator config between iterations; for example, predicting a 3D structure from the current sequence, then using that structure to condition inverse folding for the next iteration.How it works:
  1. Run the conditioning function on current sequences
  2. The conditioning function updates generator config (e.g., sets new PDB structures)
  3. Generator produces new proposals conditioned on updated config
  4. Optionally evaluate constraints and roll back rejected proposals
  5. Repeat for num_steps
When to use: Structure prediction + inverse folding cycles (“protein hallucination”), or any iterative conditioning workflow.Built-in pipeline: The protein-hunter pipeline automates the common pattern of structure prediction followed by inverse folding:
python
from proto_language.optimizer import (
    CyclingOptimizer, CyclingOptimizerConfig
)

config = CyclingOptimizerConfig(
    num_results=10,
    num_steps=50,
    pipeline="protein-hunter",
    protein_hunter={"structure_tool": "boltz2"},
)

optimizer = CyclingOptimizer(
    target_segment=construct.segments[0],  # The single segment to design
    constructs=[construct],
    generators=[proteinmpnn_gen],
    constraints=constraints,
    config=config,
)
num_results
int | None
default:"None"
Number of proposal trajectories to maintain across cycles.
num_steps
int
required
Number of conditioning-generation cycles to run.
pipeline
str
default:"None"
Named pipeline (e.g., "protein-hunter") for common conditioning patterns.
conditioning_fn
Callable
default:"None"
Custom conditioning function for advanced use cases. Passed as a constructor argument to CyclingOptimizer (not a CyclingOptimizerConfig field), and mutually exclusive with pipeline.
The Gradient optimizer (continuous relaxation with differentiable constraints, paired with PositionWeightGenerator) is summarized in the comparison table above; see its full reference at Gradient optimizer.

Optimizer Decision Tree

What kind ofoptimization?Sequence type?DNA / RNAProteinLong sequence?(>500bp)Have / want todesign structure?BeamSearch+ Evo2Goal?Cycling(protein-hunter)Goal?Rejection SamplingMCMCRejection SamplingMCMCRejection Sampling then MCMC(use Program)YesNoIterativehallucinationNoQuickscreenOptimizeQuickscreenDetailedoptimizationMulti-stageGradientdifferentiable constraintsDifferentiable
What kind ofoptimization?Sequence type?DNA / RNAProteinLong sequence?(>500bp)Have / want todesign structure?BeamSearch+ Evo2Goal?Cycling(protein-hunter)Goal?Rejection SamplingMCMCRejection SamplingMCMCRejection Sampling then MCMC(use Program)YesNoIterativehallucinationNoQuickscreenOptimizeQuickscreenDetailedoptimizationMulti-stageGradientdifferentiable constraintsDifferentiable

Pool Architecture

Understanding the dual-pool system is key to understanding how optimizers work:

proposal_sequences

The working pool. Generators write proposals here. Constraints evaluate sequences from here. Size = num_proposals.It acts as an inbox: new proposals arrive, are evaluated, and the best ones graduate to the result pool.

result_sequences

The results pool. Contains the best sequences found so far. Size = num_results.It acts as a hall of fame: only the best-scoring sequences are retained here.

Pool Initialization (Cycling)

When an optimizer starts, either fresh or after receiving results from a previous optimizer in a Program, both pools are initialized by cycling through the source sequences:
Source sequences: [A, B, C]  (3 from previous optimizer)
num_results = 5

result_sequences:  [A, B, C, A, B]    # Cycles to fill 5 slots
proposal_sequences: [A, B, C, A, B, C, A, B, C, A]  # Cycles to fill num_proposals
This cycling preserves diversity when pool sizes differ between optimizers. Some optimizers (like Rejection Sampling) keep their results sorted by energy; others preserve their natural ordering.

Constraint Evaluation & Performance

The score_energy() method implements a two-pass evaluation strategy that skips expensive GPU computations on already-rejected proposals: Pass 1, Filters: All filter constraints (those with a threshold) are evaluated first. Proposals that fail any filter are immediately rejected with energy = inf and marked with the rejecting constraint’s label. This is an AND gate; a proposal must pass every filter to survive. Pass 2, Scoring: Scoring constraints (those with a weight) are only evaluated on proposals that passed all filters. This means expensive GPU evaluations (structure prediction, binding strength) are never run on proposals that already failed a cheap filter (homopolymer check, GC content range).
50 proposals generated
  → 12 rejected by homopolymer filter     (CPU, <1ms)
  → 5 rejected by GC content filter       (CPU, <1ms)
  → 33 survive to scoring
  → ESMFold pLDDT evaluated on 33         (GPU, skipped 17)
  → Boltz2 binding evaluated on 33        (GPU, skipped 17)
GPU memory for constraint evaluation is managed at the tool level, not the framework level. Unlike generators (which have a framework-level batch_size), constraints receive all passing proposals in a single call. Each tool handles its own memory internally: ESMFold splits by residue count, Boltz2 processes complexes sequentially, and so on. Users control this through tool-specific config fields (e.g., max_batch_residues for ESMFold) rather than a constraint-level parameter.
Constraints are ordered with cheap filters first (sequence composition checks) and expensive scoring constraints last (structure prediction, binding). The two-pass strategy ensures rejected proposals never trigger GPU evaluations.

Temperature and Acceptance

Temperature controls the exploration-exploitation trade-off in MCMC. It determines how willing the optimizer is to accept a proposal that is worse than the current best:
Acceptance probability = min(1, exp(-delta_energy / temperature))
High Temperature (T=1.0)Low Temperature (T=0.001)Accepts most movesExplores broadlyGood for escapinglocal minimaOnly acceptsimprovementsExploits locallyGood for finalpolishingAnnealingover num_steps
High Temperature (T=1.0)Low Temperature (T=0.001)Accepts most movesExplores broadlyGood for escapinglocal minimaOnly acceptsimprovementsExploits locallyGood for finalpolishingAnnealingover num_steps
MCMC uses exponential annealing: T(step) = T_max x (T_min / T_max)^((step - 1) / (num_steps - 1)), so step 1 is exactly T_max and the final step is exactly T_min.
A common pattern is to start with high temperature for exploration, then anneal to low temperature for refinement. For multi-stage optimization, use a Program with explicit temperature stages instead of relying solely on annealing.

Tool Cache Management

Constraints that call expensive bioinformatics tools (structure prediction, sequence alignment) benefit from caching. The optimizer manages a shared tool cache:
python
MCMCOptimizer(
    ...,
    # Prune cache when it exceeds 100MB (default)
    clear_tool_cache=100 * 1024 * 1024,

    # Or: clear every step (aggressive, for memory-constrained runs)
    clear_tool_cache=True,

    # Or: only clear specific tools
    clear_tool_cache=["esmfold", "boltz2"],
)

History Tracking

Optimizers record snapshots of their state at configurable intervals for post-hoc analysis:
python
optimizer.run()

# Iterate over history snapshots
for snapshot in optimizer.history:
    step = snapshot["time_step"]
    energies = [r["energy_score"] for r in snapshot["results"] if r["energy_score"] is not None]
    print(f"Step {step}: best={min(energies):.4f}, mean={sum(energies)/len(energies):.4f}")

Next Steps

Programs

Chain multiple optimizers into multi-stage pipelines

Generators

The models that propose candidate sequences

Constraints

The quality checklist optimizers minimize

Optimizer Reference

Full API reference for each optimizer

Optimizer Catalog