Rejection Sampling Optimizer

This optimizer is open source. Any third-party models, product names, or trademarks referenced are the property of their respective owners, and Proto is not affiliated with them.

Source

proto-bio/proto-language/proto_language/optimizer/rejection_sampling_optimizer.py

View source Rejection Sampling optimizer for sequence optimization through extensive sampling. Generates many proposal sequences and keeps only the best num_results by lowest energy score. Unlike iterative optimizers (MCMC, beam search), each proposal batch starts fresh from the captured result state (the prior-stage results, or the original sequences on the first stage). There is no state carried between rounds. Each proposal batch:

Resets proposals to the captured result state
Applies all generators sequentially
Evaluates proposals with constraints
Updates the sorted results list if any proposals are better than the current worst
Reports each proposal as its own history iteration

If energy_threshold is set, the optimizer stops early once all best proposals have energy below the threshold.

If filter constraints reject many proposals, the optimizer may return fewer than num_results valid results.

How It Works

Rejection sampling draws many independent proposals from the same starting point and keeps the num_results with the lowest energy, carrying no state between draws. Each batch resets the proposal pool to the initial sequences (the optimizer is stateless), generates B new proposals, scores them, and folds survivors into a buffer of the K = num_results lowest-energy sequences kept in sorted order:

keep S = the K proposals with smallest energy E
insert x  ⇔  |S| < K  or  E(x) < max(S)        (bisect into sorted S, drop the worst)
stop early  ⇔  every s ∈ S has E(s) < energy_threshold

Batches repeat until num_samples proposals have been drawn (or the threshold is met). Because each batch restarts from the same initial pool, the search keeps no per-step state and is embarrassingly parallel.

API Reference

ConfigRejectionSamplingOptimizerConfig Source

Configuration object for RejectionSamplingOptimizer.The Rejection Sampling optimizer generates or receives proposal sequences and keeps only the best num_results by lowest energy score. It processes generated proposals in internal batches and reports each proposal as the semantic iteration.

If filter constraints reject many proposals (returning inf/nan energies), the optimizer may return fewer than num_results valid results.

num_samples

integer

required

Generated proposal count; in existing-results mode, candidate cap.

num_results

integer

Number of top-scoring candidate designs to retain (lowest energy first). Overrides program count.

proposal_source

enum

default:"generated"

Use generated proposals, or rank existing upstream result candidates.Options: generated, existing_results

proposal_batch_size

integer

Proposals scored per internal batch. Inferred from component batch sizes if omitted.

energy_threshold

number

Optional early-stop (lower energy = better); stops once every retained candidate is below this.

seed

integer

Random seed for reproducible optimization, generator, and constraint tool streams.

tracking_interval

integer

default:"1"

Save history and log progress every N steps. Step 0 and final step always saved.

track_proposals

boolean

default:"False"

Save granular per-proposal results (accept/reject) in history snapshots.

verbose

boolean

default:"False"

Emit per-step debug information about proposals, scores, and acceptance through the logger.

Usage

python

>>> config = RejectionSamplingOptimizerConfig(num_samples=100, num_results=10)
>>> optimizer = RejectionSamplingOptimizer(
...     constructs=constructs, generators=[mutation_gen], constraints=[gc_constraint], config=config
... )
>>> optimizer.run()
>>> best_sequences = optimizer.constructs[0].segments[0].result_sequences

With early stopping:

python

>>> config = RejectionSamplingOptimizerConfig(
...     num_samples=1000,
...     num_results=10,
...     energy_threshold=0.5,  # Stop when all top-10 have energy < 0.5
... )

Metadata

Property	Value
Key	`rejection-sampling`
Class	`RejectionSamplingOptimizer`
Targets Single Segment	`False`
Uses GPU	`False`

​How It Works

​API Reference

​Usage

​Metadata

How It Works

API Reference

Usage

Metadata