The shared construct
ASegment is the stretch of sequence being designed and a Construct groups the segments that
make up one molecule. Here a single 20 bp DNA segment (length=20, sequence_type="dna",
label="insert") starts empty, leaving all positions open for the optimizers to fill. Both
stages reference these same insert and construct objects by identity, so the sequences a stage
accepts carry over as the starting point for the next.
python
Stage 1: rejection sampling
RejectionSamplingOptimizer draws independent proposals and keeps the best by lowest energy;
each proposal batch starts fresh with no state carried between draws. gen1 is a
RandomNucleotideGenerator whose MaskingStrategy(num_mutations=10) mutates exactly ten of the
twenty positions per call, and because the segment starts empty its first call fills the initial
random sequence. The single gc_enrich constraint (gc_content_constraint with min_gc=70,
max_gc=100) scores 0 when GC content falls inside that broad window and penalizes deviation
below it. The config draws num_samples=10 proposals total and retains the top num_results=3
by lowest energy, which hand off as stage two’s starting pool.
python
Stage 2: MCMC refinement
MCMCOptimizer runs Metropolis-Hastings with simulated annealing: at each step it mutates the
current sequence, scores the proposals, and accepts improvements outright while accepting worse
proposals with probability exp(-dE / T) as the temperature anneals down from max_temperature.
Here gen2 uses MaskingStrategy(num_mutations=1), so each move flips a single base, and the
gc_refine constraint (min_gc=80, max_gc=90) rewards the tighter window. The config runs one
trajectory (num_results=1) for num_steps=10 steps, drawing proposals_per_result=20 proposals
per step and keeping the best by energy before the accept/reject decision, with
max_temperature=2.0 as the starting temperature.
python
Run both stages
TheProgram runs its optimizers in the order listed, stage1 then stage2. Because both share
the same construct, the three high-GC candidates rejection sampling retains seed the MCMC
trajectory, so the final design reflects both passes: enriched by the first, refined by the
second.
python
Inspect the result
The final design is the construct’s joined sequence. Per-segment results live undermetadata["segments"][<label>], and each constraint’s diagnostics sit under the constraints
entry keyed by the label set above. Reading gc_content back out from the stage-two
gc_refine entry confirms the design lands inside the 80-90% GC window.
python
Next Steps
Using Optimizers
Run and chain optimizers.
DNA Sequence Optimization
The single-stage version of this program.