
This toolkit is open source. Any third-party models, product names, or trademarks referenced are the property of their respective owners, and Proto is not affiliated with them.
Background
The Promoter Calculator (LaFleur et al., 2022) predicts site-specific σ70 transcription initiation rates from DNA sequence alone. σ70 is the housekeeping sigma factor that recruits E. coli RNA polymerase to the majority of constitutive promoters, and its strength is set by the sequence and geometry of the surrounding promoter elements. The model decomposes a candidate promoter into the UP element, the −35 hexamer, the spacer, the extended and core −10 hexamers, the discriminator, and the initial transcribed region, then fits 346 Ridge-regression coefficients that link these elements to an overall binding free energy ΔG_total. The fitted ΔG_total is mapped to an absolute transcription initiation rate vialog(TX / TX_ref) = -β (ΔG_total - ΔG_total_ref), with the fit grounded in massively parallel reporter assay measurements. The released coefficients were trained on 5,193 designed promoters with a single dominant transcription start site and validated against 22,132 diverse bacterial σ70 promoters drawn from multiple datasets.
The reference Python implementation used here is the barricklab/promoter-calculator fork from the Barrick Lab, which packages the original Salis Lab algorithm for streamlined installation and adds optional multi-threading for the internal transcription-start-site scan.
Learning Resources
- barricklab/promoter-calculator (Barrick Lab) - the Python fork used here, with installation instructions and the command-line surface that the wrapper drives.
- hsalis/SalisLabCode (Promoter_Calculator) (Salis Lab) - the original reference implementation distributed alongside the publication and the source of the trained coefficients.
- salislab.net (Salis Lab) - the lab home page and entry point to the hosted Promoter Calculator web service.
Tools
Salis Lab Promoter Calculator (promoter-calculator)
Scans one or more DNA sequences on both strands for every candidate σ70 transcription start site and returns, per sequence, a list of PromoterPrediction rows. Each row carries the predicted TSS position and strand, the binding free energy dG_total (kcal/mol), the transcription initiation rate Tx_rate (arbitrary units), the DNA spanning the predicted promoter, and the start and end positions of the UP, −35, spacer, −10, and discriminator elements.API Reference
Input: PromoterCalculatorInput
Input: PromoterCalculatorInput
seq_0, seq_1, …); results are returned in input order.Config: PromoterCalculatorConfig
Config: PromoterCalculatorConfig
True is coerced to 1 and False to 0.None waits indefinitely.BaseToolOutput.approx_equal), and the seed participates in cache keys. When None, cacheable seed-sensitive tools skip cache until seeded.Output: PromoterCalculatorOutput
Output: PromoterCalculatorOutput
Applications
Use this to quantify σ70 promoter strength when designing or analysing E. coli expression cassettes. Common workflows include ranking a synthetic library such as the Anderson collection of J231xx variants by predictedTx_rate to pick a target strength, sweeping a candidate construct to flag unintended cryptic promoters that could drive off-target transcription before ordering DNA, and annotating native intergenic regions across an E. coli genome to estimate baseline σ70 activity. Pair the predicted transcription rate with downstream ribosome binding site strength and plasmid copy number when projecting end-to-end protein expression.Usage Tips
- Set
circular=Truefor plasmids and bacterial chromosomes. Linear scanning of a circular sequence cannot see candidates that span the wraparound origin. When the input is circular, the calculator examines the junction explicitly and recovers any promoter that straddles it. - Short inputs need flanking context to score. The element scan needs roughly 20 nucleotides on either side of the promoter region; sequences shorter than the scan window return no predictions. Pad with neutral flanking sequence when scoring a single promoter element in isolation.
- Predictions are calibrated for E. coli σ70 only. Applying the model to other organisms or to alternative sigma factors (σS, σ32, σ54, σ28) is not validated. Treat any cross-organism output as a relative ranking, not as a calibrated rate.
Tx_rateis transcription initiation, not protein expression. The model captures RNA polymerase binding and open-complex formation. End-to-end expression also depends on the ribosome binding site, mRNA stability, copy number, and growth state, none of which the calculator sees.- The model does not see transcription factors, attenuation, or anti-σ factors. Repression by a TF bound near the promoter, riboswitch attenuation, and anti-σ sequestration all reduce in vivo output without changing the predicted
Tx_rate. Use the prediction as the unregulated upper bound.
Toolkit Notes
These apply to every Promoter Calculator tool in this toolkit (promoter-calculator).
- Runs on CPU only. The model is a Ridge-regression sum over 346 coefficients that ship with the Python source. No neural network is loaded at inference time, and there is no GPU acceleration to enable.
- Self-contained after install. The standalone setup builds a Python virtual environment and pulls dependencies once; subsequent runs need no further network access and no model-weight downloads.
threadsparallelises the internal TSS scan within a single sequence. Raising it shortens wall-clock time on long inputs but does not change the predictions. Across input sequences the wrapper itself runs sequentially.