Skip to main content
Evo1 DNA Language Model
License: Evo1 is open source and free for academic and commercial use under an Apache-2.0 license. Please refer to the license for full terms.

This generator is open source. Any third-party models, product names, or trademarks referenced are the property of their respective owners, and Proto is not affiliated with them.


Go to Tool Page
proto-bio/proto-language/proto_language/generator/evo1_generator.py
View source
@article{nguyen2024evo,
  title={Sequence modeling and design from molecular to genome scale with Evo},
  author={Nguyen, Eric and Poli, Michael and Durrant, Matthew G and Kang, Brian and Katrekar, Dhruva and Li, David B and Bartie, Liam J and Thomas, Armin W and King, Samuel H and Brixi, Garyk and Sullivan, Jeremy and Ng, Madelena Y and Lewis, Ashley and Lou, Aaron and Ermon, Stefano and Baccus, Stephen A and Hernandez-Boussard, Tina and R{\'e}, Christopher and Hsu, Patrick D and Hie, Brian L},
  journal={Science},
  volume={386},
  number={6723},
  pages={eado9336},
  year={2024},
  publisher={American Association for the Advancement of Science},
  doi={10.1126/science.ado9336}
}
Copy citation
Sequence generator using the Evo1 genomic language model. Supports multiple checkpoints including CRISPR and transposon fine-tuned variants. The number of tokens to generate is automatically calculated based on the assigned segment’s sequence_length.

API Reference

ConfigEvo1GeneratorConfig Source
Configuration object for Evo1Generator.
prompts
List[string]
required
Prompt sequences for DNA generation (single prompt or multiple)
model_checkpoint
enum
default:"evo-1-8k-base"
Evo1 model variant to load (e.g. evo-1-8k-base).Options: evo-1.5-8k-base, evo-1-8k-base, evo-1-131k-base, evo-1-8k-crispr, evo-1-8k-transposon
top_k
integer
default:"4"
At each step, restrict sampling to the k most probable tokens.
temperature
number
default:"1.0"
Sharpness of the sampling distribution. Below 1 sharpens; above 1 increases diversity. Must be > 0.
prepend_prompt
boolean
default:"False"
Whether to prepend prompt to generation
device
string
default:"cuda"
GPU device to run Evo1 on (e.g. ‘cuda’ or ‘cuda:0’).
batch_size
integer
default:"1"
Number of sequences to process simultaneously on GPU
verbose
boolean
default:"False"
Whether to print verbose output

Usage

python
>>> config = Evo1GeneratorConfig(
...     prompts="ATG",
...     model_checkpoint="evo-1-8k-crispr",
...     temperature=1.0,
... )
>>> gen = Evo1Generator(config)
>>> segment = Segment(length=1003, sequence_type="dna")
>>> gen.assign(segment)  # prepend_prompt defaults to False, so max_new_tokens = 1003
>>> gen.sample()

Metadata

PropertyValue
Keyevo1
ClassEvo1Generator
Categoryautoregressive
Input Typeprompt
Uses GPUTrue
Supported Sequence Typesdna
Allows Empty StartFalse