Installation
Proto requires Python 3.10+ and runs on Linux and macOS. The package installs withpip and pulls in the proto-tools execution layer automatically, so there is no conda environment to create and no submodule to check out.
Setup
Install the package
bash
proto-tools execution layer automatically. The system build tools that standalone tool environments need (git, curl, gcc, make, cmake) are provisioned on first use through proto-tools’ shared foundation environment, so there is nothing else to install.A direct PyPI install (
pip install proto-language) is planned. Until proto-tools is published to PyPI, the GitHub installation above is the supported path.Configure storage (optional)
All persistent data (model weights, tool environments, and the micromamba binary) lives under To override only the model-weights location, set
PROTO_HOME, which defaults to ~/.proto/ and is inherited from proto-tools. To move it elsewhere (recommended for lab and HPC environments), set it in your shell profile:bash
export PROTO_MODEL_CACHE=/path/to/shared/weights.Gated model access (optional)
Some generators and constraints load gated models (for example ESM3, AlphaGenome, and AlphaFold3) that require accepting a license and authenticating with Hugging Face. After accepting each model’s terms, export your token:See the proto-tools installation guide for the full procedure and the list of gated models.
bash
Verify Installation
Run this script to confirm everything is working:python
Developers
Contributors install editable checkouts of both layers from theproto-tools submodule:
bash
proto-tools with the local submodule so edits within proto-tools/ take effect immediately. System build tools are still provisioned automatically through the foundation environment.
Install Options
| Extra | What’s Included | Use Case |
|---|---|---|
| (none) | Core framework only | Writing and running optimization programs |
dev | pytest (+asyncio, cov, forked, randomly), ruff, mypy, docstring-parser | Testing and linting |
Prerequisites
- Python 3.10+
- pip
- Git (only for the editable developer install)
- NVIDIA GPU with CUDA 12.1+ (optional, for ML-based tools)
GPU Requirements
ML-based generators and constraints run substantially faster on a GPU; the table below lists per-tool speedups and VRAM requirements.| Tool | Purpose | GPU Benefit | VRAM Required |
|---|---|---|---|
| ESMFold | Protein structure prediction | ~10x faster | 8-16 GB |
| ESM2 / ESM3 | Protein language model generation | Required for large batches | 4-16 GB |
| Evo2 | DNA generation | Required | 16+ GB |
| ProteinMPNN | Structure-conditioned protein design | ~5x faster | 4-8 GB |
| Boltz2 / Chai1 | Multi-chain structure prediction | ~20x faster | 16-24 GB |
| AlphaFold3 | Structure prediction | ~20x faster | 16-24 GB |
| Enformer / Borzoi | Genomic expression prediction | ~50x faster | 8-16 GB |
Troubleshooting
CUDA out of memory
CUDA out of memory
Reduce the number of proposals maintained by the optimizer:You can also use a smaller model checkpoint for ESM2:
python
python
Import errors for ML models
Import errors for ML models
A
ModuleNotFoundError for packages like esm, torch, or boltz in your main interpreter is expected. ML-based generators and constraints (ESM2, ESMFold, ProteinMPNN, and others) get their dependencies, including PyTorch, from proto-tools’ isolated tool environments, not from the main package.proto-tools builds each tool’s environment on first use and provisions the required system build tools through the shared foundation environment. If a tool fails to find its environment, confirm that proto-tools is installed with pip show proto-tools; the GitHub installation above pulls it in automatically.flash-attn installation fails
flash-attn installation fails
flash-attn requires CUDA toolkit headers. Ensure you have CUDA 12.1+ installed on your system, then retry:If it still fails, you can skip flash-attn; it is an optional performance optimization for attention-heavy models, not a hard requirement.
bash
Next Steps
Quickstart
Design a first sequence
Core Concepts
Understand segments, constructs, generators, constraints, and optimizers