Overview
proto-tools provides standardized Python wrappers around 60+ bioinformatics tools: structure predictors, sequence scorers, gene annotators, alignment engines, and more. Every tool follows the same Input / Config / Output pattern, so learning one tool transfers to the rest.The Input / Config / Output Pattern
Every tool follows a three-part pattern using Pydantic models:Input: What to analyze. The primary data: sequences, structures, files.
python
Config: How to analyze it. Parameters and settings. Always optional; sensible defaults are built in.
python
python
Tool Categories
Structure Prediction
Predict 3D structures from sequences.AlphaFold2, AlphaFold3, Boltz2, Chai1, ESMFold, ESMFold2, Protenix, ViennaRNA
Structure Design
Generate novel protein backbone structures.RFDiffusion3
Structure Dynamics
Sample conformational ensembles.BioEmu
Inverse Folding
Design sequences for target structures.ProteinMPNN, LigandMPNN, FAMPNN, ESM-IF1
Masked Models
Protein language models for scoring and sampling.ESM2, ESM3, ESMC, AbLang
Causal Models
Autoregressive models for generation and scoring.Evo1, Evo2, ProGen2, ProGen3
Sequence Scoring
Predict functional effects from genomic sequences.Enformer, Borzoi, AlphaGenome, Malinois, Puffin, Segmasker
Gene Annotation
Annotate sequences and find functional elements.PyHMMER, CRISPRtracrRNA, MinCED, Promoter Calculator
Sequence Alignment
Align sequences and search databases for homologs.BLAST, MMseqs2, MAFFT, ColabFold Search
ORF Prediction
Find open reading frames in DNA.Orfipy, Prodigal
RNA Splicing
Predict splice sites and specificity.SpliceTransformer, Pangolin, SpliceAI
Structure Alignment
Align and compare 3D structures.TMAlign, USAlign, Foldseek, FoldMason, PyMOL RMSD
Database Retrieval
Fetch sequences and structures from public databases.UniProt, PDB, NCBI, SequenceFetch
Structure Scoring
Score and analyze 3D structure quality.DSSP, IPSAE, pDockQ2, PyRosetta, Structure Metrics
Binder Design
De novo antibody and binder design pipelines.BindCraft, Germinal
Mutagenesis
Random sequence mutagenesis.Random Protein, Random Nucleotide
GPU vs CPU Tools
GPU Tools
Deep learning models that require NVIDIA GPUs. Faster but require specific hardware.- Structure Prediction: AlphaFold3, Boltz2, Chai1, ESMFold, Protenix
- Inverse Folding: ProteinMPNN, LigandMPNN, FAMPNN
- Language Models: ESM2, ESM3, ESMC, AbLang, Evo1, Evo2, ProGen2, ProGen3
- Sequence Scoring: Enformer, Borzoi, AlphaGenome
- Structure Design: RFDiffusion3
- Structure Dynamics: BioEmu
- RNA Splicing: SpliceTransformer
CPU Tools
Classical bioinformatics algorithms and binary tools. Run anywhere.- Gene Annotation: PyHMMER, CRISPRtracrRNA, MinCED, Promoter Calculator
- Sequence Alignment: BLAST, MMseqs2, MAFFT, ColabFold Search
- ORF Prediction: Orfipy, Prodigal
- Structure Prediction: ViennaRNA (RNA only)
- Structure Alignment: TMAlign, USAlign
- Database Retrieval: UniProt, PDB, NCBI
Environment Isolation
Some tools have complex or conflicting dependencies. These tools use isolated virtual environments managed byToolInstance:
- Each tool with isolated deps has a
standalone/directory withsetup.shandrun.py - Virtual environments are created automatically on first use
- Execution is handled transparently; you call the same
run_tool()API
Tool Registry
All tools are registered via the@tool() decorator, enabling automatic discovery and schema generation:
python
Next Steps
Entities
Structure and Ligand data objects used by tools
Quickstart
Run your first tool in 5 minutes
Tool Persistence
Batch workloads with persistent tool instances
Device Management
GPU allocation and multi-device execution