Skip to main content
ESM2 Perplexity
License: ESM2 is open source and free for academic and commercial use under an MIT license. Please refer to the license for full terms.

This constraint is open source. Any third-party models, product names, or trademarks referenced are the property of their respective owners, and Proto is not affiliated with them.


Go to Tool Page
proto-bio/proto-language/proto_language/constraint/sequence_scoring/esm2_perplexity_constraint.py
View source
@article{lin2023esm2,
  title={Evolutionary-scale prediction of atomic-level protein structure with a language model},
  author={Lin, Zeming and Akin, Halil and Rao, Roshan and Hie, Brian and Zhu, Zhongkai and Lu, Wenting and Smetanin, Nikita and Verkuil, Robert and Kabeli, Ori and Shmueli, Yaniv and others},
  journal={Science},
  volume={379},
  number={6637},
  pages={1123--1130},
  year={2023},
  publisher={American Association for the Advancement of Science},
  doi={10.1126/science.ade2574}
}
Copy citation
Forward ESM2 protein scoring via masked pseudo-log-likelihood.

API Reference

ConfigESM2PerplexityConfig Source
Configuration for ESM2 perplexity scoring (forward + gradient).
model_checkpoint
enum
default:"esm2_t33_650M_UR50D"
ESM2 model checkpoint to useOptions: esm2_t6_8M_UR50D, esm2_t12_35M_UR50D, esm2_t30_150M_UR50D, esm2_t33_650M_UR50D, esm2_t36_3B_UR50D, esm2_t48_15B_UR50D
temperature
number
required
Softmax temperature for ESM2 (fixed for this constraint, not varied per optimizer step).
use_ste
boolean
default:"True"
Hard one-hot forward pass with soft-probability gradients.
device
string
default:"cuda"
Device for ESM2 execution, e.g. ‘cuda’ or ‘cuda:0’.
batch_size
integer
AA positions per ESM2 forward pass. Lower if OOM, higher for throughput.
score_mode
enum
default:"nll"
Return raw mean NLL by default, or ESM2 perplexity when set to ‘ppl’.Options: nll, ppl
logit_scale
number
default:"1.0"
Pre-scale raw logits before ESM2; gradients are scaled back by the same factor.
sequence_bias
SequenceLogitBiasConfig
Declarative sequence-symbol bias (canonical 20-AA protein) added before ESM2.

Usage

python
from proto_language.core import Constraint
from proto_language.constraint import esm2_perplexity_constraint, ESM2PerplexityConfig

constraint = Constraint(
    inputs=[segment],
    function=esm2_perplexity_constraint,
    function_config=ESM2PerplexityConfig(
        # Configure parameters here
    ),
)

scores = constraint.evaluate()

Metadata

PropertyValue
Keyesm2-perplexity
Functionesm2_perplexity_constraint
Categorysequence_scoring
Modedual
Uses GPUTrue
Supported Typesprotein