
This constraint is open source. Any third-party models, product names, or trademarks referenced are the property of their respective owners, and Proto is not affiliated with them.
API Reference
Configuration for protein diversity constraint.This class defines configuration parameters for evaluating amino acid diversity
in protein sequences. The constraint measures how many different amino acid
types are present in the sequence and penalizes sequences with insufficient
diversity, which may indicate poor protein quality, repetitive sequences, or
non-functional proteins.
A diversity score of 1.0 means all 20 standard amino acids are present.
The minimum for a non-empty sequence is 0.05 (1/20), reached by a
homopolymer (only one amino acid type); a score of 0.0 is unreachable
since empty sequences are rejected.
Minimum acceptable amino acid diversity. Calculated as (unique amino acids) / 20.
ReturnsConstraintOutput
One result per sequence. A score of 0.0 indicates
sufficient diversity (diversity at or above threshold) and higher
values indicate insufficient amino acid diversity. Scores scale
linearly with the deficit below the threshold (e.g., if min_diversity
is 0.5 and actual diversity is 0.25, the score is 0.5), capped at 1.0.
metadata carries:aa_diversity_score: Float diversity score (0.0-1.0) calculated as (unique amino acids) / 20unique_amino_acid_count: Integer count of unique amino acid types present in the sequence (0-20)unique_amino_acids: Sorted list of amino acid characters present in the sequence
Usage
Evaluating protein diversity:python
Metadata
| Property | Value |
|---|---|
| Key | protein-diversity |
| Function | protein_diversity_constraint |
| Category | protein_quality |
| Mode | discrete |
| Uses GPU | False |
| Supported Types | protein |