
This constraint is open source. Any third-party models, product names, or trademarks referenced are the property of their respective owners, and Proto is not affiliated with them.
API Reference
Configuration for protein repetitiveness constraint.This class defines configuration parameters for evaluating repetitive content
in protein sequences using k-mer frequency analysis. The constraint detects
and penalizes sequences with excessive tandem repeats or repetitive motifs,
which may indicate low-complexity regions or non-functional proteins. The
repetitiveness score is calculated as the maximum fraction of the sequence
covered by any repeated k-mer. For example, if “AAA” appears 10 times in a
100-amino-acid sequence, the repetitiveness for 3-mers is (10 * 3) / 100 = 0.3
(30% of sequence).
Maximum acceptable repetitiveness fraction (fraction of sequence covered by repeated k-mers)
Smallest k-mer length treated as a repeat; the scan continues up to this length plus 6.
ReturnsConstraintOutput
One result per sequence. A score of 0.0 indicates
acceptable repetitiveness (at or below threshold) and higher values
indicate excessive repetitive content. Penalties scale linearly with
excess repetitiveness: if max is 0.4 and actual is 0.6, the excess
(0.2) is normalized by the remaining range (1.0 - 0.4 = 0.6), giving
a score of 0.33. metadata carries:repetitiveness_score: Float repetitiveness score (0.0-1.0) representing the maximum fraction of sequence covered by repeated k-mersmax_repetitive_fraction: Float identical torepetitiveness_score
Usage
Evaluating repetitiveness with default settings:python
Metadata
| Property | Value |
|---|---|
| Key | protein-repetitiveness |
| Function | protein_repetitiveness_constraint |
| Category | protein_quality |
| Mode | discrete |
| Uses GPU | False |
| Supported Types | protein |