Skip to main content
GC Content

This constraint is open source. Any third-party models, product names, or trademarks referenced are the property of their respective owners, and Proto is not affiliated with them.


Source
proto-bio/proto-language/proto_language/constraint/sequence_composition/gc_content_constraint.py
View source
Enforce GC content within specified range. This constraint function calculates the percentage of guanine (G) and cytosine (C) nucleotides in DNA or RNA sequences and evaluates whether it falls within a specified acceptable range. GC content is a fundamental sequence property that affects DNA stability, melting temperature, gene expression patterns, and technical considerations like PCR amplification efficiency.

API Reference

ConfigGCContentConfig Source
Configuration for GC content constraint.This class defines configuration parameters for evaluating the GC content in DNA or RNA sequences. This penalty scales linearly with deviation from the acceptable range.
min_gc
number
required
Minimum acceptable GC content percentage (0-100)
max_gc
number
required
Maximum acceptable GC content percentage (0-100)
ReturnsConstraintOutput
One result per sequence. A score of 0.0 indicates GC content is within the acceptable range [min_gc, max_gc]. Higher scores indicate greater deviation from the acceptable range, with penalties scaling linearly with the deviation distance. The metadata field carries gc_content.

Usage

Evaluating GC content constraint:
python
>>> seq = Sequence("ATCGATCG", "dna")
>>> cfg = GCContentConfig(min_gc=40.0, max_gc=60.0)
>>> result = gc_content_constraint([(seq,)], config=cfg)
>>> print(result[0].score)  # 0.0 (50% GC content is within acceptable range)

Metadata

PropertyValue
Keygc-content
Functiongc_content_constraint
Categorysequence_composition
Modediscrete
Uses GPUFalse
Supported Typesdna, rna