Skip to main content
Alignment Gap Gini
License: MAFFT is open source and free for academic and commercial use under a BSD-3-Clause license. Please refer to the license for full terms.

This constraint is open source. Any third-party models, product names, or trademarks referenced are the property of their respective owners, and Proto is not affiliated with them.


Go to Tool Page
proto-bio/proto-language/proto_language/constraint/sequence_alignment/gap_gini_constraint.py
View source
@article{katoh2013mafft,
  title={MAFFT multiple sequence alignment software version 7: improvements in performance and usability},
  author={Katoh, Kazutaka and Standley, Daron M},
  journal={Molecular Biology and Evolution},
  volume={30},
  number={4},
  pages={772--780},
  year={2013},
  publisher={Oxford University Press},
  doi={10.1093/molbev/mst010}
}
Copy citation
Score pairwise protein alignments by gap-distribution Gini coefficient. For each (query, reference) pair the function:
  1. Aligns the two protein sequences with MAFFT.
  2. Optionally trims (center-crop 80%, strip end gaps).
  3. Computes gap run-length Gini for both sequences; takes the max.
  4. Returns 0.0 if gap_gini <= max_gap_gini, else scales linearly to 1.0.

API Reference

ConfigGapGiniConfig Source
Configuration for the alignment gap Gini constraint.The Gini coefficient measures inequality in the distribution of gap run-lengths within a pairwise alignment. A value near 0 means gaps are evenly distributed; a value near 1 means they are concentrated in a single run (truncation artifact).
max_gap_gini
number
default:"0.1"
Maximum acceptable gap Gini score (0-1). Alignments above this are penalized.
trim_alignment
boolean
default:"True"
Center-crop to 80% and strip end gaps before computing the Gini coefficient.
ReturnsConstraintOutput
One result per pair. score is 0.0 if the gap distribution is acceptable, up to 1.0 for the worst violation. The metadata carries gap_gini (and gap_gini_error on failure).

Usage

python
from proto_language.core import Constraint
from proto_language.constraint import gap_gini_constraint, GapGiniConfig

constraint = Constraint(
    inputs=[segment],
    function=gap_gini_constraint,
    function_config=GapGiniConfig(
        # Configure parameters here
    ),
)

scores = constraint.evaluate()

Metadata

PropertyValue
Keygap-gini
Functiongap_gini_constraint
Categorysequence_alignment
Modediscrete
Uses GPUFalse
Supported Typesprotein