Skip to main content
SpliceTransformer intron boundary score
License: SpliceTransformer is open source and free for academic and commercial use under an Apache-2.0 license. Please refer to the license for full terms.

This constraint is open source. Any third-party models, product names, or trademarks referenced are the property of their respective owners, and Proto is not affiliated with them.


Go to Tool Page
proto-bio/proto-language/proto_language/constraint/rna_splicing/splice_transformer_intron_boundary.py
View source
@article{you2024splicetransformer,
  title={SpliceTransformer predicts tissue-specific splicing linked to human diseases},
  author={You, Ningyuan and Liu, Chang and Gu, Yuxin and Wang, Rong and Jia, Hanying and Zhang, Tianyun and Jiang, Song and Shi, Jinsong and Chen, Ming and Guan, Min-Xin and Sun, Siqi and Pei, Shanshan and Liu, Zhihong and Shen, Ning},
  journal={Nature Communications},
  volume={15},
  number={1},
  pages={9129},
  year={2024},
  doi={10.1038/s41467-024-53088-6}
}
Copy citation
Score donor/acceptor splice sites for three-segment intron boundaries. Accepts three segments (left_flank, intron_core, right_flank), concatenates them into a single 1-kb target sequence, and scores donor/acceptor splice sites.

API Reference

ConfigSpliceTransformerIntronBoundaryConfig Source
Configuration for SpliceTransformer intron boundary constraint.This class defines configuration parameters for evaluating splice site quality using SpliceTransformer, a deep learning model trained to predict splice sites in pre-mRNA sequences. The constraint assesses whether specified positions in a sequence are likely to function as authentic splice sites, which is critical for proper intron removal and mRNA processing.
SpliceTransformer requires sequences of specific lengths:
  • Concatenated target (left_flank + intron_core + right_flank): Must be exactly 1000 bp
  • Left context: Must be exactly 4000 bp
  • Right context: Must be exactly 4000 bp
  • Total sequence analyzed: 9000 bp (4000 + 1000 + 4000)
The model outputs splice site probabilities for each position. Higher scores indicate stronger predicted splice sites. Authentic splice sites typically have scores > 0.5, while non-splice positions have scores near 0.
left_context
string
required
Sequence of the left context for SpliceTransformer
right_context
string
required
Sequence of the right context for SpliceTransformer
donor_pos
List[integer]
required
0-indexed position(s) into the concatenated target sequence of expected donor
acceptor_pos
List[integer]
required
0-indexed position(s) into the concatenated target sequence of expected acceptor
splice_transformer_config
SpliceTransformerConfig
Advanced parameter configuration for SpliceTransformer.
ReturnsConstraintOutput
One result per input. score is the combined boundary penalty in [0.0, 1.0]. metadata carries donor_pos, acceptor_pos, donor_score, acceptor_score, and total_splice_score.

Usage

python
from proto_language.core import Constraint
from proto_language.constraint import splice_transformer_intron_boundary, SpliceTransformerIntronBoundaryConfig

constraint = Constraint(
    inputs=[segment],
    function=splice_transformer_intron_boundary,
    function_config=SpliceTransformerIntronBoundaryConfig(
        # Configure parameters here
    ),
)

scores = constraint.evaluate()

Metadata

PropertyValue
Keysplice-transformer-intron-boundary
Functionsplice_transformer_intron_boundary
Categoryrna_splicing
Modediscrete
Uses GPUTrue
Supported Typesdna