DSSP - Proto

License: DSSP is open source and free for academic and commercial use under a BSD-2-Clause license. Please refer to the license for full terms.

This toolkit is open source. Any third-party models, product names, or trademarks referenced are the property of their respective owners, and Proto is not affiliated with them.

GitHub 262 GitHub 262 Publication Publication Cite Cite Tool Source Tool Source Open as Notebook Open as Notebook Open on Proto Open on Proto

PDB-REDO/dssp

Application to assign secondary structure to proteins

262 stars

View repo

DSSP 4: FAIR annotation of protein secondary structure

Maarten L. Hekkelman, Daniel Alvarez Salmoral, … Robbie P. Joosten

Protein Science (2025)

Read paper

@article{hekkelman_2025_dssp4,
  title={{DSSP} 4: FAIR annotation of protein secondary structure},
  author={Hekkelman, Maarten L. and {\'A}lvarez Salmoral, Daniel and Perrakis, Anastassis and Joosten, Robbie P.},
  journal={Protein Science},
  volume={34},
  number={8},
  pages={e70208},
  year={2025},
  doi={10.1002/pro.70208},
  pmid={40671631},
  pmcid={PMC12268231},
}

@article{kabsch_1983_dssp,
  title={Dictionary of protein secondary structure: Pattern recognition of hydrogen-bonded and geometrical features},
  author={Kabsch, Wolfgang and Sander, Christian},
  journal={Biopolymers},
  volume={22},
  number={12},
  pages={2577--2637},
  year={1983},
  doi={10.1002/bip.360221211},
  pmid={6667333},
}

Copy citation

proto-bio/proto-tools/proto_tools/tools/structure_scoring/dssp

View source

Open Notebook

Open notebook

Coming soon!

Run this tool directly in Proto with no setup required.

Function	Description
`run_dssp_secondary_structure()`	Assign helix/sheet/loop percentages using the DSSP binary	Docs Source

Background

DSSP (Kabsch and Sander, 1983) is an assignment program that classifies each residue of a protein into a secondary-structure state by inspecting the geometry and the hydrogen-bond pattern of the protein backbone. Recurring turns are assigned as helices (states H, G, and I for alpha, 3-10, and pi helices respectively), recurring bridges between residues form ladders that are assigned as strand (E), and isolated bridges, turns, bends, and unassigned residues form the remaining states. DSSP works only from atomic coordinates and does not predict secondary structure from sequence alone. The modern implementation (Hekkelman et al., 2025) is maintained by the PDB-REDO project at the Netherlands Cancer Institute and ships as the mkdssp command-line program with extended mmCIF support and FAIR annotation. This toolkit collapses the per-residue DSSP states into a coarse three-class summary for the chain of interest. Helix percentage counts residues assigned H, G, or I. Sheet percentage counts residues assigned E. Loop percentage counts every other DSSP state (B, T, S, the unassigned state, and the P polyproline-II state introduced in DSSP 4). The three percentages sum to 100 for the counted residues of the selected chain.

Learning Resources

PDB-REDO/dssp (PDB-REDO project, Netherlands Cancer Institute). Official repository and the source of the mkdssp command-line program that this toolkit invokes.
mkdssp command-line reference (PDB-REDO project). Reference documentation for the DSSP state alphabet and the command-line interface of the program that this toolkit invokes.

Tools

DSSP Secondary Structure (`dssp-secondary-structure`)

Assigns secondary structure with the mkdssp program for a selected chain in each input structure and returns the resulting helix, sheet, and loop percentages. Inputs are supplied as one or more DSSPStructureInput objects, each carrying a Structure (or a path / coordinate string accepted by Structure) plus the chain identifier to analyse. Multiple structures in one call are processed independently and the results are returned in input order.

API Reference

Source

Input: DSSPSecondaryStructureInput

inputs

List[DSSPStructureInput]

required

Structures and chains to analyze.

Show DSSPStructureInput

chain

SingleChainSelection

Chain to analyze. None analyzes the first chain in the structure. mkdssp always runs on the whole structure, so this only selects which chain’s percentages are reported.

structure

Structure

required

Protein structure to analyze.

Source

Config: DSSPSecondaryStructureConfig

verbose

integer

default:"0"

Verbosity level (0=quiet, 1=info, 2=debug, 3=raw subprocess stderr). True is coerced to 1 and False to 0.

device

string

default:"cpu"

Device to run the tool on.

timeout

integer

default:"600"

Maximum execution time in seconds. None waits indefinitely.

seed

integer

Random seed. When set, tools run reproducibly up to small GPU float noise (see BaseToolOutput.approx_equal), and the seed participates in cache keys. When None, cacheable seed-sensitive tools skip cache until seeded.

Source

Output: DSSPSecondaryStructureOutput

results

List[DSSPSecondaryStructureMetrics]

Per-input secondary-structure percentages.

Show DSSPSecondaryStructureMetrics

chain_id

string

required

Analyzed chain label in the input structure namespace.

primary_metric

string

Name of the metric that best summarizes the result overall (e.g. "avg_plddt" for AlphaFold2). Used by downstream UI and reporting to pick a headline value.

Metrics (one set per results item)

Metric	Type	Range	Availability
`helix_pct`	float	0.0 to 100.0	always
`sheet_pct`	float	0.0 to 100.0	always
`loop_pct`	float	0.0 to 100.0	always

Applications

This tool is appropriate for filtering designed proteins by their secondary-structure composition, for summarising the helical or beta-sheet content of a predicted structure ensemble, and for any analysis that requires a DSSP-backed secondary-structure assignment as an upstream step in a larger pipeline. The collapsed three-class summary is the right shape for ranking or filtering large structure batches by composition.

Usage Tips

Select the chain to analyze, or leave chain empty to analyze the first chain. chain is a SingleChainSelection (e.g. "A"); each input structure yields one result row for its selected chain. The input validator hard-errors when the selected chain is not present and lists the available chains. Omitting chain analyzes the first chain in the structure.
Structures with more than 62 chains are rejected. The DSSP standalone runs on PDB-format text, which represents a chain identifier as a single character from A-Z, a-z, or 0-9. Structures exceeding this limit cannot be dispatched through the wrapper.
Helix percentage counts DSSP states H, G, and I. Sheet percentage counts state E. Loop percentage counts every remaining state, including B, T, S, the unassigned state, and the P polyproline-II state introduced in DSSP 4. The three percentages sum to 100 for the counted residues of the selected chain.
The first model is used for multi-model structures. Only the first model parsed by Biopython contributes to the residue counts. To analyse a specific model in an NMR ensemble or a multi-state file, extract that model into its own Structure before passing it in.

Toolkit Notes

These apply to every DSSP tool in this toolkit (dssp-secondary-structure).

Structure inputs accept either typed Structure objects or a path / coordinate string. A field validator normalises raw paths and Structure-coercible values into Structure instances at input time. The wrapper writes each parsed structure to a temporary PDB file for the DSSP program and removes it after the call.
Outputs are returned as typed metric objects. Each DSSPSecondaryStructureMetrics result carries the analysed chain identifier and the three secondary-structure percentages, with helix_pct, sheet_pct, and loop_pct constrained to the range 0 to 100. Results serialise to CSV or JSON through the standard export interface.

Example notebook: See the full working example for a copy-paste-ready walkthrough.

Infrastructure Guides

The following guides cover how to run tools efficiently and at scale.

Tool Persistence

Keep a tool’s model warm across calls instead of reloading it every invocation.

Device Management

How GPUs are allocated to tools and how to target specific devices.

Parallel Execution

Fan a batch of inputs out across multiple GPUs.

Cloud Inference

Run tools on managed cloud infrastructure with no local setup.

​Background

​Learning Resources

​Tools

​DSSP Secondary Structure (dssp-secondary-structure)

​API Reference

​Applications

​Usage Tips

​Toolkit Notes

​Infrastructure Guides

Tool Persistence

Device Management

Parallel Execution

Cloud Inference

Background

Learning Resources

Tools

DSSP Secondary Structure (`dssp-secondary-structure`)

API Reference

Applications

Usage Tips

Toolkit Notes

Infrastructure Guides