Skip to main content
DSSP
License: DSSP is open source and free for academic and commercial use under a BSD-2-Clause license. Please refer to the license for full terms.

This toolkit is open source. Any third-party models, product names, or trademarks referenced are the property of their respective owners, and Proto is not affiliated with them.


PDB-REDO/dssp
PDB-REDO/dssp
Application to assign secondary structure to proteins
262 stars
View repo
DSSP 4: FAIR annotation of protein secondary structure
Maarten L. Hekkelman, Daniel Alvarez Salmoral, … Robbie P. Joosten
Protein Science (2025)
Read paper
@article{hekkelman_2025_dssp4,
  title={{DSSP} 4: FAIR annotation of protein secondary structure},
  author={Hekkelman, Maarten L. and {\'A}lvarez Salmoral, Daniel and Perrakis, Anastassis and Joosten, Robbie P.},
  journal={Protein Science},
  volume={34},
  number={8},
  pages={e70208},
  year={2025},
  doi={10.1002/pro.70208},
  pmid={40671631},
  pmcid={PMC12268231},
}

@article{kabsch_1983_dssp,
  title={Dictionary of protein secondary structure: Pattern recognition of hydrogen-bonded and geometrical features},
  author={Kabsch, Wolfgang and Sander, Christian},
  journal={Biopolymers},
  volume={22},
  number={12},
  pages={2577--2637},
  year={1983},
  doi={10.1002/bip.360221211},
  pmid={6667333},
}
Copy citation
proto-bio/proto-tools/proto_tools/tools/structure_scoring/dssp
View source
Open Notebook
Open notebook
Coming soon!
Run this tool directly in Proto with no setup required.
FunctionDescription
run_dssp_secondary_structure()Assign helix/sheet/loop percentages using the DSSP binary Docs Source

Background

DSSP (Kabsch and Sander, 1983) is an assignment program that classifies each residue of a protein into a secondary-structure state by inspecting the geometry and the hydrogen-bond pattern of the protein backbone. Recurring turns are assigned as helices (states H, G, and I for alpha, 3-10, and pi helices respectively), recurring bridges between residues form ladders that are assigned as strand (E), and isolated bridges, turns, bends, and unassigned residues form the remaining states. DSSP works only from atomic coordinates and does not predict secondary structure from sequence alone. The modern implementation (Hekkelman et al., 2025) is maintained by the PDB-REDO project at the Netherlands Cancer Institute and ships as the mkdssp command-line program with extended mmCIF support and FAIR annotation. This toolkit collapses the per-residue DSSP states into a coarse three-class summary for the chain of interest. Helix percentage counts residues assigned H, G, or I. Sheet percentage counts residues assigned E. Loop percentage counts every other DSSP state (B, T, S, the unassigned state, and the P polyproline-II state introduced in DSSP 4). The three percentages sum to 100 for the counted residues of the selected chain.

Learning Resources

  • PDB-REDO/dssp (PDB-REDO project, Netherlands Cancer Institute). Official repository and the source of the mkdssp command-line program that this toolkit invokes.
  • mkdssp command-line reference (PDB-REDO project). Reference documentation for the DSSP state alphabet and the command-line interface of the program that this toolkit invokes.

Tools

DSSP Secondary Structure (dssp-secondary-structure)

Assigns secondary structure with the mkdssp program for a selected chain in each input structure and returns the resulting helix, sheet, and loop percentages. Inputs are supplied as one or more DSSPStructureInput objects, each carrying a Structure (or a path / coordinate string accepted by Structure) plus the chain identifier to analyse. Multiple structures in one call are processed independently and the results are returned in input order.

API Reference

Source
inputs
List[DSSPStructureInput]
required
Structures and chains to analyze.
Source
verbose
integer
default:"0"
Verbosity level (0=quiet, 1=info, 2=debug, 3=raw subprocess stderr). True is coerced to 1 and False to 0.
device
string
default:"cpu"
Device to run the tool on.
timeout
integer
default:"600"
Maximum execution time in seconds. None waits indefinitely.
seed
integer
Random seed. When set, tools run reproducibly up to small GPU float noise (see BaseToolOutput.approx_equal), and the seed participates in cache keys. When None, cacheable seed-sensitive tools skip cache until seeded.
Source
results
List[DSSPSecondaryStructureMetrics]
Per-input secondary-structure percentages.
Metrics (one set per results item)
MetricTypeRangeAvailability
helix_pctfloat0.0 to 100.0always
sheet_pctfloat0.0 to 100.0always
loop_pctfloat0.0 to 100.0always

Applications

This tool is appropriate for filtering designed proteins by their secondary-structure composition, for summarising the helical or beta-sheet content of a predicted structure ensemble, and for any analysis that requires a DSSP-backed secondary-structure assignment as an upstream step in a larger pipeline. The collapsed three-class summary is the right shape for ranking or filtering large structure batches by composition.

Usage Tips

  • Select the chain to analyze, or leave chain empty to analyze the first chain. chain is a SingleChainSelection (e.g. "A"); each input structure yields one result row for its selected chain. The input validator hard-errors when the selected chain is not present and lists the available chains. Omitting chain analyzes the first chain in the structure.
  • Structures with more than 62 chains are rejected. The DSSP standalone runs on PDB-format text, which represents a chain identifier as a single character from A-Z, a-z, or 0-9. Structures exceeding this limit cannot be dispatched through the wrapper.
  • Helix percentage counts DSSP states H, G, and I. Sheet percentage counts state E. Loop percentage counts every remaining state, including B, T, S, the unassigned state, and the P polyproline-II state introduced in DSSP 4. The three percentages sum to 100 for the counted residues of the selected chain.
  • The first model is used for multi-model structures. Only the first model parsed by Biopython contributes to the residue counts. To analyse a specific model in an NMR ensemble or a multi-state file, extract that model into its own Structure before passing it in.

Toolkit Notes

These apply to every DSSP tool in this toolkit (dssp-secondary-structure).
  • Structure inputs accept either typed Structure objects or a path / coordinate string. A field validator normalises raw paths and Structure-coercible values into Structure instances at input time. The wrapper writes each parsed structure to a temporary PDB file for the DSSP program and removes it after the call.
  • Outputs are returned as typed metric objects. Each DSSPSecondaryStructureMetrics result carries the analysed chain identifier and the three secondary-structure percentages, with helix_pct, sheet_pct, and loop_pct constrained to the range 0 to 100. Results serialise to CSV or JSON through the standard export interface.
Example notebook: See the full working example for a copy-paste-ready walkthrough.

Infrastructure Guides

The following guides cover how to run tools efficiently and at scale.

Tool Persistence

Keep a tool’s model warm across calls instead of reloading it every invocation.

Device Management

How GPUs are allocated to tools and how to target specific devices.

Parallel Execution

Fan a batch of inputs out across multiple GPUs.

Cloud Inference

Run tools on managed cloud infrastructure with no local setup.