Skip to content

Baseprint Document Format (BpDF)

BpDF is the digital encoding format of a Baseprint document snapshot. These document snapshots can be identified with a SoftWare Hash IDentifier (SWHID). Baseprint document snapshots exemplify the concept of "baseprint" discussed in the document "What is a baseprint?". As of 2024, these document snapshots are only used within Baseprint document successions.

Technical details of the format are being documented in a draft specification on GitHub.

Objectives

The primary objective of BpDF is to minimize format rot and maximize "format shelf-life", where software in the future is backward compatible with the present encoding format. Unlike formats like LaTeX and Markdown, which are used for authoring (in the present), BpDF is designed for redistribution and archiving (for the future). The primary tactic for minimizing format rot is to encode articles like those archived in PubMed Central.

Supporting Software

The BpDF format is implemented in the open-source Python library epijats. This library is used in the authoring tool Baseprinter (for previews) and in BaseprintPress for generating websites, such as pilot.perm.pub.

JATS XML:
The XML file inside BpDF is a sub-format of the JATS XML Article Authoring format. This Baseprint JATS XML subformat is more minimal than the JATS XML Article Authoring Tag Set.
PubMed Central JATS XML Flavor:
The various different software systems of different organizations process different "flavors" of JATS XML. Baseprint JATS XML does not target supporting the various flavors of JATS. Baseprint JATS XML targets processing a subset of the JATS XML found in archived articles of the PMC Open Access Subset. Baseprint JATS XML targets matching the rendering of the PMC Article Previewer.
JATS4R:
Baseprint JATS XML targets being similarly restrictive like JATS4R (JATS for Reuse). However, Baseprint JATS XML targets re-use by Free Open-Source Software for author self-archiving/publishing redistribution within Baseprint document successions.
Manuscript Exchange Common Approach (MECA):
Unlike JATS XML, BpDF is also a format for the packaging of the files included in a research document, such as images. In this respect, BpDF is similar to the Manuscript Exchange Common Approach (MECA) Both BpDF and MECA are interchange formats between authoring and archiving. However, unlike MECA, BpDF targets author self-archiving/publishing and redistribution and not traditional publisher workflows.
BpDF'23
BpDF'23 is the version of BpDF supported by the epijats Python library in 2023 and 2024.