Gempub Format Description

Gempub is an ebook container format which is structurally similar to the epub ebook format, but utilizing gemtext documents (text/gemini) as the markup language instead of XHTML (as epub does). It also simplifies the structure of the container format such that it is not only possible to manually create a gempub, but it is a relatively easy task with standard tooling.

Origins (Credit Where Credit Is Due)

The gempub specification appears to have been, at least to my research, initially created by Gogledd-Orllewin. The original specification lives at the following URLs: @github (archived), @codeberg (untouched for years).

The specification had a few contributors, but seems to have lost steam and been abandoned (or is considered complete by the author). If anyone has any information regarding a more current version or a current maintainer, please reach out.

The Gempub Format

Gempub is a container format in the form of a zip archive. The proposed mime/media type of the archive is application/gpub+zip. The filetype suffix .gpub is used for gempub files.

Archive Directory Structure

The root directory of the final zip archive must contain either a valid index.gmi file or a metadata.txt file (which, in turn, points to a valid index.gmi file) or both.

The metadata.txt file is optional, so long as an index.gmi file exists in the root directory of the zip archive. A gempub may include image files, including a cover image, and as many text/gemini files as are desired.

A simple capsule-style gempub might like something like the following:

		my-book.gpub/
			index.gmi
			gemlog1.gmi
			gemlog2.gmi
			gemlog-other.gmi
			whatnot.gmi
			contact.gmi
		

To create the above, assuming all of the files are already in a directory, my-book, you could do the following to create this simple gempub:


		cd my-book
		zip ../my-book.gpub *.gmi
		

A novel or other long work more traditionally thought of as "a book" might have a structure looking something like:

		my-novel.gpub/
			metadata.txt
			images/
				my-cover.jpg
				  plate-1.png
				  plate-2.png
			source/
				index.gmi
				  titlepage.gmi
				  chapter-1.gmi
				  chapter-2.gmi
				  chapter-3.gmi
				  chapter-4.gmi
				  chapter-5.gmi
				  chapter-6.gmi
				  about-the-author.gmi
				  colophon.gmi
				  copyright.gmi
		

In the above, the metadata.txt file will point to source/index.gmi in order to parse out the contents and reading order. Similarly, the metadata file will point to images/my-cover.jpg for the cover field. More details on the metadata file can be fond in the next section of this format description.

Metadata

The file metadata.txt contains information about the book. This file, when present, must be in the root of the final gpub archive. The metadta information might be used by readers and book catalogs, for display to the user, or might be used for other forms of digital analysis and organization.

The file contains key/value pairs separated by a newline. Keys start at the beginning of a line and end with a colon. Values start after the first colon in a given line and are associated with the key that comes before said colon. Whitespace may surround keys and values and should be trimmed by any program reading the keys/values.

A valid metadata.txt file must contain title and gpubVersion fields, but may, and probably will, contain other keys and values. The order of the key/value pairs does not matter. If no index key/value is given, an index.gmi file must be present in the root directory of the archive.

For example, a simple metadata.txt where the index.gmi file is in the root directory (and not present in metadata.txt might look something like this:

		title: My Book!
		gpubVersion  :   1.0.0
		

The following keys are officially a part of the specification:

title
The title of the book. This is a required field
gpubVersion
The version of the gpub spec being conformed to (this documentation mostly represents version 1.0.0, for example). This is a required field
index
A path to the index.gmi file for the book, the path must be relative to the root directory of the archive and must not start with a filepathpath separator, paths must use UNIX-style filepath separators (/)
author
The author(s) of the book
language
A BCP 47 language code. For example: en-US, en, en-UK, en-CA are all common language codes for various kinds of English
charset
Since gemtext is UTF-8 by default, this is usually redundant and should be used to identify an alternate encoding. Note that the metadata file itself must be UTF-8 encoded
description
A single line description or summary of the book
published
Format YYYY; as a fallback for when a precise date is unknown (otherwise, use publishDate). Ex. 1943
publishDate
Format YYY-MM-DD. Ex. 1943-10-24
revisionDate
Format YYY-MM-DD. Ex. 1955-11-02
copyright
A copyright notice
license
A license notice
version
A human readable version identifier for the book
cover
A path, relative to the root directory, to a JPG or PNG image to be used as the cover of the book

According to the specification metadata is indented to communicate information to readers and must not be extended to provide rendering information or other flags to applications.

Table of Contents (index.gmi)

The index.gmi file is a required file and must be present in all gempub documents; either in the root of the archive or in a location pointed to by the index field of the metadata.txt file (which itself must be in the root of the archive).

The file must be a valid text/gemini document consisting predominantly of link lines, in the order the files are meant to be read. A reader may or may not expose non-link lines from an index.gmi file as table of contents information.

If an application cannot find the index.gmi file in either location, or it cannot be read or contains invalid data, the application must display an error to the user and refuse to read the gempub any further.

Link lines in the index.gmi file should be relative to the file. Applications should ignore non-local links in an index.gmi file for purposes of showing a table of contents or other navigation through the gempub.

Example index.gmi:

		# ToC

		=> text/title.gmi Titlepage
		=> text/chapter-1.gmi Chapter 1: In Which We Meet Our Hero
		=> text/chapter-2.gmi Chapter 2: In Which Our Hero Meets Their End
		=> text/about.gmi About the Author
		=> text/colophon.gmi Colophon
		=> text/copy.gmi Copyright
		

Content Files

All content linked to by the index.gmi file must be valid text/gemini files. Any images must be referenced inside a valid text/gemini file and the application viewing the file will decide how to handle them. Links within files should be relative to the location of the given file in the archive.

Images

Supported formats for images are JPG and PNG. Image links must include text describing the image and applications must use this text as alternative text for the image where the technology stack of the application allows such a use.

Further Considerations