Epub Format Construction Guide
Harrison Ainsworth
http://www.hxa.name/
hxa7241+articles (ατ) googlemail (dοτ) com
2010-08-27
Summary
A guide for making Epub ebooks/publications, sufficient for most purposes. It requires understanding of XHTML, CSS, XML. (1900 words)
Download an example publication – this document as Epub:http://www.hxa.name/articles/content/EpubGuide-hxa7241.epub
Contents
- Introduction
- 1: XHTML Documents
- 2: Package And Container Files
- mimetype
- container.xml
- content.opf
- toc.ncx
- 3: ADE stylesheet
- 4: Container Structure
- Specifications List
Introduction
This is a guide for making IDPF Epub ebooks/publications. It is mostly an annotated example: this document itself in Epub form.
Not all details/variations are mentioned, but enough to obviate need of the specifications for normal use. And it is for making entirely conformant publications.
Included also is a description of optional extra styling for a particular reader (but still completely conformant).
You need an understanding of and ability to make XHTML/CSS and XML documents.
IDPF
‘Epub’ is a standard from the International Digital Publishing Forum. It is an arrangement of several other standards (mainly: XHTML, CSS, XML, NCX, DCMI). There are three parts, addressing: content, package metadata, and archive (OPS, OPF, and OCF). It is powerful, straightforward, and non-proprietary.
Adobe Digital Editions
‘ADE’ is one of the first readers for Epub publications. It is very conformant with the standard. It can use an optional proprietary publication component: an extra stylesheet to adjust text-column appearance. (That is allowed by the standard.)
This guide was written using ADE version 1.0.467 .
1: XHTML Documents
Make the main content with XHTML, CSS, and images.
Relevant specifications: OPS, XHTML, CSS.
XHTML
Use XHTML 1.1, but without the following modules:
- Forms
- Server-side Image Map
- Intrinsic Events
- Scripting
(XHTML 1.1 difference from XHTML 1 strict:
lang
attribute not allowed (usexml:lang
instead)name
attribute ona
andmap
elements not allowed (useid
instead)- ruby annotations are allowed)
Include XML declaration and XHTML doctype, at the top:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
and xmlns
attribute in html
:
<html xmlns="http://www.w3.org/1999/xhtml">
Any unicode character, in UTF-8 or UTF-16, is allowed. But readers may have limited rendering capabilities.
(ADE 1.0 doesn't support: ­       ‌ ‍ ‎ ‏ ‾ ⌈ ⌉ ⌊ ⌋)
CSS
A subset of CSS 2.1 is supported. A brief summary is awkward to make. For details, see the CSS part of the OPS specification.
Be simple, and use CSS 1 without the following properties:
- background image related:
background-image
background-repeat
background-attachment
background-position
background
word-spacing
letter-spacing
text-transform
list-style-image
(There are also a few other minor details unsupported.) And don't use absolute positioning.
The CSS can be linked from the XHTML head
, or put in style
in head
.
(ADE 1.0 doesn't support:
- pseudo-classes/elements
text-align: justify;
font-variant: small-caps;
- OPS extras:
display: oeb-page-head;
display: oeb-page-foot;
oeb-column-number: [integer];)
Images
The XHTML can have images of the following types:
image/jpeg
image/png
image/gif
image/svg+xml
Fonts
Use OpenType fonts. Reference them in the CSS with @font-face
, eg.:
@font-face { font-family: "Minion Pro"; src: url(MinionPro.otf); }
@font-face { font-family: "Minion Pro"; font-style: italic;src: url(MinionPro-It.otf); }
Other descriptors allowed are: font-variant
, font-weight
, font-size
.
2: Package And Container Files
Make these four files, according to the following descriptions:
mimetype
container.xml
content.opf
toc.ncx
mimetype
application/epub+zip
It is ASCII, with no trailing end-of-line.
Specification: OCF
container.xml
<?xml version="1.0"?>
<container version="1.0" xmlns="urn:oasis:names:tc:opendocument:xmlns:container"><rootfiles><rootfile full-path="content.opf"media-type="application/oebps-package+xml"/></rootfiles>
</container>
If you rename or put the content.opf
file elsewhere than in this guide, change the full-path
attribute to match.
Specification: OCF
content.opf
<?xml version="1.0"?><package xmlns="http://www.idpf.org/2007/opf" unique-identifier="dcidid" version="2.0"><metadata xmlns:dc="http://purl.org/dc/elements/1.1/"xmlns:dcterms="http://purl.org/dc/terms/"xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"xmlns:opf="http://www.idpf.org/2007/opf"><dc:title>Epub Format Construction Guide</dc:title><dc:language xsi:type="dcterms:RFC3066">en</dc:language><dc:identifier id="dcidid" opf:scheme="URI">http://www.hxa7241.org/articles/content/epup-guide_hxa7241_2007_2.epub</dc:identifier><dc:subject>Non-fiction, technical article, tutorial, Epub, IDPF, ebook</dc:subject><dc:description>A guide for making Epub ebooks/publications, sufficientfor most purposes. It requires understanding of XHTML, CSS, XML.</dc:description><dc:relation>http://www.hxa.name/</dc:relation><dc:creator>Harrison Ainsworth / HXA7241</dc:creator><dc:publisher>Harrison Ainsworth / HXA7241</dc:publisher><dc:date xsi:type="dcterms:W3CDTF">2007-12-28</dc:date><dc:date xsi:type="dcterms:W3CDTF">2010-08-27</dc:date><dc:rights>Creative Commons BY-SA 3.0 License.</dc:rights></metadata><manifest><item id="ncx" href="toc.ncx" media-type="application/x-dtbncx+xml" /><item id="css" href="EpubGuide.css" media-type="text/css" /><item id="logo" href="hxa7241-logo.svg" media-type="image/svg+xml" /><item id="title" href="EpubGuide-title.html" media-type="application/xhtml+xml" /><item id="contents" href="EpubGuide-contents.html" media-type="application/xhtml+xml" /><item id="intro" href="EpubGuide-intro.html" media-type="application/xhtml+xml" /><item id="part1" href="EpubGuide-1.html" media-type="application/xhtml+xml" /><item id="part2" href="EpubGuide-2.html" media-type="application/xhtml+xml" /><item id="part3" href="EpubGuide-3.html" media-type="application/xhtml+xml" /><item id="part4" href="EpubGuide-4.html" media-type="application/xhtml+xml" /><item id="specs" href="EpubGuide-specs.html" media-type="application/xhtml+xml" /></manifest><spine toc="ncx"><itemref idref="title" /><itemref idref="contents" /><itemref idref="intro" /><itemref idref="part1" /><itemref idref="part2" /><itemref idref="part3" /><itemref idref="part4" /><itemref idref="specs" /></spine><guide><reference type="title-page" title="Title Page" href="EpubGuide-title.html" /><reference type="toc" title="Table of Contents" href="EpubGuide-contents.html" /><reference type="text" title="Text" href="EpubGuide-intro.html" /></guide></package>
metadata (publication information)
Add publication information according to DCMI terms. Order is not significant, and duplicates are allowed.
Required terms:
title
language
— use a RFC3066 language codeidentifier
— use a probably unique string: URI or ISBN would be good
Optional terms:
creator
contributor
publisher
subject
description
date
type
format
source
relation
coverage
rights
Some terms have optional attributes:
creator, contributor
opf:role
— see http://www.loc.gov/marc/relators/ for values
date
opf:event
— unstandardised: use something reasonable
identifier
opf:scheme
— unstandardised: use something reasonable
date, format, identifier, language, type
xsi:type
— use an appropriate standard term (such asW3CDTF
fordate
)
contributor, coverage, creator, description, publisher, relation, rights, source, subject, title
xml:lang
— use RFC-3066 format
manifest (document file list)
List every file that is part of the publication. But not: mimetype
, container.xml
, content.opf
. The order is not significant.
Give correct mime-type in media-type
attribute. id
s are required and must be unique in the content.opf
file.
spine (reading order definition)
List all XHTML documents in manifest (using the idref
), and not anything else, and with no duplicates. The order is significant. (XHTML documents can be omitted, but then they must not be linked, referenced or reachable from any part of the publication.)
guide (main parts of document)
This section is optional.
Each item references a document file, and can have a fragment id. Allowed type
s are:
- cover
- title-page
- toc (table of contents)
- index
- glossary
- acknowledgements
- bibliography
- colophon
- copyright-page
- dedication
- epigraph
- foreword
- loi (list of illustrations)
- lot (list of tables)
- notes
- preface
- text
-
other.[...]
Specifications: OPF, DCMI
toc.ncx
<?xml version="1.0"?>
<!DOCTYPE ncx PUBLIC "-//NISO//DTD ncx 2005-1//EN" "http://www.daisy.org/z3986/2005/ncx-2005-1.dtd"><ncx xmlns="http://www.daisy.org/z3986/2005/ncx/" version="2005-1"><head><meta name="dtb:uid" content="http://www.hxa7241.org/articles/content/epup-guide_hxa7241_2007_2.epub"/><meta name="dtb:depth" content="2"/><meta name="dtb:totalPageCount" content="0"/><meta name="dtb:maxPageNumber" content="0"/></head><docTitle><text>Epub Format Construction Guide</text></docTitle><navMap><navPoint id="navPoint-1" playOrder="1"><navLabel><text>Title Page</text></navLabel><content src="EpubGuide-title.html"/></navPoint><navPoint id="navPoint-2" playOrder="2"><navLabel><text>Table of Contents</text></navLabel><content src="EpubGuide-contents.html"/></navPoint><navPoint id="navPoint-3" playOrder="3"><navLabel><text>Introduction</text></navLabel><content src="EpubGuide-intro.html"/></navPoint><navPoint id="navPoint-4" playOrder="4"><navLabel><text>1: XHTML Documents</text></navLabel><content src="EpubGuide-1.html"/></navPoint><navPoint id="navPoint-5" playOrder="5"><navLabel><text>2: Package And Container Files</text></navLabel><content src="EpubGuide-2.html"/><navPoint id="navPoint-6" playOrder="6"><navLabel><text>mimetype</text></navLabel><content src="EpubGuide-2.html#mimetype"/></navPoint><navPoint id="navPoint-7" playOrder="7"><navLabel><text>container.xml</text></navLabel><content src="EpubGuide-2.html#containerxml"/></navPoint><navPoint id="navPoint-8" playOrder="8"><navLabel><text>content.opf</text></navLabel><content src="EpubGuide-2.html#contentopf"/></navPoint><navPoint id="navPoint-9" playOrder="9"><navLabel><text>toc.ncx</text></navLabel><content src="EpubGuide-2.html#tocncx"/></navPoint></navPoint><navPoint id="navPoint-10" playOrder="10"><navLabel><text>3: ADE stylesheet</text></navLabel><content src="EpubGuide-3.html"/></navPoint><navPoint id="navPoint-11" playOrder="11"><navLabel><text>4: Container Structure</text></navLabel><content src="EpubGuide-4.html"/></navPoint><navPoint id="navPoint-12" playOrder="12"><navLabel><text>Specifications List</text></navLabel><content src="EpubGuide-specs.html"/></navPoint></navMap></ncx>
head
Set the following meta
content
attributes:
uid
— to the unique identifier incontent.opf
depth
— to the depth of the contents tree (innavMap
), integer, >=1
totalPageCount
— to0
maxPageNumber
— to0
navMap
Make a table of contents, optionally hierarchical. (navMap
doesn't need to include all XHTML files, since the content.opf
spine does.)
navPoint
Set both attributes:
id
— to be unique in fileplayOrder
— to an integer, ordered innavMap
, starting at1
Set sub-parts:
- the content of
text
innavLabel
- the
src
attribute incontent
— to a URI of one of the XHTML files (fragment id allowed)
navPoint
s nested in navPoint
s are allowed.
(The Sony Reader, and perhaps others, have an extra restriction: fragment ids (in src
attributes ofcontent
s.) are not allowed in top-level (non-nested) navPoint
s.)
Specification: NCX
3: ADE stylesheet
Optionally, make this file if you want extra control of column appearance with ADE 1.0:
page-template.xpgt
Add a link in the head of XHTML files to be styled:
<link rel="stylesheet" type="application/vnd.adobe-page-template+xml" href="page-template.xpgt"/>
Will the publication then be non-conformant? Non-standard files can be included (like fonts), but must have proper fallback handling. The standard implies that all fallback behaviour is explicitly standardised (in IDPF or component standards). For stylesheets, HTML rules say readers should ignore unrecognized types. And that would very likely happen. So it seems conformant, and safe.
page-template.xpgt
<ade:template xmlns="http://www.w3.org/1999/xhtml" xmlns:ade="http://ns.adobe.com/2006/ade" xmlns:fo="http://www.w3.org/1999/XSL/Format"><fo:layout-master-set><fo:simple-page-master master-name="single_column" margin-bottom="2em" margin-top="2em" margin-left="2em" margin-right="2em"><fo:region-body/></fo:simple-page-master><fo:simple-page-master master-name="single_column_head" margin-bottom="2em" margin-top="2em" margin-left="2em" margin-right="2em"><fo:region-before extent="8em"/><fo:region-body margin-top="8em"/></fo:simple-page-master><fo:simple-page-master master-name="two_column" margin-bottom="2em" margin-top="2em" margin-left="2em" margin-right="2em"><fo:region-body column-count="2" column-gap="3em"/></fo:simple-page-master><fo:simple-page-master master-name="two_column_head" margin-bottom="2em" margin-top="2em" margin-left="2em" margin-right="2em"><fo:region-before extent="8em"/><fo:region-body column-count="2" margin-top="8em" column-gap="3em"/></fo:simple-page-master><fo:simple-page-master master-name="three_column" margin-bottom="2em" margin-top="2em" margin-left="2em" margin-right="2em"><fo:region-body column-count="3" column-gap="3em"/></fo:simple-page-master><fo:simple-page-master master-name="three_column_head" margin-bottom="2em" margin-top="2em" margin-left="2em" margin-right="2em"><fo:region-before extent="8em"/><fo:region-body column-count="3" margin-top="8em" column-gap="3em"/></fo:simple-page-master><fo:page-sequence-master><fo:repeatable-page-master-alternatives><fo:conditional-page-master-reference master-reference="three_column_head" page-position="first" ade:min-page-width="80em"/><fo:conditional-page-master-reference master-reference="three_column" ade:min-page-width="80em"/><fo:conditional-page-master-reference master-reference="two_column_head" page-position="first" ade:min-page-width="50em"/><fo:conditional-page-master-reference master-reference="two_column" ade:min-page-width="50em"/><fo:conditional-page-master-reference master-reference="single_column_head" page-position="first"/><fo:conditional-page-master-reference master-reference="single_column"/></fo:repeatable-page-master-alternatives></fo:page-sequence-master></fo:layout-master-set><ade:style><ade:styling-rule selector="#header" display="adobe-other-region" adobe-region="xsl-region-before"/></ade:style></ade:template>
The selector
attribute in ade:style
/ade:styling-rule
refers to a CSS selector. There is more detail at:http://blogs.adobe.com/digitaleditions/template.html
Specification: unknown
4: Container Structure
Arrange all files in the following directory structure:
EpubGuideMETA-INFcontainer.xmlmimetypecontent.opftoc.ncxEpubGuide.csshxa7241-logo.svgEpubGuide-title.htmlEpubGuide-contents.htmlEpubGuide-intro.htmlEpubGuide-1.htmlEpubGuide-2.htmlEpubGuide-3.htmlEpubGuide-4.htmlEpubGuide-specs.html
(META-INF
and its contents are special, but all other files can be arranged into any subdirectory structure. All references to them, in the various files, may have to be adjusted though.)
Then zip them into an archive with Zip. The filename extension should be ‘epub’, and the mimetype
file must be first (and uncompressed), and extra file attributes must be excluded:
zip -X0 EpubGuide-hxa7241.epub mimetype
zip -Xur9D EpubGuide-hxa7241.epub *
(Get Zip from: ftp://ftp.info-zip.org/pub/infozip/ or http://www.info-zip.org/Zip.html .)
Other zip programs can probably be used, if they can do the same things.
(The Sony Reader, and perhaps others, have an extra requirement: each HTML file must be < 300KB and < 100KB when zipped.)
Specification: OCF
Specifications List
-
IDPF
-
http://www.idpf.org/specs.htm
-
Open Publication Structure (OPS) 2.0 v1.0
- http://www.idpf.org/2007/ops/OPS_2.0_final_spec.html Open Packaging Format (OPF) 2.0 v1.0
- http://www.idpf.org/2007/opf/OPF_2.0_final_spec.html OEBPS Container Format (OCF) v1.0
- http://www.idpf.org/ocf/ocf1.0/download/ocf10.htm
ANSI/NISO Z39.86 - 2005 Specifications for the Digital Talking Book, NCX part (NCX)
- http://www.niso.org/standards/resources/Z39-86-2005.html#NCX DCMI Metadata Terms 2006-12-18 (DC)
- http://dublincore.org/documents/2006/12/18/dcmi-terms/ XHTML 1.1
- http://www.w3.org/TR/xhtml11/ CSS 2.1
- http://www.w3.org/TR/CSS21/ XML 1.0
- http://www.w3.org/TR/xml/
Metadata
(TXON)
DC:`title:`Epub Format Construction Guide`creator:`Harrison Ainsworth`date:`2007-12-28`date:`2010-08-27`description:`A guide for making Epub ebooks/publications, sufficient for most purposes. It requires understanding of XHTML, CSS, XML.`subject:`Epub, IDPF, ebook`language:`en-GB`type:`technical article`relation:`http://www.hxa.name/`identifier:`http://www.hxa.name/articles/content/epub-guide_hxa7241_2007.html`rights:`Creative Commons BY-SA 3.0 License` `