Bookworm will not reject valid ePub – but are you valid?

Liza Daly from threepress.org has just released an article outlining problems she is having with users uploading invalid ePub formatted documents to Bookworm; an online ePub book reader. It’s very important for anyone developing ePub eBooks to produce valid markup. Not only will Bookworm give desirable results when rendering, but you’ll also be covering yourself for any future rendering engines and conversions you might need to do.

It’s actually quite surprising how many errors are showing up from files submitted to Bookworm. You should go over to the threepress blog for a full explanation, but here’s a list of the main errors;

  1. Missing required attributes in the metadata
  2. Metadata that hasn’t been proofread
  3. Improper nesting of the ePub zip file
  4. Items declared in the OPF file that are missing from the archive
  5. Invalid XHTML

Points 1 to 4 are really quite vital, although it is understanable for many documents to have invalid XHTML. Still, if it is within your means, I would try to control this the best you can.

I have plans to write some detailed articles regarding the creation of both the NCX and OPF files found in an ePub document, so keep a lookout for those.

Creating an ePub document from XHTML

In my last post I talked about the epubBooks Project and how I plan to convert Project Gutenberg .txt eBooks to the ePub format and how I will make these eBooks available for download from ePubBooks.com.

I already have in place a converter to transform the PG .txt files to a TEI Master Format and also an XSLT script to convert these into XHTML. The final task now is to create a converter for TEI to the ePub format.

Before I attempt to write this converter I will need to have a much better understanding on how a book is laid out inside the ePub OEBPS Container Format (OCF) .zip archive. So I set about taking my XHTML output file and breaking it up into the appropriate parts ready to be packaged in to an .epub file.

On the whole this went fairly smoothly, although I did encounter a couple of issues, which I’ll explain at the end of this article.

Continue reading “Creating an ePub document from XHTML”