A few days ago I had the pleasure of being invited up to Stockholm to sit with a bunch of like minded people and talk about eBooks – specifically the ePub format. This was a very eye-opening experience indeed.
I was invited to Sweden by Publit, a company who have set themselves the task of making all the Swedish out-of-print books available as PoD (Print on Demand) titles. Considering that 95% of all Swedish books ever in existence are now out of print, this is a very worthy project, if perhaps somewhat daunting. Although Publit’s main business is PoD, they are making use of this opportunity to also provide these titles as ePub eBooks.
During my time in Sweden we discussed the many different areas of the eBook world, including DRM (of course), the processes involved in going from scanned document (TIFF/PDF/DOC) to an eBook Master format and onto ePub creation itself.
Now, the people at Publit are a group of very talented individuals with plenty of technical knowledge, yet there were aspects of ePub which has left them somewhat perplexed. There were two main points which I found interesting and have heard before around the web so I thought I would share them here.
“What flavours of ePub exist?” There is only one flavour of ePub, although it does currently support two different core formats; XHTML and DTBook (Daisy Talking Book). I won’t go further into what makes an ePub here as Jon Noring has already written an excellent article over at Teleread.org; ePub Demystified.
They were also asking if I thought “the next release of ePub would have more advanced features?” (meaning video and Flash media). The answer to this question is that video and Flash, along with audio, are already possible.
The ePub standard (OPS) can already use these types of media because the standard is built upon XHTML, a standard that already supports advanced media. The problem arises not from ePub but from the reading systems’ ability to render these advanced features.
ePub can do more than most people think; the main restriction is the reading system not the format.
I guess the question should be, when will the reading systems allow us to use more advanced media.
We also had a number of discussions on Master Formats (TEI, DTBook, DocBook, etc.) and which is the best to go for. That’s a difficult question but one thing that ties in with my recent thoughts is the question as to whether we can use the native DTBook format not only as the end user ePub format, but also as the eBook master. I will be looking into this further myself but if anyone has any thoughts on the use of DTBook then please share.
admittedly, i’m an epub newb, but after researching it a bit, i came to the understand that one of the purposes of ePub is that it can be both a master format AND a format for distribution. (i read this in a presentation about epub, which is available on the IDPF website.) As such, isn’t the question of “master format” mostly pointless?
Furthermore, my feelings on XHTML or DTBook are 99% in favor of XHTML. (I haven’t read much on DTBook, however.) First, XHTML is MUCH more widespread, well understood, and therefore simpler for developers (of reading systems) to implement in their software–either from scratch or from using pre-existing XHTML libraries. Second, for the same reasons, XHTML is easier for the the notice ePub creator to use. Thirdly, I’d question the long-term relevance of DTBook, while I have some faith that XHTML will be around for a long time.
A final consideration is this: the reading of ePub in the browser. Using XHTML dramatic lowers the barrier on reading ePub in browsers. Why would we want to read our ePub books in an internet browser? To have fantastically simple support for “advanced” features like video and flash. As ePub is, basically, similar to a webpage, I believe it would be wise not to alienate one’s “master files” from the already wide-spread reading systems known as web browsers.
In terms of advanced features, although the ePub spec allows video and other media types, making these media commonly supported by reading systems may take requiring support from reading systems, as PNG etc are required now. Of course, there’s also the question of hardware and file size. Hardware, particularly e-paper screens just are not suitable for video content, and these e-paper devices are becoming more and more popular. Video and flash would also impact battery life greatly. Adding video, sound, etc to ebooks also would increase the file size, making it more troublesome for consumers to download, as well as making it less likely for “free” over-the-air downloads such as what makes the Kindle so popular. 1MB files–ok; 200MB files–might make wireless carriers cringe.
That said, I feel that there won’t be much demand for “enhanced” ebooks for the time being, except within the realm of how-to, technical books, school text books, and possibly magazines. Those who read novels for pleasure probably won’t be fans of obtrusive videos or sounds.
whew… =)
As master-format, I think ePub is ill-advised, and for a number of reasons:
1. The standard duplicates some information (meta-data) in various places. Keeping this consistent will be a headache.
2. Although it basically is a standard zip file with standard XHTML inside, direct editing still remains somewhat complex, due to the special requirements put on the zip format, not achieved by all zip applications.
3. XHTML is really not rich enough to include various semantics you want to include in master-files. It tends to be used to reflect typographic instead of logical structure of the text.
For existing books, I’ve been using the TEI format as master format for years, and I am quite happy with it. It allows me to encode almost everything I encounter in real-life books. For new books DOCBook might be more appropriate, depending on the editors preferences.