Formatting Poetry for EPUB and Small Devices

What we are going to discuss here is how to format poetry in XHTML format (which underlies EPUB) so that it looks nice on smartphone screens – that is, when many or even all of the lines do not fit the screen width. In other words, our concern is how to break poetry lines nicely.

We do not discuss the poems which use non-standard formatting (Lewis Carrol’s Fury said to a mouse, shaped like a twisting tail, is a good example of what we are not talking about here); each poem of this sort is a separate formatting problem of artistic rather then technical nature. What we are going to consider are poetry pieces which use some sort of conventional formatting. The examples used further in this tutorial are from Shakespeare, from Horace, and, for a more specific formatting convention, from Beowulf.

It is clearly unacceptable to use plain left-align text with no style modification: the result of line breaking will be rather ugly, as shown below (a few lines from Shakespeare’s Henry V are used):

This day is call’d the feast of Crispian.

He that outlives this day, and comes safe home,

Will stand a tip-toe when this day is named,

And rouse him at the name of Crispian.

Centering everything (a solution frequently met in actual e-books) is better, but still far enough from perfection:

This day is call’d the feast of Crispian.

He that outlives this day, and comes safe home,

Will stand a tip-toe when this day is named,

And rouse him at the name of Crispian.

Let us now proceed to another variant, which is also used widely enough and which I think to be a reasonable basic approach to the problem, and then see what we can add to it.

1. The Basic Way

The idea in this basic approach is simply to use different indentation for the first and the subsequent portions of every poetic line. It has already been proposed several times by different people online; I suppose, it is usually the first thing better then just-center-it-all that comes to one’s mind (after one figures out that, alas, text-align CSS property cannot be defined separately for the first-line selector).

In terms of CSS properties, it means that the text should have a non-zero left margin, and the first line of each paragraph should be negatively indented with respect to the rest of the text (each paragraph being a poetic line rendered, depending on the screen width, as one or more lines of text on the screen):

.poemLine {
font-size: 1em;
margin: 0;
margin-left: 2.5em;
text-indent: -2.5em;
}

If different lines of the poem have different indentation, it is possible either to define several style classes, or just to add some non-breaking spaces ( ) in the beginning of the indented lines.

Now let’s format the same few lines from Henry V using this style.

<p style = ‘poemLine‘>This day is call’d the feast of Crispian.</p>
<p style = ‘poemLine‘>He that outlives this day, and comes safe home,</p>
<p style = ‘poemLine‘>Will stand a tip-toe when this day is named,</p>
<p style = ‘poemLine‘>And rouse him at the name of Crispian.</p>

Below you see how it will be rendered…

…on a wider page:

This day is call’d the feast of Crispian.

He that outlives this day, and comes safe home,

Will stand a tip-toe when this day is named,

And rouse him at the name of Crispian.

…on a narrower page:

This day is call’d the feast of Crispian.

He that outlives this day, and comes safe home,

Will stand a tip-toe when this day is named,

And rouse him at the name of Crispian.

…on an even narrower page:

This day is call’d the feast of Crispian.

He that outlives this day, and comes safe home,

Will stand a tip-toe when this day is named,

And rouse him at the name of Crispian.

The result looks better then the previous ones. However, to my taste it is still not as readable as desired. This solution is nearly perfect for the cases when only a small percentage of lines do not fit the screen width (see the second of the three screen widths in the example above). However, when smartphones are in question, splitting of a line into two is no longer a rare emergency which is to be handled just ‘nicely enough’, but rather something that happens to nearly every line (see the last of the three screen widths in the example above). For such cases this formatting is, to my taste at least, good enough to consult the text but not really good enough to enjoy it with convenience. So what can be done to that?

2. Structuring the Text

The way to overcome the lack of structure is, certainly, to add some. In our case apparently a reasonable thing to do in this direction is to allow line breaks only at pre-selected positions which correspond to pauses, or to boundaries between phrases, or to something else of the same sort. An easy way to do it is to replace the white-spaces which should not become line breaks with non-breaking spaces (&nbsp;).

An alternative way could be to use white-space property in CSS, but the support of this property by EPUB reading software is, in my experience, not guaranteed in practice; yet another relevant markup element, nobr tag, is by now deprecated and should be avoided.

So, let us format the same piece of poetry once more, now with non-breaking spaces enforcing some logic in line-breaking:

<p style = ‘poemLine‘>This day is call’d the&nbsp;feast&nbsp;of&nbsp;Crispian.</p>
<p style = ‘poemLine‘>He that outlives this day, and&nbsp;comes&nbsp;safe&nbsp;home,</p>
<p style = ‘poemLine‘>Will stand a tip-toe when&nbsp;this&nbsp;day is&nbsp;named,</p>
<p style = ‘poemLine‘>And rouse him at&nbsp;the&nbsp;name&nbsp;of&nbsp;Crispian.</p>

The difference created by structuring the text with non-breaking spaces is seen on narrow pages:

This day is call’d the feast of Crispian.

He that outlives this day, and comes safe home,

Will stand a tip-toe when this day is named,

And rouse him at the name of Crispian.

…and:

This day is call’d the feast of Crispian.

He that outlives this day, and comes safe home,

Will stand a tip-toe when this day is named,

And rouse him at the name of Crispian.

The choice of ‘permitted line-breaking places’ is surely rather subjective, and may be made differently. In any case, to my eye, the text is both easier to read and more pleasant to see when formatted this way.

A drawback of the approach is the amount of manual markup needed to format a long piece of poetry this way. However, such or similar «chunking» of poetry lines into non-breakable fragments can be done automatically, or semi-automatically. Here is a sample set of rules for English (and a Perl script based on these rules – link below) which would produce tolerable result in many cases:

  1. If application of any of the following rules results in a non-breakable block longer then 25 characters, this rule is skipped. The choice of the number ’25’ reflects my understanding of what should fit on any screen and is rather subjective; you may use whatever number seems reasonable to you in its place.
  2. With the above restriction in place, the following rules are attempted, in the given order:
    1. If there is a space before a dash, it is made non-breaking.
    2. If there are punctuation marks in the line before the end of line, everything after the last such punctuation mark is transformed into a single non-breakable block.
    3. If an article (‘a’, ‘an’, ‘the’) or a demonstrative (‘this’, ‘that’, etc.) is not followed by a punctuation mark, the following white space is replaced by a non-breaking space.
    4. Next, the same is applied to the word ‘no’ (the word ‘not’ cannot be processed that easily, for it may be connected, in different cases, both with the previous word and with the next one, e.g. ‘not knowing of something’ vs. ‘Somebody knows not of something’).
    5. Next, the same is applied to interrogative words (‘who’, ‘when’, etc.).
    6. Next, the same is applied to conjunctions (‘and’, ‘but’, etc.).
    7. Next, the same is applied to prepositions (‘of’, ‘from’, etc.).

Download “poemnobr.pl” poemnobr.pl – Downloaded 842 times – 5 KB

After such automatic processing, some manual post-correction may be applied, which, in turn, may eventually lead the editor to formulating additional automatic rules to her/his own liking. It may be potentially interesting also to consider more advanced text analysis for automatic chunking of poetry lines; e.g. syntactic parsing may be helpful in deciding which portions of text should be ‘kept together’.

3. Some special cases

This section is dedicated to special sorts of poems, and if you are not interested in them, you may just skip it.

3.1. Greek and Roman Classics: Handling Formal Cæsuræ

For the metrical forms with clear cæsuræ it may be good for readability to enforce those as the only allowed line-breaking positions. Like this:

<p style = ‘poemLine‘>Seek&nbsp;not&nbsp;thou&nbsp;to&nbsp;enquire, (who&nbsp;can&nbsp;reveal?) when,&nbsp;my&nbsp;Leuconoe,</p>
<p style = ‘poemLine‘>For&nbsp;us&nbsp;either&nbsp;an&nbsp;end Heaven&nbsp;has&nbsp;assigned; nor&nbsp;Babylonian</p>
<p style = ‘poemLine‘>Numbers&nbsp;seek&nbsp;to&nbsp;essay! Far&nbsp;better&nbsp;is’t, what&nbsp;shall&nbsp;arrive,&nbsp;to&nbsp;bear!</p>

(The example is taken from the Ode 1.11 by Horace, as translated by Arthur Hugh Clough)

Below you see how it will be rendered…

…on a wider page:

Seek not thou to enquire, (who can reveal?) when, my Leuconoe,

For us either an end Heaven has assigned; nor Babylonian

Numbers seek to essay! Far better is’t, what shall arrive, to bear!

…on a narrower page:

Seek not thou to enquire, (who can reveal?) when, my Leuconoe,

For us either an end Heaven has assigned; nor Babylonian

Numbers seek to essay! Far better is’t, what shall arrive, to bear!

…on an even narrower page:

Seek not thou to enquire, (who can reveal?) when, my Leuconoe,

For us either an end Heaven has assigned; nor Babylonian

Numbers seek to essay! Far better is’t, what shall arrive, to bear!

3.2. Old English Poems

One more special case: Old English poems. Here, obviously, the most appropriate line-breaking position is right before the long space which separates the half-lines. Note, that in this case the long space itself will provide the necessary indentation for the second half-line, so no special provisions for indentation are needed, and the CSS style should look as follows:

.OEpoemLine {
font-size: 1em;
margin: 0;
text-indent: 0;
}

Then the text is formatted as follows:

<p style = ‘OEpoemLine‘>Ðá&nbsp;him&nbsp;Hróþgár&nbsp;gewát &nbsp;&nbsp;&nbsp;&nbsp;mid&nbsp;his&nbsp;hæleþa&nbsp;gedryht</p>
<p style = ‘OEpoemLine‘>eodur&nbsp;Scyldinga &nbsp;&nbsp;&nbsp;&nbsp;út&nbsp;of&nbsp;healle·</p>
<p style = ‘OEpoemLine‘>wolde&nbsp;wígfruma &nbsp;&nbsp;&nbsp;&nbsp;Wealhþéo&nbsp;sécan</p>
<p style = ‘OEpoemLine‘>cwén&nbsp;&nbsp;gebeddan· &nbsp;&nbsp;&nbsp;&nbsp;hæfde&nbsp;kyningwuldor</p>
<p style = ‘OEpoemLine‘>&nbsp;&nbsp;&nbsp;&nbsp;swá&nbsp;guman&nbsp;gefrungon·</p>

(The example is taken from Beowulf)

Below you see how it will be rendered…

…on a wider page:

Ðá him Hróþgár gewát     mid his hæleþa gedryht

eodur Scyldinga     út of healle·

wolde wígfruma     Wealhþéo sécan

cwén tó gebeddan·     hæfde kyningwuldor

Grendle tógéanes·     swá guman gefrungon·

…on a narrower page:

Ðá him Hróþgár gewát     mid his hæleþa gedryht

eodur Scyldinga     út of healle·

wolde wígfruma     Wealhþéo sécan

cwén tó gebeddan·     hæfde kyningwuldor

Grendle tógéanes·     swá guman gefrungon·

…on an even narrower page:

Ðá him Hróþgár gewát     mid his hæleþa gedryht

eodur Scyldinga     út of healle·

wolde wígfruma     Wealhþéo sécan

cwén tó gebeddan·     hæfde kyningwuldor

Grendle tógéanes·     swá guman gefrungon·

That said and shown, it remains only to wish the readers good luck with formatting whatever they want.

If you liked this post, say thanks by sharing it.

Author: Anton Bryl

I am a computer scientist (Machnine Learning, Natural Language) and software developer (C++/Java). My interest in EPUB is initially that of a reader/user. As a user, I naturally often have thoughts which start with, "It would be great if..."; and as a software developer, I naturally proceed to thinking how that can be implemented. So, the particular topics of my articles here are often determined by my own reading habits and interests. If I need to summarise the general direction of thought behind these articles in one phrase, the phrase will be: more dynamicity in text layout; where dynamicity includes reflowability but is by no means limited to it.

3 thoughts on “Formatting Poetry for EPUB and Small Devices”

  1. <p> is reserved for paragraphs, and lines of verse are obviously not paragraphs. Some User Agents display paragraphs differently than you might expect even when styled with CSS. Use <div> instead of <p> for any division of text which does not meet the semantic definition of a paragraph.

  2. 2Lee Passey
    Thanks for that comment.
    The “p” tag is quite often used for poetry lines (as mentioned, the “basic way” addressed in Section 1 is not my invention and you may find it described in a number of places online), but “div” may be semantically neater, yes.

    If some reader of this tutorial decides to use “div” – do not forget to explicitly set also padding and border to zero in the style, for the usual defaults are different for div. With that addition, the same formatting tricks seem to work ok with div as well.

  3. Thanks so much for this post! About 1800 people have been talking about the positive margin, negative text-indent, but I’ve never seen the   used to control the line breaks.

Comments are closed.