General discussion of encoding poetry, including the use of text, div, and lg to encode basic poetic structures
The WWP encodes poems using the wrapper lg, which contains lines encoded with l. One poem may have (and likely does have) multiple lg’s within it. lg, where appropriate, can also contain the element head, and should always carry a type value: stanza, quatrain, sestet, for instance. Where the element div is allowed, a div type="poem" should enclose the lg (or lgs). For a list of type attributes for this outermost element, see 188.
We encode poetry with several overarching goals in mind: to enable poetry to be identified and distinguished from the other genres in the textbase (for analytical purposes); to enable us to extract entire poems together with their headings and associated apparatus; and to represent the internal structure of each poem with some precision. Although the TEI recommends that poems be encoded with text to acknowledge their literary wholeness, the WWP differs from this practice. Because of the documentary rather than literary nature of our encoding, we prefer to represent the poem as an inextricable part of the larger document in which it was published. As a result we treat individual poems as subdivisions of that document, using div as mentioned above. See 186 and 187 for more on this.
Poems which occur in a context where div is not allowed (for instance, within running prose) should either be encoded within quote (if the poem is being quoted) or simply within lg. In this context, a larger lg would take the place of a div, enclosing the poem’s nesting lgs. The WWP allows lg to appear both within and between p elements, and the encoder must decide which is more appropriate based on context.
Within a long poem, any major subdivisions are encoded with div type="part", or with a more specific value for type if the poem specifies one (e.g. canto, book, etc.).
Within short poems, the WWP uses lg with a type attribute to encode different groupings of metrical lines. Groupings are usually determined by intervening white space, indentation or some other graphical indicator (such as an ornament). White space, should it appear, should be indicated in the renditional information of the lg over which it appears. This means if there is white space between stanzas, for instance, the renditional information for the poem’s lgs will look like ]<lg rend="space-above(yes)" type="stanza">. The type attribute on lg is used to indicates the formal structure of the line group; the WWP has created a fixed list of values for this attribute; see 148 and 186.
Headings should be encoded within the div to which they refer. A heading for the entire poem should be encoded within the outermost div, not within the first lg. Headings to individual stanzas (including stanza numbers) should be encoded within the lg surrounding the stanza. Similarly, speaker names (e.g. the speaker of a particular stanza) should be encoded within the line group they modify. In cases where a line group is divided between two speakers (e.g. where one speaker speaks the first line of a couplet and a second speaker speaks the second line), the line group should be fragmented and the part attribute should be used.
Values for n should be arabic numbers without punctuation, without regard for the format or delimiters of the actual stanza number in the text.
Speaker names in poetry which is not part of a drama should be encoded with speaker inside the lg tag instead of the sp tag; this provides a way of encoding speakers without requiring a who value and a castList. Other information which is associated with the lg or with individual lines (such as line numbers) should be encoded with label.