Encoding Guide for Early Printed Books

linking and pointing

Parallel Structures

Most print documents represent a single textual stream, albeit one that may branch or include discontinuities (such as those introduced by footnotes, indexes, and cross-references). However, parallel structures do appear in print texts: the classic examples are works presented in parallel translation, and editions that present two parallel versions of the same work, such as the parallel text of Wordsworth’s Prelude. In addition to representing the explicit parallel structures that are native to the print text, the encoding process also allows the creation of parallel structures as part of the process of creating a digital edition.

Representing parallel structures in XML markup fundamentally involves one of two mechanisms. The first is to establish a link between the two (or more) units that are being held in parallel, and to convey the semantics of that link: in other words, to express whether the link signifies a translation, a variant reading, a duplication, or some other relationship. Although in theory links can have more than two endpoints, in XML in practice the basic linking mechanism establishes a two-ended link, and more complex linking structures are simply complex aggregations of such links. For this reason, using this kind of explicit linking mechanism for parallel structures typically works most naturally and simply when dealing with only two parallel text objects. In these cases, one can establish the parallelism between the two at whatever level of detail is most useful. One can say simply these two texts are parallel by creating a link from one to the other, or one can align the two at various points with links further down in the structure: at the section level, at the paragraph level, conceivably at the sentence or word level, or at any other level of granularity that is appropriate, depending on the project. In a case where one only needed to provide the ability to move from one point in the text to the corresponding point in the other, aligning at the section or paragraph level could be enough to permit this kind of navigation. In a translation where precise alignment of the units of meaning was important, one might choose to align at the sentence level, since that might be the natural point of correspondence between the two, or at the sub-paragraph level if the translation were loose enough that sentence boundaries did not always correspond. This approach works well both for entire texts and also for more local forms of parallelism—for instance, a single poem in a collection that is a translation of another poem elsewhere in the same collection.

The second method of representing parallel structures is to use a grouping mechanism to surround the multiple parallel streams of text and indicate their coordination. This approach makes it easy to align as many different pieces of text as necessary: for example, five different variant readings of a particular word. In effect, the text stream branches at a certain point into a set of parallel paths, and reconverges after the parallelism is over. The advantage of this approach lies in its ability to handle local, small-scale variation, and also to express the semantics of that variation much more precisely. For instance, the TEI provides methods of representing variant readings in a critical apparatus, and also methods of representing textual variations that result from the correction of typographical errors, or the expansion of abbreviations, or other kinds of editorial processes. These encoding structures permit quite a high degree of precision in representing the nature of the variation, the textual specifics, the order in which the variants occur, and similar information that is potentially quite powerful in supporting analysis and display.

Although the first mechanism discussed above is less semantically rich, it does work better on a large scale. The markup necessary to indicate the connection between two or more parallel units can be expressed within the texts themselves, but it can also be expressed in a separate structure altogether, which exists solely to show the overall coordination of the texts in question. The individual texts carry the information necessary to accomplish the alignment (for instance, standard line numbering if one is aligning at the line level), and the external structure establishes the actual linkages between the relevant parallel segments of text. This process can be used to represent immensely complex textual universes. The Canterbury Tales Project uses markup that aligns the witnesses at the level of the line and the individual word variant, to permit analysis of word-level variation between witnesses, as well as of cases where lines are added, omitted, or moved.