Tables of contents

Abstract

Encoding of tables of contents with list inside div type="contents", with internal encoding to capture the functional parts of the table of contents information, such as page numbers and titles.

table of contents front matter divisions of the text list
div rs ref list type toc pageNum

Encoding Instructions (new P5 version)

The WWP uses div type="contents" to encode tables of contents, with list inside to mark the table’s structure. Each entry is encoded with item. We record the original pagination while also setting up a system for generating new pagination upon printout, so as to make the table of contents useful under all formatting conditions.

There are several basic components to the table of contents, not all of which are present in every case:

The label for the contents item (e.g. a chapter number or other number): encoded with label, as the first child of item, if it appears. Often there is no label at all and this element will be unnecessary.

The name of the text chunk (e.g. a chapter name or poem title): encoded with rs. No internal encoding except for renditional markup is included; since the text chunk is simply a duplication for reference purposes of the words that really appear later in the actual text, encoding things like persName is redundant and misleading.

The page number on which this text chunk is to be found: encoded with ref type="pageNum", and a target attribute pointing to the xml:id of the text chunk in question (not to the xml:id of the page on which it falls). If there is no page number given, then the pointer to the location in the main text is encoded with ptr and a target attribute just as with ref. No type attribute is given for ptr. Note that the function of the pageNum attribute value in ref indicates the type of reference, not the type of information to which the reference points.

The dots, dashes, spaces, or other leader that lie between the text chunk and the page number: we ignore this entirely. The relative alignment of the elements (rs, ref, etc.) is indicated using the rend attribute.

The headings for the columns (usually something like Chapter and Page): The first instance is encoded with head; any subsequent repetitions of the column headings caused by page breaks are encoded with mw type="listhead".

If there are internal subgroupings within the table of contents (for example, if the volume is a series of novels each of which has chapters), we encode these as nested lists (the outermost items would be the novels, and within each novel-level item would be contained a list of chapters.

The WWP distinguishes between tables of contents and indexes as follows: a table of contents is ordered by page number, while an index is ordered according to topic (usually alphabetically). Either may appear at the beginning or end of the book or section. For more information on indexes and a comparison with tables of contents, see 077.

Examples

    <div type="contents">
      <head>The heading for the table goes here.</head>
      <list type="toc">
        <mw type="listHead">A subhead, if there is one for the page
          numbers, etc., goes here.</mw>
        <!-- NB use multiple <mw> elements for the headings of multiple columns -->
        <item>
          <label>If there is an identifying number, such as a chapter
            number, it goes here.</label>
          <rs>The title of the item, such as a chapter title, goes
            here.</rs>
          <ref type="pageNum" target="[idref of element here]">The
            page number goes here.</ref>
        </item>
        <item><!-- ... --></item>
        <!-- ... more <item>s as needed ... -->
      </list>