Dashes

Abstract

Encoding of dashes, including em-dashes and en-dashes, using entity references

dash entity

Encoding Instructions (new P5 version)

The WWP encodes all lengths of punctuational dash longer than the em-dash (—) using a "superdash", encoded with the entity &sdash;. The function of the superdash is simply to mark and distinguish dashes of extraordinary length. The WWP regularizes multiple punctuational em-dashes or hyphens to a single &sdash;. The entity reference can be used to print an em-dash for display, and also to locate dashes of abnormal lengths if those are ever of interest to scholars. See example 1.

The WWP encodes non-punctuational dashes slightly differently. A non- punctuational dash serves to mark blank space or indicate omitted letters (as in forbidden words or covert references to actual people). We indicate the number of omitted letters within words exactly as they appear, since their actual number could be important in identifying the abbreviated word. Non-punctuational dashes which only indicate a general omission without indicating specific numbers of letters are encoded with &sdash;. See example 2.

Examples

Example 1: various uses of punctuational dashes (represented here by hyphens, but they could be either hyphens or dashes of various lengths):

 1. The flowret's path,---the Summer sunny way,--- [Encode with 
            &sdash;]
            2. He walked--I can scarcely believe it--out the window. [Encode with 
            —]

Example 2: various uses of non-punctuational dashes:

                1. Then Mr. B----- spoke up. [Encode with 5 hyphens, as printed]
                2. D--n his eyes! [Encode with 2 hyphens, as printed]
                3. Mrs. ______ was a dreadful character. [Encode with &sdash.]

Rationale & Reasoning

The WWP distinguishes between punctuational and non-punctuational dashes and treats them differently. We also distinguish both from the ruled line which is used as an ornament or separator mark between textual divisions (which is encoded with &rule; - see 034).

The WWP collection predates the invention, or at least the wide use, of the en-dash (a dash one en long), which is typically used for number ranges. This character is not yet attested in the WWP collection. However, if it did appear, we would encode it using the entity reference –.