Errors in the original
Abstract
Encoding of errors in the document source using sic; situations where corr is and is not used; distinguishing between error and old spelling
Encoding Instructions (new P5 version)
1. The WWP encodes errors in the original text using choice with sic and corr nested inside it. The corr element is required in WWP encoding, except in a special group of cases where the sic element is used but no correction is required. At present this group contains only the following:
--errors in catchwords (that is, discrepancies between the catchword and the corresponding word in the main flow of text): corr is not used.
--cases where the reading is anomalous enough to require sic as a way of indicating that the transcription is accurate, but where we do not know what correction to supply: corr is not used.
2. The sic tag should be applied to the word, rather than just the letter or letters in question. Some examples:
<choice> <sic>puckle</sic> <corr>pickle</corr> </choice>
He went to the <choice> <sic>the</sic> <corr></corr> </choice>
Notice in the second example that the corrected value is empty, not a space or an entity reference for ␣. This indicates that nothing, not even a blank, should be present.
3. Since many of our texts use archaic spellings which were correct in the time they were written, care must be taken not to encode these as typographical errors. In general, in old-spelling texts, if the spelling of the word is phonetically similar to its proper pronunciation, it should be regarded as correct for our purposes. Thus “queane” is simply a period spelling of “queen”, but “ibside” (for “inside”) would be treated as a typographical error. If in doubt, check the OED to see whether the word is listed as a possible spelling. However, absence from the OED does not mean that the spelling is incorrect; if it is phonetically plausible we do not encode it as an error. In modern-spelling texts (typically those from the mid-18th century on), spelling irregularities are treated as errors unless they are clearly well-attested alternate spellings.
4. The WWP also uses sic type="seq" for errors in sequencing (page numbers, signatures, etc.). For more information, see the entry on sequencing errors, 081.