Foreign words and phrases
Abstract
Encoding foreign-language words and phrases using the lang attribute on existing elements, and the foreign element when necessary
Encoding Instructions (new P5 version)
The WWP indicates the language of words in a foreign language using the lang attribute. This attribute is placed on whatever element is most appropriate to encode the words in question (e.g. quote). If the words are renditionally distinct solely by virtue of being in a foreign language, we encode them with the foreign element and encode the language with the lang attribute.
The WWP provides a lang attribute for any foreign word that is renditionally distinct. In addition, in a text where the foreign language is foregrounded as part of the content (in particular, travel narratives or works focusing on linguistics), all words which are clearly foreign will also be encoded (with foreign if there is no other applicable element) and given a lang attribute.
In some cases, it may be difficult to determine whether a word or phrase is actually in a foreign language, since short phrases of foreign origin (such as “deja vu” or “et cetera”) may enter the English language gradually (and indeed different authors at any given point might disagree on whether they are fully naturalized). Where phrases like this occur in a renditionally distinct context (such as quotation or linguistic emphasis) it may be very difficult to tell whether the text is treating the phrase as foreign or not. In these cases, encoders will need to judge foreignness on the basis of usage and context.