Women Writers Project

Background

The Brown University Women Writers Project is an electronic textbase of pre- Victorian women's writing in English. It is encoded using TEI-conformant Standard Generalized Markup Language (SGML) which has been specified and modified slightly to optimize it for use on early printed books. At present the textbase includes slightly over 200 texts from a range of genres; given sufficient funding it will continue to grow to include a significant proportion--perhaps nearly all--of the estimated 10,000 to 12,000 texts written by women in English during the target period.


Encoding Practice

The encoding of the WWP textbase emphasizes certain methodological commitments which have developed as the project matures in its thinking about the relationship between electronic textuality, women's writing, scholarship, and pedagogy. First, the textbase preserves the source text in an unemended form: we create, at the very least, a diplomatic transcription of the source which preserves printers' errors, original spelling, original typography (including long s and old-style use of i/j, u/v, and vv/w), original lineation, font shifts, and capitalization. Second, the textbase takes advantage of the capabilities of SGML in order to simultaneously record some of the more useful variations on this information: for instance, we encode a correction of a printer's error in tandem with the error itself, allowing the user the option of viewing one or the other.

Sample encoding

Similarly we encode a normalized version of the old-style typography, and we encode the original use of capitalization and small capitals in such a way that alternate, less heavily capitalized views are possible. Some other features remain flexible simply because of the nature of SGML; original lineation of prose, for instance, can simply be suppressed in cases where the user does not care about it, as can shifts in font. Since information about the appearance of the text is encoded as a function of textual structure, rather than as an intrinsic part of the text, the presentation of the text for a given purpose (e.g. as a printout for classroom use, or as a World Wide Web document, or as a WWP proofreading copy) can be adjusted to the requirements of the moment.

Our encoding practice also reflects a commitment to preserving the integrity of the text as an object which circulated in the culture of a particular historical moment. There are obviously practical limits to the representation of a physical object in an electronic medium, and we do not attempt to body forth the book itself with our description; our emphasis is rather on regarding the configuration of the text--its contextualization within an actual document--as crucial to an understanding of how it was originally produced and consumed. Thus instead of attempting to isolate part of the document according to some rationale (its "literary" qualities or its authorial origins, for instance) we transcribe the full contents of the source document, including title pages, all prefatory material, advertisements, tables of contents, and all of the contents including material by writers other than those mentioned on the title page. In addition, our transcription is derived from a single copy of the source text; we do not import emendations from other editions or otherwise create a synthetic or ideal text which never existed as an actual printed artifact.


WWP Methodological Issues

As an electronic resource, the WWP textbase stands at a definitional crux, one by which we can identify some important points at issue in our thinking about electronic texts. Methodologically, the textbase resembles at least three quite different approaches to the collection and presentation of textual material: the edition, the archive, and the corpus. At the same time, our differences from each of these mean that we need to derive a coherent account of our work from some principled synthesis of these intellectual ancestors, one which also takes account of the impact of electronic textuality on all three. Our research on these issues thus takes place in the context of developments in both text encoding and editorial theory.

In some sense the textbase is indeed an edition of sorts. The use of SGML to record textual variation such as the correction of printers' errors is intellectual work which effectively creates a new edition, within the usual meaning of the term. More interestingly, the identification of textual structure and content which text encoding achieves could be regarded as creating a new configuration of the textual information which also constitutes a new edition, if we extend the idea of the edition to include the kinds of presentation enabled by the electronic medium.

And yet, perversely, our desire is also not to edit, where that word is understood to mean the performance of certain interpretive and inferential work which cannot be reversed, and which may preclude other interpretive options. Especially where editing addresses itself to particular disciplinary concerns (such as the desire for a clear reading text, or a desire for a text which incorporates authorial corrections from other editions), we tend to regard it as premature, given the aims of the project. Inasmuch as most of the writers in the WWP textbase have been out of print for hundreds of years, the crucial goal at this stage seems to be to make the source material available in a form useful to as wide as possible a range of users, and to open up research and discussion rather than present a finished, edited product which would be in some sense an endpoint of research. The textual modality that most addresses these concerns is the archive, which traditionally implies a collection of source materials in their original state. The WWP thus could, and sometimes does, refer to the textbase as an electronic archive, seeking thereby to express the aim of preserving source information for further study, even though clearly this preservation is achieved by means of analysis and transmutation. To speak of the "originality" and "uneditedness" of the electronic archive is to attest to the goals of the analysis and interpretation which are a necessary part of text encoding, rather than to express a naive desire to avoid analysis and interpretation altogether--to speak of the functionality of the end product rather than the processes that created it.

Implicit in the idea of the archive is the limitations and conditions of collection which define its scope. As a practical matter, the WWP does contain--and will for some time to come--a comparatively small subset of the extant women's writing in English before 1830, and the composition of this subset has been affected by a number of factors, some explicit and intentional and some more subtle and rooted in the unspoken assumptions of our intellectual and disciplinary inheritance. However, one of the long-term methodological goals for the textbase is to create a collection of texts that behaves like a corpus, offering a sufficiently large and representative range and quantity of text to provide a field for statistically meaningful analysis. This goal requires that the collection criteria be far more systematic, and the scope of the textbase be far more extensive, than the notion of an archive would require.

Return to Argument