WWP The Project Admin NEH Final Report, 1997-2000

Project Activities: Publication of Women Writers Online

The publication of Women Writers Online is the central achievement of this grant period, and also marks the culmination of the WWP's work over more than a decade, since our first grant award from the NEH. It required extensive research on a number of different fronts, most significantly the interface and technical design, and the design of the legal and conceptual framework for the licensing arrangement itself. Although there were existing models in both of these areas, we feel the WWP improved on those models, and also managed to articulate a rationale for the kinds of improvements we were working towards.

Self-publication

We first explored the possibility of working with one of the publishers who are now undertaking electronic publication: Oxford University Press, Routledge, Cambridge University Press, Chadwyck-Healey, and a few others. Our assumption was that the advantages of commercial publication would be significant, offering access to established marketing venues and putting the burden of management and advertising onto the publishers' professional staff rather than on the WWP. Most importantly, we expected that the publisher would develop the delivery software and interface, thus solving our first challenge as well. At the time, both Routledge and CUP were in the process of launching ambitious new electronic ventures and developing delivery software which would allow the online publication of richly encoded SGML data over the web.

At the same time, our discussions with these publishers gradually revealed several differences of strategy and attitude which emerged as potential obstacles to collaboration. First, their approach to pricing was fundamentally different from ours, based on an extremely cautious assessment of the potential market. Assuming low sales for what they felt to be a niche product, they also assumed a need for correspondingly high prices in order to make back the investment, which in turn made higher sales still more implausible. Their approach would have recouped costs, but would have jeopardized the wide dissemination which was our fundamental aim. In addition, as we considered what we wanted to achieve by online publication, we found ourselves more and more concerned about retaining control over the data, setting our own schedules for upgrades, and allowing free research use. These were goals to which commercial publishers--given their business model and fiduciary responsibilities--quite reasonably felt cautious about committing themselves.

Our eventual decision to publish the collection ourselves, and to develop our own delivery system, turned out to be a welcome necessity. Two of the publishers' online delivery systems did not materialize as planned, owing to delays in programming, and began to look less likely to fit our criteria of functionality when completed. In particular both publishers seemed more interested in CD-ROM publication than in online distribution over the internet, a choice which seemed to us to go directly counter to our conception of how the collection would be used and distributed. Working with Chadwyck-Healey would still have been possible, but for a number of reasons we decided that we did not want to be absorbed into Chadwyck-Healey's Literature Online collection. For one thing, we found that the level of functionality which Chadwyck-Healey had made standard for their collection was considerably below what we had envisioned for Women Writers Online, and would not be likely to exploit our markup fully. This problem was compounded by the fact that the interface to WWO would necessarily be made homogeneous with the rest of the Chadwyck-Healey collection, leaving us with little scope for the kinds of special features we felt users would want and that we could provide. In addition we would lose control over how our work would be sold and represented. Self-publication would allow us a degree of autonomy and oversight which would be more valuable than we first imagined. And although (as a number of publishers pointed out to us) it would put the burden of marketing on our shoulders, we felt that our unusually close relationship with a substantial existing audience would give us an advantage in that area.

Interface and technical design

As creators of richly encoded SGML data, the WWP is one of a number of projects currently facing the same problem: the fact that SGML publication software is still scarce and designed for industrial production settings rather than academic projects in the humanities. Tools for publishing SGML content on the World Wide Web (such as INSO's DynaWeb) are even scarcer and are also not designed with scholarly uses in mind. The advent of XML is widely predicted to be a possible solution to these problems, but at the time the WWP was planning our initial publication, we had the choice of customizing an existing application or of designing one ourselves from scratch. Although the latter option would theoretically have given us more flexibility and control over the resulting product, there were a number of potential concerns. The expense of software development was first among these, particularly because the actual cost of creating a functional system from scratch was difficult to estimate with precision. We also knew that although we could probably develop an SGML-to-HTML transformation system fairly easily for our specific texts, we would not be able to make it general enough to allow for easy expansion, nor could we easily support the rapid content-based indexing provided by commercial software. Finally, creating a new application ourselves would necessarily be an all or nothing approach--we risked being caught with no delivery system at all if we encountered any serious problems. We had already experimented with INSO's DynaWeb software and although its default interface and functionality were ill-suited for our purposes, we thought we could build a customized interface with most of the functionality we sought. The advantages of this approach were that we would be able to start using the system in its uncustomized state almost immediately, and add improvements as we developed them. Furthermore, if the project turned out to be a long-term success, we could design a custom application ourselves later on, possibly taking advantage of the arrival of XML-aware software and support systems.

Accordingly we decided to build a custom interface and based our delivery system on INSO's DynaWeb software. In DynaWeb the underlying infrastructure of indexing, searching, and processing the encoded data (which is performed by DynaText, an SGML search engine) is separated from the display of this data on the web. The latter works by a system of style sheets which dynamically translate SGML data into HTML for web display. From the user's point of view, the data is simply HTML which can be viewed with a standard web browser. However, searches and word- or structure-based functions are passed back to the DynaText engine and performed on a preprocessed form of the SGML data, allowing for the exploitation of specialized markup. Thus for instance the user can limit a word search to verse drama, even though HTML has no ability to represent or flag particular genres. The advantages of this general solution for us were considerable: the user would not need any specialized software or skills, the purchasing institution would not need to install anything locally, and the value of our SGML encoding would not be lost by down-translation to HTML (as it would be in a static, one-time translation system). Also unlike systems like SoftQuad's Panorama, which downloads an SGML text to the user's computer and allows specialized processing to occur locally, DynaWeb can search and selectively display information from the entire corpus. Panorama requires custom software to be installed locally and can only really handle one document at a time, both disadvantages which ruled it out for the kinds of uses we wanted to encourage.

On top of this basic system, we created a custom interface which provides several important features:

Licensing system

Our licensing system--the pricing and terms of access--was developed using existing online licensing systems as its point of departure, but with some significant adjustments that we felt would be essential to establish an equitable relationship between the WWP and our subscribers, and to address the concerns most frequently voiced by libraries.

Several points were clear to us from the start. We wanted to maximize the use of our collection, encourage exploration and experimentation, and minimize the burden on librarians and technical staff. We also wanted a system which would be easy for us to update and expand as often as necessary. For all of these reasons, we felt that offering site licenses with unlimited access, published over the Web and authenticated by IP address rather than password, was the optimal approach. IP address authentication offers transparent access to all on-campus users, and increasingly universities are using proxy servers to take care of off-campus access, so that getting into the collection is as effortless as possible. Web rather than CD-ROM publication was an easy choice, eliminating the burden of installation and ownership on the libraries' end, and the equal burden of production, mailing, and updating on our end. Offering unlimited use for an annual fee, rather than a per-use charge or a specified number of simultaneous users, seemed important to us as a way of removing obstacles to experimentation; with unlimited access paid in advance, teachers and students can feel free to browse and explore without feeling that they are incurring rising fees--an economy of scarcity which we had no interest in fostering. In addition, charging a single annual fee enabled libraries to budget more easily and eliminated bureaucracy.

One point raised early on was that of the libraries' equity in the collection: would the fees paid entitle them ultimately to some kind of ownership, as with a journal subscription? Although our goal of frequent updates made it difficult to envision selling the data directly to institutions, we concluded that they should have some assurance of a permanent stake in the collection. Our solution is to include a clause in our license agreement which guarantees, in the event that the WWP becomes unwilling or unable to continue publishing Women Writers Online, that every subscriber will receive a copy of the source data to do with as they please (consistent with the general terms of the original license). Although at present, given the paucity of cheap SGML publishing software, such a provision probably does not provide the same level of protection as, say, a run of back issues of a journal, within a few years XML publishing software will be available which will make the WWO source data a very adequate substitute for the real thing, should that become unavailable.

The final point on which our license is distinctive is in our treatment of activities like downloading and printing files and reusing downloaded materials. In general we do not restrict such activities at all, provided they are part of the personal, non-commercial research or teaching of members of the institution. We encourage faculty to make use of our materials in course packets, course web sites, and so forth, and we feel that such reuse does not threaten, but rather reinforces, the importance of the WWO collection. From our viewpoint, the collection as a whole--including its search functionality and other features--has an enormous value in addition to the separate value of the individual texts it contains, and it is this larger value on which the WWP's future prospects really rest.

Textbase Development

Aside from developing the interface and licensing system, the most important project activity was of course the continuing expansion of the WWP textbase and the preparation of the texts for online publication. Since we had also received a substantial grant from the Mellon Foundation to increase our coverage of the Renaissance period, our concern was to balance this emphasis with increased coverage of the 18th century. In addition, we were eager to broaden the generic range of the collection, giving greater attention to cultural materials which could provide a fuller image of the political, scientific, and religious context for the more literary texts already in the collection. The result was a highly diverse set of texts which offer an astonishing cross-section of the literate culture of the period, including discussions of midwifery, the universities, religious dissent, marriage, and the theater. A list of texts added during this grant period is included as Appendix A, and a full list of texts published in Women Writers Online is included as Appendix B.

Next to Project Impact and Audience