Brown
University
Women Writers Project
Research and Encoding
Training Materials
Document Analysis |
This document last updated Thursday, 15-Mar-2007 14:04:17 EDT
Document Analysis FormJulia FlandersThis document is not for actual use by encoders. Rather, it is to serve as an illustration of the issues involved. In addition, this may not be the most up-to-date version of the document; the most up-to-date version is probably still the Microsoft Word document. What this is forIn order to be sure that we encode our texts consistently and address their complexities in a well-thought-out way, we do a basic preliminary analysis of each text before we start encoding it. This process helps identify difficult textual issues early and allows us to discuss and research them. It also helps the encoder conceptualize the structure of the text and the relationships between its parts, so as to tag it more accurately and consistently. This form is a guide to help you think through the preliminary document analysis. As you begin it is expected that you may want to solve many of these questions in consultation with others; as you become more familiar with encoding issues you will be able to take more individual responsibility for developing solutions. At any point, if you come across an interesting encoding problem or issue, or something you'd like to get feedback or clarification on before your presentation, please post to WWPTAG-L. Basic Information
Document AnalysisBasic StructureSketch the basic structure of the document (as a tree or a chart or
in whatever way makes sense) on the back of the last sheet. Include at
least the first three levels of division inside
Issues to think about: What structural components form the basic
divisions of the text? What are first-level
In the case of poetry, analyse the lien groupings carefully and
decide which attributes of GenreIdentify the genre or genres to which your document belongs (circle all that apply):
Physical bibliographic issuesDoes the document have pages missing from the source text, illegibility (either in the photocopy or in the source text), or damage? Is it a complete book, xeroxed from start to finish, or is it part of some larger work whose structure may need to be taken into account? Assess the extent and cause of any damage or illegibility and the appropriate treatment. Check the pagination and collation (the sequence of signatures, as indicated by the signatures at the bottom of the pages) for accuracy or missing sections. If the text was published before 1750 (roughly; check with John), you will need to encode a complete collation, including the title page. It may help to do the collation on paper here first (check with John if you have any questions). If page numbers are missing or out of order, check to see whether the flow of the text is continuous (indicating error in the page numbering) or discontinuous (indicating an error in the printing, binding, or xeroxing). Title PageThink about how to encode the various parts of the title page, particularly for Renaissance texts. Issues to consider: how is the title itself divided? What other information is there, and how hsould it be encoded? Consider the content even more than the typography as an indication of how to encode it. Textual FeaturesLinking and cross-referencing: does the document contain footnotes, endnotes, side notes (i.e. marginal notes), errata lists, subscriber lists, table of contents, index, internal cross-references, or other referencing mechanisms? Can these be accommodated using ordinary WWP methods, or do they pose any special challenges? CharactersDoes the document contain unfamiliar characters? Any characters or abbreviations which will require expansion (e.g. macrons, etc.)? (You may need to consult Jacque Russom about how these should be expanded). Handwritten additionsIf the document contains any handwriting, you will need to assess whose handwriting it is, if possible, and whether it needs to be encoded. If it does, then you will also need to assess whether it poses any structural problems; does it span across several elements? How will its position be indicated? Is it legible? Does it contain cross-outs or other additional complexities? Does it use any characters which will need special treatment or scholarly interpretation (for instance, letters which might be either capitals or lower case; contractions; marks which are not letters)? Other IdiosyncrasiesNote any other features of the text that may need special treatment or further research. |
The Project |
The
Texts | Research and
Encoding
Contact | Site Index |
Brown
University