Brown University    Women Writers Project    Research and Encoding    Training Materials  nbsp; Clean With Spam, ediff, macros

This document last updated Tuesday, 03-Apr-2012 12:03:50 EDT.

Clean up your encoding with Spam, ediff, macros

Click here for Syd's original, more detailed directions. If this is your first time to this web page, please read Syd's original directions first. Use this web page only for quick reminders.

This document covers the following commands and macros: wwp-spam-and-ediff, wwp-fix-case-of-GIs, wwp-next-missing-TAGC, wwp-next-unquoted-attr.

This document does NOT cover the following commands (however, they ARE covered in Syd's original directions): wwp-sort-revisionDesc, find-tag.

Reminder: there are four things you can do to make sure your encoding is correct: (1)use PSGML properly; (2)validate frequently using nsgmls; (3)run supra-validation; (4)use Spam, ediff, and macros to clean up the last few errors.

Commands: wwp-spam-and-ediff; wwp-fix-case-of-GIs

NOTE: These commands have the potential to be dangerous in the same way a global search-and-replace can be dangerous: you may make changes to your file that you're not aware of. They execute major changes on your file. It's a good idea to SAVE the file, check it in, then check it back out of RCS before using any of these.

1. How to use "Show missing markup", aka wwp-spam-and-ediff

Pick Show missing markup from the WWP menu or do M-x wwp-spam-and-ediff from the keyboard. After the command completes, you will see your original document in buffer A (top buffer) and the "spammed" document in buffer B (bottom buffer).

While looking at a difference, you can change one buffer to match the other. When in ediff, the primary commands you'll be using are:

Reminder: this command runs spam on the current buffer ("Spam" is short for "SGML Parser add markup", and is freeware courtesy of James Clark) and then runs the ediff command to compare the output of spam (with its added markup) with the original buffer.

2. How to use "Fix case of GIs" aka wwp-fix-case-of-GIs

Pick Fix case of GIs from the WWP menu or do M-x wwp-fix-case-of-GIs from the keyboard.

Reminder: this fixes the case of all GIs, e.g.:

<castlist>
<castitem><actor><PERSNAME>Madonna</PERSNAME></actor></castitem>
</castlist>

becomes

<castlist>
<castItem><actor><persName>Madonna</persName></actor></castItem>
</castlist>

Reminder: this is a potentially dangerous command: prior to using it make sure to check for missing ">"s with a regexp search or the wwp-next-missing-TAGC macro (discussed below); then save, check in, and check back out (as with all major global change commands). Then, after you've done it, scan through the file and quickly check if things look all right. You can compare the current file to the most recently checked-in version with the vc-diff command (C-x v =).

Macros for Finding Potential Errors

Note, these two macros are temporarily broken in xemacs, but DO work in emacs. Syd has already been notified of their brokenness.

3. How to use "Next missing TAGC", aka wwp-next-missing-TAGC

Pick Next missing TAGC from the WWP menu or do M-x wwp-next-missing-TAGC from the keyboard. If you have any errors, you will be scrolled to the first error, e.g.: <titlepart type="main". Fix this so it is correct (<titlepart type="main">). Hit C-x e to go to the next error of this kind.

4. How to use "Next unquoted attribute", aka wwp-next-unquoted-attr

Pick Next unquoted attribute from the WWP menu or do M-x wwp-next-unquoted-attr from the keyboard. If you have any errors, you will be scrolled to the first error, e.g.: type=main. Fix this so it is correct (type="main"). Hit C-x e to go to the next error of this kind.

NOTE: These macros may NOT find any errors -- if you have been using PSGML key bindings correctly to enter elements and attributes, you will probably not have any of these errors. ALSO note: C-x e will not work properly in this context if you execute other macros in between executing the original macro and typing C-x e.

[an error occurred while processing this directive]