Intensive Introduction to TEI

XML exercise

This brief exercise is intended to give you a quick initial exposure to the XML editor we’ll be using in the class (named “<oXygen/>” … hereafter “oXygen” in this document). If you’re already familiar with XML, or even if you’ve ever edited HTML files before, it will be very simple.

To download and install oXygen, go to the SyncRO Soft web site and download a free trial copy of the software. Follow the instructions to install it on your computer. (If you have access to the lab where we’ll be holding the class, you may also be able to practice there.)

A note on oXygen. This editor has a number of advantages: it’s fairly cheap, easy to learn, and powerful; it can help you edit not only XML files, but also XSLT stylesheets, schemas and DTDs, and other types of files. It comes bundled with the current version of TEI and also the previous major release. Its one real disadvantage is that it is slow and memory-hungry. If you type quickly, you may outrun it. If it behaves oddly—for instance, if you try inserting an element and it inserts the wrong one—the most likely explanation is that you are moving too quickly.

If you plan to purchase oXygen (which is not necessary), note that anyone at a TEI member institution including all students, staff, and faculty get a 20% discount. Ask the TEI for the discount code if you would like to take advantage of the discount.

These instructions use vocabulary that is covered in the TEI’s Gentle Introduction to XML (e.g. “element”, “tag”, “attribute”, “valid”). If your XML is rusty, it will help to read at least section v.iii XML structures before you start this exercise. Again, if you find this unfamiliar you should attend the session on the 22nd. But to start you off, here is a quick basic glossary:

Now for the exercise:

  1. Launch oXygen. Be patient, it may take a little while. If your computer is low on RAM, try to keep other applications to a minimum.
  2. In the File menu, choose “New…”; in the resulting dialog box choose the “From templates” tab; and from the list choose “XHTML 1.0 Strict”. (For an extra challenge choose “TEI P5 Bare”.) You’ll be given a template which contains a skeleton XML document, using the the XHTML (or TEI Guidelines) markup language. This skeleton is already valid (that is, it contains all the minimally essential elements required by the markup system you’ve chosen). You can check this by clicking on the little document icon with a red check mark in the top center of the window (mouseover says either “Reset Cache and Validate” or “Validate Doucment”).
  3. Take a look at the markup that’s already there. Some of the element names are fairly self-explanatory (you can readily understand what the <title> and <body> elements are for…). How much of the markup can you figure out without reading the documentation?
  4. Prove to yourself that this file is valid: click on the red check mark in the top center of the window. OXygen will think for a moment and then should give a “Document is valid" message and a green square icon in the bottom of the window.
  5. Now try adding an XML element to your skeleton. In the XHTML example, put your cursor in between the start-tag and end-tag for the <body> element (i.e. right <body>here</body>). (In the TEI example, there is a comment between the <body> start-tag and end-tag: delete it.) Type a left angle bracket (<) and see what happens. You should be given a menu listing all of the elements that are permitted at the location of your cursor. Some of these have obvious functions, others may be fairly obscure. Note: in XHTML it is often necessary to scroll to the bottom of the yellow pop-up box to find the description of the selected element.
  6. Use your arrow keys to move around in the list, and double-click on your choice or type RETURN to select it. OXygen will insert your chosen element for you.
  7. Validate your document again (click on the red check mark). Is it still valid?
  8. Type some text into the element you’ve just inserted. Now select a word (e.g. by double-clicking on it) and from the “Document” menu choose “XML Refactoring” (a bizarre term!). You’ll see a list of choices, ways to insert or alter the markup in your document. Choose “Surround with tag…” (and note the keystroke sequence—much handier to use). You’ll get the same menu as before. Select an element from the list and type RETURN to insert the markup. Validate your document again.
  9. Try inserting an attribute. Put your cursor inside the start-tag for any element, just before the closing > character. (That is, right <namehere>.) Now type a space between the element name and the closing >. OXygen should display a list of attribute names. Which ones you see will depend on the element you’re in; at a minimum, you should see “id” and “lang” (in TEI that will be “xml:id” and “xml:lang”). Again, choose the one you want and type RETURN to insert it. Now you have an attribute, but no attribute value. Type a value in between the quotation marks. Try validating again. (If the attribute you chose requires specific values, your document may fail validation. Can you figure out what’s wrong?)
  10. If your document is still valid, try creating an error. Choose any element and delete one of its tags (start-tag or end-tag). Now validate your document. Does the error message make sense? (Probably not… reading validation error messages is a black art. You can pick it up by deliberately producing errors and seeing what the error messages look like.)

Java Stack Overflow Error

If you get a “stack overflow” error message, you may be able to fix this by giving java a larger stack. Here are detailed instructions for doing so in Mac OS X. If you are using GNU/Linux or Windows you may wish to see SyncRO Soft’s documentation.

Mac OS X, for double-clicking the oXygen icon

  1. Find the oXygen application itself (not an alias or the dock version)
  2. Hold the control key down and click on it
  3. Select “Show Package Contents” from the pop-up menu
  4. From new window (of Oxygen), open “Contents”
  5. optional: make a backup copy of Info.plist
  6. Edit Info.plist in your favorite editor. This may not be as easy as it sounds, as this is not a normally visible file and is generally not associated with an application. Note that it is an XML file, and you can edit it in oXygen, but because it does not end in “.xml” you cannot drag-and-drop it onto the oXygen icon. Personally I usually drag the icon onto the main oXygen window. You may need to select “XML document” from the list of possibilities oXygen presents to you on opening it.
  7. Find the string “-Xss”. It should look something like:
    <key>VMOptions</key>
    <string>-Xss650K -Xms32M -Xmx190M</string>
            
  8. Increase the value of the “-Xss” switch, e.g. to “-Xss2M”. You may wish to increase the others, too, especially if your machine has a lot of RAM. See below for a more detailed explanation and values.
  9. Save the Info.plist file
  10. If it is running, quit oXygen

Mac OS X, for issuing /Applications/oxygen/oxygenMac.sh on commandline

  1. Edit the file /Applications/oxygen/oxygenMac.sh (can use oXygen as the editor, if you like)
  2. Near the end, find the “-Xss” switch. It should look something like:
    java  "-Xdock:name=Oxygen"\
     -Dcom.oxygenxml.editor.plugins.dir="$OXYGEN_HOME/plugins"\
     -Xss650k\
     -Xmx256m\
  3. Increase the value of the “-Xss” switch, e.g. to “-Xss2M”. You may wish to increase the others, too, especially if your machine has a lot of RAM. See below for a more detailed explanation and values.

below

switch meaning comments what I use
-Xss the stack size for each thread Each thread in the VM gets a stack. The stack size will limit the number of threads that you can have. If stack size is too large you will run out of memory as each thread is allocated more memory than it needs. If stack size is too small, eventually you will get a stack overflow error. 2M or 4M
-Xms initial java heap size Set to a multiple of 1M that is greater than 1M. Some suggest that as a general rule, should be set equal to the maximum heap size (-Xmx). 16M
-Xmx maximum java heap size If set very high, oXygen will only need to “clean out” its memory rarely, but it will take longer 512M

I don’t know, but my guess is that it is a bad idea to set any value to more memory than your machine has. To find out how much memory your machine has, select “About This Mac” from the apple menu on the left end of your menu bar. I recommend, without any knowledge or basis, that maximum heap size (-Xxmx) be well under 3/4 of your total real memory. (If anyone knows better, please let me know!)

Some info from http://www.caucho.com/resin-3.0/performance/jvm-tuning.xtp and some from http://edocs.beasys.com/wls/docs81/perform/JVMTuning.html both on 2008-03-19.